The C Memory Quiz (DRAFT, Version 0.2)

This survey explores the relationship between the C language as implemented and used in practice and the C standard. It consists of about 40 questions about the behaviour of memory, pointers, and values. For each, we're interested in answers from several different perspectives:

the language that systems programmers believe they are writing in, i.e., the assumptions they make about what behaviour they can rely on;
the idioms used in the corpus of mainstream systems code out there, especially in specific large-scale systems (Linux, FreeBSD, Xen, Apache, etc.);
the languages implemented by mainstream compilers (GCC, Clang, ICC, MSVC, etc.), including the assumptions their optimisation passes make about user code and how these change with certain flags (e.g. GCC's -fno-strict-aliasing and -fno-strict-overflow);
the C standard (C11 or earlier) (which might give a clear answer, be unclear, be contradictory, or not address each issue);
the issues that arise in making C code portable between different compilers and architectures;
the behaviour assumed by code analysis tools; and
the impact on formal semantics.

If you can speak to any of these in particular, or can give real-world examples where the questions are (or must not be) relied on, please do so in the `comments' boxes below. We are especially interested in the differences between the C language as it is commonly used and the language as specified by the standard, in differing interpretations of the standard, in precisely characterising the boundary between defined and undefined behaviour, and in the ways that compilers exploit undefined behaviour in optimisations.

Recall that if a program is deemed by the standard to have undefined behaviour, then the standard imposes no requirements at all on how an implementation can treat that program. When we ask whether an idiom is free of undefined behaviour with respect to some implementation (e.g. GCC or Clang), the question is really whether that implementation assumes (for optimisation) that legal programs do not use that idiom and so may give surprising results for any programs that do. Say a `usable pointer' is one that can be written to or read from without causing undefined behaviour.

For the questions about the standard, we're most interested in what you believe it says. If you're familiar with the standard and able to justify your answers with reference to it, that would also be interesting, but it's not necessary.

We illustrate each question with an example program. We've made these as simple as possible, but to see interesting implementation behaviour one might need more complex examples, with a bigger context, to give analysis and optimisation passes something to work on. Please bear that in mind when answering the questions, rather than focussing exclusively on how the code as written would be compiled.

Preamble
Pointer abstract values and pointer representations
Pointer Comparison
Casting of pointers to and from integers
Casting of pointers: roundtrip properties
Subobject Casts
Compound Object Casts and Layout
Objects and Malloc'd Regions
Effective Types
Representation casts
Unspecified values and stability
Trap representations
Multiple representations
Representations
Any other issues

NB: the `SUBMIT' button is right at the bottom of the form; your input will be lost unless you push it (you can submit multiple responses, if you want to do the survey in parts - if you do, please write your name and email exactly the same way in each, and avoid non-identical overlapping responses to any other questions).

Preamble

Your name

Your email Your email will be used only to let you know our results and perhaps for any follow-up questions; it will not be publicised. It is optional.

Please list the C compilers (and/or analysis tools) that you are familiar with. If you use particular flags to control optimisation to ensure some specific behaviour, please also say which and why.

If possible, please list the main systems that you contribute to.

Pointer abstract values and pointer representations

C gives access both to abstract values and to their concrete byte representations. For most types, the abstract value is determined by the concrete representation (though there might be multiple representations for the same abstract value, and some representations might not denote any abstract value). For pointer values, however, it is arguable that the provenance of a pointer should be significant in determining what behaviour is undefined, how pointers compare, and what effect reads and writes using them have, whether or not that provenance is recorded in implementation runtime data. The questions in this section probe various aspects of this. Some are based on examples from Defect Report 260.

PR.1.a (usage) Given two separately allocated objects of the same type, can a pointer to the first be turned into a usable pointer to the second, by pointer arithmetic together with a check that the result has an identical representation to that of another usable pointer to the second? Example:

#include <string.h> 
#include <stdint.h>
int  y = 2, x=1;
int main() {
  // assume for this question that the line below
  // legitimately calculates an offset from x to y,
  // or that an offset is known from some other source
  intptr_t offset = (intptr_t)&y - (intptr_t)&x;
  // now consider:
  int *p = (int *)((intptr_t)&x + offset);
  int *q = &y;
  if (memcmp(&p, &q, sizeof(p)) == 0) {
    *p = 11; // is this free of undefined behaviour?
  }
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

PR.1.a (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

PR.1.a. (comment)

PR.1.b (usage) Given two separately allocated objects of the same type, can a pointer to the first be turned into a usable pointer to the second, by pointer arithmetic together with a check that the result compares equal to that of another usable pointer to the second? Example:

#include <stdint.h>
int  y = 2, x=1;
int main() {
  // assume for this question that the line below
  // legitimately calculates an offset from x to y,
  // or that an offset is known from some other source
  intptr_t offset = (intptr_t)&y - (intptr_t)&x;
  // now consider:
  int *p = (int *)((intptr_t)&x + offset);
  int *q = &y;
  if (p==q) {
    *p = 11; // is this free of undefined behaviour?
  }
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

PR.1.b (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

PR.1.b. (comment)

PR.1.c (usage) Given two separately allocated objects of the same type, can a pointer to the first be turned into a usable pointer to the second, by pointer arithmetic alone? Example:

#include <stdint.h>
int  y = 2, x=1;
int main() {
  // assume for this question that the line below
  // legitimately calculates an offset from x to y,
  // or that an offset is known from some other source
  intptr_t offset = (intptr_t)&y - (intptr_t)&x;
  // now consider:
  int *p = (int *)((intptr_t)&x + offset);
  int *q = &y;
  *p = 11;   
  // in an implementation in which p and q 
  // happen to hold the same address, is the line
  // above free of undefined behaviour?
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

PR.1.c (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

PR.1.c. (comment)

PR.2.a (usage) Can a usable pointer be copied by copying its representation bytes, by the library memcpy? Example:

#include <stdio.h>
#include <string.h>
int  x=1;
int main() {
  int *p = &x;
  int *q;
  memcpy (&q, &p, sizeof p); 
  *q = 11; // is this free of undefined behaviour?
  printf("*p=%d  *q=%d\n",*p,*q);  
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

PR.2.a (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

PR.2.a. (comment)

PR.2.b (usage) Can a usable pointer be copied by copying its representation bytes, by arbitrary user code whose result is the identity function on those bytes? Example:

#include <stdio.h>
#include <string.h>
int  x=1;
void user_memcpy2(unsigned char* dest, unsigned char *src, 
                 size_t n) {
  while (n > 0)  {		
    *dest = ((*src) ^ 1) ^ 1;
    src += 1;
    dest += 1;
    n -= 1;
  }
}
int main() {
  int *p = &x;
  int *q;
  user_memcpy2((unsigned char*)&q, (unsigned char*)&p, 
              sizeof(p));
  *q = 11; // is this free of undefined behaviour?
  printf("*p=%d  *q=%d\n",*p,*q);
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

PR.2.b (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

PR.2.b. (comment)

PR.3 (usage) Can a pointer to a now-deallocated object be used to access a new object of the same type that happens to have been allocated in the same place, if it has an identical representation to a usable pointer to the latter? Example:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <assert.h>
int main() {
  int *p, *q;
  p = malloc (sizeof (int)); assert (p != NULL);
  free(p);
  q = malloc (sizeof (int)); assert (q != NULL);
  if (memcmp (&p, &q, sizeof p) == 0) {
    printf("memcmp==0\n");
    *p = 42;  // is this free of undefined behaviour?
  }
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

PR.3 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

PR.3. (comment)

PR.4 (usage) Can a pointer representation be assumed not to change at the end of the lifetime of the object it pointed to? Example:

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
int main() {
  int *pj = (int *)(malloc(sizeof(int))); 
  *pj=1;
  char c1 = *((char *)&pj);
  free(pj);
  char c2 = *((char *)&pj);
  int b = (c1 == c2); // is this free of undef. beh.?
  // and always true?
  printf("c1=%i  c2=%i  b=%s\n",
	 (int)c1,(int)c2,b?"true":"false");
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

PR.4 (standard) According to the C standard?

Yes, by the standard
No, by the standard
The standard is unclear, contradictory, or does not address this
Don't know

PR.4. (comment)

PR.5 (usage) Can the bytes of a pointer representation be assumed not to become undefined values at the end of the lifetime of the object it pointed to? Example:

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
int main() {
  int *pj = (int *)(malloc(sizeof(int))); 
  *pj=1;
  free(pj);
  // can the representation bytes of pj 
  // be assumed not to be undefined values?  E.g.:
  char c1 = *((char *)&pj);
  char c2 = *((char *)&pj);
  int b = (c1 == c2);  // is this always true?
  printf("c1=%i  c2=%i  b=%s\n",
	 (int)c1,(int)c2,b?"true":"false");
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

PR.5 (standard) According to the C standard?

Yes, by the standard
No, by the standard
The standard is unclear, contradictory, or does not address this
Don't know

PR.5. (comment)

Pointer Comparison

In this section we look at when pointers can be compared.

PC.1 (usage) Can one compare (with <, >, <=, or >=) two pointers to separately allocated objects (of compatible object types)? Example:

#include <stdio.h>
int  y = 2, x=1;
int main() {
  int *p = &x, *q = &y;
  _Bool b = (p < q); // does this have defined behaviour?
  printf("p=%p  q=%p\n",(void*)p,(void*)q);
  printf("(p<q) = %s\n", b?"true":"false");
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

PC.1 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

PC.1. (comment)

PC.2 (usage) Can one compare (with <, >, <=, or >=) two pointers (of compatible object types) that were created as pointers to distinct objects but now point to the same object? Example:

#include <string.h> 
int  y = 2, x=1;
int main() {
  int *p = &x+1, *q = &y;
  if (memcmp(&p, &q, sizeof(p)) == 0) {
    int b = (p < q); // is this free of undef. beh.?
  }
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

PC.2 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

PC.2. (comment)

PC.3(usage) Can one check equality (with == or !=) against a pointer to an object whose lifetime has ended? Example:

#include <stdio.h>
#include <stdlib.h>
int main() {
  int i=0;
  int *pj = (int *)(malloc(sizeof(int))); 
  *pj=1;
  printf("(&i==pj)=%s\n",(&i==pj)?"true":"false");
  free(pj);
  printf("(&i==pj)=%s\n",(&i==pj)?"true":"false");
  return 0;
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

PC.3 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

PC.3. (comment)

Casting of pointers to and from integers

This section explores when one can compute on a pointer by casting it to an integer type and back.

CPI.1 (usage) Does casting a usable non-NULL pointer to a suitable integer type and back result in a usable pointer that is equivalent to the original (i.e., dereferencing it is undefined behaviour if and only if dereferencing the original is, it points to the same object as the original, and it compares equal to the original)? Example:

#include <assert.h>
#include <stdio.h>
int x=1;
int main() {
  assert(sizeof(unsigned long)==sizeof(int*)); 
  int *p = &x;
  unsigned long i = (unsigned long) p;
  int *q = (int *)i;
  // are p and q now equivalent?  For example:
  int y = *q;     // - is this not undefined behaviour?
  int b = (q==p); // - does this compare true?
  *q=2; 
  int z = *p;     // - does this give 2? 
  printf("y=%i (q==p)=%s z=%i\n",y,b?"true":"false",z);
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

CPI.1 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

CPI.1 (comment)

CPI.2 (usage) If so, does that still hold when one does nontrivial computation on the integer, for example by exclusive-or'ing it twice with another pointer (perhaps in another compilation unit)? Example:

#include <assert.h>
#include <stdio.h>
int x=1;
int main() {
  assert(sizeof(unsigned long)==sizeof(int*));
  int *p = &x;
  unsigned long i = (unsigned long) p;
  int w=0;  
  unsigned long j = (unsigned long) &w;
  i = (i ^ j) ^ j;// some computation on the integer
  int *q = (int *)i;
  // are p and q now equivalent?  For example:
  int y = *q;     // - is this not undefined behaviour?
  int b = (q==p); // - does this compare true?
  *q=2; 
  int z = *p;     // - does this give 2? 
  printf("y=%i (q==p)=%s z=%i\n",y,b?"true":"false",z);
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

CPI.2 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

CPI.2. (comment)

CPI.3 (usage) Can one use tagged pointers, for example storing additional data in the low-order bits of a pointer using integer bit operations, and clearing them before the pointer is used to give a usable pointer that is equivalent to the original? Example:

#include <assert.h>
#include <stdio.h>
#include <limits.h>
int x=1;
int main() {
  assert(sizeof(unsigned long)==sizeof(int*));
  int *p = &x;
  unsigned long i = (unsigned long) p;
  int *q;
  assert(_Alignof(int) >= 4);
  assert((i & 3u) == 0u);
  i = i | 1u;               
  q = (int *) i; // is this free of undefined behaviour?
  i = i & (ULONG_MAX - 3u);  
  q = (int *) i; 
  // are p and q now equivalent?  For example:
  int y = *q;     // - is this not undefined behaviour?
  int b = (q==p); // - does this compare true?
  *q=2; 
  int z = *p;     // - does this give 2? 
  printf("y=%i (q==p)=%s z=%i\n",y,b?"true":"false",z);
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

CPI.3 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

CPI.3. (comment)

Casting of pointers: roundtrip properties

This question asks how generally one can cast a pointer to other pointer types and then back to the original.

CPR.1 (usage) Can one cast a pointer to a series of arbitrary other pointer types and back to the original type to obtain a pointer that is equivalent to the original (i.e., dereferencing it is undefined behaviour if and only if dereferencing the original is, it points to the same object as the original, and it compares equal to the original)? Example:

#include <stdio.h>
int x=1;
int main() {
  int *p = &x;
  float *q1 = (float *) p;
  char **q2 = (char **) q1;
  int *q3 = (int *) q2;    
  // are p and q3 now equivalent?  For example:
  int y = *q3;     // - is this not undefined behaviour?
  int b = (q3==p); // - does this compare true?
  *q3=2; 
  int z = *p;      // - does this give 2? 
  printf("y=%i (q3==p)=%s z=%i\n",y,b?"true":"false",z);
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

CPR.1 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

CPR.1 (comment)

Subobject Casts

These questions explore the extent to which one can cast a pointer to a subobject (a struct or union member or array element) into a usable pointer to the whole object, and whether one can move within a struct by address arithmetic.

SC.1 (usage) Can a usable pointer to the first member of a structure be cast into a usable pointer to the structure as a whole? Example:

#include <stdio.h>
typedef struct { int  i; float f; } st;
int main() {
  st s = {.i = 1, .f = 1.0};
  int *pi = &(s.i);
  st* p = (st*) pi; // is this free of undefined behaviour?
  p->f = 2.0;       // and this?
  printf("s.f=%f  p->f=%f\n",s.f,p->f);
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

SC.1 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

SC.1. (comment)

SC.2 (usage) Can a usable pointer to the current member of a union be cast into a usable pointer to the union as a whole? Example:

#include <stdio.h>
typedef union { int  i; float f; } un;
int main() {
  un u = {.i = 1};
  int *pi = &(u.i);
  un* p = (un*) pi; // is this free of undefined behaviour?
  p->f = 2.0;       // and this?
  printf("u.f=%f  p->f=%f\n",u.f,p->f);
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

SC.2 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

SC.2 (comment)

SC.3 (usage) Can a usable pointer to the 0'th element of an array be cast into a usable pointer to the array as a whole? Example:

int main(void) {
  int a[2] = {0, 1};
  //int *p = &(a[0]);
  int (*q)[2] = (int(*)[2]) a;
  *q[1] = 11; // is this free of undefined behaviour ?
  return 0;
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

SC.3 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

SC.3. (comment)

SC.4 (usage) Can offsetof(), address arithmetic, and casts be used to build a usable pointer to one member of a structure from a usable pointer to another member of the same structure? Example:

#include <stdio.h>
#include <stddef.h>
typedef struct { int  i; float f; } st;
int main() {
  st s = {.i = 1, .f = 1.0};
  float *pf = &(s.f);
  int *pi = (int *)  (((char *)pf - offsetof(st,f)) 
                      + offsetof(st,i));
  *pi = 2.0; // is this free of undefined behaviour?
  printf("s.i=%i  *pi=%i\n",s.i,*pi);
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

SC.4 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

SC.4 (comment)

SC.5 (usage) Can a pointer to one member of a union be cast into a usable pointer to another member (presuming the union holds that member currently)? Example:

#include <stdio.h>
typedef union { int  i; float f; } un;
int main() {
  un u = {.i = 1};
  int *pi = &(u.i);
  float* pf = (float*) pi; // is this not undef. beh.?
  *pf = 2.0;               // and this?
  printf("u.f=%f  *pf=%f\n",u.f,*pf);
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

SC.5 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

SC.5. (comment)

Compound Object Casts and Layout

This section explores the assumptions one can make about the layout of structurally similar types.

COCL.1 (usage) Given two different struct types that have a prefix of members that have compatible types, and a usable pointer to an object of the first type, can it be cast to a pointer to the second type that can be used to read and write members of that prefix? Example:

#include <stdio.h>
typedef struct { int  i1; float f1; char c1; } st1;
typedef struct { int  i2; float f2; double d2; } st2;
int main() {
  st1 s1 = {.i1 = 1, .f1 = 1.0, .c1 = 'a'};
  st2 *p2 = (st2 *) (&s1);// is this free of undef.beh.?
  p2->f2=2.0;             // and this?
  printf("s1.f1=%f  p2->f2=%f\n",s1.f1,p2->f2);
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

COCL.1 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

COCL.1. (comment)

COCL.2 (usage) Can the representation of a type be assumed to be the same in all contexts, irrespective of whether it occurs at the top level or as a struct or union member or array element? Example:

#include <stdio.h>
#include <string.h>
typedef struct { int  i1; float f1; } st1;
typedef struct { int  i2; st1 st1_2;  } st2;
int main() {
  st1 s1 = {.i1 = 1, .f1 = 1.0 };
  st2 s2 = {.i2 = 2, .st1_2 = { .i1 = 2, .f1 = 2.0 } };
  memcpy(&(s2.st1_2), &s1, sizeof(st1)); 
  int i = s2.st1_2.i1; // is this free of undef.beh.?
  printf("i=%i\n",i);
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

COCL.2 (standard) According to the C standard?

Yes, by the standard
No, by the standard
The standard is unclear, contradictory, or does not address this
Don't know

COCL.2. (comment)

COCL.3 (usage) Does the representation of a struct, union, or array consist exclusively of the representations of its members plus padding? For example, can one copy a struct by using memcpy to copy the individual elements (according to their offsetof() and sizeof() one by one, and writing junk into the other bytes?" Example:

#include <stdio.h>
#include <string.h>
typedef struct { int  i; float f; } st;
int main() {
  st s1 = {.i = 1, .f = 1.0 };
  st s2;
  memcpy(&(s1.i), &(s2.i), sizeof(int)); 
  memcpy(&(s1.f), &(s2.f), sizeof(float)); 
  // is s2 now a copy of s1?
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

COCL.3 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

COCL.3. (comment)

Objects and Malloc'd Regions

These questions examine how a dynamically allocated region of memory can be used.

OM.1 (usage) Can one store multiple objects in (disjoint parts of) a malloc'd region? Example:

#include <stdlib.h>
#include <assert.h>
#define max(a, b) ((a) > (b) ? (a) : (b))
int main() {
  // manual layout for a char and float pair
  size_t i_offset = 0;
  assert(sizeof(char) <= _Alignof(float));
  size_t f_offset = _Alignof(float);
  size_t size = f_offset + sizeof(float);
  // storing a char and a float in an allocated region
  char *p = malloc(size); assert (p != NULL);
  *((char *)(p+i_offset)) = 'a';
  *((float *)(p+f_offset)) = 1.0;
  // is this free of undefined behaviour?
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

OM.1 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

OM.1. (comment)

OM.2 (usage) Can one do type-changing updates (not merely switching to a different member of a union type) to a malloc'd region? Example:

#include <stdlib.h>
#include <assert.h>
#define max(a, b) ((a) > (b) ? (a) : (b))
int main() {
  // manual layout for a char and float sum
  size_t size = max(sizeof(char), sizeof(float));
  char *p = malloc(size); assert (p != NULL);
  *((char *)p) = 'a';
  // a type-changing update from char to float
  //  in an allocated region
  *((float *)p) = 1.0;
  // is this free of undefined behaviour?
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

OM.2 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

OM.2. (comment)

OM.3 (usage) Can one initialise a struct in a malloc'd region incrementally? Example:

#include <stdlib.h>
#include <assert.h>
typedef struct { char c; float f; } st;
int main() {
  char *p = malloc(sizeof(st)); assert (p != NULL);
  ((st *)p)->c = 'a';
  ((st *)p)->f = 1.0;
  // is this free of undefined behaviour?
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

OM.3 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

OM.3. (comment)

Effective Types

Effective types are introduced in 6.5p6 of the standard, and 6.5p7 uses them to give some additional constraints on the accesses made by well-defined programs, apparently to enable type-based alias analysis. This question explores what the force of this is for objects in dynamically allocated regions of memory, asking exactly when such regions of memory acquire exactly what effective type constraints.

ET.1 (usage) If one uses a pointer to a struct to initialise a single member of the struct in a malloc'd region, does (i) the footprint of the whole struct take on the struct type as its effective type, or (ii) does the footprint of the member take on an effective type of that struct member, or (iii) does the footprint of the member take on an effective type as the type of that struct member? As far as implementations are concerned: Example:

#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>
typedef struct { char c1; float f1; } st1;
typedef struct { char c2; float f2; } st2;
int f(st1 s1) {
  return ( (int)s1.c1 + (int)s1.f1);
}
int main() {
  assert(sizeof(st1)==sizeof(st2));
  assert(offsetof(st1,c1)==offsetof(st2,c2));
  assert(offsetof(st1,f1)==offsetof(st2,f2));
  char *p = malloc(sizeof(st1)); assert (p != NULL);
  ((st1 *)p)->c1 = 'A';
  ((st2 *)p)->f2 = 1.0;
  // is this free of undefined behaviour?
  // if so, what effective type does each region of 
  //  memory have after each assignment?
  int i = f( *((st1 *)p));
  // is the body of f also free of undefined behaviour?
  printf("i=%i\n",i);
}
// Given (i), the assignment to ((st2 *)p)->f2 would 
// be undefined behaviour, while (ii) and (iii) would be 
// permitted. 
// The dereference of s1.f1 (in the body of f) would be 
// undefined behaviour for (ii) but permitted for (iii).

(i) the footprint of the whole struct takes on the struct type as its effective type
(ii) the footprint of the member takes on an effective type of that struct member
(iii) the footprint of the member takes on an effective type as the type of that struct member?
(iv) effective types are not significant
Don't know
Other:

ET.1 (standard) And according to the C standard?

(i) the footprint of the whole struct takes on the struct type as its effective type
(ii) the footprint of the member takes on an effective type of that struct member
(iii) the footprint of the member takes on an effective type as the type of that struct member
Don't know
Other:

ET.1 (comment)

Representation Casts

The standard allows one to inspect and manipulate the representations of arbitrary types via char. These questions explore whether one can do this via other types.

RC.1 (usage) Can one inspect and manipulate representations via non-char integer types? Example:

#include <stdio.h>
#include <assert.h>
int main() {
  assert(sizeof(int)==sizeof(float));
  float f = 1.0;
  int i = *((int*) (&f)); // is this not undef. beh.?
  printf("f=%f  i=%i\n",f,i);
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

RC.1 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

RC.1 (comment)

RC.2 (usage) Can a pointer to one member of a union be cast into a usable pointer to another member (of a different type) that it does not currently hold? Example:

#include <stdio.h>
#include <assert.h>
typedef union { int  i; float f; } un;
int main() {
  assert(sizeof(int)==sizeof(float));
  un u = {.f = 1.0};
  float *pf = &(u.f);
  int* pi = (int*) pf; // is this not undef. beh.?
  int j = *pi;         // and this?
  printf("u.f=%f  j=%i\n",u.f,j);
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

RC.2 (standard) Is it allowed by the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

RC.2 (comment)

Unspecified values and stability

A variable with automatic storage duration that is not explicitly initialised has `indeterminate value', which is either a `trap representation' or an `unspecified value'. These questions explore what one can assume about the behaviour of unspecified values.

UV.1 (usage) In the absence of any writes, is an unspecified value stable? I.e., will multiple reads of it always return the same value? Example:

#include <stdio.h>
int main() {
  char c; 
  int b = (c==c);  // is this always true?
  printf("(c==c)=%s\n",b?"true":"false"); 
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

UV.1 (standard) According to the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

UV.1 (comment)

UV.2 (usage) Is computation strict with respect to unspecified values? I.e., if an argument of a unary or binary operator is an unspecified value, is the result also an unspecified value? Example:

#include <stdio.h>
int main() {
  unsigned char c; 
  unsigned char c2 = (c % 2);
  // can reading c2 give something other than 0 or 1?
  printf("c=%i  c2=%i\n",(int)c,(int)c2);
}

Yes, in practice
No, in practice
Don't know
Other:

UV.2 (standard) According to the C standard?

Yes, by the standard
No, by the standard
The standard is unclear, contradictory, or does not address this
Don't know

UV.2 (comment)

UV.3 (usage) Are the representation bytes of an unspecified value also unspecified values? Example:

#include <stdio.h>
int main() {
  int i;
  char c = * ((char*)(&i)); 
  // does c now hold an unspecified value?
  printf("i=0x%x  c=%i\n",i,(int)c);
  printf("i=0x%x  c=%i\n",i,(int)c);
}

Yes, in practice
No, in practice
Don't know
Other:

UV.3 (standard) According to the C standard?

Yes, by the standard
No, by the standard
The standard is unclear, contradictory, or does not address this
Don't know

UV.3 (comment)

UV.4 (usage) When one writes into a structure member or array element, do any padding bytes `associated with' other members or elements take on unspecified values? Example:

#include <stdio.h>
#include <stddef.h>
#include <assert.h>
typedef struct { char c; float f; } st;
int main() {
  // check there is a padding byte between c and f
  size_t offset_padding = offsetof(st,c)+sizeof(char);
  assert(offsetof(st,f)>offset_padding);
  st s = {.c = 'A', .f = 1.0};
  char *padding = (char*)(&s + offset_padding);
  *padding = 0; 
  s.f = 2.0;
  // does *padding now hold an unspecified value?
  printf("s.c=%c  s.f=%f  *padding=%i\n",
	 s.c,s.f,(int)*padding);
}

Yes, in practice
No, in practice
Don't know
Other:

UV.4 (standard) According to the C standard?

Yes, by the standard
No, by the standard
The standard is unclear, contradictory, or does not address this
Don't know

UV.4 (comment)

Trap Representations

Trap representations do not represent values of the object type, and reading a trap representation (except by an lvalue of character type) is undefined behaviour. (Note that this does not mean that reading a trap representation must give rise to some hardware trap: trap representations might simply licence some compiler optimisation.)

TR.1 (usage) Where (if at all) do any current implementations of C have trap representations? (The following example has undefined behaviour if and only if the implementation has a trap representation for type int; one can also consider similar examples for float and other types.) Example:

int main() {
  int i;
  int j=i;   // is this free of undefined behaviour?
  // note that i is read but the value is not used
}

Multiple Representations

A C implementation might have multiple distinct representations for the same abstract value. These questions explore what one can assume about this.

MR.1 (usage) Where (if at all) do any current implementations of C have multiple distinct representations for the same value (apart from padding bytes in structs, unions, or arrays, or padding in bitfields)?

MR.2 (usage) Given a value that in a particular implementation has multiple distinct representations (perhaps due to padding bits), in the absence of any writes, can one assume that its representation is stable? Example:

#include <stdio.h>
int main() {
  // assume int has multiple representations of value v
  int v=1;
  int  i=v;
  char c1 = *((unsigned char *) &i);
  char c2 = *((unsigned char *) &i);
  // do c1 and c2 not hold unspecified values?
  int b = (c1==c2); // and is this guaranteed to be true?
  printf("c1=%i  c2=%i  b=%s\n",
	 (int)c1,(int)c2,b?"true":"false");
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

MR.2 (standard) According to the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

MR.2 (comment)

MR.3 (usage) Given a value that in a particular implementation has multiple distinct representations (perhaps due to padding bits), if (in the absence of any writes) the representation is unstable, then can a user byte-by-byte copy of the representation be assumed to give the original value (not a trap representation or a different value)? Example:

#include <stdio.h>
#include <assert.h>
int main() {
  assert(sizeof(int)==4);
  // assume int has multiple representations of value v
  int v=1;
  int  i=v, j=0;
  unsigned char *pi = (unsigned char *) &i;
  unsigned char *pj = (unsigned char *) &j;
  *(pj+0) = *(pi+0);
  *(pj+1) = *(pi+1);
  *(pj+2) = *(pi+2);
  *(pj+3) = *(pi+3);
  // is j guaranteed not to be a trap representation?
  int b = (j==i); // is this guaranteed to be true?
  printf("i=%i  j=%i  b=%s\n",i,j,b?"true":"false");
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

MR.3 (standard) According to the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

MR.3 (comment)

MR.4 (usage) Given a value that in a particular implementation has multiple distinct representations (perhaps due to padding bits), if (in the presence of an intervening write of the same value) the representation is unstable, then can a user byte-by-byte copy of the representation be assumed to give the original value (not a trap representation or a different value)? Example:

#include <stdio.h>
#include <assert.h>
int main() {
  assert(sizeof(int)==4);
  // assume int has multiple representations of value v
  int v=1;
  int  i=v, j=0;
  unsigned char *pi = (unsigned char *) &i;
  unsigned char *pj = (unsigned char *) &j;
  *(pj+0) = *(pi+0);
  *(pj+1) = *(pi+1);
  i=v;
  *(pj+2) = *(pi+2);
  *(pj+3) = *(pi+3);
  // is j guaranteed not to be a trap representation?
  int b = (j==i); // is this guaranteed to be true?
  printf("i=%i  j=%i  b=%s\n",i,j,b?"true":"false");
}

used in practice and supported by compilers
questionable (I would discourage this in a code review)
should not be used; compilers might well not support it
not useful, but compilers do support it
not useful, and compilers do not support it
don't know
Other:

MR.4 (standard) According to the C standard?

Allowed by the standard
Not allowed by the standard
The standard is unclear, contradictory, or does not address this
Don't know

MR.4. (comment)

Representations

REP.1 Can one assume a char is 8 bits, in current implementations? (if not, please say where)

Yes
Other:

REP.2 Can one assume that integers are two's complement, in current implementations? (if not, please say where)

Yes
Other:

REP.3 Can one assume that signed overflow gives well-defined values, in current implementations? (if not, please say where)

Yes
Other:

REP.4 Can one assume that character pointers and other pointers have the same representation, in current implementations? (if not, please say where)

Yes
Other:

REP.5. (comment)

Any other issues?

Can you describe any other important issues with the behaviour of C memory, especially with respect to any differences between implementations or between implementations and the standard?

Can we improve this survey in any way, e.g. by clarifying any of the questions?