Making the tests concise to illustrate semantic questions also means that most are not written to trigger interesting compiler behaviour, which might only occur in a larger context that permits some analysis or optimisation pass to take effect. Moreover, following the spirit of C, conventional implementations cannot and do not report all instances of undefined behaviour. Hence, only in some cases is there anything to be learned from the experimental compiler behaviour. For any executable semantics or analysis tool, on the other hand, all the tests should have instructive outcomes.
Some tests rely on address coincidences for the interesting execution; for these we sometimes include multiple variants, tuned to the allocation behaviour in the implementations we consider. Where this has not been done, some of the experimental data is not meaningful.
-fno-strict-aliasing, on x86_64 on linux, e.g.
gcc -O2 -std=c11 -pedantic -Wall -Wextra -Wno-unused-variable -pthread
clang -fsanitize=address -O2 -std=c11 -pedantic -Wall -Wextra -Wno-unused-variable -pthread