https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101868
Bug ID: 101868 Summary: Incorrect reordering in -O2 with LTO Product: gcc Version: 11.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: gcc at alanwu dot email Target Milestone: --- GCC with LTO seems to be hoisting a memory read to a place too early. It only seems to reproduce with LTO, so please excuse posting multiple files. Compile command: gcc -flto -O2 -fno-strict-aliasing one.c two.c three.c four.c //--------------- one.c -------------------------------- typedef unsigned long VALUE; __attribute__ ((cold)) void rb_check_type(VALUE, int); static VALUE repro(VALUE dummy, VALUE hash) { if (hash == 0) { rb_check_type(hash, 1); } else if (*(long *)hash) { rb_check_type(hash, 1); } return *(long *)hash; } static VALUE (*that)(VALUE dummy, VALUE hash) = repro; int main(int argc, char **argv) { argc--; that(0, argc); rb_check_type(argc, argc); } //------------ end of one.c ---------------------------- //------------ two.c ----------------------------------- typedef unsigned long VALUE; __attribute__ ((noreturn)) void rexc_raise(VALUE mesg); VALUE rb_donothing(VALUE klass); static void funexpected_type(VALUE x, int xt, int t) { rexc_raise(rb_donothing(0)); } __attribute__ ((cold)) void rb_check_type(VALUE x, int t) { int xt; if (x == 0) { funexpected_type(x, xt, t); } } //------------- end of two.c --------------------------- //------------ three.c --------------------------------- typedef unsigned long VALUE; static void thing(void) {} static void (*ptr)(void) = &thing; VALUE rb_donothing(VALUE klass) { ptr(); return 0; } //-------- end of three.c ------------------------------ //-------- four.c -------------------------------------- typedef unsigned long VALUE; __attribute__((noreturn)) void rexc_raise(VALUE mesg) { __builtin_exit(42); } //------------- end of four.c -------------------------- The code for repo() reads from memory before doing the check for zero: 0x00000000004011a0 <+0>: sub $0x18,%rsp => 0x00000000004011a4 <+4>: mov (%rsi),%rax 0x00000000004011a7 <+7>: test %rsi,%rsi 0x00000000004011aa <+10>: je 0x401051 <repro.cold> 0x00000000004011b0 <+16>: test %rax,%rax 0x00000000004011b3 <+19>: jne 0x401067 <repro.cold+22> 0x00000000004011b9 <+25>: add $0x18,%rsp 0x00000000004011bd <+29>: ret Here is the output of gcc -v. I'm using the 11.2.0 Docker Hub image. Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-linux-gnu/11.2.0/lto-wrapper Target: x86_64-linux-gnu Configured with: /usr/src/gcc/configure --build=x86_64-linux-gnu --disable-multilib --enable-languages=c,c++,fortran,go Thread model: posix Supported LTO compression algorithms: zlib gcc version 11.2.0 (GCC)