http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47764
Uros Bizjak <ubizjak at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Target|arm-linux-androideabi | CC| |ubizjak at gmail dot com Component|target |rtl-optimization Known to fail| |4.7.0, 4.8.0 --- Comment #5 from Uros Bizjak <ubizjak at gmail dot com> 2013-01-24 07:25:04 UTC --- This is a problem with rtl-optimization, gcse2 pass. Following testcase also fails on x86_64, with 4.8 [1] that removes (!o,F) alternative. Following test, when compiled with -O3 hoists memory load out of the loop: --cut here-- volatile double y; void test () { int z; for (z = 0; z < 1000; z++) y = 0.1; } --cut here-- _.210r.postreload: 15: L15: 8: NOTE_INSN_BASIC_BLOCK 3 23: xmm0:DF=[`*.LC0'] 10: [`y']=xmm0:DF REG_DEAD xmm0:DF 11: NOTE_INSN_DELETED 12: {flags:CCZ=cmp(ax:SI-0x1,0);ax:SI=ax:SI-0x1;} 13: pc={(flags:CCZ!=0)?L15:pc} REG_BR_PROB 0x26ab _.211r.gcse2: 26: xmm0:DF=[`*.LC0'] 15: L15: 8: NOTE_INSN_BASIC_BLOCK 3 10: [`y']=xmm0:DF REG_DEAD xmm0:DF 11: NOTE_INSN_DELETED 12: {flags:CCZ=cmp(ax:SI-0x1,0);ax:SI=ax:SI-0x1;} 13: pc={(flags:CCZ!=0)?L15:pc} REG_BR_PROB 0x26ab However, when constant is changed to 0.0 (so, we can load it directly to %xmm register using xorpd insn): --cut here-- volatile double y; void test () { int z; for (z = 0; z < 1000; z++) y = 0.0; } --cut here-- gcc -O3: _.211r.gcse2: 15: L15: 8: NOTE_INSN_BASIC_BLOCK 3 10: xmm0:DF=0.0 23: [`y']=xmm0:DF REG_DEAD xmm0:DF 11: NOTE_INSN_DELETED 12: {flags:CCZ=cmp(ax:SI-0x1,0);ax:SI=ax:SI-0x1;} 13: pc={(flags:CCZ!=0)?L15:pc} REG_BR_PROB 0x26ab Constant load remains inside the loop. It looks that gcse2 pass cares only for loads from memory, but I see no reason why constant load should not be considered. It looks like an oversight to me. The same happens with: --cut here-- volatile long long y; void test () { int z; for (z = 0; z < 1000; z++) y = 0x123456789; } --cut here-- _.211r.gcse2: 15: L15: 8: NOTE_INSN_BASIC_BLOCK 3 23: dx:DI=0x123456789 24: [`y']=dx:DI REG_DEAD dx:DI 11: NOTE_INSN_DELETED 12: {flags:CCZ=cmp(ax:SI-0x1,0);ax:SI=ax:SI-0x1;} 13: pc={(flags:CCZ!=0)?L15:pc} REG_BR_PROB 0x26ab resulting in: .L3: movabsq $4886718345, %rdx subl $1, %eax movq %rdx, y(%rip) jne .L3 Reconfirmed as rtl-optimization (gcse2 pass) problem. [1] 4.8.0 20130124 (experimental) [trunk revision 195417]