On 7/30/09, Steven Bosscher <stevenb....@gmail.com> wrote: > On 7/30/09, Zoltán Kócsi <zol...@bendor.com.au> wrote: > > On the ARM every instruction can be executed conditionally. GCC very > > cleverly uses this feature: > > > > int bar ( int x, int a, int b ) > > { > > if ( x ) > > > > return a; > > else > > return b; > > } > > > > compiles to: > > > > bar: > > cmp r0, #0 // test x > > movne r0, r1 // retval = 'a' if !0 ('ne') > > moveq r0, r2 // retval = 'b' if 0 ('eq') > > bx lr > > > > However, the following function: > > > > extern unsigned array[ 128 ]; > > > > int foo( int x ) > > { > > int y; > > > > y = array[ x & 127 ]; > > > > if ( x & 128 ) > > > > y = 123456789 & ( y >> 2 ); > > else > > y = 123456789 & y; > > > > return y; > > } > > > > compiled with gcc 4.4.0, using -Os generates this: > > > > foo: > > > > ldr r3, .L8 > > tst r0, #128 > > and r0, r0, #127 > > ldr r3, [r3, r0, asl #2] > > ldrne r0, .L8+4 *** > > ldreq r0, .L8+4 *** > > movne r3, r3, asr #2 > > andne r0, r3, r0 *** > > andeq r0, r3, r0 *** > > bx lr > > .L8: > > .word array > > .word 123456789 > > > > The lines marked with the *** -s do the same, one executing if the > > condition is one way, the other if the condition is the opposite. > > That is, together they perform one unconditional instruction, except > > that they use two instuctions (and clocks) instead of one. > > > > Compiling with -O2 makes things even worse, because an other issue hits: > > gcc sometimes changes a "load constant" to a "generate the constant on > > the fly" even when the latter is both slower and larger, other times it > > chooses to load a constant even when it can easily (and more cheaply) > > generate it from already available values. In this particular case it > > decides to build the constant from pieces and combines that with > > the generate an unconditional instruction using two complementary > > conditional instructions method, resulting in this: > > > > foo: > > ldr r3, .L8 > > tst r0, #128 > > and r0, r0, #127 > > ldr r0, [r3, r0, asl #2] > > movne r0, r0, asr #2 > > bicne r0, r0, #-134217728 > > biceq r0, r0, #-134217728 > > bicne r0, r0, #10747904 > > biceq r0, r0, #10747904 > > bicne r0, r0, #12992 > > biceq r0, r0, #12992 > > bicne r0, r0, #42 > > biceq r0, r0, #42 > > bx lr > > .L8: > > .word array > > > > Should I report a bug? > > This looks like my bug PR21803 (gcc.gnu.org/PR21803). Can you check if > the ce3 pass creates this code? (Compile with -fdump-rtl-all and look > at the .ce3 dump and one dump before to see if the .ce3 pass created > your funny sequence.) > > If your problem is indeed caused by the ce3 pass, you should add your > problem to PR21803, change the "Component" field to "middle-end", and > adjust the bug summary to make it clear that this is not ia64 > specific.
Oh, and you may also want to try my patch "crossjump_abstract.diff" in PR20070, it solves problems like yours sometimes (if the sequence is just right) by crossjumping earlier. Ciao! Steven