http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36884
--- Comment #4 from Georg-Johann Lay <gjl at gcc dot gnu.org> 2011-07-05 13:20:15 UTC --- Created attachment 24691 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24691 C Testcase pr36884.c This testcase shows the dis-optimization in 4.7. trunk ====================================================== Compiled with -O2 -S: swap: mov r18,r25 rol r18 clr r18 rol r18 ldi r19,lo8(0) clr __tmp_reg__ lsr r19 ror r18 ror __tmp_reg__ lsr r19 ror r18 ror __tmp_reg__ mov r19,r18 mov r18,__tmp_reg__ sbrc r25,6 ori r19,hi8(8192) .L3: sbrc r25,5 ori r19,hi8(16384) .L4: sbrc r25,4 ori r19,hi8(-32768) .L5: mov r24,r18 mov r25,r19 ret ====================================================== Compiled with -O2 -S -fno-if-conversion: swap: sbrc r25,7 rjmp .L6 ldi r18,lo8(0) ldi r19,hi8(0) .L2: sbrc r25,6 ori r19,hi8(8192) .L3: sbrc r25,5 ori r19,hi8(16384) .L4: sbrc r25,4 ori r19,hi8(-32768) .L5: mov r24,r18 mov r25,r19 ret .L6: ldi r18,lo8(64) ldi r19,hi8(64) rjmp .L2 ====================================================== The compiler does a lengthy extract and reuse of the MSB.