http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46854
--- Comment #3 from joakim.tjernlund at transmode dot se <joakim.tjernlund at transmode dot se> 2010-12-09 18:21:50 UTC --- (In reply to comment #2) > Note, -O2 generates mostly the code you want, except that it looks the address > of the string twice: > > Here is the code generated with a 4.4.4 based compiler (the compiler happens > to > be the IBM advance toolchain, version 3.0-1) using -O2 -m32 (-O1/-O3 generate > the same code): > > test: > mr. 0,3 > mtctr 0 > beq 0,.L10 > lis 3,.lanch...@ha > la 3,.lanch...@l(3) > .p2align 4,,15 > .L8: > lbzu 0,1(3) > cmpwi 7,0,0 > bne 7,.L8 > bdnz .L8 > blr > .L10: > lis 3,.lanch...@ha > la 3,.lanch...@l(3) > blr > > The SLES 11SP1 system compiler, which is based on GCC 4.3.4 generates the same > code. > > However, the GCC 4.6 trunk seems to have regressed slightly with -O2 or -O3, > in > that it does not track that the lbzu updates the pointer, but maintains its > own I have seen more similar mistakes such as not using lwzu/stwu at all. Will add a copy of a mail I sent earlier about that. > copy: > > mr. 0,3 > mtctr 0 > beq- 0,.L5 > lis 3,.lanch...@ha > la 3,.lanch...@l(3) > .L4: > mr 9,3 > .L3: > lbzu 0,1(9) > addi 3,3,1 > cmpwi 7,0,0 > bne+ 7,.L3 > bdnz .L4 > blr > .L5: > lis 3,.lanch...@ha > la 3,.lanch...@l(3) > blr > > Trunk with -Os does generate the two comparisons: > > mr 9,3 > lis 3,.lanch...@ha > la 3,.lanch...@l(3) > b .L2 > .L5: > mr 11,3 > addi 3,3,1 > lbz 0,1(11) > cmpwi 7,0,0 > bne+ 7,.L5 > addi 9,9,-1 > .L2: > cmpwi 7,9,0 > bne+ 7,.L5 > blr > > So, there are two bugs in this. One that -Os generates larger code than -O2, > and the code regression for GCC 4.6. And gcc 4.4.4/4.4.5. I suspect this started much earlier though.