https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62631
--- Comment #27 from dave.anglin at bell dot net --- On 2015-02-07, at 5:24 PM, ebotcazou at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62631 > > --- Comment #26 from Eric Botcazou <ebotcazou at gcc dot gnu.org> --- >> The generated code on PA looks optimal to me: >> >> zdep %r25,29,30,%r28 >> b .L2 >> ldi 99,%r19 >> .L6: >> zdep %r25,29,30,%r28 >> .L2: >> addl %r26,%r28,%r28 >> ldo 1(%r25),%r25 >> comb,>>= %r19,%r25,.L6 >> stw %r0,0(%r28) >> bv,n %r0(%r2) > > For most other architectures the BIV (%r25) is eliminated to the GIV (%r28) so > you only have one additive operation in the loop. This happens for 64-bit PA: > > .L5: > ldo 4(%r26),%r26 > cmpb,*>>,n %r28,%r26,.L5 > stw %r0,0(%r26) > bve,n (%r2) > > Why couldn't such a code be generated for 32-bit PA too? There is no reason that I can see. Dave -- John David Anglin dave.ang...@bell.net