https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100622

Segher Boessenkool <segher at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #5 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Thomas Koenig from comment #4)
> Yes, the masking should be only performed at the end.
> 
> However, the inner loop could be further simplified to
> 
> label:
>     lwzu r8,4(r10)
>     add r3,r8,r3
>     bdnz label
> 
> without the need to do anything with r9, so this is probably
> more than one topic in one test case.

Please use -O2 instead, no one will care much about -O1.  You can use
-fno-unroll-loops to make it easier to read.

The core for foo is

.L3:
        lwzu 10,4(9)
        add 3,10,3
        rldicl 3,3,0,32
        bdnz .L3

and for foo2 is

.L10:
        lwzu 10,4(9)
        add 3,3,10
        bdnz .L10

This is this way in Gimple already: the IV is a DImode, while it would
be better as a SImode.  That is the root of the problem here.  Sinking
extensions could well help, but the IV should not be DImode in the first
place!

Confirmed.

Reply via email to