Julian Brown <jul...@codesourcery.com> writes:
> Richard Sandiford <richard.sandif...@linaro.org> wrote:
>> One of the vectorisation discussions from last year was about the poor
>> code GCC generates for vld{2,3,4}_*() and vst{2,3,4}_*().  It forces
>> the result of the loads onto the stack, then loads the individual
>> pieces from there.  It does the same thing in reverse for stores.
>> 
>> I think there are two major problems here:
>> [...]
>
> I wanted to finish this off before making it public, but I think that
> to do so at this point is just going to result in duplicated effort.

Yeah, sorry about that. :-(  I started out by looking at the tree-level
side, but in the end decided that I needed to tackle the rtl level first.

> 1. Struct (tree) types are defined via hard-wired code in the ARM
> backend rather than in arm_neon.h. The "type mode" of those struct types
> is overridden to be an extra-wide vector, the width of the whole struct
> (so int32x2x2_t would be V4SImode, etc.).

FWIW, I was going to try to avoid this.  I think instead we should
automatically use vector modes for structures like those in arm_neon.h,
via a target hook.  It would improve the code quality for general code
that has the same sort of small-array-of-vectors structure.  (In other
words, this would help when using the generic vector extensions rather
than the Neon-specific builtins.)

> 2. Builtins (__builtin_neon_*) which previously used "big" integer
> modes to pass/return values, are initialised such that they
> directly pass/return the struct types above instead. The intrinsic
> wrappers in arm_neon.h no longer need to use unions to pun the types of
> arguments & return values.

Yeah, I'd wondered about that too.  However, these days, I think we
ought to be able to generate good code for this type of union, and we
seem to for the cases I've tried.  In the end I thought it was better
to keep the underlying built-in function close to the rtl pattern.
E.g. the fact that the name of the field is "val" seems more like an
arm_neon.h detail than something that should be hard-coded into GCC.

> 3. When those builtins are expanded, they now use the extra-wide
> vectors. The corresponding instruction patterns in neon.md also use wide
> vectors (rather than wide integer modes).

I was also going to try defining non-power-of-two vectors.  Glad to
hear it works!  (That was actually the main motivation for doing the
rtl side first: to see whether it really would be OK to ask the
vectoriser to treat these values as single vectors.)

I think we should keep the integer modes too, though, just like we
allow DImode for double registers.  (I'm surprised to see we don't
allow TImode for quad registers TBH -- might look into that.)
Given that there are no architectual restrictions on mode punning,
these integer modes are useful neutral ground.

>> The VMOV is a bit disappointing, and needs further investigation.
>
> But that should be fairly easy: we just need to expand to subreg
> operations for vcombine, vget_high/vget_low, etc. -- which I believe
> will work fine now (it didn't at some point in the distant past, which
> is why we have hardwired "vmov"s all over the place). Big-endian mode
> may need some care to be taken, as ever.

This VMOV is coming from a plain register-register SET that we fail to
optimise away.  The pattern is:

    (set (reg NEWX) (reg OLDX))
    (set (subreg (reg NEWX 0)) (plus (subreg (reg OLDX 0)) ...)) ; OLDX dead
    (set (subreg (reg NEWX 8)) (plus (subreg (reg NEWX 8)) ...))
    ...

where what we really want is for the second instruction to use NEWX
(or for NEWX not to exist at all, whichever way to prefer to look at it).
I think it's a general subreg optimisation problem.

> I don't think I'm going to have time to work on this in the immediate
> future: please feel free to use it as a base, or ignore it if your
> approach is simpler/better :-).

Thanks.  I might well end up "borrowing" the vector-mode stuff.

Richard

_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Reply via email to