https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117978

--- Comment #4 from ktkachov at gcc dot gnu.org ---
(In reply to Richard Sandiford from comment #3)
> I think this would be better done in expand rather than gimple.  The gimple
> representation would be a vector load in a 128-bit type, followed by a
> zeroing extension to the original SVE type.  I'm not sure how easy it is to
> represent the zeroing extension as things stand, but either way, it would be
> converting one load into one load + one other operation.  The result seems
> more complicated in gimple terms, so I think the natural gimple fold would
> be in the opposite direction.
> 
> If we do it in expand, we'll be able to see the constant if we use an
> appropriate predicate.
> 
> Also:
> 
> * We should do this for 8-bit, 16-bit, 32-bit, and 64-bit quantities, not
> just 128-bit.
> 
> * We should do the same thing for LD2/3/4 and ST2/3/4 (64-bit and 128-bit
> only).
> 
> * Except for the single-element case, the optimisation is only valid for
> little-endian targets.

Do we also need to guard this under TARGET_NON_STREAMING?

Reply via email to