https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92080

--- Comment #3 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 14 Oct 2019, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92080
> 
> Jakub Jelinek <jakub at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |jakub at gcc dot gnu.org
> 
> --- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> Yeah, it isn't e.g. something RTL CSE would naturally do, because there is no
> common subexpression, this needs to know that a narrower broadcast is a part 
> of
> a wider broadcast of the same argument and know how to replace that with a
> backend instruction that takes the low bits from it (while it actually usually
> expands to no code, at least before RA it needs to be expressed some way and 
> is
> very backend specific, we don't allow a vector mode to vector mode subreg with
> different size).  So the only place to deal with this in RTL would be some
> backend specific pass I'm afraid.

So what RTL CSE would need to do is when seeing

 (set reg:VNQI ...)

know (via a target hook?) which subregs can be accessed at zero-cost
and register the apropriate smaller vector sets with a subreg value.
That probably makes sense only after reload to not constrain RA
too much.  It could be restricted to vec_duplicate since there
it's easy to derive the lowpart expression to register.

Reply via email to