On Thu, Apr 16, 2020 at 11:31:17AM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Apr 15, 2020 at 09:37:32PM -0400, Michael Meissner wrote:
> > Fix regression caused by PR target/93932 backport.
> > 
> > When I back ported the fix for PR target/93932 to the GCC 9 branch, I put 
> > in an
> > unintended regression when the GCC compiler is optimizing the vec_extract
> > built-in function, and the vector element is in memory, and the index is
> > variable.  This patch masks the vector index so that it does not go out of
> > bounds.
> > 
> > 2020-04-15  Michael Meissner  <meiss...@linux.ibm.com>
> > 
> >     PR target/94557
> >     * config/rs6000/rs6000.c (rs6000_adjust_vec_address): Fix
> >     regression caused by PR target/93932 backport.  Mask variable
> >     vector extract index so it does not go beyond the vector when
> >     extracting a vector element from memory.
> 
> Much better, thanks!
> 
> > --- /tmp/4XFFqK_rs6000.c    2020-04-13 15:28:33.514011024 -0500
> > +++ gcc/config/rs6000/rs6000.c      2020-04-13 14:24:01.296932921 -0500
> > @@ -7047,18 +7047,25 @@ rs6000_adjust_vec_address (rtx scalar_re
> >      element_offset = GEN_INT (INTVAL (element) * scalar_size);
> >    else
> >      {
> > +      /* Mask the element to make sure the element number is between 0 and 
> > the
> > +    maximum number of elements - 1 so that we don't generate an address
> > +    outside the vector.  */
> 
> Hrm, so why do you need to do this here?  It is part of the semantics of
> vec_extract, so shouldn't the RTL already have this masking somewhere
> when we get here?

Yes, as we discussed when it went into the master branch, the PowerPC
vec_extract built-in function explicitly requires the masking, rather than it
being undefined.  Currently, the masking is not done when the built-in is
created, but only when it is split into the smaller insns.

What makes this more complicated that normal is that while we have VEC_SELECT
for the case where the index is constant, VEC_SELECT does not work for a
variable index.  So, we have to have a parallel set of insns that use an
UNSPEC for the variable case.

And we need to use UNSPEC (or VEC_SELECT) before combine, so that the combiner
has a chance to build the alternate insn where the vector is in memory, rather
than only doing the extract once the vector is loaded into a register.

> Nevertheless, the patch is okay for 9, it certainly won't hurt.  Thanks!


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Reply via email to