On Thu, Apr 16, 2020 at 11:31:17AM -0500, Segher Boessenkool wrote: > Hi! > > On Wed, Apr 15, 2020 at 09:37:32PM -0400, Michael Meissner wrote: > > Fix regression caused by PR target/93932 backport. > > > > When I back ported the fix for PR target/93932 to the GCC 9 branch, I put > > in an > > unintended regression when the GCC compiler is optimizing the vec_extract > > built-in function, and the vector element is in memory, and the index is > > variable. This patch masks the vector index so that it does not go out of > > bounds. > > > > 2020-04-15 Michael Meissner <meiss...@linux.ibm.com> > > > > PR target/94557 > > * config/rs6000/rs6000.c (rs6000_adjust_vec_address): Fix > > regression caused by PR target/93932 backport. Mask variable > > vector extract index so it does not go beyond the vector when > > extracting a vector element from memory. > > Much better, thanks! > > > --- /tmp/4XFFqK_rs6000.c 2020-04-13 15:28:33.514011024 -0500 > > +++ gcc/config/rs6000/rs6000.c 2020-04-13 14:24:01.296932921 -0500 > > @@ -7047,18 +7047,25 @@ rs6000_adjust_vec_address (rtx scalar_re > > element_offset = GEN_INT (INTVAL (element) * scalar_size); > > else > > { > > + /* Mask the element to make sure the element number is between 0 and > > the > > + maximum number of elements - 1 so that we don't generate an address > > + outside the vector. */ > > Hrm, so why do you need to do this here? It is part of the semantics of > vec_extract, so shouldn't the RTL already have this masking somewhere > when we get here?
Yes, as we discussed when it went into the master branch, the PowerPC vec_extract built-in function explicitly requires the masking, rather than it being undefined. Currently, the masking is not done when the built-in is created, but only when it is split into the smaller insns. What makes this more complicated that normal is that while we have VEC_SELECT for the case where the index is constant, VEC_SELECT does not work for a variable index. So, we have to have a parallel set of insns that use an UNSPEC for the variable case. And we need to use UNSPEC (or VEC_SELECT) before combine, so that the combiner has a chance to build the alternate insn where the vector is in memory, rather than only doing the extract once the vector is loaded into a register. > Nevertheless, the patch is okay for 9, it certainly won't hurt. Thanks! -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797