On Wed, Mar 08, 2017 at 11:06:38AM +0100, Jakub Jelinek wrote:
> On Wed, Mar 08, 2017 at 11:03:53AM +0100, Richard Biener wrote:
> > On Wed, 8 Mar 2017, Jakub Jelinek wrote:
> > 
> > > On Wed, Mar 08, 2017 at 09:15:05AM +0100, Richard Biener wrote:
> > > > Ok.  Note that another option for the loopy case is to do
> > > > 
> > > >   for (;;)
> > > >     {
> > > >       vec >> by-one-elt;
> > > >       elt = BIT_FIELD_REF <vec, index-zero>;
> > > >     }
> > > 
> > > Indeed, that is a possibility, but I guess I'd need to construct
> > > the result similarly if resv is non-NULL.  Also, not sure about big endian
> > > vectors and whether BIT_FIELD_REF with zero or size - elt_size is
> > > more efficient there.
> > > 
> > > In any case, the PR was about s390 without vectors enabled, so this 
> > > wouldn't
> > > apply.
> > > 
> > > > when whole-vector shifts are available (they are constructed by
> > > > VEC_PERM_EXPR if vec_perm_const_ok for that mask).  If you end up
> > > > doing variable-index array accesses you can as well spill the
> > > > vector to memory and use memory accesses on that.  Not sure how
> > > > to arrange that from this part of the expander.
> > > 
> > > Shouldn't something else during the expansion force it into memory if it 
> > > is
> > > more efficient to expand it that way?  Apparently it is forced into memory
> > 
> > Possibly - but it might end up spilling in the loop itself and thus be
> > rather inefficient?
> 
> Ok.  That said, this is -fsanitize=undefined which slows down code anyway,
> so making it more efficient can be done in GCC8 IMNSHO.

Indeed -- I'd push out that to 8.

        Marek

Reply via email to