> > >
> > > No, I don't think so.  The code that eventually performs a
> > > contiguous sub-group access directly should never extend
> > > the load beyond GROUP_SIZE - or should be gated on the DR
> > > not executed speculatively.  That is, we should "fix" this
> > > elsewhere.
> > >
> >
> > It doesn't, it's just not aligned within the range of GROUP_SIZE
> > from what I remember.
> >
> > > If you have an updated patch I can look at what's wrong here if you
> > > tell me how to reproduce (after applying the patch I suppose).
> >
> > Yes, applying the patch and running:
> >
> > /work/build/gcc/xgcc -B/work/build/gcc/
> /work/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c  -m64   
> -fdiagnostics-
> plain-output  -flto -ffat-lto-objects -msse2 -ftree-vectorize -fno-tree-loop-
> distribute-patterns -fno-vect-cost-model -fno-common -O2 -fdump-tree-vect-
> details -msse4.1  -lm  -o ./vect-early-break_26.exe
> 
> So it works as in executing fine.  We have a VF of 4 and
> 
> note:   recording new base alignment for &b
>   alignment:    32
>   misalignment: 0
>   based on:     _1 = b[i_32];
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> note:   recording new base alignment for &a
>   alignment:    32
>   misalignment: 0
>   based on:     _2 = a[i_32];
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> note:   vect_compute_data_ref_alignment:
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> note:   alignment increased due to early break to 32 bytes.
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> missed:   misalign = 8 bytes of ref b[i_32]
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> note:   vect_compute_data_ref_alignment:
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> note:   alignment increased due to early break to 32 bytes.
> 
> so no peeling necessary.  But we also have like
> 
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> missed:   misalign = 12 bytes of ref b[_6]
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> note:   vect_compute_data_ref_alignment:
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> note:   alignment increased due to early break to 32 bytes.
> 
> and we are correctly saying we vectorize an unaligned access.
> 
> The "issue" is we're having SLP nodes with a load permutation, their
> expansion might not happen with the whole DR group in mind.  I'd say
> we simply refuse to do early break speculative load vectorization
> for SLP nodes with a load permutation.

This is what I was trying to say on IRC when I mentioned that the permutes
can end up creating an unaligned access wrt to the original address.

But the reason I was still trying to allow this case is because conceptually
my assumption was that the permutes still maintain the access within
the group.  After all, they're just shifting elements around.

In other words, I was assuming that the group a[i] - a[i-2] still stays within
the group alignment of 32-bytes, even if the permute can make the second
load in the group start at say, byte 28.  My assumption was though that it can't
make it start at byte 36.

Are you saying that this is the case? that it can? Then I agree the load 
permutations
on group loads are unsafe to speculate for unmasked loops...

Thanks,
Tamar
> 
> It looks like a latent issue to me which could also interfere with
> gap peeling, I have to dig a bit further what code is responsible
> for the current behavior ...
> 
> 
> 
> > Thanks,
> > Tamar
> >
> > >
> > > > Enforcing the alignment on every group member would be wrong I think 
> > > > since
> > > > that ends up higher overall alignment than they need.
> > > >
> > > > > So besides these issues in get_load_store_type the change looks good 
> > > > > now.
> > > > >
> > > >
> > > > Thanks for the reviews.
> > > >
> > > > Tamar
> > > > > Richard.
> > > > >
> > > > > > +      else
> > > > > > +   *alignment_support_scheme = dr_aligned;
> > > > > > +    }
> > > > > > +
> > > > > >    if (*alignment_support_scheme == dr_unaligned_unsupported)
> > > > > >      {
> > > > > >        if (dump_enabled_p ())
> > > > > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> > > > > > index
> > > > >
> > >
> b0cb081cba0ae8b11fbfcfcb8c6d440ec451ccb5..97caf61b345735d297ec49fd6ca
> > > > > 64797435b46fc 100644
> > > > > > --- a/gcc/tree-vectorizer.h
> > > > > > +++ b/gcc/tree-vectorizer.h
> > > > > > @@ -1281,7 +1281,11 @@ public:
> > > > > >
> > > > > >    /* Set by early break vectorization when this DR needs peeling 
> > > > > > for
> alignment
> > > > > >       for correctness.  */
> > > > > > -  bool need_peeling_for_alignment;
> > > > > > +  bool safe_speculative_read_required;
> > > > > > +
> > > > > > +  /* Set by early break vectorization when this DR's scalar 
> > > > > > accesses are
> known
> > > > > > +     to be inbounds of a known bounds loop.  */
> > > > > > +  bool scalar_access_known_in_bounds;
> > > > > >
> > > > > >    tree base_decl;
> > > > > >
> > > > > > @@ -1997,6 +2001,35 @@ dr_target_alignment (dr_vec_info *dr_info)
> > > > > >    return dr_info->target_alignment;
> > > > > >  }
> > > > > >  #define DR_TARGET_ALIGNMENT(DR) dr_target_alignment (DR)
> > > > > > +#define DR_SCALAR_KNOWN_BOUNDS(DR) (DR)-
> > > > > >scalar_access_known_in_bounds
> > > > > > +
> > > > > > +/* Return if the stmt_vec_info requires peeling for alignment.  */
> > > > > > +inline bool
> > > > > > +dr_safe_speculative_read_required (stmt_vec_info stmt_info)
> > > > > > +{
> > > > > > +  dr_vec_info *dr_info;
> > > > > > +  if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
> > > > > > +    dr_info = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT
> > > (stmt_info));
> > > > > > +  else
> > > > > > +    dr_info = STMT_VINFO_DR_INFO (stmt_info);
> > > > > > +
> > > > > > +  return dr_info->safe_speculative_read_required;
> > > > > > +}
> > > > > > +
> > > > > > +/* Set the safe_speculative_read_required for the the 
> > > > > > stmt_vec_info, if
> group
> > > > > > +   access then set on the fist element otherwise set on DR 
> > > > > > directly.  */
> > > > > > +inline void
> > > > > > +dr_set_safe_speculative_read_required (stmt_vec_info stmt_info,
> > > > > > +                                  bool requires_alignment)
> > > > > > +{
> > > > > > +  dr_vec_info *dr_info;
> > > > > > +  if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
> > > > > > +    dr_info = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT
> > > (stmt_info));
> > > > > > +  else
> > > > > > +    dr_info = STMT_VINFO_DR_INFO (stmt_info);
> > > > > > +
> > > > > > +  dr_info->safe_speculative_read_required = requires_alignment;
> > > > > > +}
> > > > > >
> > > > > >  inline void
> > > > > >  set_dr_target_alignment (dr_vec_info *dr_info, poly_uint64 val)
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > > --
> > > > > Richard Biener <rguent...@suse.de>
> > > > > SUSE Software Solutions Germany GmbH,
> > > > > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > > Nuernberg)
> > > >
> > >
> > > --
> > > Richard Biener <rguent...@suse.de>
> > > SUSE Software Solutions Germany GmbH,
> > > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> Nuernberg)
> >
> 
> --
> Richard Biener <rguent...@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to