RE: [PATCH v2]middle-end: delay checking for alignment to load [PR118464]

Richard Biener Thu, 13 Feb 2025 06:12:18 -0800

On Wed, 12 Feb 2025, Tamar Christina wrote:

> > -----Original Message-----
> > From: Tamar Christina <tamar.christ...@arm.com>
> > Sent: Wednesday, February 12, 2025 3:20 PM
> > To: Richard Biener <rguent...@suse.de>
> > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>
> > Subject: RE: [PATCH v2]middle-end: delay checking for alignment to load
> > [PR118464]
> > 
> > > -----Original Message-----
> > > From: Richard Biener <rguent...@suse.de>
> > > Sent: Wednesday, February 12, 2025 2:58 PM
> > > To: Tamar Christina <tamar.christ...@arm.com>
> > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>
> > > Subject: Re: [PATCH v2]middle-end: delay checking for alignment to load
> > > [PR118464]
> > >
> > > On Tue, 11 Feb 2025, Tamar Christina wrote:
> > >
> > > > Hi All,
> > > >
> > > > This fixes two PRs on Early break vectorization by delaying the safety 
> > > > checks to
> > > > vectorizable_load when the VF, VMAT and vectype are all known.
> > > >
> > > > This patch does add two new restrictions:
> > > >
> > > > 1. On LOAD_LANES targets, where the buffer size is known, we reject 
> > > > uneven
> > > >    group sizes, as they are unaligned every n % 2 iterations and so may 
> > > > cross
> > > >    a page unwittingly.
> > > >
> > > > 2. On LOAD_LANES targets when the buffer is unknown, we reject 
> > > > vectorization
> > > if
> > > >    we cannot peel for alignment, as the alignment requirement is quite 
> > > > large at
> > > >    GROUP_SIZE * vectype_size.  This is unlikely to ever be beneficial 
> > > > so we
> > > >    don't support it for now.
> > > >
> > > > There are other steps documented inside the code itself so that the 
> > > > reasoning
> > > > is next to the code.
> > > >
> > > > Note that for VLA I have still left this fully disabled when not 
> > > > working on a
> > > > fixed buffer.
> > > >
> > > > For VLA targets like SVE return element alignment as the desired vector
> > > > alignment.  This means that the loads are never misaligned and so 
> > > > annoying it
> > > > won't ever need to peel.
> > > >
> > > > So what I think needs to happen in GCC 16 is that.
> > > >
> > > > 1. during vect_compute_data_ref_alignment we need to take the max of
> > > >    POLY_VALUE_MIN and vector_alignment.
> > > >
> > > > 2. vect_do_peeling define skip_vector when PFA for VLA, and in the 
> > > > guard add
> > a
> > > >    check that ncopies * vectype does not exceed POLY_VALUE_MAX which we
> > use
> > > as a
> > > >    proxy for pagesize.
> > > >
> > > > 3. Force LOOP_VINFO_USING_PARTIAL_VECTORS_P to be true in
> > > >    vect_determine_partial_vectors_and_peeling since the first iteration 
> > > > has to
> > > >    be partial. If LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P we have to fail 
> > > > to
> > > >    vectorize.
> > > >
> > > > 4. Create a default mask to be used, so that
> > > vect_use_loop_mask_for_alignment_p
> > > >    becomes true and we generate the peeled check through loop control 
> > > > for
> > > >    partial loops.  From what I can tell this won't work for
> > > >    LOOP_VINFO_FULLY_WITH_LENGTH_P since they don't have any peeling
> > > support at
> > > >    all in the compiler.  That would need to be done independently from 
> > > > the
> > > >    above.
> > >
> > > We basically need to implement peeling/versioning for alignment based
> > > on the actual POLY value with the fallback being first-fault loads.
> > >
> > > > In any case, not GCC 15 material so I've kept the WIP patches I have
> > > downstream.
> > > >
> > > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > > > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > > > -m32, -m64 and no issues.
> > > >
> > > > Ok for master?
> > > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >         PR tree-optimization/118464
> > > >         PR tree-optimization/116855
> > > >         * doc/invoke.texi (min-pagesize): Update docs with vectorizer 
> > > > use.
> > > >         * tree-vect-data-refs.cc 
> > > > (vect_analyze_early_break_dependences): Delay
> > > >         checks.
> > > >         (vect_compute_data_ref_alignment): Remove alignment checks and 
> > > > move
> > > to
> > > >         get_load_store_type, increase group access alignment.
> > > >         (vect_enhance_data_refs_alignment): Add note to comment needing
> > > >         investigating.
> > > >         (vect_analyze_data_refs_alignment): Likewise.
> > > >         (vect_supportable_dr_alignment): For group loads look at first 
> > > > DR.
> > > >         * tree-vect-stmts.cc (get_load_store_type):
> > > >         Perform safety checks for early break pfa.
> > > >         * tree-vectorizer.h (dr_peeling_alignment,
> > > >         dr_set_peeling_alignment): New.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > >         PR tree-optimization/118464
> > > >         PR tree-optimization/116855
> > > >         * gcc.dg/vect/bb-slp-pr65935.c: Update, it now vectorizes 
> > > > because the
> > > >         load type is relaxed later.
> > > >         * gcc.dg/vect/vect-early-break_121-pr114081.c: Update.
> > > >         * gcc.dg/vect/vect-early-break_22.c: Reject for load_lanes 
> > > > targets
> > > >         * g++.dg/vect/vect-early-break_7-pr118464.cc: New test.
> > > >         * gcc.dg/vect/vect-early-break_132-pr118464.c: New test.
> > > >         * gcc.dg/vect/vect-early-break_133_pfa1.c: New test.
> > > >         * gcc.dg/vect/vect-early-break_133_pfa10.c: New test.
> > > >         * gcc.dg/vect/vect-early-break_133_pfa2.c: New test.
> > > >         * gcc.dg/vect/vect-early-break_133_pfa3.c: New test.
> > > >         * gcc.dg/vect/vect-early-break_133_pfa4.c: New test.
> > > >         * gcc.dg/vect/vect-early-break_133_pfa5.c: New test.
> > > >         * gcc.dg/vect/vect-early-break_133_pfa6.c: New test.
> > > >         * gcc.dg/vect/vect-early-break_133_pfa7.c: New test.
> > > >         * gcc.dg/vect/vect-early-break_133_pfa8.c: New test.
> > > >         * gcc.dg/vect/vect-early-break_133_pfa9.c: New test.
> > > >         * g++.dg/ext/pragma-unroll-lambda-lto.C: Add pragma novector.
> > > >         * gcc.dg/tree-ssa/gen-vect-2.c: Likewise.
> > > >         * gcc.dg/tree-ssa/gen-vect-25.c: Likewise.
> > > >         * gcc.dg/tree-ssa/gen-vect-32.c: Likewise.
> > > >         * gcc.dg/tree-ssa/ivopt_mult_2g.c: Likewise.
> > > >         * gcc.dg/tree-ssa/ivopts-5.c: Likewise.
> > > >         * gcc.dg/tree-ssa/ivopts-6.c: Likewise.
> > > >         * gcc.dg/tree-ssa/ivopts-7.c: Likewise.
> > > >         * gcc.dg/tree-ssa/ivopts-8.c: Likewise.
> > > >         * gcc.dg/tree-ssa/ivopts-9.c: Likewise.
> > > >         * gcc.dg/tree-ssa/predcom-dse-1.c: Likewise.
> > > >         * gcc.dg/tree-ssa/predcom-dse-10.c: Likewise.
> > > >         * gcc.dg/tree-ssa/predcom-dse-11.c: Likewise.
> > > >         * gcc.dg/tree-ssa/predcom-dse-12.c: Likewise.
> > > >         * gcc.dg/tree-ssa/predcom-dse-2.c: Likewise.
> > > >         * gcc.dg/tree-ssa/predcom-dse-3.c: Likewise.
> > > >         * gcc.dg/tree-ssa/predcom-dse-4.c: Likewise.
> > > >         * gcc.dg/tree-ssa/predcom-dse-5.c: Likewise.
> > > >         * gcc.dg/tree-ssa/predcom-dse-6.c: Likewise.
> > > >         * gcc.dg/tree-ssa/predcom-dse-7.c: Likewise.
> > > >         * gcc.dg/tree-ssa/predcom-dse-8.c: Likewise.
> > > >         * gcc.dg/tree-ssa/predcom-dse-9.c: Likewise.
> > > >         * gcc.target/i386/pr90178.c: Likewise.
> > > >         * gcc.dg/vect/vect-early-break_39.c: Update testcase for 
> > > > misalignment.
> > > >
> > > > ---
> > > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > > > index
> > >
> > 0aef2abf05b9b2f5996de69d5ebc3a21109ee6e1..db00f8b403814b58261849d
> > > 8917863dc06bbf3e2 100644
> > > > --- a/gcc/doc/invoke.texi
> > > > +++ b/gcc/doc/invoke.texi
> > > > @@ -17256,7 +17256,7 @@ Maximum number of relations the oracle will
> > > register in a basic block.
> > > >  Work bound when discovering transitive relations from existing 
> > > > relations.
> > > >
> > > >  @item min-pagesize
> > > > -Minimum page size for warning purposes.
> > > > +Minimum page size for warning and early break vectorization purposes.
> > > >
> > > >  @item openacc-kernels
> > > >  Specify mode of OpenACC `kernels' constructs handling.
> > > > diff --git a/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C
> > > b/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C
> > > > index
> > >
> > 0db57c8d3a01985e1e76bb9f8a52613179060f19..5980bf316899553e16d078d
> > > eee32911f31fafd94 100644
> > > > --- a/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C
> > > > +++ b/gcc/testsuite/g++.dg/ext/pragma-unroll-lambda-lto.C
> > > > @@ -10,6 +10,7 @@ inline Iter
> > > >  my_find(Iter first, Iter last, Pred pred)
> > > >  {
> > > >  #pragma GCC unroll 4
> > > > +#pragma GCC novector
> > > >      while (first != last && !pred(*first))
> > > >          ++first;
> > > >      return first;
> > > > diff --git a/gcc/testsuite/g++.dg/vect/vect-early-break_7-pr118464.cc
> > > b/gcc/testsuite/g++.dg/vect/vect-early-break_7-pr118464.cc
> > > > new file mode 100644
> > > > index
> > >
> > 0000000000000000000000000000000000000000..5e50e56ad17515e278c05c
> > > 92263af120c3ab2c21
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/g++.dg/vect/vect-early-break_7-pr118464.cc
> > > > @@ -0,0 +1,23 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-add-options vect_early_break } */
> > > > +/* { dg-require-effective-target vect_early_break } */
> > > > +/* { dg-require-effective-target vect_int } */
> > > > +/* { dg-additional-options "-O3" } */
> > > > +
> > > > +#include <cstddef>
> > > > +
> > > > +struct ts1 {
> > > > +  int spans[6][2];
> > > > +};
> > > > +struct gg {
> > > > +  int t[6];
> > > > +};
> > > > +ts1 f(size_t t, struct ts1 *s1, struct gg *s2) {
> > > > +  ts1 ret;
> > > > +  for (size_t i = 0; i != t; i++) {
> > > > +    if (!(i < t)) __builtin_abort();
> > > > +    ret.spans[i][0] = s1->spans[i][0] + s2->t[i];
> > > > +    ret.spans[i][1] = s1->spans[i][1] + s2->t[i];
> > > > +  }
> > > > +  return ret;
> > > > +}
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-2.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-2.c
> > > > index
> > >
> > a35999a172ac762bb4873d10b331301750f4015b..00fc8f01991cc994737bc20
> > > 88e72d85f249bf341 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-2.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-2.c
> > > > @@ -29,6 +29,7 @@ int main ()
> > > >      }
> > > >
> > > >    /* check results:  */
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < N; i++)
> > > >      {
> > > >        if (ca[i] != cb[i])
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-25.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-25.c
> > > > index
> > >
> > 9f14a54c413757df7230b7b6053c83a8a5a1e6c9..99d5e6231ff053089782b52d
> > > c6ce9b9ccb8c64a0 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-25.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-25.c
> > > > @@ -27,6 +27,7 @@ int main_1 (int n, int *p)
> > > >      }
> > > >
> > > >    /* check results:  */
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < N; i++)
> > > >      {
> > > >        if (ia[i] != n)
> > > > @@ -40,6 +41,7 @@ int main_1 (int n, int *p)
> > > >      }
> > > >
> > > >    /* check results:  */
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < N; i++)
> > > >      {
> > > >        if (ib[i] != k)
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-32.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-32.c
> > > > index
> > >
> > 62d2b5049fd902047540b90a2ef79b789f903969..1202ec326c7e0020daf58af9
> > > 544cdbe2b1da4914 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-32.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-32.c
> > > > @@ -23,6 +23,7 @@ int main ()
> > > >      }
> > > >
> > > >    /* check results:  */
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < N; i++)
> > > >      {
> > > >        if (s.ca[i] != 5)
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2g.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2g.c
> > > > index
> > >
> > dd06e598f7f48e1a75eba41d626860404325259d..b79bd10585f501992c93648
> > > ea1a1f2d2699c07c1 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2g.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2g.c
> > > > @@ -1,5 +1,5 @@
> > > >  /* { dg-do compile { target {{ i?86-*-* x86_64-*-* } && lp64 } } } */
> > > > -/* { dg-options "-O2 -fgimple -m64 -fdump-tree-ivopts-details" } */
> > > > +/* { dg-options "-O2 -fgimple -m64 -fdump-tree-ivopts-details 
> > > > -fno-tree-
> > > vectorize" } */
> > > >
> > > >  /* Exit tests 'i < N1' and 'p2 > p_limit2' can be replaced, so
> > > >   * two ivs i and p2 can be eliminate.  */
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ivopts-5.c
> > b/gcc/testsuite/gcc.dg/tree-
> > > ssa/ivopts-5.c
> > > > index
> > >
> > a6af497f4bf7f1ef6c64e09b87931225287d78e0..7b9615f07f3c4af3657eb7d01
> > > 83c1a51de9fbc42 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/ivopts-5.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ivopts-5.c
> > > > @@ -5,6 +5,7 @@ int*
> > > >  foo (int* mem, int sz, int val)
> > > >  {
> > > >    int i;
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < sz; i++)
> > > >      if (mem[i] == val)
> > > >        return &mem[i];
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ivopts-6.c
> > b/gcc/testsuite/gcc.dg/tree-
> > > ssa/ivopts-6.c
> > > > index
> > >
> > 8383154f99f2559873ef5b3a8fa8119cf679782f..08304293140a82e5484c8399
> > > b4374a474c66b34b 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/ivopts-6.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ivopts-6.c
> > > > @@ -5,6 +5,7 @@ int*
> > > >  foo (int* mem, int sz, int val)
> > > >  {
> > > >    int i;
> > > > +#pragma GCC novector
> > > >    for (i = 0; i != sz; i++)
> > > >      if (mem[i] == val)
> > > >        return &mem[i];
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ivopts-7.c
> > b/gcc/testsuite/gcc.dg/tree-
> > > ssa/ivopts-7.c
> > > > index
> > >
> > 44f5603d4f5b8da6c759e8732503638131b0fca8..03160f234f74319cda6d7450
> > > 788da871ea0cea74 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/ivopts-7.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ivopts-7.c
> > > > @@ -5,6 +5,7 @@ int*
> > > >  foo (int* mem, int beg, int end, int val)
> > > >  {
> > > >    int i;
> > > > +#pragma GCC novector
> > > >    for (i = beg; i < end; i++)
> > > >      if (mem[i] == val)
> > > >        return &mem[i];
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ivopts-8.c
> > b/gcc/testsuite/gcc.dg/tree-
> > > ssa/ivopts-8.c
> > > > index
> > >
> > b2556eaac0d02f65a50bbd532a47fef9c0b1dfa8..a7fd3c9de3746c116dfb73419
> > > 805fd7ce6e69ffa 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/ivopts-8.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ivopts-8.c
> > > > @@ -5,6 +5,7 @@ int*
> > > >  foo (int* mem, char sz, int val)
> > > >  {
> > > >    char i;
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < sz; i++)
> > > >      if (mem[i] == val)
> > > >        return &mem[i];
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ivopts-9.c
> > b/gcc/testsuite/gcc.dg/tree-
> > > ssa/ivopts-9.c
> > > > index
> > >
> > d26d994f9bd28bc2346a6878d48b159729851ef6..fb9656b88d7bea8a9a84e2c
> > > a6ff877a2aac7e05b 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/ivopts-9.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ivopts-9.c
> > > > @@ -5,6 +5,7 @@ int*
> > > >  foo (int* mem, unsigned char sz, int val)
> > > >  {
> > > >    unsigned char i;
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < sz; i++)
> > > >      if (mem[i] == val)
> > > >        return &mem[i];
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-1.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-1.c
> > > > index
> > >
> > a0a04a08c61d48128ad5fd1a11daaf0abc783053..b660f9d258423356a4d73d5
> > > 996a5f1a8ede9ead9 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-1.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-1.c
> > > > @@ -32,6 +32,7 @@ void check (int *a, int *res, int len)
> > > >  {
> > > >    int i;
> > > >
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < len; i++)
> > > >      if (a[i] != res[i])
> > > >        abort ();
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-10.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-10.c
> > > > index
> > >
> > f770a8ad812aedee8f65b011134cda91cbe2bf91..8e5a3a434986a31bb635bf3b
> > > c1ecc36d463f2ee7 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-10.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-10.c
> > > > @@ -23,6 +23,7 @@ void check (int *a, int *res, int len)
> > > >  {
> > > >    int i;
> > > >
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < len; i++)
> > > >      if (a[i] != res[i])
> > > >        abort ();
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-11.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-11.c
> > > > index
> > >
> > ed2b96a0d1a4e0c90bf52a83b5f21e2fd1c5a5c5..fd56fd9747e3c572c93107188
> > > ede7482ad01bb99 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-11.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-11.c
> > > > @@ -29,6 +29,7 @@ void check (int *a, int *res, int len, int sum, int 
> > > > val)
> > > >    if (sum != val)
> > > >      abort ();
> > > >
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < len; i++)
> > > >      if (a[i] != res[i])
> > > >        abort ();
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-12.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-12.c
> > > > index
> > >
> > 2487c1c8205a4f09fd16974f3599ddc8c48b92cf..5eac905aff87e6c4aa4449c689
> > > d2594b240fec4e 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-12.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-12.c
> > > > @@ -37,6 +37,7 @@ void check (int *a, int *res, int len, int sval)
> > > >    if (sum != sval)
> > > >      abort ();
> > > >
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < len; i++)
> > > >      if (a[i] != res[i])
> > > >        abort ();
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-2.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-2.c
> > > > index
> > >
> > 020ca705790d6ace707184c9d2804f3d690de916..801acad33e9d6b7eb17f0cde
> > > 408903c4f2674acc 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-2.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-2.c
> > > > @@ -32,6 +32,7 @@ void check (int *a, int *res, int len)
> > > >  {
> > > >    int i;
> > > >
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < len; i++)
> > > >      if (a[i] != res[i])
> > > >        abort ();
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-3.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-3.c
> > > > index
> > >
> > 667cc333d9f2c030474e0b3115c0b86cda733c2e..8b82bdbc0c92cc579824393d
> > > c15f2f5a3e5f55e5 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-3.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-3.c
> > > > @@ -40,6 +40,7 @@ void check (int *a, int *res, int len)
> > > >  {
> > > >    int i;
> > > >
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < len; i++)
> > > >      if (a[i] != res[i])
> > > >        abort ();
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-4.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-4.c
> > > > index
> > >
> > 8118461af0b63d1f9b42879783ae2650a9d9b34a..0d64bc72f82341fd0518a6f5
> > > 9ad2a10aec7b0088 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-4.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-4.c
> > > > @@ -31,6 +31,7 @@ void check (int *a, int *res, int len)
> > > >  {
> > > >    int i;
> > > >
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < len; i++)
> > > >      if (a[i] != res[i])
> > > >        abort ();
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-5.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-5.c
> > > > index
> > >
> > 03fa646661e2839946e80e0b27ea1d0ea0ef9aeb..7db3bca3b2df98f3c0b3db00
> > > be18fc8054644655 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-5.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-5.c
> > > > @@ -33,6 +33,7 @@ void check (int *a, int *res, int len)
> > > >  {
> > > >    int i;
> > > >
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < len; i++)
> > > >      if (a[i] != res[i])
> > > >        abort ();
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-6.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-6.c
> > > > index
> > >
> > ab2fd403d3005ba06d9992580945ce28f8fb1c09..1267bae5f1c44d60d484cca7
> > > d88a5714770f147f 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-6.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-6.c
> > > > @@ -35,6 +35,7 @@ void check (int *a, int *res, int len)
> > > >  {
> > > >    int i;
> > > >
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < len; i++)
> > > >      if (a[i] != res[i])
> > > >        abort ();
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-7.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-7.c
> > > > index
> > >
> > c746ebd715561eb9f7192a433c321f86e0751eaa..cfe44a06ce4ada6fddc3659dd
> > > f748a16904b5d9e 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-7.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-7.c
> > > > @@ -33,6 +33,7 @@ void check (int *a, int *res, int len)
> > > >  {
> > > >    int i;
> > > >
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < len; i++)
> > > >      if (a[i] != res[i])
> > > >        abort ();
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-8.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-8.c
> > > > index
> > >
> > 6c4e9afa487ed33e4ab5d887640e0efa44a72c6d..646e43d9aad2b235bdae0d9d
> > > 52df89a3da2dd3e4 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-8.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-8.c
> > > > @@ -31,6 +31,7 @@ void check (int *a, int *res, int len)
> > > >  {
> > > >    int i;
> > > >
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < len; i++)
> > > >      if (a[i] != res[i])
> > > >        abort ();
> > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-9.c
> > > b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-9.c
> > > > index
> > >
> > 9c5e8ca9a793b0405e7f448798aa1fac483d2f05..30daf82fac5cef2e26e4597aa4
> > > eb10aa33cd0af2 100644
> > > > --- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-9.c
> > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-dse-9.c
> > > > @@ -69,6 +69,7 @@ void check (int *a, int *res, int len)
> > > >  {
> > > >    int i;
> > > >
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < len; i++)
> > > >      if (a[i] != res[i])
> > > >        abort ();
> > > > diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
> > > b/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
> > > > index
> > >
> > 9ef1330b47c817e16baaafa44c2b15108b9dd3a9..4c8255895b976653228233d
> > > 93c950629f3231554 100644
> > > > --- a/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
> > > > +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
> > > > @@ -55,7 +55,9 @@ int main()
> > > >           }
> > > >      }
> > > >    rephase ();
> > > > +#pragma GCC novector
> > > >    for (i = 0; i < 32; ++i)
> > > > +#pragma GCC novector
> > > >      for (j = 0; j < 3; ++j)
> > > >  #pragma GCC novector
> > > >        for (k = 0; k < 3; ++k)
> > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c
> > > b/gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c
> > > > index
> > >
> > 423ff0b566b18bf04ce4f67a45b94dc1a021a4a0..5464d1d56fe97542a2dfc7afba
> > > 39aabc0468737c 100644
> > > > --- a/gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c
> > > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c
> > > > @@ -5,7 +5,8 @@
> > > >  /* { dg-additional-options "-O3" } */
> > > >  /* { dg-additional-options "-mavx2" { target { x86_64-*-* i?86-*-* } } 
> > > > } */
> > > >
> > > > -/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > > > +/* Arm and -m32 create a group size of 3 here, which we can't support 
> > > > yet.  */
> > > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { { ! 
> > > > arm*-*-
> > *
> > > } || { { x86_64-*-* i?86-*-* } && ilp64 } } } } } */
> > > >
> > > >  typedef struct filter_list_entry {
> > > >    const char *name;
> > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_132-pr118464.c
> > > b/gcc/testsuite/gcc.dg/vect/vect-early-break_132-pr118464.c
> > > > new file mode 100644
> > > > index
> > >
> > 0000000000000000000000000000000000000000..9bf0cbc8853f74de550e8a
> > > c83ab569fc9fbde126
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_132-pr118464.c
> > > > @@ -0,0 +1,25 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-add-options vect_early_break } */
> > > > +/* { dg-require-effective-target vect_early_break } */
> > > > +/* { dg-require-effective-target vect_int } */
> > > > +/* { dg-additional-options "-O3" } */
> > > > +
> > > > +int a, b, c, d, e, f;
> > > > +short g[1];
> > > > +int main() {
> > > > +  int h;
> > > > +  while (a) {
> > > > +    while (h)
> > > > +      ;
> > > > +    for (b = 2; b; b--) {
> > > > +      while (c)
> > > > +        ;
> > > > +      f = g[a];
> > > > +      if (d)
> > > > +        break;
> > > > +    }
> > > > +    while (e)
> > > > +      ;
> > > > +  }
> > > > +  return 0;
> > > > +}
> > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa1.c
> > > b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa1.c
> > > > new file mode 100644
> > > > index
> > >
> > 0000000000000000000000000000000000000000..dc771186efafe25bb65490
> > > da7a383ad7f6ceb0a7
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa1.c
> > > > @@ -0,0 +1,19 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-add-options vect_early_break } */
> > > > +/* { dg-require-effective-target vect_early_break } */
> > > > +/* { dg-require-effective-target vect_int } */
> > > > +/* { dg-additional-options "-O3" } */
> > > > +
> > > > +char string[1020];
> > > > +
> > > > +char * find(int n, char c)
> > > > +{
> > > > +    for (int i = 1; i < n; i++) {
> > > > +        if (string[i] == c)
> > > > +            return &string[i];
> > > > +    }
> > > > +    return 0;
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > > > +/* { dg-final { scan-tree-dump "Alignment of access forced using 
> > > > peeling"
> > "vect"
> > > } } */
> > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa10.c
> > > b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa10.c
> > > > new file mode 100644
> > > > index
> > >
> > 0000000000000000000000000000000000000000..82d473a279ce060c55028
> > > 9c61729d9f9b56f0d2a
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa10.c
> > > > @@ -0,0 +1,24 @@
> > > > +/* { dg-add-options vect_early_break } */
> > > > +/* { dg-do compile } */
> > > > +/* { dg-require-effective-target vect_early_break } */
> > > > +/* { dg-require-effective-target vect_int } */
> > > > +
> > > > +/* { dg-additional-options "-Ofast" } */
> > > > +
> > > > +/* Alignment requirement too big, load lanes targets can't safely 
> > > > vectorize this.
> > > */
> > > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { !
> > > vect_load_lanes } } } } */
> > > > +/* { dg-final { scan-tree-dump-not "Alignment of access forced using 
> > > > peeling"
> > > "vect" { target { ! vect_load_lanes } } } } */
> > > > +
> > > > +unsigned test4(char x, char *restrict vect_a, char *restrict vect_b, 
> > > > int n)
> > > > +{
> > > > + unsigned ret = 0;
> > > > + for (int i = 0; i < (n - 2); i+=2)
> > > > + {
> > > > +   if (vect_a[i] > x || vect_a[i+2] > x)
> > > > +     return 1;
> > > > +
> > > > +   vect_b[i] = x;
> > > > +   vect_b[i+1] = x+1;
> > > > + }
> > > > + return ret;
> > > > +}
> > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa2.c
> > > b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa2.c
> > > > new file mode 100644
> > > > index
> > >
> > 0000000000000000000000000000000000000000..7d56772fbf380ce42ac758
> > > ca29a5f3f9d3f6e0d1
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa2.c
> > > > @@ -0,0 +1,19 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-add-options vect_early_break } */
> > > > +/* { dg-require-effective-target vect_early_break } */
> > > > +/* { dg-require-effective-target vect_int } */
> > > > +/* { dg-additional-options "-O3" } */
> > > > +
> > > > +char string[1020];
> > > > +
> > > > +char * find(int n, char c)
> > > > +{
> > > > +    for (int i = 0; i < n; i++) {
> > > > +        if (string[i] == c)
> > > > +            return &string[i];
> > > > +    }
> > > > +    return 0;
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > > > +/* { dg-final { scan-tree-dump-not "Alignment of access forced using 
> > > > peeling"
> > > "vect" } } */
> > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa3.c
> > > b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa3.c
> > > > new file mode 100644
> > > > index
> > >
> > 0000000000000000000000000000000000000000..374a051b945e97eedb9be
> > > 9da423cf54b5e564d6f
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa3.c
> > > > @@ -0,0 +1,20 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-add-options vect_early_break } */
> > > > +/* { dg-require-effective-target vect_early_break } */
> > > > +/* { dg-require-effective-target vect_int } */
> > > > +/* { dg-additional-options "-O3" } */
> > > > +
> > > > +char string[1020] __attribute__((aligned(1)));
> > > > +
> > > > +char * find(int n, char c)
> > > > +{
> > > > +    for (int i = 1; i < n; i++) {
> > > > +        if (string[i] == c)
> > > > +            return &string[i];
> > > > +    }
> > > > +    return 0;
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > > > +/* { dg-final { scan-tree-dump "Alignment of access forced using 
> > > > peeling"
> > "vect"
> > > } } */
> > > > +/* { dg-final { scan-tree-dump "force alignment of string" "vect" } } 
> > > > */
> > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa4.c
> > > b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa4.c
> > > > new file mode 100644
> > > > index
> > >
> > 0000000000000000000000000000000000000000..297fb7e9b9beffa25ab8f25
> > > 7ceea1c065fcc6ae9
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa4.c
> > > > @@ -0,0 +1,20 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-add-options vect_early_break } */
> > > > +/* { dg-require-effective-target vect_early_break } */
> > > > +/* { dg-require-effective-target vect_int } */
> > > > +/* { dg-additional-options "-O3" } */
> > > > +
> > > > +char string[1020] __attribute__((aligned(1)));
> > > > +
> > > > +char * find(int n, char c)
> > > > +{
> > > > +    for (int i = 0; i < n; i++) {
> > > > +        if (string[i] == c)
> > > > +            return &string[i];
> > > > +    }
> > > > +    return 0;
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > > > +/* { dg-final { scan-tree-dump-not "Alignment of access forced using 
> > > > peeling"
> > > "vect" } } */
> > > > +/* { dg-final { scan-tree-dump "force alignment of string" "vect" } } 
> > > > */
> > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa5.c
> > > b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa5.c
> > > > new file mode 100644
> > > > index
> > >
> > 0000000000000000000000000000000000000000..ca95be44e92e32769da1d
> > > 1e9b740ae54682a3d55
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa5.c
> > > > @@ -0,0 +1,23 @@
> > > > +/* { dg-add-options vect_early_break } */
> > > > +/* { dg-do compile } */
> > > > +/* { dg-require-effective-target vect_early_break } */
> > > > +/* { dg-require-effective-target vect_int } */
> > > > +
> > > > +/* { dg-additional-options "-Ofast" } */
> > > > +
> > > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > > > +
> > > > +unsigned test4(char x, char *vect, int n)
> > > > +{
> > > > + unsigned ret = 0;
> > > > + for (int i = 0; i < n; i++)
> > > > + {
> > > > +   if (vect[i] > x)
> > > > +     return 1;
> > > > +
> > > > +   vect[i] = x;
> > > > + }
> > > > + return ret;
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-tree-dump "Alignment of access forced using 
> > > > peeling"
> > "vect"
> > > } } */
> > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa6.c
> > > b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa6.c
> > > > new file mode 100644
> > > > index
> > >
> > 0000000000000000000000000000000000000000..ee123df6ed2ba97e92307c
> > > 64a61c97b1b6268743
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa6.c
> > > > @@ -0,0 +1,23 @@
> > > > +/* { dg-add-options vect_early_break } */
> > > > +/* { dg-do compile } */
> > > > +/* { dg-require-effective-target vect_early_break } */
> > > > +/* { dg-require-effective-target vect_int } */
> > > > +
> > > > +/* { dg-additional-options "-Ofast" } */
> > > > +
> > > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > > > +
> > > > +unsigned test4(char x, char *vect_a, char *vect_b, int n)
> > > > +{
> > > > + unsigned ret = 0;
> > > > + for (int i = 1; i < n; i++)
> > > > + {
> > > > +   if (vect_a[i] > x || vect_b[i] > x)
> > > > +     return 1;
> > > > +
> > > > +   vect_a[i] = x;
> > > > + }
> > > > + return ret;
> > > > +}
> > > > +
> > > > +/* { dg-final { scan-tree-dump "Versioning for alignment will be 
> > > > applied" "vect"
> > }
> > > } */
> > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa7.c
> > > b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa7.c
> > > > new file mode 100644
> > > > index
> > >
> > 0000000000000000000000000000000000000000..51bad4e745b67cfdaad20f
> > > 50776299531824ce9c
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa7.c
> > > > @@ -0,0 +1,23 @@
> > > > +/* { dg-add-options vect_early_break } */
> > > > +/* { dg-do compile } */
> > > > +/* { dg-require-effective-target vect_early_break } */
> > > > +/* { dg-require-effective-target vect_int } */
> > > > +
> > > > +/* { dg-additional-options "-Ofast" } */
> > > > +
> > > > +/* This should be vectorizable through load_lanes and linear targets.  
> > > > */
> > > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > > > +
> > > > +unsigned test4(char x, char * restrict vect_a, char * restrict vect_b, 
> > > > int n)
> > > > +{
> > > > + unsigned ret = 0;
> > > > + for (int i = 0; i < n; i+=2)
> > > > + {
> > > > +   if (vect_a[i] > x || vect_a[i+1] > x)
> > > > +     return 1;
> > > > +
> > > > +   vect_b[i] = x;
> > > > +   vect_b[i+1] = x+1;
> > > > + }
> > > > + return ret;
> > > > +}
> > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa8.c
> > > b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa8.c
> > > > new file mode 100644
> > > > index
> > >
> > 0000000000000000000000000000000000000000..dbb14ba3239c91b9bfdf56
> > > cecc60750394e10f2b
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa8.c
> > > > @@ -0,0 +1,25 @@
> > > > +/* { dg-add-options vect_early_break } */
> > > > +/* { dg-do compile } */
> > > > +/* { dg-require-effective-target vect_early_break } */
> > > > +/* { dg-require-effective-target vect_int } */
> > > > +
> > > > +/* { dg-additional-options "-Ofast" } */
> > > > +
> > > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > > > +
> > > > +char vect_a[1025];
> > > > +char vect_b[1025];
> > > > +
> > > > +unsigned test4(char x, int n)
> > > > +{
> > > > + unsigned ret = 0;
> > > > + for (int i = 1; i < (n - 2); i+=2)
> > > > + {
> > > > +   if (vect_a[i] > x || vect_a[i+1] > x)
> > > > +     return 1;
> > > > +
> > > > +   vect_b[i] = x;
> > > > +   vect_b[i+1] = x+1;
> > > > + }
> > > > + return ret;
> > > > +}
> > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa9.c
> > > b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa9.c
> > > > new file mode 100644
> > > > index
> > >
> > 0000000000000000000000000000000000000000..31e209620925353948325
> > > 3efc17499a53d112894
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa9.c
> > > > @@ -0,0 +1,28 @@
> > > > +/* { dg-add-options vect_early_break } */
> > > > +/* { dg-do compile } */
> > > > +/* { dg-require-effective-target vect_early_break } */
> > > > +/* { dg-require-effective-target vect_int } */
> > > > +
> > > > +/* { dg-additional-options "-Ofast" } */
> > > > +
> > > > +/* Group size is uneven, load lanes targets can't safely vectorize 
> > > > this.  */
> > > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > > > +/* { dg-final { scan-tree-dump-not "Alignment of access forced using 
> > > > peeling"
> > > "vect" } } */
> > > > +
> > > > +
> > > > +char vect_a[1025];
> > > > +char vect_b[1025];
> > > > +
> > > > +unsigned test4(char x, int n)
> > > > +{
> > > > + unsigned ret = 0;
> > > > + for (int i = 1; i < (n - 2); i+=2)
> > > > + {
> > > > +   if (vect_a[i-1] > x || vect_a[i+2] > x)
> > > > +     return 1;
> > > > +
> > > > +   vect_b[i] = x;
> > > > +   vect_b[i+1] = x+1;
> > > > + }
> > > > + return ret;
> > > > +}
> > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_39.c
> > > b/gcc/testsuite/gcc.dg/vect/vect-early-break_39.c
> > > > index
> > >
> > 9d3c6a5dffe3be4a7759b150e330d18144ab5ce5..b3f40b8c9ba49e41bd283e46
> > > a462238c3b5825ef 100644
> > > > --- a/gcc/testsuite/gcc.dg/vect/vect-early-break_39.c
> > > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_39.c
> > > > @@ -23,4 +23,5 @@ unsigned test4(unsigned x, unsigned n)
> > > >   return ret;
> > > >  }
> > > >
> > > > -/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" 
> > > > } } */
> > > > +/* cannot safely vectorize this due due to the group misalignment.  */
> > > > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 
> > > > 0 "vect" }
> > }
> > > */
> > > > diff --git a/gcc/testsuite/gcc.target/i386/pr90178.c
> > > b/gcc/testsuite/gcc.target/i386/pr90178.c
> > > > index
> > >
> > 1df36af0541c01f3624fe51efbc8cfa0ec67fe60..e9fea04fb148ed53c1ac9b2c6ed7
> > > 3e85ba982b42 100644
> > > > --- a/gcc/testsuite/gcc.target/i386/pr90178.c
> > > > +++ b/gcc/testsuite/gcc.target/i386/pr90178.c
> > > > @@ -4,6 +4,7 @@
> > > >  int*
> > > >  find_ptr (int* mem, int sz, int val)
> > > >  {
> > > > +#pragma GCC novector
> > > >    for (int i = 0; i < sz; i++)
> > > >      if (mem[i] == val)
> > > >        return &mem[i];
> > > > diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
> > > > index
> > >
> > 6d5854ac7c7a18e09ec7ad72c534abdc55cb6efa..c85df96685f64f9814251f2d4f
> > > dbcc5973f2b513 100644
> > > > --- a/gcc/tree-vect-data-refs.cc
> > > > +++ b/gcc/tree-vect-data-refs.cc
> > > > @@ -731,7 +731,8 @@ vect_analyze_early_break_dependences
> > (loop_vec_info
> > > loop_vinfo)
> > > >           if (is_gimple_debug (stmt))
> > > >             continue;
> > > >
> > > > -         stmt_vec_info stmt_vinfo = loop_vinfo->lookup_stmt (stmt);
> > > > +         stmt_vec_info stmt_vinfo
> > > > +           = vect_stmt_to_vectorize (loop_vinfo->lookup_stmt (stmt));
> > > >           auto dr_ref = STMT_VINFO_DATA_REF (stmt_vinfo);
> > > >           if (!dr_ref)
> > > >             continue;
> > > > @@ -748,26 +749,14 @@ vect_analyze_early_break_dependences
> > > (loop_vec_info loop_vinfo)
> > > >              bounded by VF so accesses are within range.  We only need 
> > > > to check
> > > >              the reads since writes are moved to a safe place where if 
> > > > we get
> > > >              there we know they are safe to perform.  */
> > > > -         if (DR_IS_READ (dr_ref)
> > > > -             && !ref_within_array_bound (stmt, DR_REF (dr_ref)))
> > > > +         if (DR_IS_READ (dr_ref))
> > > >             {
> > > > -             if (STMT_VINFO_GATHER_SCATTER_P (stmt_vinfo)
> > > > -                 || STMT_VINFO_STRIDED_P (stmt_vinfo))
> > > > -               {
> > > > -                 const char *msg
> > > > -                   = "early break not supported: cannot peel "
> > > > -                     "for alignment, vectorization would read out of "
> > > > -                     "bounds at %G";
> > > > -                 return opt_result::failure_at (stmt, msg, stmt);
> > > > -               }
> > > > -
> > > > -             dr_vec_info *dr_info = STMT_VINFO_DR_INFO (stmt_vinfo);
> > > > -             dr_info->need_peeling_for_alignment = true;
> > > > +             dr_set_peeling_alignment (stmt_vinfo, true);
> > > >
> > > >               if (dump_enabled_p ())
> > > >                 dump_printf_loc (MSG_NOTE, vect_location,
> > > > -                                "marking DR (read) as needing peeling 
> > > > for "
> > > > -                                "alignment at %G", stmt);
> > > > +                                "marking DR (read) as possibly needing 
> > > > peeling "
> > > > +                                "for alignment at %G", stmt);
> > > >             }
> > > >
> > > >           if (DR_IS_READ (dr_ref))
> > > > @@ -1326,9 +1315,6 @@ vect_record_base_alignments (vec_info *vinfo)
> > > >     Compute the misalignment of the data reference DR_INFO when 
> > > > vectorizing
> > > >     with VECTYPE.
> > > >
> > > > -   RESULT is non-NULL iff VINFO is a loop_vec_info.  In that case, 
> > > > *RESULT will
> > > > -   be set appropriately on failure (but is otherwise left unchanged).
> > > > -
> > > >     Output:
> > > >     1. initialized misalignment info for DR_INFO
> > > >
> > > > @@ -1337,7 +1323,7 @@ vect_record_base_alignments (vec_info *vinfo)
> > > >
> > > >  static void
> > > >  vect_compute_data_ref_alignment (vec_info *vinfo, dr_vec_info *dr_info,
> > > > -                                tree vectype, opt_result *result = 
> > > > nullptr)
> > > > +                                tree vectype)
> > > >  {
> > > >    stmt_vec_info stmt_info = dr_info->stmt;
> > > >    vec_base_alignments *base_alignments = &vinfo->base_alignments;
> > > > @@ -1365,63 +1351,20 @@ vect_compute_data_ref_alignment (vec_info
> > > *vinfo, dr_vec_info *dr_info,
> > > >      = exact_div (targetm.vectorize.preferred_vector_alignment 
> > > > (vectype),
> > > >                  BITS_PER_UNIT);
> > > >
> > > > -  /* If this DR needs peeling for alignment for correctness, we must
> > > > -     ensure the target alignment is a constant power-of-two multiple 
> > > > of the
> > > > -     amount read per vector iteration (overriding the above hook where
> > > > -     necessary).  */
> > > > -  if (dr_info->need_peeling_for_alignment)
> > > > +  /* If we have a grouped access we require that the alignment be VF * 
> > > > elem.
> > */
> > > > +  if (loop_vinfo
> > > > +      && dr_peeling_alignment (stmt_info)
> > > > +      && STMT_VINFO_GROUPED_ACCESS (stmt_info))
> > > >      {
> > > > -      /* Vector size in bytes.  */
> > > > -      poly_uint64 safe_align = tree_to_poly_uint64 (TYPE_SIZE_UNIT 
> > > > (vectype));
> > > > -
> > > > -      /* We can only peel for loops, of course.  */
> > > > -      gcc_checking_assert (loop_vinfo);
> > > > -
> > > > -      /* Calculate the number of vectors read per vector iteration.  If
> > > > -        it is a power of two, multiply through to get the required
> > > > -        alignment in bytes.  Otherwise, fail analysis since alignment
> > > > -        peeling wouldn't work in such a case.  */
> > > > -      poly_uint64 num_scalars = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
> > > > -      if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
> > > > -       num_scalars *= DR_GROUP_SIZE (stmt_info);
> > > > -
> > > > -      auto num_vectors = vect_get_num_vectors (num_scalars, vectype);
> > > > -      if (!pow2p_hwi (num_vectors))
> > > > -       {
> > > > -         *result = opt_result::failure_at (vect_location,
> > > > -                                           "non-power-of-two num 
> > > > vectors %u "
> > > > -                                           "for DR needing peeling for 
> > > > "
> > > > -                                           "alignment at %G",
> > > > -                                           num_vectors, 
> > > > stmt_info->stmt);
> > > > -         return;
> > > > -       }
> > > > -
> > > > -      safe_align *= num_vectors;
> > > > -      if (maybe_gt (safe_align, 4096U))
> > > > -       {
> > > > -         pretty_printer pp;
> > > > -         pp_wide_integer (&pp, safe_align);
> > > > -         *result = opt_result::failure_at (vect_location,
> > > > -                                           "alignment required for 
> > > > correctness"
> > > > -                                           " (%s) may exceed page 
> > > > size",
> > > > -                                           pp_formatted_text (&pp));
> > > > -         return;
> > > > -       }
> > > > -
> > > > -      unsigned HOST_WIDE_INT multiple;
> > > > -      if (!constant_multiple_p (vector_alignment, safe_align, 
> > > > &multiple)
> > > > -         || !pow2p_hwi (multiple))
> > > > +      poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
> > > > +      vector_alignment
> > > > +       = vf * TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
> > >
> > > I think we discussed this before, also when introducing peeling
> > > for alignment support.  This is incorrect for grouped accesses where
> > > the number of scalar elements accessed is GROUP_SIZE * vf, so you
> > > miss a multiplication by GROUP_SIZE here.
> > 
> > Huh, But doesn't your VF already contain your group size? If I have an LD4 
> > on V4SI
> > My VF is 16 isn't it?  Because I still handled 16 elements per iteration.
> > 
> > So why would I need 4 * 16?
> > 
> 
> *sigh* nevermind. I was using this loop to check:
> 
> char vect_a[1025];
> char vect_b[1025];
> unsigned test4(char x, int n)
> {  
>  unsigned ret = 0;
>  for (int i = 1; i < (n - 2); i+=2)
>  {
>    if (vect_a[i] > x || vect_a[i+1] > x)
>      return 1;
>    vect_b[i] = x;
>    vect_b[i+1] = x+1;
>  }
>  return ret;
> }
> 
> And I was expecting 32-bytes alignment, but then misread:
> 
> 
>         .align  4
>         .set    .LANCHOR0,. + 0
>         .type   vect_b, %object
>         .size   vect_b, 1025
> vect_b:
> 
> so yes I need a GROUP_SIZE *.
> 
> > >
> > > Note that this (and also your VF * element_size) can result in a
> > > non-power-of-two value.
> > >
> > > That said, I'm quite sure we don't want to have a dr->target_alignment
> > > that isn't power-of-two, so if the comput doesn't end up with a
> > > power-of-two value we should leave it as the target prefers and
> > > fixup (or fail) during vectorizable_load.
> > 
> > Ack I'll round up to power of 2.
> > 
> > >
> > > > +      if (dump_enabled_p ())
> > > >         {
> > > > -         if (dump_enabled_p ())
> > > > -           {
> > > > -             dump_printf_loc (MSG_NOTE, vect_location,
> > > > -                              "forcing alignment for DR from preferred 
> > > > (");
> > > > -             dump_dec (MSG_NOTE, vector_alignment);
> > > > -             dump_printf (MSG_NOTE, ") to safe align (");
> > > > -             dump_dec (MSG_NOTE, safe_align);
> > > > -             dump_printf (MSG_NOTE, ") for stmt: %G", stmt_info->stmt);
> > > > -           }
> > > > -         vector_alignment = safe_align;
> > > > +         dump_printf_loc (MSG_NOTE, vect_location,
> > > > +                          "alignment increased due to early break to 
> > > > ");
> > > > +         dump_dec (MSG_NOTE, vector_alignment);
> > > > +         dump_printf (MSG_NOTE, " bytes.\n");
> > > >         }
> > > >      }
> > > >
> > > > @@ -2487,6 +2430,8 @@ vect_enhance_data_refs_alignment (loop_vec_info
> > > loop_vinfo)
> > > >        || !slpeel_can_duplicate_loop_p (loop, LOOP_VINFO_IV_EXIT 
> > > > (loop_vinfo),
> > > >                                        loop_preheader_edge (loop))
> > > >        || loop->inner
> > > > +      /* We don't currently maintaing the LCSSA for prologue peeled 
> > > > inversed
> > > > +        loops.  */
> > > >        || LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo))
> > > >      do_peeling = false;
> > > >
> > > > @@ -2950,12 +2895,9 @@ vect_analyze_data_refs_alignment (loop_vec_info
> > > loop_vinfo)
> > > >           if (STMT_VINFO_GROUPED_ACCESS (dr_info->stmt)
> > > >               && DR_GROUP_FIRST_ELEMENT (dr_info->stmt) != 
> > > > dr_info->stmt)
> > > >             continue;
> > > > -         opt_result res = opt_result::success ();
> > > > +
> > > >           vect_compute_data_ref_alignment (loop_vinfo, dr_info,
> > > > -                                          STMT_VINFO_VECTYPE 
> > > > (dr_info->stmt),
> > > > -                                          &res);
> > > > -         if (!res)
> > > > -           return res;
> > > > +                                          STMT_VINFO_VECTYPE 
> > > > (dr_info->stmt));
> > > >         }
> > > >      }
> > > >
> > > > @@ -7219,7 +7161,7 @@ vect_supportable_dr_alignment (vec_info *vinfo,
> > > dr_vec_info *dr_info,
> > > >
> > > >    if (misalignment == 0)
> > > >      return dr_aligned;
> > > > -  else if (dr_info->need_peeling_for_alignment)
> > > > +  else if (dr_peeling_alignment (stmt_info))
> > > >      return dr_unaligned_unsupported;
> > > >
> > > >    /* For now assume all conditional loads/stores support unaligned
> > > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > > > index
> > >
> > 6bbb16beff2c627fca11a7403ba5ee3a5faa21c1..436d373ae6ec06aff165a7bee3
> > > 7b3fa1dc95079b 100644
> > > > --- a/gcc/tree-vect-stmts.cc
> > > > +++ b/gcc/tree-vect-stmts.cc
> > > > @@ -2597,6 +2597,89 @@ get_load_store_type (vec_info  *vinfo,
> > > stmt_vec_info stmt_info,
> > > >        return false;
> > > >      }
> > > >
> > > > +  /* If this DR needs peeling for alignment for correctness, we must
> > > > +     ensure the target alignment is a constant power-of-two multiple 
> > > > of the
> > > > +     amount read per vector iteration (overriding the above hook where
> > > > +     necessary).  */
> > > > +  if (dr_peeling_alignment (stmt_info))
> > > > +    {
> > > > +      /* We can only peel for loops, of course.  */
> > > > +      gcc_checking_assert (loop_vinfo);
> > > > +
> > > > +      /* Check if we support the operation if early breaks are needed. 
> > > >  */
> > > > +      if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo)
> > > > +         && (*memory_access_type == VMAT_GATHER_SCATTER
> > > > +             || *memory_access_type == VMAT_STRIDED_SLP))
> > > > +       {
> > > > +         if (dump_enabled_p ())
> > > > +           dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > > > +                            "early break not supported: cannot peel 
> > > > for "
> > > > +                            "alignment. With non-contiguous memory
> > > vectorization"
> > > > +                            " could read out of bounds at %G ",
> > > > +                            STMT_VINFO_STMT (stmt_info));
> > > > +         return false;
> > > > +       }
> > > > +
> > > > +      /* Even if uneven group sizes are aligned on the first load, the 
> > > > second
> > > > +        iteration won't be.  As such reject uneven group sizes.  */
> > > > +      if (STMT_VINFO_GROUPED_ACCESS (stmt_info)
> > > > +         && (DR_GROUP_SIZE (stmt_info) % 2) == 1)
> > >
> > > Hmm, but a group size of 6 is even, but a vector size of four doesn't
> > > make the 2nd aligned.  So we need a power-of-two GROUP_SIZE * VF
> > > and a byte alignment according to that.
> > >
> > 
> > Argg true.
> > 
> > 
> > > > +       {
> > > > +         if (dump_enabled_p ())
> > > > +           dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > > > +                            "early break not supported: uneven group 
> > > > size, "
> > > > +                            "vectorization could read out of bounds at 
> > > > %G ",
> > > > +                            STMT_VINFO_STMT (stmt_info));
> > > > +         return false;
> > > > +       }
> > > > +
> > > > +      /* Vector size in bytes.  */
> > > > +      poly_uint64 safe_align;
> > > > +      if (nunits.is_constant ())
> > > > +       safe_align = tree_to_poly_uint64 (TYPE_SIZE_UNIT (vectype));
> > > > +      else
> > > > +       safe_align = estimated_poly_value (LOOP_VINFO_VECT_FACTOR
> > > (loop_vinfo),
> > > > +                                          POLY_VALUE_MAX);
> > > > +
> > > > +      auto num_vectors = ncopies;
> > > > +      if (!pow2p_hwi (num_vectors))
> > > > +       {
> > > > +         if (dump_enabled_p ())
> > > > +           dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > > > +                            "non-power-of-two num vectors %u "
> > > > +                            "for DR needing peeling for "
> > > > +                            "alignment at %G",
> > > > +                            num_vectors, STMT_VINFO_STMT (stmt_info));
> > > > +         return false;
> > > > +       }
> > > > +
> > > > +      safe_align *= num_vectors;
> > > > +      bool inbounds
> > > > +       = ref_within_array_bound (STMT_VINFO_STMT (stmt_info),
> > > > +                                 DR_REF (STMT_VINFO_DATA_REF 
> > > > (stmt_info)));
> > >
> > > I'm again confused why you think ref_within_array_bound can be used to
> > > validize anything?
> > 
> > The goal here is that if we have a value that is aligned, and we don't 
> > exceed the
> > page
> > And we're able to increase the target alignment that VLA is safe to use.
> > 
> > Since we can't peel for SVE any unaligned value can't be handled later.  
> > But as I
> > mentioned,
> > To SVE nothing is unaligned, as the target alignment is the element size.
> > 
> > For non-SVE unaligned accesses get processed later on.  For SVE you just 
> > generate
> > wrong code.
> > Since we know we can realign known sized buffers (since we'll, they're 
> > .data) then
> > we know we
> > can let it through.
> > 
> > I don't really know of a better way to do this.  Since again, for VLA the 
> > backend
> > never returns that
> > the load is misaligned, so it doesn't stop vectorization.
> > 
> > The inbounds is used as a proxy for that.  Which is what the comment below 
> > was
> > trying to explain.
> 
> To explain this some more.
> 
> This loop is clearly safe to vectorize:
> 
> char vect_a[1025];
> char vect_b[1025];
> unsigned test4(char x, int n)
> {  
>  unsigned ret = 0;
>  for (int i = 0; i < 1023; i+=2)
>  {
>    if (vect_a[i] > x || vect_a[i+1] > x)
>      return 1;
>    vect_b[i] = x;
>    vect_b[i+1] = x+1;
>  }
>  return ret;
> }
> 
> But I cannot distinguish between this case, and cases where we've established 
> it's unsafe
> as you need to peel for alignment, or increase the data alignment.
> 
> SVE will always return an alignment of the element type.  This means after 
> vect_enhance_data_refs_alignment
> the calculated misalignment is always zero, which means that 
> dr_peeling_alignment is always ignored in
> vect_supportable_dr_alignment which means, the code is never marked as 
> needing peeling for alignment,
> *nor* needing versioning. 
> 
> This did not seem right to me.  As a result, as the known size cases are 
> always ok, (You can never overread the size of
> the buffer as the predicate would just make the last iteration partial) then 
> the known inbounds cases are safe for VLA
> due to predication.


Ah, so you bring up predication again.  So indeed predication should 
guarantee that accesses that were in-bound in the scalar code will
stay in-bound with loop-predicated vector code.

I think we can make that argument in general, not just for SVE, and thus
instead of requiring alignment, require predication via
LOOP_VINFO_MUST_USE_PARTIAL_VECTORS_P.  That said,
we should mark DRs that we might access speculatively due to early-break
vectorization.  We now use ->need_peeling_for_alignment which IMO is
a bad name in some places you check this in your patch - I'd say
rename that to ->safe_speculative_read_required or so - and make
that an incentive for alignment peeling to try aligning it.  And make
get_load_store_type check whether such an access is aligned according
to VF * group_size * element_size and if not, _and_ the scalar access
is statically known to be in-bounds, require
LOOP_VINFO_MUST_USE_PARTIAL_VECTORS_P.  And if neither, fail.

Note there's the case of a gap at the end - peeling for gaps - which
is not solved by loop masking (I'm not 100% positively sure the current
condition captures the VLA issue fully, but ...).

> Increasing alignment also doesn't really work for VLA because you'd have to 
> way overalign
> To get a safe value (i.e. align to maximum vector size) which a e.g. an LD2 
> already doesn't make much sense.

Yes, we know that for VLA we need to read the architectural vector size
to compute the desired alignment.  And masking might not do it when
we need multiple vectors for an access(?).
 
> At least, this is what I was trying to explain in the comment below it's use. 
>  As far as I can tell, the only cases where
> an inbounds VLA case would fault, is an incorrect program where scalar just 
> happened to exit before reading memory
> we were told was safe to read.
>
> 
> Thanks,
> Tamar
> > 
> > >
> > > > +      /* For VLA we have to insert a runtime check that the vector 
> > > > loads
> > > > +        per iterations don't exceed a page size.  For now we can use
> > > > +        POLY_VALUE_MAX as a proxy as we can't peel for VLA.  */
> > > > +      if (maybe_gt (safe_align, (unsigned)param_min_pagesize)
> > > > +         /* We don't support PFA for VLA at the moment.  Some targets 
> > > > like SVE
> > > > +            return a target alignment requirement of a single element. 
> > > >  For
> > > > +            early break this is potentially unsafe so we can't count on
> > > > +            alignment rejecting such loops later as it thinks loads 
> > > > are never
> > > > +            misaligned.  */
> > > > +         || (!nunits.is_constant () && !inbounds))
> > > > +       {
> > > > +         if (dump_enabled_p ())
> > > > +           {
> > > > +             dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > > > +                              "alignment required for correctness (");
> > > > +             dump_dec (MSG_MISSED_OPTIMIZATION, safe_align);
> > > > +             dump_printf (MSG_NOTE, ") may exceed page size\n");
> > > > +           }
> > > > +         return false;
> > > > +       }
> > > > +      *alignment_support_scheme = dr_unaligned_supported;
> > >
> > > and the only thing should be *alignment_support_scheme == dr_aligned,
> > > and with a possibly too low taget_alignment even that's not enough.
> > >
> > 
> > Fair, but you shouldn't be able to get there with a too low target 
> > alignment though.
> > 
> > > Can you split out the testsuite part that just adds #pragma GCC novector?
> > > That part is OK.
> > >
> > 
> > Ok,
> > Tamar
> > 
> > > Thanks,
> > > Richard.
> > >
> > > > +    }
> > > > +
> > > >    if (*alignment_support_scheme == dr_unaligned_unsupported)
> > > >      {
> > > >        if (dump_enabled_p ())
> > > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> > > > index
> > >
> > b0cb081cba0ae8b11fbfcfcb8c6d440ec451ccb5..aeaf714c155bc2d87bf50e6dba
> > > 0dbfbcca027441 100644
> > > > --- a/gcc/tree-vectorizer.h
> > > > +++ b/gcc/tree-vectorizer.h
> > > > @@ -1998,6 +1998,33 @@ dr_target_alignment (dr_vec_info *dr_info)
> > > >  }
> > > >  #define DR_TARGET_ALIGNMENT(DR) dr_target_alignment (DR)
> > > >
> > > > +/* Return if the stmt_vec_info requires peeling for alignment.  */
> > > > +inline bool
> > > > +dr_peeling_alignment (stmt_vec_info stmt_info)
> > > > +{
> > > > +  dr_vec_info *dr_info;
> > > > +  if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
> > > > +    dr_info = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT
> > (stmt_info));
> > > > +  else
> > > > +    dr_info = STMT_VINFO_DR_INFO (stmt_info);
> > > > +
> > > > +  return dr_info->need_peeling_for_alignment;
> > > > +}
> > > > +
> > > > +/* Set the need_peeling_for_alignment for the the stmt_vec_info, if 
> > > > group
> > > > +   access then set on the fist element otherwise set on DR directly.  
> > > > */
> > > > +inline void
> > > > +dr_set_peeling_alignment (stmt_vec_info stmt_info, bool 
> > > > requires_alignment)
> > > > +{
> > > > +  dr_vec_info *dr_info;
> > > > +  if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
> > > > +    dr_info = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT
> > (stmt_info));
> > > > +  else
> > > > +    dr_info = STMT_VINFO_DR_INFO (stmt_info);
> > > > +
> > > > +  dr_info->need_peeling_for_alignment = requires_alignment;
> > > > +}
> > > > +
> > > >  inline void
> > > >  set_dr_target_alignment (dr_vec_info *dr_info, poly_uint64 val)
> > > >  {
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > > --
> > > Richard Biener <rguent...@suse.de>
> > > SUSE Software Solutions Germany GmbH,
> > > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH v2]middle-end: delay checking for alignment to load [PR118464]

Reply via email to