Hi All,

This fixes two PRs on Early break vectorization by delaying the safety checks to
vectorizable_load when the VF, VMAT and vectype are all known.

This patch does add two new restrictions:

1. On LOAD_LANES targets, where the buffer size is known, we reject uneven
   group sizes, as they are unaligned every n % 2 iterations and so may cross
   a page unwittingly.

2. On LOAD_LANES targets when the buffer is unknown, we reject vectorization if
   we cannot peel for alignment, as the alignment requirement is quite large at
   GROUP_SIZE * vectype_size.  This is unlikely to ever be beneficial so we
   don't support it for now.

There are other steps documented inside the code itself so that the reasoning
is next to the code.

Note that for VLA I have still left this fully disabled when not working on a
fixed buffer.

For VLA targets like SVE return element alignment as the desired vector
alignment.  This means that the loads are never misaligned and so annoying it
won't ever need to peel.

So what I think needs to happen in GCC 16 is that.

1. during vect_compute_data_ref_alignment we need to take the max of
   POLY_VALUE_MIN and vector_alignment.

2. vect_do_peeling define skip_vector when PFA for VLA, and in the guard add a
   check that ncopies * vectype does not exceed POLY_VALUE_MAX which we use as a
   proxy for pagesize.

3. Force LOOP_VINFO_USING_PARTIAL_VECTORS_P to be true in
   vect_determine_partial_vectors_and_peeling since the first iteration has to
   be partial. If LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P we have to fail to
   vectorize.

4. Create a default mask to be used, so that vect_use_loop_mask_for_alignment_p
   becomes true and we generate the peeled check through loop control for
   partial loops.  From what I can tell this won't work for
   LOOP_VINFO_FULLY_WITH_LENGTH_P since they don't have any peeling support at
   all in the compiler.  That would need to be done independently from the
   above.

In any case, not GCC 15 material so I've kept the WIP patches I have downstream.

Bootstrapped Regtested on aarch64-none-linux-gnu,
arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
-m32, -m64 and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

        PR tree-optimization/118464
        PR tree-optimization/116855
        * doc/invoke.texi (min-pagesize): Update docs with vectorizer use.
        * tree-vect-data-refs.cc (vect_analyze_early_break_dependences): Delay
        checks.
        (vect_compute_data_ref_alignment): Remove alignment checks and move to
        get_load_store_type, increase group access alignment.
        (vect_enhance_data_refs_alignment): Add note to comment needing
        investigating.
        (vect_analyze_data_refs_alignment): Likewise.
        (vect_supportable_dr_alignment): For group loads look at first DR.
        * tree-vect-stmts.cc (get_load_store_type):
        Perform safety checks for early break pfa.
        * tree-vectorizer.h (dr_set_safe_speculative_read_required,
        dr_safe_speculative_read_required, DR_SCALAR_KNOWN_BOUNDS): New.
        (need_peeling_for_alignment): Renamed to...
        (safe_speculative_read_required): .. This
        (class dr_vec_info): Add scalar_access_known_in_bounds.

gcc/testsuite/ChangeLog:

        PR tree-optimization/118464
        PR tree-optimization/116855
        * gcc.dg/vect/bb-slp-pr65935.c: Update, it now vectorizes because the
        load type is relaxed later.
        * gcc.dg/vect/vect-early-break_121-pr114081.c: Update.
        * gcc.dg/vect/vect-early-break_22.c: Reject for load_lanes targets
        * g++.dg/vect/vect-early-break_7-pr118464.cc: New test.
        * gcc.dg/vect/vect-early-break_132-pr118464.c: New test.
        * gcc.dg/vect/vect-early-break_133_pfa1.c: New test.
        * gcc.dg/vect/vect-early-break_133_pfa10.c: New test.
        * gcc.dg/vect/vect-early-break_133_pfa2.c: New test.
        * gcc.dg/vect/vect-early-break_133_pfa3.c: New test.
        * gcc.dg/vect/vect-early-break_133_pfa4.c: New test.
        * gcc.dg/vect/vect-early-break_133_pfa5.c: New test.
        * gcc.dg/vect/vect-early-break_133_pfa6.c: New test.
        * gcc.dg/vect/vect-early-break_133_pfa7.c: New test.
        * gcc.dg/vect/vect-early-break_133_pfa8.c: New test.
        * gcc.dg/vect/vect-early-break_133_pfa9.c: New test.
        * gcc.dg/vect/vect-early-break_39.c: Update testcase for misalignment.
        * gcc.dg/vect/vect-early-break_53.c: Likewise.
        * gcc.dg/vect/vect-early-break_56.c: Likewise.
        * gcc.dg/vect/vect-early-break_57.c: Likewise.
        * gcc.dg/vect/vect-early-break_81.c: Likewise.

---
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 
ca8e468f3f2dbf68c959f74d8fec48c79463504d..fff2874b326d605fc2656adf4ab6eb5bd5d42d71
 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -17256,7 +17256,7 @@ Maximum number of relations the oracle will register in 
a basic block.
 Work bound when discovering transitive relations from existing relations.
 
 @item min-pagesize
-Minimum page size for warning purposes.
+Minimum page size for warning and early break vectorization purposes.
 
 @item openacc-kernels
 Specify mode of OpenACC `kernels' constructs handling.
diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
index 
9ef1330b47c817e16baaafa44c2b15108b9dd3a9..4c8255895b976653228233d93c950629f3231554
 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
@@ -55,7 +55,9 @@ int main()
          }
     }
   rephase ();
+#pragma GCC novector
   for (i = 0; i < 32; ++i)
+#pragma GCC novector
     for (j = 0; j < 3; ++j)
 #pragma GCC novector
       for (k = 0; k < 3; ++k)
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c
index 
423ff0b566b18bf04ce4f67a45b94dc1a021a4a0..f99c57be0adc4d49035b8a75c72d4a5b04cc05c7
 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c
@@ -5,7 +5,8 @@
 /* { dg-additional-options "-O3" } */
 /* { dg-additional-options "-mavx2" { target { x86_64-*-* i?86-*-* } } } */
 
-/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* Arm and -m32 create a group size of 3 here, which we can't support yet. 
AArch64 makes elementwise accesses here.  */
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { 
aarch64*-*-* } } } } */
 
 typedef struct filter_list_entry {
   const char *name;
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_132-pr118464.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_132-pr118464.c
new file mode 100644
index 
0000000000000000000000000000000000000000..9bf0cbc8853f74de550e8ac83ab569fc9fbde126
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_132-pr118464.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-O3" } */
+
+int a, b, c, d, e, f;
+short g[1];
+int main() {
+  int h;
+  while (a) {
+    while (h)
+      ;
+    for (b = 2; b; b--) {
+      while (c)
+        ;
+      f = g[a];
+      if (d)
+        break;
+    }
+    while (e)
+      ;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa1.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa1.c
new file mode 100644
index 
0000000000000000000000000000000000000000..dc771186efafe25bb65490da7a383ad7f6ceb0a7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-O3" } */
+
+char string[1020];
+
+char * find(int n, char c)
+{
+    for (int i = 1; i < n; i++) {
+        if (string[i] == c)
+            return &string[i];
+    }
+    return 0;
+}
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump "Alignment of access forced using peeling" 
"vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa10.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa10.c
new file mode 100644
index 
0000000000000000000000000000000000000000..82d473a279ce060c550289c61729d9f9b56f0d2a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa10.c
@@ -0,0 +1,24 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-additional-options "-Ofast" } */
+
+/* Alignment requirement too big, load lanes targets can't safely vectorize 
this.  */
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! 
vect_load_lanes } } } } */
+/* { dg-final { scan-tree-dump-not "Alignment of access forced using peeling" 
"vect" { target { ! vect_load_lanes } } } } */
+
+unsigned test4(char x, char *restrict vect_a, char *restrict vect_b, int n)
+{  
+ unsigned ret = 0;
+ for (int i = 0; i < (n - 2); i+=2)
+ {
+   if (vect_a[i] > x || vect_a[i+2] > x)
+     return 1;
+
+   vect_b[i] = x;
+   vect_b[i+1] = x+1;
+ }
+ return ret;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa2.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa2.c
new file mode 100644
index 
0000000000000000000000000000000000000000..7d56772fbf380ce42ac758ca29a5f3f9d3f6e0d1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa2.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-O3" } */
+
+char string[1020];
+
+char * find(int n, char c)
+{
+    for (int i = 0; i < n; i++) {
+        if (string[i] == c)
+            return &string[i];
+    }
+    return 0;
+}
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump-not "Alignment of access forced using peeling" 
"vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa3.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa3.c
new file mode 100644
index 
0000000000000000000000000000000000000000..374a051b945e97eedb9be9da423cf54b5e564d6f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa3.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-O3" } */
+
+char string[1020] __attribute__((aligned(1)));
+
+char * find(int n, char c)
+{
+    for (int i = 1; i < n; i++) {
+        if (string[i] == c)
+            return &string[i];
+    }
+    return 0;
+}
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump "Alignment of access forced using peeling" 
"vect" } } */
+/* { dg-final { scan-tree-dump "force alignment of string" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa4.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa4.c
new file mode 100644
index 
0000000000000000000000000000000000000000..297fb7e9b9beffa25ab8f257ceea1c065fcc6ae9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa4.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-O3" } */
+
+char string[1020] __attribute__((aligned(1)));
+
+char * find(int n, char c)
+{
+    for (int i = 0; i < n; i++) {
+        if (string[i] == c)
+            return &string[i];
+    }
+    return 0;
+}
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump-not "Alignment of access forced using peeling" 
"vect" } } */
+/* { dg-final { scan-tree-dump "force alignment of string" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa5.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa5.c
new file mode 100644
index 
0000000000000000000000000000000000000000..ca95be44e92e32769da1d1e9b740ae54682a3d55
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa5.c
@@ -0,0 +1,23 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-additional-options "-Ofast" } */
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+unsigned test4(char x, char *vect, int n)
+{  
+ unsigned ret = 0;
+ for (int i = 0; i < n; i++)
+ {
+   if (vect[i] > x)
+     return 1;
+
+   vect[i] = x;
+ }
+ return ret;
+}
+
+/* { dg-final { scan-tree-dump "Alignment of access forced using peeling" 
"vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa6.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa6.c
new file mode 100644
index 
0000000000000000000000000000000000000000..ee123df6ed2ba97e92307c64a61c97b1b6268743
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa6.c
@@ -0,0 +1,23 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-additional-options "-Ofast" } */
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+unsigned test4(char x, char *vect_a, char *vect_b, int n)
+{  
+ unsigned ret = 0;
+ for (int i = 1; i < n; i++)
+ {
+   if (vect_a[i] > x || vect_b[i] > x)
+     return 1;
+
+   vect_a[i] = x;
+ }
+ return ret;
+}
+
+/* { dg-final { scan-tree-dump "Versioning for alignment will be applied" 
"vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa7.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa7.c
new file mode 100644
index 
0000000000000000000000000000000000000000..51bad4e745b67cfdaad20f50776299531824ce9c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa7.c
@@ -0,0 +1,23 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-additional-options "-Ofast" } */
+
+/* This should be vectorizable through load_lanes and linear targets.  */
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+unsigned test4(char x, char * restrict vect_a, char * restrict vect_b, int n)
+{  
+ unsigned ret = 0;
+ for (int i = 0; i < n; i+=2)
+ {
+   if (vect_a[i] > x || vect_a[i+1] > x)
+     return 1;
+
+   vect_b[i] = x;
+   vect_b[i+1] = x+1;
+ }
+ return ret;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa8.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa8.c
new file mode 100644
index 
0000000000000000000000000000000000000000..dbb14ba3239c91b9bfdf56cecc60750394e10f2b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa8.c
@@ -0,0 +1,25 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-additional-options "-Ofast" } */
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+char vect_a[1025];
+char vect_b[1025];
+
+unsigned test4(char x, int n)
+{  
+ unsigned ret = 0;
+ for (int i = 1; i < (n - 2); i+=2)
+ {
+   if (vect_a[i] > x || vect_a[i+1] > x)
+     return 1;
+
+   vect_b[i] = x;
+   vect_b[i+1] = x+1;
+ }
+ return ret;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa9.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa9.c
new file mode 100644
index 
0000000000000000000000000000000000000000..31e2096209253539483253efc17499a53d112894
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa9.c
@@ -0,0 +1,28 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-additional-options "-Ofast" } */
+
+/* Group size is uneven, load lanes targets can't safely vectorize this.  */
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump-not "Alignment of access forced using peeling" 
"vect" } } */
+
+
+char vect_a[1025];
+char vect_b[1025];
+
+unsigned test4(char x, int n)
+{  
+ unsigned ret = 0;
+ for (int i = 1; i < (n - 2); i+=2)
+ {
+   if (vect_a[i-1] > x || vect_a[i+2] > x)
+     return 1;
+
+   vect_b[i] = x;
+   vect_b[i+1] = x+1;
+ }
+ return ret;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_39.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_39.c
index 
9d3c6a5dffe3be4a7759b150e330d18144ab5ce5..b3f40b8c9ba49e41bd283e46a462238c3b5825ef
 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-early-break_39.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_39.c
@@ -23,4 +23,5 @@ unsigned test4(unsigned x, unsigned n)
  return ret;
 }
 
-/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" } } */
+/* cannot safely vectorize this due due to the group misalignment.  */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 0 "vect" 
} } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_53.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_53.c
index 
a02d5986ba3cfc117b19305c5e96711299996931..d4fd0d39a25a5659e3d9452b79f3e0fabba8b3c0
 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-early-break_53.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_53.c
@@ -2,6 +2,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target vect_early_break } */
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_partial_vectors } */
 
 void abort ();
 int a[64], b[64];
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_56.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_56.c
index 
9096f66647c7b3cb430562d35f8ce076244f7c11..b35e737fa3b9137cd745c14f7ad915a3f81c38c4
 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-early-break_56.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_56.c
@@ -4,6 +4,7 @@
 /* { dg-require-effective-target vect_int } */
 /* { dg-add-options bind_pic_locally } */
 /* { dg-require-effective-target vect_early_break_hw } */
+/* { dg-require-effective-target vect_partial_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_57.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_57.c
index 
319bd125c3156f13c300ff2b94d269bb9ec29e97..a4886654f152b2c0568286febea2b31cb7be8499
 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-early-break_57.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_57.c
@@ -5,8 +5,9 @@
 
 /* { dg-additional-options "-Ofast" } */
 
-/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
-/* { dg-final { scan-tree-dump "epilog loop required" "vect" } } */
+/* Multiple loads of different alignments, we can't peel this. */
+/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump-not "epilog loop required" "vect" } } */
 
 void abort ();
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_81.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_81.c
index 
8a8c076ba92ca6fef419cb23b457a23555c61c64..b58a4611d6b8d86f0247d9ea44ab4750473589a9
 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-early-break_81.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_81.c
@@ -5,8 +5,9 @@
 
 /* { dg-additional-options "-Ofast" } */
 
-/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
-/* { dg-final { scan-tree-dump "epilog loop required" "vect" } } */
+/* Multiple loads with different misalignments.  Can't peel need partial loop 
support.  */
+/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump-not "epilog loop required" "vect" } } */
 void abort ();
 
 unsigned short sa[32];
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 
6d5854ac7c7a18e09ec7ad72c534abdc55cb6efa..3df48c8611c8fd843efb145c7c1a0afeee25edfa
 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -731,7 +731,9 @@ vect_analyze_early_break_dependences (loop_vec_info 
loop_vinfo)
          if (is_gimple_debug (stmt))
            continue;
 
-         stmt_vec_info stmt_vinfo = loop_vinfo->lookup_stmt (stmt);
+         stmt_vec_info stmt_vinfo
+           = vect_stmt_to_vectorize (loop_vinfo->lookup_stmt (stmt));
+         stmt = STMT_VINFO_STMT (stmt_vinfo);
          auto dr_ref = STMT_VINFO_DATA_REF (stmt_vinfo);
          if (!dr_ref)
            continue;
@@ -748,26 +750,16 @@ vect_analyze_early_break_dependences (loop_vec_info 
loop_vinfo)
             bounded by VF so accesses are within range.  We only need to check
             the reads since writes are moved to a safe place where if we get
             there we know they are safe to perform.  */
-         if (DR_IS_READ (dr_ref)
-             && !ref_within_array_bound (stmt, DR_REF (dr_ref)))
+         if (DR_IS_READ (dr_ref))
            {
-             if (STMT_VINFO_GATHER_SCATTER_P (stmt_vinfo)
-                 || STMT_VINFO_STRIDED_P (stmt_vinfo))
-               {
-                 const char *msg
-                   = "early break not supported: cannot peel "
-                     "for alignment, vectorization would read out of "
-                     "bounds at %G";
-                 return opt_result::failure_at (stmt, msg, stmt);
-               }
-
-             dr_vec_info *dr_info = STMT_VINFO_DR_INFO (stmt_vinfo);
-             dr_info->need_peeling_for_alignment = true;
+             dr_set_safe_speculative_read_required (stmt_vinfo, true);
+             bool inbounds = ref_within_array_bound (stmt, DR_REF (dr_ref));
+             DR_SCALAR_KNOWN_BOUNDS (STMT_VINFO_DR_INFO (stmt_vinfo)) = 
inbounds;
 
              if (dump_enabled_p ())
                dump_printf_loc (MSG_NOTE, vect_location,
-                                "marking DR (read) as needing peeling for "
-                                "alignment at %G", stmt);
+                                "marking DR (read) as possibly needing peeling 
"
+                                "for alignment at %G", stmt);
            }
 
          if (DR_IS_READ (dr_ref))
@@ -1326,9 +1318,6 @@ vect_record_base_alignments (vec_info *vinfo)
    Compute the misalignment of the data reference DR_INFO when vectorizing
    with VECTYPE.
 
-   RESULT is non-NULL iff VINFO is a loop_vec_info.  In that case, *RESULT will
-   be set appropriately on failure (but is otherwise left unchanged).
-
    Output:
    1. initialized misalignment info for DR_INFO
 
@@ -1337,7 +1326,7 @@ vect_record_base_alignments (vec_info *vinfo)
 
 static void
 vect_compute_data_ref_alignment (vec_info *vinfo, dr_vec_info *dr_info,
-                                tree vectype, opt_result *result = nullptr)
+                                tree vectype)
 {
   stmt_vec_info stmt_info = dr_info->stmt;
   vec_base_alignments *base_alignments = &vinfo->base_alignments;
@@ -1365,63 +1354,29 @@ vect_compute_data_ref_alignment (vec_info *vinfo, 
dr_vec_info *dr_info,
     = exact_div (targetm.vectorize.preferred_vector_alignment (vectype),
                 BITS_PER_UNIT);
 
-  /* If this DR needs peeling for alignment for correctness, we must
-     ensure the target alignment is a constant power-of-two multiple of the
-     amount read per vector iteration (overriding the above hook where
-     necessary).  */
-  if (dr_info->need_peeling_for_alignment)
+  if (loop_vinfo
+      && dr_safe_speculative_read_required (stmt_info))
     {
-      /* Vector size in bytes.  */
-      poly_uint64 safe_align = tree_to_poly_uint64 (TYPE_SIZE_UNIT (vectype));
-
-      /* We can only peel for loops, of course.  */
-      gcc_checking_assert (loop_vinfo);
-
-      /* Calculate the number of vectors read per vector iteration.  If
-        it is a power of two, multiply through to get the required
-        alignment in bytes.  Otherwise, fail analysis since alignment
-        peeling wouldn't work in such a case.  */
-      poly_uint64 num_scalars = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
+      poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
+      auto vectype_size
+       = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
+      poly_uint64 new_alignment = vf * vectype_size;
+      /* If we have a grouped access we require that the alignment be N * 
elem.  */
       if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
-       num_scalars *= DR_GROUP_SIZE (stmt_info);
+       new_alignment *= DR_GROUP_SIZE (DR_GROUP_FIRST_ELEMENT (stmt_info));
 
-      auto num_vectors = vect_get_num_vectors (num_scalars, vectype);
-      if (!pow2p_hwi (num_vectors))
-       {
-         *result = opt_result::failure_at (vect_location,
-                                           "non-power-of-two num vectors %u "
-                                           "for DR needing peeling for "
-                                           "alignment at %G",
-                                           num_vectors, stmt_info->stmt);
-         return;
-       }
-
-      safe_align *= num_vectors;
-      if (maybe_gt (safe_align, 4096U))
-       {
-         pretty_printer pp;
-         pp_wide_integer (&pp, safe_align);
-         *result = opt_result::failure_at (vect_location,
-                                           "alignment required for correctness"
-                                           " (%s) may exceed page size",
-                                           pp_formatted_text (&pp));
-         return;
-       }
-
-      unsigned HOST_WIDE_INT multiple;
-      if (!constant_multiple_p (vector_alignment, safe_align, &multiple)
-         || !pow2p_hwi (multiple))
+      unsigned HOST_WIDE_INT target_alignment;
+      if (new_alignment.is_constant (&target_alignment)
+         && pow2p_hwi (target_alignment))
        {
          if (dump_enabled_p ())
            {
              dump_printf_loc (MSG_NOTE, vect_location,
-                              "forcing alignment for DR from preferred (");
-             dump_dec (MSG_NOTE, vector_alignment);
-             dump_printf (MSG_NOTE, ") to safe align (");
-             dump_dec (MSG_NOTE, safe_align);
-             dump_printf (MSG_NOTE, ") for stmt: %G", stmt_info->stmt);
+                              "alignment increased due to early break to ");
+             dump_dec (MSG_NOTE, new_alignment);
+             dump_printf (MSG_NOTE, " bytes.\n");
            }
-         vector_alignment = safe_align;
+         vector_alignment = target_alignment;
        }
     }
 
@@ -2487,6 +2442,8 @@ vect_enhance_data_refs_alignment (loop_vec_info 
loop_vinfo)
       || !slpeel_can_duplicate_loop_p (loop, LOOP_VINFO_IV_EXIT (loop_vinfo),
                                       loop_preheader_edge (loop))
       || loop->inner
+      /* We don't currently maintaing the LCSSA for prologue peeled inversed
+        loops.  */
       || LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo))
     do_peeling = false;
 
@@ -2950,12 +2907,9 @@ vect_analyze_data_refs_alignment (loop_vec_info 
loop_vinfo)
          if (STMT_VINFO_GROUPED_ACCESS (dr_info->stmt)
              && DR_GROUP_FIRST_ELEMENT (dr_info->stmt) != dr_info->stmt)
            continue;
-         opt_result res = opt_result::success ();
+
          vect_compute_data_ref_alignment (loop_vinfo, dr_info,
-                                          STMT_VINFO_VECTYPE (dr_info->stmt),
-                                          &res);
-         if (!res)
-           return res;
+                                          STMT_VINFO_VECTYPE (dr_info->stmt));
        }
     }
 
@@ -7219,7 +7173,7 @@ vect_supportable_dr_alignment (vec_info *vinfo, 
dr_vec_info *dr_info,
 
   if (misalignment == 0)
     return dr_aligned;
-  else if (dr_info->need_peeling_for_alignment)
+  else if (dr_safe_speculative_read_required(stmt_info))
     return dr_unaligned_unsupported;
 
   /* For now assume all conditional loads/stores support unaligned
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 
6bbb16beff2c627fca11a7403ba5ee3a5faa21c1..e0e5598babc7b119d408d83512ae8c6047d4924c
 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -2597,6 +2597,165 @@ get_load_store_type (vec_info  *vinfo, stmt_vec_info 
stmt_info,
       return false;
     }
 
+  /* If this DR needs alignment for correctness, we must ensure the target
+     alignment is a constant power-of-two multiple of the amount read per
+     vector iteration or force masking.  */
+  if (dr_safe_speculative_read_required (stmt_info))
+    {
+      /* We can only peel for loops, of course.  */
+      gcc_checking_assert (loop_vinfo);
+
+      /* Check if we support the operation if early breaks are needed.  Here we
+        must ensure that we don't access any more than the scalar code would
+        have.  A masked operation would ensure this, so for these load types
+        force masking.  */
+      if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo)
+         && (*memory_access_type == VMAT_GATHER_SCATTER
+             || *memory_access_type == VMAT_STRIDED_SLP))
+       {
+         if (dump_enabled_p ())
+           dump_printf_loc (MSG_NOTE, vect_location,
+                            "early break not supported: cannot peel for "
+                            "alignment. With non-contiguous memory 
vectorization"
+                            " could read out of bounds at %G ",
+                            STMT_VINFO_STMT (stmt_info));
+         LOOP_VINFO_MUST_USE_PARTIAL_VECTORS_P (loop_vinfo) = true;
+       }
+
+      /* Vector size in bytes.  */
+      poly_uint64 safe_align;
+      if (nunits.is_constant ())
+       safe_align = tree_to_poly_uint64 (TYPE_SIZE_UNIT (vectype));
+      else
+       safe_align = estimated_poly_value (LOOP_VINFO_VECT_FACTOR (loop_vinfo),
+                                          POLY_VALUE_MAX);
+
+      auto num_vectors = ncopies;
+      if (!pow2p_hwi (num_vectors))
+       {
+         if (dump_enabled_p ())
+           dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+                            "non-power-of-two num vectors %u "
+                            "for DR needing peeling for "
+                            "alignment at %G",
+                            num_vectors, STMT_VINFO_STMT (stmt_info));
+         return false;
+       }
+
+      auto target_alignment
+       = DR_TARGET_ALIGNMENT (STMT_VINFO_DR_INFO (stmt_info));
+      unsigned HOST_WIDE_INT target_align;
+      bool inbounds
+       = DR_SCALAR_KNOWN_BOUNDS (STMT_VINFO_DR_INFO (stmt_info));
+
+      /* If the scalar loop is known to be in bounds, and we're using scalar
+        accesses then there's no need to check further.  */
+      if (inbounds
+         && *memory_access_type == VMAT_ELEMENTWISE)
+       {
+         *alignment_support_scheme = dr_aligned;
+         return true;
+       }
+
+      bool group_aligned = false;
+      if (target_alignment.is_constant (&target_align)
+         && nunits.is_constant ())
+       {
+         poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
+         auto vectype_size
+           = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
+         poly_uint64 required_alignment = vf * vectype_size;
+         /* If we have a grouped access we require that the alignment be N * 
elem.  */
+         if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
+           required_alignment *=
+               DR_GROUP_SIZE (DR_GROUP_FIRST_ELEMENT (stmt_info));
+         if (maybe_ne (target_alignment, required_alignment))
+           {
+             if (dump_enabled_p ())
+               dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+                            "desired alignment %wu not met. Instead got %wu "
+                            "needed for DR alignment at %G",
+                            required_alignment.to_constant (),
+                            target_align, STMT_VINFO_STMT (stmt_info));
+             return false;
+           }
+
+         if (!pow2p_hwi (target_align))
+           {
+             if (dump_enabled_p ())
+               dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+                            "non-power-of-two vector alignment %wd "
+                            "needed for DR alignment at %G",
+                            target_align, STMT_VINFO_STMT (stmt_info));
+             return false;
+           }
+         group_aligned = true;
+       }
+      else if (inbounds)
+       LOOP_VINFO_MUST_USE_PARTIAL_VECTORS_P (loop_vinfo) = true;
+      else
+       return false;
+
+      /* Even if uneven group sizes are aligned on the first load, the second
+        iteration won't be.  As such reject non-power of 2 group sizes.  */
+      if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
+       {
+         auto group_size = DR_GROUP_SIZE (DR_GROUP_FIRST_ELEMENT (stmt_info));
+         if (!pow2p_hwi (group_size))
+           {
+             if (group_aligned || inbounds)
+               LOOP_VINFO_MUST_USE_PARTIAL_VECTORS_P (loop_vinfo) = true;
+             else
+               {
+                 if (dump_enabled_p ())
+                   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+                            "early break not supported: uneven group size, "
+                            "vectorization could read out of bounds at %G ",
+                            STMT_VINFO_STMT (stmt_info));
+                 return false;
+               }
+           }
+       }
+
+      safe_align *= num_vectors;
+      /* For VLA we have to insert a runtime check that the vector loads
+        per iterations don't exceed a page size.  For now we can use
+        POLY_VALUE_MAX as a proxy as we can't peel for VLA.  */
+      if (maybe_gt (safe_align, (unsigned)param_min_pagesize))
+       {
+         if (dump_enabled_p ())
+           {
+             dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+                              "alignment required for correctness (");
+             dump_dec (MSG_MISSED_OPTIMIZATION, safe_align);
+             dump_printf (MSG_NOTE, ") may exceed page size\n");
+           }
+         return false;
+       }
+
+      /* There are multiple loads that have a misalignment that we couldn't
+        align.  We would need LOOP_VINFO_MUST_USE_PARTIAL_VECTORS_P to
+        vectorize. */
+      if (!STMT_VINFO_GROUPED_ACCESS (stmt_info)
+         && *misalignment != 0)
+       return false;
+
+      /* When using a group access the first element may be aligned but the
+        subsequent loads may not be.  For LOAD_LANES since the loads are based
+        on the first DR then all loads in the group are aligned.  For
+        non-LOAD_LANES this is not the case. In particular a load + blend when
+        there are gaps can have the non first loads issued unaligned, even
+        partially overlapping the memory of the first load in order to simplify
+        the blend.  This is what the x86_64 backend does for instance.  As
+        such only the first load in the group is aligned, the rest are not.  */
+      else if (STMT_VINFO_GROUPED_ACCESS (stmt_info)
+              && *misalignment != 0
+              && *memory_access_type != VMAT_LOAD_STORE_LANES)
+       *alignment_support_scheme = dr_unaligned_supported;
+      else
+       *alignment_support_scheme = dr_aligned;
+    }
+
   if (*alignment_support_scheme == dr_unaligned_unsupported)
     {
       if (dump_enabled_p ())
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 
b0cb081cba0ae8b11fbfcfcb8c6d440ec451ccb5..97caf61b345735d297ec49fd6ca64797435b46fc
 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -1281,7 +1281,11 @@ public:
 
   /* Set by early break vectorization when this DR needs peeling for alignment
      for correctness.  */
-  bool need_peeling_for_alignment;
+  bool safe_speculative_read_required;
+
+  /* Set by early break vectorization when this DR's scalar accesses are known
+     to be inbounds of a known bounds loop.  */
+  bool scalar_access_known_in_bounds;
 
   tree base_decl;
 
@@ -1997,6 +2001,35 @@ dr_target_alignment (dr_vec_info *dr_info)
   return dr_info->target_alignment;
 }
 #define DR_TARGET_ALIGNMENT(DR) dr_target_alignment (DR)
+#define DR_SCALAR_KNOWN_BOUNDS(DR) (DR)->scalar_access_known_in_bounds
+
+/* Return if the stmt_vec_info requires peeling for alignment.  */
+inline bool
+dr_safe_speculative_read_required (stmt_vec_info stmt_info)
+{
+  dr_vec_info *dr_info;
+  if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
+    dr_info = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT (stmt_info));
+  else
+    dr_info = STMT_VINFO_DR_INFO (stmt_info);
+
+  return dr_info->safe_speculative_read_required;
+}
+
+/* Set the safe_speculative_read_required for the the stmt_vec_info, if group
+   access then set on the fist element otherwise set on DR directly.  */
+inline void
+dr_set_safe_speculative_read_required (stmt_vec_info stmt_info,
+                                      bool requires_alignment)
+{
+  dr_vec_info *dr_info;
+  if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
+    dr_info = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT (stmt_info));
+  else
+    dr_info = STMT_VINFO_DR_INFO (stmt_info);
+
+  dr_info->safe_speculative_read_required = requires_alignment;
+}
 
 inline void
 set_dr_target_alignment (dr_vec_info *dr_info, poly_uint64 val)


-- 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ca8e468f3f2dbf68c959f74d8fec48c79463504d..fff2874b326d605fc2656adf4ab6eb5bd5d42d71 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -17256,7 +17256,7 @@ Maximum number of relations the oracle will register in a basic block.
 Work bound when discovering transitive relations from existing relations.
 
 @item min-pagesize
-Minimum page size for warning purposes.
+Minimum page size for warning and early break vectorization purposes.
 
 @item openacc-kernels
 Specify mode of OpenACC `kernels' constructs handling.
diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c b/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
index 9ef1330b47c817e16baaafa44c2b15108b9dd3a9..4c8255895b976653228233d93c950629f3231554 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
@@ -55,7 +55,9 @@ int main()
 	  }
     }
   rephase ();
+#pragma GCC novector
   for (i = 0; i < 32; ++i)
+#pragma GCC novector
     for (j = 0; j < 3; ++j)
 #pragma GCC novector
       for (k = 0; k < 3; ++k)
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c
index 423ff0b566b18bf04ce4f67a45b94dc1a021a4a0..f99c57be0adc4d49035b8a75c72d4a5b04cc05c7 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c
@@ -5,7 +5,8 @@
 /* { dg-additional-options "-O3" } */
 /* { dg-additional-options "-mavx2" { target { x86_64-*-* i?86-*-* } } } */
 
-/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* Arm and -m32 create a group size of 3 here, which we can't support yet. AArch64 makes elementwise accesses here.  */
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { aarch64*-*-* } } } } */
 
 typedef struct filter_list_entry {
   const char *name;
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_132-pr118464.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_132-pr118464.c
new file mode 100644
index 0000000000000000000000000000000000000000..9bf0cbc8853f74de550e8ac83ab569fc9fbde126
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_132-pr118464.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-O3" } */
+
+int a, b, c, d, e, f;
+short g[1];
+int main() {
+  int h;
+  while (a) {
+    while (h)
+      ;
+    for (b = 2; b; b--) {
+      while (c)
+        ;
+      f = g[a];
+      if (d)
+        break;
+    }
+    while (e)
+      ;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa1.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa1.c
new file mode 100644
index 0000000000000000000000000000000000000000..dc771186efafe25bb65490da7a383ad7f6ceb0a7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-O3" } */
+
+char string[1020];
+
+char * find(int n, char c)
+{
+    for (int i = 1; i < n; i++) {
+        if (string[i] == c)
+            return &string[i];
+    }
+    return 0;
+}
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump "Alignment of access forced using peeling" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa10.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa10.c
new file mode 100644
index 0000000000000000000000000000000000000000..82d473a279ce060c550289c61729d9f9b56f0d2a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa10.c
@@ -0,0 +1,24 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-additional-options "-Ofast" } */
+
+/* Alignment requirement too big, load lanes targets can't safely vectorize this.  */
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { ! vect_load_lanes } } } } */
+/* { dg-final { scan-tree-dump-not "Alignment of access forced using peeling" "vect" { target { ! vect_load_lanes } } } } */
+
+unsigned test4(char x, char *restrict vect_a, char *restrict vect_b, int n)
+{  
+ unsigned ret = 0;
+ for (int i = 0; i < (n - 2); i+=2)
+ {
+   if (vect_a[i] > x || vect_a[i+2] > x)
+     return 1;
+
+   vect_b[i] = x;
+   vect_b[i+1] = x+1;
+ }
+ return ret;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa2.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa2.c
new file mode 100644
index 0000000000000000000000000000000000000000..7d56772fbf380ce42ac758ca29a5f3f9d3f6e0d1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa2.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-O3" } */
+
+char string[1020];
+
+char * find(int n, char c)
+{
+    for (int i = 0; i < n; i++) {
+        if (string[i] == c)
+            return &string[i];
+    }
+    return 0;
+}
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump-not "Alignment of access forced using peeling" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa3.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa3.c
new file mode 100644
index 0000000000000000000000000000000000000000..374a051b945e97eedb9be9da423cf54b5e564d6f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa3.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-O3" } */
+
+char string[1020] __attribute__((aligned(1)));
+
+char * find(int n, char c)
+{
+    for (int i = 1; i < n; i++) {
+        if (string[i] == c)
+            return &string[i];
+    }
+    return 0;
+}
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump "Alignment of access forced using peeling" "vect" } } */
+/* { dg-final { scan-tree-dump "force alignment of string" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa4.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa4.c
new file mode 100644
index 0000000000000000000000000000000000000000..297fb7e9b9beffa25ab8f257ceea1c065fcc6ae9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa4.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-O3" } */
+
+char string[1020] __attribute__((aligned(1)));
+
+char * find(int n, char c)
+{
+    for (int i = 0; i < n; i++) {
+        if (string[i] == c)
+            return &string[i];
+    }
+    return 0;
+}
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump-not "Alignment of access forced using peeling" "vect" } } */
+/* { dg-final { scan-tree-dump "force alignment of string" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa5.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa5.c
new file mode 100644
index 0000000000000000000000000000000000000000..ca95be44e92e32769da1d1e9b740ae54682a3d55
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa5.c
@@ -0,0 +1,23 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-additional-options "-Ofast" } */
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+unsigned test4(char x, char *vect, int n)
+{  
+ unsigned ret = 0;
+ for (int i = 0; i < n; i++)
+ {
+   if (vect[i] > x)
+     return 1;
+
+   vect[i] = x;
+ }
+ return ret;
+}
+
+/* { dg-final { scan-tree-dump "Alignment of access forced using peeling" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa6.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa6.c
new file mode 100644
index 0000000000000000000000000000000000000000..ee123df6ed2ba97e92307c64a61c97b1b6268743
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa6.c
@@ -0,0 +1,23 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-additional-options "-Ofast" } */
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+unsigned test4(char x, char *vect_a, char *vect_b, int n)
+{  
+ unsigned ret = 0;
+ for (int i = 1; i < n; i++)
+ {
+   if (vect_a[i] > x || vect_b[i] > x)
+     return 1;
+
+   vect_a[i] = x;
+ }
+ return ret;
+}
+
+/* { dg-final { scan-tree-dump "Versioning for alignment will be applied" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa7.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa7.c
new file mode 100644
index 0000000000000000000000000000000000000000..51bad4e745b67cfdaad20f50776299531824ce9c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa7.c
@@ -0,0 +1,23 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-additional-options "-Ofast" } */
+
+/* This should be vectorizable through load_lanes and linear targets.  */
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+unsigned test4(char x, char * restrict vect_a, char * restrict vect_b, int n)
+{  
+ unsigned ret = 0;
+ for (int i = 0; i < n; i+=2)
+ {
+   if (vect_a[i] > x || vect_a[i+1] > x)
+     return 1;
+
+   vect_b[i] = x;
+   vect_b[i+1] = x+1;
+ }
+ return ret;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa8.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa8.c
new file mode 100644
index 0000000000000000000000000000000000000000..dbb14ba3239c91b9bfdf56cecc60750394e10f2b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa8.c
@@ -0,0 +1,25 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-additional-options "-Ofast" } */
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+char vect_a[1025];
+char vect_b[1025];
+
+unsigned test4(char x, int n)
+{  
+ unsigned ret = 0;
+ for (int i = 1; i < (n - 2); i+=2)
+ {
+   if (vect_a[i] > x || vect_a[i+1] > x)
+     return 1;
+
+   vect_b[i] = x;
+   vect_b[i+1] = x+1;
+ }
+ return ret;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa9.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa9.c
new file mode 100644
index 0000000000000000000000000000000000000000..31e2096209253539483253efc17499a53d112894
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_133_pfa9.c
@@ -0,0 +1,28 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-additional-options "-Ofast" } */
+
+/* Group size is uneven, load lanes targets can't safely vectorize this.  */
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump-not "Alignment of access forced using peeling" "vect" } } */
+
+
+char vect_a[1025];
+char vect_b[1025];
+
+unsigned test4(char x, int n)
+{  
+ unsigned ret = 0;
+ for (int i = 1; i < (n - 2); i+=2)
+ {
+   if (vect_a[i-1] > x || vect_a[i+2] > x)
+     return 1;
+
+   vect_b[i] = x;
+   vect_b[i+1] = x+1;
+ }
+ return ret;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_39.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_39.c
index 9d3c6a5dffe3be4a7759b150e330d18144ab5ce5..b3f40b8c9ba49e41bd283e46a462238c3b5825ef 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-early-break_39.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_39.c
@@ -23,4 +23,5 @@ unsigned test4(unsigned x, unsigned n)
  return ret;
 }
 
-/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" } } */
+/* cannot safely vectorize this due due to the group misalignment.  */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 0 "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_53.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_53.c
index a02d5986ba3cfc117b19305c5e96711299996931..d4fd0d39a25a5659e3d9452b79f3e0fabba8b3c0 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-early-break_53.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_53.c
@@ -2,6 +2,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target vect_early_break } */
 /* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_partial_vectors } */
 
 void abort ();
 int a[64], b[64];
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_56.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_56.c
index 9096f66647c7b3cb430562d35f8ce076244f7c11..b35e737fa3b9137cd745c14f7ad915a3f81c38c4 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-early-break_56.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_56.c
@@ -4,6 +4,7 @@
 /* { dg-require-effective-target vect_int } */
 /* { dg-add-options bind_pic_locally } */
 /* { dg-require-effective-target vect_early_break_hw } */
+/* { dg-require-effective-target vect_partial_vectors } */
 
 #include <stdarg.h>
 #include "tree-vect.h"
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_57.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_57.c
index 319bd125c3156f13c300ff2b94d269bb9ec29e97..a4886654f152b2c0568286febea2b31cb7be8499 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-early-break_57.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_57.c
@@ -5,8 +5,9 @@
 
 /* { dg-additional-options "-Ofast" } */
 
-/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
-/* { dg-final { scan-tree-dump "epilog loop required" "vect" } } */
+/* Multiple loads of different alignments, we can't peel this. */
+/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump-not "epilog loop required" "vect" } } */
 
 void abort ();
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_81.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_81.c
index 8a8c076ba92ca6fef419cb23b457a23555c61c64..b58a4611d6b8d86f0247d9ea44ab4750473589a9 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-early-break_81.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_81.c
@@ -5,8 +5,9 @@
 
 /* { dg-additional-options "-Ofast" } */
 
-/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
-/* { dg-final { scan-tree-dump "epilog loop required" "vect" } } */
+/* Multiple loads with different misalignments.  Can't peel need partial loop support.  */
+/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-tree-dump-not "epilog loop required" "vect" } } */
 void abort ();
 
 unsigned short sa[32];
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 6d5854ac7c7a18e09ec7ad72c534abdc55cb6efa..3df48c8611c8fd843efb145c7c1a0afeee25edfa 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -731,7 +731,9 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo)
 	  if (is_gimple_debug (stmt))
 	    continue;
 
-	  stmt_vec_info stmt_vinfo = loop_vinfo->lookup_stmt (stmt);
+	  stmt_vec_info stmt_vinfo
+	    = vect_stmt_to_vectorize (loop_vinfo->lookup_stmt (stmt));
+	  stmt = STMT_VINFO_STMT (stmt_vinfo);
 	  auto dr_ref = STMT_VINFO_DATA_REF (stmt_vinfo);
 	  if (!dr_ref)
 	    continue;
@@ -748,26 +750,16 @@ vect_analyze_early_break_dependences (loop_vec_info loop_vinfo)
 	     bounded by VF so accesses are within range.  We only need to check
 	     the reads since writes are moved to a safe place where if we get
 	     there we know they are safe to perform.  */
-	  if (DR_IS_READ (dr_ref)
-	      && !ref_within_array_bound (stmt, DR_REF (dr_ref)))
+	  if (DR_IS_READ (dr_ref))
 	    {
-	      if (STMT_VINFO_GATHER_SCATTER_P (stmt_vinfo)
-		  || STMT_VINFO_STRIDED_P (stmt_vinfo))
-		{
-		  const char *msg
-		    = "early break not supported: cannot peel "
-		      "for alignment, vectorization would read out of "
-		      "bounds at %G";
-		  return opt_result::failure_at (stmt, msg, stmt);
-		}
-
-	      dr_vec_info *dr_info = STMT_VINFO_DR_INFO (stmt_vinfo);
-	      dr_info->need_peeling_for_alignment = true;
+	      dr_set_safe_speculative_read_required (stmt_vinfo, true);
+	      bool inbounds = ref_within_array_bound (stmt, DR_REF (dr_ref));
+	      DR_SCALAR_KNOWN_BOUNDS (STMT_VINFO_DR_INFO (stmt_vinfo)) = inbounds;
 
 	      if (dump_enabled_p ())
 		dump_printf_loc (MSG_NOTE, vect_location,
-				 "marking DR (read) as needing peeling for "
-				 "alignment at %G", stmt);
+				 "marking DR (read) as possibly needing peeling "
+				 "for alignment at %G", stmt);
 	    }
 
 	  if (DR_IS_READ (dr_ref))
@@ -1326,9 +1318,6 @@ vect_record_base_alignments (vec_info *vinfo)
    Compute the misalignment of the data reference DR_INFO when vectorizing
    with VECTYPE.
 
-   RESULT is non-NULL iff VINFO is a loop_vec_info.  In that case, *RESULT will
-   be set appropriately on failure (but is otherwise left unchanged).
-
    Output:
    1. initialized misalignment info for DR_INFO
 
@@ -1337,7 +1326,7 @@ vect_record_base_alignments (vec_info *vinfo)
 
 static void
 vect_compute_data_ref_alignment (vec_info *vinfo, dr_vec_info *dr_info,
-				 tree vectype, opt_result *result = nullptr)
+				 tree vectype)
 {
   stmt_vec_info stmt_info = dr_info->stmt;
   vec_base_alignments *base_alignments = &vinfo->base_alignments;
@@ -1365,63 +1354,29 @@ vect_compute_data_ref_alignment (vec_info *vinfo, dr_vec_info *dr_info,
     = exact_div (targetm.vectorize.preferred_vector_alignment (vectype),
 		 BITS_PER_UNIT);
 
-  /* If this DR needs peeling for alignment for correctness, we must
-     ensure the target alignment is a constant power-of-two multiple of the
-     amount read per vector iteration (overriding the above hook where
-     necessary).  */
-  if (dr_info->need_peeling_for_alignment)
+  if (loop_vinfo
+      && dr_safe_speculative_read_required (stmt_info))
     {
-      /* Vector size in bytes.  */
-      poly_uint64 safe_align = tree_to_poly_uint64 (TYPE_SIZE_UNIT (vectype));
-
-      /* We can only peel for loops, of course.  */
-      gcc_checking_assert (loop_vinfo);
-
-      /* Calculate the number of vectors read per vector iteration.  If
-	 it is a power of two, multiply through to get the required
-	 alignment in bytes.  Otherwise, fail analysis since alignment
-	 peeling wouldn't work in such a case.  */
-      poly_uint64 num_scalars = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
+      poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
+      auto vectype_size
+	= TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
+      poly_uint64 new_alignment = vf * vectype_size;
+      /* If we have a grouped access we require that the alignment be N * elem.  */
       if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
-	num_scalars *= DR_GROUP_SIZE (stmt_info);
+	new_alignment *= DR_GROUP_SIZE (DR_GROUP_FIRST_ELEMENT (stmt_info));
 
-      auto num_vectors = vect_get_num_vectors (num_scalars, vectype);
-      if (!pow2p_hwi (num_vectors))
-	{
-	  *result = opt_result::failure_at (vect_location,
-					    "non-power-of-two num vectors %u "
-					    "for DR needing peeling for "
-					    "alignment at %G",
-					    num_vectors, stmt_info->stmt);
-	  return;
-	}
-
-      safe_align *= num_vectors;
-      if (maybe_gt (safe_align, 4096U))
-	{
-	  pretty_printer pp;
-	  pp_wide_integer (&pp, safe_align);
-	  *result = opt_result::failure_at (vect_location,
-					    "alignment required for correctness"
-					    " (%s) may exceed page size",
-					    pp_formatted_text (&pp));
-	  return;
-	}
-
-      unsigned HOST_WIDE_INT multiple;
-      if (!constant_multiple_p (vector_alignment, safe_align, &multiple)
-	  || !pow2p_hwi (multiple))
+      unsigned HOST_WIDE_INT target_alignment;
+      if (new_alignment.is_constant (&target_alignment)
+	  && pow2p_hwi (target_alignment))
 	{
 	  if (dump_enabled_p ())
 	    {
 	      dump_printf_loc (MSG_NOTE, vect_location,
-			       "forcing alignment for DR from preferred (");
-	      dump_dec (MSG_NOTE, vector_alignment);
-	      dump_printf (MSG_NOTE, ") to safe align (");
-	      dump_dec (MSG_NOTE, safe_align);
-	      dump_printf (MSG_NOTE, ") for stmt: %G", stmt_info->stmt);
+			       "alignment increased due to early break to ");
+	      dump_dec (MSG_NOTE, new_alignment);
+	      dump_printf (MSG_NOTE, " bytes.\n");
 	    }
-	  vector_alignment = safe_align;
+	  vector_alignment = target_alignment;
 	}
     }
 
@@ -2487,6 +2442,8 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
       || !slpeel_can_duplicate_loop_p (loop, LOOP_VINFO_IV_EXIT (loop_vinfo),
 				       loop_preheader_edge (loop))
       || loop->inner
+      /* We don't currently maintaing the LCSSA for prologue peeled inversed
+	 loops.  */
       || LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo))
     do_peeling = false;
 
@@ -2950,12 +2907,9 @@ vect_analyze_data_refs_alignment (loop_vec_info loop_vinfo)
 	  if (STMT_VINFO_GROUPED_ACCESS (dr_info->stmt)
 	      && DR_GROUP_FIRST_ELEMENT (dr_info->stmt) != dr_info->stmt)
 	    continue;
-	  opt_result res = opt_result::success ();
+
 	  vect_compute_data_ref_alignment (loop_vinfo, dr_info,
-					   STMT_VINFO_VECTYPE (dr_info->stmt),
-					   &res);
-	  if (!res)
-	    return res;
+					   STMT_VINFO_VECTYPE (dr_info->stmt));
 	}
     }
 
@@ -7219,7 +7173,7 @@ vect_supportable_dr_alignment (vec_info *vinfo, dr_vec_info *dr_info,
 
   if (misalignment == 0)
     return dr_aligned;
-  else if (dr_info->need_peeling_for_alignment)
+  else if (dr_safe_speculative_read_required(stmt_info))
     return dr_unaligned_unsupported;
 
   /* For now assume all conditional loads/stores support unaligned
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 6bbb16beff2c627fca11a7403ba5ee3a5faa21c1..e0e5598babc7b119d408d83512ae8c6047d4924c 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -2597,6 +2597,165 @@ get_load_store_type (vec_info  *vinfo, stmt_vec_info stmt_info,
       return false;
     }
 
+  /* If this DR needs alignment for correctness, we must ensure the target
+     alignment is a constant power-of-two multiple of the amount read per
+     vector iteration or force masking.  */
+  if (dr_safe_speculative_read_required (stmt_info))
+    {
+      /* We can only peel for loops, of course.  */
+      gcc_checking_assert (loop_vinfo);
+
+      /* Check if we support the operation if early breaks are needed.  Here we
+	 must ensure that we don't access any more than the scalar code would
+	 have.  A masked operation would ensure this, so for these load types
+	 force masking.  */
+      if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo)
+	  && (*memory_access_type == VMAT_GATHER_SCATTER
+	      || *memory_access_type == VMAT_STRIDED_SLP))
+	{
+	  if (dump_enabled_p ())
+	    dump_printf_loc (MSG_NOTE, vect_location,
+			     "early break not supported: cannot peel for "
+			     "alignment. With non-contiguous memory vectorization"
+			     " could read out of bounds at %G ",
+			     STMT_VINFO_STMT (stmt_info));
+	  LOOP_VINFO_MUST_USE_PARTIAL_VECTORS_P (loop_vinfo) = true;
+	}
+
+      /* Vector size in bytes.  */
+      poly_uint64 safe_align;
+      if (nunits.is_constant ())
+	safe_align = tree_to_poly_uint64 (TYPE_SIZE_UNIT (vectype));
+      else
+	safe_align = estimated_poly_value (LOOP_VINFO_VECT_FACTOR (loop_vinfo),
+					   POLY_VALUE_MAX);
+
+      auto num_vectors = ncopies;
+      if (!pow2p_hwi (num_vectors))
+	{
+	  if (dump_enabled_p ())
+	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			     "non-power-of-two num vectors %u "
+			     "for DR needing peeling for "
+			     "alignment at %G",
+			     num_vectors, STMT_VINFO_STMT (stmt_info));
+	  return false;
+	}
+
+      auto target_alignment
+	= DR_TARGET_ALIGNMENT (STMT_VINFO_DR_INFO (stmt_info));
+      unsigned HOST_WIDE_INT target_align;
+      bool inbounds
+	= DR_SCALAR_KNOWN_BOUNDS (STMT_VINFO_DR_INFO (stmt_info));
+
+      /* If the scalar loop is known to be in bounds, and we're using scalar
+	 accesses then there's no need to check further.  */
+      if (inbounds
+	  && *memory_access_type == VMAT_ELEMENTWISE)
+	{
+	  *alignment_support_scheme = dr_aligned;
+	  return true;
+	}
+
+      bool group_aligned = false;
+      if (target_alignment.is_constant (&target_align)
+	  && nunits.is_constant ())
+	{
+	  poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
+	  auto vectype_size
+	    = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
+	  poly_uint64 required_alignment = vf * vectype_size;
+	  /* If we have a grouped access we require that the alignment be N * elem.  */
+	  if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
+	    required_alignment *=
+		DR_GROUP_SIZE (DR_GROUP_FIRST_ELEMENT (stmt_info));
+	  if (maybe_ne (target_alignment, required_alignment))
+	    {
+	      if (dump_enabled_p ())
+		dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			     "desired alignment %wu not met. Instead got %wu "
+			     "needed for DR alignment at %G",
+			     required_alignment.to_constant (),
+			     target_align, STMT_VINFO_STMT (stmt_info));
+	      return false;
+	    }
+
+	  if (!pow2p_hwi (target_align))
+	    {
+	      if (dump_enabled_p ())
+		dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			     "non-power-of-two vector alignment %wd "
+			     "needed for DR alignment at %G",
+			     target_align, STMT_VINFO_STMT (stmt_info));
+	      return false;
+	    }
+	  group_aligned = true;
+	}
+      else if (inbounds)
+	LOOP_VINFO_MUST_USE_PARTIAL_VECTORS_P (loop_vinfo) = true;
+      else
+	return false;
+
+      /* Even if uneven group sizes are aligned on the first load, the second
+	 iteration won't be.  As such reject non-power of 2 group sizes.  */
+      if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
+	{
+	  auto group_size = DR_GROUP_SIZE (DR_GROUP_FIRST_ELEMENT (stmt_info));
+	  if (!pow2p_hwi (group_size))
+	    {
+	      if (group_aligned || inbounds)
+		LOOP_VINFO_MUST_USE_PARTIAL_VECTORS_P (loop_vinfo) = true;
+	      else
+		{
+		  if (dump_enabled_p ())
+		    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			     "early break not supported: uneven group size, "
+			     "vectorization could read out of bounds at %G ",
+			     STMT_VINFO_STMT (stmt_info));
+		  return false;
+		}
+	    }
+	}
+
+      safe_align *= num_vectors;
+      /* For VLA we have to insert a runtime check that the vector loads
+	 per iterations don't exceed a page size.  For now we can use
+	 POLY_VALUE_MAX as a proxy as we can't peel for VLA.  */
+      if (maybe_gt (safe_align, (unsigned)param_min_pagesize))
+	{
+	  if (dump_enabled_p ())
+	    {
+	      dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			       "alignment required for correctness (");
+	      dump_dec (MSG_MISSED_OPTIMIZATION, safe_align);
+	      dump_printf (MSG_NOTE, ") may exceed page size\n");
+	    }
+	  return false;
+	}
+
+      /* There are multiple loads that have a misalignment that we couldn't
+	 align.  We would need LOOP_VINFO_MUST_USE_PARTIAL_VECTORS_P to
+	 vectorize. */
+      if (!STMT_VINFO_GROUPED_ACCESS (stmt_info)
+	  && *misalignment != 0)
+	return false;
+
+      /* When using a group access the first element may be aligned but the
+	 subsequent loads may not be.  For LOAD_LANES since the loads are based
+	 on the first DR then all loads in the group are aligned.  For
+	 non-LOAD_LANES this is not the case. In particular a load + blend when
+	 there are gaps can have the non first loads issued unaligned, even
+	 partially overlapping the memory of the first load in order to simplify
+	 the blend.  This is what the x86_64 backend does for instance.  As
+	 such only the first load in the group is aligned, the rest are not.  */
+      else if (STMT_VINFO_GROUPED_ACCESS (stmt_info)
+	       && *misalignment != 0
+	       && *memory_access_type != VMAT_LOAD_STORE_LANES)
+	*alignment_support_scheme = dr_unaligned_supported;
+      else
+	*alignment_support_scheme = dr_aligned;
+    }
+
   if (*alignment_support_scheme == dr_unaligned_unsupported)
     {
       if (dump_enabled_p ())
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index b0cb081cba0ae8b11fbfcfcb8c6d440ec451ccb5..97caf61b345735d297ec49fd6ca64797435b46fc 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -1281,7 +1281,11 @@ public:
 
   /* Set by early break vectorization when this DR needs peeling for alignment
      for correctness.  */
-  bool need_peeling_for_alignment;
+  bool safe_speculative_read_required;
+
+  /* Set by early break vectorization when this DR's scalar accesses are known
+     to be inbounds of a known bounds loop.  */
+  bool scalar_access_known_in_bounds;
 
   tree base_decl;
 
@@ -1997,6 +2001,35 @@ dr_target_alignment (dr_vec_info *dr_info)
   return dr_info->target_alignment;
 }
 #define DR_TARGET_ALIGNMENT(DR) dr_target_alignment (DR)
+#define DR_SCALAR_KNOWN_BOUNDS(DR) (DR)->scalar_access_known_in_bounds
+
+/* Return if the stmt_vec_info requires peeling for alignment.  */
+inline bool
+dr_safe_speculative_read_required (stmt_vec_info stmt_info)
+{
+  dr_vec_info *dr_info;
+  if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
+    dr_info = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT (stmt_info));
+  else
+    dr_info = STMT_VINFO_DR_INFO (stmt_info);
+
+  return dr_info->safe_speculative_read_required;
+}
+
+/* Set the safe_speculative_read_required for the the stmt_vec_info, if group
+   access then set on the fist element otherwise set on DR directly.  */
+inline void
+dr_set_safe_speculative_read_required (stmt_vec_info stmt_info,
+				       bool requires_alignment)
+{
+  dr_vec_info *dr_info;
+  if (STMT_VINFO_GROUPED_ACCESS (stmt_info))
+    dr_info = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT (stmt_info));
+  else
+    dr_info = STMT_VINFO_DR_INFO (stmt_info);
+
+  dr_info->safe_speculative_read_required = requires_alignment;
+}
 
 inline void
 set_dr_target_alignment (dr_vec_info *dr_info, poly_uint64 val)

Reply via email to