https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120980

            Bug ID: 120980
           Summary: Vectorizer introduces out-of-bounds memory access
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: kristerw at gcc dot gnu.org
  Target Milestone: ---

The vectorizer introduces out-of-bounds memory access for the function below
when compiled for AArch64 with  "-O3 -march=armv9.5-a -fno-strict-aliasing".

void foo(long *p1, long *p2) {
  for (int i = 0; i < 8; i++)
    if (p1[i] != p2[i])
      __builtin_exit(0);
}

The vectorized code checks if p1 and p2 are 32-byte aligned, and in that case
processes 4 elements at a time:

  vect__4.19_93 = MEM <vector(2) long int> [(long int *)vectp_p1.17_91];
  vectp_p1.17_94 = vectp_p1.17_91 + 16;
  vect__4.20_95 = MEM <vector(2) long int> [(long int *)vectp_p1.17_94];
  vect__6.15_85 = MEM <vector(2) long int> [(long int *)vectp_p2.13_83];
  vectp_p2.13_86 = vectp_p2.13_83 + 16;
  vect__6.16_87 = MEM <vector(2) long int> [(long int *)vectp_p2.13_86];
  mask_patt_7.21_96 = vect__6.15_85 != vect__4.19_93;
  mask_patt_7.21_97 = vect__6.16_87 != vect__4.20_95;
  vexit_reduc_98 = mask_patt_7.21_97 | mask_patt_7.21_96;
  if (vexit_reduc_98 != { 0, 0 })
    goto <bb 28>; [5.50%]
  else
    goto <bb 5>; [94.50%]

The problem occurs if the input is smaller, such as
  long a1[8] = {0, 0, 0, 0, 0, 0, 0, 0};
  long a2[2] = {0, 1};
  foo(a1, a2);
This makes vect__6.16_87 be loaded from outside the array, which is harmless as
the bytes are within the same page, but it is invalid to do fully out-of-bounds
accesses according to the discussion in
https://gcc.gnu.org/pipermail/gcc/2025-April/245873.html
The IR would have been valid if the loads were done as one 4-element load
instead of two 2-element loads, as partly out-of-bounds accesses are valid.

Reply via email to