https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102139

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Target Milestone|---                         |11.3
     Ever confirmed|0                           |1
            Summary|-O3 miscompile due to       |[11/12 Regression] -O3
                   |slp-vectorize on strict     |miscompile due to
                   |align target                |slp-vectorize on strict
                   |                            |align target
   Last reconfirmed|                            |2021-08-31

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
fre is fine here.
The problem is SLP.
Here is why
take:
typedef decltype(sizeof(0)) size_t;
typedef unsigned short uint16_t;
typedef unsigned uint32_t;

void zero_two_uint16(uint16_t* ptr) {
  ptr[0] = 0;
  ptr[1] = 0;
}

#define vector __attribute__((vector_size(sizeof(uint16_t)*4)))
void f(uint16_t *a)
{
    vector uint16_t *b = (vector uint16_t *)a;
    *b = (vector uint16_t){};
}

void g(uint16_t *a)
{
    size_t t = (size_t)a;
    if ((t & 0x7)==0) {
        for(int i = 0;i < 8;i++)
      f((a + i*4));
    } else {
        for(int i = 0;i < 16;i++)
    zero_two_uint16((a + i*2));
    }
}
---- CUT ----
Compile it on aarch64 with -O3 -mgeneral-regs-only
-fno-tree-loop-distribute-patterns -mstrict-align -fno-tree-loop-vectorize
and you will produce the same result for SLP.
There is no PHI for a for FRE to merge even. And there is no alignment
information on the pointers assignments either.

As you can see by the dump (-fdump-tree-*-all):
  # PT = nonlocal null 
  uint16_tD.1724 * a_15(D) = aD.1733;

Reply via email to