https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102139
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Target Milestone|--- |11.3
Ever confirmed|0 |1
Summary|-O3 miscompile due to |[11/12 Regression] -O3
|slp-vectorize on strict |miscompile due to
|align target |slp-vectorize on strict
| |align target
Last reconfirmed| |2021-08-31
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
fre is fine here.
The problem is SLP.
Here is why
take:
typedef decltype(sizeof(0)) size_t;
typedef unsigned short uint16_t;
typedef unsigned uint32_t;
void zero_two_uint16(uint16_t* ptr) {
ptr[0] = 0;
ptr[1] = 0;
}
#define vector __attribute__((vector_size(sizeof(uint16_t)*4)))
void f(uint16_t *a)
{
vector uint16_t *b = (vector uint16_t *)a;
*b = (vector uint16_t){};
}
void g(uint16_t *a)
{
size_t t = (size_t)a;
if ((t & 0x7)==0) {
for(int i = 0;i < 8;i++)
f((a + i*4));
} else {
for(int i = 0;i < 16;i++)
zero_two_uint16((a + i*2));
}
}
---- CUT ----
Compile it on aarch64 with -O3 -mgeneral-regs-only
-fno-tree-loop-distribute-patterns -mstrict-align -fno-tree-loop-vectorize
and you will produce the same result for SLP.
There is no PHI for a for FRE to merge even. And there is no alignment
information on the pointers assignments either.
As you can see by the dump (-fdump-tree-*-all):
# PT = nonlocal null
uint16_tD.1724 * a_15(D) = aD.1733;