https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100048
Bug ID: 100048
Summary: [10/11 Regression] Wrongful CSE'ing of SVE predicates.
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Keywords: wrong-code
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
Target: aarch64-*
The following testcase
#include "arm_sve.h"
void foo(svfloat16_t in, float16_t *dst) {
const svbool_t pg_q0 = svdupq_n_b16(1, 0, 1, 0, 0, 0, 0, 0);
const svbool_t pg_f0 = svdupq_n_b16(1, 0, 0, 0, 0, 0, 0, 0);
dst[0] = svaddv_f16(pg_f0, in);
dst[1] = svaddv_f16(pg_q0, in);
}
generates the right code at -O1 with -march=armv8-a+sve but generates wrong
code at -O2.
>From this these expands are created
(insn 22 21 23 2 (set (reg:VNx8BI 100)
(subreg:VNx8BI (reg:VNx2BI 103) 0))
(expr_list:REG_EQUAL (const_vector:VNx8BI [
(const_int 1 [0x1])
(const_int 0 [0])
(const_int 1 [0x1])
(const_int 0 [0]) repeated x5
])
(nil)))
and
(insn 15 14 16 2 (set (reg:VNx8BI 96)
(subreg:VNx8BI (reg:VNx2BI 99) 0))
(expr_list:REG_EQUAL (const_vector:VNx8BI [
(const_int 1 [0x1])
(const_int 0 [0]) repeated x7
])
(nil)))
where the subregs are paradoxical. These incorrect paradoxical subregs cause
CSE to think these two predicates are the same.
As such it CSEs them away into
foo:
pfalse p2.b
ptrue p1.d, all
trn1 p1.d, p1.d, p2.d
faddv h1, p1, z0.h
str h1, [x0]
ptrue p0.s, all
trn1 p0.d, p0.d, p2.d
faddv h0, p0, z0.h
str h0, [x0, 2]
ret
instead of the expected
foo:
pfalse p2.b
ptrue p1.d, all
ptrue p0.s, all
trn1 p1.s, p1.s, p2.s
trn1 p0.s, p0.s, p2.s
faddv h1, p1, z0.h
faddv h0, p0, z0.h
str h1, [x0]
str h0, [x0, 2]
ret