https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121293
Bug ID: 121293 Summary: svdupq_lane produces suboptimal code for big-endian SVE Product: gcc Version: 16.0 Status: UNCONFIRMED Keywords: aarch64-sve Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: rsandifo at gcc dot gnu.org Target Milestone: --- Target: aarch64*-*-* Compiling: #pragma GCC aarch64 "arm_sve.h" svint32_t f(svint32_t x) { return svdupq_lane (x, 17); } with -march=armv8.2-a+sve -mbig-endian -O2 gives: f(__SVInt32_t): ptrue p2.b, all addvl sp, sp, #-1 st1w z0.s, p2, [sp] adrp x0, .LC0 ld1w z30.s, p2/z, [sp] add x0, x0, :lo12:.LC0 revw z30.d, p2/m, z30.d ld1rqd z31.d, p2/z, [x0] tbl z0.d, {z30.d}, z31.d addvl sp, sp, #1 revw z0.d, p2/m, z0.d ret .LC0: .xword 34 .xword 35 The REVWs are bogus, but I think they cancel out, meaning that this is merely suboptimal code rather than wrong code. This comes from using lowpart subregs instead of aarch64_sve_reinterpret to cast to VNx2DI and back.