https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121293

            Bug ID: 121293
           Summary: svdupq_lane produces suboptimal code for big-endian
                    SVE
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Keywords: aarch64-sve
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rsandifo at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64*-*-*

Compiling:

  #pragma GCC aarch64 "arm_sve.h"

  svint32_t f(svint32_t x) { return svdupq_lane (x, 17); }

with -march=armv8.2-a+sve -mbig-endian -O2 gives:

f(__SVInt32_t):
        ptrue   p2.b, all
        addvl   sp, sp, #-1
        st1w    z0.s, p2, [sp]
        adrp    x0, .LC0
        ld1w    z30.s, p2/z, [sp]
        add     x0, x0, :lo12:.LC0
        revw    z30.d, p2/m, z30.d
        ld1rqd  z31.d, p2/z, [x0]
        tbl     z0.d, {z30.d}, z31.d
        addvl   sp, sp, #1
        revw    z0.d, p2/m, z0.d
        ret
.LC0:
        .xword  34
        .xword  35

The REVWs are bogus, but I think they cancel out, meaning that this is merely
suboptimal code rather than wrong code.

This comes from using lowpart subregs instead of aarch64_sve_reinterpret to
cast to VNx2DI and back.

Reply via email to