https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116145

            Bug ID: 116145
           Summary: Suboptimal SVE immediate synthesis
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: aarch64-sve, missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ktkachov at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64

While optimising some string matching code I wanted to create a vector of
characters to match through an svdup and an svreinterpret but am getting
suboptimal codegen through the constant pool:
A minimised testcase:
#include <arm_sve.h>

svuint8_t
foo (void)
{
    return svreinterpret_u8(svdup_u32(0x0a0d5c3f));
}

generates for -O2 -march=armv9-a:
foo:
        ptrue   p3.b, all
        adrp    x0, .LC0
        add     x0, x0, :lo12:.LC0
        ld1rw   z0.s, p3/z, [x0]
        ret
.LC0:
        .word   168647743

but LLVM can do it with:
foo:
        mov     w8, #23615
        movk    w8, #2573, lsl #16
        mov     z0.s, w8
        ret

Reply via email to