On 7/7/25 06:19, Kyrylo Tkachov wrote:

External email: Use caution opening links or attachments


Hi all,

Similar to the BCAX and EOR3 patterns from TARGET_SHA3 we can use the
SVE2 NBSL instruction for DImode arugments when they come in SIMD registers.

Minor nit: there is a typo in "arugments"


Again, this is accomplished with a new splitter for the GP case. I noticed
that the split has a side-effect of producing a GP EON instruction where it
wasn't getting generated before because the BSL insn-and-split got in the way.
So for the inputs:

uint64_t nbsl_gp(uint64_t a, uint64_t b, uint64_t c) { return NBSL (a, b, c); }
uint64x1_t nbsl_d (uint64x1_t a, uint64x1_t b, uint64x1_t c) { return NBSL (a, 
b, c); }

We now generate:
nbsl_gp:
eor x0, x0, x1
and x0, x0, x2
eon x0, x0, x1
ret

nbsl_d:
nbsl z0.d, z0.d, z1.d, z2.d
ret

instead of:
nbsl_gp:
eor x0, x1, x0
and x0, x0, x2
eor x0, x0, x1
mvn x0, x0
ret

nbsl_d:
bif v0.8b, v1.8b, v2.8b
mvn v0.8b, v0.8b
ret

Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?

Looks good to me aside from the nit above.

Remi


Thanks,
Kyrill

Signed-off-by: Kyrylo Tkachov <ktkac...@nvidia.com><mailto:ktkac...@nvidia.com>

gcc/

        * config/aarch64/aarch64-sve.md (*aarch64_sve2_nbsl_unpreddi): New
        define_insn_and_split.

gcc/testsuite/

        * gcc.target/aarch64/sve2/nbsl_d.c: New test.

Reply via email to