For svmul, if one of the operands is a constant vector with a uniform
power of 2, this patch folds the multiplication to a left-shift by
immediate (svlsl).
Because the shift amount in svlsl is the second operand, the order of the
operands is switched, if the first operand contained the powers of 2. However,
this switching is not valid for some predications: If the predication is
_m and the predicate not ptrue, the result of svlsl might not be the
same as for svmul. Therefore, we do not apply the fold in this case.
The transform is also not applied to INTMIN for signed integers and to
constant vectors of 1 (this case is partially covered by constant folding
already and the missing cases will be addressed by the follow-up patch
suggested in
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663275.html).

Tests were added in the existing test harness to check the produced assembly
- when the first or second operand contains the power of 2
- when the second operand is a vector or scalar (_n)
- for _m, _z, _x predication
- for _m with ptrue or non-ptrue
- for intmin for signed integer types
- for the maximum power of 2 for signed and unsigned integer types.
Note that we used 4 as a power of 2, instead of 2, because a recent
patch optimizes left-shifts by 1 to an add instruction. But since we
wanted to highlight the change to an lsl instruction we used a higher
power of 2.
To also check correctness, runtime tests were added.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com>

gcc/
        * config/aarch64/aarch64-sve-builtins-base.cc (svmul_impl::fold):
        Implement fold to svlsl for power-of-2 operands.

gcc/testsuite/
        * gcc.target/aarch64/sve/acle/asm/mul_s8.c: New test.
        * gcc.target/aarch64/sve/acle/asm/mul_s16.c: Likewise.
        * gcc.target/aarch64/sve/acle/asm/mul_s32.c: Likewise.
        * gcc.target/aarch64/sve/acle/asm/mul_s64.c: Likewise.
        * gcc.target/aarch64/sve/acle/asm/mul_u8.c: Likewise.
        * gcc.target/aarch64/sve/acle/asm/mul_u16.c: Likewise.
        * gcc.target/aarch64/sve/acle/asm/mul_u32.c: Likewise.
        * gcc.target/aarch64/sve/acle/asm/mul_u64.c: Likewise.
        * gcc.target/aarch64/sve/mul_const_run.c: Likewise.

Attachment: 0001-SVE-intrinsics-Fold-svmul-with-constant-power-of-2-o.patch
Description: Binary data

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to