Ping. > On 20 Sep 2024, at 11:28, Jennifer Schmitz <jschm...@nvidia.com> wrote: > > For svmul, if one of the operands is a constant vector with a uniform > power of 2, this patch folds the multiplication to a left-shift by > immediate (svlsl). > Because the shift amount in svlsl is the second operand, the order of the > operands is switched, if the first operand contained the powers of 2. However, > this switching is not valid for some predications: If the predication is > _m and the predicate not ptrue, the result of svlsl might not be the > same as for svmul. Therefore, we do not apply the fold in this case. > The transform is also not applied to INTMIN for signed integers and to > constant vectors of 1 (this case is partially covered by constant folding > already and the missing cases will be addressed by the follow-up patch > suggested in > https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663275.html). > > Tests were added in the existing test harness to check the produced assembly > - when the first or second operand contains the power of 2 > - when the second operand is a vector or scalar (_n) > - for _m, _z, _x predication > - for _m with ptrue or non-ptrue > - for intmin for signed integer types > - for the maximum power of 2 for signed and unsigned integer types. > Note that we used 4 as a power of 2, instead of 2, because a recent > patch optimizes left-shifts by 1 to an add instruction. But since we > wanted to highlight the change to an lsl instruction we used a higher > power of 2. > To also check correctness, runtime tests were added. > > The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. > OK for mainline? > > Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com> > > gcc/ > * config/aarch64/aarch64-sve-builtins-base.cc (svmul_impl::fold): > Implement fold to svlsl for power-of-2 operands. > > gcc/testsuite/ > * gcc.target/aarch64/sve/acle/asm/mul_s8.c: New test. > * gcc.target/aarch64/sve/acle/asm/mul_s16.c: Likewise. > * gcc.target/aarch64/sve/acle/asm/mul_s32.c: Likewise. > * gcc.target/aarch64/sve/acle/asm/mul_s64.c: Likewise. > * gcc.target/aarch64/sve/acle/asm/mul_u8.c: Likewise. > * gcc.target/aarch64/sve/acle/asm/mul_u16.c: Likewise. > * gcc.target/aarch64/sve/acle/asm/mul_u32.c: Likewise. > * gcc.target/aarch64/sve/acle/asm/mul_u64.c: Likewise. > * gcc.target/aarch64/sve/mul_const_run.c: Likewise. > <0001-SVE-intrinsics-Fold-svmul-with-constant-power-of-2-o.patch>
smime.p7s
Description: S/MIME cryptographic signature