Jennifer Schmitz <jschm...@nvidia.com> writes: >> On 18 Sep 2024, at 20:33, Richard Sandiford <richard.sandif...@arm.com> >> wrote: >> >> External email: Use caution opening links or attachments >> >> >> Jennifer Schmitz <jschm...@nvidia.com> writes: >>> From 05e010a4ad5ef8df082b3e03b253aad85e2a270c Mon Sep 17 00:00:00 2001 >>> From: Jennifer Schmitz <jschm...@nvidia.com> >>> Date: Tue, 17 Sep 2024 00:15:38 -0700 >>> Subject: [PATCH] SVE intrinsics: Fold svmul with all-zero operands to zero >>> vector >>> >>> As recently implemented for svdiv, this patch folds svmul to a zero >>> vector if one of the operands is a zero vector. This transformation is >>> applied if at least one of the following conditions is met: >>> - the first operand is all zeros or >>> - the second operand is all zeros, and the predicate is ptrue or the >>> predication is _x or _z. >>> >>> In contrast to constant folding, which was implemented in a previous >>> patch, this transformation is applied as soon as one of the operands is >>> a zero vector, while the other operand can be a variable. >>> >>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no >>> regression. >>> OK for mainline? >>> >>> Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com> >> >> OK, thanks. >> >> If you're planning any more work in this area, I think the next logical >> step would be to extend the current folds to all predication types, >> before going on to support other mul/div cases or other operations. >> >> In principle, the mul and div cases correspond to: >> >> if (integer_zerop (op1) || integer_zerop (op2)) >> return f.fold_active_lanes_to (build_zero_cst (TREE_TYPE (f.lhs))); >> >> It would then be up to fold_active_lanes_to(X) to work out how to apply >> predication to X. The general case would be: >> >> - For x predication and unpredicated operations, fold to X. >> >> - For m and z, calculate a vector that supplies the values of inactive >> lanes (the first vector argument for m and a zero vector from z). >> >> - If X is equal to the inactive lanes vector, fold directly to X. >> >> - Otherwise fold to VEC_COND_EXPR <pg, X, inactive> > Dear Richard, > I pushed it to trunk with 08aba2dd8c9390b6131cca0aac069f97eeddc9d2. > Thank you also for the good suggestion, I will do that. During the last days, > I have been working on a patch that folds multiplication by powers of 2 to > left-shifts (svlsl), similar to for division. As I see it, that is > independent from what you proposed, because it is a change of the function > type. Can I submit it for review before starting on the patch you suggested?
Sure! I agree the power-of-two fold is independent. I was just worried about building up technical debt if we added more fold-to-constant cases. Thanks, Richard