Jennifer Schmitz <jschm...@nvidia.com> writes: > From 05e010a4ad5ef8df082b3e03b253aad85e2a270c Mon Sep 17 00:00:00 2001 > From: Jennifer Schmitz <jschm...@nvidia.com> > Date: Tue, 17 Sep 2024 00:15:38 -0700 > Subject: [PATCH] SVE intrinsics: Fold svmul with all-zero operands to zero > vector > > As recently implemented for svdiv, this patch folds svmul to a zero > vector if one of the operands is a zero vector. This transformation is > applied if at least one of the following conditions is met: > - the first operand is all zeros or > - the second operand is all zeros, and the predicate is ptrue or the > predication is _x or _z. > > In contrast to constant folding, which was implemented in a previous > patch, this transformation is applied as soon as one of the operands is > a zero vector, while the other operand can be a variable. > > The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. > OK for mainline? > > Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com>
OK, thanks. If you're planning any more work in this area, I think the next logical step would be to extend the current folds to all predication types, before going on to support other mul/div cases or other operations. In principle, the mul and div cases correspond to: if (integer_zerop (op1) || integer_zerop (op2)) return f.fold_active_lanes_to (build_zero_cst (TREE_TYPE (f.lhs))); It would then be up to fold_active_lanes_to(X) to work out how to apply predication to X. The general case would be: - For x predication and unpredicated operations, fold to X. - For m and z, calculate a vector that supplies the values of inactive lanes (the first vector argument for m and a zero vector from z). - If X is equal to the inactive lanes vector, fold directly to X. - Otherwise fold to VEC_COND_EXPR <pg, X, inactive> Richard