Jennifer Schmitz <jschm...@nvidia.com> writes:
> From 05e010a4ad5ef8df082b3e03b253aad85e2a270c Mon Sep 17 00:00:00 2001
> From: Jennifer Schmitz <jschm...@nvidia.com>
> Date: Tue, 17 Sep 2024 00:15:38 -0700
> Subject: [PATCH] SVE intrinsics: Fold svmul with all-zero operands to zero
>  vector
>
> As recently implemented for svdiv, this patch folds svmul to a zero
> vector if one of the operands is a zero vector. This transformation is
> applied if at least one of the following conditions is met:
> - the first operand is all zeros or
> - the second operand is all zeros, and the predicate is ptrue or the
> predication is _x or _z.
>
> In contrast to constant folding, which was implemented in a previous
> patch, this transformation is applied as soon as one of the operands is
> a zero vector, while the other operand can be a variable.
>
> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
> OK for mainline?
>
> Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com>

OK, thanks.

If you're planning any more work in this area, I think the next logical
step would be to extend the current folds to all predication types,
before going on to support other mul/div cases or other operations.

In principle, the mul and div cases correspond to:

  if (integer_zerop (op1) || integer_zerop (op2))
    return f.fold_active_lanes_to (build_zero_cst (TREE_TYPE (f.lhs)));

It would then be up to fold_active_lanes_to(X) to work out how to apply
predication to X.  The general case would be:

  - For x predication and unpredicated operations, fold to X.

  - For m and z, calculate a vector that supplies the values of inactive
    lanes (the first vector argument for m and a zero vector from z).

    - If X is equal to the inactive lanes vector, fold directly to X.

    - Otherwise fold to VEC_COND_EXPR <pg, X, inactive>

Richard

Reply via email to