Jennifer Schmitz <jschm...@nvidia.com> writes:
>> On 18 Sep 2024, at 20:33, Richard Sandiford <richard.sandif...@arm.com> 
>> wrote:
>> 
>> External email: Use caution opening links or attachments
>> 
>> 
>> Jennifer Schmitz <jschm...@nvidia.com> writes:
>>> From 05e010a4ad5ef8df082b3e03b253aad85e2a270c Mon Sep 17 00:00:00 2001
>>> From: Jennifer Schmitz <jschm...@nvidia.com>
>>> Date: Tue, 17 Sep 2024 00:15:38 -0700
>>> Subject: [PATCH] SVE intrinsics: Fold svmul with all-zero operands to zero
>>> vector
>>> 
>>> As recently implemented for svdiv, this patch folds svmul to a zero
>>> vector if one of the operands is a zero vector. This transformation is
>>> applied if at least one of the following conditions is met:
>>> - the first operand is all zeros or
>>> - the second operand is all zeros, and the predicate is ptrue or the
>>> predication is _x or _z.
>>> 
>>> In contrast to constant folding, which was implemented in a previous
>>> patch, this transformation is applied as soon as one of the operands is
>>> a zero vector, while the other operand can be a variable.
>>> 
>>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no 
>>> regression.
>>> OK for mainline?
>>> 
>>> Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com>
>> 
>> OK, thanks.
>> 
>> If you're planning any more work in this area, I think the next logical
>> step would be to extend the current folds to all predication types,
>> before going on to support other mul/div cases or other operations.
>> 
>> In principle, the mul and div cases correspond to:
>> 
>>  if (integer_zerop (op1) || integer_zerop (op2))
>>    return f.fold_active_lanes_to (build_zero_cst (TREE_TYPE (f.lhs)));
>> 
>> It would then be up to fold_active_lanes_to(X) to work out how to apply
>> predication to X.  The general case would be:
>> 
>>  - For x predication and unpredicated operations, fold to X.
>> 
>>  - For m and z, calculate a vector that supplies the values of inactive
>>    lanes (the first vector argument for m and a zero vector from z).
>> 
>>    - If X is equal to the inactive lanes vector, fold directly to X.
>> 
>>    - Otherwise fold to VEC_COND_EXPR <pg, X, inactive>
> Dear Richard,
> I pushed it to trunk with 08aba2dd8c9390b6131cca0aac069f97eeddc9d2.
> Thank you also for the good suggestion, I will do that. During the last days, 
> I have been working on a patch that folds multiplication by powers of 2 to 
> left-shifts (svlsl), similar to for division. As I see it, that is 
> independent from what you proposed, because it is a change of the function 
> type. Can I submit it for review before starting on the patch you suggested?

Sure!  I agree the power-of-two fold is independent.  I was just worried
about building up technical debt if we added more fold-to-constant cases.

Thanks,
Richard

Reply via email to