> On 8 Oct 2025, at 08:49, Tamar Christina <[email protected]> wrote:
>
> External email: Use caution opening links or attachments
>
>
> Hi Jennifer,
>
>> -----Original Message-----
>> From: Jennifer Schmitz <[email protected]>
>> Sent: 24 September 2025 08:31
>> To: Alex Coplan <[email protected]>
>> Cc: GCC Patches <[email protected]>; Richard Earnshaw
>> <[email protected]>; Kyrylo Tkachov <[email protected]>;
>> Tamar Christina <[email protected]>; Andrew Pinski
>> <[email protected]>
>> Subject: Re: [PATCH 1/2] [PR121604] aarch64: Fix folding of svbrka/b +
>> svpmov_lane with pfalse arguments
>>
>>
>>
>>> On 15 Sep 2025, at 09:30, Jennifer Schmitz <[email protected]> wrote:
>>>
>>> External email: Use caution opening links or attachments
>>>
>>>
>>>> On 27 Aug 2025, at 09:14, Jennifer Schmitz <[email protected]> wrote:
>>>>
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>>> On 26 Aug 2025, at 16:02, Alex Coplan <[email protected]> wrote:
>>>>>
>>>>> External email: Use caution opening links or attachments
>>>>>
>>>>>
>>>>> On 26/08/2025 13:39, Jennifer Schmitz wrote:
>>>>>> The function gimple_folder::fold_pfalse that was introduced in
>>>>>> g5289540ed58e42ae66255e31f22afe4ca0a6e15e
>>>>>> inappropriately folds
>>>>>> 1) svbrka/b with _m if the inactive operand is pfalse
>>>>>> 2) svpmov_lane with _m if the non-governing predicate operand is
>> pfalse.
>>>>>> To fix this, we added extra checks in the folding logic excluding certain
>>>>>> function shapes.
>>>>>>
>>>>>> The patch was bootstrapped and tested on aarch64-linux-gnu, no
>> regression.
>>>>>> OK for trunk?
>>>>>>
>>>>>> Signed-off-by: Jennifer Schmitz <[email protected]>
>>>>>>
>>>>>> gcc/
>>>>>>   PR target/121604
>>>>>>   * config/aarch64/aarch64-sve-builtins.cc
>>>>>>   (gimple_folder::fold_pfalse): Add checks for function shape.
>>>>>>
>>>>>> gcc/testsuite/
>>>>>>   PR target/121604
>>>>>>   * gcc.target/aarch64/sve/pr121604_brk.c: New test.
>>>>>>   * gcc.target/aarch64/sve2/pr121604_pmov.c: Likewise.
>>>>>> ---
>>>>>> gcc/config/aarch64/aarch64-sve-builtins.cc    |  6 +++--
>>>>>> .../gcc.target/aarch64/sve/pr121604_brk.c     | 25
>> +++++++++++++++++++
>>>>>> .../gcc.target/aarch64/sve2/pr121604_pmov.c   | 16 ++++++++++++
>>>>>> 3 files changed, 45 insertions(+), 2 deletions(-)
>>>>>> create mode 100644
>> gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>>>>>> create mode 100644
>> gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>>>>>>
>>>>>> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc
>> b/gcc/config/aarch64/aarch64-sve-builtins.cc
>>>>>> index 1764cf8f7e8..41b328d96fc 100644
>>>>>> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
>>>>>> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
>>>>>> @@ -3641,9 +3641,11 @@ gimple_folder::fold_pfalse ()
>>>>>>    inactive vector (arg0), while other function shapes are folded
>>>>>>    to op1 (arg1).  */
>>>>>>    tree arg1 = gimple_call_arg (call, 1);
>>>>>> -      if (is_pfalse (arg1))
>>>>>> +      if (is_pfalse (arg1)
>>>>>> +       && shape != aarch64_sve::shapes::pmov_to_vector_lane)
>>>>>>   return fold_call_to (arg0);
>>>>>> -      if (is_pfalse (arg0))
>>>>>> +      if (is_pfalse (arg0)
>>>>>> +       && shape != aarch64_sve::shapes::unary)
>>>>>>   return fold_call_to (arg1);
>>>>>>    return nullptr;
>>>>>>  }
>>>>>
>>>>> This fix feels very fragile, like we could easily have the same bug 
>>>>> introduced
>>>>> again for other intrinsics that have non-governing predicate operands;
>> have you
>>>>> verified there are no other such SVE intrinsics?  Either way, it's
>>>>> possible that more such intrinsics will get introduced in the future,
>>>>> and we'll have the exact same bug for those.
>>>>>
>>>>> Relying on is_pfalse () to infer the position of the GP is just too
>>>>> fragile IMO.
>>>>>
>>>>> As I mentioned in the PR, I think it would be better if we explicitly
>>>>> determined:
>>>>> (1) the position of the governing predicate, if any, and
>>>>> (2) the position of the inactive argument (to fold to on pfalse), if any,
>>>>> and used that information to do the folding.
>>>>>
>>>>> Either we can do that by dispatching on the shapes (i.e. positively
>>>>> classify the shapes into groups for the two cases that we know we can
>>>>> safely optimize), or perhaps that information can be encoded in some
>>>>> other way.
>>>> Hi Alex,
>>>> I agree that it would be more robust if we positively checked for the
>> intrinsics to which the fold can be safely applied.
>>>> We could probably do this using the function shape. But I think that the 
>>>> list
>> of shapes where the fold can be applied is much longer than the cases that
>> need to be excluded. In fact, svbrka/b and svpmov_lane with _m predication
>> seem to be exceptional cases (which does not say that there will not be more
>> of those in future).
>>>> I looked around for ways to access the position of the governing predicate.
>> It seems to me that there is no direct access to the position but rather it 
>> is
>> inferred using functions like function_shape::has_merge_argument_p or like in
>> function_resolver::check_gp_argument. But I think for svpmov_lane_m it
>> would still assume that the predicate is a governing predicate, wouldn’t it?
>>>> What do you think is the best way forward here?
>
> Yeah svpmov_lane_m is an annoying one as it's a completely synthetic 
> intrinsic.
>
> As you say these two are an exception to the rule rather than the norm.  So 
> how about
> this solution.
>
> As you say, the code assumes that governing predicates are at arg0 for unary 
> functions,
> and in other cases at op1.
>
> I think we should make this overridable,
>
> That is in function_base add two virtual functions
>
>  virtual int get_gp_arg_index () const
>  {
>     return 0;
>  }
>
>  virtual int get_fold_to_arg_index () const
>  {
>     return 1;
>  }
>
> In gimple_folder::fold_pfalse () we can then use these calls,
> If int argi0 = base->get_gp_arg_index (); returns -1 for argi0 we can
> then stop folding.
>
> We then set in pmov_to_vector_lane the get_gp_arg_index to return -1,
>
>  int get_gp_arg_index () const override
>  {
>     return -1;
>  }
>
> and for svbrk_unary_impl we adjust the arguments appropriately.
>
> That should be easily backportable too.
Hi Tamar,
Thanks for the review. I adjusted the patch based on your solution, but changed 
the two new functions to be specific to the case of m predication. The reason 
is that if get_gp_arg_index returns the “normal” position of the governing 
predicate (i.e. for x or z predication), we still have to adjust the position 
in case of m predication for a few unary shapes. And that brings us back to 
needing case distinctions for specific shapes in gimple_folder::fold_pfalse, 
which Alex wanted to avoid (if I understood correctly).
What do you think?
I’m attaching both patches of the series to this thread.
Thanks,
Jennifer

>
> Thanks,
> Tamar
>
>>>> Thanks for any advice.
>>>> Jennifer
>>> Gentle ping.
>>> Thanks,
>>> Jennifer
>> Gentle ping.
>> Thanks,
>> Jennifer
>>>>>
>>>>> Thanks,
>>>>> Alex
>>>>>
>>>>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>> b/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>>>>>> new file mode 100644
>>>>>> index 00000000000..a474a20554d
>>>>>> --- /dev/null
>>>>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>>>>>> @@ -0,0 +1,25 @@
>>>>>> +/* { dg-do compile } */
>>>>>> +/* { dg-options "-O2" } */
>>>>>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>>>>>> +
>>>>>> +#include <arm_sve.h>
>>>>>> +
>>>>>> +/*
>>>>>> +** foo:
>>>>>> +**   ptrue   p0\.b, all
>>>>>> +**   brkb    p0\.b, p0/z, p0\.b
>>>>>> +**   ret
>>>>>> +*/
>>>>>> +svbool_t foo () {
>>>>>> +  return svbrkb_b_m (svpfalse (), svptrue_b8 (), svptrue_b8 ());
>>>>>> +}
>>>>>> +
>
> Since BRKB doesn't include the first active elements, this should have folded
> to pfalse.
>
> I guess if we want that one we need to handle it in svbrk_unary_impl::fold.
This is handled in PATCH 2/2 of the series.
>
> Thanks,
> Tamar
>
>>>>>> +/*
>>>>>> +** bar:
>>>>>> +**   ptrue   p0\.b, all
>>>>>> +**   brka    p0\.b, p0/z, p0\.b
>>>>>> +**   ret
>>>>>> +*/
>>>>>> +svbool_t bar () {
>>>>>> +  return svbrka_b_m (svpfalse (), svptrue_b8 (), svptrue_b8 ());
>>>>>> +}
>>>>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>> b/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>>>>>> new file mode 100644
>>>>>> index 00000000000..16844ee4add
>>>>>> --- /dev/null
>>>>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>>>>>> @@ -0,0 +1,16 @@
>>>>>> +/* { dg-do compile } */
>>>>>> +/* { dg-options "-O2 -march=armv8.2-a+sve2p1" } */
>>>>>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>>>>>> +
>>>>>> +#include <arm_sve.h>
>>>>>> +
>>>>>> +/*
>>>>>> +** f:
>>>>>> +**   pfalse  p([0-7]+)\.b
>>>>>> +**   mov     z0\.b, #-1
>>>>>> +**   pmov    z0\[1\], p\1\.d
>>>>>> +**   ret
>>>>>> +*/
>>>>>> +svuint64_t f () {
>>>>>> +    return svpmov_lane_u64_m (svdup_u64 (~0UL), svpfalse (), 1);
>>>>>> +}
>>>>>> --
>>>>>> 2.34.1


Attachment: V2_0002-PR121604-aarch64-Fold-svbrka-b-with-ptrue-predicate-.patch
Description: V2_0002-PR121604-aarch64-Fold-svbrka-b-with-ptrue-predicate-.patch

Attachment: V2_0001-PR121604-aarch64-Fix-folding-of-svbrka-b-svpmov_lane.patch
Description: V2_0001-PR121604-aarch64-Fix-folding-of-svbrka-b-svpmov_lane.patch

Reply via email to