Re: [PATCH][PR121599] aarch64: Fix ICE when op2 is zero for SVE2 saturating add intrinsics.

Jennifer Schmitz Thu, 18 Sep 2025 19:29:59 -0700
> On 28 Aug 2025, at 09:19, Jennifer Schmitz <[email protected]> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
>> On 25 Aug 2025, at 15:40, Kyrylo Tkachov <[email protected]> wrote:
>> 
>> Hi Jennifer,
>> 
>>> On 25 Aug 2025, at 12:56, Jennifer Schmitz <[email protected]> wrote:
>>> 
>>> When op2 in SVE2 saturating add intrinsics (svuqadd, svsqadd) is a zero
>>> vector and predication is _z, an ICE in vregs occurs, e.g. for
>>> 
>>> svuint8_t foo (svbool_t pg, svuint8_t op1)
>>> {
>>>  return svsqadd_u8_z (pg, op1, svdup_s8 (0));
>>> }
>>> 
>>> The insn failed to match the pattern (aarch64-sve2.md):
>>> 
>>> ;; Predicated binary operations with no reverse form, merging with zero.
>>> ;; At present we don't generate these patterns via a cond_* optab,
>>> ;; so there's no correctness requirement to handle merging with an
>>> ;; independent value.
>>> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_z"
>>> [(set (match_operand:SVE_FULL_I 0 "register_operand")
>>> (unspec:SVE_FULL_I
>>> [(match_operand:<VPRED> 1 "register_operand")
>>> (unspec:SVE_FULL_I
>>>   [(match_operand 5)
>>>    (unspec:SVE_FULL_I
>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>> (match_operand:SVE_FULL_I 3 "register_operand")]
>>> SVE2_COND_INT_BINARY_NOREV)]
>>>   UNSPEC_PRED_X)
>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
>>> UNSPEC_SEL))]
>>> "TARGET_SVE2"
>>> {@ [ cons: =0 , 1   , 2 , 3  ]
>>>   [ &w       , Upl , 0 , w  ] movprfx\t%0.<Vetype>, %1/z, 
>>> %0.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>>>   [ &w       , Upl , w , w  ] movprfx\t%0.<Vetype>, %1/z, 
>>> %2.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>>> }
>>> "&& !CONSTANT_P (operands[5])"
>>> {
>>>  operands[5] = CONSTM1_RTX (<VPRED>mode);
>>> }
>>> [(set_attr "movprfx" "yes")]
>>> )
>>> 
>>> because operands[3] and operands[4] were both expanded into the same 
>>> register
>>> operand containing a zero vector by
>>> 
>>> ;; Predicated binary arithmetic with merging.
>>> (define_expand "@cond_<sve_int_op><mode>"
>>> [(set (match_operand:SVE_FULL_I 0 "register_operand")
>>> (unspec:SVE_FULL_I
>>> [(match_operand:<VPRED> 1 "register_operand")
>>> (unspec:SVE_FULL_I
>>>   [(match_dup 5)
>>>    (unspec:SVE_FULL_I
>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>> (match_operand:SVE_FULL_I 3 "register_operand")]
>>> SVE2_COND_INT_BINARY)]
>>>   UNSPEC_PRED_X)
>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_reg_or_zero")]
>>> UNSPEC_SEL))]
>>> "TARGET_SVE2"
>>> {
>>>  operands[5] = CONSTM1_RTX (<MODE>mode);
>>> }
>>> )
>>> 
>>> This patch fixes the ICE by changing the predicate of operands[3]
>>> in both patterns to aarch64_simd_reg_or_zero. It also adds tests
>>> adjusted from the report in PR121599.
>>> 
>>> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
>>> OK for trunk?
>>> 
>>> Alex Coplan pointed out in the bugzilla ticket that this ICE goes back
>>> to GCC 10. Shall we backport?
>>> 
>>> Signed-off-by: Jennifer Schmitz <[email protected]>
>>> 
>>> gcc/
>>> PR target/121599
>>> * config/aarch64/aarch64-sve2.md (@cond_<sve_int_op><mode>):
>>> Set predicate for operand 3 to aarch64_simd_reg_or_zero.
>>> (*cond_<sve_int_op><mode>_z): Likewise.
>>> 
>>> gcc/testsuite/
>>> PR target/121599
>>> * gcc.target/aarch64/sve2/pr121599.c: New test.
>>> ---
>>> gcc/config/aarch64/aarch64-sve2.md            |  4 +--
>>> .../gcc.target/aarch64/sve2/pr121599.c        | 31 +++++++++++++++++++
>>> 2 files changed, 33 insertions(+), 2 deletions(-)
>>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>>> 
>>> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
>>> b/gcc/config/aarch64/aarch64-sve2.md
>>> index a3cbbce8b31..c6a84fa7f3d 100644
>>> --- a/gcc/config/aarch64/aarch64-sve2.md
>>> +++ b/gcc/config/aarch64/aarch64-sve2.md
>>> @@ -1021,7 +1021,7 @@
>>>   [(match_dup 5)
>>>    (unspec:SVE_FULL_I
>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>> - (match_operand:SVE_FULL_I 3 "register_operand")]
>>> + (match_operand:SVE_FULL_I 3 "aarch64_simd_reg_or_zero")]
>>> SVE2_COND_INT_BINARY)]
>>>   UNSPEC_PRED_X)
>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_reg_or_zero")]
>>> @@ -1136,7 +1136,7 @@
>>>   [(match_operand 5)
>>>    (unspec:SVE_FULL_I
>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>> - (match_operand:SVE_FULL_I 3 "register_operand")]
>>> + (match_operand:SVE_FULL_I 3 "aarch64_simd_reg_or_zero")]
>>> SVE2_COND_INT_BINARY_NOREV)]
>>>   UNSPEC_PRED_X)
>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
>> 
>> I think the ICE happens because operands[4] ends up being a register which 
>> the aarch64_simd_imm_zero can’t handle.
>> The pattern *cond_<sve_int_op><mode>_z is also not supposed to be handling 
>> register operands, it only expresses a merge with zero.
>> Allowing a zero in operands[3] here is problematic as the output code 
>> wouldn’t know how to print it if the midend used a zero vector for it.
>> 
>> I think the right solution is to adjust the @cond_<sve_int_op><mode> 
>> expander to only allow aarch64_simd_imm_zero for operands[4] for the 
>> SVE2_COND_INT_BINARY_NOREV subset of SVE2_COND_INT_BINARY.
> Hi Kyrill,
> Thanks for reviewing the patch.
> I tried allowing only aarch64_simd_imm_zero for operands[4] for 
> SVE2_COND_INT_BINARY_NOREV, but while that solved the ICE it caused all tests 
> predication other than _z to fail.
> I chose instead to add a pattern similar to the existing 
> *cond_<sve_int_op><mode>_3 for SVE2_COND_INT_BINARY_NOREV that handles the 
> case when operands[3] and operands[4] are the same register operand.
> Thanks,
> Jennifer
Gentle ping.
Thanks,
Jennifer
> 
> When op2 in SVE2 saturating add intrinsics (svuqadd, svsqadd) is a zero
> vector and predication is _z, an ICE in vregs occurs, e.g. for
> 
> svuint8_t foo (svbool_t pg, svuint8_t op1)
> {
>    return svsqadd_u8_z (pg, op1, svdup_s8 (0));
> }
> 
> The insn failed to match the pattern (aarch64-sve2.md):
> 
> ;; Predicated binary operations with no reverse form, merging with zero.
> ;; At present we don't generate these patterns via a cond_* optab,
> ;; so there's no correctness requirement to handle merging with an
> ;; independent value.
> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_z"
>  [(set (match_operand:SVE_FULL_I 0 "register_operand")
>        (unspec:SVE_FULL_I
>          [(match_operand:<VPRED> 1 "register_operand")
>           (unspec:SVE_FULL_I
>             [(match_operand 5)
>              (unspec:SVE_FULL_I
>                [(match_operand:SVE_FULL_I 2 "register_operand")
>                 (match_operand:SVE_FULL_I 3 "register_operand")]
>                SVE2_COND_INT_BINARY_NOREV)]
>             UNSPEC_PRED_X)
>           (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
>          UNSPEC_SEL))]
>  "TARGET_SVE2"
>  {@ [ cons: =0 , 1   , 2 , 3  ]
>     [ &w       , Upl , 0 , w  ] movprfx\t%0.<Vetype>, %1/z, 
> %0.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>     [ &w       , Upl , w , w  ] movprfx\t%0.<Vetype>, %1/z, 
> %2.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>  }
>  "&& !CONSTANT_P (operands[5])"
>  {
>    operands[5] = CONSTM1_RTX (<VPRED>mode);
>  }
>  [(set_attr "movprfx" "yes")]
> )
> 
> because operands[3] and operands[4] were both expanded into the same register
> operand containing a zero vector by define_expand "@cond_<sve_int_op><mode>".
> 
> This patch fixes the ICE by adding the pattern
>  ;; Predicated binary operations with no reverse form, merging with
>  ;; the second input.
>  (define_insn_and_rewrite "*cond_<sve_int_op><mode>_3"
> for SVE2_COND_INT_BINARY_NOREV that allows operands[4] to be a register 
> operand
> (equal to operands[3]). It adds a vcond_mask expression such that we end
> up with an insn matched by
>  ;; Predicated binary arithmetic, merging with the first input.
>  (define_insn_and_rewrite "*cond_<sve_int_op><mode>_2"
> 
> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
> OK for trunk?
> 
> Alex Coplan pointed out in the bugzilla ticket that this ICE goes back
> to GCC 10. Shall we backport?
> 
> Signed-off-by: Jennifer Schmitz <[email protected]>
> 
> gcc/
>        PR target/121599
>        * config/aarch64/aarch64-sve2.md (*cond_<sve_int_op><mode>_3):
>        New pattern for SVE2_COND_INT_BINARY_NOREV.
> 
> gcc/testsuite/
>        PR target/121599
>        * gcc.target/aarch64/sve2/pr121599.c: New test.
> ---
> gcc/config/aarch64/aarch64-sve2.md            | 36 +++++++++++++++++++
> .../gcc.target/aarch64/sve2/pr121599.c        | 31 ++++++++++++++++
> 2 files changed, 67 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
> 
> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
> b/gcc/config/aarch64/aarch64-sve2.md
> index a3cbbce8b31..d82dc41df3e 100644
> --- a/gcc/config/aarch64/aarch64-sve2.md
> +++ b/gcc/config/aarch64/aarch64-sve2.md
> @@ -1124,6 +1124,42 @@
>   [(set_attr "movprfx" "yes")]
> )
> 
> +;; Predicated binary operations with no reverse form, merging with
> +;; the second input.
> +(define_insn_and_rewrite "*cond_<sve_int_op><mode>_3"
> +  [(set (match_operand:SVE_FULL_I 0 "register_operand")
> +       (unspec:SVE_FULL_I
> +         [(match_operand:<VPRED> 1 "register_operand")
> +          (unspec:SVE_FULL_I
> +            [(match_operand 5)
> +             (unspec:SVE_FULL_I
> +               [(match_operand:SVE_FULL_I 2 "register_operand")
> +                (match_operand:SVE_FULL_I 3 "register_operand")]
> +               SVE2_COND_INT_BINARY_NOREV)]
> +            UNSPEC_PRED_X)
> +          (match_operand:SVE_FULL_I 4 "register_operand")]
> +         UNSPEC_SEL))]
> +  "TARGET_SVE2
> +   && rtx_equal_p (operands[3], operands[4])"
> +  "#"
> +  "&& 1"
> +  {
> +    if (reload_completed
> +        && register_operand (operands[4], <MODE>mode)
> +        && !rtx_equal_p (operands[0], operands[4]))
> +      {
> +       emit_insn (gen_vcond_mask_<mode><vpred> (operands[0], operands[2],
> +                                                operands[4], operands[1]));
> +       operands[4] = operands[2] = operands[0];
> +      }
> +    else if (!CONSTANT_P (operands[5]))
> +      operands[5] = CONSTM1_RTX (<VPRED>mode);
> +    else
> +      FAIL;
> +  }
> +  [(set_attr "movprfx" "yes")]
> +)
> +
> ;; Predicated binary operations with no reverse form, merging with zero.
> ;; At present we don't generate these patterns via a cond_* optab,
> ;; so there's no correctness requirement to handle merging with an
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c 
> b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
> new file mode 100644
> index 00000000000..dbfa613649d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
> @@ -0,0 +1,31 @@
> +/* PR target/121599.  */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" "" } } */
> +
> +#include <arm_sve.h>
> +
> +/*
> +** foo:
> +**     movi    d([0-9]+), #0
> +**     sel     z0\.b, p0, z0\.b, z\1\.b
> +**     usqadd  z0\.b, p0/m, z0\.b, z\1\.b
> +**     ret
> +*/
> +svuint8_t foo (svbool_t pg, svuint8_t op1)
> +{
> +    return svsqadd_u8_z (pg, op1, svdup_s8 (0));
> +}
> +
> +/*
> +** bar:
> +**     movi    d([0-9]+), #0
> +**     sel     z0\.b, p0, z0\.b, z\1\.b
> +**     suqadd  z0\.b, p0/m, z0\.b, z\1\.b
> +**     ret
> +*/
> +svint8_t bar (svbool_t pg, svint8_t op1)
> +{
> +    return svuqadd_n_s8_z (pg, op1, 0);
> +}
> +
> --
> 2.34.1
>> 
>> Thanks,
>> Kyrill
>> 
>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c 
>>> b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>>> new file mode 100644
>>> index 00000000000..cd80ef9115c
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>>> @@ -0,0 +1,31 @@
>>> +/* PR target/121599.  */
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2" } */
>>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>>> +
>>> +#include <arm_sve.h>
>>> +
>>> +/*
>>> +** foo:
>>> +** movi d([0-9]+), #0
>>> +** movprfx z0\.b, p0/z, z0\.b
>>> +** usqadd z0\.b, p0/m, z0\.b, z\1\.b
>>> +** ret
>>> +*/
>>> +svuint8_t foo (svbool_t pg, svuint8_t op1)
>>> +{
>>> +    return svsqadd_u8_z (pg, op1, svdup_s8 (0));
>>> +}
>>> +
>>> +/*
>>> +** bar:
>>> +** movi d([0-9]+), #0
>>> +** movprfx z0\.b, p0/z, z0\.b
>>> +** suqadd z0\.b, p0/m, z0\.b, z\1\.b
>>> +** ret
>>> +*/
>>> +svint8_t bar (svbool_t pg, svint8_t op1)
>>> +{
>>> +    return svuqadd_n_s8_z (pg, op1, 0);
>>> +}
>>> +
>>> --
>>> 2.34.1
Re: [PATCH][PR121599] aarch64: Fix ICE when op2 is zero for SVE2 saturating add intrinsics.

Reply via email to