Re: [PATCH][PR121599] aarch64: Fix ICE when op2 is zero for SVE2 saturating add intrinsics.

Kyrylo Tkachov Sat, 18 Oct 2025 05:46:27 -0700

Hi Jennifer,

> On 28 Aug 2025, at 09:19, Jennifer Schmitz <[email protected]> wrote:
> 
> 
> 
>> On 25 Aug 2025, at 15:40, Kyrylo Tkachov <[email protected]> wrote:
>> 
>> Hi Jennifer,
>> 
>>> On 25 Aug 2025, at 12:56, Jennifer Schmitz <[email protected]> wrote:
>>> 
>>> When op2 in SVE2 saturating add intrinsics (svuqadd, svsqadd) is a zero
>>> vector and predication is _z, an ICE in vregs occurs, e.g. for
>>> 
>>> svuint8_t foo (svbool_t pg, svuint8_t op1)
>>> {
>>>  return svsqadd_u8_z (pg, op1, svdup_s8 (0));
>>> }
>>> 
>>> The insn failed to match the pattern (aarch64-sve2.md):
>>> 
>>> ;; Predicated binary operations with no reverse form, merging with zero.
>>> ;; At present we don't generate these patterns via a cond_* optab,
>>> ;; so there's no correctness requirement to handle merging with an
>>> ;; independent value.
>>> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_z"
>>> [(set (match_operand:SVE_FULL_I 0 "register_operand")
>>> (unspec:SVE_FULL_I
>>> [(match_operand:<VPRED> 1 "register_operand")
>>> (unspec:SVE_FULL_I
>>>   [(match_operand 5)
>>>    (unspec:SVE_FULL_I
>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>> (match_operand:SVE_FULL_I 3 "register_operand")]
>>> SVE2_COND_INT_BINARY_NOREV)]
>>>   UNSPEC_PRED_X)
>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
>>> UNSPEC_SEL))]
>>> "TARGET_SVE2"
>>> {@ [ cons: =0 , 1   , 2 , 3  ]
>>>   [ &w       , Upl , 0 , w  ] movprfx\t%0.<Vetype>, %1/z, 
>>> %0.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>>>   [ &w       , Upl , w , w  ] movprfx\t%0.<Vetype>, %1/z, 
>>> %2.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>>> }
>>> "&& !CONSTANT_P (operands[5])"
>>> {
>>>  operands[5] = CONSTM1_RTX (<VPRED>mode);
>>> }
>>> [(set_attr "movprfx" "yes")]
>>> )
>>> 
>>> because operands[3] and operands[4] were both expanded into the same 
>>> register
>>> operand containing a zero vector by
>>> 
>>> ;; Predicated binary arithmetic with merging.
>>> (define_expand "@cond_<sve_int_op><mode>"
>>> [(set (match_operand:SVE_FULL_I 0 "register_operand")
>>> (unspec:SVE_FULL_I
>>> [(match_operand:<VPRED> 1 "register_operand")
>>> (unspec:SVE_FULL_I
>>>   [(match_dup 5)
>>>    (unspec:SVE_FULL_I
>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>> (match_operand:SVE_FULL_I 3 "register_operand")]
>>> SVE2_COND_INT_BINARY)]
>>>   UNSPEC_PRED_X)
>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_reg_or_zero")]
>>> UNSPEC_SEL))]
>>> "TARGET_SVE2"
>>> {
>>>  operands[5] = CONSTM1_RTX (<MODE>mode);
>>> }
>>> )
>>> 
>>> This patch fixes the ICE by changing the predicate of operands[3]
>>> in both patterns to aarch64_simd_reg_or_zero. It also adds tests
>>> adjusted from the report in PR121599.
>>> 
>>> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
>>> OK for trunk?
>>> 
>>> Alex Coplan pointed out in the bugzilla ticket that this ICE goes back
>>> to GCC 10. Shall we backport?
>>> 
>>> Signed-off-by: Jennifer Schmitz <[email protected]>
>>> 
>>> gcc/
>>> PR target/121599
>>> * config/aarch64/aarch64-sve2.md (@cond_<sve_int_op><mode>):
>>> Set predicate for operand 3 to aarch64_simd_reg_or_zero.
>>> (*cond_<sve_int_op><mode>_z): Likewise.
>>> 
>>> gcc/testsuite/
>>> PR target/121599
>>> * gcc.target/aarch64/sve2/pr121599.c: New test.
>>> ---
>>> gcc/config/aarch64/aarch64-sve2.md            |  4 +--
>>> .../gcc.target/aarch64/sve2/pr121599.c        | 31 +++++++++++++++++++
>>> 2 files changed, 33 insertions(+), 2 deletions(-)
>>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>>> 
>>> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
>>> b/gcc/config/aarch64/aarch64-sve2.md
>>> index a3cbbce8b31..c6a84fa7f3d 100644
>>> --- a/gcc/config/aarch64/aarch64-sve2.md
>>> +++ b/gcc/config/aarch64/aarch64-sve2.md
>>> @@ -1021,7 +1021,7 @@
>>>   [(match_dup 5)
>>>    (unspec:SVE_FULL_I
>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>> - (match_operand:SVE_FULL_I 3 "register_operand")]
>>> + (match_operand:SVE_FULL_I 3 "aarch64_simd_reg_or_zero")]
>>> SVE2_COND_INT_BINARY)]
>>>   UNSPEC_PRED_X)
>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_reg_or_zero")]
>>> @@ -1136,7 +1136,7 @@
>>>   [(match_operand 5)
>>>    (unspec:SVE_FULL_I
>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>> - (match_operand:SVE_FULL_I 3 "register_operand")]
>>> + (match_operand:SVE_FULL_I 3 "aarch64_simd_reg_or_zero")]
>>> SVE2_COND_INT_BINARY_NOREV)]
>>>   UNSPEC_PRED_X)
>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
>> 
>> I think the ICE happens because operands[4] ends up being a register which 
>> the aarch64_simd_imm_zero can’t handle.
>> The pattern *cond_<sve_int_op><mode>_z is also not supposed to be handling 
>> register operands, it only expresses a merge with zero.
>> Allowing a zero in operands[3] here is problematic as the output code 
>> wouldn’t know how to print it if the midend used a zero vector for it.
>> 
>> I think the right solution is to adjust the @cond_<sve_int_op><mode> 
>> expander to only allow aarch64_simd_imm_zero for operands[4] for the 
>> SVE2_COND_INT_BINARY_NOREV subset of SVE2_COND_INT_BINARY.
> Hi Kyrill,
> Thanks for reviewing the patch.
> I tried allowing only aarch64_simd_imm_zero for operands[4] for 
> SVE2_COND_INT_BINARY_NOREV, but while that solved the ICE it caused all tests 
> predication other than _z to fail.
> I chose instead to add a pattern similar to the existing 
> *cond_<sve_int_op><mode>_3 for SVE2_COND_INT_BINARY_NOREV that handles the 
> case when operands[3] and operands[4] are the same register operand.
> Thanks,
> Jennifer
> 
> When op2 in SVE2 saturating add intrinsics (svuqadd, svsqadd) is a zero
> vector and predication is _z, an ICE in vregs occurs, e.g. for
> 
> svuint8_t foo (svbool_t pg, svuint8_t op1)
> {
>    return svsqadd_u8_z (pg, op1, svdup_s8 (0));
> }
> 
> The insn failed to match the pattern (aarch64-sve2.md):
> 
> ;; Predicated binary operations with no reverse form, merging with zero.
> ;; At present we don't generate these patterns via a cond_* optab,
> ;; so there's no correctness requirement to handle merging with an
> ;; independent value.
> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_z"
>  [(set (match_operand:SVE_FULL_I 0 "register_operand")
> (unspec:SVE_FULL_I
>   [(match_operand:<VPRED> 1 "register_operand")
>    (unspec:SVE_FULL_I
>      [(match_operand 5)
>       (unspec:SVE_FULL_I
> [(match_operand:SVE_FULL_I 2 "register_operand")
>  (match_operand:SVE_FULL_I 3 "register_operand")]
> SVE2_COND_INT_BINARY_NOREV)]
>      UNSPEC_PRED_X)
>    (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
>   UNSPEC_SEL))]
>  "TARGET_SVE2"
>  {@ [ cons: =0 , 1   , 2 , 3  ]
>     [ &w       , Upl , 0 , w  ] movprfx\t%0.<Vetype>, %1/z, 
> %0.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>     [ &w       , Upl , w , w  ] movprfx\t%0.<Vetype>, %1/z, 
> %2.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>  }
>  "&& !CONSTANT_P (operands[5])"
>  {
>    operands[5] = CONSTM1_RTX (<VPRED>mode);
>  }
>  [(set_attr "movprfx" "yes")]
> )
> 
> because operands[3] and operands[4] were both expanded into the same register
> operand containing a zero vector by define_expand "@cond_<sve_int_op><mode>".
> 
> This patch fixes the ICE by adding the pattern
>  ;; Predicated binary operations with no reverse form, merging with
>  ;; the second input.
>  (define_insn_and_rewrite "*cond_<sve_int_op><mode>_3"
> for SVE2_COND_INT_BINARY_NOREV that allows operands[4] to be a register 
> operand
> (equal to operands[3]). It adds a vcond_mask expression such that we end
> up with an insn matched by
>  ;; Predicated binary arithmetic, merging with the first input.
>  (define_insn_and_rewrite "*cond_<sve_int_op><mode>_2"
> 
> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
> OK for trunk?
> 
> Alex Coplan pointed out in the bugzilla ticket that this ICE goes back
> to GCC 10. Shall we backport?
> 
> Signed-off-by: Jennifer Schmitz <[email protected]>
> 
> gcc/
> PR target/121599
> * config/aarch64/aarch64-sve2.md (*cond_<sve_int_op><mode>_3):
> New pattern for SVE2_COND_INT_BINARY_NOREV.
> 
> gcc/testsuite/
> PR target/121599
> * gcc.target/aarch64/sve2/pr121599.c: New test.
> ---
> gcc/config/aarch64/aarch64-sve2.md            | 36 +++++++++++++++++++
> .../gcc.target/aarch64/sve2/pr121599.c        | 31 ++++++++++++++++
> 2 files changed, 67 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
> 
> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
> b/gcc/config/aarch64/aarch64-sve2.md
> index a3cbbce8b31..d82dc41df3e 100644
> --- a/gcc/config/aarch64/aarch64-sve2.md
> +++ b/gcc/config/aarch64/aarch64-sve2.md
> @@ -1124,6 +1124,42 @@
>   [(set_attr "movprfx" "yes")]
> )
> 
> +;; Predicated binary operations with no reverse form, merging with
> +;; the second input.
> +(define_insn_and_rewrite "*cond_<sve_int_op><mode>_3"
> +  [(set (match_operand:SVE_FULL_I 0 "register_operand")
> + (unspec:SVE_FULL_I
> +   [(match_operand:<VPRED> 1 "register_operand")
> +    (unspec:SVE_FULL_I
> +      [(match_operand 5)
> +       (unspec:SVE_FULL_I
> + [(match_operand:SVE_FULL_I 2 "register_operand")
> +  (match_operand:SVE_FULL_I 3 "register_operand")]
> + SVE2_COND_INT_BINARY_NOREV)]
> +      UNSPEC_PRED_X)
> +    (match_operand:SVE_FULL_I 4 "register_operand")]
> +   UNSPEC_SEL))]
> +  "TARGET_SVE2



The define_insn-related constructs should really have constraints for the 
operands.
I know they will be likely be split before reload when it matters, but it’s not 
impossible that they can reach RA, and so the pattern should have information 
for which constraints to use.
Ok with constraints added to the operands.
Sorry for the long delay.
Thanks,
Kyrill 

> +   && rtx_equal_p (operands[3], operands[4])"
> +  "#"
> +  "&& 1"
> +  {
> +    if (reload_completed
> +        && register_operand (operands[4], <MODE>mode)
> +        && !rtx_equal_p (operands[0], operands[4]))
> +      {
> + emit_insn (gen_vcond_mask_<mode><vpred> (operands[0], operands[2],
> +  operands[4], operands[1]));
> + operands[4] = operands[2] = operands[0];
> +      }
> +    else if (!CONSTANT_P (operands[5]))
> +      operands[5] = CONSTM1_RTX (<VPRED>mode);
> +    else
> +      FAIL;
> +  }
> +  [(set_attr "movprfx" "yes")]
> +)
> +
> ;; Predicated binary operations with no reverse form, merging with zero.
> ;; At present we don't generate these patterns via a cond_* optab,
> ;; so there's no correctness requirement to handle merging with an
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c 
> b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
> new file mode 100644
> index 00000000000..dbfa613649d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
> @@ -0,0 +1,31 @@
> +/* PR target/121599.  */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" "" } } */
> +
> +#include <arm_sve.h>
> +
> +/*
> +** foo:
> +** movi d([0-9]+), #0
> +** sel z0\.b, p0, z0\.b, z\1\.b
> +** usqadd z0\.b, p0/m, z0\.b, z\1\.b
> +** ret
> +*/
> +svuint8_t foo (svbool_t pg, svuint8_t op1)
> +{
> +    return svsqadd_u8_z (pg, op1, svdup_s8 (0));
> +}
> +
> +/*
> +** bar:
> +** movi d([0-9]+), #0
> +** sel z0\.b, p0, z0\.b, z\1\.b
> +** suqadd z0\.b, p0/m, z0\.b, z\1\.b
> +** ret
> +*/
> +svint8_t bar (svbool_t pg, svint8_t op1)
> +{
> +    return svuqadd_n_s8_z (pg, op1, 0);
> +}
> +
> -- 
> 2.34.1
>> 
>> Thanks,
>> Kyrill
>> 
>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c 
>>> b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>>> new file mode 100644
>>> index 00000000000..cd80ef9115c
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>>> @@ -0,0 +1,31 @@
>>> +/* PR target/121599.  */
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2" } */
>>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>>> +
>>> +#include <arm_sve.h>
>>> +
>>> +/*
>>> +** foo:
>>> +** movi d([0-9]+), #0
>>> +** movprfx z0\.b, p0/z, z0\.b
>>> +** usqadd z0\.b, p0/m, z0\.b, z\1\.b
>>> +** ret
>>> +*/
>>> +svuint8_t foo (svbool_t pg, svuint8_t op1)
>>> +{
>>> +    return svsqadd_u8_z (pg, op1, svdup_s8 (0));
>>> +}
>>> +
>>> +/*
>>> +** bar:
>>> +** movi d([0-9]+), #0
>>> +** movprfx z0\.b, p0/z, z0\.b
>>> +** suqadd z0\.b, p0/m, z0\.b, z\1\.b
>>> +** ret
>>> +*/
>>> +svint8_t bar (svbool_t pg, svint8_t op1)
>>> +{
>>> +    return svuqadd_n_s8_z (pg, op1, 0);
>>> +}
>>> +
>>> -- 
>>> 2.34.1

Re: [PATCH][PR121599] aarch64: Fix ICE when op2 is zero for SVE2 saturating add intrinsics.

Reply via email to