> On 28 Aug 2025, at 09:19, Jennifer Schmitz <[email protected]> wrote:
>
> External email: Use caution opening links or attachments
>
>
>> On 25 Aug 2025, at 15:40, Kyrylo Tkachov <[email protected]> wrote:
>>
>> Hi Jennifer,
>>
>>> On 25 Aug 2025, at 12:56, Jennifer Schmitz <[email protected]> wrote:
>>>
>>> When op2 in SVE2 saturating add intrinsics (svuqadd, svsqadd) is a zero
>>> vector and predication is _z, an ICE in vregs occurs, e.g. for
>>>
>>> svuint8_t foo (svbool_t pg, svuint8_t op1)
>>> {
>>> return svsqadd_u8_z (pg, op1, svdup_s8 (0));
>>> }
>>>
>>> The insn failed to match the pattern (aarch64-sve2.md):
>>>
>>> ;; Predicated binary operations with no reverse form, merging with zero.
>>> ;; At present we don't generate these patterns via a cond_* optab,
>>> ;; so there's no correctness requirement to handle merging with an
>>> ;; independent value.
>>> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_z"
>>> [(set (match_operand:SVE_FULL_I 0 "register_operand")
>>> (unspec:SVE_FULL_I
>>> [(match_operand:<VPRED> 1 "register_operand")
>>> (unspec:SVE_FULL_I
>>> [(match_operand 5)
>>> (unspec:SVE_FULL_I
>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>> (match_operand:SVE_FULL_I 3 "register_operand")]
>>> SVE2_COND_INT_BINARY_NOREV)]
>>> UNSPEC_PRED_X)
>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
>>> UNSPEC_SEL))]
>>> "TARGET_SVE2"
>>> {@ [ cons: =0 , 1 , 2 , 3 ]
>>> [ &w , Upl , 0 , w ] movprfx\t%0.<Vetype>, %1/z,
>>> %0.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>>> [ &w , Upl , w , w ] movprfx\t%0.<Vetype>, %1/z,
>>> %2.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>>> }
>>> "&& !CONSTANT_P (operands[5])"
>>> {
>>> operands[5] = CONSTM1_RTX (<VPRED>mode);
>>> }
>>> [(set_attr "movprfx" "yes")]
>>> )
>>>
>>> because operands[3] and operands[4] were both expanded into the same
>>> register
>>> operand containing a zero vector by
>>>
>>> ;; Predicated binary arithmetic with merging.
>>> (define_expand "@cond_<sve_int_op><mode>"
>>> [(set (match_operand:SVE_FULL_I 0 "register_operand")
>>> (unspec:SVE_FULL_I
>>> [(match_operand:<VPRED> 1 "register_operand")
>>> (unspec:SVE_FULL_I
>>> [(match_dup 5)
>>> (unspec:SVE_FULL_I
>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>> (match_operand:SVE_FULL_I 3 "register_operand")]
>>> SVE2_COND_INT_BINARY)]
>>> UNSPEC_PRED_X)
>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_reg_or_zero")]
>>> UNSPEC_SEL))]
>>> "TARGET_SVE2"
>>> {
>>> operands[5] = CONSTM1_RTX (<MODE>mode);
>>> }
>>> )
>>>
>>> This patch fixes the ICE by changing the predicate of operands[3]
>>> in both patterns to aarch64_simd_reg_or_zero. It also adds tests
>>> adjusted from the report in PR121599.
>>>
>>> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
>>> OK for trunk?
>>>
>>> Alex Coplan pointed out in the bugzilla ticket that this ICE goes back
>>> to GCC 10. Shall we backport?
>>>
>>> Signed-off-by: Jennifer Schmitz <[email protected]>
>>>
>>> gcc/
>>> PR target/121599
>>> * config/aarch64/aarch64-sve2.md (@cond_<sve_int_op><mode>):
>>> Set predicate for operand 3 to aarch64_simd_reg_or_zero.
>>> (*cond_<sve_int_op><mode>_z): Likewise.
>>>
>>> gcc/testsuite/
>>> PR target/121599
>>> * gcc.target/aarch64/sve2/pr121599.c: New test.
>>> ---
>>> gcc/config/aarch64/aarch64-sve2.md | 4 +--
>>> .../gcc.target/aarch64/sve2/pr121599.c | 31 +++++++++++++++++++
>>> 2 files changed, 33 insertions(+), 2 deletions(-)
>>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>>>
>>> diff --git a/gcc/config/aarch64/aarch64-sve2.md
>>> b/gcc/config/aarch64/aarch64-sve2.md
>>> index a3cbbce8b31..c6a84fa7f3d 100644
>>> --- a/gcc/config/aarch64/aarch64-sve2.md
>>> +++ b/gcc/config/aarch64/aarch64-sve2.md
>>> @@ -1021,7 +1021,7 @@
>>> [(match_dup 5)
>>> (unspec:SVE_FULL_I
>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>> - (match_operand:SVE_FULL_I 3 "register_operand")]
>>> + (match_operand:SVE_FULL_I 3 "aarch64_simd_reg_or_zero")]
>>> SVE2_COND_INT_BINARY)]
>>> UNSPEC_PRED_X)
>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_reg_or_zero")]
>>> @@ -1136,7 +1136,7 @@
>>> [(match_operand 5)
>>> (unspec:SVE_FULL_I
>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>> - (match_operand:SVE_FULL_I 3 "register_operand")]
>>> + (match_operand:SVE_FULL_I 3 "aarch64_simd_reg_or_zero")]
>>> SVE2_COND_INT_BINARY_NOREV)]
>>> UNSPEC_PRED_X)
>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
>>
>> I think the ICE happens because operands[4] ends up being a register which
>> the aarch64_simd_imm_zero can’t handle.
>> The pattern *cond_<sve_int_op><mode>_z is also not supposed to be handling
>> register operands, it only expresses a merge with zero.
>> Allowing a zero in operands[3] here is problematic as the output code
>> wouldn’t know how to print it if the midend used a zero vector for it.
>>
>> I think the right solution is to adjust the @cond_<sve_int_op><mode>
>> expander to only allow aarch64_simd_imm_zero for operands[4] for the
>> SVE2_COND_INT_BINARY_NOREV subset of SVE2_COND_INT_BINARY.
> Hi Kyrill,
> Thanks for reviewing the patch.
> I tried allowing only aarch64_simd_imm_zero for operands[4] for
> SVE2_COND_INT_BINARY_NOREV, but while that solved the ICE it caused all tests
> predication other than _z to fail.
> I chose instead to add a pattern similar to the existing
> *cond_<sve_int_op><mode>_3 for SVE2_COND_INT_BINARY_NOREV that handles the
> case when operands[3] and operands[4] are the same register operand.
> Thanks,
> Jennifer
Gentle ping.
Thanks,
Jennifer
>
> When op2 in SVE2 saturating add intrinsics (svuqadd, svsqadd) is a zero
> vector and predication is _z, an ICE in vregs occurs, e.g. for
>
> svuint8_t foo (svbool_t pg, svuint8_t op1)
> {
> return svsqadd_u8_z (pg, op1, svdup_s8 (0));
> }
>
> The insn failed to match the pattern (aarch64-sve2.md):
>
> ;; Predicated binary operations with no reverse form, merging with zero.
> ;; At present we don't generate these patterns via a cond_* optab,
> ;; so there's no correctness requirement to handle merging with an
> ;; independent value.
> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_z"
> [(set (match_operand:SVE_FULL_I 0 "register_operand")
> (unspec:SVE_FULL_I
> [(match_operand:<VPRED> 1 "register_operand")
> (unspec:SVE_FULL_I
> [(match_operand 5)
> (unspec:SVE_FULL_I
> [(match_operand:SVE_FULL_I 2 "register_operand")
> (match_operand:SVE_FULL_I 3 "register_operand")]
> SVE2_COND_INT_BINARY_NOREV)]
> UNSPEC_PRED_X)
> (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
> UNSPEC_SEL))]
> "TARGET_SVE2"
> {@ [ cons: =0 , 1 , 2 , 3 ]
> [ &w , Upl , 0 , w ] movprfx\t%0.<Vetype>, %1/z,
> %0.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
> [ &w , Upl , w , w ] movprfx\t%0.<Vetype>, %1/z,
> %2.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
> }
> "&& !CONSTANT_P (operands[5])"
> {
> operands[5] = CONSTM1_RTX (<VPRED>mode);
> }
> [(set_attr "movprfx" "yes")]
> )
>
> because operands[3] and operands[4] were both expanded into the same register
> operand containing a zero vector by define_expand "@cond_<sve_int_op><mode>".
>
> This patch fixes the ICE by adding the pattern
> ;; Predicated binary operations with no reverse form, merging with
> ;; the second input.
> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_3"
> for SVE2_COND_INT_BINARY_NOREV that allows operands[4] to be a register
> operand
> (equal to operands[3]). It adds a vcond_mask expression such that we end
> up with an insn matched by
> ;; Predicated binary arithmetic, merging with the first input.
> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_2"
>
> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
> OK for trunk?
>
> Alex Coplan pointed out in the bugzilla ticket that this ICE goes back
> to GCC 10. Shall we backport?
>
> Signed-off-by: Jennifer Schmitz <[email protected]>
>
> gcc/
> PR target/121599
> * config/aarch64/aarch64-sve2.md (*cond_<sve_int_op><mode>_3):
> New pattern for SVE2_COND_INT_BINARY_NOREV.
>
> gcc/testsuite/
> PR target/121599
> * gcc.target/aarch64/sve2/pr121599.c: New test.
> ---
> gcc/config/aarch64/aarch64-sve2.md | 36 +++++++++++++++++++
> .../gcc.target/aarch64/sve2/pr121599.c | 31 ++++++++++++++++
> 2 files changed, 67 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>
> diff --git a/gcc/config/aarch64/aarch64-sve2.md
> b/gcc/config/aarch64/aarch64-sve2.md
> index a3cbbce8b31..d82dc41df3e 100644
> --- a/gcc/config/aarch64/aarch64-sve2.md
> +++ b/gcc/config/aarch64/aarch64-sve2.md
> @@ -1124,6 +1124,42 @@
> [(set_attr "movprfx" "yes")]
> )
>
> +;; Predicated binary operations with no reverse form, merging with
> +;; the second input.
> +(define_insn_and_rewrite "*cond_<sve_int_op><mode>_3"
> + [(set (match_operand:SVE_FULL_I 0 "register_operand")
> + (unspec:SVE_FULL_I
> + [(match_operand:<VPRED> 1 "register_operand")
> + (unspec:SVE_FULL_I
> + [(match_operand 5)
> + (unspec:SVE_FULL_I
> + [(match_operand:SVE_FULL_I 2 "register_operand")
> + (match_operand:SVE_FULL_I 3 "register_operand")]
> + SVE2_COND_INT_BINARY_NOREV)]
> + UNSPEC_PRED_X)
> + (match_operand:SVE_FULL_I 4 "register_operand")]
> + UNSPEC_SEL))]
> + "TARGET_SVE2
> + && rtx_equal_p (operands[3], operands[4])"
> + "#"
> + "&& 1"
> + {
> + if (reload_completed
> + && register_operand (operands[4], <MODE>mode)
> + && !rtx_equal_p (operands[0], operands[4]))
> + {
> + emit_insn (gen_vcond_mask_<mode><vpred> (operands[0], operands[2],
> + operands[4], operands[1]));
> + operands[4] = operands[2] = operands[0];
> + }
> + else if (!CONSTANT_P (operands[5]))
> + operands[5] = CONSTM1_RTX (<VPRED>mode);
> + else
> + FAIL;
> + }
> + [(set_attr "movprfx" "yes")]
> +)
> +
> ;; Predicated binary operations with no reverse form, merging with zero.
> ;; At present we don't generate these patterns via a cond_* optab,
> ;; so there's no correctness requirement to handle merging with an
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
> b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
> new file mode 100644
> index 00000000000..dbfa613649d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
> @@ -0,0 +1,31 @@
> +/* PR target/121599. */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" "" } } */
> +
> +#include <arm_sve.h>
> +
> +/*
> +** foo:
> +** movi d([0-9]+), #0
> +** sel z0\.b, p0, z0\.b, z\1\.b
> +** usqadd z0\.b, p0/m, z0\.b, z\1\.b
> +** ret
> +*/
> +svuint8_t foo (svbool_t pg, svuint8_t op1)
> +{
> + return svsqadd_u8_z (pg, op1, svdup_s8 (0));
> +}
> +
> +/*
> +** bar:
> +** movi d([0-9]+), #0
> +** sel z0\.b, p0, z0\.b, z\1\.b
> +** suqadd z0\.b, p0/m, z0\.b, z\1\.b
> +** ret
> +*/
> +svint8_t bar (svbool_t pg, svint8_t op1)
> +{
> + return svuqadd_n_s8_z (pg, op1, 0);
> +}
> +
> --
> 2.34.1
>>
>> Thanks,
>> Kyrill
>>
>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>>> b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>>> new file mode 100644
>>> index 00000000000..cd80ef9115c
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>>> @@ -0,0 +1,31 @@
>>> +/* PR target/121599. */
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2" } */
>>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>>> +
>>> +#include <arm_sve.h>
>>> +
>>> +/*
>>> +** foo:
>>> +** movi d([0-9]+), #0
>>> +** movprfx z0\.b, p0/z, z0\.b
>>> +** usqadd z0\.b, p0/m, z0\.b, z\1\.b
>>> +** ret
>>> +*/
>>> +svuint8_t foo (svbool_t pg, svuint8_t op1)
>>> +{
>>> + return svsqadd_u8_z (pg, op1, svdup_s8 (0));
>>> +}
>>> +
>>> +/*
>>> +** bar:
>>> +** movi d([0-9]+), #0
>>> +** movprfx z0\.b, p0/z, z0\.b
>>> +** suqadd z0\.b, p0/m, z0\.b, z\1\.b
>>> +** ret
>>> +*/
>>> +svint8_t bar (svbool_t pg, svint8_t op1)
>>> +{
>>> + return svuqadd_n_s8_z (pg, op1, 0);
>>> +}
>>> +
>>> --
>>> 2.34.1