> On 2 Oct 2025, at 00:51, Richard Sandiford <[email protected]> wrote:
>
> External email: Use caution opening links or attachments
>
>
> Jennifer Schmitz <[email protected]> writes:
>>> On 30 Sep 2025, at 11:54, Richard Sandiford
>>> <[email protected]> wrote:
>>>
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> Kyrylo Tkachov <[email protected]> writes:
>>>>> diff --git a/gcc/config/aarch64/aarch64-sve2.md
>>>>> b/gcc/config/aarch64/aarch64-sve2.md
>>>>> index a3cbbce8b31..d82dc41df3e 100644
>>>>> --- a/gcc/config/aarch64/aarch64-sve2.md
>>>>> +++ b/gcc/config/aarch64/aarch64-sve2.md
>>>>> @@ -1124,6 +1124,42 @@
>>>>> [(set_attr "movprfx" "yes")]
>>>>> )
>>>>>
>>>>> +;; Predicated binary operations with no reverse form, merging with
>>>>> +;; the second input.
>>>>> +(define_insn_and_rewrite "*cond_<sve_int_op><mode>_3"
>>>>> + [(set (match_operand:SVE_FULL_I 0 "register_operand")
>>>>> + (unspec:SVE_FULL_I
>>>>> + [(match_operand:<VPRED> 1 "register_operand")
>>>>> + (unspec:SVE_FULL_I
>>>>> + [(match_operand 5)
>>>>> + (unspec:SVE_FULL_I
>>>>> + [(match_operand:SVE_FULL_I 2 "register_operand")
>>>>> + (match_operand:SVE_FULL_I 3 "register_operand")]
>>>>> + SVE2_COND_INT_BINARY_NOREV)]
>>>>> + UNSPEC_PRED_X)
>>>>> + (match_operand:SVE_FULL_I 4 "register_operand")]
>>>>> + UNSPEC_SEL))]
>>>>> + "TARGET_SVE2
>>>>
>>>>
>>>> The define_insn-related constructs should really have constraints for
>>>> the operands.
>>>> I know they will be likely be split before reload when it matters, but
>>>> it’s not impossible that they can reach RA, and so the pattern should
>>>> have information for which constraints to use.
>>>
>>> Also, this is a define_insn_and_rewrite, so any "split" is from this
>>> define_insn to itself. Once the define_insn is matched, it will be used
>>> by the RA in some form.
>>>
>>> (For avoidance of doubt, operand 5 can stay as-is. Only the
>>> register_operands need constraints.)
>>>
>> Hi Kyrill and Richard,
>> Thanks for the review. Below is the updated patch with constraints for
>> operands 0-4. I repeated bootstrap and testing, no regression. I will push
>> it to trunk tomorrow if there are no objections.
>> As pointed out by Alex Coplan in the bugzilla ticket, this ICE goes back to
>> GCC 10. Shall we backport? If so, to which release(s)?
>> Thanks,
>> Jennifer
>>
>>
>> When op2 in SVE2 saturating add intrinsics (svuqadd, svsqadd) is a zero
>> vector and predication is _z, an ICE in vregs occurs, e.g. for
>>
>> svuint8_t foo (svbool_t pg, svuint8_t op1)
>> {
>> return svsqadd_u8_z (pg, op1, svdup_s8 (0));
>> }
>>
>> The insn failed to match the pattern (aarch64-sve2.md):
>>
>> ;; Predicated binary operations with no reverse form, merging with zero.
>> ;; At present we don't generate these patterns via a cond_* optab,
>> ;; so there's no correctness requirement to handle merging with an
>> ;; independent value.
>> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_z"
>> [(set (match_operand:SVE_FULL_I 0 "register_operand")
>> (unspec:SVE_FULL_I
>> [(match_operand:<VPRED> 1 "register_operand")
>> (unspec:SVE_FULL_I
>> [(match_operand 5)
>> (unspec:SVE_FULL_I
>> [(match_operand:SVE_FULL_I 2 "register_operand")
>> (match_operand:SVE_FULL_I 3 "register_operand")]
>> SVE2_COND_INT_BINARY_NOREV)]
>> UNSPEC_PRED_X)
>> (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
>> UNSPEC_SEL))]
>> "TARGET_SVE2"
>> {@ [ cons: =0 , 1 , 2 , 3 ]
>> [ &w , Upl , 0 , w ] movprfx\t%0.<Vetype>, %1/z,
>> %0.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>> [ &w , Upl , w , w ] movprfx\t%0.<Vetype>, %1/z,
>> %2.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>> }
>> "&& !CONSTANT_P (operands[5])"
>> {
>> operands[5] = CONSTM1_RTX (<VPRED>mode);
>> }
>> [(set_attr "movprfx" "yes")]
>> )
>>
>> because operands[3] and operands[4] were both expanded into the same register
>> operand containing a zero vector by define_expand "@cond_<sve_int_op><mode>".
>>
>> This patch fixes the ICE by adding the pattern
>> ;; Predicated binary operations with no reverse form, merging with
>> ;; the second input.
>> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_3"
>> for SVE2_COND_INT_BINARY_NOREV that allows operands[4] to be a register
>> operand
>> (equal to operands[3]). It adds a vcond_mask expression such that we end
>> up with an insn matched by
>> ;; Predicated binary arithmetic, merging with the first input.
>> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_2"
>>
>> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
>> OK for trunk?
>>
>> Alex Coplan pointed out in the bugzilla ticket that this ICE goes back
>> to GCC 10. Shall we backport?
>>
>> Signed-off-by: Jennifer Schmitz <[email protected]>
>>
>> gcc/
>> PR target/121599
>> * config/aarch64/aarch64-sve2.md (*cond_<sve_int_op><mode>_3):
>> New pattern for SVE2_COND_INT_BINARY_NOREV.
>>
>> gcc/testsuite/
>> PR target/121599
>> * gcc.target/aarch64/sve2/pr121599.c: New test.
>> ---
>> gcc/config/aarch64/aarch64-sve2.md | 36 +++++++++++++++++++
>> .../gcc.target/aarch64/sve2/pr121599.c | 31 ++++++++++++++++
>> 2 files changed, 67 insertions(+)
>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>>
>> diff --git a/gcc/config/aarch64/aarch64-sve2.md
>> b/gcc/config/aarch64/aarch64-sve2.md
>> index 69a376706fa..2ed9bee335b 100644
>> --- a/gcc/config/aarch64/aarch64-sve2.md
>> +++ b/gcc/config/aarch64/aarch64-sve2.md
>> @@ -1164,6 +1164,42 @@
>> (set_attr "sve_type" "sve_int_general")]
>> )
>>
>> +;; Predicated binary operations with no reverse form, merging with
>> +;; the second input.
>> +(define_insn_and_rewrite "*cond_<sve_int_op><mode>_3"
>> + [(set (match_operand:SVE_FULL_I 0 "register_operand" "=&w")
>> + (unspec:SVE_FULL_I
>> + [(match_operand:<VPRED> 1 "register_operand" "Upl")
>> + (unspec:SVE_FULL_I
>> + [(match_operand 5)
>> + (unspec:SVE_FULL_I
>> + [(match_operand:SVE_FULL_I 2 "register_operand" "w")
>> + (match_operand:SVE_FULL_I 3 "register_operand" "w")]
>> + SVE2_COND_INT_BINARY_NOREV)]
>> + UNSPEC_PRED_X)
>> + (match_operand:SVE_FULL_I 4 "register_operand" "w")]
>> + UNSPEC_SEL))]
>> + "TARGET_SVE2
>> + && rtx_equal_p (operands[3], operands[4])"
>
> Sorry for not noticing last time (I was just concentrating on the
> constraints thing) but: it's better to use (match_dup 3) instead
> of using rtx_equal_p in the C++ condition. I don't think there's
> a requirement for all instances of a pseudo to be replaced by the
> same hard register. Something that starts off rtx_equal_p might
> not stay that way after RA.
>
> But the more fundamental question is: why is addition of zero special?
> As the comment above *cond_<sve_int_op><mode>_z says:
>
> ;; Predicated binary operations with no reverse form, merging with zero.
> ;; At present we don't generate these patterns via a cond_* optab,
> ;; so there's no correctness requirement to handle merging with an
> ;; independent value.
> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_z"
>
> So why does the testcse end up with a register "else" value instead of a zero
> else value, and why does it only happen for zero inputs and not (for example)
> for an input of 1?
>
> Debugging, it appears to be because of:
>
> /* First try reusing the result of an earlier legitimization.
> This avoids duplicate rtl and ensures that tied operands
> remain tied.
>
> This search is linear, but NOPS is bounded at compile time
> to a small number (current a single digit). */
> unsigned int j = 0;
> for (; j < i; ++j)
> if (can_reuse_operands_p (icode, opno + j, opno + i, &ops[j], &ops[i])
> && rtx_equal_p (orig_values[j], orig_values[i])
> && ops[j].value
> && insn_operand_matches (icode, opno + i, ops[j].value))
> {
> ops[i].value = copy_rtx (ops[j].value);
> break;
> }
>
> in optabs.cc. Perhaps we should restrict that to cases where the
> original operand doesn't match the predicate. (Although, as the comment
> says, that could have negative effects for tied operands and so wouldn't
> be risk-free.)
>
> A backportable fix would be:
>
> diff --git
> a/gcc/config/aarch64/aarch64-sve-builtins.ccb/gcc/config/aarch64/aarch64-sve-builtins.cc
> index 909d9e524e1..c18865d9348 100644
> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
> @@ -4351,7 +4351,10 @@ function_expander::use_cond_insn (insn_code icode,
> unsigned int merge_argno)
> add_input_operand (icode, pred);
> for (unsigned int i = 0; i < nops; ++i)
> add_input_operand (icode, args[opno + i]);
> - add_input_operand (icode, fallback_arg);
> + if (fallback_arg == CONST0_RTX (mode))
> + add_fixed_operand (fallback_arg);
> + else
> + add_input_operand (icode, fallback_arg);
> return generate_insn (icode);
> }
>
> I think this is better than adding a new pattern. It encourages the use
> of MOVPRFX /Z in other cases too.
>
> I haven't tested that. If it turns out that zero isn't always allowed,
> we could add an insn_operand_matches test to the fallback_arg test.
Thanks, Richard, that seems to be a good solution. I indeed had to add a
insn_operand_matches test
to prevent ICEs that occured in many tests (e.g.
gcc.target/aarch64/sme/acle-asm/revd_*.c).
Is the current version of the patch what you had in mind?
Thanks,
Jennifer
When op2 in SVE2 saturating add intrinsics (svuqadd, svsqadd) is a zero
vector and predication is _z, an ICE in vregs occurs, e.g. for
svuint8_t foo (svbool_t pg, svuint8_t op1)
{
return svsqadd_u8_z (pg, op1, svdup_s8 (0));
}
The insn failed to match the pattern (aarch64-sve2.md):
;; Predicated binary operations with no reverse form, merging with zero.
;; At present we don't generate these patterns via a cond_* optab,
;; so there's no correctness requirement to handle merging with an
;; independent value.
(define_insn_and_rewrite "*cond_<sve_int_op><mode>_z"
[(set (match_operand:SVE_FULL_I 0 "register_operand")
(unspec:SVE_FULL_I
[(match_operand:<VPRED> 1 "register_operand")
(unspec:SVE_FULL_I
[(match_operand 5)
(unspec:SVE_FULL_I
[(match_operand:SVE_FULL_I 2 "register_operand")
(match_operand:SVE_FULL_I 3 "register_operand")]
SVE2_COND_INT_BINARY_NOREV)]
UNSPEC_PRED_X)
(match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
UNSPEC_SEL))]
"TARGET_SVE2"
{@ [ cons: =0 , 1 , 2 , 3 ]
[ &w , Upl , 0 , w ] movprfx\t%0.<Vetype>, %1/z,
%0.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
[ &w , Upl , w , w ] movprfx\t%0.<Vetype>, %1/z,
%2.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
}
"&& !CONSTANT_P (operands[5])"
{
operands[5] = CONSTM1_RTX (<VPRED>mode);
}
[(set_attr "movprfx" "yes")]
)
because operands[3] and operands[4] were both expanded into the same register
operand containing a zero vector by define_expand "@cond_<sve_int_op><mode>".
This patch fixes the ICE by making a case distinction in
function_expander::use_cond_insn that uses add_fixed_operand if
fallback_arg == CONST0_RTX (mode), and otherwise add_input_operand (which was
previously the default and allowed the expansion of the zero-vector
fallback_arg to a register operand).
The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
OK for trunk?
Alex Coplan pointed out in the bugzilla ticket that this ICE goes back
to GCC 10. Shall we backport?
Signed-off-by: Jennifer Schmitz <[email protected]>
Co-authored by: Richard Sandiford <[email protected]>
gcc/
PR target/121599
* config/aarch64/aarch64-sve-builtins.cc
(function_expander::use_cond_insn): Use add_fixed_operand if
fallback_arg == CONST0_RTX (mode).
gcc/testsuite/
PR target/121599
* gcc.target/aarch64/sve2/pr121599.c: New test.
---
gcc/config/aarch64/aarch64-sve-builtins.cc | 6 +++-
.../gcc.target/aarch64/sve2/pr121599.c | 31 +++++++++++++++++++
2 files changed, 36 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc
b/gcc/config/aarch64/aarch64-sve-builtins.cc
index 22d75197188..5efc5ba3846 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -4356,7 +4356,11 @@ function_expander::use_cond_insn (insn_code icode,
unsigned int merge_argno)
add_input_operand (icode, pred);
for (unsigned int i = 0; i < nops; ++i)
add_input_operand (icode, args[opno + i]);
- add_input_operand (icode, fallback_arg);
+ if (insn_operand_matches (icode, opno + nops + 1, fallback_arg)
+ && fallback_arg == CONST0_RTX (mode))
+ add_fixed_operand (fallback_arg);
+ else
+ add_input_operand (icode, fallback_arg);
return generate_insn (icode);
}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
new file mode 100644
index 00000000000..90c5ac97e4f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
@@ -0,0 +1,31 @@
+/* PR target/121599. */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#include <arm_sve.h>
+
+/*
+** foo:
+** movi d([0-9]+), #0
+** movprfx z0\.b, p0/z, z0\.b
+** usqadd z0\.b, p0/m, z0\.b, z\1\.b
+** ret
+*/
+svuint8_t foo (svbool_t pg, svuint8_t op1)
+{
+ return svsqadd_u8_z (pg, op1, svdup_s8 (0));
+}
+
+/*
+** bar:
+** movi d([0-9]+), #0
+** movprfx z0\.b, p0/z, z0\.b
+** suqadd z0\.b, p0/m, z0\.b, z\1\.b
+** ret
+*/
+svint8_t bar (svbool_t pg, svint8_t op1)
+{
+ return svuqadd_n_s8_z (pg, op1, 0);
+}
+
--
2.34.1
>
> Thanks,
> Richard