> On 25 Aug 2025, at 15:40, Kyrylo Tkachov <[email protected]> wrote:
>
> Hi Jennifer,
>
>> On 25 Aug 2025, at 12:56, Jennifer Schmitz <[email protected]> wrote:
>>
>> When op2 in SVE2 saturating add intrinsics (svuqadd, svsqadd) is a zero
>> vector and predication is _z, an ICE in vregs occurs, e.g. for
>>
>> svuint8_t foo (svbool_t pg, svuint8_t op1)
>> {
>> return svsqadd_u8_z (pg, op1, svdup_s8 (0));
>> }
>>
>> The insn failed to match the pattern (aarch64-sve2.md):
>>
>> ;; Predicated binary operations with no reverse form, merging with zero.
>> ;; At present we don't generate these patterns via a cond_* optab,
>> ;; so there's no correctness requirement to handle merging with an
>> ;; independent value.
>> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_z"
>> [(set (match_operand:SVE_FULL_I 0 "register_operand")
>> (unspec:SVE_FULL_I
>> [(match_operand:<VPRED> 1 "register_operand")
>> (unspec:SVE_FULL_I
>> [(match_operand 5)
>> (unspec:SVE_FULL_I
>> [(match_operand:SVE_FULL_I 2 "register_operand")
>> (match_operand:SVE_FULL_I 3 "register_operand")]
>> SVE2_COND_INT_BINARY_NOREV)]
>> UNSPEC_PRED_X)
>> (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
>> UNSPEC_SEL))]
>> "TARGET_SVE2"
>> {@ [ cons: =0 , 1 , 2 , 3 ]
>> [ &w , Upl , 0 , w ] movprfx\t%0.<Vetype>, %1/z,
>> %0.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>> [ &w , Upl , w , w ] movprfx\t%0.<Vetype>, %1/z,
>> %2.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>> }
>> "&& !CONSTANT_P (operands[5])"
>> {
>> operands[5] = CONSTM1_RTX (<VPRED>mode);
>> }
>> [(set_attr "movprfx" "yes")]
>> )
>>
>> because operands[3] and operands[4] were both expanded into the same register
>> operand containing a zero vector by
>>
>> ;; Predicated binary arithmetic with merging.
>> (define_expand "@cond_<sve_int_op><mode>"
>> [(set (match_operand:SVE_FULL_I 0 "register_operand")
>> (unspec:SVE_FULL_I
>> [(match_operand:<VPRED> 1 "register_operand")
>> (unspec:SVE_FULL_I
>> [(match_dup 5)
>> (unspec:SVE_FULL_I
>> [(match_operand:SVE_FULL_I 2 "register_operand")
>> (match_operand:SVE_FULL_I 3 "register_operand")]
>> SVE2_COND_INT_BINARY)]
>> UNSPEC_PRED_X)
>> (match_operand:SVE_FULL_I 4 "aarch64_simd_reg_or_zero")]
>> UNSPEC_SEL))]
>> "TARGET_SVE2"
>> {
>> operands[5] = CONSTM1_RTX (<MODE>mode);
>> }
>> )
>>
>> This patch fixes the ICE by changing the predicate of operands[3]
>> in both patterns to aarch64_simd_reg_or_zero. It also adds tests
>> adjusted from the report in PR121599.
>>
>> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
>> OK for trunk?
>>
>> Alex Coplan pointed out in the bugzilla ticket that this ICE goes back
>> to GCC 10. Shall we backport?
>>
>> Signed-off-by: Jennifer Schmitz <[email protected]>
>>
>> gcc/
>> PR target/121599
>> * config/aarch64/aarch64-sve2.md (@cond_<sve_int_op><mode>):
>> Set predicate for operand 3 to aarch64_simd_reg_or_zero.
>> (*cond_<sve_int_op><mode>_z): Likewise.
>>
>> gcc/testsuite/
>> PR target/121599
>> * gcc.target/aarch64/sve2/pr121599.c: New test.
>> ---
>> gcc/config/aarch64/aarch64-sve2.md | 4 +--
>> .../gcc.target/aarch64/sve2/pr121599.c | 31 +++++++++++++++++++
>> 2 files changed, 33 insertions(+), 2 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>>
>> diff --git a/gcc/config/aarch64/aarch64-sve2.md
>> b/gcc/config/aarch64/aarch64-sve2.md
>> index a3cbbce8b31..c6a84fa7f3d 100644
>> --- a/gcc/config/aarch64/aarch64-sve2.md
>> +++ b/gcc/config/aarch64/aarch64-sve2.md
>> @@ -1021,7 +1021,7 @@
>> [(match_dup 5)
>> (unspec:SVE_FULL_I
>> [(match_operand:SVE_FULL_I 2 "register_operand")
>> - (match_operand:SVE_FULL_I 3 "register_operand")]
>> + (match_operand:SVE_FULL_I 3 "aarch64_simd_reg_or_zero")]
>> SVE2_COND_INT_BINARY)]
>> UNSPEC_PRED_X)
>> (match_operand:SVE_FULL_I 4 "aarch64_simd_reg_or_zero")]
>> @@ -1136,7 +1136,7 @@
>> [(match_operand 5)
>> (unspec:SVE_FULL_I
>> [(match_operand:SVE_FULL_I 2 "register_operand")
>> - (match_operand:SVE_FULL_I 3 "register_operand")]
>> + (match_operand:SVE_FULL_I 3 "aarch64_simd_reg_or_zero")]
>> SVE2_COND_INT_BINARY_NOREV)]
>> UNSPEC_PRED_X)
>> (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
>
> I think the ICE happens because operands[4] ends up being a register which
> the aarch64_simd_imm_zero can’t handle.
> The pattern *cond_<sve_int_op><mode>_z is also not supposed to be handling
> register operands, it only expresses a merge with zero.
> Allowing a zero in operands[3] here is problematic as the output code
> wouldn’t know how to print it if the midend used a zero vector for it.
>
> I think the right solution is to adjust the @cond_<sve_int_op><mode> expander
> to only allow aarch64_simd_imm_zero for operands[4] for the
> SVE2_COND_INT_BINARY_NOREV subset of SVE2_COND_INT_BINARY.
Hi Kyrill,
Thanks for reviewing the patch.
I tried allowing only aarch64_simd_imm_zero for operands[4] for
SVE2_COND_INT_BINARY_NOREV, but while that solved the ICE it caused all tests
predication other than _z to fail.
I chose instead to add a pattern similar to the existing
*cond_<sve_int_op><mode>_3 for SVE2_COND_INT_BINARY_NOREV that handles the case
when operands[3] and operands[4] are the same register operand.
Thanks,
Jennifer
When op2 in SVE2 saturating add intrinsics (svuqadd, svsqadd) is a zero
vector and predication is _z, an ICE in vregs occurs, e.g. for
svuint8_t foo (svbool_t pg, svuint8_t op1)
{
return svsqadd_u8_z (pg, op1, svdup_s8 (0));
}
The insn failed to match the pattern (aarch64-sve2.md):
;; Predicated binary operations with no reverse form, merging with zero.
;; At present we don't generate these patterns via a cond_* optab,
;; so there's no correctness requirement to handle merging with an
;; independent value.
(define_insn_and_rewrite "*cond_<sve_int_op><mode>_z"
[(set (match_operand:SVE_FULL_I 0 "register_operand")
(unspec:SVE_FULL_I
[(match_operand:<VPRED> 1 "register_operand")
(unspec:SVE_FULL_I
[(match_operand 5)
(unspec:SVE_FULL_I
[(match_operand:SVE_FULL_I 2 "register_operand")
(match_operand:SVE_FULL_I 3 "register_operand")]
SVE2_COND_INT_BINARY_NOREV)]
UNSPEC_PRED_X)
(match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
UNSPEC_SEL))]
"TARGET_SVE2"
{@ [ cons: =0 , 1 , 2 , 3 ]
[ &w , Upl , 0 , w ] movprfx\t%0.<Vetype>, %1/z,
%0.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
[ &w , Upl , w , w ] movprfx\t%0.<Vetype>, %1/z,
%2.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
}
"&& !CONSTANT_P (operands[5])"
{
operands[5] = CONSTM1_RTX (<VPRED>mode);
}
[(set_attr "movprfx" "yes")]
)
because operands[3] and operands[4] were both expanded into the same register
operand containing a zero vector by define_expand "@cond_<sve_int_op><mode>".
This patch fixes the ICE by adding the pattern
;; Predicated binary operations with no reverse form, merging with
;; the second input.
(define_insn_and_rewrite "*cond_<sve_int_op><mode>_3"
for SVE2_COND_INT_BINARY_NOREV that allows operands[4] to be a register operand
(equal to operands[3]). It adds a vcond_mask expression such that we end
up with an insn matched by
;; Predicated binary arithmetic, merging with the first input.
(define_insn_and_rewrite "*cond_<sve_int_op><mode>_2"
The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
OK for trunk?
Alex Coplan pointed out in the bugzilla ticket that this ICE goes back
to GCC 10. Shall we backport?
Signed-off-by: Jennifer Schmitz <[email protected]>
gcc/
PR target/121599
* config/aarch64/aarch64-sve2.md (*cond_<sve_int_op><mode>_3):
New pattern for SVE2_COND_INT_BINARY_NOREV.
gcc/testsuite/
PR target/121599
* gcc.target/aarch64/sve2/pr121599.c: New test.
---
gcc/config/aarch64/aarch64-sve2.md | 36 +++++++++++++++++++
.../gcc.target/aarch64/sve2/pr121599.c | 31 ++++++++++++++++
2 files changed, 67 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
diff --git a/gcc/config/aarch64/aarch64-sve2.md
b/gcc/config/aarch64/aarch64-sve2.md
index a3cbbce8b31..d82dc41df3e 100644
--- a/gcc/config/aarch64/aarch64-sve2.md
+++ b/gcc/config/aarch64/aarch64-sve2.md
@@ -1124,6 +1124,42 @@
[(set_attr "movprfx" "yes")]
)
+;; Predicated binary operations with no reverse form, merging with
+;; the second input.
+(define_insn_and_rewrite "*cond_<sve_int_op><mode>_3"
+ [(set (match_operand:SVE_FULL_I 0 "register_operand")
+ (unspec:SVE_FULL_I
+ [(match_operand:<VPRED> 1 "register_operand")
+ (unspec:SVE_FULL_I
+ [(match_operand 5)
+ (unspec:SVE_FULL_I
+ [(match_operand:SVE_FULL_I 2 "register_operand")
+ (match_operand:SVE_FULL_I 3 "register_operand")]
+ SVE2_COND_INT_BINARY_NOREV)]
+ UNSPEC_PRED_X)
+ (match_operand:SVE_FULL_I 4 "register_operand")]
+ UNSPEC_SEL))]
+ "TARGET_SVE2
+ && rtx_equal_p (operands[3], operands[4])"
+ "#"
+ "&& 1"
+ {
+ if (reload_completed
+ && register_operand (operands[4], <MODE>mode)
+ && !rtx_equal_p (operands[0], operands[4]))
+ {
+ emit_insn (gen_vcond_mask_<mode><vpred> (operands[0], operands[2],
+ operands[4], operands[1]));
+ operands[4] = operands[2] = operands[0];
+ }
+ else if (!CONSTANT_P (operands[5]))
+ operands[5] = CONSTM1_RTX (<VPRED>mode);
+ else
+ FAIL;
+ }
+ [(set_attr "movprfx" "yes")]
+)
+
;; Predicated binary operations with no reverse form, merging with zero.
;; At present we don't generate these patterns via a cond_* optab,
;; so there's no correctness requirement to handle merging with an
diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
new file mode 100644
index 00000000000..dbfa613649d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
@@ -0,0 +1,31 @@
+/* PR target/121599. */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#include <arm_sve.h>
+
+/*
+** foo:
+** movi d([0-9]+), #0
+** sel z0\.b, p0, z0\.b, z\1\.b
+** usqadd z0\.b, p0/m, z0\.b, z\1\.b
+** ret
+*/
+svuint8_t foo (svbool_t pg, svuint8_t op1)
+{
+ return svsqadd_u8_z (pg, op1, svdup_s8 (0));
+}
+
+/*
+** bar:
+** movi d([0-9]+), #0
+** sel z0\.b, p0, z0\.b, z\1\.b
+** suqadd z0\.b, p0/m, z0\.b, z\1\.b
+** ret
+*/
+svint8_t bar (svbool_t pg, svint8_t op1)
+{
+ return svuqadd_n_s8_z (pg, op1, 0);
+}
+
--
2.34.1
>
> Thanks,
> Kyrill
>
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>> b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>> new file mode 100644
>> index 00000000000..cd80ef9115c
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>> @@ -0,0 +1,31 @@
>> +/* PR target/121599. */
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" } */
>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>> +
>> +#include <arm_sve.h>
>> +
>> +/*
>> +** foo:
>> +** movi d([0-9]+), #0
>> +** movprfx z0\.b, p0/z, z0\.b
>> +** usqadd z0\.b, p0/m, z0\.b, z\1\.b
>> +** ret
>> +*/
>> +svuint8_t foo (svbool_t pg, svuint8_t op1)
>> +{
>> + return svsqadd_u8_z (pg, op1, svdup_s8 (0));
>> +}
>> +
>> +/*
>> +** bar:
>> +** movi d([0-9]+), #0
>> +** movprfx z0\.b, p0/z, z0\.b
>> +** suqadd z0\.b, p0/m, z0\.b, z\1\.b
>> +** ret
>> +*/
>> +svint8_t bar (svbool_t pg, svint8_t op1)
>> +{
>> + return svuqadd_n_s8_z (pg, op1, 0);
>> +}
>> +
>> --
>> 2.34.1
>>
>