Re: [PATCH][PR121602] aarch64: Force vector when folding svmul with all-ones op1.

Jennifer Schmitz Fri, 19 Sep 2025 05:00:41 -0700

> On 15 Sep 2025, at 17:42, Tamar Christina <[email protected]> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
>> -----Original Message-----
>> From: Jennifer Schmitz <[email protected]>
>> Sent: Monday, September 15, 2025 4:23 PM
>> To: Tamar Christina <[email protected]>
>> Cc: GCC Patches <[email protected]>; Alex Coplan
>> <[email protected]>; Kyrylo Tkachov <[email protected]>; Richard
>> Earnshaw <[email protected]>; Andrew Pinski <[email protected]>
>> Subject: Re: [PATCH][PR121602] aarch64: Force vector when folding svmul with
>> all-ones op1.
>> 
>> 
>> 
>>> On 1 Sep 2025, at 15:46, Tamar Christina <[email protected]> wrote:
>>> 
>>> External email: Use caution opening links or attachments
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Jennifer Schmitz <[email protected]>
>>>> Sent: Friday, August 29, 2025 1:17 PM
>>>> To: GCC Patches <[email protected]>
>>>> Cc: Alex Coplan <[email protected]>; Kyrylo Tkachov
>>>> <[email protected]>; Tamar Christina <[email protected]>; Richard
>>>> Earnshaw <[email protected]>; Andrew Pinski <[email protected]>
>>>> Subject: [PATCH][PR121602] aarch64: Force vector when folding svmul with 
>>>> all-
>>>> ones op1.
>>>> 
>>>> An ICE was reported in the following test case:
>>>> svint8_t foo(svbool_t pg, int8_t op2) {
>>>>     return svmul_n_s8_z(pg, svdup_s8(1), op2);
>>>> }
>>>> with a type mismatch in 'vec_cond_expr':
>>>> _4 = VEC_COND_EXPR <v16_2(D), v32_3(D), { 0, ... }>;
>>>> 
>>>> The reason is that svmul_impl::fold folds calls where one of the operands
>>>> is all ones to the other operand using
>>>> gimple_folder::fold_active_lanes_to. However, we implicitly assume
>>>> that the argument that is passed to fold_active_lanes_to is a vector
>>>> type. In the given test case op2 is a scalar type, resulting in the type
>>>> mismatch in the vec_cond_expr.
>>>> 
>>>> This patch fixes the ICE by forcing a vector type before svmul_impl::fold
>>>> calls fold_active_lanes_to.
>>>> 
>>>> A more general option would be to move force_vector to 
>>>> fold_active_lanes_to.
>>>> 
>>> 
>>> I was wondering how the constant version doesn't need the fixup, e.g.
>>> 
>>> #include <arm_sve.h>
>>> 
>>> svint8_t foo(svbool_t pg, int8_t op2) {
>>>     return svmul_n_s8_x(pg, svdup_s8(1), 3);
>>> }
>>> 
>>> And it seems this is due to fold_const_binary doing the fixup from scalar to
>>> vector in vector_const_binop.
>>> 
>>> This to me seems like indeed we should fix it up in fold_active_lanes_to be
>> consistent.
>>> 
>>>> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
>>>> OK for trunk?
>>>> OK to backport to GCC 15?
>>> 
>>> OK with moving it to fold_active_lanes_to unless someone disagrees.
>> Hi Tamar,
>> thanks for the review. I moved the force_vector to fold_active_lanes_to 
>> (updated
>> patch below).
>> Thanks,
>> Jennifer
>> 
>> 
>> [PATCH][PR121602] aarch64: Force vector in SVE
>> gimple_folder::fold_active_lanes_to.
>> 
>> An ICE was reported in the following test case:
>> svint8_t foo(svbool_t pg, int8_t op2) {
>>      return svmul_n_s8_z(pg, svdup_s8(1), op2);
>> }
>> with a type mismatch in 'vec_cond_expr':
>> _4 = VEC_COND_EXPR <v16_2(D), v32_3(D), { 0, ... }>;
>> 
>> The reason is that svmul_impl::fold folds calls where one of the operands
>> is all ones to the other operand using
>> gimple_folder::fold_active_lanes_to. However, we implicitly assumed
>> that the argument that is passed to fold_active_lanes_to is a vector
>> type. In the given test case op2 is a scalar type, resulting in the type
>> mismatch in the vec_cond_expr.
>> 
>> This patch fixes the ICE by forcing a vector type of the argument
>> in fold_active_lanes_to before the statement with the vec_cond_expr.
>> 
>> In the initial version of this patch, the force_vector statement was placed 
>> in
>> svmul_impl::fold, but it was moved to fold_active_lanes_to to align it with
>> fold_const_binary which takes care of the fixup from scalar to vector
>> type using vector_const_binop.
>> 
>> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
>> OK for trunk?
>> OK to backport to GCC 15?
>> 
> 
> OK for trunk and backport to GCC 15 after some stew.
Thanks, pushed to trunk (5690b710a1c2d36436361d6089187c5b3e4261e8)
and releases/gcc-15 (a584cd72498711d9775ab102828d185f37db7229).
Jennifer
> 
> Thanks,
> Tamar
> 
>> Signed-off-by: Jennifer Schmitz <[email protected]>
>> 
>> gcc/
>>      PR target/121602
>>      * config/aarch64/aarch64-sve-builtins.cc
>>      (gimple_folder::fold_active_lanes_to): Add force_vector
>>      statement.
>> 
>> gcc/testsuite/
>>      PR target/121602
>>      * gcc.target/aarch64/sve/acle/asm/mul_s16.c: New test.
>>      * gcc.target/aarch64/sve/acle/asm/mul_s32.c: Likewise.
>>      * gcc.target/aarch64/sve/acle/asm/mul_s64.c: Likewise.
>>      * gcc.target/aarch64/sve/acle/asm/mul_s8.c: Likewise.
>>      * gcc.target/aarch64/sve/acle/asm/mul_u16.c: Likewise.
>>      * gcc.target/aarch64/sve/acle/asm/mul_u32.c: Likewise.
>>      * gcc.target/aarch64/sve/acle/asm/mul_u64.c: Likewise.
>>      * gcc.target/aarch64/sve/acle/asm/mul_u8.c: Likewise.
>> ---
>> gcc/config/aarch64/aarch64-sve-builtins.cc             |  1 +
>> .../gcc.target/aarch64/sve/acle/asm/mul_s16.c          | 10 ++++++++++
>> .../gcc.target/aarch64/sve/acle/asm/mul_s32.c          | 10 ++++++++++
>> .../gcc.target/aarch64/sve/acle/asm/mul_s64.c          | 10 ++++++++++
>> gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s8.c | 10 ++++++++++
>> .../gcc.target/aarch64/sve/acle/asm/mul_u16.c          | 10 ++++++++++
>> .../gcc.target/aarch64/sve/acle/asm/mul_u32.c          | 10 ++++++++++
>> .../gcc.target/aarch64/sve/acle/asm/mul_u64.c          | 10 ++++++++++
>> gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u8.c | 10 ++++++++++
>> 9 files changed, 81 insertions(+)
>> 
>> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc
>> b/gcc/config/aarch64/aarch64-sve-builtins.cc
>> index 1764cf8f7e8..22d75197188 100644
>> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
>> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
>> @@ -3802,6 +3802,7 @@ gimple_folder::fold_active_lanes_to (tree x)
>> 
>>   gimple_seq stmts = NULL;
>>   tree pred = convert_pred (stmts, vector_type (0), 0);
>> +  x = force_vector (stmts, TREE_TYPE (lhs), x);
>>   gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
>>   return gimple_build_assign (lhs, VEC_COND_EXPR, pred, x, vec_inactive);
>> }
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s16.c
>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s16.c
>> index e9b6bf83b03..4148097cc63 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s16.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s16.c
>> @@ -331,6 +331,16 @@ TEST_UNIFORM_Z (mul_1op1_s16_z_tied2, svint16_t,
>>              z0 = svmul_s16_z (p0, svdup_s16 (1), z0),
>>              z0 = svmul_z (p0, svdup_s16 (1), z0))
>> 
>> +/*
>> +** mul_1op1n_s16_z:
>> +**   movprfx z0\.h, p0/z, z0\.h
>> +**   mov     z0\.h, p0/m, w0
>> +**   ret
>> +*/
>> +TEST_UNIFORM_ZX (mul_1op1n_s16_z, svint16_t, int16_t,
>> +     z0 = svmul_n_s16_z (p0, svdup_s16 (1), x0),
>> +     z0 = svmul_z (p0, svdup_s16 (1), x0))
>> +
>> /*
>> ** mul_3_s16_z_tied1:
>> **   mov     (z[0-9]+\.h), #3
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s32.c
>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s32.c
>> index 71c476f48ca..2c53e3f14d6 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s32.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s32.c
>> @@ -341,6 +341,16 @@ TEST_UNIFORM_Z (mul_1op1_s32_z_tied2, svint32_t,
>>              z0 = svmul_s32_z (p0, svdup_s32 (1), z0),
>>              z0 = svmul_z (p0, svdup_s32 (1), z0))
>> 
>> +/*
>> +** mul_1op1n_s32_z:
>> +**   movprfx z0\.s, p0/z, z0\.s
>> +**   mov     z0\.s, p0/m, w0
>> +**   ret
>> +*/
>> +TEST_UNIFORM_ZX (mul_1op1n_s32_z, svint32_t, int32_t,
>> +     z0 = svmul_n_s32_z (p0, svdup_s32 (1), x0),
>> +     z0 = svmul_z (p0, svdup_s32 (1), x0))
>> +
>> /*
>> ** mul_3_s32_z_tied1:
>> **   mov     (z[0-9]+\.s), #3
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s64.c
>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s64.c
>> index a34dc27740a..55342a13f8b 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s64.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s64.c
>> @@ -340,6 +340,16 @@ TEST_UNIFORM_Z (mul_1op1_s64_z_tied2, svint64_t,
>>              z0 = svmul_s64_z (p0, svdup_s64 (1), z0),
>>              z0 = svmul_z (p0, svdup_s64 (1), z0))
>> 
>> +/*
>> +** mul_1op1n_s64_z:
>> +**   movprfx z0\.d, p0/z, z0\.d
>> +**   mov     z0\.d, p0/m, x0
>> +**   ret
>> +*/
>> +TEST_UNIFORM_ZX (mul_1op1n_s64_z, svint64_t, int64_t,
>> +     z0 = svmul_n_s64_z (p0, svdup_s64 (1), x0),
>> +     z0 = svmul_z (p0, svdup_s64 (1), x0))
>> +
>> /*
>> ** mul_2_s64_z_tied1:
>> **   movprfx z0.d, p0/z, z0.d
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s8.c
>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s8.c
>> index 683e15eccec..786a424eeea 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s8.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s8.c
>> @@ -331,6 +331,16 @@ TEST_UNIFORM_Z (mul_1op1_s8_z_tied2, svint8_t,
>>              z0 = svmul_s8_z (p0, svdup_s8 (1), z0),
>>              z0 = svmul_z (p0, svdup_s8 (1), z0))
>> 
>> +/*
>> +** mul_1op1n_s8_z:
>> +**   movprfx z0\.b, p0/z, z0\.b
>> +**   mov     z0\.b, p0/m, w0
>> +**   ret
>> +*/
>> +TEST_UNIFORM_ZX (mul_1op1n_s8_z, svint8_t, int8_t,
>> +     z0 = svmul_n_s8_z (p0, svdup_s8 (1), x0),
>> +     z0 = svmul_z (p0, svdup_s8 (1), x0))
>> +
>> /*
>> ** mul_3_s8_z_tied1:
>> **   mov     (z[0-9]+\.b), #3
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u16.c
>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u16.c
>> index e228dc5995d..ed08635382d 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u16.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u16.c
>> @@ -312,6 +312,16 @@ TEST_UNIFORM_Z (mul_1op1_u16_z_tied2, svuint16_t,
>>              z0 = svmul_u16_z (p0, svdup_u16 (1), z0),
>>              z0 = svmul_z (p0, svdup_u16 (1), z0))
>> 
>> +/*
>> +** mul_1op1n_u16_z:
>> +**   movprfx z0\.h, p0/z, z0\.h
>> +**   mov     z0\.h, p0/m, w0
>> +**   ret
>> +*/
>> +TEST_UNIFORM_ZX (mul_1op1n_u16_z, svuint16_t, uint16_t,
>> +     z0 = svmul_n_u16_z (p0, svdup_u16 (1), x0),
>> +     z0 = svmul_z (p0, svdup_u16 (1), x0))
>> +
>> /*
>> ** mul_3_u16_z_tied1:
>> **   mov     (z[0-9]+\.h), #3
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u32.c
>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u32.c
>> index e8f52c9d785..f82ac4269e8 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u32.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u32.c
>> @@ -312,6 +312,16 @@ TEST_UNIFORM_Z (mul_1op1_u32_z_tied2, svuint32_t,
>>              z0 = svmul_u32_z (p0, svdup_u32 (1), z0),
>>              z0 = svmul_z (p0, svdup_u32 (1), z0))
>> 
>> +/*
>> +** mul_1op1n_u32_z:
>> +**   movprfx z0\.s, p0/z, z0\.s
>> +**   mov     z0\.s, p0/m, w0
>> +**   ret
>> +*/
>> +TEST_UNIFORM_ZX (mul_1op1n_u32_z, svuint32_t, uint32_t,
>> +     z0 = svmul_n_u32_z (p0, svdup_u32 (1), x0),
>> +     z0 = svmul_z (p0, svdup_u32 (1), x0))
>> +
>> /*
>> ** mul_3_u32_z_tied1:
>> **   mov     (z[0-9]+\.s), #3
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u64.c
>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u64.c
>> index 2ccdc3642c5..9f1bfff5fd2 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u64.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u64.c
>> @@ -333,6 +333,16 @@ TEST_UNIFORM_Z (mul_1op1_u64_z_tied2, svuint64_t,
>>              z0 = svmul_u64_z (p0, svdup_u64 (1), z0),
>>              z0 = svmul_z (p0, svdup_u64 (1), z0))
>> 
>> +/*
>> +** mul_1op1n_u64_z:
>> +**   movprfx z0\.d, p0/z, z0\.d
>> +**   mov     z0\.d, p0/m, x0
>> +**   ret
>> +*/
>> +TEST_UNIFORM_ZX (mul_1op1n_u64_z, svuint64_t, uint64_t,
>> +     z0 = svmul_n_u64_z (p0, svdup_u64 (1), x0),
>> +     z0 = svmul_z (p0, svdup_u64 (1), x0))
>> +
>> /*
>> ** mul_2_u64_z_tied1:
>> **   movprfx z0.d, p0/z, z0.d
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u8.c
>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u8.c
>> index 8e53a4821f0..b2c1edf5ff8 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u8.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u8.c
>> @@ -312,6 +312,16 @@ TEST_UNIFORM_Z (mul_1op1_u8_z_tied2, svuint8_t,
>>              z0 = svmul_u8_z (p0, svdup_u8 (1), z0),
>>              z0 = svmul_z (p0, svdup_u8 (1), z0))
>> 
>> +/*
>> +** mul_1op1n_u8_z:
>> +**   movprfx z0\.b, p0/z, z0\.b
>> +**   mov     z0\.b, p0/m, w0
>> +**   ret
>> +*/
>> +TEST_UNIFORM_ZX (mul_1op1n_u8_z, svuint8_t, uint8_t,
>> +     z0 = svmul_n_u8_z (p0, svdup_u8 (1), x0),
>> +     z0 = svmul_z (p0, svdup_u8 (1), x0))
>> +
>> /*
>> ** mul_3_u8_z_tied1:
>> **   mov     (z[0-9]+\.b), #3
>> --
>> 2.34.1
>> 
>>> 
>>> Thanks,
>>> Tamar
>>> 
>>>> 
>>>> Signed-off-by: Jennifer Schmitz <[email protected]>
>>>> 
>>>> gcc/
>>>>     PR target/121602
>>>>     * config/aarch64/aarch64-sve-builtins-base.cc (svmul_impl::fold):
>>>>     Force vector type before calling fold_active_lanes_to.
>>>> 
>>>> gcc/testsuite/
>>>>     PR target/121602
>>>>     * gcc.target/aarch64/sve/acle/asm/mul_s16.c: New test.
>>>>     * gcc.target/aarch64/sve/acle/asm/mul_s32.c: Likewise.
>>>>     * gcc.target/aarch64/sve/acle/asm/mul_s64.c: Likewise.
>>>>     * gcc.target/aarch64/sve/acle/asm/mul_s8.c: Likewise.
>>>>     * gcc.target/aarch64/sve/acle/asm/mul_u16.c: Likewise.
>>>>     * gcc.target/aarch64/sve/acle/asm/mul_u32.c: Likewise.
>>>>     * gcc.target/aarch64/sve/acle/asm/mul_u64.c: Likewise.
>>>>     * gcc.target/aarch64/sve/acle/asm/mul_u8.c: Likewise.
>>>> ---
>>>> gcc/config/aarch64/aarch64-sve-builtins-base.cc        |  7 ++++++-
>>>> .../gcc.target/aarch64/sve/acle/asm/mul_s16.c          | 10 ++++++++++
>>>> .../gcc.target/aarch64/sve/acle/asm/mul_s32.c          | 10 ++++++++++
>>>> .../gcc.target/aarch64/sve/acle/asm/mul_s64.c          | 10 ++++++++++
>>>> gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s8.c | 10 ++++++++++
>>>> .../gcc.target/aarch64/sve/acle/asm/mul_u16.c          | 10 ++++++++++
>>>> .../gcc.target/aarch64/sve/acle/asm/mul_u32.c          | 10 ++++++++++
>>>> .../gcc.target/aarch64/sve/acle/asm/mul_u64.c          | 10 ++++++++++
>>>> gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u8.c | 10 ++++++++++
>>>> 9 files changed, 86 insertions(+), 1 deletion(-)
>>>> 
>>>> diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>>>> b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>>>> index ecc06877cac..aaa7be0d4d1 100644
>>>> --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>>>> +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>>>> @@ -2314,7 +2314,12 @@ public:
>>>>    tree op1 = gimple_call_arg (f.call, 1);
>>>>    tree op2 = gimple_call_arg (f.call, 2);
>>>>    if (integer_onep (op1))
>>>> -      return f.fold_active_lanes_to (op2);
>>>> +      {
>>>> +     gimple_seq stmts = NULL;
>>>> +     op2 = f.force_vector (stmts, TREE_TYPE (f.lhs), op2);
>>>> +     gsi_insert_seq_before (f.gsi, stmts, GSI_SAME_STMT);
>>>> +     return f.fold_active_lanes_to (op2);
>>>> +      }
>>>>    if (integer_onep (op2))
>>>>      return f.fold_active_lanes_to (op1);
>>>> 
>>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s16.c
>>>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s16.c
>>>> index e9b6bf83b03..4148097cc63 100644
>>>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s16.c
>>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s16.c
>>>> @@ -331,6 +331,16 @@ TEST_UNIFORM_Z (mul_1op1_s16_z_tied2,
>> svint16_t,
>>>>             z0 = svmul_s16_z (p0, svdup_s16 (1), z0),
>>>>             z0 = svmul_z (p0, svdup_s16 (1), z0))
>>>> 
>>>> +/*
>>>> +** mul_1op1n_s16_z:
>>>> +**   movprfx z0\.h, p0/z, z0\.h
>>>> +**   mov     z0\.h, p0/m, w0
>>>> +**   ret
>>>> +*/
>>>> +TEST_UNIFORM_ZX (mul_1op1n_s16_z, svint16_t, int16_t,
>>>> +     z0 = svmul_n_s16_z (p0, svdup_s16 (1), x0),
>>>> +     z0 = svmul_z (p0, svdup_s16 (1), x0))
>>>> +
>>>> /*
>>>> ** mul_3_s16_z_tied1:
>>>> **   mov     (z[0-9]+\.h), #3
>>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s32.c
>>>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s32.c
>>>> index 71c476f48ca..2c53e3f14d6 100644
>>>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s32.c
>>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s32.c
>>>> @@ -341,6 +341,16 @@ TEST_UNIFORM_Z (mul_1op1_s32_z_tied2,
>> svint32_t,
>>>>             z0 = svmul_s32_z (p0, svdup_s32 (1), z0),
>>>>             z0 = svmul_z (p0, svdup_s32 (1), z0))
>>>> 
>>>> +/*
>>>> +** mul_1op1n_s32_z:
>>>> +**   movprfx z0\.s, p0/z, z0\.s
>>>> +**   mov     z0\.s, p0/m, w0
>>>> +**   ret
>>>> +*/
>>>> +TEST_UNIFORM_ZX (mul_1op1n_s32_z, svint32_t, int32_t,
>>>> +     z0 = svmul_n_s32_z (p0, svdup_s32 (1), x0),
>>>> +     z0 = svmul_z (p0, svdup_s32 (1), x0))
>>>> +
>>>> /*
>>>> ** mul_3_s32_z_tied1:
>>>> **   mov     (z[0-9]+\.s), #3
>>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s64.c
>>>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s64.c
>>>> index a34dc27740a..55342a13f8b 100644
>>>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s64.c
>>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s64.c
>>>> @@ -340,6 +340,16 @@ TEST_UNIFORM_Z (mul_1op1_s64_z_tied2,
>> svint64_t,
>>>>             z0 = svmul_s64_z (p0, svdup_s64 (1), z0),
>>>>             z0 = svmul_z (p0, svdup_s64 (1), z0))
>>>> 
>>>> +/*
>>>> +** mul_1op1n_s64_z:
>>>> +**   movprfx z0\.d, p0/z, z0\.d
>>>> +**   mov     z0\.d, p0/m, x0
>>>> +**   ret
>>>> +*/
>>>> +TEST_UNIFORM_ZX (mul_1op1n_s64_z, svint64_t, int64_t,
>>>> +     z0 = svmul_n_s64_z (p0, svdup_s64 (1), x0),
>>>> +     z0 = svmul_z (p0, svdup_s64 (1), x0))
>>>> +
>>>> /*
>>>> ** mul_2_s64_z_tied1:
>>>> **   movprfx z0.d, p0/z, z0.d
>>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s8.c
>>>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s8.c
>>>> index 683e15eccec..786a424eeea 100644
>>>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s8.c
>>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_s8.c
>>>> @@ -331,6 +331,16 @@ TEST_UNIFORM_Z (mul_1op1_s8_z_tied2, svint8_t,
>>>>             z0 = svmul_s8_z (p0, svdup_s8 (1), z0),
>>>>             z0 = svmul_z (p0, svdup_s8 (1), z0))
>>>> 
>>>> +/*
>>>> +** mul_1op1n_s8_z:
>>>> +**   movprfx z0\.b, p0/z, z0\.b
>>>> +**   mov     z0\.b, p0/m, w0
>>>> +**   ret
>>>> +*/
>>>> +TEST_UNIFORM_ZX (mul_1op1n_s8_z, svint8_t, int8_t,
>>>> +     z0 = svmul_n_s8_z (p0, svdup_s8 (1), x0),
>>>> +     z0 = svmul_z (p0, svdup_s8 (1), x0))
>>>> +
>>>> /*
>>>> ** mul_3_s8_z_tied1:
>>>> **   mov     (z[0-9]+\.b), #3
>>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u16.c
>>>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u16.c
>>>> index e228dc5995d..ed08635382d 100644
>>>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u16.c
>>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u16.c
>>>> @@ -312,6 +312,16 @@ TEST_UNIFORM_Z (mul_1op1_u16_z_tied2,
>> svuint16_t,
>>>>             z0 = svmul_u16_z (p0, svdup_u16 (1), z0),
>>>>             z0 = svmul_z (p0, svdup_u16 (1), z0))
>>>> 
>>>> +/*
>>>> +** mul_1op1n_u16_z:
>>>> +**   movprfx z0\.h, p0/z, z0\.h
>>>> +**   mov     z0\.h, p0/m, w0
>>>> +**   ret
>>>> +*/
>>>> +TEST_UNIFORM_ZX (mul_1op1n_u16_z, svuint16_t, uint16_t,
>>>> +     z0 = svmul_n_u16_z (p0, svdup_u16 (1), x0),
>>>> +     z0 = svmul_z (p0, svdup_u16 (1), x0))
>>>> +
>>>> /*
>>>> ** mul_3_u16_z_tied1:
>>>> **   mov     (z[0-9]+\.h), #3
>>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u32.c
>>>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u32.c
>>>> index e8f52c9d785..f82ac4269e8 100644
>>>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u32.c
>>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u32.c
>>>> @@ -312,6 +312,16 @@ TEST_UNIFORM_Z (mul_1op1_u32_z_tied2,
>> svuint32_t,
>>>>             z0 = svmul_u32_z (p0, svdup_u32 (1), z0),
>>>>             z0 = svmul_z (p0, svdup_u32 (1), z0))
>>>> 
>>>> +/*
>>>> +** mul_1op1n_u32_z:
>>>> +**   movprfx z0\.s, p0/z, z0\.s
>>>> +**   mov     z0\.s, p0/m, w0
>>>> +**   ret
>>>> +*/
>>>> +TEST_UNIFORM_ZX (mul_1op1n_u32_z, svuint32_t, uint32_t,
>>>> +     z0 = svmul_n_u32_z (p0, svdup_u32 (1), x0),
>>>> +     z0 = svmul_z (p0, svdup_u32 (1), x0))
>>>> +
>>>> /*
>>>> ** mul_3_u32_z_tied1:
>>>> **   mov     (z[0-9]+\.s), #3
>>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u64.c
>>>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u64.c
>>>> index 2ccdc3642c5..9f1bfff5fd2 100644
>>>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u64.c
>>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u64.c
>>>> @@ -333,6 +333,16 @@ TEST_UNIFORM_Z (mul_1op1_u64_z_tied2,
>> svuint64_t,
>>>>             z0 = svmul_u64_z (p0, svdup_u64 (1), z0),
>>>>             z0 = svmul_z (p0, svdup_u64 (1), z0))
>>>> 
>>>> +/*
>>>> +** mul_1op1n_u64_z:
>>>> +**   movprfx z0\.d, p0/z, z0\.d
>>>> +**   mov     z0\.d, p0/m, x0
>>>> +**   ret
>>>> +*/
>>>> +TEST_UNIFORM_ZX (mul_1op1n_u64_z, svuint64_t, uint64_t,
>>>> +     z0 = svmul_n_u64_z (p0, svdup_u64 (1), x0),
>>>> +     z0 = svmul_z (p0, svdup_u64 (1), x0))
>>>> +
>>>> /*
>>>> ** mul_2_u64_z_tied1:
>>>> **   movprfx z0.d, p0/z, z0.d
>>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u8.c
>>>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u8.c
>>>> index 8e53a4821f0..b2c1edf5ff8 100644
>>>> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u8.c
>>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mul_u8.c
>>>> @@ -312,6 +312,16 @@ TEST_UNIFORM_Z (mul_1op1_u8_z_tied2, svuint8_t,
>>>>             z0 = svmul_u8_z (p0, svdup_u8 (1), z0),
>>>>             z0 = svmul_z (p0, svdup_u8 (1), z0))
>>>> 
>>>> +/*
>>>> +** mul_1op1n_u8_z:
>>>> +**   movprfx z0\.b, p0/z, z0\.b
>>>> +**   mov     z0\.b, p0/m, w0
>>>> +**   ret
>>>> +*/
>>>> +TEST_UNIFORM_ZX (mul_1op1n_u8_z, svuint8_t, uint8_t,
>>>> +     z0 = svmul_n_u8_z (p0, svdup_u8 (1), x0),
>>>> +     z0 = svmul_z (p0, svdup_u8 (1), x0))
>>>> +
>>>> /*
>>>> ** mul_3_u8_z_tied1:
>>>> **   mov     (z[0-9]+\.b), #3
>>>> --
>>>> 2.34.1
Re: [PATCH][PR121602] aarch64: Force vector when folding svmul with all-ones op1.

Reply via email to