> On 16 May 2025, at 12:35, Richard Sandiford <richard.sandif...@arm.com> wrote:
> 
> Jennifer Schmitz <jschm...@nvidia.com> writes:
>> The ICE in PR120276 resulted from a comparison of VNx4QI and V8QI using
>> partial_subreg_p in the function copy_value during the RTL pass
>> regcprop, failing the assertion in
>> 
>> inline bool
>> partial_subreg_p (machine_mode outermode, machine_mode innermode)
>> {
>>  /* Modes involved in a subreg must be ordered.  In particular, we must
>>     always know at compile time whether the subreg is paradoxical.  */
>>  poly_int64 outer_prec = GET_MODE_PRECISION (outermode);
>>  poly_int64 inner_prec = GET_MODE_PRECISION (innermode);
>>  gcc_checking_assert (ordered_p (outer_prec, inner_prec));
>>  return maybe_lt (outer_prec, inner_prec);
>> }
>> 
>> Replacing the call to partial_subreg_p by ordered_p && maybe_lt resolves
>> the ICE and passes bootstrap and testing without regression.
>> OK for mainline?
>> 
>> Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com>
>> 
>> gcc/
>> PR middle-end/120276
>> * regcprop.cc (copy_value): Replace call to partial_subreg_p by
>> ordered_p && maybe_lt.
>> 
>> gcc/testsuite/
>> PR middle-end/120276
>> * gcc.target/aarch64/sve/pr120276.c: New test.
>> ---
>> gcc/regcprop.cc                               |  5 ++++-
>> .../gcc.target/aarch64/sve/pr120276.c         | 20 +++++++++++++++++++
>> 2 files changed, 24 insertions(+), 1 deletion(-)
>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr120276.c
>> 
>> diff --git a/gcc/regcprop.cc b/gcc/regcprop.cc
>> index 4fa1305526c..7f4a25a4718 100644
>> --- a/gcc/regcprop.cc
>> +++ b/gcc/regcprop.cc
>> @@ -326,6 +326,9 @@ copy_value (rtx dest, rtx src, struct value_data *vd)
>>       || (sr > dr && sr < dr + dn))
>>     return;
>> 
>> +  poly_int64 sr_prec = GET_MODE_PRECISION (vd->e[sr].mode);
>> +  poly_int64 src_prec = GET_MODE_PRECISION (GET_MODE (src));
>> +
>>   /* If SRC had no assigned mode (i.e. we didn't know it was live)
>>      assign it now and assume the value came from an input argument
>>      or somesuch.  */
>> @@ -371,7 +374,7 @@ copy_value (rtx dest, rtx src, struct value_data *vd)
>>      (set (reg:DI dx) (reg:DI bp))
>>      The last set is not redundant, while the low 8 bits of dx are already
>>      equal to low 8 bits of bp, the other bits are undefined.  */
>> -  else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src)))
>> +  else if (ordered_p (sr_prec, src_prec) && maybe_lt (sr_prec, src_prec))
>>     {
>>       if (!REG_CAN_CHANGE_MODE_P (sr, GET_MODE (src), vd->e[sr].mode)
>>  || !REG_CAN_CHANGE_MODE_P (dr, vd->e[sr].mode, GET_MODE (dest)))
> 
> I think we should instead add:
> 
>   else if (!ordered_p (GET_MODE_PRECISION (vd->e[sr].mode),
> GET_MODE_PRECISION (GET_MODE (src))))
>     return;
> 
> after:
> 
>  if (vd->e[sr].mode == VOIDmode)
>    set_value_regno (sr, vd->e[dr].mode, vd);
> 
> The ICE in paradoxical_subreg is just the canary.  The register
> replacement isn't valid if we don't know at compile time how the bytes
> are distributed between the two modes.
> 
> Thanks,
> Richard
> 
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c 
>> b/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c
>> new file mode 100644
>> index 00000000000..1b29c90f69b
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c
>> @@ -0,0 +1,20 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O3" } */

The test case is better-placed in a generic directory like gcc.dg/torture with 
a directive like:
/* { dg-additional-options “-march=armv8.2-a+sve" { target aarch64*-*-* } } */
to force the SVE testing.
Thanks,
Kyrill

>> +
>> +int a;
>> +char b[1];
>> +int c[18];
>> +void d(char *);
>> +void e() {
>> +  int f;
>> +  char *g;
>> +  a = 0;
>> +  for (; a < 18; a++) {
>> +    int h = f = 0;
>> +    for (; f < 4; f++) {
>> +      g[a * 4 + f] = c[a] >> h;
>> +      h += 8;
>> +    }
>> +  }
>> +  d(b);
>> +}
>> \ No newline at end of file

Reply via email to