> On 16 May 2025, at 13:11, Kyrylo Tkachov <ktkac...@nvidia.com> wrote:
> 
> 
> 
>> On 16 May 2025, at 12:35, Richard Sandiford <richard.sandif...@arm.com> 
>> wrote:
>> 
>> Jennifer Schmitz <jschm...@nvidia.com> writes:
>>> The ICE in PR120276 resulted from a comparison of VNx4QI and V8QI using
>>> partial_subreg_p in the function copy_value during the RTL pass
>>> regcprop, failing the assertion in
>>> 
>>> inline bool
>>> partial_subreg_p (machine_mode outermode, machine_mode innermode)
>>> {
>>> /* Modes involved in a subreg must be ordered.  In particular, we must
>>>    always know at compile time whether the subreg is paradoxical.  */
>>> poly_int64 outer_prec = GET_MODE_PRECISION (outermode);
>>> poly_int64 inner_prec = GET_MODE_PRECISION (innermode);
>>> gcc_checking_assert (ordered_p (outer_prec, inner_prec));
>>> return maybe_lt (outer_prec, inner_prec);
>>> }
>>> 
>>> Replacing the call to partial_subreg_p by ordered_p && maybe_lt resolves
>>> the ICE and passes bootstrap and testing without regression.
>>> OK for mainline?
>>> 
>>> Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com>
>>> 
>>> gcc/
>>> PR middle-end/120276
>>> * regcprop.cc (copy_value): Replace call to partial_subreg_p by
>>> ordered_p && maybe_lt.
>>> 
>>> gcc/testsuite/
>>> PR middle-end/120276
>>> * gcc.target/aarch64/sve/pr120276.c: New test.
>>> ---
>>> gcc/regcprop.cc                               |  5 ++++-
>>> .../gcc.target/aarch64/sve/pr120276.c         | 20 +++++++++++++++++++
>>> 2 files changed, 24 insertions(+), 1 deletion(-)
>>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr120276.c
>>> 
>>> diff --git a/gcc/regcprop.cc b/gcc/regcprop.cc
>>> index 4fa1305526c..7f4a25a4718 100644
>>> --- a/gcc/regcprop.cc
>>> +++ b/gcc/regcprop.cc
>>> @@ -326,6 +326,9 @@ copy_value (rtx dest, rtx src, struct value_data *vd)
>>>      || (sr > dr && sr < dr + dn))
>>>    return;
>>> 
>>> +  poly_int64 sr_prec = GET_MODE_PRECISION (vd->e[sr].mode);
>>> +  poly_int64 src_prec = GET_MODE_PRECISION (GET_MODE (src));
>>> +
>>>  /* If SRC had no assigned mode (i.e. we didn't know it was live)
>>>     assign it now and assume the value came from an input argument
>>>     or somesuch.  */
>>> @@ -371,7 +374,7 @@ copy_value (rtx dest, rtx src, struct value_data *vd)
>>>     (set (reg:DI dx) (reg:DI bp))
>>>     The last set is not redundant, while the low 8 bits of dx are already
>>>     equal to low 8 bits of bp, the other bits are undefined.  */
>>> -  else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src)))
>>> +  else if (ordered_p (sr_prec, src_prec) && maybe_lt (sr_prec, src_prec))
>>>    {
>>>      if (!REG_CAN_CHANGE_MODE_P (sr, GET_MODE (src), vd->e[sr].mode)
>>> || !REG_CAN_CHANGE_MODE_P (dr, vd->e[sr].mode, GET_MODE (dest)))
>> 
>> I think we should instead add:
>> 
>>  else if (!ordered_p (GET_MODE_PRECISION (vd->e[sr].mode),
>> GET_MODE_PRECISION (GET_MODE (src))))
>>    return;
>> 
>> after:
>> 
>> if (vd->e[sr].mode == VOIDmode)
>>   set_value_regno (sr, vd->e[dr].mode, vd);
Thanks, I made the change.
>> 
>> The ICE in paradoxical_subreg is just the canary.  The register
>> replacement isn't valid if we don't know at compile time how the bytes
>> are distributed between the two modes.
>> 
>> Thanks,
>> Richard
>> 
>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c 
>>> b/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c
>>> new file mode 100644
>>> index 00000000000..1b29c90f69b
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c
>>> @@ -0,0 +1,20 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O3" } */
> 
> The test case is better-placed in a generic directory like gcc.dg/torture 
> with a directive like:
> /* { dg-additional-options “-march=armv8.2-a+sve" { target aarch64*-*-* } } */
> to force the SVE testing.
I moved the test and removed /* { dg-options “-O3” } */, because that is added 
as one of the optimization levels in gcc.torture, correct?
Thanks,
Jennifer

[PATCH] [PR120276] regcprop: Return from copy_value for unordered modes

The ICE in PR120276 resulted from a comparison of VNx4QI and V8QI using
partial_subreg_p in the function copy_value during the RTL pass
regcprop, failing the assertion in

inline bool
partial_subreg_p (machine_mode outermode, machine_mode innermode)
{
  /* Modes involved in a subreg must be ordered.  In particular, we must
     always know at compile time whether the subreg is paradoxical.  */
  poly_int64 outer_prec = GET_MODE_PRECISION (outermode);
  poly_int64 inner_prec = GET_MODE_PRECISION (innermode);
  gcc_checking_assert (ordered_p (outer_prec, inner_prec));
  return maybe_lt (outer_prec, inner_prec);
}

Returning from the function if the modes are not ordered before reaching
the call to partial_subreg_p resolves the ICE and passes bootstrap and
testing without regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com>

gcc/
        PR middle-end/120276
        * regcprop.cc (copy_value): Return in case of unordered modes.

gcc/testsuite/
        PR middle-end/120276
        * gcc.dg/torture/pr120276.c: New test.
---
 gcc/regcprop.cc                         |  4 ++++
 gcc/testsuite/gcc.dg/torture/pr120276.c | 20 ++++++++++++++++++++
 2 files changed, 24 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr120276.c

diff --git a/gcc/regcprop.cc b/gcc/regcprop.cc
index 4fa1305526c..98ab3f77e83 100644
--- a/gcc/regcprop.cc
+++ b/gcc/regcprop.cc
@@ -332,6 +332,10 @@ copy_value (rtx dest, rtx src, struct value_data *vd)
   if (vd->e[sr].mode == VOIDmode)
     set_value_regno (sr, vd->e[dr].mode, vd);
 
+  else if (!ordered_p (GET_MODE_PRECISION (vd->e[sr].mode),
+                      GET_MODE_PRECISION (GET_MODE (src))))
+    return;
+
   /* If we are narrowing the input to a smaller number of hard regs,
      and it is in big endian, we are really extracting a high part.
      Since we generally associate a low part of a value with the value itself,
diff --git a/gcc/testsuite/gcc.dg/torture/pr120276.c 
b/gcc/testsuite/gcc.dg/torture/pr120276.c
new file mode 100644
index 00000000000..9717a7103e5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr120276.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=armv8.2-a+sve" { target aarch64*-*-* } } */
+
+int a;
+char b[1];
+int c[18];
+void d(char *);
+void e() {
+  int f;
+  char *g;
+  a = 0;
+  for (; a < 18; a++) {
+    int h = f = 0;
+    for (; f < 4; f++) {
+      g[a * 4 + f] = c[a] >> h;
+      h += 8;
+    }
+  }
+  d(b);
+}
\ No newline at end of file
-- 
2.34.1

> Thanks,
> Kyrill
> 
>>> +
>>> +int a;
>>> +char b[1];
>>> +int c[18];
>>> +void d(char *);
>>> +void e() {
>>> +  int f;
>>> +  char *g;
>>> +  a = 0;
>>> +  for (; a < 18; a++) {
>>> +    int h = f = 0;
>>> +    for (; f < 4; f++) {
>>> +      g[a * 4 + f] = c[a] >> h;
>>> +      h += 8;
>>> +    }
>>> +  }
>>> +  d(b);
>>> +}
>>> \ No newline at end of file


Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to