> On 16 May 2025, at 12:35, Richard Sandiford <richard.sandif...@arm.com> wrote: > > Jennifer Schmitz <jschm...@nvidia.com> writes: >> The ICE in PR120276 resulted from a comparison of VNx4QI and V8QI using >> partial_subreg_p in the function copy_value during the RTL pass >> regcprop, failing the assertion in >> >> inline bool >> partial_subreg_p (machine_mode outermode, machine_mode innermode) >> { >> /* Modes involved in a subreg must be ordered. In particular, we must >> always know at compile time whether the subreg is paradoxical. */ >> poly_int64 outer_prec = GET_MODE_PRECISION (outermode); >> poly_int64 inner_prec = GET_MODE_PRECISION (innermode); >> gcc_checking_assert (ordered_p (outer_prec, inner_prec)); >> return maybe_lt (outer_prec, inner_prec); >> } >> >> Replacing the call to partial_subreg_p by ordered_p && maybe_lt resolves >> the ICE and passes bootstrap and testing without regression. >> OK for mainline? >> >> Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com> >> >> gcc/ >> PR middle-end/120276 >> * regcprop.cc (copy_value): Replace call to partial_subreg_p by >> ordered_p && maybe_lt. >> >> gcc/testsuite/ >> PR middle-end/120276 >> * gcc.target/aarch64/sve/pr120276.c: New test. >> --- >> gcc/regcprop.cc | 5 ++++- >> .../gcc.target/aarch64/sve/pr120276.c | 20 +++++++++++++++++++ >> 2 files changed, 24 insertions(+), 1 deletion(-) >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr120276.c >> >> diff --git a/gcc/regcprop.cc b/gcc/regcprop.cc >> index 4fa1305526c..7f4a25a4718 100644 >> --- a/gcc/regcprop.cc >> +++ b/gcc/regcprop.cc >> @@ -326,6 +326,9 @@ copy_value (rtx dest, rtx src, struct value_data *vd) >> || (sr > dr && sr < dr + dn)) >> return; >> >> + poly_int64 sr_prec = GET_MODE_PRECISION (vd->e[sr].mode); >> + poly_int64 src_prec = GET_MODE_PRECISION (GET_MODE (src)); >> + >> /* If SRC had no assigned mode (i.e. we didn't know it was live) >> assign it now and assume the value came from an input argument >> or somesuch. */ >> @@ -371,7 +374,7 @@ copy_value (rtx dest, rtx src, struct value_data *vd) >> (set (reg:DI dx) (reg:DI bp)) >> The last set is not redundant, while the low 8 bits of dx are already >> equal to low 8 bits of bp, the other bits are undefined. */ >> - else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src))) >> + else if (ordered_p (sr_prec, src_prec) && maybe_lt (sr_prec, src_prec)) >> { >> if (!REG_CAN_CHANGE_MODE_P (sr, GET_MODE (src), vd->e[sr].mode) >> || !REG_CAN_CHANGE_MODE_P (dr, vd->e[sr].mode, GET_MODE (dest))) > > I think we should instead add: > > else if (!ordered_p (GET_MODE_PRECISION (vd->e[sr].mode), > GET_MODE_PRECISION (GET_MODE (src)))) > return; > > after: > > if (vd->e[sr].mode == VOIDmode) > set_value_regno (sr, vd->e[dr].mode, vd); > > The ICE in paradoxical_subreg is just the canary. The register > replacement isn't valid if we don't know at compile time how the bytes > are distributed between the two modes. > > Thanks, > Richard > >> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c >> b/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c >> new file mode 100644 >> index 00000000000..1b29c90f69b >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c >> @@ -0,0 +1,20 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O3" } */
The test case is better-placed in a generic directory like gcc.dg/torture with a directive like: /* { dg-additional-options “-march=armv8.2-a+sve" { target aarch64*-*-* } } */ to force the SVE testing. Thanks, Kyrill >> + >> +int a; >> +char b[1]; >> +int c[18]; >> +void d(char *); >> +void e() { >> + int f; >> + char *g; >> + a = 0; >> + for (; a < 18; a++) { >> + int h = f = 0; >> + for (; f < 4; f++) { >> + g[a * 4 + f] = c[a] >> h; >> + h += 8; >> + } >> + } >> + d(b); >> +} >> \ No newline at end of file