> On 16 May 2025, at 13:11, Kyrylo Tkachov <ktkac...@nvidia.com> wrote: > > > >> On 16 May 2025, at 12:35, Richard Sandiford <richard.sandif...@arm.com> >> wrote: >> >> Jennifer Schmitz <jschm...@nvidia.com> writes: >>> The ICE in PR120276 resulted from a comparison of VNx4QI and V8QI using >>> partial_subreg_p in the function copy_value during the RTL pass >>> regcprop, failing the assertion in >>> >>> inline bool >>> partial_subreg_p (machine_mode outermode, machine_mode innermode) >>> { >>> /* Modes involved in a subreg must be ordered. In particular, we must >>> always know at compile time whether the subreg is paradoxical. */ >>> poly_int64 outer_prec = GET_MODE_PRECISION (outermode); >>> poly_int64 inner_prec = GET_MODE_PRECISION (innermode); >>> gcc_checking_assert (ordered_p (outer_prec, inner_prec)); >>> return maybe_lt (outer_prec, inner_prec); >>> } >>> >>> Replacing the call to partial_subreg_p by ordered_p && maybe_lt resolves >>> the ICE and passes bootstrap and testing without regression. >>> OK for mainline? >>> >>> Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com> >>> >>> gcc/ >>> PR middle-end/120276 >>> * regcprop.cc (copy_value): Replace call to partial_subreg_p by >>> ordered_p && maybe_lt. >>> >>> gcc/testsuite/ >>> PR middle-end/120276 >>> * gcc.target/aarch64/sve/pr120276.c: New test. >>> --- >>> gcc/regcprop.cc | 5 ++++- >>> .../gcc.target/aarch64/sve/pr120276.c | 20 +++++++++++++++++++ >>> 2 files changed, 24 insertions(+), 1 deletion(-) >>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr120276.c >>> >>> diff --git a/gcc/regcprop.cc b/gcc/regcprop.cc >>> index 4fa1305526c..7f4a25a4718 100644 >>> --- a/gcc/regcprop.cc >>> +++ b/gcc/regcprop.cc >>> @@ -326,6 +326,9 @@ copy_value (rtx dest, rtx src, struct value_data *vd) >>> || (sr > dr && sr < dr + dn)) >>> return; >>> >>> + poly_int64 sr_prec = GET_MODE_PRECISION (vd->e[sr].mode); >>> + poly_int64 src_prec = GET_MODE_PRECISION (GET_MODE (src)); >>> + >>> /* If SRC had no assigned mode (i.e. we didn't know it was live) >>> assign it now and assume the value came from an input argument >>> or somesuch. */ >>> @@ -371,7 +374,7 @@ copy_value (rtx dest, rtx src, struct value_data *vd) >>> (set (reg:DI dx) (reg:DI bp)) >>> The last set is not redundant, while the low 8 bits of dx are already >>> equal to low 8 bits of bp, the other bits are undefined. */ >>> - else if (partial_subreg_p (vd->e[sr].mode, GET_MODE (src))) >>> + else if (ordered_p (sr_prec, src_prec) && maybe_lt (sr_prec, src_prec)) >>> { >>> if (!REG_CAN_CHANGE_MODE_P (sr, GET_MODE (src), vd->e[sr].mode) >>> || !REG_CAN_CHANGE_MODE_P (dr, vd->e[sr].mode, GET_MODE (dest))) >> >> I think we should instead add: >> >> else if (!ordered_p (GET_MODE_PRECISION (vd->e[sr].mode), >> GET_MODE_PRECISION (GET_MODE (src)))) >> return; >> >> after: >> >> if (vd->e[sr].mode == VOIDmode) >> set_value_regno (sr, vd->e[dr].mode, vd); Thanks, I made the change. >> >> The ICE in paradoxical_subreg is just the canary. The register >> replacement isn't valid if we don't know at compile time how the bytes >> are distributed between the two modes. >> >> Thanks, >> Richard >> >>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c >>> b/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c >>> new file mode 100644 >>> index 00000000000..1b29c90f69b >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr120276.c >>> @@ -0,0 +1,20 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-O3" } */ > > The test case is better-placed in a generic directory like gcc.dg/torture > with a directive like: > /* { dg-additional-options “-march=armv8.2-a+sve" { target aarch64*-*-* } } */ > to force the SVE testing. I moved the test and removed /* { dg-options “-O3” } */, because that is added as one of the optimization levels in gcc.torture, correct? Thanks, Jennifer
[PATCH] [PR120276] regcprop: Return from copy_value for unordered modes The ICE in PR120276 resulted from a comparison of VNx4QI and V8QI using partial_subreg_p in the function copy_value during the RTL pass regcprop, failing the assertion in inline bool partial_subreg_p (machine_mode outermode, machine_mode innermode) { /* Modes involved in a subreg must be ordered. In particular, we must always know at compile time whether the subreg is paradoxical. */ poly_int64 outer_prec = GET_MODE_PRECISION (outermode); poly_int64 inner_prec = GET_MODE_PRECISION (innermode); gcc_checking_assert (ordered_p (outer_prec, inner_prec)); return maybe_lt (outer_prec, inner_prec); } Returning from the function if the modes are not ordered before reaching the call to partial_subreg_p resolves the ICE and passes bootstrap and testing without regression. OK for mainline? Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com> gcc/ PR middle-end/120276 * regcprop.cc (copy_value): Return in case of unordered modes. gcc/testsuite/ PR middle-end/120276 * gcc.dg/torture/pr120276.c: New test. --- gcc/regcprop.cc | 4 ++++ gcc/testsuite/gcc.dg/torture/pr120276.c | 20 ++++++++++++++++++++ 2 files changed, 24 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/torture/pr120276.c diff --git a/gcc/regcprop.cc b/gcc/regcprop.cc index 4fa1305526c..98ab3f77e83 100644 --- a/gcc/regcprop.cc +++ b/gcc/regcprop.cc @@ -332,6 +332,10 @@ copy_value (rtx dest, rtx src, struct value_data *vd) if (vd->e[sr].mode == VOIDmode) set_value_regno (sr, vd->e[dr].mode, vd); + else if (!ordered_p (GET_MODE_PRECISION (vd->e[sr].mode), + GET_MODE_PRECISION (GET_MODE (src)))) + return; + /* If we are narrowing the input to a smaller number of hard regs, and it is in big endian, we are really extracting a high part. Since we generally associate a low part of a value with the value itself, diff --git a/gcc/testsuite/gcc.dg/torture/pr120276.c b/gcc/testsuite/gcc.dg/torture/pr120276.c new file mode 100644 index 00000000000..9717a7103e5 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr120276.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.2-a+sve" { target aarch64*-*-* } } */ + +int a; +char b[1]; +int c[18]; +void d(char *); +void e() { + int f; + char *g; + a = 0; + for (; a < 18; a++) { + int h = f = 0; + for (; f < 4; f++) { + g[a * 4 + f] = c[a] >> h; + h += 8; + } + } + d(b); +} \ No newline at end of file -- 2.34.1 > Thanks, > Kyrill > >>> + >>> +int a; >>> +char b[1]; >>> +int c[18]; >>> +void d(char *); >>> +void e() { >>> + int f; >>> + char *g; >>> + a = 0; >>> + for (; a < 18; a++) { >>> + int h = f = 0; >>> + for (; f < 4; f++) { >>> + g[a * 4 + f] = c[a] >> h; >>> + h += 8; >>> + } >>> + } >>> + d(b); >>> +} >>> \ No newline at end of file
smime.p7s
Description: S/MIME cryptographic signature