Hi Richard, Thanks for the feedback! We have sent a new version that uses lowpart_subreg (https://gcc.gnu.org/pipermail/gcc-patches/2025-May/682835.html). We had tried that before, but we were mishandling the case where there are multiple stores with the same offset as the load.
As for `it->offset`, that's actually the offset difference between the store and the load (we're trying to find the store with the same offset as the load), so the endianness should be irrelevant in that case. Konstantinos On Tue, Apr 29, 2025 at 8:48 PM Richard Sandiford <richard.sandif...@arm.com> wrote: > > Konstantinos Eleftheriou <konstantinos.elefther...@vrull.eu> writes: > > During the base register initialization, when we are eliminating the load > > instruction, we were calling `emit_move_insn` on registers of the same > > size but of different mode in some cases, causing an ICE. > > > > This patch fixes this, by adding a check for the modes to match before > > calling `emit_move_insn`. > > > > Bootstrapped/regtested on AArch64 and x86_64. > > > > PR rtl-optimization/119884 > > > > gcc/ChangeLog: > > > > * avoid-store-forwarding.cc (process_store_forwarding): > > Added check to ensure that the register modes match > > before calling `emit_move_insn`. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/i386/pr119884.c: New test. > > --- > > gcc/avoid-store-forwarding.cc | 3 ++- > > gcc/testsuite/gcc.target/i386/pr119884.c | 13 +++++++++++++ > > 2 files changed, 15 insertions(+), 1 deletion(-) > > create mode 100644 gcc/testsuite/gcc.target/i386/pr119884.c > > > > diff --git a/gcc/avoid-store-forwarding.cc b/gcc/avoid-store-forwarding.cc > > index ded8d7e596e0..aec05c22ac37 100644 > > --- a/gcc/avoid-store-forwarding.cc > > +++ b/gcc/avoid-store-forwarding.cc > > @@ -244,7 +244,8 @@ process_store_forwarding (vec<store_fwd_info> &stores, > > rtx_insn *load_insn, > > GET_MODE_BITSIZE (GET_MODE (it->mov_reg)))) > > base_reg = gen_rtx_ZERO_EXTEND (dest_mode, it->mov_reg); > > > > - if (base_reg) > > + /* Generate a move instruction, only when the modes match. */ > > + if (base_reg && dest_mode == GET_MODE (base_reg)) > > { > > rtx_insn *move0 = emit_move_insn (dest, base_reg); > > if (recog_memoized (move0) >= 0) > > Is this a complete fix though? It looks like: > > rtx base_reg = it->mov_reg; > if (known_gt (GET_MODE_BITSIZE (dest_mode), > GET_MODE_BITSIZE (GET_MODE (it->mov_reg)))) > base_reg = gen_rtx_ZERO_EXTEND (dest_mode, it->mov_reg); > > implicitly assumes that dest_mode and GET_MODE (it->mov_reg) are both > integer modes. > > Given that this code only triggers if we're going to eliminate the load, > and thus presumably define all the other bytes of DEST_MODE later, > couldn't we just use lowpart_subreg? That way we can cope with different > like-sized modes and with "extensions" from one mode to another. Of course, > there might be cases where doing this for different modes would trigger > unwanted cross-file register transfers, but that seems like a general > risk with forwarding. > > Not related to this PR, but on the same piece of code: > > /* If we're eliminating the load then find the store with zero offset > and use it as the base register to avoid a bit insert if possible. > */ > if (load_elim && it->offset == 0) > > Doesn't this offset check need to take endianness into account? > Byte 0 would normally be the msb for big-endian targets. > > Thanks, > Richard > > > diff --git a/gcc/testsuite/gcc.target/i386/pr119884.c > > b/gcc/testsuite/gcc.target/i386/pr119884.c > > new file mode 100644 > > index 000000000000..34d5b689244d > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/pr119884.c > > @@ -0,0 +1,13 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2 -fno-dse -favoid-store-forwarding" } */ > > + > > +typedef __attribute__((__vector_size__(64))) char V; > > +char c; > > +V v; > > + > > +char > > +foo() > > +{ > > + v *= c; > > + return v[0]; > > +} > > \ No newline at end of file