Jeff Law <jeffreya...@gmail.com> writes: > On 7/26/24 2:42 PM, Robin Dapp wrote: >> Hi, >> >> when the source mode is potentially larger than one vector (e.g. an >> LMUL2 mode for VLEN=128) we don't know which vector the subreg actually >> refers to. For zvl128b and LMUL=2 the subreg in (subreg:V2DI (reg:V4DI)) >> could actually be the a full (high) vector register of a two-register >> group (at VLEN=128) or the higher part of a single register (at VLEN>128). >> >> In that case we need to use a slidedown instead of moving a register >> directly. >> >> Regtested on rv64gcv_zvfh_zvbb -mrvv-max-lmul=m2 at vlen 128 and vlen 256. >> This also fixes >> gcc.dg/vect/bb-slp-cond-1.c >> gcc.dg/vect/bb-slp-pr101668.c >> gcc.dg/vect/pr66251.c >> and others from the vector test suite when ran with vlen 256. >> >> Regtested on rv64gcv_zvfh_zvbb -mrvv-max-lmul=m2 and vlen 128 as well as vlen >> 256. Still curious what the CI says. >> >> Regards >> Robin >> >> gcc/ChangeLog: >> >> PR target/116086 >> >> * config/riscv/riscv-v.cc (legitimize_move): Slide down instead >> of moving register directly. >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/riscv/rvv/autovec/pr116086-2-run.c: New test. >> * gcc.target/riscv/rvv/autovec/pr116086-2.c: New test. >> * gcc.target/riscv/rvv/autovec/pr116086.c: New test. > So the representational issues LMUL > 1 brings with GCC have been in the > back of my mind, but never bubbled up enough for me to say anything. > > This seems to start and touch on those concerns. While your patch fixes > the immediate issue in the RISC-V backend, I won't be at all surprised > if we find other cases in the generic code that assume they know how to > interpret a SUBREG and get it wrong.
Yeah, I agree. It seems like this is papering over an issue elsewhere, and that we're losing our chance to see what it is. A somewhat similar situation can happen for SVE with subregs like: (subreg:V4SI (reg:VNx8SI R) 16) IMO, what ought to happen here is that the RA should spill the inner register to memory and load the V4SI back from there. (Or vice versa, for an lvalue.) Obviously that's not very efficient, and so a patch like the above might be useful as an optimisation.[*] But it shouldn't be needed for correctness. The target-independent code should already have the information it needs to realise that it can't predict the register index at compile time (at least for SVE). Richard [*] Big-endian SVE also checks for tricky subregs in the move patterns, and tries to optimise them to something that doesn't involve a subreg. But it's only an optimisation. > > And we may currently be hiding these issues by having GCC and QEMU be > consistent in their handling in the default configurations. > > Anyway, the patch is sensible. Essentially using a slidedown is a good > way to avoid a subclass of the potential problems. > > Jeff