On Fri, 2020-09-11 at 10:46 +0100, Richard Sandiford wrote: > Ilya Leoshkevich via Gcc <gcc@gcc.gnu.org> writes: > > On Wed, 2020-09-09 at 16:09 -0500, Segher Boessenkool wrote: > > > Hi Ilya, > > > > > > On Wed, Sep 09, 2020 at 11:50:56AM +0200, Ilya Leoshkevich via > > > Gcc > > > wrote: > > > > I have a vector pseudo containing a single 128-bit value > > > > (V1TFmode) > > > > and > > > > I need to access its last 64 bits (DFmode). Which of the two > > > > options > > > > is better? > > > > > > > > (subreg:DF (reg:V1TF) 8) > > > > > > > > or > > > > > > > > (vec_select:DF (subreg:V2DF (reg:V1TF) 0) (parallel [(const_int > > > > 1)])) > > > > > > > > If I use the first one, I run into a problem with set_noop_p > > > > (): it > > > > thinks that > > > > > > > > (set (subreg:DF (reg:TF %f0) 8) (subreg:DF (reg:V1TF %f0) 8)) > > > > > > > > is a no-op, because it doesn't check the mode after stripping > > > > the > > > > subreg: > > > > > > > > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/rtlanal.c;h=5ae38b79#l1616 > > > > > > > > However this is not correct, because SET_DEST is the second > > > > register in > > > > a register pair, and SET_SRC is half of a vector register that > > > > overlaps > > > > the first register in the corresponding pair. So it looks as if > > > > mode > > > > needs to be considered there. > > > > > > Yes. > > > > > > > This helps: > > > > > > > > --- a/gcc/rtlanal.c > > > > +++ b/gcc/rtlanal.c > > > > @@ -1619,6 +1619,8 @@ set_noop_p (const_rtx set) > > > > return 0; > > > > src = SUBREG_REG (src); > > > > dst = SUBREG_REG (dst); > > > > + if (GET_MODE (src) != GET_MODE (dst)) > > > > + return 0; > > > > } > > > > > > > > but I'm not sure whether I'm not missing something about subreg > > > > semantics in the first place. > > > > > > You probably should just see if both modes are the same number of > > > hard > > > registers? HARD_REGNO_NREGS. > > > > I've refined my patch as follows: > > > > --- a/gcc/rtlanal.c > > +++ b/gcc/rtlanal.c > > @@ -1619,6 +1619,11 @@ set_noop_p (const_rtx set) > > return 0; > > src = SUBREG_REG (src); > > dst = SUBREG_REG (dst); > > + if (REG_P (src) && HARD_REGISTER_P (src) && REG_P (dst) > > + && HARD_REGISTER_P (dst) > > + && hard_regno_nregs (REGNO (src), GET_MODE (src)) > > + != hard_regno_nregs (REGNO (dst), GET_MODE (dst))) > > + return 0; > > } > > I think checking the mode would be safer. Having the same number > of registers doesn't mean that the bits are distributed across the > registers in the same way.
Yeah, that's what I was trying to express with this hypothetical machine example. On the other hand, checking mode is too pessimistic. E.g. if we talk about s390 GPRs, then considering (set (subreg:SI (reg:DI %r0) 4) (subreg:SI (reg:DI %r0) 4)) a no-op is fine from my perspective. So having a more restrictive check might be desirable. Is there a way to ask the backend how the subreg bits are distributed? > Out of interest, why can't the subregs in the example above get > folded down to hard registers? I think this is because the offsets are not 0. I could imagine folding (subreg:DF (reg:TF %f0) 8) to (reg:DF %f2) - but there must be a backend hook for this. Does anything like this exist? Also, can (subreg:DF (reg:V1TF %f0) 8) be folded at all? This is simply the second doubleword of 128-bit %v0 vector register.