On Fri, 2020-09-11 at 10:46 +0100, Richard Sandiford wrote:
> Ilya Leoshkevich via Gcc <gcc@gcc.gnu.org> writes:
> > On Wed, 2020-09-09 at 16:09 -0500, Segher Boessenkool wrote:
> > > Hi Ilya,
> > > 
> > > On Wed, Sep 09, 2020 at 11:50:56AM +0200, Ilya Leoshkevich via
> > > Gcc
> > > wrote:
> > > > I have a vector pseudo containing a single 128-bit value
> > > > (V1TFmode)
> > > > and
> > > > I need to access its last 64 bits (DFmode). Which of the two
> > > > options
> > > > is better?
> > > > 
> > > > (subreg:DF (reg:V1TF) 8)
> > > > 
> > > > or
> > > > 
> > > > (vec_select:DF (subreg:V2DF (reg:V1TF) 0) (parallel [(const_int
> > > > 1)]))
> > > > 
> > > > If I use the first one, I run into a problem with set_noop_p
> > > > (): it
> > > > thinks that
> > > > 
> > > > (set (subreg:DF (reg:TF %f0) 8) (subreg:DF (reg:V1TF %f0) 8))
> > > > 
> > > > is a no-op, because it doesn't check the mode after stripping
> > > > the
> > > > subreg:
> > > > 
> > > > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/rtlanal.c;h=5ae38b79#l1616
> > > > 
> > > > However this is not correct, because SET_DEST is the second
> > > > register in
> > > > a register pair, and SET_SRC is half of a vector register that
> > > > overlaps
> > > > the first register in the corresponding pair. So it looks as if
> > > > mode
> > > > needs to be considered there.
> > > 
> > > Yes.
> > > 
> > > > This helps:
> > > > 
> > > > --- a/gcc/rtlanal.c
> > > > +++ b/gcc/rtlanal.c
> > > > @@ -1619,6 +1619,8 @@ set_noop_p (const_rtx set)
> > > >         return 0;
> > > >        src = SUBREG_REG (src);
> > > >        dst = SUBREG_REG (dst);
> > > > +      if (GET_MODE (src) != GET_MODE (dst))
> > > > +       return 0;
> > > >      }
> > > > 
> > > > but I'm not sure whether I'm not missing something about subreg
> > > > semantics in the first place.
> > > 
> > > You probably should just see if both modes are the same number of
> > > hard
> > > registers?  HARD_REGNO_NREGS.
> > 
> > I've refined my patch as follows:
> > 
> > --- a/gcc/rtlanal.c
> > +++ b/gcc/rtlanal.c
> > @@ -1619,6 +1619,11 @@ set_noop_p (const_rtx set)
> >         return 0;
> >        src = SUBREG_REG (src);
> >        dst = SUBREG_REG (dst);
> > +      if (REG_P (src) && HARD_REGISTER_P (src) && REG_P (dst)
> > +         && HARD_REGISTER_P (dst)
> > +         && hard_regno_nregs (REGNO (src), GET_MODE (src))
> > +                != hard_regno_nregs (REGNO (dst), GET_MODE (dst)))
> > +       return 0;
> >      }
> 
> I think checking the mode would be safer.  Having the same number
> of registers doesn't mean that the bits are distributed across the
> registers in the same way.

Yeah, that's what I was trying to express with this hypothetical
machine example.  On the other hand, checking mode is too pessimistic.
E.g. if we talk about s390 GPRs, then considering

(set (subreg:SI (reg:DI %r0) 4) (subreg:SI (reg:DI %r0) 4))

a no-op is fine from my perspective.  So having a more restrictive
check might be desirable.  Is there a way to ask the backend how the
subreg bits are distributed?

> Out of interest, why can't the subregs in the example above get
> folded down to hard registers?

I think this is because the offsets are not 0.  I could imagine folding
(subreg:DF (reg:TF %f0) 8) to (reg:DF %f2) - but there must be a
backend hook for this.  Does anything like this exist?  Also, can
(subreg:DF (reg:V1TF %f0) 8) be folded at all? This is simply
the second doubleword of 128-bit %v0 vector register.

Reply via email to