Hi all,

On s390 we only tie floating-point modes SF and DF, and then all
remaining ones:

static bool
s390_modes_tieable_p (machine_mode mode1, machine_mode mode2)
{
  return ((mode1 == SFmode || mode1 == DFmode)
          == (mode2 == SFmode || mode2 == DFmode));
}

This in turn leads sometimes to higher costs as e.g. in

(set (reg:SI 60 [ _2 ])
    (lshiftrt:SI (subreg:SI (reg/v:SF 62 [ xD.3041 ]) 0)
        (const_int 31 [0x1f])))
original cost = 4 (weighted: 4.000000), replacement cost = 16 (weighted: 
16.000000); rejecting replacement

due to rtx_costs()

case SUBREG:
  total = 0;
  /* If we can't tie these modes, make this expensive.  The larger
     the mode, the more expensive it is.  */
  if (!targetm.modes_tieable_p (mode, GET_MODE (SUBREG_REG (x))))
    return COSTS_N_INSNS (2 + factor);

>From the internals handbook I see the strong connection between
TARGET_MODES_TIEABLE_P and TARGET_HARD_REGNO_MODE_OK which I kinda read
as

forall m1, m2. forall r.
 (TARGET_HARD_REGNO_MODE_OK (r, m1) == TARGET_HARD_REGNO_MODE_OK (r, m2))
  => TARGET_MODES_TIEABLE_P (m1, m2)

Since on s390 a value which is stored in any GPR/FPR/VR can be retrieved
unmodified, i.e., there is no normalization happening in FPRs/VRs or
whatsoever, we could actually lift this to something along the line:

static bool
s390_modes_tieable_p (machine_mode mode1, machine_mode mode2)
{
  if (GET_MODE_CLASS (mode1) == MODE_CC)
    return GET_MODE_CLASS (mode2) == MODE_CC;
  if (TARGET_VX && GET_MODE_SIZE (mode1) <= 8 && GET_MODE_SIZE (mode2) <= 8)
    return true;
  return ((mode1 == SFmode || mode1 == DFmode)
          == (mode2 == SFmode || mode2 == DFmode));
}

since any value up to 8 bytes can be stored/retrieved in any GPR/FPR/VR, if
vector extensions are available.

However, what makes me a little bit cautious is that this is true for a lot of
targets but different TARGET_MODES_TIEABLE_P implementations are rather
involved which makes me wonder whether I missed something here.  Especially
since riscv even does not allow modes SF and DF to be tied in contrast to s390:

  Don't allow floating-point modes to be tied, since type punning of
  single-precision and double-precision is implementation defined.

Furthermore, the last sentence in the paragraph:

If TARGET_HARD_REGNO_MODE_OK (r, mode1) and TARGET_HARD_REGNO_MODE_OK (r,
mode2) are always the same for any r, then TARGET_MODES_TIEABLE_P (mode1,
mode2) should be true. If they differ for any r, you should define this hook to
return false unless some other mechanism ensures the accessibility of the value 
in a
narrower mode.

speaks about narrower mode which rather increases my concerns about the
potential new implementation of s390_modes_tieable_p().

Last but not least, having a look at TARGET_MODES_TIEABLE_P usages we have e.g.
in combine.cc:

    /* In general, don't install a subreg involving two
       modes not tieable.  It can worsen register
       allocation, and can even make invalid reload
       insns, since the reg inside may need to be copied
       from in the outside mode, and that may be invalid
       if it is an fp reg copied in integer mode.

Since on s390 FPRs are left aligned and GPRs are right aligned, I can imagine
that a (subreg:DI (reg:SF 65 [ xD.3041 ]) 0) could lead to problems as
described in the comment.  During my experiments I have seen such subregs,
though, those were all dealt with correctly but maybe I was just lucky.

Long story short, could someone shed some light on hook
TARGET_MODES_TIEABLE_P, or does anyone see a potential problem by tying
all modes with size less than or equal to 8 bytes as long as we have
proper mov<mode> insns?

Cheers,
Stefan

Reply via email to