Re: [PATCH, V3, #2 of 10], Improve rs6000_setup_addr_mask

Segher Boessenkool Wed, 28 Aug 2019 15:28:31 -0700

Hi!

Thanks for splitting things out, that makes it easier to understand.

On Mon, Aug 26, 2019 at 04:10:22PM -0400, Michael Meissner wrote:
> This code attempts to make this clearer by moving the settings for
> GPRs, FPRs, and traditional Altivec registers to separate functions.

They are called VRs.  They always were, but it really doesn't make much
sense to call them "traditional AltiVec registers" now; they are used
for so many more things.

> The second issue is SDmode indicates that it can do PRE_INCREMENT,
> PRE_DECREMENT, and PRE_MODIFY in the floating point registers.  It
> can't since you need to use the LFIWZX instruction to load SDmode, and

(Or lxsiwzx -- you can do all floating point in VRs, too!)

> that does not have an pre-increment format.  I was not able to make a
> test case that actually failed with SDmode.  I opted to make my
> comparison simpler by returning the same information that the current
> compiler uses.  If you prefer, I can change it so the address mask does
> not indicate that the mode can do pre increment, etc.

Does that result in better code?  Does it make the compiler simpler?  If
either of those is "yes", then yes please.

> +/* Return true if the mode is a type that uses the full vector register (like
> +   V2DImode or KFmode).  Do not return true for 128-bit types like TDmode or
> +   IFmode.  */

(_A_ full vector register, "the" makes no sense if the mode doesn't go in
vector registers at all).

As opposed to?  Using only (part of) dword 0?

> +static bool
> +mode_uses_full_vector_reg (machine_mode mode)
>  {
> +  if (GET_MODE_SIZE (mode) < 16)
> +    return false;

Is this needed, given the other conditions?  Or...

> +  if (TARGET_VSX)
> +    return (VECTOR_MODE_P (mode)
> +         || FLOAT128_VECTOR_P (mode)
> +         || mode == TImode);
> +
> +  if (TARGET_ALTIVEC)
> +    return ALTIVEC_VECTOR_MODE (mode);

... maybe it should be just ALTIVEC_OR_VSX_VECTOR_MODE instead?  And the
TImode test?

> +/* Figure out if we can do PRE_INC, PRE_DEC, or PRE_MODIFY addressing for a
> +   given MODE.  If we allow scalars into Altivec registers, don't allow
> +   PRE_INC, PRE_DEC, or PRE_MODIFY.
> +
> +   For VSX systems, we don't allow update addressing for DFmode/SFmode if 
> those
> +   registers can go in both the traditional floating point registers and
> +   Altivec registers.  The load/store instructions for the Altivec registers 
> do
> +   not have update forms.  If we allowed update addressing, it seems to break
> +   IV-OPT code using floating point if the index type is int instead of long
> +   (PR target/81550 and target/84042).  */

"Seems to break"...  Well, ivopts makes a different decision, which isn't
very surprising, and that leads to a loop that is *better* optimised?

Not that load/store-with-update is terribly interesting for floating
point, on modern cores anyway, or that it is ideal to have insns that
only exist for half of the allowed registers :-)

> +/* Return the address mask bits for whether we allow PRE_INCREMENT,
> +   PRE_DECREMENT, and PRE_MODIFY for a given MODE.  */

Why do we do decrement and increment separately here?  There are no insns
like that at all.  Not allowing pre_modify because you need to have an
"extra" like in other cases for bigger modes, well, just do that then?

> +static addr_mask_type
> +setup_reg_addr_masks_pre_incdec (machine_mode mode)
> +{
> +  addr_mask_type addr_mask = 0;
> +
> +  if (TARGET_UPDATE
> +      && GET_MODE_SIZE (mode) <= 8
> +      && !VECTOR_MODE_P (mode)

If it is at most 8 bytes, it cannot be a vector (all our vector modes
are 16 or 32 bytes).

> +      && !FLOAT128_VECTOR_P (mode)
> +      && !COMPLEX_MODE_P (mode)
> +      && (mode != E_DFmode || !TARGET_VSX)
> +      && (mode != E_SFmode || !TARGET_P8_VECTOR))

(Please use DFmode etc. where you can).

We probably should have some helper to say what modes can go in vector
regs.  Well, don't we have that already?  rs6000_vector_unit[mode]?

> +      bool indexed_only = (mode == SDmode && TARGET_LFIWZX);
> +
> +      if (!indexed_only
> +       && (mode_size_inner <= 8
> +           || (mode_size_inner == 16 && TARGET_P9_VECTOR
> +               && mode_uses_full_vector_reg (mode_inner))))

The P9 part here needs a comment (and/or a nicer condition).

> +      else if (mode == SFmode || mode_size == 8

Newline before the || please.

> +            || mode_uses_full_vector_reg (mode_inner)
> +            || (TARGET_LFIWAX && (mode == SImode || mode == SDmode))
> +            || (TARGET_P9_VECTOR && (mode == QImode || mode == HImode)))
> +     addr_mask |= RELOAD_REG_INDEXED;

These last two really only care about the size of the mode?  So write it
like that, too?

> +      /* It is weird that previous versions of GCC supported pre increment,
> +      etc. forms of addressing for SDmode, when you could only use an
> +      indexed instruction, but allow it for now.  Previous versions of GCC
> +      also set the indexed flag for SDmode, even though there was no direct
> +      instruction to load it.  */

Why not fix it now?  It's stage 1.

> +      /* FPR registers can do REG+OFFSET addresssing for vectors if ISA 3.0
> +      instructions are enabled.  The offset for 128-bit VSX registers is
> +      only 12-bits.  */

No, it's 16 bits; but is a DQ-form, so the low 4 bits of the offset are
zeroes.  (And it is not just FPRs, it is VRs as well).

So, I wonder if things will be a lot simpler if you do not precompute
anything, just have "mode_can_use_offset_addressing" etc. functions?

Segher

Re: [PATCH, V3, #2 of 10], Improve rs6000_setup_addr_mask

Reply via email to