On Wed, 31 Jul 2024, Uros Bizjak wrote:
> On Wed, Jul 31, 2024 at 10:24 AM Jakub Jelinek <[email protected]> wrote:
> >
> > On Wed, Jul 31, 2024 at 10:11:44AM +0200, Uros Bizjak wrote:
> > > OK. Richard, can you please mention the above in the comment why
> > > XFmode is rejected in the hook?
> > >
> > > Later, we can perhaps benchmark XFmode move vs. generic memory copy to
> > > get some hard data.
> >
> > My (limited) understanding was that the hook would be used only for cases
> > where we'd like to e.g. value number some SF/DF/XF etc. mode loads and some
> > subsequent loads from the same address with different mode but same size
> > the same and replace say int or long long later load with VIEW_CONVERT_EXPR
> > of the result of the SF/SF mode load. That is what was incorrect, because
> > the load didn't preserve all the bits. The patch would still keep doing
> > normal SF/DF/XF etc. mode copies if that is all that happens in the program,
> > load some floating point value and store it elsewhere or as part of larger
> > aggregate copy.
>
> So, the hook should allow everything besides SF/DFmode, simply:
>
>
> switch (GET_MODE_INNER (mode))
> {
> case SFmode:
> case DFmode:
> /* These suffer from normalization upon load when not using SSE. */
> return !(ix86_fpmath & FPMATH_387);
> default:
> return true;
> }
OK, I think I'll go with this then. I'm now unsure whether the
wrapper around the hook should reject modes with padding or if
the supposed users (value-numbering and SRA) should deal with that
issue separately. I do wonder whether
ADJUST_FLOAT_FORMAT (XF, (TARGET_128BIT_LONG_DOUBLE
? &ieee_extended_intel_128_format
: TARGET_96_ROUND_53_LONG_DOUBLE
? &ieee_extended_intel_96_round_53_format
: &ieee_extended_intel_96_format));
ADJUST_BYTESIZE (XF, TARGET_128BIT_LONG_DOUBLE ? 16 : 12);
ADJUST_ALIGNMENT (XF, TARGET_128BIT_LONG_DOUBLE ? 16 : 4);
unambiguously specifies where the padding is - m68k has
FRACTIONAL_FLOAT_MODE (XF, 80, 12, ieee_extended_motorola_format);
It's also not clear we can model a x87 10 byte memory copy in RTL since
a mem:XF still touches 12 or 16 bytes - IIRC a store leaves
possible padding as unspecified and not "masked out" even if
the actual fstp will only store 10 bytes.
Richard.