On Wed, 31 Jul 2024, Uros Bizjak wrote: > On Wed, Jul 31, 2024 at 10:24 AM Jakub Jelinek <ja...@redhat.com> wrote: > > > > On Wed, Jul 31, 2024 at 10:11:44AM +0200, Uros Bizjak wrote: > > > OK. Richard, can you please mention the above in the comment why > > > XFmode is rejected in the hook? > > > > > > Later, we can perhaps benchmark XFmode move vs. generic memory copy to > > > get some hard data. > > > > My (limited) understanding was that the hook would be used only for cases > > where we'd like to e.g. value number some SF/DF/XF etc. mode loads and some > > subsequent loads from the same address with different mode but same size > > the same and replace say int or long long later load with VIEW_CONVERT_EXPR > > of the result of the SF/SF mode load. That is what was incorrect, because > > the load didn't preserve all the bits. The patch would still keep doing > > normal SF/DF/XF etc. mode copies if that is all that happens in the program, > > load some floating point value and store it elsewhere or as part of larger > > aggregate copy. > > So, the hook should allow everything besides SF/DFmode, simply: > > > switch (GET_MODE_INNER (mode)) > { > case SFmode: > case DFmode: > /* These suffer from normalization upon load when not using SSE. */ > return !(ix86_fpmath & FPMATH_387); > default: > return true; > }
OK, I think I'll go with this then. I'm now unsure whether the wrapper around the hook should reject modes with padding or if the supposed users (value-numbering and SRA) should deal with that issue separately. I do wonder whether ADJUST_FLOAT_FORMAT (XF, (TARGET_128BIT_LONG_DOUBLE ? &ieee_extended_intel_128_format : TARGET_96_ROUND_53_LONG_DOUBLE ? &ieee_extended_intel_96_round_53_format : &ieee_extended_intel_96_format)); ADJUST_BYTESIZE (XF, TARGET_128BIT_LONG_DOUBLE ? 16 : 12); ADJUST_ALIGNMENT (XF, TARGET_128BIT_LONG_DOUBLE ? 16 : 4); unambiguously specifies where the padding is - m68k has FRACTIONAL_FLOAT_MODE (XF, 80, 12, ieee_extended_motorola_format); It's also not clear we can model a x87 10 byte memory copy in RTL since a mem:XF still touches 12 or 16 bytes - IIRC a store leaves possible padding as unspecified and not "masked out" even if the actual fstp will only store 10 bytes. Richard.