On Wed, 31 Jul 2024, Uros Bizjak wrote:

> On Wed, Jul 31, 2024 at 11:33 AM Richard Biener <rguent...@suse.de> wrote:
> >
> > On Wed, 31 Jul 2024, Uros Bizjak wrote:
> >
> > > On Wed, Jul 31, 2024 at 10:48 AM Richard Biener <rguent...@suse.de> wrote:
> > > >
> > > > On Wed, 31 Jul 2024, Uros Bizjak wrote:
> > > >
> > > > > On Wed, Jul 31, 2024 at 10:24 AM Jakub Jelinek <ja...@redhat.com> 
> > > > > wrote:
> > > > > >
> > > > > > On Wed, Jul 31, 2024 at 10:11:44AM +0200, Uros Bizjak wrote:
> > > > > > > OK. Richard, can you please mention the above in the comment why
> > > > > > > XFmode is rejected in the hook?
> > > > > > >
> > > > > > > Later, we can perhaps benchmark XFmode move vs. generic memory 
> > > > > > > copy to
> > > > > > > get some hard data.
> > > > > >
> > > > > > My (limited) understanding was that the hook would be used only for 
> > > > > > cases
> > > > > > where we'd like to e.g. value number some SF/DF/XF etc. mode loads 
> > > > > > and some
> > > > > > subsequent loads from the same address with different mode but same 
> > > > > > size
> > > > > > the same and replace say int or long long later load with 
> > > > > > VIEW_CONVERT_EXPR
> > > > > > of the result of the SF/SF mode load.  That is what was incorrect, 
> > > > > > because
> > > > > > the load didn't preserve all the bits.  The patch would still keep 
> > > > > > doing
> > > > > > normal SF/DF/XF etc. mode copies if that is all that happens in the 
> > > > > > program,
> > > > > > load some floating point value and store it elsewhere or as part of 
> > > > > > larger
> > > > > > aggregate copy.
> > > > >
> > > > > So, the hook should allow everything besides SF/DFmode, simply:
> > > > >
> > > > >
> > > > >     switch (GET_MODE_INNER (mode))
> > > > >       {
> > > > >       case SFmode:
> > > > >       case DFmode:
> > > > >         /* These suffer from normalization upon load when not using 
> > > > > SSE.  */
> > > > >         return !(ix86_fpmath & FPMATH_387);
> > > > >       default:
> > > > >         return true;
> > > > >       }
> > > >
> > > > OK, I think I'll go with this then.  I'm now unsure whether the
> > > > wrapper around the hook should reject modes with padding or if
> > > > the supposed users (value-numbering and SRA) should deal with that
> > > > issue separately.  I do wonder whether
> > > >
> > > > ADJUST_FLOAT_FORMAT (XF, (TARGET_128BIT_LONG_DOUBLE
> > > >                           ? &ieee_extended_intel_128_format
> > > >                           : TARGET_96_ROUND_53_LONG_DOUBLE
> > > >                           ? &ieee_extended_intel_96_round_53_format
> > > >                           : &ieee_extended_intel_96_format));
> > > > ADJUST_BYTESIZE  (XF, TARGET_128BIT_LONG_DOUBLE ? 16 : 12);
> > > > ADJUST_ALIGNMENT (XF, TARGET_128BIT_LONG_DOUBLE ? 16 : 4);
> > > >
> > > > unambiguously specifies where the padding is - m68k has
> > > >
> > > > FRACTIONAL_FLOAT_MODE (XF, 80, 12, ieee_extended_motorola_format);
> > > >
> > > > It's also not clear we can model a x87 10 byte memory copy in RTL since
> > > > a mem:XF still touches 12 or 16 bytes - IIRC a store leaves
> > > > possible padding as unspecified and not "masked out" even if
> > > > the actual fstp will only store 10 bytes.
> > >
> > > The hardware will never touch bytes outside 10 bytes range, the
> > > padding is some artificial compiler thingy, so IMO it should be
> > > handled before the hook is called. Please find attached the source I
> > > have used to confirm that a) the copied bits will never be mangled and
> > > b) there is no access outside the 10 bytes range. (BTW: these
> > > particular values are to test the effect of leading bit 63, the
> > > non-hidden normalized bit).
> >
> > Thanks - I do wonder why GET_MODE_SIZE (XFmode) is not 10 then,
> > mode_base_align[XFmode] seems to be correctly set to ensure
> > 12 bytes / 16 bytes "effective" size.
> 
> Uh, this decision predates my involvement in GCC development by a long shot ;)

diff --git a/gcc/config/i386/i386-modes.def 
b/gcc/config/i386/i386-modes.def
index 6d8f1946f3a..2cc03e30f13 100644
--- a/gcc/config/i386/i386-modes.def
+++ b/gcc/config/i386/i386-modes.def
@@ -21,7 +21,7 @@ along with GCC; see the file COPYING3.  If not see
    XFmode is __float80 is IEEE extended; TFmode is __float128
    is IEEE quad.  */
 
-FRACTIONAL_FLOAT_MODE (XF, 80, 12, ieee_extended_intel_96_format);
+FRACTIONAL_FLOAT_MODE (XF, 80, 10, ieee_extended_intel_96_format);
 FLOAT_MODE (TF, 16, ieee_quad_format);
 FLOAT_MODE (HF, 2, ieee_half_format);
 FLOAT_MODE (BF, 2, 0);

bootstraps and tests (-m64/-m32) OK on x86_64-linux.

Richard.

Reply via email to