On Tue, Aug 24, 2021 at 11:38 AM Hongtao Liu <[email protected]> wrote:
>
> On Tue, Aug 24, 2021 at 5:40 PM Hongtao Liu <[email protected]> wrote:
> >
> > On Tue, Aug 17, 2021 at 9:52 AM Hongtao Liu <[email protected]> wrote:
> > >
> > > On Mon, Aug 9, 2021 at 4:34 PM Hongtao Liu <[email protected]> wrote:
> > > >
> > > > On Fri, Aug 6, 2021 at 7:27 PM Richard Biener via Gcc-patches
> > > > <[email protected]> wrote:
> > > > >
> > > > > On Fri, Aug 6, 2021 at 11:05 AM Richard Sandiford
> > > > > <[email protected]> wrote:
> > > > > >
> > > > > > Richard Biener via Gcc-patches <[email protected]> writes:
> > > > > > > On Fri, Aug 6, 2021 at 5:32 AM liuhongt <[email protected]>
> > > > > > > wrote:
> > > > > > >>
> > > > > > >> Hi:
> > > > > > >> ---
> > > > > > >> OK, I think sth is amiss here upthread. insv/extv do look like
> > > > > > >> they
> > > > > > >> are designed
> > > > > > >> to work on integer modes (but docs do not say anything about
> > > > > > >> this here).
> > > > > > >> In fact the caller of extract_bit_field_using_extv is named
> > > > > > >> extract_integral_bit_field. Of course nothing seems to check
> > > > > > >> what kind of
> > > > > > >> modes we're dealing with, but we're for example happily doing
> > > > > > >> expand_shift in 'mode'. In the extract_integral_bit_field call
> > > > > > >> 'mode' is
> > > > > > >> some integer mode and op0 is HFmode? From the above I get it's
> > > > > > >> the other way around? In that case we should wrap the
> > > > > > >> call to extract_integral_bit_field, extracting in an integer
> > > > > > >> mode with the
> > > > > > >> same size as 'mode' and then converting the result as (subreg:HF
> > > > > > >> (reg:HI ...)).
> > > > > > >> ---
> > > > > > >> This is a separate patch as a follow up of upper comments.
> > > > > > >>
> > > > > > >> gcc/ChangeLog:
> > > > > > >>
> > > > > > >> * expmed.c (extract_bit_field_1): Wrap the call to
> > > > > > >> extract_integral_bit_field, extracting in an integer
> > > > > > >> mode with
> > > > > > >> the same size as 'tmode' and then converting the result
> > > > > > >> as (subreg:tmode (reg:imode)).
> > > > > > >>
> > > > > > >> gcc/testsuite/ChangeLog:
> > > > > > >> * gcc.target/i386/float16-5.c: New test.
> > > > > > >> ---
> > > > > > >> gcc/expmed.c | 19
> > > > > > >> +++++++++++++++++++
> > > > > > >> gcc/testsuite/gcc.target/i386/float16-5.c | 12 ++++++++++++
> > > > > > >> 2 files changed, 31 insertions(+)
> > > > > > >> create mode 100644 gcc/testsuite/gcc.target/i386/float16-5.c
> > > > > > >>
> > > > > > >> diff --git a/gcc/expmed.c b/gcc/expmed.c
> > > > > > >> index 3143f38e057..72790693ef0 100644
> > > > > > >> --- a/gcc/expmed.c
> > > > > > >> +++ b/gcc/expmed.c
> > > > > > >> @@ -1850,6 +1850,25 @@ extract_bit_field_1 (rtx str_rtx,
> > > > > > >> poly_uint64 bitsize, poly_uint64 bitnum,
> > > > > > >> op0_mode = opt_scalar_int_mode ();
> > > > > > >> }
> > > > > > >>
> > > > > > >> + /* Make sure we are playing with integral modes. Pun with
> > > > > > >> subregs
> > > > > > >> + if we aren't. When tmode is HFmode, op0 is SImode, there
> > > > > > >> will be ICE
> > > > > > >> + in extract_integral_bit_field. */
> > > > > > >> + if (int_mode_for_mode (tmode).exists (&imode)
> > > > > > >
> > > > > > > check !INTEGRAL_MODE_P (tmode) before, that should be slightly
> > > > > > > cheaper. Then imode should always be != tmode. Maybe
> > > > > > > even GET_MDOE_CLASS (tmode) != MODE_INT since I'm not sure
> > > > > > > how it behaves for composite modes.
> > > > > > >
> > > > > > > Of course the least surprises would happen when we restrict this
> > > > > > > to FLOAT_MODE_P (tmode).
> > > > > > >
> > > > > > > Richard - any preferences?
> > > > > >
> > > > > > If the bug is that extract_integral_bit_field is being called with
> > > > > > a non-integral mode parameter, then it looks odd that we can still
> > > > > > fall through to it without an integral mode (when exists is false).
> > > > > >
> > > > > > If calling extract_integral_bit_field without an integral mode is
> > > > > > a bug then I think we should have:
> > > > > >
> > > > > > int_mode_for_mode (mode).require ()
> > > > > >
> > > > > > whenever mode is not already
> > > > > > SCALAR_INT_MODE_P/is_a<scalar_int_mode>.
> > > > > > Ideally we'd make the mode parameter scalar_int_mode too.
> > > > > >
> > > > > > extract_integral_bit_field currently has:
> > > > > >
> > > > > > /* Find a correspondingly-sized integer field, so we can apply
> > > > > > shifts and masks to it. */
> > > > > > scalar_int_mode int_mode;
> > > > > > if (!int_mode_for_mode (tmode).exists (&int_mode))
> > > > > > /* If this fails, we should probably push op0 out to memory and
> > > > > > then
> > > > > > do a load. */
> > > > > > int_mode = int_mode_for_mode (mode).require ();
> > > > > >
> > > > > > which would seem to be redundant after this change.
> > > > >
> > > > > I'm not sure what exactly the bug is, but extract_integral_bit_field
> > > > > ends
> > > > > up creating a lowpart subreg that's not allowed and that ICEs (and I
> > > > > can't see a way to check beforehand). So it seems to me at least
> > > > > part of that function doesn't expect non-integral extraction modes.
> > > > >
> > > > > But who knows - the code is older than I am (OK, not, but older than
> > > > > my involvment in GCC ;))
> > > > >
> > > > How about attached patch w/ below changelog
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * expmed.c (extract_bit_field_1): Make sure we're playing with
> > > > integral modes before call extract_integral_bit_field.
> > > > (extract_integral_bit_field): Add a parameter of type
> > > > scalar_int_mode which corresponds to of tmode.
> > > > And call extract_and_convert_fixed_bit_field instead of
> > > > extract_fixed_bit_field and convert_extracted_bit_field.
> > > > (extract_and_convert_fixed_bit_field): New function, it's a
> > > > combination of extract_fixed_bit_field and
> > > > convert_extracted_bit_field.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > > * gcc.target/i386/float16-5.c: New test.
> > > >
> > > I'd like to ping this patch, or maybe we can use the patch before with
> > > richi's comments.
> > > >
> >
> > Rebased and ping^2, there are plenty of avx512fp16 patches blocked by
> > this patch, i'd like someone to help review this patch.
> >
> Please ignore the former attached patch, should be the patch attached here.
I think the patch is reasonable. I'm a bit worried approving it since
my knowledge
of the code is restricted. I wonder if you can tell the change doesn't make
a difference for the majority of cases - that is, did you try for
example comparing
generated code for GCC (or parts of it)?
OK if there's no objection from others within 48 hours.
Thanks,
Richard.
> > > > > Richard.
> > > > >
> > > > > > >> + && imode != tmode
> > > > > > >> + && imode != GET_MODE (op0))
> > > > > > >> + {
> > > > > > >> + rtx ret = extract_integral_bit_field (op0, op0_mode,
> > > > > > >> + bitsize.to_constant
> > > > > > >> (),
> > > > > > >> + bitnum.to_constant
> > > > > > >> (), unsignedp,
> > > > > > >> + NULL, imode, imode,
> > > > > > >> + reverse, fallback_p);
> > > > > > >> + gcc_assert (ret);
> > > > > > >> +
> > > > > > >> + if (!REG_P (ret))
> > > > > > >> + ret = force_reg (imode, ret);
> > > > > > >> + return gen_lowpart_SUBREG (tmode, ret);
> > > > > > >> + }
> > > > > > >> +
> > > > > > >> /* It's possible we'll need to handle other cases here for
> > > > > > >> polynomial bitnum and bitsize. */
> > > > > >
> > > > > > Minor nit, but since the code is using to_constant, it should go
> > > > > > after
> > > > > > this comment rather than before it.
> > > > > >
> > > > > > Thanks,
> > > > > > Richard
> > > > > >
> > > > > > >>
> > > > > > >> diff --git a/gcc/testsuite/gcc.target/i386/float16-5.c
> > > > > > >> b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > > > > >> new file mode 100644
> > > > > > >> index 00000000000..ebc0af1490b
> > > > > > >> --- /dev/null
> > > > > > >> +++ b/gcc/testsuite/gcc.target/i386/float16-5.c
> > > > > > >> @@ -0,0 +1,12 @@
> > > > > > >> +/* { dg-do compile } */
> > > > > > >> +/* { dg-options "-msse2 -O2" } */
> > > > > > >> +_Float16
> > > > > > >> +foo (int a)
> > > > > > >> +{
> > > > > > >> + union {
> > > > > > >> + int a;
> > > > > > >> + _Float16 b;
> > > > > > >> + }c;
> > > > > > >> + c.a = a;
> > > > > > >> + return c.b;
> > > > > > >> +}
> > > > > > >> --
> > > > > > >> 2.27.0
> > > > > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > BR,
> > > > Hongtao
> > >
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
> >
> >
> >
> > --
> > BR,
> > Hongtao
>
>
>
> --
> BR,
> Hongtao