RE: [PATCH v1 1/2] Match: Add form 5 of unsigned SAT_MUL for widen-mul

Li, Pan2 Wed, 03 Sep 2025 03:24:52 -0700

> It seems convert_mult_to_widen fails to fold the inserted conversion, clearly
> first widening to __int128 isn't necessary.  matching unfolded expressions 
> looks
> wrong to me, so I wonder if we can improve widen-mult code generation instead.


I see, let me have a try to improve that first.

Pan

-----Original Message-----
From: Richard Biener <[email protected]> 
Sent: Wednesday, September 3, 2025 2:56 PM
To: Li, Pan2 <[email protected]>
Cc: [email protected]; [email protected]; [email protected]; 
[email protected]; [email protected]; Chen, Ken <[email protected]>; 
Liu, Hongtao <[email protected]>
Subject: Re: [PATCH v1 1/2] Match: Add form 5 of unsigned SAT_MUL for widen-mul

On Wed, Sep 3, 2025 at 3:18 AM Li, Pan2 <[email protected]> wrote:
>
> Thanks Richard.
>
> >> +            (convert (widen_mult:c@3 (convert@4 (convert @0))
> >> So I added 2 times convert to match above sematics. Do you indicate that 
> >> we only need one convert here?
> > Yes, the above extra widening looks odd to me.
>
> Remove the inner convert of (convert @0)) will make the pattern fail to match.

Ah, I failed to see we match their argument to the result.

> The after_dom_children will meet MUL_EXPR first and then the 
> convert_mult_to_widen
> will introduce the convert, as well as wide_mult.

It seems convert_mult_to_widen fails to fold the inserted conversion, clearly
first widening to __int128 isn't necessary.  matching unfolded expressions looks
wrong to me, so I wonder if we can improve widen-mult code generation instead.

> The 220t.fab1 doesn't have wide_mult, so I think we need to move the sat_mul 
> before widening_mul pass to get ride of the
> Inner convert up to a point.
>
>   20   │   _1 = (__int128 unsigned) a_8(D);
>   21   │   _2 = (__int128 unsigned) b_9(D);
>   22   │   x_10 = _1 * _2;
>   23   │   _3 = x_10 >> 16;
>   24   │   hi_11 = (uint16_t) _3;
>   25   │   _4 = hi_11 != 0;
>   26   │   _14 = (signed short) _4;
>   27   │   _5 = -_14;
>   28   │   lo.0_6 = (signed short) x_10;
>   29   │   _7 = _5 | lo.0_6;
>   30   │   _12 = (uint16_t) _7;
>
> The 2 times covert will be present from 272t.optmized two.
>
> Pan
>
>
> -----Original Message-----
> From: Richard Biener <[email protected]>
> Sent: Tuesday, September 2, 2025 10:01 PM
> To: Li, Pan2 <[email protected]>
> Cc: [email protected]; [email protected]; [email protected]; 
> [email protected]; [email protected]; Chen, Ken <[email protected]>; 
> Liu, Hongtao <[email protected]>
> Subject: Re: [PATCH v1 1/2] Match: Add form 5 of unsigned SAT_MUL for 
> widen-mul
>
> On Tue, Sep 2, 2025 at 3:35 PM Li, Pan2 <[email protected]> wrote:
> >
> > Thanks Richard for comments.
> >
> > > Are the inner (convert ..) around @0 and @1 necessary?  The whole
> > > thing as written seems to match (long)(short)x * (long)(unsigned int)y
> > > for 'char' x and y?
> >
> > The gimple after wide-mul has 2 times convert similar as below.
> >
> >  21   <bb 2> [local count: 1073741824]:
> >  22   _1 = (__int128 unsigned) a_8(D);
> >  23   _2 = (__int128 unsigned) b_9(D);
> >  24   _34 = (unsigned long) _1;
> >  25   _33 = (unsigned long) _2;
> >  26   x_10 = _34 w* _33;
> >  27   _3 = x_10 >> 64;
> >  28   hi_11 = (uint64_t) _3;
> >  29   lo_12 = (uint64_t) x_10;
> >
> > +            (convert (widen_mult:c@3 (convert@4 (convert @0))
> >
> > So I added 2 times convert to match above sematics. Do you indicate that we 
> > only need one convert here?
>
> Yes, the above extra widening looks odd to me.
>
> > > I also wonder if there isn't any TYPE_UNSIGNED check
> > > missing below given the comment mentions unsigned WT.
> >
> > The outer checked the type with TYPE_UNSIGNED, and inner checked the 
> > types_match (type, @0, @1)).
> > So I think this ensure it only works for unsigned.
>
> Ah, OK.
>
> > 3184 /* Saturation add for unsigned integer.  */
> > 3185 (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type))
> >
> > Pan
> >
> > -----Original Message-----
> > From: Richard Biener <[email protected]>
> > Sent: Tuesday, September 2, 2025 8:31 PM
> > To: Li, Pan2 <[email protected]>
> > Cc: [email protected]; [email protected]; [email protected]; 
> > [email protected]; [email protected]; Chen, Ken <[email protected]>; 
> > Liu, Hongtao <[email protected]>
> > Subject: Re: [PATCH v1 1/2] Match: Add form 5 of unsigned SAT_MUL for 
> > widen-mul
> >
> > On Mon, Sep 1, 2025 at 6:57 AM <[email protected]> wrote:
> > >
> > > From: Pan Li <[email protected]>
> > >
> > > This patch would like to try to match the the unsigned
> > > SAT_MUL form 4, aka below:
> > >
> > >   #define DEF_SAT_U_MUL_FMT_5(NT, WT)             \
> > >   NT __attribute__((noinline))                    \
> > >   sat_u_mul_##NT##_from_##WT##_fmt_5 (NT a, NT b) \
> > >   {                                               \
> > >     WT x = (WT)a * (WT)b;                         \
> > >     NT hi = x >> (sizeof(NT) * 8);                \
> > >     NT lo = (NT)x;                                \
> > >     return lo | -!!hi;                            \
> > >   }
> > >
> > >   while WT is uint128_t, T is uint8_t, uint16_t, uint32_t or uint64_t.
> > >
> > > gcc/ChangeLog:
> > >
> > >         * match.pd: Add pattern for SAT_MUL form 5.
> > >         * tree-ssa-math-opts.cc 
> > > (math_opts_dom_walker::after_dom_children):
> > >         Try match pattern for IOR.
> > >
> > > Signed-off-by: Pan Li <[email protected]>
> > > ---
> > >  gcc/match.pd              | 25 +++++++++++++++++++++++++
> > >  gcc/tree-ssa-math-opts.cc |  1 +
> > >  2 files changed, 26 insertions(+)
> > >
> > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > index b1d7a3a1b73..4a3587017d1 100644
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -3660,6 +3660,31 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >        bool c2_is_type_precision_p = c2 == prec;
> > >       }
> > >       (if (widen_prec > prec && c2_is_type_precision_p && c4_is_max_p)))))
> > > +  (match (unsigned_integer_sat_mul @0 @1)
> > > +   /* SAT_U_MUL (X, Y) = {
> > > +       WT x = (WT)a * (WT)b;
> > > +       NT hi = x >> (sizeof(NT) * 8);
> > > +       NT lo = (NT)x;
> > > +       return lo | -!!hi;
> > > +      } while WT is uint128_t, T is uint8_t, uint16_t, uint32_t or 
> > > uint64_t.  */
> > > +   (convert1?
> > > +    (bit_ior (convert? (negate (convert (ne (convert (rshift @3 
> > > INTEGER_CST@2))
> > > +                                           integer_zerop))))
> > > +            (convert (widen_mult:c@3 (convert@4 (convert @0))
> > > +                                     (convert@5 (convert @1))))))
> >
> > Are the inner (convert ..) around @0 and @1 necessary?  The whole
> > thing as written seems to match (long)(short)x * (long)(unsigned int)y
> > for 'char' x and y?  I also wonder if there isn't any TYPE_UNSIGNED check
> > missing below given the comment mentions unsigned WT.
> >
> > > +   (if (types_match (type, @0, @1))
> > > +    (with
> > > +     {
> > > +      unsigned widen_prec = TYPE_PRECISION (TREE_TYPE (@3));
> > > +      unsigned cvt4_prec = TYPE_PRECISION (TREE_TYPE (@4));
> > > +      unsigned cvt5_prec = TYPE_PRECISION (TREE_TYPE (@5));
> > > +      bool widen_mult_p = cvt4_prec == cvt5_prec && widen_prec == 
> > > cvt5_prec * 2;
> > > +
> > > +      unsigned c2 = tree_to_uhwi (@2);
> > > +      unsigned prec = TYPE_PRECISION (type);
> > > +      bool c2_is_type_precision_p = c2 == prec;
> > > +     }
> > > +     (if (widen_mult_p && c2_is_type_precision_p)))))
> > >  )
> > >
> > >  /* The boundary condition for case 10: IMM = 1:
> > > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
> > > index bfad4cf5c99..fcbb752a5e3 100644
> > > --- a/gcc/tree-ssa-math-opts.cc
> > > +++ b/gcc/tree-ssa-math-opts.cc
> > > @@ -6509,6 +6509,7 @@ math_opts_dom_walker::after_dom_children 
> > > (basic_block bb)
> > >               break;
> > >
> > >             case BIT_IOR_EXPR:
> > > +             match_unsigned_saturation_mul (&gsi, as_a<gassign *> 
> > > (stmt));
> > >               match_saturation_add_with_assign (&gsi, as_a<gassign *> 
> > > (stmt));
> > >               match_unsigned_saturation_trunc (&gsi, as_a<gassign *> 
> > > (stmt));
> > >               /* fall-through  */
> > > --
> > > 2.43.0
> > >

RE: [PATCH v1 1/2] Match: Add form 5 of unsigned SAT_MUL for widen-mul

Reply via email to