> It seems convert_mult_to_widen fails to fold the inserted conversion, clearly > first widening to __int128 isn't necessary. matching unfolded expressions > looks > wrong to me, so I wonder if we can improve widen-mult code generation instead.
I see, let me have a try to improve that first. Pan -----Original Message----- From: Richard Biener <[email protected]> Sent: Wednesday, September 3, 2025 2:56 PM To: Li, Pan2 <[email protected]> Cc: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; Chen, Ken <[email protected]>; Liu, Hongtao <[email protected]> Subject: Re: [PATCH v1 1/2] Match: Add form 5 of unsigned SAT_MUL for widen-mul On Wed, Sep 3, 2025 at 3:18 AM Li, Pan2 <[email protected]> wrote: > > Thanks Richard. > > >> + (convert (widen_mult:c@3 (convert@4 (convert @0)) > >> So I added 2 times convert to match above sematics. Do you indicate that > >> we only need one convert here? > > Yes, the above extra widening looks odd to me. > > Remove the inner convert of (convert @0)) will make the pattern fail to match. Ah, I failed to see we match their argument to the result. > The after_dom_children will meet MUL_EXPR first and then the > convert_mult_to_widen > will introduce the convert, as well as wide_mult. It seems convert_mult_to_widen fails to fold the inserted conversion, clearly first widening to __int128 isn't necessary. matching unfolded expressions looks wrong to me, so I wonder if we can improve widen-mult code generation instead. > The 220t.fab1 doesn't have wide_mult, so I think we need to move the sat_mul > before widening_mul pass to get ride of the > Inner convert up to a point. > > 20 │ _1 = (__int128 unsigned) a_8(D); > 21 │ _2 = (__int128 unsigned) b_9(D); > 22 │ x_10 = _1 * _2; > 23 │ _3 = x_10 >> 16; > 24 │ hi_11 = (uint16_t) _3; > 25 │ _4 = hi_11 != 0; > 26 │ _14 = (signed short) _4; > 27 │ _5 = -_14; > 28 │ lo.0_6 = (signed short) x_10; > 29 │ _7 = _5 | lo.0_6; > 30 │ _12 = (uint16_t) _7; > > The 2 times covert will be present from 272t.optmized two. > > Pan > > > -----Original Message----- > From: Richard Biener <[email protected]> > Sent: Tuesday, September 2, 2025 10:01 PM > To: Li, Pan2 <[email protected]> > Cc: [email protected]; [email protected]; [email protected]; > [email protected]; [email protected]; Chen, Ken <[email protected]>; > Liu, Hongtao <[email protected]> > Subject: Re: [PATCH v1 1/2] Match: Add form 5 of unsigned SAT_MUL for > widen-mul > > On Tue, Sep 2, 2025 at 3:35 PM Li, Pan2 <[email protected]> wrote: > > > > Thanks Richard for comments. > > > > > Are the inner (convert ..) around @0 and @1 necessary? The whole > > > thing as written seems to match (long)(short)x * (long)(unsigned int)y > > > for 'char' x and y? > > > > The gimple after wide-mul has 2 times convert similar as below. > > > > 21 <bb 2> [local count: 1073741824]: > > 22 _1 = (__int128 unsigned) a_8(D); > > 23 _2 = (__int128 unsigned) b_9(D); > > 24 _34 = (unsigned long) _1; > > 25 _33 = (unsigned long) _2; > > 26 x_10 = _34 w* _33; > > 27 _3 = x_10 >> 64; > > 28 hi_11 = (uint64_t) _3; > > 29 lo_12 = (uint64_t) x_10; > > > > + (convert (widen_mult:c@3 (convert@4 (convert @0)) > > > > So I added 2 times convert to match above sematics. Do you indicate that we > > only need one convert here? > > Yes, the above extra widening looks odd to me. > > > > I also wonder if there isn't any TYPE_UNSIGNED check > > > missing below given the comment mentions unsigned WT. > > > > The outer checked the type with TYPE_UNSIGNED, and inner checked the > > types_match (type, @0, @1)). > > So I think this ensure it only works for unsigned. > > Ah, OK. > > > 3184 /* Saturation add for unsigned integer. */ > > 3185 (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)) > > > > Pan > > > > -----Original Message----- > > From: Richard Biener <[email protected]> > > Sent: Tuesday, September 2, 2025 8:31 PM > > To: Li, Pan2 <[email protected]> > > Cc: [email protected]; [email protected]; [email protected]; > > [email protected]; [email protected]; Chen, Ken <[email protected]>; > > Liu, Hongtao <[email protected]> > > Subject: Re: [PATCH v1 1/2] Match: Add form 5 of unsigned SAT_MUL for > > widen-mul > > > > On Mon, Sep 1, 2025 at 6:57 AM <[email protected]> wrote: > > > > > > From: Pan Li <[email protected]> > > > > > > This patch would like to try to match the the unsigned > > > SAT_MUL form 4, aka below: > > > > > > #define DEF_SAT_U_MUL_FMT_5(NT, WT) \ > > > NT __attribute__((noinline)) \ > > > sat_u_mul_##NT##_from_##WT##_fmt_5 (NT a, NT b) \ > > > { \ > > > WT x = (WT)a * (WT)b; \ > > > NT hi = x >> (sizeof(NT) * 8); \ > > > NT lo = (NT)x; \ > > > return lo | -!!hi; \ > > > } > > > > > > while WT is uint128_t, T is uint8_t, uint16_t, uint32_t or uint64_t. > > > > > > gcc/ChangeLog: > > > > > > * match.pd: Add pattern for SAT_MUL form 5. > > > * tree-ssa-math-opts.cc > > > (math_opts_dom_walker::after_dom_children): > > > Try match pattern for IOR. > > > > > > Signed-off-by: Pan Li <[email protected]> > > > --- > > > gcc/match.pd | 25 +++++++++++++++++++++++++ > > > gcc/tree-ssa-math-opts.cc | 1 + > > > 2 files changed, 26 insertions(+) > > > > > > diff --git a/gcc/match.pd b/gcc/match.pd > > > index b1d7a3a1b73..4a3587017d1 100644 > > > --- a/gcc/match.pd > > > +++ b/gcc/match.pd > > > @@ -3660,6 +3660,31 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > > bool c2_is_type_precision_p = c2 == prec; > > > } > > > (if (widen_prec > prec && c2_is_type_precision_p && c4_is_max_p))))) > > > + (match (unsigned_integer_sat_mul @0 @1) > > > + /* SAT_U_MUL (X, Y) = { > > > + WT x = (WT)a * (WT)b; > > > + NT hi = x >> (sizeof(NT) * 8); > > > + NT lo = (NT)x; > > > + return lo | -!!hi; > > > + } while WT is uint128_t, T is uint8_t, uint16_t, uint32_t or > > > uint64_t. */ > > > + (convert1? > > > + (bit_ior (convert? (negate (convert (ne (convert (rshift @3 > > > INTEGER_CST@2)) > > > + integer_zerop)))) > > > + (convert (widen_mult:c@3 (convert@4 (convert @0)) > > > + (convert@5 (convert @1)))))) > > > > Are the inner (convert ..) around @0 and @1 necessary? The whole > > thing as written seems to match (long)(short)x * (long)(unsigned int)y > > for 'char' x and y? I also wonder if there isn't any TYPE_UNSIGNED check > > missing below given the comment mentions unsigned WT. > > > > > + (if (types_match (type, @0, @1)) > > > + (with > > > + { > > > + unsigned widen_prec = TYPE_PRECISION (TREE_TYPE (@3)); > > > + unsigned cvt4_prec = TYPE_PRECISION (TREE_TYPE (@4)); > > > + unsigned cvt5_prec = TYPE_PRECISION (TREE_TYPE (@5)); > > > + bool widen_mult_p = cvt4_prec == cvt5_prec && widen_prec == > > > cvt5_prec * 2; > > > + > > > + unsigned c2 = tree_to_uhwi (@2); > > > + unsigned prec = TYPE_PRECISION (type); > > > + bool c2_is_type_precision_p = c2 == prec; > > > + } > > > + (if (widen_mult_p && c2_is_type_precision_p))))) > > > ) > > > > > > /* The boundary condition for case 10: IMM = 1: > > > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc > > > index bfad4cf5c99..fcbb752a5e3 100644 > > > --- a/gcc/tree-ssa-math-opts.cc > > > +++ b/gcc/tree-ssa-math-opts.cc > > > @@ -6509,6 +6509,7 @@ math_opts_dom_walker::after_dom_children > > > (basic_block bb) > > > break; > > > > > > case BIT_IOR_EXPR: > > > + match_unsigned_saturation_mul (&gsi, as_a<gassign *> > > > (stmt)); > > > match_saturation_add_with_assign (&gsi, as_a<gassign *> > > > (stmt)); > > > match_unsigned_saturation_trunc (&gsi, as_a<gassign *> > > > (stmt)); > > > /* fall-through */ > > > -- > > > 2.43.0 > > >
