Thanks Richard for comments.
> Are the inner (convert ..) around @0 and @1 necessary? The whole
> thing as written seems to match (long)(short)x * (long)(unsigned int)y
> for 'char' x and y?
The gimple after wide-mul has 2 times convert similar as below.
21 <bb 2> [local count: 1073741824]:
22 _1 = (__int128 unsigned) a_8(D);
23 _2 = (__int128 unsigned) b_9(D);
24 _34 = (unsigned long) _1;
25 _33 = (unsigned long) _2;
26 x_10 = _34 w* _33;
27 _3 = x_10 >> 64;
28 hi_11 = (uint64_t) _3;
29 lo_12 = (uint64_t) x_10;
+ (convert (widen_mult:c@3 (convert@4 (convert @0))
So I added 2 times convert to match above sematics. Do you indicate that we
only need one convert here?
> I also wonder if there isn't any TYPE_UNSIGNED check
> missing below given the comment mentions unsigned WT.
The outer checked the type with TYPE_UNSIGNED, and inner checked the
types_match (type, @0, @1)).
So I think this ensure it only works for unsigned.
3184 /* Saturation add for unsigned integer. */
3185 (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type))
Pan
-----Original Message-----
From: Richard Biener <[email protected]>
Sent: Tuesday, September 2, 2025 8:31 PM
To: Li, Pan2 <[email protected]>
Cc: [email protected]; [email protected]; [email protected];
[email protected]; [email protected]; Chen, Ken <[email protected]>;
Liu, Hongtao <[email protected]>
Subject: Re: [PATCH v1 1/2] Match: Add form 5 of unsigned SAT_MUL for widen-mul
On Mon, Sep 1, 2025 at 6:57 AM <[email protected]> wrote:
>
> From: Pan Li <[email protected]>
>
> This patch would like to try to match the the unsigned
> SAT_MUL form 4, aka below:
>
> #define DEF_SAT_U_MUL_FMT_5(NT, WT) \
> NT __attribute__((noinline)) \
> sat_u_mul_##NT##_from_##WT##_fmt_5 (NT a, NT b) \
> { \
> WT x = (WT)a * (WT)b; \
> NT hi = x >> (sizeof(NT) * 8); \
> NT lo = (NT)x; \
> return lo | -!!hi; \
> }
>
> while WT is uint128_t, T is uint8_t, uint16_t, uint32_t or uint64_t.
>
> gcc/ChangeLog:
>
> * match.pd: Add pattern for SAT_MUL form 5.
> * tree-ssa-math-opts.cc (math_opts_dom_walker::after_dom_children):
> Try match pattern for IOR.
>
> Signed-off-by: Pan Li <[email protected]>
> ---
> gcc/match.pd | 25 +++++++++++++++++++++++++
> gcc/tree-ssa-math-opts.cc | 1 +
> 2 files changed, 26 insertions(+)
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index b1d7a3a1b73..4a3587017d1 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3660,6 +3660,31 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> bool c2_is_type_precision_p = c2 == prec;
> }
> (if (widen_prec > prec && c2_is_type_precision_p && c4_is_max_p)))))
> + (match (unsigned_integer_sat_mul @0 @1)
> + /* SAT_U_MUL (X, Y) = {
> + WT x = (WT)a * (WT)b;
> + NT hi = x >> (sizeof(NT) * 8);
> + NT lo = (NT)x;
> + return lo | -!!hi;
> + } while WT is uint128_t, T is uint8_t, uint16_t, uint32_t or uint64_t.
> */
> + (convert1?
> + (bit_ior (convert? (negate (convert (ne (convert (rshift @3
> INTEGER_CST@2))
> + integer_zerop))))
> + (convert (widen_mult:c@3 (convert@4 (convert @0))
> + (convert@5 (convert @1))))))
Are the inner (convert ..) around @0 and @1 necessary? The whole
thing as written seems to match (long)(short)x * (long)(unsigned int)y
for 'char' x and y? I also wonder if there isn't any TYPE_UNSIGNED check
missing below given the comment mentions unsigned WT.
> + (if (types_match (type, @0, @1))
> + (with
> + {
> + unsigned widen_prec = TYPE_PRECISION (TREE_TYPE (@3));
> + unsigned cvt4_prec = TYPE_PRECISION (TREE_TYPE (@4));
> + unsigned cvt5_prec = TYPE_PRECISION (TREE_TYPE (@5));
> + bool widen_mult_p = cvt4_prec == cvt5_prec && widen_prec == cvt5_prec
> * 2;
> +
> + unsigned c2 = tree_to_uhwi (@2);
> + unsigned prec = TYPE_PRECISION (type);
> + bool c2_is_type_precision_p = c2 == prec;
> + }
> + (if (widen_mult_p && c2_is_type_precision_p)))))
> )
>
> /* The boundary condition for case 10: IMM = 1:
> diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
> index bfad4cf5c99..fcbb752a5e3 100644
> --- a/gcc/tree-ssa-math-opts.cc
> +++ b/gcc/tree-ssa-math-opts.cc
> @@ -6509,6 +6509,7 @@ math_opts_dom_walker::after_dom_children (basic_block
> bb)
> break;
>
> case BIT_IOR_EXPR:
> + match_unsigned_saturation_mul (&gsi, as_a<gassign *> (stmt));
> match_saturation_add_with_assign (&gsi, as_a<gassign *> (stmt));
> match_unsigned_saturation_trunc (&gsi, as_a<gassign *> (stmt));
> /* fall-through */
> --
> 2.43.0
>