> -----原始邮件-----发件人:"Andrew Pinski" <pins...@gmail.com>发送时间:2024-10-24 10:23:01
> (星期四)收件人:"Li Xu" <xu...@eswincomputing.com>抄送:gcc-patches@gcc.gnu.org,
> kito.ch...@gmail.com, richard.guent...@gmail.com, tamar.christ...@arm.com,
> juzhe.zh...@rivai.ai, pan2...@intel.com, jeffreya...@gmail.com,
> rdapp....@gmail.com主题:Re: [PATCH 1/2 v3] Match: Simplify unsigned scalar
> sat_sub(x, 1) to (x - x != 0)
>
> On Wed, Oct 23, 2024 at 2:08 AM Li Xu <xu...@eswincomputing.com> wrote:
> >
> > From: xuli <xu...@eswincomputing.com>
> >
> > When the imm operand op1=1 in the unsigned scalar sat_sub form2 below,
> > we can simplify (x != 0 ? x + max : 0) to (x - x != 0), thereby eliminating
> > a branch instruction.
> >
> > Form2:
> > T __attribute__((noinline)) \
> > sat_u_sub_imm##IMM##_##T##_fmt_2 (T x) \
> > { \
> > return x >= (T)IMM ? x - (T)IMM : 0; \
> > }
> >
> > Take below form 2 as example:
> > DEF_SAT_U_SUB_IMM_FMT_2(uint8_t, 1)
> >
> > Before this patch:
> > __attribute__((noinline))
> > uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
> > {
> > uint8_t _1;
> > uint8_t _3;
> >
> > <bb 2> [local count: 1073741824]:
> > if (x_2(D) != 0)
> > goto <bb 3>; [50.00%]
> > else
> > goto <bb 4>; [50.00%]
> >
> > <bb 3> [local count: 536870912]:
> > _3 = x_2(D) + 255;
> >
> > <bb 4> [local count: 1073741824]:
> > # _1 = PHI <x_2(D)(2), _3(3)>
> > return _1;
> >
> > }
> >
> > Assembly code:
> > sat_u_sub_imm1_uint8_t_fmt_2:
> > beq a0,zero,.L2
> > addiw a0,a0,-1
> > andi a0,a0,0xff
> > .L2:
> > ret
> >
> > After this patch:
> > __attribute__((noinline))
> > uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
> > {
> > _Bool _1;
> > unsigned char _2;
> > uint8_t _4;
> >
> > <bb 2> [local count: 1073741824]:
> > _1 = x_3(D) != 0;
> > _2 = (unsigned char) _1;
> > _4 = x_3(D) - _2;
> > return _4;
> >
> > }
> >
> > Assembly code:
> > sat_u_sub_imm1_uint8_t_fmt_2:
> > snez a5,a0
> > subw a0,a0,a5
> > andi a0,a0,0xff
> > ret
> >
> > The below test suites are passed for this patch:
> > 1. The rv64gcv fully regression tests.
> > 2. The x86 bootstrap tests.
> > 3. The x86 fully regression tests.
> >
> > Signed-off-by: Li Xu <xu...@eswincomputing.com>
> > gcc/ChangeLog:
> >
> > * match.pd: Simplify (x != 0 ? x + max : 0) to (x - x != 0).
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/phi-opt-44.c: New test.
> >
> > ---
> > gcc/match.pd | 9 ++++++++
> > gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c | 26 ++++++++++++++++++++++
> > 2 files changed, 35 insertions(+)
> > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 0455dfa6993..6a245f8e0d3 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -3383,6 +3383,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > }
> > (if (wi::eq_p (sum, wi::uhwi (0, precision)))))))
> >
> > +/* The boundary condition for case 10: IMM = 1:
> > + SAT_U_SUB = X >= IMM ? (X - IMM) : 0.
> > + simplify (X != 0 ? X + max : 0) to (X - X != 0). */
> > +(simplify
> > + (cond (ne @0 integer_zerop) (plus @0 integer_all_onesp) integer_zerop)
> > + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> > + && types_match (type, @0))
>
> You don't need either TYPE_UNSIGNED any more because integer_all_onesp
> handles both sign and unsigned "all ones".
> You also don't need types_match since they will match due to the rules
> of PLUS_EXPR.
Thanks for reviewing.
This simplification only applies to unsigned saturated subtraction, for
example, if x=3, max=127 or 255:
(X != 0 ? X + max : 0) (X - X != 0)
uint8_t (3!=0 ? 3 + 255 : 0 ) =3 (3-3!=0) =3 correct
int8_t (3!=0 ? 3 + 127 : 0 ) =-126 (3-3!=0) =3 not correct
I think it should be changed to the following form:
(if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type))
>
> Once you remove the UNSIGNED check, please add testcases for the
> signed case too.
>
> > + (minus @0 (convert (ne @0 { build_zero_cst (type); })))))
>
> Note if you capture the ne expression originally you can just reuse
> that instead of recreating it.
I am not familiar with the simplify syntax. Do you mean like this?
(minus @0 (convert (ne @0 integer_zerop))))
integer_zerop is a predicate function. there is a problem with writing it this
way.
How can i reuse it ?
>
> Sorry I missed that in the original review.
>
> Other than that this looks good to me (but I can't approve it).
>
> Thanks,
> Andrew
>
> > +
> > /* Signed saturation sub, case 1:
> > T minus = (T)((UT)X - (UT)Y);
> > SAT_S_SUB = (X ^ Y) & (X ^ minus) < 0 ? (-(T)(X < 0) ^ MAX) : minus;
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> > b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> > new file mode 100644
> > index 00000000000..756ba065d84
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> > @@ -0,0 +1,26 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-tree-phiopt1" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t sat_u_imm1_uint8_t (uint8_t x)
> > +{
> > + return x >= (uint8_t)1 ? x - (uint8_t)1 : 0;
> > +}
> > +
> > +uint16_t sat_u_imm1_uint16_t (uint16_t x)
> > +{
> > + return x >= (uint16_t)1 ? x - (uint16_t)1 : 0;
> > +}
> > +
> > +uint32_t sat_u_imm1_uint32_t (uint32_t x)
> > +{
> > + return x >= (uint32_t)1 ? x - (uint32_t)1 : 0;
> > +}
> > +
> > +uint64_t sat_u_imm1_uint64_t (uint64_t x)
> > +{
> > + return x >= (uint64_t)1 ? x - (uint64_t)1 : 0;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-not "goto" "phiopt1" } } */
> > --
> > 2.17.1
> >