> Am 19.05.2023 um 10:00 schrieb Jakub Jelinek <ja...@redhat.com>: > > Hi! > > As can be seen on the following testcase, we pattern recognize it on > i?86/x86_64 as return __builtin_mul_overflow_p (x, y, 0UL) and avoid > that way the extra division, but don't do it e.g. on aarch64 or ppc64le, > even when return __builtin_mul_overflow_p (x, y, 0UL); actually produces > there better code. The reason for testing the presence of the optab > handler is to make sure the generated code for it is short to ensure > we don't actually pessimize code instead of optimizing it. > But, we have one case that the internal-fn.cc .MUL_OVERFLOW expansion > handles nicely, and that is when arguments/result is the same mode > TYPE_UNSIGNED type, we only use IMAGPART_EXPR of it (i.e. > __builtin_mul_overflow_p rather than __builtin_mul_overflow) and > umul_highpart_optab supports the particular mode, in that case > we emit comparison of the highpart umul result against zero. > > So, the following patch matches what we do in internal-fn.cc and > also pattern matches __builtin_mul_overflow_p if > 1) we only need the flag whether it overflowed (i.e. !use_seen) > 2) it is unsigned (i.e. !cast_stmt) > 3) umul_highpart is supported for the mode > > Bootstrapped/regtested on x86_64-linux, i686-linux, aarch64-linux and > powerpc64le-linux, ok for trunk? Ok. Richard > 2023-05-19 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/101856 > * tree-ssa-math-opts.cc (match_arith_overflow): Pattern detect > unsigned __builtin_mul_overflow_p even when umulv4_optab doesn't > support it but umul_highpart_optab does. > > * gcc.dg/tree-ssa/pr101856.c: New test. > > --- gcc/tree-ssa-math-opts.cc.jj 2023-05-17 20:57:59.537914382 +0200 > +++ gcc/tree-ssa-math-opts.cc 2023-05-18 12:04:09.332336899 +0200 > @@ -4074,7 +4074,10 @@ match_arith_overflow (gimple_stmt_iterat > TYPE_MODE (type)) == CODE_FOR_nothing) > || (code == MULT_EXPR > && optab_handler (cast_stmt ? mulv4_optab : umulv4_optab, > - TYPE_MODE (type)) == CODE_FOR_nothing)) > + TYPE_MODE (type)) == CODE_FOR_nothing > + && (use_seen > + || cast_stmt > + || !can_mult_highpart_p (TYPE_MODE (type), true)))) > { > if (code != PLUS_EXPR) > return false; > --- gcc/testsuite/gcc.dg/tree-ssa/pr101856.c.jj 2023-05-18 > 11:57:17.681206745 +0200 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr101856.c 2023-05-18 11:56:51.662577752 > +0200 > @@ -0,0 +1,11 @@ > +/* PR tree-optimization/101856 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > +/* { dg-final { scan-tree-dump " .MUL_OVERFLOW " "optimized" { target > i?86-*-* x86_64-*-* aarch64*-*-* powerpc64le-*-* } } } */ > + > +int > +foo (unsigned long x, unsigned long y) > +{ > + unsigned long z = x * y; > + return z / y != x; > +} > > Jakub >
Re: [PATCH] tree-ssa-math-opts: Pattern recognize hand written __builtin_mul_overflow_p with same unsigned types even when target just has highpart umul [PR101856]
Richard Biener via Gcc-patches Fri, 19 May 2023 03:43:40 -0700
- [PATCH] tree-ssa-math-opts: Pattern recogni... Jakub Jelinek via Gcc-patches
- Re: [PATCH] tree-ssa-math-opts: Patter... Richard Biener via Gcc-patches