On Thu, May 07, 2020 at 10:04:35AM +0200, Richard Biener wrote: > On Thu, 7 May 2020, Jakub Jelinek wrote: > > The ffs expanders on several targets (x86, ia64, aarch64 at least) > > emit a conditional move or similar code to handle the case when the > > argument is 0, which makes the code longer. > > If we know from VRP that the argument will not be zero, we can (if the > > target has also an ctz expander) just use ctz which is undefined at zero > > and thus the expander doesn't need to deal with that. > > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > can you use direct_internal_fn_supported_p (IFN_CTZ, type, > OPTIMIZE_FOR_SPEED)?
Only if it is guarded with #if GIMPLE (because otherwise the fn isn't declared). Though, restricting this to GIMPLE seems like a good idea anyway to me. Ok for trunk if it passes bootstrap/regtest? 2020-05-07 Jakub Jelinek <ja...@redhat.com> PR tree-optimization/94956 * match.pd (FFS): Optimize __builtin_ffs* of non-zero argument into __builtin_ctz* + 1 if direct IFN_CTZ is supported. * gcc.target/i386/pr94956.c: New test. --- gcc/match.pd.jj 2020-05-06 15:03:51.618058839 +0200 +++ gcc/match.pd 2020-05-07 16:16:48.466970168 +0200 @@ -5986,6 +5986,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) && direct_internal_fn_supported_p (IFN_POPCOUNT, type, OPTIMIZE_FOR_BOTH)) (convert (IFN_POPCOUNT:type @0))))) + +/* __builtin_ffs needs to deal on many targets with the possible zero + argument. If we know the argument is always non-zero, __builtin_ctz + 1 + should lead to better code. */ +(simplify + (FFS tree_expr_nonzero_p@0) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) + && direct_internal_fn_supported_p (IFN_CTZ, TREE_TYPE (@0), + OPTIMIZE_FOR_SPEED)) + (plus (CTZ:type @0) { build_one_cst (type); }))) #endif /* Simplify: --- gcc/testsuite/gcc.target/i386/pr94956.c.jj 2020-05-06 16:35:47.085876237 +0200 +++ gcc/testsuite/gcc.target/i386/pr94956.c 2020-05-06 16:39:52.927140038 +0200 @@ -0,0 +1,28 @@ +/* PR tree-optimization/94956 */ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-not "\tcmovne\t" } } */ +/* { dg-final { scan-assembler-not "\tsete\t" } } */ + +int +foo (unsigned x) +{ + if (x == 0) __builtin_unreachable (); + return __builtin_ffs (x) - 1; +} + +int +bar (unsigned long x) +{ + if (x == 0) __builtin_unreachable (); + return __builtin_ffsl (x) - 1; +} + +#ifdef __x86_64__ +int +baz (unsigned long long x) +{ + if (x == 0) __builtin_unreachable (); + return __builtin_ffsll (x) - 1; +} +#endif Jakub