On Wed, Apr 20, 2022 at 8:35 PM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > This patch implements the constant folding optimization(s) described in > PR middle-end/98865, which should help address the serious performance > regression of Botan AES-128/XTS mentioned in PR tree-optimization/98856. > This combines aspects of both Jakub Jelinek's patch in comment #2 and > Andrew Pinski's patch in comment #4, so both are listed as co-authors. > > Alas truth_valued_p is not quite what we want (and tweaking its > definition has undesirable side-effects), so instead this patch > introduces a new zero_one_valued predicate based on tree_nonzero_bits > that extends truth_valued_p (which is for Boolean types with single > bit precision). This is then used to simple if X*Y into X&Y when > both X and Y are zero_one_valued_p, and simplify X*Y into (-X)&Y when > X is zero_one_valued_p, in both cases replacing an integer multiplication > with a cheaper bit-wise AND. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and with --target_board=unix{-m32}, with > no new failures, except for a tweak required to tree-ssa/vrp116.c. > The recently proposed cmove patch ensures the i386 backend continues > to generate identical code for vrp116.c as before. > > Ok, either for mainline or when stage 1 reopens?
One issue is that on GIMPLE we consider (bit_and (negate @0) @1) to be more costly than (mult @0 @1) because it is two operations rather than one. So can we at least do (bit_and (negate! @0) @1) and thus require the negate to be simplified? Also the + && (TREE_CODE (@1) != INTEGER_CST + || wi::popcount (wi::to_wide (@1)) > 1)) exception is odd without providing the desired canonicalization to a shift? In the end all this looks more like something for RTL (expansion?) where we can query costs rather than canonicalization (to simpler expressions) which is what we should do on GIMPLE. Richard. > > > 2022-04-20 Roger Sayle <ro...@nextmovesoftware.com> > Andrew Pinski <apin...@marvell.com> > Jakub Jelinek <ja...@redhat.com> > > gcc/ChangeLog > PR middle-end/98865 > * match.pd (match zero_one_valued_p): New predicate. > (mult @0 @1): Use zero_one_valued_p for transforming into (and @0 > @1). > (mult zero_one_valued_p@0 @1): Convert integer multiplication into > a negation and a bit-wise AND, if it can't be cheaply implemented by > a single left shift. > > gcc/testsuite/ChangeLog > PR middle-end/98865 > * gcc.dg/pr98865.c: New test case. > * gcc.dg/vrp116.c: Tweak test to confirm the integer multiplication > has been eliminated, not for the actual replacement implementation. > > Thanks, > Roger > -- >