On 6/24/2022 9:09 AM, Roger Sayle wrote:
This patch implements the missed optimization described in PR 94026,
where a the shift can be eliminated from the sequence of a shift,
followed by a bit-wise AND followed by an equality/inequality test.
Specifically, ((X << C1) & C2) cmp C3 into (X & (C2 >> C1)) cmp (C3 >> C1)
and likewise ((X >> C1) & C2) cmp C3 into (X & (C2 << C1)) cmp (C3 << C1)
where cmp is == or !=, and C1, C2 and C3 are integer constants.
The example in the subject line is taken from the hot function
self_atari from the Go program Leela (in SPEC CPU 2017).
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}, with
no new failures, OK for mainline?
2022-06-24 Roger Sayle <ro...@nextmovesoftware.com>
gcc/ChangeLog
PR tree-optimization/94026
* match.pd (((X << C1) & C2) eq/ne C3): New simplification.
(((X >> C1) & C2) eq/ne C3): Likewise.
gcc/testsuite/ChangeLog
PR tree-optimization/94026
* gcc.dg/pr94026.c: New test case.
OK. But please check if we still need this code from fold-const.c:
/* Fold ((X >> C1) & C2) == 0 and ((X >> C1) & C2) != 0 where
C1 is a valid shift constant, and C2 is a power of two, i.e.
a single bit. */
if (TREE_CODE (arg0) == BIT_AND_EXPR
&& integer_pow2p (TREE_OPERAND (arg0, 1))
&& integer_zerop (arg1))
[ ... ]
There's a whole series of transformations that are done for equality
comparisons where one side is a constant and the other is combination of
logicals & shifting. Some (like the one noted above) are likely
redundant now. Others may fit better into the match.pd framework rather
than fold-const.
Anyway, the patch is fine, but please take a looksie at the referenced
cases in fold-const.c and see if there's any cleanup/refactoring we
ought to be doing there.
jeff