On 6/24/2022 9:09 AM, Roger Sayle wrote:
This patch implements the missed optimization described in PR 94026,
where a the shift can be eliminated from the sequence of a shift,
followed by a bit-wise AND followed by an equality/inequality test.
Specifically, ((X << C1) & C2) cmp C3 into (X & (C2 >> C1)) cmp (C3 >> C1)
and likewise ((X >> C1) & C2) cmp C3 into (X & (C2 << C1)) cmp (C3 << C1)
where cmp is == or !=, and C1, C2 and C3 are integer constants.
The example in the subject line is taken from the hot function
self_atari from the Go program Leela (in SPEC CPU 2017).

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}, with
no new failures, OK for mainline?


2022-06-24  Roger Sayle  <ro...@nextmovesoftware.com>

gcc/ChangeLog
        PR tree-optimization/94026
        * match.pd (((X << C1) & C2) eq/ne C3): New simplification.
        (((X >> C1) & C2) eq/ne C3): Likewise.

gcc/testsuite/ChangeLog
        PR tree-optimization/94026
        * gcc.dg/pr94026.c: New test case.
OK.  But please check if we still need this code from fold-const.c:

      /* Fold ((X >> C1) & C2) == 0 and ((X >> C1) & C2) != 0 where
         C1 is a valid shift constant, and C2 is a power of two, i.e.
         a single bit.  */
      if (TREE_CODE (arg0) == BIT_AND_EXPR
          && integer_pow2p (TREE_OPERAND (arg0, 1))
          && integer_zerop (arg1))
[ ... ]

There's a whole series of transformations that are done for equality comparisons where one side is a constant and the other is combination of logicals & shifting.  Some (like the one noted above) are likely redundant now.  Others may fit better into the match.pd framework rather than fold-const.

Anyway, the patch is fine, but please take a looksie at the referenced cases in fold-const.c and see if there's any cleanup/refactoring we ought to be doing there.

jeff

Reply via email to