https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103376

--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <ja...@gcc.gnu.org>:

https://gcc.gnu.org/g:04eccbbe3d9a4e9d2f8f43dba8ac4cb686029fb2

commit r12-5492-g04eccbbe3d9a4e9d2f8f43dba8ac4cb686029fb2
Author: Jakub Jelinek <ja...@redhat.com>
Date:   Wed Nov 24 09:54:44 2021 +0100

    bswap: Fix up symbolic merging for xor and plus [PR103376]

    On Mon, Nov 22, 2021 at 08:39:42AM -0000, Roger Sayle wrote:
    > This patch implements PR tree-optimization/103345 to merge adjacent
    > loads when combined with addition or bitwise xor.  The current code
    > in gimple-ssa-store-merging.c's find_bswap_or_nop alreay handles ior,
    > so that all that's required is to treat PLUS_EXPR and BIT_XOR_EXPR in
    > the same way at BIT_IOR_EXPR.

    Unfortunately they aren't exactly the same.  They work the same if always
    at least one operand (or corresponding byte in it) is known to be 0,
    0 | 0 = 0 ^ 0 = 0 + 0 = 0.  But for | also x | x = x for any other x,
    so perform_symbolic_merge has been accepting either that at least one
    of the bytes is 0 or that both are the same, but that is wrong for ^
    and +.

    The following patch fixes that by passing through the code of binary
    operation and allowing non-zero masked1 == masked2 through only
    for BIT_IOR_EXPR.

    Thinking more about it, perhaps we could do more for BIT_XOR_EXPR.
    We could allow masked1 == masked2 case for it, but would need to
    do something different than the
      n->n = n1->n | n2->n;
    we do on all the bytes together.
    In particular, for masked1 == masked2 if masked1 != 0 (well, for 0
    both variants are the same) and masked1 != 0xff we would need to
    clear corresponding n->n byte instead of setting it to the input
    as x ^ x = 0 (but if we don't know what x and y are, the result is
    also don't know).  Now, for plus it is much harder, because not only
    for non-zero operands we don't know what the result is, but it can
    modify upper bytes as well.  So perhaps only if current's byte
    masked1 && masked2 set the resulting byte to 0xff (unknown) iff
    the byte above it is 0 and 0, and set that resulting byte to 0xff too.
    Also, even for | we could instead of return NULL just set the resulting
    byte to 0xff if it is different, perhaps it will be masked off later on.

    2021-11-24  Jakub Jelinek  <ja...@redhat.com>

            PR tree-optimization/103376
            * gimple-ssa-store-merging.c (perform_symbolic_merge): Add CODE
            argument.  If CODE is not BIT_IOR_EXPR, ensure that one of masked1
            or masked2 is 0.
            (find_bswap_or_nop_1, find_bswap_or_nop,
            imm_store_chain_info::try_coalesce_bswap): Adjust
            perform_symbolic_merge callers.

            * gcc.c-torture/execute/pr103376.c: New test.

Reply via email to