https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98737

--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <ja...@gcc.gnu.org>:

https://gcc.gnu.org/g:6362627b27f395b054f359244fcfcb15ac0ac2ab

commit r12-6190-g6362627b27f395b054f359244fcfcb15ac0ac2ab
Author: Jakub Jelinek <ja...@redhat.com>
Date:   Mon Jan 3 14:02:23 2022 +0100

    i386, fab: Optimize __atomic_{add,sub,and,or,xor}_fetch (x, y, z)
{==,!=,<,<=,>,>=} 0 [PR98737]

    On Wed, Jan 27, 2021 at 12:27:13PM +0100, Ulrich Drepper via Gcc-patches
wrote:
    > On 1/27/21 11:37 AM, Jakub Jelinek wrote:
    > > Would equality comparison against 0 handle the most common cases.
    > >
    > > The user can write it as
    > > __atomic_sub_fetch (x, y, z) == 0
    > > or
    > > __atomic_fetch_sub (x, y, z) - y == 0
    > > thouch, so the expansion code would need to be able to cope with both.
    >
    > Please also keep !=0, <0, <=0, >0, and >=0 in mind.  They all can be
    > useful and can be handled with the flags.

    <= 0 and > 0 don't really work well with lock {add,sub,inc,dec}, x86
doesn't
    have comparisons that would look solely at both SF and ZF and not at other
    flags (and emitting two separate conditional jumps or two setcc insns and
    oring them together looks awful).

    But the rest can work.

    Here is a patch that adds internal functions and optabs for these,
    recognizes them at the same spot as e.g. .ATOMIC_BIT_TEST_AND* internal
    functions (fold all builtins pass) and expands them appropriately (or for
    the <= 0 and > 0 cases of +/- FAILs and let's middle-end fall back).

    So far I have handled just the op_fetch builtins, IMHO instead of handling
    also __atomic_fetch_sub (x, y, z) - y == 0 etc. we should canonicalize
    __atomic_fetch_sub (x, y, z) - y to __atomic_sub_fetch (x, y, z) (and vice
    versa).

    2022-01-03  Jakub Jelinek  <ja...@redhat.com>

            PR target/98737
            * internal-fn.def (ATOMIC_ADD_FETCH_CMP_0, ATOMIC_SUB_FETCH_CMP_0,
            ATOMIC_AND_FETCH_CMP_0, ATOMIC_OR_FETCH_CMP_0,
ATOMIC_XOR_FETCH_CMP_0):
            New internal fns.
            * internal-fn.h (ATOMIC_OP_FETCH_CMP_0_EQ,
ATOMIC_OP_FETCH_CMP_0_NE,
            ATOMIC_OP_FETCH_CMP_0_LT, ATOMIC_OP_FETCH_CMP_0_LE,
            ATOMIC_OP_FETCH_CMP_0_GT, ATOMIC_OP_FETCH_CMP_0_GE): New
enumerators.
            * internal-fn.c (expand_ATOMIC_ADD_FETCH_CMP_0,
            expand_ATOMIC_SUB_FETCH_CMP_0, expand_ATOMIC_AND_FETCH_CMP_0,
            expand_ATOMIC_OR_FETCH_CMP_0, expand_ATOMIC_XOR_FETCH_CMP_0): New
            functions.
            * optabs.def (atomic_add_fetch_cmp_0_optab,
            atomic_sub_fetch_cmp_0_optab, atomic_and_fetch_cmp_0_optab,
            atomic_or_fetch_cmp_0_optab, atomic_xor_fetch_cmp_0_optab): New
            direct optabs.
            * builtins.h (expand_ifn_atomic_op_fetch_cmp_0): Declare.
            * builtins.c (expand_ifn_atomic_op_fetch_cmp_0): New function.
            * tree-ssa-ccp.c: Include internal-fn.h.
            (optimize_atomic_bit_test_and): Add . before internal fn call
            in function comment.  Change return type from void to bool and
            return true only if successfully replaced.
            (optimize_atomic_op_fetch_cmp_0): New function.
            (pass_fold_builtins::execute): Use optimize_atomic_op_fetch_cmp_0
            for BUILT_IN_ATOMIC_{ADD,SUB,AND,OR,XOR}_FETCH_{1,2,4,8,16} and
            BUILT_IN_SYNC_{ADD,SUB,AND,OR,XOR}_AND_FETCH_{1,2,4,8,16},
            for *XOR* ones only if optimize_atomic_bit_test_and failed.
            * config/i386/sync.md
(atomic_<plusminus_mnemonic>_fetch_cmp_0<mode>,
            atomic_<logic>_fetch_cmp_0<mode>): New define_expand patterns.
            (atomic_add_fetch_cmp_0<mode>_1, atomic_sub_fetch_cmp_0<mode>_1,
            atomic_<logic>_fetch_cmp_0<mode>_1): New define_insn patterns.
            * doc/md.texi (atomic_add_fetch_cmp_0<mode>,
            atomic_sub_fetch_cmp_0<mode>, atomic_and_fetch_cmp_0<mode>,
            atomic_or_fetch_cmp_0<mode>, atomic_xor_fetch_cmp_0<mode>):
Document
            new named patterns.

            * gcc.target/i386/pr98737-1.c: New test.
            * gcc.target/i386/pr98737-2.c: New test.
            * gcc.target/i386/pr98737-3.c: New test.
            * gcc.target/i386/pr98737-4.c: New test.
            * gcc.target/i386/pr98737-5.c: New test.
            * gcc.target/i386/pr98737-6.c: New test.
            * gcc.target/i386/pr98737-7.c: New test.

Reply via email to