14 regression] jump threading de-optimizes nested floating point comparisons

cvs-commit at gcc dot gnu.org via Gcc-bugs Wed, 18 Oct 2023 01:55:07 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109154


--- Comment #70 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tamar Christina <[email protected]>:

https://gcc.gnu.org/g:4b39aeef594f311e2c1715f15608f1d7ebc2d868

commit r14-4713-g4b39aeef594f311e2c1715f15608f1d7ebc2d868
Author: Tamar Christina <[email protected]>
Date:   Wed Oct 18 09:32:55 2023 +0100

    middle-end: Fold vec_cond into conditional ternary or binary operation when
sharing operand [PR109154]

    When we have a vector conditional on a masked target which is doing a
selection
    on the result of a conditional operation where one of the operands of the
    conditional operation is the other operand of the select, then we can fold
the
    vector conditional into the operation.

    Concretely this transforms

      c = mask1 ? (masked_op mask2 a b) : b

    into

      c = masked_op (mask1 & mask2) a b

    The mask is then propagated upwards by the compiler.  In the SVE case we
don't
    end up needing a mask AND here since `mask2` will end up in the instruction
    creating `mask` which gives us a natural &.

    Such transformations are more common now in GCC 13+ as PRE has not started
    unsharing of common code in case it can make one branch fully independent.

    e.g. in this case `b` becomes a loop invariant value after PRE.

    This transformation removes the extra select for masked architectures but
    doesn't fix the general case.

    gcc/ChangeLog:

            PR tree-optimization/109154
            * match.pd: Add new cond_op rule.

    gcc/testsuite/ChangeLog:

            PR tree-optimization/109154
            * gcc.target/aarch64/sve/pre_cond_share_1.c: New test.

--- Comment #71 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tamar Christina <[email protected]>:

https://gcc.gnu.org/g:b0fe8f2f960d746e61debd61655f231f503bccaa

commit r14-4714-gb0fe8f2f960d746e61debd61655f231f503bccaa
Author: Tamar Christina <[email protected]>
Date:   Wed Oct 18 09:33:30 2023 +0100

    middle-end: ifcvt: Allow any const IFN in conditional blocks

    When ifcvt was initially added masking was not a thing and as such it was
    rather conservative in what it supported.

    For builtins it only allowed C99 builtin functions which it knew it can
fold
    away.

    These days the vectorizer is able to deal with needing to mask IFNs itself.
    vectorizable_call is able vectorize the IFN by emitting a VEC_PERM_EXPR
after
    the operation to emulate the masking.

    This is then used by match.pd to conver the IFN into a masked variant if
it's
    available.

    For these reasons the restriction in ifconvert is no longer require and we
    needless block vectorization when we can effectively handle the operations.

    Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

    Note: This patch is part of a testseries and tests for it are added in the
    AArch64 patch that adds supports for the optab.

    gcc/ChangeLog:

            PR tree-optimization/109154
            * tree-if-conv.cc (if_convertible_stmt_p): Allow any const IFN.

[Bug tree-optimization/109154] [13/14 regression] jump threading de-optimizes nested floating point comparisons

Reply via email to