[Bug target/116274] [14 Regression] x86: poor code generation with 16 byte function arguments and addition

cvs-commit at gcc dot gnu.org via Gcc-bugs Wed, 18 Sep 2024 02:31:16 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116274


--- Comment #11 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-14 branch has been updated by Richard Biener
<rgue...@gcc.gnu.org>:

https://gcc.gnu.org/g:d5d4f3bae5a9478dc2189e53da933175a6d7b197

commit r14-10681-gd5d4f3bae5a9478dc2189e53da933175a6d7b197
Author: Richard Biener <rguent...@suse.de>
Date:   Thu Aug 8 11:36:43 2024 +0200

    tree-optimization/116274 - overzealous SLP vectorization

    The following tries to address that the vectorizer fails to have
    precise knowledge of argument and return calling conventions and
    views some accesses as loads and stores that are not.
    This is mainly important when doing basic-block vectorization as
    otherwise loop indexing would force such arguments to memory.

    On x86 the reduction in the number of apparent loads and stores
    often dominates cost analysis so the following tries to mitigate
    this aggressively by adjusting only the scalar load and store
    cost, reducing them to the cost of a simple scalar statement,
    but not touching the vector access cost which would be much
    harder to estimate.  Thereby we error on the side of not performing
    basic-block vectorization.

            PR tree-optimization/116274
            * tree-vect-slp.cc (vect_bb_slp_scalar_cost): Cost scalar loads
            and stores as simple scalar stmts when they access a non-global,
            not address-taken variable that doesn't have BLKmode assigned.

            * gcc.target/i386/pr116274-2.c: New testcase.

    (cherry picked from commit b8ea13ebf1211714503fd72f25c04376483bfa53)

[Bug target/116274] [14 Regression] x86: poor code generation with 16 byte function arguments and addition

Reply via email to