https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117839

--- Comment #5 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <h...@gcc.gnu.org>:

https://gcc.gnu.org/g:d1cada7481420a23fbec525548ef5bdf64839a34

commit r16-271-gd1cada7481420a23fbec525548ef5bdf64839a34
Author: H.J. Lu <hjl.to...@gmail.com>
Date:   Fri Nov 29 18:22:14 2024 +0800

    x86: Add a pass to remove redundant all 0s/1s vector load

    For all different modes of all 0s/1s vectors, we can use the single widest
    all 0s/1s vector register for all 0s/1s vector uses in the whole function.
    Add a pass to generate a single widest all 0s/1s vector set instruction at
    entry of the nearest common dominator for basic blocks with all 0s/1s
    vector uses.  On Linux/x86-64, in cc1plus, this patch reduces the number
    of vector xor instructions from 4803 to 4714 and pcmpeq instructions from
    144 to 142.

    NB: PR target/92080 and PR target/117839 aren't same.  PR target/117839
    is for vectors of all 0s and all 1s with different sizes and different
    components.  PR target/92080 is for broadcast of the same component to
    different vector sizes.  This patch covers only all 0s and all 1s cases
    of PR target/92080.

    gcc/

            PR target/92080
            PR target/117839
            * config/i386/i386-features.cc (ix86_place_single_vector_set):
            New function.
            (remove_partial_avx_dependency): Use it.
            (ix86_get_vector_load_mode): New function.
            (replace_vector_const): Likewise.
            (remove_redundant_vector_load): Likewise.
            (pass_data_remove_redundant_vector_load): Likewise.
            (pass_remove_redundant_vector_load): Likewise.
            (make_pass_remove_redundant_vector_load): Likewise.
            * config/i386/i386-passes.def: Add
            pass_remove_redundant_vector_load after
            pass_remove_partial_avx_dependency.
            * config/i386/i386-protos.h
            (make_pass_remove_redundant_vector_load): New.
            * config/i386/i386.cc (ix86_modes_tieable_p): Return true for
            narrower non-scalar-integer modes in SSE registers.

    gcc/testsuite/

            PR target/92080
            PR target/117839
            * gcc.target/i386/pr117839-1a.c: New test.
            * gcc.target/i386/pr117839-1b.c: Likewise.
            * gcc.target/i386/pr117839-2.c: Likewise.
            * gcc.target/i386/pr92080-1.c: Likewise.
            * gcc.target/i386/pr92080-2.c: Likewise.
            * gcc.target/i386/pr92080-3.c: Likewise.

    Signed-off-by: H.J. Lu <hjl.to...@gmail.com>
  • [Bug target/117839] Redundant v... cvs-commit at gcc dot gnu.org via Gcc-bugs

Reply via email to