https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92080
--- Comment #12 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by H.J. Lu <h...@gcc.gnu.org>: https://gcc.gnu.org/g:d1cada7481420a23fbec525548ef5bdf64839a34 commit r16-271-gd1cada7481420a23fbec525548ef5bdf64839a34 Author: H.J. Lu <hjl.to...@gmail.com> Date: Fri Nov 29 18:22:14 2024 +0800 x86: Add a pass to remove redundant all 0s/1s vector load For all different modes of all 0s/1s vectors, we can use the single widest all 0s/1s vector register for all 0s/1s vector uses in the whole function. Add a pass to generate a single widest all 0s/1s vector set instruction at entry of the nearest common dominator for basic blocks with all 0s/1s vector uses. On Linux/x86-64, in cc1plus, this patch reduces the number of vector xor instructions from 4803 to 4714 and pcmpeq instructions from 144 to 142. NB: PR target/92080 and PR target/117839 aren't same. PR target/117839 is for vectors of all 0s and all 1s with different sizes and different components. PR target/92080 is for broadcast of the same component to different vector sizes. This patch covers only all 0s and all 1s cases of PR target/92080. gcc/ PR target/92080 PR target/117839 * config/i386/i386-features.cc (ix86_place_single_vector_set): New function. (remove_partial_avx_dependency): Use it. (ix86_get_vector_load_mode): New function. (replace_vector_const): Likewise. (remove_redundant_vector_load): Likewise. (pass_data_remove_redundant_vector_load): Likewise. (pass_remove_redundant_vector_load): Likewise. (make_pass_remove_redundant_vector_load): Likewise. * config/i386/i386-passes.def: Add pass_remove_redundant_vector_load after pass_remove_partial_avx_dependency. * config/i386/i386-protos.h (make_pass_remove_redundant_vector_load): New. * config/i386/i386.cc (ix86_modes_tieable_p): Return true for narrower non-scalar-integer modes in SSE registers. gcc/testsuite/ PR target/92080 PR target/117839 * gcc.target/i386/pr117839-1a.c: New test. * gcc.target/i386/pr117839-1b.c: Likewise. * gcc.target/i386/pr117839-2.c: Likewise. * gcc.target/i386/pr92080-1.c: Likewise. * gcc.target/i386/pr92080-2.c: Likewise. * gcc.target/i386/pr92080-3.c: Likewise. Signed-off-by: H.J. Lu <hjl.to...@gmail.com>