https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96906
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jakub at gcc dot gnu.org --- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Created attachment 49621 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49621&action=edit gcc11-pr96906.patch Looking over Agner Fog's table, pminus[bw] and psubus[bw] seems to have the same timing. This untested patch does the optimization in the combiner for SSE2/SSE4.1/AVX2, but handling AVX512BW and AVX512BW+AVX512VL will need further define_insn_and_split patterns I don't have cycles for right now (match the unspec comparisons in there).