There're several regressions after obsolete vcond{,u,eq}<mode>,
Some regressions are due to the direct optimizations in
ix86_expand_{fp,int}_vcond..i.e ix86_expand_sse_fp_minmax.
Some regrssions are due to optimizations relies on canonicalization
in ix86_expand_{fp,int}_vcond.

This series add define_split or define_insn_and_split to restore
those optimizations at pass_combine. It fixed most regressions in GCC
testsuite except for ones compiled w/o sse4.1. W/o sse4.1 it takes 3
instrution for vector condition move, and pass_combine only supports
at most 4 instructions combination. One possible solution is add fake
"ssemovcc" instructions to help combine, and split that back to real
instruction. This series doesn't handle that, but just adjust testcases
to XFAIL.

I also test performance on SPEC2017 with different options set.
-march=sapphirerapids -O2
-march=x86-64-v3 -O2
-march=x86-64 -O2
-march=sapphirerapids -O2
Didn't observe obvious performance change, mostly same binaries.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Any comments?

liuhongt (7):
  [x86] Add more splitters to match (unspec [op1 op2 (gt op3
    constm1_operand)] UNSPEC_BLENDV)
  Lower AVX512 kmask comparison back to AVX2 comparison when
    op_{true,false} is vector -1/0.
  [x86] Match IEEE min/max with UNSPEC_IEEE_{MIN,MAX}.
  Add more splitter for mskmov with avx512 comparison.
  Adjust testcase for the regressed testcases after obsolete of
    vcond{,u,eq}.
  [x86] Optimize a < 0 ? -1 : 0 to (signed)a >> 31.
  Remove vcond{,u,eq}<mode> expanders since they will be obsolete.

 gcc/config/i386/mmx.md                        | 149 ++--
 gcc/config/i386/sse.md                        | 772 +++++++++++++-----
 gcc/testsuite/g++.target/i386/avx2-pr115517.C |  60 ++
 .../g++.target/i386/avx512-pr115517.C         |  70 ++
 gcc/testsuite/g++.target/i386/pr100637-1b.C   |   4 +-
 gcc/testsuite/g++.target/i386/pr100637-1w.C   |   4 +-
 gcc/testsuite/g++.target/i386/pr103861-1.C    |   4 +-
 .../g++.target/i386/sse4_1-pr100637-1b.C      |  17 +
 .../g++.target/i386/sse4_1-pr100637-1w.C      |  17 +
 .../g++.target/i386/sse4_1-pr103861-1.C       |  17 +
 gcc/testsuite/gcc.target/i386/avx2-pr115517.c |  33 +
 .../gcc.target/i386/avx512-pr115517.c         |  70 ++
 gcc/testsuite/gcc.target/i386/pr103941-2.c    |   2 +-
 gcc/testsuite/gcc.target/i386/pr111023-2.c    |   4 +-
 gcc/testsuite/gcc.target/i386/pr88540.c       |   4 +-
 .../gcc.target/i386/sse4_1-pr88540.c          |  10 +
 gcc/testsuite/gcc.target/i386/vect-div-1.c    |   3 +-
 17 files changed, 918 insertions(+), 322 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/i386/avx2-pr115517.C
 create mode 100644 gcc/testsuite/g++.target/i386/avx512-pr115517.C
 create mode 100644 gcc/testsuite/g++.target/i386/sse4_1-pr100637-1b.C
 create mode 100644 gcc/testsuite/g++.target/i386/sse4_1-pr100637-1w.C
 create mode 100644 gcc/testsuite/g++.target/i386/sse4_1-pr103861-1.C
 create mode 100644 gcc/testsuite/gcc.target/i386/avx2-pr115517.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512-pr115517.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse4_1-pr88540.c

-- 
2.31.1

Reply via email to