[Bug c++/98348] GCC 10.2 AVX512 Mask regression from GCC 9

jakub at gcc dot gnu.org via Gcc-bugs Sat, 19 Dec 2020 02:22:24 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98348


--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
In the light of the recent discussions I've been wondering about doing it as
combine splitters only, like roughly:
--- sse.md.jj   2020-12-03 10:04:35.862093285 +0100
+++ sse.md      2020-12-19 11:00:14.272140859 +0100
@@ -2965,6 +2965,40 @@
    (set_attr "prefix" "vex")
    (set_attr "mode" "<MODE>")])

+(define_split
+  [(set (match_operand 0 "register_operand")
+       (vec_merge
+         (match_operand 1 "vector_all_ones_operand")
+         (match_operand 2 "const0_operand")
+         (unspec
+           [(match_operand 3 "register_operand")
+            (match_operand 4 "nonimmediate_operand")
+            (match_operand:SI 5 "const_0_to_31_operand")]
+            UNSPEC_PCMP)))]
+  "TARGET_AVX512VL
+   && GET_MODE_CLASS (GET_MODE (operands[0])) == MODE_VECTOR_INT
+   && (GET_MODE_SIZE (GET_MODE (operands[1])) == 16
+       || GET_MODE_SIZE (GET_MODE (operands[1])) == 32)
+   && GET_MODE (operands[1]) == GET_MODE (operands[0])
+   && GET_MODE (operands[2]) == GET_MODE (operands[0])
+   && GET_MODE_CLASS (GET_MODE (operands[3])) == MODE_VECTOR_FLOAT
+   && (GET_MODE_SIZE (GET_MODE (operands[3]))
+       == GET_MODE_SIZE (GET_MODE (operands[0])))
+   && (GET_MODE_UNIT_SIZE (GET_MODE (operands[3]))
+       == GET_MODE_UNIT_SIZE (GET_MODE (operands[0])))
+   && GET_MODE (operands[4]) == GET_MODE (operands[3])"
+  [(set (match_dup 6) (match_dup 7))
+   (set (match_dup 0) (match_dup 8))]
+{
+  operands[6] = gen_reg_rtx (GET_MODE (operands[3]));
+  operands[7]
+    = gen_rtx_UNSPEC (GET_MODE (operands[3]),
+                     gen_rtvec (3, operands[3], operands[4], operands[5]),
+                     UNSPEC_PCMP);
+  operands[8] = lowpart_subreg (GET_MODE (operands[0]), operands[6],
+                               GET_MODE (operands[3]));
+})
+
 (define_insn "avx_vmcmp<mode>3"
   [(set (match_operand:VF_128 0 "register_operand" "=x")
        (vec_merge:VF_128

The advantage is that one pattern can then handle in theory all (or half) of
the floating point comparison cases.
One problem is that combiner still doesn't even try the splitting if only
combining two insns.
Also, but I think that is in your patch too, vector_all_ones_operand will match
only integral all ones vectors, I think we want another predicate that will
match even MEMs with the floating point version thereof (a NaN kind with all
bits set).  And, we should have splitters for not just the -1 0 order in
VEC_MERGE, but also the 0 -1 order by inverting the comparison carefully.

[Bug c++/98348] GCC 10.2 AVX512 Mask regression from GCC 9

Reply via email to