https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119357
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords|ra | Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> --- So, either we fix up the splitters so that they use appropriate predicate: --- gcc/config/i386/sse.md.jj 2025-02-08 08:54:24.070260101 +0100 +++ gcc/config/i386/sse.md 2025-03-18 19:45:24.041656689 +0100 @@ -22406,7 +22406,7 @@ [(set (reg:CCZ FLAGS_REG) (compare:CCZ (unspec:SI [(eq:VI1_AVX2 - (match_operand:VI1_AVX2 0 "vector_operand") + (match_operand:VI1_AVX2 0 "register_operand") (match_operand:VI1_AVX2 1 "const0_operand"))] UNSPEC_MOVMSK) (match_operand 2 "const_int_operand")))] @@ -22443,7 +22443,7 @@ (match_operand:VI1_AVX2 3 "vector_all_ones_operand") (match_operand:VI1_AVX2 4 "const0_operand") (unspec:<avx512fmaskmode> - [(match_operand:VI1_AVX2 0 "vector_operand") + [(match_operand:VI1_AVX2 0 "register_operand") (match_operand:VI1_AVX2 1 "const0_operand") (const_int 0)] UNSPEC_PCMP))] because all the vptest instructions have one operand with register_operand and another with vector_operand and the splitters use the same operand for both, Or perhaps better just force it into REG: --- gcc/config/i386/sse.md.jj 2025-02-08 08:54:24.070260101 +0100 +++ gcc/config/i386/sse.md 2025-03-18 19:58:46.603529373 +0100 @@ -22414,7 +22414,8 @@ [(set (reg:CCZ FLAGS_REG) (unspec:CCZ [(match_dup 0) (match_dup 0)] - UNSPEC_PTEST))]) + UNSPEC_PTEST))] + "operands[0] = force_reg (<MODE>mode, operands[0]);") (define_insn_and_split "*pmovsk_mask_cmp_<mode>_avx512" [(set (reg:CCZ FLAGS_REG) @@ -22455,7 +22456,8 @@ [(set (reg:CCZ FLAGS_REG) (unspec:CCZ [(match_dup 0) (match_dup 0)] - UNSPEC_PTEST))]) + UNSPEC_PTEST))] + "operands[0] = force_reg (<MODE>mode, operands[0]);") (define_expand "sse2_maskmovdqu" [(set (match_operand:V16QI 0 "memory_operand") The difference on the testcase is - vpxor %xmm0, %xmm0, %xmm0 - vpcmpeqb (%rdi), %xmm0, %xmm0 - vpmovmskb %xmm0, %eax - cmpl $65535, %eax + vmovdqa (%rdi), %xmm0 + vptest %xmm0, %xmm0 (first patch vs. second), so I think I'll test the latter.