On Wed, Mar 19, 2025 at 8:56 AM Jakub Jelinek <ja...@redhat.com> wrote:
>
> Hi!
>
> The following testcase ICEs, because the splitters into vptest
> create an invalid instruction.  The operands of all the UNSPEC_PTEST
> using instructions use register_operand and vector_operand predicate,
> these splitters use vector_operand predicate but create vptest
> instruction which has the same argument twice, so one of them needs
> to be in a register.
> The following patch keeps vector_operand predicate on the splitters
> but uses force_reg to force it into a REG if it was a MEM, that results
> in better code generation e.g. on the included testcase, as combine
> can match those even with MEM.
> The difference on the testcase is
> -       vpxor   %xmm0, %xmm0, %xmm0
> -       vpcmpeqb        (%rdi), %xmm0, %xmm0
> -       vpmovmskb       %xmm0, %eax
> -       cmpl    $65535, %eax
> +       vmovdqa (%rdi), %xmm0
> +       vptest  %xmm0, %xmm0
> (- for patch which changes the splitters to
> s/vector_operand/register_operand/ and + for this patch).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2025-03-19  Jakub Jelinek  <ja...@redhat.com>
>
>         PR target/119357
>         * config/i386/sse.md (pmovmskb 0xffff to ptest splitter,
>         *pmovsk_ptest_<mode>_avx512): Force operands[0] into a REG.
>
>         * gcc.target/i386/avx512vlbw-pr119357.c: New test.

OK.

Thanks,
Uros.

>
> --- gcc/config/i386/sse.md.jj   2025-02-08 08:54:24.070260101 +0100
> +++ gcc/config/i386/sse.md      2025-03-18 19:58:46.603529373 +0100
> @@ -22414,7 +22414,8 @@ (define_split
>    [(set (reg:CCZ FLAGS_REG)
>         (unspec:CCZ [(match_dup 0)
>                      (match_dup 0)]
> -                   UNSPEC_PTEST))])
> +                   UNSPEC_PTEST))]
> +  "operands[0] = force_reg (<MODE>mode, operands[0]);")
>
>  (define_insn_and_split "*pmovsk_mask_cmp_<mode>_avx512"
>    [(set (reg:CCZ FLAGS_REG)
> @@ -22455,7 +22456,8 @@ (define_insn_and_split "*pmovsk_ptest_<m
>    [(set (reg:CCZ FLAGS_REG)
>         (unspec:CCZ [(match_dup 0)
>                      (match_dup 0)]
> -                   UNSPEC_PTEST))])
> +                   UNSPEC_PTEST))]
> +  "operands[0] = force_reg (<MODE>mode, operands[0]);")
>
>  (define_expand "sse2_maskmovdqu"
>    [(set (match_operand:V16QI 0 "memory_operand")
> --- gcc/testsuite/gcc.target/i386/avx512vlbw-pr119357.c.jj      2025-03-18 
> 20:10:42.671608261 +0100
> +++ gcc/testsuite/gcc.target/i386/avx512vlbw-pr119357.c 2025-03-18 
> 20:09:58.289222946 +0100
> @@ -0,0 +1,14 @@
> +/* PR target/119357 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx512bw -mavx512vl" } */
> +
> +#include <x86intrin.h>
> +
> +typedef char V __attribute__((vector_size (16)));
> +
> +void
> +foo (V *p)
> +{
> +  if (_mm_movemask_epi8 ((__m128i) (*p == 0)) != 65535)
> +    __builtin_abort ();
> +}
>
>         Jakub
>

Reply via email to