On Mon, Nov 06, 2017 at 11:45:37PM +0100, Uros Bizjak wrote:
> On Mon, Nov 6, 2017 at 10:23 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> > Hi!
> >
> > Without the following patch we emit kmovb %k1, %eax; testb %al, %al
> > when if just testing the Zero bit we can as well do ktestb %k1, %k1.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> >
> > 2017-11-06  Jakub Jelinek  <ja...@redhat.com>
> >
> >         PR target/82855
> >         * config/i386/i386.md (SWI1248_AVX512BWDQ2_64): New mode iterator.
> >         (*cmp<mode>_ccz_1): New insn with k alternative.
> >
> >         * gcc.target/i386/avx512dq-pr82855.c: New test.
> >
> > --- gcc/config/i386/i386.md.jj  2017-10-28 09:00:44.000000000 +0200
> > +++ gcc/config/i386/i386.md     2017-11-06 12:57:44.171215745 +0100
> > @@ -1275,6 +1275,26 @@ (define_expand "cmp<mode>_1"
> >         (compare:CC (match_operand:SWI48 0 "nonimmediate_operand")
> >                     (match_operand:SWI48 1 "<general_operand>")))])
> >
> > +(define_mode_iterator SWI1248_AVX512BWDQ2_64
> > +  [(QI "TARGET_AVX512DQ") (HI "TARGET_AVX512DQ")
> > +   (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW && TARGET_64BIT")])
> > +
> > +(define_insn "*cmp<mode>_ccz_1"
> > +  [(set (reg FLAGS_REG)
> > +       (compare (match_operand:SWI1248_AVX512BWDQ2_64 0
> > +                       "nonimmediate_operand" "<r>,k,?m<r>")
> 
> Please put "k" alternative as the last alternative. I suggest to
> decorate it with "*", so register allocator won't use mask registers
> instead of integer registers. This would be the same solution as in
> movdi_internal/movsi_internal patterns.

Tried that, but the testcase no longer uses the ktest (diff from without
the * decoration to with the * decoration, pseudo 93 is live just in
(insn 12 10 14 2 (set (reg:QI 93)
        (unspec:QI [
                (reg:V8SI 97)
                (reg:V8SI 98)
            ] UNSPEC_MASKED_EQ)) "include/avx512vlintrin.h":5394 3438 
{avx512vl_eqv8si3_1}
     (expr_list:REG_DEAD (reg:V8SI 98)
        (expr_list:REG_DEAD (reg:V8SI 97)
            (nil))))
(insn 14 12 15 2 (set (reg:CCZ 17 flags)
        (compare:CCZ (reg:QI 93)
            (const_int 0 [0]))) "avx512dq-pr82855.c":13 1 {*cmpqi_ccz_1}
     (expr_list:REG_DEAD (reg:QI 93)
        (nil)))
):
-    a1 (r93,l0) best MASK_EVEX_REGS, allocno MASK_EVEX_REGS
+    a1 (r93,l0) best GENERAL_REGS, allocno GENERAL_REGS

-  a1(r93,l0) costs: AREG:2000,2000 DREG:2000,2000 CREG:2000,2000 
BREG:2000,2000 SIREG:2000,2000 DIREG:2000,2000 AD_REGS:2000,2000 
CLOBBERED_REGS:2000,2000 Q_REGS:2000,2000 NON_Q_REGS:2000,2000 
TLS_GOTBASE_REGS:2000,2000 GENERAL_REGS:2000,2000 MASK_EVEX_REGS:0,0 
MASK_REGS:2000,2000 ALL_REGS:420000,420000 MEM:10000,10000
+  a1(r93,l0) costs: AREG:2000,2000 DREG:2000,2000 CREG:2000,2000 
BREG:2000,2000 SIREG:2000,2000 DIREG:2000,2000 AD_REGS:2000,2000 
CLOBBERED_REGS:2000,2000 Q_REGS:2000,2000 NON_Q_REGS:2000,2000 
TLS_GOTBASE_REGS:2000,2000 GENERAL_REGS:2000,2000 MASK_EVEX_REGS:2000,2000 
MASK_REGS:4000,4000 ALL_REGS:420000,420000 MEM:10000,10000

-    r93: preferred MASK_EVEX_REGS, alternative GENERAL_REGS, allocno ALL_REGS
+    r93: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS

-  a1(r93,l0) costs: AREG:4000,4000 DREG:4000,4000 CREG:4000,4000 
BREG:4000,4000 SIREG:4000,4000 DIREG:4000,4000 AD_REGS:4000,4000 
CLOBBERED_REGS:4000,4000 Q_REGS:4000,4000 NON_Q_REGS:4000,4000 
TLS_GOTBASE_REGS:4000,4000 GENERAL_REGS:4000,4000 MASK_EVEX_REGS:0,0 
MASK_REGS:2000,2000 ALL_REGS:420000,420000 MEM:10000,10000
-  a2(r97,l0) costs: SSE_FIRST_REG:6000,6000 NO_REX_SSE_REGS:6000,6000 
SSE_REGS:6000,6000 EVEX_SSE_REGS:6000,6000 ALL_SSE_REGS:6000,6000 
MEM:25000,25000
-  a3(r98,l0) costs: SSE_FIRST_REG:0,0 NO_REX_SSE_REGS:0,0 SSE_REGS:0,0 
EVEX_SSE_REGS:0,0 ALL_SSE_REGS:0,0 MEM:19000,19000
+  a1(r93,l0) costs: GENERAL_REGS:4000,4000 MASK_EVEX_REGS:4000,4000 
MASK_REGS:6000,6000 ALL_REGS:422000,422000 MEM:12000,12000
+  a2(r97,l0) costs: SSE_FIRST_REG:8000,8000 NO_REX_SSE_REGS:8000,8000 
SSE_REGS:8000,8000 EVEX_SSE_REGS:8000,8000 ALL_SSE_REGS:8000,8000 
MEM:27000,27000
+  a3(r98,l0) costs: SSE_FIRST_REG:2000,2000 NO_REX_SSE_REGS:2000,2000 
SSE_REGS:2000,2000 EVEX_SSE_REGS:2000,2000 ALL_SSE_REGS:2000,2000 
MEM:21000,21000

Isn't it sufficient that moves disparage slightly the k alternatives?
Or are you worried about the case where the pseudo would need to be spilled
and LRA would choose to reload it into a %kN register?

        Jakub

Reply via email to