On Mon, Nov 06, 2017 at 11:45:37PM +0100, Uros Bizjak wrote: > On Mon, Nov 6, 2017 at 10:23 PM, Jakub Jelinek <ja...@redhat.com> wrote: > > Hi! > > > > Without the following patch we emit kmovb %k1, %eax; testb %al, %al > > when if just testing the Zero bit we can as well do ktestb %k1, %k1. > > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > > > 2017-11-06 Jakub Jelinek <ja...@redhat.com> > > > > PR target/82855 > > * config/i386/i386.md (SWI1248_AVX512BWDQ2_64): New mode iterator. > > (*cmp<mode>_ccz_1): New insn with k alternative. > > > > * gcc.target/i386/avx512dq-pr82855.c: New test. > > > > --- gcc/config/i386/i386.md.jj 2017-10-28 09:00:44.000000000 +0200 > > +++ gcc/config/i386/i386.md 2017-11-06 12:57:44.171215745 +0100 > > @@ -1275,6 +1275,26 @@ (define_expand "cmp<mode>_1" > > (compare:CC (match_operand:SWI48 0 "nonimmediate_operand") > > (match_operand:SWI48 1 "<general_operand>")))]) > > > > +(define_mode_iterator SWI1248_AVX512BWDQ2_64 > > + [(QI "TARGET_AVX512DQ") (HI "TARGET_AVX512DQ") > > + (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW && TARGET_64BIT")]) > > + > > +(define_insn "*cmp<mode>_ccz_1" > > + [(set (reg FLAGS_REG) > > + (compare (match_operand:SWI1248_AVX512BWDQ2_64 0 > > + "nonimmediate_operand" "<r>,k,?m<r>") > > Please put "k" alternative as the last alternative. I suggest to > decorate it with "*", so register allocator won't use mask registers > instead of integer registers. This would be the same solution as in > movdi_internal/movsi_internal patterns.
Tried that, but the testcase no longer uses the ktest (diff from without the * decoration to with the * decoration, pseudo 93 is live just in (insn 12 10 14 2 (set (reg:QI 93) (unspec:QI [ (reg:V8SI 97) (reg:V8SI 98) ] UNSPEC_MASKED_EQ)) "include/avx512vlintrin.h":5394 3438 {avx512vl_eqv8si3_1} (expr_list:REG_DEAD (reg:V8SI 98) (expr_list:REG_DEAD (reg:V8SI 97) (nil)))) (insn 14 12 15 2 (set (reg:CCZ 17 flags) (compare:CCZ (reg:QI 93) (const_int 0 [0]))) "avx512dq-pr82855.c":13 1 {*cmpqi_ccz_1} (expr_list:REG_DEAD (reg:QI 93) (nil))) ): - a1 (r93,l0) best MASK_EVEX_REGS, allocno MASK_EVEX_REGS + a1 (r93,l0) best GENERAL_REGS, allocno GENERAL_REGS - a1(r93,l0) costs: AREG:2000,2000 DREG:2000,2000 CREG:2000,2000 BREG:2000,2000 SIREG:2000,2000 DIREG:2000,2000 AD_REGS:2000,2000 CLOBBERED_REGS:2000,2000 Q_REGS:2000,2000 NON_Q_REGS:2000,2000 TLS_GOTBASE_REGS:2000,2000 GENERAL_REGS:2000,2000 MASK_EVEX_REGS:0,0 MASK_REGS:2000,2000 ALL_REGS:420000,420000 MEM:10000,10000 + a1(r93,l0) costs: AREG:2000,2000 DREG:2000,2000 CREG:2000,2000 BREG:2000,2000 SIREG:2000,2000 DIREG:2000,2000 AD_REGS:2000,2000 CLOBBERED_REGS:2000,2000 Q_REGS:2000,2000 NON_Q_REGS:2000,2000 TLS_GOTBASE_REGS:2000,2000 GENERAL_REGS:2000,2000 MASK_EVEX_REGS:2000,2000 MASK_REGS:4000,4000 ALL_REGS:420000,420000 MEM:10000,10000 - r93: preferred MASK_EVEX_REGS, alternative GENERAL_REGS, allocno ALL_REGS + r93: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS - a1(r93,l0) costs: AREG:4000,4000 DREG:4000,4000 CREG:4000,4000 BREG:4000,4000 SIREG:4000,4000 DIREG:4000,4000 AD_REGS:4000,4000 CLOBBERED_REGS:4000,4000 Q_REGS:4000,4000 NON_Q_REGS:4000,4000 TLS_GOTBASE_REGS:4000,4000 GENERAL_REGS:4000,4000 MASK_EVEX_REGS:0,0 MASK_REGS:2000,2000 ALL_REGS:420000,420000 MEM:10000,10000 - a2(r97,l0) costs: SSE_FIRST_REG:6000,6000 NO_REX_SSE_REGS:6000,6000 SSE_REGS:6000,6000 EVEX_SSE_REGS:6000,6000 ALL_SSE_REGS:6000,6000 MEM:25000,25000 - a3(r98,l0) costs: SSE_FIRST_REG:0,0 NO_REX_SSE_REGS:0,0 SSE_REGS:0,0 EVEX_SSE_REGS:0,0 ALL_SSE_REGS:0,0 MEM:19000,19000 + a1(r93,l0) costs: GENERAL_REGS:4000,4000 MASK_EVEX_REGS:4000,4000 MASK_REGS:6000,6000 ALL_REGS:422000,422000 MEM:12000,12000 + a2(r97,l0) costs: SSE_FIRST_REG:8000,8000 NO_REX_SSE_REGS:8000,8000 SSE_REGS:8000,8000 EVEX_SSE_REGS:8000,8000 ALL_SSE_REGS:8000,8000 MEM:27000,27000 + a3(r98,l0) costs: SSE_FIRST_REG:2000,2000 NO_REX_SSE_REGS:2000,2000 SSE_REGS:2000,2000 EVEX_SSE_REGS:2000,2000 ALL_SSE_REGS:2000,2000 MEM:21000,21000 Isn't it sufficient that moves disparage slightly the k alternatives? Or are you worried about the case where the pseudo would need to be spilled and LRA would choose to reload it into a %kN register? Jakub