On Tue, Nov 7, 2017 at 9:18 AM, Jakub Jelinek <ja...@redhat.com> wrote: > On Mon, Nov 06, 2017 at 11:45:37PM +0100, Uros Bizjak wrote: >> On Mon, Nov 6, 2017 at 10:23 PM, Jakub Jelinek <ja...@redhat.com> wrote: >> > Hi! >> > >> > Without the following patch we emit kmovb %k1, %eax; testb %al, %al >> > when if just testing the Zero bit we can as well do ktestb %k1, %k1. >> > >> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? >> > >> > 2017-11-06 Jakub Jelinek <ja...@redhat.com> >> > >> > PR target/82855 >> > * config/i386/i386.md (SWI1248_AVX512BWDQ2_64): New mode iterator. >> > (*cmp<mode>_ccz_1): New insn with k alternative. >> > >> > * gcc.target/i386/avx512dq-pr82855.c: New test. >> > >> > --- gcc/config/i386/i386.md.jj 2017-10-28 09:00:44.000000000 +0200 >> > +++ gcc/config/i386/i386.md 2017-11-06 12:57:44.171215745 +0100 >> > @@ -1275,6 +1275,26 @@ (define_expand "cmp<mode>_1" >> > (compare:CC (match_operand:SWI48 0 "nonimmediate_operand") >> > (match_operand:SWI48 1 "<general_operand>")))]) >> > >> > +(define_mode_iterator SWI1248_AVX512BWDQ2_64 >> > + [(QI "TARGET_AVX512DQ") (HI "TARGET_AVX512DQ") >> > + (SI "TARGET_AVX512BW") (DI "TARGET_AVX512BW && TARGET_64BIT")]) >> > + >> > +(define_insn "*cmp<mode>_ccz_1" >> > + [(set (reg FLAGS_REG) >> > + (compare (match_operand:SWI1248_AVX512BWDQ2_64 0 >> > + "nonimmediate_operand" "<r>,k,?m<r>") >> >> Please put "k" alternative as the last alternative. I suggest to >> decorate it with "*", so register allocator won't use mask registers >> instead of integer registers. This would be the same solution as in >> movdi_internal/movsi_internal patterns. > > Tried that, but the testcase no longer uses the ktest (diff from without > the * decoration to with the * decoration, pseudo 93 is live just in > (insn 12 10 14 2 (set (reg:QI 93) > (unspec:QI [ > (reg:V8SI 97) > (reg:V8SI 98) > ] UNSPEC_MASKED_EQ)) "include/avx512vlintrin.h":5394 3438 > {avx512vl_eqv8si3_1} > (expr_list:REG_DEAD (reg:V8SI 98) > (expr_list:REG_DEAD (reg:V8SI 97) > (nil)))) > (insn 14 12 15 2 (set (reg:CCZ 17 flags) > (compare:CCZ (reg:QI 93) > (const_int 0 [0]))) "avx512dq-pr82855.c":13 1 {*cmpqi_ccz_1} > (expr_list:REG_DEAD (reg:QI 93) > (nil))) > ): > - a1 (r93,l0) best MASK_EVEX_REGS, allocno MASK_EVEX_REGS > + a1 (r93,l0) best GENERAL_REGS, allocno GENERAL_REGS > > - a1(r93,l0) costs: AREG:2000,2000 DREG:2000,2000 CREG:2000,2000 > BREG:2000,2000 SIREG:2000,2000 DIREG:2000,2000 AD_REGS:2000,2000 > CLOBBERED_REGS:2000,2000 Q_REGS:2000,2000 NON_Q_REGS:2000,2000 > TLS_GOTBASE_REGS:2000,2000 GENERAL_REGS:2000,2000 MASK_EVEX_REGS:0,0 > MASK_REGS:2000,2000 ALL_REGS:420000,420000 MEM:10000,10000 > + a1(r93,l0) costs: AREG:2000,2000 DREG:2000,2000 CREG:2000,2000 > BREG:2000,2000 SIREG:2000,2000 DIREG:2000,2000 AD_REGS:2000,2000 > CLOBBERED_REGS:2000,2000 Q_REGS:2000,2000 NON_Q_REGS:2000,2000 > TLS_GOTBASE_REGS:2000,2000 GENERAL_REGS:2000,2000 MASK_EVEX_REGS:2000,2000 > MASK_REGS:4000,4000 ALL_REGS:420000,420000 MEM:10000,10000 > > - r93: preferred MASK_EVEX_REGS, alternative GENERAL_REGS, allocno ALL_REGS > + r93: preferred GENERAL_REGS, alternative NO_REGS, allocno GENERAL_REGS > > - a1(r93,l0) costs: AREG:4000,4000 DREG:4000,4000 CREG:4000,4000 > BREG:4000,4000 SIREG:4000,4000 DIREG:4000,4000 AD_REGS:4000,4000 > CLOBBERED_REGS:4000,4000 Q_REGS:4000,4000 NON_Q_REGS:4000,4000 > TLS_GOTBASE_REGS:4000,4000 GENERAL_REGS:4000,4000 MASK_EVEX_REGS:0,0 > MASK_REGS:2000,2000 ALL_REGS:420000,420000 MEM:10000,10000 > - a2(r97,l0) costs: SSE_FIRST_REG:6000,6000 NO_REX_SSE_REGS:6000,6000 > SSE_REGS:6000,6000 EVEX_SSE_REGS:6000,6000 ALL_SSE_REGS:6000,6000 > MEM:25000,25000 > - a3(r98,l0) costs: SSE_FIRST_REG:0,0 NO_REX_SSE_REGS:0,0 SSE_REGS:0,0 > EVEX_SSE_REGS:0,0 ALL_SSE_REGS:0,0 MEM:19000,19000 > + a1(r93,l0) costs: GENERAL_REGS:4000,4000 MASK_EVEX_REGS:4000,4000 > MASK_REGS:6000,6000 ALL_REGS:422000,422000 MEM:12000,12000 > + a2(r97,l0) costs: SSE_FIRST_REG:8000,8000 NO_REX_SSE_REGS:8000,8000 > SSE_REGS:8000,8000 EVEX_SSE_REGS:8000,8000 ALL_SSE_REGS:8000,8000 > MEM:27000,27000 > + a3(r98,l0) costs: SSE_FIRST_REG:2000,2000 NO_REX_SSE_REGS:2000,2000 > SSE_REGS:2000,2000 EVEX_SSE_REGS:2000,2000 ALL_SSE_REGS:2000,2000 > MEM:21000,21000 > > Isn't it sufficient that moves disparage slightly the k alternatives? > Or are you worried about the case where the pseudo would need to be spilled > and LRA would choose to reload it into a %kN register?
The later... Perhaps it is possible to prevent unwanted reloads with ?k (or even ??k, we already have ?m<r>, IDK). I have used *k in zero-extend patterns to prevent unwanted reloads, which is perhaps too strong, but we have ree pass that eliminates unwanted moves in this case (*k in move patterns is handled by other regmove elimination passes). Uros.