https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88464
--- Comment #28 from Jakub Jelinek <jakub at gcc dot gnu.org> --- (In reply to Uroš Bizjak from comment #27) > (In reply to Jakub Jelinek from comment #25) > > Isn't ktestw and kortestw the same thing when both operands are the same > > mask register? > True, but kortestw is available with AVX512F, where ktestw is not. > > (In reply to Jakub Jelinek from comment #26) > > And the TARGET_AVX512F && looks incorrect, then we wouldn't be able to test > > or cmp without -mavx512f. > No, we fall to *cmp<mode>_ccno_1, which is compatible with CCZmode. You're right, sorry for the noise. Your patch looks good to me. There is another issue though (I guess not correctness, but efficiency), e.g. on avx512vl-pr88464-{1,3}.c. E.g. in avx512vl-pr88464-3.c we have: if (mask__40.16_82 == { 0, 0 }) goto <bb 7>; [100.00%] else goto <bb 6>; [20.00%] in *.optimized, an attempt to jump around masked stores if the mask is all zeros. We emit: ;; if (mask__40.16_82 == { 0, 0 }) (insn 44 43 45 (set (reg:CCZ 17 flags) (compare:CCZ (reg:QI 131 [ mask__40.16 ]) (const_int 0 [0]))) -1 (nil)) (jump_insn 45 44 0 (set (pc) (if_then_else (eq (reg:CCZ 17 flags) (const_int 0 [0])) (label_ref 0) (pc))) -1 (int_list:REG_BR_PROB 1073741831 (nil))) for this, i.e. kmovw %k2, %r10d testb %r10b, %r10b je .L4 (without -mavx512dq, I guess ktestb or kortestb with -mavx512dq), but perhaps we should emit kmovw %k2, %r10d; testb $3, %r10b; je .L4 instead? If the setter is a compare that clears the higher bit, then it makes no difference, but if we are e.g. looking at the low 2 or 4 bits of 4 or 8 bit mask, then it will do a masked store even if the 2 or 4 bits we care about are clear, just some upper bits are not.