https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #41 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Hongtao Liu from comment #38) > > I think we should also mask off the upper bits of variable mask? > > > > notl %esi > > orl %esi, %edi > > notl %edi > > andl $15, %edi > > je .L3 > > with -mbmi, it's > > andn %esi, %edi, %edi > andl $15, %edi > je .L3 Well, yes, the discussion in this bug was whether to do this at consumers (that's sth new) or with all mask operations (that's how we handle bit-precision integer operations, so it might be relatively easy to do that - specifically spot the places eventually needing adjustment). There's do_store_flag to fixup for uses not in branches and do_compare_and_jump for conditional jumps. Note the AND is removed by combine if I add it: Successfully matched this instruction: (set (reg:CCZ 17 flags) (compare:CCZ (and:HI (not:HI (subreg:HI (reg:QI 102 [ tem_3 ]) 0)) (const_int 15 [0xf])) (const_int 0 [0]))) (*testhi_not) - 9: {r103:QI=r102:QI&0xf;clobber flags:CC;} + REG_DEAD r99:QI + 9: NOTE_INSN_DELETED + 12: flags:CCZ=cmp(~r102:QI#0&0xf,0) REG_DEAD r102:QI - REG_UNUSED flags:CC - 12: flags:CCZ=cmp(r103:QI,0xf) - REG_DEAD r103:QI and we get foo: .LFB0: .cfi_startproc notl %esi orl %esi, %edi notl %edi testb $15, %dil je .L6 ret which I'm not sure is OK? diff --git a/gcc/dojump.cc b/gcc/dojump.cc index e2d2b3cb111..784707c1e55 100644 --- a/gcc/dojump.cc +++ b/gcc/dojump.cc @@ -1266,6 +1266,7 @@ do_compare_and_jump (tree treeop0, tree treeop1, enum rtx_code signed_code, machine_mode mode; int unsignedp; enum rtx_code code; + unsigned HOST_WIDE_INT nunits; /* Don't crash if the comparison was erroneous. */ op0 = expand_normal (treeop0); @@ -1308,6 +1309,18 @@ do_compare_and_jump (tree treeop0, tree treeop1, enum rtx_code signed_code, emit_insn (targetm.gen_canonicalize_funcptr_for_compare (new_op1, op1)); op1 = new_op1; } + else if (VECTOR_BOOLEAN_TYPE_P (type) + && mode == QImode + && TYPE_VECTOR_SUBPARTS (type).is_constant (&nunits) + && nunits < BITS_PER_UNIT) + { + op0 = expand_binop (mode, and_optab, op0, + GEN_INT ((1 << nunits) - 1), NULL_RTX, + true, OPTAB_WIDEN); + op1 = expand_binop (mode, and_optab, op1, + GEN_INT ((1 << nunits) - 1), NULL_RTX, + true, OPTAB_WIDEN); + } do_compare_rtx_and_jump (op0, op1, code, unsignedp, treeop0, mode, ((mode == BLKmode)