https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #34 from Richard Sandiford <rsandifo at gcc dot gnu.org> --- (In reply to Richard Biener from comment #32) > Btw, AVX512 knotb will invert all 8 bits and there's no knot just affecting > the lowest 4 or 2 bits. > > It all feels like desaster waiting to happen ;) Yes :) > For example BIT_NOT_EXPR is RTL expanded like > > case BIT_NOT_EXPR: > op0 = expand_expr (treeop0, subtarget, > VOIDmode, EXPAND_NORMAL); > if (modifier == EXPAND_STACK_PARM) > target = 0; > /* In case we have to reduce the result to bitfield precision > for unsigned bitfield expand this as XOR with a proper constant > instead. */ > if (reduce_bit_field && TYPE_UNSIGNED (type)) > { > int_mode = SCALAR_INT_TYPE_MODE (type); > wide_int mask = wi::mask (TYPE_PRECISION (type), > false, GET_MODE_PRECISION (int_mode)); > > temp = expand_binop (int_mode, xor_optab, op0, > immed_wide_int_const (mask, int_mode), > target, 1, OPTAB_LIB_WIDEN); > > so we could, for VECTOR_BOOLEAN_TYPE_P with integer mode and > effective bit-precision set reduce_bit_field and fixup the fallout > (not sure why the above is only for TYPE_UNSIGNED). > > At least it feels similar and doing things the opposite for vectors > (fixing up at uses) would be odd? Do you know why we take this approach for integers? Is it for correctness? Or is it supposed to be more optimal? I can imagine that, for arithmetic types, there are going to many more instances where upper bits matter (division, right shifts, MIN/MAX, etc.). So perhaps reducing every result is a good trade-off there. But there's an argument that it should be rare for the padding bits in a vector to matter, since very few things would look at the padding bits anyway. So perhaps the cost should be borne by the operations that need canonical integers. Not a strong opinion though, more just devil's advocate. There again, if e.g. the x86 API guarantees memcmp equality between two masks whose significant bits are equal, then we probably have no choice.