On Tue, Jul 14, 2020 at 02:33:41PM +0200, Richard Biener wrote: > > As mentioned in the PR, we generate a useless movzbl insn before lock > > cmpxchg. > > The problem is that the builtin for the char/short cases has the arguments > > promoted to int and combine gives up, because the instructions have > > MEM_VOLATILE_P arguments and recog in that case doesn't recognize anything > > when volatile_ok is false, and nothing afterwards optimizes the > > (reg:SI a) = (zero_extend:SI (reg:QI a)) > > ... (subreg:QI (reg:SI a) 0) ... > > So the above isn't fixable? Because it would probably be the more > generic fix, right?
I'm afraid it is not, CCing Segher on that. The question is how to differentiate between "combine didn't do anything dangerous to this MEM_VOLATILE_P and just kept it in the pattern as is" vs. "combine changed it in a dangerous way unsuitable for volatile accesses". Even if we wanted to make just the case where i3 is the only insn that originally contains MEM_VOLATILE_P and checked that the MEM stayed as is, there is the difficulty that we change the insns in place, so remembering how it looked before is hard. And then before trying to recognize it we'd carefully need to check that nothing else is volatile before enabling temporarily volatile_ok. > > + if (TREE_CODE (exp) == SSA_NAME > > + && TYPE_MODE (TREE_TYPE (exp)) != mode) > > + { > > + /* Undo argument promotion if possible, as combine might not > > + be able to do it later due to MEM_VOLATILE_P uses in the > > + patterns. */ > > + gimple *g = get_gimple_for_ssa_name (exp); > > + if (g && gimple_assign_cast_p (g)) > > + { > > + tree rhs = gimple_assign_rhs1 (g); > > + if (TYPE_MODE (TREE_TYPE (rhs)) == mode) > > Is this really enough to check? What if the cast was truncating? > The gimple_assign_cast_p predicate also allows for FIX_TRUNC_EXPR > and VIEW_CONVERT_EXPR where for the latter the rhs is the > VIEW_CONVERT_EXPR itself. I don't know how could those happen. mode is the result of get_builtin_sync_mode or int_mode_for_size (BITS_PER_UNIT * size, 0).require () (the former is int_mode_for_size (...).require ()) and thus an integral mode. The checks verify that the SSA_NAME doesn't have that mode, but rhs1 of the statement does have the right mode. As mode is integral, I think that rules out FIX_TRUNC_EXPR which would need real mode on the argument, and for VCE, if VCE is in the operand itself, then it ought to have the same mode as the lhs. I believe the only possibility where exp doesn't have already the desired mode is the char/short to int promotion, otherwise all the builtins are prototyped and thus one shouldn't have some unrelated mode. But if you want some extra checks just to be sure, I can add them, whether it means checking only for CONVERT_EXPR_CODE_P instead of all 4 rhs codes, and/or checking that both types are integral scalar and the conversion is an extension (by comparing type precisions). Jakub