https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63986
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEW CC| |ppalka at gcc dot gnu.org, | |rguenth at gcc dot gnu.org Assignee|rguenth at gcc dot gnu.org |unassigned at gcc dot gnu.org --- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- Ok, now already existing forwprop code gets fed with <bb 2>: _3 = a_2(D) == 0; x_4 = (char) _3; _7 = ~_3; _8 = (int) _7; MEM[(int *)d_5(D) + 8B] = _8; if (x_4 != 0) where we now in the first forwprop pass identified the opportunity to use ~_3 instead of x_4 == 0 thus x_4 is now no longer multi-use. This makes us optimize if (x_4 != 0) to if (_3 != 0) which we re-optimize in fold_gimple_cond now to '_3' and then of course if (_3 != 0) (err, and we return "changed"....) which means we now propagate _again_ via forward_propagate_into_gimple_cond which now specifically allows aggressive forwarding of compares, bypassing single-use restrictions. See 2014-11-16 Patrick Palka <ppa...@gcc.gnu.org> PR middle-end/63790 * tree-ssa-forwprop.c (forward_propagate_into_comparison_1): Always combine comparisons or conversions from booleans. thus me fixing my "mistake" does not help anymore. I suppose RTL CSE cannot CSE flag register sets...? Btw, my previous comment was incorrect - the code is what is now produced on trunk while on the 4.9 branch we create test_0: .LFB0: .cfi_startproc testl %edi, %edi movl %edx, %eax sete %r8b movl %r8d, %edi xorl $1, %edi testb %r8b, %r8b movzbl %dil, %edi cmovne %esi, %eax movl %edi, 8(%rcx) ret which means code generation improved for x86...