https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81165
Jeffrey A. Law <law at redhat dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |law at redhat dot com --- Comment #10 from Jeffrey A. Law <law at redhat dot com> --- WRT c#9. Precisely. There's just one too many statements in the block for the threader to think it's profitable to clone the block. I've long wanted to have some kind of indication of how many statements are going to be eliminated by jump threading within the duplicated block so that we didn't have to be so pessimistic. There's at least one more BZ in the regression list which touches on this issue. Here's the block in question: # t0_36 = PHI <-1(3), 0(2)> # t1_37 = PHI <t1_31(3), 2(2)> # prephitmp_18 = PHI <_17(3), 0(2)> # prephitmp_19 = PHI <_9(3), 2(2)> # VUSE <.MEM_28> x0.3_41 = x0; _42 = (int) x0.3_41; _43 = 29 % _42; _44 = _43 & 25; _45 = (long unsigned int) _44; _46 = _45 * 10; _47 = 128 % _46; _48 = (char) _47; _49 = (unsigned int) _48; _50 = prephitmp_18 + _49; _51 = (int) _50; if (_51 < 0) goto <bb 4>; [85.00%] else goto <bb 7>; [15.00%] Essentially starting at the control statement, we could realize that _51 is a single use SSA_NAME. So if we thread, it's defining statement will go away, so we don't need to count it. THen we look at its operand(s). _50. _50 is single use as well, so its defining statement will go away. And so-on until we hit the start of the block. We can then use that information to get a much better estimation of the codesize cost of cloning the block. Alex, want to take a stab at it?