https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81165

--- Comment #11 from rguenther at suse dot de <rguenther at suse dot de> ---
On December 4, 2017 6:55:02 PM GMT+01:00, law at redhat dot com
<gcc-bugzi...@gcc.gnu.org> wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81165
>
>Jeffrey A. Law <law at redhat dot com> changed:
>
>           What    |Removed                     |Added
>----------------------------------------------------------------------------
>                 CC|                            |law at redhat dot com
>
>--- Comment #10 from Jeffrey A. Law <law at redhat dot com> ---
>WRT c#9.  Precisely.  There's just one too many statements in the block
>for the
>threader to think it's profitable to clone the block.
>
>I've long wanted to have some kind of indication of how many statements
>are
>going to be eliminated by jump threading within the duplicated block so
>that we
>didn't have to be so pessimistic.  There's at least one more BZ in the
>regression list which touches on this issue.  Here's the block in
>question:
>
>  # t0_36 = PHI <-1(3), 0(2)>
>  # t1_37 = PHI <t1_31(3), 2(2)>
>  # prephitmp_18 = PHI <_17(3), 0(2)>
>  # prephitmp_19 = PHI <_9(3), 2(2)>
>  # VUSE <.MEM_28>
>  x0.3_41 = x0;
>  _42 = (int) x0.3_41;
>  _43 = 29 % _42;
>  _44 = _43 & 25;
>  _45 = (long unsigned int) _44;
>  _46 = _45 * 10;
>  _47 = 128 % _46;
>  _48 = (char) _47;
>  _49 = (unsigned int) _48;
>  _50 = prephitmp_18 + _49;
>  _51 = (int) _50;
>  if (_51 < 0)
>    goto <bb 4>; [85.00%]
>  else
>    goto <bb 7>; [15.00%]
>
>
>Essentially starting at the control statement, we could realize that
>_51 is a
>single use SSA_NAME.  So if we thread, it's defining statement will go
>away, so
>we don't need to count it.  THen we look at its operand(s).  _50.  _50
>is
>single use as well, so its defining statement will go away.  And so-on
>until we
>hit the start of the block. We can then use that information to get a
>much
>better  estimation of the codesize cost of cloning the block.
>
>Alex, want to take a stab at it?

Also PHI nodes which are all single argument after threading shouldn't really
count as we can propagate them away.

Loop unrolling has crude heuristics to estimate stmts eliminated by constant
propagation, sth to look at as well. And then there's the possibility to simply
run DOM on the path and instead of modifying the original code emit copies in a
new sequence, costing, copying and optimizing in one go. If it gets too
expensive simply throw the sequence away.

Richard.

Reply via email to