https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80520
Jeffrey A. Law <law at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #9 from Jeffrey A. Law <law at redhat dot com> ---
So AFAICT there's two issues that need to be addressed.  PRE and split-paths.

First up is PRE.  Compile the sample code from c#5/c#6 with -O3
-fno-split-paths


Prior to PRE we have:

  if (_16 != 0)
    goto <bb 5>; [50.00%]
  else
    goto <bb 4>; [50.00%]

  <bb 4> [local count: 531502203]:

  <bb 5> [local count: 1063004407]:
  # iftmp.0_19 = PHI <2567483615(3), 0(4)>
  _17 = _15 ^ iftmp.0_19;

That's actually reasonably good.  While it's not a conditional move in the
gimple.  It's in a form will be easy for the RTL optimizers to handle and
generate a suitable cmov if we just left it alone on x86_64.


PRE (correctly) identifies that it can reduce the number of expression
evaluations on the path traversing bb3->bb5 by hoisting the XOR with the
non-zero constant into BB4 resulting in:

 if (_16 != 0)
    goto <bb 4>; [50.00%]
  else
    goto <bb 5>; [50.00%]

  <bb 4> [local count: 531502203]:
  _52 = _15 ^ 2567483615;

  <bb 5> [local count: 1063004407]:
  # iftmp.0_19 = PHI <2567483615(4), 0(3)>
  # prephitmp_53 = PHI <_52(4), _15(3)>

That's correct, but far from ideal.


So the second issue is split-paths.  There's actually two problems to deal with
in split-paths.




As it stands today this is what we see in split-paths (as a result of the PRE
de-optimization):


  <bb 3>
  [ ... ]
  if (_20 != 0)
    goto <bb 5>; [50.00%]
  else
    goto <bb 4>; [50.00%]

  <bb 4> [local count: 531502203]:
  _18 = _25 ^ 2567483615;

  <bb 5> [local count: 1063004407]:
  # prephitmp_49 = PHI <_25(3), _18(4)>
  _2 = (void *) ivtmp.8_30;
  MEM[base: _2, offset: 0B] = prephitmp_49;
  ivtmp.8_29 = ivtmp.8_30 + 8;
  if (ivtmp.8_29 != _6)
    goto <bb 3>; [98.99%]
  else
    goto <bb 6>; [1.01%]

split-paths should try not to muck it up further.  Note that we can probably
identify this half-diamond pretty easily.  bb3 dominates bb4.  bb4 has a single
statement that feeds a PHI in bb5.  That's a very likely if-conversion
candidate so split-paths ought to leave it alone.

If we were to fix PRE then split-paths would be presented with something like
this:

 <bb3>
 [  ... ]
 if (_47 != 0)
    goto <bb 4>; [50.00%]
  else
    goto <bb 5>; [50.00%]

  <bb 4> [local count: 531502203]:

  <bb 5> [local count: 1063004407]:
  # iftmp.0_48 = PHI <2567483615(3), 0(4)>
  _49 = _18 ^ iftmp.0_48;

ISTM that when either of the blocks in question (bb3 bb4) has *no* statements,
with a single pred that is the other block then split-blocks definitely should
leave it alone as well.

So, to summarize.

1. PRE mucks things up a bit.
2. split-paths makes it worse

I've got a prototype patch that implements the two improvements to keep
split-paths from making things worse.  That will improve things, but to really
do a good job we'll have to either do something about PRE or have a pass after
PRE undo PRE's deoptimization.

Reply via email to