On Wed, Jun 8, 2016 at 2:35 PM, Jan Hubicka <hubi...@ucw.cz> wrote:
>> On 06/08/2016 12:21 PM, Andreas Schwab wrote:
>> > Jan Hubicka <hubi...@ucw.cz> writes:
>> >
>> >> Bootstrapped/regtested x86_64-linux, will commit it later today.
>> >
>> > FAIL: gcc.dg/tree-ssa/slsr-8.c scan-tree-dump-times optimized " w?\\\\* " 7
>> >
>> > Andreas.
>> >
>>
>> Hi.
>>
>> It's caused by a different probabilities for BB 2:
>>
>> @@ -11,11 +11,11 @@
>>  ;; 3 succs { 4 }
>>  ;; 4 succs { 1 }
>>  Predictions for bb 2
>> -  DS theory heuristics: 78.4%
>> -  first match heuristics (ignored): 85.0%
>> -  combined heuristics: 78.4%
>> -  pointer (on trees) heuristics: 85.0%
>> -  early return (on trees) heuristics: 39.0%
>> +  DS theory heuristics: 66.5%
>> +  first match heuristics (ignored): 70.0%
>> +  combined heuristics: 66.5%
>> +  pointer (on trees) heuristics: 70.0%
>> +  early return (on trees) heuristics: 46.0%
>
> I see this is because sinking is done when PARAM_SINK_FREQUENCY_THRESHOLD
> is met and that is 75% which seems quite ambitious for guessed profiles
> that tends to be flat.  (also the code should use counts where available).
> For some optimizers we have two thresholds - one for guessed profile and one
> for FDO. Perhaps it would make sense to benchmark how decreasing this 
> threshold
> affect performance & code size.
>
> What are the downsides of sinking? Increased register pressure?

Possibly.  But that depends on the whole stmt chain that is eventually be sunken
and this heuristic is for single stmts - which makes it somewhat fishy.  I'd
simply benchmark removing it ...   Eventually it tries to avoid sinking to
post-dominated blocks this way (those should have the same frequency), not sure.

Richard.

> For non-loop
> branches it is bit iffy to rely on static branch prediction to even give the
> right direction of the branch. It happens in about 65% of cases (where perfect
> predictor would do 85%) so we may try to come with heuristics that does not
> fully rely on the profile.
>
> We could probably fix the testcase by adding --param 
> sink-frequency-threshold=55
>
> Honza
>>
>> Which leads to a different decision made by tree-ssa-sink:
>>
>> +++ /tmp/sl-new/slsr-8.c.127t.sink    2016-06-08 14:07:59.747958332 +0200
>> @@ -21,6 +21,16 @@
>>   from bb 2 to bb 3
>>  Sinking a3_17 = s_11(D) * 6;
>>   from bb 2 to bb 3
>> +Sinking x2_16 = c_13(D) + _6;
>> + from bb 2 to bb 5
>> +Sinking _6 = -_5;
>> + from bb 2 to bb 5
>> +Sinking _5 = _4 * 4;
>> + from bb 2 to bb 5
>> +Sinking _4 = (long unsigned int) a2_15;
>> + from bb 2 to bb 5
>> +Sinking a2_15 = s_11(D) * 4;
>> + from bb 2 to bb 5
>>  f (int s, int * c)
>>  {
>>    int * x3;
>> @@ -46,17 +56,17 @@
>>    _2 = _1 * 4;
>>    _3 = -_2;
>>    x1_14 = c_13(D) + _3;
>> -  a2_15 = s_11(D) * 4;
>> -  _4 = (long unsigned int) a2_15;
>> -  _5 = _4 * 4;
>> -  _6 = -_5;
>> -  x2_16 = c_13(D) + _6;
>>    if (x1_14 != 0B)
>>      goto <bb 5>;
>>    else
>>      goto <bb 3>;
>>
>>    <bb 5>:
>> +  a2_15 = s_11(D) * 4;
>> +  _4 = (long unsigned int) a2_15;
>> +  _5 = _4 * 4;
>> +  _6 = -_5;
>> +  x2_16 = c_13(D) + _6;
>>    goto <bb 4>;
>>
>>    <bb 3>:
>>
>> That eventually leads to 9 occurrences of the scanned pattern. However, I'm 
>> not sure if the test-case makes
>> sense any longer?
>>
>> Thanks,
>> Martin
>
>>
>> ;; Function f (f, funcdef_no=0, decl_uid=1747, cgraph_uid=0, symbol_order=0)
>>
>> ;; 1 loops found
>> ;;
>> ;; Loop 0
>> ;;  header 0, latch 1
>> ;;  depth 0, outer -1
>> ;;  nodes: 0 1 2 5 3 4
>> ;; 2 succs { 5 3 }
>> ;; 5 succs { 4 }
>> ;; 3 succs { 4 }
>> ;; 4 succs { 1 }
>> Sinking x3_18 = c_13(D) + _9;
>>  from bb 2 to bb 3
>> Sinking _9 = -_8;
>>  from bb 2 to bb 3
>> Sinking _8 = _7 * 4;
>>  from bb 2 to bb 3
>> Sinking _7 = (long unsigned int) a3_17;
>>  from bb 2 to bb 3
>> Sinking a3_17 = s_11(D) * 6;
>>  from bb 2 to bb 3
>> Sinking x2_16 = c_13(D) + _6;
>>  from bb 2 to bb 5
>> Sinking _6 = -_5;
>>  from bb 2 to bb 5
>> Sinking _5 = _4 * 4;
>>  from bb 2 to bb 5
>> Sinking _4 = (long unsigned int) a2_15;
>>  from bb 2 to bb 5
>> Sinking a2_15 = s_11(D) * 4;
>>  from bb 2 to bb 5
>> f (int s, int * c)
>> {
>>   int * x3;
>>   int * x2;
>>   int * x1;
>>   int a3;
>>   int a2;
>>   int a1;
>>   long unsigned int _1;
>>   long unsigned int _2;
>>   sizetype _3;
>>   long unsigned int _4;
>>   long unsigned int _5;
>>   sizetype _6;
>>   long unsigned int _7;
>>   long unsigned int _8;
>>   sizetype _9;
>>   int * iftmp.0_10;
>>
>>   <bb 2>:
>>   a1_12 = s_11(D) * 2;
>>   _1 = (long unsigned int) a1_12;
>>   _2 = _1 * 4;
>>   _3 = -_2;
>>   x1_14 = c_13(D) + _3;
>>   if (x1_14 != 0B)
>>     goto <bb 5>;
>>   else
>>     goto <bb 3>;
>>
>>   <bb 5>:
>>   a2_15 = s_11(D) * 4;
>>   _4 = (long unsigned int) a2_15;
>>   _5 = _4 * 4;
>>   _6 = -_5;
>>   x2_16 = c_13(D) + _6;
>>   goto <bb 4>;
>>
>>   <bb 3>:
>>   a3_17 = s_11(D) * 6;
>>   _7 = (long unsigned int) a3_17;
>>   _8 = _7 * 4;
>>   _9 = -_8;
>>   x3_18 = c_13(D) + _9;
>>
>>   <bb 4>:
>>   # iftmp.0_10 = PHI <x2_16(5), x3_18(3)>
>>   return iftmp.0_10;
>>
>> }
>>
>>
>

Reply via email to