6 Regression] Performance degradation after r216304

law at redhat dot com Wed, 09 Mar 2016 23:36:22 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64058


Jeffrey A. Law <law at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amacleod at redhat dot com

--- Comment #14 from Jeffrey A. Law <law at redhat dot com> ---
So working on the assumption that full stability for SSA_NAME_VERSION changes
isn't in the cards, I went back to my patch which avoids using SSA_NAME_VERSION
as the tie-breaker given two identical coalescing costs.

I can verify for this BZ that change brings stability to the order in which we
process coalescing candidates with the same cost.  Essentially the tie breaker
becomes which coalescing opportunity is seen first during the coalescer's walk
of the IL.

That's fine and good, but as we know only addresses half the problem.  It does
not address whether or not adjustments need to be made to the costing mechanism
itself or whether we can come up with a better tie breaker.

For this BZ, using the referenced trunk versions these are the key coalescings
in the chain.
r216303:
(4004) _1 <-> _316
(3276) _1 <-> _200
(3276) u1_lsm.6_253 <-> _316

Coalesce list: (1)_1 & (316)_316 [map: 0, 98] : Success -> 0
Coalesce list: (1)_1 & (200)_200 [map: 0, 46] : Success -> 0
Coalesce list: (1)_1 & (253)l1_lsm.7_159 [map: 0, 32] : Fail due to conflict


The successful coalescings and subsequent merging of conflicts cause the 3rd
coalesce to fail (and others).  This is actually what we want for this BZ.

In r216304 we have:
(4004) _315 <-> _316
(3276) u1_lsm.6_253 <-> _316
(3276) _314 <-> _315

Coalesce list: (315)_315 & (316)_316 [map: 97, 98] : Success -> 97
Coalesce list: (253)l1_lsm.7_58 & (316)_315 [map: 4, 97] : Success -> 4
Coalesce list: (314)_314 & (315)l1_lsm.7_58 [map: 96, 4] : Fail due to conflict


Note that the coalescing pairs with cost 3276 are reversed due to
SSA_NAME_VERSION churn.  That allows 253 to coalesce with 316 which in turn
prevents 314 from coalescing with 315 (and other downstream effects).  The net
result of those downstream effects is an unwanted copy.

Looking at the costing metrics, they are strictly based on the edge frequency
and flags where the copy would have to be inserted.

What jumps out to me is that the costing metrics don't look at all at the
conflicts.  When we coalesce two objects, we have to merge their conflicts into
the partition leader.  Essentially each time we coalesce, the resulting
partition leader is going to have a larger conflict list and is less likely to
participate in further coalescing.

That might argue that peeking at the union of the resulting conflicts and
giving higher priority to the one that has a smaller resulting conflict set may
be useful in the cost calculation or as a better tie-breaker.

So going back to r216303 and looking at the original partition conflicts we
have:

Partition 0 (_1 - 1 )
Partition 98 (_316 - 316 )
Partition 46 (_200 - 200 )
Partition 72 (u1_lsm.6_253 - 253 )

Conflict graph:
0: 4, 9, 28, 35, 41, 42, 44, 53, 60, 66, 75, 81, 84, 90, 93, 96, 100, 102

46: 4, 5, 6, 9, 28, 35, 41, 42, 44, 51, 63, 66, 75, 81, 84, 90, 93, 96, 100,
102

72: 4, 9, 28, 35, 44, 55, 67, 74, 75, 78, 84, 85, 90, 93, 96, 102, 108, 110,
111

98: 4, 7, 8, 9, 10, 11, 14, 15, 27, 28, 35, 41, 42, 44, 47, 48, 50, 52, 56, 57,
58, 59, 65, 75, 81, 84, 90, 93, 96, 102

Coalescing _1 and _200 will result in a node with 22 unique conflicts.  
Coalescing _253 and _316 will result in a node with 38 unique conflicts.

So if we used the conflicts sets either directly in the costing model or for
breaking ties and gave preference to coalescings with smaller resulting
conflicts, we'd get the right code for this test.

The question then becomes whether or not that is good in general.

[Bug tree-optimization/64058] [5/6 Regression] Performance degradation after r216304

Reply via email to