On 08/15/2015 11:01 AM, Ajit Kumar Agarwal wrote:
All:
Please find the updated patch with suggestion and feedback
incorporated.
Thanks Jeff and Richard for the review comments.
Following changes were done based on the feedback on RFC comments.
and the review for the previous patch.
1. Both tracer and path splitting pass are separate passes so that
two instances of the pass will run in the end, one doing path
splitting and one doing tracing, at different times in the
optimization pipeline.
I'll have to think about this. I'm not sure I agree totally with
Richi's assertion that we should share code with the tracer pass, but
I'll give it a good looksie.
2. Transform code is shared for tracer and path splitting pass. The
common code in extracted in a given function transform_duplicate And
place the function in tracer.c and the path splitting pass uses the
transform code.
OK. I'll take a good look at that.
3. Analysis for the basic block population and traversing the basic
block using the Fibonacci heap is commonly used. This cannot be
Factored out into new function as the tracer pass does more analysis
based on the profile and the different heuristics is used in tracer
And path splitting pass.
Understood.
4. The include headers is minimal and presence of what is required
for the path splitting pass.
THanks.
5. The earlier patch does the SSA updating with replace function to
preserve the SSA representation required to move the loop latch node
same as join Block to its predecessors and the loop latch node is
just forward block. Such replace function are not required as
suggested by the Jeff. Such replace Function goes away with this
patch and the transformed code is factored into a given function
which is shared between tracer and path splitting pass.
Sounds good.
Bootstrapping with i386 and Microblaze target works fine. No
regression is seen in Deja GNU tests for Microblaze. There are
lesser failures. Mibench/EEMBC benchmarks were run for Microblaze
target and the gain of 9.3% is seen in rgbcmy_lite the EEMBC
benchmarks.
What do you mean by there are "lesser failures"? Are you saying there
are cases where path splitting generates incorrect code, or cases where
path splitting produces code that is less efficient, or something else?
SPEC 2000 benchmarks were run with i386 target and the following
performance number is achieved.
INT benchmarks with path splitting(ratio) Vs INT benchmarks without
path splitting(ratio) = 3661.225091 vs 3621.520572
That's an impressive improvement.
Anyway, I'll start taking a close look at this momentarily.
Jeff