On Thu, 2015-02-12 at 10:09 +0000, Ajit Kumar Agarwal wrote:
> Hello All:
> 
> The Loop unrolling without good unrolling factor heuristics becomes the 
> performance bottleneck. The Unrolling factor heuristics based on minimum 
> Initiation interval is quite useful with respect to better ILP.  The minimum 
> Initiation interval based on recurrence and resource calculation on Data 
> Dependency Graph  along with the register pressure can be used to add the 
> unrolling factor heuristics. To achieve better ILP with the given schedule,
> the Loops unrolling and the scheduling are inter dependent and has been 
> widely used in Software Pipelining Literature along with the more granular
> List and Trace Scheduling.
> 
> The recurrence calculation based on the Loop carried dependencies and the 
> resource allocation based on the simultaneous access of the resources 
> Using the reservation table will give good heuristics with respect to 
> calculation of unrolling factor. This has been taken care in the
> MII interval Calculation.
> 
> Along with MII, the register pressure should also be  considered in the 
> calculation of heuristics for unrolling factor.
> 
> This enable better heuristics with respect to unrolling factor. The main 
> advantage of the above heuristics for unrolling factor is that it can be 
> Implemented in the Code generation Level. Currently Loop unrolling is done 
> much before the code generation. Let's go by the current implementation
> Of doing Loop unrolling optimization at the Loop optimizer level and 
> unrolling happens. After the Current unrolling at the optimizer level the 
> above heuristics
> Can be  used to do the unrolling at the Code generation Level with the 
> accurate Register pressure calculation as done in the register allocator and 
> the
> Unrolling is done at the code generation level. This looks feasible solution 
> which I am going to propose for the above unrolling heuristics.
> 
> This enables the Loop unrolling done at the Optimizer Level  +  at the Code 
> Generation Level. This double level of Loop unrolling is quite useful.
> This will overcome the shortcomings of the Loop unrolling at the optimizer 
> level.
> 
> The SPEC benchmarks are the better candidates for the above heuristics 
> instead of Mibench and EEMBC.
Not taking register pressure into account when unrolling (and doing
other optimizations/choices) is an old problem.  See also:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20969

Cheers,
Oleg

Reply via email to