On Thu, 2015-02-12 at 10:09 +0000, Ajit Kumar Agarwal wrote: > Hello All: > > The Loop unrolling without good unrolling factor heuristics becomes the > performance bottleneck. The Unrolling factor heuristics based on minimum > Initiation interval is quite useful with respect to better ILP. The minimum > Initiation interval based on recurrence and resource calculation on Data > Dependency Graph along with the register pressure can be used to add the > unrolling factor heuristics. To achieve better ILP with the given schedule, > the Loops unrolling and the scheduling are inter dependent and has been > widely used in Software Pipelining Literature along with the more granular > List and Trace Scheduling. > > The recurrence calculation based on the Loop carried dependencies and the > resource allocation based on the simultaneous access of the resources > Using the reservation table will give good heuristics with respect to > calculation of unrolling factor. This has been taken care in the > MII interval Calculation. > > Along with MII, the register pressure should also be considered in the > calculation of heuristics for unrolling factor. > > This enable better heuristics with respect to unrolling factor. The main > advantage of the above heuristics for unrolling factor is that it can be > Implemented in the Code generation Level. Currently Loop unrolling is done > much before the code generation. Let's go by the current implementation > Of doing Loop unrolling optimization at the Loop optimizer level and > unrolling happens. After the Current unrolling at the optimizer level the > above heuristics > Can be used to do the unrolling at the Code generation Level with the > accurate Register pressure calculation as done in the register allocator and > the > Unrolling is done at the code generation level. This looks feasible solution > which I am going to propose for the above unrolling heuristics. > > This enables the Loop unrolling done at the Optimizer Level + at the Code > Generation Level. This double level of Loop unrolling is quite useful. > This will overcome the shortcomings of the Loop unrolling at the optimizer > level. > > The SPEC benchmarks are the better candidates for the above heuristics > instead of Mibench and EEMBC.
Not taking register pressure into account when unrolling (and doing other optimizations/choices) is an old problem. See also: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=20969 Cheers, Oleg