All: Loop fusion is an important optimizations that fuses the set of Loops if the following condition is valid.
1) Loops are conformant ( i.e. they have same iteration count). 2. Loops are control equivalent. The control equivalence of the loops can be identified with the dominator and post dominator properties Of the Loop. Two Loops Lj and LK are control equivalent if the Lj dominates Lk and LK post dominates the Lj. 3. Loops are adjacent. The Loops are adjacent if there is no intervening code between them. The intervening code can be determined With the dominance properties. If there is an aggregate node ax where the Loop Lj dominates ax and the Loop Lk post dominates The ax then there is a intervening code. 4. There are forward dependence between the Loops. If the properties 1. Doesn't hold i.e. the loops are not conformant and they don't have same iteration count. Then the following Transformation can be done as given in Fig(1) and Fig(2) to make the set of Loops as non-conformant. In Fig(1) the Loops i1 and i2 are non-conformant as they don't have same iteration count. The following transformation can be made given in Fig(2). The Fig(2) fuses the Loop for the Loops i1 and i2 that are non-conformant and doesn't have same iteration count. If the properties 3 is not valid and the set of Loops are not adjacent. The intervening code can be moved up the Loop or down the Loop So that the Loop become adjacent and there is no intervening code. Also partially the intervening code is moved up and partial intervening Code is moved down based on the dependence relation and the dominance properties. Such transformation can make loops adjacent and There is no intervening code. Also the Loop fusion, should be done after the Loop normalization pass and the Loop normalization pass should be present before the Loop Fusion pass. The 4 properties to be valid for loop fusion to occur is done with respect to traversing the Loops from outermost to inner most Loop. First the outer loop is traversed and the loops that are at same level of the outer loop are made into one partitions and that partition Can be fused if the above four properties is valid and otherwise the loops are taken out from the partitions. The same logic is done for the Loop traversing outer to inner loops and partitions are formed and later the partitions are fused based on the above properties. Each Loop is traverse in the forward order of dominance tree and also the reverse order of dominance tree ( i.e. post dominator) and the Forward dependence is checked. Based on the dependence the Loops are fused. And the Loops that are non-adjacent partial code is moved Up and partial code is moved down or the whole intervening code is moved up or down. I think the above transformations and properties and enhance the Loop fusion algorithm in GCC with the above Algorithm to have more Loop Fusion with respect to SPEC benchmarks. For ( I1 = 0;I1 < n;I1 ++) { If ( I1 < n-1) S1: Else S2: S3: } For ( I2 = 0 ; I2 < n-2 ;I2 ++) S4; Fig(1). For( I1 = 0; i1< n ; i1++) { If(i1 < n-2) { If(i1< n-1) S1: Else S2: S3: S4: } Else { If ( i1 < n-1) S1: Else S2: S3: } }// For Fig(2). Thanks & Regards Ajit