http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55731
Bug #: 55731 Summary: Issue with complete innermost loop unrolling (cunrolli) Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: ysrum...@gmail.com CC: hubi...@ucw.cz, izamya...@gmail.com I attached 2 test-cases extracted from important benchmark at which clang and icc outperform gcc for x86 target (atom). For 1st test-case (t.c) cunrolli phase does not perform complete loop unrolling with the following message (test was compiled with -O3 -funroll-loops options): Loop size: 23 Estimated size after unrolling: 33 Not unrolling loop 1: size would grow. but it is unrolled by cunroll phase: Loop size: 24 Estimated size after unrolling: 32 Unrolled loop 1 completely (duplicated 2 times). I wonder why this loop was not unrolled by cunrolli? We lost a lot of optimizations for unrolled loop such as Constant (address) Propagation, Dead code elimination etc. and got non-optimal binaries. For comparsion I added another test (t2.c) with successfull complete unrolling by cunrolli, at which we can see that all assignments to local array 'b' were properly propagated and deleted but we don't have such transformations for 1st test-case.