http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55858
Bug #: 55858 Summary: When scalarizing contiguous whole-arrays, consider folding into a single loop Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: fortran AssignedTo: unassig...@gcc.gnu.org ReportedBy: bur...@gcc.gnu.org Motivated by http://gcc.gnu.org/ml/fortran/2013-01/msg00015.html Currently, the scalarizer generates (rank) loops: subroutine g(x) integer, pointer, intent(in), contiguous :: x(:,:) x = 8 end subroutine g gives S.0 = Dx->dim[1].lbound while (1) { if (S.0 > x->dim[1].ubound) goto L.2; { integer(kind=8) D.1903; integer(kind=8) S.1; D.1903 = x->dim[1].stride * S.0 + x->offset; S.1 = x->dim[0].lbound; while (1) { if (S.1 > x->dim[0].ubound) goto L.1; (*D.1895)[S.1 + D.1903] = 8; S.1 = S.1 + 1; } L.1:; } S.0 = S.0 + 1; } If one knows that the memory is contiguous (i.e. it is simply contiguous), one can use a single loop. Or at least something which allows the ME to fold the two loops into a single loop, e.g. "stride[1]" could be replaced by the extent ("ubound[0]-lbound[0]+1"), which allows the ME to deduce that one can fold it into a single loop. Using "stride", the contiguity information is lost.