------- Comment #14 from changpeng dot fang at amd dot com 2010-06-30 00:36 -------
(In reply to comment #7) > A good chunk of time seems to be spent in the RTL loop unroller, triggered > by array prefetching (testing with -O3 -funroll-loops). Otherwise it might > as well be just excessive code growth caused by prefetching. Yes, for test_fpu.f90, more than half of the time is spent in the RTL loop unroller, and if manually set unroll_factor to 1 (don't unroll), the timing increase by array prefetching is negligible. With -O3 -funroll-loops, I don't expect code size or compilation time increase from the RTL loop unroller, triggered by array prefetching. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44576