------- Comment #3 from changpeng dot fang at amd dot com 2010-07-06 18:35 ------- Here is the impact of loop unrolling on the compilation time and code size on polyhedron test_fpu.f90:
-O3 -ftree-vectorize -fno-prefetch-loop-arrays -fno-unroll-loops: timing: 12.62s, size: 67069 bytes -O3 -ftree-vectorize -fprefetch-loop-arrays -funroll-loops: timing: 51.77s, size: 234045 bytes I also did an experiment on prefetching that we don't unroll the pre- and post-loop generated by the vectorizer: -O3 -ftree-vectorize -fprefetch-loop-arrays: timing: 29.32s size: 92541 bytes -O3 -ftree-vectorize -fprefetch-loop-arrays (don't unroll pre- postloops) timing: 18.34s size: 78909 bytes -O3 -ftree-vectorize -fno-prefetch-loop-arrays timing: 12.62s, size: 67069 bytes -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44794