http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422
--- Comment #39 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-01-27 16:16:48 UTC --- The size difference is likely from prefetching, it's 1.5MB vs. 1.1MB without that (-O3 -fbounds-check -ffast-math -funroll-loops). Prefetching usually causes another set of (then RTL unrolled) loop copies. See PR44688.