https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100258
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- Is it really so much more efficient? The store from reg is 4 bytes while a movl is 6 - for a larger loop keeping its size small would be important.