http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56688
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2013-03-22 Blocks| |53947 Ever Confirmed|0 |1 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> 2013-03-22 13:29:57 UTC --- The issue is that x is kept live by applying store-motion: <bb 3>: # prephitmp_26 = PHI <1(2), i.3_14(4)> # ivtmp_30 = PHI <700(2), ivtmp_15(4)> _5 = (integer(kind=8)) prephitmp_26; _6 = _5 + -1; _7 = my_data.x1[_6]; _8 = my_data.y1[_6]; x.1_9 = _7 - _8; _11 = my_data.t1[_6]; _12 = x.1_9 * _11; my_data.z1[_6] = _12; i.3_14 = prephitmp_26 + 1; ivtmp_15 = ivtmp_30 - 1; if (ivtmp_15 == 0) goto <bb 5>; else goto <bb 4>; <bb 4>: goto <bb 3>; <bb 5>: # x_lsm.7_25 = PHI <x.1_9(3)> x = x_lsm.7_25; i = 701; return; because it appears that 'save' makes all variables global ones. This kind of "reduction" is not handled by the vectorizer. If would be handled by a pass that re-materializes x_lsm.7_25 from memory and operations after the loop. Or by handling the "final" value properly by means of vector extraction or in the epilogue loop, simply using it, or forcing at least one iteration of the epilogue loop by adjusting the number of iterations of the vectorized loop. I like the last option most ;)