https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83518
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- Yes, it's not really sth new but a known issue with late value-numbering. Note that FRE wouldn't know how to simplify this either, we'd need store-merging to effectively vectorize the earlier sets. BB vectorization doesn't do this because after unrolling we see vect_cst__46 = { 5, 4, 3, 2 }; MEM[(int *)&arr] = vect_cst__46; arr[4] = 1; t_2 = arr[0]; arr[0] = 5; arr[0] = t_2; t_32 = arr[0]; _65 = arr[1]; arr[0] = _65; arr[1] = t_32; t_68 = arr[0]; _69 = arr[2]; arr[0] = _69; arr[2] = t_68; t_72 = arr[0]; _73 = arr[3]; arr[0] = _73; arr[3] = t_72; t_76 = arr[0]; _77 = arr[4]; arr[0] = _77; arr[4] = t_76; i_80 = 1; ivtmp_81 = 4; pretmp_82 = arr[0]; t_87 = arr[i_80]; arr[i_80] = pretmp_82; ... and BB vectorization is confused by the dead stores (and DSE would be by the missed constant propagations).