https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728
--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> --- Created attachment 51100 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51100&action=edit hack The attached tries to rewrite the aggregate assignments into a load/store sequence producing _33 = VIEW_CONVERT_EXPR<vector(32) unsigned char>(d_42(D)->lam2D.32702); VIEW_CONVERT_EXPR<vector(32) unsigned char>(d_42(D)->lam1D.32701) = _33; from originally d_42(D)->lam1D.32701 = d_42(D)->lam2D.32702; that's a bit ugly but still falls short of doing the full store-motion but at least now moves all but the above store: ... _35 = _36 + val$v_63; _30 = VIEW_CONVERT_EXPR<vector(32) unsigned char>(_56); VIEW_CONVERT_EXPR<vector(32) unsigned char>(*d_28(D).lam1D.32701) = _30; *d_28(D).lam2D.32702.vD.32579 = _35; il_33 = il_69 + 1; l_34 = l_68 + 2; if (lmax_26(D) >= l_34) goto <bb 6>; [89.00%] else goto <bb 7>; [11.00%] <bb 6> [local count: 850510901]: goto <bb 3>; [100.00%] <bb 7> [local count: 105119324]: # _84 = PHI <_30(3)> # _85 = PHI <_35(3)> # d__v_lsm.37_86 = PHI <d__v_lsm.37_74(3)> # d__v_lsm.38_87 = PHI <d__v_lsm.38_75(3)> # d__v_lsm.39_88 = PHI <d__v_lsm.39_76(3)> # d__v_lsm.40_89 = PHI <d__v_lsm.40_77(3)> MEM[(struct TvsimpleD.32577 *)d_28(D) + 192B].vD.32579 = d__v_lsm.37_86; MEM[(struct TvsimpleD.32577 *)d_28(D) + 224B].vD.32579 = d__v_lsm.38_87; MEM[(struct TvsimpleD.32577 *)d_28(D) + 256B].vD.32579 = d__v_lsm.39_88; MEM[(struct TvsimpleD.32577 *)d_28(D) + 288B].vD.32579 = d__v_lsm.40_89; VIEW_CONVERT_EXPR<vector(32) unsigned char>(*d_28(D).lam1D.32701) = _84; *d_28(D).lam2D.32702.vD.32579 = _85; the dependence analysis of store-motion considers the last stores (ref 14 and 15) dependent: Querying dependency of refs 2 and 15: dependent. Querying RAW dependencies of ref 2 in loop 1: dependent Querying dependency of refs 13 and 14: dependent. Querying RAW dependencies of ref 13 in loop 1: dependent Querying dependency of refs 14 and 13: dependent. Querying SM WAR dependencies of ref 14 in loop 1: dependent Querying dependency of refs 15 and 2: dependent. Querying SM WAR dependencies of ref 15 in loop 1: dependent That's the usual issue of LIM needing to identify "identical" refs but appearanlty failing to do so for: Memory reference 2: MEM[(const struct Tvsimple *)d_28(D) + 128B].v Memory reference 15: *d_28(D).lam2.v which is because we don't factor the MEM_REF contained offset. I'll see to do that independently of the "hack" (which I'm not sure is an appropriate way of avoiding to change LIM to deal with aggregates ...)