https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728

--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 51100
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51100&action=edit
hack

The attached tries to rewrite the aggregate assignments into a load/store
sequence producing

  _33 = VIEW_CONVERT_EXPR<vector(32) unsigned char>(d_42(D)->lam2D.32702);
  VIEW_CONVERT_EXPR<vector(32) unsigned char>(d_42(D)->lam1D.32701) = _33;

from originally

  d_42(D)->lam1D.32701 = d_42(D)->lam2D.32702;

that's a bit ugly but still falls short of doing the full store-motion but
at least now moves all but the above store:

...
  _35 = _36 + val$v_63;
  _30 = VIEW_CONVERT_EXPR<vector(32) unsigned char>(_56);
  VIEW_CONVERT_EXPR<vector(32) unsigned char>(*d_28(D).lam1D.32701) = _30;
  *d_28(D).lam2D.32702.vD.32579 = _35;
  il_33 = il_69 + 1;
  l_34 = l_68 + 2;
  if (lmax_26(D) >= l_34)
    goto <bb 6>; [89.00%]
  else
    goto <bb 7>; [11.00%]

  <bb 6> [local count: 850510901]:
  goto <bb 3>; [100.00%]

  <bb 7> [local count: 105119324]:
  # _84 = PHI <_30(3)>
  # _85 = PHI <_35(3)>
  # d__v_lsm.37_86 = PHI <d__v_lsm.37_74(3)>
  # d__v_lsm.38_87 = PHI <d__v_lsm.38_75(3)>
  # d__v_lsm.39_88 = PHI <d__v_lsm.39_76(3)>
  # d__v_lsm.40_89 = PHI <d__v_lsm.40_77(3)>
  MEM[(struct TvsimpleD.32577 *)d_28(D) + 192B].vD.32579 = d__v_lsm.37_86;
  MEM[(struct TvsimpleD.32577 *)d_28(D) + 224B].vD.32579 = d__v_lsm.38_87;
  MEM[(struct TvsimpleD.32577 *)d_28(D) + 256B].vD.32579 = d__v_lsm.39_88;
  MEM[(struct TvsimpleD.32577 *)d_28(D) + 288B].vD.32579 = d__v_lsm.40_89;
  VIEW_CONVERT_EXPR<vector(32) unsigned char>(*d_28(D).lam1D.32701) = _84;
  *d_28(D).lam2D.32702.vD.32579 = _85;

the dependence analysis of store-motion considers the last stores (ref 14 and
15) dependent:

Querying dependency of refs 2 and 15: dependent.
Querying RAW dependencies of ref 2 in loop 1: dependent
Querying dependency of refs 13 and 14: dependent.
Querying RAW dependencies of ref 13 in loop 1: dependent
Querying dependency of refs 14 and 13: dependent.
Querying SM WAR dependencies of ref 14 in loop 1: dependent
Querying dependency of refs 15 and 2: dependent.
Querying SM WAR dependencies of ref 15 in loop 1: dependent

That's the usual issue of LIM needing to identify "identical" refs
but appearanlty failing to do so for:

Memory reference 2: MEM[(const struct Tvsimple *)d_28(D) + 128B].v
Memory reference 15: *d_28(D).lam2.v

which is because we don't factor the MEM_REF contained offset.  I'll see
to do that independently of the "hack" (which I'm not sure is an appropriate
way of avoiding to change LIM to deal with aggregates ...)

Reply via email to