[Bug tree-optimization/88440] size optimization of memcpy-like code

marxin at gcc dot gnu.org Fri, 17 May 2019 04:46:05 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88440


--- Comment #14 from Martin Liška <marxin at gcc dot gnu.org> ---
(In reply to [email protected] from comment #13)
> On Fri, 17 May 2019, marxin at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88440
> > 
> > --- Comment #12 from Martin Liška <marxin at gcc dot gnu.org> ---
> > > 
> > > Can you share -fopt-report-loop differences?  From the above I would
> > > guess we split a lot of loops, meaning the memcpy/memmove/memset
> > > calls are in the "middle" and we have to split loops (how many
> > > calls are detected here?).  If that's true another way would be
> > > to only allow calls at head or tail position, thus a single
> > > non-builtin partition.
> > 
> > I newly see ~1400 lines:
> > 
> > module_configure.fppized.f90:7993:0: optimized: Loop 10 distributed: split 
> > to 0
> > loops and 1 library calls.
> > module_configure.fppized.f90:7995:0: optimized: Loop 11 distributed: split 
> > to 0
> > loops and 1 library calls.
> > module_configure.fppized.f90:8000:0: optimized: Loop 15 distributed: split 
> > to 0
> > loops and 1 library calls.
> > module_configure.fppized.f90:8381:0: optimized: Loop 77 distributed: split 
> > to 0
> > loops and 1 library calls.
> > module_configure.fppized.f90:8383:0: optimized: Loop 78 distributed: split 
> > to 0
> > loops and 1 library calls.
> > module_configure.fppized.f90:8498:0: optimized: Loop 105 distributed: split 
> > to
> > 0 loops and 1 library calls.
> > module_configure.fppized.f90:9742:0: optimized: Loop 169 distributed: split 
> > to
> > 0 loops and 1 library calls.
> > module_configure.fppized.f90:9978:0: optimized: Loop 207 distributed: split 
> > to
> > 0 loops and 1 library calls.
> > module_configure.fppized.f90:9979:0: optimized: Loop 208 distributed: split 
> > to
> > 0 loops and 1 library calls.
> > module_configure.fppized.f90:9980:0: optimized: Loop 209 distributed: split 
> > to
> > 0 loops and 1 library calls.
> > module_configure.fppized.f90:9981:0: optimized: Loop 210 distributed: split 
> > to
> > 0 loops and 1 library calls.
> > ...
> 
> All with "0 loops"?  That disputes my theory :/

Yep. All these are in a form of:

  <bb 1809> [local count: 118163158]:
  # S.1565_41079 = PHI <1(2028), S.1565_32687(3351)>
  # ivtmp_38850 = PHI <11(2028), ivtmp_38848(3351)>
  _3211 = S.1565_41079 + -1;
  _3212 = fire_ignition_start_y1[_3211];
  MEM[(real(kind=4)[11] *)&model_config_rec + 101040B][_3211] = _3212;
  S.1565_32687 = S.1565_41079 + 1;
  ivtmp_38848 = ivtmp_38850 - 1;
  if (ivtmp_38848 == 0)
    goto <bb 2027>; [9.09%]
  else
    goto <bb 3351>; [90.91%]

  <bb 3351> [local count: 107425740]:
  goto <bb 1809>; [100.00%]

  <bb 2027> [local count: 10737418]:

  <bb 1810> [local count: 118163158]:
  # S.1566_41080 = PHI <1(2027), S.1566_32689(3350)>
  # ivtmp_38856 = PHI <11(2027), ivtmp_38854(3350)>
  _3213 = S.1566_41080 + -1;
  _3214 = fire_ignition_end_x1[_3213];
  MEM[(real(kind=4)[11] *)&model_config_rec + 101084B][_3213] = _3214;
  S.1566_32689 = S.1566_41080 + 1;
  ivtmp_38854 = ivtmp_38856 - 1;
  if (ivtmp_38854 == 0)
    goto <bb 2026>; [9.09%]
  else
    goto <bb 3350>; [90.91%]

  <bb 3350> [local count: 107425740]:
  goto <bb 1810>; [100.00%]

  <bb 2026> [local count: 10737418]:

  <bb 1811> [local count: 118163158]:
  # S.1567_41081 = PHI <1(2026), S.1567_32691(3349)>
  # ivtmp_38860 = PHI <11(2026), ivtmp_38858(3349)>
  _3215 = S.1567_41081 + -1;
  _3216 = fire_ignition_end_y1[_3215];
  MEM[(real(kind=4)[11] *)&model_config_rec + 101128B][_3215] = _3216;
  S.1567_32691 = S.1567_41081 + 1;
  ivtmp_38858 = ivtmp_38860 - 1;
  if (ivtmp_38858 == 0)
    goto <bb 2025>; [9.09%]
  else
    goto <bb 3349>; [90.91%]

  <bb 3349> [local count: 107425740]:
  goto <bb 1811>; [100.00%]

  <bb 2025> [local count: 10737418]:
...


It's a configure module, so that it probably contains so many loops for various
configs.

[Bug tree-optimization/88440] size optimization of memcpy-like code

Reply via email to