https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96252

Jan Hubicka <hubicka at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
                 CC|                            |hubicka at gcc dot gnu.org
   Last reconfirmed|                            |2021-02-14

--- Comment #5 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
This looks like missed memory copy propagation to me.

We inline the icfed function back but for some reason we end up with all those
extra moves, so it does not seem to be problem with missed tailcall

IPA function summary for bool cmp_y(cmp, cmp)/767 inlinable
  global time:     92.095312
  self size:       13
  global size:     14
  min size:       11
  self stack:      0
  global stack:    0
    size:11.000000, time:90.095312
    size:3.000000, time:2.000000,  executed if:(not inlined)
  calls:
    bool cmp_x(cmp, cmp)/804 inlined
      freq:1.00
      Stack frame offset 0, callee self size 0
      __lexicographical_compare_impl.isra/803 inlined
        freq:1.00
        Stack frame offset 0, callee self size 0

Funny thing is that inliner seems to believe it is going to reduce code size:
Considering bool cmp_x(cmp, cmp)/766 with 10 size
 to be inlined into bool cmp_y(cmp, cmp)/767 in unknown:0
 Estimated badness is -inf, frequency 1.00.
    Badness calculation for bool cmp_y(cmp, cmp)/767 -> bool cmp_x(cmp,
cmp)/766
      size growth -3, time 16.000000 unspec 18.000000  big_speedup
      -inf: Growth -3 <= 0
      Adjusted by hints -inf

The body is:
bool cmp_y (struct cmp l, struct cmp r)
{
  int * __first1;
  int * __first2;
  struct cmp l;
  struct cmp r;
  int _8;
  int _9;
  bool _17;

  <bb 2> [local count: 1073741824]:
  l = l;
  r = r;
  goto <bb 5>; [100.00%]

  <bb 3> [local count: 9416790681]:
  if (_8 > _9)
    goto <bb 6>; [3.66%]
  else
    goto <bb 4>; [96.34%]

  <bb 4> [local count: 9072136140]:
  __first1_11 = __first1_21 + 4;
  __first2_13 = __first2_2 + 4;
  if (&MEM <struct cmp> [(void *)&r + 256B] != __first2_13)
    goto <bb 5>; [95.91%]
  else
    goto <bb 6>; [4.09%]

  <bb 5> [local count: 9774538809]:
  # __first1_21 = PHI <__first1_11(4), &l._M_elems(2)>
  # __first2_2 = PHI <__first2_13(4), &r._M_elems(2)>
  _8 = MEM[(int *)__first1_21];
  _9 = MEM[(int *)__first2_2];
  if (_8 < _9)
    goto <bb 6>; [3.66%]
  else
    goto <bb 3>; [96.34%]

  <bb 6> [local count: 1073741824]:
  # _17 = PHI <0(3), 1(5), 0(4)>

  <bb 6> [local count: 1073741824]:
  # _17 = PHI <0(3), 1(5), 0(4)>
  l ={v} {CLOBBER};
  r ={v} {CLOBBER};
  return _17;

}

Richi,
in any case, we may want to avoid creating wrappers for functions with very
large parameters?

Reply via email to