https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96252
Jan Hubicka <hubicka at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Status|UNCONFIRMED |NEW CC| |hubicka at gcc dot gnu.org Last reconfirmed| |2021-02-14 --- Comment #5 from Jan Hubicka <hubicka at gcc dot gnu.org> --- This looks like missed memory copy propagation to me. We inline the icfed function back but for some reason we end up with all those extra moves, so it does not seem to be problem with missed tailcall IPA function summary for bool cmp_y(cmp, cmp)/767 inlinable global time: 92.095312 self size: 13 global size: 14 min size: 11 self stack: 0 global stack: 0 size:11.000000, time:90.095312 size:3.000000, time:2.000000, executed if:(not inlined) calls: bool cmp_x(cmp, cmp)/804 inlined freq:1.00 Stack frame offset 0, callee self size 0 __lexicographical_compare_impl.isra/803 inlined freq:1.00 Stack frame offset 0, callee self size 0 Funny thing is that inliner seems to believe it is going to reduce code size: Considering bool cmp_x(cmp, cmp)/766 with 10 size to be inlined into bool cmp_y(cmp, cmp)/767 in unknown:0 Estimated badness is -inf, frequency 1.00. Badness calculation for bool cmp_y(cmp, cmp)/767 -> bool cmp_x(cmp, cmp)/766 size growth -3, time 16.000000 unspec 18.000000 big_speedup -inf: Growth -3 <= 0 Adjusted by hints -inf The body is: bool cmp_y (struct cmp l, struct cmp r) { int * __first1; int * __first2; struct cmp l; struct cmp r; int _8; int _9; bool _17; <bb 2> [local count: 1073741824]: l = l; r = r; goto <bb 5>; [100.00%] <bb 3> [local count: 9416790681]: if (_8 > _9) goto <bb 6>; [3.66%] else goto <bb 4>; [96.34%] <bb 4> [local count: 9072136140]: __first1_11 = __first1_21 + 4; __first2_13 = __first2_2 + 4; if (&MEM <struct cmp> [(void *)&r + 256B] != __first2_13) goto <bb 5>; [95.91%] else goto <bb 6>; [4.09%] <bb 5> [local count: 9774538809]: # __first1_21 = PHI <__first1_11(4), &l._M_elems(2)> # __first2_2 = PHI <__first2_13(4), &r._M_elems(2)> _8 = MEM[(int *)__first1_21]; _9 = MEM[(int *)__first2_2]; if (_8 < _9) goto <bb 6>; [3.66%] else goto <bb 3>; [96.34%] <bb 6> [local count: 1073741824]: # _17 = PHI <0(3), 1(5), 0(4)> <bb 6> [local count: 1073741824]: # _17 = PHI <0(3), 1(5), 0(4)> l ={v} {CLOBBER}; r ={v} {CLOBBER}; return _17; } Richi, in any case, we may want to avoid creating wrappers for functions with very large parameters?