On Sun, Apr 17, 2011 at 10:35 AM, Jan Hubicka <hubi...@ucw.cz> wrote:
>> AFAICT revision 172430 fixed the original problem in pr45810:
>>
>> gfc -Ofast -fwhole-program fatigue.f90       : 6.301u 0.003s 0:06.30
>> gfc -Ofast -fwhole-program -flto fatigue.f90 : 6.263u 0.003s 0:06.26
>>
>> However if I play with --param max-inline-insns-auto=*, I get
>>
>> gfc -Ofast -fwhole-program --param max-inline-insns-auto=124 -fstack-arrays 
>> fatigue.f90 : 4.870u 0.002s 0:04.87
>> gfc -Ofast -fwhole-program --param max-inline-insns-auto=125 -fstack-arrays 
>> fatigue.f90 : 2.872u 0.002s 0:02.87
>>
>> and
>>
>> gfc -Ofast -fwhole-program -flto --param max-inline-insns-auto=515 
>> -fstack-arrays fatigue.f90 : 4.965u 0.003s 0:04.97
>> gfc -Ofast -fwhole-program -flto --param max-inline-insns-auto=516 
>> -fstack-arrays fatigue.f90 : 2.732u 0.002s 0:02.73
>>
>> while I get the same threshold=125 with/without -flto at revision 172429.
>> Note that I get the same thresholds without -fstack-arrays, the run times
>> are only larger.
>
> Thanks for notice.   This was not really expected, but seems to give some
> insight.  I just tested a new cleanup patch of mine where I fixed few minor
> bugs in side corners.  One of those bugs I noticed was introduced by this 
> patch
> (an overlook while converting the code to new accesor).
>
> In case of nested inlining, the stack usage got misaccounted and consequently
> we allowed more inlining than --param large-stack-frame-growth would allow 
> normally.
> The vortex and wupwise improvement seems to be gone, so I think they are due 
> to this
> issue.
>
> I never really tuned the stack frame growth heuristics since it did not cause 
> any problems
> in the benchmarks. On fortran this is quite different because of the large 
> i/o blocks
> hitting it very commonly, so I will look into making it more permissive.  We 
> definitely
> can just bump up the limits and/or we can also teach it that if call 
> dominates the return
> there is not really much to save of stack usage by preventing inlining since 
> both stack
> frames will wind up on the stack anyway.

I think Micha has a fix for the I/O block issue.

Richard.

> This means adding new bit whether call edge dominate exit and using this 
> info. Also simple
> noreturn IPA discovery can be based on this and I recently noticed it might 
> be important
> for Mozilla. So I will give it a try soonish.
>
> I will also look into the estimate_size ICE reported today.
>
> Honza
>

Reply via email to