Re: [PATCH] Add capability to run several iterations of early optimizations

Richard Guenther Tue, 01 Nov 2011 14:18:42 -0700

On Sat, Oct 29, 2011 at 1:06 AM, Matt <m...@use.net> wrote:
> On Sat, 29 Oct 2011, Maxim Kuvyrkov wrote:
>
>>> I like this variant a lot better than the last one - still it lacks any
>>> analysis-based justification for iteration (see my reply to Matt on
>>> what I discussed with Honza).
>>
>> Yes, having a way to tell whether a function have significantly changed
>> would be awesome.  My approach here would be to make inline_parameters
>> output feedback of how much the size/time metrics have changed for a
>> function since previous run.  If the change is above X%, then queue
>> functions callers for more optimizations.  Similarly, Martin's
>> rebuild_cgraph_edges_and_devirt (when that goes into trunk) could queue new
>> direct callees and current function for another iteration if new direct
>> edges were resolved.
>
> Figuring out the heuristic will need decent testing on a few projects to
> figure out what the "sweet spot" is (smallest binary for time/passes spent)
> for that given codebase. With a few data points, a reasonable stab at the
> metrics you mention can be had that would not terminate the iterations
> before the known optimial number of passes. Without those data points, it
> seems like making sure the metrics allow those "sweet spots" to be attained
> will be difficult.


Well, sure - the same like with inlining heuristics.

>>>  Thus, I don't think we want to
>>> merge this in its current form or in this stage1.
>>
>> What is the benefit of pushing this to a later release?  If anything,
>> merging the support for iterative optimizations now will allow us to
>> consider adding the wonderful smartness to it later.  In the meantime,
>> substituting that smartness with a knob is still a great alternative.

The benefit?  The benifit is to not have another magic knob in there
that doesn't make too much sense and papers over real conceptual/algorithmic
issues.  Brute-force iterating is a hack, not a solution. (sorry)

> I agree (of course). Having the knob will be very useful for testing and
> determining the acceptance criteria for the later "smartness". While
> terminating early would be a nice optimization, the feature is still
> intrinsically useful and deployable without it. In addition, when using LTO
> on nearly all the projects/modules I tested on, 3+ passes were always
> productive.

If that is true then I'd really like to see testcases.  Because I am sure you
are just papering over (mabe even easy to fix) issues by the brute-force
iterating approach.  We also do not have a switch to run every pass twice
in succession, just because that would be as stupid as this.

> To be fair, when not using LTO, beyond 2-3 passes did not often
> produce improvements unless individual compilation units were enormous.
>
> There was also the question of if some of the improvements seen with
> multiple passes were indicative of deficiencies in early inlining, CFG, SRA,
> etc. If the knob is available, I'm happy to continue testing on the same
> projects I've filed recent LTO/graphite bugs against (glib, zlib, openssl,
> scummvm, binutils, etc) and write a report on what I observe as "suspicious"
> improvements that perhaps should be caught/made in a single pass.
>
> It's worth noting again that while this is a useful feature in and of itself
> (especially when combined with LTO), it's *extremely* useful when coupled
> with the de-virtualization improvements submitted in other threads. The
> examples submitted for inclusion in the test suite aren't academic -- they
> are reductions of real-world performance issues from a mature (and shipping)
> C++-based networking product. Any C++ codebase that employs physical
> separation in their designs via Factory patterns, Interface Segregation,
> and/or Dependency Inversion will likely see improvements. To me, these
> enahncements combine to form one of the biggest leaps I've seen in C++ code
> optimization -- code that can be clean, OO, *and* fast.

But iterating the whole early optimization pipeline is not a sensible approach
of attacking these.

Richard.

> Richard: If there's any additional testing or information I can reasonably
> provide to help get this in for this stage1, let me know.
>
> Thanks!
>
>
> --
> tangled strands of DNA explain the way that I behave.
> http://www.clock.org/~matt
>

Re: [PATCH] Add capability to run several iterations of early optimizations

Reply via email to