On Tue, Mar 17, 2015 at 7:29 PM, Jeff Law <l...@redhat.com> wrote: > On 03/17/2015 02:17 AM, Andreas Krebbel wrote: >> >> >> Just to have some numbers I did run a -j1 GCC bootstrap twice with and >> without the patch on x86_64. >> Best results for both are: >> >> clean: 21459s >> patched: 21314s >> >> There rather appears to be a trend towards reduced compile time perhaps >> due to the reduced number of >> INSNs to be processed in the RTL passes between the two ifcvt runs (loop >> optimization, combine, >> fwprop, dse,...)?! >> >> I also tried to measure the testsuite runs but the results show a big >> variance. So what I have right >> now does not qualify as a benchmark. > > And reality is it's getting harder and harder to benchmark this kind of > thing with turbo modes and such. A single run isn't sufficient unless > you've locked the box into a particular cpu frequency.
For the particular patch I wonder if you really need to change all three if-conversion pass instances or if changing the one before combine (pass_rtl_ifcvt, thus rest_of_handle_if_conversion) is enough. That already runs an unconditonal (huh...) cleanup_cfg (0) at the end which could be changed so that DCE is performed (CLEANUP_EXPENSIVE, runs delete_trivially_dead_insns). At least that makes the patch smaller and its impact restricted to one of the three ifcvt passes. OTOH ifcvt performs a DCE at its start (to be not confused by dead instructions I guess), so why doesn't combine do that as well (oh, it does!?)? And maybe _that_ DCE can be removed as if_convert () already performs a DF_LR_RUN_DCE on the first pass. Richard. > jeff >> >> >