On Tue, Nov 5, 2013 at 4:35 PM, Vladimir Makarov <vmaka...@redhat.com> wrote: > I'd like to add a new experimental optimization to the trunk. This > optimization was discussed on RA BOF of this summer GNU Cauldron. > > It is a register pressure relief through live-range shrinkage. It > is implemented on the scheduler base and uses register-pressure insn > scheduling infrastructure. By rearranging insns we shorten pseudo > live-ranges and increase a chance to them be assigned to a hard > register. > > The code looks pretty simple but there are a lot of works behind > this patch. I've tried about ten different versions of this code > (different heuristics for two currently existing register-pressure > algorithms). > > I think it is *upto target maintainers* to decide to use or not to > use this optimization for their targets. I'd recommend to use this at > least for x86/x86-64. I think any OOO processor with small or > moderate register file which does not use the 1st insn scheduling > might benefit from this too. > > On SPEC2000 for x86/x86-64 (I use Haswell processor, -O3 with > general tuning), the optimization usage results in smaller code size > in average (for floating point and integer benchmarks in 32- and > 64-bit mode). The improvement better visible for SPECFP2000 (although > I have the same improvement on x86-64 SPECInt2000 but it might be > attributed mostly mcf benchmark unstability). It is about 0.5% for > 32-bit and 64-bit mode. It is understandable, as the optimization has > more opportunities to improve the code on longer BBs. Different from > other heuristic optimizations, I don't see any significant worse > performance. It gives practically the same or better performance (a > few benchmarks imporoved by 1% or more upto 3%). > > The single but significant drawback is additional compilation time > (4%-6%) as the 1st insn scheduling pass is quite expensive. So I'd > recommend target maintainers to switch it on only for -Ofast.
Generally I'd not recomment viewing -Ofast as -O4 but as -O3 plus generally "unsafe" optimizations. So I'd not enable it for -Ofast but for -O3 - possibly also with -Os if indeed the main motivation is also code-size improvements (-Os is a similar beast as -O3, spend as much time as you can on optimizing size). Btw, thanks for working on this. How does it relate to -fsched-pressure? Does it treat all register classes the same? On x86 mostly the few fixed registers for some of the integer pipeline instructions hurt, x86_64 has enough general and FP registers? Richard. > If > somebody finds that the optimization works on processors which uses > 1st insn scheduling by default (in which I slightly doubt), we could > improve the compilation time by reusing data for this optimization and > the 1st insn scheduling. > > Any comments, questions, thoughts are appreciated. > > 2013-11-05 Vladimir Makarov <vmaka...@redhat.com> > > * tree-pass.h (make_pass_live_range_shrinkage): New external. > * timevar.def (TV_LIVE_RANGE_SHRINKAGE): New. > * sched-rgn.c (gate_handle_live_range_shrinkage): New. > (rest_of_handle_live_range_shrinkage): Ditto > (class pass_live_range_shrinkage): Ditto. > (pass_data_live_range_shrinkage): Ditto. > (make_pass_live_range_shrinkage): Ditto. > * sched-int.h (sched_relief_p): New external. > * sched-deps.c (create_insn_reg_set): Make void return value. > * passes.def: Add pass_live_range_shrinkage. > * ira.c (update_equiv_regs): Don't move if > flag_live_range_shrinkage. > * haifa-sched.c (sched_relief_p): New. > (rank_for_schedule): Add code for pressure relief through live > range shrinkage. > (schedule_insn): Print more debug info. > (sched_init): Setup SCHED_PRESSURE_WEIGHTED for pressure relief > through live range shrinkage. > * doc/invoke.texi (-flive-range-shrinkage): New. > * common.opt (flive-range-shrinkage): New. >