> -----Original Message----- > From: Vladimir Makarov <vmaka...@redhat.com> > Sent: Saturday, August 30, 2025 1:38 AM > To: Cui, Lili <lili....@intel.com>; gcc-patches@gcc.gnu.org > Cc: rdsandif...@googlemail.com > Subject: Re: [PATCH] ira: Remove the issue code in improve_allocation. > [PR117838] > > > On 8/22/25 3:25 AM, yes wrote: > > From: "Cui, Lili" <lili....@intel.com> > > > > Hi, > > > > This patch aims to remove issue code in improve_allocation that was causing > expensive allocno overflows. > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? > > > Yes, you can commit it into the trunk with minor change (see ChangeLog entry > below). Thank you for the patch. > > And sorry for the delay with the review. When a RA patch changes heuristics > (in this case it also modifies code which was added to improve exchange2 > code), I usually do benchmarking too. I did it on Intel CPU (i3-13600K). I > confirm the improvements are visible. There is moderate code size increases > only on bwaves (0.37%) but on average the code size decreases (the biggest > decrease 0.41% was achieved on exchange2). > > I don't see any performance decrease (only some noise) but I see visible > performance improvements on perlbench and exchange2 (+4%) and on wrf > (+3%) which results in 1% SPECInt2017 improvement and 0.25% on > SPECFP2017. I will be not surprised if the patch can result in some failures > on > some targets on tests which expect a particular code generation > (unfortunately we have many such over constraint tests). But it is ok and if > it > happens it can be addressed separately. The most credible RA test at least > for > me is always SPEC. > > Again thank you for improving RA. >
Committed this patch, thanks a lot for using SPEC to collect the data and evaluate this patch from multiple perspectives. I really appreciate the time this process takes. Lili. > > > > The original intention of this code was to allow more allocnos to > > share the same register, but this led to expensive allocno overflows. > > Extracted a small case (a bit large, see Bugzilla > > PR117838 for details) from 548.exchange2_r to analyze this register > > allocation issue. > > > > Before improve_allocation function: > > > > a537 (cost 1896, reg42) > > a20 (cost 270, reg1) > > a13 (cost 144, spill) > > a551 (cost 70, reg40) > > a5 (cost 43, spill) > > a493 (cost 30, reg42) > > a499 (cost 12, reg40) > > > > ------------------------------ > > Dump info in improve_allocation function: > > > > Base: > > Spilling a493r125 for a5r113 > > Spilling a573r202 for a5r113 > > Spilling a499r248 for a13r106 > > Spilling a551r120 for a13r106 > > Spilling a20r237 for a551r120 > > > > With patch: > > Spilling a499r248 for a13r106 > > Spilling a551r120 for a13r106 > > Spilling a493r125 for a551r120 > > ------------------------------ > > > > After assign_hard_reg (at the end of improve_allocation): > > > > Base: > > a537 (cost 1896, reg1) > > a20 (cost 270, spill) -----> This is unreasonable > > a13 (cost 144, reg40) > > a551 (cost 70, reg1) > > a5 (cost 43, reg42) > > a493 (cost 30, spill) > > a499 (cost 12, reg1) > > > > With patch: > > a537 (cost 1896, reg42) > > a20 (cost 270, reg1) > > a13 (cost 144, reg40) > > a551 (cost 70, reg42) > > a5 (cost 43, spill) > > a493 (cost 30, spill) > > a499 (cost 12, reg42) > > ----------------------------- > > > > Collected spec2017 performance on Znver3/Graviton4/EMR/SRF for O2 and > Ofast. > > No performance regression was observed. > > > > FOR multi-copy O2 > > SRF: 548.exchange2_r increased by 7.5%, 500.perlbench_r increased by > 2.0%. > > EMR: 548.exchange2_r increased by 4.5%, 500.perlbench_r increased by > 1.7%. > > Graviton4: 548.exchange2_r Increased by 2.2%, 511.povray_r increased by > 2.8%. > > Znver3 : 500.perlbench_r increased by 2.0%. > > > > gcc/ChangeLog: > > > > PR rtl-optimization/117838 > > * ira-color.cc (improve_allocation): Remove the issue code. > It is too ambiguous. Please change it to "Remove soft conflict related code". > > --- > > gcc/ira-color.cc | 41 +++++++++++------------------------------ > > 1 file changed, 11 insertions(+), 30 deletions(-) > > > > diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc index > > 4b9296029cc..fa2ea61cadf 100644 > > --- a/gcc/ira-color.cc > > +++ b/gcc/ira-color.cc > > @@ -3304,8 +3304,6 @@ improve_allocation (void) > > assigning hard register to allocno A even without spilling > > conflicting allocnos. */ > > continue; > > - auto_bitmap allocnos_to_spill; > > - HARD_REG_SET soft_conflict_regs = {}; > > mode = ALLOCNO_MODE (a); > > nwords = ALLOCNO_NUM_OBJECTS (a); > > /* Process each allocno conflicting with A and update the cost > > @@ -3331,40 +3329,24 @@ improve_allocation (void) > > ALLOCNO_COLOR_DATA (conflict_a)->temp = check; > > if ((conflict_hregno = ALLOCNO_HARD_REGNO (conflict_a)) < 0) > > continue; > > - auto spill_a = ira_soft_conflict (a, conflict_a); > > - if (spill_a) > > - { > > - if (!bitmap_set_bit (allocnos_to_spill, > > - ALLOCNO_NUM (spill_a))) > > - continue; > > - ira_loop_border_costs border_costs (spill_a); > > - spill_cost = border_costs.spill_inside_loop_cost (); > > - } > > + spill_cost = ALLOCNO_UPDATED_MEMORY_COST (conflict_a); > > + k = (ira_class_hard_reg_index > > + [ALLOCNO_CLASS (conflict_a)][conflict_hregno]); > > + ira_assert (k >= 0); > > + if ((allocno_costs = ALLOCNO_HARD_REG_COSTS (conflict_a)) > > + != NULL) > > + spill_cost -= allocno_costs[k]; > > else > > - { > > - spill_cost = ALLOCNO_UPDATED_MEMORY_COST > (conflict_a); > > - k = (ira_class_hard_reg_index > > - [ALLOCNO_CLASS (conflict_a)][conflict_hregno]); > > - ira_assert (k >= 0); > > - if ((allocno_costs = ALLOCNO_HARD_REG_COSTS > (conflict_a)) > > - != NULL) > > - spill_cost -= allocno_costs[k]; > > - else > > - spill_cost -= ALLOCNO_UPDATED_CLASS_COST (conflict_a); > > - spill_cost > > - += allocno_copy_cost_saving (conflict_a, conflict_hregno); > > - } > > + spill_cost -= ALLOCNO_UPDATED_CLASS_COST (conflict_a); > > + spill_cost > > + += allocno_copy_cost_saving (conflict_a, conflict_hregno); > > conflict_nregs = hard_regno_nregs (conflict_hregno, > > ALLOCNO_MODE > (conflict_a)); > > auto note_conflict = [&](int r) > > { > > if (check_hard_reg_p (a, r, > > conflicting_regs, > profitable_hard_regs)) > > - { > > - if (spill_a) > > - SET_HARD_REG_BIT (soft_conflict_regs, r); > > - costs[r] += spill_cost; > > - } > > + costs[r] += spill_cost; > > }; > > for (r = conflict_hregno; > > r >= 0 && (int) end_hard_regno (mode, r) > conflict_hregno; > @@ > > -3396,7 +3378,6 @@ improve_allocation (void) > > by spilling some conflicting allocnos does not improve the > > allocation cost. */ > > continue; > > - spill_soft_conflicts (a, allocnos_to_spill, soft_conflict_regs, > > best); > > nregs = hard_regno_nregs (best, mode); > > /* Now spill conflicting allocnos which contain a hard register > > of A when we assign the best chosen hard register to it. */