On Sun, Feb 2, 2025 at 4:20 PM Richard Biener
<richard.guent...@gmail.com> wrote:
>
>
>
> > Am 02.02.2025 um 08:59 schrieb H.J. Lu <hjl.to...@gmail.com>:
> >
> > On Sun, Feb 2, 2025 at 3:33 PM Richard Biener
> > <richard.guent...@gmail.com> wrote:
> >>
> >>
> >>
> >>>> Am 02.02.2025 um 08:00 schrieb H.J. Lu <hjl.to...@gmail.com>:
> >>>
> >>> Don't increase callee-saved register cost by 1000x, which leads to that
> >>> callee-saved registers aren't used to preserve local variable values
> >>> across calls, by capping the scale to 300.
> >>
> >>>   PR rtl-optimization/111673
> >>>   PR rtl-optimization/115932
> >>>   PR rtl-optimization/116028
> >>>   PR rtl-optimization/117081
> >>>   PR rtl-optimization/118497
> >>>   * ira-color.cc (assign_hard_reg): Cap callee-saved register cost
> >>>   scale to 300.
> >>>
> >>> Signed-off-by: H.J. Lu <hjl.to...@gmail.com>
> >>> ---
> >>> gcc/ira-color.cc | 16 ++++++++++++++--
> >>> 1 file changed, 14 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc
> >>> index 0699b349a1a..707ff188250 100644
> >>> --- a/gcc/ira-color.cc
> >>> +++ b/gcc/ira-color.cc
> >>> @@ -2175,13 +2175,25 @@ assign_hard_reg (ira_allocno_t a, bool retry_p)
> >>>     /* We need to save/restore the hard register in
> >>>        epilogue/prologue.  Therefore we increase the cost.  */
> >>>     {
> >>> +        int scale;
> >>> +        if (optimize_size)
> >>> +          scale = 1;
> >>> +        else
> >>> +          {
> >>> +        scale = REG_FREQ_FROM_BB (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> >>> +        /* Don't increase callee-saved register cost by 1000x,
> >>> +           which leads to that callee-saved registers aren't
> >>> +           used to preserve local variable values across calls,
> >>> +           by capping the scale to 300.  */
> >>> +        if (REG_FREQ_MAX == 1000 && scale == REG_FREQ_MAX)
> >>> +          scale = 300;
> >>
> >> That leads to 300 for 1000 but 999 for 999 which is odd.  I’d have 
> >> expected to scale this down to [0, 300] or is MAX a magic value?
> >
> > There are
> >
> > * The weights for each insn varies from 0 to REG_FREQ_BASE.
> >   This constant does not need to be high, as in infrequently executed
> >   regions we want to count instructions equivalently to optimize for
> >   size instead of speed.  */
> > #define REG_FREQ_MAX 1000
> >
> > /* Compute register frequency from the BB frequency.  When optimizing for 
> > size,
> >   or profile driven feedback is available and the function is never 
> > executed,
> >   frequency is always equivalent.  Otherwise rescale the basic block
> >   frequency.  */
> > #define REG_FREQ_FROM_BB(bb) ((optimize_function_for_size_p (cfun)          
> >   \
> >                               || !cfun->cfg->count_max.initialized_p ())    
> >  \
> >                              ? REG_FREQ_MAX                                 
> >  \
> >                              : ((bb)->count.to_frequency (cfun)             
> >  \
> >                                * REG_FREQ_MAX / BB_FREQ_MAX)                
> >  \
> >                              ? ((bb)->count.to_frequency (cfun)             
> >  \
> >                                 * REG_FREQ_MAX / BB_FREQ_MAX)               
> >  \
> >                              : 1)
> >
> > 1000 is the default.  If it isn't 1000, it isn't the default.  I only want
> > to get a more reasonable default scale, instead of 1000.   Lower
> > scale will fail the PR rtl-optimization/111673 test on powerpc64.
>
> I see.  Why not adjust the above macro then?  That would be a bit more 
> obvious.  Like use MAX/2 or so?

commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
Author: Surya Kumari Jangala <jskum...@linux.ibm.com>
Date:   Tue Jun 25 08:37:49 2024 -0500

    ira: Scale save/restore costs of callee save registers with block frequency

uses REG_FREQ_FROM_BB as the cost scale.  I don't know if it is a misuse.
I don't want to change REG_FREQ_FROM_BB since it is used in other places,
not as a cost scale.  Maybe the above commit should be reverted and we add
a target hook for callee-saved register cost scale.  Each target can choose
a proper cost scale, install of increasing the cost by 1000x for everyone.

> >
> >
> >>> +          }
> >>>       rclass = REGNO_REG_CLASS (hard_regno);
> >>>       add_cost = ((ira_memory_move_cost[mode][rclass][0]
> >>>                + ira_memory_move_cost[mode][rclass][1])
> >>>               * saved_nregs / hard_regno_nregs (hard_regno,
> >>>                             mode) - 1)
> >>> -               * (optimize_size ? 1 :
> >>> -              REG_FREQ_FROM_BB (ENTRY_BLOCK_PTR_FOR_FN (cfun)));
> >>> +               * scale;
> >>>       cost += add_cost;
> >>>       full_cost += add_cost;
> >>>     }
> >>> --
> >>> 2.48.1
> >>>
> >
> >
> >
> > --
> > H.J.



-- 
H.J.

Reply via email to