On Fri, Feb 7, 2025 at 9:20 AM Richard Sandiford
<[email protected]> wrote:
>
> Richard Sandiford <[email protected]> writes:
> > Really nice analysis! Thanks for writing this up.
> >
> > Sorry for the big quote below, but:
> >
> > Jan Hubicka <[email protected]> writes:
> >> [...]
> >> PR117081 is about regression in povray. The reducted testcase:
> >>
> >> void foo (void);
> >> void bar (void);
> >>
> >> int
> >> test (int a)
> >> {
> >> int r;
> >>
> >> if (r = -a)
> >> foo ();
> >> else
> >> bar ();
> >>
> >> return r;
> >> }
> >>
> >> shows that we now use caller saved register (EAX) to hold the return value
> >> which yields longer code. The costs are
> >> Popping a0(r98,l0) -- (0=13000,13000) (3=15000,15000) (6=15000,15000)
> >> (40=15000,15000) (41=15000,15000) (42=15000,15000) (43=15000,15000)
> >>
> >> here 15000 is 11000+4000 where I think 4000 is cost of 2 reg-reg moves
> >> multiplied by REG_FREQ_MAX. This seems correct. GCC 13 uses callee
> >> saved register and produces:
> >>
> >> 0000000000000000 <test>:
> >> 0: 53 push %rbx <--- callee
> >> save
> >> 1: 89 fb mov %edi,%ebx <--- move 1
> >> 3: f7 db neg %ebx
> >> 5: 74 09 je 10 <test+0x10>
> >> 7: e8 00 00 00 00 call c <test+0xc>
> >> c: 89 d8 mov %ebx,%eax <--- callee
> >> restore
> >> e: 5b pop %rbx
> >> f: c3 ret
> >> 10: e8 00 00 00 00 call 15 <test+0x15>
> >> 15: 89 d8 mov %ebx,%eax <--- move 2
> >> 17: 5b pop %rbx <--- callee
> >> restore
> >> 18: c3 ret
> >>
> >> Mainline used EAX since it has costs 13000. It is not 100% clear to me
> >> why.
> >> - 12000 is the spilling (which is emitted twice but executed just once)
> >> - I would have expected 2000 for the move from edi to eax.
> >> However even if cost is 14000 we will choose EAX. The code is:
> >>
> >> 0: 89 f8 mov %edi,%eax <--- move1
> >> 2: 48 83 ec 18 sub $0x18,%rsp <--- stack
> >> frame creation
> >> 6: f7 d8 neg %eax
> >> 8: 89 44 24 0c mov %eax,0xc(%rsp) <--- spill out
> >> c: 85 ff test %edi,%edi
> >> e: 74 10 je 20 <test+0x20>
> >> 10: e8 00 00 00 00 call 15 <test+0x15>
> >> 15: 8b 44 24 0c mov 0xc(%rsp),%eax <--- spill in
> >> 19: 48 83 c4 18 add $0x18,%rsp <--- stack
> >> frame
> >> 1d: c3 ret
> >> 1e: 66 90 xchg %ax,%ax
> >> 20: e8 00 00 00 00 call 25 <test+0x25>
> >> 25: 8b 44 24 0c mov 0xc(%rsp),%eax <--- spill in
> >> 29: 48 83 c4 18 add $0x18,%rsp <--- stack
> >> frame
> >> 2d: c3 ret
> >>
> >> This sequence really saves one move at the expense of of stack frame
> >> allocation (which is not modelled by the cost model) and longer spill
> >> code (also no modelled).
> >> [...]
> >> PR117082 is about noreturn function:
> >> __attribute__ ((noreturn))
> >> void
> >> f3 (void)
> >> {
> >> int y0 = x0;
> >> int y1 = x1;
> >> f1 ();
> >> f2 (y0, y1);
> >> while (1);
> >> }
> >>
> >> Here the cost model is really wrong by assuming that entry and exit
> >> block have same frequencies. This can be fixed quite easilly (though it
> >> is a rare case)
> >
> > Yeah (and nice example).
> >
> >> PR118497 seems to be ixed.
> >>
> >> So overall I think
> >> 1) we can fix scaling of epilogue by exit block frequency
> >> to get noreturns right.
> >> 2) we should drop the check for optimize_size. Since with -Os
> >> REG_FREQ_FROM_BB always returns 1000 everything should be scaled
> >> same way
> >> 3) we currently have wire in "-1" to biass the cost metric for callee
> >> saved registers.
> >> It may make sense to allow targets to control this, since i.e. x86
> >> has push/pop that is shorter. -3 would solve the testcase with neg
> >> and would express that push/pop is still cheaper with extra reg-reg
> >> move.
> >> 4) cost model misses shring wrapping, the fact that if register is
> >> callee saved it may be used by multiple allocnos and also that
> >> push/pop sequence may avoid need for manual RSP adjustments.
> >>
> >> Those seems bit harder things to fit in though.
> >>
> >> So if we want to go with the target hook, I think it should adjust the
> >> cost before scalling (since targets may have special tricks for
> >> prologues) rather than the scale factor (which is target independent
> >> part of cost model).
> >
> > Like you say, one of the missing pieces appears to be the allocation/
> > dealloaction overhead for callee saves. Could we try to add a hook to
> > model that cost, based on certain parameters?
> >
> > In particular, one thing that the examples above have in common is that
> > they don't need to allocate a frame for local variables. That seems
> > like it ought to be part of the mix. If we need to allocate a frame
> > using addition anyway, then presumably one of the advantages of push/pop
> > over callee saves goes away.
> >
> > But going back to my question above, modelling the allocation and
> > deallocation would need to be done in a way that makes them more
> > expensive than moves. I think in some cases we've assumed that
> > addition and moves have equal cost, even when optimising for speed.
> >
> > In other words, rather than add a hook to tweak the -1 bias (which I
> > still agree is a more reasonable thing to do than bypassing the IRA
> > code altogether), could we add a separate hook for the allocation/
> > deallocation and leave IRA to work out when to apply it?
> >
> > I suppose for -fno-omit-frame-pointer we should also take the creation
> > of the frame link into account, if we don't already. This matters on
> > aarch64 since -fomit-frame-pointer is not the default at -O2.
> >
> > One quirk on aarch64 is that, if we're using an odd number of
> > callee-saved GPRs, adding one more is essentially free, since we would
> > allocate space for it anyway, and could save and restore it alongside
> > the odd one out. That would probably be difficult to apply in practice
> > though. And it's obviously a separate issue from the current one.
> > Just mentioning it as another thing we could model in future.
>
> FWIW, here's a very rough initial version of the kind of thing
> I was thinking about. Hopefully the hook documentation describes
> the approach. It's deliberately (overly?) flexible.
>
> I've included an aarch64 version that (a) models the fact that the
> first caller-save can also allocate the frame more-or-less for free,
> and (b) once we've saved an odd number of GPRs, saving one more is
> essentialy free. I also hacked up an x86 version locally to model
> the allocation benefits of using caller-saved registers. It seemed
> to fix the povray example above.
Note the pair allocation for aarch64 was filed PR 117477 so it would
be useful to check if it fixes that and include the bug # in the
commit if it does.
Thanks,
Andrew
>
> This still needs a lot of clean-up and testing, but I thought I might
> as well send what I have before leaving for the weekend. Does it look
> reasonable in principle?
>
> Thanks,
> Richard
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index c1e40200806..3f9453a74fb 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -15870,6 +15870,60 @@ aarch64_memory_move_cost (machine_mode mode,
> reg_class_t rclass_i, bool in)
> : base + aarch64_tune_params.memmov_cost.store_int);
> }
>
> +/* CALLEE_SAVED_REGS is the set of callee-saved registers that the
> + RA has already decided to use. Return the total number of GPRs
> + that need to be saved and restored, including the frame link registers.
> */
> +static int
> +aarch64_count_gpr_saves (const HARD_REG_SET &callee_saved_regs)
> +{
> + auto saved_gprs = callee_saved_regs & reg_class_contents[GENERAL_REGS];
> + /* FIXME: popcount hack. */
> + auto nregs = popcount_hwi (saved_gprs.elts[0]);
> +
> + if (aarch64_needs_frame_chain ())
> + nregs += 2;
> + else if (!crtl->is_leaf || df_regs_ever_live_p (LR_REGNUM))
> + nregs += 1;
> + return nregs;
> +}
> +
> +/* Implement TARGET_CALLEE_SAVE_COST. */
> +static int
> +aarch64_callee_save_cost (spill_cost_type, unsigned int regno,
> + machine_mode, unsigned int nregs, int mem_cost,
> + const HARD_REG_SET &callee_saved_regs,
> + bool existing_spill_p)
> +{
> + auto ngprs = aarch64_count_gpr_saves (callee_saved_regs);
> +
> + /* FIXME: handle FP and Advanced SIMD registers. */
> +
> + /* If we've already committed to saving an odd number of GPRs, assume that
> + saving one more will involve turning an STR into an STP. */
> + if (GP_REGNUM_P (regno) && nregs == 1 && (ngprs & 1))
> + return 0;
> +
> + /* If this would be the first register that we save, add the cost of
> + allocating or deallocating the frame. For GPR saves, the allocation
> + and deallocation can be folded into the save and restore. */
> + if (!existing_spill_p
> + && ngprs == 0
> + && !GP_REGNUM_P (regno))
> + mem_cost += 2;
> +
> + return mem_cost;
> +}
> +
> +/* Implement TARGET_FRAME_ALLOCATION_COST. */
> +static int
> +aarch64_frame_allocation_cost (frame_cost_type,
> + const HARD_REG_SET &callee_saved_regs)
> +{
> + if (aarch64_count_gpr_saves (callee_saved_regs) == 0)
> + return 2;
> + return 0;
> +}
> +
> /* Implement TARGET_INSN_COST. We have the opportunity to do something
> much more productive here, such as using insn attributes to cost things.
> But we don't, not yet.
> @@ -31572,6 +31626,12 @@ aarch64_libgcc_floating_mode_supported_p
> #undef TARGET_MEMORY_MOVE_COST
> #define TARGET_MEMORY_MOVE_COST aarch64_memory_move_cost
>
> +#undef TARGET_CALLEE_SAVE_COST
> +#define TARGET_CALLEE_SAVE_COST aarch64_callee_save_cost
> +
> +#undef TARGET_FRAME_ALLOCATION_COST
> +#define TARGET_FRAME_ALLOCATION_COST aarch64_frame_allocation_cost
> +
> #undef TARGET_MIN_DIVISIONS_FOR_RECIP_MUL
> #define TARGET_MIN_DIVISIONS_FOR_RECIP_MUL
> aarch64_min_divisions_for_recip_mul
>
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 0de24eda6f0..171bb8c0b77 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -7003,6 +7003,74 @@ value to the result of that function. The arguments
> to that function
> are the same as to this target hook.
> @end deftypefn
>
> +@deftypefn {Target Hook} int TARGET_CALLEE_SAVE_COST (spill_cost_type
> @var{cost_type}, unsigned int @var{hard_regno}, machine_mode @var{mode},
> unsigned int @var{nregs}, int @var{mem_cost}, const HARD_REG_SET
> @var{&allocated_callee_regs}, bool @var{existing_spills_p})
> +Return the one-off cost of saving or restoring callee-saved registers
> +(also known as call-preserved register or non-volatile registers).
> +The parameters are as follows:
> +
> +@itemize
> +@item
> +@var{cost_type} is @samp{spill_cost_type::SAVE} for saving a register
> +and @samp{spill_cost_type::RESTORE} for restoring a register.
> +
> +@item
> +@var{hard_regno} and @var{mode} represent the whole register that
> +the register allocator is considering using; of these,
> +@var{nregs} registers are fully or partially callee-saved.
> +
> +@item
> +@var{mem_cost} is the normal cost for storing (for saves)
> +or loading (for restores) the @var{nregs} registers.
> +
> +@item
> +@var{allocated_callee_regs} is the set of callee-saved registers
> +that are already in use.
> +
> +@item
> +@var{existing_spills_p} is true if the register allocator has
> +already decided to spill registers to memory.
> +@end itemize
> +
> +If @var{existing_spills_p} is false, the cost of a save should account
> +for frame allocations in a way that is consistent with
> +@code{TARGET_FRAME_ALLOCATION_COST}'s handling of allocations for
> +caller-saved registers. Similarly, the cost of a restore should then
> +account for frame deallocations in a way that is consistent with
> +@code{TARGET_FRAME_ALLOCATION_COST}'s handling of deallocations.
> +
> +Note that this hook should not attempt to apply a frequency scale
> +to the cost: it is the caller's responsibility to do that where
> +appropriate.
> +
> +The default implementation returns @var{mem_cost}, plus the allocation
> +or deallocation cost returned by @code{TARGET_FRAME_ALLOCATION_COST},
> +where appropriate.
> +@end deftypefn
> +
> +@deftypefn {Target Hook} int TARGET_FRAME_ALLOCATION_COST (frame_cost_type
> @var{cost_type}, const HARD_REG_SET @var{&allocated_callee_regs})
> +Return the cost of allocating or deallocating a frame for the sake of
> +using caller-saved registers; @var{cost_type} chooses between allocation
> +and deallocation.
> +
> +This hook is only called if the register allocator has not so far
> +decided to spill any pseudo registers to memory or to use caller-saved
> +registers for values that are live across a call. It may have
> +decided to use callee-saved registers; if so, @var{allocated_callee_regs}
> +is the set of callee-saved registers it has used. There might also be
> +other reasons why a stack frame is already needed; for example,
> +@samp{get_frame_size ()} may be nonzero, or the target might already
> +require a frame for target-specific reasons.
> +
> +When the register allocator uses this hook to cost caller-saved registers,
> +it also uses @code{TARGET_CALLEE_SAVE_COST} to cost new callee-saved
> +registers, passing @samp{false} as the @var{existing_spills_p} argument.
> +The intention is to allow the target to apply an apples-for-apples
> +comparison between callee-saved and caller-saved registers when the
> +allocator has not yet committed to using registers of both types.
> +
> +The default implementation returns 0.
> +@end deftypefn
> +
> @defmac BRANCH_COST (@var{speed_p}, @var{predictable_p})
> A C expression for the cost of a branch instruction. A value of 1 is
> the default; other values are interpreted relative to that. Parameter
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index 631d04131e3..eccc4d88493 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -4582,6 +4582,10 @@ These macros are obsolete, new ports should use the
> target hook
>
> @hook TARGET_MEMORY_MOVE_COST
>
> +@hook TARGET_CALLEE_SAVE_COST
> +
> +@hook TARGET_FRAME_ALLOCATION_COST
> +
> @defmac BRANCH_COST (@var{speed_p}, @var{predictable_p})
> A C expression for the cost of a branch instruction. A value of 1 is
> the default; other values are interpreted relative to that. Parameter
> diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc
> index 0699b349a1a..9aaac794f35 100644
> --- a/gcc/ira-color.cc
> +++ b/gcc/ira-color.cc
> @@ -1199,6 +1199,9 @@ finish_update_cost_records (void)
> register was already allocated for an allocno. */
> static bool allocated_hardreg_p[FIRST_PSEUDO_REGISTER];
>
> +/* Which callee-saved hard registers we've decided to save. */
> +static HARD_REG_SET allocated_callee_save_regs;
> +
> /* Describes one element in a queue of allocnos whose costs need to be
> updated. Each allocno in the queue is known to have an allocno
> class. */
> @@ -1740,6 +1743,20 @@ check_hard_reg_p (ira_allocno_t a, int hard_regno,
> return j == nregs;
> }
>
> +/* Record that we have allocated NREGS registers starting at HARD_REGNO. */
> +
> +static void
> +record_allocation (int hard_regno, int nregs)
> +{
> + for (int i = 0; i < nregs; ++i)
> + if (!allocated_hardreg_p[hard_regno + i])
> + {
> + allocated_hardreg_p[hard_regno + i] = true;
> + if (!crtl->abi->clobbers_full_reg_p (hard_regno + i))
> + SET_HARD_REG_BIT (allocated_callee_save_regs, hard_regno + i);
> + }
> +}
> +
> /* Return number of registers needed to be saved and restored at
> function prologue/epilogue if we allocate HARD_REGNO to hold value
> of MODE. */
> @@ -1961,6 +1978,14 @@ assign_hard_reg (ira_allocno_t a, bool retry_p)
> #endif
> auto_bitmap allocnos_to_spill;
> HARD_REG_SET soft_conflict_regs = {};
> + int entry_freq = REG_FREQ_FROM_BB (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> + int exit_freq = REG_FREQ_FROM_BB (EXIT_BLOCK_PTR_FOR_FN (cfun));
> + /* Whether we have spilled pseudos or used caller-saved registers
> + for values that are live across a call.
> +
> + FIXME: also check for pseudos that prefer NO_REGS and not equivalences.
> + Also account for regular spills. */
> + bool existing_spills_p = caller_save_needed;
>
> ira_assert (! ALLOCNO_ASSIGNED_P (a));
> get_conflict_and_start_profitable_regs (a, retry_p,
> @@ -2175,17 +2200,45 @@ assign_hard_reg (ira_allocno_t a, bool retry_p)
> /* We need to save/restore the hard register in
> epilogue/prologue. Therefore we increase the cost. */
> {
> + int nregs = hard_regno_nregs (hard_regno, mode);
> + add_cost = 0;
> rclass = REGNO_REG_CLASS (hard_regno);
> - add_cost = ((ira_memory_move_cost[mode][rclass][0]
> - + ira_memory_move_cost[mode][rclass][1])
> - * saved_nregs / hard_regno_nregs (hard_regno,
> - mode) - 1)
> - * (optimize_size ? 1 :
> - REG_FREQ_FROM_BB (ENTRY_BLOCK_PTR_FOR_FN (cfun)));
> +
> + auto entry_cost = targetm.callee_save_cost
> + (spill_cost_type::SAVE, hard_regno, mode, saved_nregs,
> + ira_memory_move_cost[mode][rclass][0] * saved_nregs / nregs,
> + allocated_callee_save_regs, existing_spills_p);
> + add_cost += entry_cost * entry_freq;
> +
> + auto exit_cost = targetm.callee_save_cost
> + (spill_cost_type::RESTORE, hard_regno, mode, saved_nregs,
> + ira_memory_move_cost[mode][rclass][1] * saved_nregs / nregs,
> + allocated_callee_save_regs, existing_spills_p);
> + add_cost += exit_cost * exit_freq;
> +
> + /* In the event of a tie between caller-save and callee-save,
> + prefer callee-save. */
> + add_cost -= 1;
> +
> cost += add_cost;
> full_cost += add_cost;
> }
> }
> + if (!existing_spills_p && ira_need_caller_save_p (a, hard_regno))
> + {
> + add_cost = 0;
> +
> + auto entry_cost = targetm.frame_allocation_cost
> + (frame_cost_type::ALLOCATION, allocated_callee_save_regs);
> + add_cost += entry_cost * entry_freq;
> +
> + auto exit_cost = targetm.frame_allocation_cost
> + (frame_cost_type::DEALLOCATION, allocated_callee_save_regs);
> + add_cost += exit_cost * exit_freq;
> +
> + cost += add_cost;
> + full_cost += add_cost;
> + }
> if (min_cost > cost)
> min_cost = cost;
> if (min_full_cost > full_cost)
> @@ -2212,8 +2265,8 @@ assign_hard_reg (ira_allocno_t a, bool retry_p)
> fail:
> if (best_hard_regno >= 0)
> {
> - for (i = hard_regno_nregs (best_hard_regno, mode) - 1; i >= 0; i--)
> - allocated_hardreg_p[best_hard_regno + i] = true;
> + record_allocation (best_hard_regno,
> + hard_regno_nregs (best_hard_regno, mode));
> spill_soft_conflicts (a, allocnos_to_spill, soft_conflict_regs,
> best_hard_regno);
> }
> @@ -3369,8 +3422,7 @@ improve_allocation (void)
> /* Assign the best chosen hard register to A. */
> ALLOCNO_HARD_REGNO (a) = best;
>
> - for (j = nregs - 1; j >= 0; j--)
> - allocated_hardreg_p[best + j] = true;
> + record_allocation (best, nregs);
>
> if (internal_flag_ira_verbose > 2 && ira_dump_file != NULL)
> fprintf (ira_dump_file, "Assigning %d to a%dr%d\n",
> diff --git a/gcc/target.def b/gcc/target.def
> index 4050b2ebdd4..fdd80c06c77 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -3775,6 +3775,80 @@ are the same as to this target hook.",
> int, (machine_mode mode, reg_class_t rclass, bool in),
> default_memory_move_cost)
>
> +DEFHOOK
> +(callee_save_cost,
> + "Return the one-off cost of saving or restoring callee-saved registers\n\
> +(also known as call-preserved register or non-volatile registers).\n\
> +The parameters are as follows:\n\
> +\n\
> +@itemize\n\
> +@item\n\
> +@var{cost_type} is @samp{spill_cost_type::SAVE} for saving a register\n\
> +and @samp{spill_cost_type::RESTORE} for restoring a register.\n\
> +\n\
> +@item\n\
> +@var{hard_regno} and @var{mode} represent the whole register that\n\
> +the register allocator is considering using; of these,\n\
> +@var{nregs} registers are fully or partially callee-saved.\n\
> +\n\
> +@item\n\
> +@var{mem_cost} is the normal cost for storing (for saves)\n\
> +or loading (for restores) the @var{nregs} registers.\n\
> +\n\
> +@item\n\
> +@var{allocated_callee_regs} is the set of callee-saved registers\n\
> +that are already in use.\n\
> +\n\
> +@item\n\
> +@var{existing_spills_p} is true if the register allocator has\n\
> +already decided to spill registers to memory.\n\
> +@end itemize\n\
> +\n\
> +If @var{existing_spills_p} is false, the cost of a save should account\n\
> +for frame allocations in a way that is consistent with\n\
> +@code{TARGET_FRAME_ALLOCATION_COST}'s handling of allocations for\n\
> +caller-saved registers. Similarly, the cost of a restore should then\n\
> +account for frame deallocations in a way that is consistent with\n\
> +@code{TARGET_FRAME_ALLOCATION_COST}'s handling of deallocations.\n\
> +\n\
> +Note that this hook should not attempt to apply a frequency scale\n\
> +to the cost: it is the caller's responsibility to do that where\n\
> +appropriate.\n\
> +\n\
> +The default implementation returns @var{mem_cost}, plus the allocation\n\
> +or deallocation cost returned by @code{TARGET_FRAME_ALLOCATION_COST},\n\
> +where appropriate.",
> + int, (spill_cost_type cost_type, unsigned int hard_regno,
> + machine_mode mode, unsigned int nregs, int mem_cost,
> + const HARD_REG_SET &allocated_callee_regs, bool existing_spills_p),
> + default_callee_save_cost)
> +
> +DEFHOOK
> +(frame_allocation_cost,
> + "Return the cost of allocating or deallocating a frame for the sake of\n\
> +using caller-saved registers; @var{cost_type} chooses between allocation\n\
> +and deallocation.\n\
> +\n\
> +This hook is only called if the register allocator has not so far\n\
> +decided to spill any pseudo registers to memory or to use caller-saved\n\
> +registers for values that are live across a call. It may have\n\
> +decided to use callee-saved registers; if so, @var{allocated_callee_regs}\n\
> +is the set of callee-saved registers it has used. There might also be\n\
> +other reasons why a stack frame is already needed; for example,\n\
> +@samp{get_frame_size ()} may be nonzero, or the target might already\n\
> +require a frame for target-specific reasons.\n\
> +\n\
> +When the register allocator uses this hook to cost caller-saved registers,\n\
> +it also uses @code{TARGET_CALLEE_SAVE_COST} to cost new callee-saved\n\
> +registers, passing @samp{false} as the @var{existing_spills_p} argument.\n\
> +The intention is to allow the target to apply an apples-for-apples\n\
> +comparison between callee-saved and caller-saved registers when the\n\
> +allocator has not yet committed to using registers of both types.\n\
> +\n\
> +The default implementation returns 0.",
> + int, (frame_cost_type cost_type, const HARD_REG_SET &allocated_callee_regs),
> + default_frame_allocation_cost)
> +
> DEFHOOK
> (use_by_pieces_infrastructure_p,
> "GCC will attempt several strategies when asked to copy between\n\
> diff --git a/gcc/target.h b/gcc/target.h
> index 3e1ee68a341..2bf35e2d0ee 100644
> --- a/gcc/target.h
> +++ b/gcc/target.h
> @@ -284,6 +284,18 @@ enum poly_value_estimate_kind
> POLY_VALUE_LIKELY
> };
>
> +enum class spill_cost_type
> +{
> + SAVE,
> + RESTORE
> +};
> +
> +enum class frame_cost_type
> +{
> + ALLOCATION,
> + DEALLOCATION
> +};
> +
> typedef void (*emit_support_tinfos_callback) (tree);
>
> extern bool verify_type_context (location_t, type_context_kind, const_tree,
> diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
> index f80dc8b2e7e..a8331ba5576 100644
> --- a/gcc/targhooks.cc
> +++ b/gcc/targhooks.cc
> @@ -2075,6 +2075,29 @@ default_register_move_cost (machine_mode mode
> ATTRIBUTE_UNUSED,
> #endif
> }
>
> +int
> +default_callee_save_cost (spill_cost_type spill_type, unsigned int,
> + machine_mode, unsigned int, int mem_cost,
> + const HARD_REG_SET &callee_saved_regs,
> + bool existing_spills_p)
> +{
> + if (!existing_spills_p)
> + {
> + auto frame_type = (spill_type == spill_cost_type::SAVE
> + ? frame_cost_type::ALLOCATION
> + : frame_cost_type::DEALLOCATION);
> + mem_cost += targetm.frame_allocation_cost (frame_type,
> + callee_saved_regs);
> + }
> + return mem_cost;
> +}
> +
> +int
> +default_frame_allocation_cost (frame_cost_type, const HARD_REG_SET &)
> +{
> + return 0;
> +}
> +
> /* The default implementation of TARGET_SLOW_UNALIGNED_ACCESS. */
>
> bool
> diff --git a/gcc/targhooks.h b/gcc/targhooks.h
> index 7d15f632c23..0c4c27e257e 100644
> --- a/gcc/targhooks.h
> +++ b/gcc/targhooks.h
> @@ -234,6 +234,11 @@ extern tree default_builtin_tm_load_store (tree);
> extern int default_memory_move_cost (machine_mode, reg_class_t, bool);
> extern int default_register_move_cost (machine_mode, reg_class_t,
> reg_class_t);
> +extern int default_callee_save_cost (spill_cost_type, unsigned int,
> + machine_mode, unsigned int, int,
> + const HARD_REG_SET &, bool);
> +extern int default_frame_allocation_cost (frame_cost_type,
> + const HARD_REG_SET &);
> extern bool default_slow_unaligned_access (machine_mode, unsigned int);
> extern HOST_WIDE_INT default_estimated_poly_value (poly_int64,
> poly_value_estimate_kind);
> --
> 2.25.1
>