Jan Hubicka writes:
>>
>> Thanks for running these. I saw poor results for perlbench with my
>> initial aarch64 hooks because the hooks reduced the cost to zero for
>> the entry case:
>>
>> auto entry_cost = targetm.callee_save_cost
>>(spill_cost_type::SAVE, hard_regno, mod
>
> Thanks for running these. I saw poor results for perlbench with my
> initial aarch64 hooks because the hooks reduced the cost to zero for
> the entry case:
>
> auto entry_cost = targetm.callee_save_cost
> (spill_cost_type::SAVE, hard_regno, mode, saved_nregs,
>
Jan Hubicka writes:
>> On Wed, Feb 19, 2025 at 9:06 PM Jan Hubicka wrote:
>> >
>> > Hi,
>> > this is a variant of a hook I benchmarked on cpu2016 with -Ofast -flto
>> > and -O2 -flto. For non -Os and no Windows ABI should be pratically the
>> > same as your variant that was simply returning mem_
> On Wed, Feb 19, 2025 at 9:06 PM Jan Hubicka wrote:
> >
> > Hi,
> > this is a variant of a hook I benchmarked on cpu2016 with -Ofast -flto
> > and -O2 -flto. For non -Os and no Windows ABI should be pratically the
> > same as your variant that was simply returning mem_cost - 2.
> >
> I've tested
On Wed, Feb 19, 2025 at 9:06 PM Jan Hubicka wrote:
>
> Hi,
> this is a variant of a hook I benchmarked on cpu2016 with -Ofast -flto
> and -O2 -flto. For non -Os and no Windows ABI should be pratically the
> same as your variant that was simply returning mem_cost - 2.
>
I've tested O2/(Ofast march
Hi,
this is a variant of a hook I benchmarked on cpu2016 with -Ofast -flto
and -O2 -flto. For non -Os and no Windows ABI should be pratically the
same as your variant that was simply returning mem_cost - 2.
It seems mostly SPEC netural. With -O2 -flto there is
small 4% improvement on povray (whic
> Jan Hubicka writes:
> > Concerning x86 specifics, there is cost for allocating stack frame. So
> > if the function has nothing on stack frame push/pop becomes bit better
> > candidate then a spill. The hook you added does not seem to be able to
> > test this, since it does not have frame size
Jan Hubicka writes:
> Concerning x86 specifics, there is cost for allocating stack frame. So
> if the function has nothing on stack frame push/pop becomes bit better
> candidate then a spill. The hook you added does not seem to be able to
> test this, since it does not have frame size as an para
Hello,
I looked into updating the hook
> -/* Implement TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE. */
> +/* Implement TARGET_CALLEE_SAVE_COST. */
>
> static int
> -ix86_ira_callee_saved_register_cost_scale (int)
> +ix86_callee_save_cost (spill_cost_type, unsigned int, machine_mode,
> +
Jan Hubicka writes:
>> As described below, the patch also shows no change to AArch64 SPEC2017
>> scores. I'm afraid I'll need help from x86 folks to do performance
>> testing there.
>
> I will look into this over weekend. I can write x86 version of the
> hooks. Though in earlier email you mention
> Jan Hubicka writes:
> >> As described below, the patch also shows no change to AArch64 SPEC2017
> >> scores. I'm afraid I'll need help from x86 folks to do performance
> >> testing there.
> >
> > I will look into this over weekend. I can write x86 version of the
> > hooks. Though in earlier ema
On 2/14/25 12:27 PM, Peter Bergner wrote:
On 2/14/25 10:43 AM, Vladimir Makarov wrote:
The patch is very well described and it is OK for me to commit it into the
trunk. Thank you for working on this issue, Richard.
If we have some new failures on targets I believe the hook has enough
descrip
On 2/14/25 10:43 AM, Vladimir Makarov wrote:
> The patch is very well described and it is OK for me to commit it into the
> trunk. Thank you for working on this issue, Richard.
>
> If we have some new failures on targets I believe the hook has enough
> descriptive power to fix the failures.
Note
On 2/13/25 11:08 AM, Richard Sandiford wrote:
From 46ad583e65a1c5a27e2203a7571bba6eb0766bc6 Mon Sep 17 00:00:00 2001
From: Richard Sandiford
Date: Fri, 7 Feb 2025 15:40:21 +
Subject: [PATCH] ira: Add new hooks for callee-save vs spills [PR117477]
To: gcc-patches@gcc.gnu.org
Following on
> As described below, the patch also shows no change to AArch64 SPEC2017
> scores. I'm afraid I'll need help from x86 folks to do performance
> testing there.
I will look into this over weekend. I can write x86 version of the
hooks. Though in earlier email you mentioned you hacked up something, s
Vladimir Makarov writes:
> On 2/7/25 12:18 PM, Richard Sandiford wrote:
>> FWIW, here's a very rough initial version of the kind of thing
>> I was thinking about. Hopefully the hook documentation describes
>> the approach. It's deliberately (overly?) flexible.
>>
>> I've included an aarch64 vers
On 2/7/25 12:18 PM, Richard Sandiford wrote:
FWIW, here's a very rough initial version of the kind of thing
I was thinking about. Hopefully the hook documentation describes
the approach. It's deliberately (overly?) flexible.
I've included an aarch64 version that (a) models the fact that the
On Tue, Feb 11, 2025 at 4:38 PM Hongtao Liu wrote:
>
> On Tue, Feb 11, 2025 at 4:27 PM H.J. Lu wrote:
> >
> > On Tue, Feb 11, 2025 at 4:13 PM Hongtao Liu wrote:
> > >
> > > > PR117081 is about regression in povray. The reducted testcase:
> > > Just for clarification. PR117081 is not about regres
On Tue, Feb 11, 2025 at 4:27 PM H.J. Lu wrote:
>
> On Tue, Feb 11, 2025 at 4:13 PM Hongtao Liu wrote:
> >
> > > PR117081 is about regression in povray. The reducted testcase:
> > Just for clarification. PR117081 is not about regression in povray.
> > it's related to FAIL: gcc.target/i386/pr91384.
On Tue, Feb 11, 2025 at 4:13 PM Hongtao Liu wrote:
>
> > PR117081 is about regression in povray. The reducted testcase:
> Just for clarification. PR117081 is not about regression in povray.
> it's related to FAIL: gcc.target/i386/pr91384.c scan-assembler-not
> testl
> The pr91384.c is added by r12
> PR117081 is about regression in povray. The reducted testcase:
Just for clarification. PR117081 is not about regression in povray.
it's related to FAIL: gcc.target/i386/pr91384.c scan-assembler-not
testl
The pr91384.c is added by r12-7417 which is peephole optimization
expecting some specific ins
On Sun, Feb 2, 2025 at 10:23 PM H.J. Lu wrote:
>
> commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
> Author: Surya Kumari Jangala
> Date: Tue Jun 25 08:37:49 2024 -0500
>
> ira: Scale save/restore costs of callee save registers with block
> frequency
>
> scales the cost of saving/restoring
On Mon, Feb 10, 2025 at 9:56 AM Andrew Pinski wrote:
>
> On Sun, Feb 2, 2025 at 10:23 PM H.J. Lu wrote:
> >
> > commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
> > Author: Surya Kumari Jangala
> > Date: Tue Jun 25 08:37:49 2024 -0500
> >
> > ira: Scale save/restore costs of callee save reg
Richard Sandiford writes:
> FWIW, here's a very rough initial version of the kind of thing
> I was thinking about. Hopefully the hook documentation describes
> the approach. It's deliberately (overly?) flexible.
Argh! I forgot:
diff --git a/gcc/ira-color.cc b/gcc/ira-color.cc
index de34be31f4
On Fri, Feb 7, 2025 at 9:20 AM Richard Sandiford
wrote:
>
> Richard Sandiford writes:
> > Really nice analysis! Thanks for writing this up.
> >
> > Sorry for the big quote below, but:
> >
> > Jan Hubicka writes:
> >> [...]
> >> PR117081 is about regression in povray. The reducted testcase:
> >>
On 2/6/25 5:35 PM, Jan Hubicka wrote:
Register 3 (first caller saved) has cost 11000. This comes from:
add_cost = ((ira_memory_move_cost[mode][rclass][0]
+ ira_memory_move_cost[mode][rclass][1])
* saved_nregs / hard_regno_nregs (
Richard Sandiford writes:
> Really nice analysis! Thanks for writing this up.
>
> Sorry for the big quote below, but:
>
> Jan Hubicka writes:
>> [...]
>> PR117081 is about regression in povray. The reducted testcase:
>>
>> void foo (void);
>> void bar (void);
>>
>> int
>> test (int a)
>> {
>>
> >
> >0: 89 f8 mov%edi,%eax<--- move1
> >2: 48 83 ec 18 sub$0x18,%rsp <--- stack
> > frame creation
> >6: f7 d8 neg%eax
> >8: 89 44 24 0c mov%eax,0xc(%rsp) <--- spil
Richard Sandiford writes:
> In particular, one thing that the examples above have in common is that
> they don't need to allocate a frame for local variables. That seems
> like it ought to be part of the mix. If we need to allocate a frame
> using addition anyway, then presumably one of the adva
Really nice analysis! Thanks for writing this up.
Sorry for the big quote below, but:
Jan Hubicka writes:
>> > +/* Implement TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE. */
>> > +
>> > +static int
>> > +ix86_ira_callee_saved_register_cost_scale (int)
>> > +{
>> > + return 1;
>> > +}
>> > +
>
> On Thu, Feb 6, 2025 at 11:40 PM Vladimir Makarov wrote:
> >
> >
> > On 2/6/25 4:54 PM, Richard Sandiford wrote:
> >
> > Vladimir Makarov writes:
> >
> > This is a complicated problem resulted in many tries to fix it in some
> > general way.
> >
> > In general I am agree with Surya's approach to
On Thu, Feb 6, 2025 at 11:40 PM Vladimir Makarov wrote:
>
>
> On 2/6/25 4:54 PM, Richard Sandiford wrote:
>
> Vladimir Makarov writes:
>
> This is a complicated problem resulted in many tries to fix it in some
> general way.
>
> In general I am agree with Surya's approach to scale cost of reg
> s
On 2/6/25 4:54 PM, Richard Sandiford wrote:
Vladimir Makarov writes:
This is a complicated problem resulted in many tries to fix it in some
general way.
In general I am agree with Surya's approach to scale cost of reg
saves/restores somehow. But the general approach, although solved some
prob
> > +/* Implement TARGET_IRA_CALLEE_SAVED_REGISTER_COST_SCALE. */
> > +
> > +static int
> > +ix86_ira_callee_saved_register_cost_scale (int)
> > +{
> > + return 1;
> > +}
> > +
> > return cl;
> > }
> > +int
> > +default_ira_callee_saved_register_cost_scale (int)
> > +{
> > + return (opti
Vladimir Makarov writes:
> On 2/3/25 1:20 AM, H.J. Lu wrote:
>> commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
>> Author: Surya Kumari Jangala
>> Date: Tue Jun 25 08:37:49 2024 -0500
>>
>> ira: Scale save/restore costs of callee save registers with block
>> frequency
>>
>> scales the cos
On 2/3/25 1:20 AM, H.J. Lu wrote:
commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
Author: Surya Kumari Jangala
Date: Tue Jun 25 08:37:49 2024 -0500
ira: Scale save/restore costs of callee save registers with block frequency
scales the cost of saving/restoring a callee-save hard regist
> Hello,
>
> On Mon, 3 Feb 2025, H.J. Lu wrote:
>
> > Author: Surya Kumari Jangala
> > Date: Tue Jun 25 08:37:49 2024 -0500
> >
> > ira: Scale save/restore costs of callee save registers with block
> > frequency
> >
> > scales the cost of saving/restoring a callee-save hard register in
On 2/3/25 7:14 AM, Jeff Law wrote:
> On 2/3/25 2:31 AM, H.J. Lu wrote:
>> I believe the original patch should be reverted. Then my patch isn't needed.
>
> That patch had significant improvements across the board for RISC-V.
> I wouldn't want to see it reverted without a strong explanation of why i
On 2/3/25 3:44 AM, Richard Biener wrote:
> On Mon, Feb 3, 2025 at 10:32 AM H.J. Lu wrote:
>> I believe the original patch should be reverted. Then my patch isn't needed.
>
> I'm OK with that, but it's not my call. I do wonder why the contributor did
> not
> address any of the fallout. Maybe h
Hello,
On Mon, 3 Feb 2025, H.J. Lu wrote:
> Author: Surya Kumari Jangala
> Date: Tue Jun 25 08:37:49 2024 -0500
>
> ira: Scale save/restore costs of callee save registers with block
> frequency
>
> scales the cost of saving/restoring a callee-save hard register in epilogue
> and prologu
> > I don't think we should add a new target hook unless it's providing
> > genuinely new information about the target. Hooking into the RA to
> > brute-force a particular heuristic makes it harder to improve the RA
> > in future.
> >
> > There are already hooks that provide the costs of the relev
On 2/3/25 2:31 AM, H.J. Lu wrote:
IMO at this point a new target hook should preserve existing behavior by default
or alternatively the original patch should be reverted as causing
regressions and
a new patch introducing the target hook should be installed in next stage1.
I believe the ori
On Mon, Feb 3, 2025 at 6:29 PM Richard Sandiford
wrote:
>
> Richard Biener writes:
> > On Mon, Feb 3, 2025 at 7:23 AM H.J. Lu wrote:
> >>
> >> commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
> >> Author: Surya Kumari Jangala
> >> Date: Tue Jun 25 08:37:49 2024 -0500
> >>
> >> ira: Scale s
Richard Biener writes:
> On Mon, Feb 3, 2025 at 7:23 AM H.J. Lu wrote:
>>
>> commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
>> Author: Surya Kumari Jangala
>> Date: Tue Jun 25 08:37:49 2024 -0500
>>
>> ira: Scale save/restore costs of callee save registers with block
>> frequency
>>
>> s
On Mon, Feb 3, 2025 at 10:32 AM H.J. Lu wrote:
>
> On Mon, Feb 3, 2025 at 5:27 PM Richard Biener
> wrote:
> >
> > On Mon, Feb 3, 2025 at 7:23 AM H.J. Lu wrote:
> > >
> > > commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
> > > Author: Surya Kumari Jangala
> > > Date: Tue Jun 25 08:37:49 2024 -
On Mon, Feb 3, 2025 at 5:27 PM Richard Biener
wrote:
>
> On Mon, Feb 3, 2025 at 7:23 AM H.J. Lu wrote:
> >
> > commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
> > Author: Surya Kumari Jangala
> > Date: Tue Jun 25 08:37:49 2024 -0500
> >
> > ira: Scale save/restore costs of callee save regi
On Mon, Feb 3, 2025 at 7:23 AM H.J. Lu wrote:
>
> commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
> Author: Surya Kumari Jangala
> Date: Tue Jun 25 08:37:49 2024 -0500
>
> ira: Scale save/restore costs of callee save registers with block
> frequency
>
> scales the cost of saving/restoring
commit 3b9b8d6cfdf59337f4b7ce10ce92a98044b2657b
Author: Surya Kumari Jangala
Date: Tue Jun 25 08:37:49 2024 -0500
ira: Scale save/restore costs of callee save registers with block frequency
scales the cost of saving/restoring a callee-save hard register in epilogue
and prologue with the en
48 matches
Mail list logo