https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #26 from Rama Malladi ---
Thank you Eugene for the fix/ backports.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #23 from Rama Malladi ---
Hi Eugene,
Hope you are doing well. I am just checking-in on this patch commit to mainline
GCC and back-ported.
Thanks
-Rama
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #21 from Rama Malladi ---
Hi Eugene,
We verified that the GCC patch restores the PGO performance gain. The HammerDB
workload shows an 11% PGO performance gain with the fix, compared to a 2% gain
without the fix.
Thanks for the fix.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #20 from Rama Malladi ---
> I propose the patch below. Rama, can you please check if this resolves your
> perf regression?
Hi Eugene,
Thanks for this investigation and proposed fix. I can give it a try and update
in a day or two.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #15 from Rama Malladi ---
(In reply to Rama Malladi from comment #14)
> Thanks Eugene. Were you able to review the repro and propose a fix?
Hi Eugene, checking again. It would be great if you can look into it.
Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #14 from Rama Malladi ---
Thanks Eugene. Were you able to review the repro and propose a fix?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #12 from Rama Malladi ---
Hi Eugene, checking in again. Can you please review this issue and a fix?
Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #11 from Rama Malladi ---
Hi Eugene, Could you please review the test-case attached? Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #10 from Rama Malladi ---
Created attachment 59207
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59207&action=edit
reproducer for funtion inlining issue - source
This is a reproducer to show GCC 12.3.0 inlining issue w AutoFD
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #9 from Rama Malladi ---
Created attachment 59206
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59206&action=edit
reporducer for function inlining issue
This is a reproducer to show GCC 12.3.0 inlining issue w AutoFDO due to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #8 from Rama Malladi ---
Here attached is a compilation unit extracted from `MySQL` repo
(https://github.com/mysql/mysql-server/blob/trunk/storage/innobase/handler/ha_innodb.cc)
which shows the impact of commit `3d9e6767939e` on func
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #7 from Rama Malladi ---
I haven't been successful to create a reproducer yet. A simple `test.cc` as
follows isn't showing this behavior as the compiler inlines these irrespective
(at `-O3`).
```
#include
#define N 30
void init_arra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #6 from Rama Malladi ---
I am trying to create a reproducer for this issue. Interim, I wanted to share
some stats I got from the MySQL build to highlight this issue w GCC 12.3.0 vs.
11.5.0.
Executable Size (B)Baseline
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #5 from Rama Malladi ---
(In reply to Eugene Rozenfeld from comment #4)
> AutoFDO does work. I made a number of fixes and improvements over the last
> several years, both in GCC (including fixing autoprofiledbootstrap) and in
> googl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
--- Comment #2 from Rama Malladi ---
The 10% regression was observed on an `aarch64` architecture/ instance.
We saw regression on an `x86_64` instance too. Here is the data:
GCC version BaselineAutoFDO
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116743
Bug ID: 116743
Summary: Commit `3d9e6767939e` causes ~10% perf regression
Product: gcc
Version: 12.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Compone
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 114531, which changed state.
Bug 114531 Summary: Feature proposal for an `-finline-functions-aggressive`
compiler option
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
What|Removed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
Rama Malladi changed:
What|Removed |Added
Resolution|--- |WONTFIX
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #24 from Rama Malladi ---
I am closing this bug report as the feature request has been rejected by the
committee after reviewing the same.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #22 from Rama Malladi ---
Checking in again... Hubicka@ and rsandifo@, Can we action this PR
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655506.html? Accept or
Reject? Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #21 from Rama Malladi ---
Thank you Hubicka@ and rsandifo@ for reviewing this feature request and
supporting it. Could we go ahead and dispose this as accepted or rejected?
Accordingly action the PR
https://gcc.gnu.org/pipermail/gcc-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #19 from Rama Malladi ---
Thank you Hubicka@ for the inputs. I see your intent and that we have to
revisit the inline parameter tuning. As I and Richard S mentioned, the intent
of this feature request or PR is to expose such an optio
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #16 from Rama Malladi ---
I had posted a patch at the URL below for this feature:
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655506.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #15 from Rama Malladi ---
Thanks for the comments and for giving us some history/ perspective. I agree
with this statement,
> Pushing up -O2 limits can make sense, but needs to be done carefully -
> in longer term IMO we do not want
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #13 from Rama Malladi ---
(In reply to Jan Hubicka from comment #12)
> If this is without LTO, can you also try the LTO numbers?
> Inliner behaves sifniciantly different with and without LTO, since LTO
> introduces many (and often to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #11 from Rama Malladi ---
(In reply to Wilco from comment #10)
> A 1.1% overall performance gain looks good - is there a significant codesize
> hit from this? If so, are there slightly less aggressive settings that still
> get most o
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #9 from Rama Malladi ---
I wanted us to review this feature implementation given GCC 15 Stage 1
development has started. Thank you.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
Rama Malladi changed:
What|Removed |Added
CC||rvmallad at amazon dot com
--- Comment #
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #7 from Rama Malladi ---
(In reply to Rama Malladi from comment #5)
> (In reply to Andrew Pinski from comment #3)
> > Also do you have numbers with lto enabled? Or is these without lto?
> >
> > Does LTO improve the situation for Env
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #5 from Rama Malladi ---
(In reply to Andrew Pinski from comment #3)
> Also do you have numbers with lto enabled? Or is these without lto?
>
> Does LTO improve the situation for Envoy too?
These numbers are without lto. I haven't t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
--- Comment #4 from Rama Malladi ---
(In reply to Andrew Pinski from comment #1)
> Maybe we should figure out why the increase of the limits help and add extra
> code to get better heuristics rather than just tweaking the limits.
>
> I know tha
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114531
Bug ID: 114531
Summary: Feature proposal for an
`-finline-functions-aggressive` compiler option
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97696
--- Comment #5 from Rama Malladi ---
Thank you Richard for this patch/ fix.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97696
Rama Malladi changed:
What|Removed |Added
CC||rvmallad at amazon dot com
--- Comment #2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #23 from Rama Malladi ---
(In reply to Rama Malladi from comment #22)
> I will close this issue as we were unable to reproduce the perf drop going
> from gcc-7 to gcc-8 on a Graviton2 based instance. The performance of
> 519.lbm_r bu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #22 from Rama Malladi ---
I will close this issue as we were unable to reproduce the perf drop going from
gcc-7 to gcc-8 on a Graviton2 based instance. The performance of 519.lbm_r
built with gcc-7.4 was same as that with gcc-8.5.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #21 from Rama Malladi ---
I did another triage for perf loss on Graviton 2 processor (neoverse-n1) based
instance and found this commit: `a9a4edf0e71bbac9f1b5dcecdcf9250111d16889` to
be the reason. As I had indicated in my earlier re
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #20 from Rama Malladi ---
@Martin J and @Sebastian P, Let me walk you through the perf data and my
triage.
First, my triage has been on Graviton 3 (neoverse-v1) processor based
instances. Next, I was looking for perf delta going fro
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #19 from Rama Malladi ---
Thanks @Sebastian and @Martin J. I will get another bisect between GCC 7-and-8.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #15 from Rama Malladi ---
Hi, Can we review this issue and suggest next steps/ action please? Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #14 from Rama Malladi ---
(In reply to Martin Liška from comment #13)
> Note the mentioned revision is a fix and yes, sometimes these revisions can
> end up with a regression as profile estimation is a complex guess.
Yes, possibly.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #12 from Rama Malladi ---
I found difference in dumps at various stages of the compilation for the
mainline GCC and with update_max_bb_count() commented. Here are the details:
Mainline: Commit ID: 63a42ffc0833553fbcb84b50cf0fd2d867b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #11 from Rama Malladi ---
(In reply to Martin Liška from comment #10)
> @Honza ?
Just checking if this can be fixed/ implemented. Thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #19 from Rama Malladi ---
(In reply to Wilco from comment #17)
> (In reply to Rama Malladi from comment #16)
> > (In reply to Wilco from comment #15)
> > > (In reply to Rama Malladi from comment #14)
> > > > This fix also improved pe
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #18 from Rama Malladi ---
(In reply to Wilco from comment #17)
> (In reply to Rama Malladi from comment #16)
> > (In reply to Wilco from comment #15)
> > > (In reply to Rama Malladi from comment #14)
> > > > This fix also improved pe
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #9 from Rama Malladi ---
(In reply to Martin Liška from comment #3)
> Can you please share perf-profile before and after the revision?
>
> Note I can't see it for Altra aarch64 CPU:
> https://lnt.opensuse.org/db_default/v4/SPEC/grap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #16 from Rama Malladi ---
(In reply to Wilco from comment #15)
> (In reply to Rama Malladi from comment #14)
> > This fix also improved performance of 538.imagick_r by 15%. Did you have a
> > similar observation? Thank you.
>
> No,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #14 from Rama Malladi ---
This fix also improved performance of 538.imagick_r by 15%. Did you have a
similar observation? Thank you.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #13 from Rama Malladi ---
(In reply to CVS Commits from comment #12)
> The master branch has been updated by Wilco Dijkstra :
>
> https://gcc.gnu.org/g:0c1b0a23f1fe7db6a2e391b7cb78cff90032
>
> commit r13-4291-g0c1b0a23f1fe7db6a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #11 from Rama Malladi ---
(In reply to Wilco from comment #10)
> I'm seeing about 1.5% gain on Neoverse V1 and 0.5% loss on Neoverse N1. I'll
> post a patch that allows per-CPU settings for FMA reassociation, so you'll
> get good per
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #9 from Rama Malladi ---
(In reply to Rama Malladi from comment #8)
> (In reply to Wilco from comment #7)
> > The revert results in about 0.5% loss on Neoverse N1, so it looks like the
> > reassociation pass is still splitting FMAs i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #8 from Rama Malladi ---
(In reply to Wilco from comment #7)
> The revert results in about 0.5% loss on Neoverse N1, so it looks like the
> reassociation pass is still splitting FMAs into separate MUL and ADD (which
> is bad for narr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #6 from Rama Malladi ---
The compilation options were: -Ofast -mcpu=native -flto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #5 from Rama Malladi ---
(In reply to Wilco from comment #2)
> That's interesting - if the reassociation pass has become a bit smarter in
> the last 5 years, we might no longer need this workaround. What is the
> effect on the overal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107433
--- Comment #2 from Rama Malladi ---
(In reply to Martin Liška from comment #1)
> As mentioned slightly here:
> https://www.spec.org/cpu2017/Docs/benchmarks/510.parest_r.html
> please use -std=c++98 or something < c++17.
Thank you. I had it for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #8 from Rama Malladi ---
(In reply to Mark Wielaard from comment #7)
> The content of attachment 53773 [details] has been deleted for the following
> reason:
>
> https://sourceware.org/pipermail/overseers/2022q4/019048.html
Thank y
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #6 from Rama Malladi ---
(In reply to Martin Liška from comment #5)
> Please try writing here: overse...@sourceware.org
I have asked for deletion. Thanks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107433
Bug ID: 107433
Summary: 510.parest_r, call of overloaded 'back_interpolate' is
ambiguous
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #4 from Rama Malladi ---
Hi Martin,
Thanks for the guidance. Can we delete the attachment from this bug report?
Regards,
Rama
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #3 from Rama Malladi ---
I will get the effect of this revert for the overall SPEC FP score. I haven't
tried experimenting with fp_reassoc_width values. Will try it and update.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
--- Comment #1 from Rama Malladi ---
$ /home/ubuntu/gccfixissue2/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/home/ubuntu/gccfixissue2/bin/gcc
COLLECT_LTO_WRAPPER=/home/ubuntu/gccfixissue2/libexec/gcc/aarch64-unknown-linux-gnu/13.0.0/lto-wrapp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
--- Comment #1 from Rama Malladi ---
$ /home/ubuntu/gccfixissue1/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/home/ubuntu/gccfixissue1/bin/gcc
COLLECT_LTO_WRAPPER=/home/ubuntu/gccfixissue1/libexec/gcc/aarch64-unknown-linux-gnu/13.0.0/lto-wrapp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107413
Bug ID: 107413
Summary: Perf loss ~14% on 519.lbm_r SPEC cpu2017 benchmark
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Componen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107409
Bug ID: 107409
Summary: Perf loss ~5% on 519.lbm_r SPEC cpu2017 benchmark
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component
64 matches
Mail list logo