https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117838
Bug ID: 117838
Summary: IRA issues: The higher cost variable a is spilled for
the lower cost variable conflict_a in
improve_allocatuion()
Product: gcc
Version: 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117192
--- Comment #14 from cuilili ---
(In reply to Uroš Bizjak from comment #12)
> Created attachment 59373 [details]
> Proposed patch
>
> Patch in testing.
Sorry, I made a mistake here, thanks!
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148
--- Comment #7 from cuilili ---
(In reply to Martin Jambor from comment #6)
> I believe this has been fixed?
Yes.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148
--- Comment #3 from cuilili ---
I reproduced S1244 regression on znver3.
Src code:
for (int i = 0; i < LEN_1D-1; i++)
{
a[i] = b[i] + c[i] * c[i] + b[i] * b[i] + c[i];
d[i] = a[i] + a[i+1];
}
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148
cuilili changed:
What|Removed |Added
CC||lili.cui at intel dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271
--- Comment #14 from cuilili ---
This regression has been fixed with the commit below and we can close this
ticket.
https://gcc.gnu.org/g:1b9a5cc9ec08e9f239dd2096edcc447b7a72f64a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038
--- Comment #5 from cuilili ---
(In reply to Martin Jambor from comment #4)
> So is this now fixed?
Yes, the attachment case has been fixed.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038
--- Comment #2 from cuilili ---
(In reply to Richard Biener from comment #1)
> Probably best to limit the values to reassoc-width by adding the
> appropriate IntegerRange attribute in params.opt
>
> IntegerRange(0, 256)
>
> maybe?
"rewrite_ex
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271
--- Comment #12 from cuilili ---
This regression caused by the store forwarding issue, we eliminate the
redundant two pairs of loads and stores which have store forwarding issue by
inlining.
This regression has been fixed by
https://gcc.gnu.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 105493, which changed state.
Bug 105493 Summary: [12/13 Regression] x86_64 538.imagick_r 6% regressions and
2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718
https://gcc.gnu.org/bugzilla/show_bug.cgi?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105493
cuilili changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105493
--- Comment #2 from cuilili ---
(In reply to Richard Biener from comment #1)
> Martin is currently re-benchmarking GCC 12 on AMD, so let's see if there's
> anything left on those.
AMD may not have this issue, Richard fixed AMD regression with t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105493
Bug ID: 105493
Summary: [12/13 Regression] x86_64 538.imagick_r 6% regressions
and 2% 525.x264_r regressions on Alder Lake after
r12-7319-g90d693bdc9d718
Product: gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #11 from cuilili ---
(In reply to Jakub Jelinek from comment #10)
> And for the backend, the question is how big the penalty for the overlapping
> store is compared to doing multiple non-overlapping stores. Say for those
> 49 bytes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271
--- Comment #9 from cuilili ---
Really appreciate for your reply, I debugged SRA pass with the small testcase
and found that SRA dose not handle this situation.
SRA cannot split callee's first parameter for "Do not decompose non-BLKmode
paramet
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271
--- Comment #7 from cuilili ---
Created attachment 52706
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52706&action=edit
Add a heuristic for eliminate redundant load and store in inline pass.
Hi Richard,
Could you help take a look? This
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271
--- Comment #6 from cuilili ---
I created a patch to fix this regression. The patch is under performance
testing. Will sent it out later.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #9 from cuilili ---
(In reply to cuilili from comment #3)
> (In reply to Hongtao.liu from comment #1)
> > STF issue here?
>
correct comment #3
I used perf to collect the "ld_blocks.store_forward" event for those two test
cases, stl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #3 from cuilili ---
(In reply to Hongtao.liu from comment #1)
> STF issue here?
Yes, Since "YMMWORD PTR [rsp-72]" across the cache line, it has STLF issue
here.
vmovdqu64 YMMWORD PTR [rsp-72], ymm31 --> store 32 bytes from [rsp-7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908
--- Comment #28 from cuilili ---
(In reply to H.J. Lu from comment #25)
> Can this be mitigated by removing redundant load and store?
Yes, inlining say_sphere can remove redundant loads and stores, O3 does
inlining, but O2 is more sensitive to c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908
--- Comment #24 from cuilili ---
(In reply to cuilili from comment #23)
> (In reply to Richard Biener from comment #17)
> > I do wonder though how CLX is fine with such access pattern ;) (did you
> > test
> > with just -O2?)
>
Sorry, correct
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908
cuilili changed:
What|Removed |Added
CC||lili.cui at intel dot com
--- Comment #23 fro
22 matches
Mail list logo