[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2

2022-02-25 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 cuilili changed: What|Removed |Added CC||lili.cui at intel dot com --- Comment #23 fro

[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2

2022-02-25 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #24 from cuilili --- (In reply to cuilili from comment #23) > (In reply to Richard Biener from comment #17) > > I do wonder though how CLX is fine with such access pattern ;) (did you > > test > > with just -O2?) > Sorry, correct

[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2

2022-02-27 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #28 from cuilili --- (In reply to H.J. Lu from comment #25) > Can this be mitigated by removing redundant load and store? Yes, inlining say_sphere can remove redundant loads and stores, O3 does inlining, but O2 is more sensitive to c

[Bug target/104723] [12 regression] Redundant usage of stack

2022-03-01 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723 --- Comment #3 from cuilili --- (In reply to Hongtao.liu from comment #1) > STF issue here? Yes, Since "YMMWORD PTR [rsp-72]" across the cache line, it has STLF issue here. vmovdqu64 YMMWORD PTR [rsp-72], ymm31 --> store 32 bytes from [rsp-7

[Bug target/104723] [12 regression] Redundant usage of stack

2022-03-02 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723 --- Comment #9 from cuilili --- (In reply to cuilili from comment #3) > (In reply to Hongtao.liu from comment #1) > > STF issue here? > correct comment #3 I used perf to collect the "ld_blocks.store_forward" event for those two test cases, stl

[Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2022-03-24 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271 --- Comment #6 from cuilili --- I created a patch to fix this regression. The patch is under performance testing. Will sent it out later.

[Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2022-03-28 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271 --- Comment #7 from cuilili --- Created attachment 52706 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52706&action=edit Add a heuristic for eliminate redundant load and store in inline pass. Hi Richard, Could you help take a look? This

[Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2022-04-15 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271 --- Comment #9 from cuilili --- Really appreciate for your reply, I debugged SRA pass with the small testcase and found that SRA dose not handle this situation. SRA cannot split callee's first parameter for "Do not decompose non-BLKmode paramet

[Bug target/104723] [12 regression] Redundant usage of stack

2022-04-24 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723 --- Comment #11 from cuilili --- (In reply to Jakub Jelinek from comment #10) > And for the backend, the question is how big the penalty for the overlapping > store is compared to doing multiple non-overlapping stores. Say for those > 49 bytes

[Bug target/105493] [12/13 Regression] x86_64 538.imagick_r 6% regressions and 2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718

2022-07-26 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105493 cuilili changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2022-07-26 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 105493, which changed state. Bug 105493 Summary: [12/13 Regression] x86_64 538.imagick_r 6% regressions and 2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718 https://gcc.gnu.org/bugzilla/show_bug.cgi?

[Bug target/104271] [12/13 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2022-11-27 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271 --- Comment #12 from cuilili --- This regression caused by the store forwarding issue, we eliminate the redundant two pairs of loads and stores which have store forwarding issue by inlining. This regression has been fixed by https://gcc.gnu.

[Bug target/105493] New: [12/13 Regression] x86_64 538.imagick_r 6% regressions and 2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718

2022-05-05 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105493 Bug ID: 105493 Summary: [12/13 Regression] x86_64 538.imagick_r 6% regressions and 2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718 Product: gcc

[Bug target/105493] [12/13 Regression] x86_64 538.imagick_r 6% regressions and 2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718

2022-05-05 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105493 --- Comment #2 from cuilili --- (In reply to Richard Biener from comment #1) > Martin is currently re-benchmarking GCC 12 on AMD, so let's see if there's > anything left on those. AMD may not have this issue, Richard fixed AMD regression with t

[Bug middle-end/110148] [14 Regression] TSVC s242 regression between g:c0df96b3cda5738afbba3a65bb054183c5cd5530 and g:e4c986fde56a6248f8fbe6cf0704e1da34b055d8

2023-09-25 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148 --- Comment #7 from cuilili --- (In reply to Martin Jambor from comment #6) > I believe this has been fixed? Yes.

[Bug tree-optimization/110038] [14 Regression] ICE: in rewrite_expr_tree_parallel, at tree-ssa-reassoc.cc:5522 with --param=tree-reassoc-width=2147483647

2023-05-30 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038 --- Comment #2 from cuilili --- (In reply to Richard Biener from comment #1) > Probably best to limit the values to reassoc-width by adding the > appropriate IntegerRange attribute in params.opt > > IntegerRange(0, 256) > > maybe? "rewrite_ex

[Bug tree-optimization/110038] [14 Regression] ICE: in rewrite_expr_tree_parallel, at tree-ssa-reassoc.cc:5522 with --param=tree-reassoc-width=2147483647

2023-06-06 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038 --- Comment #5 from cuilili --- (In reply to Martin Jambor from comment #4) > So is this now fixed? Yes, the attachment case has been fixed.

[Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2023-06-06 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271 --- Comment #14 from cuilili --- This regression has been fixed with the commit below and we can close this ticket. https://gcc.gnu.org/g:1b9a5cc9ec08e9f239dd2096edcc447b7a72f64a

[Bug middle-end/110148] [14 Regression] TSVC s242 regression between g:c0df96b3cda5738afbba3a65bb054183c5cd5530 and g:e4c986fde56a6248f8fbe6cf0704e1da34b055d8

2023-06-09 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148 cuilili changed: What|Removed |Added CC||lili.cui at intel dot com --- Comment #2 from

[Bug middle-end/110148] [14 Regression] TSVC s242 regression between g:c0df96b3cda5738afbba3a65bb054183c5cd5530 and g:e4c986fde56a6248f8fbe6cf0704e1da34b055d8

2023-06-24 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148 --- Comment #3 from cuilili --- I reproduced S1244 regression on znver3. Src code: for (int i = 0; i < LEN_1D-1; i++) { a[i] = b[i] + c[i] * c[i] + b[i] * b[i] + c[i]; d[i] = a[i] + a[i+1]; } ---

[Bug target/117192] [15 Regression] wrong code at -O3 with "-fno-unswitch-loops" on x86_64-linux-gnu since r15-4397-g70f59d2a1c51bd

2024-10-17 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117192 --- Comment #14 from cuilili --- (In reply to Uroš Bizjak from comment #12) > Created attachment 59373 [details] > Proposed patch > > Patch in testing. Sorry, I made a mistake here, thanks!

[Bug middle-end/117838] New: IRA issues: The higher cost variable a is spilled for the lower cost variable conflict_a in improve_allocatuion()

2024-11-28 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117838 Bug ID: 117838 Summary: IRA issues: The higher cost variable a is spilled for the lower cost variable conflict_a in improve_allocatuion() Product: gcc Version: 1

[Bug target/120697] [16 regression] Bootstrap fails in ix86_expand_prologue since r16-1551-g2c30f828e45078

2025-06-18 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120697 --- Comment #14 from cuilili --- Created attachment 61659 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61659&action=edit Update estcase for the patch. Thanks for the reminder. Added #c7 as a new testcase for the patch.

[Bug target/120697] [16 regression] Bootstrap fails in ix86_expand_prologue

2025-06-18 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120697 --- Comment #10 from cuilili --- Thanks Sam and Sergei, I created a patch to remove this assertion. However, validating this patch requires running many tests, and if all goes well, it will take 1-2 days to fix this issue.

[Bug target/120697] [16 regression] Bootstrap fails in ix86_expand_prologue since r16-1551-g2c30f828e45078

2025-06-18 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120697 --- Comment #15 from cuilili --- (In reply to Sam James from comment #11) > Thanks Lili. I am happy to test a patch (I am not sure if just that assert > should go, or if it is that same assert in various places), or to just > workaround it local

[Bug target/120697] [16 regression] Bootstrap fails in ix86_expand_prologue since r16-1551-g2c30f828e45078

2025-06-18 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120697 --- Comment #12 from cuilili --- Created attachment 61658 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61658&action=edit 0001-x86-Handle-fstack-clash-protection-for-shrink-wrap-s.patch I have reproduced and solved this problem locally.

[Bug target/120741] [16 Regression] ICE on mingw-w64-12.0.0: during RTL pass: pro_and_epilogue ICE in ix86_expand_prologue, at config/i386/i386.cc:9446

2025-06-23 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120741 --- Comment #5 from cuilili --- (In reply to H.J. Lu from comment #3) > (In reply to cuilili from comment #1) > > Created attachment 61678 [details] > > Fix-shrink-wrap-separate-ICE-for-mingw > > > > Hi Sergei, > > > > Thanks for reporting thi

[Bug target/120741] [16 Regression] ICE on mingw-w64-12.0.0: during RTL pass: pro_and_epilogue ICE in ix86_expand_prologue, at config/i386/i386.cc:9446

2025-06-21 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120741 --- Comment #1 from cuilili --- Created attachment 61678 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61678&action=edit Fix-shrink-wrap-separate-ICE-for-mingw Hi Sergei, Thanks for reporting this issue and providing a small testcase. I

[Bug target/120818] [16 Regression] g++.target/i386/shrink_wrap_separate.C FAILs

2025-06-27 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120818 --- Comment #1 from cuilili --- Created attachment 61730 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61730&action=edit fix patch Hi Rainer, Thank you for reporting this issue and giving the actual output. I have relaxed the testcase c

[Bug gcov-profile/120881] [16 Regression] -fstack-protector-all -pg doesn't call mount at function entry by r16-1550-g9244ea4bf55638

2025-07-01 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120881 --- Comment #10 from cuilili --- (In reply to Sam James from comment #9) > Thanks both. > > H.J.'s is slightly less pessimising because it'll only affect functions > where we actually emit a call, but I don't think it really matters here, and >

[Bug gcov-profile/120881] [16 Regression] -fstack-protector-all -pg doesn't call mount at function entry by r16-1550-g9244ea4bf55638

2025-06-30 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120881 --- Comment #2 from cuilili --- I think it is an old bug, since shrink wrap, NOTE_INSN_PROLOGUE_END does not represent the entry bb.

[Bug gcov-profile/120881] [16 Regression] -fstack-protector-all -pg doesn't call mount at function entry by r16-1550-g9244ea4bf55638

2025-07-01 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120881 --- Comment #6 from cuilili --- Created attachment 61781 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61781&action=edit Disable separate shrink wrapping for profile How about changing it like this, like shrink wrap.

[Bug target/120818] [16 Regression] g++.target/i386/shrink_wrap_separate.C FAILs

2025-06-27 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120818 --- Comment #4 from cuilili --- Thank you all!