from:"Cui, Lili"

RE: [PATCH] ira: Remove the issue code in improve_allocation. [PR117838]

2025-08-31 Thread Cui, Lili

> -Original Message- > From: Vladimir Makarov > Sent: Saturday, August 30, 2025 1:38 AM > To: Cui, Lili ; gcc-patches@gcc.gnu.org > Cc: rdsandif...@googlemail.com > Subject: Re: [PATCH] ira: Remove the issue code in improve_allocation. > [PR117838] > > >

RE: [PATCH] ira: Remove the issue code in improve_allocation. [PR117838]

2025-08-28 Thread Cui, Lili

Gentle ping for this patch. Thanks, Lili. > -Original Message- > From: yes > Sent: Friday, August 22, 2025 10:56 AM > To: gcc-patches@gcc.gnu.org > Cc: vmaka...@redhat.com; rdsandif...@googlemail.com; Cui, Lili > > Subject: [PATCH] ira: Remove the issue code

RE: [PATCH V3] x86: Enable separate shrink wrapping

2025-07-08 Thread Cui, Lili

> -Original Message- > From: Segher Boessenkool > Sent: Wednesday, July 9, 2025 1:13 AM > To: Cui, Lili > Cc: ubiz...@gmail.com; gcc-patches@gcc.gnu.org; Liu, Hongtao > ; richard.guent...@gmail.com; Michael Matz > > Subject: Re: [PATCH V3] x86: Enable separate

RE: [PATCH V3] x86: Enable separate shrink wrapping

2025-07-08 Thread Cui, Lili

> -Original Message- > From: Segher Boessenkool > Sent: Friday, July 4, 2025 9:21 PM > To: Cui, Lili > Cc: ubiz...@gmail.com; gcc-patches@gcc.gnu.org; Liu, Hongtao > ; richard.guent...@gmail.com; Michael Matz > > Subject: Re: [PATCH V3] x86: Enable separate

RE: [PATCH V3] x86: Enable separate shrink wrapping

2025-07-04 Thread Cui, Lili

> -Original Message- > From: Segher Boessenkool > Sent: Wednesday, July 2, 2025 10:22 PM > To: Cui, Lili > Cc: ubiz...@gmail.com; gcc-patches@gcc.gnu.org; Liu, Hongtao > ; richard.guent...@gmail.com; Michael Matz > > Subject: Re: [PATCH V3] x86: Enable separate

RE: [PATCH V3] x86: Enable separate shrink wrapping

2025-07-02 Thread Cui, Lili

> -Original Message- > From: Segher Boessenkool > Sent: Monday, June 30, 2025 6:23 AM > To: Cui, Lili > Cc: ubiz...@gmail.com; gcc-patches@gcc.gnu.org; Liu, Hongtao > ; richard.guent...@gmail.com; Michael Matz > > Subject: Re: [PATCH V3] x86: Enable separate

RE: [PATCH V3] x86: Enable separate shrink wrapping

2025-06-30 Thread Cui, Lili

Lili. From: H.J. Lu Sent: Monday, June 30, 2025 7:03 PM To: Uros Bizjak Cc: Cui, Lili ; gcc-patches@gcc.gnu.org; Liu, Hongtao ; richard.guent...@gmail.com; Michael Matz Subject: Re: [PATCH V3] x86: Enable separate shrink wrapping On Tue, Jun 17, 2025 at 10:33 PM Uros Bizjak mailto:ubiz.

RE: [COMMITTED, PATCH] shrink_wrap_separate_check_lea.c: Scan lea(l|q)

2025-06-29 Thread Cui, Lili

> -Original Message- > From: H.J. Lu > Sent: Monday, June 30, 2025 4:16 AM > To: Uros Bizjak > Cc: Cui, Lili ; GCC Patches ; > Liu, > Hongtao ; Richard Biener > ; Michael Matz > Subject: [COMMITTED, PATCH] shrink_wrap_separate_check_lea.c: Scan > lea(l|q

RE: [PATCH V3] x86: Enable separate shrink wrapping

2025-06-27 Thread Cui, Lili

> -Original Message- > From: Cui, Lili > Sent: Friday, June 27, 2025 5:04 PM > To: H.J. Lu > Cc: ubiz...@gmail.com; gcc-patches@gcc.gnu.org; Liu, Hongtao > ; richard.guent...@gmail.com; Michael Matz > ; Sam James ; kenjin4...@gmail.com > Subject: RE: [PATCH

RE: [PATCH V3] x86: Enable separate shrink wrapping

2025-06-27 Thread Cui, Lili

> -Original Message- > From: H.J. Lu > Sent: Friday, June 27, 2025 4:48 PM > To: Cui, Lili > Cc: ubiz...@gmail.com; gcc-patches@gcc.gnu.org; Liu, Hongtao > ; richard.guent...@gmail.com; Michael Matz > ; Sam James ; kenjin4...@gmail.com > Subject: Re: [PATCH

RE: [committed] i386: Convert LEA stack adjust insn to SUB when FLAGS_REG is dead

2025-06-24 Thread Cui, Lili

> -Original Message- > From: Uros Bizjak > Sent: Wednesday, June 25, 2025 12:53 AM > To: gcc-patches@gcc.gnu.org > Cc: Cui, Lili > Subject: [committed] i386: Convert LEA stack adjust insn to SUB when > FLAGS_REG is dead > > ADD/SUB is faster than LEA for most

RE: [PATCH] Fix shrink wrap separate ICE for mingw [PR120741]

2025-06-23 Thread Cui, Lili

> -Original Message- > From: Cui, Lili > Sent: Monday, June 23, 2025 8:38 PM > To: Uros Bizjak > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao > Subject: RE: [PATCH] Fix shrink wrap separate ICE for mingw [PR120741] > > > On Mon, Jun 23, 2025 at 1:19 PM Cui, Li

RE: [PATCH] Fix shrink wrap separate ICE for mingw [PR120741]

2025-06-23 Thread Cui, Lili

> On Mon, Jun 23, 2025 at 1:19 PM Cui, Lili wrote: > > > > From: Lili Cui > > > > Hi Uros, > > > > I need to remove another assertion in the shrink wrap separate patch. > Added two cases for changing the CHECK_STACK_LIMIT value. > > > > The

RE: [PATCH] x86: Fix shrink wrap separate ICE under -fstack-clash-protection [PR120697]

2025-06-23 Thread Cui, Lili

> > > > -Original Message- > > > > From: Uros Bizjak > > > > Sent: Wednesday, June 18, 2025 9:22 PM > > > > To: Cui, Lili > > > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > > > > hongjiu...@intel.com > > &g

[PATCH] Fix shrink wrap separate ICE for mingw [PR120741]

2025-06-23 Thread Cui, Lili

From: Lili Cui Hi Uros, I need to remove another assertion in the shrink wrap separate patch. Added two cases for changing the CHECK_STACK_LIMIT value. The default values for CHECK_STACK_LIMIT for target wingw and option -mstack-arg-probe are 4000 and (-1) respectively. In this case, shrink w

RE: [PATCH] x86: Fix shrink wrap separate ICE under -fstack-clash-protection [PR120697]

2025-06-18 Thread Cui, Lili

> -Original Message- > From: Uros Bizjak > Sent: Wednesday, June 18, 2025 9:22 PM > To: Cui, Lili > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > hongjiu...@intel.com > Subject: Re: [PATCH] x86: Fix shrink wrap separate ICE under -fstack-clash- > protection [PR

[PATCH] x86: Fix shrink wrap separate ICE under -fstack-clash-protection [PR120697]

2025-06-18 Thread Cui, Lili

From: Lili Cui Hi Uros, An assertion I added in shrink wrap separate V2 reports ICE when -fstack-clash-protection is enabled. The assertion should not be added here. I created a patch to remove 3 assertions and their associated code. 1. Reproduced PR120697 issue and solved the issue with thi

[PATCH V3] x86: Enable separate shrink wrapping

2025-06-17 Thread Cui, Lili

From: Lili Cui Hi Uros, This is patch v3, the main changes are as follows. 1. Added a pro_epilogue_adjust_stack_add_nocc in i386.md to add memory clobber for lea/mov. 2. Adjusted some formatting issues. 3. Added scan-rtl-dumps for ia32 in shrink_wrap_separate.C. Collected spec2017 performance

RE: [PATCH v3] x86: Update memcpy/memset inline strategies for -mtune=generic

2025-06-17 Thread Cui, Lili

> -Original Message- > From: H.J. Lu > Sent: Monday, June 16, 2025 10:08 PM > To: Jan Hubicka > Cc: Uros Bizjak ; Cui, Lili ; gcc- > patc...@gcc.gnu.org; Liu, Hongtao ; > mjgu...@gmail.com > Subject: [PATCH v3] x86: Update memcpy/memset inline strategies for -

RE: [PATCH v2] x86: Update memcpy/memset inline strategies for -mtune=generic

2025-06-13 Thread Cui, Lili

> -Original Message- > From: Jan Hubicka > Sent: Monday, April 21, 2025 6:35 PM > To: H.J. Lu > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > ubiz...@gmail.com > Subject: Re: [PATCH v2] x86: Update memcpy/memset inline strategies for - > mtune=generic > > > On Mon, Apr 21, 2025 at 7:24

RE: [PATCH V2] x86: Enable separate shrink wrapping

2025-06-13 Thread Cui, Lili

> -Original Message- > From: Uros Bizjak > Sent: Thursday, June 12, 2025 5:05 PM > To: Cui, Lili > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > richard.guent...@gmail.com; Michael Matz > Subject: Re: [PATCH V2] x86: Enable separate shrink wrapping > >

RE: [PATCH V2] x86: Enable separate shrink wrapping

2025-06-12 Thread Cui, Lili

> > @@ -7753,8 +7762,12 @@ pro_epilogue_adjust_stack (rtx dest, rtx src, > > rtx > offset, > > add_frame_related_expr = true; > > } > > > > + if (crtl->shrink_wrapped_separate) insn = emit_insn (gen_rtx_SET > > + (dest, gen_rtx_PLUS (Pmode, src, addend))); > > Please use ix86_expa

RE: [PATCH V2] x86: Enable separate shrink wrapping

2025-06-11 Thread Cui, Lili

> > From: Lili Cui > > > > Hi Uros, > > > > Thank you very much for providing detailed BKM to reproduce Linux kernel > > boot > failure. My patch and Matz's patch have this problem. We inserted a SUB > between > TEST and JLE, and the SUB changes the value of EFlags. The branch JLE here > went

[PATCH V2] x86: Enable separate shrink wrapping

2025-06-10 Thread Cui, Lili

From: Lili Cui Hi Uros, Thank you very much for providing detailed BKM to reproduce Linux kernel boot failure. My patch and Matz's patch have this problem. We inserted a SUB between TEST and JLE, and the SUB changes the value of EFlags. The branch JLE here went wrong, and a null pointer appe

RE: [PATCH] x86: Enable separate shrink wrapping

2025-06-10 Thread Cui, Lili

> > gcc/testsuite/ChangeLog: > > > > * gcc.target/x86_64/abi/callabi/leaf-2.c: Adjust the test. > > * gcc.target/i386/interrupt-16.c: Likewise. > > * g++.target/i386/shrink_wrap_separate.c: New test. > > This one should have .C suffix. > Done. > Some comment fixes/clarif

RE: [PATCH] x86: Enable separate shrink wrapping

2025-05-14 Thread Cui, Lili

> -Original Message- > From: Richard Biener > Sent: Tuesday, May 13, 2025 7:49 PM > To: Uros Bizjak > Cc: Cui, Lili ; gcc-patches@gcc.gnu.org; Liu, Hongtao > > Subject: Re: [PATCH] x86: Enable separate shrink wrapping > > On Tue, May 13, 2025 at 12:36 PM Uro

RE: [PATCH] x86: Enable separate shrink wrapping

2025-05-14 Thread Cui, Lili

> -Original Message- > From: Uros Bizjak > Sent: Tuesday, May 13, 2025 6:04 PM > To: Cui, Lili > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao > Subject: Re: [PATCH] x86: Enable separate shrink wrapping > > On Tue, May 13, 2025 at 8:15 AM Cui, Lili wrote

[PATCH] x86: Enable separate shrink wrapping

2025-05-12 Thread Cui, Lili

From: Lili Cui Hi, This patch is to enale separate shrink wrapping for x86. Bootstrapped & regtested on x86-64-pc-linux-gnu. Ok for trunk? This commit implements the target macros (TARGET_SHRINK_WRAP_*) that enable separate shrink wrapping for function prologues/epilogues in x86. When perfo

[PATCH] Optimize 128-bit vector permutation with pand, pandn and por.

2024-11-20 Thread Cui, Lili

Hi, all This patch aims to handle certain vector shuffle operations using pand, pandn and por more efficiently. Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? Regards, Lili. This patch introduces a new subroutine in ix86_expand_vec_perm_const_1. On x86, use mixed constant pe

[PATCH] Support andn_optab for x86

2024-10-14 Thread Cui, Lili

Hi all, This patch is to add andn_optab for x86. Bootstrapped and regtested on x86-64-linux-pc, OK for trunk? Regards, Lili. Add new andn pattern to match the new optab added by r15-1890-gf379596e0ba99d. Only enable 64bit, 128bit and 256bit vector ANDN, X86-64 has mask mov instruction when avx

RE: [PATCH 0/2] Support APX zero-upper

2024-05-14 Thread Cui, Lili

Sorry, please ignore these patches. Regards, Lili. > -Original Message- > From: Cui, Lili > Sent: Wednesday, May 15, 2024 2:24 PM > To: gcc-patches@gcc.gnu.org > Cc: Liu, Hongtao ; ubiz...@gmail.com > Subject: [PATCH 0/2] Support APX zero-upper > > >

[PATCH 2/2] Support APX zero-upper

2024-05-14 Thread Cui, Lili

gas/ChangeLog: * config/tc-i386.c (build_apx_evex_prefix): Handle ZU. * testsuite/gas/i386/x86-64.exp: Added new tests for ZU. * testsuite/gas/i386/x86-64.exp: Added new tests for ZU. * testsuite/gas/i386/x86-64-apx-zu-intel.d: New test. * testsuite/gas/i386

[PATCH 1/2] Add check for 8-bit old registers in EVEX format

2024-05-14 Thread Cui, Lili

gas/ChangeLog: * config/tc-i386.c (md_assemble): Add invalid check for old byte registers in EVEX/VEX format. * testsuite/gas/i386/x86-64-apx-inval.l: Add new test. * testsuite/gas/i386/x86-64-apx-inval.s: Ditto. --- gas/config/tc-i386.c | 12 +

[PATCH 0/2] Support APX zero-upper

2024-05-14 Thread Cui, Lili

. Removed IMUL_Fixup and added a macros 'ZU' for imul and setcc. 4. Added VexWIG to EVEX format setzu/set to remove an ugly judgement. 5. Added more test cases for imulzu and setzu. *** BLURB HERE *** Cui, Lili (2): Add check for 8-bit old registers in EVEX format Support APX zero-u

RE: [PATCH] x86: Update model values for Raptorlake.

2023-08-14 Thread Cui, Lili via Gcc-patches

; Sent: Monday, August 14, 2023 10:26 PM > To: Cui, Lili > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao > Subject: Re: [PATCH] x86: Update model values for Raptorlake. > > On 14/08/23 15:19 +0100, Jonathan Wakely wrote: > >On 14/08/23 04:37 +, Pan Li via Gcc-patches wrote: > >

RE: Bootstrap fail on GCC 13 (was: Re: [PATCH] x86: Update model values for Alderlake, Rocketlake and Raptorlake.)

2023-08-14 Thread Cui, Lili via Gcc-patches

; To: gcc-patches@gcc.gnu.org; Cui, Lili > Subject: Bootstrap fail on GCC 13 (was: Re: [PATCH] x86: Update model values > for Alderlake, Rocketlake and Raptorlake.) > > Hi, > > your GCC 13 commit > https://gcc.gnu.org/r13-7720-g0fa76e35a5f9e1 x86: Update model values for > Rapto

[PATCH] x86: Update model values for Raptorlake.

2023-08-13 Thread Cui, Lili via Gcc-patches

Committed as obvious, and backported to GCC13. Lili. Update model values for Raptorlake according to SDM. gcc/ChangeLog * common/config/i386/cpuinfo.h (get_intel_cpu): Add model value 0xba to Raptorlake. --- gcc/common/config/i386/cpuinfo.h | 1 + 1 file changed, 1 insertion(+

RE: [PATCH] x86: Enable ENQCMD and UINTR for march=sierraforest.

2023-07-04 Thread Cui, Lili via Gcc-patches

> -Original Message- > From: Hongtao Liu > Sent: Tuesday, July 4, 2023 4:27 PM > To: Cui, Lili > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] x86: Enable ENQCMD and UINTR for march=sierraforest. > > On Tue, Jul 4, 2023 at 4:15 PM Cui, Lili wrote

[PATCH] x86: Enable ENQCMD and UINTR for march=sierraforest.

2023-07-04 Thread Cui, Lili via Gcc-patches

From: Lili Cui Hi Maintainer, This patch is to enable ENQCMD and UINTR for march=sierraforest according to Intel ISE. Bootstrapped and regtested. Ok for trunk? And I will backport this patch to GCC13. Thanks, Lili. Enable ENQCMD and UINTR for march=sierraforest according to Intel ISE https:

RE: [PATCH] PR gcc/110148:Avoid adding loop-carried ops to long chains

2023-06-29 Thread Cui, Lili via Gcc-patches

> -Original Message- > From: Richard Biener > Sent: Thursday, June 29, 2023 2:42 PM > To: Cui, Lili > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] PR gcc/110148:Avoid adding loop-carried ops to long > chains > > On Thu, Jun 29, 2023 at 3:49 AM Cui

RE: [PATCH] x86: Update model values for Alderlake, Rocketlake and Raptorlake.

2023-06-28 Thread Cui, Lili via Gcc-patches

I will directly commit this patch, it can be considered as an obvious patch. Thanks, Lili. > -Original Message- > From: Gcc-patches On > Behalf Of Cui, Lili via Gcc-patches > Sent: Wednesday, June 28, 2023 6:52 PM > To: gcc-patches@gcc.gnu.org > Cc: Liu, Hongtao >

[PATCH] PR gcc/110148:Avoid adding loop-carried ops to long chains

2023-06-28 Thread Cui, Lili via Gcc-patches

From: Lili Cui Hi Maintainer This patch is to fix TSVC242 regression related to loop-carried ops. Bootstrapped and regtested. Ok for trunk? Regards Lili. Avoid adding loop-carried ops to long chains, otherwise the whole chain will have dependencies across the loop iteration. Just keep loop-ca

[PATCH] x86: Update model values for Alderlake, Rocketlake and Raptorlake.

2023-06-28 Thread Cui, Lili via Gcc-patches

Hi Hongtao, This patch is to update model values for Alderlake, Rocketlake and Raptorlake according to SDM. Ok for trunk? Thanks. Lili. Update model values for Alderlake, Rocketlake and Raptorlake according to SDM. gcc/ChangeLog * common/config/i386/cpuinfo.h (get_intel_cpu): Remove

RE: [PATCH] Handle FMA friendly in reassoc pass

2023-06-07 Thread Cui, Lili via Gcc-patches

le machine. For the 527 regression, I can't reproduce it and the data seems stable. Regards, Lili. > -Original Message- > From: Di Zhao OS > Sent: Wednesday, June 7, 2023 11:48 AM > To: Cui, Lili ; gcc-patches@gcc.gnu.org > Cc: richard.guent...@gmail.com; li...@

RE: [PATCH] Fix ICE in rewrite_expr_tree_parallel

2023-05-31 Thread Cui, Lili via Gcc-patches

Committed, thanks Richard. Lili. > -Original Message- > From: Richard Biener > Sent: Wednesday, May 31, 2023 3:22 PM > To: Cui, Lili > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] Fix ICE in rewrite_expr_tree_parallel > > On Wed, May 31, 2023 at

[PATCH] Fix ICE in rewrite_expr_tree_parallel

2023-05-30 Thread Cui, Lili via Gcc-patches

Hi, This patch is to fix ICE in rewrite_expr_tree_parallel. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038 Bootstrapped and regtested. Ok for trunk? Regards Lili. 1. Limit the value of tree-reassoc-width to IntegerRange(0, 256). 2. Add width limit in rewrite_expr_tree_parallel. gcc/Change

RE: [PATCH] Handle FMA friendly in reassoc pass

2023-05-29 Thread Cui, Lili via Gcc-patches

I will rebase and commit this patch, thanks! Lili. > -Original Message- > From: Cui, Lili > Sent: Thursday, May 25, 2023 7:30 AM > To: gcc-patches@gcc.gnu.org > Cc: richard.guent...@gmail.com; li...@linux.ibm.com; Cui, Lili > > Subject: [PATCH] Handle FMA fri

RE: [PATCH] PR gcc/98350:Handle FMA friendly in reassoc pass

2023-05-24 Thread Cui, Lili via Gcc-patches

> > +rewrite_expr_tree_parallel (gassign *stmt, int width, bool has_fma, > > +const vec > > +&ops) > > { > >enum tree_code opcode = gimple_assign_rhs_code (stmt); > >int op_num = ops.length (); > > @@ -5483,10 +5494,11 @@ rewrite_expr_tree_parallel (

[PATCH] Handle FMA friendly in reassoc pass

2023-05-24 Thread Cui, Lili via Gcc-patches

From: Lili Cui Make some changes in reassoc pass to make it more friendly to fma pass later. Using FMA instead of mult + add reduces register pressure and insruction retired. There are mainly two changes 1. Put no-mult ops and mult ops alternately at the end of the queue, which is conducive to g

RE: [PATCH] PR gcc/98350:Handle FMA friendly in reassoc pass

2023-05-18 Thread Cui, Lili via Gcc-patches

benchmarks. On aarch64 507.cactuBSSN_r: Improved by 1.7% for multi-copy. 503.bwaves_r : Improved by 6.00% for single-copy. no measurable changes for other benchmarks. > -Original Message- > From: Cui, Lili > Sent: Wednesday, May 17, 2023 9:02 PM > To: gcc-patches@gcc.

RE: [PATCH 1/2] PR gcc/98350:Add a param to control the length of the chain with FMA in reassoc pass

2023-05-17 Thread Cui, Lili via Gcc-patches

> I think to make a difference you need to hit the number of parallel fadd/fmul > the pipeline can perform. I don't think issue width is ever a problem for > chains w/o fma and throughput of fma vs fadd + fmul should be similar. > Yes, for x86 backend, fadd , fmul and fma have the same TP meanin

[PATCH] PR gcc/98350:Handle FMA friendly in reassoc pass

2023-05-17 Thread Cui, Lili via Gcc-patches

From: Lili Cui Make some changes in reassoc pass to make it more friendly to fma pass later. Using FMA instead of mult + add reduces register pressure and insruction retired. There are mainly two changes 1. Put no-mult ops and mult ops alternately at the end of the queue, which is conducive to g

RE: [PATCH 1/2] PR gcc/98350:Add a param to control the length of the chain with FMA in reassoc pass

2023-05-12 Thread Cui, Lili via Gcc-patches

> ISTR there were no sufficient comments in the code explaining why > rewrite_expr_tree_parallel_for_fma is better by design. In fact ... > > > > > > > > > > if (!reassoc_insert_powi_p > > > > - && ops.length () > 3 > > > > + && len > 3 >

[PATCH1/2] PR gcc/98350:Add a param to control the length of the chain with FMA in reassoc pass

2023-05-11 Thread Cui, Lili via Gcc-patches

From: Lili Cui Add a param for the chain with FMA in reassoc pass to make it more friendly to the fma pass later. First to detect if this chain has ability to generate more than 2 FMAs,if yes and param_reassoc_max_chain_length_with_fma is enabled, We will rearrange the ops so that they can be com

RE: [PATCH 1/2] PR gcc/98350:Add a param to control the length of the chain with FMA in reassoc pass

2023-05-11 Thread Cui, Lili via Gcc-patches

> -Original Message- > From: Richard Biener > Sent: Thursday, May 11, 2023 6:53 PM > To: Cui, Lili > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH 1/2] PR gcc/98350:Add a param to control the length of > the chain with FMA in reassoc pass Hi Richard, Thanks for

[PATCH 2/2] Add a tune option to control the length of the chain with FMA

2023-05-11 Thread Cui, Lili via Gcc-patches

From: Lili Cui Set the length of the chain with FMA to 5 for icelake_cost. With this patch applied, SPR multi-copy: 508.namd_r increased by 3% ICX multi-copy: 508.namd_r increased by 3.5%, 507.cactuBSSN_r increased by 3.7% Using FMA instead of mult + add reduces register pressur

[PATCH 1/2] PR gcc/98350:Add a param to control the length of the chain with FMA in reassoc pass

2023-05-11 Thread Cui, Lili via Gcc-patches

From: Lili Cui Hi, Those two patches each add a param to control the length of the chain with FMA in reassoc pass and a tuning option in the backend. Bootstrapped and regtested. Ok for trunk? Regards Lili. Add a param for the chain with FMA in reassoc pass to make it more friendly to the fma

[PATCH] x86: Enable 256 move by pieces for ALDERLAKE and AVX2.

2022-11-11 Thread Cui,Lili via Gcc-patches

From: Lili Cui Hi Hontao, This patch is to enable 256 move by pieces for ALDERLAKE and AVX2. Bootstrap is ok, and no regressions for i386/x86-64 testsuite. OK for master? gcc/Changelog: * config/i386/x86-tune.def (X86_TUNE_AVX256_MOVE_BY_PIECES): Add alderlake and avx2.

[PATCH] Remove AVX512_VP2INTERSECT from PTA_SAPPHIRERAPIDS

2022-11-07 Thread Cui,Lili via Gcc-patches

Hi Hongtao, I backported this patch to gcc-12 release. gcc/ChangeLog: * config/i386/driver-i386.cc (host_detect_local_cpu): Move sapphirerapids out of AVX512_VP2INTERSECT. * config/i386/i386.h: Remove AVX512_VP2INTERSECT from PTA_SAPPHIRERAPIDS * doc/invoke.tex

RE: [PATCH] ix86: Suggest unroll factor for loop vectorization

2022-11-02 Thread Cui, Lili via Gcc-patches

> > > +@item x86-vect-unroll-min-ldst-threshold > > > +The vectorizer will check with target information to determine > > > +whether unroll it. This parameter is used to limit the mininum of > > > +loads and stores in the main loop. > > > > > > It's odd to "limit" the minimum number of something.

RE: Ping^3 [PATCH V2] Add attribute hot judgement for INLINE_HINT_known_hot hint.

2022-10-30 Thread Cui, Lili via Gcc-patches

> > On 10/20/22 19:52, Cui, Lili via Gcc-patches wrote: > > Hi Honza, > > > > Gentle ping > > https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601934.html > > > > gcc/ChangeLog > > > >* ipa-inline-analysis.cc (do_estimate_e

RE: [PATCH] ix86: Suggest unroll factor for loop vectorization

2022-10-26 Thread Cui, Lili via Gcc-patches

. I think 200 still work. > That said, the heuristic made me think "what the heck". Can we explain in u- > arch terms why the unrolling is beneficial instead of just defering to SPEC > CPU 2017 fotonik? > Regarding the benefits, I explained in the first answer, I checked 5

[PATCH] ix86: Suggest unroll factor for loop vectorization

2022-10-23 Thread Cui,Lili via Gcc-patches

Hi Hongtao, This patch introduces function finish_cost and determine_suggested_unroll_factor for x86 backend, to make it be able to suggest the unroll factor for a given loop being vectorized. Referring to aarch64, RS6000 backends and basing on the analysis on SPEC2017 performance evaluation resu

Ping^3 [PATCH V2] Add attribute hot judgement for INLINE_HINT_known_hot hint.

2022-10-20 Thread Cui, Lili via Gcc-patches

Hi Honza, Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601934.html gcc/ChangeLog * ipa-inline-analysis.cc (do_estimate_edge_time): Add function attribute judgement for INLINE_HINT_known_hot hint. gcc/testsuite/ChangeLog: * gcc.dg/ipa/inlinehint-6.c: New test. --

RE: Ping^2 [PATCH] Add attribute hot judgement for INLINE_HINT_known_hot hint.

2022-10-13 Thread Cui, Lili via Gcc-patches

Hi Honza, Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601934.html Thanks, Lili. > -Original Message- > From: Cui, Lili > Sent: Saturday, October 8, 2022 8:33 AM > To: Cui, Lili ; Jan Hubicka > Cc: Lu, Hongjiu ; Liu, Hongtao > ; gcc-p

[PATCH] MAINTAINERS: Add myself for write after approval

2022-10-12 Thread Cui,Lili via Gcc-patches

Hi, I want to add myself in MAINTANINER for write after approval. OK for master? ChangeLog: * MAINTAINERS (Write After Approval): Add myself. --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 11fa8bc6dbd..e4e7349a6d9 100644 --- a/MAINTA

[PATCH] Remove AVX512_VP2INTERSECT from PTA_SAPPHIRERAPIDS

2022-10-11 Thread Cui,Lili via Gcc-patches

Hi Hontao, This patch is to remove AVX512_VP2INTERSECT from PTA_SAPPHIRERAPIDS. The new intel ISE removes AVX512_VP2INTERSECT from SAPPHIRERAPIDS, AVX512_VP2INTERSECT is only supportted in Tigerlake. Hi Uros, This patch is to remove AVX512_VP2INTERSECT from PTA_SAPPHIRERAPIDS. The new intel ISE

Ping^1 [PATCH] Add attribute hot judgement for INLINE_HINT_known_hot hint.

2022-10-07 Thread Cui, Lili via Gcc-patches

Hi Honza, Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601934.html Thanks, Lili. > -Original Message- > From: Gcc-patches On > Behalf Of Cui, Lili via Gcc-patches > Sent: Wednesday, September 21, 2022 5:22 PM > To: Jan Hubicka > Cc: Lu, Hong

RE: [PATCH] Add attribute hot judgement for INLINE_HINT_known_hot hint.

2022-09-21 Thread Cui, Lili via Gcc-patches

> Thank you. Can you please also add a testcase that tests for this. > So you modify imagemagick marking attribute hot on the specific inline? Thanks Honza. Added the testcase. I didn't modify source code of 538.imagic_r, the original source code has attribute like: #define magick_hot_spot __a

[PATCH] Add attribute hot judgement for INLINE_HINT_known_hot hint.

2022-09-20 Thread Cui,Lili via Gcc-patches

Hi Honza, This patch is to add attribute hot judgement for INLINE_HINT_known_hot hint. We set up INLINE_HINT_known_hot hint only when we have profile feedback, now add function attribute judgement for it, when both caller and callee have __attribute__((hot)), we will also set up INLINE_HINT_known

RE: [PATCH] Add a heuristic for eliminate redundant load and store in inline pass.

2022-07-18 Thread Cui, Lili via Gcc-patches

Hi Honza, Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-July/597891.html Thanks, Lili. > -Original Message- > From: Gcc-patches On > Behalf Of Cui, Lili via Gcc-patches > Sent: Sunday, July 10, 2022 10:05 PM > To: Jan Hubicka > Cc: Lu, Hongjiu ; Li

RE: [PATCH] Add a heuristic for eliminate redundant load and store in inline pass.

2022-07-10 Thread Cui, Lili via Gcc-patches

> -Original Message- > From: Jan Hubicka > This is interesting idea. Basically we want to guess if inlining will > make SRA and or strore->load propagation possible. I think the > solution using INLINE_HINT may be bit too trigger happy, since it is very > common that this happens and

[PATCH] Add a heuristic for eliminate redundant load and store in inline pass.

2022-07-06 Thread Cui,Lili via Gcc-patches

From: Lili Hi Hubicka, This patch is to add a heuristic inline hint to eliminate redundant load and store. Bootstrap and regtest pending on x86_64-unknown-linux-gnu. OK for trunk? Thanks, Lili. Add a INLINE_HINT_eliminate_load_and_store hint in to inline pass. We accumulate the insn number

[PATCH] testsuite: Add -mtune=generic to dg-options for two testcases.

2022-06-10 Thread Cui,Lili via Gcc-patches

This patch is to change dg-options for two testcases. Use -mtune=generic to limit these two testcases. Because configuring them with -mtune=cascadelake or znver3 will vectorize them. regtested on x86_64-linux-gnu{-m32,}. Ok for trunk? Thanks, Lili. Use -mtune=generic to limit these two test cas

RE: [PATCH] Update {skylake,icelake,alderlake}_cost to add a bit preference to vector store.

2022-06-07 Thread Cui, Lili via Gcc-patches

> -Original Message- > From: Hongtao Liu > Sent: Monday, June 6, 2022 1:25 PM > To: H.J. Lu > Cc: Cui, Lili ; Liu, Hongtao ; GCC > Patches > Subject: Re: [PATCH] Update {skylake,icelake,alderlake}_cost to add a bit > preference to vector store. > > > >

[PATCH] Update {skylake, icelake, alderlake}_cost to add a bit preference to vector store.

2022-05-31 Thread Cui,Lili via Gcc-patches

This patch is to update {skylake,icelake,alderlake}_cost to add a bit preference to vector store. Since the interger vector construction cost has changed, we need to adjust the load and store costs for intel processers. With the patch applied 538.imagic_r:gets ~6% improvement on ADL for multicop

[PATCH] x86: Correct march=sapphirerapids to base on icelake server

2022-03-17 Thread Cui,Lili via Gcc-patches

Hi Hongtao, This patch is to correct march=sapphirerapids to base on icelake server. and update sapphirerapids in the documentation. OK for master and backport to GCC 11? gcc/Changelog: PR target/104963 * config/i386/i386.h (PTA_SAPPHIRERAPIDS): change it to base on ICX.

[PATCH] x86: Update Intel architectures ISA support in documentation.

2022-02-21 Thread Cui,Lili via Gcc-patches

Hi Uros, This patch is to update Intel architectures ISA support in documentation. Since the ISA supported by Intel architectures in the documentation are inconsistent with the actual, modify them all. OK for master? gcc/Changelog: * gcc/doc/invoke.texi: Update documents for Intel architectu

[PATCH] x86: Update model value for Alderlake and Rocketlake

2022-01-03 Thread Cui,Lili via Gcc-patches

Hi Uros, This patch is to update model value for Alderlake and Rocketlake. Bootstrap is ok, and no regressions for i386/x86-64 testsuite. OK for master? gcc/ChangeLog * common/config/i386/cpuinfo.h (get_intel_cpu): Add new model values to Alderlake and Rocketlake. --- gcc/comm

[PATCH] x86: Update -mtune=tremont

2021-12-08 Thread Cui,Lili via Gcc-patches

Hi Uros, This patch is to update mtune for tremont. Bootstrap is ok, and no regressions for i386/x86-64 testsuite. OK for master? Silvermont has a special handle in add_stmt_cost function, because it has in order SIMD pipeline. But for Tremont, its SIMD pipeline is out of order, remove Tremont

[PATCH] x86: Update -mtune=alderlake

2021-11-10 Thread Cui,Lili via Gcc-patches

Hi Uros, This patch is to update mtune for alderlake. Bootstrap is ok, and no regressions for i386/x86-64 testsuite. OK for master? Update mtune for alderlake, Alder Lake Intel Hybrid Technology will not support Intel® AVX-512. ISA features such as Intel® AVX, AVX-VNNI, Intel® AVX2, and UMONITO

RE: [PATCH 3/4] [PATCH 3/4] x86: Properly handle USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONVERTS

2021-09-16 Thread Cui, Lili via Gcc-patches

> -Original Message- > From: Uros Bizjak > Sent: Thursday, September 16, 2021 2:28 PM > To: Cui, Lili > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; H. J. Lu > > Subject: Re: [PATCH 3/4] [PATCH 3/4] x86: Properly handle > USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONV

RE: [PATCH 4/4] [PATCH 4/4] x86: Add TARGET_SSE_PARTIAL_REG_[FP_]CONVERTS_DEPENDENCY

2021-09-15 Thread Cui, Lili via Gcc-patches

> -Original Message- > From: H.J. Lu > Sent: Wednesday, September 15, 2021 10:14 PM > To: Cui, Lili > Cc: Uros Bizjak ; GCC Patches patc...@gcc.gnu.org>; Liu, Hongtao > Subject: Re: [PATCH 4/4] [PATCH 4/4] x86: Add > TARGET_SSE_PARTIAL_REG_[FP_]CONVERTS_DEP

[PATCH] Synchronize Rocket Lake's processor_names and processor_cost_table with processor_type

2021-04-24 Thread Cui, Lili via Gcc-patches

Hi Uros, This patch is to synchronize Rocket Lake's processor_names and processor_cost_table with processor_type. Bootstrap is ok, and no regressions for i386/x86-64 testsuite. OK for master? [PATCH] Synchronize Rocket Lake's processor_names and processor_cost_table with processor_type gcc/

[PATCH wwwdoc] Mention Rocketlake [GCC11]

2021-04-12 Thread Cui, Lili via Gcc-patches

Updated wwwdocs for Rocketlake [GCC11], thanks. [PATCH] Mention Rocketlake --- htdocs/gcc-11/changes.html | 4 1 file changed, 4 insertions(+) diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html index a7fa4e1b..38725abc 100644 --- a/htdocs/gcc-11/changes.html +++ b/htdocs

[PATCH] Add rocketlake to gcc.

2021-04-11 Thread Cui, Lili via Gcc-patches

Hi Uros, This patch is about to add Rocket Lake to GCC. Rocket Lake is based on Ice Lake client and minus SGX. For detailed information, please refer to https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Bootst

[PATCH] Change march=alderlake ISA list and add m_ALDERLAKE to m_CORE_AVX2

2021-04-11 Thread Cui, Lili via Gcc-patches

Hi Uros, This patch is about to change Alder Lake ISA list to GCC add m_ALDERLAKE to m_CORE_AVX2. Alder Lake Intel Hybrid Technology is based on Tremont and plus ADCX/AVX/AVX2/BMI/BMI2/F16C/FMA/LZCNT/ PCONFIG/PKU/VAES/VPCLMULQDQ/SERIALIZE/HRESET/KL/WIDEKL/AVX-VNNI For detailed information, pleas

RE: Enable MOVDIRI, MOVDIR64B, CLDEMOTE and WAITPKG for march=tremont

2020-11-13 Thread Cui, Lili via Gcc-patches

Hi Uros, This patch is to correct previous patch, PREFETCHW should be both in march=broadwell and march=Silvermont, but I move PREFETCHW from march=broadwell to march=silvermont in previous patch, sorry for that. Bootstrap is ok, and no regressions for i386/x86-64 testsuite. OK for master? [P

Enable MOVDIRI, MOVDIR64B, CLDEMOTE and WAITPKG for march=tremont

2020-11-09 Thread Cui, Lili via Gcc-patches

Hi Uros, This patch is to correct some instruction sets for march=Tremont/Broadwell/Silvermont/knl Bootstrap is ok, and no regressions for i386/x86-64 testsuite. OK for master? [PATCH] Enable MOVDIRI, MOVDIR64B, CLDEMOTE and WAITPKG for march=tremont 1. Enable MOVDIRI, MOVDIR64B, CLDEMOTE a

Initial Sapphire Rapids and Alder Lake support from ISA r40

2020-07-09 Thread Cui, Lili via Gcc-patches

Hi: This patch is about to add Sapphire Rapids and Alder Lake to GCC. Sapphire Rapids is based on Cooper Lake and plus ISA MOVDIRI/MOVDIR64B/AVX512VP2INTERSECT/ENQCMD/CLDEMOTE/PTWRITE/WAITPKG/SERIALIZE/TSXLDTRK. Alder Lake is based on Skylake and plus ISA CLDEMOTE/PTWRITE/WAITPK/SERIALIZE. For de

[PATCH] fix bitmask conflict between PTA_AVX512VP2INTERSECT and PTA_WAITPKG

2020-06-04 Thread Cui, Lili via Gcc-patches

Hi Uros, This patch is to fix bitmask conflict between PTA_AVX512VP2INTERSECT and PTA_WAITPKG in gcc/config/i386/i386.h Bootstrap is ok, make-check ok for i386 target. Ok for trunk? gcc/ChangeLog: * config/i386/i386.h (PTA_WAITPKG): Change bitmask value. --- gcc/config/i386/i386

RE: Add TIGERLAKE and COOPERLAKE to GCC

2019-08-20 Thread Cui, Lili

> -Original Message- > From: Uros Bizjak [mailto:ubiz...@gmail.com] > Sent: Friday, August 16, 2019 11:07 PM > To: H.J. Lu > Cc: Cui, Lili ; Jeff Law ; GCC Patches > ; Zhang, Annita ; Xiao, > Wei3 ; Liu, Hongtao ; Wang, > Hongyu ; Castillo, Jason M > >

RE: Add TIGERLAKE and COOPERLAKE to GCC

2019-08-15 Thread Cui, Lili

> -Original Message- > From: H.J. Lu [mailto:hjl.to...@gmail.com] > Sent: Friday, August 16, 2019 6:02 AM > To: Jeff Law > Cc: Cui, Lili ; Uros Bizjak ; GCC > Patches ; Zhang, Annita > ; Xiao, Wei3 ; Liu, Hongtao > ; Wang, Hongyu ; > Castillo, Jason M >

RE: Add TIGERLAKE and COOPERLAKE to GCC

2019-08-14 Thread Cui, Lili

Resend this mail for GCC Patches rejected my message, thanks. -Original Message- Hi Uros and all: This patch is about to add TIGERLAKE and COOPERLAKE to GCC. TIGERLAKE is based on ICELAKE_CLIENT and plus new ISA MOVEDIRI/MOVDIR64B/AVX512VP2INTERSECT. COOPERLAKE is based on CASCADELAKE a

94 matches

Mail list logo