Re: [PATCH] MIPS: Fixed the problem that the nop instruction is inserted at the wrong position after enabling '-fpatchable-function-entry='

2025-05-06 Thread WANG Xuerui
On 4/30/25 14:26, Lulu Cheng wrote: Because MIPS function symbol is generated in the prologue function, this nop generation should be done in prologue. OK for trunk? PR target/99217 gcc/ChangeLog: * config/mips/mips.cc (mips_start_function_definition): Implements the fu

Re: [PATCH] LoongArch: Increase cost of vector aligned store/load.

2023-11-15 Thread WANG Xuerui
On 11/16/23 14:17, Jiahao Xu wrote: Based on SPEC2017 performance evaluation results, making them equal to the cost of unaligned store/load to avoid odd alignment peeling is better. Paraphrasing a bit to shorten the subject of the sentence: "it's better to make them equal to ... so as to avoid

Re: [PATCH] LoongArch: Use fcmp.caf.s instead of movgr2cf for zeroing a fcc

2023-10-17 Thread WANG Xuerui
On 10/17/23 22:06, Xi Ruoyao wrote: During the review of a LLVM change [1], on LA464 we found that zeroing "an" LLVM change (because the word LLVM is pronounced letter-by-letter) a fcc with fcmp.caf.s is much faster than a movgr2cf from $r0. Similarly, "an" fcc [1]: https://github.com/llvm

Re: [PATCH] LoongArch: Fix lo_sum rtx cost

2023-09-16 Thread WANG Xuerui
Hi, On 9/16/23 17:16, mengqinggang wrote: The cost of lo_sum rtx for addi.d instruction my be a very big number if computed by common function. It may cause some symbols saving to stack and loading from stack if there no enough registers during loop optimization. Thanks for the patch! It seems

Re: [PATCH 1/2] LoongArch: Optimize switch with sign-extended index.

2023-09-02 Thread WANG Xuerui
On 9/2/23 14:24, Lulu Cheng wrote: The patch refers to the submission of RISCV 7bbce9b50302959286381d9177818642bceaf301. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_extend_comparands): In unsigned QImode test, check for sign extended subreg and/or constant

Re: [PATCH v1] LoongArch: Remove the symbolic extension instruction due to the SLT directive.

2023-08-24 Thread WANG Xuerui
On 8/25/23 12:01, Lulu Cheng wrote: Since the slt instruction does not distinguish between 32-bit and 64-bit operations under the LoongArch 64-bit architecture, if the operands of slt are of SImode, symbol expansion is required before operation. Hint:“符号扩展” is "sign extension" (as noun) or "sig

Re: [PATCH v1 2/6] LoongArch: Added Loongson SX base instruction support.

2023-06-30 Thread WANG Xuerui
On 2023/6/30 10:16, Chenghui Pan wrote: [snip] --- gcc/config/loongarch/constraints.md| 128 +- gcc/config/loongarch/loongarch-builtins.cc | 10 + gcc/config/loongarch/loongarch-modes.def | 38 + gcc/config/loongarch/loongarch-protos.h| 31 + gcc/config/loongarch/loo

Re: [pushed][PATCH v3] LoongArch: Avoid non-returning indirect jumps through $ra [PR110136]

2023-06-18 Thread WANG Xuerui
Hi, On 6/15/23 17:03, Xi Ruoyao wrote: Xuerui: I guess this makes it sensible to show "ret" instead of "jirl $zero, $ra, 0" in objdump -d output, but I don't know how to implement it. Do you have some idea? Thanks for the suggestion! Actually I have previously made this patch series [1] whic

Re: [PATCH v2] LoongArch: Modify the register constraints for template "jumptable" and "indirect_jump" from "r" to "e" [PR110136]

2023-06-07 Thread WANG Xuerui
On 2023/6/8 10:27, Lulu Cheng wrote: Micro-architecture unconditionally treats a "jr $ra" as "return from subroutine", hence doing "jr $ra" would interfere with both subroutine return prediction and the more general indirect branch prediction. Therefore, a problem like PR110136 can cause a sign

Re: [PATCH] LoongArch: Change jumptable's register constraint to 'q' [PR110136]

2023-06-07 Thread WANG Xuerui
On 2023/6/7 11:36, Lulu Cheng wrote: 在 2023/6/7 上午11:26, WANG Xuerui 写道: Hi, On 2023/6/7 10:31, Lulu Cheng wrote: If the $ra register is modified during the jump to the jump table, the hardware branch prediction function will be broken, resulting in a significant increase in the branch

Re: [PATCH] LoongArch: Change jumptable's register constraint to 'q' [PR110136]

2023-06-06 Thread WANG Xuerui
Hi, On 2023/6/7 10:31, Lulu Cheng wrote: If the $ra register is modified during the jump to the jump table, the hardware branch prediction function will be broken, resulting in a significant increase in the branch false prediction rate and affecting performance. Thanks for the insight! This is

Re: [PATCH] LoongArch: Fix the problem of structure parameter passing in C++. This structure has empty structure members and less than three floating point members.

2023-05-24 Thread WANG Xuerui
On 2023/5/25 10:46, Lulu Cheng wrote: 在 2023/5/25 上午4:15, Jason Merrill 写道: On Wed, May 24, 2023 at 5:00 AM Jonathan Wakely via Gcc-patches mailto:gcc-patches@gcc.gnu.org>> wrote: On Wed, 24 May 2023 at 09:41, Xi Ruoyao wrote: > Wang Lei raised some concerns about Itanium C++ ABI,

Re: [PATCH] LoongArch: Enable shrink wrapping

2023-04-26 Thread WANG Xuerui
On 2023/4/26 18:14, Lulu Cheng wrote: 在 2023/4/26 下午6:02, WANG Xuerui 写道: On 2023/4/26 17:53, Lulu Cheng wrote: Hi, ruoyao:   The performance of spec2006 is finished. The fixed-point 400.perlbench has about 3% performance improvement, and the other basics have not changed, and the

Re: [PATCH] LoongArch: Enable shrink wrapping

2023-04-26 Thread WANG Xuerui
On 2023/4/26 17:53, Lulu Cheng wrote: Hi, ruoyao:   The performance of spec2006 is finished. The fixed-point 400.perlbench has about 3% performance improvement, and the other basics have not changed, and the floating-point tests have basically remained the same. Nice to know!  

Re: [PATCH] LoongArch: Set 4 * (issue rate) as the default for -falign-functions and -falign-loops

2023-04-18 Thread WANG Xuerui
On 2023/4/18 20:45, Xi Ruoyao wrote: On Tue, 2023-04-18 at 20:39 +0800, WANG Xuerui wrote: Hi, Thanks for helping confirming on GCC and porting this! I'd never know even GCC lacked this adaptation without someone actually checking... Too many things are taken for granted these days. On

Re: [PATCH] LoongArch: Set 4 * (issue rate) as the default for -falign-functions and -falign-loops

2023-04-18 Thread WANG Xuerui
Hi, Thanks for helping confirming on GCC and porting this! I'd never know even GCC lacked this adaptation without someone actually checking... Too many things are taken for granted these days. On 2023/4/18 20:17, Xi Ruoyao wrote: According to Xuerui's LLVM changeset [1], doing so can make a

Re: [PATCH] gcc-13: Add changelog for LoongArch.

2023-04-18 Thread WANG Xuerui
Hi, Just some minor fixes ;-) On 2023/4/18 14:15, Lulu Cheng wrote: --- htdocs/gcc-13/changes.html | 39 ++ 1 file changed, 39 insertions(+) diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html index f3b9afed..c75e341b 100644 --- a/htdocs/

Re: [PATCH] LoongArch: Control all __crc* __crcc* builtin functions with macro __loongarch64.

2023-03-13 Thread WANG Xuerui
On 2023/3/13 13:14, Xi Ruoyao wrote: On Mon, 2023-03-13 at 12:58 +0800, Lulu Cheng wrote: 在 2023/3/13 下午12:54, Xi Ruoyao 写道: On Mon, 2023-03-13 at 12:40 +0800, WANG Xuerui wrote: This is ugly. The fact all current LA32 models don't support CRC ops is just a coincidence; it's entirel

Re: [PATCH] LoongArch: Control all __crc* __crcc* builtin functions with macro __loongarch64.

2023-03-12 Thread WANG Xuerui
On 2023/3/13 11:52, Lulu Cheng wrote: LoongArch 32-bit instruction set does not support crc* and crcc* instructions. gcc/ChangeLog: * config/loongarch/larchintrin.h (__crc_w_b_w): Add macros for control. (__crc_w_h_w): Likewise. (__crc_w_w_w): Likewise. (__crcc

Re: [PATCH] LoongArch: Change the value of macro TRY_EMPTY_VM_SPACE from 0x8000000000 to 0x1000000000.

2023-02-22 Thread WANG Xuerui
On 2023/2/22 17:30, Lulu Cheng wrote: 在 2023/2/21 下午9:56, WANG Xuerui 写道: Hi, On 2023/2/21 21:03, Lulu Cheng wrote: 在 2023/2/21 下午3:41, Xi Ruoyao 写道: On Tue, 2023-02-21 at 15:20 +0800, Lulu Cheng wrote: Like la264 only has 40 effective bits of virtual address space. I'm OK with the c

Re: [PATCH] LoongArch: Change the value of macro TRY_EMPTY_VM_SPACE from 0x8000000000 to 0x1000000000.

2023-02-21 Thread WANG Xuerui
Hi, On 2023/2/21 21:03, Lulu Cheng wrote: 在 2023/2/21 下午3:41, Xi Ruoyao 写道: On Tue, 2023-02-21 at 15:20 +0800, Lulu Cheng wrote: Like la264 only has 40 effective bits of virtual address space. I'm OK with the change.  But the VA length is configurable building the kernel.  Is there any speci

Re: [PATCH] LoongArch: Fix multiarch tuple canonization

2023-02-15 Thread WANG Xuerui
Hi, On 2023/2/13 18:38, Xi Ruoyao wrote: Multiarch tuple will be coded in file or directory names in multiarch-aware distros, so one ABI should have only one multiarch tuple. For example, "--target=loongarch64-linux-gnu --with-abi=lp64s" and "--target=loongarch64-linux-gnusf" should both set mu

Re: [PATCH v3] LoongArch: Add prefetch instructions.

2022-11-15 Thread WANG Xuerui
On 2022/11/16 10:10, Lulu Cheng wrote: v2 -> v3: 1. Remove preldx support. --- Enable sw prefetching at -O3 and higher. Co-Authored-By: xujiahao gcc/ChangeLog: * config/loongarch/constraints.md (ZD): New constraint. * config/loongarch/loo

Re: [PATCH v3] LoongArch: Libvtv add loongarch support.

2022-10-28 Thread WANG Xuerui
Hi, The code change seems good but a few grammatical nits. Patch subject should be a verb phrase, something like "libvtv: add LoongArch support" could be better. On 2022/10/28 16:01, Lulu Cheng wrote: After several considerations, I decided to set VTV_PAGE_SIZE to 16KB under loongarch64.

Re: [PATCH] Libvtv-test: Fix the problem that scansarif.exp cannot be found in libvtv regression test.

2022-09-26 Thread WANG Xuerui
On 2022/9/27 11:16, Lulu Cheng wrote: r13-967 add ARRIF output format. However libvtv does not add support. "SARIF support was added in r13-967 but libvtv wasn't updated." (Tip: always remember that English, unlike Chinese, isn't a "topic-prominent" language, meaning you should almo

Re: 回复:[PATCH v5] LoongArch: add movable attribute

2022-08-05 Thread WANG Xuerui
On 2022/8/5 15:19, Lulu Cheng wrote: 在 2022/8/5 下午2:03, Xi Ruoyao 写道: On Fri, 2022-08-05 at 12:01 +0800, Lulu Cheng wrote: 在 2022/8/5 上午11:45, Xi Ruoyao 写道: On Fri, 2022-08-05 at 11:34 +0800, Xi Ruoyao via Gcc-patches wrote: Or maybe we should just use a PC-relative addressing with 4 inst

Re: [PATCH v5] LoongArch: add movable attribute

2022-08-02 Thread WANG Xuerui
On 2022/8/3 09:36, Xi Ruoyao wrote: Is it OK for trunk or I need to change something? By the way, I'm seeking a possibility to include this into 12.2. Then we leaves only 12.1 without this attribute, and we can just say "building the kernel needs GCC 12.2 or later". On Mon, 2022-08-01 at 18:07

Re: [PATCH] LoongArch: document -m[no-]explicit-relocs

2022-07-27 Thread WANG Xuerui
On 2022/7/27 17:28, Lulu Cheng wrote: 在 2022/7/27 下午5:15, Xi Ruoyao 写道: On Wed, 2022-07-27 at 16:47 +0800, Lulu Cheng wrote:  "Use or do not use assembler relocation operators when dealing with symbolic addresses. The alternative is to use assembler macros instead, which may limit optimizat

Re: [PATCH] LoongArch: document -m[no-]explicit-relocs

2022-07-27 Thread WANG Xuerui
Hi, On 2022/7/27 15:06, Xi Ruoyao wrote: Document newly introduced -m[no-]explicit-relocs options. Ok for trunk? -- >8 -- gcc/ChangeLog: * doc/invoke.texi: Document -m[no-]explicit-relocs for LoongArch. --- gcc/doc/invoke.texi | 12 1 file changed, 12 insertio

Re: [PATCH] LoongArch: Modify fp_sp_offset and gp_sp_offset's calculation method, when frame->mask or frame->fmask is zero.

2022-07-07 Thread WANG Xuerui
Hi, On 2022/7/7 16:04, Lulu Cheng wrote: gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_compute_frame_info): Modify fp_sp_offset and gp_sp_offset's calculation method, when frame->mask or frame->fmask is zero, don't minus UNITS_PER_WORD or UNITS_PER_FP

Re: [PATCH] loongarch: ignore zero-size fields in calling convention

2022-04-25 Thread WANG Xuerui
On 4/25/22 13:57, Xi Ruoyao wrote: Ping. Normally we shouldn't ping a patch after only a few days, but we're running out of time to catch GCC 12 milestone. And once GCC 12 is released the patch will become far more complicated for a psABI warning. And please note that the ABI difference betwee