Re: [PATCH 1/2] LoongArch: Fix wrong code with _alsl_reversesi_extended

2025-01-22 Thread Lulu Cheng
在 2025/1/22 下午5:21, Xi Ruoyao 写道: On Wed, 2025-01-22 at 10:53 +0800, Xi Ruoyao wrote: On Wed, 2025-01-22 at 10:37 +0800, Lulu Cheng wrote: 在 2025/1/22 上午8:49, Xi Ruoyao 写道: The second source register of this insn cannot be the same as the destination register. gcc/ChangeLog

Re: [PATCH v2 2/2] LoongArch: Improve reassociation for bitwise operation and left shift [PR 115921]

2025-01-20 Thread Lulu Cheng
在 2025/1/18 下午7:33, Xi Ruoyao 写道: /* snip */ ;; This code iterator allows unsigned and signed division to be generated ;; from the same template. @@ -3083,39 +3084,6 @@ (define_expand "rotl3" } }); -;; The following templates were added to generate "bstrpick.d + alsl.d" -;;

Re: [PATCH v2 2/2] LoongArch: Improve reassociation for bitwise operation and left shift [PR 115921]

2025-01-21 Thread Lulu Cheng
在 2025/1/21 下午12:59, Xi Ruoyao 写道: On Tue, 2025-01-21 at 11:46 +0800, Lulu Cheng wrote: 在 2025/1/18 下午7:33, Xi Ruoyao 写道: /* snip */   ;; This code iterator allows unsigned and signed division to be generated   ;; from the same template. @@ -3083,39 +3084,6 @@ (define_expand "

[PATCH] LoongArch: Fix ICE caused by illegal calls to builtin functions [PR118561].

2025-01-22 Thread Lulu Cheng
PR target/118561 gcc/ChangeLog: * config/loongarch/loongarch-builtins.cc (loongarch_expand_builtin_lsx_test_branch): NULL_RTX will not be returned when an error is detected. (loongarch_expand_builtin): Likewise. gcc/testsuite/ChangeLog: * gcc.targ

Re: [PATCH] LoongArch: Fix invalid subregs in xorsign [PR118501]

2025-01-22 Thread Lulu Cheng
在 2025/1/23 上午11:36, Xi Ruoyao 写道: On Thu, 2025-01-23 at 11:21 +0800, Lulu Cheng wrote: 在 2025/1/22 下午9:26, Xi Ruoyao 写道: The test case added in r15-7073 now triggers an ICE, indicating we need the same fix as AArch64. gcc/ChangeLog: PR target/118501 * config/loongarch

Re: [PATCH] LoongArch: Fix invalid subregs in xorsign [PR118501]

2025-01-22 Thread Lulu Cheng
在 2025/1/22 下午9:26, Xi Ruoyao 写道: The test case added in r15-7073 now triggers an ICE, indicating we need the same fix as AArch64. gcc/ChangeLog: PR target/118501 * config/loongarch/loongarch.md (@xorsign3): Use force_lowpart_subreg. --- Bootstrapped and regtested on

Re: [PATCH 1/2] LoongArch: Fix wrong code with _alsl_reversesi_extended

2025-01-24 Thread Lulu Cheng
在 2025/1/24 下午3:58, Richard Sandiford 写道: Lulu Cheng writes: 在 2025/1/22 上午8:49, Xi Ruoyao 写道: The second source register of this insn cannot be the same as the destination register. gcc/ChangeLog: * config/loongarch/loongarch.md (_alsl_reversesi_extended): Add '&

Re: [PATCH v2 1/2] LoongArch: Simplify using bstr{ins,pick} instructions for and

2025-01-19 Thread Lulu Cheng
LGTM! Thanks! 在 2025/1/18 下午7:33, Xi Ruoyao 写道: For bstrins, we can merge it into and3 instead of having a separate define_insn. For bstrpick, we can use the constraints to ensure the first source register and the destination register are the same hardware register, instead of emitting a move

[PATCH v2 0/2] Implement target attribute and pragma.

2025-01-20 Thread Lulu Cheng
n (struct gcc_options *opts, case OPT_mlasx: opts->x_la_opt_simd = val ? ISA_EXT_SIMD_LASX - : (la_opt_simd == ISA_EXT_SIMD_LSX || la_opt_simd == ISA_EXT_SIMD_LSX + : (la_opt_simd == ISA_EXT_SIMD_LASX || la_opt_simd == ISA_EXT_SIMD_LSX 2. Add example to doc. Lulu Cheng (2): Lo

[PATCH v2 1/2] LoongArch: Implement target attribute.

2025-01-20 Thread Lulu Cheng
Add function attributes support for LoongArch. Currently, the following items are supported: __attribute__ ((target ("{no-}strict-align"))) __attribute__ ((target ("cmodel="))) __attribute__ ((target ("arch="))) __attribute__ ((target ("tune="))) __attribut

[PATCH v2 2/2] LoongArch: Implement target pragma.

2025-01-20 Thread Lulu Cheng
The target pragmas defined correspond to the target function attributes. This implementation is derived from AArch64. gcc/ChangeLog: * config/loongarch/loongarch-protos.h (loongarch_reset_previous_fndecl): Add function declaration. (loongarch_save_restore_target_globals)

Re:[pushed] [PATCH v2 0/2] Implement target attribute and pragma.

2025-01-21 Thread Lulu Cheng
Pushed to r15-7092 and r15-7093. 在 2025/1/20 下午5:54, Lulu Cheng 写道: Currently, the following items are supported: __attribute__ ((target ("{no-}strict-align"))) __attribute__ ((target ("cmodel="))) __attribute__ ((target ("arch=")))

Re: [PATCH v2 2/2] LoongArch: Improve reassociation for bitwise operation and left shift [PR 115921]

2025-01-21 Thread Lulu Cheng
在 2025/1/21 下午6:05, Xi Ruoyao 写道: On Tue, 2025-01-21 at 16:41 +0800, Lulu Cheng wrote: 在 2025/1/21 下午12:59, Xi Ruoyao 写道: On Tue, 2025-01-21 at 11:46 +0800, Lulu Cheng wrote: 在 2025/1/18 下午7:33, Xi Ruoyao 写道: /* snip */    ;; This code iterator allows unsigned and signed division to be

Re: [PATCH 1/2] LoongArch: Fix wrong code with _alsl_reversesi_extended

2025-01-21 Thread Lulu Cheng
在 2025/1/22 上午8:49, Xi Ruoyao 写道: The second source register of this insn cannot be the same as the destination register. gcc/ChangeLog: * config/loongarch/loongarch.md (_alsl_reversesi_extended): Add '&' to the destination register constraint and append '0' to the fir

Re: [PATCH v2 2/2] LoongArch: Improve reassociation for bitwise operation and left shift [PR 115921]

2025-01-21 Thread Lulu Cheng
在 2025/1/21 下午4:41, Lulu Cheng 写道: 在 2025/1/21 下午12:59, Xi Ruoyao 写道: On Tue, 2025-01-21 at 11:46 +0800, Lulu Cheng wrote: 在 2025/1/18 下午7:33, Xi Ruoyao 写道: /* snip */    ;; This code iterator allows unsigned and signed division to be generated    ;; from the same template. @@ -3083,39

Re: [pushed][PATCH] LoongArch: Implement TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS hook

2024-12-25 Thread Lulu Cheng
Pushed to r15-6432. 在 2024/12/17 上午10:41, Jiahao Xu 写道: The hook changes the allocno class to either FP_REGS or GR_REGS depending on the mode of the register. This results in better register allocation overall, fewer spills and reduced codesize - particularly in SPEC2017 lbm. gcc/ChangeLog:

Re: [PATCH] LoongArch: Add alsl.wu

2025-01-16 Thread Lulu Cheng
LGTM! Thanks! 在 2025/1/15 下午6:09, Xi Ruoyao 写道: On 64-bit capable LoongArch hardware, alsl.wu is similar to alsl.w but zero-extending the 32-bit result. gcc/ChangeLog: * config/loongarch/loongarch.md (alslsi3_extend): Add alsl.wu. gcc/testsuite/ChangeLog: * gcc.target/loonga

Re: [PATCH] LoongArch: Fix cost model for alsl

2025-01-16 Thread Lulu Cheng
在 2025/1/16 下午8:59, Xi Ruoyao 写道: On Thu, 2025-01-16 at 20:52 +0800, Xi Ruoyao wrote: On Thu, 2025-01-16 at 20:30 +0800, Lulu Cheng wrote: 在 2025/1/15 下午6:10, Xi Ruoyao 写道: diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 9d97f0216f0..3a8e1297bd3

Re: [PATCH] LoongArch: Fix cost model for alsl

2025-01-16 Thread Lulu Cheng
在 2025/1/15 下午6:10, Xi Ruoyao 写道: diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 9d97f0216f0..3a8e1297bd3 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -3929,14 +3929,31 @@ loongarch_rtx_costs (rtx x, machine

Re: [PATCH] LoongArch: Adjust the cost of ADDRESS_REG_REG [PR114978].

2025-01-09 Thread Lulu Cheng
在 2025/1/8 下午11:16, Xi Ruoyao 写道: On Tue, 2025-01-07 at 10:44 +0800, Lulu Cheng wrote: After changing this cost from 1 to 3, the performance of spec2006 401 473 416 465 482 can be improved by about 2% on LA664. Would this fix https://gcc.gnu.org/PR114978 (or at least make it latent)? The

Re: [PATCH 1/2] LoongArch: Fix wrong code with _alsl_reversesi_extended

2025-02-12 Thread Lulu Cheng
在 2025/1/24 下午7:44, Richard Sandiford 写道: Lulu Cheng writes: 在 2025/1/24 下午3:58, Richard Sandiford 写道: Lulu Cheng writes: 在 2025/1/22 上午8:49, Xi Ruoyao 写道: I have no problem with this patch. But, I have always been confused about the use of reload_completed. I can understand that it

Re: [PATCH v2 2/8] LoongArch: Allow moving TImode vectors

2025-02-13 Thread Lulu Cheng
Hi, If only apply the first and second patches, the code will not compile. Otherwise LGTM. Thanks! 在 2025/2/13 下午5:41, Xi Ruoyao 写道: We have some vector instructions for operations on 128-bit integer, i.e. TImode, vectors. Previously they had been modeled with unspecs, but it's more natural

[PATCH v2] LoongArch: Adjust the cost of ADDRESS_REG_REG.

2025-02-13 Thread Lulu Cheng
After changing this cost from 1 to 3, the performance of spec2006 401 473 416 465 482 can be improved by about 2% on LA664. Add option '-maddr-reg-reg-cost='. gcc/ChangeLog: * config/loongarch/genopts/loongarch.opt.in: Add option '-maddr-reg-reg-cost='. * config/loongarch

[PATCH v3 2/4] LoongArch: Split the function loongarch_cpu_cpp_builtins into two functions.

2025-02-13 Thread Lulu Cheng
Split the implementation of the function loongarch_cpu_cpp_builtins into two parts: 1. Macro definitions that do not change (only considering 64-bit architecture) 2. Macro definitions that change with different compilation options. gcc/ChangeLog: * config/loongarch/loongarch-c.cc (bu

[PATCH v3 0/4] Organize the code and fix PR118828 and PR118843.

2025-02-13 Thread Lulu Cheng
v1 -> v2: 1. Move __loongarch_{arch,tune} _LOONGARCH_{ARCH,TUNE} __loongarch_{div32,am_bh,amcas,ld_seq_sa} and __loongarch_version_major/__loongarch_version_minor to update function. 2. Fixed PR118843. 3. Add testsuites. v2 -> v3: 1. Modify test cases (pr118828-3.c pr118828-4.c).

[PATCH v3 3/4] LoongArch: After setting the compilation options, update the predefined macros.

2025-02-13 Thread Lulu Cheng
PR target/118828 gcc/ChangeLog: * config/loongarch/loongarch-c.cc (loongarch_pragma_target_parse): Update the predefined macros. gcc/testsuite/ChangeLog: * gcc.target/loongarch/pr118828.c: New test. * gcc.target/loongarch/pr118828-2.c: New test. *

[PATCH v3 4/4] LoongArch: When -mfpu=none, '__loongarch_frecipe' shouldn't be defined [PR118843].

2025-02-13 Thread Lulu Cheng
PR target/118843 gcc/ChangeLog: * config/loongarch/loongarch-c.cc (loongarch_update_cpp_builtins): Fix macro definition issues. gcc/testsuite/ChangeLog: * gcc.target/loongarch/pr118843.c: New test. --- gcc/config/loongarch/loongarch-c.cc | 27

[PATCH v3 1/4] LoongArch: Move the function loongarch_register_pragmas to loongarch-c.cc.

2025-02-13 Thread Lulu Cheng
gcc/ChangeLog: * config/loongarch/loongarch-target-attr.cc (loongarch_pragma_target_parse): Move to ... (loongarch_register_pragmas): Move to ... * config/loongarch/loongarch-c.cc (loongarch_pragma_target_parse): ... here. (loongarch_register_pragmas

Re: [PATCH] LoongArch: Accept ADD, IOR or XOR when combining objects with no bits in common [PR115478]

2025-02-13 Thread Lulu Cheng
LGTM! Thanks! 在 2025/2/11 下午2:34, Xi Ruoyao 写道: Since r15-1120, multi-word shifts/rotates produces PLUS instead of IOR. It's generally a good thing (allowing to use our alsl instruction or similar instrunction on other architectures), but it's preventing us from using bytepick. For example, if

Re: [PATCH 6/8] LoongArch: Simplify {lsx,lasx_x}vpick description

2025-02-13 Thread Lulu Cheng
, Lulu Cheng wrote: 在 2025/2/7 下午8:09, Xi Ruoyao 写道: /* snip */ - -(define_insn "lasx_xvpickev_w" -  [(set (match_operand:V8SI 0 "register_operand" "=f") - (vec_select:V8SI -   (vec_concat:V16SI -     (match_operand:V8SI 1 "register_operand"

Re:[Pushed] [PATCH] LoongArch: Fix the issue of function jump out of range caused by crtbeginS.o [PR118844].

2025-02-16 Thread Lulu Cheng
Pushed to r15-7581. 在 2025/2/12 下午4:01, Lulu Cheng 写道: Due to the presence of R_LARCH_B26 in /usr/lib/gcc/loongarch64-linux-gnu/14/crtbeginS.o, its addressing range is [PC-128MiB, PC+128MiB-4]. This means that when the code segment size exceeds 128MB, linking with lld will definitely fail (ld

Re: [PATCH 2/3] LoongArch: Split the function loongarch_cpu_cpp_builtins into two functions.

2025-02-11 Thread Lulu Cheng
在 2025/2/11 下午9:26, Xi Ruoyao 写道: On Tue, 2025-02-11 at 20:49 +0800, Lulu Cheng wrote: Split the implementation of the function loongarch_cpu_cpp_builtins into two parts:   1. Macro definitions that do not change (only considering 64-bit architecture)   2. Macro definitions that change with

Re: [PATCH 3/8] LoongArch: Simplify {lsx_,lasx_x}v{add,sub,mul}l{ev,od} description

2025-02-11 Thread Lulu Cheng
在 2025/2/7 下午8:09, Xi Ruoyao 写道: These pattern definitions are tediously long, invoking 32 UNSPECs and many hard-coded long const vectors. To simplify them, at first we use the TImode vector operations instead of the UNSPECs, then we adopt an approach in AArch64: using a special predicate to m

Re: [PATCH 6/8] LoongArch: Simplify {lsx,lasx_x}vpick description

2025-02-11 Thread Lulu Cheng
在 2025/2/12 上午3:30, Xi Ruoyao 写道: On Tue, 2025-02-11 at 16:52 +0800, Lulu Cheng wrote: 在 2025/2/7 下午8:09, Xi Ruoyao 写道: /* snip */ - -(define_insn "lasx_xvpickev_w" -  [(set (match_operand:V8SI 0 "register_operand" "=f") - (vec_select:V8

[PATCH] LoongArch: Fix the issue of function jump out of range caused by crtbeginS.o [PR118844].

2025-02-12 Thread Lulu Cheng
Due to the presence of R_LARCH_B26 in /usr/lib/gcc/loongarch64-linux-gnu/14/crtbeginS.o, its addressing range is [PC-128MiB, PC+128MiB-4]. This means that when the code segment size exceeds 128MB, linking with lld will definitely fail (ld will not fail because the order of the two is different). T

[PATCH] LoongArch: Support Q suffix for __float128.

2025-03-22 Thread Lulu Cheng
In r14-3635 supports `__float128`, but does not support the 'q/Q' suffix. PR target/119408 gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_c_mode_for_suffix): New. (TARGET_C_MODE_FOR_SUFFIX): Define. gcc/testsuite/ChangeLog: * gcc.target/loonga

Re: [PATCH] LoongArch: Add ABI names for FPR

2025-03-18 Thread Lulu Cheng
 LGTM. Thanks. 在 2025/3/16 下午2:41, Xi Ruoyao 写道: We already allow the ABI names for GPR in inline asm clobber list, so for consistency allow the ABI names for FPR as well. Reported-by: Yao Zi gcc/ChangeLog: * config/loongarch/loongarch.h (ADDITIONAL_REGISTER_NAMES): Add fa0-

Re: [PATCH] LoongArch: Make gen-evolution.awk compatible with FreeBSD awk

2025-04-05 Thread Lulu Cheng
在 2025/4/2 上午11:19, Xi Ruoyao 写道: Avoid using gensub that FreeBSD awk lacks, use gsub and split those each of gawk, mawk, and FreeBSD awk provides. Reported-by: mp...@vip.163.com Link: https://man.freebsd.org/cgi/man.cgi?query=awk gcc/ChangeLog: * config/loongarch/genopts/gen-evoluti

Re:[pushed] [PATCH] LoongArch: Support Q suffix for __float128.

2025-03-27 Thread Lulu Cheng
Pushed to r15-8962. 在 2025/3/22 下午4:35, Lulu Cheng 写道: In r14-3635 supports `__float128`, but does not support the 'q/Q' suffix. PR target/119408 gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_c_mode_for_suffix): New. (TARGET_C_MODE_

Re:[pushed] [PATCH v3] LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].

2025-04-07 Thread Lulu Cheng
Pushed to r15-9245 and r14-11538. 在 2025/4/7 下午3:44, Lulu Cheng 写道: In GCC14, LoongArch added __float128 as an alias for _Float128. In commit r15-8962, support for q/Q suffixes for 128-bit floating point numbers. This will cause the compiler to automatically link libquadmath when compiling

Re:[pushed] [PATCH] LoongArch: Fix awk / sed usage for compatibility

2025-04-08 Thread Lulu Cheng
Pushed to r15-9324 and r14-11545. 在 2025/4/7 上午10:31, Yang Yujie 写道: Tested with nawk, mawk, and gawk. gcc/ChangeLog: * config/loongarch/genopts/gen-evolution.awk: remove usage of "asort". * config/loongarch/genopts/genstr.sh: replace sed with awk. --- .../loongarch/g

[PATCH v3] LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].

2025-04-07 Thread Lulu Cheng
In GCC14, LoongArch added __float128 as an alias for _Float128. In commit r15-8962, support for q/Q suffixes for 128-bit floating point numbers. This will cause the compiler to automatically link libquadmath when compiling Fortran programs. But on LoongArch `long double` is IEEE quad, so there is

Re: [PATCH] LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].

2025-04-07 Thread Lulu Cheng
在 2025/4/7 下午3:19, Jakub Jelinek 写道: On Mon, Apr 07, 2025 at 03:12:22PM +0800, Lulu Cheng wrote: The above hunks clearly show that you're regenerating it with some patched autoconf or something like that. Please manually remove those hunks or use vanilla upstream autoconf 2.69. Othe

[PATCH] LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].

2025-04-07 Thread Lulu Cheng
In GCC14, LoongArch added __float128 as an alias for _Float128. In commit r15-8962, support for q/Q suffixes for 128-bit floating point numbers. This will cause the compiler to automatically link libquadmath when compiling Fortran programs. But on LoongArch `long double` is IEEE quad, so there is

[PATCH v2] LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].

2025-04-07 Thread Lulu Cheng
In GCC14, LoongArch added __float128 as an alias for _Float128. In commit r15-8962, support for q/Q suffixes for 128-bit floating point numbers. This will cause the compiler to automatically link libquadmath when compiling Fortran programs. But on LoongArch `long double` is IEEE quad, so there is

Re: [PATCH v3] LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].

2025-04-07 Thread Lulu Cheng
在 2025/4/7 下午4:02, Jakub Jelinek 写道: On Mon, Apr 07, 2025 at 03:44:52PM +0800, Lulu Cheng wrote: In GCC14, LoongArch added __float128 as an alias for _Float128. In commit r15-8962, support for q/Q suffixes for 128-bit floating point numbers. This will cause the compiler to automatically link

[PATCH] gcc-15/changes: Document LoongArch changes.

2025-04-17 Thread Lulu Cheng
--- htdocs/gcc-15/changes.html | 24 1 file changed, 24 insertions(+) diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html index a02ba17a..011fc5ca 100644 --- a/htdocs/gcc-15/changes.html +++ b/htdocs/gcc-15/changes.html @@ -842,6 +842,30 @@ asm (".text; %

[PATCH v2] gcc-15/changes: Document LoongArch changes.

2025-04-21 Thread Lulu Cheng
--- htdocs/gcc-15/changes.html | 20 1 file changed, 20 insertions(+) diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html index a02ba17a..b94802f5 100644 --- a/htdocs/gcc-15/changes.html +++ b/htdocs/gcc-15/changes.html @@ -842,6 +842,26 @@ asm (".text; %cc0:

Re: [pushed][PATCH] LoongArch: Change {dg-do-what-default} save and restore logical

2025-04-18 Thread Lulu Cheng
The changelog format has been modified and pushed to r16-13 and r15-9556. 在 2025/4/16 上午10:29, Xing Li 写道: The set of {dg-do-what-default} to 'run' may lead some test hang during make check. gcc/testsuite *gcc.target/loongarch/vector/loongarch-vector.exp: Change {d

[PATCH] MIPS: Fixed the problem that the nop instruction is inserted at the wrong position after enabling '-fpatchable-function-entry='

2025-04-29 Thread Lulu Cheng
Because MIPS function symbol is generated in the prologue function, this nop generation should be done in prologue. OK for trunk? PR target/99217 gcc/ChangeLog: * config/mips/mips.cc (mips_start_function_definition): Implements the functionality of '-fpatchable-function-e

Re: [PATCH] MIPS: Fixed the problem that the nop instruction is inserted at the wrong position after enabling '-fpatchable-function-entry='

2025-05-06 Thread Lulu Cheng
在 2025/5/6 下午6:14, WANG Xuerui 写道: On 4/30/25 14:26, Lulu Cheng wrote: Because MIPS function symbol is generated in the prologue function, this nop generation should be done in prologue. OK for trunk? PR target/99217 gcc/ChangeLog: * config/mips/mips.cc

Re: [PATCH] LoongArch: Change {dg-do-what-default} save and restore logical

2025-04-18 Thread Lulu Cheng
在 2025/4/16 上午10:29, Xing Li 写道: The set of {dg-do-what-default} to 'run' may lead some test hang during make check. gcc/testsuite *gcc.target/loongarch/vector/loongarch-vector.exp: Change {dg-do-what-default} save and restore logical Hi, There is a problem wit

Re: [PATCH] LoongArch: Use normal RTL pattern instead of UNSPEC for {x,}vsr{a,l}ri instructions

2025-02-18 Thread Lulu Cheng
LGTM! Thanks! 在 2025/2/14 下午9:37, Xi Ruoyao 写道: Allowing (t + (1ul << imm >> 1)) >> imm to be recognized as a rounding shift operation. gcc/ChangeLog: * config/loongarch/lasx.md (UNSPEC_LASX_XVSRARI): Remove. (UNSPEC_LASX_XVSRLRI): Remove. (lasx_xvsrari_): Remove.

Re: Ping: [PATCH] testsuite: Fix up toplevel-asm-1.c for LoongArch

2025-02-18 Thread Lulu Cheng
在 2025/2/19 下午3:27, Xi Ruoyao 写道: On Wed, 2025-02-05 at 08:57 +0800, Xi Ruoyao wrote: Like RISC-V, on LoongArch we don't really support %cN for SYMBOL_REFs even with -fno-pic. gcc/testsuite/ChangeLog: * c-c++-common/toplevel-asm-1.c: Use %cc3 %cc4 instead of %c3 %c4 on LoongA

Re: [PATCH v3 0/8] LoongArch: SIMD odd/even/horizontal widening arithmetic cleanup and optimization

2025-02-17 Thread Lulu Cheng
在 2025/2/14 下午8:21, Xi Ruoyao 写道: This series is intended to fix some test failures on vect-reduc-chain-*.c by adding the [su]dot_prod* expand for LSX and LASX vector modes. But the code base of the related instructions was not readable, so clean it up first (using the approach learnt from AAr

Re: [PATCH v3 8/8] LoongArch: Implement [su]dot_prod* for LSX and LASX modes

2025-03-06 Thread Lulu Cheng
在 2025/3/7 下午2:37, Lulu Cheng 写道: 在 2025/2/14 下午8:21, Xi Ruoyao 写道: Despite it's just a special case of "a widening product of which the result used for reduction," having these standard names allows to recognize the dot product pattern earlier and it may be beneficial to opti

[PATCH 2/3] LoongArch: testsuite: Fix gcc.dg/vect/bb-slp-77.c.

2025-03-06 Thread Lulu Cheng
The issue is the same as 12383255fe4e82c31f5e42c72a8fbcb1b5dea35d. Neither is .REDUC_PLUS set for V2SImode on LoongArch, so add it to the list of targets not expecting BB vectorization. gcc/testsuite/ChangeLog: * gcc.dg/vect/bb-slp-77.c: Add loongarch*-*-* to the list of expected

[PATCH 1/3] LoongArch: testsuite: Fix pr112325.c and pr117888-1.c.

2025-03-06 Thread Lulu Cheng
By default, vectorization is not enabled on LoongArch, resulting in the failure of these two test cases. gcc/testsuite/ChangeLog: * gcc.dg/vect/pr112325.c: Add the vector compilation option '-mlsx' for LoongArch. * gcc.dg/vect/pr117888-1.c: Likewise. --- gcc/testsuite/gc

Re: [PATCH v3 8/8] LoongArch: Implement [su]dot_prod* for LSX and LASX modes

2025-03-06 Thread Lulu Cheng
在 2025/2/14 下午8:21, Xi Ruoyao 写道: Despite it's just a special case of "a widening product of which the result used for reduction," having these standard names allows to recognize the dot product pattern earlier and it may be beneficial to optimization. Also fix some test failures with the test

Re: [PATCH] LoongArch: Add a dedicated pattern for bitwise + alsl

2025-03-07 Thread Lulu Cheng
LGTM, but since we're now in stage 4, I believe it should be merged into GCC16 Stage 1. Thanks! 在 2025/3/1 上午11:38, Xi Ruoyao 写道: We've implemented the slli + bitwise => bitwise + slli reassociation in r15-7062. I'd hoped late combine could handle slli.d + bitwise + add.d => bitwise + slli.d

[PATCH 3/3] LoongArch: testsuite: Fix gcc.dg/vect/slp-26.c.

2025-03-09 Thread Lulu Cheng
After d34cda720988674bcf8a24267c9e1ec61335d6de, what was originally not vectorizable can now be vectorized. So adjust gcc.dg/vect/slp-26.c. gcc/testsuite/ChangeLog: * gcc.dg/vect/slp-26.c: Adjust. --- gcc/testsuite/gcc.dg/vect/slp-26.c | 4 ++-- 1 file changed, 2 insertions(+), 2 delet

Re: [PATCH] LoongArch: Fix ICE when trying to recognize bitwise + alsl.w pair [PR119127]

2025-03-10 Thread Lulu Cheng
LGTM. Thanks. 在 2025/3/10 下午2:40, Xi Ruoyao 写道: When we call loongarch_reassoc_shift_bitwise for _alsl_reversesi_extend, the mask is in DImode but we are trying to operate it in SImode, causing an ICE. To fix the issue sign-extend the mask into the mode we want. And also specially handle the

Re: Ping: [PATCH] LoongArch: combine related slli operations

2025-03-10 Thread Lulu Cheng
. Thanks. 在 2025/1/7 下午8:45, Zhou Zhao 写道: 在 2025/1/7 下午7:49, Lulu Cheng 写道: 在 2025/1/2 下午5:46, Zhou Zhao 写道: If SImode reg is continuous left shifted twice, combine related instruction to one. gcc/ChangeLog: * config/loongarch/loongarch.md (extsv_ashlsi3):     New template Hi

Re: [PATCH] LoongArch: Don't use C++17 feature [PR119238]

2025-03-12 Thread Lulu Cheng
在 2025/3/12 下午9:14, Xi Ruoyao 写道: Structured binding is a C++17 feature but the GCC code base is in C++14. I couldn't find the description of the standards followed by GCC code in the document yesterday. Therefore, I assumed that this standard is the same as the default standard set durin

Re: [PATCH] LoongArch: Fix incorrect reorder of __lsx_vldx and __lasx_xvldx [PR119084]

2025-03-04 Thread Lulu Cheng
LGTM! Thanks. 在 2025/3/3 下午3:29, Xi Ruoyao 写道: They could be incorrectly reordered with store instructions like st.b because the RTL expression does not have a memory_operand or a (mem) expression. The incorrect reorder has been observed in openh264 LTO build. Expand them to a (mem) expressio

Re: [PATCH] LoongArch: Fix incorrect reorder of __lsx_vldx and __lasx_xvldx [PR119084]

2025-03-04 Thread Lulu Cheng
在 2025/3/5 上午11:03, Xi Ruoyao 写道: On Wed, 2025-03-05 at 10:52 +0800, Lulu Cheng wrote: LGTM! Pushed to trunk. The draft of gcc-14 backport is attached, I'll push it if it builds & tests fine and there's no objection. Thanks a lot.

Re: [PATCH] LoongArch: Don't use C++17 feature [PR119238]

2025-03-12 Thread Lulu Cheng
在 2025/3/13 上午10:36, Andrew Pinski 写道: On Wed, Mar 12, 2025 at 6:23 PM Lulu Cheng wrote: 在 2025/3/12 下午9:14, Xi Ruoyao 写道: Structured binding is a C++17 feature but the GCC code base is in C++14. I couldn't find the description of the standards followed by GCC code in the doc

Re: [pushed][PATCH v3 0/4] Organize the code and fix PR118828 and PR118843.

2025-02-13 Thread Lulu Cheng
Pushed to r15-7521..r15-7524 在 2025/2/13 下午8:59, Lulu Cheng 写道: v1 -> v2: 1. Move __loongarch_{arch,tune} _LOONGARCH_{ARCH,TUNE} __loongarch_{div32,am_bh,amcas,ld_seq_sa} and __loongarch_version_major/__loongarch_version_minor to update function. 2. Fixed PR118843. 3. Add testsuites.

Re:[pushed] [PATCH v2] LoongArch: Adjust the cost of ADDRESS_REG_REG.

2025-02-13 Thread Lulu Cheng
Pushed to r15-7525. 在 2025/2/13 下午4:40, Lulu Cheng 写道: After changing this cost from 1 to 3, the performance of spec2006 401 473 416 465 482 can be improved by about 2% on LA664. Add option '-maddr-reg-reg-cost='. gcc/ChangeLog: * config/loongarch/genopts/loongarch.o

Re: [PATCH v2 3/4] LoongArch: After setting the compilation options, update the predefined macros.

2025-02-12 Thread Lulu Cheng
在 2025/2/12 下午6:19, Xi Ruoyao 写道: On Wed, 2025-02-12 at 18:03 +0800, Lulu Cheng wrote: /* snip */ diff --git a/gcc/testsuite/gcc.target/loongarch/pr118828-3.c b/gcc/testsuite/gcc.target/loongarch/pr118828-3.c new file mode 100644 index 000..a682ae4a356 --- /dev/null +++ b/gcc

Re: [PATCH] LoongArch: Avoid unnecessary zero-initialization using LSX for scalar popcount

2025-02-25 Thread Lulu Cheng
在 2025/2/22 下午3:34, Xi Ruoyao 写道: Now for __builtin_popcountl we are getting things like vrepli.b$vr0,0 vinsgr2vr.d $vr0,$r4,0 vpcnt.d $vr0,$vr0 vpickve2gr.du $r4,$vr0,0 slli.w $r4,$r4,0 jr $r1 The "vrepli.b" instruction is intro

[PATCH] LoongArch: doc: Put the '-mtls-dialect=opt' option description in the correct position.

2025-03-31 Thread Lulu Cheng
gcc/ChangeLog: * doc/invoke.texi: Corrected the position of '-mtls-dialect=opt' option. --- gcc/doc/invoke.texi | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index df461090824..ecc5f61f29c 100644 ---

Re:[pushed] [PATCH 2/2] LoongArch: doc: Add same-address constraint to the description of '-mld-seq-sa'.

2025-03-28 Thread Lulu Cheng
Pushed to r15-9023. 在 2025/3/27 下午3:01, Lulu Cheng 写道: gcc/ChangeLog: * doc/invoke.texi: Modify the description of '-mld-seq-sa'. --- gcc/doc/invoke.texi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.

Re: Pushed r15-9167: [PATCH] LoongArch: Make gen-evolution.awk compatible with FreeBSD awk

2025-04-02 Thread Lulu Cheng
在 2025/4/3 上午11:12, Xi Ruoyao 写道: On Thu, 2025-04-03 at 10:13 +0800, Lulu Cheng wrote: 在 2025/4/2 上午11:19, Xi Ruoyao 写道: Avoid using gensub that FreeBSD awk lacks, use gsub and split those each of gawk, mawk, and FreeBSD awk provides. Reported-by: mp...@vip.163.com Link: https

Re: [pushed][PATCH] LoongArch: doc: Put the '-mtls-dialect=opt' option description in the correct position.

2025-04-04 Thread Lulu Cheng
Pushed to r15-9115. 在 2025/3/31 下午3:14, Lulu Cheng 写道: gcc/ChangeLog: * doc/invoke.texi: Corrected the position of '-mtls-dialect=opt' option. --- gcc/doc/invoke.texi | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/gcc/doc/inv

Re:[pushed] [PATCH 1/2] LoongArch: Set default alignment for functions jumps loops and labels.

2025-03-28 Thread Lulu Cheng
Pushed to r15-9022. 在 2025/3/27 下午3:01, Lulu Cheng 写道: Based on r15-7624, a set of align combinations with better performance was tested through spec2006. LA464: -falign-loops=8 -falign-functions=32 -falign-jumps=32 -falign-labels=8 LA664: -falign-loops=16 -falign-functions=16 -falign-jumps=32

[PATCH 1/2] LoongArch: Set default alignment for functions jumps loops and labels.

2025-03-27 Thread Lulu Cheng
Based on r15-7624, a set of align combinations with better performance was tested through spec2006. LA464: -falign-loops=8 -falign-functions=32 -falign-jumps=32 -falign-labels=8 LA664: -falign-loops=16 -falign-functions=16 -falign-jumps=32 -falign-labels=8 gcc/ChangeLog: * config/loongar

[PATCH 2/2] LoongArch: doc: Add same-address constraint to the description of '-mld-seq-sa'.

2025-03-27 Thread Lulu Cheng
gcc/ChangeLog: * doc/invoke.texi: Modify the description of '-mld-seq-sa'. --- gcc/doc/invoke.texi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index b3f7f0479cc..4cdef8938dd 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/d

Re:[ping] [PATCH v2] MIPS: Fix the issue with the '-fpatchable-function-entry=' feature.

2025-05-12 Thread Lulu Cheng
Ping? 在 2025/5/9 上午10:14, Lulu Cheng 写道: From: ChengLulu PR target/99217 gcc/ChangeLog: * config/mips/mips.cc (mips_start_function_definition): Implements the functionality of '-fpatchable-function-entry='. (mips_print_patchable_function_entry): De

[PATCH v2] MIPS: Fix the issue with the '-fpatchable-function-entry=' feature.

2025-05-08 Thread Lulu Cheng
From: ChengLulu PR target/99217 gcc/ChangeLog: * config/mips/mips.cc (mips_start_function_definition): Implements the functionality of '-fpatchable-function-entry='. (mips_print_patchable_function_entry): Define empty function. (TARGET_ASM_PRINT_PATCHABLE

<    1   2   3   4   5