Co-Authored-By: xujiahao
gcc/ChangeLog:
* config/loongarch/loongarch-def.c: Initial number of parallel prefetch.
* config/loongarch/loongarch-tune.h (struct loongarch_cache):
Define number of parallel prefetch.
* config/loongarch/loongarch.cc (loongarch_option_ove
The old stack check was performed before the stack was dropped,
which would cause the detection tool to report a memory leak.
The current stack check scheme is as follows:
'-fstack-clash-protection':
1. When the frame->total_size is smaller than the guard page size,
the stack is dropped accord
在 2022/11/15 下午5:17, Xi Ruoyao 写道:
On Sat, 2022-11-12 at 17:45 +0800, Xi Ruoyao via Gcc-patches wrote:
void prefetch(char *ptr, int off)
{
return __builtin_prefetch(ptr + off);
}
It's compiled to "preldx 0,$r4,$r5". I don't think it's correct
because
according to the doc, rk should
v2 -> v3:
1. Remove preldx support.
---
Enable sw prefetching at -O3 and higher.
Co-Authored-By: xujiahao
gcc/ChangeLog:
* config/loongarch/constraints.md (ZD): New constraint.
* config/loongarch/loongarch-def.c: Initial number of parallel pr
在 2022/11/16 上午11:06, WANG Xuerui 写道:
On 2022/11/16 10:10, Lulu Cheng wrote:
v2 -> v3:
1. Remove preldx support.
---
Enable sw prefetching at -O3 and higher.
Co-Authored-By: xujiahao
gcc/ChangeLog:
* config/loongarch/constraints.md (ZD):
v1 -> v2:
1. Change the code format.
2. Fix bugs in the code.
v2 -> v3:
Modifying a code implementation of an undefined behavior.
v3 -> v4:
Move the part of the immediate number decomposition from expand pass to split
pass.
Both regression tests and spec2006 passed.
The problem mentioned in the
在 2023/4/12 下午8:16, Xi Ruoyao 写道:
We'd been generating really bad block move sequences which is recently
complained by kernel developers who tried __builtin_memcpy. To improve
it:
1. Take the advantage of -mno-strict-align. When it is set, set mode
size to UNITS_PER_WORD regardless of th
Pushed to master.
Thanks!
在 2023/4/19 下午2:04, Gerald Pfeifer 写道:
On Tue, 18 Apr 2023, Lulu Cheng wrote:
v1 -> v2: Modify syntax errors and description information.
v2 -> v3: Modify some description information.
Thank you, and thank you to Xuerui for their feedback!
Please go ahe
在 2023/4/17 下午2:51, 樊鹏 写道:
Yes, https://wiki.musl-libc.org/guidelines-for-distributions.html,
"Multilib/multi-arch" section of this
introduces it.
Hi, fanpeng:
I agree with ruoyao, add this link to the commit message.
I have no problem with other.
Thanks!
-Original Messages-
F
Pushed to r14-130.
在 2023/4/19 下午4:23, Peng Fan 写道:
The system based on musl has no '/lib64', so change it.
https://wiki.musl-libc.org/guidelines-for-distributions.html,
"Multilib/multi-arch" section of this introduces it.
gcc/
* config/loongarch/gnu-user.h (MUSL_DYNAMIC_LINKER: Redef
Ok, I will do spec performance test comparison as soon as possible.
Thanks!
在 2023/4/23 下午9:19, Xi Ruoyao 写道:
This commit implements the target macros for shrink wrapping of function
prologues/epilogues shrink wrapping on LoongArch.
Bootstrapped and regtested on loongarch64-linux-gnu. I don't
+guojie
在 2023/4/23 下午9:19, Xi Ruoyao 写道:
This commit implements the target macros for shrink wrapping of function
prologues/epilogues shrink wrapping on LoongArch.
Bootstrapped and regtested on loongarch64-linux-gnu. I don't have an
access to SPEC CPU so I hope the reviewer can perform a benc
Hi, ruoyao:
The performance of spec2006 is finished. The fixed-point
400.perlbench has about 3% performance improvement,
and the other basics have not changed, and the floating-point tests have
basically remained the same.
Do you have any questions about the test cases mentioned
在 2023/4/26 下午6:02, WANG Xuerui 写道:
On 2023/4/26 17:53, Lulu Cheng wrote:
Hi, ruoyao:
The performance of spec2006 is finished. The fixed-point
400.perlbench has about 3% performance improvement,
and the other basics have not changed, and the floating-point tests
have basically
在 2023/2/4 上午1:50, Xi Ruoyao 写道:
We can use bytepick.[wd] for
a << (8 * x) | b >> (8 * (sizeof(a) - x))
while a and b are uint32_t or uint64_t. This is useful for some cases,
for example:
https://sourceware.org/pipermail/libc-alpha/2023-February/145203.html
Bootstrapped and regtested o
C++2017 and previous standard description:
The value of E1 << E2 is E1 left-shifted E2 bit positions;
vacated bits are zero-filled. If E1 has an unsigned type,
the value of the result is E1×2E2, reduced modulo one more
than the maximum value representable inthe result type.
Otherwise, if E1 has a
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_classify_address):
Add precessint for CONST_INT.
(loongarch_print_operand): Increase the processing of '%c'.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/tst-asm-const.c: Moved to...
* gcc.target
bootstrap-ubsan. And the compiled result of
imm-load1.c seems OK.
And it's doing correct thing for Glibc "improved generic string
functions" patch, producing some really tight loop now.
On Thu, 2022-11-17 at 17:59 +0800, Lulu Cheng wrote:
v1 -> v2:
1. Change the code format.
gcc/ChangeLog:
* doc/rtl.texi: Correct a clerical error in the document.
---
gcc/doc/rtl.texi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 43c9ee8bffe..44858d12892 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -214
在 2022/12/7 下午6:05, Richard Sandiford 写道:
Lulu Cheng writes:
gcc/ChangeLog:
* doc/rtl.texi: Correct a clerical error in the document.
---
gcc/doc/rtl.texi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 43c9ee8bffe
There is description of '%c' "%n" "%a" and "%l" in section 17.5 of gccint.pdf.
So I can understand that these descriptors are the ones that the common code
implementation back end has to support, right?
But I don't see the use of these descriptors in gcc.pdf.Now I want to add the
descriptor informa
/* snip */
diff --git a/gcc/testsuite/gcc.target/loongarch/add-const.c
b/gcc/testsuite/gcc.target/loongarch/add-const.c
new file mode 100644
index 000..3a9f72fe83d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/add-const.c
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-opt
在 2023/4/4 下午4:38, Xi Ruoyao 写道:
1. Use addu16i.d for TARGET_64BIT and suitable immediates.
2. Split one addition with immediate into two addu16i.d or addi.{d/w}
instructions if possible. This can avoid using a temp register w/o
increase the count of instructions.
Inspired by https://
在 2023/4/4 下午4:40, Xi Ruoyao 写道:
On Tue, 2023-04-04 at 16:00 +0800, Xi Ruoyao via Gcc-patches wrote:
On Tue, 2023-04-04 at 11:01 +0800, Lulu Cheng wrote:
/* snip */
+unsigned long f10 (unsigned long x) { return x - 0x8000l * 2; }
+unsigned long f11 (unsigned long x) { return x
gcc/ChangeLog:
* doc/extend.texi: Add section for LoongArch BASE Built-in functions.
---
gcc/doc/extend.texi | 89 +
1 file changed, 89 insertions(+)
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 3adb67aa47a..417af6c368d 100644
-
gcc/ChangeLog:
* doc/extend.texi: Add section for LoongArch Base Built-in functions.
---
gcc/doc/extend.texi | 129
1 file changed, 129 insertions(+)
---
v1 -> v2:
(1) Does not use i8, u8, i16, u16 etc.
(2) Add the description informati
Sorry, it's my question. I still have some questions that I haven't
understood, so I haven't replied to the email yet.:-(
在 2023/4/10 下午5:04, Xi Ruoyao 写道:
Ping. Or maybe I've lost some replies here because my mail server
crashed several days ago :).
On Wed, 2023-03-29 at 02:01 +0800, Xi Ruo
In some cases, setting this macro as the default can reduce the number of
conditional
branch instructions.
gcc/ChangeLog:
* config/loongarch/loongarch.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Remove
the macro
definition.
---
gcc/config/loongarch/loongarch.h | 1 -
1 file changed, 1 de
在 2023/4/13 下午8:24, Xi Ruoyao 写道:
On Thu, 2023-04-13 at 19:51 +0800, Lulu Cheng wrote:
In some cases, setting this macro as the default can reduce the number of
conditional
branch instructions.
gcc/ChangeLog:
* config/loongarch/loongarch.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Remove
Pushed to r14-15.
Due to my reasons, this modification did not catch up with the creation
of the releases/gcc-13 branch,
can I still submit this modification to releases/gcc-13?:-(
在 2023/4/13 下午8:24, Xi Ruoyao 写道:
On Thu, 2023-04-13 at 19:51 +0800, Lulu Cheng wrote:
In some cases
Pushed to r14-14.
在 2023/4/7 下午4:38, Lulu Cheng 写道:
gcc/ChangeLog:
* doc/extend.texi: Add section for LoongArch Base Built-in functions.
---
gcc/doc/extend.texi | 129
1 file changed, 129 insertions(+)
---
v1 -> v2:
(1) Does
---
htdocs/gcc-13/changes.html | 39 ++
1 file changed, 39 insertions(+)
diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index f3b9afed..c75e341b 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -563,6 +563,45 @@
---
htdocs/gcc-13/changes.html | 41 ++
1 file changed, 41 insertions(+)
---
v1 -> v2: Modify syntax errors and description information.
diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index f3b9afed..4324c2d1 100644
--- a/htdocs/gcc-13/
在 2023/4/18 下午2:44, Gerald Pfeifer 写道:
Here, and in the other cases, the closing (that I marked aboved)
should follow , since both the heading and the are part of the
same list item.
(See the RISC-V entry, for example.)
This change is fine with the changes highlighted above. (If you prefer
在 2023/4/18 下午3:29, WANG Xuerui 写道:
Hi,
Just some minor fixes ;-)
On 2023/4/18 14:15, Lulu Cheng wrote:
---
htdocs/gcc-13/changes.html | 39 ++
1 file changed, 39 insertions(+)
diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index
---
htdocs/gcc-13/changes.html | 42 ++
1 file changed, 42 insertions(+)
---
v1 -> v2: Modify syntax errors and description information.
v2 -> v3: Modify some description information.
diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index
在 2023/4/18 下午5:27, Xi Ruoyao 写道:
On Mon, 2023-04-10 at 17:45 +0800, Lulu Cheng wrote:
Sorry, it's my question. I still have some questions that I haven't
understood, so I haven't replied to the email yet.:-(
I've verified the value of cfun->va_list_gpr_size with
在 2023/4/18 下午7:48, Xi Ruoyao 写道:
On Tue, 2023-04-18 at 19:21 +0800, Lulu Cheng wrote:
在 2023/4/18 下午5:27, Xi Ruoyao 写道:
On Mon, 2023-04-10 at 17:45 +0800, Lulu Cheng wrote:
Sorry, it's my question. I still have some questions that I haven't
understood, so I haven't replied t
Hi, ruoyao:
Thank you so much for making this submission. But we are testing the
impact of these two alignment parameters
(also including -falign-jumps and -falign-lables ) on performance. So
before the result comes out, this patch will
not be merged into the main branch for the time being.
Pushed to r15-4130.
在 2024/7/11 下午7:43, Xi Ruoyao 写道:
This is per the request from the kernel developers. For generating the
ORC unwind info, the objtool program needs to analysis the control flow
of a .o file. If a jump table is used, objtool has to correlate the
jump instruction with the tab
在 2024/11/2 上午1:10, Xi Ruoyao 写道:
On Thu, 2024-10-31 at 23:58 +0800, Xi Ruoyao wrote:
/* snip */
---
Now running bootstrap & regtest. Posted early as a context for some
LLVM patch. I'll post the regtest result once it finishes.
Done, no regressions.
The LLVM patch is https://github.com/
[x]vldi.{b/h/w/d} is not implemented in LoongArch.
Use the macro [x]vrepli.{b/h/w/d} to replace.
gcc/ChangeLog:
* config/loongarch/lasx.md: Fixed.
* config/loongarch/lsx.md: Fixed.
---
gcc/config/loongarch/lasx.md | 2 +-
gcc/config/loongarch/lsx.md | 2 +-
2 files changed, 2 in
在 2024/11/2 上午1:36, Xi Ruoyao 写道:
Without optimization, GCC does not emit a jump table for the test case.
I'm not sure if the test case has been wrong in the first place or
something has changed in these months...
It was in the r15-4756 that turned -fjump-tables off at O0 optimization.
I wa
Lulu Cheng (2):
LoongArch: Remove redundant code.
LoongArch: Modify the document to remove options that don't exist.
gcc/config/loongarch/loongarch-builtins.cc | 102 -
gcc/config/loongarch/loongarch-protos.h| 1 -
gcc/config/loongarch/loongarch.cc
TARGET_ASM_ALIGNED_{HI,SI,QI}_OP are defined repeatedly and deleted.
gcc/ChangeLog:
* config/loongarch/loongarch-builtins.cc
(loongarch_builtin_vectorized_function): Delete.
(LARCH_GET_BUILTIN): Delete.
* config/loongarch/loongarch-protos.h
(loongarch_built
gcc/ChangeLog:
* doc/invoke.texi: Remove the non-existent option
'-msmall-data-limit' and add a description of '-G'.
---
gcc/doc/invoke.texi | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index fd6c0c44709..
Pushed to r15-4588
在 2024/1/27 下午3:09, Yang Yujie 写道:
This patch correspond to the upstream PR:
https://github.com/libffi/libffi/pull/817
libffi/ChangeLog:
* src/loongarch64/ffi.c: Avoid defining floats
in struct call_context if the ABI is soft-float.
---
libffi/src/loongarch
Pushed to r15-5583 and r15-5584.
在 2024/11/2 上午10:48, Lulu Cheng 写道:
Lulu Cheng (2):
LoongArch: Remove redundant code.
LoongArch: Modify the document to remove options that don't exist.
gcc/config/loongarch/loongarch-builtins.cc | 102 -
gcc/config/loon
gcc/ChangeLog:
* config/g.opt.urls: Regenerate.
* config/i386/nto.opt.urls: Regenerate.
* config/riscv/riscv.opt.urls: Regenerate.
* config/rx/rx.opt.urls: Regenerate.
* config/sol2.opt.urls: Regenerate.
---
gcc/config/g.opt.urls | 2 +-
gcc/confi
在 2024/11/28 上午9:26, Jinyang He 写道:
For {xv,v}{srl,sll,sra}, the constraint `vector_same_uimm6` cause overflow
in when emit {w,h,b}. Since the number of bits shifted is the remainder of
the register value, it is actually unnecessary to constrain the range.
Simply mask the shift number with the
在 2024/11/27 下午3:10, Xi Ruoyao 写道:
On Wed, 2024-11-27 at 14:24 +0800, Lulu Cheng wrote:
在 2024/11/27 下午12:06, Xi Ruoyao 写道:
On Wed, 2024-11-27 at 11:58 +0800, Lulu Cheng wrote:
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-shift-sameimm-vec.c
@@ -0,0 +1,72
在 2024/11/27 上午10:14, Xi Ruoyao 写道:
On Tue, 2024-11-26 at 18:37 +0800, Jinyang He wrote:
For {xv,v}{srl,sll,sra}, the constraint `vector_same_uimm6` cause overflow
in when emit {w,h,b}. Since the number of bits shifted is the remainder of
the register value, it is actually unnecessary to const
在 2024/11/27 下午12:06, Xi Ruoyao 写道:
On Wed, 2024-11-27 at 11:58 +0800, Lulu Cheng wrote:
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-shift-sameimm-vec.c
@@ -0,0 +1,72 @@
+/* Test shift bits overflow in vector */
+/* { dg-do compile } */
+/* { dg-options "-mlas
Pushed to r15-5817.
在 2024/11/26 下午4:06, Lulu Cheng 写道:
In r15-5327, change the default language version for C compilation from
-std=gnu17 to -std=gnu23.
ISO C99 and C11 allow ceil, floor, round and trunc, and their float and
long double variants, to raise the “inexact” exception,
but ISO/IEC
Pushed to r15-5818.
在 2024/11/26 下午4:06, Lulu Cheng 写道:
Add '-fdump-tree-optimized' to this testcases.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/lasx-andn-iorn.c:
Add '-fdump-tree-optimized'.
* gcc.target/loongarch/lsx-andn-iorn.c:
Pushed to r15-5819..
在 2024/11/28 上午9:26, Jinyang He 写道:
For {xv,v}{srl,sll,sra}, the constraint `vector_same_uimm6` cause overflow
in when emit {w,h,b}. Since the number of bits shifted is the remainder of
the register value, it is actually unnecessary to constrain the range.
Simply mask the sh
In r15-5327, change the default language version for C compilation from
-std=gnu17 to -std=gnu23.
ISO C99 and C11 allow ceil, floor, round and trunc, and their float and
long double variants, to raise the “inexact” exception,
but ISO/IEC TS 18661-1:2014, the C bindings to IEEE 754-2008, as
integra
Add '-fdump-tree-optimized' to this testcases.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/lasx-andn-iorn.c:
Add '-fdump-tree-optimized'.
* gcc.target/loongarch/lsx-andn-iorn.c:
Likewise.
---
gcc/testsuite/gcc.target/loongarch/lasx-andn-iorn.c | 2 +-
gcc/test
After changing this cost from 1 to 3, the performance of spec2006
401 473 416 465 482 can be improved by about 2% on LA664.
Add option '-maddr-reg-reg-cost='.
gcc/ChangeLog:
* config/loongarch/genopts/loongarch.opt.in: Add
option '-maddr-reg-reg-cost='.
* config/loongarch
Pushed to r15-6477.
在 2024/12/25 下午5:59, Jiahao Xu 写道:
In order to support vectorization of loops with multiple exits, this
patch adds the implementation of the conditional branch optab for
LoongArch LSX/LASX instructions.
This patch causes the gen-vect-{2,25}.c tests to fail. This is because
Pushed to r15-6487.
在 2024/12/30 上午10:34, Guo Jie 写道:
gcc/ChangeLog:
* config/loongarch/lasx.md: Remove useless code.
* config/loongarch/lsx.md: Ditto.
---
gcc/config/loongarch/lasx.md | 66
gcc/config/loongarch/lsx.md | 35 -
Pushed to r15-6492.
在 2024/12/30 下午3:12, Guo Jie 写道:
gcc/ChangeLog:
* config/loongarch/lasx.md (lasx_xvabsd_s_): Remove.
(abd3): New insn pattern.
(lasx_xvabsd_u_): Remove.
* config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vabsd_b):
Rename.
(
Pushed to r15-6490.
在 2024/12/30 上午10:38, Guo Jie 写道:
For some instruction patterns with commutative operands,
the order of operands needs to be adjusted to match the rules.
gcc/ChangeLog:
* config/loongarch/loongarch.md
(bytepick_d__rev): New combiner.
(bstrpick_alsl_p
Pushed to r15-6489.
在 2024/12/30 上午10:37, Guo Jie 写道:
There are two aspects that affect the matching of instruction templates:
1. vec_duplicate is redundant in the following operations.
set (match_operand:V4DI ...)
(vec_duplicate:V4DI (vec_select:V4DI ...))
2. The range of values
Pushed to r15-6491.
在 2024/12/30 上午10:38, Guo Jie 写道:
gcc/ChangeLog:
* config/loongarch/lasx.md (vec_unpacks_lo_): Redefine.
(vec_unpacku_lo_): Ditto.
(lasx_vext2xv_h_b): Replaced by vec_unpack_lo_v32qi.
(vec_unpack_lo_v32qi): New insn.
(lasx_vext2xv_w_h)
Pushed to r15-6493.
在 2024/12/30 上午10:39, Guo Jie 写道:
The optimization example is as follows.
From:
if (condition)
dest += 1 << 16;
To:
dest += (condition ? 1 : 0) << 16;
It does not use maskeqz and masknez, thus reducing the number of
instructions.
gcc/ChangeLog:
* config
Pushed to r15-6488.
在 2024/12/30 上午10:37, Guo Jie 写道:
The xvexth related instructions operate SEPARATELY according to
the high and low 128 bits, and sign/zero extend the upper half
of every 128 bits in src to the corresponding 128 bits in dest.
For xvexth.d.w, the rule for the first element of
在 2025/1/7 下午12:47, chenxiaolong 写道:
When analyzing 525 on LoongArch architecture, it was found that the
for loop of hotspot function x264_pixel_satd_8x4 could not be quantized
256-bit due to the cost of vec_construct setting. After re-adjusting
vec_construct, the performance of 525 program
__attribute__ ((target ("{no-}lsx")))
__attribute__ ((target ("{no-}lasx")))
Lulu Cheng (2):
LoongArch: Implement target attribute.
LoongArch: Implement target pragma.
gcc/attr-urls.def | 6 +
gcc/config.gcc| 2 +-
Add function attributes support for LoongArch.
Currently, the following items are supported:
__attribute__ ((target ("{no-}strict-align")))
__attribute__ ((target ("cmodel=")))
__attribute__ ((target ("arch=")))
__attribute__ ((target ("tune=")))
__attribut
Pushed to r15-6617.
在 2024/12/31 下午7:33, Deng Jianbo 写道:
In LoongArch, currently uses instruction movgr2fr.{d|w} to move zero
from fixed-point register to floating-pointer regsiter for initializing
fp register to zero. When LSX or LASX is enabled, we can use instruction
vxor.v which has lower la
在 2025/1/2 下午5:46, Zhou Zhao 写道:
If SImode reg is continuous left shifted twice, combine related
instruction to one.
gcc/ChangeLog:
* config/loongarch/loongarch.md (extsv_ashlsi3):
New template
Hi, zhaozhou:
The indentation here is wrong, it needs to be aligned with *.
The target pragmas defined correspond to the target function attributes.
This implementation is derived from AArch64.
gcc/ChangeLog:
* config/loongarch/loongarch-protos.h
(loongarch_reset_previous_fndecl): Add function declaration.
(loongarch_save_restore_target_globals)
Pushed to r15-6445.
在 2024/12/18 下午3:45, Jiahao Xu 写道:
We can't vectorize the code into instructions like vslti.w that compare
with immediate_operand, because we miss immediate_operand support for
integer comparisons.
gcc/ChangeLog:
* config/loongarch/lasx.md (vec_cmp): Remove.
Pushed to r14-11275 and r15-7386.
在 2025/1/23 上午11:44, Lulu Cheng 写道:
PR target/118561
gcc/ChangeLog:
* config/loongarch/loongarch-builtins.cc
(loongarch_expand_builtin_lsx_test_branch):
NULL_RTX will not be returned when an error is detected
在 2025/2/7 下午7:51, Xi Ruoyao 写道:
Now that C default is C23, so we can no longer use LSX/LASX instructions
for these operations as the standard disallows raising INEXACT
exceptions. So LoongArch is no longer suitable for these effective
targets.
Fix the test failures on gcc.dg/vect/vect-roundi
在 2025/1/20 上午9:30, Xi Ruoyao 写道:
For mask{eq,ne}z, rk is always compared with 0 in the full width, thus
the mode for rk should be X.
LGTM!
I agree with your point of view.
Thank you.
I found the issue reviewing a patch fixing a similar issue for RISC-V
XTheadCondMov [1], but interestin
在 2025/2/7 下午8:09, Xi Ruoyao 写道:
/* snip */
-
-(define_insn "lasx_xvpickev_w"
- [(set (match_operand:V8SI 0 "register_operand" "=f")
- (vec_select:V8SI
- (vec_concat:V16SI
- (match_operand:V8SI 1 "register_operand" "f")
- (match_operand:V8SI 2 "register_operan
在 2025/2/11 下午4:37, Xi Ruoyao 写道:
On Tue, 2025-02-11 at 15:48 +0800, Lulu Cheng wrote:
Hi,
I think , the "{lsx_,lasx_x}hv{add,sub}w" in the title should be
"{lsx_,lasx_x}vh{add,sub}w".
Indeed.
在 2025/2/7 下午8:09, Xi Ruoyao 写道:
Like what we've done for {lsx_,las
Hi,
I think , the "{lsx_,lasx_x}hv{add,sub}w" in the title should be
"{lsx_,lasx_x}vh{add,sub}w".
在 2025/2/7 下午8:09, Xi Ruoyao 写道:
Like what we've done for {lsx_,lasx_x}v{add,sub,mul}l{ev,od}, use
special predicates and TImode RTL instead of hard-coded const vectors
and UNSPECs.
/* snip */
It seems that the title here is "{lsx_,lasx_x}vmaddw".
在 2025/2/7 下午8:09, Xi Ruoyao 写道:
Like what we've done for {lsx_,lasx_x}v{add,sub,mul}l{ev,od}, use
special predicates and TImode RTL instead of hard-coded const vectors
and UNSPECs.
Also reorder two operands of the outer plus in the templat
target/PR118828
gcc/ChangeLog:
* config/loongarch/loongarch-c.cc (loongarch_pragma_target_parse):
Update the predefined macros.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/pr118828.c: New test.
Change-Id: I13f7b44b11bba2080db797157a0389cc1bd65ac6
---
gcc/co
Split the implementation of the function loongarch_cpu_cpp_builtins into two
parts:
1. Macro definitions that do not change (only considering 64-bit architecture)
2. Macro definitions that change with different compilation options.
gcc/ChangeLog:
* config/loongarch/loongarch-c.cc (bu
Refer to the implementation of aarch64 to fix PR118828.
Lulu Cheng (3):
LoongArch: Move the function loongarch_register_pragmas to
loongarch-c.cc.
LoongArch: Split the function loongarch_cpu_cpp_builtins into two
functions.
LoongArch: After setting the compilation options, update
gcc/ChangeLog:
* config/loongarch/loongarch-target-attr.cc
(loongarch_pragma_target_parse): Move to ...
(loongarch_register_pragmas): Move to ...
* config/loongarch/loongarch-c.cc
(loongarch_pragma_target_parse): ... here.
(loongarch_register_pragmas
PR target/118828
gcc/ChangeLog:
* config/loongarch/loongarch-c.cc (loongarch_pragma_target_parse):
Update the predefined macros.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/pr118828.c: New test.
* gcc.target/loongarch/pr118828-2.c: New test.
*
gcc/ChangeLog:
* config/loongarch/loongarch-target-attr.cc
(loongarch_pragma_target_parse): Move to ...
(loongarch_register_pragmas): Move to ...
* config/loongarch/loongarch-c.cc
(loongarch_pragma_target_parse): ... here.
(loongarch_register_pragmas
PR target/118843
gcc/ChangeLog:
* config/loongarch/loongarch-c.cc
(loongarch_update_cpp_builtins): Fix macro definition issues.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/pr118843.c: New test.
Change-Id: I777e46ccbc80bfa8948e7d416ac86853c8f4c16d
---
gcc/co
v1 -> v2:
1. Move __loongarch_{arch,tune} _LOONGARCH_{ARCH,TUNE}
__loongarch_{div32,am_bh,amcas,ld_seq_sa} and
__loongarch_version_major/__loongarch_version_minor to update function.
2. Fixed PR118843.
3. Add testsuites.
Lulu Cheng (4):
LoongArch: Move the funct
Split the implementation of the function loongarch_cpu_cpp_builtins into two
parts:
1. Macro definitions that do not change (only considering 64-bit architecture)
2. Macro definitions that change with different compilation options.
gcc/ChangeLog:
* config/loongarch/loongarch-c.cc (bu
在 2024/12/17 下午12:30, Xi Ruoyao 写道:
On Tue, 2024-12-17 at 11:27 +0800, Lulu Cheng wrote:
在 2024/12/16 下午9:20, Xi Ruoyao 写道:
/* snip */
+;; For HImode it's a little complicated...
+(define_expand "rbithi"
I didn't find rtithi's template description. Are there any tes
在 2024/12/16 下午9:20, Xi Ruoyao 写道:
/* snip */
+;; For HImode it's a little complicated...
+(define_expand "rbithi"
I didn't find rtithi's template description. Are there any test cases ?
+ [(match_operand:HI 0 "register_operand")
+ (match_operand:HI 1 "register_operand")]
+ ""
+ {
+r
在 2024/12/16 下午9:19, Xi Ruoyao 写道:
A generic CRC optimization pass has been implemented in r15-5850. But
without target-specific code, it'll only optimize the CRC loop to a
table lookup. With LoongArch-specific code we can do it better: for
64-bit LoongArch and the IEEE 802.3 polynomial or th
Pushed to r15-5580.
We searched in the multimedia package and found no cases of using
__builtin_lsx_vorn_v or __builtin_lasx_xvorn_v,
so the interface type has been modified in the form of a bugfix.
Thanks!
在 2024/10/31 下午11:58, Xi Ruoyao 写道:
Align them with other vector bitwise builtins.
Pushed to r14-10960.
在 2024/11/22 上午9:52, Lulu Cheng 写道:
Pushed to r15-5580.
We searched in the multimedia package and found no cases of using
__builtin_lsx_vorn_v or __builtin_lasx_xvorn_v,
so the interface type has been modified in the form of a bugfix.
Thanks!
在 2024/10/31 下午11:58
Pushed to r15-5581 and r14-10961.
在 2024/11/2 下午3:37, Lulu Cheng 写道:
[x]vldi.{b/h/w/d} is not implemented in LoongArch.
Use the macro [x]vrepli.{b/h/w/d} to replace.
gcc/ChangeLog:
* config/loongarch/lasx.md: Fixed.
* config/loongarch/lsx.md: Fixed.
---
gcc/config/loongarch
Pushed to r15-6817.
在 2025/1/10 上午10:27, mengqinggang 写道:
Generate 0x1010 instead of 0x101>>12 for lu12i.w. lu32i.d and lu52i.d use
the same processing.
gcc/ChangeLog:
* config/loongarch/lasx.md: Use new loongarch_output_move.
* config/loongarch/loongarch-protos.h (loongarc
Pushed to r15-6755.
在 2025/1/6 下午4:16, mengqinggang 写道:
Generate 0x1010 instead of 0x101>>12 for lu12i.w. lu32i.d and lu52i.d use
the same processing.
gcc/ChangeLog:
* config/loongarch/lasx.md: Use new loongarch_output_move.
* config/loongarch/loongarch-protos.h (loongarch_
在 2025/1/10 上午10:03, Lulu Cheng 写道:
Pushed to r15-6755.
Sorry, I replied to the wrong email.
在 2025/1/6 下午4:16, mengqinggang 写道:
Generate 0x1010 instead of 0x101>>12 for lu12i.w. lu32i.d and
lu52i.d use
the same processing.
gcc/ChangeLog:
* config/loongarch/lasx.md: U
Pushed to r15-6755.
在 2025/1/7 下午9:04, chenxiaolong 写道:
When analyzing 525 on LoongArch architecture, it was found that the
for loop of hotspot function x264_pixel_satd_8x4 could not be quantized
256-bit due to the cost of vec_construct setting. After re-adjusting
vec_construct, the performan
301 - 400 of 480 matches
Mail list logo