from:"Xi Ruoyao"

[PATCH] LoongArch: Replace UNSPEC_FCOPYSIGN with copysign RTL

2023-10-02 Thread Xi Ruoyao

When I added copysign support for LoongArch (r13-3702), we did not have a copysign RTL insn, so I had to use UNSPEC to represent the copysign instruction. Now the copysign RTX code has been added in r14-1586, so this patch removes those UNSPECs, and it uses the native RTL copysign insn. Inspired b

Re: [PATCH] Support g++ 4.8 as a host compiler.

2023-10-04 Thread Xi Ruoyao

ue using g++ 4.8 as a host compiler. AFAIK G++ 5.1 also has a bug (https://gcc.gnu.org/PR65801) breaking building recent GCC. I don't think it's really "maintainable" to ensure current GCC able to be built with a buggy host compiler. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] LoongArch: Reimplement multilib build option handling.

2023-10-07 Thread Xi Ruoyao

gt; > P.S. Currently support for "f32" is not active, and it should probably be > avoided if you want to build a working rootfs. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

[PATCH] LoongArch: Use fcmp.caf.s instead of movgr2cf for zeroing a fcc

2023-10-17 Thread Xi Ruoyao

During the review of a LLVM change [1], on LA464 we found that zeroing a fcc with fcmp.caf.s is much faster than a movgr2cf from $r0. [1]: https://github.com/llvm/llvm-project/pull/69300 gcc/ChangeLog: * config/loongarch/loongarch.md (movfcc): Use fcmp.caf.s for zeroing a fcc. --

Pushed: [PATCH] LoongArch: Use fcmp.caf.s instead of movgr2cf for zeroing a fcc

2023-10-18 Thread Xi Ruoyao

On Wed, 2023-10-18 at 09:34 +0800, chenglulu wrote: > > 在 2023/10/17 下午10:24, WANG Xuerui 写道: > > > > On 10/17/23 22:06, Xi Ruoyao wrote: > > > During the review of a LLVM change [1], on LA464 we found that zeroing > > "an" LLVM change (because t

[PATCH 1/5] LoongArch: Add enum-style -mexplicit-relocs= option

2023-10-19 Thread Xi Ruoyao

To take a better balance between scheduling and relaxation when -flto is enabled, add three-way -mexplicit-relocs={auto,none,always} options. The old -mexplicit-relocs and -mno-explicit-relocs options are still supported, they are mapped to -mexplicit-relocs=always and -mexplicit-relocs=none. The

[PATCH 3/5] LoongArch: Use explicit relocs for TLS access with -mexplicit-relocs=auto

2023-10-19 Thread Xi Ruoyao

The linker does not know how to relax TLS access for LoongArch, so let's emit machine instructions with explicit relocs for TLS. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_explicit_relocs_p): Return true for TLS symbol types if -mexplicit-relocs=auto. (loong

[PATCH 5/5] LoongArch: Document -mexplicit-relocs={auto,none,always}

2023-10-19 Thread Xi Ruoyao

gcc/ChangeLog: * doc/invoke.texi (-mexplicit-relocs=style): Document. (-mexplicit-relocs): Document as an alias of -mexplicit-relocs=always. (-mno-explicit-relocs): Document as an alias of -mexplicit-relocs=none. (-mcmodel=extreme): Mention -mexplici

[PATCH 0/5] LoongArch: Better balance between relaxation and scheduling

2023-10-19 Thread Xi Ruoyao

allow the compiler to use explicit relocs for these cases, but assembler macros for other cases. Use it as the default if the assembler supports both explicit relocs and relaxation. LTO-bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk? Xi Ruoyao (5): LoongArch: Add enum-

[PATCH 2/5] LoongArch: Use explicit relocs for GOT access when -mexplicit-relocs=auto and LTO during a final link with linker plugin

2023-10-19 Thread Xi Ruoyao

If we are performing LTO for a final link and linker plugin is enabled, then we are sure any GOT access may resolve to a symbol out of the link unit (otherwise the linker plugin will tell us the symbol should be resolved locally and we'll use PC-relative access instead). Produce machine instructio

[PATCH 4/5] LoongArch: Use explicit relocs for addresses only used for one load or store with -mexplicit-relocs=auto and -mcmodel={normal, medium}

2023-10-19 Thread Xi Ruoyao

In these cases, if we use explicit relocs, we end up with 2 instructions: pcalau12it0, %pc_hi20(x) ld.d t0, t0, %pc_lo12(x) If we use la.local pseudo-op, in the best scenario (x is in +/- 2MiB range) we still have 2 instructions: pcaddi t0, %pcrel_20(x) ld.d

Re: [PATCH 2/5] LoongArch: Use explicit relocs for GOT access when -mexplicit-relocs=auto and LTO during a final link with linker plugin

2023-10-21 Thread Xi Ruoyao

ternal symbol c, the linker may relax "la.global c" to "la.local c" (if ab.o is linked together with another file c.o which contains the definition of c) or not. As we cannot exclude the possibility of a relaxation on la.global for incremental linking, just emit la.global and let the

Pushed: [PATCH 0/5] LoongArch: Better balance between relaxation and scheduling

2023-10-23 Thread Xi Ruoyao

Pushed r14-{4848..4852}. On Thu, 2023-10-19 at 22:02 +0800, Xi Ruoyao wrote: > For relaxation we are now generating assembler macros for symbolic > addresses everywhere, but this is limiting scheduling and there are > known situations where the relaxation cannot improve the code. >

[PATCH] LoongArch: Define HAVE_AS_TLS to 0 if it's undefined

2023-10-30 Thread Xi Ruoyao

Now loongarch.md uses HAVE_AS_TLS, we need this to fix the failure building a cross compiler if the cross assembler is not installed yet. gcc/ChangeLog: * config/loongarch/loongarch-opts.h (HAVE_AS_TLS): Define to 0 if not defined yet. --- Ok for trunk? gcc/config/loongarch/loo

Re: [PATCH] LoongArch: Define HAVE_AS_TLS to 0 if it's undefined

2023-10-30 Thread Xi Ruoyao

On Mon, 2023-10-30 at 19:50 +0800, chenglulu wrote: > 在 2023/10/30 下午7:42, Xi Ruoyao 写道: > > Now loongarch.md uses HAVE_AS_TLS, we need this to fix the failure > > building a cross compiler if the cross assembler is not installed yet. > > > > gcc/ChangeLog: >

Pushed: [PATCH v2] LoongArch: Define HAVE_AS_TLS to 0 if it's undefined [PR112299]

2023-10-30 Thread Xi Ruoyao

Pushed r14-5030. The subject and ChangeLog are updated to include the PR number. The code change is same as v1. On Mon, 2023-10-30 at 20:44 +0800, chenglulu wrote: > > 在 2023/10/30 下午8:26, Xi Ruoyao 写道: > > On Mon, 2023-10-30 at 19:50 +0800, chenglulu wrote: > > > 在

[PATCH] LoongArch: Disable relaxation if the assembler don't support conditional branch relaxation [PR112330]

2023-11-05 Thread Xi Ruoyao

As the commit message of r14-4674 has indicated, if the assembler does not support conditional branch relaxation, a relocation overflow may happen on conditional branches when relaxation is enabled because the number of NOP instructions inserted by the assembler will be more than the number estimat

[PATCH] LoongArch: Optimize single-used address with -mexplicit-relocs=auto for fld/fst

2023-11-05 Thread Xi Ruoyao

fld and fst have same address mode as ld.w and st.w, so the same optimization as r14-4851 should be applied for them too. gcc/ChangeLog: * config/loongarch/loongarch.md (LD_AT_LEAST_32_BIT): New mode iterator. (ST_ANY): New mode iterator. (define_peephole2): Use LD

[PATCH] LoongArch: Remove redundant barrier instructions before LL-SC loops

2023-11-06 Thread Xi Ruoyao

This is isomorphic to the LLVM changes [1-2]. On LoongArch, the LL and SC instructions has memory barrier semantics: - LL: + - SC: + But the compare and swap operation is allowed to fail, and if it fails the SC instruction is not executed, thus the guarantee of acquiring semantics cannot be

Re: [PATCH] MIPS: Fix PR target/98491 (ChangeLog)

2021-02-12 Thread Xi Ruoyao

Well, it just dislike my mail server :(. Switch to the mail server of my university. On 2021-02-12 22:54 +0800, Xi Ruoyao wrote: > Resend the mail. I had to fill in a form to send mail to Robert. > > On 2021-02-12 22:17 +0800, Xi Ruoyao wrote: > > On 2021-01-11 01:01 +0800,

[PATCH] Fix symver attribute with LTO

2019-12-16 Thread Xi Ruoyao

9.0 can be built with LTO enabled and no problem. I'll test symver patch and this patch with more packages. Bootstrapped/regtested x86_64-linux. I'm not a maintainer. gcc/ChangeLog: 2019-12-17 Xi Ruoyao * cgraph.h (symtab_node::used_from_object_file_p): Symver nodes are

Re: [PATCH] Fix symver attribute with LTO

2019-12-17 Thread Xi Ruoyao

as global in the version script). And foo (VERS_2) would be missing. With this patch foo (VERS_2) would show up. We can't mark "foo_v2" to be "global" because it should not be a part of DSO ABI. The other 1% chance would be a regression in Binutils. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University pr48200.tar.gz Description: application/compressed-tar

Re: [PATCH] Fix symver attribute with LTO

2019-12-18 Thread Xi Ruoyao

erminated. /usr/bin/ld: error: lto-wrapper failed collect2: error: ld returned 1 exit status make: *** [Makefile:4: obj/test.so] Error 1 The change to lto/lto-common.c makes sense. I tried it instead of my change to cgraph.h and everything is OK. I'll investigate the change to varasm.c a little. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] Fix symver attribute with LTO

2019-12-18 Thread Xi Ruoyao

snode->resolution = *res; >} I still believe we should consider symver targets to be externally visible in cgraph_externally_visible_p. There is a comment saying "if linker counts on us, we must preserve the function". That's true in our case. And, I think .globl .L

Re: [PATCH] Fix symver attribute with LTO

2019-12-18 Thread Xi Ruoyao

lly control foo_v2 > symbol (with LTO it will turn it into static symbol, it will inline > calls to it and do other things), while in the second case we need to > treat foo_v2 specially. > > So if it's safe we can force a ".globl foo_v2" before ".symver foo_v2, > > foo@@VERS_2". But I can't prove it's safe so I think it's better to > > consider > > this case in cgraph_externally_visible_p. > > It sort of makes things work, but for example it will prevent gcc from > inlining calls to foo_v2. I think we will still need to do something > about -fvisibility=hidden. > > It is sad that we do not have way to produce symbol version without a > corresponding symbol of global visiblity. If we had we could support > multiple symver aliases from one symbol and avoid the need to explicitly > hide the unnecesary symbols in the map files... Explicitly hiding the unnecessary symbols in map files is now how we handle this [with __asm__(".symver foo_v2, foo@@VERS_2")]. We can continue to do in this way and leave it as an enhancement. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] Fix symver attribute with LTO

2019-12-18 Thread Xi Ruoyao

ymver_export') or 2. Add some mangled symbol name in GCC (like ".LSYMVERx" or "foo_v2.symver_export"). -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] Fix symver attribute with LTO

2019-12-19 Thread Xi Ruoyao

mbol is unused, never bring internal > - symbols that are declared by user as used or externally visible. > - This is needed for i.e. references from asm statements. */ > - if (used_from_object_file_p ()) > -return true; Are these two lines removed intentionally? >if (resoluti

Re: [PATCH] Fix symver attribute with LTO

2019-12-19 Thread Xi Ruoyao

On 2019-12-19 19:12 +0800, Xi Ruoyao wrote: > On 2019-12-19 11:06 +0100, Jan Hubicka wrote: > > - /* See if we have linker information about symbol not being used or > > - if we need to make guess based on the declaration. > > + /* Limitation of gas requires us to out

Re: [PATCH] LoongArch: Fix inconsistent description in *sge_

2024-03-04 Thread Xi Ruoyao

So allowing const_imm12_operand here makes no benefit. > "" > - "slti\t%0,%.,%1" > + "slt%i1\t%0,%.,%1" > [(set_attr "type" "slt") > (set_attr "mode" "")]) > -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v2] LoongArch: Fix inconsistent description in *sge_

2024-03-05 Thread Xi Ruoyao

int 1)))] > "" > - "slti\t%0,%.,%1" > + "slt\t%0,%.,%1" > [(set_attr "type" "slt") > (set_attr "mode" "")]) Hmm, this define_insn seems never really used or it would generate something like "sltu

[PATCH v3] testsuite: Add a test case for negating FP vectors containing zeros

2024-03-05 Thread Xi Ruoyao

Recently I've fixed two wrong FP vector negate implementation which caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To prevent a similar issue from happening again, add a test case. Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS (with MSA), LoongArch (with

[PATCH v2] LoongArch: Allow s9 as a register alias

2024-03-05 Thread Xi Ruoyao

The psABI allows using s9 as an alias of r22. gcc/ChangeLog: * config/loongarch/loongarch.h (ADDITIONAL_REGISTER_NAMES): Add s9 as an alias of r22. --- v1 -> v2: Add a test case. Ok for trunk? gcc/config/loongarch/loongarch.h | 1 + gcc/testsuite/gcc.target/l

[PATCH] LoongArch: testsuite: Rewrite {x, }vfcmp-{d, f}.c to avoid named registers

2024-03-05 Thread Xi Ruoyao

Loops on named vector register are not vectorized (see comment 11 of PR113622), so the these test cases have been failing for a while. Rewrite them using check-function-bodies to remove hard coding register names. A barrier is needed to always load the first operand before the second operand. gcc

Re: [PATCH] LoongArch: Emit R_LARCH_RELAX for TLS IE with non-extreme code model to allow the IE to LE linker relaxation

2024-03-06 Thread Xi Ruoyao

{ dg-options "-O2 -mcmodel=normal -mexplicit-relocs -mno-relax" } */ > > +/* { dg-final { scan-assembler-not "R_LARCH_RELAX" { target tls_native } } > > } */ i.e. -mno-relax is used compiling this test case, and the compiled assembly code should not contain R_LARCH_RELAX. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v1] LoongArch: Fixed an issue with the implementation of the template atomic_compare_and_swapsi.

2024-03-07 Thread Xi Ruoyao

pect, operands[4], + operands[6])); +} rtx compare = operands[1]; if (operands[3] != const0_rtx) It produces: slli.w $r4,$r4,0 1: ll.w$r14,$r3,0 bne $r14,$r4,2f or $r15,$zero,$r12 sc.w$r15,$r3,0 beqz$r15,1b b 3f 2: dbar0b10100 3: for the test case and the compiled test case runs successfully. I've not done a full bootstrap yet though. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v1] LoongArch: Fixed an issue with the implementation of the template atomic_compare_and_swapsi.

2024-03-07 Thread Xi Ruoyao

On Thu, 2024-03-07 at 21:07 +0800, chenglulu wrote: > > 在 2024/3/7 下午8:52, Xi Ruoyao 写道: > > It should be better to extend the expected value before the ll/sc loop > > (like what LLVM does), instead of repeating the extending in each > > iteration. Something like: >

Re: [PATCH v4] LoongArch: Add support for TLS descriptors

2024-03-12 Thread Xi Ruoyao

> + ? "pcalau12i\t$r4,%%desc_pc_hi20(%1)\n\ > + \taddi.d\t%2,$r0,%%desc_pc_lo12(%1)\n\ > + \tlu32i.d\t%2,%%desc64_pc_lo20(%1)\n\ > + \tlu52i.d\t%2,%2,%%desc64_pc_hi12(%1)\n\ > + \tadd.d\t$r4,$r4,%2\n\ > + \tld.d\t$r1,$r4,%%desc_ld(%1)\n\ > + \tjir

Re: [PATCH v4] LoongArch: Add support for TLS descriptors

2024-03-12 Thread Xi Ruoyao

On Wed, 2024-03-13 at 06:15 +0800, Xi Ruoyao wrote: > > +(define_insn "@got_load_tls_desc" > > + [(set (match_operand:P 0 "register_operand" "=r") Hmm, and it looks like we should use (reg:P 4) instead of match_operand here, because the instruction do

Re: [PATCH v4] LoongArch: Add support for TLS descriptors

2024-03-12 Thread Xi Ruoyao

On Wed, 2024-03-13 at 06:56 +0800, Xi Ruoyao wrote: > On Wed, 2024-03-13 at 06:15 +0800, Xi Ruoyao wrote: > > > +(define_insn "@got_load_tls_desc" > > > + [(set (match_operand:P 0 "register_operand" "=r") > > Hmm, and it looks like we shou

Re: [PATCH v4] LoongArch: Add support for TLS descriptors

2024-03-13 Thread Xi Ruoyao

On Wed, 2024-03-13 at 11:06 +0800, mengqinggang wrote: > > 在 2024/3/13 上午6:15, Xi Ruoyao 写道: > > On Tue, 2024-03-12 at 17:20 +0800, mengqinggang wrote: > > > +(define_insn "@got_load_tls_desc" > > > + [(set (match_operand:P 0 &q

Re: [PATCH v4] LoongArch: Add support for TLS descriptors

2024-03-13 Thread Xi Ruoyao

On Wed, 2024-03-13 at 10:24 +0800, Xi Ruoyao wrote: > return TARGET_EXPLICIT_RELOCS > - ? "pcalau12i\t$r4,%%desc_pc_hi20(%1)\n\ > - \taddi.d\t%2,$r0,%%desc_pc_lo12(%1)\n\ > - \tlu32i.d\t%2,%%desc64_pc_lo20(%1)\n\ > - \tlu52i.d\t%2,%2,%%desc64_pc_hi12(%1)\n

Re: [PATCH] testsuite: Fix vfprintf-chk-1.c with -fhardened

2024-03-13 Thread Xi Ruoyao

/configure ... which will taint the test suite with -fhardened. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v1] LoongArch: Remove masking process for operand 3 of xvpermi.q.

2024-03-13 Thread Xi Ruoyao

testsuite/ChangeLog: > > * gcc.target/loongarch/vector/lasx/lasx-xvpermi_q.c: > Reposition operand 3's value into instruction's defined accept range. ^^ Remove these two white spaces. Should be OK with these ChangeLog style issues fixed. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

[PATCH] LoongArch: Remove unused and incorrect "sge_" define_insn

2024-03-13 Thread Xi Ruoyao

If this insn is really used, we'll have something like slti $r4,$r0,$r5 in the code. The assembler will reject it because slti wants 2 register operands and 1 immediate operand. But we've not got any bug report for this, indicating this define_insn is unused at all. Note that do_store_flag

[PATCH] LoongArch: Fix C23 (...) functions returning large aggregates [PR114175]

2024-03-18 Thread Xi Ruoyao

We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named arguments and there is nothing to advance, but that is not the case for (...) functions returning by hidden reference which have one such artificial argument. This is causing gcc.dg/c23-stdarg-6.c and gcc.dg/c23-stdarg-8.c to fail.

Pushed: [PATCH v2] LoongArch: Fix C23 (...) functions returning large aggregates [PR114175]

2024-03-19 Thread Xi Ruoyao

On Tue, 2024-03-19 at 11:19 +0800, chenglulu wrote: > > 在 2024/3/18 下午5:34, Xi Ruoyao 写道: > > We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named > > arguments and there is nothing to advance, but that is not the case > > for (...) functions returning b

[PATCH] mips: Fix C23 (...) functions returning large aggregates [PR114175]

2024-03-20 Thread Xi Ruoyao

We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named arguments and there is nothing to advance, but that is not the case for (...) functions returning by hidden reference which have one such artificial argument. This is causing gcc.dg/c23-stdarg-{6,8,9}.c to fail. Fix the issue by ch

Pushed: [PATCH] LoongArch: Fix a typo [PR 114407]

2024-03-20 Thread Xi Ruoyao

gcc/ChangeLog: PR target/114407 * config/loongarch/loongarch-opts.cc (loongarch_config_target): Fix typo in diagnostic message, enabing -> enabling. --- Pushed r14-9582 as obvious. gcc/config/loongarch/loongarch-opts.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(

Re: [PATCH] MIPS: Add MIN/MAX.fmt instructions support for MIPS R6

2024-03-21 Thread Xi Ruoyao

-if "code quality test" { *-*-* } { "-O0" } { "" } } */ -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

TARGET_RTX_COSTS and pipeline latency vs. variable-latency instructions (was Re: [PATCH] RISC-V: Add XiangShan Nanhu microarchitecture.)

2024-03-25 Thread Xi Ruoyao

ivision may spend different number of cycles for different inputs: on LoongArch LA664 I've observed 5 cycles for some inputs and 39 cycles for other inputs. So should we use the minimal value, the maximum value, or something in- between for TARGET_RTX_COSTS and pipeline descriptions? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

[PATCH] LoongArch: Increase division costs

2024-03-26 Thread Xi Ruoyao

The latency of LA464 and LA664 division instructions depends on the input. When I updated the costs in r14-6642, I unintentionally set the division costs to the best-case latency (when the first operand is 0). Per a recent discussion [1] we should use "something sensible" instead of it. Use the a

Re: [PATCH v2] MIPS: Add MIN/MAX.fmt instructions support for MIPS R6

2024-03-26 Thread Xi Ruoyao

On Tue, 2024-03-26 at 11:15 +0800, YunQiang Su wrote: /* snip */ > With -ffinite-math-only -fno-signed-zeros, it does work with > x >= y ? x : y > while without `-ffinite-math-only -fno-signed-zeros`, it cannot. > @Xi Ruoyao Is it expected by IEEE? When y is (quiet) NaN and x

Re: [PATCH] LoongArch: Increase division costs

2024-03-27 Thread Xi Ruoyao

On Wed, 2024-03-27 at 10:38 +0800, chenglulu wrote: > > 在 2024/3/26 下午5:48, Xi Ruoyao 写道: > > The latency of LA464 and LA664 division instructions depends on the > > input. When I updated the costs in r14-6642, I unintentionally set the > > division costs to the best-case

Re: [PATCH] LoongArch: Increase division costs

2024-03-27 Thread Xi Ruoyao

On Wed, 2024-03-27 at 08:54 +0100, Richard Biener wrote: > On Tue, Mar 26, 2024 at 10:52 AM Xi Ruoyao wrote: > > > > The latency of LA464 and LA664 division instructions depends on the > > input. When I updated the costs in r14-6642, I unintentionally set the > > divi

Re: [PATCH] LoongArch: Increase division costs

2024-03-27 Thread Xi Ruoyao

On Wed, 2024-03-27 at 18:39 +0800, Xi Ruoyao wrote: > On Wed, 2024-03-27 at 10:38 +0800, chenglulu wrote: > > > > 在 2024/3/26 下午5:48, Xi Ruoyao 写道: > > > The latency of LA464 and LA664 division instructions depends on the > > > input. When I updated the costs i

Ping: [PATCH] mips: Fix C23 (...) functions returning large aggregates [PR114175]

2024-03-28 Thread Xi Ruoyao

Ping. On Wed, 2024-03-20 at 15:10 +0800, Xi Ruoyao wrote: > We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named > arguments and there is nothing to advance, but that is not the case > for (...) functions returning by hidden reference which have one such > artificia

Re: [PATCH] LoongArch: Increase division costs

2024-03-31 Thread Xi Ruoyao

t for some workloads like competitive programming. However "adapting with different modulos" is not possible w/o refactoring generic code so it must be deferred to at least GCC 15. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] LoongArch: Increase division costs

2024-03-31 Thread Xi Ruoyao

On Mon, 2024-04-01 at 10:22 +0800, chenglulu wrote: > > 在 2024/4/1 上午9:29, Xi Ruoyao 写道: > > On Fri, 2024-03-29 at 09:23 +0800, chenglulu wrote: > > > > > I tested spec2006. In the floating-point program, the test items with > > > large > > > &g

Re: [PATCH v5] LoongArch: Add support for TLS descriptors

2024-04-01 Thread Xi Ruoyao

> * gcc.target/loongarch/explicit-relocs-auto-tls-desc.c: New test. > * gcc.target/loongarch/explicit-relocs-extreme-tls-desc.c: New test. > * gcc.target/loongarch/explicit-relocs-tls-desc.c: New test. > > Co-authored-by: Lulu Cheng > Co-authored-by: Xi Ruoyao -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v1] LoongArch: Set default alignment for functions jumps and loops [PR112919].

2024-04-06 Thread Xi Ruoyao

oke.texi for > the ^ ^ Better have two commas here. Otherwise it should be OK. > + format. */ -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] LoongArch: Enable switchable target

2024-04-07 Thread Xi Ruoyao

R target/113233 Oops, so this PR isn't fixed with r14-7134 "LoongArch: Implement option save/restore"? Should I reopen it? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] LoongArch: Enable switchable target

2024-04-07 Thread Xi Ruoyao

On Sun, 2024-04-07 at 16:23 +0800, Yang Yujie wrote: > On Sun, Apr 07, 2024 at 04:23:53PM +0800, Xi Ruoyao wrote: > > On Sun, 2024-04-07 at 15:47 +0800, Yang Yujie wrote: > > > This patch fixes the back-end context switching in cases where functions > > > should be

Re: [PATCH] ICF&SRA: Make ICF and SRA agree on padding

2024-04-07 Thread Xi Ruoyao

farm. If that goes well, I intend to commit the patch and then start > working on backports. I've tried these two patches out on my own 24-core AArch64 machine. Bootstrapped (but no LTO or PGO) and regtested fine. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] ICF&SRA: Make ICF and SRA agree on padding

2024-04-07 Thread Xi Ruoyao

_padding.length (); > + if (l != p2.m_padding.length ()) > + return false; > + for (unsigned i = 0; i < l; i++) > + if (p1.m_padding[i].first != p2.m_padding[i].first > + || p1.m_padding[i].second != p2.m_padding[i].second) > + return false; > + > + return

Re: [PATCH] LoongArch: Enable switchable target

2024-04-07 Thread Xi Ruoyao

this line. > (loongarch_expand_builtin): Turn assertion of builtin > availability > into a test. and this line. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v2] LoongArch: Enable switchable target

2024-04-08 Thread Xi Ruoyao

when I do LTO, so I don't really rely on this patch though. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] Change gcc/ira-conflicts.cc build_conflict_bit_table to use size_t/%zu

2024-02-01 Thread Xi Ruoyao

" PRIu64 ")\n", Should use HOST_WIDE_INT_PRINT_UNSIGNED instead of PRIu64. >(unsigned HOST_WIDE_INT) (sizeof (IRA_INT_TYPE) > * allocated_words_num), >(unsigned HOST_WIDE_INT) (sizeof (IRA_INT_TYPE) >

Re: [PATCH] Change gcc/ira-conflicts.cc build_conflict_bit_table to use size_t/%zu

2024-02-01 Thread Xi Ruoyao

On Thu, 2024-02-01 at 14:55 +0100, Jakub Jelinek wrote: > On Thu, Feb 01, 2024 at 01:42:03PM +, Jonathan Yong wrote: > > On 2/1/24 13:06, Xi Ruoyao wrote: > > > On Thu, 2024-02-01 at 14:01 +0100, Jakub Jelinek wrote: > > > > On Thu, Feb 01, 2024 at 12:45:3

[PATCH] LoongArch: Fix an ODR violation

2024-02-01 Thread Xi Ruoyao

When bootstrapping GCC 14 --with-build-config=bootstrap-lto, an ODR violation is detected: ../../gcc/config/loongarch/loongarch-opts.cc:57: warning: 'abi_minimal_isa' violates the C++ One Definition Rule [-Wodr] 57 | abi_minimal_isa[N_ABI_BASE_TYPES][N_ABI_EXT_TYPES]; ../../gcc/con

[PATCH] LoongArch: Avoid out-of-bounds access in loongarch_symbol_insns

2024-02-02 Thread Xi Ruoyao

We call loongarch_symbol_insns with mode = MAX_MACHINE_MODE sometimes. But in loongarch_symbol_insns: if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE_P (mode)) return 0; And LSX_SUPPORTED_MODE_P is defined as: #define LSX_SUPPORTED_MODE_P(MODE) \ (ISA_HAS_LSX \

[PATCH] LoongArch: Fix wrong LSX FP vector negation

2024-02-03 Thread Xi Ruoyao

We expanded (neg x) to (minus const0 x) for LSX FP vectors, this is wrong because -0.0 is not 0 - 0.0. This causes some Python tests to fail when Python is built with LSX enabled. Use the vbitrevi.{d/w} instructions to simply reverse the sign bit instead. We are already doing this for LASX and n

Pushed: [PATCH] LoongArch: Fix an ODR violation

2024-02-03 Thread Xi Ruoyao

On Fri, 2024-02-02 at 10:42 +0800, chenglulu wrote: > LGTM! > > Thanks! Pushed r14-8773. > 在 2024/2/2 上午5:54, Xi Ruoyao 写道: > > When bootstrapping GCC 14 --with-build-config=bootstrap-lto, an ODR > > violation is detected: > > > > ../../gcc/config/loo

Pushed: [PATCH] LoongArch: Fix wrong LSX FP vector negation

2024-02-04 Thread Xi Ruoyao

On Sun, 2024-02-04 at 11:20 +0800, chenglulu wrote: > > 在 2024/2/3 下午4:58, Xi Ruoyao 写道: > > We expanded (neg x) to (minus const0 x) for LSX FP vectors, this is > > wrong because -0.0 is not 0 - 0.0. This causes some Python tests to > > fail when Python is built with L

Pushed: [PATCH] LoongArch: Avoid out-of-bounds access in loongarch_symbol_insns

2024-02-04 Thread Xi Ruoyao

On Sun, 2024-02-04 at 11:19 +0800, chenglulu wrote: > > 在 2024/2/2 下午5:55, Xi Ruoyao 写道: > > We call loongarch_symbol_insns with mode = MAX_MACHINE_MODE sometimes. > > But in loongarch_symbol_insns: > > > > if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE

[PATCH] MIPS: Fix wrong MSA FP vector negation

2024-02-04 Thread Xi Ruoyao

We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is wrong because -0.0 is not 0 - 0.0. This causes some Python tests to fail when Python is built with MSA enabled. Use the bnegi.df instructions to simply reverse the sign bit instead. gcc/ChangeLog: * config/mips/mips-msa

Pushed: [PATCH] MIPS: Fix wrong MSA FP vector negation

2024-02-05 Thread Xi Ruoyao

On Mon, 2024-02-05 at 09:56 +0800, YunQiang Su wrote: > Xi Ruoyao 于2024年2月5日周一 02:01写道： > > > > We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is > > wrong because -0.0 is not 0 - 0.0. This causes some Python tests to > > fail when Pytho

[PATCH] testsuite: Add a test case for negating FP vectors containing zeros

2024-02-06 Thread Xi Ruoyao

Recently I've fixed two wrong FP vector negate implementation which caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To prevent a similar issue from happening again, add a test case. Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS (with MSA), LoongArch (with

LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-06 Thread Xi Ruoyao

4-4674 into releases/gcc-12 and releases/gcc-13 then? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH] testsuite: Add a test case for negating FP vectors containing zeros

2024-02-06 Thread Xi Ruoyao

On Tue, 2024-02-06 at 17:55 +0800, Xi Ruoyao wrote: > Recently I've fixed two wrong FP vector negate implementation which > caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To > prevent a similar issue from happening again, add a test case. > > Tested on x8

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-09 Thread Xi Ruoyao

On Fri, 2024-02-09 at 00:02 +0800, chenglulu wrote: > > 在 2024/2/7 上午12:23, Xi Ruoyao 写道: > > Hi Lulu, > > > > I'm proposing to backport r14-4674 "LoongArch: Delete macro definition > > ASM_OUTPUT_ALIGN_WITH_NOP." to releases/gcc-12 and releases/g

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-20 Thread Xi Ruoyao

test failures due to "excessive errors" if running the GCC test suite with these earlier GAS versions. Maybe we'll have to add some autoconf-based probing for the linker anyway? -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-20 Thread Xi Ruoyao

On Tue, 2024-02-20 at 19:25 +0800, Xi Ruoyao wrote: > On Tue, 2024-02-20 at 10:07 +0800, chenglulu wrote: > > > So I think that without worrying about performance and ensuring that > > there is no problem > > > > with binutils, I think we can ma

Re: LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-20 Thread Xi Ruoyao

On Tue, 2024-02-20 at 19:50 +0800, chenglulu wrote: > > 在 2024/2/20 下午7:31, Xi Ruoyao 写道: > > On Tue, 2024-02-20 at 19:25 +0800, Xi Ruoyao wrote: > > > On Tue, 2024-02-20 at 10:07 +0800, chenglulu wrote: > > > > > > > So I think that without wo

[PATCH] LoongArch: Don't falsely claim gold supported in toplevel configure

2024-02-22 Thread Xi Ruoyao

The gold linker has never been ported to LoongArch (and it seems unlikely to be ported in the future as the new architectures are focusing on lld and/or mold for fast linkers). ChangeLog: * configure.ac (ENABLE_GOLD): Remove loongarch*-*-* from target list. * configure: Re

[GCC 13 PATCH] LoongArch: Don't default to -mno-explicit-relocs if -mno-relax

2024-02-22 Thread Xi Ruoyao

To improve Binutils compatibility we've had to backported relaxation support. But if a user just updates to GCC 13.3 and sticks with Binutils 2.41, there is no reason to use -mno-explicit-relocs as the default because we are turning off relaxation for Binutils 2.41 (it lacks conditional branch rel

Re: [PATCH] LoongArch: Don't falsely claim gold supported in toplevel configure

2024-02-22 Thread Xi Ruoyao

On Fri, 2024-02-23 at 11:16 +0800, chenglulu wrote: > > 在 2024/2/22 下午5:17, Xi Ruoyao 写道: > > The gold linker has never been ported to LoongArch (and it seems > > unlikely to be ported in the future as the new architectures are > > focusing on lld and/or mold for fast link

Pushed: [PATCH] LoongArch: Don't falsely claim gold supported in toplevel configure

2024-02-23 Thread Xi Ruoyao

On Fri, 2024-02-23 at 11:37 +0800, chenglulu wrote: > > 在 2024/2/23 上午11:27, Xi Ruoyao 写道: > > On Fri, 2024-02-23 at 11:16 +0800, chenglulu wrote: > > > 在 2024/2/22 下午5:17, Xi Ruoyao 写道: > > > > The gold linker has never been ported to LoongArch (and it seems >

Pushed: [GCC 13 PATCH] LoongArch: Don't default to -mno-explicit-relocs if -mno-relax

2024-02-23 Thread Xi Ruoyao

On Thu, 2024-02-22 at 19:09 +0800, chenglulu wrote: > > 在 2024/2/22 下午6:20, Xi Ruoyao 写道: > > To improve Binutils compatibility we've had to backported relaxation > > support. But if a user just updates to GCC 13.3 and sticks with > > Binutils 2.41, there is no reaso

[PATCH 1/2] LoongArch: NFC: Deduplicate crc instruction defines

2024-02-25 Thread Xi Ruoyao

Introduce an iterator for UNSPEC_CRC and UNSPEC_CRCC to make the next change easier. gcc/ChangeLog: * config/loongarch/loongarch.md (CRC): New define_int_iterator. (crc): New define_int_attr. (loongarch_crc_w__w, loongarch_crcc_w__w): Unify into ... (loonga

[PATCH 2/2] LoongArch: Remove unneeded sign extension after crc/crcc instructions

2024-02-25 Thread Xi Ruoyao

The specification of crc/crcc instructions is clear that the output is sign-extended to GRLEN. Add a define_insn to tell the compiler this fact and allow it to remove the unneeded sign extension on crc/crcc output. As crc/crcc instructions are usually used in a tight loop, this should produce a s

Re: [PATCH v2] LoongArch: Add support for TLS descriptors

2024-02-28 Thread Xi Ruoyao

TARGET_EXPLICIT_RELOCS_ALWAS ? "......" : "la.tls.desc\t%0,%1"; } > + [(set_attr "got" "load") > + (set_attr "mode" "")]) We need (set_attr "length" "16") in this list as this actually expands into 16 bytes. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v2] LoongArch: Add support for TLS descriptors

2024-02-28 Thread Xi Ruoyao

On Thu, 2024-02-29 at 14:08 +0800, Xi Ruoyao wrote: > > + "TARGET_TLS_DESC" > > + "la.tls.desc\t%0,%1" > > With -mexplicit-relocs=always we should emit %desc_pc_lo12 etc. instead > of la.tls.desc. As we don't want to add too many code we can just

[PATCH v2] testsuite: Make pr104992.c irrelated to target vector feature [PR113418]

2024-02-28 Thread Xi Ruoyao

The vect_int_mod target selector is evaluated with the options in DEFAULT_VECTCFLAGS in effect, but these options are not automatically passed to tests out of the vect directories. So this test fails on targets where integer vector modulo operation is supported but requiring an option to enable, f

[PATCH v2] testsuite: Add a test case for negating FP vectors containing zeros

2024-02-28 Thread Xi Ruoyao

Recently I've fixed two wrong FP vector negate implementation which caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To prevent a similar issue from happening again, add a test case. Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS (with MSA), LoongArch (with

[PATCH] LoongArch: Emit R_LARCH_RELAX for TLS IE with non-extreme code model to allow the IE to LE linker relaxation

2024-02-28 Thread Xi Ruoyao

In Binutils we need to make IE to LE relaxation only allowed when there is an R_LARCH_RELAX after R_LARCH_TLE_IE_PC_{HI20,LO12} so an invalid "partial" relaxation won't happen with the extreme code model. So if we are emitting %ie_pc_{hi20,lo12} in a non-extreme code model, emit an R_LARCH_RELAX t

[PATCH] LoongArch: Allow s9 as a register alias

2024-02-28 Thread Xi Ruoyao

The psABI allows using s9 as an alias of r22. gcc/ChangeLog: * config/loongarch/loongarch.h (ADDITIONAL_REGISTER_NAMES): Add s9 as an alias of r22. --- Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk? gcc/config/loongarch/loongarch.h | 1 + 1 file changed, 1

Re: [PATCH v2] testsuite: Add a test case for negating FP vectors containing zeros

2024-02-29 Thread Xi Ruoyao

On Thu, 2024-02-29 at 15:09 +0800, Xi Ruoyao wrote: > Recently I've fixed two wrong FP vector negate implementation which > caused wrong sign bits in zeros in targets (r14-8786 and r14-8801). To > prevent a similar issue from happening again, add a test case. > > Tested on x8

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-12 Thread Xi Ruoyao

-shared --enable-bootstrap > --enable-checking=release > $ make BOOT_FLAGS="-mcmodel=extreme" > > What did I do wrong?:-( BOOT_CFLAGS, not BOOT_FLAGS :). -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v2 2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

2024-01-13 Thread Xi Ruoyao

在 2024-01-13星期六的 15:01 +0800，chenglulu写道： > > 在 2024/1/12 下午7:42, Xi Ruoyao 写道: > > 在 2024-01-12星期五的 09:46 +0800，chenglulu写道： > > > > > > I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS: > > > > we n

Re: [PATCH v2] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-13 Thread Xi Ruoyao

821731 100644 > --- a/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f > +++ b/gcc/testsuite/gfortran.dg/vect/fast-math-mgrid-resid.f > @@ -2,6 +2,7 @@ > ! { dg-require-effective-target vect_double } > ! { dg-options "-O3 --param vect-max-peeling-for-alignment=0 > -fpredi

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1155 matches

Mail list logo