On Mon, 2024-01-15 at 09:29 +0800, chenxiaolong wrote:
> At 21:13 +0800 on Saturday, 2024-01-13, Xi Ruoyao wrote:
> > At 15:28 +0800 on Saturday 2024-01-13, chenxiaolong wrote:
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.dg/pr104992.c: Added addition
On Mon, 2024-01-15 at 14:32 +0800, YunQiang Su wrote:
> Xi Ruoyao 于2024年1月15日周一 12:11写道:
> >
> > On Mon, 2024-01-15 at 09:29 +0800, chenxiaolong wrote:
> > > At 21:13 +0800 on Saturday, 2024-01-13, Xi Ruoyao wrote:
> > > > At 15:28 +0800 on Saturday 2024-01-1
On Mon, 2024-01-15 at 15:10 +0800, chenxiaolong wrote:
> At 14:42 +0800 on the first day of 2024-01-15, Xi Ruoyao wrote:
> > On Mon, 2024-01-15 at 14:32 +0800, YunQiang Su wrote:
> > > Xi Ruoyao wrote at 12:11pm on Monday, January
> > > 15, 2024:
> > >
On Tue, 2024-01-16 at 10:57 +0800, chenxiaolong wrote:
> 在 2024-01-15一的 15:50 +0800,Xi Ruoyao写道:
> > On Mon, 2024-01-15 at 15:10 +0800, chenxiaolong wrote:
> > > At 14:42 +0800 on the first day of 2024-01-15, Xi Ruoyao wrote:
> > > > On Mon, 2024-01-15 at
Ping.
On Fri, 2023-12-15 at 20:56 +0800, Xi Ruoyao wrote:
> We don't allow SImode in FCC, so constraint z is never really used
> here.
>
> gcc/ChangeLog:
>
> * config/loongarch/loongarch.md (movsi_internal): Remove
> constraint z.
> ---
>
> Bootst
On Tue, 2024-01-16 at 14:16 +0800, chenglulu wrote:
>
>
> 在 2024/1/16 下午1:34, Xi Ruoyao 写道:
> > Ping.
> >
> > On Fri, 2023-12-15 at 20:56 +0800, Xi Ruoyao wrote:
> > > We don't allow SImode in FCC, so constraint z is never really us
On Tue, 2024-01-16 at 12:58 +0800, Xi Ruoyao wrote:
> On Tue, 2024-01-16 at 10:57 +0800, chenxiaolong wrote:
> > 在 2024-01-15一的 15:50 +0800,Xi Ruoyao写道:
> > > On Mon, 2024-01-15 at 15:10 +0800, chenxiaolong wrote:
> > > > At 14:42 +0800 on the first day of 2024-01-15,
ite/lib/dg-options.exp
> +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> @@ -337,6 +337,7 @@ proc add_options_for_libatomic { flags } {
> || ([istarget powerpc*-*-*] && [check_effective_target_ilp32])
> || [istarget riscv*-*-*]
> || ([istarget sparc*-*-linux-gnu] &
On Wed, 2024-01-17 at 17:38 +0800, chenglulu wrote:
>
> 在 2024/1/13 下午9:05, Xi Ruoyao 写道:
> > 在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
> > > 在 2024/1/12 下午7:42, Xi Ruoyao 写道:
> > > > 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
> > > >
> > >
;t understand the purpose of adding
> '-fno-tree-vectorize' here.
I don't think -fno-tree-vectorize will make a difference here. This
test case uses __attribute__((vector_size(...))) explicitly so the
vector operation will be used even if -fno-tree-vectorize.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
gister_operand" "=r")
(unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
With this the buggy REG_UNUSED notes were gone. But it then prevented
the CSE when loading the address of __tls_get_addr (i.e. if we address
10 TLE_LD symbols in a function it would emit 10 instance
Binutils 2.42 supports TLS LD/GD relaxation which requires the assembler
macro.
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_explicit_relocs_p):
If la_opt_explicit_relocs is EXPLICIT_RELOCS_AUTO, return false
for SYMBOL_TLS_LDM and SYMBOL_TLS_GD.
(loon
On Tue, 2024-01-23 at 10:37 +0800, chenglulu wrote:
> LGTM!
>
> Thanks!
Pushed v2 as attached. The only change is in the comment: Qinggang told
me TLE LE relaxation actually *requires* explicit relocs.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian Univer
When building GCC with --enable-default-ssp, the stack protector is
enabled for got-load.C, causing additional GOT loads for
__stack_chk_guard. So mem/u will be matched more than 2 times and the
test will fail.
Disable stack protector to fix this issue.
gcc/testsuite:
* g++.target/loong
The vect_int_mod target selector is evaluated with the options in
DEFAULT_VECTCFLAGS in effect, but these options are not automatically
passed to tests out of the vect directories. So this test fails on
targets where integer vector modulo operation is supported but requiring
an option to enable, f
k only
papers over the same issue caused spec2006 failure. I tried a bootstrap
with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS
commented out, and there is no more spurious "note: non-delegitimized
UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things.
I feel that this hook is still written in a buggy way, so maybe removing
it will solve the spec2017 issue.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
n __inline float
> __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> __frecipe_s (float _1)
> {
> - __builtin_loongarch_frecipe_s ((float) _1);
> + return (float) __builtin_loongarch_frecipe_s ((float) _1);
I don't think the (float) conversion is needed.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Wed, 2024-01-24 at 18:32 +0800, chenxiaolong wrote:
> On 20:09 +0800 on Tuesday, 2024-01-23, Xi Ruoyao wrote:
> > The vect_int_mod target selector is evaluated with the options in
> > DEFAULT_VECTCFLAGS in effect, but these options are not automatically
> > passed to
On Wed, 2024-01-24 at 19:08 +0800, chenxiaolong wrote:
> At 19:00 +0800 on Wednesday, 2024-01-24, Xi Ruoyao wrote:
> > On Wed, 2024-01-24 at 18:32 +0800, chenxiaolong wrote:
> > > On 20:09 +0800 on Tuesday, 2024-01-23, Xi Ruoyao wrote:
> > > > The vect_int_mod target
On Thu, 2024-01-25 at 08:48 +0800, chenglulu wrote:
>
> 在 2024/1/24 上午3:36, Xi Ruoyao 写道:
> > On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote:
> > > > > The failure of this test case was because the compiler believes that
> > > > > two
> > &g
eme TLS GD/LD with -mexplicit-relocs=auto.
I've rebased and attached the patch to fix the bad split in -mexplicit-
relocs={always,auto} -mcmodel=extreme on top of this series. I've not
tested it seriously though (only tested the added and modified test
cases).
--
Xi Ruoyao
School of Aero
n "la.tls.le\t%0,%1";
> + case SYMBOL_TLS_IE:
> + return "la.tls.ie\t%0,%1";
> + case SYMBOL_TLSLDM:
> + return "la.tls.ld\t%0,%1";
> + case SYMBOL_TLSGD:
> + return "la.tls.gd\t%0,%1";
/* snip */
> + default:
turn "la.tls.gd\t%0,%2,%1";
> + case SYMBOL_TLSLDM:
> + return "la.tls.ld\t%0,%2,%1";
> +
> + default:
> + gcc_unreachable ();
> + }
> +}
> + "&& REG_P (operands[1]) && find_reg_note (insn, REG_UNUSED, operands[2]) !=
> 0"
> + [(set (match_dup 0) (match_dup 1))]
> + ""
> + [(set_attr "mode" "DI")
> + (set_attr "length" "5")])
Should be 20, in bytes.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote:
>
> 在 2024/1/26 下午4:49, Xi Ruoyao 写道:
> > On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote:
> > > v3 -> v4:
> > > 1. Add macro support for TLS symbols
> > > 2. Added support for loading __get_t
On Sat, 2024-01-27 at 11:15 +0800, chenglulu wrote:
>
> 在 2024/1/26 下午6:57, Xi Ruoyao 写道:
> > On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote:
> > > 在 2024/1/26 下午4:49, Xi Ruoyao 写道:
> > > > On Fri, 2024-01-26 at 15:37 +0800, Lulu Cheng wrote:
> > >
On Sat, 2024-01-27 at 18:02 +0800, Xi Ruoyao wrote:
> On Sat, 2024-01-27 at 11:15 +0800, chenglulu wrote:
> >
> > 在 2024/1/26 下午6:57, Xi Ruoyao 写道:
> > > On Fri, 2024-01-26 at 16:59 +0800, chenglulu wrote:
> > > > 在 2024/1/26 下午4:49, Xi Ruoyao 写道:
> > >
at.
You need to wait until the PR is accepted by the libffi maintainers.
Frankly I don't know what libffi maintainers are busy on and I'm
frustrated as well (having a MIPS patch unreviewed there for a month)
but this is the procedure :(.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Wed, 2023-12-13 at 20:22 +0800, chenglulu wrote:
在 2023/12/10 上午1:03, Xi Ruoyao 写道:
Replace the instruction costs in loongarch_rtx_cost_data constructor
based on micro-benchmark results on LA464 and LA664.
This allows optimizations like "x * 17" to alsl, and "x * 68" to
We used a branch to load floating-point comparison results into GPR.
This is very slow when the branch is not predictable.
Use the movcf2gr instruction to implement cstore4 if movcf2gr
is fast enough.
gcc/ChangeLog:
* config/loongarch/genopts/loongarch.opt.in (muse-movcf2gr): New
t; 0x1206ac93f execute
> ../../gcc/gcc/ira.cc:6161
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
We used a branch to load floating-point comparison results into GPR.
This is very slow when the branch is not predictable.
Implement movfcc so we can reload FCCmode into GPRs, FPRs, and MEM.
Then implement cstore4.
gcc/ChangeLog:
* config/loongarch/loongarch-tune.h
(loongarch_rtx
We don't allow SImode in FCC, so constraint z is never really used
here.
gcc/ChangeLog:
* config/loongarch/loongarch.md (movsi_internal): Remove
constraint z.
---
Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk?
gcc/config/loongarch/loongarch.md | 6 +++---
1
We had the following mappings between vfcmp submenmonics and RTX
codes:
(define_code_attr fcc
[(unordered "cun")
(ordered "cor")
(eq "ceq")
(ne "cne")
(uneq "cueq")
(unle "cule")
(unlt "cult")
(le "cle")
Remove a redundant sign extension.
gcc/ChangeLog:
* config/loongarch/loongarch.md (rotrsi3_extend): New
define_insn.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/rotrw.c: New test.
---
Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk?
gcc/config/l
On Sun, 2023-12-10 at 01:03 +0800, Xi Ruoyao wrote:
> Update LoongArch instruction costs based on the micro-benchmark results
> on LA464 and LA664. In particular, this allows generating alsl/slli or
> alsl/slli + add pairs for multiplying some constants as on LA464/LA664
> a mul instr
With simplify_gen_unary we end up with a not fully expanded RTX like
(set (reg:SI 90) (and:SI (neg:SI (reg:SI 80)) (const_int 63)))
Then it will cause an ICE with unrecognizable insn.
gcc/ChangeLog:
PR middle-end/113033
* expmed.cc (expand_shift_1): When expanding rotate shi
gcc/ChangeLog:
* config/loongarch/loongarch.md (rotl3):
New define_expand.
* config/loongarch/simd.md (vrotl3): Likewise.
(rotl3): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/rotl-with-rotr.c: New test.
* gcc.target/loongarch/rotl-wit
On Mon, 2023-12-18 at 08:39 -0700, Jeff Law wrote:
>
>
> On 12/18/23 06:42, Xi Ruoyao wrote:
> > With simplify_gen_unary we end up with a not fully expanded RTX like
> >
> > (set (reg:SI 90) (and:SI (neg:SI (reg:SI 80)) (const_int 63)))
> >
&
On Mon, 2023-12-18 at 18:45 +0100, Jakub Jelinek wrote:
> On Tue, Dec 19, 2023 at 12:48:46AM +0800, Xi Ruoyao wrote:
> > > > gcc/ChangeLog:
> > > >
> > > > PR middle-end/113033
> > > > * expmed.cc (expand_shift_1): When expa
> I've looked e.g. at i386 vec_init and that is exactly what it does,
> see the various tests + force_reg calls in ix86_expand_vector_init*.
Ok, I'm abandoning abandon this patch and I'll rework.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
code clean up is separated into the 2nd patch to make reviewing
easier.
Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk?
Xi Ruoyao (2):
LoongArch: Use force_reg instead of gen_reg_rtx + emit_move_insn in
vec_init expander [PR113033]
LoongArch: Clean up vec_init expa
Jakub says:
Then that seems like a bug in the loongarch vec_init pattern(s).
Those really don't have a predicate in any of the backends on the
input operand, so they need to force_reg it if it is something it
can't handle. I've looked e.g. at i386 vec_init and that is exactly
w
Non functional change, clean up the code.
gcc/ChangeLog:
* config/loongarch/loongarch.cc
(loongarch_expand_vector_init_same): Remove "temp2" and reuse
"temp" instead.
(loongarch_expand_vector_init): Use gcc_unreachable () instead
of gcc_assert (0), and fix
e LSX/LASX code is wrong.
> > Most seriously, the RTX code NE should be mapped to "cneq", not "cne".
>
> The "cneq" in the commit info may be "cune" according to the context?
Oops, indeed.
I'll push the patch with this typo fixed.
>
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
_r". Or we'll hit:
t.c:11:1: internal compiler error: output_operand: operand number
missing after %-letter
> + [(set_attr "type" "move")]
> +)
> +
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
Ping :).
On Tue, 2023-12-12 at 14:47 +0800, Xi Ruoyao wrote:
> The problem with peephole2 is it uses a naive sliding-window algorithm
> and misses many cases. For example:
>
> float a[1];
> float t() { return a[0] + a[8000]; }
>
> is compiled to:
>
g the peephole besides the new
define_insn_and_split produces a better result instead of solely relying
on define_insn_and_split?
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
here is a problem. My regression test has the following two fail
> items.(based on r14-6787)
> +FAIL: gcc.dg/cpp/_Pragma3.c (test for excess errors)
> +FAIL: gcc.dg/pr86617.c scan-rtl-dump-times final "mem/v" 6
Strange. I didn't see them on r14-6650 (with or without the patch)
On Sat, 2023-12-23 at 18:44 +0800, Xi Ruoyao wrote:
> On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote:
> > > The performance drop has nothing to do with this patch. I found that the
> > > h264 performance compiled
> > > by r14-6787 compared to r14-6421 dropped
ence may be caused by a different binutils version or some
other changes in GCC. I'll figure it out...
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Sun, 2023-12-24 at 00:56 +0800, Xi Ruoyao wrote:
> On Sat, 2023-12-23 at 15:00 +0800, chenglulu wrote:
> > Hi,
> >
> > This patch will cause the following tests to fail:
> >
> > +FAIL: gcc.dg/vect/pr97081-2.c (internal compiler error: in extract_insn,
> &
gcc/ChangeLog:
* config/loongarch/loongarch.md (rotl3):
New define_expand.
* config/loongarch/simd.md (vrotl3): Likewise.
(rotl3): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/rotl-with-rotr.c: New test.
* gcc.target/loongarch/rotl-wit
On Sun, 2023-12-24 at 01:04 +0800, Xi Ruoyao wrote:
> On Sun, 2023-12-24 at 00:56 +0800, Xi Ruoyao wrote:
> > On Sat, 2023-12-23 at 15:00 +0800, chenglulu wrote:
> > > Hi,
> > >
> > > This patch will cause the following tests to fail:
> > >
> >
On Sat, 2023-12-23 at 18:47 +0800, Xi Ruoyao wrote:
> On Sat, 2023-12-23 at 18:44 +0800, Xi Ruoyao wrote:
> > On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote:
> > > > The performance drop has nothing to do with this patch. I found that
> > > > the h264 performa
gt; + "&& true"
> [(set (match_dup 0) (match_dup 1))
> (set (zero_extract:GPR (match_dup 0) (match_dup 2) (match_dup 4))
> (match_dup 3))]
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Mon, 2023-12-25 at 10:08 +0800, chenglulu wrote:
>
> 在 2023/12/24 下午8:59, Xi Ruoyao 写道:
> > On Sat, 2023-12-23 at 18:47 +0800, Xi Ruoyao wrote:
> > > On Sat, 2023-12-23 at 18:44 +0800, Xi Ruoyao wrote:
> > > > On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote:
The problem with peephole2 is it uses a naive sliding-window algorithm
and misses many cases. For example:
float a[1];
float t() { return a[0] + a[8000]; }
is compiled to:
la.local$r13,a
la.local$r12,a+32768
fld.s $f1,$r13,0
fld.s $f0,$r12,-768
The GCC internal doc says:
X might be a pseudo-register or a 'subreg' of a pseudo-register,
which could either be in a hard register or in memory. Use
'true_regnum' to find out; it will return -1 if the pseudo is in
memory and the hard register number if it is in a register.
ymbol_ref:DI ("*.LANCHOR0") [flags 0x182])) [0 S1
> A8]))) "volatile.c":5:11 -1
> (nil))
>
> The volatile property of the mem here is gone, so the test fails.
Phew. I guess I couldn't reproduce it because I have Jeff's ext-dce
patch in my local repo, which removed the zero_extend...
I'll rework this patch.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
; >nelt,
> +
> rperm));
> + tmp = gen_rtx_SUBREG (E_V4DImode, d->target, 0);
Likewise.
> + emit_move_insn (tmp, sel);
> + break;
> + case E_V8SFmode:
> + sel = ge
The problem with peephole2 is it uses a naive sliding-window algorithm
and misses many cases. For example:
float a[1];
float t() { return a[0] + a[8000]; }
is compiled to:
la.local$r13,a
la.local$r12,a+32768
fld.s $f1,$r13,0
fld.s $f0,$r12,-768
+ op = XEXP (op, 0);
> > + return symbolic_pcrel_operand (op, Pmode) ||
> > +symbolic_pcrel_offset_operand (op, Pmode);
> > +})
> > +
> >
> Symbol '||' It shouldn't be at the end of the line.
Indeed.
>
> + return symbolic_pcrel_operand (op, Pmode)
> + || symbolic_pcrel_offset_operand (op, Pmode);
>
> Others LGTM.
> Thanks!
>
> /* snip */
>
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
Pushed v4 as attached, with the format issues fixed and a minor
adjustment in the commit message ("define_insn_and_split" is changed to
"define_insn_and_rewrite" to match the actual change).
On Fri, 2023-12-29 at 19:55 +0800, Xi Ruoyao wrote:
> On Fri, 2023-12-29 at 15:57
gcc/ChangeLog:
* config/loongarch/loongarch.md (bstrins__for_ior_mask):
For the condition, remove unneeded trailing "\" and move "&&" to
follow GNU coding style. NFC.
---
Pushed as obvious.
gcc/config/loongarch/loongarch.md | 4 ++--
1 file changed, 2 insertions(+), 2 d
but not reduc_fmin_scal_*?
> If so, we probably need a new target selector for fmin/fmax reduction.
Let me try if the [x]vf{min,max} instructions are IEEE-conform. They've
still not released the volume 2 of the instruction manual so I can only
try...
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Sat, 2023-12-30 at 20:25 +0800, Xi Ruoyao wrote:
> On Sat, 2023-12-30 at 12:15 +, Richard Sandiford wrote:
> > This shouldn't be necessary. The test does:
> >
> > for (int i = 0; i < n; i += 2)
> > {
> > x0 = __builtin_fmin (x0, ptr[i
We already had smin/smax RTL pattern using vfmin/vfmax instructions.
But for smin/smax, it's unspecified what will happen if either operand
contains any NaN operands. So we would not vectorize the loop with
-fno-finite-math-only (the default for all optimization levels expect
-Ofast).
But, LoongA
On Wed, 2024-01-03 at 16:24 +0800, chenglulu wrote:
> LGTM!
>
> Thanks!
Pushed r14-6890.
FWIW sometimes tree optimizer still fails to emit .reduc_f{max,min} or
it emits them sub-optimally. I've commented in PR112457 but maybe I
should've created a new ticket...
> 在 2024
match_operand:DI 2 "register_operand "=&r"))]
And use
gen_movdi_pcrel64 (operands[0], operands[1], gen_reg_rtx(DImode))
in expand.
> + "TARGET_64BIT"
> + "la.local %0,$r15,%1"
> + [(set_attr "mode" "DI")
> + (set_attr "length" "5")])
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Thu, 2024-01-04 at 11:58 +0800, chenglulu wrote:
>
> 在 2024/1/4 上午11:51, Xi Ruoyao 写道:
> > On Wed, 2023-12-27 at 16:46 +0800, Lulu Cheng wrote:
> > > +(define_insn "movdi_pcrel64"
> > > + [(set (match_operand:DI 0 "register_oper
x27;s to get as much testing
> as possible. Assuming the rest is ACK'd for the trunk we'll put it into
> the list of optimizations enabled by -O2.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
_effective_target_s390_vx])
> > +|| ([istarget riscv*-*-*]
> > + && [check_effective_target_riscv_v])
>
> Unless I'm missing something, we have copysign in the scalar
> floating-point ISAs as well. So I think this should be
>
> || ([istarget riscv*-*-*]
> && [check_effective_target_hard_float])
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
e several hours trying to implement this...
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
>
> 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> > > bool
> > > loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
> > > {
> > > +
fective_target_loongarch_sx] ||" because SIMD
requires hard float.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:
> On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
> >
> > 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> > > > bool
> > > > loongarch_ex
HAS_DIV32 etc. in the code base? It seems some of them are not
replaced.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Fri, 2024-01-05 at 20:45 +0800, chenglulu wrote:
>
> 在 2024/1/5 下午7:55, Xi Ruoyao 写道:
> > On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:
> > > On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
> > > > 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > > &
_rtx (DImode);
> + emit_insn (gen_addsi3_extended (t, operands[1], operands[2]));
AFAIK if !TARGET_64BIT a DImode should be actually a pair of hardware
registers, but addsi3_extended don't output such a pair so this seems
invalid...
> + t = gen_lowpart (SImode, t);
> +
uot;")])
>
> +(define_insn "*nsi_internal"
> + [(set (match_operand:SI 0 "register_operand" "=r")
> + (neg_bitwise:SI
> + (not:SI (match_operand:SI 1 "register_operand" "r"))
> + (match_operand:SI 2 "register_operand" "r")))]
> + "TARGET_64BIT"
> + "n\t%0,%2,%1"
> + [(set_attr "type" "logical")
> + (set_attr "mode" "SI")])
>
> ;;
> ;;
> @@ -3167,7 +3210,6 @@ (define_expand "condjump"
> (label_ref (match_operand 1))
> (pc)))])
>
> -
>
> ;;
> ;;
> @@ -3967,10 +4009,13 @@ (define_insn "bytepick_w_"
> (define_insn "bytepick_w__extend"
> [(set (match_operand:DI 0 "register_operand" "=r")
> (sign_extend:DI
> - (ior:SI (lshiftrt (match_operand:SI 1 "register_operand" "r")
> - (const_int ))
> - (ashift (match_operand:SI 2 "register_operand" "r")
> - (const_int bytepick_w_ashift_amount)]
> + (subreg:SI
> + (ior:DI (subreg:DI (lshiftrt
> + (match_operand:SI 1 "register_operand" "r")
> + (const_int )) 0)
> + (subreg:DI (ashift
> + (match_operand:SI 2 "register_operand" "r")
> + (const_int bytepick_w_ashift_amount)) 0)) 0)))]
> "TARGET_64BIT"
> "bytepick.w\t%0,%1,%2,"
> [(set_attr "mode" "SI")])
> diff --git a/gcc/testsuite/gcc.target/loongarch/sign-extend-bitwise.c
> b/gcc/testsuite/gcc.target/loongarch/sign-extend-bitwise.c
> new file mode 100644
> index 000..5753ef69db2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/loongarch/sign-extend-bitwise.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mabi=lp64d -O2" } */
> +/* { dg-final { scan-assembler-not "slli.w\t\\\$r\[0-9\]+,\\\$r\[0-9\]+,0" }
> } */
> +
> +struct pmop
> +{
> + unsigned int op_pmflags;
> + unsigned int op_pmpermflags;
> +};
> +unsigned int PL_hints;
> +
> +struct pmop *pmop;
> +void
> +Perl_newPMOP (int type, int flags)
> +{
> + if (PL_hints & 0x0010)
> + pmop->op_pmpermflags |= 0x0001;
> + if (PL_hints & 0x0004)
> + pmop->op_pmpermflags |= 0x0800;
> + pmop->op_pmflags = pmop->op_pmpermflags;
> +}
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
can-assembler-times "slli.w\t\\\$r\[0-9\]+,\\\$r\[0-9\]+,0"
> 0 } } */
Use scan-assembler-not instead of scan-assembler-times ... 0.
Otherwise LGTM.
> #include
> #define my_min(x, y) ((x) < (y) ? (x) : (y))
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
Recently there are some people building GCC with srcdir == objdir and
the attempts just failed [1]. So stop to say "it should work". OTOH
objdir as a subdirectory of srcdir works: we've built GCC in LFS [2]
and BLFS [3] this way for decades and this is confirmed during the
review of a previous ve
On Thu, 2023-11-30 at 08:44 -0700, Jeff Law wrote:
>
>
> On 11/29/23 02:33, Xi Ruoyao wrote:
> > On Mon, 2023-11-27 at 23:06 -0700, Jeff Law wrote:
> > > This has (of course) been tested on rv64. It's also been bootstrapped
> > > and regression tested on x8
Ping.
On Fri, 2023-11-24 at 17:09 +0800, Xi Ruoyao wrote:
> With -fno-fp-int-builtin-inexact, trunc is not allowed to raise
> FE_INEXACT and it should produce an integral result (if the input is not
> NaN or Inf). Thus FE_INEXACT should not be raised.
>
> But (int)x may raise FE
version (D_SoftFloat)
> + return;
> + else
> + {
> + asm nothrow @nogc
> + {
> + "movgr2fcsr $r0,%0" :
> + : "r" (newState & (roundingMask |
> allExceptions));
> + }
&g
CC is configured to decide the default.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Fri, 2023-12-01 at 18:01 +0800, Xi Ruoyao wrote:
> On Fri, 2023-12-01 at 17:55 +0800, mengqinggang wrote:
> > Generate la.tls.desc macro instruction for TLS descriptors model.
> >
> > la.tls.desc expand to
> > pcalau12i $a0, %desc_pc_hi20(a)
> > ld.d
ult if
it's supported by the assembler and --with-glibc-version= setting is
high enough...
Currently the only architecture (AFAIK) having TLS desc as the default
is AArch64 because it supports TLS desc since the birthday.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
gt; +#if !defined(IN_LIBGCC2) && !defined(IN_TARGET_LIBS) && !defined(IN_RTS)
> #include "loongarch-def.h"
> +#endif
With this change we can revert r14-5634 (remove the #if
!defined(IN_LIBGCC2) && !defined(IN_TARGET_LIBS) && !defined(IN_RTS)
guards in loongarch-def.h as they'll be unneeded).
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
t that the code can't go here, I will add a prompt
> message here.:-(
If I read the code correctly, this is indeed unreachable so we can just
put gcc_unreachable() here. But maybe I'm wrong.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
t; @end smallexample
No, this is definitely incorrect. srcdir is the path (it may be
relative or absolute) to the GCC source tree. It's not necessary to be
placed in the parent directory of objdir.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
> \
> - int *temp_ref = &ref[i], *temp_res = &res[i];
> \
> + int *temp_ref = (int *)&ref[i], *temp_res = (int *)&res[i];
> \
> if (abs (*temp_ref - *temp_res) > 0)
> \
> {
> \
> printf (" error: %s at line %ld , expected " #ref
> \
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Mon, 2023-12-04 at 20:31 +0800, Xi Ruoyao wrote:
> On Mon, 2023-12-04 at 20:14 +0800, chenxiaolong wrote:
> > On LoongArch architecture, using the latest gcc14 in regression test,
> > it is found that the vector test cases in vector directory appear FAIL
> > entries with un
nt main()
{
float x[4] = {};
int y[4] = {};
assert_eq(x, y, __LINE__);
}
This is C++, not C. But IMO we can port the tests to C++ anyway.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
Martin Uecker has pointed out the alignment may be different
with the different order of arguments, per C23 (N2293). With earlier
versions of the standard some people believe the alignment should not be
different, while the other people disagree (as the text is not very
clear).
--
Xi Ruoya
_save_restore_reg (word_mode, regno, offset, fn);
> +
> + offset -= UNITS_PER_WORD;
> + }
> + }
I don't like this pair of {} for the for statement. It's not necessary
and it changes the indent level, causing the diff hard to review.
Otherwise LGTM. I'm not sure why I didn't notice the eh_return issue
when I learnt shrink wrapping from RISC-V...
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
On Thu, 2023-12-07 at 14:18 +0800, Yang Yujie wrote:
> On Thu, Dec 07, 2023 at 11:02:58AM +0800, Xi Ruoyao wrote:
> >
> > I don't like this pair of {} for the for statement. It's not necessary
> > and it changes the indent level, causing the diff hard to review.
&g
There seems no real reason to require -mexplicit-relocs=always for
-mcmodel=extreme or model attribute. As the linker does not know how to
relax a 3-operand la.local or la.global pseudo instruction, just emit
explicit relocs for SYMBOL_PCREL64, and under TARGET_CMODEL_EXTREME also
SYMBOL_GOT_DISP.
> Hi,
>
> Changes to this module should go first to github.com/dlang/phobos.
>
> I also notice that theses SoftFloat static conditions in all LoongArch
> support code doesn't exist in upstream either. Can a pull request be
> raised to sort out the discrepancy?
It looks like this patch has been dropped in V3.
--
Xi Ruoyao
School of Aerospace Science and Technology, Xidian University
We are excluding loongarch-opts.h from target libraries, but now struct
loongarch_target and gcc_options are not declared in the target
libraries, causing:
In file included from ../.././gcc/options.h:8,
from ../.././gcc/tm.h:49,
from ../../../gcc/libgcc/fixed-bit.
101 - 200 of 1155 matches
Mail list logo