https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119832
--- Comment #4 from Li Pan ---
(In reply to Kito Cheng from comment #1)
> Created attachment 61135 [details]
> 0001-RISC-V-Implement-TARGET_MODE_CONFLUENCE.patch
>
> My working patch for this bug
I think TARGET_MODE_CONFLUENCE should be a bett
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119832
--- Comment #2 from Li Pan ---
More details.
7 --- mode change from 10 -> 7 // NONE => DYN <<<
8 --- restore mode is dyn and prev is not call
9 --- mode change from 7 -> 9 // DYN => CALL
10 --- mode change from 10 -> 9 // NONE =>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119757
Li Pan changed:
What|Removed |Added
CC||pan2.li at intel dot com
--- Comment #8 from L
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119581
--- Comment #3 from Li Pan ---
(In reply to Jeffrey A. Law from comment #2)
> Thanks Pan. I've got an intern working in this space, and this may be a
> good exercise for them. So definitely reach out before you dive in to see
> if she's gotten
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119581
Li Pan changed:
What|Removed |Added
CC||pan2.li at intel dot com
--- Comment #1 from L
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119554
Li Pan changed:
What|Removed |Added
CC||jeffreyalaw at gmail dot com,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119547
Li Pan changed:
What|Removed |Added
CC||jeffreyalaw at gmail dot com,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #19 from Li Pan ---
> No you got it wrong.
> _121 will either be -1 or 0. _11 should be -1 or 0 too.
> So the question is what was the VEC_EXTRACT doing the right thing? Is it 0/-1
> or 0/1?
Oh, I see. Let me revisit the dump code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #12 from Li Pan ---
(In reply to Robin Dapp from comment #9)
> I suspect the problem lies somewhere here:
>
> _11 = .VEC_EXTRACT (mask__83.22_110, 0);
> _23 = MEM[(short int *)&t + 20B];
> _24 = _23 & _132;
> _25 = _24 != 0;
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #8 from Li Pan ---
252 │ vect__81.20_52 = vect_cst__142 & _164; // {3}
253 │ mask__82.21_53 = vect__81.20_52 != { 0, 0, 0, 0, 0, 0, 0, 0 };//
0xff
254 │ _31 = mask__82.21_53 ^ mask__57.18_81; // 0xff
255 │ mask__8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #7 from Li Pan ---
Yes, double checked, the result of tree.optimized looks right, details as
below.
Then should be a backend issue now.
will take a look into it.
206 │[local count: 56478818]:
207 │ _114 = MEM[(short int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #5 from Li Pan ---
(In reply to Robin Dapp from comment #4)
> Very weird indeed. It looks like we're not even vectorizing? I mean, sure,
> we use vector instructions but they are all broadcast from scalars?
> (VMAT_INVARIANT) And
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #2 from Li Pan ---
Tweak test case for easy locating.
1 │ int b[18];
2 │ long long al;
3 │ _Bool e;
4 │ char f = 010;
5 │ short t[18];
6 │
7 │ unsigned w[8][18][18][18];
8 │ _Bool a;
9 │
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
--- Comment #3 from Li Pan ---
The related asm looks abnormal up to a point, there should be a reduce insn for
a but actually not, the insn and flow may looks like below.
114 │1028c: cc847057vsetivlizero,8,e16,m1,ta,ma
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119114
Li Pan changed:
What|Removed |Added
CC||pan2.li at intel dot com
--- Comment #1 from L
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118931
--- Comment #3 from Li Pan ---
It is a bug of interleaved_stepped when expand_const_vector, the base + i*step
for base1 series may overflow and then the base2 series will OR overflowed bits
to the final result.
I will prepare a fix for this.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118931
--- Comment #2 from Li Pan ---
13 │ int main ()
14 │ {
15 │ vector(16) unsigned char vect__3.5;
16 │ unsigned char a_lsm.2;
17 │ long long int _5;
18 │ vector(16) unsigned char _13;
19 │ unsigned char _29;
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118931
Li Pan changed:
What|Removed |Added
CC||pan2.li at intel dot com
--- Comment #1 from L
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118949
--- Comment #5 from Li Pan ---
Thanks Vineet, update another case with explicit convert. It is unrelated to
the global_reg change.
1 │ #define T float
2 │
3 │ void func(const T * restrict a, const T * restrict b,
4 │
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118949
Li Pan changed:
What|Removed |Added
CC||pan2.li at intel dot com
--- Comment #2 from L
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116351
--- Comment #5 from Li Pan ---
(In reply to Li Pan from comment #4)
> I see, I worked out another fix that is under testing, will send it out if
> no surprise from test and see.
seems not that correct, will have a try from yours.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116351
--- Comment #4 from Li Pan ---
I see, I worked out another fix that is under testing, will send it out if no
surprise from test and see.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116351
Li Pan changed:
What|Removed |Added
CC||pan2.li at intel dot com
--- Comment #2 from L
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118540
Li Pan changed:
What|Removed |Added
CC||pan2.li at intel dot com
--- Comment #1 from L
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118832
--- Comment #9 from Li Pan ---
(In reply to Robin Dapp from comment #8)
> I think for vec_duplicate the idea is the same as for all the other splits -
> keep it in simple shape so we can combine/fwprop etc. It also helps
> converting e.g.
>
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118832
--- Comment #7 from Li Pan ---
Thanks Jeff and Robin, that makes much sense to me.
However, I got a little confused about the vec_duplicate with
define_insn_and_split. As I learned, define_insn_and_split equals define_insn +
define_split + defi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118832
Li Pan changed:
What|Removed |Added
CC||juzhe.zhong at rivai dot ai
--- Comment #2 fro
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118832
Li Pan changed:
What|Removed |Added
CC||pan2.li at intel dot com
--- Comment #1 from L
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103
--- Comment #12 from Li Pan ---
(In reply to Li Pan from comment #10)
> (In reply to Vineet Gupta from comment #8)
> > A fix for PR/118464 is posted to list [1] which also cures this issue.
> >
> > [1] https://gcc.gnu.org/pipermail/gcc-patches/
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103
--- Comment #11 from Li Pan ---
TARGET_CONDITIONAL_REGISTER_USAGE can help to resolve this issue, let me have a
try for regression test.
But looks we don't need to emit_volatile_frm anymore here, but it is another
refactor later.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103
--- Comment #10 from Li Pan ---
(In reply to Vineet Gupta from comment #8)
> A fix for PR/118464 is posted to list [1] which also cures this issue.
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674498.html
Thanks Vineet, it se
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103
--- Comment #9 from Li Pan ---
(In reply to Richard Sandiford from comment #7)
> The problem seems to be in the modelling of the FRM register.
> CALL_USED_REGISTERS says that the register is call-clobbered/caller-save,
> which means:
>
> (a) i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103
--- Comment #5 from Li Pan ---
bisect late-combine2 results in some invalid combine here for main function.
(insn 40 5 41 2 (set (reg:SI 11 a1 [151])
(reg:SI 69 frm)) "pr118103-simple.c":67:15 2712 {frrmsi}
(nil))
(insn 41 40 7 2 (set (reg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103
--- Comment #4 from Li Pan ---
gcc-14 has the correct behavior and mostly some middle-end change I guess.
└─(11:39:07 on master⚑ ✭)──> riscv64-linux-gnu-gcc-14 --version
riscv64-linux-gnu-gcc-14 (Ubuntu 14.2.0-4ubuntu2~24.04) 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103
--- Comment #3 from Li Pan ---
Interesting the test_example in a separate function other than main will have
the frm restore insn, but there will be no such frm in main function.
62 │ test_exampe:
63 │ frrma2
64 │ fsrmi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103
--- Comment #2 from Li Pan ---
(In reply to Li Pan from comment #1)
> Ack, let me try to reproduce this.
Reproduced, the inlined compute delete the restore FRM somewhere, will take a
look into it.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118103
Li Pan changed:
What|Removed |Added
CC||pan2.li at intel dot com
--- Comment #1 from L
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117688
--- Comment #7 from Li Pan ---
Reproduced and will prepare a fix patch for this.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117688
--- Comment #6 from Li Pan ---
Ack, looks like a code-gen issue for the risc-v backend, let me try to
reproduce it from qemu and dev-board.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118075
--- Comment #2 from Li Pan ---
Ack and reproduced.
Take a rough look it should be the strided store for memory alias because
disable the sch can fix it.
I will take care of it.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878
--- Comment #12 from Li Pan ---
(In reply to Robin Dapp from comment #11)
> I'm not really sure. For now I hope not. If we hit similar problems again
> that are not easily fixable we can reconsider.
Sure thing.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878
--- Comment #10 from Li Pan ---
(In reply to Robin Dapp from comment #9)
> Should be fixed.
Thanks Robin for fixing this, do we still need to do something like
ix86_pre_reload_split for the risc-v backend? Which avoid the the define expand
to b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019
--- Comment #4 from Li Pan ---
(In reply to Li Pan from comment #3)
>1 │ #include
>2 │
>3 │ #define I_P1 16
>4 │ #define I_P2 1344
>5 │
>6 │ #define HADAMARD4(d0, d1, d2, d3, s0, s1, s2, s3) {\
>7 │
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118019
--- Comment #3 from Li Pan ---
1 │ #include
2 │
3 │ #define I_P1 16
4 │ #define I_P2 1344
5 │
6 │ #define HADAMARD4(d0, d1, d2, d3, s0, s1, s2, s3) {\
7 │ int t0 = s0 + s1;\
8 │ int t1 = s0 - s1;\
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117990
--- Comment #8 from Li Pan ---
The gather load has involved this (mem:BLK (scratch)) already, thus it doesn't
have this problem.
BTW, does alias analysis support the complicated scenario like strided/index
load (I bet we may need more info to f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117990
--- Comment #6 from Li Pan ---
Add (mem:BLK (scratch)) to strided load define_insn can help to fix this issue,
as (mem:BLK (scratch)) is considered to alias all other memories.
In theory, we can do even more accurate alias analysis here, like v
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117990
--- Comment #5 from Li Pan ---
The tree optimized looks right up to a point.
5 │ int main ()
6 │ {
7 │ vector(8) int vect__4.8;
8 │ vector(8) char vect__3.7;
9 │ vector(8) char D.2823;
10 │ int _5;
11 │
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117990
--- Comment #4 from Li Pan ---
Another example to reproduce this.
1 │ #define STEP 10
2 │
3 │ char d[225];
4 │ int e[STEP];
5 │
6 │ int main() {
7 │ for (long h = 0; h < STEP; ++h)
8 │ d[h * STEP] =
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117990
--- Comment #2 from Li Pan ---
(In reply to Patrick O'Neill from comment #1)
> -flto can be replaced with -fwhole-program:
>
> -march=rv64gcv_zvl256b -fwhole-program -O3 -mrvv-vector-bits=zvl test.c -o
> user-config.out
Confirmed, reproduced b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722
--- Comment #17 from Li Pan ---
(In reply to Vineet Gupta from comment #14)
> (In reply to Li Pan from comment #7)
> > Created attachment 59661 [details]
> > with usad pattern
>
> Can you please post the patch, lest we duplicate your effort.
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878
--- Comment #7 from Li Pan ---
This insn is introduced during reload when lra_constraints. There will be const
vector like:
(const_vector:V8QI [
(const_int 4 [0x4])
(const_int 12 [0xc])
(const_int 5 [0x5])
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878
--- Comment #6 from Li Pan ---
526.blender_r is quite long to tell, will leverage below code to investigate
which can reproduce this issue too.
1 │ int *b;
2 │ inline void c(char *d, int e) {
3 │ d[0] = 0;
4 │ d[1] = e;
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878
--- Comment #4 from Li Pan ---
(In reply to Robin Dapp from comment #3)
> Generally, yes, I guess. But I'd like to understand better what exactly is
> going wrong. Shouldn't emitting those "pre-RA" insns already be guarded
> properly? I haven
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878
--- Comment #2 from Li Pan ---
(In reply to Robin Dapp from comment #1)
> Is this related to PR117353? Seems very similar.
Yes, very similar but ice at different pass.
The similar approach like ix86_pre_reload_split can fix the code example i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117878
Bug ID: 117878
Summary: RISC-V: ICE when build spec17 526.blender_r with -O3
-march=rv64gcv_zvl256b
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: no
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88603
--- Comment #4 from Li Pan ---
(In reply to Andrew Pinski from comment #3)
> We don't recongize saturation_add in comment #0 as a SAT_ADD still.
Yes, the form like convert to widen for overflow checking is not supported for
now. I will take care
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112600
--- Comment #26 from Li Pan ---
(In reply to Uroš Bizjak from comment #25)
> (In reply to Li Pan from comment #24)
>
> > Does upstream still have the issue mentioned above? If not, I'll add some
> > test cases i386.
> The issue looks fixed, f2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112600
--- Comment #24 from Li Pan ---
(In reply to Uroš Bizjak from comment #22)
> (In reply to Li Pan from comment #21)
>
> > Looks the f2 can vectorized to sat_add from upstream now, may be impacted by
> > recent changes. Let me add one test for th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722
--- Comment #12 from Li Pan ---
(In reply to Robin Dapp from comment #11)
> (In reply to Li Pan from comment #9)
> > Created attachment 59663 [details]
> > before_vs_after when outer loop is 128
>
> Ok, that's a different loop then. I'm seeing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722
--- Comment #9 from Li Pan ---
Created attachment 59663
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59663&action=edit
before_vs_after when outer loop is 128
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722
--- Comment #7 from Li Pan ---
Created attachment 59661
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59661&action=edit
with usad pattern
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722
--- Comment #6 from Li Pan ---
Created attachment 59660
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59660&action=edit
upstream
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117722
--- Comment #2 from Li Pan ---
Take x86_64 perf data for 625 base, x264_pixel_satd_8x4 is the hottest func.
Children Self Command Shared Object Symbol
+ 19.26%18.96% x264_s_base.non x264_s_base.none [.]
x264_pixel
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117594
--- Comment #4 from Li Pan ---
I can reproduce this.
└─(07:29:53 on master⚑ ✭)──>
QEMU_CPU=rv64,vlen=128,rvv_ta_all_1s=true,rvv_ma_all_1s=true,v=true,vext_spec=v1.0
~/bin/qemu/bin/qemu-riscv64 test.elf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
Li Pan changed:
What|Removed |Added
CC||pan2.li at intel dot com
--- Comment #20 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112600
--- Comment #23 from Li Pan ---
(In reply to Uroš Bizjak from comment #22)
> (In reply to Li Pan from comment #21)
>
> > Looks the f2 can vectorized to sat_add from upstream now, may be impacted by
> > recent changes. Let me add one test for th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112600
--- Comment #21 from Li Pan ---
(In reply to Li Pan from comment #20)
> (In reply to Li Pan from comment #19)
> > interesting, I will take a look for f2 after some more sat_* supported.
>
> RISC-V backend works well for all of above pattern but
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116655
--- Comment #5 from Li Pan ---
(In reply to Robin Dapp from comment #4)
> Fixed.
Thanks Robin, this also fixed the spec17 build failures as below for dynamic.
Build errors for intrate: 502.gcc_r(base; CE), 525.x264_r(base; CE),
557.xz_r(base;
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116064
--- Comment #15 from Li Pan ---
(In reply to Li Pan from comment #14)
> > So you have to use one of those two.
>
> Thanks, I see, let me update the config file and have another try.
-Wno-error=template-body works, thanks a lot.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116064
--- Comment #14 from Li Pan ---
> So you have to use one of those two.
Thanks, I see, let me update the config file and have another try.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116064
Li Pan changed:
What|Removed |Added
CC||pan2.li at intel dot com
--- Comment #12 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112600
--- Comment #20 from Li Pan ---
(In reply to Li Pan from comment #19)
> interesting, I will take a look for f2 after some more sat_* supported.
RISC-V backend works well for all of above pattern but x86 failed on f2, let me
dig more details for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112600
--- Comment #19 from Li Pan ---
interesting, I will take a look for f2 after some more sat_* supported.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116883
--- Comment #3 from Li Pan ---
I think xuli is working on this issue. As you know, the first week of Oct is
the National Holiday.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116861
--- Comment #10 from Li Pan ---
(In reply to Andrew Pinski from comment #9)
> (In reply to Li Pan from comment #8)
> > [0] psi ptr 0x7e2f8f00c000
> > [1] psi ptr 0x7e2f8f00c400
> > [2] psi ptr 0xa5a5a5a5a5a5a5a5 <=== Invalid.
> >
> > Looks som
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116861
--- Comment #8 from Li Pan ---
[0] psi ptr 0x7e2f8f00c000
[1] psi ptr 0x7e2f8f00c400
[2] psi ptr 0xa5a5a5a5a5a5a5a5 <=== Invalid.
Looks some gsi info is polluted during matching.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116861
--- Comment #7 from Li Pan ---
Thanks all for reducing, reproduced from myside and will take a look soon.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116795
Li Pan changed:
What|Removed |Added
Status|RESOLVED|CLOSED
--- Comment #10 from Li Pan ---
Thanks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116795
--- Comment #8 from Li Pan ---
(In reply to Sam James from comment #7)
> (In reply to Li Pan from comment #6)
> > (In reply to Sam James from comment #5)
> > > Pan Li, if you set your email on Bugzilla to pa...@gcc.gnu.org, you will
> > > get
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116795
--- Comment #6 from Li Pan ---
(In reply to Sam James from comment #5)
> Pan Li, if you set your email on Bugzilla to pa...@gcc.gnu.org, you will get
> permissions to modify bugs :)
Yes and Thanks. I can modify bugs, but could you please help t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116814
--- Comment #1 from Li Pan ---
Ack, thanks for reporting this.
Should be introduced by this commit.
https://github.com/gcc-mirror/gcc/commit/f2476a2649e9975d454d179145574c21d8218aee
I am preparing a fix for this.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116795
--- Comment #2 from Li Pan ---
Ack, and thanks for reporting, will take a look soon.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116278
--- Comment #11 from Li Pan ---
Thanks for suggestion, will move run test to
gcc/testsuite/gcc.c-torture/execute and only leave asm check under riscv.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116280
--- Comment #1 from Li Pan ---
Looks like some typos in md files, let me take a look.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116278
--- Comment #8 from Li Pan ---
(In reply to Li Pan from comment #7)
> The backend take
> rtx xmode_x = gen_lowpart (Xmode, x);
>
> For the incoming op of .SAT_ADD, thus I think we should take lbu instead of
> lb according to the ISA.
During u
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116278
--- Comment #7 from Li Pan ---
The backend take
rtx xmode_x = gen_lowpart (Xmode, x);
For the incoming op of .SAT_ADD, thus I think we should take lbu instead of lb
according to the ISA.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116278
--- Comment #6 from Li Pan ---
(In reply to Andrew Pinski from comment #4)
> lb a1,0(a5) // load -40
> lui a0,%hi(.LC0)
> lui a4,%hi(c)
> addia5,a1,9 //a5 = -31
> sllia5,a5,48
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116278
--- Comment #5 from Li Pan ---
Reproduced from both qemu and hardware, let me take a look.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116278
--- Comment #3 from Li Pan ---
(In reply to Kito Cheng from comment #2)
> Hi Pan, could you take a look to see if it related to SAT_ADD?
Ack, thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116202
--- Comment #3 from Li Pan ---
(In reply to Li Pan from comment #2)
> Confirmed, thanks and will take care of it soon.
Just prepared a fix, and will send it out if no surprise from test.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116202
--- Comment #2 from Li Pan ---
Confirmed, thanks and will take care of it soon.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103
--- Comment #10 from Li Pan ---
(In reply to Thomas Schwinge from comment #9)
> (In reply to Li Pan from comment #7)
> > confirm with you all related failures are covered.
>
> Yes, the testing state is restored to what it was before, thanks!
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103
--- Comment #7 from Li Pan ---
Hi Thomas,
Could you please help to double confirm the below patch is able to fix these
asm check failure?
https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658519.html
I tested below cases for target=amdgcn-a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103
--- Comment #6 from Li Pan ---
(In reply to Thomas Schwinge from comment #5)
> (In reply to Li Pan from comment #3)
> > best practice of cross
> > compile gfx908 in x86 linux?
>
> If you only need the 'cc1' (and no assembler, linker, libc), the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116103
--- Comment #3 from Li Pan ---
Thanks Richard for the suggestion.
Hi Thomas, could you please help to insight me the best practice of cross
compile gfx908 in x86 linux?
Then I can have a try following Richard's suggestion.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115961
--- Comment #5 from Li Pan ---
Thanks Andrew Pinski.
That make much sense to me, and I can reproduce this from upstream now. Let me
file a patch for it.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115863
--- Comment #14 from Li Pan ---
Hi Uroš,
> Please note two new instructions in the second asm dump. These are expanded
> from .SAT_TRUNC and are not present in the first asm dump.
> The problem here is that the presence of ustrunc{m}{n}2 optab
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115961
--- Comment #3 from Li Pan ---
Only x86 implemented the .SAT_TRUNC for scalar, so I bet it is almost the same
as this https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115863 ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115863
--- Comment #13 from Li Pan ---
Thanks Richard and Bizjak.
Got the point here, and let me have a try for the improvement.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115863
--- Comment #8 from Li Pan ---
(In reply to Richard Biener from comment #7)
> (In reply to Uroš Bizjak from comment #6)
> > Please note that w/o .SAT_TRUNC the compiler is able to optimize hot loop in
> > compress2 to:
> >
> >[local count:
1 - 100 of 219 matches
Mail list logo