Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-31 Thread 钟居哲
and do my downstream regression (Regression is very big and include so many benchmarks). Thanks. juzhe.zh...@rivai.ai From: Richard Biener Date: 2023-05-31 18:53 To: juzhe.zh...@rivai.ai CC: richard.sandiford; gcc-patches; linkw Subject: Re: Re: [PATCH] VECT: Change flow of decrement IV On Wed, 3

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-31 Thread Richard Biener via Gcc-patches
On Wed, 31 May 2023, juzhe.zh...@rivai.ai wrote: > Thanks Richard. > Seems that this patch's approach is ok to trunk? > Maybe the only thing we should do is to wait Kewen's testing feedback, am I > right ? Can you repost the patch with Kevens fix and state how you tested it? Thanks, Richard.

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-31 Thread juzhe.zh...@rivai.ai
Thanks Richard. Seems that this patch's approach is ok to trunk? Maybe the only thing we should do is to wait Kewen's testing feedback, am I right ? Thanks. juzhe.zh...@rivai.ai From: Richard Sandiford Date: 2023-05-31 17:01 To: Richard Biener via Gcc-patches CC: Richard Biener; juzhe.zhong\@

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-31 Thread Richard Biener via Gcc-patches
On Wed, 31 May 2023, juzhe.zh...@rivai.ai wrote: > Hi, Richard. > > >> I don't object though. It just feels like we're giving up easily. > >> And that's a bit frustrating, since this potential problem was flagged > >> ahead of time. > > I can take a look at it. Would you mind giving me some hin

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-31 Thread juzhe.zh...@rivai.ai
> *From:* Richard Biener <mailto:rguent...@suse.de> > *Date:* 2023-05-31 14:41 > *To:* juzhe.zh...@rivai.ai <mailto:juzhe.zh...@rivai.ai> > *CC:* richard.sandiford <mailto:richard.sandif...@arm.com>; gcc-patches > <mailto:gcc-patches@gcc.gnu.org>

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-31 Thread juzhe.zh...@rivai.ai
>> I'm just saying that to go forward the vectorizer change looks >>more promising (also considering the pace RISC-V people are working at >>...) Yeah, RVV needs a lot of middle-end support: SELECT_VL, LEN_MASK_LOAD/LEN_MASK_STORE,.etc LEN_ADD for RVV reduction support like COND_ADD for ARM

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-31 Thread juzhe.zh...@rivai.ai
Hi, Richard. >> I don't object though. It just feels like we're giving up easily. >> And that's a bit frustrating, since this potential problem was flagged >> ahead of time. I can take a look at it. Would you mind giving me some hints? Should I do this in which PASS ? "ivopts" PASS? Is that righ

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread juzhe.zh...@rivai.ai
a/show_bug.cgi?id=109971, Kewen is happy with this patch, turns out this patch can fix power's issue. So, Let's wait for Richard's comments. Thanks. juzhe.zh...@rivai.ai From: Richard Biener Date: 2023-05-31 14:41 To: juzhe.zh...@rivai.ai CC: richard.sandiford; gcc-patches; linkw Su

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread Richard Biener via Gcc-patches
ation to iteration (but we know a lower and maybe an upper bound?) Thanks, Richard. > Thanks. > > > juzhe.zh...@rivai.ai > > From: ??? > Date: 2023-05-30 23:05 > To: rguenther > CC: richard.sandiford; gcc-patches; linkw > Subject: Re: Re: [PATCH] VECT: Change fl

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread juzhe.zh...@rivai.ai
ter well reviewed) or we should extend SCEV/IVOPTS ? Thanks. juzhe.zh...@rivai.ai From: 钟居哲 Date: 2023-05-30 23:05 To: rguenther CC: richard.sandiford; gcc-patches; linkw Subject: Re: Re: [PATCH] VECT: Change flow of decrement IV More information of power's testcase: Before this patch: t

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread 钟居哲
Hi, Richi. >> As I said in the PR with the proposed scheme you get a loop around copy of >> the IV since both the pre and the post decrement values are live at the same >> time. >> If the CPU has a underflow bit set from the subtraction and a branch on that >> test using that could avoid the

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread 钟居哲
More information of power's testcase: Before this patch: test_npeel_int16_t: lui a4,%hi(.LANCHOR0+130) lui a3,%hi(.LANCHOR1) addi a3,a3,%lo(.LANCHOR1) addi a4,a4,%lo(.LANCHOR0+130) li a5,58 li a2,16 vsetivli zero,16,e16,m1,ta,ma vl1re16.v v3,0(a3) vid.v v1 .L5: minu a3,a5,a2 vsetvli zero,a3,e16,m1

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread 钟居哲
Also, I have investigated power's testcase in RVV: #include #define TEST_ALL(T)\ T (int8_t) \ T (uint8_t)

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread 钟居哲
Hi, all. After several investigations: Here is my experiements: void single_rgroup (int32_t *__restrict a, int32_t *__restrict b, int n) { for (int i = 0; i < n; i++) a[i] = b[i] + a[i]; } void mutiple_rgroup (float *__restrict f, double *__restrict d, int n) { for (int i = 0; i < n; ++i)

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread 钟居哲
>> That's odd, you only need to adjust the IV which is used in the exit test, >> not all the others. Sorry for my incorrect information. I checked the codegen of both single-rgroup and multi-rgroup. Their codegen are same behavior, after this patch, there will be 1 more neg instruction in prehea

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread juzhe.zh...@rivai.ai
>> How does it affect RVV code quality? I thought you specifically chose >> the previous approach because code quality was better that way. Yes, previous way is better for RVV. But as I said, we will definitely use SELECT_VL then in SELECT_VL, we will using remain - step (produced by SELET_VL).

Re: Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread juzhe.zh...@rivai.ai
Before this patch: foo: ble a2,zero,.L5 csrr a3,vlenb srli a4,a3,2 .L3: minu a5,a2,a4 vsetvli zero,a5,e32,m1,ta,ma vle32.v v2,0(a1) vle32.v v1,0(a0) vsetvli t1,zero,e32,m1,ta,ma vadd.vv v1,v1,v2 vsetvli zero,a5,e32,m1,ta,ma vse32.v v1,0(a0) add a1,a1,a3 add a0,a0,a3 sub a2,a2,a5 bne a2,zero