Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Richard. Forget about V10 patch. Just go directly V11 patch. I am so sorry that I send V10 since I originally did not notice Case 2 and Case 3 are totally the same. I apologize for that. I have reviewed V11 patch twice, it seems that this patch is much more reasonable and better understanding

Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Richard and Richi. I am so sorry for sending you garbage patches (My mistake, sending RISC-V patches to you). I finally realize that Case 2 and Case 3 are totally the same sequence! I have combined them into single function called "vect_adjust_loop_lens_control" I have sent V11 patch: https:

Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Richard. I have sent V10: https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618718.html I can't combine implementation Case 2 and Case 3, Case 2 each control (len) are coming from same rgc. But Case 3 each control (len) are coming coming from different rgc. Can you help me with that ? Also,

Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Richard. RVV infrastructure in RISC-V backend status: 1. All RVV instructions pattern related to intrinsics are all finished (They will be called not only by intrinsics but also autovec in the future). 2. In case of autovec, we finished len_load/len_store (They are temporary used and will be

Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Oh, I am sorry for incorrect typos in the last email, fix typos : Hi, Richard. For case 2, I come up with this idea: +Case 2 (SLP multiple rgroup): + ... + _38 = (unsigned long) n_12(D); + _39 = _38 * 2; + _40 = MAX_EXPR <_39, 16

Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Richard. For case 2, I come up with this idea: +Case 2 (SLP multiple rgroup): + ... + _38 = (unsigned long) n_12(D); + _39 = _38 * 2; + _40 = MAX_EXPR <_39, 16>; + _41 = _40 - 16; + ... +

Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-16 Thread juzhe.zh...@rivai.ai
Hi, Richard. >> But we can't generate (vector) gimple that has undefined behaviour from >> (scalar) gimple that had defined behaviour. So something needs to change. >> Either we need to generate a different sequence, or we need to define >> what the behaviour of len_load/store/etc. are when the l

Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-15 Thread juzhe.zh...@rivai.ai
>> The examples are good, but this one made me wonder: why is the >> adjustment made to the limit (namely 16, the gap between _39 and _41) >> different from the limits imposed by the MIN_EXPR (32)? And I think >> the answer is that: >> - _47 counts the number of elements processed by the loop in

Re: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer

2023-05-15 Thread 钟居哲
Hi, Richard. >> Easier to read as: >> _41 = _40 - 16 >> (which might not be valid gimple, but pseudocode is good enough). OK. >> The difficulty with this is that the len_load* and len_store* >>optabs currently say that the behaviour is undefined if the >>length argument is greater than the