Hi, Richard. Forget about V10 patch. Just go directly V11 patch.
I am so sorry that I send V10 since I originally did not notice Case 2 and Case
3 are totally the same.
I apologize for that. I have reviewed V11 patch twice, it seems that this patch
is much more reasonable and better understanding
Hi, Richard and Richi.
I am so sorry for sending you garbage patches (My mistake, sending RISC-V
patches to you).
I finally realize that Case 2 and Case 3 are totally the same sequence!
I have combined them into single function called "vect_adjust_loop_lens_control"
I have sent V11 patch:
https:
Hi, Richard.
I have sent V10:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618718.html
I can't combine implementation Case 2 and Case 3, Case 2 each control (len) are
coming from same rgc.
But Case 3 each control (len) are coming coming from different rgc.
Can you help me with that ?
Also,
Hi, Richard.
RVV infrastructure in RISC-V backend status:
1. All RVV instructions pattern related to intrinsics are all finished (They
will be called not only by intrinsics but also autovec in the future).
2. In case of autovec, we finished len_load/len_store (They are temporary used
and will be
Oh,
I am sorry for incorrect typos in the last email, fix typos :
Hi, Richard.
For case 2, I come up with this idea:
+Case 2 (SLP multiple rgroup):
+ ...
+ _38 = (unsigned long) n_12(D);
+ _39 = _38 * 2;
+ _40 = MAX_EXPR <_39, 16
Hi, Richard.
For case 2, I come up with this idea:
+Case 2 (SLP multiple rgroup):
+ ...
+ _38 = (unsigned long) n_12(D);
+ _39 = _38 * 2;
+ _40 = MAX_EXPR <_39, 16>;
+ _41 = _40 - 16;
+ ...
+
Hi, Richard.
>> But we can't generate (vector) gimple that has undefined behaviour from
>> (scalar) gimple that had defined behaviour. So something needs to change.
>> Either we need to generate a different sequence, or we need to define
>> what the behaviour of len_load/store/etc. are when the l
>> The examples are good, but this one made me wonder: why is the
>> adjustment made to the limit (namely 16, the gap between _39 and _41)
>> different from the limits imposed by the MIN_EXPR (32)? And I think
>> the answer is that:
>> - _47 counts the number of elements processed by the loop in
Hi, Richard.
>> Easier to read as:
>> _41 = _40 - 16
>> (which might not be valid gimple, but pseudocode is good enough).
OK.
>> The difficulty with this is that the len_load* and len_store*
>>optabs currently say that the behaviour is undefined if the
>>length argument is greater than the