Hi, Richard.

>> But we can't generate (vector) gimple that has undefined behaviour from
>> (scalar) gimple that had defined behaviour.  So something needs to change.
>> Either we need to generate a different sequence, or we need to define
>> what the behaviour of len_load/store/etc. are when the length is out of
>> range (perhaps under a target hook?).

To be safe for any targets, so we need add "MIN" to make the length never over 
length.
So for case 2 will be like this:

_44 = MIN_EXPR <ivtmp_42, 32>;_47 = MIN_EXPR <ivtmp_45, 32>;
...
_44_2 = MIN_EXPR <_44, 16>;  ->>> add this MIN
....LEN_STORE (_6, 8B, _44_2, ...);...
I suddenly realize that it's better to add a MIN for it.But I am not sure 
whether we can have a better gimple IRthan it.
>> We also need to be consistent.  If case 2 is allowed to use length
>> parameters that are greater than the vector length, then there's no
>> reason for case 1 to use the result of the MIN_EXPR as the length
>> parameter.  It could just use the loop IV directly.  (I realise the
>> select_vl patch will change case 1 for RVV anyway.  But the principle
>>  still holds.)

Oh, thanks for catching this. After thinking about your comments,
I suddenly realize that make length possible larger than VF will create
potential issues to RVV.
 
>> What does the riscv backend's implementation of the len_load and
>> len_store guarantee?  Is any length greater than the vector length
>> capped to the vector length?  Or is it more complicated than that?

For RVV, it is more complicated than that...

For RVV, we will emit vsetvli instruction first for len_load/len_store.

For example for case 1:
loop:
min a5, a4,16
vsetvli zero, a5...
load.
add.
store
a4 = a4 - a5...

Since we have min a5,a4,16 which will make the length always <= vf.
So it works fine for RVV.

However, if we do something like this (according to undefine behavior of 
len_load/len_store):
loop:
min a5, a4,16
vsetvli zero, a4...
load.
add.
store
a4 = a4 - a5...

Notice there is different here:
vsetvli zero, a4..., here we use "a4" instead of "a5", this will create an 
issue here:
Since according to RVV ISA, for vsetvli instruction:
vsetvli zero, a4...,  if a4 <= vf, then, the length = vf
if vf < a4 <= 2 *vf, it will make length any value between [a4/2, vf] depending 
on the hardward design.
Since our current data reference pointer IV is added by VF (in bytes) by 
default.
Then it will be an issue.

So, may be for case 2 like your said, we should not involve undefine behavior 
into len_load/len_store,
instead, we should well handle loop control by add "MIN (16)" ?

Thanks.



juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-16 14:57
To: juzhe.zhong\@rivai.ai
CC: gcc-patches; rguenther
Subject: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer
"juzhe.zh...@rivai.ai" <juzhe.zh...@rivai.ai> writes:
>>> The examples are good, but this one made me wonder: why is the
>>> adjustment made to the limit (namely 16, the gap between _39 and _41)
>>> different from the limits imposed by the MIN_EXPR (32)?  And I think
>>> the answer is that:
>
>>> - _47 counts the number of elements processed by the loop in total,
>>>   including the vectors under the control of _44
>
>>> - _44 counts the number of elements controlled by _47 in the next
>>>   iteration of the vector loop (if there is one)
>
>>> And that's needed to allow the IVs to be updated independently.
>
>>> The difficulty with this is that the len_load* and len_store*
>>> optabs currently say that the behaviour is undefined if the
>>> length argument is greater than the length of a vector.
>>> So I think using these values of _47 and _44 in the .LEN_STOREs
>>> is relying on undefined behaviour.
>
>>> Haven't had time to think about the consequences of that yet,
>>> but wanted to send something out sooner rather than later.
>
> Hi, Richard. I totally understand your concern now. I think the undefine 
> behavior is more
> appropriate for RVV since we have vsetvli instruction that gurantee this will 
> cause potential
> issues. However, for some other target, we may need to use additional 
> MIN_EXPR to guard
> the length never over VF. I think it can be addressed in the future when it 
> is needed.
 
But we can't generate (vector) gimple that has undefined behaviour from
(scalar) gimple that had defined behaviour.  So something needs to change.
Either we need to generate a different sequence, or we need to define
what the behaviour of len_load/store/etc. are when the length is out of
range (perhaps under a target hook?).
 
We also need to be consistent.  If case 2 is allowed to use length
parameters that are greater than the vector length, then there's no
reason for case 1 to use the result of the MIN_EXPR as the length
parameter.  It could just use the loop IV directly.  (I realise the
select_vl patch will change case 1 for RVV anyway.  But the principle
still holds.)
 
What does the riscv backend's implementation of the len_load and
len_store guarantee?  Is any length greater than the vector length
capped to the vector length?  Or is it more complicated than that?
 
Thanks,
Richard
 

Reply via email to