"[email protected]" <[email protected]> writes: >>> The examples are good, but this one made me wonder: why is the >>> adjustment made to the limit (namely 16, the gap between _39 and _41) >>> different from the limits imposed by the MIN_EXPR (32)? And I think >>> the answer is that: > >>> - _47 counts the number of elements processed by the loop in total, >>> including the vectors under the control of _44 > >>> - _44 counts the number of elements controlled by _47 in the next >>> iteration of the vector loop (if there is one) > >>> And that's needed to allow the IVs to be updated independently. > >>> The difficulty with this is that the len_load* and len_store* >>> optabs currently say that the behaviour is undefined if the >>> length argument is greater than the length of a vector. >>> So I think using these values of _47 and _44 in the .LEN_STOREs >>> is relying on undefined behaviour. > >>> Haven't had time to think about the consequences of that yet, >>> but wanted to send something out sooner rather than later. > > Hi, Richard. I totally understand your concern now. I think the undefine > behavior is more > appropriate for RVV since we have vsetvli instruction that gurantee this will > cause potential > issues. However, for some other target, we may need to use additional > MIN_EXPR to guard > the length never over VF. I think it can be addressed in the future when it > is needed.
But we can't generate (vector) gimple that has undefined behaviour from (scalar) gimple that had defined behaviour. So something needs to change. Either we need to generate a different sequence, or we need to define what the behaviour of len_load/store/etc. are when the length is out of range (perhaps under a target hook?). We also need to be consistent. If case 2 is allowed to use length parameters that are greater than the vector length, then there's no reason for case 1 to use the result of the MIN_EXPR as the length parameter. It could just use the loop IV directly. (I realise the select_vl patch will change case 1 for RVV anyway. But the principle still holds.) What does the riscv backend's implementation of the len_load and len_store guarantee? Is any length greater than the vector length capped to the vector length? Or is it more complicated than that? Thanks, Richard
