BTW, after writing it down, we can find that perhaps it is not necessary (for
S1) to explicitly introduce a special vscale. Another approach is that we can
mark an SVE scope, and use a normal tvm variable `n` to mark the sve extent.
```python
# note vscale = n
n = T.let(call(tvm.builtin.vscale()
it might be useful also bring some discussions to forums. here is a quick
related sketch of GPU related models
```python
for y in range(64):
for x in range(64):
C[y, x] = A[y, x] * (B[y] + 1)
```
Say we are interested in the original program. In a normal GPU programming
terminology, we w
Thanks for your comments @tqchen, much appreciated! I want to ask some
clarifications and expand on some of the points you made, based on my
understanding.
TL;DR:
- We need to be able to express `vscale` dependent `extent`s in the TIR `For`
nodes
- Aside of predication, SVE vectors are not muc