Hi @tqchen @kparzysz-quic @kparzysz-quic @masahi @tkonolige @smeijer1234 ,

We are looking to revive this work. I have gone through the thread.
Summary so far is as follows : 

* We want to introduce/enhance a scheduling vectorization primitive that could 
be controlled by user/auto-tuner/auto-scheduler either to use scalable vectors 
in the backend codegen.
  * The conversation has resolved to be extending the existing vectorize 
scheduling primitive i.e. : s[C].vectorize(..., scalable=True)

* Usage of this scheduling primitive should result in creating a for loop with 
a Ramp nodes with either an additional argument "is_scalable" or special number 
for lanes.

  * I think @tqchen was suggesting to use the special lane number (-1) as 
opposed to introducing an additional argument to all TIR nodes such as Ramp and 
Broadcast as well as DataType (and to DLDataType) to avoid ABI breakages.
  * Moreover, VectorizeLoopScalable will be modified to create a While node.

* The name of the RFC is confusing ? @kparzysz-quic . I suppose for TIR, what 
we are adding is vector-length agnostic vectorization support for TIR, while 
demonstrating the codegen of VLA vectorized TIR using Arm(R) SVE codegen. 

Please confirm whether this is a right summary of the current state.
As for next steps, I would like to propose/resolve each of the outstanding 
discussion points and update the RFC.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/18#issuecomment-1162041631
You are receiving this because you are subscribed to this thread.

Message ID: <apache/tvm-rfcs/pull/18/c1162041...@github.com>

Reply via email to