Hi @JosseVanDelm ,
Thanks for the post! Some thoughts: >Right now a lot of calls to the HWlib are very inefficient, as they require a >lot of data reformatting on the RISC-V before being accessible to the >accelerator. It is weird/annoying that the data layout already gets specified >from Relay, we would probably need to insert a data layout (TIR?) optimization >pass along the computation graph at some point there. and > Our accelerator supports int8, but also int4 and int2. At some point we will > probably need to look into the Bring your own datatype framework, but we also > still need to look into quantization support in TVM. Any recommended > reference work would be very useful here! Tagging @jwfromm in case he knows more here. > We have looked into using BYOC, but we felt like this was a very direct > mapping of Relay to instructions, which bypasses a lot of > scheduling/optimization magic (Tensor Expressions, AutoTVM) from the rest of > the TVM stack. It also did not seem like a very scalable solution to us, > since it seems like we would have to map a lot of Relay instructions directly > to a HWLib function call, which we also have to develop ourselves. Is tensorization an option here, or do you need to do more with the TIR after schedule generation? >We have looked into VTA, but VTA is quite different from our platform. We >don’t have a fully fledged workstation host device at hand, apart from the >bare metal microcontroller. Also we would like to compile as much as possible >statically and AoT, and not in a JIT-fashion. Maybe there are some accelerator >specific parts we can reuse though. If someone can share their experience on >reusing some of this work that would be very insightful! This is an area I'm quite interested in, but we haven't done anything on this I know of. > Some functions of the HWlib require parameters that have to be set during > compilation based on the weights. It is not clear to us how this fits in with > the rest of the compilation stack. Could this be implemented in a TIR pass > for example? It seems like you could have a TIR pass that replaces free variables with constants after doing that computation. Also tagging @tqchen who may have some more ideas of related work here. Andrew --- [Visit Topic](https://discuss.tvm.apache.org/t/feedback-on-tvm-port-to-custom-accelerator/9548/2) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/a18f1a7836aa237cbb3c68905f409a2ec3a28353509f15d793919c92faacd96f).