Hello,
     I was wondering if there are options for quantization in post Relay TVM. 
To be able to build support and research for custom accelerators, integer 
support through quantization is needed. In the old VTA flow, relay.quantize was 
used to quantize to int8, which the VTA hardware was able to compute with 
(floating point is typically not used on these custom hardware accelerators). 
Are there any plans to create a relax.quantize, and if not, what options do 
people have if they want to explore TVM with custom accelerators. I know MLC 
LLM has some quantization built on top of relax TVM but that is only for weight 
storage and does not allow for integer operations. Any help on this matter 
would be appreciated.





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/quantization-in-post-relay-tvm/18770/1) 
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/3d1c07e3b5569d9a6089db206b5587ce3e6fc732dde4ec44f4658f4dfc9ceec3).

Reply via email to