Hi Experts,
I need to Auto-tune the ONNX model on iOS device. i went through the some of
the tutorials from https://tvm.apache.org/docs/tutorials/index.html#auto-tuning
. i'm able to tune the model on the cuda (NVIDIA-GPU)target, but some how my
bad i'm not able auto-tune the model on iOS dev
Hi,
I am currently exploring relay with the BYOC infrastructure and realized that
pooling, relu and a number of other support operations are still done using
float32.
As my target accelerator supports pool, relu and activations only in the int8
range, I want to quantize all operations. Am I
I was planning on implementing ScatterND operator for TFLite Frontend, which
takes indices, updates and shapes as arguments, by:
1. Splitting the indices into seperate tensors (using split Relay Op).
2. Use strided_slice to get the value(s) currently at the index/slice.
3. Add the update value(s
In the documentation, I have read that tvm/vta supports 8 bits and FEWER. I
need low bits, such as 2 or 4 bit for the realization of low bit quantization.
Could I do it with tvm or vta simulator?
---
[Visit Topic](https://discuss.tvm.ai/t/tvm-vta-low-bit-quantization/7674/1) to
respond.