Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

ds-jnorwood Sun, 16 Jun 2019 21:29:05 -0700

>  If no activation, we will clamp it to 0 / 127. 

In the tflite quantized conv implementation ( I posted an excerpt from their 
code previously) the offset is added in prior to the clamping.  The tflite 
quantized models in their repository used uint8 asymmetric quantization with 
non-zero offsets for activations and weights and int32 for biases .  In that 
case min and max values passed into the quantized conv are always 0 and 255.


It appears to me, though, that someone who wrote that conv code might have also 
considered supporting return of signed int8 quantized values ... since they 
provided  a signed int32 min saturation value.  If signed int8 quantization is 
a tflite quantization conversion option, then maybe a good idea make sure to 
cover this case.

The intel quantization uses fixed 0 offset uint8 for activations and fixed 0 
offset int8 for weights and fixed 0 int32 for biases.  That simplifies the 
terms of the convolution inner loops ( a lot, as has been discussed here 
before).  It also reflects Intel's avx512 DLBoost hardware int8 
capabilities/limitations.  So, probably a good idea to support that mode.  

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2351#issuecomment-502530374

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

Reply via email to