Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-05-29 Thread ds-jnorwood
to explain a little further ... during training they determine the range of input values, and they determine the downscale multiplier that will shrink the observed range to 0..255 (for the uint8 quantization). The fp downscale multiplier is coverted to integer mpy and right-shift constants, whi

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-05-29 Thread ds-jnorwood
similarly, for the tflite quantized inception v3 model, all those output_activation_min, output_activation_max are 0 and 255 I'll attach a zip file with the log. [inv3.zip](https://github.com/dmlc/tvm/files/3235141/inv3.zip) -- You are receiving this because you are subscribed to this thread. R

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-05-29 Thread ds-jnorwood
I want to point out that the min and max values you mentioned are not related to the activation range in the original model. They are saturation values. In the case of mobilenet, for example, which has relu_6 use everywhere, I'm printing out the min and max activation values from the tflite mo

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-05-29 Thread Tianqi Chen
Here are some points to discuss: - namespace for the tflite quantize style dialect - List of ops that might need tvm's compute declaration - set of possible passes that lower the rest into the core ops Some of the discussions involve fusion, and that is something where TVM might be able to help.

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-05-29 Thread Animesh Jain
> For the `q_conv2d`, we will add two more arguments. > > ```python > output_min=0, > output_max=0 > ``` > > These will be used for restrict the output range, which could be calculated > previously. I see what you are saying, but I am not sure if this is the right approach. In my opinion,

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-05-29 Thread Zhao Wu
> > > > For the `q_conv2d`, we will add two more arguments. > > > > ```python > > > > output_min=0, > > > > output_max=0 > > > > ``` > > > > > > > > > > > > These will be used for restrict the output range, which could be > > > > calculated previously. > > > > > > > > > I see what you are

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-05-29 Thread Animesh Jain
> > > For the `q_conv2d`, we will add two more arguments. > > > ```python > > > output_min=0, > > > output_max=0 > > > ``` > > > > > > > > > These will be used for restrict the output range, which could be > > > calculated previously. > > > > > > I see what you are saying, but I am not su

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-05-29 Thread Animesh Jain
> Yes, I believe the MobilenetV2 relu_6 is effectively fused in by the > downscale saturation. You might need it if you want to support their way of > training, though. > > Yes Mobilenet has the q_add, but I suggest the Inceptionv3 for q_concatenate, > since it also has concat nodes feeding int

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-05-29 Thread Zhao Wu
> > For the `q_conv2d`, we will add two more arguments. > > ```python > > output_min=0, > > output_max=0 > > ``` > > > > > > These will be used for restrict the output range, which could be calculated > > previously. > > I see what you are saying, but I am not sure if this is the right app

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-05-29 Thread Animesh Jain
> Hi @anijain2305 regarding the requantization, if the it is not going to put > in conv op, the op may suppose to output FP32, otherwise the semantic is > confusing. The requantization can convert FP32 to INT8. The multiplier/shift > based reuantization approach introduced by TFLite is also adop

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-05-29 Thread 黎明灰烬
Hi @anijain2305 regarding the requantization, if the it is not going to put in conv op, the op may suppose to output FP32, otherwise the semantic is confusing. The requantization can convert FP32 to INT8. The multiplier/shift based reuantization approach introduced by TFLite is also adopted by

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-05-29 Thread ds-jnorwood
> From my experience, we needn't q_relu. But we need q_add / q_concate and so > on. I suggest we use MobilenetV2 quant model for example, Yes, I believe the MobilenetV2 relu_6 is effectively fused in by the downscale saturation. You might need it if you want to support their way of training,