to explain a little further ... during training they determine the range of
input values, and they determine the downscale multiplier that will shrink the
observed range to 0..255 (for the uint8 quantization). The fp downscale
multiplier is coverted to integer mpy and right-shift constants, whi
similarly, for the tflite quantized inception v3 model, all those
output_activation_min, output_activation_max are 0 and 255
I'll attach a zip file with the log.
[inv3.zip](https://github.com/dmlc/tvm/files/3235141/inv3.zip)
--
You are receiving this because you are subscribed to this thread.
R
I want to point out that the min and max values you mentioned are not related
to the activation range in the original model. They are saturation values. In
the case of mobilenet, for example, which has relu_6 use everywhere, I'm
printing out the min and max activation values from the tflite mo
Here are some points to discuss:
- namespace for the tflite quantize style dialect
- List of ops that might need tvm's compute declaration
- set of possible passes that lower the rest into the core ops
Some of the discussions involve fusion, and that is something where TVM might
be able to help.
> For the `q_conv2d`, we will add two more arguments.
>
> ```python
> output_min=0,
> output_max=0
> ```
>
> These will be used for restrict the output range, which could be calculated
> previously.
I see what you are saying, but I am not sure if this is the right approach. In
my opinion,
> > > > For the `q_conv2d`, we will add two more arguments.
> > > > ```python
> > > > output_min=0,
> > > > output_max=0
> > > > ```
> > > >
> > > >
> > > > These will be used for restrict the output range, which could be
> > > > calculated previously.
> > >
> > >
> > > I see what you are
> > > For the `q_conv2d`, we will add two more arguments.
> > > ```python
> > > output_min=0,
> > > output_max=0
> > > ```
> > >
> > >
> > > These will be used for restrict the output range, which could be
> > > calculated previously.
> >
> >
> > I see what you are saying, but I am not su
> Yes, I believe the MobilenetV2 relu_6 is effectively fused in by the
> downscale saturation. You might need it if you want to support their way of
> training, though.
>
> Yes Mobilenet has the q_add, but I suggest the Inceptionv3 for q_concatenate,
> since it also has concat nodes feeding int
> > For the `q_conv2d`, we will add two more arguments.
> > ```python
> > output_min=0,
> > output_max=0
> > ```
> >
> >
> > These will be used for restrict the output range, which could be calculated
> > previously.
>
> I see what you are saying, but I am not sure if this is the right app
> Hi @anijain2305 regarding the requantization, if the it is not going to put
> in conv op, the op may suppose to output FP32, otherwise the semantic is
> confusing. The requantization can convert FP32 to INT8. The multiplier/shift
> based reuantization approach introduced by TFLite is also adop
Hi @anijain2305 regarding the requantization, if the it is not going to put in
conv op, the op may suppose to output FP32, otherwise the semantic is
confusing. The requantization can convert FP32 to INT8. The multiplier/shift
based reuantization approach introduced by TFLite is also adopted by
> From my experience, we needn't q_relu. But we need q_add / q_concate and so
> on. I suggest we use MobilenetV2 quant model for example,
Yes, I believe the MobilenetV2 relu_6 is effectively fused in by the downscale
saturation. You might need it if you want to support their way of training,
12 matches
Mail list logo