> > Not true. When there is activation, the range is not always 0 ~ 255. For
> > example RELU,
>
> I believe tflite extends the quantization range so it always includes 0, as
> done in the gemmlowp quantization example below. I have dumped my min and max
> saturation input values from the six q
> Not true. When there is activation, the range is not always 0 ~ 255. For
> example RELU,
I believe tflite extends the quantization range so it always includes 0, as
done in the gemmlowp quantization example below. I have dumped my min and max
saturation input values from the six quantized tfl
In the tflite quantized Mobilenet v2, from the repository, the first conv
operation has a non-zero offset ... there is no activation. The offset is 128.
So either provide a conv which uses signed int8 and 0 offset, or do what
tflite does and handle it as quantized uint8 convolution with 128 o
> In that case min and max values passed into the quantized conv are always 0
> and 255.
Not true. When there is activation, the range is not always 0 ~ 255. For
example RELU,
```cpp
auto quantize = [scale, zero_point](float f) {
return zero_point + static_cast(TfLiteRound(f / scale));
> If no activation, we will clamp it to 0 / 127.
In the tflite quantized conv implementation ( I posted an excerpt from their
code previously) the offset is added in prior to the clamping. The tflite
quantized models in their repository used uint8 asymmetric quantization with
non-zero offset
> `https://arxiv.org/pdf/1803.08607.pdf`
Qualcomm's Way? Let us see the Google's TFLite model:

We have the quantized model doesn't remove RELU6 in dw conv / conv. I think we
should focus
> I guess the saturation is exactly what activations (ReLU family) mean,
> semantically. :)
In the case of the tflite quantized models I've looked at, the batch
normalization and relu6 operations in training are fused into the conv
operations used during inference. You probably need to fuse
> > Although the quantized conv result is held in uint8, it could be static
> > casted to signed int8, or even fewer than 8 bit quantization. That would
> > require both min and max saturations, as in the reference tflite quantized
> > conv implementation
>
> Ah, I see. That finally makes sense
> During inference, the min and max saturation values are just handling
> saturation of values seen outside the range expected from the training...
I guess the saturation is exactly what activations (ReLU family) mean,
semantically. :)
--
You are receiving this because you are subscribed to
> > > It appears to me this would let them simulate smaller than 8 bit
> > > quantizations.
> >
> >
> > If _simulating 8 smaller bit_ is the case, 8 bit should be able to hold
> > activation min/max value.
>
> 8 bits could hold. But what the value output_min / output_max is ? I think
> @jnorw
yes, right. The scaling constant computed during training is based on the
range of values seen after fused in activations (at least that is true for the
tflite quantized models I've looked at). That includes being after the relu6
positive clipping also. During inference, the min and max sat
> So, this is not about activation.
Of course it comes from activation, and is related to zero point and scale.
Maybe you can read the whole implementation rather than read secondhand message.
For this min/max activation:
1. They are even named with activation when used in computing kernel:
http
> Although the quantized conv result is held in uint8, it could be static
> casted to signed int8, or even fewer than 8 bit quantization. That would
> require both min and max saturations, as in the reference tflite quantized
> conv implementation
Ah, I see. That finally makes sense.
So, this i
The min and max are not conditional on existence of activation operation in the
original model. They are there to saturate the downscaled and offset adjusted
32 bit signed int accumulator to the min and max value of the uint8 quantized
bit range.
Although the quantized conv result is held in
> I think it is ok. If we do this way, we should insert one clamp if we have
> activation.
> Like our tflite frontend
Yes, I agree with that. That's exactly what I was thinking.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHu
15 matches
Mail list logo