One thing to be careful about is that when using shift and normalize, right
shift corresponds to round down as opposed to round to nearest, an additional
0.5 equivalence needs to be added to get the round behavior
--
You are receiving this because you are subscribed to this thread.
Reply to thi
Thanks everybody for the fruitful discussion. I think we are gradually reaching
convergence :)
I am have been prototyping the qnn.conv2d and qnn.requantize at
https://github.com/dmlc/tvm/pull/3367
I have still few lose ends to fix. I will update once I am done and then we can
discuss if the im
> And in the case when the scale is a power of two, use shift and normalize
> might be better than float scale and round
Yes, the shift and normalize can completely by done in integer scale instead of
going to Floating point (even if they are not a multiple of 2). I have been
prototyping that.
Maybe scales are rarely a power of two (I assume you mean values such as
0100b, 0.0010b). They are basically with long fractionals.
Tianqi Chen 于2019年7月7日 周日上午11:08写道:
> OK, given that most of the qnn ops are already in integer domain, we might
> be just fine. Minimization of requantize is still
OK, given that most of the qnn ops are already in integer domain, we might be
just fine. Minimization of requantize is still useful. And in the case when the
scale is a power of two, use shift and normalize might be better than float
scale and round
--
You are receiving this because you are s
Several comments :)
> Regarding @anijain2305 's [ReLU
> proposal](https://github.com/dmlc/tvm/issues/2351#issuecomment-508956237).
The symmetric and asymmetric path may merge into one - the asymmetric - where
the zero point for symmetric approach is 0. Actually, this is a bit more
complicate r
> I can see that you might want the graph to represent all the operations prior
> to optimizing the implementation. I just want to point out that the qrelu
> implementation can avoid the lowered resolution and can be completely cost
> free by revising the downscale multiplier and zero point of a
> To complete the picture, suppose the quantized framework graph is (fw stands
> for framework)
>
> `fw.quantize -> fw.qconv2d -> fw.qrelu -> fw.dequantize`
If you do the qconv2d and qrelu operations sequentially, using their analogous
fp operations, the output from qrelu will have the (potenti
Thanks @tqchen for the detailed explanation.
Actually, my proposal is simpler. My `qnn.relu` does not convert to the three
stages that you mentioned. It only performs the `relu_int_i8`.
The frameworks (atleast TFLite and MxNet) do not go back to FP32 unless the
operator is not supported in `i8`
To elaborate further about the choice of the domain and how it is relatively
independent of which op you would like to perform.
It means how you should represent the number of a certain layer. We can
represent 2.5 by
- f32: stored as val_f32 , where val_f32=2.5
- i8: stored as val_i8 * scale +
Yes, you can also view the domain conversion minimization as an optimization
pass here. The resulting graph is to some extent equivalent semantically
equivalent to the original one that does the conversion to f32 and back and
forth. The idea is we can be smarter when lowering qnn ops into the r
> In particular, refer to the current quantization pass, every value could sit
> in a domain, which could be fixed point with an implied scale, or floating
> point. Conversion between domains might be necessary and should be conducted
> in a minimum way. The default way always convert integer do
I agree that mixed-precision might make avg_pool2d's case a bit tricky.
However, assuming that the zero-point won't change, we might just do
```avg_pool2d(x.astype("i32")).astype("i8")```.
max_pool2d though should be the same given that the maximum rule is the same
regardless of zero point.
M
> Do we allow mix of standard ops and qnn ones?
The framework parsed graph might have a mix (as shown in the lowering of
qconv2d). But in the `relay.build` function, my first pass would be
quantize_rewrite pass, that will convert all the `qnn` ops to existing relay
ops, resulting in whole graph
@tqchen Added the case for qrelu. (I think the asymmetric lowering can be
improved further, but thats not the point).
Similarly for quantized avg pool2d, as @FrozenGene mentioned, we will still
need to upcast the tensor to int32 to avoid saturation. Additionally, we would
need to handle the zer

--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2351#issuecomment-508956
Seems we agreed that weak reference was better than gc. close this RFC thread
for now. Thanks everyone who participated in the discussion
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/342
Closed #3423.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/3423#event-2464108976
Hey Jared,
Nice proposal!
I am mostly interested in using the node system across C ABI.
First, I would love to understand:
1) how member methods could be generated, and
2) their usability across C ABIs.
If we wrap up data fields of generated nodes in pure C, and if packed
functions' global reg
Closed #3202.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/3202#event-2464107570
Closed #1585.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/1585#event-2464106791
https://github.com/dmlc/tvm/issues/3501
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2983#issuecomment-508953907
Closed #2983.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2983#event-2464106180
Closed #3236.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/3236#event-2464105979
cc @jermainewang @kazimuth @junrushao1994 @icemelon9 @ajtulloch @yzhliu who
might be interested in this. Some initial thoughts:
- Convention of the name convention the class and file hierarchy schema
- e.g.``` tvm.schema.expr.py -> include/IR/expr.h```
- Alternatively, allow declarat
Okay I will send a patch
---
[Visit
Topic](https://discuss.tvm.ai/t/quantization-broken-due-to-pr-3135/3237/4) to
respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.ai/email/unsubscribe/9aef064ab6d6d8
26 matches
Mail list logo