Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread Tianqi Chen
One thing to be careful about is that when using shift and normalize, right shift corresponds to round down as opposed to round to nearest, an additional 0.5 equivalence needs to be added to get the round behavior -- You are receiving this because you are subscribed to this thread. Reply to thi

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread Animesh Jain
Thanks everybody for the fruitful discussion. I think we are gradually reaching convergence :) I am have been prototyping the qnn.conv2d and qnn.requantize at https://github.com/dmlc/tvm/pull/3367 I have still few lose ends to fix. I will update once I am done and then we can discuss if the im

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread Animesh Jain
> And in the case when the scale is a power of two, use shift and normalize > might be better than float scale and round Yes, the shift and normalize can completely by done in integer scale instead of going to Floating point (even if they are not a multiple of 2). I have been prototyping that.

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread 黎明灰烬
Maybe scales are rarely a power of two (I assume you mean values such as 0100b, 0.0010b). They are basically with long fractionals. Tianqi Chen 于2019年7月7日 周日上午11:08写道: > OK, given that most of the qnn ops are already in integer domain, we might > be just fine. Minimization of requantize is still

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread Tianqi Chen
OK, given that most of the qnn ops are already in integer domain, we might be just fine. Minimization of requantize is still useful. And in the case when the scale is a power of two, use shift and normalize might be better than float scale and round -- You are receiving this because you are s

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread 黎明灰烬
Several comments :) > Regarding @anijain2305 's [ReLU > proposal](https://github.com/dmlc/tvm/issues/2351#issuecomment-508956237). The symmetric and asymmetric path may merge into one - the asymmetric - where the zero point for symmetric approach is 0. Actually, this is a bit more complicate r

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread Animesh Jain
> I can see that you might want the graph to represent all the operations prior > to optimizing the implementation. I just want to point out that the qrelu > implementation can avoid the lowered resolution and can be completely cost > free by revising the downscale multiplier and zero point of a

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread ds-jnorwood
> To complete the picture, suppose the quantized framework graph is (fw stands > for framework) > > `fw.quantize -> fw.qconv2d -> fw.qrelu -> fw.dequantize` If you do the qconv2d and qrelu operations sequentially, using their analogous fp operations, the output from qrelu will have the (potenti

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread Animesh Jain
Thanks @tqchen for the detailed explanation. Actually, my proposal is simpler. My `qnn.relu` does not convert to the three stages that you mentioned. It only performs the `relu_int_i8`. The frameworks (atleast TFLite and MxNet) do not go back to FP32 unless the operator is not supported in `i8`

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread Tianqi Chen
To elaborate further about the choice of the domain and how it is relatively independent of which op you would like to perform. It means how you should represent the number of a certain layer. We can represent 2.5 by - f32: stored as val_f32 , where val_f32=2.5 - i8: stored as val_i8 * scale +

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread Tianqi Chen
Yes, you can also view the domain conversion minimization as an optimization pass here. The resulting graph is to some extent equivalent semantically equivalent to the original one that does the conversion to f32 and back and forth. The idea is we can be smarter when lowering qnn ops into the r

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread Animesh Jain
> In particular, refer to the current quantization pass, every value could sit > in a domain, which could be fixed point with an implied scale, or floating > point. Conversion between domains might be necessary and should be conducted > in a minimum way. The default way always convert integer do

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread Tianqi Chen
I agree that mixed-precision might make avg_pool2d's case a bit tricky. However, assuming that the zero-point won't change, we might just do ```avg_pool2d(x.astype("i32")).astype("i8")```. max_pool2d though should be the same given that the maximum rule is the same regardless of zero point. M

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread Animesh Jain
> Do we allow mix of standard ops and qnn ones? The framework parsed graph might have a mix (as shown in the lowering of qconv2d). But in the `relay.build` function, my first pass would be quantize_rewrite pass, that will convert all the `qnn` ops to existing relay ops, resulting in whole graph

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread Animesh Jain
@tqchen Added the case for qrelu. (I think the asymmetric lowering can be improved further, but thats not the point). Similarly for quantized avg pool2d, as @FrozenGene mentioned, we will still need to upcast the tensor to int32 to avoid saturation. Additionally, we would need to handle the zer

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-07-06 Thread Animesh Jain
![q_conv2d](https://user-images.githubusercontent.com/13822661/60761466-2ef79d80-9ffe-11e9-9895-8707d4d8c100.jpg) -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/2351#issuecomment-508956

Re: [dmlc/tvm] [Relay][RFC] Garbage Collection (#3423)

2019-07-06 Thread Tianqi Chen
Seems we agreed that weak reference was better than gc. close this RFC thread for now. Thanks everyone who participated in the discussion -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/342

Re: [dmlc/tvm] [Relay][RFC] Garbage Collection (#3423)

2019-07-06 Thread Tianqi Chen
Closed #3423. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/3423#event-2464108976

Re: [dmlc/tvm] [RFC][Relay][HalideIR] Automatically generate the AST (#3501)

2019-07-06 Thread Junru Shao
Hey Jared, Nice proposal! I am mostly interested in using the node system across C ABI. First, I would love to understand: 1) how member methods could be generated, and 2) their usability across C ABIs. If we wrap up data fields of generated nodes in pure C, and if packed functions' global reg

Re: [dmlc/tvm] [RFC][Relay] Pass API Discussion (#3202)

2019-07-06 Thread Tianqi Chen
Closed #3202. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/3202#event-2464107570

Re: [dmlc/tvm] [RFC][Graph Tuner] Graph level auto-tuning (#1585)

2019-07-06 Thread Tianqi Chen
Closed #1585. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/1585#event-2464106791

Re: [dmlc/tvm] [RFC] More PackedFunc metadata (#2983)

2019-07-06 Thread Tianqi Chen
https://github.com/dmlc/tvm/issues/3501 -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/2983#issuecomment-508953907

Re: [dmlc/tvm] [RFC] More PackedFunc metadata (#2983)

2019-07-06 Thread Tianqi Chen
Closed #2983. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/2983#event-2464106180

Re: [dmlc/tvm] [RFC][Relay] Feature Manager (#3236)

2019-07-06 Thread Tianqi Chen
Closed #3236. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/3236#event-2464105979

Re: [dmlc/tvm] [RFC][Relay][HalideIR] Automatically generate the AST (#3501)

2019-07-06 Thread Tianqi Chen
cc @jermainewang @kazimuth @junrushao1994 @icemelon9 @ajtulloch @yzhliu who might be interested in this. Some initial thoughts: - Convention of the name convention the class and file hierarchy schema - e.g.``` tvm.schema.expr.py -> include/IR/expr.h``` - Alternatively, allow declarat

[TVM Discuss] [Development] Quantization broken due to PR #3135

2019-07-06 Thread Wuwei Lin via TVM Discuss
Okay I will send a patch --- [Visit Topic](https://discuss.tvm.ai/t/quantization-broken-due-to-pr-3135/3237/4) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/9aef064ab6d6d8