Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-06-15 Thread Zhao Wu
> @FrozenGene For the output_min and max, isn't the out_dtype enough? If its > uint8, we can clamp at 0 and 255. If its int8, we can clamp at -128 and 127. > I don't see any reason the values will be any different, unless you want to > fuse the quantized relu in the quantized convolution from th

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-06-15 Thread Zhao Wu
> > It appears to me this would let them simulate smaller than 8 bit > > quantizations. > > If _simulating 8 smaller bit_ is the case, 8 bit should be able to hold > activation min/max value. 8 bits could hold. But what the value output_min / output_max is ? I think @jnorwood want to express t

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-06-15 Thread Zhao Wu
> @FrozenGene a clarifying question to your above comment. If we pass in the > output scale and shift can we not compute int32-> int8 by simply adding more > nodes in the graph. doesn't understand your comment fully. do you mean could we avoid int32 -> int8 computation? If so, I think we can no

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-06-15 Thread 黎明灰烬
> It appears to me this would let them simulate smaller than 8 bit quantizations If *simulating 8 smaller bit* is the case, 8 bit should be able to hold activation min/max value. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHu

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-06-15 Thread ds-jnorwood
The tflite quantized convolution reference implementation passes in both limits as int32 values. It appears to me this would let them simulate smaller than 8 bit quantizations, if that is something you want to support. this is from ` tensorflow/lite/kernels/internal/reference/conv.h ` ` acc

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-06-15 Thread Animesh Jain
@FrozenGene For the output_min and max, isn't the out_dtype enough? If its uint8, we can clamp at 0 and 255. If its int8, we can clamp at -128 and 127. I don't see any reason the values will be any different, unless you want to fuse the quantized relu in the quantized convolution from the starti

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-06-15 Thread shoubhik
@FrozenGene a clarifying question to your above comment. If we pass in the output scale and shift can we not compute int32-> int8 by simply adding more nodes in the graph. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: htt

Re: [dmlc/tvm] [RFC][Quantization] Support quantized models from TensorflowLite (#2351)

2019-06-15 Thread Zhao Wu
@anijain2305 I understand your thought and thought. I agree we should make the api minimal. However, no matter what way, q_conv2d’s int32 output should be clamped into uint8 range. If you don’t pass min / max, you also need do `output = std::max(output, 0)` and `output = std::min(output, 255)` t

Re: [dmlc/tvm] [RFC][EXPR] Formalize Integer Arithmetic Analysis (#2588)

2019-06-15 Thread Sergei Grechanik
@tqchen One thing I wanted to clarify: why isn't the Analyzer class integrated into the Node hierarchy? Instead some separate closure-based mechanism is used for python integration, which feels strange and seemingly makes it harder to create functions which accept Analyzer objects and work acros