[Apache TVM Discuss] [Development/RFC] [RFC][Quantization] A new quantization framework in TVM: initial RFC (1/4)

2021-04-26 Thread Animesh Jain via Apache TVM Discuss
Thanks, that makes sense. I was thinking that while calibration, you could use different attributes for `simulated_quantize` and `simulated_dequantize` ops. In the callback of calibrating an operator, one can simulate the affine space and argue about scales and zero points. But for capturing r

[Apache TVM Discuss] [Development/RFC] [RFC][Quantization] A new quantization framework in TVM: initial RFC (1/4)

2021-04-26 Thread Lily Orth-Smith via Apache TVM Discuss
Also, as part of the standardization of QNN, we could ensure that all QNN "compute" ops go from `int8 -> int8` . I believe that `qnn.conv2d` is the only QNN op that outputs an accumulation dtype, so we could change `qnn.conv2d` to take in bias in addition to the data and weight. --- [Visi

[Apache TVM Discuss] [Development] [RFC] Compute graph pipeline with new subgraph executor

2021-04-26 Thread Cody H. Yu via Apache TVM Discuss
Here are my two cents before diving into the detail code review. 1. At the first glance most implementations, including the Relay passes, were done in Python. It would be better to implement them in C++ for better performance. 2. The term and namespace "subgraph" is improper and confusing. 3.

[Apache TVM Discuss] [Development] [RFC] Compute graph pipeline with new subgraph executor

2021-04-26 Thread huajsj via Apache TVM Discuss
@tqchen @comaniac @areusch @tmoreau89 @zhiics @giuseros. --- [Visit Topic](https://discuss.tvm.apache.org/t/rfc-compute-graph-pipeline-with-new-subgraph-executor/9839/2) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click he

[Apache TVM Discuss] [Development/RFC] [RFC][Quantization] A new quantization framework in TVM: initial RFC (1/4)

2021-04-26 Thread M1k3 via Apache TVM Discuss
[quote="electriclilies, post:21, topic:9775, full:true"] @mikeseven Yes, the goal is to create a fully quantized graph, and we do recognize that this transformation will change the output of the graph. For this reason, we're not going to present the rewrite as a Relay pass. And I definitely agr

[Apache TVM Discuss] [Development/RFC] [RFC][Quantization] A new quantization framework in TVM: initial RFC (1/4)

2021-04-26 Thread Lily Orth-Smith via Apache TVM Discuss
[quote="anijain2305, post:20, topic:9775"] I am trying to understand why we need `qnn.conv2d*` (* represents operator along the lines of `qnn.simulated_conv2d`) during calibration. The only reason would be if you want to propagate the error from previous operators while **calibrating** current

[Apache TVM Discuss] [Development/RFC] [RFC][Quantization] A new quantization framework in TVM: initial RFC (1/4)

2021-04-26 Thread Lily Orth-Smith via Apache TVM Discuss
@mikeseven Yes, the goal is to create a fully quantized graph, and we do recognize that this transformation will change the output of the graph. For this reason, we're not going to present the rewrite as a Relay pass. And I definitely agree that we should let there be user-defined handling. A

[Apache TVM Discuss] [Development/RFC] Compute Graph Pipeline

2021-04-26 Thread huajsj via Apache TVM Discuss
Thanks @comaniac @zhiics @giuseros @JaydeepIMG @JakeStevens for the review and discussion. The compute pipeline runtime/executor PR already get submitted(https://github.com/apache/tvm/pull/7892) , I also propose a new RFC(https://discuss.tvm.apache.org/t/rfc-compute-graph-pipeline-with-new-s

[Apache TVM Discuss] [Development] [RFC] Compute graph pipeline with new subgraph executor

2021-04-26 Thread huajsj via Apache TVM Discuss
>This is a follow up RFC for >https://discuss.tvm.apache.org/t/compute-graph-pipeline/8957 >PR https://github.com/apache/tvm/pull/7892 # **Split relay graph into Subgraph then doing Subgraph Pipeline :RFC** In this RFC, we present a new framework for subgraph pipelining that first split rela

[Apache TVM Discuss] [Development] [RFC] Preparing to Launch new RFC Process

2021-04-26 Thread Chris Hoge via Apache TVM Discuss
The new [RFC repository](https://github.com/apache/tvm-rfcs) is online, and we have a [final draft of the RFC process ready for review](https://github.com/apache/tvm-rfcs/pull/2). Please take a moment to review the process and raise any issues or questions so that we can address them before w

[Apache TVM Discuss] [Development/RFC] [RFC] Introducing a 'rolling_buffer' scheduling primitive

2021-04-26 Thread Matt Barrett via Apache TVM Discuss
### What is a rolling buffer? A rolling buffer (at least for the purposes of this RFC) is a buffer where one of the dimensions should be addressed via modulo arithmetic. This gives it a 'wrap-around' behaviour which makes it self-overwriting. This means they are effectively just higher dimens

[Apache TVM Discuss] [Development/RFC] [RFC][TFLite frontend] Create models for frontend testing by directly writing TFLite buffers

2021-04-26 Thread Matt Barrett via Apache TVM Discuss
@siju-samuel @FrozenGene you may also be interested in this proposal. --- [Visit Topic](https://discuss.tvm.apache.org/t/rfc-tflite-frontend-create-models-for-frontend-testing-by-directly-writing-tflite-buffers/9811/3) to respond. You are receiving this because you enabled mailing list mo

[Apache TVM Discuss] [Development/RFC] [RFC][TFLite frontend] Create models for frontend testing by directly writing TFLite buffers

2021-04-26 Thread Elen Kalda via Apache TVM Discuss
@anijain2305 @tqchen @dmitriy-arm --- [Visit Topic](https://discuss.tvm.apache.org/t/rfc-tflite-frontend-create-models-for-frontend-testing-by-directly-writing-tflite-buffers/9811/2) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these email

[Apache TVM Discuss] [Development/RFC] [RFC][Quantization] A new quantization framework in TVM: initial RFC (1/4)

2021-04-26 Thread Animesh Jain via Apache TVM Discuss
I apologize for the long delay. Thanks @electriclilies and team for nicely written RFC. I support the idea. Reading through the comments, it seems that many of us are in agreement about the AutoQ and its reliance on QNN extension. The mentioned pain points mostly revolve around * The inconsi