from:"M1k3 via Apache TVM Discuss"

[Apache TVM Discuss] [Development/RFC] [RFC] CSE Optimization

2020-10-17 Thread M1k3 via Apache TVM Discuss



I suppose CSE would solve my question earlier where 2 identical adds with the 
same tensor shapes where not simplified as 1 add with both tensors added?





---
[Visit Topic](https://discuss.tvm.apache.org/t/rfc-cse-optimization/8130/7) to 
respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/41e67799778f0933d30ffc7a438e4c318c6c69578835420adcce0a2d2110d2db).

[Apache TVM Discuss] [Development] ScatterND missing

2020-10-26 Thread M1k3 via Apache TVM Discuss



ScatterND is used in TFLite and ONNX, notably on Yolo v5 models.
@jainris asked for it in August too but it doesn't seem to be available in 
latest code.
 
Is there any plan to support it?

Thanks,

--mike





---
[Visit Topic](https://discuss.tvm.apache.org/t/scatternd-missing/8292/1) to 
respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/9fc34f8a4a0938f9848d70d8fd0e07c638a52009cf470119d4e52c7ae67eef53).

[Apache TVM Discuss] [Development/RFC] [C/C++ runtime] multimodel support

2020-11-24 Thread M1k3 via Apache TVM Discuss

In scenarios where multiple models are used back to back, with multiple inputs
and outputs, TVM doesn't produce helpful native libraries to connect them:
- `get_num_inputs()` returns all tensors instead of only the inputs of the model
- `get_output(id)` has no support for strings. And since output names are
mangled, it's unclear what an `id` corresponds to which output.
- as mentioned in the topic "multithreading and TVM runtime", there seems to be
an issue with the module factory shared between threads. In a multimode
scenario, each model runs under different threads and caching the module
factory doesn't work, forcing each thread to recreate it, which incurs some
performance hit.
- while a secondary goal, the names of operators in the graph can be many
characters long, where a simple integer would suffice.
- also a secondary goal, parameters saved in a library are uncompressed. When
saved separately and compressed with even a simple gzip, quite a lot of space
can be reclaimed.

What we need
- `get_num_inputs()` to return only inputs of the model,
- `get_num_params()` to return only parameters/weights,
- preserve output nodes names and so `get_output(name)` works,
- make sure 2 models running in their own thread can cache their module factory
at setup time and reuse PackedFuncs as fast as possible,
- replace parameter names with integers,
- provide an option to compress parameters' tensors, especially when stored in
the same library, even a default gz or LZ4 saves lots of space, and more
dedicated methods could be provided by users.

These would be extensions of the existing code as (most of) this information is
already available in graph runtime, for example. I'm not sure if there are
impacts on the rest of the codebase.

What do you think?

---
[Visit
Topic](https://discuss.tvm.apache.org/t/c-c-runtime-multimodel-support/8518/1)
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.org/email/unsubscribe/b053d87b5dca7f4135f9674b99c70725f4ada8491cd7af2cf57a8d39f05b9f7b).

[Apache TVM Discuss] [Development/RFC] [C/C++ runtime] multimodel support

2020-11-25 Thread M1k3 via Apache TVM Discuss



Thanks for splitting the proposal.

Replying about F1
Yes we can generate multiple libraries but the issue is linking them together. 
Specifically, there is no way to differentiate inputs vs params/weights. There 
is no way to know the name of the outputs as they have been mangled after 
simplification.

About F2
Yes it may be useful for debugging but once release I don't see any need to 
keep long names. At release, all one needs are inputs (not params) and outputs 
names.





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/c-c-runtime-multimodel-support/8518/3) 
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/423b4709f5a51c35823a84fc6c873521eaf939673cd5147b7298e9c3b9c54124).

[Apache TVM Discuss] [Development/RFC] [RFC][Quantization] A new quantization framework in TVM: initial RFC (1/4)

2021-04-25 Thread M1k3 via Apache TVM Discuss

I'd like to make sure the end goal of this framework is to create a fully
quantized graph, ie with all operators in affine space.

Unlike the usual transformation contraint in TVM that graph rewrite doesn't
change outcome, for quantization, it obviously does. Statistics must be
available to help answer how much.

>From a BYOC point of view, some group of operators may be replaced by
>efficient hardware equivalent. For example, conv-add-relu. Also, math
>functions may be replaced by LUT.

The transformed graph is a simulated quantized graph that allows the user or
the quantization framework to always simulate output and handle quantization
error. I don't think we need to provide all combinations but hooks should be in
place to allow such custom, user defined, handling.

Finally, the proposal may be missing definition of accumulators in affine
space. While weights, inputs (constant or dynamic) and outputs will be in
affine space eg int8 dtype, it is important to be able to specify on which
dtype intermediate math operations will be, for example int32. If we allow any
kind of dtype, then the simulated quantized graph should be able to answer how
many bits do I need before saturation. Again, I view such answers as part of
statistics the user can analyze. At TIR level, such accumulators may lead to
efficient, hardware dependent, transformations.

---
[Visit
Topic](https://discuss.tvm.apache.org/t/rfc-quantization-a-new-quantization-framework-in-tvm-initial-rfc-1-4/9775/19)
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.org/email/unsubscribe/0d4f8f42ddb8ddcf3ee0e93b3a5602975f9c62e5f69d2872f8352fe1d5b73e29).

[Apache TVM Discuss] [Development/RFC] [RFC][Quantization] A new quantization framework in TVM: initial RFC (1/4)

2021-04-26 Thread M1k3 via Apache TVM Discuss



[quote="electriclilies, post:21, topic:9775, full:true"]
@mikeseven
Yes, the goal is to create a fully quantized graph, and we do recognize that 
this transformation will change the output of the graph. For this reason, we're 
not going to present the rewrite as a Relay pass. And I definitely agree that 
we should let there be user-defined handling.

Also, we definitely have been thinking about simulating accumulation in affine 
space. For int8 input datatypes with int32 accumulation, simulating int32 
accumulation is probably not super important since there's a low likelihood of 
overflow. Therefore we're hoping to deal with it in the multi-dtype extension. 
One option for doing this is creating another simulated QNN op that simulates 
overflow for a given dtype.
[/quote]

Thanks Lily. Agree ;-)





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/rfc-quantization-a-new-quantization-framework-in-tvm-initial-rfc-1-4/9775/23)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/b99563bb4fc2943481843c8a89db6f7aeaffd99e02be7cd74995f9523f0b0ae4).

[Apache TVM Discuss] [Development/RFC] [RFC] CSE Optimization

[Apache TVM Discuss] [Development] ScatterND missing

[Apache TVM Discuss] [Development/RFC] [C/C++ runtime] multimodel support

[Apache TVM Discuss] [Development/RFC] [C/C++ runtime] multimodel support

[Apache TVM Discuss] [Development/RFC] [RFC][Quantization] A new quantization framework in TVM: initial RFC (1/4)

[Apache TVM Discuss] [Development/RFC] [RFC][Quantization] A new quantization framework in TVM: initial RFC (1/4)

6 matches

Site Navigation

Mail list logo

Footer information