Yeah. A performance regression test would be very nice. There are a lot of
times we need to do binary search to find the commit causing regression.
---
[Visit
Topic](https://discuss.tvm.apache.org/t/rfc-building-a-new-reproducible-benchmark-for-tvm/8496/3)
to respond.
You are receiving t
Yeah. In most cases we can do vectorize in TIR instead of relying on llvm.
---
[Visit
Topic](https://discuss.tvm.apache.org/t/role-of-the-llvm-autovectorizer-in-tvm/8388/3)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
Graph of TF OD is much larger than PT OD.
---
[Visit
Topic](https://discuss.tvm.apache.org/t/vm-slow-compilation-of-tf-object-detection-models/7479/10)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discus
Would love to see dynamic shape supported otherwise a large set of models can't
be backed by new TensorIR. :D
---
[Visit
Topic](https://discuss.tvm.apache.org/t/rfc-tensorir-a-schedulable-ir-for-tvm/7872/16)
to respond.
You are receiving this because you enabled mailing list mode.
To un
Thanks for clarification. It would be nice if we can use various methods to
create tensor programs and use new tir to schedule them.
---
[Visit
Topic](https://discuss.tvm.apache.org/t/rfc-tensorir-a-schedulable-ir-for-tvm/7872/14)
to respond.
You are receiving this because you enabled ma
Thanks for explanation. The relation between te and new tir is now more clear
to me.
---
[Visit
Topic](https://discuss.tvm.apache.org/t/rfc-tensorir-a-schedulable-ir-for-tvm/7872/13)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emai
What is the sematics of begin=3 and end=0 in the original framework? This relay
node is illegal since it generates negative slice.
---
[Visit
Topic](https://discuss.tvm.apache.org/t/can-slice-from-relay-support-empty-result/5889/9)
to respond.
You are receiving this because you enabled m
Thank you for this proposal! This work does make scheduling much easier. I have
a concern about using this way to write a tensor expression. It looks like more
complicated than tvm.compute when defining matmul. We need to define some
buffers and creating block with corresponding shape dimensio
TensorArray is supported in Relay and TF TensorArray ops can be converted now.
Did you mean something more than these?
---
[Visit Topic](https://discuss.tvm.ai/t/tensor-arrays-in-tir/7135/3) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from thes
Adding "-libs=cblas" in target and building tvm with MKLDNN will use mkl gemm
for dense.
---
[Visit
Topic](https://discuss.tvm.ai/t/rfc-ansor-an-auto-scheduler-for-tvm-autotvm-v2-0/7005/26)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from the
Thanks @merrymercy
The point of bringing up MKLDNN is that for dense op these libraries have a bag
of tricks which might be difficult to achieve in TVM. @haichen has done nice
work on TVM+MKLDNN for bert, and has become the standard way we use to support
bert on cloud CPU. It would be nice to
IMO, AutoTVM + schedule template system represents a methodology which
developer can create and fully control their own kernel optimization, which is
functionally disjoint with Ansor. If deprecating AutoTVM means we will not
discard any core functionalities but just unify them under a larger p
Also I think a benchmark to cover more models on more platforms would be
necessary if we want to replace major part in the system. In addition, we can
probably consider different methods of codegen in tvm as baseline. One example
is that currently we use TVM+MKLDNN for bert on x86 cpu since x8
Thanks @merrymercy for this work! I have several questions regarding to this
plan:
1. In Ansor paper there are relative performance number between current autotvm
and ansor, it would be great to have some benchmark data in terms of latency.
For example, for resnet50_v1 we can achieve 2.2 ms on
Looks like some op names have changed across tf 1.x and 2.x.
---
[Visit
Topic](https://discuss.tvm.ai/t/tensorflow-frontend-support-for-tensor-frontend-new-operators/6971/4)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
Current TF frontend parser is for tf 1.x. Community has some initial
discussions about tf 2.x support but I'm not sure about the status.
---
[Visit
Topic](https://discuss.tvm.ai/t/tensorflow-frontend-support-for-tensor-frontend-new-operators/6971/2)
to respond.
You are receiving this bec
Closed #4969.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/4969#event-3437024852
This feature is now supported in TVM.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/4969#issuecomment-643081334
@zhanghaohit It's still under investigation for different options, but it's
more likely a static shape will fall into a bucket and calling corresponding
kernel.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github
Make sense. For API user, we will provide more dynamic API to support any
input cases. In the backend, we separate the purely static case(Probably
requires no shape func?) and dynamic cases to make it easier to maintain
related passed.
---
[Visit Topic](https://discuss.tvm.ai/t/dynamic-o
Correct me if my understanding is wrong, is the goal of A1 to finally merge
static and dynamic ops into a single dynamic API which input tensors allows
dynamic input and attributes only allows constants(Like TensorFlow)? Also in
terms of the boundary of static and dynamic ops, we still need to
@mbrookhart Thank you for this RFC. IMHO, one advantage of A0 is it is more
user friendly to just have a unified API to handle both static and dynamic
shape cases. Though this adds complexity of type inference of each op, it
reduces 1) number of relay ops. 2) complexity of frontend logic to ha
@cloudhan Thanks for your info. @icemelon9 Do we have any work related to
dynamic axis range?
In terms of codegen, indeed efficiency(and also how to limit the number of
buckets but loss less performance) is one of the difficult part. We are working
on improving some fundamental infra to see ho
@zhanghaohit This feature is one part of dynamic codegen. We are working some
significant backend features and will update this later.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/i
Does it mean we will have a set of predefined tag names so that user can
directly refer to specific hardware platform, instead of having to setup
architecture details such as "mcpu"? This can make it a lot easier for tvm
beginners.
---
[Visit Topic](https://discuss.tvm.ai/t/rfc-tvm-target
Thanks for bringing this up. I have a question about the functionality of
target tag: does it mainly serve as the name to differentiate autotvm logs for
specific platform? For x86 cpu, this name might not be very important since we
mainly care about the ISA supported by the cpu.
---
[Visi
We will have general support for TensorFlow control flow and tensor array,
which allows parsing for TensorFlow object detection models
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/i
In the long term, we probably want to improve FoldConstant to hint that for
some op such as shapeof, we just need its input shape to be constant. In the
short term, I'm not very sure about whether we need a pass specifically for it.
Another way is we just modify the frontend to infer_value for
@anijain2305 Thank you for bringing this up. Would you mind listing the current
pain point of managing different categories of conv2d in Relay, and listing
pros and cons for keeping the current method VS separating?
---
[Visit
Topic](https://discuss.tvm.ai/t/separate-relay-depthwise-conv-
Hi,
This mainly targets for dynamic codegen for gpu. Dynamic codegen in TVM is
still a WIP project.
---
[Visit Topic](https://discuss.tvm.ai/t/if-hoisting-in-tvm/6127/2) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
h
[quote="heliqi, post:6, topic:5889"]
```
```
When have an ‘end_v < begin_v error’ , it still seems to throw
[/quote]
In this case, same change need to be made for relay: ```CHECK_LT(begin_v,
end_v)``` -> ```CHECK_LE(begin_v, end_v)```
---
[Visit
Topic](https://discuss.tvm.ai/t/can-slice
https://github.com/apache/incubator-tvm/blob/e89b19d95d/topi/include/topi/transform.h#L620
begin == end case is already supported.
---
[Visit
Topic](https://discuss.tvm.ai/t/can-slice-from-relay-support-empty-result/5889/5)
to respond.
You are receiving this because you enabled mailing l
One blocking issue for dynamic NMS is that it needs to use dynamic
strided_slice. However, current fusion pass can't handle dynamic shape well
yet. As a result, we can't fuse dynamic strided_slice and have to temporarily
mark strided_slice as Opaque. However, this will result in other normal sta
### Summary
In today’s TVM TensorFlow frontend, there is only limited support for control
flow, which resulting in difficult in covering TensorFlow object detection
models. In this RFC, we will discuss how to improve current TVM TF frontend to
fully support TensorFlow control flow.
### Solution
Symbolic Shape Enhancement:
Add more shape functions commonly used in cv models:
https://github.com/apache/incubator-tvm/pull/4179
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issue
@comaniac Thank you for these data. Can you update in the main thread?
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/4188#issuecomment-549959110
Also by further discussing with @comaniac, we thought that it's worth a bit
further dig into the reason why full tuning for ssd and yolo only achieve 60%
-70% performance of selective tuning. Maybe increase the number of trials of
full tuning can give us a more clear picture.
--
You are receiv
I agree that we can think a bit more about the name "select" and "depend".
These two names have rich semantics in different context. Maybe we would like
some API more tightly bind to autotvm specifically.
--
You are receiving this because you are subscribed to this thread.
Reply to this email
Thanks for this proposal! For Yolo and ssd, does the performance advantage
mainly come from larger tuning space? If so, I suggest we also do full
auto-tuning with expanded tuning space, so that we have an apple to apple
comparison, and a more clear picture about tuning time vs performance
trade
@tqchen Sure. Dispatch function doesn't need to couple with relay::Module.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/4118#issuecomment-545759382
@soiferj For ```full``` op, we can change the input shape argument to be
relay.Expr. We use hybrid script to register shape functions, since most of
them are not easy to be written as tensor expression. We only add CPU version
shape functions, and relay on Heterogeneous execution for gpu.
--
Y
For local log file management, how about we store the best K schedules for each
workload? User can choose how many schedules they would like to keep.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tv
Thank you for this proposal. This is helpful to manage local log files. One
question about:
```python
with config_library:
relay.build(...)
```
What is the relationship between config_library and autotvm dispatch context?
It seems that this design replaces dispatch context with config_library
I think you need to create a relay.Function and call with new_args.
---
[Visit
Topic](http://tracking.discuss.tvm.ai/tracking/click?d=RrWWa6ZnID_g8FliIRahCwQ_WoNDb7NcN3cKg_cQTcd-2-85LLdevpDxXuPmWjg_9OjF8iMLsOsG-Vh8QUtfgmw3jn33dtjuLP4fuZPiEwk8LTXJDCT3gvN2w-NPB8DvayTYHX9Pf5WTIxXK5sSDgLo7uxxhI
## Overview
There are more and more deployment requirements regarding dynamic input graphs,
such as dynamic batching CV models and dynamic BERT. While dynamic input graph
is supported in eager mode(Pytorch, Tensorflow Eager, MXNet Gluon) for model
developing, TVM still just support static shape
Just curious, is this change also related to relay/tvm node system unification?
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/4116#issuecomment-541469429
Can we have a more detailed example to help clarify this issue?
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/4054#issuecomment-538060274
We will have a RFC regarding dynamic shape support soon. To achieve descent
performance, there is still lots of work remaining. A rough timeline for
performant dynamic shape model can be late this year or early next year.
---
[Visit
Topic](https://discuss.tvm.ai/t/whether-tvm-will-support
To support CV model in tensorflow/tflite, do we need to add propagate rules for
ops to support conversion from "NHWC" to "NCHW"? If so, would ii be easier to
add these rules as operator attributes?
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly
Do we need such kind of operator for optimizer, or we have better alternatives?
---
[Visit
Topic](https://discuss.tvm.ai/t/rfc-implement-add-to-semantic-in-tvm/3618/3) to
respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here
Agree that we need some benchmarking to decide the best solution.
---
[Visit Topic](https://discuss.tvm.ai/t/explore-optimizations-for-concat/2435/4)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.t
I think for now it is okay to just support code gen with fully symbolic shape.
Later if we want specific optimization strategies for different programs, we
can revisit and adding more code gen strategies.
--
You are receiving this because you are subscribed to this thread.
Reply to this email d
@FrozenGene Data of "apply_history_best" updated.
@yzhliu Updated some implementation details.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/1585#issuecomment-481464375
+1
--
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2973#issuecomment-480552097
@merrymercy Auto-scheduler will create another search space consists of
schedule templates. For a given set of hardware parameters, it will try various
schedule templates and for each template do some auto-tuning on real device.
This means for each minor device type, we need to do all these step
Thank you for opening this RFC! I have a question regarding user API. Does the
hardware information needed for autotvm.AutoSchedulerOptions(**kwargs) function
pre-defined for different hardware architectures? If so, how much more
information does a user need to provide to differentiate between d
@FrozenGene The default schedule here for x86 eliminates most layout
transformations. It should have similar performance with "apply_history_best".
I'll update the data for "apply_history_best".
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or
Hi,
You might need to set extra llvm arguments, especially -mcpu option, depending
on your chip: https://llvm.org/docs/CommandGuide/llc.html
You can also try autotuning after proper llvm arguments are set.
---
[Visit Topic](https://discuss.tvm.ai/t/low-efficiency-on-my-own-cpu/2030/2) to
58 matches
Mail list logo