Dear All,
This is a continuation from the issue
https://github.com/apache/incubator-tvm/issues/6857
@giuseros, Thank you for the detailed explanation. It got clear on the usage of
the flags USE_ARM_COMPUTE_LIB_GRAPH_RUNTIME and USE_ARM_COMPUTE_LIB while
compiling TVM
One thing still unclear
The argument order of the cuda kernel is decided right now at the time of the
device host split. So it is not deterministic atm.
To call the cuda function, we usually invoke a related host function which has
the same argument order as the arguments being passed to build.
---
[Visit
Topic
Hi @joyalbin,
Unfortunately I never tried this scenario, and I am not Android expert. I think
you have two possibilities:
* Running the RPC server on Android
* Statically compiling every together, copy the binary on Android and run
### RPC on android
In theory it should be similar to the way
Thank you for your reply, it's would be great if the cuda kernel argument list
keeps the same with the TVM build args, thus we can deploy the cuda kernel
easily without using TVM runtime to avoid the C++ ABI problem, etc..
---
[Visit
Topic](https://discuss.tvm.apache.org/t/whatt-the-argum
Since we can't know the cuda kernel args list, does TVM still keep the input
and output args continuous, and by topological order?
---
[Visit
Topic](https://discuss.tvm.apache.org/t/whatt-the-arguments-order-of-tvm-generated-cuda-kernel/8422/5)
to respond.
You are receiving this because
I would still recommend using the PackedFunc interface, as there can be quite a
few things that are needed to directly use the raw kernel, for example, the
launching parameter calculation is part of the host code, as well as the data
unpacking.
Depending on the schedule, we could also generat
Thank you, I will try the C interface, it's true that manually call cuda kernel
can't handle the multiple cuda kernels.
---
[Visit
Topic](https://discuss.tvm.apache.org/t/whatt-the-arguments-order-of-tvm-generated-cuda-kernel/8422/7)
to respond.
You are receiving this because you enabled
@giuseros this was helpful.. I got some idea to move forward.
Dear All,
please share any performance comparison between TVM kernel and ACL kernel on
any ARM devices (preferably android)?
---
[Visit
Topic](https://discuss.tvm.apache.org/t/how-to-use-arm-compute-library-with-android-rpc/8
Hi All,
Hi @comaniac, I want to follow up with my above post. I removed the IF
statement, and now it works.
Is that mean there is some MergeCompilerRegions does not fully support IF yet.
This is the code that works.
```
# this is test case for graph type 1
print("Graph type 1")
OK, I think I know what I'm looking for and where they are. It's in
`transform_layout.h` where there's a `LayoutRewriter` function for this
purpose. Specifically, `memoizer`'s `Transform` function (defined in the same
file) does the job:
```
Expr Transform(Expr raw, const Layout& src_layout
Hi guys:
I have a question. If I have finished writing register_topi_schedule
register_topi_compute for a specific op in topi/cuda repository, my goal is
that I want to let auto_tvm automatically extract tasks with respect to that
new registered scheduler, how I am going to do? I know tha
11 matches
Mail list logo