subject:"\[Apache TVM Discuss\] \[Questions\] Matrix multiplication example for Cuda"

[Apache TVM Discuss] [Questions] Matrix multiplication example for Cuda

2020-10-05 Thread Tristan Konolige via Apache TVM Discuss

Kernels running on the GPU require all memory accesses to be within a thread or a block. The file you are looking does not do any thread binding. I suggest looking at this tutorial: https://tvm.apache.org/docs/tutorials/optimize/opt_conv_cuda.html --- [Visit Topic](https://discuss.tvm.ap

[Apache TVM Discuss] [Questions] Matrix multiplication example for Cuda

2020-10-04 Thread Le Xu via Apache TVM Discuss

Hi! I have been studying how TVM works and I tried out this (https://github.com/apache/incubator-tvm/blob/master/tutorials/autotvm/tune_simple_template.py) tutorial example from the website and it seems like running this example with cuda (or OpenCL) produces errors like: > Cannot find config