[The debugger](https://tvm.apache.org/docs/dev/debugger.html?highlight=debug) can provide some time breakdowns for different operations.
However, I'm not sure if it will give you the granularity that you need. For example I have looked into the Conv2D op, and I wanted to get time breakdowns of how much time was spent in padding, packing, convolution, and unpacking. But it only gave the full Conv2D time. This is complicated by the fact they use the same schedule I guess. But if there are ways of getting a finer-grain breakdown time it would be good to know. --- [Visit Topic](https://discuss.tvm.ai/t/how-do-you-test-the-percentage-of-time-spent-on-several-cuda-kernels/6279/2) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/7a07f1a3b976834b41f485c919d1390c9c5ff808d1af87549fce67e9b71aef55).