This is exactly what I needed, thanks!
I'm now able to extract the PAPI counters from standalone functions by running the function exported as an `.so` library in C++, with the above PAPI code! I'll use this method to get the data I need. Now, looking forward, I'm thinking how best to expose a Python interface to this, to try and make this more usable for others in the short-to-medium term. Within my C++ module, I benchmark using a `PackedFunc`. I can get the `PackedFunc` from the Python side `mod` (i.e. output of `tvm.build`) by running `mod.entry_func`. I guess what I would need is a Python exposed C++ interface that takes a `tvm.module`, the input tensors, and the target device + PAPI counters. Then it can just return the JSON from the `tvm::runtime::profiling::Report`. I'll need to think about the best place to build this. Should it be a method of `Module`, or would it be better to keep it separate somehow? --- [Visit Topic](https://discuss.tvm.apache.org/t/papi-counters-with-basic-matmul-relay-function/11263/7) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/858aade3439d697934a3aad3b6821957adbbb690677e69b892b24b6b0d5d5159).