@varunnaw Good point, in my project we use this approach to retrieve
attributes, including the dynamic shared memory size and block/grid
information, which might be helpful to you.
https://github.com/microsoft/BitBLAS/blob/main/bitblas/builder/wrapper/tir.py#L64-L80
## Why this is important?
One suggestion that I have for TVM is to add a cleaner exit from the stack.
For example, for opencl/ cuda targets, what do I do if I just want the
generated kernels?
Note: there is a way to print the source for CL, but unfortunately I have not
found a way to get the work group / threadblock s
@echuraev @elvin-n
How did you get the work group sizings from tvm for the opencl target on Adreno
GPU?
I saw your samples here: [qualcomm/apps/OpenCLCppRunner at master ·
Deelvin/qualcomm · GitHub
](https://github.com/Deelvin/qualcomm/tree/master/apps/OpenCLCppRunner)
I see that you obtai