I'm trying to integrate an autotuning schedule that I've created for a special case of convolution to work with TOPI.
However, I'm having difficulty getting it integrated correctly with autotvm. When running my compute description and schedule in a standalone system, I am successfully able to autotune. This is following the decorator style described in the [tune_simple_template.py](https://github.com/apache/incubator-tvm/blob/v0.6/tutorials/autotvm/tune_simple_template.py) tutorial. By default my schedule uses standard things like SIMD, and gets good times. Autotuning squeezes the extra performance. I have successfully integrated into TOPI to run without autotuning. Using the decorators, it uses my compute definition and schedule. My benchmark takes `X ms`. If I manually disable vectorisation, it takes `4X ms`. The default version in `v0.6` takes `Y ms`. However, I am confused when I try to use autotvm. If I run my benchmark using tvm, during autotuning I can confirm that my compute definition and schedule are called (using the old reliable `printf` debugging). However, at the end of autotuning it seems to fall back to using the default implementation, with the time being around `Y ms`, rather than `X ms`. My suspicion that the default is used, I disabled SIMD on my schedule, and this has no effect. To debug, I feel I need to understand the flow of execution when integrated in TOPI better. Taking ARM CPU conv2d as an example, this is what my traces and reasoning has brought me. I would appreciate it if anyone could point out any holes, or resources I could look at to get better. I'm targeting an ARM CPU, so a lot of the things I'm using are in the `topi/python/topi/arm_cpu/conv2d.py` file. When we autotune, here's what I think happens: ### Define the computation 1. legalise conv2d from relay 2. access the callback registered to autotvm as a compute definition conv2d 3. build the compute definition for the chosen layout, and define tuning knobs ### Call the schedule 1. call schedule_conv2d_nchw, registered as an autotvm schedule. Call the appropriate schedule function for the compute used (e.g. Winograd, Spatial pack, etc.) 2. our normal schedule is applied, e.g. loop unrolling, vectorisation etc ### Autotuning occurs Unsure all of the steps that happen here, am familiar what happens in the context of a standalone version. --- [Visit Topic](https://discuss.tvm.ai/t/topi-autotuning-integration/6079/1) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/b4916818ec14c9540954df10e9e09394c8f40026b3052e61fd40e264201913fd).