I'm trying to integrate an autotuning schedule that I've created for a special 
case of convolution to work with TOPI.  

However, I'm having difficulty getting it integrated correctly with autotvm.

When running my compute description and schedule in a standalone system, I am 
successfully able to autotune.  This is following the decorator style described 
in the 
[tune_simple_template.py](https://github.com/apache/incubator-tvm/blob/v0.6/tutorials/autotvm/tune_simple_template.py)
 tutorial.  By default my schedule uses standard things like SIMD, and gets 
good times.  Autotuning squeezes the extra performance.

I have successfully integrated into TOPI to run without autotuning.  Using the 
decorators, it uses my compute definition and schedule.  My benchmark takes `X 
ms`.  If I manually disable vectorisation, it takes `4X ms`.  The default 
version in `v0.6` takes `Y ms`.

However, I am confused when I try to use autotvm.  If I run my benchmark using 
tvm, during autotuning I can confirm that my compute definition and schedule 
are called (using the old reliable `printf` debugging).

However, at the end of autotuning it seems to fall back to using the default 
implementation, with the time being around `Y ms`, rather than `X ms`.  My 
suspicion that the default is used, I disabled SIMD on my schedule, and this 
has no effect.

To debug, I feel I need to understand the flow of execution when integrated in 
TOPI better.  Taking ARM CPU conv2d as an example, this is what my traces and 
reasoning has brought me.

I would appreciate it if anyone could point out any holes, or resources I could 
look at to get better.

I'm targeting an ARM CPU, so a lot of the things I'm using are in the 
`topi/python/topi/arm_cpu/conv2d.py` file.

When we autotune, here's what I think happens:

### Define the computation
1. legalise conv2d from relay
2. access the callback registered to autotvm as a compute definition conv2d
3. build the compute definition for the chosen layout, and define tuning knobs

### Call the schedule

1. call schedule_conv2d_nchw, registered as an autotvm schedule.  Call the 
appropriate schedule function for the compute used (e.g. Winograd, Spatial 
pack, etc.)
2. our normal schedule is applied, e.g. loop unrolling, vectorisation etc

### Autotuning occurs

Unsure all of the steps that happen here, am familiar what happens in the 
context of a standalone version.





---
[Visit Topic](https://discuss.tvm.ai/t/topi-autotuning-integration/6079/1) to 
respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/b4916818ec14c9540954df10e9e09394c8f40026b3052e61fd40e264201913fd).

Reply via email to