Thanks for the feedback :) Tiling the output computations + `compute_at` is 
actually exactly what I've been doing to prototype this - and you're right that 
for a sufficiently large tile the recompute isn't particularly bad. I think the 
rolling buffers aren't immediately essential, but they would be a very 
beneficial future optimization.

In our testing/prototyping we have found profitable cascades of 5+ ops, 
particularly in both mobilenet-type architectures and super-resolution 
networks. Determining whether continuing a cascade is profitable would be one 
of the jobs of the cascading algorithm.

My major concern integrating this is that convolution-type operations are 
always on their own in primitive functions. For my experiments I'm currently 
lowering the whole graph to a single TE but this will not work alongside the 
current TOPI integration which expects 'master ops' to determine the schedule. 
In essence I would like to do a hierarchical scheduling, first of the cascades 
and the second of the ops themselves.





---
[Visit Topic](https://discuss.tvm.apache.org/t/rfc-cascade-scheduling/8119/3) 
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/3094e775f5188e9ab1f01dc892191d3dad3dfe6af3c8b41133276a75aa463b41).

Reply via email to