> Is there a model being developed to estimate latency of the chosen 
> partitioning? 

No. As far as Collage is concerned it just calls the abstract 
CostEstimator::Estimate interface for each candidate partition, and can remain 
ignorant as to where those costs come from. In the prototype it is hard coded 
to tune, build and run locally to help us get going. Here at OctoML we'll need  
to instantiate the interface to connect to our production 
tuning/running/caching systems. In principle the interface could also be 
instantiated to some analytical models a la the cascading schedule, though 
we're not planning to build one.

I'm thinking the CostEstimator object can just be passed into the 
CollagePartitioner to leave this extension point open.

> How can a user export the search done by collage ?

In an early draft I had support for that. Basically it just needs some 
materialization of the 'optimal' partitioning as an  Array<CandidatePartition>. 
The method CandidatePartition::ParallelRewrite can take that array and rewrite 
the whole expression, exactly as the Collage pass would have done. So splitting 
between search and rewrite is pretty easy.

But I ended up dropping it all in favor of just relying on the CostEstimator to 
cache all measurements (which it needs to do anyway given all the sharing 
opportunities). Firstly, it's not yet clear if there's any significant compile 
time advantage to bypassing the collage search if every candidate partition to 
estimate results in a cache hit. I figured I'd at least measure that before 
adding a fix. But secondly, if someone (the service, the user) is going to go 
to the trouble of caching the optimal partitioning for a particular (model, 
targets) pair, why not just cache the built artifact directly and skip all the 
bother?

However let me know if I over simplified and I can add that part back.

> think we would need to follow the template

Ok.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/62#issuecomment-1077915271
You are receiving this because you are subscribed to this thread.

Message ID: <apache/tvm-rfcs/pull/62/c1077915...@github.com>

Reply via email to