hi @r.stahl , 

Thanks for posting your implementation!
> First results are shown below. This is evaluated on RISC-V and using the 
> [code generator ](https://github.com/tum-ei-eda/utvm_staticrt_codegen/). 
> There still might be some issue, because I would not expect the RAM usage to 
> rise in any example. But looks promising so far with 10% reduction for a 
> cifar10 model and 18% for resnet!

These are great results!

It would be awesome to find a way to contribute this to TVM. Here are some 
thoughts along those lines...

- We've now landed initial work towards the AOT executor, and there's been some 
parallel work to do memory planning with AOT executor.
- The AOT planner currently uses GraphPlanMemory but there is a [similar 
proposal](https://github.com/apache/tvm/pull/8096/files) to replace 
GraphPlanMemory with something else.
- It would be great if we could come to a single memory planning interface and 
use that for both Graph and AOT memory planning.
- With AOT, memory planning happens at the TIR level, which I think is slightly 
better as it allows for planning scratchpad/workspace memory alongside 
intermediate/output tensor memory. However, I think the fundamental inputs to 
any memory planning algorithms are similar.

In [[RFC] Unified Static Memory 
Planning](https://discuss.tvm.apache.org/t/rfc-unified-static-memory-planning/10099),
 there is a proposal for a memory pool-based planner interface. Could you 
provide some input on the interface planned there, and see if it's compatible 
with your work here? It seems like we could move forward by merging that 
interface and replacing GraphPlanMemory with something that uses the new 
interface.

Previously you'd critiqued memory pools, so I also just wanted to follow-up:

> What we are used to from C compilers is a priorization for performance or 
> memory (-O3/-Os). In the non-prioritized category it still tries to do the 
> best possible job while avoiding unreasonable trade-offs. In the micro-world 
> I guess a higher amount of control for this trade-off could be desired, but 
> not really for all TVM users, right?

My thought is that memory pools still work as an abstraction here, and there 
can be additional parameters provided to any memory planning algorithms to 
enable tradeoffs such as these. If the user wishes to try for highest 
performance, they can offer as much memory as they can to the planner to see if 
it improves things. 

I think in general it's good to retain flexibility in TVM, as use cases tend to 
be quite varied. We may need to ensure there are sane defaults, though. Is 
there a use case you had in mind where the additional control is a drawback?





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/discussion-alignment-memory-planning/9730/9)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/3396a09de9257e1e23029832c0eb7da40e634cf9b690d4d8a138251adfe8a015).

Reply via email to