In the long tradition of competitive engineering, let's develop both VMs concurrently. That way we will generate actual engineering metrics of the difference between a stack-based and a register-based VM. The engineering dependencies are too complex to productively discuss without actually going through the implementation. The arguments that @jroesch presented, i.e. the stack VM simplifies the instruction management, and the fact that you would need to unroll to discover how operators interact was considered acceptable given the fact that most concurrency is expected to be in the operators themselves.
What the stack vs register organization does to the complexity of resulting compiler analysis code in the context of tensor compositions I think is an open question that can only be answered by doing both implementations. The beauty of that approach is that it will generate a level set of these questions and will offer many an opportunity to publish the results and go on to lucrative commercial engagements as one of the few people in the world that actually has the engineering know-how about the technical details. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/2915#issuecomment-477294221
