Thanks for raising this. If TVM are used as a lib in a bigger service,
sometimes it's difficult to bump up strace limit as individual threads can have
its own stack set programmably when created. In this case, simulated stack on
heap might actually help. I understand that recursion is easier f
I noticed that the relay tensor shape type will be converted to `int32_t` after
shape inference, however, if IndexExpr arithmetics is involved, the shape type
will be preserved, which breaks the following example
```
fn (%X: Tensor[(int64(7), int64(16)), float32]) {
%0 = reshape(%X, newshape=