I have a implementation done in my fork, which is to introduce sugars at parser/printer level. In the IR local variables remain buffers.
```py A = T.alloc_cell("int32") A = A + 1 T.cp_async(A.buffer.data, ...) ``` Parser will create a buffer with dtype `int32` with shape `[1]`, but in the value table of parser, A is recorded as A[0] (BufferLoad). Then anywhere A used as a PrimExpr naturally works. BufferStore is the place that needs additional handling. Parser accepts the case where the lhs of assign (and augassign) is a BufferLoad (of a [1] shape buffer). If the user wants to access Buffer attributes, or encounters any other cases where the buffer of A is needed, `A.buffer` can be used. The primray motivation of this solution is that - I don't prefer heavy solutions like introducing more nodes into IR - For backend codes (like CUDA), it doesn't seem to matter to keep local variables as arrays. Or you can modify the code genenator. It doesn't affect other system parts anyway --- [Visit Topic](https://discuss.tvm.apache.org/t/how-to-design-a-local-variable/18490/2) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/269c94f2d6e5c91b55c17fda7c89c9dfe2970eaf793ec2e597408bfef5b14aa5).