TVM deals with these in the Relay IR directly. For example, the IR with NCHW16c and NCHW4c may look like:
``` %1 = nn.conv2d(...) // output layout: NCHW16c %2 = layout_transform(%1, "NCHW4c") // output layout: NCHW4c ... ``` When compiling the above IR, `layout_tranform` is just an operator like `conv2d`, so `%1` and `%2` are individual tensors. As a result, runtime only needs to execute the compiled graph/bytecode and doesn't have to worry about layout transform. Weights can be done in the same way, but we usually simplify/fold the layout transform in the case of model inference which weights are already constants: ``` def @main(%data) { %1 = layout_transform(%const[0], "target_layout"); // %const[0] is the weights %2 = nn.conv2d(%data, %1); ... } ``` becomes: ``` def @main(%data) { %1 = nn.conv2d(%data, %const[0]); // %const[0] is the weights in target_layout. ... } ``` --- [Visit Topic](https://discuss.tvm.apache.org/t/where-does-layout-transform-data-copy-move-happen/11523/2) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/95c8025db37ebe7da3616182321a18366e5eaf8cc1955af8d2fd5aef275f48a1).