TVM deals with these in the Relay IR directly. For example, the IR with NCHW16c 
and NCHW4c may look like:

```
%1 = nn.conv2d(...) // output layout: NCHW16c
%2 = layout_transform(%1, "NCHW4c") // output layout: NCHW4c
...
```

When compiling the above IR, `layout_tranform` is just an operator like 
`conv2d`, so `%1` and `%2` are individual tensors. As a result, runtime only 
needs to execute the compiled graph/bytecode and doesn't have to worry about 
layout transform.

Weights can be done in the same way, but we usually simplify/fold the layout 
transform in the case of model inference which weights are already constants:

```
def @main(%data) {
  %1 = layout_transform(%const[0], "target_layout"); // %const[0] is the weights
  %2 = nn.conv2d(%data, %1);
  ...
}
```

becomes:

```
def @main(%data) {
  %1 = nn.conv2d(%data, %const[0]); // %const[0] is the weights in 
target_layout.
  ...
}
```





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/where-does-layout-transform-data-copy-move-happen/11523/2)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/95c8025db37ebe7da3616182321a18366e5eaf8cc1955af8d2fd5aef275f48a1).

Reply via email to