Thanks for the reply Kevin! Those two layout trans make sense, but for filter parameters, they're loaded from .pth with OIHW by default(relay/frontend/pytorch.py) and I set desired_layout for HWIO. Will these filter parameters be transformed in advanced or by a cuda kernel in each inference?
I guess they should be converted only once since these parameters are kind of constant data regarding the inference process. Could someone give me a hint which parts of code responsible for it? I observed the same number of layout_transform calls with conv calls in my model/running, therefore something is wrong with it. In comparison, the gpu trace of tvm resnet sample shows only two layout transform, which is expected. I'm a very beginner to TVM code base and where should I start? Thanks a lot. --- [Visit Topic](https://discuss.tvm.apache.org/t/how-dose-tvm-elimitate-calls-of-conv-weights-layout-transform/8208/4) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/9022b102452de7ddc825e27d9cce810ee317f08ebb6be2c57598661d9452d22f).