from:"Shawn Wu via TVM Discuss"

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of Winograd conv2d on Tensor Core

2020-05-24 Thread Shawn Wu via TVM Discuss

Hi xiaocenxiaocen, Thanks. I will follow up this paper. Best wishes, Shawn Wu --- [Visit Topic](https://discuss.tvm.ai/t/rfc-tensor-core-optimization-of-winograd-conv2d-on-tensor-core/6543/3) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of CNNs on Tensor Core

2020-05-24 Thread Shawn Wu via TVM Discuss

Hi @Novice , Yes, I agree that TVM on Tensor Core GPUs do have a lot of room to optimize. Currently we are optimizing the data path between global memory and registers, and we think this is a major bottleneck. We are trying to experiment on different layout of both feature maps and weights.

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of Winograd conv2d on Tensor Core

2020-04-29 Thread Shawn Wu via TVM Discuss

**Introduction** We optimized the Winograd algorithm of conv2d for Tensor Core with NHWC layout. There are four modules in winograd algorithm: feature map transform, kernel transform, inverse transform, and batched gemm (bgemm). Following major functions were added: 1, Conv2d_nhwc_winograd_t

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of CNNs on Tensor Core

2020-03-23 Thread Shawn Wu via TVM Discuss

We are pleased to share the codes. Please check the PR: [TOPI][Tensor Core] Conv2d and Dense ops support on Tensor Core #5099. Try to find the code in topi/python/topi/cuda/conv2d_nhwc.py, which is the code that has the same layout as conv2d of Tensor Core. For any questions, please feel fre

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of Winograd conv2d on Tensor Core

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of CNNs on Tensor Core

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of Winograd conv2d on Tensor Core

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of CNNs on Tensor Core

4 matches

Site Navigation

Mail list logo

Footer information