date:20200524

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of Winograd conv2d on Tensor Core

2020-05-24 Thread Shawn Wu via TVM Discuss

Hi xiaocenxiaocen, Thanks. I will follow up this paper. Best wishes, Shawn Wu --- [Visit Topic](https://discuss.tvm.ai/t/rfc-tensor-core-optimization-of-winograd-conv2d-on-tensor-core/6543/3) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of CNNs on Tensor Core

2020-05-24 Thread Shawn Wu via TVM Discuss

Hi @Novice , Yes, I agree that TVM on Tensor Core GPUs do have a lot of room to optimize. Currently we are optimizing the data path between global memory and registers, and we think this is a major bottleneck. We are trying to experiment on different layout of both feature maps and weights.

[TVM Discuss] [Development/RFC] [RFC] Add bfloat16 data type

2020-05-24 Thread Menooker via TVM Discuss

Updated design details # Details on legalization Since most of the HW has no native support for computation on bf16, we added a pass `BF16Legalization` to use fp32 computing bf16 data. It has 3 sub-passes: `Promotion`, `Elimilination` and `Lowering`. ## BF16Promotion It adds `cast_to_fp32()

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of Winograd conv2d on Tensor Core

2020-05-24 Thread Xiaocenxiaocen via TVM Discuss

For winograd impl with large batch size, maybe you can refer to this paper https://dl.acm.org/doi/pdf/10.1145/3332466.3374520. They implement an assembler for Volta/Turing architecture and use CHWN layout for large batch winograd algorithm. --- [Visit Topic](https://discuss.tvm.ai/t/rf

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of CNNs on Tensor Core

2020-05-24 Thread Novice via TVM Discuss

Hi, @Hzfengsy @Shawn_Inspur :slightly_smiling_face: Thanks for your efforts on supporing TensorCore on TVM. I have tuned TensorCore on classical network such as resnet50 & vgg16(32 batch_size). And the tensor_precision_fu_utilization reported by Nvprof shows that I got a Mid/Low utilization o

[TVM Discuss] [Development/RFC] TVM on Windows - Tips(?) and Feedback

2020-05-24 Thread Heliqi via TVM Discuss

[quote="hcho3, post:11, topic:4341"] Your post saved lots of time for m [/quote] @hcho3 I compiled LLVM from the github and lld-link.exe was produced.But I still have an error running the code: ``` RuntimeError: Can not find cl.exe,please run this in Vistual Studio Command Prompt ``` I also us

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of Winograd conv2d on Tensor Core

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of CNNs on Tensor Core

[TVM Discuss] [Development/RFC] [RFC] Add bfloat16 data type

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of Winograd conv2d on Tensor Core

[TVM Discuss] [Development/RFC] [RFC][Tensor Core] Optimization of CNNs on Tensor Core

[TVM Discuss] [Development/RFC] TVM on Windows - Tips(?) and Feedback

6 matches

Site Navigation

Mail list logo

Footer information