Hi There,
VTA first conv layer is running on CPU and not get offload into FPGA, in most
case that is a performance bottle neck and need optimization, following are
some idea about the
optimization, please kindly comments.
Regards
Hua
1. training network to make first conv layer support int8 input and weight, add
feature
into vta to support using 16*16 MAC to compute 3 input channel compute.
2. When running on arm-cpu, seems like only one cpu get used for first conv
compute,
we may can do parallel to running first conv in multiple cpu for accelerate.
---
[Visit Topic](https://discuss.tvm.ai/t/vta-first-conv-layer-optimize/6766/1) to
respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.ai/email/unsubscribe/a560279b6579e8d93ca0a5906b926ec0a2ee716fa8c681d4aae77f7e876e51a1).