Thanks a lot. I've been playing around with this on a BERT model, but I'm hitting some issues when calling `relay.build` with opt level 3. The target is `cuda`. The error message looks like this:
``` unresolved intrinsic sqrt with return type float16x4 ``` It comes from `codegen_c.cc`. Does this mean that `sqrt` isn't supported with float16? --- [Visit Topic](https://discuss.tvm.ai/t/cuda-fp16-example/6190/10) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/a3dbe3977fc7fb7e23fbe5c5a77b4a665dbbebda947ffb7efee04091b585f37a).