jchlanda added a comment.

In D118977#3297465 <https://reviews.llvm.org/D118977#3297465>, @tra wrote:

>> They all require PTX 7.0, SM_80.
>
> According to 
> https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#half-precision-floating-point-instructions-fma
>  only `fma.relu` and `bf16*` variants require ptx70/sm80:
>
>   PTX ISA Notes
>   Introduced in PTX ISA version 4.2.
>   
>   fma.relu.{f16, f16x2} and fma{.relu}.{bf16, bf16x2} introduced in PTX ISA 
> version 7.0.
>   
>   Target ISA Notes
>   Requires sm_53 or higher.
>   
>   fma.relu.{f16, f16x2} and fma{.relu}.{bf16, bf16x2} require sm_80 or higher.

My bad, sorry. Fixed now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118977/new/

https://reviews.llvm.org/D118977

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to