pengfei added a comment. In D120395#3343799 <https://reviews.llvm.org/D120395#3343799>, @andrew.w.kaylor wrote:
> In D120395#3340953 <https://reviews.llvm.org/D120395#3340953>, @craig.topper > wrote: > >> These intrinsics pre-date the existence of the bfloat type in LLVM. To use >> bfloat we have to make __bf16 a legal type in C. This means we need to >> support loads, stores, and arguments of that type. I think that would create >> bunch of backend complexity because we don't have could 16-bit load/store >> support to XMM registers. I think we only have load that inserts into a >> specific element. It's doable, but I'm not sure what we gain from it. > > My motivation for wanting to use 'bloat' started with wanting to use '__bf16' > as the front end type. It just doesn't make sense to me to define a new type > when we have an existing built-in type that has the same semantics and binary > representation. The argument for introducing a new IR type was made here: > https://reviews.llvm.org/D76077 It doesn't seem like a particularly strong > argument, but it's what was decided then. Using bfloat rather than i16 in the > IR has the benefit that it expresses what the type actually is instead of > just using something that has the same size. Using i16, the semantics of the > type are known only to the front end and we have to rely on what the front > end did for enforcement of the semantics. That's generally going to be OK, > but it seems to me like it works for the wrong reason. That is, i16 is not a > storage-only type and the only reason we don't notice is that the front end > doesn't generate IR that violates the implicit semantics of the type. > > I think there's a minor argument to be made concerning TBAA (short and > __bfloat16 look like compatible types). Perhaps a more significant argument > is that using the __bf16 built-in type would allow us to define a type like > __m256bh like this: > > typedef __bf16 __m256bh __attribute__((__vector_size__(32), > __aligned__(32))); > > So my question would be, how much work are we talking about to make this work > with the x86 backend? I don't see much value to support `__bf16` in front end for X86. I guess you may want something like `__fp16`. But the design of `__fp16` doesn't look great to me. GCC doesn't support `__fp16` for X86. And the existing implementation of `__fp16` somehow becomes obstacle for us to support `_Float16`, especially when we want to support for targets without `avx512fp16`. Not to mention the functionality of `__bf16` isn't as complete as `__fp16`: https://godbolt.org/z/WzKPrYTYP I think it's far from evaluating the backend work. I believe the right approch is to define the ABI type firstly like `_Float16`, then we can do something in backend to support it. Anyway, it doesn't matter to the intrinsics we are supporting here whether we want to support `__bf16` or not. We are free to define and use target specific type for target intrinsics. As mature intrinsics, our focuses are backward compatibilities and cross compiler compatibilities. Both stop us from defining with `__bf16`. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120395/new/ https://reviews.llvm.org/D120395 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits