pengfei added a comment.

In D120395#3343799 <https://reviews.llvm.org/D120395#3343799>, @andrew.w.kaylor 
wrote:

> In D120395#3340953 <https://reviews.llvm.org/D120395#3340953>, @craig.topper 
> wrote:
>
>> These intrinsics pre-date the existence of the bfloat type in LLVM. To use 
>> bfloat we have to make __bf16 a legal type in C. This means we need to 
>> support loads, stores, and arguments of that type. I think that would create 
>> bunch of backend complexity because we don't have could 16-bit load/store 
>> support to XMM registers. I think we only have load that inserts into a 
>> specific element. It's doable, but I'm not sure what we gain from it.
>
> My motivation for wanting to use 'bloat' started with wanting to use '__bf16' 
> as the front end type. It just doesn't make sense to me to define a new type 
> when we have an existing built-in type that has the same semantics and binary 
> representation. The argument for introducing a new IR type was made here: 
> https://reviews.llvm.org/D76077 It doesn't seem like a particularly strong 
> argument, but it's what was decided then. Using bfloat rather than i16 in the 
> IR has the benefit that it expresses what the type actually is instead of 
> just using something that has the same size. Using i16, the semantics of the 
> type are known only to the front end and we have to rely on what the front 
> end did for enforcement of the semantics. That's generally going to be OK, 
> but it seems to me like it works for the wrong reason. That is, i16 is not a 
> storage-only type and the only reason we don't notice is that the front end 
> doesn't generate IR that violates the implicit semantics of the type.
>
> I think there's a minor argument to be made concerning TBAA (short and 
> __bfloat16 look like compatible types). Perhaps a more significant argument 
> is that using the __bf16 built-in type would allow us to define a type like 
> __m256bh like this:
>
>   typedef __bf16 __m256bh __attribute__((__vector_size__(32), 
> __aligned__(32)));
>
> So my question would be, how much work are we talking about to make this work 
> with the x86 backend?

I don't see much value to support `__bf16` in front end for X86. I guess you 
may want something like `__fp16`. But the design of `__fp16` doesn't look great 
to me. GCC doesn't support `__fp16` for X86. And the existing implementation of 
`__fp16` somehow becomes obstacle for us to support `_Float16`, especially when 
we want to support for targets without `avx512fp16`. Not to mention the 
functionality of `__bf16` isn't as complete as `__fp16`: 
https://godbolt.org/z/WzKPrYTYP I think it's far from evaluating the backend 
work.
I believe the right approch is to define the ABI type firstly like `_Float16`, 
then we can do something in backend to support it.

Anyway, it doesn't matter to the intrinsics we are supporting here whether we 
want to support `__bf16` or not. We are free to define and use target specific 
type for target intrinsics. As mature intrinsics, our focuses are backward 
compatibilities and cross compiler compatibilities. Both stop us from defining 
with `__bf16`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120395/new/

https://reviews.llvm.org/D120395

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to