ahatanak added a comment.

In rG825235c140e7747f686bd7902cb0f9af77590841#909794 
<https://reviews.llvm.org/rG825235c140e7747f686bd7902cb0f9af77590841#909794>, 
@SjoerdMeijer wrote:

> > I'm still not sure why __fp16, which is a storage-only type, is used for 
> > the element type of float16x4_t if we want to avoid promotion to a float 
> > vector type.
>
> About the types: `__fp16` this is the older, storage-only type. So, just 
> historically, I think this element type was used to describe the neon 
> vectors, which translates to vectors with elements of i16 or f16, which all 
> seems fine.
>
> > Was there a discussion on llvm-dev or phabricator?
>
> I don't think this is necessary, because this is how it's been from let's say 
> the beginning.
>
> > Can we change the element type of the vector to _Float16?
>
> I don't see a benefit of doing this, and making this perhaps intrusive change.
>
> > It seems that the original plan discussed on llvm-dev 
> > (http://lists.llvm.org/pipermail/cfe-dev/2017-May/053768.html) was to 
> > introduce _Float16 and use it instead of __fp16 for ARMv8.2-A, but the 
> > subsequent patches that were committed are making NeonEmitter.cpp emit 
> > typedefs of vectors that have __fp16 as the element type.
>
> Then `_Float16` came along, which is a proper half type (not a storage only 
> type), is defined in a C11 extension, and should be used for the ARMv8.2-A 
> native half instructions; the semantics of `__fp16` and `_Float16` are 
> different, and as a result using `__fp16` for the v8.2 half FP instructions 
> is not an option, so it is not a matter of using `_Float16` instead of 
> `__fp16`.


OK, I see. Looking at the history, the half precision vector types defined in 
arm_neon.h (e.g., `float16x4_t`) were introduced long before `_Float16` or 
ARMv8.2-A.

> As I said, your problem wasn't clear to me, so let's look at your example:
> 
>> For example, in vneg_f16, __p0 is promoted to a vector of float before it's 
>> negated.
>> 
>> __ai float16x4_t vneg_f16(float16x4_t __p0) {
>> 
>>   float16x4_t __ret;
>>    __ret = -__p0;
>>    return __ret;
>> 
>> }
> 
> when I compile this and emit llvm ir I get this:
> 
>   define dso_local <2 x i32> @vneg_f16(<2 x i32> %__p0.coerce) 
> local_unnamed_addr #0 {
>   entry:
>     %0 = bitcast <2 x i32> %__p0.coerce to <4 x half>
>     %fneg = fneg fast <4 x half> %0
>     %1 = bitcast <4 x half> %fneg to <2 x i32>
>     ret <2 x i32> %1
>   }
>    
> 
> so far so good I think: this shows a fneg on vector of halfs.

Yes, I think this is what we want to see. We don't want to treat `__p0` as a 
strorage-only type and promote it to a vector of floats before negating it.

> Where and when do you see problems? Can you describe this, and also provide 
> your compile commands?

The problem is that, after I committed a patch (which was reverted later, see 
https://reviews.llvm.org/rGa6150b48cea00ab31e9335cc73770327acc4cb3a) to fix a 
bug in Sema, clang started to promote  `__p0` to a vector of floats because the 
element type of `float16x4_t` is `__fp16`.

  typedef __fp16 float16_t;
  typedef __attribute__((neon_vector_type(4))) float16_t float16x4_t;

The bug I'm trying to fix causes clang to generate different IR for `test0` and 
`test1` in the code below even though the only difference is that the element 
type of `half_vector1` is a typedef of `__fp16`. In `test1`, the two vectors 
are added without being promoted to vectors of floats first.

  typedef __fp16 FP16;
  typedef __fp16 half_vector0 __attribute__ ((vector_size (8)));
  typedef FP16 half_vector1 __attribute__ ((vector_size (8)));
  
  // 'a' and 'b' are promoted to vectors of floats before the addition.
  half_vector0 test0(half_vector0 a, half_vector0 b) {
    return a + b;
  }
  
  // Currently 'a' and 'b' aren't promoted to vectors of floats before the 
addition.
  half_vector1 test1(half_vector1 a, half_vector1 b) {
    return a + b;p
  }

After I fixed the bug, clang started to generate the same IR for `test0` and 
`test1`, but this had the unwanted side-effect of promoting `__p0` in 
`vneg_f16` to a vector of float as I explained before.

Just to be sure I'm understanding the semantics of the types defined in 
arm_neon.h, we don't want to treat `float16x4_t` as a storage-only type like 
`half_vector0` in the code above and promote it to vectors of floats in the 
front-end, right? If that is the case, it seems to me that `__fp16` isn't the 
right type as the element type, since that implies the vector of halves has to 
be promoted to vector of floats. But I also realize that changing the typedef 
of `float16x4_t` and using `_Float16` as the element type long after the 
typedef was introduced can break code that is currently working.

I think I'll leave the typedefs unchanged and just avoid promoting vectors of 
halves in Sema if they are neon vectors.


BRANCHES
  master, release/10.x

Users:
  ahatanak (Author)

https://reviews.llvm.org/rG825235c140e7



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to