[PATCH] D109658: [X86][FP16] Change the order of the operands in complex FMA intrinsics to allow swap between the mul operands.

Pengfei Wang via Phabricator via cfe-commits Mon, 13 Sep 2021 07:11:17 -0700

pengfei added a comment.

In D109658#2996767 <https://reviews.llvm.org/D109658#2996767>, @craig.topper 
wrote:

> In D109658#2996714 <https://reviews.llvm.org/D109658#2996714>, @pengfei wrote:
>
>> In D109658#2996412 <https://reviews.llvm.org/D109658#2996412>, @craig.topper 
>> wrote:
>>
>>> Does gcc use the same builtin name? Our general policy is to have the same 
>>> interface as gcc if we have a builtin. So if gcc has these builtins the 
>>> should work the same way.
>>
>> No. We don't sync with GCC on the builtin name during the development. We 
>> had a disscussion and decided to not keep them aligned due to 1) target 
>> specific builtins are compiler private names that no need to keep it 
>> compatible with other compilers; and 2) we already differentiate the target 
>> builtins with GCC long ago on the naming, masking etc. Currently, regardless 
>> the name, GCC uses the same C, A, B order with our existing implementation. 
>> https://gitlab.com/x86-gcc/gcc/-/blob/users/intel/liuhongt/independentfp16_wip/gcc/config/i386/avx512fp16intrin.h#L6672
>
> I thought we were pretty consistent on names with gcc for most of sse and avx 
> and most of avx512. The names aren't completely private occasionally users 
> due try to use them. If we happen to have the same name we should have the 
> same behavior to avoid confusion.

I'm not so optimistic. I had a coarse-grained statistics on the use of x86 
builtins in Clang and GCC. It shows Clang only defines 2/3 of GCC's builtins 
and 1/4 of Clang builtins have different names with GCC's. Command below:

  cat gcc/config/i386/*.h | grep -o "\b__builtin_ia32_\w\+" |sort|uniq|tee 
gcc.txt|wc -l
  2788

  ls clang/lib/Headers/*.h |grep -v fp16 |xargs cat |grep -o 
"\b__builtin_ia32_\w\+" |sort|uniq|tee clang.txt|wc -l
  1808

  comm -12 gcc.txt clang.txt |wc -l
  1347

Regarding this case, we already have a different name with GCC, I think it 
worthwhile to use a different order for the swapping optimization.
With a bit research on AVX512IFMA, I found:

1. The use of C, A, B order in GCC is not consistent on its AVX512IFMA 
builtins. It supposes GCC should change to A, B, C order if considering 
consistency;
2. We aren't consistent on AVX512IFMA builtins with GCC either due to the use 
of select.

By the way, GCC folks told me GCC has ability to specify arbitrary operands 
that can be commutative. But I found both SDNode and MI only have ability on 
the first 2 operands, which is insufficient for instruction like CFMA. Do you 
know if we have other mechanism for commutable operands?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109658/new/

https://reviews.llvm.org/D109658

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D109658: [X86][FP16] Change the order of the operands in complex FMA intrinsics to allow swap between the mul operands.

Reply via email to