[PATCH] D105263: [X86] AVX512FP16 instructions enabling 1/6

Pengfei Wang via Phabricator via cfe-commits Fri, 09 Jul 2021 22:54:18 -0700

pengfei added inline comments.


================
Comment at: clang/lib/Headers/avx512vlfp16intrin.h:74
+static __inline__ __m256h __DEFAULT_FN_ATTRS256 _mm256_abs_ph(__m256h __A) {
+  return (__m256h)_mm256_and_epi32(_mm256_set1_epi32(0x7FFF7FFF), 
(__m256i)__A);
+}
----------------
craig.topper wrote:
> Why do we use _mm256_set1_epi32 instead of _mm256_set1_epi16?
There's no difference in assembly for immediate value. 
https://godbolt.org/z/sMbrM611d. But the latency of vpbroadcastd is better than 
vpbroadcastw in Skylake according to intrinsic guide. Here the only effect is 
consist with _mm256_and_epi32. Do you think it's better to use 
_mm256_set1_epi16?


================
Comment at: llvm/include/llvm/IR/RuntimeLibcalls.def:290
 HANDLE_LIBCALL(FPEXT_F16_F128, "__extendhftf2")
+HANDLE_LIBCALL(FPEXT_F16_F80, "__extendhfxf2")
 HANDLE_LIBCALL(FPEXT_F32_F64, "__extendsfdf2")
----------------
craig.topper wrote:
> Is this tested in this patch?
No. I'll move it to the 3rd patch and test it there.


================
Comment at: llvm/lib/Target/X86/X86FastISel.cpp:58
   bool X86ScalarSSEf32;
+  bool X86ScalarAVXf16;
 
----------------
craig.topper wrote:
> AVX here should maybe be AVX512, but maybe this is pointing out that this 
> name is bad. Would X86ScalarXMMf* be better?
Maybe we can use X86ScalarSSEf16, here SSE means SSE registers? Especially GCC 
community proposing to support FP16 since SSE2.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105263/new/

https://reviews.llvm.org/D105263

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D105263: [X86] AVX512FP16 instructions enabling 1/6

Reply via email to