[PATCH] D132329: [X86][RFC] Using `__bf16` for AVX512_BF16 intrinsics

Phoebe Wang via Phabricator via cfe-commits Wed, 21 Sep 2022 07:33:51 -0700

pengfei added inline comments.


================
Comment at: llvm/test/CodeGen/X86/avx512bf16-intrinsics-upgrade.ll:30
 ; X64-NEXT:    kmovd %edi, %k1 # encoding: [0xc5,0xfb,0x92,0xcf]
-; X64-NEXT:    vcvtne2ps2bf16 %zmm1, %zmm0, %zmm0 {%k1} {z} # encoding: 
[0x62,0xf2,0x7f,0xc9,0x72,0xc1]
+; X64-NEXT:    vmovdqu16 %zmm0, %zmm0 {%k1} {z} # encoding: 
[0x62,0xf1,0xff,0xc9,0x6f,0xc0]
 ; X64-NEXT:    retq # encoding: [0xc3]
----------------
RKSimon wrote:
> pengfei wrote:
> > RKSimon wrote:
> > > any chance we can recover the predicated instruction?
> > It's possible, e.g., iterate all users of the intrinsic, bitcast all the 
> > select operands as well; or add patterns for i16; or make vselect peek 
> > through bitcast etc.
> > But I think the small performance regression is not a critical requirement 
> > as the backward compatibility for the old intrinsics. It may not worth the 
> > code complexity.
> OK - how come the mask_move_lowering_f16_bf16 refactoring in 
> X86InstrAVX512.td didn't fix this?
The `mask_move_lowering_f16_bf16` should do nothing with it. I think the 
problem is after AutoUpgrade the IR becomes:
```
  %0 = tail call <32 x bfloat> @llvm.x86.avx512bf16.cvtne2ps2bf16.512(<16 x 
float> %A, <16 x float> %B)
  %1 = bitcast i32 %U to <32 x i1>
  %2 = bitcast <32 x bfloat> %0 to <32 x i16>
  %3 = select <32 x i1> %1, <32 x i16> %2, <32 x i16> zeroinitializer
  %4 = bitcast <32 x i16> %3 to <8 x i64>
  ret <8 x i64> %4
```
And after refactoring of X86InstrAVX512.td, we are able to match
```
  %0 = tail call <32 x bfloat> @llvm.x86.avx512bf16.cvtne2ps2bf16.512(<16 x 
float> %A, <16 x float> %B)
  ... ...
  %2 = select <32 x i1> %1, <32 x bfloat> %0, <32 x bfloat> zeroinitializer
```
So leaving the upgraded IRs failed to match the predicated instruction.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132329/new/

https://reviews.llvm.org/D132329

_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D132329: [X86][RFC] Using `__bf16` for AVX512_BF16 intrinsics

Reply via email to