LuoYuanke added inline comments.
================ Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:616 + setOperationAction(ISD::FROUNDEVEN, MVT::f16, Promote); + setOperationAction(ISD::FP_ROUND, MVT::f16, Expand); + setOperationAction(ISD::FP_EXTEND, MVT::f32, Expand); ---------------- Just confused how to expand it. Will the expand fail and finally turns to libcall? ================ Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:763 + if (isTypeLegal(MVT::f16)) { + setOperationAction(ISD::FP_EXTEND, MVT::f80, Custom); + setOperationAction(ISD::STRICT_FP_EXTEND, MVT::f80, Custom); ---------------- Why f16 emulation affect f80 type? Are we checking isTypeLegal(MVT::f80)? ================ Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:22100 SDValue Res; + if (SrcVT == MVT::f16 && !Subtarget.hasFP16()) { + if (IsStrict) ---------------- Not sure if it is better to wrapper it into a readable function (e.g., isSoftF16). ================ Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:22448 + if (SrcVT == MVT::f16) + return SDValue(); + ---------------- Why we don't extent to f32 here? ================ Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:22522 + if (!isScalarFPTypeInSSEReg(SrcVT) || + (SrcVT == MVT::f16 && !Subtarget.hasFP16())) return SDValue(); ---------------- Why we don't extent to f32 here? Will it be promoted finally? ================ Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:22765 + DAG.getIntPtrConstant(0, DL)); + Res = DAG.getNode(X86ISD::STRICT_CVTPS2PH, DL, {MVT::v8i16, MVT::Other}, + {Chain, Res, DAG.getTargetConstant(4, DL, MVT::i32)}); ---------------- Should MVT::v8i16 be MVT::v8f16? ================ Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:22766 + Res = DAG.getNode(X86ISD::STRICT_CVTPS2PH, DL, {MVT::v8i16, MVT::Other}, + {Chain, Res, DAG.getTargetConstant(4, DL, MVT::i32)}); + Chain = Res.getValue(1); ---------------- Is it rounding control? Can we use a macro or add comments for what is the rounding control? ================ Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:22775 + + Res = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL, MVT::i16, Res, + DAG.getIntPtrConstant(0, DL)); ---------------- MVT::f16 and delete the bitcast? ================ Comment at: llvm/lib/Target/X86/X86ISelLowering.cpp:44211 VT != MVT::f80 && VT != MVT::f128 && + !(VT.getScalarType() == MVT::f16 && !Subtarget.hasFP16()) && (TLI.isTypeLegal(VT) || VT == MVT::v2f32) && ---------------- Not sure if it is better to wrapper it into a readable function (e.g., isSoftF16). ================ Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:1476 } -let Predicates = [HasFP16] in { +let Predicates = [HasBWI] in { def : Pat<(v32f16 (X86VBroadcastld16 addr:$src)), ---------------- If target don't have avx512bw feature. There is some other pattern to lower the node or fp16 broadcast node is invalid? ================ Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:4107 _.ExeDomain>, EVEX_4V, Sched<[SchedWriteFShuffle.XMM]>; + let Predicates = [prd] in { def rrkz : AVX512PI<0x10, MRMSrcReg, (outs _.RC:$dst), ---------------- Previous prd only apply to "def rr"? Is it a bug for previous code? ================ Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:4352 +defm : avx512_move_scalar_lowering<"VMOVSHZ", X86Movsh, fp16imm0, v8f16x_info>; +defm : avx512_store_scalar_lowering<"VMOVSHZ", avx512vl_f16_info, + (v32i1 (bitconvert (and GR32:$mask, (i32 1)))), GR32>; ---------------- Why previous code don't have predicates? ================ Comment at: llvm/lib/Target/X86/X86InstrAVX512.td:11657 +let Predicates = [HasBWI], AddedComplexity = -10 in { + def : Pat<(f16 (load addr:$src)), (COPY_TO_REGCLASS (VPINSRWZrm (v8i16 (IMPLICIT_DEF)), addr:$src, 0), FR16X)>; ---------------- Why set AddedComplexity to -10? There no such addtional complexity in previous code. Add comments for it? ================ Comment at: llvm/lib/Target/X86/X86InstrSSE.td:3970 +let Predicates = [UseSSE2], AddedComplexity = -10 in { + def : Pat<(f16 (load addr:$src)), (COPY_TO_REGCLASS (PINSRWrm (v8i16 (IMPLICIT_DEF)), addr:$src, 0), FR16)>; ---------------- Why AddedComplexity = -10? Add comments for it? ================ Comment at: llvm/lib/Target/X86/X86InstrSSE.td:3978 +let Predicates = [HasAVX, NoBWI], AddedComplexity = -10 in { + def : Pat<(f16 (load addr:$src)), (COPY_TO_REGCLASS (VPINSRWrm (v8i16 (IMPLICIT_DEF)), addr:$src, 0), FR16)>; + def : Pat<(i16 (bitconvert f16:$src)), (EXTRACT_SUBREG (VPEXTRWrr (v8i16 (COPY_TO_REGCLASS FR16:$src, VR128)), 0), sub_16bit)>; ---------------- Miss pattern for store? ================ Comment at: llvm/lib/Target/X86/X86InstrSSE.td:5214 + +let Predicates = [HasAVX, NoBWI] in + def : Pat<(store f16:$src, addr:$dst), (VPEXTRWmr addr:$dst, (v8i16 (COPY_TO_REGCLASS FR16:$src, VR128)), 0)>; ---------------- Why no AddedComplexity for it? ================ Comment at: llvm/lib/Target/X86/X86RegisterInfo.td:540 +def FR16 : RegisterClass<"X86", [f16], 16, (add FR32)> {let Size = 32;} + ---------------- pengfei wrote: > skan wrote: > > The alignment is not same as the size? > No. This is spill size instead of alignment. When there is avx512fp16 feature, is the spill size still 32? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D107082/new/ https://reviews.llvm.org/D107082 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits