LuoYuanke added inline comments.
================ Comment at: llvm/test/CodeGen/X86/fpclamptosat.ll:569 ; CHECK-NEXT: cvttss2si %xmm0, %rax ; CHECK-NEXT: ucomiss {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0 +; CHECK-NEXT: movabsq $-9223372036854775808, %rcx # imm = 0x8000000000000000 ---------------- I'm curious why there is 1 more compare in this patch. ================ Comment at: llvm/test/CodeGen/X86/fpclamptosat.ll:776 +; CHECK-NEXT: cmovael %eax, %ecx +; CHECK-NEXT: ucomiss {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0 +; CHECK-NEXT: movl $2147483647, %edx # imm = 0x7FFFFFFF ---------------- Ditto. ================ Comment at: llvm/test/CodeGen/X86/fpclamptosat_vec.ll:605 +; CHECK-NEXT: .cfi_def_cfa_offset 80 +; CHECK-NEXT: movss %xmm2, {{[-0-9]+}}(%r{{[sb]}}p) # 4-byte Spill +; CHECK-NEXT: movss %xmm1, {{[-0-9]+}}(%r{{[sb]}}p) # 4-byte Spill ---------------- Is the vector <4 x half> split to 4 scalar and pass by xmm? What's the ABI for vector half? Is there any case that test the scenario that run out of register and pass parameter through stack? ================ Comment at: llvm/test/CodeGen/X86/fptosi-sat-scalar.ll:2138 +; X64-NEXT: ucomiss {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0 +; X64-NEXT: movl $255, %eax +; X64-NEXT: cmovael %ecx, %eax ---------------- It seems less efficient than previous code on NAN, zero handling, but we can improve later. ================ Comment at: llvm/test/CodeGen/X86/half.ll:946 +; CHECK-I686-NEXT: calll __extendhfsf2 +; CHECK-I686-NEXT: fstps {{[0-9]+}}(%esp) +; CHECK-I686-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero ---------------- Why the x87 instruction is generated? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D107082/new/ https://reviews.llvm.org/D107082 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits