tra added a comment. Nice. Thank you!
================ Comment at: clang/lib/CodeGen/CGBuiltin.cpp:18104 CodeGenFunction::EmitNVPTXBuiltinExpr(unsigned BuiltinID, const CallExpr *E) { - auto MakeLdg = [&](unsigned IntrinsicID) { + auto HasHalfSupport = [&](unsigned BuiltinID) { + auto &Context = getContext(); ---------------- I'd add a comment describing a meaning of the fields in the returned pair. ================ Comment at: clang/lib/CodeGen/CGBuiltin.cpp:18116 + case NVPTX::BI__nvvm_ldu_h: + BuiltinName = "__nvvm_ldu_h"; + break; ---------------- Can we use the standard `StringRef Name = getContext().BuiltinInfo.getName(BuiltinID);` to figure out the builtin name? ================ Comment at: clang/lib/CodeGen/CGBuiltin.cpp:18261-18263 + std::string ErrMsg{HalfSupport.second}; + CGM.Error(E->getExprLoc(), + ErrMsg.append(" requires native half type support.").c_str()); ---------------- Nit: this would be a bit more readable: ``` std::string BuiltinName{HalfSupport.second}; CGM.Error(E->getExprLoc(), BuiltinName + " requires native half type support.")` ``` You may also consider changing returned `BuiltinName` to be `std::string`, so you would not need an explicit temp var. ================ Comment at: clang/test/CodeGen/builtins-nvptx-native-half-type-err.c:3 +// +// RUN: not %clang_cc1 -DLDG -fsyntax-only -ffp-contract=off -triple nvptx-unknown-unknown -target-cpu \ +// RUN: sm_75 -target-feature +ptx70 -fcuda-is-device -x cuda -emit-llvm -o - %s 2>&1 \ ---------------- I think we could've done it all in one run if we were to do both ldg and ldu in one function. ================ Comment at: llvm/test/CodeGen/NVPTX/ldu-ldg.ll:33 +; ldu.global.u16 + %val = tail call i16 @llvm.nvvm.ldu.global.i.i16.p1(ptr addrspace(1) %ptr, i32 4) + ret i16 %val ---------------- Should alignment be 2 ? ================ Comment at: llvm/test/CodeGen/NVPTX/ldu-ldg.ll:60 +define double @test_ldu_f64(ptr addrspace(1) %ptr) { +; ldu.global.u64 + %val = tail call double @llvm.nvvm.ldu.global.f.f64.p1(ptr addrspace(1) %ptr, i32 8) ---------------- Hmm. I wonder why we end up with `u64` here and not `b64`. Not that it matters in this case, but it is a discrepancy vs. `f16`. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D145238/new/ https://reviews.llvm.org/D145238 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits