[PATCH] D114957: [AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args

2021-12-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D114957#3166817 , @foad wrote: > This is a flag-day change to the signatures of the LLVM intrinsics and the > OpenCL builtins. Is that OK? This breaks users' code. If we have to do this, at least let clang emit a pre-defined

[PATCH] D114957: [AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args

2021-12-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D114957#3166861 , @arsenm wrote: > In D114957#3166858 , @yaxunl wrote: > >> In D114957#3166817 , @foad wrote: >> >>> This is a flag-day change t

[PATCH] D114957: [AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args

2021-12-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D114957#3166974 , @foad wrote: > In D114957#3166948 , @b-sumner > wrote: > >> In D114957#3166936 , @foad wrote: >> >>> In D114957#3166858

[PATCH] D114957: [AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args

2021-12-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D114957#3166936 , @foad wrote: > In D114957#3166858 , @yaxunl wrote: > >> In D114957#3166817 , @foad wrote: >> >>> This is a flag-day change to

[PATCH] D114957: [AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args

2021-12-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. LGTM from clang side. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D114957/new/ https://reviews.llvm.org/D114957 ___ cfe-commits mailing list cfe-commits@lists.llvm.org ht

[PATCH] D114957: [AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args

2021-12-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D114957#3167703 , @arsenm wrote: > In D114957#3167700 , @arsenm wrote: > >> I think this macro is purely terrible and should not be added (and at least >> should be all caps?). If we ca

[PATCH] D115039: [HIP] Fix -fgpu-rdc for Windows

2021-12-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. yaxunl requested review of this revision. This patch fixes issues for -fgpu-rdc for Windows MSVC toolchain: Fix COFF specific section flags and remove section types in llvm-mc input file for Windows. Escape fatbin path in llvm-mc input

[PATCH] D115032: [AMDGPU] Change llvm.amdgcn.image.bvh.intersect.ray to take vec3 args

2021-12-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. Ideally, we could let the builtins accept both vec3 and vec4. But I am OK with this for now. I think the overhead may be minimal. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D115032/new/ http

[PATCH] D135269: [AMDGPU] Disable bool range metadata to workaround backend issue

2022-10-18 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D135269#3863470 , @nikic wrote: > Checking back here, have you made any progress on reducing the issue? > > cc @arsenm for awareness No. I am busy with other work and have not got time to get back on it. Repository: rG LLVM

[PATCH] D136311: [CUDA,NVPTX] Implement __bf16 support for NVPTX.

2022-10-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. LGTM. Thanks. Do you plan to support arithmetic operators for bf16 or implement the FMA instruction support? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136311/new/ https://reviews.llvm.org

[PATCH] D136701: [LinkerWrapper] Perform device linking steps in parallel

2022-10-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In general, I think parallelizing the linking workload for multiple GPU's in the linker wrapper is a useful feature. I am not sure whether the workload to be parallelized includes the LLVM passes and codegen, which is usually the bottleneck. Parallelizing this workload w

[PATCH] D136854: [HIP] add -fhiplib-add-rpath

2022-10-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Herald added a project: All. yaxunl requested review of this revision. Herald added a subscriber: MaskRay. Add an option -f[no-]hiplib-add-rpath to control whether to pass -rpath to linker for HIP runtime library. By default it is off to

[PATCH] D136859: [HIP] add fmax/fmin for fp16

2022-10-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, b-sumner. Herald added a project: All. yaxunl requested review of this revision. https://reviews.llvm.org/D136859 Files: clang/lib/Headers/__clang_hip_libdevice_declares.h Index: clang/lib/Headers/__clang_hip_libdevice_declares.h

[PATCH] D136854: [HIP] add -fhiplib-add-rpath

2022-10-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D136854#3889141 , @MaskRay wrote: > If an option does not affect compilation, I prefer `--` to `-f` I will rename it as --offload-add-rpath Comment at: clang/include/clang/Driver/Options.td:4158 + HelpText<

[PATCH] D136854: [HIP] add -fhiplib-add-rpath

2022-10-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 471289. yaxunl added a comment. rename to --offload-add-rpath CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136854/new/ https://reviews.llvm.org/D136854 Files: clang/include/clang/Driver/Options.td clang/lib/Driver/ToolChains/Linux.cpp clang/t

[PATCH] D136854: [HIP] add --offload-add-rpath

2022-10-28 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG55b43449978c: [HIP] add --offload-add-rpath (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136854/new/ h

[PATCH] D136859: [HIP] add fmax/fmin for fp16

2022-10-28 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG319444fcf586: [HIP] add fmax/fmin for fp16 (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136859/new/ ht

[PATCH] D136981: [HIP] add float to fp16 convert functions

2022-10-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, b-sumner. Herald added a project: All. yaxunl requested review of this revision. https://reviews.llvm.org/D136981 Files: clang/lib/Headers/__clang_hip_libdevice_declares.h Index: clang/lib/Headers/__clang_hip_libdevice_declares.h

[PATCH] D136981: [HIP] add float to fp16 convert functions

2022-10-28 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG36a025366215: [HIP] add float to fp16 convert functions (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo

[PATCH] D136701: [LinkerWrapper] Perform device linking steps in parallel

2022-10-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:1211-1212 +llvm::sort(Input, [](OffloadingImage &A, OffloadingImage &B) { + return A.StringData["triple"].compare(B.StringData["triple"]) == 1 || + A.StringData[

[PATCH] D132140: [AMDGPU] Add builtin s_sendmsg_rtn_b{32|64}

2022-08-18 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: b-sumner, arsenm, foad, kzhuravl, bcahoon. Herald added subscribers: kosarev, kerbowa, t-tye, tpr, dstuttard, jvesely. Herald added a project: All. yaxunl requested review of this revision. Herald added a subscriber: wdng. https://reviews.llvm.

[PATCH] D132140: [AMDGPU] Add builtin s_sendmsg_rtn_b{32|64}

2022-08-18 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D132140#3732262 , @b-sumner wrote: > Following existing naming, it might make sense to rename "rtn_b32" --> "rtn" > and "rtn_b64" --> "rtnl". will modify. thanks. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D132140/

[PATCH] D132140: [AMDGPU] Add builtin s_sendmsg_rtn_b{32|64}

2022-08-18 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 453681. yaxunl added a comment. revised by Brian's comments CHANGES SINCE LAST ACTION https://reviews.llvm.org/D132140/new/ https://reviews.llvm.org/D132140 Files: clang/include/clang/Basic/BuiltinsAMDGPU.def clang/lib/CodeGen/CGBuiltin.cpp clang/te

[PATCH] D132248: [CUDA][OpenMP] Fix the new driver crashing on multiple device-only outputs

2022-08-19 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D132248#3736295 , @tra wrote: > I'm OK with that. > > @yaxunl -- what are your thoughts on whether this approach would work for > HIP? On one hand HIP already has a lot of features that the new driver is > intended to provide,

[PATCH] D132140: [AMDGPU] Add builtin s_sendmsg_rtn

2022-08-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/test/CodeGenOpenCL/builtins-amdgcn-gfx11.cl:23 + +// Test mismatched argument and return types are handled. + tra wrote: > Is there a particular reason for this test? > > Argument and return value type checks shoul

[PATCH] D132140: [AMDGPU] Add builtin s_sendmsg_rtn

2022-08-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 454607. yaxunl added a comment. remove unnecessary tests CHANGES SINCE LAST ACTION https://reviews.llvm.org/D132140/new/ https://reviews.llvm.org/D132140 Files: clang/include/clang/Basic/BuiltinsAMDGPU.def clang/lib/CodeGen/CGBuiltin.cpp clang/test/

[PATCH] D132140: [AMDGPU] Add builtin s_sendmsg_rtn

2022-08-22 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG9f6cb3e9fdb4: [AMDGPU] Add builtin s_sendmsg_rtn (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D132140/ne

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-08-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:9436 +CGM.getModule(), Type, true, +llvm::GlobalValue::LinkageTypes::LinkOnceODRLinkage, +llvm::ConstantInt::get(Type, Value), Name, nullptr, This does not support

[PATCH] D132248: [CUDA][OpenMP] Fix the new driver crashing on multiple device-only outputs

2022-08-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D132248/new/ https://reviews.llvm.org/D132248 __

[PATCH] D132607: [OffloadPackager] Add ability to extract mages from other file types

2022-08-26 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/tools/clang-offload-packager/ClangOffloadPackager.cpp:17-21 +#include "llvm/IR/Constants.h" +#include "llvm/IR/Module.h" +#include "llvm/IRReader/IRReader.h" +#include "llvm/Object/Archive.h" +#include "llvm/Object/ArchiveWriter.h"

[PATCH] D133705: [HIP] Fix unbundling archive

2022-09-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D133705#3793702 , @tra wrote: > In D133705#3785470 , @yaxunl wrote: > >>> Also, using `lib*.a` as pattern to tell device libraries from the host-ony >>> one will be insufficient. There

[PATCH] D133705: [HIP] Fix unbundling archive

2022-09-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D133705#3793931 , @MaskRay wrote: > I know very little about HIP, but I am concerned with relying on extensions > as well. For example, I've seen `libc++.a.1` (we use this for the real > archive while `libc++.a` is a linker sc

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-09-16 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:9468-9472 + AddGlobal("__oclc_wavefrontsize64", Wavefront64, /*Size=*/8); + AddGlobal("__oclc_daz_opt", DenormAreZero, /*Size=*/8); + AddGlobal("__oclc_finite_only_opt", FiniteOnly || RelaxedMath, /*Siz

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-09-16 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:9468 + // Control constants for math operations. + AddGlobal("__oclc_wavefrontsize64", Wavefront64, /*Size=*/8); + AddGlobal("__oclc_daz_opt", DenormAreZero, /*Size=*/8); jhuber6 wrote:

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-09-16 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:9468 + // Control constants for math operations. + AddGlobal("__oclc_wavefrontsize64", Wavefront64, /*Size=*/8); + AddGlobal("__oclc_daz_opt", DenormAreZero, /*Size=*/8); jhuber6 wrote:

[PATCH] D134355: [AMDGPU] Emit module flag for all code object versions

2022-09-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: b-sumner, cfang. Herald added subscribers: kosarev, kerbowa, t-tye, tpr, dstuttard, jvesely, kzhuravl. Herald added a project: All. yaxunl requested review of this revision. Herald added a subscriber: wdng. https://reviews.llvm.org/D134355 Fi

[PATCH] D134314: [HIP] stop forcing the lang std in the driver

2022-09-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D134314/new/ https://reviews.llvm.org/D134314 __

[PATCH] D134355: [AMDGPU] Emit module flag for all code object versions

2022-09-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:590-591 // times 100. -// ToDo: Enable module flag for all code object version when ROCm device -// library is ready. -if (getTarget().getTarget

[PATCH] D133705: [HIP] Fix unbundling archive

2022-09-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 462189. yaxunl added a comment. allow archive files to have unknown extension CHANGES SINCE LAST ACTION https://reviews.llvm.org/D133705/new/ https://reviews.llvm.org/D133705 Files: clang/lib/Driver/Driver.cpp clang/lib/Driver/ToolChains/CommonArgs.cp

[PATCH] D134355: [AMDGPU] Emit module flag for all code object versions

2022-09-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added a comment. In D134355#3807435 , @cfang wrote: > LGTM > > Should the module flag name be amdgpu_code_object_version or > amdhsa_code_object_version? Good question. @b-sumner Does code object version

[PATCH] D134355: [AMDGPU] Emit module flag for all code object versions

2022-09-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D134355#3809471 , @b-sumner wrote: > In D134355#3809294 , @yaxunl wrote: > >> In D134355#3807435 , @cfang wrote: >> >>> LGTM >>> >>> Should the

[PATCH] D134355: [AMDGPU] Emit module flag for all code object versions

2022-09-22 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG5e25284dbc94: [AMDGPU] Emit module flag for all code object versions (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Gith

[PATCH] D133705: [HIP] Fix unbundling archive

2022-09-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 4 inline comments as done. yaxunl added inline comments. Comment at: clang/lib/Driver/Driver.cpp:2907-2908 +// which are not object files. Files with extension ".lib" is classified +// as TY_Object but they are actually archives, therefore should no

[PATCH] D133705: [HIP] Fix unbundling archive

2022-09-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 462376. yaxunl marked 4 inline comments as done. yaxunl added a comment. revised by Artem's comments CHANGES SINCE LAST ACTION https://reviews.llvm.org/D133705/new/ https://reviews.llvm.org/D133705 Files: clang/lib/Driver/Driver.cpp clang/lib/Driver/T

[PATCH] D134546: [clang-offload-bundler] extracting compatible bundle entry

2022-09-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, saiislam, lamb-j. Herald added a project: All. yaxunl requested review of this revision. Herald added subscribers: sstefan1, MaskRay. Herald added a reviewer: jdoerfert. In HIP a library is usually compiled with default target ID e.g. gfx9

[PATCH] D133705: [HIP] Fix unbundling archive

2022-09-24 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 462673. yaxunl added a comment. I just found clang-offload-bundler reports an error when trying to unbundle an archive but the input file is not an archive. This update let clang-offload-bundler to extract empty archives when the input file is not an archive

[PATCH] D133705: [HIP] Fix unbundling archive

2022-09-26 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: clang/test/Driver/clang-offload-bundler.c:410-412 +// Check clang-offload-bundler extracts empty archives if the input file +// is not an archive when --allow-missing-bundles is specified, otherwis

[PATCH] D133705: [HIP] Fix unbundling archive

2022-09-26 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 463016. yaxunl marked an inline comment as done. yaxunl added a comment. check file magic and only unbundle real archives CHANGES SINCE LAST ACTION https://reviews.llvm.org/D133705/new/ https://reviews.llvm.org/D133705 Files: clang/lib/Driver/Driver.cpp

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-09-26 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/CodeGen/CodeGenAction.cpp:299-308 + if (!LinkModules.empty() && Gen->CGM().getTriple().isAMDGCN() && + !Gen->CGM().getLangOpts().GPURelocatableDeviceCode) { +const StringRef GVS[] = {"__oclc_daz_opt", "__oc

[PATCH] D133705: [HIP] Fix unbundling archive

2022-09-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl closed this revision. yaxunl added a comment. committed by 1172bdecfab364579d90e6aa5ba7fc64a5b96786 CHANGES SINCE LAST ACTION https://reviews.llvm.org/D133705/new/ https://reviews.llvm.org/D133705 _

[PATCH] D134872: AMDGPU: Add __builtin_amdgcn_permlane64

2022-09-29 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision as: yaxunl. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D134872/new/ https://reviews.llvm.org/D134872 ___ cfe-commits mailin

[PATCH] D88976: [clang] Use correct address space for global variable debug info

2022-09-29 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D88976/new/ https://reviews.llvm.org/D88976 ___ cfe-commits mailing list cfe-commits@list

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-10-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/test/CodeGen/amdgcn-link-control-constants.c:2-3 +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --function-signature --check-globals --include-generated-funcs --global-value-regex "__oclc_

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-10-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/test/CodeGen/amdgcn-link-control-constants.c:2-3 +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --function-signature --check-globals --include-generated-funcs --global-value-regex "__oclc_

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-10-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/test/CodeGen/amdgcn-control-constants.c:8 + +// GFX90A: @__oclc_daz_opt = linkonce_odr hidden local_unnamed_addr addrspace(4) constant i8 0, align 1 +// GFX90A: @__oclc_wavefrontsize64 = linkonce_odr hidden local_unnamed_addr addr

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-10-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/test/CodeGen/amdgcn-control-constants.c:8 + +// GFX90A: @__oclc_daz_opt = linkonce_odr hidden local_unnamed_addr addrspace(4) constant i8 0, align 1 +// GFX90A: @__oclc_wavefrontsize64 = linkonce_odr hidden local_unnamed_addr addr

[PATCH] D134546: [clang-offload-bundler] extracting compatible bundle entry

2022-10-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 3 inline comments as done. yaxunl added inline comments. Comment at: clang/lib/Driver/OffloadBundler.cpp:1008 +auto Output = Worklist.begin(); +for (auto E = Worklist.end(); Output != E; Output++) { + if (isCodeObjectCompatible( tra wro

[PATCH] D134546: [clang-offload-bundler] extracting compatible bundle entry

2022-10-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 465265. yaxunl marked an inline comment as done. yaxunl added a comment. check bundle entry ID compatibility when bundling CHANGES SINCE LAST ACTION https://reviews.llvm.org/D134546/new/ https://reviews.llvm.org/D134546 Files: clang/include/clang/Basic/

[PATCH] D135269: [AMDGPU] Disable bool range metadata to workaround backend issue

2022-10-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, ronl. Herald added subscribers: kosarev, t-tye, tpr, dstuttard, kzhuravl. Herald added a project: All. yaxunl requested review of this revision. Herald added a subscriber: wdng. Currently there is some backend issue which causes values loa

[PATCH] D135269: [AMDGPU] Disable bool range metadata to workaround backend issue

2022-10-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 465403. yaxunl added a comment. fix comments in test CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135269/new/ https://reviews.llvm.org/D135269 Files: clang/lib/CodeGen/CGExpr.cpp clang/test/CodeGenCUDA/bool-range.cu Index: clang/test/CodeGenC

[PATCH] D135305: [Clang] Fix using LTO with the new driver in RDC-mode

2022-10-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. should we test with -ccc-print-phases instead? It is not clear what actions are produced by driver. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135305/new/ https://reviews.llvm.org/D135305 ___

[PATCH] D135305: [Clang] Fix using LTO with the new driver in RDC-mode

2022-10-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D135305#3838435 , @jhuber6 wrote: > In D135305#3838412 , @yaxunl wrote: > >> should we test with -ccc-print-phases instead? It is not clear what actions >> are produced by driver. > > A

[PATCH] D135269: [AMDGPU] Disable bool range metadata to workaround backend issue

2022-10-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D135269#3837394 , @tra wrote: > Is there more info about the issue? What does AMDGPU currently emit for the > test case? > > AFAICT from running it on CE (https://godbolt.org/z/ccq3vnbrM) llvm optimizes > it to essentially `*y

[PATCH] D134546: [clang-offload-bundler] extracting compatible bundle entry

2022-10-05 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG844b84af20c7: [clang-offload-bundler] extracting compatible bundle entry (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM

[PATCH] D135328: [CUDA] Refactored CUDA version housekeeping to use less boilerplate.

2022-10-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/Basic/Cuda.cpp:59 +CudaVersion ToCudaVersion(llvm::VersionTuple Version) { + int IVer = Version.getMajor() * 10 + Version.getMinor().value_or(0); + for (auto *I = CudaNameVersionMap; I->Version != CudaVersion::UNKNOWN; ++I) --

[PATCH] D135305: [Clang] Fix using LTO with the new driver in RDC-mode

2022-10-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/test/Driver/hip-phases.hip:553 +// +// RUN: %clang -### --target=x86_64-linux-gnu --offload-new-driver -ccc-print-phases \ +// RUN:--offload-arch=gfx90a --offload-arch=gfx908 -foffload-lto -fgpu-rdc -c %s 2>&1 \ --

[PATCH] D135306: [CUDA] Add support for CUDA-11.8 and sm_{87,89,90} GPUs.

2022-10-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:30-32 +#define SM_89 "sm_87|" SM_90 +#define SM_87 "sm_89|" SM_89 +#define SM_86 "sm_89|" SM_87 typo? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https:

[PATCH] D135269: [AMDGPU] Disable bool range metadata to workaround backend issue

2022-10-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: clang/lib/CodeGen/CGExpr.cpp:1792 // attach range metadata to the load. - } else if (CGM.getCodeGenOpts().OptimizationLevel > 0) +// TODO: Enable range metadata for AMDGCN after backend i

[PATCH] D135305: [Clang] Fix using LTO with the new driver in RDC-mode

2022-10-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135305/new/ https://reviews.llvm.org/D135305 __

[PATCH] D135306: [CUDA] Add support for CUDA-11.8 and sm_{87,89,90} GPUs.

2022-10-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135306/new/ https://reviews.llvm.org/D135306 __

[PATCH] D135269: [AMDGPU] Disable bool range metadata to workaround backend issue

2022-10-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 465731. yaxunl marked an inline comment as done. yaxunl edited the summary of this revision. yaxunl added a comment. update comments with issue link CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135269/new/ https://reviews.llvm.org/D135269 Files:

[PATCH] D135269: [AMDGPU] Disable bool range metadata to workaround backend issue

2022-10-07 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG107ee2613063: [AMDGPU] Disable bool range metadata to workaround backend issue (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https

[PATCH] D135328: [CUDA] Refactored CUDA version housekeeping to use less boilerplate.

2022-10-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135328/new/ https://reviews.llvm.org/D135328 __

[PATCH] D135614: [OpenMP][CUDA][AMDGPU] Accept case insensitive subarchitecture names

2022-10-11 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. I am not sure whether it is a good idea to allow gfx90A in `--offload-arch`, since it is not documented in LLVM AMDGPU usage. @b-sumner @arsenm Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135614/new/ https://reviews.llvm.

[PATCH] D135724: [HIP] Fix unbundling archive

2022-10-11 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, saiislam. Herald added a project: All. yaxunl requested review of this revision. Herald added a subscriber: MaskRay. When `-lxxx` is specified, if there happens to have a directory or file with name `xxx`, clang will not look up `libxxx.a`

[PATCH] D135724: [HIP] Fix unbundling archive

2022-10-12 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:1839 auto Ext = IsMSVC ? ".lib" : ".a"; - if (!Lib.startswith(":") && llvm::sys::fs::exists(Lib)) { -ArchiveOfBundles = Lib; -FoundAOB = true; + if (!Lib.startswith(":") && !Lib.star

[PATCH] D135796: [HIP] Detect HIP for Debian/Fedora

2022-10-12 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Herald added subscribers: kosarev, kerbowa, jvesely. Herald added a project: All. yaxunl requested review of this revision. Herald added a subscriber: MaskRay. HIP is installed at /usr or /usr/local on Debin/Fedora, and the version file i

[PATCH] D135796: [HIP] Detect HIP for Debian/Fedora

2022-10-12 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:309 + ROCmSearchDirs.emplace_back(D.SysRoot + "/usr/local", + /*StrictChecking=*/true); tra wrote: > Should it be done for Debian/Fedora only? See >

[PATCH] D135724: [HIP] Fix unbundling archive

2022-10-12 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG12c6a41f5249: [HIP] Fix unbundling archive (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135724/new/ ht

[PATCH] D135796: [HIP] Detect HIP for Debian/Fedora

2022-10-12 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:309 + ROCmSearchDirs.emplace_back(D.SysRoot + "/usr/local", + /*StrictChecking=*/true); tra wrote: > yaxunl w

[PATCH] D135796: [HIP] Detect HIP for Debian/Fedora

2022-10-12 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 467282. yaxunl marked an inline comment as done. yaxunl added a comment. only check /usr and/usr/local for debian and redhat CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135796/new/ https://reviews.llvm.org/D135796 Files: clang/lib/Driver/ToolCha

[PATCH] D135796: [HIP] Detect HIP for Debian/Fedora

2022-10-12 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG082593ff7aff: [HIP] Detect HIP for Debian/Fedora (authored by yaxunl). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135796/ne

[PATCH] D136036: [Clang] Add __has_constexpr_builtin support

2022-10-17 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. need some Sema tests to verify a constexpr builtin is const evaluated. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D136036/new/ https://reviews.llvm.org/D136036 ___ cfe-commits m

[PATCH] D135832: Do not append terminating NUL to the string with embedded GPU binary.

2022-10-17 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D135832/new/ https://reviews.llvm.org/D135832 __

[PATCH] D140315: [AMDGCN] Update search path for device libraries

2022-12-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. files under clang/test/Driver/Inputs/rocm_resource_dir/lib/amdgcn/bitcode-no-abi-ver seem not used. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140315/new/ https://reviews.llvm.org/D140315 __

[PATCH] D138392: clang/HIP: Fix broken implementations of __make_mantissa* functions

2022-12-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. It seems gcc assumes the argument to nan is nonnull (https://godbolt.org/z/xzb8T6Gon), so we can assume that too. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D138392/n

[PATCH] D141008: [Clang][SPIR-V] Emit target extension types for OpenCL types on SPIR-V.

2023-01-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/test/CodeGenOpenCL/cl20-device-side-enqueue.cl:6 // RUN: %clang_cc1 -no-opaque-pointers -no-enable-noundef-analysis %s -cl-std=CL3.0 -ffake-address-space-map -O0 -emit-llvm -o - -triple "spir64-unknown-unknown" | FileCheck %s --c

[PATCH] D141051: [CUDA][HIP] Add support for `--offload-arch=native` to CUDA and refactor

2023-01-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/test/Driver/amdgpu-hip-system-arch.c:24 + +// case when amdgpu_arch does not return anything with successful execution +// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpulib --offload-arch=native --amdgpu-arch-tool=%t/a

[PATCH] D140663: CUDA/HIP: Use kernel name to map to symbol

2023-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/test/CodeGenCUDA/incomplete-func-ptr-type.cu:2 +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -x hip %s -o - \ +// RUN: | FileCheck %s + need to check `_Z19__device_stub__kern7TempValIjE` generates

[PATCH] D140315: [AMDGCN] Update search path for device libraries

2023-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D140315/new/ https://reviews.llvm.org/D140315 __

[PATCH] D141051: [CUDA][HIP] Add support for `--offload-arch=native` to CUDA and refactor

2023-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D141051/new/ https://reviews.llvm.org/D141051 __

[PATCH] D141437: [HIP] Use .hipi as preprocessor output extension

2023-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Herald added a project: All. yaxunl requested review of this revision. Herald added a subscriber: MaskRay. so that clang can recognize it and handle it automatically without -x hip-cpp-output https://reviews.llvm.org/D141437 Files:

[PATCH] D141449: clang/OpenCL: Fix not setting convergent on block invoke kernels

2023-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:12428 F->addFnAttr(llvm::Attribute::NoUnwind); + F->addFnAttr(llvm::Attribute::Convergent); How about using CGF.CGM.addDefaultFunctionDefinitionAttributes? same as below CHANGES SI

[PATCH] D141447: clang/OpenCL: Don't use a Function for the block type

2023-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. need a test CHANGES SINCE LAST ACTION https://reviews.llvm.org/D141447/new/ https://reviews.llvm.org/D141447 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D141437: [HIP] Use .hipi as preprocessor output extension

2023-01-11 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp:105 + " cui - cuda-cpp-output\n" + " hipi - hip-cpp-outpu\n" + " d- dependency\n" ---

[PATCH] D137251: [clang][cuda/hip] Allow `__noinline__` lambdas

2022-11-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. need a CodeGenCUDA test Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D137251/new/ https://reviews.llvm.org/D137251 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://

[PATCH] D137275: [Driver][test] Fix test by creating empty archive instead of empty file

2022-11-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D137275/new/ https://reviews.llvm.org/D137275 __

[PATCH] D137154: Adding nvvm_reflect clang builtin

2022-11-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D137154#3904810 , @hdelan wrote: > In DPC++ for CUDA we use libclc as a wrapper around CUDA SDK's libdevice. > Like libdevice we want to precompile libclc to bc for the CUDA backend > without specializing for a particular arch

[PATCH] D137251: [clang][cuda/hip] Allow `__noinline__` lambdas

2022-11-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/Parse/ParseExprCXX.cpp:1300 +ParseGNUAttributes(Attr, nullptr, &D); + } else if (Tok.is(tok::kw___noinline__)) { +IdentifierInfo *AttrName = Tok.getIdentifierInfo(); Pierre-vh wrote: > aaron

[PATCH] D137251: [clang][cuda/hip] Allow `__noinline__` lambdas

2022-11-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D137251#3904618 , @Pierre-vh wrote: > Comments > > Not sure if the release note is in the right place though. > As for the test, I did something quite targeted/minimal, hope it's fine? LGTM. Thanks. Repository: rG LLVM Gith

<    13   14   15   16   17   18   19   20   21   22   >