[PATCH] D55067: [HIP] Fix offset of kernel argument for AMDGPU target

2018-11-29 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D55067#1313290 , @yaxunl wrote: > In D55067#1313213 , @rjmccall wrote: > > > This seems backwards. Clang knows what the actual ABI alignment of the C > > type is, and it doesn't have to

[PATCH] D55067: [HIP] Fix offset of kernel argument for AMDGPU target

2018-11-29 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D55067#1313364 , @arsenm wrote: > In D55067#1313264 , @yaxunl wrote: > > > In D55067#1313213 , @rjmccall > > wrote: > > > > > This seems backwards

[PATCH] D53153: [OpenCL] Mark kernel functions with default visibility

2018-11-30 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D53153#1315039 , @scott.linder wrote: > In D53153#1314798 , @rjmccall wrote: > > > You still have the same linkage model for those other languages, right? > > Ultimately there's somethi

[PATCH] D55067: [HIP] Fix offset of kernel argument for AMDGPU target

2018-12-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D55067#1318810 , @arsenm wrote: > An OpenCL kernel may call another OpenCL kernel. I am wondering how do you pass arguments to the kernel callee. A simpler solution would be not letting hipSetupArgument specify the offset. S

[PATCH] D55067: [HIP] Fix offset of kernel argument for AMDGPU target

2018-12-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D55067#1320691 , @arsenm wrote: > I think if we can just declare something simple to follow that doesn't depend > on the IR type alignment, we could pack any basic type and align any > aggregates to 4 From the user point of v

[PATCH] D58658: [OpenCL] Fix assertion due to blocks

2019-02-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: Anastasia. Herald added subscribers: kristof.beyls, javed.absar. A recent change caused assertion in CodeGenFunction::EmitBlockCallExpr when a block is called. There is code if (!isa(E->getCalleeDecl())) Func = CGM.getOpenCLRuntime().

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-02-26 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1410153 , @rjmccall wrote: > In D56411#1406212 , @yaxunl wrote: > > > I would like to fix the validation issue only and leave the overload > > resolution issue for future. > > > As

[PATCH] D58658: [OpenCL] Fix assertion due to blocks

2019-02-26 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC354893: [OpenCL] Fix assertion due to blocks (authored by yaxunl, committed by ). Herald added a project: clang. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D58658/new/ h

[PATCH] D57716: [CUDA][HIP] Check calling convention based on function target

2019-02-26 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57716/new/ https://reviews.llvm.org/D57716 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D58623: [AMDGPU] Allow using integral non-type template parameters

2019-02-26 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D58623/new/ https://reviews.llvm.org/D58623 __

[PATCH] D57716: [CUDA][HIP] Check calling convention based on function target

2019-02-26 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC354929: [CUDA][HIP] Check calling convention based on function target (authored by yaxunl, committed by ). Herald added a project: clang. Repository: rC Clang CHANGES SINCE LAST ACTION https://review

[PATCH] D58518: [HIP] change kernel stub name

2019-02-26 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL354948: [HIP] change kernel stub name (authored by yaxunl, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D58518

[PATCH] D57716: [CUDA][HIP] Check calling convention based on function target

2019-02-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: lib/Sema/SemaDeclAttr.cpp:4620 const TargetInfo &TI = Context.getTargetInfo(); - TargetInfo::CallingConvCheckResult A = TI.checkCallingConvention(CC); + auto *Aux = Context.getAuxTargetInfo();

[PATCH] D57716: [CUDA][HIP] Check calling convention based on function target

2019-02-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: lib/Sema/SemaDeclAttr.cpp:4620 const TargetInfo &TI = Context.getTargetInfo(); - TargetInfo::CallingConvCheckResult A = TI.checkCallingConvention(CC); + auto *Aux = Context.getAuxTargetInfo();

[PATCH] D58917: [HIP] Do not unbundle object files for -fno-gpu-rdc

2019-03-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Herald added a subscriber: jdoerfert. Herald added a project: clang. When -fno-gpu-rdc is set, device code is compiled, linked, and assembled into fat binary and embedded as string in object files. The object files are normal object fil

[PATCH] D58057: Allow bundle size to be 0 in clang-offload-bundler

2019-03-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Alexey, could you please also review this patch? Thanks. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D58057/new/ https://reviews.llvm.org/D58057 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.

[PATCH] D58917: [HIP] Do not unbundle object files for -fno-gpu-rdc

2019-03-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: lib/Driver/Driver.cpp:2298 +/// Flag for -fgpu-rdc. +bool Relocatable; public: ABataev wrote: > Set the default initializer for the field will do. thanks. Repository:

[PATCH] D58917: [HIP] Do not unbundle object files for -fno-gpu-rdc

2019-03-05 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL355410: [HIP] Do not unbundle object files for -fno-gpu-rdc (authored by yaxunl, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://r

[PATCH] D58057: Allow bundle size to be 0 in clang-offload-bundler

2019-03-05 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC355419: Allow bundle size to be 0 in clang-offload-bundler (authored by yaxunl, committed by ). Herald added a project: clang. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-03-05 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC355421: [CUDA][HIP][Sema] Fix template kernel with function as template parameter (authored by yaxunl, committed by ). Herald added a project: clang. Repository: rC Clang CHANGES SINCE LAST ACTION ht

[PATCH] D59316: [HIP-Clang] propagate -mllvm options to opt and llc

2019-03-14 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level link

[PATCH] D59316: [HIP-Clang] propagate -mllvm options to opt and llc

2019-03-14 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks! Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D59316/new/ https://reviews.llvm.org/D59316 ___ cfe-c

[PATCH] D59316: [HIP-Clang] propagate -mllvm options to opt and llc

2019-03-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D59316#1431238 , @arsenm wrote: > In D59316#1429580 , @yaxunl wrote: > > > Here we are looking at the code which emulates a "linker" for HIP > > toolchain. The offloading action builder r

[PATCH] D59316: [HIP-Clang] propagate -mllvm options to opt and llc

2019-03-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D59316#1431276 , @arsenm wrote: > In D59316#1431253 , @yaxunl wrote: > > > In D59316#1431238 , @arsenm wrote: > > > > > In D59316#1429580

[PATCH] D59316: [HIP-Clang] propagate -mllvm options to opt and llc

2019-03-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D59316#1431302 , @arsenm wrote: > > ML workloads are extremely unlikely to use a call. We should have an > execution tests with noinline somewhere to stress this I compiled and ran a test with noinline function and I saw fu

[PATCH] D59647: [CUDA][HIP] Warn shared var initialization

2019-03-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Herald added a project: clang. In many cases the default constructor of a class contains initializer of data members, which allows concise code. The class may be instantiated as global or automatic variables in device code, which is total

[PATCH] D59863: [HIP] Support gpu arch gfx906+sram-ecc

2019-03-26 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, ashi1. Herald added a subscriber: jholewinski. Add a new gpu arch gfx906+sram-ecc for HIP. It is similar to gfx906 but implies target feature sram-ecc enabled, whereas gfx906 implies target sram-ecc disabled. Corresponding option -mattr=[

[PATCH] D59863: [HIP] Support gpu arch gfx906+sram-ecc

2019-03-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: lib/Basic/Cuda.cpp:113 + case CudaArch::GFX906_SRAM_ECC: // TBA +return "gfx906+sram-ecc"; case CudaArch::GFX909: // TBA tra wrote: > Wording nit: > Does it mean `+(SRAM, E

[PATCH] D60141: [HIP-Clang] Fat binary should not be produced for non GPU code

2019-04-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks! Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D60141/new/ https://reviews.llvm.org/D60141 ___ cfe-c

[PATCH] D60141: [HIP-Clang] Fat binary should not be produced for non GPU code

2019-04-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: lib/CodeGen/CGCUDANV.cpp:475-476 return nullptr; + if (IsHIP && EmittedKernels.empty() && DeviceVars.empty()) +return nullptr; // void __{cuda|hip}_register_globals(void* handle); tra wrote: > I think this wo

[PATCH] D59321: WIP: AMDGPU: Teach toolchain to link rocm device libs

2019-04-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: lib/Driver/ToolChains/AMDGPU.h:25 +/// TODO: Generalize to handle libclc. +class RocmInstallationDetector { +private: I don't think we should detect ROCm installation here. We are compiling code for amdgpu not only on RO

[PATCH] D60513: [HIP] Use -mlink-builtin-bitcode to link device library

2019-04-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, ashi1. Herald added a subscriber: jdoerfert. Use -mlink-builtin-bitcode instead of llvm-link to link device library so that device library bitcode and user device code can be compiled in a consistent way. This is the same approach used by

[PATCH] D60620: [HIP] Support -offloading-target-id

2019-04-12 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, b-sumner, ashi1, scchan, t-tye. Herald added a subscriber: mgorny. This patch introduces a new option -offloading-target-id for HIP. Offloading target id is a generalization of CUDA/HIP GPU arch. It is a device name plus optional feature

[PATCH] D60513: [HIP] Use -mlink-builtin-bitcode to link device library

2019-04-12 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL358290: [HIP] Use -mlink-builtin-bitcode to link device library (authored by yaxunl, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https

[PATCH] D61112: AMDGPU: Enable _Float16

2019-04-24 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: b-sumner, rampitec, arsenm. Herald added subscribers: t-tye, tpr, dstuttard, nhaehnle, wdng, jvesely, kzhuravl. https://reviews.llvm.org/D61112 Files: lib/Basic/Targets/AMDGPU.cpp test/CodeGenCXX/amdgpu-float16.cpp Index: test/CodeGenCX

[PATCH] D61194: [HIP] Fix visibility of `__constant__` variables.

2019-04-26 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:7851 + (isa(D) && + (D->hasAttr() || D->hasAttr())); } is format right? Comment at: clang/test/CodeGenCUDA/amdgpu-visibility.cu:1 +// RUN: %clang_cc1 -

[PATCH] D61194: [HIP] Fix visibility of `__constant__` variables.

2019-04-26 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D61194/new/ https://reviews.llvm.org/D61194 __

[PATCH] D61274: [Sema][AST] Explicit visibility for OpenCL/CUDA kernels/variables

2019-04-30 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: lib/AST/Decl.cpp:738 + (isa(D) && D->hasAttr()) || + (isa(D) && D->hasAttr())) { +Visibility Vis = LV.getVisibility(); we also need this for `__constant__` variables. Repository: rC Clang CHANGES SINCE

[PATCH] D61396: [hip] Fix ambiguity from `>>>` of CUDA.

2019-05-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. LGTM too. Thanks Michael for fixing this. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D61396/new/ https://reviews.llvm.org/D61396 ___ cfe-commits mailing list cfe-commits@lists

[PATCH] D51809: [CUDA][HIP] Fix assertion in LookupSpecialMember

2018-09-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. ping https://reviews.llvm.org/D51809 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D52320: AMDGPU: add __builtin_amdgcn_update_dpp

2018-09-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: kzhuravl, b-sumner, arsenm. Herald added subscribers: t-tye, tpr, dstuttard, nhaehnle, wdng, jvesely. https://reviews.llvm.org/D52320 Files: include/clang/Basic/BuiltinsAMDGPU.def lib/CodeGen/CGBuiltin.cpp test/CodeGenOpenCL/builtins-amd

[PATCH] D52377: [HIP] Support early finalization of device code

2018-09-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. This patch introduced a driver option `--hip-early-finalize`. When enabled, clang will assume the device code in each translation unit does not call external functions except those in the device library, therefore it is possible to compil

[PATCH] D52377: [HIP] Support early finalization of device code

2018-09-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 166556. yaxunl added a comment. Fix comments. https://reviews.llvm.org/D52377 Files: include/clang/Driver/Options.td include/clang/Driver/Types.def lib/CodeGen/CGCUDANV.cpp lib/Driver/Driver.cpp lib/Driver/ToolChains/Clang.cpp lib/Driver/ToolChai

[PATCH] D52673: [HIP] Remove disabled irif library

2018-10-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. this seems to be duplicate of https://reviews.llvm.org/D51857 Is HIP github ready for this change? Repository: rC Clang https://reviews.llvm.org/D52673 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.

[PATCH] D52377: [HIP] Support early finalization of device code

2018-10-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D52377#1242547, @tra wrote: > Overall the patch look OK. I'll take a closer look on Monday. > > Which mode do you expect will be most commonly used for HIP by default? With > this patch we'll have two different ways to do similar things in HIP

[PATCH] D52377: [HIP] Support early finalization of device code for -fno-gpu-rdc

2018-10-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 167774. yaxunl retitled this revision from "[HIP] Support early finalization of device code" to "[HIP] Support early finalization of device code for -fno-gpu-rdc". yaxunl edited the summary of this revision. yaxunl added a comment. Uses -fno-gpu-rdc for early

[PATCH] D52377: [HIP] Support early finalization of device code for -fno-gpu-rdc

2018-10-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 167952. yaxunl edited the summary of this revision. yaxunl added a comment. Added -f{no}-cuda-rdc as alias to -f{no}-gpu-rdc. https://reviews.llvm.org/D52377 Files: include/clang/Basic/LangOptions.def include/clang/Driver/Options.td include/clang/Drive

[PATCH] D52377: [HIP] Support early finalization of device code for -fno-gpu-rdc

2018-10-02 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC343611: [HIP] Support early finalization of device code for -fno-gpu-rdc (authored by yaxunl, committed by ). Repository: rC Clang https://reviews.llvm.org/D52377 Files: include/clang/Basic/LangOpti

[PATCH] D52658: [OpenCL] Remove PIPE_RESERVE_ID_VALID_BIT from opencl-c.h

2018-10-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. I am OK with the change. Repository: rC Clang https://reviews.llvm.org/D52658 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D51809: [CUDA][HIP] Fix ShouldDeleteSpecialMember for inherited constructors

2018-10-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 168479. yaxunl retitled this revision from "[CUDA][HIP] Fix assertion in LookupSpecialMember" to "[CUDA][HIP] Fix ShouldDeleteSpecialMember for inherited constructors". yaxunl edited the summary of this revision. yaxunl added a comment. Revised by Justin's co

[PATCH] D52891: [AMDGPU] Add -fvisibility-amdgpu-non-kernel-functions

2018-10-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D52891#1256207, @arsenm wrote: > I think the name needs work, but I'm not sure what it should be. I think it > should avoid using "non" and "amdgpu" I think dropping amdgpu is fine since we can add (AMDGUP only) to the description of the opt

[PATCH] D52891: [AMDGPU] Add -fvisibility-amdgpu-non-kernel-functions

2018-10-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Can you also fix HIP toolchain? It is in HIPToolChain::addClangTargetOptions. Thanks. https://reviews.llvm.org/D52891 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe

[PATCH] D51809: [CUDA][HIP] Fix ShouldDeleteSpecialMember for inherited constructors

2018-10-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 168500. yaxunl added a comment. fix a typo. https://reviews.llvm.org/D51809 Files: lib/Sema/SemaDeclCXX.cpp test/SemaCUDA/implicit-member-target-inherited.cu test/SemaCUDA/inherited-ctor.cu Index: test/SemaCUDA/inherited-ctor.cu ==

[PATCH] D57527: Do not copy long double and 128-bit fp format from aux target for AMDGPU

2019-01-31 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was not accepted when it landed; it landed in state "Needs Review". This revision was automatically updated to reflect the committed changes. Closed by commit rC352801: Do not copy long double and 128-bit fp format from aux target for AMDGPU (authored by yaxunl, committed by ). Cha

[PATCH] D57716: Let AMDGPU compile MSVC headers containing vectorcall

2019-02-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: rjmccall. Herald added subscribers: t-tye, tpr, dstuttard, nhaehnle, wdng, jvesely, kzhuravl. MSVC header files using vectorcall to differentiate overloaded functions, which causes failure for AMDGPU target. Let AMDGPU target recognize vector

[PATCH] D57716: [CUDA][HIP] Check calling convention based on function target

2019-02-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 185375. yaxunl retitled this revision from "Let AMDGPU compile MSVC headers containing vectorcall" to "[CUDA][HIP] Check calling convention based on function target". yaxunl edited the summary of this revision. yaxunl added a reviewer: tra. yaxunl added a comme

[PATCH] D57829: [HIP] Disable emitting llvm.linker.options in device compilation

2019-02-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, rjmccall. HIP toolchain does not support llvm.linker.options in device compilation, therefore disable it. https://reviews.llvm.org/D57829 Files: lib/CodeGen/CodeGenModule.cpp test/CodeGenCUDA/linker-options.cu Index: test/CodeGen

[PATCH] D57831: AMDGPU: set wchar_t and wint_t to be unsigned short on windows

2019-02-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: b-sumner. Herald added subscribers: t-tye, Anastasia, tpr, dstuttard, nhaehnle, wdng, jvesely, kzhuravl. In MSVC wchar_t and wint_t are unsigned short. There is static_assert in MSVC headers checking that. Since HIP and OpenCL share the sam

[PATCH] D57829: [HIP] Disable emitting llvm.linker.options in device compilation

2019-02-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D57829#1387412 , @tra wrote: > Could you elaborate on why you want to disable this metadata? I think the > original idea of llvm.linker.options was that it should be ignored if the > back-end does not support it. If backend d

[PATCH] D58057: Allow bundle size to be 0 in clang-offload-bundler

2019-02-11 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: rjmccall. HIP uses clang-offload-bundler to create fat binary. The bundle for host is empty. Currently clang-offload-bundler checks if the bundle size is 0 when unbundling. If so it will exit without unbundling the remaining bundles. This cau

[PATCH] D56871: [AMDGPU] Require at least protected visibility for certain symbols

2019-02-11 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D56871/new/ https://reviews.llvm.org/D56871 ___ cfe-commits mailing list cfe-

[PATCH] D58163: [CUDA][HIP] Use device side kernel and variable names when registering them

2019-02-12 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: rjmccall, tra. Herald added subscribers: jdoerfert, tpr. `__hipRegisterFunction` and `__hipRegisterVar` need to accept device side kernel and variable names so that HIP runtime can associate kernel stub functions in host code with kernel symb

[PATCH] D58163: [CUDA][HIP] Use device side kernel and variable names when registering them

2019-02-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: lib/CodeGen/CGCUDANV.cpp:412 + for (auto &&I : EmittedKernels) { +llvm::Constant *KernelName = makeConstantString(I.DeviceSideName); llvm::Constant *NullPtr = llvm::ConstantPointerNull::g

[PATCH] D58163: [CUDA][HIP] Use device side kernel and variable names when registering them

2019-02-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: lib/CodeGen/CGCUDANV.cpp:412 + for (auto &&I : EmittedKernels) { +llvm::Constant *KernelName = makeConstantString(I.DeviceSideName); llvm::Constant *NullPtr = llvm::ConstantPointerNull::g

[PATCH] D58163: [CUDA][HIP] Use device side kernel and variable names when registering them

2019-02-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 186711. yaxunl added a comment. Revised by Artem's comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D58163/new/ https://reviews.llvm.org/D58163 Files: include/clang/AST/ASTContext.h lib/AST/ASTContext.cpp lib/CodeGen/CGCUDANV.cpp lib/C

[PATCH] D58163: [CUDA][HIP] Use device side kernel and variable names when registering them

2019-02-13 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC354004: [CUDA][HIP] Use device side kernel and variable names when registering them (authored by yaxunl, committed by ). Herald added a project: clang. Changed prior to commit: https://reviews.llvm.org/

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-02-14 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1365878 , @yaxunl wrote: > In D56411#1365745 , @rjmccall wrote: > > > In D56411#1365727 , @yaxunl wrote: > > > > > In D56411#1360010

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-02-14 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1398103 , @rjmccall wrote: > In D56411#1398097 , @yaxunl wrote: > > > In D56411#1365878 , @yaxunl wrote: > > > > > In D56411#1365745

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-02-14 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1398291 , @tra wrote: > >> That said, does CUDA have a general rule resolving `__host__` vs. > >> `__device__` overloads based on context? And does it allow overloading > >> based solely on `__host__` vs. `__device__`?

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-02-14 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1398329 , @rjmccall wrote: > In D56411#1398328 , @rjmccall wrote: > > > In D56411#1398291 , @tra wrote: > > > > > >> That said, does CUDA ha

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-02-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1398586 , @rjmccall wrote: > But what we've just been talking about is not a validity rule, it's an > overload-resolution rule. It's not *invalid* to use a device function as a > template argument to a host function tem

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-02-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1400251 , @rjmccall wrote: > It is totally unreasonable, at the time you are resolving a template > argument, to investigate how the corresponding template parameter is used > within the template and use that to shape ho

[PATCH] D58518: [HIP] change kernel stub name

2019-02-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: t-tye, tra. Add .stub to kernel stub function name so that it is different from kernel name in device code. This is necessary to let debugger find correct symbol for kernel https://reviews.llvm.org/D58518 Files: lib/CodeGen/CGCUDANV.cpp

[PATCH] D58518: [HIP] change kernel stub name

2019-02-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added a comment. In D58518#1406124 , @tra wrote: > My guess is that this is needed because HIP debugger can see symbols from > both host and device executables at the same time. Is that so? > > If that's the

[PATCH] D58509: [CodeGen] Fix string literal address space casting.

2019-02-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D58509/new/ https://reviews.llvm.org/D58509 __

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-02-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 187832. yaxunl added a comment. I would like to fix the validation issue only and leave the overload resolution issue for future. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D56411/new/ https://reviews.llvm.org/D56411 Files: lib/Sema/SemaCUDA.cp

[PATCH] D58518: [HIP] change kernel stub name

2019-02-21 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC354615: [HIP] change kernel stub name (authored by yaxunl, committed by ). Herald added a project: clang. Changed prior to commit: https://reviews.llvm.org/D58518?vs=187815&id=187840#toc Repository:

[PATCH] D57716: [CUDA][HIP] Check calling convention based on function target

2019-02-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 187862. yaxunl added a comment. Herald added a subscriber: jdoerfert. Revised by Artem's comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57716/new/ https://reviews.llvm.org/D57716 Files: lib/Sema/SemaDeclAttr.cpp test/SemaCUDA/amdgpu-win

[PATCH] D57716: [CUDA][HIP] Check calling convention based on function target

2019-02-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: test/SemaCUDA/amdgpu-windows-vectorcall.cu:9-10 + +template struct A<_Ret (__cdecl _Arg0::*)(_Types) > { }; +template struct A<_Ret (__vectorcall _Arg0::*)(_Types) > {}; + tra wr

[PATCH] D57716: [CUDA][HIP] Check calling convention based on function target

2019-02-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 187943. yaxunl added a comment. modify test to use non-template functions. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57716/new/ https://reviews.llvm.org/D57716 Files: lib/Sema/SemaDeclAttr.cpp test/SemaCUDA/amdgpu-windows-vectorcall.cu Ind

[PATCH] D58518: [HIP] change kernel stub name

2019-02-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 187980. yaxunl added a comment. Fixed regressions. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D58518/new/ https://reviews.llvm.org/D58518 Files: lib/CodeGen/CGCUDANV.cpp lib/CodeGen/CodeGenModule.cpp test/CodeGenCUDA/kernel-stub-name.cu In

[PATCH] D58518: [HIP] change kernel stub name

2019-02-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: lib/CodeGen/CodeGenModule.cpp:1059 +FD->hasAttr()) + MangledName = MangledName + ".stub"; + tra wrote: > Changing mangled name exposes this change to a wider scope of potential > issues. Is the mangled name

[PATCH] D34367: CodeGen: Fix address space of indirect function argument

2017-08-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 6 inline comments as done. yaxunl added inline comments. Comment at: lib/CodeGen/CGCall.cpp:3851 + ->getType() + ->getPointerAddressSpace(); const unsigned ArgAddrSpace = --

[PATCH] D34367: CodeGen: Fix address space of indirect function argument

2017-08-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: lib/CodeGen/CGCall.cpp:3861 < Align.getQuantity()) || (ArgInfo.getIndirectByVal() && (RVAddrSpace != ArgAddrSpace))) { // Create an aligned temporary, and co

[PATCH] D34367: CodeGen: Fix address space of indirect function argument

2017-08-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: lib/CodeGen/CGCall.cpp:3861 < Align.getQuantity()) || (ArgInfo.getIndirectByVal() && (RVAddrSpace != ArgAddrSpace))) { // Create an aligned temporary, and copy to it. rjmccall wrote

[PATCH] D36410: [OpenCL] Handle taking address of block captures

2017-08-30 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D36410#855358, @bader wrote: > In https://reviews.llvm.org/D36410#855282, @Anastasia wrote: > > > Ok, I will update it to be implicitly generic then. Just to be sure, @bader > > do you agree on this too? > > > > > > An alternative approached co

[PATCH] D34367: CodeGen: Fix address space of indirect function argument

2017-08-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: lib/CodeGen/CGCall.cpp:3861 < Align.getQuantity()) || (ArgInfo.getIndirectByVal() && (RVAddrSpace != ArgAddrSpace))) { // Create an aligned temporary, and copy to it. yaxunl wrote:

[PATCH] D34367: CodeGen: Fix address space of indirect function argument

2017-08-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 113443. yaxunl added a comment. Revised by John's comments. https://reviews.llvm.org/D34367 Files: lib/CodeGen/CGCall.cpp lib/CodeGen/CGCall.h lib/CodeGen/CGDecl.cpp test/CodeGenCXX/amdgcn-func-arg.cpp test/CodeGenOpenCL/addr-space-struct-arg.cl I

[PATCH] D36678: [OpenCL] Do not use vararg in emitted functions for enqueue_kernel

2017-08-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 113483. yaxunl marked 3 inline comments as done. yaxunl added a comment. update tests. https://reviews.llvm.org/D36678 Files: lib/CodeGen/CGBuiltin.cpp test/CodeGenOpenCL/cl20-device-side-enqueue.cl Index: test/CodeGenOpenCL/cl20-device-side-enqueue.cl

[PATCH] D37386: [AMDGPU] Implement infrastructure to set options in AMDGPUToolChain

2017-09-01 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Is it possible to add a test for this? Thanks. https://reviews.llvm.org/D37386 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D36678: [OpenCL] Do not use vararg in emitted functions for enqueue_kernel

2017-09-03 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL312441: [OpenCL] Do not use vararg in emitted functions for enqueue_kernel (authored by yaxunl). Changed prior to commit: https://reviews.llvm.org/D36678?vs=113483&id=113694#toc Repository: rL LLVM

[PATCH] D37386: [AMDGPU] Implement infrastructure to set options in AMDGPUToolChain

2017-09-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. LGTM. Thanks. https://reviews.llvm.org/D37386 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D36410: [OpenCL] Handle taking address of block captures

2017-09-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D36410#856907, @bader wrote: > In https://reviews.llvm.org/D36410#856716, @yaxunl wrote: > > > The captured variable is still passed by value. The address taking is on > > the duplicate of the captured variable, not on the original variable. >

[PATCH] D35082: [OpenCL] Add LangAS::opencl_private to represent private address space in AST

2017-09-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added a comment. ping https://reviews.llvm.org/D35082 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D37568: [AMDGPU] Allow flexible register names in inline asm constraints

2017-09-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. Herald added subscribers: eraman, t-tye, tpr, dstuttard, nhaehnle, wdng, kzhuravl. Currently AMDGPU inline asm only allow "v" and "s" as register names in constraints. This patch allows the following register names in constraints: (n, m is unsigned integer, n < m)

[PATCH] D36410: [OpenCL] Handle taking address of block captures

2017-09-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM as a temporary measure until we have a solution for properly emitting blocks as enqueued kernel. https://reviews.llvm.org/D36410 ___ cfe-co

[PATCH] D36410: [OpenCL] Handle taking address of block captures

2017-09-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D36410#863426, @Anastasia wrote: > In https://reviews.llvm.org/D36410#863409, @yaxunl wrote: > > > LGTM as a temporary measure until we have a solution for properly emitting > > blocks as enqueued kernel. > > > Should I start discussion with Kh

[PATCH] D37568: [AMDGPU] Allow flexible register names in inline asm constraints

2017-09-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: test/Sema/inline-asm-validate-amdgpu.cl:38 +__asm("v_add_f32_e32 v1, v2, v3" : "=v1"(ci) : "v2"(ai), "v3"(bi) : ); /// expected-error {{invalid output constraint '=v1' in asm}} +__asm("v_add_f32_e32 v1, v2, v3" : "=v1:2"(ci) : "v

[PATCH] D37568: [AMDGPU] Allow flexible register names in inline asm constraints

2017-09-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 114223. yaxunl marked 3 inline comments as done. yaxunl edited the summary of this revision. yaxunl added a comment. Revised by Matt's comments. https://reviews.llvm.org/D37568 Files: lib/Basic/Targets/AMDGPU.h test/CodeGenOpenCL/amdgcn-inline-asm.cl t

[PATCH] D36410: [OpenCL] Handle taking address of block captures

2017-09-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D36410#863579, @Anastasia wrote: > In https://reviews.llvm.org/D36410#863508, @yaxunl wrote: > > > In https://reviews.llvm.org/D36410#863426, @Anastasia wrote: > > > > > In https://reviews.llvm.org/D36410#863409, @yaxunl wrote: > > > > > > > LGT

<    1   2   3   4   5   6   7   8   9   10   >