[PATCH] D34342: [OpenCL] Fix code generation of function-scope constant samplers.

2017-06-19 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: test/CodeGenOpenCL/sampler.cl:62 + + const sampler_t const_smp = CLK_ADDRESS_CLAMP_TO_EDGE | CLK_NORMALIZED_COORDS_TRUE | CLK_FILTER_LINEAR; + // CHECK: [[CONST_SAMP:%[0-9]+]] = call %opencl.sampler_t addrspace(2)* @__translate_sample

[PATCH] D34342: [OpenCL] Fix code generation of function-scope constant samplers.

2017-06-19 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. https://reviews.llvm.org/D34342 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listi

[PATCH] D33706: CodeGen: Cast temporary variable to proper address space

2017-06-19 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL305711: CodeGen: Cast temporary variable to proper address space (authored by yaxunl). Changed prior to commit: https://reviews.llvm.org/D33706?vs=101227&id=103070#toc Repository: rL LLVM https://re

[PATCH] D33989: [OpenCL] Allow targets to select address space per type

2017-06-19 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: include/clang/Basic/TargetInfo.h:1041 +default: + return LangAS::Default; +} Anastasia wrote: > bader wrote: > > yaxunl wrote: > > > bader wrote: > > > > yaxunl wrote: > > > > > I think the default (including

[PATCH] D34367: CodeGen: Fix address space of indirect function argument

2017-06-19 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. The byval function argument is in alloca address space in LLVM IR. However, during Clang codegen for C++, the address space of indirect function argument should match its address space in the source code, even for byval argument. This is because destructor of the byva

[PATCH] D34367: CodeGen: Fix address space of indirect function argument

2017-06-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: lib/CodeGen/CGCall.cpp:1605 + ? CGM.getDataLayout().getAllocaAddrSpace() + : getContext().getTargetAddressSpace(LangAS::Default)); break; rjmccall wrote: > Everything about your reasoning

[PATCH] D33842: [AMDGPU] Fix address space of global variable

2017-06-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 3 inline comments as done. yaxunl added inline comments. Comment at: include/clang/Basic/TargetInfo.h:959 + /// \brief Return the target address space which is read only and can be + /// casted to the generic address space. + virtual llvm::Optional getTargetConst

[PATCH] D33842: [AMDGPU] Fix address space of global variable

2017-06-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 103613. yaxunl marked 2 inline comments as done. yaxunl added a comment. Revised by John's comments. https://reviews.llvm.org/D33842 Files: include/clang/Basic/TargetInfo.h lib/Basic/Targets.cpp lib/CodeGen/CGBlocks.cpp lib/CodeGen/CGDecl.cpp lib/C

[PATCH] D34367: CodeGen: Fix address space of indirect function argument

2018-02-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 136301. yaxunl added a comment. Revised by John's comments. Added CallArg::copyInto and modified CallArg::getRValue() to return an independent r-value by default. However some cases expecting l-value not copied, therefore I added an optional argument to Call

[PATCH] D43911: [AMDGPU] Clean up old address space mapping and fix constant address space value

2018-02-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: t-tye, b-sumner, arsenm. Herald added subscribers: tpr, dstuttard, nhaehnle, wdng, kzhuravl. https://reviews.llvm.org/D43911 Files: lib/Basic/Targets/AMDGPU.cpp lib/Basic/Targets/AMDGPU.h test/CodeGenCXX/cxx0x-initializer-stdinitializerl

[PATCH] D43911: [AMDGPU] Clean up old address space mapping and fix constant address space value

2018-03-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Ping https://reviews.llvm.org/D43911 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D43911: [AMDGPU] Clean up old address space mapping and fix constant address space value

2018-03-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Will make recommended changes when committing. https://reviews.llvm.org/D43911 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D43911: [AMDGPU] Clean up old address space mapping and fix constant address space value

2018-03-05 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL326725: [AMDGPU] Clean up old address space mapping and fix constant address space value (authored by yaxunl, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://re

[PATCH] D43783: [OpenCL] Remove block invoke function from emitted block literal struct

2018-03-05 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. ping https://reviews.llvm.org/D43783 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D43783: [OpenCL] Remove block invoke function from emitted block literal struct

2018-03-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: lib/CodeGen/CGBlocks.cpp:1065-1067 + llvm::Value *FuncPtr; + if (!CGM.getLangOpts().OpenCL) { bader wrote: > I think it would be more readable if we merge this if statement with the if > statement at the line #1103.

[PATCH] D34367: CodeGen: Fix address space of indirect function argument

2018-03-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 137279. yaxunl added a comment. Clean up CallArg::getRValue() so that it always return independent value. https://reviews.llvm.org/D34367 Files: lib/CodeGen/CGAtomic.cpp lib/CodeGen/CGCall.cpp lib/CodeGen/CGCall.h lib/CodeGen/CGClass.cpp lib/CodeGe

[PATCH] D34367: CodeGen: Fix address space of indirect function argument

2018-03-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 20 inline comments as done. yaxunl added inline comments. Comment at: lib/CodeGen/CGCall.h:248 + return HasLV ? LV.getAddress() : RV.getAggregateAddress(); +} + rjmccall wrote: > Part of my thinking in suggesting this representation change

[PATCH] D34367: CodeGen: Fix address space of indirect function argument

2018-03-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 137421. yaxunl marked an inline comment as done. yaxunl added a comment. Revised by John's comments. Removed CallArg::getAggregateAddress(). https://reviews.llvm.org/D34367 Files: lib/CodeGen/CGAtomic.cpp lib/CodeGen/CGCall.cpp lib/CodeGen/CGCall.h l

[PATCH] D43783: [OpenCL] Remove block invoke function from emitted block literal struct

2018-03-07 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC326937: [OpenCL] Remove block invoke function from emitted block literal struct (authored by yaxunl, committed by ). Repository: rC Clang https://reviews.llvm.org/D43783 Files: lib/CodeGen/CGBlocks.

[PATCH] D34367: CodeGen: Fix address space of indirect function argument

2018-03-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 137460. yaxunl marked an inline comment as done. yaxunl edited the summary of this revision. yaxunl added a comment. Added comment about emit non-null argument check. https://reviews.llvm.org/D34367 Files: lib/CodeGen/CGAtomic.cpp lib/CodeGen/CGCall.cpp

[PATCH] D34367: CodeGen: Fix address space of indirect function argument

2018-03-07 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL326946: CodeGen: Fix address space of indirect function argument (authored by yaxunl, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D34367?vs

[PATCH] D44445: CodeGen: Reduce LValue and CallArgList memory footprint before recommitting r326946

2018-03-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: rsmith, rjmccall. Recent change r326946 (https://reviews.llvm.org/D34367) causes regression in Eigen due to increased memory footprint of CallArg. This patch reduces LValue size from 112 to 96 bytes and reduces inline argument count of CallA

[PATCH] D44445: CodeGen: Reduce LValue and CallArgList memory footprint before recommitting r326946

2018-03-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 138252. yaxunl marked an inline comment as done. yaxunl added a comment. Saturate alignment when it is too large. https://reviews.llvm.org/D5 Files: lib/CodeGen/CGCall.h lib/CodeGen/CGValue.h test/CodeGenCXX/deep-ast-tree.cpp Index: test/CodeGenCX

[PATCH] D44445: CodeGen: Reduce LValue and CallArgList memory footprint before recommitting r326946

2018-03-14 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC327515: CodeGen: Reduce LValue and CallArgList memory footprint before recommitting… (authored by yaxunl, committed by ). Changed prior to commit: https://reviews.llvm.org/D5?vs=138252&id=138358#toc

[PATCH] D44445: CodeGen: Reduce LValue and CallArgList memory footprint before recommitting r326946

2018-03-14 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. I have to remove the lit test since it causes failure on atom http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/15477 It seems when the lit test is running on atom, it is compiled with default CPU for x86_64, therefor __atom__ is not defined and there i

[PATCH] D44533: [AMDGPU] Fix codegen for inline assembly

2018-03-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: arsenm, rampitec, b-sumner. Herald added subscribers: eraman, t-tye, tpr, dstuttard, nhaehnle, wdng, kzhuravl. Need to override convertConstraint to recognise amdgpu specific register names. https://reviews.llvm.org/D44533 Files: lib/Basi

[PATCH] D44533: [AMDGPU] Fix codegen for inline assembly

2018-03-16 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 138768. yaxunl marked an inline comment as done. yaxunl added a comment. Fix the comment. https://reviews.llvm.org/D44533 Files: lib/Basic/Targets/AMDGPU.h lib/CodeGen/CGStmt.cpp test/CodeGenOpenCL/inline-asm-amdgcn.cl test/Sema/inline-asm-validate-a

[PATCH] D47555: [HIP] Fix unbundling

2018-05-30 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: rjmccall. HIP uses clang-offload-bundler to bundle intermediate files for host and different gpu archs together. When a file is unbundled, clang-offload-bundler should be called only once, and the objects for host and different gpu archs shoul

[PATCH] D47694: [CUDA][HIP] Do not emit type info when compiling for device

2018-06-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: rjmccall, tra. CUDA/HIP does not support RTTI on device side, therefore there is no point of emitting type info when compiling for device. Emitting type info for device not only clutters the IR with useless global variables, but also causes un

[PATCH] D47376: [CUDA][HIP] Do not offload for -M

2018-06-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. ping https://reviews.llvm.org/D47376 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D47694: [CUDA][HIP] Do not emit type info when compiling for device

2018-06-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D47694#1120367, @rjmccall wrote: > Why not just have the driver disable RTTI in the frontend invocation? CUDA/HIP uses single source for device and host. The host code may depend on RTTI, e.g., an application may include some boost headers wh

[PATCH] D47694: [CUDA][HIP] Do not emit type info when compiling for device

2018-06-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D47694#1121037, @rjmccall wrote: > In https://reviews.llvm.org/D47694#1120375, @yaxunl wrote: > > > In https://reviews.llvm.org/D47694#1120367, @rjmccall wrote: > > > > > Why not just have the driver disable RTTI in the frontend invocation? > >

[PATCH] D47733: [CUDA][HIP] Set kernel calling convention before arrange function

2018-06-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: rjmccall, tra. Currently clang set kernel calling convention for CUDA/HIP after arranging function, which causes incorrect kernel function type since it depends on calling convention. This patch moves setting kernel convention before arranging

[PATCH] D47694: [CUDA][HIP] Do not emit type info when compiling for device

2018-06-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 149892. yaxunl added a comment. Revised by John's comments. https://reviews.llvm.org/D47694 Files: lib/CodeGen/CodeGenModule.cpp test/CodeGenCUDA/device-vtable.cu Index: test/CodeGenCUDA/device-vtable.cu

[PATCH] D47694: [CUDA][HIP] Do not emit type info when compiling for device

2018-06-05 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC334021: [CUDA][HIP] Do not emit type info when compiling for device (authored by yaxunl, committed by ). Repository: rC Clang https://reviews.llvm.org/D47694 Files: lib/CodeGen/CodeGenModule.cpp t

[PATCH] D47733: [CUDA][HIP] Set kernel calling convention before arrange function

2018-06-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 150106. yaxunl added a comment. Revised by Artem's comments. https://reviews.llvm.org/D47733 Files: lib/CodeGen/CGCall.cpp lib/CodeGen/CodeGenModule.cpp lib/CodeGen/TargetInfo.cpp lib/CodeGen/TargetInfo.h test/CodeGenCUDA/kernel-args.cu Index: tes

[PATCH] D47555: [HIP] Fix unbundling

2018-06-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 3 inline comments as done. yaxunl added inline comments. Comment at: lib/Driver/Driver.cpp:3895 +if (UI.DependentOffloadKind == Action::OFK_Host) + Arch = StringRef(); +else tra wrote: > Should it be something more descripti

[PATCH] D47555: [HIP] Fix unbundling

2018-06-06 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. yaxunl marked 3 inline comments as done. Closed by commit rC334128: [HIP] Fix unbundling (authored by yaxunl, committed by ). Changed prior to commit: https://reviews.llvm.org/D47555?vs=149191&id=150188#toc Repository:

[PATCH] D47958: [CUDA][HIP] Allow CUDA kernel to have amdgpu kernel attributes

2018-06-08 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: kzhuravl. Herald added subscribers: t-tye, tpr, dstuttard, nhaehnle, wdng. There are HIP applications e.g. Tensorflow 1.3 using amdgpu kernel attributes. https://reviews.llvm.org/D47958 Files: lib/Sema/SemaDeclAttr.cpp test/CodeGenCUDA/

[PATCH] D47733: [CUDA][HIP] Set kernel calling convention before arrange function

2018-06-08 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 150559. yaxunl marked an inline comment as done. yaxunl added a comment. Wrap long RUN lines in test. https://reviews.llvm.org/D47733 Files: lib/CodeGen/CGCall.cpp lib/CodeGen/CodeGenModule.cpp lib/CodeGen/TargetInfo.cpp lib/CodeGen/TargetInfo.h te

[PATCH] D47958: [CUDA][HIP] Allow CUDA `__global__` functions to have amdgpu kernel attributes

2018-06-08 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added a comment. In https://reviews.llvm.org/D47958#1126875, @tra wrote: > Drive-by review: > > The patch could use a better description. > Something that describes *what* the patch does (E.g. enforce that attributes > X/Y/Z are only applied to __

[PATCH] D47733: [CUDA][HIP] Set kernel calling convention before arrange function

2018-06-11 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL334457: [CUDA][HIP] Set kernel calling convention before arrange function (authored by yaxunl, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/

[PATCH] D47958: [CUDA][HIP] Allow CUDA __global__ functions to have amdgpu kernel attributes

2018-06-11 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 150872. yaxunl marked 2 inline comments as done. yaxunl retitled this revision from "[CUDA][HIP] Allow CUDA `__global__` functions to have amdgpu kernel attributes" to "[CUDA][HIP] Allow CUDA __global__ functions to have amdgpu kernel attributes". yaxunl added

[PATCH] D47958: [CUDA][HIP] Allow CUDA __global__ functions to have amdgpu kernel attributes

2018-06-12 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL334561: [CUDA][HIP] Allow CUDA __global__ functions to have amdgpu kernel attributes (authored by yaxunl, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://review

[PATCH] D48287: [HIP] Support -fcuda-flush-denormals-to-zero for amdgcn

2018-06-18 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: b-sumner, tra. yaxunl added a reviewer: scchan. https://reviews.llvm.org/D48287 Files: lib/Frontend/CompilerInvocation.cpp test/CodeGenCUDA/flush-denormals.cu Index: test/CodeGenCUDA/flush-denormals.cu ===

[PATCH] D48287: [HIP] Support -fcuda-flush-denormals-to-zero for amdgcn

2018-06-18 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 151748. yaxunl added a comment. Correct comments in test. https://reviews.llvm.org/D48287 Files: lib/Frontend/CompilerInvocation.cpp test/CodeGenCUDA/flush-denormals.cu Index: test/CodeGenCUDA/flush-denormals.cu

[PATCH] D48438: [Sema] Updated note for address spaces to print the type.

2018-06-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. https://reviews.llvm.org/D48438 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listi

[PATCH] D48419: [OpenCL] Fixed parsing of address spaces for C++

2018-06-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Did you notice the bug that C++ silently allows implicit casting of a pointer to a pointer type with different address space? Do you have a fix for that? https://reviews.llvm.org/D48419 ___ cfe-commits mailing list cfe-comm

[PATCH] D48419: [OpenCL] Fixed parsing of address spaces for C++

2018-06-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D48419#1139749, @Anastasia wrote: > In https://reviews.llvm.org/D48419#1139601, @yaxunl wrote: > > > Did you notice the bug that C++ silently allows implicit casting of a > > pointer to a pointer type with different address space? > > > > Do yo

[PATCH] D48455: Remove hip.amdgcn.bc hc.amdgcn.bc from HIP Toolchains

2018-06-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks! Repository: rC Clang https://reviews.llvm.org/D48455 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.o

[PATCH] D48419: [OpenCL] Fixed parsing of address spaces for C++

2018-06-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thank! https://reviews.llvm.org/D48419 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listin

[PATCH] D48493: [HIP] Support flush denorms bitcode

2018-06-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. I think we should have a lit test for linked bitcode files. Comment at: lib/Driver/ToolChains/HIP.cpp:78 +std::string OCLC_daz_opt; +if (Args.hasArg(options::OPT_fcuda_flush_denormals_to_zero)) + OCLC_daz_opt = "oclc_daz_opt_on.amdgcn.bc"; -

[PATCH] D51809: [CUDA][HIP] Fix ShouldDeleteSpecialMember for inherited constructors

2018-10-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: lib/Sema/SemaDeclCXX.cpp:7231 +if (ICI) + CSM = getSpecialMember(MD); + jlebar wrote: > LGTM, but perhaps we should use a new variable instead of modifying `CSM` in > case someone adds code beneath this branch?

[PATCH] D51809: [CUDA][HIP] Fix ShouldDeleteSpecialMember for inherited constructors

2018-10-09 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL344057: [CUDA][HIP] Fix ShouldDeleteSpecialMember for inherited constructors (authored by yaxunl, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.o

[PATCH] D52673: [HIP] Remove disabled irif library

2018-10-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: lib/Driver/ToolChains/HIP.cpp:85 C.addTempFile(C.getArgs().MakeArgString(TmpName)); CmdArgs.push_back(OutputFileName); SmallString<128> ExecPath(C.getDriver().Dir); maybe we should put hip.amdgcn.bc at the be

[PATCH] D52891: [AMDGPU] Add -fvisibility-amdgpu-non-kernel-functions

2018-10-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D52891#1258070, @scott.linder wrote: > I will update the patch to modify the HIP toolchain and to add tests for > global variables. > > As far as the semantics are concerned, are we OK with this being AMDGPU only? > I do not see a means of det

[PATCH] D52673: [HIP] Remove disabled irif library

2018-10-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks! https://reviews.llvm.org/D52673 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listi

[PATCH] D53153: [OpenCL] Mark namespace scope variables and kernel functions with default visibility

2018-10-12 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. This approach is trying to make OpenCL kernel and variable exceptions to -fvisibility option. However it does not provide users with choices. What if a user really wants to change the visibility of kernels and variables by -fvisibility? I think is more like a hack compar

[PATCH] D53153: [OpenCL] Mark namespace scope variables and kernel functions with default visibility

2018-10-12 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D53153#1263771, @scott.linder wrote: > The rationale is that `-fvisibility` only affects the default, and already > does not apply in many cases. For example, see the rest of the conditions > above the fvisibility check in `getLVForNamespaceSc

[PATCH] D53153: [OpenCL] Mark namespace scope variables and kernel functions with default visibility

2018-10-12 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D53153#1263810, @scott.linder wrote: > In https://reviews.llvm.org/D53153#1263804, @rsmith wrote: > > > In https://reviews.llvm.org/D53153#1263771, @scott.linder wrote: > > > > > The rationale is that `-fvisibility` only affects the default, and

[PATCH] D53295: [OpenCL] Mark load of block invoke function as invariant

2018-10-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: rjmccall, Anastasia. OpenCL v2.0 s6.12.5: Block variable declarations are implicitly qualified with const. Therefore all block variables must be initialized at declaration time and may not be reassigned. As such, load of block in

[PATCH] D53295: [OpenCL] Mark load of block invoke function as invariant

2018-10-16 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: lib/CodeGen/CGBlocks.cpp:1318 +CGM.getModule().getMDKindID("invariant.load"), +llvm::MDNode::get(getLLVMContext(), None)); + rjmccall wrote: > OpenCL blocks are still potentially function-local, right? I

[PATCH] D53325: Disable code object version 3 for HIP toolchain

2018-10-16 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: kzhuravl. Herald added a subscriber: tpr. AMDGPU backend will switch to code object version 3 by default. Since HIP runtime is not ready, disable it until the runtime is ready. https://reviews.llvm.org/D53325 Files: lib/Driver/ToolChains/

[PATCH] D53325: Disable code object version 3 for HIP toolchain

2018-10-16 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL344630: Disable code object version 3 for HIP toolchain (authored by yaxunl, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D53325?vs=169831&i

[PATCH] D53295: Mark store and load of block invoke function as invariant.group

2018-10-16 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 169864. yaxunl retitled this revision from "[OpenCL] Mark load of block invoke function as invariant" to "Mark store and load of block invoke function as invariant.group". yaxunl edited the summary of this revision. yaxunl added a comment. Herald added a subsc

[PATCH] D52320: AMDGPU: add __builtin_amdgcn_update_dpp

2018-10-16 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 169882. yaxunl edited the summary of this revision. yaxunl added a comment. emit llvm.amdgcn.update.dpp for __builtin_amdgcn_mov_dpp. https://reviews.llvm.org/D52320 Files: include/clang/Basic/BuiltinsAMDGPU.def lib/CodeGen/CGBuiltin.cpp test/CodeGenOp

[PATCH] D52320: AMDGPU: add __builtin_amdgcn_update_dpp

2018-10-16 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Brian checked the extra argument for dpp mov should be the first one. so mov_dpp(x,...) --> update_dpp(undef, x, ...). I will fix that when committing. https://reviews.llvm.org/D52320 ___ cfe-commits mailing list cfe-commits

[PATCH] D52320: AMDGPU: add __builtin_amdgcn_update_dpp

2018-10-16 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC344665: AMDGPU: add __builtin_amdgcn_update_dpp (authored by yaxunl, committed by ). Changed prior to commit: https://reviews.llvm.org/D52320?vs=169882&id=169942#toc Repository: rC Clang https://rev

[PATCH] D53472: Add gfx904 and gfx906 to GPU Arch

2018-10-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Herald added a subscriber: jholewinski. https://reviews.llvm.org/D53472 Files: include/clang/Basic/Cuda.h lib/Basic/Cuda.cpp lib/Basic/Targets/NVPTX.cpp Index: lib/Basic/Targets/NVPTX.cpp =

[PATCH] D53472: Add gfx904 and gfx906 to GPU Arch

2018-10-22 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL344996: Add gfx904 and gfx906 to GPU Arch (authored by yaxunl, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D53472?vs=170311&id=170560#toc

[PATCH] D53558: Add gfx909 to GPU Arch

2018-10-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM Repository: rC Clang https://reviews.llvm.org/D53558 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bi

[PATCH] D56318: [HIP] Fix size_t for MSVC environment

2019-01-11 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 181326. yaxunl added a comment. Herald added a subscriber: jfb. Copy type information from AuxTarget. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D56318/new/ https://reviews.llvm.org/D56318 Files: include/clang/Basic/TargetInfo.h lib/Basic/Targ

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-01-16 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1359275 , @rjmccall wrote: > In D56411#1352602 , @yaxunl wrote: > > > In D56411#1352332 , @rjmccall > > wrote: > > > > > This patch still d

[PATCH] D56318: [HIP] Fix size_t for MSVC environment

2019-01-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 182815. yaxunl added a comment. separate layout controlling flags to a base class for TargetInfo. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D56318/new/ https://reviews.llvm.org/D56318 Files: include/clang/Basic/TargetInfo.h lib/Basic/TargetIn

[PATCH] D56318: [HIP] Fix size_t for MSVC environment

2019-01-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56318#1355705 , @rjmccall wrote: > It's pretty unfortunate that all these fields have to be individually called > out like this. Can you move all these basic layout fields into a separate > `struct` (which can be a secondary

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-01-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1360010 , @rjmccall wrote: > I think the diagnostic should come during instantiation when you find an > evaluated use of a host function within a device function. It seems the body of function template is checked only d

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-01-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1365745 , @rjmccall wrote: > In D56411#1365727 , @yaxunl wrote: > > > In D56411#1360010 , @rjmccall > > wrote: > > > > > I think the diagno

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-01-24 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1365745 , @rjmccall wrote: > In D56411#1365727 , @yaxunl wrote: > > > In D56411#1360010 , @rjmccall > > wrote: > > > > > I think the diagno

[PATCH] D57188: Disable _Float16 for non ARM/SPIR Targets

2019-01-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. This change causes regressions for CUDA/HIP. As single-source language, CUDA/HIP code contains both device and host code. It has separate compilation for host and device. In host compilation, device function is parsed but not emitted in IR. The device function may have _

[PATCH] D57369: [CUDA][HIP] Do not diagnose use of _Float16

2019-01-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: rjmccall, tra. r352221 caused regressions in CUDA/HIP since device function may use _Float16 whereas host does not support it. In this case host compilation should not diagnose usage of _Float16 in device functions or variables. For now just

[PATCH] D57369: [CUDA][HIP] Do not diagnose use of _Float16

2019-01-29 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC352488: [CUDA][HIP] Do not diagnose use of _Float16 (authored by yaxunl, committed by ). Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57369/new/ https://reviews.llvm.org/

[PATCH] D56318: [HIP] Fix size_t for MSVC environment

2019-01-29 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 184238. yaxunl added a comment. Revised by John's comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D56318/new/ https://reviews.llvm.org/D56318 Files: include/clang/Basic/TargetInfo.h lib/Basic/TargetInfo.cpp lib/Basic/Targets/AMDGPU.cpp

[PATCH] D56318: [HIP] Fix size_t for MSVC environment

2019-01-29 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 184245. yaxunl added a comment. Use const argument. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D56318/new/ https://reviews.llvm.org/D56318 Files: include/clang/Basic/TargetInfo.h lib/Basic/TargetInfo.cpp lib/Basic/Targets/AMDGPU.cpp lib/Ba

[PATCH] D56318: [HIP] Fix size_t for MSVC environment

2019-01-30 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC352620: [HIP] Fix size_t for MSVC environment (authored by yaxunl, committed by ). Changed prior to commit: https://reviews.llvm.org/D56318?vs=184245&id=184274#toc Repository: rC Clang CHANGES SINCE

[PATCH] D57527: Do not copy floating pointer format from aux target

2019-01-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: rjmccall. rC352620 caused regressions because it copied floating point format from aux target. floating point format decides whether extended long double is supported. It is x86_fp80 on x86 but IEEE double

[PATCH] D57527: Do not copy floating pointer format from aux target

2019-01-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D57527#1379065 , @rjmccall wrote: > Okay, so you silently have an incompatible ABI for anything in the system > headers that mentions `long double`. Do you have any plans to address or > work around that, or is the hope that i

[PATCH] D57527: Do not copy long double and 128-bit fp format from aux target for AMDGPU

2019-01-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 184569. yaxunl retitled this revision from "Do not copy floating pointer format from aux target" to "Do not copy long double and 128-bit fp format from aux target for AMDGPU". yaxunl edited the summary of this revision. yaxunl added a comment. Herald added sub

[PATCH] D57527: Do not copy long double and 128-bit fp format from aux target for AMDGPU

2019-01-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D57527#1379159 , @rjmccall wrote: > In D57527#1379088 , @yaxunl wrote: > > > In D57527#1379065 , @rjmccall > > wrote: > > > > > Okay, so you silen

[PATCH] D57527: Do not copy long double and 128-bit fp format from aux target for AMDGPU

2019-01-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D57527#1379287 , @rjmccall wrote: > Explanatory comment, please. Otherwise LGTM. will do when committing. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57527/new/ https://reviews.llvm.org/D57527 _

[PATCH] D54183: [HIP] Change default optimization level to -O3

2018-11-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. The default optimization level of nvcc is -O3. There are HIP applications which expect the default optimization level to be -O3. Most HIP applications use -O3, therefore making it default. https://reviews.llvm.org/D54183 Files: lib

[PATCH] D53780: Fix bitcast to address space cast for coerced load/stores

2018-11-08 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC346413: Fix bitcast to address space cast for coerced load/stores (authored by yaxunl, committed by ). Changed prior to commit: https://reviews.llvm.org/D53780?vs=172673&id=173179#toc Repository: rC

[PATCH] D54275: [HIP] Remove useless sections in linked files

2018-11-08 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. clang-offload-bundler creates `__CLANG_OFFLOAD_BUNDLE__*` sections in the bundles, which get into the linked files. These sections are useless after linking. They waste disk space and cause confusion for clang when directly linked with

[PATCH] D54275: [HIP] Remove useless sections in linked files

2018-11-09 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC346536: [HIP] Remove useless sections in linked files (authored by yaxunl, committed by ). Repository: rC Clang https://reviews.llvm.org/D54275 Files: lib/Driver/ToolChains/CommonArgs.cpp Index: l

[PATCH] D54496: [HIP] Fix device only compilation

2018-11-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Fix a bug causing host code being compiled when `--cude-device-only` is set. https://reviews.llvm.org/D54496 Files: lib/Driver/Driver.cpp test/Driver/cuda-phases.cu Index: test/Driver/cuda-phases.cu ===

[PATCH] D54496: [HIP] Fix device only compilation

2018-11-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D54496#1297710, @tra wrote: > Do I understand it correctly that the bug appears to affect HIP compilation > only? Yes. Only HIP. https://reviews.llvm.org/D54496 ___ cfe-commits mailing list cfe

[PATCH] D54496: [HIP] Fix device only compilation

2018-11-13 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC346828: [HIP] Fix device only compilation (authored by yaxunl, committed by ). Repository: rC Clang https://reviews.llvm.org/D54496 Files: lib/Driver/Driver.cpp test/Driver/cuda-phases.cu Index:

[PATCH] D51341: [HEADER] Overloadable function candidates for half/double types

2018-11-16 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D51341#1301047, @sidorovd wrote: > @Anastasia @yaxunl > Hi, I am working on generalizing this patch and several questions have > raised during this, so I want to discuss them with you: > > 1. Should #pragma OPENCL EXTENSION ext_name : begin e

[PATCH] D55067: [HIP] Fix offset of kernel argument for AMDGPU target

2018-11-29 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, arsenm, rjmccall. Herald added subscribers: t-tye, tpr, dstuttard, wdng, kzhuravl. Clang emits call of hipSetupArgument(arg, size, offset) in host IR to set up arguments for a HIP kernel. The offset should meet the expection of the device

[PATCH] D55067: [HIP] Fix offset of kernel argument for AMDGPU target

2018-11-29 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D55067#1313213 , @rjmccall wrote: > This seems backwards. Clang knows what the actual ABI alignment of the C > type is, and it doesn't have to match the alignment of the IR type that IRGen > produces. It's the actual C ABI al

[PATCH] D55067: [HIP] Fix offset of kernel argument for AMDGPU target

2018-11-29 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D55067#1313213 , @rjmccall wrote: > This seems backwards. Clang knows what the actual ABI alignment of the C > type is, and it doesn't have to match the alignment of the IR type that IRGen > produces. It's the actual C ABI al

<    1   2   3   4   5   6   7   8   9   10   >