[PATCH] D37568: [AMDGPU] Allow flexible register names in inline asm constraints

2017-09-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D37568#863735, @b-sumner wrote: > The assembler accepts v[N] in addition to vN. I'm not sure if that is needed > here. Then we'd better also allow that in constraints to avoid confusion of users. https://reviews.llvm.org/D37568

[PATCH] D35082: [OpenCL] Add LangAS::opencl_private to represent private address space in AST

2017-09-08 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 6 inline comments as done. yaxunl added inline comments. Comment at: include/clang/AST/Type.h:332 + bool getImplicitAddressSpaceFlag() const { return Mask & IMask; } + void setImplicitAddressSpaceFlag(bool Value) { Anastasia wrote: > Could we ad

[PATCH] D35082: [OpenCL] Add LangAS::opencl_private to represent private address space in AST

2017-09-08 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 114337. yaxunl marked 3 inline comments as done. yaxunl edited the summary of this revision. yaxunl added a comment. Add comments for getImplicitAddressSpaceFlag and fix checking of null pointer. https://reviews.llvm.org/D35082 Files: include/clang/AST/AST

[PATCH] D37568: [AMDGPU] Allow flexible register names in inline asm constraints

2017-09-08 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 114367. yaxunl edited the summary of this revision. yaxunl added a comment. Allow {v[n]} and {s[n]}. Add more tests. https://reviews.llvm.org/D37568 Files: lib/Basic/Targets/AMDGPU.h test/CodeGenOpenCL/amdgcn-inline-asm.cl test/Sema/inline-asm-validate

[PATCH] D35082: [OpenCL] Add LangAS::opencl_private to represent private address space in AST

2017-09-08 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 2 inline comments as done. yaxunl added inline comments. Comment at: lib/Sema/SemaType.cpp:6994 + // OpenCL v1.2 s6.5: + // The generic address space name for arguments to a function in a + // program, or local variables of a function is __private. Al

[PATCH] D36327: [OpenCL] Allow targets emit optimized pipe functions for power of 2 type sizes

2017-09-08 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl abandoned this revision. yaxunl added a comment. We implemented this optimization through some target specific llvm pass. https://reviews.llvm.org/D36327 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/ma

[PATCH] D37703: [AMDGPU] Change addr space of clk_event_t, queue_t and reserve_id_t to global

2017-09-11 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. Herald added subscribers: t-tye, tpr, dstuttard, nhaehnle, wdng, kzhuravl. https://reviews.llvm.org/D37703 Files: lib/Basic/Targets/AMDGPU.h test/CodeGenOpenCL/opencl_types.cl Index: test/CodeGenOpenCL/opencl_types.cl ===

[PATCH] D37742: Add more tests for OpenCL atomic builtin functions

2017-09-12 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. Add tests for different address spaces and insert some blank lines to make them more readable. https://reviews.llvm.org/D37742 Files: test/CodeGenOpenCL/atomic-ops-libcall.cl test/CodeGenOpenCL/atomic-ops.cl Index: test/CodeGenOpenCL/atomic-ops.cl ===

[PATCH] D37703: [AMDGPU] Change addr space of clk_event_t, queue_t and reserve_id_t to global

2017-09-13 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL313171: [AMDGPU] Change addr space of clk_event_t, queue_t and reserve_id_t to global (authored by yaxunl). Changed prior to commit: https://reviews.llvm.org/D37703?vs=114642&id=115088#toc Repository:

[PATCH] D37742: Add more tests for OpenCL atomic builtin functions

2017-09-13 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL313172: Add more tests for OpenCL atomic builtin functions (authored by yaxunl). Changed prior to commit: https://reviews.llvm.org/D37742?vs=114824&id=115090#toc Repository: rL LLVM https://reviews.

[PATCH] D37822: [OpenCL] Clean up and add missing fields for block struct

2017-09-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. Currently block is translated to a structure equivalent to struct Block { void *isa; int flags; int reserved; void *invoke; void *descriptor; }; Except `invoke`, which is the pointer to the block invoke function, all other fields are useless

[PATCH] D37804: [OpenCL] Handle address space conversion while setting type alignment

2017-09-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: lib/CodeGen/CGExpr.cpp:957 -return Builder.CreateBitCast(Addr, ConvertType(E->getType())); +return Builder.CreatePointerBitCastOrAddrSpaceCast( +Addr, ConvertType(E->getType())); Better asser

[PATCH] D37822: [OpenCL] Clean up and add missing fields for block struct

2017-09-14 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 115222. yaxunl added a comment. Fix bug about calling blocks. https://reviews.llvm.org/D37822 Files: lib/CodeGen/CGBlocks.cpp lib/CodeGen/CGOpenCLRuntime.cpp lib/CodeGen/CGOpenCLRuntime.h test/CodeGen/blocks-opencl.cl test/CodeGenOpenCL/blocks.cl

[PATCH] D37568: [AMDGPU] Allow flexible register names in inline asm constraints

2017-09-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 2 inline comments as done. yaxunl added inline comments. Comment at: lib/Basic/Targets/AMDGPU.h:194 +Info.setAllowsRegister(); +Name = S.data() - 1; +return true; arsenm wrote: > I'm not sure I understand these data() - 1s. The caller o

[PATCH] D37822: [OpenCL] Clean up and add missing fields for block struct

2017-09-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 2 inline comments as done. yaxunl added a comment. In https://reviews.llvm.org/D37822#872291, @Anastasia wrote: > Could you please explain a bit more why the alignment have to be put > explicitly in the struct? I am just not very convinced this is general enough. The captured var

[PATCH] D37804: [OpenCL] Handle address space conversion while setting type alignment

2017-09-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: lib/CodeGen/CGExpr.cpp:957 -return Builder.CreateBitCast(Addr, ConvertType(E->getType())); +return Builder.CreatePointerBitCastOrAddrSpaceCast( +Addr, ConvertType(E->getType())); Anastasia wr

[PATCH] D37822: [OpenCL] Clean up and add missing fields for block struct

2017-09-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 4 inline comments as done. yaxunl added a comment. In https://reviews.llvm.org/D37822#873876, @Anastasia wrote: > In https://reviews.llvm.org/D37822#872446, @yaxunl wrote: > > > In https://reviews.llvm.org/D37822#872291, @Anastasia wrote: > > > > > Could you please explain a bit mor

[PATCH] D38134: [OpenCL] Emit enqueued block as kernel

2017-09-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. Herald added a subscriber: nhaehnle. In OpenCL the kernel function and non-kernel function has different calling conventions. For certain targets they have different argument ABIs. Also kernels have special function attributes and metadata for runtime to launch them

[PATCH] D37822: [OpenCL] Clean up and add missing fields for block struct

2017-09-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 116277. yaxunl marked 6 inline comments as done. yaxunl added a comment. Revise by Anastasia's comments. https://reviews.llvm.org/D37822 Files: lib/CodeGen/CGBlocks.cpp lib/CodeGen/CGOpenCLRuntime.cpp lib/CodeGen/CGOpenCLRuntime.h test/CodeGen/blocks

[PATCH] D37822: [OpenCL] Clean up and add missing fields for block struct

2017-09-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In https://reviews.llvm.org/D37822#877903, @Anastasia wrote: > In https://reviews.llvm.org/D37822#877572, @yaxunl wrote: > > > In https://reviews.llvm.org/D37822#873876, @Anastasia wrote: > > > > > In https://reviews.llvm.org/D37822#872446, @yaxunl wrote: > > > > > > > In

[PATCH] D38134: [OpenCL] Emit enqueued block as kernel

2017-09-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 3 inline comments as done. yaxunl added a comment. In https://reviews.llvm.org/D38134#877831, @Anastasia wrote: > Now if we have a block which is being called and enqueued at the same time, > will we generate 2 functions for it? Could we add such test case btw? Yes. It is covered

[PATCH] D37822: [OpenCL] Clean up and add missing fields for block struct

2017-09-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 116363. yaxunl edited the summary of this revision. yaxunl added a comment. Add custom fields to block and target hooks to fill them. https://reviews.llvm.org/D37822 Files: lib/CodeGen/CGBlocks.cpp lib/CodeGen/CGOpenCLRuntime.cpp lib/CodeGen/CGOpenCLRu

[PATCH] D37804: [OpenCL] Handle address space conversion while setting type alignment

2017-09-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. https://reviews.llvm.org/D37804 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listi

[PATCH] D38134: [OpenCL] Emit enqueued block as kernel

2017-09-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 10 inline comments as done. yaxunl added inline comments. Comment at: lib/CodeGen/CGOpenCLRuntime.cpp:113 + +llvm::Value *CGOpenCLRuntime::emitOpenCLEnqueuedBlock(CodeGenFunction &CGF, + const Expr *E) { -

[PATCH] D37568: [AMDGPU] Allow flexible register names in inline asm constraints

2017-09-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 116383. yaxunl marked 4 inline comments as done. yaxunl edited the summary of this revision. yaxunl added a comment. Fix typo. https://reviews.llvm.org/D37568 Files: lib/Basic/Targets/AMDGPU.h test/CodeGenOpenCL/amdgcn-inline-asm.cl test/Sema/inline-as

[PATCH] D37568: [AMDGPU] Allow flexible register names in inline asm constraints

2017-09-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: test/Sema/inline-asm-validate-amdgpu.cl:74 +__asm("v_add_f64_e64 v[1:2], v[3:4], v[5:6]" : "=v[1:2]"(ci) : "v[3:4]"(ai), "v[5:6]"(bi) : ); //expected-error {{invalid output constraint '=v[1:2

[PATCH] D38113: OpenCL: Assume functions are convergent

2017-09-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Do we need an option to disable this? In case it causes regression in some applications and users want to disable it. At least for debugging. Comment at: test/CodeGenOpenCL/convergent.cl:73 // CHECK: %[[tobool_pr:.+]] = phi i1 [ true, %[[if_then]] ],

[PATCH] D62738: [HIP] Support texture type

2019-05-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Herald added a reviewer: a.sidorin. This patch handles `__attribute__((device_builtin_vector_type))` for HIP. If a class or struct type has this attribute, any variables with this type will be emitted as global symbol in device code wit

[PATCH] D62739: AMDGPU: Always emit amdgpu-flat-work-group-size

2019-05-31 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: lib/CodeGen/TargetInfo.cpp:7885 +// By default, restrict the maximum size to 256. +F->addFnAttr("amdgpu-flat-work-group-size", "128,256"); } arsenm wrote: > b-sumner wrote: > > Theoretically, shouldn't the mini

[PATCH] D62971: [HIP] Remove the assertion on match between host/device names.

2019-06-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. LGTM. It seems no reason to assume the mangled name to be same on host and device side once anonymous types are mangled differently in host and device code. On windows, kernel has totally different names on host and device side without issues. Repository: rG LLVM Git

[PATCH] D62696: AMDGPU: Use AMDGPU toolchain for other OSes

2019-06-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62696/new/ https://reviews.llvm.org/D62696 ___ cfe-commits mailing list cfe-commits@l

[PATCH] D62739: AMDGPU: Always emit amdgpu-flat-work-group-size

2019-06-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. My concern is that this essentially forcing user to add amdgpu_flat_work_group_size attribute to all kernels that are executed outside of (128,256). Potentially this can cause lots of regressions for existing OpenCL apps. I am not sure if it is feasible to force all Open

[PATCH] D62738: [HIP] Support texture type

2019-06-11 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. ping CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62738/new/ https://reviews.llvm.org/D62738 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D62738: [HIP] Support texture type

2019-06-11 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 204155. yaxunl marked 5 inline comments as done. yaxunl added a comment. Revised by Artem's comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62738/new/ https://reviews.llvm.org/D62738 Files: include/clang/Basic/Attr.td include/clang/Basic

[PATCH] D62244: [AMDGPU] Enable the implicit arguments for HIP (CLANG)

2019-06-14 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL363414: [AMDGPU] Enable the implicit arguments for HIP (CLANG) (authored by yaxunl, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https:

[PATCH] D63335: [HIP] Add the interface deriving the stub name of device kernels.

2019-06-17 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. LGTM. Thanks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63335/new/ https://reviews.llvm.org/D63335 ___ cfe-commits mailing list cfe-commits@li

[PATCH] D62697: AMDGPU: Disable errno by default

2019-06-17 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Sorry for the delay. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62697/new/ https://reviews.llvm.org/D62697 ___ cfe-commits mail

[PATCH] D62738: [HIP] Support device_shadow variable

2019-06-18 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 205490. yaxunl retitled this revision from "[HIP] Support texture type" to "[HIP] Support device_shadow variable". yaxunl edited the summary of this revision. yaxunl added a comment. Revised by Artem's comments. CHANGES SINCE LAST ACTION https://reviews.ll

[PATCH] D62738: [HIP] Support device_shadow variable

2019-06-18 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 205503. yaxunl added a comment. Fix visibility and dso_local. Allow undefined symbol in code object. This is to allow merging the host and device symbols at run time. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62738/new/ https://reviews.llvm.org/

[PATCH] D62738: [HIP] Support device_shadow variable

2019-06-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added a comment. In D62738#1538900 , @tra wrote: > So, the only thing this patch appears to do is make everything with this > attribute uninitialized on device side and give protected visibility. > If I und

[PATCH] D62738: [HIP] Support device_shadow variable

2019-06-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. > The problem is that we do not see generic usage of > Although there is no texture specific handling on the compiler side, there > is texture specific handling of symbols Please ignore this comment. It is some old comment submitted by accident. CHANGES SINCE LAST ACT

[PATCH] D64364: [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012

2019-07-18 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:4973-4992 case CudaArch::GFX600: case CudaArch::GFX601: case CudaArch::GFX700: case CudaArch::GFX701: case CudaArch::GFX702: case CudaArch::GF

[PATCH] D63256: [OpenCL] Split type and macro definitions into opencl-c-base.h

2019-06-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: cfe/trunk/lib/Headers/opencl-c.h:13638 #if __OPENCL_C_VERSION__ >= CL_VERSION_2_0 -#ifndef ATOMIC_VAR_INIT -#define ATOMIC_VAR_INIT(x) (x) kzhuravl wrote: > Any reason this piece of code got completely removed? Removing

[PATCH] D62738: [HIP] Support device_shadow variable

2019-06-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: lib/CodeGen/CodeGenModule.cpp:3775 + // left undefined. + bool IsHIPDeviceShadowVar = getLangOpts().HIP && getLangOpts().CUDAIsDevice && + D->hasAttr();

[PATCH] D62738: [HIP] Support device_shadow variable

2019-06-24 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: include/clang/Basic/Attr.td:955 +def CUDADeviceShadow : InheritableAttr { + let Spellings = [GNU<"device_shadow">, Declspec<"__device_shadow__">]; + let Subjects = SubjectList<[Var]>; ---

[PATCH] D63756: [AMDGPU] Increased the number of implicit argument bytes for both OpenCL and HIP (CLANG).

2019-06-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. can you try compile an empty HIP kernel and see what metadata is generated by backend? If I remember correctly, backend generates kernel arg metadata based on the number of implicit kernel args. It knows how to handle 48 but I am not sure what will happen if it becomes

[PATCH] D62738: [HIP] Support attribute hip_pinned_shadow

2019-06-25 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 206504. yaxunl retitled this revision from "[HIP] Support device_shadow variable" to "[HIP] Support attribute hip_pinned_shadow". yaxunl edited the summary of this revision. yaxunl added a comment. rename the attribute and make it HIP only. CHANGES SINCE LAS

[PATCH] D62738: [HIP] Support attribute hip_pinned_shadow

2019-06-25 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL364381: [HIP] Support attribute hip_pinned_shadow (authored by yaxunl, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llv

[PATCH] D53295: Mark store and load of block invoke function as invariant.group

2019-07-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D53295#1561890 , @rjmccall wrote: > Great, thank you. Yaxun, are you planning to pick this back up? I know it's > been a long time. Sorry I caught up with some other work. Currently there has been another change about block

[PATCH] D63756: [AMDGPU] Increased the number of implicit argument bytes for both OpenCL and HIP (CLANG).

2019-07-08 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Thanks. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63756/new/ https://reviews.llvm.org/D63756 ___ cfe-c

[PATCH] D64364: [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012

2019-07-08 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: ashi1, tra. Herald added a subscriber: jholewinski. https://reviews.llvm.org/D64364 Files: include/clang/Basic/Cuda.h lib/Basic/Cuda.cpp lib/Basic/Targets/NVPTX.cpp lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Index: lib/CodeGen/CGOpenMPRunti

[PATCH] D64364: [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012

2019-07-11 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL365799: [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012 (authored by yaxunl, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://revi

[PATCH] D62197: [OpenCL] Fix file-scope const sampler variable for 2.0

2019-05-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: Anastasia. OpenCL spec v2.0 s6.13.14: Samplers can also be declared as global constants in the program source using the following syntax. const sampler_t = This works fine for OpenCL 1.2 but fails for 2.0, because clang duduces

[PATCH] D62197: [OpenCL] Fix file-scope const sampler variable for 2.0

2019-05-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 200509. yaxunl added a comment. Add full diff. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62197/new/ https://reviews.llvm.org/D62197 Files: lib/Sema/SemaType.cpp test/CodeGenOpenCL/sampler.cl test/SemaOpenCL/sampler_t.cl Index: test/SemaOp

[PATCH] D62244: [AMDGPU] Enable the implicit arguments for HIP (CLANG)

2019-05-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Currently HIP and CUDA share the same test directories, so better put the test in CodeGenCUDA. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62244/new/ https://reviews.llvm.org/D62244 ___ cfe

[PATCH] D62197: [OpenCL] Fix file-scope const sampler variable for 2.0

2019-05-27 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC361757: [OpenCL] Fix file-scope const sampler variable for 2.0 (authored by yaxunl, committed by ). Herald added a project: clang. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.

[PATCH] D62483: [CUDA][HIP] Emit dependent libs for host only

2019-05-27 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. Recently D60274 was introduced to allow lld to handle dependent libs. However current usage of dependent libs (e.g. pragma comment(lib, *) in windows header files) are intended for host only. Emitting

[PATCH] D62483: [CUDA][HIP] Emit dependent libs for host only

2019-05-28 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC361880: [CUDA][HIP] Emit dependent libs for host only (authored by yaxunl, committed by ). Herald added a project: clang. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D6248

[PATCH] D62603: [CUDA][HIP] Skip setting `externally_initialized` for static device variables.

2019-05-29 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. LGTM. The externally initializable attribute causes some optimizations disabled. For static device variables it seems reasonable to remove the externaly initializable attribute. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/

[PATCH] D62603: [CUDA][HIP] Skip setting `externally_initialized` for static device variables.

2019-05-29 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D62603#1521832 , @tra wrote: > In D62603#1521792 , @hliao wrote: > > > that should assume that variable is not declared with `static`. that's also > > the motivation of this patch. > > >

[PATCH] D55663: [CUDA] Make all host-side shadows of device-side variables undef.

2018-12-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. LGTM. Thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D55663/new/ https://reviews.llvm.org/D55663 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-

[PATCH] D56225: [HIP] Use nul instead of /dev/null when running on windows

2019-01-02 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. When clang is running on windows, /dev/null is not available. Use nul as empty input file instead. https://reviews.llvm.org/D56225 Files: lib/Driver/ToolChains/HIP.cpp Index: lib/Driver/ToolChains/HIP.cpp =

[PATCH] D56318: [HIP] Fix size_t for MSVC environment

2019-01-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, rjmccall. Herald added subscribers: tpr, nhaehnle, jvesely. In 64 bit MSVC environment size_t is defined as unsigned long long. Fix AMDGPU target info to match it in MSVC environment. https://reviews.llvm.org/D56318 Files: lib/Basic/

[PATCH] D56318: [HIP] Fix size_t for MSVC environment

2019-01-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56318#1346456 , @rjmccall wrote: > What's the general idea here, that you're going to pretend to be the > environment's "standard" CPU target of the right pointer width and try to > match the ABI exactly? This seems like a pr

[PATCH] D56321: [HIP][DRIVER][OFFLOAD] Do not unbundle unsupported file types

2019-01-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. This revision is now accepted and ready to land. LGTM. Can you rename the lit test as hip-link-shared-library.hip? That is more meaningful. Thanks. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D56321/new/ ht

[PATCH] D56318: [HIP] Fix size_t for MSVC environment

2019-01-04 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56318#1346693 , @rjmccall wrote: > No, no, I understand that you're not changing pointer sizes, but this is one > example of trying to match the ABI of the target environment, and I'm trying > to understand how far that goes.

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-01-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, rjmccall. If a kernel template has a function as its template parameter, a device function should be allowed as template argument since a kernel can call a device function. However, currently if the kernel template is instantiated in a ho

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-01-08 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1349364 , @jlebar wrote: > Without reading the patch in detail (sorry) but looking mainly at the > testcase: It looks like we're not checking how overloading and `__host__ > __device__` functions play into this. Maybe t

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-01-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1350233 , @jlebar wrote: > __host__ void bar() {} > __device__ int bar() { return 0; } > __host__ __device__ void foo() { int x = bar(); } > template __global__ void kernel() { devF();} > > kernel(); > > > > >

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-01-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 180848. yaxunl added a comment. Add test for `__host__ __device__`. Removing the flag IsParsingTemplateArgument in Sema. Instead, check ExprEvalContexts for disabling checking device/host consistency. I did not use ExprEvalContext Unevaluated to condition the

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-01-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 180888. yaxunl added a comment. Passing template decl by ExpressionEvaluationContextRecord. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D56411/new/ https://reviews.llvm.org/D56411 Files: include/clang/Sema/Sema.h lib/Sema/SemaCUDA.cpp lib/Sem

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-01-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1349342 , @rjmccall wrote: > Sema won't necessarily have resolved a template decl when parsing a template > argument list, so trying to propagate that decl down to indicate that we're > resolving a template argument is n

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-01-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 180960. yaxunl added a comment. disable the check for more general cases. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D56411/new/ https://reviews.llvm.org/D56411 Files: include/clang/Sema/Sema.h lib/Sema/SemaCUDA.cpp lib/Sema/SemaTemplate.cpp

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-01-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1351824 , @rjmccall wrote: > But why? Why do you want to limit this to just template arguments instead of > all sorts of similar contexts? I updated the patch to disable the check for unevaluated expr context and const

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56411#1352332 , @rjmccall wrote: > This patch still doesn't make any sense. You don't need to do any special > validation when passing a function as a template argument. When Sema > instantiates the template definition, it'l

[PATCH] D56225: [HIP] Use nul instead of /dev/null when running on windows

2019-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: lib/Driver/ToolChains/HIP.cpp:210 std::string BundlerTargetArg = "-targets=host-x86_64-unknown-linux"; - std::string BundlerInputArg = "-inputs=/dev/null"; + std::string BundlerInputArg = "-in

[PATCH] D56318: [HIP] Fix size_t for MSVC environment

2019-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56318#1346991 , @rjmccall wrote: > Okay. Is there a reasonable way to make your targets delegate to a different > `TargetInfo` implementation for most things so that you can generally match > the host target for things like t

[PATCH] D56318: [HIP] Fix size_t for MSVC environment

2019-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56318#1352962 , @rjmccall wrote: > If I was only concerned about `size_t`, your current solution would be fine. > My concern is that you really need to match *all* of the associated CPU > target's ABI choices, so your target

[PATCH] D56318: [HIP] Fix size_t for MSVC environment

2019-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56318#1353106 , @rjmccall wrote: > No, I understand that things like the function-call ABI should be different > from the associated host ABI, but things like the size of `long` and the > bit-field layout algorithm presumably

[PATCH] D56225: [HIP] Use nul instead of /dev/null when running on windows

2019-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL350885: [HIP] Use nul instead of /dev/null when running on windows (authored by yaxunl, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D56225?

[PATCH] D56318: [HIP] Fix size_t for MSVC environment

2019-01-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D56318#1353176 , @rjmccall wrote: > In D56318#1353116 , @yaxunl wrote: > > > In D56318#1353106 , @rjmccall > > wrote: > > > > > No, I understand t

[PATCH] D67509: [CUDA][HIP] Diagnose defaulted constructor only if it is used

2019-09-12 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, rjmccall. Clang infers defaulted ctor as `__device__ __host__` and virtual dtor as `__host__`. It diagnose the following code in device compilation as B() references ~A() implicitly. struct A { virtual ~A(); }; struct B: public A

[PATCH] D67509: [CUDA][HIP] Diagnose defaulted constructor only if it is used

2019-09-13 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. Sorry I found some issue with the fix. The following code: struct A { virtual ~A(); }; struct B: public A { B(); }; B::B() = default; will cause B::B() with external linkage emitted in IR, since `B::B() = default;` is a function definition. This somehow defeats

[PATCH] D67509: [CUDA][HIP] Fix hostness of defaulted constructor

2019-09-19 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 220838. yaxunl retitled this revision from "[CUDA][HIP] Diagnose defaulted constructor only if it is used" to "[CUDA][HIP] Fix hostness of defaulted constructor". yaxunl edited the summary of this revision. yaxunl added a comment. Posts a new fix for this issu

[PATCH] D67509: [CUDA][HIP] Fix hostness of defaulted constructor

2019-09-19 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: lib/Sema/SemaCUDA.cpp:273-274 + MemberDecl->hasAttr(); + if (!InClass || hasAttr) +return false; + tra wrote: > A comment here would be helpful. > > I think t

[PATCH] D67509: [CUDA][HIP] Fix hostness of defaulted constructor

2019-09-19 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 220892. yaxunl edited the summary of this revision. yaxunl added a comment. Skip inferring for explicit host/device attrs only. Adds checks for implicit device and host attrs and avoid duplicates. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67509/n

[PATCH] D67509: [CUDA][HIP] Fix hostness of defaulted constructor

2019-09-19 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 220907. yaxunl added a comment. simplify logic by Artem's comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67509/new/ https://reviews.llvm.org/D67509 Files: lib/Sema/SemaCUDA.cpp test/SemaCUDA/default-ctor.cu Index: test/SemaCUDA/default

[PATCH] D67509: [CUDA][HIP] Fix hostness of defaulted constructor

2019-09-19 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 220912. yaxunl added a comment. revise by Artem's comments. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67509/new/ https://reviews.llvm.org/D67509 Files: lib/Sema/SemaCUDA.cpp test/SemaCUDA/default-ctor.cu Index: test/SemaCUDA/default-ctor.cu

[PATCH] D67509: [CUDA][HIP] Fix hostness of defaulted constructor

2019-09-20 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL372394: [CUDA][HIP] Fix hostness of defaulted constructor (authored by yaxunl, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://rev

[PATCH] D67837: [CUDA][HIP] Fix assertion in Sema::markKnownEmitted with -fopenmp

2019-09-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added reviewers: tra, rjmccall. Herald added a subscriber: guansong. Herald added a reviewer: jdoerfert. Herald added a project: clang. CUDA/HIP program may be compiled with -fopenmp. In this case, -fopenmp is only passed to host compilation to take advantages

[PATCH] D67837: [CUDA][HIP] Fix assertion in Sema::markKnownEmitted with -fopenmp

2019-09-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 221097. yaxunl added a comment. reuse the call tree. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67837/new/ https://reviews.llvm.org/D67837 Files: lib/Sema/Sema.cpp test/SemaCUDA/openmp-static-func.cu Index: test/SemaCUDA/openmp-static-func.

[PATCH] D67837: [CUDA][HIP] Fix assertion in Sema::markKnownEmitted with -fopenmp

2019-09-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done. yaxunl added inline comments. Comment at: lib/Sema/Sema.cpp:1511 +if (Loc != S.DeviceCallGraph.end()) + S.DeviceCallGraph.erase(Loc); return; rjmccall wrote: > There's an overload of `DenseMap::erase` that ju

[PATCH] D67509: [CUDA][HIP] Fix hostness of defaulted constructor

2019-09-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D67509#1677528 , @tra wrote: > Looks like CUDA test-suite is triggering the assertion added by this patch: > > http://lab.llvm.org:8011/builders/clang-cuda-build/builds/37301/steps/ninja%20build%20simple%20CUDA%20tests/logs/stdio

[PATCH] D67509: [CUDA][HIP] Fix hostness of defaulted constructor

2019-09-20 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D67509#1677586 , @yaxunl wrote: > In D67509#1677528 , @tra wrote: > > > Looks like CUDA test-suite is triggering the assertion added by this patch: > > > > http://lab.llvm.org:8011/builder

[PATCH] D67509: [CUDA][HIP] Fix hostness of defaulted constructor

2019-09-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D67509#1677722 , @yaxunl wrote: > In D67509#1677586 , @yaxunl wrote: > > > In D67509#1677528 , @tra wrote: > > > > > Looks like CUDA test-suite is

[PATCH] D67837: [CUDA][HIP] Fix assertion in Sema::markKnownEmitted with -fopenmp

2019-09-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 221341. yaxunl added a comment. revised by John's comments CHANGES SINCE LAST ACTION https://reviews.llvm.org/D67837/new/ https://reviews.llvm.org/D67837 Files: lib/Sema/Sema.cpp test/SemaCUDA/openmp-static-func.cu Index: test/SemaCUDA/openmp-static

[PATCH] D67947: [HIP] Support new kernel launching API

2019-09-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision. yaxunl added a reviewer: tra. https://reviews.llvm.org/D67947 Files: include/clang/Basic/LangOptions.def include/clang/Driver/Options.td lib/CodeGen/CGCUDANV.cpp lib/Driver/ToolChains/Clang.cpp lib/Frontend/CompilerInvocation.cpp lib/Sema/SemaCUDA.cpp t

[PATCH] D67837: [CUDA][HIP] Fix assertion in Sema::markKnownEmitted with -fopenmp

2019-09-24 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D67837#1679670 , @rjmccall wrote: > Okay. And it's okay to fall down to the code below when functions are used > in both ways this way? This part of code is for delayed checking of hostness. If a host function calls device f

[PATCH] D67947: [HIP] Support new kernel launching API

2019-09-24 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL372773: [HIP] Support new kernel launching API (authored by yaxunl, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.o

[PATCH] D67509: [CUDA][HIP] Fix hostness of defaulted constructor

2019-09-24 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. In D67509#1679524 , @tra wrote: > In D67509#1678394 , @yaxunl wrote: > > > A reduced test case is > > > > struct A { > > A(); > > }; > > > > template > > struct B > > { > >

<    1   2   3   4   5   6   7   8   9   10   >