[PATCH] D52179: [clang-tidy] Replace redundant checks with an assert().

2018-09-17 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 165841. tra added a comment. - Check that D is non-null https://reviews.llvm.org/D52179 Files: clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.cpp Index: clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.cpp =

[PATCH] D52179: [clang-tidy] Replace redundant checks with an assert().

2018-09-17 Thread Artem Belevich via Phabricator via cfe-commits
tra marked an inline comment as done. tra added inline comments. Comment at: clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.cpp:551-552 if (Decl->isMain() || !Decl->isUserProvided() || -Decl->isUsualDeallocationFunction() || -Decl->isCopyAssi

[PATCH] D52179: [clang-tidy] Replace redundant checks with an assert().

2018-09-17 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.cpp:388 &NamingStyles) { + assert(D && D->getIdentifier() && !D->getName().empty() && !D->isImplicit() && + "Decl must be an explicit identifier with a name."

[PATCH] D52179: [clang-tidy] Replace redundant checks with an assert().

2018-09-18 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rCTE342514: [clang-tidy] Replace redundant checks with an assert(). (authored by tra, committed by ). Changed prior to commit: https://reviews.llvm.org/D52179?vs=165841&id=166042#toc Repository: rCTE C

[PATCH] D51808: [CUDA] Ignore uncallable functions when we check for usual deallocators.

2018-09-18 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 166050. tra marked an inline comment as done. tra added a comment. Updated assertion message. https://reviews.llvm.org/D51808 Files: clang/include/clang/AST/DeclCXX.h clang/include/clang/Sema/Sema.h clang/lib/AST/DeclCXX.cpp clang/lib/Sema/SemaDeclCXX.c

[PATCH] D51808: [CUDA] Ignore uncallable functions when we check for usual deallocators.

2018-09-18 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 166053. tra added a comment. Renamed last instance of 'Matches' -> 'PreventedBy'. https://reviews.llvm.org/D51808 Files: clang/include/clang/AST/DeclCXX.h clang/include/clang/Sema/Sema.h clang/lib/AST/DeclCXX.cpp clang/lib/Sema/SemaDeclCXX.cpp clang/l

[PATCH] D52259: [CUDA] Rearrange search path ordering to fix two test case failures

2018-09-19 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. The patch does not seem to match the description and appears to have nothing to do with rearranging include paths. Could you check if these are the changes you intended to send for review. Repository: rC Clang https://reviews.llvm.org/D52259 _

[PATCH] D51808: [CUDA] Ignore uncallable functions when we check for usual deallocators.

2018-09-21 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL342749: [CUDA] Ignore uncallable functions when we check for usual deallocators. (authored by tra, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.

[PATCH] D52321: [CUDA] Fixed parsing of optional template-argument-list.

2018-09-21 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 166509. tra added a comment. Added '>=' and '>>=' to the list of tokens that may indicate the end of the empty template argument list. https://reviews.llvm.org/D52321 Files: clang/lib/Parse/ParseTemplate.cpp clang/test/Parser/cuda-kernel-call-c++11.cu cla

[PATCH] D52321: [CUDA] Fixed parsing of optional template-argument-list.

2018-09-21 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC342752: [CUDA] Fixed parsing of optional template-argument-list. (authored by tra, committed by ). Changed prior to commit: https://reviews.llvm.org/D52321?vs=166509&id=166512#toc Repository: rC Clan

[PATCH] D52377: [HIP] Support early finalization of device code

2018-09-21 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Overall the patch look OK. I'll take a closer look on Monday. Which mode do you expect will be most commonly used for HIP by default? With this patch we'll have two different ways to do similar things in HIP vs. CUDA. E.g. by default CUDA compiles GPU code in each TU in a co

[PATCH] D52438: [CUDA] Add basic support for CUDA-10.0

2018-09-24 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: jlebar. Herald added subscribers: bixia, hiraditya, sanjoy, jholewinski. https://reviews.llvm.org/D52438 Files: clang/include/clang/Basic/Cuda.h clang/lib/Basic/Cuda.cpp clang/lib/Basic/Targets/NVPTX.cpp clang/lib/Driver/ToolChains/Cuda.cpp

[PATCH] D52437: [CUDA] Add preliminary support for CUDA 10.0

2018-09-24 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Great to see someone beating me to add support for a new CUDA version. :-) I've posted my patch in https://reviews.llvm.org/D52438. It's very similar to yours with a couple of other necessary changes. Repository: rC Clang https://reviews.llvm.org/D52437 _

[PATCH] D52259: [CUDA] Fix two failed test cases using --cuda-path-ignore-env

2018-09-25 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. It's unfortunate that lit does not scrub the environment in order to make tests hermetic. Repository: rC Clang https://reviews.llvm.org/D52259 ___

[PATCH] D52438: [CUDA] Add basic support for CUDA-10.0

2018-10-01 Thread Artem Belevich via Phabricator via cfe-commits
tra closed this revision. tra added a comment. Landed in https://reviews.llvm.org/rC342924 https://reviews.llvm.org/D52438 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D52377: [HIP] Support early finalization of device code for -fno-gpu-rdc

2018-10-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: include/clang/Driver/Options.td:587-589 +def fgpu_rdc : Flag<["-"], "fgpu-rdc">, Flags<[CC1Option]>, HelpText<"Generate relocatable device code, also known as separate compilation mode.">; +def fno_gpu_rdc : Flag<["-"], "fno-gpu-rdc">; -

[PATCH] D52377: [HIP] Support early finalization of device code for -fno-gpu-rdc

2018-10-02 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. https://reviews.llvm.org/D52377 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commit

[PATCH] D52938: [CUDA] Use all 64 bits of GUID in __nv_module_id

2018-10-05 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: Hahnfeld. Herald added subscribers: bixia, jlebar, sanjoy. getGUID() returns an uint64_t and "%x" only prints 32 bits of it. Use PRIx64 format string to print all 64 bits. https://reviews.llvm.org/D52938 Files: clang/lib/CodeGen/CGCUDANV.cpp

[PATCH] D52938: [CUDA] Use all 64 bits of GUID in __nv_module_id

2018-10-05 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. This particular change is largely cosmetic. I've just spotted this nit while I was debugging a different problem. It's also related to module ID. We're trying to compile NCCL 2.3 with -fcuda-rdc and we were getting duplicate symbols when we tried to link multiple object fi

[PATCH] D52938: [CUDA] Use all 64 bits of GUID in __nv_module_id

2018-10-05 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL343875: [CUDA] Use all 64 bits of GUID in __nv_module_id (authored by tra, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D52938?vs=168489&id=

[PATCH] D57487: [CUDA] Propagate detected version of CUDA to cc1

2019-01-31 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL352798: [CUDA] Propagate detected version of CUDA to cc1 (authored by tra, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D57487?vs=184416&id=

[PATCH] D57488: [CUDA] add support for the new kernel launch API in CUDA-9.2+.

2019-01-31 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC352799: [CUDA] add support for the new kernel launch API in CUDA-9.2+. (authored by tra, committed by ). Changed prior to commit: https://reviews.llvm.org/D57488?vs=184592&id=184598#toc Repository: r

[PATCH] D57771: [CUDA] Add basic support for CUDA-10.1

2019-02-05 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: jlebar. Herald added subscribers: bixia, sanjoy. https://reviews.llvm.org/D57771 Files: clang/include/clang/Basic/Cuda.h clang/lib/Basic/Cuda.cpp clang/lib/CodeGen/CGCUDANV.cpp clang/lib/Driver/ToolChains/Cuda.cpp clang/lib/Headers/__clan

[PATCH] D57771: [CUDA] Add basic support for CUDA-10.1

2019-02-05 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 185353. tra added a comment. Make the function object local. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57771/new/ https://reviews.llvm.org/D57771 Files: clang/include/clang/Basic/Cuda.h clang/lib/Basic/Cuda.cpp clang/lib/CodeGen/CGCUDANV.cpp

[PATCH] D57716: [CUDA][HIP] Check calling convention based on function target

2019-02-05 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Sema/SemaDeclAttr.cpp:4621-4638 +auto CUDATarget = IdentifyCUDATarget(FD); +if (CUDATarget == CFT_HostDevice) { + A = TI.checkCallingConvention(CC); + if (A == TargetInfo::CCCR_OK && Aux) +A = Aux->checkCallingC

[PATCH] D57771: [CUDA] Add basic support for CUDA-10.1

2019-02-05 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 185398. tra added a comment. Made a comment more readable. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57771/new/ https://reviews.llvm.org/D57771 Files: clang/include/clang/Basic/Cuda.h clang/lib/Basic/Cuda.cpp clang/lib/CodeGen/CGCUDANV.cpp

[PATCH] D57771: [CUDA] Add basic support for CUDA-10.1

2019-02-05 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL353232: Basic CUDA-10 support. (authored by tra, committed by ). Herald added a project: LLVM. Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D57771?vs=185398

[PATCH] D57829: [HIP] Disable emitting llvm.linker.options in device compilation

2019-02-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Could you elaborate on why you want to disable this metadata? I think the original idea of llvm.linker.options was that it should be ignored if the back-end does not support it. Comment at: lib/CodeGen/CodeGenModule.cpp:441 if (CodeGenOpts.Autolink &&

[PATCH] D57829: [HIP] Disable emitting llvm.linker.options in device compilation

2019-02-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D57829#1387416 , @yaxunl wrote: > In D57829#1387412 , @tra wrote: > > > Could you elaborate on why you want to disable this metadata? I think the > > original idea of llvm.linker.options was

[PATCH] D57908: [SEMA]Generalize deferred diagnostic interface, NFC.

2019-02-07 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. I've added jlebar@ as he's originally written the code. Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D57908/new/ https://reviews.llvm.org/D57908 ___

[PATCH] D58163: [CUDA][HIP] Use device side kernel and variable names when registering them

2019-02-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/CodeGen/CGCUDANV.cpp:128 + unsigned Flags) override { +DeviceVars.push_back(VarInfo{&Var, getDeviceSideName(VD), Flags}); } Nit: `VarInfo` is not needed. Compiler should be able to infer it

[PATCH] D58163: [CUDA][HIP] Use device side kernel and variable names when registering them

2019-02-13 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: lib/CodeGen/CGCUDANV.cpp:412 + for (auto &&I : EmittedKernels) { +llvm::Constant *KernelName = makeConstantString(I.DeviceSideName); llvm::Constant *NullP

[PATCH] D58163: [CUDA][HIP] Use device side kernel and variable names when registering them

2019-02-13 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. Thank you. LGTM. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D58163/new/ https://reviews.llvm.org/D58163 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bi

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-02-14 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. >> That said, does CUDA have a general rule resolving `__host__` vs. >> `__device__` overloads based on context? And does it allow overloading >> based solely on `__host__` vs. `__device__`? NVCC does not. Clang does. See https://goo.gl/EXnymm for the details. AFAICT, NVI

[PATCH] D58243: [OPENMP] Delay emission of the asm target-specific error messages.

2019-02-19 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Sema/SemaStmtAsm.cpp:256-263 // If we're compiling CUDA file and function attributes indicate that it's not // for this compilation side, skip all the checks. if (!DeclAttrsMatchCUDAMode(getLangOpts(), getCurFunctionDecl())) {

[PATCH] D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter

2019-02-19 Thread Artem Belevich via Phabricator via cfe-commits
tra added a subscriber: rsmith. tra added a comment. In D56411#1400300 , @rjmccall wrote: > Okay, but it's not great design to have a kind of overloading that can't be > resolved to an exact intended declaration even by an explicit cast. That's > why I

[PATCH] D58243: [OPENMP] Delay emission of the asm target-specific error messages.

2019-02-19 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Sema/SemaStmtAsm.cpp:256-263 // If we're compiling CUDA file and function attributes indicate that it's not // for this compilation side, skip all the checks. if (!DeclAttrsMatchCUDAMode(getLangOpts(), getCurFunctionDecl())) {

[PATCH] D58463: [CUDA]Delayed diagnostics for the asm instructions.

2019-02-20 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Thank you. Comment at: lib/Sema/Sema.cpp:1494-1496 +if (getLangOpts().CUDAIsDevice) + return CUDADiagIfDeviceCode(Loc, DiagID); +return CUDADiagIfHostCode(Loc, DiagID)

[PATCH] D58518: [HIP] change kernel stub name

2019-02-21 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. My guess is that this is needed because HIP debugger can see symbols from both host and device executables at the same time. Is that so? If that's the case, I guess HIP may have similar naming probl

[PATCH] D58518: [HIP] change kernel stub name

2019-02-21 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D58518#1406202 , @t-tye wrote: > Yes this relates to supporting the debugger. > > For the same function being present on both host and device, having the same > name is correct as the debugger must set a breakpoint at both places.

[PATCH] D58518: [HIP] change kernel stub name

2019-02-21 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D58518#1406274 , @t-tye wrote: > To clarify, I am saying that the stub does have a different name since it is > conceptually part of the implementation of doing the call to the device > function implementation, and is not in fact

[PATCH] D57716: [CUDA][HIP] Check calling convention based on function target

2019-02-21 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: test/SemaCUDA/amdgpu-windows-vectorcall.cu:3 + +// expected-no-diagnostics +template tra wrote: > It may be good to add a check where we *would* expect to see the diagnostics. > It may be good to add a check where we *would*

[PATCH] D58518: [HIP] change kernel stub name

2019-02-22 Thread Artem Belevich via Phabricator via cfe-commits
tra requested changes to this revision. tra added a subscriber: echristo. tra added inline comments. This revision now requires changes to proceed. Comment at: lib/CodeGen/CodeGenModule.cpp:1059 +FD->hasAttr()) + MangledName = MangledName + ".stub"; + ---

[PATCH] D37539: [CUDA] Add device overloads for non-placement new/delete.

2017-09-06 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: clang/lib/Headers/cuda_wrappers/new:79 +} +__device__ void operator delete[](void *ptr, std::size_t sz) CUDA_NOEXCEPT { + ::operator delete(ptr);

[PATCH] D37548: [CUDA] When compilation fails, print the compilation mode.

2017-09-07 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Nice. https://reviews.llvm.org/D37548 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commit

[PATCH] D37576: [CUDA] Added rudimentary support for CUDA-9 and sm_70.

2017-09-07 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: hiraditya, sanjoy, jholewinski. For now CUDA-9 is not included in the list of CUDA versions clang searches for, so the path to CUDA-9 must be explicitly passed via --cuda-path=. On LLVM side NVPTX added sm_70 GPU type which bumps required PTX v

[PATCH] D37576: [CUDA] Added rudimentary support for CUDA-9 and sm_70.

2017-09-07 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 114206. tra added a comment. Added tests for sm_70 support. https://reviews.llvm.org/D37576 Files: clang/include/clang/Basic/Cuda.h clang/lib/Basic/Cuda.cpp clang/lib/Basic/Targets/NVPTX.cpp clang/lib/Driver/ToolChains/Cuda.cpp clang/lib/Headers/__cla

[PATCH] D37576: [CUDA] Added rudimentary support for CUDA-9 and sm_70.

2017-09-07 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL312734: [CUDA] Added rudimentary support for CUDA-9 and sm_70. (authored by tra). Changed prior to commit: https://reviews.llvm.org/D37576?vs=114206&id=114216#toc Repository: rL LLVM https://reviews

[PATCH] D37906: [CUDA] Work around a new quirk in CUDA9 headers.

2017-09-15 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added a subscriber: sanjoy. In CUDA-9 some of device-side math functions that we need are conditionally defined within '#if _GLIBCXX_MATH_H'. We need to temporarily undo the guard around inclusion of math_functions.hpp https://reviews.llvm.org/D37906 Files:

[PATCH] D37906: [CUDA] Work around a new quirk in CUDA9 headers.

2017-09-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. I don't think we really care why they do it for nvcc. My understanding is that nvcc needs to avoid name clashes between their implementation of functions and the ones that come from the host headers and that's why they have to tread really carefully around host includes. W

[PATCH] D37906: [CUDA] Work around a new quirk in CUDA9 headers.

2017-09-15 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL313369: [CUDA] Work around a new quirk in CUDA9 headers. (authored by tra). Changed prior to commit: https://reviews.llvm.org/D37906?vs=115415&id=115422#toc Repository: rL LLVM https://reviews.llvm.

[PATCH] D37912: [OpenMP] Bugfix: output file name drops the absolute path where full path is needed.

2017-09-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Shouldn't this temp .cubin file go into the temporary directory, as opposed to the same directory as the input file? Repository: rL LLVM https://reviews.llvm.org/D37912 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required.

2017-09-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:505-508 +if ((DeviceOffloadingKind == Action::OFK_OpenMP && + DriverArgs.hasArg(options::OPT_S)) || +(DeviceOffloadingKind == Action::OFK_OpenMP && + DriverArgs.hasArg(options::OPT_c

[PATCH] D37912: [OpenMP] Bugfix: output file name drops the absolute path where full path is needed.

2017-09-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In https://reviews.llvm.org/D37912#872318, @gtbercea wrote: > In https://reviews.llvm.org/D37912#872294, @tra wrote: > > > Shouldn't this temp .cubin file go into the temporary directory, as opposed > > to the same directory as the input file? > > > That is indeed the intent

[PATCH] D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required.

2017-09-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. BTW, at least for CUDA compilation, '-c' would still needs libdevice as device-side will compile PTX to SASS and will need all the symbols PTX may refer to. Would that not be the case for OpenMP's compilation, too? https://reviews.llvm.org/D37914 __

[PATCH] D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required.

2017-09-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Now we just need a test case to make sure this works as intended. https://reviews.llvm.org/D37914 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required.

2017-09-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. I'm not particularly familiar with OpenMP internals. Could you elaborate on why libdevice would not be needed with -c for the OpenMP case? Is that because it would only apply to the host compilation and that nothing will be compiled for the openmp targets? Does openmp allow

[PATCH] D37913: [OpenMP] Enable the existing nocudalib flag for OpenMP offloading toolchain.

2017-09-19 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:255-257 CudaArch gpu_arch = StringToCudaArch(GPUArchName); - assert(gpu_arch != CudaArch::UNKNOWN && + assert((gpu_arch != CudaArch::UNKNOWN || + Args.hasArg(options::OPT_nocudalib)) && --

[PATCH] D37912: [OpenMP] Bugfix: output file name drops the absolute path where full path is needed.

2017-09-19 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:441 -SmallString<256> Name = llvm::sys::path::filename(II.getFilename()); +SmallString<256> Name = StringRef(II.getFilename()); llvm::sys::path::replace_extension(Name, "cubin");

[PATCH] D38040: [OpenMP] Add an additional test for D34888

2017-09-19 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. LGTM in general. Comment at: test/OpenMP/target_map_codegen.cpp:4845 +///==/// +// RUN: %clang_cc1 -DCK30 -std=c++11 -fopenmp -S -emit-llvm -fopenmp -fopenmp-targets=nvptx64-nvidia-cud

[PATCH] D37913: [OpenMP] Enable the existing nocudalib flag for OpenMP offloading toolchain.

2017-09-20 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. One small nit. LGTM otherwise. Comment at: test/Driver/openmp-offload-gpu.c:133 +/// Check that the flag is passed when -fopenmp-relocatable-target is used. +// RUN: %clang -###

[PATCH] D38040: [OpenMP] Add an additional test for D34888

2017-09-20 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: test/OpenMP/target_map_codegen.cpp:4845 +///==/// +// RUN: %clang_cc1 -DCK30 -std=c++11 -fopenmp -S -emit-llvm -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda %s -o - 2>&

[PATCH] D38090: [NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins.

2017-09-20 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: hiraditya, sanjoy, jholewinski. https://reviews.llvm.org/D38090 Files: clang/include/clang/Basic/BuiltinsNVPTX.def clang/lib/Driver/ToolChains/Cuda.cpp clang/lib/Headers/__clang_cuda_intrinsics.h clang/test/CodeGen/builtins-nvptx-ptx60.

[PATCH] D38090: [NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins.

2017-09-20 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 116047. tra added a comment. Addressed Justin's comments. https://reviews.llvm.org/D38090 Files: clang/include/clang/Basic/BuiltinsNVPTX.def clang/lib/Driver/ToolChains/Cuda.cpp clang/lib/Headers/__clang_cuda_intrinsics.h clang/test/CodeGen/builtins-nvp

[PATCH] D38090: [NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins.

2017-09-20 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL313820: [NVPTX] Implemented shfl.sync instruction and supporting intrinsics/builtins. (authored by tra). Changed prior to commit: https://reviews.llvm.org/D38090?vs=116047&id=116073#toc Repository: r

[PATCH] D38147: [CUDA] Fixed order of words in the names of shfl builtins.

2017-09-21 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added a subscriber: sanjoy. https://reviews.llvm.org/D38147 Files: clang/lib/Headers/__clang_cuda_intrinsics.h Index: clang/lib/Headers/__clang_cuda_intrinsics.h === --- clang/lib/Headers/__clang

[PATCH] D38148: [NVPTX] Implemented bar.warp.sync, barrier.sync, and vote{.sync} instructions/intrinsics/builtins.

2017-09-21 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: hiraditya, sanjoy, jholewinski. https://reviews.llvm.org/D38148 Files: clang/include/clang/Basic/BuiltinsNVPTX.def clang/lib/Headers/__clang_cuda_intrinsics.h clang/test/CodeGen/builtins-nvptx-ptx60.cu clang/test/CodeGen/builtins-nvptx.

[PATCH] D38148: [NVPTX] Implemented bar.warp.sync, barrier.sync, and vote{.sync} instructions/intrinsics/builtins.

2017-09-21 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 116236. tra added a comment. Fixed a typo in one test. https://reviews.llvm.org/D38148 Files: clang/include/clang/Basic/BuiltinsNVPTX.def clang/lib/Headers/__clang_cuda_intrinsics.h clang/test/CodeGen/builtins-nvptx-ptx60.cu clang/test/CodeGen/builtins-

[PATCH] D38148: [NVPTX] Implemented bar.warp.sync, barrier.sync, and vote{.sync} instructions/intrinsics/builtins.

2017-09-21 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL313898: [NVPTX] Implemented bar.warp.sync, barrier.sync, and vote{.sync}… (authored by tra). Changed prior to commit: https://reviews.llvm.org/D38148?vs=116236&id=116237#toc Repository: rL LLVM http

[PATCH] D38147: [CUDA] Fixed order of words in the names of shfl builtins.

2017-09-21 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL313899: [CUDA] Fixed order of words in the names of shfl builtins. (authored by tra). Changed prior to commit: https://reviews.llvm.org/D38147?vs=116228&id=116238#toc Repository: rL LLVM https://rev

[PATCH] D38188: [CUDA] Fix names of __nvvm_vote* intrinsics.

2017-09-25 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added a subscriber: sanjoy. Also fixed a syntax error in activemask(). https://reviews.llvm.org/D38188 Files: clang/lib/Headers/__clang_cuda_intrinsics.h Index: clang/lib/Headers/__clang_cuda_intrinsics.h

[PATCH] D38191: [NVPTX] added match.{any, all}.sync instructions, intrinsics & builtins.

2017-09-25 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: hiraditya, sanjoy, jholewinski. https://reviews.llvm.org/D38191 Files: clang/include/clang/Basic/BuiltinsNVPTX.def clang/lib/CodeGen/CGBuiltin.cpp clang/lib/Headers/__clang_cuda_intrinsics.h clang/test/CodeGen/builtins-nvptx-ptx60.cu

[PATCH] D63029: [NFC][CUDA] Avoid undefined grep in cuda-types.cu

2019-06-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: test/Preprocessor/cuda-types.cu:11 // RUN: %clang --cuda-host-only -nocudainc -target i386-unknown-linux-gnu -x cuda -E -dM -o - /dev/null \ -// RUN: | grep 'define __[^ ]*\(TYPE\|MAX\|SIZEOF|WIDTH\)\|define __GCC_ATOMIC' \ -// RUN: |

[PATCH] D62738: [HIP] Support texture type

2019-06-11 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Syntactically the patch looks OK to me, but I think the purpose and meaning of the builtin type should be documented in more details. Based on this patch alone it's not clear at all what it's supposed to be used for and how. Comment at: include/clang/Basi

[PATCH] D62738: [HIP] Support texture type

2019-06-11 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. So, the only thing this patch appears to do is make everything with this attribute uninitialized on device side and give protected visibility. If I understand it correctly, you're using the attribute in order to construct something that's sort of opposite of the currently us

[PATCH] D63164: [HIP] Add option to force lambda nameing following ODR in HIP/CUDA.

2019-06-11 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. So, in short, what you're saying is that lambda type may leak into the mangled name of a `__global__` function and ne need to ensure that the mangled name is identical for both host and device, hence the need for consistent naming of lambdas. If that's the case, shouldn't

[PATCH] D63335: [HIP] Change kernel stub name again

2019-06-14 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Is there particular reason you need to switch to this naming scheme? One issue with this patch is that demanglers will no longer be able to deal with the name. While they do know to ignore .stub suffix, they can't deal with `__device_stub_` prefix. E.g: % c++filt __devic

[PATCH] D63335: [HIP] Change kernel stub name again

2019-06-14 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D63335#1543845 , @hliao wrote: > it's requested from debugger people. they don't want to the host-side stub > could match the device-side kernel function name. the previous scheme cannot > prevent that. I understand that you wan

[PATCH] D63335: [HIP] Change kernel stub name again

2019-06-14 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D63335#1544019 , @hliao wrote: > In D63335#1543854 , @tra wrote: > > > In D63335#1543845 , @hliao wrote: > > > > > it's requested from debugger people

[PATCH] D63335: [HIP] Change kernel stub name again

2019-06-14 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D63335#1544026 , @hliao wrote: > Is it OK for us to mangle `__device_stub __` as the nested name into the > original one, says, we prepend `_ZN15__device_stub__E`, so that we have > `_ZN15__device_stub__E10kernelfuncIiEvv` > > and

[PATCH] D63335: [HIP] Change kernel stub name again

2019-06-14 Thread Artem Belevich via Phabricator via cfe-commits
tra requested changes to this revision. tra added a comment. This revision now requires changes to proceed. In D63335#1544315 , @hliao wrote: > > Sorry, I still don't think I understand the reasons for this change. The > > stub and the kernel do have a di

[PATCH] D63335: [HIP] Add the interface deriving the stub name of device kernels.

2019-06-14 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D63335#1544324 , @hliao wrote: > > I think debugger does have sufficient information to deal with this and > > that would be the right place to deal with the issue. > > em, I did push the later as well, :(. OK, I will simplify the

[PATCH] D63335: [HIP] Add the interface deriving the stub name of device kernels.

2019-06-14 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. This is a cleaner way to provide stub name tweaks. Comment at: clang/lib/CodeGen/CGCUDANV.cpp:223 + // Ensure either we have different ABIs between host and device compilati

[PATCH] D63277: [CUDA][HIP] Don't set "comdat" attribute for CUDA device stub functions.

2019-06-19 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. SGTM in principle. Folding the stubs would be bad as their addresses are implicitly used to identify the kernels to launch. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:4294 setNonAliasAttributes(GD, Fn); SetLLVMFunctionAttributesForDefinition(D,

[PATCH] D62738: [HIP] Support device_shadow variable

2019-06-19 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Overall looks good. Thank you for making the change. While reviewing the patch it occured to me that it presents an opportunity to generalize the shadow variables to work in both directions. See below. Comment at: include/clang/Basic/AttrDocs.td:4164-4171

[PATCH] D25845: [CUDA] Ignore implicit target attributes during function template instantiation.

2019-08-12 Thread Artem Belevich via Phabricator via cfe-commits
tra marked an inline comment as done. tra added inline comments. Comment at: cfe/trunk/lib/Sema/SemaDecl.cpp:8416 +// in the CheckFunctionTemplateSpecialization() call below. +if (getLangOpts().CUDA & !isFunctionTemplateSpecialization) + maybeAddCUDAHostDeviceAttrs(N

[PATCH] D62738: [HIP] Support device_shadow variable

2019-06-24 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: include/clang/Basic/Attr.td:954 +def CUDADeviceShadow : InheritableAttr { + let Spellings = [GNU<"device_shadow">, Declspec<"__device_shadow__">]; `HIPDeviceShadow` ? Comment at: include/clang/Basic/Att

[PATCH] D63756: [AMDGPU] Increased the number of implicit argument bytes for both OpenCL and HIP (CLANG).

2019-06-25 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D63756#1557658 , @cdevadas wrote: > Hi Sam, > The compiler generates metadata for the first 48 bytes. I compiled a sample > code and verified it. The backend does nothing for the extra bytes now. > I will soon submit the backend

[PATCH] D62738: [HIP] Support attribute hip_pinned_shadow

2019-06-25 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. Thank you! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62738/new/ https://reviews.llvm.org/D62738 ___ cfe-commits mailing list cfe-com

[PATCH] D64364: [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012

2019-07-10 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. Comment at: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp:4973-4992 case CudaArch::GFX600: case CudaArch::GFX601: case CudaArch::GFX700: case CudaArch::GFX701: case CudaArch::GFX702: case CudaArch::GFX703: case Cu

[PATCH] D61765: [OpenMP][Clang][BugFix] Split declares and math functions inclusion.

2019-05-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Headers/__clang_cuda_math_forward_declares.h:30-38 +#ifndef _OPENMP +__DEVICE__ long abs(long); +__DEVICE__ long long abs(long long); +#else +#ifndef __cplusplus __DEVICE__ long abs(long); __DEVICE__ long long abs(long long); -

[PATCH] D61765: [OpenMP][Clang][BugFix] Split declares and math functions inclusion.

2019-05-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Headers/__clang_cuda_math_forward_declares.h:30-38 +#ifndef _OPENMP +__DEVICE__ long abs(long); +__DEVICE__ long long abs(long long); +#else +#ifndef __cplusplus __DEVICE__ long abs(long); __DEVICE__ long long abs(long long); -

[PATCH] D61765: [OpenMP][Clang][BugFix] Split declares and math functions inclusion.

2019-05-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Headers/__clang_cuda_math_forward_declares.h:30-38 +#ifndef _OPENMP +__DEVICE__ long abs(long); +__DEVICE__ long long abs(long long); +#else +#ifndef __cplusplus __DEVICE__ long abs(long); __DEVICE__ long long abs(long long); -

[PATCH] D61765: [OpenMP][Clang][BugFix] Split declares and math functions inclusion.

2019-05-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Headers/__clang_cuda_math_forward_declares.h:30-38 +#ifndef _OPENMP +__DEVICE__ long abs(long); +__DEVICE__ long long abs(long long); +#else +#ifndef __cplusplus __DEVICE__ long abs(long); __DEVICE__ long long abs(long long); -

[PATCH] D61765: [OpenMP][Clang][BugFix] Split declares and math functions inclusion.

2019-05-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D61765#1499957 , @jdoerfert wrote: > Two small changes and then it is fine with me. @tra ? LGTM in general. I would still like to confirm that the changes work with libc++. Repository: rC Clang CHANGES SINCE LAST ACTION ht

[PATCH] D61765: [OpenMP][Clang][BugFix] Split declares and math functions inclusion.

2019-05-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D61765#1500309 , @gtbercea wrote: > As soon as libc++ the limits header included in > > __clang_cuda_cmath.h:15 > ``` is not found: > > > > > __clang_cuda_cmath.h:15:10: fatal error: 'limits' file not found > #include > >

[PATCH] D61765: [OpenMP][Clang][BugFix] Split declares and math functions inclusion.

2019-05-13 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. In D61765#1500457 , @gtbercea wrote: > This won't affect CUDA in any way, all we have added is OpenMP specific. LGTM for CUDA. I'll leave the question of te

[PATCH] D61949: [OpenMP][bugfix] Fix issues with C++ 17 compilation when handling math functions

2019-05-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Headers/__clang_cuda_device_functions.h:1477 #endif // CUDA_VERSION >= 9020 +#if __cplusplus >= 201703L +__DEVICE__ int abs(int __a) noexcept { return __nv_abs(__a); } jdoerfert wrote: > Hahnfeld wrote: > > If I recall

[PATCH] D61949: [OpenMP][bugfix] Fix issues with C++ 17 compilation when handling math functions

2019-05-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. LGTM. Comment at: lib/Headers/__clang_cuda_cmath.h:42 +// variant is supported. +#if defined(_OPENMP) && defined(__cplusplus) && __cplusplus >= 201703L +#define __NOEXCEPT noexcept I think the change is useful for CUDA, too, but I'm OK keep

[PATCH] D62046: [OpenMP][bugfix] Add missing math functions variants for log and abs.

2019-05-16 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Headers/__clang_cuda_cmath.h:55-56 +#if defined(_OPENMP) && defined(__cplusplus) +__DEVICE__ const float abs(const float __x) { return ::fabsf((float)__x); } +__DEVICE__ const double abs(const double __x) { return ::fabs((double)__x); }

<    1   2   3   4   5   6   7   8   9   10   >