[PATCH] D60620: [HIP] Support target id by --offload-arch

2020-05-26 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Driver/ToolChains/HIP.cpp:121-123 + auto Pos = SubArchName.find_first_of("+-"); + if (Pos != SubArchName.npos) +SubArchName = SubArchName.substr(0, Pos); yaxunl wrote: > tra wrote: > > Parsing should probably

[PATCH] D60620: [HIP] Support target id by --offload-arch

2020-05-27 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Driver/ToolChains/HIP.cpp:121-123 + auto Pos = SubArchName.find_first_of("+-"); + if (Pos != SubArchName.npos) +SubArchName = SubArchName.substr(0, Pos); yaxunl wrote: > tra wrote: > > yaxunl wrote: > > > tra

[PATCH] D80897: [OpenMP] Initial support for std::complex in target regions

2020-06-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Hmm. I'm pretty sure tensorflow is using std::complex for various types. I'm surprised that we haven't seen these functions missing. Plain CUDA (e.g. https://godbolt.org/z/Us6oXC) code appears to have no references to `__mul*` or `__div*`, at least for optimized builds, but

[PATCH] D80897: [OpenMP] Initial support for std::complex in target regions

2020-06-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D80897#2066952 , @jdoerfert wrote: > In D80897#2066723 , @tra wrote: > > > Hmm. I'm pretty sure tensorflow is using std::complex for various types. > > I'm surprised that we haven't seen the

[PATCH] D60620: [HIP] Support target id by --offload-arch

2020-06-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D60620#2064839 , @yaxunl wrote: > It means HIP will create two compilation passes: one for gfx908 and one for > gfx908:xnack+:sramecc+. OK, so empty feature list may also be valid. > > >> One thing you'll need is a way to norm

[PATCH] D80858: [HIP] Support accessing static device variable in host code

2020-06-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D80858#2067537 , @JonChesterfield wrote: > The value is based on llvm::sys::Process::GetRandomNumber(). So unless one > provides a build-system-derived uuid for every compilation unit, recompiling > identical source will yield an

[PATCH] D83893: [CUDA][HIP] Always defer diagnostics for wrong-sided reference

2020-07-16 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D83893/new/ https://reviews.llvm.org/D83893 ___ cfe-commits mailing list cfe-commits@lists.

[PATCH] D84068: AMDGPU/clang: Search resource directory for device libraries

2020-07-17 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Could you walk me through how you see this working in practice? IIUIC, the idea is to have bitcode files located somewhere within clang installation. If that's the case, will we ship those bitcode libraries with clang, or do they come from ROCm packages? If we ship them wit

[PATCH] D60620: [HIP] Support target id by --offload-arch

2020-07-21 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/include/clang/Driver/Driver.h:332 + llvm::Triple getHIPOffloadTargetTriple() const; + This is used exclusively by the Driver.cpp and does not have to be a public API call. Comment at: clang/lib/B

[PATCH] D71227: [cuda][hip] Fix function overload resolution in the global initiailizer.

2020-07-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Is this patch still actual? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D71227/new/ https://reviews.llvm.org/D71227 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https:/

[PATCH] D84362: [NFC] Add missing functions to PartialDiagnostic

2020-07-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. FYI, the patch does not compile: In file included from llvm/repo/clang/lib/Parse/ParseStmtAsm.cpp:13: In file included from /llvm/repo/clang/include/clang/Parse/Parser.h:24: llvm/repo/clang/include/clang/Sema/Sema.h:1833:10: error: use of overloaded operator '<<' is am

[PATCH] D84364: [CUDA][HIP] Defer all diagnostics for host device functions

2020-07-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added a subscriber: rsmith. tra added a comment. It's an interesting idea and it may be needed to handle wider range of implicitly-HD functions. However it's likely to have interesting consequences, not all of them desirable. It may be worth considering hiding the new behavior behind a flag

[PATCH] D84364: [CUDA][HIP] Defer all diagnostics for host device functions

2020-07-23 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. I'm going to try the patch on our CUDA code and see how it fares. Stay tuned. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D84364/new/ https://reviews.llvm.org/D84364 ___ cfe-commits mailing list cfe-commits@lists.llvm.

[PATCH] D80858: [CUDA][HIP] Support accessing static device variable in host code

2020-07-23 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Would it work if we generate a globally unique visible aliases for the static vars and use the alias' name to register device-side entities without changing their visibility? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D80858/new/ https://reviews.llvm.org/D80858

[PATCH] D80858: [CUDA][HIP] Support accessing static device variable in host code

2020-07-23 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D80858#2170604 , @hliao wrote: > In D80858#2170547 , @tra wrote: > > > Would it work if we generate a globally unique visible aliases for the > > static vars and use the alias' name to regis

[PATCH] D84364: [CUDA][HIP] Defer all diagnostics for host device functions

2020-07-23 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D84364#2170244 , @tra wrote: > I'm going to try the patch on our CUDA code and see how it fares. Stay tuned. The patch works well enough to build TensorFlow which is a good sign that existing working code should be OK -- we're ma

[PATCH] D80858: [CUDA][HIP] Support accessing static device variable in host code

2020-07-23 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D80858#2170781 , @hliao wrote: > The problem is not whether we have solution to tell them but when we need to > add that. Not all `static` device variables need to be visible to the host > side. `Externalizing` them adds the overh

[PATCH] D80858: [CUDA][HIP] Support accessing static device variable in host code

2020-07-27 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. It's a good point. Perhaps this is one of the cases where we should *not* follow nvcc. We can't have our cake (preserve static behavior) and eat it (treat it as non-static in case something on the host side may decide to use an API which uses symbol names). Something's got

[PATCH] D84743: [Clang][AMDGCN] Universal device offloading macros header

2020-07-28 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. I'm not sure it's particularly useful, to be honest. CUDA code still needs to be compatible with NVCC so it can't be used in portable code like TF or other currently used CUDA libraries. It could be useful internally, though, so I'm fine with it for that purpose.

[PATCH] D80858: [CUDA][HIP] Support accessing static device variable in host code

2020-07-28 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. I think Sam's approach is reasonable. The ability to supply CUID to sub-compilations is useful by itself and should probably be split into a separate patch as it's largely independent of the externalization of file-scope static vars. CHANGES SINCE LAST ACTION https://re

[PATCH] D84743: [Clang][AMDGCN] Universal device offloading macros header

2020-07-29 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Also, do we need the header at all? It would be much easier to just get clang itself to add normalized macros without trying to reconstruct them from the existing macros. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D84743/new/

[PATCH] D71726: Let clang atomic builtins fetch add/sub support floating point types

2020-07-29 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. LGTM, modulo couple of nits. @jyknight are you OK with this? In D71726#2179428 , @yaxunl wrote: > Make IEEE single and double type as supported for fp atomics in all targets > by default. This is based on the assumption that AtomicE

[PATCH] D84824: [HIP] Emit target-id module flag

2020-07-29 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:603-605 +TargetIDStr == "" +? TargetIDStr +: (Twine(getTriple().str()) + "-" + Targe

[PATCH] D77451: Accept -x cu to indicate language is CUDA, transfer CUDA language flag to header-file arguments

2020-04-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added a reviewer: sammccall. tra added a comment. Please add some more details to the bug description. This change is to make clangd work when compilation database sees CUDA sources compiled with nvcc. NVCC uses different options that should be properly translated. This patch only deals wit

[PATCH] D77451: Accept -x cu to indicate language is CUDA, transfer CUDA language flag to header-file arguments

2020-04-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D77451#1964971 , @sammccall wrote: > > NVCC uses different options that should be properly translated > > Interested to see how this will work. Is clang itself going to support these > args (act compatibly with nvcc, or is the idea

[PATCH] D77665: [CUDA] Simplify GPU variant handling. NFC.

2020-04-07 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: yaxunl. Herald added subscribers: sanjoy.google, bixia. Herald added a project: clang. Instead of hardcoding individual GPU mappings in multiple functions, keep them all in one table and use it to look up the mappings. We also don't care about 'vir

[PATCH] D77665: [CUDA] Simplify GPU variant handling. NFC.

2020-04-07 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 255756. tra edited the summary of this revision. tra added a comment. Enumerate all known GPU variants during libdevice detection instead of hardcoding them. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77665/new/

[PATCH] D77670: [CUDA] Add partial support for recent CUDA versions.

2020-04-07 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: yaxunl. Herald added subscribers: sanjoy.google, bixia, hiraditya, jholewinski. Herald added a project: clang. Generate PTX using newer versions of PTX and allow using sm_80 with CUDA-11. None of the new features of CUDA-10.2+ have been implemented

[PATCH] D77665: [CUDA] Simplify GPU variant handling. NFC.

2020-04-07 Thread Artem Belevich via Phabricator via cfe-commits
tra marked an inline comment as done. tra added inline comments. Comment at: clang/lib/Basic/Cuda.cpp:61 +// clang-format off +SM2(20, "compute_20"), SM2(21, "compute_20"), // Fermi +SM(30), SM(32), SM(35), SM(37), // Kepler yaxunl wrote: > Thanks fo

[PATCH] D77688: [CUDA] Improve testing of libdevice detection.

2020-04-07 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: yaxunl. Herald added subscribers: sanjoy.google, bixia. Herald added a project: clang. Added new testcases for libdevice in CUDA-9+ and removed unused checks. Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D77688 Files: clang/t

[PATCH] D77665: [CUDA] Simplify GPU variant handling. NFC.

2020-04-08 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG33386b20aa82: [CUDA] Simplify GPU variant handling. NFC. (authored by tra). Changed prior to commit: https://reviews.llvm.org/D77665?vs=255756&id=256078#toc Repository: rG LLVM Github Monorepo CHANG

[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-08 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Would not this scheme create a conflict between the device-side mangled kernel name and the handle which we emit with the same name? I recall that the distinct stub name was introduced specifically to avoid confusion between device-side kernel and the host-side stub that we

[PATCH] D77688: [CUDA] Improve testing of libdevice detection.

2020-04-08 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGd2e498b1725d: [CUDA] Improve testing of libdevice detection. (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77688/new/ https://reviews.llv

[PATCH] D77670: [CUDA] Add partial support for recent CUDA versions.

2020-04-08 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGa9627b7ea7e2: [CUDA] Add partial support for recent CUDA versions. (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77670/new/ https://revie

[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-08 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. In D77743#1970163 , @yaxunl wrote: > The kernel handle is a variable. Even if it has the same name as kernel, it > is OK for the debugger since the debugger

[PATCH] D77451: Accept -x cu to indicate language is CUDA, transfer CUDA language flag to header-file arguments

2020-04-08 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Thank you for the patch. I assume you don't have commit access to LLVM. I can land the patch for you. How should I attribute it? Will `ADRA ` (used in phabricator emails) do or do you prefer some other form? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77451/new/

[PATCH] D77777: [nvptx] Add `nvvm.texsurf.handle` internalizer.

2020-04-09 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. The patch could use a more detailed description. Specifically, it does not describe the purpose of these changes. > Replace them with the internal version, i.e. nvvm.texsurf.handle.internal > just before the instruction selector. It's not clear what is 'them'. 'nvvm.texsur

[PATCH] D77451: Accept -x cu to indicate language is CUDA, transfer CUDA language flag to header-file arguments

2020-04-09 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG6ed88afd780c: [CUDA] Accept -x cu to indicate language is CUDA, transfer CUDA language flag… (authored by ADRAADRA, committed by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION ht

[PATCH] D77451: Accept -x cu to indicate language is CUDA, transfer CUDA language flag to header-file arguments

2020-04-09 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Thank you! A apologize for not checking the patch before landing it. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77451/new/ https://reviews.llvm.org/D77451 ___ cfe-commits mailin

[PATCH] D77777: [nvptx] Add `nvvm.texsurf.handle` internalizer.

2020-04-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D7#1972720 , @hliao wrote: > In D7#1972349 , @tra wrote: > > > The patch could use a more detailed description. Specifically, it does not > > describe the purpose of these changes. >

[PATCH] D77777: [nvptx] Add `nvvm.texsurf.handle` internalizer.

2020-04-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D7#1974849 , @hliao wrote: > >> NVVM IR spec is for nvidia's own compiler. It's based on LLVM, but it does >> not impose direct constraints on LLVM's design choices. > > It would be an advantage and, sometimes, desirable t

[PATCH] D77777: [nvptx] Add `nvvm.texsurf.handle` internalizer.

2020-04-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a subscriber: majnemer. tra added a comment. In D7#1975178 , @hliao wrote: > the 1st argument in `llvm.nvvm.texsurf.hande.internal` or the 2nd one in > `llvm.nvvm.texsurf.handle` must be kept as an immediate or constant value, > i.e. that g

[PATCH] D77777: [nvptx] Add `nvvm.texsurf.handle` internalizer.

2020-04-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Also, if I read PTX docs correctly, it should be OK to pass texture handle address via an intermediate variable: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#texture-sampler-and-surface-types > Creating pointers to opaque variables using mov, e.g., mov.u

[PATCH] D76365: [cuda][hip] Add CUDA builtin surface/texture reference support.

2020-04-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. It appears I can crash clang with some texture code: https://godbolt.org/z/5vdEwC Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D76365/new/ https://reviews.llvm.org/D76365 ___ cfe-

[PATCH] D77777: [nvptx] Add `nvvm.texsurf.handle` internalizer.

2020-04-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D7#1975628 , @hliao wrote: > In D7#1975440 , @tra wrote: > > > Also, if I read PTX docs correctly, it should be OK to pass texture handle > > address via an intermediate variable: >

[PATCH] D78019: HIP: Fix handling of denormal mode

2020-04-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. LGTM. The patch appears to be an NFC one for CUDA. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78019/new/ https://reviews.llvm.org/D78019 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/

[PATCH] D77954: [CUDA][HIP] Fix host/device based overload resolution

2020-04-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. LGTM in principle. That said, my gut feeling is that this patch has a good chance of breaking something in sufficiently convoluted CUDA code like Eigen. When you land this patch, I'd appreciate if you could do it on a workday morning (Pacific time) so I'm around to test it

[PATCH] D80450: [CUDA][HIP] Fix implicit HD function resolution

2020-06-03 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D80450#2055463 , @tra wrote: > Is this patch supposed to be used with D79526 > or instead of it? ^^^ I don't think this has been answered. I would like to test this change before it lands. CHA

[PATCH] D79237: [CUDA][HIP] Fix constexpr variables for C++17

2020-06-03 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Tested with tensorflow build. The patch- does not seem to break anything now. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79237/new/ https://reviews.llvm.org/D79237 ___

[PATCH] D80450: [CUDA][HIP] Fix implicit HD function resolution

2020-06-03 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. Combined with D79526 it appears to work for tensorflow build. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D80450/new/ https://reviews.llvm.org/D804

[PATCH] D81176: [HIP] Add default header and include path

2020-06-04 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Thank you for the patch. This will make my life a lot easier. There are a few nits, but it's LGTM in general. Do I understand it correctly that for detection of HIP installation all we need is to f

[PATCH] D81427: [hip] Fix device-only relocatable code compilation.

2020-06-08 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. LGTM in general, but I'll let Sam stamp it. Comment at: clang/lib/Driver/Driver.cpp:4594-4599 +if (!AtTopLevel && JA.getType() == types::TY_LLVM_BC && +(C.getArgs().hasArg(options::OPT_emit_llvm) || + (JA.getOffloadingDeviceKind() == Act

[PATCH] D80450: [CUDA][HIP] Fix implicit HD function resolution

2020-06-11 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Reproducer for the regression. https://gist.github.com/Artem-B/183e9cfc28c6b04c1c862c853b5d9575 It's not particularly small, but that's as far as I could get it reduced. With the patch, an attempt to instantiate `ag` on line 36 (in the reproducer sources I linked to above)

[PATCH] D80450: [CUDA][HIP] Fix implicit HD function resolution

2020-06-11 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D80450#2087938 , @tra wrote: > Reproducer for the regression. > https://gist.github.com/Artem-B/183e9cfc28c6b04c1c862c853b5d9575 > It's not particularly small, but that's as far as I could get it reduced. > > With the patch, an at

[PATCH] D81627: [HIP] Do not call opt/llc for -fno-gpu-rdc

2020-06-11 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Looks OK in general. I'm happy to see reduced opt/llc use. You may want to get someone more familiar with the AMD GPU compilation process to double-check that the compilation pipeline still does the right thing. Comment at: clang/lib/Driver/Driver.cpp:272

[PATCH] D81713: [HIP] Fix rocm not found on rocm3.5

2020-06-12 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D81713#2089760 , @arsenm wrote: > In D81713#2089680 , @yaxunl wrote: > > > In D81713#2089672 , @arsenm wrote: > > > > > Can you add tests for this? Is

[PATCH] D60620: [HIP] Support target id by --offload-arch

2020-06-12 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D60620#2089722 , @yaxunl wrote: > In D60620#2067134 , @tra wrote: > > > Do you expect users to specify these IDs? How do you see it being used in > > practice? I think you do need to impleme

[PATCH] D81627: [HIP] Do not call opt/llc for -fno-gpu-rdc

2020-06-12 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. Good to go if @arsenm is OK with fixing -fgpu-rdc in a separate patch. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D81627/new/ https://reviews.llvm.org/D81627 ___

[PATCH] D81713: [HIP] Fix rocm not found on rocm3.5

2020-06-12 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D81713#2090309 , @arsenm wrote: > > I can tell that it does not. We're still looking under `/opt/rocm` by > > default. I've ran into it trying to get comgr working with the recent LLVM > > which prompted my comment on Github > >

[PATCH] D81771: [CUDA,HIP] Use VFS for SDK detection.

2020-06-12 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added reviewers: yaxunl, arsenm. Herald added subscribers: kerbowa, sanjoy.google, bixia, nhaehnle, wdng, jvesely. Herald added a project: clang. tra updated this revision to Diff 270533. tra edited the summary of this revision. tra added a comment. Replaced another

[PATCH] D81771: [CUDA,HIP] Use VFS for SDK detection.

2020-06-12 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 270533. tra edited the summary of this revision. tra added a comment. Replaced another use of D.getVFS. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D81771/new/ https://reviews.llvm.org/D81771 Files: clang/lib/D

[PATCH] D81771: [CUDA,HIP] Use VFS for SDK detection.

2020-06-15 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGd700237f1aa1: [CUDA,HIP] Use VFS for SDK detection. (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D81771/new/ https://reviews.llvm.org/D81

[PATCH] D81861: [HIP] Do not use llvm-link/opt/llc for -fgpu-rdc

2020-06-15 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. LGTM. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D81861/new/ https://reviews.llvm.org/D81861 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/l

[PATCH] D81938: [InferAddressSpaces] Handle the pair of `ptrtoint`/`inttoptr`.

2020-06-16 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D81938#2095869 , @hliao wrote: > In D81938#2095828 , @lebedev.ri > wrote: > > > This should be two separate patches - inferaddressspace and SROA. > > > Yes, I prepared that into 2 commits bu

[PATCH] D78655: [CUDA][HIP] Let lambda be host device by default

2020-07-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Sema/SemaLambda.cpp:1783 + + if (LangOpts.CUDA && LangOpts.CUDAIsDevice) +CUDACheckLambdaCapture(CallOperator, From); yaxunl wrote: > tra wrote: > > I would expect Sema-level diags to be produced durin

[PATCH] D82930: [HIP] Fix rocm detection

2020-07-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:167 +llvm::ErrorOr> VersionFile = +FS.getBufferForFile(BinPath + "/.hipVersion"); +if (!VersionFile) arsenm wrote: > yaxunl wrote: > > arsenm wrote: > > > arsenm wrote:

[PATCH] D80858: [CUDA][HIP] Support accessing static device variable in host code

2020-07-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:6069 +llvm::raw_ostream &OS) const { + OS << ".static." << getLangOpts().CUID; +} I suspect that will have interesting issues if CUID is an arbitrary user-supplied string. We may wan

[PATCH] D80897: [OpenMP] Initial support for std::complex in target regions

2020-07-07 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. LGTM. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D80897/new/ https://reviews.llvm.org/D80897 ___ cfe-commits mailing list cfe-commits@lists.llvm.org h

[PATCH] D82930: [HIP] Fix rocm detection

2020-07-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:167 +llvm::ErrorOr> VersionFile = +FS.getBufferForFile(BinPath + "/.hipVersion"); +if (!VersionFile) arsenm wrote: > yaxunl wrote: > > yaxunl wrote: > > > yaxunl wrote:

[PATCH] D82930: [HIP] Fix rocm detection

2020-07-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:167 +llvm::ErrorOr> VersionFile = +FS.getBufferForFile(BinPath + "/.hipVersion"); +if (!VersionFile) arsenm wrote: > tra wrote: > > arsenm wrote: > > > yaxunl wrote: > >

[PATCH] D78655: [CUDA][HIP] Let lambda be host device by default

2020-07-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Sema/SemaCUDA.cpp:757-759 + // In host compilation, deferred diagnostics are only emitted for functions + // which are sure to be emitted on host side since there is no reliable + // way to check if a function is emitted on devi

[PATCH] D78655: [CUDA][HIP] Let lambda be host device by default

2020-07-08 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. LGTM. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78655/new/ https://reviews.llvm.org/D78655 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/l

[PATCH] D82930: [HIP] Fix rocm detection

2020-07-10 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Herald added a subscriber: dexonsmith. LGTM in principle. Few style comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:161 +unsigned Minor = 0; +auto Splits = HIP

[PATCH] D83591: [OpenMP][CUDA] Fix std::complex in GPU regions

2020-07-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D83591#2145378 , @JonChesterfield wrote: > Fine by me. Let's get nvptx working properly in tree now and work out how to > wire up amdgcn subsequently. I'm sure a reasonable abstraction will present > itself. I'm missing somethi

[PATCH] D83591: [OpenMP][CUDA] Fix std::complex in GPU regions

2020-07-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D83591#2145512 , @JonChesterfield wrote: > In D83591#2145437 , @jdoerfert wrote: > > > I did not know they are using __clang_cuda headers. (Site note, we should > > rename them then.) > > >

[PATCH] D83591: [OpenMP][CUDA] Fix std::complex in GPU regions

2020-07-10 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. LGTM. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D83591/new/ https://reviews.llvm.org/D83591 ___ cfe-commits mailing list cfe-commits@lists.llvm.org h

[PATCH] D83893: [CUDA][HIP] Always defer diagnostics for wrong-sided reference

2020-07-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added subscribers: jlebar, rsmith. tra added a comment. > This is different from nvcc behavior, where it is diagnosed only if the > function is really emitted: That by itself is not a sufficient reason for relaxing the checks. Clang is stricter/more principled in diagnosing other questional

[PATCH] D78655: [CUDA][HIP] Let non-caputuring lambda be host device

2020-06-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D78655#2105058 , @yaxunl wrote: > - lambdas with any lambda-capture (which must therefore have an enclosing > function) inherit the enclosing function's HDness. Nit: *any* capture does not necessarily imply existence of the enclo

[PATCH] D78655: [CUDA][HIP] Let non-caputuring lambda be host device

2020-06-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D78655#2107190 , @yaxunl wrote: > It seems we can only promote non-capturing lambdas, no matter whether it has > enclosing function or not. I'd be OK with promoting only non-capturing lambdas until we figure out a consistent way

[PATCH] D78655: [CUDA][HIP] Let non-caputuring lambda be host device

2020-06-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Sema/SemaCUDA.cpp:753 return; + if (LI.Default == LCD_None && LI.Captures.size() == 0) { +Method->addAttr(CUDADeviceAttr::CreateImplicit(Context)); pfultz2 wrote: > There should at least be a flag to enab

[PATCH] D78655: [CUDA][HIP] Let non-caputuring lambda be host device

2020-06-23 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D78655#2108047 , @pfultz2 wrote: > > Could you give an example to demonstrate current use and how it will break? > > Here is place where it would break: > > https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/blob/develop/src/target

[PATCH] D82506: [HIP] Add missing options for lto

2020-06-24 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Driver/ToolChains/HIP.cpp:76 +MAttrString.append(Args.MakeArgString(OneFeature)); +if (OneFeature != Features.back()) + MAttrString.append(","); I think StringRef will result in comparison by value. If

[PATCH] D82506: [HIP] Add missing options for lto

2020-06-25 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D82506/new/ https://reviews.llvm.org/D82506 ___ cfe-commits mailing list cfe-commits@lists.

[PATCH] D82579: [NFC] Extract unifyTargetFeatures

2020-06-25 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:93 +StringRef Name = Features[I]; +assert(Name[0] == '-' || Name[0] == '+'); +LastOpt[Name.drop_front(1)] = I; I don't think assert should be used for something that may

[PATCH] D82579: [NFC] Extract unifyTargetFeatures

2020-06-25 Thread Artem Belevich via Phabricator via cfe-commits
tra marked an inline comment as done. tra added inline comments. Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:93 +StringRef Name = Features[I]; +assert(Name[0] == '-' || Name[0] == '+'); +LastOpt[Name.drop_front(1)] = I; yaxunl wrote: > tra

[PATCH] D82579: [NFC] Extract unifyTargetFeatures

2020-06-25 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:93 +StringRef Name = Features[I]; +assert(Name[0] == '-' || Name[0] == '+'); +LastOpt[Name.drop_front(1)] = I

[PATCH] D78655: [CUDA][HIP] Let lambda be host device by default

2020-06-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Sema/SemaCUDA.cpp:750 +bool Sema::CUDACheckLambdaCapture(CXXMethodDecl *Callee, + const sema::Capture &Capture) { What does the return value mean? We don't seem to check it anyway

[PATCH] D78392: [CUDA] Define __CUDACC__ before standard library headers

2020-04-17 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Headers/__clang_cuda_runtime_wrapper.h:44 #include // Preserve common macros that will be changed below by us or by CUDA We should undef __CUDACC__ here to make sure it does not affect anything else in the he

[PATCH] D78392: [CUDA] Define __CUDACC__ before standard library headers

2020-04-17 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 258396. tra added a comment. Undef __CUDACC__ after the standard headers. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78392/new/ https://reviews.llvm.org/D78392 Files: clang/lib/Headers/__clang_cuda_runtime_wr

[PATCH] D78392: [CUDA] Define __CUDACC__ before standard library headers

2020-04-17 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG8e2051654044: [CUDA] Define __CUDACC__ before standard library headers (authored by tambre, committed by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D

[PATCH] D59647: [CUDA][HIP] Warn shared var initialization

2020-04-17 Thread Artem Belevich via Phabricator via cfe-commits
tra requested changes to this revision. tra added a comment. This revision now requires changes to proceed. NVCC is not always right. > Is there a specific use case for this, other than matching nvcc bug-for-bug? I don't think I've seen the answer to my question above. > However, user can turn

[PATCH] D78655: [HIP] Let lambda be host device by default

2020-04-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added a reviewer: rsmith. tra added a subscriber: rsmith. tra added a comment. Summoning @rsmith as I'm sure that there are interesting corner cases in lambda handling that we didn't consider. Making lambdas implicitly HD will make it easier to write the code which can't be instantiated on

[PATCH] D78726: [CUDA] Define __CUDACC_DEBUG__ when compiling device code in debug mode

2020-04-23 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Driver/ToolChains/Cuda.cpp:632 + if (mustEmitDebugInfo(DriverArgs) == EmitSameDebugInfoAsHost) +CC1Args.push_back("-D__CUDACC_DEBUG__"); + Built-in preprocessor macros are defined in clang/lib/Frontend/InitP

[PATCH] D77954: [CUDA][HIP] Fix host/device based overload resolution

2020-04-23 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: clang/lib/Sema/SemaOverload.cpp:9749 + if (isBetterMultiversionCandidate(Cand1, Cand2)) +return true; + rjmccall wrote: > erichkeane wrote: >

[PATCH] D77954: [CUDA][HIP] Fix host/device based overload resolution

2020-04-24 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Sema/SemaOverload.cpp:9749 + if (isBetterMultiversionCandidate(Cand1, Cand2)) +return true; + yaxunl wrote: > rjmccall wrote: > > tra wrote: > > > rjmccall wrote: > > > > erichkeane wrote: > > > > > yaxunl wro

[PATCH] D77954: [CUDA][HIP] Fix host/device based overload resolution

2020-04-24 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Go ahead. I'll revert it if it breaks anything on our side. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77954/new/ https://reviews.llvm.org/D77954 ___ cfe-commits mailing list cf

[PATCH] D78655: [HIP] Let lambda be host device by default

2020-04-24 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/test/CodeGenCUDA/lambda.cu:16-26 +template +__global__ void g(F f) { f(); } + +template +void h(F f) { g<<<1,1>>>(f); } + +__device__ int a; yaxunl wrote: > tra wrote: > > The test example may not be doing what it's se

[PATCH] D78970: [CUDA][HIP] Fix ambiguity of new operator

2020-04-28 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. OK. Let's give it a try. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78970/new/ https://reviews.llvm.org/D78970 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.or

[PATCH] D78655: [HIP] Let lambda be host device by default

2020-04-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D78655#2013226 , @pfultz2 wrote: > > Not only the capture is an issue, like a regular function, lambda could > > also access non-local variables/functions. > > In practice this is not an issue. Hcc will implictly treat anything inl

[PATCH] D78655: [HIP] Let lambda be host device by default

2020-04-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D78655#2013616 , @pfultz2 wrote: > > I.e. if I pass a mutable lambda by reference to the GPU kernel > > I dont think we are enabling passing host objects by reference through > functions. Although it could be possible to capture th

<    1   2   3   4   5   6   7   8   9   10   >