[PATCH] D102237: [CUDA][HIP] Fix non-ODR-use of static device variable

2021-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D102237#2767131 , @yaxunl wrote: > > It would be interesting to know what kind of variables are emitted. I'm still reducing the failure. I'll send you a reproducer once I have it. > Would you like the change reverted? We ha

[PATCH] D102237: [CUDA][HIP] Fix non-ODR-use of static device variable

2021-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Here's one example reproducer: https://godbolt.org/z/77M596W89 It's rather hairy, but should be usable for further debugging. There are no CUDA attributes anywhere in sight, but we do end up emitting a host-only constructor for `o_u` which calls `strlen`. Repository: rG

[PATCH] D102237: [CUDA][HIP] Fix non-ODR-use of static device variable

2021-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Here's a slightly simpler reproducer: https://godbolt.org/z/85EsxnPPM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D102237/new/ https://reviews.llvm.org/D102237 ___ cfe-commits maili

[PATCH] D102801: [CUDA][HIP] Fix implicit constant variable

2021-05-19 Thread Artem Belevich via Phabricator via cfe-commits
tra added a reviewer: rsmith. tra added a subscriber: rsmith. tra added a comment. Tentative LGTM as we need it to fix the regression soon. Summoning @rsmith for the 'big picture' opinion. While the patch may fix this particular regression, I wonder if there's a better way to deal with this. We

[PATCH] D102237: [CUDA][HIP] Fix non-ODR-use of static device variable

2021-05-19 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D102237#2769475 , @yaxunl wrote: > In D102237#2767538 , @tra wrote: > >> Here's a slightly simpler reproducer: https://godbolt.org/z/rW6P9e37s > > I have a fix for this: https://reviews.llv

[PATCH] D102801: [CUDA][HIP] Fix implicit constant variable

2021-05-19 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. This patch does not appear to fix the second regression introduced by the D102237 . Trying to compile the following code triggers an assertion in CGExpr.cpp: class a { public: a(char *); }; void b() { [](char *c) { sta

[PATCH] D102801: [CUDA][HIP] Fix device variables used by host

2021-05-20 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D102801#2771664 , @yaxunl wrote: > In the updated patch I have a simpler solution which is easier to explain to > the users. Basically we classify variables by how they are emitted: device > side only, host side only, both sides

[PATCH] D102801: [CUDA][HIP] Fix device variables used by host

2021-05-20 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. I've verified that Tensorflow still builds with this patch and that the patch does fix the regressions we've seen. If you could land this patch soon, that would be appreciated. CHANGES SINCE

[PATCH] D102936: [CUDA] Work around compatibility issue with libstdc++ 11.1.0

2021-05-21 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added reviewers: jlebar, jwakely. Herald added subscribers: bixia, yaxunl. tra requested review of this revision. Herald added a project: clang. libstdc++ 11.1.0 redeclares __failed_assertion multiple times and that results in the function declared with conflicting

[PATCH] D102936: [CUDA] Work around compatibility issue with libstdc++ 11.1.0

2021-05-21 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Headers/cuda_wrappers/complex:77 +// https://bugs.llvm.org/show_bug.cgi?id=50383 +#pragma push_macro("__failed_assert") +#if _GLIBCXX_RELEASE == 11 && __GLIBCXX__ == 20210427 fodinabor wrote: > Not sure I understan

[PATCH] D102936: [CUDA] Work around compatibility issue with libstdc++ 11.1.0

2021-05-21 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 347092. tra edited the summary of this revision. tra added a comment. Fixed typo in push/pop macro name. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D102936/new/ https://reviews.llvm.org/D102936 Files: clang/li

[PATCH] D102936: [CUDA] Work around compatibility issue with libstdc++ 11.1.0

2021-05-21 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Headers/cuda_wrappers/complex:77 +// https://bugs.llvm.org/show_bug.cgi?id=50383 +#pragma push_macro("__failed_assert") +#if _GLIBCXX_RELEASE == 11 && __GLIBCXX__ == 20210427 tra wrote: > fodinabor wrote: > > Not s

[PATCH] D102936: [CUDA] Work around compatibility issue with libstdc++ 11.1.0

2021-05-21 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 347133. tra added a comment. Check only _GLIBCXX_RELEASE Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D102936/new/ https://reviews.llvm.org/D102936 Files: clang/lib/Headers/cuda_wrappers/complex Index: clang/l

[PATCH] D102936: [CUDA] Work around compatibility issue with libstdc++ 11.1.0

2021-05-21 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D102936#2774686 , @jwakely wrote: > You can't use `__GLIBCXX__` this way. It will be different for different > snapshots from the gcc-11 branch. Some distros are already shipping gcc-11 > snapshots with later dates. > > I would j

[PATCH] D102936: [CUDA] Work around compatibility issue with libstdc++ 11.1.0

2021-05-21 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Headers/cuda_wrappers/complex:79 +#if _GLIBCXX_RELEASE == 11 +#define __failed_assertion __cuda_failed_assertion +#endif yaxunl wrote: > May I ask where is __cuda_failed_assertion defined? Thanks. The function is n

[PATCH] D101630: [HIP] Fix device-only compilation

2021-05-24 Thread Artem Belevich via Phabricator via cfe-commits
tra added a subscriber: echristo. tra added a comment. In D101630#2777346 , @yaxunl wrote: > In D101630#2748513 , @tra wrote: > >> How about this: >> If the user explicitly specified `--cuda-host-only` or `--cuda-

[PATCH] D102975: [HIP] Check compatibility of -fgpu-sanitize with offload arch

2021-05-24 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Herald added a subscriber: foad. Comment at: clang/lib/Driver/ToolChains/AMDGPU.h:114-115 + /// specified and valid. + std::tuple, Optional, + Optional>> + getParsedTargetID(const llvm::opt::ArgList &DriverArgs) const; I'

[PATCH] D102936: [CUDA] Work around compatibility issue with libstdc++ 11.1.0

2021-05-24 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG9a75c06cd9d9: [CUDA] Work around compatibility issue with libstdc++ 11.1.0 (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D102936/new/ htt

[PATCH] D103108: [CUDA][HIP] Promote const variables to constant

2021-05-26 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Sema/SemaCUDA.cpp:568 +} +// Check whether a variable has an allowed initializer for a CUDA device side +// variable with global storage. \p VD may be a host variable to be checked for Nit: add an empty line to sep

[PATCH] D103108: [CUDA][HIP] Promote const variables to constant

2021-05-26 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Overall looks good, though I've got one more question. Comment at: clang/test/SemaCUDA/device-use-host-var.cu:90 + const int &ref_const_var = global_const_var; const int &ref_constexpr_var = global_constexpr_var; *out = ref_host_var;

[PATCH] D103221: [HIP] Change default lang std to c++14

2021-06-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/include/clang/Basic/LangStandards.def:196-197 // CUDA LANGSTANDARD(cuda, "cuda", CUDA, "NVIDIA CUDA(tm)", LineComment | CPlusPlus | Digraphs) It would make sense to bump C++ version for CUDA as well.

[PATCH] D101630: [HIP] Fix device-only compilation

2021-06-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D101630#2787714 , @yaxunl wrote: > How does nvcc --genco behave when there are multiple GPU arch's? Does it > output a fat binary containing multiple ISA's? Also, does it support > device-only compilation for intermediate outputs

[PATCH] D103108: [CUDA][HIP] Promote const variables to constant

2021-06-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. LGTM. I would like to test the patch on our code first. Please wait a bit before landing the patch. I should be able to have the results tomorrow. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D103108/new/ https://reviews.llvm.org/D103108 _

[PATCH] D103221: [CUDA][HIP] Change default lang std to c++14

2021-06-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/test/SemaCUDA/asm_delayed_diags.cu:31 static __device__ __host__ double t3(double x) { - register long double result; + register long double result; // expected-warning {{'register' storage class specifier is deprecated and incompa

[PATCH] D101630: [HIP] Fix device-only compilation

2021-06-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D101630#2792052 , @yaxunl wrote: > I think for intermediate outputs e.g. preprocessor expansion, IR, and > assembly, probably it makes sense not to bundle by default. Agreed. > However, for default action (emitting object), we n

[PATCH] D103108: [CUDA][HIP] Promote const variables to constant

2021-06-01 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. I'm done with testing. The patch does not seem to break anything obvious. Tensorflow builds and works. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D103108/new/ https://reviews.llvm.org/D1

[PATCH] D94732: [CUDA] Normalize handling of defauled dtor.

2021-01-21 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG127091bfd5ed: [CUDA] Normalize handling of defauled dtor. (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D94732/new/ https://reviews.llvm.o

[PATCH] D95299: Fix truncated __OPENMP_NVPTX__ preprocessor condition

2021-01-25 Thread Artem Belevich via Phabricator via cfe-commits
tra added reviewers: tra, ABataev. tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D95299/new/ https://reviews.llvm.org/D95299

[PATCH] D69322: [hip][cuda] Enable extended lambda support on Windows.

2021-01-26 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. @hliao -- Can you take a look at https://bugs.llvm.org/show_bug.cgi?id=48866. This patch may be relevant there. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D69322/new/ https://reviews.llvm.org/D69322 ___

[PATCH] D95560: [CUDA][HIP] Fix function scope static variable

2021-01-27 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. > A static variable in device and global functions is supposed to have > implicit device attribute. Currently it does not. This causes incorrect > diagnostics about host variables accessed by device functions. Correct diagnostics sevice-side local static vars is a valid conc

[PATCH] D95558: [NFC][CUDA] Refactor registering device variable

2021-01-27 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/CodeGen/CGCUDANV.cpp:157 llvm::Function *makeModuleDtorFunction() override; + void + adjustShadowVarLinkage(const VarDecl *D, clang-format it? `void` hanging all by itself looks odd. Comment

[PATCH] D69322: [hip][cuda] Enable extended lambda support on Windows.

2021-01-28 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D69322#2523058 , @tra wrote: > @hliao -- Can you take a look at > https://bugs.llvm.org/show_bug.cgi?id=48866. This patch may be relevant there. @rnk Reid, looks like this patch does fix a lambda mangling issue in CUDA on Window

[PATCH] D95660: [NFC] Disallow unused prefixes under clang/test/Driver

2021-02-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. LGTM for CUDA. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D95660/new/ https://reviews.llvm.org/D95660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org

[PATCH] D95558: [NFC][CUDA] Refactor registering device variable

2021-02-01 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Couple of minor nits LGTM otherwise, Comment at: clang/lib/CodeGen/CGCUDANV.cpp:924 + +void CGNVCUDARuntime::adjustShadowVarLinkage( +const VarDecl *D, llvm::GlobalValue::Linka

[PATCH] D71726: Let clang atomic builtins fetch add/sub support floating point types

2021-02-02 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D71726#2536966 , @jyknight wrote: > My concern is that this is treating a backend _bug_ as if it were just an > optional feature. But it's not the case that it might be reasonable to either > implement or not implement this in a b

[PATCH] D95558: [NFC][CUDA] Refactor registering device variable

2021-02-02 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:4270-4271 } else { - // Host-side shadows of external declarations of device-side - // global variables become internal definitions. These have to - // be internal in order to prevent n

[PATCH] D95560: [CUDA][HIP] Fix function scope static variable

2021-02-02 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Sema/SemaDecl.cpp:7247-7250 + // CUDA/HIP: Function-scope static variables in device or global functions + // have implicit device or constant attribute. Function-scope static variables + // in host device functions have implic

[PATCH] D95840: [CUDA][HIP] Fix checking dependent initalizer

2021-02-02 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: clang/lib/Sema/SemaCUDA.cpp:538 AllowedInit = - ((VD->getType()->isDependentType() || Init->isValueDependent()) && - VD->isConstexpr()) ||

[PATCH] D95558: [NFC][CUDA] Refactor registering device variable

2021-02-02 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. LGTM. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:4270-4271 } else { - // Host-side shadows of external declarations of device-side - // global variables become internal definitions. These have to - // be internal in order to prevent

[PATCH] D95901: [CUDA][HIP] Fix device variable linkage

2021-02-03 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. > For -fgpu-rdc, shadow variables should not be internalized, otherwise they > cannot be accessed by other TUs. > This is necessary because the shadow variable of external device variables > are always emitted as undefined symbols, which need to resolve to a global > symbo

[PATCH] D95901: [CUDA][HIP] Fix device variable linkage

2021-02-03 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. In D95901#2539754 , @yaxunl wrote: > For -fno-gpu-rdc, two TU's can have global device variables with the same > name, therefore the shadow variables need to

[PATCH] D95901: [CUDA][HIP] Fix device variable linkage

2021-02-03 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/AST/ASTContext.cpp:11437-11443 + return ((!getLangOpts().GPURelocatableDeviceCode && + ((D->hasAttr() && + !D->getAttr()->isImplicit()) || +(D->hasAttr() && + !D->getAttr()->isImplicit

[PATCH] D95974: [CUDA, NVPTX] Allow targeting sm_86 GPUs.

2021-02-03 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added reviewers: yaxunl, jdoerfert. Herald added subscribers: dexonsmith, bixia, hiraditya, jholewinski. tra requested review of this revision. Herald added projects: clang, LLVM. The patch only plumbs through the option necessary for targeting sm_86 GPUs w/o adding

[PATCH] D95970: [HIP] Allow undefined symbols

2021-02-03 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. What's going to happen if you do have an undefined reference that's *not* to a `__managed__` variable? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D95970/new/ https://reviews.llvm.org/D95970 ___ cfe-commits mailing list

[PATCH] D95901: [CUDA][HIP] Fix device variable linkage

2021-02-03 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/AST/ASTContext.cpp:11437-11443 + return ((!getLangOpts().GPURelocatableDeviceCode && + ((D->hasAttr() && + !D->getAttr()->isImplicit()) || +(D->hasAttr() && + !D->getAttr()->isImplicit

[PATCH] D95974: [CUDA, NVPTX] Allow targeting sm_86 GPUs.

2021-02-03 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 321220. tra edited the summary of this revision. tra added a comment. Removed debug printout Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D95974/new/ https://reviews.llvm.org/D95974 Files: clang/include/clang/Ba

[PATCH] D95974: [CUDA, NVPTX] Allow targeting sm_86 GPUs.

2021-02-03 Thread Artem Belevich via Phabricator via cfe-commits
tra marked an inline comment as done. tra added a comment. > So we add ptx72 but it's not used with sm_86, interesting. `ptx71` is the minimum/default requited PTX version for sm_86. If we compile with CUDA-11.2, clang will set the '+ptx72' as we may potentially need it in order to link in libd

[PATCH] D95970: [HIP] Allow undefined symbols

2021-02-03 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D95970#2540414 , @yaxunl wrote: > In D95970#2540303 , @tra wrote: > >> What's going to happen if you do have an undefined reference that's *not* to >> a `__managed__` variable? > > By defaul

[PATCH] D108787: [CUDA] Pass ExecConfig through BuildCallToMemberFunction

2021-09-16 Thread Artem Belevich via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG6b20ea696356: [CUDA] Pass ExecConfig through BuildCallToMemberFunction (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-20 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added reviewers: jlebar, yaxunl, hliao. Herald added subscribers: bixia, mgorny. Herald added a reviewer: a.sidorin. tra requested review of this revision. Herald added a project: clang. The patch Implements support for testure lookups (mostly) in a header file. The

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-20 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Here's expanded and formatted version of the header: https://gist.github.com/Artem-B/ec4290809650f5092d61d6dafa6b0131 It may help to see what's going on. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https://review

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-20 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 373747. tra edited the summary of this revision. tra added a comment. cosmetic cleanups. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https://reviews.llvm.org/D110089 Files: clang/include/clang/Basi

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-21 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 374028. tra added a comment. Minor cleanups Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https://reviews.llvm.org/D110089 Files: clang/include/clang/Basic/Builtins.def clang/include/clang/Sema/Sem

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-21 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 374034. tra added a comment. Undo useless NOLINT Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https://reviews.llvm.org/D110089 Files: clang/include/clang/Basic/Builtins.def clang/include/clang/Sem

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-21 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Most of clang-tidy warnings are irrelevant -- it tries to parse the header all by itself, without CUDA headers. It also ignores `NOLINTNEXTLINE(clang-diagnostic-error)` which was intended to suppress the warning triggered by `#error`. The only useful one was in SemaChecking

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D110089#3014388 , @jlebar wrote: >> One alternative would be to use run-time dispatch, but, given that texture >> lookup is a single instruction, the overhead would be >> substantial-to-prohibitive. > > I guess I'm confused... I

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D110089#3016145 , @jlebar wrote: >> Depending on which particular operation is used, the arguments vary, too. > > So something like > > T __nv_tex_surf_handler(name, arg1) { > switch (name) { > ... > default: >

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-22 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 374360. tra added a comment. Switched to purely in-header implementation based on constexpr perfect hash. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https://reviews.llvm.org/D110089 Files: clang/l

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-22 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 374362. tra added a comment. Require c++11 for texture support. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https://reviews.llvm.org/D110089 Files: clang/lib/Headers/CMakeLists.txt clang/lib/Head

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-22 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 374391. tra marked 2 inline comments as done. tra added a comment. Added better C++11 guards. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https://reviews.llvm.org/D110089 Files: clang/lib/Headers/C

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Headers/__clang_cuda_texture_intrinsics.h:41 + +namespace { + jlebar wrote: > jlebar wrote: > > what are you trying to accomplish with an anon ns inside a header? > I know you wrote it in the commit message, but th

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-22 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 374404. tra added a comment. Cleanups. Added more comments. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https://reviews.llvm.org/D110089 Files: clang/lib/Headers/CMakeLists.txt clang/lib/Headers/

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-22 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 374405. tra added a comment. Removed a test file committed by mistake. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https://reviews.llvm.org/D110089 Files: clang/lib/Headers/CMakeLists.txt clang/l

[PATCH] D110304: [HIP] Fix linking of asanrt.bc

2021-09-23 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/include/clang/Driver/ToolChain.h:116-117 + // Enums corresponding to clang options for linking bitcode, i.e., + // -mlink-builtin-bitcode or -mlink-bitcode-file + enum BitCodeLinkOpt { It appears that what we deal

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-23 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 374604. tra added a comment. Added a test. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https://reviews.llvm.org/D110089 Files: clang/lib/Headers/CMakeLists.txt clang/lib/Headers/__clang_cuda_runt

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-23 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 374642. tra added a comment. Disable sparse ops for pre-sm_60 GPUs. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https://reviews.llvm.org/D110089 Files: clang/lib/Headers/CMakeLists.txt clang/lib/

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-23 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 374644. tra added a comment. Sort push/pop_macro. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https://reviews.llvm.org/D110089 Files: clang/lib/Headers/CMakeLists.txt clang/lib/Headers/__clang_cu

[PATCH] D110304: [HIP] Fix linking of asanrt.bc

2021-09-24 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: clang/lib/Driver/ToolChains/HIP.cpp:413 } else -BCLibs.push_back(AsanRTL.str()); +BCLibs.push_back({AsanRTL.str(), false}); } --

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-24 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 374979. tra added a comment. More comments. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110089/new/ https://reviews.llvm.org/D110089 Files: clang/lib/Headers/CMakeLists.txt clang/lib/Headers/__clang_cuda_run

[PATCH] D110089: [CUDA] Implement experimental support for texture lookups.

2021-09-24 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D110089#3021652 , @jlebar wrote: > Okay, I give up on the phab interface. It's unreadable with all the existing > comments and lint errors. Yeah. Phabricator experience is not great. > +// Put all functions into anonymous namesp

[PATCH] D108247: [CUDA] Improve CUDA version detection and diagnostics.

2021-09-27 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. So, what's the current state of affairs regarding CUDA SDK layout in debian? Clang does rely on very particular ordering of includes, so having CUDA headers in a location clang does not expect will lead to issues sooner or later. If the headers are not available where --cuda-

[PATCH] D108247: [CUDA] Improve CUDA version detection and diagnostics.

2021-09-27 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D108247#3025415 , @tambre wrote: >> Another workaround would be to place a fake /usr/lib/cuda/include/cuda.h >> with something like this: > > My CMake CI bot [[ https://open.cdash.org/test/498767666 | encountered > `cmath` templa

[PATCH] D110596: [CUDA] Move CUDA SDK include path further down the include search path.

2021-09-27 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added reviewers: tambre, jlebar. Herald added subscribers: bixia, yaxunl. tra requested review of this revision. Herald added a project: clang. This allows clang to work on Linux distributions like Debian where /include may be a symlink to /usr/include. We only need

[PATCH] D110596: [CUDA] Move CUDA SDK include path further down the include search path.

2021-09-27 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 375425. tra edited the summary of this revision. tra added a comment. Fixed the failing test affected by the search path changes. Disabled addition of CUDA include path if -nogpuinc is in effect. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D110596: [CUDA] Move CUDA SDK include path further down the include search path.

2021-09-28 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. @tambre I've tested the patch on experimental debian docker and it appears to work with a symlink `/usr/lib/cuda/include` ->`/usr/include`. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D110596/new/ https://reviews.llvm.org/D11

[PATCH] D110596: [CUDA] Move CUDA SDK include path further down the include search path.

2021-09-28 Thread Artem Belevich via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGfd582eeffe58: [CUDA] Move CUDA SDK include path further down the include search path. (authored by tra). Repository: rG LLVM Github Monorepo CHAN

[PATCH] D110622: [HIPSPV][3/4] Enable SPIR-V emission for HIP

2021-09-29 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. > A Cuda GPU architecture ‘generic’ is added. The name is picked from the LLVM > SPIR-V Backend. In the HIPSPV code path the architecture name is inserted to > the bundle entry ID as target ID. Target ID is expected to be always present > so a component in the target triple

[PATCH] D110618: [HIPSPV][2/4] Add HIPSPV tool chain

2021-09-29 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/include/clang/Driver/Options.td:3701 " do not include the default CUDA/HIP wrapper headers">; +def nohipwrapperinc : Flag<["-"], "nohipwrapperinc">, + HelpText<"Do not include the default HIP wrapper headers">; Is

[PATCH] D110622: [HIPSPV][3/4] Enable SPIR-V emission for HIP

2021-09-29 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. > --offload’ option, which is envisioned in [1], is added for specifying > offload targets. This option is used to override default device target > (amdgcn-amd-amdhsa) for HIP compilation for emitting device code as SPIR-V > binary. The option is handled in getHIPOffloadTar

[PATCH] D110781: [CUDA] Make sure is included with original __THROW defined.

2021-09-29 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: jdoerfert. Herald added subscribers: bixia, yaxunl. tra requested review of this revision. Herald added a project: clang. Otherwise we may end up with an inconsistent redeclarations of the standard library functions if _FORTIFY_SOURCE is in effect.

[PATCH] D98143: [HIP] Diagnose aggregate args containing half types

2021-09-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Is this patch still relevant? Looks like I've missed it. What exactly is the difference between gcc and clang regarding fp16 and why does it matter for aggregate arguments? On a trivial example both clang and gcc appear to treat _Float16 similarly: https://godbolt.org/z/8Wx

[PATCH] D98143: [HIP] Diagnose aggregate args containing half types

2021-09-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D98143#3034672 , @yaxunl wrote: > On gcc11 and below, since gcc does not support fp16, it is common practice to > use short to pass fp16 in struct. Then gcc and clang has different ABI: > https://godbolt.org/z/zqhT7x7qo > > Basica

[PATCH] D96105: [CUDA][HIP] Pass -fgpu-rdc to host clang -cc1

2021-02-08 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: clang/test/Driver/hip-rdc-device-only.hip:50-61 // COMMON: [[CLANG:".*clang.*"]] "-cc1" "-mllvm" "--amdhsa-code-object-version={{[0-9]+}}" "-triple" "amdgcn-amd-a

[PATCH] D95007: [CUDA][HIP] Add -fuse-cuid

2021-02-08 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Few test nits. LGTM in principle. Comment at: clang/test/Driver/hip-cuid.hip:98 + +// RUN: rm -rf %t.out + Is it necessary? The next 'RUN' command would overwrite

[PATCH] D85223: [CUDA][HIP] Support accessing static device variable in host code for -fgpu-rdc

2021-02-09 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM with new test nits. @JonChesterfield -- are you OK with the patch? Comment at: clang/test/CodeGenCUDA/device-var-linkage.cu:40 // NORDC-DAG: @_ZL3sv1 = dso_local addrspace(1

[PATCH] D85223: [CUDA][HIP] Support accessing static device variable in host code for -fgpu-rdc

2021-02-09 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. What breaks existing abstractions is that we produce N ELF objects from a single TU and the meaning of `static` becomes fuzzy. On one hand, we don't want that static symbol to be visible across objects on the same target, at the same time we do want it to be visible across

[PATCH] D95974: [CUDA, NVPTX] Allow targeting sm_86 GPUs.

2021-02-09 Thread Artem Belevich via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG2aa01ccec301: [CUDA, NVPTX] Allow targeting sm_86 GPUs. (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https:

[PATCH] D86376: [HIP] Emit kernel symbol

2021-02-09 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D86376#2551298 , @yaxunl wrote: > Actually there is one issue with this approach. > > HIP have API's to launch kernels, which accept kernel as function pointer > argument. Currently when taking address of kernel, we get the stub fu

[PATCH] D86376: [HIP] Emit kernel symbol

2021-02-09 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D86376#2552419 , @yaxunl wrote: > For triple chevron with kernel name, it is not needed. We only need > indirection for a triple chevron with a function pointer, in which case we do > not know its stub function at compile time. Th

[PATCH] D96835: [HIP] Support device sanitizer

2021-02-17 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/include/clang/Driver/Options.td:939 "__cyg_profile_func_enter and __cyg_profile_func_exit">; +def fgpu_sanitize : Flag<["-"], "fgpu-sanitize">, + HelpText<"Enable sanitizer for AMDGPU target.">; We do have `BoolFOp

[PATCH] D96906: [AMDGPU] gfx90a support

2021-02-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. This is a pretty huge patch, with no details in the commit log. One hour between sending the patch out and landing it is not sufficient for anyone to meaningfully review the patch and there are no mentions of the review done anywhere else. While the code only changes AMDGP

[PATCH] D96906: [AMDGPU] gfx90a support

2021-02-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D96906#2570095 , @rampitec wrote: > It's a year of work necessarily downstream. Every line there was reviewed and > tested in the downstream. I understand no one can reasonably review something > that big, although I cannot break

[PATCH] D96835: [HIP] Support device sanitizer

2021-02-18 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Nice. LGTM with few minor nits. Comment at: clang/lib/Driver/ToolChain.cpp:1185 +ToolChain::getHIPDeviceLibs(const ArgList &DriverArgs) const { + return llvm::SmallVector(); +} --

[PATCH] D97009: [CUDA] fix builtin constraints for PTX 7.2

2021-02-18 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: yaxunl. Herald added subscribers: bixia, jholewinski. tra requested review of this revision. Herald added a project: clang. This fixes build issues w/ CUDA-11 introduced by https://reviews.llvm.org/D95974 Repository: rG LLVM Github Monorepo htt

[PATCH] D97009: [CUDA] fix builtin constraints for PTX 7.2

2021-02-18 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 324812. tra edited the summary of this revision. tra added a comment. Updated the test. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D97009/new/ https://reviews.llvm.org/D97009 Files: clang/include/clang/Basic/B

[PATCH] D96195: [HIP] Fix managed variable linkage

2021-02-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/CodeGen/CGCUDANV.cpp:1017 + +void CGNVCUDARuntime::transformManagedVars() { + for (auto &&Info : DeviceVars) { A comment about how exactly we're transforming the vars would be helpful. Comment a

[PATCH] D97009: [CUDA] fix builtin constraints for PTX 7.2

2021-02-18 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 324829. tra added a comment. pop the macro Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D97009/new/ https://reviews.llvm.org/D97009 Files: clang/include/clang/Basic/BuiltinsNVPTX.def clang/test/CodeGen/builtin

[PATCH] D97009: [CUDA] fix builtin constraints for PTX 7.2

2021-02-18 Thread Artem Belevich via Phabricator via cfe-commits
tra marked an inline comment as done. tra added inline comments. Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:744 #pragma pop_macro("PTX70") #pragma pop_macro("PTX71") yaxunl wrote: > need to pop PTX72 ? Good catch. Done. Repository: rG LLVM Gith

[PATCH] D96906: [AMDGPU] gfx90a support

2021-02-19 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D96906#2572842 , @msearles wrote: > In D96906#2572749 , @kzhuravl wrote: > >>> The point is that nobody upstream even got a chance to chime in. >> >> We are and will be taking care of any fee

[PATCH] D96195: [HIP] Fix managed variable linkage

2021-02-22 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: clang/lib/CodeGen/CGCUDARuntime.h:107 + /// Transform managed variables in device compilation. + virtual void transformManagedVars() = 0; }; yax

<    5   6   7   8   9   10   11   12   13   14   >