[PATCH] D79210: Let clang print registered targets for --version

2020-04-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a reviewer: echristo. tra added a subscriber: echristo. tra added a comment. LGTM, I've been annoyed by the lack of this info in the past. I've added @echristo for the driver owner's stamp of approval. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79210/new/ https://reviews.ll

[PATCH] D80858: [CUDA][HIP] Support accessing static device variable in host code for -fno-gpu-rdc

2020-08-04 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. What is expected to happen to device statics in anonymous name spaces? It may be worth adding them to the tests. LGTM otherwise. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D80858/new/ https://reviews.llvm.org/D80858 _

[PATCH] D85236: [CUDA] Work around a bug in rint() caused by a broken implementation provided by CUDA.

2020-08-04 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: jlebar. Herald added subscribers: sanjoy.google, bixia, yaxunl. Herald added a project: clang. tra requested review of this revision. Normally math functions are forwarded to __nv_* counterparts provided by CUDA's libdevice bitcode. However, __nv_ri

[PATCH] D85276: [PGO][CUDA][HIP] Skip generating profile on the device stub and wrong-side functions.

2020-08-05 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. LGTM for CUDA. Comment at: clang/lib/CodeGen/CodeGenPGO.cpp:839-840 + // Skip host-only functions in the CUDA device compilation and device-only + // functions in the host compilation. + if (CGM.getLangOpts().CUDA && We will still have

[PATCH] D85276: [PGO][CUDA][HIP] Skip generating profile on the device stub and wrong-side functions.

2020-08-05 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/CodeGen/CodeGenPGO.cpp:839-840 + // Skip host-only functions in the CUDA device compilation and device-only + // functions in the host compilation. + if (CGM.getLangOpts().CUDA && hliao wrote: > tra wrote: > >

[PATCH] D85236: [CUDA] Work around a bug in rint() caused by a broken implementation provided by CUDA.

2020-08-05 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 283359. tra edited the summary of this revision. tra added a comment. Also fixed the same bug in nearbyint(). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D85236/new/ https://reviews.llvm.org/D85236 Files: clang

[PATCH] D85236: [CUDA] Work around a bug in rint() caused by a broken implementation provided by CUDA.

2020-08-05 Thread Artem Belevich via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG7d057efddc00: [CUDA] Work around a bug in rint/nearbyint caused by a broken implementation… (authored by tra). Repository: rG LLVM Github Monorepo

[PATCH] D85276: [PGO][CUDA][HIP] Skip generating profile on the device stub and wrong-side functions.

2020-08-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D85276#2199655 , @yaxunl wrote: > Do we need to disable pgo and coverage mapping for device compilation? Or it > is already disabled? We already disable profiling during device compilation for NVIDIA and AMD GPUs: https://github.c

[PATCH] D85276: [PGO][CUDA][HIP] Skip generating profile on the device stub and wrong-side functions.

2020-08-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/CodeGen/CodeGenPGO.cpp:839-840 + // Skip host-only functions in the CUDA device compilation and device-only + // functions in the host compilation. + if (CGM.getLangOpts().CUDA && hliao wrote: > tra wrote: > >

[PATCH] D84364: [CUDA][HIP] Defer overloading resolution diagnostics for host device functions

2020-08-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D84364#2176091 , @yaxunl wrote: > I added a `Deferrable` bit to the diagnostics which can be specified in td > files. This can be added to individual diagnostic defs or added to a bunch of > diagnostic defs all together. > > This

[PATCH] D85551: [OpenMP] Split OpenMP/target_map_codegen test [NFC]

2020-08-07 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: ABataev. Herald added subscribers: sanjoy.google, bixia, guansong, yaxunl. Herald added a project: clang. tra requested review of this revision. Herald added a reviewer: jdoerfert. Herald added a subscriber: sstefan1. The test file is the single lon

[PATCH] D85551: [OpenMP] Split OpenMP/target_map_codegen test [NFC]

2020-08-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D85551#2203711 , @jdoerfert wrote: > Wow, cool. I imagine it was hard to split this given the manual check lines. > We really need to start using the upgrade scripts here. > > I'm fine with this, @ABataev WDYT? It was not that har

[PATCH] D85551: [OpenMP] Split OpenMP/target_map_codegen test [NFC]

2020-08-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/test/OpenMP/target_map_codegen_18.cpp:5 + +///==/// +// RUN: %clang_cc1 -DUSE -DCK19 -verify -fopenmp -fopenmp-targets=powerpc64le-ibm-linux-gnu -x c++ -triple p

[PATCH] D85551: [OpenMP] Split OpenMP/target_map_codegen test [NFC]

2020-08-07 Thread Artem Belevich via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGcd01980f308a: [OpenMP] Split OpenMP/target_map_codegen test [NFC] (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTIO

[PATCH] D85575: [ARM] Speed up arm-cortex-cpus.c test

2020-08-07 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: LukeGeeson. Herald added subscribers: danielkiss, sanjoy.google, bixia, kristof.beyls. Herald added a project: clang. tra requested review of this revision. Trailing wildcard regex searches greedily continue searching through the whole input and mak

[PATCH] D80858: [CUDA][HIP] Support accessing static device variable in host code for -fno-gpu-rdc

2020-08-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Sam, just a FYI that the patch has a couple of unintended consequences. We now end up with various things instantiated as device-side __constant__ objects when they were not before, when we compile with -std=c++17 (especially with libc++): https://godbolt.org/z/KbTM9M That

[PATCH] D80858: [CUDA][HIP] Support accessing static device variable in host code for -fno-gpu-rdc

2020-08-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. > We can restrict externalization to constant variables with explicit > 'constant' attributes only, which should fix this issue. SGTM. If it does not have explicit device-side attribute, it's never going to need to cross host/device boundary. I guess this applies to vars wi

[PATCH] D85686: [CUDA][HIP] Do not externalize implicit constant static variable

2020-08-10 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: clang/lib/AST/ASTContext.cpp:11196 return !getLangOpts().GPURelocatableDeviceCode && - (D->hasAttr() || D->hasAttr()) && + (D->hasAttr() || +

[PATCH] D80858: [CUDA][HIP] Support accessing static device variable in host code for -fno-gpu-rdc

2020-08-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. > The fix is here https://reviews.llvm.org/D85686 Thank you. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D80858/new/ https://reviews.llvm.org/D80858 ___ cfe-commits mailing list cf

[PATCH] D84068: AMDGPU/clang: Search resource directory for device libraries

2020-08-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a subscriber: echristo. tra added a comment. In D84068#2204713 , @arsenm wrote: >> If we ship them with clang, who/where/how builds them? >> If they come from ROCm packages, how would those packages add stuff into >> *clang* install directory? R

[PATCH] D85575: [ARM] Speed up arm-cortex-cpus.c test

2020-08-10 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG9c8ae4086031: [ARM] Speed up arm-cortex-cpus.c test (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D85575/new/ https://reviews.llvm.org/D85

[PATCH] D85695: [OpenMP] split execution of a long test into smaller parts.

2020-08-10 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added reviewers: ABataev, jdoerfert. Herald added subscribers: sanjoy.google, bixia, guansong, yaxunl. Herald added a project: clang. tra requested review of this revision. Herald added a subscriber: sstefan1. This test is bottlenecked by heavy regex use (~0.6s per F

[PATCH] D85551: [OpenMP] Split OpenMP/target_map_codegen test [NFC]

2020-08-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/test/OpenMP/target_map_codegen_18.cpp:5 + +///==/// +// RUN: %clang_cc1 -DUSE -DCK19 -verify -fopenmp -fopenmp-targets=powerpc64le-ibm-linux-gnu -x c++ -triple p

[PATCH] D85695: [OpenMP] split execution of a long test into smaller parts.

2020-08-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. The test is no longer sticking out: https://gist.github.com/Artem-B/d0b05c2e98a49158c02de23f7f4f0279 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D85695/new/ https://reviews.llvm.org/D85695 ___

[PATCH] D85695: [OpenMP] split execution of a long test into smaller parts.

2020-08-11 Thread Artem Belevich via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGec5f793996f4: [OpenMP] split execution of a long test into smaller parts. (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LA

[PATCH] D60620: [HIP] Support target id by --offload-arch

2020-08-11 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Looks good in general. Mostly C++ style comments below. Comment at: clang/include/clang/Basic/TargetID.h:30 +/// Returns canonical processor name or empty if the processor name is invalid. +llvm::StringRef getProcessorFromTargetID(const llvm::Triple &T, +

[PATCH] D85798: Split Preprocessor/init.c test. NFC.

2020-08-11 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: MaskRay. Herald added subscribers: sanjoy.google, mstorsjo, jfb, atanasyan, bixia, jrtc27, kbarton, krytarowski, nemanjai. Herald added a project: clang. tra requested review of this revision. Herald added a subscriber: wuzish. Some parts of the te

[PATCH] D85879: [OpenMP] Overload `std::isnan` and friends multiple times for the GPU

2020-08-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Headers/__clang_cuda_cmath.h:85 +//(note that we do not create implicit base functions here). To avoid +//this clash we add a new trait to some of them that is always true +//(this is LLVM after all ;)). It

[PATCH] D85798: Split Preprocessor/init.c test. NFC.

2020-08-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/test/Preprocessor/init-arm.c:19 +// ARM:#define __CHAR16_TYPE__ unsigned short +// ARM:#define __CHAR32_TYPE__ unsigned int +// ARM:#define __CHAR_BIT__ 8 MaskRay wrote: > Consider spending a bit more time and adding `

[PATCH] D85879: [OpenMP] Overload `std::isnan` and friends multiple times for the GPU

2020-08-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Headers/__clang_cuda_cmath.h:85 +//(note that we do not create implicit base functions here). To avoid +//this clash we add a new trait to some of them that is always true +//(this is LLVM after all ;)). It

[PATCH] D60620: [HIP] Support target id by --offload-arch

2020-08-13 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Couple of minor nits. LGTM otherwise. Comment at: clang/include/clang/Basic/TargetID.h:42 +/// is not a null pointer. +/// If \p CanonicalizeProc is true, canonicalize returned pro

[PATCH] D85879: [OpenMP] Overload `std::isnan` and friends multiple times for the GPU

2020-08-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Headers/__clang_cuda_cmath.h:85 +//(note that we do not create implicit base functions here). To avoid +//this clash we add a new trait to some of them that is always true +//(this is LLVM after all ;)). It

[PATCH] D85879: [OpenMP] Overload `std::isnan` and friends multiple times for the GPU

2020-08-14 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. Comment at: clang/lib/Headers/__clang_cuda_cmath.h:85 +//(note that we do not create implicit base functions here). To avoid +//this clash we add a new trait to some of them that is always true +//(thi

[PATCH] D85798: Split Preprocessor/init.c test. NFC.

2020-08-14 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 285745. tra edited the summary of this revision. tra added a comment. Separated blocks of RUN commands with an empty line to make navigation easier. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D85798/new/ https://

[PATCH] D85798: Split Preprocessor/init.c test. NFC.

2020-08-14 Thread Artem Belevich via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG1689c36b1aeb: Split Preprocessor/init.c test (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.l

[PATCH] D78655: [HIP] Add -fhip-lambda-host-device

2020-05-04 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. LGTM. Thank you for adding the checks for capturing lambdas and for putting this behind the flag. I've asked @rsmith to chime in in case there's anything else about the lambdas we need to deal with. Please wait a day or two before landing the pat

[PATCH] D79344: [cuda] Start diagnosing variables with bad target.

2020-05-04 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. This has a good chance of breaking existing code. It would be great to add an escape hatch option to revert to the old behavior if we run into problems. The change is relatively simple, so reverting it in case something goes wrong should work, too. Up to you. ===

[PATCH] D79344: [cuda] Start diagnosing variables with bad target.

2020-05-04 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D79344#2018628 , @hliao wrote: > In D79344#2018561 , @tra wrote: > > > This has a good chance of breaking existing code. It would be great to add > > an escape hatch option to revert to the

[PATCH] D79344: [cuda] Start diagnosing variables with bad target.

2020-05-04 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D79344#2018693 , @hliao wrote: > OK, I will add one option, But, do we turn it on by default or off? As a rule of thumb, if it's an experimental feature, then the default would be off. For a change which should be the default, bu

[PATCH] D79367: [CUDA][HIP] Fix empty ctor/dtor check for union

2020-05-04 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Nice! Thank you for the patch. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79367/new/ https://reviews.llvm.org/D79367 ___ cfe-commits mailin

[PATCH] D78655: [HIP] Add -fhip-lambda-host-device

2020-05-05 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. > I agree with Richard that just making lambdas HD by default in all modes > seems like the right rule. Ack. Let's give it a try. I'll test this on our code and see what falls out. Stay tuned. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78655/new/ https://revie

[PATCH] D77954: [CUDA][HIP] Fix host/device based overload resolution

2020-05-05 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. It appears that re-landed b46b1a916d44216f0c70de55ae2123eb9de69027 has created another compilation regression. I don't have a simple reproducer yet, so here's the error message for now: llvm_unstable/t

[PATCH] D77954: [CUDA][HIP] Fix host/device based overload resolution

2020-05-05 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. FYI, I've just reverted it in bf6a26b066382e0f41bf023c781d84061c542307 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77954/new/ https://reviews.llvm.org/D77

[PATCH] D77954: [CUDA][HIP] Fix host/device based overload resolution

2020-05-05 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D77954#2021299 , @yaxunl wrote: > I will put a workaround: In device compilation, in implicit `__device__ > __host__` callers, I will keep the old behavior, that is, implicit > `__device__ __host__` candidate has equal preference

[PATCH] D79449: [CUDA] Make NVVM builtins available with CUDA-11 & PTX6.5

2020-05-05 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 262236. tra added a comment. Test that the builtins work all the way to sm_80/ptx65 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79449/new/ https://reviews.llvm.org/D79449 Files: clang/include/clang/Basic/Built

[PATCH] D79449: [CUDA] Make NVVM builtins available with CUDA-11 & PTX6.5

2020-05-05 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: timshen. Herald added subscribers: sanjoy.google, bixia, yaxunl, jholewinski. Herald added a project: clang. tra updated this revision to Diff 262236. tra added a comment. Test that the builtins work all the way to sm_80/ptx65 Repository: rG LLV

[PATCH] D79449: [CUDA] Make NVVM builtins available with CUDA-11 & PTX6.5

2020-05-05 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 262237. tra added a comment. Actually added sm_80 predicate. It's not used on any builtins yet, but it makes sense to add it right now along with ptx65. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79449/new/ http

[PATCH] D79449: [CUDA] Make NVVM builtins available with CUDA-11 & PTX6.5

2020-05-05 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG844096b996a0: [CUDA] Make NVVM builtins available with CUDA-11/PTX6.5 (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79449/new/ https://re

[PATCH] D79515: [CUDA] Enable existing builtins for PTX7.0 as well.

2020-05-06 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: timshen. Herald added subscribers: sanjoy.google, bixia, yaxunl, jholewinski. Herald added a project: clang. I've missed PXT7.0 in D79449 Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D79515 Fil

[PATCH] D79515: [CUDA] Enable existing builtins for PTX7.0 as well.

2020-05-06 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 262473. tra added a comment. Updates test. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79515/new/ https://reviews.llvm.org/D79515 Files: clang/include/clang/Basic/BuiltinsNVPTX.def clang/test/CodeGen/builtin

[PATCH] D79515: [CUDA] Enable existing builtins for PTX7.0 as well.

2020-05-06 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG314f99e7d42d: [CUDA] Enable existing builtins for PTX7.0 as well. (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79515/new/ https://review

[PATCH] D79526: [CUDA][HIP] Workaround for resolving host device function against wrong-sided function

2020-05-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. I've tested the patch on our sources and it still breaks tensorflow compilation, though in a different way: In file included from third_party/tensorflow/core/kernels/slice_op_gpu.cu.cc:22: In file included from ./third_party/tensorflow/core/framework/register_types.h:2

[PATCH] D79344: [cuda] Start diagnosing variables with bad target.

2020-05-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D79344#2018915 , @tra wrote: > If you can wait, I can try patching this change into our clang tree and then > see if it breaks anything obvious. If nothing falls apart, I'll be fine with > the patch as is. The patch appears to b

[PATCH] D78655: [HIP] Add -fhip-lambda-host-device

2020-05-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D78655#2020651 , @tra wrote: > Ack. Let's give it a try. I'll test this on our code and see what falls out. > Stay tuned. The patch seems to cause no issues. I've ran it with local changes that enable it unconditionally for CUDA

[PATCH] D79344: [cuda] Start diagnosing variables with bad target.

2020-05-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. We're calling `copysign( int, double)`. The standard library provides `copysign(double, double)`, CUDA provides only `copysign(float, double)`. As far as C++ is concerned, both require one type conversion. I guess previously we would give `__device__` one provided by CUDA

[PATCH] D79344: [cuda] Start diagnosing variables with bad target.

2020-05-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D79344#2026126 , @hliao wrote: > In D79344#2026025 , @tra wrote: > > > We're calling `copysign( int, double)`. The standard library provides > > `copysign(double, double)`, CUDA provides onl

[PATCH] D79344: [cuda] Start diagnosing variables with bad target.

2020-05-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. The problem is reproducible in upstream clang. Let's see if I can reduce it to something simpler. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79344/new/ https://reviews.llvm.org/D79344

[PATCH] D79344: [cuda] Start diagnosing variables with bad target.

2020-05-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D79344#2026180 , @tra wrote: > The problem is reproducible in upstream clang. Let's see if I can reduce it > to something simpler. Reduced it down to this -- compiles with clang w/o the patch, but fails with it. __attribute__(

[PATCH] D79344: [cuda] Start diagnosing variables with bad target.

2020-05-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Here's a slightly smaller variant which may be a good clue for tracking down the root cause. This one fails with: var.cc:6:14: error: no matching function for call to 'copysign' double g = copysign(0, g); ^~~~ var.cc:5:56: note: candidate template

[PATCH] D79526: [CUDA][HIP] Workaround for resolving host device function against wrong-sided function

2020-05-08 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. The latest version of the patch works well enough to compile tensorflow. That's the good news. In D79526#2026857 , @yaxunl wrote: > Looks like we went overboard to treat implicit host device candidate as > inferior. They should be t

[PATCH] D79526: [CUDA][HIP] Workaround for resolving host device function against wrong-sided function

2020-05-08 Thread Artem Belevich via Phabricator via cfe-commits
tra added a subscriber: wash. tra added a comment. In D79526#2027470 , @yaxunl wrote: > For implicit host device functions, since they are not guaranteed to work in > device compilation, we can only resolve them as if they are host functions. > This caus

[PATCH] D79526: [CUDA][HIP] Workaround for resolving host device function against wrong-sided function

2020-05-08 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. This one is just a FYI. I've managed to reduce the failure in the first version of this patch and it looks rather odd because the reduced test case has nothing to do with CUDA. Instead it appears to introduce a difference in compilation of regular host-only C++ code with `-

[PATCH] D79344: [cuda] Start diagnosing variables with bad target.

2020-05-08 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. This triggers an assertion: clang: /usr/local/google/home/tra/work/llvm/repo/clang/lib/AST/Decl.cpp:2697: clang::Expr *clang::ParmVarDecl::getDefaultArg(): Assertion `!hasUninstantiatedDefaultArg() && "Default argument is not yet instantiated!"' failed. #2 0x7f

[PATCH] D79526: [CUDA][HIP] Workaround for resolving host device function against wrong-sided function

2020-05-11 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/include/clang/Sema/Sema.h:11670 + bool IsCUDAImplicitHostDeviceFunction(const FunctionDecl *D); + I think this can be `static` as it does not need Sema's state. Comment at: clang/lib/Sema/SemaCUDA

[PATCH] D79526: [CUDA][HIP] Workaround for resolving host device function against wrong-sided function

2020-05-11 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM, modulo cosmetic test changes mentioned below. Comment at: clang/test/SemaCUDA/function-overload.cu:479 +namespace ImplicitHostDeviceVsWrongSided { +inline CorrectOverloadRetT

[PATCH] D79237: [CUDA][HIP] Fix constexpr variables for C++17

2020-05-12 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. > constexpr variables are compile time constants and implicitly const, therefore > they are safe to emit on both device and host side. Besides, in many cases > they are intended for both device and host, therefore it makes sense > to emit them on both device and host sides

[PATCH] D79866: [CUDA][HIP] Do not emit debug info for stub function

2020-05-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. I do not see the behavior the patch is supposed to fix in CUDA. If I compile a simple program, host-side debugger does not see the `kernel`, sees `__device_stub_kernel` and, if the breakpoint is set on `kernel`, it treats it as a yet-to-be-loaded one and does end up breaking

[PATCH] D79866: [CUDA][HIP] Do not emit debug info for stub function

2020-05-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D79866#2034683 , @yaxunl wrote: > can you try set bp by using file name and line number on the kernel? In regular gdb it is set on the stub. In cuda-gdb the behavior is interesting -- it initially gets set and breaks on the stub,

[PATCH] D79967: Fix debug info for NoDebug attr

2020-05-14 Thread Artem Belevich via Phabricator via cfe-commits
tra added a reviewer: dblaikie. tra added a subscriber: dblaikie. tra added a comment. LGTM. Added @dblaikie as reviewer for debug info expertise. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D79967/new/ https://reviews.llvm.org/D79967 ___

[PATCH] D79237: [CUDA][HIP] Fix constexpr variables for C++17

2020-05-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. LGTM in general. Let me check the patch on our tensorflow build. Comment at: clang/test/SemaCUDA/constexpr-variables.cu:30-31 + static constexpr int c = sizeof(a); + a[0] = &b; + a[1] = &c; + foo(a); Can we verify the diags for bad case

[PATCH] D79237: [CUDA][HIP] Fix constexpr variables for C++17

2020-05-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D79237#2039417 , @tra wrote: > LGTM in general. Let me check the patch on our tensorflow build. Bad news -- it breaks the standard C++ library. Reproducer: $ bin/clang++ -x cuda /dev/null -fsyntax-only -include algorithm --cu

[PATCH] D79237: [CUDA][HIP] Fix constexpr variables for C++17

2020-05-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/test/SemaCUDA/constexpr-variables.cu:30-31 + static constexpr int c = sizeof(a); + a[0] = &b; + a[1] = &c; + foo(a); yaxunl wrote: > tra wrote: > > Can we verify the diags for bad cases, too? > By bad cases you mea

[PATCH] D79526: [CUDA][HIP] Workaround for resolving host device function against wrong-sided function

2020-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. e03394c6a6ff5832aa43259d4b8345f40ca6a22c Still breaks some of the existing CUDA code (got failures in pytorch and Eigen). I'll revert the patch and will send you a reduced reproducer. Repository: rG L

[PATCH] D79526: [CUDA][HIP] Workaround for resolving host device function against wrong-sided function

2020-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Reduced test case: struct a { __attribute__((device)) a(short); __attribute__((device)) operator unsigned() const; __attribute__((device)) operator int() const; }; struct b { a d; }; void f(b g) { b e = g; } Failure: $ bin/clang++ -x cuda aten.cc

[PATCH] D78155: [OpenMP] Use __OPENMP_NVPTX__ instead of _OPENMP in wrapper headers

2020-05-19 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D78155/new/ https://reviews.llvm.org/D78155 ___ cfe-

[PATCH] D40275: [CUDA] Report "unsupported VLA" errors only on device side.

2017-11-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. @rjmccall : are you OK with this approach? If VLA is not supported by the target, CUDA is handled as a special case so it can emit deferred diag, OpenMP reports an error only if shouldDiagnoseTargetSupportFromOpenMP() allows it, and everything else does so unconditionally.

[PATCH] D40453: Add the nvidia-cuda-toolkit Debian package path to search path

2017-11-27 Thread Artem Belevich via Phabricator via cfe-commits
tra requested changes to this revision. tra added a comment. This revision now requires changes to proceed. I'm reluctant to add a distribution-specific search path unconditionally. We do have Distro.IsDebian() https://github.com/llvm-mirror/clang/blob/9f9177d3ef72580ca29e8844327f63d7aa1908af/in

[PATCH] D40275: [CUDA] Report "unsupported VLA" errors only on device side.

2017-11-27 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In https://reviews.llvm.org/D40275#933253, @tra wrote: > @rjmccall : are you OK with this approach? If VLA is not supported by the > target, CUDA is handled as a special case so it can emit deferred diag, > OpenMP reports an error only if shouldDiagnoseTargetSupportFromOpen

[PATCH] D40275: [CUDA] Report "unsupported VLA" errors only on device side.

2017-11-28 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Sema/SemaType.cpp:2188 + !Context.getTargetInfo().isVLASupported() && + shouldDiagnoseTargetSupportFromOpenMP()) { + // Some targets don't support VLAs. rjmccall wrote: > tra wrote:

[PATCH] D40275: [CUDA] Report "unsupported VLA" errors only on device side.

2017-11-28 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL319201: [CUDA] Report "unsupported VLA" errors only on device side. (authored by tra). Changed prior to commit: https://reviews.llvm.org/D40275?vs=123831&id=124607#toc Repository: rL LLVM https://re

[PATCH] D40453: Add the nvidia-cuda-toolkit Debian package path to search path

2017-11-28 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:81 + +if (Distro.IsDebian()) + CudaPathCandidates.push_back(D.SysRoot + "/usr/lib/cuda"); No need for a named temporary. `if (Distro(D.getVFS).IsDebian())` should do. Also, please add

[PATCH] D40198: [CUDA] Tweak CUDA wrappers to make cuda-9 work with libc++

2017-11-29 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. ping. https://reviews.llvm.org/D40198 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D40198: [CUDA] Tweak CUDA wrappers to make cuda-9 work with libc++

2017-11-30 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL319485: [CUDA] Tweak CUDA wrappers to make cuda-9 work with libc++ (authored by tra). Changed prior to commit: https://reviews.llvm.org/D40198?vs=123421&id=125021#toc Repository: rL LLVM https://rev

[PATCH] D40871: [CUDA] Added overloads for '[unsigned] long' variants of shfl builtins.

2017-12-05 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added a subscriber: sanjoy. https://reviews.llvm.org/D40871 Files: clang/lib/Headers/__clang_cuda_intrinsics.h Index: clang/lib/Headers/__clang_cuda_intrinsics.h === --- clang/lib/Headers/__clang

[PATCH] D40872: [NVPTX, CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in clang.

2017-12-05 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: hiraditya, sanjoy, jholewinski. https://reviews.llvm.org/D40872 Files: clang/include/clang/Basic/BuiltinsNVPTX.def clang/lib/Headers/__clang_cuda_intrinsics.h llvm/include/llvm/IR/IntrinsicsNVVM.td llvm/lib/Target/NVPTX/NVPTXIntrinsics.

[PATCH] D40871: [CUDA] Added overloads for '[unsigned] long' variants of shfl builtins.

2017-12-06 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC319908: [CUDA] Added overloads for '[unsigned] long' variants of shfl builtins. (authored by tra). Changed prior to commit: https://reviews.llvm.org/D40871?vs=125648&id=125756#toc Repository: rC Clan

[PATCH] D40871: [CUDA] Added overloads for '[unsigned] long' variants of shfl builtins.

2017-12-06 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL319908: [CUDA] Added overloads for '[unsigned] long' variants of shfl builtins. (authored by tra). Changed prior to commit: https://reviews.llvm.org/D40871?vs=125648&id=125755#toc Repository: rL LLVM

[PATCH] D40872: [NVPTX, CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in clang.

2017-12-06 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL319909: [NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in… (authored by tra). Changed prior to commit: https://reviews.llvm.org/D40872?vs=125649&id=125757#toc Repository:

[PATCH] D40996: Add --no-cuda-version-check in unknown-std.cpp

2017-12-08 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Ideally the tests should be hermetic and should use mock CUDA installation that comes with the tests. E.g. `--cuda-path=%S/Inputs/CUDA/usr/local/cuda` https://reviews.llvm.org/D40996 ___ cfe-commits mailing list cfe-commits@lis

[PATCH] D133133: [CUDA] Allow using -o with -fsyntax-only

2022-09-01 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: mattd, carlosgalvezp, bixia, yaxunl. Herald added a project: All. tra updated this revision to Diff 457341. tra added a comment. tra updated this revision to Diff 457343. tra published this revision for review. tra added reviewers: jhuber6, yaxun

[PATCH] D133133: [CUDA] Allow using -o with -fsyntax-only

2022-09-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/test/Driver/cuda-bindings.cu:99 +// SYN-DAG: # "nvptx64-nvidia-cuda" - "clang", inputs: [{{.*}}], output: (nothing) // // Test two gpu architectures up to the assemble phase. yaxunl wrote: > should we check there is

[PATCH] D133133: [CUDA] Allow using -o with -fsyntax-only

2022-09-01 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 457377. tra added a comment. Updated tests and addressed comments. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D133133/new/ https://reviews.llvm.org/D133133 Files: clang/lib/Driver/Driver.cpp clang/test/Drive

[PATCH] D133133: [CUDA] Allow using -o with -fsyntax-only

2022-09-01 Thread Artem Belevich via Phabricator via cfe-commits
tra marked an inline comment as done. tra added inline comments. Comment at: clang/test/Driver/cuda-bindings.cu:99 +// SYN-DAG: # "nvptx64-nvidia-cuda" - "clang", inputs: [{{.*}}], output: (nothing) // // Test two gpu architectures up to the assemble phase. tr

[PATCH] D133161: [Clang] Fix the new driver crashing when using '-fsyntax-only'

2022-09-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Does this patch obviate D133133 or is it purely for the new driver functionality? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D133161/new/ https://reviews.llvm.org/D133161 _

[PATCH] D133133: [CUDA] Allow using -o with -fsyntax-only

2022-09-01 Thread Artem Belevich via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. tra marked an inline comment as done. Closed by commit rG54c47ff9398f: [CUDA] Allow using -o with -fsyntax-only (authored by tra). Repository: rG LLVM Github Monorep

[PATCH] D133161: [Clang] Fix the new driver crashing when using '-fsyntax-only'

2022-09-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. OK. I'm going to land it then. Comment at: clang/lib/Driver/Driver.cpp:4396-4398 + bool SingleDeviceOutput = !llvm::any_of(OffloadActions, [](Action *A) { +return A->getType() == types::TY_Nothing; + }) && isa(HostAction); `any_of(A->

[PATCH] D133367: [OpenMP] Remove use of removed '-f[no-]openmp-new-driver' flag

2022-09-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Driver/Driver.cpp:3906 - (C.isOffloadingHostKind(Action::OFK_OpenMP) && - Args.hasFlag(options::OPT_fopenmp_new_driver, -options::OPT_no_offload_new_driver, true)) || The option still

[PATCH] D133367: [OpenMP] Remove use of removed '-f[no-]openmp-new-driver' flag

2022-09-06 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: clang/lib/Driver/Driver.cpp:3906 - (C.isOffloadingHostKind(Action::OFK_OpenMP) && - Args.hasFlag(options::OPT_fopenmp_new_driver, -op

[PATCH] D133161: [Clang] Fix the new driver crashing when using '-fsyntax-only'

2022-09-06 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: clang/lib/Driver/Driver.cpp:4396-4398 + bool SingleDeviceOutput = !llvm::any_of(OffloadActions, [](Action *A) { +return A->getType() == types::TY_Nothing; + }

[PATCH] D130308: [clang] extend getCommonSugaredType to merge sugar nodes

2022-09-08 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. It looks like some of these changes are causing compiler to crash in `clang::ASTContext::getFloatTypeSemantics` during CUDA compilation: https://lab.llvm.org/buildbot/#/builders/55/builds/35047 I'm currently working on narrowing down which commit is the culprit. Here's a sn

<    1   2   3   4   5   6   7   8   9   10   >