[PATCH] D47757: [Sema] Produce diagnostics when unavailable aligned allocation/deallocation functions are called

2018-08-09 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: test/SemaCUDA/call-host-fn-from-device.cu:88 __host__ __device__ void class_specific_delete(T *t, U *u) { - delete t; // ok, call sized device delete even though host has preferable non-sized version + delete t; // expected-error {{refer

[PATCH] D47757: [Sema] Produce diagnostics when unavailable aligned allocation/deallocation functions are called

2018-08-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: test/SemaCUDA/call-host-fn-from-device.cu:88 __host__ __device__ void class_specific_delete(T *t, U *u) { - delete t; // ok, call sized device delete even though host has preferable non-sized version + delete t; // expected-error {{refer

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-16 Thread Artem Belevich via Phabricator via cfe-commits
tra added a subscriber: pcc. tra added a comment. In https://reviews.llvm.org/D50845#1202551, @ABataev wrote: > In https://reviews.llvm.org/D50845#1202550, @Hahnfeld wrote: > > > In https://reviews.llvm.org/D50845#1202540, @ABataev wrote: > > > > > Maybe for device compilation we also should defi

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-16 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In https://reviews.llvm.org/D50845#1203031, @gtbercea wrote: > In https://reviews.llvm.org/D50845#1202991, @hfinkel wrote: > > > In https://reviews.llvm.org/D50845#1202965, @Hahnfeld wrote: > > > > > In https://reviews.llvm.org/D50845#1202963, @hfinkel wrote: > > > > > > > As

[PATCH] D50815: Establish the header

2018-08-16 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. It appears that libcxx/include/CMakeLists.txt needs to be updated to include `bit` file into the file set. https://reviews.llvm.org/D50815 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/ma

[PATCH] D47757: [Sema] Produce diagnostics when unavailable aligned allocation/deallocation functions are called

2018-08-17 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In https://reviews.llvm.org/D47757#1204545, @ahatanak wrote: > @tra and @rsmith: Can we move forward and fix the incorrect cuda diagnostics > in a separate patch? Doing that in a separate patch is OK, provided that that patch will be committed along with this one. It's a

[PATCH] D47757: [Sema] Produce diagnostics when unavailable aligned allocation/deallocation functions are called

2018-08-17 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In https://reviews.llvm.org/D47757#1204621, @ahatanak wrote: > In https://reviews.llvm.org/D47757#1204561, @tra wrote: > > > It's a regression. There's a decent chance it breaks someone and this > > patch, if committed by itself, will end up being rolled back. > > > Is the r

[PATCH] D47757: [Sema] Produce diagnostics when unavailable aligned allocation/deallocation functions are called

2018-08-17 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Talked to @ahatanak over IRC. It appears that this patch may have exposed a preexisting bug. Apparently `delete t;` in test/SemaCUDA/call-host-fn-from-device.cu does actually end up calling `__host__ operator delete`. It should've picked `__device__ operator delete`, but i

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. `__clang_cuda_device_functions.h` is not intended to be a device-side math.h, despite having a lot of overlap/similarities. It may change at any time we get new CUDA version. I would suggest writing an OpenMP-specific replacement for math.h which would map to whatever devic

[PATCH] D44435: CUDA ctor/dtor Module-Unique Symbol Name

2018-05-08 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Great! Let's close this review then. And good luck with cling. https://reviews.llvm.org/D44435 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D46471: [HIP] Add hip offload kind

2018-05-08 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. Small nit. LGTM otherwise. Comment at: lib/Driver/ToolChains/Clang.cpp:133-135 Work(*C.getSingleOffloadToolChain()); + if (JA.isHostOffloading(Action::OFK_HIP)) CUDA and HIP are mutually exclusive, so thi

[PATCH] D46148: [CUDA] Added -f[no-]cuda-short-ptr option

2018-05-09 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL331938: [CUDA] Added -f[no-]cuda-short-ptr option (authored by tra, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D46148?vs=144419&id=146027#

[PATCH] D46994: [test-suite] Test CUDA in C++14 mode with C++11 stdlibs.

2018-05-17 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: External/CUDA/CMakeLists.txt:339-345 # Same as above, but for libc++ # Tell clang to use libc++ # We also need to add compiler's include pat

[PATCH] D46995: [test-suite] Enable CUDA complex tests with libc++ now that D25403 is resolved.

2018-05-17 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: External/CUDA/complex.cu:24 // libstdc++ (compile errors in ). -#if __cplusplus >= 201103L && !defined(_LIBCPP_VERSION) && \ -(__cplusplus < 201402L || STDLIB_

[PATCH] D47070: [CUDA] Upgrade linked bitcode to enable inlining

2018-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a subscriber: echristo. tra added a comment. This was not intended. :-( I was unaware that GetCPUAndFeaturesAttributes() would add any feature that looks like a valid CPU name to the target-cpu attribute. All I needed is to make builtins available or not. Setting them as function attr

[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Hi, Sorry about the long silence. I'm back to continue the reviews. I'll handle what I can today and will continue with the rest on Tuesday. It looks like patch description needs to be updated: > Use clang-offload-bindler to create binary for device ISA. I don't see anyth

[PATCH] D46476: [HIP] Add action builder for HIP

2018-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Driver/Driver.cpp:2221 +CudaDeviceActions.clear(); +for (unsigned I = 0, E = GpuArchList.size(); I != E; ++I) { + CudaDeviceActions.push_back(UA); `for(auto Arch: GpuArchList)`

[PATCH] D45212: [HIP] Let CUDA toolchain support HIP language mode and amdgpu

2018-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. One more thing -- it would be really good to add some tests to make sure your commands are constructed the way you want. https://reviews.llvm.org/D45212 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.

[PATCH] D46472: [HIP] Support offloading by linker script

2018-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: cfe/trunk/lib/Driver/ToolChains/CommonArgs.cpp:1371-1388 + // machines. + LksStream << "/*\n"; + LksStream << " HIP Offload Linker Script\n"; + LksStream << " *** Automatically generated by Clang ***\n"; + LksStream << "*/\n"; +

[PATCH] D47070: [CUDA] Upgrade linked bitcode to enable inlining

2018-05-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In https://reviews.llvm.org/D47070#1106018, @echristo wrote: > > As a short-term fix we can disable feature-to-function attribute > > propagation for NVPTX until we fix it. > > > > @echristo -- any other suggestions? > > This is some of what I was talking about when I was m

[PATCH] D47154: Try to make builtin address space declarations not useless

2018-05-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. CUDA does not expose explicit AS on clang size. All pointers are treated as generic and we infer specific address space only in LLVM. `__nvvm_atom_*_[sg]_*` builtins should probably be removed as they are indeed useless without pointers with explicit AS and NVCC itself does

[PATCH] D47268: [CUDA] Fixed the list of GPUs supported by CUDA-9

2018-05-23 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added reviewers: jlebar, klimek. Herald added subscribers: bixia, sanjoy. Removed sm_20 as it is not supported by CUDA-9. Added sm_37. https://reviews.llvm.org/D47268 Files: clang/lib/Driver/ToolChains/Cuda.cpp Index: clang/lib/Driver/ToolChains/Cuda.cpp =

[PATCH] D47268: [CUDA] Fixed the list of GPUs supported by CUDA-9

2018-05-23 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC333098: [CUDA] Fixed the list of GPUs supported by CUDA-9. (authored by tra, committed by ). Changed prior to commit: https://reviews.llvm.org/D47268?vs=148232&id=148236#toc Repository: rC Clang htt

[PATCH] D45212: Add HIP toolchain

2018-05-23 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Driver/ToolChains/HIP.cpp:29-47 +static bool addBCLib(Compilation &C, const ArgList &Args, + ArgStringList &CmdArgs, ArgStringList LibraryPaths, + StringRef BCName) { + StringRef FullName; + bool

[PATCH] D45212: Add HIP toolchain

2018-05-23 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. One small nit. LGTM otherwise. Comment at: lib/Driver/ToolChains/HIP.cpp:44 + } + if (!FoundLibDevice) +C.getDriver().Diag(diag::err_drv_no_such_file) << BCName;

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-05-29 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. "Interoperability with other compilers" is probably a statement that's a bit too strong. At best it's kind of compatible with CUDA tools and I don't think it's feasible for other compilers. I.e. it will be useless for AMD GPUs and whatever compiler they use. In general it

[PATCH] D46476: [HIP] Add action builder for HIP

2018-05-29 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. One nit. LGTM otherwise. Comment at: test/Driver/cuda-phases.cu:16 +// RUN: | FileCheck -check-prefixes=BIN,BIN_NV %s +// RUN: %clang -x hip -target powerpc64le-ibm-linux-gnu -ccc-

[PATCH] D38188: [CUDA] Fix names of __nvvm_vote* intrinsics.

2017-09-25 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In https://reviews.llvm.org/D38188#880318, @jlebar wrote: > Should we add tests to the test-suite? Or, are these already caught by the > existing tests we have? That's the plan. Once clang can compile CUDA headers, I'll add CUDA-9 specific tests to the testsuite and upda

[PATCH] D38188: [CUDA] Fix names of __nvvm_vote* intrinsics.

2017-09-25 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL314129: [CUDA] Fix names of __nvvm_vote* intrinsics. (authored by tra). Changed prior to commit: https://reviews.llvm.org/D38188?vs=116400&id=116576#toc Repository: rL LLVM https://reviews.llvm.org/

[PATCH] D38191: [NVPTX] added match.{any, all}.sync instructions, intrinsics & builtins.

2017-09-25 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 116578. tra marked an inline comment as done. tra added a comment. Addressed Justin's comments. https://reviews.llvm.org/D38191 Files: clang/include/clang/Basic/BuiltinsNVPTX.def clang/lib/CodeGen/CGBuiltin.cpp clang/lib/Headers/__clang_cuda_intrinsics.h

[PATCH] D38191: [NVPTX] added match.{any, all}.sync instructions, intrinsics & builtins.

2017-09-25 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:9603 +Value *Pred = Builder.CreateSExt(Builder.CreateExtractValue(ResultPair, 1), + PredOutPtr.getElementType()); +Builder.CreateStore(Pred, PredOutPtr); ---

[PATCH] D38191: [NVPTX] added match.{any, all}.sync instructions, intrinsics & builtins.

2017-09-25 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL314135: [NVPTX] added match.{any,all}.sync instructions, intrinsics & builtins. (authored by tra). Changed prior to commit: https://reviews.llvm.org/D38191?vs=116578&id=116584#toc Repository: rL LLVM

[PATCH] D38191: [NVPTX] added match.{any, all}.sync instructions, intrinsics & builtins.

2017-09-26 Thread Artem Belevich via Phabricator via cfe-commits
tra reopened this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: llvm/trunk/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp:716 case Intrinsic::nvvm_texsurf_handle_internal: SelectTexSurfHandle(N); + case Intrinsic::nvvm_match_al

[PATCH] D38191: [NVPTX] added match.{any, all}.sync instructions, intrinsics & builtins.

2017-09-26 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 116674. tra added a comment. Added missing return. Tests pass now. https://reviews.llvm.org/D38191 Files: clang/include/clang/Basic/BuiltinsNVPTX.def clang/lib/CodeGen/CGBuiltin.cpp clang/lib/Headers/__clang_cuda_intrinsics.h clang/test/CodeGen/builtins

[PATCH] D38191: [NVPTX] added match.{any, all}.sync instructions, intrinsics & builtins.

2017-09-26 Thread Artem Belevich via Phabricator via cfe-commits
tra closed this revision. tra added a comment. Landed with fix in r314223. https://reviews.llvm.org/D38191 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D38326: [CUDA] Work around conflicting function definitions in CUDA-9 headers.

2017-09-27 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added a subscriber: sanjoy. https://reviews.llvm.org/D38326 Files: clang/lib/Headers/__clang_cuda_runtime_wrapper.h Index: clang/lib/Headers/__clang_cuda_runtime_wrapper.h === --- clang/lib/Heade

[PATCH] D38326: [CUDA] Work around conflicting function definitions in CUDA-9 headers.

2017-09-27 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL314334: [CUDA] Work around conflicting function definitions in CUDA-9 headers. (authored by tra). Changed prior to commit: https://reviews.llvm.org/D38326?vs=116856&id=116858#toc Repository: rL LLVM

[PATCH] D38742: [CUDA] Added __hmma_m16n16k16_* builtins to support mma instructions in sm_70

2017-10-10 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: sanjoy, jholewinski. https://reviews.llvm.org/D38742 Files: clang/include/clang/Basic/BuiltinsNVPTX.def clang/lib/CodeGen/CGBuiltin.cpp clang/test/CodeGen/builtins-nvptx-sm_70.cu Index: clang/test/CodeGen/builtins-nvptx-sm_70.cu

[PATCH] D38742: [CUDA] Added __hmma_m16n16k16_* builtins to support mma instructions in sm_70

2017-10-11 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 118636. tra marked 6 inline comments as done. tra added a comment. Addressed Justin's comments. https://reviews.llvm.org/D38742 Files: clang/include/clang/Basic/BuiltinsNVPTX.def clang/lib/CodeGen/CGBuiltin.cpp clang/test/CodeGen/builtins-nvptx-sm_70.cu

[PATCH] D38742: [CUDA] Added __hmma_m16n16k16_* builtins to support mma instructions in sm_70

2017-10-11 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:9726 + case NVPTX::BI__hmma_m16n16k16_ld_c_f16: +case NVPTX::BI__hmma_m16n16k16_ld_c_f32:{ +Address Dst = EmitPointerWithAlignment(E->getArg(0)); jlebar wrote: > weird indentation? My

[PATCH] D38742: [CUDA] Added __hmma_m16n16k16_* builtins to support mma instructions in sm_70

2017-10-12 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL315624: [CUDA] Added __hmma_m16n16k16_* builtins to support mma instructions on sm_70 (authored by tra). Changed prior to commit: https://reviews.llvm.org/D38742?vs=118636&id=118848#toc Repository: r

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:170-182 -// This code prevents IsValid from being set when -// no libdevice has been found. -bool allEmpty = true; -std::string LibDeviceFile; -for (auto key : LibDeviceMap.keys()) { -

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:170-182 -// This code prevents IsValid from being set when -// no libdevice has been found. -bool allEmpty = true; -std::string LibDeviceFile; -for (auto key : LibDeviceMap.keys()) { -

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:170-182 -// This code prevents IsValid from being set when -// no libdevice has been found. -bool allEmpty = true; -std::string LibDeviceFile; -for (auto key : LibDeviceMap.keys()) { -

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:170-182 -// This code prevents IsValid from being set when -// no libdevice has been found. -bool allEmpty = true; -std::string LibDeviceFile; -for (auto key : LibDeviceMap.keys()) { -

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Driver/ToolChains/Cuda.h:90 - } }; gtbercea wrote: > gtbercea wrote: > > I would also like to keep the spirit of this code if not in this exact form > > at least something that performs the same functionality. > @tr

[PATCH] D38901: [CUDA] Require libdevice only if needed

2017-10-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. The change could use a test. https://reviews.llvm.org/D38901 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Driver/ToolChains/Cuda.h:90 - } }; gtbercea wrote: > gtbercea wrote: > > tra wrote: > > > gtbercea wrote: > > > > gtbercea wrote: > > > > > I would also like to keep the spirit of this code if not in this > > > > >

[PATCH] D38901: [CUDA] Require libdevice only if needed

2017-10-13 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Looks good. Thank you. https://reviews.llvm.org/D38901 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/li

[PATCH] D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory

2017-10-16 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Please add tests for the cases where such local->shaed conversion should and should not happen. I would appreciate if you could add details on what exactly your passes are supposed to move to shared memory. Considering that device-side code tends to be heavily inlined, it m

[PATCH] D39005: [OpenMP] Clean up variable and function names for NVPTX backend

2017-10-17 Thread Artem Belevich via Phabricator via cfe-commits
tra requested changes to this revision. tra added a comment. Justin is right. I completely forgot about this. :-/ Hal offered possible solution: https://reviews.llvm.org/D17738#661115 Repository: rL LLVM https://reviews.llvm.org/D39005 ___ cfe-co

[PATCH] D47757: [Sema] Produce diagnostics when unavailable aligned allocation/deallocation functions are called

2018-08-23 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. I've confirmed that the patch does not break anything in our CUDA code, so it's good to go as far as CUDA is concerned. I'll fix the exposed CUDA issue in a separate patch. Repository: rC Clang https://reviews.llvm.org/D47757 _

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-24 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Please keep an eye on CUDA buildbot http://lab.llvm.org:8011/builders/clang-cuda-build. It runs fair amount of tests with libc++ and handful of libstdc++ versions and may a canary if these changes b

[PATCH] D51434: [HIP] Add -amdgpu-internalize-symbols option to opt

2018-08-29 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Could you elaborate on what exactly is the problem this patch fixes? I don't see how internalizing the symbols connects to PLTs. My understanding is that PLTs are used to provide stubs for symbols to be resolved by dynamic linker at runtime. AFAICT AMD does not use shared li

[PATCH] D51434: [HIP] Add -amdgpu-internalize-symbols option to opt

2018-08-29 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. I could not find anything about PLTs in AMDGPU-ABI , nor could I find anything relevant on google. I still have no idea why PLTs are required in this case. Without that info, the problem

[PATCH] D51441: Add predefined macro __gnu_linux__ for proper aux-triple

2018-08-29 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Frontend/InitPreprocessor.cpp:1126 Builder.defineMacro("__linux__"); +if (AuxTriple.getEnvironment() == llvm::Triple::GNU) + Builder.defineMacro("__gnu_linux__"); AFAICT, we always define `__gnu_linix__` on

[PATCH] D51441: Add predefined macro __gnu_linux__ for proper aux-triple

2018-08-29 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. While we're here, perhaps `Builder.defineMacro("__linux__")` should be changed to `DefineStd("linux")` which defines `linux/__linux/__linux__`? https://reviews.llvm.org/D51441 __

[PATCH] D51434: [HIP] Add -fvisibility hidden option to clang

2018-08-29 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Driver/ToolChains/HIP.cpp:255 + options::OPT_fvisibility_ms_compat)) { +CC1Args.push_back("-fvisibility"); +CC1Args.push_back("hidden"); Nit: You could collapse multiple `push_back` calls

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. FYI. This breaks our CUDA compilation. I haven't figured out what exactly is wrong yet. I may need to unroll the patch if the fix is not obvious. Repository: rL LLVM https://reviews.llvm.org/D50845 ___ cfe-commits mailing li

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In our case the headers from a relatively old glibc and compiler errors out on this: /* This function is used in the `isfinite' macro. */ __MATH_INLINE int __NTH (__finite (double __x)) { return (__extension__ (union { double __d; int __i[2]; }) {_

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In https://reviews.llvm.org/D50845#1219733, @Hahnfeld wrote: > In https://reviews.llvm.org/D50845#1219726, @gtbercea wrote: > > > In general, it looks like this patch leads to some host macros having to be > > defined again for the auxiliary triple case. It is not clear to m

[PATCH] D51501: [CUDA] Fix CUDA compilation broken by D50845

2018-08-30 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: Hahnfeld. Herald added subscribers: bixia, jlebar, sanjoy. This keeps predefined macros for CUDA to work as they were before and lets OpenMP control the set of macros it needs. https://reviews.llvm.org/D51501 Files: clang/lib/Frontend/InitPrep

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. I've sent out https://reviews.llvm.org/D51501. It unbreaks CUDA compilation and keeps OpenMP unchanged. Repository: rL LLVM https://reviews.llvm.org/D50845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In https://reviews.llvm.org/D50845#1219819, @Hahnfeld wrote: > Ok, the top preprocessor condition for that function is `#ifndef > __SSE2_MATH__` - the exact same macro that was part of the motivation. Can > you please test compiling a simple C file (including `math.h`) with

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. That, and r340967 https://reviews.llvm.org/D51441. I'm running check-clang now and will land reverted changes shortly. Repository: rL LLVM https://reviews.llvm.org/D50845 ___ cfe-commits mailing list cfe-commits@lists.llvm.o

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Reverted in https://reviews.llvm.org/rL341115 Repository: rL LLVM https://reviews.llvm.org/D50845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D51441: Add predefined macro __gnu_linux__ for proper aux-triple

2018-08-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Reverted in https://reviews.llvm.org/rL341115. Repository: rC Clang https://reviews.llvm.org/D51441 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Tests reverted in https://reviews.llvm.org/rL341118 Repository: rL LLVM https://reviews.llvm.org/D50845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D51312: [OpenMP][NVPTX] Use appropriate _CALL_ELF macro when offloading

2018-08-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Reverted in https://reviews.llvm.org/rL341115 & https://reviews.llvm.org/rL341118. Repository: rC Clang https://reviews.llvm.org/D51312 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/ma

[PATCH] D51441: Add predefined macro __gnu_linux__ for proper aux-triple

2018-08-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Tests reverted in https://reviews.llvm.org/rL341118. Repository: rC Clang https://reviews.llvm.org/D51441 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D51507: Allow all supportable attributes to be used with #pragma clang attribute.

2018-08-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: test/Misc/pragma-attribute-supported-attributes-list.test:27-32 +// CHECK-NEXT: CUDAConstant (SubjectMatchRule_variable) +// CHECK-NEXT: CUDADevice (SubjectMatchRule_function, SubjectMatchRule_variable) +// CHECK-NEXT: CUDAGlobal (SubjectMa

[PATCH] D51554: [CUDA][OPENMP][NVPTX]Improve logic of the debug info support.

2018-08-31 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Nice. So, in effect, for optimized builds we'll generate pre-DWARF line info only, unless --cuda-noopt-device-debug is specified. Will this deal with the warnings about back-end being unable to handl

[PATCH] D51501: [CUDA] Fix CUDA compilation broken by D50845

2018-09-04 Thread Artem Belevich via Phabricator via cfe-commits
tra abandoned this revision. tra added a comment. > Not needed anymore after the reverts in https://reviews.llvm.org/rC341115 and > https://reviews.llvm.org/rC341118, right? Correct. https://reviews.llvm.org/D51501 ___ cfe-commits mailing list cfe

[PATCH] D51808: [CUDA] Ignore uncallable functions when we check for usual deallocators.

2018-09-07 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added a reviewer: rsmith. Herald added subscribers: bixia, jlebar, sanjoy. Previously clang considered function variants from both sides of compilation and that sometimes resulted in picking up wrong deallocation function. https://reviews.llvm.org/D51808 Files:

[PATCH] D51809: [CUDA][HIP] Fix assertion in LookupSpecialMember

2018-09-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. @jlebar Justin, can you take a look? https://reviews.llvm.org/D51809 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D51808: [CUDA] Ignore uncallable functions when we check for usual deallocators.

2018-09-13 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. @rsmith ping. https://reviews.llvm.org/D51808 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D49274: [CUDA] Provide integer SIMD functions for CUDA-9.2

2018-07-19 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 156386. tra added a comment. Fixed inline asm syntax. Added workaround for the bug in __vmaxs2() discovered during testing(). I've got set of tests for these functions that I'll add to test-suite shortly. AFAICT this implementation matches nvidia's bit-to-bit.

[PATCH] D49274: [CUDA] Provide integer SIMD functions for CUDA-9.2

2018-07-19 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 156397. tra added a comment. Fixed the issues pointed out by bkramer@. Apparently. sat does not matter for vabsdiff instruction with unsigned operands. My tests were also missing __vabsssN. https://reviews.llvm.org/D49274 Files: clang/lib/Headers/__clang_cu

[PATCH] D49274: [CUDA] Provide integer SIMD functions for CUDA-9.2

2018-07-19 Thread Artem Belevich via Phabricator via cfe-commits
tra marked 2 inline comments as done. tra added a comment. Ben, PTAL. Comment at: clang/lib/Headers/__clang_cuda_device_functions.h:1080 + unsigned int r; + asm("vabsdiff2.u32.u32.u32.sat %0,%1,%2,0;" : "=r"(r) : "r"(__a), "r"(__b)); + return r; bkramer wrot

[PATCH] D49274: [CUDA] Provide integer SIMD functions for CUDA-9.2

2018-07-20 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. tra marked 2 inline comments as done. Closed by commit rC337587: [CUDA] Provide integer SIMD functions for CUDA-9.2 (authored by tra, committed by ). Changed prior to commit: https://reviews.llvm.org/D49274?vs=156397&id=1

[PATCH] D48287: [HIP] Support -fcuda-flush-denormals-to-zero for amdgcn

2018-07-20 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Thank you. That should work. https://reviews.llvm.org/D48287 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mail

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-07-20 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In https://reviews.llvm.org/D47849#1126925, @gtbercea wrote: > I just stumbled upon a very interesting situation. > > I noticed that, for OpenMP, the use of device math functions happens as I > expected for -O0. For -O1 or higher math functions such as "sqrt" resolve to > l

[PATCH] D49763: [CUDA] Call atexit() for CUDA destructor early on.

2018-07-24 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. tra added reviewers: jlebar, timshen. Herald added subscribers: bixia, sanjoy. There's apparently a race between fatbin destructors registered by us and some internal calls registered by CUDA runtime from cudaRegisterFatbin. Moving fatbin de-registration to atexit() was

[PATCH] D49763: [CUDA] Call atexit() for CUDA destructor early on.

2018-07-24 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In https://reviews.llvm.org/D49763#1174283, @joerg wrote: > Can this ever end up in a shared library? If yes, please use the normal logic > for creating a global destructor. atexit is not very friendly to dlopen... Yes, it can end up in a shared library. What would be the

[PATCH] D49763: [CUDA] Call atexit() for CUDA destructor early on.

2018-07-24 Thread Artem Belevich via Phabricator via cfe-commits
tra planned changes to this revision. tra added a comment. Ugh. Apparently moving this code up just disabled module destructor. :-( That explains why we no longer crash. https://reviews.llvm.org/D49763 ___ cfe-commits mailing list cfe-commits@lists

[PATCH] D49931: [CUDA][HIP] Allow function-scope static const variable

2018-07-27 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. > This patch also allows function-scope static const variable without device > memory qualifier and emits it as a global variable in constant address space. What does NVCC do with local static const variables? https://reviews.llvm.org/D49931

[PATCH] D49931: [CUDA][HIP] Allow function-scope static const variable

2018-07-27 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Looks OK overall except for the huge `if` below. Comment at: lib/Sema/SemaDecl.cpp:11923-11930 + if (getLangOpts().CUDA && + !(VD->hasAttr() || +(VD->getType().isConstQualified() && + !VD->hasAttr() && + !VD

[PATCH] D49763: [CUDA] Call atexit() for CUDA destructor early on.

2018-07-30 Thread Artem Belevich via Phabricator via cfe-commits
tra abandoned this revision. tra added a comment. It appears that the issue that originally prompted this change is due to suspected bug in glibc triggered by specific details of our internal build. https://reviews.llvm.org/D49763 ___ cfe-commits m

[PATCH] D49148: [DEBUGINFO] Disable unsupported debug info options for NVPTX target.

2018-08-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. I wonder, what's the right thing to do to silence the warnings. For instance, we compile everything with -Werror and the warnings result in build breaks. Easy way out is to pass `-Wno-unsupported-target-opt`. It works, but it does not really solve anything. It also may mas

[PATCH] D49148: [DEBUGINFO] Disable unsupported debug info options for NVPTX target.

2018-08-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. We normally do not need to deviate from the host options all that often. I would argue that keeping options identical is a reasonable default for most options. For some options the driver may be able to derive a sensible value based on the host options. E.g. some options ca

[PATCH] D43045: Add NVPTX Support to ValidCPUList (enabling march notes)

2018-02-08 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: test/Misc/target-invalid-cpu-note.c:38 +// NVPTX: note: valid target CPU values are: sm_20, sm_21, sm_30, sm_32, sm_35, +// NVPTX-SAME: sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72 Nit: Generally speaking th

[PATCH] D42581: [NVPTX] Emit debug info in DWARF-2 by default for Cuda devices.

2018-02-08 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM Repository: rC Clang https://reviews.llvm.org/D42581 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mail

[PATCH] D43045: Add NVPTX Support to ValidCPUList (enabling march notes)

2018-02-08 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: test/Misc/target-invalid-cpu-note.c:38 +// NVPTX: note: valid target CPU values are: sm_20, sm_21, sm_30, sm_32, sm_35, +// NVPTX-SAME: sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72 erichkeane wrote: > tra wr

[PATCH] D42920: [CUDA] Fix test cuda-external-tools.cu

2018-02-09 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: test/Driver/cuda-external-tools.cu:11 +// RUN: | FileCheck -check-prefix CHECK -check-prefix ARCH64 \ +// RUN: -check-prefix SM20 -check-prefix OPT0 %s

[PATCH] D42921: [CUDA] Add option to generate relocatable device code

2018-02-09 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: include/clang/Driver/Options.td:572 + HelpText<"Generate relocatable device code, also known as separate compilation mode.">; +def fno_cuda_rdc : Flag<["-"], "fno

[PATCH] D42923: [CUDA] Allow external variables in separate compilation

2018-02-12 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. Comment at: test/SemaCUDA/extern-shared.cu:4 +// These declarations are fine in separate compilation mode! +// RUN: %clang_cc1 -fsyntax-only -fcuda-rdc -verify=rdc %s +// RUN

[PATCH] D42923: [CUDA] Allow external variables in separate compilation

2018-02-12 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: test/SemaCUDA/extern-shared.cu:4 +// These declarations are fine in separate compilation mode! +// RUN: %clang_cc1 -fsyntax-only -fcuda-rdc -verify=rdc %s +// RUN: %clang_cc1 -fsyntax-only -fcuda-is-device -fcuda-rdc -verify=rdc %s -

[PATCH] D42922: [CUDA] Register relocatable GPU binaries

2018-02-16 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/CodeGen/CGCUDANV.cpp:330-331 // the GPU side. for (const std::string &GpuBinaryFileName : CGM.getCodeGenOpts().CudaGpuBinaryFileNames) { llvm::ErrorOr> GpuBinaryOrErr = Hahnfeld wrote: > Can we actuall

[PATCH] D42922: [CUDA] Register relocatable GPU binaries

2018-02-16 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/CodeGen/CGCUDANV.cpp:330-331 // the GPU side. for (const std::string &GpuBinaryFileName : CGM.getCodeGenOpts().CudaGpuBinaryFileNames) { llvm::ErrorOr> GpuBinaryOrErr = Hahnfeld wrote: > tra wrote: > >

[PATCH] D43461: [CUDA] Include single GPU binary, NFCI.

2018-02-20 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: lib/Driver/ToolChains/Clang.cpp:4659 if (IsCuda) { -// Host-side cuda compilation receives device-side outputs as Inputs[1...]. -// Include them with -fcuda-include-gpubinary. +// Host-side cuda compilation receives device-sid

  1   2   3   4   5   6   7   8   9   10   >