[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-07 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. Do we still need this? I think what we really need to solve is the problem of (host) inline assembly in the header files... Repository: rC Clang https://reviews.llvm.org/D47849 ___ cfe-commits mailing list cfe-commits@l

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1190997, @gtbercea wrote: > Don't we want to use device specific math functions? > It's not just about avoiding some the host specific assembly, it's also > about getting an implementation tailored to the device. Ok, so you are alre

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1192134, @gtbercea wrote: > This patch is concerned with calling device functions when you're on the > device. The correctness issues you mention are orthogonal to this and should > be handled by another patch. I don't think this patc

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1192321, @gtbercea wrote: > > IIRC you started to work on this to fix the problem with inline assembly > > (see https://reviews.llvm.org/D47849#1125019). AFAICS this patch fixes > > declarations of math functions but you still cannot

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1192375, @gtbercea wrote: > I do not get that error. In the beginning you said that you were facing the same error. Did that go away in the meantime? Are you testing on x86 or Power? With optimizations enabled? Repository: rC Cla

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1192493, @gtbercea wrote: > @Hahnfeld do you get the same error if you compile with clang++ instead of > clang? Yes, with both trunk and this patch applied. It's the same header after all... Repository: rC Clang https://reviews.

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-10 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld removed a reviewer: Hahnfeld. Hahnfeld added a comment. I feel like there is no progress in the discussion (here and off-list), partly because we might still not be talking about the same things. So I'm stepping down from this revision to unblock review from somebody else. Here's my cu

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-16 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld created this revision. Hahnfeld added reviewers: tra, gtbercea, hfinkel. Herald added subscribers: cfe-commits, guansong. When compiling CUDA or OpenMP device code Clang parses header files that expect certain predefined macros from the host architecture. To make this work the compiler pa

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-16 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added inline comments. Comment at: test/SemaCUDA/builtins.cu:15-17 +#if !defined(__x86_64__) +#error "Expected to see preprocessor macros from the host." #endif @tra I'm not sure here: Do we want `__PTX__` to be defined during host compilation? I can't

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-16 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1202540, @ABataev wrote: > Maybe for device compilation we also should define `__NO_MATH_INLINES` and > `__NO_STRING_INLINES` macros to disable inline assembly in glibc? The problem is that `__NO_MATH_INLINES` doesn't even avoid all

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-16 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1202838, @tra wrote: > In https://reviews.llvm.org/D50845#1202551, @ABataev wrote: > > > In https://reviews.llvm.org/D50845#1202550, @Hahnfeld wrote: > > > > > In https://reviews.llvm.org/D50845#1202540, @ABataev wrote: > > > > > > > Ma

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-16 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1202963, @hfinkel wrote: > As a result, we should really have a separate header that has those > actually-available functions. When targeting NVPTX, why don't we have the > included math.h be CUDA's math.h? In the end, those are the f

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-17 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1202973, @ABataev wrote: > > So ideally I think Clang should determine which functions are really > > `declare target` (either explicit or implicit) and only run semantical > > analysis on them. If a function is then found to be "brok

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-17 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1203967, @ABataev wrote: > In https://reviews.llvm.org/D50845#1203961, @Hahnfeld wrote: > > > In https://reviews.llvm.org/D50845#1202973, @ABataev wrote: > > > > > > So ideally I think Clang should determine which functions are really

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-17 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1203991, @ABataev wrote: > In https://reviews.llvm.org/D50845#1203973, @Hahnfeld wrote: > > > In https://reviews.llvm.org/D50845#1203967, @ABataev wrote: > > > > > I thought about this approach already. But it won't work in general. The

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-17 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1204210, @ABataev wrote: > > Right, warning wasn't a good thought. We really want strict checking and > > would have to error out when we find a function that wasn't implicitly > > `declare target` on the host. > > I meant to ask how

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-17 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1202540, @ABataev wrote: > Maybe for device compilation we also should define `__NO_MATH_INLINES` and > `__NO_STRING_INLINES` macros to disable inline assembly in glibc? Coming back to this original question: - I just searched the h

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-17 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1204340, @ABataev wrote: > In https://reviews.llvm.org/D50845#1204216, @Hahnfeld wrote: > > > Got that, I agree on the conservative approach: If we find a function to be > > called that wasn't checked (because it wasn't implicitly `dec

[PATCH] D46540: [X86] ptwrite intrinsic

2018-05-08 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. Could you maybe add some short summaries to your patches? It's hard for non-Intel employees to guess what all these instructions do... https://reviews.llvm.org/D46540 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D46540: [X86] ptwrite intrinsic

2018-05-10 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D46540#1092620, @GBuella wrote: > In https://reviews.llvm.org/D46540#1091625, @Hahnfeld wrote: > > > Could you maybe add some short summaries to your patches? It's hard for > > non-Intel employees to guess what all these instructions do... >

[PATCH] D47070: [CUDA] Upgrade linked bitcode to enable inlining

2018-05-18 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld created this revision. Hahnfeld added reviewers: tra, jlebar. Herald added a subscriber: cfe-commits. Revision https://reviews.llvm.org/rC329829 added the architecture to "target-features". This prevents inlining of previously generated bitcode because the feature sets don't match. Thus

[PATCH] D47070: [CUDA] Upgrade linked bitcode to enable inlining

2018-05-18 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. I think that's intended because the generated code might use instructions based on that feature. If we want to ignore that, we could override `TargetTransformInfo::areInlineCompatible` for NVPTX to only compare `target-cpu` Repository: rC Clang https://reviews.llv

[PATCH] D47070: [CUDA] Upgrade linked bitcode to enable inlining

2018-05-19 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added subscribers: chandlerc, ahatanak. Hahnfeld added a comment. Looks like this was added as a "temporary solution" in https://reviews.llvm.org/D8984. Meanwhile the attribute whitelist was merged half a year later (https://reviews.llvm.org/D7802), so maybe we can just get rid of comp

[PATCH] D47200: [Sema] Add tests for weak functions

2018-05-22 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld created this revision. Hahnfeld added reviewers: aaron.ballman, rjmccall. Herald added a subscriber: cfe-commits. I found these checks to be missing, just add some simple cases. Repository: rC Clang https://reviews.llvm.org/D47200 Files: test/Sema/attr-weak.c Index: test/Sema/at

[PATCH] D47201: [CUDA] Implement nv_weak attribute for functions

2018-05-22 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld created this revision. Hahnfeld added a reviewer: tra. Herald added a subscriber: cfe-commits. This is needed for relocatable device code with CUDA 9 and later. Before this patch, linking two or more object files resulted in "Multiple definition" errors for a group of functions from cuda_

[PATCH] D47200: [Sema] Add tests for weak functions

2018-05-25 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL333283: [Sema] Add tests for weak functions (authored by Hahnfeld, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D47200?vs=148021&id=148616#t

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-05-29 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47394#1115086, @tra wrote: > On one hand I can see how being able to treat GPU-side binaries as any other > host files is convenient. On the other hand, this convenience comes with the > price of targeting only NVPTX. This seems contrary to

[PATCH] D38257: [OpenMP] Fix memory leak when translating arguments

2017-09-25 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld created this revision. Parsing the argument after -Xopenmp-target allocates memory that needs to be freed. Associate it with the final DerivedArgList after we know which one will be used. https://reviews.llvm.org/D38257 Files: include/clang/Driver/ToolChain.h lib/Driver/Compilation

[PATCH] D38258: [OpenMP] Fix passing of -m arguments to device toolchain

2017-09-25 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld created this revision. AuxTriple is not set if host and device share a toolchain. Also, removing an argument modifies the DAL which needs to be returned for future use. (Move tests back to offload-openmp.c as they are not related to GPUs.) https://reviews.llvm.org/D38258 Files: lib/D

[PATCH] D38259: [OpenMP] Fix translation of target args

2017-09-25 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld created this revision. ToolChain::TranslateArgs() returns nullptr if no changes are performed. This would currently mean that OpenMPArgs are lost. Patch fixes this by falling back to simply using OpenMPArgs in that case. https://reviews.llvm.org/D38259 Files: lib/Driver/Compilation.c

[PATCH] D38277: [compiler-rt}[CMake] Fix configuration on PowerPC with sanitizers

2017-09-26 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld created this revision. Herald added subscribers: mgorny, dberris, nemanjai. TEST_BIG_ENDIAN() performs compile tests that will fail with -nodefaultlibs when building under LLVM_USE_SANITIZER. https://reviews.llvm.org/D38277 Files: cmake/base-config-ix.cmake Index: cmake/base-config

[PATCH] D38040: [OpenMP] Add an additional test for D34888

2017-09-26 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld accepted this revision. Hahnfeld added a comment. This revision is now accepted and ready to land. LGTM https://reviews.llvm.org/D38040 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/c

[PATCH] D38258: [OpenMP] Fix passing of -m arguments to device toolchain

2017-09-27 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added inline comments. Comment at: test/Driver/openmp-offload.c:89 +/// ### + /// Check the phases graph when using a single target, different from the host. gtbercea wrote: > Shoul

[PATCH] D38258: [OpenMP] Fix passing of -m arguments to device toolchain

2017-09-27 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL314329: [OpenMP] Fix passing of -m arguments to device toolchain (authored by Hahnfeld). Changed prior to commit: https://reviews.llvm.org/D38258?vs=116608&id=116845#toc Repository: rL LLVM https://

[PATCH] D38259: [OpenMP] Fix translation of target args

2017-09-27 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL314330: [OpenMP] Fix translation of target args (authored by Hahnfeld). Changed prior to commit: https://reviews.llvm.org/D38259?vs=116610&id=116846#toc Repository: rL LLVM https://reviews.llvm.org/

[PATCH] D38257: [OpenMP] Fix memory leak when translating arguments

2017-09-27 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL314328: [OpenMP] Fix memory leak when translating arguments (authored by Hahnfeld). Changed prior to commit: https://reviews.llvm.org/D38257?vs=116607&id=116844#toc Repository: rL LLVM https://revie

[PATCH] D38277: [compiler-rt}[CMake] Fix configuration on PowerPC with sanitizers

2017-09-28 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld marked 2 inline comments as done. Hahnfeld added a comment. The error with `-DLLVM_USE_SANITIZER=Address` is -- Check if the system is big endian -- Searching 16 bit integer -- Looking for stddef.h -- Looking for stddef.h - not found -- Check size of unsigned short -- Check s

[PATCH] D38277: [compiler-rt][CMake] Fix configuration on PowerPC with sanitizers

2017-09-28 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld updated this revision to Diff 116977. Hahnfeld retitled this revision from "[compiler-rt}[CMake] Fix configuration on PowerPC with sanitizers" to "[compiler-rt][CMake] Fix configuration on PowerPC with sanitizers". Hahnfeld added a subscriber: gtbercea. https://reviews.llvm.org/D38277

[PATCH] D38372: [OpenMP] Fix passing of -m arguments correctly

2017-09-28 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld created this revision. The recent fix in https://reviews.llvm.org/D38258 was wrong: getAuxTriple() only returns non-null values for the CUDA toolchain. That is why the now added test for PPC and X86 failed. https://reviews.llvm.org/D38372 Files: include/clang/Driver/ToolChain.h li

[PATCH] D38277: [compiler-rt][CMake] Fix configuration on PowerPC with sanitizers

2017-09-29 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL314512: [CMake] Fix configuration on PowerPC with sanitizers (authored by Hahnfeld). Changed prior to commit: https://reviews.llvm.org/D38277?vs=116977&id=117132#toc Repository: rL LLVM https://revi

[PATCH] D38277: [compiler-rt][CMake] Fix configuration on PowerPC with sanitizers

2017-09-29 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added inline comments. Comment at: compiler-rt/trunk/cmake/base-config-ix.cmake:154 + cmake_push_check_state() + string(REPLACE "-nodefaultlibs" "" CMAKE_REQUIRED_FLAGS ${OLD_CMAKE_REQUIRED_FLAGS}) TEST_BIG_ENDIAN(HOST_IS_BIG_ENDIAN) al

[PATCH] D38468: [CUDA] Fix name of __activemask()

2017-10-02 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld created this revision. The name has two underscores in the official CUDA documentation: http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#warp-vote-functions https://reviews.llvm.org/D38468 Files: lib/Headers/__clang_cuda_intrinsics.h Index: lib/Headers/__clang_cuda_i

[PATCH] D38468: [CUDA] Fix name of __activemask()

2017-10-02 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL314691: [CUDA] Fix name of __activemask() (authored by Hahnfeld). Changed prior to commit: https://reviews.llvm.org/D38468?vs=117384&id=117392#toc Repository: rL LLVM https://reviews.llvm.org/D38468

[PATCH] D38372: [OpenMP] Fix passing of -m arguments correctly

2017-10-04 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL314902: [OpenMP] Fix passing of -m arguments correctly (authored by Hahnfeld). Changed prior to commit: https://reviews.llvm.org/D38372?vs=117023&id=117664#toc Repository: rL LLVM https://reviews.ll

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld created this revision. Herald added a subscriber: mgorny. For the shuffle instructions in reductions we need at least sm_30 but the user may want to customize the default architecture. Also remove some code that went in while troubleshooting broken tests on external build bots. https://

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld marked an inline comment as done. Hahnfeld added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:170-182 -// This code prevents IsValid from being set when -// no libdevice has been found. -bool allEmpty = true; -std::string LibDeviceFile; -

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld marked 4 inline comments as done. Hahnfeld added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:170-182 -// This code prevents IsValid from being set when -// no libdevice has been found. -bool allEmpty = true; -std::string LibDeviceFile; -

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld updated this revision to Diff 118961. Hahnfeld marked an inline comment as done. Hahnfeld edited the summary of this revision. Hahnfeld added a comment. Check that the user didn't specify a value lower than `sm_30` and re-add some code as discussed. https://reviews.llvm.org/D38883 Fil

[PATCH] D38901: [CUDA] Require libdevice only if needed

2017-10-13 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld created this revision. If the user passes -nocudalib, we can live without it being present. Simplify the code by just checking whether LibDeviceMap is empty. https://reviews.llvm.org/D38901 Files: lib/Driver/ToolChains/Cuda.cpp Index: lib/Driver/ToolChains/Cuda.cpp

[PATCH] D38901: [CUDA] Require libdevice only if needed

2017-10-13 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld updated this revision to Diff 118969. Hahnfeld added a comment. Fix one more condition that checks for `nvvm/libdevice` and add a test. https://reviews.llvm.org/D38901 Files: lib/Driver/ToolChains/Cuda.cpp test/Driver/Inputs/CUDA-nolibdevice/usr/local/cuda/bin/.keep test/Driver/I

[PATCH] D38901: [CUDA] Require libdevice only if needed

2017-10-16 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL315902: [CUDA] Require libdevice only if needed (authored by Hahnfeld). Changed prior to commit: https://reviews.llvm.org/D38901?vs=118969&id=119149#toc Repository: rL LLVM https://reviews.llvm.org/

[PATCH] D38968: [OpenMP] Implement omp_is_initial_device() as builtin

2017-10-16 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld created this revision. This allows to return the static value that we know at compile time. https://reviews.llvm.org/D38968 Files: include/clang/Basic/Builtins.def include/clang/Basic/Builtins.h lib/AST/ExprConstant.cpp lib/Basic/Builtins.cpp test/OpenMP/is_initial_device.c

[PATCH] D38968: [OpenMP] Implement omp_is_initial_device() as builtin

2017-10-16 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D38968#898951, @grokos wrote: > Now that this issue has been addressed and regressions tests pass, should we > re-enable Cmake to build libomptarget by default? Yes, I already have a local patch which also takes care of restricting the tes

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-17 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL315996: [CMake][OpenMP] Customize default offloading arch (authored by Hahnfeld). Changed prior to commit: https://reviews.llvm.org/D38883?vs=118961&id=119310#toc Repository: rL LLVM https://reviews

[PATCH] D38968: [OpenMP] Implement omp_is_initial_device() as builtin

2017-10-17 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL316001: [OpenMP] Implement omp_is_initial_device() as builtin (authored by Hahnfeld). Changed prior to commit: https://reviews.llvm.org/D38968?vs=119190&id=119320#toc Repository: rL LLVM https://rev

[PATCH] D26244: [Driver] Add CLANG_PREFER_GCC_LIBRARIES which can be disabled

2017-10-19 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld abandoned this revision. Hahnfeld added a comment. Abandoning as I lost interest in this. https://reviews.llvm.org/D26244 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D39136: [OpenMP] Avoid VLAs for some reductions on array sections

2017-10-20 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld created this revision. In some cases the compiler can deduce the length of an array section as constants. With this information, VLAs can be avoided in place of a constant sized array or even a scalar value if the length is 1. Example: int a[4], b[2]; pragma omp parallel reduction(+:

[PATCH] D39136: [OpenMP] Avoid VLAs for some reductions on array sections

2017-10-20 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL316229: [OpenMP] Avoid VLAs for some reductions on array sections (authored by Hahnfeld). Changed prior to commit: https://reviews.llvm.org/D39136?vs=119687&id=119689#toc Repository: rL LLVM https:/

[PATCH] D39136: [OpenMP] Avoid VLAs for some reductions on array sections

2017-10-20 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld reopened this revision. Hahnfeld added a comment. This revision is now accepted and ready to land. At least two buildbots failing: http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/1175 http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/10478 Reposi

[PATCH] D39136: [OpenMP] Avoid VLAs for some reductions on array sections

2017-10-23 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL316362: [OpenMP] Avoid VLAs for some reductions on array sections (authored by Hahnfeld). Changed prior to commit: https://reviews.llvm.org/D39136?vs=119689&id=119909#toc Repository: rL LLVM https:/

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-23 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. The discussion kind of moved away from the original patch, probably because the problem is larger than the defition of some host macros. However I still think that this patch improves the situation. Repository: rC Clang https://reviews.llvm.org/D50845 __

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-23 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1211463, @gregrodgers wrote: > What am I missing? As discussed above this patch doesn't fix this problem. However we need `__x86_64__` because `bits/wordsize.h` will use it to determine if we are 64- or 32-bit. Repository: rC Cl

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-23 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld planned changes to this revision. Hahnfeld added a comment. This patch breaks C++ and CUDA compilation at the moment, sorry. I need to find and add more macros that turn out to be needed. Repository: rC Clang https://reviews.llvm.org/D50845 ___

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-24 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld updated this revision to Diff 162328. Hahnfeld added a comment. Herald added a subscriber: krytarowski. Add required macros for compiling C++ code. https://reviews.llvm.org/D50845 Files: lib/Frontend/InitPreprocessor.cpp test/Preprocessor/aux-triple.c test/SemaCUDA/builtins.cu I

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-25 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld updated this revision to Diff 162543. Hahnfeld added a comment. Based on libc++ I guessed some more macros that may be needed on macOS and Windows. As I can't test myself if somebody else could report if this change is regressing CUDA support on these platforms. https://reviews.llvm.o

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-25 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1212643, @tra wrote: > Please keep an eye on CUDA buildbot > http://lab.llvm.org:8011/builders/clang-cuda-build. > It runs fair amount of tests with libc++ and handful of libstdc++ versions > and may a canary if these changes break s

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-25 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL340681: [CUDA/OpenMP] Define only some host macros during device compilation (authored by Hahnfeld, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm

[PATCH] D51312: [OpenMP][NVPTX] Use appropriate _CALL_ELF macro when offloading

2018-08-27 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld accepted this revision. Hahnfeld added a comment. This revision is now accepted and ready to land. LGTM. Can you add a comment to `InitializePredefinedAuxMacros` explaining that the macro is used in `gnu/stubs.h` and add a check to the test? Repository: rC Clang https://reviews.llvm

[PATCH] D51378: [OPENMP] Add support for nested 'declare target' directives

2018-08-30 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D51378#1218184, @RaviNarayanaswamy wrote: > We should just go with generating an error if the DeclareTargetNestingLevel > is not 0 at the end of compilation unit. > Hard to detect if user accidentally forgot to have end declare in header

[PATCH] D51446: [OpenMP][bugfix] Add missing macros for Power

2018-08-30 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld requested changes to this revision. Hahnfeld added a comment. This revision now requires changes to proceed. Please also update the test. Comment at: lib/Frontend/InitPreprocessor.cpp:1115-1130 case llvm::Triple::ppc64: +if (AuxTI.getLongDoubleWidth() == 128) {

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-30 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. Do you have invocations or headers that don't work? The problem is that the previous code defined all macros unconditionally, so it will afterwards be hard to find the necessary macros... Repository: rL LLVM https://reviews.llvm.org/D50845 __

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-30 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1219726, @gtbercea wrote: > In general, it looks like this patch leads to some host macros having to be > defined again for the auxiliary triple case. It is not clear to me how to > exhaustively identify the missing macros, so far it'

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-30 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1219797, @tra wrote: > I've sent out https://reviews.llvm.org/D51501. It unbreaks CUDA compilation > and keeps OpenMP unchanged. I think a full revert would make more sense. And you definitely want to reinstantiate // FIXME: This

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-30 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1219746, @tra wrote: > In our case the headers from a relatively old glibc and compiler errors out > on this: > > /* This function is used in the `isfinite' macro. */ > __MATH_INLINE int > __NTH (__finite (double __x)) > { >

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-30 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1219853, @tra wrote: > In https://reviews.llvm.org/D50845#1219819, @Hahnfeld wrote: > > > Ok, the top preprocessor condition for that function is `#ifndef > > __SSE2_MATH__` - the exact same macro that was part of the motivation. Can

[PATCH] D50845: [CUDA/OpenMP] Define only some host macros during device compilation

2018-08-30 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D50845#1219865, @gtbercea wrote: > In https://reviews.llvm.org/D50845#1219859, @Hahnfeld wrote: > > > removing `InitializePredefinedAuxMacros` and the new test completely should > > do. > > > Yep they also contain https://reviews.llvm.org/D51

[PATCH] D51446: [OpenMP][bugfix] Add missing macros for Power

2018-09-04 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. Not needed anymore after the reverts in https://reviews.llvm.org/rC341115 and https://reviews.llvm.org/rC341118, right? Maybe we should revive the test to make sure we don't break this in the future? Repository: rC Clang https://reviews.llvm.org/D51446 _

[PATCH] D51501: [CUDA] Fix CUDA compilation broken by D50845

2018-09-04 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. Not needed anymore after the reverts in https://reviews.llvm.org/rC341115 and https://reviews.llvm.org/rC341118, right? https://reviews.llvm.org/D51501 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.l

[PATCH] D51686: [OpenMP] Improve search for libomptarget-nvptx

2018-09-05 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld created this revision. Hahnfeld added reviewers: gtbercea, ABataev. Herald added subscribers: cfe-commits, guansong. When looking for the bclib Clang considered the default library path first while it preferred directories in LIBRARY_PATH when constructing the invocation of nvlink. The la

[PATCH] D48862: [OpenEmbedded] Fix lib paths for OpenEmbedded targets

2018-07-31 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. I fixed `linux-header-search.cpp` by adding `-stdlib=libstdc++` in r338360 because I was seeing the same failure and that's what agreed to do in these cases. If you can verify that it fixes your problems, I think it's safe to add `-rtlib=libgcc` to the other test. Re

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-07-31 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1124861, @hfinkel wrote: > In https://reviews.llvm.org/D47849#1124638, @Hahnfeld wrote: > > > 2. Incidentally I ran into a closely related problem: I can't `#include > > ` in translation units compiled for offloading, Clang complains

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-01 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1183150, @hfinkel wrote: > Hrmm. Doesn't that make it so that whatever functions are implemented using > that inline assembly will not be callable from target code (or, perhaps > worse, will crash the backend if called)? You are rig

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-01 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1184367, @hfinkel wrote: > The problem is that the inline assembly might actually be for the target, > instead of the host, because we also have target preprocessor macros defined, > and it's going to be hard to tell. I'm not sure tha

[PATCH] D42978: Make march/target-cpu print a note with the list of valid values for ARM

2018-02-08 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. I think this means that the Clang test needs to be updated whenever somebody adds an architecture to LLVM? Maybe just test that Clang emits a note and don't check which values it prints? These should be checked in the backend... https://reviews.llvm.org/D42978

[PATCH] D43041: Add X86 Support to ValidCPUList (enabling march notes)

2018-02-08 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added inline comments. Comment at: lib/Basic/Targets/X86.cpp:1670-1672 +#define PROC_ALIAS(ENUM, ALIAS) \ + if (checkCPUKind(getCPUKind(ALIAS))) \ +Values.emplace_back(ALIAS); -

[PATCH] D43041: Add X86 Support to ValidCPUList (enabling march notes)

2018-02-08 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added inline comments. Comment at: lib/Basic/Targets/X86.cpp:1670-1672 +#define PROC_ALIAS(ENUM, ALIAS) \ + if (checkCPUKind(getCPUKind(ALIAS))) \ +Values.emplace_back(ALIAS); -

[PATCH] D42978: Make march/target-cpu print a note with the list of valid values for ARM

2018-02-08 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added inline comments. Comment at: test/Misc/target-invalid-cpu-note.c:1 +// RUN: not %clang_cc1 -triple armv5--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix ARM +// ARM: error: unknown target CPU 'not-a-cpu' Is there a rea

[PATCH] D42840: [docs] Fix duplicate arguments for JoinedAndSeparate

2018-02-09 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. Ping Repository: rC Clang https://reviews.llvm.org/D42840 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D42920: [CUDA] Fix test cuda-external-tools.cu

2018-02-12 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added inline comments. Comment at: test/Driver/cuda-external-tools.cu:11 +// RUN: | FileCheck -check-prefix CHECK -check-prefix ARCH64 \ +// RUN: -check-prefix SM20 -check-prefix OPT0 %s // RUN: %clang -### -target x86_64-linux-gnu -O1 -c %s 2>&1 \ -

[PATCH] D42920: [CUDA] Fix test cuda-external-tools.cu

2018-02-12 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld updated this revision to Diff 133816. Hahnfeld marked an inline comment as done. Hahnfeld added a comment. Use `--check-prefixes` instead of multiple `--check-prefix`. https://reviews.llvm.org/D42920 Files: test/Driver/cuda-external-tools.cu Index: test/Driver/cuda-external-tools.cu

[PATCH] D42921: [CUDA] Add option to generate relocatable device code

2018-02-12 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld updated this revision to Diff 133817. Hahnfeld marked an inline comment as done. Hahnfeld added a comment. Hide help for `-fcuda-rdc` until support is ready. https://reviews.llvm.org/D42921 Files: include/clang/Basic/LangOptions.def include/clang/Driver/Options.td lib/Driver/Tool

[PATCH] D42921: [CUDA] Add option to generate relocatable device code

2018-02-12 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added inline comments. Comment at: include/clang/Driver/Options.td:572 + HelpText<"Generate relocatable device code, also known as separate compilation mode.">; +def fno_cuda_rdc : Flag<["-"], "fno-cuda-rdc">; def dA : Flag<["-"], "dA">, Group; tra wr

[PATCH] D42921: [CUDA] Add option to generate relocatable device code

2018-02-12 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL324878: [CUDA] Add option to generate relocatable device code (authored by Hahnfeld, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D42921?vs=

[PATCH] D42920: [CUDA] Fix test cuda-external-tools.cu

2018-02-12 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL324877: [CUDA] Fix test cuda-external-tools.cu (authored by Hahnfeld, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D42920?vs=133816&id=13381

[PATCH] D42922: [CUDA] Register relocatable GPU binaries

2018-02-12 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld updated this revision to Diff 133831. Hahnfeld added a comment. Rebase and fix `Debug` build. https://reviews.llvm.org/D42922 Files: lib/CodeGen/CGCUDANV.cpp Index: lib/CodeGen/CGCUDANV.cpp === --- lib/CodeGen/CGCUDANV.

[PATCH] D42922: [CUDA] Register relocatable GPU binaries

2018-02-12 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld planned changes to this revision. Hahnfeld added a comment. Still no regression tests. I did some functional tests though (https://reviews.llvm.org/F5822023): With this patch Clang can generate valid object files with relocatable device code. For linking I still defer to `nvcc` and I'm

[PATCH] D42923: [CUDA] Allow external variables in separate compilation

2018-02-12 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added inline comments. Comment at: test/SemaCUDA/extern-shared.cu:4 +// These declarations are fine in separate compilation mode! +// RUN: %clang_cc1 -fsyntax-only -fcuda-rdc -verify=rdc %s +// RUN: %clang_cc1 -fsyntax-only -fcuda-is-device -fcuda-rdc -verify=rdc %s

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-02-13 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added inline comments. Comment at: test/Driver/openmp-offload-gpu.c:150 +/// bitcode library that will be found via the LIBRARY_PATH. +// RUN: touch /tmp/libomptarget-nvptx-sm_60.bc +// RUN: LIBRARY_PATH=/tmp %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvid

[PATCH] D42923: [CUDA] Allow external variables in separate compilation

2018-02-14 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rC325136: [CUDA] Allow external variables in separate compilation (authored by Hahnfeld, committed by ). Changed prior to commit: https://reviews.llvm.org/D42923?vs=132866&id=134230#toc Repository: rL

[PATCH] D42923: [CUDA] Allow external variables in separate compilation

2018-02-14 Thread Jonas Hahnfeld via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL325136: [CUDA] Allow external variables in separate compilation (authored by Hahnfeld, committed by ). Herald added a subscriber: llvm-commits. Changed prior to commit: https://reviews.llvm.org/D42923?v

  1   2   3   4   5   6   >