[PATCH] D94745: [OpenMP][WIP] Build the deviceRTLs with OpenMP instead of target dependent language - NOT FOR REVIEW

2021-01-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt:24 + return() +endif() + jdoerfert wrote: > we need the 32 bit versions as well, I guess? We could limit to 64 and see if a feature request for 32 comes in.

[PATCH] D95161: [Clang][OpenMP][NVPTX] Replace `libomptarget-nvptx-path` with `libomptarget-nvptx-bc-path`

2021-01-23 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Still LGTM. A little surprised these are being run on windows, but good to be robust. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D95161/new/ https://reviews.llvm.org/D95161 __

[PATCH] D95313: [WIP] Move part of nvptx devicertl under clang

2021-01-24 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: jdoerfert, tianshilei1992. Herald added subscribers: mgorny, jvesely. JonChesterfield requested review of this revision. Herald added subscribers: openmp-commits, cfe-commits, sstefan1. Herald added projects: clang, OpenMP. [W

[PATCH] D95313: [WIP] Move part of nvptx devicertl under clang

2021-01-24 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1204 +{ + auto *CTC = static_cast( + C.getSingleOffloadToolChain()); Logic very like this could pick out a second, small devicertl bitcode library ===

[PATCH] D94745: [OpenMP][WIP] Build the deviceRTLs with OpenMP instead of target dependent language - NOT FOR REVIEW

2021-01-25 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: openmp/libomptarget/deviceRTLs/common/src/libcall.cu:319 -EXTERN int omp_is_initial_device(void) { - PRINT0(LD_IO, "call omp_is_initial_device() returns 0\n"); - return 0; // 0 by def on device -} +// EXTERN int omp_is_initia

[PATCH] D94745: [OpenMP][WIP] Build the deviceRTLs with OpenMP instead of target dependent language - NOT FOR REVIEW

2021-01-25 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Couple of minor suggestions inline, but overall this looks pretty good. Hopefully device-only compilation works already, the rest could be left for after this patch. Comment at: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt:100 +

[PATCH] D94745: [OpenMP][WIP] Build the deviceRTLs with OpenMP instead of target dependent language - NOT FOR REVIEW

2021-01-25 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. > the access size (8 bytes) exceeds the max lock-free size (0 bytes) > [-Watomic-alignment That's not a useful warning. Every size is greater than 0. I guess nvptx hasn't set a value somewhere in clang for max lock free size. Fortunately we only compile with c

[PATCH] D94745: [OpenMP][WIP] Build the deviceRTLs with OpenMP instead of target dependent language - NOT FOR REVIEW

2021-01-25 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h:98 #ifndef CUDA_VERSION #error CUDA_VERSION macro is undefined, something wrong with cuda. #endif This should be firing now that cuda.h is removed Repos

[PATCH] D94745: [OpenMP][WIP] Build the deviceRTLs with OpenMP instead of target dependent language - NOT FOR REVIEW

2021-01-25 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: openmp/libomptarget/deviceRTLs/common/omptarget.h:301 +extern DEVICE uint8_t parallelLevel[MAX_THREADS_PER_TEAM / WARPSIZE]; +#pragma omp allocate(parallelLevel) allocator(omp_pteam_mem_alloc) +extern DEVICE uint16_t EXTERN_SHARE

[PATCH] D94961: [OpenMP] Add OpenMP offloading toolchain for AMDGPU

2021-01-25 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. The deviceRTL is changing from cuda/hip to openmp at present. It would be good to be able to compile that as openmp for amdgpu, which needs a patch roughly like this and probably some follow up. It's plausible that the contents of this file are only really of in

[PATCH] D94745: [OpenMP][WIP] Build the deviceRTLs with OpenMP instead of target dependent language - NOT FOR REVIEW

2021-01-25 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: openmp/libomptarget/deviceRTLs/common/allocator.h:16 + +// Follows the pattern in interface.h +// Clang sema checks this type carefully, needs to closely match that from omp.h If we guard this with #ifdef _OPENM

[PATCH] D94745: [OpenMP][deviceRTLs] Build the deviceRTLs with OpenMP instead of target dependent language

2021-01-25 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. LGTM! Thank you for iterating on this, and for trying to keep amdgpu working. With this patch, the devicertl no longer needs the cuda sdk installed to compile. I believe ther

[PATCH] D95313: [WIP] Move part of nvptx devicertl under clang

2021-01-26 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield abandoned this revision. JonChesterfield added a comment. Abandoned in favour of multiple instantiations of the devicertl, which works across all languages and doesn't require hacks to clang Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.

[PATCH] D95764: [OpenMP] Do not propagate match extensions to nested contexts

2021-01-31 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I would have expected a nested match statement to hit on a subset of those that matched in the parent. If (match x64) If (match Intel) - expect x64 and intel to be true here I think that's how the normal openmp variant match works. Why do we want to diverge fr

[PATCH] D94961: [OpenMP] Add OpenMP offloading toolchain for AMDGPU

2021-02-01 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D94961#2533848 , @jdoerfert wrote: > Overall, looks reasonable. I'm not a fan of the way things are hooked up but > that is also true for the NVPTX toolchain and for now unavoidable I guess. This is loosely based on th

[PATCH] D94961: [OpenMP] Add OpenMP offloading toolchain for AMDGPU

2021-02-01 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp:65 + .Case("g", "1") + .Default("2"); +} JonChesterfield wrote: > jdoerfert wrote: > > To verify, O0 is not mapped to O2, correct?

[PATCH] D94961: [OpenMP] Add OpenMP offloading toolchain for AMDGPU

2021-02-01 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I believe this is 'safe', in the sense that it can't break anything other than AMDGPU's openmp implementation, which doesn't work yet anyway. The change to non-amd code is to OpenMPActionBuilder, and while parsing `-march=gfx` isn't pretty, it won't fire on nvpt

[PATCH] D95764: [OpenMP] Do not propagate match extensions to nested contexts

2021-02-01 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. That sounds reasonable. We can probably expect features to be renamed and semantically adjusted on their way to standardisation anyway. Repository: rG LLVM Github Monorepo

[PATCH] D95765: [OpenMP] Introduce the `disable_selector_propagation` variant selector trait

2021-02-01 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Yes, OK. I'm a little nervous that variant is evolving into a turing complete sublanguage, but I see the use of opting out of the implicit inheritance. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D95765/new/ https

[PATCH] D94961: [OpenMP] Add OpenMP offloading toolchain for AMDGPU

2021-02-02 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/test/Driver/amdgpu-openmp-toolchain.c:2 +// REQUIRES: amdgpu-registered-target +// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=

[PATCH] D94961: [OpenMP] Add OpenMP offloading toolchain for AMDGPU

2021-02-02 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Looks much simpler, thanks! Comment at: clang/lib/Driver/Driver.cpp:755 + *this, TT, *HostTC, C.getInputArgs(), Action::OFK_OpenMP); +else if (TT.isAMDGCN()) + DeviceTC = Mi

[PATCH] D94961: [OpenMP] Add OpenMP offloading toolchain for AMDGPU

2021-02-02 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/Driver.cpp:755 + *this, TT, *HostTC, C.getInputArgs(), Action::OFK_OpenMP); +else if (TT.isAMDGCN()) + DeviceTC = pdhaliwal wrote: > JonChes

[PATCH] D94745: [OpenMP][deviceRTLs] Build the deviceRTLs with OpenMP instead of target dependent language

2021-02-03 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I think there's a bug report about this. Sycl (iirc) introduced a change that caught invalid things with types that were previously ignored. @jdoerfert is on point I think. `-DLLVM_ENABLE_PROJECTS="clang;compiler-rt;libcxxabi;libcxx;libunwind;openmp" ` ^ That wo

[PATCH] D109885: [MLIR][[amdgpu-arch]][OpenMP] Remove direct dependency on /opt/rocm

2021-09-16 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Dropping PATHS /opt/rocm from the openmp parts will break people using an existing rocm install with trunk llvm. I think it would be reasonable to look at whatever ROCM_PATH is set to instead of assuming /opt/rocm. Is that a variable we can pass to find_package?

[PATCH] D102107: [OpenMP] Codegen aggregate for outlined function captures

2021-09-21 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a subscriber: dpalermo. JonChesterfield added a comment. @ronlieb can you apply this to amd-stg-open? If it breaks there we have a chance of trying a debugger on it. @dpalermo might be available again now. @jdoerfert I debug stuff like this by inspection, guesswork and a DI

[PATCH] D102107: [OpenMP] Codegen aggregate for outlined function captures

2021-09-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D102107#3014599 , @pdhaliwal wrote: > It looks like from IR diff that this patch is adding use of kmpc_alloc_shared > method. These methods likely won't work on AMDGPU as device malloc is not > available. Not sure wha

[PATCH] D102107: [OpenMP] Codegen aggregate for outlined function captures

2021-09-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D102107#3014743 , @pdhaliwal wrote: > I got this after changing __kmpc_impl_malloc to return 0xdeadbeef. So, this > confirms that missing malloc implementation is the root cause. > >> Memory access fault by GPU node-4

[PATCH] D110029: [OpenMP][Offloading] Use bitset to indicate execution mode instead of value

2021-09-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. This seems to be related to some things failing on amdgpu at present: > Target AMDGPU RTL --> Error wrong exec_mode value specified in HSA code > object file: 3 The semantics for this mode seem to have been introduced in D106460

[PATCH] D96248: [OpenMP][AMDGPU] Add support for linking libomptarget bitcode

2021-02-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I like this. Using the same logic, in the same function call, to find this library on either gpu is the right thing to do. Looks like a non functional change on nvptx, though phab doesn't make that obvious. Comment at: clang/include/clang/Driv

[PATCH] D96248: [OpenMP][AMDGPU] Add support for linking libomptarget bitcode

2021-02-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. LGTM. Let's wait for someone using nvptx to sanity check Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96248/new/ https://reviews.llvm.org/D96248 ___ cfe-commits mailing

[PATCH] D96248: [OpenMP][AMDGPU] Add support for linking libomptarget bitcode

2021-02-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Looks like an existing problem, but for building openmp via runtimes, with this patch applied this doesn't look in the right place and errors: 'error: No library 'libomptarget-amdgcn-gfx906.bc' found in the default clang lib directory or in LIBRARY_PATH. Please

[PATCH] D96248: [OpenMP][AMDGPU] Add support for linking libomptarget bitcode

2021-02-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp:193 CC1Args.push_back("-emit-llvm-bc"); + + std::string BitcodeSuffix = "amdgcn-" + GpuArch.str(); Need `if (DriverArgs.hasArg(options::OPT_nogpulib)) return;` her

[PATCH] D96248: [OpenMP][AMDGPU] Add support for linking libomptarget bitcode

2021-02-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp:193 CC1Args.push_back("-emit-llvm-bc"); + + std::string BitcodeSuffix = "amdgcn-" + GpuArch.str(); tianshilei1992 wrote: > JonChesterfield wrote: > > Need `if (Dri

[PATCH] D96248: [OpenMP][AMDGPU] Add support for linking libomptarget bitcode

2021-02-09 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Some background discussion about the diagnostic. This change means people using nvptx, where the library cannot be found, will now be advised to use libomptarget-device-bc-path instead of libomptarget-nvptx-bc-path. If they use libomptarget-nvptx-bc-path anyway,

[PATCH] D96248: [OpenMP][AMDGPU] Add support for linking libomptarget bitcode

2021-02-09 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Using one option for both targets seems great - if both have put the devicertl in the same folder. Which I suppose they might not have. Maybe keep it separate for one, one for nvptx and one for amdgcn, and hope for a common 'device' later. Repository: rG LLV

[PATCH] D85223: [CUDA][HIP] Support accessing static device variable in host code for -fgpu-rdc

2021-02-09 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This works around the limitations of the binary format nvptx and amdgpu are using in the compiler. It's the wrong place in the stack to fix it - we could introduce another symbol table in the binary to capture the per-tu-be

[PATCH] D96248: [OpenMP][AMDGPU] Add support for linking libomptarget bitcode

2021-02-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Parameter to err_drv_omp_offload_target_missingbcruntime is sensible. @tianshilei1992? With this we can set the path for nvptx and amdgcn independently. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96248/new/ htt

[PATCH] D95912: [OpenMP] Attribute target diagnostics properly

2021-02-15 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. I agree this is an improvement on the current status and should land. Reading through it has made me nervous about our current semantic checking. In particular, I'm not sure

[PATCH] D95928: [OpenMP] Delay more diagnostics of potentially non-emitted code

2021-02-15 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Sema/SemaDecl.cpp:18345 if (LangOpts.OpenMPIsDevice) { +// In OpenMP device mode we will not emit host only functions, or functions +// we don't need due to their linkage. What catches a stat

[PATCH] D95928: [OpenMP] Delay more diagnostics of potentially non-emitted code

2021-02-15 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D95928/new/ https://reviews.llvm.org/D95928

[PATCH] D96769: [OpenMP][AMDGPU] Skip backend and assemble phases for amdgcn

2021-02-16 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Change looks reasonable. Does this fix save-temps as intended? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96769/new/ https://reviews.llvm.org/D96769 ___ cfe-commits ma

[PATCH] D96769: [OpenMP][AMDGPU] Skip backend and assemble phases for amdgcn

2021-02-16 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Perhaps paraphrase that and add it to the commit message (which is derived from the text at the top of this page) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96769/new/ https://reviews.llvm.org/D96769 __

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-17 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: jdoerfert, ye-luo, tianshilei1992, ABataev, grokos. JonChesterfield requested review of this revision. Herald added subscribers: cfe-commits, sstefan1. Herald added a project: clang. [libomptarget] Try a fallback devicertl if

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-17 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 324342. JonChesterfield added a comment. - drop whitespace from .td Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96877/new/ https://reviews.llvm.org/D96877 Files: clang/include/clang/Basic/Diagnosti

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-17 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 324354. JonChesterfield added a comment. - review comments Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96877/new/ https://reviews.llvm.org/D96877 Files: clang/include/clang/Basic/DiagnosticDriverKi

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-17 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield marked an inline comment as done. JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:1647 +llvm::Twine lib) { + for (StringRef LibraryPath : LibraryPaths) { +SmallString<128> TargetFile(LibraryPath);

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-17 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/include/clang/Basic/DiagnosticDriverKinds.td:265 "No library '%0' found in the default clang lib directory or in LIBRARY_PATH. Please use --libomptarget-%1-bc-path to specify %1 bitcode library.">; +def warn_drv_omp_of

[PATCH] D96769: [OpenMP][AMDGPU] Skip backend and assemble phases for amdgcn

2021-02-17 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Debugged this because I need save-temps (or hack up a binary by hand) to debug something else. The problem is addClangTargetOptions gets called to add target options to clang. With save temps, an early clang invocation runs the preprocessor alone. That gets pass

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-17 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Suggestion is to resolve libomptarget-nvptx-unknown.bc to a cp of the bitcode libary built for the newest sm_xx and ptx version clang knows of. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96877/new/ https://revie

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-17 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 32. JonChesterfield added a comment. - address review comment Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96877/new/ https://reviews.llvm.org/D96877 Files: clang/include/clang/Basic/DiagnosticD

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-17 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D96877#2569863 , @tianshilei1992 wrote: > Please also update the test. > > In D96877#2569861 , @JonChesterfield > wrote: > >> Suggestion is to resolve libomptarget-nvptx-unknown

[PATCH] D96769: [OpenMP][AMDGPU] Skip backend and assemble phases for amdgcn

2021-02-17 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I don't think there's a general bug here. Maybe save-temps could strip out some arguments that collide with E, or pass E at the end of the list instead of the start, but that seems pretty invasive for a problem that doesn't seem to affect anyone. Passing emit-l

[PATCH] D96769: [OpenMP][AMDGPU] Skip backend and assemble phases for amdgcn

2021-02-17 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. An upshot of looking at this is that `-Xclang -emit-llvm-bc -save-temps` does not work, and nor does `-Xclang -emit-llvm -save-temps`. However `-emit-llvm -save-temps` is fine. Opened https://bugs.llvm.org/show_bug.cgi?id=49234 to track. Skipping the phases sti

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. If you like it, accept it. If there's more stuff to do first, please say so. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96877/new/ https://reviews.llvm.org/D96877 ___

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Which tests? Choosing a file to duplicate with the unknown.bc name is left separate. This just adds code to look for that file if the first choice failed. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96877/new/ h

[PATCH] D97120: [Clang][OpenMP] Update driver test case for OpenMP offload to use sm_35

2021-02-20 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. LG. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D97120/new/ https://reviews.llvm.org/D97120

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Is this obsolete with the change to devicertl cmake? Would prefer abandon to land if so Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96877/new/ https://reviews.llvm.org/D96877

[PATCH] D97003: [Clang][OpenMP] Require CUDA 9.2+ for OpenMP offloading on NVPTX target

2021-02-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. LGTM too. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D97003/new/ https://reviews.llvm.org/D97003 ___ cfe-commits mailing list cf

[PATCH] D96769: [OpenMP][AMDGPU] Skip backend and assemble phases for amdgcn

2021-02-24 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Replacing `CC1Args.push_back("-emit-llvm-bc");` with `CC1Args.push_back("-emit-llvm-bc");` as suggested on the call does not work. This hook is downstream of the clang driver, so all it does under save temps is lead to `clang -E -emit-llvm`, which generated llvm

[PATCH] D96769: [OpenMP][AMDGPU] Skip backend and assemble phases for amdgcn

2021-02-24 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp:192 CC1Args.push_back("-fcuda-is-device"); - CC1Args.push_back("-emit-llvm-bc"); vaguely interesting that `-emit-llvm` here appears to generate the same code as

[PATCH] D97273: OpenMP: Fix object clobbering issue when using save-temps

2021-02-24 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Works everywhere we have tried it. Fundamentally it renames a temporary file, so shouldn't break much. Will be great to have -save-temps working for nvptx. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D97273/new/ h

[PATCH] D96769: [OpenMP][AMDGPU] Skip backend and assemble phases for amdgcn

2021-02-24 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Something we should probably check is the interaction between -save-temps and whether we are trying to compile a single file or an executable, e.g. the difference between clang and clang -c. If trying to compile foo.c directly to an executable, -save-temps shoul

[PATCH] D96769: [OpenMP][AMDGPU] Skip backend and assemble phases for amdgcn

2021-02-25 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Strictly speaking I suppose that's not a temporary file, since it isn't used for anything, but whenever I pass -save-temps I am likely to want to read the assembly. This looks like a great workaround to me. @jdoerfert? Repository: rG LLVM Github Monorepo CHA

[PATCH] D104677: [OpenMP][AMDGCN] Apply fix for isnan, isinf and isfinite for amdgcn.

2021-06-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Looks very similar to the cuda workaround. Any chance we can #include to use the same code in both? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D104677/new/ https://reviews.llvm.org/D104677 __

[PATCH] D104677: [OpenMP][AMDGCN] Apply fix for isnan, isinf and isfinite for amdgcn.

2021-06-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Looks like the __OPENMP_AMDGCN__ guard is sound to me, but can't really argue against having more tests Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D104677/new/ https://reviews.llvm.org/D104677 __

[PATCH] D104677: [OpenMP][AMDGCN] Apply fix for isnan, isinf and isfinite for amdgcn.

2021-06-23 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG5dfdc1812d9b: [OpenMP][AMDGCN] Apply fix for isnan, isinf and isfinite for amdgcn. (authored by estewart08, committed by JonChesterfield). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D105981: [AMDGPU][OpenMP] Support linking of math libraries

2021-07-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. I think this is good enough for now, assuming the majority of OvO is running locally. More cleanups to do after landing. Repository: rG LLVM Github Monorepo CHANGES SINCE

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-07-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added inline comments. Comment at: clang/lib/Headers/openmp_wrappers/cmath:113 +__DEVICE__ float lgamma(float __x) { return ::lgammaf(__x); } +//__DEVICE__ long long int llrint(float __x) { return ::llrintf(__x); } +//__DEVI

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-07-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. This patch didn't change complex so I'm struggling to make sense of the backtrace. Something in libstdc++ needs memory but doesn't include it? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D104904/new/ https://revie

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-07-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D104904#2916943 , @ye-luo wrote: > Unforuantely I hit error on my ubuntu 20.04 system. > > #include > int main() > { } Given that ^ in fail.cpp and an invocation on a machine that doesn't have cuda or an nvidia

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-08-02 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Headers/openmp_wrappers/math.h:53 +#pragma omp begin declare variant match(device = {arch(amdgcn)}) + +#define __OPENMP_AMDGCN__ That's quite worrying. Declare variant match amdgcn is supposed to have t

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-08-02 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. Latest patch can only misfire on amdgpu so lets go with it and try to work out variant vs ifdef subsequently. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D104904/new/ https:

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-08-04 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Headers/openmp_wrappers/__clang_openmp_device_functions.h:38 + +#pragma omp begin declare variant match(device = {arch(amdgcn)}) + Given that declare variant didn't work elsewhere, it probably doesn't

[PATCH] D107468: [AMDGPU][OpenMP] Wrap amdgcn declare variant inside ifdef

2021-08-04 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Yes, ship it. We'll look at fixing variant and pull the ifdef back out once it's working. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.

[PATCH] D107492: [clang] Replace asm with __asm__ in cuda header

2021-08-04 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: tra, emankov, gtbercea, jdoerfert. Herald added a subscriber: yaxunl. JonChesterfield requested review of this revision. Herald added subscribers: cfe-commits, sstefan1. Herald added a project: clang. Asm is a gnu extension fo

[PATCH] D107492: [clang] Replace asm with __asm__ in cuda header

2021-08-04 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Implemented by `F=__clang_cuda_device_functions.h ; sed -i 's$asm$__asm__$g' $F && clang-format -i $F` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D107492/new/ https://reviews.llvm.org/D107492 ___

[PATCH] D107492: [clang] Replace asm with __asm__ in cuda header

2021-08-04 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Re: `__volatile__`. I've done some background reading on it but can't find the original motivation. `__asm__` exists because the asm word is in the application namespace and asm is an extension. Volatile has been with us since C89 though, and is also accepted un

[PATCH] D107492: [clang] Replace asm with __asm__ in cuda header

2021-08-04 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 364274. JonChesterfield added a comment. - whitespace change Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D107492/new/ https://reviews.llvm.org/D107492 Files: clang/lib/Headers/__clang_cuda_device_fu

[PATCH] D107492: [clang] Replace asm with __asm__ in cuda header

2021-08-04 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Headers/__clang_cuda_device_functions.h:1043 } -#else // CUDA_VERSION >= 9020 +#else // CUDA_VERSION >= 9020 // CUDA no longer provides inline assembly (or bitcode) implementation of these JonChester

[PATCH] D107492: [clang] Replace asm with __asm__ in cuda header

2021-08-04 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D107492#2926910 , @tra wrote: > In D107492#2926871 , > @JonChesterfield wrote: > >> Therefore I'd like to leave it as `__asm__ volatile`. > > Being the one who introduced incon

[PATCH] D107492: [clang] Replace asm with __asm__ in cuda header

2021-08-04 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 364277. JonChesterfield added a comment. - consistent __volatile__ Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D107492/new/ https://reviews.llvm.org/D107492 Files: clang/lib/Headers/__clang_cuda_dev

[PATCH] D107492: [clang] Replace asm with __asm__ in cuda header

2021-08-05 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG509854b69cea: [clang] Replace asm with __asm__ in cuda header (authored by JonChesterfield). Repository: rG LLVM Github Monorepo CHANGES SINCE LA

[PATCH] D105981: [AMDGPU][OpenMP] Support linking of math libraries

2021-08-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. It's been pointed out to me that -lm is a linker flag so it's weird to require it at compile time. I haven't thought of a good fix for that yet. We don't need to splice in ocml for each compilation unit, so can move the test+splice into the link phase, except th

[PATCH] D106960: [OffloadArch] Library to query properties of current offload archicture

2021-08-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. T Comment at: llvm/lib/OffloadArch/amdgpu/vendor_specific_capabilities.cpp:25 +// +#include "hsa-subset.h" +#include yaxunl wrote: > It would be much simpler to use HIP API to get device name and capabilities > e.g. gfx906:xna

[PATCH] D107952: [AMDGPU][OpenMP] Use llvm-link to link ocml libraries

2021-08-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. This is semantically different to using mlink-builtin-bitcode with clang. The former internalizes all the symbols after they are introduced, using llvm-link does not. I'm not immediately sure whether that is a problem - the symbols in rocm device libs should all

[PATCH] D107952: [AMDGPU][OpenMP] Use llvm-link to link ocml libraries

2021-08-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. LG, thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D107952/new/ https://reviews.llvm.org/D107952 __

[PATCH] D107971: [openmp] Annotate tmp variables with omp_thread_mem_alloc

2021-08-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: arsenm, jdoerfert. Herald added subscribers: guansong, yaxunl. JonChesterfield requested review of this revision. Herald added subscribers: cfe-commits, sstefan1, wdng. Herald added a project: clang. Fixes miscompile of calls

[PATCH] D107971: [openmp] Annotate tmp variables with omp_thread_mem_alloc

2021-08-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added reviewers: ye-luo, yaxunl. JonChesterfield added a subscriber: ye-luo. JonChesterfield added a comment. @ye-luo this fixes the modf and sincos test cases from https://github.com/ye-luo/openmp-target Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https:/

[PATCH] D107971: [openmp] Annotate tmp variables with omp_thread_mem_alloc

2021-08-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D107971#2941666 , @jdoerfert wrote: > Assuming this causes us to generate an `alloc as(5)` for `__tmp`, LG Yes, with the running example and this patch we get the perfect: %__tmp = alloca double, align 8, addrspace(

[PATCH] D107971: [openmp] Annotate tmp variables with omp_thread_mem_alloc

2021-08-12 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGb6113548c921: [openmp] Annotate tmp variables with omp_thread_mem_alloc (authored by JonChesterfield). Repository: rG LLVM Github Monorepo CHANGE

[PATCH] D107971: [openmp] Annotate tmp variables with omp_thread_mem_alloc

2021-08-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Failed a CI job that builds an openmp test in an environment without omp.h, will revert. Thoughts on fixing? Putting the omp allocator definition in this header is likely to collide with a real omp.h. Fairly clean fix is to move the definitions into the deviceR

[PATCH] D108291: [clang-nvlink-wrapper] Wrapper around nvlink for archive files

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Tests. The usual bug in this area is that an archive can contain multiple files with the same name which clobber each other if extracted to the same directory. It looks to me like this implementation will fail that test. Also, motivation? Seems this can be worke

[PATCH] D93525: [clang-offload-bundler] Add unbundling of archives containing bundled object files into device specific archives

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. This is mentioned as broken in the referenced patch and landed with @t-tye still marked as requires changes. Revert warranted? Testing looks very sparse for something handling archives. Presumably it has the same set of bugs as Repository: rG LLVM Github Mon

[PATCH] D108291: [clang-nvlink-wrapper] Wrapper around nvlink for archive files

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Discussed at the multicompany meeting today. Consensus reached is that the whole-archive/no-whole-archive distinction is unimportant - we can ship a toolchain that has whole-archive semantics and later change the default when doing more work in the linker. It's

[PATCH] D99551: [clang-offload-wrapper] Add standard notes for ELF offload images

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Nvptx broken here too, amdgpu is fine. I'm guessing one of the cuda tools does some overly aggressive input validation that we're running afoul of. There was a discussion about this on the call today - plan was to put it behind a disabled boolean argument while

[PATCH] D108303: [clang][openmp] Disable embedded elf notes

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: vzakhari, ABataev, grokos, sdmitriev, jdoerfert, ronlieb. Herald added subscribers: guansong, yaxunl. JonChesterfield requested review of this revision. Herald added subscribers: cfe-commits, sstefan1. Herald added a project:

[PATCH] D99551: [clang-offload-wrapper] Add standard notes for ELF offload images

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. It's a hack, but D108303 will unblock nvptx offloading. Alternative to reverting. Suggest we go with that then revisit in a couple of weeks. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.ll

[PATCH] D108246: [clang-offload-wrapper] Disabled ELF offload notes embedding by default.

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. Amusingly similar to D108303 . LGTM. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D108246/new/ https://reviews.llvm.org/D108246 ___ cfe

[PATCH] D108246: [clang-offload-wrapper] Disabled ELF offload notes embedding by default.

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp:79 +static cl::opt AddOpenMPOffloadNotes( +"add-omp-offload-notes", I'd have probably gone with an explicit false here but it doesn't make much dif

[PATCH] D108303: [clang][openmp] Disable embedded elf notes

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield abandoned this revision. JonChesterfield added a comment. I like D108246 more. None of the offloading tests updated in D108246 failed with the above patch, perhaps they're not run by `make check-openmp` Repos

<    1   2   3   4   5   6   7   8   9   >