[PATCH] D116542: [OpenMP] Add a flag for embedding a file into the module

2022-01-31 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/CodeGen/BackendUtil.cpp:1760 + for (StringRef OffloadObject : CGOpts.OffloadObjects) { +auto FilenameAndSection = OffloadObject.split(','); +llvm::ErrorOr> ObjectOrErr = JonChesterfield wrote:

[PATCH] D116543: [OpenMP] Embed device files into the host IR

2022-01-31 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Description and test have slightly diverged from implementation - filename is appended to disambiguate, but the filecheck regex only looks at the prefix and the name described in the commit message is missing the filename Repository: rG LLVM Github Monorepo

[PATCH] D116542: [OpenMP] Add a flag for embedding a file into the module

2022-01-31 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Thanks for sticking with me on this! LG Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D116542/new/ https://reviews.llvm.org/D11

[PATCH] D116544: [Clang] Introduce Clang Linker Wrapper Tool

2022-01-31 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. I think this is reasonable. It's a lot of boilerplate to put the interception shim in the right place, but we do want a specific point to inject offloading functionality and

[PATCH] D118493: Set rpath on openmp executables

2022-01-31 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/test/OpenMP/implicit_rpath.c:28 +// CHECK-COMPOSABLE: ({{R|RUN}}PATH) Library {{r|run}}path: [early:late:{{.*}}llvm/lib] + +int main() {} This ^ probably has path separator issues on windows, will try to f

[PATCH] D118493: Set rpath on openmp executables

2022-01-31 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Yep. I'm not totally confident what windows builds will embed in the dynamic table - might be windows style \ or it might be linux style /. Or it might be \\, which can be matched as in D109057 . I think I'm going with the very

[PATCH] D118493: Set rpath on openmp executables

2022-01-31 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 404536. JonChesterfield added a comment. - More paranoid regex - not sure what windows uses as the path separator in the dynamic table Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D118493/new/ https://

[PATCH] D118493: Set rpath on openmp executables

2022-01-31 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGa80d5c34e4b9: Set rpath on openmp executables (authored by JonChesterfield). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION http

[PATCH] D118493: Set rpath on openmp executables

2022-01-31 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Apologies for the noise, turns out 'llvm/lib' and 'llvm/lib64' are not the only two options. Made it more permissive and reapplied. Also looks like ppc64le defaults to rpath as opposed to runpath, so that's worth knowing. Repository: rG LLVM Github Monorepo

[PATCH] D118493: Set rpath on openmp executables

2022-01-31 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a reviewer: kparzysz. JonChesterfield added a subscriber: kparzysz. JonChesterfield added a comment. This test doesn't work on hexagon because hexagon doesn't have a linker (??), emailed @kparzysz asking how to disable tests on hexagon as I can't find a likely looking UNSUP

[PATCH] D118493: Set rpath on openmp executables

2022-01-31 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D118493#3284617 , @thakis wrote: > looks like the mac linker doesn't like this test either: > http://45.33.8.238/macm1/26834/step_7.txt Error message is ld: file too small (length=8) file '/Users/thakis/src/llvm-proj

[PATCH] D118493: Set rpath on openmp executables

2022-01-31 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D118493#3284781 , @thakis wrote: > In D118493#3284663 , > @JonChesterfield wrote: > >> In D118493#3284617 , @thakis wrote: >> >>> look

[PATCH] D120353: [OpenMP] Add option to make offloading mandatory

2022-02-23 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Thanks! Seems a good thing to add to the offloading test runner, preferably in a separate change to avoid reverting this in case of unforeseen problems Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120353/new/ http

[PATCH] D120272: [CUDA] Add driver support for compiling CUDA with the new driver

2022-03-04 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/Driver.cpp:4132-4134 + Archs.insert(CudaArchToString(CudaArch::SM_35)); +else if (Kind == Action::OFK_HIP) + Archs.insert(CudaArchToString(CudaArch::GFX803)); tra wrote: > If we do

[PATCH] D101077: [clang][nfc] Split getOrCheckAMDGPUCodeObjectVersion

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: yaxunl, t-tye, kzhuravl, ronlieb, b-sumner, pdhaliwal. Herald added subscribers: kerbowa, tpr, dstuttard, nhaehnle, jvesely. JonChesterfield requested review of this revision. Herald added subscribers: cfe-commits, wdng. Heral

[PATCH] D101077: [clang][nfc] Split getOrCheckAMDGPUCodeObjectVersion

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:1597 + +if (CodeObjArg->getOption().getID() == +options::OPT_mcode_object_version_EQ) { This would probably be more useful if it diagnosed an invalid argumen

[PATCH] D98746: [clang][amdgpu] Use implicit code object default

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Posted D101077 which is a refactor. Expect to shortly post a replacement for this which implements @yaxunl's suggestion above, i.e. pass the argument to llc whenever the user specified one to clang Repository: rG LLVM Github

[PATCH] D99949: [AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D99949#2709297 , @davezarzycki wrote: > This change broke multi-stage builds (even if AMDGPU is disabled as a > target). The problem is that clang/tools/amdgpu-arch/AMDGPUArch.cpp can't > find hsa.h. Is there a quick

[PATCH] D101095: [clang][amdgpu] Use implicit code object version

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: yaxunl, t-tye, kzhuravl, ronlieb, b-sumner, pdhaliwal. Herald added subscribers: pengfei, tpr, dstuttard. JonChesterfield requested review of this revision. Herald added subscribers: cfe-commits, sstefan1, wdng. Herald added a

[PATCH] D101095: [clang][amdgpu] Use implicit code object version

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Diff created relative to D101077 , not relative to main. Iiuc that will work out fine if the NFC lands first. Comment at: clang/test/Driver/hip-code-object-version.hip:55 // RUN: %s 2>&1 | FileCheck -check-p

[PATCH] D98746: [clang][amdgpu] Use implicit code object default

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Revision at D101095 implements the suggestion above. The argument is passed to llc whenever an argument was passed to clang. Front end tests / device runtime libraries that are happy with the default can then build without the

[PATCH] D99949: [AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. That's interesting. The various permutations I have on disk are also all path/include/hsa/hsa.h. The existing in tree use of hsa.h is the amdgpu plugin, which uses `#include "hsa.h"` and `#include `, which seems unlikely to be correct. I'm going to patch this o

[PATCH] D99949: [AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. It's not obvious to me why any of those stages would get a different result for the search for rocr. Do you do things with chroot/jails to ensure isolation for some of them? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm

[PATCH] D99949: [AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. When find_package can find rocr, but it sets up include paths that don't work with it, I think that constitutes a broken rocr install. At least, one that is difficult to use from cmake. The packages are meant to deal with that sort of thing which suggests the /

[PATCH] D101077: [clang][nfc] Split getOrCheckAMDGPUCodeObjectVersion

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:1626 } else { - auto Remnant = - StringRef(CodeObjArg->getValue()).getAsInteger(0, CodeObjVer); - if (Diagnose && - (Remnant || CodeObjVer < MinCodeObjVe

[PATCH] D99949: [AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D99949#2710273 , @davezarzycki wrote: > I removed all HSA/ROCM via the package manager and everything is fine now. > Have you considered using __has_include() instead of CMake? It seems to be > supported by all contem

[PATCH] D101117: [clang][nfc] Split getOrCheckAMDGPUCodeObjectVersion

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added a reviewer: yaxunl. Herald added subscribers: kerbowa, t-tye, tpr, dstuttard, nhaehnle, jvesely, kzhuravl. JonChesterfield requested review of this revision. Herald added subscribers: cfe-commits, wdng. Herald added a project: clang. [c

[PATCH] D101077: [clang][nfc] Split getOrCheckAMDGPUCodeObjectVersion

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:1626 } else { - auto Remnant = - StringRef(CodeObjArg->getValue()).getAsInteger(0, CodeObjVer); - if (Diagnose && - (Remnant || CodeObjVer < MinCodeObjVe

[PATCH] D101117: [clang][nfc] Split getOrCheckAMDGPUCodeObjectVersion

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I'm not very pleased with this. There's too much state to move around when representing the failure and the control flow is probably harder to read than it was in the original. May take another pass at it. Repository: rG LLVM Github Monorepo CHANGES SINCE LA

[PATCH] D101117: [clang][nfc] Split getOrCheckAMDGPUCodeObjectVersion

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield abandoned this revision. JonChesterfield added a comment. Abandoning this. Unacceptably ugly, will revise D101077 in a slightly different direction. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D10

[PATCH] D101117: [clang][nfc] Split getOrCheckAMDGPUCodeObjectVersion

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 339806. JonChesterfield added a comment. This revision is now accepted and ready to land. - factor out last arg wins Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101117/new/ https://reviews.llvm.org/D1

[PATCH] D101077: [clang][nfc] Split getOrCheckAMDGPUCodeObjectVersion

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 339807. JonChesterfield added a comment. - factor out last arg wins Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101077/new/ https://reviews.llvm.org/D101077 Files: clang/lib/Driver/ToolChains/AMDGP

[PATCH] D101077: [clang][nfc] Split getOrCheckAMDGPUCodeObjectVersion

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 339810. JonChesterfield added a comment. - whitespace Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101077/new/ https://reviews.llvm.org/D101077 Files: clang/lib/Driver/ToolChains/AMDGPU.cpp clang/

[PATCH] D101077: [clang][nfc] Split getOrCheckAMDGPUCodeObjectVersion

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:1576 -unsigned tools::getOrCheckAMDGPUCodeObjectVersion( -const Driver &D, const llvm::opt::ArgList &Args, bool Diagnose) { +static llvm::opt::Arg * +getAMDGPUCodeObjectArgument(co

[PATCH] D101077: [clang][nfc] Split getOrCheckAMDGPUCodeObjectVersion

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG2cdb9873b210: [clang][nfc] Split getOrCheckAMDGPUCodeObjectVersion (authored by JonChesterfield). Repository: rG LLVM Github Monorepo CHANGES SIN

[PATCH] D101095: [clang][amdgpu] Use implicit code object version

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 339820. JonChesterfield added a comment. - rebase Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101095/new/ https://reviews.llvm.org/D101095 Files: clang/lib/Driver/ToolChains/Clang.cpp clang/lib/D

[PATCH] D99949: [AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Live again as of 15be0c41d2e5 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D99949/new/ https://reviews.llvm.org/D99949

[PATCH] D101095: [clang][amdgpu] Use implicit code object version

2021-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 339824. JonChesterfield added a comment. - simplify Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101095/new/ https://reviews.llvm.org/D101095 Files: clang/lib/Driver/ToolChains/Clang.cpp clang/lib

[PATCH] D99949: [AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed

2021-04-23 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. None of those include paths look like the result of find_package locating a hsa so I'd guess this some quirk of cmake. It seems it is setting hsa-runtime64_FOUND but not doing the include path setup to match. I don't know how to debug that. @pdhaliwal? Reposit

[PATCH] D100929: [Clang] Allow the combination of loader_uninitialized and address spaces

2021-04-23 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Diff now shows unrelated changed (// C++ for OpenCL v1.0 s2.4:), maybe needs to be rebased? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D100929/new/ https://reviews.llvm.org/D100929 __

[PATCH] D98746: [clang][amdgpu] Use implicit code object default

2021-04-23 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield abandoned this revision. JonChesterfield added a comment. Abandoning in favour of the more conservative D101095 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D98746/new/ https://reviews.llvm.org/D9

[PATCH] D101095: [clang][amdgpu] Use implicit code object version

2021-04-23 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D101095#2713652 , @yaxunl wrote: > In general we think we should move toward the direction of clang being > agnostic about code object version and let backend decide which code object > version to use This sounds the

[PATCH] D101095: [clang][amdgpu] Use implicit code object version

2021-04-23 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGfc88d927e30d: [clang][amdgpu] Use implicit code object version (authored by JonChesterfield). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101095/new/ http

[PATCH] D101595: [Clang][OpenMP] Allow unified_shared_memory for Pascal-generation GPUs.

2021-04-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Nice. Thanks for including the references. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101595/new/ https://reviews.llvm.org/

[PATCH] D69498: IR: Invert convergent attribute handling

2021-04-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I wonder if it is necessary for the exec mask to be implicit state, managed by a convergent/divergent abstraction. We could reify it as an argument passed to kernels (where all bits are set), and adjusted by phi nodes at entry to basic blocks. Various intrinsics

[PATCH] D100621: [OpenMP] Ensure the DefaultMapperId has a location

2021-05-03 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Not ideal to have unreproducible failures but the fix looks safe regardless Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D10062

[PATCH] D100620: [OpenMP] Make sure classes work on the device as they do on the host

2021-05-03 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. cstdlib may not exist, probably in the same situations as new not existing, but that can probably be left until the problem arises. I think this looks reasonable, and it'll u

[PATCH] D100620: [OpenMP] Make sure classes work on the device as they do on the host

2021-05-03 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. We might want a test that includes along with this header, to check they don't conflict with one another Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D100620/new/ https://reviews.llvm.org/D100620

[PATCH] D100620: [OpenMP] Make sure classes work on the device as they do on the host

2021-05-03 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D100620#2733708 , @jdoerfert wrote: > For that I need to replicate the potentially conflicting `` stuff in the > test header mockups. I'll try it later on one system though. Yep, may not be able to check in something

[PATCH] D101901: [AMDGPU][OpenMP] Fix clang driver crash when provided -c

2021-05-05 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. @jdoerfert This drops the logic for save-temps that I remember being contentious. Doing a no-op backend pass instead of skipping over it is probably more robust. Comment at: clang/lib/Driver/Driver.cpp:4600 + + if (const AssembleJobAction

[PATCH] D99949: [AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed

2021-05-05 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. We may need more robust error handling around this. I'm seeing test failures when building clang, `clang-13: error: Cannot determine AMDGPU architecture: $HOME/llvm-build/llvm/./bin/amdgpu-arch: Execute failed: No such file or directory. Consider passing it via

[PATCH] D101926: [amdgpu-arch] Fix rpath to run from build dir

2021-05-05 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: gregrodgers, pdhaliwal, tianshilei1992, jdoerfert. Herald added subscribers: kerbowa, t-tye, tpr, dstuttard, yaxunl, mgorny, nhaehnle, jvesely, kzhuravl. JonChesterfield requested review of this revision. Herald added subscri

[PATCH] D101935: [clang] Search runtimes build tree for openmp runtime

2021-05-05 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: jdoerfert, pdhaliwal, ronlieb, grokos, tianshilei1992. Herald added subscribers: guansong, yaxunl. JonChesterfield requested review of this revision. Herald added subscribers: cfe-commits, sstefan1. Herald added a project: cla

[PATCH] D101935: [clang] Search runtimes build tree for openmp runtime

2021-05-05 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I think the lit test suite is intended to set environment variables such that this is not necessary. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101935/new/ https://reviews.llvm.org/D101935 _

[PATCH] D101960: [openmp] Drop requirement on library path environment variables

2021-05-05 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: jdoerfert, grokos, ABataev, gregrodgers, RaviNarayanaswamy, pdhaliwal, hfinkel, tlwilmar, AndreyChurbanov, jlpeyton, protze.joachim, ye-luo, tianshilei1992. Herald added subscribers: kerbowa, guansong, yaxunl, mgorny, nhaehnl

[PATCH] D101960: [openmp] Drop requirement on library path environment variables

2021-05-05 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:648 +void tools::addOpenMPRuntimeSpecificRPath(const ToolChain &TC, + const ArgList &Args, Similar to other functions in this

[PATCH] D101960: [openmp] Drop requirement on library path environment variables

2021-05-05 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: openmp/libomptarget/src/rtl.cpp:76 + std::string full_plugin_name; + void *handle = dlopen("libomptarget.so", RTLD_NOW); JonChesterfield wrote: > This logic is cut from D73657 without addressing any of the re

[PATCH] D101926: [amdgpu-arch] Fix rpath to run from build dir

2021-05-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. It does indeed fix that comment from D99949 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101926/new/ https://reviews.llvm.org/D101926 _

[PATCH] D101926: [amdgpu-arch] Fix rpath to run from build dir

2021-05-06 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGb24e9f82b71f: [amdgpu-arch] Fix rpath to run from build dir (authored by JonChesterfield). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST

[PATCH] D101976: [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL

2021-05-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield requested changes to this revision. JonChesterfield added a comment. This revision now requires changes to proceed. What are the required semantics of the barrier operations? Amdgcn builds them on shared memory, so probably needs a change to the corresponding target_impl to match

[PATCH] D101960: [openmp] Drop requirement on library path environment variables

2021-05-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Yes. Libomp and libomptarget are in the same directory, so the rpath/runpath change catches both of them. Libomptarget then needs to find the plugins to load, either through library path or something equivalent to the above. Comment at: clang/

[PATCH] D101976: [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL

2021-05-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D101976#2742188 , @jdoerfert wrote: > In D101976#2742166 , > @JonChesterfield wrote: > >> What are the required semantics of the barrier operations? Amdgcn builds >> them on s

[PATCH] D99949: [AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed

2021-05-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. There is a potential hazard here for parallel compilation and to a lesser extent testing. (I have just learned that) kfd imposes a process limit which is low enough that we may see hsa_init fail under multiple processes. @jdoerfert you talked me out of embedding

[PATCH] D101976: [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL

2021-05-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D101976#2742919 , @jdoerfert wrote: > So what do you wnat me to change for this patch now? Equivalent change to amdgpu target_impl to the nvptx target_impl, which looks like syncthreads should call a new barrier. Iiu

[PATCH] D101960: [openmp] Drop requirement on library path environment variables

2021-05-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 343557. JonChesterfield added a comment. - rework logic for finding libomptarget.so Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101960/new/ https://reviews.llvm.org/D101960 Files: clang/lib/Driver/

[PATCH] D101960: [openmp] Drop requirement on library path environment variables

2021-05-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: openmp/libomptarget/src/rtl.cpp:76 + std::string full_plugin_name; + void *handle = dlopen("libomptarget.so", RTLD_NOW); JonChesterfield wrote: > JonChesterfield wrote: > > This logic is cut from D73657 witho

[PATCH] D102065: [AMDGPU][OpenMP] Emit textual IR for -emit-llvm -S

2021-05-07 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/Clang.cpp:4397 JA.getType() == types::TY_LTO_BC) { - CmdArgs.push_back("-emit-llvm-bc"); + // Emit textual llvm IR for AMDGPU offloading for -emit-llvm -S + if (Triple.is

[PATCH] D102065: [AMDGPU][OpenMP] Emit textual IR for -emit-llvm -S

2021-05-07 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Ah, yes. Thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D102065/new/ https://reviews.llvm.org/D102065 _

[PATCH] D102067: [amdgpu-arch] Guard hsa.h with __has_include

2021-05-07 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Logic doesn't look quite right to me. If the compiler supports has_include, but neither of those headers exist, we fall through to main() which won't compile. How about: #if defined(__has_include) #if __has_include("hsa.h") #define HSA_FOUND 1 // nam

[PATCH] D102067: [amdgpu-arch] Guard hsa.h with __has_include

2021-05-07 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. LGTM, thanks. I think this leaves us totally sure the clang build will succeed with a best effort chance of having a working amdgpu ident tool afterwards. Repository: rG L

[PATCH] D102067: [amdgpu-arch] Guard hsa.h with __has_include

2021-05-07 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. A thought - we need to keep return 0 for success, but could return a different integer for each failure mode. That would be useful when guessing why it failed. Orthogonal to this patch. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://

[PATCH] D101960: [openmp] Drop requirement on library path environment variables

2021-05-07 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:648 +void tools::addOpenMPRuntimeSpecificRPath(const ToolChain &TC, + const ArgList &Args, protze.joachim wrote: > JonChesterf

[PATCH] D101935: [clang] Search runtimes build tree for openmp runtime

2021-05-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. More compelling than the testing case is that openmp is presently unusable without setting ld_library_path. This is part of fixing that. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101935/new/ https://reviews.llv

[PATCH] D101935: [clang] Search runtimes build tree for openmp runtime

2021-05-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I'm not so sure about that. With the other stuff in flight, it would let us build openmp apps from the clang build directory. That's useful for quicker build/debug cycles, as well as for lit tests that can be run by copying into a shell. At zero runtime cost whe

[PATCH] D96248: [OpenMP][AMDGPU] Add support for linking libomptarget bitcode

2022-01-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/include/clang/Driver/Options.td:948 def libomptarget_nvptx_bc_path_EQ : Joined<["--"], "libomptarget-nvptx-bc-path=">, Group, HelpText<"Path to libomptarget-nvptx bitcode library">; def dD : Flag<["-"], "dD">, Group,

[PATCH] D116906: [OpenMP][AMDGPU] Optimize the linked in math libraries

2022-01-10 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Using the optimiser to hide errors in how we're pulling in libm is clearly not right, but it leaves us less obviously broken than we are at present. Repository: rG LLVM Gi

[PATCH] D117048: [OpenMP] Link the bitcode library late for device LTO

2022-01-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Do we have a hook where we could link it at, uh, link time on nvtptx without LTO? Amdgpu had a llvm-link already there. Always bothered me a little that we link a copy per TU then hope the optimiser sorts it out nicely. Repository: rG LLVM Github Monorepo CH

[PATCH] D117048: [OpenMP] Link the bitcode library late for device LTO

2022-01-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Oh, right. Nvptx is still lowering to machine code per-tu. We don't want the devicertl linking as machine code, so it has to go in per-tu. Or we could link nvptx as IR instead, and send that plus amdgpu down the same code path. Probably makes applications faster

[PATCH] D117048: [OpenMP] Link the bitcode library late for device LTO

2022-01-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp:268 + // Link the bitcode library late if we're using device LTO. + if (getDriver().isUsingLTO(/* IsOffload */ true)) +return; Can/should probably always link it

[PATCH] D97446: Change some addUsedGlobal to addUsedOrCompilerUsedGlobal

2022-01-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield reopened this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. I don't know the context of this patch but changing attribute((used)) to put things under compiler.used is definitely a breaking change. Please introduce a new attribute if n

[PATCH] D97446: Change some addUsedGlobal to addUsedOrCompilerUsedGlobal

2022-01-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. It looks deeply wrong to be marking global as either llvm.used or llvm.compiler.used based on the output file format. It should be based on the (purpose of) the global. In D97446#3241142 , @rnk wrote: > This change was i

[PATCH] D97446: Change some addUsedGlobal to addUsedOrCompilerUsedGlobal

2022-01-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. > Modulo optimizer bugs, __attribute__((used)) hasn't changed semantics. > If your downstream project does not handle llvm.compiler.used, maybe handle > it now :) There seems to be some confusion here. The 'downstream project' is openmp, which has worked around

[PATCH] D97446: Change some addUsedGlobal to addUsedOrCompilerUsedGlobal

2022-01-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D97446#3241735 , @MaskRay wrote: > For your comment it appears an issue in the internalisation pass. It is > orthogonal to this patch. > Do you have a reduced example with reproduce instructions for the issue? I > know

[PATCH] D115157: [openmp] Default to new rtl for amdgpu

2021-12-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: ronlieb, jhuber6, jdoerfert, pdhaliwal. Herald added subscribers: dang, kerbowa, guansong, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl. JonChesterfield requested review of this revision. Herald added subscribers

[PATCH] D115157: [openmp] Default to new rtl for amdgpu

2021-12-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. It's a bit hard to tell. Definitely might work. I'm hoping someone will feed this into the internal CI infra. @ronlieb any objection to making this change on trunk and reverting it internally if things break there? Repository: rG LLVM Github Monorepo CHANGES

[PATCH] D115157: [openmp] Default to new rtl for amdgpu

2021-12-06 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG6bb2a4f3e654: [openmp] Default to new rtl for amdgpu (authored by JonChesterfield). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D115283: [AMDGPU] Set "amdgpu_hostcall" module flag if an AMDGPU function has calls to device lib functions that use hostcalls.

2021-12-07 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/CodeGen/TargetInfo.cpp:9425 +void AMDGPUTargetCodeGenInfo::checkFunctionCallABI(CodeGenModule &CGM, + SourceLocation CallLoc, Doesn't seem to check an

[PATCH] D115283: [AMDGPU] Set "amdgpu_hostcall" module flag if an AMDGPU function has calls to device lib functions that use hostcalls.

2021-12-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Openmp defines a weak symbol in the hostcall_invoke function. Optimisation and deadstripping friendly, no compiler support necessary. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D115283/new/ https://reviews.llvm.o

[PATCH] D115283: [AMDGPU] Set "amdgpu_hostcall" module flag if an AMDGPU function has calls to device lib functions that use hostcalls.

2021-12-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Not exactly that. The weak symbol isn't the function name, as that gets renamed or inlined. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D115283/new/ https://reviews.llvm.org/D115283 __

[PATCH] D115153: clang/AMDGPU: Don't set implicit arg attribute to default size

2021-12-09 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Great, thanks! CHANGES SINCE LAST ACTION https://reviews.llvm.org/D115153/new/ https://reviews.llvm.org/D115153 ___ cfe-commit

[PATCH] D115283: [AMDGPU] Set "amdgpu_hostcall" module flag if an AMDGPU function has calls to device lib functions that use hostcalls.

2021-12-09 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D115283#3182879 , @yaxunl wrote: > In D115283#3181128 , > @JonChesterfield wrote: > >> Not exactly that. The weak symbol isn't the function name, as that gets >> renamed or in

[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Patch looks ok to me. This will fix the miscompile (we end up with a store to addrspace(4) at present) without upsetting whatever hacks rely on addrspace(4). @arsenm reasonable as a point fix? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION h

[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. That sounds good here. Infer addrspaces is pretty complicated, given marginal benefit for (4) this patch seems to hit the mark. Repository: rG LLVM Github Monorepo CHANGE

[PATCH] D115661: [clang][amdgpu] - Choose when to promote VarDecl to address space 4.

2021-12-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Changing the has-constant-initialiser check to a more precise/robust one sounds reasonable to me. In D115661#3193157 , @arsenm wrote: > The backend also knows because the constant flag is set on the global > variable. A

[PATCH] D109885: [MLIR][[amdgpu-arch]][OpenMP] Remove direct dependency on /opt/rocm

2021-12-15 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Diff was created without context unfortunately, scrolling through the cmake locally suggests variables have names of the form `LIBOMPTARGET_AMDGPU_FOO` though the namespacing isn't as consistent as it might be. Something like: if (not defined (LIBOMPTARGET_AM

[PATCH] D109885: [MLIR][[amdgpu-arch]][OpenMP] Remove direct dependency on /opt/rocm

2021-12-16 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D109885#3198232 , @arsenm wrote: > Dropping the default install location from the default search hint is > entirely unreasonable That seems to be the feature request, though I haven't gone through the jira web to wor

[PATCH] D109885: [MLIR][[amdgpu-arch]][OpenMP] Remove direct dependency on /opt/rocm

2021-12-16 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D109885#3198296 , @arsenm wrote: > This isn't a feature, it's the introduction of a bug. It would regress people depending on the implicit pickup of /opt/rocm, leaving them with a straightforward workaround of setting

[PATCH] D109885: [MLIR][[amdgpu-arch]][OpenMP] Remove direct dependency on /opt/rocm

2021-12-16 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/tools/amdgpu-arch/CMakeLists.txt:13 + +find_package(hsa-runtime64 QUIET 1.2.0 HINTS ${CMAKE_INSTALL_PREFIX} PATH ${ROCM_PATH}) if (NOT ${hsa-runtime64_FOUND}) arsenm wrote: > I also think CMAKE_INSTALL_PR

[PATCH] D109885: [MLIR][[amdgpu-arch]][OpenMP] Remove direct dependency on /opt/rocm

2021-12-20 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D109885#3198375 , @arsenm wrote: > In D109885#3198340 , > @JonChesterfield wrote: > >> I'm very mistrusting of mixing a rocm toolchain with a trunk toolchain as >> they're bot

[PATCH] D109885: [MLIR][[amdgpu-arch]][OpenMP] Remove direct dependency on /opt/rocm

2021-12-21 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D109885#3203412 , @arsenm wrote: > We should have versioned libraries and well defined ABI breaks. The build > should know what runtime versions it requires and account when searching If we're playing that game, ROCm

<    2   3   4   5   6   7   8   9   >