[PATCH] D108317: [libomptarget][devicertl] Replace lanemask with uint64 at interface

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: jdoerfert, dpalermo, grokos, tianshilei1992. Herald added subscribers: hiraditya, tpr. JonChesterfield requested review of this revision. Herald added subscribers: llvm-commits, openmp-commits, cfe-commits, sstefan1. Herald add

[PATCH] D108317: [libomptarget][devicertl] Replace lanemask with uint64 at interface

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 367295. JonChesterfield added a comment. - void Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D108317/new/ https://reviews.llvm.org/D108317 Files: clang/test/OpenMP/nvptx_parallel_codegen.cpp llvm/i

[PATCH] D108317: [libomptarget][devicertl] Replace lanemask with uint64 at interface

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Yep, and optionally in the old one as well. The 32/64 conversion all shakes out OK in the end. Restricting the patch to the interface and immediate consequences makes it easier to review. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https:

[PATCH] D108317: [libomptarget][devicertl] Replace lanemask with uint64 at interface

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG21d91a8ef319: [libomptarget][devicertl] Replace lanemask with uint64 at interface (authored by JonChesterfield). Repository: rG LLVM Github Monore

[PATCH] D108339: [openmp][nfc] Replace OMPGridValues array with struct

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: jdoerfert, dpalermo, gregrodgers, ronlieb, tianshilei1992, grokos. Herald added subscribers: kerbowa, guansong, tpr, yaxunl, nhaehnle, jvesely, jholewinski. JonChesterfield requested review of this revision. Herald added subs

[PATCH] D107971: [openmp] Annotate tmp variables with omp_thread_mem_alloc

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. intended content for the new omp.h is #ifndef __OMP_H #define __OMP_H #if _OPENMP // Follows the pattern in interface.h // Clang sema checks this type carefully, needs to closely match that from omp.h typedef enum omp_allocator_handle_t { omp_n

[PATCH] D107971: [openmp] Annotate tmp variables with omp_thread_mem_alloc

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 367374. JonChesterfield added a comment. - add minimal omp.h to clang test inputs Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D107971/new/ https://reviews.llvm.org/D107971 Files: clang/lib/Headers/_

[PATCH] D107971: [openmp] Annotate tmp variables with omp_thread_mem_alloc

2021-08-18 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGdbd7bad9ad9b: [openmp] Annotate tmp variables with omp_thread_mem_alloc (authored by JonChesterfield). Repository: rG LLVM Github Monorepo CHANGE

[PATCH] D108339: [openmp][nfc] Replace OMPGridValues array with struct

2021-08-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Given this patch, I can then template the amdgpu one on wavesize to get something that is a compile time constant in the devicertl and easily used from clang/host plugin. Expecting to have a few refactors following on from this to make the gfx9/gfx10 abstraction

[PATCH] D108339: [openmp][nfc] Replace OMPGridValues array with struct

2021-08-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 367451. JonChesterfield added a comment. - revert to unsigned, file doesn't currently require stdint Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D108339/new/ https://reviews.llvm.org/D108339 Files:

[PATCH] D106870: [OpenMP] Multi architecture compilation support

2021-08-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I think this patch needs to split up into a large number of much smaller pieces. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D106870/new/ https://reviews.llvm.org/D106870 __

[PATCH] D108339: [openmp][nfc] Replace OMPGridValues array with struct

2021-08-19 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG77579b99e9ce: [openmp][nfc] Replace OMPGridValues array with struct (authored by JonChesterfield). Repository: rG LLVM Github Monorepo CHANGES SI

[PATCH] D106870: [OpenMP] Multi architecture compilation support

2021-08-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added reviewers: ronlieb, pdhaliwal. JonChesterfield added a comment. Spent some time reading through this. I think the idea is to create a host binary that contains code objects for multiple variants of amdgpu - e.g. one that runs on gfx906 and another on gfx908, or one that run

[PATCH] D108380: [openmp][nfc] Refactor GridValues

2021-08-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: jdoerfert, dpalermo, gregrodgers, ronlieb, tianshilei1992, grokos, atmnpatel. Herald added subscribers: kerbowa, guansong, yaxunl, nhaehnle, jvesely, jholewinski. JonChesterfield requested review of this revision. Herald adde

[PATCH] D108380: [openmp][nfc] Refactor GridValues

2021-08-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 367513. JonChesterfield added a comment. - reorder field Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D108380/new/ https://reviews.llvm.org/D108380 Files: clang/include/clang/Basic/TargetInfo.h cla

[PATCH] D108380: [openmp][nfc] Refactor GridValues

2021-08-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h:102 +return R; + } }; jdoerfert wrote: > It should be in the device rtl then, no? This header is currently used from clang and the (amdgpu, could also be cu

[PATCH] D108380: [openmp][nfc] Refactor GridValues

2021-08-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h:102 +return R; + } }; jdoerfert wrote: > JonChesterfield wrote: > > jdoerfert wrote: > > > It should be in the device rtl then, no? > > This header is curre

[PATCH] D108380: [openmp][nfc] Refactor GridValues

2021-08-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 367587. JonChesterfield added a comment. - delete log2 accessors per review comments Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D108380/new/ https://reviews.llvm.org/D108380 Files: clang/include/cl

[PATCH] D108380: [openmp][nfc] Refactor GridValues

2021-08-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 367589. JonChesterfield added a comment. - whitespace, drop asserts Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D108380/new/ https://reviews.llvm.org/D108380 Files: clang/include/clang/Basic/TargetI

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-06-25 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Headers/__clang_hip_cmath.h:29 +#ifdef __OPENMP_AMDGCN__ +#define __DEVICE__ static constexpr __attribute__((always_inline, nothrow)) +#define __constant__ __attribute__((constant)) scchan wrote: > `__D

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-06-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/test/Headers/Inputs/include/cstdlib:29 float fabs(float __x) { return __builtin_fabs(__x); } +#endif jdoerfert wrote: > That seems to be fundamentally broken then, but let's see, maybe it will > somehow

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-06-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Headers/__clang_hip_cmath.h:43 +__DEVICE__ __CONSTEXPR__ long abs(long __n) { return ::labs(__n); } +__DEVICE__ __CONSTEXPR__ float fma(float __x, float __y, float __z) { return ::fmaf(__x, __y, __z); ---

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-06-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. I recommend we ship this and fix up the rough edges as we run into them. Paired with ocml it passes OVO libm tests which seems to be a fairly high bar for 'does it work'. The

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-06-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. @ronlieb reports that this change means __CUDA__ is defined for openmp amdgcn compilation. I'm going to try to verify that Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D104904/new/ https://reviews.llvm.org/D104904

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-06-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Good spot. I've been feeding the following to various toolchains: // permute //#include //#include //#include #ifndef _OPENMP #error "OpenMP should be defined #endif #ifdef __CUDA__ #error "Cuda should not be defined" #endif #ifde

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-06-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Weird pre-existing stuff in cuda_complex_builtins. It has an #ifdef AMDGCN macro in it, despite 'cuda' in the name. I note there is no corresponding 'hip' complex builtins. The ifdef logic for stubbing out some functions (which is done with macros...) isn't ide

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-06-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. tagged request changes because I think we should ifdef around complex before (or while) landing this, as defining `__CUDA__`, even transiently, is a user hostile thing to do from amdgpu openmp It is *really* ugly that we have cuda and hip implementations of cmat

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-06-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Headers/openmp_wrappers/math.h:41 #pragma omp begin declare variant match( \ device = {arch(nvptx, nvptx64)}, implementation = {extension(match_any)}) tian

[PATCH] D105221: [openmp][nfc] Simplify macros guarding math complex headers

2021-06-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: jdoerfert, tianshilei1992, pdhaliwal. Herald added subscribers: guansong, kristof.beyls, tpr, yaxunl. JonChesterfield requested review of this revision. Herald added subscribers: cfe-commits, sstefan1. Herald added a project: c

[PATCH] D105221: [openmp][nfc] Simplify macros guarding math complex headers

2021-06-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a reviewer: ronlieb. JonChesterfield added a comment. this unblocks the hazard I am concerned about for D104904 , namely it stops us defining `__CUDA__` when compiling amdgcn code that includes complex.h Comment at: clan

[PATCH] D105221: [openmp][nfc] Simplify macros guarding math complex headers

2021-07-01 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Yeah, it probably should be. I should also check the blame list for the file to see who else should be on the reviewer list. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105221/new/ https://reviews.llvm.org/D10522

[PATCH] D105221: [openmp][nfc] Simplify macros guarding math complex headers

2021-07-01 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. That's interesting. I don't see how there is a semantic change here - _openmp is defined already and the builtins file ignores the cuda define - but I also haven't tried openmp+cuda in combination. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTIO

[PATCH] D105221: [openmp][nfc] Simplify macros guarding math complex headers

2021-07-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I think the openmp_wrappers are only used when compiling device code, which would explain why setting a macro in one of them is a proxy for detecting compilation for the device. Attempting to verify that, it looks like: trunk-nvptx includes openmp_wrappers on de

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-07-09 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/test/Headers/Inputs/include/cstdlib:29 float fabs(float __x) { return __builtin_fabs(__x); } +#endif jdoerfert wrote: > estewart08 wrote: > > JonChesterfield wrote: > > > jdoerfert wrote: > > > > That se

[PATCH] D105221: [openmp][nfc] Simplify macros guarding math complex headers

2021-07-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 358248. JonChesterfield added a comment. - reduce patch to only dropping cuda define Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105221/new/ https://reviews.llvm.org/D105221 Files: clang/lib/Header

[PATCH] D105221: [openmp][nfc] Simplify macros guarding math complex headers

2021-07-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 358250. JonChesterfield added a comment. - reduce patch to only dropping cuda define v2, now with missing save Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105221/new/ https://reviews.llvm.org/D105221

[PATCH] D105221: [openmp][nfc] Simplify macros guarding math complex headers

2021-07-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Cut down to only dropping the cuda define, which is sufficient to resolve D104904 . Haven't build/tested this diff yet. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105221/new/ ht

[PATCH] D105937: [OpenMP] Encode `omp [...] assume[...]` assumptions with `omp[x]` prefix

2021-07-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105937/new/ https://reviews.llvm.org/D105937 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm

[PATCH] D105221: [openmp][nfc] Simplify macros guarding math complex headers

2021-07-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Headers/openmp_wrappers/complex:21 #define __OPENMP_NVPTX__ #include <__clang_cuda_complex_builtins.h> #undef __OPENMP_NVPTX__ ^ this header does not look for a macro called __CUDA__ or include any o

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-07-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Only revision I'm looking for here is to land D105221 or equivalent first Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D104904/new/ https://reviews.llvm.org/D104904 _

[PATCH] D105981: [AMDGPU][OpenMP] Support linking of math libraries

2021-07-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp:252 +bool Wave64 = isWave64(DriverArgs, Kind); + +// TODO: There are way too many flags that change this. Do we need to check I recognise this comment. Is this

[PATCH] D105981: [AMDGPU][OpenMP] Support linking of math libraries

2021-07-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a reviewer: t-tye. JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp:267 +llvm::SmallVector BCLibs; +BCLibs.append(RocmInstallation.getCommonBitcodeLibs( +DriverArgs, LibDeviceFile, Wave64, DAZ, F

[PATCH] D105981: [AMDGPU][OpenMP] Support linking of math libraries

2021-07-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:879-889 + // TODO: There are way too many flags that change this. Do we need to check + // them all? + bool DAZ = DriverArgs.hasArg(options::OPT_cl_denorms_are_zero) || + getD

[PATCH] D105981: [AMDGPU][OpenMP] Support linking of math libraries

2021-07-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:879-889 + // TODO: There are way too many flags that change this. Do we need to check + // them all? + bool DAZ = DriverArgs.hasArg(options::OPT_cl_denorms_are_zero) || + getD

[PATCH] D112504: [OpenMP] Wrap (v)printf in the new RT and use same handling for AMD

2021-10-25 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. That's an interesting approach. Do you happen to know where I can find details of the data format behind that void*? Have been meaning to look at writing printf for amdgpu as host side decoding of that buffer. If the compiler knows how long it is, that would be

[PATCH] D112504: [OpenMP] Wrap (v)printf in the new RT and use same handling for AMD

2021-10-26 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Nice! Yep, can add a size argument later. Will want it to control copying the payload over to the host Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112504/new/ https://reviews.llvm.org/D112504 ___

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-26 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 382306. JonChesterfield added a comment. - rebase Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112227/new/ https://reviews.llvm.org/D112227 Files: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp open

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-26 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 382421. JonChesterfield added a comment. - Enable tests on amdgpu, with same ones marked xfail/unsupported as on the old runtime Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112227/new/ https://review

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-26 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 382430. JonChesterfield added a comment. - drop outdated comment Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112227/new/ https://reviews.llvm.org/D112227 Files: clang/lib/Driver/ToolChains/AMDGPUOp

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-26 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 382433. JonChesterfield added a comment. - delete the extern debug_kind variable Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112227/new/ https://reviews.llvm.org/D112227 Files: clang/lib/Driver/Too

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-26 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I think this is good enough for now. It drops the not yet used debug variable and writes out the lowering for runtime values of memory ordering manually. The latter will be simplified once clang learns to emit the switch instead of error. Omp lock is a problem I

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-26 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 382441. JonChesterfield added a comment. - rebase Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112227/new/ https://reviews.llvm.org/D112227 Files: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp open

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-26 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 382442. JonChesterfield added a comment. - rebase Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112227/new/ https://reviews.llvm.org/D112227 Files: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp open

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-26 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 382446. JonChesterfield added a comment. - rebase, having landed D111987 this time Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112227/new/ https://reviews.llvm.org/D

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-26 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Problem with missing symbol for __omp_rtl_debug_kind was a local error. I did the initial testing of this with a jury rigged clang that linked the new bitcode and ignored the old. The generation of this integer is guarded by which runtime clang thinks it is comp

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-26 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Requires either D112544 or disabling bug51982 on newRTL before landing Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112227/new/ https://reviews.llvm.org/D112227

[PATCH] D112639: [openmp][amdgpu] Add comment warning that libm may be broken

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: arsenm, jdoerfert, gregrodgers, pdhaliwal, ye-luo, tianshilei1992, b-sumner. Herald added subscribers: kerbowa, guansong, t-tye, tpr, dstuttard, yaxunl, nhaehnle, jvesely, kzhuravl. JonChesterfield requested review of this re

[PATCH] D112639: [openmp][amdgpu] Add comment warning that libm may be broken

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a reviewer: ronlieb. JonChesterfield added a comment. Using phab as a bug tracker here, but am also happy to check in that comment as-is if someone hits the green button. The problem is that rocm device-libs doesn't have architecture specific attributes set because it uses

[PATCH] D112504: [OpenMP] Wrap (v)printf in the new RT and use same handling for AMD

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. This doesn't apply against main, diff relative to something that isn't in main Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112504/new/ https://reviews.llvm.org/D112504

[PATCH] D112504: [OpenMP] Wrap (v)printf in the new RT and use same handling for AMD

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 382747. JonChesterfield added a comment. -rebase on main Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112504/new/ https://reviews.llvm.org/D112504 Files: clang/lib/CodeGen/CGBuiltin.cpp clang/lib/

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG33427fdb7b52: [libomptarget] Build DeviceRTL for amdgpu (authored by JonChesterfield). Changed prior to commit: https://reviews.llvm.org/D112227?vs=382446&id=382853#toc Repository: rG LLVM Github Mon

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield reopened this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Didn't fare very well under CI, investigating. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112227/new/ https://reviews.llvm.

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: openmp/libomptarget/DeviceRTL/src/Synchronization.cpp:196 -void syncThreadsAligned() { syncThreads(); } - -void fenceTeam(int Ordering) { __builtin_amdgcn_fence(Ordering, "workgroup"); } - -void fenceKernel(int Ordering) { __bu

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 382864. JonChesterfield added a comment. - Reintroduce syncThreadsAligned, dropped in git merge conflict Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112227/new/ https://reviews.llvm.org/D112227 Files

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 382866. JonChesterfield added a comment. - fix new+old test enabling Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112227/new/ https://reviews.llvm.org/D112227 Files: clang/lib/Driver/ToolChains/AMDG

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: jdoerfert, tianshilei1992, jhuber6, ye-luo, ronlieb, pdhaliwal. Herald added subscribers: asavonic, kerbowa, guansong, tpr, yaxunl, mgorny, nhaehnle, jvesely. JonChesterfield requested review of this revision. Herald added su

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Makes more sense with D112227 landed, but probably worthwhile just for the old runtime as it removes the link failure from all the tests that use printf. Comment at: clang/lib/CodeGen/CGGPUBuiltin.cpp:94 +std

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a reviewer: ronlieb. JonChesterfield added a comment. Tagging Ron as this is current stuck on the mystery of passing locally and failing CI Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112227/new/ https://reviews.llvm.org/D1

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 382873. JonChesterfield added a comment. - rebase to not require new runtime, stubs now return error Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112680/new/ https://reviews.llvm.org/D112680 Files:

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. New runtime failing CI suggests it may take some iteration to land, so reworked this to make sense with or without it. Doesn't make much difference to nvptx - printf still ultimately resolves to the same vprintf function as before - but a lot of tests that were

[PATCH] D112682: [openmp][nfc] Subset of D112680, landing would clean up diff

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: jdoerfert, tianshilei1992, jhuber6, ye-luo, ronlieb, pdhaliwal. Herald added subscribers: guansong, yaxunl. JonChesterfield requested review of this revision. Herald added subscribers: cfe-commits, sstefan1. Herald added a pro

[PATCH] D112682: [openmp][nfc] Subset of D112680

2021-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Change takes the bulk of EmitNVPTXDevicePrintfCallExpr into a helper function, and adjusts it to also return the size of the alloca for use in D112680 . NFC. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION ht

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-28 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG4d50803ce49c: [libomptarget] Build DeviceRTL for amdgpu (authored by JonChesterfield). Changed prior to commit: https://reviews.llvm.org/D112227?vs=382866&id=382987#toc Repository: rG LLVM Github Mon

[PATCH] D112227: [libomptarget] Build DeviceRTL for amdgpu

2021-10-28 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Landed a slightly modified version of this - code and test changes are included, but the tests are not run by default. I'm hopeful this will help the process of working out why ~10 are failing on CI and passing locally. Repository: rG LLVM Github Monorepo CH

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-11-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Ping Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112680/new/ https://reviews.llvm.org/D112680 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm

[PATCH] D109885: [MLIR][[amdgpu-arch]][OpenMP] Remove direct dependency on /opt/rocm

2021-11-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield requested changes to this revision. JonChesterfield added a reviewer: ronl. JonChesterfield added a comment. This revision now requires changes to proceed. Marking as request changes because: `if (DEFINED ENV{ROCM_PATH})` Looking at environment variables is strictly worse than look

[PATCH] D109885: [MLIR][[amdgpu-arch]][OpenMP] Remove direct dependency on /opt/rocm

2021-11-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I don't know about the MLIR parts, but replacing > find_package(hsa-runtime64 QUIET 1.2.0 HINTS ${CMAKE_INSTALL_PREFIX} PATHS > /opt/rocm) with > find_package(hsa-runtime64 QUIET 1.2.0 HINTS ${CMAKE_INSTALL_PREFIX} PATHS > ${ROCM_PATH}) seems an improvement,

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-11-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Thanks for the review! Couple of helpers to split out and the rebase on main is messy, retesting now Comment at: clang/lib/CodeGen/CGGPUBuiltin.cpp:231 + VprintfFunc, {Args[0].getRValue(*this).getScalarVal(), BufferPtr, Size})); +} --

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-11-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 385505. JonChesterfield added a comment. - rebase, static functions Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112680/new/ https://reviews.llvm.org/D112680 Files: clang/lib/CodeGen/CGBuiltin.cpp

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-11-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 385512. JonChesterfield added a comment. - helper for nonscalars Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112680/new/ https://reviews.llvm.org/D112680 Files: clang/lib/CodeGen/CGBuiltin.cpp cl

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-11-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a subscriber: ronlieb. JonChesterfield added a comment. Updated. Left Builder.CreateCall alone as I've failed to think of a clean way to handle the sometimes present size argument. Tests pass locally and are a credible guess at what will pass the CI bot, but may need to rol

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-11-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 385525. JonChesterfield added a comment. - fold nvptx and openmp printf into one call - Drop ReturnValue argument Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D112680/new/ https://reviews.llvm.org/D1126

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-11-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Folded EmitOpenMPDevicePrintfCallExpr and EmitNVPTXDevicePrintfCallExpr into asserts plus one line call into a common function. Leaving amdgpu alone in this patch as it's doing something odd with getScalarVal and I don't want to run this through the internal CI

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-11-08 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGdb81d8f6c4d6: [OpenMP] Lower printf to __llvm_omp_vprintf (authored by JonChesterfield). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST A

[PATCH] D113421: [clang][openmp][NFC] Remove arch-specific CGOpenMPRuntimeGPU files

2021-11-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Example of the function as opposed to intrinsics is __kmpc_get_hardware_num_threads_in_block from just above where you've modified. That corresponds to a function in the device runtime, e.g. int __kmpc_get_hardware_num_threads_in_block() { return get_workg

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-11-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. This passed the openmp buildbot but failed some others, e.g. clang-armv7-quick/llvm/clang/test/OpenMP/nvptx_target_printf_codegen.c:64:19: error: CHECK-64-NEXT: expected string not found in input // CHECK-64-NEXT: [[TMP6:%.*]] = call i32 @vprintf(i8* [[TMP1

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-11-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 385614. JonChesterfield added a comment. - ./llvm/utils/update_cc_test_checks.py clang/test/OpenMP/nvptx_target_printf_codegen.c --opt=$HOME/llvm-install/bin/opt Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.ll

[PATCH] D113421: [clang][openmp][NFC] Remove arch-specific CGOpenMPRuntimeGPU files

2021-11-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I think __kmpc_get_warp_size needs an entry in Utils.cpp keepAlive(), and corresponding functions added to `deviceRTLs/*/src/target_impl.*` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113421/new/ https://reviews.

[PATCH] D113421: [clang][openmp][NFC] Remove arch-specific CGOpenMPRuntimeGPU files

2021-11-08 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. Thanks! Delighted to see the subclasses gone Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113421/new/ https://reviews.llvm.org/D113421 __

[PATCH] D113538: OpenMP: Start calling setTargetAttributes for generated kernels

2021-11-10 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added reviewers: tianshilei1992, ye-luo, jhuber6. JonChesterfield added a comment. That seems important. What was the symptom of failing to set these? We may now be redundantly setting some, e.g. I think convergent is set somewhere else before this patch. Added a few people who

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-11-10 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 386156. JonChesterfield added a comment. - regen test file using the better './llvm/utils/update_cc_test_checks.py --llvm-bin=/home/amd/llvm-install/bin clang/test/OpenMP/nvptx_target_printf_codegen.c', now passing Repository: rG LLVM Github Mono

[PATCH] D112680: [OpenMP] Lower printf to __llvm_omp_vprintf

2021-11-10 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG27177b82d4ca: [OpenMP] Lower printf to __llvm_omp_vprintf (authored by JonChesterfield). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST A

[PATCH] D96769: [OpenMP][AMDGPU] Skip backend and assemble phases for amdgcn

2021-03-15 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Agreed. Lack of save temps is causing grief when debugging, so I keep on applying this patch locally. Let's go with this for now. and change to something better when we think

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-03-15 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield abandoned this revision. JonChesterfield added a comment. I'm going to abandon this. I'm not confident that a cuda toolkit that is newer than the compiler will work with it correctly and would prefer it take some jury rigging on the end users part to put the two together. I get

[PATCH] D113359: [Libomptarget][WIP] Introduce VGPU Plugin

2021-11-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I can't see it in the diff - does the cmake somewhere enable the existing tests on this new target? A bit surprised to see ffi involved, are we thinking of spawning a separate process for the target? Comment at: clang/lib/Basic/Targets/X86.h:

[PATCH] D112639: [openmp][amdgpu] Add comment warning that libm may be broken

2021-11-15 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG0e738323a9c4: [openmp][amdgpu] Add comment warning that libm may be broken (authored by JonChesterfield). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D11263

[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-11-15 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Please change the commit message to say why this change is necessary / an improvement on what we have now. My recollection is that the amdgpu backend crashes on some IR and this decreases the probability of that IR pattern occuring, which still sounds like fixi

[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-11-15 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Please update the documentation with this new constraint. It would be helpful to know exactly when we now require alloca instructions to be adjacent to one another. If you wish to avoid other passes breaking the invariant in future, I think it needs to be added

[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-11-16 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D110257#3133866 , @hsmhsm wrote: > This is not something specific to AMDGPU backend, but AMDGPU backend at > present requires this canonical form. Undocumented and not checked by the IR verifier. Canonical form seems

[PATCH] D110257: [CFE][Codegen] Make sure to maintain the contiguity of all the static allocas

2021-11-16 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D110257#3133895 , @hsmhsm wrote: > In D110257#3133879 , > @JonChesterfield wrote: > >> In D110257#3133866 , @hsmhsm wrote: >> >>> This

<    1   2   3   4   5   6   7   8   9   >