[PATCH] D62046: [OpenMP][bugfix] Add missing math functions variants for log and abs.

2019-08-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I'm not sure about this diff. I think it's breaking and . Raised bug https://bugs.llvm.org/show_bug.cgi?id=42972 Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D62046/new/ https://reviews.llvm.org/D62046 ___

[PATCH] D64943: [Clang][OpenMP offload] Eliminate use of OpenMP linker script

2019-09-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I'm not sure copying the crtbegin/crtend mechanism from the early days of C runtime is ideal. Since the data is stored in a common section anyway, please could we rename it to __omp_offloading_entries in which case the linker will provide start/end symbols autom

[PATCH] D64943: [Clang][OpenMP offload] Eliminate use of OpenMP linker script

2019-09-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. > OpenMP linker script is known to cause problems for gold and lld linkers on > Linux and it will also cause problems for Windows enabling in future What are the known problems with the linker script? I'm wondering if they can be resolved without the overhead of

[PATCH] D64943: [Clang][OpenMP offload] Eliminate use of OpenMP linker script

2019-09-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D64943#173 , @ABataev wrote: > In D64943#158 , @JonChesterfield > wrote: > > > > OpenMP linker script is known to cause problems for gold and lld linkers > > > on Linux

[PATCH] D64943: [Clang][OpenMP offload] Eliminate use of OpenMP linker script

2019-09-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D64943#179 , @ABataev wrote: > In D64943#178 , @JonChesterfield > wrote: > > > In D64943#173 , @ABataev wrote: > > > > > In D

[PATCH] D64943: [Clang][OpenMP offload] Eliminate use of OpenMP linker script

2019-09-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D64943#1666849 , @sdmitriev wrote: > In D64943#136 , @JonChesterfield > wrote: > > > I'm not sure copying the crtbegin/crtend mechanism from the early days of C > > runtime

[PATCH] D64943: [Clang][OpenMP offload] Eliminate use of OpenMP linker script

2019-09-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. > Hm, I was not aware of this Linux linker feature, thanks a lot for the > explanation! I see only one problem with using it as a replacement for the > begin/end objects – it looks like `__start_name`/`__stop_name` symbols are > created with `default` visibility

[PATCH] D64943: [Clang][OpenMP offload] Eliminate use of OpenMP linker script

2019-09-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I'm on board with getting rid of the linker script. Gold's limited support for that seems conclusive. I believe the current script does two things: 1/ takes a binary and embeds it in a section named .omp_offloading.amdgcn-amd-amdhsa 2/ provides start/end symbols

[PATCH] D64943: [Clang][OpenMP offload] Eliminate use of OpenMP linker script

2019-09-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. >> Does this diff mix getting rid of the linker script with other changes? E.g. >> it looks like the metadata generation is moving from clang to the new tool, >> but that seems orthogonal to dropping the linker script. > > Metadata is still generated by the clan

[PATCH] D64943: [Clang][OpenMP offload] Eliminate use of OpenMP linker script

2019-09-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. > I see some problems with using llvm-objcopy for that. First issue is that > symbols created by llvm-objcopy for embedded data depend on the input file > name. As you know these symbols are referenced from the offload registration > code that is currently added

[PATCH] D64943: [Clang][OpenMP offload] Eliminate use of OpenMP linker script

2019-09-25 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D64943#1682452 , @hfinkel wrote: > This LGTM. I'm happy that this is a design improvement over the current > scheme. @JonChesterfield , @ABataev , any further comments? This patch mixes two concerns. 1/ Remove the li

[PATCH] D64943: [Clang][OpenMP offload] Eliminate use of OpenMP linker script

2019-09-25 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. The three way split looks great, thanks. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D64943/new/ https://reviews.llvm.org/D64943 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.or

[PATCH] D68166: [Clang][OpenMP Offload] Add new tool for wrapping offload device binaries

2019-10-02 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I still don't understand what advantage the standalone tool has over renaming the file to `omp_offloading` and then using `objcopy -I binary`. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D68166/new/ https://review

[PATCH] D68166: [Clang][OpenMP Offload] Add new tool for wrapping offload device binaries

2019-10-02 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I think this patch is a behaviour change. Currently, the target binary is embedded in the host binary at link time. With this change, the contents of the binary are embedded in bitcode which is subsequently fed into the link. If indeed so, that seems strictly be

[PATCH] D68166: [Clang][OpenMP Offload] Add new tool for wrapping offload device binaries

2019-10-02 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. > The tool indeed does not have anything specific to OpenMP at this step, but > that will change... That makes sense to me, thanks. I think we're going to have some trouble adapting this to our build as there's already a standalone tool that runs at link time.

[PATCH] D68166: [Clang][OpenMP Offload] Add new tool for wrapping offload device binaries

2019-10-09 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. The direction is good and I believe all the feedback from D64943 has already been incorporated. LGTM, thanks. CHANGES SINCE LAST ACTION h

[PATCH] D69494: OpenMP: Add helper function for convergent runtime calls

2019-10-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a reviewer: grokos. JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. LGTM. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D69494/new/ https://reviews.llvm.org/D69494 _

[PATCH] D80804: [AMDGPU] Expose llvm atomic inc/dec instructions as clang builtins for AMDGPU target

2020-05-29 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Thanks for this. I like the refactor to share code with amdgcn_fence. Agreed with Matt's points above. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D80804/new/ https://reviews.llvm.org/D80804 ___

[PATCH] D80858: [HIP] Support accessing static device variable in host code

2020-06-01 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. The value is based on llvm::sys::Process::GetRandomNumber(). So unless one provides a build-system-derived uuid for every compilation unit, recompiling identical source will yield an observably different binary. The distinction between 'unique' and 'random' is s

[PATCH] D84476: Make hip math headers easier to use from C

2020-07-23 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added a reviewer: yaxunl. Herald added a reviewer: jdoerfert. Herald added subscribers: cfe-commits, sstefan1. Herald added a project: clang. Make hip math headers easier to use from C Motivation is a step towards using the hip math headers t

[PATCH] D84476: Make hip math headers easier to use from C

2020-07-24 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 280524. JonChesterfield added a comment. - Fix missing underscores Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D84476/new/ https://reviews.llvm.org/D84476 Files: clang/lib/Headers/__clang_hip_libdev

[PATCH] D84476: Make hip math headers easier to use from C

2020-07-24 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield marked 4 inline comments as done. JonChesterfield added inline comments. Comment at: clang/lib/Headers/__clang_hip_math.h:561 inline double abs(double __x) { return __ocml_fabs_f64(__x); } +#endif __DEVICE__ yaxunl wrote: > jdoerfert wrote: > >

[PATCH] D84476: Make hip math headers easier to use from C

2020-07-24 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG679158e662aa: Make hip math headers easier to use from C (authored by JonChesterfield). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D84476/new/ https://rev

[PATCH] D82087: AMDGPU/clang: Add builtins for llvm.amdgcn.ballot

2020-07-26 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. A macro for wavefront size would make targeting gfx10 for openmp easier. We currently use an int32_t for nvptx and an int64_t for amdgcn in various runtime function interfaces. I'd like to be able to set the latter based on said macro. CHANGES SINCE LAST ACTIO

[PATCH] D84743: [Clang][AMDGCN] Universal device offloading macros header

2020-07-28 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a reviewer: arsenm. JonChesterfield added a comment. Herald added a subscriber: wdng. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D84743/new/ https://reviews.llvm.org/D84743 ___ c

[PATCH] D80917: [OpenMP][AMDGCN] Support OpenMP offloading for AMDGCN architecture - Part 2

2020-07-28 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: llvm/include/llvm/Frontend/OpenMP/OMPGridValues.h:66 + /// for NVPTX. + GV_Warp_Size_32, + /// The number of bits required to represent the max number of threads in warp What's the point of warp_size_32? It's

[PATCH] D84743: [Clang][AMDGCN] Universal device offloading macros header

2020-07-29 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. We probably do want a macro to indicate 'compiling for amdgcn as the device half of a combined host+device language'. I'm having a tough time with the control flow in this header so we probably want tests to check the overall behaviour is as intended. E.g. stati

[PATCH] D75917: Expose llvm fence instruction as clang intrinsic

2020-04-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Sema/SemaChecking.cpp:1888 +// Check valididty of memory ordering as per C11 / C++11's memody model. +if (ord < static_cast(llvm::AtomicOrderingCABI::acquire) || + ord > static_cast(llvm::AtomicOrderingCABI

[PATCH] D75917: Expose llvm fence instruction as clang intrinsic

2020-04-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:13651 + llvm::getConstantStringInfo(Scope, scp); + SSID = getLLVMContext().getOrInsertSyncScopeID(scp); + sameerds wrote: > saiislam wrote: > > sameerds wrote: > > > This

[PATCH] D75917: Expose llvm fence instruction as clang intrinsic

2020-04-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/test/CodeGenHIP/builtin_memory_fence.cpp:9 + // CHECK: fence syncscope("workgroup") seq_cst + __builtin_memory_fence(__ATOMIC_SEQ_CST, "workgroup"); + sameerds wrote: > JonChesterfield wrote: > > samee

[PATCH] D75917: Expose llvm fence instruction as clang intrinsic

2020-04-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/test/CodeGenHIP/builtin_memory_fence.cpp:9 + // CHECK: fence syncscope("workgroup") seq_cst + __builtin_memory_fence(__ATOMIC_SEQ_CST, "workgroup"); + JonChesterfield wrote: > sameerds wrote: > > JonCh

[PATCH] D75917: Expose llvm fence instruction as clang intrinsic

2020-04-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/test/CodeGenHIP/builtin_memory_fence.cpp:1 // REQUIRES: amdgpu-registered-target // RUN: %clang_cc1 %s -x hip -emit-llvm -O0 -o - \ Codegen test should be under CodeGen and/or CodeGenCXX Repository: r

[PATCH] D77774: [OpenMP] Allow to go first in C++-mode in target regions

2020-04-09 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. The cmath/math.h story makes me sad, but this is a good workaround. Thanks Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D4/

[PATCH] D77390: Fix __builtin_amdgcn_workgroup_size_x/y/z return type

2020-04-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/test/CodeGenOpenCL/builtins-amdgcn.cl:541 switch (d) { - case 0: *out = __builtin_amdgcn_workgroup_size_x(); break; + case 0: *out = __builtin_amdgcn_workgroup_size_x() + 1; break; case 1: *out =

[PATCH] D77918: [OpenMP] Avoid crash in preparation for diagnose of unsupported type

2020-04-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D77918/new/ https://reviews.llvm.org/D77918 __

[PATCH] D80897: [OpenMP] Initial support for std::complex in target regions

2020-07-02 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. I think this change is good. The library story is a bit difficult, but fundamentally openmp needs a shim of some sort to map target math functions onto the libm of the underl

[PATCH] D83121: [AMDGPU] Change Clang AMDGCN atomic inc/dec builtins to take unsigned values

2020-07-03 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Thanks for this. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D83121/new/ https://reviews.llvm.org/D83121 _

[PATCH] D80858: [CUDA][HIP] Support accessing static device variable in host code

2020-07-06 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/AST/ASTContext.cpp:10068 +isa(D) && cast(D)->isFileVarDecl() && +cast(D)->getStorageClass() == SC_Static) { + return GVA_StrongExternal; yaxunl wrote: > rjmccall wrote: > > Are you

[PATCH] D83268: [OpenMP][NFC] Remove unused (always fixed) arguments

2020-07-07 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I'm not sure we have a consensus on api stability. Usually llvm allows mixing libraries and compilers from different sources, e.g. the various libunwind or compiler-rt vs libgcc. Libomptarget in general appears to be considered fixed and has external users (int

[PATCH] D83268: [OpenMP][NFC] Remove unused (always fixed) arguments

2020-07-07 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Aside from the API stability concern this looks uncontentious. Removes dead arguments, generally makes things simpler. Thus LGTM. @Hahnfeld @ABataev - are you sufficiently pe

[PATCH] D83268: [OpenMP][NFC] Remove unused (always fixed) arguments

2020-07-07 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D83268#2135938 , @ABataev wrote: > > @Hahnfeld @ABataev - are you sufficiently persuaded that preserving the > > current interface is not worth the development cost? > > No, I'm not. Long before that, we relied on the A

[PATCH] D83349: [OpenMP][NFC] Remove unused and untested code from the device runtime

2020-07-07 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Applied to the amdgcn implementation. Compiles fine, tests all passing. Seems likely that this lot really is dead. Interesting that this removes*_data_sharing_environment. I

[PATCH] D83268: [OpenMP][NFC] Remove unused (always fixed) arguments

2020-07-07 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. __kmpc_spmd_kernel_init is always called with RequiresDataSharing == 0 Specifically, it's only called from clang, and emitSPMDEntryHeader unconditionally passes zero to it I.e. I think there's more stuff that can be cleaned up in the theme of the above, suggest

[PATCH] D83492: [OpenMP] Use common interface to access GPU Grid Values

2020-07-09 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Changing to getGridValue would be useful for sharing parts of this with amdgcn. The aomp toolchain handles codegen for amdgcn by adding if (isAMDGCN) to this file. Until such time as tregions obsoletes this code, I think we should go with layers instead of scatt

[PATCH] D83591: [OpenMP][CUDA] Fix std::complex in GPU regions

2020-07-10 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Fine by me. Let's get nvptx working properly in tree now and work out how to wire up amdgcn subsequently. I'm sure a reasonable abstraction will present itself. Repository:

[PATCH] D83591: [OpenMP][CUDA] Fix std::complex in GPU regions

2020-07-10 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D83591#2145437 , @jdoerfert wrote: > I did not know they are using __clang_cuda headers. (Site note, we should > rename them then.) I also did not know that. I am repeatedly caught out by things named 'cuda', 'nvptx'

[PATCH] D83723: [OpenMP] Generalize CGOpenMPRuntimeNVPTX as CGOpenMPRuntimeGPU

2020-07-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. This is indeed the direction I had in mind. Broad strokes look right. I probably wouldn't notice an accidental change amidst the whitespace reshuffle. Very happy to read through line by line if you can split the whitespace change out. Comment

[PATCH] D83723: [OpenMP] Generalize CGOpenMPRuntimeNVPTX as CGOpenMPRuntimeGPU

2020-07-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/CodeGen/CGOpenMPRuntimeAMDGCN.cpp:39 + // return constant compile-time target-specific warp size + unsigned WarpSize = CGF.getTarget().getGridValue(llvm::omp::GV_Warp_Size); + return Bld.getInt32(WarpSize); -

[PATCH] D83281: [OpenMP] Allow traits for the OpenMP context selector `isa`

2020-07-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Agreed on tests. I like the mechanism - passing a string through to the backend as a way to dispatch between isa properties looks cleanly extensible. We probably do want to emit a warning when the backend claims it doesn't know anything about said string as it'l

[PATCH] D83832: [OpenMP] Provide a flag to disable safety checks for GPU optimizations

2020-07-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I think there's an unfortunate interaction with link time optimisation here. If there are external regions, but their code is combined with llvm-link before codegen, then a user could reasonably assume this flag is safe. Would it would be correct to compile the

[PATCH] D83832: [OpenMP] Provide a flag to disable safety checks for GPU optimizations

2020-07-15 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. I think this is good. It's dangerous, but it's also undocumented and has unsafe in the name. I should be able to use this to sidestep limitations in the amdgpu function poin

[PATCH] D78759: Add Statically Linked Libraries

2020-06-28 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. This appears to have been committed without addressing all the comments or waiting for an acceptance from someone outside of our organisation. That doesn't seem right - am I missing part of the thread here? Repository: rG LLVM Github Monorepo CHANGES SINCE L

[PATCH] D80804: [AMDGPU] Introduce Clang builtins to be mapped to AMDGCN atomic inc/dec intrinsics

2020-06-30 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. This patch declares the clang builtin as acting on signed values, but the IR intrinsic maps to an instruction which does an unsigned comparison. We don't have ISA support for a signed comparison equivalent. Addition is the same operation on signed or unsigned in

[PATCH] D75917: Expose llvm fence instruction as clang intrinsic

2020-04-17 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. The tests look good, but I can't see the implementation in this diff. Maybe a file missing from the patch? Can be hard to tell with phabricator, the error may be at my end. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.

[PATCH] D78495: [nfc] Accept addrspacecast allocas in InitTempAlloca

2020-04-20 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: rjmccall, aaron.ballman, ABataev, jdoerfert, arsenm. Herald added subscribers: cfe-commits, wdng. Herald added a project: clang. [nfc] Accept addrspacecast allocas in InitTempAlloca Changes the precondition to be slightly mor

[PATCH] D78495: [nfc] Accept addrspacecast allocas in InitTempAlloca

2020-04-20 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D78495#1992404 , @arsenm wrote: > Needs test? I'm not sure how to write said test. How do we normally hit asserts from the clang test suite? This fires a lot in the openmp on amdgcn downstream branch, but I'm happy

[PATCH] D78495: [nfc] Accept addrspacecast allocas in InitTempAlloca

2020-04-20 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield abandoned this revision. JonChesterfield added a comment. No problem. This isn't on the live path - the function is mostly called from openmp codegen and clang doesn't target openmp/amdgcn just yet. I'll roll this change into the codegen patch to enable that. Repository: rG L

[PATCH] D75917: Expose llvm fence instruction as clang intrinsic

2020-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Amdgcn specific is fine by me. Hopefully that unblocks this. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D75917/new/ https://reviews.llvm.org/D75917 ___ cfe-commits ma

[PATCH] D75917: Expose llvm fence instruction as clang intrinsic

2020-04-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/test/CodeGenCXX/builtin-amdgcn-fence-failure.cpp:5 + +void test_amdgcn_fence_failure() { + arsenm wrote: > Does this really depend on C++? Can it use OpenCL like the other builtin > tests?This also belongs

[PATCH] D46472: [HIP] Support offloading by linker script

2020-04-23 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Herald added a reviewer: jdoerfert. Comment at: cfe/trunk/lib/Driver/ToolChains/CommonArgs.cpp:1329 + // Get the HIP offload tool chain. + auto *HIPTC = static_cast( + C.getSingleOffloadToolChain()); Should this be `t

[PATCH] D85735: [OpenMP] Context selector extensions for template functions

2020-08-11 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Nice. What makes it an extension? 5.0 / 2.3.5 claims "and where variant-func-id is the name of a function variant that is either a base language identifier or, for C++, a template-id." which suggests this could be always-on Comment at: clang/l

[PATCH] D85879: [OpenMP] Overload `std::isnan` and friends multiple times for the GPU

2020-08-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I think this is reasonable. It's unfortunate to have isnan return bool or int depending on the system headers, but considering we have that in a language that doesn't mangle the return type into the name the workaround seems OK. I think `#define isnan()` in a sy

[PATCH] D85877: [OpenMP] Support nested OpenMP context selectors (declare variant)

2020-08-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Thanks! Probably good to implement this just at the time math headers needs it as that gives us a real world example of the change working. Repository: rG LLVM Github Mono

[PATCH] D85878: [OpenMP] Context selector extensions for return value overloading

2020-08-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. If I recall correctly, &foo with variants of foo returns a pointer to the base. If we have no base, and disable_implicit_base, what does &foo yield? It should probably be a compilation error with some descriptive message Repository: rG LLVM Github Monorepo C

[PATCH] D74361: [Clang] Undef attribute for global variables

2020-08-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D74361#2218143 , @erichkeane wrote: > Also, see this bug: This crashes immediately when used on a template > instantiation: https://bugs.llvm.org/show_bug.cgi?id=47169 Thanks for the bug report! Template instantiation

[PATCH] D74361: [Clang] Undef attribute for global variables

2020-08-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D74361#2218258 , @erichkeane wrote: > I did a little debugging, and the problem is the template itself isn't a > complete type That's clear cut then, thanks. This patch was limited to trivially constructible types, an

[PATCH] D74361: [Clang] Undef attribute for global variables

2020-08-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D74361#2218294 , @erichkeane wrote: > Yep! Declaring a global variable that isn't 'extern' with an incomplete type > is disallowed anyway, so if you call RequireCompleteType, you're likely just > diagnosing that early

[PATCH] D85990: [Clang] Fix BZ47169, loader_uninitialized on incomplete types

2020-08-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield created this revision. JonChesterfield added reviewers: erichkeane, aaron.ballman, rjmccall. Herald added a project: clang. Herald added a subscriber: cfe-commits. JonChesterfield requested review of this revision. [Clang] Fix BZ47169, loader_uninitialized on incomplete types Repo

[PATCH] D74361: [Clang] Undef attribute for global variables

2020-08-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Fix proposed at D85990 . Thanks for the detailed bug report! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D74361/new/ https://reviews.llvm.org/D74361 __

[PATCH] D85990: [Clang] Fix BZ47169, loader_uninitialized on incomplete types

2020-08-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/test/Sema/attr-loader-uninitialized.cpp:14 +extern S incomplete_external_rejected __attribute__((loader_uninitialized)); +// expected-error@-1 {{variable has incomplete type 'S'}} +// expected-note@-3 {{forward declaration

[PATCH] D85878: [OpenMP] Context selector extensions for return value overloading

2020-08-16 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Link error seems reasonable to me. Thanks Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D85878/new/ https://reviews.llvm.org/D8

[PATCH] D85875: [OpenMP][FIX] Do not crash trying to print a missing (demangled) user condition

2020-08-16 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Seems OK to me as a local fix. I'm not sure about encoding 'missing condition' as 0, but that's a preexisting choice. Repository: rG LLVM Github Monorepo CHANGES SINCE LA

[PATCH] D133726: [OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU

2022-09-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. We can do this but should expect an increase in code size from having multiple internalised copies of the same function. There may be an incidental benefit if we can specialise some functions to the call site without additional cloning. Address of the same funct

[PATCH] D133726: [OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU

2022-09-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.h:58 + llvm::SmallVector + getHIPDeviceLibs(const llvm::opt::ArgList &Args) const override; + Why hip device libs? There's a common set, plus a hip.bc plus a opencl.bc.

[PATCH] D133726: [OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU

2022-09-12 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.h:58 + llvm::SmallVector + getHIPDeviceLibs(const llvm::opt::ArgList &Args) const override; + jhuber6 wrote: > JonChesterfield wrote: > > Why hip device libs? There's a

[PATCH] D133726: [OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU

2022-09-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. march for openmp, mcpu for hip seems ok. Notably llc needs to be told this as well, using mcpu, which may be an issue for save-temps Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D133726/new/ https://reviews.llvm.or

[PATCH] D133726: [OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU

2022-09-14 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. I don't like this but will concede it's quicker than changing device libs to contain IR that doesn't have to be patched on the fly. Repository: rG LLVM Github Monorepo CH

[PATCH] D122352: [OpenMP] Do not create offloading entries for internal or hidden symbols

2022-03-23 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Sounds good to me too, thanks! Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D122352/new/ https://reviews.llvm.org/D122352 ___ cfe-commits mailing list cfe-commits@lists.l

[PATCH] D122504: [OpenMP] Make Ctor / Dtor functions have external visibility

2022-03-25 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. Nice, thanks. Wonder if we want protected visibility as well. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D122504/new/ https:

[PATCH] D122069: [Object] Add binary format for bundling offloading metadata

2022-03-29 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added reviewers: grokos, ABataev, ronlieb, tianshilei1992. JonChesterfield added a comment. Added some reviewers. I'd much prefer this used an existing binary format, DIY is prone to errors and maintenance hassles down the road. Don't care as much about which format as about it b

[PATCH] D131307: [Clang] Allow downgrading to a warning the diagnostic for setting a non fixed enum to a value outside the range of the enumeration values

2022-08-09 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Did some digging here. The function hsa_agent_get_info takes an argument of type hsa_agent_info_t, which has declared values in the range [0 24]. The implementation of that (in amd_gpu_agent fwiw) casts that argument to a size_t and then switches on it, checking

[PATCH] D106070: [HIP] Remove workaround in __clang_hip_runtime_wrapper.h

2021-07-15 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. OK from the openmp gpu side as far as I can tell. This is probably another instance of where we really wanted _OPENMP_TARGET or similar. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D106070/new/ https://reviews.llvm.org/D106070

[PATCH] D105221: [openmp][nfc] Simplify macros guarding math complex headers

2021-07-18 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. @fodinabor? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105221/new/ https://reviews.llvm.org/D105221 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lis

[PATCH] D105221: [openmp][nfc] Simplify macros guarding math complex headers

2021-07-18 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG3e649f8ef187: [openmp][nfc] Simplify macros guarding math complex headers (authored by JonChesterfield). Repository: rG LLVM Github Monorepo CHAN

[PATCH] D102107: [OpenMP] Codegen aggregate for outlined function captures

2021-07-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a subscriber: ronlieb. JonChesterfield added a comment. @ronlieb bisected amdgpu crashing to this too, rocm 'veccopy' case tries to dereference 0. Might be the same failure mode as the above or a different one, the hsa error reporting is quite coarse grained. Suggest we pu

[PATCH] D106301: [OpenMP] Disable trap before unreachable for OpenMP device jobs

2021-07-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. What's the problem with emitting llvm.trap in various unreachable places? Wondering if it also affects translating assert to an llvm.trap Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D106301/new/ https://reviews.ll

[PATCH] D106301: [OpenMP] Disable trap before unreachable for OpenMP device jobs

2021-07-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. In D106301#2888170 , @jdoerfert wrote: > llvm.trap is preserved, thus branches to an llvm.trap are preserved. That's interesting. Consistent with IR in general, template int test(int x) { if (x < 42) { retur

[PATCH] D106301: [OpenMP] Disable trap before unreachable for OpenMP device jobs

2021-07-19 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I'm not sure about that - we could tie instcombine to -O0 or some similar proxy for debugging ve performance - but I'm practice it's fairly likely that most traps are compiler inserted so it probably works out the same. Conditional instcombine would let us remov

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-07-20 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield accepted this revision. JonChesterfield added a comment. This revision is now accepted and ready to land. D105221 so LGTM too Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D104904/new/ https://revie

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-07-21 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield updated this revision to Diff 360447. JonChesterfield added a comment. - rebase on main Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D104904/new/ https://reviews.llvm.org/D104904 Files: clang/lib/Driver/ToolChains/Clang.cpp cla

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-07-21 Thread Jon Chesterfield via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG968899ad9cf1: [OpenMP][AMDGCN] Initial math headers support (authored by pdhaliwal, committed by JonChesterfield). Repository: rG LLVM Github Mono

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-07-21 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Landed on @pdhaliwal's behalf. My expectation is that this patch mostly works and the rough edges can be cleaned up once ocml is linked in and we can more easily run more applications through it. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-07-21 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Thanks! Will take a look. Feel free to revert, I'll do so shortly if noone beats me to it Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D104904/new/ https://reviews.llvm.org/D104904

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-07-21 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. cstdlib test header contains // amdgcn already provides definition of fabs #ifndef __AMDGCN__ float fabs(float __x) { return __builtin_fabs(__x); } #endif If I delete or invert the ifndef > $HOME/llvm-build/llvm/lib/clang/13.0.0/include/__clang_hip_cmath

[PATCH] D106542: [OPENMP]Fix PR49787: Codegen for calling __tgt_target_teams_nowait_mapper has too few arguments.

2021-07-22 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Nice, thanks Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D106542/new/ https://reviews.llvm.org/D106542 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://li

[PATCH] D106793: [OpenMP] Add a driver flag to enable the new device runtime library

2021-07-26 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. Can we call this something other than new? We don't tend to remove command line arguments and this won't make much sense once it's the only runtime. I'd be inclined to add an argument called 'use_legacy_runtime' or similar, which defaults to true Repository:

[PATCH] D106793: [OpenMP] Add a driver flag to enable the new device runtime library

2021-07-26 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp:229 + options::OPT_fno_openmp_target_new_runtime, false)) +BitcodeSuffix = "new-amdgcn-" + GPUArch; + else Likewise here, how about amdgcn

[PATCH] D106870: [OpenMP] Multi architecture compilation support

2021-07-27 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. There seems to be a bunch of different things in this patch. There's some driver plumbing to compile for more than one arch (presumably by calling the target compiler N times). That's a great feature, I want to build an application bthat can run on nvptx or amdg

[PATCH] D105981: [AMDGPU][OpenMP] Support linking of math libraries

2021-07-28 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I don't like that this pulls in ockl automatically but don't think that's a blocker. OK on my side, @yaxunl? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105981/new/ https://reviews.llvm.org/D105981 _

[PATCH] D105981: [AMDGPU][OpenMP] Support linking of math libraries

2021-07-28 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added inline comments. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:831-860 auto Kind = llvm::AMDGPU::parseArchAMDGCN(GpuArch); const StringRef CanonArch = llvm::AMDGPU::getArchNameAMDGCN(Kind); std::string LibDeviceFile = RocmInstallation.getLibDev

  1   2   3   4   5   6   7   8   9   >