from:"Jon Chesterfield via cfe\-commits"

[clang] [llvm] [Transforms] Implement always_specialize attribute lowering (PR #143983)

2025-06-13 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: A user perspective is useful here, thanks for stopping by! The docs might be too concise. There's a nice example in the discourse thread, ``` void qsort( void* ptr, size_t count, size_t __attribute__((always_specialize)) size, int (* __attribute__((always_specialize

[clang] [llvm] [Transforms] Implement always_specialize attribute lowering (PR #143983)

2025-06-12 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: Discourse RFC is at https://github.com/llvm/llvm-project/pull/143983 https://github.com/llvm/llvm-project/pull/143983 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-com

[clang] [llvm] [Transforms] Implement always_specialize attribute lowering (PR #143983)

2025-06-12 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield created https://github.com/llvm/llvm-project/pull/143983 Implement an attribute in the spirit of always_inline, for giving programmers a hook to have llvm specialise subtrees of their program with respect to constant variables. For example, specialise a sort

[clang] [llvm] [mlir] [AMDGPU] Add a new amdgcn.load.to.lds intrinsic (PR #137425)

2025-05-22 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: I think we could do with an additional overload here. Currently a bunch of code (notably CK but probably elsewhere) uses the v4i32 version of the LDS intrinsics. I think this patch lets one use the addrspace(7) pointer of 128 bits alternative. So callers could transform

[clang] [llvm] [clang][amdgpu] Add builtins for raw/struct buffer lds load (PR #137678)

2025-05-01 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: There's some existing builtin stuff going on in this area, e.g. https://github.com/llvm/llvm-project/pull/138141/commits/96e94b5662c613fd80f712080751076254a73524 The use case in CK is behind a hip header file, so could become static inline functions renaming a builtin, o

[clang] [llvm] [mlir] [Sema] Fix bug in builtin AS override (PR #138141)

2025-05-01 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: The patch to SemaExpr looks reasonable to me. I'd suggest that goes in separate from the amdgpu intrinsic stuff. I'd test this by tweaking the code to do the current lowering _and_ the proposed and check that they do exactly the same thing on all the existing builtins,

[clang] [llvm] [clang][amdgpu] Add builtins for raw/struct buffer lds load (PR #137678)

2025-04-29 Thread Jon Chesterfield via cfe-commits

@@ -163,7 +163,10 @@ BUILTIN(__builtin_amdgcn_raw_buffer_load_b64, "V2UiQbiiIi", "n") BUILTIN(__builtin_amdgcn_raw_buffer_load_b96, "V3UiQbiiIi", "n") BUILTIN(__builtin_amdgcn_raw_buffer_load_b128, "V4UiQbiiIi", "n") +TARGET_BUILTIN(__builtin_amdgcn_raw_buffer_load_lds, "vV4U

[clang] [llvm] [clang][amdgpu] Add builtins for raw/struct buffer lds load (PR #137678)

2025-04-29 Thread Jon Chesterfield via cfe-commits

@@ -0,0 +1,12 @@ +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu tahiti -S -verify -o - %s +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu bonaire -S -verify -o - %s +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu carrizo -S -verify

[clang] [llvm] [clang][amdgpu] Add builtins for raw/struct buffer lds load (PR #137678)

2025-04-29 Thread Jon Chesterfield via cfe-commits

@@ -163,7 +163,10 @@ BUILTIN(__builtin_amdgcn_raw_buffer_load_b64, "V2UiQbiiIi", "n") BUILTIN(__builtin_amdgcn_raw_buffer_load_b96, "V3UiQbiiIi", "n") BUILTIN(__builtin_amdgcn_raw_buffer_load_b128, "V4UiQbiiIi", "n") +TARGET_BUILTIN(__builtin_amdgcn_raw_buffer_load_lds, "vV4U

[clang] [llvm] [clang][amdgpu] Add builtins for raw/struct buffer lds load (PR #137678)

2025-04-29 Thread Jon Chesterfield via cfe-commits

@@ -0,0 +1,12 @@ +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu tahiti -S -verify -o - %s +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu bonaire -S -verify -o - %s +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu carrizo -S -verify

[clang] [llvm] [clang][amdgpu] Add builtins for raw/struct buffer lds load (PR #137678)

2025-04-28 Thread Jon Chesterfield via cfe-commits

@@ -0,0 +1,12 @@ +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu tahiti -S -verify -o - %s +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu bonaire -S -verify -o - %s +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu carrizo -S -verify

[clang] [llvm] [clang][amdgpu] Add builtins for raw/struct buffer lds load (PR #137678)

2025-04-28 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield created https://github.com/llvm/llvm-project/pull/137678 We have a clang builtin for one of four very similar IR intrinsics. This patch adds builtins for the other three. IR intrinsics introduced in https://reviews.llvm.org/D124884. The request from composab

[clang] [Headers] Create stub spirvintrin.h (PR #131164)

2025-04-05 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131164 >From 092024bbf31b0677e6efbb0e6fc0cee7606ecb08 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Tue, 18 Mar 2025 15:57:02 + Subject: [PATCH] [Headers] Implement spirvamdgcnintrin.h --- clang/l

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-04-05 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131190 >From e3d1c0d0f430a96e26c68e22ab53dc2fa4a14e47 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Wed, 12 Mar 2025 20:55:17 + Subject: [PATCH] [SPIRV] GPU intrinsics --- clang/include/clang/Basi

[clang] [Headers] Implement spirvamdgcnintrin.h (PR #131164)

2025-03-18 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: We don't need a way to call the builtins. See for example this pull request. Intel have done the spirv64-intel- thing similar to the spirv64-amd-amdhsa so they can (presumably?) use whatever intrinsics they like, using another header quite like this one. https://github.

[clang] [Clang] Permit `-Xarch_` to be used with `--offload-arch` (PR #131884)

2025-03-18 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield edited https://github.com/llvm/llvm-project/pull/131884 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Headers] Implement spirvamdgcnintrin.h (PR #131164)

2025-03-18 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131164 >From 402a091ac6eac8a50ce54a519acce5bfa4de1c88 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Tue, 18 Mar 2025 15:57:02 + Subject: [PATCH] [Headers] Implement spirvamdgcnintrin.h --- clang/l

[clang] [Headers] Implement spirvamdgcnintrin.h (PR #131164)

2025-03-18 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield edited https://github.com/llvm/llvm-project/pull/131164 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Headers] Implement spirvamdgcnintrin.h (PR #131164)

2025-03-18 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: @sarnex I'm deeply sorry about this sequence of events. The single spirv64 header that lowered to intrinsics that amdgpu or intel map onto their own world would have removed a swathe of spurious variation. What we're going to have to do in the interim is have spirvamdgcn

[clang] [Headers] Implement spirvamdgcnintrin.h (PR #131164)

2025-03-18 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield edited https://github.com/llvm/llvm-project/pull/131164 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Headers] Create stub spirvintrin.h (PR #131164)

2025-03-17 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: OK, we can't have spirv64-unknown-unknown at this point. But we could get this compiling using a mixture of spirv intrinsics (where they exist) and amdgpu intrinsics for spirv64-amd-amdhsa, leaving a preprocessor error for other targets. I'll take a stab at that. https:

[clang] [Headers] Create stub spirvintrin.h (PR #131164)

2025-03-17 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131164 >From d91671fdbb2aa9204f728747009459373bfd6553 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Thu, 13 Mar 2025 15:44:52 + Subject: [PATCH 1/2] [Headers] Create stub spirintrin.h --- clang/li

[clang] [Headers] Create stub spirvintrin.h (PR #131164)

2025-03-17 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131164 >From d91671fdbb2aa9204f728747009459373bfd6553 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Thu, 13 Mar 2025 15:44:52 + Subject: [PATCH 1/2] [Headers] Create stub spirintrin.h --- clang/li

[clang] [Headers] Create stub spirvintrin.h (PR #131164)

2025-03-17 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: Well it's not pretty, but spirv64-amd-amdhsa sets both __AMDGPU__ and __SPIRV64__ macros. Added a commit with an example that dispatches to amdgpu intrinsics on the happy path and preprocessor error otherwise. If you let that get to the spirv backend it falls over with `

[clang] [Headers] Create stub spirvintrin.h (PR #131164)

2025-03-17 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: The utility is having a place to fill things in incrementally as we get them working, and thus use libc to drive the implementation of enough of spirv to get things working. If you decline to have any spirv code until everything is working I'll have to do the testing som

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-17 Thread Jon Chesterfield via cfe-commits

@@ -0,0 +1,501 @@ +//===- LowerGPUIntrinsic.cpp --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Headers] Create stub spirvintrin.h (PR #131164)

2025-03-17 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: @jhuber6 I think we should have this despite the rejected https://github.com/llvm/llvm-project/pull/131190. Maybe we'll get some clang builtins for spirv. Otherwise some things can be done with the asm label hack. Some can just be left as nop, e.g. a suspend that does n

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-17 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: Patch opposed internally. Will see if I can think of an alternative. https://github.com/llvm/llvm-project/pull/131190 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-com

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-17 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield closed https://github.com/llvm/llvm-project/pull/131190 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-17 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: I've dropped the clang headers part from this patch, rewritten to elide the grid intrinsic and moved the test. Matt suggested llvm.simt as the prefix which I like very much more than llvm.gpu but the rename is probably going to take a moment. Turns out I do need to upda

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-17 Thread Jon Chesterfield via cfe-commits

@@ -0,0 +1,501 @@ +//===- LowerGPUIntrinsic.cpp --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-17 Thread Jon Chesterfield via cfe-commits

@@ -0,0 +1,501 @@ +//===- LowerGPUIntrinsic.cpp --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-17 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131190 >From 65dcf7cb54156ec9623c2f04f1d70b479b4623b5 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Wed, 12 Mar 2025 20:55:17 + Subject: [PATCH] [SPIRV] GPU intrinsics --- clang/include/clang/Basi

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-17 Thread Jon Chesterfield via cfe-commits

@@ -0,0 +1,427 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-attributes +; RUN: opt -S -mtriple=amdgcn-- -passes=lower-gpu-intrinsic < %s | FileCheck %s --check-prefix=AMDGCN JonChesterfield wrote: I don't love

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-17 Thread Jon Chesterfield via cfe-commits

@@ -0,0 +1,501 @@ +//===- LowerGPUIntrinsic.cpp --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-17 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: This patch probably does too many things and it's implemented in a non-conventional fashion. I think that does a decent job of having something to upset most people that have looked at it. I note that reviewers are primarily complaining about disjoint aspects of the patc

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-17 Thread Jon Chesterfield via cfe-commits

@@ -0,0 +1,501 @@ +//===- LowerGPUIntrinsic.cpp --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-17 Thread Jon Chesterfield via cfe-commits

@@ -0,0 +1,501 @@ +//===- LowerGPUIntrinsic.cpp --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-17 Thread Jon Chesterfield via cfe-commits

@@ -2861,6 +2861,69 @@ def int_experimental_convergence_anchor def int_experimental_convergence_loop : DefaultAttrsIntrinsic<[llvm_token_ty], [], [IntrNoMem, IntrConvergent]>; +//===--- GPU Intrinsics ---===// + +class GPUIntr

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-17 Thread Jon Chesterfield via cfe-commits

@@ -150,6 +150,8 @@ defm int_amdgcn_workitem_id : AMDGPUReadPreloadRegisterIntrinsic_xyz; defm int_amdgcn_workgroup_id : AMDGPUReadPreloadRegisterIntrinsic_xyz_named <"__builtin_amdgcn_workgroup_id">; +defm int_amdgcn_grid_size : AMDGPUReadPrelo

[clang] [libc][nfc] Steps to allow sharing code between gpu intrin.h headers (PR #131134)

2025-03-15 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131134 >From 0466c31d1e0b10aa2d2352bb6befd36eb5306f9c Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Thu, 13 Mar 2025 12:49:42 + Subject: [PATCH] [libc][nfc] Steps to allow sharing code between gpu i

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-13 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: If the name llvm.gpu is a stumbling block, how about llvm.offload? Will add JD to the reviewers https://github.com/llvm/llvm-project/pull/131190 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-13 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: > I'm not convinced we should do this, as I have a bunch of concerns: > > * it's intrusive and duplicates work already done by > [libclc](https://github.com/llvm/llvm-project/tree/main/libclc); Also compiler-rt. Also gpuintrin.h. Also openmp's devicertl. All the lang

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-13 Thread Jon Chesterfield via cfe-commits

@@ -0,0 +1,501 @@ +//===- LowerGPUIntrinsic.cpp --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield edited https://github.com/llvm/llvm-project/pull/131190 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Headers] Create stub spirvintrin.h (PR #131164)

2025-03-13 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: Effectively a subset of https://github.com/llvm/llvm-project/pull/131190/, I'd still like to land this and rebase 131190 on the grounds of signal to noise ratio. https://github.com/llvm/llvm-project/pull/131164 ___ cfe-commits

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131190 >From b52a04c55ad56e1172dec6262f2536ec3fe7162b Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Wed, 12 Mar 2025 20:55:17 + Subject: [PATCH] [SPIRV] GPU intrinsics --- clang/include/clang/Basi

[clang] [llvm] [SPIRV] GPU intrinsics (PR #131190)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield created https://github.com/llvm/llvm-project/pull/131190 Introduce __builtin_gpu builtins to clang and corresponding llvm.gpu intrinsics in llvm for abstracting over minor differences between GPU architectures, and use those to implement a gpuintrin.h instant

[clang] [Headers] Create stub spirv64intrin.h (PR #131164)

2025-03-13 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: > Name should be `spirvintrin.h` but if we can't support 32 for now just error > in the preprocessor. Yep, you're right. It'll be caught by only checking for the SPIRV64 macro, but nothing in this file is 32 vs 64 bit dependent as that's part of what gpuintrin.h gives u

[clang] [Headers] Create stub spirvintrin.h (PR #131164)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131164 >From d91671fdbb2aa9204f728747009459373bfd6553 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Thu, 13 Mar 2025 15:44:52 + Subject: [PATCH] [Headers] Create stub spirintrin.h --- clang/lib/He

[clang] [Headers] Create stub spirvintrin.h (PR #131164)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield edited https://github.com/llvm/llvm-project/pull/131164 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Headers] Create stub spirv64intrin.h (PR #131164)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131164 >From be94c9af7eaa8bc05ac9bdb80dadc285575c1472 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Thu, 13 Mar 2025 15:44:52 + Subject: [PATCH] [Headers] Create stub spirv64intrin.h --- clang/lib

[clang] [Headers] Create stub spirv64intrin.h (PR #131164)

2025-03-13 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: Clang raises a lot of exciting errors for spirv-- about vulcan environments and I don't really know the distinction between the two - if 32 bit spirv turns out to be a workable thing it should go down the same code path, with `ifdef SPIRV || SPIRV64` and a file rename. I

[clang] [Headers] Create stub spirv64intrin.h (PR #131164)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield created https://github.com/llvm/llvm-project/pull/131164 Structure follows amdgcnintrin.h but with declarations where compiler intrinsics are not yet available. Address space numbers, kernel attribute, checking how it interacts with openmp are left for later

[clang] [Headers][NFC] Deduplicate gpu_match_ between targets via inlining (PR #131141)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131141 >From fbeb177a750ca671a9cff9f37f57e58c6900e7fd Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Thu, 13 Mar 2025 13:23:38 + Subject: [PATCH] [Headers][NFC] Deduplicate gpu_match_ between targets

[clang] [Headers][NFC] Deduplicate gpu_match_ between targets via inlining (PR #131141)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield closed https://github.com/llvm/llvm-project/pull/131141 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Headers][NFC] Deduplicate gpu_match_ between targets via inlining (PR #131141)

2025-03-13 Thread Jon Chesterfield via cfe-commits

@@ -32,6 +32,31 @@ _Pragma("push_macro(\"bool\")"); #define bool _Bool #endif +_Pragma("omp begin declare target device_type(nohost)"); +_Pragma("omp begin declare variant match(device = {kind(gpu)})"); + +// Forward declare a few functions for the implementation header. + +//

[clang] [libc][nfc] Steps to allow sharing code between gpu intrin.h headers (PR #131134)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield edited https://github.com/llvm/llvm-project/pull/131134 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Headers][NFC] Deduplicate gpu_match_any between targets (PR #131141)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield edited https://github.com/llvm/llvm-project/pull/131141 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Headers][NFC] Steps to allow sharing code between gpu intrin.h headers (PR #131134)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131134 >From 7347ebc6a0aadd1b9676e329bdf7705dbfae7875 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Thu, 13 Mar 2025 12:49:42 + Subject: [PATCH] [libc][nfc] Steps to allow sharing code between gpu i

[clang] [Headers][NFC] Deduplicate gpu_match_ between targets via inlining (PR #131141)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131141 >From 5e55b829eb3c7f4a4e674333cdde73b5bfe970f8 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Thu, 13 Mar 2025 13:23:38 + Subject: [PATCH] [Headers][NFC] Deduplicate gpu_match_ between targets

[clang] [Headers][NFC] Steps to allow sharing code between gpu intrin.h headers (PR #131134)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield closed https://github.com/llvm/llvm-project/pull/131134 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Headers][NFC] Deduplicate gpu_match_ between targets via inlining (PR #131141)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield edited https://github.com/llvm/llvm-project/pull/131141 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Headers][NFC] Deduplicate gpu_match_ between targets via inlining (PR #131141)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield edited https://github.com/llvm/llvm-project/pull/131141 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Headers][NFC] Deduplicate gpu_match_any between targets (PR #131141)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131141 >From cde4232ed28eed2b0c0c1cb11815b5a4317345b6 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Thu, 13 Mar 2025 13:23:38 + Subject: [PATCH] [Headers][NFC] Deduplicate gpu_match_any between targ

[clang] [Headers][NFC] Deduplicate gpu_match_any between targets (PR #131141)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131141 >From 28cb801f73e6886eacfd5cdbcd17abb68b6dd947 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Thu, 13 Mar 2025 13:23:38 + Subject: [PATCH] [Headers][NFC] Deduplicate gpu_match_any between targ

[clang] [Headers][NFC] Deduplicate gpu_match_any between targets (PR #131141)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131141 >From e1456be61130ea0ea006472990c7d294b8a32c03 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Thu, 13 Mar 2025 13:23:38 + Subject: [PATCH] [Headers][NFC] Deduplicate gpu_match_any between targ

[clang] [Headers][NFC] Deduplicate gpu_match_any between targets (PR #131141)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131141 >From 42253295a3b11b4303e15c3455047e3bfc5d196a Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Thu, 13 Mar 2025 13:23:38 + Subject: [PATCH] [Headers][NFC] Deduplicate gpu_match_any between targ

[clang] [Headers][NFC] Deduplicate gpu_match_any between targets (PR #131141)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield created https://github.com/llvm/llvm-project/pull/131141 Declare a few functions before including the target specific headers then define a fallback_match_any, used by amdgpu and by older nvptx. >From b9fdef141a83969eff8e7ac2dbc8c98163c0fbf5 Mon Sep 17 00:00:

[clang] [Headers][NFC] Steps to allow sharing code between gpu intrin.h headers (PR #131134)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield edited https://github.com/llvm/llvm-project/pull/131134 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Headers][NFC] Steps to allow sharing code between gpu intrin.h headers (PR #131134)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield edited https://github.com/llvm/llvm-project/pull/131134 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc][nfc] Steps to allow sharing code between gpu intrin.h headers (PR #131134)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield created https://github.com/llvm/llvm-project/pull/131134 Adds macro guards to warn if the implementation headers are included directly as part of dropping the need for them to be standalone. I'd like to declare functions before the include but it might be be

[clang] [Headers][NFC] Steps to allow sharing code between gpu intrin.h headers (PR #131134)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131134 >From 4c04f6979409642eb6bc9dc3c48b5e3636210ef0 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Thu, 13 Mar 2025 12:49:42 + Subject: [PATCH] [libc][nfc] Steps to allow sharing code between gpu i

[clang] [libc][nfc] Steps to allow sharing code between gpu intrin.h headers (PR #131134)

2025-03-13 Thread Jon Chesterfield via cfe-commits

@@ -13,11 +13,8 @@ #error "This file is intended for AMDGPU targets or offloading to AMDGPU" #endif -#include - -#if !defined(__cplusplus) -_Pragma("push_macro(\"bool\")"); -#define bool _Bool +#ifndef __GPUINTRIN_H +#warning "This file is intended as an implementation detail

[clang] [libc][nfc] Steps to allow sharing code between gpu intrin.h headers (PR #131134)

2025-03-13 Thread Jon Chesterfield via cfe-commits

@@ -263,8 +256,4 @@ _DEFAULT_FN_ATTRS static __inline__ void __gpu_thread_suspend(void) { _Pragma("omp end declare variant"); _Pragma("omp end declare target"); -#if !defined(__cplusplus) -_Pragma("pop_macro(\"bool\")"); JonChesterfield wrote: Up into gpuint

[clang] [libc][nfc] Steps to allow sharing code between gpu intrin.h headers (PR #131134)

2025-03-13 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131134 >From 7347ebc6a0aadd1b9676e329bdf7705dbfae7875 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Thu, 13 Mar 2025 12:49:42 + Subject: [PATCH] [libc][nfc] Steps to allow sharing code between gpu i

[clang] [libc][nfc] Use common implementation of read_first_lane_u64 (PR #131027)

2025-03-12 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield closed https://github.com/llvm/llvm-project/pull/131027 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc][nfc] Use common implementation of read_first_lane_u64 (PR #131027)

2025-03-12 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield updated https://github.com/llvm/llvm-project/pull/131027 >From 68f09d0f3f7849b91cb39ce42ba48e3e4aafb488 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield Date: Wed, 12 Mar 2025 20:31:39 + Subject: [PATCH] [libc][nfc] Use common implementation of read_first_l

[clang] [libc][nfc] Use common implementation of read_first_lane_u64 (PR #131027)

2025-03-12 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield created https://github.com/llvm/llvm-project/pull/131027 No codegen regression on either target. The two builtin_ffs implied on nvptx CSE away. ``` define internal i64 @__gpu_read_first_lane_u64(i64 noundef %__lane_mask, i64 noundef %__x) #2 { entry: %shr

[clang] [libc][nfc] Include instantiations of gpuintrin.h in IR test case (PR #130956)

2025-03-12 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: Yep. I'm looking at changing their implementation and want a before&after shot in the git diff for the upcoming review. If that doesn't pan out, still good to get a heads up if codegen for these changes on us unexpectedly. https://github.com/llvm/llvm-project/pull/130956

[clang] [libc][nfc] Include instantiations of gpuintrin.h in IR test case (PR #130956)

2025-03-12 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield closed https://github.com/llvm/llvm-project/pull/130956 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc][nfc] Include instantiations of gpuintrin.h in IR test case (PR #130956)

2025-03-12 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield created https://github.com/llvm/llvm-project/pull/130956 Regenerated existing test case with include-generated-funcs to show the lowered IR for each instantiation. >From 4ec726e4fcf5ab0b03f3942e42a4dbde1a6f43a4 Mon Sep 17 00:00:00 2001 From: Jon Chesterfield

[clang] [spirv][amdgpu] Set atomic size in the clang target info (PR #128569)

2025-02-25 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield closed https://github.com/llvm/llvm-project/pull/128569 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [spirv][amdgpu] Set atomic size in the clang target info (PR #128569)

2025-02-24 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield created https://github.com/llvm/llvm-project/pull/128569 Problem identified by Joseph. The openmp device runtime uses __scoped_atomic_load_n and similar which presently hit ``` error: large atomic operation may incur significant performance penalty; th

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2025-01-27 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield approved this pull request. This is the right thing. amdgpu is completely usable without rocm libraries and already has out of tree users doing that. Needing to manually opt out of rocm libs when not using any of rocm is definitely annoying, especially in cont

[clang] [Clang][AMDGPU] Stop defaulting to `one-as` for all atomic scopes (PR #120095)

2024-12-17 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: Stringing the pieces together, we may have been conflating opencl-the-language with opencl-the-implementation. Let's go with the first line of attack, no language special casing here, no checking seq-cst and appending one-as. Opencl the implementation won't care because

[clang] [Clang][AMDGPU] Stop defaulting to `one-as` for all atomic scopes (PR #120095)

2024-12-16 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: > This one-as business seems like it's cruft from before MMRAs. Can we rip them > out and replace them with MMRAs for OpenCL? https://llvm.org/docs/MemoryModelRelaxationAnnotations.html calls out the opencl fence as a motivating example which suggests either yes, or we s

[clang] [Clang][AMDGPU] Stop defaulting to `one-as` for all atomic scopes (PR #120095)

2024-12-16 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield requested changes to this pull request. "You need to leave a comment indicating the requested changes." https://github.com/llvm/llvm-project/pull/120095 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://

[clang] [Clang][AMDGPU] Stop defaulting to `one-as` for all atomic scopes (PR #120095)

2024-12-16 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: @b-sumner has useful context on this. I won't paraphrase, but it sounds like the block deleted here has the right semantics for opencl, where "seqcst" has some special meaning and generally the semantics don't totally make sense to me. Suggest we amend this to "if opencl

[clang] [Clang][AMDGPU] Stop defaulting to `one-as` for all atomic scopes (PR #120095)

2024-12-16 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield edited https://github.com/llvm/llvm-project/pull/120095 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][AMDGPU] Stop defaulting to `one-as` for all atomic scopes (PR #120095)

2024-12-16 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield approved this pull request. Explicitly marking green, even if this commit upsets something else in the backend having the concurrency primitives default to racy is clearly bad. https://github.com/llvm/llvm-project/pull/120095 __

[clang] [Clang][AMDGPU] Stop defaulting to `one-as` for all atomic scopes (PR #120095)

2024-12-16 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: I would say this change is obviously correct, but I can't see why it was introduced and vaguely fear tripping over abhorrent behaviour in the backend. Can you send this down the internal CI pipeline to pick up some more runtime testing (unless amd-stg-open is already def

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: > Would only work for Linux unfortunately, unless some Windows driver developer > out there knows if there's some similar win32 magic. Windows getting subprocess calls until their driver catches up (or someone points out how to do this) seems fine to me. Linux people get

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: Oh. I now see there was a bunch of discussion about this, will add some context. The driver has a hard limit on how many processes can open it at a time. clang calls this utility to ask what gpu to compile for by default. If you put those together, a parallel build on a

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield approved this pull request. Absolutely yes, awesome. I had a todo to have the kernel export this under sysfs literally years ago and didn't get around to working out their commit structure, delighted to see it is exposed. The unreliable hsa calls has been a c

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-07 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: Awesome! This is absolutely something that has been on my todo stack for ages and it's very good to see someone else writing the thing. It looks like the implementation is contentious so I'll leave that for the moment. Under some time constraints so please forgive the le

[clang] [Clang/AMDGPU] Zero sized arrays not allowed in HIP device code. (PR #113470)

2024-10-23 Thread Jon Chesterfield via cfe-commits

https://github.com/JonChesterfield approved this pull request. Error looks good. Might want to add a case for "dynamic __shared__" to the test file as the syntax is very close to the case being diagnosed - iirc it's things like ```cuda extern __shared__ float array[]; ``` Some existing handli

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-09-26 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: Probably want a longer prefix. _gpu or_llvm or similar. If the shared header gets the declarations then people can include the intrin.h and look at it to see what functions they have available, without going and looking through all the implementations. That seems like a

[clang] [llvm] [Sanitizer] Make sanitizer passes idempotent (PR #99439)

2024-08-13 Thread Jon Chesterfield via cfe-commits

JonChesterfield wrote: Sanizer passes setting a "no sanitizer" magic variable is backwards. If this behaviour is the way to go, have clang set a "needs_asan_lowering" or whatever and have the corresponding pass remove it. It shouldn't be necessary to emit ever increasing lists of pass and targ

[clang] [libc] [llvm] [libc] Implement (v|f)printf on the GPU (PR #96369)

2024-07-01 Thread Jon Chesterfield via cfe-commits

@@ -116,8 +116,7 @@ class LLVM_LIBRARY_VISIBILITY NVPTXTargetInfo : public TargetInfo { } BuiltinVaListKind getBuiltinVaListKind() const override { -// FIXME: implement -return TargetInfo::CharPtrBuiltinVaList; +return TargetInfo::VoidPtrBuiltinVaList; ---

1 2 3 >

1 - 100 of 296 matches

Mail list logo