JonChesterfield wrote:
I think we could do with an additional overload here.
Currently a bunch of code (notably CK but probably elsewhere) uses the v4i32
version of the LDS intrinsics. I think this patch lets one use the addrspace(7)
pointer of 128 bits alternative. So callers could transform
JonChesterfield wrote:
There's some existing builtin stuff going on in this area, e.g.
https://github.com/llvm/llvm-project/pull/138141/commits/96e94b5662c613fd80f712080751076254a73524
The use case in CK is behind a hip header file, so could become static inline
functions renaming a builtin, o
JonChesterfield wrote:
The patch to SemaExpr looks reasonable to me. I'd suggest that goes in separate
from the amdgpu intrinsic stuff.
I'd test this by tweaking the code to do the current lowering _and_ the
proposed and check that they do exactly the same thing on all the existing
builtins,
@@ -163,7 +163,10 @@ BUILTIN(__builtin_amdgcn_raw_buffer_load_b64,
"V2UiQbiiIi", "n")
BUILTIN(__builtin_amdgcn_raw_buffer_load_b96, "V3UiQbiiIi", "n")
BUILTIN(__builtin_amdgcn_raw_buffer_load_b128, "V4UiQbiiIi", "n")
+TARGET_BUILTIN(__builtin_amdgcn_raw_buffer_load_lds, "vV4U
@@ -0,0 +1,12 @@
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu tahiti -S
-verify -o - %s
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu bonaire -S
-verify -o - %s
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu carrizo -S
-verify
@@ -163,7 +163,10 @@ BUILTIN(__builtin_amdgcn_raw_buffer_load_b64,
"V2UiQbiiIi", "n")
BUILTIN(__builtin_amdgcn_raw_buffer_load_b96, "V3UiQbiiIi", "n")
BUILTIN(__builtin_amdgcn_raw_buffer_load_b128, "V4UiQbiiIi", "n")
+TARGET_BUILTIN(__builtin_amdgcn_raw_buffer_load_lds, "vV4U
@@ -0,0 +1,12 @@
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu tahiti -S
-verify -o - %s
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu bonaire -S
-verify -o - %s
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu carrizo -S
-verify
@@ -0,0 +1,12 @@
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu tahiti -S
-verify -o - %s
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu bonaire -S
-verify -o - %s
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu carrizo -S
-verify
https://github.com/JonChesterfield created
https://github.com/llvm/llvm-project/pull/137678
We have a clang builtin for one of four very similar IR intrinsics. This patch
adds builtins for the other three.
IR intrinsics introduced in https://reviews.llvm.org/D124884. The request from
composab
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131164
>From 092024bbf31b0677e6efbb0e6fc0cee7606ecb08 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Tue, 18 Mar 2025 15:57:02 +
Subject: [PATCH] [Headers] Implement spirvamdgcnintrin.h
---
clang/l
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131190
>From e3d1c0d0f430a96e26c68e22ab53dc2fa4a14e47 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Wed, 12 Mar 2025 20:55:17 +
Subject: [PATCH] [SPIRV] GPU intrinsics
---
clang/include/clang/Basi
JonChesterfield wrote:
We don't need a way to call the builtins. See for example this pull request.
Intel have done the spirv64-intel- thing similar to the spirv64-amd-amdhsa so
they can (presumably?) use whatever intrinsics they like, using another header
quite like this one.
https://github.
https://github.com/JonChesterfield edited
https://github.com/llvm/llvm-project/pull/131884
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131164
>From 402a091ac6eac8a50ce54a519acce5bfa4de1c88 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Tue, 18 Mar 2025 15:57:02 +
Subject: [PATCH] [Headers] Implement spirvamdgcnintrin.h
---
clang/l
https://github.com/JonChesterfield edited
https://github.com/llvm/llvm-project/pull/131164
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
JonChesterfield wrote:
@sarnex I'm deeply sorry about this sequence of events. The single spirv64
header that lowered to intrinsics that amdgpu or intel map onto their own world
would have removed a swathe of spurious variation.
What we're going to have to do in the interim is have spirvamdgcn
https://github.com/JonChesterfield edited
https://github.com/llvm/llvm-project/pull/131164
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
JonChesterfield wrote:
OK, we can't have spirv64-unknown-unknown at this point. But we could get this
compiling using a mixture of spirv intrinsics (where they exist) and amdgpu
intrinsics for spirv64-amd-amdhsa, leaving a preprocessor error for other
targets. I'll take a stab at that.
https:
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131164
>From d91671fdbb2aa9204f728747009459373bfd6553 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Thu, 13 Mar 2025 15:44:52 +
Subject: [PATCH 1/2] [Headers] Create stub spirintrin.h
---
clang/li
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131164
>From d91671fdbb2aa9204f728747009459373bfd6553 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Thu, 13 Mar 2025 15:44:52 +
Subject: [PATCH 1/2] [Headers] Create stub spirintrin.h
---
clang/li
JonChesterfield wrote:
Well it's not pretty, but spirv64-amd-amdhsa sets both __AMDGPU__ and
__SPIRV64__ macros. Added a commit with an example that dispatches to amdgpu
intrinsics on the happy path and preprocessor error otherwise.
If you let that get to the spirv backend it falls over with `
JonChesterfield wrote:
The utility is having a place to fill things in incrementally as we get them
working, and thus use libc to drive the implementation of enough of spirv to
get things working. If you decline to have any spirv code until everything is
working I'll have to do the testing som
@@ -0,0 +1,501 @@
+//===- LowerGPUIntrinsic.cpp
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
JonChesterfield wrote:
@jhuber6 I think we should have this despite the rejected
https://github.com/llvm/llvm-project/pull/131190.
Maybe we'll get some clang builtins for spirv. Otherwise some things can be
done with the asm label hack. Some can just be left as nop, e.g. a suspend that
does n
JonChesterfield wrote:
Patch opposed internally. Will see if I can think of an alternative.
https://github.com/llvm/llvm-project/pull/131190
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-com
https://github.com/JonChesterfield closed
https://github.com/llvm/llvm-project/pull/131190
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
JonChesterfield wrote:
I've dropped the clang headers part from this patch, rewritten to elide the
grid intrinsic and moved the test. Matt suggested llvm.simt as the prefix which
I like very much more than llvm.gpu but the rename is probably going to take a
moment.
Turns out I do need to upda
@@ -0,0 +1,501 @@
+//===- LowerGPUIntrinsic.cpp
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -0,0 +1,501 @@
+//===- LowerGPUIntrinsic.cpp
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131190
>From 65dcf7cb54156ec9623c2f04f1d70b479b4623b5 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Wed, 12 Mar 2025 20:55:17 +
Subject: [PATCH] [SPIRV] GPU intrinsics
---
clang/include/clang/Basi
@@ -0,0 +1,427 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
UTC_ARGS: --check-attributes
+; RUN: opt -S -mtriple=amdgcn-- -passes=lower-gpu-intrinsic < %s | FileCheck
%s --check-prefix=AMDGCN
JonChesterfield wrote:
I don't love
@@ -0,0 +1,501 @@
+//===- LowerGPUIntrinsic.cpp
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
JonChesterfield wrote:
This patch probably does too many things and it's implemented in a
non-conventional fashion. I think that does a decent job of having something to
upset most people that have looked at it. I note that reviewers are primarily
complaining about disjoint aspects of the patc
@@ -0,0 +1,501 @@
+//===- LowerGPUIntrinsic.cpp
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -0,0 +1,501 @@
+//===- LowerGPUIntrinsic.cpp
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -2861,6 +2861,69 @@ def int_experimental_convergence_anchor
def int_experimental_convergence_loop
: DefaultAttrsIntrinsic<[llvm_token_ty], [], [IntrNoMem, IntrConvergent]>;
+//===--- GPU Intrinsics
---===//
+
+class GPUIntr
@@ -150,6 +150,8 @@ defm int_amdgcn_workitem_id :
AMDGPUReadPreloadRegisterIntrinsic_xyz;
defm int_amdgcn_workgroup_id : AMDGPUReadPreloadRegisterIntrinsic_xyz_named
<"__builtin_amdgcn_workgroup_id">;
+defm int_amdgcn_grid_size : AMDGPUReadPrelo
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131134
>From 0466c31d1e0b10aa2d2352bb6befd36eb5306f9c Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Thu, 13 Mar 2025 12:49:42 +
Subject: [PATCH] [libc][nfc] Steps to allow sharing code between gpu i
JonChesterfield wrote:
If the name llvm.gpu is a stumbling block, how about llvm.offload? Will add JD
to the reviewers
https://github.com/llvm/llvm-project/pull/131190
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-
JonChesterfield wrote:
> I'm not convinced we should do this, as I have a bunch of concerns:
>
> * it's intrusive and duplicates work already done by
> [libclc](https://github.com/llvm/llvm-project/tree/main/libclc);
Also compiler-rt. Also gpuintrin.h. Also openmp's devicertl. All the lang
@@ -0,0 +1,501 @@
+//===- LowerGPUIntrinsic.cpp
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
https://github.com/JonChesterfield edited
https://github.com/llvm/llvm-project/pull/131190
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
JonChesterfield wrote:
Effectively a subset of https://github.com/llvm/llvm-project/pull/131190/, I'd
still like to land this and rebase 131190 on the grounds of signal to noise
ratio.
https://github.com/llvm/llvm-project/pull/131164
___
cfe-commits
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131190
>From b52a04c55ad56e1172dec6262f2536ec3fe7162b Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Wed, 12 Mar 2025 20:55:17 +
Subject: [PATCH] [SPIRV] GPU intrinsics
---
clang/include/clang/Basi
https://github.com/JonChesterfield created
https://github.com/llvm/llvm-project/pull/131190
Introduce __builtin_gpu builtins to clang and corresponding llvm.gpu intrinsics
in llvm for abstracting over minor differences between GPU architectures, and
use those to implement a gpuintrin.h instant
JonChesterfield wrote:
> Name should be `spirvintrin.h` but if we can't support 32 for now just error
> in the preprocessor.
Yep, you're right. It'll be caught by only checking for the SPIRV64 macro, but
nothing in this file is 32 vs 64 bit dependent as that's part of what
gpuintrin.h gives u
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131164
>From d91671fdbb2aa9204f728747009459373bfd6553 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Thu, 13 Mar 2025 15:44:52 +
Subject: [PATCH] [Headers] Create stub spirintrin.h
---
clang/lib/He
https://github.com/JonChesterfield edited
https://github.com/llvm/llvm-project/pull/131164
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131164
>From be94c9af7eaa8bc05ac9bdb80dadc285575c1472 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Thu, 13 Mar 2025 15:44:52 +
Subject: [PATCH] [Headers] Create stub spirv64intrin.h
---
clang/lib
JonChesterfield wrote:
Clang raises a lot of exciting errors for spirv-- about vulcan environments and
I don't really know the distinction between the two - if 32 bit spirv turns out
to be a workable thing it should go down the same code path, with `ifdef SPIRV
|| SPIRV64` and a file rename. I
https://github.com/JonChesterfield created
https://github.com/llvm/llvm-project/pull/131164
Structure follows amdgcnintrin.h but with declarations where compiler
intrinsics are not yet available.
Address space numbers, kernel attribute, checking how it interacts with openmp
are left for later
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131141
>From fbeb177a750ca671a9cff9f37f57e58c6900e7fd Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Thu, 13 Mar 2025 13:23:38 +
Subject: [PATCH] [Headers][NFC] Deduplicate gpu_match_ between targets
https://github.com/JonChesterfield closed
https://github.com/llvm/llvm-project/pull/131141
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -32,6 +32,31 @@ _Pragma("push_macro(\"bool\")");
#define bool _Bool
#endif
+_Pragma("omp begin declare target device_type(nohost)");
+_Pragma("omp begin declare variant match(device = {kind(gpu)})");
+
+// Forward declare a few functions for the implementation header.
+
+//
https://github.com/JonChesterfield edited
https://github.com/llvm/llvm-project/pull/131134
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/JonChesterfield edited
https://github.com/llvm/llvm-project/pull/131141
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131134
>From 7347ebc6a0aadd1b9676e329bdf7705dbfae7875 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Thu, 13 Mar 2025 12:49:42 +
Subject: [PATCH] [libc][nfc] Steps to allow sharing code between gpu i
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131141
>From 5e55b829eb3c7f4a4e674333cdde73b5bfe970f8 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Thu, 13 Mar 2025 13:23:38 +
Subject: [PATCH] [Headers][NFC] Deduplicate gpu_match_ between targets
https://github.com/JonChesterfield closed
https://github.com/llvm/llvm-project/pull/131134
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/JonChesterfield edited
https://github.com/llvm/llvm-project/pull/131141
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/JonChesterfield edited
https://github.com/llvm/llvm-project/pull/131141
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131141
>From cde4232ed28eed2b0c0c1cb11815b5a4317345b6 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Thu, 13 Mar 2025 13:23:38 +
Subject: [PATCH] [Headers][NFC] Deduplicate gpu_match_any between targ
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131141
>From 28cb801f73e6886eacfd5cdbcd17abb68b6dd947 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Thu, 13 Mar 2025 13:23:38 +
Subject: [PATCH] [Headers][NFC] Deduplicate gpu_match_any between targ
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131141
>From e1456be61130ea0ea006472990c7d294b8a32c03 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Thu, 13 Mar 2025 13:23:38 +
Subject: [PATCH] [Headers][NFC] Deduplicate gpu_match_any between targ
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131141
>From 42253295a3b11b4303e15c3455047e3bfc5d196a Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Thu, 13 Mar 2025 13:23:38 +
Subject: [PATCH] [Headers][NFC] Deduplicate gpu_match_any between targ
https://github.com/JonChesterfield created
https://github.com/llvm/llvm-project/pull/131141
Declare a few functions before including the target specific headers then
define a fallback_match_any, used by amdgpu and by older nvptx.
>From b9fdef141a83969eff8e7ac2dbc8c98163c0fbf5 Mon Sep 17 00:00:
https://github.com/JonChesterfield edited
https://github.com/llvm/llvm-project/pull/131134
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/JonChesterfield edited
https://github.com/llvm/llvm-project/pull/131134
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/JonChesterfield created
https://github.com/llvm/llvm-project/pull/131134
Adds macro guards to warn if the implementation headers are included directly
as part of dropping the need for them to be standalone.
I'd like to declare functions before the include but it might be be
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131134
>From 4c04f6979409642eb6bc9dc3c48b5e3636210ef0 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Thu, 13 Mar 2025 12:49:42 +
Subject: [PATCH] [libc][nfc] Steps to allow sharing code between gpu i
@@ -13,11 +13,8 @@
#error "This file is intended for AMDGPU targets or offloading to AMDGPU"
#endif
-#include
-
-#if !defined(__cplusplus)
-_Pragma("push_macro(\"bool\")");
-#define bool _Bool
+#ifndef __GPUINTRIN_H
+#warning "This file is intended as an implementation detail
@@ -263,8 +256,4 @@ _DEFAULT_FN_ATTRS static __inline__ void
__gpu_thread_suspend(void) {
_Pragma("omp end declare variant");
_Pragma("omp end declare target");
-#if !defined(__cplusplus)
-_Pragma("pop_macro(\"bool\")");
JonChesterfield wrote:
Up into gpuint
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131134
>From 7347ebc6a0aadd1b9676e329bdf7705dbfae7875 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Thu, 13 Mar 2025 12:49:42 +
Subject: [PATCH] [libc][nfc] Steps to allow sharing code between gpu i
https://github.com/JonChesterfield closed
https://github.com/llvm/llvm-project/pull/131027
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/JonChesterfield updated
https://github.com/llvm/llvm-project/pull/131027
>From 68f09d0f3f7849b91cb39ce42ba48e3e4aafb488 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
Date: Wed, 12 Mar 2025 20:31:39 +
Subject: [PATCH] [libc][nfc] Use common implementation of read_first_l
https://github.com/JonChesterfield created
https://github.com/llvm/llvm-project/pull/131027
No codegen regression on either target. The two builtin_ffs implied on nvptx
CSE away.
```
define internal i64 @__gpu_read_first_lane_u64(i64 noundef %__lane_mask, i64
noundef %__x) #2 {
entry:
%shr
JonChesterfield wrote:
Yep. I'm looking at changing their implementation and want a before&after shot
in the git diff for the upcoming review. If that doesn't pan out, still good to
get a heads up if codegen for these changes on us unexpectedly.
https://github.com/llvm/llvm-project/pull/130956
https://github.com/JonChesterfield closed
https://github.com/llvm/llvm-project/pull/130956
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/JonChesterfield created
https://github.com/llvm/llvm-project/pull/130956
Regenerated existing test case with include-generated-funcs to show the lowered
IR for each instantiation.
>From 4ec726e4fcf5ab0b03f3942e42a4dbde1a6f43a4 Mon Sep 17 00:00:00 2001
From: Jon Chesterfield
https://github.com/JonChesterfield closed
https://github.com/llvm/llvm-project/pull/128569
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/JonChesterfield created
https://github.com/llvm/llvm-project/pull/128569
Problem identified by Joseph. The openmp device runtime uses
__scoped_atomic_load_n and similar which presently hit
```
error: large atomic operation may incur significant performance
penalty; th
https://github.com/JonChesterfield approved this pull request.
This is the right thing. amdgpu is completely usable without rocm libraries and
already has out of tree users doing that. Needing to manually opt out of rocm
libs when not using any of rocm is definitely annoying, especially in cont
JonChesterfield wrote:
Stringing the pieces together, we may have been conflating opencl-the-language
with opencl-the-implementation.
Let's go with the first line of attack, no language special casing here, no
checking seq-cst and appending one-as.
Opencl the implementation won't care because
JonChesterfield wrote:
> This one-as business seems like it's cruft from before MMRAs. Can we rip them
> out and replace them with MMRAs for OpenCL?
https://llvm.org/docs/MemoryModelRelaxationAnnotations.html calls out the
opencl fence as a motivating example which suggests either yes, or we s
https://github.com/JonChesterfield requested changes to this pull request.
"You need to leave a comment indicating the requested changes."
https://github.com/llvm/llvm-project/pull/120095
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://
JonChesterfield wrote:
@b-sumner has useful context on this. I won't paraphrase, but it sounds like
the block deleted here has the right semantics for opencl, where "seqcst" has
some special meaning and generally the semantics don't totally make sense to
me. Suggest we amend this to "if opencl
https://github.com/JonChesterfield edited
https://github.com/llvm/llvm-project/pull/120095
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/JonChesterfield approved this pull request.
Explicitly marking green, even if this commit upsets something else in the
backend having the concurrency primitives default to racy is clearly bad.
https://github.com/llvm/llvm-project/pull/120095
__
JonChesterfield wrote:
I would say this change is obviously correct, but I can't see why it was
introduced and vaguely fear tripping over abhorrent behaviour in the backend.
Can you send this down the internal CI pipeline to pick up some more runtime
testing (unless amd-stg-open is already def
JonChesterfield wrote:
> Would only work for Linux unfortunately, unless some Windows driver developer
> out there knows if there's some similar win32 magic.
Windows getting subprocess calls until their driver catches up (or someone
points out how to do this) seems fine to me. Linux people get
JonChesterfield wrote:
Oh. I now see there was a bunch of discussion about this, will add some context.
The driver has a hard limit on how many processes can open it at a time. clang
calls this utility to ask what gpu to compile for by default. If you put those
together, a parallel build on a
https://github.com/JonChesterfield approved this pull request.
Absolutely yes, awesome. I had a todo to have the kernel export this under
sysfs literally years ago and didn't get around to working out their commit
structure, delighted to see it is exposed. The unreliable hsa calls has been a
c
JonChesterfield wrote:
Awesome! This is absolutely something that has been on my todo stack for ages
and it's very good to see someone else writing the thing.
It looks like the implementation is contentious so I'll leave that for the
moment. Under some time constraints so please forgive the le
https://github.com/JonChesterfield approved this pull request.
Error looks good. Might want to add a case for "dynamic __shared__" to the test
file as the syntax is very close to the case being diagnosed - iirc it's things
like
```cuda
extern __shared__ float array[];
```
Some existing handli
JonChesterfield wrote:
Probably want a longer prefix. _gpu or_llvm or similar.
If the shared header gets the declarations then people can include the intrin.h
and look at it to see what functions they have available, without going and
looking through all the implementations. That seems like a
JonChesterfield wrote:
Sanizer passes setting a "no sanitizer" magic variable is backwards.
If this behaviour is the way to go, have clang set a "needs_asan_lowering" or
whatever and have the corresponding pass remove it.
It shouldn't be necessary to emit ever increasing lists of pass and targ
@@ -116,8 +116,7 @@ class LLVM_LIBRARY_VISIBILITY NVPTXTargetInfo : public
TargetInfo {
}
BuiltinVaListKind getBuiltinVaListKind() const override {
-// FIXME: implement
-return TargetInfo::CharPtrBuiltinVaList;
+return TargetInfo::VoidPtrBuiltinVaList;
---
@@ -54,7 +54,34 @@ class MockArgList {
}
template LIBC_INLINE T next_var() {
-++arg_counter;
+arg_counter++;
+return T(arg_counter);
+ }
+
+ size_t read_count() const { return arg_counter; }
+};
+
+// Used by the GPU implementation to parse how many bytes ne
@@ -942,6 +942,36 @@ struct Amdgpu final : public VariadicABIInfo {
}
};
+struct NVPTX final : public VariadicABIInfo {
+
+ bool enableForTarget() override { return true; }
+
+ bool vaListPassedInSSARegister() override { return true; }
+
+ Type *vaListType(LLVMContext &Ct
https://github.com/JonChesterfield edited
https://github.com/llvm/llvm-project/pull/96369
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
1 - 100 of 293 matches
Mail list logo