[llvm] [compiler-rt] [clang-tools-extra] [clang] [InferAddressSpaces] Fix constant replace to avoid modifying other functions (PR #70611)

2023-11-08 Thread Wenju He via cfe-commits


@@ -334,6 +335,15 @@ template<> struct simplify_type {
   }
 };
 
+template <> struct GraphTraits {

wenju-he wrote:

> For the specific problem here, I'd consider expanding all constant 
> expressions in the function upfront, and then not having to deal with it.

is it right that I can use convertUsersOfConstantsToInstructions to expand 
constantexp to instructions? However, convertUsersOfConstantsToInstructions 
changes other functions as well. An option to add an addition function 
parameter to convertUsersOfConstantsToInstructions to ensure only constexpr 
users in the function is expanded. @nikic WDYT of this option?

convertUsersOfConstantsToInstructions is also doing a DFS on user of User.

https://github.com/llvm/llvm-project/pull/70611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang-tools-extra] [clang] [compiler-rt] [InferAddressSpaces] Fix constant replace to avoid modifying other functions (PR #70611)

2023-11-12 Thread Wenju He via cfe-commits

wenju-he wrote:

> I think it would be better if we could eliminate ConstantExpr addrspacecasts 
> from the IR altogether, which would avoid most of the complexity here. I 
> would also somewhat prefer to push this DFS into a helper function, but can 
> live with it inline as-is

thank you for the review. I'll draft another PR to modify 
convertUsersOfConstantsToInstructions to allow change in a function only, so 
that DFS is not need here.

https://github.com/llvm/llvm-project/pull/70611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [compiler-rt] [clang-tools-extra] [InferAddressSpaces] Fix constant replace to avoid modifying other functions (PR #70611)

2023-10-31 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/70611

>From 7c41be75c1ef661e757bfaca8d693b3937df649e Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Mon, 30 Oct 2023 08:36:52 +0800
Subject: [PATCH 1/2] [InferAddressSpaces] Fix constant replace to avoid
 modifying other functions

A constant value is unique in llvm context. InferAddressSpaces was
replacing its users in other functions as well. This leads to unexpected
behavior in our downstream use case after the pass.

InferAddressSpaces is a function passe, so it shall not modify functions
other than currently processed one.

Co-authored-by: Abhinav Gaba 
---
 llvm/include/llvm/IR/User.h   | 10 +
 .../Transforms/Scalar/InferAddressSpaces.cpp  | 20 +-
 .../ensure-other-funcs-unchanged.ll   | 40 +++
 3 files changed, 69 insertions(+), 1 deletion(-)
 create mode 100644 
llvm/test/Transforms/InferAddressSpaces/ensure-other-funcs-unchanged.ll

diff --git a/llvm/include/llvm/IR/User.h b/llvm/include/llvm/IR/User.h
index a9cf60151e5dc6c..d27d4bf4f5f1e66 100644
--- a/llvm/include/llvm/IR/User.h
+++ b/llvm/include/llvm/IR/User.h
@@ -18,6 +18,7 @@
 #ifndef LLVM_IR_USER_H
 #define LLVM_IR_USER_H
 
+#include "llvm/ADT/GraphTraits.h"
 #include "llvm/ADT/iterator.h"
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/IR/Use.h"
@@ -334,6 +335,15 @@ template<> struct simplify_type {
   }
 };
 
+template <> struct GraphTraits {
+  using NodeRef = User *;
+  using ChildIteratorType = Value::user_iterator;
+
+  static NodeRef getEntryNode(NodeRef N) { return N; }
+  static ChildIteratorType child_begin(NodeRef N) { return N->user_begin(); }
+  static ChildIteratorType child_end(NodeRef N) { return N->user_end(); }
+};
+
 } // end namespace llvm
 
 #endif // LLVM_IR_USER_H
diff --git a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp 
b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
index 2da521375c00161..828b1e765cb1af2 100644
--- a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
+++ b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
@@ -1166,6 +1166,8 @@ bool InferAddressSpacesImpl::rewriteWithNewAddressSpaces(
   }
 
   SmallVector DeadInstructions;
+  ValueToValueMapTy VMap;
+  ValueMapper VMapper(VMap, RF_NoModuleLevelChanges | RF_IgnoreMissingLocals);
 
   // Replaces the uses of the old address expressions with the new ones.
   for (const WeakTrackingVH &WVH : Postorder) {
@@ -1184,7 +1186,18 @@ bool InferAddressSpacesImpl::rewriteWithNewAddressSpaces(
   if (C != Replace) {
 LLVM_DEBUG(dbgs() << "Inserting replacement const cast: " << Replace
   << ": " << *Replace << '\n');
-C->replaceAllUsesWith(Replace);
+VMap[C] = Replace;
+for (User *U : make_early_inc_range(C->users())) {
+  for (auto It = df_begin(U), E = df_end(U); It != E;) {
+if (auto *I = dyn_cast(*It)) {
+  if (I->getFunction() == F)
+VMapper.remapInstruction(*I);
+  It.skipChildren();
+  continue;
+}
+++It;
+  }
+}
 V = Replace;
   }
 }
@@ -1210,6 +1223,11 @@ bool InferAddressSpacesImpl::rewriteWithNewAddressSpaces(
   // Skip if the current user is the new value itself.
   if (CurUser == NewV)
 continue;
+
+  if (auto *CurUserI = dyn_cast(CurUser);
+  CurUserI && CurUserI->getFunction() != F)
+continue;
+
   // Handle more complex cases like intrinsic that need to be remangled.
   if (auto *MI = dyn_cast(CurUser)) {
 if (!MI->isVolatile() && handleMemIntrinsicPtrUse(MI, V, NewV))
diff --git 
a/llvm/test/Transforms/InferAddressSpaces/ensure-other-funcs-unchanged.ll 
b/llvm/test/Transforms/InferAddressSpaces/ensure-other-funcs-unchanged.ll
new file mode 100644
index 000..ae052a69f9ed02d
--- /dev/null
+++ b/llvm/test/Transforms/InferAddressSpaces/ensure-other-funcs-unchanged.ll
@@ -0,0 +1,40 @@
+; RUN: opt -assume-default-is-flat-addrspace -print-module-scope 
-print-after-all -S -disable-output -passes=infer-address-spaces <%s 2>&1 | 
FileCheck %s
+
+; CHECK: IR Dump After InferAddressSpacesPass on f2
+
+; Check that after running infer-address-spaces on f2, the redundant addrspace 
cast %x1 in f2 is gone.
+; CHECK-LABEL: define spir_func void @f2()
+; CHECK: [[X:%.*]] = addrspacecast ptr addrspace(1) @x to ptr
+; CHECK-NEXT:call spir_func void @f1(ptr noundef [[X]])
+
+; But it should not affect f3.
+; CHECK-LABEL: define spir_func void @f3()
+; CHECK: %x1 = addrspacecast ptr addrspacecast (ptr addrspace(1) @x to 
ptr) to ptr addrspace(1)
+; CHECK-NEXT:%x2 = addrspacecast ptr addrspace(1) %x1 to ptr
+; CHECK-NEXT:call spir_func void @f1(ptr noundef %x2)
+
+; Ensure that the pass hasn't run on f3 yet.
+; CHECK: IR Dump After InferAddressSpacesPass on f3
+
+target datalayout = 
"e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256

[clang] [llvm] [compiler-rt] [clang-tools-extra] [InferAddressSpaces] Fix constant replace to avoid modifying other functions (PR #70611)

2023-11-01 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/70611

>From 7c41be75c1ef661e757bfaca8d693b3937df649e Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Mon, 30 Oct 2023 08:36:52 +0800
Subject: [PATCH 1/3] [InferAddressSpaces] Fix constant replace to avoid
 modifying other functions

A constant value is unique in llvm context. InferAddressSpaces was
replacing its users in other functions as well. This leads to unexpected
behavior in our downstream use case after the pass.

InferAddressSpaces is a function passe, so it shall not modify functions
other than currently processed one.

Co-authored-by: Abhinav Gaba 
---
 llvm/include/llvm/IR/User.h   | 10 +
 .../Transforms/Scalar/InferAddressSpaces.cpp  | 20 +-
 .../ensure-other-funcs-unchanged.ll   | 40 +++
 3 files changed, 69 insertions(+), 1 deletion(-)
 create mode 100644 
llvm/test/Transforms/InferAddressSpaces/ensure-other-funcs-unchanged.ll

diff --git a/llvm/include/llvm/IR/User.h b/llvm/include/llvm/IR/User.h
index a9cf60151e5dc6c..d27d4bf4f5f1e66 100644
--- a/llvm/include/llvm/IR/User.h
+++ b/llvm/include/llvm/IR/User.h
@@ -18,6 +18,7 @@
 #ifndef LLVM_IR_USER_H
 #define LLVM_IR_USER_H
 
+#include "llvm/ADT/GraphTraits.h"
 #include "llvm/ADT/iterator.h"
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/IR/Use.h"
@@ -334,6 +335,15 @@ template<> struct simplify_type {
   }
 };
 
+template <> struct GraphTraits {
+  using NodeRef = User *;
+  using ChildIteratorType = Value::user_iterator;
+
+  static NodeRef getEntryNode(NodeRef N) { return N; }
+  static ChildIteratorType child_begin(NodeRef N) { return N->user_begin(); }
+  static ChildIteratorType child_end(NodeRef N) { return N->user_end(); }
+};
+
 } // end namespace llvm
 
 #endif // LLVM_IR_USER_H
diff --git a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp 
b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
index 2da521375c00161..828b1e765cb1af2 100644
--- a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
+++ b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
@@ -1166,6 +1166,8 @@ bool InferAddressSpacesImpl::rewriteWithNewAddressSpaces(
   }
 
   SmallVector DeadInstructions;
+  ValueToValueMapTy VMap;
+  ValueMapper VMapper(VMap, RF_NoModuleLevelChanges | RF_IgnoreMissingLocals);
 
   // Replaces the uses of the old address expressions with the new ones.
   for (const WeakTrackingVH &WVH : Postorder) {
@@ -1184,7 +1186,18 @@ bool InferAddressSpacesImpl::rewriteWithNewAddressSpaces(
   if (C != Replace) {
 LLVM_DEBUG(dbgs() << "Inserting replacement const cast: " << Replace
   << ": " << *Replace << '\n');
-C->replaceAllUsesWith(Replace);
+VMap[C] = Replace;
+for (User *U : make_early_inc_range(C->users())) {
+  for (auto It = df_begin(U), E = df_end(U); It != E;) {
+if (auto *I = dyn_cast(*It)) {
+  if (I->getFunction() == F)
+VMapper.remapInstruction(*I);
+  It.skipChildren();
+  continue;
+}
+++It;
+  }
+}
 V = Replace;
   }
 }
@@ -1210,6 +1223,11 @@ bool InferAddressSpacesImpl::rewriteWithNewAddressSpaces(
   // Skip if the current user is the new value itself.
   if (CurUser == NewV)
 continue;
+
+  if (auto *CurUserI = dyn_cast(CurUser);
+  CurUserI && CurUserI->getFunction() != F)
+continue;
+
   // Handle more complex cases like intrinsic that need to be remangled.
   if (auto *MI = dyn_cast(CurUser)) {
 if (!MI->isVolatile() && handleMemIntrinsicPtrUse(MI, V, NewV))
diff --git 
a/llvm/test/Transforms/InferAddressSpaces/ensure-other-funcs-unchanged.ll 
b/llvm/test/Transforms/InferAddressSpaces/ensure-other-funcs-unchanged.ll
new file mode 100644
index 000..ae052a69f9ed02d
--- /dev/null
+++ b/llvm/test/Transforms/InferAddressSpaces/ensure-other-funcs-unchanged.ll
@@ -0,0 +1,40 @@
+; RUN: opt -assume-default-is-flat-addrspace -print-module-scope 
-print-after-all -S -disable-output -passes=infer-address-spaces <%s 2>&1 | 
FileCheck %s
+
+; CHECK: IR Dump After InferAddressSpacesPass on f2
+
+; Check that after running infer-address-spaces on f2, the redundant addrspace 
cast %x1 in f2 is gone.
+; CHECK-LABEL: define spir_func void @f2()
+; CHECK: [[X:%.*]] = addrspacecast ptr addrspace(1) @x to ptr
+; CHECK-NEXT:call spir_func void @f1(ptr noundef [[X]])
+
+; But it should not affect f3.
+; CHECK-LABEL: define spir_func void @f3()
+; CHECK: %x1 = addrspacecast ptr addrspacecast (ptr addrspace(1) @x to 
ptr) to ptr addrspace(1)
+; CHECK-NEXT:%x2 = addrspacecast ptr addrspace(1) %x1 to ptr
+; CHECK-NEXT:call spir_func void @f1(ptr noundef %x2)
+
+; Ensure that the pass hasn't run on f3 yet.
+; CHECK: IR Dump After InferAddressSpacesPass on f3
+
+target datalayout = 
"e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256

[clang] [llvm] [compiler-rt] [clang-tools-extra] [InferAddressSpaces] Fix constant replace to avoid modifying other functions (PR #70611)

2023-11-01 Thread Wenju He via cfe-commits

https://github.com/wenju-he ready_for_review 
https://github.com/llvm/llvm-project/pull/70611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [compiler-rt] [clang-tools-extra] [InferAddressSpaces] Fix constant replace to avoid modifying other functions (PR #70611)

2023-11-01 Thread Wenju He via cfe-commits


@@ -0,0 +1,40 @@
+; RUN: opt -assume-default-is-flat-addrspace -print-module-scope 
-print-after-all -S -disable-output -passes=infer-address-spaces <%s 2>&1 | 
FileCheck %s
+
+; CHECK: IR Dump After InferAddressSpacesPass on f2
+
+; Check that after running infer-address-spaces on f2, the redundant addrspace 
cast %x1 in f2 is gone.
+; CHECK-LABEL: define spir_func void @f2()
+; CHECK: [[X:%.*]] = addrspacecast ptr addrspace(1) @x to ptr
+; CHECK-NEXT:call spir_func void @f1(ptr noundef [[X]])
+
+; But it should not affect f3.
+; CHECK-LABEL: define spir_func void @f3()
+; CHECK: %x1 = addrspacecast ptr addrspacecast (ptr addrspace(1) @x to 
ptr) to ptr addrspace(1)
+; CHECK-NEXT:%x2 = addrspacecast ptr addrspace(1) %x1 to ptr
+; CHECK-NEXT:call spir_func void @f1(ptr noundef %x2)
+
+; Ensure that the pass hasn't run on f3 yet.
+; CHECK: IR Dump After InferAddressSpacesPass on f3

wenju-he wrote:

done

https://github.com/llvm/llvm-project/pull/70611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [compiler-rt] [clang-tools-extra] [InferAddressSpaces] Fix constant replace to avoid modifying other functions (PR #70611)

2023-11-01 Thread Wenju He via cfe-commits


@@ -0,0 +1,40 @@
+; RUN: opt -assume-default-is-flat-addrspace -print-module-scope 
-print-after-all -S -disable-output -passes=infer-address-spaces <%s 2>&1 | 
FileCheck %s
+
+; CHECK: IR Dump After InferAddressSpacesPass on f2
+
+; Check that after running infer-address-spaces on f2, the redundant addrspace 
cast %x1 in f2 is gone.
+; CHECK-LABEL: define spir_func void @f2()
+; CHECK: [[X:%.*]] = addrspacecast ptr addrspace(1) @x to ptr
+; CHECK-NEXT:call spir_func void @f1(ptr noundef [[X]])
+
+; But it should not affect f3.
+; CHECK-LABEL: define spir_func void @f3()
+; CHECK: %x1 = addrspacecast ptr addrspacecast (ptr addrspace(1) @x to 
ptr) to ptr addrspace(1)
+; CHECK-NEXT:%x2 = addrspacecast ptr addrspace(1) %x1 to ptr
+; CHECK-NEXT:call spir_func void @f1(ptr noundef %x2)
+
+; Ensure that the pass hasn't run on f3 yet.
+; CHECK: IR Dump After InferAddressSpacesPass on f3
+
+target datalayout = 
"e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-n8:16:32:64"
+target triple = "spir64"
+
+@x = addrspace(1) global i32 0, align 4
+
+define spir_func void @f2() {

wenju-he wrote:

done

https://github.com/llvm/llvm-project/pull/70611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[compiler-rt] [llvm] [clang] [clang-tools-extra] [InferAddressSpaces] Fix constant replace to avoid modifying other functions (PR #70611)

2023-11-01 Thread Wenju He via cfe-commits


@@ -334,6 +335,15 @@ template<> struct simplify_type {
   }
 };
 
+template <> struct GraphTraits {

wenju-he wrote:

> In any case, this should not be in a public IR header.

I've reverted the change in this file.

https://github.com/llvm/llvm-project/pull/70611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[compiler-rt] [clang] [clang-tools-extra] [llvm] [InferAddressSpaces] Fix constant replace to avoid modifying other functions (PR #70611)

2023-11-01 Thread Wenju He via cfe-commits


@@ -334,6 +335,15 @@ template<> struct simplify_type {
   }
 };
 
+template <> struct GraphTraits {

wenju-he wrote:

I measured llvm-project compile time impact of this change to llvm/IR/User.h on 
intel icx 8358 cpu.

Build command:
`
cmake -GNinja -DLLVM_ENABLE_PROJECTS="llvm;clang;clang-tools-extra" 
-DLLVM_INCLUDE_TESTS=ON -DLLVM_BUILD_TESTS=ON -DLLVM_ENABLE_ASSERTIONS=ON 
-DCMAKE_BUILD_TYPE=Release ../llvm -DBUILD_SHARED_LIBS=OFF 
-DLLVM_TARGETS_TO_BUILD="X86;AMDGPU" -DCMAKE_INSTALL_PREFIX=install && ninja 
-j40
`

Ref: (https://github.com/llvm/llvm-project.git 
c0d78c4232057768b04d3330e581d81544391e68)
Exp: (https://github.com/llvm/llvm-project.git 
c0d78c4232057768b04d3330e581d81544391e68) + change to User.h

### OS: RHEL9   (note: llvm-project build on this system also builds 
https://github.com/KhronosGroup/SPIRV-LLVM-Translator as a project inside 
llvm/project folder)

REF: 
real10m39.052s
user398m36.074s
sys 19m26.839s

real10m39.058s
user398m31.903s
sys 19m18.405s

EXP:
real10m39.910s
user398m55.339s
sys 19m36.278s

real10m40.319s
user398m52.378s
sys 19m38.137s

### OS: Ubuntu 22
REF:
real10m2.783s
user376m47.038s
sys 18m35.310s

real10m2.983s
user376m47.032s
sys 18m46.517s

real10m3.059s
user376m52.423s
sys 18m45.694s

EXP:
real10m2.730s
user376m46.556s
sys 18m44.566s

real10m3.007s
user376m48.369s
sys 18m46.733s

real10m2.860s
user376m44.655s
sys 18m51.154s

Summary:
On RHEL9 build time is slowed down by ~1 second. On Ubuntu 22 there is no 
obvious change of compile time.

@nikic could we add GraphTraits to User.h since compile time impact is 
not obvious?

https://github.com/llvm/llvm-project/pull/70611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[compiler-rt] [clang-tools-extra] [llvm] [clang] [InferAddressSpaces] Fix constant replace to avoid modifying other functions (PR #70611)

2023-11-02 Thread Wenju He via cfe-commits

https://github.com/wenju-he edited 
https://github.com/llvm/llvm-project/pull/70611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang] [llvm] [compiler-rt] [InferAddressSpaces] Fix constant replace to avoid modifying other functions (PR #70611)

2023-11-02 Thread Wenju He via cfe-commits

https://github.com/wenju-he edited 
https://github.com/llvm/llvm-project/pull/70611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[compiler-rt] [clang-tools-extra] [clang] [llvm] [InferAddressSpaces] Fix constant replace to avoid modifying other functions (PR #70611)

2023-11-02 Thread Wenju He via cfe-commits


@@ -334,6 +335,15 @@ template<> struct simplify_type {
   }
 };
 
+template <> struct GraphTraits {

wenju-he wrote:

added 3 more runs on RHEL9 today, the compile time diff is very small.

https://github.com/llvm/llvm-project/pull/70611
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang] [llvm] [IR] Disallow ZeroInit for spirv.Image (PR #73887)

2023-12-17 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/73887

>From e369b5d62094c9b48f3c33075d764c115a208a74 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Thu, 30 Nov 2023 09:57:06 +0800
Subject: [PATCH] [IR] Disallow ZeroInit for spirv.Image

According to spirv spec, OpConstantNull's result type can't be image
type. So we can't generate zeroinitializer for spirv.Image.
---
 llvm/lib/IR/Type.cpp| 2 ++
 llvm/unittests/Transforms/Utils/ValueMapperTest.cpp | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/IR/Type.cpp b/llvm/lib/IR/Type.cpp
index 3d2e203a20dac7..e3a09018ad8b98 100644
--- a/llvm/lib/IR/Type.cpp
+++ b/llvm/lib/IR/Type.cpp
@@ -841,6 +841,8 @@ struct TargetTypeInfo {
 static TargetTypeInfo getTargetTypeInfo(const TargetExtType *Ty) {
   LLVMContext &C = Ty->getContext();
   StringRef Name = Ty->getName();
+  if (Name.equals("spirv.Image"))
+return TargetTypeInfo(PointerType::get(C, 0), TargetExtType::CanBeGlobal);
   if (Name.startswith("spirv."))
 return TargetTypeInfo(PointerType::get(C, 0), TargetExtType::HasZeroInit,
   TargetExtType::CanBeGlobal);
diff --git a/llvm/unittests/Transforms/Utils/ValueMapperTest.cpp 
b/llvm/unittests/Transforms/Utils/ValueMapperTest.cpp
index 17083b3846430d..c0c9d383ac1816 100644
--- a/llvm/unittests/Transforms/Utils/ValueMapperTest.cpp
+++ b/llvm/unittests/Transforms/Utils/ValueMapperTest.cpp
@@ -423,7 +423,7 @@ TEST(ValueMapperTest, mapValuePoisonWithTypeRemap) {
 
 TEST(ValueMapperTest, mapValueConstantTargetNoneToLayoutTypeNullValue) {
   LLVMContext C;
-  auto *OldTy = TargetExtType::get(C, "spirv.Image");
+  auto *OldTy = TargetExtType::get(C, "spirv.Event");
   Type *NewTy = OldTy->getLayoutType();
 
   TestTypeRemapper TM(NewTy);

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang] [llvm] [libc] [flang] [compiler-rt] [libcxx] [IR] Disallow ZeroInit for spirv.Image (PR #73887)

2023-12-18 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/73887

>From e369b5d62094c9b48f3c33075d764c115a208a74 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Thu, 30 Nov 2023 09:57:06 +0800
Subject: [PATCH] [IR] Disallow ZeroInit for spirv.Image

According to spirv spec, OpConstantNull's result type can't be image
type. So we can't generate zeroinitializer for spirv.Image.
---
 llvm/lib/IR/Type.cpp| 2 ++
 llvm/unittests/Transforms/Utils/ValueMapperTest.cpp | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/IR/Type.cpp b/llvm/lib/IR/Type.cpp
index 3d2e203a20dac7..e3a09018ad8b98 100644
--- a/llvm/lib/IR/Type.cpp
+++ b/llvm/lib/IR/Type.cpp
@@ -841,6 +841,8 @@ struct TargetTypeInfo {
 static TargetTypeInfo getTargetTypeInfo(const TargetExtType *Ty) {
   LLVMContext &C = Ty->getContext();
   StringRef Name = Ty->getName();
+  if (Name.equals("spirv.Image"))
+return TargetTypeInfo(PointerType::get(C, 0), TargetExtType::CanBeGlobal);
   if (Name.startswith("spirv."))
 return TargetTypeInfo(PointerType::get(C, 0), TargetExtType::HasZeroInit,
   TargetExtType::CanBeGlobal);
diff --git a/llvm/unittests/Transforms/Utils/ValueMapperTest.cpp 
b/llvm/unittests/Transforms/Utils/ValueMapperTest.cpp
index 17083b3846430d..c0c9d383ac1816 100644
--- a/llvm/unittests/Transforms/Utils/ValueMapperTest.cpp
+++ b/llvm/unittests/Transforms/Utils/ValueMapperTest.cpp
@@ -423,7 +423,7 @@ TEST(ValueMapperTest, mapValuePoisonWithTypeRemap) {
 
 TEST(ValueMapperTest, mapValueConstantTargetNoneToLayoutTypeNullValue) {
   LLVMContext C;
-  auto *OldTy = TargetExtType::get(C, "spirv.Image");
+  auto *OldTy = TargetExtType::get(C, "spirv.Event");
   Type *NewTy = OldTy->getLayoutType();
 
   TestTypeRemapper TM(NewTy);

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[compiler-rt] [llvm] [clang] [flang] [libcxx] [clang-tools-extra] [libc] [IR] Disallow ZeroInit for spirv.Image (PR #73887)

2023-12-19 Thread Wenju He via cfe-commits

wenju-he wrote:

> image-unoptimized.ll

backtrace:
```
 #0 0x01d2f0e1 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) 
(/export/users/wenjuhe/llvm/llvm-project/build/bin/llc+0x1d2f0e1)
 #1 0x01d2c644 SignalHandler(int) Signals.cpp:0:0
 #2 0x7fd516a7ddb0 __restore_rt (/lib64/libc.so.6+0x59db0)
 #3 0x7fd516aca42c __pthread_kill_implementation (/lib64/libc.so.6+0xa642c)
 #4 0x7fd516a7dd06 gsignal (/lib64/libc.so.6+0x59d06)
 #5 0x7fd516a507d3 abort (/lib64/libc.so.6+0x2c7d3)
 #6 0x7fd516a506fb _nl_load_domain.cold (/lib64/libc.so.6+0x2c6fb)
 #7 0x7fd516a76c86 (/lib64/libc.so.6+0x52c86)
 #8 0x014b3bf1 llvm::ConstantTargetNone::get(llvm::TargetExtType*) 
(/export/users/wenjuhe/llvm/llvm-project/build/bin/llc+0x14b3bf1)
 #9 0x00abac68 (anonymous 
namespace)::SPIRVEmitIntrinsics::runOnFunction(llvm::Function&) (.part.0) 
SPIRVEmitIntrinsics.cpp:0:0
```
SPIRVEmitIntrinsics pass probably needs fix to not generate ConstantTargetNone 
for spirv.Image. @bogner WDYT?

https://github.com/llvm/llvm-project/pull/73887
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] link_bc target should depends on target builtins.link.clc-arch_suffix (PR #132338)

2025-03-23 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/132338

>From 6ce54aa767f8cdff2f938cdce8656e495a1346f0 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Thu, 20 Mar 2025 22:01:55 -0700
Subject: [PATCH 1/2] [libclc] link_bc target should depends on target
 builtins.link.clc-arch_suffix

Currently link_bc command depends on the bitcode file that is associated
with custom target builtins.link.clc-arch_suffix.
On windows we randomly see following error:
`
  Generating builtins.link.clc-${ARCH}--.bc
  Generating builtins.link.libspirv-${ARCH}.bc
  error : The requested operation cannot be performed on a file with a 
user-mapped section open.
`
I suspect that builtins.link.clc-${ARCH}--.bc file is being generated
while it is being used in link_bc.
This PR adds target-level dependency to ensure builtins.link.clc-${ARCH}--.bc
is generated first.
---
 libclc/CMakeLists.txt| 2 +-
 libclc/cmake/modules/AddLibclc.cmake | 9 +
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index 426f210a73fcc..3de7ee9b707a8 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -413,7 +413,7 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
   GEN_FILES ${opencl_gen_files}
   ALIASES ${${d}_aliases}
   # Link in the CLC builtins and internalize their symbols
-  INTERNAL_LINK_DEPENDENCIES 
$
+  INTERNAL_LINK_DEPENDENCIES builtins.link.clc-${arch_suffix}
 )
   endforeach( d )
 endforeach( t )
diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 911559ff4bfa9..0808b39e06555 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -211,8 +211,9 @@ endfunction()
 #  * ALIASES  ...
 #  List of aliases
 #  * INTERNAL_LINK_DEPENDENCIES  ...
-#  A list of extra bytecode files to link into the builtin library. Symbols
-#  from these link dependencies will be internalized during linking.
+#  A list of extra bytecode file's targets. The bitcode files will be 
linked
+#  into the builtin library. Symbols from these link dependencies will be
+#  internalized during linking.
 function(add_libclc_builtin_set)
   cmake_parse_arguments(ARG
 "CLC_INTERNAL"
@@ -313,8 +314,8 @@ function(add_libclc_builtin_set)
   INTERNALIZE
   TARGET ${builtins_link_lib_tgt}
   INPUTS $
-${ARG_INTERNAL_LINK_DEPENDENCIES}
-  DEPENDENCIES ${builtins_link_lib_tmp_tgt}
+$
+  DEPENDENCIES ${builtins_link_lib_tmp_tgt} 
${ARG_INTERNAL_LINK_DEPENDENCIES}
 )
   endif()
 

>From 7b9a65f8dac3ab5e13fcabade3e9ed4d384c5b92 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Fri, 21 Mar 2025 01:29:51 -0700
Subject: [PATCH 2/2] update comment, fix files to link

---
 libclc/cmake/modules/AddLibclc.cmake | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 0808b39e06555..29d728494cd3e 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -210,7 +210,7 @@ endfunction()
 #  Optimization options (for opt)
 #  * ALIASES  ...
 #  List of aliases
-#  * INTERNAL_LINK_DEPENDENCIES  ...
+#  * INTERNAL_LINK_DEPENDENCIES  ...
 #  A list of extra bytecode file's targets. The bitcode files will be 
linked
 #  into the builtin library. Symbols from these link dependencies will be
 #  internalized during linking.
@@ -310,11 +310,15 @@ function(add_libclc_builtin_set)
   INPUTS ${bytecode_files}
   DEPENDENCIES ${builtins_comp_lib_tgt}
 )
+set( internal_link_depend_files )
+foreach( tgt ${ARG_INTERNAL_LINK_DEPENDENCIES} )
+  list( APPEND internal_link_depend_files 
$ )
+endforeach()
 link_bc(
   INTERNALIZE
   TARGET ${builtins_link_lib_tgt}
   INPUTS $
-$
+${internal_link_depend_files}
   DEPENDENCIES ${builtins_link_lib_tmp_tgt} 
${ARG_INTERNAL_LINK_DEPENDENCIES}
 )
   endif()

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Skip opt command if opt_flags is empty (PR #130882)

2025-03-21 Thread Wenju He via cfe-commits


@@ -342,22 +342,32 @@ function(add_libclc_builtin_set)
 
   set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
 
-  # Add opt target
-  add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
-COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
-  ${builtins_link_lib}
-DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
-  )
-  add_custom_target( ${builtins_opt_lib_tgt}
-ALL DEPENDS ${builtins_opt_lib_tgt}.bc
-  )
+  if( ${ARG_OPT_FLAGS} STREQUAL "" )

wenju-he wrote:

let me know if it worth to update PR to unifies the code path and avoids early 
exit. Otherwise I'll close this PR.

The original request of this PR is to disable opt -O3. But things has changed 
and the request is irrelevant.
Though in practice it might still make sense to apply the PR to avoid code 
diverge like for SPIR-V.

https://github.com/llvm/llvm-project/pull/130882
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] link_bc target should depends on target builtins.link.clc-arch_suffix (PR #132338)

2025-03-22 Thread Wenju He via cfe-commits


@@ -313,8 +314,8 @@ function(add_libclc_builtin_set)
   INTERNALIZE
   TARGET ${builtins_link_lib_tgt}
   INPUTS $
-${ARG_INTERNAL_LINK_DEPENDENCIES}
-  DEPENDENCIES ${builtins_link_lib_tmp_tgt}
+$

wenju-he wrote:

thanks, changed to reading TARGE_FILE from each target

https://github.com/llvm/llvm-project/pull/132338
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] add --only-needed to llvm-link when INTERNALIZE flag is set (PR #130871)

2025-03-20 Thread Wenju He via cfe-commits

wenju-he wrote:

@frasercrmck could you please try this PR on https://github.com/intel/llvm repo?

My experiment:
clang --version
clang version 21.0.0git (https://github.com/llvm/llvm-project 
f5ee10538b68835112323c241ca7db67ca78bf62)

before PR: 
find . -name "builtins.link*.bc" -printf "%s\n" | paste -sd+ | bc
101593692

after PR:
find . -name "builtins.link*.bc" -printf "%s\n" | paste -sd+ | bc
101316928

This PR reduces bitcode file sizes by 0.27%

https://github.com/llvm/llvm-project/pull/130871
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Skip opt command if opt_flags is empty (PR #130882)

2025-03-20 Thread Wenju He via cfe-commits


@@ -342,22 +342,32 @@ function(add_libclc_builtin_set)
 
   set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
 
-  # Add opt target
-  add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
-COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
-  ${builtins_link_lib}
-DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
-  )
-  add_custom_target( ${builtins_opt_lib_tgt}
-ALL DEPENDS ${builtins_opt_lib_tgt}.bc
-  )
+  if( ${ARG_OPT_FLAGS} STREQUAL "" )

wenju-he wrote:

> I just realised this code isn't triggered upstream. I take it you have a 
> downstream making use of it?

there is a case in upstream LLVM that opt flag is empty: 
https://github.com/llvm/llvm-project/blob/ad5cac3b06c3cb41397acc1fc96beae9b460f20c/libclc/CMakeLists.txt#L353

> I tried it locally and this needs to be `if( "${ARG_OPT_FLAGS} STREQUAL "" )` 
> or maybe better yet `if ( NOT ARG_OPT_FLAGS )`. The code as-is generates an 
> error if I pass in an empty OPT_FLAGS:
> 
> ```
> CMake Error at /llvm-project/libclc/cmake/modules/AddLibclc.cmake:345 (if):
>   if given arguments:
> 
> "STREQUAL" ""
> 
>   Unknown arguments specified
> ```

Thanks for the testing. Changed to `if ( NOT ARG_OPT_FLAGS )`
I don't know what happened, a few days ago I did observe that bitcode size 
become smaller when opt flag is empty.


https://github.com/llvm/llvm-project/pull/130882
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] link_bc target should depends on target builtins.link.clc-arch_suffix (PR #132338)

2025-03-20 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/132338

>From 6ce54aa767f8cdff2f938cdce8656e495a1346f0 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Thu, 20 Mar 2025 22:01:55 -0700
Subject: [PATCH] [libclc] link_bc target should depends on target
 builtins.link.clc-arch_suffix

Currently link_bc command depends on the bitcode file that is associated
with custom target builtins.link.clc-arch_suffix.
On windows we randomly see following error:
`
  Generating builtins.link.clc-${ARCH}--.bc
  Generating builtins.link.libspirv-${ARCH}.bc
  error : The requested operation cannot be performed on a file with a 
user-mapped section open.
`
I suspect that builtins.link.clc-${ARCH}--.bc file is being generated
while it is being used in link_bc.
This PR adds target-level dependency to ensure builtins.link.clc-${ARCH}--.bc
is generated first.
---
 libclc/CMakeLists.txt| 2 +-
 libclc/cmake/modules/AddLibclc.cmake | 9 +
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index 426f210a73fcc..3de7ee9b707a8 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -413,7 +413,7 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
   GEN_FILES ${opencl_gen_files}
   ALIASES ${${d}_aliases}
   # Link in the CLC builtins and internalize their symbols
-  INTERNAL_LINK_DEPENDENCIES 
$
+  INTERNAL_LINK_DEPENDENCIES builtins.link.clc-${arch_suffix}
 )
   endforeach( d )
 endforeach( t )
diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 911559ff4bfa9..0808b39e06555 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -211,8 +211,9 @@ endfunction()
 #  * ALIASES  ...
 #  List of aliases
 #  * INTERNAL_LINK_DEPENDENCIES  ...
-#  A list of extra bytecode files to link into the builtin library. Symbols
-#  from these link dependencies will be internalized during linking.
+#  A list of extra bytecode file's targets. The bitcode files will be 
linked
+#  into the builtin library. Symbols from these link dependencies will be
+#  internalized during linking.
 function(add_libclc_builtin_set)
   cmake_parse_arguments(ARG
 "CLC_INTERNAL"
@@ -313,8 +314,8 @@ function(add_libclc_builtin_set)
   INTERNALIZE
   TARGET ${builtins_link_lib_tgt}
   INPUTS $
-${ARG_INTERNAL_LINK_DEPENDENCIES}
-  DEPENDENCIES ${builtins_link_lib_tmp_tgt}
+$
+  DEPENDENCIES ${builtins_link_lib_tmp_tgt} 
${ARG_INTERNAL_LINK_DEPENDENCIES}
 )
   endif()
 

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Skip opt command if opt_flags is empty (PR #130882)

2025-03-20 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/130882

>From 1727cb49ebbee324ecad0a766ec341eb1aed082b Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 11 Mar 2025 19:05:25 -0700
Subject: [PATCH 1/6] [libclc] Skip opt command if opt_flags is empty

When the flag is empty, the opt command won't modify the bitcode;
however, the command is slow for large bitcode files in debug mode.
---
 libclc/cmake/modules/AddLibclc.cmake | 42 +---
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 911559ff4bfa9..de24848256d72 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -340,29 +340,37 @@ function(add_libclc_builtin_set)
 return()
   endif()
 
-  set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+  if( ${ARG_OPT_FLAGS} )
+set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+
+# Add opt target
+add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
+  COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
+${builtins_link_lib}
+  DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
+)
+add_custom_target( ${builtins_opt_lib_tgt}
+  ALL DEPENDS ${builtins_opt_lib_tgt}.bc
+)
+set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
+  TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
+  FOLDER "libclc/Device IR/Opt"
+)
 
-  # Add opt target
-  add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
-COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
-  ${builtins_link_lib}
-DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
-  )
-  add_custom_target( ${builtins_opt_lib_tgt}
-ALL DEPENDS ${builtins_opt_lib_tgt}.bc
-  )
-  set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
-TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
-FOLDER "libclc/Device IR/Opt"
-  )
+set( builtins_opt_lib 
$ )
 
-  set( builtins_opt_lib $ 
)
+set( builtins_link_opt_lib ${builtins_opt_lib} )
+set( builtins_link_opt_lib_tgt ${builtins_opt_lib_tgt} )
+  else()
+set( builtins_link_opt_lib ${builtins_link_lib} )
+set( builtins_link_opt_lib_tgt ${builtins_link_lib_tgt} )
+  endif()
 
   # Add prepare target
   set( obj_suffix ${ARG_ARCH_SUFFIX}.bc )
   add_custom_command( OUTPUT ${obj_suffix}
-COMMAND ${prepare_builtins_exe} -o ${obj_suffix} ${builtins_opt_lib}
-DEPENDS ${builtins_opt_lib} ${builtins_opt_lib_tgt} 
${prepare_builtins_target} )
+COMMAND ${prepare_builtins_exe} -o ${obj_suffix} ${builtins_link_opt_lib}
+DEPENDS ${builtins_link_opt_lib} ${builtins_link_opt_lib_tgt} 
${prepare_builtins_target} )
   add_custom_target( prepare-${obj_suffix} ALL DEPENDS ${obj_suffix} )
   set_target_properties( "prepare-${obj_suffix}" PROPERTIES FOLDER 
"libclc/Device IR/Prepare" )
 

>From d3543f3601469b2e2b51f3fa275019f06c8378e6 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 18 Mar 2025 04:04:34 -0700
Subject: [PATCH 2/6] unconditionally add builtins_opt_lib_tgt, which is empty
 if OPT_FLAGS is empty

---
 libclc/cmake/modules/AddLibclc.cmake | 30 +---
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index de24848256d72..14f28a0b525b0 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -340,37 +340,35 @@ function(add_libclc_builtin_set)
 return()
   endif()
 
-  if( ${ARG_OPT_FLAGS} )
-set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+  set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+  add_custom_target( ${builtins_opt_lib_tgt} ALL )
+  set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
+FOLDER "libclc/Device IR/Opt"
+  )
+  add_dependencies( ${builtins_opt_lib_tgt} ${builtins_link_lib_tgt} )
 
-# Add opt target
+  # Add opt target
+  if( ${ARG_OPT_FLAGS} STREQUAL "" )
+# no-op
+set( builtins_opt_lib ${builtins_link_lib} )
+  else()
 add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
   COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
 ${builtins_link_lib}
   DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
 )
-add_custom_target( ${builtins_opt_lib_tgt}
-  ALL DEPENDS ${builtins_opt_lib_tgt}.bc
-)
 set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
   TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
-  FOLDER "libclc/Device IR/Opt"
+  DEPENDS ${builtins_opt_lib_tgt}.bc
 )
-
 set( builtins_opt_lib 
$ )
-
-set( builtins_link_opt_lib ${builtins_opt_lib} )
-set( builtins_link_opt_lib_tgt ${builtins_opt_lib_tgt} )
-  else()
-set( builtins_link_opt_lib ${builtins_link_lib} )
-set

[libclc] [libclc] link_bc target should depends on target builtins.link.clc-arch_suffix (PR #132338)

2025-03-21 Thread Wenju He via cfe-commits

wenju-he wrote:

I've tested on Windows for half a day, it seems this PR can fix the issue.

https://github.com/llvm/llvm-project/pull/132338
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] link_bc target should depends on target builtins.link.clc-arch_suffix (PR #132338)

2025-03-20 Thread Wenju He via cfe-commits

https://github.com/wenju-he edited 
https://github.com/llvm/llvm-project/pull/132338
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] link_bc target should depends on target builtins.link.clc-arch_suffix (PR #132338)

2025-03-20 Thread Wenju He via cfe-commits

wenju-he wrote:

@frasercrmck could you please review? thanks

https://github.com/llvm/llvm-project/pull/132338
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Skip opt command if opt_flags is empty (PR #130882)

2025-03-24 Thread Wenju He via cfe-commits

wenju-he wrote:

@frasercrmck please help to merge, thanks. I don't have merge access.

https://github.com/llvm/llvm-project/pull/130882
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Skip opt command if opt_flags is empty (PR #130882)

2025-03-18 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/130882

>From 1727cb49ebbee324ecad0a766ec341eb1aed082b Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 11 Mar 2025 19:05:25 -0700
Subject: [PATCH 1/4] [libclc] Skip opt command if opt_flags is empty

When the flag is empty, the opt command won't modify the bitcode;
however, the command is slow for large bitcode files in debug mode.
---
 libclc/cmake/modules/AddLibclc.cmake | 42 +---
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 911559ff4bfa9..de24848256d72 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -340,29 +340,37 @@ function(add_libclc_builtin_set)
 return()
   endif()
 
-  set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+  if( ${ARG_OPT_FLAGS} )
+set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+
+# Add opt target
+add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
+  COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
+${builtins_link_lib}
+  DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
+)
+add_custom_target( ${builtins_opt_lib_tgt}
+  ALL DEPENDS ${builtins_opt_lib_tgt}.bc
+)
+set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
+  TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
+  FOLDER "libclc/Device IR/Opt"
+)
 
-  # Add opt target
-  add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
-COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
-  ${builtins_link_lib}
-DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
-  )
-  add_custom_target( ${builtins_opt_lib_tgt}
-ALL DEPENDS ${builtins_opt_lib_tgt}.bc
-  )
-  set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
-TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
-FOLDER "libclc/Device IR/Opt"
-  )
+set( builtins_opt_lib 
$ )
 
-  set( builtins_opt_lib $ 
)
+set( builtins_link_opt_lib ${builtins_opt_lib} )
+set( builtins_link_opt_lib_tgt ${builtins_opt_lib_tgt} )
+  else()
+set( builtins_link_opt_lib ${builtins_link_lib} )
+set( builtins_link_opt_lib_tgt ${builtins_link_lib_tgt} )
+  endif()
 
   # Add prepare target
   set( obj_suffix ${ARG_ARCH_SUFFIX}.bc )
   add_custom_command( OUTPUT ${obj_suffix}
-COMMAND ${prepare_builtins_exe} -o ${obj_suffix} ${builtins_opt_lib}
-DEPENDS ${builtins_opt_lib} ${builtins_opt_lib_tgt} 
${prepare_builtins_target} )
+COMMAND ${prepare_builtins_exe} -o ${obj_suffix} ${builtins_link_opt_lib}
+DEPENDS ${builtins_link_opt_lib} ${builtins_link_opt_lib_tgt} 
${prepare_builtins_target} )
   add_custom_target( prepare-${obj_suffix} ALL DEPENDS ${obj_suffix} )
   set_target_properties( "prepare-${obj_suffix}" PROPERTIES FOLDER 
"libclc/Device IR/Prepare" )
 

>From d3543f3601469b2e2b51f3fa275019f06c8378e6 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 18 Mar 2025 04:04:34 -0700
Subject: [PATCH 2/4] unconditionally add builtins_opt_lib_tgt, which is empty
 if OPT_FLAGS is empty

---
 libclc/cmake/modules/AddLibclc.cmake | 30 +---
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index de24848256d72..14f28a0b525b0 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -340,37 +340,35 @@ function(add_libclc_builtin_set)
 return()
   endif()
 
-  if( ${ARG_OPT_FLAGS} )
-set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+  set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+  add_custom_target( ${builtins_opt_lib_tgt} ALL )
+  set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
+FOLDER "libclc/Device IR/Opt"
+  )
+  add_dependencies( ${builtins_opt_lib_tgt} ${builtins_link_lib_tgt} )
 
-# Add opt target
+  # Add opt target
+  if( ${ARG_OPT_FLAGS} STREQUAL "" )
+# no-op
+set( builtins_opt_lib ${builtins_link_lib} )
+  else()
 add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
   COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
 ${builtins_link_lib}
   DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
 )
-add_custom_target( ${builtins_opt_lib_tgt}
-  ALL DEPENDS ${builtins_opt_lib_tgt}.bc
-)
 set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
   TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
-  FOLDER "libclc/Device IR/Opt"
+  DEPENDS ${builtins_opt_lib_tgt}.bc
 )
-
 set( builtins_opt_lib 
$ )
-
-set( builtins_link_opt_lib ${builtins_opt_lib} )
-set( builtins_link_opt_lib_tgt ${builtins_opt_lib_tgt} )
-  else()
-set( builtins_link_opt_lib ${builtins_link_lib} )
-set

[libclc] [libclc] Fix commands in compile_to_bc are executed sequentially (PR #130755)

2025-03-18 Thread Wenju He via cfe-commits

wenju-he wrote:

> Am I right in thinking that CMake 3.27's `DEPENDS_EXPLICIT_ONLY` would also 
> fix this? If so it might be worth documenting this explicitly, both in the 
> PR/commit and in the code? We might be able to refactor this in the future, 
> when LLVM updates its minimum version to 3.27.

done.
I noticed DEPENDS_EXPLICIT_ONLY in the cmake issue link above, but I didn't 
test it.
I have just test DEPENDS_EXPLICIT_ONLY on windows, it can indeed fix the 
sequential build issue.

https://github.com/llvm/llvm-project/pull/130755
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Skip opt command if opt_flags is empty (PR #130882)

2025-03-18 Thread Wenju He via cfe-commits

https://github.com/wenju-he edited 
https://github.com/llvm/llvm-project/pull/130882
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Skip opt command if opt_flags is empty (PR #130882)

2025-03-18 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/130882

>From 1727cb49ebbee324ecad0a766ec341eb1aed082b Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 11 Mar 2025 19:05:25 -0700
Subject: [PATCH 1/5] [libclc] Skip opt command if opt_flags is empty

When the flag is empty, the opt command won't modify the bitcode;
however, the command is slow for large bitcode files in debug mode.
---
 libclc/cmake/modules/AddLibclc.cmake | 42 +---
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 911559ff4bfa9..de24848256d72 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -340,29 +340,37 @@ function(add_libclc_builtin_set)
 return()
   endif()
 
-  set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+  if( ${ARG_OPT_FLAGS} )
+set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+
+# Add opt target
+add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
+  COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
+${builtins_link_lib}
+  DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
+)
+add_custom_target( ${builtins_opt_lib_tgt}
+  ALL DEPENDS ${builtins_opt_lib_tgt}.bc
+)
+set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
+  TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
+  FOLDER "libclc/Device IR/Opt"
+)
 
-  # Add opt target
-  add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
-COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
-  ${builtins_link_lib}
-DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
-  )
-  add_custom_target( ${builtins_opt_lib_tgt}
-ALL DEPENDS ${builtins_opt_lib_tgt}.bc
-  )
-  set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
-TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
-FOLDER "libclc/Device IR/Opt"
-  )
+set( builtins_opt_lib 
$ )
 
-  set( builtins_opt_lib $ 
)
+set( builtins_link_opt_lib ${builtins_opt_lib} )
+set( builtins_link_opt_lib_tgt ${builtins_opt_lib_tgt} )
+  else()
+set( builtins_link_opt_lib ${builtins_link_lib} )
+set( builtins_link_opt_lib_tgt ${builtins_link_lib_tgt} )
+  endif()
 
   # Add prepare target
   set( obj_suffix ${ARG_ARCH_SUFFIX}.bc )
   add_custom_command( OUTPUT ${obj_suffix}
-COMMAND ${prepare_builtins_exe} -o ${obj_suffix} ${builtins_opt_lib}
-DEPENDS ${builtins_opt_lib} ${builtins_opt_lib_tgt} 
${prepare_builtins_target} )
+COMMAND ${prepare_builtins_exe} -o ${obj_suffix} ${builtins_link_opt_lib}
+DEPENDS ${builtins_link_opt_lib} ${builtins_link_opt_lib_tgt} 
${prepare_builtins_target} )
   add_custom_target( prepare-${obj_suffix} ALL DEPENDS ${obj_suffix} )
   set_target_properties( "prepare-${obj_suffix}" PROPERTIES FOLDER 
"libclc/Device IR/Prepare" )
 

>From d3543f3601469b2e2b51f3fa275019f06c8378e6 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 18 Mar 2025 04:04:34 -0700
Subject: [PATCH 2/5] unconditionally add builtins_opt_lib_tgt, which is empty
 if OPT_FLAGS is empty

---
 libclc/cmake/modules/AddLibclc.cmake | 30 +---
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index de24848256d72..14f28a0b525b0 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -340,37 +340,35 @@ function(add_libclc_builtin_set)
 return()
   endif()
 
-  if( ${ARG_OPT_FLAGS} )
-set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+  set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+  add_custom_target( ${builtins_opt_lib_tgt} ALL )
+  set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
+FOLDER "libclc/Device IR/Opt"
+  )
+  add_dependencies( ${builtins_opt_lib_tgt} ${builtins_link_lib_tgt} )
 
-# Add opt target
+  # Add opt target
+  if( ${ARG_OPT_FLAGS} STREQUAL "" )
+# no-op
+set( builtins_opt_lib ${builtins_link_lib} )
+  else()
 add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
   COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
 ${builtins_link_lib}
   DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
 )
-add_custom_target( ${builtins_opt_lib_tgt}
-  ALL DEPENDS ${builtins_opt_lib_tgt}.bc
-)
 set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
   TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
-  FOLDER "libclc/Device IR/Opt"
+  DEPENDS ${builtins_opt_lib_tgt}.bc
 )
-
 set( builtins_opt_lib 
$ )
-
-set( builtins_link_opt_lib ${builtins_opt_lib} )
-set( builtins_link_opt_lib_tgt ${builtins_opt_lib_tgt} )
-  else()
-set( builtins_link_opt_lib ${builtins_link_lib} )
-set

[libclc] [libclc] Fix commands in compile_to_bc are executed sequentially (PR #130755)

2025-03-18 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/130755

>From 1f8b5bfbfea6b562e9cae088256e8e5dddf0a335 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 11 Mar 2025 04:24:36 -0700
Subject: [PATCH 1/3] [libclc] Fix commands in compile_to_bc are executed
 sequentially

In libclc, we observe that compiling OpenCL source files to bitcode is
executed sequentially on Windows, which increases debug build time by
about an hour.
add_custom_command may introduce additional implicit dependencies, see
https://gitlab.kitware.com/cmake/cmake/-/issues/17097
This PR adds a target for each command and enables parallel builds of
OpenCL source files.
---
 libclc/cmake/modules/AddLibclc.cmake | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 911559ff4bfa9..9dc328fcd489c 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -1,6 +1,8 @@
 # Compiles an OpenCL C - or assembles an LL file - to bytecode
 #
 # Arguments:
+# * TARGET 
+# Custom target to create
 # * TRIPLE 
 # Target triple for which to compile the bytecode file.
 # * INPUT 
@@ -17,7 +19,7 @@
 function(compile_to_bc)
   cmake_parse_arguments(ARG
 ""
-"TRIPLE;INPUT;OUTPUT"
+"TARGET;TRIPLE;INPUT;OUTPUT"
 "EXTRA_OPTS;DEPENDENCIES"
 ${ARGN}
   )
@@ -60,9 +62,11 @@ function(compile_to_bc)
 DEPENDS
   ${clang_target}
   ${ARG_INPUT}
-  ${ARG_DEPENDENCIES}
 DEPFILE ${ARG_OUTPUT}.d
   )
+  add_custom_target( ${ARG_TARGET}
+DEPENDS ${ARG_OUTPUT}${TMP_SUFFIX} ${ARG_DEPENDENCIES}
+  )
 
   if( ${FILE_EXT} STREQUAL ".ll" )
 add_custom_command(
@@ -70,6 +74,7 @@ function(compile_to_bc)
   COMMAND ${llvm-as_exe} -o ${ARG_OUTPUT} ${ARG_OUTPUT}${TMP_SUFFIX}
   DEPENDS ${llvm-as_target} ${ARG_OUTPUT}${TMP_SUFFIX}
 )
+add_custom_target( ${ARG_TARGET}-as DEPENDS ${ARG_OUTPUT} )
   endif()
 endfunction()
 
@@ -227,6 +232,7 @@ function(add_libclc_builtin_set)
 
   set( bytecode_files )
   set( bytecode_ir_files )
+  set( compile_tgts )
   foreach( file IN LISTS ARG_GEN_FILES ARG_LIB_FILES )
 # We need to take each file and produce an absolute input file, as well
 # as a unique architecture-specific output file. We deal with a mix of
@@ -256,7 +262,11 @@ function(add_libclc_builtin_set)
 
 get_filename_component( file_dir ${file} DIRECTORY )
 
+string( REPLACE "/" "-" replaced ${file} )
+set( tgt compile_tgt-${ARG_ARCH_SUFFIX}${replaced})
+
 compile_to_bc(
+  TARGET ${tgt}
   TRIPLE ${ARG_TRIPLE}
   INPUT ${input_file}
   OUTPUT ${output_file}
@@ -264,11 +274,13 @@ function(add_libclc_builtin_set)
 "${ARG_COMPILE_FLAGS}" -I${CMAKE_CURRENT_SOURCE_DIR}/${file_dir}
   DEPENDENCIES ${input_file_dep}
 )
+list( APPEND compile_tgts ${tgt} )
 
 # Collect all files originating in LLVM IR separately
 get_filename_component( file_ext ${file} EXT )
 if( ${file_ext} STREQUAL ".ll" )
   list( APPEND bytecode_ir_files ${output_file} )
+  list( APPEND compile_tgts ${tgt}-as )
 else()
   list( APPEND bytecode_files ${output_file} )
 endif()
@@ -283,7 +295,7 @@ function(add_libclc_builtin_set)
 
   set( builtins_comp_lib_tgt builtins.comp.${ARG_ARCH_SUFFIX} )
   add_custom_target( ${builtins_comp_lib_tgt}
-DEPENDS ${bytecode_files}
+DEPENDS ${compile_tgts}
   )
   set_target_properties( ${builtins_comp_lib_tgt} PROPERTIES FOLDER 
"libclc/Device IR/Comp" )
 

>From 318148023265ea8e71d7c1d65e932748bacd417a Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 11 Mar 2025 06:04:41 -0700
Subject: [PATCH 2/3] revert ARG_DEPENDENCIES change

---
 libclc/cmake/modules/AddLibclc.cmake | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 9dc328fcd489c..fba3b0110aaec 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -62,11 +62,10 @@ function(compile_to_bc)
 DEPENDS
   ${clang_target}
   ${ARG_INPUT}
+  ${ARG_DEPENDENCIES}
 DEPFILE ${ARG_OUTPUT}.d
   )
-  add_custom_target( ${ARG_TARGET}
-DEPENDS ${ARG_OUTPUT}${TMP_SUFFIX} ${ARG_DEPENDENCIES}
-  )
+  add_custom_target( ${ARG_TARGET} DEPENDS ${ARG_OUTPUT}${TMP_SUFFIX} )
 
   if( ${FILE_EXT} STREQUAL ".ll" )
 add_custom_command(

>From ea2f503e3c8ea76de06dccf125f985a862535ae8 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 18 Mar 2025 03:14:41 -0700
Subject: [PATCH 3/3] add FIXME about DEPENDS_EXPLICIT_ONLY

---
 libclc/cmake/modules/AddLibclc.cmake | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index fba3b0110aaec..5d03acc73b3d0 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -65,6 +65,11 @@ funct

[libclc] [libclc] Skip opt command if opt_flags is empty (PR #130882)

2025-03-18 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/130882

>From 1727cb49ebbee324ecad0a766ec341eb1aed082b Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 11 Mar 2025 19:05:25 -0700
Subject: [PATCH 1/2] [libclc] Skip opt command if opt_flags is empty

When the flag is empty, the opt command won't modify the bitcode;
however, the command is slow for large bitcode files in debug mode.
---
 libclc/cmake/modules/AddLibclc.cmake | 42 +---
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 911559ff4bfa9..de24848256d72 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -340,29 +340,37 @@ function(add_libclc_builtin_set)
 return()
   endif()
 
-  set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+  if( ${ARG_OPT_FLAGS} )
+set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+
+# Add opt target
+add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
+  COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
+${builtins_link_lib}
+  DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
+)
+add_custom_target( ${builtins_opt_lib_tgt}
+  ALL DEPENDS ${builtins_opt_lib_tgt}.bc
+)
+set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
+  TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
+  FOLDER "libclc/Device IR/Opt"
+)
 
-  # Add opt target
-  add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
-COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
-  ${builtins_link_lib}
-DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
-  )
-  add_custom_target( ${builtins_opt_lib_tgt}
-ALL DEPENDS ${builtins_opt_lib_tgt}.bc
-  )
-  set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
-TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
-FOLDER "libclc/Device IR/Opt"
-  )
+set( builtins_opt_lib 
$ )
 
-  set( builtins_opt_lib $ 
)
+set( builtins_link_opt_lib ${builtins_opt_lib} )
+set( builtins_link_opt_lib_tgt ${builtins_opt_lib_tgt} )
+  else()
+set( builtins_link_opt_lib ${builtins_link_lib} )
+set( builtins_link_opt_lib_tgt ${builtins_link_lib_tgt} )
+  endif()
 
   # Add prepare target
   set( obj_suffix ${ARG_ARCH_SUFFIX}.bc )
   add_custom_command( OUTPUT ${obj_suffix}
-COMMAND ${prepare_builtins_exe} -o ${obj_suffix} ${builtins_opt_lib}
-DEPENDS ${builtins_opt_lib} ${builtins_opt_lib_tgt} 
${prepare_builtins_target} )
+COMMAND ${prepare_builtins_exe} -o ${obj_suffix} ${builtins_link_opt_lib}
+DEPENDS ${builtins_link_opt_lib} ${builtins_link_opt_lib_tgt} 
${prepare_builtins_target} )
   add_custom_target( prepare-${obj_suffix} ALL DEPENDS ${obj_suffix} )
   set_target_properties( "prepare-${obj_suffix}" PROPERTIES FOLDER 
"libclc/Device IR/Prepare" )
 

>From d3543f3601469b2e2b51f3fa275019f06c8378e6 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 18 Mar 2025 04:04:34 -0700
Subject: [PATCH 2/2] unconditionally add builtins_opt_lib_tgt, which is empty
 if OPT_FLAGS is empty

---
 libclc/cmake/modules/AddLibclc.cmake | 30 +---
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index de24848256d72..14f28a0b525b0 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -340,37 +340,35 @@ function(add_libclc_builtin_set)
 return()
   endif()
 
-  if( ${ARG_OPT_FLAGS} )
-set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+  set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+  add_custom_target( ${builtins_opt_lib_tgt} ALL )
+  set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
+FOLDER "libclc/Device IR/Opt"
+  )
+  add_dependencies( ${builtins_opt_lib_tgt} ${builtins_link_lib_tgt} )
 
-# Add opt target
+  # Add opt target
+  if( ${ARG_OPT_FLAGS} STREQUAL "" )
+# no-op
+set( builtins_opt_lib ${builtins_link_lib} )
+  else()
 add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
   COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
 ${builtins_link_lib}
   DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
 )
-add_custom_target( ${builtins_opt_lib_tgt}
-  ALL DEPENDS ${builtins_opt_lib_tgt}.bc
-)
 set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
   TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
-  FOLDER "libclc/Device IR/Opt"
+  DEPENDS ${builtins_opt_lib_tgt}.bc
 )
-
 set( builtins_opt_lib 
$ )
-
-set( builtins_link_opt_lib ${builtins_opt_lib} )
-set( builtins_link_opt_lib_tgt ${builtins_opt_lib_tgt} )
-  else()
-set( builtins_link_opt_lib ${builtins_link_lib} )
-set

[libclc] [libclc] Skip opt command if opt_flags is empty (PR #130882)

2025-03-18 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/130882

>From 1727cb49ebbee324ecad0a766ec341eb1aed082b Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 11 Mar 2025 19:05:25 -0700
Subject: [PATCH 1/3] [libclc] Skip opt command if opt_flags is empty

When the flag is empty, the opt command won't modify the bitcode;
however, the command is slow for large bitcode files in debug mode.
---
 libclc/cmake/modules/AddLibclc.cmake | 42 +---
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 911559ff4bfa9..de24848256d72 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -340,29 +340,37 @@ function(add_libclc_builtin_set)
 return()
   endif()
 
-  set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+  if( ${ARG_OPT_FLAGS} )
+set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+
+# Add opt target
+add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
+  COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
+${builtins_link_lib}
+  DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
+)
+add_custom_target( ${builtins_opt_lib_tgt}
+  ALL DEPENDS ${builtins_opt_lib_tgt}.bc
+)
+set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
+  TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
+  FOLDER "libclc/Device IR/Opt"
+)
 
-  # Add opt target
-  add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
-COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
-  ${builtins_link_lib}
-DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
-  )
-  add_custom_target( ${builtins_opt_lib_tgt}
-ALL DEPENDS ${builtins_opt_lib_tgt}.bc
-  )
-  set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
-TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
-FOLDER "libclc/Device IR/Opt"
-  )
+set( builtins_opt_lib 
$ )
 
-  set( builtins_opt_lib $ 
)
+set( builtins_link_opt_lib ${builtins_opt_lib} )
+set( builtins_link_opt_lib_tgt ${builtins_opt_lib_tgt} )
+  else()
+set( builtins_link_opt_lib ${builtins_link_lib} )
+set( builtins_link_opt_lib_tgt ${builtins_link_lib_tgt} )
+  endif()
 
   # Add prepare target
   set( obj_suffix ${ARG_ARCH_SUFFIX}.bc )
   add_custom_command( OUTPUT ${obj_suffix}
-COMMAND ${prepare_builtins_exe} -o ${obj_suffix} ${builtins_opt_lib}
-DEPENDS ${builtins_opt_lib} ${builtins_opt_lib_tgt} 
${prepare_builtins_target} )
+COMMAND ${prepare_builtins_exe} -o ${obj_suffix} ${builtins_link_opt_lib}
+DEPENDS ${builtins_link_opt_lib} ${builtins_link_opt_lib_tgt} 
${prepare_builtins_target} )
   add_custom_target( prepare-${obj_suffix} ALL DEPENDS ${obj_suffix} )
   set_target_properties( "prepare-${obj_suffix}" PROPERTIES FOLDER 
"libclc/Device IR/Prepare" )
 

>From d3543f3601469b2e2b51f3fa275019f06c8378e6 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 18 Mar 2025 04:04:34 -0700
Subject: [PATCH 2/3] unconditionally add builtins_opt_lib_tgt, which is empty
 if OPT_FLAGS is empty

---
 libclc/cmake/modules/AddLibclc.cmake | 30 +---
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index de24848256d72..14f28a0b525b0 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -340,37 +340,35 @@ function(add_libclc_builtin_set)
 return()
   endif()
 
-  if( ${ARG_OPT_FLAGS} )
-set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+  set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
+  add_custom_target( ${builtins_opt_lib_tgt} ALL )
+  set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
+FOLDER "libclc/Device IR/Opt"
+  )
+  add_dependencies( ${builtins_opt_lib_tgt} ${builtins_link_lib_tgt} )
 
-# Add opt target
+  # Add opt target
+  if( ${ARG_OPT_FLAGS} STREQUAL "" )
+# no-op
+set( builtins_opt_lib ${builtins_link_lib} )
+  else()
 add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
   COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
 ${builtins_link_lib}
   DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
 )
-add_custom_target( ${builtins_opt_lib_tgt}
-  ALL DEPENDS ${builtins_opt_lib_tgt}.bc
-)
 set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
   TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
-  FOLDER "libclc/Device IR/Opt"
+  DEPENDS ${builtins_opt_lib_tgt}.bc
 )
-
 set( builtins_opt_lib 
$ )
-
-set( builtins_link_opt_lib ${builtins_opt_lib} )
-set( builtins_link_opt_lib_tgt ${builtins_opt_lib_tgt} )
-  else()
-set( builtins_link_opt_lib ${builtins_link_lib} )
-set

[libclc] [libclc] Skip opt command if opt_flags is empty (PR #130882)

2025-03-18 Thread Wenju He via cfe-commits


@@ -342,21 +342,32 @@ function(add_libclc_builtin_set)
 
   set( builtins_opt_lib_tgt builtins.opt.${ARG_ARCH_SUFFIX} )
 
-  # Add opt target
-  add_custom_command( OUTPUT ${builtins_opt_lib_tgt}.bc
-COMMAND ${opt_exe} ${ARG_OPT_FLAGS} -o ${builtins_opt_lib_tgt}.bc
-  ${builtins_link_lib}
-DEPENDS ${opt_target} ${builtins_link_lib} ${builtins_link_lib_tgt}
-  )
-  add_custom_target( ${builtins_opt_lib_tgt}
-ALL DEPENDS ${builtins_opt_lib_tgt}.bc
-  )
-  set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES
-TARGET_FILE ${CMAKE_CURRENT_BINARY_DIR}/${builtins_opt_lib_tgt}.bc
-FOLDER "libclc/Device IR/Opt"
-  )
+  if( ${ARG_OPT_FLAGS} STREQUAL "" )
+# Add empty opt target.
+add_custom_target( ${builtins_opt_lib_tgt} ALL )
+set_target_properties( ${builtins_opt_lib_tgt} PROPERTIES

wenju-he wrote:

thanks for the suggestion. Done.

https://github.com/llvm/llvm-project/pull/130882
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Skip opt command if opt_flags is empty (PR #130882)

2025-03-19 Thread Wenju He via cfe-commits

wenju-he wrote:

> Another option would be to unconditionally have builtin.opt targets but if no 
> flags are passed, just make them empty targets that rely only on builtin.link 
> targets. 

Thanks for the suggestion. Done. Please review again.

https://github.com/llvm/llvm-project/pull/130882
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Fix commands in compile_to_bc are executed sequentially (PR #130755)

2025-03-18 Thread Wenju He via cfe-commits

https://github.com/wenju-he edited 
https://github.com/llvm/llvm-project/pull/130755
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add 'cl' to enable OpenCL kernel file formatting (PR #134529)

2025-04-06 Thread Wenju He via cfe-commits

https://github.com/wenju-he created 
https://github.com/llvm/llvm-project/pull/134529

None

>From ac389b8b92fbb77c8884515d8f7293b4af17dfa5 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Sun, 6 Apr 2025 18:30:42 +0800
Subject: [PATCH] [clang-format] Add 'cl' to enable OpenCL kernel file
 formatting

---
 clang/tools/clang-format/git-clang-format | 1 +
 1 file changed, 1 insertion(+)

diff --git a/clang/tools/clang-format/git-clang-format 
b/clang/tools/clang-format/git-clang-format
index 85eff4761e289..ba324b14ab80d 100755
--- a/clang/tools/clang-format/git-clang-format
+++ b/clang/tools/clang-format/git-clang-format
@@ -126,6 +126,7 @@ def main():
 "pb.txt",
 "textproto",
 "asciipb",  # TextProto
+"cl", # OpenCL
 ]
 )
 

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add 'cl' to enable OpenCL kernel file formatting (PR #134529)

2025-04-06 Thread Wenju He via cfe-commits

https://github.com/wenju-he edited 
https://github.com/llvm/llvm-project/pull/134529
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Fix commands in compile_to_bc are executed sequentially (PR #130755)

2025-04-05 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/130755

>From 1f8b5bfbfea6b562e9cae088256e8e5dddf0a335 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 11 Mar 2025 04:24:36 -0700
Subject: [PATCH 1/4] [libclc] Fix commands in compile_to_bc are executed
 sequentially

In libclc, we observe that compiling OpenCL source files to bitcode is
executed sequentially on Windows, which increases debug build time by
about an hour.
add_custom_command may introduce additional implicit dependencies, see
https://gitlab.kitware.com/cmake/cmake/-/issues/17097
This PR adds a target for each command and enables parallel builds of
OpenCL source files.
---
 libclc/cmake/modules/AddLibclc.cmake | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 911559ff4bfa9..9dc328fcd489c 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -1,6 +1,8 @@
 # Compiles an OpenCL C - or assembles an LL file - to bytecode
 #
 # Arguments:
+# * TARGET 
+# Custom target to create
 # * TRIPLE 
 # Target triple for which to compile the bytecode file.
 # * INPUT 
@@ -17,7 +19,7 @@
 function(compile_to_bc)
   cmake_parse_arguments(ARG
 ""
-"TRIPLE;INPUT;OUTPUT"
+"TARGET;TRIPLE;INPUT;OUTPUT"
 "EXTRA_OPTS;DEPENDENCIES"
 ${ARGN}
   )
@@ -60,9 +62,11 @@ function(compile_to_bc)
 DEPENDS
   ${clang_target}
   ${ARG_INPUT}
-  ${ARG_DEPENDENCIES}
 DEPFILE ${ARG_OUTPUT}.d
   )
+  add_custom_target( ${ARG_TARGET}
+DEPENDS ${ARG_OUTPUT}${TMP_SUFFIX} ${ARG_DEPENDENCIES}
+  )
 
   if( ${FILE_EXT} STREQUAL ".ll" )
 add_custom_command(
@@ -70,6 +74,7 @@ function(compile_to_bc)
   COMMAND ${llvm-as_exe} -o ${ARG_OUTPUT} ${ARG_OUTPUT}${TMP_SUFFIX}
   DEPENDS ${llvm-as_target} ${ARG_OUTPUT}${TMP_SUFFIX}
 )
+add_custom_target( ${ARG_TARGET}-as DEPENDS ${ARG_OUTPUT} )
   endif()
 endfunction()
 
@@ -227,6 +232,7 @@ function(add_libclc_builtin_set)
 
   set( bytecode_files )
   set( bytecode_ir_files )
+  set( compile_tgts )
   foreach( file IN LISTS ARG_GEN_FILES ARG_LIB_FILES )
 # We need to take each file and produce an absolute input file, as well
 # as a unique architecture-specific output file. We deal with a mix of
@@ -256,7 +262,11 @@ function(add_libclc_builtin_set)
 
 get_filename_component( file_dir ${file} DIRECTORY )
 
+string( REPLACE "/" "-" replaced ${file} )
+set( tgt compile_tgt-${ARG_ARCH_SUFFIX}${replaced})
+
 compile_to_bc(
+  TARGET ${tgt}
   TRIPLE ${ARG_TRIPLE}
   INPUT ${input_file}
   OUTPUT ${output_file}
@@ -264,11 +274,13 @@ function(add_libclc_builtin_set)
 "${ARG_COMPILE_FLAGS}" -I${CMAKE_CURRENT_SOURCE_DIR}/${file_dir}
   DEPENDENCIES ${input_file_dep}
 )
+list( APPEND compile_tgts ${tgt} )
 
 # Collect all files originating in LLVM IR separately
 get_filename_component( file_ext ${file} EXT )
 if( ${file_ext} STREQUAL ".ll" )
   list( APPEND bytecode_ir_files ${output_file} )
+  list( APPEND compile_tgts ${tgt}-as )
 else()
   list( APPEND bytecode_files ${output_file} )
 endif()
@@ -283,7 +295,7 @@ function(add_libclc_builtin_set)
 
   set( builtins_comp_lib_tgt builtins.comp.${ARG_ARCH_SUFFIX} )
   add_custom_target( ${builtins_comp_lib_tgt}
-DEPENDS ${bytecode_files}
+DEPENDS ${compile_tgts}
   )
   set_target_properties( ${builtins_comp_lib_tgt} PROPERTIES FOLDER 
"libclc/Device IR/Comp" )
 

>From 318148023265ea8e71d7c1d65e932748bacd417a Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 11 Mar 2025 06:04:41 -0700
Subject: [PATCH 2/4] revert ARG_DEPENDENCIES change

---
 libclc/cmake/modules/AddLibclc.cmake | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 9dc328fcd489c..fba3b0110aaec 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -62,11 +62,10 @@ function(compile_to_bc)
 DEPENDS
   ${clang_target}
   ${ARG_INPUT}
+  ${ARG_DEPENDENCIES}
 DEPFILE ${ARG_OUTPUT}.d
   )
-  add_custom_target( ${ARG_TARGET}
-DEPENDS ${ARG_OUTPUT}${TMP_SUFFIX} ${ARG_DEPENDENCIES}
-  )
+  add_custom_target( ${ARG_TARGET} DEPENDS ${ARG_OUTPUT}${TMP_SUFFIX} )
 
   if( ${FILE_EXT} STREQUAL ".ll" )
 add_custom_command(

>From ea2f503e3c8ea76de06dccf125f985a862535ae8 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 18 Mar 2025 03:14:41 -0700
Subject: [PATCH 3/4] add FIXME about DEPENDS_EXPLICIT_ONLY

---
 libclc/cmake/modules/AddLibclc.cmake | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index fba3b0110aaec..5d03acc73b3d0 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -65,6 +65,11 @@ funct

[libclc] [libclc] Fix commands in compile_to_bc are executed sequentially (PR #130755)

2025-04-05 Thread Wenju He via cfe-commits


@@ -283,7 +294,7 @@ function(add_libclc_builtin_set)
 
   set( builtins_comp_lib_tgt builtins.comp.${ARG_ARCH_SUFFIX} )
   add_custom_target( ${builtins_comp_lib_tgt}
-DEPENDS ${bytecode_files}
+DEPENDS ${compile_tgts}

wenju-he wrote:

thanks, I'll add ${bytecode_files} depends back

https://github.com/llvm/llvm-project/pull/130755
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [NFC][libclc] Merge atomic extension built-ins with identical name into a single file (PR #134489)

2025-04-05 Thread Wenju He via cfe-commits

https://github.com/wenju-he created 
https://github.com/llvm/llvm-project/pull/134489

Similar to how cl_khr_fp64 and cl_khr_fp16 implementations are put in a same 
file for math built-ins, this PR do the same to atom_* built-ins.

The main motivation is to prevent that two files with same base name 
implementats different built-ins. In a follow-up PR, I'd like to relax 
libclc_configure_lib_source to only compare filename instead of path for 
overriding, since in our downstream the same category of built-ins, e.g. math, 
are organized in several different folders.

>From 310f537cb36fdb6cfe208860ad9c149c4a0c372a Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Sat, 5 Apr 2025 02:08:28 -0700
Subject: [PATCH] [NFC][libclc] Merge atomic extension built-ins with identical
 name into a single file

Similar to how cl_khr_fp64 and cl_khr_fp16 implementations are put in a
same file for math built-ins, this PR do the same to atom_* built-ins.

The main motivation is to prevent that two files with same base name
implementats different built-ins. In a follow-up PR, I'd like to relax
libclc_configure_lib_source to only compare filename instead of path for
overriding, since in our downstream the same category of built-ins, e.g.
math, are organized in several different folders.
---
 .../atom_add.h| 12 +++-
 .../atom_and.h| 12 +++-
 .../generic/include/clc/atomic/atom_cmpxchg.h | 36 +++
 .../atom_dec.h| 15 -
 .../clc/{ => atomic}/atom_decl_int32.inc  |  7 +-
 .../clc/{ => atomic}/atom_decl_int64.inc  |  7 +-
 .../atom_inc.h| 15 -
 .../atom_max.h| 13 +++-
 .../atom_min.h| 12 +++-
 .../atom_or.h | 12 +++-
 .../atom_sub.h| 12 +++-
 .../atom_xchg.h   | 12 +++-
 .../atom_xor.h| 12 +++-
 .../atom_cmpxchg.h| 10 ---
 .../atom_dec.h| 10 ---
 .../atom_inc.h| 10 ---
 .../clc/cl_khr_int64_base_atomics/atom_add.h  | 10 ---
 .../cl_khr_int64_base_atomics/atom_cmpxchg.h  | 12 
 .../clc/cl_khr_int64_base_atomics/atom_sub.h  | 10 ---
 .../clc/cl_khr_int64_base_atomics/atom_xchg.h | 10 ---
 .../cl_khr_int64_extended_atomics/atom_and.h  | 10 ---
 .../cl_khr_int64_extended_atomics/atom_max.h  | 10 ---
 .../cl_khr_int64_extended_atomics/atom_min.h  | 10 ---
 .../cl_khr_int64_extended_atomics/atom_or.h   | 10 ---
 .../cl_khr_int64_extended_atomics/atom_xor.h  | 10 ---
 .../atom_add.h| 11 
 .../atom_cmpxchg.h| 10 ---
 .../atom_dec.h| 10 ---
 .../atom_inc.h| 10 ---
 .../atom_sub.h| 11 
 .../atom_xchg.h   | 11 
 .../atom_and.h| 11 
 .../atom_max.h| 11 
 .../atom_min.h| 11 
 .../atom_or.h | 11 
 .../atom_xor.h| 11 
 libclc/generic/include/clc/clc.h  | 64 +--
 libclc/generic/lib/SOURCES| 46 -
 .../atom_add.cl   | 14 +++-
 .../atom_and.cl   | 14 +++-
 .../atom_cmpxchg.cl   | 25 +++-
 .../atom_dec.cl   | 26 +++-
 .../atom_inc.cl   | 26 +++-
 .../lib/{ => atomic}/atom_int32_binary.inc|  9 +--
 .../atom_max.cl   | 14 +++-
 .../atom_min.cl   | 14 +++-
 .../atom_or.cl| 12 +++-
 .../atom_sub.cl   | 14 +++-
 .../atom_xchg.cl  | 14 +++-
 .../atom_xor.cl   | 12 +++-
 .../atom_add.cl   | 11 
 .../atom_cmpxchg.cl   | 17 -
 .../atom_dec.cl   | 17 -
 .../atom_inc.cl   | 17 -
 .../atom_sub.cl   | 11 
 .../atom_xchg.cl  | 11 
 .../atom_and.cl   | 11 
 .../atom_max.cl   | 11 
 .../atom_min.cl   | 11 
 .../atom_or.cl| 11 
 .../atom_xor.cl   | 11 
 .../atom_add.cl   | 11 
 .../atom_cmpxchg.cl   | 17 -
 .../atom_dec.cl   | 17 -
 .../atom_inc.cl

[libclc] [NFC][libclc] Merge atomic extension built-ins with identical name into a single file (PR #134489)

2025-04-05 Thread Wenju He via cfe-commits

wenju-he wrote:

@frasercrmck could you please review, thanks.

https://github.com/llvm/llvm-project/pull/134489
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] link_bc target should depends on target builtins.link.clc-arch_suffix (PR #132338)

2025-03-26 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/132338

>From 6ce54aa767f8cdff2f938cdce8656e495a1346f0 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Thu, 20 Mar 2025 22:01:55 -0700
Subject: [PATCH 1/3] [libclc] link_bc target should depends on target
 builtins.link.clc-arch_suffix

Currently link_bc command depends on the bitcode file that is associated
with custom target builtins.link.clc-arch_suffix.
On windows we randomly see following error:
`
  Generating builtins.link.clc-${ARCH}--.bc
  Generating builtins.link.libspirv-${ARCH}.bc
  error : The requested operation cannot be performed on a file with a 
user-mapped section open.
`
I suspect that builtins.link.clc-${ARCH}--.bc file is being generated
while it is being used in link_bc.
This PR adds target-level dependency to ensure builtins.link.clc-${ARCH}--.bc
is generated first.
---
 libclc/CMakeLists.txt| 2 +-
 libclc/cmake/modules/AddLibclc.cmake | 9 +
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index 426f210a73fcc..3de7ee9b707a8 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -413,7 +413,7 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
   GEN_FILES ${opencl_gen_files}
   ALIASES ${${d}_aliases}
   # Link in the CLC builtins and internalize their symbols
-  INTERNAL_LINK_DEPENDENCIES 
$
+  INTERNAL_LINK_DEPENDENCIES builtins.link.clc-${arch_suffix}
 )
   endforeach( d )
 endforeach( t )
diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 911559ff4bfa9..0808b39e06555 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -211,8 +211,9 @@ endfunction()
 #  * ALIASES  ...
 #  List of aliases
 #  * INTERNAL_LINK_DEPENDENCIES  ...
-#  A list of extra bytecode files to link into the builtin library. Symbols
-#  from these link dependencies will be internalized during linking.
+#  A list of extra bytecode file's targets. The bitcode files will be 
linked
+#  into the builtin library. Symbols from these link dependencies will be
+#  internalized during linking.
 function(add_libclc_builtin_set)
   cmake_parse_arguments(ARG
 "CLC_INTERNAL"
@@ -313,8 +314,8 @@ function(add_libclc_builtin_set)
   INTERNALIZE
   TARGET ${builtins_link_lib_tgt}
   INPUTS $
-${ARG_INTERNAL_LINK_DEPENDENCIES}
-  DEPENDENCIES ${builtins_link_lib_tmp_tgt}
+$
+  DEPENDENCIES ${builtins_link_lib_tmp_tgt} 
${ARG_INTERNAL_LINK_DEPENDENCIES}
 )
   endif()
 

>From 7b9a65f8dac3ab5e13fcabade3e9ed4d384c5b92 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Fri, 21 Mar 2025 01:29:51 -0700
Subject: [PATCH 2/3] update comment, fix files to link

---
 libclc/cmake/modules/AddLibclc.cmake | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 0808b39e06555..29d728494cd3e 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -210,7 +210,7 @@ endfunction()
 #  Optimization options (for opt)
 #  * ALIASES  ...
 #  List of aliases
-#  * INTERNAL_LINK_DEPENDENCIES  ...
+#  * INTERNAL_LINK_DEPENDENCIES  ...
 #  A list of extra bytecode file's targets. The bitcode files will be 
linked
 #  into the builtin library. Symbols from these link dependencies will be
 #  internalized during linking.
@@ -310,11 +310,15 @@ function(add_libclc_builtin_set)
   INPUTS ${bytecode_files}
   DEPENDENCIES ${builtins_comp_lib_tgt}
 )
+set( internal_link_depend_files )
+foreach( tgt ${ARG_INTERNAL_LINK_DEPENDENCIES} )
+  list( APPEND internal_link_depend_files 
$ )
+endforeach()
 link_bc(
   INTERNALIZE
   TARGET ${builtins_link_lib_tgt}
   INPUTS $
-$
+${internal_link_depend_files}
   DEPENDENCIES ${builtins_link_lib_tmp_tgt} 
${ARG_INTERNAL_LINK_DEPENDENCIES}
 )
   endif()

>From 7097315c4dbf474a9d0e4adbca18bf868a503436 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Fri, 21 Mar 2025 02:05:20 -0700
Subject: [PATCH 3/3] fix

---
 libclc/cmake/modules/AddLibclc.cmake | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 29d728494cd3e..fe7151e61b466 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -312,7 +312,7 @@ function(add_libclc_builtin_set)
 )
 set( internal_link_depend_files )
 foreach( tgt ${ARG_INTERNAL_LINK_DEPENDENCIES} )
-  list( APPEND internal_link_depend_files 
$ )
+  list( APPEND internal_link_depend_files 
$ )
 endforeach()
 link_bc(
   INTERNALIZE

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-

[libclc] [libclc] Fix commands in compile_to_bc are executed sequentially (PR #130755)

2025-03-26 Thread Wenju He via cfe-commits

wenju-he wrote:

@frasercrmck should we check CMAKE_VERSION and use DEPENDS_EXPLICIT_ONLY if the 
version is 3.27 or higher? I suppose DEPENDS_EXPLICIT_ONLY would be more 
light-weighted. And we need to keep both DEPENDS_EXPLICIT_ONLY and adding new 
target path, until llvm uplifts cmake version to 3.27.

https://github.com/llvm/llvm-project/pull/130755
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add 'cl' to enable OpenCL kernel file formatting (PR #134529)

2025-04-08 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/134529

>From ac389b8b92fbb77c8884515d8f7293b4af17dfa5 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Sun, 6 Apr 2025 18:30:42 +0800
Subject: [PATCH 1/4] [clang-format] Add 'cl' to enable OpenCL kernel file
 formatting

---
 clang/tools/clang-format/git-clang-format | 1 +
 1 file changed, 1 insertion(+)

diff --git a/clang/tools/clang-format/git-clang-format 
b/clang/tools/clang-format/git-clang-format
index 85eff4761e289..ba324b14ab80d 100755
--- a/clang/tools/clang-format/git-clang-format
+++ b/clang/tools/clang-format/git-clang-format
@@ -126,6 +126,7 @@ def main():
 "pb.txt",
 "textproto",
 "asciipb",  # TextProto
+"cl", # OpenCL
 ]
 )
 

>From 88c11747fcc8db1921dfd8f73c9330c662f7fd91 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Mon, 7 Apr 2025 16:17:39 -0700
Subject: [PATCH 2/4] return C language format style for OpenCL

---
 clang/lib/Format/Format.cpp   | 5 +
 clang/test/Format/dump-config-opencl-stdin.cl | 7 +++
 clang/test/Format/lit.local.cfg   | 3 ++-
 clang/unittests/Format/FormatTest.cpp | 3 +++
 4 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/Format/dump-config-opencl-stdin.cl

diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index 226d39f635676..0565d6d46eb32 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -4094,6 +4094,9 @@ static FormatStyle::LanguageKind 
getLanguageByFileName(StringRef FileName) {
   FileName.ends_with_insensitive(".vh")) {
 return FormatStyle::LK_Verilog;
   }
+  // OpenCL is based on C99 and C11.
+  if (FileName.ends_with(".cl"))
+return FormatStyle::LK_C;
   return FormatStyle::LK_Cpp;
 }
 
@@ -4121,6 +4124,8 @@ static FormatStyle::LanguageKind 
getLanguageByComment(const Environment &Env) {
   return FormatStyle::LK_Cpp;
 if (Text == "ObjC")
   return FormatStyle::LK_ObjC;
+if (Text == "OpenCL")
+  return FormatStyle::LK_C;
   }
 
   return FormatStyle::LK_None;
diff --git a/clang/test/Format/dump-config-opencl-stdin.cl 
b/clang/test/Format/dump-config-opencl-stdin.cl
new file mode 100644
index 0..d02a3fb287a42
--- /dev/null
+++ b/clang/test/Format/dump-config-opencl-stdin.cl
@@ -0,0 +1,7 @@
+// RUN: clang-format -assume-filename=foo.cl -dump-config | FileCheck %s
+
+// RUN: clang-format -dump-config - < %s | FileCheck %s
+
+// CHECK: Language: C
+
+void foo() {}
diff --git a/clang/test/Format/lit.local.cfg b/clang/test/Format/lit.local.cfg
index b060c79226cbd..3717ee0dac577 100644
--- a/clang/test/Format/lit.local.cfg
+++ b/clang/test/Format/lit.local.cfg
@@ -20,7 +20,8 @@ config.suffixes = [
 ".textpb",
 ".asciipb",
 ".td",
-".test"
+".test",
+".cl"
 ]
 
 # AIX 'diff' command doesn't support --strip-trailing-cr, but the internal
diff --git a/clang/unittests/Format/FormatTest.cpp 
b/clang/unittests/Format/FormatTest.cpp
index 69c9ee1d1dcb2..146ec9e0a1616 100644
--- a/clang/unittests/Format/FormatTest.cpp
+++ b/clang/unittests/Format/FormatTest.cpp
@@ -25187,6 +25187,9 @@ TEST_F(FormatTest, GetLanguageByComment) {
   EXPECT_EQ(FormatStyle::LK_ObjC,
 guessLanguage("foo.h", "// clang-format Language: ObjC\n"
"int i;"));
+  EXPECT_EQ(FormatStyle::LK_C,
+guessLanguage("foo.h", "// clang-format Language: OpenCL\n"
+   "int i;"));
 }
 
 TEST_F(FormatTest, TypenameMacros) {

>From 69825a4bd73df7bdfaf21c52880ed1441c1d4d6b Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Mon, 7 Apr 2025 23:48:38 -0700
Subject: [PATCH 3/4] Revert "return C language format style for OpenCL"

This reverts commit 88c11747fcc8db1921dfd8f73c9330c662f7fd91.
---
 clang/lib/Format/Format.cpp   | 5 -
 clang/test/Format/dump-config-opencl-stdin.cl | 7 ---
 clang/test/Format/lit.local.cfg   | 3 +--
 clang/unittests/Format/FormatTest.cpp | 3 ---
 4 files changed, 1 insertion(+), 17 deletions(-)
 delete mode 100644 clang/test/Format/dump-config-opencl-stdin.cl

diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index 0565d6d46eb32..226d39f635676 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -4094,9 +4094,6 @@ static FormatStyle::LanguageKind 
getLanguageByFileName(StringRef FileName) {
   FileName.ends_with_insensitive(".vh")) {
 return FormatStyle::LK_Verilog;
   }
-  // OpenCL is based on C99 and C11.
-  if (FileName.ends_with(".cl"))
-return FormatStyle::LK_C;
   return FormatStyle::LK_Cpp;
 }
 
@@ -4124,8 +4121,6 @@ static FormatStyle::LanguageKind 
getLanguageByComment(const Environment &Env) {
   return FormatStyle::LK_Cpp;
 if (Text == "ObjC")
   return FormatStyle::LK_ObjC;
-if (Text == "OpenCL")
-  return FormatStyle::LK_C;
   }
 
   return FormatStyle

[clang] [clang-format] Add 'cl' to enable OpenCL kernel file formatting (PR #134529)

2025-04-08 Thread Wenju He via cfe-commits


@@ -126,6 +126,7 @@ def main():
 "pb.txt",
 "textproto",
 "asciipb",  # TextProto
+"cl", # OpenCL

wenju-he wrote:

done, moved after line 108. 
> Do we want to add "clcpp", # OpenCL C++?

I actually don't know about the current development status of C++ for OpenCL in 
clang. So If there is such request, I think it can be added in a separate PR.

https://github.com/llvm/llvm-project/pull/134529
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add 'cl' to enable OpenCL kernel file formatting (PR #134529)

2025-04-07 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/134529

>From ac389b8b92fbb77c8884515d8f7293b4af17dfa5 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Sun, 6 Apr 2025 18:30:42 +0800
Subject: [PATCH 1/3] [clang-format] Add 'cl' to enable OpenCL kernel file
 formatting

---
 clang/tools/clang-format/git-clang-format | 1 +
 1 file changed, 1 insertion(+)

diff --git a/clang/tools/clang-format/git-clang-format 
b/clang/tools/clang-format/git-clang-format
index 85eff4761e289..ba324b14ab80d 100755
--- a/clang/tools/clang-format/git-clang-format
+++ b/clang/tools/clang-format/git-clang-format
@@ -126,6 +126,7 @@ def main():
 "pb.txt",
 "textproto",
 "asciipb",  # TextProto
+"cl", # OpenCL
 ]
 )
 

>From 88c11747fcc8db1921dfd8f73c9330c662f7fd91 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Mon, 7 Apr 2025 16:17:39 -0700
Subject: [PATCH 2/3] return C language format style for OpenCL

---
 clang/lib/Format/Format.cpp   | 5 +
 clang/test/Format/dump-config-opencl-stdin.cl | 7 +++
 clang/test/Format/lit.local.cfg   | 3 ++-
 clang/unittests/Format/FormatTest.cpp | 3 +++
 4 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/Format/dump-config-opencl-stdin.cl

diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index 226d39f635676..0565d6d46eb32 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -4094,6 +4094,9 @@ static FormatStyle::LanguageKind 
getLanguageByFileName(StringRef FileName) {
   FileName.ends_with_insensitive(".vh")) {
 return FormatStyle::LK_Verilog;
   }
+  // OpenCL is based on C99 and C11.
+  if (FileName.ends_with(".cl"))
+return FormatStyle::LK_C;
   return FormatStyle::LK_Cpp;
 }
 
@@ -4121,6 +4124,8 @@ static FormatStyle::LanguageKind 
getLanguageByComment(const Environment &Env) {
   return FormatStyle::LK_Cpp;
 if (Text == "ObjC")
   return FormatStyle::LK_ObjC;
+if (Text == "OpenCL")
+  return FormatStyle::LK_C;
   }
 
   return FormatStyle::LK_None;
diff --git a/clang/test/Format/dump-config-opencl-stdin.cl 
b/clang/test/Format/dump-config-opencl-stdin.cl
new file mode 100644
index 0..d02a3fb287a42
--- /dev/null
+++ b/clang/test/Format/dump-config-opencl-stdin.cl
@@ -0,0 +1,7 @@
+// RUN: clang-format -assume-filename=foo.cl -dump-config | FileCheck %s
+
+// RUN: clang-format -dump-config - < %s | FileCheck %s
+
+// CHECK: Language: C
+
+void foo() {}
diff --git a/clang/test/Format/lit.local.cfg b/clang/test/Format/lit.local.cfg
index b060c79226cbd..3717ee0dac577 100644
--- a/clang/test/Format/lit.local.cfg
+++ b/clang/test/Format/lit.local.cfg
@@ -20,7 +20,8 @@ config.suffixes = [
 ".textpb",
 ".asciipb",
 ".td",
-".test"
+".test",
+".cl"
 ]
 
 # AIX 'diff' command doesn't support --strip-trailing-cr, but the internal
diff --git a/clang/unittests/Format/FormatTest.cpp 
b/clang/unittests/Format/FormatTest.cpp
index 69c9ee1d1dcb2..146ec9e0a1616 100644
--- a/clang/unittests/Format/FormatTest.cpp
+++ b/clang/unittests/Format/FormatTest.cpp
@@ -25187,6 +25187,9 @@ TEST_F(FormatTest, GetLanguageByComment) {
   EXPECT_EQ(FormatStyle::LK_ObjC,
 guessLanguage("foo.h", "// clang-format Language: ObjC\n"
"int i;"));
+  EXPECT_EQ(FormatStyle::LK_C,
+guessLanguage("foo.h", "// clang-format Language: OpenCL\n"
+   "int i;"));
 }
 
 TEST_F(FormatTest, TypenameMacros) {

>From 69825a4bd73df7bdfaf21c52880ed1441c1d4d6b Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Mon, 7 Apr 2025 23:48:38 -0700
Subject: [PATCH 3/3] Revert "return C language format style for OpenCL"

This reverts commit 88c11747fcc8db1921dfd8f73c9330c662f7fd91.
---
 clang/lib/Format/Format.cpp   | 5 -
 clang/test/Format/dump-config-opencl-stdin.cl | 7 ---
 clang/test/Format/lit.local.cfg   | 3 +--
 clang/unittests/Format/FormatTest.cpp | 3 ---
 4 files changed, 1 insertion(+), 17 deletions(-)
 delete mode 100644 clang/test/Format/dump-config-opencl-stdin.cl

diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index 0565d6d46eb32..226d39f635676 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -4094,9 +4094,6 @@ static FormatStyle::LanguageKind 
getLanguageByFileName(StringRef FileName) {
   FileName.ends_with_insensitive(".vh")) {
 return FormatStyle::LK_Verilog;
   }
-  // OpenCL is based on C99 and C11.
-  if (FileName.ends_with(".cl"))
-return FormatStyle::LK_C;
   return FormatStyle::LK_Cpp;
 }
 
@@ -4124,8 +4121,6 @@ static FormatStyle::LanguageKind 
getLanguageByComment(const Environment &Env) {
   return FormatStyle::LK_Cpp;
 if (Text == "ObjC")
   return FormatStyle::LK_ObjC;
-if (Text == "OpenCL")
-  return FormatStyle::LK_C;
   }
 
   return FormatStyle

[clang] [clang-format] Add 'cl' to enable OpenCL kernel file formatting (PR #134529)

2025-04-07 Thread Wenju He via cfe-commits

wenju-he wrote:

> > > I feel like there are more places where this needs to be addressed if we 
> > > are going to add it.
> > 
> > 
> > I added support in language detection for OpenCL in 
> > [88c1174](https://github.com/llvm/llvm-project/commit/88c11747fcc8db1921dfd8f73c9330c662f7fd91).
> >  It returns C language since OpenCL is based on C99 and C11 with 
> > restrictions, see 
> > https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_C.html
> 
> OpenCL is formatted as C++ now. Do we really need to change that? See #132832 
> for potential problems.

reverted. Thanks. Default should be better if it works better.

https://github.com/llvm/llvm-project/pull/134529
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Fix commands in compile_to_bc are executed sequentially (PR #130755)

2025-04-10 Thread Wenju He via cfe-commits


@@ -63,13 +65,15 @@ function(compile_to_bc)
   ${ARG_DEPENDENCIES}
 DEPFILE ${ARG_OUTPUT}.d
   )
+  add_custom_target( ${ARG_TARGET} DEPENDS ${ARG_OUTPUT}${TMP_SUFFIX} )

wenju-he wrote:

I have simplified the code by removing ${ARG_TARGET}-as
The issue we observing is that .cl files are compiled sequentially. We don't 
see problem with .ll file, so we can leave it as-is.

https://github.com/llvm/llvm-project/pull/130755
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add 'cl' to enable OpenCL kernel file formatting (PR #134529)

2025-04-08 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/134529

>From ac389b8b92fbb77c8884515d8f7293b4af17dfa5 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Sun, 6 Apr 2025 18:30:42 +0800
Subject: [PATCH 1/4] [clang-format] Add 'cl' to enable OpenCL kernel file
 formatting

---
 clang/tools/clang-format/git-clang-format | 1 +
 1 file changed, 1 insertion(+)

diff --git a/clang/tools/clang-format/git-clang-format 
b/clang/tools/clang-format/git-clang-format
index 85eff4761e289..ba324b14ab80d 100755
--- a/clang/tools/clang-format/git-clang-format
+++ b/clang/tools/clang-format/git-clang-format
@@ -126,6 +126,7 @@ def main():
 "pb.txt",
 "textproto",
 "asciipb",  # TextProto
+"cl", # OpenCL
 ]
 )
 

>From 88c11747fcc8db1921dfd8f73c9330c662f7fd91 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Mon, 7 Apr 2025 16:17:39 -0700
Subject: [PATCH 2/4] return C language format style for OpenCL

---
 clang/lib/Format/Format.cpp   | 5 +
 clang/test/Format/dump-config-opencl-stdin.cl | 7 +++
 clang/test/Format/lit.local.cfg   | 3 ++-
 clang/unittests/Format/FormatTest.cpp | 3 +++
 4 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/Format/dump-config-opencl-stdin.cl

diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index 226d39f635676..0565d6d46eb32 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -4094,6 +4094,9 @@ static FormatStyle::LanguageKind 
getLanguageByFileName(StringRef FileName) {
   FileName.ends_with_insensitive(".vh")) {
 return FormatStyle::LK_Verilog;
   }
+  // OpenCL is based on C99 and C11.
+  if (FileName.ends_with(".cl"))
+return FormatStyle::LK_C;
   return FormatStyle::LK_Cpp;
 }
 
@@ -4121,6 +4124,8 @@ static FormatStyle::LanguageKind 
getLanguageByComment(const Environment &Env) {
   return FormatStyle::LK_Cpp;
 if (Text == "ObjC")
   return FormatStyle::LK_ObjC;
+if (Text == "OpenCL")
+  return FormatStyle::LK_C;
   }
 
   return FormatStyle::LK_None;
diff --git a/clang/test/Format/dump-config-opencl-stdin.cl 
b/clang/test/Format/dump-config-opencl-stdin.cl
new file mode 100644
index 0..d02a3fb287a42
--- /dev/null
+++ b/clang/test/Format/dump-config-opencl-stdin.cl
@@ -0,0 +1,7 @@
+// RUN: clang-format -assume-filename=foo.cl -dump-config | FileCheck %s
+
+// RUN: clang-format -dump-config - < %s | FileCheck %s
+
+// CHECK: Language: C
+
+void foo() {}
diff --git a/clang/test/Format/lit.local.cfg b/clang/test/Format/lit.local.cfg
index b060c79226cbd..3717ee0dac577 100644
--- a/clang/test/Format/lit.local.cfg
+++ b/clang/test/Format/lit.local.cfg
@@ -20,7 +20,8 @@ config.suffixes = [
 ".textpb",
 ".asciipb",
 ".td",
-".test"
+".test",
+".cl"
 ]
 
 # AIX 'diff' command doesn't support --strip-trailing-cr, but the internal
diff --git a/clang/unittests/Format/FormatTest.cpp 
b/clang/unittests/Format/FormatTest.cpp
index 69c9ee1d1dcb2..146ec9e0a1616 100644
--- a/clang/unittests/Format/FormatTest.cpp
+++ b/clang/unittests/Format/FormatTest.cpp
@@ -25187,6 +25187,9 @@ TEST_F(FormatTest, GetLanguageByComment) {
   EXPECT_EQ(FormatStyle::LK_ObjC,
 guessLanguage("foo.h", "// clang-format Language: ObjC\n"
"int i;"));
+  EXPECT_EQ(FormatStyle::LK_C,
+guessLanguage("foo.h", "// clang-format Language: OpenCL\n"
+   "int i;"));
 }
 
 TEST_F(FormatTest, TypenameMacros) {

>From 69825a4bd73df7bdfaf21c52880ed1441c1d4d6b Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Mon, 7 Apr 2025 23:48:38 -0700
Subject: [PATCH 3/4] Revert "return C language format style for OpenCL"

This reverts commit 88c11747fcc8db1921dfd8f73c9330c662f7fd91.
---
 clang/lib/Format/Format.cpp   | 5 -
 clang/test/Format/dump-config-opencl-stdin.cl | 7 ---
 clang/test/Format/lit.local.cfg   | 3 +--
 clang/unittests/Format/FormatTest.cpp | 3 ---
 4 files changed, 1 insertion(+), 17 deletions(-)
 delete mode 100644 clang/test/Format/dump-config-opencl-stdin.cl

diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index 0565d6d46eb32..226d39f635676 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -4094,9 +4094,6 @@ static FormatStyle::LanguageKind 
getLanguageByFileName(StringRef FileName) {
   FileName.ends_with_insensitive(".vh")) {
 return FormatStyle::LK_Verilog;
   }
-  // OpenCL is based on C99 and C11.
-  if (FileName.ends_with(".cl"))
-return FormatStyle::LK_C;
   return FormatStyle::LK_Cpp;
 }
 
@@ -4124,8 +4121,6 @@ static FormatStyle::LanguageKind 
getLanguageByComment(const Environment &Env) {
   return FormatStyle::LK_Cpp;
 if (Text == "ObjC")
   return FormatStyle::LK_ObjC;
-if (Text == "OpenCL")
-  return FormatStyle::LK_C;
   }
 
   return FormatStyle

[libclc] [NFC][libclc] Merge atomic extension built-ins with identical name into a single file (PR #134489)

2025-04-09 Thread Wenju He via cfe-commits


@@ -6,6 +6,17 @@
 //
 
//===--===//
 
+// cl_khr_global_int32_base_atomics

wenju-he wrote:

done, you're right that they should be all guarded since they are extensions. 
Thanks.

https://github.com/llvm/llvm-project/pull/134489
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Fix commands in compile_to_bc are executed sequentially (PR #130755)

2025-04-10 Thread Wenju He via cfe-commits


@@ -283,7 +294,7 @@ function(add_libclc_builtin_set)
 
   set( builtins_comp_lib_tgt builtins.comp.${ARG_ARCH_SUFFIX} )
   add_custom_target( ${builtins_comp_lib_tgt}
-DEPENDS ${bytecode_files}
+DEPENDS ${compile_tgts}

wenju-he wrote:

done, added back ${bytecode_files}

https://github.com/llvm/llvm-project/pull/130755
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Fix commands in compile_to_bc are executed sequentially (PR #130755)

2025-04-10 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/130755

>From 1f8b5bfbfea6b562e9cae088256e8e5dddf0a335 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 11 Mar 2025 04:24:36 -0700
Subject: [PATCH 1/5] [libclc] Fix commands in compile_to_bc are executed
 sequentially

In libclc, we observe that compiling OpenCL source files to bitcode is
executed sequentially on Windows, which increases debug build time by
about an hour.
add_custom_command may introduce additional implicit dependencies, see
https://gitlab.kitware.com/cmake/cmake/-/issues/17097
This PR adds a target for each command and enables parallel builds of
OpenCL source files.
---
 libclc/cmake/modules/AddLibclc.cmake | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 911559ff4bfa9..9dc328fcd489c 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -1,6 +1,8 @@
 # Compiles an OpenCL C - or assembles an LL file - to bytecode
 #
 # Arguments:
+# * TARGET 
+# Custom target to create
 # * TRIPLE 
 # Target triple for which to compile the bytecode file.
 # * INPUT 
@@ -17,7 +19,7 @@
 function(compile_to_bc)
   cmake_parse_arguments(ARG
 ""
-"TRIPLE;INPUT;OUTPUT"
+"TARGET;TRIPLE;INPUT;OUTPUT"
 "EXTRA_OPTS;DEPENDENCIES"
 ${ARGN}
   )
@@ -60,9 +62,11 @@ function(compile_to_bc)
 DEPENDS
   ${clang_target}
   ${ARG_INPUT}
-  ${ARG_DEPENDENCIES}
 DEPFILE ${ARG_OUTPUT}.d
   )
+  add_custom_target( ${ARG_TARGET}
+DEPENDS ${ARG_OUTPUT}${TMP_SUFFIX} ${ARG_DEPENDENCIES}
+  )
 
   if( ${FILE_EXT} STREQUAL ".ll" )
 add_custom_command(
@@ -70,6 +74,7 @@ function(compile_to_bc)
   COMMAND ${llvm-as_exe} -o ${ARG_OUTPUT} ${ARG_OUTPUT}${TMP_SUFFIX}
   DEPENDS ${llvm-as_target} ${ARG_OUTPUT}${TMP_SUFFIX}
 )
+add_custom_target( ${ARG_TARGET}-as DEPENDS ${ARG_OUTPUT} )
   endif()
 endfunction()
 
@@ -227,6 +232,7 @@ function(add_libclc_builtin_set)
 
   set( bytecode_files )
   set( bytecode_ir_files )
+  set( compile_tgts )
   foreach( file IN LISTS ARG_GEN_FILES ARG_LIB_FILES )
 # We need to take each file and produce an absolute input file, as well
 # as a unique architecture-specific output file. We deal with a mix of
@@ -256,7 +262,11 @@ function(add_libclc_builtin_set)
 
 get_filename_component( file_dir ${file} DIRECTORY )
 
+string( REPLACE "/" "-" replaced ${file} )
+set( tgt compile_tgt-${ARG_ARCH_SUFFIX}${replaced})
+
 compile_to_bc(
+  TARGET ${tgt}
   TRIPLE ${ARG_TRIPLE}
   INPUT ${input_file}
   OUTPUT ${output_file}
@@ -264,11 +274,13 @@ function(add_libclc_builtin_set)
 "${ARG_COMPILE_FLAGS}" -I${CMAKE_CURRENT_SOURCE_DIR}/${file_dir}
   DEPENDENCIES ${input_file_dep}
 )
+list( APPEND compile_tgts ${tgt} )
 
 # Collect all files originating in LLVM IR separately
 get_filename_component( file_ext ${file} EXT )
 if( ${file_ext} STREQUAL ".ll" )
   list( APPEND bytecode_ir_files ${output_file} )
+  list( APPEND compile_tgts ${tgt}-as )
 else()
   list( APPEND bytecode_files ${output_file} )
 endif()
@@ -283,7 +295,7 @@ function(add_libclc_builtin_set)
 
   set( builtins_comp_lib_tgt builtins.comp.${ARG_ARCH_SUFFIX} )
   add_custom_target( ${builtins_comp_lib_tgt}
-DEPENDS ${bytecode_files}
+DEPENDS ${compile_tgts}
   )
   set_target_properties( ${builtins_comp_lib_tgt} PROPERTIES FOLDER 
"libclc/Device IR/Comp" )
 

>From 318148023265ea8e71d7c1d65e932748bacd417a Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 11 Mar 2025 06:04:41 -0700
Subject: [PATCH 2/5] revert ARG_DEPENDENCIES change

---
 libclc/cmake/modules/AddLibclc.cmake | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index 9dc328fcd489c..fba3b0110aaec 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -62,11 +62,10 @@ function(compile_to_bc)
 DEPENDS
   ${clang_target}
   ${ARG_INPUT}
+  ${ARG_DEPENDENCIES}
 DEPFILE ${ARG_OUTPUT}.d
   )
-  add_custom_target( ${ARG_TARGET}
-DEPENDS ${ARG_OUTPUT}${TMP_SUFFIX} ${ARG_DEPENDENCIES}
-  )
+  add_custom_target( ${ARG_TARGET} DEPENDS ${ARG_OUTPUT}${TMP_SUFFIX} )
 
   if( ${FILE_EXT} STREQUAL ".ll" )
 add_custom_command(

>From ea2f503e3c8ea76de06dccf125f985a862535ae8 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 18 Mar 2025 03:14:41 -0700
Subject: [PATCH 3/5] add FIXME about DEPENDS_EXPLICIT_ONLY

---
 libclc/cmake/modules/AddLibclc.cmake | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libclc/cmake/modules/AddLibclc.cmake 
b/libclc/cmake/modules/AddLibclc.cmake
index fba3b0110aaec..5d03acc73b3d0 100644
--- a/libclc/cmake/modules/AddLibclc.cmake
+++ b/libclc/cmake/modules/AddLibclc.cmake
@@ -65,6 +65,11 @@ funct

[libclc] [libclc] Fix commands in compile_to_bc are executed sequentially (PR #130755)

2025-04-10 Thread Wenju He via cfe-commits


@@ -256,19 +261,25 @@ function(add_libclc_builtin_set)
 
 get_filename_component( file_dir ${file} DIRECTORY )
 
+string( REPLACE "/" "-" replaced ${file} )
+set( tgt compile_tgt-${ARG_ARCH_SUFFIX}${replaced})
+
 compile_to_bc(
+  TARGET ${tgt}
   TRIPLE ${ARG_TRIPLE}
   INPUT ${input_file}
   OUTPUT ${output_file}
   EXTRA_OPTS -fno-builtin -nostdlib
 "${ARG_COMPILE_FLAGS}" -I${CMAKE_CURRENT_SOURCE_DIR}/${file_dir}
   DEPENDENCIES ${input_file_dep}
 )
+list( APPEND compile_tgts ${tgt} )
 
 # Collect all files originating in LLVM IR separately
 get_filename_component( file_ext ${file} EXT )
 if( ${file_ext} STREQUAL ".ll" )
   list( APPEND bytecode_ir_files ${output_file} )
+  list( APPEND compile_tgts ${tgt}-as )

wenju-he wrote:

done, removed

https://github.com/llvm/llvm-project/pull/130755
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-format] Add 'cl' to enable OpenCL kernel file formatting (PR #134529)

2025-04-10 Thread Wenju He via cfe-commits

wenju-he wrote:

@owenca could you please help me to merge? Thanks. I don't have commit 
permission.

https://github.com/llvm/llvm-project/pull/134529
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [NFC][libclc] Merge atomic extension built-ins with identical name into a single file (PR #134489)

2025-04-08 Thread Wenju He via cfe-commits

https://github.com/wenju-he edited 
https://github.com/llvm/llvm-project/pull/134489
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [liblc] only check filename part of the source for avoiding duplication (PR #135710)

2025-04-14 Thread Wenju He via cfe-commits

wenju-he wrote:

@frasercrmck please help to review? thanks.

https://github.com/llvm/llvm-project/pull/135710
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [NFC][libclc] Refine clz to use __builtin_clzg (PR #135301)

2025-04-14 Thread Wenju He via cfe-commits

https://github.com/wenju-he closed 
https://github.com/llvm/llvm-project/pull/135301
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [NFC][libclc] Refine clz to use __builtin_clzg (PR #135301)

2025-04-14 Thread Wenju He via cfe-commits

wenju-he wrote:

> Note I'm in the process of introducing elementwise clz/ctz builtins which 
> would accomplish this and achieve vector intrinsics. See #131995. It might be 
> worth waiting for that change instead?

thanks @frasercrmck 
LGTM. Please refactor clz after #131995
close this PR.

https://github.com/llvm/llvm-project/pull/135301
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Set OpenCL C version for each target (PR #135733)

2025-04-21 Thread Wenju He via cfe-commits

wenju-he wrote:

> > LGTM. That means we compile for the last OpenCL version, which is 3.0, and 
> > enable all OpenCL extensions/features, right?
> 
> Yes. Otherwise you have to still write the code to work with the lowest 
> common denominator.

done in commit 
https://github.com/llvm/llvm-project/pull/135733/commits/b3653cdcfde00ece9ac929d6c0555c237e87ff86

https://github.com/llvm/llvm-project/pull/135733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Set OpenCL C version for each target (PR #135733)

2025-04-21 Thread Wenju He via cfe-commits

wenju-he wrote:

> An OpenCL 1.2 module could still call a builtin that makes internal use of 
> CLC `ctz`, for example.

Yes, you're right.



https://github.com/llvm/llvm-project/pull/135733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Build for OpenCL 3.0 and enable all extensions and features (PR #135733)

2025-04-21 Thread Wenju He via cfe-commits

https://github.com/wenju-he edited 
https://github.com/llvm/llvm-project/pull/135733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Build for OpenCL 3.0 and enable all extensions and features (PR #135733)

2025-04-21 Thread Wenju He via cfe-commits

https://github.com/wenju-he edited 
https://github.com/llvm/llvm-project/pull/135733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Set OpenCL C version for each target (PR #135733)

2025-04-21 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/135733

>From 64d7bfdceb5a0a6fbf34bb15cd7d6cbeb9214881 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Mon, 14 Apr 2025 19:20:25 -0700
Subject: [PATCH 1/5] [libclc] Set OpenCL version to 3.0

This PR is cherry-pick of https://github.com/intel/llvm/commit/cba338e5fb1c
This allows adding OpenCL 2.0 built-ins, e.g. ctz, and OpenCL 3.0
extension built-ins, including generic address space variants.

llvm-diff shows this PR has no change in amdgcn--amdhsa.bc.
---
 libclc/CMakeLists.txt | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index dbbc29261d3b5..278ae5d777a84 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -411,6 +411,16 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 set( LIBCLC_ARCH_OBJFILE_DIR "${LIBCLC_OBJFILE_DIR}/${arch_suffix}" )
 file( MAKE_DIRECTORY ${LIBCLC_ARCH_OBJFILE_DIR} )
 
+# OpenCL 3.0 extensions
+string(CONCAT CL_3_0_EXTENSIONS
+  "-cl-ext="
+  "+cl_khr_fp64,"
+  "+cl_khr_fp16,"
+  "+__opencl_c_3d_image_writes,"
+  "+__opencl_c_images,"
+  "+cl_khr_3d_image_writes")
+list( APPEND build_flags -cl-std=CL3.0 "-Xclang" ${CL_3_0_EXTENSIONS} )
+
 string( TOUPPER "CLC_${MACRO_ARCH}" CLC_TARGET_DEFINE )
 
 list( APPEND build_flags

>From 4facfec781e39a247aba639ea8e080aa79153a12 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 15 Apr 2025 20:56:40 -0700
Subject: [PATCH 2/5] set opencl_c_version per target, remove CL_3_0_EXTENSIONS

---
 libclc/CMakeLists.txt | 32 +---
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index 278ae5d777a84..e3093af57e728 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -387,7 +387,11 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 
 message( STATUS "  device: ${d} ( ${${d}_aliases} )" )
 
-if ( ARCH STREQUAL spirv OR ARCH STREQUAL spirv64 )
+# 1.2 is Clang's default OpenCL C language standard to compile for.
+set( opencl_lang_std "CL1.2" )
+
+if ( ${DARCH} STREQUAL spirv )
+  set( opencl_lang_std "CL3.0" )
   set( build_flags -O0 -finline-hint-functions -DCLC_SPIRV )
   set( opt_flags )
   set( spvflags --spirv-max-version=1.1 )
@@ -395,13 +399,27 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
   if( ARCH STREQUAL spirv64 )
 set( MACRO_ARCH SPIRV64 )
   endif()
-elseif( ARCH STREQUAL clspv OR ARCH STREQUAL clspv64 )
+elseif( ${DARCH} STREQUAL clspv )
+  # Refer to https://github.com/google/clspv for OpenCL version.
+  set( opencl_lang_std "CL3.0" )
   set( build_flags "-Wno-unknown-assumption" -DCLC_CLSPV )
   set( opt_flags -O3 )
   set( MACRO_ARCH CLSPV32 )
   if( ARCH STREQUAL clspv64 )
 set( MACRO_ARCH CLSPV64 )
   endif()
+elseif( ${DARCH} STREQUAL nvptx )
+  # Refer to https://www.khronos.org/opencl/ for OpenCL version in NV 
implementation.
+  set( opencl_lang_std "CL3.0" )
+  set( build_flags )
+  set( opt_flags -O3 )
+  set( MACRO_ARCH ${ARCH} )
+elseif( ${DARCH} STREQUAL amdgcn OR ${DARCH} STREQUAL amdgcn-amdhsa OR 
${DARCH} STREQUAL r600 )
+  # Refer to https://github.com/ROCm/clr/tree/develop/opencl for OpenCL 
version.
+  set( opencl_lang_std "CL2.0" )
+  set( build_flags )
+  set( opt_flags -O3 )
+  set( MACRO_ARCH ${ARCH} )
 else()
   set( build_flags )
   set( opt_flags -O3 )
@@ -411,15 +429,7 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 set( LIBCLC_ARCH_OBJFILE_DIR "${LIBCLC_OBJFILE_DIR}/${arch_suffix}" )
 file( MAKE_DIRECTORY ${LIBCLC_ARCH_OBJFILE_DIR} )
 
-# OpenCL 3.0 extensions
-string(CONCAT CL_3_0_EXTENSIONS
-  "-cl-ext="
-  "+cl_khr_fp64,"
-  "+cl_khr_fp16,"
-  "+__opencl_c_3d_image_writes,"
-  "+__opencl_c_images,"
-  "+cl_khr_3d_image_writes")
-list( APPEND build_flags -cl-std=CL3.0 "-Xclang" ${CL_3_0_EXTENSIONS} )
+list( APPEND build_flags -cl-std=${opencl_lang_std} )
 
 string( TOUPPER "CLC_${MACRO_ARCH}" CLC_TARGET_DEFINE )
 

>From 31604df0f2c7337d476878bc3245f452fe2c941b Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 15 Apr 2025 21:05:57 -0700
Subject: [PATCH 3/5] use default OpenCL C version for r600

---
 libclc/CMakeLists.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index e3093af57e728..07da2466f5e42 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -414,7 +414,7 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
   set( build_flags )
   set( opt_flags -O3 )
   set( MACRO_ARCH ${ARCH} )
-elseif( ${DARCH} STREQUAL amdgcn OR ${DARCH} STREQUAL amdgcn-amdhsa OR 
${DARCH} STREQUAL r600 )
+elseif( ${DARCH} STREQUAL amdgcn OR ${DARCH} STREQUAL amdgcn-amdhsa )
   # Refer to https://github.com/ROCm/clr/tree/develop/opencl for O

[libclc] [libclc] Set OpenCL C version for each target (PR #135733)

2025-04-17 Thread Wenju He via cfe-commits

wenju-he wrote:

> I'd expect the libclc build (or any other runtime support library) to 
> consistently use the same language version independent of the target. If the 
> target doesn't support that version (which IIRC isn't actually a hard error 
> anywhere) and fails to compile on some feature, those should be carved out 
> into separate version dependent build (as in, this isn't a compile target 
> property but a specific build target property)

LGTM. That means we compile for the last OpenCL version, which is 3.0, and 
enable all OpenCL extensions/features, right?
libclc is implemented using functions supported by clang, so it is unlikely a 
target has libclc build error in generic libclc files that are shared for all 
targets.
When libclc is linked to application module, libclc built-ins implementation of 
unsupported extensions/features won't be linked in because these built-ins 
won't exist in the user module. If the application code uses an unsupported 
extension, frontend would report error already.
There could be problem if application modules links in a supported libclc 
built-in function which calls another libclc built-in that belongs to 
unsupported special extension/feature, but I think this scenario is unlikely.
@frasercrmck What do you think?

https://github.com/llvm/llvm-project/pull/135733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Set OpenCL C version for each target (PR #135733)

2025-04-18 Thread Wenju He via cfe-commits


@@ -387,21 +387,39 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 
 message( STATUS "  device: ${d} ( ${${d}_aliases} )" )
 
-if ( ARCH STREQUAL spirv OR ARCH STREQUAL spirv64 )
+# 1.2 is Clang's default OpenCL C language standard to compile for.
+set( opencl_lang_std "CL1.2" )
+
+if ( ${DARCH} STREQUAL spirv )
+  set( opencl_lang_std "CL3.0" )
   set( build_flags -O0 -finline-hint-functions -DCLC_SPIRV )
   set( opt_flags )
   set( spvflags --spirv-max-version=1.1 )
   set( MACRO_ARCH SPIRV32 )
   if( ARCH STREQUAL spirv64 )
 set( MACRO_ARCH SPIRV64 )
   endif()
-elseif( ARCH STREQUAL clspv OR ARCH STREQUAL clspv64 )
+elseif( ${DARCH} STREQUAL clspv )
+  # Refer to https://github.com/google/clspv for OpenCL version.
+  set( opencl_lang_std "CL3.0" )
   set( build_flags "-Wno-unknown-assumption" -DCLC_CLSPV )
   set( opt_flags -O3 )
   set( MACRO_ARCH CLSPV32 )
   if( ARCH STREQUAL clspv64 )
 set( MACRO_ARCH CLSPV64 )
   endif()
+elseif( ${DARCH} STREQUAL nvptx )
+  # Refer to https://www.khronos.org/opencl/ for OpenCL version in NV 
implementation.
+  set( opencl_lang_std "CL3.0" )
+  set( build_flags )
+  set( opt_flags -O3 )
+  set( MACRO_ARCH ${ARCH} )
+elseif( ${DARCH} STREQUAL amdgcn OR ${DARCH} STREQUAL amdgcn-amdhsa )
+  # Refer to https://github.com/ROCm/clr/tree/develop/opencl for OpenCL 
version.
+  set( opencl_lang_std "CL2.0" )

wenju-he wrote:

> I thought we already picked out device compatible default versions in clang?

OpenCL C version is 1.2 for all targets by default in clang if the version 
isn't specified in command line: 
https://github.com/llvm/llvm-project/blob/b07c88563febdb62b82daad0480d7b6131bc54d4/clang/lib/Basic/LangStandards.cpp#L99

There is TargetInfo api setSupportedOpenCLOpts for target to define its 
supported extensions and features.

https://github.com/llvm/llvm-project/pull/135733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Build for OpenCL 3.0 (PR #135733)

2025-04-23 Thread Wenju He via cfe-commits

https://github.com/wenju-he edited 
https://github.com/llvm/llvm-project/pull/135733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Build for OpenCL 3.0 (PR #135733)

2025-04-23 Thread Wenju He via cfe-commits


@@ -429,7 +411,9 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 set( LIBCLC_ARCH_OBJFILE_DIR "${LIBCLC_OBJFILE_DIR}/${arch_suffix}" )
 file( MAKE_DIRECTORY ${LIBCLC_ARCH_OBJFILE_DIR} )
 
-list( APPEND build_flags -cl-std=${opencl_lang_std} )
+# Build for OpenCL 3.0 and enable all extensions and features independently
+# of the target or device.
+list( APPEND build_flags -cl-std=CL3.0 -Xclang -cl-ext=+all )

wenju-he wrote:

done, thanks

https://github.com/llvm/llvm-project/pull/135733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Build for OpenCL 3.0 (PR #135733)

2025-04-23 Thread Wenju He via cfe-commits

https://github.com/wenju-he edited 
https://github.com/llvm/llvm-project/pull/135733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Build for OpenCL 3.0 (PR #135733)

2025-04-23 Thread Wenju He via cfe-commits

https://github.com/wenju-he edited 
https://github.com/llvm/llvm-project/pull/135733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Build for OpenCL 3.0 and enable all extensions and features (PR #135733)

2025-04-23 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/135733

>From 64d7bfdceb5a0a6fbf34bb15cd7d6cbeb9214881 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Mon, 14 Apr 2025 19:20:25 -0700
Subject: [PATCH 1/6] [libclc] Set OpenCL version to 3.0

This PR is cherry-pick of https://github.com/intel/llvm/commit/cba338e5fb1c
This allows adding OpenCL 2.0 built-ins, e.g. ctz, and OpenCL 3.0
extension built-ins, including generic address space variants.

llvm-diff shows this PR has no change in amdgcn--amdhsa.bc.
---
 libclc/CMakeLists.txt | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index dbbc29261d3b5..278ae5d777a84 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -411,6 +411,16 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 set( LIBCLC_ARCH_OBJFILE_DIR "${LIBCLC_OBJFILE_DIR}/${arch_suffix}" )
 file( MAKE_DIRECTORY ${LIBCLC_ARCH_OBJFILE_DIR} )
 
+# OpenCL 3.0 extensions
+string(CONCAT CL_3_0_EXTENSIONS
+  "-cl-ext="
+  "+cl_khr_fp64,"
+  "+cl_khr_fp16,"
+  "+__opencl_c_3d_image_writes,"
+  "+__opencl_c_images,"
+  "+cl_khr_3d_image_writes")
+list( APPEND build_flags -cl-std=CL3.0 "-Xclang" ${CL_3_0_EXTENSIONS} )
+
 string( TOUPPER "CLC_${MACRO_ARCH}" CLC_TARGET_DEFINE )
 
 list( APPEND build_flags

>From 4facfec781e39a247aba639ea8e080aa79153a12 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 15 Apr 2025 20:56:40 -0700
Subject: [PATCH 2/6] set opencl_c_version per target, remove CL_3_0_EXTENSIONS

---
 libclc/CMakeLists.txt | 32 +---
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index 278ae5d777a84..e3093af57e728 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -387,7 +387,11 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 
 message( STATUS "  device: ${d} ( ${${d}_aliases} )" )
 
-if ( ARCH STREQUAL spirv OR ARCH STREQUAL spirv64 )
+# 1.2 is Clang's default OpenCL C language standard to compile for.
+set( opencl_lang_std "CL1.2" )
+
+if ( ${DARCH} STREQUAL spirv )
+  set( opencl_lang_std "CL3.0" )
   set( build_flags -O0 -finline-hint-functions -DCLC_SPIRV )
   set( opt_flags )
   set( spvflags --spirv-max-version=1.1 )
@@ -395,13 +399,27 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
   if( ARCH STREQUAL spirv64 )
 set( MACRO_ARCH SPIRV64 )
   endif()
-elseif( ARCH STREQUAL clspv OR ARCH STREQUAL clspv64 )
+elseif( ${DARCH} STREQUAL clspv )
+  # Refer to https://github.com/google/clspv for OpenCL version.
+  set( opencl_lang_std "CL3.0" )
   set( build_flags "-Wno-unknown-assumption" -DCLC_CLSPV )
   set( opt_flags -O3 )
   set( MACRO_ARCH CLSPV32 )
   if( ARCH STREQUAL clspv64 )
 set( MACRO_ARCH CLSPV64 )
   endif()
+elseif( ${DARCH} STREQUAL nvptx )
+  # Refer to https://www.khronos.org/opencl/ for OpenCL version in NV 
implementation.
+  set( opencl_lang_std "CL3.0" )
+  set( build_flags )
+  set( opt_flags -O3 )
+  set( MACRO_ARCH ${ARCH} )
+elseif( ${DARCH} STREQUAL amdgcn OR ${DARCH} STREQUAL amdgcn-amdhsa OR 
${DARCH} STREQUAL r600 )
+  # Refer to https://github.com/ROCm/clr/tree/develop/opencl for OpenCL 
version.
+  set( opencl_lang_std "CL2.0" )
+  set( build_flags )
+  set( opt_flags -O3 )
+  set( MACRO_ARCH ${ARCH} )
 else()
   set( build_flags )
   set( opt_flags -O3 )
@@ -411,15 +429,7 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 set( LIBCLC_ARCH_OBJFILE_DIR "${LIBCLC_OBJFILE_DIR}/${arch_suffix}" )
 file( MAKE_DIRECTORY ${LIBCLC_ARCH_OBJFILE_DIR} )
 
-# OpenCL 3.0 extensions
-string(CONCAT CL_3_0_EXTENSIONS
-  "-cl-ext="
-  "+cl_khr_fp64,"
-  "+cl_khr_fp16,"
-  "+__opencl_c_3d_image_writes,"
-  "+__opencl_c_images,"
-  "+cl_khr_3d_image_writes")
-list( APPEND build_flags -cl-std=CL3.0 "-Xclang" ${CL_3_0_EXTENSIONS} )
+list( APPEND build_flags -cl-std=${opencl_lang_std} )
 
 string( TOUPPER "CLC_${MACRO_ARCH}" CLC_TARGET_DEFINE )
 

>From 31604df0f2c7337d476878bc3245f452fe2c941b Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 15 Apr 2025 21:05:57 -0700
Subject: [PATCH 3/6] use default OpenCL C version for r600

---
 libclc/CMakeLists.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index e3093af57e728..07da2466f5e42 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -414,7 +414,7 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
   set( build_flags )
   set( opt_flags -O3 )
   set( MACRO_ARCH ${ARCH} )
-elseif( ${DARCH} STREQUAL amdgcn OR ${DARCH} STREQUAL amdgcn-amdhsa OR 
${DARCH} STREQUAL r600 )
+elseif( ${DARCH} STREQUAL amdgcn OR ${DARCH} STREQUAL amdgcn-amdhsa )
   # Refer to https://github.com/ROCm/clr/tree/develop/opencl for O

[libclc] [libclc] Add v3 variants of async_work_group_copy/async_work_group_strided_copy/prefetch (PR #137932)

2025-04-30 Thread Wenju He via cfe-commits

https://github.com/wenju-he created 
https://github.com/llvm/llvm-project/pull/137932

3-component vector type is supported for them per OpenCL spec.

>From cafb374de8d77c82fa450b732a122663090f6e34 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Wed, 30 Apr 2025 00:44:50 -0700
Subject: [PATCH] [libclc] Add v3 variants of
 async_work_group_copy/async_work_group_strided_copy/prefetch

3-component vector type is supported for them per OpenCL spec.
---
 libclc/generic/include/clc/async/gentype.inc | 44 
 1 file changed, 44 insertions(+)

diff --git a/libclc/generic/include/clc/async/gentype.inc 
b/libclc/generic/include/clc/async/gentype.inc
index 1114883e1ad35..e023c8bbd97d2 100644
--- a/libclc/generic/include/clc/async/gentype.inc
+++ b/libclc/generic/include/clc/async/gentype.inc
@@ -14,6 +14,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE char3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE char4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -34,6 +38,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE uchar3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE uchar4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -54,6 +62,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE short3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE short4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -74,6 +86,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE ushort3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE ushort4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -94,6 +110,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE int3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE int4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -114,6 +134,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE uint3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE uint4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -134,6 +158,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE float3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE float4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -154,6 +182,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE long3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE long4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -174,6 +206,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE ulong3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE ulong4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -197,6 +233,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE double3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE double4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -222,6 +262,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE half3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE half4
 #include __CLC_BODY
 #undef __CLC_GENTYPE

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Skip opt command if opt_flags is empty (PR #130882)

2025-04-30 Thread Wenju He via cfe-commits

wenju-he wrote:

close this PR. Original intention of disabling opt -O3 on downstream project 
was changed.

https://github.com/llvm/llvm-project/pull/130882
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Skip opt command if opt_flags is empty (PR #130882)

2025-04-30 Thread Wenju He via cfe-commits

https://github.com/wenju-he closed 
https://github.com/llvm/llvm-project/pull/130882
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Add v3 variants of async_work_group_copy/async_work_group_strided_copy/prefetch (PR #137932)

2025-04-30 Thread Wenju He via cfe-commits

wenju-he wrote:

@frasercrmck please help to review, thanks

https://github.com/llvm/llvm-project/pull/137932
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Add v3 variants of async_work_group_copy/async_work_group_strided_copy/prefetch (PR #137932)

2025-04-30 Thread Wenju He via cfe-commits


@@ -9,15 +9,31 @@
 #define __CLC_DST_ADDR_SPACE local
 #define __CLC_SRC_ADDR_SPACE global
 #define __CLC_BODY 
-#include 
+#include 
+#undef __CLC_DST_ADDR_SPACE

wenju-he wrote:

done, thanks for the suggestion

https://github.com/llvm/llvm-project/pull/137932
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Add v3 variants of async_work_group_copy/async_work_group_strided_copy/prefetch (PR #137932)

2025-04-30 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/137932

>From cafb374de8d77c82fa450b732a122663090f6e34 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Wed, 30 Apr 2025 00:44:50 -0700
Subject: [PATCH 1/3] [libclc] Add v3 variants of
 async_work_group_copy/async_work_group_strided_copy/prefetch

3-component vector type is supported for them per OpenCL spec.
---
 libclc/generic/include/clc/async/gentype.inc | 44 
 1 file changed, 44 insertions(+)

diff --git a/libclc/generic/include/clc/async/gentype.inc 
b/libclc/generic/include/clc/async/gentype.inc
index 1114883e1ad35..e023c8bbd97d2 100644
--- a/libclc/generic/include/clc/async/gentype.inc
+++ b/libclc/generic/include/clc/async/gentype.inc
@@ -14,6 +14,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE char3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE char4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -34,6 +38,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE uchar3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE uchar4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -54,6 +62,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE short3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE short4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -74,6 +86,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE ushort3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE ushort4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -94,6 +110,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE int3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE int4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -114,6 +134,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE uint3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE uint4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -134,6 +158,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE float3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE float4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -154,6 +182,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE long3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE long4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -174,6 +206,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE ulong3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE ulong4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -197,6 +233,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE double3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE double4
 #include __CLC_BODY
 #undef __CLC_GENTYPE
@@ -222,6 +262,10 @@
 #include __CLC_BODY
 #undef __CLC_GENTYPE
 
+#define __CLC_GENTYPE half3
+#include __CLC_BODY
+#undef __CLC_GENTYPE
+
 #define __CLC_GENTYPE half4
 #include __CLC_BODY
 #undef __CLC_GENTYPE

>From 1def8b25f6cbd5e3128f697393fa52ff1d97e90b Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Wed, 30 Apr 2025 02:32:40 -0700
Subject: [PATCH 2/3] delete async gentype.inc, use integer and math
 gentype.inc

---
 .../include/clc/async/async_work_group_copy.h |  20 +-
 .../clc/async/async_work_group_strided_copy.h |  20 +-
 libclc/generic/include/clc/async/gentype.inc  | 283 --
 libclc/generic/include/clc/async/prefetch.h   |   6 +-
 .../lib/async/async_work_group_copy.cl|   7 +-
 .../async/async_work_group_strided_copy.cl|   7 +-
 libclc/generic/lib/async/prefetch.cl  |   7 +-
 7 files changed, 59 insertions(+), 291 deletions(-)
 delete mode 100644 libclc/generic/include/clc/async/gentype.inc

diff --git a/libclc/generic/include/clc/async/async_work_group_copy.h 
b/libclc/generic/include/clc/async/async_work_group_copy.h
index a2c4e353ce469..1af31056e62f3 100644
--- a/libclc/generic/include/clc/async/async_work_group_copy.h
+++ b/libclc/generic/include/clc/async/async_work_group_copy.h
@@ -9,7 +9,23 @@
 #define __CLC_DST_ADDR_SPACE local
 #define __CLC_SRC_ADDR_SPACE global
 #define __CLC_BODY 
-#include 
+#include 
+#undef __CLC_DST_ADDR_SPACE
+#undef __CLC_SRC_ADDR_SPACE
+#undef __CLC_BODY
+
+#define __CLC_DST_ADDR_SPACE local
+#define __CLC_SRC_ADDR_SPACE global
+#define __CLC_BODY 
+#include 
+#undef __CLC_DST_ADDR_SPACE
+#undef __CLC_SRC_ADDR_SPACE
+#undef __CLC_BODY
+
+#define __CLC_DST_ADDR_SPACE global
+#define __CLC_SRC_ADDR_SPACE local
+#define __CLC_BODY 
+#include 
 #undef __CLC_DST_ADDR_SPACE
 #undef __CLC_SRC_ADDR_SPACE
 #undef __CLC_BODY
@@ -17,7 +33,7 @@
 #define __CLC_DST_ADDR_SPACE global
 #define __CLC_SRC_ADDR_SPACE local
 #define __CLC_BODY 
-#include 
+#include 
 #undef __CLC_DST_ADDR_SPACE
 #undef __CLC_SRC_ADDR_SPACE
 #undef __CLC_BODY
diff --git a/libclc/generic/include/clc/async/async_work_group_strided_copy.h 
b/libc

[libclc] [libclc] Add v3 variants of async_work_group_copy/async_work_group_strided_copy/prefetch (PR #137932)

2025-04-30 Thread Wenju He via cfe-commits

wenju-he wrote:

> We could probably come up with some kind of combined `gentype.inc` that does 
> them both at once 

good idea.


https://github.com/llvm/llvm-project/pull/137932
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [NFC][libclc] Merge atomic extension built-ins with identical name into a single file (PR #134489)

2025-04-10 Thread Wenju He via cfe-commits

https://github.com/wenju-he edited 
https://github.com/llvm/llvm-project/pull/134489
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [NFC][libclc] Merge atomic extension built-ins with identical name into a single file (PR #134489)

2025-04-10 Thread Wenju He via cfe-commits


@@ -0,0 +1,24 @@
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifdef cl_khr_global_int32_extended_atomics
+#define __CLC_FUNCTION atom_min
+#define __CLC_ADDRESS_SPACE global
+#include 
+#endif // cl_khr_global_int32_extended_atomics
+
+#ifdef cl_khr_local_int32_extended_atomics
+#define __CLC_FUNCTION atom_min
+#define __CLC_ADDRESS_SPACE local
+#include 
+#endif // cl_khr_local_int32_extended_atomics
+
+#ifdef cl_khr_int64_base_atomics

wenju-he wrote:

done

https://github.com/llvm/llvm-project/pull/134489
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [NFC][libclc] Merge atomic extension built-ins with identical name into a single file (PR #134489)

2025-04-10 Thread Wenju He via cfe-commits


@@ -0,0 +1,24 @@
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifdef cl_khr_global_int32_extended_atomics
+#define __CLC_FUNCTION atom_or
+#define __CLC_ADDRESS_SPACE global
+#include 
+#endif // cl_khr_global_int32_extended_atomics
+
+#ifdef cl_khr_local_int32_extended_atomics
+#define __CLC_FUNCTION atom_or
+#define __CLC_ADDRESS_SPACE local
+#include 
+#endif // cl_khr_local_int32_extended_atomics
+
+#ifdef cl_khr_int64_base_atomics

wenju-he wrote:

done in 
https://github.com/llvm/llvm-project/pull/134489/commits/de7d8b93e0f6f0e162f4672b98ce240ecbb419ea

https://github.com/llvm/llvm-project/pull/134489
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [NFC][libclc] Merge atomic extension built-ins with identical name into a single file (PR #134489)

2025-04-10 Thread Wenju He via cfe-commits


@@ -0,0 +1,24 @@
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifdef cl_khr_global_int32_extended_atomics
+#define __CLC_FUNCTION atom_xor
+#define __CLC_ADDRESS_SPACE global
+#include 
+#endif // cl_khr_global_int32_extended_atomics
+
+#ifdef cl_khr_local_int32_extended_atomics
+#define __CLC_FUNCTION atom_xor
+#define __CLC_ADDRESS_SPACE local
+#include 
+#endif // cl_khr_local_int32_extended_atomics
+
+#ifdef cl_khr_int64_base_atomics

wenju-he wrote:

done in 
https://github.com/llvm/llvm-project/pull/134489/commits/de7d8b93e0f6f0e162f4672b98ce240ecbb419ea

https://github.com/llvm/llvm-project/pull/134489
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [NFC][libclc] Merge atomic extension built-ins with identical name into a single file (PR #134489)

2025-04-10 Thread Wenju He via cfe-commits

https://github.com/wenju-he edited 
https://github.com/llvm/llvm-project/pull/134489
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [NFC][libclc] Merge atomic extension built-ins with identical name into a single file (PR #134489)

2025-04-10 Thread Wenju He via cfe-commits

https://github.com/wenju-he edited 
https://github.com/llvm/llvm-project/pull/134489
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [NFC][libclc] Merge atomic extension built-ins with identical name into a single file (PR #134489)

2025-04-10 Thread Wenju He via cfe-commits


@@ -0,0 +1,24 @@
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifdef cl_khr_global_int32_extended_atomics
+#define __CLC_FUNCTION atom_and
+#define __CLC_ADDRESS_SPACE global
+#include 
+#endif // cl_khr_global_int32_extended_atomics
+
+#ifdef cl_khr_local_int32_extended_atomics
+#define __CLC_FUNCTION atom_and
+#define __CLC_ADDRESS_SPACE local
+#include 
+#endif // cl_khr_local_int32_extended_atomics
+
+#ifdef cl_khr_int64_base_atomics

wenju-he wrote:

good eye. done, thanks. I didn't notice this.

https://github.com/llvm/llvm-project/pull/134489
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [NFC][libclc] Merge atomic extension built-ins with identical name into a single file (PR #134489)

2025-04-10 Thread Wenju He via cfe-commits


@@ -0,0 +1,24 @@
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifdef cl_khr_global_int32_extended_atomics
+#define __CLC_FUNCTION atom_max
+#define __CLC_ADDRESS_SPACE global
+#include 
+#endif // cl_khr_global_int32_extended_atomics
+
+#ifdef cl_khr_local_int32_extended_atomics
+#define __CLC_FUNCTION atom_max
+#define __CLC_ADDRESS_SPACE local
+#include 
+#endif // cl_khr_local_int32_extended_atomics
+
+#ifdef cl_khr_int64_base_atomics

wenju-he wrote:

done

https://github.com/llvm/llvm-project/pull/134489
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Fix commands in compile_to_bc are executed sequentially (PR #130755)

2025-04-10 Thread Wenju He via cfe-commits


@@ -63,13 +65,15 @@ function(compile_to_bc)
   ${ARG_DEPENDENCIES}
 DEPFILE ${ARG_OUTPUT}.d
   )
+  add_custom_target( ${ARG_TARGET} DEPENDS ${ARG_OUTPUT}${TMP_SUFFIX} )

wenju-he wrote:

> This is still better than what we've got so maybe we can keep an eye on it 
> and see if we have any parallel build issues stemming from .ll files, and we 
> can adjust as needed.

yeah, I agree

https://github.com/llvm/llvm-project/pull/130755
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [NFC][libclc] Refine clz to use __builtin_clzg (PR #135301)

2025-04-10 Thread Wenju He via cfe-commits

https://github.com/wenju-he created 
https://github.com/llvm/llvm-project/pull/135301

It looks simpler to use __builtin_clzg for all unsigned types.

>From 5992cc83e904ce047598a1987e2f8ce1926b9292 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Thu, 10 Apr 2025 19:34:50 -0700
Subject: [PATCH] [NFC][libclc] Refine clz to use __builtin_clzg

It looks simpler to use __builtin_clzg for all unsigned types.
---
 libclc/clc/lib/generic/integer/clc_clz.cl | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/libclc/clc/lib/generic/integer/clc_clz.cl 
b/libclc/clc/lib/generic/integer/clc_clz.cl
index 74f662375af6b..a38c1d7ea0685 100644
--- a/libclc/clc/lib/generic/integer/clc_clz.cl
+++ b/libclc/clc/lib/generic/integer/clc_clz.cl
@@ -11,35 +11,35 @@
 #include 
 
 _CLC_OVERLOAD _CLC_DEF char __clc_clz(char x) {
-  return __clc_clz((ushort)(uchar)x) - 8;
+  return __clc_clz(__clc_as_uchar(x));
 }
 
 _CLC_OVERLOAD _CLC_DEF uchar __clc_clz(uchar x) {
-  return __clc_clz((ushort)x) - 8;
+  return __builtin_clzg(x, 8);
 }
 
 _CLC_OVERLOAD _CLC_DEF short __clc_clz(short x) {
-  return x ? __builtin_clzs(x) : 16;
+  return __clc_clz(__clc_as_ushort(x));
 }
 
 _CLC_OVERLOAD _CLC_DEF ushort __clc_clz(ushort x) {
-  return x ? __builtin_clzs(x) : 16;
+  return __builtin_clzg(x, 16);
 }
 
 _CLC_OVERLOAD _CLC_DEF int __clc_clz(int x) {
-  return x ? __builtin_clz(x) : 32;
+  return __clc_clz(__clc_as_uint(x));
 }
 
 _CLC_OVERLOAD _CLC_DEF uint __clc_clz(uint x) {
-  return x ? __builtin_clz(x) : 32;
+  return __builtin_clzg(x, 32);
 }
 
 _CLC_OVERLOAD _CLC_DEF long __clc_clz(long x) {
-  return x ? __builtin_clzl(x) : 64;
+  return __clc_clz(__clc_as_ulong(x));
 }
 
 _CLC_OVERLOAD _CLC_DEF ulong __clc_clz(ulong x) {
-  return x ? __builtin_clzl(x) : 64;
+  return __builtin_clzg(x, 64);
 }
 
 _CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, char, __clc_clz, char)

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [NFC][libclc] Refine clz to use __builtin_clzg (PR #135301)

2025-04-10 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/135301

>From 5992cc83e904ce047598a1987e2f8ce1926b9292 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Thu, 10 Apr 2025 19:34:50 -0700
Subject: [PATCH 1/2] [NFC][libclc] Refine clz to use __builtin_clzg

It looks simpler to use __builtin_clzg for all unsigned types.
---
 libclc/clc/lib/generic/integer/clc_clz.cl | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/libclc/clc/lib/generic/integer/clc_clz.cl 
b/libclc/clc/lib/generic/integer/clc_clz.cl
index 74f662375af6b..a38c1d7ea0685 100644
--- a/libclc/clc/lib/generic/integer/clc_clz.cl
+++ b/libclc/clc/lib/generic/integer/clc_clz.cl
@@ -11,35 +11,35 @@
 #include 
 
 _CLC_OVERLOAD _CLC_DEF char __clc_clz(char x) {
-  return __clc_clz((ushort)(uchar)x) - 8;
+  return __clc_clz(__clc_as_uchar(x));
 }
 
 _CLC_OVERLOAD _CLC_DEF uchar __clc_clz(uchar x) {
-  return __clc_clz((ushort)x) - 8;
+  return __builtin_clzg(x, 8);
 }
 
 _CLC_OVERLOAD _CLC_DEF short __clc_clz(short x) {
-  return x ? __builtin_clzs(x) : 16;
+  return __clc_clz(__clc_as_ushort(x));
 }
 
 _CLC_OVERLOAD _CLC_DEF ushort __clc_clz(ushort x) {
-  return x ? __builtin_clzs(x) : 16;
+  return __builtin_clzg(x, 16);
 }
 
 _CLC_OVERLOAD _CLC_DEF int __clc_clz(int x) {
-  return x ? __builtin_clz(x) : 32;
+  return __clc_clz(__clc_as_uint(x));
 }
 
 _CLC_OVERLOAD _CLC_DEF uint __clc_clz(uint x) {
-  return x ? __builtin_clz(x) : 32;
+  return __builtin_clzg(x, 32);
 }
 
 _CLC_OVERLOAD _CLC_DEF long __clc_clz(long x) {
-  return x ? __builtin_clzl(x) : 64;
+  return __clc_clz(__clc_as_ulong(x));
 }
 
 _CLC_OVERLOAD _CLC_DEF ulong __clc_clz(ulong x) {
-  return x ? __builtin_clzl(x) : 64;
+  return __builtin_clzg(x, 64);
 }
 
 _CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, char, __clc_clz, char)

>From 2be72b2db81865f777cd45c3bd5b9fc15399eeb8 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Thu, 10 Apr 2025 20:26:44 -0700
Subject: [PATCH 2/2] clang-format

---
 libclc/clc/lib/generic/integer/clc_clz.cl | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/libclc/clc/lib/generic/integer/clc_clz.cl 
b/libclc/clc/lib/generic/integer/clc_clz.cl
index a38c1d7ea0685..71582fca94172 100644
--- a/libclc/clc/lib/generic/integer/clc_clz.cl
+++ b/libclc/clc/lib/generic/integer/clc_clz.cl
@@ -14,9 +14,7 @@ _CLC_OVERLOAD _CLC_DEF char __clc_clz(char x) {
   return __clc_clz(__clc_as_uchar(x));
 }
 
-_CLC_OVERLOAD _CLC_DEF uchar __clc_clz(uchar x) {
-  return __builtin_clzg(x, 8);
-}
+_CLC_OVERLOAD _CLC_DEF uchar __clc_clz(uchar x) { return __builtin_clzg(x, 8); 
}
 
 _CLC_OVERLOAD _CLC_DEF short __clc_clz(short x) {
   return __clc_clz(__clc_as_ushort(x));
@@ -30,9 +28,7 @@ _CLC_OVERLOAD _CLC_DEF int __clc_clz(int x) {
   return __clc_clz(__clc_as_uint(x));
 }
 
-_CLC_OVERLOAD _CLC_DEF uint __clc_clz(uint x) {
-  return __builtin_clzg(x, 32);
-}
+_CLC_OVERLOAD _CLC_DEF uint __clc_clz(uint x) { return __builtin_clzg(x, 32); }
 
 _CLC_OVERLOAD _CLC_DEF long __clc_clz(long x) {
   return __clc_clz(__clc_as_ulong(x));

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] add ctz built-in implementation to clc and generic (PR #135309)

2025-04-10 Thread Wenju He via cfe-commits

https://github.com/wenju-he created 
https://github.com/llvm/llvm-project/pull/135309

None

>From 31423376e35f34ca032fe3d11998537912ba2c63 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Thu, 10 Apr 2025 20:10:41 -0700
Subject: [PATCH] [libclc] add ctz built-in implementation to clc and generic

---
 libclc/clc/include/clc/integer/clc_ctz.h  | 20 ++
 libclc/clc/lib/generic/SOURCES|  1 +
 libclc/clc/lib/generic/integer/clc_ctz.cl | 48 +++
 libclc/generic/include/clc/clc.h  |  1 +
 libclc/generic/include/clc/integer/ctz.h  | 15 +++
 libclc/generic/lib/SOURCES|  1 +
 libclc/generic/lib/integer/ctz.cl | 15 +++
 7 files changed, 101 insertions(+)
 create mode 100644 libclc/clc/include/clc/integer/clc_ctz.h
 create mode 100644 libclc/clc/lib/generic/integer/clc_ctz.cl
 create mode 100644 libclc/generic/include/clc/integer/ctz.h
 create mode 100644 libclc/generic/lib/integer/ctz.cl

diff --git a/libclc/clc/include/clc/integer/clc_ctz.h 
b/libclc/clc/include/clc/integer/clc_ctz.h
new file mode 100644
index 0..1e6365100827b
--- /dev/null
+++ b/libclc/clc/include/clc/integer/clc_ctz.h
@@ -0,0 +1,20 @@
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef __CLC_INTEGER_CLC_CTZ_H__
+#define __CLC_INTEGER_CLC_CTZ_H__
+
+#define __CLC_FUNCTION __clc_ctz
+#define __CLC_BODY 
+
+#include 
+
+#undef __CLC_BODY
+#undef __CLC_FUNCTION
+
+#endif // __CLC_INTEGER_CLC_CTZ_H__
diff --git a/libclc/clc/lib/generic/SOURCES b/libclc/clc/lib/generic/SOURCES
index 4503a20ad9848..1e73627c3a270 100644
--- a/libclc/clc/lib/generic/SOURCES
+++ b/libclc/clc/lib/generic/SOURCES
@@ -7,6 +7,7 @@ integer/clc_abs.cl
 integer/clc_abs_diff.cl
 integer/clc_add_sat.cl
 integer/clc_clz.cl
+integer/clc_ctz.cl
 integer/clc_hadd.cl
 integer/clc_mad24.cl
 integer/clc_mad_sat.cl
diff --git a/libclc/clc/lib/generic/integer/clc_ctz.cl 
b/libclc/clc/lib/generic/integer/clc_ctz.cl
new file mode 100644
index 0..50fda4a214b24
--- /dev/null
+++ b/libclc/clc/lib/generic/integer/clc_ctz.cl
@@ -0,0 +1,48 @@
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include 
+#include 
+#include 
+
+_CLC_OVERLOAD _CLC_DEF char __clc_ctz(char x) {
+  return __clc_ctz(__clc_as_uchar(x));
+}
+
+_CLC_OVERLOAD _CLC_DEF uchar __clc_ctz(uchar x) { return __builtin_ctzg(x, 8); 
}
+
+_CLC_OVERLOAD _CLC_DEF short __clc_ctz(short x) {
+  return __clc_ctz(__clc_as_ushort(x));
+}
+
+_CLC_OVERLOAD _CLC_DEF ushort __clc_ctz(ushort x) {
+  return __builtin_ctzg(x, 16);
+}
+
+_CLC_OVERLOAD _CLC_DEF int __clc_ctz(int x) {
+  return __clc_ctz(__clc_as_uint(x));
+}
+
+_CLC_OVERLOAD _CLC_DEF uint __clc_ctz(uint x) { return __builtin_ctzg(x, 32); }
+
+_CLC_OVERLOAD _CLC_DEF long __clc_ctz(long x) {
+  return __clc_ctz(__clc_as_ulong(x));
+}
+
+_CLC_OVERLOAD _CLC_DEF ulong __clc_ctz(ulong x) {
+  return __builtin_ctzg(x, 64);
+}
+
+_CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, char, __clc_ctz, char)
+_CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, uchar, __clc_ctz, uchar)
+_CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, short, __clc_ctz, short)
+_CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, ushort, __clc_ctz, ushort)
+_CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, int, __clc_ctz, int)
+_CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, uint, __clc_ctz, uint)
+_CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, long, __clc_ctz, long)
+_CLC_UNARY_VECTORIZE(_CLC_OVERLOAD _CLC_DEF, ulong, __clc_ctz, ulong)
diff --git a/libclc/generic/include/clc/clc.h b/libclc/generic/include/clc/clc.h
index 51a4f3413e725..d950fa5b76cca 100644
--- a/libclc/generic/include/clc/clc.h
+++ b/libclc/generic/include/clc/clc.h
@@ -153,6 +153,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/libclc/generic/include/clc/integer/ctz.h 
b/libclc/generic/include/clc/integer/ctz.h
new file mode 100644
index 0..e8a91cd0ac1fc
--- /dev/null
+++ b/libclc/generic/include/clc/integer/ctz.h
@@ -0,0 +1,15 @@
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===---

[libclc] [libclc] add ctz built-in implementation to clc and generic (PR #135309)

2025-04-10 Thread Wenju He via cfe-commits

wenju-he wrote:

@frasercrmck please help to review? thanks.

https://github.com/llvm/llvm-project/pull/135309
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [NFC][libclc] Refine clz to use __builtin_clzg (PR #135301)

2025-04-10 Thread Wenju He via cfe-commits

wenju-he wrote:

@frasercrmck please help to review? thanks.

https://github.com/llvm/llvm-project/pull/135301
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Fix commands in compile_to_bc are executed sequentially (PR #130755)

2025-04-10 Thread Wenju He via cfe-commits

wenju-he wrote:

> On my build this increases the number of targets `ninja -t targets` from 19K 
> to 28K. That's building all targets. I'm not sure what else we could do about 
> that and whether it's a problem.

yes, the number of targets increase significantly with this approach. But as 
far as I know, we don't have compilation correctness issue regarding the 
increase of the number of targets.
The only alternative approach is DEPENDS_EXPLICIT_ONLY. I can add 
DEPENDS_EXPLICIT_ONLY support to this PR if cmake version is 3.27 or higher.

https://github.com/llvm/llvm-project/pull/130755
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Set OpenCL C version for each target (PR #135733)

2025-04-15 Thread Wenju He via cfe-commits

wenju-he wrote:

> Targets may even want to compile libclc for multiple different versions? That 
> might introduce extra complexity (we'd need "an extra loop", more compilation 
> time, new naming/versioning schemes for the build artifacts).

An application may compile using different -cl-std version, but IIUC such usage 
is incompatible with a target which supports a specific OpenCL version. So 
compiling for the target's supported OpenCL version might be enough.

https://github.com/llvm/llvm-project/pull/135733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Set OpenCL version to 3.0 (PR #135733)

2025-04-15 Thread Wenju He via cfe-commits


@@ -411,6 +411,16 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 set( LIBCLC_ARCH_OBJFILE_DIR "${LIBCLC_OBJFILE_DIR}/${arch_suffix}" )
 file( MAKE_DIRECTORY ${LIBCLC_ARCH_OBJFILE_DIR} )
 
+# OpenCL 3.0 extensions
+string(CONCAT CL_3_0_EXTENSIONS
+  "-cl-ext="
+  "+cl_khr_fp64,"
+  "+cl_khr_fp16,"
+  "+__opencl_c_3d_image_writes,"
+  "+__opencl_c_images,"
+  "+cl_khr_3d_image_writes")
+list( APPEND build_flags -cl-std=CL3.0 "-Xclang" ${CL_3_0_EXTENSIONS} )

wenju-he wrote:

I've reverted OpenCL extensions change in this PR in 
https://github.com/llvm/llvm-project/pull/135733/commits/4facfec781e39a247aba639ea8e080aa79153a12
I find that target should define its supported extension via 
setSupportedOpenCLOpts API.

https://github.com/llvm/llvm-project/pull/135733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Set OpenCL version to 3.0 (PR #135733)

2025-04-15 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/135733

>From 64d7bfdceb5a0a6fbf34bb15cd7d6cbeb9214881 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Mon, 14 Apr 2025 19:20:25 -0700
Subject: [PATCH 1/2] [libclc] Set OpenCL version to 3.0

This PR is cherry-pick of https://github.com/intel/llvm/commit/cba338e5fb1c
This allows adding OpenCL 2.0 built-ins, e.g. ctz, and OpenCL 3.0
extension built-ins, including generic address space variants.

llvm-diff shows this PR has no change in amdgcn--amdhsa.bc.
---
 libclc/CMakeLists.txt | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index dbbc29261d3b5..278ae5d777a84 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -411,6 +411,16 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 set( LIBCLC_ARCH_OBJFILE_DIR "${LIBCLC_OBJFILE_DIR}/${arch_suffix}" )
 file( MAKE_DIRECTORY ${LIBCLC_ARCH_OBJFILE_DIR} )
 
+# OpenCL 3.0 extensions
+string(CONCAT CL_3_0_EXTENSIONS
+  "-cl-ext="
+  "+cl_khr_fp64,"
+  "+cl_khr_fp16,"
+  "+__opencl_c_3d_image_writes,"
+  "+__opencl_c_images,"
+  "+cl_khr_3d_image_writes")
+list( APPEND build_flags -cl-std=CL3.0 "-Xclang" ${CL_3_0_EXTENSIONS} )
+
 string( TOUPPER "CLC_${MACRO_ARCH}" CLC_TARGET_DEFINE )
 
 list( APPEND build_flags

>From 4facfec781e39a247aba639ea8e080aa79153a12 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 15 Apr 2025 20:56:40 -0700
Subject: [PATCH 2/2] set opencl_c_version per target, remove CL_3_0_EXTENSIONS

---
 libclc/CMakeLists.txt | 32 +---
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index 278ae5d777a84..e3093af57e728 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -387,7 +387,11 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 
 message( STATUS "  device: ${d} ( ${${d}_aliases} )" )
 
-if ( ARCH STREQUAL spirv OR ARCH STREQUAL spirv64 )
+# 1.2 is Clang's default OpenCL C language standard to compile for.
+set( opencl_lang_std "CL1.2" )
+
+if ( ${DARCH} STREQUAL spirv )
+  set( opencl_lang_std "CL3.0" )
   set( build_flags -O0 -finline-hint-functions -DCLC_SPIRV )
   set( opt_flags )
   set( spvflags --spirv-max-version=1.1 )
@@ -395,13 +399,27 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
   if( ARCH STREQUAL spirv64 )
 set( MACRO_ARCH SPIRV64 )
   endif()
-elseif( ARCH STREQUAL clspv OR ARCH STREQUAL clspv64 )
+elseif( ${DARCH} STREQUAL clspv )
+  # Refer to https://github.com/google/clspv for OpenCL version.
+  set( opencl_lang_std "CL3.0" )
   set( build_flags "-Wno-unknown-assumption" -DCLC_CLSPV )
   set( opt_flags -O3 )
   set( MACRO_ARCH CLSPV32 )
   if( ARCH STREQUAL clspv64 )
 set( MACRO_ARCH CLSPV64 )
   endif()
+elseif( ${DARCH} STREQUAL nvptx )
+  # Refer to https://www.khronos.org/opencl/ for OpenCL version in NV 
implementation.
+  set( opencl_lang_std "CL3.0" )
+  set( build_flags )
+  set( opt_flags -O3 )
+  set( MACRO_ARCH ${ARCH} )
+elseif( ${DARCH} STREQUAL amdgcn OR ${DARCH} STREQUAL amdgcn-amdhsa OR 
${DARCH} STREQUAL r600 )
+  # Refer to https://github.com/ROCm/clr/tree/develop/opencl for OpenCL 
version.
+  set( opencl_lang_std "CL2.0" )
+  set( build_flags )
+  set( opt_flags -O3 )
+  set( MACRO_ARCH ${ARCH} )
 else()
   set( build_flags )
   set( opt_flags -O3 )
@@ -411,15 +429,7 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 set( LIBCLC_ARCH_OBJFILE_DIR "${LIBCLC_OBJFILE_DIR}/${arch_suffix}" )
 file( MAKE_DIRECTORY ${LIBCLC_ARCH_OBJFILE_DIR} )
 
-# OpenCL 3.0 extensions
-string(CONCAT CL_3_0_EXTENSIONS
-  "-cl-ext="
-  "+cl_khr_fp64,"
-  "+cl_khr_fp16,"
-  "+__opencl_c_3d_image_writes,"
-  "+__opencl_c_images,"
-  "+cl_khr_3d_image_writes")
-list( APPEND build_flags -cl-std=CL3.0 "-Xclang" ${CL_3_0_EXTENSIONS} )
+list( APPEND build_flags -cl-std=${opencl_lang_std} )
 
 string( TOUPPER "CLC_${MACRO_ARCH}" CLC_TARGET_DEFINE )
 

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Set OpenCL C version for each target (PR #135733)

2025-04-16 Thread Wenju He via cfe-commits


@@ -387,21 +387,39 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 
 message( STATUS "  device: ${d} ( ${${d}_aliases} )" )
 
-if ( ARCH STREQUAL spirv OR ARCH STREQUAL spirv64 )
+# 1.2 is Clang's default OpenCL C language standard to compile for.
+set( opencl_lang_std "CL1.2" )
+
+if ( ${DARCH} STREQUAL spirv )

wenju-he wrote:

done

https://github.com/llvm/llvm-project/pull/135733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Set OpenCL C version for each target (PR #135733)

2025-04-16 Thread Wenju He via cfe-commits


@@ -387,21 +387,39 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 
 message( STATUS "  device: ${d} ( ${${d}_aliases} )" )
 
-if ( ARCH STREQUAL spirv OR ARCH STREQUAL spirv64 )
+# 1.2 is Clang's default OpenCL C language standard to compile for.
+set( opencl_lang_std "CL1.2" )
+
+if ( ${DARCH} STREQUAL spirv )
+  set( opencl_lang_std "CL3.0" )
   set( build_flags -O0 -finline-hint-functions -DCLC_SPIRV )
   set( opt_flags )
   set( spvflags --spirv-max-version=1.1 )
   set( MACRO_ARCH SPIRV32 )
   if( ARCH STREQUAL spirv64 )
 set( MACRO_ARCH SPIRV64 )
   endif()
-elseif( ARCH STREQUAL clspv OR ARCH STREQUAL clspv64 )
+elseif( ${DARCH} STREQUAL clspv )
+  # Refer to https://github.com/google/clspv for OpenCL version.
+  set( opencl_lang_std "CL3.0" )
   set( build_flags "-Wno-unknown-assumption" -DCLC_CLSPV )
   set( opt_flags -O3 )
   set( MACRO_ARCH CLSPV32 )
   if( ARCH STREQUAL clspv64 )
 set( MACRO_ARCH CLSPV64 )
   endif()
+elseif( ${DARCH} STREQUAL nvptx )
+  # Refer to https://www.khronos.org/opencl/ for OpenCL version in NV 
implementation.

wenju-he wrote:

updated link to https://developer.nvidia.com/opencl

https://github.com/llvm/llvm-project/pull/135733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libclc] [libclc] Set OpenCL C version for each target (PR #135733)

2025-04-16 Thread Wenju He via cfe-commits

https://github.com/wenju-he updated 
https://github.com/llvm/llvm-project/pull/135733

>From 64d7bfdceb5a0a6fbf34bb15cd7d6cbeb9214881 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Mon, 14 Apr 2025 19:20:25 -0700
Subject: [PATCH 1/4] [libclc] Set OpenCL version to 3.0

This PR is cherry-pick of https://github.com/intel/llvm/commit/cba338e5fb1c
This allows adding OpenCL 2.0 built-ins, e.g. ctz, and OpenCL 3.0
extension built-ins, including generic address space variants.

llvm-diff shows this PR has no change in amdgcn--amdhsa.bc.
---
 libclc/CMakeLists.txt | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index dbbc29261d3b5..278ae5d777a84 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -411,6 +411,16 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 set( LIBCLC_ARCH_OBJFILE_DIR "${LIBCLC_OBJFILE_DIR}/${arch_suffix}" )
 file( MAKE_DIRECTORY ${LIBCLC_ARCH_OBJFILE_DIR} )
 
+# OpenCL 3.0 extensions
+string(CONCAT CL_3_0_EXTENSIONS
+  "-cl-ext="
+  "+cl_khr_fp64,"
+  "+cl_khr_fp16,"
+  "+__opencl_c_3d_image_writes,"
+  "+__opencl_c_images,"
+  "+cl_khr_3d_image_writes")
+list( APPEND build_flags -cl-std=CL3.0 "-Xclang" ${CL_3_0_EXTENSIONS} )
+
 string( TOUPPER "CLC_${MACRO_ARCH}" CLC_TARGET_DEFINE )
 
 list( APPEND build_flags

>From 4facfec781e39a247aba639ea8e080aa79153a12 Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 15 Apr 2025 20:56:40 -0700
Subject: [PATCH 2/4] set opencl_c_version per target, remove CL_3_0_EXTENSIONS

---
 libclc/CMakeLists.txt | 32 +---
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index 278ae5d777a84..e3093af57e728 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -387,7 +387,11 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 
 message( STATUS "  device: ${d} ( ${${d}_aliases} )" )
 
-if ( ARCH STREQUAL spirv OR ARCH STREQUAL spirv64 )
+# 1.2 is Clang's default OpenCL C language standard to compile for.
+set( opencl_lang_std "CL1.2" )
+
+if ( ${DARCH} STREQUAL spirv )
+  set( opencl_lang_std "CL3.0" )
   set( build_flags -O0 -finline-hint-functions -DCLC_SPIRV )
   set( opt_flags )
   set( spvflags --spirv-max-version=1.1 )
@@ -395,13 +399,27 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
   if( ARCH STREQUAL spirv64 )
 set( MACRO_ARCH SPIRV64 )
   endif()
-elseif( ARCH STREQUAL clspv OR ARCH STREQUAL clspv64 )
+elseif( ${DARCH} STREQUAL clspv )
+  # Refer to https://github.com/google/clspv for OpenCL version.
+  set( opencl_lang_std "CL3.0" )
   set( build_flags "-Wno-unknown-assumption" -DCLC_CLSPV )
   set( opt_flags -O3 )
   set( MACRO_ARCH CLSPV32 )
   if( ARCH STREQUAL clspv64 )
 set( MACRO_ARCH CLSPV64 )
   endif()
+elseif( ${DARCH} STREQUAL nvptx )
+  # Refer to https://www.khronos.org/opencl/ for OpenCL version in NV 
implementation.
+  set( opencl_lang_std "CL3.0" )
+  set( build_flags )
+  set( opt_flags -O3 )
+  set( MACRO_ARCH ${ARCH} )
+elseif( ${DARCH} STREQUAL amdgcn OR ${DARCH} STREQUAL amdgcn-amdhsa OR 
${DARCH} STREQUAL r600 )
+  # Refer to https://github.com/ROCm/clr/tree/develop/opencl for OpenCL 
version.
+  set( opencl_lang_std "CL2.0" )
+  set( build_flags )
+  set( opt_flags -O3 )
+  set( MACRO_ARCH ${ARCH} )
 else()
   set( build_flags )
   set( opt_flags -O3 )
@@ -411,15 +429,7 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
 set( LIBCLC_ARCH_OBJFILE_DIR "${LIBCLC_OBJFILE_DIR}/${arch_suffix}" )
 file( MAKE_DIRECTORY ${LIBCLC_ARCH_OBJFILE_DIR} )
 
-# OpenCL 3.0 extensions
-string(CONCAT CL_3_0_EXTENSIONS
-  "-cl-ext="
-  "+cl_khr_fp64,"
-  "+cl_khr_fp16,"
-  "+__opencl_c_3d_image_writes,"
-  "+__opencl_c_images,"
-  "+cl_khr_3d_image_writes")
-list( APPEND build_flags -cl-std=CL3.0 "-Xclang" ${CL_3_0_EXTENSIONS} )
+list( APPEND build_flags -cl-std=${opencl_lang_std} )
 
 string( TOUPPER "CLC_${MACRO_ARCH}" CLC_TARGET_DEFINE )
 

>From 31604df0f2c7337d476878bc3245f452fe2c941b Mon Sep 17 00:00:00 2001
From: Wenju He 
Date: Tue, 15 Apr 2025 21:05:57 -0700
Subject: [PATCH 3/4] use default OpenCL C version for r600

---
 libclc/CMakeLists.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libclc/CMakeLists.txt b/libclc/CMakeLists.txt
index e3093af57e728..07da2466f5e42 100644
--- a/libclc/CMakeLists.txt
+++ b/libclc/CMakeLists.txt
@@ -414,7 +414,7 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} )
   set( build_flags )
   set( opt_flags -O3 )
   set( MACRO_ARCH ${ARCH} )
-elseif( ${DARCH} STREQUAL amdgcn OR ${DARCH} STREQUAL amdgcn-amdhsa OR 
${DARCH} STREQUAL r600 )
+elseif( ${DARCH} STREQUAL amdgcn OR ${DARCH} STREQUAL amdgcn-amdhsa )
   # Refer to https://github.com/ROCm/clr/tree/develop/opencl for O

[libclc] [libclc] Set OpenCL C version for each target (PR #135733)

2025-04-16 Thread Wenju He via cfe-commits

wenju-he wrote:

> Yes I think on reflection it's probably okay to compile for the highest 
> supported OpenCL C version. 

Yeah I think libclc should compile for the version that target claims 
supporting.

> I believe they're supersets of one another, so a 3.0 builtins library would 
> still contain all of the 1.2 builtins? I might be missing some of the nitty 
> gritty here though.

3.0 is backward compatible with 1.2,1.1,1.0.
For other versions, it depends on whether target include the versions in 
[CL_DEVICE_OPENCL_C_ALL_VERSIONS](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_API.html#CL_DEVICE_OPENCL_C_ALL_VERSIONS)
 query.


https://github.com/llvm/llvm-project/pull/135733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   >