[llvm-branch-commits] [llvm] 3bf9b25 - [DAG] Allow AssertZExt to scalarize. (#122463)

2025-01-14 Thread Tobias Hieta via llvm-branch-commits

Author: David Green
Date: 2025-01-14T10:39:41+01:00
New Revision: 3bf9b25b1627ea8610c04b1ca8de4ba9ce432050

URL: 
https://github.com/llvm/llvm-project/commit/3bf9b25b1627ea8610c04b1ca8de4ba9ce432050
DIFF: 
https://github.com/llvm/llvm-project/commit/3bf9b25b1627ea8610c04b1ca8de4ba9ce432050.diff

LOG: [DAG] Allow AssertZExt to scalarize. (#122463)

With range and undef metadata on a call we can have vector AssertZExt
generated on a target with no vector operations. The AssertZExt needs to
scalarize to a normal `AssertZext tin, ValueType`. I have added
AssertSext too, although I do not have a test case.

Fixes #110374

(cherry picked from commit ab9a80a3ad78f611fd06cd6f7215bd828809310c)

Added: 
llvm/test/CodeGen/ARM/scalarize-assert-zext.ll

Modified: 
llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

Removed: 




diff  --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h 
b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
index d4e61c85889012..d74896772bf53b 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
@@ -838,7 +838,7 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
   SDValue ScalarizeVecRes_BUILD_VECTOR(SDNode *N);
   SDValue ScalarizeVecRes_EXTRACT_SUBVECTOR(SDNode *N);
   SDValue ScalarizeVecRes_FP_ROUND(SDNode *N);
-  SDValue ScalarizeVecRes_ExpOp(SDNode *N);
+  SDValue ScalarizeVecRes_UnaryOpWithExtraInput(SDNode *N);
   SDValue ScalarizeVecRes_INSERT_VECTOR_ELT(SDNode *N);
   SDValue ScalarizeVecRes_LOAD(LoadSDNode *N);
   SDValue ScalarizeVecRes_SCALAR_TO_VECTOR(SDNode *N);

diff  --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp 
b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
index 92b62ccdc27552..ea95aaef8a1e87 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
@@ -58,7 +58,11 @@ void DAGTypeLegalizer::ScalarizeVectorResult(SDNode *N, 
unsigned ResNo) {
   case ISD::BUILD_VECTOR:  R = ScalarizeVecRes_BUILD_VECTOR(N); break;
   case ISD::EXTRACT_SUBVECTOR: R = ScalarizeVecRes_EXTRACT_SUBVECTOR(N); break;
   case ISD::FP_ROUND:  R = ScalarizeVecRes_FP_ROUND(N); break;
-  case ISD::FPOWI: R = ScalarizeVecRes_ExpOp(N); break;
+  case ISD::AssertZext:
+  case ISD::AssertSext:
+  case ISD::FPOWI:
+R = ScalarizeVecRes_UnaryOpWithExtraInput(N);
+break;
   case ISD::INSERT_VECTOR_ELT: R = ScalarizeVecRes_INSERT_VECTOR_ELT(N); break;
   case ISD::LOAD:   R = 
ScalarizeVecRes_LOAD(cast(N));break;
   case ISD::SCALAR_TO_VECTOR:  R = ScalarizeVecRes_SCALAR_TO_VECTOR(N); break;
@@ -426,7 +430,7 @@ SDValue DAGTypeLegalizer::ScalarizeVecRes_FP_ROUND(SDNode 
*N) {
  N->getOperand(1));
 }
 
-SDValue DAGTypeLegalizer::ScalarizeVecRes_ExpOp(SDNode *N) {
+SDValue DAGTypeLegalizer::ScalarizeVecRes_UnaryOpWithExtraInput(SDNode *N) {
   SDValue Op = GetScalarizedVector(N->getOperand(0));
   return DAG.getNode(N->getOpcode(), SDLoc(N), Op.getValueType(), Op,
  N->getOperand(1));

diff  --git a/llvm/test/CodeGen/ARM/scalarize-assert-zext.ll 
b/llvm/test/CodeGen/ARM/scalarize-assert-zext.ll
new file mode 100644
index 00..5638bb4a398803
--- /dev/null
+++ b/llvm/test/CodeGen/ARM/scalarize-assert-zext.ll
@@ -0,0 +1,46 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: llc -mtriple=armv7-unknown-linux-musleabihf -mattr=-neon %s -o - | 
FileCheck %s
+
+declare fastcc noundef range(i16 0, 256) <4 x i16> @other()
+
+define void @test(ptr %0) #0 {
+; CHECK-LABEL: test:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:.save {r4, lr}
+; CHECK-NEXT:push {r4, lr}
+; CHECK-NEXT:mov r4, r0
+; CHECK-NEXT:bl other
+; CHECK-NEXT:uxth r3, r3
+; CHECK-NEXT:uxth r2, r2
+; CHECK-NEXT:uxth r1, r1
+; CHECK-NEXT:uxth r0, r0
+; CHECK-NEXT:strb r3, [r4, #3]
+; CHECK-NEXT:strb r2, [r4, #2]
+; CHECK-NEXT:strb r1, [r4, #1]
+; CHECK-NEXT:strb r0, [r4]
+; CHECK-NEXT:pop {r4, pc}
+entry:
+  %call = call fastcc <4 x i16> @other()
+  %t = trunc <4 x i16> %call to <4 x i8>
+  store <4 x i8> %t, ptr %0, align 1
+  ret void
+}
+
+define <4 x i16> @test2() #0 {
+; CHECK-LABEL: test2:
+; CHECK:   @ %bb.0: @ %entry
+; CHECK-NEXT:.save {r11, lr}
+; CHECK-NEXT:push {r11, lr}
+; CHECK-NEXT:bl other
+; CHECK-NEXT:movw r1, #65408
+; CHECK-NEXT:and r0, r0, r1
+; CHECK-NEXT:and r2, r2, r1
+; CHECK-NEXT:mov r1, #0
+; CHECK-NEXT:mov r3, #0
+; CHECK-NEXT:pop {r11, pc}
+entry:
+  %call = call fastcc <4 x i16> @other()
+  %a = and <4 x i16> %call, 
+  ret <4 x i16> %a
+}
+



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-b

[llvm-branch-commits] [clang] [llvm] [IR] Add FPOperation intrinsic property (PR #122313)

2025-01-14 Thread Serge Pavlov via llvm-branch-commits

spavloff wrote:

Rounding mode is a parameter of a floating-point operation, its effect does not 
depend on the fact that it comes from a register or is specified as an 
immediate value. Implementing them differently does not look like a good 
solution.

Implementing static rounding using different intrinsic set would require four 
instrinsics to represent a single FP operation (constrained intrinsics also can 
use static rounding). It is the problem the operand bundles should solve.

Now an FP operation may have two parameters attached to it - rounding mode and 
exception handling. They are of very different nature, - rounding mode is a 
part of FP environment but exception handling is just an instruction to the 
compiler, it contains no information about FP environment. Adding static 
rounding (and in future probably static denormal mode) hardly add a confusion 
to this solution.


https://github.com/llvm/llvm-project/pull/122313
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 8fa5dff - [SLP] Check if instructions exist after vectorization (#120434)

2025-01-14 Thread Tobias Hieta via llvm-branch-commits

Author: DianQK
Date: 2025-01-14T10:39:18+01:00
New Revision: 8fa5dffc865a9c9239ec0aceb3fbc801895147b7

URL: 
https://github.com/llvm/llvm-project/commit/8fa5dffc865a9c9239ec0aceb3fbc801895147b7
DIFF: 
https://github.com/llvm/llvm-project/commit/8fa5dffc865a9c9239ec0aceb3fbc801895147b7.diff

LOG: [SLP] Check if instructions exist after vectorization (#120434)

Fixes #120433.

(cherry picked from commit e7a4d78ad328d02bf515b2fa4af8b2c188a6a636)

Added: 
llvm/test/Transforms/SLPVectorizer/slp-deleted-inst.ll

Modified: 
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 746ba51a981fe0..fd08d5d9d7556a 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -18596,8 +18596,11 @@ bool 
SLPVectorizerPass::vectorizeCmpInsts(iterator_range CmpInsts,
 if (R.isDeleted(I))
   continue;
 for (Value *Op : I->operands())
-  if (auto *RootOp = dyn_cast(Op))
+  if (auto *RootOp = dyn_cast(Op)) {
 Changed |= vectorizeRootInstruction(nullptr, RootOp, BB, R, TTI);
+if (R.isDeleted(I))
+  break;
+  }
   }
   // Try to vectorize operands as vector bundles.
   for (CmpInst *I : CmpInsts) {

diff  --git a/llvm/test/Transforms/SLPVectorizer/slp-deleted-inst.ll 
b/llvm/test/Transforms/SLPVectorizer/slp-deleted-inst.ll
new file mode 100644
index 00..d3995f1bb7f85d
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/slp-deleted-inst.ll
@@ -0,0 +1,51 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -S -passes=slp-vectorizer < %s | FileCheck %s
+
+define void @foo() {
+; CHECK-LABEL: define void @foo() {
+; CHECK-NEXT:  [[BB:.*]]:
+; CHECK-NEXT:br label %[[BB1:.*]]
+; CHECK:   [[BB1]]:
+; CHECK-NEXT:[[TMP0:%.*]] = phi <2 x i32> [ [[TMP11:%.*]], %[[BB3:.*]] ], 
[ zeroinitializer, %[[BB]] ]
+; CHECK-NEXT:br label %[[BB3]]
+; CHECK:   [[BB3]]:
+; CHECK-NEXT:[[TMP1:%.*]] = trunc <2 x i32> [[TMP0]] to <2 x i1>
+; CHECK-NEXT:[[TMP2:%.*]] = mul <2 x i1> [[TMP1]], zeroinitializer
+; CHECK-NEXT:[[TMP3:%.*]] = or <2 x i1> zeroinitializer, [[TMP2]]
+; CHECK-NEXT:[[TMP4:%.*]] = and <2 x i1> [[TMP3]], zeroinitializer
+; CHECK-NEXT:[[TMP5:%.*]] = extractelement <2 x i1> [[TMP4]], i32 0
+; CHECK-NEXT:[[TMP6:%.*]] = zext i1 [[TMP5]] to i32
+; CHECK-NEXT:[[TMP7:%.*]] = extractelement <2 x i1> [[TMP4]], i32 1
+; CHECK-NEXT:[[TMP8:%.*]] = zext i1 [[TMP7]] to i32
+; CHECK-NEXT:[[I22:%.*]] = or i32 [[TMP6]], [[TMP8]]
+; CHECK-NEXT:[[TMP9:%.*]] = insertelement <2 x i32> , 
i32 [[I22]], i32 0
+; CHECK-NEXT:[[TMP10:%.*]] = icmp ult <2 x i32> [[TMP9]], zeroinitializer
+; CHECK-NEXT:[[TMP11]] = select <2 x i1> [[TMP10]], <2 x i32> 
zeroinitializer, <2 x i32> zeroinitializer
+; CHECK-NEXT:br label %[[BB1]]
+;
+bb:
+  br label %bb1
+
+bb1:  ; preds = %bb3, %bb
+  %i = phi i32 [ %i26, %bb3 ], [ 0, %bb ]
+  %i2 = phi i32 [ %i24, %bb3 ], [ 0, %bb ]
+  br label %bb3
+
+bb3:  ; preds = %bb1
+  %i4 = zext i32 %i2 to i64
+  %i5 = mul i64 %i4, 0
+  %i10 = or i64 0, %i5
+  %i11 = trunc i64 %i10 to i32
+  %i12 = and i32 %i11, 0
+  %i13 = zext i32 %i to i64
+  %i14 = mul i64 %i13, 0
+  %i19 = or i64 0, %i14
+  %i20 = trunc i64 %i19 to i32
+  %i21 = and i32 %i20, 0
+  %i22 = or i32 %i12, %i21
+  %i23 = icmp ult i32 %i22, 0
+  %i24 = select i1 %i23, i32 0, i32 0
+  %i25 = icmp ult i32 0, 0
+  %i26 = select i1 %i25, i32 0, i32 0
+  br label %bb1
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SLP] Check if instructions exist after vectorization (#120434) (PR #120505)

2025-01-14 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/120505

>From 8fa5dffc865a9c9239ec0aceb3fbc801895147b7 Mon Sep 17 00:00:00 2001
From: DianQK 
Date: Thu, 19 Dec 2024 06:21:57 +0800
Subject: [PATCH] [SLP] Check if instructions exist after vectorization
 (#120434)

Fixes #120433.

(cherry picked from commit e7a4d78ad328d02bf515b2fa4af8b2c188a6a636)
---
 .../Transforms/Vectorize/SLPVectorizer.cpp|  5 +-
 .../SLPVectorizer/slp-deleted-inst.ll | 51 +++
 2 files changed, 55 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/Transforms/SLPVectorizer/slp-deleted-inst.ll

diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 746ba51a981fe0..fd08d5d9d7556a 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -18596,8 +18596,11 @@ bool 
SLPVectorizerPass::vectorizeCmpInsts(iterator_range CmpInsts,
 if (R.isDeleted(I))
   continue;
 for (Value *Op : I->operands())
-  if (auto *RootOp = dyn_cast(Op))
+  if (auto *RootOp = dyn_cast(Op)) {
 Changed |= vectorizeRootInstruction(nullptr, RootOp, BB, R, TTI);
+if (R.isDeleted(I))
+  break;
+  }
   }
   // Try to vectorize operands as vector bundles.
   for (CmpInst *I : CmpInsts) {
diff --git a/llvm/test/Transforms/SLPVectorizer/slp-deleted-inst.ll 
b/llvm/test/Transforms/SLPVectorizer/slp-deleted-inst.ll
new file mode 100644
index 00..d3995f1bb7f85d
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/slp-deleted-inst.ll
@@ -0,0 +1,51 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -S -passes=slp-vectorizer < %s | FileCheck %s
+
+define void @foo() {
+; CHECK-LABEL: define void @foo() {
+; CHECK-NEXT:  [[BB:.*]]:
+; CHECK-NEXT:br label %[[BB1:.*]]
+; CHECK:   [[BB1]]:
+; CHECK-NEXT:[[TMP0:%.*]] = phi <2 x i32> [ [[TMP11:%.*]], %[[BB3:.*]] ], 
[ zeroinitializer, %[[BB]] ]
+; CHECK-NEXT:br label %[[BB3]]
+; CHECK:   [[BB3]]:
+; CHECK-NEXT:[[TMP1:%.*]] = trunc <2 x i32> [[TMP0]] to <2 x i1>
+; CHECK-NEXT:[[TMP2:%.*]] = mul <2 x i1> [[TMP1]], zeroinitializer
+; CHECK-NEXT:[[TMP3:%.*]] = or <2 x i1> zeroinitializer, [[TMP2]]
+; CHECK-NEXT:[[TMP4:%.*]] = and <2 x i1> [[TMP3]], zeroinitializer
+; CHECK-NEXT:[[TMP5:%.*]] = extractelement <2 x i1> [[TMP4]], i32 0
+; CHECK-NEXT:[[TMP6:%.*]] = zext i1 [[TMP5]] to i32
+; CHECK-NEXT:[[TMP7:%.*]] = extractelement <2 x i1> [[TMP4]], i32 1
+; CHECK-NEXT:[[TMP8:%.*]] = zext i1 [[TMP7]] to i32
+; CHECK-NEXT:[[I22:%.*]] = or i32 [[TMP6]], [[TMP8]]
+; CHECK-NEXT:[[TMP9:%.*]] = insertelement <2 x i32> , 
i32 [[I22]], i32 0
+; CHECK-NEXT:[[TMP10:%.*]] = icmp ult <2 x i32> [[TMP9]], zeroinitializer
+; CHECK-NEXT:[[TMP11]] = select <2 x i1> [[TMP10]], <2 x i32> 
zeroinitializer, <2 x i32> zeroinitializer
+; CHECK-NEXT:br label %[[BB1]]
+;
+bb:
+  br label %bb1
+
+bb1:  ; preds = %bb3, %bb
+  %i = phi i32 [ %i26, %bb3 ], [ 0, %bb ]
+  %i2 = phi i32 [ %i24, %bb3 ], [ 0, %bb ]
+  br label %bb3
+
+bb3:  ; preds = %bb1
+  %i4 = zext i32 %i2 to i64
+  %i5 = mul i64 %i4, 0
+  %i10 = or i64 0, %i5
+  %i11 = trunc i64 %i10 to i32
+  %i12 = and i32 %i11, 0
+  %i13 = zext i32 %i to i64
+  %i14 = mul i64 %i13, 0
+  %i19 = or i64 0, %i14
+  %i20 = trunc i64 %i19 to i32
+  %i21 = and i32 %i20, 0
+  %i22 = or i32 %i12, %i21
+  %i23 = icmp ult i32 %i22, 0
+  %i24 = select i1 %i23, i32 0, i32 0
+  %i25 = icmp ult i32 0, 0
+  %i26 = select i1 %i25, i32 0, i32 0
+  br label %bb1
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] f1b37b6 - Fix std::initializer_list recognition if it's exported out of a module

2025-01-14 Thread Tobias Hieta via llvm-branch-commits

Author: Artsiom Drapun
Date: 2025-01-14T07:24:48+01:00
New Revision: f1b37b6665b5ea2b8962f11fb6d034f77a7bbd36

URL: 
https://github.com/llvm/llvm-project/commit/f1b37b6665b5ea2b8962f11fb6d034f77a7bbd36
DIFF: 
https://github.com/llvm/llvm-project/commit/f1b37b6665b5ea2b8962f11fb6d034f77a7bbd36.diff

LOG: Fix std::initializer_list recognition if it's exported out of a module

- Add implementation
- Add a regression test
- Add release notes

Added: 

clang/test/Modules/initializer-list-recognition-through-export-and-linkage-issue-118218.cpp

Modified: 
clang/docs/ReleaseNotes.rst
clang/lib/Sema/SemaDeclCXX.cpp

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 8c7a6ba70acd28..1e9845b9b9c5b2 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1123,7 +1123,8 @@ Bug Fixes to C++ Support
 - Fixed assertion failure by skipping the analysis of an invalid field 
declaration. (#GH99868)
 - Fix an issue with dependent source location expressions (#GH106428), 
(#GH81155), (#GH80210), (#GH85373)
 - Fix handling of ``_`` as the name of a lambda's init capture variable. 
(#GH107024)
-
+- Fixed recognition of ``std::initializer_list`` when it's surrounded with 
``extern "C++"`` and exported
+  out of a module (which is the case e.g. in MSVC's implementation of ``std`` 
module). (#GH118218)
 
 Bug Fixes to AST Handling
 ^

diff  --git a/clang/lib/Sema/SemaDeclCXX.cpp b/clang/lib/Sema/SemaDeclCXX.cpp
index 4e4f91de8cd5a5..18262993af283a 100644
--- a/clang/lib/Sema/SemaDeclCXX.cpp
+++ b/clang/lib/Sema/SemaDeclCXX.cpp
@@ -11919,7 +11919,7 @@ bool Sema::isStdInitializerList(QualType Ty, QualType 
*Element) {
 if (TemplateClass->getIdentifier() !=
 &PP.getIdentifierTable().get("initializer_list") ||
 !getStdNamespace()->InEnclosingNamespaceSetOf(
-TemplateClass->getDeclContext()))
+TemplateClass->getNonTransparentDeclContext()))
   return false;
 // This is a template called std::initializer_list, but is it the right
 // template?

diff  --git 
a/clang/test/Modules/initializer-list-recognition-through-export-and-linkage-issue-118218.cpp
 
b/clang/test/Modules/initializer-list-recognition-through-export-and-linkage-issue-118218.cpp
new file mode 100644
index 00..e2c796fb103f6c
--- /dev/null
+++ 
b/clang/test/Modules/initializer-list-recognition-through-export-and-linkage-issue-118218.cpp
@@ -0,0 +1,39 @@
+// RUN: rm -rf %t
+// RUN: mkdir -p %t
+// RUN: split-file %s %t
+//
+// RUN: %clang_cc1 -std=c++20 %t/std.cppm -emit-module-interface -o %t/std.pcm
+// RUN: %clang_cc1 -std=c++20 %t/mod.cppm -fprebuilt-module-path=%t 
-emit-module-interface -o %t/mod.pcm
+// RUN: %clang_cc1 -std=c++20 -fprebuilt-module-path=%t -verify %t/main.cpp
+
+//--- std.cppm
+export module std;
+
+extern "C++" {
+  namespace std {
+  export template 
+  class initializer_list {
+const E* _1;
+const E* _2;
+  };
+  }
+}
+
+//--- mod.cppm
+export module mod;
+
+import std;
+
+export struct A {
+  void func(std::initializer_list) {}
+};
+
+//--- main.cpp
+// expected-no-diagnostics
+import std;
+import mod;
+
+int main() {
+  A{}.func({1,1});
+  return 0;
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SLP] Check if instructions exist after vectorization (#120434) (PR #120505)

2025-01-14 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/120505
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] Fix std::initializer_list recognition if it's exported out of a module (PR #121739)

2025-01-14 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/121739
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] [clang-tidy][NFC] refactor modernize-raw-string-literal fix hint (PR #122909)

2025-01-14 Thread Congcong Cai via llvm-branch-commits

HerrCai0907 wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/122909?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#122909** https://app.graphite.dev/github/pr/llvm/llvm-project/122909?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/122909?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#122901** https://app.graphite.dev/github/pr/llvm/llvm-project/122901?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/122909
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] [clang-tidy][NFC] refactor modernize-raw-string-literal fix hint (PR #122909)

2025-01-14 Thread Congcong Cai via llvm-branch-commits

https://github.com/HerrCai0907 created 
https://github.com/llvm/llvm-project/pull/122909

None

>From c413bd7b5f8d8524335ad50a71bc986d8bdca2a8 Mon Sep 17 00:00:00 2001
From: Congcong Cai 
Date: Tue, 14 Jan 2025 22:24:46 +0800
Subject: [PATCH] [clang-tidy][NFC] refactor modernize-raw-string-literal fix
 hint

---
 .../modernize/RawStringLiteralCheck.cpp   | 105 +++---
 .../modernize/RawStringLiteralCheck.h |   4 -
 2 files changed, 62 insertions(+), 47 deletions(-)

diff --git a/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp 
b/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp
index bd9830278facb7..1618b5be7699d9 100644
--- a/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp
+++ b/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp
@@ -9,8 +9,11 @@
 #include "RawStringLiteralCheck.h"
 #include "clang/AST/ASTContext.h"
 #include "clang/ASTMatchers/ASTMatchFinder.h"
+#include "clang/Basic/LangOptions.h"
+#include "clang/Basic/SourceManager.h"
 #include "clang/Lex/Lexer.h"
 #include "llvm/ADT/StringRef.h"
+#include 
 
 using namespace clang::ast_matchers;
 
@@ -67,20 +70,6 @@ bool containsDelimiter(StringRef Bytes, const std::string 
&Delimiter) {
 : (")" + Delimiter + R"(")")) != StringRef::npos;
 }
 
-std::string asRawStringLiteral(const StringLiteral *Literal,
-   const std::string &DelimiterStem) {
-  const StringRef Bytes = Literal->getBytes();
-  std::string Delimiter;
-  for (int I = 0; containsDelimiter(Bytes, Delimiter); ++I) {
-Delimiter = (I == 0) ? DelimiterStem : DelimiterStem + std::to_string(I);
-  }
-
-  if (Delimiter.empty())
-return (R"(R"()" + Bytes + R"lit()")lit").str();
-
-  return (R"(R")" + Delimiter + "(" + Bytes + ")" + Delimiter + R"(")").str();
-}
-
 } // namespace
 
 RawStringLiteralCheck::RawStringLiteralCheck(StringRef Name,
@@ -120,43 +109,73 @@ void RawStringLiteralCheck::registerMatchers(MatchFinder 
*Finder) {
   stringLiteral(unless(hasParent(predefinedExpr(.bind("lit"), this);
 }
 
-void RawStringLiteralCheck::check(const MatchFinder::MatchResult &Result) {
-  const auto *Literal = Result.Nodes.getNodeAs("lit");
-  if (Literal->getBeginLoc().isMacroID())
-return;
-
-  if (containsEscapedCharacters(Result, Literal, DisallowedChars)) {
-std::string Replacement = asRawStringLiteral(Literal, DelimiterStem);
-if (ReplaceShorterLiterals ||
-Replacement.length() <=
-Lexer::MeasureTokenLength(Literal->getBeginLoc(),
-  *Result.SourceManager, getLangOpts()))
-  replaceWithRawStringLiteral(Result, Literal, Replacement);
-  }
-}
-
-void RawStringLiteralCheck::replaceWithRawStringLiteral(
-const MatchFinder::MatchResult &Result, const StringLiteral *Literal,
-std::string Replacement) {
-  DiagnosticBuilder Builder =
-  diag(Literal->getBeginLoc(),
-   "escaped string literal can be written as a raw string literal");
-  const SourceManager &SM = *Result.SourceManager;
+static std::optional
+createUserDefinedSuffix(const StringLiteral *Literal, const SourceManager &SM,
+const LangOptions &LangOpts) {
   const CharSourceRange TokenRange =
   CharSourceRange::getTokenRange(Literal->getSourceRange());
   Token T;
-  if (Lexer::getRawToken(Literal->getBeginLoc(), T, SM, getLangOpts()))
-return;
+  if (Lexer::getRawToken(Literal->getBeginLoc(), T, SM, LangOpts))
+return std::nullopt;
   const CharSourceRange CharRange =
-  Lexer::makeFileCharRange(TokenRange, SM, getLangOpts());
+  Lexer::makeFileCharRange(TokenRange, SM, LangOpts);
   if (T.hasUDSuffix()) {
-StringRef Text = Lexer::getSourceText(CharRange, SM, getLangOpts());
+StringRef Text = Lexer::getSourceText(CharRange, SM, LangOpts);
 const size_t UDSuffixPos = Text.find_last_of('"');
 if (UDSuffixPos == StringRef::npos)
-  return;
-Replacement += Text.slice(UDSuffixPos + 1, Text.size());
+  return std::nullopt;
+return Text.slice(UDSuffixPos + 1, Text.size());
+  }
+  return std::nullopt;
+}
+
+static std::string createRawStringLiteral(const StringLiteral *Literal,
+  const std::string &DelimiterStem,
+  const SourceManager &SM,
+  const LangOptions &LangOpts) {
+  const StringRef Bytes = Literal->getBytes();
+  std::string Delimiter;
+  for (int I = 0; containsDelimiter(Bytes, Delimiter); ++I) {
+Delimiter = (I == 0) ? DelimiterStem : DelimiterStem + std::to_string(I);
+  }
+
+  std::optional UserDefinedSuffix =
+  createUserDefinedSuffix(Literal, SM, LangOpts);
+
+  if (Delimiter.empty())
+return (R"(R"()" + Bytes + R"lit()")lit" + UserDefinedSuffix.value_or(""))
+.str();
+
+  return (R"(R")" + Delimiter + "(" + Bytes + ")" + Delimiter + R"(")" +
+  UserDefinedSuffix.valu

[llvm-branch-commits] [clang-tools-extra] [clang-tidy][NFC] refactor modernize-raw-string-literal fix hint (PR #122909)

2025-01-14 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-tidy

Author: Congcong Cai (HerrCai0907)


Changes



---
Full diff: https://github.com/llvm/llvm-project/pull/122909.diff


2 Files Affected:

- (modified) clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp 
(+62-43) 
- (modified) clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.h 
(-4) 


``diff
diff --git a/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp 
b/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp
index bd9830278facb7..1618b5be7699d9 100644
--- a/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp
+++ b/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp
@@ -9,8 +9,11 @@
 #include "RawStringLiteralCheck.h"
 #include "clang/AST/ASTContext.h"
 #include "clang/ASTMatchers/ASTMatchFinder.h"
+#include "clang/Basic/LangOptions.h"
+#include "clang/Basic/SourceManager.h"
 #include "clang/Lex/Lexer.h"
 #include "llvm/ADT/StringRef.h"
+#include 
 
 using namespace clang::ast_matchers;
 
@@ -67,20 +70,6 @@ bool containsDelimiter(StringRef Bytes, const std::string 
&Delimiter) {
 : (")" + Delimiter + R"(")")) != StringRef::npos;
 }
 
-std::string asRawStringLiteral(const StringLiteral *Literal,
-   const std::string &DelimiterStem) {
-  const StringRef Bytes = Literal->getBytes();
-  std::string Delimiter;
-  for (int I = 0; containsDelimiter(Bytes, Delimiter); ++I) {
-Delimiter = (I == 0) ? DelimiterStem : DelimiterStem + std::to_string(I);
-  }
-
-  if (Delimiter.empty())
-return (R"(R"()" + Bytes + R"lit()")lit").str();
-
-  return (R"(R")" + Delimiter + "(" + Bytes + ")" + Delimiter + R"(")").str();
-}
-
 } // namespace
 
 RawStringLiteralCheck::RawStringLiteralCheck(StringRef Name,
@@ -120,43 +109,73 @@ void RawStringLiteralCheck::registerMatchers(MatchFinder 
*Finder) {
   stringLiteral(unless(hasParent(predefinedExpr(.bind("lit"), this);
 }
 
-void RawStringLiteralCheck::check(const MatchFinder::MatchResult &Result) {
-  const auto *Literal = Result.Nodes.getNodeAs("lit");
-  if (Literal->getBeginLoc().isMacroID())
-return;
-
-  if (containsEscapedCharacters(Result, Literal, DisallowedChars)) {
-std::string Replacement = asRawStringLiteral(Literal, DelimiterStem);
-if (ReplaceShorterLiterals ||
-Replacement.length() <=
-Lexer::MeasureTokenLength(Literal->getBeginLoc(),
-  *Result.SourceManager, getLangOpts()))
-  replaceWithRawStringLiteral(Result, Literal, Replacement);
-  }
-}
-
-void RawStringLiteralCheck::replaceWithRawStringLiteral(
-const MatchFinder::MatchResult &Result, const StringLiteral *Literal,
-std::string Replacement) {
-  DiagnosticBuilder Builder =
-  diag(Literal->getBeginLoc(),
-   "escaped string literal can be written as a raw string literal");
-  const SourceManager &SM = *Result.SourceManager;
+static std::optional
+createUserDefinedSuffix(const StringLiteral *Literal, const SourceManager &SM,
+const LangOptions &LangOpts) {
   const CharSourceRange TokenRange =
   CharSourceRange::getTokenRange(Literal->getSourceRange());
   Token T;
-  if (Lexer::getRawToken(Literal->getBeginLoc(), T, SM, getLangOpts()))
-return;
+  if (Lexer::getRawToken(Literal->getBeginLoc(), T, SM, LangOpts))
+return std::nullopt;
   const CharSourceRange CharRange =
-  Lexer::makeFileCharRange(TokenRange, SM, getLangOpts());
+  Lexer::makeFileCharRange(TokenRange, SM, LangOpts);
   if (T.hasUDSuffix()) {
-StringRef Text = Lexer::getSourceText(CharRange, SM, getLangOpts());
+StringRef Text = Lexer::getSourceText(CharRange, SM, LangOpts);
 const size_t UDSuffixPos = Text.find_last_of('"');
 if (UDSuffixPos == StringRef::npos)
-  return;
-Replacement += Text.slice(UDSuffixPos + 1, Text.size());
+  return std::nullopt;
+return Text.slice(UDSuffixPos + 1, Text.size());
+  }
+  return std::nullopt;
+}
+
+static std::string createRawStringLiteral(const StringLiteral *Literal,
+  const std::string &DelimiterStem,
+  const SourceManager &SM,
+  const LangOptions &LangOpts) {
+  const StringRef Bytes = Literal->getBytes();
+  std::string Delimiter;
+  for (int I = 0; containsDelimiter(Bytes, Delimiter); ++I) {
+Delimiter = (I == 0) ? DelimiterStem : DelimiterStem + std::to_string(I);
+  }
+
+  std::optional UserDefinedSuffix =
+  createUserDefinedSuffix(Literal, SM, LangOpts);
+
+  if (Delimiter.empty())
+return (R"(R"()" + Bytes + R"lit()")lit" + UserDefinedSuffix.value_or(""))
+.str();
+
+  return (R"(R")" + Delimiter + "(" + Bytes + ")" + Delimiter + R"(")" +
+  UserDefinedSuffix.value_or(""))
+  .str();
+}
+
+static bool compareStringLength(StringRef Replacement,
+

[llvm-branch-commits] [clang-tools-extra] [clang-tidy][NFC] refactor modernize-raw-string-literal fix hint (PR #122909)

2025-01-14 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-tools-extra

Author: Congcong Cai (HerrCai0907)


Changes



---
Full diff: https://github.com/llvm/llvm-project/pull/122909.diff


2 Files Affected:

- (modified) clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp 
(+62-43) 
- (modified) clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.h 
(-4) 


``diff
diff --git a/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp 
b/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp
index bd9830278facb7..1618b5be7699d9 100644
--- a/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp
+++ b/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp
@@ -9,8 +9,11 @@
 #include "RawStringLiteralCheck.h"
 #include "clang/AST/ASTContext.h"
 #include "clang/ASTMatchers/ASTMatchFinder.h"
+#include "clang/Basic/LangOptions.h"
+#include "clang/Basic/SourceManager.h"
 #include "clang/Lex/Lexer.h"
 #include "llvm/ADT/StringRef.h"
+#include 
 
 using namespace clang::ast_matchers;
 
@@ -67,20 +70,6 @@ bool containsDelimiter(StringRef Bytes, const std::string 
&Delimiter) {
 : (")" + Delimiter + R"(")")) != StringRef::npos;
 }
 
-std::string asRawStringLiteral(const StringLiteral *Literal,
-   const std::string &DelimiterStem) {
-  const StringRef Bytes = Literal->getBytes();
-  std::string Delimiter;
-  for (int I = 0; containsDelimiter(Bytes, Delimiter); ++I) {
-Delimiter = (I == 0) ? DelimiterStem : DelimiterStem + std::to_string(I);
-  }
-
-  if (Delimiter.empty())
-return (R"(R"()" + Bytes + R"lit()")lit").str();
-
-  return (R"(R")" + Delimiter + "(" + Bytes + ")" + Delimiter + R"(")").str();
-}
-
 } // namespace
 
 RawStringLiteralCheck::RawStringLiteralCheck(StringRef Name,
@@ -120,43 +109,73 @@ void RawStringLiteralCheck::registerMatchers(MatchFinder 
*Finder) {
   stringLiteral(unless(hasParent(predefinedExpr(.bind("lit"), this);
 }
 
-void RawStringLiteralCheck::check(const MatchFinder::MatchResult &Result) {
-  const auto *Literal = Result.Nodes.getNodeAs("lit");
-  if (Literal->getBeginLoc().isMacroID())
-return;
-
-  if (containsEscapedCharacters(Result, Literal, DisallowedChars)) {
-std::string Replacement = asRawStringLiteral(Literal, DelimiterStem);
-if (ReplaceShorterLiterals ||
-Replacement.length() <=
-Lexer::MeasureTokenLength(Literal->getBeginLoc(),
-  *Result.SourceManager, getLangOpts()))
-  replaceWithRawStringLiteral(Result, Literal, Replacement);
-  }
-}
-
-void RawStringLiteralCheck::replaceWithRawStringLiteral(
-const MatchFinder::MatchResult &Result, const StringLiteral *Literal,
-std::string Replacement) {
-  DiagnosticBuilder Builder =
-  diag(Literal->getBeginLoc(),
-   "escaped string literal can be written as a raw string literal");
-  const SourceManager &SM = *Result.SourceManager;
+static std::optional
+createUserDefinedSuffix(const StringLiteral *Literal, const SourceManager &SM,
+const LangOptions &LangOpts) {
   const CharSourceRange TokenRange =
   CharSourceRange::getTokenRange(Literal->getSourceRange());
   Token T;
-  if (Lexer::getRawToken(Literal->getBeginLoc(), T, SM, getLangOpts()))
-return;
+  if (Lexer::getRawToken(Literal->getBeginLoc(), T, SM, LangOpts))
+return std::nullopt;
   const CharSourceRange CharRange =
-  Lexer::makeFileCharRange(TokenRange, SM, getLangOpts());
+  Lexer::makeFileCharRange(TokenRange, SM, LangOpts);
   if (T.hasUDSuffix()) {
-StringRef Text = Lexer::getSourceText(CharRange, SM, getLangOpts());
+StringRef Text = Lexer::getSourceText(CharRange, SM, LangOpts);
 const size_t UDSuffixPos = Text.find_last_of('"');
 if (UDSuffixPos == StringRef::npos)
-  return;
-Replacement += Text.slice(UDSuffixPos + 1, Text.size());
+  return std::nullopt;
+return Text.slice(UDSuffixPos + 1, Text.size());
+  }
+  return std::nullopt;
+}
+
+static std::string createRawStringLiteral(const StringLiteral *Literal,
+  const std::string &DelimiterStem,
+  const SourceManager &SM,
+  const LangOptions &LangOpts) {
+  const StringRef Bytes = Literal->getBytes();
+  std::string Delimiter;
+  for (int I = 0; containsDelimiter(Bytes, Delimiter); ++I) {
+Delimiter = (I == 0) ? DelimiterStem : DelimiterStem + std::to_string(I);
+  }
+
+  std::optional UserDefinedSuffix =
+  createUserDefinedSuffix(Literal, SM, LangOpts);
+
+  if (Delimiter.empty())
+return (R"(R"()" + Bytes + R"lit()")lit" + UserDefinedSuffix.value_or(""))
+.str();
+
+  return (R"(R")" + Delimiter + "(" + Bytes + ")" + Delimiter + R"(")" +
+  UserDefinedSuffix.value_or(""))
+  .str();
+}
+
+static bool compareStringLength(StringRef Replacement,
+ 

[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Parse WHEN, OTHERWISE, MATCH clauses plus METADIRECTIVE (PR #121817)

2025-01-14 Thread Krzysztof Parzyszek via llvm-branch-commits

https://github.com/kparzysz updated 
https://github.com/llvm/llvm-project/pull/121817

>From fe3ec47965d5f970e26f9f729a21b61acf366053 Mon Sep 17 00:00:00 2001
From: Krzysztof Parzyszek 
Date: Thu, 12 Dec 2024 15:26:26 -0600
Subject: [PATCH] [flang][OpenMP] Parse WHEN, OTHERWISE, MATCH clauses plus
 METADIRECTIVE

Parse METADIRECTIVE as a standalone executable directive at the moment.
This will allow testing the parser code.

There is no lowering, not even clause conversion yet. There is also no
verification of the allowed values for trait sets, trait properties.
---
 flang/include/flang/Parser/dump-parse-tree.h |   6 +
 flang/include/flang/Parser/parse-tree.h  |  49 -
 flang/lib/Lower/OpenMP/Clauses.cpp   |  21 +-
 flang/lib/Lower/OpenMP/Clauses.h |   1 +
 flang/lib/Lower/OpenMP/OpenMP.cpp|   5 +
 flang/lib/Parser/openmp-parsers.cpp  |  29 ++-
 flang/lib/Parser/unparse.cpp |  16 ++
 flang/lib/Semantics/check-omp-structure.cpp  |  30 +++
 flang/lib/Semantics/check-omp-structure.h|  12 +-
 flang/lib/Semantics/resolve-directives.cpp   |  11 ++
 flang/test/Parser/OpenMP/metadirective.f90   | 196 +++
 llvm/include/llvm/Frontend/OpenMP/OMP.td |   9 +-
 12 files changed, 377 insertions(+), 8 deletions(-)
 create mode 100644 flang/test/Parser/OpenMP/metadirective.f90

diff --git a/flang/include/flang/Parser/dump-parse-tree.h 
b/flang/include/flang/Parser/dump-parse-tree.h
index 49eeed0e7b4393..1323fd695d4439 100644
--- a/flang/include/flang/Parser/dump-parse-tree.h
+++ b/flang/include/flang/Parser/dump-parse-tree.h
@@ -476,6 +476,12 @@ class ParseTreeDumper {
   NODE(parser, NullInit)
   NODE(parser, ObjectDecl)
   NODE(parser, OldParameterStmt)
+  NODE(parser, OmpMetadirectiveDirective)
+  NODE(parser, OmpMatchClause)
+  NODE(parser, OmpOtherwiseClause)
+  NODE(parser, OmpWhenClause)
+  NODE(OmpWhenClause, Modifier)
+  NODE(parser, OmpDirectiveSpecification)
   NODE(parser, OmpTraitPropertyName)
   NODE(parser, OmpTraitScore)
   NODE(parser, OmpTraitPropertyExtension)
diff --git a/flang/include/flang/Parser/parse-tree.h 
b/flang/include/flang/Parser/parse-tree.h
index f8175ea1de679e..db19608a8491ee 100644
--- a/flang/include/flang/Parser/parse-tree.h
+++ b/flang/include/flang/Parser/parse-tree.h
@@ -3456,6 +3456,14 @@ WRAPPER_CLASS(PauseStmt, std::optional);
 struct OmpClause;
 struct OmpClauseList;
 
+struct OmpDirectiveSpecification {
+  TUPLE_CLASS_BOILERPLATE(OmpDirectiveSpecification);
+  std::tuple>>
+  t;
+  CharBlock source;
+};
+
 // 2.1 Directives or clauses may accept a list or extended-list.
 // A list item is a variable, array section or common block name (enclosed
 // in slashes). An extended list item is a list item or a procedure Name.
@@ -3964,6 +3972,7 @@ struct OmpBindClause {
 // data-sharing-attribute ->
 //SHARED | NONE |   // since 4.5
 //PRIVATE | FIRSTPRIVATE// since 5.0
+// See also otherwise-clause.
 struct OmpDefaultClause {
   ENUM_CLASS(DataSharingAttribute, Private, Firstprivate, Shared, None)
   WRAPPER_CLASS_BOILERPLATE(OmpDefaultClause, DataSharingAttribute);
@@ -4184,6 +4193,16 @@ struct OmpMapClause {
   std::tuple t;
 };
 
+// Ref: [5.0:58-60], [5.1:63-68], [5.2:194-195]
+//
+// match-clause ->
+//MATCH (context-selector-specification)// since 5.0
+struct OmpMatchClause {
+  // The context-selector is an argument.
+  WRAPPER_CLASS_BOILERPLATE(
+  OmpMatchClause, traits::OmpContextSelectorSpecification);
+};
+
 // Ref: [5.2:217-218]
 // message-clause ->
 //MESSAGE("message-text")
@@ -4214,6 +4233,17 @@ struct OmpOrderClause {
   std::tuple t;
 };
 
+// Ref: [5.0:56-57], [5.1:60-62], [5.2:191]
+//
+// otherwise-clause ->
+//DEFAULT ([directive-specification])   // since 5.0, until 5.1
+// otherwise-clause ->
+//OTHERWISE ([directive-specification])]// since 5.2
+struct OmpOtherwiseClause {
+  WRAPPER_CLASS_BOILERPLATE(
+  OmpOtherwiseClause, std::optional);
+};
+
 // Ref: [4.5:46-50], [5.0:74-78], [5.1:92-96], [5.2:229-230]
 //
 // proc-bind-clause ->
@@ -4299,6 +4329,17 @@ struct OmpUpdateClause {
   std::variant u;
 };
 
+// Ref: [5.0:56-57], [5.1:60-62], [5.2:190-191]
+//
+// when-clause ->
+//WHEN (context-selector :
+//[directive-specification])// since 5.0
+struct OmpWhenClause {
+  TUPLE_CLASS_BOILERPLATE(OmpWhenClause);
+  MODIFIER_BOILERPLATE(OmpContextSelector);
+  std::tuple> t;
+};
+
 // OpenMP Clauses
 struct OmpClause {
   UNION_CLASS_BOILERPLATE(OmpClause);
@@ -4323,6 +4364,12 @@ struct OmpClauseList {
 
 // --- Directives and constructs
 
+struct OmpMetadirectiveDirective {
+  TUPLE_CLASS_BOILERPLATE(OmpMetadirectiveDirective);
+  std::tuple t;
+  CharBlock source;
+};
+
 // Ref: [5.1:89-90], [5.2:216]
 //
 // nothing-directive ->
@@ -4696,7 +4743,7 @@ struct OpenMPStandaloneConstruct {
   CharBlock source;
   std::variant
+ 

[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder][MLIR] Add support for target 'if' clause (PR #122478)

2025-01-14 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak updated 
https://github.com/llvm/llvm-project/pull/122478

>From 8c348ba2796e08d45fe167d52db0fe047eaafa8a Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Fri, 10 Jan 2025 15:40:05 +
Subject: [PATCH] [OMPIRBuilder][MLIR] Add support for target 'if' clause

This patch implements support for handling the 'if' clause of OpenMP 'target'
constructs in the OMPIRBuilder and updates MLIR to LLVM IR translation of the
`omp.target` MLIR operation to make use of this new feature.
---
 .../llvm/Frontend/OpenMP/OMPIRBuilder.h   |  14 +-
 llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 188 ++
 .../Frontend/OpenMPIRBuilderTest.cpp  |  26 +--
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  |  11 +-
 mlir/test/Target/LLVMIR/omptarget-if.mlir |  68 +++
 mlir/test/Target/LLVMIR/openmp-todo.mlir  |  11 -
 6 files changed, 200 insertions(+), 118 deletions(-)
 create mode 100644 mlir/test/Target/LLVMIR/omptarget-if.mlir

diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h 
b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
index 7eceec3d8cf8f5..6b6e5bc19d95a4 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
+++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
@@ -2994,27 +2994,29 @@ class OpenMPIRBuilder {
   /// \param Loc where the target data construct was encountered.
   /// \param IsOffloadEntry whether it is an offload entry.
   /// \param CodeGenIP The insertion point where the call to the outlined
-  /// function should be emitted.
+  ///function should be emitted.
   /// \param EntryInfo The entry information about the function.
   /// \param DefaultAttrs Structure containing the default attributes, 
including
   ///numbers of threads and teams to launch the kernel with.
   /// \param RuntimeAttrs Structure containing the runtime numbers of threads
   ///and teams to launch the kernel with.
+  /// \param IfCond value of the `if` clause.
   /// \param Inputs The input values to the region that will be passed.
-  /// as arguments to the outlined function.
+  ///as arguments to the outlined function.
   /// \param BodyGenCB Callback that will generate the region code.
   /// \param ArgAccessorFuncCB Callback that will generate accessors
-  /// instructions for passed in target arguments where neccessary
+  ///instructions for passed in target arguments where neccessary
   /// \param Dependencies A vector of DependData objects that carry
-  // dependency information as passed in the depend clause
-  // \param HasNowait Whether the target construct has a `nowait` clause or 
not.
+  ///dependency information as passed in the depend clause
+  /// \param HasNowait Whether the target construct has a `nowait` clause or
+  ///not.
   InsertPointOrErrorTy createTarget(
   const LocationDescription &Loc, bool IsOffloadEntry,
   OpenMPIRBuilder::InsertPointTy AllocaIP,
   OpenMPIRBuilder::InsertPointTy CodeGenIP,
   TargetRegionEntryInfo &EntryInfo,
   const TargetKernelDefaultAttrs &DefaultAttrs,
-  const TargetKernelRuntimeAttrs &RuntimeAttrs,
+  const TargetKernelRuntimeAttrs &RuntimeAttrs, Value *IfCond,
   SmallVectorImpl &Inputs, GenMapInfoCallbackTy GenMapInfoCB,
   TargetBodyGenCallbackTy BodyGenCB,
   TargetGenArgAccessorsCallbackTy ArgAccessorFuncCB,
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 3d461f0ad4228c..d29e22c762bd4f 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -7340,7 +7340,7 @@ emitTargetCall(OpenMPIRBuilder &OMPBuilder, IRBuilderBase 
&Builder,
OpenMPIRBuilder::InsertPointTy AllocaIP,
const OpenMPIRBuilder::TargetKernelDefaultAttrs &DefaultAttrs,
const OpenMPIRBuilder::TargetKernelRuntimeAttrs &RuntimeAttrs,
-   Function *OutlinedFn, Constant *OutlinedFnID,
+   Value *IfCond, Function *OutlinedFn, Constant *OutlinedFnID,
SmallVectorImpl &Args,
OpenMPIRBuilder::GenMapInfoCallbackTy GenMapInfoCB,
SmallVector Dependencies = 
{},
@@ -7386,9 +7386,9 @@ emitTargetCall(OpenMPIRBuilder &OMPBuilder, IRBuilderBase 
&Builder,
 return Error::success();
   };
 
-  // If we don't have an ID for the target region, it means an offload entry
-  // wasn't created. In this case we just run the host fallback directly.
-  if (!OutlinedFnID) {
+  auto &&EmitTargetCallElse =
+  [&](OpenMPIRBuilder::InsertPointTy AllocaIP,
+  OpenMPIRBuilder::InsertPointTy CodeGenIP) -> Error {
 // Assume no error was returned because EmitTargetCallFallbackCB doesn't
 // produce any.
 OpenMPIRBuilder::InsertPointTy AfterIP = cantFail([&]() {
@@ -7404,102 +7404,124 @@ emitTargetCall(OpenMPIRBuilder &OMPBuilder, 
IRBuilderBase &Builder,
 }());
 
 Builder.restoreIP(AfterIP);
-return;
-  

[llvm-branch-commits] [llvm] DAG: Avoid forming shufflevector from a single extract_vector_elt (PR #122672)

2025-01-14 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> I don't think the heuristic here is quite what you want. I believe this 
> heuristic disables both of the following cases:
> 
> * BuildVector w/one non-zero non-undef element
> * BuildVector w/one non-zero non-undef source, repeated 100 times (i.e. splat 
> or select of two splats)
> 
> Disabling the former seems defensible, doing so for the second less so.

Depends if isExtractVecEltCheap is true or not for the non-zero index.

> * BuildVector w/one non-zero non-undef source, repeated 100 times (i.e. splat 
> or select of two splats)

I don't follow, this is a 2 element vector, how can you have 100 variants?

>  If the target isn't optimally lowering the splat or select of splat case in 
> the shuffle lowering, maybe we should just adjust the target lowering to do 
> so?t

It's not a lowering issue, it's the effect on every other combine. We'd have to 
special case 1 element + 1 undef shuffles everywhere we handle 
extract_vector_elt now, which is just excessive complexity. #122671 is almost 
an alternative in one instance, but still shows expanding complexity of 
handling this edge case. 

https://github.com/llvm/llvm-project/pull/122672
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] Add documentation for Multilib custom flags (PR #114998)

2025-01-14 Thread Sam Elliott via llvm-branch-commits

lenary wrote:

I'm still happy with this, and it is for docs, so I don't think the barrier to 
landing it is very high. 

https://github.com/llvm/llvm-project/pull/114998
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] DAG: Avoid forming shufflevector from a single extract_vector_elt (PR #122672)

2025-01-14 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/122672

>From acdbd89a32c585668dc6ad9797a9b7f578f84776 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Fri, 10 Jan 2025 21:13:09 +0700
Subject: [PATCH] DAG: Avoid forming shufflevector from a single
 extract_vector_elt

This avoids regressions in a future AMDGPU commit. Previously we
would have a build_vector (extract_vector_elt x), undef with free
access to the elements bloated into a shuffle of one element + undef,
which has much worse combine support than the extract.

Alternatively could check aggressivelyPreferBuildVectorSources, but
I'm not sure it's really different than isExtractVecEltCheap.
---
 llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp |  25 ++-
 .../CodeGen/AMDGPU/insert_vector_dynelt.ll|  10 +-
 llvm/test/CodeGen/X86/avx512-build-vector.ll  |   8 +-
 .../X86/avx512-shuffles/partial_permute.ll| 157 ++
 .../CodeGen/X86/insertelement-duplicates.ll   |  10 +-
 llvm/test/CodeGen/X86/sse-align-12.ll |   4 +-
 6 files changed, 123 insertions(+), 91 deletions(-)

diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp 
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 51381b85a5e1b6..5c10cf400d8bc9 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -23799,6 +23799,10 @@ SDValue DAGCombiner::reduceBuildVecToShuffle(SDNode 
*N) {
   SmallVector VecIn;
   VecIn.push_back(SDValue());
 
+  // If we have a single extract_element with a constant index, track the index
+  // value.
+  unsigned OneConstExtractIndex = ~0u;
+
   for (unsigned i = 0; i != NumElems; ++i) {
 SDValue Op = N->getOperand(i);
 
@@ -23816,16 +23820,18 @@ SDValue DAGCombiner::reduceBuildVecToShuffle(SDNode 
*N) {
 
 // Not an undef or zero. If the input is something other than an
 // EXTRACT_VECTOR_ELT with an in-range constant index, bail out.
-if (Op.getOpcode() != ISD::EXTRACT_VECTOR_ELT ||
-!isa(Op.getOperand(1)))
+if (Op.getOpcode() != ISD::EXTRACT_VECTOR_ELT)
   return SDValue();
-SDValue ExtractedFromVec = Op.getOperand(0);
 
+SDValue ExtractedFromVec = Op.getOperand(0);
 if (ExtractedFromVec.getValueType().isScalableVector())
   return SDValue();
+auto *ExtractIdx = dyn_cast(Op.getOperand(1));
+if (!ExtractIdx)
+  return SDValue();
 
-const APInt &ExtractIdx = Op.getConstantOperandAPInt(1);
-if (ExtractIdx.uge(ExtractedFromVec.getValueType().getVectorNumElements()))
+if (ExtractIdx->getAsAPIntVal().uge(
+ExtractedFromVec.getValueType().getVectorNumElements()))
   return SDValue();
 
 // All inputs must have the same element type as the output.
@@ -23833,6 +23839,8 @@ SDValue DAGCombiner::reduceBuildVecToShuffle(SDNode *N) 
{
 ExtractedFromVec.getValueType().getVectorElementType())
   return SDValue();
 
+OneConstExtractIndex = ExtractIdx->getZExtValue();
+
 // Have we seen this input vector before?
 // The vectors are expected to be tiny (usually 1 or 2 elements), so using
 // a map back from SDValues to numbers isn't worth it.
@@ -23855,6 +23863,13 @@ SDValue DAGCombiner::reduceBuildVecToShuffle(SDNode 
*N) {
   // VecIn accordingly.
   bool DidSplitVec = false;
   if (VecIn.size() == 2) {
+// If we only found a single constant indexed extract_vector_elt feeding 
the
+// build_vector, do not produce a more complicated shuffle if the extract 
is
+// cheap.
+if (TLI.isOperationLegalOrCustom(ISD::EXTRACT_VECTOR_ELT, VT) &&
+TLI.isExtractVecEltCheap(VT, OneConstExtractIndex))
+  return SDValue();
+
 unsigned MaxIndex = 0;
 unsigned NearestPow2 = 0;
 SDValue Vec = VecIn.back();
diff --git a/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll 
b/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll
index 7912d1cf8dc0d1..add8c0f75bf335 100644
--- a/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll
+++ b/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll
@@ -452,11 +452,11 @@ define amdgpu_kernel void @byte8_inselt(ptr addrspace(1) 
%out, <8 x i8> %vec, i3
 ; GCN-NEXT:s_and_b32 s6, s4, 0x1010101
 ; GCN-NEXT:s_andn2_b64 s[2:3], s[2:3], s[4:5]
 ; GCN-NEXT:s_or_b64 s[2:3], s[6:7], s[2:3]
-; GCN-NEXT:v_mov_b32_e32 v3, s1
-; GCN-NEXT:v_mov_b32_e32 v0, s2
-; GCN-NEXT:v_mov_b32_e32 v1, s3
-; GCN-NEXT:v_mov_b32_e32 v2, s0
-; GCN-NEXT:flat_store_dwordx2 v[2:3], v[0:1]
+; GCN-NEXT:v_mov_b32_e32 v0, s0
+; GCN-NEXT:v_mov_b32_e32 v2, s2
+; GCN-NEXT:v_mov_b32_e32 v1, s1
+; GCN-NEXT:v_mov_b32_e32 v3, s3
+; GCN-NEXT:flat_store_dwordx2 v[0:1], v[2:3]
 ; GCN-NEXT:s_endpgm
 entry:
   %v = insertelement <8 x i8> %vec, i8 1, i32 %sel
diff --git a/llvm/test/CodeGen/X86/avx512-build-vector.ll 
b/llvm/test/CodeGen/X86/avx512-build-vector.ll
index b21a0c4e36c2bd..27cb3eb406e9e8 100644
--- a/llvm/test/CodeGen/X86/avx512-build-vector.ll
+++ b/llvm/test/CodeGen/X86/avx512-build-

[llvm-branch-commits] [llvm] AMDGPU: Implement isExtractVecEltCheap (PR #122460)

2025-01-14 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/122460

>From ea6a8ce50fbd64222bd897080eed662aaba15e43 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Fri, 10 Jan 2025 14:57:24 +0700
Subject: [PATCH] AMDGPU: Implement isExtractVecEltCheap

Once again we have excessive TLI hooks with bad defaults. Permit this
for 32-bit element vectors, which are just use-different-register.
We should permit 16-bit vectors as cheap with legal packed instructions,
but I see some mixed improvements and regressions that need investigation.
---
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp |  7 +
 llvm/lib/Target/AMDGPU/SIISelLowering.h   |  1 +
 llvm/test/CodeGen/AMDGPU/mad-mix.ll   | 12 -
 llvm/test/CodeGen/AMDGPU/packed-fp32.ll   | 32 +++
 llvm/test/CodeGen/AMDGPU/trunc-combine.ll |  9 ---
 5 files changed, 45 insertions(+), 16 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 69dca988b2cad9..88f84d2b2ced47 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -1949,6 +1949,13 @@ bool SITargetLowering::isExtractSubvectorCheap(EVT 
ResVT, EVT SrcVT,
   return Index == 0;
 }
 
+bool SITargetLowering::isExtractVecEltCheap(EVT VT, unsigned Index) const {
+  // TODO: This should be more aggressive, particular for 16-bit element
+  // vectors. However there are some mixed improvements and regressions.
+  EVT EltTy = VT.getVectorElementType();
+  return EltTy.getSizeInBits() % 32 == 0;
+}
+
 bool SITargetLowering::isTypeDesirableForOp(unsigned Op, EVT VT) const {
   if (Subtarget->has16BitInsts() && VT == MVT::i16) {
 switch (Op) {
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.h 
b/llvm/lib/Target/AMDGPU/SIISelLowering.h
index 5c215f76552d9c..bbb96d9115a0a9 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.h
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.h
@@ -365,6 +365,7 @@ class SITargetLowering final : public AMDGPUTargetLowering {
 
   bool isExtractSubvectorCheap(EVT ResVT, EVT SrcVT,
unsigned Index) const override;
+  bool isExtractVecEltCheap(EVT VT, unsigned Index) const override;
 
   bool isTypeDesirableForOp(unsigned Op, EVT VT) const override;
 
diff --git a/llvm/test/CodeGen/AMDGPU/mad-mix.ll 
b/llvm/test/CodeGen/AMDGPU/mad-mix.ll
index b520dd1060ec8c..30e3bc3ba5da85 100644
--- a/llvm/test/CodeGen/AMDGPU/mad-mix.ll
+++ b/llvm/test/CodeGen/AMDGPU/mad-mix.ll
@@ -385,17 +385,15 @@ define <2 x float> @v_mad_mix_v2f32_shuffle(<2 x half> 
%src0, <2 x half> %src1,
 ; SDAG-CI:   ; %bb.0:
 ; SDAG-CI-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
 ; SDAG-CI-NEXT:v_cvt_f16_f32_e32 v3, v3
-; SDAG-CI-NEXT:v_cvt_f16_f32_e32 v4, v5
 ; SDAG-CI-NEXT:v_cvt_f16_f32_e32 v2, v2
-; SDAG-CI-NEXT:v_cvt_f16_f32_e32 v5, v1
+; SDAG-CI-NEXT:v_cvt_f16_f32_e32 v1, v1
 ; SDAG-CI-NEXT:v_cvt_f16_f32_e32 v0, v0
 ; SDAG-CI-NEXT:v_cvt_f32_f16_e32 v3, v3
-; SDAG-CI-NEXT:v_cvt_f32_f16_e32 v1, v4
 ; SDAG-CI-NEXT:v_cvt_f32_f16_e32 v2, v2
-; SDAG-CI-NEXT:v_cvt_f32_f16_e32 v4, v5
-; SDAG-CI-NEXT:v_cvt_f32_f16_e32 v5, v0
-; SDAG-CI-NEXT:v_mad_f32 v0, v4, v2, v1
-; SDAG-CI-NEXT:v_mac_f32_e32 v1, v5, v3
+; SDAG-CI-NEXT:v_cvt_f32_f16_e32 v1, v1
+; SDAG-CI-NEXT:v_cvt_f32_f16_e32 v4, v0
+; SDAG-CI-NEXT:v_mad_f32 v0, v1, v2, v5
+; SDAG-CI-NEXT:v_mad_f32 v1, v4, v3, v5
 ; SDAG-CI-NEXT:s_setpc_b64 s[30:31]
 ;
 ; GISEL-CI-LABEL: v_mad_mix_v2f32_shuffle:
diff --git a/llvm/test/CodeGen/AMDGPU/packed-fp32.ll 
b/llvm/test/CodeGen/AMDGPU/packed-fp32.ll
index 6b7eff316fe95b..0833dada43e4d5 100644
--- a/llvm/test/CodeGen/AMDGPU/packed-fp32.ll
+++ b/llvm/test/CodeGen/AMDGPU/packed-fp32.ll
@@ -549,17 +549,19 @@ bb:
   ret void
 }
 
-; GCN-LABEL: {{^}}fadd_fadd_fsub:
+; GCN-LABEL: {{^}}fadd_fadd_fsub_0:
 ; GFX900:   v_add_f32_e64 v{{[0-9]+}}, s{{[0-9]+}}, 0
 ; GFX900:   v_add_f32_e32 v{{[0-9]+}}, 0, v{{[0-9]+}}
-; PACKED-SDAG:  v_pk_add_f32 v[{{[0-9:]+}}], s[{{[0-9:]+}}], 0 
op_sel_hi:[1,0]{{$}}
-; PACKED-SDAG:  v_pk_add_f32 v[{{[0-9:]+}}], v[{{[0-9:]+}}], 0 
op_sel_hi:[1,0]{{$}}
+
+; PACKED-SDAG: v_add_f32_e64 v{{[0-9]+}}, s{{[0-9]+}}, 0
+; PACKED-SDAG: v_add_f32_e32 v{{[0-9]+}}, 0, v{{[0-9]+}}
+
 ; PACKED-GISEL: v_pk_add_f32 v[{{[0-9:]+}}], s[{{[0-9:]+}}], 
v[{{[0-9:]+}}]{{$}}
 ; PACKED-GISEL: v_pk_add_f32 v[{{[0-9:]+}}], v[{{[0-9:]+}}], 
s[{{[0-9:]+}}]{{$}}
-define amdgpu_kernel void @fadd_fadd_fsub(<2 x float> %arg) {
+define amdgpu_kernel void @fadd_fadd_fsub_0(<2 x float> %arg) {
 bb:
   %i12 = fadd <2 x float> zeroinitializer, %arg
-  %shift8 = shufflevector <2 x float> %i12, <2 x float> undef, <2 x i32> 
+  %shift8 = shufflevector <2 x float> %i12, <2 x float> poison, <2 x i32> 
   %i13 = fadd <2 x float> zeroinitializer, %shift8
   %i14 = shufflevector <2 x float> %arg, <2 x float> %i13, <2 x i32> 
   %i15 = fsub <2 x float> %i14, zeroinitializer
@@ -567,6 

[llvm-branch-commits] [llvm] AMDGPU: Implement isExtractVecEltCheap (PR #122460)

2025-01-14 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad edited 
https://github.com/llvm/llvm-project/pull/122460
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Implement isExtractVecEltCheap (PR #122460)

2025-01-14 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/122460
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support AMDGPU's generic target (PR #122629)

2025-01-14 Thread Shilei Tian via llvm-branch-commits

https://github.com/shiltian edited 
https://github.com/llvm/llvm-project/pull/122629
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Implement isExtractVecEltCheap (PR #122460)

2025-01-14 Thread Jay Foad via llvm-branch-commits


@@ -1949,6 +1949,13 @@ bool SITargetLowering::isExtractSubvectorCheap(EVT 
ResVT, EVT SrcVT,
   return Index == 0;
 }
 
+bool SITargetLowering::isExtractVecEltCheap(EVT VT, unsigned Index) const {
+  // TODO: This should be more aggressive, particular for 16-bit element
+  // vectors. However there are some mixed improvements and regressions.
+  EVT EltTy = VT.getVectorElementType();
+  return EltTy.getSizeInBits() % 32 == 0;

jayfoad wrote:

@broxigarchen @Sisyph for true16 we should aim to return `EltTy.getSizeInBits() 
% 16 == 0` here.

https://github.com/llvm/llvm-project/pull/122460
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] DAG: Avoid forming shufflevector from a single extract_vector_elt (PR #122672)

2025-01-14 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> > > * BuildVector w/one non-zero non-undef source, repeated 100 times (i.e. 
> > > splat or select of two splats)
> > 
> > 
> > I don't follow, this is a 2 element vector, how can you have 100 variants?
> 
> Isn't the condition in code in terms of VecIn.size() == 2? I believe that 
> VecIn is the _unique_ input elements, right? Which is distinct from the 
> number of elements in the destination type? (Am I just misreading? I only 
> skimmed this.)



> 
> > > If the target isn't optimally lowering the splat or select of splat case 
> > > in the shuffle lowering, maybe we should just adjust the target lowering 
> > > to do so?t
> > 
> > 
> > It's not a lowering issue, it's the effect on every other combine. We'd 
> > have to special case 1 element + 1 undef shuffles everywhere we handle 
> > extract_vector_elt now, which is just excessive complexity. #122671 is 
> > almost an alternative in one instance, but still shows expanding complexity 
> > of handling this edge case.
> 
> Honestly, #122671 (from the review description only) sounds like a worthwhile 
> change. That's not a hugely compelling argument here. Let's settle the prior 
> point, and then return to this. If I'm just misreading something, let's not 
> waste time discussing this.




https://github.com/llvm/llvm-project/pull/122672
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Implement isExtractVecEltCheap (PR #122460)

2025-01-14 Thread Matt Arsenault via llvm-branch-commits


@@ -1949,6 +1949,13 @@ bool SITargetLowering::isExtractSubvectorCheap(EVT 
ResVT, EVT SrcVT,
   return Index == 0;
 }
 
+bool SITargetLowering::isExtractVecEltCheap(EVT VT, unsigned Index) const {
+  // TODO: This should be more aggressive, particular for 16-bit element
+  // vectors. However there are some mixed improvements and regressions.
+  EVT EltTy = VT.getVectorElementType();
+  return EltTy.getSizeInBits() % 32 == 0;

arsenm wrote:

Even without true16 it should be better (maybe only even aligned cases?) 

https://github.com/llvm/llvm-project/pull/122460
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] DAG: Avoid forming shufflevector from a single extract_vector_elt (PR #122672)

2025-01-14 Thread Philip Reames via llvm-branch-commits

preames wrote:

> > * BuildVector w/one non-zero non-undef source, repeated 100 times (i.e. 
> > splat or select of two splats)
> 
> I don't follow, this is a 2 element vector, how can you have 100 variants?

Isn't the condition in code in terms of VecIn.size() == 2?  I believe that 
VecIn is the *unique* input elements, right?  Which is distinct from the number 
of elements in the destination type?  (Am I just misreading?  I only skimmed 
this.)

> > If the target isn't optimally lowering the splat or select of splat case in 
> > the shuffle lowering, maybe we should just adjust the target lowering to do 
> > so?t
> 
> It's not a lowering issue, it's the effect on every other combine. We'd have 
> to special case 1 element + 1 undef shuffles everywhere we handle 
> extract_vector_elt now, which is just excessive complexity. #122671 is almost 
> an alternative in one instance, but still shows expanding complexity of 
> handling this edge case.

Honestly,  #122671 (from the review description only) sounds like a worthwhile 
change.  That's not a hugely compelling argument here.  Let's settle the prior 
point, and then return to this.  If I'm just misreading something, let's not 
waste time discussing this.  



https://github.com/llvm/llvm-project/pull/122672
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support AMDGPU's generic target (PR #122629)

2025-01-14 Thread Joseph Huber via llvm-branch-commits

jhuber6 wrote:

Somewhere for the linker wrapper I just checked if the triple was recognized, 
you could probably just take strings after the `-` until it stops working.

https://github.com/llvm/llvm-project/pull/122629
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Implement isExtractVecEltCheap (PR #122460)

2025-01-14 Thread Jay Foad via llvm-branch-commits


@@ -1949,6 +1949,13 @@ bool SITargetLowering::isExtractSubvectorCheap(EVT 
ResVT, EVT SrcVT,
   return Index == 0;
 }
 
+bool SITargetLowering::isExtractVecEltCheap(EVT VT, unsigned Index) const {
+  // TODO: This should be more aggressive, particular for 16-bit element
+  // vectors. However there are some mixed improvements and regressions.
+  EVT EltTy = VT.getVectorElementType();
+  return EltTy.getSizeInBits() % 32 == 0;

jayfoad wrote:

Yeah, without true16 `EltTy.getSizeInBits() * Index % 32 == 0` would make sense 
to me.

https://github.com/llvm/llvm-project/pull/122460
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Parse WHEN, OTHERWISE, MATCH clauses plus METADIRECTIVE (PR #121817)

2025-01-14 Thread Krzysztof Parzyszek via llvm-branch-commits

https://github.com/kparzysz updated 
https://github.com/llvm/llvm-project/pull/121817

>From fe3ec47965d5f970e26f9f729a21b61acf366053 Mon Sep 17 00:00:00 2001
From: Krzysztof Parzyszek 
Date: Thu, 12 Dec 2024 15:26:26 -0600
Subject: [PATCH] [flang][OpenMP] Parse WHEN, OTHERWISE, MATCH clauses plus
 METADIRECTIVE

Parse METADIRECTIVE as a standalone executable directive at the moment.
This will allow testing the parser code.

There is no lowering, not even clause conversion yet. There is also no
verification of the allowed values for trait sets, trait properties.
---
 flang/include/flang/Parser/dump-parse-tree.h |   6 +
 flang/include/flang/Parser/parse-tree.h  |  49 -
 flang/lib/Lower/OpenMP/Clauses.cpp   |  21 +-
 flang/lib/Lower/OpenMP/Clauses.h |   1 +
 flang/lib/Lower/OpenMP/OpenMP.cpp|   5 +
 flang/lib/Parser/openmp-parsers.cpp  |  29 ++-
 flang/lib/Parser/unparse.cpp |  16 ++
 flang/lib/Semantics/check-omp-structure.cpp  |  30 +++
 flang/lib/Semantics/check-omp-structure.h|  12 +-
 flang/lib/Semantics/resolve-directives.cpp   |  11 ++
 flang/test/Parser/OpenMP/metadirective.f90   | 196 +++
 llvm/include/llvm/Frontend/OpenMP/OMP.td |   9 +-
 12 files changed, 377 insertions(+), 8 deletions(-)
 create mode 100644 flang/test/Parser/OpenMP/metadirective.f90

diff --git a/flang/include/flang/Parser/dump-parse-tree.h 
b/flang/include/flang/Parser/dump-parse-tree.h
index 49eeed0e7b4393..1323fd695d4439 100644
--- a/flang/include/flang/Parser/dump-parse-tree.h
+++ b/flang/include/flang/Parser/dump-parse-tree.h
@@ -476,6 +476,12 @@ class ParseTreeDumper {
   NODE(parser, NullInit)
   NODE(parser, ObjectDecl)
   NODE(parser, OldParameterStmt)
+  NODE(parser, OmpMetadirectiveDirective)
+  NODE(parser, OmpMatchClause)
+  NODE(parser, OmpOtherwiseClause)
+  NODE(parser, OmpWhenClause)
+  NODE(OmpWhenClause, Modifier)
+  NODE(parser, OmpDirectiveSpecification)
   NODE(parser, OmpTraitPropertyName)
   NODE(parser, OmpTraitScore)
   NODE(parser, OmpTraitPropertyExtension)
diff --git a/flang/include/flang/Parser/parse-tree.h 
b/flang/include/flang/Parser/parse-tree.h
index f8175ea1de679e..db19608a8491ee 100644
--- a/flang/include/flang/Parser/parse-tree.h
+++ b/flang/include/flang/Parser/parse-tree.h
@@ -3456,6 +3456,14 @@ WRAPPER_CLASS(PauseStmt, std::optional);
 struct OmpClause;
 struct OmpClauseList;
 
+struct OmpDirectiveSpecification {
+  TUPLE_CLASS_BOILERPLATE(OmpDirectiveSpecification);
+  std::tuple>>
+  t;
+  CharBlock source;
+};
+
 // 2.1 Directives or clauses may accept a list or extended-list.
 // A list item is a variable, array section or common block name (enclosed
 // in slashes). An extended list item is a list item or a procedure Name.
@@ -3964,6 +3972,7 @@ struct OmpBindClause {
 // data-sharing-attribute ->
 //SHARED | NONE |   // since 4.5
 //PRIVATE | FIRSTPRIVATE// since 5.0
+// See also otherwise-clause.
 struct OmpDefaultClause {
   ENUM_CLASS(DataSharingAttribute, Private, Firstprivate, Shared, None)
   WRAPPER_CLASS_BOILERPLATE(OmpDefaultClause, DataSharingAttribute);
@@ -4184,6 +4193,16 @@ struct OmpMapClause {
   std::tuple t;
 };
 
+// Ref: [5.0:58-60], [5.1:63-68], [5.2:194-195]
+//
+// match-clause ->
+//MATCH (context-selector-specification)// since 5.0
+struct OmpMatchClause {
+  // The context-selector is an argument.
+  WRAPPER_CLASS_BOILERPLATE(
+  OmpMatchClause, traits::OmpContextSelectorSpecification);
+};
+
 // Ref: [5.2:217-218]
 // message-clause ->
 //MESSAGE("message-text")
@@ -4214,6 +4233,17 @@ struct OmpOrderClause {
   std::tuple t;
 };
 
+// Ref: [5.0:56-57], [5.1:60-62], [5.2:191]
+//
+// otherwise-clause ->
+//DEFAULT ([directive-specification])   // since 5.0, until 5.1
+// otherwise-clause ->
+//OTHERWISE ([directive-specification])]// since 5.2
+struct OmpOtherwiseClause {
+  WRAPPER_CLASS_BOILERPLATE(
+  OmpOtherwiseClause, std::optional);
+};
+
 // Ref: [4.5:46-50], [5.0:74-78], [5.1:92-96], [5.2:229-230]
 //
 // proc-bind-clause ->
@@ -4299,6 +4329,17 @@ struct OmpUpdateClause {
   std::variant u;
 };
 
+// Ref: [5.0:56-57], [5.1:60-62], [5.2:190-191]
+//
+// when-clause ->
+//WHEN (context-selector :
+//[directive-specification])// since 5.0
+struct OmpWhenClause {
+  TUPLE_CLASS_BOILERPLATE(OmpWhenClause);
+  MODIFIER_BOILERPLATE(OmpContextSelector);
+  std::tuple> t;
+};
+
 // OpenMP Clauses
 struct OmpClause {
   UNION_CLASS_BOILERPLATE(OmpClause);
@@ -4323,6 +4364,12 @@ struct OmpClauseList {
 
 // --- Directives and constructs
 
+struct OmpMetadirectiveDirective {
+  TUPLE_CLASS_BOILERPLATE(OmpMetadirectiveDirective);
+  std::tuple t;
+  CharBlock source;
+};
+
 // Ref: [5.1:89-90], [5.2:216]
 //
 // nothing-directive ->
@@ -4696,7 +4743,7 @@ struct OpenMPStandaloneConstruct {
   CharBlock source;
   std::variant
+ 

[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Parse WHEN, OTHERWISE, MATCH clauses plus METADIRECTIVE (PR #121817)

2025-01-14 Thread Krzysztof Parzyszek via llvm-branch-commits

kparzysz wrote:

Force-pushed to change the base.  There were no review comments yes, so it 
shouldn't mess up anything.

https://github.com/llvm/llvm-project/pull/121817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Multilib] Custom flags processing for library selection (PR #110659)

2025-01-14 Thread Victor Campos via llvm-branch-commits

https://github.com/vhscampos edited 
https://github.com/llvm/llvm-project/pull/110659
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] Add documentation for Multilib custom flags (PR #114998)

2025-01-14 Thread Victor Campos via llvm-branch-commits

https://github.com/vhscampos updated 
https://github.com/llvm/llvm-project/pull/114998

>From 7fd397e918ba2663c7342bc1653c9ccbc5be9d96 Mon Sep 17 00:00:00 2001
From: Victor Campos 
Date: Tue, 5 Nov 2024 14:22:06 +
Subject: [PATCH 1/4] Add documentation for Multilib custom flags

---
 clang/docs/Multilib.rst | 90 +
 1 file changed, 90 insertions(+)

diff --git a/clang/docs/Multilib.rst b/clang/docs/Multilib.rst
index 7637d0db9565b8..85cb789b9847ac 100644
--- a/clang/docs/Multilib.rst
+++ b/clang/docs/Multilib.rst
@@ -122,6 +122,78 @@ subclass and a suitable base multilib variant is present 
then the
 It is the responsibility of layered multilib authors to ensure that headers and
 libraries in each layer are complete enough to mask any incompatibilities.
 
+Multilib custom flags
+=
+
+Introduction
+
+
+The multilib mechanism supports library variants that correspond to target,
+code generation or language command-line flags. Examples include ``--target``,
+``-mcpu``, ``-mfpu``, ``-mbranch-protection``, ``-fno-rtti``. However, some 
library
+variants are particular to features that do not correspond to any command-line
+option. Multithreading and semihosting, for instance, have no associated
+compiler option.
+
+In order to support the selection of variants for which no compiler option
+exists, the multilib specification includes the concept of *custom flags*.
+These flags have no impact on code generation and are only used in the multilib
+processing.
+
+Multilib custom flags follow this format in the driver invocation:
+
+::
+
+  -fmultilib-flag=
+
+They are fed into the multilib system alongside the remaining flags.
+
+Custom flag declarations
+
+
+Custom flags can be declared in the YAML file under the *Flags* section.
+
+.. code-block:: yaml
+
+  Flags:
+  - Name: multithreaded
+Values:
+- Name: no-multithreaded
+  DriverArgs: [-D__SINGLE_THREAD__]
+- Name: multithreaded
+Default: no-multithreaded
+
+* Name: the name to categorize a flag.
+* Values: a list of flag *Value*s (defined below).
+* Default: it specifies the name of the value this flag should take if not
+  specified in the command-line invocation. It must be one value from the 
Values
+  field.
+
+A Default value is useful to save users from specifying custom flags that have 
a
+most commonly used value.
+
+Each flag *Value* is defined as:
+
+* Name: name of the value. This is the string to be used in
+  ``-fmultilib-flag=``.
+* DriverArgs: a list of strings corresponding to the extra driver arguments
+  used to build a library variant that's in accordance to this specific custom
+  flag value. These arguments are fed back into the driver if this flag *Value*
+  is enabled.
+
+The namespace of flag values is common across all flags. This means that flag
+value names must be unique.
+
+Usage of custom flags in the *Variants* specifications
+--
+
+Library variants should list their requirement on one or more custom flags like
+they do for any other flag. Each requirement must be listed as
+``-fmultilib-flag=``.
+
+A variant that does not specify a requirement on one particular flag can be
+matched against any value of that flag.
+
 Stability
 =
 
@@ -222,6 +294,24 @@ For a more comprehensive example see
 # Flags is a list of one or more strings.
 Flags: [--target=thumbv7m-none-eabi]
 
+  # Custom flag declarations. Each item is a different declaration.
+  Flags:
+# Name of the flag
+  - Name: multithreaded
+# List of custom flag values
+Values:
+  # Name of the custom flag value. To be used in -fmultilib-flag=.
+- Name: no-multithreaded
+  # Extra driver arguments to be printed with -print-multi-lib. Useful for
+  # specifying extra arguments for building the the associated library
+  # variant(s).
+  DriverArgs: [-D__SINGLE_THREAD__]
+- Name: multithreaded
+# Default flag value. If no value for this flag declaration is used in the
+# command-line, the multilib system will use this one. Must be equal to one
+# of the flag value names from this flag declaration.
+Default: no-multithreaded
+
 Design principles
 =
 

>From 27217d74b6be78263e2365ed74600953fc3f353c Mon Sep 17 00:00:00 2001
From: Victor Campos 
Date: Mon, 25 Nov 2024 15:07:57 +
Subject: [PATCH 2/4] Fix doc build warning

---
 clang/docs/Multilib.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/docs/Multilib.rst b/clang/docs/Multilib.rst
index 85cb789b9847ac..48d84087dda01c 100644
--- a/clang/docs/Multilib.rst
+++ b/clang/docs/Multilib.rst
@@ -164,7 +164,7 @@ Custom flags can be declared in the YAML file under the 
*Flags* section.
 Default: no-multithreaded
 
 * Name: the name to categorize a flag.
-* Values: a list of flag *Value*s (defined below).
+* Values: a list of flag Values (defined b

[llvm-branch-commits] [clang-tools-extra] [clang-tidy][NFC] refactor modernize-raw-string-literal fix hint (PR #122909)

2025-01-14 Thread Congcong Cai via llvm-branch-commits

https://github.com/HerrCai0907 updated 
https://github.com/llvm/llvm-project/pull/122909

>From 9e5c5eb96a65d9bdec47566c9bf5ae95c57107f0 Mon Sep 17 00:00:00 2001
From: Congcong Cai 
Date: Tue, 14 Jan 2025 22:24:46 +0800
Subject: [PATCH] [clang-tidy][NFC] refactor modernize-raw-string-literal fix
 hint

---
 .../modernize/RawStringLiteralCheck.cpp   | 105 +++---
 .../modernize/RawStringLiteralCheck.h |   4 -
 2 files changed, 62 insertions(+), 47 deletions(-)

diff --git a/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp 
b/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp
index 126463ae795eb6..24674a407cb369 100644
--- a/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp
+++ b/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp
@@ -9,8 +9,11 @@
 #include "RawStringLiteralCheck.h"
 #include "clang/AST/ASTContext.h"
 #include "clang/ASTMatchers/ASTMatchFinder.h"
+#include "clang/Basic/LangOptions.h"
+#include "clang/Basic/SourceManager.h"
 #include "clang/Lex/Lexer.h"
 #include "llvm/ADT/StringRef.h"
+#include 
 
 using namespace clang::ast_matchers;
 
@@ -67,20 +70,6 @@ bool containsDelimiter(StringRef Bytes, const std::string 
&Delimiter) {
 : (")" + Delimiter + R"(")")) != StringRef::npos;
 }
 
-std::string asRawStringLiteral(const StringLiteral *Literal,
-   const std::string &DelimiterStem) {
-  const StringRef Bytes = Literal->getBytes();
-  std::string Delimiter;
-  for (int I = 0; containsDelimiter(Bytes, Delimiter); ++I) {
-Delimiter = (I == 0) ? DelimiterStem : DelimiterStem + std::to_string(I);
-  }
-
-  if (Delimiter.empty())
-return (R"(R"()" + Bytes + R"lit()")lit").str();
-
-  return (R"(R")" + Delimiter + "(" + Bytes + ")" + Delimiter + R"(")").str();
-}
-
 } // namespace
 
 RawStringLiteralCheck::RawStringLiteralCheck(StringRef Name,
@@ -120,43 +109,73 @@ void RawStringLiteralCheck::registerMatchers(MatchFinder 
*Finder) {
   stringLiteral(unless(hasParent(predefinedExpr(.bind("lit"), this);
 }
 
-void RawStringLiteralCheck::check(const MatchFinder::MatchResult &Result) {
-  const auto *Literal = Result.Nodes.getNodeAs("lit");
-  if (Literal->getBeginLoc().isMacroID())
-return;
-
-  if (containsEscapedCharacters(Result, Literal, DisallowedChars)) {
-std::string Replacement = asRawStringLiteral(Literal, DelimiterStem);
-if (ReplaceShorterLiterals ||
-Replacement.length() <=
-Lexer::MeasureTokenLength(Literal->getBeginLoc(),
-  *Result.SourceManager, getLangOpts()))
-  replaceWithRawStringLiteral(Result, Literal, Replacement);
-  }
-}
-
-void RawStringLiteralCheck::replaceWithRawStringLiteral(
-const MatchFinder::MatchResult &Result, const StringLiteral *Literal,
-std::string Replacement) {
-  DiagnosticBuilder Builder =
-  diag(Literal->getBeginLoc(),
-   "escaped string literal can be written as a raw string literal");
-  const SourceManager &SM = *Result.SourceManager;
+static std::optional
+createUserDefinedSuffix(const StringLiteral *Literal, const SourceManager &SM,
+const LangOptions &LangOpts) {
   const CharSourceRange TokenRange =
   CharSourceRange::getTokenRange(Literal->getSourceRange());
   Token T;
-  if (Lexer::getRawToken(Literal->getBeginLoc(), T, SM, getLangOpts()))
-return;
+  if (Lexer::getRawToken(Literal->getBeginLoc(), T, SM, LangOpts))
+return std::nullopt;
   const CharSourceRange CharRange =
-  Lexer::makeFileCharRange(TokenRange, SM, getLangOpts());
+  Lexer::makeFileCharRange(TokenRange, SM, LangOpts);
   if (T.hasUDSuffix()) {
-const StringRef Text = Lexer::getSourceText(CharRange, SM, getLangOpts());
+StringRef Text = Lexer::getSourceText(CharRange, SM, LangOpts);
 const size_t UDSuffixPos = Text.find_last_of('"');
 if (UDSuffixPos == StringRef::npos)
-  return;
-Replacement += Text.slice(UDSuffixPos + 1, Text.size());
+  return std::nullopt;
+return Text.slice(UDSuffixPos + 1, Text.size());
+  }
+  return std::nullopt;
+}
+
+static std::string createRawStringLiteral(const StringLiteral *Literal,
+  const std::string &DelimiterStem,
+  const SourceManager &SM,
+  const LangOptions &LangOpts) {
+  const StringRef Bytes = Literal->getBytes();
+  std::string Delimiter;
+  for (int I = 0; containsDelimiter(Bytes, Delimiter); ++I) {
+Delimiter = (I == 0) ? DelimiterStem : DelimiterStem + std::to_string(I);
+  }
+
+  std::optional UserDefinedSuffix =
+  createUserDefinedSuffix(Literal, SM, LangOpts);
+
+  if (Delimiter.empty())
+return (R"(R"()" + Bytes + R"lit()")lit" + UserDefinedSuffix.value_or(""))
+.str();
+
+  return (R"(R")" + Delimiter + "(" + Bytes + ")" + Delimiter + R"(")" +
+  UserDefinedSuffix.valu

[llvm-branch-commits] [clang-tools-extra] [clang-tidy][NFC] refactor modernize-raw-string-literal fix hint (PR #122909)

2025-01-14 Thread Congcong Cai via llvm-branch-commits

https://github.com/HerrCai0907 updated 
https://github.com/llvm/llvm-project/pull/122909

>From 9e5c5eb96a65d9bdec47566c9bf5ae95c57107f0 Mon Sep 17 00:00:00 2001
From: Congcong Cai 
Date: Tue, 14 Jan 2025 22:24:46 +0800
Subject: [PATCH] [clang-tidy][NFC] refactor modernize-raw-string-literal fix
 hint

---
 .../modernize/RawStringLiteralCheck.cpp   | 105 +++---
 .../modernize/RawStringLiteralCheck.h |   4 -
 2 files changed, 62 insertions(+), 47 deletions(-)

diff --git a/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp 
b/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp
index 126463ae795eb6..24674a407cb369 100644
--- a/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp
+++ b/clang-tools-extra/clang-tidy/modernize/RawStringLiteralCheck.cpp
@@ -9,8 +9,11 @@
 #include "RawStringLiteralCheck.h"
 #include "clang/AST/ASTContext.h"
 #include "clang/ASTMatchers/ASTMatchFinder.h"
+#include "clang/Basic/LangOptions.h"
+#include "clang/Basic/SourceManager.h"
 #include "clang/Lex/Lexer.h"
 #include "llvm/ADT/StringRef.h"
+#include 
 
 using namespace clang::ast_matchers;
 
@@ -67,20 +70,6 @@ bool containsDelimiter(StringRef Bytes, const std::string 
&Delimiter) {
 : (")" + Delimiter + R"(")")) != StringRef::npos;
 }
 
-std::string asRawStringLiteral(const StringLiteral *Literal,
-   const std::string &DelimiterStem) {
-  const StringRef Bytes = Literal->getBytes();
-  std::string Delimiter;
-  for (int I = 0; containsDelimiter(Bytes, Delimiter); ++I) {
-Delimiter = (I == 0) ? DelimiterStem : DelimiterStem + std::to_string(I);
-  }
-
-  if (Delimiter.empty())
-return (R"(R"()" + Bytes + R"lit()")lit").str();
-
-  return (R"(R")" + Delimiter + "(" + Bytes + ")" + Delimiter + R"(")").str();
-}
-
 } // namespace
 
 RawStringLiteralCheck::RawStringLiteralCheck(StringRef Name,
@@ -120,43 +109,73 @@ void RawStringLiteralCheck::registerMatchers(MatchFinder 
*Finder) {
   stringLiteral(unless(hasParent(predefinedExpr(.bind("lit"), this);
 }
 
-void RawStringLiteralCheck::check(const MatchFinder::MatchResult &Result) {
-  const auto *Literal = Result.Nodes.getNodeAs("lit");
-  if (Literal->getBeginLoc().isMacroID())
-return;
-
-  if (containsEscapedCharacters(Result, Literal, DisallowedChars)) {
-std::string Replacement = asRawStringLiteral(Literal, DelimiterStem);
-if (ReplaceShorterLiterals ||
-Replacement.length() <=
-Lexer::MeasureTokenLength(Literal->getBeginLoc(),
-  *Result.SourceManager, getLangOpts()))
-  replaceWithRawStringLiteral(Result, Literal, Replacement);
-  }
-}
-
-void RawStringLiteralCheck::replaceWithRawStringLiteral(
-const MatchFinder::MatchResult &Result, const StringLiteral *Literal,
-std::string Replacement) {
-  DiagnosticBuilder Builder =
-  diag(Literal->getBeginLoc(),
-   "escaped string literal can be written as a raw string literal");
-  const SourceManager &SM = *Result.SourceManager;
+static std::optional
+createUserDefinedSuffix(const StringLiteral *Literal, const SourceManager &SM,
+const LangOptions &LangOpts) {
   const CharSourceRange TokenRange =
   CharSourceRange::getTokenRange(Literal->getSourceRange());
   Token T;
-  if (Lexer::getRawToken(Literal->getBeginLoc(), T, SM, getLangOpts()))
-return;
+  if (Lexer::getRawToken(Literal->getBeginLoc(), T, SM, LangOpts))
+return std::nullopt;
   const CharSourceRange CharRange =
-  Lexer::makeFileCharRange(TokenRange, SM, getLangOpts());
+  Lexer::makeFileCharRange(TokenRange, SM, LangOpts);
   if (T.hasUDSuffix()) {
-const StringRef Text = Lexer::getSourceText(CharRange, SM, getLangOpts());
+StringRef Text = Lexer::getSourceText(CharRange, SM, LangOpts);
 const size_t UDSuffixPos = Text.find_last_of('"');
 if (UDSuffixPos == StringRef::npos)
-  return;
-Replacement += Text.slice(UDSuffixPos + 1, Text.size());
+  return std::nullopt;
+return Text.slice(UDSuffixPos + 1, Text.size());
+  }
+  return std::nullopt;
+}
+
+static std::string createRawStringLiteral(const StringLiteral *Literal,
+  const std::string &DelimiterStem,
+  const SourceManager &SM,
+  const LangOptions &LangOpts) {
+  const StringRef Bytes = Literal->getBytes();
+  std::string Delimiter;
+  for (int I = 0; containsDelimiter(Bytes, Delimiter); ++I) {
+Delimiter = (I == 0) ? DelimiterStem : DelimiterStem + std::to_string(I);
+  }
+
+  std::optional UserDefinedSuffix =
+  createUserDefinedSuffix(Literal, SM, LangOpts);
+
+  if (Delimiter.empty())
+return (R"(R"()" + Bytes + R"lit()")lit" + UserDefinedSuffix.value_or(""))
+.str();
+
+  return (R"(R")" + Delimiter + "(" + Bytes + ")" + Delimiter + R"(")" +
+  UserDefinedSuffix.valu

[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Parse WHEN, OTHERWISE, MATCH clauses plus METADIRECTIVE (PR #121817)

2025-01-14 Thread Krzysztof Parzyszek via llvm-branch-commits

https://github.com/kparzysz edited 
https://github.com/llvm/llvm-project/pull/121817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] [clang-tidy][NFC] refactor modernize-raw-string-literal fix hint (PR #122909)

2025-01-14 Thread Congcong Cai via llvm-branch-commits

https://github.com/HerrCai0907 ready_for_review 
https://github.com/llvm/llvm-project/pull/122909
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder][MLIR] Add support for target 'if' clause (PR #122478)

2025-01-14 Thread via llvm-branch-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 364cd46360d7a5d2a79ae9bf516f23c4840ff09b 
8c348ba2796e08d45fe167d52db0fe047eaafa8a --extensions cpp,h -- 
llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h 
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
``





View the diff from clang-format here.


``diff
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index d29e22c762..273470b2ba 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -7423,34 +7423,36 @@ emitTargetCall(OpenMPIRBuilder &OMPBuilder, 
IRBuilderBase &Builder,
 
 SmallVector NumTeamsC;
 for (auto [DefaultVal, RuntimeVal] :
-zip_equal(DefaultAttrs.MaxTeams, RuntimeAttrs.MaxTeams))
-  NumTeamsC.push_back(RuntimeVal ? RuntimeVal : 
Builder.getInt32(DefaultVal));
+ zip_equal(DefaultAttrs.MaxTeams, RuntimeAttrs.MaxTeams))
+  NumTeamsC.push_back(RuntimeVal ? RuntimeVal
+ : Builder.getInt32(DefaultVal));
 
-// Calculate number of threads: 0 if no clauses specified, otherwise it is 
the
-// minimum between optional THREAD_LIMIT and NUM_THREADS clauses.
+// Calculate number of threads: 0 if no clauses specified, otherwise it is
+// the minimum between optional THREAD_LIMIT and NUM_THREADS clauses.
 auto InitMaxThreadsClause = [&Builder](Value *Clause) {
   if (Clause)
 Clause = Builder.CreateIntCast(Clause, Builder.getInt32Ty(),
-  /*isSigned=*/false);
+   /*isSigned=*/false);
   return Clause;
 };
 auto CombineMaxThreadsClauses = [&Builder](Value *Clause, Value *&Result) {
   if (Clause)
-Result = Result
-? Builder.CreateSelect(Builder.CreateICmpULT(Result, 
Clause),
-Result, Clause)
-: Clause;
+Result =
+Result ? Builder.CreateSelect(Builder.CreateICmpULT(Result, 
Clause),
+  Result, Clause)
+   : Clause;
 };
 
 // If a multi-dimensional THREAD_LIMIT is set, it is the OMPX_BARE case, so
 // the NUM_THREADS clause is overriden by THREAD_LIMIT.
 SmallVector NumThreadsC;
-Value *MaxThreadsClause = RuntimeAttrs.TeamsThreadLimit.size() == 1
-  ? 
InitMaxThreadsClause(RuntimeAttrs.MaxThreads)
-  : nullptr;
+Value *MaxThreadsClause =
+RuntimeAttrs.TeamsThreadLimit.size() == 1
+? InitMaxThreadsClause(RuntimeAttrs.MaxThreads)
+: nullptr;
 
-for (auto [TeamsVal, TargetVal] : zip_equal(RuntimeAttrs.TeamsThreadLimit,
-
RuntimeAttrs.TargetThreadLimit)) {
+for (auto [TeamsVal, TargetVal] : zip_equal(
+ RuntimeAttrs.TeamsThreadLimit, RuntimeAttrs.TargetThreadLimit)) {
   Value *TeamsThreadLimitClause = InitMaxThreadsClause(TeamsVal);
   Value *NumThreads = InitMaxThreadsClause(TargetVal);
 
@@ -7466,13 +7468,13 @@ emitTargetCall(OpenMPIRBuilder &OMPBuilder, 
IRBuilderBase &Builder,
 uint32_t SrcLocStrSize;
 Constant *SrcLocStr = 
OMPBuilder.getOrCreateDefaultSrcLocStr(SrcLocStrSize);
 Value *RTLoc = OMPBuilder.getOrCreateIdent(SrcLocStr, SrcLocStrSize,
-  llvm::omp::IdentFlag(0), 0);
+   llvm::omp::IdentFlag(0), 0);
 
 Value *TripCount = RuntimeAttrs.LoopTripCount
-  ? Builder.CreateIntCast(RuntimeAttrs.LoopTripCount,
-  Builder.getInt64Ty(),
-  /*isSigned=*/false)
-  : Builder.getInt64(0);
+   ? Builder.CreateIntCast(RuntimeAttrs.LoopTripCount,
+   Builder.getInt64Ty(),
+   /*isSigned=*/false)
+   : Builder.getInt64(0);
 
 // TODO: Use correct DynCGGroupMem
 Value *DynCGGroupMem = Builder.getInt32(0);

``




https://github.com/llvm/llvm-project/pull/122478
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [mlir] [OMPIRBuilder][MLIR] Add support for target 'if' clause (PR #122478)

2025-01-14 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak updated 
https://github.com/llvm/llvm-project/pull/122478

>From 8c348ba2796e08d45fe167d52db0fe047eaafa8a Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Fri, 10 Jan 2025 15:40:05 +
Subject: [PATCH 1/2] [OMPIRBuilder][MLIR] Add support for target 'if' clause

This patch implements support for handling the 'if' clause of OpenMP 'target'
constructs in the OMPIRBuilder and updates MLIR to LLVM IR translation of the
`omp.target` MLIR operation to make use of this new feature.
---
 .../llvm/Frontend/OpenMP/OMPIRBuilder.h   |  14 +-
 llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 188 ++
 .../Frontend/OpenMPIRBuilderTest.cpp  |  26 +--
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  |  11 +-
 mlir/test/Target/LLVMIR/omptarget-if.mlir |  68 +++
 mlir/test/Target/LLVMIR/openmp-todo.mlir  |  11 -
 6 files changed, 200 insertions(+), 118 deletions(-)
 create mode 100644 mlir/test/Target/LLVMIR/omptarget-if.mlir

diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h 
b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
index 7eceec3d8cf8f5..6b6e5bc19d95a4 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
+++ b/llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
@@ -2994,27 +2994,29 @@ class OpenMPIRBuilder {
   /// \param Loc where the target data construct was encountered.
   /// \param IsOffloadEntry whether it is an offload entry.
   /// \param CodeGenIP The insertion point where the call to the outlined
-  /// function should be emitted.
+  ///function should be emitted.
   /// \param EntryInfo The entry information about the function.
   /// \param DefaultAttrs Structure containing the default attributes, 
including
   ///numbers of threads and teams to launch the kernel with.
   /// \param RuntimeAttrs Structure containing the runtime numbers of threads
   ///and teams to launch the kernel with.
+  /// \param IfCond value of the `if` clause.
   /// \param Inputs The input values to the region that will be passed.
-  /// as arguments to the outlined function.
+  ///as arguments to the outlined function.
   /// \param BodyGenCB Callback that will generate the region code.
   /// \param ArgAccessorFuncCB Callback that will generate accessors
-  /// instructions for passed in target arguments where neccessary
+  ///instructions for passed in target arguments where neccessary
   /// \param Dependencies A vector of DependData objects that carry
-  // dependency information as passed in the depend clause
-  // \param HasNowait Whether the target construct has a `nowait` clause or 
not.
+  ///dependency information as passed in the depend clause
+  /// \param HasNowait Whether the target construct has a `nowait` clause or
+  ///not.
   InsertPointOrErrorTy createTarget(
   const LocationDescription &Loc, bool IsOffloadEntry,
   OpenMPIRBuilder::InsertPointTy AllocaIP,
   OpenMPIRBuilder::InsertPointTy CodeGenIP,
   TargetRegionEntryInfo &EntryInfo,
   const TargetKernelDefaultAttrs &DefaultAttrs,
-  const TargetKernelRuntimeAttrs &RuntimeAttrs,
+  const TargetKernelRuntimeAttrs &RuntimeAttrs, Value *IfCond,
   SmallVectorImpl &Inputs, GenMapInfoCallbackTy GenMapInfoCB,
   TargetBodyGenCallbackTy BodyGenCB,
   TargetGenArgAccessorsCallbackTy ArgAccessorFuncCB,
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 3d461f0ad4228c..d29e22c762bd4f 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -7340,7 +7340,7 @@ emitTargetCall(OpenMPIRBuilder &OMPBuilder, IRBuilderBase 
&Builder,
OpenMPIRBuilder::InsertPointTy AllocaIP,
const OpenMPIRBuilder::TargetKernelDefaultAttrs &DefaultAttrs,
const OpenMPIRBuilder::TargetKernelRuntimeAttrs &RuntimeAttrs,
-   Function *OutlinedFn, Constant *OutlinedFnID,
+   Value *IfCond, Function *OutlinedFn, Constant *OutlinedFnID,
SmallVectorImpl &Args,
OpenMPIRBuilder::GenMapInfoCallbackTy GenMapInfoCB,
SmallVector Dependencies = 
{},
@@ -7386,9 +7386,9 @@ emitTargetCall(OpenMPIRBuilder &OMPBuilder, IRBuilderBase 
&Builder,
 return Error::success();
   };
 
-  // If we don't have an ID for the target region, it means an offload entry
-  // wasn't created. In this case we just run the host fallback directly.
-  if (!OutlinedFnID) {
+  auto &&EmitTargetCallElse =
+  [&](OpenMPIRBuilder::InsertPointTy AllocaIP,
+  OpenMPIRBuilder::InsertPointTy CodeGenIP) -> Error {
 // Assume no error was returned because EmitTargetCallFallbackCB doesn't
 // produce any.
 OpenMPIRBuilder::InsertPointTy AfterIP = cantFail([&]() {
@@ -7404,102 +7404,124 @@ emitTargetCall(OpenMPIRBuilder &OMPBuilder, 
IRBuilderBase &Builder,
 }());
 
 Builder.restoreIP(AfterIP);
-return;

[llvm-branch-commits] [flang] [llvm] [mlir] [MLIR][OpenMP] Introduce overlapped record type map support (PR #119588)

2025-01-14 Thread via llvm-branch-commits

agozillon wrote:

Small ping for some attention on this PR if at all possible please! Would be 
greatly appreciated.

https://github.com/llvm/llvm-project/pull/119588
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [llvm] [mlir] [Flang][OpenMP][MLIR] Initial declare target to for variables implementation (PR #119589)

2025-01-14 Thread via llvm-branch-commits

agozillon wrote:

Small ping for some attention on this PR if at all possible please! Would be 
greatly appreciated.

https://github.com/llvm/llvm-project/pull/119589
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] DAG: Avoid forming shufflevector from a single extract_vector_elt (PR #122672)

2025-01-14 Thread Philip Reames via llvm-branch-commits

https://github.com/preames commented:

I don't think the heuristic here is quite what you want.  I believe this 
heuristic disables both of the following cases:
* BuildVector w/one non-zero non-undef element
* BuildVector w/one non-zero non-undef source, repeated 100 times (i.e. splat 
or select of two splats)

Disabling the former seems defensible, doing so for the second less so.  

Though honestly, I'm not sure of this change as a whole.  Having a single 
canonical form seems valuable here.  If the target isn't optimally lowering the 
splat or select of splat case in the shuffle lowering, maybe we should just 
adjust the target lowering to do so?

https://github.com/llvm/llvm-project/pull/122672
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] DAG: Avoid forming shufflevector from a single extract_vector_elt (PR #122672)

2025-01-14 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> Isn't the condition in code in terms of VecIn.size() == 2? I believe that 
> VecIn is the _unique_ input elements, right? Which is distinct from the 
> number of elements in the destination type? (Am I just misreading? I only 
> skimmed this.)

VecIn is collecting only extract_vector_elts feeding the build_vector. So it's 
true it's not only a 2 element vector, in general (but the standard case of 
building a complete vector is 2 elements). The other skipped elements are all 
constant or undef.

A 2 element shuffle just happens to the only case I care about which I'm trying 
to make legal (and really only the odd -> even case is of any use). 



https://github.com/llvm/llvm-project/pull/122672
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support AMDGPU's generic target (PR #122629)

2025-01-14 Thread Yaxun Liu via llvm-branch-commits

yxsamliu wrote:

> Somewhere for the linker wrapper I just checked if the triple was recognized, 
> you could probably just take strings after the `-` until it stops working.

+1

It would be bad user experience to break existing app. It would be low risk to 
have env+cpu to be a valid cpu. So you could assume env exist first, if fails 
to parse remaining as cpu, then recoil to assume no env and parse the remaining 
all as cpu.

https://github.com/llvm/llvm-project/pull/122629
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] DAG: Avoid forming shufflevector from a single extract_vector_elt (PR #122672)

2025-01-14 Thread Philip Reames via llvm-branch-commits

preames wrote:

> > Isn't the condition in code in terms of VecIn.size() == 2? I believe that 
> > VecIn is the _unique_ input elements, right? Which is distinct from the 
> > number of elements in the destination type? (Am I just misreading? I only 
> > skimmed this.)
> 
> VecIn is collecting only extract_vector_elts feeding the build_vector. So 
> it's true it's not only a 2 element vector, in general (but the standard case 
> of building a complete vector is 2 elements). The other skipped elements are 
> all constant or undef.
> 
> A 2 element shuffle just happens to the only case I care about which I'm 
> trying to make legal (and really only the odd -> even case is of any use).

This is exactly the distinct I'm trying to get at.  Avoiding the creation of a 
1-2 element shuffle seems quite reasonable.  Avoiding the creation of a 100 
element splat shuffle does not.  I think you need to add an explicit condition 
in terms of the number elements in the result, not the number of *unique* 
elements in the result.  

https://github.com/llvm/llvm-project/pull/122672
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support AMDGPU's generic target (PR #122629)

2025-01-14 Thread Shilei Tian via llvm-branch-commits

https://github.com/shiltian updated 
https://github.com/llvm/llvm-project/pull/122629

>From c2588fee406f43e6cc1fc1e902a06d18c96a0444 Mon Sep 17 00:00:00 2001
From: Shilei Tian 
Date: Sun, 12 Jan 2025 18:01:55 -0500
Subject: [PATCH] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to
 support generic target

The current parsing of target string assumes to be in a form of
`kind-triple-targetid:feature`, such as
`hipv4-amdgcn-amd-amdhsa-gfx1030:+xnack`. Specifically, the target id does not
contain any `-`, which is not the case for generic target. Also, a generic
target may contain one or more `-`, such as `gfx10-3-generic` and
`gfx12-generic`. As a result, we can no longer depend on `rstrip` to get things
work right. This patch reworks the logic to parse the target string to make it
more robust, as well as supporting generic target.
---
 clang/docs/ClangOffloadBundler.rst|  7 ++-
 clang/lib/Driver/OffloadBundler.cpp   | 49 +--
 clang/lib/Driver/ToolChains/Clang.cpp |  6 ++-
 clang/lib/Driver/ToolChains/CommonArgs.cpp|  3 +-
 clang/lib/Driver/ToolChains/HIPUtility.cpp|  2 +-
 .../Driver/clang-offload-bundler-asserts-on.c | 14 +++---
 .../clang-offload-bundler-standardize.c   | 18 ++-
 .../test/Driver/clang-offload-bundler-zlib.c  | 12 ++---
 .../test/Driver/clang-offload-bundler-zstd.c  | 12 ++---
 clang/test/Driver/clang-offload-bundler.c | 48 +-
 clang/test/Driver/hip-link-bc-to-bc.hip   |  8 +--
 clang/test/Driver/hip-link-bundle-archive.hip | 10 ++--
 .../test/Driver/hip-offload-compress-zstd.hip |  4 +-
 clang/test/Driver/hip-rdc-device-only.hip |  8 +--
 .../Driver/hip-toolchain-rdc-separate.hip | 12 ++---
 llvm/utils/lit/lit/llvm/config.py | 12 -
 16 files changed, 114 insertions(+), 111 deletions(-)

diff --git a/clang/docs/ClangOffloadBundler.rst 
b/clang/docs/ClangOffloadBundler.rst
index 3c241027d405ca..25214c2ea6a4e1 100644
--- a/clang/docs/ClangOffloadBundler.rst
+++ b/clang/docs/ClangOffloadBundler.rst
@@ -266,15 +266,14 @@ without differentiation based on offload kind.
 The target triple of the code object. See `Target Triple
 `_.
 
-The bundler accepts target triples with or without the optional environment
-field:
+LLVM target triples can be with or without the optional environment field:
 
 ``--``, or
 ``---``
 
 However, in order to standardize outputs for tools that consume bitcode
-bundles, bundles written by the bundler internally use only the 4-field
-target triple:
+bundles, the bundler only accepts target triples with the 4-field target
+triple:
 
 ``---``
 
diff --git a/clang/lib/Driver/OffloadBundler.cpp 
b/clang/lib/Driver/OffloadBundler.cpp
index 2d6bdff0393be5..c29ab61853efa8 100644
--- a/clang/lib/Driver/OffloadBundler.cpp
+++ b/clang/lib/Driver/OffloadBundler.cpp
@@ -84,31 +84,27 @@ OffloadTargetInfo::OffloadTargetInfo(const StringRef Target,
 : BundlerConfig(BC) {
 
   // TODO: Add error checking from ClangOffloadBundler.cpp
-  auto TargetFeatures = Target.split(':');
-  auto TripleOrGPU = TargetFeatures.first.rsplit('-');
-
-  if (clang::StringToOffloadArch(TripleOrGPU.second) !=
-  clang::OffloadArch::UNKNOWN) {
-auto KindTriple = TripleOrGPU.first.split('-');
-this->OffloadKind = KindTriple.first;
-
-// Enforce optional env field to standardize bundles
-llvm::Triple t = llvm::Triple(KindTriple.second);
-this->Triple = llvm::Triple(t.getArchName(), t.getVendorName(),
-t.getOSName(), t.getEnvironmentName());
-
-this->TargetID = Target.substr(Target.find(TripleOrGPU.second));
-  } else {
-auto KindTriple = TargetFeatures.first.split('-');
-this->OffloadKind = KindTriple.first;
-
-// Enforce optional env field to standardize bundles
-llvm::Triple t = llvm::Triple(KindTriple.second);
-this->Triple = llvm::Triple(t.getArchName(), t.getVendorName(),
-t.getOSName(), t.getEnvironmentName());
-
+  // -[-[:target features]]
+  //  := ---
+  SmallVector Components;
+  Target.split(Components, '-', /*MaxSplit=*/5);
+  assert((Components.size() == 5 || Components.size() == 6) &&
+ "malformed target string");
+
+  StringRef TargetIdWithFeature =
+  Components.size() == 6 ? Components.back() : "";
+  StringRef TargetId = TargetIdWithFeature.split(':').first;
+  if (!TargetId.empty() &&
+  clang::StringToOffloadArch(TargetId) != clang::OffloadArch::UNKNOWN)
+this->TargetID = TargetIdWithFeature;
+  else
 this->TargetID = "";
-  }
+
+  this->OffloadKind = Components.front();
+  ArrayRef TripleSlice{&Components[1], /*length=*/4};
+  llvm::Triple T = llvm::Triple(llvm::join(TripleSlice, "-"));
+  this->Triple = llvm::Triple(T.getArchName(), T.getVendorName(), 
T.getOSName(),
+  T.getEnvironmentName(

[llvm-branch-commits] [clang] [llvm] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support AMDGPU's generic target (PR #122629)

2025-01-14 Thread Shilei Tian via llvm-branch-commits

shiltian wrote:

First of all, I don't think it can fix the issue in a robust way. Second, 
`generic` is already a valid target/cpu/offload target.

Unless we do something like, if the last part is `generic`, we keep looking 
forward until we can construct a valid target. That has no difference than 
doing pattern matching, which is back to my previous point about AMD special 
sauce.

If we assert that the offload bundler is an AMD only thing (which TBH really 
looks like so), I'm fine with adding a bunch of more special sauce here.

https://github.com/llvm/llvm-project/pull/122629
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support AMDGPU's generic target (PR #122629)

2025-01-14 Thread Yaxun Liu via llvm-branch-commits

yxsamliu wrote:

> First of all, I don't think it can fix the issue in a robust way. Second, 
> `generic` is already a valid target/cpu/offload target.
> 
> Unless we do something like, if the last part is `generic`, we keep looking 
> forward until we can construct a valid target. That has no difference than 
> doing pattern matching, which is back to my previous point about AMD special 
> sauce.
> 
> If we assert that the offload bundler is an AMD only thing (which TBH really 
> looks like so), I'm fine with adding a bunch of more special sauce here.

offload-bundler is not an AMD only thing. At least HIPSPRV toolchain uses it, 
which is Intel GPU.

Still, I think it is possible to make it generic with minor assumption. Let's 
say you are now about to parsing the final part of the target ID string which 
may be either "env-cpu" or "cpu" without env. clang has a function 
getCanonicalProcessorName() which can check whether a string is a valid cpu 
name. Just pass the remaining string to it. If true, that means the remaining 
is a cpu, without env string. Otherwise, assuming there is an env string that 
contains no "-" and split it.

https://github.com/llvm/llvm-project/pull/122629
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support AMDGPU's generic target (PR #122629)

2025-01-14 Thread Shilei Tian via llvm-branch-commits

shiltian wrote:

> Still, I think it is possible to make it generic with minor assumption. Let's 
> say you are now about to parsing the final part of the target ID string which 
> may be either "env-cpu" or "cpu" without env.

This is not actually the issue. The issue is when the cpu is a generic target, 
such as `gfx10-3-generic`. By the current logic, the target id after split is 
`generic`, which is totally a valid one, and leave the rest with things like 
`hip-amd-amdhsa-amd-gfx10-3`.

https://github.com/llvm/llvm-project/pull/122629
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [llvm] [mlir] [Flang][OpenMP][MLIR] Initial declare target to for variables implementation (PR #119589)

2025-01-14 Thread via llvm-branch-commits

https://github.com/agozillon updated 
https://github.com/llvm/llvm-project/pull/119589

>From 82f879526d8618b40a6bc83d27c9acc16422de6e Mon Sep 17 00:00:00 2001
From: agozillon 
Date: Wed, 11 Dec 2024 04:52:00 -0600
Subject: [PATCH] [Flang][OpenMP][MLIR] Initial declare target to for variables
 implementation

While the infrastructure for declare target to/enter and link for variables 
exists in the MLIR dialect and at the Flang level, the current lowering from 
MLIR -> LLVM IR isn't in place, it's only in place for variables that have the 
link clause applied.

This PR aims to extend that lowering to an initial implementation that 
incorporates declare target to as well, which primarily requires changes in the 
OpenMPToLLVMIRTranslation phase. However, a minor addition to the OpenMP 
dialect was required to extend the declare target enumerator to include a 
default None field as well.

This also requires a minor change to the Flang lowering's 
MapInfoFinlization.cpp pass to alter the map type for descriptors to deal with 
cases where a variable is marked declare to.  Currently, when a descriptor 
variable is mapped declare target to the  descriptor component can become 
attatched, and cannot be updated, this results in issues when an unusual 
allocation range is specified (effectively an off-by X error). The current 
solution is to map the descriptor always, as we always require an up-to-date 
version of this data. However, this also requires an interlinked PR that adds a 
more intricate type of mapping of structures/record types that clang currently 
implements, to circumvent the overwriting of the pointer in the descriptor.

3/3 required PRs to enable declare target to mapping, this PR should pass all 
tests and provide an all green CI.

Co-authored-by: Raghu Maddhipatla raghu.maddhipa...@amd.com
---
 .../Optimizer/OpenMP/MapInfoFinalization.cpp  |   6 +-
 .../Lower/OpenMP/allocatable-array-bounds.f90 |   7 +-
 flang/test/Lower/OpenMP/allocatable-map.f90   |   2 +-
 flang/test/Lower/OpenMP/array-bounds.f90  |   2 +-
 .../OpenMP/declare-target-link-tarop-cap.f90  |   4 +-
 .../OpenMP/derived-type-allocatable-map.f90   |  12 +-
 flang/test/Lower/OpenMP/target.f90|   4 +-
 .../Transforms/omp-map-info-finalization.fir  |  20 +--
 .../mlir/Dialect/OpenMP/OpenMPEnums.td|   8 +-
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  | 127 ++
 .../omptarget-declare-target-to-device.mlir   |  38 ++
 .../omptarget-declare-target-to-host.mlir |  35 +
 ...allocatable-vars-in-target-with-update.f90 |  52 +++
 ...arget-to-vars-target-region-and-update.f90 |  36 +
 ...t-to-zero-index-allocatable-target-map.f90 |  30 +
 15 files changed, 326 insertions(+), 57 deletions(-)
 create mode 100644 
mlir/test/Target/LLVMIR/omptarget-declare-target-to-device.mlir
 create mode 100644 
mlir/test/Target/LLVMIR/omptarget-declare-target-to-host.mlir
 create mode 100644 
offload/test/offloading/fortran/declare-target-to-allocatable-vars-in-target-with-update.f90
 create mode 100644 
offload/test/offloading/fortran/declare-target-to-vars-target-region-and-update.f90
 create mode 100644 
offload/test/offloading/fortran/declare-target-to-zero-index-allocatable-target-map.f90

diff --git a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp 
b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
index c63d2f4531a6f1..cb21769bfb8dda 100644
--- a/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
+++ b/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
@@ -254,8 +254,10 @@ class MapInfoFinalizationPass
 return llvm::to_underlying(
 hasImplicitMap
 ? llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO |
-  llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_IMPLICIT
-: llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO);
+  llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_IMPLICIT |
+  llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_ALWAYS
+: llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_TO |
+  llvm::omp::OpenMPOffloadMappingFlags::OMP_MAP_ALWAYS);
   }
 
   mlir::omp::MapInfoOp genDescriptorMemberMaps(mlir::omp::MapInfoOp op,
diff --git a/flang/test/Lower/OpenMP/allocatable-array-bounds.f90 
b/flang/test/Lower/OpenMP/allocatable-array-bounds.f90
index e66b6f17d8858a..ff54464dc328fd 100644
--- a/flang/test/Lower/OpenMP/allocatable-array-bounds.f90
+++ b/flang/test/Lower/OpenMP/allocatable-array-bounds.f90
@@ -24,7 +24,7 @@
 !HOST: %[[BOUNDS_1:.*]] = omp.map.bounds lower_bound(%[[LB_1]] : index) 
upper_bound(%[[UB_1]] : index) extent(%[[BOX_3]]#1 : index) stride(%[[BOX_2]]#2 
: index) start_idx(%[[BOX_1]]#0 : index) {stride_in_bytes = true}
 !HOST: %[[VAR_PTR_PTR:.*]] = fir.box_offset %[[DECLARE_1]]#1 base_addr : 
(!fir.ref>>>) -> 
!fir.llvm_ptr>>
 !HOST: %[[MAP_INFO_MEMBER:.*]] = omp.map.info var_ptr(%[[DECLARE_1]]#1 : 
!fir.ref>>>, i32) 
var_ptr_ptr(%[[VAR_PTR_PTR]] : !fir.llvm_ptr>>) 
map_c

[llvm-branch-commits] [flang] [llvm] [mlir] [MLIR][OpenMP] Introduce overlapped record type map support (PR #119588)

2025-01-14 Thread via llvm-branch-commits

https://github.com/agozillon updated 
https://github.com/llvm/llvm-project/pull/119588

>From ed1fb2d5340505faa3b14253eb4395b33a69f63e Mon Sep 17 00:00:00 2001
From: agozillon 
Date: Wed, 11 Dec 2024 09:11:58 -0600
Subject: [PATCH] [MLIR][OpenMP] Introduce overlapped record type map support

This PR introduces a new additional type of map lowering for record types that 
Clang currently supports, in which a user can map a top-level record type and 
then individual members with different mapping, effectively creating a sort of 
"overlapping" mapping that we attempt to cut around.

This is currently most predominantly used in Fortran, when mapping descriptors 
and there data, we map the descriptor and its data with separate map modifiers 
and "cut around" the pointer data, so that wedo not overwrite it unless the 
runtime deems it a neccesary action based on its reference counting mechanism. 
However, it is a mechanism that will come in handy/trigger when a user 
explitily maps a record type (derived type or structure) and then explicitly 
maps a member with a different map type.

These additions were predominantly in the OpenMPToLLVMIRTranslation.cpp file 
and phase, however, one Flang test that checks end-to-end IR compilation (as 
far as we care for now at least) was altered.

2/3 required PRs to enable declare target to mapping, should look at PR 3/3 to 
check for full green passes (this one will fail a number due to some 
dependencies).

Co-authored-by: Raghu Maddhipatla raghu.maddhipa...@amd.com
---
 .../OpenMP/map-types-and-sizes.f90| 130 +
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  | 260 ++
 .../omptarget-data-use-dev-ordering.mlir  |  20 +-
 ...ptarget-overlapping-record-member-map.mlir |  65 +
 ...rget-record-type-with-ptr-member-host.mlir |  95 ---
 .../fortran/dtype-member-overlap-map.f90  |  47 
 6 files changed, 463 insertions(+), 154 deletions(-)
 create mode 100644 
mlir/test/Target/LLVMIR/omptarget-overlapping-record-member-map.mlir
 create mode 100644 offload/test/offloading/fortran/dtype-member-overlap-map.f90

diff --git a/flang/test/Integration/OpenMP/map-types-and-sizes.f90 
b/flang/test/Integration/OpenMP/map-types-and-sizes.f90
index dac4938157a60d..efa73245361895 100644
--- a/flang/test/Integration/OpenMP/map-types-and-sizes.f90
+++ b/flang/test/Integration/OpenMP/map-types-and-sizes.f90
@@ -30,8 +30,8 @@ subroutine mapType_array
   !$omp end target
 end subroutine mapType_array
 
-!CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [4 x i64] [i64 
0, i64 24, i64 8, i64 4]
-!CHECK: @.offload_maptypes{{.*}} = private unnamed_addr constant [4 x i64] 
[i64 32, i64 281474976711169, i64 281474976711171, i64 281474976711187]
+!CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [5 x i64] [i64 
0, i64 0, i64 0, i64 8, i64 4]
+!CHECK: @.offload_maptypes{{.*}} = private unnamed_addr constant [5 x i64] 
[i64 32, i64 281474976711173, i64 281474976711173, i64 281474976711171, i64 
281474976711187]
 subroutine mapType_ptr
   integer, pointer :: a
   !$omp target
@@ -39,8 +39,8 @@ subroutine mapType_ptr
   !$omp end target
 end subroutine mapType_ptr
 
-!CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [4 x i64] [i64 
0, i64 24, i64 8, i64 4]
-!CHECK: @.offload_maptypes{{.*}} = private unnamed_addr constant [4 x i64] 
[i64 32, i64 281474976711169, i64 281474976711171, i64 281474976711187]
+!CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [5 x i64] [i64 
0, i64 0, i64 0, i64 8, i64 4]
+!CHECK: @.offload_maptypes{{.*}} = private unnamed_addr constant [5 x i64] 
[i64 32, i64 281474976711173, i64 281474976711173, i64 281474976711171, i64 
281474976711187]
 subroutine mapType_allocatable
   integer, allocatable :: a
   allocate(a)
@@ -50,8 +50,8 @@ subroutine mapType_allocatable
   deallocate(a)
 end subroutine mapType_allocatable
 
-!CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [4 x i64] [i64 
0, i64 24, i64 8, i64 4]
-!CHECK: @.offload_maptypes{{.*}} = private unnamed_addr constant [4 x i64] 
[i64 32, i64 281474976710657, i64 281474976710659, i64 281474976710675]
+!CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [5 x i64] [i64 
0, i64 0, i64 0, i64 8, i64 4]
+!CHECK: @.offload_maptypes{{.*}} = private unnamed_addr constant [5 x i64] 
[i64 32, i64 281474976710661, i64 281474976710661, i64 281474976710659, i64 
281474976710675]
 subroutine mapType_ptr_explicit
   integer, pointer :: a
   !$omp target map(tofrom: a)
@@ -59,8 +59,8 @@ subroutine mapType_ptr_explicit
   !$omp end target
 end subroutine mapType_ptr_explicit
 
-!CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [4 x i64] [i64 
0, i64 24, i64 8, i64 4]
-!CHECK: @.offload_maptypes{{.*}} = private unnamed_addr constant [4 x i64] 
[i64 32, i64 281474976710657, i64 281474976710659, i64 281474976710675]
+!CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [5 x i64] [i64 
0, i64 0, i64 0, i64 8, 

[llvm-branch-commits] [clang] [llvm] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support AMDGPU's generic target (PR #122629)

2025-01-14 Thread Yaxun Liu via llvm-branch-commits

yxsamliu wrote:

how about assuming the strict triple format first, this will make the generic 
GPU arch work with strict triple.

If the first assumption fails, then fall back to the legacy parsing, that is, 
assuming no '-' in GPU arch and split at the right most '-'. This way, the old 
target ID string with non-strict triple still works.

https://github.com/llvm/llvm-project/pull/122629
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support AMDGPU's generic target (PR #122629)

2025-01-14 Thread Yaxun Liu via llvm-branch-commits

yxsamliu wrote:

> > > Still, I think it is possible to make it generic with minor assumption. 
> > > Let's say you are now about to parsing the final part of the target ID 
> > > string which may be either "env-cpu" or "cpu" without env.
> > 
> > 
> > This is not actually the issue. The issue is when the cpu is a generic 
> > target, such as `gfx10-3-generic`. By the current logic, the target id 
> > after split is `generic`, which is totally a valid one, and leave the rest 
> > with things like `hip-amd-amdhsa-amd-gfx10-3`.
> 
> That is probably due to this line 
> https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/OffloadBundler.cpp#L88
> 
> It assumes there is no '-' in GPU name.
> 
> we could add a loop. If that line fails, we will split at the second '-' from 
> right.

Ok I get your point. since generic is a valid GPU name. it will stop there.

https://github.com/llvm/llvm-project/pull/122629
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [llvm] [mlir] [Flang][OpenMP][MLIR] Initial declare target to for variables implementation (PR #119589)

2025-01-14 Thread via llvm-branch-commits

agozillon wrote:

Performed an update of PR 3 to address some reviewer nits, and then a rebase, 
the rebase has unfortunately brought in some code that will break the CI / 
patch series for the moment. However, I need to create a seperate PR to address 
it as it's a little unrelated to the changes in the PR stack. So once that 
lands everything will pass happily and hopefully be ready for submission after 
some reviews :-) 

https://github.com/llvm/llvm-project/pull/119589
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoongArch] Merge base and offset for tls-le code sequence (PR #122999)

2025-01-14 Thread via llvm-branch-commits

https://github.com/zhaoqi5 updated 
https://github.com/llvm/llvm-project/pull/122999

>From e37c2933125a65e627949557fba7a606d41db716 Mon Sep 17 00:00:00 2001
From: Qi Zhao 
Date: Tue, 14 Jan 2025 21:35:31 +0800
Subject: [PATCH] [LoongArch] Merge base and offset for tls-le code sequence

Adapt the merge base offset pass to optimize the tls-le
code sequence.
---
 .../LoongArch/LoongArchMergeBaseOffset.cpp| 165 -
 .../LoongArch/machinelicm-address-pseudos.ll  |   6 +-
 .../LoongArch/merge-base-offset-tlsle.ll  | 318 +++---
 3 files changed, 266 insertions(+), 223 deletions(-)

diff --git a/llvm/lib/Target/LoongArch/LoongArchMergeBaseOffset.cpp 
b/llvm/lib/Target/LoongArch/LoongArchMergeBaseOffset.cpp
index 7f98f7718a538d..2aae498e1f2de2 100644
--- a/llvm/lib/Target/LoongArch/LoongArchMergeBaseOffset.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchMergeBaseOffset.cpp
@@ -37,6 +37,8 @@ class LoongArchMergeBaseOffsetOpt : public 
MachineFunctionPass {
   bool detectFoldable(MachineInstr &Hi20, MachineInstr *&Lo12,
   MachineInstr *&Lo20, MachineInstr *&Hi12,
   MachineInstr *&Last);
+  bool detectFoldable(MachineInstr &Hi20, MachineInstr *&Add,
+  MachineInstr *&Lo12);
 
   bool detectAndFoldOffset(MachineInstr &Hi20, MachineInstr &Lo12,
MachineInstr *&Lo20, MachineInstr *&Hi12,
@@ -176,7 +178,80 @@ bool 
LoongArchMergeBaseOffsetOpt::detectFoldable(MachineInstr &Hi20,
   return true;
 }
 
-// Update the offset in Hi20, Lo12, Lo20 and Hi12 instructions.
+// Detect the pattern:
+//
+// (small/medium):
+//   lu12i.w  vreg1, %le_hi20_r(s)
+//   add.w/d  vreg2, vreg1, r2, %le_add_r(s)
+//   addi.w/d vreg3, vreg2, %le_lo12_r(s)
+
+// The pattern is only accepted if:
+//1) The first instruction has only one use, which is the PseudoAddTPRel.
+//   The second instruction has only one use, which is the ADDI. The
+//   second instruction's last operand is the tp register.
+//2) The address operands have the appropriate type, reflecting the
+//   lowering of a thread_local global address using the pattern.
+//3) The offset value in the ThreadLocal Global Address is 0.
+bool LoongArchMergeBaseOffsetOpt::detectFoldable(MachineInstr &Hi20,
+ MachineInstr *&Add,
+ MachineInstr *&Lo12) {
+  if (Hi20.getOpcode() != LoongArch::LU12I_W)
+return false;
+
+  auto isGlobalOrCPI = [](const MachineOperand &Op) {
+return Op.isGlobal() || Op.isCPI();
+  };
+
+  const MachineOperand &Hi20Op1 = Hi20.getOperand(1);
+  if (LoongArchII::getDirectFlags(Hi20Op1) != LoongArchII::MO_LE_HI_R ||
+  !isGlobalOrCPI(Hi20Op1) || Hi20Op1.getOffset() != 0)
+return false;
+
+  Register HiDestReg = Hi20.getOperand(0).getReg();
+  if (!MRI->hasOneUse(HiDestReg))
+return false;
+
+  Add = &*MRI->use_instr_begin(HiDestReg);
+  if ((ST->is64Bit() && Add->getOpcode() != LoongArch::PseudoAddTPRel_D) ||
+  (!ST->is64Bit() && Add->getOpcode() != LoongArch::PseudoAddTPRel_W))
+return false;
+
+  if (Add->getOperand(2).getReg() != LoongArch::R2)
+return false;
+
+  const MachineOperand &AddOp3 = Add->getOperand(3);
+  if (LoongArchII::getDirectFlags(AddOp3) != LoongArchII::MO_LE_ADD_R ||
+  !(isGlobalOrCPI(AddOp3) || AddOp3.isMCSymbol()) ||
+  AddOp3.getOffset() != 0)
+return false;
+
+  Register AddDestReg = Add->getOperand(0).getReg();
+  if (!MRI->hasOneUse(AddDestReg))
+return false;
+
+  Lo12 = &*MRI->use_instr_begin(AddDestReg);
+  if ((ST->is64Bit() && Lo12->getOpcode() != LoongArch::ADDI_D) ||
+  (!ST->is64Bit() && Lo12->getOpcode() != LoongArch::ADDI_W))
+return false;
+
+  const MachineOperand &Lo12Op2 = Lo12->getOperand(2);
+  if (LoongArchII::getDirectFlags(Lo12Op2) != LoongArchII::MO_LE_LO_R ||
+  !(isGlobalOrCPI(Lo12Op2) || Lo12Op2.isMCSymbol()) ||
+  Lo12Op2.getOffset() != 0)
+return false;
+
+  if (Hi20Op1.isGlobal()) {
+LLVM_DEBUG(dbgs() << "  Found lowered global address: "
+  << *Hi20Op1.getGlobal() << "\n");
+  } else if (Hi20Op1.isCPI()) {
+LLVM_DEBUG(dbgs() << "  Found lowered constant pool: " << 
Hi20Op1.getIndex()
+  << "\n");
+  }
+
+  return true;
+}
+
+// Update the offset in Hi20, (Add), Lo12, (Lo20 and Hi12) instructions.
 // Delete the tail instruction and update all the uses to use the
 // output from Last.
 void LoongArchMergeBaseOffsetOpt::foldOffset(
@@ -190,31 +265,49 @@ void LoongArchMergeBaseOffsetOpt::foldOffset(
 Lo20->getOperand(2).setOffset(Offset);
 Hi12->getOperand(2).setOffset(Offset);
   }
+
+  // For tls-le, offset of the second PseudoAddTPRel instr should also be
+  // updated.
+  MachineInstr *Add = &*MRI->use_instr_begin(Hi20.getOperand(0).getReg());
+  if (Hi20.getOpcode() == LoongArch::LU12I_W)
+Add->getOperand(3).setOffset(Offset);
+
   // Delete the tail instr

[llvm-branch-commits] [llvm] [LoongArch] Merge base and offset for tls-le code sequence (PR #122999)

2025-01-14 Thread via llvm-branch-commits

https://github.com/zhaoqi5 updated 
https://github.com/llvm/llvm-project/pull/122999

>From ac63a4f1c8e8d1b3831c83c5fab2a139a284dcc6 Mon Sep 17 00:00:00 2001
From: Qi Zhao 
Date: Tue, 14 Jan 2025 21:35:31 +0800
Subject: [PATCH] [LoongArch] Merge base and offset for tls-le code sequence

Adapt the merge base offset pass to optimize the tls-le
code sequence.
---
 .../LoongArch/LoongArchMergeBaseOffset.cpp| 165 -
 .../LoongArch/machinelicm-address-pseudos.ll  |   6 +-
 .../LoongArch/merge-base-offset-tlsle.ll  | 318 +++---
 3 files changed, 265 insertions(+), 224 deletions(-)

diff --git a/llvm/lib/Target/LoongArch/LoongArchMergeBaseOffset.cpp 
b/llvm/lib/Target/LoongArch/LoongArchMergeBaseOffset.cpp
index 7f98f7718a538d..bef56e58bdc88d 100644
--- a/llvm/lib/Target/LoongArch/LoongArchMergeBaseOffset.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchMergeBaseOffset.cpp
@@ -37,6 +37,8 @@ class LoongArchMergeBaseOffsetOpt : public 
MachineFunctionPass {
   bool detectFoldable(MachineInstr &Hi20, MachineInstr *&Lo12,
   MachineInstr *&Lo20, MachineInstr *&Hi12,
   MachineInstr *&Last);
+  bool detectFoldable(MachineInstr &Hi20, MachineInstr *&Add,
+  MachineInstr *&Lo12);
 
   bool detectAndFoldOffset(MachineInstr &Hi20, MachineInstr &Lo12,
MachineInstr *&Lo20, MachineInstr *&Hi12,
@@ -176,7 +178,80 @@ bool 
LoongArchMergeBaseOffsetOpt::detectFoldable(MachineInstr &Hi20,
   return true;
 }
 
-// Update the offset in Hi20, Lo12, Lo20 and Hi12 instructions.
+// Detect the pattern:
+//
+// (small/medium):
+//   lu12i.w  vreg1, %le_hi20_r(s)
+//   add.w/d  vreg2, vreg1, r2, %le_add_r(s)
+//   addi.w/d vreg3, vreg2, %le_lo12_r(s)
+
+// The pattern is only accepted if:
+//1) The first instruction has only one use, which is the PseudoAddTPRel.
+//   The second instruction has only one use, which is the ADDI. The
+//   second instruction's last operand is the tp register.
+//2) The address operands have the appropriate type, reflecting the
+//   lowering of a thread_local global address using the pattern.
+//3) The offset value in the ThreadLocal Global Address is 0.
+bool LoongArchMergeBaseOffsetOpt::detectFoldable(MachineInstr &Hi20,
+ MachineInstr *&Add,
+ MachineInstr *&Lo12) {
+  if (Hi20.getOpcode() != LoongArch::LU12I_W)
+return false;
+
+  auto isGlobalOrCPI = [](const MachineOperand &Op) {
+return Op.isGlobal() || Op.isCPI();
+  };
+
+  const MachineOperand &Hi20Op1 = Hi20.getOperand(1);
+  if (LoongArchII::getDirectFlags(Hi20Op1) != LoongArchII::MO_LE_HI_R ||
+  !isGlobalOrCPI(Hi20Op1) || Hi20Op1.getOffset() != 0)
+return false;
+
+  Register HiDestReg = Hi20.getOperand(0).getReg();
+  if (!MRI->hasOneUse(HiDestReg))
+return false;
+
+  Add = &*MRI->use_instr_begin(HiDestReg);
+  if ((ST->is64Bit() && Add->getOpcode() != LoongArch::PseudoAddTPRel_D) ||
+  (!ST->is64Bit() && Add->getOpcode() != LoongArch::PseudoAddTPRel_W))
+return false;
+
+  if (Add->getOperand(2).getReg() != LoongArch::R2)
+return false;
+
+  const MachineOperand &AddOp3 = Add->getOperand(3);
+  if (LoongArchII::getDirectFlags(AddOp3) != LoongArchII::MO_LE_ADD_R ||
+  !(isGlobalOrCPI(AddOp3) || AddOp3.isMCSymbol()) ||
+  AddOp3.getOffset() != 0)
+return false;
+
+  Register AddDestReg = Add->getOperand(0).getReg();
+  if (!MRI->hasOneUse(AddDestReg))
+return false;
+
+  Lo12 = &*MRI->use_instr_begin(AddDestReg);
+  if ((ST->is64Bit() && Lo12->getOpcode() != LoongArch::ADDI_D) ||
+  (!ST->is64Bit() && Lo12->getOpcode() != LoongArch::ADDI_W))
+return false;
+
+  const MachineOperand &Lo12Op2 = Lo12->getOperand(2);
+  if (LoongArchII::getDirectFlags(Lo12Op2) != LoongArchII::MO_LE_LO_R ||
+  !(isGlobalOrCPI(Lo12Op2) || Lo12Op2.isMCSymbol()) ||
+  Lo12Op2.getOffset() != 0)
+return false;
+
+  if (Hi20Op1.isGlobal()) {
+LLVM_DEBUG(dbgs() << "  Found lowered global address: "
+  << *Hi20Op1.getGlobal() << "\n");
+  } else if (Hi20Op1.isCPI()) {
+LLVM_DEBUG(dbgs() << "  Found lowered constant pool: " << 
Hi20Op1.getIndex()
+  << "\n");
+  }
+
+  return true;
+}
+
+// Update the offset in Hi20, (Add), Lo12, (Lo20 and Hi12) instructions.
 // Delete the tail instruction and update all the uses to use the
 // output from Last.
 void LoongArchMergeBaseOffsetOpt::foldOffset(
@@ -190,31 +265,49 @@ void LoongArchMergeBaseOffsetOpt::foldOffset(
 Lo20->getOperand(2).setOffset(Offset);
 Hi12->getOperand(2).setOffset(Offset);
   }
+
+  // For tls-le, offset of the second PseudoAddTPRel instr should also be
+  // updated.
+  MachineInstr *Add = &*MRI->use_instr_begin(Hi20.getOperand(0).getReg());
+  if (Hi20.getOpcode() == LoongArch::LU12I_W)
+Add->getOperand(3).setOffset(Offset);
+
   // Delete the tail instr

[llvm-branch-commits] [llvm] AMDGPU: Implement isExtractVecEltCheap (PR #122460)

2025-01-14 Thread Joe Nash via llvm-branch-commits


@@ -1949,6 +1949,13 @@ bool SITargetLowering::isExtractSubvectorCheap(EVT 
ResVT, EVT SrcVT,
   return Index == 0;
 }
 
+bool SITargetLowering::isExtractVecEltCheap(EVT VT, unsigned Index) const {
+  // TODO: This should be more aggressive, particular for 16-bit element
+  // vectors. However there are some mixed improvements and regressions.
+  EVT EltTy = VT.getVectorElementType();
+  return EltTy.getSizeInBits() % 32 == 0;

Sisyph wrote:

Yes I would think EltTy.getSizeInBits() * Index % 16 == 0 for True16 would be 
the way to go.

https://github.com/llvm/llvm-project/pull/122460
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support AMDGPU's generic target (PR #122629)

2025-01-14 Thread Shilei Tian via llvm-branch-commits

https://github.com/shiltian updated 
https://github.com/llvm/llvm-project/pull/122629

>From e7eac39e7fd0516069edbc1b0b62bf8eced359df Mon Sep 17 00:00:00 2001
From: Shilei Tian 
Date: Sun, 12 Jan 2025 18:01:55 -0500
Subject: [PATCH] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to
 support generic target

The current parsing of target string assumes to be in a form of
`kind-triple-targetid:feature`, such as
`hipv4-amdgcn-amd-amdhsa-gfx1030:+xnack`. Specifically, the target id does not
contain any `-`, which is not the case for generic target. Also, a generic
target may contain one or more `-`, such as `gfx10-3-generic` and
`gfx12-generic`. As a result, we can no longer depend on `rstrip` to get things
work right. This patch reworks the logic to parse the target string to make it
more robust, as well as supporting generic target.
---
 clang/docs/ClangOffloadBundler.rst|  7 ++-
 clang/lib/Driver/OffloadBundler.cpp   | 49 +--
 clang/lib/Driver/ToolChains/Clang.cpp |  6 ++-
 clang/lib/Driver/ToolChains/CommonArgs.cpp|  3 +-
 clang/lib/Driver/ToolChains/HIPUtility.cpp|  2 +-
 .../Driver/clang-offload-bundler-asserts-on.c | 14 +++---
 .../clang-offload-bundler-standardize.c   | 18 ++-
 .../test/Driver/clang-offload-bundler-zlib.c  | 12 ++---
 .../test/Driver/clang-offload-bundler-zstd.c  | 12 ++---
 clang/test/Driver/clang-offload-bundler.c | 48 +-
 clang/test/Driver/hip-link-bc-to-bc.hip   |  8 +--
 clang/test/Driver/hip-link-bundle-archive.hip | 10 ++--
 .../test/Driver/hip-offload-compress-zstd.hip |  4 +-
 clang/test/Driver/hip-rdc-device-only.hip |  8 +--
 .../Driver/hip-toolchain-rdc-separate.hip | 12 ++---
 llvm/utils/lit/lit/llvm/config.py | 12 -
 16 files changed, 114 insertions(+), 111 deletions(-)

diff --git a/clang/docs/ClangOffloadBundler.rst 
b/clang/docs/ClangOffloadBundler.rst
index 3c241027d405ca..25214c2ea6a4e1 100644
--- a/clang/docs/ClangOffloadBundler.rst
+++ b/clang/docs/ClangOffloadBundler.rst
@@ -266,15 +266,14 @@ without differentiation based on offload kind.
 The target triple of the code object. See `Target Triple
 `_.
 
-The bundler accepts target triples with or without the optional environment
-field:
+LLVM target triples can be with or without the optional environment field:
 
 ``--``, or
 ``---``
 
 However, in order to standardize outputs for tools that consume bitcode
-bundles, bundles written by the bundler internally use only the 4-field
-target triple:
+bundles, the bundler only accepts target triples with the 4-field target
+triple:
 
 ``---``
 
diff --git a/clang/lib/Driver/OffloadBundler.cpp 
b/clang/lib/Driver/OffloadBundler.cpp
index 2d6bdff0393be5..c29ab61853efa8 100644
--- a/clang/lib/Driver/OffloadBundler.cpp
+++ b/clang/lib/Driver/OffloadBundler.cpp
@@ -84,31 +84,27 @@ OffloadTargetInfo::OffloadTargetInfo(const StringRef Target,
 : BundlerConfig(BC) {
 
   // TODO: Add error checking from ClangOffloadBundler.cpp
-  auto TargetFeatures = Target.split(':');
-  auto TripleOrGPU = TargetFeatures.first.rsplit('-');
-
-  if (clang::StringToOffloadArch(TripleOrGPU.second) !=
-  clang::OffloadArch::UNKNOWN) {
-auto KindTriple = TripleOrGPU.first.split('-');
-this->OffloadKind = KindTriple.first;
-
-// Enforce optional env field to standardize bundles
-llvm::Triple t = llvm::Triple(KindTriple.second);
-this->Triple = llvm::Triple(t.getArchName(), t.getVendorName(),
-t.getOSName(), t.getEnvironmentName());
-
-this->TargetID = Target.substr(Target.find(TripleOrGPU.second));
-  } else {
-auto KindTriple = TargetFeatures.first.split('-');
-this->OffloadKind = KindTriple.first;
-
-// Enforce optional env field to standardize bundles
-llvm::Triple t = llvm::Triple(KindTriple.second);
-this->Triple = llvm::Triple(t.getArchName(), t.getVendorName(),
-t.getOSName(), t.getEnvironmentName());
-
+  // -[-[:target features]]
+  //  := ---
+  SmallVector Components;
+  Target.split(Components, '-', /*MaxSplit=*/5);
+  assert((Components.size() == 5 || Components.size() == 6) &&
+ "malformed target string");
+
+  StringRef TargetIdWithFeature =
+  Components.size() == 6 ? Components.back() : "";
+  StringRef TargetId = TargetIdWithFeature.split(':').first;
+  if (!TargetId.empty() &&
+  clang::StringToOffloadArch(TargetId) != clang::OffloadArch::UNKNOWN)
+this->TargetID = TargetIdWithFeature;
+  else
 this->TargetID = "";
-  }
+
+  this->OffloadKind = Components.front();
+  ArrayRef TripleSlice{&Components[1], /*length=*/4};
+  llvm::Triple T = llvm::Triple(llvm::join(TripleSlice, "-"));
+  this->Triple = llvm::Triple(T.getArchName(), T.getVendorName(), 
T.getOSName(),
+  T.getEnvironmentName(

[llvm-branch-commits] [clang] [llvm] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support AMDGPU's generic target (PR #122629)

2025-01-14 Thread Shilei Tian via llvm-branch-commits

shiltian wrote:

> how about assuming the strict triple format first, this will make the generic 
> GPU arch work with strict triple.

That doesn't work if someone write a 3-tuple with a generic target because 
neither pass can handle that.

https://github.com/llvm/llvm-project/pull/122629
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] Reapply "[Fuchsia][cmake] Allow using FatLTO when building runtimes" (#119252) (PR #121820)

2025-01-14 Thread Paul Kirth via llvm-branch-commits

https://github.com/ilovepi updated 
https://github.com/llvm/llvm-project/pull/121820


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] Reapply "[Fuchsia][cmake] Allow using FatLTO when building runtimes" (#119252) (PR #121820)

2025-01-14 Thread Paul Kirth via llvm-branch-commits

https://github.com/ilovepi updated 
https://github.com/llvm/llvm-project/pull/121820


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support AMDGPU's generic target (PR #122629)

2025-01-14 Thread Yaxun Liu via llvm-branch-commits

yxsamliu wrote:

> > Still, I think it is possible to make it generic with minor assumption. 
> > Let's say you are now about to parsing the final part of the target ID 
> > string which may be either "env-cpu" or "cpu" without env.
> 
> This is not actually the issue. The issue is when the cpu is a generic 
> target, such as `gfx10-3-generic`. By the current logic, the target id after 
> split is `generic`, which is totally a valid one, and leave the rest with 
> things like `hip-amd-amdhsa-amd-gfx10-3`.

That is probably due to this line 
https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/OffloadBundler.cpp#L88

It assumes there is no '-' in GPU name.

we could add a loop. If that line fails, we will split at the second '-' from 
right.

https://github.com/llvm/llvm-project/pull/122629
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits