[llvm-branch-commits] [llvm] [NFC] Leave a comment in `Local.cpp` about debug info & sample profiling (PR #155296)

2025-09-13 Thread Mircea Trofin via llvm-branch-commits

https://github.com/mtrofin updated 
https://github.com/llvm/llvm-project/pull/155296

>From 421deb7334bf6030686eb809132e1d13b730cbc6 Mon Sep 17 00:00:00 2001
From: Mircea Trofin 
Date: Mon, 25 Aug 2025 21:04:05 +
Subject: [PATCH] [NFC] Leave a comment in `Local.cpp` about debug info &
 sample profiling

---
 llvm/lib/Transforms/Utils/Local.cpp | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/Local.cpp 
b/llvm/lib/Transforms/Utils/Local.cpp
index 2cfd70a1746c8..57dc1b38b8ec3 100644
--- a/llvm/lib/Transforms/Utils/Local.cpp
+++ b/llvm/lib/Transforms/Utils/Local.cpp
@@ -3342,8 +3342,11 @@ void llvm::hoistAllInstructionsInto(BasicBlock 
*DomBlock, Instruction *InsertPt,
   // retain their original debug locations (DILocations) and debug intrinsic
   // instructions.
   //
-  // Doing so would degrade the debugging experience and adversely affect the
-  // accuracy of profiling information.
+  // Doing so would degrade the debugging experience.
+  //
+  // FIXME: Issue #152767: debug info should also be the same as the
+  // original branch, **if** the user explicitly indicated that (for sampling
+  // PGO)
   //
   // Currently, when hoisting the instructions, we take the following actions:
   // - Remove their debug intrinsic instructions.

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/21.x: [Clang][Cygwin] Use correct mangling rule (#158404) (PR #158442)

2025-09-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: None (llvmbot)


Changes

Backport 4abcbb053f8adaf48dbfff677e8ccda1f6d52b33

Requested by: @mstorsjo

---
Full diff: https://github.com/llvm/llvm-project/pull/158442.diff


3 Files Affected:

- (modified) clang/lib/Basic/Targets/X86.h (+2) 
- (modified) clang/test/CodeGen/mangle-windows.c (+4-2) 
- (modified) clang/test/CodeGenCXX/mangle-windows.cpp (+3) 


``diff
diff --git a/clang/lib/Basic/Targets/X86.h b/clang/lib/Basic/Targets/X86.h
index ebc59c92f4c24..a7be080695ed3 100644
--- a/clang/lib/Basic/Targets/X86.h
+++ b/clang/lib/Basic/Targets/X86.h
@@ -649,6 +649,7 @@ class LLVM_LIBRARY_VISIBILITY CygwinX86_32TargetInfo : 
public X86_32TargetInfo {
   : X86_32TargetInfo(Triple, Opts) {
 this->WCharType = TargetInfo::UnsignedShort;
 this->WIntType = TargetInfo::UnsignedInt;
+this->UseMicrosoftManglingForC = true;
 DoubleAlign = LongLongAlign = 64;
 resetDataLayout("e-m:x-p:32:32-p270:32:32-p271:32:32-p272:64:64-i64:64-"
 "i128:128-f80:32-n8:16:32-a:0:32-S32",
@@ -986,6 +987,7 @@ class LLVM_LIBRARY_VISIBILITY CygwinX86_64TargetInfo : 
public X86_64TargetInfo {
   : X86_64TargetInfo(Triple, Opts) {
 this->WCharType = TargetInfo::UnsignedShort;
 this->WIntType = TargetInfo::UnsignedInt;
+this->UseMicrosoftManglingForC = true;
   }
 
   void getTargetDefines(const LangOptions &Opts,
diff --git a/clang/test/CodeGen/mangle-windows.c 
b/clang/test/CodeGen/mangle-windows.c
index 046b1e8815a8a..e1b06e72a9635 100644
--- a/clang/test/CodeGen/mangle-windows.c
+++ b/clang/test/CodeGen/mangle-windows.c
@@ -1,8 +1,10 @@
 // RUN: %clang_cc1 -emit-llvm %s -o - -triple=i386-pc-win32 | FileCheck %s
-// RUN: %clang_cc1 -emit-llvm %s -o - -triple=i386-mingw32 | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm %s -o - -triple=i386-mingw32  | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm %s -o - -triple=i386-cygwin   | FileCheck %s
 // RUN: %clang_cc1 -emit-llvm %s -o - -triple=i386-pc-windows-msvc-elf | 
FileCheck %s --check-prefix=ELF32
 // RUN: %clang_cc1 -emit-llvm %s -o - -triple=x86_64-pc-win32 | FileCheck %s 
--check-prefix=X64
-// RUN: %clang_cc1 -emit-llvm %s -o - -triple=x86_64-mingw32 | FileCheck %s 
--check-prefix=X64
+// RUN: %clang_cc1 -emit-llvm %s -o - -triple=x86_64-mingw32  | FileCheck %s 
--check-prefix=X64
+// RUN: %clang_cc1 -emit-llvm %s -o - -triple=x86_64-cygwin   | FileCheck %s 
--check-prefix=X64
 // RUN: %clang_cc1 -emit-llvm %s -o - -triple=x86_64-pc-windows-msvc-elf | 
FileCheck %s --check-prefix=ELF64
 
 // CHECK: target datalayout = "e-m:x-{{.*}}"
diff --git a/clang/test/CodeGenCXX/mangle-windows.cpp 
b/clang/test/CodeGenCXX/mangle-windows.cpp
index 3d5a1e9a868ef..737abcf6e3498 100644
--- a/clang/test/CodeGenCXX/mangle-windows.cpp
+++ b/clang/test/CodeGenCXX/mangle-windows.cpp
@@ -4,6 +4,9 @@
 // RUN: %clang_cc1 -emit-llvm %s -o - -triple=i386-mingw32 | \
 // RUN: FileCheck --check-prefix=ITANIUM %s
 
+// RUN: %clang_cc1 -emit-llvm %s -o - -triple=i386-cygwin | \
+// RUN: FileCheck --check-prefix=ITANIUM %s
+
 void __stdcall f1(void) {}
 // WIN: define dso_local x86_stdcallcc void @"?f1@@YGXXZ"
 // ITANIUM: define dso_local x86_stdcallcc void @"\01__Z2f1v@0"

``




https://github.com/llvm/llvm-project/pull/158442
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/21.x: [Clang][Cygwin] Use correct mangling rule (#158404) (PR #158442)

2025-09-13 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/158442

Backport 4abcbb053f8adaf48dbfff677e8ccda1f6d52b33

Requested by: @mstorsjo

>From f4907049285ca0875cc91770e3ceb3f162ec7c48 Mon Sep 17 00:00:00 2001
From: Tomohiro Kashiwada 
Date: Sun, 14 Sep 2025 06:40:12 +0900
Subject: [PATCH] [Clang][Cygwin] Use correct mangling rule (#158404)

In
https://github.com/llvm/llvm-project/commit/45ca613c135ea7b5fbc63bff003f20bf20f62081,
whether to mangle names based on calling conventions according to
Microsoft conventions was refactored to a bool in the TargetInfo. Cygwin
targets also require this mangling, but were missed, presumably due to
lack of test coverage of these targets. This commit enables the name
mangling for Cygwin, and also enables test coverage of this mangling on
Cygwin targets.

(cherry picked from commit 4abcbb053f8adaf48dbfff677e8ccda1f6d52b33)
---
 clang/lib/Basic/Targets/X86.h| 2 ++
 clang/test/CodeGen/mangle-windows.c  | 6 --
 clang/test/CodeGenCXX/mangle-windows.cpp | 3 +++
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/clang/lib/Basic/Targets/X86.h b/clang/lib/Basic/Targets/X86.h
index ebc59c92f4c24..a7be080695ed3 100644
--- a/clang/lib/Basic/Targets/X86.h
+++ b/clang/lib/Basic/Targets/X86.h
@@ -649,6 +649,7 @@ class LLVM_LIBRARY_VISIBILITY CygwinX86_32TargetInfo : 
public X86_32TargetInfo {
   : X86_32TargetInfo(Triple, Opts) {
 this->WCharType = TargetInfo::UnsignedShort;
 this->WIntType = TargetInfo::UnsignedInt;
+this->UseMicrosoftManglingForC = true;
 DoubleAlign = LongLongAlign = 64;
 resetDataLayout("e-m:x-p:32:32-p270:32:32-p271:32:32-p272:64:64-i64:64-"
 "i128:128-f80:32-n8:16:32-a:0:32-S32",
@@ -986,6 +987,7 @@ class LLVM_LIBRARY_VISIBILITY CygwinX86_64TargetInfo : 
public X86_64TargetInfo {
   : X86_64TargetInfo(Triple, Opts) {
 this->WCharType = TargetInfo::UnsignedShort;
 this->WIntType = TargetInfo::UnsignedInt;
+this->UseMicrosoftManglingForC = true;
   }
 
   void getTargetDefines(const LangOptions &Opts,
diff --git a/clang/test/CodeGen/mangle-windows.c 
b/clang/test/CodeGen/mangle-windows.c
index 046b1e8815a8a..e1b06e72a9635 100644
--- a/clang/test/CodeGen/mangle-windows.c
+++ b/clang/test/CodeGen/mangle-windows.c
@@ -1,8 +1,10 @@
 // RUN: %clang_cc1 -emit-llvm %s -o - -triple=i386-pc-win32 | FileCheck %s
-// RUN: %clang_cc1 -emit-llvm %s -o - -triple=i386-mingw32 | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm %s -o - -triple=i386-mingw32  | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm %s -o - -triple=i386-cygwin   | FileCheck %s
 // RUN: %clang_cc1 -emit-llvm %s -o - -triple=i386-pc-windows-msvc-elf | 
FileCheck %s --check-prefix=ELF32
 // RUN: %clang_cc1 -emit-llvm %s -o - -triple=x86_64-pc-win32 | FileCheck %s 
--check-prefix=X64
-// RUN: %clang_cc1 -emit-llvm %s -o - -triple=x86_64-mingw32 | FileCheck %s 
--check-prefix=X64
+// RUN: %clang_cc1 -emit-llvm %s -o - -triple=x86_64-mingw32  | FileCheck %s 
--check-prefix=X64
+// RUN: %clang_cc1 -emit-llvm %s -o - -triple=x86_64-cygwin   | FileCheck %s 
--check-prefix=X64
 // RUN: %clang_cc1 -emit-llvm %s -o - -triple=x86_64-pc-windows-msvc-elf | 
FileCheck %s --check-prefix=ELF64
 
 // CHECK: target datalayout = "e-m:x-{{.*}}"
diff --git a/clang/test/CodeGenCXX/mangle-windows.cpp 
b/clang/test/CodeGenCXX/mangle-windows.cpp
index 3d5a1e9a868ef..737abcf6e3498 100644
--- a/clang/test/CodeGenCXX/mangle-windows.cpp
+++ b/clang/test/CodeGenCXX/mangle-windows.cpp
@@ -4,6 +4,9 @@
 // RUN: %clang_cc1 -emit-llvm %s -o - -triple=i386-mingw32 | \
 // RUN: FileCheck --check-prefix=ITANIUM %s
 
+// RUN: %clang_cc1 -emit-llvm %s -o - -triple=i386-cygwin | \
+// RUN: FileCheck --check-prefix=ITANIUM %s
+
 void __stdcall f1(void) {}
 // WIN: define dso_local x86_stdcallcc void @"?f1@@YGXXZ"
 // ITANIUM: define dso_local x86_stdcallcc void @"\01__Z2f1v@0"

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/21.x: [Clang][Cygwin] Use correct mangling rule (#158404) (PR #158442)

2025-09-13 Thread via llvm-branch-commits

llvmbot wrote:

@jeremyd2019 What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/158442
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] CodeGen: Remove TRI argument from reMaterialize (PR #158229)

2025-09-13 Thread Simon Pilgrim via llvm-branch-commits

https://github.com/RKSimon approved this pull request.


https://github.com/llvm/llvm-project/pull/158229
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DA] Add overflow check in ExactSIV (PR #157086)

2025-09-13 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj updated 
https://github.com/llvm/llvm-project/pull/157086

>From 94b18495719b35a89ee6a18e474e8e92a4429d99 Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Fri, 5 Sep 2025 11:41:29 +
Subject: [PATCH] [DA] Add overflow check in ExactSIV

---
 llvm/lib/Analysis/DependenceAnalysis.cpp  | 14 +-
 llvm/test/Analysis/DependenceAnalysis/ExactSIV.ll |  2 +-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp 
b/llvm/lib/Analysis/DependenceAnalysis.cpp
index 0f77a1410e83b..6e576e866b310 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -1170,6 +1170,15 @@ const SCEVConstant 
*DependenceInfo::collectConstantUpperBound(const Loop *L,
   return nullptr;
 }
 
+/// Returns \p A - \p B if it guaranteed not to signed wrap. Otherwise returns
+/// nullptr. \p A and \p B must have the same integer type.
+static const SCEV *minusSCEVNoSignedOverflow(const SCEV *A, const SCEV *B,
+ ScalarEvolution &SE) {
+  if (SE.willNotOverflow(Instruction::Sub, /*Signed=*/true, A, B))
+return SE.getMinusSCEV(A, B);
+  return nullptr;
+}
+
 // testZIV -
 // When we have a pair of subscripts of the form [c1] and [c2],
 // where c1 and c2 are both loop invariant, we attack it using
@@ -1626,7 +1635,9 @@ bool DependenceInfo::exactSIVtest(const SCEV *SrcCoeff, 
const SCEV *DstCoeff,
   assert(0 < Level && Level <= CommonLevels && "Level out of range");
   Level--;
   Result.Consistent = false;
-  const SCEV *Delta = SE->getMinusSCEV(DstConst, SrcConst);
+  const SCEV *Delta = minusSCEVNoSignedOverflow(DstConst, SrcConst, *SE);
+  if (!Delta)
+return false;
   LLVM_DEBUG(dbgs() << "\tDelta = " << *Delta << "\n");
   NewConstraint.setLine(SrcCoeff, SE->getNegativeSCEV(DstCoeff), Delta,
 CurLoop);
@@ -1716,6 +1727,7 @@ bool DependenceInfo::exactSIVtest(const SCEV *SrcCoeff, 
const SCEV *DstCoeff,
   // explore directions
   unsigned NewDirection = Dependence::DVEntry::NONE;
   APInt LowerDistance, UpperDistance;
+  // TODO: Overflow check may be needed.
   if (TA.sgt(TB)) {
 LowerDistance = (TY - TX) + (TA - TB) * TL;
 UpperDistance = (TY - TX) + (TA - TB) * TU;
diff --git a/llvm/test/Analysis/DependenceAnalysis/ExactSIV.ll 
b/llvm/test/Analysis/DependenceAnalysis/ExactSIV.ll
index 54bb8b73da02a..fd58568d02c43 100644
--- a/llvm/test/Analysis/DependenceAnalysis/ExactSIV.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/ExactSIV.ll
@@ -841,7 +841,7 @@ define void @exact14(ptr %A) {
 ; CHECK-SIV-ONLY-NEXT:  Src: store i8 0, ptr %idx.0, align 1 --> Dst: store i8 
0, ptr %idx.0, align 1
 ; CHECK-SIV-ONLY-NEXT:da analyze - none!
 ; CHECK-SIV-ONLY-NEXT:  Src: store i8 0, ptr %idx.0, align 1 --> Dst: store i8 
1, ptr %idx.1, align 1
-; CHECK-SIV-ONLY-NEXT:da analyze - none!
+; CHECK-SIV-ONLY-NEXT:da analyze - output [*|<]!
 ; CHECK-SIV-ONLY-NEXT:  Src: store i8 1, ptr %idx.1, align 1 --> Dst: store i8 
1, ptr %idx.1, align 1
 ; CHECK-SIV-ONLY-NEXT:da analyze - none!
 ;

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [Clang] Rewrite tests using subshells to set env variables (PR #158446)

2025-09-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-modules

Author: Aiden Grossman (boomanaiden154)


Changes

Now that we have the %readfile substitution, we can rewrite these tests
that were using env variable subshells to write the output of the
command into a file and then load it where it is needed using readfile.

This does involve one invocation of bash so that we are using the system
env binary, which does support redirection into a tool like grep. We
already do this in one LLVM test. I'm not happy about needing that, but
the more principled way to solve it involves reworking how in-process
builtins work within lit. That is something we want to do eventually,
but not something that I think should block this patch.


---
Full diff: https://github.com/llvm/llvm-project/pull/158446.diff


5 Files Affected:

- (modified) clang/test/ClangScanDeps/pr61006.cppm (+2-1) 
- (modified) clang/test/ClangScanDeps/resource_directory.c (+4-5) 
- (modified) clang/test/Driver/env.c (+3-2) 
- (modified) clang/test/Driver/program-path-priority.c (+8-8) 
- (modified) clang/test/Modules/relative-resource-dir.m (+3-3) 


``diff
diff --git a/clang/test/ClangScanDeps/pr61006.cppm 
b/clang/test/ClangScanDeps/pr61006.cppm
index f75edd38c81ba..f10bc1e673987 100644
--- a/clang/test/ClangScanDeps/pr61006.cppm
+++ b/clang/test/ClangScanDeps/pr61006.cppm
@@ -6,7 +6,8 @@
 // RUN: mkdir -p %t
 // RUN: split-file %s %t
 //
-// RUN: EXPECTED_RESOURCE_DIR=`%clang -print-resource-dir` && \
+// RUN: %clang -print-resource-dir | tr -d '\n' > %t/resource-dir
+// RUN: env EXPECTED_RESOURCE_DIR=%{readfile:%t/resource-dir} && \
 // RUN: ln -s %clang++ %t/clang++ && \
 // RUN: sed "s|EXPECTED_RESOURCE_DIR|$EXPECTED_RESOURCE_DIR|g; s|DIR|%/t|g" 
%t/P1689.json.in > %t/P1689.json && \
 // RUN: clang-scan-deps -compilation-database %t/P1689.json -format=p1689 | 
FileCheck %t/a.cpp -DPREFIX=%/t && \
diff --git a/clang/test/ClangScanDeps/resource_directory.c 
b/clang/test/ClangScanDeps/resource_directory.c
index 55d5d90bbcdea..6183e8aefacfa 100644
--- a/clang/test/ClangScanDeps/resource_directory.c
+++ b/clang/test/ClangScanDeps/resource_directory.c
@@ -12,14 +12,14 @@
 // then verify `%clang-scan-deps` arrives at the same path by calling the
 // `Driver::GetResourcesPath` function.
 //
-// RUN: EXPECTED_RESOURCE_DIR=`%clang -print-resource-dir`
+// RUN: %clang -print-resource-dir | tr -d '\n' > %t/resource-dir
 // RUN: sed -e "s|CLANG|%clang|g" -e "s|DIR|%/t|g" \
 // RUN:   %S/Inputs/resource_directory/cdb.json.template > %t/cdb_path.json
 //
 // RUN: clang-scan-deps -compilation-database %t/cdb_path.json --format 
experimental-full \
 // RUN:   --resource-dir-recipe modify-compiler-path > %t/result_path.json
 // RUN: cat %t/result_path.json | sed 's:\?:/:g' \
-// RUN:   | FileCheck %s --check-prefix=CHECK-PATH 
-DEXPECTED_RESOURCE_DIR="$EXPECTED_RESOURCE_DIR"
+// RUN:   | FileCheck %s --check-prefix=CHECK-PATH 
-DEXPECTED_RESOURCE_DIR="%{readfile:%t/resource-dir}"
 // CHECK-PATH:  "-resource-dir"
 // CHECK-PATH-NEXT: "[[EXPECTED_RESOURCE_DIR]]"
 
@@ -31,9 +31,8 @@
 // Here we hard-code the expected path into `%t/compiler` and then verify
 // `%clang-scan-deps` arrives at the path by actually running the executable.
 //
-// RUN: EXPECTED_RESOURCE_DIR="/custom/compiler/resources"
 // RUN: echo "#!/bin/sh"  > %t/compiler
-// RUN: echo "echo '$EXPECTED_RESOURCE_DIR'" >> %t/compiler
+// RUN: echo "echo '/custom/compiler/resources'" >> %t/compiler
 // RUN: chmod +x %t/compiler
 // RUN: sed -e "s|CLANG|%/t/compiler|g" -e "s|DIR|%/t|g" \
 // RUN:   %S/Inputs/resource_directory/cdb.json.template > 
%t/cdb_invocation.json
@@ -41,6 +40,6 @@
 // RUN: clang-scan-deps -compilation-database %t/cdb_invocation.json --format 
experimental-full \
 // RUN:   --resource-dir-recipe invoke-compiler > %t/result_invocation.json
 // RUN: cat %t/result_invocation.json | sed 's:\?:/:g' \
-// RUN:   | FileCheck %s --check-prefix=CHECK-PATH 
-DEXPECTED_RESOURCE_DIR="$EXPECTED_RESOURCE_DIR"
+// RUN:   | FileCheck %s --check-prefix=CHECK-PATH 
-DEXPECTED_RESOURCE_DIR="/custom/compiler/resources"
 // CHECK-INVOCATION:  "-resource-dir"
 // CHECK-INVOCATION-NEXT: "[[EXPECTED_RESOURCE_DIR]]"
diff --git a/clang/test/Driver/env.c b/clang/test/Driver/env.c
index b3345ae09ffef..68ded45385702 100644
--- a/clang/test/Driver/env.c
+++ b/clang/test/Driver/env.c
@@ -1,13 +1,14 @@
 // Some assertions in this test use Linux style (/) file paths.
 // UNSUPPORTED: system-windows
+// RUN: bash -c env | grep LD_LIBRARY_PATH | tr -d '\n' > /tmp/ld_library_path
 // The PATH variable is heavily used when trying to find a linker.
-// RUN: env -i LC_ALL=C LD_LIBRARY_PATH="$LD_LIBRARY_PATH" 
CLANG_NO_DEFAULT_CONFIG=1 \
+// RUN: env -i LC_ALL=C LD_LIBRARY_PATH="%{readfile:/tmp/ld_library_path}" 
CLANG_NO_DEFAULT_CONFIG=1 \
 // RUN:   %clang %s -### -o %t.o --target=i386-unknown-linux \
 // RUN: --sysroot=%S/Inputs/basic_linux_tree \
 // RUN: --rtlib=platfo

[llvm-branch-commits] [Clang] Rewrite tests using subshells to set env variables (PR #158446)

2025-09-13 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 created 
https://github.com/llvm/llvm-project/pull/158446

Now that we have the %readfile substitution, we can rewrite these tests
that were using env variable subshells to write the output of the
command into a file and then load it where it is needed using readfile.

This does involve one invocation of bash so that we are using the system
env binary, which does support redirection into a tool like grep. We
already do this in one LLVM test. I'm not happy about needing that, but
the more principled way to solve it involves reworking how in-process
builtins work within lit. That is something we want to do eventually,
but not something that I think should block this patch.



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [lit] Make builtin cat work with stdin (PR #158447)

2025-09-13 Thread Alexander Richardson via llvm-branch-commits

https://github.com/arichardson approved this pull request.


https://github.com/llvm/llvm-project/pull/158447
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [RISCV] Support PreserveMost calling convention (#148214) (PR #158403)

2025-09-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-risc-v

Author: None (llvmbot)


Changes

Backport 058d96f2d68d3496ae52774c06177d4a9039a134

Requested by: @llvmbot

---

Patch is 20.40 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/158403.diff


5 Files Affected:

- (modified) llvm/docs/LangRef.rst (+2) 
- (modified) llvm/lib/Target/RISCV/RISCVCallingConv.td (+4) 
- (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+1) 
- (modified) llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp (+10-1) 
- (added) llvm/test/CodeGen/RISCV/calling-conv-preserve-most.ll (+449) 


``diff
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 8ea850af7a69b..2a9b67b671e11 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -413,6 +413,8 @@ added in the future:
 - On AArch64 the callee preserve all general purpose registers, except
   X0-X8 and X16-X18. Not allowed with ``nest``.
 
+- On RISC-V the callee preserve x5-x31 except x6, x7 and x28 registers.
+
 The idea behind this convention is to support calls to runtime functions
 that have a hot path and a cold path. The hot path is usually a small piece
 of code that doesn't use many registers. The cold path might need to call 
out to
diff --git a/llvm/lib/Target/RISCV/RISCVCallingConv.td 
b/llvm/lib/Target/RISCV/RISCVCallingConv.td
index cbf039edec273..d8c52cbde04c7 100644
--- a/llvm/lib/Target/RISCV/RISCVCallingConv.td
+++ b/llvm/lib/Target/RISCV/RISCVCallingConv.td
@@ -93,3 +93,7 @@ def CSR_XLEN_F32_V_Interrupt_RVE: CalleeSavedRegs<(sub 
CSR_XLEN_F32_V_Interrupt,
 // Same as CSR_XLEN_F64_V_Interrupt, but excluding X16-X31.
 def CSR_XLEN_F64_V_Interrupt_RVE: CalleeSavedRegs<(sub 
CSR_XLEN_F64_V_Interrupt,
(sequence "X%u", 16, 31))>;
+
+def CSR_RT_MostRegs : CalleeSavedRegs<(sub CSR_Interrupt, X6, X7, X28)>;
+def CSR_RT_MostRegs_RVE : CalleeSavedRegs<(sub CSR_RT_MostRegs,
+   (sequence "X%u", 16, 31))>;
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 5fb16f5ac6b9e..07a03792c2b23 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -4,6 +4,7 @@ SDValue RISCVTargetLowering::LowerFormalArguments(
   case CallingConv::C:
   case CallingConv::Fast:
   case CallingConv::SPIR_KERNEL:
+  case CallingConv::PreserveMost:
   case CallingConv::GRAAL:
   case CallingConv::RISCV_VectorCall:
 #define CC_VLS_CASE(ABI_VLEN) case CallingConv::RISCV_VLSCall_##ABI_VLEN:
diff --git a/llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
index 540412366026b..214536d7f3a74 100644
--- a/llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
@@ -68,6 +68,9 @@ RISCVRegisterInfo::getCalleeSavedRegs(const MachineFunction 
*MF) const {
   auto &Subtarget = MF->getSubtarget();
   if (MF->getFunction().getCallingConv() == CallingConv::GHC)
 return CSR_NoRegs_SaveList;
+  if (MF->getFunction().getCallingConv() == CallingConv::PreserveMost)
+return Subtarget.hasStdExtE() ? CSR_RT_MostRegs_RVE_SaveList
+  : CSR_RT_MostRegs_SaveList;
   if (MF->getFunction().hasFnAttribute("interrupt")) {
 if (Subtarget.hasVInstructions()) {
   if (Subtarget.hasStdExtD())
@@ -811,7 +814,13 @@ RISCVRegisterInfo::getCallPreservedMask(const 
MachineFunction & MF,
 
   if (CC == CallingConv::GHC)
 return CSR_NoRegs_RegMask;
-  switch (Subtarget.getTargetABI()) {
+  RISCVABI::ABI ABI = Subtarget.getTargetABI();
+  if (CC == CallingConv::PreserveMost) {
+if (ABI == RISCVABI::ABI_ILP32E || ABI == RISCVABI::ABI_LP64E)
+  return CSR_RT_MostRegs_RVE_RegMask;
+return CSR_RT_MostRegs_RegMask;
+  }
+  switch (ABI) {
   default:
 llvm_unreachable("Unrecognized ABI");
   case RISCVABI::ABI_ILP32E:
diff --git a/llvm/test/CodeGen/RISCV/calling-conv-preserve-most.ll 
b/llvm/test/CodeGen/RISCV/calling-conv-preserve-most.ll
new file mode 100644
index 0..08340bbe0013a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/calling-conv-preserve-most.ll
@@ -0,0 +1,449 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=riscv32 < %s | FileCheck %s -check-prefix=RV32I
+; RUN: llc -mtriple=riscv64 < %s | FileCheck %s -check-prefix=RV64I
+; RUN: llc -mtriple=riscv32 -mattr=+e -target-abi ilp32e < %s | FileCheck %s 
-check-prefix=RV32E
+; RUN: llc -mtriple=riscv64 -mattr=+e -target-abi lp64e < %s | FileCheck %s 
-check-prefix=RV64E
+
+; Check the PreserveMost calling convention works.
+
+declare void @standard_cc_func()
+declare preserve_mostcc void @preserve_mostcc_func()
+
+define preserve_mostcc void @preserve_mostcc1() nounwind {
+; RV32I-LABEL: preserve_mostcc1:
+; RV32I:   # %bb.0: # %entry
+; RV32I-NEXT:addi sp, sp, -64
+; RV32I

[llvm-branch-commits] [clang] [llvm] [lit] Make builtin cat work with stdin (PR #158447)

2025-09-13 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 edited 
https://github.com/llvm/llvm-project/pull/158447
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [flang][do concurent] Add saxpy offload tests for OpenMP mapping (PR #155993)

2025-09-13 Thread Kareem Ergawy via llvm-branch-commits

https://github.com/ergawy updated 
https://github.com/llvm/llvm-project/pull/155993

>From 5d9fe15da36fe4af54c34f09ebb11fca7f8a1ac3 Mon Sep 17 00:00:00 2001
From: ergawy 
Date: Fri, 29 Aug 2025 04:04:07 -0500
Subject: [PATCH] [flang][do concurent] Add saxpy offload tests for OpenMP
 mapping

Adds end-to-end tests for `do concurrent` offloading to the device.
---
 .../fortran/do-concurrent-to-omp-saxpy-2d.f90 | 53 +++
 .../fortran/do-concurrent-to-omp-saxpy.f90| 53 +++
 2 files changed, 106 insertions(+)
 create mode 100644 
offload/test/offloading/fortran/do-concurrent-to-omp-saxpy-2d.f90
 create mode 100644 
offload/test/offloading/fortran/do-concurrent-to-omp-saxpy.f90

diff --git a/offload/test/offloading/fortran/do-concurrent-to-omp-saxpy-2d.f90 
b/offload/test/offloading/fortran/do-concurrent-to-omp-saxpy-2d.f90
new file mode 100644
index 0..c6f576acb90b6
--- /dev/null
+++ b/offload/test/offloading/fortran/do-concurrent-to-omp-saxpy-2d.f90
@@ -0,0 +1,53 @@
+! REQUIRES: flang, amdgpu
+
+! RUN: %libomptarget-compile-fortran-generic -fdo-concurrent-to-openmp=device
+! RUN: env LIBOMPTARGET_INFO=16 %libomptarget-run-generic 2>&1 | 
%fcheck-generic
+module saxpymod
+   use iso_fortran_env
+   public :: saxpy
+contains
+
+subroutine saxpy(a, x, y, n, m)
+   use iso_fortran_env
+   implicit none
+   integer,intent(in) :: n, m
+   real(kind=real32),intent(in) :: a
+   real(kind=real32), dimension(:,:),intent(in) :: x
+   real(kind=real32), dimension(:,:),intent(inout) :: y
+   integer :: i, j
+
+   do concurrent(i=1:n, j=1:m)
+   y(i,j) = a * x(i,j) + y(i,j)
+   end do
+
+   write(*,*) "plausibility check:"
+   write(*,'("y(1,1) ",f8.6)') y(1,1)
+   write(*,'("y(n,m) ",f8.6)') y(n,m)
+end subroutine saxpy
+
+end module saxpymod
+
+program main
+   use iso_fortran_env
+   use saxpymod, ONLY:saxpy
+   implicit none
+
+   integer,parameter :: n = 1000, m=1
+   real(kind=real32), allocatable, dimension(:,:) :: x, y
+   real(kind=real32) :: a
+   integer :: i
+
+   allocate(x(1:n,1:m), y(1:n,1:m))
+   a = 2.0_real32
+   x(:,:) = 1.0_real32
+   y(:,:) = 2.0_real32
+
+   call saxpy(a, x, y, n, m)
+
+   deallocate(x,y)
+end program main
+
+! CHECK:  "PluginInterface" device {{[0-9]+}} info: Launching kernel {{.*}}
+! CHECK:  plausibility check:
+! CHECK:  y(1,1) 4.0
+! CHECK:  y(n,m) 4.0
diff --git a/offload/test/offloading/fortran/do-concurrent-to-omp-saxpy.f90 
b/offload/test/offloading/fortran/do-concurrent-to-omp-saxpy.f90
new file mode 100644
index 0..e094a1d7459ef
--- /dev/null
+++ b/offload/test/offloading/fortran/do-concurrent-to-omp-saxpy.f90
@@ -0,0 +1,53 @@
+! REQUIRES: flang, amdgpu
+
+! RUN: %libomptarget-compile-fortran-generic -fdo-concurrent-to-openmp=device
+! RUN: env LIBOMPTARGET_INFO=16 %libomptarget-run-generic 2>&1 | 
%fcheck-generic
+module saxpymod
+   use iso_fortran_env
+   public :: saxpy
+contains
+
+subroutine saxpy(a, x, y, n)
+   use iso_fortran_env
+   implicit none
+   integer,intent(in) :: n
+   real(kind=real32),intent(in) :: a
+   real(kind=real32), dimension(:),intent(in) :: x
+   real(kind=real32), dimension(:),intent(inout) :: y
+   integer :: i
+
+   do concurrent(i=1:n)
+   y(i) = a * x(i) + y(i)
+   end do
+
+   write(*,*) "plausibility check:"
+   write(*,'("y(1) ",f8.6)') y(1)
+   write(*,'("y(n) ",f8.6)') y(n)
+end subroutine saxpy
+
+end module saxpymod
+
+program main
+   use iso_fortran_env
+   use saxpymod, ONLY:saxpy
+   implicit none
+
+   integer,parameter :: n = 1000
+   real(kind=real32), allocatable, dimension(:) :: x, y
+   real(kind=real32) :: a
+   integer :: i
+
+   allocate(x(1:n), y(1:n))
+   a = 2.0_real32
+   x(:) = 1.0_real32
+   y(:) = 2.0_real32
+
+   call saxpy(a, x, y, n)
+
+   deallocate(x,y)
+end program main
+
+! CHECK:  "PluginInterface" device {{[0-9]+}} info: Launching kernel {{.*}}
+! CHECK:  plausibility check:
+! CHECK:  y(1) 4.0
+! CHECK:  y(n) 4.0

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/21.x: [Clang][Cygwin] Use correct mangling rule (#158404) (PR #158442)

2025-09-13 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/158442
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [AMDGPU] Add builtins for wave reduction intrinsics (PR #150170)

2025-09-13 Thread via llvm-branch-commits

https://github.com/easyonaadit updated 
https://github.com/llvm/llvm-project/pull/150170

>From be85e6c0222fe757ac59959bad5c56a85a32b869 Mon Sep 17 00:00:00 2001
From: Aaditya 
Date: Sat, 19 Jul 2025 12:57:27 +0530
Subject: [PATCH] Add builtins for wave reduction intrinsics

---
 clang/include/clang/Basic/BuiltinsAMDGPU.def |  25 ++
 clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp  |  58 +++
 clang/test/CodeGenOpenCL/builtins-amdgcn.cl  | 378 +++
 3 files changed, 461 insertions(+)

diff --git a/clang/include/clang/Basic/BuiltinsAMDGPU.def 
b/clang/include/clang/Basic/BuiltinsAMDGPU.def
index e5a1422fe8778..56b1a8dc09b15 100644
--- a/clang/include/clang/Basic/BuiltinsAMDGPU.def
+++ b/clang/include/clang/Basic/BuiltinsAMDGPU.def
@@ -364,6 +364,31 @@ BUILTIN(__builtin_amdgcn_endpgm, "v", "nr")
 BUILTIN(__builtin_amdgcn_get_fpenv, "WUi", "n")
 BUILTIN(__builtin_amdgcn_set_fpenv, "vWUi", "n")
 
+//===--===//
+
+// Wave Reduction builtins.
+
+//===--===//
+
+BUILTIN(__builtin_amdgcn_wave_reduce_add_u32, "ZUiZUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_sub_u32, "ZUiZUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_min_i32, "ZiZiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_min_u32, "ZUiZUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_max_i32, "ZiZiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_max_u32, "ZUiZUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_and_b32, "ZiZiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_or_b32, "ZiZiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_xor_b32, "ZiZiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_add_u64, "WUiWUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_sub_u64, "WUiWUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_min_i64, "WiWiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_min_u64, "WUiWUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_max_i64, "WiWiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_max_u64, "WUiWUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_and_b64, "WiWiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_or_b64, "WiWiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_xor_b64, "WiWiZi", "nc")
+
 
//===--===//
 // R600-NI only builtins.
 
//===--===//
diff --git a/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp 
b/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
index 87a46287c4022..07cf08c54985a 100644
--- a/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+++ b/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
@@ -295,11 +295,69 @@ void 
CodeGenFunction::AddAMDGPUFenceAddressSpaceMMRA(llvm::Instruction *Inst,
   Inst->setMetadata(LLVMContext::MD_mmra, MMRAMetadata::getMD(Ctx, MMRAs));
 }
 
+static Intrinsic::ID getIntrinsicIDforWaveReduction(unsigned BuiltinID) {
+  switch (BuiltinID) {
+  default:
+llvm_unreachable("Unknown BuiltinID for wave reduction");
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_add_u32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_add_u64:
+return Intrinsic::amdgcn_wave_reduce_add;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_sub_u32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_sub_u64:
+return Intrinsic::amdgcn_wave_reduce_sub;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_min_i32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_min_i64:
+return Intrinsic::amdgcn_wave_reduce_min;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_min_u32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_min_u64:
+return Intrinsic::amdgcn_wave_reduce_umin;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_max_i32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_max_i64:
+return Intrinsic::amdgcn_wave_reduce_max;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_max_u32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_max_u64:
+return Intrinsic::amdgcn_wave_reduce_umax;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_and_b32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_and_b64:
+return Intrinsic::amdgcn_wave_reduce_and;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_or_b32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_or_b64:
+return Intrinsic::amdgcn_wave_reduce_or;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_xor_b32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_xor_b64:
+return Intrinsic::amdgcn_wave_reduce_xor;
+  }
+}
+
 Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
   const CallExpr *E) {
   llvm::AtomicOrdering AO = llvm::AtomicOrdering::SequentiallyConsistent;
   llvm::SyncScope::ID SSID;
   switch (BuiltinID) {
+  case AMDGPU::BI__builtin_amdgcn_wave_reduce_add_u32:
+  case AMDGPU::BI__builtin_amdgcn_wave_reduce_sub_u

[llvm-branch-commits] [llvm] [Offload][Conformance] Update olMemFree calls in conformance tests (PR #157773)

2025-09-13 Thread Joseph Huber via llvm-branch-commits

https://github.com/jhuber6 approved this pull request.


https://github.com/llvm/llvm-project/pull/157773
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][Standalone] test Standalone against install distributions (PR #157944)

2025-09-13 Thread Renato Golin via llvm-branch-commits


@@ -65,6 +65,12 @@ if (MLIR_INCLUDE_INTEGRATION_TESTS)
 
 endif()
 
+option(MLIR_RUN_STANDALONE_INSTALL_TESTS "Run Standalone example install 
tests." ON)
+if(MLIR_RUN_STANDALONE_INSTALL_TESTS AND "${CMAKE_INSTALL_PREFIX}" STREQUAL "")
+  message(WARNING "Standalone example install tests will install into root!\

rengolin wrote:

What do you mean by `root`? The GIT root? `/`? The build directory root?

Usually, when I do automatic installs for CI, I set it to `%build/install`, so 
that it's guaranteed to be writable by the build process and in the same 
filesystem (ex. the source directory may be NFS and slow).

https://github.com/llvm/llvm-project/pull/157944
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Use RegClassByHwMode to manage operand VGPR operand constraints (PR #158272)

2025-09-13 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/158272
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/21.x: [Clang][Cygwin] Use correct mangling rule (#158404) (PR #158442)

2025-09-13 Thread via llvm-branch-commits

https://github.com/jeremyd2019 approved this pull request.

LGTM, and this seems to be a regression from 20.x

https://github.com/llvm/llvm-project/pull/158442
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Offload] Add GenericPluginTy::get_mem_info (PR #157484)

2025-09-13 Thread Jan Patrick Lehr via llvm-branch-commits

https://github.com/jplehr edited 
https://github.com/llvm/llvm-project/pull/157484
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lldb] release/21.x: [lldb][DataFormatter] Allow std::string formatters to match against custom allocators (#156050) (PR #157048)

2025-09-13 Thread via llvm-branch-commits

github-actions[bot] wrote:

@Michael137 (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/157048
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Use RegClassByHwMode to manage operand VGPR operand constraints (PR #158272)

2025-09-13 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/158272?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#158272** https://app.graphite.dev/github/pr/llvm/llvm-project/158272?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/158272?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#158269** https://app.graphite.dev/github/pr/llvm/llvm-project/158269?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>: 1 other dependent PR 
([#158271](https://github.com/llvm/llvm-project/pull/158271) https://app.graphite.dev/github/pr/llvm/llvm-project/158271?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>)
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/158272
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] 4779770 - Revert "Revert "[clang][dataflow] Transfer more cast expressions." (#157148)"

2025-09-13 Thread via llvm-branch-commits

Author: Samira Bakon
Date: 2025-09-08T15:08:52-04:00
New Revision: 4779770aab880315d359005a5cacd4c4976d8649

URL: 
https://github.com/llvm/llvm-project/commit/4779770aab880315d359005a5cacd4c4976d8649
DIFF: 
https://github.com/llvm/llvm-project/commit/4779770aab880315d359005a5cacd4c4976d8649.diff

LOG: Revert "Revert "[clang][dataflow] Transfer more cast expressions." 
(#157148)"

This reverts commit 4c29a600fa34d0c1cabf4ffcf081f2a00b09fddd.

Added: 


Modified: 
clang/include/clang/Analysis/FlowSensitive/StorageLocation.h
clang/lib/Analysis/FlowSensitive/Transfer.cpp
clang/unittests/Analysis/FlowSensitive/TransferTest.cpp

Removed: 




diff  --git a/clang/include/clang/Analysis/FlowSensitive/StorageLocation.h 
b/clang/include/clang/Analysis/FlowSensitive/StorageLocation.h
index 8fcc6a44027a0..534b9a017d8f0 100644
--- a/clang/include/clang/Analysis/FlowSensitive/StorageLocation.h
+++ b/clang/include/clang/Analysis/FlowSensitive/StorageLocation.h
@@ -17,6 +17,7 @@
 #include "clang/AST/Decl.h"
 #include "clang/AST/Type.h"
 #include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/StringRef.h"
 #include "llvm/Support/Debug.h"
 #include 
 
@@ -152,6 +153,11 @@ class RecordStorageLocation final : public StorageLocation 
{
 return {SyntheticFields.begin(), SyntheticFields.end()};
   }
 
+  /// Add a synthetic field, if none by that name is already present.
+  void addSyntheticField(llvm::StringRef Name, StorageLocation &Loc) {
+SyntheticFields.insert({Name, &Loc});
+  }
+
   /// Changes the child storage location for a field `D` of reference type.
   /// All other fields cannot change their storage location and always retain
   /// the storage location passed to the `RecordStorageLocation` constructor.
@@ -164,6 +170,11 @@ class RecordStorageLocation final : public StorageLocation 
{
 Children[&D] = Loc;
   }
 
+  /// Add a child storage location for a field `D`, if not already present.
+  void addChild(const ValueDecl &D, StorageLocation *Loc) {
+Children.insert({&D, Loc});
+  }
+
   llvm::iterator_range children() const {
 return {Children.begin(), Children.end()};
   }

diff  --git a/clang/lib/Analysis/FlowSensitive/Transfer.cpp 
b/clang/lib/Analysis/FlowSensitive/Transfer.cpp
index 86a816e2e406c..23a6de45e18b1 100644
--- a/clang/lib/Analysis/FlowSensitive/Transfer.cpp
+++ b/clang/lib/Analysis/FlowSensitive/Transfer.cpp
@@ -20,14 +20,17 @@
 #include "clang/AST/OperationKinds.h"
 #include "clang/AST/Stmt.h"
 #include "clang/AST/StmtVisitor.h"
+#include "clang/AST/Type.h"
 #include "clang/Analysis/FlowSensitive/ASTOps.h"
 #include "clang/Analysis/FlowSensitive/AdornedCFG.h"
 #include "clang/Analysis/FlowSensitive/DataflowAnalysisContext.h"
 #include "clang/Analysis/FlowSensitive/DataflowEnvironment.h"
 #include "clang/Analysis/FlowSensitive/NoopAnalysis.h"
 #include "clang/Analysis/FlowSensitive/RecordOps.h"
+#include "clang/Analysis/FlowSensitive/StorageLocation.h"
 #include "clang/Analysis/FlowSensitive/Value.h"
 #include "clang/Basic/Builtins.h"
+#include "clang/Basic/LLVM.h"
 #include "clang/Basic/OperatorKinds.h"
 #include "llvm/Support/Casting.h"
 #include 
@@ -287,7 +290,7 @@ class TransferVisitor : public 
ConstStmtVisitor {
 }
   }
 
-  void VisitImplicitCastExpr(const ImplicitCastExpr *S) {
+  void VisitCastExpr(const CastExpr *S) {
 const Expr *SubExpr = S->getSubExpr();
 assert(SubExpr != nullptr);
 
@@ -317,6 +320,60 @@ class TransferVisitor : public 
ConstStmtVisitor {
   break;
 }
 
+case CK_BaseToDerived: {
+  // This is a cast of (single-layer) pointer or reference to a record 
type.
+  // We should now model the fields for the derived type.
+
+  // Get the RecordStorageLocation for the record object underneath.
+  RecordStorageLocation *Loc = nullptr;
+  if (S->getType()->isPointerType()) {
+auto *PV = Env.get(*SubExpr);
+assert(PV != nullptr);
+if (PV == nullptr)
+  break;
+Loc = cast(&PV->getPointeeLoc());
+  } else {
+assert(S->getType()->isRecordType());
+if (SubExpr->isGLValue()) {
+  Loc = Env.get(*SubExpr);
+} else {
+  Loc = &Env.getResultObjectLocation(*SubExpr);
+}
+  }
+  if (!Loc) {
+// Nowhere to add children or propagate from, so we're done.
+break;
+  }
+
+  // Get the derived record type underneath the reference or pointer.
+  QualType Derived = S->getType().getNonReferenceType();
+  if (Derived->isPointerType()) {
+Derived = Derived->getPointeeType();
+  }
+
+  // Add children to the storage location for fields (including synthetic
+  // fields) of the derived type and initialize their values.
+  for (const FieldDecl *Field :
+   Env.getDataflowAnalysisContext().getModeledFields(Derived)) {
+assert(Field != nullptr);
+QualType FieldType = Field-

[llvm-branch-commits] [llvm] [NFC] Leave a comment in `Local.cpp` about debug info & sample profiling (PR #155296)

2025-09-13 Thread Mircea Trofin via llvm-branch-commits

https://github.com/mtrofin updated 
https://github.com/llvm/llvm-project/pull/155296

>From 2362af9d23a45db4bb85381539630be98703a2c3 Mon Sep 17 00:00:00 2001
From: Mircea Trofin 
Date: Mon, 25 Aug 2025 21:04:05 +
Subject: [PATCH] [NFC] Leave a comment in `Local.cpp` about debug info &
 sample profiling

---
 llvm/lib/Transforms/Utils/Local.cpp | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/Local.cpp 
b/llvm/lib/Transforms/Utils/Local.cpp
index 2cfd70a1746c8..57dc1b38b8ec3 100644
--- a/llvm/lib/Transforms/Utils/Local.cpp
+++ b/llvm/lib/Transforms/Utils/Local.cpp
@@ -3342,8 +3342,11 @@ void llvm::hoistAllInstructionsInto(BasicBlock 
*DomBlock, Instruction *InsertPt,
   // retain their original debug locations (DILocations) and debug intrinsic
   // instructions.
   //
-  // Doing so would degrade the debugging experience and adversely affect the
-  // accuracy of profiling information.
+  // Doing so would degrade the debugging experience.
+  //
+  // FIXME: Issue #152767: debug info should also be the same as the
+  // original branch, **if** the user explicitly indicated that (for sampling
+  // PGO)
   //
   // Currently, when hoisting the instructions, we take the following actions:
   // - Remove their debug intrinsic instructions.

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NFC] Leave a comment in `Local.cpp` about debug info & sample profiling (PR #155296)

2025-09-13 Thread Mircea Trofin via llvm-branch-commits

https://github.com/mtrofin updated 
https://github.com/llvm/llvm-project/pull/155296

>From 2362af9d23a45db4bb85381539630be98703a2c3 Mon Sep 17 00:00:00 2001
From: Mircea Trofin 
Date: Mon, 25 Aug 2025 21:04:05 +
Subject: [PATCH] [NFC] Leave a comment in `Local.cpp` about debug info &
 sample profiling

---
 llvm/lib/Transforms/Utils/Local.cpp | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/Local.cpp 
b/llvm/lib/Transforms/Utils/Local.cpp
index 2cfd70a1746c8..57dc1b38b8ec3 100644
--- a/llvm/lib/Transforms/Utils/Local.cpp
+++ b/llvm/lib/Transforms/Utils/Local.cpp
@@ -3342,8 +3342,11 @@ void llvm::hoistAllInstructionsInto(BasicBlock 
*DomBlock, Instruction *InsertPt,
   // retain their original debug locations (DILocations) and debug intrinsic
   // instructions.
   //
-  // Doing so would degrade the debugging experience and adversely affect the
-  // accuracy of profiling information.
+  // Doing so would degrade the debugging experience.
+  //
+  // FIXME: Issue #152767: debug info should also be the same as the
+  // original branch, **if** the user explicitly indicated that (for sampling
+  // PGO)
   //
   // Currently, when hoisting the instructions, we take the following actions:
   // - Remove their debug intrinsic instructions.

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [Remarks] BitstreamRemarkParser: Refactor error handling (PR #156511)

2025-09-13 Thread Jon Roelofs via llvm-branch-commits


@@ -13,81 +13,171 @@
 #ifndef LLVM_LIB_REMARKS_BITSTREAM_REMARK_PARSER_H
 #define LLVM_LIB_REMARKS_BITSTREAM_REMARK_PARSER_H
 
-#include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/Bitstream/BitstreamReader.h"
 #include "llvm/Remarks/BitstreamRemarkContainer.h"
+#include "llvm/Remarks/Remark.h"
 #include "llvm/Remarks/RemarkFormat.h"
 #include "llvm/Remarks/RemarkParser.h"
+#include "llvm/Remarks/RemarkStringTable.h"
 #include "llvm/Support/Error.h"
-#include 
+#include "llvm/Support/FormatVariadic.h"
 #include 
 #include 
 #include 
 
 namespace llvm {
 namespace remarks {
 
-struct Remark;
+class BitstreamBlockParserHelperBase {
+protected:
+  BitstreamCursor &Stream;
+
+  unsigned BlockID;
+  StringRef BlockName;
+
+public:
+  BitstreamBlockParserHelperBase(BitstreamCursor &Stream, unsigned BlockID,
+ StringRef BlockName)
+  : Stream(Stream), BlockID(BlockID), BlockName(BlockName) {}

jroelofs wrote:

to go with the other re-ordering:
```suggestion
  : Stream(Stream), BlockName(BlockName), BlockID(BlockID) {}
```

https://github.com/llvm/llvm-project/pull/156511
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] `do concurrent`: support `reduce` on device (PR #156610)

2025-09-13 Thread Kareem Ergawy via llvm-branch-commits

https://github.com/ergawy updated 
https://github.com/llvm/llvm-project/pull/156610

>From 31bf2c1e204b8ad977f1416c5e91aacfc0faaf80 Mon Sep 17 00:00:00 2001
From: ergawy 
Date: Tue, 2 Sep 2025 08:36:34 -0500
Subject: [PATCH] [flang][OpenMP] `do concurrent`: support `reduce` on device

Extends `do concurrent` to OpenMP device mapping by adding support for
mapping `reduce` specifiers to omp `reduction` clauses. The changes
attach 2 `reduction` clauses to the mapped OpenMP construct: one on the
`teams` part of the construct and one on the `wloop` part.
---
 .../OpenMP/DoConcurrentConversion.cpp | 117 ++
 .../DoConcurrent/reduce_device.mlir   |  53 
 2 files changed, 121 insertions(+), 49 deletions(-)
 create mode 100644 flang/test/Transforms/DoConcurrent/reduce_device.mlir

diff --git a/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp 
b/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp
index d00a4fdd2cf2e..6e308499100fa 100644
--- a/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp
+++ b/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp
@@ -141,6 +141,9 @@ void collectLoopLiveIns(fir::DoConcurrentLoopOp loop,
 
   for (mlir::Value local : loop.getLocalVars())
 liveIns.push_back(local);
+
+  for (mlir::Value reduce : loop.getReduceVars())
+liveIns.push_back(reduce);
 }
 
 /// Collects values that are local to a loop: "loop-local values". A loop-local
@@ -319,7 +322,7 @@ class DoConcurrentConversion
   targetOp =
   genTargetOp(doLoop.getLoc(), rewriter, mapper, loopNestLiveIns,
   targetClauseOps, loopNestClauseOps, liveInShapeInfoMap);
-  genTeamsOp(doLoop.getLoc(), rewriter);
+  genTeamsOp(rewriter, loop, mapper);
 }
 
 mlir::omp::ParallelOp parallelOp =
@@ -492,46 +495,7 @@ class DoConcurrentConversion
 if (!mapToDevice)
   genPrivatizers(rewriter, mapper, loop, wsloopClauseOps);
 
-if (!loop.getReduceVars().empty()) {
-  for (auto [op, byRef, sym, arg] : llvm::zip_equal(
-   loop.getReduceVars(), loop.getReduceByrefAttr().asArrayRef(),
-   loop.getReduceSymsAttr().getAsRange(),
-   loop.getRegionReduceArgs())) {
-auto firReducer = moduleSymbolTable.lookup(
-sym.getLeafReference());
-
-mlir::OpBuilder::InsertionGuard guard(rewriter);
-rewriter.setInsertionPointAfter(firReducer);
-std::string ompReducerName = sym.getLeafReference().str() + ".omp";
-
-auto ompReducer =
-moduleSymbolTable.lookup(
-rewriter.getStringAttr(ompReducerName));
-
-if (!ompReducer) {
-  ompReducer = mlir::omp::DeclareReductionOp::create(
-  rewriter, firReducer.getLoc(), ompReducerName,
-  firReducer.getTypeAttr().getValue());
-
-  cloneFIRRegionToOMP(rewriter, firReducer.getAllocRegion(),
-  ompReducer.getAllocRegion());
-  cloneFIRRegionToOMP(rewriter, firReducer.getInitializerRegion(),
-  ompReducer.getInitializerRegion());
-  cloneFIRRegionToOMP(rewriter, firReducer.getReductionRegion(),
-  ompReducer.getReductionRegion());
-  cloneFIRRegionToOMP(rewriter, firReducer.getAtomicReductionRegion(),
-  ompReducer.getAtomicReductionRegion());
-  cloneFIRRegionToOMP(rewriter, firReducer.getCleanupRegion(),
-  ompReducer.getCleanupRegion());
-  moduleSymbolTable.insert(ompReducer);
-}
-
-wsloopClauseOps.reductionVars.push_back(op);
-wsloopClauseOps.reductionByref.push_back(byRef);
-wsloopClauseOps.reductionSyms.push_back(
-mlir::SymbolRefAttr::get(ompReducer));
-  }
-}
+genReductions(rewriter, mapper, loop, wsloopClauseOps);
 
 auto wsloopOp =
 mlir::omp::WsloopOp::create(rewriter, loop.getLoc(), wsloopClauseOps);
@@ -553,8 +517,6 @@ class DoConcurrentConversion
 
 rewriter.setInsertionPointToEnd(&loopNestOp.getRegion().back());
 mlir::omp::YieldOp::create(rewriter, loop->getLoc());
-loop->getParentOfType().print(
-llvm::errs(), mlir::OpPrintingFlags().assumeVerified());
 
 return {loopNestOp, wsloopOp};
   }
@@ -778,15 +740,26 @@ class DoConcurrentConversion
 liveInName, shape);
   }
 
-  mlir::omp::TeamsOp
-  genTeamsOp(mlir::Location loc,
- mlir::ConversionPatternRewriter &rewriter) const {
-auto teamsOp = rewriter.create(
-loc, /*clauses=*/mlir::omp::TeamsOperands{});
+  mlir::omp::TeamsOp genTeamsOp(mlir::ConversionPatternRewriter &rewriter,
+fir::DoConcurrentLoopOp loop,
+mlir::IRMapping &mapper) const {
+mlir::omp::TeamsOperands teamsOps;
+genReductions(rewriter, mapper, loop, teamsOps);
+
+mlir::Location loc = loop.getLoc();
+aut

[llvm-branch-commits] [llvm] [IR2Vec] Refactor vocabulary to use section-based storage (PR #158376)

2025-09-13 Thread S. VenkataKeerthy via llvm-branch-commits

https://github.com/svkeerthy edited 
https://github.com/llvm/llvm-project/pull/158376
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][Standalone] test Standalone against install distributions (PR #157944)

2025-09-13 Thread Maksim Levental via llvm-branch-commits

makslevental wrote:

Note, this PR depends on https://github.com/llvm/llvm-project/pull/158090 
because currently (HEAD) the `monolithic` CI scripts do not install 
`FileCheck`, `count`, and `not` targets.

https://github.com/llvm/llvm-project/pull/157944
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Add deactivation symbol operand to ConstantPtrAuth. (PR #133537)

2025-09-13 Thread Oliver Hunt via llvm-branch-commits

ojhunt wrote:

> > I have checked in with @ahmedbougacha and his feeling is that this is fine 
> > as it requires a bunch of work to opt in, and for places where the security 
> > is important enough that we don't want people using this it's easy enough 
> > to block.
> 
> Thanks for checking.

as above I misunderstood what Ahmed was saying, and also the wording was 
terrible: the opinion on disabling and similar was mine - the concerns there 
were mine and I was trying to say I felt my concerns had been addressed.

https://github.com/llvm/llvm-project/pull/133537
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] CodeGen: Emit .prefalign directives based on the prefalign attribute. (PR #155529)

2025-09-13 Thread Peter Collingbourne via llvm-branch-commits

pcc wrote:

Ping

https://github.com/llvm/llvm-project/pull/155529
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][Standalone] test Standalone against install distributions (PR #157944)

2025-09-13 Thread Maksim Levental via llvm-branch-commits

https://github.com/makslevental updated 
https://github.com/llvm/llvm-project/pull/157944

>From 888cca06e4c81b1b12c85ec0ac48408e53ad57bd Mon Sep 17 00:00:00 2001
From: makslevental 
Date: Wed, 10 Sep 2025 12:57:54 -0700
Subject: [PATCH 01/10] [MLIR][Standalone] test Standalone against install
 distributions

---
 mlir/test/Examples/standalone/lit.local.cfg  |  2 ++
 .../Examples/standalone/test.toy.install-dir | 16 
 mlir/test/lit.cfg.py |  3 +++
 mlir/test/lit.site.cfg.py.in |  1 +
 4 files changed, 22 insertions(+)
 create mode 100644 mlir/test/Examples/standalone/test.toy.install-dir

diff --git a/mlir/test/Examples/standalone/lit.local.cfg 
b/mlir/test/Examples/standalone/lit.local.cfg
index fe8397c6b9a10..bc9928decf527 100644
--- a/mlir/test/Examples/standalone/lit.local.cfg
+++ b/mlir/test/Examples/standalone/lit.local.cfg
@@ -10,3 +10,5 @@ config.substitutions.append(("%host_cc", config.host_cc))
 config.substitutions.append(("%enable_libcxx", config.enable_libcxx))
 config.substitutions.append(("%mlir_cmake_dir", config.mlir_cmake_dir))
 config.substitutions.append(("%llvm_use_linker", config.llvm_use_linker))
+config.substitutions.append(("%llvm_obj_root", config.llvm_obj_root))
+config.substitutions.append(("%host_cmake_install_prefix", 
config.host_cmake_install_prefix))
diff --git a/mlir/test/Examples/standalone/test.toy.install-dir 
b/mlir/test/Examples/standalone/test.toy.install-dir
new file mode 100644
index 0..5c33a70491ae1
--- /dev/null
+++ b/mlir/test/Examples/standalone/test.toy.install-dir
@@ -0,0 +1,16 @@
+# REQUIRES: github-actions
+# RUN: "%cmake_exe" --build %llvm_obj_root --target install
+# RUN: "%cmake_exe" "%mlir_src_root/examples/standalone" -G "%cmake_generator" 
\
+# RUN: -DCMAKE_CXX_COMPILER=%host_cxx -DCMAKE_C_COMPILER=%host_cc \
+# RUN: -DLLVM_ENABLE_LIBCXX=%enable_libcxx 
-DMLIR_DIR=%host_cmake_install_prefix \
+# RUN: -DLLVM_USE_LINKER=%llvm_use_linker \
+# RUN: -DPython3_EXECUTABLE=%python \
+# RUN: -DPython_EXECUTABLE=%python
+# RUN: "%cmake_exe" --build . --target check-standalone | tee %t
+# RUN: FileCheck --input-file=%t %s
+
+# Note: The number of checked tests is not important. The command will fail
+# if any fail.
+# CHECK: Passed
+# CHECK-NOT: Failed
+# UNSUPPORTED: target={{.*(windows|android).*}}
diff --git a/mlir/test/lit.cfg.py b/mlir/test/lit.cfg.py
index f99c24d6e299a..08c7947c1e9a6 100644
--- a/mlir/test/lit.cfg.py
+++ b/mlir/test/lit.cfg.py
@@ -383,3 +383,6 @@ def have_host_jit_feature_support(feature_name):
 
 if sys.version_info >= (3, 11):
 config.available_features.add("python-ge-311")
+
+if "GITHUB_ACTIONS" in os.environ:
+config.available_features.add("github-actions")
diff --git a/mlir/test/lit.site.cfg.py.in b/mlir/test/lit.site.cfg.py.in
index 8a742a227847b..7e22ebf23c773 100644
--- a/mlir/test/lit.site.cfg.py.in
+++ b/mlir/test/lit.site.cfg.py.in
@@ -18,6 +18,7 @@ config.host_cxx = "@HOST_CXX@"
 config.enable_libcxx = "@LLVM_ENABLE_LIBCXX@"
 config.host_cmake = "@CMAKE_COMMAND@"
 config.host_cmake_generator = "@CMAKE_GENERATOR@"
+config.host_cmake_install_prefix = "@CMAKE_INSTALL_PREFIX@"
 config.llvm_use_linker = "@LLVM_USE_LINKER@"
 config.llvm_use_sanitizer = "@LLVM_USE_SANITIZER@"
 config.host_arch = "@HOST_ARCH@"

>From f26de0615a7e62b55bfa4dd0eee2ea423a1175f1 Mon Sep 17 00:00:00 2001
From: Maksim Levental 
Date: Wed, 10 Sep 2025 13:23:07 -0700
Subject: [PATCH 02/10] Update lit.site.cfg.py.in

---
 .../standalone/{test.toy.install-dir => test.install-dir.toy}| 0
 mlir/test/lit.site.cfg.py.in | 1 +
 2 files changed, 1 insertion(+)
 rename mlir/test/Examples/standalone/{test.toy.install-dir => 
test.install-dir.toy} (100%)

diff --git a/mlir/test/Examples/standalone/test.toy.install-dir 
b/mlir/test/Examples/standalone/test.install-dir.toy
similarity index 100%
rename from mlir/test/Examples/standalone/test.toy.install-dir
rename to mlir/test/Examples/standalone/test.install-dir.toy
diff --git a/mlir/test/lit.site.cfg.py.in b/mlir/test/lit.site.cfg.py.in
index 7e22ebf23c773..eadfd047d15f7 100644
--- a/mlir/test/lit.site.cfg.py.in
+++ b/mlir/test/lit.site.cfg.py.in
@@ -3,6 +3,7 @@
 import sys
 
 config.target_triple = "@LLVM_TARGET_TRIPLE@"
+config.llvm_obj_root = "@LLVM_BINARY_DIR@"
 config.llvm_src_root = "@LLVM_SOURCE_DIR@"
 config.llvm_tools_dir = lit_config.substitute("@LLVM_TOOLS_DIR@")
 config.lit_tools_dir = "@LLVM_LIT_TOOLS_DIR@"

>From b682c08d8682f4775b050d92a9082b943f42988b Mon Sep 17 00:00:00 2001
From: makslevental 
Date: Wed, 10 Sep 2025 15:54:54 -0700
Subject: [PATCH 03/10] add test.install-distribution-dir.toy

---
 mlir/test/Examples/standalone/lit.local.cfg |  1 +
 .../Examples/standalone/test.install-dir.toy|  4 ++--
 .../test.install-distribution-dir.toy   | 17 +
 3 files changed, 20 insertions(+), 2 deletions(-)
 create mode 100644 
mlir/test/Examples/stand

[llvm-branch-commits] [mlir] [MLIR][Standalone] test Standalone against install distributions (PR #157944)

2025-09-13 Thread Maksim Levental via llvm-branch-commits

https://github.com/makslevental updated 
https://github.com/llvm/llvm-project/pull/157944

>From 888cca06e4c81b1b12c85ec0ac48408e53ad57bd Mon Sep 17 00:00:00 2001
From: makslevental 
Date: Wed, 10 Sep 2025 12:57:54 -0700
Subject: [PATCH 01/10] [MLIR][Standalone] test Standalone against install
 distributions

---
 mlir/test/Examples/standalone/lit.local.cfg  |  2 ++
 .../Examples/standalone/test.toy.install-dir | 16 
 mlir/test/lit.cfg.py |  3 +++
 mlir/test/lit.site.cfg.py.in |  1 +
 4 files changed, 22 insertions(+)
 create mode 100644 mlir/test/Examples/standalone/test.toy.install-dir

diff --git a/mlir/test/Examples/standalone/lit.local.cfg 
b/mlir/test/Examples/standalone/lit.local.cfg
index fe8397c6b9a10..bc9928decf527 100644
--- a/mlir/test/Examples/standalone/lit.local.cfg
+++ b/mlir/test/Examples/standalone/lit.local.cfg
@@ -10,3 +10,5 @@ config.substitutions.append(("%host_cc", config.host_cc))
 config.substitutions.append(("%enable_libcxx", config.enable_libcxx))
 config.substitutions.append(("%mlir_cmake_dir", config.mlir_cmake_dir))
 config.substitutions.append(("%llvm_use_linker", config.llvm_use_linker))
+config.substitutions.append(("%llvm_obj_root", config.llvm_obj_root))
+config.substitutions.append(("%host_cmake_install_prefix", 
config.host_cmake_install_prefix))
diff --git a/mlir/test/Examples/standalone/test.toy.install-dir 
b/mlir/test/Examples/standalone/test.toy.install-dir
new file mode 100644
index 0..5c33a70491ae1
--- /dev/null
+++ b/mlir/test/Examples/standalone/test.toy.install-dir
@@ -0,0 +1,16 @@
+# REQUIRES: github-actions
+# RUN: "%cmake_exe" --build %llvm_obj_root --target install
+# RUN: "%cmake_exe" "%mlir_src_root/examples/standalone" -G "%cmake_generator" 
\
+# RUN: -DCMAKE_CXX_COMPILER=%host_cxx -DCMAKE_C_COMPILER=%host_cc \
+# RUN: -DLLVM_ENABLE_LIBCXX=%enable_libcxx 
-DMLIR_DIR=%host_cmake_install_prefix \
+# RUN: -DLLVM_USE_LINKER=%llvm_use_linker \
+# RUN: -DPython3_EXECUTABLE=%python \
+# RUN: -DPython_EXECUTABLE=%python
+# RUN: "%cmake_exe" --build . --target check-standalone | tee %t
+# RUN: FileCheck --input-file=%t %s
+
+# Note: The number of checked tests is not important. The command will fail
+# if any fail.
+# CHECK: Passed
+# CHECK-NOT: Failed
+# UNSUPPORTED: target={{.*(windows|android).*}}
diff --git a/mlir/test/lit.cfg.py b/mlir/test/lit.cfg.py
index f99c24d6e299a..08c7947c1e9a6 100644
--- a/mlir/test/lit.cfg.py
+++ b/mlir/test/lit.cfg.py
@@ -383,3 +383,6 @@ def have_host_jit_feature_support(feature_name):
 
 if sys.version_info >= (3, 11):
 config.available_features.add("python-ge-311")
+
+if "GITHUB_ACTIONS" in os.environ:
+config.available_features.add("github-actions")
diff --git a/mlir/test/lit.site.cfg.py.in b/mlir/test/lit.site.cfg.py.in
index 8a742a227847b..7e22ebf23c773 100644
--- a/mlir/test/lit.site.cfg.py.in
+++ b/mlir/test/lit.site.cfg.py.in
@@ -18,6 +18,7 @@ config.host_cxx = "@HOST_CXX@"
 config.enable_libcxx = "@LLVM_ENABLE_LIBCXX@"
 config.host_cmake = "@CMAKE_COMMAND@"
 config.host_cmake_generator = "@CMAKE_GENERATOR@"
+config.host_cmake_install_prefix = "@CMAKE_INSTALL_PREFIX@"
 config.llvm_use_linker = "@LLVM_USE_LINKER@"
 config.llvm_use_sanitizer = "@LLVM_USE_SANITIZER@"
 config.host_arch = "@HOST_ARCH@"

>From f26de0615a7e62b55bfa4dd0eee2ea423a1175f1 Mon Sep 17 00:00:00 2001
From: Maksim Levental 
Date: Wed, 10 Sep 2025 13:23:07 -0700
Subject: [PATCH 02/10] Update lit.site.cfg.py.in

---
 .../standalone/{test.toy.install-dir => test.install-dir.toy}| 0
 mlir/test/lit.site.cfg.py.in | 1 +
 2 files changed, 1 insertion(+)
 rename mlir/test/Examples/standalone/{test.toy.install-dir => 
test.install-dir.toy} (100%)

diff --git a/mlir/test/Examples/standalone/test.toy.install-dir 
b/mlir/test/Examples/standalone/test.install-dir.toy
similarity index 100%
rename from mlir/test/Examples/standalone/test.toy.install-dir
rename to mlir/test/Examples/standalone/test.install-dir.toy
diff --git a/mlir/test/lit.site.cfg.py.in b/mlir/test/lit.site.cfg.py.in
index 7e22ebf23c773..eadfd047d15f7 100644
--- a/mlir/test/lit.site.cfg.py.in
+++ b/mlir/test/lit.site.cfg.py.in
@@ -3,6 +3,7 @@
 import sys
 
 config.target_triple = "@LLVM_TARGET_TRIPLE@"
+config.llvm_obj_root = "@LLVM_BINARY_DIR@"
 config.llvm_src_root = "@LLVM_SOURCE_DIR@"
 config.llvm_tools_dir = lit_config.substitute("@LLVM_TOOLS_DIR@")
 config.lit_tools_dir = "@LLVM_LIT_TOOLS_DIR@"

>From b682c08d8682f4775b050d92a9082b943f42988b Mon Sep 17 00:00:00 2001
From: makslevental 
Date: Wed, 10 Sep 2025 15:54:54 -0700
Subject: [PATCH 03/10] add test.install-distribution-dir.toy

---
 mlir/test/Examples/standalone/lit.local.cfg |  1 +
 .../Examples/standalone/test.install-dir.toy|  4 ++--
 .../test.install-distribution-dir.toy   | 17 +
 3 files changed, 20 insertions(+), 2 deletions(-)
 create mode 100644 
mlir/test/Examples/stand

[llvm-branch-commits] [mlir] [MLIR][Standalone] test Standalone against install distributions (PR #157944)

2025-09-13 Thread Maksim Levental via llvm-branch-commits

https://github.com/makslevental edited 
https://github.com/llvm/llvm-project/pull/157944
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Add deactivation symbol operand to ConstantPtrAuth. (PR #133537)

2025-09-13 Thread Ahmed Bougacha via llvm-branch-commits


@@ -1046,7 +1046,8 @@ class ConstantPtrAuth final : public Constant {
 public:
   /// Return a pointer signed with the specified parameters.
   LLVM_ABI static ConstantPtrAuth *get(Constant *Ptr, ConstantInt *Key,
-   ConstantInt *Disc, Constant *AddrDisc);
+   ConstantInt *Disc, Constant *AddrDisc,
+   Constant *DeactivationSymbol);

ahmedbougacha wrote:

You don't have to do this here, but we probably should make the optional 
operands (in textual IR) optional here as well, and implicitly make them null?  
Now that I think about it, I'm not sure how idiomatic that would be

https://github.com/llvm/llvm-project/pull/133537
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Add deactivation symbol operand to ConstantPtrAuth. (PR #133537)

2025-09-13 Thread Ahmed Bougacha via llvm-branch-commits

ahmedbougacha wrote:

Yep, this does seem reasonable to me as well (with a question in-line).
Thanks for the summons, sorry I haven't had the chance to take a look before!

https://github.com/llvm/llvm-project/pull/133537
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][Standalone] test Standalone against install distributions (PR #157944)

2025-09-13 Thread Maksim Levental via llvm-branch-commits

makslevental wrote:

Think about the `install-cxx-test-suite-prefix` thing some more I feel like 
this is basically the same thing that I'm doing here - it runs CMake in a 
subprocess and excecutes the install script against the same build directory. 
The only difference is that it resets `-DCMAKE_INSTALL_PREFIX=...` without 
overwriting the user's prefix and because it's a `add_custom_target` it plays 
nicely with dependencies. Other than that it's the same thing.

https://github.com/llvm/llvm-project/pull/157944
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][Standalone] test Standalone against install distributions (PR #157944)

2025-09-13 Thread Maksim Levental via llvm-branch-commits

https://github.com/makslevental updated 
https://github.com/llvm/llvm-project/pull/157944

>From 888cca06e4c81b1b12c85ec0ac48408e53ad57bd Mon Sep 17 00:00:00 2001
From: makslevental 
Date: Wed, 10 Sep 2025 12:57:54 -0700
Subject: [PATCH 01/10] [MLIR][Standalone] test Standalone against install
 distributions

---
 mlir/test/Examples/standalone/lit.local.cfg  |  2 ++
 .../Examples/standalone/test.toy.install-dir | 16 
 mlir/test/lit.cfg.py |  3 +++
 mlir/test/lit.site.cfg.py.in |  1 +
 4 files changed, 22 insertions(+)
 create mode 100644 mlir/test/Examples/standalone/test.toy.install-dir

diff --git a/mlir/test/Examples/standalone/lit.local.cfg 
b/mlir/test/Examples/standalone/lit.local.cfg
index fe8397c6b9a10..bc9928decf527 100644
--- a/mlir/test/Examples/standalone/lit.local.cfg
+++ b/mlir/test/Examples/standalone/lit.local.cfg
@@ -10,3 +10,5 @@ config.substitutions.append(("%host_cc", config.host_cc))
 config.substitutions.append(("%enable_libcxx", config.enable_libcxx))
 config.substitutions.append(("%mlir_cmake_dir", config.mlir_cmake_dir))
 config.substitutions.append(("%llvm_use_linker", config.llvm_use_linker))
+config.substitutions.append(("%llvm_obj_root", config.llvm_obj_root))
+config.substitutions.append(("%host_cmake_install_prefix", 
config.host_cmake_install_prefix))
diff --git a/mlir/test/Examples/standalone/test.toy.install-dir 
b/mlir/test/Examples/standalone/test.toy.install-dir
new file mode 100644
index 0..5c33a70491ae1
--- /dev/null
+++ b/mlir/test/Examples/standalone/test.toy.install-dir
@@ -0,0 +1,16 @@
+# REQUIRES: github-actions
+# RUN: "%cmake_exe" --build %llvm_obj_root --target install
+# RUN: "%cmake_exe" "%mlir_src_root/examples/standalone" -G "%cmake_generator" 
\
+# RUN: -DCMAKE_CXX_COMPILER=%host_cxx -DCMAKE_C_COMPILER=%host_cc \
+# RUN: -DLLVM_ENABLE_LIBCXX=%enable_libcxx 
-DMLIR_DIR=%host_cmake_install_prefix \
+# RUN: -DLLVM_USE_LINKER=%llvm_use_linker \
+# RUN: -DPython3_EXECUTABLE=%python \
+# RUN: -DPython_EXECUTABLE=%python
+# RUN: "%cmake_exe" --build . --target check-standalone | tee %t
+# RUN: FileCheck --input-file=%t %s
+
+# Note: The number of checked tests is not important. The command will fail
+# if any fail.
+# CHECK: Passed
+# CHECK-NOT: Failed
+# UNSUPPORTED: target={{.*(windows|android).*}}
diff --git a/mlir/test/lit.cfg.py b/mlir/test/lit.cfg.py
index f99c24d6e299a..08c7947c1e9a6 100644
--- a/mlir/test/lit.cfg.py
+++ b/mlir/test/lit.cfg.py
@@ -383,3 +383,6 @@ def have_host_jit_feature_support(feature_name):
 
 if sys.version_info >= (3, 11):
 config.available_features.add("python-ge-311")
+
+if "GITHUB_ACTIONS" in os.environ:
+config.available_features.add("github-actions")
diff --git a/mlir/test/lit.site.cfg.py.in b/mlir/test/lit.site.cfg.py.in
index 8a742a227847b..7e22ebf23c773 100644
--- a/mlir/test/lit.site.cfg.py.in
+++ b/mlir/test/lit.site.cfg.py.in
@@ -18,6 +18,7 @@ config.host_cxx = "@HOST_CXX@"
 config.enable_libcxx = "@LLVM_ENABLE_LIBCXX@"
 config.host_cmake = "@CMAKE_COMMAND@"
 config.host_cmake_generator = "@CMAKE_GENERATOR@"
+config.host_cmake_install_prefix = "@CMAKE_INSTALL_PREFIX@"
 config.llvm_use_linker = "@LLVM_USE_LINKER@"
 config.llvm_use_sanitizer = "@LLVM_USE_SANITIZER@"
 config.host_arch = "@HOST_ARCH@"

>From f26de0615a7e62b55bfa4dd0eee2ea423a1175f1 Mon Sep 17 00:00:00 2001
From: Maksim Levental 
Date: Wed, 10 Sep 2025 13:23:07 -0700
Subject: [PATCH 02/10] Update lit.site.cfg.py.in

---
 .../standalone/{test.toy.install-dir => test.install-dir.toy}| 0
 mlir/test/lit.site.cfg.py.in | 1 +
 2 files changed, 1 insertion(+)
 rename mlir/test/Examples/standalone/{test.toy.install-dir => 
test.install-dir.toy} (100%)

diff --git a/mlir/test/Examples/standalone/test.toy.install-dir 
b/mlir/test/Examples/standalone/test.install-dir.toy
similarity index 100%
rename from mlir/test/Examples/standalone/test.toy.install-dir
rename to mlir/test/Examples/standalone/test.install-dir.toy
diff --git a/mlir/test/lit.site.cfg.py.in b/mlir/test/lit.site.cfg.py.in
index 7e22ebf23c773..eadfd047d15f7 100644
--- a/mlir/test/lit.site.cfg.py.in
+++ b/mlir/test/lit.site.cfg.py.in
@@ -3,6 +3,7 @@
 import sys
 
 config.target_triple = "@LLVM_TARGET_TRIPLE@"
+config.llvm_obj_root = "@LLVM_BINARY_DIR@"
 config.llvm_src_root = "@LLVM_SOURCE_DIR@"
 config.llvm_tools_dir = lit_config.substitute("@LLVM_TOOLS_DIR@")
 config.lit_tools_dir = "@LLVM_LIT_TOOLS_DIR@"

>From b682c08d8682f4775b050d92a9082b943f42988b Mon Sep 17 00:00:00 2001
From: makslevental 
Date: Wed, 10 Sep 2025 15:54:54 -0700
Subject: [PATCH 03/10] add test.install-distribution-dir.toy

---
 mlir/test/Examples/standalone/lit.local.cfg |  1 +
 .../Examples/standalone/test.install-dir.toy|  4 ++--
 .../test.install-distribution-dir.toy   | 17 +
 3 files changed, 20 insertions(+), 2 deletions(-)
 create mode 100644 
mlir/test/Examples/stand

[llvm-branch-commits] [mlir] [MLIR][Standalone] test Standalone against install distributions (PR #157944)

2025-09-13 Thread Renato Golin via llvm-branch-commits

https://github.com/rengolin edited 
https://github.com/llvm/llvm-project/pull/157944
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFC][CodeGe][CFI] Pre-commit transparent_union tests (PR #158192)

2025-09-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Vitaly Buka (vitalybuka)


Changes



---
Full diff: https://github.com/llvm/llvm-project/pull/158192.diff


4 Files Affected:

- (modified) clang/test/CodeGen/cfi-icall-generalize.c (+15) 
- (modified) clang/test/CodeGen/cfi-icall-normalize2.c (+13) 
- (modified) clang/test/CodeGen/kcfi-generalize.c (+15) 
- (modified) clang/test/CodeGen/kcfi-normalize.c (+13) 


``diff
diff --git a/clang/test/CodeGen/cfi-icall-generalize.c 
b/clang/test/CodeGen/cfi-icall-generalize.c
index 0af17e5760cc6..116a99e4e2859 100644
--- a/clang/test/CodeGen/cfi-icall-generalize.c
+++ b/clang/test/CodeGen/cfi-icall-generalize.c
@@ -15,5 +15,20 @@ void g(int** (*fp)(const char *, const char **)) {
   fp(0, 0);
 }
 
+union Union {
+  char *c;
+  long* n;
+} __attribute__((transparent_union));
+
+// CHECK: define{{.*}} void @uni({{.*}} !type [[TYPE2:![0-9]+]] !type 
[[TYPE2_GENERALIZED:![0-9]+]]
+void uni(void (*fn)(union Union), union Union arg1) {
+  // UNGENERALIZED: call i1 @llvm.type.test(ptr {{.*}}, metadata 
!"_ZTSFv5UnionE")
+  // GENERALIZED: call i1 @llvm.type.test(ptr {{.*}}, metadata 
!"_ZTSFv5UnionE.generalized")
+fn(arg1);
+}
+
 // CHECK: [[TYPE]] = !{i64 0, !"_ZTSFPPiPKcPS2_E"}
 // CHECK: [[TYPE_GENERALIZED]] = !{i64 0, !"_ZTSFPvPKvS_E.generalized"}
+
+// CHECK: [[TYPE2]] = !{i64 0, !"_ZTSFvPFv5UnionES_E"}
+// CHECK: [[TYPE2_GENERALIZED]] = !{i64 0, !"_ZTSFvPv5UnionE.generalized"}
diff --git a/clang/test/CodeGen/cfi-icall-normalize2.c 
b/clang/test/CodeGen/cfi-icall-normalize2.c
index 93893065cf903..c88ecc9f0c3f7 100644
--- a/clang/test/CodeGen/cfi-icall-normalize2.c
+++ b/clang/test/CodeGen/cfi-icall-normalize2.c
@@ -24,6 +24,19 @@ void baz(void (*fn)(int, int, int), int arg1, int arg2, int 
arg3) {
 fn(arg1, arg2, arg3);
 }
 
+union Union {
+  char *c;
+  long* n;
+} __attribute__((transparent_union));
+
+void uni(void (*fn)(union Union), union Union arg1) {
+// CHECK-LABEL: define{{.*}}uni
+// CHECK-SAME: {{.*}}!type ![[TYPE4:[0-9]+]] !type !{{[0-9]+}}
+// CHECK: call i1 @llvm.type.test({{i8\*|ptr}} {{%f|%0}}, metadata 
!"_ZTSFv5UnionE.normalized")
+fn(arg1);
+}
+
 // CHECK: ![[TYPE1]] = !{i64 0, !"_ZTSFvPFvu3i32ES_E.normalized"}
 // CHECK: ![[TYPE2]] = !{i64 0, !"_ZTSFvPFvu3i32S_ES_S_E.normalized"}
 // CHECK: ![[TYPE3]] = !{i64 0, !"_ZTSFvPFvu3i32S_S_ES_S_S_E.normalized"}
+// CHECK: ![[TYPE4]] = !{i64 0, !"_ZTSFvPFv5UnionES_E.normalized"}
diff --git a/clang/test/CodeGen/kcfi-generalize.c 
b/clang/test/CodeGen/kcfi-generalize.c
index 4e32f4f35057c..89b298f3e2faa 100644
--- a/clang/test/CodeGen/kcfi-generalize.c
+++ b/clang/test/CodeGen/kcfi-generalize.c
@@ -26,8 +26,23 @@ void g(int** (*fp)(const char *, const char **)) {
   fp(0, 0);
 }
 
+union Union {
+  char *c;
+  long* n;
+} __attribute__((transparent_union));
+
+// CHECK: define{{.*}} void @uni({{.*}} !kcfi_type [[TYPE2:![0-9]+]]
+void uni(void (*fn)(union Union), union Union arg1) {
+  // UNGENERALIZED: call {{.*}} [ "kcfi"(i32 -1037059548) ]
+  // GENERALIZED: call {{.*}} [ "kcfi"(i32 422130955) ]
+fn(arg1);
+}
+
 // UNGENERALIZED: [[TYPE]] = !{i32 1296635908}
 // GENERALIZED: [[TYPE]] = !{i32 -49168686}
 
 // UNGENERALIZED: [[TYPE3]] = !{i32 874141567}
 // GENERALIZED: [[TYPE3]] = !{i32 954385378}
+
+// UNGENERALIZED: [[TYPE2]] = !{i32 981319178}
+// GENERALIZED: [[TYPE2]] = !{i32 -1599950473}
\ No newline at end of file
diff --git a/clang/test/CodeGen/kcfi-normalize.c 
b/clang/test/CodeGen/kcfi-normalize.c
index b9150e88f6ab5..cde784962d11a 100644
--- a/clang/test/CodeGen/kcfi-normalize.c
+++ b/clang/test/CodeGen/kcfi-normalize.c
@@ -28,7 +28,20 @@ void baz(void (*fn)(int, int, int), int arg1, int arg2, int 
arg3) {
 fn(arg1, arg2, arg3);
 }
 
+union Union {
+  char *c;
+  long* n;
+} __attribute__((transparent_union));
+
+void uni(void (*fn)(union Union), union Union arg1) {
+// CHECK-LABEL: define{{.*}}uni
+// CHECK-SAME: {{.*}}!kcfi_type ![[TYPE4:[0-9]+]]
+// CHECK: call void %0(ptr %1) [ "kcfi"(i32 -1430221633) ]
+fn(arg1);
+}
+
 // CHECK: ![[#]] = !{i32 4, !"cfi-normalize-integers", i32 1}
 // CHECK: ![[TYPE1]] = !{i32 -1143117868}
 // CHECK: ![[TYPE2]] = !{i32 -460921415}
 // CHECK: ![[TYPE3]] = !{i32 -333839615}
+// CHECK: ![[TYPE4]] = !{i32 1766237188}
\ No newline at end of file

``




https://github.com/llvm/llvm-project/pull/158192
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][Standalone] test Standalone against install distributions (PR #157944)

2025-09-13 Thread Maksim Levental via llvm-branch-commits

https://github.com/makslevental ready_for_review 
https://github.com/llvm/llvm-project/pull/157944
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Move spill pseudo special case out of adjustAllocatableRegClass (PR #158246)

2025-09-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)


Changes

This is special for the same reason av_mov_b64_imm_pseudo is special.

---
Full diff: https://github.com/llvm/llvm-project/pull/158246.diff


2 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+3-5) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+4-2) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp 
b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 5c3340703ba3b..b1a61886802f4 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -5976,8 +5976,7 @@ SIInstrInfo::getWholeWaveFunctionSetup(MachineFunction 
&MF) const {
 static const TargetRegisterClass *
 adjustAllocatableRegClass(const GCNSubtarget &ST, const SIRegisterInfo &RI,
   const MCInstrDesc &TID, unsigned RCID) {
-  if (!ST.hasGFX90AInsts() && (((TID.mayLoad() || TID.mayStore()) &&
-!(TID.TSFlags & SIInstrFlags::Spill {
+  if (!ST.hasGFX90AInsts() && (((TID.mayLoad() || TID.mayStore() {
 switch (RCID) {
 case AMDGPU::AV_32RegClassID:
   RCID = AMDGPU::VGPR_32RegClassID;
@@ -6012,10 +6011,9 @@ const TargetRegisterClass 
*SIInstrInfo::getRegClass(const MCInstrDesc &TID,
   if (OpNum >= TID.getNumOperands())
 return nullptr;
   auto RegClass = TID.operands()[OpNum].RegClass;
-  if (TID.getOpcode() == AMDGPU::AV_MOV_B64_IMM_PSEUDO) {
-// Special pseudos have no alignment requirement
+  // Special pseudos have no alignment requirement
+  if (TID.getOpcode() == AMDGPU::AV_MOV_B64_IMM_PSEUDO || isSpill(TID))
 return RI.getRegClass(RegClass);
-  }
 
   return adjustAllocatableRegClass(ST, RI, TID, RegClass);
 }
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h 
b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
index f7dde2b90b68e..e0373e7768435 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
@@ -797,10 +797,12 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo {
 return get(Opcode).TSFlags & SIInstrFlags::Spill;
   }
 
-  static bool isSpill(const MachineInstr &MI) {
-return MI.getDesc().TSFlags & SIInstrFlags::Spill;
+  static bool isSpill(const MCInstrDesc &Desc) {
+return Desc.TSFlags & SIInstrFlags::Spill;
   }
 
+  static bool isSpill(const MachineInstr &MI) { return isSpill(MI.getDesc()); }
+
   static bool isWWMRegSpillOpcode(uint16_t Opcode) {
 return Opcode == AMDGPU::SI_SPILL_WWM_V32_SAVE ||
Opcode == AMDGPU::SI_SPILL_WWM_AV32_SAVE ||

``




https://github.com/llvm/llvm-project/pull/158246
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][LoongArch] Introduce LASX and LSX conversion intrinsics (PR #157819)

2025-09-13 Thread via llvm-branch-commits

https://github.com/heiher updated 
https://github.com/llvm/llvm-project/pull/157819

>From 9e2fb6040167b52d7bf4fbd4e8ab444de3099d74 Mon Sep 17 00:00:00 2001
From: WANG Rui 
Date: Wed, 10 Sep 2025 17:11:10 +0800
Subject: [PATCH] [clang][LoongArch] Introduce LASX and LSX conversion
 intrinsics

This patch introduces the LASX and LSX conversion intrinsics:

- __m256 __lasx_cast_128_s (__m128)
- __m256d __lasx_cast_128_d (__m128d)
- __m256i __lasx_cast_128 (__m128i)
- __m256 __lasx_concat_128_s (__m128, __m128)
- __m256d __lasx_concat_128_d (__m128, __m128d)
- __m256i __lasx_concat_128 (__m128, __m128i)
- __m128 __lasx_extract_128_lo_s (__m256)
- __m128d __lasx_extract_128_lo_d (__m256d)
- __m128i __lasx_extract_128_lo (__m256i)
- __m128 __lasx_extract_128_hi_s (__m256)
- __m128d __lasx_extract_128_hi_d (__m256d)
- __m128i __lasx_extract_128_hi (__m256i)
- __m256 __lasx_insert_128_lo_s (__m256, __m128)
- __m256d __lasx_insert_128_lo_d (__m256d, __m128d)
- __m256i __lasx_insert_128_lo (__m256i, __m128i)
- __m256 __lasx_insert_128_hi_s (__m256, __m128)
- __m256d __lasx_insert_128_hi_d (__m256d, __m128d)
- __m256i __lasx_insert_128_hi (__m256i, __m128i)
---
 .../clang/Basic/BuiltinsLoongArchLASX.def |  19 +++
 clang/lib/Headers/lasxintrin.h| 110 
 .../CodeGen/LoongArch/lasx/builtin-alias.c| 153 +
 clang/test/CodeGen/LoongArch/lasx/builtin.c   | 157 ++
 4 files changed, 439 insertions(+)

diff --git a/clang/include/clang/Basic/BuiltinsLoongArchLASX.def 
b/clang/include/clang/Basic/BuiltinsLoongArchLASX.def
index c4ea46a3bc5b5..b234dedad648e 100644
--- a/clang/include/clang/Basic/BuiltinsLoongArchLASX.def
+++ b/clang/include/clang/Basic/BuiltinsLoongArchLASX.def
@@ -986,3 +986,22 @@ TARGET_BUILTIN(__builtin_lasx_xbnz_b, "iV32Uc", "nc", 
"lasx")
 TARGET_BUILTIN(__builtin_lasx_xbnz_h, "iV16Us", "nc", "lasx")
 TARGET_BUILTIN(__builtin_lasx_xbnz_w, "iV8Ui", "nc", "lasx")
 TARGET_BUILTIN(__builtin_lasx_xbnz_d, "iV4ULLi", "nc", "lasx")
+
+TARGET_BUILTIN(__builtin_lasx_cast_128_s, "V8fV4f", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_cast_128_d, "V4dV2d", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_cast_128, "V32ScV16Sc", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_concat_128_s, "V8fV4fV4f", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_concat_128_d, "V4dV2dV2d", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_concat_128, "V32ScV16ScV16Sc", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_extract_128_lo_s, "V4fV8f", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_extract_128_lo_d, "V2dV4d", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_extract_128_lo, "V16ScV32Sc", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_extract_128_hi_s, "V4fV8f", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_extract_128_hi_d, "V2dV4d", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_extract_128_hi, "V16ScV32Sc", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_insert_128_lo_s, "V8fV8fV4f", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_insert_128_lo_d, "V4dV4dV2d", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_insert_128_lo, "V32ScV32ScV16Sc", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_insert_128_hi_s, "V8fV8fV4f", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_insert_128_hi_d, "V4dV4dV2d", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_insert_128_hi, "V32ScV32ScV16Sc", "nc", "lasx")
diff --git a/clang/lib/Headers/lasxintrin.h b/clang/lib/Headers/lasxintrin.h
index 85020d82829e2..417671ffd437d 100644
--- a/clang/lib/Headers/lasxintrin.h
+++ b/clang/lib/Headers/lasxintrin.h
@@ -10,6 +10,8 @@
 #ifndef _LOONGSON_ASXINTRIN_H
 #define _LOONGSON_ASXINTRIN_H 1
 
+#include 
+
 #if defined(__loongarch_asx)
 
 typedef signed char v32i8 __attribute__((vector_size(32), aligned(32)));
@@ -3882,5 +3884,113 @@ extern __inline
 
 #define __lasx_xvrepli_w(/*si10*/ _1) ((__m256i)__builtin_lasx_xvrepli_w((_1)))
 
+extern __inline
+__attribute__((__gnu_inline__, __always_inline__,
+   __artificial__)) __m256 __lasx_cast_128_s(__m128 _1) {
+  return (__m256)__builtin_lasx_cast_128_s((v4f32)_1);
+}
+
+extern __inline
+__attribute__((__gnu_inline__, __always_inline__, __artificial__)) __m256d
+__lasx_cast_128_d(__m128d _1) {
+  return (__m256d)__builtin_lasx_cast_128_d((v2f64)_1);
+}
+
+extern __inline
+__attribute__((__gnu_inline__, __always_inline__, __artificial__)) __m256i
+__lasx_cast_128(__m128i _1) {
+  return (__m256i)__builtin_lasx_cast_128((v16i8)_1);
+}
+
+extern __inline
+__attribute__((__gnu_inline__, __always_inline__, __artificial__)) __m256
+__lasx_concat_128_s(__m128 _1, __m128 _2) {
+  return (__m256)__builtin_lasx_concat_128_s((v4f32)_1, (v4f32)_2);
+}
+
+extern __inline
+__attribute__((__gnu_inline__, __always_inline__, __artificial__)) __m256d
+__lasx_concat_128_d(__m128d _1, __m128d _2) {
+  return (__m256d)__builtin_lasx_concat_128_d((v2f64)_1, (v2f64)_2);
+}
+
+extern __inline
+__attribute__((__gnu_inline__, __always_inline__, __artificial__)) __

[llvm-branch-commits] [llvm] CodeGen: Keep reference to TargetRegisterInfo in TargetInstrInfo (PR #158224)

2025-09-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-msp430

Author: Matt Arsenault (arsenm)


Changes

Both conceptually belong to the same subtarget, so it should not
be necessary to pass in the context TargetRegisterInfo to any
TargetInstrInfo member. Add this reference so those superfluous
arguments can be removed.

Most targets placed their TargetRegisterInfo as a member
in TargetInstrInfo. A few had this owned by the TargetSubtargetInfo,
so unify all targets to look the same.

---

Patch is 45.06 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/158224.diff


50 Files Affected:

- (modified) llvm/include/llvm/CodeGen/TargetInstrInfo.h (+8-3) 
- (modified) llvm/lib/CodeGen/TargetInstrInfo.cpp (+27-41) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/AMDGPU/R600InstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+2-1) 
- (modified) llvm/lib/Target/ARC/ARCInstrInfo.cpp (+2-1) 
- (modified) llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp (+3-2) 
- (modified) llvm/lib/Target/ARM/ARMBaseInstrInfo.h (+7-2) 
- (modified) llvm/lib/Target/ARM/ARMInstrInfo.cpp (+2-1) 
- (modified) llvm/lib/Target/ARM/ARMInstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/ARM/Thumb1InstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/ARM/Thumb1InstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/ARM/Thumb2InstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/ARM/Thumb2InstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/AVR/AVRInstrInfo.cpp (+2-2) 
- (modified) llvm/lib/Target/BPF/BPFInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/CSKY/CSKYInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/DirectX/DirectXInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/Hexagon/HexagonInstrInfo.cpp (+2-2) 
- (modified) llvm/lib/Target/Hexagon/HexagonInstrInfo.h (+5) 
- (modified) llvm/lib/Target/Hexagon/HexagonSubtarget.cpp (+1-2) 
- (modified) llvm/lib/Target/Hexagon/HexagonSubtarget.h (+1-2) 
- (modified) llvm/lib/Target/Lanai/LanaiInstrInfo.cpp (+2-1) 
- (modified) llvm/lib/Target/LoongArch/LoongArchInstrInfo.cpp (+2-2) 
- (modified) llvm/lib/Target/LoongArch/LoongArchInstrInfo.h (+4) 
- (modified) llvm/lib/Target/LoongArch/LoongArchSubtarget.cpp (+1-1) 
- (modified) llvm/lib/Target/LoongArch/LoongArchSubtarget.h (+1-2) 
- (modified) llvm/lib/Target/MSP430/MSP430InstrInfo.cpp (+2-1) 
- (modified) llvm/lib/Target/Mips/Mips16InstrInfo.cpp (+1-5) 
- (modified) llvm/lib/Target/Mips/Mips16InstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/Mips/MipsInstrInfo.cpp (+3-2) 
- (modified) llvm/lib/Target/Mips/MipsInstrInfo.h (+6-2) 
- (modified) llvm/lib/Target/Mips/MipsSEInstrInfo.cpp (+1-5) 
- (modified) llvm/lib/Target/Mips/MipsSEInstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/NVPTX/NVPTXInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/PowerPC/PPCInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+3-2) 
- (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.h (+3) 
- (modified) llvm/lib/Target/RISCV/RISCVSubtarget.cpp (+1-1) 
- (modified) llvm/lib/Target/RISCV/RISCVSubtarget.h (+1-2) 
- (modified) llvm/lib/Target/SPIRV/SPIRVInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/Sparc/SparcInstrInfo.cpp (+2-2) 
- (modified) llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/VE/VEInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/X86/X86InstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/XCore/XCoreInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/Xtensa/XtensaInstrInfo.cpp (+2-1) 
- (modified) llvm/unittests/CodeGen/MFCommon.inc (+3-1) 
- (modified) llvm/utils/TableGen/InstrInfoEmitter.cpp (+7-5) 


``diff
diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h 
b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index 6a624a7052cdd..802cca6022074 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -113,9 +113,12 @@ struct ExtAddrMode {
 ///
 class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
 protected:
-  TargetInstrInfo(unsigned CFSetupOpcode = ~0u, unsigned CFDestroyOpcode = ~0u,
-  unsigned CatchRetOpcode = ~0u, unsigned ReturnOpcode = ~0u)
-  : CallFrameSetupOpcode(CFSetupOpcode),
+  const TargetRegisterInfo &TRI;
+
+  TargetInstrInfo(const TargetRegisterInfo &TRI, unsigned CFSetupOpcode = ~0u,
+  unsigned CFDestroyOpcode = ~0u, unsigned CatchRetOpcode = 
~0u,
+  unsigned ReturnOpcode = ~0u)
+  : TRI(TRI), CallFrameSetupOpcode(CFSetupOpcode),
 CallFrameDestroyOpcode(CFDestroyOpcode), 
CatchRetOpcode(CatchRetOpcode),
 ReturnOpcode(ReturnOpcode) {}
 
@@ -124,6 +127,8 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
   TargetInstrInfo &operator=(const TargetInstrInfo &) = delete;
   virtual ~TargetInstrInfo();
 
+  const TargetRegisterInfo &getRegisterInfo() const { return TRI; }
+
   static bool isG

[llvm-branch-commits] [llvm] CodeGen: Remove TRI arguments from stack load/store hooks (PR #158240)

2025-09-13 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-backend-msp430

Author: Matt Arsenault (arsenm)


Changes

This is directly available in TargetInstrInfo

---

Patch is 110.41 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/158240.diff


63 Files Affected:

- (modified) llvm/include/llvm/CodeGen/TargetInstrInfo.h (+2-4) 
- (modified) llvm/lib/CodeGen/FixupStatepointCallerSaved.cpp (+3-3) 
- (modified) llvm/lib/CodeGen/InlineSpiller.cpp (+4-4) 
- (modified) llvm/lib/CodeGen/RegAllocFast.cpp (+3-4) 
- (modified) llvm/lib/CodeGen/RegisterScavenging.cpp (+2-2) 
- (modified) llvm/lib/CodeGen/TargetFrameLoweringImpl.cpp (+2-3) 
- (modified) llvm/lib/CodeGen/TargetInstrInfo.cpp (+2-2) 
- (modified) llvm/lib/Target/AArch64/AArch64FrameLowering.cpp (+2-3) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+12-12) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.h (+2-3) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+3-5) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+2-4) 
- (modified) llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp (+4-5) 
- (modified) llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp (+10-7) 
- (modified) llvm/lib/Target/ARM/ARMBaseInstrInfo.h (+2-3) 
- (modified) llvm/lib/Target/ARM/Thumb1InstrInfo.cpp (+6-5) 
- (modified) llvm/lib/Target/ARM/Thumb1InstrInfo.h (+2-3) 
- (modified) llvm/lib/Target/ARM/Thumb2InstrInfo.cpp (+8-8) 
- (modified) llvm/lib/Target/ARM/Thumb2InstrInfo.h (+2-3) 
- (modified) llvm/lib/Target/AVR/AVRInstrInfo.cpp (+5-7) 
- (modified) llvm/lib/Target/AVR/AVRInstrInfo.h (+2-4) 
- (modified) llvm/lib/Target/BPF/BPFInstrInfo.cpp (+6-5) 
- (modified) llvm/lib/Target/BPF/BPFInstrInfo.h (+2-3) 
- (modified) llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp (+4-7) 
- (modified) llvm/lib/Target/Hexagon/HexagonInstrInfo.cpp (+6-5) 
- (modified) llvm/lib/Target/Hexagon/HexagonInstrInfo.h (+2-3) 
- (modified) llvm/lib/Target/Lanai/LanaiInstrInfo.cpp (+2-4) 
- (modified) llvm/lib/Target/Lanai/LanaiInstrInfo.h (+2-4) 
- (modified) llvm/lib/Target/LoongArch/LoongArchFrameLowering.cpp (+1-1) 
- (modified) llvm/lib/Target/LoongArch/LoongArchInstrInfo.cpp (+8-8) 
- (modified) llvm/lib/Target/LoongArch/LoongArchInstrInfo.h (+2-4) 
- (modified) llvm/lib/Target/MSP430/MSP430InstrInfo.cpp (+7-6) 
- (modified) llvm/lib/Target/MSP430/MSP430InstrInfo.h (+2-4) 
- (modified) llvm/lib/Target/Mips/Mips16InstrInfo.cpp (+6-5) 
- (modified) llvm/lib/Target/Mips/Mips16InstrInfo.h (+3-2) 
- (modified) llvm/lib/Target/Mips/MipsInstrInfo.h (+6-9) 
- (modified) llvm/lib/Target/Mips/MipsSEFrameLowering.cpp (+20-29) 
- (modified) llvm/lib/Target/Mips/MipsSEInstrInfo.cpp (+20-19) 
- (modified) llvm/lib/Target/Mips/MipsSEInstrInfo.h (+2-3) 
- (modified) llvm/lib/Target/PowerPC/PPCFrameLowering.cpp (+5-6) 
- (modified) llvm/lib/Target/PowerPC/PPCInstrInfo.cpp (+11-12) 
- (modified) llvm/lib/Target/PowerPC/PPCInstrInfo.h (+6-6) 
- (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.cpp (+13-15) 
- (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+4-6) 
- (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+14-13) 
- (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.h (+3-3) 
- (modified) llvm/lib/Target/Sparc/SparcInstrInfo.cpp (+6-5) 
- (modified) llvm/lib/Target/Sparc/SparcInstrInfo.h (+2-3) 
- (modified) llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp (+8-8) 
- (modified) llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp (+8-6) 
- (modified) llvm/lib/Target/SystemZ/SystemZInstrInfo.h (+4-2) 
- (modified) llvm/lib/Target/VE/VEInstrInfo.cpp (+6-5) 
- (modified) llvm/lib/Target/VE/VEInstrInfo.h (+4-2) 
- (modified) llvm/lib/Target/X86/X86FastPreTileConfig.cpp (+1-2) 
- (modified) llvm/lib/Target/X86/X86FrameLowering.cpp (+3-4) 
- (modified) llvm/lib/Target/X86/X86InstrInfo.cpp (+13-11) 
- (modified) llvm/lib/Target/X86/X86InstrInfo.h (+3-3) 
- (modified) llvm/lib/Target/XCore/XCoreFrameLowering.cpp (+2-3) 
- (modified) llvm/lib/Target/XCore/XCoreInstrInfo.cpp (+2-3) 
- (modified) llvm/lib/Target/XCore/XCoreInstrInfo.h (+4-2) 
- (modified) llvm/lib/Target/Xtensa/XtensaFrameLowering.cpp (+1-1) 
- (modified) llvm/lib/Target/Xtensa/XtensaInstrInfo.cpp (+10-8) 
- (modified) llvm/lib/Target/Xtensa/XtensaInstrInfo.h (+2-3) 


``diff
diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h 
b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index 802cca6022074..fb7ced7960846 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -1165,8 +1165,7 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
   /// register spill instruction, part of prologue, during the frame lowering.
   virtual void storeRegToStackSlot(
   MachineBasicBlock &MBB, MachineBasicBlock::iterator MI, Register SrcReg,
-  bool isKill, int FrameIndex, const TargetRegisterClass *RC,
-  const TargetRegisterInfo *TRI, Register VReg,
+  bool isKill, int FrameIndex, const TargetRegisterClass *RC, 

[llvm-branch-commits] [NFC][CFI][CodeGen] Move GeneralizeFunctionType out of CreateMetadataIdentifierGeneralized (PR #158190)

2025-09-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-codegen

Author: Vitaly Buka (vitalybuka)


Changes



---
Full diff: https://github.com/llvm/llvm-project/pull/158190.diff


1 Files Affected:

- (modified) clang/lib/CodeGen/CodeGenModule.cpp (+9-5) 


``diff
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index d45fb823d4c35..acd77c5aca89c 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -3041,9 +3041,11 @@ void 
CodeGenModule::createFunctionTypeMetadataForIcall(const FunctionDecl *FD,
   if (isa(FD) && !cast(FD)->isStatic())
 return;
 
-  llvm::Metadata *MD = CreateMetadataIdentifierForType(FD->getType());
+  QualType FnType = FD->getType();
+  llvm::Metadata *MD = CreateMetadataIdentifierForType(FnType);
   F->addTypeMetadata(0, MD);
-  F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
+  FnType = GeneralizeFunctionType(getContext(), FnType);
+  F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FnType));
 
   // Emit a hash-based bit set entry for cross-DSO calls.
   if (CodeGenOpts.SanitizeCfiCrossDso)
@@ -7936,8 +7938,10 @@ CodeGenModule::CreateMetadataIdentifierImpl(QualType T, 
MetadataTypeMap &Map,
 
 llvm::Metadata *CodeGenModule::CreateMetadataIdentifierForFnType(QualType T) {
   assert(isa(T));
-  if (getCodeGenOpts().SanitizeCfiICallGeneralizePointers)
+  if (getCodeGenOpts().SanitizeCfiICallGeneralizePointers) {
+T = GeneralizeFunctionType(getContext(), T);
 return CreateMetadataIdentifierGeneralized(T);
+  }
   return CreateMetadataIdentifierForType(T);
 }
 
@@ -7951,8 +7955,8 @@ 
CodeGenModule::CreateMetadataIdentifierForVirtualMemPtrType(QualType T) {
 }
 
 llvm::Metadata *CodeGenModule::CreateMetadataIdentifierGeneralized(QualType T) 
{
-  return CreateMetadataIdentifierImpl(GeneralizeFunctionType(getContext(), T),
-  GeneralizedMetadataIdMap, 
".generalized");
+  return CreateMetadataIdentifierImpl(T, GeneralizedMetadataIdMap,
+  ".generalized");
 }
 
 /// Returns whether this module needs the "all-vtables" type identifier.

``




https://github.com/llvm/llvm-project/pull/158190
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CodeGen][CFI] Generalize transparent union parameters (PR #158193)

2025-09-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Vitaly Buka (vitalybuka)


Changes

According GCC documentation transparent union
calling convention is the same as the type of the
first member of the union.

C++ ignores attribute.


---
Full diff: https://github.com/llvm/llvm-project/pull/158193.diff


5 Files Affected:

- (modified) clang/lib/CodeGen/CodeGenModule.cpp (+17-1) 
- (modified) clang/test/CodeGen/cfi-icall-generalize.c (+4-4) 
- (modified) clang/test/CodeGen/cfi-icall-normalize2.c (+2-2) 
- (modified) clang/test/CodeGen/kcfi-generalize.c (+4-4) 
- (modified) clang/test/CodeGen/kcfi-normalize.c (+6-4) 


``diff
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index c647003ff389d..46dbd85665e5d 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2339,13 +2339,29 @@ llvm::ConstantInt 
*CodeGenModule::CreateCrossDsoCfiTypeId(llvm::Metadata *MD) {
   return llvm::ConstantInt::get(Int64Ty, llvm::MD5Hash(MDS->getString()));
 }
 
+static QualType GeneralizeTransparentUnion(QualType Ty) {
+  const RecordType *UT = Ty->getAsUnionType();
+  if (!UT)
+return Ty;
+  const RecordDecl *UD = UT->getOriginalDecl()->getDefinitionOrSelf();
+  if (!UD->hasAttr())
+return Ty;
+  for (const auto *it : UD->fields()) {
+return it->getType();
+  }
+  return Ty;
+}
+
+static QualType GeneralizeTransparentUnion(QualType Ty) {
+}
+
 // Generalize pointer types to a void pointer with the qualifiers of the
 // originally pointed-to type, e.g. 'const char *' and 'char * const *'
 // generalize to 'const void *' while 'char *' and 'const char **' generalize 
to
 // 'void *'.
 static QualType GeneralizeType(ASTContext &Ctx, QualType Ty,
bool GeneralizePointers) {
-  // TODO: Add other generalizations.
+  Ty = GeneralizeTransparentUnion(Ty);
 
   if (!GeneralizePointers || !Ty->isPointerType())
 return Ty;
diff --git a/clang/test/CodeGen/cfi-icall-generalize.c 
b/clang/test/CodeGen/cfi-icall-generalize.c
index 116a99e4e2859..5359134805198 100644
--- a/clang/test/CodeGen/cfi-icall-generalize.c
+++ b/clang/test/CodeGen/cfi-icall-generalize.c
@@ -22,13 +22,13 @@ union Union {
 
 // CHECK: define{{.*}} void @uni({{.*}} !type [[TYPE2:![0-9]+]] !type 
[[TYPE2_GENERALIZED:![0-9]+]]
 void uni(void (*fn)(union Union), union Union arg1) {
-  // UNGENERALIZED: call i1 @llvm.type.test(ptr {{.*}}, metadata 
!"_ZTSFv5UnionE")
-  // GENERALIZED: call i1 @llvm.type.test(ptr {{.*}}, metadata 
!"_ZTSFv5UnionE.generalized")
+  // UNGENERALIZED: call i1 @llvm.type.test(ptr {{.*}}, metadata !"_ZTSFvPcE")
+  // GENERALIZED: call i1 @llvm.type.test(ptr {{.*}}, metadata 
!"_ZTSFvPvE.generalized")
 fn(arg1);
 }
 
 // CHECK: [[TYPE]] = !{i64 0, !"_ZTSFPPiPKcPS2_E"}
 // CHECK: [[TYPE_GENERALIZED]] = !{i64 0, !"_ZTSFPvPKvS_E.generalized"}
 
-// CHECK: [[TYPE2]] = !{i64 0, !"_ZTSFvPFv5UnionES_E"}
-// CHECK: [[TYPE2_GENERALIZED]] = !{i64 0, !"_ZTSFvPv5UnionE.generalized"}
+// CHECK: [[TYPE2]] = !{i64 0, !"_ZTSFvPFv5UnionEPcE"}
+// CHECK: [[TYPE2_GENERALIZED]] = !{i64 0, !"_ZTSFvPvS_E.generalized"}
diff --git a/clang/test/CodeGen/cfi-icall-normalize2.c 
b/clang/test/CodeGen/cfi-icall-normalize2.c
index c88ecc9f0c3f7..b9d9af7c8a47b 100644
--- a/clang/test/CodeGen/cfi-icall-normalize2.c
+++ b/clang/test/CodeGen/cfi-icall-normalize2.c
@@ -32,11 +32,11 @@ union Union {
 void uni(void (*fn)(union Union), union Union arg1) {
 // CHECK-LABEL: define{{.*}}uni
 // CHECK-SAME: {{.*}}!type ![[TYPE4:[0-9]+]] !type !{{[0-9]+}}
-// CHECK: call i1 @llvm.type.test({{i8\*|ptr}} {{%f|%0}}, metadata 
!"_ZTSFv5UnionE.normalized")
+// CHECK: call i1 @llvm.type.test({{i8\*|ptr}} {{%f|%0}}, metadata 
!"_ZTSFvPu2i8E.normalized")
 fn(arg1);
 }
 
 // CHECK: ![[TYPE1]] = !{i64 0, !"_ZTSFvPFvu3i32ES_E.normalized"}
 // CHECK: ![[TYPE2]] = !{i64 0, !"_ZTSFvPFvu3i32S_ES_S_E.normalized"}
 // CHECK: ![[TYPE3]] = !{i64 0, !"_ZTSFvPFvu3i32S_S_ES_S_S_E.normalized"}
-// CHECK: ![[TYPE4]] = !{i64 0, !"_ZTSFvPFv5UnionES_E.normalized"}
+// CHECK: ![[TYPE4]] = !{i64 0, !"_ZTSFvPFv5UnionEPu2i8E.normalized"}
diff --git a/clang/test/CodeGen/kcfi-generalize.c 
b/clang/test/CodeGen/kcfi-generalize.c
index 89b298f3e2faa..24e054549d527 100644
--- a/clang/test/CodeGen/kcfi-generalize.c
+++ b/clang/test/CodeGen/kcfi-generalize.c
@@ -33,8 +33,8 @@ union Union {
 
 // CHECK: define{{.*}} void @uni({{.*}} !kcfi_type [[TYPE2:![0-9]+]]
 void uni(void (*fn)(union Union), union Union arg1) {
-  // UNGENERALIZED: call {{.*}} [ "kcfi"(i32 -1037059548) ]
-  // GENERALIZED: call {{.*}} [ "kcfi"(i32 422130955) ]
+  // UNGENERALIZED: call {{.*}} [ "kcfi"(i32 -587217045) ]
+  // GENERALIZED: call {{.*}} [ "kcfi"(i32 2139530422) ]
 fn(arg1);
 }
 
@@ -44,5 +44,5 @@ void uni(void (*fn)(union Union), union Union arg1) {
 // UNGENERALIZED: [[TYPE3]] = !{i32 874141567}
 // GENERALIZED: [[TYPE3]] = !{i32 954385378}
 
-// UNGENERALIZED: [[TYPE2]] = !{i32 98

[llvm-branch-commits] [CodeGen][CFI] Generalize transparent union in args of args of functions (PR #158194)

2025-09-13 Thread Vitaly Buka via llvm-branch-commits

https://github.com/vitalybuka created 
https://github.com/llvm/llvm-project/pull/158194

According GCC documentation transparent union
calling convention is the same as the type of the
first member of the union.

C++ ignores attribute.



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [Clang] Invoke shell script with bash (PR #157608)

2025-09-13 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 updated 
https://github.com/llvm/llvm-project/pull/157608


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] CodeGen: Remove MachineFunction argument from getRegClass (PR #158188)

2025-09-13 Thread Sergei Barannikov via llvm-branch-commits

https://github.com/s-barannikov approved this pull request.


https://github.com/llvm/llvm-project/pull/158188
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFC][CodeGe][CFI] Pre-commit transparent_union tests (PR #158192)

2025-09-13 Thread Vitaly Buka via llvm-branch-commits

https://github.com/vitalybuka created 
https://github.com/llvm/llvm-project/pull/158192

None


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] CodeGen: Remove TRI argument from getRegClass (PR #158225)

2025-09-13 Thread Simon Pilgrim via llvm-branch-commits

https://github.com/RKSimon approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/158225
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [Remarks] Restructure bitstream remarks to be fully standalone (PR #156715)

2025-09-13 Thread Jon Roelofs via llvm-branch-commits


@@ -232,43 +221,40 @@ void BitstreamRemarkSerializerHelper::setupBlockInfo() {
 }

jroelofs wrote:

if you push this into each `case`, and replace `break;` with `return;`, then 
this could become a fully covered-switch-with-unreachable pattern.

https://github.com/llvm/llvm-project/pull/156715
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] 111fd8e - Revert "Introduce LDBG_OS() macro as a variant of LDBG() (#157194) (#158260)"

2025-09-13 Thread via llvm-branch-commits

Author: Mehdi Amini
Date: 2025-09-12T11:37:23+01:00
New Revision: 111fd8e494728ba5eef6ffac50c6bf7a3d967f21

URL: 
https://github.com/llvm/llvm-project/commit/111fd8e494728ba5eef6ffac50c6bf7a3d967f21
DIFF: 
https://github.com/llvm/llvm-project/commit/111fd8e494728ba5eef6ffac50c6bf7a3d967f21.diff

LOG: Revert "Introduce LDBG_OS() macro as a variant of LDBG() (#157194) 
(#158260)"

This reverts commit 8457e68b6b59f8daf5fb747fe3a2f9c48c3c3ba8.

Added: 


Modified: 
llvm/include/llvm/Support/Debug.h
llvm/include/llvm/Support/DebugLog.h
llvm/unittests/Support/DebugLogTest.cpp
mlir/lib/Dialect/Transform/IR/TransformOps.cpp

Removed: 




diff  --git a/llvm/include/llvm/Support/Debug.h 
b/llvm/include/llvm/Support/Debug.h
index b73f2d7c8b852..a7795d403721c 100644
--- a/llvm/include/llvm/Support/Debug.h
+++ b/llvm/include/llvm/Support/Debug.h
@@ -44,6 +44,11 @@ class raw_ostream;
 /// level, return false.
 LLVM_ABI bool isCurrentDebugType(const char *Type, int Level = 0);
 
+/// Overload allowing to swap the order of the Type and Level arguments.
+LLVM_ABI inline bool isCurrentDebugType(int Level, const char *Type) {
+  return isCurrentDebugType(Type, Level);
+}
+
 /// setCurrentDebugType - Set the current debug type, as if the -debug-only=X
 /// option were specified.  Note that DebugFlag also needs to be set to true 
for
 /// debug output to be produced.

diff  --git a/llvm/include/llvm/Support/DebugLog.h 
b/llvm/include/llvm/Support/DebugLog.h
index f7748bc9904b1..dce706e196bde 100644
--- a/llvm/include/llvm/Support/DebugLog.h
+++ b/llvm/include/llvm/Support/DebugLog.h
@@ -19,55 +19,52 @@
 namespace llvm {
 #ifndef NDEBUG
 
-/// LDBG() is a macro that can be used as a raw_ostream for debugging.
-/// It will stream the output to the dbgs() stream, with a prefix of the
-/// debug type and the file and line number. A trailing newline is added to the
-/// output automatically. If the streamed content contains a newline, the 
prefix
-/// is added to each beginning of a new line. Nothing is printed if the debug
-/// output is not enabled or the debug type does not match.
-///
-/// E.g.,
-///   LDBG() << "Bitset contains: " << Bitset;
-/// is equivalent to
-///   LLVM_DEBUG(dbgs() << "[" << DEBUG_TYPE << "] " << __FILE__ << ":" <<
-///   __LINE__ << " "
-///  << "Bitset contains: " << Bitset << "\n");
-///
+// LDBG() is a macro that can be used as a raw_ostream for debugging.
+// It will stream the output to the dbgs() stream, with a prefix of the
+// debug type and the file and line number. A trailing newline is added to the
+// output automatically. If the streamed content contains a newline, the prefix
+// is added to each beginning of a new line. Nothing is printed if the debug
+// output is not enabled or the debug type does not match.
+//
+// E.g.,
+//   LDBG() << "Bitset contains: " << Bitset;
+// is somehow equivalent to
+//   LLVM_DEBUG(dbgs() << "[" << DEBUG_TYPE << "] " << __FILE__ << ":" <<
+//   __LINE__ << " "
+//  << "Bitset contains: " << Bitset << "\n");
+//
 // An optional `level` argument can be provided to control the verbosity of the
-/// output. The default level is 1, and is in increasing level of verbosity.
-///
-/// The `level` argument can be a literal integer, or a macro that evaluates to
-/// an integer.
-///
-/// An optional `type` argument can be provided to control the debug type. The
-/// default type is DEBUG_TYPE. The `type` argument can be a literal string, or
-/// a macro that evaluates to a string.
-///
-/// E.g.,
-///   LDBG(2) << "Bitset contains: " << Bitset;
-///   LDBG("debug_type") << "Bitset contains: " << Bitset;
-///   LDBG("debug_type", 2) << "Bitset contains: " << Bitset;
+// output. The default level is 1, and is in increasing level of verbosity.
+//
+// The `level` argument can be a literal integer, or a macro that evaluates to
+// an integer.
+//
+// An optional `type` argument can be provided to control the debug type. The
+// default type is DEBUG_TYPE. The `type` argument can be a literal string, or 
a
+// macro that evaluates to a string.
 #define LDBG(...) _GET_LDBG_MACRO(__VA_ARGS__)(__VA_ARGS__)
 
-/// LDBG_OS() is a macro that behaves like LDBG() but instead of directly using
-/// it to stream the output, it takes a callback function that will be called
-/// with a raw_ostream.
-/// This is useful when you need to pass a `raw_ostream` to a helper function 
to
-/// be able to print (when the `<<` operator is not available).
-///
-/// E.g.,
-///   LDBG_OS([&] (raw_ostream &Os) {
-/// Os << "Pass Manager contains: ";
-/// pm.printAsTextual(Os);
-///   });
-///
-/// Just like LDBG(), it optionally accepts a `level` and `type` arguments.
-/// E.g.,
-///   LDBG_OS(2, [&] (raw_ostream &Os) { ... });
-///   LDBG_OS("debug_type", [&] (raw_ostream &Os) { ... });
-///   LDBG_OS("debug_type", 2, [&] (raw_ostream &Os) { ... });
-///
-#defi

[llvm-branch-commits] [llvm] 1ff8b74 - Revert "[DebugLine] Correct debug line emittion (#157529)"

2025-09-13 Thread via llvm-branch-commits

Author: David Blaikie
Date: 2025-09-12T11:22:56-07:00
New Revision: 1ff8b74c4d592f3c235460fda236e636b2f2590f

URL: 
https://github.com/llvm/llvm-project/commit/1ff8b74c4d592f3c235460fda236e636b2f2590f
DIFF: 
https://github.com/llvm/llvm-project/commit/1ff8b74c4d592f3c235460fda236e636b2f2590f.diff

LOG: Revert "[DebugLine] Correct debug line emittion (#157529)"

This reverts commit 84f431c35b3fbd5b9c46608689f25a5d29bc0f55.

Added: 


Modified: 
llvm/lib/MC/MCDwarf.cpp
llvm/test/DebugInfo/X86/DW_AT_LLVM_stmt_seq_sec_offset.ll
llvm/test/MC/ELF/debug-loc-label.s

Removed: 
llvm/test/DebugInfo/ARM/stmt_seq_macho.test



diff  --git a/llvm/lib/MC/MCDwarf.cpp b/llvm/lib/MC/MCDwarf.cpp
index e8f000a584839..e7c0d37e8f99b 100644
--- a/llvm/lib/MC/MCDwarf.cpp
+++ b/llvm/lib/MC/MCDwarf.cpp
@@ -181,7 +181,7 @@ void MCDwarfLineTable::emitOne(
 
   unsigned FileNum, LastLine, Column, Flags, Isa, Discriminator;
   bool IsAtStartSeq;
-  MCSymbol *PrevLabel;
+  MCSymbol *LastLabel;
   auto init = [&]() {
 FileNum = 1;
 LastLine = 1;
@@ -189,31 +189,21 @@ void MCDwarfLineTable::emitOne(
 Flags = DWARF2_LINE_DEFAULT_IS_STMT ? DWARF2_FLAG_IS_STMT : 0;
 Isa = 0;
 Discriminator = 0;
-PrevLabel = nullptr;
+LastLabel = nullptr;
 IsAtStartSeq = true;
   };
   init();
 
   // Loop through each MCDwarfLineEntry and encode the dwarf line number table.
   bool EndEntryEmitted = false;
-  for (auto It = LineEntries.begin(); It != LineEntries.end(); ++It) {
-auto LineEntry = *It;
-MCSymbol *CurrLabel = LineEntry.getLabel();
+  for (const MCDwarfLineEntry &LineEntry : LineEntries) {
+MCSymbol *Label = LineEntry.getLabel();
 const MCAsmInfo *asmInfo = MCOS->getContext().getAsmInfo();
 
 if (LineEntry.LineStreamLabel) {
   if (!IsAtStartSeq) {
-auto *Label = CurrLabel;
-auto NextIt = It + 1;
-// LineEntry with a null Label is probably a fake LineEntry we added
-// when `-emit-func-debug-line-table-offsets` in order to terminate the
-// sequence. Look for the next Label if possible, otherwise we will set
-// the PC to the end of the section.
-if (!Label && NextIt != LineEntries.end()) {
-  Label = NextIt->getLabel();
-}
-MCOS->emitDwarfLineEndEntry(Section, PrevLabel,
-/*EndLabel =*/Label);
+MCOS->emitDwarfLineEndEntry(Section, LastLabel,
+/*EndLabel =*/LastLabel);
 init();
   }
   MCOS->emitLabel(LineEntry.LineStreamLabel, LineEntry.StreamLabelDefLoc);
@@ -221,7 +211,7 @@ void MCDwarfLineTable::emitOne(
 }
 
 if (LineEntry.IsEndEntry) {
-  MCOS->emitDwarfAdvanceLineAddr(INT64_MAX, PrevLabel, CurrLabel,
+  MCOS->emitDwarfAdvanceLineAddr(INT64_MAX, LastLabel, Label,
  asmInfo->getCodePointerSize());
   init();
   EndEntryEmitted = true;
@@ -268,12 +258,12 @@ void MCDwarfLineTable::emitOne(
 // At this point we want to emit/create the sequence to encode the delta in
 // line numbers and the increment of the address from the previous Label
 // and the current Label.
-MCOS->emitDwarfAdvanceLineAddr(LineDelta, PrevLabel, CurrLabel,
+MCOS->emitDwarfAdvanceLineAddr(LineDelta, LastLabel, Label,
asmInfo->getCodePointerSize());
 
 Discriminator = 0;
 LastLine = LineEntry.getLine();
-PrevLabel = CurrLabel;
+LastLabel = Label;
 IsAtStartSeq = false;
   }
 
@@ -283,7 +273,7 @@ void MCDwarfLineTable::emitOne(
   // does not track ranges nor terminate the line table. In that case,
   // conservatively use the section end symbol to end the line table.
   if (!EndEntryEmitted && !IsAtStartSeq)
-MCOS->emitDwarfLineEndEntry(Section, PrevLabel);
+MCOS->emitDwarfLineEndEntry(Section, LastLabel);
 }
 
 void MCDwarfLineTable::endCurrentSeqAndEmitLineStreamLabel(MCStreamer *MCOS,

diff  --git a/llvm/test/DebugInfo/ARM/stmt_seq_macho.test 
b/llvm/test/DebugInfo/ARM/stmt_seq_macho.test
deleted file mode 100644
index f0874bfc45ed2..0
--- a/llvm/test/DebugInfo/ARM/stmt_seq_macho.test
+++ /dev/null
@@ -1,98 +0,0 @@
-// RUN: split-file %s %t
-
-// RUN: clang++ --target=arm64-apple-macos11 \
-// RUN:   %t/stmt_seq_macho.cpp -o %t/stmt_seq_macho.o \
-// RUN:   -g -Oz -gdwarf-4 -c -mno-outline \
-// RUN:   -mllvm -emit-func-debug-line-table-offsets \
-// RUN:   -fdebug-compilation-dir=/private/tmp/stmt_seq \
-// RUN:   -fno-unwind-tables -fno-exceptions
-
-// RUN: llvm-dwarfdump -all %t/stmt_seq_macho.o | FileCheck %s
-
-// CHECK:  AddressLine   Column File   ISA 
Discriminator OpIndex Flags
-// CHECK-NEXT: -- -- -- -- --- 
- --- -
-// CHECK-NEXT: 0x  2 33  1  

[llvm-branch-commits] [NFC][CFI][CodeGen] Move GeneralizeFunctionType out of CreateMetadataIdentifierGeneralized (PR #158190)

2025-09-13 Thread Vitaly Buka via llvm-branch-commits

https://github.com/vitalybuka updated 
https://github.com/llvm/llvm-project/pull/158190


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Mips: Switch to RegClassByHwMode (PR #158273)

2025-09-13 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/158273?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#158273** https://app.graphite.dev/github/pr/llvm/llvm-project/158273?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/158273?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#158269** https://app.graphite.dev/github/pr/llvm/llvm-project/158269?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>: 2 other dependent PRs 
([#158271](https://github.com/llvm/llvm-project/pull/158271) https://app.graphite.dev/github/pr/llvm/llvm-project/158271?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>, 
[#158272](https://github.com/llvm/llvm-project/pull/158272) https://app.graphite.dev/github/pr/llvm/llvm-project/158272?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>)
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/158273
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] X86: Switch to RegClassByHwMode (PR #158274)

2025-09-13 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/158274
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Offload] Add GenericPluginTy::get_mem_info (PR #157484)

2025-09-13 Thread Ross Brunton via llvm-branch-commits

https://github.com/RossBrunton updated 
https://github.com/llvm/llvm-project/pull/157484

>From 7bf7fe1df8a873964df2ebc17328d9bef00f1347 Mon Sep 17 00:00:00 2001
From: Ross Brunton 
Date: Mon, 8 Sep 2025 10:45:42 +0100
Subject: [PATCH 1/6] [Offload] Add GenericPluginTy::get_mem_info

This takes a pointer allocated by the plugin, and returns a struct
containing important information about it. This is now used in
`olMemFree` instead of using a map to track allocation info.
---
 offload/include/omptarget.h   |   2 +
 offload/liboffload/src/OffloadImpl.cpp|  27 +--
 .../amdgpu/dynamic_hsa/hsa.cpp|   1 +
 .../amdgpu/dynamic_hsa/hsa_ext_amd.h  |   3 +
 offload/plugins-nextgen/amdgpu/src/rtl.cpp|  31 ++-
 .../common/include/PluginInterface.h  |  13 ++
 offload/plugins-nextgen/cuda/src/rtl.cpp  | 216 +++---
 offload/plugins-nextgen/host/src/rtl.cpp  |   5 +
 8 files changed, 193 insertions(+), 105 deletions(-)

diff --git a/offload/include/omptarget.h b/offload/include/omptarget.h
index 8fd722bb15022..197cbd3806d91 100644
--- a/offload/include/omptarget.h
+++ b/offload/include/omptarget.h
@@ -96,6 +96,8 @@ enum OpenMPOffloadingDeclareTargetFlags {
   OMP_REGISTER_REQUIRES = 0x10,
 };
 
+// Note: This type should be no larger than 3 bits, as the amdgpu platform uses
+// the lower 3 bits of a pointer to store it
 enum TargetAllocTy : int32_t {
   TARGET_ALLOC_DEVICE = 0,
   TARGET_ALLOC_HOST,
diff --git a/offload/liboffload/src/OffloadImpl.cpp 
b/offload/liboffload/src/OffloadImpl.cpp
index fef3a5669e0d5..9620c35ac5c10 100644
--- a/offload/liboffload/src/OffloadImpl.cpp
+++ b/offload/liboffload/src/OffloadImpl.cpp
@@ -201,8 +201,6 @@ struct OffloadContext {
 
   bool TracingEnabled = false;
   bool ValidationEnabled = true;
-  DenseMap AllocInfoMap{};
-  std::mutex AllocInfoMapMutex{};
   SmallVector Platforms{};
   size_t RefCount;
 
@@ -624,32 +622,15 @@ Error olMemAlloc_impl(ol_device_handle_t Device, 
ol_alloc_type_t Type,
 return Alloc.takeError();
 
   *AllocationOut = *Alloc;
-  {
-std::lock_guard Lock(OffloadContext::get().AllocInfoMapMutex);
-OffloadContext::get().AllocInfoMap.insert_or_assign(
-*Alloc, AllocInfo{Device, Type});
-  }
   return Error::success();
 }
 
 Error olMemFree_impl(ol_platform_handle_t Platform, void *Address) {
-  ol_device_handle_t Device;
-  ol_alloc_type_t Type;
-  {
-std::lock_guard Lock(OffloadContext::get().AllocInfoMapMutex);
-if (!OffloadContext::get().AllocInfoMap.contains(Address))
-  return createOffloadError(ErrorCode::INVALID_ARGUMENT,
-"address is not a known allocation");
-
-auto AllocInfo = OffloadContext::get().AllocInfoMap.at(Address);
-Device = AllocInfo.Device;
-Type = AllocInfo.Type;
-OffloadContext::get().AllocInfoMap.erase(Address);
-  }
-  assert(Platform == Device->Platform);
+  auto MemInfo = Platform->Plugin->get_memory_info(Address);
+  if (auto Err = MemInfo.takeError())
+return Err;
 
-  if (auto Res =
-  Device->Device->dataDelete(Address, convertOlToPluginAllocTy(Type)))
+  if (auto Res = MemInfo->Device->dataDelete(Address, MemInfo->Type))
 return Res;
 
   return Error::success();
diff --git a/offload/plugins-nextgen/amdgpu/dynamic_hsa/hsa.cpp 
b/offload/plugins-nextgen/amdgpu/dynamic_hsa/hsa.cpp
index bc92f4a46a5c0..7f0e75cb9b500 100644
--- a/offload/plugins-nextgen/amdgpu/dynamic_hsa/hsa.cpp
+++ b/offload/plugins-nextgen/amdgpu/dynamic_hsa/hsa.cpp
@@ -68,6 +68,7 @@ DLWRAP(hsa_amd_register_system_event_handler, 2)
 DLWRAP(hsa_amd_signal_create, 5)
 DLWRAP(hsa_amd_signal_async_handler, 5)
 DLWRAP(hsa_amd_pointer_info, 5)
+DLWRAP(hsa_amd_pointer_info_set_userdata, 2)
 DLWRAP(hsa_code_object_reader_create_from_memory, 3)
 DLWRAP(hsa_code_object_reader_destroy, 1)
 DLWRAP(hsa_executable_load_agent_code_object, 5)
diff --git a/offload/plugins-nextgen/amdgpu/dynamic_hsa/hsa_ext_amd.h 
b/offload/plugins-nextgen/amdgpu/dynamic_hsa/hsa_ext_amd.h
index 29cfe78082dbb..5c2fbd127c86d 100644
--- a/offload/plugins-nextgen/amdgpu/dynamic_hsa/hsa_ext_amd.h
+++ b/offload/plugins-nextgen/amdgpu/dynamic_hsa/hsa_ext_amd.h
@@ -160,6 +160,7 @@ typedef struct hsa_amd_pointer_info_s {
   void* agentBaseAddress;
   void* hostBaseAddress;
   size_t sizeInBytes;
+  void *userData;
 } hsa_amd_pointer_info_t;
 
 hsa_status_t hsa_amd_pointer_info(const void* ptr,
@@ -168,6 +169,8 @@ hsa_status_t hsa_amd_pointer_info(const void* ptr,
   uint32_t* num_agents_accessible,
   hsa_agent_t** accessible);
 
+hsa_status_t hsa_amd_pointer_info_set_userdata(const void *ptr, void 
*userdata);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/offload/plugins-nextgen/amdgpu/src/rtl.cpp 
b/offload/plugins-nextgen/amdgpu/src/rtl.cpp
index c26cfe961aa0e..90d9ca9f787e7 100644
--- a/offload/plugins-nextgen/amdgpu/src/rtl.cpp
+++ b/offload/plugins-nex

[llvm-branch-commits] [llvm] [DA] Add overflow check in ExactSIV (PR #157086)

2025-09-13 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj updated 
https://github.com/llvm/llvm-project/pull/157086

>From 94b18495719b35a89ee6a18e474e8e92a4429d99 Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Fri, 5 Sep 2025 11:41:29 +
Subject: [PATCH] [DA] Add overflow check in ExactSIV

---
 llvm/lib/Analysis/DependenceAnalysis.cpp  | 14 +-
 llvm/test/Analysis/DependenceAnalysis/ExactSIV.ll |  2 +-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp 
b/llvm/lib/Analysis/DependenceAnalysis.cpp
index 0f77a1410e83b..6e576e866b310 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -1170,6 +1170,15 @@ const SCEVConstant 
*DependenceInfo::collectConstantUpperBound(const Loop *L,
   return nullptr;
 }
 
+/// Returns \p A - \p B if it guaranteed not to signed wrap. Otherwise returns
+/// nullptr. \p A and \p B must have the same integer type.
+static const SCEV *minusSCEVNoSignedOverflow(const SCEV *A, const SCEV *B,
+ ScalarEvolution &SE) {
+  if (SE.willNotOverflow(Instruction::Sub, /*Signed=*/true, A, B))
+return SE.getMinusSCEV(A, B);
+  return nullptr;
+}
+
 // testZIV -
 // When we have a pair of subscripts of the form [c1] and [c2],
 // where c1 and c2 are both loop invariant, we attack it using
@@ -1626,7 +1635,9 @@ bool DependenceInfo::exactSIVtest(const SCEV *SrcCoeff, 
const SCEV *DstCoeff,
   assert(0 < Level && Level <= CommonLevels && "Level out of range");
   Level--;
   Result.Consistent = false;
-  const SCEV *Delta = SE->getMinusSCEV(DstConst, SrcConst);
+  const SCEV *Delta = minusSCEVNoSignedOverflow(DstConst, SrcConst, *SE);
+  if (!Delta)
+return false;
   LLVM_DEBUG(dbgs() << "\tDelta = " << *Delta << "\n");
   NewConstraint.setLine(SrcCoeff, SE->getNegativeSCEV(DstCoeff), Delta,
 CurLoop);
@@ -1716,6 +1727,7 @@ bool DependenceInfo::exactSIVtest(const SCEV *SrcCoeff, 
const SCEV *DstCoeff,
   // explore directions
   unsigned NewDirection = Dependence::DVEntry::NONE;
   APInt LowerDistance, UpperDistance;
+  // TODO: Overflow check may be needed.
   if (TA.sgt(TB)) {
 LowerDistance = (TY - TX) + (TA - TB) * TL;
 UpperDistance = (TY - TX) + (TA - TB) * TU;
diff --git a/llvm/test/Analysis/DependenceAnalysis/ExactSIV.ll 
b/llvm/test/Analysis/DependenceAnalysis/ExactSIV.ll
index 54bb8b73da02a..fd58568d02c43 100644
--- a/llvm/test/Analysis/DependenceAnalysis/ExactSIV.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/ExactSIV.ll
@@ -841,7 +841,7 @@ define void @exact14(ptr %A) {
 ; CHECK-SIV-ONLY-NEXT:  Src: store i8 0, ptr %idx.0, align 1 --> Dst: store i8 
0, ptr %idx.0, align 1
 ; CHECK-SIV-ONLY-NEXT:da analyze - none!
 ; CHECK-SIV-ONLY-NEXT:  Src: store i8 0, ptr %idx.0, align 1 --> Dst: store i8 
1, ptr %idx.1, align 1
-; CHECK-SIV-ONLY-NEXT:da analyze - none!
+; CHECK-SIV-ONLY-NEXT:da analyze - output [*|<]!
 ; CHECK-SIV-ONLY-NEXT:  Src: store i8 1, ptr %idx.1, align 1 --> Dst: store i8 
1, ptr %idx.1, align 1
 ; CHECK-SIV-ONLY-NEXT:da analyze - none!
 ;

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] SPARC: Use RegClassByHwMode instead of PointerLikeRegClass (PR #158271)

2025-09-13 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/158271
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [Clang] Port ulimit tests to work with internal shell (PR #157977)

2025-09-13 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 updated 
https://github.com/llvm/llvm-project/pull/157977


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Offload][Conformance] Update olMemFree calls in conformance tests (PR #157773)

2025-09-13 Thread Ross Brunton via llvm-branch-commits

RossBrunton wrote:

@jhuber6 This was merged into my user branch, was that intentional?

https://github.com/llvm/llvm-project/pull/157773
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][Standalone] test Standalone against install distributions (PR #157944)

2025-09-13 Thread Renato Golin via llvm-branch-commits


@@ -65,6 +65,12 @@ if (MLIR_INCLUDE_INTEGRATION_TESTS)
 
 endif()
 
+option(MLIR_RUN_STANDALONE_INSTALL_TESTS "Run Standalone example install 
tests." ON)
+if(MLIR_RUN_STANDALONE_INSTALL_TESTS AND "${CMAKE_INSTALL_PREFIX}" STREQUAL "")
+  message(WARNING "Standalone example install tests will install into root!\

rengolin wrote:

D'oh, now I saw the line above! 🤣 

https://github.com/llvm/llvm-project/pull/157944
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm-profgen] Extend llvm-profgen to generate vtable profiles with data access events for non context-sensitive profiles using debug info (PR #148013)

2025-09-13 Thread Paschalis Mpeis via llvm-branch-commits

https://github.com/paschalis-mpeis approved this pull request.

Thanks for addressing the comments and adding a pie test. Looks good.

https://github.com/llvm/llvm-project/pull/148013
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Generate canonical additions in AMDGPUPromoteAlloca (PR #157810)

2025-09-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Fabian Ritter (ritter-x2a)


Changes

When we know that one operand of an addition is a constant, we might was
well put it on the right-hand side and avoid the work to canonicalize it
in a later pass.

---
Full diff: https://github.com/llvm/llvm-project/pull/157810.diff


4 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp (+1-1) 
- (modified) llvm/test/CodeGen/AMDGPU/promote-alloca-multidim.ll (+4-4) 
- (modified) llvm/test/CodeGen/AMDGPU/promote-alloca-negative-index.ll (+2-2) 
- (modified) llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep-of-gep.ll 
(+3-3) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
index bb77cdff778c0..7dbe1235a98b5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
@@ -478,7 +478,7 @@ static Value *GEPToVectorIndex(GetElementPtrInst *GEP, 
AllocaInst *Alloca,
 
   ConstantInt *ConstIndex =
   ConstantInt::get(OffsetType, IndexQuot.getSExtValue());
-  Value *IndexAdd = Builder.CreateAdd(ConstIndex, Offset);
+  Value *IndexAdd = Builder.CreateAdd(Offset, ConstIndex);
   if (Instruction *NewInst = dyn_cast(IndexAdd))
 NewInsts.push_back(NewInst);
   return IndexAdd;
diff --git a/llvm/test/CodeGen/AMDGPU/promote-alloca-multidim.ll 
b/llvm/test/CodeGen/AMDGPU/promote-alloca-multidim.ll
index d72f158763c61..63622e67e7d0b 100644
--- a/llvm/test/CodeGen/AMDGPU/promote-alloca-multidim.ll
+++ b/llvm/test/CodeGen/AMDGPU/promote-alloca-multidim.ll
@@ -312,7 +312,7 @@ define amdgpu_kernel void 
@i64_2d_load_store_subvec_3_i64_offset_index(ptr %out)
 ; CHECK-NEXT:[[TMP15:%.*]] = insertelement <6 x i64> [[TMP14]], i64 4, i32 
4
 ; CHECK-NEXT:[[TMP16:%.*]] = insertelement <6 x i64> [[TMP15]], i64 5, i32 
5
 ; CHECK-NEXT:[[TMP1:%.*]] = mul i64 [[SEL3]], 3
-; CHECK-NEXT:[[TMP2:%.*]] = add i64 6, [[TMP1]]
+; CHECK-NEXT:[[TMP2:%.*]] = add i64 [[TMP1]], 6
 ; CHECK-NEXT:[[TMP3:%.*]] = extractelement <6 x i64> [[TMP16]], i64 
[[TMP2]]
 ; CHECK-NEXT:[[TMP4:%.*]] = insertelement <3 x i64> poison, i64 [[TMP3]], 
i64 0
 ; CHECK-NEXT:[[TMP5:%.*]] = add i64 [[TMP2]], 1
@@ -464,7 +464,7 @@ define amdgpu_kernel void @i16_2d_load_store(ptr %out, i32 
%sel) {
 ; CHECK-NEXT:[[TMP4:%.*]] = insertelement <6 x i16> [[TMP3]], i16 3, i32 3
 ; CHECK-NEXT:[[TMP5:%.*]] = insertelement <6 x i16> [[TMP4]], i16 4, i32 4
 ; CHECK-NEXT:[[TMP6:%.*]] = insertelement <6 x i16> [[TMP5]], i16 5, i32 5
-; CHECK-NEXT:[[TMP1:%.*]] = add i32 3, [[SEL]]
+; CHECK-NEXT:[[TMP1:%.*]] = add i32 [[SEL]], 3
 ; CHECK-NEXT:[[TMP2:%.*]] = extractelement <6 x i16> [[TMP6]], i32 [[TMP1]]
 ; CHECK-NEXT:store i16 [[TMP2]], ptr [[OUT]], align 2
 ; CHECK-NEXT:ret void
@@ -498,7 +498,7 @@ define amdgpu_kernel void @float_2d_load_store(ptr %out, 
i32 %sel) {
 ; CHECK-NEXT:[[TMP4:%.*]] = insertelement <6 x float> [[TMP3]], float 
3.00e+00, i32 3
 ; CHECK-NEXT:[[TMP5:%.*]] = insertelement <6 x float> [[TMP4]], float 
4.00e+00, i32 4
 ; CHECK-NEXT:[[TMP6:%.*]] = insertelement <6 x float> [[TMP5]], float 
5.00e+00, i32 5
-; CHECK-NEXT:[[TMP1:%.*]] = add i32 3, [[SEL]]
+; CHECK-NEXT:[[TMP1:%.*]] = add i32 [[SEL]], 3
 ; CHECK-NEXT:[[TMP2:%.*]] = extractelement <6 x float> [[TMP6]], i32 
[[TMP1]]
 ; CHECK-NEXT:store float [[TMP2]], ptr [[OUT]], align 4
 ; CHECK-NEXT:ret void
@@ -538,7 +538,7 @@ define amdgpu_kernel void @ptr_2d_load_store(ptr %out, i32 
%sel) {
 ; CHECK-NEXT:[[TMP4:%.*]] = insertelement <6 x ptr> [[TMP3]], ptr 
[[PTR_3]], i32 3
 ; CHECK-NEXT:[[TMP5:%.*]] = insertelement <6 x ptr> [[TMP4]], ptr 
[[PTR_4]], i32 4
 ; CHECK-NEXT:[[TMP6:%.*]] = insertelement <6 x ptr> [[TMP5]], ptr 
[[PTR_5]], i32 5
-; CHECK-NEXT:[[TMP7:%.*]] = add i32 3, [[SEL]]
+; CHECK-NEXT:[[TMP7:%.*]] = add i32 [[SEL]], 3
 ; CHECK-NEXT:[[TMP8:%.*]] = extractelement <6 x ptr> [[TMP6]], i32 [[TMP7]]
 ; CHECK-NEXT:store ptr [[TMP8]], ptr [[OUT]], align 8
 ; CHECK-NEXT:ret void
diff --git a/llvm/test/CodeGen/AMDGPU/promote-alloca-negative-index.ll 
b/llvm/test/CodeGen/AMDGPU/promote-alloca-negative-index.ll
index 1b6ac0bd93c19..a865bf5058d6a 100644
--- a/llvm/test/CodeGen/AMDGPU/promote-alloca-negative-index.ll
+++ b/llvm/test/CodeGen/AMDGPU/promote-alloca-negative-index.ll
@@ -11,7 +11,7 @@ define amdgpu_kernel void @negative_index_byte(ptr %out, i64 
%offset) {
 ; CHECK-NEXT:[[TMP2:%.*]] = insertelement <4 x i8> [[TMP1]], i8 1, i32 1
 ; CHECK-NEXT:[[TMP3:%.*]] = insertelement <4 x i8> [[TMP2]], i8 2, i32 2
 ; CHECK-NEXT:[[TMP4:%.*]] = insertelement <4 x i8> [[TMP3]], i8 3, i32 3
-; CHECK-NEXT:[[TMP5:%.*]] = add i64 -1, [[OFFSET:%.*]]
+; CHECK-NEXT:[[TMP5:%.*]] = add i64 [[OFFSET:%.*]], -1
 ; CHECK-NEXT:[[TMP6:%.*]] = extractelement <4 x i8> [[TMP4]], i64 [[TMP5]]
 ; CHECK-NEXT:   

[llvm-branch-commits] [llvm] CodeGen: Remove TRI arguments from stack load/store hooks (PR #158240)

2025-09-13 Thread Simon Pilgrim via llvm-branch-commits

https://github.com/RKSimon approved this pull request.

LGTM with the clang-format fix

https://github.com/llvm/llvm-project/pull/158240
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][Standalone] test Standalone against install distributions (PR #157944)

2025-09-13 Thread Christopher Bate via llvm-branch-commits


@@ -65,6 +65,12 @@ if (MLIR_INCLUDE_INTEGRATION_TESTS)
 
 endif()
 
+option(MLIR_RUN_STANDALONE_INSTALL_TESTS "Run Standalone example install 
tests." ON)
+if(MLIR_RUN_STANDALONE_INSTALL_TESTS AND "${CMAKE_INSTALL_PREFIX}" STREQUAL "")
+  message(WARNING "Standalone example install tests will install into root!\

christopherbate wrote:

Shouldn't any potential to write outside the build directory in a test by a 
FATAL_ERROR not a WARNING?

https://github.com/llvm/llvm-project/pull/157944
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][LoongArch] Introduce LASX and LSX conversion intrinsics (PR #157819)

2025-09-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: hev (heiher)


Changes

This patch introduces the LASX and LSX conversion intrinsics:

- __m256 __lasx_cast_128_s (__m128)
- __m256d __lasx_cast_128_d (__m128d)
- __m256i __lasx_cast_128 (__m128i)
- __m256 __lasx_concat_128_s (__m128, __m128)
- __m256d __lasx_concat_128_d (__m128, __m128d)
- __m256i __lasx_concat_128 (__m128, __m128i)
- __m128 __lasx_extract_128_lo_s (__m256)
- __m128d __lasx_extract_128_lo_d (__m256d)
- __m128i __lasx_extract_128_lo (__m256i)
- __m128 __lasx_extract_128_hi_s (__m256)
- __m128d __lasx_extract_128_hi_d (__m256d)
- __m128i __lasx_extract_128_hi (__m256i)
- __m256 __lasx_insert_128_lo_s (__m256, __m128)
- __m256d __lasx_insert_128_lo_d (__m256d, __m128d)
- __m256i __lasx_insert_128_lo (__m256i, __m128i)
- __m256 __lasx_insert_128_hi_s (__m256, __m128)
- __m256d __lasx_insert_128_hi_d (__m256d, __m128d)
- __m256i __lasx_insert_128_hi (__m256i, __m128i)

---

Patch is 25.73 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/157819.diff


4 Files Affected:

- (modified) clang/include/clang/Basic/BuiltinsLoongArchLASX.def (+19) 
- (modified) clang/lib/Headers/lasxintrin.h (+110) 
- (modified) clang/test/CodeGen/LoongArch/lasx/builtin-alias.c (+153) 
- (modified) clang/test/CodeGen/LoongArch/lasx/builtin.c (+157) 


``diff
diff --git a/clang/include/clang/Basic/BuiltinsLoongArchLASX.def 
b/clang/include/clang/Basic/BuiltinsLoongArchLASX.def
index c4ea46a3bc5b5..b234dedad648e 100644
--- a/clang/include/clang/Basic/BuiltinsLoongArchLASX.def
+++ b/clang/include/clang/Basic/BuiltinsLoongArchLASX.def
@@ -986,3 +986,22 @@ TARGET_BUILTIN(__builtin_lasx_xbnz_b, "iV32Uc", "nc", 
"lasx")
 TARGET_BUILTIN(__builtin_lasx_xbnz_h, "iV16Us", "nc", "lasx")
 TARGET_BUILTIN(__builtin_lasx_xbnz_w, "iV8Ui", "nc", "lasx")
 TARGET_BUILTIN(__builtin_lasx_xbnz_d, "iV4ULLi", "nc", "lasx")
+
+TARGET_BUILTIN(__builtin_lasx_cast_128_s, "V8fV4f", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_cast_128_d, "V4dV2d", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_cast_128, "V32ScV16Sc", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_concat_128_s, "V8fV4fV4f", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_concat_128_d, "V4dV2dV2d", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_concat_128, "V32ScV16ScV16Sc", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_extract_128_lo_s, "V4fV8f", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_extract_128_lo_d, "V2dV4d", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_extract_128_lo, "V16ScV32Sc", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_extract_128_hi_s, "V4fV8f", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_extract_128_hi_d, "V2dV4d", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_extract_128_hi, "V16ScV32Sc", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_insert_128_lo_s, "V8fV8fV4f", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_insert_128_lo_d, "V4dV4dV2d", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_insert_128_lo, "V32ScV32ScV16Sc", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_insert_128_hi_s, "V8fV8fV4f", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_insert_128_hi_d, "V4dV4dV2d", "nc", "lasx")
+TARGET_BUILTIN(__builtin_lasx_insert_128_hi, "V32ScV32ScV16Sc", "nc", "lasx")
diff --git a/clang/lib/Headers/lasxintrin.h b/clang/lib/Headers/lasxintrin.h
index 85020d82829e2..6dd8ac24ed46d 100644
--- a/clang/lib/Headers/lasxintrin.h
+++ b/clang/lib/Headers/lasxintrin.h
@@ -10,6 +10,8 @@
 #ifndef _LOONGSON_ASXINTRIN_H
 #define _LOONGSON_ASXINTRIN_H 1
 
+#include 
+
 #if defined(__loongarch_asx)
 
 typedef signed char v32i8 __attribute__((vector_size(32), aligned(32)));
@@ -3882,5 +3884,113 @@ extern __inline
 
 #define __lasx_xvrepli_w(/*si10*/ _1) ((__m256i)__builtin_lasx_xvrepli_w((_1)))
 
+extern __inline
+__attribute__((__gnu_inline__, __always_inline__, __artificial__)) __m256
+__lasx_cast_128_s(__m128 _1) {
+  return (__m256)__builtin_lasx_cast_128_s((v4f32)_1);
+}
+
+extern __inline
+__attribute__((__gnu_inline__, __always_inline__, __artificial__)) __m256d
+__lasx_cast_128_d(__m128d _1) {
+  return (__m256d)__builtin_lasx_cast_128_d((v2f64)_1);
+}
+
+extern __inline
+__attribute__((__gnu_inline__, __always_inline__, __artificial__)) __m256i
+__lasx_cast_128(__m128i _1) {
+  return (__m256i)__builtin_lasx_cast_128((v16i8)_1);
+}
+
+extern __inline
+__attribute__((__gnu_inline__, __always_inline__, __artificial__)) __m256
+__lasx_concat_128_s(__m128 _1, __m128 _2) {
+  return (__m256)__builtin_lasx_concat_128_s((v4f32)_1, (v4f32)_2);
+}
+
+extern __inline
+__attribute__((__gnu_inline__, __always_inline__, __artificial__)) __m256d
+__lasx_concat_128_d(__m128d _1, __m128d _2) {
+  return (__m256d)__builtin_lasx_concat_128_d((v2f64)_1, (v2f64)_2);
+}
+
+extern __inline
+__attribute__((__gnu_inline__, __always_inline__, __artificial__)) __m256i
+__lasx_concat_128(__m128i _1, __m128i _2) {
+  return (__m256i)__builtin_lasx_concat_128((v16i8)_1, (v1

[llvm-branch-commits] [llvm] Add deactivation symbol operand to ConstantPtrAuth. (PR #133537)

2025-09-13 Thread Peter Collingbourne via llvm-branch-commits

pcc wrote:

> I have checked in with @ahmedbougacha and his feeling is that this is fine as 
> it requires a bunch of work to opt in, and for places where the security is 
> important enough that we don't want people using this it's easy enough to 
> block.

Thanks for checking.

> I'm concerned about the interaction of these changes with ptrauth intrinsic 
> optimizations

I took a look and found some cases where we needed to inhibit optimizations. 
There was no practical effect due to how PFP uses these intrinisics, but I 
implemented the inhibitions in #133536 and this PR.

> the ability for attackers to gain control of the enablement flags.

This isn't possible, the symbols are resolved at static link time. See the RFC 
for more information: 
https://discourse.llvm.org/t/rfc-deactivation-symbols/85556

https://github.com/llvm/llvm-project/pull/133537
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [AMDGPU] Add builtins for wave reduction intrinsics (PR #150170)

2025-09-13 Thread via llvm-branch-commits

https://github.com/easyonaadit updated 
https://github.com/llvm/llvm-project/pull/150170

>From 308545da2b700e93d2c4b5e32c8392468385 Mon Sep 17 00:00:00 2001
From: Aaditya 
Date: Sat, 19 Jul 2025 12:57:27 +0530
Subject: [PATCH] Add builtins for wave reduction intrinsics

---
 clang/include/clang/Basic/BuiltinsAMDGPU.def |  25 ++
 clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp  |  58 +++
 clang/test/CodeGenOpenCL/builtins-amdgcn.cl  | 378 +++
 3 files changed, 461 insertions(+)

diff --git a/clang/include/clang/Basic/BuiltinsAMDGPU.def 
b/clang/include/clang/Basic/BuiltinsAMDGPU.def
index e5a1422fe8778..56b1a8dc09b15 100644
--- a/clang/include/clang/Basic/BuiltinsAMDGPU.def
+++ b/clang/include/clang/Basic/BuiltinsAMDGPU.def
@@ -364,6 +364,31 @@ BUILTIN(__builtin_amdgcn_endpgm, "v", "nr")
 BUILTIN(__builtin_amdgcn_get_fpenv, "WUi", "n")
 BUILTIN(__builtin_amdgcn_set_fpenv, "vWUi", "n")
 
+//===--===//
+
+// Wave Reduction builtins.
+
+//===--===//
+
+BUILTIN(__builtin_amdgcn_wave_reduce_add_u32, "ZUiZUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_sub_u32, "ZUiZUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_min_i32, "ZiZiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_min_u32, "ZUiZUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_max_i32, "ZiZiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_max_u32, "ZUiZUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_and_b32, "ZiZiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_or_b32, "ZiZiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_xor_b32, "ZiZiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_add_u64, "WUiWUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_sub_u64, "WUiWUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_min_i64, "WiWiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_min_u64, "WUiWUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_max_i64, "WiWiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_max_u64, "WUiWUiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_and_b64, "WiWiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_or_b64, "WiWiZi", "nc")
+BUILTIN(__builtin_amdgcn_wave_reduce_xor_b64, "WiWiZi", "nc")
+
 
//===--===//
 // R600-NI only builtins.
 
//===--===//
diff --git a/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp 
b/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
index 87a46287c4022..07cf08c54985a 100644
--- a/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+++ b/clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
@@ -295,11 +295,69 @@ void 
CodeGenFunction::AddAMDGPUFenceAddressSpaceMMRA(llvm::Instruction *Inst,
   Inst->setMetadata(LLVMContext::MD_mmra, MMRAMetadata::getMD(Ctx, MMRAs));
 }
 
+static Intrinsic::ID getIntrinsicIDforWaveReduction(unsigned BuiltinID) {
+  switch (BuiltinID) {
+  default:
+llvm_unreachable("Unknown BuiltinID for wave reduction");
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_add_u32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_add_u64:
+return Intrinsic::amdgcn_wave_reduce_add;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_sub_u32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_sub_u64:
+return Intrinsic::amdgcn_wave_reduce_sub;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_min_i32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_min_i64:
+return Intrinsic::amdgcn_wave_reduce_min;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_min_u32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_min_u64:
+return Intrinsic::amdgcn_wave_reduce_umin;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_max_i32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_max_i64:
+return Intrinsic::amdgcn_wave_reduce_max;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_max_u32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_max_u64:
+return Intrinsic::amdgcn_wave_reduce_umax;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_and_b32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_and_b64:
+return Intrinsic::amdgcn_wave_reduce_and;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_or_b32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_or_b64:
+return Intrinsic::amdgcn_wave_reduce_or;
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_xor_b32:
+  case clang::AMDGPU::BI__builtin_amdgcn_wave_reduce_xor_b64:
+return Intrinsic::amdgcn_wave_reduce_xor;
+  }
+}
+
 Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
   const CallExpr *E) {
   llvm::AtomicOrdering AO = llvm::AtomicOrdering::SequentiallyConsistent;
   llvm::SyncScope::ID SSID;
   switch (BuiltinID) {
+  case AMDGPU::BI__builtin_amdgcn_wave_reduce_add_u32:
+  case AMDGPU::BI__builtin_amdgcn_wave_reduce_sub_u

[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: Import D16 load patterns and add combines for them (PR #153178)

2025-09-13 Thread Petar Avramovic via llvm-branch-commits

https://github.com/petar-avramovic updated 
https://github.com/llvm/llvm-project/pull/153178

>From 739caaa21a514cc89c57deae344bf563c9563e90 Mon Sep 17 00:00:00 2001
From: Petar Avramovic 
Date: Tue, 26 Aug 2025 14:10:41 +0200
Subject: [PATCH] AMDGPU/GlobalISel: Import D16 load patterns and add combines
 for them

Add G_AMDGPU_LOAD_D16 generic instructions and GINodeEquivs for them,
this will import D16 load patterns to global-isel's tablegened
instruction selector.
For newly imported patterns to work add combines for G_AMDGPU_LOAD_D16
in AMDGPURegBankCombiner.
---
 llvm/lib/Target/AMDGPU/AMDGPUCombine.td   |   9 +-
 llvm/lib/Target/AMDGPU/AMDGPUGISel.td |   7 +
 .../Target/AMDGPU/AMDGPURegBankCombiner.cpp   |  86 
 llvm/lib/Target/AMDGPU/SIInstructions.td  |  15 +
 .../AMDGPU/GlobalISel/atomic_load_flat.ll |  15 +-
 .../AMDGPU/GlobalISel/atomic_load_global.ll   |  15 +-
 .../AMDGPU/GlobalISel/atomic_load_local_2.ll  |  13 +-
 .../CodeGen/AMDGPU/GlobalISel/load-d16.ll | 412 ++
 llvm/test/CodeGen/AMDGPU/global-saddr-load.ll | 246 +++
 9 files changed, 622 insertions(+), 196 deletions(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/GlobalISel/load-d16.ll

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCombine.td 
b/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
index b5dac95b57a2d..e8b211f7866ad 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
@@ -71,6 +71,12 @@ def int_minmax_to_med3 : GICombineRule<
  [{ return matchIntMinMaxToMed3(*${min_or_max}, ${matchinfo}); }]),
   (apply [{ applyMed3(*${min_or_max}, ${matchinfo}); }])>;
 
+let Predicates = [Predicate<"Subtarget->d16PreservesUnusedBits()">] in
+def d16_load : GICombineRule<
+  (defs root:$bitcast),
+  (combine (G_BITCAST $dst, $src):$bitcast,
+   [{ return combineD16Load(*${bitcast} ); }])>;
+
 def fp_minmax_to_med3 : GICombineRule<
   (defs root:$min_or_max, med3_matchdata:$matchinfo),
   (match (wip_match_opcode G_FMAXNUM,
@@ -219,5 +225,6 @@ def AMDGPURegBankCombiner : GICombiner<
zext_trunc_fold, int_minmax_to_med3, ptr_add_immed_chain,
fp_minmax_to_clamp, fp_minmax_to_med3, fmed3_intrinsic_to_clamp,
identity_combines, redundant_and, constant_fold_cast_op,
-   cast_of_cast_combines, sext_trunc, zext_of_shift_amount_combines]> {
+   cast_of_cast_combines, sext_trunc, zext_of_shift_amount_combines,
+   d16_load]> {
 }
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUGISel.td 
b/llvm/lib/Target/AMDGPU/AMDGPUGISel.td
index 0c112d1787c1a..bb4bf742fb861 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUGISel.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPUGISel.td
@@ -315,6 +315,13 @@ def : GINodeEquiv;
 def : GINodeEquiv;
 def : GINodeEquiv;
 
+def : GINodeEquiv;
+def : GINodeEquiv;
+def : GINodeEquiv;
+def : GINodeEquiv;
+def : GINodeEquiv;
+def : GINodeEquiv;
+
 def : GINodeEquiv;
 // G_AMDGPU_WHOLE_WAVE_FUNC_RETURN is simpler than AMDGPUwhole_wave_return,
 // so we don't mark it as equivalent.
diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
index ee324a5e93f0f..fd604e1b19cd4 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
@@ -89,6 +89,10 @@ class AMDGPURegBankCombinerImpl : public Combiner {
 
   void applyCanonicalizeZextShiftAmt(MachineInstr &MI, MachineInstr &Ext) 
const;
 
+  bool combineD16Load(MachineInstr &MI) const;
+  bool applyD16Load(unsigned D16Opc, MachineInstr &DstMI,
+MachineInstr *SmallLoad, Register ToOverwriteD16) const;
+
 private:
   SIModeRegisterDefaults getMode() const;
   bool getIEEE() const;
@@ -392,6 +396,88 @@ void 
AMDGPURegBankCombinerImpl::applyCanonicalizeZextShiftAmt(
   MI.eraseFromParent();
 }
 
+bool AMDGPURegBankCombinerImpl::combineD16Load(MachineInstr &MI) const {
+  Register Dst;
+  MachineInstr *Load, *SextLoad;
+  const int64_t CleanLo16 = 0x;
+  const int64_t CleanHi16 = 0x;
+
+  // Load lo
+  if (mi_match(MI.getOperand(1).getReg(), MRI,
+   m_GOr(m_GAnd(m_GBitcast(m_Reg(Dst)),
+m_Copy(m_SpecificICst(CleanLo16))),
+ m_MInstr(Load {
+
+if (Load->getOpcode() == AMDGPU::G_ZEXTLOAD) {
+  const MachineMemOperand *MMO = *Load->memoperands_begin();
+  unsigned LoadSize = MMO->getSizeInBits().getValue();
+  if (LoadSize == 8)
+return applyD16Load(AMDGPU::G_AMDGPU_LOAD_D16_LO_U8, MI, Load, Dst);
+  if (LoadSize == 16)
+return applyD16Load(AMDGPU::G_AMDGPU_LOAD_D16_LO, MI, Load, Dst);
+  return false;
+}
+
+if (mi_match(
+Load, MRI,
+m_GAnd(m_MInstr(SextLoad), m_Copy(m_SpecificICst(CleanHi16) {
+  if (SextLoad->getOpcode() != AMDGPU::G_SEXTLOAD)
+return false;
+
+  const MachineMemOperand *MMO = *SextLoad->memoperands_begin();
+  if (MMO->getSizeInBits().getValue() != 8)
+retu

[llvm-branch-commits] [lld] CodeGen: Emit .prefalign directives based on the prefalign attribute. (PR #155529)

2025-09-13 Thread Peter Collingbourne via llvm-branch-commits

https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/155529

>From 38615b9b39e93afab94c6aaa3ae6c026b7f2086a Mon Sep 17 00:00:00 2001
From: Peter Collingbourne 
Date: Tue, 26 Aug 2025 19:19:33 -0700
Subject: [PATCH] Fix failing lld test

Created using spr 1.3.6-beta.1
---
 lld/test/ELF/lto/linker-script-symbols-ipo.ll | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lld/test/ELF/lto/linker-script-symbols-ipo.ll 
b/lld/test/ELF/lto/linker-script-symbols-ipo.ll
index 414ee4080bee0..39996cbfa28db 100644
--- a/lld/test/ELF/lto/linker-script-symbols-ipo.ll
+++ b/lld/test/ELF/lto/linker-script-symbols-ipo.ll
@@ -18,7 +18,7 @@
 ; NOIPO:  :
 ; NOIPO-NEXT:   movl $2, %eax
 ; NOIPO:  <_start>:
-; NOIPO-NEXT:   jmp 0x201160 
+; NOIPO-NEXT:   jmp 0x201158 
 
 target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-unknown-linux-gnu"

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [RISCV] Support PreserveMost calling convention (#148214) (PR #158403)

2025-09-13 Thread Nikita Popov via llvm-branch-commits

https://github.com/nikic closed https://github.com/llvm/llvm-project/pull/158403
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [RISCV] Support PreserveMost calling convention (#148214) (PR #158403)

2025-09-13 Thread Nikita Popov via llvm-branch-commits

nikic wrote:

Duplicate of https://github.com/llvm/llvm-project/pull/158402.

https://github.com/llvm/llvm-project/pull/158403
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Enable ISD::PTRADD for 64-bit AS by default (PR #146076)

2025-09-13 Thread Fabian Ritter via llvm-branch-commits

https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/146076

>From 8710de705f09d90f166f82c1733620b2c8581306 Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Fri, 27 Jun 2025 05:38:52 -0400
Subject: [PATCH 1/3] [AMDGPU][SDAG] Enable ISD::PTRADD for 64-bit AS by
 default

Also removes the command line option to control this feature.

There seem to be mainly two kinds of test changes:
- Some operands of addition instructions are swapped; that is to be expected
  since PTRADD is not commutative.
- Improvements in code generation, probably because the legacy lowering enabled
  some transformations that were sometimes harmful.

For SWDEV-516125.
---
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp |  10 +-
 .../identical-subrange-spill-infloop.ll   | 352 +++---
 .../AMDGPU/infer-addrspace-flat-atomic.ll |  14 +-
 llvm/test/CodeGen/AMDGPU/lds-frame-extern.ll  |   8 +-
 .../AMDGPU/lower-module-lds-via-hybrid.ll |   4 +-
 .../AMDGPU/lower-module-lds-via-table.ll  |  16 +-
 .../match-perm-extract-vector-elt-bug.ll  |  22 +-
 llvm/test/CodeGen/AMDGPU/memmove-var-size.ll  |  16 +-
 .../AMDGPU/preload-implicit-kernargs.ll   |   6 +-
 .../AMDGPU/promote-constOffset-to-imm.ll  |   8 +-
 llvm/test/CodeGen/AMDGPU/ptradd-sdag-mubuf.ll |   7 +-
 .../AMDGPU/ptradd-sdag-optimizations.ll   |  94 ++---
 .../AMDGPU/ptradd-sdag-undef-poison.ll|   6 +-
 llvm/test/CodeGen/AMDGPU/ptradd-sdag.ll   |  27 +-
 llvm/test/CodeGen/AMDGPU/store-weird-sizes.ll |  29 +-
 15 files changed, 310 insertions(+), 309 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index a1af50dac7e54..05ab745171f6d 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -63,14 +63,6 @@ static cl::opt UseDivergentRegisterIndexing(
 cl::desc("Use indirect register addressing for divergent indexes"),
 cl::init(false));
 
-// TODO: This option should be removed once we switch to always using PTRADD in
-// the SelectionDAG.
-static cl::opt UseSelectionDAGPTRADD(
-"amdgpu-use-sdag-ptradd", cl::Hidden,
-cl::desc("Generate ISD::PTRADD nodes for 64-bit pointer arithmetic in the "
- "SelectionDAG ISel"),
-cl::init(false));
-
 static bool denormalModeIsFlushAllF32(const MachineFunction &MF) {
   const SIMachineFunctionInfo *Info = MF.getInfo();
   return Info->getMode().FP32Denormals == DenormalMode::getPreserveSign();
@@ -11252,7 +11244,7 @@ static bool isNoUnsignedWrap(SDValue Addr) {
 
 bool SITargetLowering::shouldPreservePtrArith(const Function &F,
   EVT PtrVT) const {
-  return UseSelectionDAGPTRADD && PtrVT == MVT::i64;
+  return PtrVT == MVT::i64;
 }
 
 bool SITargetLowering::canTransformPtrArithOutOfBounds(const Function &F,
diff --git a/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll 
b/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll
index 2c03113e8af47..805cdd37d6e70 100644
--- a/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll
+++ b/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll
@@ -6,96 +6,150 @@ define void @main(i1 %arg) #0 {
 ; CHECK:   ; %bb.0: ; %bb
 ; CHECK-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
 ; CHECK-NEXT:s_xor_saveexec_b64 s[4:5], -1
-; CHECK-NEXT:buffer_store_dword v5, off, s[0:3], s32 ; 4-byte Folded Spill
-; CHECK-NEXT:buffer_store_dword v6, off, s[0:3], s32 offset:4 ; 4-byte 
Folded Spill
+; CHECK-NEXT:buffer_store_dword v6, off, s[0:3], s32 ; 4-byte Folded Spill
+; CHECK-NEXT:buffer_store_dword v7, off, s[0:3], s32 offset:4 ; 4-byte 
Folded Spill
 ; CHECK-NEXT:s_mov_b64 exec, s[4:5]
-; CHECK-NEXT:v_writelane_b32 v5, s30, 0
-; CHECK-NEXT:v_writelane_b32 v5, s31, 1
-; CHECK-NEXT:v_writelane_b32 v5, s36, 2
-; CHECK-NEXT:v_writelane_b32 v5, s37, 3
-; CHECK-NEXT:v_writelane_b32 v5, s38, 4
-; CHECK-NEXT:v_writelane_b32 v5, s39, 5
-; CHECK-NEXT:v_writelane_b32 v5, s48, 6
-; CHECK-NEXT:v_writelane_b32 v5, s49, 7
-; CHECK-NEXT:v_writelane_b32 v5, s50, 8
-; CHECK-NEXT:v_writelane_b32 v5, s51, 9
-; CHECK-NEXT:v_writelane_b32 v5, s52, 10
-; CHECK-NEXT:v_writelane_b32 v5, s53, 11
-; CHECK-NEXT:v_writelane_b32 v5, s54, 12
-; CHECK-NEXT:v_writelane_b32 v5, s55, 13
-; CHECK-NEXT:s_getpc_b64 s[24:25]
-; CHECK-NEXT:v_writelane_b32 v5, s64, 14
-; CHECK-NEXT:s_movk_i32 s4, 0xf0
-; CHECK-NEXT:s_mov_b32 s5, s24
-; CHECK-NEXT:v_writelane_b32 v5, s65, 15
-; CHECK-NEXT:s_load_dwordx16 s[8:23], s[4:5], 0x0
-; CHECK-NEXT:s_mov_b64 s[4:5], 0
-; CHECK-NEXT:v_writelane_b32 v5, s66, 16
-; CHECK-NEXT:s_load_dwordx4 s[4:7], s[4:5], 0x0
-; CHECK-NEXT:v_writelane_b32 v5, s67, 17
-; CHECK-NEXT:s_waitcnt lgkmcnt(0)
-; CHECK-NEXT:s_movk_i32 s6, 0x130
-; CHECK-NEXT:s_mov_b32 s7, s24
-; CHECK-NEXT:v_writelane_b32 v5

[llvm-branch-commits] [mlir] [MLIR][Standalone] test Standalone against install distributions (PR #157944)

2025-09-13 Thread Maksim Levental via llvm-branch-commits

makslevental wrote:

> The fact that Subprocess 1 CMake and Parent CMake have identical build 
> directories seems particularly problematic.

My assumption was (and I realize that it's flawed now) that process 1 isn't 
actually building anything, just "installing" artifacts. But of course we've 
all seen things get built when doing `ninja install` even after doing a full 
build.

> Sounds like we're aligned that this Standalone/LIT location isn't the right 
> place, @makslevental ?

I mean I don't care how the thing is tested right - I was just going for what I 
thought was the least controversial approach. IF there's an even less 
controversial approach I'm happy to do that instead!

BTW I guess this is what @boomanaiden154 was talking about

https://github.com/llvm/llvm-project/blob/e236a52a88956968f318fb908c584e5cb80b5b03/libcxx/test/CMakeLists.txt#L40-L58

which I can try out if we _do_ want it to be a lit test (but again happy not to 
have to do that).



https://github.com/llvm/llvm-project/pull/157944
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [Remarks] BitstreamRemarkParser: Refactor error handling (PR #156511)

2025-09-13 Thread Jon Roelofs via llvm-branch-commits

jroelofs wrote:

>  We could delete bytes from valid files to trigger the errors and commit them 
> as binary blobs.

Committing more binary blobs is not a great idea given the `xz` debacle. It 
would be better to have them set up as tests that serialize something from yaml 
to bitstream, and then corrupt some bytes and check that the error triggers.

https://github.com/llvm/llvm-project/pull/156511
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [Remarks] BitstreamRemarkParser: Refactor error handling (PR #156511)

2025-09-13 Thread Jon Roelofs via llvm-branch-commits


@@ -13,81 +13,171 @@
 #ifndef LLVM_LIB_REMARKS_BITSTREAM_REMARK_PARSER_H
 #define LLVM_LIB_REMARKS_BITSTREAM_REMARK_PARSER_H
 
-#include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/Bitstream/BitstreamReader.h"
 #include "llvm/Remarks/BitstreamRemarkContainer.h"
+#include "llvm/Remarks/Remark.h"
 #include "llvm/Remarks/RemarkFormat.h"
 #include "llvm/Remarks/RemarkParser.h"
+#include "llvm/Remarks/RemarkStringTable.h"
 #include "llvm/Support/Error.h"
-#include 
+#include "llvm/Support/FormatVariadic.h"
 #include 
 #include 
 #include 
 
 namespace llvm {
 namespace remarks {
 
-struct Remark;
+class BitstreamBlockParserHelperBase {
+protected:
+  BitstreamCursor &Stream;
+
+  unsigned BlockID;
+  StringRef BlockName;

jroelofs wrote:

My intuition says this will have better struct layout, though I haven't checked:
```suggestion
  StringRef BlockName;
  unsigned BlockID;
```

https://github.com/llvm/llvm-project/pull/156511
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/21.x: [RISCV] Support PreserveMost calling convention (#148214) (PR #158403)

2025-09-13 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/158403
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][Attributor] Add `AAAMDGPUClusterDims` (PR #158076)

2025-09-13 Thread Shilei Tian via llvm-branch-commits


@@ -1296,6 +1303,157 @@ struct AAAMDGPUNoAGPR
 
 const char AAAMDGPUNoAGPR::ID = 0;
 
+/// An abstract attribute to propagate the function attribute
+/// "amdgpu-cluster-dims" from kernel entry functions to device functions.
+struct AAAMDGPUClusterDims
+: public StateWrapper {
+  using Base = StateWrapper;
+  AAAMDGPUClusterDims(const IRPosition &IRP, Attributor &A) : Base(IRP) {}
+
+  /// Create an abstract attribute view for the position \p IRP.
+  static AAAMDGPUClusterDims &createForPosition(const IRPosition &IRP,
+Attributor &A);
+
+  /// See AbstractAttribute::getName().
+  StringRef getName() const override { return "AAAMDGPUClusterDims"; }
+
+  /// See AbstractAttribute::getIdAddr().
+  const char *getIdAddr() const override { return &ID; }
+
+  /// This function should return true if the type of the \p AA is
+  /// AAAMDGPUClusterDims.
+  static bool classof(const AbstractAttribute *AA) {
+return (AA->getIdAddr() == &ID);
+  }
+
+  virtual const AMDGPU::ClusterDimsAttr &getClusterDims() const = 0;
+
+  /// Unique ID (due to the unique address)
+  static const char ID;
+};
+
+const char AAAMDGPUClusterDims::ID = 0;
+
+struct AAAMDGPUClusterDimsFunction : public AAAMDGPUClusterDims {
+  AAAMDGPUClusterDimsFunction(const IRPosition &IRP, Attributor &A)
+  : AAAMDGPUClusterDims(IRP, A) {}
+
+  void initialize(Attributor &A) override {
+Function *F = getAssociatedFunction();
+assert(F && "empty associated function");
+
+Attr = AMDGPU::ClusterDimsAttr::get(*F);
+
+// No matter what a kernel function has, it is final.
+if (AMDGPU::isEntryFunctionCC(F->getCallingConv())) {
+  if (Attr.isUnknown())
+indicatePessimisticFixpoint();
+  else
+indicateOptimisticFixpoint();
+}
+  }
+
+  const std::string getAsStr(Attributor *A) const override {
+if (!getAssumed() || Attr.isUnknown())
+  return "unknown";
+if (Attr.isNoCluster())
+  return "no";
+if (Attr.isVariableedDims())

shiltian wrote:

oh that's bad. will do.

https://github.com/llvm/llvm-project/pull/158076
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] SPARC: Use RegClassByHwMode instead of PointerLikeRegClass (PR #158271)

2025-09-13 Thread Sergei Barannikov via llvm-branch-commits


@@ -95,10 +95,27 @@ def HasFSMULD : Predicate<"!Subtarget->hasNoFSMULD()">;
 // will pick deprecated instructions.
 def UseDeprecatedInsts : Predicate<"Subtarget->useV8DeprecatedInsts()">;
 
+//===--===//
+// HwModes Pattern Stuff
+//===--===//
+
+defvar SPARC32 = DefaultMode;
+def SPARC64 : HwMode<[Is64Bit]>;

s-barannikov wrote:

I meant default mode in hardware. This is more of a stylistic suggestion.

https://github.com/llvm/llvm-project/pull/158271
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] `do concurrent`: support `local` on device (PR #157638)

2025-09-13 Thread Kareem Ergawy via llvm-branch-commits

https://github.com/ergawy updated 
https://github.com/llvm/llvm-project/pull/157638

>From 983b97d91cbf9dbf45973fdacabf2ae6948491a4 Mon Sep 17 00:00:00 2001
From: ergawy 
Date: Tue, 2 Sep 2025 05:54:00 -0500
Subject: [PATCH] [flang][OpenMP] `do concurrent`: support `local` on device

Extends support for mapping `do concurrent` on the device by adding
support for `local` specifiers. The changes in this PR map the local
variable to the `omp.target` op and uses the mapped value as the
`private` clause operand in the nested `omp.parallel` op.
---
 .../include/flang/Optimizer/Dialect/FIROps.td |  12 ++
 .../OpenMP/DoConcurrentConversion.cpp | 192 +++---
 .../Transforms/DoConcurrent/local_device.mlir |  49 +
 3 files changed, 175 insertions(+), 78 deletions(-)
 create mode 100644 flang/test/Transforms/DoConcurrent/local_device.mlir

diff --git a/flang/include/flang/Optimizer/Dialect/FIROps.td 
b/flang/include/flang/Optimizer/Dialect/FIROps.td
index bc971e8fd6600..fc6eedc6ed4c6 100644
--- a/flang/include/flang/Optimizer/Dialect/FIROps.td
+++ b/flang/include/flang/Optimizer/Dialect/FIROps.td
@@ -3894,6 +3894,18 @@ def fir_DoConcurrentLoopOp : fir_Op<"do_concurrent.loop",
   return getReduceVars().size();
 }
 
+unsigned getInductionVarsStart() {
+  return 0;
+}
+
+unsigned getLocalOperandsStart() {
+  return getNumInductionVars();
+}
+
+unsigned getReduceOperandsStart() {
+  return getLocalOperandsStart() + getNumLocalOperands();
+}
+
 mlir::Block::BlockArgListType getInductionVars() {
   return getBody()->getArguments().slice(0, getNumInductionVars());
 }
diff --git a/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp 
b/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp
index 6c71924000842..d00a4fdd2cf2e 100644
--- a/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp
+++ b/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp
@@ -138,6 +138,9 @@ void collectLoopLiveIns(fir::DoConcurrentLoopOp loop,
 
 liveIns.push_back(operand->get());
   });
+
+  for (mlir::Value local : loop.getLocalVars())
+liveIns.push_back(local);
 }
 
 /// Collects values that are local to a loop: "loop-local values". A loop-local
@@ -298,8 +301,7 @@ class DoConcurrentConversion
   .getIsTargetDevice();
 
   mlir::omp::TargetOperands targetClauseOps;
-  genLoopNestClauseOps(doLoop.getLoc(), rewriter, loop, mapper,
-   loopNestClauseOps,
+  genLoopNestClauseOps(doLoop.getLoc(), rewriter, loop, loopNestClauseOps,
isTargetDevice ? nullptr : &targetClauseOps);
 
   LiveInShapeInfoMap liveInShapeInfoMap;
@@ -321,14 +323,13 @@ class DoConcurrentConversion
 }
 
 mlir::omp::ParallelOp parallelOp =
-genParallelOp(doLoop.getLoc(), rewriter, ivInfos, mapper);
+genParallelOp(rewriter, loop, ivInfos, mapper);
 
 // Only set as composite when part of `distribute parallel do`.
 parallelOp.setComposite(mapToDevice);
 
 if (!mapToDevice)
-  genLoopNestClauseOps(doLoop.getLoc(), rewriter, loop, mapper,
-   loopNestClauseOps);
+  genLoopNestClauseOps(doLoop.getLoc(), rewriter, loop, loopNestClauseOps);
 
 for (mlir::Value local : locals)
   looputils::localizeLoopLocalValue(local, parallelOp.getRegion(),
@@ -337,10 +338,38 @@ class DoConcurrentConversion
 if (mapToDevice)
   genDistributeOp(doLoop.getLoc(), rewriter).setComposite(/*val=*/true);
 
-mlir::omp::LoopNestOp ompLoopNest =
+auto [loopNestOp, wsLoopOp] =
 genWsLoopOp(rewriter, loop, mapper, loopNestClauseOps,
 /*isComposite=*/mapToDevice);
 
+// `local` region arguments are transferred/cloned from the `do concurrent`
+// loop to the loopnest op when the region is cloned above. Instead, these
+// region arguments should be on the workshare loop's region.
+if (mapToDevice) {
+  for (auto [parallelArg, loopNestArg] : llvm::zip_equal(
+   parallelOp.getRegion().getArguments(),
+   loopNestOp.getRegion().getArguments().slice(
+   loop.getLocalOperandsStart(), loop.getNumLocalOperands(
+rewriter.replaceAllUsesWith(loopNestArg, parallelArg);
+
+  for (auto [wsloopArg, loopNestArg] : llvm::zip_equal(
+   wsLoopOp.getRegion().getArguments(),
+   loopNestOp.getRegion().getArguments().slice(
+   loop.getReduceOperandsStart(), 
loop.getNumReduceOperands(
+rewriter.replaceAllUsesWith(loopNestArg, wsloopArg);
+} else {
+  for (auto [wsloopArg, loopNestArg] :
+   llvm::zip_equal(wsLoopOp.getRegion().getArguments(),
+   loopNestOp.getRegion().getArguments().drop_front(
+   loopNestClauseOps.loopLowerBounds.size(
+rewriter.replaceAllUsesWith(loopNestArg, wsloopArg);
+}
+
+for (unsigned i = 0;
+ i 

[llvm-branch-commits] [llvm] AMDGPU/UniformityAnalysis: fix G_ZEXTLOAD and G_SEXTLOAD (PR #157845)

2025-09-13 Thread Petar Avramovic via llvm-branch-commits

https://github.com/petar-avramovic updated 
https://github.com/llvm/llvm-project/pull/157845

>From f426257364826fbec65abb6de92698bfa18f9487 Mon Sep 17 00:00:00 2001
From: Petar Avramovic 
Date: Wed, 10 Sep 2025 13:04:20 +0200
Subject: [PATCH] AMDGPU/UniformityAnalysis: fix G_ZEXTLOAD and G_SEXTLOAD

Use same rules for G_ZEXTLOAD and G_SEXTLOAD as for G_LOAD.
Flat addrspace(0) and private addrspace(5) G_ZEXTLOAD and G_SEXTLOAD
should be always divergent.
---
 llvm/lib/Target/AMDGPU/SIInstrInfo.cpp| 15 +++---
 .../AMDGPU/MIR/loads-gmir.mir | 20 +++
 2 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp 
b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 5c958dfe6954f..398c99b3bd127 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -10281,7 +10281,7 @@ unsigned SIInstrInfo::getInstrLatency(const 
InstrItineraryData *ItinData,
 InstructionUniformity
 SIInstrInfo::getGenericInstructionUniformity(const MachineInstr &MI) const {
   const MachineRegisterInfo &MRI = MI.getMF()->getRegInfo();
-  unsigned opcode = MI.getOpcode();
+  unsigned Opcode = MI.getOpcode();
 
   auto HandleAddrSpaceCast = [this, &MRI](const MachineInstr &MI) {
 Register Dst = MI.getOperand(0).getReg();
@@ -10301,7 +10301,7 @@ SIInstrInfo::getGenericInstructionUniformity(const 
MachineInstr &MI) const {
   // If the target supports globally addressable scratch, the mapping from
   // scratch memory to the flat aperture changes therefore an address space 
cast
   // is no longer uniform.
-  if (opcode == TargetOpcode::G_ADDRSPACE_CAST)
+  if (Opcode == TargetOpcode::G_ADDRSPACE_CAST)
 return HandleAddrSpaceCast(MI);
 
   if (auto *GI = dyn_cast(&MI)) {
@@ -10329,7 +10329,8 @@ SIInstrInfo::getGenericInstructionUniformity(const 
MachineInstr &MI) const {
   //
   // All other loads are not divergent, because if threads issue loads with the
   // same arguments, they will always get the same result.
-  if (opcode == AMDGPU::G_LOAD) {
+  if (Opcode == AMDGPU::G_LOAD || Opcode == AMDGPU::G_ZEXTLOAD ||
+  Opcode == AMDGPU::G_SEXTLOAD) {
 if (MI.memoperands_empty())
   return InstructionUniformity::NeverUniform; // conservative assumption
 
@@ -10343,10 +10344,10 @@ SIInstrInfo::getGenericInstructionUniformity(const 
MachineInstr &MI) const {
 return InstructionUniformity::Default;
   }
 
-  if (SIInstrInfo::isGenericAtomicRMWOpcode(opcode) ||
-  opcode == AMDGPU::G_ATOMIC_CMPXCHG ||
-  opcode == AMDGPU::G_ATOMIC_CMPXCHG_WITH_SUCCESS ||
-  AMDGPU::isGenericAtomic(opcode)) {
+  if (SIInstrInfo::isGenericAtomicRMWOpcode(Opcode) ||
+  Opcode == AMDGPU::G_ATOMIC_CMPXCHG ||
+  Opcode == AMDGPU::G_ATOMIC_CMPXCHG_WITH_SUCCESS ||
+  AMDGPU::isGenericAtomic(Opcode)) {
 return InstructionUniformity::NeverUniform;
   }
   return InstructionUniformity::Default;
diff --git a/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/loads-gmir.mir 
b/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/loads-gmir.mir
index cb3c2de5b8753..d799cd2057f47 100644
--- a/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/loads-gmir.mir
+++ b/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/loads-gmir.mir
@@ -46,13 +46,13 @@ body: |
 %6:_(p5) = G_IMPLICIT_DEF
 
 ; Atomic load
-; CHECK-NOT: DIVERGENT
-
+; CHECK: DIVERGENT
+; CHECK-SAME: G_ZEXTLOAD
 %0:_(s32) = G_ZEXTLOAD %1(p0) :: (load seq_cst (s16) from `ptr undef`)
 
 ; flat load
-; CHECK-NOT: DIVERGENT
-
+; CHECK: DIVERGENT
+; CHECK-SAME: G_ZEXTLOAD
 %2:_(s32) = G_ZEXTLOAD %1(p0) :: (load (s16) from `ptr undef`)
 
 ; Gloabal load
@@ -60,7 +60,8 @@ body: |
 %3:_(s32) = G_ZEXTLOAD %4(p1) :: (load (s16) from `ptr addrspace(1) 
undef`, addrspace 1)
 
 ; Private load
-; CHECK-NOT: DIVERGENT
+; CHECK: DIVERGENT
+; CHECK-SAME: G_ZEXTLOAD
 %5:_(s32) = G_ZEXTLOAD %6(p5) :: (volatile load (s16) from `ptr 
addrspace(5) undef`, addrspace 5)
 G_STORE %2(s32), %4(p1) :: (volatile store (s32) into `ptr addrspace(1) 
undef`, addrspace 1)
 G_STORE %3(s32), %4(p1) :: (volatile store (s32) into `ptr addrspace(1) 
undef`, addrspace 1)
@@ -80,11 +81,13 @@ body: |
 %6:_(p5) = G_IMPLICIT_DEF
 
 ; Atomic load
-; CHECK-NOT: DIVERGENT
+; CHECK: DIVERGENT
+; CHECK-SAME: G_SEXTLOAD
 %0:_(s32) = G_SEXTLOAD %1(p0) :: (load seq_cst (s16) from `ptr undef`)
 
 ; flat load
-; CHECK-NOT: DIVERGENT
+; CHECK: DIVERGENT
+; CHECK-SAME: G_SEXTLOAD
 %2:_(s32) = G_SEXTLOAD %1(p0) :: (load (s16) from `ptr undef`)
 
 ; Gloabal load
@@ -92,7 +95,8 @@ body: |
 %3:_(s32) = G_SEXTLOAD %4(p1) :: (load (s16) from `ptr addrspace(1) 
undef`, addrspace 1)
 
 ; Private load
-; CHECK-NOT: DIVERGENT
+; CHECK: DIVERGENT
+; CHECK-SAME: G_SEXTLOAD
 %5:_(s32) = G_SEXTLOAD %6(p5) :: (volatile load (s16)

[llvm-branch-commits] [llvm] [mlir] [flang][OpenMP] Support multi-block reduction combiner regions on the GPU (PR #156837)

2025-09-13 Thread Kareem Ergawy via llvm-branch-commits

https://github.com/ergawy updated 
https://github.com/llvm/llvm-project/pull/156837

>From 7f6d6feb526c33b05e9705ef6587e8bcc145458f Mon Sep 17 00:00:00 2001
From: ergawy 
Date: Thu, 4 Sep 2025 01:06:21 -0500
Subject: [PATCH 1/2] [flang][OpenMP] Support multi-block reduction combiner 
 regions on the GPU

Fixes a bug related to insertion points when inlining multi-block
combiner reduction regions. The IP at the end of the inlined region was
not used resulting in emitting BBs with multiple terminators.
---
 llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp |  3 +
 .../omptarget-multi-block-reduction.mlir  | 85 +++
 2 files changed, 88 insertions(+)
 create mode 100644 mlir/test/Target/LLVMIR/omptarget-multi-block-reduction.mlir

diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index c955ecd403633..116d1d9f4a951 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -3507,6 +3507,8 @@ Expected 
OpenMPIRBuilder::createReductionFunction(
 return AfterIP.takeError();
   if (!Builder.GetInsertBlock())
 return ReductionFunc;
+
+  Builder.SetInsertPoint(AfterIP->getBlock(), AfterIP->getPoint());
   Builder.CreateStore(Reduced, LHSPtr);
 }
   }
@@ -3751,6 +3753,7 @@ OpenMPIRBuilder::InsertPointOrErrorTy 
OpenMPIRBuilder::createReductionsGPU(
   RI.ReductionGen(Builder.saveIP(), RHSValue, LHSValue, Reduced);
   if (!AfterIP)
 return AfterIP.takeError();
+  Builder.SetInsertPoint(AfterIP->getBlock(), AfterIP->getPoint());
   Builder.CreateStore(Reduced, LHS, false);
 }
   }
diff --git a/mlir/test/Target/LLVMIR/omptarget-multi-block-reduction.mlir 
b/mlir/test/Target/LLVMIR/omptarget-multi-block-reduction.mlir
new file mode 100644
index 0..aaf06d2d0e0c2
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/omptarget-multi-block-reduction.mlir
@@ -0,0 +1,85 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+// Verifies that the IR builder can handle reductions with multi-block combiner
+// regions on the GPU.
+
+module attributes {dlti.dl_spec = #dlti.dl_spec<"dlti.alloca_memory_space" = 5 
: ui64, "dlti.global_memory_space" = 1 : ui64>, llvm.target_triple = 
"amdgcn-amd-amdhsa", omp.is_gpu = true, omp.is_target_device = true} {
+  llvm.func @bar() {}
+  llvm.func @baz() {}
+
+  omp.declare_reduction @add_reduction_byref_box_5xf32 : !llvm.ptr alloc {
+%0 = llvm.mlir.constant(1 : i64) : i64
+%1 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 
x array<3 x i64>>)> : (i64) -> !llvm.ptr<5>
+%2 = llvm.addrspacecast %1 : !llvm.ptr<5> to !llvm.ptr
+omp.yield(%2 : !llvm.ptr)
+  } init {
+  ^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr):
+omp.yield(%arg1 : !llvm.ptr)
+  } combiner {
+  ^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr):
+llvm.call @bar() : () -> ()
+llvm.br ^bb3
+
+  ^bb3:  // pred: ^bb1
+llvm.call @baz() : () -> ()
+omp.yield(%arg0 : !llvm.ptr)
+  }
+  llvm.func @foo_() {
+%c1 = llvm.mlir.constant(1 : i64) : i64
+%10 = llvm.alloca %c1 x !llvm.array<5 x f32> {bindc_name = "x"} : (i64) -> 
!llvm.ptr<5>
+%11 = llvm.addrspacecast %10 : !llvm.ptr<5> to !llvm.ptr
+%74 = omp.map.info var_ptr(%11 : !llvm.ptr, !llvm.array<5 x f32>) 
map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = "x"}
+omp.target map_entries(%74 -> %arg0 : !llvm.ptr) {
+  %c1_2 = llvm.mlir.constant(1 : i32) : i32
+  %c10 = llvm.mlir.constant(10 : i32) : i32
+  omp.teams reduction(byref @add_reduction_byref_box_5xf32 %arg0 -> %arg2 
: !llvm.ptr) {
+omp.parallel {
+  omp.distribute {
+omp.wsloop {
+  omp.loop_nest (%arg5) : i32 = (%c1_2) to (%c10) inclusive step 
(%c1_2) {
+omp.yield
+  }
+} {omp.composite}
+  } {omp.composite}
+  omp.terminator
+} {omp.composite}
+omp.terminator
+  }
+  omp.terminator
+}
+llvm.return
+  }
+}
+
+// CHECK:  call void @__kmpc_parallel_51({{.*}}, i32 1, i32 -1, i32 -1,
+// CHECK-SAME:   ptr @[[PAR_OUTLINED:.*]], ptr null, ptr %2, i64 1)
+
+// CHECK: define internal void @[[PAR_OUTLINED]]{{.*}} {
+// CHECK:   .omp.reduction.then:
+// CHECK: br label %omp.reduction.nonatomic.body
+
+// CHECK:   omp.reduction.nonatomic.body:
+// CHECK: call void @bar()
+// CHECK: br label %[[BODY_2ND_BB:.*]]
+
+// CHECK:   [[BODY_2ND_BB]]:
+// CHECK: call void @baz()
+// CHECK: br label %[[CONT_BB:.*]]
+
+// CHECK:   [[CONT_BB]]:
+// CHECK: br label %.omp.reduction.done
+// CHECK: }
+
+// CHECK: define internal void @"{{.*}}$reduction$reduction_func"(ptr noundef 
%0, ptr noundef %1) #0 {
+// CHECK: br label %omp.reduction.nonatomic.body
+
+// CHECK:   [[BODY_2ND_BB:.*]]:
+// CHECK: call void @baz()
+// CHECK: br label %omp.region.cont
+
+
+// CHECK: omp.reduction.nonatomic.body:
+// CHECK:   call void @b

[llvm-branch-commits] [llvm] [mlir] [flang][OpenMP] Support multi-block reduction combiner regions on the GPU (PR #156837)

2025-09-13 Thread Kareem Ergawy via llvm-branch-commits

https://github.com/ergawy updated 
https://github.com/llvm/llvm-project/pull/156837

>From 7f6d6feb526c33b05e9705ef6587e8bcc145458f Mon Sep 17 00:00:00 2001
From: ergawy 
Date: Thu, 4 Sep 2025 01:06:21 -0500
Subject: [PATCH] [flang][OpenMP] Support multi-block reduction combiner 
 regions on the GPU

Fixes a bug related to insertion points when inlining multi-block
combiner reduction regions. The IP at the end of the inlined region was
not used resulting in emitting BBs with multiple terminators.
---
 llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp |  3 +
 .../omptarget-multi-block-reduction.mlir  | 85 +++
 2 files changed, 88 insertions(+)
 create mode 100644 mlir/test/Target/LLVMIR/omptarget-multi-block-reduction.mlir

diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index c955ecd403633..116d1d9f4a951 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -3507,6 +3507,8 @@ Expected 
OpenMPIRBuilder::createReductionFunction(
 return AfterIP.takeError();
   if (!Builder.GetInsertBlock())
 return ReductionFunc;
+
+  Builder.SetInsertPoint(AfterIP->getBlock(), AfterIP->getPoint());
   Builder.CreateStore(Reduced, LHSPtr);
 }
   }
@@ -3751,6 +3753,7 @@ OpenMPIRBuilder::InsertPointOrErrorTy 
OpenMPIRBuilder::createReductionsGPU(
   RI.ReductionGen(Builder.saveIP(), RHSValue, LHSValue, Reduced);
   if (!AfterIP)
 return AfterIP.takeError();
+  Builder.SetInsertPoint(AfterIP->getBlock(), AfterIP->getPoint());
   Builder.CreateStore(Reduced, LHS, false);
 }
   }
diff --git a/mlir/test/Target/LLVMIR/omptarget-multi-block-reduction.mlir 
b/mlir/test/Target/LLVMIR/omptarget-multi-block-reduction.mlir
new file mode 100644
index 0..aaf06d2d0e0c2
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/omptarget-multi-block-reduction.mlir
@@ -0,0 +1,85 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+// Verifies that the IR builder can handle reductions with multi-block combiner
+// regions on the GPU.
+
+module attributes {dlti.dl_spec = #dlti.dl_spec<"dlti.alloca_memory_space" = 5 
: ui64, "dlti.global_memory_space" = 1 : ui64>, llvm.target_triple = 
"amdgcn-amd-amdhsa", omp.is_gpu = true, omp.is_target_device = true} {
+  llvm.func @bar() {}
+  llvm.func @baz() {}
+
+  omp.declare_reduction @add_reduction_byref_box_5xf32 : !llvm.ptr alloc {
+%0 = llvm.mlir.constant(1 : i64) : i64
+%1 = llvm.alloca %0 x !llvm.struct<(ptr, i64, i32, i8, i8, i8, i8, array<1 
x array<3 x i64>>)> : (i64) -> !llvm.ptr<5>
+%2 = llvm.addrspacecast %1 : !llvm.ptr<5> to !llvm.ptr
+omp.yield(%2 : !llvm.ptr)
+  } init {
+  ^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr):
+omp.yield(%arg1 : !llvm.ptr)
+  } combiner {
+  ^bb0(%arg0: !llvm.ptr, %arg1: !llvm.ptr):
+llvm.call @bar() : () -> ()
+llvm.br ^bb3
+
+  ^bb3:  // pred: ^bb1
+llvm.call @baz() : () -> ()
+omp.yield(%arg0 : !llvm.ptr)
+  }
+  llvm.func @foo_() {
+%c1 = llvm.mlir.constant(1 : i64) : i64
+%10 = llvm.alloca %c1 x !llvm.array<5 x f32> {bindc_name = "x"} : (i64) -> 
!llvm.ptr<5>
+%11 = llvm.addrspacecast %10 : !llvm.ptr<5> to !llvm.ptr
+%74 = omp.map.info var_ptr(%11 : !llvm.ptr, !llvm.array<5 x f32>) 
map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = "x"}
+omp.target map_entries(%74 -> %arg0 : !llvm.ptr) {
+  %c1_2 = llvm.mlir.constant(1 : i32) : i32
+  %c10 = llvm.mlir.constant(10 : i32) : i32
+  omp.teams reduction(byref @add_reduction_byref_box_5xf32 %arg0 -> %arg2 
: !llvm.ptr) {
+omp.parallel {
+  omp.distribute {
+omp.wsloop {
+  omp.loop_nest (%arg5) : i32 = (%c1_2) to (%c10) inclusive step 
(%c1_2) {
+omp.yield
+  }
+} {omp.composite}
+  } {omp.composite}
+  omp.terminator
+} {omp.composite}
+omp.terminator
+  }
+  omp.terminator
+}
+llvm.return
+  }
+}
+
+// CHECK:  call void @__kmpc_parallel_51({{.*}}, i32 1, i32 -1, i32 -1,
+// CHECK-SAME:   ptr @[[PAR_OUTLINED:.*]], ptr null, ptr %2, i64 1)
+
+// CHECK: define internal void @[[PAR_OUTLINED]]{{.*}} {
+// CHECK:   .omp.reduction.then:
+// CHECK: br label %omp.reduction.nonatomic.body
+
+// CHECK:   omp.reduction.nonatomic.body:
+// CHECK: call void @bar()
+// CHECK: br label %[[BODY_2ND_BB:.*]]
+
+// CHECK:   [[BODY_2ND_BB]]:
+// CHECK: call void @baz()
+// CHECK: br label %[[CONT_BB:.*]]
+
+// CHECK:   [[CONT_BB]]:
+// CHECK: br label %.omp.reduction.done
+// CHECK: }
+
+// CHECK: define internal void @"{{.*}}$reduction$reduction_func"(ptr noundef 
%0, ptr noundef %1) #0 {
+// CHECK: br label %omp.reduction.nonatomic.body
+
+// CHECK:   [[BODY_2ND_BB:.*]]:
+// CHECK: call void @baz()
+// CHECK: br label %omp.region.cont
+
+
+// CHECK: omp.reduction.nonatomic.body:
+// CHECK:   call void @bar()

[llvm-branch-commits] [llvm] [NFC][flang][do concurent] Add saxpy offload tests for OpenMP mapping (PR #155993)

2025-09-13 Thread Kareem Ergawy via llvm-branch-commits

https://github.com/ergawy updated 
https://github.com/llvm/llvm-project/pull/155993

>From 2177ccc20d333d6c6645f96a2b9c427d4ea952ac Mon Sep 17 00:00:00 2001
From: ergawy 
Date: Fri, 29 Aug 2025 04:04:07 -0500
Subject: [PATCH] [flang][do concurent] Add saxpy offload tests for OpenMP
 mapping

Adds end-to-end tests for `do concurrent` offloading to the device.
---
 .../fortran/do-concurrent-to-omp-saxpy-2d.f90 | 53 +++
 .../fortran/do-concurrent-to-omp-saxpy.f90| 53 +++
 2 files changed, 106 insertions(+)
 create mode 100644 
offload/test/offloading/fortran/do-concurrent-to-omp-saxpy-2d.f90
 create mode 100644 
offload/test/offloading/fortran/do-concurrent-to-omp-saxpy.f90

diff --git a/offload/test/offloading/fortran/do-concurrent-to-omp-saxpy-2d.f90 
b/offload/test/offloading/fortran/do-concurrent-to-omp-saxpy-2d.f90
new file mode 100644
index 0..c6f576acb90b6
--- /dev/null
+++ b/offload/test/offloading/fortran/do-concurrent-to-omp-saxpy-2d.f90
@@ -0,0 +1,53 @@
+! REQUIRES: flang, amdgpu
+
+! RUN: %libomptarget-compile-fortran-generic -fdo-concurrent-to-openmp=device
+! RUN: env LIBOMPTARGET_INFO=16 %libomptarget-run-generic 2>&1 | 
%fcheck-generic
+module saxpymod
+   use iso_fortran_env
+   public :: saxpy
+contains
+
+subroutine saxpy(a, x, y, n, m)
+   use iso_fortran_env
+   implicit none
+   integer,intent(in) :: n, m
+   real(kind=real32),intent(in) :: a
+   real(kind=real32), dimension(:,:),intent(in) :: x
+   real(kind=real32), dimension(:,:),intent(inout) :: y
+   integer :: i, j
+
+   do concurrent(i=1:n, j=1:m)
+   y(i,j) = a * x(i,j) + y(i,j)
+   end do
+
+   write(*,*) "plausibility check:"
+   write(*,'("y(1,1) ",f8.6)') y(1,1)
+   write(*,'("y(n,m) ",f8.6)') y(n,m)
+end subroutine saxpy
+
+end module saxpymod
+
+program main
+   use iso_fortran_env
+   use saxpymod, ONLY:saxpy
+   implicit none
+
+   integer,parameter :: n = 1000, m=1
+   real(kind=real32), allocatable, dimension(:,:) :: x, y
+   real(kind=real32) :: a
+   integer :: i
+
+   allocate(x(1:n,1:m), y(1:n,1:m))
+   a = 2.0_real32
+   x(:,:) = 1.0_real32
+   y(:,:) = 2.0_real32
+
+   call saxpy(a, x, y, n, m)
+
+   deallocate(x,y)
+end program main
+
+! CHECK:  "PluginInterface" device {{[0-9]+}} info: Launching kernel {{.*}}
+! CHECK:  plausibility check:
+! CHECK:  y(1,1) 4.0
+! CHECK:  y(n,m) 4.0
diff --git a/offload/test/offloading/fortran/do-concurrent-to-omp-saxpy.f90 
b/offload/test/offloading/fortran/do-concurrent-to-omp-saxpy.f90
new file mode 100644
index 0..e094a1d7459ef
--- /dev/null
+++ b/offload/test/offloading/fortran/do-concurrent-to-omp-saxpy.f90
@@ -0,0 +1,53 @@
+! REQUIRES: flang, amdgpu
+
+! RUN: %libomptarget-compile-fortran-generic -fdo-concurrent-to-openmp=device
+! RUN: env LIBOMPTARGET_INFO=16 %libomptarget-run-generic 2>&1 | 
%fcheck-generic
+module saxpymod
+   use iso_fortran_env
+   public :: saxpy
+contains
+
+subroutine saxpy(a, x, y, n)
+   use iso_fortran_env
+   implicit none
+   integer,intent(in) :: n
+   real(kind=real32),intent(in) :: a
+   real(kind=real32), dimension(:),intent(in) :: x
+   real(kind=real32), dimension(:),intent(inout) :: y
+   integer :: i
+
+   do concurrent(i=1:n)
+   y(i) = a * x(i) + y(i)
+   end do
+
+   write(*,*) "plausibility check:"
+   write(*,'("y(1) ",f8.6)') y(1)
+   write(*,'("y(n) ",f8.6)') y(n)
+end subroutine saxpy
+
+end module saxpymod
+
+program main
+   use iso_fortran_env
+   use saxpymod, ONLY:saxpy
+   implicit none
+
+   integer,parameter :: n = 1000
+   real(kind=real32), allocatable, dimension(:) :: x, y
+   real(kind=real32) :: a
+   integer :: i
+
+   allocate(x(1:n), y(1:n))
+   a = 2.0_real32
+   x(:) = 1.0_real32
+   y(:) = 2.0_real32
+
+   call saxpy(a, x, y, n)
+
+   deallocate(x,y)
+end program main
+
+! CHECK:  "PluginInterface" device {{[0-9]+}} info: Launching kernel {{.*}}
+! CHECK:  plausibility check:
+! CHECK:  y(1) 4.0
+! CHECK:  y(n) 4.0

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] `do concurrent`: support `reduce` on device (PR #156610)

2025-09-13 Thread Kareem Ergawy via llvm-branch-commits

https://github.com/ergawy updated 
https://github.com/llvm/llvm-project/pull/156610

>From 6dbc28572976b06ae8ad6661724c5cbfd7aab2e9 Mon Sep 17 00:00:00 2001
From: ergawy 
Date: Tue, 2 Sep 2025 08:36:34 -0500
Subject: [PATCH] [flang][OpenMP] `do concurrent`: support `reduce` on device

Extends `do concurrent` to OpenMP device mapping by adding support for
mapping `reduce` specifiers to omp `reduction` clauses. The changes
attach 2 `reduction` clauses to the mapped OpenMP construct: one on the
`teams` part of the construct and one on the `wloop` part.
---
 .../OpenMP/DoConcurrentConversion.cpp | 117 ++
 .../DoConcurrent/reduce_device.mlir   |  53 
 2 files changed, 121 insertions(+), 49 deletions(-)
 create mode 100644 flang/test/Transforms/DoConcurrent/reduce_device.mlir

diff --git a/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp 
b/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp
index d00a4fdd2cf2e..6e308499100fa 100644
--- a/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp
+++ b/flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp
@@ -141,6 +141,9 @@ void collectLoopLiveIns(fir::DoConcurrentLoopOp loop,
 
   for (mlir::Value local : loop.getLocalVars())
 liveIns.push_back(local);
+
+  for (mlir::Value reduce : loop.getReduceVars())
+liveIns.push_back(reduce);
 }
 
 /// Collects values that are local to a loop: "loop-local values". A loop-local
@@ -319,7 +322,7 @@ class DoConcurrentConversion
   targetOp =
   genTargetOp(doLoop.getLoc(), rewriter, mapper, loopNestLiveIns,
   targetClauseOps, loopNestClauseOps, liveInShapeInfoMap);
-  genTeamsOp(doLoop.getLoc(), rewriter);
+  genTeamsOp(rewriter, loop, mapper);
 }
 
 mlir::omp::ParallelOp parallelOp =
@@ -492,46 +495,7 @@ class DoConcurrentConversion
 if (!mapToDevice)
   genPrivatizers(rewriter, mapper, loop, wsloopClauseOps);
 
-if (!loop.getReduceVars().empty()) {
-  for (auto [op, byRef, sym, arg] : llvm::zip_equal(
-   loop.getReduceVars(), loop.getReduceByrefAttr().asArrayRef(),
-   loop.getReduceSymsAttr().getAsRange(),
-   loop.getRegionReduceArgs())) {
-auto firReducer = moduleSymbolTable.lookup(
-sym.getLeafReference());
-
-mlir::OpBuilder::InsertionGuard guard(rewriter);
-rewriter.setInsertionPointAfter(firReducer);
-std::string ompReducerName = sym.getLeafReference().str() + ".omp";
-
-auto ompReducer =
-moduleSymbolTable.lookup(
-rewriter.getStringAttr(ompReducerName));
-
-if (!ompReducer) {
-  ompReducer = mlir::omp::DeclareReductionOp::create(
-  rewriter, firReducer.getLoc(), ompReducerName,
-  firReducer.getTypeAttr().getValue());
-
-  cloneFIRRegionToOMP(rewriter, firReducer.getAllocRegion(),
-  ompReducer.getAllocRegion());
-  cloneFIRRegionToOMP(rewriter, firReducer.getInitializerRegion(),
-  ompReducer.getInitializerRegion());
-  cloneFIRRegionToOMP(rewriter, firReducer.getReductionRegion(),
-  ompReducer.getReductionRegion());
-  cloneFIRRegionToOMP(rewriter, firReducer.getAtomicReductionRegion(),
-  ompReducer.getAtomicReductionRegion());
-  cloneFIRRegionToOMP(rewriter, firReducer.getCleanupRegion(),
-  ompReducer.getCleanupRegion());
-  moduleSymbolTable.insert(ompReducer);
-}
-
-wsloopClauseOps.reductionVars.push_back(op);
-wsloopClauseOps.reductionByref.push_back(byRef);
-wsloopClauseOps.reductionSyms.push_back(
-mlir::SymbolRefAttr::get(ompReducer));
-  }
-}
+genReductions(rewriter, mapper, loop, wsloopClauseOps);
 
 auto wsloopOp =
 mlir::omp::WsloopOp::create(rewriter, loop.getLoc(), wsloopClauseOps);
@@ -553,8 +517,6 @@ class DoConcurrentConversion
 
 rewriter.setInsertionPointToEnd(&loopNestOp.getRegion().back());
 mlir::omp::YieldOp::create(rewriter, loop->getLoc());
-loop->getParentOfType().print(
-llvm::errs(), mlir::OpPrintingFlags().assumeVerified());
 
 return {loopNestOp, wsloopOp};
   }
@@ -778,15 +740,26 @@ class DoConcurrentConversion
 liveInName, shape);
   }
 
-  mlir::omp::TeamsOp
-  genTeamsOp(mlir::Location loc,
- mlir::ConversionPatternRewriter &rewriter) const {
-auto teamsOp = rewriter.create(
-loc, /*clauses=*/mlir::omp::TeamsOperands{});
+  mlir::omp::TeamsOp genTeamsOp(mlir::ConversionPatternRewriter &rewriter,
+fir::DoConcurrentLoopOp loop,
+mlir::IRMapping &mapper) const {
+mlir::omp::TeamsOperands teamsOps;
+genReductions(rewriter, mapper, loop, teamsOps);
+
+mlir::Location loc = loop.getLoc();
+aut

[llvm-branch-commits] [llvm] [mlir] [flang][OpenMP] Support multi-block reduction combiner regions on the GPU (PR #156837)

2025-09-13 Thread Kareem Ergawy via llvm-branch-commits


@@ -3506,6 +3506,8 @@ Expected 
OpenMPIRBuilder::createReductionFunction(
 return AfterIP.takeError();
   if (!Builder.GetInsertBlock())
 return ReductionFunc;
+
+  Builder.SetInsertPoint(AfterIP->getBlock(), AfterIP->getPoint());

ergawy wrote:

Done.

https://github.com/llvm/llvm-project/pull/156837
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [mlir] [flang][OpenMP] Support multi-block reduction combiner regions on the GPU (PR #156837)

2025-09-13 Thread Kareem Ergawy via llvm-branch-commits


@@ -3750,6 +3752,7 @@ OpenMPIRBuilder::InsertPointOrErrorTy 
OpenMPIRBuilder::createReductionsGPU(
   RI.ReductionGen(Builder.saveIP(), RHSValue, LHSValue, Reduced);
   if (!AfterIP)
 return AfterIP.takeError();
+  Builder.SetInsertPoint(AfterIP->getBlock(), AfterIP->getPoint());

ergawy wrote:

Done.

https://github.com/llvm/llvm-project/pull/156837
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] VocabStorage (PR #158376)

2025-09-13 Thread S. VenkataKeerthy via llvm-branch-commits

svkeerthy wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/158376?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#158376** https://app.graphite.dev/github/pr/llvm/llvm-project/158376?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/158376?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#156952** https://app.graphite.dev/github/pr/llvm/llvm-project/156952?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#155690** https://app.graphite.dev/github/pr/llvm/llvm-project/155690?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#155516** https://app.graphite.dev/github/pr/llvm/llvm-project/155516?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#155323** https://app.graphite.dev/github/pr/llvm/llvm-project/155323?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>: 1 other dependent PR 
([#155700](https://github.com/llvm/llvm-project/pull/155700) https://app.graphite.dev/github/pr/llvm/llvm-project/155700?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>)
* **#153094** https://app.graphite.dev/github/pr/llvm/llvm-project/153094?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#153089** https://app.graphite.dev/github/pr/llvm/llvm-project/153089?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#153087** https://app.graphite.dev/github/pr/llvm/llvm-project/153087?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#152613** https://app.graphite.dev/github/pr/llvm/llvm-project/152613?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/158376
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFC][CFI][CodeGen] Move GeneralizeFunctionType out of CreateMetadataIdentifierGeneralized (PR #158190)

2025-09-13 Thread Vitaly Buka via llvm-branch-commits

https://github.com/vitalybuka updated 
https://github.com/llvm/llvm-project/pull/158190


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [mlir] [flang][OpenMP] Support multi-block reduction combiner regions on the GPU (PR #156837)

2025-09-13 Thread Kareem Ergawy via llvm-branch-commits

ergawy wrote:

Thanks for the review Abid. Addressed your comments.

https://github.com/llvm/llvm-project/pull/156837
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits