[llvm-branch-commits] [llvm] [WIP] Handle guard insertion in callbacks to OpenMP runtime functions. (PR #164655)
https://github.com/abidh updated
https://github.com/llvm/llvm-project/pull/164655
>From 56037a64dbd5f73d2c020dd5d58d2c99758b35d0 Mon Sep 17 00:00:00 2001
From: Abid Qadeer
Date: Tue, 21 Oct 2025 20:53:46 +0100
Subject: [PATCH 01/11] Add callback metadata to runtime functions which take
callbacks.
---
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 25
.../Frontend/OpenMPIRBuilderTest.cpp | 58 +++
2 files changed, 83 insertions(+)
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index c164d32f8f98c..312e119c4280d 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -750,6 +750,31 @@ OpenMPIRBuilder::getOrCreateRuntimeFunction(Module &M,
RuntimeFunction FnID) {
*MDNode::get(Ctx, {MDB.createCallbackEncoding(
2, {-1, -1}, /* VarArgsArePassed */ true)}));
}
+
+} else if (FnID == OMPRTL___kmpc_distribute_static_loop_4 ||
+ FnID == OMPRTL___kmpc_distribute_static_loop_4u ||
+ FnID == OMPRTL___kmpc_distribute_static_loop_8 ||
+ FnID == OMPRTL___kmpc_distribute_static_loop_8u ||
+ FnID == OMPRTL___kmpc_distribute_for_static_loop_4 ||
+ FnID == OMPRTL___kmpc_distribute_for_static_loop_4u ||
+ FnID == OMPRTL___kmpc_distribute_for_static_loop_8 ||
+ FnID == OMPRTL___kmpc_distribute_for_static_loop_8u ||
+ FnID == OMPRTL___kmpc_for_static_loop_4 ||
+ FnID == OMPRTL___kmpc_for_static_loop_4u ||
+ FnID == OMPRTL___kmpc_for_static_loop_8 ||
+ FnID == OMPRTL___kmpc_for_static_loop_8u) {
+ if (!Fn->hasMetadata(LLVMContext::MD_callback)) {
+LLVMContext &Ctx = Fn->getContext();
+MDBuilder MDB(Ctx);
+// Annotate the callback behavior of the runtime function:
+// - The callback callee is argument number 1.
+// - The first argument of the callback callee is unknown (-1).
+// - The second argument of the callback callee is argument number 2
+Fn->addMetadata(
+LLVMContext::MD_callback,
+*MDNode::get(Ctx, {MDB.createCallbackEncoding(
+ 1, {-1, 2}, /* VarArgsArePassed */ false)}));
+ }
}
LLVM_DEBUG(dbgs() << "Created OpenMP runtime function " << Fn->getName()
diff --git a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
index d231a778a8a97..aca2153f85c26 100644
--- a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
+++ b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
@@ -7957,4 +7957,62 @@ TEST_F(OpenMPIRBuilderTest, spliceBBWithEmptyBB) {
EXPECT_FALSE(Terminator->getDbgRecordRange().empty());
}
+TEST_F(OpenMPIRBuilderTest, callBackFunctions) {
+ OpenMPIRBuilder OMPBuilder(*M);
+ OMPBuilder.Config.IsTargetDevice = true;
+ OMPBuilder.initialize();
+
+ // Test multiple runtime functions that should have callback metadata
+ std::vector CallbackFunctions = {
+OMPRTL___kmpc_distribute_static_loop_4,
+OMPRTL___kmpc_distribute_static_loop_4u,
+OMPRTL___kmpc_distribute_static_loop_8,
+OMPRTL___kmpc_distribute_static_loop_8u,
+OMPRTL___kmpc_distribute_for_static_loop_4,
+OMPRTL___kmpc_distribute_for_static_loop_4u,
+OMPRTL___kmpc_distribute_for_static_loop_8,
+OMPRTL___kmpc_distribute_for_static_loop_8u,
+OMPRTL___kmpc_for_static_loop_4,
+OMPRTL___kmpc_for_static_loop_4u,
+OMPRTL___kmpc_for_static_loop_8,
+OMPRTL___kmpc_for_static_loop_8u
+ };
+
+ for (RuntimeFunction RF : CallbackFunctions) {
+Function *Fn = OMPBuilder.getOrCreateRuntimeFunctionPtr(RF);
+ASSERT_NE(Fn, nullptr) << "Function should exist for runtime function";
+
+MDNode *CallbackMD = Fn->getMetadata(LLVMContext::MD_callback);
+EXPECT_NE(CallbackMD, nullptr) << "Function should have callback metadata";
+
+if (CallbackMD) {
+ // Should have at least one callback
+ EXPECT_GE(CallbackMD->getNumOperands(), 1U);
+
+ // Test first callback entry
+ MDNode *FirstCallback = cast(CallbackMD->getOperand(0));
+ EXPECT_EQ(FirstCallback->getNumOperands(), 4U);
+
+ // Callee index should be valid
+ auto *CalleeIdxCM =
cast(FirstCallback->getOperand(0));
+ uint64_t CalleeIdx =
cast(CalleeIdxCM->getValue())->getZExtValue();
+ EXPECT_EQ(CalleeIdx, 1u);
+
+ // Verify payload arguments re (-1, 2)
+ auto *Arg0CM = cast(FirstCallback->getOperand(1));
+ int64_t Arg0 = cast(Arg0CM->getValue())->getSExtValue();
+ EXPECT_EQ(Arg0, -1);
+ auto *Arg1CM = cast(FirstCallback->getOperand(2));
+ int64_t Arg1 = cast(Arg1CM->getValue())->getSExtValue();
+ EXPECT_EQ(Arg1, 2);
+
+ // Verify the varArgs is false.
+ auto *VarArgCM = cast(FirstCallback->getOperand(3));
+ uint64_t Var
[llvm-branch-commits] [llvm] [LIR][profcheck] Reuse the loop's exit condition profile (PR #164523)
https://github.com/jinhuang1102 approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/164523 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [OpenMP] Add codegen support for dyn_groupprivate clause (PR #152830)
https://github.com/skc7 approved this pull request. LGTM. https://github.com/llvm/llvm-project/pull/152830 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Backport #164873 and #166067 to `release/21.x` (PR #166409)
https://github.com/nga888 closed https://github.com/llvm/llvm-project/pull/166409 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Backport #164873 and #166067 to `release/21.x` (PR #166409)
nga888 wrote: > Hi @nga888, at this point in the release cycle we are only accepting fixes > for regressions or major bug fixes. This only seems to be a performance > improvement in a very specific scenario, so it does not seem to meet the bar, > plus I am unsure how risky taking these two changes might be for the release > branch. Can you provide some detail on why you feel we should incorporate > these changes into the release branch at this time? Yes, I forgot that these "final" point releases are only for regressions or bug fixes. Both these changes are NFC but you're right they don't really meet the criteria. Will have to wait for the improvement in the next release. Thanks! https://github.com/llvm/llvm-project/pull/166409 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [WIP] Handle guard insertion in callbacks to OpenMP runtime functions. (PR #164655)
https://github.com/abidh updated
https://github.com/llvm/llvm-project/pull/164655
>From 56037a64dbd5f73d2c020dd5d58d2c99758b35d0 Mon Sep 17 00:00:00 2001
From: Abid Qadeer
Date: Tue, 21 Oct 2025 20:53:46 +0100
Subject: [PATCH 01/12] Add callback metadata to runtime functions which take
callbacks.
---
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 25
.../Frontend/OpenMPIRBuilderTest.cpp | 58 +++
2 files changed, 83 insertions(+)
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index c164d32f8f98c..312e119c4280d 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -750,6 +750,31 @@ OpenMPIRBuilder::getOrCreateRuntimeFunction(Module &M,
RuntimeFunction FnID) {
*MDNode::get(Ctx, {MDB.createCallbackEncoding(
2, {-1, -1}, /* VarArgsArePassed */ true)}));
}
+
+} else if (FnID == OMPRTL___kmpc_distribute_static_loop_4 ||
+ FnID == OMPRTL___kmpc_distribute_static_loop_4u ||
+ FnID == OMPRTL___kmpc_distribute_static_loop_8 ||
+ FnID == OMPRTL___kmpc_distribute_static_loop_8u ||
+ FnID == OMPRTL___kmpc_distribute_for_static_loop_4 ||
+ FnID == OMPRTL___kmpc_distribute_for_static_loop_4u ||
+ FnID == OMPRTL___kmpc_distribute_for_static_loop_8 ||
+ FnID == OMPRTL___kmpc_distribute_for_static_loop_8u ||
+ FnID == OMPRTL___kmpc_for_static_loop_4 ||
+ FnID == OMPRTL___kmpc_for_static_loop_4u ||
+ FnID == OMPRTL___kmpc_for_static_loop_8 ||
+ FnID == OMPRTL___kmpc_for_static_loop_8u) {
+ if (!Fn->hasMetadata(LLVMContext::MD_callback)) {
+LLVMContext &Ctx = Fn->getContext();
+MDBuilder MDB(Ctx);
+// Annotate the callback behavior of the runtime function:
+// - The callback callee is argument number 1.
+// - The first argument of the callback callee is unknown (-1).
+// - The second argument of the callback callee is argument number 2
+Fn->addMetadata(
+LLVMContext::MD_callback,
+*MDNode::get(Ctx, {MDB.createCallbackEncoding(
+ 1, {-1, 2}, /* VarArgsArePassed */ false)}));
+ }
}
LLVM_DEBUG(dbgs() << "Created OpenMP runtime function " << Fn->getName()
diff --git a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
index d231a778a8a97..aca2153f85c26 100644
--- a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
+++ b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
@@ -7957,4 +7957,62 @@ TEST_F(OpenMPIRBuilderTest, spliceBBWithEmptyBB) {
EXPECT_FALSE(Terminator->getDbgRecordRange().empty());
}
+TEST_F(OpenMPIRBuilderTest, callBackFunctions) {
+ OpenMPIRBuilder OMPBuilder(*M);
+ OMPBuilder.Config.IsTargetDevice = true;
+ OMPBuilder.initialize();
+
+ // Test multiple runtime functions that should have callback metadata
+ std::vector CallbackFunctions = {
+OMPRTL___kmpc_distribute_static_loop_4,
+OMPRTL___kmpc_distribute_static_loop_4u,
+OMPRTL___kmpc_distribute_static_loop_8,
+OMPRTL___kmpc_distribute_static_loop_8u,
+OMPRTL___kmpc_distribute_for_static_loop_4,
+OMPRTL___kmpc_distribute_for_static_loop_4u,
+OMPRTL___kmpc_distribute_for_static_loop_8,
+OMPRTL___kmpc_distribute_for_static_loop_8u,
+OMPRTL___kmpc_for_static_loop_4,
+OMPRTL___kmpc_for_static_loop_4u,
+OMPRTL___kmpc_for_static_loop_8,
+OMPRTL___kmpc_for_static_loop_8u
+ };
+
+ for (RuntimeFunction RF : CallbackFunctions) {
+Function *Fn = OMPBuilder.getOrCreateRuntimeFunctionPtr(RF);
+ASSERT_NE(Fn, nullptr) << "Function should exist for runtime function";
+
+MDNode *CallbackMD = Fn->getMetadata(LLVMContext::MD_callback);
+EXPECT_NE(CallbackMD, nullptr) << "Function should have callback metadata";
+
+if (CallbackMD) {
+ // Should have at least one callback
+ EXPECT_GE(CallbackMD->getNumOperands(), 1U);
+
+ // Test first callback entry
+ MDNode *FirstCallback = cast(CallbackMD->getOperand(0));
+ EXPECT_EQ(FirstCallback->getNumOperands(), 4U);
+
+ // Callee index should be valid
+ auto *CalleeIdxCM =
cast(FirstCallback->getOperand(0));
+ uint64_t CalleeIdx =
cast(CalleeIdxCM->getValue())->getZExtValue();
+ EXPECT_EQ(CalleeIdx, 1u);
+
+ // Verify payload arguments re (-1, 2)
+ auto *Arg0CM = cast(FirstCallback->getOperand(1));
+ int64_t Arg0 = cast(Arg0CM->getValue())->getSExtValue();
+ EXPECT_EQ(Arg0, -1);
+ auto *Arg1CM = cast(FirstCallback->getOperand(2));
+ int64_t Arg1 = cast(Arg1CM->getValue())->getSExtValue();
+ EXPECT_EQ(Arg1, 2);
+
+ // Verify the varArgs is false.
+ auto *VarArgCM = cast(FirstCallback->getOperand(3));
+ uint64_t Var
[llvm-branch-commits] [libc] [llvm] [libc][math] Refactor expm1f implementation to header-only in src/__support/math folder. (PR #162131)
https://github.com/bassiounix updated
https://github.com/llvm/llvm-project/pull/162131
>From 3d04ebf27d1d4341ed4972c3720d85919d81ca24 Mon Sep 17 00:00:00 2001
From: bassiounix
Date: Mon, 6 Oct 2025 20:55:27 +0300
Subject: [PATCH] [libc][math] Refactor expm1f implementation to header-only in
src/__support/math folder.
---
libc/shared/math.h| 1 +
libc/shared/math/expm1f.h | 23 +++
libc/src/__support/math/CMakeLists.txt| 17 ++
libc/src/__support/math/expm1f.h | 182 ++
libc/src/math/generic/CMakeLists.txt | 11 +-
libc/src/math/generic/expm1f.cpp | 162 +---
libc/test/shared/CMakeLists.txt | 1 +
libc/test/shared/shared_math_test.cpp | 1 +
.../llvm-project-overlay/libc/BUILD.bazel | 24 ++-
9 files changed, 244 insertions(+), 178 deletions(-)
create mode 100644 libc/shared/math/expm1f.h
create mode 100644 libc/src/__support/math/expm1f.h
diff --git a/libc/shared/math.h b/libc/shared/math.h
index 2db1d5501b7b3..70c6d375c22de 100644
--- a/libc/shared/math.h
+++ b/libc/shared/math.h
@@ -55,6 +55,7 @@
#include "math/expf.h"
#include "math/expf16.h"
#include "math/expm1.h"
+#include "math/expm1f.h"
#include "math/frexpf.h"
#include "math/frexpf128.h"
#include "math/frexpf16.h"
diff --git a/libc/shared/math/expm1f.h b/libc/shared/math/expm1f.h
new file mode 100644
index 0..e0cf6a846f116
--- /dev/null
+++ b/libc/shared/math/expm1f.h
@@ -0,0 +1,23 @@
+//===-- Shared expm1f function --*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_LIBC_SHARED_MATH_EXPM1F_H
+#define LLVM_LIBC_SHARED_MATH_EXPM1F_H
+
+#include "shared/libc_common.h"
+#include "src/__support/math/expm1f.h"
+
+namespace LIBC_NAMESPACE_DECL {
+namespace shared {
+
+using math::expm1f;
+
+} // namespace shared
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LLVM_LIBC_SHARED_MATH_EXPM1F_H
diff --git a/libc/src/__support/math/CMakeLists.txt
b/libc/src/__support/math/CMakeLists.txt
index cc7719e1d98fc..924b2e29abc56 100644
--- a/libc/src/__support/math/CMakeLists.txt
+++ b/libc/src/__support/math/CMakeLists.txt
@@ -890,6 +890,23 @@ add_header_library(
libc.src.errno.errno
)
+add_header_library(
+ expm1f
+ HDRS
+expm1f.h
+ DEPENDS
+.common_constants
+libc.src.__support.FPUtil.basic_operations
+libc.src.__support.FPUtil.fenv_impl
+libc.src.__support.FPUtil.fp_bits
+libc.src.__support.FPUtil.multiply_add
+libc.src.__support.FPUtil.nearest_integer
+libc.src.__support.FPUtil.polyeval
+libc.src.__support.FPUtil.rounding_mode
+libc.src.__support.macros.optimization
+libc.src.errno.errno
+)
+
add_header_library(
range_reduction_double
HDRS
diff --git a/libc/src/__support/math/expm1f.h b/libc/src/__support/math/expm1f.h
new file mode 100644
index 0..43e79ae3112dc
--- /dev/null
+++ b/libc/src/__support/math/expm1f.h
@@ -0,0 +1,182 @@
+//===-- Implementation header for expm1f *- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_LIBC_SRC___SUPPORT_MATH_EXPM1F_H
+#define LLVM_LIBC_SRC___SUPPORT_MATH_EXPM1F_H
+
+#include "common_constants.h" // Lookup tables EXP_M1 and EXP_M2.
+#include "src/__support/FPUtil/BasicOperations.h"
+#include "src/__support/FPUtil/FEnvImpl.h"
+#include "src/__support/FPUtil/FMA.h"
+#include "src/__support/FPUtil/FPBits.h"
+#include "src/__support/FPUtil/PolyEval.h"
+#include "src/__support/FPUtil/multiply_add.h"
+#include "src/__support/FPUtil/nearest_integer.h"
+#include "src/__support/FPUtil/rounding_mode.h"
+#include "src/__support/common.h"
+#include "src/__support/macros/config.h"
+#include "src/__support/macros/optimization.h"// LIBC_UNLIKELY
+#include "src/__support/macros/properties/cpu_features.h" //
LIBC_TARGET_CPU_HAS_FMA
+
+namespace LIBC_NAMESPACE_DECL {
+
+namespace math {
+
+LIBC_INLINE static constexpr float expm1f(float x) {
+ using namespace common_constants_internal;
+ using FPBits = typename fputil::FPBits;
+ FPBits xbits(x);
+
+ uint32_t x_u = xbits.uintval();
+ uint32_t x_abs = x_u & 0x7fff'U;
+
+#ifndef LIBC_MATH_HAS_SKIP_ACCURATE_PASS
+ // Exceptional value
+ if (LIBC_UNLIKELY(x_u == 0x3e35'bec5U)) { // x = 0x1.6b7d8ap-3f
+int round_mode = fputil::quick_get_round();
+if (round_mode == FE_TONEAREST || round_mode == FE_UPWARD)
+ re
[llvm-branch-commits] [libc] [llvm] [libc][math] Refactor expm1f16 implementation to header-only in src/__support/math folder. (PR #162132)
https://github.com/bassiounix updated
https://github.com/llvm/llvm-project/pull/162132
>From 0f507adf2a7fe0ce8bc7b2bb866fbe0fc7022045 Mon Sep 17 00:00:00 2001
From: bassiounix
Date: Mon, 6 Oct 2025 21:54:20 +0300
Subject: [PATCH] [libc][math] Refactor expm1f16 implementation to header-only
in src/__support/math folder.
---
libc/shared/math.h| 1 +
libc/shared/math/expm1f16.h | 29
libc/src/__support/math/CMakeLists.txt| 16 ++
libc/src/__support/math/expm1f16.h| 153 ++
libc/src/math/generic/CMakeLists.txt | 12 +-
libc/src/math/generic/expm1f16.cpp| 130 +--
libc/test/shared/CMakeLists.txt | 1 +
libc/test/shared/shared_math_test.cpp | 1 +
.../llvm-project-overlay/libc/BUILD.bazel | 18 ++-
9 files changed, 221 insertions(+), 140 deletions(-)
create mode 100644 libc/shared/math/expm1f16.h
create mode 100644 libc/src/__support/math/expm1f16.h
diff --git a/libc/shared/math.h b/libc/shared/math.h
index 70c6d375c22de..874c2c0779adb 100644
--- a/libc/shared/math.h
+++ b/libc/shared/math.h
@@ -56,6 +56,7 @@
#include "math/expf16.h"
#include "math/expm1.h"
#include "math/expm1f.h"
+#include "math/expm1f16.h"
#include "math/frexpf.h"
#include "math/frexpf128.h"
#include "math/frexpf16.h"
diff --git a/libc/shared/math/expm1f16.h b/libc/shared/math/expm1f16.h
new file mode 100644
index 0..5698400d7066a
--- /dev/null
+++ b/libc/shared/math/expm1f16.h
@@ -0,0 +1,29 @@
+//===-- Shared expm1f16 function *- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_LIBC_SHARED_MATH_EXPM1F16_H
+#define LLVM_LIBC_SHARED_MATH_EXPM1F16_H
+
+#include "include/llvm-libc-macros/float16-macros.h"
+#include "shared/libc_common.h"
+
+#ifdef LIBC_TYPES_HAS_FLOAT16
+
+#include "src/__support/math/expm1f16.h"
+
+namespace LIBC_NAMESPACE_DECL {
+namespace shared {
+
+using math::expm1f16;
+
+} // namespace shared
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LIBC_TYPES_HAS_FLOAT16
+
+#endif // LLVM_LIBC_SHARED_MATH_EXPM1F16_H
diff --git a/libc/src/__support/math/CMakeLists.txt
b/libc/src/__support/math/CMakeLists.txt
index 924b2e29abc56..effded3c2bfb2 100644
--- a/libc/src/__support/math/CMakeLists.txt
+++ b/libc/src/__support/math/CMakeLists.txt
@@ -907,6 +907,22 @@ add_header_library(
libc.src.errno.errno
)
+add_header_library(
+ expm1f16
+ HDRS
+expm1f16.h
+ DEPENDS
+.expxf16_utils
+libc.src.__support.FPUtil.cast
+libc.src.__support.FPUtil.except_value_utils
+libc.src.__support.FPUtil.fenv_impl
+libc.src.__support.FPUtil.fp_bits
+libc.src.__support.FPUtil.multiply_add
+libc.src.__support.FPUtil.polyeval
+libc.src.__support.FPUtil.rounding_mode
+libc.src.__support.macros.optimization
+)
+
add_header_library(
range_reduction_double
HDRS
diff --git a/libc/src/__support/math/expm1f16.h
b/libc/src/__support/math/expm1f16.h
new file mode 100644
index 0..79547b62b0892
--- /dev/null
+++ b/libc/src/__support/math/expm1f16.h
@@ -0,0 +1,153 @@
+//===-- Implementation header for expm1f16 --*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_LIBC_SRC___SUPPORT_MATH_EXPM1F16_H
+#define LLVM_LIBC_SRC___SUPPORT_MATH_EXPM1F16_H
+
+#include "include/llvm-libc-macros/float16-macros.h"
+
+#ifdef LIBC_TYPES_HAS_FLOAT16
+
+#include "src/__support/FPUtil/FEnvImpl.h"
+#include "src/__support/FPUtil/FPBits.h"
+#include "src/__support/FPUtil/PolyEval.h"
+#include "src/__support/FPUtil/cast.h"
+#include "src/__support/FPUtil/except_value_utils.h"
+#include "src/__support/FPUtil/multiply_add.h"
+#include "src/__support/FPUtil/rounding_mode.h"
+#include "src/__support/common.h"
+#include "src/__support/macros/config.h"
+#include "src/__support/macros/optimization.h"
+#include "src/__support/math/expxf16_utils.h"
+
+namespace LIBC_NAMESPACE_DECL {
+
+namespace math {
+
+LIBC_INLINE static constexpr float16 expm1f16(float16 x) {
+#ifndef LIBC_MATH_HAS_SKIP_ACCURATE_PASS
+ constexpr fputil::ExceptValues EXPM1F16_EXCEPTS_LO = {{
+ // (input, RZ output, RU offset, RD offset, RN offset)
+ // x = 0x1.564p-5, expm1f16(x) = 0x1.5d4p-5 (RZ)
+ {0x2959U, 0x2975U, 1U, 0U, 1U},
+ }};
+
+#ifdef LIBC_TARGET_CPU_HAS_FMA_FLOAT
+ constexpr size_t N_EXPM1F16_EXCEPTS_HI = 2;
+#else
+ constexpr size_t N_EXPM1F16_EXCEPTS_HI =
[llvm-branch-commits] [libc] [llvm] [libc][math] Refactor fma implementation to header-only in src/__support/math folder. (PR #163968)
https://github.com/bassiounix updated
https://github.com/llvm/llvm-project/pull/163968
>From 8ec6dc139887826777dd290f3d5cddf85591629f Mon Sep 17 00:00:00 2001
From: bassiounix
Date: Fri, 17 Oct 2025 17:02:59 +0300
Subject: [PATCH] [libc][math] Refactor fma implementation to header-only in
src/__support/math folder.
---
libc/shared/math.h| 1 +
libc/shared/math/fma.h| 23
libc/src/__support/math/CMakeLists.txt| 8 ++
libc/src/__support/math/fma.h | 27 +++
libc/src/math/generic/CMakeLists.txt | 2 +-
libc/src/math/generic/fma.cpp | 7 ++---
libc/test/shared/CMakeLists.txt | 1 +
libc/test/shared/shared_math_test.cpp | 1 +
.../llvm-project-overlay/libc/BUILD.bazel | 14 +++---
9 files changed, 75 insertions(+), 9 deletions(-)
create mode 100644 libc/shared/math/fma.h
create mode 100644 libc/src/__support/math/fma.h
diff --git a/libc/shared/math.h b/libc/shared/math.h
index 874c2c0779adb..79ba2ea5aa6ff 100644
--- a/libc/shared/math.h
+++ b/libc/shared/math.h
@@ -57,6 +57,7 @@
#include "math/expm1.h"
#include "math/expm1f.h"
#include "math/expm1f16.h"
+#include "math/fma.h"
#include "math/frexpf.h"
#include "math/frexpf128.h"
#include "math/frexpf16.h"
diff --git a/libc/shared/math/fma.h b/libc/shared/math/fma.h
new file mode 100644
index 0..82f1dac61dda2
--- /dev/null
+++ b/libc/shared/math/fma.h
@@ -0,0 +1,23 @@
+//===-- Shared fma function -*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_LIBC_SHARED_MATH_FMA_H
+#define LLVM_LIBC_SHARED_MATH_FMA_H
+
+#include "shared/libc_common.h"
+#include "src/__support/math/fma.h"
+
+namespace LIBC_NAMESPACE_DECL {
+namespace shared {
+
+using math::fma;
+
+} // namespace shared
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LLVM_LIBC_SHARED_MATH_FMA_H
diff --git a/libc/src/__support/math/CMakeLists.txt
b/libc/src/__support/math/CMakeLists.txt
index effded3c2bfb2..76dc425caef5b 100644
--- a/libc/src/__support/math/CMakeLists.txt
+++ b/libc/src/__support/math/CMakeLists.txt
@@ -593,6 +593,14 @@ add_header_library(
libc.src.__support.math.exp10_float16_constants
)
+add_header_library(
+ fma
+ HDRS
+fma.h
+ DEPENDS
+libc.src.__support.FPUtil.fma
+)
+
add_header_library(
frexpf128
HDRS
diff --git a/libc/src/__support/math/fma.h b/libc/src/__support/math/fma.h
new file mode 100644
index 0..d996610167a19
--- /dev/null
+++ b/libc/src/__support/math/fma.h
@@ -0,0 +1,27 @@
+//===-- Implementation header for fma ---*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_LIBC_SRC___SUPPORT_MATH_FMA_H
+#define LLVM_LIBC_SRC___SUPPORT_MATH_FMA_H
+
+#include "src/__support/FPUtil/FMA.h"
+#include "src/__support/macros/config.h"
+
+namespace LIBC_NAMESPACE_DECL {
+
+namespace math {
+
+LIBC_INLINE static double fma(double x, double y, double z) {
+ return fputil::fma(x, y, z);
+}
+
+} // namespace math
+
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LLVM_LIBC_SRC___SUPPORT_MATH_FMA_H
diff --git a/libc/src/math/generic/CMakeLists.txt
b/libc/src/math/generic/CMakeLists.txt
index 08e16711e00af..fdd5c3b329192 100644
--- a/libc/src/math/generic/CMakeLists.txt
+++ b/libc/src/math/generic/CMakeLists.txt
@@ -4696,7 +4696,7 @@ add_entrypoint_object(
HDRS
../fma.h
DEPENDS
-libc.src.__support.FPUtil.fma
+libc.src.__support.math.fma
)
add_entrypoint_object(
diff --git a/libc/src/math/generic/fma.cpp b/libc/src/math/generic/fma.cpp
index 2ea4ae9961150..3ccdb78846e34 100644
--- a/libc/src/math/generic/fma.cpp
+++ b/libc/src/math/generic/fma.cpp
@@ -7,15 +7,12 @@
//===--===//
#include "src/math/fma.h"
-#include "src/__support/common.h"
-
-#include "src/__support/FPUtil/FMA.h"
-#include "src/__support/macros/config.h"
+#include "src/__support/math/fma.h"
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(double, fma, (double x, double y, double z)) {
- return fputil::fma(x, y, z);
+ return math::fma(x, y, z);
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/test/shared/CMakeLists.txt b/libc/test/shared/CMakeLists.txt
index bfcac7884e646..cd4b5ec75f876 100644
--- a/libc/test/shared/CMakeLists.txt
+++ b/libc/test/shared/CMakeLists.txt
@@ -53,6 +53
[llvm-branch-commits] [libc] [llvm] [libc][math] Refactor fmaf implementation to header-only in src/__support/math folder. (PR #163970)
https://github.com/bassiounix updated
https://github.com/llvm/llvm-project/pull/163970
>From cf9e52117e957292b09cefd66e8163e8ed8aeff8 Mon Sep 17 00:00:00 2001
From: bassiounix
Date: Fri, 17 Oct 2025 17:19:30 +0300
Subject: [PATCH] [libc][math] Refactor fmaf implementation to header-only in
src/__support/math folder.
---
libc/shared/math.h| 1 +
libc/shared/math/fmaf.h | 23
libc/src/__support/math/CMakeLists.txt| 8 ++
libc/src/__support/math/fmaf.h| 27 +++
libc/src/math/generic/CMakeLists.txt | 2 +-
libc/src/math/generic/fmaf.cpp| 7 ++---
libc/test/shared/CMakeLists.txt | 1 +
libc/test/shared/shared_math_test.cpp | 2 +-
.../llvm-project-overlay/libc/BUILD.bazel | 10 ++-
9 files changed, 73 insertions(+), 8 deletions(-)
create mode 100644 libc/shared/math/fmaf.h
create mode 100644 libc/src/__support/math/fmaf.h
diff --git a/libc/shared/math.h b/libc/shared/math.h
index 79ba2ea5aa6ff..2198ffdc74459 100644
--- a/libc/shared/math.h
+++ b/libc/shared/math.h
@@ -58,6 +58,7 @@
#include "math/expm1f.h"
#include "math/expm1f16.h"
#include "math/fma.h"
+#include "math/fmaf.h"
#include "math/frexpf.h"
#include "math/frexpf128.h"
#include "math/frexpf16.h"
diff --git a/libc/shared/math/fmaf.h b/libc/shared/math/fmaf.h
new file mode 100644
index 0..eef75b05d4a18
--- /dev/null
+++ b/libc/shared/math/fmaf.h
@@ -0,0 +1,23 @@
+//===-- Shared fmaf function *- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_LIBC_SHARED_MATH_FMAF_H
+#define LLVM_LIBC_SHARED_MATH_FMAF_H
+
+#include "shared/libc_common.h"
+#include "src/__support/math/fmaf.h"
+
+namespace LIBC_NAMESPACE_DECL {
+namespace shared {
+
+using math::fmaf;
+
+} // namespace shared
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LLVM_LIBC_SHARED_MATH_FMAF_H
diff --git a/libc/src/__support/math/CMakeLists.txt
b/libc/src/__support/math/CMakeLists.txt
index 76dc425caef5b..09625a16d86e8 100644
--- a/libc/src/__support/math/CMakeLists.txt
+++ b/libc/src/__support/math/CMakeLists.txt
@@ -601,6 +601,14 @@ add_header_library(
libc.src.__support.FPUtil.fma
)
+add_header_library(
+ fmaf
+ HDRS
+fmaf.h
+ DEPENDS
+libc.src.__support.FPUtil.fma
+)
+
add_header_library(
frexpf128
HDRS
diff --git a/libc/src/__support/math/fmaf.h b/libc/src/__support/math/fmaf.h
new file mode 100644
index 0..f5b893873e885
--- /dev/null
+++ b/libc/src/__support/math/fmaf.h
@@ -0,0 +1,27 @@
+//===-- Implementation header for fmaf --*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_LIBC_SRC___SUPPORT_MATH_FMAF_H
+#define LLVM_LIBC_SRC___SUPPORT_MATH_FMAF_H
+
+#include "src/__support/FPUtil/FMA.h"
+#include "src/__support/macros/config.h"
+
+namespace LIBC_NAMESPACE_DECL {
+
+namespace math {
+
+LIBC_INLINE static float fmaf(float x, float y, float z) {
+ return fputil::fma(x, y, z);
+}
+
+} // namespace math
+
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LLVM_LIBC_SRC___SUPPORT_MATH_FMAF_H
diff --git a/libc/src/math/generic/CMakeLists.txt
b/libc/src/math/generic/CMakeLists.txt
index fdd5c3b329192..c1a7027a13a88 100644
--- a/libc/src/math/generic/CMakeLists.txt
+++ b/libc/src/math/generic/CMakeLists.txt
@@ -4675,7 +4675,7 @@ add_entrypoint_object(
HDRS
../fmaf.h
DEPENDS
-libc.src.__support.FPUtil.fma
+libc.src.__support.math.fmaf
)
add_entrypoint_object(
diff --git a/libc/src/math/generic/fmaf.cpp b/libc/src/math/generic/fmaf.cpp
index dad85b4e9d1e6..069c7c4605da7 100644
--- a/libc/src/math/generic/fmaf.cpp
+++ b/libc/src/math/generic/fmaf.cpp
@@ -7,15 +7,12 @@
//===--===//
#include "src/math/fmaf.h"
-#include "src/__support/common.h"
-
-#include "src/__support/FPUtil/FMA.h"
-#include "src/__support/macros/config.h"
+#include "src/__support/math/fmaf.h"
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION(float, fmaf, (float x, float y, float z)) {
- return fputil::fma(x, y, z);
+ return math::fmaf(x, y, z);
}
} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/test/shared/CMakeLists.txt b/libc/test/shared/CMakeLists.txt
index cd4b5ec75f876..f5c42c145c318 100644
--- a/libc/test/shared/CMakeLists.txt
+++ b/libc/test/shared/CMakeLists.txt
@@ -54,6 +5
[llvm-branch-commits] [libc] [llvm] [libc][math] Refactor fmaf16 implementation to header-only in src/__support/math folder. (PR #163977)
https://github.com/bassiounix updated
https://github.com/llvm/llvm-project/pull/163977
>From 389fba5d8c64e1625b02e412ed64e04905f12708 Mon Sep 17 00:00:00 2001
From: bassiounix
Date: Fri, 17 Oct 2025 18:09:02 +0300
Subject: [PATCH] [libc][math] Refactor fmaf16 implementation to header-only in
src/__support/math folder.
---
libc/shared/math.h| 1 +
libc/shared/math/fmaf16.h | 29
libc/src/__support/math/CMakeLists.txt| 8 +
libc/src/__support/math/fmaf16.h | 33 +++
libc/src/math/generic/CMakeLists.txt | 3 +-
libc/src/math/generic/fmaf16.cpp | 6 ++--
libc/test/shared/CMakeLists.txt | 1 +
libc/test/shared/shared_math_test.cpp | 3 +-
.../llvm-project-overlay/libc/BUILD.bazel | 10 +-
9 files changed, 86 insertions(+), 8 deletions(-)
create mode 100644 libc/shared/math/fmaf16.h
create mode 100644 libc/src/__support/math/fmaf16.h
diff --git a/libc/shared/math.h b/libc/shared/math.h
index 2198ffdc74459..050b67f0efb5f 100644
--- a/libc/shared/math.h
+++ b/libc/shared/math.h
@@ -59,6 +59,7 @@
#include "math/expm1f16.h"
#include "math/fma.h"
#include "math/fmaf.h"
+#include "math/fmaf16.h"
#include "math/frexpf.h"
#include "math/frexpf128.h"
#include "math/frexpf16.h"
diff --git a/libc/shared/math/fmaf16.h b/libc/shared/math/fmaf16.h
new file mode 100644
index 0..a6da1bfa58a3a
--- /dev/null
+++ b/libc/shared/math/fmaf16.h
@@ -0,0 +1,29 @@
+//===-- Shared fmaf16 function --*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_LIBC_SHARED_MATH_FMAF16_H
+#define LLVM_LIBC_SHARED_MATH_FMAF16_H
+
+#include "include/llvm-libc-macros/float16-macros.h"
+#include "shared/libc_common.h"
+
+#ifdef LIBC_TYPES_HAS_FLOAT16
+
+#include "src/__support/math/fmaf16.h"
+
+namespace LIBC_NAMESPACE_DECL {
+namespace shared {
+
+using math::fmaf16;
+
+} // namespace shared
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LIBC_TYPES_HAS_FLOAT16
+
+#endif // LLVM_LIBC_SHARED_MATH_FMAF16_H
diff --git a/libc/src/__support/math/CMakeLists.txt
b/libc/src/__support/math/CMakeLists.txt
index 09625a16d86e8..68561cc5e7cb8 100644
--- a/libc/src/__support/math/CMakeLists.txt
+++ b/libc/src/__support/math/CMakeLists.txt
@@ -609,6 +609,14 @@ add_header_library(
libc.src.__support.FPUtil.fma
)
+add_header_library(
+ fmaf16
+ HDRS
+fmaf16.h
+ DEPENDS
+libc.src.__support.FPUtil.fma
+)
+
add_header_library(
frexpf128
HDRS
diff --git a/libc/src/__support/math/fmaf16.h b/libc/src/__support/math/fmaf16.h
new file mode 100644
index 0..e15a646190ad2
--- /dev/null
+++ b/libc/src/__support/math/fmaf16.h
@@ -0,0 +1,33 @@
+//===-- Implementation header for fmaf16 *- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_LIBC_SRC___SUPPORT_MATH_FMAF16_H
+#define LLVM_LIBC_SRC___SUPPORT_MATH_FMAF16_H
+
+#include "include/llvm-libc-macros/float16-macros.h"
+
+#ifdef LIBC_TYPES_HAS_FLOAT16
+
+#include "src/__support/FPUtil/FMA.h"
+#include "src/__support/macros/config.h"
+
+namespace LIBC_NAMESPACE_DECL {
+
+namespace math {
+
+LIBC_INLINE static float16 fmaf16(float16 x, float16 y, float16 z) {
+ return fputil::fma(x, y, z);
+}
+
+} // namespace math
+
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LIBC_TYPES_HAS_FLOAT16
+
+#endif // LLVM_LIBC_SRC___SUPPORT_MATH_FMAF16_H
diff --git a/libc/src/math/generic/CMakeLists.txt
b/libc/src/math/generic/CMakeLists.txt
index c1a7027a13a88..fed29c645a72c 100644
--- a/libc/src/math/generic/CMakeLists.txt
+++ b/libc/src/math/generic/CMakeLists.txt
@@ -4685,8 +4685,7 @@ add_entrypoint_object(
HDRS
../fmaf16.h
DEPENDS
-libc.src.__support.FPUtil.fma
-libc.src.__support.macros.properties.types
+libc.src.__support.math.fmaf16
)
add_entrypoint_object(
diff --git a/libc/src/math/generic/fmaf16.cpp b/libc/src/math/generic/fmaf16.cpp
index 4f712f5de764f..e21240900c2f5 100644
--- a/libc/src/math/generic/fmaf16.cpp
+++ b/libc/src/math/generic/fmaf16.cpp
@@ -7,14 +7,12 @@
//===--===//
#include "src/math/fmaf16.h"
-#include "src/__support/FPUtil/FMA.h"
-#include "src/__support/common.h"
-#include "src/__support/macros/config.h"
+#include "src/__support/math/fmaf16.h"
namespace LIBC_NAMESPACE_DECL {
LLVM_LIBC_FUNCTION
[llvm-branch-commits] [llvm] ec0dd66 - Revert "[utils][UpdateLLCTestChecks] Add MIR support to update_llc_test_checkβ¦"
Author: Valery Pykhtin
Date: 2025-11-05T13:58:56+01:00
New Revision: ec0dd66aba8536fd8db2f4b1fb8454ca42015425
URL:
https://github.com/llvm/llvm-project/commit/ec0dd66aba8536fd8db2f4b1fb8454ca42015425
DIFF:
https://github.com/llvm/llvm-project/commit/ec0dd66aba8536fd8db2f4b1fb8454ca42015425.diff
LOG: Revert "[utils][UpdateLLCTestChecks] Add MIR support to
update_llc_test_checkβ¦"
This reverts commit c782ed3440b5a1565428db9731504fd1c4c2a9a9.
Added:
Modified:
llvm/utils/UpdateTestChecks/common.py
llvm/utils/UpdateTestChecks/mir.py
llvm/utils/update_llc_test_checks.py
Removed:
llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86_asm_mir_mixed.ll
llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86_asm_mir_mixed.ll.expected
llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86_asm_mir_same_prefix.ll
llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86_asm_mir_same_prefix.ll.expected
llvm/test/tools/UpdateTestChecks/update_llc_test_checks/x86-asm-mir-mixed.test
llvm/test/tools/UpdateTestChecks/update_llc_test_checks/x86-asm-mir-same-prefix.test
diff --git
a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86_asm_mir_mixed.ll
b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86_asm_mir_mixed.ll
deleted file mode 100644
index 292637177591f..0
---
a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86_asm_mir_mixed.ll
+++ /dev/null
@@ -1,17 +0,0 @@
-; RUN: llc -mtriple=x86_64 < %s | FileCheck %s --check-prefix=ASM
-; RUN: llc -mtriple=x86_64 -stop-after=finalize-isel < %s | FileCheck %s
--check-prefix=MIR
-
-define i64 @test1(i64 %i) nounwind readnone {
- %loc = alloca i64
- %j = load i64, ptr %loc
- %r = add i64 %i, %j
- ret i64 %r
-}
-
-define i64 @test2(i32 %i) nounwind readnone {
- %loc = alloca i32
- %j = load i32, ptr %loc
- %r = add i32 %i, %j
- %ext = zext i32 %r to i64
- ret i64 %ext
-}
diff --git
a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86_asm_mir_mixed.ll.expected
b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86_asm_mir_mixed.ll.expected
deleted file mode 100644
index 88cb03e85204a..0
---
a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86_asm_mir_mixed.ll.expected
+++ /dev/null
@@ -1,45 +0,0 @@
-; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=x86_64 < %s | FileCheck %s --check-prefix=ASM
-; RUN: llc -mtriple=x86_64 -stop-after=finalize-isel < %s | FileCheck %s
--check-prefix=MIR
-
-define i64 @test1(i64 %i) nounwind readnone {
-; ASM-LABEL: test1:
-; ASM: # %bb.0:
-; ASM-NEXT:movq %rdi, %rax
-; ASM-NEXT:addq -{{[0-9]+}}(%rsp), %rax
-; ASM-NEXT:retq
-; MIR-LABEL: name: test1
-; MIR: bb.0 (%ir-block.0):
-; MIR-NEXT: liveins: $rdi
-; MIR-NEXT: {{ $}}
-; MIR-NEXT: [[COPY:%[0-9]+]]:gr64 = COPY $rdi
-; MIR-NEXT: [[ADD64rm:%[0-9]+]]:gr64 = ADD64rm [[COPY]], %stack.0.loc, 1,
$noreg, 0, $noreg, implicit-def dead $eflags :: (dereferenceable load (s64)
from %ir.loc)
-; MIR-NEXT: $rax = COPY [[ADD64rm]]
-; MIR-NEXT: RET 0, $rax
- %loc = alloca i64
- %j = load i64, ptr %loc
- %r = add i64 %i, %j
- ret i64 %r
-}
-
-define i64 @test2(i32 %i) nounwind readnone {
-; ASM-LABEL: test2:
-; ASM: # %bb.0:
-; ASM-NEXT:movl %edi, %eax
-; ASM-NEXT:addl -{{[0-9]+}}(%rsp), %eax
-; ASM-NEXT:retq
-; MIR-LABEL: name: test2
-; MIR: bb.0 (%ir-block.0):
-; MIR-NEXT: liveins: $edi
-; MIR-NEXT: {{ $}}
-; MIR-NEXT: [[COPY:%[0-9]+]]:gr32 = COPY $edi
-; MIR-NEXT: [[ADD32rm:%[0-9]+]]:gr32 = ADD32rm [[COPY]], %stack.0.loc, 1,
$noreg, 0, $noreg, implicit-def dead $eflags :: (dereferenceable load (s32)
from %ir.loc)
-; MIR-NEXT: [[SUBREG_TO_REG:%[0-9]+]]:gr64 = SUBREG_TO_REG 0, killed
[[ADD32rm]], %subreg.sub_32bit
-; MIR-NEXT: $rax = COPY [[SUBREG_TO_REG]]
-; MIR-NEXT: RET 0, $rax
- %loc = alloca i32
- %j = load i32, ptr %loc
- %r = add i32 %i, %j
- %ext = zext i32 %r to i64
- ret i64 %ext
-}
diff --git
a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86_asm_mir_same_prefix.ll
b/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86_asm_mir_same_prefix.ll
deleted file mode 100644
index 7167bcf258e68..0
---
a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Inputs/x86_asm_mir_same_prefix.ll
+++ /dev/null
@@ -1,13 +0,0 @@
-; RUN: llc -mtriple=x86_64 < %s | FileCheck %s --check-prefix=CHECK
-; RUN: llc -mtriple=x86_64 -stop-after=finalize-isel < %s | FileCheck %s
--check-prefix=CHECK
-
-define i32 @add(i32 %a, i32 %b) {
- %sum = add i32 %a, %b
- ret i32 %sum
-}
-
-define i32 @sub(i32 %a, i32 %b) {
- %
diff = sub i32 %a, %b
- ret i32 %
diff
-}
-
diff --git
a/llvm/test/tools/UpdateTestChecks/update_llc_test_checks/Input
[llvm-branch-commits] [llvm] Analysis: Add RuntimeLibcall analysis pass (PR #165196)
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/165196
>From 4da4e58e170a44f208b8eeddb35fb083940b7498 Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Mon, 2 Jun 2025 18:32:22 +0200
Subject: [PATCH] Analysis: Add RuntimeLibcall analysis pass
Currently RuntimeLibcallsInfo is a hardcoded list based on the triple.
In the future the available libcall set should be dynamically modifiable
with module flags.
Note this isn't really used yet. TargetLowering is still constructing
its own copy, and untangling that to use this requires several more
steps.
---
.../llvm/Analysis/RuntimeLibcallInfo.h| 60 +++
llvm/include/llvm/CodeGen/SelectionDAGISel.h | 1 +
llvm/include/llvm/IR/RuntimeLibcalls.h| 10
llvm/include/llvm/InitializePasses.h | 1 +
llvm/include/llvm/Passes/CodeGenPassBuilder.h | 3 +
llvm/lib/Analysis/Analysis.cpp| 1 +
llvm/lib/Analysis/CMakeLists.txt | 1 +
llvm/lib/Analysis/RuntimeLibcallInfo.cpp | 43 +
llvm/lib/IR/RuntimeLibcalls.cpp | 7 ++-
llvm/lib/Passes/PassBuilder.cpp | 1 +
llvm/lib/Passes/PassRegistry.def | 1 +
llvm/lib/Target/Target.cpp| 1 +
12 files changed, 129 insertions(+), 1 deletion(-)
create mode 100644 llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
create mode 100644 llvm/lib/Analysis/RuntimeLibcallInfo.cpp
diff --git a/llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
b/llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
new file mode 100644
index 0..a3e1014b417e5
--- /dev/null
+++ b/llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
@@ -0,0 +1,60 @@
+//===-- RuntimeLibcallInfo.h - Runtime library information --*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_ANALYSIS_RUNTIMELIBCALLINFO_H
+#define LLVM_ANALYSIS_RUNTIMELIBCALLINFO_H
+
+#include "llvm/IR/RuntimeLibcalls.h"
+#include "llvm/Pass.h"
+
+namespace llvm {
+
+class LLVM_ABI RuntimeLibraryAnalysis
+: public AnalysisInfoMixin {
+public:
+ using Result = RTLIB::RuntimeLibcallsInfo;
+
+ RuntimeLibraryAnalysis() = default;
+ RuntimeLibraryAnalysis(RTLIB::RuntimeLibcallsInfo &&BaselineInfoImpl)
+ : LibcallsInfo(std::move(BaselineInfoImpl)) {}
+ explicit RuntimeLibraryAnalysis(const Triple &T) : LibcallsInfo(T) {}
+
+ LLVM_ABI RTLIB::RuntimeLibcallsInfo run(const Module &M,
+ ModuleAnalysisManager &);
+
+private:
+ friend AnalysisInfoMixin;
+ LLVM_ABI static AnalysisKey Key;
+
+ RTLIB::RuntimeLibcallsInfo LibcallsInfo;
+};
+
+class LLVM_ABI RuntimeLibraryInfoWrapper : public ImmutablePass {
+ RuntimeLibraryAnalysis RTLA;
+ std::optional RTLCI;
+
+public:
+ static char ID;
+ RuntimeLibraryInfoWrapper();
+ explicit RuntimeLibraryInfoWrapper(const Triple &T);
+ explicit RuntimeLibraryInfoWrapper(const RTLIB::RuntimeLibcallsInfo &RTLCI);
+
+ const RTLIB::RuntimeLibcallsInfo &getRTLCI(const Module &M) {
+ModuleAnalysisManager DummyMAM;
+RTLCI = RTLA.run(M, DummyMAM);
+return *RTLCI;
+ }
+
+ void getAnalysisUsage(AnalysisUsage &AU) const override;
+};
+
+LLVM_ABI ModulePass *createRuntimeLibraryInfoWrapperPass();
+
+} // namespace llvm
+
+#endif
diff --git a/llvm/include/llvm/CodeGen/SelectionDAGISel.h
b/llvm/include/llvm/CodeGen/SelectionDAGISel.h
index 5241a51dd8cd8..d7921c3eb3f7c 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAGISel.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAGISel.h
@@ -46,6 +46,7 @@ class SelectionDAGISel {
public:
TargetMachine &TM;
const TargetLibraryInfo *LibInfo;
+ const RTLIB::RuntimeLibcallsInfo *RuntimeLibCallInfo;
std::unique_ptr FuncInfo;
std::unique_ptr SwiftError;
MachineFunction *MF;
diff --git a/llvm/include/llvm/IR/RuntimeLibcalls.h
b/llvm/include/llvm/IR/RuntimeLibcalls.h
index 78e4b1723aafa..c822b6530a441 100644
--- a/llvm/include/llvm/IR/RuntimeLibcalls.h
+++ b/llvm/include/llvm/IR/RuntimeLibcalls.h
@@ -9,6 +9,8 @@
// This file implements a common interface to work with library calls into a
// runtime that may be emitted by a given backend.
//
+// FIXME: This should probably move to Analysis
+//
//===--===//
#ifndef LLVM_IR_RUNTIME_LIBCALLS_H
@@ -20,6 +22,7 @@
#include "llvm/ADT/StringTable.h"
#include "llvm/IR/CallingConv.h"
#include "llvm/IR/InstrTypes.h"
+#include "llvm/IR/PassManager.h"
#include "llvm/Support/AtomicOrdering.h"
#include "llvm/Support/CodeGen.h"
#include "llvm/Support/Compiler.h"
@@ -74,6 +77,8 @@ struct RuntimeLibcallsInfo {
public:
friend class llvm::LibcallLoweringInfo;
+ RuntimeLibcallsInfo() = default;
+
[llvm-branch-commits] [llvm] Analysis: Add RuntimeLibcall analysis pass (PR #165196)
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/165196
>From 4da4e58e170a44f208b8eeddb35fb083940b7498 Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Mon, 2 Jun 2025 18:32:22 +0200
Subject: [PATCH] Analysis: Add RuntimeLibcall analysis pass
Currently RuntimeLibcallsInfo is a hardcoded list based on the triple.
In the future the available libcall set should be dynamically modifiable
with module flags.
Note this isn't really used yet. TargetLowering is still constructing
its own copy, and untangling that to use this requires several more
steps.
---
.../llvm/Analysis/RuntimeLibcallInfo.h| 60 +++
llvm/include/llvm/CodeGen/SelectionDAGISel.h | 1 +
llvm/include/llvm/IR/RuntimeLibcalls.h| 10
llvm/include/llvm/InitializePasses.h | 1 +
llvm/include/llvm/Passes/CodeGenPassBuilder.h | 3 +
llvm/lib/Analysis/Analysis.cpp| 1 +
llvm/lib/Analysis/CMakeLists.txt | 1 +
llvm/lib/Analysis/RuntimeLibcallInfo.cpp | 43 +
llvm/lib/IR/RuntimeLibcalls.cpp | 7 ++-
llvm/lib/Passes/PassBuilder.cpp | 1 +
llvm/lib/Passes/PassRegistry.def | 1 +
llvm/lib/Target/Target.cpp| 1 +
12 files changed, 129 insertions(+), 1 deletion(-)
create mode 100644 llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
create mode 100644 llvm/lib/Analysis/RuntimeLibcallInfo.cpp
diff --git a/llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
b/llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
new file mode 100644
index 0..a3e1014b417e5
--- /dev/null
+++ b/llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
@@ -0,0 +1,60 @@
+//===-- RuntimeLibcallInfo.h - Runtime library information --*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_ANALYSIS_RUNTIMELIBCALLINFO_H
+#define LLVM_ANALYSIS_RUNTIMELIBCALLINFO_H
+
+#include "llvm/IR/RuntimeLibcalls.h"
+#include "llvm/Pass.h"
+
+namespace llvm {
+
+class LLVM_ABI RuntimeLibraryAnalysis
+: public AnalysisInfoMixin {
+public:
+ using Result = RTLIB::RuntimeLibcallsInfo;
+
+ RuntimeLibraryAnalysis() = default;
+ RuntimeLibraryAnalysis(RTLIB::RuntimeLibcallsInfo &&BaselineInfoImpl)
+ : LibcallsInfo(std::move(BaselineInfoImpl)) {}
+ explicit RuntimeLibraryAnalysis(const Triple &T) : LibcallsInfo(T) {}
+
+ LLVM_ABI RTLIB::RuntimeLibcallsInfo run(const Module &M,
+ ModuleAnalysisManager &);
+
+private:
+ friend AnalysisInfoMixin;
+ LLVM_ABI static AnalysisKey Key;
+
+ RTLIB::RuntimeLibcallsInfo LibcallsInfo;
+};
+
+class LLVM_ABI RuntimeLibraryInfoWrapper : public ImmutablePass {
+ RuntimeLibraryAnalysis RTLA;
+ std::optional RTLCI;
+
+public:
+ static char ID;
+ RuntimeLibraryInfoWrapper();
+ explicit RuntimeLibraryInfoWrapper(const Triple &T);
+ explicit RuntimeLibraryInfoWrapper(const RTLIB::RuntimeLibcallsInfo &RTLCI);
+
+ const RTLIB::RuntimeLibcallsInfo &getRTLCI(const Module &M) {
+ModuleAnalysisManager DummyMAM;
+RTLCI = RTLA.run(M, DummyMAM);
+return *RTLCI;
+ }
+
+ void getAnalysisUsage(AnalysisUsage &AU) const override;
+};
+
+LLVM_ABI ModulePass *createRuntimeLibraryInfoWrapperPass();
+
+} // namespace llvm
+
+#endif
diff --git a/llvm/include/llvm/CodeGen/SelectionDAGISel.h
b/llvm/include/llvm/CodeGen/SelectionDAGISel.h
index 5241a51dd8cd8..d7921c3eb3f7c 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAGISel.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAGISel.h
@@ -46,6 +46,7 @@ class SelectionDAGISel {
public:
TargetMachine &TM;
const TargetLibraryInfo *LibInfo;
+ const RTLIB::RuntimeLibcallsInfo *RuntimeLibCallInfo;
std::unique_ptr FuncInfo;
std::unique_ptr SwiftError;
MachineFunction *MF;
diff --git a/llvm/include/llvm/IR/RuntimeLibcalls.h
b/llvm/include/llvm/IR/RuntimeLibcalls.h
index 78e4b1723aafa..c822b6530a441 100644
--- a/llvm/include/llvm/IR/RuntimeLibcalls.h
+++ b/llvm/include/llvm/IR/RuntimeLibcalls.h
@@ -9,6 +9,8 @@
// This file implements a common interface to work with library calls into a
// runtime that may be emitted by a given backend.
//
+// FIXME: This should probably move to Analysis
+//
//===--===//
#ifndef LLVM_IR_RUNTIME_LIBCALLS_H
@@ -20,6 +22,7 @@
#include "llvm/ADT/StringTable.h"
#include "llvm/IR/CallingConv.h"
#include "llvm/IR/InstrTypes.h"
+#include "llvm/IR/PassManager.h"
#include "llvm/Support/AtomicOrdering.h"
#include "llvm/Support/CodeGen.h"
#include "llvm/Support/Compiler.h"
@@ -74,6 +77,8 @@ struct RuntimeLibcallsInfo {
public:
friend class llvm::LibcallLoweringInfo;
+ RuntimeLibcallsInfo() = default;
+
[llvm-branch-commits] [llvm] ExpandFp: Require RuntimeLibcallsInfo analysis (PR #165197)
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/165197
>From 3f57ae9a3872fab0aa24308065e906534c869809 Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Sun, 26 Oct 2025 02:44:00 +0900
Subject: [PATCH] ExpandFp: Require RuntimeLibcallsInfo analysis
Not sure I'm doing the new pass manager handling correctly. I do
not like needing to manually check if the cached module pass is
available and manually erroring in every pass.
---
llvm/lib/CodeGen/ExpandFp.cpp | 14 ++
llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll | 4 ++--
llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll | 2 +-
.../Transforms/ExpandFp/AMDGPU/missing-analysis.ll | 6 ++
.../Transforms/ExpandFp/AMDGPU/pass-parameters.ll | 8
5 files changed, 27 insertions(+), 7 deletions(-)
create mode 100644 llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
diff --git a/llvm/lib/CodeGen/ExpandFp.cpp b/llvm/lib/CodeGen/ExpandFp.cpp
index f44eb227133ae..9386ffe7791a3 100644
--- a/llvm/lib/CodeGen/ExpandFp.cpp
+++ b/llvm/lib/CodeGen/ExpandFp.cpp
@@ -18,6 +18,7 @@
#include "llvm/ADT/SmallVector.h"
#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/GlobalsModRef.h"
+#include "llvm/Analysis/RuntimeLibcallInfo.h"
#include "llvm/Analysis/SimplifyQuery.h"
#include "llvm/Analysis/ValueTracking.h"
#include "llvm/CodeGen/ISDOpcodes.h"
@@ -1092,6 +1093,8 @@ class ExpandFpLegacyPass : public FunctionPass {
auto *TM = &getAnalysis().getTM();
auto *TLI = TM->getSubtargetImpl(F)->getTargetLowering();
AssumptionCache *AC = nullptr;
+const RTLIB::RuntimeLibcallsInfo *Libcalls =
+&getAnalysis().getRTLCI(*F.getParent());
if (OptLevel != CodeGenOptLevel::None && !F.hasOptNone())
AC = &getAnalysis().getAssumptionCache(F);
@@ -1104,6 +1107,7 @@ class ExpandFpLegacyPass : public FunctionPass {
AU.addRequired();
AU.addPreserved();
AU.addPreserved();
+AU.addRequired();
}
};
} // namespace
@@ -1126,6 +1130,15 @@ PreservedAnalyses ExpandFpPass::run(Function &F,
FunctionAnalysisManager &FAM) {
AssumptionCache *AC = nullptr;
if (OptLevel != CodeGenOptLevel::None)
AC = &FAM.getResult(F);
+
+ auto &MAMProxy = FAM.getResult(F);
+ const RTLIB::RuntimeLibcallsInfo *Libcalls =
+ MAMProxy.getCachedResult(*F.getParent());
+ if (!Libcalls) {
+F.getContext().emitError("'runtime-libcall-info' analysis required");
+return PreservedAnalyses::all();
+ }
+
return runImpl(F, TLI, AC) ? PreservedAnalyses::none()
: PreservedAnalyses::all();
}
@@ -1133,6 +1146,7 @@ PreservedAnalyses ExpandFpPass::run(Function &F,
FunctionAnalysisManager &FAM) {
char ExpandFpLegacyPass::ID = 0;
INITIALIZE_PASS_BEGIN(ExpandFpLegacyPass, "expand-fp",
"Expand certain fp instructions", false, false)
+INITIALIZE_PASS_DEPENDENCY(RuntimeLibraryInfoWrapper)
INITIALIZE_PASS_END(ExpandFpLegacyPass, "expand-fp", "Expand fp", false, false)
FunctionPass *llvm::createExpandFpPass(CodeGenOptLevel OptLevel) {
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
index f70f0d25f172d..4d302f63e1f0b 100644
--- a/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
@@ -1,5 +1,5 @@
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK %s
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK,OPT1 %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK,OPT1 %s
; Check the handling of potentially infinite numerators in the frem
; expansion at different optimization levels and with different
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
index 4c0f9db147c96..56ccfb6bf454c 100644
--- a/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
@@ -1,5 +1,5 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
UTC_ARGS: --version 5
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck %s
define amdgpu_kernel void @frem_f16(ptr addrspace(1) %out, ptr addrspace(1)
%in1,
; CHECK-LABEL: define amdgpu_kernel void @frem_f16(
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
new file mode 100644
index 0..5cad68e66d3ee
--- /dev/null
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
@@ -0,0 +1,6 @@
+; RUN: not opt -mtriple=amdgcn -passes=expand-fp -disable-output %s 2>&1 |
FileCheck %s
+
+; CHECK: 'runtime-libc
[llvm-branch-commits] [llvm] ExpandFp: Require RuntimeLibcallsInfo analysis (PR #165197)
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/165197
>From 3f57ae9a3872fab0aa24308065e906534c869809 Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Sun, 26 Oct 2025 02:44:00 +0900
Subject: [PATCH] ExpandFp: Require RuntimeLibcallsInfo analysis
Not sure I'm doing the new pass manager handling correctly. I do
not like needing to manually check if the cached module pass is
available and manually erroring in every pass.
---
llvm/lib/CodeGen/ExpandFp.cpp | 14 ++
llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll | 4 ++--
llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll | 2 +-
.../Transforms/ExpandFp/AMDGPU/missing-analysis.ll | 6 ++
.../Transforms/ExpandFp/AMDGPU/pass-parameters.ll | 8
5 files changed, 27 insertions(+), 7 deletions(-)
create mode 100644 llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
diff --git a/llvm/lib/CodeGen/ExpandFp.cpp b/llvm/lib/CodeGen/ExpandFp.cpp
index f44eb227133ae..9386ffe7791a3 100644
--- a/llvm/lib/CodeGen/ExpandFp.cpp
+++ b/llvm/lib/CodeGen/ExpandFp.cpp
@@ -18,6 +18,7 @@
#include "llvm/ADT/SmallVector.h"
#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/GlobalsModRef.h"
+#include "llvm/Analysis/RuntimeLibcallInfo.h"
#include "llvm/Analysis/SimplifyQuery.h"
#include "llvm/Analysis/ValueTracking.h"
#include "llvm/CodeGen/ISDOpcodes.h"
@@ -1092,6 +1093,8 @@ class ExpandFpLegacyPass : public FunctionPass {
auto *TM = &getAnalysis().getTM();
auto *TLI = TM->getSubtargetImpl(F)->getTargetLowering();
AssumptionCache *AC = nullptr;
+const RTLIB::RuntimeLibcallsInfo *Libcalls =
+&getAnalysis().getRTLCI(*F.getParent());
if (OptLevel != CodeGenOptLevel::None && !F.hasOptNone())
AC = &getAnalysis().getAssumptionCache(F);
@@ -1104,6 +1107,7 @@ class ExpandFpLegacyPass : public FunctionPass {
AU.addRequired();
AU.addPreserved();
AU.addPreserved();
+AU.addRequired();
}
};
} // namespace
@@ -1126,6 +1130,15 @@ PreservedAnalyses ExpandFpPass::run(Function &F,
FunctionAnalysisManager &FAM) {
AssumptionCache *AC = nullptr;
if (OptLevel != CodeGenOptLevel::None)
AC = &FAM.getResult(F);
+
+ auto &MAMProxy = FAM.getResult(F);
+ const RTLIB::RuntimeLibcallsInfo *Libcalls =
+ MAMProxy.getCachedResult(*F.getParent());
+ if (!Libcalls) {
+F.getContext().emitError("'runtime-libcall-info' analysis required");
+return PreservedAnalyses::all();
+ }
+
return runImpl(F, TLI, AC) ? PreservedAnalyses::none()
: PreservedAnalyses::all();
}
@@ -1133,6 +1146,7 @@ PreservedAnalyses ExpandFpPass::run(Function &F,
FunctionAnalysisManager &FAM) {
char ExpandFpLegacyPass::ID = 0;
INITIALIZE_PASS_BEGIN(ExpandFpLegacyPass, "expand-fp",
"Expand certain fp instructions", false, false)
+INITIALIZE_PASS_DEPENDENCY(RuntimeLibraryInfoWrapper)
INITIALIZE_PASS_END(ExpandFpLegacyPass, "expand-fp", "Expand fp", false, false)
FunctionPass *llvm::createExpandFpPass(CodeGenOptLevel OptLevel) {
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
index f70f0d25f172d..4d302f63e1f0b 100644
--- a/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
@@ -1,5 +1,5 @@
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK %s
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK,OPT1 %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK,OPT1 %s
; Check the handling of potentially infinite numerators in the frem
; expansion at different optimization levels and with different
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
index 4c0f9db147c96..56ccfb6bf454c 100644
--- a/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
@@ -1,5 +1,5 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
UTC_ARGS: --version 5
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck %s
define amdgpu_kernel void @frem_f16(ptr addrspace(1) %out, ptr addrspace(1)
%in1,
; CHECK-LABEL: define amdgpu_kernel void @frem_f16(
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
new file mode 100644
index 0..5cad68e66d3ee
--- /dev/null
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
@@ -0,0 +1,6 @@
+; RUN: not opt -mtriple=amdgcn -passes=expand-fp -disable-output %s 2>&1 |
FileCheck %s
+
+; CHECK: 'runtime-libc
[llvm-branch-commits] [llvm] ExpandFp: Require RuntimeLibcallsInfo analysis (PR #165197)
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/165197
>From c4f4a6be5d21b8204e48fb5b8971dcac3d13d56f Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Sun, 26 Oct 2025 02:44:00 +0900
Subject: [PATCH] ExpandFp: Require RuntimeLibcallsInfo analysis
Not sure I'm doing the new pass manager handling correctly. I do
not like needing to manually check if the cached module pass is
available and manually erroring in every pass.
---
llvm/lib/CodeGen/ExpandFp.cpp | 14 ++
llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll | 4 ++--
llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll | 2 +-
.../Transforms/ExpandFp/AMDGPU/missing-analysis.ll | 6 ++
.../Transforms/ExpandFp/AMDGPU/pass-parameters.ll | 8
5 files changed, 27 insertions(+), 7 deletions(-)
create mode 100644 llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
diff --git a/llvm/lib/CodeGen/ExpandFp.cpp b/llvm/lib/CodeGen/ExpandFp.cpp
index f44eb227133ae..9386ffe7791a3 100644
--- a/llvm/lib/CodeGen/ExpandFp.cpp
+++ b/llvm/lib/CodeGen/ExpandFp.cpp
@@ -18,6 +18,7 @@
#include "llvm/ADT/SmallVector.h"
#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/GlobalsModRef.h"
+#include "llvm/Analysis/RuntimeLibcallInfo.h"
#include "llvm/Analysis/SimplifyQuery.h"
#include "llvm/Analysis/ValueTracking.h"
#include "llvm/CodeGen/ISDOpcodes.h"
@@ -1092,6 +1093,8 @@ class ExpandFpLegacyPass : public FunctionPass {
auto *TM = &getAnalysis().getTM();
auto *TLI = TM->getSubtargetImpl(F)->getTargetLowering();
AssumptionCache *AC = nullptr;
+const RTLIB::RuntimeLibcallsInfo *Libcalls =
+&getAnalysis().getRTLCI(*F.getParent());
if (OptLevel != CodeGenOptLevel::None && !F.hasOptNone())
AC = &getAnalysis().getAssumptionCache(F);
@@ -1104,6 +1107,7 @@ class ExpandFpLegacyPass : public FunctionPass {
AU.addRequired();
AU.addPreserved();
AU.addPreserved();
+AU.addRequired();
}
};
} // namespace
@@ -1126,6 +1130,15 @@ PreservedAnalyses ExpandFpPass::run(Function &F,
FunctionAnalysisManager &FAM) {
AssumptionCache *AC = nullptr;
if (OptLevel != CodeGenOptLevel::None)
AC = &FAM.getResult(F);
+
+ auto &MAMProxy = FAM.getResult(F);
+ const RTLIB::RuntimeLibcallsInfo *Libcalls =
+ MAMProxy.getCachedResult(*F.getParent());
+ if (!Libcalls) {
+F.getContext().emitError("'runtime-libcall-info' analysis required");
+return PreservedAnalyses::all();
+ }
+
return runImpl(F, TLI, AC) ? PreservedAnalyses::none()
: PreservedAnalyses::all();
}
@@ -1133,6 +1146,7 @@ PreservedAnalyses ExpandFpPass::run(Function &F,
FunctionAnalysisManager &FAM) {
char ExpandFpLegacyPass::ID = 0;
INITIALIZE_PASS_BEGIN(ExpandFpLegacyPass, "expand-fp",
"Expand certain fp instructions", false, false)
+INITIALIZE_PASS_DEPENDENCY(RuntimeLibraryInfoWrapper)
INITIALIZE_PASS_END(ExpandFpLegacyPass, "expand-fp", "Expand fp", false, false)
FunctionPass *llvm::createExpandFpPass(CodeGenOptLevel OptLevel) {
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
index f70f0d25f172d..4d302f63e1f0b 100644
--- a/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
@@ -1,5 +1,5 @@
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK %s
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK,OPT1 %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK,OPT1 %s
; Check the handling of potentially infinite numerators in the frem
; expansion at different optimization levels and with different
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
index 4c0f9db147c96..56ccfb6bf454c 100644
--- a/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
@@ -1,5 +1,5 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
UTC_ARGS: --version 5
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck %s
define amdgpu_kernel void @frem_f16(ptr addrspace(1) %out, ptr addrspace(1)
%in1,
; CHECK-LABEL: define amdgpu_kernel void @frem_f16(
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
new file mode 100644
index 0..5cad68e66d3ee
--- /dev/null
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
@@ -0,0 +1,6 @@
+; RUN: not opt -mtriple=amdgcn -passes=expand-fp -disable-output %s 2>&1 |
FileCheck %s
+
+; CHECK: 'runtime-libc
[llvm-branch-commits] [llvm] ExpandFp: Require RuntimeLibcallsInfo analysis (PR #165197)
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/165197
>From c4f4a6be5d21b8204e48fb5b8971dcac3d13d56f Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Sun, 26 Oct 2025 02:44:00 +0900
Subject: [PATCH] ExpandFp: Require RuntimeLibcallsInfo analysis
Not sure I'm doing the new pass manager handling correctly. I do
not like needing to manually check if the cached module pass is
available and manually erroring in every pass.
---
llvm/lib/CodeGen/ExpandFp.cpp | 14 ++
llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll | 4 ++--
llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll | 2 +-
.../Transforms/ExpandFp/AMDGPU/missing-analysis.ll | 6 ++
.../Transforms/ExpandFp/AMDGPU/pass-parameters.ll | 8
5 files changed, 27 insertions(+), 7 deletions(-)
create mode 100644 llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
diff --git a/llvm/lib/CodeGen/ExpandFp.cpp b/llvm/lib/CodeGen/ExpandFp.cpp
index f44eb227133ae..9386ffe7791a3 100644
--- a/llvm/lib/CodeGen/ExpandFp.cpp
+++ b/llvm/lib/CodeGen/ExpandFp.cpp
@@ -18,6 +18,7 @@
#include "llvm/ADT/SmallVector.h"
#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/GlobalsModRef.h"
+#include "llvm/Analysis/RuntimeLibcallInfo.h"
#include "llvm/Analysis/SimplifyQuery.h"
#include "llvm/Analysis/ValueTracking.h"
#include "llvm/CodeGen/ISDOpcodes.h"
@@ -1092,6 +1093,8 @@ class ExpandFpLegacyPass : public FunctionPass {
auto *TM = &getAnalysis().getTM();
auto *TLI = TM->getSubtargetImpl(F)->getTargetLowering();
AssumptionCache *AC = nullptr;
+const RTLIB::RuntimeLibcallsInfo *Libcalls =
+&getAnalysis().getRTLCI(*F.getParent());
if (OptLevel != CodeGenOptLevel::None && !F.hasOptNone())
AC = &getAnalysis().getAssumptionCache(F);
@@ -1104,6 +1107,7 @@ class ExpandFpLegacyPass : public FunctionPass {
AU.addRequired();
AU.addPreserved();
AU.addPreserved();
+AU.addRequired();
}
};
} // namespace
@@ -1126,6 +1130,15 @@ PreservedAnalyses ExpandFpPass::run(Function &F,
FunctionAnalysisManager &FAM) {
AssumptionCache *AC = nullptr;
if (OptLevel != CodeGenOptLevel::None)
AC = &FAM.getResult(F);
+
+ auto &MAMProxy = FAM.getResult(F);
+ const RTLIB::RuntimeLibcallsInfo *Libcalls =
+ MAMProxy.getCachedResult(*F.getParent());
+ if (!Libcalls) {
+F.getContext().emitError("'runtime-libcall-info' analysis required");
+return PreservedAnalyses::all();
+ }
+
return runImpl(F, TLI, AC) ? PreservedAnalyses::none()
: PreservedAnalyses::all();
}
@@ -1133,6 +1146,7 @@ PreservedAnalyses ExpandFpPass::run(Function &F,
FunctionAnalysisManager &FAM) {
char ExpandFpLegacyPass::ID = 0;
INITIALIZE_PASS_BEGIN(ExpandFpLegacyPass, "expand-fp",
"Expand certain fp instructions", false, false)
+INITIALIZE_PASS_DEPENDENCY(RuntimeLibraryInfoWrapper)
INITIALIZE_PASS_END(ExpandFpLegacyPass, "expand-fp", "Expand fp", false, false)
FunctionPass *llvm::createExpandFpPass(CodeGenOptLevel OptLevel) {
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
index f70f0d25f172d..4d302f63e1f0b 100644
--- a/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
@@ -1,5 +1,5 @@
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK %s
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK,OPT1 %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK,OPT1 %s
; Check the handling of potentially infinite numerators in the frem
; expansion at different optimization levels and with different
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
index 4c0f9db147c96..56ccfb6bf454c 100644
--- a/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
@@ -1,5 +1,5 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
UTC_ARGS: --version 5
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck %s
define amdgpu_kernel void @frem_f16(ptr addrspace(1) %out, ptr addrspace(1)
%in1,
; CHECK-LABEL: define amdgpu_kernel void @frem_f16(
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
new file mode 100644
index 0..5cad68e66d3ee
--- /dev/null
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
@@ -0,0 +1,6 @@
+; RUN: not opt -mtriple=amdgcn -passes=expand-fp -disable-output %s 2>&1 |
FileCheck %s
+
+; CHECK: 'runtime-libc
[llvm-branch-commits] [llvm] Analysis: Add RuntimeLibcall analysis pass (PR #165196)
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/165196
>From 833269e63a65128bd7f95bf6176c08921dc69fde Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Mon, 2 Jun 2025 18:32:22 +0200
Subject: [PATCH] Analysis: Add RuntimeLibcall analysis pass
Currently RuntimeLibcallsInfo is a hardcoded list based on the triple.
In the future the available libcall set should be dynamically modifiable
with module flags.
Note this isn't really used yet. TargetLowering is still constructing
its own copy, and untangling that to use this requires several more
steps.
---
.../llvm/Analysis/RuntimeLibcallInfo.h| 60 +++
llvm/include/llvm/CodeGen/SelectionDAGISel.h | 1 +
llvm/include/llvm/IR/RuntimeLibcalls.h| 10
llvm/include/llvm/InitializePasses.h | 1 +
llvm/include/llvm/Passes/CodeGenPassBuilder.h | 3 +
llvm/lib/Analysis/Analysis.cpp| 1 +
llvm/lib/Analysis/CMakeLists.txt | 1 +
llvm/lib/Analysis/RuntimeLibcallInfo.cpp | 43 +
llvm/lib/IR/RuntimeLibcalls.cpp | 7 ++-
llvm/lib/Passes/PassBuilder.cpp | 1 +
llvm/lib/Passes/PassRegistry.def | 1 +
llvm/lib/Target/Target.cpp| 1 +
12 files changed, 129 insertions(+), 1 deletion(-)
create mode 100644 llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
create mode 100644 llvm/lib/Analysis/RuntimeLibcallInfo.cpp
diff --git a/llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
b/llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
new file mode 100644
index 0..a3e1014b417e5
--- /dev/null
+++ b/llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
@@ -0,0 +1,60 @@
+//===-- RuntimeLibcallInfo.h - Runtime library information --*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_ANALYSIS_RUNTIMELIBCALLINFO_H
+#define LLVM_ANALYSIS_RUNTIMELIBCALLINFO_H
+
+#include "llvm/IR/RuntimeLibcalls.h"
+#include "llvm/Pass.h"
+
+namespace llvm {
+
+class LLVM_ABI RuntimeLibraryAnalysis
+: public AnalysisInfoMixin {
+public:
+ using Result = RTLIB::RuntimeLibcallsInfo;
+
+ RuntimeLibraryAnalysis() = default;
+ RuntimeLibraryAnalysis(RTLIB::RuntimeLibcallsInfo &&BaselineInfoImpl)
+ : LibcallsInfo(std::move(BaselineInfoImpl)) {}
+ explicit RuntimeLibraryAnalysis(const Triple &T) : LibcallsInfo(T) {}
+
+ LLVM_ABI RTLIB::RuntimeLibcallsInfo run(const Module &M,
+ ModuleAnalysisManager &);
+
+private:
+ friend AnalysisInfoMixin;
+ LLVM_ABI static AnalysisKey Key;
+
+ RTLIB::RuntimeLibcallsInfo LibcallsInfo;
+};
+
+class LLVM_ABI RuntimeLibraryInfoWrapper : public ImmutablePass {
+ RuntimeLibraryAnalysis RTLA;
+ std::optional RTLCI;
+
+public:
+ static char ID;
+ RuntimeLibraryInfoWrapper();
+ explicit RuntimeLibraryInfoWrapper(const Triple &T);
+ explicit RuntimeLibraryInfoWrapper(const RTLIB::RuntimeLibcallsInfo &RTLCI);
+
+ const RTLIB::RuntimeLibcallsInfo &getRTLCI(const Module &M) {
+ModuleAnalysisManager DummyMAM;
+RTLCI = RTLA.run(M, DummyMAM);
+return *RTLCI;
+ }
+
+ void getAnalysisUsage(AnalysisUsage &AU) const override;
+};
+
+LLVM_ABI ModulePass *createRuntimeLibraryInfoWrapperPass();
+
+} // namespace llvm
+
+#endif
diff --git a/llvm/include/llvm/CodeGen/SelectionDAGISel.h
b/llvm/include/llvm/CodeGen/SelectionDAGISel.h
index 5241a51dd8cd8..d7921c3eb3f7c 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAGISel.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAGISel.h
@@ -46,6 +46,7 @@ class SelectionDAGISel {
public:
TargetMachine &TM;
const TargetLibraryInfo *LibInfo;
+ const RTLIB::RuntimeLibcallsInfo *RuntimeLibCallInfo;
std::unique_ptr FuncInfo;
std::unique_ptr SwiftError;
MachineFunction *MF;
diff --git a/llvm/include/llvm/IR/RuntimeLibcalls.h
b/llvm/include/llvm/IR/RuntimeLibcalls.h
index 78e4b1723aafa..c822b6530a441 100644
--- a/llvm/include/llvm/IR/RuntimeLibcalls.h
+++ b/llvm/include/llvm/IR/RuntimeLibcalls.h
@@ -9,6 +9,8 @@
// This file implements a common interface to work with library calls into a
// runtime that may be emitted by a given backend.
//
+// FIXME: This should probably move to Analysis
+//
//===--===//
#ifndef LLVM_IR_RUNTIME_LIBCALLS_H
@@ -20,6 +22,7 @@
#include "llvm/ADT/StringTable.h"
#include "llvm/IR/CallingConv.h"
#include "llvm/IR/InstrTypes.h"
+#include "llvm/IR/PassManager.h"
#include "llvm/Support/AtomicOrdering.h"
#include "llvm/Support/CodeGen.h"
#include "llvm/Support/Compiler.h"
@@ -74,6 +77,8 @@ struct RuntimeLibcallsInfo {
public:
friend class llvm::LibcallLoweringInfo;
+ RuntimeLibcallsInfo() = default;
+
[llvm-branch-commits] [llvm] Analysis: Add RuntimeLibcall analysis pass (PR #165196)
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/165196
>From 833269e63a65128bd7f95bf6176c08921dc69fde Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Mon, 2 Jun 2025 18:32:22 +0200
Subject: [PATCH] Analysis: Add RuntimeLibcall analysis pass
Currently RuntimeLibcallsInfo is a hardcoded list based on the triple.
In the future the available libcall set should be dynamically modifiable
with module flags.
Note this isn't really used yet. TargetLowering is still constructing
its own copy, and untangling that to use this requires several more
steps.
---
.../llvm/Analysis/RuntimeLibcallInfo.h| 60 +++
llvm/include/llvm/CodeGen/SelectionDAGISel.h | 1 +
llvm/include/llvm/IR/RuntimeLibcalls.h| 10
llvm/include/llvm/InitializePasses.h | 1 +
llvm/include/llvm/Passes/CodeGenPassBuilder.h | 3 +
llvm/lib/Analysis/Analysis.cpp| 1 +
llvm/lib/Analysis/CMakeLists.txt | 1 +
llvm/lib/Analysis/RuntimeLibcallInfo.cpp | 43 +
llvm/lib/IR/RuntimeLibcalls.cpp | 7 ++-
llvm/lib/Passes/PassBuilder.cpp | 1 +
llvm/lib/Passes/PassRegistry.def | 1 +
llvm/lib/Target/Target.cpp| 1 +
12 files changed, 129 insertions(+), 1 deletion(-)
create mode 100644 llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
create mode 100644 llvm/lib/Analysis/RuntimeLibcallInfo.cpp
diff --git a/llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
b/llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
new file mode 100644
index 0..a3e1014b417e5
--- /dev/null
+++ b/llvm/include/llvm/Analysis/RuntimeLibcallInfo.h
@@ -0,0 +1,60 @@
+//===-- RuntimeLibcallInfo.h - Runtime library information --*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_ANALYSIS_RUNTIMELIBCALLINFO_H
+#define LLVM_ANALYSIS_RUNTIMELIBCALLINFO_H
+
+#include "llvm/IR/RuntimeLibcalls.h"
+#include "llvm/Pass.h"
+
+namespace llvm {
+
+class LLVM_ABI RuntimeLibraryAnalysis
+: public AnalysisInfoMixin {
+public:
+ using Result = RTLIB::RuntimeLibcallsInfo;
+
+ RuntimeLibraryAnalysis() = default;
+ RuntimeLibraryAnalysis(RTLIB::RuntimeLibcallsInfo &&BaselineInfoImpl)
+ : LibcallsInfo(std::move(BaselineInfoImpl)) {}
+ explicit RuntimeLibraryAnalysis(const Triple &T) : LibcallsInfo(T) {}
+
+ LLVM_ABI RTLIB::RuntimeLibcallsInfo run(const Module &M,
+ ModuleAnalysisManager &);
+
+private:
+ friend AnalysisInfoMixin;
+ LLVM_ABI static AnalysisKey Key;
+
+ RTLIB::RuntimeLibcallsInfo LibcallsInfo;
+};
+
+class LLVM_ABI RuntimeLibraryInfoWrapper : public ImmutablePass {
+ RuntimeLibraryAnalysis RTLA;
+ std::optional RTLCI;
+
+public:
+ static char ID;
+ RuntimeLibraryInfoWrapper();
+ explicit RuntimeLibraryInfoWrapper(const Triple &T);
+ explicit RuntimeLibraryInfoWrapper(const RTLIB::RuntimeLibcallsInfo &RTLCI);
+
+ const RTLIB::RuntimeLibcallsInfo &getRTLCI(const Module &M) {
+ModuleAnalysisManager DummyMAM;
+RTLCI = RTLA.run(M, DummyMAM);
+return *RTLCI;
+ }
+
+ void getAnalysisUsage(AnalysisUsage &AU) const override;
+};
+
+LLVM_ABI ModulePass *createRuntimeLibraryInfoWrapperPass();
+
+} // namespace llvm
+
+#endif
diff --git a/llvm/include/llvm/CodeGen/SelectionDAGISel.h
b/llvm/include/llvm/CodeGen/SelectionDAGISel.h
index 5241a51dd8cd8..d7921c3eb3f7c 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAGISel.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAGISel.h
@@ -46,6 +46,7 @@ class SelectionDAGISel {
public:
TargetMachine &TM;
const TargetLibraryInfo *LibInfo;
+ const RTLIB::RuntimeLibcallsInfo *RuntimeLibCallInfo;
std::unique_ptr FuncInfo;
std::unique_ptr SwiftError;
MachineFunction *MF;
diff --git a/llvm/include/llvm/IR/RuntimeLibcalls.h
b/llvm/include/llvm/IR/RuntimeLibcalls.h
index 78e4b1723aafa..c822b6530a441 100644
--- a/llvm/include/llvm/IR/RuntimeLibcalls.h
+++ b/llvm/include/llvm/IR/RuntimeLibcalls.h
@@ -9,6 +9,8 @@
// This file implements a common interface to work with library calls into a
// runtime that may be emitted by a given backend.
//
+// FIXME: This should probably move to Analysis
+//
//===--===//
#ifndef LLVM_IR_RUNTIME_LIBCALLS_H
@@ -20,6 +22,7 @@
#include "llvm/ADT/StringTable.h"
#include "llvm/IR/CallingConv.h"
#include "llvm/IR/InstrTypes.h"
+#include "llvm/IR/PassManager.h"
#include "llvm/Support/AtomicOrdering.h"
#include "llvm/Support/CodeGen.h"
#include "llvm/Support/Compiler.h"
@@ -74,6 +77,8 @@ struct RuntimeLibcallsInfo {
public:
friend class llvm::LibcallLoweringInfo;
+ RuntimeLibcallsInfo() = default;
+
[llvm-branch-commits] [CI] Make premerge_advisor_explain write comments (PR #166605)
llvmbot wrote: @llvm/pr-subscribers-infrastructure Author: Aiden Grossman (boomanaiden154) Changes This patch makes the premerge advisor write out comments. This allows for surfacing the findings of the advisor in a user-visible manner beyond just dumping the output in the logs. Surfacing the information in a comment also makes it much easier to discover compared to the Github summary view. --- Patch is 21.20 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/166605.diff 4 Files Affected: - (modified) .ci/all_requirements.txt (+192-2) - (modified) .ci/premerge_advisor_explain.py (+65-3) - (modified) .ci/requirements.txt (+1) - (modified) .ci/utils.sh (+2-1) ``diff diff --git a/.ci/all_requirements.txt b/.ci/all_requirements.txt index ac9682a09bec1..4918d7519291f 100644 --- a/.ci/all_requirements.txt +++ b/.ci/all_requirements.txt @@ -12,6 +12,94 @@ certifi==2025.8.3 \ --hash=sha256:e564105f78ded564e3ae7c923924435e1daa7463faeab5bb932bc53ffae63407 \ --hash=sha256:f6c12493cfb1b06ba2ff328595af9350c65d6644968e5d3a2ffd78699af217a5 # via requests +cffi==2.0.0 \ + --hash=sha256:00bdf7acc5f795150faa6957054fbbca2439db2f775ce831222b66f192f03beb \ + --hash=sha256:07b271772c100085dd28b74fa0cd81c8fb1a3ba18b21e03d7c27f3436a10606b \ + --hash=sha256:087067fa8953339c723661eda6b54bc98c5625757ea62e95eb4898ad5e776e9f \ + --hash=sha256:0a1527a803f0a659de1af2e1fd700213caba79377e27e4693648c2923da066f9 \ + --hash=sha256:0cf2d91ecc3fcc0625c2c530fe004f82c110405f101548512cce44322fa8ac44 \ + --hash=sha256:0f6084a0ea23d05d20c3edcda20c3d006f9b6f3fefeac38f59262e10cef47ee2 \ + --hash=sha256:12873ca6cb9b0f0d3a0da705d6086fe911591737a59f28b7936bdfed27c0d47c \ + --hash=sha256:19f705ada2530c1167abacb171925dd886168931e0a7b78f5bffcae5c6b5be75 \ + --hash=sha256:1cd13c99ce269b3ed80b417dcd591415d3372bcac067009b6e0f59c7d4015e65 \ + --hash=sha256:1e3a615586f05fc4065a8b22b8152f0c1b00cdbc60596d187c2a74f9e3036e4e \ + --hash=sha256:1f72fb8906754ac8a2cc3f9f5aaa298070652a0ffae577e0ea9bd480dc3c931a \ + --hash=sha256:1fc9ea04857caf665289b7a75923f2c6ed559b8298a1b8c49e59f7dd95c8481e \ + --hash=sha256:203a48d1fb583fc7d78a4c6655692963b860a417c0528492a6bc21f1aaefab25 \ + --hash=sha256:2081580ebb843f759b9f617314a24ed5738c51d2aee65d31e02f6f7a2b97707a \ + --hash=sha256:21d1152871b019407d8ac3985f6775c079416c282e431a4da6afe7aefd2bccbe \ + --hash=sha256:24b6f81f1983e6df8db3adc38562c83f7d4a0c36162885ec7f7b77c7dcbec97b \ + --hash=sha256:256f80b80ca3853f90c21b23ee78cd008713787b1b1e93eae9f3d6a7134abd91 \ + --hash=sha256:28a3a209b96630bca57cce802da70c266eb08c6e97e5afd61a75611ee6c64592 \ + --hash=sha256:2c8f814d84194c9ea681642fd164267891702542f028a15fc97d4674b6206187 \ + --hash=sha256:2de9a304e27f7596cd03d16f1b7c72219bd944e99cc52b84d0145aefb07cbd3c \ + --hash=sha256:38100abb9d1b1435bc4cc340bb4489635dc2f0da7456590877030c9b3d40b0c1 \ + --hash=sha256:3925dd22fa2b7699ed2617149842d2e6adde22b262fcbfada50e3d195e4b3a94 \ + --hash=sha256:3e17ed538242334bf70832644a32a7aae3d83b57567f9fd60a26257e992b79ba \ + --hash=sha256:3e837e369566884707ddaf85fc1744b47575005c0a229de3327f8f9a20f4efeb \ + --hash=sha256:3f4d46d8b35698056ec29bca21546e1551a205058ae1a181d871e278b0b28165 \ + --hash=sha256:44d1b5909021139fe36001ae048dbdde8214afa20200eda0f64c068cac5d5529 \ + --hash=sha256:45d5e886156860dc35862657e1494b9bae8dfa63bf56796f2fb56e1679fc0bca \ + --hash=sha256:4647afc2f90d1ddd33441e5b0e85b16b12ddec4fca55f0d9671fef036ecca27c \ + --hash=sha256:4671d9dd5ec934cb9a73e7ee9676f9362aba54f7f34910956b84d727b0d73fb6 \ + --hash=sha256:53f77cbe57044e88bbd5ed26ac1d0514d2acf0591dd6bb02a3ae37f76811b80c \ + --hash=sha256:5eda85d6d1879e692d546a078b44251cdd08dd1cfb98dfb77b670c97cee49ea0 \ + --hash=sha256:5fed36fccc0612a53f1d4d9a816b50a36702c28a2aa880cb8a122b3466638743 \ + --hash=sha256:61d028e90346df14fedc3d1e5441df818d095f3b87d286825dfcbd6459b7ef63 \ + --hash=sha256:66f011380d0e49ed280c789fbd08ff0d40968ee7b665575489afa95c98196ab5 \ + --hash=sha256:6824f87845e3396029f3820c206e459ccc91760e8fa24422f8b0c3d1731cbec5 \ + --hash=sha256:6c6c373cfc5c83a975506110d17457138c8c63016b563cc9ed6e056a82f13ce4 \ + --hash=sha256:6d02d6655b0e54f54c4ef0b94eb6be0607b70853c45ce98bd278dc7de718be5d \ + --hash=sha256:6d50360be4546678fc1b79ffe7a66265e28667840010348dd69a314145807a1b \ + --hash=sha256:730cacb21e1bdff3ce90babf007d0a0917cc3e6492f336c2f0134101e0944f93 \ + --hash=sha256:737fe7d37e1a1bffe70bd5754ea763a62a066dc5913ca57e957824b72a85e205 \ + --hash=sha256:74a03b9698e198d47562765773b4a8309919089150a0bb17d829ad7b44b60d27 \ + --hash=sha256:7553fb2090d71822f02c629afe6042c299edf91ba1bf94951165613553984512 \ + --hash=sha256:7a66c7204d8869299919db4d5069a82f1561581af12b11b3c9f48c584eb8743d \ + --hash=sha256:7cc09976e8b56f8cebd752f7113ad07752461f48a58cbba644139015ac24954c \ + --hash=
[llvm-branch-commits] [CI][NFC] Refactor compute_platform_title into generate_test_report_lib (PR #166604)
llvmbot wrote:
@llvm/pr-subscribers-infrastructure
Author: Aiden Grossman (boomanaiden154)
Changes
This enables reuse in other CI components, like
premerge_advisor_explain.py.
---
Full diff: https://github.com/llvm/llvm-project/pull/166604.diff
2 Files Affected:
- (modified) .ci/generate_test_report_github.py (+3-12)
- (modified) .ci/generate_test_report_lib.py (+11)
``diff
diff --git a/.ci/generate_test_report_github.py
b/.ci/generate_test_report_github.py
index 08387de817467..18c5e078a5064 100644
--- a/.ci/generate_test_report_github.py
+++ b/.ci/generate_test_report_github.py
@@ -4,21 +4,10 @@
"""Script to generate a build report for Github."""
import argparse
-import platform
import generate_test_report_lib
-def compute_platform_title() -> str:
-logo = ":window:" if platform.system() == "Windows" else ":penguin:"
-# On Linux the machine value is x86_64 on Windows it is AMD64.
-if platform.machine() == "x86_64" or platform.machine() == "AMD64":
-arch = "x64"
-else:
-arch = platform.machine()
-return f"{logo} {platform.system()} {arch} Test Results"
-
-
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("return_code", help="The build's return code.",
type=int)
@@ -28,7 +17,9 @@ def compute_platform_title() -> str:
args = parser.parse_args()
report = generate_test_report_lib.generate_report_from_files(
-compute_platform_title(), args.return_code, args.build_test_logs
+generate_test_report_lib.compute_platform_title(),
+args.return_code,
+args.build_test_logs,
)
print(report)
diff --git a/.ci/generate_test_report_lib.py b/.ci/generate_test_report_lib.py
index c9a2aaeb10f8c..48a6be903da41 100644
--- a/.ci/generate_test_report_lib.py
+++ b/.ci/generate_test_report_lib.py
@@ -4,6 +4,7 @@
"""Library to parse JUnit XML files and return a markdown report."""
from typing import TypedDict
+import platform
from junitparser import JUnitXml, Failure
@@ -305,3 +306,13 @@ def load_info_from_files(build_log_files):
def generate_report_from_files(title, return_code, build_log_files):
junit_objects, ninja_logs = load_info_from_files(build_log_files)
return generate_report(title, return_code, junit_objects, ninja_logs)
+
+
+def compute_platform_title() -> str:
+logo = ":window:" if platform.system() == "Windows" else ":penguin:"
+# On Linux the machine value is x86_64 on Windows it is AMD64.
+if platform.machine() == "x86_64" or platform.machine() == "AMD64":
+arch = "x64"
+else:
+arch = platform.machine()
+return f"{logo} {platform.system()} {arch} Test Results"
``
https://github.com/llvm/llvm-project/pull/166604
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CI] Make premerge_advisor_explain write comments (PR #166605)
https://github.com/boomanaiden154 created https://github.com/llvm/llvm-project/pull/166605 This patch makes the premerge advisor write out comments. This allows for surfacing the findings of the advisor in a user-visible manner beyond just dumping the output in the logs. Surfacing the information in a comment also makes it much easier to discover compared to the Github summary view. ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CI][NFC] Refactor compute_platform_title into generate_test_report_lib (PR #166604)
https://github.com/boomanaiden154 created https://github.com/llvm/llvm-project/pull/166604 This enables reuse in other CI components, like premerge_advisor_explain.py. ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CI] Make premerge_advisor_explain write comments (PR #166605)
https://github.com/boomanaiden154 updated
https://github.com/llvm/llvm-project/pull/166605
>From 06c030dcb4ee57be287beffd96d1b21ef1697dd4 Mon Sep 17 00:00:00 2001
From: Aiden Grossman
Date: Wed, 5 Nov 2025 18:23:46 +
Subject: [PATCH] fix
Created using spr 1.3.7
---
.ci/premerge_advisor_explain.py | 34 -
.ci/utils.sh| 10 +-
2 files changed, 22 insertions(+), 22 deletions(-)
diff --git a/.ci/premerge_advisor_explain.py b/.ci/premerge_advisor_explain.py
index 4d840a33c3cf2..1d487af9e9ec7 100644
--- a/.ci/premerge_advisor_explain.py
+++ b/.ci/premerge_advisor_explain.py
@@ -31,22 +31,11 @@ def get_comment_id(platform: str, pr:
github.PullRequest.PullRequest) -> int | N
def get_comment(
github_token: str,
pr_number: int,
-junit_objects,
-ninja_logs,
-advisor_response,
-return_code,
+body: str,
) -> dict[str, str]:
repo = github.Github(github_token).get_repo("llvm/llvm-project")
pr = repo.get_issue(pr_number).as_pull_request()
-comment = {
-"body": generate_test_report_lib.generate_report(
-generate_test_report_lib.compute_platform_title(),
-return_code,
-junit_objects,
-ninja_logs,
-failure_explanations_list=advisor_response,
-)
-}
+comment = {"body": body}
comment_id = get_comment_id(platform.system(), pr)
if comment_id:
comment["id"] = comment_id
@@ -59,6 +48,14 @@ def main(
pr_number: int,
return_code: int,
):
+if return_code == 0:
+with open("comment", "w") as comment_file_handle:
+comment = get_comment(
+":white_check_mark: With the latest revision this PR passed "
+"the premerge checks."
+)
+if comment["id"]:
+json.dump([comment], comment_file_handle)
junit_objects, ninja_logs = generate_test_report_lib.load_info_from_files(
build_log_files
)
@@ -90,10 +87,13 @@ def main(
get_comment(
github_token,
pr_number,
-junit_objects,
-ninja_logs,
-advisor_response.json(),
-return_code,
+generate_test_report_lib.generate_report(
+generate_test_report_lib.compute_platform_title(),
+return_code,
+junit_objects,
+ninja_logs,
+failure_explanations_list=advisor_response.json(),
+),
)
]
with open("comment", "w") as comment_file_handle:
diff --git a/.ci/utils.sh b/.ci/utils.sh
index 72f4b04f5bf3a..91c27319f3534 100644
--- a/.ci/utils.sh
+++ b/.ci/utils.sh
@@ -33,18 +33,18 @@ function at-exit {
# If building fails there will be no results files.
shopt -s nullglob
- if [[ "$GITHUB_STEP_SUMMARY" != "" ]]; then
+ if [[ "$GITHUB_ACTIONS" != "" ]]; then
python "${MONOREPO_ROOT}"/.ci/generate_test_report_github.py \
$retcode "${BUILD_DIR}"/test-results.*.xml "${MONOREPO_ROOT}"/ninja*.log
\
>> $GITHUB_STEP_SUMMARY
+python "${MONOREPO_ROOT}"/.ci/premerge_advisor_explain.py \
+ $(git rev-parse HEAD~1) $retcode ${{ secrets.GITHUB_TOKEN }} \
+ $GITHUB_PR_NUMBER "${BUILD_DIR}"/test-results.*.xml \
+ "${MONOREPO_ROOT}"/ninja*.log
fi
if [[ "$retcode" != "0" ]]; then
if [[ "$GITHUB_ACTIONS" != "" ]]; then
- python "${MONOREPO_ROOT}"/.ci/premerge_advisor_explain.py \
-$(git rev-parse HEAD~1) $retcode ${{ secrets.GITHUB_TOKEN }} \
-$GITHUB_PR_NUMBER "${BUILD_DIR}"/test-results.*.xml \
-"${MONOREPO_ROOT}"/ninja*.log
python "${MONOREPO_ROOT}"/.ci/premerge_advisor_upload.py \
$(git rev-parse HEAD~1) $GITHUB_RUN_NUMBER \
"${BUILD_DIR}"/test-results.*.xml "${MONOREPO_ROOT}"/ninja*.log
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt][sanitizers] Mark three tests as unsupported on Android (PR #166639)
https://github.com/boomanaiden154 created https://github.com/llvm/llvm-project/pull/166639 These tests were already XFailed on Android, but are unresolved when running under the internal shell rather than failing due to missing file paths, which is likely the same reason they are xfailed. This does make it slightly worse if someone accidentally fixes these tests for Android without realizing it, but the alternative is likely fixing the functionality/test on Android. ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt][HWAsan] Remove CHECK lines from test (PR #166638)
https://github.com/boomanaiden154 created https://github.com/llvm/llvm-project/pull/166638 These check lines were added in 144dae207a3b1750ec94553248bf44c359b6d452 as part of reenabling on Linux. The check lines were added using an or clause though that gets short circuited, so were never actually executed. Fixing the short circuit so they do execute reveals the filecheck assertions no longer pass. Remove them for now given they did not exist in the original test. This would cause failures on the internal shell given the () syntax is not understood by the internal shell. ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt][HWAsan] Remove CHECK lines from test (PR #166638)
llvmbot wrote: @llvm/pr-subscribers-compiler-rt-sanitizer Author: Aiden Grossman (boomanaiden154) Changes These check lines were added in 144dae207a3b1750ec94553248bf44c359b6d452 as part of reenabling on Linux. The check lines were added using an or clause though that gets short circuited, so were never actually executed. Fixing the short circuit so they do execute reveals the filecheck assertions no longer pass. Remove them for now given they did not exist in the original test. This would cause failures on the internal shell given the () syntax is not understood by the internal shell. --- Full diff: https://github.com/llvm/llvm-project/pull/166638.diff 1 Files Affected: - (modified) compiler-rt/test/hwasan/TestCases/Linux/fixed-shadow.c (+2-4) ``diff diff --git a/compiler-rt/test/hwasan/TestCases/Linux/fixed-shadow.c b/compiler-rt/test/hwasan/TestCases/Linux/fixed-shadow.c index fc83b213561c8..08a04fc305ffb 100644 --- a/compiler-rt/test/hwasan/TestCases/Linux/fixed-shadow.c +++ b/compiler-rt/test/hwasan/TestCases/Linux/fixed-shadow.c @@ -3,12 +3,12 @@ // Default compiler instrumentation works with any shadow base (dynamic or fixed). // RUN: %clang_hwasan %s -o %t // RUN: %run %t -// RUN: env HWASAN_OPTIONS=fixed_shadow_base=263878495698944 %run %t 2>%t.out || (cat %t.out | FileCheck %s) +// RUN: env HWASAN_OPTIONS=fixed_shadow_base=263878495698944 %run %t // RUN: env HWASAN_OPTIONS=fixed_shadow_base=4398046511104 %run %t // // If -hwasan-mapping-offset is set, then the fixed_shadow_base needs to match. // RUN: %clang_hwasan %s -mllvm -hwasan-mapping-offset=263878495698944 -o %t -// RUN: env HWASAN_OPTIONS=fixed_shadow_base=263878495698944 %run %t 2>%t.out || (cat %t.out | FileCheck %s) +// RUN: env HWASAN_OPTIONS=fixed_shadow_base=263878495698944 %run %t // RUN: env HWASAN_OPTIONS=fixed_shadow_base=4398046511104 not %run %t // RUN: %clang_hwasan %s -mllvm -hwasan-mapping-offset=4398046511104 -o %t @@ -26,8 +26,6 @@ // // UNSUPPORTED: android -// CHECK: FATAL: HWAddressSanitizer: Shadow range {{.*}} is not available - #include #include #include `` https://github.com/llvm/llvm-project/pull/166638 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt][sanitizers] Mark three tests as unsupported on Android (PR #166639)
llvmbot wrote:
@llvm/pr-subscribers-compiler-rt-sanitizer
Author: Aiden Grossman (boomanaiden154)
Changes
These tests were already XFailed on Android, but are unresolved when
running under the internal shell rather than failing due to missing file
paths, which is likely the same reason they are xfailed. This does make
it slightly worse if someone accidentally fixes these tests for Android
without realizing it, but the alternative is likely fixing the
functionality/test on Android.
---
Full diff: https://github.com/llvm/llvm-project/pull/166639.diff
3 Files Affected:
- (modified) compiler-rt/test/asan/TestCases/log-path_test.cpp (+1-2)
- (modified) compiler-rt/test/asan/TestCases/verbose-log-path_test.cpp (+2-2)
- (modified)
compiler-rt/test/sanitizer_common/TestCases/Posix/sanitizer_set_report_fd_test.cpp
(+1-1)
``diff
diff --git a/compiler-rt/test/asan/TestCases/log-path_test.cpp
b/compiler-rt/test/asan/TestCases/log-path_test.cpp
index 6875d57c43cc0..22f077fb54680 100644
--- a/compiler-rt/test/asan/TestCases/log-path_test.cpp
+++ b/compiler-rt/test/asan/TestCases/log-path_test.cpp
@@ -1,6 +1,5 @@
// FIXME: https://code.google.com/p/address-sanitizer/issues/detail?id=316
-// XFAIL: android
-// UNSUPPORTED: ios
+// UNSUPPORTED: ios, android
//
// The for loop in the backticks below requires bash.
// REQUIRES: shell
diff --git a/compiler-rt/test/asan/TestCases/verbose-log-path_test.cpp
b/compiler-rt/test/asan/TestCases/verbose-log-path_test.cpp
index 53166ccded390..f4781a7d47647 100644
--- a/compiler-rt/test/asan/TestCases/verbose-log-path_test.cpp
+++ b/compiler-rt/test/asan/TestCases/verbose-log-path_test.cpp
@@ -9,8 +9,8 @@
// RUN: FileCheck %s --check-prefix=CHECK-ERROR <
%t-dir/asan.log.verbose-log-path_test-binary.*
// FIXME: only FreeBSD, NetBSD and Linux have verbose log paths now.
-// XFAIL: target={{.*windows-msvc.*}},android
-// UNSUPPORTED: ios
+// XFAIL: target={{.*windows-msvc.*}}
+// UNSUPPORTED: ios, android
#include
#include
diff --git
a/compiler-rt/test/sanitizer_common/TestCases/Posix/sanitizer_set_report_fd_test.cpp
b/compiler-rt/test/sanitizer_common/TestCases/Posix/sanitizer_set_report_fd_test.cpp
index 6ba7025bf7578..68e76bb49f631 100644
---
a/compiler-rt/test/sanitizer_common/TestCases/Posix/sanitizer_set_report_fd_test.cpp
+++
b/compiler-rt/test/sanitizer_common/TestCases/Posix/sanitizer_set_report_fd_test.cpp
@@ -7,7 +7,7 @@
// RUN: not %run %t %t-out && FileCheck < %t-out %s
// REQUIRES: stable-runtime
-// XFAIL: android && asan
+// UNSUPPORTED: android
#include
#include
``
https://github.com/llvm/llvm-project/pull/166639
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Move call probe information to CallSiteInfo (PR #165490)
https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/165490 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [BOLT] Compress YAML pseudo probe information (PR #166680)
https://github.com/aaupov created https://github.com/llvm/llvm-project/pull/166680 None ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [BOLT] Compress YAML pseudo probe information (PR #166680)
github-actions[bot] wrote:
:warning: C/C++ code formatter, clang-format found issues in your code.
:warning:
You can test this locally with the following command:
``bash
git-clang-format --diff origin/main HEAD --extensions h,cpp --
bolt/include/bolt/Profile/ProfileYAMLMapping.h
bolt/include/bolt/Profile/YAMLProfileWriter.h
bolt/include/bolt/Utils/CommandLineOpts.h bolt/lib/Profile/DataAggregator.cpp
bolt/lib/Profile/YAMLProfileReader.cpp bolt/lib/Profile/YAMLProfileWriter.cpp
bolt/lib/Rewrite/PseudoProbeRewriter.cpp --diff_from_common_commit
``
:warning:
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing `origin/main` to the base branch/commit you want to compare against.
:warning:
View the diff from clang-format here.
``diff
diff --git a/bolt/lib/Profile/YAMLProfileWriter.cpp
b/bolt/lib/Profile/YAMLProfileWriter.cpp
index 1667026a8..99d42c4c9 100644
--- a/bolt/lib/Profile/YAMLProfileWriter.cpp
+++ b/bolt/lib/Profile/YAMLProfileWriter.cpp
@@ -486,7 +486,7 @@ std::error_code YAMLProfileWriter::writeProfile(const
RewriteInstance &RI) {
// Add probe inline tree nodes.
InlineTreeDesc InlineTree;
if (const MCPseudoProbeDecoder *Decoder =
- opts::ProfileWritePseudoProbes ? BC.getPseudoProbeDecoder() : nullptr)
+ opts::ProfileWritePseudoProbes ? BC.getPseudoProbeDecoder() :
nullptr)
std::tie(BP.PseudoProbeDesc, InlineTree) =
convertPseudoProbeDesc(*Decoder);
// Add all function objects.
``
https://github.com/llvm/llvm-project/pull/166680
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Enable amdgpu-lower-special-lds pass in pipeline (PR #165746)
https://github.com/skc7 updated https://github.com/llvm/llvm-project/pull/165746
>From ca4b858851a2b6c2a0e81fe6d48618332d18ca15 Mon Sep 17 00:00:00 2001
From: skc7
Date: Thu, 30 Oct 2025 22:42:33 +0530
Subject: [PATCH 1/3] [AMDGPU] Enable amdgpu-lower-special-lds pass in pipeline
---
.../AMDGPU/AMDGPULowerModuleLDSPass.cpp | 126 --
llvm/lib/Target/AMDGPU/AMDGPUMemoryUtils.cpp | 6 +
llvm/lib/Target/AMDGPU/AMDGPUSwLowerLDS.cpp | 3 +-
.../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 14 ++
...amdgpu-lower-special-lds-and-module-lds.ll | 119 +
.../amdgpu-lower-special-lds-and-sw-lds.ll| 86
llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll | 6 +-
llvm/test/CodeGen/AMDGPU/llc-pipeline.ll | 5 +
.../test/CodeGen/AMDGPU/s-barrier-lowering.ll | 2 +-
9 files changed, 236 insertions(+), 131 deletions(-)
create mode 100644
llvm/test/CodeGen/AMDGPU/amdgpu-lower-special-lds-and-module-lds.ll
create mode 100644
llvm/test/CodeGen/AMDGPU/amdgpu-lower-special-lds-and-sw-lds.ll
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
b/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
index a4ef524c43466..3c0328e93ffbd 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
@@ -922,126 +922,6 @@ class AMDGPULowerModuleLDS {
return KernelToCreatedDynamicLDS;
}
- static GlobalVariable *uniquifyGVPerKernel(Module &M, GlobalVariable *GV,
- Function *KF) {
-bool NeedsReplacement = false;
-for (Use &U : GV->uses()) {
- if (auto *I = dyn_cast(U.getUser())) {
-Function *F = I->getFunction();
-if (isKernelLDS(F) && F != KF) {
- NeedsReplacement = true;
- break;
-}
- }
-}
-if (!NeedsReplacement)
- return GV;
-// Create a new GV used only by this kernel and its function
-GlobalVariable *NewGV = new GlobalVariable(
-M, GV->getValueType(), GV->isConstant(), GV->getLinkage(),
-GV->getInitializer(), GV->getName() + "." + KF->getName(), nullptr,
-GV->getThreadLocalMode(), GV->getType()->getAddressSpace());
-NewGV->copyAttributesFrom(GV);
-for (Use &U : make_early_inc_range(GV->uses())) {
- if (auto *I = dyn_cast(U.getUser())) {
-Function *F = I->getFunction();
-if (!isKernelLDS(F) || F == KF) {
- U.getUser()->replaceUsesOfWith(GV, NewGV);
-}
- }
-}
-return NewGV;
- }
-
- bool lowerSpecialLDSVariables(
- Module &M, LDSUsesInfoTy &LDSUsesInfo,
- VariableFunctionMap &LDSToKernelsThatNeedToAccessItIndirectly) {
-bool Changed = false;
-const DataLayout &DL = M.getDataLayout();
-// The 1st round: give module-absolute assignments
-int NumAbsolutes = 0;
-std::vector OrderedGVs;
-for (auto &K : LDSToKernelsThatNeedToAccessItIndirectly) {
- GlobalVariable *GV = K.first;
- if (!isNamedBarrier(*GV))
-continue;
- // give a module-absolute assignment if it is indirectly accessed by
- // multiple kernels. This is not precise, but we don't want to duplicate
- // a function when it is called by multiple kernels.
- if (LDSToKernelsThatNeedToAccessItIndirectly[GV].size() > 1) {
-OrderedGVs.push_back(GV);
- } else {
-// leave it to the 2nd round, which will give a kernel-relative
-// assignment if it is only indirectly accessed by one kernel
-LDSUsesInfo.direct_access[*K.second.begin()].insert(GV);
- }
- LDSToKernelsThatNeedToAccessItIndirectly.erase(GV);
-}
-OrderedGVs = sortByName(std::move(OrderedGVs));
-for (GlobalVariable *GV : OrderedGVs) {
- unsigned BarrierScope = llvm::AMDGPU::Barrier::BARRIER_SCOPE_WORKGROUP;
- unsigned BarId = NumAbsolutes + 1;
- unsigned BarCnt = DL.getTypeAllocSize(GV->getValueType()) / 16;
- NumAbsolutes += BarCnt;
-
- // 4 bits for alignment, 5 bits for the barrier num,
- // 3 bits for the barrier scope
- unsigned Offset = 0x802000u | BarrierScope << 9 | BarId << 4;
- recordLDSAbsoluteAddress(&M, GV, Offset);
-}
-OrderedGVs.clear();
-
-// The 2nd round: give a kernel-relative assignment for GV that
-// either only indirectly accessed by single kernel or only directly
-// accessed by multiple kernels.
-std::vector OrderedKernels;
-for (auto &K : LDSUsesInfo.direct_access) {
- Function *F = K.first;
- assert(isKernelLDS(F));
- OrderedKernels.push_back(F);
-}
-OrderedKernels = sortByName(std::move(OrderedKernels));
-
-llvm::DenseMap Kernel2BarId;
-for (Function *F : OrderedKernels) {
- for (GlobalVariable *GV : LDSUsesInfo.direct_access[F]) {
-if (!isNamedBarrier(*GV))
- continue;
-
-LDSUsesInfo.direct_access[F].erase(GV);
-if (GV->isAbsoluteSymbolRef()) {
- // already assigned
-
[llvm-branch-commits] [llvm] [AMDGPU] Enable amdgpu-lower-exec-sync pass in pipeline (PR #165746)
https://github.com/skc7 edited https://github.com/llvm/llvm-project/pull/165746 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Enable amdgpu-lower-exec-sync pass in pipeline (PR #165746)
https://github.com/skc7 edited https://github.com/llvm/llvm-project/pull/165746 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LIR][profcheck] Reuse the loop's exit condition profile (PR #164523)
https://github.com/mtrofin updated
https://github.com/llvm/llvm-project/pull/164523
>From a4e7fae2b33bb24e688bd6f28d8b0f78589574e2 Mon Sep 17 00:00:00 2001
From: Mircea Trofin
Date: Tue, 21 Oct 2025 17:24:49 -0700
Subject: [PATCH] [LIR][profcheck] Reuse the loop's exit condition profile
---
.../Transforms/Scalar/LoopIdiomRecognize.cpp | 40 +--
.../LoopIdiom/X86/preserve-profile.ll | 70 +++
2 files changed, 106 insertions(+), 4 deletions(-)
create mode 100644 llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
diff --git a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
index 019536ca91ae0..9070d252ae09f 100644
--- a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
@@ -72,6 +72,7 @@
#include "llvm/IR/Module.h"
#include "llvm/IR/PassManager.h"
#include "llvm/IR/PatternMatch.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/IR/Type.h"
#include "llvm/IR/User.h"
#include "llvm/IR/Value.h"
@@ -105,6 +106,7 @@ STATISTIC(
STATISTIC(NumShiftUntilZero,
"Number of uncountable loops recognized as 'shift until zero'
idiom");
+namespace llvm {
bool DisableLIRP::All;
static cl::opt
DisableLIRPAll("disable-" DEBUG_TYPE "-all",
@@ -163,6 +165,10 @@ static cl::opt ForceMemsetPatternIntrinsic(
cl::desc("Use memset.pattern intrinsic whenever possible"),
cl::init(false),
cl::Hidden);
+extern cl::opt ProfcheckDisableMetadataFixes;
+
+} // namespace llvm
+
namespace {
class LoopIdiomRecognize {
@@ -3199,7 +3205,21 @@ bool LoopIdiomRecognize::recognizeShiftUntilBitTest() {
// The loop trip count check.
auto *IVCheck = Builder.CreateICmpEQ(IVNext, LoopTripCount,
CurLoop->getName() + ".ivcheck");
- Builder.CreateCondBr(IVCheck, SuccessorBB, LoopHeaderBB);
+ SmallVector BranchWeights;
+ const bool HasBranchWeights =
+ !ProfcheckDisableMetadataFixes &&
+ extractBranchWeights(*LoopHeaderBB->getTerminator(), BranchWeights);
+
+ auto *BI = Builder.CreateCondBr(IVCheck, SuccessorBB, LoopHeaderBB);
+ if (HasBranchWeights) {
+if (SuccessorBB == LoopHeaderBB->getTerminator()->getSuccessor(1))
+ std::swap(BranchWeights[0], BranchWeights[1]);
+// We're not changing the loop profile, so we can reuse the original loop's
+// profile.
+setBranchWeights(*BI, BranchWeights,
+ /*IsExpected=*/false);
+ }
+
LoopHeaderBB->getTerminator()->eraseFromParent();
// Populate the IV PHI.
@@ -3368,10 +3388,10 @@ static bool detectShiftUntilZeroIdiom(Loop *CurLoop,
ScalarEvolution *SE,
/// %start = <...>
/// %extraoffset = <...>
/// <...>
-/// br label %for.cond
+/// br label %loop
///
/// loop:
-/// %iv = phi i8 [ %start, %entry ], [ %iv.next, %for.cond ]
+/// %iv = phi i8 [ %start, %entry ], [ %iv.next, %loop ]
/// %nbits = add nsw i8 %iv, %extraoffset
/// %val.shifted = {{l,a}shr,shl} i8 %val, %nbits
/// %val.shifted.iszero = icmp eq i8 %val.shifted, 0
@@ -3533,7 +3553,19 @@ bool LoopIdiomRecognize::recognizeShiftUntilZero() {
// The loop terminator.
Builder.SetInsertPoint(LoopHeaderBB->getTerminator());
- Builder.CreateCondBr(CIVCheck, SuccessorBB, LoopHeaderBB);
+ SmallVector BranchWeights;
+ const bool HasBranchWeights =
+ !ProfcheckDisableMetadataFixes &&
+ extractBranchWeights(*LoopHeaderBB->getTerminator(), BranchWeights);
+
+ auto *BI = Builder.CreateCondBr(CIVCheck, SuccessorBB, LoopHeaderBB);
+ if (HasBranchWeights) {
+if (InvertedCond)
+ std::swap(BranchWeights[0], BranchWeights[1]);
+// We're not changing the loop profile, so we can reuse the original loop's
+// profile.
+setBranchWeights(*BI, BranchWeights, /*IsExpected=*/false);
+ }
LoopHeaderBB->getTerminator()->eraseFromParent();
// Populate the IV PHI.
diff --git a/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
b/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
new file mode 100644
index 0..d01bb748d9422
--- /dev/null
+++ b/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
@@ -0,0 +1,70 @@
+; RUN: opt
-passes="module(print),function(loop(loop-idiom)),module(print)"
-mtriple=x86_64 -mcpu=core-avx2 %s -disable-output 2>&1 | FileCheck
--check-prefix=PROFILE %s
+
+declare void @escape_inner(i8, i8, i8, i1, i8)
+declare void @escape_outer(i8, i8, i8, i1, i8)
+
+declare i8 @gen.i8()
+
+; Most basic pattern; Note that iff the shift amount is offset, said offsetting
+; must not cause an overflow, but `add nsw` is fine.
+define i8 @p0(i8 %val, i8 %start, i8 %extraoffset) mustprogress {
+entry:
+ br label %loop
+
+loop:
+ %iv = phi i8 [ %start, %entry ], [ %iv.next, %loop ]
+ %nbits = add nsw i8 %iv, %extraoffset
+ %val.shifted = ashr i8 %val, %nbits
+ %val.shifted.iszero = icmp eq i8 %val.shifted, 0
+ %iv.next = add i8 %iv, 1
+
+ call void @escap
[llvm-branch-commits] [llvm] [LIR][profcheck] Reuse the loop's exit condition profile (PR #164523)
https://github.com/mtrofin updated
https://github.com/llvm/llvm-project/pull/164523
>From a4e7fae2b33bb24e688bd6f28d8b0f78589574e2 Mon Sep 17 00:00:00 2001
From: Mircea Trofin
Date: Tue, 21 Oct 2025 17:24:49 -0700
Subject: [PATCH] [LIR][profcheck] Reuse the loop's exit condition profile
---
.../Transforms/Scalar/LoopIdiomRecognize.cpp | 40 +--
.../LoopIdiom/X86/preserve-profile.ll | 70 +++
2 files changed, 106 insertions(+), 4 deletions(-)
create mode 100644 llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
diff --git a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
index 019536ca91ae0..9070d252ae09f 100644
--- a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
@@ -72,6 +72,7 @@
#include "llvm/IR/Module.h"
#include "llvm/IR/PassManager.h"
#include "llvm/IR/PatternMatch.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/IR/Type.h"
#include "llvm/IR/User.h"
#include "llvm/IR/Value.h"
@@ -105,6 +106,7 @@ STATISTIC(
STATISTIC(NumShiftUntilZero,
"Number of uncountable loops recognized as 'shift until zero'
idiom");
+namespace llvm {
bool DisableLIRP::All;
static cl::opt
DisableLIRPAll("disable-" DEBUG_TYPE "-all",
@@ -163,6 +165,10 @@ static cl::opt ForceMemsetPatternIntrinsic(
cl::desc("Use memset.pattern intrinsic whenever possible"),
cl::init(false),
cl::Hidden);
+extern cl::opt ProfcheckDisableMetadataFixes;
+
+} // namespace llvm
+
namespace {
class LoopIdiomRecognize {
@@ -3199,7 +3205,21 @@ bool LoopIdiomRecognize::recognizeShiftUntilBitTest() {
// The loop trip count check.
auto *IVCheck = Builder.CreateICmpEQ(IVNext, LoopTripCount,
CurLoop->getName() + ".ivcheck");
- Builder.CreateCondBr(IVCheck, SuccessorBB, LoopHeaderBB);
+ SmallVector BranchWeights;
+ const bool HasBranchWeights =
+ !ProfcheckDisableMetadataFixes &&
+ extractBranchWeights(*LoopHeaderBB->getTerminator(), BranchWeights);
+
+ auto *BI = Builder.CreateCondBr(IVCheck, SuccessorBB, LoopHeaderBB);
+ if (HasBranchWeights) {
+if (SuccessorBB == LoopHeaderBB->getTerminator()->getSuccessor(1))
+ std::swap(BranchWeights[0], BranchWeights[1]);
+// We're not changing the loop profile, so we can reuse the original loop's
+// profile.
+setBranchWeights(*BI, BranchWeights,
+ /*IsExpected=*/false);
+ }
+
LoopHeaderBB->getTerminator()->eraseFromParent();
// Populate the IV PHI.
@@ -3368,10 +3388,10 @@ static bool detectShiftUntilZeroIdiom(Loop *CurLoop,
ScalarEvolution *SE,
/// %start = <...>
/// %extraoffset = <...>
/// <...>
-/// br label %for.cond
+/// br label %loop
///
/// loop:
-/// %iv = phi i8 [ %start, %entry ], [ %iv.next, %for.cond ]
+/// %iv = phi i8 [ %start, %entry ], [ %iv.next, %loop ]
/// %nbits = add nsw i8 %iv, %extraoffset
/// %val.shifted = {{l,a}shr,shl} i8 %val, %nbits
/// %val.shifted.iszero = icmp eq i8 %val.shifted, 0
@@ -3533,7 +3553,19 @@ bool LoopIdiomRecognize::recognizeShiftUntilZero() {
// The loop terminator.
Builder.SetInsertPoint(LoopHeaderBB->getTerminator());
- Builder.CreateCondBr(CIVCheck, SuccessorBB, LoopHeaderBB);
+ SmallVector BranchWeights;
+ const bool HasBranchWeights =
+ !ProfcheckDisableMetadataFixes &&
+ extractBranchWeights(*LoopHeaderBB->getTerminator(), BranchWeights);
+
+ auto *BI = Builder.CreateCondBr(CIVCheck, SuccessorBB, LoopHeaderBB);
+ if (HasBranchWeights) {
+if (InvertedCond)
+ std::swap(BranchWeights[0], BranchWeights[1]);
+// We're not changing the loop profile, so we can reuse the original loop's
+// profile.
+setBranchWeights(*BI, BranchWeights, /*IsExpected=*/false);
+ }
LoopHeaderBB->getTerminator()->eraseFromParent();
// Populate the IV PHI.
diff --git a/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
b/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
new file mode 100644
index 0..d01bb748d9422
--- /dev/null
+++ b/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
@@ -0,0 +1,70 @@
+; RUN: opt
-passes="module(print),function(loop(loop-idiom)),module(print)"
-mtriple=x86_64 -mcpu=core-avx2 %s -disable-output 2>&1 | FileCheck
--check-prefix=PROFILE %s
+
+declare void @escape_inner(i8, i8, i8, i1, i8)
+declare void @escape_outer(i8, i8, i8, i1, i8)
+
+declare i8 @gen.i8()
+
+; Most basic pattern; Note that iff the shift amount is offset, said offsetting
+; must not cause an overflow, but `add nsw` is fine.
+define i8 @p0(i8 %val, i8 %start, i8 %extraoffset) mustprogress {
+entry:
+ br label %loop
+
+loop:
+ %iv = phi i8 [ %start, %entry ], [ %iv.next, %loop ]
+ %nbits = add nsw i8 %iv, %extraoffset
+ %val.shifted = ashr i8 %val, %nbits
+ %val.shifted.iszero = icmp eq i8 %val.shifted, 0
+ %iv.next = add i8 %iv, 1
+
+ call void @escap
[llvm-branch-commits] [llvm] [LSCFG][profcheck] Add dummy branch weights for the dummy switch to dead exits (PR #164714)
https://github.com/mtrofin updated
https://github.com/llvm/llvm-project/pull/164714
>From 6efd7cb0a75d10bdfce54a7dfa611c06940b7801 Mon Sep 17 00:00:00 2001
From: Mircea Trofin
Date: Wed, 22 Oct 2025 14:34:31 -0700
Subject: [PATCH] [LSCFG][profcheck] Add dummy branch weights for the dummy
switch to dead exits
---
.../lib/Transforms/Scalar/LoopSimplifyCFG.cpp | 12 ++
.../LoopSimplifyCFG/constant-fold-branch.ll | 104 +-
2 files changed, 66 insertions(+), 50 deletions(-)
diff --git a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
index b9546c5fa236b..e902b71776973 100644
--- a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
@@ -24,6 +24,7 @@
#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/IR/Dominators.h"
#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Scalar/LoopPassManager.h"
@@ -393,6 +394,17 @@ class ConstantTerminatorFoldingImpl {
DTUpdates.push_back({DominatorTree::Insert, Preheader, BB});
++NumLoopExitsDeleted;
}
+// We don't really need to add branch weights to DummySwitch, because all
+// but one branches are just a temporary artifact - see the comment on top
+// of this function. But, it's easy to estimate the weights, and it helps
+// maintain a property of the overall compiler - that the branch weights
+// don't "just get dropped" accidentally (i.e. profcheck)
+if (DummySwitch->getParent()->getParent()->hasProfileData()) {
+ SmallVector DummyBranchWeights(1 + DummySwitch->getNumCases());
+ // default. 100% probability, the rest are dead.
+ DummyBranchWeights[0] = 1;
+ setBranchWeights(*DummySwitch, DummyBranchWeights, /*IsExpected=*/false);
+}
assert(L.getLoopPreheader() == NewPreheader && "Malformed CFG?");
if (Loop *OuterLoop = LI.getLoopFor(Preheader)) {
diff --git a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
index 1ec212f0bb5ea..46b6209986fed 100644
--- a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
+++ b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
@@ -1,4 +1,4 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
UTC_ARGS: --check-globals
; REQUIRES: asserts
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes=loop-simplifycfg -verify-loop-info -verify-dom-info -verify-loop-lcssa
< %s | FileCheck %s
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes='require,loop(loop-simplifycfg)' -verify-loop-info
-verify-dom-info -verify-loop-lcssa < %s | FileCheck %s
@@ -59,7 +59,7 @@ define i32 @dead_backedge_test_switch_loop(i32 %end) {
; CHECK: dead_backedge:
; CHECK-NEXT:[[I_2]] = add i32 [[I_1]], 10
; CHECK-NEXT:switch i32 1, label [[EXIT:%.*]] [
-; CHECK-NEXT:i32 0, label [[HEADER_BACKEDGE]]
+; CHECK-NEXT: i32 0, label [[HEADER_BACKEDGE]]
; CHECK-NEXT:]
; CHECK: exit:
; CHECK-NEXT:[[I_2_LCSSA:%.*]] = phi i32 [ [[I_2]], [[DEAD_BACKEDGE]] ]
@@ -233,12 +233,12 @@ exit:
; Check that we preserve static reachibility of a dead exit block while
deleting
; a branch.
-define i32 @dead_exit_test_branch_loop(i32 %end) {
+define i32 @dead_exit_test_branch_loop(i32 %end) !prof
!{!"function_entry_count", i32 10} {
; CHECK-LABEL: @dead_exit_test_branch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
-; CHECK-NEXT:]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT:], !prof [[PROF1:![0-9]+]]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
; CHECK: header:
@@ -262,7 +262,7 @@ preheader:
header:
%i = phi i32 [0, %preheader], [%i.inc, %backedge]
- br i1 true, label %backedge, label %dead
+ br i1 true, label %backedge, label %dead, !prof !{!"branch_weights", i32 10,
i32 1}
dead:
br label %dummy
@@ -286,7 +286,7 @@ define i32 @dead_exit_test_switch_loop(i32 %end) {
; CHECK-LABEL: @dead_exit_test_switch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
; CHECK-NEXT:]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
@@ -383,9 +383,9 @@ define i32 @dead_loop_test_switch_loop(i32 %end) {
; CHECK: header:
; CHECK-NEXT:[[I:%.*]] = phi i32 [ 0, [[PREHEADER:%.*]] ], [
[[I_INC:%.*]], [[BACKEDGE:%.*]] ]
; CHECK-NEXT:switch i32 1, label [[DEAD:%.*]] [
-; CHECK-NEXT:i32 0, label [[DEAD]]
-; CHECK-NEXT:i32 1, label [[BACKEDGE]]
-; CHECK-NEXT:i32 2, lab
[llvm-branch-commits] [llvm] [LSCFG][profcheck] Add dummy branch weights for the dummy switch to dead exits (PR #164714)
https://github.com/mtrofin updated
https://github.com/llvm/llvm-project/pull/164714
>From 6efd7cb0a75d10bdfce54a7dfa611c06940b7801 Mon Sep 17 00:00:00 2001
From: Mircea Trofin
Date: Wed, 22 Oct 2025 14:34:31 -0700
Subject: [PATCH] [LSCFG][profcheck] Add dummy branch weights for the dummy
switch to dead exits
---
.../lib/Transforms/Scalar/LoopSimplifyCFG.cpp | 12 ++
.../LoopSimplifyCFG/constant-fold-branch.ll | 104 +-
2 files changed, 66 insertions(+), 50 deletions(-)
diff --git a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
index b9546c5fa236b..e902b71776973 100644
--- a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
@@ -24,6 +24,7 @@
#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/IR/Dominators.h"
#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Scalar/LoopPassManager.h"
@@ -393,6 +394,17 @@ class ConstantTerminatorFoldingImpl {
DTUpdates.push_back({DominatorTree::Insert, Preheader, BB});
++NumLoopExitsDeleted;
}
+// We don't really need to add branch weights to DummySwitch, because all
+// but one branches are just a temporary artifact - see the comment on top
+// of this function. But, it's easy to estimate the weights, and it helps
+// maintain a property of the overall compiler - that the branch weights
+// don't "just get dropped" accidentally (i.e. profcheck)
+if (DummySwitch->getParent()->getParent()->hasProfileData()) {
+ SmallVector DummyBranchWeights(1 + DummySwitch->getNumCases());
+ // default. 100% probability, the rest are dead.
+ DummyBranchWeights[0] = 1;
+ setBranchWeights(*DummySwitch, DummyBranchWeights, /*IsExpected=*/false);
+}
assert(L.getLoopPreheader() == NewPreheader && "Malformed CFG?");
if (Loop *OuterLoop = LI.getLoopFor(Preheader)) {
diff --git a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
index 1ec212f0bb5ea..46b6209986fed 100644
--- a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
+++ b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
@@ -1,4 +1,4 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
UTC_ARGS: --check-globals
; REQUIRES: asserts
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes=loop-simplifycfg -verify-loop-info -verify-dom-info -verify-loop-lcssa
< %s | FileCheck %s
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes='require,loop(loop-simplifycfg)' -verify-loop-info
-verify-dom-info -verify-loop-lcssa < %s | FileCheck %s
@@ -59,7 +59,7 @@ define i32 @dead_backedge_test_switch_loop(i32 %end) {
; CHECK: dead_backedge:
; CHECK-NEXT:[[I_2]] = add i32 [[I_1]], 10
; CHECK-NEXT:switch i32 1, label [[EXIT:%.*]] [
-; CHECK-NEXT:i32 0, label [[HEADER_BACKEDGE]]
+; CHECK-NEXT: i32 0, label [[HEADER_BACKEDGE]]
; CHECK-NEXT:]
; CHECK: exit:
; CHECK-NEXT:[[I_2_LCSSA:%.*]] = phi i32 [ [[I_2]], [[DEAD_BACKEDGE]] ]
@@ -233,12 +233,12 @@ exit:
; Check that we preserve static reachibility of a dead exit block while
deleting
; a branch.
-define i32 @dead_exit_test_branch_loop(i32 %end) {
+define i32 @dead_exit_test_branch_loop(i32 %end) !prof
!{!"function_entry_count", i32 10} {
; CHECK-LABEL: @dead_exit_test_branch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
-; CHECK-NEXT:]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT:], !prof [[PROF1:![0-9]+]]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
; CHECK: header:
@@ -262,7 +262,7 @@ preheader:
header:
%i = phi i32 [0, %preheader], [%i.inc, %backedge]
- br i1 true, label %backedge, label %dead
+ br i1 true, label %backedge, label %dead, !prof !{!"branch_weights", i32 10,
i32 1}
dead:
br label %dummy
@@ -286,7 +286,7 @@ define i32 @dead_exit_test_switch_loop(i32 %end) {
; CHECK-LABEL: @dead_exit_test_switch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
; CHECK-NEXT:]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
@@ -383,9 +383,9 @@ define i32 @dead_loop_test_switch_loop(i32 %end) {
; CHECK: header:
; CHECK-NEXT:[[I:%.*]] = phi i32 [ 0, [[PREHEADER:%.*]] ], [
[[I_INC:%.*]], [[BACKEDGE:%.*]] ]
; CHECK-NEXT:switch i32 1, label [[DEAD:%.*]] [
-; CHECK-NEXT:i32 0, label [[DEAD]]
-; CHECK-NEXT:i32 1, label [[BACKEDGE]]
-; CHECK-NEXT:i32 2, lab
[llvm-branch-commits] [llvm] [LSCFG][profcheck] Add dummy branch weights for the dummy switch to dead exits (PR #164714)
https://github.com/mtrofin updated
https://github.com/llvm/llvm-project/pull/164714
>From aaa20e689ee9be4aa7e43c612a363973d5459f6e Mon Sep 17 00:00:00 2001
From: Mircea Trofin
Date: Wed, 22 Oct 2025 14:34:31 -0700
Subject: [PATCH] [LSCFG][profcheck] Add dummy branch weights for the dummy
switch to dead exits
---
.../lib/Transforms/Scalar/LoopSimplifyCFG.cpp | 12 ++
.../LoopSimplifyCFG/constant-fold-branch.ll | 104 +-
2 files changed, 66 insertions(+), 50 deletions(-)
diff --git a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
index b9546c5fa236b..e902b71776973 100644
--- a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
@@ -24,6 +24,7 @@
#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/IR/Dominators.h"
#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Scalar/LoopPassManager.h"
@@ -393,6 +394,17 @@ class ConstantTerminatorFoldingImpl {
DTUpdates.push_back({DominatorTree::Insert, Preheader, BB});
++NumLoopExitsDeleted;
}
+// We don't really need to add branch weights to DummySwitch, because all
+// but one branches are just a temporary artifact - see the comment on top
+// of this function. But, it's easy to estimate the weights, and it helps
+// maintain a property of the overall compiler - that the branch weights
+// don't "just get dropped" accidentally (i.e. profcheck)
+if (DummySwitch->getParent()->getParent()->hasProfileData()) {
+ SmallVector DummyBranchWeights(1 + DummySwitch->getNumCases());
+ // default. 100% probability, the rest are dead.
+ DummyBranchWeights[0] = 1;
+ setBranchWeights(*DummySwitch, DummyBranchWeights, /*IsExpected=*/false);
+}
assert(L.getLoopPreheader() == NewPreheader && "Malformed CFG?");
if (Loop *OuterLoop = LI.getLoopFor(Preheader)) {
diff --git a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
index 1ec212f0bb5ea..46b6209986fed 100644
--- a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
+++ b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
@@ -1,4 +1,4 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
UTC_ARGS: --check-globals
; REQUIRES: asserts
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes=loop-simplifycfg -verify-loop-info -verify-dom-info -verify-loop-lcssa
< %s | FileCheck %s
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes='require,loop(loop-simplifycfg)' -verify-loop-info
-verify-dom-info -verify-loop-lcssa < %s | FileCheck %s
@@ -59,7 +59,7 @@ define i32 @dead_backedge_test_switch_loop(i32 %end) {
; CHECK: dead_backedge:
; CHECK-NEXT:[[I_2]] = add i32 [[I_1]], 10
; CHECK-NEXT:switch i32 1, label [[EXIT:%.*]] [
-; CHECK-NEXT:i32 0, label [[HEADER_BACKEDGE]]
+; CHECK-NEXT: i32 0, label [[HEADER_BACKEDGE]]
; CHECK-NEXT:]
; CHECK: exit:
; CHECK-NEXT:[[I_2_LCSSA:%.*]] = phi i32 [ [[I_2]], [[DEAD_BACKEDGE]] ]
@@ -233,12 +233,12 @@ exit:
; Check that we preserve static reachibility of a dead exit block while
deleting
; a branch.
-define i32 @dead_exit_test_branch_loop(i32 %end) {
+define i32 @dead_exit_test_branch_loop(i32 %end) !prof
!{!"function_entry_count", i32 10} {
; CHECK-LABEL: @dead_exit_test_branch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
-; CHECK-NEXT:]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT:], !prof [[PROF1:![0-9]+]]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
; CHECK: header:
@@ -262,7 +262,7 @@ preheader:
header:
%i = phi i32 [0, %preheader], [%i.inc, %backedge]
- br i1 true, label %backedge, label %dead
+ br i1 true, label %backedge, label %dead, !prof !{!"branch_weights", i32 10,
i32 1}
dead:
br label %dummy
@@ -286,7 +286,7 @@ define i32 @dead_exit_test_switch_loop(i32 %end) {
; CHECK-LABEL: @dead_exit_test_switch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
; CHECK-NEXT:]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
@@ -383,9 +383,9 @@ define i32 @dead_loop_test_switch_loop(i32 %end) {
; CHECK: header:
; CHECK-NEXT:[[I:%.*]] = phi i32 [ 0, [[PREHEADER:%.*]] ], [
[[I_INC:%.*]], [[BACKEDGE:%.*]] ]
; CHECK-NEXT:switch i32 1, label [[DEAD:%.*]] [
-; CHECK-NEXT:i32 0, label [[DEAD]]
-; CHECK-NEXT:i32 1, label [[BACKEDGE]]
-; CHECK-NEXT:i32 2, lab
[llvm-branch-commits] [llvm] [LIR][profcheck] Reuse the loop's exit condition profile (PR #164523)
https://github.com/mtrofin updated
https://github.com/llvm/llvm-project/pull/164523
>From d41e0bec22c98ba940fbeea4e9fd5351279ecdc1 Mon Sep 17 00:00:00 2001
From: Mircea Trofin
Date: Tue, 21 Oct 2025 17:24:49 -0700
Subject: [PATCH] [LIR][profcheck] Reuse the loop's exit condition profile
---
.../Transforms/Scalar/LoopIdiomRecognize.cpp | 40 +--
.../LoopIdiom/X86/preserve-profile.ll | 70 +++
2 files changed, 106 insertions(+), 4 deletions(-)
create mode 100644 llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
diff --git a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
index 019536ca91ae0..9070d252ae09f 100644
--- a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
@@ -72,6 +72,7 @@
#include "llvm/IR/Module.h"
#include "llvm/IR/PassManager.h"
#include "llvm/IR/PatternMatch.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/IR/Type.h"
#include "llvm/IR/User.h"
#include "llvm/IR/Value.h"
@@ -105,6 +106,7 @@ STATISTIC(
STATISTIC(NumShiftUntilZero,
"Number of uncountable loops recognized as 'shift until zero'
idiom");
+namespace llvm {
bool DisableLIRP::All;
static cl::opt
DisableLIRPAll("disable-" DEBUG_TYPE "-all",
@@ -163,6 +165,10 @@ static cl::opt ForceMemsetPatternIntrinsic(
cl::desc("Use memset.pattern intrinsic whenever possible"),
cl::init(false),
cl::Hidden);
+extern cl::opt ProfcheckDisableMetadataFixes;
+
+} // namespace llvm
+
namespace {
class LoopIdiomRecognize {
@@ -3199,7 +3205,21 @@ bool LoopIdiomRecognize::recognizeShiftUntilBitTest() {
// The loop trip count check.
auto *IVCheck = Builder.CreateICmpEQ(IVNext, LoopTripCount,
CurLoop->getName() + ".ivcheck");
- Builder.CreateCondBr(IVCheck, SuccessorBB, LoopHeaderBB);
+ SmallVector BranchWeights;
+ const bool HasBranchWeights =
+ !ProfcheckDisableMetadataFixes &&
+ extractBranchWeights(*LoopHeaderBB->getTerminator(), BranchWeights);
+
+ auto *BI = Builder.CreateCondBr(IVCheck, SuccessorBB, LoopHeaderBB);
+ if (HasBranchWeights) {
+if (SuccessorBB == LoopHeaderBB->getTerminator()->getSuccessor(1))
+ std::swap(BranchWeights[0], BranchWeights[1]);
+// We're not changing the loop profile, so we can reuse the original loop's
+// profile.
+setBranchWeights(*BI, BranchWeights,
+ /*IsExpected=*/false);
+ }
+
LoopHeaderBB->getTerminator()->eraseFromParent();
// Populate the IV PHI.
@@ -3368,10 +3388,10 @@ static bool detectShiftUntilZeroIdiom(Loop *CurLoop,
ScalarEvolution *SE,
/// %start = <...>
/// %extraoffset = <...>
/// <...>
-/// br label %for.cond
+/// br label %loop
///
/// loop:
-/// %iv = phi i8 [ %start, %entry ], [ %iv.next, %for.cond ]
+/// %iv = phi i8 [ %start, %entry ], [ %iv.next, %loop ]
/// %nbits = add nsw i8 %iv, %extraoffset
/// %val.shifted = {{l,a}shr,shl} i8 %val, %nbits
/// %val.shifted.iszero = icmp eq i8 %val.shifted, 0
@@ -3533,7 +3553,19 @@ bool LoopIdiomRecognize::recognizeShiftUntilZero() {
// The loop terminator.
Builder.SetInsertPoint(LoopHeaderBB->getTerminator());
- Builder.CreateCondBr(CIVCheck, SuccessorBB, LoopHeaderBB);
+ SmallVector BranchWeights;
+ const bool HasBranchWeights =
+ !ProfcheckDisableMetadataFixes &&
+ extractBranchWeights(*LoopHeaderBB->getTerminator(), BranchWeights);
+
+ auto *BI = Builder.CreateCondBr(CIVCheck, SuccessorBB, LoopHeaderBB);
+ if (HasBranchWeights) {
+if (InvertedCond)
+ std::swap(BranchWeights[0], BranchWeights[1]);
+// We're not changing the loop profile, so we can reuse the original loop's
+// profile.
+setBranchWeights(*BI, BranchWeights, /*IsExpected=*/false);
+ }
LoopHeaderBB->getTerminator()->eraseFromParent();
// Populate the IV PHI.
diff --git a/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
b/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
new file mode 100644
index 0..d01bb748d9422
--- /dev/null
+++ b/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
@@ -0,0 +1,70 @@
+; RUN: opt
-passes="module(print),function(loop(loop-idiom)),module(print)"
-mtriple=x86_64 -mcpu=core-avx2 %s -disable-output 2>&1 | FileCheck
--check-prefix=PROFILE %s
+
+declare void @escape_inner(i8, i8, i8, i1, i8)
+declare void @escape_outer(i8, i8, i8, i1, i8)
+
+declare i8 @gen.i8()
+
+; Most basic pattern; Note that iff the shift amount is offset, said offsetting
+; must not cause an overflow, but `add nsw` is fine.
+define i8 @p0(i8 %val, i8 %start, i8 %extraoffset) mustprogress {
+entry:
+ br label %loop
+
+loop:
+ %iv = phi i8 [ %start, %entry ], [ %iv.next, %loop ]
+ %nbits = add nsw i8 %iv, %extraoffset
+ %val.shifted = ashr i8 %val, %nbits
+ %val.shifted.iszero = icmp eq i8 %val.shifted, 0
+ %iv.next = add i8 %iv, 1
+
+ call void @escap
[llvm-branch-commits] [llvm] [LIR][profcheck] Reuse the loop's exit condition profile (PR #164523)
https://github.com/mtrofin updated
https://github.com/llvm/llvm-project/pull/164523
>From d41e0bec22c98ba940fbeea4e9fd5351279ecdc1 Mon Sep 17 00:00:00 2001
From: Mircea Trofin
Date: Tue, 21 Oct 2025 17:24:49 -0700
Subject: [PATCH] [LIR][profcheck] Reuse the loop's exit condition profile
---
.../Transforms/Scalar/LoopIdiomRecognize.cpp | 40 +--
.../LoopIdiom/X86/preserve-profile.ll | 70 +++
2 files changed, 106 insertions(+), 4 deletions(-)
create mode 100644 llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
diff --git a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
index 019536ca91ae0..9070d252ae09f 100644
--- a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
@@ -72,6 +72,7 @@
#include "llvm/IR/Module.h"
#include "llvm/IR/PassManager.h"
#include "llvm/IR/PatternMatch.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/IR/Type.h"
#include "llvm/IR/User.h"
#include "llvm/IR/Value.h"
@@ -105,6 +106,7 @@ STATISTIC(
STATISTIC(NumShiftUntilZero,
"Number of uncountable loops recognized as 'shift until zero'
idiom");
+namespace llvm {
bool DisableLIRP::All;
static cl::opt
DisableLIRPAll("disable-" DEBUG_TYPE "-all",
@@ -163,6 +165,10 @@ static cl::opt ForceMemsetPatternIntrinsic(
cl::desc("Use memset.pattern intrinsic whenever possible"),
cl::init(false),
cl::Hidden);
+extern cl::opt ProfcheckDisableMetadataFixes;
+
+} // namespace llvm
+
namespace {
class LoopIdiomRecognize {
@@ -3199,7 +3205,21 @@ bool LoopIdiomRecognize::recognizeShiftUntilBitTest() {
// The loop trip count check.
auto *IVCheck = Builder.CreateICmpEQ(IVNext, LoopTripCount,
CurLoop->getName() + ".ivcheck");
- Builder.CreateCondBr(IVCheck, SuccessorBB, LoopHeaderBB);
+ SmallVector BranchWeights;
+ const bool HasBranchWeights =
+ !ProfcheckDisableMetadataFixes &&
+ extractBranchWeights(*LoopHeaderBB->getTerminator(), BranchWeights);
+
+ auto *BI = Builder.CreateCondBr(IVCheck, SuccessorBB, LoopHeaderBB);
+ if (HasBranchWeights) {
+if (SuccessorBB == LoopHeaderBB->getTerminator()->getSuccessor(1))
+ std::swap(BranchWeights[0], BranchWeights[1]);
+// We're not changing the loop profile, so we can reuse the original loop's
+// profile.
+setBranchWeights(*BI, BranchWeights,
+ /*IsExpected=*/false);
+ }
+
LoopHeaderBB->getTerminator()->eraseFromParent();
// Populate the IV PHI.
@@ -3368,10 +3388,10 @@ static bool detectShiftUntilZeroIdiom(Loop *CurLoop,
ScalarEvolution *SE,
/// %start = <...>
/// %extraoffset = <...>
/// <...>
-/// br label %for.cond
+/// br label %loop
///
/// loop:
-/// %iv = phi i8 [ %start, %entry ], [ %iv.next, %for.cond ]
+/// %iv = phi i8 [ %start, %entry ], [ %iv.next, %loop ]
/// %nbits = add nsw i8 %iv, %extraoffset
/// %val.shifted = {{l,a}shr,shl} i8 %val, %nbits
/// %val.shifted.iszero = icmp eq i8 %val.shifted, 0
@@ -3533,7 +3553,19 @@ bool LoopIdiomRecognize::recognizeShiftUntilZero() {
// The loop terminator.
Builder.SetInsertPoint(LoopHeaderBB->getTerminator());
- Builder.CreateCondBr(CIVCheck, SuccessorBB, LoopHeaderBB);
+ SmallVector BranchWeights;
+ const bool HasBranchWeights =
+ !ProfcheckDisableMetadataFixes &&
+ extractBranchWeights(*LoopHeaderBB->getTerminator(), BranchWeights);
+
+ auto *BI = Builder.CreateCondBr(CIVCheck, SuccessorBB, LoopHeaderBB);
+ if (HasBranchWeights) {
+if (InvertedCond)
+ std::swap(BranchWeights[0], BranchWeights[1]);
+// We're not changing the loop profile, so we can reuse the original loop's
+// profile.
+setBranchWeights(*BI, BranchWeights, /*IsExpected=*/false);
+ }
LoopHeaderBB->getTerminator()->eraseFromParent();
// Populate the IV PHI.
diff --git a/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
b/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
new file mode 100644
index 0..d01bb748d9422
--- /dev/null
+++ b/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
@@ -0,0 +1,70 @@
+; RUN: opt
-passes="module(print),function(loop(loop-idiom)),module(print)"
-mtriple=x86_64 -mcpu=core-avx2 %s -disable-output 2>&1 | FileCheck
--check-prefix=PROFILE %s
+
+declare void @escape_inner(i8, i8, i8, i1, i8)
+declare void @escape_outer(i8, i8, i8, i1, i8)
+
+declare i8 @gen.i8()
+
+; Most basic pattern; Note that iff the shift amount is offset, said offsetting
+; must not cause an overflow, but `add nsw` is fine.
+define i8 @p0(i8 %val, i8 %start, i8 %extraoffset) mustprogress {
+entry:
+ br label %loop
+
+loop:
+ %iv = phi i8 [ %start, %entry ], [ %iv.next, %loop ]
+ %nbits = add nsw i8 %iv, %extraoffset
+ %val.shifted = ashr i8 %val, %nbits
+ %val.shifted.iszero = icmp eq i8 %val.shifted, 0
+ %iv.next = add i8 %iv, 1
+
+ call void @escap
[llvm-branch-commits] [llvm] [LSCFG][profcheck] Add dummy branch weights for the dummy switch to dead exits (PR #164714)
https://github.com/mtrofin updated
https://github.com/llvm/llvm-project/pull/164714
>From aaa20e689ee9be4aa7e43c612a363973d5459f6e Mon Sep 17 00:00:00 2001
From: Mircea Trofin
Date: Wed, 22 Oct 2025 14:34:31 -0700
Subject: [PATCH] [LSCFG][profcheck] Add dummy branch weights for the dummy
switch to dead exits
---
.../lib/Transforms/Scalar/LoopSimplifyCFG.cpp | 12 ++
.../LoopSimplifyCFG/constant-fold-branch.ll | 104 +-
2 files changed, 66 insertions(+), 50 deletions(-)
diff --git a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
index b9546c5fa236b..e902b71776973 100644
--- a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
@@ -24,6 +24,7 @@
#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/IR/Dominators.h"
#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Scalar/LoopPassManager.h"
@@ -393,6 +394,17 @@ class ConstantTerminatorFoldingImpl {
DTUpdates.push_back({DominatorTree::Insert, Preheader, BB});
++NumLoopExitsDeleted;
}
+// We don't really need to add branch weights to DummySwitch, because all
+// but one branches are just a temporary artifact - see the comment on top
+// of this function. But, it's easy to estimate the weights, and it helps
+// maintain a property of the overall compiler - that the branch weights
+// don't "just get dropped" accidentally (i.e. profcheck)
+if (DummySwitch->getParent()->getParent()->hasProfileData()) {
+ SmallVector DummyBranchWeights(1 + DummySwitch->getNumCases());
+ // default. 100% probability, the rest are dead.
+ DummyBranchWeights[0] = 1;
+ setBranchWeights(*DummySwitch, DummyBranchWeights, /*IsExpected=*/false);
+}
assert(L.getLoopPreheader() == NewPreheader && "Malformed CFG?");
if (Loop *OuterLoop = LI.getLoopFor(Preheader)) {
diff --git a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
index 1ec212f0bb5ea..46b6209986fed 100644
--- a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
+++ b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
@@ -1,4 +1,4 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
UTC_ARGS: --check-globals
; REQUIRES: asserts
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes=loop-simplifycfg -verify-loop-info -verify-dom-info -verify-loop-lcssa
< %s | FileCheck %s
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes='require,loop(loop-simplifycfg)' -verify-loop-info
-verify-dom-info -verify-loop-lcssa < %s | FileCheck %s
@@ -59,7 +59,7 @@ define i32 @dead_backedge_test_switch_loop(i32 %end) {
; CHECK: dead_backedge:
; CHECK-NEXT:[[I_2]] = add i32 [[I_1]], 10
; CHECK-NEXT:switch i32 1, label [[EXIT:%.*]] [
-; CHECK-NEXT:i32 0, label [[HEADER_BACKEDGE]]
+; CHECK-NEXT: i32 0, label [[HEADER_BACKEDGE]]
; CHECK-NEXT:]
; CHECK: exit:
; CHECK-NEXT:[[I_2_LCSSA:%.*]] = phi i32 [ [[I_2]], [[DEAD_BACKEDGE]] ]
@@ -233,12 +233,12 @@ exit:
; Check that we preserve static reachibility of a dead exit block while
deleting
; a branch.
-define i32 @dead_exit_test_branch_loop(i32 %end) {
+define i32 @dead_exit_test_branch_loop(i32 %end) !prof
!{!"function_entry_count", i32 10} {
; CHECK-LABEL: @dead_exit_test_branch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
-; CHECK-NEXT:]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT:], !prof [[PROF1:![0-9]+]]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
; CHECK: header:
@@ -262,7 +262,7 @@ preheader:
header:
%i = phi i32 [0, %preheader], [%i.inc, %backedge]
- br i1 true, label %backedge, label %dead
+ br i1 true, label %backedge, label %dead, !prof !{!"branch_weights", i32 10,
i32 1}
dead:
br label %dummy
@@ -286,7 +286,7 @@ define i32 @dead_exit_test_switch_loop(i32 %end) {
; CHECK-LABEL: @dead_exit_test_switch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
; CHECK-NEXT:]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
@@ -383,9 +383,9 @@ define i32 @dead_loop_test_switch_loop(i32 %end) {
; CHECK: header:
; CHECK-NEXT:[[I:%.*]] = phi i32 [ 0, [[PREHEADER:%.*]] ], [
[[I_INC:%.*]], [[BACKEDGE:%.*]] ]
; CHECK-NEXT:switch i32 1, label [[DEAD:%.*]] [
-; CHECK-NEXT:i32 0, label [[DEAD]]
-; CHECK-NEXT:i32 1, label [[BACKEDGE]]
-; CHECK-NEXT:i32 2, lab
[llvm-branch-commits] [llvm] [dwarf] make dwarf fission compatible with RISCV relaxations 2/2 (PR #164813)
https://github.com/dlav-sc updated
https://github.com/llvm/llvm-project/pull/164813
>From dc7aa182f878a3c6212fff8ffba7a2c0172c6a0c Mon Sep 17 00:00:00 2001
From: Daniil Avdeev
Date: Thu, 18 Sep 2025 02:05:39 +
Subject: [PATCH] [dwarf] make dwarf fission compatible with RISCV relaxations
2/2
This patch makes DWARF fission compatible with RISC-V relaxations by
using indirect addressing for the DW_AT_high_pc attribute. This
eliminates the remaining relocations in .dwo files.
---
.../CodeGen/AsmPrinter/DwarfCompileUnit.cpp | 8 ++--
llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll | 44 +--
2 files changed, 35 insertions(+), 17 deletions(-)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 751d3735d3b2b..2e4a26ef70bc2 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -493,10 +493,12 @@ void DwarfCompileUnit::attachLowHighPC(DIE &D, const
MCSymbol *Begin,
assert(End->isDefined() && "Invalid end label");
addLabelAddress(D, dwarf::DW_AT_low_pc, Begin);
- if (DD->getDwarfVersion() < 4)
-addLabelAddress(D, dwarf::DW_AT_high_pc, End);
- else
+ if (DD->getDwarfVersion() >= 4 &&
+ (!isDwoUnit() || !llvm::isRangeRelaxable(Begin, End))) {
addLabelDelta(D, dwarf::DW_AT_high_pc, End, Begin);
+return;
+ }
+ addLabelAddress(D, dwarf::DW_AT_high_pc, End);
}
// Add info for Wasm-global-based relocation.
diff --git a/llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll
b/llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll
index f8ab7fc5ad900..64f83ba1a7d7f 100644
--- a/llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll
+++ b/llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll
@@ -1,12 +1,13 @@
; RUN: llc -dwarf-version=5 -split-dwarf-file=foo.dwo -O0 %s
-mtriple=riscv64-unknown-linux-gnu -filetype=obj -o %t
; RUN: llvm-dwarfdump -v %t | FileCheck --check-prefix=DWARF5 %s
; RUN: llvm-dwarfdump --debug-info %t 2> %t.txt
-; RUN: FileCheck --input-file=%t.txt %s --check-prefix=RELOCS
--implicit-check-not=warning:
+; RUN: FileCheck --input-file=%t.txt %s --check-prefix=RELOCS --allow-empty
--implicit-check-not=warning:
; RUN: llc -dwarf-version=4 -split-dwarf-file=foo.dwo -O0 %s
-mtriple=riscv64-unknown-linux-gnu -filetype=obj -o %t
; RUN: llvm-dwarfdump -v %t | FileCheck --check-prefix=DWARF4 %s
; RUN: llvm-dwarfdump --debug-info %t 2> %t.txt
-; RUN: FileCheck --input-file=%t.txt %s --check-prefix=RELOCS
--implicit-check-not=warning:
+; RUN: FileCheck --input-file=%t.txt %s --check-prefix=RELOCS --allow-empty
--implicit-check-not=warning:
+; RUN: llvm-objdump -h %t | FileCheck --check-prefix=HDR %s
; In the RISC-V architecture, the .text section is subject to
; relaxation, meaning the start address of each function can change
@@ -49,60 +50,75 @@
; clang -g -S -gsplit-dwarf --target=riscv64 -march=rv64gc -O0
relax_dwo_ranges.cpp
-; Currently, square() still uses an offset to represent the function's end
address,
-; which requires a relocation here.
-; RELOCS: warning: unexpected relocations for dwo section '.debug_info.dwo'
+; RELOCS-NOT: warning: unexpected relocations for dwo section '.debug_info.dwo'
+; Make sure we don't produce any relocations in any .dwo section
+; HDR-NOT: .rela.{{.*}}.dwo
+
+; Ensure that 'square()' function uses indexed start and end addresses
; DWARF5: .debug_info.dwo contents:
; DWARF5: DW_TAG_subprogram
-; DWARF5-NEXT: DW_AT_low_pc [DW_FORM_addrx](indexed () address =
0x ".text")
-; DWARF5-NEXT: DW_AT_high_pc [DW_FORM_data4] (0x)
+; DWARF5-NEXT: DW_AT_low_pc [DW_FORM_addrx](indexed () address =
0x ".text")
+; DWARF5-NEXT: DW_AT_high_pc [DW_FORM_addrx](indexed (0001) address =
0x0044 ".text")
; DWARF5: DW_AT_name {{.*}} "square")
; DWARF5: DW_TAG_formal_parameter
+; HDR-NOT: .rela.{{.*}}.dwo
+
; Ensure there is no unnecessary addresses in .o file
; DWARF5: .debug_addr contents:
; DWARF5: Addrs: [
; DWARF5-NEXT: 0x
+; DWARF5-NEXT: 0x0044
; DWARF5-NEXT: 0x0046
; DWARF5-NEXT: 0x006c
; DWARF5-NEXT: 0x00b0
; DWARF5-NEXT: ]
+; HDR-NOT: .rela.{{.*}}.dwo
+
; Ensure that 'boo()' and 'main()' use DW_RLE_startx_length and
DW_RLE_startx_endx
; entries respectively
; DWARF5: .debug_rnglists.dwo contents:
; DWARF5: ranges:
-; DWARF5-NEXT: 0x0014: [DW_RLE_startx_length]: 0x0001,
0x0024 => [0x0046, 0x006a)
+; DWARF5-NEXT: 0x0014: [DW_RLE_startx_length]: 0x0002,
0x0024 => [0x0046, 0x006a)
; DWARF5-NEXT: 0x0017: [DW_RLE_end_of_list ]
-; DWARF5-NEXT: 0x0018: [DW_RLE_startx_endx ]: 0x0002,
0x0003 => [0x006c, 0x00b0)
+; DWARF5-NEXT: 0x0018: [DW_RLE_startx_endx ]: 0x0
[llvm-branch-commits] [mlir] check float cast (PR #166618)
https://github.com/makslevental created
https://github.com/llvm/llvm-project/pull/166618
None
>From 186a5f9dd5545db6e3ccb228174e9f6edbce95d5 Mon Sep 17 00:00:00 2001
From: makslevental
Date: Wed, 5 Nov 2025 11:13:09 -0800
Subject: [PATCH] check float cast
---
mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp
b/mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp
index 632e1a7f02602..99d181f6262cd 100644
--- a/mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp
+++ b/mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp
@@ -583,9 +583,11 @@ struct FancyAddFLowering : public
ConvertOpToLLVMPattern {
auto parent = op->getParentOfType();
if (!parent)
return failure();
+auto floatTy = dyn_cast(op.getType());
+if (!floatTy)
+ return failure();
FailureOr adder =
LLVM::lookupOrCreateApFloatAddFFn(rewriter, parent);
-auto floatTy = cast(op.getType());
// Cast operands to 64-bit integers.
Location loc = op.getLoc();
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LSCFG][profcheck] Add dummy branch weights for the dummy switch to dead exits (PR #164714)
https://github.com/mtrofin updated
https://github.com/llvm/llvm-project/pull/164714
>From 663f7a64988be7a1ace16fef79e72c10058b3e16 Mon Sep 17 00:00:00 2001
From: Mircea Trofin
Date: Wed, 22 Oct 2025 14:34:31 -0700
Subject: [PATCH] [LSCFG][profcheck] Add dummy branch weights for the dummy
switch to dead exits
---
.../lib/Transforms/Scalar/LoopSimplifyCFG.cpp | 12 ++
.../LoopSimplifyCFG/constant-fold-branch.ll | 104 +-
2 files changed, 66 insertions(+), 50 deletions(-)
diff --git a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
index b9546c5fa236b..e902b71776973 100644
--- a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
@@ -24,6 +24,7 @@
#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/IR/Dominators.h"
#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Scalar/LoopPassManager.h"
@@ -393,6 +394,17 @@ class ConstantTerminatorFoldingImpl {
DTUpdates.push_back({DominatorTree::Insert, Preheader, BB});
++NumLoopExitsDeleted;
}
+// We don't really need to add branch weights to DummySwitch, because all
+// but one branches are just a temporary artifact - see the comment on top
+// of this function. But, it's easy to estimate the weights, and it helps
+// maintain a property of the overall compiler - that the branch weights
+// don't "just get dropped" accidentally (i.e. profcheck)
+if (DummySwitch->getParent()->getParent()->hasProfileData()) {
+ SmallVector DummyBranchWeights(1 + DummySwitch->getNumCases());
+ // default. 100% probability, the rest are dead.
+ DummyBranchWeights[0] = 1;
+ setBranchWeights(*DummySwitch, DummyBranchWeights, /*IsExpected=*/false);
+}
assert(L.getLoopPreheader() == NewPreheader && "Malformed CFG?");
if (Loop *OuterLoop = LI.getLoopFor(Preheader)) {
diff --git a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
index 1ec212f0bb5ea..46b6209986fed 100644
--- a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
+++ b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
@@ -1,4 +1,4 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
UTC_ARGS: --check-globals
; REQUIRES: asserts
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes=loop-simplifycfg -verify-loop-info -verify-dom-info -verify-loop-lcssa
< %s | FileCheck %s
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes='require,loop(loop-simplifycfg)' -verify-loop-info
-verify-dom-info -verify-loop-lcssa < %s | FileCheck %s
@@ -59,7 +59,7 @@ define i32 @dead_backedge_test_switch_loop(i32 %end) {
; CHECK: dead_backedge:
; CHECK-NEXT:[[I_2]] = add i32 [[I_1]], 10
; CHECK-NEXT:switch i32 1, label [[EXIT:%.*]] [
-; CHECK-NEXT:i32 0, label [[HEADER_BACKEDGE]]
+; CHECK-NEXT: i32 0, label [[HEADER_BACKEDGE]]
; CHECK-NEXT:]
; CHECK: exit:
; CHECK-NEXT:[[I_2_LCSSA:%.*]] = phi i32 [ [[I_2]], [[DEAD_BACKEDGE]] ]
@@ -233,12 +233,12 @@ exit:
; Check that we preserve static reachibility of a dead exit block while
deleting
; a branch.
-define i32 @dead_exit_test_branch_loop(i32 %end) {
+define i32 @dead_exit_test_branch_loop(i32 %end) !prof
!{!"function_entry_count", i32 10} {
; CHECK-LABEL: @dead_exit_test_branch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
-; CHECK-NEXT:]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT:], !prof [[PROF1:![0-9]+]]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
; CHECK: header:
@@ -262,7 +262,7 @@ preheader:
header:
%i = phi i32 [0, %preheader], [%i.inc, %backedge]
- br i1 true, label %backedge, label %dead
+ br i1 true, label %backedge, label %dead, !prof !{!"branch_weights", i32 10,
i32 1}
dead:
br label %dummy
@@ -286,7 +286,7 @@ define i32 @dead_exit_test_switch_loop(i32 %end) {
; CHECK-LABEL: @dead_exit_test_switch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
; CHECK-NEXT:]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
@@ -383,9 +383,9 @@ define i32 @dead_loop_test_switch_loop(i32 %end) {
; CHECK: header:
; CHECK-NEXT:[[I:%.*]] = phi i32 [ 0, [[PREHEADER:%.*]] ], [
[[I_INC:%.*]], [[BACKEDGE:%.*]] ]
; CHECK-NEXT:switch i32 1, label [[DEAD:%.*]] [
-; CHECK-NEXT:i32 0, label [[DEAD]]
-; CHECK-NEXT:i32 1, label [[BACKEDGE]]
-; CHECK-NEXT:i32 2, lab
[llvm-branch-commits] [llvm] [LIR][profcheck] Reuse the loop's exit condition profile (PR #164523)
https://github.com/mtrofin updated
https://github.com/llvm/llvm-project/pull/164523
>From 38069a384122f039598f14c4f08ad1905df9b201 Mon Sep 17 00:00:00 2001
From: Mircea Trofin
Date: Tue, 21 Oct 2025 17:24:49 -0700
Subject: [PATCH] [LIR][profcheck] Reuse the loop's exit condition profile
---
.../Transforms/Scalar/LoopIdiomRecognize.cpp | 40 +--
.../LoopIdiom/X86/preserve-profile.ll | 70 +++
2 files changed, 106 insertions(+), 4 deletions(-)
create mode 100644 llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
diff --git a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
index 019536ca91ae0..9070d252ae09f 100644
--- a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
@@ -72,6 +72,7 @@
#include "llvm/IR/Module.h"
#include "llvm/IR/PassManager.h"
#include "llvm/IR/PatternMatch.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/IR/Type.h"
#include "llvm/IR/User.h"
#include "llvm/IR/Value.h"
@@ -105,6 +106,7 @@ STATISTIC(
STATISTIC(NumShiftUntilZero,
"Number of uncountable loops recognized as 'shift until zero'
idiom");
+namespace llvm {
bool DisableLIRP::All;
static cl::opt
DisableLIRPAll("disable-" DEBUG_TYPE "-all",
@@ -163,6 +165,10 @@ static cl::opt ForceMemsetPatternIntrinsic(
cl::desc("Use memset.pattern intrinsic whenever possible"),
cl::init(false),
cl::Hidden);
+extern cl::opt ProfcheckDisableMetadataFixes;
+
+} // namespace llvm
+
namespace {
class LoopIdiomRecognize {
@@ -3199,7 +3205,21 @@ bool LoopIdiomRecognize::recognizeShiftUntilBitTest() {
// The loop trip count check.
auto *IVCheck = Builder.CreateICmpEQ(IVNext, LoopTripCount,
CurLoop->getName() + ".ivcheck");
- Builder.CreateCondBr(IVCheck, SuccessorBB, LoopHeaderBB);
+ SmallVector BranchWeights;
+ const bool HasBranchWeights =
+ !ProfcheckDisableMetadataFixes &&
+ extractBranchWeights(*LoopHeaderBB->getTerminator(), BranchWeights);
+
+ auto *BI = Builder.CreateCondBr(IVCheck, SuccessorBB, LoopHeaderBB);
+ if (HasBranchWeights) {
+if (SuccessorBB == LoopHeaderBB->getTerminator()->getSuccessor(1))
+ std::swap(BranchWeights[0], BranchWeights[1]);
+// We're not changing the loop profile, so we can reuse the original loop's
+// profile.
+setBranchWeights(*BI, BranchWeights,
+ /*IsExpected=*/false);
+ }
+
LoopHeaderBB->getTerminator()->eraseFromParent();
// Populate the IV PHI.
@@ -3368,10 +3388,10 @@ static bool detectShiftUntilZeroIdiom(Loop *CurLoop,
ScalarEvolution *SE,
/// %start = <...>
/// %extraoffset = <...>
/// <...>
-/// br label %for.cond
+/// br label %loop
///
/// loop:
-/// %iv = phi i8 [ %start, %entry ], [ %iv.next, %for.cond ]
+/// %iv = phi i8 [ %start, %entry ], [ %iv.next, %loop ]
/// %nbits = add nsw i8 %iv, %extraoffset
/// %val.shifted = {{l,a}shr,shl} i8 %val, %nbits
/// %val.shifted.iszero = icmp eq i8 %val.shifted, 0
@@ -3533,7 +3553,19 @@ bool LoopIdiomRecognize::recognizeShiftUntilZero() {
// The loop terminator.
Builder.SetInsertPoint(LoopHeaderBB->getTerminator());
- Builder.CreateCondBr(CIVCheck, SuccessorBB, LoopHeaderBB);
+ SmallVector BranchWeights;
+ const bool HasBranchWeights =
+ !ProfcheckDisableMetadataFixes &&
+ extractBranchWeights(*LoopHeaderBB->getTerminator(), BranchWeights);
+
+ auto *BI = Builder.CreateCondBr(CIVCheck, SuccessorBB, LoopHeaderBB);
+ if (HasBranchWeights) {
+if (InvertedCond)
+ std::swap(BranchWeights[0], BranchWeights[1]);
+// We're not changing the loop profile, so we can reuse the original loop's
+// profile.
+setBranchWeights(*BI, BranchWeights, /*IsExpected=*/false);
+ }
LoopHeaderBB->getTerminator()->eraseFromParent();
// Populate the IV PHI.
diff --git a/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
b/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
new file mode 100644
index 0..d01bb748d9422
--- /dev/null
+++ b/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
@@ -0,0 +1,70 @@
+; RUN: opt
-passes="module(print),function(loop(loop-idiom)),module(print)"
-mtriple=x86_64 -mcpu=core-avx2 %s -disable-output 2>&1 | FileCheck
--check-prefix=PROFILE %s
+
+declare void @escape_inner(i8, i8, i8, i1, i8)
+declare void @escape_outer(i8, i8, i8, i1, i8)
+
+declare i8 @gen.i8()
+
+; Most basic pattern; Note that iff the shift amount is offset, said offsetting
+; must not cause an overflow, but `add nsw` is fine.
+define i8 @p0(i8 %val, i8 %start, i8 %extraoffset) mustprogress {
+entry:
+ br label %loop
+
+loop:
+ %iv = phi i8 [ %start, %entry ], [ %iv.next, %loop ]
+ %nbits = add nsw i8 %iv, %extraoffset
+ %val.shifted = ashr i8 %val, %nbits
+ %val.shifted.iszero = icmp eq i8 %val.shifted, 0
+ %iv.next = add i8 %iv, 1
+
+ call void @escap
[llvm-branch-commits] [llvm] [LSCFG][profcheck] Add dummy branch weights for the dummy switch to dead exits (PR #164714)
https://github.com/mtrofin updated
https://github.com/llvm/llvm-project/pull/164714
>From 663f7a64988be7a1ace16fef79e72c10058b3e16 Mon Sep 17 00:00:00 2001
From: Mircea Trofin
Date: Wed, 22 Oct 2025 14:34:31 -0700
Subject: [PATCH] [LSCFG][profcheck] Add dummy branch weights for the dummy
switch to dead exits
---
.../lib/Transforms/Scalar/LoopSimplifyCFG.cpp | 12 ++
.../LoopSimplifyCFG/constant-fold-branch.ll | 104 +-
2 files changed, 66 insertions(+), 50 deletions(-)
diff --git a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
index b9546c5fa236b..e902b71776973 100644
--- a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
@@ -24,6 +24,7 @@
#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/IR/Dominators.h"
#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Scalar/LoopPassManager.h"
@@ -393,6 +394,17 @@ class ConstantTerminatorFoldingImpl {
DTUpdates.push_back({DominatorTree::Insert, Preheader, BB});
++NumLoopExitsDeleted;
}
+// We don't really need to add branch weights to DummySwitch, because all
+// but one branches are just a temporary artifact - see the comment on top
+// of this function. But, it's easy to estimate the weights, and it helps
+// maintain a property of the overall compiler - that the branch weights
+// don't "just get dropped" accidentally (i.e. profcheck)
+if (DummySwitch->getParent()->getParent()->hasProfileData()) {
+ SmallVector DummyBranchWeights(1 + DummySwitch->getNumCases());
+ // default. 100% probability, the rest are dead.
+ DummyBranchWeights[0] = 1;
+ setBranchWeights(*DummySwitch, DummyBranchWeights, /*IsExpected=*/false);
+}
assert(L.getLoopPreheader() == NewPreheader && "Malformed CFG?");
if (Loop *OuterLoop = LI.getLoopFor(Preheader)) {
diff --git a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
index 1ec212f0bb5ea..46b6209986fed 100644
--- a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
+++ b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
@@ -1,4 +1,4 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
UTC_ARGS: --check-globals
; REQUIRES: asserts
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes=loop-simplifycfg -verify-loop-info -verify-dom-info -verify-loop-lcssa
< %s | FileCheck %s
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes='require,loop(loop-simplifycfg)' -verify-loop-info
-verify-dom-info -verify-loop-lcssa < %s | FileCheck %s
@@ -59,7 +59,7 @@ define i32 @dead_backedge_test_switch_loop(i32 %end) {
; CHECK: dead_backedge:
; CHECK-NEXT:[[I_2]] = add i32 [[I_1]], 10
; CHECK-NEXT:switch i32 1, label [[EXIT:%.*]] [
-; CHECK-NEXT:i32 0, label [[HEADER_BACKEDGE]]
+; CHECK-NEXT: i32 0, label [[HEADER_BACKEDGE]]
; CHECK-NEXT:]
; CHECK: exit:
; CHECK-NEXT:[[I_2_LCSSA:%.*]] = phi i32 [ [[I_2]], [[DEAD_BACKEDGE]] ]
@@ -233,12 +233,12 @@ exit:
; Check that we preserve static reachibility of a dead exit block while
deleting
; a branch.
-define i32 @dead_exit_test_branch_loop(i32 %end) {
+define i32 @dead_exit_test_branch_loop(i32 %end) !prof
!{!"function_entry_count", i32 10} {
; CHECK-LABEL: @dead_exit_test_branch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
-; CHECK-NEXT:]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT:], !prof [[PROF1:![0-9]+]]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
; CHECK: header:
@@ -262,7 +262,7 @@ preheader:
header:
%i = phi i32 [0, %preheader], [%i.inc, %backedge]
- br i1 true, label %backedge, label %dead
+ br i1 true, label %backedge, label %dead, !prof !{!"branch_weights", i32 10,
i32 1}
dead:
br label %dummy
@@ -286,7 +286,7 @@ define i32 @dead_exit_test_switch_loop(i32 %end) {
; CHECK-LABEL: @dead_exit_test_switch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
; CHECK-NEXT:]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
@@ -383,9 +383,9 @@ define i32 @dead_loop_test_switch_loop(i32 %end) {
; CHECK: header:
; CHECK-NEXT:[[I:%.*]] = phi i32 [ 0, [[PREHEADER:%.*]] ], [
[[I_INC:%.*]], [[BACKEDGE:%.*]] ]
; CHECK-NEXT:switch i32 1, label [[DEAD:%.*]] [
-; CHECK-NEXT:i32 0, label [[DEAD]]
-; CHECK-NEXT:i32 1, label [[BACKEDGE]]
-; CHECK-NEXT:i32 2, lab
[llvm-branch-commits] [llvm] [LIR][profcheck] Reuse the loop's exit condition profile (PR #164523)
https://github.com/mtrofin updated
https://github.com/llvm/llvm-project/pull/164523
>From 38069a384122f039598f14c4f08ad1905df9b201 Mon Sep 17 00:00:00 2001
From: Mircea Trofin
Date: Tue, 21 Oct 2025 17:24:49 -0700
Subject: [PATCH] [LIR][profcheck] Reuse the loop's exit condition profile
---
.../Transforms/Scalar/LoopIdiomRecognize.cpp | 40 +--
.../LoopIdiom/X86/preserve-profile.ll | 70 +++
2 files changed, 106 insertions(+), 4 deletions(-)
create mode 100644 llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
diff --git a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
index 019536ca91ae0..9070d252ae09f 100644
--- a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
@@ -72,6 +72,7 @@
#include "llvm/IR/Module.h"
#include "llvm/IR/PassManager.h"
#include "llvm/IR/PatternMatch.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/IR/Type.h"
#include "llvm/IR/User.h"
#include "llvm/IR/Value.h"
@@ -105,6 +106,7 @@ STATISTIC(
STATISTIC(NumShiftUntilZero,
"Number of uncountable loops recognized as 'shift until zero'
idiom");
+namespace llvm {
bool DisableLIRP::All;
static cl::opt
DisableLIRPAll("disable-" DEBUG_TYPE "-all",
@@ -163,6 +165,10 @@ static cl::opt ForceMemsetPatternIntrinsic(
cl::desc("Use memset.pattern intrinsic whenever possible"),
cl::init(false),
cl::Hidden);
+extern cl::opt ProfcheckDisableMetadataFixes;
+
+} // namespace llvm
+
namespace {
class LoopIdiomRecognize {
@@ -3199,7 +3205,21 @@ bool LoopIdiomRecognize::recognizeShiftUntilBitTest() {
// The loop trip count check.
auto *IVCheck = Builder.CreateICmpEQ(IVNext, LoopTripCount,
CurLoop->getName() + ".ivcheck");
- Builder.CreateCondBr(IVCheck, SuccessorBB, LoopHeaderBB);
+ SmallVector BranchWeights;
+ const bool HasBranchWeights =
+ !ProfcheckDisableMetadataFixes &&
+ extractBranchWeights(*LoopHeaderBB->getTerminator(), BranchWeights);
+
+ auto *BI = Builder.CreateCondBr(IVCheck, SuccessorBB, LoopHeaderBB);
+ if (HasBranchWeights) {
+if (SuccessorBB == LoopHeaderBB->getTerminator()->getSuccessor(1))
+ std::swap(BranchWeights[0], BranchWeights[1]);
+// We're not changing the loop profile, so we can reuse the original loop's
+// profile.
+setBranchWeights(*BI, BranchWeights,
+ /*IsExpected=*/false);
+ }
+
LoopHeaderBB->getTerminator()->eraseFromParent();
// Populate the IV PHI.
@@ -3368,10 +3388,10 @@ static bool detectShiftUntilZeroIdiom(Loop *CurLoop,
ScalarEvolution *SE,
/// %start = <...>
/// %extraoffset = <...>
/// <...>
-/// br label %for.cond
+/// br label %loop
///
/// loop:
-/// %iv = phi i8 [ %start, %entry ], [ %iv.next, %for.cond ]
+/// %iv = phi i8 [ %start, %entry ], [ %iv.next, %loop ]
/// %nbits = add nsw i8 %iv, %extraoffset
/// %val.shifted = {{l,a}shr,shl} i8 %val, %nbits
/// %val.shifted.iszero = icmp eq i8 %val.shifted, 0
@@ -3533,7 +3553,19 @@ bool LoopIdiomRecognize::recognizeShiftUntilZero() {
// The loop terminator.
Builder.SetInsertPoint(LoopHeaderBB->getTerminator());
- Builder.CreateCondBr(CIVCheck, SuccessorBB, LoopHeaderBB);
+ SmallVector BranchWeights;
+ const bool HasBranchWeights =
+ !ProfcheckDisableMetadataFixes &&
+ extractBranchWeights(*LoopHeaderBB->getTerminator(), BranchWeights);
+
+ auto *BI = Builder.CreateCondBr(CIVCheck, SuccessorBB, LoopHeaderBB);
+ if (HasBranchWeights) {
+if (InvertedCond)
+ std::swap(BranchWeights[0], BranchWeights[1]);
+// We're not changing the loop profile, so we can reuse the original loop's
+// profile.
+setBranchWeights(*BI, BranchWeights, /*IsExpected=*/false);
+ }
LoopHeaderBB->getTerminator()->eraseFromParent();
// Populate the IV PHI.
diff --git a/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
b/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
new file mode 100644
index 0..d01bb748d9422
--- /dev/null
+++ b/llvm/test/Transforms/LoopIdiom/X86/preserve-profile.ll
@@ -0,0 +1,70 @@
+; RUN: opt
-passes="module(print),function(loop(loop-idiom)),module(print)"
-mtriple=x86_64 -mcpu=core-avx2 %s -disable-output 2>&1 | FileCheck
--check-prefix=PROFILE %s
+
+declare void @escape_inner(i8, i8, i8, i1, i8)
+declare void @escape_outer(i8, i8, i8, i1, i8)
+
+declare i8 @gen.i8()
+
+; Most basic pattern; Note that iff the shift amount is offset, said offsetting
+; must not cause an overflow, but `add nsw` is fine.
+define i8 @p0(i8 %val, i8 %start, i8 %extraoffset) mustprogress {
+entry:
+ br label %loop
+
+loop:
+ %iv = phi i8 [ %start, %entry ], [ %iv.next, %loop ]
+ %nbits = add nsw i8 %iv, %extraoffset
+ %val.shifted = ashr i8 %val, %nbits
+ %val.shifted.iszero = icmp eq i8 %val.shifted, 0
+ %iv.next = add i8 %iv, 1
+
+ call void @escap
[llvm-branch-commits] [compiler-rt][sanitizers] Mark three tests as unsupported on Android (PR #166639)
https://github.com/boomanaiden154 updated https://github.com/llvm/llvm-project/pull/166639 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt][HWAsan] Remove CHECK lines from test (PR #166638)
https://github.com/boomanaiden154 updated https://github.com/llvm/llvm-project/pull/166638 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt][HWAsan] Remove CHECK lines from test (PR #166638)
https://github.com/boomanaiden154 updated https://github.com/llvm/llvm-project/pull/166638 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt][sanitizers] Mark three tests as unsupported on Android (PR #166639)
https://github.com/boomanaiden154 updated https://github.com/llvm/llvm-project/pull/166639 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LSCFG][profcheck] Add dummy branch weights for the dummy switch to dead exits (PR #164714)
https://github.com/mtrofin updated
https://github.com/llvm/llvm-project/pull/164714
>From 7d9c10cead00d07e2bd50bbd4c2287b0b3b60e34 Mon Sep 17 00:00:00 2001
From: Mircea Trofin
Date: Wed, 22 Oct 2025 14:34:31 -0700
Subject: [PATCH] [LSCFG][profcheck] Add dummy branch weights for the dummy
switch to dead exits
---
.../lib/Transforms/Scalar/LoopSimplifyCFG.cpp | 12 ++
.../LoopSimplifyCFG/constant-fold-branch.ll | 104 +-
2 files changed, 66 insertions(+), 50 deletions(-)
diff --git a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
index b9546c5fa236b..e902b71776973 100644
--- a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
@@ -24,6 +24,7 @@
#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/IR/Dominators.h"
#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Scalar/LoopPassManager.h"
@@ -393,6 +394,17 @@ class ConstantTerminatorFoldingImpl {
DTUpdates.push_back({DominatorTree::Insert, Preheader, BB});
++NumLoopExitsDeleted;
}
+// We don't really need to add branch weights to DummySwitch, because all
+// but one branches are just a temporary artifact - see the comment on top
+// of this function. But, it's easy to estimate the weights, and it helps
+// maintain a property of the overall compiler - that the branch weights
+// don't "just get dropped" accidentally (i.e. profcheck)
+if (DummySwitch->getParent()->getParent()->hasProfileData()) {
+ SmallVector DummyBranchWeights(1 + DummySwitch->getNumCases());
+ // default. 100% probability, the rest are dead.
+ DummyBranchWeights[0] = 1;
+ setBranchWeights(*DummySwitch, DummyBranchWeights, /*IsExpected=*/false);
+}
assert(L.getLoopPreheader() == NewPreheader && "Malformed CFG?");
if (Loop *OuterLoop = LI.getLoopFor(Preheader)) {
diff --git a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
index 1ec212f0bb5ea..46b6209986fed 100644
--- a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
+++ b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
@@ -1,4 +1,4 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
UTC_ARGS: --check-globals
; REQUIRES: asserts
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes=loop-simplifycfg -verify-loop-info -verify-dom-info -verify-loop-lcssa
< %s | FileCheck %s
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes='require,loop(loop-simplifycfg)' -verify-loop-info
-verify-dom-info -verify-loop-lcssa < %s | FileCheck %s
@@ -59,7 +59,7 @@ define i32 @dead_backedge_test_switch_loop(i32 %end) {
; CHECK: dead_backedge:
; CHECK-NEXT:[[I_2]] = add i32 [[I_1]], 10
; CHECK-NEXT:switch i32 1, label [[EXIT:%.*]] [
-; CHECK-NEXT:i32 0, label [[HEADER_BACKEDGE]]
+; CHECK-NEXT: i32 0, label [[HEADER_BACKEDGE]]
; CHECK-NEXT:]
; CHECK: exit:
; CHECK-NEXT:[[I_2_LCSSA:%.*]] = phi i32 [ [[I_2]], [[DEAD_BACKEDGE]] ]
@@ -233,12 +233,12 @@ exit:
; Check that we preserve static reachibility of a dead exit block while
deleting
; a branch.
-define i32 @dead_exit_test_branch_loop(i32 %end) {
+define i32 @dead_exit_test_branch_loop(i32 %end) !prof
!{!"function_entry_count", i32 10} {
; CHECK-LABEL: @dead_exit_test_branch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
-; CHECK-NEXT:]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT:], !prof [[PROF1:![0-9]+]]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
; CHECK: header:
@@ -262,7 +262,7 @@ preheader:
header:
%i = phi i32 [0, %preheader], [%i.inc, %backedge]
- br i1 true, label %backedge, label %dead
+ br i1 true, label %backedge, label %dead, !prof !{!"branch_weights", i32 10,
i32 1}
dead:
br label %dummy
@@ -286,7 +286,7 @@ define i32 @dead_exit_test_switch_loop(i32 %end) {
; CHECK-LABEL: @dead_exit_test_switch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
; CHECK-NEXT:]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
@@ -383,9 +383,9 @@ define i32 @dead_loop_test_switch_loop(i32 %end) {
; CHECK: header:
; CHECK-NEXT:[[I:%.*]] = phi i32 [ 0, [[PREHEADER:%.*]] ], [
[[I_INC:%.*]], [[BACKEDGE:%.*]] ]
; CHECK-NEXT:switch i32 1, label [[DEAD:%.*]] [
-; CHECK-NEXT:i32 0, label [[DEAD]]
-; CHECK-NEXT:i32 1, label [[BACKEDGE]]
-; CHECK-NEXT:i32 2, lab
[llvm-branch-commits] [llvm] [LSCFG][profcheck] Add dummy branch weights for the dummy switch to dead exits (PR #164714)
https://github.com/mtrofin updated
https://github.com/llvm/llvm-project/pull/164714
>From eb82bc3d94bc231f5b141cd6d4e596262ff354c1 Mon Sep 17 00:00:00 2001
From: Mircea Trofin
Date: Wed, 22 Oct 2025 14:34:31 -0700
Subject: [PATCH] [LSCFG][profcheck] Add dummy branch weights for the dummy
switch to dead exits
---
.../lib/Transforms/Scalar/LoopSimplifyCFG.cpp | 12 ++
.../LoopSimplifyCFG/constant-fold-branch.ll | 104 +-
2 files changed, 66 insertions(+), 50 deletions(-)
diff --git a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
index b9546c5fa236b..e902b71776973 100644
--- a/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopSimplifyCFG.cpp
@@ -24,6 +24,7 @@
#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/IR/Dominators.h"
#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/ProfDataUtils.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Scalar/LoopPassManager.h"
@@ -393,6 +394,17 @@ class ConstantTerminatorFoldingImpl {
DTUpdates.push_back({DominatorTree::Insert, Preheader, BB});
++NumLoopExitsDeleted;
}
+// We don't really need to add branch weights to DummySwitch, because all
+// but one branches are just a temporary artifact - see the comment on top
+// of this function. But, it's easy to estimate the weights, and it helps
+// maintain a property of the overall compiler - that the branch weights
+// don't "just get dropped" accidentally (i.e. profcheck)
+if (DummySwitch->getParent()->getParent()->hasProfileData()) {
+ SmallVector DummyBranchWeights(1 + DummySwitch->getNumCases());
+ // default. 100% probability, the rest are dead.
+ DummyBranchWeights[0] = 1;
+ setBranchWeights(*DummySwitch, DummyBranchWeights, /*IsExpected=*/false);
+}
assert(L.getLoopPreheader() == NewPreheader && "Malformed CFG?");
if (Loop *OuterLoop = LI.getLoopFor(Preheader)) {
diff --git a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
index 1ec212f0bb5ea..46b6209986fed 100644
--- a/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
+++ b/llvm/test/Transforms/LoopSimplifyCFG/constant-fold-branch.ll
@@ -1,4 +1,4 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
UTC_ARGS: --check-globals
; REQUIRES: asserts
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes=loop-simplifycfg -verify-loop-info -verify-dom-info -verify-loop-lcssa
< %s | FileCheck %s
; RUN: opt -S -enable-loop-simplifycfg-term-folding=true
-passes='require,loop(loop-simplifycfg)' -verify-loop-info
-verify-dom-info -verify-loop-lcssa < %s | FileCheck %s
@@ -59,7 +59,7 @@ define i32 @dead_backedge_test_switch_loop(i32 %end) {
; CHECK: dead_backedge:
; CHECK-NEXT:[[I_2]] = add i32 [[I_1]], 10
; CHECK-NEXT:switch i32 1, label [[EXIT:%.*]] [
-; CHECK-NEXT:i32 0, label [[HEADER_BACKEDGE]]
+; CHECK-NEXT: i32 0, label [[HEADER_BACKEDGE]]
; CHECK-NEXT:]
; CHECK: exit:
; CHECK-NEXT:[[I_2_LCSSA:%.*]] = phi i32 [ [[I_2]], [[DEAD_BACKEDGE]] ]
@@ -233,12 +233,12 @@ exit:
; Check that we preserve static reachibility of a dead exit block while
deleting
; a branch.
-define i32 @dead_exit_test_branch_loop(i32 %end) {
+define i32 @dead_exit_test_branch_loop(i32 %end) !prof
!{!"function_entry_count", i32 10} {
; CHECK-LABEL: @dead_exit_test_branch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
-; CHECK-NEXT:]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT:], !prof [[PROF1:![0-9]+]]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
; CHECK: header:
@@ -262,7 +262,7 @@ preheader:
header:
%i = phi i32 [0, %preheader], [%i.inc, %backedge]
- br i1 true, label %backedge, label %dead
+ br i1 true, label %backedge, label %dead, !prof !{!"branch_weights", i32 10,
i32 1}
dead:
br label %dummy
@@ -286,7 +286,7 @@ define i32 @dead_exit_test_switch_loop(i32 %end) {
; CHECK-LABEL: @dead_exit_test_switch_loop(
; CHECK-NEXT: preheader:
; CHECK-NEXT:switch i32 0, label [[PREHEADER_SPLIT:%.*]] [
-; CHECK-NEXT:i32 1, label [[DEAD:%.*]]
+; CHECK-NEXT: i32 1, label [[DEAD:%.*]]
; CHECK-NEXT:]
; CHECK: preheader.split:
; CHECK-NEXT:br label [[HEADER:%.*]]
@@ -383,9 +383,9 @@ define i32 @dead_loop_test_switch_loop(i32 %end) {
; CHECK: header:
; CHECK-NEXT:[[I:%.*]] = phi i32 [ 0, [[PREHEADER:%.*]] ], [
[[I_INC:%.*]], [[BACKEDGE:%.*]] ]
; CHECK-NEXT:switch i32 1, label [[DEAD:%.*]] [
-; CHECK-NEXT:i32 0, label [[DEAD]]
-; CHECK-NEXT:i32 1, label [[BACKEDGE]]
-; CHECK-NEXT:i32 2, lab
[llvm-branch-commits] [CI] Make premerge upload/write comments (PR #166609)
boomanaiden154 wrote: We can't just upload a single comment from one job, because the Linux/Windows/Linux AArch64 jobs will race against each other and we have no ordering guarantees. https://github.com/llvm/llvm-project/pull/166609 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] ExpandFp: Require RuntimeLibcallsInfo analysis (PR #165197)
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/165197
>From 70f2855ae3759bd63f785c94d53e6a704e79cf33 Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Sun, 26 Oct 2025 02:44:00 +0900
Subject: [PATCH] ExpandFp: Require RuntimeLibcallsInfo analysis
Not sure I'm doing the new pass manager handling correctly. I do
not like needing to manually check if the cached module pass is
available and manually erroring in every pass.
---
llvm/lib/CodeGen/ExpandFp.cpp | 14 ++
llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll | 4 ++--
llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll | 2 +-
.../Transforms/ExpandFp/AMDGPU/missing-analysis.ll | 6 ++
.../Transforms/ExpandFp/AMDGPU/pass-parameters.ll | 8
5 files changed, 27 insertions(+), 7 deletions(-)
create mode 100644 llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
diff --git a/llvm/lib/CodeGen/ExpandFp.cpp b/llvm/lib/CodeGen/ExpandFp.cpp
index f44eb227133ae..9386ffe7791a3 100644
--- a/llvm/lib/CodeGen/ExpandFp.cpp
+++ b/llvm/lib/CodeGen/ExpandFp.cpp
@@ -18,6 +18,7 @@
#include "llvm/ADT/SmallVector.h"
#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/GlobalsModRef.h"
+#include "llvm/Analysis/RuntimeLibcallInfo.h"
#include "llvm/Analysis/SimplifyQuery.h"
#include "llvm/Analysis/ValueTracking.h"
#include "llvm/CodeGen/ISDOpcodes.h"
@@ -1092,6 +1093,8 @@ class ExpandFpLegacyPass : public FunctionPass {
auto *TM = &getAnalysis().getTM();
auto *TLI = TM->getSubtargetImpl(F)->getTargetLowering();
AssumptionCache *AC = nullptr;
+const RTLIB::RuntimeLibcallsInfo *Libcalls =
+&getAnalysis().getRTLCI(*F.getParent());
if (OptLevel != CodeGenOptLevel::None && !F.hasOptNone())
AC = &getAnalysis().getAssumptionCache(F);
@@ -1104,6 +1107,7 @@ class ExpandFpLegacyPass : public FunctionPass {
AU.addRequired();
AU.addPreserved();
AU.addPreserved();
+AU.addRequired();
}
};
} // namespace
@@ -1126,6 +1130,15 @@ PreservedAnalyses ExpandFpPass::run(Function &F,
FunctionAnalysisManager &FAM) {
AssumptionCache *AC = nullptr;
if (OptLevel != CodeGenOptLevel::None)
AC = &FAM.getResult(F);
+
+ auto &MAMProxy = FAM.getResult(F);
+ const RTLIB::RuntimeLibcallsInfo *Libcalls =
+ MAMProxy.getCachedResult(*F.getParent());
+ if (!Libcalls) {
+F.getContext().emitError("'runtime-libcall-info' analysis required");
+return PreservedAnalyses::all();
+ }
+
return runImpl(F, TLI, AC) ? PreservedAnalyses::none()
: PreservedAnalyses::all();
}
@@ -1133,6 +1146,7 @@ PreservedAnalyses ExpandFpPass::run(Function &F,
FunctionAnalysisManager &FAM) {
char ExpandFpLegacyPass::ID = 0;
INITIALIZE_PASS_BEGIN(ExpandFpLegacyPass, "expand-fp",
"Expand certain fp instructions", false, false)
+INITIALIZE_PASS_DEPENDENCY(RuntimeLibraryInfoWrapper)
INITIALIZE_PASS_END(ExpandFpLegacyPass, "expand-fp", "Expand fp", false, false)
FunctionPass *llvm::createExpandFpPass(CodeGenOptLevel OptLevel) {
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
index f70f0d25f172d..4d302f63e1f0b 100644
--- a/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
@@ -1,5 +1,5 @@
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK %s
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK,OPT1 %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK,OPT1 %s
; Check the handling of potentially infinite numerators in the frem
; expansion at different optimization levels and with different
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
index 4c0f9db147c96..56ccfb6bf454c 100644
--- a/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
@@ -1,5 +1,5 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
UTC_ARGS: --version 5
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck %s
define amdgpu_kernel void @frem_f16(ptr addrspace(1) %out, ptr addrspace(1)
%in1,
; CHECK-LABEL: define amdgpu_kernel void @frem_f16(
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
new file mode 100644
index 0..5cad68e66d3ee
--- /dev/null
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
@@ -0,0 +1,6 @@
+; RUN: not opt -mtriple=amdgcn -passes=expand-fp -disable-output %s 2>&1 |
FileCheck %s
+
+; CHECK: 'runtime-libc
[llvm-branch-commits] [llvm] ExpandFp: Require RuntimeLibcallsInfo analysis (PR #165197)
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/165197
>From 70f2855ae3759bd63f785c94d53e6a704e79cf33 Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Sun, 26 Oct 2025 02:44:00 +0900
Subject: [PATCH] ExpandFp: Require RuntimeLibcallsInfo analysis
Not sure I'm doing the new pass manager handling correctly. I do
not like needing to manually check if the cached module pass is
available and manually erroring in every pass.
---
llvm/lib/CodeGen/ExpandFp.cpp | 14 ++
llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll | 4 ++--
llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll | 2 +-
.../Transforms/ExpandFp/AMDGPU/missing-analysis.ll | 6 ++
.../Transforms/ExpandFp/AMDGPU/pass-parameters.ll | 8
5 files changed, 27 insertions(+), 7 deletions(-)
create mode 100644 llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
diff --git a/llvm/lib/CodeGen/ExpandFp.cpp b/llvm/lib/CodeGen/ExpandFp.cpp
index f44eb227133ae..9386ffe7791a3 100644
--- a/llvm/lib/CodeGen/ExpandFp.cpp
+++ b/llvm/lib/CodeGen/ExpandFp.cpp
@@ -18,6 +18,7 @@
#include "llvm/ADT/SmallVector.h"
#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/GlobalsModRef.h"
+#include "llvm/Analysis/RuntimeLibcallInfo.h"
#include "llvm/Analysis/SimplifyQuery.h"
#include "llvm/Analysis/ValueTracking.h"
#include "llvm/CodeGen/ISDOpcodes.h"
@@ -1092,6 +1093,8 @@ class ExpandFpLegacyPass : public FunctionPass {
auto *TM = &getAnalysis().getTM();
auto *TLI = TM->getSubtargetImpl(F)->getTargetLowering();
AssumptionCache *AC = nullptr;
+const RTLIB::RuntimeLibcallsInfo *Libcalls =
+&getAnalysis().getRTLCI(*F.getParent());
if (OptLevel != CodeGenOptLevel::None && !F.hasOptNone())
AC = &getAnalysis().getAssumptionCache(F);
@@ -1104,6 +1107,7 @@ class ExpandFpLegacyPass : public FunctionPass {
AU.addRequired();
AU.addPreserved();
AU.addPreserved();
+AU.addRequired();
}
};
} // namespace
@@ -1126,6 +1130,15 @@ PreservedAnalyses ExpandFpPass::run(Function &F,
FunctionAnalysisManager &FAM) {
AssumptionCache *AC = nullptr;
if (OptLevel != CodeGenOptLevel::None)
AC = &FAM.getResult(F);
+
+ auto &MAMProxy = FAM.getResult(F);
+ const RTLIB::RuntimeLibcallsInfo *Libcalls =
+ MAMProxy.getCachedResult(*F.getParent());
+ if (!Libcalls) {
+F.getContext().emitError("'runtime-libcall-info' analysis required");
+return PreservedAnalyses::all();
+ }
+
return runImpl(F, TLI, AC) ? PreservedAnalyses::none()
: PreservedAnalyses::all();
}
@@ -1133,6 +1146,7 @@ PreservedAnalyses ExpandFpPass::run(Function &F,
FunctionAnalysisManager &FAM) {
char ExpandFpLegacyPass::ID = 0;
INITIALIZE_PASS_BEGIN(ExpandFpLegacyPass, "expand-fp",
"Expand certain fp instructions", false, false)
+INITIALIZE_PASS_DEPENDENCY(RuntimeLibraryInfoWrapper)
INITIALIZE_PASS_END(ExpandFpLegacyPass, "expand-fp", "Expand fp", false, false)
FunctionPass *llvm::createExpandFpPass(CodeGenOptLevel OptLevel) {
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
index f70f0d25f172d..4d302f63e1f0b 100644
--- a/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/frem-inf.ll
@@ -1,5 +1,5 @@
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK %s
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK,OPT1 %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck
--check-prefixes CHECK,OPT1 %s
; Check the handling of potentially infinite numerators in the frem
; expansion at different optimization levels and with different
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
index 4c0f9db147c96..56ccfb6bf454c 100644
--- a/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/frem.ll
@@ -1,5 +1,5 @@
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
UTC_ARGS: --version 5
-; RUN: opt -mtriple=amdgcn -passes="expand-fp" %s -S -o - | FileCheck %s
+; RUN: opt -mtriple=amdgcn
-passes="require,expand-fp" %s -S -o - | FileCheck %s
define amdgpu_kernel void @frem_f16(ptr addrspace(1) %out, ptr addrspace(1)
%in1,
; CHECK-LABEL: define amdgpu_kernel void @frem_f16(
diff --git a/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
b/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
new file mode 100644
index 0..5cad68e66d3ee
--- /dev/null
+++ b/llvm/test/Transforms/ExpandFp/AMDGPU/missing-analysis.ll
@@ -0,0 +1,6 @@
+; RUN: not opt -mtriple=amdgcn -passes=expand-fp -disable-output %s 2>&1 |
FileCheck %s
+
+; CHECK: 'runtime-libc
[llvm-branch-commits] [lldb] release/21.x: [debugserver] Fix debugserver build on < macOS 10.15 (#166599) (PR #166614)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/166614 Backport bc55f4f4f2b4ef196cf3ec25f69dfbd9cd032237 Requested by: @JDevlieghere >From 4b2ac3f7a210c1c8de5e8ee4c5f47104a3859ed7 Mon Sep 17 00:00:00 2001 From: Jonas Devlieghere Date: Wed, 5 Nov 2025 10:32:38 -0800 Subject: [PATCH] [debugserver] Fix debugserver build on < macOS 10.15 (#166599) The VM_MEMORY_SANITIZER constant was added in macOs 10.15 and friends. Support using the constant on older OSes. Fixes #156144 (cherry picked from commit bc55f4f4f2b4ef196cf3ec25f69dfbd9cd032237) --- lldb/tools/debugserver/source/MacOSX/MachVMRegion.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/lldb/tools/debugserver/source/MacOSX/MachVMRegion.cpp b/lldb/tools/debugserver/source/MacOSX/MachVMRegion.cpp index 97908b4acaf28..18d254e76b917 100644 --- a/lldb/tools/debugserver/source/MacOSX/MachVMRegion.cpp +++ b/lldb/tools/debugserver/source/MacOSX/MachVMRegion.cpp @@ -14,6 +14,12 @@ #include "DNBLog.h" #include #include +#include + +// From , but not on older OSs. +#ifndef VM_MEMORY_SANITIZER +#define VM_MEMORY_SANITIZER 99 +#endif MachVMRegion::MachVMRegion(task_t task) : m_task(task), m_addr(INVALID_NUB_ADDRESS), m_err(), ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lldb] release/21.x: [debugserver] Fix debugserver build on < macOS 10.15 (#166599) (PR #166614)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/166614 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lldb] release/21.x: [debugserver] Fix debugserver build on < macOS 10.15 (#166599) (PR #166614)
llvmbot wrote: @llvm/pr-subscribers-lldb Author: None (llvmbot) Changes Backport bc55f4f4f2b4ef196cf3ec25f69dfbd9cd032237 Requested by: @JDevlieghere --- Full diff: https://github.com/llvm/llvm-project/pull/166614.diff 1 Files Affected: - (modified) lldb/tools/debugserver/source/MacOSX/MachVMRegion.cpp (+6) ``diff diff --git a/lldb/tools/debugserver/source/MacOSX/MachVMRegion.cpp b/lldb/tools/debugserver/source/MacOSX/MachVMRegion.cpp index 97908b4acaf28..18d254e76b917 100644 --- a/lldb/tools/debugserver/source/MacOSX/MachVMRegion.cpp +++ b/lldb/tools/debugserver/source/MacOSX/MachVMRegion.cpp @@ -14,6 +14,12 @@ #include "DNBLog.h" #include #include +#include + +// From , but not on older OSs. +#ifndef VM_MEMORY_SANITIZER +#define VM_MEMORY_SANITIZER 99 +#endif MachVMRegion::MachVMRegion(task_t task) : m_task(task), m_addr(INVALID_NUB_ADDRESS), m_err(), `` https://github.com/llvm/llvm-project/pull/166614 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lldb] release/21.x: [debugserver] Fix debugserver build on < macOS 10.15 (#166599) (PR #166614)
llvmbot wrote: @felipepiovezan What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/166614 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] check float cast (PR #166618)
https://github.com/makslevental updated
https://github.com/llvm/llvm-project/pull/166618
>From 186a5f9dd5545db6e3ccb228174e9f6edbce95d5 Mon Sep 17 00:00:00 2001
From: makslevental
Date: Wed, 5 Nov 2025 11:13:09 -0800
Subject: [PATCH 1/2] check float cast
---
mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp
b/mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp
index 632e1a7f02602..99d181f6262cd 100644
--- a/mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp
+++ b/mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp
@@ -583,9 +583,11 @@ struct FancyAddFLowering : public
ConvertOpToLLVMPattern {
auto parent = op->getParentOfType();
if (!parent)
return failure();
+auto floatTy = dyn_cast(op.getType());
+if (!floatTy)
+ return failure();
FailureOr adder =
LLVM::lookupOrCreateApFloatAddFFn(rewriter, parent);
-auto floatTy = cast(op.getType());
// Cast operands to 64-bit integers.
Location loc = op.getLoc();
>From 45b3830b7cc440bc62b975d169837159201e0f3c Mon Sep 17 00:00:00 2001
From: makslevental
Date: Wed, 5 Nov 2025 13:26:59 -0800
Subject: [PATCH 2/2] fix creates
---
mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp | 16
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp
b/mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp
index 99d181f6262cd..6fe4c22178c03 100644
--- a/mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp
+++ b/mlir/lib/Conversion/ArithToLLVM/ArithToLLVM.cpp
@@ -591,16 +591,16 @@ struct FancyAddFLowering : public
ConvertOpToLLVMPattern {
// Cast operands to 64-bit integers.
Location loc = op.getLoc();
-Value lhsBits = rewriter.create(loc, rewriter.getI64Type(),
- adaptor.getLhs());
-Value rhsBits = rewriter.create(loc, rewriter.getI64Type(),
- adaptor.getRhs());
+Value lhsBits = LLVM::ZExtOp::create(rewriter, loc, rewriter.getI64Type(),
+ adaptor.getLhs());
+Value rhsBits = LLVM::ZExtOp::create(rewriter, loc, rewriter.getI64Type(),
+ adaptor.getRhs());
// Call software implementation of floating point addition.
int32_t sem =
llvm::APFloatBase::SemanticsToEnum(floatTy.getFloatSemantics());
-Value semValue = rewriter.create(
-loc, rewriter.getI32Type(),
+Value semValue = LLVM::ConstantOp::create(
+rewriter, loc, rewriter.getI32Type(),
rewriter.getIntegerAttr(rewriter.getI32Type(), sem));
SmallVector params = {semValue, lhsBits, rhsBits};
auto resultOp =
@@ -608,8 +608,8 @@ struct FancyAddFLowering : public
ConvertOpToLLVMPattern {
SymbolRefAttr::get(*adder), params);
// Truncate result to the original width.
-Value truncatedBits = rewriter.create(
-loc, rewriter.getIntegerType(floatTy.getWidth()),
+Value truncatedBits = LLVM::TruncOp::create(
+rewriter, loc, rewriter.getIntegerType(floatTy.getWidth()),
resultOp->getResult(0));
rewriter.replaceOp(op, truncatedBits);
return success();
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Start to use AV classes for unknown vector class (PR #166482)
https://github.com/shiltian approved this pull request. https://github.com/llvm/llvm-project/pull/166482 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [dwarf] make dwarf fission compatible with RISCV relaxations 2/2 (PR #164813)
https://github.com/dlav-sc updated
https://github.com/llvm/llvm-project/pull/164813
>From 7f6a94fb0462c7b7a30fda84151d38f4182590bc Mon Sep 17 00:00:00 2001
From: Daniil Avdeev
Date: Thu, 18 Sep 2025 02:05:39 +
Subject: [PATCH] [dwarf] make dwarf fission compatible with RISCV relaxations
2/2
This patch makes DWARF fission compatible with RISC-V relaxations by
using indirect addressing for the DW_AT_high_pc attribute. This
eliminates the remaining relocations in .dwo files.
---
.../CodeGen/AsmPrinter/DwarfCompileUnit.cpp | 8 ++--
llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll | 44 +--
2 files changed, 35 insertions(+), 17 deletions(-)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 751d3735d3b2b..2e4a26ef70bc2 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -493,10 +493,12 @@ void DwarfCompileUnit::attachLowHighPC(DIE &D, const
MCSymbol *Begin,
assert(End->isDefined() && "Invalid end label");
addLabelAddress(D, dwarf::DW_AT_low_pc, Begin);
- if (DD->getDwarfVersion() < 4)
-addLabelAddress(D, dwarf::DW_AT_high_pc, End);
- else
+ if (DD->getDwarfVersion() >= 4 &&
+ (!isDwoUnit() || !llvm::isRangeRelaxable(Begin, End))) {
addLabelDelta(D, dwarf::DW_AT_high_pc, End, Begin);
+return;
+ }
+ addLabelAddress(D, dwarf::DW_AT_high_pc, End);
}
// Add info for Wasm-global-based relocation.
diff --git a/llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll
b/llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll
index f8ab7fc5ad900..64f83ba1a7d7f 100644
--- a/llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll
+++ b/llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll
@@ -1,12 +1,13 @@
; RUN: llc -dwarf-version=5 -split-dwarf-file=foo.dwo -O0 %s
-mtriple=riscv64-unknown-linux-gnu -filetype=obj -o %t
; RUN: llvm-dwarfdump -v %t | FileCheck --check-prefix=DWARF5 %s
; RUN: llvm-dwarfdump --debug-info %t 2> %t.txt
-; RUN: FileCheck --input-file=%t.txt %s --check-prefix=RELOCS
--implicit-check-not=warning:
+; RUN: FileCheck --input-file=%t.txt %s --check-prefix=RELOCS --allow-empty
--implicit-check-not=warning:
; RUN: llc -dwarf-version=4 -split-dwarf-file=foo.dwo -O0 %s
-mtriple=riscv64-unknown-linux-gnu -filetype=obj -o %t
; RUN: llvm-dwarfdump -v %t | FileCheck --check-prefix=DWARF4 %s
; RUN: llvm-dwarfdump --debug-info %t 2> %t.txt
-; RUN: FileCheck --input-file=%t.txt %s --check-prefix=RELOCS
--implicit-check-not=warning:
+; RUN: FileCheck --input-file=%t.txt %s --check-prefix=RELOCS --allow-empty
--implicit-check-not=warning:
+; RUN: llvm-objdump -h %t | FileCheck --check-prefix=HDR %s
; In the RISC-V architecture, the .text section is subject to
; relaxation, meaning the start address of each function can change
@@ -49,60 +50,75 @@
; clang -g -S -gsplit-dwarf --target=riscv64 -march=rv64gc -O0
relax_dwo_ranges.cpp
-; Currently, square() still uses an offset to represent the function's end
address,
-; which requires a relocation here.
-; RELOCS: warning: unexpected relocations for dwo section '.debug_info.dwo'
+; RELOCS-NOT: warning: unexpected relocations for dwo section '.debug_info.dwo'
+; Make sure we don't produce any relocations in any .dwo section
+; HDR-NOT: .rela.{{.*}}.dwo
+
+; Ensure that 'square()' function uses indexed start and end addresses
; DWARF5: .debug_info.dwo contents:
; DWARF5: DW_TAG_subprogram
-; DWARF5-NEXT: DW_AT_low_pc [DW_FORM_addrx](indexed () address =
0x ".text")
-; DWARF5-NEXT: DW_AT_high_pc [DW_FORM_data4] (0x)
+; DWARF5-NEXT: DW_AT_low_pc [DW_FORM_addrx](indexed () address =
0x ".text")
+; DWARF5-NEXT: DW_AT_high_pc [DW_FORM_addrx](indexed (0001) address =
0x0044 ".text")
; DWARF5: DW_AT_name {{.*}} "square")
; DWARF5: DW_TAG_formal_parameter
+; HDR-NOT: .rela.{{.*}}.dwo
+
; Ensure there is no unnecessary addresses in .o file
; DWARF5: .debug_addr contents:
; DWARF5: Addrs: [
; DWARF5-NEXT: 0x
+; DWARF5-NEXT: 0x0044
; DWARF5-NEXT: 0x0046
; DWARF5-NEXT: 0x006c
; DWARF5-NEXT: 0x00b0
; DWARF5-NEXT: ]
+; HDR-NOT: .rela.{{.*}}.dwo
+
; Ensure that 'boo()' and 'main()' use DW_RLE_startx_length and
DW_RLE_startx_endx
; entries respectively
; DWARF5: .debug_rnglists.dwo contents:
; DWARF5: ranges:
-; DWARF5-NEXT: 0x0014: [DW_RLE_startx_length]: 0x0001,
0x0024 => [0x0046, 0x006a)
+; DWARF5-NEXT: 0x0014: [DW_RLE_startx_length]: 0x0002,
0x0024 => [0x0046, 0x006a)
; DWARF5-NEXT: 0x0017: [DW_RLE_end_of_list ]
-; DWARF5-NEXT: 0x0018: [DW_RLE_startx_endx ]: 0x0002,
0x0003 => [0x006c, 0x00b0)
+; DWARF5-NEXT: 0x0018: [DW_RLE_startx_endx ]: 0x0
[llvm-branch-commits] [llvm] [dwarf] make dwarf fission compatible with RISCV relaxations 2/2 (PR #164813)
https://github.com/dlav-sc edited https://github.com/llvm/llvm-project/pull/164813 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CI] Make premerge upload/write comments (PR #166609)
https://github.com/boomanaiden154 created https://github.com/llvm/llvm-project/pull/166609 This only does this for Linux currently as the issue-write workflow currently does not support writing out multiple comments. This gets the ball rolling as the failures that most people see are common to both platforms. Ensuring we have coverage on Windows for comments will be done in a future patch. ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CI] Make premerge upload/write comments (PR #166609)
llvmbot wrote: @llvm/pr-subscribers-github-workflow Author: Aiden Grossman (boomanaiden154) Changes This only does this for Linux currently as the issue-write workflow currently does not support writing out multiple comments. This gets the ball rolling as the failures that most people see are common to both platforms. Ensuring we have coverage on Windows for comments will be done in a future patch. --- Full diff: https://github.com/llvm/llvm-project/pull/166609.diff 2 Files Affected: - (modified) .github/workflows/issue-write.yml (+1) - (modified) .github/workflows/premerge.yaml (+7) ``diff diff --git a/.github/workflows/issue-write.yml b/.github/workflows/issue-write.yml index 26cd60c070251..8966a1b372dfc 100644 --- a/.github/workflows/issue-write.yml +++ b/.github/workflows/issue-write.yml @@ -7,6 +7,7 @@ on: - "Check for private emails used in PRs" - "PR Request Release Note" - "Code lint" + - "CI Checks" types: - completed diff --git a/.github/workflows/premerge.yaml b/.github/workflows/premerge.yaml index 973d3abf358ce..2b88ebd3e6106 100644 --- a/.github/workflows/premerge.yaml +++ b/.github/workflows/premerge.yaml @@ -116,6 +116,13 @@ jobs: path: artifacts/ retention-days: 5 include-hidden-files: 'true' + - name: Upload Comment +if: always() +continue-on-error: true +with: + name: workflow-args + path: | +comments premerge-checks-windows: name: Build and Test Windows `` https://github.com/llvm/llvm-project/pull/166609 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libc] e3b12c0 - Revert "[libc] Return errno from OFD failure paths in fcntl. (#166252)"
Author: Jackson Stogel
Date: 2025-11-05T14:52:26-08:00
New Revision: e3b12c0859b0e472a44df89dd2be19b2016191f6
URL:
https://github.com/llvm/llvm-project/commit/e3b12c0859b0e472a44df89dd2be19b2016191f6
DIFF:
https://github.com/llvm/llvm-project/commit/e3b12c0859b0e472a44df89dd2be19b2016191f6.diff
LOG: Revert "[libc] Return errno from OFD failure paths in fcntl. (#166252)"
This reverts commit 81dede888a35281e20b59107a6bf347c23e1c5f6.
Added:
Modified:
libc/src/__support/OSUtil/linux/fcntl.cpp
libc/test/src/fcntl/fcntl_test.cpp
Removed:
diff --git a/libc/src/__support/OSUtil/linux/fcntl.cpp
b/libc/src/__support/OSUtil/linux/fcntl.cpp
index 08db4859c6417..bb76eee90efd2 100644
--- a/libc/src/__support/OSUtil/linux/fcntl.cpp
+++ b/libc/src/__support/OSUtil/linux/fcntl.cpp
@@ -66,7 +66,7 @@ ErrorOr fcntl(int fd, int cmd, void *arg) {
LIBC_NAMESPACE::syscall_impl(FCNTL_SYSCALL_ID, fd, cmd, &flk64);
// On failure, return
if (ret < 0)
- return Error(-ret);
+ return Error(-1);
// Check for overflow, i.e. the offsets are not the same when cast
// to off_t from off64_t.
if (static_cast(flk64.l_len) != flk64.l_len ||
diff --git a/libc/test/src/fcntl/fcntl_test.cpp
b/libc/test/src/fcntl/fcntl_test.cpp
index d6aaccddd76ee..84feb34e537a0 100644
--- a/libc/test/src/fcntl/fcntl_test.cpp
+++ b/libc/test/src/fcntl/fcntl_test.cpp
@@ -94,99 +94,78 @@ TEST_F(LlvmLibcFcntlTest, FcntlSetFl) {
ASSERT_THAT(LIBC_NAMESPACE::close(fd), Succeeds(0));
}
-/* Tests that are common between OFD and traditional variants of fcntl locks.
*/
-template
-class LibcFcntlCommonLockTests : public LlvmLibcFcntlTest {
-public:
- void GetLkRead() {
-using LIBC_NAMESPACE::testing::ErrnoSetterMatcher::Succeeds;
-constexpr const char *TEST_FILE_NAME = "testdata/fcntl_getlkread.test";
-const auto TEST_FILE = libc_make_test_file_path(TEST_FILE_NAME);
-
-struct flock flk = {};
-struct flock svflk = {};
-int retVal;
-int fd =
-LIBC_NAMESPACE::open(TEST_FILE, O_CREAT | O_TRUNC | O_RDONLY, S_IRWXU);
-ASSERT_ERRNO_SUCCESS();
-ASSERT_GT(fd, 0);
-
-flk.l_type = F_RDLCK;
-flk.l_start = 0;
-flk.l_whence = SEEK_SET;
-flk.l_len = 50;
-
-// copy flk into svflk
-svflk = flk;
-
-retVal = LIBC_NAMESPACE::fcntl(fd, GETLK_CMD, &svflk);
-ASSERT_ERRNO_SUCCESS();
-ASSERT_GT(retVal, -1);
-ASSERT_NE((int)svflk.l_type, F_WRLCK); // File should not be write locked.
-
-retVal = LIBC_NAMESPACE::fcntl(fd, SETLK_CMD, &svflk);
-ASSERT_ERRNO_SUCCESS();
-ASSERT_GT(retVal, -1);
-
-ASSERT_THAT(LIBC_NAMESPACE::close(fd), Succeeds(0));
- }
-
- void GetLkWrite() {
-using LIBC_NAMESPACE::testing::ErrnoSetterMatcher::Succeeds;
-constexpr const char *TEST_FILE_NAME = "testdata/fcntl_getlkwrite.test";
-const auto TEST_FILE = libc_make_test_file_path(TEST_FILE_NAME);
-
-struct flock flk = {};
-struct flock svflk = {};
-int retVal;
-int fd =
-LIBC_NAMESPACE::open(TEST_FILE, O_CREAT | O_TRUNC | O_RDWR, S_IRWXU);
-ASSERT_ERRNO_SUCCESS();
-ASSERT_GT(fd, 0);
-
-flk.l_type = F_WRLCK;
-flk.l_start = 0;
-flk.l_whence = SEEK_SET;
-flk.l_len = 0;
-
-// copy flk into svflk
-svflk = flk;
-
-retVal = LIBC_NAMESPACE::fcntl(fd, GETLK_CMD, &svflk);
-ASSERT_ERRNO_SUCCESS();
-ASSERT_GT(retVal, -1);
-ASSERT_NE((int)svflk.l_type, F_RDLCK); // File should not be read locked.
-
-retVal = LIBC_NAMESPACE::fcntl(fd, SETLK_CMD, &svflk);
-ASSERT_ERRNO_SUCCESS();
-ASSERT_GT(retVal, -1);
-
-ASSERT_THAT(LIBC_NAMESPACE::close(fd), Succeeds(0));
- }
-
- void UseAfterClose() {
-using LIBC_NAMESPACE::testing::ErrnoSetterMatcher::Succeeds;
-constexpr const char *TEST_FILE_NAME =
-"testdata/fcntl_use_after_close.test";
-const auto TEST_FILE = libc_make_test_file_path(TEST_FILE_NAME);
-int fd =
-LIBC_NAMESPACE::open(TEST_FILE, O_CREAT | O_TRUNC | O_RDWR, S_IRWXU);
-ASSERT_THAT(LIBC_NAMESPACE::close(fd), Succeeds(0));
-ASSERT_EQ(-1, LIBC_NAMESPACE::fcntl(fd, GETLK_CMD));
-ASSERT_ERRNO_EQ(EBADF);
- }
-};
-
-#define COMMON_LOCK_TESTS(NAME, GETLK, SETLK)
\
- using NAME = LibcFcntlCommonLockTests;
\
- TEST_F(NAME, GetLkRead) { GetLkRead(); }
\
- TEST_F(NAME, GetLkWrite) { GetLkWrite(); }
\
- TEST_F(NAME, UseAfterClose) { UseAfterClose(); }
\
- static_assert(true, "Require semicolon.")
-
-COMMON_LOCK_TESTS(LlvmLibcFcntlProcessAssociatedLockTest, F_GETLK, F_SETLK);
-COMMON_LOCK_TESTS(LlvmLibcFcntlOpenFileDescriptionLockTest, F_OFD_GETLK,
- F_OFD_SETLK);
+TEST_F(LlvmLibcFcntlTest, FcntlGetLkRead) {
+ using LIBC_NAMESPACE::testing::ErrnoSetterMatcher::Suc
[llvm-branch-commits] [llvm] [dwarf] make dwarf fission compatible with RISCV relaxations 2/2 (PR #164813)
https://github.com/dlav-sc updated
https://github.com/llvm/llvm-project/pull/164813
>From ac224f9079e297ae6b5f675234120d0dcef0ebe8 Mon Sep 17 00:00:00 2001
From: Daniil Avdeev
Date: Thu, 18 Sep 2025 02:05:39 +
Subject: [PATCH] [dwarf] make dwarf fission compatible with RISCV relaxations
2/2
This patch makes DWARF fission compatible with RISC-V relaxations by
using indirect addressing for the DW_AT_high_pc attribute. This
eliminates the remaining relocations in .dwo files.
---
.../CodeGen/AsmPrinter/DwarfCompileUnit.cpp | 8 ++--
llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll | 44 +--
2 files changed, 35 insertions(+), 17 deletions(-)
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 751d3735d3b2b..2e4a26ef70bc2 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -493,10 +493,12 @@ void DwarfCompileUnit::attachLowHighPC(DIE &D, const
MCSymbol *Begin,
assert(End->isDefined() && "Invalid end label");
addLabelAddress(D, dwarf::DW_AT_low_pc, Begin);
- if (DD->getDwarfVersion() < 4)
-addLabelAddress(D, dwarf::DW_AT_high_pc, End);
- else
+ if (DD->getDwarfVersion() >= 4 &&
+ (!isDwoUnit() || !llvm::isRangeRelaxable(Begin, End))) {
addLabelDelta(D, dwarf::DW_AT_high_pc, End, Begin);
+return;
+ }
+ addLabelAddress(D, dwarf::DW_AT_high_pc, End);
}
// Add info for Wasm-global-based relocation.
diff --git a/llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll
b/llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll
index f8ab7fc5ad900..64f83ba1a7d7f 100644
--- a/llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll
+++ b/llvm/test/DebugInfo/RISCV/relax_dwo_ranges.ll
@@ -1,12 +1,13 @@
; RUN: llc -dwarf-version=5 -split-dwarf-file=foo.dwo -O0 %s
-mtriple=riscv64-unknown-linux-gnu -filetype=obj -o %t
; RUN: llvm-dwarfdump -v %t | FileCheck --check-prefix=DWARF5 %s
; RUN: llvm-dwarfdump --debug-info %t 2> %t.txt
-; RUN: FileCheck --input-file=%t.txt %s --check-prefix=RELOCS
--implicit-check-not=warning:
+; RUN: FileCheck --input-file=%t.txt %s --check-prefix=RELOCS --allow-empty
--implicit-check-not=warning:
; RUN: llc -dwarf-version=4 -split-dwarf-file=foo.dwo -O0 %s
-mtriple=riscv64-unknown-linux-gnu -filetype=obj -o %t
; RUN: llvm-dwarfdump -v %t | FileCheck --check-prefix=DWARF4 %s
; RUN: llvm-dwarfdump --debug-info %t 2> %t.txt
-; RUN: FileCheck --input-file=%t.txt %s --check-prefix=RELOCS
--implicit-check-not=warning:
+; RUN: FileCheck --input-file=%t.txt %s --check-prefix=RELOCS --allow-empty
--implicit-check-not=warning:
+; RUN: llvm-objdump -h %t | FileCheck --check-prefix=HDR %s
; In the RISC-V architecture, the .text section is subject to
; relaxation, meaning the start address of each function can change
@@ -49,60 +50,75 @@
; clang -g -S -gsplit-dwarf --target=riscv64 -march=rv64gc -O0
relax_dwo_ranges.cpp
-; Currently, square() still uses an offset to represent the function's end
address,
-; which requires a relocation here.
-; RELOCS: warning: unexpected relocations for dwo section '.debug_info.dwo'
+; RELOCS-NOT: warning: unexpected relocations for dwo section '.debug_info.dwo'
+; Make sure we don't produce any relocations in any .dwo section
+; HDR-NOT: .rela.{{.*}}.dwo
+
+; Ensure that 'square()' function uses indexed start and end addresses
; DWARF5: .debug_info.dwo contents:
; DWARF5: DW_TAG_subprogram
-; DWARF5-NEXT: DW_AT_low_pc [DW_FORM_addrx](indexed () address =
0x ".text")
-; DWARF5-NEXT: DW_AT_high_pc [DW_FORM_data4] (0x)
+; DWARF5-NEXT: DW_AT_low_pc [DW_FORM_addrx](indexed () address =
0x ".text")
+; DWARF5-NEXT: DW_AT_high_pc [DW_FORM_addrx](indexed (0001) address =
0x0044 ".text")
; DWARF5: DW_AT_name {{.*}} "square")
; DWARF5: DW_TAG_formal_parameter
+; HDR-NOT: .rela.{{.*}}.dwo
+
; Ensure there is no unnecessary addresses in .o file
; DWARF5: .debug_addr contents:
; DWARF5: Addrs: [
; DWARF5-NEXT: 0x
+; DWARF5-NEXT: 0x0044
; DWARF5-NEXT: 0x0046
; DWARF5-NEXT: 0x006c
; DWARF5-NEXT: 0x00b0
; DWARF5-NEXT: ]
+; HDR-NOT: .rela.{{.*}}.dwo
+
; Ensure that 'boo()' and 'main()' use DW_RLE_startx_length and
DW_RLE_startx_endx
; entries respectively
; DWARF5: .debug_rnglists.dwo contents:
; DWARF5: ranges:
-; DWARF5-NEXT: 0x0014: [DW_RLE_startx_length]: 0x0001,
0x0024 => [0x0046, 0x006a)
+; DWARF5-NEXT: 0x0014: [DW_RLE_startx_length]: 0x0002,
0x0024 => [0x0046, 0x006a)
; DWARF5-NEXT: 0x0017: [DW_RLE_end_of_list ]
-; DWARF5-NEXT: 0x0018: [DW_RLE_startx_endx ]: 0x0002,
0x0003 => [0x006c, 0x00b0)
+; DWARF5-NEXT: 0x0018: [DW_RLE_startx_endx ]: 0x0
[llvm-branch-commits] [CI] Make premerge upload/write comments (PR #166609)
@@ -116,6 +116,13 @@ jobs: path: artifacts/ retention-days: 5 include-hidden-files: 'true' + - name: Upload Comment +if: always() +continue-on-error: true tstellar wrote: Are you missing a 'uses' here? https://github.com/llvm/llvm-project/pull/166609 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][X86] Add native ct.select support for X86 and i386 (PR #166704)
github-actions[bot] wrote:
:warning: C/C++ code formatter, clang-format found issues in your code.
:warning:
You can test this locally with the following command:
``bash
git-clang-format --diff origin/main HEAD --extensions h,cpp --
llvm/lib/Target/X86/X86ISelLowering.cpp llvm/lib/Target/X86/X86ISelLowering.h
llvm/lib/Target/X86/X86InstrInfo.cpp llvm/lib/Target/X86/X86InstrInfo.h
llvm/lib/Target/X86/X86TargetMachine.cpp --diff_from_common_commit
``
:warning:
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing `origin/main` to the base branch/commit you want to compare against.
:warning:
View the diff from clang-format here.
``diff
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 4c73f7402..3afdb884b 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -29,9 +29,9 @@
#include "llvm/Analysis/BlockFrequencyInfo.h"
#include "llvm/Analysis/ProfileSummaryInfo.h"
#include "llvm/Analysis/VectorUtils.h"
-#include "llvm/CodeGen/LivePhysRegs.h"
#include "llvm/CodeGen/ISDOpcodes.h"
#include "llvm/CodeGen/IntrinsicLowering.h"
+#include "llvm/CodeGen/LivePhysRegs.h"
#include "llvm/CodeGen/MachineFrameInfo.h"
#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineInstrBuilder.h"
@@ -25372,7 +25372,7 @@ static SDValue LowerSIGN_EXTEND_Mask(SDValue Op, const
SDLoc &dl,
}
SDValue X86TargetLowering::LowerCTSELECT(SDValue Op, SelectionDAG &DAG) const {
- SDValue Cond = Op.getOperand(0); // condition
+ SDValue Cond = Op.getOperand(0);// condition
SDValue TrueOp = Op.getOperand(1); // true_value
SDValue FalseOp = Op.getOperand(2); // false_value
SDLoc DL(Op);
@@ -25542,29 +25542,33 @@ SDValue X86TargetLowering::LowerCTSELECT(SDValue Op,
SelectionDAG &DAG) const {
SDValue FalseSlot = DAG.CreateStackTemporary(MVT::f80);
// Store f80 values to memory
-SDValue StoreTrueF80 = DAG.getStore(Chain, DL, TrueOp, TrueSlot,
-MachinePointerInfo());
-SDValue StoreFalseF80 = DAG.getStore(Chain, DL, FalseOp, FalseSlot,
- MachinePointerInfo());
+SDValue StoreTrueF80 =
+DAG.getStore(Chain, DL, TrueOp, TrueSlot, MachinePointerInfo());
+SDValue StoreFalseF80 =
+DAG.getStore(Chain, DL, FalseOp, FalseSlot, MachinePointerInfo());
// Load i32 parts from memory (3 chunks for 96-bit f80 storage)
-SDValue TruePart0 = DAG.getLoad(MVT::i32, DL, StoreTrueF80, TrueSlot,
- MachinePointerInfo());
-SDValue TruePart1Ptr = DAG.getMemBasePlusOffset(TrueSlot,
TypeSize::getFixed(4), DL);
+SDValue TruePart0 =
+DAG.getLoad(MVT::i32, DL, StoreTrueF80, TrueSlot,
MachinePointerInfo());
+SDValue TruePart1Ptr =
+DAG.getMemBasePlusOffset(TrueSlot, TypeSize::getFixed(4), DL);
SDValue TruePart1 = DAG.getLoad(MVT::i32, DL, StoreTrueF80, TruePart1Ptr,
- MachinePointerInfo());
-SDValue TruePart2Ptr = DAG.getMemBasePlusOffset(TrueSlot,
TypeSize::getFixed(8), DL);
+MachinePointerInfo());
+SDValue TruePart2Ptr =
+DAG.getMemBasePlusOffset(TrueSlot, TypeSize::getFixed(8), DL);
SDValue TruePart2 = DAG.getLoad(MVT::i32, DL, StoreTrueF80, TruePart2Ptr,
- MachinePointerInfo());
+MachinePointerInfo());
SDValue FalsePart0 = DAG.getLoad(MVT::i32, DL, StoreFalseF80, FalseSlot,
- MachinePointerInfo());
-SDValue FalsePart1Ptr = DAG.getMemBasePlusOffset(FalseSlot,
TypeSize::getFixed(4), DL);
+ MachinePointerInfo());
+SDValue FalsePart1Ptr =
+DAG.getMemBasePlusOffset(FalseSlot, TypeSize::getFixed(4), DL);
SDValue FalsePart1 = DAG.getLoad(MVT::i32, DL, StoreFalseF80,
FalsePart1Ptr,
- MachinePointerInfo());
-SDValue FalsePart2Ptr = DAG.getMemBasePlusOffset(FalseSlot,
TypeSize::getFixed(8), DL);
+ MachinePointerInfo());
+SDValue FalsePart2Ptr =
+DAG.getMemBasePlusOffset(FalseSlot, TypeSize::getFixed(8), DL);
SDValue FalsePart2 = DAG.getLoad(MVT::i32, DL, StoreFalseF80,
FalsePart2Ptr,
- MachinePointerInfo());
+ MachinePointerInfo());
// Perform CTSELECT on each 32-bit chunk
SDValue Part0Ops[] = {FalsePart0, TruePart0, CC, ProcessedCond};
@@ -25576,17 +25580,20 @@ SDValue X86TargetLowering::LowerCTSELECT(SDValue Op,
SelectionDAG &DAG) const {
// Create result stack slot and store the selected parts
SDValue ResultSlot = DAG.CreateStackTempor
[llvm-branch-commits] [llvm] [LLVM][ARM] Add native ct.select support for ARM32 and Thumb (PR #166707)
github-actions[bot] wrote:
:warning: C/C++ code formatter, clang-format found issues in your code.
:warning:
You can test this locally with the following command:
``bash
git-clang-format --diff origin/main HEAD --extensions h,cpp --
llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp llvm/lib/Target/ARM/ARMBaseInstrInfo.h
llvm/lib/Target/ARM/ARMISelDAGToDAG.cpp llvm/lib/Target/ARM/ARMISelLowering.cpp
llvm/lib/Target/ARM/ARMISelLowering.h llvm/lib/Target/ARM/ARMTargetMachine.cpp
--diff_from_common_commit
``
:warning:
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing `origin/main` to the base branch/commit you want to compare against.
:warning:
View the diff from clang-format here.
``diff
diff --git a/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
b/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
index 6e2aaa9fc..6d8a3b722 100644
--- a/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
+++ b/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
@@ -1552,8 +1552,9 @@ bool ARMBaseInstrInfo::expandCtSelectVector(MachineInstr
&MI) const {
unsigned RsbOp = Subtarget.isThumb2() ? ARM::t2RSBri : ARM::RSBri;
- // Any vector pseudo has: ((outs $dst, $tmp_mask, $bcast_mask), (ins $src1,
$src2, $cond))
- Register VectorMaskReg = MI.getOperand(2).getReg();
+ // Any vector pseudo has: ((outs $dst, $tmp_mask, $bcast_mask), (ins $src1,
+ // $src2, $cond))
+ Register VectorMaskReg = MI.getOperand(2).getReg();
Register Src1Reg = MI.getOperand(3).getReg();
Register Src2Reg = MI.getOperand(4).getReg();
Register CondReg = MI.getOperand(5).getReg();
@@ -1564,47 +1565,46 @@ bool
ARMBaseInstrInfo::expandCtSelectVector(MachineInstr &MI) const {
// When cond = 0: mask = 0x.
// When cond = 1: mask = 0x.
- MachineInstr *FirstNewMI =
-BuildMI(*MBB, MI, DL, get(RsbOp), MaskReg)
-.addReg(CondReg)
-.addImm(0)
-.add(predOps(ARMCC::AL))
-.add(condCodeOp())
-.setMIFlag(MachineInstr::MIFlag::NoMerge);
-
+ MachineInstr *FirstNewMI = BuildMI(*MBB, MI, DL, get(RsbOp), MaskReg)
+ .addReg(CondReg)
+ .addImm(0)
+ .add(predOps(ARMCC::AL))
+ .add(condCodeOp())
+ .setMIFlag(MachineInstr::MIFlag::NoMerge);
+
// 2. A = src1 & mask
// For vectors, broadcast the scalar mask so it matches operand size.
BuildMI(*MBB, MI, DL, get(BroadcastOp), VectorMaskReg)
-.addReg(MaskReg)
-.add(predOps(ARMCC::AL))
-.setMIFlag(MachineInstr::MIFlag::NoMerge);
+ .addReg(MaskReg)
+ .add(predOps(ARMCC::AL))
+ .setMIFlag(MachineInstr::MIFlag::NoMerge);
BuildMI(*MBB, MI, DL, get(AndOp), DestReg)
-.addReg(Src1Reg)
-.addReg(VectorMaskReg)
-.add(predOps(ARMCC::AL))
-.setMIFlag(MachineInstr::MIFlag::NoMerge);
+ .addReg(Src1Reg)
+ .addReg(VectorMaskReg)
+ .add(predOps(ARMCC::AL))
+ .setMIFlag(MachineInstr::MIFlag::NoMerge);
// 3. B = src2 & ~mask
BuildMI(*MBB, MI, DL, get(BicOp), VectorMaskReg)
-.addReg(Src2Reg)
-.addReg(VectorMaskReg)
-.add(predOps(ARMCC::AL))
-.setMIFlag(MachineInstr::MIFlag::NoMerge);
+ .addReg(Src2Reg)
+ .addReg(VectorMaskReg)
+ .add(predOps(ARMCC::AL))
+ .setMIFlag(MachineInstr::MIFlag::NoMerge);
// 4. result = A | B
auto LastNewMI = BuildMI(*MBB, MI, DL, get(OrrOp), DestReg)
-.addReg(DestReg)
-.addReg(VectorMaskReg)
-.add(predOps(ARMCC::AL))
-.setMIFlag(MachineInstr::MIFlag::NoMerge);
+ .addReg(DestReg)
+ .addReg(VectorMaskReg)
+ .add(predOps(ARMCC::AL))
+ .setMIFlag(MachineInstr::MIFlag::NoMerge);
auto BundleStart = FirstNewMI->getIterator();
auto BundleEnd = LastNewMI->getIterator();
// Add instruction bundling
finalizeBundle(*MBB, BundleStart, std::next(BundleEnd));
-
+
MI.eraseFromParent();
return true;
}
@@ -1614,8 +1614,8 @@ bool ARMBaseInstrInfo::expandCtSelectThumb(MachineInstr
&MI) const {
MachineBasicBlock *MBB = MI.getParent();
DebugLoc DL = MI.getDebugLoc();
- // pseudos in thumb1 mode have: (outs $dst, $tmp_mask), (ins $src1, $src2,
$cond))
- // register class here is always tGPR.
+ // pseudos in thumb1 mode have: (outs $dst, $tmp_mask), (ins $src1, $src2,
+ // $cond)) register class here is always tGPR.
Register DestReg = MI.getOperand(0).getReg();
Register MaskReg = MI.getOperand(1).getReg();
Register Src1Reg = MI.getOperand(2).getReg();
@@ -1631,60 +1631,64 @@ bool ARMBaseInstrInfo::expandCtSelectThumb(MachineInstr
&MI) const {
unsigned ShiftAmount = RegSize - 1;
// Option 1: Shift-based mask (preferred - no flag modification)
- MachineInstr *FirstNewMI =
-BuildMI(*MBB, MI, DL, get(ARM::tMOVr), MaskReg)
-
[llvm-branch-commits] [llvm] [LLVM][AArch64] Add native ct.select support for ARM64 (PR #166706)
github-actions[bot] wrote:
:warning: C/C++ code formatter, clang-format found issues in your code.
:warning:
You can test this locally with the following command:
``bash
git-clang-format --diff origin/main HEAD --extensions h,cpp --
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/lib/Target/AArch64/AArch64ISelLowering.h
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
llvm/lib/Target/AArch64/AArch64MCInstLower.cpp --diff_from_common_commit
``
:warning:
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing `origin/main` to the base branch/commit you want to compare against.
:warning:
View the diff from clang-format here.
``diff
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index a86aac88b..54d0ea168 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -511,7 +511,7 @@ AArch64TargetLowering::AArch64TargetLowering(const
TargetMachine &TM,
setOperationAction(ISD::BR_CC, MVT::f64, Custom);
setOperationAction(ISD::SELECT, MVT::i32, Custom);
setOperationAction(ISD::SELECT, MVT::i64, Custom);
- setOperationAction(ISD::CTSELECT, MVT::i8, Promote);
+ setOperationAction(ISD::CTSELECT, MVT::i8, Promote);
setOperationAction(ISD::CTSELECT, MVT::i16, Promote);
setOperationAction(ISD::CTSELECT, MVT::i32, Custom);
setOperationAction(ISD::CTSELECT, MVT::i64, Custom);
@@ -534,7 +534,8 @@ AArch64TargetLowering::AArch64TargetLowering(const
TargetMachine &TM,
MVT elemType = VT.getVectorElementType();
if (elemType == MVT::i8 || elemType == MVT::i16) {
setOperationAction(ISD::CTSELECT, VT, Promote);
-} else if ((elemType == MVT::f16 || elemType == MVT::bf16) &&
!Subtarget->hasFullFP16()) {
+} else if ((elemType == MVT::f16 || elemType == MVT::bf16) &&
+ !Subtarget->hasFullFP16()) {
setOperationAction(ISD::CTSELECT, VT, Promote);
} else {
setOperationAction(ISD::CTSELECT, VT, Expand);
@@ -3351,7 +3352,9 @@ void AArch64TargetLowering::fixupPtrauthDiscriminator(
IntDiscOp.setImm(IntDisc);
}
-MachineBasicBlock *AArch64TargetLowering::EmitCTSELECT(MachineInstr &MI,
MachineBasicBlock *MBB, unsigned Opcode) const {
+MachineBasicBlock *AArch64TargetLowering::EmitCTSELECT(MachineInstr &MI,
+ MachineBasicBlock *MBB,
+ unsigned Opcode) const {
const TargetInstrInfo *TII = Subtarget->getInstrInfo();
DebugLoc DL = MI.getDebugLoc();
MachineInstrBuilder Builder = BuildMI(*MBB, MI, DL, TII->get(Opcode));
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index d14d64ffe..987377bc4 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -207,7 +207,8 @@ public:
MachineOperand &AddrDiscOp,
const TargetRegisterClass *AddrDiscRC) const;
- MachineBasicBlock *EmitCTSELECT(MachineInstr &MI, MachineBasicBlock *BB,
unsigned Opcode) const;
+ MachineBasicBlock *EmitCTSELECT(MachineInstr &MI, MachineBasicBlock *BB,
+ unsigned Opcode) const;
MachineBasicBlock *
EmitInstrWithCustomInserter(MachineInstr &MI,
@@ -928,9 +929,7 @@ private:
return VT.isScalableVector();
}
- bool isSelectSupported(SelectSupportKind Kind) const override {
-return true;
- }
+ bool isSelectSupported(SelectSupportKind Kind) const override { return true;
}
};
namespace AArch64 {
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 227e5d596..bab67f57e 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -2113,7 +2113,8 @@ bool AArch64InstrInfo::removeCmpToZeroOrOne(
return true;
}
-static inline void expandCtSelect(MachineBasicBlock &MBB, MachineInstr &MI,
DebugLoc &DL, const MCInstrDesc &MCID) {
+static inline void expandCtSelect(MachineBasicBlock &MBB, MachineInstr &MI,
+ DebugLoc &DL, const MCInstrDesc &MCID) {
MachineInstrBuilder Builder = BuildMI(MBB, MI, DL, MCID);
for (unsigned Idx = 0; Idx < MI.getNumOperands(); ++Idx) {
Builder.add(MI.getOperand(Idx));
@@ -2129,24 +2130,24 @@ bool AArch64InstrInfo::expandPostRAPseudo(MachineInstr
&MI) const {
DebugLoc DL = MI.getDebugLoc();
switch (MI.getOpcode()) {
-case AArch64::I32CTSELECT:
- expandCtSelect(MBB, MI, DL, get(AArch64::CSELWr));
- return true;
-case AArch64::I64CTSELECT:
- expandCtSelect(MBB, MI, DL, get(AArch64::CSELXr));
- return true;
-case AArch64::BF16CTSELECT:
- expandCtSelect
[llvm-branch-commits] [llvm] [LLVM][MIPS] Add comprehensive tests for ct.select (PR #166705)
https://github.com/wizardengineer created
https://github.com/llvm/llvm-project/pull/166705
None
>From 22e92921b5a3b7720e53ba32777e2bccf024896a Mon Sep 17 00:00:00 2001
From: wizardengineer
Date: Wed, 5 Nov 2025 11:01:26 -0500
Subject: [PATCH] [LLVM][MIPS] Add comprehensive tests for ct.select
---
.../Mips/ctselect-fallback-edge-cases.ll | 244 +
.../Mips/ctselect-fallback-patterns.ll| 426 +
.../CodeGen/Mips/ctselect-fallback-vector.ll | 830 ++
llvm/test/CodeGen/Mips/ctselect-fallback.ll | 371
.../CodeGen/Mips/ctselect-side-effects.ll | 183
5 files changed, 2054 insertions(+)
create mode 100644 llvm/test/CodeGen/Mips/ctselect-fallback-edge-cases.ll
create mode 100644 llvm/test/CodeGen/Mips/ctselect-fallback-patterns.ll
create mode 100644 llvm/test/CodeGen/Mips/ctselect-fallback-vector.ll
create mode 100644 llvm/test/CodeGen/Mips/ctselect-fallback.ll
create mode 100644 llvm/test/CodeGen/Mips/ctselect-side-effects.ll
diff --git a/llvm/test/CodeGen/Mips/ctselect-fallback-edge-cases.ll
b/llvm/test/CodeGen/Mips/ctselect-fallback-edge-cases.ll
new file mode 100644
index 0..f1831a625d4a4
--- /dev/null
+++ b/llvm/test/CodeGen/Mips/ctselect-fallback-edge-cases.ll
@@ -0,0 +1,244 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
UTC_ARGS: --version 5
+; RUN: llc < %s -mtriple=mipsel-unknown-linux-gnu -O3 | FileCheck %s
--check-prefix=M32
+; RUN: llc < %s -mtriple=mips64el-unknown-linux-gnu -O3 | FileCheck %s
--check-prefix=M64
+
+; Portable edge case tests
+
+; Test with small integer types
+define i1 @test_ctselect_i1(i1 %cond, i1 %a, i1 %b) {
+; M32-LABEL: test_ctselect_i1:
+; M32: # %bb.0:
+; M32-NEXT:xori $2, $4, 1
+; M32-NEXT:and $1, $4, $5
+; M32-NEXT:and $2, $2, $6
+; M32-NEXT:jr $ra
+; M32-NEXT:or $2, $1, $2
+;
+; M64-LABEL: test_ctselect_i1:
+; M64: # %bb.0:
+; M64-NEXT:sll $2, $4, 0
+; M64-NEXT:sll $1, $6, 0
+; M64-NEXT:xori $2, $2, 1
+; M64-NEXT:and $1, $2, $1
+; M64-NEXT:and $2, $4, $5
+; M64-NEXT:sll $2, $2, 0
+; M64-NEXT:jr $ra
+; M64-NEXT:or $2, $2, $1
+ %result = call i1 @llvm.ct.select.i1(i1 %cond, i1 %a, i1 %b)
+ ret i1 %result
+}
+
+; Test with extremal values
+define i32 @test_ctselect_extremal_values(i1 %cond) {
+; M32-LABEL: test_ctselect_extremal_values:
+; M32: # %bb.0:
+; M32-NEXT:lui $3, 32767
+; M32-NEXT:andi $1, $4, 1
+; M32-NEXT:negu $2, $1
+; M32-NEXT:ori $3, $3, 65535
+; M32-NEXT:addiu $1, $1, -1
+; M32-NEXT:and $2, $2, $3
+; M32-NEXT:lui $3, 32768
+; M32-NEXT:and $1, $1, $3
+; M32-NEXT:jr $ra
+; M32-NEXT:or $2, $2, $1
+;
+; M64-LABEL: test_ctselect_extremal_values:
+; M64: # %bb.0:
+; M64-NEXT:sll $1, $4, 0
+; M64-NEXT:lui $3, 32767
+; M64-NEXT:andi $1, $1, 1
+; M64-NEXT:ori $3, $3, 65535
+; M64-NEXT:negu $2, $1
+; M64-NEXT:addiu $1, $1, -1
+; M64-NEXT:and $2, $2, $3
+; M64-NEXT:lui $3, 32768
+; M64-NEXT:and $1, $1, $3
+; M64-NEXT:jr $ra
+; M64-NEXT:or $2, $2, $1
+ %result = call i32 @llvm.ct.select.i32(i1 %cond, i32 2147483647, i32
-2147483648)
+ ret i32 %result
+}
+
+; Test with null pointers
+define ptr @test_ctselect_null_ptr(i1 %cond, ptr %ptr) {
+; M32-LABEL: test_ctselect_null_ptr:
+; M32: # %bb.0:
+; M32-NEXT:andi $1, $4, 1
+; M32-NEXT:negu $1, $1
+; M32-NEXT:jr $ra
+; M32-NEXT:and $2, $1, $5
+;
+; M64-LABEL: test_ctselect_null_ptr:
+; M64: # %bb.0:
+; M64-NEXT:andi $1, $4, 1
+; M64-NEXT:dnegu $1, $1
+; M64-NEXT:jr $ra
+; M64-NEXT:and $2, $1, $5
+ %result = call ptr @llvm.ct.select.p0(i1 %cond, ptr %ptr, ptr null)
+ ret ptr %result
+}
+
+; Test with function pointers
+define ptr @test_ctselect_function_ptr(i1 %cond, ptr %func1, ptr %func2) {
+; M32-LABEL: test_ctselect_function_ptr:
+; M32: # %bb.0:
+; M32-NEXT:andi $1, $4, 1
+; M32-NEXT:negu $2, $1
+; M32-NEXT:addiu $1, $1, -1
+; M32-NEXT:and $2, $2, $5
+; M32-NEXT:and $1, $1, $6
+; M32-NEXT:jr $ra
+; M32-NEXT:or $2, $2, $1
+;
+; M64-LABEL: test_ctselect_function_ptr:
+; M64: # %bb.0:
+; M64-NEXT:andi $1, $4, 1
+; M64-NEXT:dnegu $2, $1
+; M64-NEXT:daddiu $1, $1, -1
+; M64-NEXT:and $2, $2, $5
+; M64-NEXT:and $1, $1, $6
+; M64-NEXT:jr $ra
+; M64-NEXT:or $2, $2, $1
+ %result = call ptr @llvm.ct.select.p0(i1 %cond, ptr %func1, ptr %func2)
+ ret ptr %result
+}
+
+; Test with condition from icmp on pointers
+define ptr @test_ctselect_ptr_cmp(ptr %p1, ptr %p2, ptr %a, ptr %b) {
+; M32-LABEL: test_ctselect_ptr_cmp:
+; M32: # %bb.0:
+; M32-NEXT:xor $1, $4, $5
+; M32-NEXT:sltu $1, $zero, $1
+; M32-NEXT:addiu $1, $1, -1
+; M32-NEXT:and $2, $1, $6
+; M32-NEXT:not $1, $1
+; M32-NEXT:and $1, $1, $7
+; M32-NEXT:jr $ra
+; M32-NEXT:or $2, $2, $1
+;
+; M64-LABEL: test_ctselect_ptr_cmp:
+; M64: # %bb.0:
[llvm-branch-commits] [llvm] [LLVM][AArch64] Add native ct.select support for ARM64 (PR #166706)
https://github.com/wizardengineer created
https://github.com/llvm/llvm-project/pull/166706
This patch implements architecture-specific lowering for ct.select on AArch64
using CSEL (conditional select) instructions for constant-time selection.
Implementation details:
- Uses CSEL family of instructions for scalar integer types
- Uses FCSEL for floating-point types (F16, BF16, F32, F64)
- Post-RA MC lowering to convert pseudo-instructions to real CSEL/FCSEL
- Handles vector types appropriately
- Comprehensive test coverage for AArch64
The implementation includes:
- ISelLowering: Custom lowering to CTSELECT pseudo-instructions
- InstrInfo: Pseudo-instruction definitions and patterns
- MCInstLower: Post-RA lowering of pseudo-instructions to actual CSEL/FCSEL
- Proper handling of condition codes for constant-time guarantees
>From 071428b7a6eed7a800364cc4b9a7e25e1d8e310e Mon Sep 17 00:00:00 2001
From: wizardengineer
Date: Wed, 5 Nov 2025 17:09:45 -0500
Subject: [PATCH] [LLVM][AArch64] Add native ct.select support for ARM64
This patch implements architecture-specific lowering for ct.select on AArch64
using CSEL (conditional select) instructions for constant-time selection.
Implementation details:
- Uses CSEL family of instructions for scalar integer types
- Uses FCSEL for floating-point types (F16, BF16, F32, F64)
- Post-RA MC lowering to convert pseudo-instructions to real CSEL/FCSEL
- Handles vector types appropriately
- Comprehensive test coverage for AArch64
The implementation includes:
- ISelLowering: Custom lowering to CTSELECT pseudo-instructions
- InstrInfo: Pseudo-instruction definitions and patterns
- MCInstLower: Post-RA lowering of pseudo-instructions to actual CSEL/FCSEL
- Proper handling of condition codes for constant-time guarantees
---
.../Target/AArch64/AArch64ISelLowering.cpp| 53 +
llvm/lib/Target/AArch64/AArch64ISelLowering.h | 12 ++
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp | 200 --
llvm/lib/Target/AArch64/AArch64InstrInfo.td | 45
.../lib/Target/AArch64/AArch64MCInstLower.cpp | 18 ++
llvm/test/CodeGen/AArch64/ctselect.ll | 153 ++
6 files changed, 371 insertions(+), 110 deletions(-)
create mode 100644 llvm/test/CodeGen/AArch64/ctselect.ll
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 60aa61e993b26..a86aac88b94a8 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -511,12 +511,35 @@ AArch64TargetLowering::AArch64TargetLowering(const
TargetMachine &TM,
setOperationAction(ISD::BR_CC, MVT::f64, Custom);
setOperationAction(ISD::SELECT, MVT::i32, Custom);
setOperationAction(ISD::SELECT, MVT::i64, Custom);
+ setOperationAction(ISD::CTSELECT, MVT::i8, Promote);
+ setOperationAction(ISD::CTSELECT, MVT::i16, Promote);
+ setOperationAction(ISD::CTSELECT, MVT::i32, Custom);
+ setOperationAction(ISD::CTSELECT, MVT::i64, Custom);
if (Subtarget->hasFPARMv8()) {
setOperationAction(ISD::SELECT, MVT::f16, Custom);
setOperationAction(ISD::SELECT, MVT::bf16, Custom);
}
+ if (Subtarget->hasFullFP16()) {
+setOperationAction(ISD::CTSELECT, MVT::f16, Custom);
+setOperationAction(ISD::CTSELECT, MVT::bf16, Custom);
+ } else {
+setOperationAction(ISD::CTSELECT, MVT::f16, Promote);
+setOperationAction(ISD::CTSELECT, MVT::bf16, Promote);
+ }
setOperationAction(ISD::SELECT, MVT::f32, Custom);
setOperationAction(ISD::SELECT, MVT::f64, Custom);
+ setOperationAction(ISD::CTSELECT, MVT::f32, Custom);
+ setOperationAction(ISD::CTSELECT, MVT::f64, Custom);
+ for (MVT VT : MVT::vector_valuetypes()) {
+MVT elemType = VT.getVectorElementType();
+if (elemType == MVT::i8 || elemType == MVT::i16) {
+ setOperationAction(ISD::CTSELECT, VT, Promote);
+} else if ((elemType == MVT::f16 || elemType == MVT::bf16) &&
!Subtarget->hasFullFP16()) {
+ setOperationAction(ISD::CTSELECT, VT, Promote);
+} else {
+ setOperationAction(ISD::CTSELECT, VT, Expand);
+}
+ }
setOperationAction(ISD::SELECT_CC, MVT::i32, Custom);
setOperationAction(ISD::SELECT_CC, MVT::i64, Custom);
setOperationAction(ISD::SELECT_CC, MVT::f16, Custom);
@@ -3328,6 +3351,18 @@ void AArch64TargetLowering::fixupPtrauthDiscriminator(
IntDiscOp.setImm(IntDisc);
}
+MachineBasicBlock *AArch64TargetLowering::EmitCTSELECT(MachineInstr &MI,
MachineBasicBlock *MBB, unsigned Opcode) const {
+ const TargetInstrInfo *TII = Subtarget->getInstrInfo();
+ DebugLoc DL = MI.getDebugLoc();
+ MachineInstrBuilder Builder = BuildMI(*MBB, MI, DL, TII->get(Opcode));
+ for (unsigned Idx = 0; Idx < MI.getNumOperands(); ++Idx) {
+Builder.add(MI.getOperand(Idx));
+ }
+ Builder->setFlag(MachineInstr::NoMerge);
+ MBB->remove_instr(&MI);
+ return MBB;
+}
+
MachineBasicBlock *AArch64TargetLowering::EmitInstrWithCustomInserter(
MachineInstr &MI, MachineBasicBlock *BB) const
[llvm-branch-commits] [llvm] [ConstantTime][RISCV] Add comprehensive tests for ct.select (PR #166708)
https://github.com/wizardengineer created
https://github.com/llvm/llvm-project/pull/166708
Add comprehensive test suite for RISC-V fallback implementation:
- Edge cases (zero conditions, large integers, sign extension)
- Pattern matching (nested selects, chains)
- Vector support with RVV extensions
- Side effects and memory operations
The basic fallback test is in the core infrastructure PR.
>From 05465f4247bd1e05138118a697625b0ad9378e81 Mon Sep 17 00:00:00 2001
From: wizardengineer
Date: Wed, 5 Nov 2025 11:01:00 -0500
Subject: [PATCH] [ConstantTime][RISCV] Add comprehensive tests for ct.select
Add comprehensive test suite for RISC-V fallback implementation:
- Edge cases (zero conditions, large integers, sign extension)
- Pattern matching (nested selects, chains)
- Vector support with RVV extensions
- Side effects and memory operations
The basic fallback test is in the core infrastructure PR.
---
.../RISCV/ctselect-fallback-edge-cases.ll | 214 +
.../RISCV/ctselect-fallback-patterns.ll | 383 +
.../RISCV/ctselect-fallback-vector-rvv.ll | 804 ++
llvm/test/CodeGen/RISCV/ctselect-fallback.ll | 330 ---
.../CodeGen/RISCV/ctselect-side-effects.ll| 176
5 files changed, 1577 insertions(+), 330 deletions(-)
create mode 100644 llvm/test/CodeGen/RISCV/ctselect-fallback-edge-cases.ll
create mode 100644 llvm/test/CodeGen/RISCV/ctselect-fallback-patterns.ll
create mode 100644 llvm/test/CodeGen/RISCV/ctselect-fallback-vector-rvv.ll
delete mode 100644 llvm/test/CodeGen/RISCV/ctselect-fallback.ll
create mode 100644 llvm/test/CodeGen/RISCV/ctselect-side-effects.ll
diff --git a/llvm/test/CodeGen/RISCV/ctselect-fallback-edge-cases.ll
b/llvm/test/CodeGen/RISCV/ctselect-fallback-edge-cases.ll
new file mode 100644
index 0..af1be0c8f3ddc
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/ctselect-fallback-edge-cases.ll
@@ -0,0 +1,214 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
UTC_ARGS: --version 5
+; RUN: llc < %s -mtriple=riscv64 -O3 | FileCheck %s --check-prefix=RV64
+; RUN: llc < %s -mtriple=riscv32 -O3 | FileCheck %s --check-prefix=RV32
+
+; Test with small integer types
+define i1 @test_ctselect_i1(i1 %cond, i1 %a, i1 %b) {
+; RV64-LABEL: test_ctselect_i1:
+; RV64: # %bb.0:
+; RV64-NEXT:and a1, a0, a1
+; RV64-NEXT:xori a0, a0, 1
+; RV64-NEXT:and a0, a0, a2
+; RV64-NEXT:or a0, a1, a0
+; RV64-NEXT:ret
+;
+; RV32-LABEL: test_ctselect_i1:
+; RV32: # %bb.0:
+; RV32-NEXT:and a1, a0, a1
+; RV32-NEXT:xori a0, a0, 1
+; RV32-NEXT:and a0, a0, a2
+; RV32-NEXT:or a0, a1, a0
+; RV32-NEXT:ret
+ %result = call i1 @llvm.ct.select.i1(i1 %cond, i1 %a, i1 %b)
+ ret i1 %result
+}
+
+; Test with extremal values
+define i32 @test_ctselect_extremal_values(i1 %cond) {
+; RV64-LABEL: test_ctselect_extremal_values:
+; RV64: # %bb.0:
+; RV64-NEXT:andi a0, a0, 1
+; RV64-NEXT:lui a1, 524288
+; RV64-NEXT:subw a0, a1, a0
+; RV64-NEXT:ret
+;
+; RV32-LABEL: test_ctselect_extremal_values:
+; RV32: # %bb.0:
+; RV32-NEXT:andi a0, a0, 1
+; RV32-NEXT:lui a1, 524288
+; RV32-NEXT:addi a2, a0, -1
+; RV32-NEXT:neg a0, a0
+; RV32-NEXT:and a1, a2, a1
+; RV32-NEXT:slli a0, a0, 1
+; RV32-NEXT:srli a0, a0, 1
+; RV32-NEXT:or a0, a0, a1
+; RV32-NEXT:ret
+ %result = call i32 @llvm.ct.select.i32(i1 %cond, i32 2147483647, i32
-2147483648)
+ ret i32 %result
+}
+
+; Test with null pointers
+define ptr @test_ctselect_null_ptr(i1 %cond, ptr %ptr) {
+; RV64-LABEL: test_ctselect_null_ptr:
+; RV64: # %bb.0:
+; RV64-NEXT:slli a0, a0, 63
+; RV64-NEXT:srai a0, a0, 63
+; RV64-NEXT:and a0, a0, a1
+; RV64-NEXT:ret
+;
+; RV32-LABEL: test_ctselect_null_ptr:
+; RV32: # %bb.0:
+; RV32-NEXT:slli a0, a0, 31
+; RV32-NEXT:srai a0, a0, 31
+; RV32-NEXT:and a0, a0, a1
+; RV32-NEXT:ret
+ %result = call ptr @llvm.ct.select.p0(i1 %cond, ptr %ptr, ptr null)
+ ret ptr %result
+}
+
+; Test with function pointers
+define ptr @test_ctselect_function_ptr(i1 %cond, ptr %func1, ptr %func2) {
+; RV64-LABEL: test_ctselect_function_ptr:
+; RV64: # %bb.0:
+; RV64-NEXT:andi a0, a0, 1
+; RV64-NEXT:neg a3, a0
+; RV64-NEXT:addi a0, a0, -1
+; RV64-NEXT:and a1, a3, a1
+; RV64-NEXT:and a0, a0, a2
+; RV64-NEXT:or a0, a1, a0
+; RV64-NEXT:ret
+;
+; RV32-LABEL: test_ctselect_function_ptr:
+; RV32: # %bb.0:
+; RV32-NEXT:andi a0, a0, 1
+; RV32-NEXT:neg a3, a0
+; RV32-NEXT:addi a0, a0, -1
+; RV32-NEXT:and a1, a3, a1
+; RV32-NEXT:and a0, a0, a2
+; RV32-NEXT:or a0, a1, a0
+; RV32-NEXT:ret
+ %result = call ptr @llvm.ct.select.p0(i1 %cond, ptr %func1, ptr %func2)
+ ret ptr %result
+}
+
+; Test with condition from icmp on pointers
+define ptr @test_ctselect_ptr_cmp(ptr %p1, ptr %p2, ptr %a, ptr %b) {
+; RV64-LABEL: test_ctselect_ptr_cmp:
+; RV64: # %bb.0:
+; RV64-NEXT:xor a
[llvm-branch-commits] [llvm] [ConstantTime][WebAssembly] Add comprehensive tests for ct.select (PR #166709)
https://github.com/wizardengineer created
https://github.com/llvm/llvm-project/pull/166709
None
>From d55c2d69a0d48538f7376d9f06b8cbf0e2215e93 Mon Sep 17 00:00:00 2001
From: wizardengineer
Date: Wed, 5 Nov 2025 11:03:23 -0500
Subject: [PATCH] [ConstantTime][WebAssembly] Add comprehensive tests for
ct.select
---
.../ctselect-fallback-edge-cases.ll | 376 +
.../WebAssembly/ctselect-fallback-patterns.ll | 641
.../WebAssembly/ctselect-fallback-vector.ll | 714 ++
.../CodeGen/WebAssembly/ctselect-fallback.ll | 552 ++
.../WebAssembly/ctselect-side-effects.ll | 226 ++
5 files changed, 2509 insertions(+)
create mode 100644
llvm/test/CodeGen/WebAssembly/ctselect-fallback-edge-cases.ll
create mode 100644 llvm/test/CodeGen/WebAssembly/ctselect-fallback-patterns.ll
create mode 100644 llvm/test/CodeGen/WebAssembly/ctselect-fallback-vector.ll
create mode 100644 llvm/test/CodeGen/WebAssembly/ctselect-fallback.ll
create mode 100644 llvm/test/CodeGen/WebAssembly/ctselect-side-effects.ll
diff --git a/llvm/test/CodeGen/WebAssembly/ctselect-fallback-edge-cases.ll
b/llvm/test/CodeGen/WebAssembly/ctselect-fallback-edge-cases.ll
new file mode 100644
index 0..b0f7f2807debd
--- /dev/null
+++ b/llvm/test/CodeGen/WebAssembly/ctselect-fallback-edge-cases.ll
@@ -0,0 +1,376 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
UTC_ARGS: --version 5
+; RUN: llc < %s -mtriple=wasm32-unknown-unknown -O3 -filetype=asm | FileCheck
%s --check-prefix=W32
+; RUN: llc < %s -mtriple=wasm64-unknown-unknown -O3 -filetype=asm | FileCheck
%s --check-prefix=W64
+
+; Test with small integer types
+define i1 @test_ctselect_i1(i1 %cond, i1 %a, i1 %b) {
+; W32-LABEL: test_ctselect_i1:
+; W32: .functype test_ctselect_i1 (i32, i32, i32) -> (i32)
+; W32-NEXT: # %bb.0:
+; W32-NEXT:local.get 0
+; W32-NEXT:local.get 1
+; W32-NEXT:i32.and
+; W32-NEXT:local.get 0
+; W32-NEXT:i32.const 1
+; W32-NEXT:i32.xor
+; W32-NEXT:local.get 2
+; W32-NEXT:i32.and
+; W32-NEXT:i32.or
+; W32-NEXT:# fallthrough-return
+;
+; W64-LABEL: test_ctselect_i1:
+; W64: .functype test_ctselect_i1 (i32, i32, i32) -> (i32)
+; W64-NEXT: # %bb.0:
+; W64-NEXT:local.get 0
+; W64-NEXT:local.get 1
+; W64-NEXT:i32.and
+; W64-NEXT:local.get 0
+; W64-NEXT:i32.const 1
+; W64-NEXT:i32.xor
+; W64-NEXT:local.get 2
+; W64-NEXT:i32.and
+; W64-NEXT:i32.or
+; W64-NEXT:# fallthrough-return
+ %result = call i1 @llvm.ct.select.i1(i1 %cond, i1 %a, i1 %b)
+ ret i1 %result
+}
+
+; Test with extremal values
+define i32 @test_ctselect_extremal_values(i1 %cond) {
+; W32-LABEL: test_ctselect_extremal_values:
+; W32: .functype test_ctselect_extremal_values (i32) -> (i32)
+; W32-NEXT: # %bb.0:
+; W32-NEXT:i32.const 0
+; W32-NEXT:local.get 0
+; W32-NEXT:i32.const 1
+; W32-NEXT:i32.and
+; W32-NEXT:local.tee 0
+; W32-NEXT:i32.sub
+; W32-NEXT:i32.const 2147483647
+; W32-NEXT:i32.and
+; W32-NEXT:local.get 0
+; W32-NEXT:i32.const -1
+; W32-NEXT:i32.add
+; W32-NEXT:i32.const -2147483648
+; W32-NEXT:i32.and
+; W32-NEXT:i32.or
+; W32-NEXT:# fallthrough-return
+;
+; W64-LABEL: test_ctselect_extremal_values:
+; W64: .functype test_ctselect_extremal_values (i32) -> (i32)
+; W64-NEXT: # %bb.0:
+; W64-NEXT:i32.const 0
+; W64-NEXT:local.get 0
+; W64-NEXT:i32.const 1
+; W64-NEXT:i32.and
+; W64-NEXT:local.tee 0
+; W64-NEXT:i32.sub
+; W64-NEXT:i32.const 2147483647
+; W64-NEXT:i32.and
+; W64-NEXT:local.get 0
+; W64-NEXT:i32.const -1
+; W64-NEXT:i32.add
+; W64-NEXT:i32.const -2147483648
+; W64-NEXT:i32.and
+; W64-NEXT:i32.or
+; W64-NEXT:# fallthrough-return
+ %result = call i32 @llvm.ct.select.i32(i1 %cond, i32 2147483647, i32
-2147483648)
+ ret i32 %result
+}
+
+; Test with null pointers
+define ptr @test_ctselect_null_ptr(i1 %cond, ptr %ptr) {
+; W32-LABEL: test_ctselect_null_ptr:
+; W32: .functype test_ctselect_null_ptr (i32, i32) -> (i32)
+; W32-NEXT: # %bb.0:
+; W32-NEXT:i32.const 0
+; W32-NEXT:local.get 0
+; W32-NEXT:i32.const 1
+; W32-NEXT:i32.and
+; W32-NEXT:i32.sub
+; W32-NEXT:local.get 1
+; W32-NEXT:i32.and
+; W32-NEXT:# fallthrough-return
+;
+; W64-LABEL: test_ctselect_null_ptr:
+; W64: .functype test_ctselect_null_ptr (i32, i64) -> (i64)
+; W64-NEXT: # %bb.0:
+; W64-NEXT:i64.const 0
+; W64-NEXT:local.get 0
+; W64-NEXT:i64.extend_i32_u
+; W64-NEXT:i64.const 1
+; W64-NEXT:i64.and
+; W64-NEXT:i64.sub
+; W64-NEXT:local.get 1
+; W64-NEXT:i64.and
+; W64-NEXT:# fallthrough-return
+ %result = call ptr @llvm.ct.select.p0(i1 %cond, ptr %ptr, ptr null)
+ ret ptr %result
+}
+
+; Test with function pointers
+define ptr @test_ctselect_function_ptr(i1 %cond, ptr %func1, ptr %func2) {
+
[llvm-branch-commits] [clang] [ConstantTime][Clang] Add __builtin_ct_select for constant-time selection (PR #166703)
wizardengineer wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.com/github/pr/llvm/llvm-project/166703?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#166703** https://app.graphite.com/github/pr/llvm/llvm-project/166703?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>: 6 dependent PRs ([#166704](https://github.com/llvm/llvm-project/pull/166704) https://app.graphite.com/github/pr/llvm/llvm-project/166704?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>, [#166705](https://github.com/llvm/llvm-project/pull/166705) https://app.graphite.com/github/pr/llvm/llvm-project/166705?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>, [#166706](https://github.com/llvm/llvm-project/pull/166706) https://app.graphite.com/github/pr/llvm/llvm-project/166706?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> and 3 others) π https://app.graphite.com/github/pr/llvm/llvm-project/166703?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#166702** https://app.graphite.com/github/pr/llvm/llvm-project/166702?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/166703 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [ConstantTime][Clang] Add __builtin_ct_select for constant-time selection (PR #166703)
https://github.com/wizardengineer created
https://github.com/llvm/llvm-project/pull/166703
None
>From cbb549061c8ec8023ca44b664e0f9d9b4727afde Mon Sep 17 00:00:00 2001
From: wizardengineer
Date: Wed, 5 Nov 2025 10:56:34 -0500
Subject: [PATCH] [ConstantTime][Clang] Add __builtin_ct_select for
constant-time selection
---
clang/include/clang/Basic/Builtins.td | 8 +
clang/lib/CodeGen/CGBuiltin.cpp | 35 +
clang/lib/Sema/SemaChecking.cpp | 87 +++
.../test/Sema/builtin-ct-select-edge-cases.c | 384 ++
clang/test/Sema/builtin-ct-select.c | 683 ++
5 files changed, 1197 insertions(+)
create mode 100644 clang/test/Sema/builtin-ct-select-edge-cases.c
create mode 100644 clang/test/Sema/builtin-ct-select.c
diff --git a/clang/include/clang/Basic/Builtins.td
b/clang/include/clang/Basic/Builtins.td
index 2b400b012d6ed..13e2a9849bfca 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -5278,3 +5278,11 @@ def CountedByRef : Builtin {
let Attributes = [NoThrow, CustomTypeChecking];
let Prototype = "int(...)";
}
+
+// Constant-time select builtin
+def CtSelect : Builtin {
+ let Spellings = ["__builtin_ct_select"];
+ let Attributes = [NoThrow, Const, UnevaluatedArguments,
+ConstIgnoringExceptions, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index b81e0d02da2c9..6c0a9eb1e7d86 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -28,6 +28,7 @@
#include "clang/AST/StmtVisitor.h"
#include "clang/Basic/TargetInfo.h"
#include "clang/Frontend/FrontendDiagnostic.h"
+#include "clang/Basic/DiagnosticSema.h"
#include "llvm/IR/InlineAsm.h"
#include "llvm/IR/Instruction.h"
#include "llvm/IR/Intrinsics.h"
@@ -6450,6 +6451,40 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl
GD, unsigned BuiltinID,
auto Str = CGM.GetAddrOfConstantCString(Name, "");
return RValue::get(Str.getPointer());
}
+ case Builtin::BI__builtin_ct_select: {
+if (E->getNumArgs() != 3) {
+ CGM.getDiags().Report(E->getBeginLoc(),
+E->getNumArgs() > 3
+? diag::err_typecheck_call_too_many_args
+: diag::err_typecheck_call_too_few_args);
+ return GetUndefRValue(E->getType());
+}
+
+auto *Cond = EmitScalarExpr(E->getArg(0));
+auto *A = EmitScalarExpr(E->getArg(1));
+auto *B = EmitScalarExpr(E->getArg(2));
+
+// Verify types match
+if (A->getType() != B->getType()) {
+ CGM.getDiags().Report(E->getBeginLoc(),
+diag::err_typecheck_convert_incompatible);
+ return GetUndefRValue(E->getType());
+}
+
+// Verify condition is integer type
+if (!Cond->getType()->isIntegerTy()) {
+ CGM.getDiags().Report(E->getBeginLoc(), diag::err_typecheck_expect_int);
+ return GetUndefRValue(E->getType());
+}
+
+if (Cond->getType()->getIntegerBitWidth() != 1)
+ Cond = Builder.CreateICmpNE(
+ Cond, llvm::ConstantInt::get(Cond->getType(), 0), "cond.bool");
+
+llvm::Function *Fn =
+CGM.getIntrinsic(llvm::Intrinsic::ct_select, {A->getType()});
+return RValue::get(Builder.CreateCall(Fn, {Cond, A, B}));
+ }
}
// If this is an alias for a lib function (e.g. __builtin_sin), emit
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index ad2c2e4a97bb9..bce47cc4586c6 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -3494,6 +3494,93 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl,
unsigned BuiltinID,
if (BuiltinCountedByRef(TheCall))
return ExprError();
break;
+
+ case Builtin::BI__builtin_ct_select: {
+if (TheCall->getNumArgs() != 3) {
+ // Simple argument count check without complex diagnostics
+ if (TheCall->getNumArgs() < 3) {
+return Diag(TheCall->getEndLoc(),
diag::err_typecheck_call_too_few_args_at_least)
+ << 0 << 3 << TheCall->getNumArgs() << 0
+ << TheCall->getCallee()->getSourceRange();
+ } else {
+return Diag(TheCall->getEndLoc(),
diag::err_typecheck_call_too_many_args)
+ << 0 << 3 << TheCall->getNumArgs() << 0
+ << TheCall->getCallee()->getSourceRange();
+ }
+}
+auto *Cond = TheCall->getArg(0);
+auto *A = TheCall->getArg(1);
+auto *B = TheCall->getArg(2);
+
+QualType CondTy = Cond->getType();
+if (!CondTy->isIntegerType()) {
+ return Diag(Cond->getBeginLoc(), diag::err_typecheck_cond_expect_scalar)
+ << CondTy << Cond->getSourceRange();
+}
+
+QualType ATy = A->getType();
+QualType BTy = B->getType();
+
+// check for scalar or vector scalar type
+if ((!ATy->isScalarType() && !ATy->isVectorType()) ||
+(!BTy
[llvm-branch-commits] [llvm] [LLVM][X86] Add native ct.select support for X86 and i386 (PR #166704)
wizardengineer wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.com/github/pr/llvm/llvm-project/166704?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#166704** https://app.graphite.com/github/pr/llvm/llvm-project/166704?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> π https://app.graphite.com/github/pr/llvm/llvm-project/166704?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#166703** https://app.graphite.com/github/pr/llvm/llvm-project/166703?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>: 5 other dependent PRs ([#166705](https://github.com/llvm/llvm-project/pull/166705) https://app.graphite.com/github/pr/llvm/llvm-project/166705?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>, [#166706](https://github.com/llvm/llvm-project/pull/166706) https://app.graphite.com/github/pr/llvm/llvm-project/166706?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>, [#166707](https://github.com/llvm/llvm-project/pull/166707) https://app.graphite.com/github/pr/llvm/llvm-project/166707?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> and 2 others) * **#166702** https://app.graphite.com/github/pr/llvm/llvm-project/166702?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/166704 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [ConstantTime][WebAssembly] Add comprehensive tests for ct.select (PR #166709)
wizardengineer wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.com/github/pr/llvm/llvm-project/166709?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#166709** https://app.graphite.com/github/pr/llvm/llvm-project/166709?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> π https://app.graphite.com/github/pr/llvm/llvm-project/166709?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#166703** https://app.graphite.com/github/pr/llvm/llvm-project/166703?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>: 5 other dependent PRs ([#166704](https://github.com/llvm/llvm-project/pull/166704) https://app.graphite.com/github/pr/llvm/llvm-project/166704?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>, [#166705](https://github.com/llvm/llvm-project/pull/166705) https://app.graphite.com/github/pr/llvm/llvm-project/166705?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>, [#166706](https://github.com/llvm/llvm-project/pull/166706) https://app.graphite.com/github/pr/llvm/llvm-project/166706?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> and 2 others) * **#166702** https://app.graphite.com/github/pr/llvm/llvm-project/166702?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/166709 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][ARM] Add native ct.select support for ARM32 and Thumb (PR #166707)
wizardengineer wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.com/github/pr/llvm/llvm-project/166707?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#166707** https://app.graphite.com/github/pr/llvm/llvm-project/166707?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> π https://app.graphite.com/github/pr/llvm/llvm-project/166707?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#166703** https://app.graphite.com/github/pr/llvm/llvm-project/166703?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>: 5 other dependent PRs ([#166704](https://github.com/llvm/llvm-project/pull/166704) https://app.graphite.com/github/pr/llvm/llvm-project/166704?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>, [#166705](https://github.com/llvm/llvm-project/pull/166705) https://app.graphite.com/github/pr/llvm/llvm-project/166705?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>, [#166706](https://github.com/llvm/llvm-project/pull/166706) https://app.graphite.com/github/pr/llvm/llvm-project/166706?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> and 2 others) * **#166702** https://app.graphite.com/github/pr/llvm/llvm-project/166702?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/166707 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [ConstantTime][RISCV] Add comprehensive tests for ct.select (PR #166708)
wizardengineer wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.com/github/pr/llvm/llvm-project/166708?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#166708** https://app.graphite.com/github/pr/llvm/llvm-project/166708?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> π https://app.graphite.com/github/pr/llvm/llvm-project/166708?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#166703** https://app.graphite.com/github/pr/llvm/llvm-project/166703?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>: 5 other dependent PRs ([#166704](https://github.com/llvm/llvm-project/pull/166704) https://app.graphite.com/github/pr/llvm/llvm-project/166704?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>, [#166705](https://github.com/llvm/llvm-project/pull/166705) https://app.graphite.com/github/pr/llvm/llvm-project/166705?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>, [#166706](https://github.com/llvm/llvm-project/pull/166706) https://app.graphite.com/github/pr/llvm/llvm-project/166706?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> and 2 others) * **#166702** https://app.graphite.com/github/pr/llvm/llvm-project/166702?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/166708 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [ConstantTime][Clang] Add __builtin_ct_select for constant-time selection (PR #166703)
github-actions[bot] wrote:
:warning: C/C++ code formatter, clang-format found issues in your code.
:warning:
You can test this locally with the following command:
``bash
git-clang-format --diff origin/main HEAD --extensions cpp,c --
clang/test/Sema/builtin-ct-select-edge-cases.c
clang/test/Sema/builtin-ct-select.c clang/lib/CodeGen/CGBuiltin.cpp
clang/lib/Sema/SemaChecking.cpp --diff_from_common_commit
``
:warning:
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing `origin/main` to the base branch/commit you want to compare against.
:warning:
View the diff from clang-format here.
``diff
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 6c0a9eb1e..a589f9092 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -26,9 +26,9 @@
#include "TargetInfo.h"
#include "clang/AST/OSLog.h"
#include "clang/AST/StmtVisitor.h"
+#include "clang/Basic/DiagnosticSema.h"
#include "clang/Basic/TargetInfo.h"
#include "clang/Frontend/FrontendDiagnostic.h"
-#include "clang/Basic/DiagnosticSema.h"
#include "llvm/IR/InlineAsm.h"
#include "llvm/IR/Instruction.h"
#include "llvm/IR/Intrinsics.h"
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index bce47cc45..026912c85 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -3499,11 +3499,13 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl,
unsigned BuiltinID,
if (TheCall->getNumArgs() != 3) {
// Simple argument count check without complex diagnostics
if (TheCall->getNumArgs() < 3) {
-return Diag(TheCall->getEndLoc(),
diag::err_typecheck_call_too_few_args_at_least)
+return Diag(TheCall->getEndLoc(),
+diag::err_typecheck_call_too_few_args_at_least)
<< 0 << 3 << TheCall->getNumArgs() << 0
<< TheCall->getCallee()->getSourceRange();
} else {
-return Diag(TheCall->getEndLoc(),
diag::err_typecheck_call_too_many_args)
+return Diag(TheCall->getEndLoc(),
+diag::err_typecheck_call_too_many_args)
<< 0 << 3 << TheCall->getNumArgs() << 0
<< TheCall->getCallee()->getSourceRange();
}
``
https://github.com/llvm/llvm-project/pull/166703
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][AArch64] Add native ct.select support for ARM64 (PR #166706)
wizardengineer wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.com/github/pr/llvm/llvm-project/166706?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#166706** https://app.graphite.com/github/pr/llvm/llvm-project/166706?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> π https://app.graphite.com/github/pr/llvm/llvm-project/166706?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#166703** https://app.graphite.com/github/pr/llvm/llvm-project/166703?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>: 5 other dependent PRs ([#166704](https://github.com/llvm/llvm-project/pull/166704) https://app.graphite.com/github/pr/llvm/llvm-project/166704?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>, [#166705](https://github.com/llvm/llvm-project/pull/166705) https://app.graphite.com/github/pr/llvm/llvm-project/166705?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>, [#166707](https://github.com/llvm/llvm-project/pull/166707) https://app.graphite.com/github/pr/llvm/llvm-project/166707?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> and 2 others) * **#166702** https://app.graphite.com/github/pr/llvm/llvm-project/166702?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/166706 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][MIPS] Add comprehensive tests for ct.select (PR #166705)
wizardengineer wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.com/github/pr/llvm/llvm-project/166705?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#166705** https://app.graphite.com/github/pr/llvm/llvm-project/166705?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> π https://app.graphite.com/github/pr/llvm/llvm-project/166705?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#166703** https://app.graphite.com/github/pr/llvm/llvm-project/166703?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>: 5 other dependent PRs ([#166704](https://github.com/llvm/llvm-project/pull/166704) https://app.graphite.com/github/pr/llvm/llvm-project/166704?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>, [#166706](https://github.com/llvm/llvm-project/pull/166706) https://app.graphite.com/github/pr/llvm/llvm-project/166706?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>, [#166707](https://github.com/llvm/llvm-project/pull/166707) https://app.graphite.com/github/pr/llvm/llvm-project/166707?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> and 2 others) * **#166702** https://app.graphite.com/github/pr/llvm/llvm-project/166702?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/166705 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoongArch] Initial implementation for `enableMemCmpExpansion` hook (PR #166526)
https://github.com/zhaoqi5 edited https://github.com/llvm/llvm-project/pull/166526 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoongArch] Initial implementation for `enableMemCmpExpansion` hook (PR #166526)
https://github.com/zhaoqi5 ready_for_review https://github.com/llvm/llvm-project/pull/166526 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
