[llvm-branch-commits] [flang] [mlir] [OpenMP][MLIR] Add thread_limit with dims modifier support (PR #171825)

2025-12-18 Thread Michael Klemm via llvm-branch-commits

mjklemm wrote:

I'm fine with the PR, consider the comments to be nitpicking.

https://github.com/llvm/llvm-project/pull/171825
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [OpenMP][MLIR] Add thread_limit with dims modifier support (PR #171825)

2025-12-18 Thread Michael Klemm via llvm-branch-commits




mjklemm wrote:

GitHub messed this comment up.  It's meant to be a change to the string of the 
error message in that line.

https://github.com/llvm/llvm-project/pull/171825
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [OpenMP][MLIR] Add thread_limit with dims modifier support (PR #171825)

2025-12-18 Thread Michael Klemm via llvm-branch-commits




mjklemm wrote:

```suggestion
  "dimension values can only be specified with 'dims' modifier");
```

https://github.com/llvm/llvm-project/pull/171825
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [mlir] [OpenMP][MLIR] Add thread_limit with dims modifier support (PR #171825)

2025-12-18 Thread Michael Klemm via llvm-branch-commits


@@ -1974,6 +1974,10 @@ convertOmpTeams(omp::TeamsOp op, llvm::IRBuilderBase 
&builder,
 return op.emitError("Lowering of num_teams with dims modifier is NYI.");
   }
 
+  if (op.hasThreadLimitDimsModifier()) {
+return op.emitError("Lowering of thread_limit with dims modifier is NYI.");

mjklemm wrote:

Please expand 'NYI'.

https://github.com/llvm/llvm-project/pull/171825
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [OpenMP][MLIR] Add num_threads clause with dims modifier support (PR #171767)

2025-12-18 Thread Michael Klemm via llvm-branch-commits


@@ -2887,6 +2887,9 @@ convertOmpParallel(omp::ParallelOp opInst, 
llvm::IRBuilderBase &builder,
   if (auto ifVar = opInst.getIfExpr())
 ifCond = moduleTranslation.lookupValue(ifVar);
   llvm::Value *numThreads = nullptr;
+  // num_threads dims and values are not yet supported
+  assert(!opInst.hasNumThreadsDimsModifier() &&
+ "Lowering of num_threads with dims modifier is NYI.");

mjklemm wrote:

Please expand the 'NYI'.

https://github.com/llvm/llvm-project/pull/171767
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [OpenMP][MLIR] Add num_threads clause with dims modifier support (PR #171767)

2025-12-18 Thread Michael Klemm via llvm-branch-commits


@@ -2601,14 +2604,39 @@ static LogicalResult verifyPrivateVarList(OpType &op) {
   return success();
 }
 
+// Helper: Verify num_threads clause
+LogicalResult
+verifyNumThreadsClause(Operation *op,
+   std::optional numThreadsNumDims,
+   OperandRange numThreadsDimsValues, Value numThreads) {
+  bool hasDimsModifier =
+  numThreadsNumDims.has_value() && numThreadsNumDims.value();
+  if (hasDimsModifier && numThreads) {
+return op->emitError("num_threads with dims modifier cannot be used "
+ "together with number of threads");

mjklemm wrote:

This error message does not make much sense to me.

https://github.com/llvm/llvm-project/pull/171767
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [OpenMP][MLIR] Add num_threads clause with dims modifier support (PR #171767)

2025-12-18 Thread Michael Klemm via llvm-branch-commits


@@ -5654,6 +5657,9 @@ extractHostEvalClauses(omp::TargetOp targetOp, Value 
&numThreads,
   llvm_unreachable("unsupported host_eval use");
   })
   .Case([&](omp::ParallelOp parallelOp) {
+// num_threads dims and values are not yet supported
+assert(!parallelOp.hasNumThreadsDimsModifier() &&
+   "Lowering of num_threads with dims modifier is NYI.");

mjklemm wrote:

Same for 'NYI'.

https://github.com/llvm/llvm-project/pull/171767
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [OpenMP][MLIR] Add num_threads clause with dims modifier support (PR #171767)

2025-12-18 Thread Michael Klemm via llvm-branch-commits


@@ -5774,8 +5780,12 @@ initTargetDefaultAttrs(omp::TargetOp targetOp, Operation 
*capturedOp,
   threadLimit = teamsOp.getThreadLimit();
 }
 
-if (auto parallelOp = castOrGetParentOfType(capturedOp))
+if (auto parallelOp = castOrGetParentOfType(capturedOp)) {
+  // num_threads dims and values are not yet supported
+  assert(!parallelOp.hasNumThreadsDimsModifier() &&
+ "Lowering of num_threads with dims modifier is NYI.");

mjklemm wrote:

Same for 'NYI'.

https://github.com/llvm/llvm-project/pull/171767
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [GlobalISel][AArch64] Added support for sli/sri intrinsics (PR #171448)

2025-12-18 Thread Hari Limaye via llvm-branch-commits

hazzlim wrote:

Nit: use imperative present tense for commit message, i.e. `Added support for 
sli/sri intrinsics` -> `Add support for sli/sri intrinsics`

https://github.com/llvm/llvm-project/pull/171448
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [OpenMP][MLIR] Add num_threads clause with dims modifier support (PR #171767)

2025-12-18 Thread Krzysztof Parzyszek via llvm-branch-commits


@@ -1069,16 +1069,55 @@ class OpenMP_NumThreadsClauseSkip<
   > : OpenMP_Clause {
   let arguments = (ins
+ConfinedAttr, [IntPositive]>:$num_threads_num_dims,
+Variadic:$num_threads_dims_values,
 Optional:$num_threads

kparzysz wrote:

I think we should have one list for the arguments in addition to the num_dims 
attribute.

I'm not sure if I understand the rest...  The N in dims(N) must be a literal 
integer (so we can verify that it's positive), but the actual arguments can be 
expressions.  We can use any suitable type for those.  Whether it's AnyInteger 
or IntLikeType probably doesn't matter that much.

https://github.com/llvm/llvm-project/pull/171767
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] Continuation of fexec-charset (PR #169803)

2025-12-18 Thread Abhina Sree via llvm-branch-commits

https://github.com/abhina-sree updated 
https://github.com/llvm/llvm-project/pull/169803

>From 16f3faac450b16cb527e409339fd32b42cc0ad43 Mon Sep 17 00:00:00 2001
From: Abhina Sreeskantharajan 
Date: Mon, 24 Nov 2025 11:00:04 -0500
Subject: [PATCH 1/4] add ParserConversionAction

(cherry picked from commit c2647a73957921d3f7a53c6f25a69f1cc2725aa3)
---
 clang/include/clang/Parse/Parser.h |  1 +
 clang/include/clang/Sema/Sema.h|  8 ++--
 clang/lib/Parse/ParseDecl.cpp  | 13 +
 clang/lib/Parse/ParseDeclCXX.cpp   | 10 +++---
 clang/lib/Parse/ParseExpr.cpp  |  9 +
 clang/lib/Parse/Parser.cpp |  4 
 clang/lib/Sema/SemaExpr.cpp| 12 +++-
 7 files changed, 43 insertions(+), 14 deletions(-)

diff --git a/clang/include/clang/Parse/Parser.h 
b/clang/include/clang/Parse/Parser.h
index 58eb1c0a7c114..97867183b5a1d 100644
--- a/clang/include/clang/Parse/Parser.h
+++ b/clang/include/clang/Parse/Parser.h
@@ -5633,6 +5633,7 @@ class Parser : public CodeCompletionHandler {
 bool Finished;
   };
   ObjCImplParsingDataRAII *CurParsedObjCImpl;
+  ConversionAction ParserConversionAction;
 
   /// StashAwayMethodOrFunctionBodyTokens -  Consume the tokens and store them
   /// for later parsing.
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index cbfcc9bc0ea99..65567e367dea4 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -54,6 +54,7 @@
 #include "clang/Basic/TemplateKinds.h"
 #include "clang/Basic/TokenKinds.h"
 #include "clang/Basic/TypeTraits.h"
+#include "clang/Lex/LiteralConverter.h"
 #include "clang/Sema/AnalysisBasedWarnings.h"
 #include "clang/Sema/Attr.h"
 #include "clang/Sema/CleanupInfo.h"
@@ -7272,9 +7273,12 @@ class Sema final : public SemaBase {
   /// from multiple tokens.  However, the common case is that StringToks points
   /// to one string.
   ExprResult ActOnStringLiteral(ArrayRef StringToks,
-Scope *UDLScope = nullptr);
+Scope *UDLScope = nullptr,
+ConversionAction Action = CA_ToExecEncoding);
 
-  ExprResult ActOnUnevaluatedStringLiteral(ArrayRef StringToks);
+  ExprResult
+  ActOnUnevaluatedStringLiteral(ArrayRef StringToks,
+ConversionAction Action = CA_ToExecEncoding);
 
   /// ControllingExprOrType is either an opaque pointer coming out of a
   /// ParsedType or an Expr *. FIXME: it'd be better to split this interface
diff --git a/clang/lib/Parse/ParseDecl.cpp b/clang/lib/Parse/ParseDecl.cpp
index 8688ccf41acb5..fd537618a3c83 100644
--- a/clang/lib/Parse/ParseDecl.cpp
+++ b/clang/lib/Parse/ParseDecl.cpp
@@ -555,6 +555,9 @@ unsigned Parser::ParseAttributeArgsCommon(
   nullptr,
   Sema::ExpressionEvaluationContextRecord::EK_AttrArgument);
 
+  SaveAndRestore SavedTranslationState(
+  ParserConversionAction, CA_NoConversion);
+
   ExprResult ArgExpr = ParseAssignmentExpression();
   if (ArgExpr.isInvalid()) {
 SkipUntil(tok::r_paren, StopAtSemi);
@@ -634,6 +637,9 @@ void Parser::ParseGNUAttributeArgs(
   ParsedAttr::Kind AttrKind =
   ParsedAttr::getParsedKind(AttrName, ScopeName, Form.getSyntax());
 
+  SaveAndRestore 
SavedTranslationState(ParserConversionAction,
+ CA_NoConversion);
+
   if (AttrKind == ParsedAttr::AT_Availability) {
 ParseAvailabilityAttribute(*AttrName, AttrNameLoc, Attrs, EndLoc, 
ScopeName,
ScopeLoc, Form);
@@ -699,6 +705,9 @@ unsigned Parser::ParseClangAttributeArgs(
   ParsedAttr::Kind AttrKind =
   ParsedAttr::getParsedKind(AttrName, ScopeName, Form.getSyntax());
 
+  SaveAndRestore 
SavedTranslationState(ParserConversionAction,
+ CA_NoConversion);
+
   switch (AttrKind) {
   default:
 return ParseAttributeArgsCommon(AttrName, AttrNameLoc, Attrs, EndLoc,
@@ -1521,6 +1530,10 @@ void Parser::ParseExternalSourceSymbolAttribute(
   SkipUntil(tok::comma, tok::r_paren, StopAtSemi | StopBeforeMatch);
   continue;
 }
+
+SaveAndRestore SavedTranslationState(
+ParserConversionAction, CA_NoConversion);
+
 if (Keyword == Ident_language) {
   if (HadLanguage) {
 Diag(KeywordLoc, diag::err_external_source_symbol_duplicate_clause)
diff --git a/clang/lib/Parse/ParseDeclCXX.cpp b/clang/lib/Parse/ParseDeclCXX.cpp
index d8ed7e3ff96bd..40bf409124711 100644
--- a/clang/lib/Parse/ParseDeclCXX.cpp
+++ b/clang/lib/Parse/ParseDeclCXX.cpp
@@ -314,7 +314,9 @@ Decl *Parser::ParseNamespaceAlias(SourceLocation 
NamespaceLoc,
 
 Decl *Parser::ParseLinkage(ParsingDeclSpec &DS, DeclaratorContext Context) {
   assert(isTokenStringLiteral() && "Not a string literal!");
-  ExprResult Lang = ParseUnevaluatedStringLiteralExpression();
+  ExprResult Lang = (SaveAndRestore(ParserConversio

[llvm-branch-commits] [clang] [llvm] Continuation of fexec-charset (PR #169803)

2025-12-18 Thread Abhina Sree via llvm-branch-commits

https://github.com/abhina-sree updated 
https://github.com/llvm/llvm-project/pull/169803

>From 16f3faac450b16cb527e409339fd32b42cc0ad43 Mon Sep 17 00:00:00 2001
From: Abhina Sreeskantharajan 
Date: Mon, 24 Nov 2025 11:00:04 -0500
Subject: [PATCH 1/3] add ParserConversionAction

(cherry picked from commit c2647a73957921d3f7a53c6f25a69f1cc2725aa3)
---
 clang/include/clang/Parse/Parser.h |  1 +
 clang/include/clang/Sema/Sema.h|  8 ++--
 clang/lib/Parse/ParseDecl.cpp  | 13 +
 clang/lib/Parse/ParseDeclCXX.cpp   | 10 +++---
 clang/lib/Parse/ParseExpr.cpp  |  9 +
 clang/lib/Parse/Parser.cpp |  4 
 clang/lib/Sema/SemaExpr.cpp| 12 +++-
 7 files changed, 43 insertions(+), 14 deletions(-)

diff --git a/clang/include/clang/Parse/Parser.h 
b/clang/include/clang/Parse/Parser.h
index 58eb1c0a7c114..97867183b5a1d 100644
--- a/clang/include/clang/Parse/Parser.h
+++ b/clang/include/clang/Parse/Parser.h
@@ -5633,6 +5633,7 @@ class Parser : public CodeCompletionHandler {
 bool Finished;
   };
   ObjCImplParsingDataRAII *CurParsedObjCImpl;
+  ConversionAction ParserConversionAction;
 
   /// StashAwayMethodOrFunctionBodyTokens -  Consume the tokens and store them
   /// for later parsing.
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index cbfcc9bc0ea99..65567e367dea4 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -54,6 +54,7 @@
 #include "clang/Basic/TemplateKinds.h"
 #include "clang/Basic/TokenKinds.h"
 #include "clang/Basic/TypeTraits.h"
+#include "clang/Lex/LiteralConverter.h"
 #include "clang/Sema/AnalysisBasedWarnings.h"
 #include "clang/Sema/Attr.h"
 #include "clang/Sema/CleanupInfo.h"
@@ -7272,9 +7273,12 @@ class Sema final : public SemaBase {
   /// from multiple tokens.  However, the common case is that StringToks points
   /// to one string.
   ExprResult ActOnStringLiteral(ArrayRef StringToks,
-Scope *UDLScope = nullptr);
+Scope *UDLScope = nullptr,
+ConversionAction Action = CA_ToExecEncoding);
 
-  ExprResult ActOnUnevaluatedStringLiteral(ArrayRef StringToks);
+  ExprResult
+  ActOnUnevaluatedStringLiteral(ArrayRef StringToks,
+ConversionAction Action = CA_ToExecEncoding);
 
   /// ControllingExprOrType is either an opaque pointer coming out of a
   /// ParsedType or an Expr *. FIXME: it'd be better to split this interface
diff --git a/clang/lib/Parse/ParseDecl.cpp b/clang/lib/Parse/ParseDecl.cpp
index 8688ccf41acb5..fd537618a3c83 100644
--- a/clang/lib/Parse/ParseDecl.cpp
+++ b/clang/lib/Parse/ParseDecl.cpp
@@ -555,6 +555,9 @@ unsigned Parser::ParseAttributeArgsCommon(
   nullptr,
   Sema::ExpressionEvaluationContextRecord::EK_AttrArgument);
 
+  SaveAndRestore SavedTranslationState(
+  ParserConversionAction, CA_NoConversion);
+
   ExprResult ArgExpr = ParseAssignmentExpression();
   if (ArgExpr.isInvalid()) {
 SkipUntil(tok::r_paren, StopAtSemi);
@@ -634,6 +637,9 @@ void Parser::ParseGNUAttributeArgs(
   ParsedAttr::Kind AttrKind =
   ParsedAttr::getParsedKind(AttrName, ScopeName, Form.getSyntax());
 
+  SaveAndRestore 
SavedTranslationState(ParserConversionAction,
+ CA_NoConversion);
+
   if (AttrKind == ParsedAttr::AT_Availability) {
 ParseAvailabilityAttribute(*AttrName, AttrNameLoc, Attrs, EndLoc, 
ScopeName,
ScopeLoc, Form);
@@ -699,6 +705,9 @@ unsigned Parser::ParseClangAttributeArgs(
   ParsedAttr::Kind AttrKind =
   ParsedAttr::getParsedKind(AttrName, ScopeName, Form.getSyntax());
 
+  SaveAndRestore 
SavedTranslationState(ParserConversionAction,
+ CA_NoConversion);
+
   switch (AttrKind) {
   default:
 return ParseAttributeArgsCommon(AttrName, AttrNameLoc, Attrs, EndLoc,
@@ -1521,6 +1530,10 @@ void Parser::ParseExternalSourceSymbolAttribute(
   SkipUntil(tok::comma, tok::r_paren, StopAtSemi | StopBeforeMatch);
   continue;
 }
+
+SaveAndRestore SavedTranslationState(
+ParserConversionAction, CA_NoConversion);
+
 if (Keyword == Ident_language) {
   if (HadLanguage) {
 Diag(KeywordLoc, diag::err_external_source_symbol_duplicate_clause)
diff --git a/clang/lib/Parse/ParseDeclCXX.cpp b/clang/lib/Parse/ParseDeclCXX.cpp
index d8ed7e3ff96bd..40bf409124711 100644
--- a/clang/lib/Parse/ParseDeclCXX.cpp
+++ b/clang/lib/Parse/ParseDeclCXX.cpp
@@ -314,7 +314,9 @@ Decl *Parser::ParseNamespaceAlias(SourceLocation 
NamespaceLoc,
 
 Decl *Parser::ParseLinkage(ParsingDeclSpec &DS, DeclaratorContext Context) {
   assert(isTokenStringLiteral() && "Not a string literal!");
-  ExprResult Lang = ParseUnevaluatedStringLiteralExpression();
+  ExprResult Lang = (SaveAndRestore(ParserConversio

[llvm-branch-commits] [libc] [libc] Add `IN6_IS_ADDR_V4MAPPED` (PR #172645)

2025-12-18 Thread Connector Switch via llvm-branch-commits

c8ef wrote:

### Merge activity

* **Dec 18, 3:02 PM UTC**: A user started a stack merge that includes this pull 
request via 
[Graphite](https://app.graphite.com/github/pr/llvm/llvm-project/172645).


https://github.com/llvm/llvm-project/pull/172645
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libc] [libc] Add `IN6_IS_ADDR_MC*` (PR #172643)

2025-12-18 Thread Connector Switch via llvm-branch-commits

c8ef wrote:

### Merge activity

* **Dec 18, 3:02 PM UTC**: A user started a stack merge that includes this pull 
request via 
[Graphite](https://app.graphite.com/github/pr/llvm/llvm-project/172643).


https://github.com/llvm/llvm-project/pull/172643
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libc] [libc] Add `IN6_IS_ADDR_MULTICAST` (PR #172498)

2025-12-18 Thread Connector Switch via llvm-branch-commits

c8ef wrote:

### Merge activity

* **Dec 18, 3:02 PM UTC**: A user started a stack merge that includes this pull 
request via 
[Graphite](https://app.graphite.com/github/pr/llvm/llvm-project/172498).


https://github.com/llvm/llvm-project/pull/172498
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libc] [libc] Add `IN6_IS_ADDR_LOOPBACK` (PR #172312)

2025-12-18 Thread Connector Switch via llvm-branch-commits

c8ef wrote:

### Merge activity

* **Dec 18, 3:02 PM UTC**: A user started a stack merge that includes this pull 
request via 
[Graphite](https://app.graphite.com/github/pr/llvm/llvm-project/172312).


https://github.com/llvm/llvm-project/pull/172312
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libc] [libc] Add `IN6_IS_ADDR_V4COMPAT` (PR #172646)

2025-12-18 Thread Connector Switch via llvm-branch-commits

c8ef wrote:

### Merge activity

* **Dec 18, 3:02 PM UTC**: A user started a stack merge that includes this pull 
request via 
[Graphite](https://app.graphite.com/github/pr/llvm/llvm-project/172646).


https://github.com/llvm/llvm-project/pull/172646
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Add liverange split instructions into BB Prolog (PR #117544)

2025-12-18 Thread Christudasan Devadasan via llvm-branch-commits

cdevadas wrote:

This PR stack is essential for fixing multiple issues. Can we get this stack 
merged?

https://github.com/llvm/llvm-project/pull/117544
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen][NPM] Remove "LowerConstantIntrinsicsPass" from the pipeline (PR #172794)

2025-12-18 Thread Christudasan Devadasan via llvm-branch-commits

https://github.com/cdevadas edited 
https://github.com/llvm/llvm-project/pull/172794
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][NPM] Disable few non useful passes (PR #172796)

2025-12-18 Thread Christudasan Devadasan via llvm-branch-commits

https://github.com/cdevadas approved this pull request.


https://github.com/llvm/llvm-project/pull/172796
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][NPM] add "addPostBBSections()" to NPM (PR #172793)

2025-12-18 Thread Christudasan Devadasan via llvm-branch-commits

https://github.com/cdevadas approved this pull request.


https://github.com/llvm/llvm-project/pull/172793
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen][NPM] Remove "LowerConstantIntrinsicsPass" from the pipeline (PR #172794)

2025-12-18 Thread Christudasan Devadasan via llvm-branch-commits

https://github.com/cdevadas approved this pull request.


https://github.com/llvm/llvm-project/pull/172794
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Clang] Add __builtin_allow_sanitize_check() (PR #172030)

2025-12-18 Thread Marco Elver via llvm-branch-commits

melver wrote:

@fmayer @vitalybuka - do the names look reasonable to you?

Thanks!

https://github.com/llvm/llvm-project/pull/172030
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Clang] Add __builtin_allow_sanitize_check() (PR #172030)

2025-12-18 Thread Marco Elver via llvm-branch-commits

https://github.com/melver updated 
https://github.com/llvm/llvm-project/pull/172030

>From d4f149dbb21fd7f5e706560b1aa3b8f0b9fa5ae9 Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Fri, 12 Dec 2025 17:18:15 +0100
Subject: [PATCH 1/2] tweak test

Created using spr 1.3.8-beta.1
---
 clang/test/CodeGen/builtin-allow-sanitize-check-lower.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c 
b/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c
index 05a0295799f55..5e52f77f55573 100644
--- a/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c
+++ b/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c
@@ -9,9 +9,9 @@ _Bool check() {
   return __builtin_allow_sanitize_check("address");
 }
 
-// CHECK-LABEL: @test
+// CHECK-LABEL: @test_sanitize
 // CHECK: ret i1 true
-_Bool test() {
+_Bool test_sanitize() {
   return check();
 }
 

>From f4c6cfded128616e2c8d119e19174bbed1298781 Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Wed, 17 Dec 2025 20:47:40 +0100
Subject: [PATCH 2/2] fixes

Created using spr 1.3.8-beta.1
---
 clang/lib/CodeGen/CGBuiltin.cpp   | 37 ++-
 .../test/Sema/builtin-allow-sanitize-check.c  |  6 +--
 2 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 98d80620b44a5..9de085379882c 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3551,25 +3551,28 @@ RValue CodeGenFunction::EmitBuiltinExpr(const 
GlobalDecl GD, unsigned BuiltinID,
   }
   case Builtin::BI__builtin_allow_sanitize_check: {
 Intrinsic::ID IntrID = Intrinsic::not_intrinsic;
-StringRef SanitizerName =
+StringRef Name =
 cast(E->getArg(0)->IgnoreParenCasts())->getString();
 
-if (SanitizerName == "address" || SanitizerName == "kernel-address") {
-  if (CGM.getLangOpts().Sanitize.hasOneOf(SanitizerKind::Address |
-  SanitizerKind::KernelAddress))
-IntrID = Intrinsic::allow_sanitize_address;
-} else if (SanitizerName == "thread") {
-  if (CGM.getLangOpts().Sanitize.has(SanitizerKind::Thread))
-IntrID = Intrinsic::allow_sanitize_thread;
-} else if (SanitizerName == "memory" || SanitizerName == "kernel-memory") {
-  if (CGM.getLangOpts().Sanitize.hasOneOf(SanitizerKind::Memory |
-  SanitizerKind::KernelMemory))
-IntrID = Intrinsic::allow_sanitize_memory;
-} else if (SanitizerName == "hwaddress" ||
-   SanitizerName == "kernel-hwaddress") {
-  if (CGM.getLangOpts().Sanitize.hasOneOf(SanitizerKind::HWAddress |
-  SanitizerKind::KernelHWAddress))
-IntrID = Intrinsic::allow_sanitize_hwaddress;
+// We deliberately allow the use of kernel- and non-kernel names
+// interchangably, even when one or the other is enabled. This is 
consistent
+// with the no_sanitize-attribute, which allows either kernel- or 
non-kernel
+// name to disable instrumentation (see CodeGenFunction::StartFunction).
+if (getLangOpts().Sanitize.hasOneOf(SanitizerKind::Address |
+SanitizerKind::KernelAddress) &&
+(Name == "address" || Name == "kernel-address")) {
+  IntrID = Intrinsic::allow_sanitize_address;
+} else if (getLangOpts().Sanitize.has(SanitizerKind::Thread) &&
+   Name == "thread") {
+  IntrID = Intrinsic::allow_sanitize_thread;
+} else if (getLangOpts().Sanitize.hasOneOf(SanitizerKind::Memory |
+   SanitizerKind::KernelMemory) &&
+   (Name == "memory" || Name == "kernel-memory")) {
+  IntrID = Intrinsic::allow_sanitize_memory;
+} else if (getLangOpts().Sanitize.hasOneOf(
+   SanitizerKind::HWAddress | SanitizerKind::KernelHWAddress) 
&&
+   (Name == "hwaddress" || Name == "kernel-hwaddress")) {
+  IntrID = Intrinsic::allow_sanitize_hwaddress;
 }
 
 if (IntrID != Intrinsic::not_intrinsic) {
diff --git a/clang/test/Sema/builtin-allow-sanitize-check.c 
b/clang/test/Sema/builtin-allow-sanitize-check.c
index 94deb16dd89f9..6e0e21a869461 100644
--- a/clang/test/Sema/builtin-allow-sanitize-check.c
+++ b/clang/test/Sema/builtin-allow-sanitize-check.c
@@ -1,15 +1,15 @@
 // RUN: %clang_cc1 -fsyntax-only -verify %s
 
 void test_builtin_allow_sanitize_check() {
-  // Test with non-string literal argument
+  // Test with non-string literal argument.
   char str[] = "address";
   (void)__builtin_allow_sanitize_check(str); // expected-error {{expression is 
not a string literal}}
   (void)__builtin_allow_sanitize_check(123); // expected-error {{expression is 
not a string literal}}
 
-  // Test with unsupported sanitizer name
+  // Test with unsupported sanitizer name.
   (void)__builtin_allow_sanitize_check("unsupported"

[llvm-branch-commits] [clang] [Clang] Add __builtin_allow_sanitize_check() (PR #172030)

2025-12-18 Thread Marco Elver via llvm-branch-commits

https://github.com/melver updated 
https://github.com/llvm/llvm-project/pull/172030

>From d4f149dbb21fd7f5e706560b1aa3b8f0b9fa5ae9 Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Fri, 12 Dec 2025 17:18:15 +0100
Subject: [PATCH 1/2] tweak test

Created using spr 1.3.8-beta.1
---
 clang/test/CodeGen/builtin-allow-sanitize-check-lower.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c 
b/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c
index 05a0295799f55..5e52f77f55573 100644
--- a/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c
+++ b/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c
@@ -9,9 +9,9 @@ _Bool check() {
   return __builtin_allow_sanitize_check("address");
 }
 
-// CHECK-LABEL: @test
+// CHECK-LABEL: @test_sanitize
 // CHECK: ret i1 true
-_Bool test() {
+_Bool test_sanitize() {
   return check();
 }
 

>From f4c6cfded128616e2c8d119e19174bbed1298781 Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Wed, 17 Dec 2025 20:47:40 +0100
Subject: [PATCH 2/2] fixes

Created using spr 1.3.8-beta.1
---
 clang/lib/CodeGen/CGBuiltin.cpp   | 37 ++-
 .../test/Sema/builtin-allow-sanitize-check.c  |  6 +--
 2 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 98d80620b44a5..9de085379882c 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3551,25 +3551,28 @@ RValue CodeGenFunction::EmitBuiltinExpr(const 
GlobalDecl GD, unsigned BuiltinID,
   }
   case Builtin::BI__builtin_allow_sanitize_check: {
 Intrinsic::ID IntrID = Intrinsic::not_intrinsic;
-StringRef SanitizerName =
+StringRef Name =
 cast(E->getArg(0)->IgnoreParenCasts())->getString();
 
-if (SanitizerName == "address" || SanitizerName == "kernel-address") {
-  if (CGM.getLangOpts().Sanitize.hasOneOf(SanitizerKind::Address |
-  SanitizerKind::KernelAddress))
-IntrID = Intrinsic::allow_sanitize_address;
-} else if (SanitizerName == "thread") {
-  if (CGM.getLangOpts().Sanitize.has(SanitizerKind::Thread))
-IntrID = Intrinsic::allow_sanitize_thread;
-} else if (SanitizerName == "memory" || SanitizerName == "kernel-memory") {
-  if (CGM.getLangOpts().Sanitize.hasOneOf(SanitizerKind::Memory |
-  SanitizerKind::KernelMemory))
-IntrID = Intrinsic::allow_sanitize_memory;
-} else if (SanitizerName == "hwaddress" ||
-   SanitizerName == "kernel-hwaddress") {
-  if (CGM.getLangOpts().Sanitize.hasOneOf(SanitizerKind::HWAddress |
-  SanitizerKind::KernelHWAddress))
-IntrID = Intrinsic::allow_sanitize_hwaddress;
+// We deliberately allow the use of kernel- and non-kernel names
+// interchangably, even when one or the other is enabled. This is 
consistent
+// with the no_sanitize-attribute, which allows either kernel- or 
non-kernel
+// name to disable instrumentation (see CodeGenFunction::StartFunction).
+if (getLangOpts().Sanitize.hasOneOf(SanitizerKind::Address |
+SanitizerKind::KernelAddress) &&
+(Name == "address" || Name == "kernel-address")) {
+  IntrID = Intrinsic::allow_sanitize_address;
+} else if (getLangOpts().Sanitize.has(SanitizerKind::Thread) &&
+   Name == "thread") {
+  IntrID = Intrinsic::allow_sanitize_thread;
+} else if (getLangOpts().Sanitize.hasOneOf(SanitizerKind::Memory |
+   SanitizerKind::KernelMemory) &&
+   (Name == "memory" || Name == "kernel-memory")) {
+  IntrID = Intrinsic::allow_sanitize_memory;
+} else if (getLangOpts().Sanitize.hasOneOf(
+   SanitizerKind::HWAddress | SanitizerKind::KernelHWAddress) 
&&
+   (Name == "hwaddress" || Name == "kernel-hwaddress")) {
+  IntrID = Intrinsic::allow_sanitize_hwaddress;
 }
 
 if (IntrID != Intrinsic::not_intrinsic) {
diff --git a/clang/test/Sema/builtin-allow-sanitize-check.c 
b/clang/test/Sema/builtin-allow-sanitize-check.c
index 94deb16dd89f9..6e0e21a869461 100644
--- a/clang/test/Sema/builtin-allow-sanitize-check.c
+++ b/clang/test/Sema/builtin-allow-sanitize-check.c
@@ -1,15 +1,15 @@
 // RUN: %clang_cc1 -fsyntax-only -verify %s
 
 void test_builtin_allow_sanitize_check() {
-  // Test with non-string literal argument
+  // Test with non-string literal argument.
   char str[] = "address";
   (void)__builtin_allow_sanitize_check(str); // expected-error {{expression is 
not a string literal}}
   (void)__builtin_allow_sanitize_check(123); // expected-error {{expression is 
not a string literal}}
 
-  // Test with unsupported sanitizer name
+  // Test with unsupported sanitizer name.
   (void)__builtin_allow_sanitize_check("unsupported"

[llvm-branch-commits] [LowerAllowCheck] Add llvm.allow.sanitize.* intrinsics (PR #172029)

2025-12-18 Thread Alexander Potapenko via llvm-branch-commits


@@ -123,26 +124,41 @@ static bool lowerAllowChecks(Function &F, const 
BlockFrequencyInfo &BFI,
 switch (ID) {
 case Intrinsic::allow_ubsan_check:
 case Intrinsic::allow_runtime_check: {
-  ++NumChecksTotal;
-
   bool ToRemove = ShouldRemove(II);
 
   ReplaceWithValue.push_back({
   II,
-  ToRemove,
+  !ToRemove,
   });
-  if (ToRemove)
-++NumChecksRemoved;
   emitRemark(II, ORE, ToRemove);
   break;
 }
+case Intrinsic::allow_sanitize_address:
+  ReplaceWithValue.push_back(
+  {II, F.hasFnAttribute(Attribute::SanitizeAddress)});
+  break;
+case Intrinsic::allow_sanitize_thread:
+  ReplaceWithValue.push_back(
+  {II, F.hasFnAttribute(Attribute::SanitizeThread)});
+  break;
+case Intrinsic::allow_sanitize_memory:
+  ReplaceWithValue.push_back(
+  {II, F.hasFnAttribute(Attribute::SanitizeMemory)});
+  break;
+case Intrinsic::allow_sanitize_hwaddress:
+  ReplaceWithValue.push_back(
+  {II, F.hasFnAttribute(Attribute::SanitizeHWAddress)});
+  break;
 default:
   break;
 }
   }
 
   for (auto [I, V] : ReplaceWithValue) {
-I->replaceAllUsesWith(ConstantInt::getBool(I->getType(), !V));
+++NumChecksTotal;
+if (!V) // If the final value is false, the check is considered removed

ramosian-glider wrote:

Nit: period at the end of the line.

https://github.com/llvm/llvm-project/pull/172029
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [LowerAllowCheck] Add llvm.allow.sanitize.* intrinsics (PR #172029)

2025-12-18 Thread Alexander Potapenko via llvm-branch-commits

https://github.com/ramosian-glider approved this pull request.


https://github.com/llvm/llvm-project/pull/172029
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libc] [libc] Add `IN6_IS_ADDR_LOOPBACK` (PR #172312)

2025-12-18 Thread Connector Switch via llvm-branch-commits

https://github.com/c8ef edited https://github.com/llvm/llvm-project/pull/172312
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Clang] Add __builtin_allow_sanitize_check() (PR #172030)

2025-12-18 Thread Marco Elver via llvm-branch-commits

https://github.com/melver updated 
https://github.com/llvm/llvm-project/pull/172030

>From d4f149dbb21fd7f5e706560b1aa3b8f0b9fa5ae9 Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Fri, 12 Dec 2025 17:18:15 +0100
Subject: [PATCH 1/2] tweak test

Created using spr 1.3.8-beta.1
---
 clang/test/CodeGen/builtin-allow-sanitize-check-lower.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c 
b/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c
index 05a0295799f55..5e52f77f55573 100644
--- a/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c
+++ b/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c
@@ -9,9 +9,9 @@ _Bool check() {
   return __builtin_allow_sanitize_check("address");
 }
 
-// CHECK-LABEL: @test
+// CHECK-LABEL: @test_sanitize
 // CHECK: ret i1 true
-_Bool test() {
+_Bool test_sanitize() {
   return check();
 }
 

>From f4c6cfded128616e2c8d119e19174bbed1298781 Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Wed, 17 Dec 2025 20:47:40 +0100
Subject: [PATCH 2/2] fixes

Created using spr 1.3.8-beta.1
---
 clang/lib/CodeGen/CGBuiltin.cpp   | 37 ++-
 .../test/Sema/builtin-allow-sanitize-check.c  |  6 +--
 2 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 98d80620b44a5..9de085379882c 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3551,25 +3551,28 @@ RValue CodeGenFunction::EmitBuiltinExpr(const 
GlobalDecl GD, unsigned BuiltinID,
   }
   case Builtin::BI__builtin_allow_sanitize_check: {
 Intrinsic::ID IntrID = Intrinsic::not_intrinsic;
-StringRef SanitizerName =
+StringRef Name =
 cast(E->getArg(0)->IgnoreParenCasts())->getString();
 
-if (SanitizerName == "address" || SanitizerName == "kernel-address") {
-  if (CGM.getLangOpts().Sanitize.hasOneOf(SanitizerKind::Address |
-  SanitizerKind::KernelAddress))
-IntrID = Intrinsic::allow_sanitize_address;
-} else if (SanitizerName == "thread") {
-  if (CGM.getLangOpts().Sanitize.has(SanitizerKind::Thread))
-IntrID = Intrinsic::allow_sanitize_thread;
-} else if (SanitizerName == "memory" || SanitizerName == "kernel-memory") {
-  if (CGM.getLangOpts().Sanitize.hasOneOf(SanitizerKind::Memory |
-  SanitizerKind::KernelMemory))
-IntrID = Intrinsic::allow_sanitize_memory;
-} else if (SanitizerName == "hwaddress" ||
-   SanitizerName == "kernel-hwaddress") {
-  if (CGM.getLangOpts().Sanitize.hasOneOf(SanitizerKind::HWAddress |
-  SanitizerKind::KernelHWAddress))
-IntrID = Intrinsic::allow_sanitize_hwaddress;
+// We deliberately allow the use of kernel- and non-kernel names
+// interchangably, even when one or the other is enabled. This is 
consistent
+// with the no_sanitize-attribute, which allows either kernel- or 
non-kernel
+// name to disable instrumentation (see CodeGenFunction::StartFunction).
+if (getLangOpts().Sanitize.hasOneOf(SanitizerKind::Address |
+SanitizerKind::KernelAddress) &&
+(Name == "address" || Name == "kernel-address")) {
+  IntrID = Intrinsic::allow_sanitize_address;
+} else if (getLangOpts().Sanitize.has(SanitizerKind::Thread) &&
+   Name == "thread") {
+  IntrID = Intrinsic::allow_sanitize_thread;
+} else if (getLangOpts().Sanitize.hasOneOf(SanitizerKind::Memory |
+   SanitizerKind::KernelMemory) &&
+   (Name == "memory" || Name == "kernel-memory")) {
+  IntrID = Intrinsic::allow_sanitize_memory;
+} else if (getLangOpts().Sanitize.hasOneOf(
+   SanitizerKind::HWAddress | SanitizerKind::KernelHWAddress) 
&&
+   (Name == "hwaddress" || Name == "kernel-hwaddress")) {
+  IntrID = Intrinsic::allow_sanitize_hwaddress;
 }
 
 if (IntrID != Intrinsic::not_intrinsic) {
diff --git a/clang/test/Sema/builtin-allow-sanitize-check.c 
b/clang/test/Sema/builtin-allow-sanitize-check.c
index 94deb16dd89f9..6e0e21a869461 100644
--- a/clang/test/Sema/builtin-allow-sanitize-check.c
+++ b/clang/test/Sema/builtin-allow-sanitize-check.c
@@ -1,15 +1,15 @@
 // RUN: %clang_cc1 -fsyntax-only -verify %s
 
 void test_builtin_allow_sanitize_check() {
-  // Test with non-string literal argument
+  // Test with non-string literal argument.
   char str[] = "address";
   (void)__builtin_allow_sanitize_check(str); // expected-error {{expression is 
not a string literal}}
   (void)__builtin_allow_sanitize_check(123); // expected-error {{expression is 
not a string literal}}
 
-  // Test with unsupported sanitizer name
+  // Test with unsupported sanitizer name.
   (void)__builtin_allow_sanitize_check("unsupported"

[llvm-branch-commits] [clang] [Clang] Add __builtin_allow_sanitize_check() (PR #172030)

2025-12-18 Thread Marco Elver via llvm-branch-commits

https://github.com/melver updated 
https://github.com/llvm/llvm-project/pull/172030

>From d4f149dbb21fd7f5e706560b1aa3b8f0b9fa5ae9 Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Fri, 12 Dec 2025 17:18:15 +0100
Subject: [PATCH 1/2] tweak test

Created using spr 1.3.8-beta.1
---
 clang/test/CodeGen/builtin-allow-sanitize-check-lower.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c 
b/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c
index 05a0295799f55..5e52f77f55573 100644
--- a/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c
+++ b/clang/test/CodeGen/builtin-allow-sanitize-check-lower.c
@@ -9,9 +9,9 @@ _Bool check() {
   return __builtin_allow_sanitize_check("address");
 }
 
-// CHECK-LABEL: @test
+// CHECK-LABEL: @test_sanitize
 // CHECK: ret i1 true
-_Bool test() {
+_Bool test_sanitize() {
   return check();
 }
 

>From f4c6cfded128616e2c8d119e19174bbed1298781 Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Wed, 17 Dec 2025 20:47:40 +0100
Subject: [PATCH 2/2] fixes

Created using spr 1.3.8-beta.1
---
 clang/lib/CodeGen/CGBuiltin.cpp   | 37 ++-
 .../test/Sema/builtin-allow-sanitize-check.c  |  6 +--
 2 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 98d80620b44a5..9de085379882c 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3551,25 +3551,28 @@ RValue CodeGenFunction::EmitBuiltinExpr(const 
GlobalDecl GD, unsigned BuiltinID,
   }
   case Builtin::BI__builtin_allow_sanitize_check: {
 Intrinsic::ID IntrID = Intrinsic::not_intrinsic;
-StringRef SanitizerName =
+StringRef Name =
 cast(E->getArg(0)->IgnoreParenCasts())->getString();
 
-if (SanitizerName == "address" || SanitizerName == "kernel-address") {
-  if (CGM.getLangOpts().Sanitize.hasOneOf(SanitizerKind::Address |
-  SanitizerKind::KernelAddress))
-IntrID = Intrinsic::allow_sanitize_address;
-} else if (SanitizerName == "thread") {
-  if (CGM.getLangOpts().Sanitize.has(SanitizerKind::Thread))
-IntrID = Intrinsic::allow_sanitize_thread;
-} else if (SanitizerName == "memory" || SanitizerName == "kernel-memory") {
-  if (CGM.getLangOpts().Sanitize.hasOneOf(SanitizerKind::Memory |
-  SanitizerKind::KernelMemory))
-IntrID = Intrinsic::allow_sanitize_memory;
-} else if (SanitizerName == "hwaddress" ||
-   SanitizerName == "kernel-hwaddress") {
-  if (CGM.getLangOpts().Sanitize.hasOneOf(SanitizerKind::HWAddress |
-  SanitizerKind::KernelHWAddress))
-IntrID = Intrinsic::allow_sanitize_hwaddress;
+// We deliberately allow the use of kernel- and non-kernel names
+// interchangably, even when one or the other is enabled. This is 
consistent
+// with the no_sanitize-attribute, which allows either kernel- or 
non-kernel
+// name to disable instrumentation (see CodeGenFunction::StartFunction).
+if (getLangOpts().Sanitize.hasOneOf(SanitizerKind::Address |
+SanitizerKind::KernelAddress) &&
+(Name == "address" || Name == "kernel-address")) {
+  IntrID = Intrinsic::allow_sanitize_address;
+} else if (getLangOpts().Sanitize.has(SanitizerKind::Thread) &&
+   Name == "thread") {
+  IntrID = Intrinsic::allow_sanitize_thread;
+} else if (getLangOpts().Sanitize.hasOneOf(SanitizerKind::Memory |
+   SanitizerKind::KernelMemory) &&
+   (Name == "memory" || Name == "kernel-memory")) {
+  IntrID = Intrinsic::allow_sanitize_memory;
+} else if (getLangOpts().Sanitize.hasOneOf(
+   SanitizerKind::HWAddress | SanitizerKind::KernelHWAddress) 
&&
+   (Name == "hwaddress" || Name == "kernel-hwaddress")) {
+  IntrID = Intrinsic::allow_sanitize_hwaddress;
 }
 
 if (IntrID != Intrinsic::not_intrinsic) {
diff --git a/clang/test/Sema/builtin-allow-sanitize-check.c 
b/clang/test/Sema/builtin-allow-sanitize-check.c
index 94deb16dd89f9..6e0e21a869461 100644
--- a/clang/test/Sema/builtin-allow-sanitize-check.c
+++ b/clang/test/Sema/builtin-allow-sanitize-check.c
@@ -1,15 +1,15 @@
 // RUN: %clang_cc1 -fsyntax-only -verify %s
 
 void test_builtin_allow_sanitize_check() {
-  // Test with non-string literal argument
+  // Test with non-string literal argument.
   char str[] = "address";
   (void)__builtin_allow_sanitize_check(str); // expected-error {{expression is 
not a string literal}}
   (void)__builtin_allow_sanitize_check(123); // expected-error {{expression is 
not a string literal}}
 
-  // Test with unsupported sanitizer name
+  // Test with unsupported sanitizer name.
   (void)__builtin_allow_sanitize_check("unsupported"

[llvm-branch-commits] [llvm] [LowerAllowCheck] Add llvm.allow.sanitize.* intrinsics (PR #172029)

2025-12-18 Thread Marco Elver via llvm-branch-commits

https://github.com/melver updated 
https://github.com/llvm/llvm-project/pull/172029

>From 7c8dbba4f20841f2759fb7ee9a7c012facf056ac Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Thu, 18 Dec 2025 12:05:08 +0100
Subject: [PATCH] fix

Created using spr 1.3.8-beta.1
---
 llvm/lib/Transforms/Instrumentation/LowerAllowCheckPass.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/lib/Transforms/Instrumentation/LowerAllowCheckPass.cpp 
b/llvm/lib/Transforms/Instrumentation/LowerAllowCheckPass.cpp
index d4c23ffe9a723..7a950036d9b6b 100644
--- a/llvm/lib/Transforms/Instrumentation/LowerAllowCheckPass.cpp
+++ b/llvm/lib/Transforms/Instrumentation/LowerAllowCheckPass.cpp
@@ -156,7 +156,7 @@ static bool lowerAllowChecks(Function &F, const 
BlockFrequencyInfo &BFI,
 
   for (auto [I, V] : ReplaceWithValue) {
 ++NumChecksTotal;
-if (!V) // If the final value is false, the check is considered removed
+if (!V) // If the final value is false, the check is considered removed.
   ++NumChecksRemoved;
 I->replaceAllUsesWith(ConstantInt::getBool(I->getType(), V));
 I->eraseFromParent();

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen][NPM] Remove "LowerConstantIntrinsicsPass" from the pipeline (PR #172794)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172794

>From 2069a0869ecd304aadebc71ed6b038a012aec376 Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 15:00:27 +0530
Subject: [PATCH] [NPM] Remove "LowerConstantIntrinsicsPass" from the pipeline

---
 llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 -
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll  | 6 +++---
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index d7ce17bebbf5d..7151dceac4b79 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -730,7 +730,6 @@ void CodeGenPassBuilder::addIRPasses(
 flushFPMsToMPM(PMW);
 addModulePass(ShadowStackGCLoweringPass(), PMW);
   }
-  addFunctionPass(LowerConstantIntrinsicsPass(), PMW);
 
   // Make sure that no unreachable blocks are instruction selected.
   addFunctionPass(UnreachableBlockElimPass(), PMW);
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index 0383f0613b71d..9b8f0c5f4ef0d 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -9,11 +9,11 @@
 ; RUN:   | FileCheck -check-prefix=GCN-O3 %s
 
 
-; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,lower-constant-intrinsics,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
+; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
 
-; GCN-O2: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lo

[llvm-branch-commits] [llvm] [NPM] Update OptimizedRegAlloc and MachineLateOptimization pipelines (PR #172795)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172795

>From 82805bc6ac48375d4f18fd3810cb7e4a7a56 Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 15:17:49 +0530
Subject: [PATCH] [NPM] Update OptimizedRegAlloc and MachineLateOptimization
 pipelines

---
 llvm/include/llvm/Passes/CodeGenPassBuilder.h | 9 ++---
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll  | 4 ++--
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index 7151dceac4b79..7528ee9f251e5 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -1255,6 +1255,9 @@ void CodeGenPassBuilder::addOptimizedRegAlloc(
 // addRegAssignmentOptimized did not add a reg alloc pass, so do nothing.
 return;
   }
+
+  addMachineFunctionPass(StackSlotColoringPass(), PMW);
+
   // Allow targets to expand pseudo instructions depending on the choice of
   // registers before MachineCopyPropagation.
   derived().addPostRewrite(PMW);
@@ -1277,6 +1280,9 @@ void CodeGenPassBuilder::addOptimizedRegAlloc(
 template 
 void CodeGenPassBuilder::addMachineLateOptimization(
 PassManagerWrapper &PMW) const {
+  // Cleanup of redundant (identical) address/immediate loads.
+  addMachineFunctionPass(MachineLateInstrsCleanupPass(), PMW);
+
   // Branch folding must be run after regalloc and prolog/epilog insertion.
   addMachineFunctionPass(BranchFolderPass(Opt.EnableTailMerge), PMW);
 
@@ -1287,9 +1293,6 @@ void CodeGenPassBuilder::addMachineLateOptimization(
   if (!TM.requiresStructuredCFG())
 addMachineFunctionPass(TailDuplicatePass(), PMW);
 
-  // Cleanup of redundant (identical) address/immediate loads.
-  addMachineFunctionPass(MachineLateInstrsCleanupPass(), PMW);
-
   // Copy propagation.
   addMachineFunctionPass(MachineCopyPropagationPass(), PMW);
 }
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index 9b8f0c5f4ef0d..d4227d72c7c5a 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -11,9 +11,9 @@
 
 ; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
 
-; GCN-O2: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-image-intrinsic-opt,amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(amdgpu-atomic-optimizer,atomic-expand,amdgpu-promote-alloca,separate-const-offset-from-gep<>,slsr,early-cse<>,nary-reassociate,early-cse<>,amdgpu-codegenprepare,loop-mssa(licm),verify,loop-mssa(canon-freeze,loop-reduce),mergeicmps,expand-memcmp,gc-lowering,unreachableblockelim,consthoist,replace-with-veclib,partially-inline-libcalls,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,early-cse<>),amdgpu-preload-kernel-arguments,function(amdgpu-lower-kernel-arguments,codegenprepare,load-store-vectorizer),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreacha

[llvm-branch-commits] [llvm] [AMDGPU][NPM] Disable few non useful passes (PR #172796)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172796

>From ef5f9507ae7795d273d14b991a1f36572f3c6328 Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 15:28:47 +0530
Subject: [PATCH] [AMDGPU][NPM] Disable few non useful passes

---
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 4 ++--
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll   | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index b21c107a026da..4a9853288d996 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -2097,8 +2097,8 @@ AMDGPUCodeGenPassBuilder::AMDGPUCodeGenPassBuilder(
   // Exceptions and StackMaps are not supported, so these passes will never do
   // anything.
   // Garbage collection is not supported.
-  disablePass();
+  disablePass();
 }
 
 void AMDGPUCodeGenPassBuilder::addIRPasses(PassManagerWrapper &PMW) const {
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index d4227d72c7c5a..1ae057d2c3bc0 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -9,11 +9,11 @@
 ; RUN:   | FileCheck -check-prefix=GCN-O3 %s
 
 
-; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
+; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
 
-; GCN-O2: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-image-intrinsic-opt,amdgpu-

[llvm-branch-commits] [llvm] [AMDGPU][NPM] add "addPostBBSections()" to NPM (PR #172793)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172793

>From a1e46965396fefcf12645e376c202839cfc3af56 Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 14:51:08 +0530
Subject: [PATCH] [AMDGPU][NPM] add "addPostBBSections()" to NPM

---
 llvm/include/llvm/Passes/CodeGenPassBuilder.h  | 4 
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 8 
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll   | 6 +++---
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index f47537d109671..d7ce17bebbf5d 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -486,6 +486,8 @@ template  class 
CodeGenPassBuilder {
   /// Add standard basic block placement passes.
   void addBlockPlacement(PassManagerWrapper &PMW) const;
 
+  void addPostBBSections(PassManagerWrapper &PMW) const {}
+
   using CreateMCStreamer =
   std::function>(MCContext &)>;
   void addAsmPrinter(PassManagerWrapper &PMW, CreateMCStreamer) const {
@@ -1063,6 +1065,8 @@ Error CodeGenPassBuilder::addMachinePasses(
 }
   }
 
+  derived().addPostBBSections(PMW);
+
   addMachineFunctionPass(StackFrameLayoutAnalysisPass(), PMW);
 
   // Add passes that directly emit MI after all other MI passes.
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 39b8bb77a9f20..b21c107a026da 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -157,6 +157,7 @@ class AMDGPUCodeGenPassBuilder
   void addPreRegAlloc(PassManagerWrapper &PMW) const;
   void addOptimizedRegAlloc(PassManagerWrapper &PMW) const;
   void addPreSched2(PassManagerWrapper &PMW) const;
+  void addPostBBSections(PassManagerWrapper &PMW) const;
 
   /// Check if a pass is enabled given \p Opt option. The option always
   /// overrides defaults if explicitly used. Otherwise its default will be used
@@ -2403,6 +2404,13 @@ void 
AMDGPUCodeGenPassBuilder::addPreSched2(PassManagerWrapper &PMW) const {
   addMachineFunctionPass(SIPostRABundlerPass(), PMW);
 }
 
+void AMDGPUCodeGenPassBuilder::addPostBBSections(
+PassManagerWrapper &PMW) const {
+  // We run this later to avoid passes like livedebugvalues and BBSections
+  // having to deal with the apparent multi-entry functions we may generate.
+  addMachineFunctionPass(AMDGPUPreloadKernArgPrologPass(), PMW);
+}
+
 void AMDGPUCodeGenPassBuilder::addPreEmitPass(PassManagerWrapper &PMW) const {
   if (isPassEnabled(EnableVOPD, CodeGenOptLevel::Less)) {
 addMachineFunctionPass(GCNCreateVOPDPass(), PMW);
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index c3dc26f3e10e4..0383f0613b71d 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -9,11 +9,11 @@
 ; RUN:   | FileCheck -check-prefix=GCN-O3 %s
 
 
-; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,lower-constant-intrinsics,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,stack-frame-layout,verify),free-machine-function))
+; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform

[llvm-branch-commits] [llvm] [CodeGen][NPM] Remove "LowerConstantIntrinsicsPass" from the pipeline (PR #172794)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172794

>From 2069a0869ecd304aadebc71ed6b038a012aec376 Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 15:00:27 +0530
Subject: [PATCH] [NPM] Remove "LowerConstantIntrinsicsPass" from the pipeline

---
 llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 -
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll  | 6 +++---
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index d7ce17bebbf5d..7151dceac4b79 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -730,7 +730,6 @@ void CodeGenPassBuilder::addIRPasses(
 flushFPMsToMPM(PMW);
 addModulePass(ShadowStackGCLoweringPass(), PMW);
   }
-  addFunctionPass(LowerConstantIntrinsicsPass(), PMW);
 
   // Make sure that no unreachable blocks are instruction selected.
   addFunctionPass(UnreachableBlockElimPass(), PMW);
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index 0383f0613b71d..9b8f0c5f4ef0d 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -9,11 +9,11 @@
 ; RUN:   | FileCheck -check-prefix=GCN-O3 %s
 
 
-; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,lower-constant-intrinsics,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
+; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
 
-; GCN-O2: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lo

[llvm-branch-commits] [llvm] [AMDGPU][NPM] add "addPostBBSections()" to NPM (PR #172793)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172793

>From a1e46965396fefcf12645e376c202839cfc3af56 Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 14:51:08 +0530
Subject: [PATCH] [AMDGPU][NPM] add "addPostBBSections()" to NPM

---
 llvm/include/llvm/Passes/CodeGenPassBuilder.h  | 4 
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 8 
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll   | 6 +++---
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index f47537d109671..d7ce17bebbf5d 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -486,6 +486,8 @@ template  class 
CodeGenPassBuilder {
   /// Add standard basic block placement passes.
   void addBlockPlacement(PassManagerWrapper &PMW) const;
 
+  void addPostBBSections(PassManagerWrapper &PMW) const {}
+
   using CreateMCStreamer =
   std::function>(MCContext &)>;
   void addAsmPrinter(PassManagerWrapper &PMW, CreateMCStreamer) const {
@@ -1063,6 +1065,8 @@ Error CodeGenPassBuilder::addMachinePasses(
 }
   }
 
+  derived().addPostBBSections(PMW);
+
   addMachineFunctionPass(StackFrameLayoutAnalysisPass(), PMW);
 
   // Add passes that directly emit MI after all other MI passes.
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 39b8bb77a9f20..b21c107a026da 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -157,6 +157,7 @@ class AMDGPUCodeGenPassBuilder
   void addPreRegAlloc(PassManagerWrapper &PMW) const;
   void addOptimizedRegAlloc(PassManagerWrapper &PMW) const;
   void addPreSched2(PassManagerWrapper &PMW) const;
+  void addPostBBSections(PassManagerWrapper &PMW) const;
 
   /// Check if a pass is enabled given \p Opt option. The option always
   /// overrides defaults if explicitly used. Otherwise its default will be used
@@ -2403,6 +2404,13 @@ void 
AMDGPUCodeGenPassBuilder::addPreSched2(PassManagerWrapper &PMW) const {
   addMachineFunctionPass(SIPostRABundlerPass(), PMW);
 }
 
+void AMDGPUCodeGenPassBuilder::addPostBBSections(
+PassManagerWrapper &PMW) const {
+  // We run this later to avoid passes like livedebugvalues and BBSections
+  // having to deal with the apparent multi-entry functions we may generate.
+  addMachineFunctionPass(AMDGPUPreloadKernArgPrologPass(), PMW);
+}
+
 void AMDGPUCodeGenPassBuilder::addPreEmitPass(PassManagerWrapper &PMW) const {
   if (isPassEnabled(EnableVOPD, CodeGenOptLevel::Less)) {
 addMachineFunctionPass(GCNCreateVOPDPass(), PMW);
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index c3dc26f3e10e4..0383f0613b71d 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -9,11 +9,11 @@
 ; RUN:   | FileCheck -check-prefix=GCN-O3 %s
 
 
-; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,lower-constant-intrinsics,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,stack-frame-layout,verify),free-machine-function))
+; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform

[llvm-branch-commits] [llvm] [AMDGPU][NPM] Disable few non useful passes (PR #172796)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172796

>From ef5f9507ae7795d273d14b991a1f36572f3c6328 Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 15:28:47 +0530
Subject: [PATCH] [AMDGPU][NPM] Disable few non useful passes

---
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 4 ++--
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll   | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index b21c107a026da..4a9853288d996 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -2097,8 +2097,8 @@ AMDGPUCodeGenPassBuilder::AMDGPUCodeGenPassBuilder(
   // Exceptions and StackMaps are not supported, so these passes will never do
   // anything.
   // Garbage collection is not supported.
-  disablePass();
+  disablePass();
 }
 
 void AMDGPUCodeGenPassBuilder::addIRPasses(PassManagerWrapper &PMW) const {
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index d4227d72c7c5a..1ae057d2c3bc0 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -9,11 +9,11 @@
 ; RUN:   | FileCheck -check-prefix=GCN-O3 %s
 
 
-; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
+; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
 
-; GCN-O2: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-image-intrinsic-opt,amdgpu-

[llvm-branch-commits] [llvm] [NPM] Update OptimizedRegAlloc and MachineLateOptimization pipelines (PR #172795)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172795

>From 82805bc6ac48375d4f18fd3810cb7e4a7a56 Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 15:17:49 +0530
Subject: [PATCH] [NPM] Update OptimizedRegAlloc and MachineLateOptimization
 pipelines

---
 llvm/include/llvm/Passes/CodeGenPassBuilder.h | 9 ++---
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll  | 4 ++--
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index 7151dceac4b79..7528ee9f251e5 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -1255,6 +1255,9 @@ void CodeGenPassBuilder::addOptimizedRegAlloc(
 // addRegAssignmentOptimized did not add a reg alloc pass, so do nothing.
 return;
   }
+
+  addMachineFunctionPass(StackSlotColoringPass(), PMW);
+
   // Allow targets to expand pseudo instructions depending on the choice of
   // registers before MachineCopyPropagation.
   derived().addPostRewrite(PMW);
@@ -1277,6 +1280,9 @@ void CodeGenPassBuilder::addOptimizedRegAlloc(
 template 
 void CodeGenPassBuilder::addMachineLateOptimization(
 PassManagerWrapper &PMW) const {
+  // Cleanup of redundant (identical) address/immediate loads.
+  addMachineFunctionPass(MachineLateInstrsCleanupPass(), PMW);
+
   // Branch folding must be run after regalloc and prolog/epilog insertion.
   addMachineFunctionPass(BranchFolderPass(Opt.EnableTailMerge), PMW);
 
@@ -1287,9 +1293,6 @@ void CodeGenPassBuilder::addMachineLateOptimization(
   if (!TM.requiresStructuredCFG())
 addMachineFunctionPass(TailDuplicatePass(), PMW);
 
-  // Cleanup of redundant (identical) address/immediate loads.
-  addMachineFunctionPass(MachineLateInstrsCleanupPass(), PMW);
-
   // Copy propagation.
   addMachineFunctionPass(MachineCopyPropagationPass(), PMW);
 }
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index 9b8f0c5f4ef0d..d4227d72c7c5a 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -11,9 +11,9 @@
 
 ; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
 
-; GCN-O2: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-image-intrinsic-opt,amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(amdgpu-atomic-optimizer,atomic-expand,amdgpu-promote-alloca,separate-const-offset-from-gep<>,slsr,early-cse<>,nary-reassociate,early-cse<>,amdgpu-codegenprepare,loop-mssa(licm),verify,loop-mssa(canon-freeze,loop-reduce),mergeicmps,expand-memcmp,gc-lowering,unreachableblockelim,consthoist,replace-with-veclib,partially-inline-libcalls,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,early-cse<>),amdgpu-preload-kernel-arguments,function(amdgpu-lower-kernel-arguments,codegenprepare,load-store-vectorizer),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreacha

[llvm-branch-commits] [libc] [libc] Add `IN6_IS_ADDR_V4MAPPED` (PR #172645)

2025-12-18 Thread Connector Switch via llvm-branch-commits

https://github.com/c8ef edited https://github.com/llvm/llvm-project/pull/172645
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libc] [libc] Add `IN6_IS_ADDR_V4COMPAT` (PR #172646)

2025-12-18 Thread Connector Switch via llvm-branch-commits

https://github.com/c8ef edited https://github.com/llvm/llvm-project/pull/172646
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libc] [libc] Add `IN6_IS_ADDR_MULTICAST` (PR #172498)

2025-12-18 Thread Connector Switch via llvm-branch-commits

https://github.com/c8ef edited https://github.com/llvm/llvm-project/pull/172498
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libc] [libc] Add `IN6_IS_ADDR_MC*` (PR #172643)

2025-12-18 Thread Connector Switch via llvm-branch-commits

https://github.com/c8ef edited https://github.com/llvm/llvm-project/pull/172643
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][NPM] Disable few non useful passes (PR #172796)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172796

>From 9595ea731e7eb243cc1777e33fb0732d68bd4fa5 Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 15:28:47 +0530
Subject: [PATCH] [AMDGPU][NPM] Disable few non useful passes

---
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 4 ++--
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll   | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index b21c107a026da..4a9853288d996 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -2097,8 +2097,8 @@ AMDGPUCodeGenPassBuilder::AMDGPUCodeGenPassBuilder(
   // Exceptions and StackMaps are not supported, so these passes will never do
   // anything.
   // Garbage collection is not supported.
-  disablePass();
+  disablePass();
 }
 
 void AMDGPUCodeGenPassBuilder::addIRPasses(PassManagerWrapper &PMW) const {
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index d4227d72c7c5a..1ae057d2c3bc0 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -9,11 +9,11 @@
 ; RUN:   | FileCheck -check-prefix=GCN-O3 %s
 
 
-; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
+; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
 
-; GCN-O2: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-image-intrinsic-opt,amdgpu-

[llvm-branch-commits] [llvm] [CodeGen][NPM] Remove "LowerConstantIntrinsicsPass" from the pipeline (PR #172794)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172794

>From 5a8d3b2edb0e61d6cafde1d3b84d35ce09aeb386 Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 15:00:27 +0530
Subject: [PATCH] [NPM] Remove "LowerConstantIntrinsicsPass" from the pipeline

---
 llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 -
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll  | 6 +++---
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index d7ce17bebbf5d..7151dceac4b79 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -730,7 +730,6 @@ void CodeGenPassBuilder::addIRPasses(
 flushFPMsToMPM(PMW);
 addModulePass(ShadowStackGCLoweringPass(), PMW);
   }
-  addFunctionPass(LowerConstantIntrinsicsPass(), PMW);
 
   // Make sure that no unreachable blocks are instruction selected.
   addFunctionPass(UnreachableBlockElimPass(), PMW);
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index 0383f0613b71d..9b8f0c5f4ef0d 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -9,11 +9,11 @@
 ; RUN:   | FileCheck -check-prefix=GCN-O3 %s
 
 
-; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,lower-constant-intrinsics,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
+; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
 
-; GCN-O2: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lo

[llvm-branch-commits] [llvm] [AMDGPU][NPM] add "addPostBBSections()" to NPM (PR #172793)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172793

>From 867d24508715755011000841ce6f2a5b24bc8f3e Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 14:51:08 +0530
Subject: [PATCH] [AMDGPU][NPM] add "addPostBBSections()" to NPM

---
 llvm/include/llvm/Passes/CodeGenPassBuilder.h  | 4 
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 8 
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll   | 6 +++---
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index f47537d109671..d7ce17bebbf5d 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -486,6 +486,8 @@ template  class 
CodeGenPassBuilder {
   /// Add standard basic block placement passes.
   void addBlockPlacement(PassManagerWrapper &PMW) const;
 
+  void addPostBBSections(PassManagerWrapper &PMW) const {}
+
   using CreateMCStreamer =
   std::function>(MCContext &)>;
   void addAsmPrinter(PassManagerWrapper &PMW, CreateMCStreamer) const {
@@ -1063,6 +1065,8 @@ Error CodeGenPassBuilder::addMachinePasses(
 }
   }
 
+  derived().addPostBBSections(PMW);
+
   addMachineFunctionPass(StackFrameLayoutAnalysisPass(), PMW);
 
   // Add passes that directly emit MI after all other MI passes.
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 39b8bb77a9f20..b21c107a026da 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -157,6 +157,7 @@ class AMDGPUCodeGenPassBuilder
   void addPreRegAlloc(PassManagerWrapper &PMW) const;
   void addOptimizedRegAlloc(PassManagerWrapper &PMW) const;
   void addPreSched2(PassManagerWrapper &PMW) const;
+  void addPostBBSections(PassManagerWrapper &PMW) const;
 
   /// Check if a pass is enabled given \p Opt option. The option always
   /// overrides defaults if explicitly used. Otherwise its default will be used
@@ -2403,6 +2404,13 @@ void 
AMDGPUCodeGenPassBuilder::addPreSched2(PassManagerWrapper &PMW) const {
   addMachineFunctionPass(SIPostRABundlerPass(), PMW);
 }
 
+void AMDGPUCodeGenPassBuilder::addPostBBSections(
+PassManagerWrapper &PMW) const {
+  // We run this later to avoid passes like livedebugvalues and BBSections
+  // having to deal with the apparent multi-entry functions we may generate.
+  addMachineFunctionPass(AMDGPUPreloadKernArgPrologPass(), PMW);
+}
+
 void AMDGPUCodeGenPassBuilder::addPreEmitPass(PassManagerWrapper &PMW) const {
   if (isPassEnabled(EnableVOPD, CodeGenOptLevel::Less)) {
 addMachineFunctionPass(GCNCreateVOPDPass(), PMW);
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index c3dc26f3e10e4..0383f0613b71d 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -9,11 +9,11 @@
 ; RUN:   | FileCheck -check-prefix=GCN-O3 %s
 
 
-; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,lower-constant-intrinsics,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,stack-frame-layout,verify),free-machine-function))
+; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform

[llvm-branch-commits] [llvm] [NPM] Update OptimizedRegAlloc and MachineLateOptimization pipelines (PR #172795)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172795

>From 5e0c6385e708a7249d7bda256010890acd2c1c6c Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 15:17:49 +0530
Subject: [PATCH] [NPM] Update OptimizedRegAlloc and MachineLateOptimization
 pipelines

---
 llvm/include/llvm/Passes/CodeGenPassBuilder.h | 9 ++---
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll  | 4 ++--
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index 7151dceac4b79..7528ee9f251e5 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -1255,6 +1255,9 @@ void CodeGenPassBuilder::addOptimizedRegAlloc(
 // addRegAssignmentOptimized did not add a reg alloc pass, so do nothing.
 return;
   }
+
+  addMachineFunctionPass(StackSlotColoringPass(), PMW);
+
   // Allow targets to expand pseudo instructions depending on the choice of
   // registers before MachineCopyPropagation.
   derived().addPostRewrite(PMW);
@@ -1277,6 +1280,9 @@ void CodeGenPassBuilder::addOptimizedRegAlloc(
 template 
 void CodeGenPassBuilder::addMachineLateOptimization(
 PassManagerWrapper &PMW) const {
+  // Cleanup of redundant (identical) address/immediate loads.
+  addMachineFunctionPass(MachineLateInstrsCleanupPass(), PMW);
+
   // Branch folding must be run after regalloc and prolog/epilog insertion.
   addMachineFunctionPass(BranchFolderPass(Opt.EnableTailMerge), PMW);
 
@@ -1287,9 +1293,6 @@ void CodeGenPassBuilder::addMachineLateOptimization(
   if (!TM.requiresStructuredCFG())
 addMachineFunctionPass(TailDuplicatePass(), PMW);
 
-  // Cleanup of redundant (identical) address/immediate loads.
-  addMachineFunctionPass(MachineLateInstrsCleanupPass(), PMW);
-
   // Copy propagation.
   addMachineFunctionPass(MachineCopyPropagationPass(), PMW);
 }
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index 9b8f0c5f4ef0d..d4227d72c7c5a 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -11,9 +11,9 @@
 
 ; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
 
-; GCN-O2: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-image-intrinsic-opt,amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(amdgpu-atomic-optimizer,atomic-expand,amdgpu-promote-alloca,separate-const-offset-from-gep<>,slsr,early-cse<>,nary-reassociate,early-cse<>,amdgpu-codegenprepare,loop-mssa(licm),verify,loop-mssa(canon-freeze,loop-reduce),mergeicmps,expand-memcmp,gc-lowering,unreachableblockelim,consthoist,replace-with-veclib,partially-inline-libcalls,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,early-cse<>),amdgpu-preload-kernel-arguments,function(amdgpu-lower-kernel-arguments,codegenprepare,load-store-vectorizer),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreacha

[llvm-branch-commits] [llvm] [AMDGPU][NPM] Disable few non useful passes (PR #172796)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172796

>From 9595ea731e7eb243cc1777e33fb0732d68bd4fa5 Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 15:28:47 +0530
Subject: [PATCH] [AMDGPU][NPM] Disable few non useful passes

---
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 4 ++--
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll   | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index b21c107a026da..4a9853288d996 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -2097,8 +2097,8 @@ AMDGPUCodeGenPassBuilder::AMDGPUCodeGenPassBuilder(
   // Exceptions and StackMaps are not supported, so these passes will never do
   // anything.
   // Garbage collection is not supported.
-  disablePass();
+  disablePass();
 }
 
 void AMDGPUCodeGenPassBuilder::addIRPasses(PassManagerWrapper &PMW) const {
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index d4227d72c7c5a..1ae057d2c3bc0 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -9,11 +9,11 @@
 ; RUN:   | FileCheck -check-prefix=GCN-O3 %s
 
 
-; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
+; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
 
-; GCN-O2: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-image-intrinsic-opt,amdgpu-

[llvm-branch-commits] [llvm] [RISCV] Schedule RVV instructions with compatible type first (PR #95924)

2025-12-18 Thread Pengcheng Wang via llvm-branch-commits

wangpc-pp wrote:

The current result on llvm-testsuite of this approach(compiled with `-O3 
-march=rva23u64 -mrvv-vector-bits=128`, `-mrvv-vector-bits` is specified to 
increase the opportunities of vectorization):
```
Metric: 
riscv-insert-vsetvli.NumInsertedVSETVL,riscv-instr-info.NumVRegSpilled,riscv-instr-info.NumVRegReloaded

Program   
riscv-insert-vsetvli.NumInsertedVSETVL
riscv-instr-info.NumVRegSpilled   
riscv-instr-info.NumVRegReloaded  
  baseline  
 vtype-sched diff   baselinevtype-sched diff  
baseline vtype-sched diff 
SingleSource/Benchmarks/Stanford/Queens 26.00   
   17.00 -34.6%12.00   12.00 0.0%
12.0012.00 0.0%
SingleSource/Benchmarks/Stanford/Oscar  12.00   
   10.00 -16.7% 

MultiSourc...arks/Rodinia/backprop/backprop 14.00   
   12.00 -14.3% 

SingleSource/Benchmarks/Misc/flops-8 7.00   
8.00  14.3% 

SingleSource/Benchmarks/Misc/flops-6 7.00   
8.00  14.3% 

SingleSource/Benchmarks/Misc/flops-5 7.00   
8.00  14.3% 

SingleSour.../floyd-warshall/floyd-warshall 32.00   
   36.00  12.5% 

MultiSourc...hmarks/MallocBench/cfrac/cfrac 33.00   
   29.00 -12.1% 5.005.00 0.0% 
5.00 5.00 0.0%
SingleSour...ce/UnitTests/matrix-types-spec   2495.00   
 2218.00 -11.1% 38865.0037481.00-3.6% 
58086.00 57734.00-0.6%
MultiSourc...ch/consumer-lame/consumer-lame719.00   
  660.00  -8.2%29.00   29.00 0.0%
44.0044.00 0.0%
External/S...2017rate/525.x264_r/525.x264_r   1414.00   
 1319.00  -6.7%   116.00  111.00-4.3%   
128.00   121.00-5.5%
External/S...017speed/625.x264_s/625.x264_s   1414.00   
 1319.00  -6.7%   116.00  111.00-4.3%   
128.00   121.00-5.5%
MultiSource/Benchmarks/Ptrdist/bc/bc17.00   
   18.00   5.9% 

MultiSource/Benchmarks/McCat/18-imp/imp 35.00   
   33.00  -5.7% 

External/S...T2017speed/605.mcf_s/605.mcf_s 70.00   
   66.00  -5.7% 

   Geomean difference   
  -0.5% -0.9%   
   -1.1%
```

We have:
* `-0.5%` of `vsetvli`.
* `-0.9%` of vector spills.
* `-1.1%` of vector reloads.


https://github.com/llvm/llvm-project/pull/95924
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen][NPM] Remove "LowerConstantIntrinsicsPass" from the pipeline (PR #172794)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172794

>From 5a8d3b2edb0e61d6cafde1d3b84d35ce09aeb386 Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 15:00:27 +0530
Subject: [PATCH] [NPM] Remove "LowerConstantIntrinsicsPass" from the pipeline

---
 llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 -
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll  | 6 +++---
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index d7ce17bebbf5d..7151dceac4b79 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -730,7 +730,6 @@ void CodeGenPassBuilder::addIRPasses(
 flushFPMsToMPM(PMW);
 addModulePass(ShadowStackGCLoweringPass(), PMW);
   }
-  addFunctionPass(LowerConstantIntrinsicsPass(), PMW);
 
   // Make sure that no unreachable blocks are instruction selected.
   addFunctionPass(UnreachableBlockElimPass(), PMW);
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index 0383f0613b71d..9b8f0c5f4ef0d 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -9,11 +9,11 @@
 ; RUN:   | FileCheck -check-prefix=GCN-O3 %s
 
 
-; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,lower-constant-intrinsics,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
+; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
 
-; GCN-O2: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lo

[llvm-branch-commits] [llvm] [AMDGPU][NPM] add "addPostBBSections()" to NPM (PR #172793)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172793

>From 867d24508715755011000841ce6f2a5b24bc8f3e Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 14:51:08 +0530
Subject: [PATCH] [AMDGPU][NPM] add "addPostBBSections()" to NPM

---
 llvm/include/llvm/Passes/CodeGenPassBuilder.h  | 4 
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 8 
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll   | 6 +++---
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index f47537d109671..d7ce17bebbf5d 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -486,6 +486,8 @@ template  class 
CodeGenPassBuilder {
   /// Add standard basic block placement passes.
   void addBlockPlacement(PassManagerWrapper &PMW) const;
 
+  void addPostBBSections(PassManagerWrapper &PMW) const {}
+
   using CreateMCStreamer =
   std::function>(MCContext &)>;
   void addAsmPrinter(PassManagerWrapper &PMW, CreateMCStreamer) const {
@@ -1063,6 +1065,8 @@ Error CodeGenPassBuilder::addMachinePasses(
 }
   }
 
+  derived().addPostBBSections(PMW);
+
   addMachineFunctionPass(StackFrameLayoutAnalysisPass(), PMW);
 
   // Add passes that directly emit MI after all other MI passes.
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 39b8bb77a9f20..b21c107a026da 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -157,6 +157,7 @@ class AMDGPUCodeGenPassBuilder
   void addPreRegAlloc(PassManagerWrapper &PMW) const;
   void addOptimizedRegAlloc(PassManagerWrapper &PMW) const;
   void addPreSched2(PassManagerWrapper &PMW) const;
+  void addPostBBSections(PassManagerWrapper &PMW) const;
 
   /// Check if a pass is enabled given \p Opt option. The option always
   /// overrides defaults if explicitly used. Otherwise its default will be used
@@ -2403,6 +2404,13 @@ void 
AMDGPUCodeGenPassBuilder::addPreSched2(PassManagerWrapper &PMW) const {
   addMachineFunctionPass(SIPostRABundlerPass(), PMW);
 }
 
+void AMDGPUCodeGenPassBuilder::addPostBBSections(
+PassManagerWrapper &PMW) const {
+  // We run this later to avoid passes like livedebugvalues and BBSections
+  // having to deal with the apparent multi-entry functions we may generate.
+  addMachineFunctionPass(AMDGPUPreloadKernArgPrologPass(), PMW);
+}
+
 void AMDGPUCodeGenPassBuilder::addPreEmitPass(PassManagerWrapper &PMW) const {
   if (isPassEnabled(EnableVOPD, CodeGenOptLevel::Less)) {
 addMachineFunctionPass(GCNCreateVOPDPass(), PMW);
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index c3dc26f3e10e4..0383f0613b71d 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -9,11 +9,11 @@
 ; RUN:   | FileCheck -check-prefix=GCN-O3 %s
 
 
-; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,lower-constant-intrinsics,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,stack-frame-layout,verify),free-machine-function))
+; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform

[llvm-branch-commits] [llvm] [NPM] Update OptimizedRegAlloc and MachineLateOptimization pipelines (PR #172795)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH updated 
https://github.com/llvm/llvm-project/pull/172795

>From 5e0c6385e708a7249d7bda256010890acd2c1c6c Mon Sep 17 00:00:00 2001
From: vikhegde 
Date: Wed, 17 Dec 2025 15:17:49 +0530
Subject: [PATCH] [NPM] Update OptimizedRegAlloc and MachineLateOptimization
 pipelines

---
 llvm/include/llvm/Passes/CodeGenPassBuilder.h | 9 ++---
 llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll  | 4 ++--
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index 7151dceac4b79..7528ee9f251e5 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -1255,6 +1255,9 @@ void CodeGenPassBuilder::addOptimizedRegAlloc(
 // addRegAssignmentOptimized did not add a reg alloc pass, so do nothing.
 return;
   }
+
+  addMachineFunctionPass(StackSlotColoringPass(), PMW);
+
   // Allow targets to expand pseudo instructions depending on the choice of
   // registers before MachineCopyPropagation.
   derived().addPostRewrite(PMW);
@@ -1277,6 +1280,9 @@ void CodeGenPassBuilder::addOptimizedRegAlloc(
 template 
 void CodeGenPassBuilder::addMachineLateOptimization(
 PassManagerWrapper &PMW) const {
+  // Cleanup of redundant (identical) address/immediate loads.
+  addMachineFunctionPass(MachineLateInstrsCleanupPass(), PMW);
+
   // Branch folding must be run after regalloc and prolog/epilog insertion.
   addMachineFunctionPass(BranchFolderPass(Opt.EnableTailMerge), PMW);
 
@@ -1287,9 +1293,6 @@ void CodeGenPassBuilder::addMachineLateOptimization(
   if (!TM.requiresStructuredCFG())
 addMachineFunctionPass(TailDuplicatePass(), PMW);
 
-  // Cleanup of redundant (identical) address/immediate loads.
-  addMachineFunctionPass(MachineLateInstrsCleanupPass(), PMW);
-
   // Copy propagation.
   addMachineFunctionPass(MachineCopyPropagationPass(), PMW);
 }
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll 
b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
index 9b8f0c5f4ef0d..d4227d72c7c5a 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
@@ -11,9 +11,9 @@
 
 ; GCN-O0: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(atomic-expand,verify,gc-lowering,unreachableblockelim,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,amdgpu-lower-kernel-arguments),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreachableblockelim)),require,cgscc(function(amdgpu-unify-divergent-exit-nodes,fix-irreducible,unify-loop-exits,StructurizeCFGPass,amdgpu-annotate-uniform,si-annotate-control-flow,amdgpu-rewrite-undef-for-phi,lcssa,require,callbr-prepare,safe-stack,stack-protector,verify)),cgscc(function(machine-function(amdgpu-isel,si-fix-sgpr-copies,si-i1-copies,finalize-isel,localstackalloc))),require,cgscc(function(machine-function(reg-usage-propagation,phi-node-elimination,two-address-instruction,regallocfast,si-fix-vgpr-copies,remove-redundant-debug-values,fixup-statepoint-caller-saved,prolog-epilog,post-ra-pseudos,si-post-ra-bundler,fentry-insert,xray-instrumentation,patchable-function,si-memory-legalizer,si-insert-waitcnts,si-mode-register,si-late-branch-lowering,post-RA-hazard-rec,amdgpu-wait-sgpr-hazards,amdgpu-lower-vgpr-encoding,branch-relaxation))),require,cgscc(function(machine-function(reg-usage-collector,remove-loads-into-fake-uses,live-debug-values,machine-sanmd,amdgpu-preload-kern-arg-prolog,stack-frame-layout,verify),free-machine-function))
 
-; GCN-O2: 
require,require,require,require,pre-isel-intrinsic-lowering,function(expand-large-div-rem,expand-fp),amdgpu-remove-incompatible-functions,amdgpu-printf-runtime-binding,amdgpu-lower-ctor-dtor,function(amdgpu-image-intrinsic-opt,amdgpu-uniform-intrinsic-combine),expand-variadics,amdgpu-always-inline,always-inline,amdgpu-export-kernel-runtime-handles,amdgpu-lower-exec-sync,amdgpu-sw-lower-lds,amdgpu-lower-module-lds,function(amdgpu-atomic-optimizer,atomic-expand,amdgpu-promote-alloca,separate-const-offset-from-gep<>,slsr,early-cse<>,nary-reassociate,early-cse<>,amdgpu-codegenprepare,loop-mssa(licm),verify,loop-mssa(canon-freeze,loop-reduce),mergeicmps,expand-memcmp,gc-lowering,unreachableblockelim,consthoist,replace-with-veclib,partially-inline-libcalls,ee-instrument,scalarize-masked-mem-intrin,expand-reductions,early-cse<>),amdgpu-preload-kernel-arguments,function(amdgpu-lower-kernel-arguments,codegenprepare,load-store-vectorizer),amdgpu-lower-buffer-fat-pointers,amdgpu-lower-intrinsics,cgscc(function(lower-switch,lower-invoke,unreacha

[llvm-branch-commits] [llvm] [AMDGPU][NPM] add "addPostBBSections()" to NPM (PR #172793)

2025-12-18 Thread Vikram Hegde via llvm-branch-commits

https://github.com/vikramRH edited 
https://github.com/llvm/llvm-project/pull/172793
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NPM] Update OptimizedRegAlloc and MachineLateOptimization pipelines (PR #172795)

2025-12-18 Thread Christudasan Devadasan via llvm-branch-commits

https://github.com/cdevadas approved this pull request.


https://github.com/llvm/llvm-project/pull/172795
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Clang] Add __builtin_allow_sanitize_check() (PR #172030)

2025-12-18 Thread Alexander Potapenko via llvm-branch-commits

ramosian-glider wrote:

LGTM

https://github.com/llvm/llvm-project/pull/172030
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [flang] [llvm] [openmp] [Flang] Move builtin .mod generation into runtimes (Reapply #137828) (PR #171515)

2025-12-18 Thread Michael Kruse via llvm-branch-commits

Meinersbur wrote:

Not this year anymore, our operations would not be able to react before the 
holidays if something goes wrong. After that, it depends on @petrhosek's 
review. I don't know whether they will have larger change requests. It's been 
NFC only so fat.

Changes made are responding to @petrhosek's review comments (also from 
#171610), with the exception of [`9e7ab48` 
(#171515)](https://github.com/llvm/llvm-project/pull/171515/commits/9e7ab487945a04468469e82aab81cbf86a2da0d7)
 which I made in the hope to reduce build problems.

https://github.com/llvm/llvm-project/pull/171515
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV][NFC] Add RISCVVSETVLIInfoAnalysis (PR #172615)

2025-12-18 Thread Craig Topper via llvm-branch-commits

https://github.com/topperc approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/172615
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] StackProtector: Use LibcallLoweringInfo analysis (PR #170329)

2025-12-18 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/170329

>From 78c30697352a8d115391b0b66cd51873fa154e81 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Mon, 24 Nov 2025 13:48:45 -0500
Subject: [PATCH] StackProtector: Use LibcallLoweringInfo analysis

---
 llvm/lib/CodeGen/StackProtector.cpp   | 82 +--
 .../NVPTX/no-stack-protector-libcall-error.ll |  2 +-
 .../X86/stack-protector-atomicrmw-xchg.ll |  4 +-
 .../cross-dso-cfi-stack-chk-fail.ll   |  2 +-
 .../StackProtector/missing-analysis.ll|  7 ++
 .../StackProtector/stack-chk-fail-alias.ll|  2 +-
 6 files changed, 67 insertions(+), 32 deletions(-)
 create mode 100644 llvm/test/Transforms/StackProtector/missing-analysis.ll

diff --git a/llvm/lib/CodeGen/StackProtector.cpp 
b/llvm/lib/CodeGen/StackProtector.cpp
index 971c3e58b10ef..2fef484dbc583 100644
--- a/llvm/lib/CodeGen/StackProtector.cpp
+++ b/llvm/lib/CodeGen/StackProtector.cpp
@@ -69,13 +69,15 @@ static cl::opt 
DisableCheckNoReturn("disable-check-noreturn-call",
 ///  - The prologue code loads and stores the stack guard onto the stack.
 ///  - The epilogue checks the value stored in the prologue against the 
original
 ///value. It calls __stack_chk_fail if they differ.
-static bool InsertStackProtectors(const TargetMachine *TM, Function *F,
-  DomTreeUpdater *DTU, bool &HasPrologue,
-  bool &HasIRCheck);
+static bool InsertStackProtectors(const TargetLowering &TLI,
+  const LibcallLoweringInfo &Libcalls,
+  Function *F, DomTreeUpdater *DTU,
+  bool &HasPrologue, bool &HasIRCheck);
 
 /// CreateFailBB - Create a basic block to jump to when the stack protector
 /// check fails.
-static BasicBlock *CreateFailBB(Function *F, const TargetLowering &TLI);
+static BasicBlock *CreateFailBB(Function *F,
+const LibcallLoweringInfo &Libcalls);
 
 bool SSPLayoutInfo::shouldEmitSDCheck(const BasicBlock &BB) const {
   return HasPrologue && !HasIRCheck && isa(BB.getTerminator());
@@ -131,8 +133,23 @@ PreservedAnalyses StackProtectorPass::run(Function &F,
   return PreservedAnalyses::all();
   }
 
+  auto &MAMProxy = FAM.getResult(F);
+  const LibcallLoweringModuleAnalysisResult *LibcallLowering =
+  MAMProxy.getCachedResult(*F.getParent());
+
+  if (!LibcallLowering) {
+F.getContext().emitError("'" + LibcallLoweringModuleAnalysis::name() +
+ "' analysis required");
+return PreservedAnalyses::all();
+  }
+
+  const TargetSubtargetInfo *STI = TM->getSubtargetImpl(F);
+  const TargetLowering *TLI = STI->getTargetLowering();
+  const LibcallLoweringInfo &Libcalls =
+  LibcallLowering->getLibcallLowering(*STI);
+
   ++NumFunProtected;
-  bool Changed = InsertStackProtectors(TM, &F, DT ? &DTU : nullptr,
+  bool Changed = InsertStackProtectors(*TLI, Libcalls, &F, DT ? &DTU : nullptr,
Info.HasPrologue, Info.HasIRCheck);
 #ifdef EXPENSIVE_CHECKS
   assert((!DT ||
@@ -156,6 +173,7 @@ StackProtector::StackProtector() : FunctionPass(ID) {
 
 INITIALIZE_PASS_BEGIN(StackProtector, DEBUG_TYPE,
   "Insert stack protectors", false, true)
+INITIALIZE_PASS_DEPENDENCY(LibcallLoweringInfoWrapper)
 INITIALIZE_PASS_DEPENDENCY(TargetPassConfig)
 INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
 INITIALIZE_PASS_END(StackProtector, DEBUG_TYPE,
@@ -164,6 +182,7 @@ INITIALIZE_PASS_END(StackProtector, DEBUG_TYPE,
 FunctionPass *llvm::createStackProtectorPass() { return new StackProtector(); }
 
 void StackProtector::getAnalysisUsage(AnalysisUsage &AU) const {
+  AU.addRequired();
   AU.addRequired();
   AU.addPreserved();
 }
@@ -190,9 +209,16 @@ bool StackProtector::runOnFunction(Function &Fn) {
   return false;
   }
 
+  const TargetSubtargetInfo *Subtarget = TM->getSubtargetImpl(Fn);
+  const LibcallLoweringInfo &Libcalls =
+  getAnalysis().getLibcallLowering(*M,
+   *Subtarget);
+
+  const TargetLowering *TLI = Subtarget->getTargetLowering();
+
   ++NumFunProtected;
   bool Changed =
-  InsertStackProtectors(TM, F, DTU ? &*DTU : nullptr,
+  InsertStackProtectors(*TLI, Libcalls, F, DTU ? &*DTU : nullptr,
 LayoutInfo.HasPrologue, LayoutInfo.HasIRCheck);
 #ifdef EXPENSIVE_CHECKS
   assert((!DTU ||
@@ -519,10 +545,10 @@ bool SSPLayoutAnalysis::requiresStackProtector(Function 
*F,
 
 /// Create a stack guard loading and populate whether SelectionDAG SSP is
 /// supported.
-static Value *getStackGuard(const TargetLoweringBase *TLI, Module *M,
+static Value *getStackGuard(const TargetLoweringBase &TLI, Module *M,
 IRBuilder<> &B,
 bool *SupportsSelectionDAGSP = nullptr) {
-  Value *Guard = TLI->getIRSt

[llvm-branch-commits] [llvm] [AMDGPU] Add liverange split instructions into BB Prolog (PR #117544)

2025-12-18 Thread Quentin Colombet via llvm-branch-commits


@@ -9709,6 +9709,30 @@ unsigned SIInstrInfo::getLiveRangeSplitOpcode(Register 
SrcReg,
   return AMDGPU::COPY;
 }
 
+bool SIInstrInfo::canAddToBBProlog(const MachineInstr &MI) const {
+  uint16_t Opcode = MI.getOpcode();
+  // Check if it is SGPR spill or wwm-register spill Opcode.
+  if (isSGPRSpill(Opcode) || isWWMRegSpillOpcode(Opcode))
+return true;
+
+  const MachineFunction *MF = MI.getMF();
+  const MachineRegisterInfo &MRI = MF->getRegInfo();
+  const SIMachineFunctionInfo *MFI = MF->getInfo();
+
+  // See if this is Liverange split instruction inserted for SGPR or
+  // wwm-register. The implicit def inserted for wwm-registers should also be
+  // included as they can appear at the bb begin.
+  bool IsLRSplitInst = MI.getFlag(MachineInstr::LRSplit);

qcolombet wrote:

Instead of adding a flag, would it be enough to check that this is an SGPR copy?

https://github.com/llvm/llvm-project/pull/117544
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64][PAC] Factor out printing real AUT/PAC/BLRA encodings (NFC) (PR #160901)

2025-12-18 Thread Peter Collingbourne via llvm-branch-commits

https://github.com/pcc approved this pull request.


https://github.com/llvm/llvm-project/pull/160901
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] Continuation of fexec-charset (PR #169803)

2025-12-18 Thread Abhina Sree via llvm-branch-commits

https://github.com/abhina-sree updated 
https://github.com/llvm/llvm-project/pull/169803

>From 16f3faac450b16cb527e409339fd32b42cc0ad43 Mon Sep 17 00:00:00 2001
From: Abhina Sreeskantharajan 
Date: Mon, 24 Nov 2025 11:00:04 -0500
Subject: [PATCH 1/3] add ParserConversionAction

(cherry picked from commit c2647a73957921d3f7a53c6f25a69f1cc2725aa3)
---
 clang/include/clang/Parse/Parser.h |  1 +
 clang/include/clang/Sema/Sema.h|  8 ++--
 clang/lib/Parse/ParseDecl.cpp  | 13 +
 clang/lib/Parse/ParseDeclCXX.cpp   | 10 +++---
 clang/lib/Parse/ParseExpr.cpp  |  9 +
 clang/lib/Parse/Parser.cpp |  4 
 clang/lib/Sema/SemaExpr.cpp| 12 +++-
 7 files changed, 43 insertions(+), 14 deletions(-)

diff --git a/clang/include/clang/Parse/Parser.h 
b/clang/include/clang/Parse/Parser.h
index 58eb1c0a7c114..97867183b5a1d 100644
--- a/clang/include/clang/Parse/Parser.h
+++ b/clang/include/clang/Parse/Parser.h
@@ -5633,6 +5633,7 @@ class Parser : public CodeCompletionHandler {
 bool Finished;
   };
   ObjCImplParsingDataRAII *CurParsedObjCImpl;
+  ConversionAction ParserConversionAction;
 
   /// StashAwayMethodOrFunctionBodyTokens -  Consume the tokens and store them
   /// for later parsing.
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index cbfcc9bc0ea99..65567e367dea4 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -54,6 +54,7 @@
 #include "clang/Basic/TemplateKinds.h"
 #include "clang/Basic/TokenKinds.h"
 #include "clang/Basic/TypeTraits.h"
+#include "clang/Lex/LiteralConverter.h"
 #include "clang/Sema/AnalysisBasedWarnings.h"
 #include "clang/Sema/Attr.h"
 #include "clang/Sema/CleanupInfo.h"
@@ -7272,9 +7273,12 @@ class Sema final : public SemaBase {
   /// from multiple tokens.  However, the common case is that StringToks points
   /// to one string.
   ExprResult ActOnStringLiteral(ArrayRef StringToks,
-Scope *UDLScope = nullptr);
+Scope *UDLScope = nullptr,
+ConversionAction Action = CA_ToExecEncoding);
 
-  ExprResult ActOnUnevaluatedStringLiteral(ArrayRef StringToks);
+  ExprResult
+  ActOnUnevaluatedStringLiteral(ArrayRef StringToks,
+ConversionAction Action = CA_ToExecEncoding);
 
   /// ControllingExprOrType is either an opaque pointer coming out of a
   /// ParsedType or an Expr *. FIXME: it'd be better to split this interface
diff --git a/clang/lib/Parse/ParseDecl.cpp b/clang/lib/Parse/ParseDecl.cpp
index 8688ccf41acb5..fd537618a3c83 100644
--- a/clang/lib/Parse/ParseDecl.cpp
+++ b/clang/lib/Parse/ParseDecl.cpp
@@ -555,6 +555,9 @@ unsigned Parser::ParseAttributeArgsCommon(
   nullptr,
   Sema::ExpressionEvaluationContextRecord::EK_AttrArgument);
 
+  SaveAndRestore SavedTranslationState(
+  ParserConversionAction, CA_NoConversion);
+
   ExprResult ArgExpr = ParseAssignmentExpression();
   if (ArgExpr.isInvalid()) {
 SkipUntil(tok::r_paren, StopAtSemi);
@@ -634,6 +637,9 @@ void Parser::ParseGNUAttributeArgs(
   ParsedAttr::Kind AttrKind =
   ParsedAttr::getParsedKind(AttrName, ScopeName, Form.getSyntax());
 
+  SaveAndRestore 
SavedTranslationState(ParserConversionAction,
+ CA_NoConversion);
+
   if (AttrKind == ParsedAttr::AT_Availability) {
 ParseAvailabilityAttribute(*AttrName, AttrNameLoc, Attrs, EndLoc, 
ScopeName,
ScopeLoc, Form);
@@ -699,6 +705,9 @@ unsigned Parser::ParseClangAttributeArgs(
   ParsedAttr::Kind AttrKind =
   ParsedAttr::getParsedKind(AttrName, ScopeName, Form.getSyntax());
 
+  SaveAndRestore 
SavedTranslationState(ParserConversionAction,
+ CA_NoConversion);
+
   switch (AttrKind) {
   default:
 return ParseAttributeArgsCommon(AttrName, AttrNameLoc, Attrs, EndLoc,
@@ -1521,6 +1530,10 @@ void Parser::ParseExternalSourceSymbolAttribute(
   SkipUntil(tok::comma, tok::r_paren, StopAtSemi | StopBeforeMatch);
   continue;
 }
+
+SaveAndRestore SavedTranslationState(
+ParserConversionAction, CA_NoConversion);
+
 if (Keyword == Ident_language) {
   if (HadLanguage) {
 Diag(KeywordLoc, diag::err_external_source_symbol_duplicate_clause)
diff --git a/clang/lib/Parse/ParseDeclCXX.cpp b/clang/lib/Parse/ParseDeclCXX.cpp
index d8ed7e3ff96bd..40bf409124711 100644
--- a/clang/lib/Parse/ParseDeclCXX.cpp
+++ b/clang/lib/Parse/ParseDeclCXX.cpp
@@ -314,7 +314,9 @@ Decl *Parser::ParseNamespaceAlias(SourceLocation 
NamespaceLoc,
 
 Decl *Parser::ParseLinkage(ParsingDeclSpec &DS, DeclaratorContext Context) {
   assert(isTokenStringLiteral() && "Not a string literal!");
-  ExprResult Lang = ParseUnevaluatedStringLiteralExpression();
+  ExprResult Lang = (SaveAndRestore(ParserConversio

[llvm-branch-commits] [llvm] [AMDGPU] Add liverange split instructions into BB Prolog (PR #117544)

2025-12-18 Thread Matt Arsenault via llvm-branch-commits


@@ -9709,6 +9709,30 @@ unsigned SIInstrInfo::getLiveRangeSplitOpcode(Register 
SrcReg,
   return AMDGPU::COPY;
 }
 
+bool SIInstrInfo::canAddToBBProlog(const MachineInstr &MI) const {
+  uint16_t Opcode = MI.getOpcode();
+  // Check if it is SGPR spill or wwm-register spill Opcode.
+  if (isSGPRSpill(Opcode) || isWWMRegSpillOpcode(Opcode))
+return true;
+
+  const MachineFunction *MF = MI.getMF();
+  const MachineRegisterInfo &MRI = MF->getRegInfo();
+  const SIMachineFunctionInfo *MFI = MF->getInfo();
+
+  // See if this is Liverange split instruction inserted for SGPR or
+  // wwm-register. The implicit def inserted for wwm-registers should also be
+  // included as they can appear at the bb begin.
+  bool IsLRSplitInst = MI.getFlag(MachineInstr::LRSplit);

arsenm wrote:

That won't help for the WWM case (though below it is checking the flag in the 
SGPR case too)

https://github.com/llvm/llvm-project/pull/117544
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] 50e8835 - Revert "Reland "[flang][cuda] Add support for derived-type initialization on …"

2025-12-18 Thread via llvm-branch-commits

Author: Valentin Clement (バレンタイン クレメン)
Date: 2025-12-18T11:11:01-08:00
New Revision: 50e88351f8debca0a734c9a0ca784bb86a455bf3

URL: 
https://github.com/llvm/llvm-project/commit/50e88351f8debca0a734c9a0ca784bb86a455bf3
DIFF: 
https://github.com/llvm/llvm-project/commit/50e88351f8debca0a734c9a0ca784bb86a455bf3.diff

LOG: Revert "Reland "[flang][cuda] Add support for derived-type initialization 
on …"

This reverts commit e81bae73fe7530d164376cf0cdaf257ee7344fb4.

Added: 
flang/test/Lower/CUDA/TODO/cuda-allocate-default-init.cuf

Modified: 
flang-rt/include/flang-rt/runtime/derived.h
flang-rt/include/flang-rt/runtime/work-queue.h
flang-rt/lib/cuda/allocatable.cpp
flang-rt/lib/cuda/memmove-function.cpp
flang-rt/lib/cuda/pointer.cpp
flang-rt/lib/runtime/allocatable.cpp
flang-rt/lib/runtime/derived.cpp
flang-rt/lib/runtime/pointer.cpp
flang/include/flang/Optimizer/Builder/Runtime/RTBuilder.h
flang/include/flang/Runtime/CUDA/allocatable.h
flang/include/flang/Runtime/CUDA/memmove-function.h
flang/include/flang/Runtime/CUDA/pointer.h
flang/include/flang/Runtime/allocatable.h
flang/include/flang/Runtime/freestanding-tools.h
flang/include/flang/Runtime/pointer.h
flang/lib/Lower/Allocatable.cpp
flang/lib/Optimizer/Builder/Runtime/Allocatable.cpp
flang/lib/Optimizer/Transforms/CUDA/CUFAllocationConversion.cpp
flang/test/Fir/CUDA/cuda-allocate.fir
flang/test/Lower/Intrinsics/c_loc.f90
flang/test/Lower/OpenACC/acc-declare.f90
flang/test/Lower/OpenMP/parallel-reduction-pointer-array.f90
flang/test/Lower/OpenMP/wsloop-reduction-pointer.f90
flang/test/Lower/allocatable-polymorphic.f90
flang/test/Lower/allocatable-runtime.f90
flang/test/Lower/allocate-mold.f90
flang/test/Lower/assign-statement.f90
flang/test/Lower/nullify-polymorphic.f90
flang/test/Lower/polymorphic.f90
flang/test/Lower/volatile-allocatable.f90
flang/test/Transforms/lower-repack-arrays.fir

Removed: 




diff  --git a/flang-rt/include/flang-rt/runtime/derived.h 
b/flang-rt/include/flang-rt/runtime/derived.h
index b3718e2a1c2cf..ac6962c57168c 100644
--- a/flang-rt/include/flang-rt/runtime/derived.h
+++ b/flang-rt/include/flang-rt/runtime/derived.h
@@ -12,7 +12,6 @@
 #define FLANG_RT_RUNTIME_DERIVED_H_
 
 #include "flang/Common/api-attrs.h"
-#include "flang/Runtime/freestanding-tools.h"
 
 namespace Fortran::runtime::typeInfo {
 class DerivedType;
@@ -25,8 +24,7 @@ class Terminator;
 // Perform default component initialization, allocate automatic components.
 // Returns a STAT= code (0 when all's well).
 RT_API_ATTRS int Initialize(const Descriptor &, const typeInfo::DerivedType &,
-Terminator &, bool hasStat = false, const Descriptor *errMsg = nullptr,
-MemcpyFct memcpyFct = nullptr);
+Terminator &, bool hasStat = false, const Descriptor *errMsg = nullptr);
 
 // Initializes an object clone from the original object.
 // Each allocatable member of the clone is allocated with the same bounds as

diff  --git a/flang-rt/include/flang-rt/runtime/work-queue.h 
b/flang-rt/include/flang-rt/runtime/work-queue.h
index 4b56540d6fd5a..7d7f8ad991a57 100644
--- a/flang-rt/include/flang-rt/runtime/work-queue.h
+++ b/flang-rt/include/flang-rt/runtime/work-queue.h
@@ -249,15 +249,12 @@ class ElementsOverComponents : public Elementwise, public 
Componentwise {
 class InitializeTicket : public ImmediateTicketRunner,
  private ElementsOverComponents {
 public:
-  RT_API_ATTRS InitializeTicket(const Descriptor &instance,
-  const typeInfo::DerivedType &derived, MemcpyFct memcpyFct)
+  RT_API_ATTRS InitializeTicket(
+  const Descriptor &instance, const typeInfo::DerivedType &derived)
   : ImmediateTicketRunner{*this},
-ElementsOverComponents{instance, derived}, memcpyFct_{memcpyFct} {}
+ElementsOverComponents{instance, derived} {}
   RT_API_ATTRS int Begin(WorkQueue &);
   RT_API_ATTRS int Continue(WorkQueue &);
-
-private:
-  MemcpyFct memcpyFct_;
 };
 
 // Initializes one derived type instance from the value of another
@@ -451,12 +448,12 @@ class WorkQueue {
 
   // APIs for particular tasks.  These can return StatOk if the work is
   // completed immediately.
-  RT_API_ATTRS int BeginInitialize(const Descriptor &descriptor,
-  const typeInfo::DerivedType &derived, MemcpyFct memcpyFct = nullptr) {
+  RT_API_ATTRS int BeginInitialize(
+  const Descriptor &descriptor, const typeInfo::DerivedType &derived) {
 if (runTicketsImmediately_) {
-  return InitializeTicket{descriptor, derived, memcpyFct}.Run(*this);
+  return InitializeTicket{descriptor, derived}.Run(*this);
 } else {
-  StartTicket().u.emplace(descriptor, derived, 
memcpyFct);
+  StartTicket().u.emplace(descriptor, derived);
   return StatContinue;
 }
   }

diff  --git a/flang-rt/lib/cuda/allocatable.cpp 
b/fla

[llvm-branch-commits] [llvm] [CodeGen][NPM] Remove "LowerConstantIntrinsicsPass" from the pipeline (PR #172794)

2025-12-18 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/172794
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] StackProtector: Use LibcallLoweringInfo analysis (PR #170329)

2025-12-18 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/170329

>From aee9d14de9a741e458196b9600d784b41ad3f942 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Mon, 24 Nov 2025 13:48:45 -0500
Subject: [PATCH] StackProtector: Use LibcallLoweringInfo analysis

---
 llvm/lib/CodeGen/StackProtector.cpp   | 82 +--
 .../NVPTX/no-stack-protector-libcall-error.ll |  2 +-
 .../X86/stack-protector-atomicrmw-xchg.ll |  4 +-
 .../cross-dso-cfi-stack-chk-fail.ll   |  2 +-
 .../StackProtector/missing-analysis.ll|  7 ++
 .../StackProtector/stack-chk-fail-alias.ll|  2 +-
 6 files changed, 67 insertions(+), 32 deletions(-)
 create mode 100644 llvm/test/Transforms/StackProtector/missing-analysis.ll

diff --git a/llvm/lib/CodeGen/StackProtector.cpp 
b/llvm/lib/CodeGen/StackProtector.cpp
index 971c3e58b10ef..2fef484dbc583 100644
--- a/llvm/lib/CodeGen/StackProtector.cpp
+++ b/llvm/lib/CodeGen/StackProtector.cpp
@@ -69,13 +69,15 @@ static cl::opt 
DisableCheckNoReturn("disable-check-noreturn-call",
 ///  - The prologue code loads and stores the stack guard onto the stack.
 ///  - The epilogue checks the value stored in the prologue against the 
original
 ///value. It calls __stack_chk_fail if they differ.
-static bool InsertStackProtectors(const TargetMachine *TM, Function *F,
-  DomTreeUpdater *DTU, bool &HasPrologue,
-  bool &HasIRCheck);
+static bool InsertStackProtectors(const TargetLowering &TLI,
+  const LibcallLoweringInfo &Libcalls,
+  Function *F, DomTreeUpdater *DTU,
+  bool &HasPrologue, bool &HasIRCheck);
 
 /// CreateFailBB - Create a basic block to jump to when the stack protector
 /// check fails.
-static BasicBlock *CreateFailBB(Function *F, const TargetLowering &TLI);
+static BasicBlock *CreateFailBB(Function *F,
+const LibcallLoweringInfo &Libcalls);
 
 bool SSPLayoutInfo::shouldEmitSDCheck(const BasicBlock &BB) const {
   return HasPrologue && !HasIRCheck && isa(BB.getTerminator());
@@ -131,8 +133,23 @@ PreservedAnalyses StackProtectorPass::run(Function &F,
   return PreservedAnalyses::all();
   }
 
+  auto &MAMProxy = FAM.getResult(F);
+  const LibcallLoweringModuleAnalysisResult *LibcallLowering =
+  MAMProxy.getCachedResult(*F.getParent());
+
+  if (!LibcallLowering) {
+F.getContext().emitError("'" + LibcallLoweringModuleAnalysis::name() +
+ "' analysis required");
+return PreservedAnalyses::all();
+  }
+
+  const TargetSubtargetInfo *STI = TM->getSubtargetImpl(F);
+  const TargetLowering *TLI = STI->getTargetLowering();
+  const LibcallLoweringInfo &Libcalls =
+  LibcallLowering->getLibcallLowering(*STI);
+
   ++NumFunProtected;
-  bool Changed = InsertStackProtectors(TM, &F, DT ? &DTU : nullptr,
+  bool Changed = InsertStackProtectors(*TLI, Libcalls, &F, DT ? &DTU : nullptr,
Info.HasPrologue, Info.HasIRCheck);
 #ifdef EXPENSIVE_CHECKS
   assert((!DT ||
@@ -156,6 +173,7 @@ StackProtector::StackProtector() : FunctionPass(ID) {
 
 INITIALIZE_PASS_BEGIN(StackProtector, DEBUG_TYPE,
   "Insert stack protectors", false, true)
+INITIALIZE_PASS_DEPENDENCY(LibcallLoweringInfoWrapper)
 INITIALIZE_PASS_DEPENDENCY(TargetPassConfig)
 INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
 INITIALIZE_PASS_END(StackProtector, DEBUG_TYPE,
@@ -164,6 +182,7 @@ INITIALIZE_PASS_END(StackProtector, DEBUG_TYPE,
 FunctionPass *llvm::createStackProtectorPass() { return new StackProtector(); }
 
 void StackProtector::getAnalysisUsage(AnalysisUsage &AU) const {
+  AU.addRequired();
   AU.addRequired();
   AU.addPreserved();
 }
@@ -190,9 +209,16 @@ bool StackProtector::runOnFunction(Function &Fn) {
   return false;
   }
 
+  const TargetSubtargetInfo *Subtarget = TM->getSubtargetImpl(Fn);
+  const LibcallLoweringInfo &Libcalls =
+  getAnalysis().getLibcallLowering(*M,
+   *Subtarget);
+
+  const TargetLowering *TLI = Subtarget->getTargetLowering();
+
   ++NumFunProtected;
   bool Changed =
-  InsertStackProtectors(TM, F, DTU ? &*DTU : nullptr,
+  InsertStackProtectors(*TLI, Libcalls, F, DTU ? &*DTU : nullptr,
 LayoutInfo.HasPrologue, LayoutInfo.HasIRCheck);
 #ifdef EXPENSIVE_CHECKS
   assert((!DTU ||
@@ -519,10 +545,10 @@ bool SSPLayoutAnalysis::requiresStackProtector(Function 
*F,
 
 /// Create a stack guard loading and populate whether SelectionDAG SSP is
 /// supported.
-static Value *getStackGuard(const TargetLoweringBase *TLI, Module *M,
+static Value *getStackGuard(const TargetLoweringBase &TLI, Module *M,
 IRBuilder<> &B,
 bool *SupportsSelectionDAGSP = nullptr) {
-  Value *Guard = TLI->getIRSt

[llvm-branch-commits] [llvm] StackProtector: Use LibcallLoweringInfo analysis (PR #170329)

2025-12-18 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/170329

>From aee9d14de9a741e458196b9600d784b41ad3f942 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Mon, 24 Nov 2025 13:48:45 -0500
Subject: [PATCH] StackProtector: Use LibcallLoweringInfo analysis

---
 llvm/lib/CodeGen/StackProtector.cpp   | 82 +--
 .../NVPTX/no-stack-protector-libcall-error.ll |  2 +-
 .../X86/stack-protector-atomicrmw-xchg.ll |  4 +-
 .../cross-dso-cfi-stack-chk-fail.ll   |  2 +-
 .../StackProtector/missing-analysis.ll|  7 ++
 .../StackProtector/stack-chk-fail-alias.ll|  2 +-
 6 files changed, 67 insertions(+), 32 deletions(-)
 create mode 100644 llvm/test/Transforms/StackProtector/missing-analysis.ll

diff --git a/llvm/lib/CodeGen/StackProtector.cpp 
b/llvm/lib/CodeGen/StackProtector.cpp
index 971c3e58b10ef..2fef484dbc583 100644
--- a/llvm/lib/CodeGen/StackProtector.cpp
+++ b/llvm/lib/CodeGen/StackProtector.cpp
@@ -69,13 +69,15 @@ static cl::opt 
DisableCheckNoReturn("disable-check-noreturn-call",
 ///  - The prologue code loads and stores the stack guard onto the stack.
 ///  - The epilogue checks the value stored in the prologue against the 
original
 ///value. It calls __stack_chk_fail if they differ.
-static bool InsertStackProtectors(const TargetMachine *TM, Function *F,
-  DomTreeUpdater *DTU, bool &HasPrologue,
-  bool &HasIRCheck);
+static bool InsertStackProtectors(const TargetLowering &TLI,
+  const LibcallLoweringInfo &Libcalls,
+  Function *F, DomTreeUpdater *DTU,
+  bool &HasPrologue, bool &HasIRCheck);
 
 /// CreateFailBB - Create a basic block to jump to when the stack protector
 /// check fails.
-static BasicBlock *CreateFailBB(Function *F, const TargetLowering &TLI);
+static BasicBlock *CreateFailBB(Function *F,
+const LibcallLoweringInfo &Libcalls);
 
 bool SSPLayoutInfo::shouldEmitSDCheck(const BasicBlock &BB) const {
   return HasPrologue && !HasIRCheck && isa(BB.getTerminator());
@@ -131,8 +133,23 @@ PreservedAnalyses StackProtectorPass::run(Function &F,
   return PreservedAnalyses::all();
   }
 
+  auto &MAMProxy = FAM.getResult(F);
+  const LibcallLoweringModuleAnalysisResult *LibcallLowering =
+  MAMProxy.getCachedResult(*F.getParent());
+
+  if (!LibcallLowering) {
+F.getContext().emitError("'" + LibcallLoweringModuleAnalysis::name() +
+ "' analysis required");
+return PreservedAnalyses::all();
+  }
+
+  const TargetSubtargetInfo *STI = TM->getSubtargetImpl(F);
+  const TargetLowering *TLI = STI->getTargetLowering();
+  const LibcallLoweringInfo &Libcalls =
+  LibcallLowering->getLibcallLowering(*STI);
+
   ++NumFunProtected;
-  bool Changed = InsertStackProtectors(TM, &F, DT ? &DTU : nullptr,
+  bool Changed = InsertStackProtectors(*TLI, Libcalls, &F, DT ? &DTU : nullptr,
Info.HasPrologue, Info.HasIRCheck);
 #ifdef EXPENSIVE_CHECKS
   assert((!DT ||
@@ -156,6 +173,7 @@ StackProtector::StackProtector() : FunctionPass(ID) {
 
 INITIALIZE_PASS_BEGIN(StackProtector, DEBUG_TYPE,
   "Insert stack protectors", false, true)
+INITIALIZE_PASS_DEPENDENCY(LibcallLoweringInfoWrapper)
 INITIALIZE_PASS_DEPENDENCY(TargetPassConfig)
 INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
 INITIALIZE_PASS_END(StackProtector, DEBUG_TYPE,
@@ -164,6 +182,7 @@ INITIALIZE_PASS_END(StackProtector, DEBUG_TYPE,
 FunctionPass *llvm::createStackProtectorPass() { return new StackProtector(); }
 
 void StackProtector::getAnalysisUsage(AnalysisUsage &AU) const {
+  AU.addRequired();
   AU.addRequired();
   AU.addPreserved();
 }
@@ -190,9 +209,16 @@ bool StackProtector::runOnFunction(Function &Fn) {
   return false;
   }
 
+  const TargetSubtargetInfo *Subtarget = TM->getSubtargetImpl(Fn);
+  const LibcallLoweringInfo &Libcalls =
+  getAnalysis().getLibcallLowering(*M,
+   *Subtarget);
+
+  const TargetLowering *TLI = Subtarget->getTargetLowering();
+
   ++NumFunProtected;
   bool Changed =
-  InsertStackProtectors(TM, F, DTU ? &*DTU : nullptr,
+  InsertStackProtectors(*TLI, Libcalls, F, DTU ? &*DTU : nullptr,
 LayoutInfo.HasPrologue, LayoutInfo.HasIRCheck);
 #ifdef EXPENSIVE_CHECKS
   assert((!DTU ||
@@ -519,10 +545,10 @@ bool SSPLayoutAnalysis::requiresStackProtector(Function 
*F,
 
 /// Create a stack guard loading and populate whether SelectionDAG SSP is
 /// supported.
-static Value *getStackGuard(const TargetLoweringBase *TLI, Module *M,
+static Value *getStackGuard(const TargetLoweringBase &TLI, Module *M,
 IRBuilder<> &B,
 bool *SupportsSelectionDAGSP = nullptr) {
-  Value *Guard = TLI->getIRSt

[llvm-branch-commits] [llvm] [AMDGPU][NPM] Disable few non useful passes (PR #172796)

2025-12-18 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/172796
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NPM] Update OptimizedRegAlloc and MachineLateOptimization pipelines (PR #172795)

2025-12-18 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/172795
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] StackProtector: Use LibcallLoweringInfo analysis (PR #170329)

2025-12-18 Thread via llvm-branch-commits

github-actions[bot] wrote:


# :window: Windows x64 Test Results

* 128788 tests passed
* 2828 tests skipped
* 1 test failed

## Failed Tests
(click on a test name to see its output)

### LLVM

LLVM.Transforms/StackProtector/missing-analysis.ll

```
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
not c:\_work\llvm-project\llvm-project\build\bin\opt.exe -mtriple=x86_64-- 
-passes=expand-fp -disable-output 
C:\_work\llvm-project\llvm-project\llvm\test\Transforms\StackProtector\missing-analysis.ll
 2>&1 | c:\_work\llvm-project\llvm-project\build\bin\filecheck.exe 
C:\_work\llvm-project\llvm-project\llvm\test\Transforms\StackProtector\missing-analysis.ll
# executed command: not 'c:\_work\llvm-project\llvm-project\build\bin\opt.exe' 
-mtriple=x86_64-- -passes=expand-fp -disable-output 
'C:\_work\llvm-project\llvm-project\llvm\test\Transforms\StackProtector\missing-analysis.ll'
# note: command had no output on stdout or stderr
# executed command: 
'c:\_work\llvm-project\llvm-project\build\bin\filecheck.exe' 
'C:\_work\llvm-project\llvm-project\llvm\test\Transforms\StackProtector\missing-analysis.ll'
# .---command stderr
# | 
C:\_work\llvm-project\llvm-project\llvm\test\Transforms\StackProtector\missing-analysis.ll:4:10:
 error: CHECK: expected string not found in input
# | ; CHECK: 'LibcallLoweringModuleAnalysis' analysis required
# |  ^
# | :1:1: note: scanning from here
# | c:\_work\llvm-project\llvm-project\build\bin\opt.exe: unknown pass name 
'expand-fp'
# | ^
# | :1:41: note: possible intended match here
# | c:\_work\llvm-project\llvm-project\build\bin\opt.exe: unknown pass name 
'expand-fp'
# | ^
# | 
# | Input file: 
# | Check file: 
C:\_work\llvm-project\llvm-project\llvm\test\Transforms\StackProtector\missing-analysis.ll
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<
# |1: c:\_work\llvm-project\llvm-project\build\bin\opt.exe: unknown 
pass name 'expand-fp' 
# | check:4'0 
X~~~
 error: no match found
# | check:4'1 ? 
   possible intended match
# | >>
# `-
# error: command failed with exit status: 1

--

```


If these failures are unrelated to your changes (for example tests are broken 
or flaky at HEAD), please open an issue at 
https://github.com/llvm/llvm-project/issues and add the `infrastructure` label.

https://github.com/llvm/llvm-project/pull/170329
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] StackProtector: Use LibcallLoweringInfo analysis (PR #170329)

2025-12-18 Thread via llvm-branch-commits

github-actions[bot] wrote:


# :penguin: Linux x64 Test Results

* 167306 tests passed
* 2957 tests skipped
* 1 test failed

## Failed Tests
(click on a test name to see its output)

### LLVM

LLVM.Transforms/StackProtector/missing-analysis.ll

```
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
not /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/opt 
-mtriple=x86_64-- -passes=expand-fp -disable-output 
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/Transforms/StackProtector/missing-analysis.ll
 2>&1 | 
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck 
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/Transforms/StackProtector/missing-analysis.ll
# executed command: not 
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/opt 
-mtriple=x86_64-- -passes=expand-fp -disable-output 
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/Transforms/StackProtector/missing-analysis.ll
# note: command had no output on stdout or stderr
# executed command: 
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/FileCheck 
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/Transforms/StackProtector/missing-analysis.ll
# .---command stderr
# | 
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/Transforms/StackProtector/missing-analysis.ll:4:10:
 error: CHECK: expected string not found in input
# | ; CHECK: 'LibcallLoweringModuleAnalysis' analysis required
# |  ^
# | :1:1: note: scanning from here
# | /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/opt: 
unknown pass name 'expand-fp'
# | ^
# | :1:61: note: possible intended match here
# | /home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/opt: 
unknown pass name 'expand-fp'
# | ^
# | 
# | Input file: 
# | Check file: 
/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/test/Transforms/StackProtector/missing-analysis.ll
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<
# |1: 
/home/gha/actions-runner/_work/llvm-project/llvm-project/build/bin/opt: unknown 
pass name 'expand-fp' 
# | check:4'0 
X~
 error: no match found
# | check:4'1 ? 
 possible intended match
# | >>
# `-
# error: command failed with exit status: 1

--

```


If these failures are unrelated to your changes (for example tests are broken 
or flaky at HEAD), please open an issue at 
https://github.com/llvm/llvm-project/issues and add the `infrastructure` label.

https://github.com/llvm/llvm-project/pull/170329
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Clang] Add __builtin_allow_sanitize_check() (PR #172030)

2025-12-18 Thread Marco Elver via llvm-branch-commits


@@ -3587,6 +3587,30 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, 
unsigned BuiltinID,
 }
 break;
   }
+
+  case Builtin::BI__builtin_allow_sanitize_check: {
+Expr *Arg = TheCall->getArg(0);
+// Check if the argument is a string literal.
+const StringLiteral *SanitizerName =
+dyn_cast(Arg->IgnoreParenImpCasts());

melver wrote:

I fear that GCC folks won't like that, as it's a bit too magical, because that 
builtin will have too many different semantics depending on its inputs (we 
really want GCC to implement it too, otherwise it's unlikely this will get 
used).

Currently `__builtin_allow_runtime_check()` semantics is only for denoting 
checking in hot/cold code. If we expand its semantics to depend on sanitizer 
names, I think it will become too complex. In general I was also hesitating 
using strings to disambiguate or introduce one builtin per sanitizer, but 
keeping the precedent of mapping the strings to `no_sanitize(string)` is 
probably good here.

https://github.com/llvm/llvm-project/pull/172030
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Clang] Add __builtin_allow_sanitize_check() (PR #172030)

2025-12-18 Thread Vitaly Buka via llvm-branch-commits

https://github.com/vitalybuka approved this pull request.

@efriedma-quic do we need some RFC for that?
Similar to 
https://discourse.llvm.org/t/rfc-add-llvm-allow-runtime-check-intrinsic/77641

Other than possible redundancy LGTM!

https://github.com/llvm/llvm-project/pull/172030
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV][NFC] Add RISCVVSETVLIInfoAnalysis (PR #172615)

2025-12-18 Thread Min-Yih Hsu via llvm-branch-commits

https://github.com/mshockwave approved this pull request.

LGTM thanks

https://github.com/llvm/llvm-project/pull/172615
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits