[llvm-branch-commits] [llvm] [LoopInterchange] Fix the vectorizable check for a loop (PR #133667)
https://github.com/kasuga-fj created https://github.com/llvm/llvm-project/pull/133667 In the profitability check for vectorization, the dependency matrix was not handled correctly. This can result to make a wrong decision: It may say "this loop can be vectorized" when in fact it cannot. The root cause of this is that the check process early returns when it finds '=' or 'I' in the dependency matrix. To make sure that we can actually vectorize the loop, we need to check all the rows of the matrix. This patch fixes the process of checking whether we can vectorize the loop or not. Now it won't make a wrong decision for a loop that cannot be vectorized. Related: #131130 >From 2db59e8629d3640ec070eb906ac55a5e970176d1 Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Thu, 27 Mar 2025 09:52:16 + Subject: [PATCH] [LoopInterchange] Fix the vectorizable check for a loop In the profitability check for vectorization, the dependency matrix was not handled correctly. This can result to make a wrong decision: It may say "this loop can be vectorized" when in fact it cannot. The root cause of this is that the check process early returns when it finds '=' or 'I' in the dependency matrix. To make sure that we can actually vectorize the loop, we need to check all the rows of the matrix. This patch fixes the process of checking whether we can vectorize the loop or not. Now it won't make a wrong decision for a loop that cannot be vectorized. Related: #131130 --- .../lib/Transforms/Scalar/LoopInterchange.cpp | 41 +++ .../profitability-vectorization-heuristic.ll | 9 ++-- 2 files changed, 27 insertions(+), 23 deletions(-) diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp index e777f950a7c5a..b6b0b7d7a947a 100644 --- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp +++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp @@ -1197,25 +1197,32 @@ LoopInterchangeProfitability::isProfitablePerInstrOrderCost() { return std::nullopt; } +/// Return true if we can vectorize the loop specified by \p LoopId. +static bool canVectorize(const CharMatrix &DepMatrix, unsigned LoopId) { + for (unsigned I = 0; I != DepMatrix.size(); I++) { +char Dir = DepMatrix[I][LoopId]; +if (Dir != 'I' && Dir != '=') + return false; + } + return true; +} + std::optional LoopInterchangeProfitability::isProfitableForVectorization( unsigned InnerLoopId, unsigned OuterLoopId, CharMatrix &DepMatrix) { - for (auto &Row : DepMatrix) { -// If the inner loop is loop independent or doesn't carry any dependency -// it is not profitable to move this to outer position, since we are -// likely able to do inner loop vectorization already. -if (Row[InnerLoopId] == 'I' || Row[InnerLoopId] == '=') - return std::optional(false); - -// If the outer loop is not loop independent it is not profitable to move -// this to inner position, since doing so would not enable inner loop -// parallelism. -if (Row[OuterLoopId] != 'I' && Row[OuterLoopId] != '=') - return std::optional(false); - } - // If inner loop has dependence and outer loop is loop independent then it - // is/ profitable to interchange to enable inner loop parallelism. - // If there are no dependences, interchanging will not improve anything. - return std::optional(!DepMatrix.empty()); + // If the outer loop is not loop independent it is not profitable to move + // this to inner position, since doing so would not enable inner loop + // parallelism. + if (!canVectorize(DepMatrix, OuterLoopId)) +return false; + + // If inner loop has dependence and outer loop is loop independent then it is + // profitable to interchange to enable inner loop parallelism. + if (!canVectorize(DepMatrix, InnerLoopId)) +return true; + + // TODO: Estimate the cost of vectorized loop body when both the outer and the + // inner loop can be vectorized. + return std::nullopt; } bool LoopInterchangeProfitability::isProfitable( diff --git a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll index 606117e70db86..b82dd5141a6b2 100644 --- a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll +++ b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll @@ -15,16 +15,13 @@ ; } ; } ; -; FIXME: These loops are not exchanged at this time due to the problem of -; profitablity heuristic for vectorization. -; CHECK: --- !Missed +; CHECK: --- !Passed ; CHECK-NEXT: Pass:loop-interchange -; CHECK-NEXT: Name:InterchangeNotProfitable +; CHECK-NEXT: Name:Interchanged ; CHECK-NEXT: Function:interchange_necesasry_for_vectorization ; CHECK-NEXT: Args: -; CHECK-NEXT: - String: Interchanging loops is not considered to improve cache locality nor vectori
[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)
@@ -80,6 +80,21 @@ enum class RuleTy { ForVectorization, }; +/// Store the information about if corresponding direction vector was negated kasuga-fj wrote: > But I now guess that the complication here is the unique entries in the > dependency matrix, is that right? Yes. (But holding two boolean values is a bit redundant. What is actually needed are three states. If both of them are false, it is an illegal state.) > I am wondering if it isn't easier to keep all the entries and don't make them > unique? I think it would be simpler. Also, there is no need to stop making entries unique altogether. If duplicate direction vectors are allowed, I think the simplest implementation would be to keep pairs of a direction vector and a boolean value indicating whether the corresponding vector is negated. However, I'm not sure how effective it is to make direction vectors unique. In the worst case, holding pairs of a vector and a boolean value instead of a single vector doubles the number of entries. Is this allowed? https://github.com/llvm/llvm-project/pull/133672 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopInterchange] Add tests for the vectorization profitability (NFC) (PR #133665)
https://github.com/kasuga-fj edited https://github.com/llvm/llvm-project/pull/133665 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)
https://github.com/kasuga-fj updated https://github.com/llvm/llvm-project/pull/133672 >From 72b48ba6d6b70eb9a65abdc516697f3dee9c7a2e Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Thu, 27 Mar 2025 10:45:26 + Subject: [PATCH] [LoopInterchange] Improve profitability check for vectorization The vectorization profitability has a process to check whether a given loop can be vectorized or not. Since the process is conservative, a loop that can be vectorized may be deemed not to be possible. This can trigger unnecessary exchanges. This patch improves the profitability decision by mitigating such misjudgments. Before this patch, we considered a loop to be vectorizable only when there are no loop carried dependencies with the IV of the loop. However, a loop carried dependency doesn't prevent vectorization if the distance is positive. This patch makes the vectorization check more accurate by allowing a loop with the positive dependency. Note that it is difficult to make a complete decision whether a loop can be vectorized or not. To achieve this, we must check the vector width and the distance of dependency. --- .../lib/Transforms/Scalar/LoopInterchange.cpp | 128 ++ .../profitability-vectorization-heuristic.ll | 8 +- 2 files changed, 106 insertions(+), 30 deletions(-) diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp index b6b0b7d7a947a..cb2cb6652899c 100644 --- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp +++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp @@ -17,8 +17,8 @@ #include "llvm/ADT/SmallSet.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/Statistic.h" +#include "llvm/ADT/StringMap.h" #include "llvm/ADT/StringRef.h" -#include "llvm/ADT/StringSet.h" #include "llvm/Analysis/DependenceAnalysis.h" #include "llvm/Analysis/LoopCacheAnalysis.h" #include "llvm/Analysis/LoopInfo.h" @@ -80,6 +80,21 @@ enum class RuleTy { ForVectorization, }; +/// Store the information about if corresponding direction vector was negated +/// by normalization or not. This is necessary to restore the original one from +/// a row of a dependency matrix, because we only manage normalized direction +/// vectors and duplicate vectors are eliminated. So there may be both original +/// and negated vectors for a single entry (a row of dependency matrix). E.g., +/// if there are two direction vectors `[< =]` and `[> =]`, the later one will +/// be converted to the same as former one by normalization, so only `[< =]` +/// would be retained in the final result. +struct NegatedStatus { + bool Original = false; + bool Negated = false; + + bool isNonNegativeDir(char Dir) const; +}; + } // end anonymous namespace // Minimum loop depth supported. @@ -126,9 +141,10 @@ static void printDepMatrix(CharMatrix &DepMatrix) { } #endif -static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level, - Loop *L, DependenceInfo *DI, - ScalarEvolution *SE, +static bool populateDependencyMatrix(CharMatrix &DepMatrix, + std::vector &NegStatusVec, + unsigned Level, Loop *L, + DependenceInfo *DI, ScalarEvolution *SE, OptimizationRemarkEmitter *ORE) { using ValueVector = SmallVector; @@ -167,7 +183,9 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level, return false; } ValueVector::iterator I, IE, J, JE; - StringSet<> Seen; + + // Manage all found direction vectors. and map it to the index of DepMatrix. + StringMap Seen; for (I = MemInstr.begin(), IE = MemInstr.end(); I != IE; ++I) { for (J = I, JE = MemInstr.end(); J != JE; ++J) { @@ -182,7 +200,8 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level, assert(D->isOrdered() && "Expected an output, flow or anti dep."); // If the direction vector is negative, normalize it to // make it non-negative. -if (D->normalize(SE)) +bool Normalized = D->normalize(SE); +if (Normalized) LLVM_DEBUG(dbgs() << "Negative dependence vector normalized.\n"); LLVM_DEBUG(StringRef DepType = D->isFlow() ? "flow" : D->isAnti() ? "anti" : "output"; @@ -214,8 +233,17 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level, } // Make sure we only add unique entries to the dependency matrix. -if (Seen.insert(StringRef(Dep.data(), Dep.size())).second) +unsigned Index = DepMatrix.size(); +auto [Ite, Inserted] = +Seen.try_emplace(StringRef(Dep.data(), Dep.size()), Index); +if (Inserted) { DepMatrix.push_back(Dep); + NegStatusVec.push_back(NegatedStatus{}); +} else + Index = Ite->second; + +
[llvm-branch-commits] [llvm] [LoopInterchange] Fix the vectorizable check for a loop (PR #133667)
https://github.com/kasuga-fj updated https://github.com/llvm/llvm-project/pull/133667 >From ae5d9cff055000480e7e71205265b18911440e29 Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Thu, 27 Mar 2025 09:52:16 + Subject: [PATCH] [LoopInterchange] Fix the vectorizable check for a loop In the profitability check for vectorization, the dependency matrix was not handled correctly. This can result to make a wrong decision: It may say "this loop can be vectorized" when in fact it cannot. The root cause of this is that the check process early returns when it finds '=' or 'I' in the dependency matrix. To make sure that we can actually vectorize the loop, we need to check all the rows of the matrix. This patch fixes the process of checking whether we can vectorize the loop or not. Now it won't make a wrong decision for a loop that cannot be vectorized. Related: #131130 --- .../lib/Transforms/Scalar/LoopInterchange.cpp | 41 +++ .../profitability-vectorization-heuristic.ll | 9 ++-- 2 files changed, 27 insertions(+), 23 deletions(-) diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp index e777f950a7c5a..b6b0b7d7a947a 100644 --- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp +++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp @@ -1197,25 +1197,32 @@ LoopInterchangeProfitability::isProfitablePerInstrOrderCost() { return std::nullopt; } +/// Return true if we can vectorize the loop specified by \p LoopId. +static bool canVectorize(const CharMatrix &DepMatrix, unsigned LoopId) { + for (unsigned I = 0; I != DepMatrix.size(); I++) { +char Dir = DepMatrix[I][LoopId]; +if (Dir != 'I' && Dir != '=') + return false; + } + return true; +} + std::optional LoopInterchangeProfitability::isProfitableForVectorization( unsigned InnerLoopId, unsigned OuterLoopId, CharMatrix &DepMatrix) { - for (auto &Row : DepMatrix) { -// If the inner loop is loop independent or doesn't carry any dependency -// it is not profitable to move this to outer position, since we are -// likely able to do inner loop vectorization already. -if (Row[InnerLoopId] == 'I' || Row[InnerLoopId] == '=') - return std::optional(false); - -// If the outer loop is not loop independent it is not profitable to move -// this to inner position, since doing so would not enable inner loop -// parallelism. -if (Row[OuterLoopId] != 'I' && Row[OuterLoopId] != '=') - return std::optional(false); - } - // If inner loop has dependence and outer loop is loop independent then it - // is/ profitable to interchange to enable inner loop parallelism. - // If there are no dependences, interchanging will not improve anything. - return std::optional(!DepMatrix.empty()); + // If the outer loop is not loop independent it is not profitable to move + // this to inner position, since doing so would not enable inner loop + // parallelism. + if (!canVectorize(DepMatrix, OuterLoopId)) +return false; + + // If inner loop has dependence and outer loop is loop independent then it is + // profitable to interchange to enable inner loop parallelism. + if (!canVectorize(DepMatrix, InnerLoopId)) +return true; + + // TODO: Estimate the cost of vectorized loop body when both the outer and the + // inner loop can be vectorized. + return std::nullopt; } bool LoopInterchangeProfitability::isProfitable( diff --git a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll index 606117e70db86..b82dd5141a6b2 100644 --- a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll +++ b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll @@ -15,16 +15,13 @@ ; } ; } ; -; FIXME: These loops are not exchanged at this time due to the problem of -; profitablity heuristic for vectorization. -; CHECK: --- !Missed +; CHECK: --- !Passed ; CHECK-NEXT: Pass:loop-interchange -; CHECK-NEXT: Name:InterchangeNotProfitable +; CHECK-NEXT: Name:Interchanged ; CHECK-NEXT: Function:interchange_necesasry_for_vectorization ; CHECK-NEXT: Args: -; CHECK-NEXT: - String: Interchanging loops is not considered to improve cache locality nor vectorization. -; CHECK-NEXT: ... +; CHECK-NEXT: - String: Loop interchanged with enclosing loop. define void @interchange_necesasry_for_vectorization() { entry: br label %for.i.header ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)
https://github.com/kasuga-fj updated https://github.com/llvm/llvm-project/pull/133672 >From 692e4de4f84281f8c2bc5f7278f8066929df3cd6 Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Thu, 27 Mar 2025 10:45:26 + Subject: [PATCH] [LoopInterchange] Improve profitability check for vectorization The vectorization profitability has a process to check whether a given loop can be vectorized or not. Since the process is conservative, a loop that can be vectorized may be deemed not to be possible. This can trigger unnecessary exchanges. This patch improves the profitability decision by mitigating such misjudgments. Before this patch, we considered a loop to be vectorizable only when there are no loop carried dependencies with the IV of the loop. However, a loop carried dependency doesn't prevent vectorization if the distance is positive. This patch makes the vectorization check more accurate by allowing a loop with the positive dependency. Note that it is difficult to make a complete decision whether a loop can be vectorized or not. To achieve this, we must check the vector width and the distance of dependency. --- .../lib/Transforms/Scalar/LoopInterchange.cpp | 128 ++ .../profitability-vectorization-heuristic.ll | 8 +- 2 files changed, 106 insertions(+), 30 deletions(-) diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp index 1dccba4cfa7b8..078da53c52b52 100644 --- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp +++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp @@ -17,8 +17,8 @@ #include "llvm/ADT/SmallSet.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/Statistic.h" +#include "llvm/ADT/StringMap.h" #include "llvm/ADT/StringRef.h" -#include "llvm/ADT/StringSet.h" #include "llvm/Analysis/DependenceAnalysis.h" #include "llvm/Analysis/LoopCacheAnalysis.h" #include "llvm/Analysis/LoopInfo.h" @@ -80,6 +80,21 @@ enum class RuleTy { ForVectorization, }; +/// Store the information about if corresponding direction vector was negated +/// by normalization or not. This is necessary to restore the original one from +/// a row of a dependency matrix, because we only manage normalized direction +/// vectors and duplicate vectors are eliminated. So there may be both original +/// and negated vectors for a single entry (a row of dependency matrix). E.g., +/// if there are two direction vectors `[< =]` and `[> =]`, the later one will +/// be converted to the same as former one by normalization, so only `[< =]` +/// would be retained in the final result. +struct NegatedStatus { + bool Original = false; + bool Negated = false; + + bool isNonNegativeDir(char Dir) const; +}; + } // end anonymous namespace // Minimum loop depth supported. @@ -126,9 +141,10 @@ static void printDepMatrix(CharMatrix &DepMatrix) { } #endif -static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level, - Loop *L, DependenceInfo *DI, - ScalarEvolution *SE, +static bool populateDependencyMatrix(CharMatrix &DepMatrix, + std::vector &NegStatusVec, + unsigned Level, Loop *L, + DependenceInfo *DI, ScalarEvolution *SE, OptimizationRemarkEmitter *ORE) { using ValueVector = SmallVector; @@ -167,7 +183,9 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level, return false; } ValueVector::iterator I, IE, J, JE; - StringSet<> Seen; + + // Manage all found direction vectors. and map it to the index of DepMatrix. + StringMap Seen; for (I = MemInstr.begin(), IE = MemInstr.end(); I != IE; ++I) { for (J = I, JE = MemInstr.end(); J != JE; ++J) { @@ -182,7 +200,8 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level, assert(D->isOrdered() && "Expected an output, flow or anti dep."); // If the direction vector is negative, normalize it to // make it non-negative. -if (D->normalize(SE)) +bool Normalized = D->normalize(SE); +if (Normalized) LLVM_DEBUG(dbgs() << "Negative dependence vector normalized.\n"); LLVM_DEBUG(StringRef DepType = D->isFlow() ? "flow" : D->isAnti() ? "anti" : "output"; @@ -214,8 +233,17 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level, } // Make sure we only add unique entries to the dependency matrix. -if (Seen.insert(StringRef(Dep.data(), Dep.size())).second) +unsigned Index = DepMatrix.size(); +auto [Ite, Inserted] = +Seen.try_emplace(StringRef(Dep.data(), Dep.size()), Index); +if (Inserted) { DepMatrix.push_back(Dep); + NegStatusVec.push_back(NegatedStatus{}); +} else + Index = Ite->second; + +
[llvm-branch-commits] [llvm] [LoopInterchange] Fix the vectorizable check for a loop (PR #133667)
https://github.com/kasuga-fj updated https://github.com/llvm/llvm-project/pull/133667 >From bd84ddc9e4dc645e965b2a6dc535a3023e0d7e45 Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Thu, 27 Mar 2025 09:52:16 + Subject: [PATCH] [LoopInterchange] Fix the vectorizable check for a loop In the profitability check for vectorization, the dependency matrix was not handled correctly. This can result to make a wrong decision: It may say "this loop can be vectorized" when in fact it cannot. The root cause of this is that the check process early returns when it finds '=' or 'I' in the dependency matrix. To make sure that we can actually vectorize the loop, we need to check all the rows of the matrix. This patch fixes the process of checking whether we can vectorize the loop or not. Now it won't make a wrong decision for a loop that cannot be vectorized. Related: #131130 --- .../lib/Transforms/Scalar/LoopInterchange.cpp | 44 --- .../profitability-vectorization-heuristic.ll | 9 ++-- 2 files changed, 30 insertions(+), 23 deletions(-) diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp index e777f950a7c5a..1dccba4cfa7b8 100644 --- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp +++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp @@ -1197,25 +1197,35 @@ LoopInterchangeProfitability::isProfitablePerInstrOrderCost() { return std::nullopt; } +/// Return true if we can vectorize the loop specified by \p LoopId. +static bool canVectorize(const CharMatrix &DepMatrix, unsigned LoopId) { + for (unsigned I = 0; I != DepMatrix.size(); I++) { +char Dir = DepMatrix[I][LoopId]; +if (Dir != 'I' && Dir != '=') + return false; + } + return true; +} + std::optional LoopInterchangeProfitability::isProfitableForVectorization( unsigned InnerLoopId, unsigned OuterLoopId, CharMatrix &DepMatrix) { - for (auto &Row : DepMatrix) { -// If the inner loop is loop independent or doesn't carry any dependency -// it is not profitable to move this to outer position, since we are -// likely able to do inner loop vectorization already. -if (Row[InnerLoopId] == 'I' || Row[InnerLoopId] == '=') - return std::optional(false); - -// If the outer loop is not loop independent it is not profitable to move -// this to inner position, since doing so would not enable inner loop -// parallelism. -if (Row[OuterLoopId] != 'I' && Row[OuterLoopId] != '=') - return std::optional(false); - } - // If inner loop has dependence and outer loop is loop independent then it - // is/ profitable to interchange to enable inner loop parallelism. - // If there are no dependences, interchanging will not improve anything. - return std::optional(!DepMatrix.empty()); + // If the outer loop is not loop independent it is not profitable to move + // this to inner position, since doing so would not enable inner loop + // parallelism. + if (!canVectorize(DepMatrix, OuterLoopId)) +return false; + + // If inner loop has dependence and outer loop is loop independent then it is + // profitable to interchange to enable inner loop parallelism. + if (!canVectorize(DepMatrix, InnerLoopId)) +return true; + + // If both the inner and the outer loop can be vectorized, it is necessary to + // check the cost of each vectorized loop for profitability decision. At this + // time we do not have a cost model to estimate them, so return nullopt. + // TODO: Estimate the cost of vectorized loop when both the outer and the + // inner loop can be vectorized. + return std::nullopt; } bool LoopInterchangeProfitability::isProfitable( diff --git a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll index 606117e70db86..b82dd5141a6b2 100644 --- a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll +++ b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll @@ -15,16 +15,13 @@ ; } ; } ; -; FIXME: These loops are not exchanged at this time due to the problem of -; profitablity heuristic for vectorization. -; CHECK: --- !Missed +; CHECK: --- !Passed ; CHECK-NEXT: Pass:loop-interchange -; CHECK-NEXT: Name:InterchangeNotProfitable +; CHECK-NEXT: Name:Interchanged ; CHECK-NEXT: Function:interchange_necesasry_for_vectorization ; CHECK-NEXT: Args: -; CHECK-NEXT: - String: Interchanging loops is not considered to improve cache locality nor vectorization. -; CHECK-NEXT: ... +; CHECK-NEXT: - String: Loop interchanged with enclosing loop. define void @interchange_necesasry_for_vectorization() { entry: br label %for.i.header ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-b
[llvm-branch-commits] [llvm] [MachinePipeliner] Introduce a new class for loop-carried deps (PR #137663)
https://github.com/kasuga-fj created https://github.com/llvm/llvm-project/pull/137663 In MachinePipeliner, loop-carried memory dependencies are represented by DAG, which makes things complicated and causes some necessary dependencies to be missing. This patch introduces a new class to manage loop-carried memory dependencies to simplify the logic. The ultimate goal is to add currently missing dependencies, but this is a first step of that, and this patch doesn't intend to change current behavior. This patch also adds new tests that show the missed dependencies, which should be fixed in the future. Split off from #135148 >From 487aa84ff51e727fc53b1dc5da52281c14adebca Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Mon, 28 Apr 2025 13:04:47 + Subject: [PATCH] [MachinePipeliner] Introduce a new class for loop-carried deps In MachinePipeliner, loop-carried memory dependencies are represented by DAG, which makes things complicated and causes some necessary dependencies to be missing. This patch introduces a new class to manage loop-carried memory dependencies to simplify the logic. The ultimate goal is to add currently missing dependencies, but this is a first step of that, and this patch doesn't intend to change current behavior. This patch also adds new tests that show the missed dependencies, which should be fixed in the future. Split off from #135148 --- llvm/include/llvm/CodeGen/MachinePipeliner.h | 34 ++- llvm/lib/CodeGen/MachinePipeliner.cpp | 253 -- .../sms-loop-carried-fp-exceptions1.mir | 109 .../sms-loop-carried-fp-exceptions2.mir | 99 +++ .../Hexagon/swp-loop-carried-order-dep1.mir | 110 .../Hexagon/swp-loop-carried-order-dep2.mir | 105 .../Hexagon/swp-loop-carried-order-dep3.mir | 109 .../Hexagon/swp-loop-carried-order-dep4.mir | 109 .../Hexagon/swp-loop-carried-order-dep5.mir | 111 .../Hexagon/swp-loop-carried-order-dep6.mir | 154 +++ 10 files changed, 1163 insertions(+), 30 deletions(-) create mode 100644 llvm/test/CodeGen/AArch64/sms-loop-carried-fp-exceptions1.mir create mode 100644 llvm/test/CodeGen/AArch64/sms-loop-carried-fp-exceptions2.mir create mode 100644 llvm/test/CodeGen/Hexagon/swp-loop-carried-order-dep1.mir create mode 100644 llvm/test/CodeGen/Hexagon/swp-loop-carried-order-dep2.mir create mode 100644 llvm/test/CodeGen/Hexagon/swp-loop-carried-order-dep3.mir create mode 100644 llvm/test/CodeGen/Hexagon/swp-loop-carried-order-dep4.mir create mode 100644 llvm/test/CodeGen/Hexagon/swp-loop-carried-order-dep5.mir create mode 100644 llvm/test/CodeGen/Hexagon/swp-loop-carried-order-dep6.mir diff --git a/llvm/include/llvm/CodeGen/MachinePipeliner.h b/llvm/include/llvm/CodeGen/MachinePipeliner.h index 966ffb7a1fbd2..e4e794c434adb 100644 --- a/llvm/include/llvm/CodeGen/MachinePipeliner.h +++ b/llvm/include/llvm/CodeGen/MachinePipeliner.h @@ -190,6 +190,38 @@ class SwingSchedulerDDGEdge { bool ignoreDependence(bool IgnoreAnti) const; }; +/// Represents loop-carried dependencies. Because SwingSchedulerDAG doesn't +/// assume cycle dependencies as the name suggests, such dependencies must be +/// handled separately. After DAG construction is finished, these dependencies +/// are added to SwingSchedulerDDG. +/// TODO: Also handle output-dependencies introduced by physical registers. +struct LoopCarriedEdges { + using OrderDep = SmallSetVector; + using OrderDepsType = DenseMap; + + OrderDepsType OrderDeps; + + const OrderDep *getOrderDepOrNull(SUnit *Key) const { +auto Ite = OrderDeps.find(Key); +if (Ite == OrderDeps.end()) + return nullptr; +return &Ite->second; + } + + /// Retruns true if the edge from \p From to \p To is a back-edge that should + /// be used when scheduling. + bool shouldUseWhenScheduling(const SUnit *From, const SUnit *To) const; + + /// Adds some edges to the original DAG that correspond to loop-carried + /// dependencies. Historically, loop-carried edges are represented by using + /// non-loop-carried edges in the original DAG. This function appends such + /// edges to preserve the previous behavior. + void modifySUnits(std::vector &SUnits); + + void dump(SUnit *SU, const TargetRegisterInfo *TRI, +const MachineRegisterInfo *MRI) const; +}; + /// Represents dependencies between instructions. This class is a wrapper of /// `SUnits` and its dependencies to manipulate back-edges in a natural way. /// Currently it only supports back-edges via PHI, which are expressed as @@ -402,7 +434,7 @@ class SwingSchedulerDAG : public ScheduleDAGInstrs { const MachineInstr *OtherMI) const; private: - void addLoopCarriedDependences(); + LoopCarriedEdges addLoopCarriedDependences(); void updatePhiDependences(); void changeDependences(); unsigned calculateResMII(); diff --git a/llvm/lib/CodeGen/MachinePipeliner.cpp b/llvm/lib/CodeGen/Machin
[llvm-branch-commits] [llvm] [MachinePipeliner] Introduce a new class for loop-carried deps (PR #137663)
kasuga-fj wrote: This PR is a part of [Stacked Pull Requests](https://llvm.org/docs/GitHub.html#stacked-pull-requests) and depends on #137662. The target branch will automatically change to main after the dependent PR is merged. Could you please take a look at #137662 at first? https://github.com/llvm/llvm-project/pull/137663 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopInterchange] Add tests for the vectorization profitability (NFC) (PR #133665)
https://github.com/kasuga-fj created https://github.com/llvm/llvm-project/pull/133665 There is a problem with the current profitability check for vectorization in LoopInterchange. There are both false positives and false negatives. The former means that the heuristic may say that "an exchange is necessary to vectorize the innermost loop" even though it's already possible. The latter means that the heuristic may miss a case where an exchange is necessary to vectorize the innermost loop. Note that this is not a dependency analysis problem. These problems can occur even if the analysis is accurate (no overestimation). This patch adds tests to clarify the cases that should be fixed. The root cause of these cases is that the heuristic doesn't handle the direction of a dependency correctly. >From b53b7ce2b303ff9ea94d77b3ffe74d1697db9f3d Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Thu, 27 Mar 2025 07:04:27 + Subject: [PATCH] [LoopInterchange] Add tests for the vectorization profitability (NFC) There is a problem with the current profitability check for vectorization in LoopInterchange. There are both false positives and false negatives. The former means that the heuristic may say that "an exchange is necessary to vectorize the innermost loop" even though it's already possible. The latter means that the heuristic may miss a case where an exchange is necessary to vectorize the innermost loop. Note that this is not a dependency analysis problem. These problems can occur even if the analysis is accurate (no overestimation). This patch adds tests to clarify the cases that should be fixed. The root cause of these cases is that the heuristic doesn't handle the direction of a dependency correctly. --- .../profitability-vectorization-heuristic.ll | 108 ++ 1 file changed, 108 insertions(+) create mode 100644 llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll diff --git a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll new file mode 100644 index 0..606117e70db86 --- /dev/null +++ b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll @@ -0,0 +1,108 @@ +; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 \ +; RUN: -pass-remarks-output=%t -disable-output -loop-interchange-profitabilities=vectorize +; RUN: FileCheck -input-file %t %s + +@A = dso_local global [256 x [256 x float]] zeroinitializer +@B = dso_local global [256 x [256 x float]] zeroinitializer +@C = dso_local global [256 x [256 x float]] zeroinitializer + +; Check that the below loops are exchanged for vectorization. +; +; for (int i = 0; i < 256; i++) { +; for (int j = 1; j < 256; j++) { +; A[i][j] = A[i][j-1] + B[i][j]; +; C[i][j] += 1; +; } +; } +; +; FIXME: These loops are not exchanged at this time due to the problem of +; profitablity heuristic for vectorization. + +; CHECK: --- !Missed +; CHECK-NEXT: Pass:loop-interchange +; CHECK-NEXT: Name:InterchangeNotProfitable +; CHECK-NEXT: Function:interchange_necesasry_for_vectorization +; CHECK-NEXT: Args: +; CHECK-NEXT: - String: Interchanging loops is not considered to improve cache locality nor vectorization. +; CHECK-NEXT: ... +define void @interchange_necesasry_for_vectorization() { +entry: + br label %for.i.header + +for.i.header: + %i = phi i64 [ 1, %entry ], [ %i.next, %for.i.inc ] + br label %for.j.body + +for.j.body: + %j = phi i64 [ 1, %for.i.header ], [ %j.next, %for.j.body ] + %j.dec = add nsw i64 %j, -1 + %a.load.index = getelementptr nuw inbounds [256 x [256 x float]], ptr @A, i64 %i, i64 %j.dec + %b.index = getelementptr nuw inbounds [256 x [256 x float]], ptr @B, i64 %i, i64 %j + %c.index = getelementptr nuw inbounds [256 x [256 x float]], ptr @C, i64 %i, i64 %j + %a = load float, ptr %a.load.index, align 4 + %b = load float, ptr %b.index, align 4 + %c = load float, ptr %c.index, align 4 + %add.0 = fadd float %a, %b + %a.store.index = getelementptr nuw inbounds [256 x [256 x float]], ptr @A, i64 %i, i64 %j + store float %add.0, ptr %a.store.index, align 4 + %add.1 = fadd float %c, 1.0 + store float %add.1, ptr %c.index, align 4 + %j.next = add nuw nsw i64 %j, 1 + %cmp.j = icmp eq i64 %j.next, 256 + br i1 %cmp.j, label %for.i.inc, label %for.j.body + +for.i.inc: + %i.next = add nuw nsw i64 %i, 1 + %cmp.i = icmp eq i64 %i.next, 256 + br i1 %cmp.i, label %exit, label %for.i.header + +exit: + ret void +} + +; Check that the following innermost loop can be vectorized so that +; interchangig is unnecessary. +; +; for (int i = 0; i < 256; i++) +; for (int j = 1; j < 256; j++) +; A[i][j-1] = A[i][j] + B[i][j]; +; +; FIXME: These loops are exchanged at this time due to the problem of +; profitablity heuristic for vectorization. + +; CHECK: --- !Passed +; CHECK-NEXT: Pass
[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)
https://github.com/kasuga-fj created https://github.com/llvm/llvm-project/pull/133672 The vectorization profitability has a process to check whether a given loop can be vectorized or not. Since the process is conservative, a loop that can be vectorized may be deemed not to be possible. This can trigger unnecessary exchanges. This patch improves the profitability decision by mitigating such misjudgments. Before this patch, we considered a loop to be vectorizable only when there are no loop carried dependencies with the IV of the loop. However, a loop carried dependency doesn't prevent vectorization if the distance is positive. This patch makes the vectorization check more accurate by allowing a loop with the positive dependency. Note that it is difficult to make a complete decision whether a loop can be vectorized or not. To achieve this, we must check the vector width and the distance of dependency. >From cdec72a2b2c365e29cbe05f2ad2d21b403104999 Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Thu, 27 Mar 2025 10:45:26 + Subject: [PATCH] [LoopInterchange] Improve profitability check for vectorization The vectorization profitability has a process to check whether a given loop can be vectorized or not. Since the process is conservative, a loop that can be vectorized may be deemed not to be possible. This can trigger unnecessary exchanges. This patch improves the profitability decision by mitigating such misjudgments. Before this patch, we considered a loop to be vectorizable only when there are no loop carried dependencies with the IV of the loop. However, a loop carried dependency doesn't prevent vectorization if the distance is positive. This patch makes the vectorization check more accurate by allowing a loop with the positive dependency. Note that it is difficult to make a complete decision whether a loop can be vectorized or not. To achieve this, we must check the vector width and the distance of dependency. --- .../lib/Transforms/Scalar/LoopInterchange.cpp | 128 ++ .../profitability-vectorization-heuristic.ll | 8 +- 2 files changed, 106 insertions(+), 30 deletions(-) diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp index b6b0b7d7a947a..0c3a9cbfeed5f 100644 --- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp +++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp @@ -17,8 +17,8 @@ #include "llvm/ADT/SmallSet.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/Statistic.h" +#include "llvm/ADT/StringMap.h" #include "llvm/ADT/StringRef.h" -#include "llvm/ADT/StringSet.h" #include "llvm/Analysis/DependenceAnalysis.h" #include "llvm/Analysis/LoopCacheAnalysis.h" #include "llvm/Analysis/LoopInfo.h" @@ -80,6 +80,21 @@ enum class RuleTy { ForVectorization, }; +/// Store the information about if corresponding direction vector was negated +/// by normalization or not. This is necessary to restore the original one from +/// a row of a dependency matrix because we only manage normalized direction +/// vectors. Also, duplicate vectors are eliminated, so there may be both +/// original and negated vectors for a single entry (a row of dependency +/// matrix). E.g., if there are two direction vectors `[< =]` and `[> =]`, the +/// later one will be converted to the same as former one by normalization, so +/// only `[< =]` would be retained in the final result. +struct NegatedStatus { + bool Original = false; + bool Negated = false; + + bool isNonNegativeDir(char Dir) const; +}; + } // end anonymous namespace // Minimum loop depth supported. @@ -126,9 +141,10 @@ static void printDepMatrix(CharMatrix &DepMatrix) { } #endif -static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level, - Loop *L, DependenceInfo *DI, - ScalarEvolution *SE, +static bool populateDependencyMatrix(CharMatrix &DepMatrix, + std::vector &NegStatusVec, + unsigned Level, Loop *L, + DependenceInfo *DI, ScalarEvolution *SE, OptimizationRemarkEmitter *ORE) { using ValueVector = SmallVector; @@ -167,7 +183,9 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level, return false; } ValueVector::iterator I, IE, J, JE; - StringSet<> Seen; + + // Manage all found direction vectors. and map it to the index of DepMatrix. + StringMap Seen; for (I = MemInstr.begin(), IE = MemInstr.end(); I != IE; ++I) { for (J = I, JE = MemInstr.end(); J != JE; ++J) { @@ -182,7 +200,8 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level, assert(D->isOrdered() && "Expected an output, flow or anti dep."); // If the direction vector is negative, normalize it to // make it non-negative. -if (D->normalize(SE)) +
[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)
kasuga-fj wrote: Depends on #133667 https://github.com/llvm/llvm-project/pull/133672 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopInterchange] Fix the vectorizable check for a loop (PR #133667)
kasuga-fj wrote: Depends on #133665 https://github.com/llvm/llvm-project/pull/133667 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopInterchange] Add tests for the vectorization profitability (NFC) (PR #133665)
kasuga-fj wrote: Depends on #133664 (Sorry for inconvenience, I tried using [Graphite](https://app.graphite.dev/), but it didn't work fine due to my network problem). https://github.com/llvm/llvm-project/pull/133665 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)
https://github.com/kasuga-fj updated https://github.com/llvm/llvm-project/pull/133672 >From 1a1c1f61a8cb179443d782127c157695bd21f6cc Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Thu, 27 Mar 2025 10:45:26 + Subject: [PATCH] [LoopInterchange] Improve profitability check for vectorization The vectorization profitability has a process to check whether a given loop can be vectorized or not. Since the process is conservative, a loop that can be vectorized may be deemed not to be possible. This can trigger unnecessary exchanges. This patch improves the profitability decision by mitigating such misjudgments. Before this patch, we considered a loop to be vectorizable only when there are no loop carried dependencies with the IV of the loop. However, a loop carried dependency doesn't prevent vectorization if the distance is positive. This patch makes the vectorization check more accurate by allowing a loop with the positive dependency. Note that it is difficult to make a complete decision whether a loop can be vectorized or not. To achieve this, we must check the vector width and the distance of dependency. --- .../lib/Transforms/Scalar/LoopInterchange.cpp | 128 ++ .../profitability-vectorization-heuristic.ll | 8 +- 2 files changed, 106 insertions(+), 30 deletions(-) diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp index 1dccba4cfa7b8..078da53c52b52 100644 --- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp +++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp @@ -17,8 +17,8 @@ #include "llvm/ADT/SmallSet.h" #include "llvm/ADT/SmallVector.h" #include "llvm/ADT/Statistic.h" +#include "llvm/ADT/StringMap.h" #include "llvm/ADT/StringRef.h" -#include "llvm/ADT/StringSet.h" #include "llvm/Analysis/DependenceAnalysis.h" #include "llvm/Analysis/LoopCacheAnalysis.h" #include "llvm/Analysis/LoopInfo.h" @@ -80,6 +80,21 @@ enum class RuleTy { ForVectorization, }; +/// Store the information about if corresponding direction vector was negated +/// by normalization or not. This is necessary to restore the original one from +/// a row of a dependency matrix, because we only manage normalized direction +/// vectors and duplicate vectors are eliminated. So there may be both original +/// and negated vectors for a single entry (a row of dependency matrix). E.g., +/// if there are two direction vectors `[< =]` and `[> =]`, the later one will +/// be converted to the same as former one by normalization, so only `[< =]` +/// would be retained in the final result. +struct NegatedStatus { + bool Original = false; + bool Negated = false; + + bool isNonNegativeDir(char Dir) const; +}; + } // end anonymous namespace // Minimum loop depth supported. @@ -126,9 +141,10 @@ static void printDepMatrix(CharMatrix &DepMatrix) { } #endif -static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level, - Loop *L, DependenceInfo *DI, - ScalarEvolution *SE, +static bool populateDependencyMatrix(CharMatrix &DepMatrix, + std::vector &NegStatusVec, + unsigned Level, Loop *L, + DependenceInfo *DI, ScalarEvolution *SE, OptimizationRemarkEmitter *ORE) { using ValueVector = SmallVector; @@ -167,7 +183,9 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level, return false; } ValueVector::iterator I, IE, J, JE; - StringSet<> Seen; + + // Manage all found direction vectors. and map it to the index of DepMatrix. + StringMap Seen; for (I = MemInstr.begin(), IE = MemInstr.end(); I != IE; ++I) { for (J = I, JE = MemInstr.end(); J != JE; ++J) { @@ -182,7 +200,8 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level, assert(D->isOrdered() && "Expected an output, flow or anti dep."); // If the direction vector is negative, normalize it to // make it non-negative. -if (D->normalize(SE)) +bool Normalized = D->normalize(SE); +if (Normalized) LLVM_DEBUG(dbgs() << "Negative dependence vector normalized.\n"); LLVM_DEBUG(StringRef DepType = D->isFlow() ? "flow" : D->isAnti() ? "anti" : "output"; @@ -214,8 +233,17 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level, } // Make sure we only add unique entries to the dependency matrix. -if (Seen.insert(StringRef(Dep.data(), Dep.size())).second) +unsigned Index = DepMatrix.size(); +auto [Ite, Inserted] = +Seen.try_emplace(StringRef(Dep.data(), Dep.size()), Index); +if (Inserted) { DepMatrix.push_back(Dep); + NegStatusVec.push_back(NegatedStatus{}); +} else + Index = Ite->second; + +
[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)
@@ -80,6 +80,21 @@ enum class RuleTy { ForVectorization, }; +/// Store the information about if corresponding direction vector was negated kasuga-fj wrote: > I think duplicated direction vectors are always allowed. They don't add new > or different information, so it shouldn't effect the interpretation of the > dependence analysis in any way. The only thing that it affects is processing > the same information again and again, so the only benefit of making them > unique is to avoid that. I agree with this, and am only concerned about the compile time degradation. So I will try to compare the difference of the number of entries with or without making them unique. Thank you for your opinion! https://github.com/llvm/llvm-project/pull/133672 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopInterchange] Relax the legality check to accept more patterns (PR #118267)
https://github.com/kasuga-fj edited https://github.com/llvm/llvm-project/pull/118267 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopInterchange] Relax the legality check to accept more patterns (PR #118267)
https://github.com/kasuga-fj updated https://github.com/llvm/llvm-project/pull/118267 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support — https://githubstatus.com";>GitHub Status — https://twitter.com/githubstatus";>@githubstatus ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopInterchange] Relax the legality check to accept more patterns (PR #118267)
https://github.com/kasuga-fj edited https://github.com/llvm/llvm-project/pull/118267 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopInterchange] Relax the legality check to accept more patterns (PR #118267)
kasuga-fj wrote: @Meinersbur Could you please take a look? I have submitted another PR #139254 as well, on which this PR depends, to address the following problems: I first tried to simply replace the function `isLexicographicallyPositive(std::vector &DV)` to take an additional argument like `StartAt` to ignore the prefix, as discussed in the last meeting. However, it made several tests to fail, e.g. a case where a direction vector is `[< > =]` and we would like to swap the last two loops. It is safe because the first element guarantees that the whole vector is lexicographically positive, but just changing the legality check to drop the unrelated prefix rejected such a case because it makes the direction vector to be `[> =]`, which is negative. The original issue #71519 isn't resolved by this patch because a problem similar to the above exists (I believe this can be resolved by improving the handling of negative direction). For now, I'd like to add a logic to ignore the irrelevant prefix of direction vectors, and will create another PR to address #71519. Thanks. https://github.com/llvm/llvm-project/pull/118267 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopInterchange] Relax the legality check to accept more patterns (PR #118267)
kasuga-fj wrote: Thanks for your review. https://github.com/llvm/llvm-project/pull/118267 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoopInterchange] Relax the legality check to accept more patterns (PR #118267)
kasuga-fj wrote: I misunderstood GitHub behavior and cannot reopen this PR... Submitted another PR #139690 with same contents. https://github.com/llvm/llvm-project/pull/118267 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)
https://github.com/kasuga-fj ready_for_review https://github.com/llvm/llvm-project/pull/148241 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)
@@ -113,7 +113,7 @@ define void @banerjee1(ptr %A, ptr %B, i64 %m, i64 %n) nounwind uwtable ssp { ; CHECK-NEXT: Src: %2 = load i64, ptr %arrayidx6, align 8 --> Dst: store i64 %2, ptr %B.addr.12, align 8 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store i64 %2, ptr %B.addr.12, align 8 --> Dst: store i64 %2, ptr %B.addr.12, align 8 -; CHECK-NEXT:da analyze - output [* *]! kasuga-fj wrote: All the test changes except for FlipFlopBaseAddress.ll are related to dependencies between memory accesses of the form `*B++`. https://github.com/llvm/llvm-project/pull/148241 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)
https://github.com/kasuga-fj created https://github.com/llvm/llvm-project/pull/148241 None >From 3596b1726514dda5942d99d4779852e01367c764 Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Fri, 11 Jul 2025 13:10:44 + Subject: [PATCH] [DA] Add check for base pointer invariance --- llvm/lib/Analysis/DependenceAnalysis.cpp | 16 .../DependenceAnalysis/FlipFlopBaseAddress.ll | 18 +- 2 files changed, 25 insertions(+), 9 deletions(-) diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp b/llvm/lib/Analysis/DependenceAnalysis.cpp index 428342f51ad2e..07bf560772c8c 100644 --- a/llvm/lib/Analysis/DependenceAnalysis.cpp +++ b/llvm/lib/Analysis/DependenceAnalysis.cpp @@ -3383,6 +3383,10 @@ bool DependenceInfo::tryDelinearize(Instruction *Src, Instruction *Dst, SrcSubscripts, DstSubscripts)) return false; + assert(isLoopInvariant(SrcBase, SrcLoop) && + isLoopInvariant(DstBase, DstLoop) && + "Expected SrcBase and DstBase to be loop invariant"); + int Size = SrcSubscripts.size(); LLVM_DEBUG({ dbgs() << "\nSrcSubscripts: "; @@ -3666,6 +3670,18 @@ DependenceInfo::depends(Instruction *Src, Instruction *Dst, SCEVUnionPredicate(Assume, *SE)); } + // Even if the base pointers are the same, they may not be loop-invariant. It + // could lead to incorrect results, as we're analyzing loop-carried + // dependencies. + Loop *SrcLoop = LI->getLoopFor(Src->getParent()); + Loop *DstLoop = LI->getLoopFor(Dst->getParent()); + if (!isLoopInvariant(SrcBase, SrcLoop) || + !isLoopInvariant(DstBase, DstLoop)) { +LLVM_DEBUG(dbgs() << "The base pointer is not loop invariant.\n"); +return std::make_unique(Src, Dst, +SCEVUnionPredicate(Assume, *SE)); + } + uint64_t EltSize = SrcLoc.Size.toRaw(); const SCEV *SrcEv = SE->getMinusSCEV(SrcSCEV, SrcBase); const SCEV *DstEv = SE->getMinusSCEV(DstSCEV, DstBase); diff --git a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll index 843c18a6e0d1e..a357018563be1 100644 --- a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll +++ b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll @@ -8,11 +8,11 @@ define float @bug41488_test1(float %f) { ; CHECK-LABEL: 'bug41488_test1' ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, ptr %p, align 4 -; CHECK-NEXT:da analyze - input [*]! +; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: store float %f, ptr %q, align 4 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store float %f, ptr %q, align 4 --> Dst: store float %f, ptr %q, align 4 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: %g = alloca float, align 4 @@ -34,11 +34,11 @@ for.cond.cleanup: define void @bug41488_test2(i32 %n) { ; CHECK-LABEL: 'bug41488_test2' ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, ptr %p, align 4 -; CHECK-NEXT:da analyze - input [*]! +; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: store float 0.00e+00, ptr %q, align 4 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store float 0.00e+00, ptr %q, align 4 --> Dst: store float 0.00e+00, ptr %q, align 4 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: %g = alloca float, align 4 @@ -68,7 +68,7 @@ define void @bug53942_foo(i32 noundef %n, ptr noalias nocapture noundef writeonl ; CHECK-NEXT: Src: %.pre = load double, ptr %B, align 8 --> Dst: store double %.pre, ptr %arrayidx2, align 8 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store double %.pre, ptr %arrayidx2, align 8 --> Dst: store double %.pre, ptr %arrayidx2, align 8 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: %cmp8 = icmp sgt i32 %n, 1 @@ -99,11 +99,11 @@ for.body: ; preds = %for.body.preheader, define void @bug53942_bar(i32 noundef %n, ptr noalias noundef %A, ptr noalias noundef %B) { ; CHECK-LABEL: 'bug53942_bar' ; CHECK-NEXT: Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: %0 = load double, ptr %arrayidx, align 8 -; CHECK-NEXT:da analyze - input [*]! +; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: store double %0, ptr %arrayidx8, align 8 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store double %0, ptr %arrayidx8, align 8 --> Dst: store double %0, ptr %arrayidx8, align 8 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: br label %for.cond @@ -173,7 +173,7 @@ for.end:
[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)
https://github.com/kasuga-fj updated https://github.com/llvm/llvm-project/pull/148241 >From d914ea5f3c44387570cab65ce9a507ebf429f827 Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Fri, 11 Jul 2025 13:10:44 + Subject: [PATCH] [DA] Add check for base pointer invariance --- llvm/lib/Analysis/DependenceAnalysis.cpp | 16 .../DependenceAnalysis/FlipFlopBaseAddress.ll | 18 +- 2 files changed, 25 insertions(+), 9 deletions(-) diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp b/llvm/lib/Analysis/DependenceAnalysis.cpp index 428342f51ad2e..07bf560772c8c 100644 --- a/llvm/lib/Analysis/DependenceAnalysis.cpp +++ b/llvm/lib/Analysis/DependenceAnalysis.cpp @@ -3383,6 +3383,10 @@ bool DependenceInfo::tryDelinearize(Instruction *Src, Instruction *Dst, SrcSubscripts, DstSubscripts)) return false; + assert(isLoopInvariant(SrcBase, SrcLoop) && + isLoopInvariant(DstBase, DstLoop) && + "Expected SrcBase and DstBase to be loop invariant"); + int Size = SrcSubscripts.size(); LLVM_DEBUG({ dbgs() << "\nSrcSubscripts: "; @@ -3666,6 +3670,18 @@ DependenceInfo::depends(Instruction *Src, Instruction *Dst, SCEVUnionPredicate(Assume, *SE)); } + // Even if the base pointers are the same, they may not be loop-invariant. It + // could lead to incorrect results, as we're analyzing loop-carried + // dependencies. + Loop *SrcLoop = LI->getLoopFor(Src->getParent()); + Loop *DstLoop = LI->getLoopFor(Dst->getParent()); + if (!isLoopInvariant(SrcBase, SrcLoop) || + !isLoopInvariant(DstBase, DstLoop)) { +LLVM_DEBUG(dbgs() << "The base pointer is not loop invariant.\n"); +return std::make_unique(Src, Dst, +SCEVUnionPredicate(Assume, *SE)); + } + uint64_t EltSize = SrcLoc.Size.toRaw(); const SCEV *SrcEv = SE->getMinusSCEV(SrcSCEV, SrcBase); const SCEV *DstEv = SE->getMinusSCEV(DstSCEV, DstBase); diff --git a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll index 3e3426afab0f7..52cab0f77e73e 100644 --- a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll +++ b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll @@ -8,11 +8,11 @@ define float @bug41488_test1(float %f) { ; CHECK-LABEL: 'bug41488_test1' ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, ptr %p, align 4 -; CHECK-NEXT:da analyze - input [*]! +; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: store float %f, ptr %q, align 4 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store float %f, ptr %q, align 4 --> Dst: store float %f, ptr %q, align 4 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: %g = alloca float, align 4 @@ -34,11 +34,11 @@ for.cond.cleanup: define void @bug41488_test2(i32 %n) { ; CHECK-LABEL: 'bug41488_test2' ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, ptr %p, align 4 -; CHECK-NEXT:da analyze - input [*]! +; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: store float 0.00e+00, ptr %q, align 4 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store float 0.00e+00, ptr %q, align 4 --> Dst: store float 0.00e+00, ptr %q, align 4 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: %g = alloca float, align 4 @@ -68,7 +68,7 @@ define void @bug53942_foo(i32 noundef %n, ptr noalias nocapture noundef writeonl ; CHECK-NEXT: Src: %.pre = load double, ptr %B, align 8 --> Dst: store double %.pre, ptr %arrayidx2, align 8 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store double %.pre, ptr %arrayidx2, align 8 --> Dst: store double %.pre, ptr %arrayidx2, align 8 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: %cmp8 = icmp sgt i32 %n, 1 @@ -99,11 +99,11 @@ for.body: ; preds = %for.body.preheader, define void @bug53942_bar(i32 noundef %n, ptr noalias noundef %A, ptr noalias noundef %B) { ; CHECK-LABEL: 'bug53942_bar' ; CHECK-NEXT: Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: %0 = load double, ptr %arrayidx, align 8 -; CHECK-NEXT:da analyze - input [*]! +; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: store double %0, ptr %arrayidx8, align 8 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store double %0, ptr %arrayidx8, align 8 --> Dst: store double %0, ptr %arrayidx8, align 8 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: br label %for.cond @@ -173,7 +173,7 @@ for.end:
[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)
https://github.com/kasuga-fj edited https://github.com/llvm/llvm-project/pull/148241 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)
https://github.com/kasuga-fj edited https://github.com/llvm/llvm-project/pull/148241 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)
https://github.com/kasuga-fj updated https://github.com/llvm/llvm-project/pull/148241 >From d914ea5f3c44387570cab65ce9a507ebf429f827 Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Fri, 11 Jul 2025 13:10:44 + Subject: [PATCH 1/2] [DA] Add check for base pointer invariance --- llvm/lib/Analysis/DependenceAnalysis.cpp | 16 .../DependenceAnalysis/FlipFlopBaseAddress.ll | 18 +- 2 files changed, 25 insertions(+), 9 deletions(-) diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp b/llvm/lib/Analysis/DependenceAnalysis.cpp index 428342f51ad2e..07bf560772c8c 100644 --- a/llvm/lib/Analysis/DependenceAnalysis.cpp +++ b/llvm/lib/Analysis/DependenceAnalysis.cpp @@ -3383,6 +3383,10 @@ bool DependenceInfo::tryDelinearize(Instruction *Src, Instruction *Dst, SrcSubscripts, DstSubscripts)) return false; + assert(isLoopInvariant(SrcBase, SrcLoop) && + isLoopInvariant(DstBase, DstLoop) && + "Expected SrcBase and DstBase to be loop invariant"); + int Size = SrcSubscripts.size(); LLVM_DEBUG({ dbgs() << "\nSrcSubscripts: "; @@ -3666,6 +3670,18 @@ DependenceInfo::depends(Instruction *Src, Instruction *Dst, SCEVUnionPredicate(Assume, *SE)); } + // Even if the base pointers are the same, they may not be loop-invariant. It + // could lead to incorrect results, as we're analyzing loop-carried + // dependencies. + Loop *SrcLoop = LI->getLoopFor(Src->getParent()); + Loop *DstLoop = LI->getLoopFor(Dst->getParent()); + if (!isLoopInvariant(SrcBase, SrcLoop) || + !isLoopInvariant(DstBase, DstLoop)) { +LLVM_DEBUG(dbgs() << "The base pointer is not loop invariant.\n"); +return std::make_unique(Src, Dst, +SCEVUnionPredicate(Assume, *SE)); + } + uint64_t EltSize = SrcLoc.Size.toRaw(); const SCEV *SrcEv = SE->getMinusSCEV(SrcSCEV, SrcBase); const SCEV *DstEv = SE->getMinusSCEV(DstSCEV, DstBase); diff --git a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll index 3e3426afab0f7..52cab0f77e73e 100644 --- a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll +++ b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll @@ -8,11 +8,11 @@ define float @bug41488_test1(float %f) { ; CHECK-LABEL: 'bug41488_test1' ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, ptr %p, align 4 -; CHECK-NEXT:da analyze - input [*]! +; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: store float %f, ptr %q, align 4 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store float %f, ptr %q, align 4 --> Dst: store float %f, ptr %q, align 4 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: %g = alloca float, align 4 @@ -34,11 +34,11 @@ for.cond.cleanup: define void @bug41488_test2(i32 %n) { ; CHECK-LABEL: 'bug41488_test2' ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, ptr %p, align 4 -; CHECK-NEXT:da analyze - input [*]! +; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: store float 0.00e+00, ptr %q, align 4 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store float 0.00e+00, ptr %q, align 4 --> Dst: store float 0.00e+00, ptr %q, align 4 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: %g = alloca float, align 4 @@ -68,7 +68,7 @@ define void @bug53942_foo(i32 noundef %n, ptr noalias nocapture noundef writeonl ; CHECK-NEXT: Src: %.pre = load double, ptr %B, align 8 --> Dst: store double %.pre, ptr %arrayidx2, align 8 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store double %.pre, ptr %arrayidx2, align 8 --> Dst: store double %.pre, ptr %arrayidx2, align 8 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: %cmp8 = icmp sgt i32 %n, 1 @@ -99,11 +99,11 @@ for.body: ; preds = %for.body.preheader, define void @bug53942_bar(i32 noundef %n, ptr noalias noundef %A, ptr noalias noundef %B) { ; CHECK-LABEL: 'bug53942_bar' ; CHECK-NEXT: Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: %0 = load double, ptr %arrayidx, align 8 -; CHECK-NEXT:da analyze - input [*]! +; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: store double %0, ptr %arrayidx8, align 8 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store double %0, ptr %arrayidx8, align 8 --> Dst: store double %0, ptr %arrayidx8, align 8 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: br label %for.cond @@ -173,7 +173,7 @@ for.end:
[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)
https://github.com/kasuga-fj updated https://github.com/llvm/llvm-project/pull/148241 >From d914ea5f3c44387570cab65ce9a507ebf429f827 Mon Sep 17 00:00:00 2001 From: Ryotaro Kasuga Date: Fri, 11 Jul 2025 13:10:44 + Subject: [PATCH 1/3] [DA] Add check for base pointer invariance --- llvm/lib/Analysis/DependenceAnalysis.cpp | 16 .../DependenceAnalysis/FlipFlopBaseAddress.ll | 18 +- 2 files changed, 25 insertions(+), 9 deletions(-) diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp b/llvm/lib/Analysis/DependenceAnalysis.cpp index 428342f51ad2e..07bf560772c8c 100644 --- a/llvm/lib/Analysis/DependenceAnalysis.cpp +++ b/llvm/lib/Analysis/DependenceAnalysis.cpp @@ -3383,6 +3383,10 @@ bool DependenceInfo::tryDelinearize(Instruction *Src, Instruction *Dst, SrcSubscripts, DstSubscripts)) return false; + assert(isLoopInvariant(SrcBase, SrcLoop) && + isLoopInvariant(DstBase, DstLoop) && + "Expected SrcBase and DstBase to be loop invariant"); + int Size = SrcSubscripts.size(); LLVM_DEBUG({ dbgs() << "\nSrcSubscripts: "; @@ -3666,6 +3670,18 @@ DependenceInfo::depends(Instruction *Src, Instruction *Dst, SCEVUnionPredicate(Assume, *SE)); } + // Even if the base pointers are the same, they may not be loop-invariant. It + // could lead to incorrect results, as we're analyzing loop-carried + // dependencies. + Loop *SrcLoop = LI->getLoopFor(Src->getParent()); + Loop *DstLoop = LI->getLoopFor(Dst->getParent()); + if (!isLoopInvariant(SrcBase, SrcLoop) || + !isLoopInvariant(DstBase, DstLoop)) { +LLVM_DEBUG(dbgs() << "The base pointer is not loop invariant.\n"); +return std::make_unique(Src, Dst, +SCEVUnionPredicate(Assume, *SE)); + } + uint64_t EltSize = SrcLoc.Size.toRaw(); const SCEV *SrcEv = SE->getMinusSCEV(SrcSCEV, SrcBase); const SCEV *DstEv = SE->getMinusSCEV(DstSCEV, DstBase); diff --git a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll index 3e3426afab0f7..52cab0f77e73e 100644 --- a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll +++ b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll @@ -8,11 +8,11 @@ define float @bug41488_test1(float %f) { ; CHECK-LABEL: 'bug41488_test1' ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, ptr %p, align 4 -; CHECK-NEXT:da analyze - input [*]! +; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: store float %f, ptr %q, align 4 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store float %f, ptr %q, align 4 --> Dst: store float %f, ptr %q, align 4 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: %g = alloca float, align 4 @@ -34,11 +34,11 @@ for.cond.cleanup: define void @bug41488_test2(i32 %n) { ; CHECK-LABEL: 'bug41488_test2' ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, ptr %p, align 4 -; CHECK-NEXT:da analyze - input [*]! +; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: %0 = load float, ptr %p, align 4 --> Dst: store float 0.00e+00, ptr %q, align 4 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store float 0.00e+00, ptr %q, align 4 --> Dst: store float 0.00e+00, ptr %q, align 4 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: %g = alloca float, align 4 @@ -68,7 +68,7 @@ define void @bug53942_foo(i32 noundef %n, ptr noalias nocapture noundef writeonl ; CHECK-NEXT: Src: %.pre = load double, ptr %B, align 8 --> Dst: store double %.pre, ptr %arrayidx2, align 8 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store double %.pre, ptr %arrayidx2, align 8 --> Dst: store double %.pre, ptr %arrayidx2, align 8 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: %cmp8 = icmp sgt i32 %n, 1 @@ -99,11 +99,11 @@ for.body: ; preds = %for.body.preheader, define void @bug53942_bar(i32 noundef %n, ptr noalias noundef %A, ptr noalias noundef %B) { ; CHECK-LABEL: 'bug53942_bar' ; CHECK-NEXT: Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: %0 = load double, ptr %arrayidx, align 8 -; CHECK-NEXT:da analyze - input [*]! +; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: store double %0, ptr %arrayidx8, align 8 ; CHECK-NEXT:da analyze - confused! ; CHECK-NEXT: Src: store double %0, ptr %arrayidx8, align 8 --> Dst: store double %0, ptr %arrayidx8, align 8 -; CHECK-NEXT:da analyze - output [*]! +; CHECK-NEXT:da analyze - confused! ; entry: br label %for.cond @@ -173,7 +173,7 @@ for.end:
[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)
https://github.com/kasuga-fj converted_to_draft https://github.com/llvm/llvm-project/pull/148241 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits