[llvm-branch-commits] [llvm] [LoopInterchange] Fix the vectorizable check for a loop (PR #133667)

2025-04-05 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj created 
https://github.com/llvm/llvm-project/pull/133667

In the profitability check for vectorization, the dependency matrix was not 
handled correctly. This can result to make a wrong decision: It may say "this 
loop can be vectorized" when in fact it cannot. The root cause of this is that 
the check process early returns when it finds '=' or 'I' in the dependency 
matrix. To make sure that we can actually vectorize the loop, we need to check 
all the rows of the matrix. This patch fixes the process of checking whether we 
can vectorize the loop or not. Now it won't make a wrong decision for a loop 
that cannot be vectorized.

Related: #131130

>From 2db59e8629d3640ec070eb906ac55a5e970176d1 Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Thu, 27 Mar 2025 09:52:16 +
Subject: [PATCH] [LoopInterchange] Fix the vectorizable check for a loop

In the profitability check for vectorization, the dependency matrix was
not handled correctly. This can result to make a wrong decision: It may
say "this loop can be vectorized" when in fact it cannot. The root cause
of this is that the check process early returns when it finds '=' or 'I'
in the dependency matrix. To make sure that we can actually vectorize
the loop, we need to check all the rows of the matrix. This patch fixes
the process of checking whether we can vectorize the loop or not. Now it
won't make a wrong decision for a loop that cannot be vectorized.

Related: #131130
---
 .../lib/Transforms/Scalar/LoopInterchange.cpp | 41 +++
 .../profitability-vectorization-heuristic.ll  |  9 ++--
 2 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp 
b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index e777f950a7c5a..b6b0b7d7a947a 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -1197,25 +1197,32 @@ 
LoopInterchangeProfitability::isProfitablePerInstrOrderCost() {
   return std::nullopt;
 }
 
+/// Return true if we can vectorize the loop specified by \p LoopId.
+static bool canVectorize(const CharMatrix &DepMatrix, unsigned LoopId) {
+  for (unsigned I = 0; I != DepMatrix.size(); I++) {
+char Dir = DepMatrix[I][LoopId];
+if (Dir != 'I' && Dir != '=')
+  return false;
+  }
+  return true;
+}
+
 std::optional LoopInterchangeProfitability::isProfitableForVectorization(
 unsigned InnerLoopId, unsigned OuterLoopId, CharMatrix &DepMatrix) {
-  for (auto &Row : DepMatrix) {
-// If the inner loop is loop independent or doesn't carry any dependency
-// it is not profitable to move this to outer position, since we are
-// likely able to do inner loop vectorization already.
-if (Row[InnerLoopId] == 'I' || Row[InnerLoopId] == '=')
-  return std::optional(false);
-
-// If the outer loop is not loop independent it is not profitable to move
-// this to inner position, since doing so would not enable inner loop
-// parallelism.
-if (Row[OuterLoopId] != 'I' && Row[OuterLoopId] != '=')
-  return std::optional(false);
-  }
-  // If inner loop has dependence and outer loop is loop independent then it
-  // is/ profitable to interchange to enable inner loop parallelism.
-  // If there are no dependences, interchanging will not improve anything.
-  return std::optional(!DepMatrix.empty());
+  // If the outer loop is not loop independent it is not profitable to move
+  // this to inner position, since doing so would not enable inner loop
+  // parallelism.
+  if (!canVectorize(DepMatrix, OuterLoopId))
+return false;
+
+  // If inner loop has dependence and outer loop is loop independent then it is
+  // profitable to interchange to enable inner loop parallelism.
+  if (!canVectorize(DepMatrix, InnerLoopId))
+return true;
+
+  // TODO: Estimate the cost of vectorized loop body when both the outer and 
the
+  // inner loop can be vectorized.
+  return std::nullopt;
 }
 
 bool LoopInterchangeProfitability::isProfitable(
diff --git 
a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll 
b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
index 606117e70db86..b82dd5141a6b2 100644
--- 
a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
+++ 
b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
@@ -15,16 +15,13 @@
 ;   }
 ; }
 ;
-; FIXME: These loops are not exchanged at this time due to the problem of
-; profitablity heuristic for vectorization.
 
-; CHECK:  --- !Missed
+; CHECK:  --- !Passed
 ; CHECK-NEXT: Pass:loop-interchange
-; CHECK-NEXT: Name:InterchangeNotProfitable
+; CHECK-NEXT: Name:Interchanged
 ; CHECK-NEXT: Function:interchange_necesasry_for_vectorization
 ; CHECK-NEXT: Args:
-; CHECK-NEXT:   - String:  Interchanging loops is not considered to 
improve cache locality nor vectori

[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)

2025-04-05 Thread Ryotaro Kasuga via llvm-branch-commits


@@ -80,6 +80,21 @@ enum class RuleTy {
   ForVectorization,
 };
 
+/// Store the information about if corresponding direction vector was negated

kasuga-fj wrote:

> But I now guess that the complication here is the unique entries in the 
> dependency matrix, is that right?

Yes. (But holding two boolean values is a bit redundant. What is actually 
needed are three states. If both of them are false, it is an illegal state.)

> I am wondering if it isn't easier to keep all the entries and don't make them 
> unique?

I think it would be simpler. Also, there is no need to stop making entries 
unique altogether. If duplicate direction vectors are allowed, I think the 
simplest implementation would be to keep pairs of a direction vector and a 
boolean value indicating whether the corresponding vector is negated. However, 
I'm not sure how effective it is to make direction vectors unique. In the worst 
case, holding pairs of a vector and a boolean value instead of a single vector 
doubles the number of entries. Is this allowed?

https://github.com/llvm/llvm-project/pull/133672
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopInterchange] Add tests for the vectorization profitability (NFC) (PR #133665)

2025-04-05 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj edited 
https://github.com/llvm/llvm-project/pull/133665
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)

2025-04-05 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj updated 
https://github.com/llvm/llvm-project/pull/133672

>From 72b48ba6d6b70eb9a65abdc516697f3dee9c7a2e Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Thu, 27 Mar 2025 10:45:26 +
Subject: [PATCH] [LoopInterchange] Improve profitability check for
 vectorization

The vectorization profitability has a process to check whether a given
loop can be vectorized or not. Since the process is conservative, a loop
that can be vectorized may be deemed not to be possible. This can
trigger unnecessary exchanges.
This patch improves the profitability decision by mitigating such
misjudgments. Before this patch, we considered a loop to be vectorizable
only when there are no loop carried dependencies with the IV of the
loop. However, a loop carried dependency doesn't prevent vectorization
if the distance is positive. This patch makes the vectorization check
more accurate by allowing a loop with the positive dependency. Note that
it is difficult to make a complete decision whether a loop can be
vectorized or not. To achieve this, we must check the vector width and
the distance of dependency.
---
 .../lib/Transforms/Scalar/LoopInterchange.cpp | 128 ++
 .../profitability-vectorization-heuristic.ll  |   8 +-
 2 files changed, 106 insertions(+), 30 deletions(-)

diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp 
b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index b6b0b7d7a947a..cb2cb6652899c 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -17,8 +17,8 @@
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/Statistic.h"
+#include "llvm/ADT/StringMap.h"
 #include "llvm/ADT/StringRef.h"
-#include "llvm/ADT/StringSet.h"
 #include "llvm/Analysis/DependenceAnalysis.h"
 #include "llvm/Analysis/LoopCacheAnalysis.h"
 #include "llvm/Analysis/LoopInfo.h"
@@ -80,6 +80,21 @@ enum class RuleTy {
   ForVectorization,
 };
 
+/// Store the information about if corresponding direction vector was negated
+/// by normalization or not. This is necessary to restore the original one from
+/// a row of a dependency matrix, because we only manage normalized direction
+/// vectors and duplicate vectors are eliminated. So there may be both original
+/// and negated vectors for a single entry (a row of dependency matrix). E.g.,
+/// if there are two direction vectors `[< =]` and `[> =]`, the later one will
+/// be converted to the same as former one by normalization, so only `[< =]`
+/// would be retained in the final result.
+struct NegatedStatus {
+  bool Original = false;
+  bool Negated = false;
+
+  bool isNonNegativeDir(char Dir) const;
+};
+
 } // end anonymous namespace
 
 // Minimum loop depth supported.
@@ -126,9 +141,10 @@ static void printDepMatrix(CharMatrix &DepMatrix) {
 }
 #endif
 
-static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level,
- Loop *L, DependenceInfo *DI,
- ScalarEvolution *SE,
+static bool populateDependencyMatrix(CharMatrix &DepMatrix,
+ std::vector &NegStatusVec,
+ unsigned Level, Loop *L,
+ DependenceInfo *DI, ScalarEvolution *SE,
  OptimizationRemarkEmitter *ORE) {
   using ValueVector = SmallVector;
 
@@ -167,7 +183,9 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, 
unsigned Level,
 return false;
   }
   ValueVector::iterator I, IE, J, JE;
-  StringSet<> Seen;
+
+  // Manage all found direction vectors. and map it to the index of DepMatrix.
+  StringMap Seen;
 
   for (I = MemInstr.begin(), IE = MemInstr.end(); I != IE; ++I) {
 for (J = I, JE = MemInstr.end(); J != JE; ++J) {
@@ -182,7 +200,8 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, 
unsigned Level,
 assert(D->isOrdered() && "Expected an output, flow or anti dep.");
 // If the direction vector is negative, normalize it to
 // make it non-negative.
-if (D->normalize(SE))
+bool Normalized = D->normalize(SE);
+if (Normalized)
   LLVM_DEBUG(dbgs() << "Negative dependence vector normalized.\n");
 LLVM_DEBUG(StringRef DepType =
D->isFlow() ? "flow" : D->isAnti() ? "anti" : "output";
@@ -214,8 +233,17 @@ static bool populateDependencyMatrix(CharMatrix 
&DepMatrix, unsigned Level,
 }
 
 // Make sure we only add unique entries to the dependency matrix.
-if (Seen.insert(StringRef(Dep.data(), Dep.size())).second)
+unsigned Index = DepMatrix.size();
+auto [Ite, Inserted] =
+Seen.try_emplace(StringRef(Dep.data(), Dep.size()), Index);
+if (Inserted) {
   DepMatrix.push_back(Dep);
+  NegStatusVec.push_back(NegatedStatus{});
+} else
+  Index = Ite->second;
+
+

[llvm-branch-commits] [llvm] [LoopInterchange] Fix the vectorizable check for a loop (PR #133667)

2025-04-05 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj updated 
https://github.com/llvm/llvm-project/pull/133667

>From ae5d9cff055000480e7e71205265b18911440e29 Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Thu, 27 Mar 2025 09:52:16 +
Subject: [PATCH] [LoopInterchange] Fix the vectorizable check for a loop

In the profitability check for vectorization, the dependency matrix was
not handled correctly. This can result to make a wrong decision: It may
say "this loop can be vectorized" when in fact it cannot. The root cause
of this is that the check process early returns when it finds '=' or 'I'
in the dependency matrix. To make sure that we can actually vectorize
the loop, we need to check all the rows of the matrix. This patch fixes
the process of checking whether we can vectorize the loop or not. Now it
won't make a wrong decision for a loop that cannot be vectorized.

Related: #131130
---
 .../lib/Transforms/Scalar/LoopInterchange.cpp | 41 +++
 .../profitability-vectorization-heuristic.ll  |  9 ++--
 2 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp 
b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index e777f950a7c5a..b6b0b7d7a947a 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -1197,25 +1197,32 @@ 
LoopInterchangeProfitability::isProfitablePerInstrOrderCost() {
   return std::nullopt;
 }
 
+/// Return true if we can vectorize the loop specified by \p LoopId.
+static bool canVectorize(const CharMatrix &DepMatrix, unsigned LoopId) {
+  for (unsigned I = 0; I != DepMatrix.size(); I++) {
+char Dir = DepMatrix[I][LoopId];
+if (Dir != 'I' && Dir != '=')
+  return false;
+  }
+  return true;
+}
+
 std::optional LoopInterchangeProfitability::isProfitableForVectorization(
 unsigned InnerLoopId, unsigned OuterLoopId, CharMatrix &DepMatrix) {
-  for (auto &Row : DepMatrix) {
-// If the inner loop is loop independent or doesn't carry any dependency
-// it is not profitable to move this to outer position, since we are
-// likely able to do inner loop vectorization already.
-if (Row[InnerLoopId] == 'I' || Row[InnerLoopId] == '=')
-  return std::optional(false);
-
-// If the outer loop is not loop independent it is not profitable to move
-// this to inner position, since doing so would not enable inner loop
-// parallelism.
-if (Row[OuterLoopId] != 'I' && Row[OuterLoopId] != '=')
-  return std::optional(false);
-  }
-  // If inner loop has dependence and outer loop is loop independent then it
-  // is/ profitable to interchange to enable inner loop parallelism.
-  // If there are no dependences, interchanging will not improve anything.
-  return std::optional(!DepMatrix.empty());
+  // If the outer loop is not loop independent it is not profitable to move
+  // this to inner position, since doing so would not enable inner loop
+  // parallelism.
+  if (!canVectorize(DepMatrix, OuterLoopId))
+return false;
+
+  // If inner loop has dependence and outer loop is loop independent then it is
+  // profitable to interchange to enable inner loop parallelism.
+  if (!canVectorize(DepMatrix, InnerLoopId))
+return true;
+
+  // TODO: Estimate the cost of vectorized loop body when both the outer and 
the
+  // inner loop can be vectorized.
+  return std::nullopt;
 }
 
 bool LoopInterchangeProfitability::isProfitable(
diff --git 
a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll 
b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
index 606117e70db86..b82dd5141a6b2 100644
--- 
a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
+++ 
b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
@@ -15,16 +15,13 @@
 ;   }
 ; }
 ;
-; FIXME: These loops are not exchanged at this time due to the problem of
-; profitablity heuristic for vectorization.
 
-; CHECK:  --- !Missed
+; CHECK:  --- !Passed
 ; CHECK-NEXT: Pass:loop-interchange
-; CHECK-NEXT: Name:InterchangeNotProfitable
+; CHECK-NEXT: Name:Interchanged
 ; CHECK-NEXT: Function:interchange_necesasry_for_vectorization
 ; CHECK-NEXT: Args:
-; CHECK-NEXT:   - String:  Interchanging loops is not considered to 
improve cache locality nor vectorization.
-; CHECK-NEXT: ...
+; CHECK-NEXT:   - String:  Loop interchanged with enclosing loop.
 define void @interchange_necesasry_for_vectorization() {
 entry:
   br label %for.i.header

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)

2025-04-04 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj updated 
https://github.com/llvm/llvm-project/pull/133672

>From 692e4de4f84281f8c2bc5f7278f8066929df3cd6 Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Thu, 27 Mar 2025 10:45:26 +
Subject: [PATCH] [LoopInterchange] Improve profitability check for
 vectorization

The vectorization profitability has a process to check whether a given
loop can be vectorized or not. Since the process is conservative, a loop
that can be vectorized may be deemed not to be possible. This can
trigger unnecessary exchanges.
This patch improves the profitability decision by mitigating such
misjudgments. Before this patch, we considered a loop to be vectorizable
only when there are no loop carried dependencies with the IV of the
loop. However, a loop carried dependency doesn't prevent vectorization
if the distance is positive. This patch makes the vectorization check
more accurate by allowing a loop with the positive dependency. Note that
it is difficult to make a complete decision whether a loop can be
vectorized or not. To achieve this, we must check the vector width and
the distance of dependency.
---
 .../lib/Transforms/Scalar/LoopInterchange.cpp | 128 ++
 .../profitability-vectorization-heuristic.ll  |   8 +-
 2 files changed, 106 insertions(+), 30 deletions(-)

diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp 
b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index 1dccba4cfa7b8..078da53c52b52 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -17,8 +17,8 @@
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/Statistic.h"
+#include "llvm/ADT/StringMap.h"
 #include "llvm/ADT/StringRef.h"
-#include "llvm/ADT/StringSet.h"
 #include "llvm/Analysis/DependenceAnalysis.h"
 #include "llvm/Analysis/LoopCacheAnalysis.h"
 #include "llvm/Analysis/LoopInfo.h"
@@ -80,6 +80,21 @@ enum class RuleTy {
   ForVectorization,
 };
 
+/// Store the information about if corresponding direction vector was negated
+/// by normalization or not. This is necessary to restore the original one from
+/// a row of a dependency matrix, because we only manage normalized direction
+/// vectors and duplicate vectors are eliminated. So there may be both original
+/// and negated vectors for a single entry (a row of dependency matrix). E.g.,
+/// if there are two direction vectors `[< =]` and `[> =]`, the later one will
+/// be converted to the same as former one by normalization, so only `[< =]`
+/// would be retained in the final result.
+struct NegatedStatus {
+  bool Original = false;
+  bool Negated = false;
+
+  bool isNonNegativeDir(char Dir) const;
+};
+
 } // end anonymous namespace
 
 // Minimum loop depth supported.
@@ -126,9 +141,10 @@ static void printDepMatrix(CharMatrix &DepMatrix) {
 }
 #endif
 
-static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level,
- Loop *L, DependenceInfo *DI,
- ScalarEvolution *SE,
+static bool populateDependencyMatrix(CharMatrix &DepMatrix,
+ std::vector &NegStatusVec,
+ unsigned Level, Loop *L,
+ DependenceInfo *DI, ScalarEvolution *SE,
  OptimizationRemarkEmitter *ORE) {
   using ValueVector = SmallVector;
 
@@ -167,7 +183,9 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, 
unsigned Level,
 return false;
   }
   ValueVector::iterator I, IE, J, JE;
-  StringSet<> Seen;
+
+  // Manage all found direction vectors. and map it to the index of DepMatrix.
+  StringMap Seen;
 
   for (I = MemInstr.begin(), IE = MemInstr.end(); I != IE; ++I) {
 for (J = I, JE = MemInstr.end(); J != JE; ++J) {
@@ -182,7 +200,8 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, 
unsigned Level,
 assert(D->isOrdered() && "Expected an output, flow or anti dep.");
 // If the direction vector is negative, normalize it to
 // make it non-negative.
-if (D->normalize(SE))
+bool Normalized = D->normalize(SE);
+if (Normalized)
   LLVM_DEBUG(dbgs() << "Negative dependence vector normalized.\n");
 LLVM_DEBUG(StringRef DepType =
D->isFlow() ? "flow" : D->isAnti() ? "anti" : "output";
@@ -214,8 +233,17 @@ static bool populateDependencyMatrix(CharMatrix 
&DepMatrix, unsigned Level,
 }
 
 // Make sure we only add unique entries to the dependency matrix.
-if (Seen.insert(StringRef(Dep.data(), Dep.size())).second)
+unsigned Index = DepMatrix.size();
+auto [Ite, Inserted] =
+Seen.try_emplace(StringRef(Dep.data(), Dep.size()), Index);
+if (Inserted) {
   DepMatrix.push_back(Dep);
+  NegStatusVec.push_back(NegatedStatus{});
+} else
+  Index = Ite->second;
+
+

[llvm-branch-commits] [llvm] [LoopInterchange] Fix the vectorizable check for a loop (PR #133667)

2025-04-04 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj updated 
https://github.com/llvm/llvm-project/pull/133667

>From bd84ddc9e4dc645e965b2a6dc535a3023e0d7e45 Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Thu, 27 Mar 2025 09:52:16 +
Subject: [PATCH] [LoopInterchange] Fix the vectorizable check for a loop

In the profitability check for vectorization, the dependency matrix was
not handled correctly. This can result to make a wrong decision: It may
say "this loop can be vectorized" when in fact it cannot. The root cause
of this is that the check process early returns when it finds '=' or 'I'
in the dependency matrix. To make sure that we can actually vectorize
the loop, we need to check all the rows of the matrix. This patch fixes
the process of checking whether we can vectorize the loop or not. Now it
won't make a wrong decision for a loop that cannot be vectorized.

Related: #131130
---
 .../lib/Transforms/Scalar/LoopInterchange.cpp | 44 ---
 .../profitability-vectorization-heuristic.ll  |  9 ++--
 2 files changed, 30 insertions(+), 23 deletions(-)

diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp 
b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index e777f950a7c5a..1dccba4cfa7b8 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -1197,25 +1197,35 @@ 
LoopInterchangeProfitability::isProfitablePerInstrOrderCost() {
   return std::nullopt;
 }
 
+/// Return true if we can vectorize the loop specified by \p LoopId.
+static bool canVectorize(const CharMatrix &DepMatrix, unsigned LoopId) {
+  for (unsigned I = 0; I != DepMatrix.size(); I++) {
+char Dir = DepMatrix[I][LoopId];
+if (Dir != 'I' && Dir != '=')
+  return false;
+  }
+  return true;
+}
+
 std::optional LoopInterchangeProfitability::isProfitableForVectorization(
 unsigned InnerLoopId, unsigned OuterLoopId, CharMatrix &DepMatrix) {
-  for (auto &Row : DepMatrix) {
-// If the inner loop is loop independent or doesn't carry any dependency
-// it is not profitable to move this to outer position, since we are
-// likely able to do inner loop vectorization already.
-if (Row[InnerLoopId] == 'I' || Row[InnerLoopId] == '=')
-  return std::optional(false);
-
-// If the outer loop is not loop independent it is not profitable to move
-// this to inner position, since doing so would not enable inner loop
-// parallelism.
-if (Row[OuterLoopId] != 'I' && Row[OuterLoopId] != '=')
-  return std::optional(false);
-  }
-  // If inner loop has dependence and outer loop is loop independent then it
-  // is/ profitable to interchange to enable inner loop parallelism.
-  // If there are no dependences, interchanging will not improve anything.
-  return std::optional(!DepMatrix.empty());
+  // If the outer loop is not loop independent it is not profitable to move
+  // this to inner position, since doing so would not enable inner loop
+  // parallelism.
+  if (!canVectorize(DepMatrix, OuterLoopId))
+return false;
+
+  // If inner loop has dependence and outer loop is loop independent then it is
+  // profitable to interchange to enable inner loop parallelism.
+  if (!canVectorize(DepMatrix, InnerLoopId))
+return true;
+
+  // If both the inner and the outer loop can be vectorized, it is necessary to
+  // check the cost of each vectorized loop for profitability decision. At this
+  // time we do not have a cost model to estimate them, so return nullopt.
+  // TODO: Estimate the cost of vectorized loop when both the outer and the
+  // inner loop can be vectorized.
+  return std::nullopt;
 }
 
 bool LoopInterchangeProfitability::isProfitable(
diff --git 
a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll 
b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
index 606117e70db86..b82dd5141a6b2 100644
--- 
a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
+++ 
b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
@@ -15,16 +15,13 @@
 ;   }
 ; }
 ;
-; FIXME: These loops are not exchanged at this time due to the problem of
-; profitablity heuristic for vectorization.
 
-; CHECK:  --- !Missed
+; CHECK:  --- !Passed
 ; CHECK-NEXT: Pass:loop-interchange
-; CHECK-NEXT: Name:InterchangeNotProfitable
+; CHECK-NEXT: Name:Interchanged
 ; CHECK-NEXT: Function:interchange_necesasry_for_vectorization
 ; CHECK-NEXT: Args:
-; CHECK-NEXT:   - String:  Interchanging loops is not considered to 
improve cache locality nor vectorization.
-; CHECK-NEXT: ...
+; CHECK-NEXT:   - String:  Loop interchanged with enclosing loop.
 define void @interchange_necesasry_for_vectorization() {
 entry:
   br label %for.i.header

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-b

[llvm-branch-commits] [llvm] [MachinePipeliner] Introduce a new class for loop-carried deps (PR #137663)

2025-04-28 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj created 
https://github.com/llvm/llvm-project/pull/137663

In MachinePipeliner, loop-carried memory dependencies are represented by DAG, 
which makes things complicated and causes some necessary dependencies to be 
missing. This patch introduces a new class to manage loop-carried memory 
dependencies to simplify the logic. The ultimate goal is to add currently 
missing dependencies, but this is a first step of that, and this patch doesn't 
intend to change current behavior. This patch also adds new tests that show the 
missed dependencies, which should be fixed in the future.

Split off from #135148

>From 487aa84ff51e727fc53b1dc5da52281c14adebca Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Mon, 28 Apr 2025 13:04:47 +
Subject: [PATCH] [MachinePipeliner] Introduce a new class for loop-carried
 deps

In MachinePipeliner, loop-carried memory dependencies are represented by
DAG, which makes things complicated and causes some necessary
dependencies to be missing. This patch introduces a new class to manage
loop-carried memory dependencies to simplify the logic. The ultimate
goal is to add currently missing dependencies, but this is a first step
of that, and this patch doesn't intend to change current behavior. This
patch also adds new tests that show the missed dependencies, which
should be fixed in the future.

Split off from #135148
---
 llvm/include/llvm/CodeGen/MachinePipeliner.h  |  34 ++-
 llvm/lib/CodeGen/MachinePipeliner.cpp | 253 --
 .../sms-loop-carried-fp-exceptions1.mir   | 109 
 .../sms-loop-carried-fp-exceptions2.mir   |  99 +++
 .../Hexagon/swp-loop-carried-order-dep1.mir   | 110 
 .../Hexagon/swp-loop-carried-order-dep2.mir   | 105 
 .../Hexagon/swp-loop-carried-order-dep3.mir   | 109 
 .../Hexagon/swp-loop-carried-order-dep4.mir   | 109 
 .../Hexagon/swp-loop-carried-order-dep5.mir   | 111 
 .../Hexagon/swp-loop-carried-order-dep6.mir   | 154 +++
 10 files changed, 1163 insertions(+), 30 deletions(-)
 create mode 100644 
llvm/test/CodeGen/AArch64/sms-loop-carried-fp-exceptions1.mir
 create mode 100644 
llvm/test/CodeGen/AArch64/sms-loop-carried-fp-exceptions2.mir
 create mode 100644 llvm/test/CodeGen/Hexagon/swp-loop-carried-order-dep1.mir
 create mode 100644 llvm/test/CodeGen/Hexagon/swp-loop-carried-order-dep2.mir
 create mode 100644 llvm/test/CodeGen/Hexagon/swp-loop-carried-order-dep3.mir
 create mode 100644 llvm/test/CodeGen/Hexagon/swp-loop-carried-order-dep4.mir
 create mode 100644 llvm/test/CodeGen/Hexagon/swp-loop-carried-order-dep5.mir
 create mode 100644 llvm/test/CodeGen/Hexagon/swp-loop-carried-order-dep6.mir

diff --git a/llvm/include/llvm/CodeGen/MachinePipeliner.h 
b/llvm/include/llvm/CodeGen/MachinePipeliner.h
index 966ffb7a1fbd2..e4e794c434adb 100644
--- a/llvm/include/llvm/CodeGen/MachinePipeliner.h
+++ b/llvm/include/llvm/CodeGen/MachinePipeliner.h
@@ -190,6 +190,38 @@ class SwingSchedulerDDGEdge {
   bool ignoreDependence(bool IgnoreAnti) const;
 };
 
+/// Represents loop-carried dependencies. Because SwingSchedulerDAG doesn't
+/// assume cycle dependencies as the name suggests, such dependencies must be
+/// handled separately. After DAG construction is finished, these dependencies
+/// are added to SwingSchedulerDDG.
+/// TODO: Also handle output-dependencies introduced by physical registers.
+struct LoopCarriedEdges {
+  using OrderDep = SmallSetVector;
+  using OrderDepsType = DenseMap;
+
+  OrderDepsType OrderDeps;
+
+  const OrderDep *getOrderDepOrNull(SUnit *Key) const {
+auto Ite = OrderDeps.find(Key);
+if (Ite == OrderDeps.end())
+  return nullptr;
+return &Ite->second;
+  }
+
+  /// Retruns true if the edge from \p From to \p To is a back-edge that should
+  /// be used when scheduling.
+  bool shouldUseWhenScheduling(const SUnit *From, const SUnit *To) const;
+
+  /// Adds some edges to the original DAG that correspond to loop-carried
+  /// dependencies. Historically, loop-carried edges are represented by using
+  /// non-loop-carried edges in the original DAG. This function appends such
+  /// edges to preserve the previous behavior.
+  void modifySUnits(std::vector &SUnits);
+
+  void dump(SUnit *SU, const TargetRegisterInfo *TRI,
+const MachineRegisterInfo *MRI) const;
+};
+
 /// Represents dependencies between instructions. This class is a wrapper of
 /// `SUnits` and its dependencies to manipulate back-edges in a natural way.
 /// Currently it only supports back-edges via PHI, which are expressed as
@@ -402,7 +434,7 @@ class SwingSchedulerDAG : public ScheduleDAGInstrs {
  const MachineInstr *OtherMI) const;
 
 private:
-  void addLoopCarriedDependences();
+  LoopCarriedEdges addLoopCarriedDependences();
   void updatePhiDependences();
   void changeDependences();
   unsigned calculateResMII();
diff --git a/llvm/lib/CodeGen/MachinePipeliner.cpp 
b/llvm/lib/CodeGen/Machin

[llvm-branch-commits] [llvm] [MachinePipeliner] Introduce a new class for loop-carried deps (PR #137663)

2025-05-01 Thread Ryotaro Kasuga via llvm-branch-commits

kasuga-fj wrote:

This PR is a part of [Stacked Pull 
Requests](https://llvm.org/docs/GitHub.html#stacked-pull-requests) and depends 
on #137662. The target branch will automatically change to main after the 
dependent PR is merged. Could you please take a look at #137662 at first?

https://github.com/llvm/llvm-project/pull/137663
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopInterchange] Add tests for the vectorization profitability (NFC) (PR #133665)

2025-03-30 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj created 
https://github.com/llvm/llvm-project/pull/133665

There is a problem with the current profitability check for vectorization in 
LoopInterchange. There are both false positives and false negatives. The former 
means that the heuristic may say that "an exchange is necessary to vectorize 
the innermost loop" even though it's already possible. The latter means that 
the heuristic may miss a case where an exchange is necessary to vectorize the 
innermost loop. Note that this is not a dependency analysis problem.  These 
problems can occur even if the analysis is accurate (no overestimation).

This patch adds tests to clarify the cases that should be fixed. The root cause 
of these cases is that the heuristic doesn't handle the direction of a 
dependency correctly.

>From b53b7ce2b303ff9ea94d77b3ffe74d1697db9f3d Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Thu, 27 Mar 2025 07:04:27 +
Subject: [PATCH] [LoopInterchange] Add tests for the vectorization
 profitability (NFC)

There is a problem with the current profitability check for
vectorization in LoopInterchange. There are both false positives and
false negatives. The former means that the heuristic may say that "an
exchange is necessary to vectorize the innermost loop" even though it's
already possible. The latter means that the heuristic may miss a case
where an exchange is necessary to vectorize the innermost loop. Note
that this is not a dependency analysis problem.  These problems can
occur even if the analysis is accurate (no overestimation).

This patch adds tests to clarify the cases that should be fixed. The
root cause of these cases is that the heuristic doesn't handle the
direction of a dependency correctly.
---
 .../profitability-vectorization-heuristic.ll  | 108 ++
 1 file changed, 108 insertions(+)
 create mode 100644 
llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll

diff --git 
a/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll 
b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
new file mode 100644
index 0..606117e70db86
--- /dev/null
+++ 
b/llvm/test/Transforms/LoopInterchange/profitability-vectorization-heuristic.ll
@@ -0,0 +1,108 @@
+; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 \
+; RUN: -pass-remarks-output=%t -disable-output 
-loop-interchange-profitabilities=vectorize
+; RUN: FileCheck -input-file %t %s
+
+@A = dso_local global [256 x [256 x float]] zeroinitializer
+@B = dso_local global [256 x [256 x float]] zeroinitializer
+@C = dso_local global [256 x [256 x float]] zeroinitializer
+
+; Check that the below loops are exchanged for vectorization.
+;
+; for (int i = 0; i < 256; i++) {
+;   for (int j = 1; j < 256; j++) {
+; A[i][j] = A[i][j-1] + B[i][j];
+; C[i][j] += 1;
+;   }
+; }
+;
+; FIXME: These loops are not exchanged at this time due to the problem of
+; profitablity heuristic for vectorization.
+
+; CHECK:  --- !Missed
+; CHECK-NEXT: Pass:loop-interchange
+; CHECK-NEXT: Name:InterchangeNotProfitable
+; CHECK-NEXT: Function:interchange_necesasry_for_vectorization
+; CHECK-NEXT: Args:
+; CHECK-NEXT:   - String:  Interchanging loops is not considered to 
improve cache locality nor vectorization.
+; CHECK-NEXT: ...
+define void @interchange_necesasry_for_vectorization() {
+entry:
+  br label %for.i.header
+
+for.i.header:
+  %i = phi i64 [ 1, %entry ], [ %i.next, %for.i.inc ]
+  br label %for.j.body
+
+for.j.body:
+  %j = phi i64 [ 1, %for.i.header ], [ %j.next, %for.j.body ]
+  %j.dec = add nsw i64 %j, -1
+  %a.load.index = getelementptr nuw inbounds [256 x [256 x float]], ptr @A, 
i64 %i, i64 %j.dec
+  %b.index = getelementptr nuw inbounds [256 x [256 x float]], ptr @B, i64 %i, 
i64 %j
+  %c.index = getelementptr nuw inbounds [256 x [256 x float]], ptr @C, i64 %i, 
i64 %j
+  %a = load float, ptr %a.load.index, align 4
+  %b = load float, ptr %b.index, align 4
+  %c = load float, ptr %c.index, align 4
+  %add.0 = fadd float %a, %b
+  %a.store.index = getelementptr nuw inbounds [256 x [256 x float]], ptr @A, 
i64 %i, i64 %j
+  store float %add.0, ptr %a.store.index, align 4
+  %add.1 = fadd float %c, 1.0
+  store float %add.1, ptr %c.index, align 4
+  %j.next = add nuw nsw i64 %j, 1
+  %cmp.j = icmp eq i64 %j.next, 256
+  br i1 %cmp.j, label %for.i.inc, label %for.j.body
+
+for.i.inc:
+  %i.next = add nuw nsw i64 %i, 1
+  %cmp.i = icmp eq i64 %i.next, 256
+  br i1 %cmp.i, label %exit, label %for.i.header
+
+exit:
+  ret void
+}
+
+; Check that the following innermost loop can be vectorized so that
+; interchangig is unnecessary.
+;
+; for (int i = 0; i < 256; i++)
+;   for (int j = 1; j < 256; j++)
+; A[i][j-1] = A[i][j] + B[i][j];
+;
+; FIXME: These loops are exchanged at this time due to the problem of
+; profitablity heuristic for vectorization.
+
+; CHECK:  --- !Passed
+; CHECK-NEXT: Pass

[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)

2025-03-30 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj created 
https://github.com/llvm/llvm-project/pull/133672

The vectorization profitability has a process to check whether a given loop can 
be vectorized or not. Since the process is conservative, a loop that can be 
vectorized may be deemed not to be possible. This can trigger unnecessary 
exchanges.
This patch improves the profitability decision by mitigating such misjudgments. 
Before this patch, we considered a loop to be vectorizable only when there are 
no loop carried dependencies with the IV of the loop. However, a loop carried 
dependency doesn't prevent vectorization if the distance is positive. This 
patch makes the vectorization check more accurate by allowing a loop with the 
positive dependency. Note that it is difficult to make a complete decision 
whether a loop can be vectorized or not. To achieve this, we must check the 
vector width and the distance of dependency.

>From cdec72a2b2c365e29cbe05f2ad2d21b403104999 Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Thu, 27 Mar 2025 10:45:26 +
Subject: [PATCH] [LoopInterchange] Improve profitability check for
 vectorization

The vectorization profitability has a process to check whether a given
loop can be vectorized or not. Since the process is conservative, a loop
that can be vectorized may be deemed not to be possible. This can
trigger unnecessary exchanges.
This patch improves the profitability decision by mitigating such
misjudgments. Before this patch, we considered a loop to be vectorizable
only when there are no loop carried dependencies with the IV of the
loop. However, a loop carried dependency doesn't prevent vectorization
if the distance is positive. This patch makes the vectorization check
more accurate by allowing a loop with the positive dependency. Note that
it is difficult to make a complete decision whether a loop can be
vectorized or not. To achieve this, we must check the vector width and
the distance of dependency.
---
 .../lib/Transforms/Scalar/LoopInterchange.cpp | 128 ++
 .../profitability-vectorization-heuristic.ll  |   8 +-
 2 files changed, 106 insertions(+), 30 deletions(-)

diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp 
b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index b6b0b7d7a947a..0c3a9cbfeed5f 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -17,8 +17,8 @@
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/Statistic.h"
+#include "llvm/ADT/StringMap.h"
 #include "llvm/ADT/StringRef.h"
-#include "llvm/ADT/StringSet.h"
 #include "llvm/Analysis/DependenceAnalysis.h"
 #include "llvm/Analysis/LoopCacheAnalysis.h"
 #include "llvm/Analysis/LoopInfo.h"
@@ -80,6 +80,21 @@ enum class RuleTy {
   ForVectorization,
 };
 
+/// Store the information about if corresponding direction vector was negated
+/// by normalization or not. This is necessary to restore the original one from
+/// a row of a dependency matrix because we only manage normalized direction
+/// vectors. Also, duplicate vectors are eliminated, so there may be both
+/// original and negated vectors for a single entry (a row of dependency
+/// matrix). E.g., if there are two direction vectors `[< =]` and `[> =]`, the
+/// later one will be converted to the same as former one by normalization, so
+/// only `[< =]` would be retained in the final result.
+struct NegatedStatus {
+  bool Original = false;
+  bool Negated = false;
+
+  bool isNonNegativeDir(char Dir) const;
+};
+
 } // end anonymous namespace
 
 // Minimum loop depth supported.
@@ -126,9 +141,10 @@ static void printDepMatrix(CharMatrix &DepMatrix) {
 }
 #endif
 
-static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level,
- Loop *L, DependenceInfo *DI,
- ScalarEvolution *SE,
+static bool populateDependencyMatrix(CharMatrix &DepMatrix,
+ std::vector &NegStatusVec,
+ unsigned Level, Loop *L,
+ DependenceInfo *DI, ScalarEvolution *SE,
  OptimizationRemarkEmitter *ORE) {
   using ValueVector = SmallVector;
 
@@ -167,7 +183,9 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, 
unsigned Level,
 return false;
   }
   ValueVector::iterator I, IE, J, JE;
-  StringSet<> Seen;
+
+  // Manage all found direction vectors. and map it to the index of DepMatrix.
+  StringMap Seen;
 
   for (I = MemInstr.begin(), IE = MemInstr.end(); I != IE; ++I) {
 for (J = I, JE = MemInstr.end(); J != JE; ++J) {
@@ -182,7 +200,8 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, 
unsigned Level,
 assert(D->isOrdered() && "Expected an output, flow or anti dep.");
 // If the direction vector is negative, normalize it to
 // make it non-negative.
-if (D->normalize(SE))
+  

[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)

2025-03-31 Thread Ryotaro Kasuga via llvm-branch-commits

kasuga-fj wrote:

Depends on #133667 

https://github.com/llvm/llvm-project/pull/133672
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopInterchange] Fix the vectorizable check for a loop (PR #133667)

2025-03-31 Thread Ryotaro Kasuga via llvm-branch-commits

kasuga-fj wrote:

Depends on #133665 

https://github.com/llvm/llvm-project/pull/133667
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopInterchange] Add tests for the vectorization profitability (NFC) (PR #133665)

2025-04-04 Thread Ryotaro Kasuga via llvm-branch-commits

kasuga-fj wrote:

Depends on #133664 (Sorry for inconvenience, I tried using 
[Graphite](https://app.graphite.dev/), but it didn't work fine due to my 
network problem).

https://github.com/llvm/llvm-project/pull/133665
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)

2025-04-02 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj updated 
https://github.com/llvm/llvm-project/pull/133672

>From 1a1c1f61a8cb179443d782127c157695bd21f6cc Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Thu, 27 Mar 2025 10:45:26 +
Subject: [PATCH] [LoopInterchange] Improve profitability check for
 vectorization

The vectorization profitability has a process to check whether a given
loop can be vectorized or not. Since the process is conservative, a loop
that can be vectorized may be deemed not to be possible. This can
trigger unnecessary exchanges.
This patch improves the profitability decision by mitigating such
misjudgments. Before this patch, we considered a loop to be vectorizable
only when there are no loop carried dependencies with the IV of the
loop. However, a loop carried dependency doesn't prevent vectorization
if the distance is positive. This patch makes the vectorization check
more accurate by allowing a loop with the positive dependency. Note that
it is difficult to make a complete decision whether a loop can be
vectorized or not. To achieve this, we must check the vector width and
the distance of dependency.
---
 .../lib/Transforms/Scalar/LoopInterchange.cpp | 128 ++
 .../profitability-vectorization-heuristic.ll  |   8 +-
 2 files changed, 106 insertions(+), 30 deletions(-)

diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp 
b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index 1dccba4cfa7b8..078da53c52b52 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -17,8 +17,8 @@
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/Statistic.h"
+#include "llvm/ADT/StringMap.h"
 #include "llvm/ADT/StringRef.h"
-#include "llvm/ADT/StringSet.h"
 #include "llvm/Analysis/DependenceAnalysis.h"
 #include "llvm/Analysis/LoopCacheAnalysis.h"
 #include "llvm/Analysis/LoopInfo.h"
@@ -80,6 +80,21 @@ enum class RuleTy {
   ForVectorization,
 };
 
+/// Store the information about if corresponding direction vector was negated
+/// by normalization or not. This is necessary to restore the original one from
+/// a row of a dependency matrix, because we only manage normalized direction
+/// vectors and duplicate vectors are eliminated. So there may be both original
+/// and negated vectors for a single entry (a row of dependency matrix). E.g.,
+/// if there are two direction vectors `[< =]` and `[> =]`, the later one will
+/// be converted to the same as former one by normalization, so only `[< =]`
+/// would be retained in the final result.
+struct NegatedStatus {
+  bool Original = false;
+  bool Negated = false;
+
+  bool isNonNegativeDir(char Dir) const;
+};
+
 } // end anonymous namespace
 
 // Minimum loop depth supported.
@@ -126,9 +141,10 @@ static void printDepMatrix(CharMatrix &DepMatrix) {
 }
 #endif
 
-static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level,
- Loop *L, DependenceInfo *DI,
- ScalarEvolution *SE,
+static bool populateDependencyMatrix(CharMatrix &DepMatrix,
+ std::vector &NegStatusVec,
+ unsigned Level, Loop *L,
+ DependenceInfo *DI, ScalarEvolution *SE,
  OptimizationRemarkEmitter *ORE) {
   using ValueVector = SmallVector;
 
@@ -167,7 +183,9 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, 
unsigned Level,
 return false;
   }
   ValueVector::iterator I, IE, J, JE;
-  StringSet<> Seen;
+
+  // Manage all found direction vectors. and map it to the index of DepMatrix.
+  StringMap Seen;
 
   for (I = MemInstr.begin(), IE = MemInstr.end(); I != IE; ++I) {
 for (J = I, JE = MemInstr.end(); J != JE; ++J) {
@@ -182,7 +200,8 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, 
unsigned Level,
 assert(D->isOrdered() && "Expected an output, flow or anti dep.");
 // If the direction vector is negative, normalize it to
 // make it non-negative.
-if (D->normalize(SE))
+bool Normalized = D->normalize(SE);
+if (Normalized)
   LLVM_DEBUG(dbgs() << "Negative dependence vector normalized.\n");
 LLVM_DEBUG(StringRef DepType =
D->isFlow() ? "flow" : D->isAnti() ? "anti" : "output";
@@ -214,8 +233,17 @@ static bool populateDependencyMatrix(CharMatrix 
&DepMatrix, unsigned Level,
 }
 
 // Make sure we only add unique entries to the dependency matrix.
-if (Seen.insert(StringRef(Dep.data(), Dep.size())).second)
+unsigned Index = DepMatrix.size();
+auto [Ite, Inserted] =
+Seen.try_emplace(StringRef(Dep.data(), Dep.size()), Index);
+if (Inserted) {
   DepMatrix.push_back(Dep);
+  NegStatusVec.push_back(NegatedStatus{});
+} else
+  Index = Ite->second;
+
+

[llvm-branch-commits] [llvm] [LoopInterchange] Improve profitability check for vectorization (PR #133672)

2025-04-02 Thread Ryotaro Kasuga via llvm-branch-commits


@@ -80,6 +80,21 @@ enum class RuleTy {
   ForVectorization,
 };
 
+/// Store the information about if corresponding direction vector was negated

kasuga-fj wrote:

> I think duplicated direction vectors are always allowed. They don't add new 
> or different information, so it shouldn't effect the interpretation of the 
> dependence analysis in any way. The only thing that it affects is processing 
> the same information again and again, so the only benefit of making them 
> unique is to avoid that.

I agree with this, and am only concerned about the compile time degradation. So 
I will try to compare the difference of the number of entries with or without 
making them unique. Thank you for your opinion!

https://github.com/llvm/llvm-project/pull/133672
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopInterchange] Relax the legality check to accept more patterns (PR #118267)

2025-05-09 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj edited 
https://github.com/llvm/llvm-project/pull/118267
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopInterchange] Relax the legality check to accept more patterns (PR #118267)

2025-05-09 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj updated 
https://github.com/llvm/llvm-project/pull/118267



  



Rate limit · GitHub


  body {
background-color: #f6f8fa;
color: #24292e;
font-family: -apple-system,BlinkMacSystemFont,Segoe 
UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;
font-size: 14px;
line-height: 1.5;
margin: 0;
  }

  .container { margin: 50px auto; max-width: 600px; text-align: center; 
padding: 0 24px; }

  a { color: #0366d6; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; 
text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; }

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  



  Whoa there!
  You have exceeded a secondary rate limit.
Please wait a few minutes before you try again;
in some cases this may take up to an hour.
  
  
https://support.github.com/contact";>Contact Support —
https://githubstatus.com";>GitHub Status —
https://twitter.com/githubstatus";>@githubstatus
  

  

  

  

  

  


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopInterchange] Relax the legality check to accept more patterns (PR #118267)

2025-05-09 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj edited 
https://github.com/llvm/llvm-project/pull/118267
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopInterchange] Relax the legality check to accept more patterns (PR #118267)

2025-05-09 Thread Ryotaro Kasuga via llvm-branch-commits

kasuga-fj wrote:

@Meinersbur Could you please take a look?

I have submitted another PR #139254 as well, on which this PR depends, to 
address the following problems: I first tried to simply replace the function 
`isLexicographicallyPositive(std::vector &DV)` to take an additional 
argument like `StartAt` to ignore the prefix, as discussed in the last meeting. 
However, it made several tests to fail, e.g. a case where a direction vector is 
`[< > =]` and we would like to swap the last two loops. It is safe because the 
first element guarantees that the whole vector is lexicographically positive, 
but just changing the legality check to drop the unrelated prefix rejected such 
a case because it makes the direction vector to be `[> =]`, which is negative.

The original issue #71519 isn't resolved by this patch because a problem 
similar to the above exists (I believe this can be resolved by improving the 
handling of negative direction). For now, I'd like to add a logic to ignore the 
irrelevant prefix of direction vectors, and will create another PR to address 
#71519.

Thanks.

https://github.com/llvm/llvm-project/pull/118267
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopInterchange] Relax the legality check to accept more patterns (PR #118267)

2025-05-12 Thread Ryotaro Kasuga via llvm-branch-commits

kasuga-fj wrote:

Thanks for your review.

https://github.com/llvm/llvm-project/pull/118267
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopInterchange] Relax the legality check to accept more patterns (PR #118267)

2025-05-13 Thread Ryotaro Kasuga via llvm-branch-commits

kasuga-fj wrote:

I misunderstood GitHub behavior and cannot reopen this PR... Submitted another 
PR #139690 with same contents.

https://github.com/llvm/llvm-project/pull/118267
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)

2025-07-11 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj ready_for_review 
https://github.com/llvm/llvm-project/pull/148241
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)

2025-07-11 Thread Ryotaro Kasuga via llvm-branch-commits


@@ -113,7 +113,7 @@ define void @banerjee1(ptr %A, ptr %B, i64 %m, i64 %n) 
nounwind uwtable ssp {
 ; CHECK-NEXT:  Src: %2 = load i64, ptr %arrayidx6, align 8 --> Dst: store i64 
%2, ptr %B.addr.12, align 8
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store i64 %2, ptr %B.addr.12, align 8 --> Dst: store i64 
%2, ptr %B.addr.12, align 8
-; CHECK-NEXT:da analyze - output [* *]!

kasuga-fj wrote:

All the test changes except for FlipFlopBaseAddress.ll are related to 
dependencies between memory accesses of the form `*B++`.

https://github.com/llvm/llvm-project/pull/148241
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)

2025-07-11 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj created 
https://github.com/llvm/llvm-project/pull/148241

None

>From 3596b1726514dda5942d99d4779852e01367c764 Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Fri, 11 Jul 2025 13:10:44 +
Subject: [PATCH] [DA] Add check for base pointer invariance

---
 llvm/lib/Analysis/DependenceAnalysis.cpp   | 16 
 .../DependenceAnalysis/FlipFlopBaseAddress.ll  | 18 +-
 2 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp 
b/llvm/lib/Analysis/DependenceAnalysis.cpp
index 428342f51ad2e..07bf560772c8c 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -3383,6 +3383,10 @@ bool DependenceInfo::tryDelinearize(Instruction *Src, 
Instruction *Dst,
 SrcSubscripts, DstSubscripts))
 return false;
 
+  assert(isLoopInvariant(SrcBase, SrcLoop) &&
+ isLoopInvariant(DstBase, DstLoop) &&
+ "Expected SrcBase and DstBase to be loop invariant");
+
   int Size = SrcSubscripts.size();
   LLVM_DEBUG({
 dbgs() << "\nSrcSubscripts: ";
@@ -3666,6 +3670,18 @@ DependenceInfo::depends(Instruction *Src, Instruction 
*Dst,
 SCEVUnionPredicate(Assume, *SE));
   }
 
+  // Even if the base pointers are the same, they may not be loop-invariant. It
+  // could lead to incorrect results, as we're analyzing loop-carried
+  // dependencies.
+  Loop *SrcLoop = LI->getLoopFor(Src->getParent());
+  Loop *DstLoop = LI->getLoopFor(Dst->getParent());
+  if (!isLoopInvariant(SrcBase, SrcLoop) ||
+  !isLoopInvariant(DstBase, DstLoop)) {
+LLVM_DEBUG(dbgs() << "The base pointer is not loop invariant.\n");
+return std::make_unique(Src, Dst,
+SCEVUnionPredicate(Assume, *SE));
+  }
+
   uint64_t EltSize = SrcLoc.Size.toRaw();
   const SCEV *SrcEv = SE->getMinusSCEV(SrcSCEV, SrcBase);
   const SCEV *DstEv = SE->getMinusSCEV(DstSCEV, DstBase);
diff --git a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll 
b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll
index 843c18a6e0d1e..a357018563be1 100644
--- a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll
@@ -8,11 +8,11 @@
 define float @bug41488_test1(float %f) {
 ; CHECK-LABEL: 'bug41488_test1'
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, 
ptr %p, align 4
-; CHECK-NEXT:da analyze - input [*]!
+; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: store float %f, 
ptr %q, align 4
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store float %f, ptr %q, align 4 --> Dst: store float %f, 
ptr %q, align 4
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   %g = alloca float, align 4
@@ -34,11 +34,11 @@ for.cond.cleanup:
 define void @bug41488_test2(i32 %n) {
 ; CHECK-LABEL: 'bug41488_test2'
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, 
ptr %p, align 4
-; CHECK-NEXT:da analyze - input [*]!
+; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: store float 
0.00e+00, ptr %q, align 4
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store float 0.00e+00, ptr %q, align 4 --> Dst: store 
float 0.00e+00, ptr %q, align 4
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   %g = alloca float, align 4
@@ -68,7 +68,7 @@ define void @bug53942_foo(i32 noundef %n, ptr noalias 
nocapture noundef writeonl
 ; CHECK-NEXT:  Src: %.pre = load double, ptr %B, align 8 --> Dst: store double 
%.pre, ptr %arrayidx2, align 8
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store double %.pre, ptr %arrayidx2, align 8 --> Dst: store 
double %.pre, ptr %arrayidx2, align 8
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   %cmp8 = icmp sgt i32 %n, 1
@@ -99,11 +99,11 @@ for.body: ; preds = 
%for.body.preheader,
 define void @bug53942_bar(i32 noundef %n, ptr noalias noundef %A, ptr noalias 
noundef %B) {
 ; CHECK-LABEL: 'bug53942_bar'
 ; CHECK-NEXT:  Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: %0 = 
load double, ptr %arrayidx, align 8
-; CHECK-NEXT:da analyze - input [*]!
+; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: store 
double %0, ptr %arrayidx8, align 8
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store double %0, ptr %arrayidx8, align 8 --> Dst: store 
double %0, ptr %arrayidx8, align 8
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   br label %for.cond
@@ -173,7 +173,7 @@ for.end: 

[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)

2025-07-11 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj updated 
https://github.com/llvm/llvm-project/pull/148241

>From d914ea5f3c44387570cab65ce9a507ebf429f827 Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Fri, 11 Jul 2025 13:10:44 +
Subject: [PATCH] [DA] Add check for base pointer invariance

---
 llvm/lib/Analysis/DependenceAnalysis.cpp   | 16 
 .../DependenceAnalysis/FlipFlopBaseAddress.ll  | 18 +-
 2 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp 
b/llvm/lib/Analysis/DependenceAnalysis.cpp
index 428342f51ad2e..07bf560772c8c 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -3383,6 +3383,10 @@ bool DependenceInfo::tryDelinearize(Instruction *Src, 
Instruction *Dst,
 SrcSubscripts, DstSubscripts))
 return false;
 
+  assert(isLoopInvariant(SrcBase, SrcLoop) &&
+ isLoopInvariant(DstBase, DstLoop) &&
+ "Expected SrcBase and DstBase to be loop invariant");
+
   int Size = SrcSubscripts.size();
   LLVM_DEBUG({
 dbgs() << "\nSrcSubscripts: ";
@@ -3666,6 +3670,18 @@ DependenceInfo::depends(Instruction *Src, Instruction 
*Dst,
 SCEVUnionPredicate(Assume, *SE));
   }
 
+  // Even if the base pointers are the same, they may not be loop-invariant. It
+  // could lead to incorrect results, as we're analyzing loop-carried
+  // dependencies.
+  Loop *SrcLoop = LI->getLoopFor(Src->getParent());
+  Loop *DstLoop = LI->getLoopFor(Dst->getParent());
+  if (!isLoopInvariant(SrcBase, SrcLoop) ||
+  !isLoopInvariant(DstBase, DstLoop)) {
+LLVM_DEBUG(dbgs() << "The base pointer is not loop invariant.\n");
+return std::make_unique(Src, Dst,
+SCEVUnionPredicate(Assume, *SE));
+  }
+
   uint64_t EltSize = SrcLoc.Size.toRaw();
   const SCEV *SrcEv = SE->getMinusSCEV(SrcSCEV, SrcBase);
   const SCEV *DstEv = SE->getMinusSCEV(DstSCEV, DstBase);
diff --git a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll 
b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll
index 3e3426afab0f7..52cab0f77e73e 100644
--- a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll
@@ -8,11 +8,11 @@
 define float @bug41488_test1(float %f) {
 ; CHECK-LABEL: 'bug41488_test1'
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, 
ptr %p, align 4
-; CHECK-NEXT:da analyze - input [*]!
+; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: store float %f, 
ptr %q, align 4
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store float %f, ptr %q, align 4 --> Dst: store float %f, 
ptr %q, align 4
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   %g = alloca float, align 4
@@ -34,11 +34,11 @@ for.cond.cleanup:
 define void @bug41488_test2(i32 %n) {
 ; CHECK-LABEL: 'bug41488_test2'
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, 
ptr %p, align 4
-; CHECK-NEXT:da analyze - input [*]!
+; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: store float 
0.00e+00, ptr %q, align 4
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store float 0.00e+00, ptr %q, align 4 --> Dst: store 
float 0.00e+00, ptr %q, align 4
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   %g = alloca float, align 4
@@ -68,7 +68,7 @@ define void @bug53942_foo(i32 noundef %n, ptr noalias 
nocapture noundef writeonl
 ; CHECK-NEXT:  Src: %.pre = load double, ptr %B, align 8 --> Dst: store double 
%.pre, ptr %arrayidx2, align 8
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store double %.pre, ptr %arrayidx2, align 8 --> Dst: store 
double %.pre, ptr %arrayidx2, align 8
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   %cmp8 = icmp sgt i32 %n, 1
@@ -99,11 +99,11 @@ for.body: ; preds = 
%for.body.preheader,
 define void @bug53942_bar(i32 noundef %n, ptr noalias noundef %A, ptr noalias 
noundef %B) {
 ; CHECK-LABEL: 'bug53942_bar'
 ; CHECK-NEXT:  Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: %0 = 
load double, ptr %arrayidx, align 8
-; CHECK-NEXT:da analyze - input [*]!
+; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: store 
double %0, ptr %arrayidx8, align 8
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store double %0, ptr %arrayidx8, align 8 --> Dst: store 
double %0, ptr %arrayidx8, align 8
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   br label %for.cond
@@ -173,7 +173,7 @@ for.end:   

[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)

2025-07-11 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj edited 
https://github.com/llvm/llvm-project/pull/148241
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)

2025-07-11 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj edited 
https://github.com/llvm/llvm-project/pull/148241
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)

2025-07-11 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj updated 
https://github.com/llvm/llvm-project/pull/148241

>From d914ea5f3c44387570cab65ce9a507ebf429f827 Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Fri, 11 Jul 2025 13:10:44 +
Subject: [PATCH 1/2] [DA] Add check for base pointer invariance

---
 llvm/lib/Analysis/DependenceAnalysis.cpp   | 16 
 .../DependenceAnalysis/FlipFlopBaseAddress.ll  | 18 +-
 2 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp 
b/llvm/lib/Analysis/DependenceAnalysis.cpp
index 428342f51ad2e..07bf560772c8c 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -3383,6 +3383,10 @@ bool DependenceInfo::tryDelinearize(Instruction *Src, 
Instruction *Dst,
 SrcSubscripts, DstSubscripts))
 return false;
 
+  assert(isLoopInvariant(SrcBase, SrcLoop) &&
+ isLoopInvariant(DstBase, DstLoop) &&
+ "Expected SrcBase and DstBase to be loop invariant");
+
   int Size = SrcSubscripts.size();
   LLVM_DEBUG({
 dbgs() << "\nSrcSubscripts: ";
@@ -3666,6 +3670,18 @@ DependenceInfo::depends(Instruction *Src, Instruction 
*Dst,
 SCEVUnionPredicate(Assume, *SE));
   }
 
+  // Even if the base pointers are the same, they may not be loop-invariant. It
+  // could lead to incorrect results, as we're analyzing loop-carried
+  // dependencies.
+  Loop *SrcLoop = LI->getLoopFor(Src->getParent());
+  Loop *DstLoop = LI->getLoopFor(Dst->getParent());
+  if (!isLoopInvariant(SrcBase, SrcLoop) ||
+  !isLoopInvariant(DstBase, DstLoop)) {
+LLVM_DEBUG(dbgs() << "The base pointer is not loop invariant.\n");
+return std::make_unique(Src, Dst,
+SCEVUnionPredicate(Assume, *SE));
+  }
+
   uint64_t EltSize = SrcLoc.Size.toRaw();
   const SCEV *SrcEv = SE->getMinusSCEV(SrcSCEV, SrcBase);
   const SCEV *DstEv = SE->getMinusSCEV(DstSCEV, DstBase);
diff --git a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll 
b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll
index 3e3426afab0f7..52cab0f77e73e 100644
--- a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll
@@ -8,11 +8,11 @@
 define float @bug41488_test1(float %f) {
 ; CHECK-LABEL: 'bug41488_test1'
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, 
ptr %p, align 4
-; CHECK-NEXT:da analyze - input [*]!
+; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: store float %f, 
ptr %q, align 4
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store float %f, ptr %q, align 4 --> Dst: store float %f, 
ptr %q, align 4
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   %g = alloca float, align 4
@@ -34,11 +34,11 @@ for.cond.cleanup:
 define void @bug41488_test2(i32 %n) {
 ; CHECK-LABEL: 'bug41488_test2'
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, 
ptr %p, align 4
-; CHECK-NEXT:da analyze - input [*]!
+; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: store float 
0.00e+00, ptr %q, align 4
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store float 0.00e+00, ptr %q, align 4 --> Dst: store 
float 0.00e+00, ptr %q, align 4
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   %g = alloca float, align 4
@@ -68,7 +68,7 @@ define void @bug53942_foo(i32 noundef %n, ptr noalias 
nocapture noundef writeonl
 ; CHECK-NEXT:  Src: %.pre = load double, ptr %B, align 8 --> Dst: store double 
%.pre, ptr %arrayidx2, align 8
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store double %.pre, ptr %arrayidx2, align 8 --> Dst: store 
double %.pre, ptr %arrayidx2, align 8
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   %cmp8 = icmp sgt i32 %n, 1
@@ -99,11 +99,11 @@ for.body: ; preds = 
%for.body.preheader,
 define void @bug53942_bar(i32 noundef %n, ptr noalias noundef %A, ptr noalias 
noundef %B) {
 ; CHECK-LABEL: 'bug53942_bar'
 ; CHECK-NEXT:  Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: %0 = 
load double, ptr %arrayidx, align 8
-; CHECK-NEXT:da analyze - input [*]!
+; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: store 
double %0, ptr %arrayidx8, align 8
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store double %0, ptr %arrayidx8, align 8 --> Dst: store 
double %0, ptr %arrayidx8, align 8
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   br label %for.cond
@@ -173,7 +173,7 @@ for.end:   

[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)

2025-07-11 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj updated 
https://github.com/llvm/llvm-project/pull/148241

>From d914ea5f3c44387570cab65ce9a507ebf429f827 Mon Sep 17 00:00:00 2001
From: Ryotaro Kasuga 
Date: Fri, 11 Jul 2025 13:10:44 +
Subject: [PATCH 1/3] [DA] Add check for base pointer invariance

---
 llvm/lib/Analysis/DependenceAnalysis.cpp   | 16 
 .../DependenceAnalysis/FlipFlopBaseAddress.ll  | 18 +-
 2 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp 
b/llvm/lib/Analysis/DependenceAnalysis.cpp
index 428342f51ad2e..07bf560772c8c 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -3383,6 +3383,10 @@ bool DependenceInfo::tryDelinearize(Instruction *Src, 
Instruction *Dst,
 SrcSubscripts, DstSubscripts))
 return false;
 
+  assert(isLoopInvariant(SrcBase, SrcLoop) &&
+ isLoopInvariant(DstBase, DstLoop) &&
+ "Expected SrcBase and DstBase to be loop invariant");
+
   int Size = SrcSubscripts.size();
   LLVM_DEBUG({
 dbgs() << "\nSrcSubscripts: ";
@@ -3666,6 +3670,18 @@ DependenceInfo::depends(Instruction *Src, Instruction 
*Dst,
 SCEVUnionPredicate(Assume, *SE));
   }
 
+  // Even if the base pointers are the same, they may not be loop-invariant. It
+  // could lead to incorrect results, as we're analyzing loop-carried
+  // dependencies.
+  Loop *SrcLoop = LI->getLoopFor(Src->getParent());
+  Loop *DstLoop = LI->getLoopFor(Dst->getParent());
+  if (!isLoopInvariant(SrcBase, SrcLoop) ||
+  !isLoopInvariant(DstBase, DstLoop)) {
+LLVM_DEBUG(dbgs() << "The base pointer is not loop invariant.\n");
+return std::make_unique(Src, Dst,
+SCEVUnionPredicate(Assume, *SE));
+  }
+
   uint64_t EltSize = SrcLoc.Size.toRaw();
   const SCEV *SrcEv = SE->getMinusSCEV(SrcSCEV, SrcBase);
   const SCEV *DstEv = SE->getMinusSCEV(DstSCEV, DstBase);
diff --git a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll 
b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll
index 3e3426afab0f7..52cab0f77e73e 100644
--- a/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/FlipFlopBaseAddress.ll
@@ -8,11 +8,11 @@
 define float @bug41488_test1(float %f) {
 ; CHECK-LABEL: 'bug41488_test1'
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, 
ptr %p, align 4
-; CHECK-NEXT:da analyze - input [*]!
+; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: store float %f, 
ptr %q, align 4
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store float %f, ptr %q, align 4 --> Dst: store float %f, 
ptr %q, align 4
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   %g = alloca float, align 4
@@ -34,11 +34,11 @@ for.cond.cleanup:
 define void @bug41488_test2(i32 %n) {
 ; CHECK-LABEL: 'bug41488_test2'
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: %0 = load float, 
ptr %p, align 4
-; CHECK-NEXT:da analyze - input [*]!
+; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load float, ptr %p, align 4 --> Dst: store float 
0.00e+00, ptr %q, align 4
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store float 0.00e+00, ptr %q, align 4 --> Dst: store 
float 0.00e+00, ptr %q, align 4
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   %g = alloca float, align 4
@@ -68,7 +68,7 @@ define void @bug53942_foo(i32 noundef %n, ptr noalias 
nocapture noundef writeonl
 ; CHECK-NEXT:  Src: %.pre = load double, ptr %B, align 8 --> Dst: store double 
%.pre, ptr %arrayidx2, align 8
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store double %.pre, ptr %arrayidx2, align 8 --> Dst: store 
double %.pre, ptr %arrayidx2, align 8
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   %cmp8 = icmp sgt i32 %n, 1
@@ -99,11 +99,11 @@ for.body: ; preds = 
%for.body.preheader,
 define void @bug53942_bar(i32 noundef %n, ptr noalias noundef %A, ptr noalias 
noundef %B) {
 ; CHECK-LABEL: 'bug53942_bar'
 ; CHECK-NEXT:  Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: %0 = 
load double, ptr %arrayidx, align 8
-; CHECK-NEXT:da analyze - input [*]!
+; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load double, ptr %arrayidx, align 8 --> Dst: store 
double %0, ptr %arrayidx8, align 8
 ; CHECK-NEXT:da analyze - confused!
 ; CHECK-NEXT:  Src: store double %0, ptr %arrayidx8, align 8 --> Dst: store 
double %0, ptr %arrayidx8, align 8
-; CHECK-NEXT:da analyze - output [*]!
+; CHECK-NEXT:da analyze - confused!
 ;
 entry:
   br label %for.cond
@@ -173,7 +173,7 @@ for.end:   

[llvm-branch-commits] [llvm] [DA] Add check for base pointer invariance (PR #148241)

2025-07-11 Thread Ryotaro Kasuga via llvm-branch-commits

https://github.com/kasuga-fj converted_to_draft 
https://github.com/llvm/llvm-project/pull/148241
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits