[llvm-branch-commits] [clang-tools-extra] [clang] Backport '[clang] static operators should evaluate object argument (reland)' to release/18.x (PR #80109)

2024-01-31 Thread Tianlan Zhou via llvm-branch-commits

https://github.com/SuperSodaSea edited 
https://github.com/llvm/llvm-project/pull/80109
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] 2373491 - Revert "[mlir][memref] `memref.subview`: Verify result strides"

2024-01-31 Thread via llvm-branch-commits

Author: Matthias Springer
Date: 2024-01-31T09:32:54+01:00
New Revision: 23734917da16bdfbadf4d75e657f7ff6deabd618

URL: 
https://github.com/llvm/llvm-project/commit/23734917da16bdfbadf4d75e657f7ff6deabd618
DIFF: 
https://github.com/llvm/llvm-project/commit/23734917da16bdfbadf4d75e657f7ff6deabd618.diff

LOG: Revert "[mlir][memref] `memref.subview`: Verify result strides"

Added: 


Modified: 
mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
mlir/test/Dialect/GPU/decompose-memrefs.mlir
mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir
mlir/test/Dialect/MemRef/invalid.mlir
mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_rewrite_sort_coo.mlir

Removed: 




diff  --git a/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp 
b/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
index f43217f6f27ae..8b5765b7f8dba 100644
--- a/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
+++ b/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
@@ -917,7 +917,7 @@ static std::map 
getNumOccurences(ArrayRef vals) {
 /// This accounts for cases where there are multiple unit-dims, but only a
 /// subset of those are dropped. For MemRefTypes these can be disambiguated
 /// using the strides. If a dimension is dropped the stride must be dropped 
too.
-static FailureOr
+static std::optional
 computeMemRefRankReductionMask(MemRefType originalType, MemRefType reducedType,
ArrayRef sizes) {
   llvm::SmallBitVector unusedDims(originalType.getRank());
@@ -941,7 +941,7 @@ computeMemRefRankReductionMask(MemRefType originalType, 
MemRefType reducedType,
   getStridesAndOffset(originalType, originalStrides, originalOffset)) 
||
   failed(
   getStridesAndOffset(reducedType, candidateStrides, candidateOffset)))
-return failure();
+return std::nullopt;
 
   // For memrefs, a dimension is truly dropped if its corresponding stride is
   // also dropped. This is particularly important when more than one of the 
dims
@@ -976,22 +976,22 @@ computeMemRefRankReductionMask(MemRefType originalType, 
MemRefType reducedType,
 candidateStridesNumOccurences[originalStride]) {
   // This should never happen. Cant have a stride in the reduced rank type
   // that wasnt in the original one.
-  return failure();
+  return std::nullopt;
 }
   }
 
   if ((int64_t)unusedDims.count() + reducedType.getRank() !=
   originalType.getRank())
-return failure();
+return std::nullopt;
   return unusedDims;
 }
 
 llvm::SmallBitVector SubViewOp::getDroppedDims() {
   MemRefType sourceType = getSourceType();
   MemRefType resultType = getType();
-  FailureOr unusedDims =
+  std::optional unusedDims =
   computeMemRefRankReductionMask(sourceType, resultType, getMixedSizes());
-  assert(succeeded(unusedDims) && "unable to find unused dims of subview");
+  assert(unusedDims && "unable to find unused dims of subview");
   return *unusedDims;
 }
 
@@ -2745,7 +2745,7 @@ void SubViewOp::build(OpBuilder &b, OperationState 
&result, Value source,
 /// For ViewLikeOpInterface.
 Value SubViewOp::getViewSource() { return getSource(); }
 
-/// Return true if `t1` and `t2` have equal offsets (both dynamic or of same
+/// Return true if t1 and t2 have equal offsets (both dynamic or of same
 /// static value).
 static bool haveCompatibleOffsets(MemRefType t1, MemRefType t2) {
   int64_t t1Offset, t2Offset;
@@ -2755,41 +2755,56 @@ static bool haveCompatibleOffsets(MemRefType t1, 
MemRefType t2) {
   return succeeded(res1) && succeeded(res2) && t1Offset == t2Offset;
 }
 
-/// Return true if `t1` and `t2` have equal strides (both dynamic or of same
-/// static value).
-static bool haveCompatibleStrides(MemRefType t1, MemRefType t2) {
-  int64_t t1Offset, t2Offset;
-  SmallVector t1Strides, t2Strides;
-  auto res1 = getStridesAndOffset(t1, t1Strides, t1Offset);
-  auto res2 = getStridesAndOffset(t2, t2Strides, t2Offset);
-  if (failed(res1) || failed(res2))
-return false;
-  for (auto [s1, s2] : llvm::zip_equal(t1Strides, t2Strides))
-if (s1 != s2)
-  return false;
-  return true;
+/// Checks if `original` Type type can be rank reduced to `reduced` type.
+/// This function is slight variant of `is subsequence` algorithm where
+/// not matching dimension must be 1.
+static SliceVerificationResult
+isRankReducedMemRefType(MemRefType originalType,
+MemRefType candidateRankReducedType,
+ArrayRef sizes) {
+  auto partialRes = isRankReducedType(originalType, candidateRankReducedType);
+  if (partialRes != SliceVerificationResult::Success)
+return partialRes;
+
+  auto optionalUnusedDimsMask = computeMemRefRankReductionMask(
+  originalType, candidateRankReducedType, sizes);
+
+  // Sizes cannot be matched in case empty vector is returned.
+  if (!optionalUnusedDimsMask)
+return SliceVerificationResult::LayoutMismatch;
+
+  if (originalType.getMemorySpace() !=
+ 

[llvm-branch-commits] [clang] PR for llvm/llvm-project#79568 (PR #80120)

2024-01-31 Thread Younan Zhang via llvm-branch-commits

https://github.com/zyn0217 milestoned 
https://github.com/llvm/llvm-project/pull/80120
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79568 (PR #80120)

2024-01-31 Thread Younan Zhang via llvm-branch-commits

https://github.com/zyn0217 created 
https://github.com/llvm/llvm-project/pull/80120

Backporting https://github.com/llvm/llvm-project/pull/79568 to clang 18.

>From 16a8c31d342abed9c02493445c14c1bc7fd1519e Mon Sep 17 00:00:00 2001
From: Younan Zhang 
Date: Sat, 27 Jan 2024 15:42:52 +0800
Subject: [PATCH] [Concepts] Traverse the instantiation chain for parameter
 injection inside a constraint scope (#79568)

We preserve the trailing requires-expression during the lambda
expression transformation. In order to get those referenced parameters
inside a requires-expression properly resolved to the instantiated
decls, we intended to inject these 'original' `ParmVarDecls` to the
current instantiaion scope, at `Sema::SetupConstraintScope`.

The previous approach seems to overlook nested instantiation chains,
leading to the crash within a nested lambda followed by a requires
clause.

This fixes https://github.com/llvm/llvm-project/issues/73418.
---
 clang/docs/ReleaseNotes.rst |  4 
 clang/lib/Sema/SemaConcept.cpp  |  8 ++--
 clang/test/SemaTemplate/concepts-lambda.cpp | 18 ++
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 060bc7669b72a..da52d5ac4d3c6 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1054,6 +1054,10 @@ Bug Fixes to C++ Support
   Fixes (`#78830 `_)
   Fixes (`#60085 `_)
 
+- Fixed a bug where variables referenced by requires-clauses inside
+  nested generic lambdas were not properly injected into the constraint scope.
+  (`#73418 `_)
+
 Bug Fixes to AST Handling
 ^
 - Fixed an import failure of recursive friend class template.
diff --git a/clang/lib/Sema/SemaConcept.cpp b/clang/lib/Sema/SemaConcept.cpp
index acfc00f412540..88fc846c89e42 100644
--- a/clang/lib/Sema/SemaConcept.cpp
+++ b/clang/lib/Sema/SemaConcept.cpp
@@ -612,8 +612,12 @@ bool Sema::SetupConstraintScope(
 
 // If this is a member function, make sure we get the parameters that
 // reference the original primary template.
-if (const auto *FromMemTempl =
-PrimaryTemplate->getInstantiatedFromMemberTemplate()) {
+// We walk up the instantiated template chain so that nested lambdas get
+// handled properly.
+for (FunctionTemplateDecl *FromMemTempl =
+ PrimaryTemplate->getInstantiatedFromMemberTemplate();
+ FromMemTempl;
+ FromMemTempl = FromMemTempl->getInstantiatedFromMemberTemplate()) {
   if (addInstantiatedParametersToScope(FD, 
FromMemTempl->getTemplatedDecl(),
Scope, MLTAL))
 return true;
diff --git a/clang/test/SemaTemplate/concepts-lambda.cpp 
b/clang/test/SemaTemplate/concepts-lambda.cpp
index 7e431529427df..0b7580f91043c 100644
--- a/clang/test/SemaTemplate/concepts-lambda.cpp
+++ b/clang/test/SemaTemplate/concepts-lambda.cpp
@@ -149,3 +149,21 @@ void foo() {
   auto caller = make_caller.operator()<&S1::f1>();
 }
 } // namespace ReturnTypeRequirementInLambda
+
+namespace GH73418 {
+void foo() {
+  int x;
+  [&x](auto) {
+return [](auto y) {
+  return [](auto obj, auto... params)
+requires requires {
+  sizeof...(params);
+  [](auto... pack) {
+return sizeof...(pack);
+  }(params...);
+}
+  { return false; }(y);
+}(x);
+  }(x);
+}
+} // namespace GH73418

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79568 (PR #80120)

2024-01-31 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Younan Zhang (zyn0217)


Changes

Backporting https://github.com/llvm/llvm-project/pull/79568 to clang 18.

---
Full diff: https://github.com/llvm/llvm-project/pull/80120.diff


3 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+4) 
- (modified) clang/lib/Sema/SemaConcept.cpp (+6-2) 
- (modified) clang/test/SemaTemplate/concepts-lambda.cpp (+18) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 060bc7669b72a..da52d5ac4d3c6 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1054,6 +1054,10 @@ Bug Fixes to C++ Support
   Fixes (`#78830 `_)
   Fixes (`#60085 `_)
 
+- Fixed a bug where variables referenced by requires-clauses inside
+  nested generic lambdas were not properly injected into the constraint scope.
+  (`#73418 `_)
+
 Bug Fixes to AST Handling
 ^
 - Fixed an import failure of recursive friend class template.
diff --git a/clang/lib/Sema/SemaConcept.cpp b/clang/lib/Sema/SemaConcept.cpp
index acfc00f412540..88fc846c89e42 100644
--- a/clang/lib/Sema/SemaConcept.cpp
+++ b/clang/lib/Sema/SemaConcept.cpp
@@ -612,8 +612,12 @@ bool Sema::SetupConstraintScope(
 
 // If this is a member function, make sure we get the parameters that
 // reference the original primary template.
-if (const auto *FromMemTempl =
-PrimaryTemplate->getInstantiatedFromMemberTemplate()) {
+// We walk up the instantiated template chain so that nested lambdas get
+// handled properly.
+for (FunctionTemplateDecl *FromMemTempl =
+ PrimaryTemplate->getInstantiatedFromMemberTemplate();
+ FromMemTempl;
+ FromMemTempl = FromMemTempl->getInstantiatedFromMemberTemplate()) {
   if (addInstantiatedParametersToScope(FD, 
FromMemTempl->getTemplatedDecl(),
Scope, MLTAL))
 return true;
diff --git a/clang/test/SemaTemplate/concepts-lambda.cpp 
b/clang/test/SemaTemplate/concepts-lambda.cpp
index 7e431529427df..0b7580f91043c 100644
--- a/clang/test/SemaTemplate/concepts-lambda.cpp
+++ b/clang/test/SemaTemplate/concepts-lambda.cpp
@@ -149,3 +149,21 @@ void foo() {
   auto caller = make_caller.operator()<&S1::f1>();
 }
 } // namespace ReturnTypeRequirementInLambda
+
+namespace GH73418 {
+void foo() {
+  int x;
+  [&x](auto) {
+return [](auto y) {
+  return [](auto obj, auto... params)
+requires requires {
+  sizeof...(params);
+  [](auto... pack) {
+return sizeof...(pack);
+  }(params...);
+}
+  { return false; }(y);
+}(x);
+  }(x);
+}
+} // namespace GH73418

``




https://github.com/llvm/llvm-project/pull/80120
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [lld] [docs] Add release notes for Windows specific changes in 18.x (PR #80011)

2024-01-31 Thread David Spickett via llvm-branch-commits
Martin =?utf-8?q?Storsj=C3=B6?= 
Message-ID:
In-Reply-To: 


https://github.com/DavidSpickett approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/80011
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79992 (PR #79997)

2024-01-31 Thread via llvm-branch-commits

https://github.com/cor3ntin approved this pull request.


https://github.com/llvm/llvm-project/pull/79997
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc] [flang] [llvm] [clang] [compiler-rt] [RISCV] Support select optimization (PR #80124)

2024-01-31 Thread Wang Pengcheng via llvm-branch-commits

https://github.com/wangpc-pp created 
https://github.com/llvm/llvm-project/pull/80124

AArch64 has enabled this in https://reviews.llvm.org/D138990, and
the measurement data still stands for RISCV.

And, similar optimization like #77284 is added too.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc] [flang] [llvm] [clang] [compiler-rt] [RISCV] Support select optimization (PR #80124)

2024-01-31 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-risc-v

Author: Wang Pengcheng (wangpc-pp)


Changes

AArch64 has enabled this in https://reviews.llvm.org/D138990, and
the measurement data still stands for RISCV.

And, similar optimization like #77284 is added too.


---

Patch is 59.85 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/80124.diff


7 Files Affected:

- (modified) llvm/lib/Target/RISCV/RISCVFeatures.td (+8) 
- (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+2) 
- (modified) llvm/lib/Target/RISCV/RISCVTargetMachine.cpp (+8) 
- (modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+15) 
- (modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h (+3) 
- (modified) llvm/test/CodeGen/RISCV/O3-pipeline.ll (+9) 
- (added) llvm/test/CodeGen/RISCV/selectopt.ll (+873) 


``diff
diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td 
b/llvm/lib/Target/RISCV/RISCVFeatures.td
index 58bf5e8fdefbd..a1600a48900cd 100644
--- a/llvm/lib/Target/RISCV/RISCVFeatures.td
+++ b/llvm/lib/Target/RISCV/RISCVFeatures.td
@@ -1046,6 +1046,14 @@ def FeatureFastUnalignedAccess
 def FeaturePostRAScheduler : SubtargetFeature<"use-postra-scheduler",
 "UsePostRAScheduler", "true", "Schedule again after register allocation">;
 
+def FeaturePredictableSelectIsExpensive
+  : SubtargetFeature<"predictable-select-expensive", 
"PredictableSelectIsExpensive",
+ "true", "Prefer likely predicted branches over selects">;
+
+def FeatureEnableSelectOptimize
+  : SubtargetFeature<"enable-select-opt", "EnableSelectOptimize", "true",
+"Enable the select optimize pass for select loop 
heuristics">;
+
 def TuneNoOptimizedZeroStrideLoad
: SubtargetFeature<"no-optimized-zero-stride-load", 
"HasOptimizedZeroStrideLoad",
   "false", "Hasn't optimized (perform fewer memory 
operations)"
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 82836346d8832..02fa067c59094 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -1374,6 +1374,8 @@ RISCVTargetLowering::RISCVTargetLowering(const 
TargetMachine &TM,
   setPrefFunctionAlignment(Subtarget.getPrefFunctionAlignment());
   setPrefLoopAlignment(Subtarget.getPrefLoopAlignment());
 
+  PredictableSelectIsExpensive = Subtarget.predictableSelectIsExpensive();
+
   setTargetDAGCombine({ISD::INTRINSIC_VOID, ISD::INTRINSIC_W_CHAIN,
ISD::INTRINSIC_WO_CHAIN, ISD::ADD, ISD::SUB, ISD::MUL,
ISD::AND, ISD::OR, ISD::XOR, ISD::SETCC, ISD::SELECT});
diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index 2285c99d79010..fdf1c023fff87 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -101,6 +101,11 @@ static cl::opt EnableMISchedLoadClustering(
 cl::desc("Enable load clustering in the machine scheduler"),
 cl::init(false));
 
+static cl::opt
+EnableSelectOpt("riscv-select-opt", cl::Hidden,
+cl::desc("Enable select to branch optimizations"),
+cl::init(true));
+
 extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeRISCVTarget() {
   RegisterTargetMachine X(getTheRISCV32Target());
   RegisterTargetMachine Y(getTheRISCV64Target());
@@ -445,6 +450,9 @@ void RISCVPassConfig::addIRPasses() {
 if (EnableLoopDataPrefetch)
   addPass(createLoopDataPrefetchPass());
 
+if (EnableSelectOpt && getOptLevel() == CodeGenOptLevel::Aggressive)
+  addPass(createSelectOptimizePass());
+
 addPass(createRISCVGatherScatterLoweringPass());
 addPass(createInterleavedAccessPass());
 addPass(createRISCVCodeGenPreparePass());
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index fe1cdb2dfa423..aad2786623dcb 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -34,6 +34,9 @@ static cl::opt SLPMaxVF(
 "exclusively by SLP vectorizer."),
 cl::Hidden);
 
+static cl::opt EnableOrLikeSelectOpt("enable-riscv-or-like-select",
+   cl::init(true), cl::Hidden);
+
 InstructionCost
 RISCVTTIImpl::getRISCVInstructionCost(ArrayRef OpCodes, MVT VT,
   TTI::TargetCostKind CostKind) {
@@ -1594,3 +1597,15 @@ bool RISCVTTIImpl::isLSRCostLess(const 
TargetTransformInfo::LSRCost &C1,
   C2.NumIVMuls, C2.NumBaseAdds,
   C2.ScaleCost, C2.ImmCost, C2.SetupCost);
 }
+
+bool RISCVTTIImpl::shouldTreatInstructionLikeSelect(const Instruction *I) {
+  // For the binary operators (e.g. or) we need to be more careful than
+  // selects, here we only transform them if they are already at a natural
+  // break point in the code - the 

[llvm-branch-commits] [flang] [libc] [libcxx] [clang] [llvm] [compiler-rt] [SelectOpt] Print instruction instead of pointer (PR #80125)

2024-01-31 Thread Wang Pengcheng via llvm-branch-commits

https://github.com/wangpc-pp created 
https://github.com/llvm/llvm-project/pull/80125

None


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc] [flang] [llvm] [clang] [compiler-rt] [SelectOpt] Print instruction instead of pointer (PR #80125)

2024-01-31 Thread David Green via llvm-branch-commits

https://github.com/davemgreen approved this pull request.

Thanks. LGTM

https://github.com/llvm/llvm-project/pull/80125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [libc] [compiler-rt] [libcxx] [clang] [llvm] [RISCV] Support select optimization (PR #80124)

2024-01-31 Thread Yingwei Zheng via llvm-branch-commits


@@ -0,0 +1,873 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -select-optimize -mtriple=riscv64 -S < %s \
+; RUN:   | FileCheck %s --check-prefix=CHECK-SELECT
+; RUN: opt -select-optimize -mtriple=riscv64 -mattr=+enable-select-opt -S < %s 
\
+; RUN:   | FileCheck %s --check-prefix=CHECK-BRANCH
+; RUN: opt -select-optimize -mtriple=riscv64 
-mattr=+enable-select-opt,+predictable-select-expensive -S < %s \
+; RUN:   | FileCheck %s --check-prefix=CHECK-BRANCH
+
+%struct.st = type { i32, i64, ptr, ptr, i16, ptr, ptr, i64, i64 }
+
+; This test has a select at the end of if.then, which is better transformed to 
a branch on OoO cores.
+
+define void @replace(ptr nocapture noundef %newst, ptr noundef %t, ptr noundef 
%h, i64 noundef %c, i64 noundef %rc, i64 noundef %ma, i64 noundef %n) {

dtcxzyw wrote:

Could you please reduce the test?


https://github.com/llvm/llvm-project/pull/80124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [flang] [libc] [libcxx] [compiler-rt] [clang] [RISCV] Support select optimization (PR #80124)

2024-01-31 Thread Wang Pengcheng via llvm-branch-commits


@@ -0,0 +1,873 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -select-optimize -mtriple=riscv64 -S < %s \
+; RUN:   | FileCheck %s --check-prefix=CHECK-SELECT
+; RUN: opt -select-optimize -mtriple=riscv64 -mattr=+enable-select-opt -S < %s 
\
+; RUN:   | FileCheck %s --check-prefix=CHECK-BRANCH
+; RUN: opt -select-optimize -mtriple=riscv64 
-mattr=+enable-select-opt,+predictable-select-expensive -S < %s \
+; RUN:   | FileCheck %s --check-prefix=CHECK-BRANCH
+
+%struct.st = type { i32, i64, ptr, ptr, i16, ptr, ptr, i64, i64 }
+
+; This test has a select at the end of if.then, which is better transformed to 
a branch on OoO cores.
+
+define void @replace(ptr nocapture noundef %newst, ptr noundef %t, ptr noundef 
%h, i64 noundef %c, i64 noundef %rc, i64 noundef %ma, i64 noundef %n) {

wangpc-pp wrote:

This file is copied from AArch64, I don't know if I can reduce it.

https://github.com/llvm/llvm-project/pull/80124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80137 (PR #80138)

2024-01-31 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/80138
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80137 (PR #80138)

2024-01-31 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/80138

resolves llvm/llvm-project#80137

>From 48b5718efad2595682081c1cd6fa39fdf8781d66 Mon Sep 17 00:00:00 2001
From: Sander de Smalen 
Date: Wed, 31 Jan 2024 11:38:29 +
Subject: [PATCH] [AArch64][SME] Fix inlining bug introduced in #78703 (#79994)

Calling a `__arm_locally_streaming` function from a function that
is not a streaming-SVE function would lead to incorrect inlining.

The issue didn't surface because the tests were not testing what
they were supposed to test.

(cherry picked from commit 3abf55a68caefd45042c27b73a658c638afbbb8b)
---
 .../AArch64/AArch64TargetTransformInfo.cpp|  17 +-
 .../Inline/AArch64/sme-pstatesm-attrs.ll  | 369 +-
 2 files changed, 195 insertions(+), 191 deletions(-)

diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp 
b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index d611338fc268f..992b11da7 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -233,15 +233,20 @@ static bool hasPossibleIncompatibleOps(const Function *F) 
{
 
 bool AArch64TTIImpl::areInlineCompatible(const Function *Caller,
  const Function *Callee) const {
-  SMEAttrs CallerAttrs(*Caller);
-  SMEAttrs CalleeAttrs(*Callee);
+  SMEAttrs CallerAttrs(*Caller), CalleeAttrs(*Callee);
+
+  // When inlining, we should consider the body of the function, not the
+  // interface.
+  if (CalleeAttrs.hasStreamingBody()) {
+CalleeAttrs.set(SMEAttrs::SM_Compatible, false);
+CalleeAttrs.set(SMEAttrs::SM_Enabled, true);
+  }
+
   if (CalleeAttrs.hasNewZABody())
 return false;
 
   if (CallerAttrs.requiresLazySave(CalleeAttrs) ||
-  (CallerAttrs.requiresSMChange(CalleeAttrs) &&
-   (!CallerAttrs.hasStreamingInterfaceOrBody() ||
-!CalleeAttrs.hasStreamingBody( {
+  CallerAttrs.requiresSMChange(CalleeAttrs)) {
 if (hasPossibleIncompatibleOps(Callee))
   return false;
   }
@@ -4062,4 +4067,4 @@ bool 
AArch64TTIImpl::shouldTreatInstructionLikeSelect(const Instruction *I) {
   cast(I->getNextNode())->isUnconditional())
 return true;
   return BaseT::shouldTreatInstructionLikeSelect(I);
-}
\ No newline at end of file
+}
diff --git a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll 
b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
index d6b1f3ef45e76..7723e6c664c3d 100644
--- a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
+++ b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
@@ -1,71 +1,70 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 2
 ; RUN: opt < %s -mtriple=aarch64-unknown-linux-gnu -mattr=+sme -S 
-passes=inline | FileCheck %s
 
-declare void @inlined_body() "aarch64_pstate_sm_compatible";
+declare i32 @llvm.vscale.i32()
 
-; Define some functions that will be called by the functions below.
-; These just call a '...body()' function. If we see the call to one of
-; these functions being replaced by '...body()', then we know it has been
-; inlined.
+; Define some functions that merely call llvm.vscale.i32(), which will be 
called
+; by the other functions below. If we see the call to one of these functions
+; being replaced by 'llvm.vscale()', then we know it has been inlined.
 
-define void @normal_callee() {
-; CHECK-LABEL: define void @normal_callee
+define i32 @normal_callee() {
+; CHECK-LABEL: define i32 @normal_callee
 ; CHECK-SAME: () #[[ATTR1:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:call void @inlined_body()
-; CHECK-NEXT:ret void
+; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 }
 
-define void @streaming_callee() "aarch64_pstate_sm_enabled" {
-; CHECK-LABEL: define void @streaming_callee
+define i32 @streaming_callee() "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define i32 @streaming_callee
 ; CHECK-SAME: () #[[ATTR2:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:call void @inlined_body()
-; CHECK-NEXT:ret void
+; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 }
 
-define void @locally_streaming_callee() "aarch64_pstate_sm_body" {
-; CHECK-LABEL: define void @locally_streaming_callee
+define i32 @locally_streaming_callee() "aarch64_pstate_sm_body" {
+; CHECK-LABEL: define i32 @locally_streaming_callee
 ; CHECK-SAME: () #[[ATTR3:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:call void @inlined_body()
-; CHECK-NEXT:ret void
+; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @l

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80137 (PR #80138)

2024-01-31 Thread via llvm-branch-commits

llvmbot wrote:

@kmclaughlin-arm What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/80138
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80137 (PR #80138)

2024-01-31 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-transforms

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#80137

---

Patch is 26.14 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/80138.diff


2 Files Affected:

- (modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp (+11-6) 
- (modified) llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll 
(+184-185) 


``diff
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp 
b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index d611338fc268f..992b11da7 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -233,15 +233,20 @@ static bool hasPossibleIncompatibleOps(const Function *F) 
{
 
 bool AArch64TTIImpl::areInlineCompatible(const Function *Caller,
  const Function *Callee) const {
-  SMEAttrs CallerAttrs(*Caller);
-  SMEAttrs CalleeAttrs(*Callee);
+  SMEAttrs CallerAttrs(*Caller), CalleeAttrs(*Callee);
+
+  // When inlining, we should consider the body of the function, not the
+  // interface.
+  if (CalleeAttrs.hasStreamingBody()) {
+CalleeAttrs.set(SMEAttrs::SM_Compatible, false);
+CalleeAttrs.set(SMEAttrs::SM_Enabled, true);
+  }
+
   if (CalleeAttrs.hasNewZABody())
 return false;
 
   if (CallerAttrs.requiresLazySave(CalleeAttrs) ||
-  (CallerAttrs.requiresSMChange(CalleeAttrs) &&
-   (!CallerAttrs.hasStreamingInterfaceOrBody() ||
-!CalleeAttrs.hasStreamingBody( {
+  CallerAttrs.requiresSMChange(CalleeAttrs)) {
 if (hasPossibleIncompatibleOps(Callee))
   return false;
   }
@@ -4062,4 +4067,4 @@ bool 
AArch64TTIImpl::shouldTreatInstructionLikeSelect(const Instruction *I) {
   cast(I->getNextNode())->isUnconditional())
 return true;
   return BaseT::shouldTreatInstructionLikeSelect(I);
-}
\ No newline at end of file
+}
diff --git a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll 
b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
index d6b1f3ef45e76..7723e6c664c3d 100644
--- a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
+++ b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
@@ -1,71 +1,70 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 2
 ; RUN: opt < %s -mtriple=aarch64-unknown-linux-gnu -mattr=+sme -S 
-passes=inline | FileCheck %s
 
-declare void @inlined_body() "aarch64_pstate_sm_compatible";
+declare i32 @llvm.vscale.i32()
 
-; Define some functions that will be called by the functions below.
-; These just call a '...body()' function. If we see the call to one of
-; these functions being replaced by '...body()', then we know it has been
-; inlined.
+; Define some functions that merely call llvm.vscale.i32(), which will be 
called
+; by the other functions below. If we see the call to one of these functions
+; being replaced by 'llvm.vscale()', then we know it has been inlined.
 
-define void @normal_callee() {
-; CHECK-LABEL: define void @normal_callee
+define i32 @normal_callee() {
+; CHECK-LABEL: define i32 @normal_callee
 ; CHECK-SAME: () #[[ATTR1:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:call void @inlined_body()
-; CHECK-NEXT:ret void
+; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 }
 
-define void @streaming_callee() "aarch64_pstate_sm_enabled" {
-; CHECK-LABEL: define void @streaming_callee
+define i32 @streaming_callee() "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define i32 @streaming_callee
 ; CHECK-SAME: () #[[ATTR2:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:call void @inlined_body()
-; CHECK-NEXT:ret void
+; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 }
 
-define void @locally_streaming_callee() "aarch64_pstate_sm_body" {
-; CHECK-LABEL: define void @locally_streaming_callee
+define i32 @locally_streaming_callee() "aarch64_pstate_sm_body" {
+; CHECK-LABEL: define i32 @locally_streaming_callee
 ; CHECK-SAME: () #[[ATTR3:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:call void @inlined_body()
-; CHECK-NEXT:ret void
+; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 }
 
-define void @streaming_compatible_callee() "aarch64_pstate_sm_compatible" {
-; CHECK-LABEL: define void @streaming_compatible_callee
+define i32 @streaming_compatible_callee() "aarch64_pstate_sm_compatible" {
+; CHECK-LABEL: define i32 @streaming_compatible_callee
 ; CHECK-SAME: () #[[ATTR0:[

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80140 (PR #80141)

2024-01-31 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/80141
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80140 (PR #80141)

2024-01-31 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/80141

resolves llvm/llvm-project#80140

>From baa6dce938079604d671bbf58ffdb57133fd48bf Mon Sep 17 00:00:00 2001
From: Sander de Smalen 
Date: Wed, 31 Jan 2024 09:04:13 +
Subject: [PATCH] [SME] Stop RA from coalescing COPY instructions that
 transcend beyond smstart/smstop. (#78294)

This patch introduces a 'COALESCER_BARRIER' which is a pseudo node that
expands to
a 'nop', but which stops the register allocator from coalescing a COPY
node when
its use/def crosses a SMSTART or SMSTOP instruction.

For example:

%0:fpr64 = COPY killed $d0
undef %2.dsub:zpr = COPY %0   // <- Do not coalesce this COPY
ADJCALLSTACKDOWN 0, 0
MSRpstatesvcrImm1 1, 0, csr_aarch64_smstartstop, implicit-def dead $d0
$d0 = COPY killed %0
BL @use_f64, csr_aarch64_aapcs

If the COPY would be coalesced, that would lead to:

$d0 = COPY killed %0

being replaced by:

$d0 = COPY killed %2.dsub

which means the whole ZPR reg would be live upto the call, causing the
MSRpstatesvcrImm1 (smstop) to spill/reload the ZPR register:

str q0, [sp]   // 16-byte Folded Spill
smstop  sm
ldr z0, [sp]   // 16-byte Folded Reload
bl  use_f64

which would be incorrect for two reasons:
1. The program may load more data than it has allocated.
2. If there are other SVE objects on the stack, the compiler might use
the
   'mul vl' addressing modes to access the spill location.

By disabling the coalescing, we get the desired results:

str d0, [sp, #8]  // 8-byte Folded Spill
smstop  sm
ldr d0, [sp, #8]  // 8-byte Folded Reload
bl  use_f64

(cherry picked from commit dd736661826e215ac70ff3a4a4ccd75bda0c5ccd)
---
 .../AArch64/AArch64ExpandPseudoInsts.cpp  |6 +
 .../Target/AArch64/AArch64ISelLowering.cpp|   24 +-
 llvm/lib/Target/AArch64/AArch64ISelLowering.h |4 +-
 .../Target/AArch64/AArch64RegisterInfo.cpp|   35 +
 .../lib/Target/AArch64/AArch64SMEInstrInfo.td |   22 +
 .../AArch64/sme-disable-gisel-fisel.ll|   20 +-
 ...ate-sm-changing-call-disable-coalescing.ll | 1640 +
 .../CodeGen/AArch64/sme-streaming-body.ll |4 +
 .../sme-streaming-compatible-interface.ll |   29 +-
 .../AArch64/sme-streaming-interface.ll|   12 +-
 ...nging-call-disable-stackslot-scavenging.ll |2 +-
 11 files changed, 1769 insertions(+), 29 deletions(-)
 create mode 100644 
llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll

diff --git a/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp 
b/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
index 352c61d48e2ff..1af064b6de3cb 100644
--- a/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
@@ -1544,6 +1544,12 @@ bool AArch64ExpandPseudo::expandMI(MachineBasicBlock 
&MBB,
NextMBBI = MBB.end(); // The NextMBBI iterator is invalidated.
  return true;
}
+   case AArch64::COALESCER_BARRIER_FPR16:
+   case AArch64::COALESCER_BARRIER_FPR32:
+   case AArch64::COALESCER_BARRIER_FPR64:
+   case AArch64::COALESCER_BARRIER_FPR128:
+ MI.eraseFromParent();
+ return true;
case AArch64::LD1B_2Z_IMM_PSEUDO:
  return expandMultiVecPseudo(
  MBB, MBBI, AArch64::ZPR2RegClass, AArch64::ZPR2StridedRegClass,
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 332fb37655288..a59b1f2ec3c1c 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -2375,6 +2375,7 @@ const char 
*AArch64TargetLowering::getTargetNodeName(unsigned Opcode) const {
   switch ((AArch64ISD::NodeType)Opcode) {
   case AArch64ISD::FIRST_NUMBER:
 break;
+MAKE_CASE(AArch64ISD::COALESCER_BARRIER)
 MAKE_CASE(AArch64ISD::SMSTART)
 MAKE_CASE(AArch64ISD::SMSTOP)
 MAKE_CASE(AArch64ISD::RESTORE_ZA)
@@ -7154,13 +7155,18 @@ void AArch64TargetLowering::saveVarArgRegisters(CCState 
&CCInfo,
   }
 }
 
+static bool isPassedInFPR(EVT VT) {
+  return VT.isFixedLengthVector() ||
+ (VT.isFloatingPoint() && !VT.isScalableVector());
+}
+
 /// LowerCallResult - Lower the result values of a call into the
 /// appropriate copies out of appropriate physical registers.
 SDValue AArch64TargetLowering::LowerCallResult(
 SDValue Chain, SDValue InGlue, CallingConv::ID CallConv, bool isVarArg,
 const SmallVectorImpl &RVLocs, const SDLoc &DL,
 SelectionDAG &DAG, SmallVectorImpl &InVals, bool isThisReturn,
-SDValue ThisVal) const {
+SDValue ThisVal, bool RequiresSMChange) const {
   DenseMap CopiedRegs;
   // Copy all of the result registers out of their specified physreg.
   for (unsigned i = 0; i != RVLocs.size(); ++i) {
@@ -7205,6 +7211,10 @@ SDValue AArch64TargetLowering::LowerCallResult(
   break;
 }
 
+if (RequiresSMChange && isPassedInFPR(VA.getValVT()))
+  Val = DAG.getN

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80140 (PR #80141)

2024-01-31 Thread via llvm-branch-commits

llvmbot wrote:

@david-arm What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/80141
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80140 (PR #80141)

2024-01-31 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#80140

---

Patch is 93.79 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/80141.diff


11 Files Affected:

- (modified) llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp (+6) 
- (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+20-4) 
- (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.h (+3-1) 
- (modified) llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp (+35) 
- (modified) llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td (+22) 
- (modified) llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll (+12-8) 
- (added) 
llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll 
(+1640) 
- (modified) llvm/test/CodeGen/AArch64/sme-streaming-body.ll (+4) 
- (modified) llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll 
(+20-9) 
- (modified) llvm/test/CodeGen/AArch64/sme-streaming-interface.ll (+6-6) 
- (modified) 
llvm/test/CodeGen/AArch64/sme-streaming-mode-changing-call-disable-stackslot-scavenging.ll
 (+1-1) 


``diff
diff --git a/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp 
b/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
index 352c61d48e2ff..1af064b6de3cb 100644
--- a/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
@@ -1544,6 +1544,12 @@ bool AArch64ExpandPseudo::expandMI(MachineBasicBlock 
&MBB,
NextMBBI = MBB.end(); // The NextMBBI iterator is invalidated.
  return true;
}
+   case AArch64::COALESCER_BARRIER_FPR16:
+   case AArch64::COALESCER_BARRIER_FPR32:
+   case AArch64::COALESCER_BARRIER_FPR64:
+   case AArch64::COALESCER_BARRIER_FPR128:
+ MI.eraseFromParent();
+ return true;
case AArch64::LD1B_2Z_IMM_PSEUDO:
  return expandMultiVecPseudo(
  MBB, MBBI, AArch64::ZPR2RegClass, AArch64::ZPR2StridedRegClass,
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 332fb37655288..a59b1f2ec3c1c 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -2375,6 +2375,7 @@ const char 
*AArch64TargetLowering::getTargetNodeName(unsigned Opcode) const {
   switch ((AArch64ISD::NodeType)Opcode) {
   case AArch64ISD::FIRST_NUMBER:
 break;
+MAKE_CASE(AArch64ISD::COALESCER_BARRIER)
 MAKE_CASE(AArch64ISD::SMSTART)
 MAKE_CASE(AArch64ISD::SMSTOP)
 MAKE_CASE(AArch64ISD::RESTORE_ZA)
@@ -7154,13 +7155,18 @@ void AArch64TargetLowering::saveVarArgRegisters(CCState 
&CCInfo,
   }
 }
 
+static bool isPassedInFPR(EVT VT) {
+  return VT.isFixedLengthVector() ||
+ (VT.isFloatingPoint() && !VT.isScalableVector());
+}
+
 /// LowerCallResult - Lower the result values of a call into the
 /// appropriate copies out of appropriate physical registers.
 SDValue AArch64TargetLowering::LowerCallResult(
 SDValue Chain, SDValue InGlue, CallingConv::ID CallConv, bool isVarArg,
 const SmallVectorImpl &RVLocs, const SDLoc &DL,
 SelectionDAG &DAG, SmallVectorImpl &InVals, bool isThisReturn,
-SDValue ThisVal) const {
+SDValue ThisVal, bool RequiresSMChange) const {
   DenseMap CopiedRegs;
   // Copy all of the result registers out of their specified physreg.
   for (unsigned i = 0; i != RVLocs.size(); ++i) {
@@ -7205,6 +7211,10 @@ SDValue AArch64TargetLowering::LowerCallResult(
   break;
 }
 
+if (RequiresSMChange && isPassedInFPR(VA.getValVT()))
+  Val = DAG.getNode(AArch64ISD::COALESCER_BARRIER, DL, Val.getValueType(),
+Val);
+
 InVals.push_back(Val);
   }
 
@@ -7915,6 +7925,12 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
   return ArgReg.Reg == VA.getLocReg();
 });
   } else {
+// Add an extra level of indirection for streaming mode changes by
+// using a pseudo copy node that cannot be rematerialised between a
+// smstart/smstop and the call by the simple register coalescer.
+if (RequiresSMChange && isPassedInFPR(Arg.getValueType()))
+  Arg = DAG.getNode(AArch64ISD::COALESCER_BARRIER, DL,
+Arg.getValueType(), Arg);
 RegsToPass.emplace_back(VA.getLocReg(), Arg);
 RegsUsed.insert(VA.getLocReg());
 const TargetOptions &Options = DAG.getTarget().Options;
@@ -8151,9 +8167,9 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
 
   // Handle result values, copying them out of physregs into vregs that we
   // return.
-  SDValue Result = LowerCallResult(Chain, InGlue, CallConv, IsVarArg, RVLocs,
-   DL, DAG, InVals, IsThisReturn,
-   IsThisReturn ? OutVals[0] : SDValue());
+  SDValue Result = LowerCallResult(
+  Chain, InGlue, CallConv, IsVarArg, RVLocs, DL, DAG, InVals, IsThisReturn,
+  IsThisReturn ? OutVals

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80137 (PR #80138)

2024-01-31 Thread Kerry McLaughlin via llvm-branch-commits

kmclaughlin-arm wrote:

> @kmclaughlin-arm What do you think about merging this PR to the release 
> branch?

I think this should be merged into the release branch, as it fixes incorrect 
inlining of `__arm_locally_streaming` functions.

https://github.com/llvm/llvm-project/pull/80138
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80137 (PR #80138)

2024-01-31 Thread Kerry McLaughlin via llvm-branch-commits

https://github.com/kmclaughlin-arm approved this pull request.


https://github.com/llvm/llvm-project/pull/80138
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BPF] expand cttz, ctlz for i32, i64 (PR #73668)

2024-01-31 Thread via llvm-branch-commits

eddyz87 wrote:

I tried this with the test below using current kernel master, and all works as 
expected.

```diff
diff --git a/tools/testing/selftests/bpf/progs/verifier_and.c 
b/tools/testing/selftests/bpf/progs/verifier_and.c
index e97e518516b6..8a051bd0c886 100644
--- a/tools/testing/selftests/bpf/progs/verifier_and.c
+++ b/tools/testing/selftests/bpf/progs/verifier_and.c
@@ -104,4 +104,14 @@ l0_%=: r0 = 0; 
\
: __clobber_all);
 }
 
+unsigned A[3] = {1u << 31, 1u << 30, 1u << 29};
+
+SEC("socket") __success __retval(0) int clz1(void *ctx) { return 
__builtin_clz(A[0]); }
+SEC("socket") __success __retval(1) int clz2(void *ctx) { return 
__builtin_clz(A[1]); }
+SEC("socket") __success __retval(2) int clz3(void *ctx) { return 
__builtin_clz(A[2]); }
+
+SEC("socket") __success __retval(31) int ctz1(void *ctx) { return 
__builtin_ctz(A[0]); }
+SEC("socket") __success __retval(30) int ctz2(void *ctx) { return 
__builtin_ctz(A[1]); }
+SEC("socket") __success __retval(29) int ctz3(void *ctx) { return 
__builtin_ctz(A[2]); }
+
 char _license[] SEC("license") = "GPL";
```

https://github.com/llvm/llvm-project/pull/73668
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BPF] expand cttz, ctlz for i32, i64 (PR #73668)

2024-01-31 Thread via llvm-branch-commits


@@ -0,0 +1,304 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
+; RUN: llc < %s -march=bpf | FileCheck %s
+
+; test that we can expand CTTZ & CTLZ
+
+declare i32 @llvm.cttz.i32(i32, i1)
+
+define i32 @cttz_i32_zdef(i32 %a) {
+; CHECK-LABEL: cttz_i32_zdef:

eddyz87 wrote:

Question, how stable are these expansions?
Previously compiler would just error out, maybe just insert some dummy checks 
that verify that something is returned from these functions? And we can add a 
few tests on kernel side that verify runtime result. Wdyt?

https://github.com/llvm/llvm-project/pull/73668
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79568 (PR #80120)

2024-01-31 Thread Erich Keane via llvm-branch-commits

erichkeane wrote:

Does this fix a regression against 17?  I didn't think it did?

https://github.com/llvm/llvm-project/pull/80120
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79568 (PR #80120)

2024-01-31 Thread Younan Zhang via llvm-branch-commits

zyn0217 wrote:

I think this is a regression from clang 16. https://cpp1.godbolt.org/z/1PnWbvY1r

https://github.com/llvm/llvm-project/pull/80120
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79568 (PR #80120)

2024-01-31 Thread Erich Keane via llvm-branch-commits

https://github.com/erichkeane approved this pull request.


https://github.com/llvm/llvm-project/pull/80120
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80140 (PR #80141)

2024-01-31 Thread David Sherwood via llvm-branch-commits

https://github.com/david-arm approved this pull request.

LGTM. This is a critical fix for SME to ensure correct behaviour and prevent 
stack corruption.

https://github.com/llvm/llvm-project/pull/80141
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang-tools-extra] Backport '[clang] static operators should evaluate object argument (reland)' to release/18.x (PR #80109)

2024-01-31 Thread Aaron Ballman via llvm-branch-commits

https://github.com/AaronBallman milestoned 
https://github.com/llvm/llvm-project/pull/80109
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#80150 (PR #80151)

2024-01-31 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/80151

resolves llvm/llvm-project#80150

>From 80dc5345695c8b72626463a737a944caabc18734 Mon Sep 17 00:00:00 2001
From: Andrey Ali Khan Bolshakov
 <32954549+bolshako...@users.noreply.github.com>
Date: Wed, 31 Jan 2024 17:28:37 +0300
Subject: [PATCH] [clang] Represent array refs as
 `TemplateArgument::Declaration` (#80050)

This returns (probably temporarily) array-referring NTTP behavior to
which was prior to #78041 because ~~I'm fed up~~ have no time to fix
regressions.

(cherry picked from commit 9bf4e54ef42d907ae7550f36fa518f14fa97af6f)
---
 clang/lib/Sema/SemaTemplate.cpp  | 44 +---
 clang/test/CoverageMapping/templates.cpp | 13 +++
 2 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/clang/lib/Sema/SemaTemplate.cpp b/clang/lib/Sema/SemaTemplate.cpp
index 9bfa71dc8bcf1..a381d876a54c6 100644
--- a/clang/lib/Sema/SemaTemplate.cpp
+++ b/clang/lib/Sema/SemaTemplate.cpp
@@ -7412,9 +7412,9 @@ ExprResult 
Sema::CheckTemplateArgument(NonTypeTemplateParmDecl *Param,
 if (ArgResult.isInvalid())
   return ExprError();
 
-// Prior to C++20, enforce restrictions on possible template argument
-// values.
-if (!getLangOpts().CPlusPlus20 && Value.isLValue()) {
+if (Value.isLValue()) {
+  APValue::LValueBase Base = Value.getLValueBase();
+  auto *VD = const_cast(Base.dyn_cast());
   //   For a non-type template-parameter of pointer or reference type,
   //   the value of the constant expression shall not refer to
   assert(ParamType->isPointerType() || ParamType->isReferenceType() ||
@@ -7423,8 +7423,6 @@ ExprResult 
Sema::CheckTemplateArgument(NonTypeTemplateParmDecl *Param,
   // -- a string literal
   // -- the result of a typeid expression, or
   // -- a predefined __func__ variable
-  APValue::LValueBase Base = Value.getLValueBase();
-  auto *VD = const_cast(Base.dyn_cast());
   if (Base &&
   (!VD ||
isa(VD))) 
{
@@ -7432,24 +7430,30 @@ ExprResult 
Sema::CheckTemplateArgument(NonTypeTemplateParmDecl *Param,
 << Arg->getSourceRange();
 return ExprError();
   }
-  // -- a subobject [until C++20]
-  if (Value.hasLValuePath() && Value.getLValuePath().size() == 1 &&
-  VD && VD->getType()->isArrayType() &&
+
+  if (Value.hasLValuePath() && Value.getLValuePath().size() == 1 && VD &&
+  VD->getType()->isArrayType() &&
   Value.getLValuePath()[0].getAsArrayIndex() == 0 &&
   !Value.isLValueOnePastTheEnd() && ParamType->isPointerType()) {
-// Per defect report (no number yet):
-//   ... other than a pointer to the first element of a complete array
-//   object.
-  } else if (!Value.hasLValuePath() || Value.getLValuePath().size() ||
- Value.isLValueOnePastTheEnd()) {
-Diag(StartLoc, diag::err_non_type_template_arg_subobject)
-  << Value.getAsString(Context, ParamType);
-return ExprError();
+SugaredConverted = TemplateArgument(VD, ParamType);
+CanonicalConverted = TemplateArgument(
+cast(VD->getCanonicalDecl()), CanonParamType);
+return ArgResult.get();
+  }
+
+  // -- a subobject [until C++20]
+  if (!getLangOpts().CPlusPlus20) {
+if (!Value.hasLValuePath() || Value.getLValuePath().size() ||
+Value.isLValueOnePastTheEnd()) {
+  Diag(StartLoc, diag::err_non_type_template_arg_subobject)
+  << Value.getAsString(Context, ParamType);
+  return ExprError();
+}
+assert((VD || !ParamType->isReferenceType()) &&
+   "null reference should not be a constant expression");
+assert((!VD || !ParamType->isNullPtrType()) &&
+   "non-null value of type nullptr_t?");
   }
-  assert((VD || !ParamType->isReferenceType()) &&
- "null reference should not be a constant expression");
-  assert((!VD || !ParamType->isNullPtrType()) &&
- "non-null value of type nullptr_t?");
 }
 
 if (Value.isAddrLabelDiff())
diff --git a/clang/test/CoverageMapping/templates.cpp 
b/clang/test/CoverageMapping/templates.cpp
index 7010edbc32c34..143e566a33cb8 100644
--- a/clang/test/CoverageMapping/templates.cpp
+++ b/clang/test/CoverageMapping/templates.cpp
@@ -19,3 +19,16 @@ int main() {
   func(true);
   return 0;
 }
+
+namespace structural_value_crash {
+  template 
+  void tpl_fn() {
+(void)p;
+  }
+
+  int arr[] = {1, 2, 3};
+
+  void test() {
+tpl_fn();
+  }
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#80150 (PR #80151)

2024-01-31 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/80151
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#80150 (PR #80151)

2024-01-31 Thread via llvm-branch-commits

llvmbot wrote:

@cor3ntin What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/80151
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#80150 (PR #80151)

2024-01-31 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#80150

---
Full diff: https://github.com/llvm/llvm-project/pull/80151.diff


2 Files Affected:

- (modified) clang/lib/Sema/SemaTemplate.cpp (+24-20) 
- (modified) clang/test/CoverageMapping/templates.cpp (+13) 


``diff
diff --git a/clang/lib/Sema/SemaTemplate.cpp b/clang/lib/Sema/SemaTemplate.cpp
index 9bfa71dc8bcf1..a381d876a54c6 100644
--- a/clang/lib/Sema/SemaTemplate.cpp
+++ b/clang/lib/Sema/SemaTemplate.cpp
@@ -7412,9 +7412,9 @@ ExprResult 
Sema::CheckTemplateArgument(NonTypeTemplateParmDecl *Param,
 if (ArgResult.isInvalid())
   return ExprError();
 
-// Prior to C++20, enforce restrictions on possible template argument
-// values.
-if (!getLangOpts().CPlusPlus20 && Value.isLValue()) {
+if (Value.isLValue()) {
+  APValue::LValueBase Base = Value.getLValueBase();
+  auto *VD = const_cast(Base.dyn_cast());
   //   For a non-type template-parameter of pointer or reference type,
   //   the value of the constant expression shall not refer to
   assert(ParamType->isPointerType() || ParamType->isReferenceType() ||
@@ -7423,8 +7423,6 @@ ExprResult 
Sema::CheckTemplateArgument(NonTypeTemplateParmDecl *Param,
   // -- a string literal
   // -- the result of a typeid expression, or
   // -- a predefined __func__ variable
-  APValue::LValueBase Base = Value.getLValueBase();
-  auto *VD = const_cast(Base.dyn_cast());
   if (Base &&
   (!VD ||
isa(VD))) 
{
@@ -7432,24 +7430,30 @@ ExprResult 
Sema::CheckTemplateArgument(NonTypeTemplateParmDecl *Param,
 << Arg->getSourceRange();
 return ExprError();
   }
-  // -- a subobject [until C++20]
-  if (Value.hasLValuePath() && Value.getLValuePath().size() == 1 &&
-  VD && VD->getType()->isArrayType() &&
+
+  if (Value.hasLValuePath() && Value.getLValuePath().size() == 1 && VD &&
+  VD->getType()->isArrayType() &&
   Value.getLValuePath()[0].getAsArrayIndex() == 0 &&
   !Value.isLValueOnePastTheEnd() && ParamType->isPointerType()) {
-// Per defect report (no number yet):
-//   ... other than a pointer to the first element of a complete array
-//   object.
-  } else if (!Value.hasLValuePath() || Value.getLValuePath().size() ||
- Value.isLValueOnePastTheEnd()) {
-Diag(StartLoc, diag::err_non_type_template_arg_subobject)
-  << Value.getAsString(Context, ParamType);
-return ExprError();
+SugaredConverted = TemplateArgument(VD, ParamType);
+CanonicalConverted = TemplateArgument(
+cast(VD->getCanonicalDecl()), CanonParamType);
+return ArgResult.get();
+  }
+
+  // -- a subobject [until C++20]
+  if (!getLangOpts().CPlusPlus20) {
+if (!Value.hasLValuePath() || Value.getLValuePath().size() ||
+Value.isLValueOnePastTheEnd()) {
+  Diag(StartLoc, diag::err_non_type_template_arg_subobject)
+  << Value.getAsString(Context, ParamType);
+  return ExprError();
+}
+assert((VD || !ParamType->isReferenceType()) &&
+   "null reference should not be a constant expression");
+assert((!VD || !ParamType->isNullPtrType()) &&
+   "non-null value of type nullptr_t?");
   }
-  assert((VD || !ParamType->isReferenceType()) &&
- "null reference should not be a constant expression");
-  assert((!VD || !ParamType->isNullPtrType()) &&
- "non-null value of type nullptr_t?");
 }
 
 if (Value.isAddrLabelDiff())
diff --git a/clang/test/CoverageMapping/templates.cpp 
b/clang/test/CoverageMapping/templates.cpp
index 7010edbc32c34..143e566a33cb8 100644
--- a/clang/test/CoverageMapping/templates.cpp
+++ b/clang/test/CoverageMapping/templates.cpp
@@ -19,3 +19,16 @@ int main() {
   func(true);
   return 0;
 }
+
+namespace structural_value_crash {
+  template 
+  void tpl_fn() {
+(void)p;
+  }
+
+  int arr[] = {1, 2, 3};
+
+  void test() {
+tpl_fn();
+  }
+}

``




https://github.com/llvm/llvm-project/pull/80151
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] [clang] Backport '[clang] static operators should evaluate object argument (reland)' to release/18.x (PR #80109)

2024-01-31 Thread Aaron Ballman via llvm-branch-commits

https://github.com/AaronBallman approved this pull request.

LGTM for 18.x, but another set of eyes verifying the fix looks safe would not 
be a bad thing, either.

https://github.com/llvm/llvm-project/pull/80109
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Release Notes][FMV] Document support for rcpc3 and mops features. (PR #80152)

2024-01-31 Thread Alexandros Lamprineas via llvm-branch-commits

https://github.com/labrinea created 
https://github.com/llvm/llvm-project/pull/80152

Documents support for Load-Acquire RCpc instructions v3 (rcpc3) as well as 
Memory Copy and Memory Set Acceleration instructions (mops) when targeting 
AArch64.

>From 6556c342d6c98b9d755fecfbd1d41e49dc620c4c Mon Sep 17 00:00:00 2001
From: Alexandros Lamprineas 
Date: Wed, 31 Jan 2024 15:19:41 +
Subject: [PATCH] [Release Notes][FMV] Document support for rcpc3 and mops
 features.

---
 clang/docs/ReleaseNotes.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 060bc7669b72a..421f793ee1d8e 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1164,6 +1164,12 @@ Arm and AArch64 Support
   * Cortex-A720 (cortex-a720).
   * Cortex-X4 (cortex-x4).
 
+- Function Multi Versioning has been extended to support Load-Acquire RCpc
+  instructions v3 (rcpc3) as well as Memory Copy and Memory Set Acceleration
+  instructions (mops) when targeting AArch64. The feature identifiers (in
+  parenthesis) can be used with either of the ``target_version`` and
+  ``target_clones`` attributes.
+
 Android Support
 ^^^
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Release Notes][FMV] Document support for rcpc3 and mops features. (PR #80152)

2024-01-31 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Alexandros Lamprineas (labrinea)


Changes

Documents support for Load-Acquire RCpc instructions v3 (rcpc3) as well as 
Memory Copy and Memory Set Acceleration instructions (mops) when targeting 
AArch64.

---
Full diff: https://github.com/llvm/llvm-project/pull/80152.diff


1 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+6) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 060bc7669b72a..421f793ee1d8e 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1164,6 +1164,12 @@ Arm and AArch64 Support
   * Cortex-A720 (cortex-a720).
   * Cortex-X4 (cortex-x4).
 
+- Function Multi Versioning has been extended to support Load-Acquire RCpc
+  instructions v3 (rcpc3) as well as Memory Copy and Memory Set Acceleration
+  instructions (mops) when targeting AArch64. The feature identifiers (in
+  parenthesis) can be used with either of the ``target_version`` and
+  ``target_clones`` attributes.
+
 Android Support
 ^^^
 

``




https://github.com/llvm/llvm-project/pull/80152
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#80150 (PR #80151)

2024-01-31 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/80151

>From 6709a582a8345025bf771ff9009d62eab94f2224 Mon Sep 17 00:00:00 2001
From: Andrey Ali Khan Bolshakov
 <32954549+bolshako...@users.noreply.github.com>
Date: Wed, 31 Jan 2024 17:28:37 +0300
Subject: [PATCH] [clang] Represent array refs as
 `TemplateArgument::Declaration` (#80050)

This returns (probably temporarily) array-referring NTTP behavior to
which was prior to #78041 because ~~I'm fed up~~ have no time to fix
regressions.

(cherry picked from commit 9bf4e54ef42d907ae7550f36fa518f14fa97af6f)
---
 clang/lib/Sema/SemaTemplate.cpp  | 44 +---
 clang/test/CoverageMapping/templates.cpp | 13 +++
 2 files changed, 37 insertions(+), 20 deletions(-)

diff --git a/clang/lib/Sema/SemaTemplate.cpp b/clang/lib/Sema/SemaTemplate.cpp
index 9bfa71dc8bcf1..a381d876a54c6 100644
--- a/clang/lib/Sema/SemaTemplate.cpp
+++ b/clang/lib/Sema/SemaTemplate.cpp
@@ -7412,9 +7412,9 @@ ExprResult 
Sema::CheckTemplateArgument(NonTypeTemplateParmDecl *Param,
 if (ArgResult.isInvalid())
   return ExprError();
 
-// Prior to C++20, enforce restrictions on possible template argument
-// values.
-if (!getLangOpts().CPlusPlus20 && Value.isLValue()) {
+if (Value.isLValue()) {
+  APValue::LValueBase Base = Value.getLValueBase();
+  auto *VD = const_cast(Base.dyn_cast());
   //   For a non-type template-parameter of pointer or reference type,
   //   the value of the constant expression shall not refer to
   assert(ParamType->isPointerType() || ParamType->isReferenceType() ||
@@ -7423,8 +7423,6 @@ ExprResult 
Sema::CheckTemplateArgument(NonTypeTemplateParmDecl *Param,
   // -- a string literal
   // -- the result of a typeid expression, or
   // -- a predefined __func__ variable
-  APValue::LValueBase Base = Value.getLValueBase();
-  auto *VD = const_cast(Base.dyn_cast());
   if (Base &&
   (!VD ||
isa(VD))) 
{
@@ -7432,24 +7430,30 @@ ExprResult 
Sema::CheckTemplateArgument(NonTypeTemplateParmDecl *Param,
 << Arg->getSourceRange();
 return ExprError();
   }
-  // -- a subobject [until C++20]
-  if (Value.hasLValuePath() && Value.getLValuePath().size() == 1 &&
-  VD && VD->getType()->isArrayType() &&
+
+  if (Value.hasLValuePath() && Value.getLValuePath().size() == 1 && VD &&
+  VD->getType()->isArrayType() &&
   Value.getLValuePath()[0].getAsArrayIndex() == 0 &&
   !Value.isLValueOnePastTheEnd() && ParamType->isPointerType()) {
-// Per defect report (no number yet):
-//   ... other than a pointer to the first element of a complete array
-//   object.
-  } else if (!Value.hasLValuePath() || Value.getLValuePath().size() ||
- Value.isLValueOnePastTheEnd()) {
-Diag(StartLoc, diag::err_non_type_template_arg_subobject)
-  << Value.getAsString(Context, ParamType);
-return ExprError();
+SugaredConverted = TemplateArgument(VD, ParamType);
+CanonicalConverted = TemplateArgument(
+cast(VD->getCanonicalDecl()), CanonParamType);
+return ArgResult.get();
+  }
+
+  // -- a subobject [until C++20]
+  if (!getLangOpts().CPlusPlus20) {
+if (!Value.hasLValuePath() || Value.getLValuePath().size() ||
+Value.isLValueOnePastTheEnd()) {
+  Diag(StartLoc, diag::err_non_type_template_arg_subobject)
+  << Value.getAsString(Context, ParamType);
+  return ExprError();
+}
+assert((VD || !ParamType->isReferenceType()) &&
+   "null reference should not be a constant expression");
+assert((!VD || !ParamType->isNullPtrType()) &&
+   "non-null value of type nullptr_t?");
   }
-  assert((VD || !ParamType->isReferenceType()) &&
- "null reference should not be a constant expression");
-  assert((!VD || !ParamType->isNullPtrType()) &&
- "non-null value of type nullptr_t?");
 }
 
 if (Value.isAddrLabelDiff())
diff --git a/clang/test/CoverageMapping/templates.cpp 
b/clang/test/CoverageMapping/templates.cpp
index 7010edbc32c34..143e566a33cb8 100644
--- a/clang/test/CoverageMapping/templates.cpp
+++ b/clang/test/CoverageMapping/templates.cpp
@@ -19,3 +19,16 @@ int main() {
   func(true);
   return 0;
 }
+
+namespace structural_value_crash {
+  template 
+  void tpl_fn() {
+(void)p;
+  }
+
+  int arr[] = {1, 2, 3};
+
+  void test() {
+tpl_fn();
+  }
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Release Notes][FMV] Document support for rcpc3 and mops features. (PR #80152)

2024-01-31 Thread Alexandros Lamprineas via llvm-branch-commits

https://github.com/labrinea milestoned 
https://github.com/llvm/llvm-project/pull/80152
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80140 (PR #80141)

2024-01-31 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/80141

>From 27139bceb27d0b551e6e9d18fb91c703cbc3d7b8 Mon Sep 17 00:00:00 2001
From: Sander de Smalen 
Date: Wed, 31 Jan 2024 09:04:13 +
Subject: [PATCH] [SME] Stop RA from coalescing COPY instructions that
 transcend beyond smstart/smstop. (#78294)

This patch introduces a 'COALESCER_BARRIER' which is a pseudo node that
expands to
a 'nop', but which stops the register allocator from coalescing a COPY
node when
its use/def crosses a SMSTART or SMSTOP instruction.

For example:

%0:fpr64 = COPY killed $d0
undef %2.dsub:zpr = COPY %0   // <- Do not coalesce this COPY
ADJCALLSTACKDOWN 0, 0
MSRpstatesvcrImm1 1, 0, csr_aarch64_smstartstop, implicit-def dead $d0
$d0 = COPY killed %0
BL @use_f64, csr_aarch64_aapcs

If the COPY would be coalesced, that would lead to:

$d0 = COPY killed %0

being replaced by:

$d0 = COPY killed %2.dsub

which means the whole ZPR reg would be live upto the call, causing the
MSRpstatesvcrImm1 (smstop) to spill/reload the ZPR register:

str q0, [sp]   // 16-byte Folded Spill
smstop  sm
ldr z0, [sp]   // 16-byte Folded Reload
bl  use_f64

which would be incorrect for two reasons:
1. The program may load more data than it has allocated.
2. If there are other SVE objects on the stack, the compiler might use
the
   'mul vl' addressing modes to access the spill location.

By disabling the coalescing, we get the desired results:

str d0, [sp, #8]  // 8-byte Folded Spill
smstop  sm
ldr d0, [sp, #8]  // 8-byte Folded Reload
bl  use_f64

(cherry picked from commit dd736661826e215ac70ff3a4a4ccd75bda0c5ccd)
---
 .../AArch64/AArch64ExpandPseudoInsts.cpp  |6 +
 .../Target/AArch64/AArch64ISelLowering.cpp|   24 +-
 llvm/lib/Target/AArch64/AArch64ISelLowering.h |4 +-
 .../Target/AArch64/AArch64RegisterInfo.cpp|   35 +
 .../lib/Target/AArch64/AArch64SMEInstrInfo.td |   22 +
 .../AArch64/sme-disable-gisel-fisel.ll|   20 +-
 ...ate-sm-changing-call-disable-coalescing.ll | 1640 +
 .../CodeGen/AArch64/sme-streaming-body.ll |4 +
 .../sme-streaming-compatible-interface.ll |   29 +-
 .../AArch64/sme-streaming-interface.ll|   12 +-
 ...nging-call-disable-stackslot-scavenging.ll |2 +-
 11 files changed, 1769 insertions(+), 29 deletions(-)
 create mode 100644 
llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll

diff --git a/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp 
b/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
index 352c61d48e2ff..1af064b6de3cb 100644
--- a/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
@@ -1544,6 +1544,12 @@ bool AArch64ExpandPseudo::expandMI(MachineBasicBlock 
&MBB,
NextMBBI = MBB.end(); // The NextMBBI iterator is invalidated.
  return true;
}
+   case AArch64::COALESCER_BARRIER_FPR16:
+   case AArch64::COALESCER_BARRIER_FPR32:
+   case AArch64::COALESCER_BARRIER_FPR64:
+   case AArch64::COALESCER_BARRIER_FPR128:
+ MI.eraseFromParent();
+ return true;
case AArch64::LD1B_2Z_IMM_PSEUDO:
  return expandMultiVecPseudo(
  MBB, MBBI, AArch64::ZPR2RegClass, AArch64::ZPR2StridedRegClass,
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 332fb37655288..a59b1f2ec3c1c 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -2375,6 +2375,7 @@ const char 
*AArch64TargetLowering::getTargetNodeName(unsigned Opcode) const {
   switch ((AArch64ISD::NodeType)Opcode) {
   case AArch64ISD::FIRST_NUMBER:
 break;
+MAKE_CASE(AArch64ISD::COALESCER_BARRIER)
 MAKE_CASE(AArch64ISD::SMSTART)
 MAKE_CASE(AArch64ISD::SMSTOP)
 MAKE_CASE(AArch64ISD::RESTORE_ZA)
@@ -7154,13 +7155,18 @@ void AArch64TargetLowering::saveVarArgRegisters(CCState 
&CCInfo,
   }
 }
 
+static bool isPassedInFPR(EVT VT) {
+  return VT.isFixedLengthVector() ||
+ (VT.isFloatingPoint() && !VT.isScalableVector());
+}
+
 /// LowerCallResult - Lower the result values of a call into the
 /// appropriate copies out of appropriate physical registers.
 SDValue AArch64TargetLowering::LowerCallResult(
 SDValue Chain, SDValue InGlue, CallingConv::ID CallConv, bool isVarArg,
 const SmallVectorImpl &RVLocs, const SDLoc &DL,
 SelectionDAG &DAG, SmallVectorImpl &InVals, bool isThisReturn,
-SDValue ThisVal) const {
+SDValue ThisVal, bool RequiresSMChange) const {
   DenseMap CopiedRegs;
   // Copy all of the result registers out of their specified physreg.
   for (unsigned i = 0; i != RVLocs.size(); ++i) {
@@ -7205,6 +7211,10 @@ SDValue AArch64TargetLowering::LowerCallResult(
   break;
 }
 
+if (RequiresSMChange && isPassedInFPR(VA.getValVT()))
+  Val = DAG.getNode(AArch64ISD::COALESCER_BARRIER,

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80137 (PR #80138)

2024-01-31 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/80138

>From 467057c926eacfbe8ed00bba9de3e86e79b64d7e Mon Sep 17 00:00:00 2001
From: Sander de Smalen 
Date: Wed, 31 Jan 2024 11:38:29 +
Subject: [PATCH] [AArch64][SME] Fix inlining bug introduced in #78703 (#79994)

Calling a `__arm_locally_streaming` function from a function that
is not a streaming-SVE function would lead to incorrect inlining.

The issue didn't surface because the tests were not testing what
they were supposed to test.

(cherry picked from commit 3abf55a68caefd45042c27b73a658c638afbbb8b)
---
 .../AArch64/AArch64TargetTransformInfo.cpp|  17 +-
 .../Inline/AArch64/sme-pstatesm-attrs.ll  | 369 +-
 2 files changed, 195 insertions(+), 191 deletions(-)

diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp 
b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index d611338fc268f..992b11da7 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -233,15 +233,20 @@ static bool hasPossibleIncompatibleOps(const Function *F) 
{
 
 bool AArch64TTIImpl::areInlineCompatible(const Function *Caller,
  const Function *Callee) const {
-  SMEAttrs CallerAttrs(*Caller);
-  SMEAttrs CalleeAttrs(*Callee);
+  SMEAttrs CallerAttrs(*Caller), CalleeAttrs(*Callee);
+
+  // When inlining, we should consider the body of the function, not the
+  // interface.
+  if (CalleeAttrs.hasStreamingBody()) {
+CalleeAttrs.set(SMEAttrs::SM_Compatible, false);
+CalleeAttrs.set(SMEAttrs::SM_Enabled, true);
+  }
+
   if (CalleeAttrs.hasNewZABody())
 return false;
 
   if (CallerAttrs.requiresLazySave(CalleeAttrs) ||
-  (CallerAttrs.requiresSMChange(CalleeAttrs) &&
-   (!CallerAttrs.hasStreamingInterfaceOrBody() ||
-!CalleeAttrs.hasStreamingBody( {
+  CallerAttrs.requiresSMChange(CalleeAttrs)) {
 if (hasPossibleIncompatibleOps(Callee))
   return false;
   }
@@ -4062,4 +4067,4 @@ bool 
AArch64TTIImpl::shouldTreatInstructionLikeSelect(const Instruction *I) {
   cast(I->getNextNode())->isUnconditional())
 return true;
   return BaseT::shouldTreatInstructionLikeSelect(I);
-}
\ No newline at end of file
+}
diff --git a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll 
b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
index d6b1f3ef45e76..7723e6c664c3d 100644
--- a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
+++ b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
@@ -1,71 +1,70 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 2
 ; RUN: opt < %s -mtriple=aarch64-unknown-linux-gnu -mattr=+sme -S 
-passes=inline | FileCheck %s
 
-declare void @inlined_body() "aarch64_pstate_sm_compatible";
+declare i32 @llvm.vscale.i32()
 
-; Define some functions that will be called by the functions below.
-; These just call a '...body()' function. If we see the call to one of
-; these functions being replaced by '...body()', then we know it has been
-; inlined.
+; Define some functions that merely call llvm.vscale.i32(), which will be 
called
+; by the other functions below. If we see the call to one of these functions
+; being replaced by 'llvm.vscale()', then we know it has been inlined.
 
-define void @normal_callee() {
-; CHECK-LABEL: define void @normal_callee
+define i32 @normal_callee() {
+; CHECK-LABEL: define i32 @normal_callee
 ; CHECK-SAME: () #[[ATTR1:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:call void @inlined_body()
-; CHECK-NEXT:ret void
+; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 }
 
-define void @streaming_callee() "aarch64_pstate_sm_enabled" {
-; CHECK-LABEL: define void @streaming_callee
+define i32 @streaming_callee() "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define i32 @streaming_callee
 ; CHECK-SAME: () #[[ATTR2:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:call void @inlined_body()
-; CHECK-NEXT:ret void
+; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 }
 
-define void @locally_streaming_callee() "aarch64_pstate_sm_body" {
-; CHECK-LABEL: define void @locally_streaming_callee
+define i32 @locally_streaming_callee() "aarch64_pstate_sm_body" {
+; CHECK-LABEL: define i32 @locally_streaming_callee
 ; CHECK-SAME: () #[[ATTR3:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:call void @inlined_body()
-; CHECK-NEXT:ret void
+; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 

[llvm-branch-commits] [clang] [Release Notes][FMV] Document support for rcpc3 and mops features. (PR #80152)

2024-01-31 Thread Pavel Iliin via llvm-branch-commits

https://github.com/ilinpv approved this pull request.


https://github.com/llvm/llvm-project/pull/80152
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 5490201 - Revert "[AArch64] Convert concat(uhadd(a,b), uhadd(c,d)) to uhadd(concat(a,c)…"

2024-01-31 Thread via llvm-branch-commits

Author: Rin Dobrescu
Date: 2024-01-31T16:32:38Z
New Revision: 5490201f1bcf78b0d51bc5bd95e9eb5976d8891b

URL: 
https://github.com/llvm/llvm-project/commit/5490201f1bcf78b0d51bc5bd95e9eb5976d8891b
DIFF: 
https://github.com/llvm/llvm-project/commit/5490201f1bcf78b0d51bc5bd95e9eb5976d8891b.diff

LOG: Revert "[AArch64] Convert concat(uhadd(a,b), uhadd(c,d)) to 
uhadd(concat(a,c)…"

This reverts commit cf828aee2460058db5dacb1523797fe787486f4d.

Added: 


Modified: 
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/test/CodeGen/AArch64/avoid-pre-trunc.ll

Removed: 
llvm/test/CodeGen/AArch64/concat-vector-add-combine.ll



diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 242a2d35b93f3..823d181efc4f0 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -18242,11 +18242,18 @@ static SDValue performConcatVectorsCombine(SDNode *N,
   if (DCI.isBeforeLegalizeOps())
 return SDValue();
 
-  // Optimise concat_vectors of two [us]avgceils or [us]avgfloors with a 
128-bit
-  // destination size, combine into an avg of two contacts of the source
-  // vectors. eg: concat(uhadd(a,b), uhadd(c, d)) -> uhadd(concat(a, c),
-  // concat(b, d))
-  if (N->getNumOperands() == 2 && N0Opc == N1Opc && VT.is128BitVector() &&
+  // Optimise concat_vectors of two [us]avgceils or [us]avgfloors that use
+  // extracted subvectors from the same original vectors. Combine these into a
+  // single avg that operates on the two original vectors.
+  // avgceil is the target independant name for rhadd, avgfloor is a hadd.
+  // Example:
+  //  (concat_vectors (v8i8 (avgceils (extract_subvector (v16i8 OpA, <0>),
+  //   extract_subvector (v16i8 OpB, <0>))),
+  //  (v8i8 (avgceils (extract_subvector (v16i8 OpA, <8>),
+  //   extract_subvector (v16i8 OpB, <8>)
+  // ->
+  //  (v16i8(avgceils(v16i8 OpA, v16i8 OpB)))
+  if (N->getNumOperands() == 2 && N0Opc == N1Opc &&
   (N0Opc == ISD::AVGCEILU || N0Opc == ISD::AVGCEILS ||
N0Opc == ISD::AVGFLOORU || N0Opc == ISD::AVGFLOORS)) {
 SDValue N00 = N0->getOperand(0);
@@ -18254,9 +18261,32 @@ static SDValue performConcatVectorsCombine(SDNode *N,
 SDValue N10 = N1->getOperand(0);
 SDValue N11 = N1->getOperand(1);
 
-SDValue Concat0 = DAG.getNode(ISD::CONCAT_VECTORS, dl, VT, N00, N10);
-SDValue Concat1 = DAG.getNode(ISD::CONCAT_VECTORS, dl, VT, N01, N11);
-return DAG.getNode(N0Opc, dl, VT, Concat0, Concat1);
+EVT N00VT = N00.getValueType();
+EVT N10VT = N10.getValueType();
+
+if (N00->getOpcode() == ISD::EXTRACT_SUBVECTOR &&
+N01->getOpcode() == ISD::EXTRACT_SUBVECTOR &&
+N10->getOpcode() == ISD::EXTRACT_SUBVECTOR &&
+N11->getOpcode() == ISD::EXTRACT_SUBVECTOR && N00VT == N10VT) {
+  SDValue N00Source = N00->getOperand(0);
+  SDValue N01Source = N01->getOperand(0);
+  SDValue N10Source = N10->getOperand(0);
+  SDValue N11Source = N11->getOperand(0);
+
+  if (N00Source == N10Source && N01Source == N11Source &&
+  N00Source.getValueType() == VT && N01Source.getValueType() == VT) {
+assert(N0.getValueType() == N1.getValueType());
+
+uint64_t N00Index = N00.getConstantOperandVal(1);
+uint64_t N01Index = N01.getConstantOperandVal(1);
+uint64_t N10Index = N10.getConstantOperandVal(1);
+uint64_t N11Index = N11.getConstantOperandVal(1);
+
+if (N00Index == N01Index && N10Index == N11Index && N00Index == 0 &&
+N10Index == N00VT.getVectorNumElements())
+  return DAG.getNode(N0Opc, dl, VT, N00Source, N01Source);
+  }
+}
   }
 
   auto IsRSHRN = [](SDValue Shr) {

diff  --git a/llvm/test/CodeGen/AArch64/avoid-pre-trunc.ll 
b/llvm/test/CodeGen/AArch64/avoid-pre-trunc.ll
index c4de177176e33..24cce9a2b26b5 100644
--- a/llvm/test/CodeGen/AArch64/avoid-pre-trunc.ll
+++ b/llvm/test/CodeGen/AArch64/avoid-pre-trunc.ll
@@ -1,36 +1,75 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
 ; RUN: llc -mtriple=aarch64 < %s | FileCheck %s
 
+define i32 @lower_lshr(<4 x i32> %a, <4 x i32> %b, <4 x i32> %c, <4 x i32> %d, 
<4 x i32> %e, <4 x i32> %f, <4 x i32> %g, <4 x i32> %h) {
+; CHECK-LABEL: lower_lshr:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:addv s0, v0.4s
+; CHECK-NEXT:addv s1, v1.4s
+; CHECK-NEXT:addv s4, v4.4s
+; CHECK-NEXT:addv s5, v5.4s
+; CHECK-NEXT:addv s2, v2.4s
+; CHECK-NEXT:addv s6, v6.4s
+; CHECK-NEXT:mov v0.s[1], v1.s[0]
+; CHECK-NEXT:addv s1, v3.4s
+; CHECK-NEXT:addv s3, v7.4s
+; CHECK-NEXT:mov v4.s[1], v5.s[0]
+; CHECK-NEXT:mov v0.s[2], v2.s[0]
+; CHECK-NEXT:mov v4.s[2], v6.s[0]
+; CHECK-NEXT:mov v0.s[3], v1.s[0]
+; CHECK-NEXT:

[llvm-branch-commits] [llvm] 977a511 - Revert "[MIRPrinter] Don't print space when there is no successor (#80143)"

2024-01-31 Thread via llvm-branch-commits

Author: Alex Bradbury
Date: 2024-01-31T17:00:31Z
New Revision: 977a5117995dc4f78f1f9232f13395d50775887c

URL: 
https://github.com/llvm/llvm-project/commit/977a5117995dc4f78f1f9232f13395d50775887c
DIFF: 
https://github.com/llvm/llvm-project/commit/977a5117995dc4f78f1f9232f13395d50775887c.diff

LOG: Revert "[MIRPrinter] Don't print space when there is no successor (#80143)"

This reverts commit b7738e275dc097f224d00434253b485288a6caff.

Added: 


Modified: 
llvm/lib/CodeGen/MIRPrinter.cpp
llvm/test/CodeGen/AArch64/GlobalISel/uaddo-8-16-bits.mir
llvm/test/CodeGen/AArch64/regalloc-last-chance-recolor-with-split.mir
llvm/test/CodeGen/AArch64/tail-dup-redundant-phi.mir
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-trap.mir
llvm/test/CodeGen/ARM/constant-island-movwt.mir
llvm/test/CodeGen/X86/statepoint-invoke-ra-enter-at-end.mir
llvm/test/CodeGen/X86/statepoint-invoke-ra-remove-back-copies.mir
llvm/test/CodeGen/X86/statepoint-vreg-invoke.ll

Removed: 
llvm/test/CodeGen/MIR/X86/unreachable-block-print.mir



diff  --git a/llvm/lib/CodeGen/MIRPrinter.cpp b/llvm/lib/CodeGen/MIRPrinter.cpp
index b1ad035739a0d..b19a377f07448 100644
--- a/llvm/lib/CodeGen/MIRPrinter.cpp
+++ b/llvm/lib/CodeGen/MIRPrinter.cpp
@@ -694,9 +694,7 @@ void MIPrinter::print(const MachineBasicBlock &MBB) {
   // fallthrough.
   if ((!MBB.succ_empty() && !SimplifyMIR) || !canPredictProbs ||
   !canPredictSuccessors(MBB)) {
-OS.indent(2) << "successors:";
-if (!MBB.succ_empty())
-  OS << " ";
+OS.indent(2) << "successors: ";
 for (auto I = MBB.succ_begin(), E = MBB.succ_end(); I != E; ++I) {
   if (I != MBB.succ_begin())
 OS << ", ";

diff  --git a/llvm/test/CodeGen/AArch64/GlobalISel/uaddo-8-16-bits.mir 
b/llvm/test/CodeGen/AArch64/GlobalISel/uaddo-8-16-bits.mir
index f4366fb7888ea..0ab11b3ac558f 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/uaddo-8-16-bits.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/uaddo-8-16-bits.mir
@@ -24,7 +24,7 @@ body: |
   ; CHECK-NEXT:   G_BR %bb.1
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.1:
-  ; CHECK-NEXT:   successors:
+  ; CHECK-NEXT:   successors:{{ $}}
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.trap)
   ; CHECK-NEXT: {{  $}}
@@ -78,7 +78,7 @@ body: |
   ; CHECK-NEXT:   G_BR %bb.1
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.1:
-  ; CHECK-NEXT:   successors:
+  ; CHECK-NEXT:   successors:{{ $}}
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.trap)
   ; CHECK-NEXT: {{  $}}
@@ -132,7 +132,7 @@ body: |
   ; CHECK-NEXT:   G_BR %bb.1
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.1:
-  ; CHECK-NEXT:   successors:
+  ; CHECK-NEXT:   successors:{{ $}}
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.trap)
   ; CHECK-NEXT: {{  $}}
@@ -204,7 +204,7 @@ body: |
   ; CHECK-NEXT:   G_BR %bb.1
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.1:
-  ; CHECK-NEXT:   successors:
+  ; CHECK-NEXT:   successors:{{ $}}
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.trap)
   ; CHECK-NEXT: {{  $}}
@@ -259,7 +259,7 @@ body: |
   ; CHECK-NEXT:   G_BR %bb.1
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.1:
-  ; CHECK-NEXT:   successors:
+  ; CHECK-NEXT:   successors:{{ $}}
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.trap)
   ; CHECK-NEXT: {{  $}}
@@ -315,7 +315,7 @@ body: |
   ; CHECK-NEXT:   G_BR %bb.1
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.1:
-  ; CHECK-NEXT:   successors:
+  ; CHECK-NEXT:   successors:{{ $}}
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.trap)
   ; CHECK-NEXT: {{  $}}
@@ -375,7 +375,7 @@ body: |
   ; CHECK-NEXT:   G_BR %bb.1
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.1:
-  ; CHECK-NEXT:   successors:
+  ; CHECK-NEXT:   successors:{{ $}}
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.trap)
   ; CHECK-NEXT: {{  $}}
@@ -510,7 +510,7 @@ body: |
   ; CHECK-NEXT:   G_BR %bb.3
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.2:
-  ; CHECK-NEXT:   successors:
+  ; CHECK-NEXT:   successors:{{ $}}
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.trap)
   ; CHECK-NEXT: {{  $}}
@@ -575,7 +575,7 @@ body: |
   ; CHECK-NEXT:   G_BR %bb.1
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.1:
-  ; CHECK-NEXT:   successors:
+  ; CHECK-NEXT:   successors:{{ $}}
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT:   G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.trap)
   ; CHECK-NEXT: {{  $}}
@@ -632,7 +632,7 @@ body: |
   ; CHECK-NEXT:   G_BR %bb.1
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.1:
-  ; CHECK-NEXT:   successors:
+  ; CHECK-NEXT:   successors:{{ $}}

[llvm-branch-commits] [llvm] 27139bc - [SME] Stop RA from coalescing COPY instructions that transcend beyond smstart/smstop. (#78294)

2024-01-31 Thread via llvm-branch-commits

Author: Sander de Smalen
Date: 2024-01-31T15:41:15Z
New Revision: 27139bceb27d0b551e6e9d18fb91c703cbc3d7b8

URL: 
https://github.com/llvm/llvm-project/commit/27139bceb27d0b551e6e9d18fb91c703cbc3d7b8
DIFF: 
https://github.com/llvm/llvm-project/commit/27139bceb27d0b551e6e9d18fb91c703cbc3d7b8.diff

LOG: [SME] Stop RA from coalescing COPY instructions that transcend beyond 
smstart/smstop. (#78294)

This patch introduces a 'COALESCER_BARRIER' which is a pseudo node that
expands to
a 'nop', but which stops the register allocator from coalescing a COPY
node when
its use/def crosses a SMSTART or SMSTOP instruction.

For example:

%0:fpr64 = COPY killed $d0
undef %2.dsub:zpr = COPY %0   // <- Do not coalesce this COPY
ADJCALLSTACKDOWN 0, 0
MSRpstatesvcrImm1 1, 0, csr_aarch64_smstartstop, implicit-def dead $d0
$d0 = COPY killed %0
BL @use_f64, csr_aarch64_aapcs

If the COPY would be coalesced, that would lead to:

$d0 = COPY killed %0

being replaced by:

$d0 = COPY killed %2.dsub

which means the whole ZPR reg would be live upto the call, causing the
MSRpstatesvcrImm1 (smstop) to spill/reload the ZPR register:

str q0, [sp]   // 16-byte Folded Spill
smstop  sm
ldr z0, [sp]   // 16-byte Folded Reload
bl  use_f64

which would be incorrect for two reasons:
1. The program may load more data than it has allocated.
2. If there are other SVE objects on the stack, the compiler might use
the
   'mul vl' addressing modes to access the spill location.

By disabling the coalescing, we get the desired results:

str d0, [sp, #8]  // 8-byte Folded Spill
smstop  sm
ldr d0, [sp, #8]  // 8-byte Folded Reload
bl  use_f64

(cherry picked from commit dd736661826e215ac70ff3a4a4ccd75bda0c5ccd)

Added: 
llvm/test/CodeGen/AArch64/sme-pstate-sm-changing-call-disable-coalescing.ll

Modified: 
llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/lib/Target/AArch64/AArch64ISelLowering.h
llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll
llvm/test/CodeGen/AArch64/sme-streaming-body.ll
llvm/test/CodeGen/AArch64/sme-streaming-compatible-interface.ll
llvm/test/CodeGen/AArch64/sme-streaming-interface.ll

llvm/test/CodeGen/AArch64/sme-streaming-mode-changing-call-disable-stackslot-scavenging.ll

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp 
b/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
index 352c61d48e2ff..1af064b6de3cb 100644
--- a/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
@@ -1544,6 +1544,12 @@ bool AArch64ExpandPseudo::expandMI(MachineBasicBlock 
&MBB,
NextMBBI = MBB.end(); // The NextMBBI iterator is invalidated.
  return true;
}
+   case AArch64::COALESCER_BARRIER_FPR16:
+   case AArch64::COALESCER_BARRIER_FPR32:
+   case AArch64::COALESCER_BARRIER_FPR64:
+   case AArch64::COALESCER_BARRIER_FPR128:
+ MI.eraseFromParent();
+ return true;
case AArch64::LD1B_2Z_IMM_PSEUDO:
  return expandMultiVecPseudo(
  MBB, MBBI, AArch64::ZPR2RegClass, AArch64::ZPR2StridedRegClass,

diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 332fb37655288..a59b1f2ec3c1c 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -2375,6 +2375,7 @@ const char 
*AArch64TargetLowering::getTargetNodeName(unsigned Opcode) const {
   switch ((AArch64ISD::NodeType)Opcode) {
   case AArch64ISD::FIRST_NUMBER:
 break;
+MAKE_CASE(AArch64ISD::COALESCER_BARRIER)
 MAKE_CASE(AArch64ISD::SMSTART)
 MAKE_CASE(AArch64ISD::SMSTOP)
 MAKE_CASE(AArch64ISD::RESTORE_ZA)
@@ -7154,13 +7155,18 @@ void AArch64TargetLowering::saveVarArgRegisters(CCState 
&CCInfo,
   }
 }
 
+static bool isPassedInFPR(EVT VT) {
+  return VT.isFixedLengthVector() ||
+ (VT.isFloatingPoint() && !VT.isScalableVector());
+}
+
 /// LowerCallResult - Lower the result values of a call into the
 /// appropriate copies out of appropriate physical registers.
 SDValue AArch64TargetLowering::LowerCallResult(
 SDValue Chain, SDValue InGlue, CallingConv::ID CallConv, bool isVarArg,
 const SmallVectorImpl &RVLocs, const SDLoc &DL,
 SelectionDAG &DAG, SmallVectorImpl &InVals, bool isThisReturn,
-SDValue ThisVal) const {
+SDValue ThisVal, bool RequiresSMChange) const {
   DenseMap CopiedRegs;
   // Copy all of the result registers out of their specified physreg.
   for (unsigned i = 0; i != RVLocs.size(); ++i) {
@@ -7205,6 +7211,10 @@ SDValue AArch64TargetLowering::LowerCallResult(
   break;
 }
 
+if (RequiresSMChange && isPassedInFPR(VA.ge

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80140 (PR #80141)

2024-01-31 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/80141
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libc] [llvm] [compiler-rt] [clang] [libcxx] [flang] [RISCV] Support select optimization (PR #80124)

2024-01-31 Thread Alexander Richardson via llvm-branch-commits


@@ -0,0 +1,873 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -select-optimize -mtriple=riscv64 -S < %s \
+; RUN:   | FileCheck %s --check-prefix=CHECK-SELECT
+; RUN: opt -select-optimize -mtriple=riscv64 -mattr=+enable-select-opt -S < %s 
\
+; RUN:   | FileCheck %s --check-prefix=CHECK-BRANCH
+; RUN: opt -select-optimize -mtriple=riscv64 
-mattr=+enable-select-opt,+predictable-select-expensive -S < %s \
+; RUN:   | FileCheck %s --check-prefix=CHECK-BRANCH
+
+%struct.st = type { i32, i64, ptr, ptr, i16, ptr, ptr, i64, i64 }
+
+; This test has a select at the end of if.then, which is better transformed to 
a branch on OoO cores.
+
+define void @replace(ptr nocapture noundef %newst, ptr noundef %t, ptr noundef 
%h, i64 noundef %c, i64 noundef %rc, i64 noundef %ma, i64 noundef %n) {

arichardson wrote:

You could add an llvm_unreachable() when the new optimization fires and feed 
this test case to llvm-reduce to shrink it. And once that has been done ideally 
the AArch64 could also be reduced since it has a lot of noise that does not 
seem relevant to the optimization.

https://github.com/llvm/llvm-project/pull/80124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Refactor recomputeLiveIns to operate on whole CFG (#79498) (PR #79641)

2024-01-31 Thread Oskar Wirga via llvm-branch-commits

https://github.com/oskarwirga updated 
https://github.com/llvm/llvm-project/pull/79641

>From b4901ff3e9f9fef59bf64fb9f7465f949a2ca32a Mon Sep 17 00:00:00 2001
From: Oskar Wirga <10386631+oskarwi...@users.noreply.github.com>
Date: Tue, 30 Jan 2024 19:33:04 -0800
Subject: [PATCH] Refactor recomputeLiveIns to converge on added
 MachineBasicBlocks (#79940)

This is a fix for the regression seen in
https://github.com/llvm/llvm-project/pull/79498

> Currently, the way that recomputeLiveIns works is that it will
recompute the livein registers for that MachineBasicBlock but it matters
what order you call recomputeLiveIn which can result in incorrect
register allocations down the line.

Now we do not recompute the entire CFG but we do ensure that the newly
added MBB do reach convergence.

(cherry picked from commit ff4636a4ab00b633c15eb3942c26126ceb2662e6)
---
 llvm/include/llvm/CodeGen/LivePhysRegs.h  | 11 --
 llvm/include/llvm/CodeGen/MachineBasicBlock.h |  6 
 llvm/lib/CodeGen/BranchFolding.cpp|  6 ++--
 .../Target/AArch64/AArch64FrameLowering.cpp   |  6 ++--
 llvm/lib/Target/AArch64/AArch64InstrInfo.cpp  | 10 --
 llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp   | 13 +++
 .../PowerPC/PPCExpandAtomicPseudoInsts.cpp| 14 +---
 llvm/lib/Target/PowerPC/PPCFrameLowering.cpp  | 13 ---
 .../Target/SystemZ/SystemZFrameLowering.cpp   | 12 ---
 llvm/lib/Target/X86/X86FrameLowering.cpp  | 15 
 .../SystemZ/branch-folder-hoist-livein.mir| 36 +--
 .../Thumb2/LowOverheadLoops/spillingmove.mir  |  2 +-
 12 files changed, 98 insertions(+), 46 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/LivePhysRegs.h 
b/llvm/include/llvm/CodeGen/LivePhysRegs.h
index 76bb34d270a26..1d40b1cbb0eaa 100644
--- a/llvm/include/llvm/CodeGen/LivePhysRegs.h
+++ b/llvm/include/llvm/CodeGen/LivePhysRegs.h
@@ -193,11 +193,18 @@ void addLiveIns(MachineBasicBlock &MBB, const 
LivePhysRegs &LiveRegs);
 void computeAndAddLiveIns(LivePhysRegs &LiveRegs,
   MachineBasicBlock &MBB);
 
-/// Convenience function for recomputing live-in's for \p MBB.
-static inline void recomputeLiveIns(MachineBasicBlock &MBB) {
+/// Convenience function for recomputing live-in's for a MBB. Returns true if
+/// any changes were made.
+static inline bool recomputeLiveIns(MachineBasicBlock &MBB) {
   LivePhysRegs LPR;
+  auto oldLiveIns = MBB.getLiveIns();
+
   MBB.clearLiveIns();
   computeAndAddLiveIns(LPR, MBB);
+  MBB.sortUniqueLiveIns();
+
+  auto newLiveIns = MBB.getLiveIns();
+  return oldLiveIns != newLiveIns;
 }
 
 } // end namespace llvm
diff --git a/llvm/include/llvm/CodeGen/MachineBasicBlock.h 
b/llvm/include/llvm/CodeGen/MachineBasicBlock.h
index c84fd281c6a54..dc2035fa598c4 100644
--- a/llvm/include/llvm/CodeGen/MachineBasicBlock.h
+++ b/llvm/include/llvm/CodeGen/MachineBasicBlock.h
@@ -111,6 +111,10 @@ class MachineBasicBlock
 
 RegisterMaskPair(MCPhysReg PhysReg, LaneBitmask LaneMask)
 : PhysReg(PhysReg), LaneMask(LaneMask) {}
+
+bool operator==(const RegisterMaskPair &other) const {
+  return PhysReg == other.PhysReg && LaneMask == other.LaneMask;
+}
   };
 
 private:
@@ -473,6 +477,8 @@ class MachineBasicBlock
   /// Remove entry from the livein set and return iterator to the next.
   livein_iterator removeLiveIn(livein_iterator I);
 
+  std::vector getLiveIns() const { return LiveIns; }
+
   class liveout_iterator {
   public:
 using iterator_category = std::input_iterator_tag;
diff --git a/llvm/lib/CodeGen/BranchFolding.cpp 
b/llvm/lib/CodeGen/BranchFolding.cpp
index a9f78358e57b9..ecf7bc30913f5 100644
--- a/llvm/lib/CodeGen/BranchFolding.cpp
+++ b/llvm/lib/CodeGen/BranchFolding.cpp
@@ -2048,8 +2048,10 @@ bool 
BranchFolder::HoistCommonCodeInSuccs(MachineBasicBlock *MBB) {
   FBB->erase(FBB->begin(), FIB);
 
   if (UpdateLiveIns) {
-recomputeLiveIns(*TBB);
-recomputeLiveIns(*FBB);
+bool anyChange = false;
+do {
+  anyChange = recomputeLiveIns(*TBB) || recomputeLiveIns(*FBB);
+} while (anyChange);
   }
 
   ++NumHoist;
diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
index d55deec976009..732e787d2a321 100644
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
@@ -4339,8 +4339,10 @@ AArch64FrameLowering::inlineStackProbeLoopExactMultiple(
   ExitMBB->transferSuccessorsAndUpdatePHIs(&MBB);
   MBB.addSuccessor(LoopMBB);
   // Update liveins.
-  recomputeLiveIns(*LoopMBB);
-  recomputeLiveIns(*ExitMBB);
+  bool anyChange = false;
+  do {
+anyChange = recomputeLiveIns(*ExitMBB) || recomputeLiveIns(*LoopMBB);
+  } while (anyChange);
 
   return ExitMBB->begin();
 }
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp 
b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 13e9d9725cc2e..9b4bb7c88bc82 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64

[llvm-branch-commits] [llvm] [BPF] expand cttz, ctlz for i32, i64 (PR #73668)

2024-01-31 Thread Yingchi Long via llvm-branch-commits


@@ -0,0 +1,304 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
+; RUN: llc < %s -march=bpf | FileCheck %s
+
+; test that we can expand CTTZ & CTLZ
+
+declare i32 @llvm.cttz.i32(i32, i1)
+
+define i32 @cttz_i32_zdef(i32 %a) {
+; CHECK-LABEL: cttz_i32_zdef:

inclyc wrote:

> Question, how stable are these expansions?

It is expanded by common codegen functions, thus might be changed without 
modifications in BPF backend.

> Previously compiler would just error out, maybe just insert some dummy checks 
> that verify that something is returned from these functions? 

I think for compilers "CodeGen" tests, it is common to assert "Generated Asm" 
without actually run it. For example, for aarch64 backend developers, a "trick" 
is to write some code on x86 machine, testing the assembly without aarch64 
emulator(e.g. qemu-user). 

So IMHO, for LLVM codegen we can just `CHECK-NEXT`: these asm lines. (They can 
be automatically generated)

> And we can add a few tests on kernel side that verify runtime result.

Yes, that would be nice if we can test "runtime" behavior.

https://github.com/llvm/llvm-project/pull/73668
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BPF] expand cttz, ctlz for i32, i64 (PR #73668)

2024-01-31 Thread via llvm-branch-commits


@@ -0,0 +1,304 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
+; RUN: llc < %s -march=bpf | FileCheck %s
+
+; test that we can expand CTTZ & CTLZ
+
+declare i32 @llvm.cttz.i32(i32, i1)
+
+define i32 @cttz_i32_zdef(i32 %a) {
+; CHECK-LABEL: cttz_i32_zdef:

eddyz87 wrote:

> I think for compilers "CodeGen" tests, it is common to assert "Generated Asm" 
> without actually run it. For example, for aarch64 backend developers, a 
> "trick" is to write some code on x86 machine, testing the assembly without 
> aarch64 emulator(e.g. qemu-user).

Actually, I meant to skip matching the full expansion and just do something 
like below:

```
; CHECK-LABEL: cttz_i32_zdef:
; CHECK: r0 =
; CHECK: exit
```

(As we don't really test the expansion logic, we just test that some expansion 
is applied).

> So IMHO, for LLVM codegen we can just CHECK-NEXT: these asm lines. (They can 
> be automatically generated)

You mean `utils/update_llc_test_checks`, right?
Well, maybe that's good enough, at-least the tests are easy to adjust if 
anything changes.

https://github.com/llvm/llvm-project/pull/73668
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BPF] expand cttz, ctlz for i32, i64 (PR #73668)

2024-01-31 Thread Yingchi Long via llvm-branch-commits


@@ -0,0 +1,304 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
+; RUN: llc < %s -march=bpf | FileCheck %s
+
+; test that we can expand CTTZ & CTLZ
+
+declare i32 @llvm.cttz.i32(i32, i1)
+
+define i32 @cttz_i32_zdef(i32 %a) {
+; CHECK-LABEL: cttz_i32_zdef:

inclyc wrote:

> (As we don't really test the expansion logic, we just test that some 
> expansion is applied)

Hmm, I'm worried about if we do at tests like

```
; CHECK-LABEL: cttz_i32_zdef:
; CHECK: r0 =
; CHECK: exit
```

how can this catch BPF backend changes like `setOperationAction()` to `Custom` 
and do some custom lowering, as if custom lowering might be incorrect though?

I think we should test implementation details about "how to expand / custom" 
actually. For example if someone really changes the logic of expansion, I think 
these tests shall be regenerated and we can see such patches (touched our 
backend tests)

https://github.com/llvm/llvm-project/pull/73668
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BPF] expand cttz, ctlz for i32, i64 (PR #73668)

2024-01-31 Thread via llvm-branch-commits


@@ -0,0 +1,304 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
+; RUN: llc < %s -march=bpf | FileCheck %s
+
+; test that we can expand CTTZ & CTLZ
+
+declare i32 @llvm.cttz.i32(i32, i1)
+
+define i32 @cttz_i32_zdef(i32 %a) {
+; CHECK-LABEL: cttz_i32_zdef:

eddyz87 wrote:

Ok, fair enough.

https://github.com/llvm/llvm-project/pull/73668
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BPF] expand cttz, ctlz for i32, i64 (PR #73668)

2024-01-31 Thread via llvm-branch-commits

https://github.com/eddyz87 approved this pull request.


https://github.com/llvm/llvm-project/pull/73668
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [llvm] [clang] [clang-tools-extra] [lld] [flang] [libc] [compiler-rt] [libcxxabi] [lldb] [SelectOpt] Print instruction instead of pointer (PR #80125)

2024-01-31 Thread Wang Pengcheng via llvm-branch-commits

https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/80125


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [clang-tools-extra] [flang] [libcxx] [lldb] [libcxxabi] [libc] [compiler-rt] [lld] [clang] [SelectOpt] Print instruction instead of pointer (PR #80125)

2024-01-31 Thread Wang Pengcheng via llvm-branch-commits

https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/80125


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Revert "[RISCV] Recurse on first operand of two operand shuffles (#79180)" (PR #80238)

2024-01-31 Thread Philip Reames via llvm-branch-commits

https://github.com/preames created 
https://github.com/llvm/llvm-project/pull/80238

This reverts commit bdc41106ee48dce59c500c9a3957af947f30c8c3 on the
release/18.x branch.  This change was the first in a mini-series
and while I'm not aware of any particular problem from having it on
it's own in the branch, it seems safer to ship with the previous
known good state.

@tstellar This is my first backport in the new process, so please bear with me 
and double check I got all pieces of this right.

>From 98e43e0054ab81e3455011933e1bdf64bd59e148 Mon Sep 17 00:00:00 2001
From: Philip Reames 
Date: Wed, 31 Jan 2024 14:44:39 -0800
Subject: [PATCH] Revert "[RISCV] Recurse on first operand of two operand
 shuffles (#79180)"

This reverts commit bdc41106ee48dce59c500c9a3957af947f30c8c3 on the
release/18.x branch.  This change was the first in a mini-series
and while I'm not aware of any particular problem from having it on
it's own in the branch, it seems safer to ship with the previous
known good state.
---
 llvm/lib/Target/RISCV/RISCVISelLowering.cpp   |  92 ++---
 .../RISCV/rvv/fixed-vectors-fp-interleave.ll  |  41 +-
 .../RISCV/rvv/fixed-vectors-int-interleave.ll |  63 +--
 .../RISCV/rvv/fixed-vectors-int-shuffles.ll   |  43 +-
 .../rvv/fixed-vectors-interleaved-access.ll   | 387 +-
 .../rvv/fixed-vectors-shuffle-transpose.ll| 128 +++---
 6 files changed, 407 insertions(+), 347 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 47c6cd6e5487b..c8f7b5c35a381 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -5033,60 +5033,56 @@ static SDValue lowerVECTOR_SHUFFLE(SDValue Op, 
SelectionDAG &DAG,
   MVT IndexContainerVT =
   ContainerVT.changeVectorElementType(IndexVT.getScalarType());
 
-  // Base case for the recursion just below - handle the worst case
-  // single source permutation.  Note that all the splat variants
-  // are handled above.
-  if (V2.isUndef()) {
+  SDValue Gather;
+  // TODO: This doesn't trigger for i64 vectors on RV32, since there we
+  // encounter a bitcasted BUILD_VECTOR with low/high i32 values.
+  if (SDValue SplatValue = DAG.getSplatValue(V1, /*LegalTypes*/ true)) {
+Gather = lowerScalarSplat(SDValue(), SplatValue, VL, ContainerVT, DL, DAG,
+  Subtarget);
+  } else {
 V1 = convertToScalableVector(ContainerVT, V1, DAG, Subtarget);
-SDValue LHSIndices = DAG.getBuildVector(IndexVT, DL, GatherIndicesLHS);
-LHSIndices = convertToScalableVector(IndexContainerVT, LHSIndices, DAG,
- Subtarget);
-SDValue Gather = DAG.getNode(GatherVVOpc, DL, ContainerVT, V1, LHSIndices,
- DAG.getUNDEF(ContainerVT), TrueMask, VL);
-return convertFromScalableVector(VT, Gather, DAG, Subtarget);
-  }
-
-  // Translate the gather index we computed above (and possibly swapped)
-  // back to a shuffle mask.  This step should disappear once we complete
-  // the migration to recursive design.
-  SmallVector ShuffleMaskLHS;
-  ShuffleMaskLHS.reserve(GatherIndicesLHS.size());
-  for (SDValue GatherIndex : GatherIndicesLHS) {
-if (GatherIndex.isUndef()) {
-  ShuffleMaskLHS.push_back(-1);
-  continue;
+// If only one index is used, we can use a "splat" vrgather.
+// TODO: We can splat the most-common index and fix-up any stragglers, if
+// that's beneficial.
+if (LHSIndexCounts.size() == 1) {
+  int SplatIndex = LHSIndexCounts.begin()->getFirst();
+  Gather = DAG.getNode(GatherVXOpc, DL, ContainerVT, V1,
+   DAG.getConstant(SplatIndex, DL, XLenVT),
+   DAG.getUNDEF(ContainerVT), TrueMask, VL);
+} else {
+  SDValue LHSIndices = DAG.getBuildVector(IndexVT, DL, GatherIndicesLHS);
+  LHSIndices =
+  convertToScalableVector(IndexContainerVT, LHSIndices, DAG, 
Subtarget);
+
+  Gather = DAG.getNode(GatherVVOpc, DL, ContainerVT, V1, LHSIndices,
+   DAG.getUNDEF(ContainerVT), TrueMask, VL);
 }
-auto *IdxC = cast(GatherIndex);
-ShuffleMaskLHS.push_back(IdxC->getZExtValue());
   }
 
-  // Recursively invoke lowering for the LHS as if there were no RHS.
-  // This allows us to leverage all of our single source permute tricks.
-  SDValue Gather =
-DAG.getVectorShuffle(VT, DL, V1, DAG.getUNDEF(VT), ShuffleMaskLHS);
-  Gather = convertToScalableVector(ContainerVT, Gather, DAG, Subtarget);
+  // If a second vector operand is used by this shuffle, blend it in with an
+  // additional vrgather.
+  if (!V2.isUndef()) {
+V2 = convertToScalableVector(ContainerVT, V2, DAG, Subtarget);
 
-  // Blend in second vector source with an additional vrgather.
-  V2 = convertToScalableVector(ContainerVT, V2, DAG, Subtarget);
+MVT MaskContainerVT = ContainerVT.changeVectorElementType(MVT::i1);
+SelectMask =
+convert

[llvm-branch-commits] [llvm] Revert "[RISCV] Recurse on first operand of two operand shuffles (#79180)" (PR #80238)

2024-01-31 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-risc-v

Author: Philip Reames (preames)


Changes

This reverts commit bdc41106ee48dce59c500c9a3957af947f30c8c3 on the
release/18.x branch.  This change was the first in a mini-series
and while I'm not aware of any particular problem from having it on
it's own in the branch, it seems safer to ship with the previous
known good state.

@tstellar This is my first backport in the new process, so please bear 
with me and double check I got all pieces of this right.

---

Patch is 54.12 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/80238.diff


6 Files Affected:

- (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+44-48) 
- (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-interleave.ll 
(+27-14) 
- (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-interleave.ll 
(+35-28) 
- (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll (+27-16) 
- (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleaved-access.ll 
(+194-193) 
- (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-transpose.ll 
(+80-48) 


``diff
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 47c6cd6e5487b..c8f7b5c35a381 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -5033,60 +5033,56 @@ static SDValue lowerVECTOR_SHUFFLE(SDValue Op, 
SelectionDAG &DAG,
   MVT IndexContainerVT =
   ContainerVT.changeVectorElementType(IndexVT.getScalarType());
 
-  // Base case for the recursion just below - handle the worst case
-  // single source permutation.  Note that all the splat variants
-  // are handled above.
-  if (V2.isUndef()) {
+  SDValue Gather;
+  // TODO: This doesn't trigger for i64 vectors on RV32, since there we
+  // encounter a bitcasted BUILD_VECTOR with low/high i32 values.
+  if (SDValue SplatValue = DAG.getSplatValue(V1, /*LegalTypes*/ true)) {
+Gather = lowerScalarSplat(SDValue(), SplatValue, VL, ContainerVT, DL, DAG,
+  Subtarget);
+  } else {
 V1 = convertToScalableVector(ContainerVT, V1, DAG, Subtarget);
-SDValue LHSIndices = DAG.getBuildVector(IndexVT, DL, GatherIndicesLHS);
-LHSIndices = convertToScalableVector(IndexContainerVT, LHSIndices, DAG,
- Subtarget);
-SDValue Gather = DAG.getNode(GatherVVOpc, DL, ContainerVT, V1, LHSIndices,
- DAG.getUNDEF(ContainerVT), TrueMask, VL);
-return convertFromScalableVector(VT, Gather, DAG, Subtarget);
-  }
-
-  // Translate the gather index we computed above (and possibly swapped)
-  // back to a shuffle mask.  This step should disappear once we complete
-  // the migration to recursive design.
-  SmallVector ShuffleMaskLHS;
-  ShuffleMaskLHS.reserve(GatherIndicesLHS.size());
-  for (SDValue GatherIndex : GatherIndicesLHS) {
-if (GatherIndex.isUndef()) {
-  ShuffleMaskLHS.push_back(-1);
-  continue;
+// If only one index is used, we can use a "splat" vrgather.
+// TODO: We can splat the most-common index and fix-up any stragglers, if
+// that's beneficial.
+if (LHSIndexCounts.size() == 1) {
+  int SplatIndex = LHSIndexCounts.begin()->getFirst();
+  Gather = DAG.getNode(GatherVXOpc, DL, ContainerVT, V1,
+   DAG.getConstant(SplatIndex, DL, XLenVT),
+   DAG.getUNDEF(ContainerVT), TrueMask, VL);
+} else {
+  SDValue LHSIndices = DAG.getBuildVector(IndexVT, DL, GatherIndicesLHS);
+  LHSIndices =
+  convertToScalableVector(IndexContainerVT, LHSIndices, DAG, 
Subtarget);
+
+  Gather = DAG.getNode(GatherVVOpc, DL, ContainerVT, V1, LHSIndices,
+   DAG.getUNDEF(ContainerVT), TrueMask, VL);
 }
-auto *IdxC = cast(GatherIndex);
-ShuffleMaskLHS.push_back(IdxC->getZExtValue());
   }
 
-  // Recursively invoke lowering for the LHS as if there were no RHS.
-  // This allows us to leverage all of our single source permute tricks.
-  SDValue Gather =
-DAG.getVectorShuffle(VT, DL, V1, DAG.getUNDEF(VT), ShuffleMaskLHS);
-  Gather = convertToScalableVector(ContainerVT, Gather, DAG, Subtarget);
+  // If a second vector operand is used by this shuffle, blend it in with an
+  // additional vrgather.
+  if (!V2.isUndef()) {
+V2 = convertToScalableVector(ContainerVT, V2, DAG, Subtarget);
 
-  // Blend in second vector source with an additional vrgather.
-  V2 = convertToScalableVector(ContainerVT, V2, DAG, Subtarget);
+MVT MaskContainerVT = ContainerVT.changeVectorElementType(MVT::i1);
+SelectMask =
+convertToScalableVector(MaskContainerVT, SelectMask, DAG, Subtarget);
 
-  MVT MaskContainerVT = ContainerVT.changeVectorElementType(MVT::i1);
-  SelectMask =
-convertToScalableVector(MaskContainerVT, SelectMask, DAG, Subtarget);
-
-  // If only one index is used,

[llvm-branch-commits] [flang] [libc] [compiler-rt] [clang] [libcxx] [llvm] [RISCV] Support select optimization (PR #80124)

2024-01-31 Thread Philip Reames via llvm-branch-commits

preames wrote:

> and the measurement data still stands for RISCV.

Please give the measurement data in this review or a direct link to it.  I 
tried searching for it, and did not immediately find it.

https://github.com/llvm/llvm-project/pull/80124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [clang] [libcxx] [libc] [compiler-rt] [llvm] [RISCV] Support select optimization (PR #80124)

2024-01-31 Thread Wang Pengcheng via llvm-branch-commits

wangpc-pp wrote:

> > and the measurement data still stands for RISCV.
> 
> Please give the measurement data in this review or a direct link to it. I 
> tried searching for it, and did not immediately find it.

It's in the Phabricator link (https://reviews.llvm.org/D138990): 

> The headline numbers are these for SPEC2017 on a Neoverse N1:
>
> 500.perlbench_r   -0.12%
> 502.gcc_r 0.02%
> 505.mcf_r 6.02%
> 520.omnetpp_r 0.32%
> 523.xalancbmk_r   0.20%
> 525.x264_r0.02%
> 531.deepsjeng_r   0.00%
> 541.leela_r   -0.09%
> 548.exchange2_r   0.00%
> 557.xz_r  -0.20%
>
> Running benchmarks with a combination of the llvm-test-suite plus several 
> versions of SPEC gave between a 0.2% and 0.4% geomean improvement depending 
> on the core/run. The instruction count went down by 0.1% too.

The performance gain is related to core implementation. For RISCV, the 
subtarget feature `FeatureEnableSelectOptimize` can be appended to tune 
features if it's beneficial.

https://github.com/llvm/llvm-project/pull/80124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [compiler-rt] [clang] [lldb] [flang] [clang-tools-extra] [lld] [libcxxabi] [libcxx] [libc] [SelectOpt] Print instruction instead of pointer (PR #80125)

2024-01-31 Thread Wang Pengcheng via llvm-branch-commits

https://github.com/wangpc-pp closed 
https://github.com/llvm/llvm-project/pull/80125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] [libcxx] [libcxxabi] [libc] [flang] [llvm] [lldb] [clang] [compiler-rt] [lld] [SelectOpt] Print instruction instead of pointer (PR #80125)

2024-01-31 Thread Wang Pengcheng via llvm-branch-commits

wangpc-pp wrote:

Committed as 995d21bc6ff2220b2887cf9640d936eb99b3c617.
Somehow `spr` failed with error so I have to land it manually:
```
  #️⃣   Pull Request #80125 


  🛫  Getting started... 

 
  🛑  GitHub: Validation Failed  

 
  Documentation URL: 
https://docs.github.com/rest/pulls/pulls#update-a-pull-request  

 
  Errors:   

  
  - {"code":"invalid","field":"base","message":"Proposed base branch 
'refs/heads/main' is invalid","resource":"PullRequest"} 
```

https://github.com/llvm/llvm-project/pull/80125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80137 (PR #80138)

2024-01-31 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/80138

>From e502141a420a65a4ea534274e07a023a3801fa76 Mon Sep 17 00:00:00 2001
From: Sander de Smalen 
Date: Wed, 31 Jan 2024 11:38:29 +
Subject: [PATCH] [AArch64][SME] Fix inlining bug introduced in #78703 (#79994)

Calling a `__arm_locally_streaming` function from a function that
is not a streaming-SVE function would lead to incorrect inlining.

The issue didn't surface because the tests were not testing what
they were supposed to test.

(cherry picked from commit 3abf55a68caefd45042c27b73a658c638afbbb8b)
---
 .../AArch64/AArch64TargetTransformInfo.cpp|  17 +-
 .../Inline/AArch64/sme-pstatesm-attrs.ll  | 369 +-
 2 files changed, 195 insertions(+), 191 deletions(-)

diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp 
b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index d611338fc268f..992b11da7 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -233,15 +233,20 @@ static bool hasPossibleIncompatibleOps(const Function *F) 
{
 
 bool AArch64TTIImpl::areInlineCompatible(const Function *Caller,
  const Function *Callee) const {
-  SMEAttrs CallerAttrs(*Caller);
-  SMEAttrs CalleeAttrs(*Callee);
+  SMEAttrs CallerAttrs(*Caller), CalleeAttrs(*Callee);
+
+  // When inlining, we should consider the body of the function, not the
+  // interface.
+  if (CalleeAttrs.hasStreamingBody()) {
+CalleeAttrs.set(SMEAttrs::SM_Compatible, false);
+CalleeAttrs.set(SMEAttrs::SM_Enabled, true);
+  }
+
   if (CalleeAttrs.hasNewZABody())
 return false;
 
   if (CallerAttrs.requiresLazySave(CalleeAttrs) ||
-  (CallerAttrs.requiresSMChange(CalleeAttrs) &&
-   (!CallerAttrs.hasStreamingInterfaceOrBody() ||
-!CalleeAttrs.hasStreamingBody( {
+  CallerAttrs.requiresSMChange(CalleeAttrs)) {
 if (hasPossibleIncompatibleOps(Callee))
   return false;
   }
@@ -4062,4 +4067,4 @@ bool 
AArch64TTIImpl::shouldTreatInstructionLikeSelect(const Instruction *I) {
   cast(I->getNextNode())->isUnconditional())
 return true;
   return BaseT::shouldTreatInstructionLikeSelect(I);
-}
\ No newline at end of file
+}
diff --git a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll 
b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
index d6b1f3ef45e76..7723e6c664c3d 100644
--- a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
+++ b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
@@ -1,71 +1,70 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 2
 ; RUN: opt < %s -mtriple=aarch64-unknown-linux-gnu -mattr=+sme -S 
-passes=inline | FileCheck %s
 
-declare void @inlined_body() "aarch64_pstate_sm_compatible";
+declare i32 @llvm.vscale.i32()
 
-; Define some functions that will be called by the functions below.
-; These just call a '...body()' function. If we see the call to one of
-; these functions being replaced by '...body()', then we know it has been
-; inlined.
+; Define some functions that merely call llvm.vscale.i32(), which will be 
called
+; by the other functions below. If we see the call to one of these functions
+; being replaced by 'llvm.vscale()', then we know it has been inlined.
 
-define void @normal_callee() {
-; CHECK-LABEL: define void @normal_callee
+define i32 @normal_callee() {
+; CHECK-LABEL: define i32 @normal_callee
 ; CHECK-SAME: () #[[ATTR1:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:call void @inlined_body()
-; CHECK-NEXT:ret void
+; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 }
 
-define void @streaming_callee() "aarch64_pstate_sm_enabled" {
-; CHECK-LABEL: define void @streaming_callee
+define i32 @streaming_callee() "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define i32 @streaming_callee
 ; CHECK-SAME: () #[[ATTR2:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:call void @inlined_body()
-; CHECK-NEXT:ret void
+; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 }
 
-define void @locally_streaming_callee() "aarch64_pstate_sm_body" {
-; CHECK-LABEL: define void @locally_streaming_callee
+define i32 @locally_streaming_callee() "aarch64_pstate_sm_body" {
+; CHECK-LABEL: define i32 @locally_streaming_callee
 ; CHECK-SAME: () #[[ATTR3:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:call void @inlined_body()
-; CHECK-NEXT:ret void
+; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 

[llvm-branch-commits] [llvm] e502141 - [AArch64][SME] Fix inlining bug introduced in #78703 (#79994)

2024-01-31 Thread Tom Stellard via llvm-branch-commits

Author: Sander de Smalen
Date: 2024-01-31T22:29:42-08:00
New Revision: e502141a420a65a4ea534274e07a023a3801fa76

URL: 
https://github.com/llvm/llvm-project/commit/e502141a420a65a4ea534274e07a023a3801fa76
DIFF: 
https://github.com/llvm/llvm-project/commit/e502141a420a65a4ea534274e07a023a3801fa76.diff

LOG: [AArch64][SME] Fix inlining bug introduced in #78703 (#79994)

Calling a `__arm_locally_streaming` function from a function that
is not a streaming-SVE function would lead to incorrect inlining.

The issue didn't surface because the tests were not testing what
they were supposed to test.

(cherry picked from commit 3abf55a68caefd45042c27b73a658c638afbbb8b)

Added: 


Modified: 
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp 
b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index d611338fc268f..992b11da7 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -233,15 +233,20 @@ static bool hasPossibleIncompatibleOps(const Function *F) 
{
 
 bool AArch64TTIImpl::areInlineCompatible(const Function *Caller,
  const Function *Callee) const {
-  SMEAttrs CallerAttrs(*Caller);
-  SMEAttrs CalleeAttrs(*Callee);
+  SMEAttrs CallerAttrs(*Caller), CalleeAttrs(*Callee);
+
+  // When inlining, we should consider the body of the function, not the
+  // interface.
+  if (CalleeAttrs.hasStreamingBody()) {
+CalleeAttrs.set(SMEAttrs::SM_Compatible, false);
+CalleeAttrs.set(SMEAttrs::SM_Enabled, true);
+  }
+
   if (CalleeAttrs.hasNewZABody())
 return false;
 
   if (CallerAttrs.requiresLazySave(CalleeAttrs) ||
-  (CallerAttrs.requiresSMChange(CalleeAttrs) &&
-   (!CallerAttrs.hasStreamingInterfaceOrBody() ||
-!CalleeAttrs.hasStreamingBody( {
+  CallerAttrs.requiresSMChange(CalleeAttrs)) {
 if (hasPossibleIncompatibleOps(Callee))
   return false;
   }
@@ -4062,4 +4067,4 @@ bool 
AArch64TTIImpl::shouldTreatInstructionLikeSelect(const Instruction *I) {
   cast(I->getNextNode())->isUnconditional())
 return true;
   return BaseT::shouldTreatInstructionLikeSelect(I);
-}
\ No newline at end of file
+}

diff  --git a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll 
b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
index d6b1f3ef45e76..7723e6c664c3d 100644
--- a/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
+++ b/llvm/test/Transforms/Inline/AArch64/sme-pstatesm-attrs.ll
@@ -1,71 +1,70 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 2
 ; RUN: opt < %s -mtriple=aarch64-unknown-linux-gnu -mattr=+sme -S 
-passes=inline | FileCheck %s
 
-declare void @inlined_body() "aarch64_pstate_sm_compatible";
+declare i32 @llvm.vscale.i32()
 
-; Define some functions that will be called by the functions below.
-; These just call a '...body()' function. If we see the call to one of
-; these functions being replaced by '...body()', then we know it has been
-; inlined.
+; Define some functions that merely call llvm.vscale.i32(), which will be 
called
+; by the other functions below. If we see the call to one of these functions
+; being replaced by 'llvm.vscale()', then we know it has been inlined.
 
-define void @normal_callee() {
-; CHECK-LABEL: define void @normal_callee
+define i32 @normal_callee() {
+; CHECK-LABEL: define i32 @normal_callee
 ; CHECK-SAME: () #[[ATTR1:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:call void @inlined_body()
-; CHECK-NEXT:ret void
+; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 }
 
-define void @streaming_callee() "aarch64_pstate_sm_enabled" {
-; CHECK-LABEL: define void @streaming_callee
+define i32 @streaming_callee() "aarch64_pstate_sm_enabled" {
+; CHECK-LABEL: define i32 @streaming_callee
 ; CHECK-SAME: () #[[ATTR2:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:call void @inlined_body()
-; CHECK-NEXT:ret void
+; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK-NEXT:ret i32 [[RES]]
 ;
 entry:
-  call void @inlined_body()
-  ret void
+  %res = call i32 @llvm.vscale.i32()
+  ret i32 %res
 }
 
-define void @locally_streaming_callee() "aarch64_pstate_sm_body" {
-; CHECK-LABEL: define void @locally_streaming_callee
+define i32 @locally_streaming_callee() "aarch64_pstate_sm_body" {
+; CHECK-LABEL: define i32 @locally_streaming_callee
 ; CHECK-SAME: () #[[ATTR3:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:call void @inlined_body()
-; CHECK-NEXT:ret void
+; CHECK-NEXT:[[RES:%.*]] = call i32 @llvm.vscale.i32()
+; CHECK

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80137 (PR #80138)

2024-01-31 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/80138
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 5605312 - [RISCV] Add test to showcase miscompile from #79072

2024-01-31 Thread Tom Stellard via llvm-branch-commits

Author: Luke Lau
Date: 2024-01-31T22:49:19-08:00
New Revision: 5605312fc5742c1e9825bfa4deafe29509795e78

URL: 
https://github.com/llvm/llvm-project/commit/5605312fc5742c1e9825bfa4deafe29509795e78
DIFF: 
https://github.com/llvm/llvm-project/commit/5605312fc5742c1e9825bfa4deafe29509795e78.diff

LOG: [RISCV] Add test to showcase miscompile from #79072

Added: 


Modified: 
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll

Removed: 




diff  --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll 
b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
index f53b51e05c572..c0b02f62444ef 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
@@ -138,8 +138,8 @@ define <4 x i64> @m2_splat_two_source(<4 x i64> %v1, <4 x 
i64> %v2) vscale_range
   ret <4 x i64> %res
 }
 
-define <4 x i64> @m2_splat_into_identity_two_source(<4 x i64> %v1, <4 x i64> 
%v2) vscale_range(2,2) {
-; CHECK-LABEL: m2_splat_into_identity_two_source:
+define <4 x i64> @m2_splat_into_identity_two_source_v2_hi(<4 x i64> %v1, <4 x 
i64> %v2) vscale_range(2,2) {
+; CHECK-LABEL: m2_splat_into_identity_two_source_v2_hi:
 ; CHECK:   # %bb.0:
 ; CHECK-NEXT:vsetivli zero, 2, e64, m1, ta, ma
 ; CHECK-NEXT:vrgather.vi v10, v8, 0
@@ -149,6 +149,20 @@ define <4 x i64> @m2_splat_into_identity_two_source(<4 x 
i64> %v1, <4 x i64> %v2
   ret <4 x i64> %res
 }
 
+; FIXME: This is a miscompile, we're clobbering the lower reg group of %v2
+; (v10), and the vmv1r.v is moving from the wrong reg group (should be v10)
+define <4 x i64> @m2_splat_into_slide_two_source_v2_lo(<4 x i64> %v1, <4 x 
i64> %v2) vscale_range(2,2) {
+; CHECK-LABEL: m2_splat_into_slide_two_source_v2_lo:
+; CHECK:   # %bb.0:
+; CHECK-NEXT:vsetivli zero, 2, e64, m1, ta, ma
+; CHECK-NEXT:vrgather.vi v10, v8, 0
+; CHECK-NEXT:vmv1r.v v11, v8
+; CHECK-NEXT:vmv2r.v v8, v10
+; CHECK-NEXT:ret
+  %res = shufflevector <4 x i64> %v1, <4 x i64> %v2, <4 x i32> 
+  ret <4 x i64> %res
+}
+
 define <4 x i64> @m2_splat_into_slide_two_source(<4 x i64> %v1, <4 x i64> %v2) 
vscale_range(2,2) {
 ; CHECK-LABEL: m2_splat_into_slide_two_source:
 ; CHECK:   # %bb.0:



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] aca7586 - [RISCV] Fix M1 shuffle on wrong SrcVec in lowerShuffleViaVRegSplitting

2024-01-31 Thread Tom Stellard via llvm-branch-commits

Author: Luke Lau
Date: 2024-01-31T22:49:19-08:00
New Revision: aca7586ac9cef896a0ab47bd1ccfbbcf9ec50e61

URL: 
https://github.com/llvm/llvm-project/commit/aca7586ac9cef896a0ab47bd1ccfbbcf9ec50e61
DIFF: 
https://github.com/llvm/llvm-project/commit/aca7586ac9cef896a0ab47bd1ccfbbcf9ec50e61.diff

LOG: [RISCV] Fix M1 shuffle on wrong SrcVec in lowerShuffleViaVRegSplitting

This fixes a miscompile from #79072 where we were taking the wrong SrcVec to do
the M1 shuffle. E.g. if the SrcVecIdx was 2 and we had 2 VRegsPerSrc, we ended
up taking it from V1 instead of V2.

Added: 


Modified: 
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll

Removed: 




diff  --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 47c6cd6e5487b..7895d74f06d12 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -4718,7 +4718,7 @@ static SDValue 
lowerShuffleViaVRegSplitting(ShuffleVectorSDNode *SVN,
 if (SrcVecIdx == -1)
   continue;
 unsigned ExtractIdx = (SrcVecIdx % VRegsPerSrc) * NumOpElts;
-SDValue SrcVec = (unsigned)SrcVecIdx > VRegsPerSrc ? V2 : V1;
+SDValue SrcVec = (unsigned)SrcVecIdx >= VRegsPerSrc ? V2 : V1;
 SDValue SubVec = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, M1VT, SrcVec,
  DAG.getVectorIdxConstant(ExtractIdx, DL));
 SubVec = convertFromScalableVector(OneRegVT, SubVec, DAG, Subtarget);

diff  --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll 
b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
index c0b02f62444ef..3f0bdb9d5e316 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
@@ -149,15 +149,13 @@ define <4 x i64> 
@m2_splat_into_identity_two_source_v2_hi(<4 x i64> %v1, <4 x i6
   ret <4 x i64> %res
 }
 
-; FIXME: This is a miscompile, we're clobbering the lower reg group of %v2
-; (v10), and the vmv1r.v is moving from the wrong reg group (should be v10)
 define <4 x i64> @m2_splat_into_slide_two_source_v2_lo(<4 x i64> %v1, <4 x 
i64> %v2) vscale_range(2,2) {
 ; CHECK-LABEL: m2_splat_into_slide_two_source_v2_lo:
 ; CHECK:   # %bb.0:
 ; CHECK-NEXT:vsetivli zero, 2, e64, m1, ta, ma
-; CHECK-NEXT:vrgather.vi v10, v8, 0
-; CHECK-NEXT:vmv1r.v v11, v8
-; CHECK-NEXT:vmv2r.v v8, v10
+; CHECK-NEXT:vrgather.vi v12, v8, 0
+; CHECK-NEXT:vmv1r.v v13, v10
+; CHECK-NEXT:vmv2r.v v8, v12
 ; CHECK-NEXT:ret
   %res = shufflevector <4 x i64> %v1, <4 x i64> %v2, <4 x i32> 
   ret <4 x i64> %res



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Backport 5cf9f2cd9888feea23a624c1de3cc37ce8ce8112 to release/18.x (PR #79931)

2024-01-31 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Merged: 5605312fc5742c1e9825bfa4deafe29509795e78 
aca7586ac9cef896a0ab47bd1ccfbbcf9ec50e61

https://github.com/llvm/llvm-project/pull/79931
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Backport 5cf9f2cd9888feea23a624c1de3cc37ce8ce8112 to release/18.x (PR #79931)

2024-01-31 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79931
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [backport] [C++20] [Modules] Backport the ability to skip ODR checks in GMF (PR #80249)

2024-01-31 Thread Chuanqi Xu via llvm-branch-commits

https://github.com/ChuanqiXu9 created 
https://github.com/llvm/llvm-project/pull/80249

The backport follows the new practice suggested in 
https://discourse.llvm.org/t/release-18-x-branch-has-been-created/76480.

See https://github.com/llvm/llvm-project/issues/79240 and 
https://github.com/llvm/llvm-project/pull/79959 for the full context.

This is pretty helpful to improve the user experiences in modules given there 
are a lot of issue reports about false positive ODR violation diagnostics.

>From 85dc0ff79515cc439cc3e0d8c991709ad789a50b Mon Sep 17 00:00:00 2001
From: Chuanqi Xu 
Date: Mon, 29 Jan 2024 11:42:08 +0800
Subject: [PATCH 1/3] [C++20] [Modules] Don't perform ODR checks in GMF

Close https://github.com/llvm/llvm-project/issues/79240.

See the linked issue for details. Given the frequency of issue reporting
about false positive ODR checks (I received private issue reports too),
I'd like to backport this to 18.x too.
---
 clang/docs/ReleaseNotes.rst   |  5 ++
 clang/include/clang/Serialization/ASTReader.h |  4 ++
 clang/lib/Serialization/ASTReader.cpp |  3 ++
 clang/lib/Serialization/ASTReaderDecl.cpp | 37 +
 clang/lib/Serialization/ASTWriter.cpp |  8 ++-
 clang/lib/Serialization/ASTWriterDecl.cpp | 13 +++--
 clang/test/Modules/concept.cppm   | 14 ++---
 clang/test/Modules/no-eager-load.cppm | 53 ---
 clang/test/Modules/polluted-operator.cppm |  8 ++-
 clang/test/Modules/pr76638.cppm   |  6 +--
 10 files changed, 68 insertions(+), 83 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 060bc7669b72a..e8dfdfa63717c 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -188,6 +188,11 @@ C++20 Feature Support
   This feature is still experimental. Accordingly, 
``__cpp_nontype_template_args`` was not updated.
   However, its support can be tested with 
``__has_extension(cxx_generalized_nttp)``.
 
+- Clang won't perform ODR checks for decls in the global module fragment any
+  more to ease the implementation and improve the user's using experience.
+  This follows the MSVC's behavior.
+  (`#79240 `_).
+
 C++23 Feature Support
 ^
 - Implemented `P0847R7: Deducing this `_. Some 
related core issues were also
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dd1451bbf2d2c..ba06ab0cd4509 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -2452,6 +2452,10 @@ class BitsUnpacker {
   uint32_t CurrentBitsIndex = ~0;
 };
 
+inline bool isFromExplicitGMF(const Decl *D) {
+  return D->getOwningModule() && 
D->getOwningModule()->isExplicitGlobalModule();
+}
+
 } // namespace clang
 
 #endif // LLVM_CLANG_SERIALIZATION_ASTREADER_H
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index fecd94e875f67..e91c5fe08a043 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -9743,6 +9743,9 @@ void ASTReader::finishPendingActions() {
 
 if (!FD->isLateTemplateParsed() &&
 !NonConstDefn->isLateTemplateParsed() &&
+// We only perform ODR checks for decls not in the explicit
+// global module fragment.
+!isFromExplicitGMF(FD) &&
 FD->getODRHash() != NonConstDefn->getODRHash()) {
   if (!isa(FD)) {
 PendingFunctionOdrMergeFailures[FD].push_back(NonConstDefn);
diff --git a/clang/lib/Serialization/ASTReaderDecl.cpp 
b/clang/lib/Serialization/ASTReaderDecl.cpp
index a149d82153037..7697f29b9054b 100644
--- a/clang/lib/Serialization/ASTReaderDecl.cpp
+++ b/clang/lib/Serialization/ASTReaderDecl.cpp
@@ -804,8 +804,10 @@ void ASTDeclReader::VisitEnumDecl(EnumDecl *ED) {
   ED->setScopedUsingClassTag(EnumDeclBits.getNextBit());
   ED->setFixed(EnumDeclBits.getNextBit());
 
-  ED->setHasODRHash(true);
-  ED->ODRHash = Record.readInt();
+  if (!isFromExplicitGMF(ED)) {
+ED->setHasODRHash(true);
+ED->ODRHash = Record.readInt();
+  }
 
   // If this is a definition subject to the ODR, and we already have a
   // definition, merge this one into it.
@@ -827,7 +829,9 @@ void ASTDeclReader::VisitEnumDecl(EnumDecl *ED) {
   Reader.MergedDeclContexts.insert(std::make_pair(ED, OldDef));
   ED->demoteThisDefinitionToDeclaration();
   Reader.mergeDefinitionVisibility(OldDef, ED);
-  if (OldDef->getODRHash() != ED->getODRHash())
+  // We don't want to check the ODR hash value for declarations from global
+  // module fragment.
+  if (!isFromExplicitGMF(ED) && OldDef->getODRHash() != ED->getODRHash())
 Reader.PendingEnumOdrMergeFailures[OldDef].push_back(ED);
 } else {
   OldDef = ED;
@@ -866,6 +870,9 @@ ASTDeclReader::VisitRecordDeclImpl(Rec

[llvm-branch-commits] [clang] [backport] [C++20] [Modules] Backport the ability to skip ODR checks in GMF (PR #80249)

2024-01-31 Thread Chuanqi Xu via llvm-branch-commits

https://github.com/ChuanqiXu9 milestoned 
https://github.com/llvm/llvm-project/pull/80249
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [clang] [backport] [C++20] [Modules] Backport the ability to skip ODR checks in GMF (PR #80249)

2024-01-31 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-driver

Author: Chuanqi Xu (ChuanqiXu9)


Changes

The backport follows the new practice suggested in 
https://discourse.llvm.org/t/release-18-x-branch-has-been-created/76480.

See https://github.com/llvm/llvm-project/issues/79240 and 
https://github.com/llvm/llvm-project/pull/79959 for the full context.

This is pretty helpful to improve the user experiences in modules given there 
are a lot of issue reports about false positive ODR violation diagnostics.

---

Patch is 25.64 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/80249.diff


18 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+6-3) 
- (modified) clang/docs/StandardCPlusPlusModules.rst (+23) 
- (modified) clang/include/clang/Basic/LangOptions.def (+1) 
- (modified) clang/include/clang/Driver/Options.td (+8) 
- (modified) clang/include/clang/Serialization/ASTReader.h (+6) 
- (modified) clang/lib/AST/ODRHash.cpp (+1-48) 
- (modified) clang/lib/Driver/ToolChains/Clang.cpp (+4) 
- (modified) clang/lib/Serialization/ASTReader.cpp (+3) 
- (modified) clang/lib/Serialization/ASTReaderDecl.cpp (+29-9) 
- (modified) clang/lib/Serialization/ASTWriter.cpp (+6-2) 
- (modified) clang/lib/Serialization/ASTWriterDecl.cpp (+9-4) 
- (added) clang/test/Driver/modules-skip-odr-check-in-gmf.cpp (+10) 
- (modified) clang/test/Modules/concept.cppm (+10-1) 
- (added) clang/test/Modules/cxx20-modules-enum-odr.cppm (+51) 
- (modified) clang/test/Modules/no-eager-load.cppm (-53) 
- (modified) clang/test/Modules/polluted-operator.cppm (+10) 
- (modified) clang/test/Modules/pr76638.cppm (+10) 
- (added) clang/test/Modules/skip-odr-check-in-gmf.cppm (+56) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 060bc7669b72a..ca6f4439971f1 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -188,6 +188,12 @@ C++20 Feature Support
   This feature is still experimental. Accordingly, 
``__cpp_nontype_template_args`` was not updated.
   However, its support can be tested with 
``__has_extension(cxx_generalized_nttp)``.
 
+- Clang won't perform ODR checks for decls in the global module fragment any
+  more to ease the implementation and improve the user's using experience.
+  This follows the MSVC's behavior. Users interested in testing the more strict
+  behavior can use the flag '-Xclang -fno-skip-odr-check-in-gmf'.
+  (`#79240 `_).
+
 C++23 Feature Support
 ^
 - Implemented `P0847R7: Deducing this `_. Some 
related core issues were also
@@ -1041,9 +1047,6 @@ Bug Fixes to C++ Support
   in different visibility.
   Fixes (`#67893 `_)
 
-- Fix a false-positive ODR violation for different definitions for 
`std::align_val_t`.
-  Fixes (`#76638 `_)
-
 - Remove recorded `#pragma once` state for headers included in named modules.
   Fixes (`#77995 `_)
 
diff --git a/clang/docs/StandardCPlusPlusModules.rst 
b/clang/docs/StandardCPlusPlusModules.rst
index 81043ff25be02..0f85065f464a8 100644
--- a/clang/docs/StandardCPlusPlusModules.rst
+++ b/clang/docs/StandardCPlusPlusModules.rst
@@ -457,6 +457,29 @@ Note that **currently** the compiler doesn't consider 
inconsistent macro definit
 Currently Clang would accept the above example. But it may produce surprising 
results if the
 debugging code depends on consistent use of ``NDEBUG`` also in other 
translation units.
 
+Definitions consistency
+^^^
+
+The C++ language defines that same declarations in different translation units 
should have
+the same definition, as known as ODR (One Definition Rule). Prior to modules, 
the translation
+units don't dependent on each other and the compiler itself can't perform a 
strong
+ODR violation check. With the introduction of modules, now the compiler have
+the chance to perform ODR violations with language semantics across 
translation units.
+
+However, in the practice, we found the existing ODR checking mechanism is not 
stable
+enough. Many people suffers from the false positive ODR violation diagnostics, 
AKA,
+the compiler are complaining two identical declarations have different 
definitions
+incorrectly. Also the true positive ODR violations are rarely reported.
+Also we learned that MSVC don't perform ODR check for declarations in the 
global module
+fragment.
+
+So in order to get better user experience, save the time checking ODR and keep 
consistent
+behavior with MSVC, we disabled the ODR check for the declarations in the 
global module
+fragment by default. Users who want more strict check can still use the
+``-Xclang -fno-skip-odr-check-in-gmf`` flag to get the ODR check enabled. It 
is also
+encouraged to report issues if users find false positive ODR violat

[llvm-branch-commits] [llvm] [clang] [backport] [C++20] [Modules] Backport the ability to skip ODR checks in GMF (PR #80249)

2024-01-31 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Chuanqi Xu (ChuanqiXu9)


Changes

The backport follows the new practice suggested in 
https://discourse.llvm.org/t/release-18-x-branch-has-been-created/76480.

See https://github.com/llvm/llvm-project/issues/79240 and 
https://github.com/llvm/llvm-project/pull/79959 for the full context.

This is pretty helpful to improve the user experiences in modules given there 
are a lot of issue reports about false positive ODR violation diagnostics.

---

Patch is 25.64 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/80249.diff


18 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+6-3) 
- (modified) clang/docs/StandardCPlusPlusModules.rst (+23) 
- (modified) clang/include/clang/Basic/LangOptions.def (+1) 
- (modified) clang/include/clang/Driver/Options.td (+8) 
- (modified) clang/include/clang/Serialization/ASTReader.h (+6) 
- (modified) clang/lib/AST/ODRHash.cpp (+1-48) 
- (modified) clang/lib/Driver/ToolChains/Clang.cpp (+4) 
- (modified) clang/lib/Serialization/ASTReader.cpp (+3) 
- (modified) clang/lib/Serialization/ASTReaderDecl.cpp (+29-9) 
- (modified) clang/lib/Serialization/ASTWriter.cpp (+6-2) 
- (modified) clang/lib/Serialization/ASTWriterDecl.cpp (+9-4) 
- (added) clang/test/Driver/modules-skip-odr-check-in-gmf.cpp (+10) 
- (modified) clang/test/Modules/concept.cppm (+10-1) 
- (added) clang/test/Modules/cxx20-modules-enum-odr.cppm (+51) 
- (modified) clang/test/Modules/no-eager-load.cppm (-53) 
- (modified) clang/test/Modules/polluted-operator.cppm (+10) 
- (modified) clang/test/Modules/pr76638.cppm (+10) 
- (added) clang/test/Modules/skip-odr-check-in-gmf.cppm (+56) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 060bc7669b72a..ca6f4439971f1 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -188,6 +188,12 @@ C++20 Feature Support
   This feature is still experimental. Accordingly, 
``__cpp_nontype_template_args`` was not updated.
   However, its support can be tested with 
``__has_extension(cxx_generalized_nttp)``.
 
+- Clang won't perform ODR checks for decls in the global module fragment any
+  more to ease the implementation and improve the user's using experience.
+  This follows the MSVC's behavior. Users interested in testing the more strict
+  behavior can use the flag '-Xclang -fno-skip-odr-check-in-gmf'.
+  (`#79240 `_).
+
 C++23 Feature Support
 ^
 - Implemented `P0847R7: Deducing this `_. Some 
related core issues were also
@@ -1041,9 +1047,6 @@ Bug Fixes to C++ Support
   in different visibility.
   Fixes (`#67893 `_)
 
-- Fix a false-positive ODR violation for different definitions for 
`std::align_val_t`.
-  Fixes (`#76638 `_)
-
 - Remove recorded `#pragma once` state for headers included in named modules.
   Fixes (`#77995 `_)
 
diff --git a/clang/docs/StandardCPlusPlusModules.rst 
b/clang/docs/StandardCPlusPlusModules.rst
index 81043ff25be02..0f85065f464a8 100644
--- a/clang/docs/StandardCPlusPlusModules.rst
+++ b/clang/docs/StandardCPlusPlusModules.rst
@@ -457,6 +457,29 @@ Note that **currently** the compiler doesn't consider 
inconsistent macro definit
 Currently Clang would accept the above example. But it may produce surprising 
results if the
 debugging code depends on consistent use of ``NDEBUG`` also in other 
translation units.
 
+Definitions consistency
+^^^
+
+The C++ language defines that same declarations in different translation units 
should have
+the same definition, as known as ODR (One Definition Rule). Prior to modules, 
the translation
+units don't dependent on each other and the compiler itself can't perform a 
strong
+ODR violation check. With the introduction of modules, now the compiler have
+the chance to perform ODR violations with language semantics across 
translation units.
+
+However, in the practice, we found the existing ODR checking mechanism is not 
stable
+enough. Many people suffers from the false positive ODR violation diagnostics, 
AKA,
+the compiler are complaining two identical declarations have different 
definitions
+incorrectly. Also the true positive ODR violations are rarely reported.
+Also we learned that MSVC don't perform ODR check for declarations in the 
global module
+fragment.
+
+So in order to get better user experience, save the time checking ODR and keep 
consistent
+behavior with MSVC, we disabled the ODR check for the declarations in the 
global module
+fragment by default. Users who want more strict check can still use the
+``-Xclang -fno-skip-odr-check-in-gmf`` flag to get the ODR check enabled. It 
is also
+encouraged to report issues if users find false positive ODR violations or

[llvm-branch-commits] [clang] [llvm] [backport] [C++20] [Modules] Backport the ability to skip ODR checks in GMF (PR #80249)

2024-01-31 Thread Chuanqi Xu via llvm-branch-commits

https://github.com/ChuanqiXu9 updated 
https://github.com/llvm/llvm-project/pull/80249

>From 85dc0ff79515cc439cc3e0d8c991709ad789a50b Mon Sep 17 00:00:00 2001
From: Chuanqi Xu 
Date: Mon, 29 Jan 2024 11:42:08 +0800
Subject: [PATCH 1/3] [C++20] [Modules] Don't perform ODR checks in GMF

Close https://github.com/llvm/llvm-project/issues/79240.

See the linked issue for details. Given the frequency of issue reporting
about false positive ODR checks (I received private issue reports too),
I'd like to backport this to 18.x too.
---
 clang/docs/ReleaseNotes.rst   |  5 ++
 clang/include/clang/Serialization/ASTReader.h |  4 ++
 clang/lib/Serialization/ASTReader.cpp |  3 ++
 clang/lib/Serialization/ASTReaderDecl.cpp | 37 +
 clang/lib/Serialization/ASTWriter.cpp |  8 ++-
 clang/lib/Serialization/ASTWriterDecl.cpp | 13 +++--
 clang/test/Modules/concept.cppm   | 14 ++---
 clang/test/Modules/no-eager-load.cppm | 53 ---
 clang/test/Modules/polluted-operator.cppm |  8 ++-
 clang/test/Modules/pr76638.cppm   |  6 +--
 10 files changed, 68 insertions(+), 83 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 060bc7669b72a..e8dfdfa63717c 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -188,6 +188,11 @@ C++20 Feature Support
   This feature is still experimental. Accordingly, 
``__cpp_nontype_template_args`` was not updated.
   However, its support can be tested with 
``__has_extension(cxx_generalized_nttp)``.
 
+- Clang won't perform ODR checks for decls in the global module fragment any
+  more to ease the implementation and improve the user's using experience.
+  This follows the MSVC's behavior.
+  (`#79240 `_).
+
 C++23 Feature Support
 ^
 - Implemented `P0847R7: Deducing this `_. Some 
related core issues were also
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index dd1451bbf2d2c..ba06ab0cd4509 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -2452,6 +2452,10 @@ class BitsUnpacker {
   uint32_t CurrentBitsIndex = ~0;
 };
 
+inline bool isFromExplicitGMF(const Decl *D) {
+  return D->getOwningModule() && 
D->getOwningModule()->isExplicitGlobalModule();
+}
+
 } // namespace clang
 
 #endif // LLVM_CLANG_SERIALIZATION_ASTREADER_H
diff --git a/clang/lib/Serialization/ASTReader.cpp 
b/clang/lib/Serialization/ASTReader.cpp
index fecd94e875f67..e91c5fe08a043 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -9743,6 +9743,9 @@ void ASTReader::finishPendingActions() {
 
 if (!FD->isLateTemplateParsed() &&
 !NonConstDefn->isLateTemplateParsed() &&
+// We only perform ODR checks for decls not in the explicit
+// global module fragment.
+!isFromExplicitGMF(FD) &&
 FD->getODRHash() != NonConstDefn->getODRHash()) {
   if (!isa(FD)) {
 PendingFunctionOdrMergeFailures[FD].push_back(NonConstDefn);
diff --git a/clang/lib/Serialization/ASTReaderDecl.cpp 
b/clang/lib/Serialization/ASTReaderDecl.cpp
index a149d82153037..7697f29b9054b 100644
--- a/clang/lib/Serialization/ASTReaderDecl.cpp
+++ b/clang/lib/Serialization/ASTReaderDecl.cpp
@@ -804,8 +804,10 @@ void ASTDeclReader::VisitEnumDecl(EnumDecl *ED) {
   ED->setScopedUsingClassTag(EnumDeclBits.getNextBit());
   ED->setFixed(EnumDeclBits.getNextBit());
 
-  ED->setHasODRHash(true);
-  ED->ODRHash = Record.readInt();
+  if (!isFromExplicitGMF(ED)) {
+ED->setHasODRHash(true);
+ED->ODRHash = Record.readInt();
+  }
 
   // If this is a definition subject to the ODR, and we already have a
   // definition, merge this one into it.
@@ -827,7 +829,9 @@ void ASTDeclReader::VisitEnumDecl(EnumDecl *ED) {
   Reader.MergedDeclContexts.insert(std::make_pair(ED, OldDef));
   ED->demoteThisDefinitionToDeclaration();
   Reader.mergeDefinitionVisibility(OldDef, ED);
-  if (OldDef->getODRHash() != ED->getODRHash())
+  // We don't want to check the ODR hash value for declarations from global
+  // module fragment.
+  if (!isFromExplicitGMF(ED) && OldDef->getODRHash() != ED->getODRHash())
 Reader.PendingEnumOdrMergeFailures[OldDef].push_back(ED);
 } else {
   OldDef = ED;
@@ -866,6 +870,9 @@ ASTDeclReader::VisitRecordDeclImpl(RecordDecl *RD) {
 
 void ASTDeclReader::VisitRecordDecl(RecordDecl *RD) {
   VisitRecordDeclImpl(RD);
+  // We should only reach here if we're in C/Objective-C. There is no
+  // global module fragment.
+  assert(!isFromExplicitGMF(RD));
   RD->setODRHash(Record.readInt());
 
   // Maintain the invariant of a redeclaration chain containing only
@@ -1094,8 +1101,10 @@ void ASTDeclReader::VisitFunctionDecl(Funct

[llvm-branch-commits] [clang] d3aeedc - [Docs] Fix documentation build.

2024-01-31 Thread Tom Stellard via llvm-branch-commits

Author: Craig Topper
Date: 2024-01-31T22:56:47-08:00
New Revision: d3aeedcd47cb9ac29769716c1eed6d5b80b45728

URL: 
https://github.com/llvm/llvm-project/commit/d3aeedcd47cb9ac29769716c1eed6d5b80b45728
DIFF: 
https://github.com/llvm/llvm-project/commit/d3aeedcd47cb9ac29769716c1eed6d5b80b45728.diff

LOG: [Docs] Fix documentation build.

Missing ending `` after c92ad411f2f94d8521cd18abcb37285f9a390ecb

Added: 


Modified: 
clang/docs/ReleaseNotes.rst

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 45d1ab34d0f93..2f4fe8bf7556e 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1227,7 +1227,7 @@ RISC-V Support
 - Default ABI with F but without D was changed to ilp32f for RV32 and to lp64f
   for RV64.
 
-- ``__attribute__((rvv_vector_bits(N))) is now supported for RVV vbool*_t 
types.
+- ``__attribute__((rvv_vector_bits(N)))`` is now supported for RVV vbool*_t 
types.
 
 CUDA/HIP Language Changes
 ^



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] f4e5ce0 - [RISCV] Support __riscv_v_fixed_vlen for vbool types. (#76551)

2024-01-31 Thread Tom Stellard via llvm-branch-commits

Author: Craig Topper
Date: 2024-01-31T22:56:47-08:00
New Revision: f4e5ce059eb4f04788e0c8391dc565b6aab952dc

URL: 
https://github.com/llvm/llvm-project/commit/f4e5ce059eb4f04788e0c8391dc565b6aab952dc
DIFF: 
https://github.com/llvm/llvm-project/commit/f4e5ce059eb4f04788e0c8391dc565b6aab952dc.diff

LOG: [RISCV] Support __riscv_v_fixed_vlen for vbool types. (#76551)

This adopts a similar behavior to AArch64 SVE, where bool vectors are
represented as a vector of chars with 1/8 the number of elements. This
ensures the vector always occupies a power of 2 number of bytes.

A consequence of this is that vbool64_t, vbool32_t, and vool16_t can
only be used with a vector length that guarantees at least 8 bits.

Added: 


Modified: 
clang/docs/ReleaseNotes.rst
clang/include/clang/AST/Type.h
clang/include/clang/Basic/AttrDocs.td
clang/lib/AST/ASTContext.cpp
clang/lib/AST/ItaniumMangle.cpp
clang/lib/AST/JSONNodeDumper.cpp
clang/lib/AST/TextNodeDumper.cpp
clang/lib/AST/Type.cpp
clang/lib/AST/TypePrinter.cpp
clang/lib/CodeGen/Targets/RISCV.cpp
clang/lib/Sema/SemaExpr.cpp
clang/lib/Sema/SemaType.cpp
clang/test/CodeGen/attr-riscv-rvv-vector-bits-bitcast.c
clang/test/CodeGen/attr-riscv-rvv-vector-bits-call.c
clang/test/CodeGen/attr-riscv-rvv-vector-bits-cast.c
clang/test/CodeGen/attr-riscv-rvv-vector-bits-codegen.c
clang/test/CodeGen/attr-riscv-rvv-vector-bits-globals.c
clang/test/CodeGen/attr-riscv-rvv-vector-bits-types.c
clang/test/CodeGenCXX/riscv-mangle-rvv-fixed-vectors.cpp
clang/test/Sema/attr-riscv-rvv-vector-bits.c

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 060bc7669b72a..45d1ab34d0f93 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1227,6 +1227,8 @@ RISC-V Support
 - Default ABI with F but without D was changed to ilp32f for RV32 and to lp64f
   for RV64.
 
+- ``__attribute__((rvv_vector_bits(N))) is now supported for RVV vbool*_t 
types.
+
 CUDA/HIP Language Changes
 ^
 

diff  --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h
index ea425791fc97f..6384cf9420b82 100644
--- a/clang/include/clang/AST/Type.h
+++ b/clang/include/clang/AST/Type.h
@@ -3495,6 +3495,9 @@ enum class VectorKind {
 
   /// is RISC-V RVV fixed-length data vector
   RVVFixedLengthData,
+
+  /// is RISC-V RVV fixed-length mask vector
+  RVVFixedLengthMask,
 };
 
 /// Represents a GCC generic vector type. This type is created using

diff  --git a/clang/include/clang/Basic/AttrDocs.td 
b/clang/include/clang/Basic/AttrDocs.td
index 7e633f8e2635a..e02a1201e2ad7 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -2424,7 +2424,10 @@ only be a power of 2 between 64 and 65536.
 For types where LMUL!=1, ``__riscv_v_fixed_vlen`` needs to be scaled by the 
LMUL
 of the type before passing to the attribute.
 
-``vbool*_t`` types are not supported at this time.
+For ``vbool*_t`` types, ``__riscv_v_fixed_vlen`` needs to be divided by the
+number from the type name. For example, ``vbool8_t`` needs to use
+``__riscv_v_fixed_vlen`` / 8. If the resulting value is not a multiple of 8,
+the type is not supported for that value of ``__riscv_v_fixed_vlen``.
 }];
 }
 

diff  --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 5eb7aa3664569..ab16ca10395fa 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -1945,7 +1945,8 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const 
{
 else if (VT->getVectorKind() == VectorKind::SveFixedLengthPredicate)
   // Adjust the alignment for fixed-length SVE predicates.
   Align = 16;
-else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData)
+else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData ||
+ VT->getVectorKind() == VectorKind::RVVFixedLengthMask)
   // Adjust the alignment for fixed-length RVV vectors.
   Align = std::min(64, Width);
 break;
@@ -9416,7 +9417,9 @@ bool ASTContext::areCompatibleVectorTypes(QualType 
FirstVec,
   Second->getVectorKind() != VectorKind::SveFixedLengthData &&
   Second->getVectorKind() != VectorKind::SveFixedLengthPredicate &&
   First->getVectorKind() != VectorKind::RVVFixedLengthData &&
-  Second->getVectorKind() != VectorKind::RVVFixedLengthData)
+  Second->getVectorKind() != VectorKind::RVVFixedLengthData &&
+  First->getVectorKind() != VectorKind::RVVFixedLengthMask &&
+  Second->getVectorKind() != VectorKind::RVVFixedLengthMask)
 return true;
 
   return false;
@@ -9522,8 +9525,11 @@ static uint64_t getRVVTypeSize(ASTContext &Context, 
const BuiltinType *Ty) {
 
   ASTContext::BuiltinVectorTypeInfo Info = 
Context.getBuiltinVectorTypeInfo(Ty);
 
-  uint64_t EltSize = Context.getTypeSize(Info.ElementType);
-  u

[llvm-branch-commits] [clang] PR for llvm/llvm-project#79479 (PR #79907)

2024-01-31 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79907
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] PR for llvm/llvm-project#79479 (PR #79907)

2024-01-31 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Merged: f4e5ce059eb4f04788e0c8391dc565b6aab952dc 
d3aeedcd47cb9ac29769716c1eed6d5b80b45728

https://github.com/llvm/llvm-project/pull/79907
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [clang] [backport] [C++20] [Modules] Backport the ability to skip ODR checks in GMF (PR #80249)

2024-01-31 Thread Chuanqi Xu via llvm-branch-commits

https://github.com/ChuanqiXu9 edited 
https://github.com/llvm/llvm-project/pull/80249
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits