date:20240205

[llvm-branch-commits] [clang] PR for llvm/llvm-project#80432 (PR #80433)

2024-02-05 Thread Dinar Temirbulatov via llvm-branch-commits


https://github.com/dtemirbulatov approved this pull request.

LGTM.

https://github.com/llvm/llvm-project/pull/80433
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [clang] PR for llvm/llvm-project#79614 (PR #79870)

2024-02-05 Thread Alexandros Lamprineas via llvm-branch-commits


labrinea wrote:

Ping! @tstellar do I need to take any actions for this to show up in the 
release board? I am not seeing it under `needs triage` or `needs merge`.

https://github.com/llvm/llvm-project/pull/79870
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [mlir] [mlir][flang][openmp] Rework wsloop reduction operations (PR #80019)

2024-02-05 Thread Kiran Chandramohan via llvm-branch-commits


https://github.com/kiranchandramohan ready_for_review 
https://github.com/llvm/llvm-project/pull/80019
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [mlir] [mlir][flang][openmp] Rework wsloop reduction operations (PR #80019)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-mlir-llvm

Author: David Truby (DavidTruby)


Changes

This patch reworks the way that wsloop reduction operations function to better 
match the expected semantics from the OpenMP specification, following the 
rework of parallel reductions.

The new semantics create a private reduction variable as a block argument which 
should be used normally for all operations on that variable in the region; this 
private variable is then combined with the others into the shared variable. 
This way no special omp.reduction operations are needed inside the region. 
These block arguments follow the loop control block arguments.

---

Patch is 361.55 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/80019.diff


37 Files Affected:

- (modified) flang/lib/Lower/OpenMP.cpp (+37-19) 
- (modified) flang/test/Fir/convert-to-llvm-openmp-and-fir.fir (+16-4) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-add.f90 (+266-166) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-iand.f90 (+5-3) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-ieor.f90 (+5-3) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-ior.f90 (+4-2) 
- (removed) flang/test/Lower/OpenMP/FIR/wsloop-reduction-logical-and.f90 (-137) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-logical-eqv.f90 
(+138-93) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-logical-neqv.f90 
(+140-93) 
- (removed) flang/test/Lower/OpenMP/FIR/wsloop-reduction-logical-or.f90 (-137) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-max.f90 (+8-5) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-min.f90 (+8-4) 
- (removed) flang/test/Lower/OpenMP/FIR/wsloop-reduction-mul.f90 (-274) 
- (modified) flang/test/Lower/OpenMP/default-clause.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-add-hlfir.f90 (+37-28) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-add.f90 (+312-187) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-iand.f90 (+40-24) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-ieor.f90 (+6-3) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-ior.f90 (+41-24) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-logical-and.f90 (+158-97) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-logical-eqv.f90 (+153-96) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-logical-neqv.f90 
(+158-96) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-logical-or.f90 (+155-96) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-max-2.f90 (+2-1) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-max-hlfir.f90 (+41-19) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-max.f90 (+105-48) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-min.f90 (+107-48) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-mul.f90 (+282-186) 
- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+3-6) 
- (modified) mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp (+31-1) 
- (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+90-11) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+14-24) 
- (modified) mlir/test/Conversion/OpenMPToLLVM/convert-to-llvmir.mlir (+10-4) 
- (modified) mlir/test/Conversion/SCFToOpenMP/reductions.mlir (+22-8) 
- (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+3-32) 
- (modified) mlir/test/Dialect/OpenMP/ops.mlir (+17-9) 
- (modified) mlir/test/Target/LLVMIR/openmp-reduction.mlir (+30-14) 


``diff
diff --git a/flang/lib/Lower/OpenMP.cpp b/flang/lib/Lower/OpenMP.cpp
index fcf10b26c135b..74cd6c27b3440 100644
--- a/flang/lib/Lower/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP.cpp
@@ -2274,6 +2274,12 @@ static void createBodyOfOp(
 return undef.getDefiningOp();
   };
 
+  llvm::SmallVector blockArgTypes;
+  llvm::SmallVector blockArgLocs;
+  blockArgTypes.reserve(loopArgs.size() + reductionArgs.size());
+  blockArgLocs.reserve(blockArgTypes.size());
+  mlir::Block *entryBlock;
+
   // If an argument for the region is provided then create the block with that
   // argument. Also update the symbol's address with the mlir argument value.
   // e.g. For loops the argument is the induction variable. And all further
@@ -2283,11 +2289,21 @@ static void createBodyOfOp(
 for (const Fortran::semantics::Symbol *arg : loopArgs)
   loopVarTypeSize = std::max(loopVarTypeSize, arg->GetUltimate().size());
 mlir::Type loopVarType = getLoopVarType(converter, loopVarTypeSize);
-llvm::SmallVector tiv(loopArgs.size(), loopVarType);
-llvm::SmallVector locs(loopArgs.size(), loc);
-firOpBuilder.createBlock(&op.getRegion(), {}, tiv, locs);
-// The argument is not currently in memory, so make a temporary for the
-// argument, and store it there, then bind that location to the argument.
+std::fill_n(std::back_inserter(blockArgTypes), loopArgs.size(),
+loopVarType);
+std::fill_n(std::

[llvm-branch-commits] [mlir] [flang] [mlir][flang][openmp] Rework wsloop reduction operations (PR #80019)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-mlir

Author: David Truby (DavidTruby)


Changes

This patch reworks the way that wsloop reduction operations function to better 
match the expected semantics from the OpenMP specification, following the 
rework of parallel reductions.

The new semantics create a private reduction variable as a block argument which 
should be used normally for all operations on that variable in the region; this 
private variable is then combined with the others into the shared variable. 
This way no special omp.reduction operations are needed inside the region. 
These block arguments follow the loop control block arguments.

---

Patch is 361.55 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/80019.diff


37 Files Affected:

- (modified) flang/lib/Lower/OpenMP.cpp (+37-19) 
- (modified) flang/test/Fir/convert-to-llvm-openmp-and-fir.fir (+16-4) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-add.f90 (+266-166) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-iand.f90 (+5-3) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-ieor.f90 (+5-3) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-ior.f90 (+4-2) 
- (removed) flang/test/Lower/OpenMP/FIR/wsloop-reduction-logical-and.f90 (-137) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-logical-eqv.f90 
(+138-93) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-logical-neqv.f90 
(+140-93) 
- (removed) flang/test/Lower/OpenMP/FIR/wsloop-reduction-logical-or.f90 (-137) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-max.f90 (+8-5) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-min.f90 (+8-4) 
- (removed) flang/test/Lower/OpenMP/FIR/wsloop-reduction-mul.f90 (-274) 
- (modified) flang/test/Lower/OpenMP/default-clause.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-add-hlfir.f90 (+37-28) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-add.f90 (+312-187) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-iand.f90 (+40-24) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-ieor.f90 (+6-3) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-ior.f90 (+41-24) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-logical-and.f90 (+158-97) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-logical-eqv.f90 (+153-96) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-logical-neqv.f90 
(+158-96) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-logical-or.f90 (+155-96) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-max-2.f90 (+2-1) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-max-hlfir.f90 (+41-19) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-max.f90 (+105-48) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-min.f90 (+107-48) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-mul.f90 (+282-186) 
- (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+3-6) 
- (modified) mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp (+31-1) 
- (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+90-11) 
- (modified) 
mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+14-24) 
- (modified) mlir/test/Conversion/OpenMPToLLVM/convert-to-llvmir.mlir (+10-4) 
- (modified) mlir/test/Conversion/SCFToOpenMP/reductions.mlir (+22-8) 
- (modified) mlir/test/Dialect/OpenMP/invalid.mlir (+3-32) 
- (modified) mlir/test/Dialect/OpenMP/ops.mlir (+17-9) 
- (modified) mlir/test/Target/LLVMIR/openmp-reduction.mlir (+30-14) 


``diff
diff --git a/flang/lib/Lower/OpenMP.cpp b/flang/lib/Lower/OpenMP.cpp
index fcf10b26c135b..74cd6c27b3440 100644
--- a/flang/lib/Lower/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP.cpp
@@ -2274,6 +2274,12 @@ static void createBodyOfOp(
 return undef.getDefiningOp();
   };
 
+  llvm::SmallVector blockArgTypes;
+  llvm::SmallVector blockArgLocs;
+  blockArgTypes.reserve(loopArgs.size() + reductionArgs.size());
+  blockArgLocs.reserve(blockArgTypes.size());
+  mlir::Block *entryBlock;
+
   // If an argument for the region is provided then create the block with that
   // argument. Also update the symbol's address with the mlir argument value.
   // e.g. For loops the argument is the induction variable. And all further
@@ -2283,11 +2289,21 @@ static void createBodyOfOp(
 for (const Fortran::semantics::Symbol *arg : loopArgs)
   loopVarTypeSize = std::max(loopVarTypeSize, arg->GetUltimate().size());
 mlir::Type loopVarType = getLoopVarType(converter, loopVarTypeSize);
-llvm::SmallVector tiv(loopArgs.size(), loopVarType);
-llvm::SmallVector locs(loopArgs.size(), loc);
-firOpBuilder.createBlock(&op.getRegion(), {}, tiv, locs);
-// The argument is not currently in memory, so make a temporary for the
-// argument, and store it there, then bind that location to the argument.
+std::fill_n(std::back_inserter(blockArgTypes), loopArgs.size(),
+loopVarType);
+std::fill_n(std::back_

[llvm-branch-commits] [clang] [Release Notes][FMV] Document support for rcpc3 and mops features. (PR #80152)

2024-02-05 Thread Alexandros Lamprineas via llvm-branch-commits


labrinea wrote:

@tstellar could you please merge this patch on the release branch? Cheers.

https://github.com/llvm/llvm-project/pull/80152
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [flang] [mlir][flang][openmp] Rework wsloop reduction operations (PR #80019)

2024-02-05 Thread Tom Eccles via llvm-branch-commits





tblah wrote:

Please could you update the documentation for reductions on line 442 - I 
presume we don't want to encourage `omp.reduction` operations anymore

https://github.com/llvm/llvm-project/pull/80019
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [flang] [mlir][flang][openmp] Rework wsloop reduction operations (PR #80019)

2024-02-05 Thread Tom Eccles via llvm-branch-commits



@@ -398,11 +400,39 @@ struct ParallelOpLowering : public 
OpRewritePattern {
 // Replace the reduction operations contained in this loop. Must be done
 // here rather than in a separate pattern to have access to the list of
 // reduction variables.
+unsigned int reductionIndex = 0;
 for (auto [x, y] :
  llvm::zip_equal(reductionVariables, reduce.getOperands())) {
   OpBuilder::InsertionGuard guard(rewriter);
   rewriter.setInsertionPoint(reduce);
-  rewriter.create(reduce.getLoc(), y, x);
+  Region &redRegion =
+  ompReductionDecls[reductionIndex].getReductionRegion();
+  assert(redRegion.hasOneBlock() &&
+ "expect reduction region to have one block");

tblah wrote:

Please could you add a comment explaining why a reduction region must have only 
one block, or adding a TODO for multiple blocks.

https://github.com/llvm/llvm-project/pull/80019
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [flang] [mlir][flang][openmp] Rework wsloop reduction operations (PR #80019)

2024-02-05 Thread Tom Eccles via llvm-branch-commits



@@ -398,11 +400,39 @@ struct ParallelOpLowering : public 
OpRewritePattern {
 // Replace the reduction operations contained in this loop. Must be done
 // here rather than in a separate pattern to have access to the list of
 // reduction variables.
+unsigned int reductionIndex = 0;
 for (auto [x, y] :
  llvm::zip_equal(reductionVariables, reduce.getOperands())) {

tblah wrote:

nit: you could add `ompReductionDecls` to the `llvm::zip_equal` so that the 
loop handles the iteration.

https://github.com/llvm/llvm-project/pull/80019
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] PR for llvm/llvm-project#80432 (PR #80433)

2024-02-05 Thread Dinar Temirbulatov via llvm-branch-commits


dtemirbulatov wrote:

> @dtemirbulatov What do you think about merging this PR to the release branch?

no objections.

https://github.com/llvm/llvm-project/pull/80433
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80694 (PR #80695)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/80695

resolves llvm/llvm-project#80694

>From ba5a8cd31193ed21602781c7f0f23ddd380401cf Mon Sep 17 00:00:00 2001
From: Pierre van Houtryve 
Date: Mon, 5 Feb 2024 14:36:15 +0100
Subject: [PATCH] [AMDGPU][PromoteAlloca] Support memsets to ptr allocas
 (#80678)

Fixes #80366

(cherry picked from commit 4e958abf2f44d08129eafd5b6a4ee2bd3584ed22)
---
 .../lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp | 16 --
 .../CodeGen/AMDGPU/promote-alloca-memset.ll   | 54 +++
 2 files changed, 66 insertions(+), 4 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
index 5e73411cae9b7..c1b244f50d93f 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
@@ -521,10 +521,18 @@ static Value *promoteAllocaUserToVector(
   // For memset, we don't need to know the previous value because we
   // currently only allow memsets that cover the whole alloca.
   Value *Elt = MSI->getOperand(1);
-  if (DL.getTypeStoreSize(VecEltTy) > 1) {
-Value *EltBytes =
-Builder.CreateVectorSplat(DL.getTypeStoreSize(VecEltTy), Elt);
-Elt = Builder.CreateBitCast(EltBytes, VecEltTy);
+  const unsigned BytesPerElt = DL.getTypeStoreSize(VecEltTy);
+  if (BytesPerElt > 1) {
+Value *EltBytes = Builder.CreateVectorSplat(BytesPerElt, Elt);
+
+// If the element type of the vector is a pointer, we need to first 
cast
+// to an integer, then use a PtrCast.
+if (VecEltTy->isPointerTy()) {
+  Type *PtrInt = Builder.getIntNTy(BytesPerElt * 8);
+  Elt = Builder.CreateBitCast(EltBytes, PtrInt);
+  Elt = Builder.CreateIntToPtr(Elt, VecEltTy);
+} else
+  Elt = Builder.CreateBitCast(EltBytes, VecEltTy);
   }
 
   return Builder.CreateVectorSplat(VectorTy->getElementCount(), Elt);
diff --git a/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll 
b/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
index 15af1f17e230e..f1e2737b370ef 100644
--- a/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
+++ b/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
@@ -84,4 +84,58 @@ entry:
   ret void
 }
 
+define amdgpu_kernel void @memset_array_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_array_ptr_alloca(
+; CHECK-NEXT:store i64 0, ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca [6 x ptr], align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
+define amdgpu_kernel void @memset_vector_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_vector_ptr_alloca(
+; CHECK-NEXT:store i64 0, ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca <6 x ptr>, align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
+define amdgpu_kernel void @memset_array_of_array_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_array_of_array_ptr_alloca(
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [2 x [3 x ptr]], align 16, 
addrspace(5)
+; CHECK-NEXT:call void @llvm.memset.p5.i64(ptr addrspace(5) [[ALLOCA]], i8 
0, i64 48, i1 false)
+; CHECK-NEXT:[[LOAD:%.*]] = load i64, ptr addrspace(5) [[ALLOCA]], align 8
+; CHECK-NEXT:store i64 [[LOAD]], ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca [2 x [3 x ptr]], align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
+define amdgpu_kernel void @memset_array_of_vec_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_array_of_vec_ptr_alloca(
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [2 x <3 x ptr>], align 16, 
addrspace(5)
+; CHECK-NEXT:call void @llvm.memset.p5.i64(ptr addrspace(5) [[ALLOCA]], i8 
0, i64 48, i1 false)
+; CHECK-NEXT:[[LOAD:%.*]] = load i64, ptr addrspace(5) [[ALLOCA]], align 8
+; CHECK-NEXT:store i64 [[LOAD]], ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca [2 x <3 x ptr>], align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
 declare void @llvm.memset.p5.i64(ptr addrspace(5) nocapture writeonly, i8, 
i64, i1 immarg)

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80694 (PR #80695)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/80695
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80694 (PR #80695)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:

@arsenm What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/80695
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80694 (PR #80695)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#80694

---
Full diff: https://github.com/llvm/llvm-project/pull/80695.diff


2 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp (+12-4) 
- (modified) llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll (+54) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
index 5e73411cae9b70..c1b244f50d93f8 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
@@ -521,10 +521,18 @@ static Value *promoteAllocaUserToVector(
   // For memset, we don't need to know the previous value because we
   // currently only allow memsets that cover the whole alloca.
   Value *Elt = MSI->getOperand(1);
-  if (DL.getTypeStoreSize(VecEltTy) > 1) {
-Value *EltBytes =
-Builder.CreateVectorSplat(DL.getTypeStoreSize(VecEltTy), Elt);
-Elt = Builder.CreateBitCast(EltBytes, VecEltTy);
+  const unsigned BytesPerElt = DL.getTypeStoreSize(VecEltTy);
+  if (BytesPerElt > 1) {
+Value *EltBytes = Builder.CreateVectorSplat(BytesPerElt, Elt);
+
+// If the element type of the vector is a pointer, we need to first 
cast
+// to an integer, then use a PtrCast.
+if (VecEltTy->isPointerTy()) {
+  Type *PtrInt = Builder.getIntNTy(BytesPerElt * 8);
+  Elt = Builder.CreateBitCast(EltBytes, PtrInt);
+  Elt = Builder.CreateIntToPtr(Elt, VecEltTy);
+} else
+  Elt = Builder.CreateBitCast(EltBytes, VecEltTy);
   }
 
   return Builder.CreateVectorSplat(VectorTy->getElementCount(), Elt);
diff --git a/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll 
b/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
index 15af1f17e230ec..f1e2737b370ef0 100644
--- a/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
+++ b/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
@@ -84,4 +84,58 @@ entry:
   ret void
 }
 
+define amdgpu_kernel void @memset_array_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_array_ptr_alloca(
+; CHECK-NEXT:store i64 0, ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca [6 x ptr], align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
+define amdgpu_kernel void @memset_vector_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_vector_ptr_alloca(
+; CHECK-NEXT:store i64 0, ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca <6 x ptr>, align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
+define amdgpu_kernel void @memset_array_of_array_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_array_of_array_ptr_alloca(
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [2 x [3 x ptr]], align 16, 
addrspace(5)
+; CHECK-NEXT:call void @llvm.memset.p5.i64(ptr addrspace(5) [[ALLOCA]], i8 
0, i64 48, i1 false)
+; CHECK-NEXT:[[LOAD:%.*]] = load i64, ptr addrspace(5) [[ALLOCA]], align 8
+; CHECK-NEXT:store i64 [[LOAD]], ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca [2 x [3 x ptr]], align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
+define amdgpu_kernel void @memset_array_of_vec_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_array_of_vec_ptr_alloca(
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [2 x <3 x ptr>], align 16, 
addrspace(5)
+; CHECK-NEXT:call void @llvm.memset.p5.i64(ptr addrspace(5) [[ALLOCA]], i8 
0, i64 48, i1 false)
+; CHECK-NEXT:[[LOAD:%.*]] = load i64, ptr addrspace(5) [[ALLOCA]], align 8
+; CHECK-NEXT:store i64 [[LOAD]], ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca [2 x <3 x ptr>], align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
 declare void @llvm.memset.p5.i64(ptr addrspace(5) nocapture writeonly, i8, 
i64, i1 immarg)

``




https://github.com/llvm/llvm-project/pull/80695
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-05 Thread via llvm-branch-commits


https://github.com/ayalz commented:

Nice refactoring clean-up! Adding some comments.

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-05 Thread via llvm-branch-commits



@@ -489,6 +490,23 @@ Value *VPInstruction::generateInstruction(VPTransformState 
&State,
 
 return ReducedPartRdx;
   }
+  case VPInstruction::PtrAdd: {
+if (vputils::onlyFirstLaneUsed(this)) {
+  auto *P = Builder.CreatePtrAdd(
+  State.get(getOperand(0), VPIteration(Part, 0)),
+  State.get(getOperand(1), VPIteration(Part, 0)), Name);
+  State.set(this, P, VPIteration(Part, 0));
+} else {
+  for (unsigned Lane = 0; Lane != State.VF.getKnownMinValue(); ++Lane) {
+Value *P = Builder.CreatePtrAdd(
+State.get(getOperand(0), VPIteration(Part, Lane)),
+State.get(getOperand(1), VPIteration(Part, Lane)), Name);
+
+State.set(this, P, VPIteration(Part, Lane));
+  }
+}
+return nullptr;

ayalz wrote:

Better for generateInstruction() to continue generate and return a single 
per-part Value, which is then set in State, possibly renaming it 
generateValuePerPart(), and have a separate generateValuePerLane() - currently 
to be invoked only for PtrAdd having all lanes used?

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-05 Thread via llvm-branch-commits


https://github.com/ayalz edited https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-05 Thread via llvm-branch-commits



@@ -540,6 +560,7 @@ bool VPInstruction::onlyFirstLaneUsed(const VPValue *Op) 
const {
   default:
 return false;
   case Instruction::ICmp:
+  case VPInstruction::PtrAdd:
 // TODO: Cover additional opcodes.

ayalz wrote:

nit: better place this TODO under default?

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-05 Thread via llvm-branch-commits



@@ -537,6 +542,30 @@ void VPlanTransforms::optimizeInductions(VPlan &Plan, 
ScalarEvolution &SE) {
   bool HasOnlyVectorVFs = !Plan.hasVF(ElementCount::getFixed(1));
   VPBasicBlock::iterator InsertPt = HeaderVPBB->getFirstNonPhi();
   for (VPRecipeBase &Phi : HeaderVPBB->phis()) {
+if (auto *PtrIV = dyn_cast(&Phi)) {
+  if (!PtrIV->onlyScalarsGenerated(Plan.hasScalableVF()))
+continue;
+
+  const InductionDescriptor &ID = PtrIV->getInductionDescriptor();
+  VPValue *StartV = Plan.getVPValueOrAddLiveIn(
+  ConstantInt::get(ID.getStep()->getType(), 0));
+  VPValue *StepV = PtrIV->getOperand(1);
+  VPRecipeBase *Steps =

ayalz wrote:

Have createScalarIVSteps() return a VPSingleDefRecipe (or even 
VPScalarIVStepsRecipe) to avoid going through getDefiningRecipe() and 
getVPSingleValue()?

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-05 Thread via llvm-branch-commits



@@ -857,11 +857,7 @@ void VPlan::execute(VPTransformState *State) {
 Phi = cast(State->get(R.getVPSingleValue(), 0));
   } else {
 auto *WidenPhi = cast(&R);
-// TODO: Split off the case that all users of a pointer phi are scalar
-// from the VPWidenPointerInductionRecipe.
-if (WidenPhi->onlyScalarsGenerated(State->VF.isScalable()))
-  continue;
-
+assert(!WidenPhi->onlyScalarsGenerated(State->VF.isScalable()));

ayalz wrote:

nit: assert message.

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-05 Thread via llvm-branch-commits



@@ -546,9 +575,10 @@ void VPlanTransforms::optimizeInductions(VPlan &Plan, 
ScalarEvolution &SE) {
   continue;
 
 const InductionDescriptor &ID = WideIV->getInductionDescriptor();
-VPValue *Steps = createScalarIVSteps(Plan, ID, SE, WideIV->getTruncInst(),
- WideIV->getStartValue(),
- WideIV->getStepValue(), InsertPt);
+VPValue *Steps = createScalarIVSteps(
+Plan, ID.getKind(), SE, WideIV->getTruncInst(), 
WideIV->getStartValue(),
+WideIV->getStepValue(), ID.getInductionOpcode(), InsertPt,
+dyn_cast_or_null(ID.getInductionBinOp()));

ayalz wrote:

nit: seems more logical to group the three fields of ID and pass them as 
adjacent parameters?

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-05 Thread via llvm-branch-commits



@@ -540,6 +560,7 @@ bool VPInstruction::onlyFirstLaneUsed(const VPValue *Op) 
const {
   default:
 return false;
   case Instruction::ICmp:
+  case VPInstruction::PtrAdd:
 // TODO: Cover additional opcodes.
 return vputils::onlyFirstLaneUsed(this);
   case VPInstruction::ComputeReductionResult:

ayalz wrote:

nit (unrelated): fall-through to join `true` cases.

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-05 Thread via llvm-branch-commits



@@ -2503,6 +2504,12 @@ class VPDerivedIVRecipe : public VPSingleDefRecipe {
 dyn_cast_or_null(IndDesc.getInductionBinOp()),
 Start, CanonicalIV, Step) {}
 
+  VPDerivedIVRecipe(InductionDescriptor::InductionKind Kind, VPValue *Start,
+VPCanonicalIVPHIRecipe *CanonicalIV, VPValue *Step,
+FPMathOperator *FPBinOp)

ayalz wrote:

nit: this is identical to the private constructor above, except for accepting a 
non-const FPBinOp?

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-05 Thread via llvm-branch-commits



@@ -537,6 +542,30 @@ void VPlanTransforms::optimizeInductions(VPlan &Plan, 
ScalarEvolution &SE) {
   bool HasOnlyVectorVFs = !Plan.hasVF(ElementCount::getFixed(1));
   VPBasicBlock::iterator InsertPt = HeaderVPBB->getFirstNonPhi();
   for (VPRecipeBase &Phi : HeaderVPBB->phis()) {
+if (auto *PtrIV = dyn_cast(&Phi)) {
+  if (!PtrIV->onlyScalarsGenerated(Plan.hasScalableVF()))
+continue;
+
+  const InductionDescriptor &ID = PtrIV->getInductionDescriptor();
+  VPValue *StartV = Plan.getVPValueOrAddLiveIn(
+  ConstantInt::get(ID.getStep()->getType(), 0));

ayalz wrote:

nit: would getting the Type of ID.getStartValue() be more consistent?

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-05 Thread via llvm-branch-commits



@@ -489,15 +489,18 @@ void VPlanTransforms::removeDeadRecipes(VPlan &Plan) {
   }
 }
 
-static VPValue *createScalarIVSteps(VPlan &Plan, const InductionDescriptor &ID,
+static VPValue *createScalarIVSteps(VPlan &Plan,
+InductionDescriptor::InductionKind Kind,
 ScalarEvolution &SE, Instruction *TruncI,
 VPValue *StartV, VPValue *Step,
-VPBasicBlock::iterator IP) {
+Instruction::BinaryOps InductionOpcode,
+VPBasicBlock::iterator IP,
+FPMathOperator *FPBinOp = nullptr) {
   VPBasicBlock *HeaderVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
   VPSingleDefRecipe *BaseIV = CanonicalIV;
-  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step)) {
-BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step);
+  if (!CanonicalIV->isCanonical(Kind, StartV, Step)) {
+BaseIV = new VPDerivedIVRecipe(Kind, StartV, CanonicalIV, Step, FPBinOp);

ayalz wrote:

Should this refactoring to accept and pass Kind instead of ID be pushed as a 
separate simplifying preparation? 

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-05 Thread via llvm-branch-commits



@@ -515,6 +533,8 @@ void VPInstruction::execute(VPTransformState &State) {
 State.Builder.setFastMathFlags(getFastMathFlags());
   for (unsigned Part = 0; Part < State.UF; ++Part) {
 Value *GeneratedValue = generateInstruction(State, Part);
+if (!GeneratedValue)
+  continue;
 if (!hasResult())
   continue;

ayalz wrote:

```suggestion
if (!GeneratedValue || !hasResult())
  continue;
```

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-05 Thread via llvm-branch-commits



@@ -515,6 +533,8 @@ void VPInstruction::execute(VPTransformState &State) {
 State.Builder.setFastMathFlags(getFastMathFlags());
   for (unsigned Part = 0; Part < State.UF; ++Part) {
 Value *GeneratedValue = generateInstruction(State, Part);
+if (!GeneratedValue)
+  continue;
 if (!hasResult())
   continue;
 assert(GeneratedValue && "generateInstruction must produce a value");

ayalz wrote:

This assert is now redundant.

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-05 Thread via llvm-branch-commits



@@ -537,6 +542,30 @@ void VPlanTransforms::optimizeInductions(VPlan &Plan, 
ScalarEvolution &SE) {
   bool HasOnlyVectorVFs = !Plan.hasVF(ElementCount::getFixed(1));
   VPBasicBlock::iterator InsertPt = HeaderVPBB->getFirstNonPhi();
   for (VPRecipeBase &Phi : HeaderVPBB->phis()) {

ayalz wrote:

nit: worth adding a comment describing what unfolds next.

Plus revisit the documentation of optimizeInductions().

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80694 (PR #80695)

2024-02-05 Thread Pierre van Houtryve via llvm-branch-commits


https://github.com/Pierre-vh approved this pull request.


https://github.com/llvm/llvm-project/pull/80695
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79175 (PR #80274)

2024-02-05 Thread Florian Hahn via llvm-branch-commits


https://github.com/fhahn approved this pull request.

LGTM as this fixes a miscompile

https://github.com/llvm/llvm-project/pull/80274
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80694 (PR #80695)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/80695

>From 09303e727e515a7856d5f4cb100c5a9dec00b626 Mon Sep 17 00:00:00 2001
From: Pierre van Houtryve 
Date: Mon, 5 Feb 2024 14:36:15 +0100
Subject: [PATCH] [AMDGPU][PromoteAlloca] Support memsets to ptr allocas
 (#80678)

Fixes #80366

(cherry picked from commit 4e958abf2f44d08129eafd5b6a4ee2bd3584ed22)
---
 .../lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp | 16 --
 .../CodeGen/AMDGPU/promote-alloca-memset.ll   | 54 +++
 2 files changed, 66 insertions(+), 4 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
index 5e73411cae9b70..c1b244f50d93f8 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
@@ -521,10 +521,18 @@ static Value *promoteAllocaUserToVector(
   // For memset, we don't need to know the previous value because we
   // currently only allow memsets that cover the whole alloca.
   Value *Elt = MSI->getOperand(1);
-  if (DL.getTypeStoreSize(VecEltTy) > 1) {
-Value *EltBytes =
-Builder.CreateVectorSplat(DL.getTypeStoreSize(VecEltTy), Elt);
-Elt = Builder.CreateBitCast(EltBytes, VecEltTy);
+  const unsigned BytesPerElt = DL.getTypeStoreSize(VecEltTy);
+  if (BytesPerElt > 1) {
+Value *EltBytes = Builder.CreateVectorSplat(BytesPerElt, Elt);
+
+// If the element type of the vector is a pointer, we need to first 
cast
+// to an integer, then use a PtrCast.
+if (VecEltTy->isPointerTy()) {
+  Type *PtrInt = Builder.getIntNTy(BytesPerElt * 8);
+  Elt = Builder.CreateBitCast(EltBytes, PtrInt);
+  Elt = Builder.CreateIntToPtr(Elt, VecEltTy);
+} else
+  Elt = Builder.CreateBitCast(EltBytes, VecEltTy);
   }
 
   return Builder.CreateVectorSplat(VectorTy->getElementCount(), Elt);
diff --git a/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll 
b/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
index 15af1f17e230ec..f1e2737b370ef0 100644
--- a/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
+++ b/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
@@ -84,4 +84,58 @@ entry:
   ret void
 }
 
+define amdgpu_kernel void @memset_array_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_array_ptr_alloca(
+; CHECK-NEXT:store i64 0, ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca [6 x ptr], align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
+define amdgpu_kernel void @memset_vector_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_vector_ptr_alloca(
+; CHECK-NEXT:store i64 0, ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca <6 x ptr>, align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
+define amdgpu_kernel void @memset_array_of_array_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_array_of_array_ptr_alloca(
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [2 x [3 x ptr]], align 16, 
addrspace(5)
+; CHECK-NEXT:call void @llvm.memset.p5.i64(ptr addrspace(5) [[ALLOCA]], i8 
0, i64 48, i1 false)
+; CHECK-NEXT:[[LOAD:%.*]] = load i64, ptr addrspace(5) [[ALLOCA]], align 8
+; CHECK-NEXT:store i64 [[LOAD]], ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca [2 x [3 x ptr]], align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
+define amdgpu_kernel void @memset_array_of_vec_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_array_of_vec_ptr_alloca(
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [2 x <3 x ptr>], align 16, 
addrspace(5)
+; CHECK-NEXT:call void @llvm.memset.p5.i64(ptr addrspace(5) [[ALLOCA]], i8 
0, i64 48, i1 false)
+; CHECK-NEXT:[[LOAD:%.*]] = load i64, ptr addrspace(5) [[ALLOCA]], align 8
+; CHECK-NEXT:store i64 [[LOAD]], ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca [2 x <3 x ptr>], align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
 declare void @llvm.memset.p5.i64(ptr addrspace(5) nocapture writeonly, i8, 
i64, i1 immarg)

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] PR for llvm/llvm-project#80168 (PR #80702)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/80702
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] PR for llvm/llvm-project#80168 (PR #80702)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/80702

resolves llvm/llvm-project#80168

>From c04bd5109fe4a15d24e6c66cb91567d0589d33c3 Mon Sep 17 00:00:00 2001
From: Louis Dionne 
Date: Mon, 5 Feb 2024 11:05:46 -0500
Subject: [PATCH] [libc++] Add missing conditionals for feature-test macros
 (#80168)

We noticed that some feature-test macros were not conditional on
configuration flags like _LIBCPP_HAS_NO_FILESYSTEM. As a result, code
attempting to use FTMs would not work as intended.

This patch adds conditionals for a few feature-test macros, but more
issues may exist.

rdar://122020466
(cherry picked from commit f2c84211d2834c73ff874389c6bb47b1c76d391a)
---
 libcxx/include/version|  14 +-
 .../filesystem.version.compile.pass.cpp   |  16 +-
 .../fstream.version.compile.pass.cpp  |  16 +-
 .../iomanip.version.compile.pass.cpp  |  80 +---
 .../mutex.version.compile.pass.cpp|  64 +--
 .../version.version.compile.pass.cpp  | 176 --
 .../generate_feature_test_macro_components.py |  10 +-
 7 files changed, 254 insertions(+), 122 deletions(-)

diff --git a/libcxx/include/version b/libcxx/include/version
index 9e26da8c1b242..d356976d6454a 100644
--- a/libcxx/include/version
+++ b/libcxx/include/version
@@ -266,7 +266,9 @@ __cpp_lib_within_lifetime   
202306L 
 # define __cpp_lib_make_reverse_iterator201402L
 # define __cpp_lib_make_unique  201304L
 # define __cpp_lib_null_iterators   201304L
-# define __cpp_lib_quoted_string_io 201304L
+# if !defined(_LIBCPP_HAS_NO_LOCALIZATION)
+#   define __cpp_lib_quoted_string_io   201304L
+# endif
 # define __cpp_lib_result_of_sfinae 201210L
 # define __cpp_lib_robust_nonmodifying_seq_ops  201304L
 # if !defined(_LIBCPP_HAS_NO_THREADS)
@@ -294,7 +296,7 @@ __cpp_lib_within_lifetime   
202306L 
 # define __cpp_lib_clamp201603L
 # define __cpp_lib_enable_shared_from_this  201603L
 // # define __cpp_lib_execution201603L
-# if _LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY
+# if !defined(_LIBCPP_HAS_NO_FILESYSTEM) && 
_LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY
 #   define __cpp_lib_filesystem 201703L
 # endif
 # define __cpp_lib_gcd_lcm  201606L
@@ -323,7 +325,9 @@ __cpp_lib_within_lifetime   
202306L 
 // # define __cpp_lib_parallel_algorithm   201603L
 # define __cpp_lib_raw_memory_algorithms201606L
 # define __cpp_lib_sample   201603L
-# define __cpp_lib_scoped_lock  201703L
+# if !defined(_LIBCPP_HAS_NO_THREADS)
+#   define __cpp_lib_scoped_lock201703L
+# endif
 # if !defined(_LIBCPP_HAS_NO_THREADS)
 #   define __cpp_lib_shared_mutex   201505L
 # endif
@@ -496,7 +500,9 @@ __cpp_lib_within_lifetime   
202306L 
 // # define __cpp_lib_freestanding_optional202311L
 // # define __cpp_lib_freestanding_string_view 202311L
 // # define __cpp_lib_freestanding_variant 202311L
-# define __cpp_lib_fstream_native_handle202306L
+# if !defined(_LIBCPP_HAS_NO_FILESYSTEM) && 
!defined(_LIBCPP_HAS_NO_LOCALIZATION)
+#   define __cpp_lib_fstream_native_handle  202306L
+# endif
 // # define __cpp_lib_function_ref 202306L
 // # define __cpp_lib_hazard_pointer   202306L
 // # define __cpp_lib_linalg   202311L
diff --git 
a/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
 
b/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
index 46ccde800c179..3f03e8be9aeab 100644
--- 
a/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
+++ 
b/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
@@ -51,7 +51,7 @@
 #   error "__cpp_lib_char8_t should not be defined before c++20"
 # endif
 
-# if !defined(_LIBCPP_VERSION) || _LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY
+# if !defined(_LIBCPP_VERSION) || (!defined(_LIBCPP_HAS_NO_FILESYSTEM) && 
_LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY)
 #   ifndef __cpp_lib_filesystem
 # error "__cpp_lib_filesystem should be defined in c++17"
 #   endif
@@ -60,7 +60,7 @@
 #   endif
 # else
 #   ifdef __cpp_lib_filesystem
-# error "__cpp_lib_filesystem should not be defined when the requirement 
'!defined(_LIBCPP_VERSION) || _LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY' is 
not met!"
+# error "__cpp_lib_filesystem

[llvm-branch-commits] [libcxx] PR for llvm/llvm-project#80168 (PR #80702)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:

@ldionne What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/80702
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79571 (PR #79572)

2024-02-05 Thread Alina Sbirlea via llvm-branch-commits


https://github.com/alinas approved this pull request.

There are some pre-merge failures to review, but including this in the release 
makes sense.

https://github.com/llvm/llvm-project/pull/79572
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] PR for llvm/llvm-project#80168 (PR #80702)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-libcxx

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#80168

---

Patch is 30.77 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/80702.diff


7 Files Affected:

- (modified) libcxx/include/version (+10-4) 
- (modified) 
libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
 (+8-8) 
- (modified) 
libcxx/test/std/language.support/support.limits/support.limits.general/fstream.version.compile.pass.cpp
 (+11-5) 
- (modified) 
libcxx/test/std/language.support/support.limits/support.limits.general/iomanip.version.compile.pass.cpp
 (+55-25) 
- (modified) 
libcxx/test/std/language.support/support.limits/support.limits.general/mutex.version.compile.pass.cpp
 (+44-20) 
- (modified) 
libcxx/test/std/language.support/support.limits/support.limits.general/version.version.compile.pass.cpp
 (+118-58) 
- (modified) libcxx/utils/generate_feature_test_macro_components.py (+8-2) 


``diff
diff --git a/libcxx/include/version b/libcxx/include/version
index 9e26da8c1b242..d356976d6454a 100644
--- a/libcxx/include/version
+++ b/libcxx/include/version
@@ -266,7 +266,9 @@ __cpp_lib_within_lifetime   
202306L 
 # define __cpp_lib_make_reverse_iterator201402L
 # define __cpp_lib_make_unique  201304L
 # define __cpp_lib_null_iterators   201304L
-# define __cpp_lib_quoted_string_io 201304L
+# if !defined(_LIBCPP_HAS_NO_LOCALIZATION)
+#   define __cpp_lib_quoted_string_io   201304L
+# endif
 # define __cpp_lib_result_of_sfinae 201210L
 # define __cpp_lib_robust_nonmodifying_seq_ops  201304L
 # if !defined(_LIBCPP_HAS_NO_THREADS)
@@ -294,7 +296,7 @@ __cpp_lib_within_lifetime   
202306L 
 # define __cpp_lib_clamp201603L
 # define __cpp_lib_enable_shared_from_this  201603L
 // # define __cpp_lib_execution201603L
-# if _LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY
+# if !defined(_LIBCPP_HAS_NO_FILESYSTEM) && 
_LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY
 #   define __cpp_lib_filesystem 201703L
 # endif
 # define __cpp_lib_gcd_lcm  201606L
@@ -323,7 +325,9 @@ __cpp_lib_within_lifetime   
202306L 
 // # define __cpp_lib_parallel_algorithm   201603L
 # define __cpp_lib_raw_memory_algorithms201606L
 # define __cpp_lib_sample   201603L
-# define __cpp_lib_scoped_lock  201703L
+# if !defined(_LIBCPP_HAS_NO_THREADS)
+#   define __cpp_lib_scoped_lock201703L
+# endif
 # if !defined(_LIBCPP_HAS_NO_THREADS)
 #   define __cpp_lib_shared_mutex   201505L
 # endif
@@ -496,7 +500,9 @@ __cpp_lib_within_lifetime   
202306L 
 // # define __cpp_lib_freestanding_optional202311L
 // # define __cpp_lib_freestanding_string_view 202311L
 // # define __cpp_lib_freestanding_variant 202311L
-# define __cpp_lib_fstream_native_handle202306L
+# if !defined(_LIBCPP_HAS_NO_FILESYSTEM) && 
!defined(_LIBCPP_HAS_NO_LOCALIZATION)
+#   define __cpp_lib_fstream_native_handle  202306L
+# endif
 // # define __cpp_lib_function_ref 202306L
 // # define __cpp_lib_hazard_pointer   202306L
 // # define __cpp_lib_linalg   202311L
diff --git 
a/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
 
b/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
index 46ccde800c179..3f03e8be9aeab 100644
--- 
a/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
+++ 
b/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
@@ -51,7 +51,7 @@
 #   error "__cpp_lib_char8_t should not be defined before c++20"
 # endif
 
-# if !defined(_LIBCPP_VERSION) || _LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY
+# if !defined(_LIBCPP_VERSION) || (!defined(_LIBCPP_HAS_NO_FILESYSTEM) && 
_LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY)
 #   ifndef __cpp_lib_filesystem
 # error "__cpp_lib_filesystem should be defined in c++17"
 #   endif
@@ -60,7 +60,7 @@
 #   endif
 # else
 #   ifdef __cpp_lib_filesystem
-# error "__cpp_lib_filesystem should not be defined when the requirement 
'!defined(_LIBCPP_VERSION) || _LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY' is 
not met!"
+# error "__cpp_lib_filesystem should not be defined when the requirement 
'!defined(_LIBCPP_VERSION) || (!defined(_LIBCPP_HAS_NO_FILESYSTEM) && 
_LIBCPP_AVAILABI

[llvm-branch-commits] [libcxx] PR for llvm/llvm-project#80168 (PR #80702)

2024-02-05 Thread Louis Dionne via llvm-branch-commits


ldionne wrote:

> @ldionne What do you think about merging this PR to the release branch?

Approved!

https://github.com/llvm/llvm-project/pull/80702
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] 4ce4248 - Revert "[mlir][openacc] Add legalize data pass for compute operation (#80351)"

2024-02-05 Thread via llvm-branch-commits


Author: Valentin Clement (バレンタイン クレメン)
Date: 2024-02-05T08:47:02-08:00
New Revision: 4ce4248b450f71324d547d78fdf3dd48bb76d587

URL: 
https://github.com/llvm/llvm-project/commit/4ce4248b450f71324d547d78fdf3dd48bb76d587
DIFF: 
https://github.com/llvm/llvm-project/commit/4ce4248b450f71324d547d78fdf3dd48bb76d587.diff

LOG: Revert "[mlir][openacc] Add legalize data pass for compute operation 
(#80351)"

This reverts commit 29d47513b3ce706b5df66409170e40ba39f3795a.

Added: 


Modified: 
flang/include/flang/Optimizer/Support/InitFIR.h
mlir/include/mlir/Dialect/OpenACC/CMakeLists.txt
mlir/include/mlir/InitAllPasses.h
mlir/lib/Dialect/OpenACC/CMakeLists.txt

Removed: 
flang/test/Fir/OpenACC/legalize-data.fir
mlir/include/mlir/Dialect/OpenACC/Transforms/CMakeLists.txt
mlir/include/mlir/Dialect/OpenACC/Transforms/Passes.h
mlir/include/mlir/Dialect/OpenACC/Transforms/Passes.td
mlir/lib/Dialect/OpenACC/IR/CMakeLists.txt
mlir/lib/Dialect/OpenACC/Transforms/CMakeLists.txt
mlir/lib/Dialect/OpenACC/Transforms/LegalizeData.cpp
mlir/test/Dialect/OpenACC/legalize-data.mlir



diff  --git a/flang/include/flang/Optimizer/Support/InitFIR.h 
b/flang/include/flang/Optimizer/Support/InitFIR.h
index b5c41699205f4..8c47ad3d9f445 100644
--- a/flang/include/flang/Optimizer/Support/InitFIR.h
+++ b/flang/include/flang/Optimizer/Support/InitFIR.h
@@ -19,7 +19,6 @@
 #include "mlir/Dialect/Affine/Passes.h"
 #include "mlir/Dialect/Complex/IR/Complex.h"
 #include "mlir/Dialect/Func/Extensions/InlinerExtension.h"
-#include "mlir/Dialect/OpenACC/Transforms/Passes.h"
 #include "mlir/InitAllDialects.h"
 #include "mlir/Pass/Pass.h"
 #include "mlir/Pass/PassRegistry.h"
@@ -75,7 +74,6 @@ inline void loadDialects(mlir::MLIRContext &context) {
 /// Register the standard passes we use. This comes from registerAllPasses(),
 /// but is a smaller set since we aren't using many of the passes found there.
 inline void registerMLIRPassesForFortranTools() {
-  mlir::acc::registerOpenACCPasses();
   mlir::registerCanonicalizerPass();
   mlir::registerCSEPass();
   mlir::affine::registerAffineLoopFusionPass();

diff  --git a/flang/test/Fir/OpenACC/legalize-data.fir 
b/flang/test/Fir/OpenACC/legalize-data.fir
deleted file mode 100644
index 3b8695434e6e4..0
--- a/flang/test/Fir/OpenACC/legalize-data.fir
+++ /dev/null
@@ -1,24 +0,0 @@
-// RUN: fir-opt -split-input-file --openacc-legalize-data %s | FileCheck %s
-
-func.func @_QPsub1(%arg0: !fir.ref {fir.bindc_name = "i"}) {
-  %0:2 = hlfir.declare %arg0 {uniq_name = "_QFsub1Ei"} : (!fir.ref) -> 
(!fir.ref, !fir.ref)
-  %1 = acc.copyin varPtr(%0#0 : !fir.ref) -> !fir.ref {dataClause = 
#acc, name = "i"}
-  acc.parallel dataOperands(%1 : !fir.ref) {
-%c0_i32 = arith.constant 0 : i32
-hlfir.assign %c0_i32 to %0#0 : i32, !fir.ref
-acc.yield
-  }
-  acc.copyout accPtr(%1 : !fir.ref) to varPtr(%0#0 : !fir.ref) 
{dataClause = #acc, name = "i"}
-  return
-}
-
-// CHECK-LABEL: func.func @_QPsub1
-// CHECK-SAME: (%[[ARG0:.*]]: !fir.ref {fir.bindc_name = "i"})
-// CHECK: %[[I:.*]]:2 = hlfir.declare %[[ARG0]] {uniq_name = "_QFsub1Ei"} : 
(!fir.ref) -> (!fir.ref, !fir.ref)
-// CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[I]]#0 : !fir.ref) -> 
!fir.ref {dataClause = #acc, name = "i"}
-// CHECK: acc.parallel dataOperands(%[[COPYIN]] : !fir.ref) {
-// CHECK:   %c0_i32 = arith.constant 0 : i32
-// CHECK:   hlfir.assign %c0{{.*}} to %[[COPYIN]] : i32, !fir.ref
-// CHECK:   acc.yield
-// CHECK: }
-// CHECK: acc.copyout accPtr(%[[COPYIN]] : !fir.ref) to varPtr(%[[I]]#0 : 
!fir.ref) {dataClause = #acc, name = "i"}

diff  --git a/mlir/include/mlir/Dialect/OpenACC/CMakeLists.txt 
b/mlir/include/mlir/Dialect/OpenACC/CMakeLists.txt
index 8a4b1c7b196ea..56ba2976ee5d4 100644
--- a/mlir/include/mlir/Dialect/OpenACC/CMakeLists.txt
+++ b/mlir/include/mlir/Dialect/OpenACC/CMakeLists.txt
@@ -1,5 +1,3 @@
-add_subdirectory(Transforms)
-
 set(LLVM_TARGET_DEFINITIONS 
${LLVM_MAIN_INCLUDE_DIR}/llvm/Frontend/OpenACC/ACC.td)
 mlir_tablegen(AccCommon.td --gen-directive-decl --directives-dialect=OpenACC)
 add_public_tablegen_target(acc_common_td)

diff  --git a/mlir/include/mlir/Dialect/OpenACC/Transforms/CMakeLists.txt 
b/mlir/include/mlir/Dialect/OpenACC/Transforms/CMakeLists.txt
deleted file mode 100644
index ddbd5839576fc..0
--- a/mlir/include/mlir/Dialect/OpenACC/Transforms/CMakeLists.txt
+++ /dev/null
@@ -1,5 +0,0 @@
-set(LLVM_TARGET_DEFINITIONS Passes.td)
-mlir_tablegen(Passes.h.inc -gen-pass-decls -name OpenACC)
-add_public_tablegen_target(MLIROpenACCPassIncGen)
-
-add_mlir_doc(Passes OpenACCPasses ./ -gen-pass-doc)

diff  --git a/mlir/include/mlir/Dialect/OpenACC/Transforms/Passes.h 
b/mlir/include/mlir/Dialect/OpenACC/Transforms/Passes.h
deleted file mode 100644
index 5a11056cda609..0
--- a/mlir/include/mlir/Dialect/OpenACC/Transforms/Passes.h
+++ /dev/null
@@ -1,40 +

[llvm-branch-commits] [mlir] 248b916 - Revert "[mlir][vector] Drop inner unit dims for transfer ops on dynamic shape…"

2024-02-05 Thread via llvm-branch-commits


Author: Han-Chung Wang
Date: 2024-02-05T09:06:32-08:00
New Revision: 248b9161015f4c030294359182169e96737998c3

URL: 
https://github.com/llvm/llvm-project/commit/248b9161015f4c030294359182169e96737998c3
DIFF: 
https://github.com/llvm/llvm-project/commit/248b9161015f4c030294359182169e96737998c3.diff

LOG: Revert "[mlir][vector] Drop inner unit dims for transfer ops on dynamic 
shape…"

This reverts commit 66347e516e22f9159b86024071fb92f364ac4418.

Added: 


Modified: 
mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp
mlir/test/Dialect/Vector/vector-transfer-collapse-inner-most-dims.mlir

Removed: 




diff  --git a/mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp 
b/mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp
index 8363e73857e5c..12aa11e9e33f5 100644
--- a/mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/VectorTransforms.cpp
@@ -1236,7 +1236,7 @@ class DropInnerMostUnitDimsTransferRead
   return failure();
 
 auto srcType = dyn_cast(readOp.getSource().getType());
-if (!srcType)
+if (!srcType || !srcType.hasStaticShape())
   return failure();
 
 if (!readOp.getPermutationMap().isMinorIdentity())
@@ -1260,21 +1260,19 @@ class DropInnerMostUnitDimsTransferRead
 targetType.getElementType());
 
 auto loc = readOp.getLoc();
-SmallVector sizes =
-memref::getMixedSizes(rewriter, loc, readOp.getSource());
-SmallVector offsets(srcType.getRank(),
-  rewriter.getIndexAttr(0));
-SmallVector strides(srcType.getRank(),
-  rewriter.getIndexAttr(1));
 MemRefType resultMemrefType =
 getMemRefTypeWithDroppingInnerDims(rewriter, srcType, dimsToDrop);
+SmallVector offsets(srcType.getRank(), 0);
+SmallVector strides(srcType.getRank(), 1);
+
 ArrayAttr inBoundsAttr =
 readOp.getInBounds()
 ? rewriter.getArrayAttr(
   readOp.getInBoundsAttr().getValue().drop_back(dimsToDrop))
 : ArrayAttr();
 Value rankedReducedView = rewriter.create(
-loc, resultMemrefType, readOp.getSource(), offsets, sizes, strides);
+loc, resultMemrefType, readOp.getSource(), offsets, srcType.getShape(),
+strides);
 auto permMap = getTransferMinorIdentityMap(
 cast(rankedReducedView.getType()), resultTargetVecType);
 Value result = rewriter.create(
@@ -1320,7 +1318,7 @@ class DropInnerMostUnitDimsTransferWrite
   return failure();
 
 auto srcType = dyn_cast(writeOp.getSource().getType());
-if (!srcType)
+if (!srcType || !srcType.hasStaticShape())
   return failure();
 
 if (!writeOp.getPermutationMap().isMinorIdentity())
@@ -1343,23 +1341,20 @@ class DropInnerMostUnitDimsTransferWrite
 VectorType::get(targetType.getShape().drop_back(dimsToDrop),
 targetType.getElementType());
 
-Location loc = writeOp.getLoc();
-SmallVector sizes =
-memref::getMixedSizes(rewriter, loc, writeOp.getSource());
-SmallVector offsets(srcType.getRank(),
-  rewriter.getIndexAttr(0));
-SmallVector strides(srcType.getRank(),
-  rewriter.getIndexAttr(1));
 MemRefType resultMemrefType =
 getMemRefTypeWithDroppingInnerDims(rewriter, srcType, dimsToDrop);
+SmallVector offsets(srcType.getRank(), 0);
+SmallVector strides(srcType.getRank(), 1);
 ArrayAttr inBoundsAttr =
 writeOp.getInBounds()
 ? rewriter.getArrayAttr(
   writeOp.getInBoundsAttr().getValue().drop_back(dimsToDrop))
 : ArrayAttr();
 
+Location loc = writeOp.getLoc();
 Value rankedReducedView = rewriter.create(
-loc, resultMemrefType, writeOp.getSource(), offsets, sizes, strides);
+loc, resultMemrefType, writeOp.getSource(), offsets, 
srcType.getShape(),
+strides);
 auto permMap = getTransferMinorIdentityMap(
 cast(rankedReducedView.getType()), resultTargetVecType);
 

diff  --git 
a/mlir/test/Dialect/Vector/vector-transfer-collapse-inner-most-dims.mlir 
b/mlir/test/Dialect/Vector/vector-transfer-collapse-inner-most-dims.mlir
index 3984f17f9e8cd..d6d69c8af8850 100644
--- a/mlir/test/Dialect/Vector/vector-transfer-collapse-inner-most-dims.mlir
+++ b/mlir/test/Dialect/Vector/vector-transfer-collapse-inner-most-dims.mlir
@@ -16,25 +16,6 @@ func.func @contiguous_inner_most_view(%in: 
memref<1x1x8x1xf32, strided<[3072, 8,
 
 // -
 
-func.func @contiguous_outer_dyn_inner_most_view(%in: memref>) -> vector<1x8x1xf32>{
-  %c0 = arith.constant 0 : index
-  %cst = arith.constant 0.0 : f32
-  %0 = vector.transfer_read %in[%c0, %c0, %c0, %c0], %cst {in_bounds = [true, 
true, true]} : memref>, 
vector<1x8x1xf32>
-  return %0 : vector<1x8x1xf32>
-}
-//

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80715 (PR #80716)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/80716
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80715 (PR #80716)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:

@yxsamliu What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/80716
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80715 (PR #80716)

2024-02-05 Thread Yaxun Liu via llvm-branch-commits


https://github.com/yxsamliu approved this pull request.


https://github.com/llvm/llvm-project/pull/80716
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80715 (PR #80716)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#80715

---

Patch is 295.23 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/80716.diff


3 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/div_i128.ll (+5443-6) 
- (added) llvm/test/CodeGen/AMDGPU/div_v2i128.ll (+25) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
index 55d95154c75878..2af53a664ff173 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
@@ -577,6 +577,7 @@ AMDGPUTargetLowering::AMDGPUTargetLowering(const 
TargetMachine &TM,
ISD::AssertSext, ISD::INTRINSIC_WO_CHAIN});
 
   setMaxAtomicSizeInBitsSupported(64);
+  setMaxDivRemBitWidthSupported(64);
 }
 
 bool AMDGPUTargetLowering::mayIgnoreSignedZero(SDValue Op) const {
diff --git a/llvm/test/CodeGen/AMDGPU/div_i128.ll 
b/llvm/test/CodeGen/AMDGPU/div_i128.ll
index 4aa97c57cbd9c2..5296ad3ab51d31 100644
--- a/llvm/test/CodeGen/AMDGPU/div_i128.ll
+++ b/llvm/test/CodeGen/AMDGPU/div_i128.ll
@@ -1,9 +1,5446 @@
-; RUN: not --crash llc -global-isel=0 -mtriple=amdgcn-amd-amdhsa 
-verify-machineinstrs -o - %s 2>&1 | FileCheck -check-prefix=SDAG-ERR %s
-; RUN: not --crash llc -global-isel=1 -mtriple=amdgcn-amd-amdhsa 
-verify-machineinstrs -o - %s 2>&1 | FileCheck -check-prefix=GISEL-ERR %s
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
+; RUN: llc -global-isel=0 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -o - %s | 
FileCheck -check-prefixes=GFX9,GFX9-SDAG %s
+; RUN: llc -O0 -global-isel=0 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -o - %s 
| FileCheck -check-prefixes=GFX9-O0,GFX9-SDAG-O0 %s
+
+; FIXME: GlobalISel missing the power-of-2 cases in legalization. 
https://github.com/llvm/llvm-project/issues/80671
+; xUN: llc -global-isel=1 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -o - %s | 
FileCheck -check-prefixes=GFX9,GFX9 %s
+; xUN: llc -O0 -global-isel=1 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -o - %s 
| FileCheck -check-prefixes=GFX9-O0,GFX9-O0 %s
 
-; SDAG-ERR: LLVM ERROR: unsupported libcall legalization
-; GISEL-ERR: LLVM ERROR: unable to legalize instruction: %{{[0-9]+}}:_(s128) = 
G_SDIV %{{[0-9]+}}:_, %{{[0-9]+}}:_ (in function: v_sdiv_i128_vv)
 define i128 @v_sdiv_i128_vv(i128 %lhs, i128 %rhs) {
-  %shl = sdiv i128 %lhs, %rhs
-  ret i128 %shl
+; GFX9-LABEL: v_sdiv_i128_vv:
+; GFX9:   ; %bb.0: ; %_udiv-special-cases
+; GFX9-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX9-NEXT:v_ashrrev_i32_e32 v16, 31, v3
+; GFX9-NEXT:v_xor_b32_e32 v0, v16, v0
+; GFX9-NEXT:v_xor_b32_e32 v1, v16, v1
+; GFX9-NEXT:v_sub_co_u32_e32 v8, vcc, v0, v16
+; GFX9-NEXT:v_xor_b32_e32 v2, v16, v2
+; GFX9-NEXT:v_subb_co_u32_e32 v9, vcc, v1, v16, vcc
+; GFX9-NEXT:v_ashrrev_i32_e32 v17, 31, v7
+; GFX9-NEXT:v_xor_b32_e32 v3, v16, v3
+; GFX9-NEXT:v_subb_co_u32_e32 v10, vcc, v2, v16, vcc
+; GFX9-NEXT:v_subb_co_u32_e32 v11, vcc, v3, v16, vcc
+; GFX9-NEXT:v_xor_b32_e32 v3, v17, v4
+; GFX9-NEXT:v_xor_b32_e32 v2, v17, v5
+; GFX9-NEXT:v_sub_co_u32_e32 v20, vcc, v3, v17
+; GFX9-NEXT:v_xor_b32_e32 v0, v17, v6
+; GFX9-NEXT:v_subb_co_u32_e32 v21, vcc, v2, v17, vcc
+; GFX9-NEXT:v_xor_b32_e32 v1, v17, v7
+; GFX9-NEXT:v_subb_co_u32_e32 v0, vcc, v0, v17, vcc
+; GFX9-NEXT:v_subb_co_u32_e32 v1, vcc, v1, v17, vcc
+; GFX9-NEXT:v_or_b32_e32 v3, v21, v1
+; GFX9-NEXT:v_or_b32_e32 v2, v20, v0
+; GFX9-NEXT:v_cmp_eq_u64_e32 vcc, 0, v[2:3]
+; GFX9-NEXT:v_or_b32_e32 v3, v9, v11
+; GFX9-NEXT:v_or_b32_e32 v2, v8, v10
+; GFX9-NEXT:v_cmp_eq_u64_e64 s[4:5], 0, v[2:3]
+; GFX9-NEXT:v_ffbh_u32_e32 v2, v0
+; GFX9-NEXT:v_add_u32_e32 v2, 32, v2
+; GFX9-NEXT:v_ffbh_u32_e32 v3, v1
+; GFX9-NEXT:v_min_u32_e32 v2, v2, v3
+; GFX9-NEXT:v_ffbh_u32_e32 v3, v20
+; GFX9-NEXT:v_add_u32_e32 v3, 32, v3
+; GFX9-NEXT:v_ffbh_u32_e32 v4, v21
+; GFX9-NEXT:v_min_u32_e32 v3, v3, v4
+; GFX9-NEXT:s_or_b64 s[4:5], vcc, s[4:5]
+; GFX9-NEXT:v_add_co_u32_e32 v3, vcc, 64, v3
+; GFX9-NEXT:v_addc_co_u32_e64 v4, s[6:7], 0, 0, vcc
+; GFX9-NEXT:v_cmp_ne_u64_e32 vcc, 0, v[0:1]
+; GFX9-NEXT:v_ffbh_u32_e32 v5, v11
+; GFX9-NEXT:v_cndmask_b32_e32 v2, v3, v2, vcc
+; GFX9-NEXT:v_ffbh_u32_e32 v3, v10
+; GFX9-NEXT:v_add_u32_e32 v3, 32, v3
+; GFX9-NEXT:v_min_u32_e32 v3, v3, v5
+; GFX9-NEXT:v_ffbh_u32_e32 v5, v8
+; GFX9-NEXT:v_add_u32_e32 v5, 32, v5
+; GFX9-NEXT:v_ffbh_u32_e32 v6, v9
+; GFX9-NEXT:v_min_u32_e32 v5, v5, v6
+; GFX9-NEXT:v_cndmask_b32_e64 v4, v4, 0, vcc
+; GFX9-NEXT:v_add_co_u32_e32 v5, vcc, 64, v5
+; GFX9-NEXT:v_addc_co_u32_e64 v6, s[6:7], 0, 0, vcc
+; GFX9-NEXT:v_cmp_ne_u64_e32 vcc, 0, v[10:11]
+; GFX9-NEXT:

[llvm-branch-commits] [libcxx] PR for llvm/llvm-project#80718 (PR #80720)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/80720
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] PR for llvm/llvm-project#80718 (PR #80720)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:

@philnik777 What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/80720
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] PR for llvm/llvm-project#80718 (PR #80720)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/80720

resolves llvm/llvm-project#80718

>From 8df2173d644846197b4285bccc3aba3a651d1521 Mon Sep 17 00:00:00 2001
From: Dimitry Andric 
Date: Mon, 5 Feb 2024 17:41:12 +0100
Subject: [PATCH] [libc++] Rename __bit_reference template parameter to avoid
 conflict (#80661)

As of 4d20cfcf4eb08217ed37c4d4c38dc395d7a66d26, `__bit_reference`
contains a template `__fill_n` with a bool `_FillValue` parameter.

Unfortunately there is a relatively widely used piece of scientific
software called NetCDF, which exposes a (C) macro `_FillValue` in its
public headers.

When building the NetCDF C++ bindings, this quickly leads to compilation
errors when the macro interferes with the template in `__bit_reference`.

Rename the parameter to `_FillVal` to avoid the conflict.

(cherry picked from commit 1ec252298925de50b27930c557ba9de3cc397afe)
---
 libcxx/include/__bit_reference | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/libcxx/include/__bit_reference b/libcxx/include/__bit_reference
index 9032b8f018093..3a5339b72ddc3 100644
--- a/libcxx/include/__bit_reference
+++ b/libcxx/include/__bit_reference
@@ -173,7 +173,7 @@ private:
 
 // fill_n
 
-template 
+template 
 _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void
 __fill_n(__bit_iterator<_Cp, false> __first, typename _Cp::size_type __n) {
   using _It= __bit_iterator<_Cp, false>;
@@ -185,7 +185,7 @@ __fill_n(__bit_iterator<_Cp, false> __first, typename 
_Cp::size_type __n) {
 __storage_type __clz_f = static_cast<__storage_type>(__bits_per_word - 
__first.__ctz_);
 __storage_type __dn= std::min(__clz_f, __n);
 __storage_type __m = (~__storage_type(0) << __first.__ctz_) & 
(~__storage_type(0) >> (__clz_f - __dn));
-if (_FillValue)
+if (_FillVal)
   *__first.__seg_ |= __m;
 else
   *__first.__seg_ &= ~__m;
@@ -194,13 +194,13 @@ __fill_n(__bit_iterator<_Cp, false> __first, typename 
_Cp::size_type __n) {
   }
   // do middle whole words
   __storage_type __nw = __n / __bits_per_word;
-  std::fill_n(std::__to_address(__first.__seg_), __nw, _FillValue ? 
static_cast<__storage_type>(-1) : 0);
+  std::fill_n(std::__to_address(__first.__seg_), __nw, _FillVal ? 
static_cast<__storage_type>(-1) : 0);
   __n -= __nw * __bits_per_word;
   // do last partial word
   if (__n > 0) {
 __first.__seg_ += __nw;
 __storage_type __m = ~__storage_type(0) >> (__bits_per_word - __n);
-if (_FillValue)
+if (_FillVal)
   *__first.__seg_ |= __m;
 else
   *__first.__seg_ &= ~__m;
@@ -1007,7 +1007,7 @@ private:
   friend class __bit_iterator<_Cp, true>;
   template 
   friend struct __bit_array;
-  template 
+  template 
   _LIBCPP_CONSTEXPR_SINCE_CXX20 friend void __fill_n(__bit_iterator<_Dp, 
false> __first, typename _Dp::size_type __n);
 
   template 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] PR for llvm/llvm-project#80718 (PR #80720)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-libcxx

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#80718

---
Full diff: https://github.com/llvm/llvm-project/pull/80720.diff


1 Files Affected:

- (modified) libcxx/include/__bit_reference (+5-5) 


``diff
diff --git a/libcxx/include/__bit_reference b/libcxx/include/__bit_reference
index 9032b8f018093..3a5339b72ddc3 100644
--- a/libcxx/include/__bit_reference
+++ b/libcxx/include/__bit_reference
@@ -173,7 +173,7 @@ private:
 
 // fill_n
 
-template 
+template 
 _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void
 __fill_n(__bit_iterator<_Cp, false> __first, typename _Cp::size_type __n) {
   using _It= __bit_iterator<_Cp, false>;
@@ -185,7 +185,7 @@ __fill_n(__bit_iterator<_Cp, false> __first, typename 
_Cp::size_type __n) {
 __storage_type __clz_f = static_cast<__storage_type>(__bits_per_word - 
__first.__ctz_);
 __storage_type __dn= std::min(__clz_f, __n);
 __storage_type __m = (~__storage_type(0) << __first.__ctz_) & 
(~__storage_type(0) >> (__clz_f - __dn));
-if (_FillValue)
+if (_FillVal)
   *__first.__seg_ |= __m;
 else
   *__first.__seg_ &= ~__m;
@@ -194,13 +194,13 @@ __fill_n(__bit_iterator<_Cp, false> __first, typename 
_Cp::size_type __n) {
   }
   // do middle whole words
   __storage_type __nw = __n / __bits_per_word;
-  std::fill_n(std::__to_address(__first.__seg_), __nw, _FillValue ? 
static_cast<__storage_type>(-1) : 0);
+  std::fill_n(std::__to_address(__first.__seg_), __nw, _FillVal ? 
static_cast<__storage_type>(-1) : 0);
   __n -= __nw * __bits_per_word;
   // do last partial word
   if (__n > 0) {
 __first.__seg_ += __nw;
 __storage_type __m = ~__storage_type(0) >> (__bits_per_word - __n);
-if (_FillValue)
+if (_FillVal)
   *__first.__seg_ |= __m;
 else
   *__first.__seg_ &= ~__m;
@@ -1007,7 +1007,7 @@ private:
   friend class __bit_iterator<_Cp, true>;
   template 
   friend struct __bit_array;
-  template 
+  template 
   _LIBCPP_CONSTEXPR_SINCE_CXX20 friend void __fill_n(__bit_iterator<_Dp, 
false> __first, typename _Dp::size_type __n);
 
   template 

``




https://github.com/llvm/llvm-project/pull/80720
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80348 (PR #80585)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


tstellar wrote:

/cherry-pick 4b34558f43121df9b863ff2492f74fb2e65a5af1.

https://github.com/llvm/llvm-project/pull/80585
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80348 (PR #80585)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:


Failed to cherry-pick: 4b34558f43121df9b863ff2492f74fb2e65a5af1.

https://github.com/llvm/llvm-project/actions/runs/7789532649

Please manually backport the fix and push it to your github fork.  Once this is 
done, please create a [pull 
request](https://github.com/llvm/llvm-project/compare)

https://github.com/llvm/llvm-project/pull/80585
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80699 (PR #80729)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/80729

resolves llvm/llvm-project#80699

>From 29a91711a135622cf74989e100274ab46c8c0bc1 Mon Sep 17 00:00:00 2001
From: Jeremy Morse 
Date: Wed, 24 Jan 2024 17:45:43 +
Subject: [PATCH] [BPI] Transfer value-handles when assign/move constructing
 BPI (#4)

Background: BPI stores a collection of edge branch-probabilities, and
also a set of Callback value-handles for the blocks in the
edge-collection. When a block is deleted, BPI's eraseBlock method is
called to clear the edge-collection of references to that block, to
avoid dangling pointers.

However, when move-constructing or assigning a BPI object, the
edge-collection gets moved, but the value-handles are discarded. This
can lead to to stale entries in the edge-collection when blocks are
deleted without the callback -- not normally a problem, but if a new
block is allocated with the same address as an old block, spurious
branch probabilities will be recorded about it. The fix is to transfer
the handles from the source BPI object.

This was exposed by an unrelated debug-info change, it probably just
shifted around allocation orders to expose this. Detected as
nondeterminism and reduced by Zequan Wu:

https://github.com/llvm/llvm-project/commit/f1b0a544514f3d343f32a41de9d6fb0b6cbb6021#commitcomment-136737090

(No test because IMHO testing for a behaviour that varies with memory
allocators is likely futile; I can add the reproducer with a CHECK for
the relevant branch weights if it's desired though)

(cherry picked from commit 604a6c409e8473b212952b8633d92bbdb22a45c9)
---
 llvm/include/llvm/Analysis/BranchProbabilityInfo.h | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/llvm/include/llvm/Analysis/BranchProbabilityInfo.h 
b/llvm/include/llvm/Analysis/BranchProbabilityInfo.h
index 6b9d178182011..91e1872e9bd6f 100644
--- a/llvm/include/llvm/Analysis/BranchProbabilityInfo.h
+++ b/llvm/include/llvm/Analysis/BranchProbabilityInfo.h
@@ -122,16 +122,23 @@ class BranchProbabilityInfo {
   }
 
   BranchProbabilityInfo(BranchProbabilityInfo &&Arg)
-  : Probs(std::move(Arg.Probs)), LastF(Arg.LastF),
-EstimatedBlockWeight(std::move(Arg.EstimatedBlockWeight)) {}
+  : Handles(std::move(Arg.Handles)), Probs(std::move(Arg.Probs)),
+LastF(Arg.LastF),
+EstimatedBlockWeight(std::move(Arg.EstimatedBlockWeight)) {
+for (auto &Handle : Handles)
+  Handle.setBPI(this);
+  }
 
   BranchProbabilityInfo(const BranchProbabilityInfo &) = delete;
   BranchProbabilityInfo &operator=(const BranchProbabilityInfo &) = delete;
 
   BranchProbabilityInfo &operator=(BranchProbabilityInfo &&RHS) {
 releaseMemory();
+Handles = std::move(RHS.Handles);
 Probs = std::move(RHS.Probs);
 EstimatedBlockWeight = std::move(RHS.EstimatedBlockWeight);
+for (auto &Handle : Handles)
+  Handle.setBPI(this);
 return *this;
   }
 
@@ -279,6 +286,8 @@ class BranchProbabilityInfo {
 }
 
   public:
+void setBPI(BranchProbabilityInfo *BPI) { this->BPI = BPI; }
+
 BasicBlockCallbackVH(const Value *V, BranchProbabilityInfo *BPI = nullptr)
 : CallbackVH(const_cast(V)), BPI(BPI) {}
   };

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80699 (PR #80729)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/80729
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80699 (PR #80729)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:

@ZequanWu What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/80729
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80699 (PR #80729)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-llvm-analysis

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#80699

---
Full diff: https://github.com/llvm/llvm-project/pull/80729.diff


1 Files Affected:

- (modified) llvm/include/llvm/Analysis/BranchProbabilityInfo.h (+11-2) 


``diff
diff --git a/llvm/include/llvm/Analysis/BranchProbabilityInfo.h 
b/llvm/include/llvm/Analysis/BranchProbabilityInfo.h
index 6b9d178182011..91e1872e9bd6f 100644
--- a/llvm/include/llvm/Analysis/BranchProbabilityInfo.h
+++ b/llvm/include/llvm/Analysis/BranchProbabilityInfo.h
@@ -122,16 +122,23 @@ class BranchProbabilityInfo {
   }
 
   BranchProbabilityInfo(BranchProbabilityInfo &&Arg)
-  : Probs(std::move(Arg.Probs)), LastF(Arg.LastF),
-EstimatedBlockWeight(std::move(Arg.EstimatedBlockWeight)) {}
+  : Handles(std::move(Arg.Handles)), Probs(std::move(Arg.Probs)),
+LastF(Arg.LastF),
+EstimatedBlockWeight(std::move(Arg.EstimatedBlockWeight)) {
+for (auto &Handle : Handles)
+  Handle.setBPI(this);
+  }
 
   BranchProbabilityInfo(const BranchProbabilityInfo &) = delete;
   BranchProbabilityInfo &operator=(const BranchProbabilityInfo &) = delete;
 
   BranchProbabilityInfo &operator=(BranchProbabilityInfo &&RHS) {
 releaseMemory();
+Handles = std::move(RHS.Handles);
 Probs = std::move(RHS.Probs);
 EstimatedBlockWeight = std::move(RHS.EstimatedBlockWeight);
+for (auto &Handle : Handles)
+  Handle.setBPI(this);
 return *this;
   }
 
@@ -279,6 +286,8 @@ class BranchProbabilityInfo {
 }
 
   public:
+void setBPI(BranchProbabilityInfo *BPI) { this->BPI = BPI; }
+
 BasicBlockCallbackVH(const Value *V, BranchProbabilityInfo *BPI = nullptr)
 : CallbackVH(const_cast(V)), BPI(BPI) {}
   };

``




https://github.com/llvm/llvm-project/pull/80729
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80597 (PR #80731)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/80731

resolves llvm/llvm-project#80597

>From 3df992ed00f46a44492416cce46121f5e4fc0716 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Tue, 6 Feb 2024 01:29:38 +0800
Subject: [PATCH] [InstCombine] Fix assertion failure in issue80597 (#80614)

The assertion in #80597 failed when we were trying to compute known bits
of a value in an unreachable BB.

https://github.com/llvm/llvm-project/blob/859b09da08c2a47026ba0a7d2f21b7dca705864d/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp#L749-L810

In this case, `SignBits` is 30 (deduced from instr info), but `Known` is
`110101011101000101000?0?`
(deduced from dom cond). Setting high bits of `lshr Known, 1` will lead
to conflict.

This patch masks out high bits of `Known.Zero` to address this problem.

Fixes #80597.

(cherry picked from commit cb8d83a77c25e529f58eba17bb1ec76069a04e90)
---
 .../InstCombineSimplifyDemanded.cpp   |  3 ++
 llvm/test/Transforms/InstCombine/pr80597.ll   | 33 +++
 2 files changed, 36 insertions(+)
 create mode 100644 llvm/test/Transforms/InstCombine/pr80597.ll

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp 
b/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
index a8a5f9831e15e..79873a9b4cbb4 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
@@ -802,6 +802,9 @@ Value *InstCombinerImpl::SimplifyDemandedUseBits(Value *V, 
APInt DemandedMask,
 return InsertNewInstWith(LShr, I->getIterator());
   } else if (Known.One[BitWidth-ShiftAmt-1]) { // New bits are known one.
 Known.One |= HighBits;
+// SignBits may be out-of-sync with Known.countMinSignBits(). Mask out
+// high bits of Known.Zero to avoid conflicts.
+Known.Zero &= ~HighBits;
   }
 } else {
   computeKnownBits(I, Known, Depth, CxtI);
diff --git a/llvm/test/Transforms/InstCombine/pr80597.ll 
b/llvm/test/Transforms/InstCombine/pr80597.ll
new file mode 100644
index 0..5feae4a06c45c
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/pr80597.ll
@@ -0,0 +1,33 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -S -passes=instcombine < %s | FileCheck %s
+
+define i64 @pr80597(i1 %cond) {
+; CHECK-LABEL: define i64 @pr80597(
+; CHECK-SAME: i1 [[COND:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[ADD:%.*]] = select i1 [[COND]], i64 0, i64 -12884901888
+; CHECK-NEXT:[[SEXT1:%.*]] = add nsw i64 [[ADD]], 8836839514384105472
+; CHECK-NEXT:[[CMP:%.*]] = icmp ult i64 [[SEXT1]], -34359738368
+; CHECK-NEXT:br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:   if.else:
+; CHECK-NEXT:[[SEXT2:%.*]] = ashr exact i64 [[ADD]], 1
+; CHECK-NEXT:[[ASHR:%.*]] = or i64 [[SEXT2]], 4418419761487020032
+; CHECK-NEXT:ret i64 [[ASHR]]
+; CHECK:   if.then:
+; CHECK-NEXT:ret i64 0
+;
+entry:
+  %add = select i1 %cond, i64 0, i64 4294967293
+  %add8 = shl i64 %add, 32
+  %sext1 = add i64 %add8, 8836839514384105472
+  %cmp = icmp ult i64 %sext1, -34359738368
+  br i1 %cmp, label %if.then, label %if.else
+
+if.else:
+  %sext2 = or i64 %add8, 8836839522974040064
+  %ashr = ashr i64 %sext2, 1
+  ret i64 %ashr
+
+if.then:
+  ret i64 0
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80597 (PR #80731)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/80731
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80597 (PR #80731)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:

@nikic What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/80731
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80597 (PR #80731)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#80597

---
Full diff: https://github.com/llvm/llvm-project/pull/80731.diff


2 Files Affected:

- (modified) llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp 
(+3) 
- (added) llvm/test/Transforms/InstCombine/pr80597.ll (+33) 


``diff
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp 
b/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
index a8a5f9831e15e..79873a9b4cbb4 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
@@ -802,6 +802,9 @@ Value *InstCombinerImpl::SimplifyDemandedUseBits(Value *V, 
APInt DemandedMask,
 return InsertNewInstWith(LShr, I->getIterator());
   } else if (Known.One[BitWidth-ShiftAmt-1]) { // New bits are known one.
 Known.One |= HighBits;
+// SignBits may be out-of-sync with Known.countMinSignBits(). Mask out
+// high bits of Known.Zero to avoid conflicts.
+Known.Zero &= ~HighBits;
   }
 } else {
   computeKnownBits(I, Known, Depth, CxtI);
diff --git a/llvm/test/Transforms/InstCombine/pr80597.ll 
b/llvm/test/Transforms/InstCombine/pr80597.ll
new file mode 100644
index 0..5feae4a06c45c
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/pr80597.ll
@@ -0,0 +1,33 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -S -passes=instcombine < %s | FileCheck %s
+
+define i64 @pr80597(i1 %cond) {
+; CHECK-LABEL: define i64 @pr80597(
+; CHECK-SAME: i1 [[COND:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[ADD:%.*]] = select i1 [[COND]], i64 0, i64 -12884901888
+; CHECK-NEXT:[[SEXT1:%.*]] = add nsw i64 [[ADD]], 8836839514384105472
+; CHECK-NEXT:[[CMP:%.*]] = icmp ult i64 [[SEXT1]], -34359738368
+; CHECK-NEXT:br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:   if.else:
+; CHECK-NEXT:[[SEXT2:%.*]] = ashr exact i64 [[ADD]], 1
+; CHECK-NEXT:[[ASHR:%.*]] = or i64 [[SEXT2]], 4418419761487020032
+; CHECK-NEXT:ret i64 [[ASHR]]
+; CHECK:   if.then:
+; CHECK-NEXT:ret i64 0
+;
+entry:
+  %add = select i1 %cond, i64 0, i64 4294967293
+  %add8 = shl i64 %add, 32
+  %sext1 = add i64 %add8, 8836839514384105472
+  %cmp = icmp ult i64 %sext1, -34359738368
+  br i1 %cmp, label %if.then, label %if.else
+
+if.else:
+  %sext2 = or i64 %add8, 8836839522974040064
+  %ashr = ashr i64 %sext2, 1
+  ret i64 %ashr
+
+if.then:
+  ret i64 0
+}

``




https://github.com/llvm/llvm-project/pull/80731
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] PR for llvm/llvm-project#80432 (PR #80433)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/80433

>From 1a791e84d9e6ef0e6be1a15e64b78a8fcc18467a Mon Sep 17 00:00:00 2001
From: Sander de Smalen 
Date: Fri, 2 Feb 2024 11:56:38 +
Subject: [PATCH] [Clang][AArch64] Emit 'unimplemented' diagnostic for SME
 (#80295)

When a function F has ZA and ZT0 state, calls another function G that
only shares ZT0 state with its caller, F will have to save ZA before
the call to G, and restore it afterwards (rather than setting up a
lazy-sve).

This is not yet implemented in LLVM and does not result in a
compile-time error either. So instead of silently generating incorrect
code, it's better to emit an error saying this is not yet implemented.

(cherry picked from commit 319f4c03ba2909c7240ac157cc46216bf1518c10)
---
 .../clang/Basic/DiagnosticSemaKinds.td|  6 +++
 clang/lib/Sema/SemaChecking.cpp   | 50 +--
 clang/test/Sema/aarch64-sme-func-attrs.c  | 13 -
 3 files changed, 40 insertions(+), 29 deletions(-)

diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td 
b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index a1c32abb4dcd88..ef8c111b1d8cc8 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -3711,6 +3711,12 @@ def err_sme_za_call_no_za_state : Error<
   "call to a shared ZA function requires the caller to have ZA state">;
 def err_sme_zt0_call_no_zt0_state : Error<
   "call to a shared ZT0 function requires the caller to have ZT0 state">;
+def err_sme_unimplemented_za_save_restore : Error<
+  "call to a function that shares state other than 'za' from a "
+  "function that has live 'za' state requires a spill/fill of ZA, which is not 
yet "
+  "implemented">;
+def note_sme_use_preserves_za : Note<
+  "add '__arm_preserves(\"za\")' to the callee if it preserves ZA">;
 def err_sme_definition_using_sm_in_non_sme_target : Error<
   "function executed in streaming-SVE mode requires 'sme'">;
 def err_sme_definition_using_za_in_non_sme_target : Error<
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 25e9af1ea3f362..09b7e1c62fbd7b 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -7545,47 +7545,43 @@ void Sema::checkCall(NamedDecl *FDecl, const 
FunctionProtoType *Proto,
   }
 }
 
-// If the callee uses AArch64 SME ZA state but the caller doesn't define
-// any, then this is an error.
-FunctionType::ArmStateValue ArmZAState =
+FunctionType::ArmStateValue CalleeArmZAState =
 FunctionType::getArmZAState(ExtInfo.AArch64SMEAttributes);
-if (ArmZAState != FunctionType::ARM_None) {
+FunctionType::ArmStateValue CalleeArmZT0State =
+FunctionType::getArmZT0State(ExtInfo.AArch64SMEAttributes);
+if (CalleeArmZAState != FunctionType::ARM_None ||
+CalleeArmZT0State != FunctionType::ARM_None) {
   bool CallerHasZAState = false;
+  bool CallerHasZT0State = false;
   if (const auto *CallerFD = dyn_cast(CurContext)) {
 auto *Attr = CallerFD->getAttr();
 if (Attr && Attr->isNewZA())
   CallerHasZAState = true;
-else if (const auto *FPT =
- CallerFD->getType()->getAs())
-  CallerHasZAState = FunctionType::getArmZAState(
- FPT->getExtProtoInfo().AArch64SMEAttributes) 
!=
- FunctionType::ARM_None;
-  }
-
-  if (!CallerHasZAState)
-Diag(Loc, diag::err_sme_za_call_no_za_state);
-}
-
-// If the callee uses AArch64 SME ZT0 state but the caller doesn't define
-// any, then this is an error.
-FunctionType::ArmStateValue ArmZT0State =
-FunctionType::getArmZT0State(ExtInfo.AArch64SMEAttributes);
-if (ArmZT0State != FunctionType::ARM_None) {
-  bool CallerHasZT0State = false;
-  if (const auto *CallerFD = dyn_cast(CurContext)) {
-auto *Attr = CallerFD->getAttr();
 if (Attr && Attr->isNewZT0())
   CallerHasZT0State = true;
-else if (const auto *FPT =
- CallerFD->getType()->getAs())
-  CallerHasZT0State =
+if (const auto *FPT = CallerFD->getType()->getAs()) 
{
+  CallerHasZAState |=
+  FunctionType::getArmZAState(
+  FPT->getExtProtoInfo().AArch64SMEAttributes) !=
+  FunctionType::ARM_None;
+  CallerHasZT0State |=
   FunctionType::getArmZT0State(
   FPT->getExtProtoInfo().AArch64SMEAttributes) !=
   FunctionType::ARM_None;
+}
   }
 
-  if (!CallerHasZT0State)
+  if (CalleeArmZAState != FunctionType::ARM_None && !CallerHasZAState)
+Diag(Loc, diag::err_sme_za_call_no_za_state);
+
+  if (CalleeArmZT0State != FunctionType::ARM_None && !CallerHasZT0State)
 Diag(Loc, diag::err_sme_zt0_call_no_zt0_state);
+
+  if (CallerHasZAState && CalleeAr

[llvm-branch-commits] [clang] 1a791e8 - [Clang][AArch64] Emit 'unimplemented' diagnostic for SME (#80295)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


Author: Sander de Smalen
Date: 2024-02-05T11:20:35-08:00
New Revision: 1a791e84d9e6ef0e6be1a15e64b78a8fcc18467a

URL: 
https://github.com/llvm/llvm-project/commit/1a791e84d9e6ef0e6be1a15e64b78a8fcc18467a
DIFF: 
https://github.com/llvm/llvm-project/commit/1a791e84d9e6ef0e6be1a15e64b78a8fcc18467a.diff

LOG: [Clang][AArch64] Emit 'unimplemented' diagnostic for SME (#80295)

When a function F has ZA and ZT0 state, calls another function G that
only shares ZT0 state with its caller, F will have to save ZA before
the call to G, and restore it afterwards (rather than setting up a
lazy-sve).

This is not yet implemented in LLVM and does not result in a
compile-time error either. So instead of silently generating incorrect
code, it's better to emit an error saying this is not yet implemented.

(cherry picked from commit 319f4c03ba2909c7240ac157cc46216bf1518c10)

Added: 


Modified: 
clang/include/clang/Basic/DiagnosticSemaKinds.td
clang/lib/Sema/SemaChecking.cpp
clang/test/Sema/aarch64-sme-func-attrs.c

Removed: 




diff  --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td 
b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index a1c32abb4dcd8..ef8c111b1d8cc 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -3711,6 +3711,12 @@ def err_sme_za_call_no_za_state : Error<
   "call to a shared ZA function requires the caller to have ZA state">;
 def err_sme_zt0_call_no_zt0_state : Error<
   "call to a shared ZT0 function requires the caller to have ZT0 state">;
+def err_sme_unimplemented_za_save_restore : Error<
+  "call to a function that shares state other than 'za' from a "
+  "function that has live 'za' state requires a spill/fill of ZA, which is not 
yet "
+  "implemented">;
+def note_sme_use_preserves_za : Note<
+  "add '__arm_preserves(\"za\")' to the callee if it preserves ZA">;
 def err_sme_definition_using_sm_in_non_sme_target : Error<
   "function executed in streaming-SVE mode requires 'sme'">;
 def err_sme_definition_using_za_in_non_sme_target : Error<

diff  --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 25e9af1ea3f36..09b7e1c62fbd7 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -7545,47 +7545,43 @@ void Sema::checkCall(NamedDecl *FDecl, const 
FunctionProtoType *Proto,
   }
 }
 
-// If the callee uses AArch64 SME ZA state but the caller doesn't define
-// any, then this is an error.
-FunctionType::ArmStateValue ArmZAState =
+FunctionType::ArmStateValue CalleeArmZAState =
 FunctionType::getArmZAState(ExtInfo.AArch64SMEAttributes);
-if (ArmZAState != FunctionType::ARM_None) {
+FunctionType::ArmStateValue CalleeArmZT0State =
+FunctionType::getArmZT0State(ExtInfo.AArch64SMEAttributes);
+if (CalleeArmZAState != FunctionType::ARM_None ||
+CalleeArmZT0State != FunctionType::ARM_None) {
   bool CallerHasZAState = false;
+  bool CallerHasZT0State = false;
   if (const auto *CallerFD = dyn_cast(CurContext)) {
 auto *Attr = CallerFD->getAttr();
 if (Attr && Attr->isNewZA())
   CallerHasZAState = true;
-else if (const auto *FPT =
- CallerFD->getType()->getAs())
-  CallerHasZAState = FunctionType::getArmZAState(
- FPT->getExtProtoInfo().AArch64SMEAttributes) 
!=
- FunctionType::ARM_None;
-  }
-
-  if (!CallerHasZAState)
-Diag(Loc, diag::err_sme_za_call_no_za_state);
-}
-
-// If the callee uses AArch64 SME ZT0 state but the caller doesn't define
-// any, then this is an error.
-FunctionType::ArmStateValue ArmZT0State =
-FunctionType::getArmZT0State(ExtInfo.AArch64SMEAttributes);
-if (ArmZT0State != FunctionType::ARM_None) {
-  bool CallerHasZT0State = false;
-  if (const auto *CallerFD = dyn_cast(CurContext)) {
-auto *Attr = CallerFD->getAttr();
 if (Attr && Attr->isNewZT0())
   CallerHasZT0State = true;
-else if (const auto *FPT =
- CallerFD->getType()->getAs())
-  CallerHasZT0State =
+if (const auto *FPT = CallerFD->getType()->getAs()) 
{
+  CallerHasZAState |=
+  FunctionType::getArmZAState(
+  FPT->getExtProtoInfo().AArch64SMEAttributes) !=
+  FunctionType::ARM_None;
+  CallerHasZT0State |=
   FunctionType::getArmZT0State(
   FPT->getExtProtoInfo().AArch64SMEAttributes) !=
   FunctionType::ARM_None;
+}
   }
 
-  if (!CallerHasZT0State)
+  if (CalleeArmZAState != FunctionType::ARM_None && !CallerHasZAState)
+Diag(Loc, diag::err_sme_za_call_no_za_state);
+
+  if (CalleeArmZT0State != FunctionType::ARM_None && !CallerHasZT0State)
 Dia

[llvm-branch-commits] [clang] PR for llvm/llvm-project#80432 (PR #80433)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/80433
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79571 (PR #79572)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/79572

>From 43db795259d91ddb3b12596e8aec3dddbd1fb583 Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Wed, 24 Jan 2024 10:15:42 +0100
Subject: [PATCH] [MSSAUpdater] Handle simplified accesses when updating phis
 (#78272)

This is a followup to #76819. After those changes, we can still run into
an assertion failure for a slight variation of the test case: When
fixing up MemoryPhis, we map the incoming access to the access of the
cloned instruction -- which may now no longer exist.

Fix this by reusing the getNewDefiningAccessForClone() helper, which
will look upwards for a new defining access in that case.

(cherry picked from commit a7a1b8b17e264fb0f2d2b4165cf9a7f5094b08b3)
---
 llvm/lib/Analysis/MemorySSAUpdater.cpp|  22 +---
 .../memssa-readnone-access.ll | 104 ++
 2 files changed, 107 insertions(+), 19 deletions(-)

diff --git a/llvm/lib/Analysis/MemorySSAUpdater.cpp 
b/llvm/lib/Analysis/MemorySSAUpdater.cpp
index e87ae7d71fffe2..aa550f0b6a7bfd 100644
--- a/llvm/lib/Analysis/MemorySSAUpdater.cpp
+++ b/llvm/lib/Analysis/MemorySSAUpdater.cpp
@@ -692,25 +692,9 @@ void MemorySSAUpdater::updateForClonedLoop(const 
LoopBlocksRPO &LoopBlocks,
 continue;
 
   // Determine incoming value and add it as incoming from IncBB.
-  if (MemoryUseOrDef *IncMUD = dyn_cast(IncomingAccess)) {
-if (!MSSA->isLiveOnEntryDef(IncMUD)) {
-  Instruction *IncI = IncMUD->getMemoryInst();
-  assert(IncI && "Found MemoryUseOrDef with no Instruction.");
-  if (Instruction *NewIncI =
-  cast_or_null(VMap.lookup(IncI))) {
-IncMUD = MSSA->getMemoryAccess(NewIncI);
-assert(IncMUD &&
-   "MemoryUseOrDef cannot be null, all preds processed.");
-  }
-}
-NewPhi->addIncoming(IncMUD, IncBB);
-  } else {
-MemoryPhi *IncPhi = cast(IncomingAccess);
-if (MemoryAccess *NewDefPhi = MPhiMap.lookup(IncPhi))
-  NewPhi->addIncoming(NewDefPhi, IncBB);
-else
-  NewPhi->addIncoming(IncPhi, IncBB);
-  }
+  NewPhi->addIncoming(
+  getNewDefiningAccessForClone(IncomingAccess, VMap, MPhiMap, MSSA),
+  IncBB);
 }
 if (auto *SingleAccess = onlySingleValue(NewPhi)) {
   MPhiMap[Phi] = SingleAccess;
diff --git a/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll 
b/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll
index 2aaf777683e116..c6e6608d4be383 100644
--- a/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll
+++ b/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll
@@ -115,3 +115,107 @@ split:
 exit:
   ret void
 }
+
+; Variants of the above test with swapped branch destinations.
+
+define void @test1_swapped(i1 %c) {
+; CHECK-LABEL: define void @test1_swapped(
+; CHECK-SAME: i1 [[C:%.*]]) {
+; CHECK-NEXT:  start:
+; CHECK-NEXT:[[C_FR:%.*]] = freeze i1 [[C]]
+; CHECK-NEXT:br i1 [[C_FR]], label [[START_SPLIT_US:%.*]], label 
[[START_SPLIT:%.*]]
+; CHECK:   start.split.us:
+; CHECK-NEXT:br label [[LOOP_US:%.*]]
+; CHECK:   loop.us:
+; CHECK-NEXT:call void @foo()
+; CHECK-NEXT:br label [[LOOP_US]]
+; CHECK:   start.split:
+; CHECK-NEXT:br label [[LOOP:%.*]]
+; CHECK:   loop:
+; CHECK-NEXT:call void @foo()
+; CHECK-NEXT:br label [[EXIT:%.*]]
+; CHECK:   exit:
+; CHECK-NEXT:ret void
+;
+start:
+  br label %loop
+
+loop:
+  %fn = load ptr, ptr @vtable, align 8
+  call void %fn()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test2_swapped(i1 %c, ptr %p) {
+; CHECK-LABEL: define void @test2_swapped(
+; CHECK-SAME: i1 [[C:%.*]], ptr [[P:%.*]]) {
+; CHECK-NEXT:[[C_FR:%.*]] = freeze i1 [[C]]
+; CHECK-NEXT:br i1 [[C_FR]], label [[DOTSPLIT_US:%.*]], label 
[[DOTSPLIT:%.*]]
+; CHECK:   .split.us:
+; CHECK-NEXT:br label [[LOOP_US:%.*]]
+; CHECK:   loop.us:
+; CHECK-NEXT:call void @foo()
+; CHECK-NEXT:call void @bar()
+; CHECK-NEXT:br label [[LOOP_US]]
+; CHECK:   .split:
+; CHECK-NEXT:br label [[LOOP:%.*]]
+; CHECK:   loop:
+; CHECK-NEXT:call void @foo()
+; CHECK-NEXT:call void @bar()
+; CHECK-NEXT:br label [[EXIT:%.*]]
+; CHECK:   exit:
+; CHECK-NEXT:ret void
+;
+  br label %loop
+
+loop:
+  %fn = load ptr, ptr @vtable, align 8
+  call void %fn()
+  call void @bar()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test3_swapped(i1 %c, ptr %p) {
+; CHECK-LABEL: define void @test3_swapped(
+; CHECK-SAME: i1 [[C:%.*]], ptr [[P:%.*]]) {
+; CHECK-NEXT:[[C_FR:%.*]] = freeze i1 [[C]]
+; CHECK-NEXT:br i1 [[C_FR]], label [[DOTSPLIT_US:%.*]], label 
[[DOTSPLIT:%.*]]
+; CHECK:   .split.us:
+; CHECK-NEXT:br label [[LOOP_US:%.*]]
+; CHECK:   loop.us:
+; CHECK-NEXT:br label [[SPLIT_US:%.*]]
+; CHECK:

[llvm-branch-commits] [llvm] 43db795 - [MSSAUpdater] Handle simplified accesses when updating phis (#78272)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


Author: Nikita Popov
Date: 2024-02-05T11:23:33-08:00
New Revision: 43db795259d91ddb3b12596e8aec3dddbd1fb583

URL: 
https://github.com/llvm/llvm-project/commit/43db795259d91ddb3b12596e8aec3dddbd1fb583
DIFF: 
https://github.com/llvm/llvm-project/commit/43db795259d91ddb3b12596e8aec3dddbd1fb583.diff

LOG: [MSSAUpdater] Handle simplified accesses when updating phis (#78272)

This is a followup to #76819. After those changes, we can still run into
an assertion failure for a slight variation of the test case: When
fixing up MemoryPhis, we map the incoming access to the access of the
cloned instruction -- which may now no longer exist.

Fix this by reusing the getNewDefiningAccessForClone() helper, which
will look upwards for a new defining access in that case.

(cherry picked from commit a7a1b8b17e264fb0f2d2b4165cf9a7f5094b08b3)

Added: 


Modified: 
llvm/lib/Analysis/MemorySSAUpdater.cpp
llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll

Removed: 




diff  --git a/llvm/lib/Analysis/MemorySSAUpdater.cpp 
b/llvm/lib/Analysis/MemorySSAUpdater.cpp
index e87ae7d71fffe..aa550f0b6a7bf 100644
--- a/llvm/lib/Analysis/MemorySSAUpdater.cpp
+++ b/llvm/lib/Analysis/MemorySSAUpdater.cpp
@@ -692,25 +692,9 @@ void MemorySSAUpdater::updateForClonedLoop(const 
LoopBlocksRPO &LoopBlocks,
 continue;
 
   // Determine incoming value and add it as incoming from IncBB.
-  if (MemoryUseOrDef *IncMUD = dyn_cast(IncomingAccess)) {
-if (!MSSA->isLiveOnEntryDef(IncMUD)) {
-  Instruction *IncI = IncMUD->getMemoryInst();
-  assert(IncI && "Found MemoryUseOrDef with no Instruction.");
-  if (Instruction *NewIncI =
-  cast_or_null(VMap.lookup(IncI))) {
-IncMUD = MSSA->getMemoryAccess(NewIncI);
-assert(IncMUD &&
-   "MemoryUseOrDef cannot be null, all preds processed.");
-  }
-}
-NewPhi->addIncoming(IncMUD, IncBB);
-  } else {
-MemoryPhi *IncPhi = cast(IncomingAccess);
-if (MemoryAccess *NewDefPhi = MPhiMap.lookup(IncPhi))
-  NewPhi->addIncoming(NewDefPhi, IncBB);
-else
-  NewPhi->addIncoming(IncPhi, IncBB);
-  }
+  NewPhi->addIncoming(
+  getNewDefiningAccessForClone(IncomingAccess, VMap, MPhiMap, MSSA),
+  IncBB);
 }
 if (auto *SingleAccess = onlySingleValue(NewPhi)) {
   MPhiMap[Phi] = SingleAccess;

diff  --git a/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll 
b/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll
index 2aaf777683e11..c6e6608d4be38 100644
--- a/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll
+++ b/llvm/test/Transforms/SimpleLoopUnswitch/memssa-readnone-access.ll
@@ -115,3 +115,107 @@ split:
 exit:
   ret void
 }
+
+; Variants of the above test with swapped branch destinations.
+
+define void @test1_swapped(i1 %c) {
+; CHECK-LABEL: define void @test1_swapped(
+; CHECK-SAME: i1 [[C:%.*]]) {
+; CHECK-NEXT:  start:
+; CHECK-NEXT:[[C_FR:%.*]] = freeze i1 [[C]]
+; CHECK-NEXT:br i1 [[C_FR]], label [[START_SPLIT_US:%.*]], label 
[[START_SPLIT:%.*]]
+; CHECK:   start.split.us:
+; CHECK-NEXT:br label [[LOOP_US:%.*]]
+; CHECK:   loop.us:
+; CHECK-NEXT:call void @foo()
+; CHECK-NEXT:br label [[LOOP_US]]
+; CHECK:   start.split:
+; CHECK-NEXT:br label [[LOOP:%.*]]
+; CHECK:   loop:
+; CHECK-NEXT:call void @foo()
+; CHECK-NEXT:br label [[EXIT:%.*]]
+; CHECK:   exit:
+; CHECK-NEXT:ret void
+;
+start:
+  br label %loop
+
+loop:
+  %fn = load ptr, ptr @vtable, align 8
+  call void %fn()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test2_swapped(i1 %c, ptr %p) {
+; CHECK-LABEL: define void @test2_swapped(
+; CHECK-SAME: i1 [[C:%.*]], ptr [[P:%.*]]) {
+; CHECK-NEXT:[[C_FR:%.*]] = freeze i1 [[C]]
+; CHECK-NEXT:br i1 [[C_FR]], label [[DOTSPLIT_US:%.*]], label 
[[DOTSPLIT:%.*]]
+; CHECK:   .split.us:
+; CHECK-NEXT:br label [[LOOP_US:%.*]]
+; CHECK:   loop.us:
+; CHECK-NEXT:call void @foo()
+; CHECK-NEXT:call void @bar()
+; CHECK-NEXT:br label [[LOOP_US]]
+; CHECK:   .split:
+; CHECK-NEXT:br label [[LOOP:%.*]]
+; CHECK:   loop:
+; CHECK-NEXT:call void @foo()
+; CHECK-NEXT:call void @bar()
+; CHECK-NEXT:br label [[EXIT:%.*]]
+; CHECK:   exit:
+; CHECK-NEXT:ret void
+;
+  br label %loop
+
+loop:
+  %fn = load ptr, ptr @vtable, align 8
+  call void %fn()
+  call void @bar()
+  br i1 %c, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @test3_swapped(i1 %c, ptr %p) {
+; CHECK-LABEL: define void @test3_swapped(
+; CHECK-SAME: i1 [[C:%.*]], ptr [[P:%.*]]) {
+; CHECK-NEXT:[[C_FR:%.*]] = freeze i1 [[C]]
+; CHECK-NEXT:br i1 [[C_FR]], label [[DOTSPLIT_US:%.*]], label 
[[DOTSPLIT:%.*]]
+; CHECK:   .split.us:
+; CHEC

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79571 (PR #79572)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/79572
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80699 (PR #80729)

2024-02-05 Thread Zequan Wu via llvm-branch-commits


https://github.com/ZequanWu approved this pull request.


https://github.com/llvm/llvm-project/pull/80729
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#77871 (PR #80513)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/80513

>From a6817b7315af5da94cfbe69767c8e8f827fecbca Mon Sep 17 00:00:00 2001
From: NAKAMURA Takumi 
Date: Fri, 2 Feb 2024 18:37:10 +0900
Subject: [PATCH 1/2] CoverageMappingWriter: Emit `Decision` before `Expansion`
 (#78966)

To relax scanning record, tweak order by `Decision < Expansion`, or
`Expansion` could not be distinguished whether it belonged to `Decision`
or not.

Relevant to #77871

(cherry picked from commit 438fe1db09b0c20708ea1020519d8073c37feae8)
---
 .../Coverage/CoverageMappingWriter.cpp| 10 +-
 .../ProfileData/CoverageMappingTest.cpp   | 36 +++
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp 
b/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
index 1c7d8a8909c48..27727f216b051 100644
--- a/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
+++ b/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
@@ -167,7 +167,15 @@ void CoverageMappingWriter::write(raw_ostream &OS) {
   return LHS.FileID < RHS.FileID;
 if (LHS.startLoc() != RHS.startLoc())
   return LHS.startLoc() < RHS.startLoc();
-return LHS.Kind < RHS.Kind;
+
+// Put `Decision` before `Expansion`.
+auto getKindKey = [](CounterMappingRegion::RegionKind Kind) {
+  return (Kind == CounterMappingRegion::MCDCDecisionRegion
+  ? 2 * CounterMappingRegion::ExpansionRegion - 1
+  : 2 * Kind);
+};
+
+return getKindKey(LHS.Kind) < getKindKey(RHS.Kind);
   });
 
   // Write out the fileid -> filename mapping.
diff --git a/llvm/unittests/ProfileData/CoverageMappingTest.cpp 
b/llvm/unittests/ProfileData/CoverageMappingTest.cpp
index 23f66a0232ddb..2849781a9dc43 100644
--- a/llvm/unittests/ProfileData/CoverageMappingTest.cpp
+++ b/llvm/unittests/ProfileData/CoverageMappingTest.cpp
@@ -890,6 +890,42 @@ TEST_P(CoverageMappingTest, non_code_region_bitmask) {
   ASSERT_EQ(1U, Names.size());
 }
 
+// Test the order of MCDCDecision before Expansion
+TEST_P(CoverageMappingTest, decision_before_expansion) {
+  startFunction("foo", 0x1234);
+  addCMR(Counter::getCounter(0), "foo", 3, 23, 5, 2);
+
+  // This(4:11) was put after Expansion(4:11) before the fix
+  addMCDCDecisionCMR(0, 2, "foo", 4, 11, 4, 20);
+
+  addExpansionCMR("foo", "A", 4, 11, 4, 12);
+  addExpansionCMR("foo", "B", 4, 19, 4, 20);
+  addCMR(Counter::getCounter(0), "A", 1, 14, 1, 17);
+  addCMR(Counter::getCounter(0), "A", 1, 14, 1, 17);
+  addMCDCBranchCMR(Counter::getCounter(0), Counter::getCounter(1), 1, 2, 0, 
"A",
+   1, 14, 1, 17);
+  addCMR(Counter::getCounter(1), "B", 1, 14, 1, 17);
+  addMCDCBranchCMR(Counter::getCounter(1), Counter::getCounter(2), 2, 0, 0, 
"B",
+   1, 14, 1, 17);
+
+  // InputFunctionCoverageData::Regions is rewritten after the write.
+  auto InputRegions = InputFunctions.back().Regions;
+
+  writeAndReadCoverageRegions();
+
+  const auto &OutputRegions = OutputFunctions.back().Regions;
+
+  size_t N = ArrayRef(InputRegions).size();
+  ASSERT_EQ(N, OutputRegions.size());
+  for (size_t I = 0; I < N; ++I) {
+ASSERT_EQ(InputRegions[I].Kind, OutputRegions[I].Kind);
+ASSERT_EQ(InputRegions[I].FileID, OutputRegions[I].FileID);
+ASSERT_EQ(InputRegions[I].ExpandedFileID, OutputRegions[I].ExpandedFileID);
+ASSERT_EQ(InputRegions[I].startLoc(), OutputRegions[I].startLoc());
+ASSERT_EQ(InputRegions[I].endLoc(), OutputRegions[I].endLoc());
+  }
+}
+
 TEST_P(CoverageMappingTest, strip_filename_prefix) {
   ProfileWriter.addRecord({"file1:func", 0x1234, {0}}, Err);
 

>From b50a84e303378df35996d7330aa80aa4ea1f497a Mon Sep 17 00:00:00 2001
From: NAKAMURA Takumi 
Date: Fri, 2 Feb 2024 20:34:12 +0900
Subject: [PATCH 2/2] [Coverage] Let `Decision` take account of expansions
 (#78969)

The current implementation (D138849) assumes `Branch`(es) would follow
after the corresponding `Decision`. It is not true if `Branch`(es) are
forwarded to expanded file ID. As a result, consecutive `Decision`(s)
would be confused with insufficient number of `Branch`(es).

`Expansion` will point `Branch`(es) in other file IDs if `Expansion` is
included in the range of `Decision`.

Fixes #77871

-

Co-authored-by: Alan Phipps 
(cherry picked from commit d912f1f0cb49465b08f82fae89ece222404e5640)
---
 .../ProfileData/Coverage/CoverageMapping.cpp  | 240 ++
 llvm/test/tools/llvm-cov/Inputs/mcdc-macro.c  |  20 ++
 llvm/test/tools/llvm-cov/Inputs/mcdc-macro.o  | Bin 0 -> 6424 bytes
 .../tools/llvm-cov/Inputs/mcdc-macro.proftext |  62 +
 llvm/test/tools/llvm-cov/mcdc-macro.test  |  99 
 5 files changed, 378 insertions(+), 43 deletions(-)
 create mode 100644 llvm/test/tools/llvm-cov/Inputs/mcdc-macro.c
 create mode 100644 llvm/test/tools/llvm-cov/Inputs/mcdc-macro.o
 create mode 100644 llvm/test/tools/llvm-cov/Inputs/mcdc-macro.proftext
 create mode 10064

[llvm-branch-commits] [llvm] a6817b7 - CoverageMappingWriter: Emit `Decision` before `Expansion` (#78966)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


Author: NAKAMURA Takumi
Date: 2024-02-05T11:32:50-08:00
New Revision: a6817b7315af5da94cfbe69767c8e8f827fecbca

URL: 
https://github.com/llvm/llvm-project/commit/a6817b7315af5da94cfbe69767c8e8f827fecbca
DIFF: 
https://github.com/llvm/llvm-project/commit/a6817b7315af5da94cfbe69767c8e8f827fecbca.diff

LOG: CoverageMappingWriter: Emit `Decision` before `Expansion` (#78966)

To relax scanning record, tweak order by `Decision < Expansion`, or
`Expansion` could not be distinguished whether it belonged to `Decision`
or not.

Relevant to #77871

(cherry picked from commit 438fe1db09b0c20708ea1020519d8073c37feae8)

Added: 


Modified: 
llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
llvm/unittests/ProfileData/CoverageMappingTest.cpp

Removed: 




diff  --git a/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp 
b/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
index 1c7d8a8909c48..27727f216b051 100644
--- a/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
+++ b/llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
@@ -167,7 +167,15 @@ void CoverageMappingWriter::write(raw_ostream &OS) {
   return LHS.FileID < RHS.FileID;
 if (LHS.startLoc() != RHS.startLoc())
   return LHS.startLoc() < RHS.startLoc();
-return LHS.Kind < RHS.Kind;
+
+// Put `Decision` before `Expansion`.
+auto getKindKey = [](CounterMappingRegion::RegionKind Kind) {
+  return (Kind == CounterMappingRegion::MCDCDecisionRegion
+  ? 2 * CounterMappingRegion::ExpansionRegion - 1
+  : 2 * Kind);
+};
+
+return getKindKey(LHS.Kind) < getKindKey(RHS.Kind);
   });
 
   // Write out the fileid -> filename mapping.

diff  --git a/llvm/unittests/ProfileData/CoverageMappingTest.cpp 
b/llvm/unittests/ProfileData/CoverageMappingTest.cpp
index 23f66a0232ddb..2849781a9dc43 100644
--- a/llvm/unittests/ProfileData/CoverageMappingTest.cpp
+++ b/llvm/unittests/ProfileData/CoverageMappingTest.cpp
@@ -890,6 +890,42 @@ TEST_P(CoverageMappingTest, non_code_region_bitmask) {
   ASSERT_EQ(1U, Names.size());
 }
 
+// Test the order of MCDCDecision before Expansion
+TEST_P(CoverageMappingTest, decision_before_expansion) {
+  startFunction("foo", 0x1234);
+  addCMR(Counter::getCounter(0), "foo", 3, 23, 5, 2);
+
+  // This(4:11) was put after Expansion(4:11) before the fix
+  addMCDCDecisionCMR(0, 2, "foo", 4, 11, 4, 20);
+
+  addExpansionCMR("foo", "A", 4, 11, 4, 12);
+  addExpansionCMR("foo", "B", 4, 19, 4, 20);
+  addCMR(Counter::getCounter(0), "A", 1, 14, 1, 17);
+  addCMR(Counter::getCounter(0), "A", 1, 14, 1, 17);
+  addMCDCBranchCMR(Counter::getCounter(0), Counter::getCounter(1), 1, 2, 0, 
"A",
+   1, 14, 1, 17);
+  addCMR(Counter::getCounter(1), "B", 1, 14, 1, 17);
+  addMCDCBranchCMR(Counter::getCounter(1), Counter::getCounter(2), 2, 0, 0, 
"B",
+   1, 14, 1, 17);
+
+  // InputFunctionCoverageData::Regions is rewritten after the write.
+  auto InputRegions = InputFunctions.back().Regions;
+
+  writeAndReadCoverageRegions();
+
+  const auto &OutputRegions = OutputFunctions.back().Regions;
+
+  size_t N = ArrayRef(InputRegions).size();
+  ASSERT_EQ(N, OutputRegions.size());
+  for (size_t I = 0; I < N; ++I) {
+ASSERT_EQ(InputRegions[I].Kind, OutputRegions[I].Kind);
+ASSERT_EQ(InputRegions[I].FileID, OutputRegions[I].FileID);
+ASSERT_EQ(InputRegions[I].ExpandedFileID, OutputRegions[I].ExpandedFileID);
+ASSERT_EQ(InputRegions[I].startLoc(), OutputRegions[I].startLoc());
+ASSERT_EQ(InputRegions[I].endLoc(), OutputRegions[I].endLoc());
+  }
+}
+
 TEST_P(CoverageMappingTest, strip_filename_prefix) {
   ProfileWriter.addRecord({"file1:func", 0x1234, {0}}, Err);
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] b50a84e - [Coverage] Let `Decision` take account of expansions (#78969)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


Author: NAKAMURA Takumi
Date: 2024-02-05T11:32:51-08:00
New Revision: b50a84e303378df35996d7330aa80aa4ea1f497a

URL: 
https://github.com/llvm/llvm-project/commit/b50a84e303378df35996d7330aa80aa4ea1f497a
DIFF: 
https://github.com/llvm/llvm-project/commit/b50a84e303378df35996d7330aa80aa4ea1f497a.diff

LOG: [Coverage] Let `Decision` take account of expansions (#78969)

The current implementation (D138849) assumes `Branch`(es) would follow
after the corresponding `Decision`. It is not true if `Branch`(es) are
forwarded to expanded file ID. As a result, consecutive `Decision`(s)
would be confused with insufficient number of `Branch`(es).

`Expansion` will point `Branch`(es) in other file IDs if `Expansion` is
included in the range of `Decision`.

Fixes #77871

-

Co-authored-by: Alan Phipps 
(cherry picked from commit d912f1f0cb49465b08f82fae89ece222404e5640)

Added: 
llvm/test/tools/llvm-cov/Inputs/mcdc-macro.c
llvm/test/tools/llvm-cov/Inputs/mcdc-macro.o
llvm/test/tools/llvm-cov/Inputs/mcdc-macro.proftext
llvm/test/tools/llvm-cov/mcdc-macro.test

Modified: 
llvm/lib/ProfileData/Coverage/CoverageMapping.cpp

Removed: 




diff  --git a/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp 
b/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
index da8e1d87319dd..a357b4cb49211 100644
--- a/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
+++ b/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
@@ -14,6 +14,7 @@
 #include "llvm/ProfileData/Coverage/CoverageMapping.h"
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SmallBitVector.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringExtras.h"
@@ -583,6 +584,160 @@ static unsigned getMaxBitmapSize(const 
CounterMappingContext &Ctx,
   return MaxBitmapID + (SizeInBits / CHAR_BIT);
 }
 
+namespace {
+
+/// Collect Decisions, Branchs, and Expansions and associate them.
+class MCDCDecisionRecorder {
+private:
+  /// This holds the DecisionRegion and MCDCBranches under it.
+  /// Also traverses Expansion(s).
+  /// The Decision has the number of MCDCBranches and will complete
+  /// when it is filled with unique ConditionID of MCDCBranches.
+  struct DecisionRecord {
+const CounterMappingRegion *DecisionRegion;
+
+/// They are reflected from DecisionRegion for convenience.
+LineColPair DecisionStartLoc;
+LineColPair DecisionEndLoc;
+
+/// This is passed to `MCDCRecordProcessor`, so this should be compatible
+/// to`ArrayRef`.
+SmallVector MCDCBranches;
+
+/// IDs that are stored in MCDCBranches
+/// Complete when all IDs (1 to NumConditions) are met.
+DenseSet ConditionIDs;
+
+/// Set of IDs of Expansion(s) that are relevant to DecisionRegion
+/// and its children (via expansions).
+/// FileID  pointed by ExpandedFileID is dedicated to the expansion, so
+/// the location in the expansion doesn't matter.
+DenseSet ExpandedFileIDs;
+
+DecisionRecord(const CounterMappingRegion &Decision)
+: DecisionRegion(&Decision), DecisionStartLoc(Decision.startLoc()),
+  DecisionEndLoc(Decision.endLoc()) {
+  assert(Decision.Kind == CounterMappingRegion::MCDCDecisionRegion);
+}
+
+/// Determine whether DecisionRecord dominates `R`.
+bool dominates(const CounterMappingRegion &R) const {
+  // Determine whether `R` is included in `DecisionRegion`.
+  if (R.FileID == DecisionRegion->FileID &&
+  R.startLoc() >= DecisionStartLoc && R.endLoc() <= DecisionEndLoc)
+return true;
+
+  // Determine whether `R` is pointed by any of Expansions.
+  return ExpandedFileIDs.contains(R.FileID);
+}
+
+enum Result {
+  NotProcessed = 0, /// Irrelevant to this Decision
+  Processed,/// Added to this Decision
+  Completed,/// Added and filled this Decision
+};
+
+/// Add Branch into the Decision
+/// \param Branch expects MCDCBranchRegion
+/// \returns NotProcessed/Processed/Completed
+Result addBranch(const CounterMappingRegion &Branch) {
+  assert(Branch.Kind == CounterMappingRegion::MCDCBranchRegion);
+
+  auto ConditionID = Branch.MCDCParams.ID;
+  assert(ConditionID > 0 && "ConditionID should begin with 1");
+
+  if (ConditionIDs.contains(ConditionID) ||
+  ConditionID > DecisionRegion->MCDCParams.NumConditions)
+return NotProcessed;
+
+  if (!this->dominates(Branch))
+return NotProcessed;
+
+  assert(MCDCBranches.size() < DecisionRegion->MCDCParams.NumConditions);
+
+  // Put `ID=1` in front of `MCDCBranches` for convenience
+  // even if `MCDCBranches` is not topological.
+  if (ConditionID == 1)
+MCDCBranches.insert(MCDCBranches.begin(), &Branch);
+  else
+MCDCBranches.push_back(&Branch);
+
+  // Mark `ID` as `assigned`.
+  ConditionIDs.insert(Con

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#77871 (PR #80513)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/80513
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [flang][OpenMP] Add support for copyprivate (PR #80485)

2024-02-05 Thread Leandro Lupori via llvm-branch-commits



@@ -1092,6 +1040,79 @@ class FirConverter : public 
Fortran::lower::AbstractConverter {
 return true;
   }
 
+  void copyVar(const Fortran::semantics::Symbol &sym,
+   const Fortran::lower::SymbolBox &lhs_sb,
+   const Fortran::lower::SymbolBox &rhs_sb) {
+mlir::Location loc = genLocation(sym.name());
+if (lowerToHighLevelFIR())
+  copyVarHLFIR(loc, lhs_sb.getAddr(), rhs_sb.getAddr());
+else
+  copyVarFIR(loc, sym, lhs_sb, rhs_sb);
+  }
+
+  void copyVarHLFIR(mlir::Location loc, mlir::Value dst, mlir::Value src) {
+assert(lowerToHighLevelFIR());
+hlfir::Entity lhs{dst};
+hlfir::Entity rhs{src};
+// Temporary_lhs is set to true in hlfir.assign below to avoid user
+// assignment to be used and finalization to be called on the LHS.
+// This may or may not be correct but mimics the current behaviour
+// without HLFIR.
+auto copyData = [&](hlfir::Entity l, hlfir::Entity r) {
+  // Dereference RHS and load it if trivial scalar.
+  r = hlfir::loadTrivialScalar(loc, *builder, r);
+  builder->create(
+  loc, r, l,
+  /*isWholeAllocatableAssignment=*/false,
+  /*keepLhsLengthInAllocatableAssignment=*/false,
+  /*temporary_lhs=*/true);
+};
+if (lhs.isAllocatable()) {
+  // Deep copy allocatable if it is allocated.
+  // Note that when allocated, the RHS is already allocated with the LHS
+  // shape for copy on entry in createHostAssociateVarClone.
+  // For lastprivate, this assumes that the RHS was not reallocated in
+  // the OpenMP region.
+  lhs = hlfir::derefPointersAndAllocatables(loc, *builder, lhs);
+  mlir::Value addr = hlfir::genVariableRawAddress(loc, *builder, lhs);
+  mlir::Value isAllocated = builder->genIsNotNullAddr(loc, addr);
+  builder->genIfThen(loc, isAllocated)
+  .genThen([&]() {
+// Copy the DATA, not the descriptors.
+copyData(lhs, rhs);
+  })
+  .end();
+} else if (lhs.isPointer()) {
+  // Set LHS target to the target of RHS (do not copy the RHS
+  // target data into the LHS target storage).
+  auto loadVal = builder->create(loc, rhs);
+  builder->create(loc, loadVal, lhs);
+} else {
+  // Non ALLOCATABLE/POINTER variable. Simple DATA copy.
+  copyData(lhs, rhs);
+}
+  }
+
+  void copyVarFIR(mlir::Location loc, const Fortran::semantics::Symbol &sym,
+  const Fortran::lower::SymbolBox &lhs_sb,
+  const Fortran::lower::SymbolBox &rhs_sb) {
+assert(!lowerToHighLevelFIR());
+fir::ExtendedValue lhs = symBoxToExtendedValue(lhs_sb);
+fir::ExtendedValue rhs = symBoxToExtendedValue(rhs_sb);
+mlir::Type symType = genType(sym);
+if (auto seqTy = symType.dyn_cast()) {
+  Fortran::lower::StatementContext stmtCtx;
+  Fortran::lower::createSomeArrayAssignment(*this, lhs, rhs, localSymbols,
+stmtCtx);
+  stmtCtx.finalizeAndReset();
+} else if (lhs.getBoxOf()) {
+  fir::factory::CharacterExprHelper{*builder, loc}.createAssign(lhs, rhs);
+} else {
+  auto loadVal = builder->create(loc, fir::getBase(rhs));
+  builder->create(loc, loadVal, fir::getBase(lhs));
+}
+  }
+

luporl wrote:

I guess it would be possible to move this to OpenMP.cpp, but this would mean 
duplicating around 40 lines of code.
The `copyVarHLFIR()` code was extracted from `copyHostAssociateVar()`, that now 
calls `copyVar()` instead.

Can we keep it in the converter to avoid code duplication?

https://github.com/llvm/llvm-project/pull/80485
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] PR for llvm/llvm-project#80543 (PR #80544)

2024-02-05 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/80544

>From 7a5cba8bea8f774d48db1b0426bcc102edd2b69f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Martin=20Storsj=C3=B6?= 
Date: Sat, 3 Feb 2024 14:52:49 +0100
Subject: [PATCH] [compiler-rt] Remove duplicate MS names for chkstk symbols
 (#80450)

Prior to 885d7b759b5c166c07c07f4c58c6e0ba110fb0c2, the builtins library
contained two chkstk implementations for each of i386 and x86_64, one
that was used in mingw environments, and one unused (with a symbol name
not matching anything that is used anywhere). Some of the functions
additionally had other, also unused, aliases.

After cleaning this up in 885d7b759b5c166c07c07f4c58c6e0ba110fb0c2, the
unused symbol names were removed.

At the same time, symbol aliases were added for the names as they are
used by MSVC; the functions are functionally equivalent, but have
different names between mingw and MSVC style environments.

By adding a symbol alias (so that one object file contains two different
symbols for the same function), users can run into problems with
duplicate definitions, if they themselves define one of the symbols (for
various reasons), but need to link in the other one.

This happens for Wine, which provides their own definition of
"__chkstk", but when built in mingw mode does need compiler-rt to
provide the mingw specific symbol names; see
https://github.com/mstorsjo/llvm-mingw/issues/397.

To avoid the issue, remove the extra MS style names. They weren't
entirely usable as such for MSVC style environments anyway, as
compiler-rt builtins don't build these object files at all, when built
in MSVC mode; thus, the effort to provide them for MSVC style
environments in 885d7b759b5c166c07c07f4c58c6e0ba110fb0c2 was a
half-hearted step towards that.

If we really do want to provide those functions (as an alternative to
the ones provided by MSVC itself), we should do it in a separate object
file (even if the function implementation is the same), so that users
who have a definition of one of them but need a definition of the other,
won't have conflicts.

Additionally, if we do want to provide them for MSVC, those files
actually should be built when building the builtins in MSVC mode as well
(see compiler-rt/lib/builtins/CMakeLists.txt).

If we do that, there's a risk that an MSVC style build ends up linking
in and preferring our implementation over the one provided by MSVC,
which would be suboptimal. Our implementation always probes the
requested amount of stack, while the MSVC one checks the amount of
allocated stack and only probes as much as really is needed.

In short - this reverts the situation to what it was in the 17.x release
series (except for unused functions that have been removed).

(cherry picked from commit 248aeac1ad2cf4f583490dd1312a5b448d2bb8cc)
---
 compiler-rt/lib/builtins/i386/chkstk.S   | 2 --
 compiler-rt/lib/builtins/x86_64/chkstk.S | 2 --
 2 files changed, 4 deletions(-)

diff --git a/compiler-rt/lib/builtins/i386/chkstk.S 
b/compiler-rt/lib/builtins/i386/chkstk.S
index a84bb0ee30070..cdd9a4c2a5752 100644
--- a/compiler-rt/lib/builtins/i386/chkstk.S
+++ b/compiler-rt/lib/builtins/i386/chkstk.S
@@ -14,7 +14,6 @@
 .text
 .balign 4
 DEFINE_COMPILERRT_FUNCTION(_alloca) // _chkstk and _alloca are the same 
function
-DEFINE_COMPILERRT_FUNCTION(_chkstk)
 push   %ecx
 cmp$0x1000,%eax
 lea8(%esp),%ecx // esp before calling this routine -> ecx
@@ -35,7 +34,6 @@ DEFINE_COMPILERRT_FUNCTION(_chkstk)
 push   (%eax)   // push return address onto the stack
 sub%esp,%eax// restore the original value in eax
 ret
-END_COMPILERRT_FUNCTION(_chkstk)
 END_COMPILERRT_FUNCTION(_alloca)

 #endif // __i386__
diff --git a/compiler-rt/lib/builtins/x86_64/chkstk.S 
b/compiler-rt/lib/builtins/x86_64/chkstk.S
index 494ee261193bc..ad7953a116ac7 100644
--- a/compiler-rt/lib/builtins/x86_64/chkstk.S
+++ b/compiler-rt/lib/builtins/x86_64/chkstk.S
@@ -18,7 +18,6 @@
 .text
 .balign 4
 DEFINE_COMPILERRT_FUNCTION(___chkstk_ms)
-DEFINE_COMPILERRT_FUNCTION(__chkstk)
 push   %rcx
 push   %rax
 cmp$0x1000,%rax
@@ -36,7 +35,6 @@ DEFINE_COMPILERRT_FUNCTION(__chkstk)
 pop%rax
 pop%rcx
 ret
-END_COMPILERRT_FUNCTION(__chkstk)
 END_COMPILERRT_FUNCTION(___chkstk_ms)

 #endif // __x86_64__

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] 7a5cba8 - [compiler-rt] Remove duplicate MS names for chkstk symbols (#80450)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


Author: Martin Storsjö
Date: 2024-02-05T11:39:38-08:00
New Revision: 7a5cba8bea8f774d48db1b0426bcc102edd2b69f

URL: 
https://github.com/llvm/llvm-project/commit/7a5cba8bea8f774d48db1b0426bcc102edd2b69f
DIFF: 
https://github.com/llvm/llvm-project/commit/7a5cba8bea8f774d48db1b0426bcc102edd2b69f.diff

LOG: [compiler-rt] Remove duplicate MS names for chkstk symbols (#80450)

Prior to 885d7b759b5c166c07c07f4c58c6e0ba110fb0c2, the builtins library
contained two chkstk implementations for each of i386 and x86_64, one
that was used in mingw environments, and one unused (with a symbol name
not matching anything that is used anywhere). Some of the functions
additionally had other, also unused, aliases.

After cleaning this up in 885d7b759b5c166c07c07f4c58c6e0ba110fb0c2, the
unused symbol names were removed.

At the same time, symbol aliases were added for the names as they are
used by MSVC; the functions are functionally equivalent, but have
different names between mingw and MSVC style environments.

By adding a symbol alias (so that one object file contains two different
symbols for the same function), users can run into problems with
duplicate definitions, if they themselves define one of the symbols (for
various reasons), but need to link in the other one.

This happens for Wine, which provides their own definition of
"__chkstk", but when built in mingw mode does need compiler-rt to
provide the mingw specific symbol names; see
https://github.com/mstorsjo/llvm-mingw/issues/397.

To avoid the issue, remove the extra MS style names. They weren't
entirely usable as such for MSVC style environments anyway, as
compiler-rt builtins don't build these object files at all, when built
in MSVC mode; thus, the effort to provide them for MSVC style
environments in 885d7b759b5c166c07c07f4c58c6e0ba110fb0c2 was a
half-hearted step towards that.

If we really do want to provide those functions (as an alternative to
the ones provided by MSVC itself), we should do it in a separate object
file (even if the function implementation is the same), so that users
who have a definition of one of them but need a definition of the other,
won't have conflicts.

Additionally, if we do want to provide them for MSVC, those files
actually should be built when building the builtins in MSVC mode as well
(see compiler-rt/lib/builtins/CMakeLists.txt).

If we do that, there's a risk that an MSVC style build ends up linking
in and preferring our implementation over the one provided by MSVC,
which would be suboptimal. Our implementation always probes the
requested amount of stack, while the MSVC one checks the amount of
allocated stack and only probes as much as really is needed.

In short - this reverts the situation to what it was in the 17.x release
series (except for unused functions that have been removed).

(cherry picked from commit 248aeac1ad2cf4f583490dd1312a5b448d2bb8cc)

Added: 


Modified: 
compiler-rt/lib/builtins/i386/chkstk.S
compiler-rt/lib/builtins/x86_64/chkstk.S

Removed: 




diff  --git a/compiler-rt/lib/builtins/i386/chkstk.S 
b/compiler-rt/lib/builtins/i386/chkstk.S
index a84bb0ee30070..cdd9a4c2a5752 100644
--- a/compiler-rt/lib/builtins/i386/chkstk.S
+++ b/compiler-rt/lib/builtins/i386/chkstk.S
@@ -14,7 +14,6 @@
 .text
 .balign 4
 DEFINE_COMPILERRT_FUNCTION(_alloca) // _chkstk and _alloca are the same 
function
-DEFINE_COMPILERRT_FUNCTION(_chkstk)
 push   %ecx
 cmp$0x1000,%eax
 lea8(%esp),%ecx // esp before calling this routine -> ecx
@@ -35,7 +34,6 @@ DEFINE_COMPILERRT_FUNCTION(_chkstk)
 push   (%eax)   // push return address onto the stack
 sub%esp,%eax// restore the original value in eax
 ret
-END_COMPILERRT_FUNCTION(_chkstk)
 END_COMPILERRT_FUNCTION(_alloca)
 
 #endif // __i386__

diff  --git a/compiler-rt/lib/builtins/x86_64/chkstk.S 
b/compiler-rt/lib/builtins/x86_64/chkstk.S
index 494ee261193bc..ad7953a116ac7 100644
--- a/compiler-rt/lib/builtins/x86_64/chkstk.S
+++ b/compiler-rt/lib/builtins/x86_64/chkstk.S
@@ -18,7 +18,6 @@
 .text
 .balign 4
 DEFINE_COMPILERRT_FUNCTION(___chkstk_ms)
-DEFINE_COMPILERRT_FUNCTION(__chkstk)
 push   %rcx
 push   %rax
 cmp$0x1000,%rax
@@ -36,7 +35,6 @@ DEFINE_COMPILERRT_FUNCTION(__chkstk)
 pop%rax
 pop%rcx
 ret
-END_COMPILERRT_FUNCTION(__chkstk)
 END_COMPILERRT_FUNCTION(___chkstk_ms)
 
 #endif // __x86_64__



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] PR for llvm/llvm-project#80543 (PR #80544)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/80544
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79175 (PR #80274)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/80274

>From aa6980841e587eba9c98bf54c51f5414f8a15871 Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Wed, 24 Jan 2024 12:33:57 +0100
Subject: [PATCH 1/3] [Loads] Use BatchAAResults for available value APIs
 (NFCI)

This allows caching AA queries both within and across the calls,
and enables us to use a custom AAQI configuration.

(cherry picked from commit 89dae798cc77789a43e9a60173f647dae03a65fe)
---
 llvm/include/llvm/Analysis/Loads.h   | 12 ++--
 llvm/lib/Analysis/Lint.cpp   |  3 ++-
 llvm/lib/Analysis/Loads.cpp  |  9 -
 .../InstCombine/InstCombineLoadStoreAlloca.cpp   |  3 ++-
 llvm/lib/Transforms/Scalar/JumpThreading.cpp | 11 ++-
 5 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/llvm/include/llvm/Analysis/Loads.h 
b/llvm/include/llvm/Analysis/Loads.h
index 2880ed33a34cbc..0926093bba99de 100644
--- a/llvm/include/llvm/Analysis/Loads.h
+++ b/llvm/include/llvm/Analysis/Loads.h
@@ -18,7 +18,7 @@
 
 namespace llvm {
 
-class AAResults;
+class BatchAAResults;
 class AssumptionCache;
 class DataLayout;
 class DominatorTree;
@@ -129,11 +129,10 @@ extern cl::opt DefMaxInstsToScan;
 /// location in memory, as opposed to the value operand of a store.
 ///
 /// \returns The found value, or nullptr if no value is found.
-Value *FindAvailableLoadedValue(LoadInst *Load,
-BasicBlock *ScanBB,
+Value *FindAvailableLoadedValue(LoadInst *Load, BasicBlock *ScanBB,
 BasicBlock::iterator &ScanFrom,
 unsigned MaxInstsToScan = DefMaxInstsToScan,
-AAResults *AA = nullptr,
+BatchAAResults *AA = nullptr,
 bool *IsLoadCSE = nullptr,
 unsigned *NumScanedInst = nullptr);
 
@@ -141,7 +140,8 @@ Value *FindAvailableLoadedValue(LoadInst *Load,
 /// FindAvailableLoadedValue() for the case where we are not interested in
 /// finding the closest clobbering instruction if no available load is found.
 /// This overload cannot be used to scan across multiple blocks.
-Value *FindAvailableLoadedValue(LoadInst *Load, AAResults &AA, bool *IsLoadCSE,
+Value *FindAvailableLoadedValue(LoadInst *Load, BatchAAResults &AA,
+bool *IsLoadCSE,
 unsigned MaxInstsToScan = DefMaxInstsToScan);
 
 /// Scan backwards to see if we have the value of the given pointer available
@@ -170,7 +170,7 @@ Value *FindAvailableLoadedValue(LoadInst *Load, AAResults 
&AA, bool *IsLoadCSE,
 Value *findAvailablePtrLoadStore(const MemoryLocation &Loc, Type *AccessTy,
  bool AtLeastAtomic, BasicBlock *ScanBB,
  BasicBlock::iterator &ScanFrom,
- unsigned MaxInstsToScan, AAResults *AA,
+ unsigned MaxInstsToScan, BatchAAResults *AA,
  bool *IsLoadCSE, unsigned *NumScanedInst);
 
 /// Returns true if a pointer value \p A can be replace with another pointer
diff --git a/llvm/lib/Analysis/Lint.cpp b/llvm/lib/Analysis/Lint.cpp
index 1ebc593016bc0d..16635097d20afe 100644
--- a/llvm/lib/Analysis/Lint.cpp
+++ b/llvm/lib/Analysis/Lint.cpp
@@ -657,11 +657,12 @@ Value *Lint::findValueImpl(Value *V, bool OffsetOk,
 BasicBlock::iterator BBI = L->getIterator();
 BasicBlock *BB = L->getParent();
 SmallPtrSet VisitedBlocks;
+BatchAAResults BatchAA(*AA);
 for (;;) {
   if (!VisitedBlocks.insert(BB).second)
 break;
   if (Value *U =
-  FindAvailableLoadedValue(L, BB, BBI, DefMaxInstsToScan, AA))
+  FindAvailableLoadedValue(L, BB, BBI, DefMaxInstsToScan, 
&BatchAA))
 return findValueImpl(U, OffsetOk, Visited);
   if (BBI != BB->begin())
 break;
diff --git a/llvm/lib/Analysis/Loads.cpp b/llvm/lib/Analysis/Loads.cpp
index 97d21db86abf28..6bf0d2f56eb4eb 100644
--- a/llvm/lib/Analysis/Loads.cpp
+++ b/llvm/lib/Analysis/Loads.cpp
@@ -450,11 +450,10 @@ llvm::DefMaxInstsToScan("available-load-scan-limit", 
cl::init(6), cl::Hidden,
"to scan backward from a given instruction, when searching for "
"available loaded value"));
 
-Value *llvm::FindAvailableLoadedValue(LoadInst *Load,
-  BasicBlock *ScanBB,
+Value *llvm::FindAvailableLoadedValue(LoadInst *Load, BasicBlock *ScanBB,
   BasicBlock::iterator &ScanFrom,
   unsigned MaxInstsToScan,
-  AAResults *AA, bool *IsLoad,
+  BatchAAResults *AA, bool *IsLoad,
   unsigned *NumScanedInst) {
   // Don't CSE load that

[llvm-branch-commits] [llvm] aa69808 - [Loads] Use BatchAAResults for available value APIs (NFCI)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


Author: Nikita Popov
Date: 2024-02-05T11:41:54-08:00
New Revision: aa6980841e587eba9c98bf54c51f5414f8a15871

URL: 
https://github.com/llvm/llvm-project/commit/aa6980841e587eba9c98bf54c51f5414f8a15871
DIFF: 
https://github.com/llvm/llvm-project/commit/aa6980841e587eba9c98bf54c51f5414f8a15871.diff

LOG: [Loads] Use BatchAAResults for available value APIs (NFCI)

This allows caching AA queries both within and across the calls,
and enables us to use a custom AAQI configuration.

(cherry picked from commit 89dae798cc77789a43e9a60173f647dae03a65fe)

Added: 


Modified: 
llvm/include/llvm/Analysis/Loads.h
llvm/lib/Analysis/Lint.cpp
llvm/lib/Analysis/Loads.cpp
llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
llvm/lib/Transforms/Scalar/JumpThreading.cpp

Removed: 




diff  --git a/llvm/include/llvm/Analysis/Loads.h 
b/llvm/include/llvm/Analysis/Loads.h
index 2880ed33a34cb..0926093bba99d 100644
--- a/llvm/include/llvm/Analysis/Loads.h
+++ b/llvm/include/llvm/Analysis/Loads.h
@@ -18,7 +18,7 @@
 
 namespace llvm {
 
-class AAResults;
+class BatchAAResults;
 class AssumptionCache;
 class DataLayout;
 class DominatorTree;
@@ -129,11 +129,10 @@ extern cl::opt DefMaxInstsToScan;
 /// location in memory, as opposed to the value operand of a store.
 ///
 /// \returns The found value, or nullptr if no value is found.
-Value *FindAvailableLoadedValue(LoadInst *Load,
-BasicBlock *ScanBB,
+Value *FindAvailableLoadedValue(LoadInst *Load, BasicBlock *ScanBB,
 BasicBlock::iterator &ScanFrom,
 unsigned MaxInstsToScan = DefMaxInstsToScan,
-AAResults *AA = nullptr,
+BatchAAResults *AA = nullptr,
 bool *IsLoadCSE = nullptr,
 unsigned *NumScanedInst = nullptr);
 
@@ -141,7 +140,8 @@ Value *FindAvailableLoadedValue(LoadInst *Load,
 /// FindAvailableLoadedValue() for the case where we are not interested in
 /// finding the closest clobbering instruction if no available load is found.
 /// This overload cannot be used to scan across multiple blocks.
-Value *FindAvailableLoadedValue(LoadInst *Load, AAResults &AA, bool *IsLoadCSE,
+Value *FindAvailableLoadedValue(LoadInst *Load, BatchAAResults &AA,
+bool *IsLoadCSE,
 unsigned MaxInstsToScan = DefMaxInstsToScan);
 
 /// Scan backwards to see if we have the value of the given pointer available
@@ -170,7 +170,7 @@ Value *FindAvailableLoadedValue(LoadInst *Load, AAResults 
&AA, bool *IsLoadCSE,
 Value *findAvailablePtrLoadStore(const MemoryLocation &Loc, Type *AccessTy,
  bool AtLeastAtomic, BasicBlock *ScanBB,
  BasicBlock::iterator &ScanFrom,
- unsigned MaxInstsToScan, AAResults *AA,
+ unsigned MaxInstsToScan, BatchAAResults *AA,
  bool *IsLoadCSE, unsigned *NumScanedInst);
 
 /// Returns true if a pointer value \p A can be replace with another pointer

diff  --git a/llvm/lib/Analysis/Lint.cpp b/llvm/lib/Analysis/Lint.cpp
index 1ebc593016bc0..16635097d20af 100644
--- a/llvm/lib/Analysis/Lint.cpp
+++ b/llvm/lib/Analysis/Lint.cpp
@@ -657,11 +657,12 @@ Value *Lint::findValueImpl(Value *V, bool OffsetOk,
 BasicBlock::iterator BBI = L->getIterator();
 BasicBlock *BB = L->getParent();
 SmallPtrSet VisitedBlocks;
+BatchAAResults BatchAA(*AA);
 for (;;) {
   if (!VisitedBlocks.insert(BB).second)
 break;
   if (Value *U =
-  FindAvailableLoadedValue(L, BB, BBI, DefMaxInstsToScan, AA))
+  FindAvailableLoadedValue(L, BB, BBI, DefMaxInstsToScan, 
&BatchAA))
 return findValueImpl(U, OffsetOk, Visited);
   if (BBI != BB->begin())
 break;

diff  --git a/llvm/lib/Analysis/Loads.cpp b/llvm/lib/Analysis/Loads.cpp
index 97d21db86abf2..6bf0d2f56eb4e 100644
--- a/llvm/lib/Analysis/Loads.cpp
+++ b/llvm/lib/Analysis/Loads.cpp
@@ -450,11 +450,10 @@ llvm::DefMaxInstsToScan("available-load-scan-limit", 
cl::init(6), cl::Hidden,
"to scan backward from a given instruction, when searching for "
"available loaded value"));
 
-Value *llvm::FindAvailableLoadedValue(LoadInst *Load,
-  BasicBlock *ScanBB,
+Value *llvm::FindAvailableLoadedValue(LoadInst *Load, BasicBlock *ScanBB,
   BasicBlock::iterator &ScanFrom,
   unsigned MaxInstsToScan,
-  AAResults *AA, bool *IsLoad,
+  BatchAAResults *AA, bool *IsLoad,
   unsigned *NumScanedInst) {
   // Don't C

[llvm-branch-commits] [llvm] 28879ab - [AA][JumpThreading] Don't use DomTree for AA in JumpThreading (#79294)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


Author: Nikita Popov
Date: 2024-02-05T11:41:55-08:00
New Revision: 28879ab8276e7237bfc86f4c7d7890fd4311d334

URL: 
https://github.com/llvm/llvm-project/commit/28879ab8276e7237bfc86f4c7d7890fd4311d334
DIFF: 
https://github.com/llvm/llvm-project/commit/28879ab8276e7237bfc86f4c7d7890fd4311d334.diff

LOG: [AA][JumpThreading] Don't use DomTree for AA in JumpThreading (#79294)

JumpThreading may perform AA queries while the dominator tree is not up
to date, which may result in miscompilations.

Fix this by adding a new AAQI option to disable the use of the dominator
tree in BasicAA.

Fixes https://github.com/llvm/llvm-project/issues/79175.

(cherry picked from commit 4f32f5d5720fbef06672714a62376f236a36aef5)

Added: 


Modified: 
llvm/include/llvm/Analysis/AliasAnalysis.h
llvm/include/llvm/Analysis/BasicAliasAnalysis.h
llvm/lib/Analysis/BasicAliasAnalysis.cpp
llvm/lib/Transforms/Scalar/JumpThreading.cpp
llvm/test/Transforms/JumpThreading/pr79175.ll

Removed: 




diff  --git a/llvm/include/llvm/Analysis/AliasAnalysis.h 
b/llvm/include/llvm/Analysis/AliasAnalysis.h
index d6f732d35fd4c..e8e4f491be5a3 100644
--- a/llvm/include/llvm/Analysis/AliasAnalysis.h
+++ b/llvm/include/llvm/Analysis/AliasAnalysis.h
@@ -287,6 +287,10 @@ class AAQueryInfo {
   ///   store %l, ...
   bool MayBeCrossIteration = false;
 
+  /// Whether alias analysis is allowed to use the dominator tree, for use by
+  /// passes that lazily update the DT while performing AA queries.
+  bool UseDominatorTree = true;
+
   AAQueryInfo(AAResults &AAR, CaptureInfo *CI) : AAR(AAR), CI(CI) {}
 };
 
@@ -668,6 +672,9 @@ class BatchAAResults {
   void enableCrossIterationMode() {
 AAQI.MayBeCrossIteration = true;
   }
+
+  /// Disable the use of the dominator tree during alias analysis queries.
+  void disableDominatorTree() { AAQI.UseDominatorTree = false; }
 };
 
 /// Temporary typedef for legacy code that uses a generic \c AliasAnalysis

diff  --git a/llvm/include/llvm/Analysis/BasicAliasAnalysis.h 
b/llvm/include/llvm/Analysis/BasicAliasAnalysis.h
index afc1811239f28..7eca82729430d 100644
--- a/llvm/include/llvm/Analysis/BasicAliasAnalysis.h
+++ b/llvm/include/llvm/Analysis/BasicAliasAnalysis.h
@@ -43,20 +43,26 @@ class BasicAAResult : public AAResultBase {
   const Function &F;
   const TargetLibraryInfo &TLI;
   AssumptionCache &AC;
-  DominatorTree *DT;
+  /// Use getDT() instead of accessing this member directly, in order to
+  /// respect the AAQI.UseDominatorTree option.
+  DominatorTree *DT_;
+
+  DominatorTree *getDT(const AAQueryInfo &AAQI) const {
+return AAQI.UseDominatorTree ? DT_ : nullptr;
+  }
 
 public:
   BasicAAResult(const DataLayout &DL, const Function &F,
 const TargetLibraryInfo &TLI, AssumptionCache &AC,
 DominatorTree *DT = nullptr)
-  : DL(DL), F(F), TLI(TLI), AC(AC), DT(DT) {}
+  : DL(DL), F(F), TLI(TLI), AC(AC), DT_(DT) {}
 
   BasicAAResult(const BasicAAResult &Arg)
   : AAResultBase(Arg), DL(Arg.DL), F(Arg.F), TLI(Arg.TLI), AC(Arg.AC),
-DT(Arg.DT) {}
+DT_(Arg.DT_) {}
   BasicAAResult(BasicAAResult &&Arg)
   : AAResultBase(std::move(Arg)), DL(Arg.DL), F(Arg.F), TLI(Arg.TLI),
-AC(Arg.AC), DT(Arg.DT) {}
+AC(Arg.AC), DT_(Arg.DT_) {}
 
   /// Handle invalidation events in the new pass manager.
   bool invalidate(Function &Fn, const PreservedAnalyses &PA,

diff  --git a/llvm/lib/Analysis/BasicAliasAnalysis.cpp 
b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
index 3178e2d278167..1028b52a79123 100644
--- a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
@@ -89,7 +89,7 @@ bool BasicAAResult::invalidate(Function &Fn, const 
PreservedAnalyses &PA,
   // may be created without handles to some analyses and in that case don't
   // depend on them.
   if (Inv.invalidate(Fn, PA) ||
-  (DT && Inv.invalidate(Fn, PA)))
+  (DT_ && Inv.invalidate(Fn, PA)))
 return true;
 
   // Otherwise this analysis result remains valid.
@@ -1063,6 +1063,7 @@ AliasResult BasicAAResult::aliasGEP(
  : AliasResult::MayAlias;
   }
 
+  DominatorTree *DT = getDT(AAQI);
   DecomposedGEP DecompGEP1 = DecomposeGEPExpression(GEP1, DL, &AC, DT);
   DecomposedGEP DecompGEP2 = DecomposeGEPExpression(V2, DL, &AC, DT);
 
@@ -1556,6 +1557,7 @@ AliasResult BasicAAResult::aliasCheck(const Value *V1, 
LocationSize V1Size,
 const Value *HintO1 = getUnderlyingObject(Hint1);
 const Value *HintO2 = getUnderlyingObject(Hint2);
 
+DominatorTree *DT = getDT(AAQI);
 auto ValidAssumeForPtrContext = [&](const Value *Ptr) {
   if (const Instruction *PtrI = dyn_cast(Ptr)) {
 return isValidAssumeForContext(Assume, PtrI, DT,
@@ -1735,7 +1737,7 @@ bool BasicAAResult::isValueEqualInPotentialCycles(const 
Value *V,
   if (!Inst || Inst->getParent()->isEntryBlo

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79175 (PR #80274)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/80274
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] a581690 - [JumpThreading] Add test for #79175 (NFC)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


Author: Nikita Popov
Date: 2024-02-05T11:41:55-08:00
New Revision: a581690c57d153f329ded71004a8616b93cb88ca

URL: 
https://github.com/llvm/llvm-project/commit/a581690c57d153f329ded71004a8616b93cb88ca
DIFF: 
https://github.com/llvm/llvm-project/commit/a581690c57d153f329ded71004a8616b93cb88ca.diff

LOG: [JumpThreading] Add test for #79175 (NFC)

(cherry picked from commit 7143b451d71fe314730f7610d7908e3b9611815c)

Added: 
llvm/test/Transforms/JumpThreading/pr79175.ll

Modified: 


Removed: 




diff  --git a/llvm/test/Transforms/JumpThreading/pr79175.ll 
b/llvm/test/Transforms/JumpThreading/pr79175.ll
new file mode 100644
index 0..6815aabb26dfc
--- /dev/null
+++ b/llvm/test/Transforms/JumpThreading/pr79175.ll
@@ -0,0 +1,64 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -S -passes=jump-threading < %s | FileCheck %s
+
+@f = external global i32
+
+; Make sure the value of @f is reloaded prior to the final comparison.
+; FIXME: This is a miscompile.
+define i32 @test(i64 %idx, i32 %val) {
+; CHECK-LABEL: define i32 @test(
+; CHECK-SAME: i64 [[IDX:%.*]], i32 [[VAL:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[CMP:%.*]] = icmp slt i64 [[IDX]], 1
+; CHECK-NEXT:br i1 [[CMP]], label [[FOR_BODY:%.*]], label [[RETURN:%.*]]
+; CHECK:   for.body:
+; CHECK-NEXT:[[F:%.*]] = load i32, ptr @f, align 4
+; CHECK-NEXT:[[CMP1:%.*]] = icmp eq i32 [[F]], 0
+; CHECK-NEXT:br i1 [[CMP1]], label [[COND_END_THREAD:%.*]], label 
[[COND_END:%.*]]
+; CHECK:   cond.end:
+; CHECK-NEXT:[[CMP_I:%.*]] = icmp sgt i32 [[VAL]], 0
+; CHECK-NEXT:[[COND_FR:%.*]] = freeze i1 [[CMP_I]]
+; CHECK-NEXT:br i1 [[COND_FR]], label [[COND_END_THREAD]], label 
[[TMP0:%.*]]
+; CHECK:   cond.end.thread:
+; CHECK-NEXT:[[F_RELOAD_PR:%.*]] = load i32, ptr @f, align 4
+; CHECK-NEXT:br label [[TMP0]]
+; CHECK:   0:
+; CHECK-NEXT:[[F_RELOAD:%.*]] = phi i32 [ [[F]], [[COND_END]] ], [ 
[[F_RELOAD_PR]], [[COND_END_THREAD]] ]
+; CHECK-NEXT:[[TMP1:%.*]] = phi i32 [ 0, [[COND_END_THREAD]] ], [ [[VAL]], 
[[COND_END]] ]
+; CHECK-NEXT:[[F_IDX:%.*]] = getelementptr inbounds i32, ptr @f, i64 
[[IDX]]
+; CHECK-NEXT:store i32 [[TMP1]], ptr [[F_IDX]], align 4
+; CHECK-NEXT:[[CMP3:%.*]] = icmp slt i32 [[F_RELOAD]], 1
+; CHECK-NEXT:br i1 [[CMP3]], label [[RETURN2:%.*]], label [[RETURN]]
+; CHECK:   return:
+; CHECK-NEXT:ret i32 0
+; CHECK:   return2:
+; CHECK-NEXT:ret i32 1
+;
+entry:
+  %cmp = icmp slt i64 %idx, 1
+  br i1 %cmp, label %for.body, label %return
+
+for.body:
+  %f = load i32, ptr @f, align 4
+  %cmp1 = icmp eq i32 %f, 0
+  br i1 %cmp1, label %cond.end, label %cond.false
+
+cond.false:
+  br label %cond.end
+
+cond.end:
+  %phi = phi i32 [ %val, %cond.false ], [ 1, %for.body ]
+  %cmp.i = icmp sgt i32 %phi, 0
+  %sel = select i1 %cmp.i, i32 0, i32 %phi
+  %f.idx = getelementptr inbounds i32, ptr @f, i64 %idx
+  store i32 %sel, ptr %f.idx, align 4
+  %f.reload = load i32, ptr @f, align 4
+  %cmp3 = icmp slt i32 %f.reload, 1
+  br i1 %cmp3, label %return2, label %return
+
+return:
+  ret i32 0
+
+return2:
+  ret i32 1
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80597 (PR #80731)

2024-02-05 Thread Nikita Popov via llvm-branch-commits


https://github.com/nikic approved this pull request.


https://github.com/llvm/llvm-project/pull/80731
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] PR for llvm/llvm-project#80599 (PR #80600)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/80600

>From f7d0a0e7aec97eb7f0719f0f3cfcf94ad823fedd Mon Sep 17 00:00:00 2001
From: Koakuma 
Date: Sun, 4 Feb 2024 11:08:00 +0700
Subject: [PATCH] [clang] Add GCC-compatible code model names for sparc64

This adds GCC-compatible names for code model selection on 64-bit SPARC
with absolute code.
Testing with a 2-stage build then running codegen tests works okay under
all of the supported code models.

(32-bit target does not have selectable code models)

Reviewed By: @brad0, @MaskRay

(cherry picked from commit b0f0babff22e9c0af74535b05e2c6424392bb24a)
---
 clang/lib/Driver/ToolChains/Clang.cpp | 8 
 clang/test/Driver/sparc64-codemodel.c | 6 ++
 2 files changed, 14 insertions(+)
 create mode 100644 clang/test/Driver/sparc64-codemodel.c

diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 8092fc050b0ee..54de8edd9a039 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5779,6 +5779,14 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   // NVPTX/AMDGPU does not care about the code model and will accept
   // whatever works for the host.
   Ok = true;
+} else if (Triple.isSPARC64()) {
+  if (CM == "medlow")
+CM = "small";
+  else if (CM == "medmid")
+CM = "medium";
+  else if (CM == "medany")
+CM = "large";
+  Ok = CM == "small" || CM == "medium" || CM == "large";
 }
 if (Ok) {
   CmdArgs.push_back(Args.MakeArgString("-mcmodel=" + CM));
diff --git a/clang/test/Driver/sparc64-codemodel.c 
b/clang/test/Driver/sparc64-codemodel.c
new file mode 100644
index 0..e4b01fd61b6fa
--- /dev/null
+++ b/clang/test/Driver/sparc64-codemodel.c
@@ -0,0 +1,6 @@
+// RUN: %clang --target=sparc64 -mcmodel=medlow %s -### 2>&1 | FileCheck 
-check-prefix=MEDLOW %s
+// RUN: %clang --target=sparc64 -mcmodel=medmid %s -### 2>&1 | FileCheck 
-check-prefix=MEDMID %s
+// RUN: %clang --target=sparc64 -mcmodel=medany %s -### 2>&1 | FileCheck 
-check-prefix=MEDANY %s
+// MEDLOW: "-mcmodel=small"
+// MEDMID: "-mcmodel=medium"
+// MEDANY: "-mcmodel=large"

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] f7d0a0e - [clang] Add GCC-compatible code model names for sparc64

2024-02-05 Thread Tom Stellard via llvm-branch-commits


Author: Koakuma
Date: 2024-02-05T11:46:24-08:00
New Revision: f7d0a0e7aec97eb7f0719f0f3cfcf94ad823fedd

URL: 
https://github.com/llvm/llvm-project/commit/f7d0a0e7aec97eb7f0719f0f3cfcf94ad823fedd
DIFF: 
https://github.com/llvm/llvm-project/commit/f7d0a0e7aec97eb7f0719f0f3cfcf94ad823fedd.diff

LOG: [clang] Add GCC-compatible code model names for sparc64

This adds GCC-compatible names for code model selection on 64-bit SPARC
with absolute code.
Testing with a 2-stage build then running codegen tests works okay under
all of the supported code models.

(32-bit target does not have selectable code models)

Reviewed By: @brad0, @MaskRay

(cherry picked from commit b0f0babff22e9c0af74535b05e2c6424392bb24a)

Added: 
clang/test/Driver/sparc64-codemodel.c

Modified: 
clang/lib/Driver/ToolChains/Clang.cpp

Removed: 




diff  --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 8092fc050b0ee..54de8edd9a039 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5779,6 +5779,14 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   // NVPTX/AMDGPU does not care about the code model and will accept
   // whatever works for the host.
   Ok = true;
+} else if (Triple.isSPARC64()) {
+  if (CM == "medlow")
+CM = "small";
+  else if (CM == "medmid")
+CM = "medium";
+  else if (CM == "medany")
+CM = "large";
+  Ok = CM == "small" || CM == "medium" || CM == "large";
 }
 if (Ok) {
   CmdArgs.push_back(Args.MakeArgString("-mcmodel=" + CM));

diff  --git a/clang/test/Driver/sparc64-codemodel.c 
b/clang/test/Driver/sparc64-codemodel.c
new file mode 100644
index 0..e4b01fd61b6fa
--- /dev/null
+++ b/clang/test/Driver/sparc64-codemodel.c
@@ -0,0 +1,6 @@
+// RUN: %clang --target=sparc64 -mcmodel=medlow %s -### 2>&1 | FileCheck 
-check-prefix=MEDLOW %s
+// RUN: %clang --target=sparc64 -mcmodel=medmid %s -### 2>&1 | FileCheck 
-check-prefix=MEDMID %s
+// RUN: %clang --target=sparc64 -mcmodel=medany %s -### 2>&1 | FileCheck 
-check-prefix=MEDANY %s
+// MEDLOW: "-mcmodel=small"
+// MEDMID: "-mcmodel=medium"
+// MEDANY: "-mcmodel=large"



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] PR for llvm/llvm-project#80599 (PR #80600)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/80600
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80694 (PR #80695)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/80695

>From 47fbb649e12f7016ee60a5918bda26c01f2ea543 Mon Sep 17 00:00:00 2001
From: Pierre van Houtryve 
Date: Mon, 5 Feb 2024 14:36:15 +0100
Subject: [PATCH] [AMDGPU][PromoteAlloca] Support memsets to ptr allocas
 (#80678)

Fixes #80366

(cherry picked from commit 4e958abf2f44d08129eafd5b6a4ee2bd3584ed22)
---
 .../lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp | 16 --
 .../CodeGen/AMDGPU/promote-alloca-memset.ll   | 54 +++
 2 files changed, 66 insertions(+), 4 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
index 5e73411cae9b70..c1b244f50d93f8 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
@@ -521,10 +521,18 @@ static Value *promoteAllocaUserToVector(
   // For memset, we don't need to know the previous value because we
   // currently only allow memsets that cover the whole alloca.
   Value *Elt = MSI->getOperand(1);
-  if (DL.getTypeStoreSize(VecEltTy) > 1) {
-Value *EltBytes =
-Builder.CreateVectorSplat(DL.getTypeStoreSize(VecEltTy), Elt);
-Elt = Builder.CreateBitCast(EltBytes, VecEltTy);
+  const unsigned BytesPerElt = DL.getTypeStoreSize(VecEltTy);
+  if (BytesPerElt > 1) {
+Value *EltBytes = Builder.CreateVectorSplat(BytesPerElt, Elt);
+
+// If the element type of the vector is a pointer, we need to first 
cast
+// to an integer, then use a PtrCast.
+if (VecEltTy->isPointerTy()) {
+  Type *PtrInt = Builder.getIntNTy(BytesPerElt * 8);
+  Elt = Builder.CreateBitCast(EltBytes, PtrInt);
+  Elt = Builder.CreateIntToPtr(Elt, VecEltTy);
+} else
+  Elt = Builder.CreateBitCast(EltBytes, VecEltTy);
   }
 
   return Builder.CreateVectorSplat(VectorTy->getElementCount(), Elt);
diff --git a/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll 
b/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
index 15af1f17e230ec..f1e2737b370ef0 100644
--- a/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
+++ b/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
@@ -84,4 +84,58 @@ entry:
   ret void
 }
 
+define amdgpu_kernel void @memset_array_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_array_ptr_alloca(
+; CHECK-NEXT:store i64 0, ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca [6 x ptr], align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
+define amdgpu_kernel void @memset_vector_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_vector_ptr_alloca(
+; CHECK-NEXT:store i64 0, ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca <6 x ptr>, align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
+define amdgpu_kernel void @memset_array_of_array_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_array_of_array_ptr_alloca(
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [2 x [3 x ptr]], align 16, 
addrspace(5)
+; CHECK-NEXT:call void @llvm.memset.p5.i64(ptr addrspace(5) [[ALLOCA]], i8 
0, i64 48, i1 false)
+; CHECK-NEXT:[[LOAD:%.*]] = load i64, ptr addrspace(5) [[ALLOCA]], align 8
+; CHECK-NEXT:store i64 [[LOAD]], ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca [2 x [3 x ptr]], align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
+define amdgpu_kernel void @memset_array_of_vec_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_array_of_vec_ptr_alloca(
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [2 x <3 x ptr>], align 16, 
addrspace(5)
+; CHECK-NEXT:call void @llvm.memset.p5.i64(ptr addrspace(5) [[ALLOCA]], i8 
0, i64 48, i1 false)
+; CHECK-NEXT:[[LOAD:%.*]] = load i64, ptr addrspace(5) [[ALLOCA]], align 8
+; CHECK-NEXT:store i64 [[LOAD]], ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca [2 x <3 x ptr>], align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
 declare void @llvm.memset.p5.i64(ptr addrspace(5) nocapture writeonly, i8, 
i64, i1 immarg)

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] 47fbb64 - [AMDGPU][PromoteAlloca] Support memsets to ptr allocas (#80678)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


Author: Pierre van Houtryve
Date: 2024-02-05T11:48:14-08:00
New Revision: 47fbb649e12f7016ee60a5918bda26c01f2ea543

URL: 
https://github.com/llvm/llvm-project/commit/47fbb649e12f7016ee60a5918bda26c01f2ea543
DIFF: 
https://github.com/llvm/llvm-project/commit/47fbb649e12f7016ee60a5918bda26c01f2ea543.diff

LOG: [AMDGPU][PromoteAlloca] Support memsets to ptr allocas (#80678)

Fixes #80366

(cherry picked from commit 4e958abf2f44d08129eafd5b6a4ee2bd3584ed22)

Added: 


Modified: 
llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll

Removed: 




diff  --git a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
index 5e73411cae9b7..c1b244f50d93f 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
@@ -521,10 +521,18 @@ static Value *promoteAllocaUserToVector(
   // For memset, we don't need to know the previous value because we
   // currently only allow memsets that cover the whole alloca.
   Value *Elt = MSI->getOperand(1);
-  if (DL.getTypeStoreSize(VecEltTy) > 1) {
-Value *EltBytes =
-Builder.CreateVectorSplat(DL.getTypeStoreSize(VecEltTy), Elt);
-Elt = Builder.CreateBitCast(EltBytes, VecEltTy);
+  const unsigned BytesPerElt = DL.getTypeStoreSize(VecEltTy);
+  if (BytesPerElt > 1) {
+Value *EltBytes = Builder.CreateVectorSplat(BytesPerElt, Elt);
+
+// If the element type of the vector is a pointer, we need to first 
cast
+// to an integer, then use a PtrCast.
+if (VecEltTy->isPointerTy()) {
+  Type *PtrInt = Builder.getIntNTy(BytesPerElt * 8);
+  Elt = Builder.CreateBitCast(EltBytes, PtrInt);
+  Elt = Builder.CreateIntToPtr(Elt, VecEltTy);
+} else
+  Elt = Builder.CreateBitCast(EltBytes, VecEltTy);
   }
 
   return Builder.CreateVectorSplat(VectorTy->getElementCount(), Elt);

diff  --git a/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll 
b/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
index 15af1f17e230e..f1e2737b370ef 100644
--- a/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
+++ b/llvm/test/CodeGen/AMDGPU/promote-alloca-memset.ll
@@ -84,4 +84,58 @@ entry:
   ret void
 }
 
+define amdgpu_kernel void @memset_array_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_array_ptr_alloca(
+; CHECK-NEXT:store i64 0, ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca [6 x ptr], align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
+define amdgpu_kernel void @memset_vector_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_vector_ptr_alloca(
+; CHECK-NEXT:store i64 0, ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca <6 x ptr>, align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
+define amdgpu_kernel void @memset_array_of_array_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_array_of_array_ptr_alloca(
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [2 x [3 x ptr]], align 16, 
addrspace(5)
+; CHECK-NEXT:call void @llvm.memset.p5.i64(ptr addrspace(5) [[ALLOCA]], i8 
0, i64 48, i1 false)
+; CHECK-NEXT:[[LOAD:%.*]] = load i64, ptr addrspace(5) [[ALLOCA]], align 8
+; CHECK-NEXT:store i64 [[LOAD]], ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca [2 x [3 x ptr]], align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
+define amdgpu_kernel void @memset_array_of_vec_ptr_alloca(ptr %out) {
+; CHECK-LABEL: @memset_array_of_vec_ptr_alloca(
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [2 x <3 x ptr>], align 16, 
addrspace(5)
+; CHECK-NEXT:call void @llvm.memset.p5.i64(ptr addrspace(5) [[ALLOCA]], i8 
0, i64 48, i1 false)
+; CHECK-NEXT:[[LOAD:%.*]] = load i64, ptr addrspace(5) [[ALLOCA]], align 8
+; CHECK-NEXT:store i64 [[LOAD]], ptr [[OUT:%.*]], align 8
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca [2 x <3 x ptr>], align 16, addrspace(5)
+  call void @llvm.memset.p5.i64(ptr addrspace(5) %alloca, i8 0, i64 48, i1 
false)
+  %load = load i64, ptr addrspace(5) %alloca
+  store i64 %load, ptr %out
+  ret void
+}
+
 declare void @llvm.memset.p5.i64(ptr addrspace(5) nocapture writeonly, i8, 
i64, i1 immarg)



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80694 (PR #80695)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/80695
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] PR for llvm/llvm-project#80168 (PR #80702)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/80702

>From 72533964036dca3ce806044e92a1e70584e3aca9 Mon Sep 17 00:00:00 2001
From: Louis Dionne 
Date: Mon, 5 Feb 2024 11:05:46 -0500
Subject: [PATCH] [libc++] Add missing conditionals for feature-test macros
 (#80168)

We noticed that some feature-test macros were not conditional on
configuration flags like _LIBCPP_HAS_NO_FILESYSTEM. As a result, code
attempting to use FTMs would not work as intended.

This patch adds conditionals for a few feature-test macros, but more
issues may exist.

rdar://122020466
(cherry picked from commit f2c84211d2834c73ff874389c6bb47b1c76d391a)
---
 libcxx/include/version|  14 +-
 .../filesystem.version.compile.pass.cpp   |  16 +-
 .../fstream.version.compile.pass.cpp  |  16 +-
 .../iomanip.version.compile.pass.cpp  |  80 +---
 .../mutex.version.compile.pass.cpp|  64 +--
 .../version.version.compile.pass.cpp  | 176 --
 .../generate_feature_test_macro_components.py |  10 +-
 7 files changed, 254 insertions(+), 122 deletions(-)

diff --git a/libcxx/include/version b/libcxx/include/version
index 9e26da8c1b2425..d356976d6454ad 100644
--- a/libcxx/include/version
+++ b/libcxx/include/version
@@ -266,7 +266,9 @@ __cpp_lib_within_lifetime   
202306L 
 # define __cpp_lib_make_reverse_iterator201402L
 # define __cpp_lib_make_unique  201304L
 # define __cpp_lib_null_iterators   201304L
-# define __cpp_lib_quoted_string_io 201304L
+# if !defined(_LIBCPP_HAS_NO_LOCALIZATION)
+#   define __cpp_lib_quoted_string_io   201304L
+# endif
 # define __cpp_lib_result_of_sfinae 201210L
 # define __cpp_lib_robust_nonmodifying_seq_ops  201304L
 # if !defined(_LIBCPP_HAS_NO_THREADS)
@@ -294,7 +296,7 @@ __cpp_lib_within_lifetime   
202306L 
 # define __cpp_lib_clamp201603L
 # define __cpp_lib_enable_shared_from_this  201603L
 // # define __cpp_lib_execution201603L
-# if _LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY
+# if !defined(_LIBCPP_HAS_NO_FILESYSTEM) && 
_LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY
 #   define __cpp_lib_filesystem 201703L
 # endif
 # define __cpp_lib_gcd_lcm  201606L
@@ -323,7 +325,9 @@ __cpp_lib_within_lifetime   
202306L 
 // # define __cpp_lib_parallel_algorithm   201603L
 # define __cpp_lib_raw_memory_algorithms201606L
 # define __cpp_lib_sample   201603L
-# define __cpp_lib_scoped_lock  201703L
+# if !defined(_LIBCPP_HAS_NO_THREADS)
+#   define __cpp_lib_scoped_lock201703L
+# endif
 # if !defined(_LIBCPP_HAS_NO_THREADS)
 #   define __cpp_lib_shared_mutex   201505L
 # endif
@@ -496,7 +500,9 @@ __cpp_lib_within_lifetime   
202306L 
 // # define __cpp_lib_freestanding_optional202311L
 // # define __cpp_lib_freestanding_string_view 202311L
 // # define __cpp_lib_freestanding_variant 202311L
-# define __cpp_lib_fstream_native_handle202306L
+# if !defined(_LIBCPP_HAS_NO_FILESYSTEM) && 
!defined(_LIBCPP_HAS_NO_LOCALIZATION)
+#   define __cpp_lib_fstream_native_handle  202306L
+# endif
 // # define __cpp_lib_function_ref 202306L
 // # define __cpp_lib_hazard_pointer   202306L
 // # define __cpp_lib_linalg   202311L
diff --git 
a/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
 
b/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
index 46ccde800c1796..3f03e8be9aeab3 100644
--- 
a/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
+++ 
b/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
@@ -51,7 +51,7 @@
 #   error "__cpp_lib_char8_t should not be defined before c++20"
 # endif
 
-# if !defined(_LIBCPP_VERSION) || _LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY
+# if !defined(_LIBCPP_VERSION) || (!defined(_LIBCPP_HAS_NO_FILESYSTEM) && 
_LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY)
 #   ifndef __cpp_lib_filesystem
 # error "__cpp_lib_filesystem should be defined in c++17"
 #   endif
@@ -60,7 +60,7 @@
 #   endif
 # else
 #   ifdef __cpp_lib_filesystem
-# error "__cpp_lib_filesystem should not be defined when the requirement 
'!defined(_LIBCPP_VERSION) || _LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY' is 
not met!"
+# error "__cpp_lib_filesystem should not be defined when the

[llvm-branch-commits] [libcxx] 7253396 - [libc++] Add missing conditionals for feature-test macros (#80168)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


Author: Louis Dionne
Date: 2024-02-05T11:49:51-08:00
New Revision: 72533964036dca3ce806044e92a1e70584e3aca9

URL: 
https://github.com/llvm/llvm-project/commit/72533964036dca3ce806044e92a1e70584e3aca9
DIFF: 
https://github.com/llvm/llvm-project/commit/72533964036dca3ce806044e92a1e70584e3aca9.diff

LOG: [libc++] Add missing conditionals for feature-test macros (#80168)

We noticed that some feature-test macros were not conditional on
configuration flags like _LIBCPP_HAS_NO_FILESYSTEM. As a result, code
attempting to use FTMs would not work as intended.

This patch adds conditionals for a few feature-test macros, but more
issues may exist.

rdar://122020466
(cherry picked from commit f2c84211d2834c73ff874389c6bb47b1c76d391a)

Added: 


Modified: 
libcxx/include/version

libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp

libcxx/test/std/language.support/support.limits/support.limits.general/fstream.version.compile.pass.cpp

libcxx/test/std/language.support/support.limits/support.limits.general/iomanip.version.compile.pass.cpp

libcxx/test/std/language.support/support.limits/support.limits.general/mutex.version.compile.pass.cpp

libcxx/test/std/language.support/support.limits/support.limits.general/version.version.compile.pass.cpp
libcxx/utils/generate_feature_test_macro_components.py

Removed: 




diff  --git a/libcxx/include/version b/libcxx/include/version
index 9e26da8c1b242..d356976d6454a 100644
--- a/libcxx/include/version
+++ b/libcxx/include/version
@@ -266,7 +266,9 @@ __cpp_lib_within_lifetime   
202306L 
 # define __cpp_lib_make_reverse_iterator201402L
 # define __cpp_lib_make_unique  201304L
 # define __cpp_lib_null_iterators   201304L
-# define __cpp_lib_quoted_string_io 201304L
+# if !defined(_LIBCPP_HAS_NO_LOCALIZATION)
+#   define __cpp_lib_quoted_string_io   201304L
+# endif
 # define __cpp_lib_result_of_sfinae 201210L
 # define __cpp_lib_robust_nonmodifying_seq_ops  201304L
 # if !defined(_LIBCPP_HAS_NO_THREADS)
@@ -294,7 +296,7 @@ __cpp_lib_within_lifetime   
202306L 
 # define __cpp_lib_clamp201603L
 # define __cpp_lib_enable_shared_from_this  201603L
 // # define __cpp_lib_execution201603L
-# if _LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY
+# if !defined(_LIBCPP_HAS_NO_FILESYSTEM) && 
_LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY
 #   define __cpp_lib_filesystem 201703L
 # endif
 # define __cpp_lib_gcd_lcm  201606L
@@ -323,7 +325,9 @@ __cpp_lib_within_lifetime   
202306L 
 // # define __cpp_lib_parallel_algorithm   201603L
 # define __cpp_lib_raw_memory_algorithms201606L
 # define __cpp_lib_sample   201603L
-# define __cpp_lib_scoped_lock  201703L
+# if !defined(_LIBCPP_HAS_NO_THREADS)
+#   define __cpp_lib_scoped_lock201703L
+# endif
 # if !defined(_LIBCPP_HAS_NO_THREADS)
 #   define __cpp_lib_shared_mutex   201505L
 # endif
@@ -496,7 +500,9 @@ __cpp_lib_within_lifetime   
202306L 
 // # define __cpp_lib_freestanding_optional202311L
 // # define __cpp_lib_freestanding_string_view 202311L
 // # define __cpp_lib_freestanding_variant 202311L
-# define __cpp_lib_fstream_native_handle202306L
+# if !defined(_LIBCPP_HAS_NO_FILESYSTEM) && 
!defined(_LIBCPP_HAS_NO_LOCALIZATION)
+#   define __cpp_lib_fstream_native_handle  202306L
+# endif
 // # define __cpp_lib_function_ref 202306L
 // # define __cpp_lib_hazard_pointer   202306L
 // # define __cpp_lib_linalg   202311L

diff  --git 
a/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
 
b/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
index 46ccde800c179..3f03e8be9aeab 100644
--- 
a/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
+++ 
b/libcxx/test/std/language.support/support.limits/support.limits.general/filesystem.version.compile.pass.cpp
@@ -51,7 +51,7 @@
 #   error "__cpp_lib_char8_t should not be defined before c++20"
 # endif
 
-# if !defined(_LIBCPP_VERSION) || _LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY
+# if !defined(_LIBCPP_VERSION) || (!defined(_LIBCPP_HAS_NO_FILESYSTEM) && 
_LIBCPP_AVAILABILITY_HAS_FILESYSTEM_LIBRARY)
 #   ifndef __cpp_lib_filesystem
 # error "__cpp_lib_fil

[llvm-branch-commits] [libcxx] PR for llvm/llvm-project#80168 (PR #80702)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/80702
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] PR for llvm/llvm-project#80718 (PR #80720)

2024-02-05 Thread Nikolas Klauser via llvm-branch-commits


https://github.com/philnik777 approved this pull request.


https://github.com/llvm/llvm-project/pull/80720
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] PR for llvm/llvm-project#78892 (PR #80259)

2024-02-05 Thread Björn Schäpers via llvm-branch-commits


https://github.com/HazardyKnusperkeks approved this pull request.

As stated in the discussion, it is an absolutely must to merge it in the 
release. In my opinion we can't just drop an option, for the next release.

https://github.com/llvm/llvm-project/pull/80259
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] PR for llvm/llvm-project#80718 (PR #80720)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/80720

>From 984fe4054a4e67ed3a781e15a4269a2a89b5f424 Mon Sep 17 00:00:00 2001
From: Dimitry Andric 
Date: Mon, 5 Feb 2024 17:41:12 +0100
Subject: [PATCH] [libc++] Rename __bit_reference template parameter to avoid
 conflict (#80661)

As of 4d20cfcf4eb08217ed37c4d4c38dc395d7a66d26, `__bit_reference`
contains a template `__fill_n` with a bool `_FillValue` parameter.

Unfortunately there is a relatively widely used piece of scientific
software called NetCDF, which exposes a (C) macro `_FillValue` in its
public headers.

When building the NetCDF C++ bindings, this quickly leads to compilation
errors when the macro interferes with the template in `__bit_reference`.

Rename the parameter to `_FillVal` to avoid the conflict.

(cherry picked from commit 1ec252298925de50b27930c557ba9de3cc397afe)
---
 libcxx/include/__bit_reference | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/libcxx/include/__bit_reference b/libcxx/include/__bit_reference
index 9032b8f0180937..3a5339b72ddc31 100644
--- a/libcxx/include/__bit_reference
+++ b/libcxx/include/__bit_reference
@@ -173,7 +173,7 @@ private:
 
 // fill_n
 
-template 
+template 
 _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void
 __fill_n(__bit_iterator<_Cp, false> __first, typename _Cp::size_type __n) {
   using _It= __bit_iterator<_Cp, false>;
@@ -185,7 +185,7 @@ __fill_n(__bit_iterator<_Cp, false> __first, typename 
_Cp::size_type __n) {
 __storage_type __clz_f = static_cast<__storage_type>(__bits_per_word - 
__first.__ctz_);
 __storage_type __dn= std::min(__clz_f, __n);
 __storage_type __m = (~__storage_type(0) << __first.__ctz_) & 
(~__storage_type(0) >> (__clz_f - __dn));
-if (_FillValue)
+if (_FillVal)
   *__first.__seg_ |= __m;
 else
   *__first.__seg_ &= ~__m;
@@ -194,13 +194,13 @@ __fill_n(__bit_iterator<_Cp, false> __first, typename 
_Cp::size_type __n) {
   }
   // do middle whole words
   __storage_type __nw = __n / __bits_per_word;
-  std::fill_n(std::__to_address(__first.__seg_), __nw, _FillValue ? 
static_cast<__storage_type>(-1) : 0);
+  std::fill_n(std::__to_address(__first.__seg_), __nw, _FillVal ? 
static_cast<__storage_type>(-1) : 0);
   __n -= __nw * __bits_per_word;
   // do last partial word
   if (__n > 0) {
 __first.__seg_ += __nw;
 __storage_type __m = ~__storage_type(0) >> (__bits_per_word - __n);
-if (_FillValue)
+if (_FillVal)
   *__first.__seg_ |= __m;
 else
   *__first.__seg_ &= ~__m;
@@ -1007,7 +1007,7 @@ private:
   friend class __bit_iterator<_Cp, true>;
   template 
   friend struct __bit_array;
-  template 
+  template 
   _LIBCPP_CONSTEXPR_SINCE_CXX20 friend void __fill_n(__bit_iterator<_Dp, 
false> __first, typename _Dp::size_type __n);
 
   template 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] 984fe40 - [libc++] Rename __bit_reference template parameter to avoid conflict (#80661)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


Author: Dimitry Andric
Date: 2024-02-05T13:21:41-08:00
New Revision: 984fe4054a4e67ed3a781e15a4269a2a89b5f424

URL: 
https://github.com/llvm/llvm-project/commit/984fe4054a4e67ed3a781e15a4269a2a89b5f424
DIFF: 
https://github.com/llvm/llvm-project/commit/984fe4054a4e67ed3a781e15a4269a2a89b5f424.diff

LOG: [libc++] Rename __bit_reference template parameter to avoid conflict 
(#80661)

As of 4d20cfcf4eb08217ed37c4d4c38dc395d7a66d26, `__bit_reference`
contains a template `__fill_n` with a bool `_FillValue` parameter.

Unfortunately there is a relatively widely used piece of scientific
software called NetCDF, which exposes a (C) macro `_FillValue` in its
public headers.

When building the NetCDF C++ bindings, this quickly leads to compilation
errors when the macro interferes with the template in `__bit_reference`.

Rename the parameter to `_FillVal` to avoid the conflict.

(cherry picked from commit 1ec252298925de50b27930c557ba9de3cc397afe)

Added: 


Modified: 
libcxx/include/__bit_reference

Removed: 




diff  --git a/libcxx/include/__bit_reference b/libcxx/include/__bit_reference
index 9032b8f018093..3a5339b72ddc3 100644
--- a/libcxx/include/__bit_reference
+++ b/libcxx/include/__bit_reference
@@ -173,7 +173,7 @@ private:
 
 // fill_n
 
-template 
+template 
 _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void
 __fill_n(__bit_iterator<_Cp, false> __first, typename _Cp::size_type __n) {
   using _It= __bit_iterator<_Cp, false>;
@@ -185,7 +185,7 @@ __fill_n(__bit_iterator<_Cp, false> __first, typename 
_Cp::size_type __n) {
 __storage_type __clz_f = static_cast<__storage_type>(__bits_per_word - 
__first.__ctz_);
 __storage_type __dn= std::min(__clz_f, __n);
 __storage_type __m = (~__storage_type(0) << __first.__ctz_) & 
(~__storage_type(0) >> (__clz_f - __dn));
-if (_FillValue)
+if (_FillVal)
   *__first.__seg_ |= __m;
 else
   *__first.__seg_ &= ~__m;
@@ -194,13 +194,13 @@ __fill_n(__bit_iterator<_Cp, false> __first, typename 
_Cp::size_type __n) {
   }
   // do middle whole words
   __storage_type __nw = __n / __bits_per_word;
-  std::fill_n(std::__to_address(__first.__seg_), __nw, _FillValue ? 
static_cast<__storage_type>(-1) : 0);
+  std::fill_n(std::__to_address(__first.__seg_), __nw, _FillVal ? 
static_cast<__storage_type>(-1) : 0);
   __n -= __nw * __bits_per_word;
   // do last partial word
   if (__n > 0) {
 __first.__seg_ += __nw;
 __storage_type __m = ~__storage_type(0) >> (__bits_per_word - __n);
-if (_FillValue)
+if (_FillVal)
   *__first.__seg_ |= __m;
 else
   *__first.__seg_ &= ~__m;
@@ -1007,7 +1007,7 @@ private:
   friend class __bit_iterator<_Cp, true>;
   template 
   friend struct __bit_array;
-  template 
+  template 
   _LIBCPP_CONSTEXPR_SINCE_CXX20 friend void __fill_n(__bit_iterator<_Dp, 
false> __first, typename _Dp::size_type __n);
 
   template 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] PR for llvm/llvm-project#80718 (PR #80720)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/80720
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [18.x][Docs] Add release note about Clang-defined target OS macros (PR #80044)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


tstellar wrote:

Looks like this patch caused the documentation build to fail.

https://github.com/llvm/llvm-project/pull/80044
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80752 (PR #80754)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/80754

resolves llvm/llvm-project#80752

>From 3ac083df943f040770b9d324956fb066bb8db27d Mon Sep 17 00:00:00 2001
From: Billy Laws 
Date: Wed, 31 Jan 2024 02:32:15 +
Subject: [PATCH 1/2] [AArch64] Fix variadic tail-calls on ARM64EC (#79774)

ARM64EC varargs calls expect that x4 = sp at entry, special handling is
needed to ensure this with tail calls since they occur after the
epilogue and the x4 write happens before.

I tried going through AArch64MachineFrameLowering for this, hoping to
avoid creating the dummy object but this was the best I could do since
the stack info that uses isn't populated at this stage,
CreateFixedObject also explicitly forbids 0 sized objects.

(cherry picked from commit c761b4a5e4cc003a2c850898e1dc67d2637cfb0c)
---
 .../Target/AArch64/AArch64ISelLowering.cpp| 10 -
 llvm/test/CodeGen/AArch64/arm64ec-varargs.ll  | 37 +++
 llvm/test/CodeGen/AArch64/vararg-tallcall.ll  |  8 
 3 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index e97f5e3220148..957b556edaf31 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -8007,11 +8007,19 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
   }
 
   if (IsVarArg && Subtarget->isWindowsArm64EC()) {
+SDValue ParamPtr = StackPtr;
+if (IsTailCall) {
+  // Create a dummy object at the top of the stack that can be used to get
+  // the SP after the epilogue
+  int FI = MF.getFrameInfo().CreateFixedObject(1, FPDiff, true);
+  ParamPtr = DAG.getFrameIndex(FI, PtrVT);
+}
+
 // For vararg calls, the Arm64EC ABI requires values in x4 and x5
 // describing the argument list.  x4 contains the address of the
 // first stack parameter. x5 contains the size in bytes of all parameters
 // passed on the stack.
-RegsToPass.emplace_back(AArch64::X4, StackPtr);
+RegsToPass.emplace_back(AArch64::X4, ParamPtr);
 RegsToPass.emplace_back(AArch64::X5,
 DAG.getConstant(NumBytes, DL, MVT::i64));
   }
diff --git a/llvm/test/CodeGen/AArch64/arm64ec-varargs.ll 
b/llvm/test/CodeGen/AArch64/arm64ec-varargs.ll
index dc16b3a1a0f27..844fc52ddade6 100644
--- a/llvm/test/CodeGen/AArch64/arm64ec-varargs.ll
+++ b/llvm/test/CodeGen/AArch64/arm64ec-varargs.ll
@@ -100,5 +100,42 @@ define void @varargs_many_argscalleer() nounwind {
   ret void
 }
 
+define void @varargs_caller_tail() nounwind {
+; CHECK-LABEL: varargs_caller_tail:
+; CHECK:// %bb.0:
+; CHECK-NEXT:sub sp, sp, #48
+; CHECK-NEXT:mov x4, sp
+; CHECK-NEXT:add x8, sp, #16
+; CHECK-NEXT:mov x9, #4617315517961601024// 
=0x4014
+; CHECK-NEXT:mov x0, #4607182418800017408// 
=0x3ff0
+; CHECK-NEXT:mov w1, #2  // =0x2
+; CHECK-NEXT:mov x2, #4613937818241073152// 
=0x4008
+; CHECK-NEXT:mov w3, #4  // =0x4
+; CHECK-NEXT:mov w5, #16 // =0x10
+; CHECK-NEXT:stp xzr, x30, [sp, #24] // 8-byte Folded 
Spill
+; CHECK-NEXT:stp x9, x8, [sp]
+; CHECK-NEXT:str xzr, [sp, #16]
+; CHECK-NEXT:.weak_anti_dep  varargs_callee
+; CHECK-NEXT:.set varargs_callee, "#varargs_callee"@WEAKREF
+; CHECK-NEXT:.weak_anti_dep  "#varargs_callee"
+; CHECK-NEXT:.set "#varargs_callee", varargs_callee@WEAKREF
+; CHECK-NEXT:bl  "#varargs_callee"
+; CHECK-NEXT:ldr x30, [sp, #32]  // 8-byte Folded 
Reload
+; CHECK-NEXT:add x4, sp, #48
+; CHECK-NEXT:mov x0, #4607182418800017408// 
=0x3ff0
+; CHECK-NEXT:mov w1, #4  // =0x4
+; CHECK-NEXT:mov w2, #3  // =0x3
+; CHECK-NEXT:mov w3, #2  // =0x2
+; CHECK-NEXT:mov x5, xzr
+; CHECK-NEXT:add sp, sp, #48
+; CHECK-NEXT:.weak_anti_dep  varargs_callee
+; CHECK-NEXT:.set varargs_callee, "#varargs_callee"@WEAKREF
+; CHECK-NEXT:.weak_anti_dep  "#varargs_callee"
+; CHECK-NEXT:.set "#varargs_callee", varargs_callee@WEAKREF
+; CHECK-NEXT:b   "#varargs_callee"
+  call void (double, ...) @varargs_callee(double 1.0, i32 2, double 3.0, i32 
4, double 5.0, <2 x double> )
+  tail call void (double, ...) @varargs_callee(double 1.0, i32 4, i32 3, i32 2)
+  ret void
+}
 
 declare void @llvm.va_start(ptr)
diff --git a/llvm/test/CodeGen/AArch64/vararg-tallcall.ll 
b/llvm/test/CodeGen/AArch64/vararg-tallcall.ll
index 2d6db1642247d..812837639196e 100644
--- a/llvm/test/CodeGen/AArch64/vararg-tallcall.ll
+++ b/llvm/test/CodeGen/AArch64/vara

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80752 (PR #80754)

2024-02-05 Thread via llvm-branch-commits


https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/80754
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80752 (PR #80754)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:

@efriedma-quic @cjacek What do you think about merging this PR to the release 
branch?

https://github.com/llvm/llvm-project/pull/80754
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80752 (PR #80754)

2024-02-05 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: None (llvmbot)


Changes

resolves llvm/llvm-project#80752

---
Full diff: https://github.com/llvm/llvm-project/pull/80754.diff


5 Files Affected:

- (modified) llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp (+27-21) 
- (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+9-1) 
- (modified) llvm/test/CodeGen/AArch64/arm64ec-entry-thunks.ll (+2-2) 
- (modified) llvm/test/CodeGen/AArch64/arm64ec-varargs.ll (+37) 
- (modified) llvm/test/CodeGen/AArch64/vararg-tallcall.ll (+8) 


``diff
diff --git a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
index 11248bb7aef31..91b4f18c73c93 100644
--- a/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
@@ -43,6 +43,8 @@ static cl::opt 
GenerateThunks("arm64ec-generate-thunks", cl::Hidden,
 
 namespace {
 
+enum class ThunkType { GuestExit, Entry, Exit };
+
 class AArch64Arm64ECCallLowering : public ModulePass {
 public:
   static char ID;
@@ -69,14 +71,14 @@ class AArch64Arm64ECCallLowering : public ModulePass {
   Type *I64Ty;
   Type *VoidTy;
 
-  void getThunkType(FunctionType *FT, AttributeList AttrList, bool EntryThunk,
+  void getThunkType(FunctionType *FT, AttributeList AttrList, ThunkType TT,
 raw_ostream &Out, FunctionType *&Arm64Ty,
 FunctionType *&X64Ty);
   void getThunkRetType(FunctionType *FT, AttributeList AttrList,
raw_ostream &Out, Type *&Arm64RetTy, Type *&X64RetTy,
SmallVectorImpl &Arm64ArgTypes,
SmallVectorImpl &X64ArgTypes, bool &HasSretPtr);
-  void getThunkArgTypes(FunctionType *FT, AttributeList AttrList,
+  void getThunkArgTypes(FunctionType *FT, AttributeList AttrList, ThunkType TT,
 raw_ostream &Out,
 SmallVectorImpl &Arm64ArgTypes,
 SmallVectorImpl &X64ArgTypes, bool HasSretPtr);
@@ -89,10 +91,11 @@ class AArch64Arm64ECCallLowering : public ModulePass {
 
 void AArch64Arm64ECCallLowering::getThunkType(FunctionType *FT,
   AttributeList AttrList,
-  bool EntryThunk, raw_ostream 
&Out,
+  ThunkType TT, raw_ostream &Out,
   FunctionType *&Arm64Ty,
   FunctionType *&X64Ty) {
-  Out << (EntryThunk ? "$ientry_thunk$cdecl$" : "$iexit_thunk$cdecl$");
+  Out << (TT == ThunkType::Entry ? "$ientry_thunk$cdecl$"
+ : "$iexit_thunk$cdecl$");
 
   Type *Arm64RetTy;
   Type *X64RetTy;
@@ -102,8 +105,8 @@ void AArch64Arm64ECCallLowering::getThunkType(FunctionType 
*FT,
 
   // The first argument to a thunk is the called function, stored in x9.
   // For exit thunks, we pass the called function down to the emulator;
-  // for entry thunks, we just call the Arm64 function directly.
-  if (!EntryThunk)
+  // for entry/guest exit thunks, we just call the Arm64 function directly.
+  if (TT == ThunkType::Exit)
 Arm64ArgTypes.push_back(PtrTy);
   X64ArgTypes.push_back(PtrTy);
 
@@ -111,14 +114,16 @@ void 
AArch64Arm64ECCallLowering::getThunkType(FunctionType *FT,
   getThunkRetType(FT, AttrList, Out, Arm64RetTy, X64RetTy, Arm64ArgTypes,
   X64ArgTypes, HasSretPtr);
 
-  getThunkArgTypes(FT, AttrList, Out, Arm64ArgTypes, X64ArgTypes, HasSretPtr);
+  getThunkArgTypes(FT, AttrList, TT, Out, Arm64ArgTypes, X64ArgTypes,
+   HasSretPtr);
 
-  Arm64Ty = FunctionType::get(Arm64RetTy, Arm64ArgTypes, false);
+  Arm64Ty = FunctionType::get(Arm64RetTy, Arm64ArgTypes,
+  TT == ThunkType::Entry && FT->isVarArg());
   X64Ty = FunctionType::get(X64RetTy, X64ArgTypes, false);
 }
 
 void AArch64Arm64ECCallLowering::getThunkArgTypes(
-FunctionType *FT, AttributeList AttrList, raw_ostream &Out,
+FunctionType *FT, AttributeList AttrList, ThunkType TT, raw_ostream &Out,
 SmallVectorImpl &Arm64ArgTypes,
 SmallVectorImpl &X64ArgTypes, bool HasSretPtr) {
 
@@ -151,14 +156,16 @@ void AArch64Arm64ECCallLowering::getThunkArgTypes(
   X64ArgTypes.push_back(I64Ty);
 }
 
-// x4
-Arm64ArgTypes.push_back(PtrTy);
-X64ArgTypes.push_back(PtrTy);
-// x5
-Arm64ArgTypes.push_back(I64Ty);
-// FIXME: x5 isn't actually passed/used by the x64 side; revisit once we
-// have proper isel for varargs
-X64ArgTypes.push_back(I64Ty);
+if (TT != ThunkType::Entry) {
+  // x4
+  Arm64ArgTypes.push_back(PtrTy);
+  X64ArgTypes.push_back(PtrTy);
+  // x5
+  Arm64ArgTypes.push_back(I64Ty);
+  // FIXME: x5 isn't actually passed/used by the x64 side; revisit once we
+  // have proper isel for varargs
+  X64ArgTypes.push_back(

[llvm-branch-commits] [compiler-rt] [llvm] [NFC] (PR #80762)

2024-02-05 Thread Mingming Liu via llvm-branch-commits


https://github.com/minglotus-6 created 
https://github.com/llvm/llvm-project/pull/80762

None


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [llvm] [NFC] (PR #80762)

2024-02-05 Thread Bill Wendling via llvm-branch-commits


bwendling wrote:

Please add a title and description.

https://github.com/llvm/llvm-project/pull/80762
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80715 (PR #80716)

2024-02-05 Thread Tom Stellard via llvm-branch-commits


https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/80716
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

1 2 >

1 - 100 of 124 matches

Mail list logo