[llvm-branch-commits] [clang] release/19.x: [C++20] [Modules] Fix the duplicated static initializer problem (#114193) (PR #114197)

2024-10-30 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/114197

Backport 259eaa6878ead1e2e7ef572a874dc3d885c1899b

Requested by: @ChuanqiXu9

>From 6a3b8b5c12d45c5f981e90dae1a1e6c40d986cc7 Mon Sep 17 00:00:00 2001
From: Chuanqi Xu 
Date: Wed, 30 Oct 2024 17:27:04 +0800
Subject: [PATCH] [C++20] [Modules] Fix the duplicated static initializer
 problem (#114193)

Reproducer:

```
//--- a.cppm
export module a;
int func();
static int a = func();

//--- a.cpp
import a;
```

The `func()` should only execute once. However, before this patch we
will somehow import `static int a` from a.cppm incorrectly and
initialize that again.

This is super bad and can introduce serious runtime behaviors.

And also surprisingly, it looks like the root cause of the problem is
simply some oversight choosing APIs.

(cherry picked from commit 259eaa6878ead1e2e7ef572a874dc3d885c1899b)
---
 clang/lib/CodeGen/CodeGenModule.cpp|  4 ++--
 clang/test/Modules/static-initializer.cppm | 18 ++
 2 files changed, 20 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/Modules/static-initializer.cppm

diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index 151505baf38db1..2a5d5f9083ae65 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -7080,8 +7080,8 @@ void CodeGenModule::EmitTopLevelDecl(Decl *D) {
 // For C++ standard modules we are done - we will call the module
 // initializer for imported modules, and that will likewise call those for
 // any imports it has.
-if (CXX20ModuleInits && Import->getImportedOwningModule() &&
-!Import->getImportedOwningModule()->isModuleMapModule())
+if (CXX20ModuleInits && Import->getImportedModule() &&
+Import->getImportedModule()->isNamedModule())
   break;
 
 // For clang C++ module map modules the initializers for sub-modules are
diff --git a/clang/test/Modules/static-initializer.cppm 
b/clang/test/Modules/static-initializer.cppm
new file mode 100644
index 00..10d4854ee67fa6
--- /dev/null
+++ b/clang/test/Modules/static-initializer.cppm
@@ -0,0 +1,18 @@
+// RUN: rm -rf %t
+// RUN: mkdir -p %t
+// RUN: split-file %s %t
+//
+// RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 %t/a.cppm 
-emit-module-interface -o %t/a.pcm
+// RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 %t/a.cpp 
-fmodule-file=a=%t/a.pcm -emit-llvm -o - | FileCheck %t/a.cpp
+
+//--- a.cppm
+export module a;
+int func();
+static int a = func();
+
+//--- a.cpp
+import a;
+
+// CHECK-NOT: internal global
+// CHECK-NOT: __cxx_global_var_init
+

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port RegUsageInfoCollector pass to NPM (PR #113874)

2024-10-30 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan updated 
https://github.com/llvm/llvm-project/pull/113874

>From a95b69c07c7804d2e2a10b939a178a191643a41c Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Mon, 28 Oct 2024 06:22:49 +
Subject: [PATCH 1/4] [CodeGen][NewPM] Port RegUsageInfoCollector pass to NPM

---
 .../llvm/CodeGen/RegUsageInfoCollector.h  | 25 
 llvm/include/llvm/InitializePasses.h  |  2 +-
 llvm/include/llvm/Passes/CodeGenPassBuilder.h |  1 +
 .../llvm/Passes/MachinePassRegistry.def   |  2 +-
 llvm/lib/CodeGen/CodeGen.cpp  |  2 +-
 llvm/lib/CodeGen/RegUsageInfoCollector.cpp| 60 +--
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 llvm/test/CodeGen/AMDGPU/ipra-regmask.ll  |  5 ++
 8 files changed, 76 insertions(+), 22 deletions(-)
 create mode 100644 llvm/include/llvm/CodeGen/RegUsageInfoCollector.h

diff --git a/llvm/include/llvm/CodeGen/RegUsageInfoCollector.h 
b/llvm/include/llvm/CodeGen/RegUsageInfoCollector.h
new file mode 100644
index 00..6b88cc4f99089e
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/RegUsageInfoCollector.h
@@ -0,0 +1,25 @@
+//===- llvm/CodeGen/RegUsageInfoCollector.h -*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CODEGEN_REGUSAGEINFOCOLLECTOR_H
+#define LLVM_CODEGEN_REGUSAGEINFOCOLLECTOR_H
+
+#include "llvm/CodeGen/MachinePassManager.h"
+
+namespace llvm {
+
+class RegUsageInfoCollectorPass
+: public AnalysisInfoMixin {
+public:
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+};
+
+} // namespace llvm
+
+#endif // LLVM_CODEGEN_REGUSAGEINFOCOLLECTOR_H
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index edc237f2819818..44b7ba830bb329 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -257,7 +257,7 @@ void 
initializeRegAllocPriorityAdvisorAnalysisPass(PassRegistry &);
 void initializeRegAllocScoringPass(PassRegistry &);
 void initializeRegBankSelectPass(PassRegistry &);
 void initializeRegToMemWrapperPassPass(PassRegistry &);
-void initializeRegUsageInfoCollectorPass(PassRegistry &);
+void initializeRegUsageInfoCollectorLegacyPass(PassRegistry &);
 void initializeRegUsageInfoPropagationPass(PassRegistry &);
 void initializeRegionInfoPassPass(PassRegistry &);
 void initializeRegionOnlyPrinterPass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index 8cbc9f71ab26d0..066cd70ec8b996 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -53,6 +53,7 @@
 #include "llvm/CodeGen/PHIElimination.h"
 #include "llvm/CodeGen/PreISelIntrinsicLowering.h"
 #include "llvm/CodeGen/RegAllocFast.h"
+#include "llvm/CodeGen/RegUsageInfoCollector.h"
 #include "llvm/CodeGen/RegisterUsageInfo.h"
 #include "llvm/CodeGen/ReplaceWithVeclib.h"
 #include "llvm/CodeGen/SafeStack.h"
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def 
b/llvm/include/llvm/Passes/MachinePassRegistry.def
index 7db28cb0092525..0ee4794034e98b 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -156,6 +156,7 @@ MACHINE_FUNCTION_PASS("print",
   MachinePostDominatorTreePrinterPass(dbgs()))
 MACHINE_FUNCTION_PASS("print", SlotIndexesPrinterPass(dbgs()))
 MACHINE_FUNCTION_PASS("print", VirtRegMapPrinterPass(dbgs()))
+MACHINE_FUNCTION_PASS("reg-usage-collector", RegUsageInfoCollectorPass())
 MACHINE_FUNCTION_PASS("require-all-machine-function-properties",
   RequireAllMachineFunctionPropertiesPass())
 MACHINE_FUNCTION_PASS("stack-coloring", StackColoringPass())
@@ -250,7 +251,6 @@ DUMMY_MACHINE_FUNCTION_PASS("prologepilog-code", 
PrologEpilogCodeInserterPass)
 DUMMY_MACHINE_FUNCTION_PASS("ra-basic", RABasicPass)
 DUMMY_MACHINE_FUNCTION_PASS("ra-greedy", RAGreedyPass)
 DUMMY_MACHINE_FUNCTION_PASS("ra-pbqp", RAPBQPPass)
-DUMMY_MACHINE_FUNCTION_PASS("reg-usage-collector", RegUsageInfoCollectorPass)
 DUMMY_MACHINE_FUNCTION_PASS("reg-usage-propagation", 
RegUsageInfoPropagationPass)
 DUMMY_MACHINE_FUNCTION_PASS("regalloc", RegAllocPass)
 DUMMY_MACHINE_FUNCTION_PASS("regallocscoringpass", RegAllocScoringPass)
diff --git a/llvm/lib/CodeGen/CodeGen.cpp b/llvm/lib/CodeGen/CodeGen.cpp
index 39fba1d0b527ef..e7e8a121369b75 100644
--- a/llvm/lib/CodeGen/CodeGen.cpp
+++ b/llvm/lib/CodeGen/CodeGen.cpp
@@ -113,7 +113,7 @@ void llvm::initializeCodeGen(PassRegistry &Registry) {
   initializeRABasicPass(Registry);
   initializeRAGreedyPass(Registry);
   initializeRegAllocFastPass(Reg

[llvm-branch-commits] [mlir] [MLIR][OpenMP] Emit descriptive errors for all unsupported clauses (PR #114037)

2024-10-30 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak updated 
https://github.com/llvm/llvm-project/pull/114037

>From 5f9c42714f1f8168adcb55ef72bf10fd0f6db81a Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Tue, 29 Oct 2024 11:18:07 +
Subject: [PATCH 1/2] [MLIR][OpenMP] Emit descriptive errors for all
 unsupported clauses

This patch improves error reporting in the MLIR to LLVM IR translation pass for
the 'omp' dialect by emitting descriptive errors when encountering clauses not
yet supported by that pass.

Additionally, not-yet-implemented errors previously missing for some clauses
are added, to avoid silently ignoring them.

Error messages related to inlining of `omp.private` and `omp.declare_reduction`
regions have been updated to use the same format.
---
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  | 340 --
 mlir/test/Target/LLVMIR/openmp-todo.mlir  | 212 +--
 2 files changed, 421 insertions(+), 131 deletions(-)

diff --git 
a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp 
b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 4d189b1f40c46b..582a68f4c00a47 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -581,7 +581,8 @@ makeReductionGen(omp::DeclareReductionOp decl, 
llvm::IRBuilderBase &builder,
 if (failed(inlineConvertOmpRegions(decl.getReductionRegion(),
"omp.reduction.nonatomic.body", builder,
moduleTranslation, &phis)))
-  return llvm::createStringError("failed reduction region translation");
+  return llvm::createStringError(
+  "failed to inline `combiner` region of `omp.declare_reduction`");
 assert(phis.size() == 1);
 result = phis[0];
 return builder.saveIP();
@@ -614,7 +615,8 @@ makeAtomicReductionGen(omp::DeclareReductionOp decl,
 if (failed(inlineConvertOmpRegions(decl.getAtomicReductionRegion(),
"omp.reduction.atomic.body", builder,
moduleTranslation, &phis)))
-  return llvm::createStringError("failed reduction region translation");
+  return llvm::createStringError(
+  "failed to inline `atomic` region of `omp.declare_reduction`");
 assert(phis.empty());
 return builder.saveIP();
   };
@@ -650,6 +652,13 @@ convertOmpOrdered(Operation &opInst, llvm::IRBuilderBase 
&builder,
   return success();
 }
 
+static LogicalResult orderedRegionSupported(omp::OrderedRegionOp op) {
+  if (op.getParLevelSimd())
+return op.emitError("parallelization-level clause set not yet supported");
+
+  return success();
+}
+
 /// Converts an OpenMP 'ordered_region' operation into LLVM IR using
 /// OpenMPIRBuilder.
 static LogicalResult
@@ -658,9 +667,8 @@ convertOmpOrderedRegion(Operation &opInst, 
llvm::IRBuilderBase &builder,
   using InsertPointTy = llvm::OpenMPIRBuilder::InsertPointTy;
   auto orderedRegionOp = cast(opInst);
 
-  // TODO: The code generation for ordered simd directive is not supported yet.
-  if (orderedRegionOp.getParLevelSimd())
-return opInst.emitError("unhandled clauses for translation to LLVM IR");
+  if (failed(orderedRegionSupported(orderedRegionOp)))
+return failure();
 
   auto bodyGenCB = [&](InsertPointTy allocaIP, InsertPointTy codeGenIP) {
 // OrderedOp has only one region associated with it.
@@ -727,9 +735,10 @@ allocReductionVars(T loop, ArrayRef 
reductionArgs,
   SmallVector phis;
   if (failed(inlineConvertOmpRegions(allocRegion, "omp.reduction.alloc",
  builder, moduleTranslation, &phis)))
-return failure();
-  assert(phis.size() == 1 && "expected one allocation to be yielded");
+return loop.emitError(
+"failed to inline `alloc` region of `omp.declare_reduction`");
 
+  assert(phis.size() == 1 && "expected one allocation to be yielded");
   builder.SetInsertPoint(allocaIP.getBlock()->getTerminator());
 
   // Allocate reduction variable (which is a pointer to the real reduction
@@ -995,6 +1004,16 @@ static LogicalResult allocAndInitializeReductionVars(
   return success();
 }
 
+static LogicalResult sectionsOpSupported(omp::SectionsOp op) {
+  if (!op.getAllocateVars().empty() || !op.getAllocatorVars().empty())
+return op.emitError("allocate clause not yet supported");
+
+  if (!op.getPrivateVars().empty() || op.getPrivateSyms())
+return op.emitError("privatization clauses not yet supported");
+
+  return success();
+}
+
 static LogicalResult
 convertOmpSections(Operation &opInst, llvm::IRBuilderBase &builder,
LLVM::ModuleTranslation &moduleTranslation) {
@@ -1004,12 +1023,8 @@ convertOmpSections(Operation &opInst, 
llvm::IRBuilderBase &builder,
 
   auto sectionsOp = cast(opInst);
 
-  // TODO: Support the following clauses: private, firstprivate, lastprivat

[llvm-branch-commits] [mlir] [MLIR][OpenMP] Emit descriptive errors for all unsupported clauses (PR #114037)

2024-10-30 Thread Sergio Afonso via llvm-branch-commits


@@ -640,6 +642,13 @@ convertOmpOrdered(Operation &opInst, llvm::IRBuilderBase 
&builder,
   return success();
 }
 
+static LogicalResult orderedRegionSupported(omp::OrderedRegionOp op) {

skatrak wrote:

This should be ready to review again. Now the implementation is much more 
centralized, helping us make sure error messages are consistent.

https://github.com/llvm/llvm-project/pull/114037
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [C++20] [Modules] Fix the duplicated static initializer problem (#114193) (PR #114197)

2024-10-30 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/114197
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)

2024-10-30 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-risc-v

Author: None (dlav-sc)


Changes



---
Full diff: https://github.com/llvm/llvm-project/pull/114227.diff


2 Files Affected:

- (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.cpp (+94-98) 
- (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.h (+3-2) 


``diff
diff --git a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
index 6375300117090f..b94dd031186356 100644
--- a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
@@ -27,6 +27,88 @@
 
 using namespace llvm;
 
+static unsigned getCaleeSavedRVVNumRegs(const Register &BaseReg) {
+  return RISCV::VRRegClass.contains(BaseReg) ? 1
+ : RISCV::VRM2RegClass.contains(BaseReg) ? 2
+ : RISCV::VRM4RegClass.contains(BaseReg) ? 4
+ : 8;
+}
+
+static MCRegister getRVVBaseRegister(const RISCVRegisterInfo &TRI, const 
Register &Reg) {
+  MCRegister BaseReg = TRI.getSubReg(Reg, RISCV::sub_vrm1_0);
+  // If it's not a grouped vector register, it doesn't have subregister, so
+  // the base register is just itself.
+  if (BaseReg == RISCV::NoRegister)
+BaseReg = Reg;
+  return BaseReg;
+}
+
+namespace {
+
+struct CFIRestoreRegisterEmitter {
+  CFIRestoreRegisterEmitter(MachineFunction &, const RISCVSubtarget &) {};
+
+  void emit(MachineFunction &MF, MachineBasicBlock &MBB, 
MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, const 
RISCVInstrInfo &TII, const DebugLoc &DL, const CalleeSavedInfo &CS) const {
+Register Reg = CS.getReg();
+unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createRestore(
+nullptr, RI.getDwarfRegNum(Reg, true)));
+BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
+.addCFIIndex(CFIIndex)
+.setMIFlag(MachineInstr::FrameDestroy);
+  }
+};
+
+class CFIStoreRegisterEmitter {
+  MachineFrameInfo &MFI;
+  
+  public:
+  CFIStoreRegisterEmitter(MachineFunction &MF, const RISCVSubtarget &) : 
MFI{MF.getFrameInfo()} {};
+  
+  void emit(MachineFunction &MF, MachineBasicBlock &MBB, 
MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, const 
RISCVInstrInfo &TII, const DebugLoc &DL, const CalleeSavedInfo &CS) const {
+int FrameIdx = CS.getFrameIdx();
+int64_t Offset = MFI.getObjectOffset(FrameIdx);
+Register Reg = CS.getReg();
+unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createOffset(
+nullptr, RI.getDwarfRegNum(Reg, true), Offset));
+BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
+.addCFIIndex(CFIIndex)
+.setMIFlag(MachineInstr::FrameSetup);
+  }
+};
+
+class CFIRestoreRVVRegisterEmitter {
+  const llvm::RISCVRegisterInfo *TRI;
+  
+  public:
+  CFIRestoreRVVRegisterEmitter(MachineFunction &, const RISCVSubtarget &STI) : 
TRI{STI.getRegisterInfo()} {};
+
+  void emit(MachineFunction &MF, MachineBasicBlock &MBB, 
MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, const 
RISCVInstrInfo &TII, const DebugLoc &DL, const CalleeSavedInfo &CS) const {
+MCRegister BaseReg = getRVVBaseRegister(*TRI, CS.getReg());
+unsigned NumRegs = getCaleeSavedRVVNumRegs(CS.getReg());
+for (unsigned i = 0; i < NumRegs; ++i) {
+  unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createRestore(
+  nullptr, RI.getDwarfRegNum(BaseReg + i, true)));
+  BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
+  .addCFIIndex(CFIIndex)
+  .setMIFlag(MachineInstr::FrameDestroy);
+}
+  }
+};
+
+}
+
+template 
+void RISCVFrameLowering::emitCFIForCSI(MachineBasicBlock &MBB, 
MachineBasicBlock::iterator MBBI, const SmallVector &CSI) 
const {
+  MachineFunction *MF = MBB.getParent();
+  const RISCVRegisterInfo *RI = STI.getRegisterInfo();
+  const RISCVInstrInfo *TII = STI.getInstrInfo();
+  DebugLoc DL = MBB.findDebugLoc(MBBI);
+  
+  Emitter E{*MF, STI}; 
+  for (const auto &CS : CSI)
+E.emit(*MF, MBB, MBBI, *RI, *TII, DL, CS);
+}
+
 static Align getABIStackAlignment(RISCVABI::ABI ABI) {
   if (ABI == RISCVABI::ABI_ILP32E)
 return Align(4);
@@ -607,16 +689,7 @@ void RISCVFrameLowering::emitPrologue(MachineFunction &MF,
 .addCFIIndex(CFIIndex)
 .setMIFlag(MachineInstr::FrameSetup);
 
-for (const auto &Entry : getPushOrLibCallsSavedInfo(MF, CSI)) {
-  int FrameIdx = Entry.getFrameIdx();
-  int64_t Offset = MFI.getObjectOffset(FrameIdx);
-  Register Reg = Entry.getReg();
-  unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createOffset(
-  nullptr, RI->getDwarfRegNum(Reg, true), Offset));
-  BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
-  .addCFIIndex(CFIIndex)
-  .setMIFlag(MachineInstr::FrameSetup);
-}
+emitCFIForCSI(MBB, MBBI, 
getPushOrLibCallsSavedInfo(MF, CSI));
   }
 
   // FIXME (note copied from Lanai): This appears to be overallocating.  Needs
@

[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)

2024-10-30 Thread via llvm-branch-commits

https://github.com/dlav-sc created 
https://github.com/llvm/llvm-project/pull/114227

None

>From 825bc966611bb9a5e737f2ed65b524766998c261 Mon Sep 17 00:00:00 2001
From: Daniil Avdeev 
Date: Wed, 30 Oct 2024 13:24:21 +
Subject: [PATCH] [RISCV][NFC] refactor CFI emitting

---
 llvm/lib/Target/RISCV/RISCVFrameLowering.cpp | 192 +--
 llvm/lib/Target/RISCV/RISCVFrameLowering.h   |   5 +-
 2 files changed, 97 insertions(+), 100 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp 
b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
index 6375300117090f..b94dd031186356 100644
--- a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
@@ -27,6 +27,88 @@
 
 using namespace llvm;
 
+static unsigned getCaleeSavedRVVNumRegs(const Register &BaseReg) {
+  return RISCV::VRRegClass.contains(BaseReg) ? 1
+ : RISCV::VRM2RegClass.contains(BaseReg) ? 2
+ : RISCV::VRM4RegClass.contains(BaseReg) ? 4
+ : 8;
+}
+
+static MCRegister getRVVBaseRegister(const RISCVRegisterInfo &TRI, const 
Register &Reg) {
+  MCRegister BaseReg = TRI.getSubReg(Reg, RISCV::sub_vrm1_0);
+  // If it's not a grouped vector register, it doesn't have subregister, so
+  // the base register is just itself.
+  if (BaseReg == RISCV::NoRegister)
+BaseReg = Reg;
+  return BaseReg;
+}
+
+namespace {
+
+struct CFIRestoreRegisterEmitter {
+  CFIRestoreRegisterEmitter(MachineFunction &, const RISCVSubtarget &) {};
+
+  void emit(MachineFunction &MF, MachineBasicBlock &MBB, 
MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, const 
RISCVInstrInfo &TII, const DebugLoc &DL, const CalleeSavedInfo &CS) const {
+Register Reg = CS.getReg();
+unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createRestore(
+nullptr, RI.getDwarfRegNum(Reg, true)));
+BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
+.addCFIIndex(CFIIndex)
+.setMIFlag(MachineInstr::FrameDestroy);
+  }
+};
+
+class CFIStoreRegisterEmitter {
+  MachineFrameInfo &MFI;
+  
+  public:
+  CFIStoreRegisterEmitter(MachineFunction &MF, const RISCVSubtarget &) : 
MFI{MF.getFrameInfo()} {};
+  
+  void emit(MachineFunction &MF, MachineBasicBlock &MBB, 
MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, const 
RISCVInstrInfo &TII, const DebugLoc &DL, const CalleeSavedInfo &CS) const {
+int FrameIdx = CS.getFrameIdx();
+int64_t Offset = MFI.getObjectOffset(FrameIdx);
+Register Reg = CS.getReg();
+unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createOffset(
+nullptr, RI.getDwarfRegNum(Reg, true), Offset));
+BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
+.addCFIIndex(CFIIndex)
+.setMIFlag(MachineInstr::FrameSetup);
+  }
+};
+
+class CFIRestoreRVVRegisterEmitter {
+  const llvm::RISCVRegisterInfo *TRI;
+  
+  public:
+  CFIRestoreRVVRegisterEmitter(MachineFunction &, const RISCVSubtarget &STI) : 
TRI{STI.getRegisterInfo()} {};
+
+  void emit(MachineFunction &MF, MachineBasicBlock &MBB, 
MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, const 
RISCVInstrInfo &TII, const DebugLoc &DL, const CalleeSavedInfo &CS) const {
+MCRegister BaseReg = getRVVBaseRegister(*TRI, CS.getReg());
+unsigned NumRegs = getCaleeSavedRVVNumRegs(CS.getReg());
+for (unsigned i = 0; i < NumRegs; ++i) {
+  unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createRestore(
+  nullptr, RI.getDwarfRegNum(BaseReg + i, true)));
+  BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
+  .addCFIIndex(CFIIndex)
+  .setMIFlag(MachineInstr::FrameDestroy);
+}
+  }
+};
+
+}
+
+template 
+void RISCVFrameLowering::emitCFIForCSI(MachineBasicBlock &MBB, 
MachineBasicBlock::iterator MBBI, const SmallVector &CSI) 
const {
+  MachineFunction *MF = MBB.getParent();
+  const RISCVRegisterInfo *RI = STI.getRegisterInfo();
+  const RISCVInstrInfo *TII = STI.getInstrInfo();
+  DebugLoc DL = MBB.findDebugLoc(MBBI);
+  
+  Emitter E{*MF, STI}; 
+  for (const auto &CS : CSI)
+E.emit(*MF, MBB, MBBI, *RI, *TII, DL, CS);
+}
+
 static Align getABIStackAlignment(RISCVABI::ABI ABI) {
   if (ABI == RISCVABI::ABI_ILP32E)
 return Align(4);
@@ -607,16 +689,7 @@ void RISCVFrameLowering::emitPrologue(MachineFunction &MF,
 .addCFIIndex(CFIIndex)
 .setMIFlag(MachineInstr::FrameSetup);
 
-for (const auto &Entry : getPushOrLibCallsSavedInfo(MF, CSI)) {
-  int FrameIdx = Entry.getFrameIdx();
-  int64_t Offset = MFI.getObjectOffset(FrameIdx);
-  Register Reg = Entry.getReg();
-  unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createOffset(
-  nullptr, RI->getDwarfRegNum(Reg, true), Offset));
-  BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
-  .addCFIIndex(CFIIndex)
-  .setMIFlag(MachineInstr::FrameSetup);
-}
+emitCFIForCSI(MBB, MBB

[llvm-branch-commits] [compiler-rt] release/19.x: [compiler-rt] [test] Fix using toolchains that rely on Clang default configs (#113491) (PR #114229)

2024-10-30 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/114229

Backport a14a83d9a102253eca7c02ff4c35a2ce3f7de6e5

Requested by: @mstorsjo

>From 3f88a54c4eeb8d7c92aab78b39e9fb9162c6030c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Martin=20Storsj=C3=B6?= 
Date: Thu, 24 Oct 2024 23:45:14 +0300
Subject: [PATCH] [compiler-rt] [test] Fix using toolchains that rely on Clang
 default configs (#113491)

The use of CLANG_NO_DEFAULT_CONFIG in the tests was added because some
Linux distributions had a global default config file, that added flags
relating to hardening, which interfere with the sanitizer tests. By
setting CLANG_NO_DEFAULT_CONFIG, the global default config files that
are found are ignored, and the sanitizers get the expected default
compiler behaviour.

(This was https://github.com/llvm/llvm-project/issues/60394, which was
fixed in 8ab762557fb057af1a3015211ee116a975027e78.)

However, some toolchains may rely on default config files for mandatory
parts required for functioning at all - setting things like sysroots,
-rtlib, -unwindlib, -stdlib, -fuse-ld etc. In such a case we can't
forcibly disable any default config, because it will break the otherwise
working toolchain.

Add a test for whether the compiler works while passing
--no-default-config to it. If the option is accepted and the toolchain
still works while that is set, set CLANG_NO_DEFAULT_CONFIG while running
tests.

(This adds a little bit of inconsistency, as we're testing for the
command line option, while using the environment variable. However doing
compile testing with an environment variable isn't quite as easily
doable, and passing an extra command line flag to all compile commands
while testing, is a bit clumsy - therefore this inconsistency.)

(cherry picked from commit a14a83d9a102253eca7c02ff4c35a2ce3f7de6e5)
---
 compiler-rt/CMakeLists.txt| 16 
 compiler-rt/test/CMakeLists.txt   |  2 ++
 compiler-rt/test/lit.common.cfg.py|  6 +-
 compiler-rt/test/lit.common.configured.in |  1 +
 4 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/compiler-rt/CMakeLists.txt b/compiler-rt/CMakeLists.txt
index 2207555b03a03f..6cf20ab7c183ce 100644
--- a/compiler-rt/CMakeLists.txt
+++ b/compiler-rt/CMakeLists.txt
@@ -39,6 +39,22 @@ include(CompilerRTUtils)
 include(CMakeDependentOption)
 include(GetDarwinLinkerVersion)
 
+include(CheckCXXCompilerFlag)
+
+# Check if we can compile with --no-default-config, or if that omits a config
+# file that is essential for the toolchain to work properly.
+#
+# Using CMAKE_REQUIRED_FLAGS to make sure the flag is used both for compilation
+# and for linking.
+#
+# Doing this test early on, to see if the flag works on the toolchain
+# out of the box. Later on, we end up adding -nostdlib and similar flags
+# to all test compiles, which easily can give false positives on this test.
+set(OLD_CMAKE_REQUIRED_FLAGS "${CMAKE_REQUIRED_FLAGS}")
+set(CMAKE_REQUIRED_FLAGS "${CMAKE_REQUIRED_FLAGS} --no-default-config")
+check_cxx_compiler_flag("" COMPILER_RT_HAS_NO_DEFAULT_CONFIG_FLAG)
+set(CMAKE_REQUIRED_FLAGS "${OLD_CMAKE_REQUIRED_FLAGS}")
+
 option(COMPILER_RT_BUILD_BUILTINS "Build builtins" ON)
 mark_as_advanced(COMPILER_RT_BUILD_BUILTINS)
 option(COMPILER_RT_DISABLE_AARCH64_FMV "Disable AArch64 Function Multi 
Versioning support" OFF)
diff --git a/compiler-rt/test/CMakeLists.txt b/compiler-rt/test/CMakeLists.txt
index 84a98f36747495..f9e23710d3e4f7 100644
--- a/compiler-rt/test/CMakeLists.txt
+++ b/compiler-rt/test/CMakeLists.txt
@@ -12,6 +12,8 @@ pythonize_bool(COMPILER_RT_ENABLE_INTERNAL_SYMBOLIZER)
 
 pythonize_bool(COMPILER_RT_HAS_AARCH64_SME)
 
+pythonize_bool(COMPILER_RT_HAS_NO_DEFAULT_CONFIG_FLAG)
+
 configure_compiler_rt_lit_site_cfg(
   ${CMAKE_CURRENT_SOURCE_DIR}/lit.common.configured.in
   ${CMAKE_CURRENT_BINARY_DIR}/lit.common.configured)
diff --git a/compiler-rt/test/lit.common.cfg.py 
b/compiler-rt/test/lit.common.cfg.py
index 0690c3a18efdbc..d4b1e1d71d3c54 100644
--- a/compiler-rt/test/lit.common.cfg.py
+++ b/compiler-rt/test/lit.common.cfg.py
@@ -980,7 +980,11 @@ def is_windows_lto_supported():
 # default configs for the test runs. In particular, anything hardening
 # related is likely to cause issues with sanitizer tests, because it may
 # preempt something we're looking to trap (e.g. _FORTIFY_SOURCE vs our ASAN).
-config.environment["CLANG_NO_DEFAULT_CONFIG"] = "1"
+#
+# Only set this if we know we can still build for the target while disabling
+# default configs.
+if config.has_no_default_config_flag:
+config.environment["CLANG_NO_DEFAULT_CONFIG"] = "1"
 
 if config.has_compiler_rt_libatomic:
   base_lib = os.path.join(config.compiler_rt_libdir, "libclang_rt.atomic%s.so"
diff --git a/compiler-rt/test/lit.common.configured.in 
b/compiler-rt/test/lit.common.configured.in
index 8889b816b149fc..f7276627995520 100644
--- a/compiler-rt/test/lit.common.configured.in
+++ b/compiler-rt/test/lit.common.configured.in
@@ -53,6 +53,7

[llvm-branch-commits] [compiler-rt] release/19.x: [compiler-rt] [test] Fix using toolchains that rely on Clang default configs (#113491) (PR #114229)

2024-10-30 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/114229
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/19.x: [compiler-rt] [test] Fix using toolchains that rely on Clang default configs (#113491) (PR #114229)

2024-10-30 Thread via llvm-branch-commits

llvmbot wrote:

@mgorny What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/114229
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [OpenMP][MLIR] Descriptor explicit member map lowering changes (PR #113556)

2024-10-30 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak commented:

LGTM, but buildbot errors in "MapInfoFinalization.cpp", 
"MapsForPrivatizedSymbols.cpp" and "Utils.cpp" seem to be a consequence of the 
type change to `members_index`. Hopefully it's a simple fix.

https://github.com/llvm/llvm-project/pull/113556
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: AMDGPURegBankSelect (PR #112863)

2024-10-30 Thread Petar Avramovic via llvm-branch-commits

https://github.com/petar-avramovic edited 
https://github.com/llvm/llvm-project/pull/112863
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [C++20] [Modules] Fix the duplicated static initializer problem (#114193) (PR #114197)

2024-10-30 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-modules

Author: None (llvmbot)


Changes

Backport 259eaa6878ead1e2e7ef572a874dc3d885c1899b

Requested by: @ChuanqiXu9

---
Full diff: https://github.com/llvm/llvm-project/pull/114197.diff


2 Files Affected:

- (modified) clang/lib/CodeGen/CodeGenModule.cpp (+2-2) 
- (added) clang/test/Modules/static-initializer.cppm (+18) 


``diff
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index 151505baf38db1..2a5d5f9083ae65 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -7080,8 +7080,8 @@ void CodeGenModule::EmitTopLevelDecl(Decl *D) {
 // For C++ standard modules we are done - we will call the module
 // initializer for imported modules, and that will likewise call those for
 // any imports it has.
-if (CXX20ModuleInits && Import->getImportedOwningModule() &&
-!Import->getImportedOwningModule()->isModuleMapModule())
+if (CXX20ModuleInits && Import->getImportedModule() &&
+Import->getImportedModule()->isNamedModule())
   break;
 
 // For clang C++ module map modules the initializers for sub-modules are
diff --git a/clang/test/Modules/static-initializer.cppm 
b/clang/test/Modules/static-initializer.cppm
new file mode 100644
index 00..10d4854ee67fa6
--- /dev/null
+++ b/clang/test/Modules/static-initializer.cppm
@@ -0,0 +1,18 @@
+// RUN: rm -rf %t
+// RUN: mkdir -p %t
+// RUN: split-file %s %t
+//
+// RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 %t/a.cppm 
-emit-module-interface -o %t/a.pcm
+// RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 %t/a.cpp 
-fmodule-file=a=%t/a.pcm -emit-llvm -o - | FileCheck %t/a.cpp
+
+//--- a.cppm
+export module a;
+int func();
+static int a = func();
+
+//--- a.cpp
+import a;
+
+// CHECK-NOT: internal global
+// CHECK-NOT: __cxx_global_var_init
+

``




https://github.com/llvm/llvm-project/pull/114197
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][C++03] Update include guards (PR #109001)

2024-10-30 Thread Louis Dionne via llvm-branch-commits

https://github.com/ldionne edited 
https://github.com/llvm/llvm-project/pull/109001
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][C++03] Update include guards (PR #109001)

2024-10-30 Thread Louis Dionne via llvm-branch-commits


@@ -29,18 +29,18 @@ _LIBCPP_PUSH_MACROS
 
 // TODO: Find out how altivec changes things and allow vectorizations there 
too.
 #if _LIBCPP_STD_VER >= 14 && defined(_LIBCPP_CLANG_VER) && 
!defined(__ALTIVEC__)
-#  define _LIBCPP_HAS_ALGORITHM_VECTOR_UTILS 1
+#  define _LIBCPP___CXX03_HAS_ALGORITHM_VECTOR_UTILS 1

ldionne wrote:

Oops!

https://github.com/llvm/llvm-project/pull/109001
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][C++03] Update include guards (PR #109001)

2024-10-30 Thread Louis Dionne via llvm-branch-commits

https://github.com/ldionne commented:

This looks good to me in principle, however the patch itself has a few 
incorrect renamings. One thing you could do is use `git diff --stat` to confirm 
that all the changed files have exactly 3 lines changed in them. A few headers 
like the C compat headers might need additional renames, that needs to be 
investigated.

https://github.com/llvm/llvm-project/pull/109001
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][C++03] Update include guards (PR #109001)

2024-10-30 Thread Louis Dionne via llvm-branch-commits


@@ -41,7 +41,7 @@ template , int> = 0>
 [[nodiscard]] _LIBCPP_HIDE_FROM_ABI bool
 any_of(_ExecutionPolicy&& __policy, _ForwardIterator __first, _ForwardIterator 
__last, _Predicate __pred) {
-  _LIBCPP_REQUIRE_CPP17_FORWARD_ITERATOR(_ForwardIterator, "any_of requires a 
ForwardIterator");
+  _LIBCPP___CXX03_REQUIRE_CPP17_FORWARD_ITERATOR(_ForwardIterator, "any_of 
requires a ForwardIterator");

ldionne wrote:

Oops!

https://github.com/llvm/llvm-project/pull/109001
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [PAC][lld][AArch64][ELF] Support signed GOT with tiny code model (PR #113816)

2024-10-30 Thread Paul Kirth via llvm-branch-commits


@@ -78,6 +78,79 @@ _start:
   adrp x1, :got_auth:zed
   add  x1, x1, :got_auth_lo12:zed
 
+#--- ok-tiny.s
+
+# RUN: llvm-mc -filetype=obj -triple=aarch64-none-linux ok-tiny.s -o ok-tiny.o
+
+# RUN: ld.lld ok-tiny.o a.so -pie -o external-tiny
+# RUN: llvm-readelf -r -S -x .got external-tiny | FileCheck %s 
--check-prefix=EXTERNAL-TINY
+
+# RUN: ld.lld ok-tiny.o a.o -pie -o local-tiny
+# RUN: llvm-readelf -r -S -x .got -s local-tiny | FileCheck %s 
--check-prefix=LOCAL-TINY
+
+# EXTERNAL-TINY:  OffsetInfo Type  
  Symbol's Value   Symbol's Name + Addend
+# EXTERNAL-TINY-NEXT: 00020380  0001e201 
R_AARCH64_AUTH_GLOB_DAT  bar + 0
+# EXTERNAL-TINY-NEXT: 00020388  0002e201 
R_AARCH64_AUTH_GLOB_DAT  zed + 0
+
+## Symbol's values for bar and zed are equal since they contain no content 
(see Inputs/shared.s)
+# LOCAL-TINY: OffsetInfo Type  
  Symbol's Value   Symbol's Name + Addend
+# LOCAL-TINY-NEXT:00020320  0411 
R_AARCH64_AUTH_RELATIVE 10260
+# LOCAL-TINY-NEXT:00020328  0411 
R_AARCH64_AUTH_RELATIVE 10260
+
+# EXTERNAL-TINY:  Hex dump of section '.got':
+# EXTERNAL-TINY-NEXT: 0x00020380  0080  00a0
+#   ^^
+#   0b1000 bit 63 address 
diversity = true, bits 61..60 key = IA
+# ^^
+# 0b1010 
bit 63 address diversity = true, bits 61..60 key = DA

ilovepi wrote:

I assume these are intentionally not matched. In that case, is there a good 
reason to keep them in the test?

https://github.com/llvm/llvm-project/pull/113816
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)

2024-10-30 Thread Michał Górny via llvm-branch-commits


@@ -294,7 +294,11 @@ class Triple {
 
 PAuthTest,
 
-LastEnvironmentType = PAuthTest
+GNUT64,
+GNUEABIT64,
+GNUEABIHFT64,
+
+LastEnvironmentType = GNUEABIHFT64

mgorny wrote:

The change wasn't meant to break API or ABI compatibility, and whether it did 
is at least controversial. What Zig is doing here is quite uncommon.

https://github.com/llvm/llvm-project/pull/112364
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)

2024-10-30 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru edited https://github.com/llvm/llvm-project/pull/112364
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)

2024-10-30 Thread Tobias Hieta via llvm-branch-commits


@@ -294,7 +294,11 @@ class Triple {
 
 PAuthTest,
 
-LastEnvironmentType = PAuthTest
+GNUT64,
+GNUEABIT64,
+GNUEABIHFT64,
+
+LastEnvironmentType = GNUEABIHFT64

tru wrote:

Yeah this is obviously an oversight and of the ABI checker would have flagged 
it I would asked for changes.

The question now is what we do about it - reverting the change might not be the 
best way forward.

I think we need some input from @tstellar @nikic and possibly @AaronBallman 
here. 

https://github.com/llvm/llvm-project/pull/112364
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [OpenMP][MLIR] Descriptor explicit member map lowering changes (PR #113556)

2024-10-30 Thread via llvm-branch-commits

agozillon wrote:

Ah yes, this one won't build, as MapInfoFinalization.cpp is a Flang change, as 
is the Utils.cpp/hpp and other complaints. This patch requires the changeset 
from 113557, so it's 113557 we should be looking at to pass, which it currently 
doesn't, so I'll have a look there. But don't expect this one to pass as it 
doesn't contain the changeset require to do so, it'd require both the Flang 
frontend changes and the changeset in this PR which is only the MLIR project 
level changes.

https://github.com/llvm/llvm-project/pull/113556
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [OpenMP][MLIR] Descriptor explicit member map lowering changes (PR #113556)

2024-10-30 Thread via llvm-branch-commits

https://github.com/agozillon updated 
https://github.com/llvm/llvm-project/pull/113556

>From b27db9198dedd284ea24161dc4fe6fcb3952814b Mon Sep 17 00:00:00 2001
From: agozillon 
Date: Fri, 4 Oct 2024 13:03:22 -0500
Subject: [PATCH] [OpenMP][MLIR] Descriptor explicit member map lowering
 changes

This is one of 3 PRs in a PR stack that aims to add support for explicit 
mapping of
allocatable members in derived types.

The primary changes in this PR are the OpenMPToLLVMIRTranslation.cpp changes,
which are small and seek to alter the current member mapping to add an
additional map insertion for pointers. Effectively, if the member is a pointer
(currently indicated by having a varPtrPtr field) we add an additional map for
the pointer and then alter the subsequent mapping of the member (the data)
to utilise the member rather than the parents base pointer. This appears to be
necessary in certain cases when mapping pointer data within record types to
avoid segfaulting on device (due to incorrect data mapping). In general this
record type mapping may be simplifiable in the future.

There are also additions of tests which should help to showcase the affect
of the changes above.
---
 mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td |   2 +-
 mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp  |  58 +++--
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  |  81 -
 mlir/test/Dialect/OpenMP/ops.mlir |   4 +-
 ...t-nested-ptr-record-type-mapping-host.mlir |  66 ++
 ...arget-nested-record-type-mapping-host.mlir |   2 +-
 ...get-record-type-with-ptr-member-host.mlir} | 114 ++
 7 files changed, 197 insertions(+), 130 deletions(-)
 create mode 100644 
mlir/test/Target/LLVMIR/omptarget-nested-ptr-record-type-mapping-host.mlir
 rename mlir/test/Target/LLVMIR/{omptarget-fortran-allocatable-types-host.mlir 
=> omptarget-record-type-with-ptr-member-host.mlir} (58%)

diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index 626539cb7bde42..348c1b9c2b8bdf 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -895,7 +895,7 @@ def MapInfoOp : OpenMP_Op<"map.info", 
[AttrSizedOperandSegments]> {
TypeAttr:$var_type,
Optional:$var_ptr_ptr,
Variadic:$members,
-   OptionalAttr:$members_index,
+   OptionalAttr:$members_index,
Variadic:$bounds, /* rank-0 to 
rank-{n-1} */
OptionalAttr:$map_type,
OptionalAttr:$map_capture_type,
diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp 
b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
index e1df647d6a3c71..8d31cda3a33ee9 100644
--- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
@@ -1395,16 +1395,15 @@ static void printMapClause(OpAsmPrinter &p, Operation 
*op,
 }
 
 static ParseResult parseMembersIndex(OpAsmParser &parser,
- DenseIntElementsAttr &membersIdx) {
-  SmallVector values;
-  int64_t value;
-  int64_t shape[2] = {0, 0};
-  unsigned shapeTmp = 0;
+ ArrayAttr &membersIdx) {
+  SmallVector values, memberIdxs;
+
   auto parseIndices = [&]() -> ParseResult {
+int64_t value;
 if (parser.parseInteger(value))
   return failure();
-shapeTmp++;
-values.push_back(APInt(32, value, /*isSigned=*/true));
+values.push_back(IntegerAttr::get(parser.getBuilder().getIntegerType(64),
+  APInt(64, value, /*isSigned=*/false)));
 return success();
   };
 
@@ -1418,52 +1417,29 @@ static ParseResult parseMembersIndex(OpAsmParser 
&parser,
 if (failed(parser.parseRSquare()))
   return failure();
 
-// Only set once, if any indices are not the same size
-// we error out in the next check as that's unsupported
-if (shape[1] == 0)
-  shape[1] = shapeTmp;
-
-// Verify that the recently parsed list is equal to the
-// first one we parsed, they must be equal lengths to
-// keep the rectangular shape DenseIntElementsAttr
-// requires
-if (shapeTmp != shape[1])
-  return failure();
-
-shapeTmp = 0;
-shape[0]++;
+memberIdxs.push_back(ArrayAttr::get(parser.getContext(), values));
+values.clear();
   } while (succeeded(parser.parseOptionalComma()));
 
-  if (!values.empty()) {
-ShapedType valueType =
-VectorType::get(shape, IntegerType::get(parser.getContext(), 32));
-membersIdx = DenseIntElementsAttr::get(valueType, values);
-  }
+  if (!memberIdxs.empty())
+membersIdx = ArrayAttr::get(parser.getContext(), memberIdxs);
 
   return success();
 }
 
 static void printMembersIndex(OpAsmPrinter &p, MapInfoOp op,
-  DenseIntElementsAttr membersIdx) {
-  llvm::ArrayRef shape = membersIdx.getShapedType

[llvm-branch-commits] [mlir] [mlir][bufferization] Remove `finalizing-bufferize` pass (PR #114154)

2024-10-30 Thread Javed Absar via llvm-branch-commits

https://github.com/javedabsar1 commented:

Hi Matthias.
Will you delete references in docs in a different diff ? 
https://github.com/llvm/llvm-project/blob/main/mlir/docs/Bufferization.md?plain=1#L561

https://github.com/llvm/llvm-project/pull/114154
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [OpenMP][MLIR] Descriptor explicit member map lowering changes (PR #113556)

2024-10-30 Thread via llvm-branch-commits

agozillon wrote:

> LGTM, but buildbot errors in "MapInfoFinalization.cpp", 
> "MapsForPrivatizedSymbols.cpp" and "Utils.cpp" seem to be a consequence of 
> the type change to `members_index`. Hopefully it's a simple fix.

I have a feeling it's just not setup correctly to layer on top of it's 
dependent PRs, hence the errors. As in conjunction it all builds fine and runs 
on my machine and I'll double check again before landing the PR stack :-)

Would love an approval if you're happy with the PRs current state!

And would love an approval or further review comments from @ergawy @TIFitis if 
at all possible, would be wonderful to be able to land this in the next week or 
two as it seems to be approaching closure :-) 

https://github.com/llvm/llvm-project/pull/113556
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [llvm] [libc++][C++03] Use `__cxx03/` headers in C++03 mode (PR #109002)

2024-10-30 Thread Louis Dionne via llvm-branch-commits

https://github.com/ldionne edited 
https://github.com/llvm/llvm-project/pull/109002
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #113557)

2024-10-30 Thread via llvm-branch-commits

agozillon wrote:

Would love a review on this whenever you all get a bit of spare time! Thank you 
very much for your help and time it's greatly appreciated  :-)

https://github.com/llvm/llvm-project/pull/113557
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)

2024-10-30 Thread Alex Rønne Petersen via llvm-branch-commits


@@ -294,7 +294,11 @@ class Triple {
 
 PAuthTest,
 
-LastEnvironmentType = PAuthTest
+GNUT64,
+GNUEABIT64,
+GNUEABIHFT64,
+
+LastEnvironmentType = GNUEABIHFT64

alexrp wrote:

Also, ABI aside, quoting [Release Patch 
Rules](https://llvm.org/docs/HowToReleaseLLVM.html#release-patch-rules):

> Bug fix releases Patches should be limited to bug fixes or very safe and 
> critical performance improvements. Patches must maintain **both API and ABI 
> compatibility** with the previous major release.

https://github.com/llvm/llvm-project/pull/112364
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm][fatlto] Drop any CFI related instrumentation after emitting bitcode (PR #112788)

2024-10-30 Thread Paul Kirth via llvm-branch-commits

https://github.com/ilovepi updated 
https://github.com/llvm/llvm-project/pull/112788

>From ad89d61e60bac57cf8c66a974d741377ebe1db30 Mon Sep 17 00:00:00 2001
From: Paul Kirth 
Date: Fri, 18 Oct 2024 01:59:26 +
Subject: [PATCH] Use new enum in constructor

Created using spr 1.3.4
---
 llvm/lib/Passes/PassBuilderPipelines.cpp | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index aec79304ab5c3c..0585e83e59a9ab 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -1631,8 +1631,11 @@ 
PassBuilder::buildFatLTODefaultPipeline(OptimizationLevel Level, bool ThinLTO,
   MPM.addPass(EmbedBitcodePass(ThinLTO, EmitSummary));
 
   // If we're doing FatLTO w/ CFI enabled, we don't want the type tests in the
-  // object file.
-  MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true, true));
+  // object code, only in the bitcode section, so drop it before we run
+  // module optimization and generate machine code. If llvm.type.test() isn't 
in
+  // the IR, this won't do anything.
+  MPM.addPass(
+  LowerTypeTestsPass(nullptr, nullptr, lowertypetests::DropTestKind::All));
 
   // Use the ThinLTO post-link pipeline with sample profiling
   if (ThinLTO && PGOOpt && PGOOpt->Action == PGOOptions::SampleUse)

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [llvm][fatlto] Drop any CFI related instrumentation after emitting bitcode (PR #112788)

2024-10-30 Thread Paul Kirth via llvm-branch-commits

https://github.com/ilovepi updated 
https://github.com/llvm/llvm-project/pull/112788

>From ad89d61e60bac57cf8c66a974d741377ebe1db30 Mon Sep 17 00:00:00 2001
From: Paul Kirth 
Date: Fri, 18 Oct 2024 01:59:26 +
Subject: [PATCH] Use new enum in constructor

Created using spr 1.3.4
---
 llvm/lib/Passes/PassBuilderPipelines.cpp | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index aec79304ab5c3c..0585e83e59a9ab 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -1631,8 +1631,11 @@ 
PassBuilder::buildFatLTODefaultPipeline(OptimizationLevel Level, bool ThinLTO,
   MPM.addPass(EmbedBitcodePass(ThinLTO, EmitSummary));
 
   // If we're doing FatLTO w/ CFI enabled, we don't want the type tests in the
-  // object file.
-  MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true, true));
+  // object code, only in the bitcode section, so drop it before we run
+  // module optimization and generate machine code. If llvm.type.test() isn't 
in
+  // the IR, this won't do anything.
+  MPM.addPass(
+  LowerTypeTestsPass(nullptr, nullptr, lowertypetests::DropTestKind::All));
 
   // Use the ThinLTO post-link pipeline with sample profiling
   if (ThinLTO && PGOOpt && PGOOpt->Action == PGOOptions::SampleUse)

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)

2024-10-30 Thread Sam Elliott via llvm-branch-commits

https://github.com/lenary commented:

This cleanup is going in a nice direction, I think.

A few suggestions/questions below.

https://github.com/llvm/llvm-project/pull/114227
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)

2024-10-30 Thread Sam Elliott via llvm-branch-commits


@@ -27,6 +27,102 @@
 
 using namespace llvm;
 
+static unsigned getCaleeSavedRVVNumRegs(const Register &BaseReg) {
+  return RISCV::VRRegClass.contains(BaseReg) ? 1
+ : RISCV::VRM2RegClass.contains(BaseReg) ? 2
+ : RISCV::VRM4RegClass.contains(BaseReg) ? 4
+ : 8;
+}
+
+static MCRegister getRVVBaseRegister(const RISCVRegisterInfo &TRI,
+ const Register &Reg) {
+  MCRegister BaseReg = TRI.getSubReg(Reg, RISCV::sub_vrm1_0);
+  // If it's not a grouped vector register, it doesn't have subregister, so
+  // the base register is just itself.
+  if (BaseReg == RISCV::NoRegister)
+BaseReg = Reg;
+  return BaseReg;
+}
+
+namespace {
+
+struct CFIRestoreRegisterEmitter {
+  CFIRestoreRegisterEmitter(MachineFunction &, const RISCVSubtarget &) {};
+
+  void emit(MachineFunction &MF, MachineBasicBlock &MBB,
+MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI,
+const RISCVInstrInfo &TII, const DebugLoc &DL,
+const CalleeSavedInfo &CS) const {
+Register Reg = CS.getReg();
+unsigned CFIIndex = MF.addFrameInst(
+MCCFIInstruction::createRestore(nullptr, RI.getDwarfRegNum(Reg, 
true)));
+BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
+.addCFIIndex(CFIIndex)
+.setMIFlag(MachineInstr::FrameDestroy);
+  }
+};
+
+class CFIStoreRegisterEmitter {
+  MachineFrameInfo &MFI;
+
+public:
+  CFIStoreRegisterEmitter(MachineFunction &MF, const RISCVSubtarget &)
+  : MFI{MF.getFrameInfo()} {};
+
+  void emit(MachineFunction &MF, MachineBasicBlock &MBB,
+MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI,
+const RISCVInstrInfo &TII, const DebugLoc &DL,
+const CalleeSavedInfo &CS) const {
+int FrameIdx = CS.getFrameIdx();
+int64_t Offset = MFI.getObjectOffset(FrameIdx);
+Register Reg = CS.getReg();
+unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createOffset(
+nullptr, RI.getDwarfRegNum(Reg, true), Offset));
+BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
+.addCFIIndex(CFIIndex)
+.setMIFlag(MachineInstr::FrameSetup);
+  }
+};
+
+class CFIRestoreRVVRegisterEmitter {
+  const llvm::RISCVRegisterInfo *TRI;
+
+public:
+  CFIRestoreRVVRegisterEmitter(MachineFunction &, const RISCVSubtarget &STI)
+  : TRI{STI.getRegisterInfo()} {};
+
+  void emit(MachineFunction &MF, MachineBasicBlock &MBB,
+MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI,
+const RISCVInstrInfo &TII, const DebugLoc &DL,
+const CalleeSavedInfo &CS) const {
+MCRegister BaseReg = getRVVBaseRegister(*TRI, CS.getReg());
+unsigned NumRegs = getCaleeSavedRVVNumRegs(CS.getReg());
+for (unsigned i = 0; i < NumRegs; ++i) {
+  unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createRestore(
+  nullptr, RI.getDwarfRegNum(BaseReg + i, true)));
+  BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
+  .addCFIIndex(CFIIndex)
+  .setMIFlag(MachineInstr::FrameDestroy);
+}
+  }
+};
+
+} // namespace
+
+template 
+void RISCVFrameLowering::emitCFIForCSI(
+MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
+const SmallVector &CSI) const {
+  MachineFunction *MF = MBB.getParent();
+  const RISCVRegisterInfo *RI = STI.getRegisterInfo();
+  const RISCVInstrInfo *TII = STI.getInstrInfo();
+  DebugLoc DL = MBB.findDebugLoc(MBBI);
+
+  Emitter E{*MF, STI};
+  for (const auto &CS : CSI)
+E.emit(*MF, MBB, MBBI, *RI, *TII, DL, CS);

lenary wrote:

I don't quite know why `*MF` needs to be passed into both the constructor and 
into `emit`? Would it not be simpler to just pass it into the constructor and 
cache the reference?

https://github.com/llvm/llvm-project/pull/114227
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)

2024-10-30 Thread Sam Elliott via llvm-branch-commits


@@ -27,6 +27,102 @@
 
 using namespace llvm;
 
+static unsigned getCaleeSavedRVVNumRegs(const Register &BaseReg) {
+  return RISCV::VRRegClass.contains(BaseReg) ? 1
+ : RISCV::VRM2RegClass.contains(BaseReg) ? 2
+ : RISCV::VRM4RegClass.contains(BaseReg) ? 4
+ : 8;
+}
+
+static MCRegister getRVVBaseRegister(const RISCVRegisterInfo &TRI,
+ const Register &Reg) {
+  MCRegister BaseReg = TRI.getSubReg(Reg, RISCV::sub_vrm1_0);
+  // If it's not a grouped vector register, it doesn't have subregister, so
+  // the base register is just itself.
+  if (BaseReg == RISCV::NoRegister)
+BaseReg = Reg;
+  return BaseReg;
+}
+
+namespace {
+
+struct CFIRestoreRegisterEmitter {
+  CFIRestoreRegisterEmitter(MachineFunction &, const RISCVSubtarget &) {};
+
+  void emit(MachineFunction &MF, MachineBasicBlock &MBB,
+MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI,
+const RISCVInstrInfo &TII, const DebugLoc &DL,
+const CalleeSavedInfo &CS) const {
+Register Reg = CS.getReg();
+unsigned CFIIndex = MF.addFrameInst(
+MCCFIInstruction::createRestore(nullptr, RI.getDwarfRegNum(Reg, 
true)));
+BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
+.addCFIIndex(CFIIndex)
+.setMIFlag(MachineInstr::FrameDestroy);
+  }
+};
+
+class CFIStoreRegisterEmitter {

lenary wrote:

```suggestion
class CFISaveRegisterEmitter {
```

I think it would be clearer to use "Save" as the opposite of "Restore", rather 
than "Store" vs "Restore".

https://github.com/llvm/llvm-project/pull/114227
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)

2024-10-30 Thread Sam Elliott via llvm-branch-commits


@@ -1737,39 +1776,14 @@ void RISCVFrameLowering::emitCalleeSavedRVVPrologCFI(
   for (auto &CS : RVVCSI) {
 // Insert the spill to the stack frame.
 int FI = CS.getFrameIdx();

lenary wrote:

Is there a reason you didn't replace this loop with a call to `emitCFIforCSI` 
with a new `CFISaveRVVRegisterEmitter`? That would give a little better 
symmetry in those objects, even though it doesn't get rid of the calls to 
`emitCalleeSavedRVVPrologCFI` because of the other logic happening before this 
loop.

https://github.com/llvm/llvm-project/pull/114227
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)

2024-10-30 Thread Sam Elliott via llvm-branch-commits


@@ -27,6 +27,102 @@
 
 using namespace llvm;
 
+static unsigned getCaleeSavedRVVNumRegs(const Register &BaseReg) {
+  return RISCV::VRRegClass.contains(BaseReg) ? 1
+ : RISCV::VRM2RegClass.contains(BaseReg) ? 2
+ : RISCV::VRM4RegClass.contains(BaseReg) ? 4
+ : 8;
+}
+
+static MCRegister getRVVBaseRegister(const RISCVRegisterInfo &TRI,
+ const Register &Reg) {
+  MCRegister BaseReg = TRI.getSubReg(Reg, RISCV::sub_vrm1_0);
+  // If it's not a grouped vector register, it doesn't have subregister, so
+  // the base register is just itself.
+  if (BaseReg == RISCV::NoRegister)
+BaseReg = Reg;
+  return BaseReg;
+}
+
+namespace {
+
+struct CFIRestoreRegisterEmitter {
+  CFIRestoreRegisterEmitter(MachineFunction &, const RISCVSubtarget &) {};
+
+  void emit(MachineFunction &MF, MachineBasicBlock &MBB,
+MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI,
+const RISCVInstrInfo &TII, const DebugLoc &DL,
+const CalleeSavedInfo &CS) const {
+Register Reg = CS.getReg();
+unsigned CFIIndex = MF.addFrameInst(
+MCCFIInstruction::createRestore(nullptr, RI.getDwarfRegNum(Reg, 
true)));
+BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
+.addCFIIndex(CFIIndex)
+.setMIFlag(MachineInstr::FrameDestroy);
+  }
+};
+
+class CFIStoreRegisterEmitter {
+  MachineFrameInfo &MFI;
+
+public:
+  CFIStoreRegisterEmitter(MachineFunction &MF, const RISCVSubtarget &)
+  : MFI{MF.getFrameInfo()} {};
+
+  void emit(MachineFunction &MF, MachineBasicBlock &MBB,
+MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI,
+const RISCVInstrInfo &TII, const DebugLoc &DL,
+const CalleeSavedInfo &CS) const {
+int FrameIdx = CS.getFrameIdx();
+int64_t Offset = MFI.getObjectOffset(FrameIdx);
+Register Reg = CS.getReg();
+unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createOffset(
+nullptr, RI.getDwarfRegNum(Reg, true), Offset));
+BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
+.addCFIIndex(CFIIndex)
+.setMIFlag(MachineInstr::FrameSetup);
+  }
+};
+
+class CFIRestoreRVVRegisterEmitter {
+  const llvm::RISCVRegisterInfo *TRI;
+
+public:
+  CFIRestoreRVVRegisterEmitter(MachineFunction &, const RISCVSubtarget &STI)
+  : TRI{STI.getRegisterInfo()} {};

lenary wrote:

Why is this saved, when the same pointee is passed into `emit` as a reference 
`RI`? Getting rid of this would simplify the constructor, removing the STI 
argument.

https://github.com/llvm/llvm-project/pull/114227
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)

2024-10-30 Thread Sam Elliott via llvm-branch-commits


@@ -27,6 +27,102 @@
 
 using namespace llvm;
 
+static unsigned getCaleeSavedRVVNumRegs(const Register &BaseReg) {

lenary wrote:

Typo
```suggestion
static unsigned getCalleeSavedRVVNumRegs(const Register &BaseReg) {
```


https://github.com/llvm/llvm-project/pull/114227
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)

2024-10-30 Thread Sam Elliott via llvm-branch-commits

https://github.com/lenary edited 
https://github.com/llvm/llvm-project/pull/114227
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)

2024-10-30 Thread Tobias Hieta via llvm-branch-commits


@@ -294,7 +294,11 @@ class Triple {
 
 PAuthTest,
 
-LastEnvironmentType = PAuthTest
+GNUT64,
+GNUEABIT64,
+GNUEABIHFT64,
+
+LastEnvironmentType = GNUEABIHFT64

tru wrote:

Hmm. That looks a bit odd. The Abi checker didn't catch this. Wonder why - 
@tstellar 

https://github.com/llvm/llvm-project/pull/112364
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: Add skeletons for new register bank select passes (PR #112862)

2024-10-30 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/112862
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [llvm] [libc++][C++03] Use `__cxx03/` headers in C++03 mode (PR #109002)

2024-10-30 Thread Louis Dionne via llvm-branch-commits

https://github.com/ldionne edited 
https://github.com/llvm/llvm-project/pull/109002
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [llvm] [libc++][C++03] Use `__cxx03/` headers in C++03 mode (PR #109002)

2024-10-30 Thread Louis Dionne via llvm-branch-commits


@@ -1827,232 +1827,147 @@ template 
 
 */
 
-#include <__config>
-
-#include <__algorithm/adjacent_find.h>
-#include <__algorithm/all_of.h>
-#include <__algorithm/any_of.h>
-#include <__algorithm/binary_search.h>
-#include <__algorithm/copy.h>
-#include <__algorithm/copy_backward.h>
-#include <__algorithm/copy_if.h>
-#include <__algorithm/copy_n.h>
-#include <__algorithm/count.h>
-#include <__algorithm/count_if.h>
-#include <__algorithm/equal.h>
-#include <__algorithm/equal_range.h>
-#include <__algorithm/fill.h>
-#include <__algorithm/fill_n.h>
-#include <__algorithm/find.h>
-#include <__algorithm/find_end.h>
-#include <__algorithm/find_first_of.h>
-#include <__algorithm/find_if.h>
-#include <__algorithm/find_if_not.h>
-#include <__algorithm/for_each.h>
-#include <__algorithm/generate.h>
-#include <__algorithm/generate_n.h>
-#include <__algorithm/includes.h>
-#include <__algorithm/inplace_merge.h>
-#include <__algorithm/is_heap.h>
-#include <__algorithm/is_heap_until.h>
-#include <__algorithm/is_partitioned.h>
-#include <__algorithm/is_permutation.h>
-#include <__algorithm/is_sorted.h>
-#include <__algorithm/is_sorted_until.h>
-#include <__algorithm/iter_swap.h>
-#include <__algorithm/lexicographical_compare.h>
-#include <__algorithm/lower_bound.h>
-#include <__algorithm/make_heap.h>
-#include <__algorithm/max.h>
-#include <__algorithm/max_element.h>
-#include <__algorithm/merge.h>
-#include <__algorithm/min.h>
-#include <__algorithm/min_element.h>
-#include <__algorithm/minmax.h>
-#include <__algorithm/minmax_element.h>
-#include <__algorithm/mismatch.h>
-#include <__algorithm/move.h>
-#include <__algorithm/move_backward.h>
-#include <__algorithm/next_permutation.h>
-#include <__algorithm/none_of.h>
-#include <__algorithm/nth_element.h>
-#include <__algorithm/partial_sort.h>
-#include <__algorithm/partial_sort_copy.h>
-#include <__algorithm/partition.h>
-#include <__algorithm/partition_copy.h>
-#include <__algorithm/partition_point.h>
-#include <__algorithm/pop_heap.h>
-#include <__algorithm/prev_permutation.h>
-#include <__algorithm/push_heap.h>
-#include <__algorithm/remove.h>
-#include <__algorithm/remove_copy.h>
-#include <__algorithm/remove_copy_if.h>
-#include <__algorithm/remove_if.h>
-#include <__algorithm/replace.h>
-#include <__algorithm/replace_copy.h>
-#include <__algorithm/replace_copy_if.h>
-#include <__algorithm/replace_if.h>
-#include <__algorithm/reverse.h>
-#include <__algorithm/reverse_copy.h>
-#include <__algorithm/rotate.h>
-#include <__algorithm/rotate_copy.h>
-#include <__algorithm/search.h>
-#include <__algorithm/search_n.h>
-#include <__algorithm/set_difference.h>
-#include <__algorithm/set_intersection.h>
-#include <__algorithm/set_symmetric_difference.h>
-#include <__algorithm/set_union.h>
-#include <__algorithm/shuffle.h>
-#include <__algorithm/sort.h>
-#include <__algorithm/sort_heap.h>
-#include <__algorithm/stable_partition.h>
-#include <__algorithm/stable_sort.h>
-#include <__algorithm/swap_ranges.h>
-#include <__algorithm/transform.h>
-#include <__algorithm/unique.h>
-#include <__algorithm/unique_copy.h>
-#include <__algorithm/upper_bound.h>
-
-#if _LIBCPP_STD_VER >= 17
-#  include <__algorithm/clamp.h>
-#  include <__algorithm/for_each_n.h>
-#  include <__algorithm/pstl.h>
-#  include <__algorithm/sample.h>
-#endif // _LIBCPP_STD_VER >= 17
-
-#if _LIBCPP_STD_VER >= 20
-#  include <__algorithm/in_found_result.h>
-#  include <__algorithm/in_fun_result.h>
-#  include <__algorithm/in_in_out_result.h>
-#  include <__algorithm/in_in_result.h>
-#  include <__algorithm/in_out_out_result.h>
-#  include <__algorithm/in_out_result.h>
-#  include <__algorithm/lexicographical_compare_three_way.h>
-#  include <__algorithm/min_max_result.h>
-#  include <__algorithm/ranges_adjacent_find.h>
-#  include <__algorithm/ranges_all_of.h>
-#  include <__algorithm/ranges_any_of.h>
-#  include <__algorithm/ranges_binary_search.h>
-#  include <__algorithm/ranges_clamp.h>
-#  include <__algorithm/ranges_contains.h>
-#  include <__algorithm/ranges_copy.h>
-#  include <__algorithm/ranges_copy_backward.h>
-#  include <__algorithm/ranges_copy_if.h>
-#  include <__algorithm/ranges_copy_n.h>
-#  include <__algorithm/ranges_count.h>
-#  include <__algorithm/ranges_count_if.h>
-#  include <__algorithm/ranges_equal.h>
-#  include <__algorithm/ranges_equal_range.h>
-#  include <__algorithm/ranges_fill.h>
-#  include <__algorithm/ranges_fill_n.h>
-#  include <__algorithm/ranges_find.h>
-#  include <__algorithm/ranges_find_end.h>
-#  include <__algorithm/ranges_find_first_of.h>
-#  include <__algorithm/ranges_find_if.h>
-#  include <__algorithm/ranges_find_if_not.h>
-#  include <__algorithm/ranges_for_each.h>
-#  include <__algorithm/ranges_for_each_n.h>
-#  include <__algorithm/ranges_generate.h>
-#  include <__algorithm/ranges_generate_n.h>
-#  include <__algorithm/ranges_includes.h>
-#  include <__algorithm/ranges_inplace_merge.h>
-#  include <__algorithm/ranges_is_heap.h>
-#  include <__a

[llvm-branch-commits] [libcxx] [llvm] [libc++][C++03] Use `__cxx03/` headers in C++03 mode (PR #109002)

2024-10-30 Thread Louis Dionne via llvm-branch-commits


@@ -0,0 +1,2 @@
+set(LIBCXX_TEST_PARAMS "std=c++03;test-cxx03-headers=True" CACHE STRING "")

ldionne wrote:

I would probably name this new param `use-frozen-cxx03-headers`. That tells 
what the setting does a bit more explicitly.

https://github.com/llvm/llvm-project/pull/109002
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [llvm] [libc++][C++03] Use `__cxx03/` headers in C++03 mode (PR #109002)

2024-10-30 Thread Louis Dionne via llvm-branch-commits


@@ -587,42 +587,48 @@ template 
 
 */
 
-#include <__config>
-
-#include <__atomic/aliases.h>
-#include <__atomic/atomic.h>
-#include <__atomic/atomic_base.h>
-#include <__atomic/atomic_flag.h>
-#include <__atomic/atomic_init.h>
-#include <__atomic/atomic_lock_free.h>
-#include <__atomic/atomic_sync.h>
-#include <__atomic/check_memory_order.h>
-#include <__atomic/contention_t.h>
-#include <__atomic/cxx_atomic_impl.h>
-#include <__atomic/fence.h>
-#include <__atomic/is_always_lock_free.h>
-#include <__atomic/kill_dependency.h>
-#include <__atomic/memory_order.h>
-#include 
-
-#if _LIBCPP_STD_VER >= 20
-#  include <__atomic/atomic_ref.h>
-#endif
-
-#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
-#  pragma GCC system_header
-#endif
-
-#if !_LIBCPP_HAS_ATOMIC_HEADER
-#  error  is not implemented
-#endif
-
-#if !defined(_LIBCPP_REMOVE_TRANSITIVE_INCLUDES) && _LIBCPP_STD_VER <= 20
-#  include 
-#  include 
-#  include 
-#  include 
-#  include 
-#endif
+#include <__configuration/cxx03.h>
+
+#if defined(_LIBCPP_CXX03_LANG) && !defined(_LIBCPP_USE_CXX03_HEADERS)
+#  include <__cxx03/algorithm>

ldionne wrote:

```suggestion
#  include <__cxx03/atomic>
```

Perhaps other occurrences of the same mistake?

https://github.com/llvm/llvm-project/pull/109002
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [llvm] [libc++][C++03] Use `__cxx03/` headers in C++03 mode (PR #109002)

2024-10-30 Thread Louis Dionne via llvm-branch-commits

https://github.com/ldionne commented:

I think this makes sense, however I would like you to post this PR to the RFC, 
since we mentioned back then that the decision of Driver vs conditional 
includes would be best taken based on actual proposed changes. This will give 
this patch a bit more visibility and we can ensure that we have consensus on 
moving forward with this implementation strategy.

https://github.com/llvm/llvm-project/pull/109002
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [llvm] [libc++][C++03] Use `__cxx03/` headers in C++03 mode (PR #109002)

2024-10-30 Thread Louis Dionne via llvm-branch-commits


@@ -14,6 +14,7 @@
 #include <__configuration/abi.h>
 #include <__configuration/availability.h>
 #include <__configuration/compiler.h>
+#include <__configuration/cxx03.h>

ldionne wrote:

Could we move `_LIBCPP_CXX03_LANG` into `language.h` instead? And it might be 
worth making a separate patch since that's so easy to do.

So in many places the includes would become `#include 
<__configuration/language.h>` at the beginning of files.

https://github.com/llvm/llvm-project/pull/109002
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/19.x: [compiler-rt] [test] Fix using toolchains that rely on Clang default configs (#113491) (PR #114229)

2024-10-30 Thread Michał Górny via llvm-branch-commits

https://github.com/mgorny approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/114229
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][OpenMP] Emit descriptive errors for all unsupported clauses (PR #114037)

2024-10-30 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah approved this pull request.

Thanks for this. Looks great!

https://github.com/llvm/llvm-project/pull/114037
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 821d9ba - Revert "[GlobalISel] Import samesign flag (#113090)"

2024-10-30 Thread via llvm-branch-commits

Author: Thorsten Schütt
Date: 2024-10-30T17:02:07+01:00
New Revision: 821d9bad7373f8ddc4d1c424c0d806f8c087faa3

URL: 
https://github.com/llvm/llvm-project/commit/821d9bad7373f8ddc4d1c424c0d806f8c087faa3
DIFF: 
https://github.com/llvm/llvm-project/commit/821d9bad7373f8ddc4d1c424c0d806f8c087faa3.diff

LOG: Revert "[GlobalISel] Import samesign flag (#113090)"

This reverts commit 72b115301d1c0d56f40f5030bb8d16f422ac211b.

Added: 


Modified: 
llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h
llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
llvm/include/llvm/CodeGen/MachineInstr.h
llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp
llvm/lib/CodeGen/MIRParser/MILexer.cpp
llvm/lib/CodeGen/MIRParser/MILexer.h
llvm/lib/CodeGen/MIRParser/MIParser.cpp
llvm/lib/CodeGen/MIRPrinter.cpp
llvm/lib/CodeGen/MachineInstr.cpp

Removed: 
llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-samesign.ll
llvm/test/CodeGen/MIR/icmp-flags.mir



diff  --git a/llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h 
b/llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h
index cd7ebcf54c9e1e..b6309a9ea0ec78 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h
@@ -28,7 +28,7 @@ namespace llvm {
 class GenericMachineInstr : public MachineInstr {
   constexpr static unsigned PoisonFlags = NoUWrap | NoSWrap | NoUSWrap |
   IsExact | Disjoint | NonNeg |
-  FmNoNans | FmNoInfs | SameSign;
+  FmNoNans | FmNoInfs;
 
 public:
   GenericMachineInstr() = delete;

diff  --git a/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h 
b/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
index a38dd34a17097a..c41e74ec7ebdcc 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
@@ -1266,8 +1266,7 @@ class MachineIRBuilder {
   ///
   /// \return a MachineInstrBuilder for the newly created instruction.
   MachineInstrBuilder buildICmp(CmpInst::Predicate Pred, const DstOp &Res,
-const SrcOp &Op0, const SrcOp &Op1,
-std::optional Flags = std::nullopt);
+const SrcOp &Op0, const SrcOp &Op1);
 
   /// Build and insert a \p Res = G_FCMP \p Pred\p Op0, \p Op1
   ///

diff  --git a/llvm/include/llvm/CodeGen/MachineInstr.h 
b/llvm/include/llvm/CodeGen/MachineInstr.h
index ead6bbe1d5f641..36051732474634 100644
--- a/llvm/include/llvm/CodeGen/MachineInstr.h
+++ b/llvm/include/llvm/CodeGen/MachineInstr.h
@@ -119,7 +119,6 @@ class MachineInstr
 Disjoint = 1 << 19,  // Each bit is zero in at least one of the inputs.
 NoUSWrap = 1 << 20,  // Instruction supports geps
  // no unsigned signed wrap.
-SameSign = 1 << 21   // Both operands have the same sign.
   };
 
 private:

diff  --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp 
b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
index a87754389cc8ed..5381dce58f9e65 100644
--- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
@@ -340,17 +340,20 @@ bool IRTranslator::translateCompare(const User &U,
   Register Op1 = getOrCreateVReg(*U.getOperand(1));
   Register Res = getOrCreateVReg(U);
   CmpInst::Predicate Pred = CI->getPredicate();
-  uint32_t Flags = MachineInstr::copyFlagsFromInstruction(*CI);
   if (CmpInst::isIntPredicate(Pred))
-MIRBuilder.buildICmp(Pred, Res, Op0, Op1, Flags);
+MIRBuilder.buildICmp(Pred, Res, Op0, Op1);
   else if (Pred == CmpInst::FCMP_FALSE)
 MIRBuilder.buildCopy(
 Res, getOrCreateVReg(*Constant::getNullValue(U.getType(;
   else if (Pred == CmpInst::FCMP_TRUE)
 MIRBuilder.buildCopy(
 Res, getOrCreateVReg(*Constant::getAllOnesValue(U.getType(;
-  else
+  else {
+uint32_t Flags = 0;
+if (CI)
+  Flags = MachineInstr::copyFlagsFromInstruction(*CI);
 MIRBuilder.buildFCmp(Pred, Res, Op0, Op1, Flags);
+  }
 
   return true;
 }

diff  --git a/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp 
b/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp
index 15b9164247846c..59f2fc633f5de7 100644
--- a/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp
@@ -898,9 +898,8 @@ MachineIRBuilder::buildFPTrunc(const DstOp &Res, const 
SrcOp &Op,
 MachineInstrBuilder MachineIRBuilder::buildICmp(CmpInst::Predicate Pred,
 const DstOp &Res,
 const SrcOp &Op0,
-const SrcOp &Op1,
-  

[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)

2024-10-30 Thread Alex Rønne Petersen via llvm-branch-commits


@@ -294,7 +294,11 @@ class Triple {
 
 PAuthTest,
 
-LastEnvironmentType = PAuthTest
+GNUT64,
+GNUEABIT64,
+GNUEABIHFT64,
+
+LastEnvironmentType = GNUEABIHFT64

alexrp wrote:

I don't *think* a revert at this stage would make much of a difference since 
the break has already shipped in a release anyway. We'll have to deal with it 
in Zig regardless.

We just need to be a bit more careful with enum changes like this going forward.

https://github.com/llvm/llvm-project/pull/112364
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][cuda] Data transfer with descriptor (PR #114302)

2024-10-30 Thread Valentin Clement バレンタイン クレメン via llvm-branch-commits

https://github.com/clementval created 
https://github.com/llvm/llvm-project/pull/114302

Use the feature added in #114301 to perform data transfer between data having a 
descriptor. 

>From e4c7e31c77bbfda563e4e2c9b591fe2f5cb2c259 Mon Sep 17 00:00:00 2001
From: Valentin Clement 
Date: Wed, 30 Oct 2024 11:53:12 -0700
Subject: [PATCH] [flang][cuda] Data transfer with descriptor

---
 flang/runtime/CUDA/memory.cpp   | 34 +++--
 flang/unittests/Runtime/CUDA/Memory.cpp | 40 +
 2 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/flang/runtime/CUDA/memory.cpp b/flang/runtime/CUDA/memory.cpp
index 4778a4ae77683f..f25d3b531c84f0 100644
--- a/flang/runtime/CUDA/memory.cpp
+++ b/flang/runtime/CUDA/memory.cpp
@@ -9,10 +9,32 @@
 #include "flang/Runtime/CUDA/memory.h"
 #include "../terminator.h"
 #include "flang/Runtime/CUDA/common.h"
+#include "flang/Runtime/assign.h"
 
 #include "cuda_runtime.h"
 
 namespace Fortran::runtime::cuda {
+static void *MemmoveHostToDevice(
+void *dst, const void *src, std::size_t count) {
+  // TODO: Use cudaMemcpyAsync when we have support for stream.
+  CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice));
+  return dst;
+}
+
+static void *MemmoveDeviceToHost(
+void *dst, const void *src, std::size_t count) {
+  // TODO: Use cudaMemcpyAsync when we have support for stream.
+  CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyDeviceToHost));
+  return dst;
+}
+
+static void *MemmoveDeviceToDevice(
+void *dst, const void *src, std::size_t count) {
+  // TODO: Use cudaMemcpyAsync when we have support for stream.
+  CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice));
+  return dst;
+}
+
 extern "C" {
 
 void *RTDEF(CUFMemAlloc)(
@@ -90,8 +112,16 @@ void RTDEF(CUFDataTransferPtrDesc)(void *addr, Descriptor 
*desc,
 void RTDECL(CUFDataTransferDescDesc)(Descriptor *dstDesc, Descriptor *srcDesc,
 unsigned mode, const char *sourceFile, int sourceLine) {
   Terminator terminator{sourceFile, sourceLine};
-  terminator.Crash(
-  "not yet implemented: CUDA data transfer between two descriptors");
+  MemmoveFct memmoveFct;
+  if (mode == kHostToDevice) {
+memmoveFct = &MemmoveHostToDevice;
+  } else if (mode == kDeviceToHost) {
+memmoveFct = &MemmoveDeviceToHost;
+  } else if (mode == kDeviceToDevice) {
+memmoveFct = &MemmoveDeviceToDevice;
+  }
+  Fortran::runtime::Assign(
+  dstDesc, srcDesc, terminator, MaybeReallocate, memmoveFct);
 }
 }
 } // namespace Fortran::runtime::cuda
diff --git a/flang/unittests/Runtime/CUDA/Memory.cpp 
b/flang/unittests/Runtime/CUDA/Memory.cpp
index 157d3cdb531def..ade05e21b70a89 100644
--- a/flang/unittests/Runtime/CUDA/Memory.cpp
+++ b/flang/unittests/Runtime/CUDA/Memory.cpp
@@ -9,11 +9,17 @@
 #include "flang/Runtime/CUDA/memory.h"
 #include "gtest/gtest.h"
 #include "../../../runtime/terminator.h"
+#include "../tools.h"
 #include "flang/Common/Fortran.h"
+#include "flang/Runtime/CUDA/allocator.h"
 #include "flang/Runtime/CUDA/common.h"
+#include "flang/Runtime/CUDA/descriptor.h"
+#include "flang/Runtime/allocatable.h"
+#include "flang/Runtime/allocator-registry.h"
 
 #include "cuda_runtime.h"
 
+using namespace Fortran::runtime;
 using namespace Fortran::runtime::cuda;
 
 TEST(MemoryCUFTest, SimpleAllocTramsferFree) {
@@ -29,3 +35,37 @@ TEST(MemoryCUFTest, SimpleAllocTramsferFree) {
   EXPECT_EQ(42, host);
   RTNAME(CUFMemFree)((void *)dev, kMemTypeDevice, __FILE__, __LINE__);
 }
+
+static OwningPtr createAllocatable(
+Fortran::common::TypeCategory tc, int kind, int rank = 1) {
+  return Descriptor::Create(TypeCode{tc, kind}, kind, nullptr, rank, nullptr,
+  CFI_attribute_allocatable);
+}
+
+TEST(MemoryCUFTest, CUFDataTransferDescDesc) {
+  using Fortran::common::TypeCategory;
+  RTNAME(CUFRegisterAllocator)();
+  // INTEGER(4), DEVICE, ALLOCATABLE :: a(:)
+  auto dev{createAllocatable(TypeCategory::Integer, 4)};
+  dev->SetAllocIdx(kDeviceAllocatorPos);
+  EXPECT_EQ((int)kDeviceAllocatorPos, dev->GetAllocIdx());
+  RTNAME(AllocatableSetBounds)(*dev, 0, 1, 10);
+  RTNAME(AllocatableAllocate)
+  (*dev, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__);
+  EXPECT_TRUE(dev->IsAllocated());
+
+  // Create temp array to transfer to device.
+  auto x{MakeArray(std::vector{10},
+  std::vector{0, 1, 2, 3, 4, 5, 6, 7, 8, 9})};
+  RTNAME(CUFDataTransferDescDesc)(*dev, *x, kHostToDevice, __FILE__, __LINE__);
+
+  // Retrieve data from device.
+  auto host{MakeArray(std::vector{10},
+  std::vector{0, 0, 0, 0, 0, 0, 0, 0, 0, 0})};
+  RTNAME(CUFDataTransferDescDesc)(
+  *host, *dev, kDeviceToHost, __FILE__, __LINE__);
+
+  for (unsigned i = 0; i < 10; ++i) {
+EXPECT_EQ(*host->ZeroBasedIndexedElement(i), 
(std::int32_t)i);
+  }
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch

[llvm-branch-commits] [flang] [flang][cuda] Data transfer with descriptor (PR #114302)

2024-10-30 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-flang-runtime

Author: Valentin Clement (バレンタイン クレメン) (clementval)


Changes

Use the feature added in #114301 to perform data transfer between data 
having a descriptor. 

---
Full diff: https://github.com/llvm/llvm-project/pull/114302.diff


2 Files Affected:

- (modified) flang/runtime/CUDA/memory.cpp (+32-2) 
- (modified) flang/unittests/Runtime/CUDA/Memory.cpp (+40) 


``diff
diff --git a/flang/runtime/CUDA/memory.cpp b/flang/runtime/CUDA/memory.cpp
index 4778a4ae77683f..f25d3b531c84f0 100644
--- a/flang/runtime/CUDA/memory.cpp
+++ b/flang/runtime/CUDA/memory.cpp
@@ -9,10 +9,32 @@
 #include "flang/Runtime/CUDA/memory.h"
 #include "../terminator.h"
 #include "flang/Runtime/CUDA/common.h"
+#include "flang/Runtime/assign.h"
 
 #include "cuda_runtime.h"
 
 namespace Fortran::runtime::cuda {
+static void *MemmoveHostToDevice(
+void *dst, const void *src, std::size_t count) {
+  // TODO: Use cudaMemcpyAsync when we have support for stream.
+  CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice));
+  return dst;
+}
+
+static void *MemmoveDeviceToHost(
+void *dst, const void *src, std::size_t count) {
+  // TODO: Use cudaMemcpyAsync when we have support for stream.
+  CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyDeviceToHost));
+  return dst;
+}
+
+static void *MemmoveDeviceToDevice(
+void *dst, const void *src, std::size_t count) {
+  // TODO: Use cudaMemcpyAsync when we have support for stream.
+  CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice));
+  return dst;
+}
+
 extern "C" {
 
 void *RTDEF(CUFMemAlloc)(
@@ -90,8 +112,16 @@ void RTDEF(CUFDataTransferPtrDesc)(void *addr, Descriptor 
*desc,
 void RTDECL(CUFDataTransferDescDesc)(Descriptor *dstDesc, Descriptor *srcDesc,
 unsigned mode, const char *sourceFile, int sourceLine) {
   Terminator terminator{sourceFile, sourceLine};
-  terminator.Crash(
-  "not yet implemented: CUDA data transfer between two descriptors");
+  MemmoveFct memmoveFct;
+  if (mode == kHostToDevice) {
+memmoveFct = &MemmoveHostToDevice;
+  } else if (mode == kDeviceToHost) {
+memmoveFct = &MemmoveDeviceToHost;
+  } else if (mode == kDeviceToDevice) {
+memmoveFct = &MemmoveDeviceToDevice;
+  }
+  Fortran::runtime::Assign(
+  dstDesc, srcDesc, terminator, MaybeReallocate, memmoveFct);
 }
 }
 } // namespace Fortran::runtime::cuda
diff --git a/flang/unittests/Runtime/CUDA/Memory.cpp 
b/flang/unittests/Runtime/CUDA/Memory.cpp
index 157d3cdb531def..ade05e21b70a89 100644
--- a/flang/unittests/Runtime/CUDA/Memory.cpp
+++ b/flang/unittests/Runtime/CUDA/Memory.cpp
@@ -9,11 +9,17 @@
 #include "flang/Runtime/CUDA/memory.h"
 #include "gtest/gtest.h"
 #include "../../../runtime/terminator.h"
+#include "../tools.h"
 #include "flang/Common/Fortran.h"
+#include "flang/Runtime/CUDA/allocator.h"
 #include "flang/Runtime/CUDA/common.h"
+#include "flang/Runtime/CUDA/descriptor.h"
+#include "flang/Runtime/allocatable.h"
+#include "flang/Runtime/allocator-registry.h"
 
 #include "cuda_runtime.h"
 
+using namespace Fortran::runtime;
 using namespace Fortran::runtime::cuda;
 
 TEST(MemoryCUFTest, SimpleAllocTramsferFree) {
@@ -29,3 +35,37 @@ TEST(MemoryCUFTest, SimpleAllocTramsferFree) {
   EXPECT_EQ(42, host);
   RTNAME(CUFMemFree)((void *)dev, kMemTypeDevice, __FILE__, __LINE__);
 }
+
+static OwningPtr createAllocatable(
+Fortran::common::TypeCategory tc, int kind, int rank = 1) {
+  return Descriptor::Create(TypeCode{tc, kind}, kind, nullptr, rank, nullptr,
+  CFI_attribute_allocatable);
+}
+
+TEST(MemoryCUFTest, CUFDataTransferDescDesc) {
+  using Fortran::common::TypeCategory;
+  RTNAME(CUFRegisterAllocator)();
+  // INTEGER(4), DEVICE, ALLOCATABLE :: a(:)
+  auto dev{createAllocatable(TypeCategory::Integer, 4)};
+  dev->SetAllocIdx(kDeviceAllocatorPos);
+  EXPECT_EQ((int)kDeviceAllocatorPos, dev->GetAllocIdx());
+  RTNAME(AllocatableSetBounds)(*dev, 0, 1, 10);
+  RTNAME(AllocatableAllocate)
+  (*dev, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__);
+  EXPECT_TRUE(dev->IsAllocated());
+
+  // Create temp array to transfer to device.
+  auto x{MakeArray(std::vector{10},
+  std::vector{0, 1, 2, 3, 4, 5, 6, 7, 8, 9})};
+  RTNAME(CUFDataTransferDescDesc)(*dev, *x, kHostToDevice, __FILE__, __LINE__);
+
+  // Retrieve data from device.
+  auto host{MakeArray(std::vector{10},
+  std::vector{0, 0, 0, 0, 0, 0, 0, 0, 0, 0})};
+  RTNAME(CUFDataTransferDescDesc)(
+  *host, *dev, kDeviceToHost, __FILE__, __LINE__);
+
+  for (unsigned i = 0; i < 10; ++i) {
+EXPECT_EQ(*host->ZeroBasedIndexedElement(i), 
(std::int32_t)i);
+  }
+}

``




https://github.com/llvm/llvm-project/pull/114302
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][cuda] Data transfer with descriptor (PR #114302)

2024-10-30 Thread via llvm-branch-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff a1f81d24c44da15071196faa9e8d466bcbbd7e97 
e4c7e31c77bbfda563e4e2c9b591fe2f5cb2c259 --extensions cpp -- 
flang/runtime/CUDA/memory.cpp flang/unittests/Runtime/CUDA/Memory.cpp
``





View the diff from clang-format here.


``diff
diff --git a/flang/unittests/Runtime/CUDA/Memory.cpp 
b/flang/unittests/Runtime/CUDA/Memory.cpp
index ade05e21b7..3765fbbb7b 100644
--- a/flang/unittests/Runtime/CUDA/Memory.cpp
+++ b/flang/unittests/Runtime/CUDA/Memory.cpp
@@ -62,8 +62,8 @@ TEST(MemoryCUFTest, CUFDataTransferDescDesc) {
   // Retrieve data from device.
   auto host{MakeArray(std::vector{10},
   std::vector{0, 0, 0, 0, 0, 0, 0, 0, 0, 0})};
-  RTNAME(CUFDataTransferDescDesc)(
-  *host, *dev, kDeviceToHost, __FILE__, __LINE__);
+  RTNAME(CUFDataTransferDescDesc)
+  (*host, *dev, kDeviceToHost, __FILE__, __LINE__);
 
   for (unsigned i = 0; i < 10; ++i) {
 EXPECT_EQ(*host->ZeroBasedIndexedElement(i), 
(std::int32_t)i);

``




https://github.com/llvm/llvm-project/pull/114302
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [PAC][CodeGen][ELF][AArch64] Support signed GOT with tiny code model (PR #113812)

2024-10-30 Thread Paul Kirth via llvm-branch-commits

https://github.com/ilovepi commented:

Overall, this seems fine as a translation of the spec. I still need to take 
another pass over the tests to be sure, but they seem appropriate so far. 

All that said, I'm far from an expert on Aarch64 minutiae, so a more 
experienced reviewer here will have to provide any final approval.

https://github.com/llvm/llvm-project/pull/113812
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [PAC][CodeGen][ELF][AArch64] Support signed GOT with tiny code model (PR #113812)

2024-10-30 Thread Paul Kirth via llvm-branch-commits


@@ -2277,28 +2277,40 @@ void AArch64AsmPrinter::LowerLOADgotAUTH(const 
MachineInstr &MI) {
   const MachineOperand &GAMO = MI.getOperand(1);
   assert(GAMO.getOffset() == 0);
 
-  MachineOperand GAHiOp(GAMO);
-  MachineOperand GALoOp(GAMO);
-  GAHiOp.addTargetFlag(AArch64II::MO_PAGE);
-  GALoOp.addTargetFlag(AArch64II::MO_PAGEOFF | AArch64II::MO_NC);
+  if (MI.getParent()->getParent()->getTarget().getCodeModel() ==
+  CodeModel::Tiny) {

ilovepi wrote:

```suggestion
  if (MI.getMF()->getTarget().getCodeModel() == CodeModel::Tiny) {
```
Not sure if there's an easier way to get to the target from MI off the top of 
my head, but you can get the MachineFunction w/o going through `getParent`

https://github.com/llvm/llvm-project/pull/113812
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [PAC][CodeGen][ELF][AArch64] Support signed GOT with tiny code model (PR #113812)

2024-10-30 Thread Paul Kirth via llvm-branch-commits

https://github.com/ilovepi edited 
https://github.com/llvm/llvm-project/pull/113812
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)

2024-10-30 Thread Tobias Hieta via llvm-branch-commits


@@ -294,7 +294,11 @@ class Triple {
 
 PAuthTest,
 
-LastEnvironmentType = PAuthTest
+GNUT64,
+GNUEABIT64,
+GNUEABIHFT64,
+
+LastEnvironmentType = GNUEABIHFT64

tru wrote:

I want to understand why the Abi checker didn't flag this. But yes - I will 
have to read the diffs much more carefully until we know we can trust it.

Maybe it would be enough to have an action that flags any changes to the public 
include files so that it's not easily missed. 

https://github.com/llvm/llvm-project/pull/112364
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][OpenMP] Emit descriptive errors for all unsupported clauses (PR #114037)

2024-10-30 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/114037
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/19.x: [compiler-rt] [test] Fix using toolchains that rely on Clang default configs (#113491) (PR #114229)

2024-10-30 Thread Sam James via llvm-branch-commits

https://github.com/thesamesam approved this pull request.


https://github.com/llvm/llvm-project/pull/114229
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] MachineUniformityAnalysis: Improve isConstantOrUndefValuePhi (PR #112866)

2024-10-30 Thread Petar Avramovic via llvm-branch-commits

https://github.com/petar-avramovic updated 
https://github.com/llvm/llvm-project/pull/112866

>From 98a00e5a2ed28da3a4608d9c209a04f0cff6fe12 Mon Sep 17 00:00:00 2001
From: Petar Avramovic 
Date: Wed, 30 Oct 2024 15:41:59 +0100
Subject: [PATCH] MachineUniformityAnalysis: Improve isConstantOrUndefValuePhi

Change existing code for G_PHI to match what LLVM-IR version is doing
via PHINode::hasConstantOrUndefValue. This is not safe for regular PHI
since it may appear with an undef operand and getVRegDef can fail.
Most notably this improves number of values that can be allocated
to sgpr register bank in AMDGPURegBankSelect.
Common case here are phis that appear in structurize-cfg lowering
for cycles with multiple exits:
Undef incoming value is coming from block that reached cycle exit
condition, if other incoming is uniform keep the phi uniform despite
the fact it is joining values from pair of blocks that are entered
via divergent condition branch.
---
 llvm/lib/CodeGen/MachineSSAContext.cpp| 27 +-
 .../AMDGPU/MIR/hidden-diverge-gmir.mir| 28 +++
 .../AMDGPU/MIR/hidden-loop-diverge.mir|  4 +-
 .../AMDGPU/MIR/uses-value-from-cycle.mir  |  8 +-
 .../GlobalISel/divergence-structurizer.mir| 80 --
 .../regbankselect-mui-regbanklegalize.mir | 69 ---
 .../regbankselect-mui-regbankselect.mir   | 18 ++--
 .../AMDGPU/GlobalISel/regbankselect-mui.ll| 84 ++-
 .../AMDGPU/GlobalISel/regbankselect-mui.mir   | 51 ++-
 9 files changed, 191 insertions(+), 178 deletions(-)

diff --git a/llvm/lib/CodeGen/MachineSSAContext.cpp 
b/llvm/lib/CodeGen/MachineSSAContext.cpp
index e384187b6e8593..8e13c0916dd9e1 100644
--- a/llvm/lib/CodeGen/MachineSSAContext.cpp
+++ b/llvm/lib/CodeGen/MachineSSAContext.cpp
@@ -54,9 +54,34 @@ const MachineBasicBlock 
*MachineSSAContext::getDefBlock(Register value) const {
   return F->getRegInfo().getVRegDef(value)->getParent();
 }
 
+static bool isUndef(const MachineInstr &MI) {
+  return MI.getOpcode() == TargetOpcode::G_IMPLICIT_DEF ||
+ MI.getOpcode() == TargetOpcode::IMPLICIT_DEF;
+}
+
+/// MachineInstr equivalent of PHINode::hasConstantOrUndefValue() for G_PHI.
 template <>
 bool MachineSSAContext::isConstantOrUndefValuePhi(const MachineInstr &Phi) {
-  return Phi.isConstantValuePHI();
+  if (!Phi.isPHI())
+return false;
+
+  // In later passes PHI may appear with an undef operand, getVRegDef can fail.
+  if (Phi.getOpcode() == TargetOpcode::PHI)
+return Phi.isConstantValuePHI();
+
+  // For G_PHI we do equivalent of PHINode::hasConstantOrUndefValue().
+  const MachineRegisterInfo &MRI = Phi.getMF()->getRegInfo();
+  Register This = Phi.getOperand(0).getReg();
+  Register ConstantValue;
+  for (unsigned i = 1, e = Phi.getNumOperands(); i < e; i += 2) {
+Register Incoming = Phi.getOperand(i).getReg();
+if (Incoming != This && !isUndef(*MRI.getVRegDef(Incoming))) {
+  if (ConstantValue && ConstantValue != Incoming)
+return false;
+  ConstantValue = Incoming;
+}
+  }
+  return true;
 }
 
 template <>
diff --git 
a/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/hidden-diverge-gmir.mir 
b/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/hidden-diverge-gmir.mir
index ce00edf3363f77..9694a340b5e906 100644
--- a/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/hidden-diverge-gmir.mir
+++ b/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/hidden-diverge-gmir.mir
@@ -1,24 +1,24 @@
 # RUN: llc -mtriple=amdgcn-- -run-pass=print-machine-uniformity -o - %s 2>&1 | 
FileCheck %s
 # CHECK-LABEL: MachineUniformityInfo for function: hidden_diverge
 # CHECK-LABEL: BLOCK bb.0
-# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s32) = G_INTRINSIC 
intrinsic(@llvm.amdgcn.workitem.id.x)
-# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1) = G_ICMP intpred(slt)
-# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1) = G_XOR %{{[0-9]*}}:_, 
%{{[0-9]*}}:_
-# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1), %{{[0-9]*}}:_(s64) = 
G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.if)
-# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1), %{{[0-9]*}}:_(s64) = 
G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.if)
-# CHECK: DIVERGENT: G_BRCOND %{{[0-9]*}}:_(s1), %bb.1
-# CHECK: DIVERGENT: G_BR %bb.2
+# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s32) = G_INTRINSIC 
intrinsic(@llvm.amdgcn.workitem.id.x)
+# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1) = G_ICMP intpred(slt)
+# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1) = G_XOR %{{[0-9]*}}:_, 
%{{[0-9]*}}:_
+# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1), %{{[0-9]*}}:_(s64) = 
G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.if)
+# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1), %{{[0-9]*}}:_(s64) = 
G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.if)
+# CHECK: DIVERGENT: G_BRCOND %{{[0-9]*}}:_(s1), %bb.1
+# CHECK: DIVERGENT: G_BR %bb.2
 # CHECK-LABEL: BLOCK bb.1
 # CHECK-LABEL: BLOCK bb.2
-# CHECK: D

[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: RegBankLegalize rules for load (PR #112882)

2024-10-30 Thread Petar Avramovic via llvm-branch-commits

https://github.com/petar-avramovic updated 
https://github.com/llvm/llvm-project/pull/112882

>From 4675f79f28222cef60d1607acb1b682ca3363eb6 Mon Sep 17 00:00:00 2001
From: Petar Avramovic 
Date: Wed, 30 Oct 2024 15:37:59 +0100
Subject: [PATCH] AMDGPU/GlobalISel: RegBankLegalize rules for load

Add IDs for bit width that cover multiple LLTs: B32 B64 etc.
"Predicate" wrapper class for bool predicate functions used to
write pretty rules. Predicates can be combined using &&, || and !.
Lowering for splitting and widening loads.
Write rules for loads to not change existing mir tests from old
regbankselect.
---
 .../AMDGPU/AMDGPURegBankLegalizeHelper.cpp| 287 +++-
 .../AMDGPU/AMDGPURegBankLegalizeHelper.h  |   5 +
 .../AMDGPU/AMDGPURegBankLegalizeRules.cpp | 309 -
 .../AMDGPU/AMDGPURegBankLegalizeRules.h   |  65 +++-
 .../AMDGPU/GlobalISel/regbankselect-load.mir  | 320 +++---
 .../GlobalISel/regbankselect-zextload.mir |   9 +-
 6 files changed, 929 insertions(+), 66 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
index 15ccf1a38af9a5..19d8d466e3b12e 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
@@ -36,6 +36,83 @@ void 
RegBankLegalizeHelper::findRuleAndApplyMapping(MachineInstr &MI) {
   lower(MI, Mapping, WaterfallSgprs);
 }
 
+void RegBankLegalizeHelper::splitLoad(MachineInstr &MI,
+  ArrayRef LLTBreakdown, LLT MergeTy) 
{
+  MachineFunction &MF = B.getMF();
+  assert(MI.getNumMemOperands() == 1);
+  MachineMemOperand &BaseMMO = **MI.memoperands_begin();
+  Register Dst = MI.getOperand(0).getReg();
+  const RegisterBank *DstRB = MRI.getRegBankOrNull(Dst);
+  Register Base = MI.getOperand(1).getReg();
+  LLT PtrTy = MRI.getType(Base);
+  const RegisterBank *PtrRB = MRI.getRegBankOrNull(Base);
+  LLT OffsetTy = LLT::scalar(PtrTy.getSizeInBits());
+  SmallVector LoadPartRegs;
+
+  unsigned ByteOffset = 0;
+  for (LLT PartTy : LLTBreakdown) {
+Register BasePlusOffset;
+if (ByteOffset == 0) {
+  BasePlusOffset = Base;
+} else {
+  auto Offset = B.buildConstant({PtrRB, OffsetTy}, ByteOffset);
+  BasePlusOffset = B.buildPtrAdd({PtrRB, PtrTy}, Base, Offset).getReg(0);
+}
+auto *OffsetMMO = MF.getMachineMemOperand(&BaseMMO, ByteOffset, PartTy);
+auto LoadPart = B.buildLoad({DstRB, PartTy}, BasePlusOffset, *OffsetMMO);
+LoadPartRegs.push_back(LoadPart.getReg(0));
+ByteOffset += PartTy.getSizeInBytes();
+  }
+
+  if (!MergeTy.isValid()) {
+// Loads are of same size, concat or merge them together.
+B.buildMergeLikeInstr(Dst, LoadPartRegs);
+  } else {
+// Loads are not all of same size, need to unmerge them to smaller pieces
+// of MergeTy type, then merge pieces to Dst.
+SmallVector MergeTyParts;
+for (Register Reg : LoadPartRegs) {
+  if (MRI.getType(Reg) == MergeTy) {
+MergeTyParts.push_back(Reg);
+  } else {
+auto Unmerge = B.buildUnmerge({DstRB, MergeTy}, Reg);
+for (unsigned i = 0; i < Unmerge->getNumOperands() - 1; ++i)
+  MergeTyParts.push_back(Unmerge.getReg(i));
+  }
+}
+B.buildMergeLikeInstr(Dst, MergeTyParts);
+  }
+  MI.eraseFromParent();
+}
+
+void RegBankLegalizeHelper::widenLoad(MachineInstr &MI, LLT WideTy,
+  LLT MergeTy) {
+  MachineFunction &MF = B.getMF();
+  assert(MI.getNumMemOperands() == 1);
+  MachineMemOperand &BaseMMO = **MI.memoperands_begin();
+  Register Dst = MI.getOperand(0).getReg();
+  const RegisterBank *DstRB = MRI.getRegBankOrNull(Dst);
+  Register Base = MI.getOperand(1).getReg();
+
+  MachineMemOperand *WideMMO = MF.getMachineMemOperand(&BaseMMO, 0, WideTy);
+  auto WideLoad = B.buildLoad({DstRB, WideTy}, Base, *WideMMO);
+
+  if (WideTy.isScalar()) {
+B.buildTrunc(Dst, WideLoad);
+  } else {
+SmallVector MergeTyParts;
+auto Unmerge = B.buildUnmerge({DstRB, MergeTy}, WideLoad);
+
+LLT DstTy = MRI.getType(Dst);
+unsigned NumElts = DstTy.getSizeInBits() / MergeTy.getSizeInBits();
+for (unsigned i = 0; i < NumElts; ++i) {
+  MergeTyParts.push_back(Unmerge.getReg(i));
+}
+B.buildMergeLikeInstr(Dst, MergeTyParts);
+  }
+  MI.eraseFromParent();
+}
+
 void RegBankLegalizeHelper::lower(MachineInstr &MI,
   const RegBankLLTMapping &Mapping,
   SmallSet &WaterfallSgprs) {
@@ -114,6 +191,50 @@ void RegBankLegalizeHelper::lower(MachineInstr &MI,
 MI.eraseFromParent();
 break;
   }
+  case SplitLoad: {
+LLT DstTy = MRI.getType(MI.getOperand(0).getReg());
+unsigned Size = DstTy.getSizeInBits();
+// Even split to 128-bit loads
+if (Size > 128) {
+  LLT B128;
+  if (DstTy.isVector()) {
+LLT EltTy = DstTy.getElementType();
+B128 = LLT:

[llvm-branch-commits] [flang] [flang][cuda] Data transfer with descriptor (PR #114302)

2024-10-30 Thread Renaud Kauffmann via llvm-branch-commits


@@ -9,10 +9,32 @@
 #include "flang/Runtime/CUDA/memory.h"
 #include "../terminator.h"
 #include "flang/Runtime/CUDA/common.h"
+#include "flang/Runtime/assign.h"
 
 #include "cuda_runtime.h"
 
 namespace Fortran::runtime::cuda {
+static void *MemmoveHostToDevice(
+void *dst, const void *src, std::size_t count) {
+  // TODO: Use cudaMemcpyAsync when we have support for stream.
+  CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice));
+  return dst;
+}
+
+static void *MemmoveDeviceToHost(
+void *dst, const void *src, std::size_t count) {
+  // TODO: Use cudaMemcpyAsync when we have support for stream.
+  CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyDeviceToHost));
+  return dst;
+}
+
+static void *MemmoveDeviceToDevice(
+void *dst, const void *src, std::size_t count) {
+  // TODO: Use cudaMemcpyAsync when we have support for stream.
+  CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice));

Renaud-K wrote:

DeviceToDevice?

https://github.com/llvm/llvm-project/pull/114302
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] ValueTracking: Allow getUnderlyingObject to look at vectors (PR #114311)

2024-10-30 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm created 
https://github.com/llvm/llvm-project/pull/114311

We can identify some easy vector of pointer cases, such as
a getelementptr with a scalar base.

>From 6e8c27a281929c1c1941960e0134122c3ccf77b8 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Tue, 29 Oct 2024 15:30:51 -0700
Subject: [PATCH] ValueTracking: Allow getUnderlyingObject to look at vectors

We can identify some easy vector of pointer cases, such as
a getelementptr with a scalar base.
---
 llvm/lib/Analysis/ValueTracking.cpp   |  4 +--
 .../AMDGPU/promote-alloca-vector-gep.ll   | 27 ++-
 2 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/llvm/lib/Analysis/ValueTracking.cpp 
b/llvm/lib/Analysis/ValueTracking.cpp
index aa5142f3362409..dd7cac9f4fc325 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -6686,11 +6686,11 @@ static bool isSameUnderlyingObjectInLoop(const PHINode 
*PN,
 }
 
 const Value *llvm::getUnderlyingObject(const Value *V, unsigned MaxLookup) {
-  if (!V->getType()->isPointerTy())
-return V;
   for (unsigned Count = 0; MaxLookup == 0 || Count < MaxLookup; ++Count) {
 if (auto *GEP = dyn_cast(V)) {
   V = GEP->getPointerOperand();
+  if (!V->getType()->isPointerTy()) // Only handle scalar pointer base.
+return nullptr;
 } else if (Operator::getOpcode(V) == Instruction::BitCast ||
Operator::getOpcode(V) == Instruction::AddrSpaceCast) {
   Value *NewV = cast(V)->getOperand(0);
diff --git a/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll 
b/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll
index 355a2b8796b24d..76e1868b3c4b9e 100644
--- a/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll
+++ b/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll
@@ -35,17 +35,30 @@ bb:
   ret void
 }
 
-; TODO: Should be able to promote this
 define amdgpu_kernel void @scalar_alloca_ptr_with_vector_gep_offset_select(i1 
%cond) {
 ; CHECK-LABEL: define amdgpu_kernel void 
@scalar_alloca_ptr_with_vector_gep_offset_select(
 ; CHECK-SAME: i1 [[COND:%.*]]) {
 ; CHECK-NEXT:  [[BB:.*:]]
-; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [4 x i32], align 4, addrspace(5)
-; CHECK-NEXT:[[GETELEMENTPTR0:%.*]] = getelementptr inbounds i8, ptr 
addrspace(5) [[ALLOCA]], <4 x i64> 
-; CHECK-NEXT:[[GETELEMENTPTR1:%.*]] = getelementptr inbounds i8, ptr 
addrspace(5) [[ALLOCA]], <4 x i64> 
-; CHECK-NEXT:[[SELECT:%.*]] = select i1 [[COND]], <4 x ptr addrspace(5)> 
[[GETELEMENTPTR0]], <4 x ptr addrspace(5)> [[GETELEMENTPTR1]]
-; CHECK-NEXT:[[EXTRACTELEMENT:%.*]] = extractelement <4 x ptr 
addrspace(5)> [[SELECT]], i64 1
-; CHECK-NEXT:store i32 0, ptr addrspace(5) [[EXTRACTELEMENT]], align 4
+; CHECK-NEXT:[[TMP0:%.*]] = call noalias nonnull dereferenceable(64) ptr 
addrspace(4) @llvm.amdgcn.dispatch.ptr()
+; CHECK-NEXT:[[TMP1:%.*]] = getelementptr inbounds i32, ptr addrspace(4) 
[[TMP0]], i64 1
+; CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr addrspace(4) [[TMP1]], align 4, 
!invariant.load [[META0]]
+; CHECK-NEXT:[[TMP3:%.*]] = getelementptr inbounds i32, ptr addrspace(4) 
[[TMP0]], i64 2
+; CHECK-NEXT:[[TMP4:%.*]] = load i32, ptr addrspace(4) [[TMP3]], align 4, 
!range [[RNG1]], !invariant.load [[META0]]
+; CHECK-NEXT:[[TMP5:%.*]] = lshr i32 [[TMP2]], 16
+; CHECK-NEXT:[[TMP6:%.*]] = call range(i32 0, 1024) i32 
@llvm.amdgcn.workitem.id.x()
+; CHECK-NEXT:[[TMP7:%.*]] = call range(i32 0, 1024) i32 
@llvm.amdgcn.workitem.id.y()
+; CHECK-NEXT:[[TMP8:%.*]] = call range(i32 0, 1024) i32 
@llvm.amdgcn.workitem.id.z()
+; CHECK-NEXT:[[TMP9:%.*]] = mul nuw nsw i32 [[TMP5]], [[TMP4]]
+; CHECK-NEXT:[[TMP10:%.*]] = mul i32 [[TMP9]], [[TMP6]]
+; CHECK-NEXT:[[TMP11:%.*]] = mul nuw nsw i32 [[TMP7]], [[TMP4]]
+; CHECK-NEXT:[[TMP12:%.*]] = add i32 [[TMP10]], [[TMP11]]
+; CHECK-NEXT:[[TMP13:%.*]] = add i32 [[TMP12]], [[TMP8]]
+; CHECK-NEXT:[[TMP14:%.*]] = getelementptr inbounds [1024 x [4 x i32]], 
ptr addrspace(3) @scalar_alloca_ptr_with_vector_gep_offset_select.alloca, i32 
0, i32 [[TMP13]]
+; CHECK-NEXT:[[GETELEMENTPTR0:%.*]] = getelementptr inbounds i8, ptr 
addrspace(3) [[TMP14]], <4 x i64> 
+; CHECK-NEXT:[[GETELEMENTPTR1:%.*]] = getelementptr inbounds i8, ptr 
addrspace(3) [[TMP14]], <4 x i64> 
+; CHECK-NEXT:[[SELECT:%.*]] = select i1 [[COND]], <4 x ptr addrspace(3)> 
[[GETELEMENTPTR0]], <4 x ptr addrspace(3)> [[GETELEMENTPTR1]]
+; CHECK-NEXT:[[EXTRACTELEMENT:%.*]] = extractelement <4 x ptr 
addrspace(3)> [[SELECT]], i64 1
+; CHECK-NEXT:store i32 0, ptr addrspace(3) [[EXTRACTELEMENT]], align 4
 ; CHECK-NEXT:ret void
 ;
 bb:

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] ValueTracking: Allow getUnderlyingObject to look at vectors (PR #114311)

2024-10-30 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-analysis

Author: Matt Arsenault (arsenm)


Changes

We can identify some easy vector of pointer cases, such as
a getelementptr with a scalar base.

---
Full diff: https://github.com/llvm/llvm-project/pull/114311.diff


2 Files Affected:

- (modified) llvm/lib/Analysis/ValueTracking.cpp (+2-2) 
- (modified) llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll (+20-7) 


``diff
diff --git a/llvm/lib/Analysis/ValueTracking.cpp 
b/llvm/lib/Analysis/ValueTracking.cpp
index aa5142f33624099..dd7cac9f4fc3259 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -6686,11 +6686,11 @@ static bool isSameUnderlyingObjectInLoop(const PHINode 
*PN,
 }
 
 const Value *llvm::getUnderlyingObject(const Value *V, unsigned MaxLookup) {
-  if (!V->getType()->isPointerTy())
-return V;
   for (unsigned Count = 0; MaxLookup == 0 || Count < MaxLookup; ++Count) {
 if (auto *GEP = dyn_cast(V)) {
   V = GEP->getPointerOperand();
+  if (!V->getType()->isPointerTy()) // Only handle scalar pointer base.
+return nullptr;
 } else if (Operator::getOpcode(V) == Instruction::BitCast ||
Operator::getOpcode(V) == Instruction::AddrSpaceCast) {
   Value *NewV = cast(V)->getOperand(0);
diff --git a/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll 
b/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll
index 355a2b8796b24dc..76e1868b3c4b9e8 100644
--- a/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll
+++ b/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll
@@ -35,17 +35,30 @@ bb:
   ret void
 }
 
-; TODO: Should be able to promote this
 define amdgpu_kernel void @scalar_alloca_ptr_with_vector_gep_offset_select(i1 
%cond) {
 ; CHECK-LABEL: define amdgpu_kernel void 
@scalar_alloca_ptr_with_vector_gep_offset_select(
 ; CHECK-SAME: i1 [[COND:%.*]]) {
 ; CHECK-NEXT:  [[BB:.*:]]
-; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [4 x i32], align 4, addrspace(5)
-; CHECK-NEXT:[[GETELEMENTPTR0:%.*]] = getelementptr inbounds i8, ptr 
addrspace(5) [[ALLOCA]], <4 x i64> 
-; CHECK-NEXT:[[GETELEMENTPTR1:%.*]] = getelementptr inbounds i8, ptr 
addrspace(5) [[ALLOCA]], <4 x i64> 
-; CHECK-NEXT:[[SELECT:%.*]] = select i1 [[COND]], <4 x ptr addrspace(5)> 
[[GETELEMENTPTR0]], <4 x ptr addrspace(5)> [[GETELEMENTPTR1]]
-; CHECK-NEXT:[[EXTRACTELEMENT:%.*]] = extractelement <4 x ptr 
addrspace(5)> [[SELECT]], i64 1
-; CHECK-NEXT:store i32 0, ptr addrspace(5) [[EXTRACTELEMENT]], align 4
+; CHECK-NEXT:[[TMP0:%.*]] = call noalias nonnull dereferenceable(64) ptr 
addrspace(4) @llvm.amdgcn.dispatch.ptr()
+; CHECK-NEXT:[[TMP1:%.*]] = getelementptr inbounds i32, ptr addrspace(4) 
[[TMP0]], i64 1
+; CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr addrspace(4) [[TMP1]], align 4, 
!invariant.load [[META0]]
+; CHECK-NEXT:[[TMP3:%.*]] = getelementptr inbounds i32, ptr addrspace(4) 
[[TMP0]], i64 2
+; CHECK-NEXT:[[TMP4:%.*]] = load i32, ptr addrspace(4) [[TMP3]], align 4, 
!range [[RNG1]], !invariant.load [[META0]]
+; CHECK-NEXT:[[TMP5:%.*]] = lshr i32 [[TMP2]], 16
+; CHECK-NEXT:[[TMP6:%.*]] = call range(i32 0, 1024) i32 
@llvm.amdgcn.workitem.id.x()
+; CHECK-NEXT:[[TMP7:%.*]] = call range(i32 0, 1024) i32 
@llvm.amdgcn.workitem.id.y()
+; CHECK-NEXT:[[TMP8:%.*]] = call range(i32 0, 1024) i32 
@llvm.amdgcn.workitem.id.z()
+; CHECK-NEXT:[[TMP9:%.*]] = mul nuw nsw i32 [[TMP5]], [[TMP4]]
+; CHECK-NEXT:[[TMP10:%.*]] = mul i32 [[TMP9]], [[TMP6]]
+; CHECK-NEXT:[[TMP11:%.*]] = mul nuw nsw i32 [[TMP7]], [[TMP4]]
+; CHECK-NEXT:[[TMP12:%.*]] = add i32 [[TMP10]], [[TMP11]]
+; CHECK-NEXT:[[TMP13:%.*]] = add i32 [[TMP12]], [[TMP8]]
+; CHECK-NEXT:[[TMP14:%.*]] = getelementptr inbounds [1024 x [4 x i32]], 
ptr addrspace(3) @scalar_alloca_ptr_with_vector_gep_offset_select.alloca, i32 
0, i32 [[TMP13]]
+; CHECK-NEXT:[[GETELEMENTPTR0:%.*]] = getelementptr inbounds i8, ptr 
addrspace(3) [[TMP14]], <4 x i64> 
+; CHECK-NEXT:[[GETELEMENTPTR1:%.*]] = getelementptr inbounds i8, ptr 
addrspace(3) [[TMP14]], <4 x i64> 
+; CHECK-NEXT:[[SELECT:%.*]] = select i1 [[COND]], <4 x ptr addrspace(3)> 
[[GETELEMENTPTR0]], <4 x ptr addrspace(3)> [[GETELEMENTPTR1]]
+; CHECK-NEXT:[[EXTRACTELEMENT:%.*]] = extractelement <4 x ptr 
addrspace(3)> [[SELECT]], i64 1
+; CHECK-NEXT:store i32 0, ptr addrspace(3) [[EXTRACTELEMENT]], align 4
 ; CHECK-NEXT:ret void
 ;
 bb:

``




https://github.com/llvm/llvm-project/pull/114311
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] ValueTracking: Allow getUnderlyingObject to look at vectors (PR #114311)

2024-10-30 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/114311
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [llvm][fatlto] Drop any CFI related instrumentation after emitting bitcode (PR #112788)

2024-10-30 Thread Paul Kirth via llvm-branch-commits

https://github.com/ilovepi updated 
https://github.com/llvm/llvm-project/pull/112788

>From ad89d61e60bac57cf8c66a974d741377ebe1db30 Mon Sep 17 00:00:00 2001
From: Paul Kirth 
Date: Fri, 18 Oct 2024 01:59:26 +
Subject: [PATCH 1/2] Use new enum in constructor

Created using spr 1.3.4
---
 llvm/lib/Passes/PassBuilderPipelines.cpp | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index aec79304ab5c3c..0585e83e59a9ab 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -1631,8 +1631,11 @@ 
PassBuilder::buildFatLTODefaultPipeline(OptimizationLevel Level, bool ThinLTO,
   MPM.addPass(EmbedBitcodePass(ThinLTO, EmitSummary));
 
   // If we're doing FatLTO w/ CFI enabled, we don't want the type tests in the
-  // object file.
-  MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true, true));
+  // object code, only in the bitcode section, so drop it before we run
+  // module optimization and generate machine code. If llvm.type.test() isn't 
in
+  // the IR, this won't do anything.
+  MPM.addPass(
+  LowerTypeTestsPass(nullptr, nullptr, lowertypetests::DropTestKind::All));
 
   // Use the ThinLTO post-link pipeline with sample profiling
   if (ThinLTO && PGOOpt && PGOOpt->Action == PGOOptions::SampleUse)

>From f331c70196e399d0e0d4ec8fdb76d3a313b25ac3 Mon Sep 17 00:00:00 2001
From: Paul Kirth 
Date: Wed, 30 Oct 2024 23:54:00 +
Subject: [PATCH 2/2] Update test to use cc1

Created using spr 1.3.4
---
 clang/test/CodeGen/fat-lto-objects-cfi.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/test/CodeGen/fat-lto-objects-cfi.cpp 
b/clang/test/CodeGen/fat-lto-objects-cfi.cpp
index 022e74fd9b6f22..628951847053ac 100644
--- a/clang/test/CodeGen/fat-lto-objects-cfi.cpp
+++ b/clang/test/CodeGen/fat-lto-objects-cfi.cpp
@@ -1,7 +1,7 @@
 // REQUIRES: x86-registered-target
 
-// RUN: %clangxx --target=x86_64-unknown-fuchsia -O2 -flto -ffat-lto-objects \
-// RUN:  -fsanitize=cfi -fvisibility=hidden -S -emit-llvm -o - %s \
+// RUN: %clang_cc1 -triple x86_64-unknown-fuchsia -O2 -flto -ffat-lto-objects \
+// RUN:  -fsanitize=cfi-icall -fsanitize-trap=cfi-icall 
-fvisibility=hidden  -emit-llvm -o - %s \
 // RUN:   | FileCheck %s
 
 // CHECK: llvm.embedded.object

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)

2024-10-30 Thread Michał Górny via llvm-branch-commits


@@ -294,7 +294,11 @@ class Triple {
 
 PAuthTest,
 
-LastEnvironmentType = PAuthTest
+GNUT64,
+GNUEABIT64,
+GNUEABIHFT64,
+
+LastEnvironmentType = GNUEABIHFT64

mgorny wrote:

For the record, Gentoo has already backported it to 18.x as well (I was 
literally waiting for the patch to be merged as a confirmation that this 
approach is good).

My alternative idea would be to limit the changes to accept `*t64` but return 
one of the existing enum values, i.e. `-gnut64` would return `::GNU` and so on. 
We'd lose the ability to customize clang behavior on it, but at least it won't 
refuse to work right away.

https://github.com/llvm/llvm-project/pull/112364
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][cuda] Data transfer with descriptor (PR #114302)

2024-10-30 Thread Valentin Clement バレンタイン クレメン via llvm-branch-commits

https://github.com/clementval updated 
https://github.com/llvm/llvm-project/pull/114302

>From e4c7e31c77bbfda563e4e2c9b591fe2f5cb2c259 Mon Sep 17 00:00:00 2001
From: Valentin Clement 
Date: Wed, 30 Oct 2024 11:53:12 -0700
Subject: [PATCH 1/2] [flang][cuda] Data transfer with descriptor

---
 flang/runtime/CUDA/memory.cpp   | 34 +++--
 flang/unittests/Runtime/CUDA/Memory.cpp | 40 +
 2 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/flang/runtime/CUDA/memory.cpp b/flang/runtime/CUDA/memory.cpp
index 4778a4ae77683f..f25d3b531c84f0 100644
--- a/flang/runtime/CUDA/memory.cpp
+++ b/flang/runtime/CUDA/memory.cpp
@@ -9,10 +9,32 @@
 #include "flang/Runtime/CUDA/memory.h"
 #include "../terminator.h"
 #include "flang/Runtime/CUDA/common.h"
+#include "flang/Runtime/assign.h"
 
 #include "cuda_runtime.h"
 
 namespace Fortran::runtime::cuda {
+static void *MemmoveHostToDevice(
+void *dst, const void *src, std::size_t count) {
+  // TODO: Use cudaMemcpyAsync when we have support for stream.
+  CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice));
+  return dst;
+}
+
+static void *MemmoveDeviceToHost(
+void *dst, const void *src, std::size_t count) {
+  // TODO: Use cudaMemcpyAsync when we have support for stream.
+  CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyDeviceToHost));
+  return dst;
+}
+
+static void *MemmoveDeviceToDevice(
+void *dst, const void *src, std::size_t count) {
+  // TODO: Use cudaMemcpyAsync when we have support for stream.
+  CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice));
+  return dst;
+}
+
 extern "C" {
 
 void *RTDEF(CUFMemAlloc)(
@@ -90,8 +112,16 @@ void RTDEF(CUFDataTransferPtrDesc)(void *addr, Descriptor 
*desc,
 void RTDECL(CUFDataTransferDescDesc)(Descriptor *dstDesc, Descriptor *srcDesc,
 unsigned mode, const char *sourceFile, int sourceLine) {
   Terminator terminator{sourceFile, sourceLine};
-  terminator.Crash(
-  "not yet implemented: CUDA data transfer between two descriptors");
+  MemmoveFct memmoveFct;
+  if (mode == kHostToDevice) {
+memmoveFct = &MemmoveHostToDevice;
+  } else if (mode == kDeviceToHost) {
+memmoveFct = &MemmoveDeviceToHost;
+  } else if (mode == kDeviceToDevice) {
+memmoveFct = &MemmoveDeviceToDevice;
+  }
+  Fortran::runtime::Assign(
+  dstDesc, srcDesc, terminator, MaybeReallocate, memmoveFct);
 }
 }
 } // namespace Fortran::runtime::cuda
diff --git a/flang/unittests/Runtime/CUDA/Memory.cpp 
b/flang/unittests/Runtime/CUDA/Memory.cpp
index 157d3cdb531def..ade05e21b70a89 100644
--- a/flang/unittests/Runtime/CUDA/Memory.cpp
+++ b/flang/unittests/Runtime/CUDA/Memory.cpp
@@ -9,11 +9,17 @@
 #include "flang/Runtime/CUDA/memory.h"
 #include "gtest/gtest.h"
 #include "../../../runtime/terminator.h"
+#include "../tools.h"
 #include "flang/Common/Fortran.h"
+#include "flang/Runtime/CUDA/allocator.h"
 #include "flang/Runtime/CUDA/common.h"
+#include "flang/Runtime/CUDA/descriptor.h"
+#include "flang/Runtime/allocatable.h"
+#include "flang/Runtime/allocator-registry.h"
 
 #include "cuda_runtime.h"
 
+using namespace Fortran::runtime;
 using namespace Fortran::runtime::cuda;
 
 TEST(MemoryCUFTest, SimpleAllocTramsferFree) {
@@ -29,3 +35,37 @@ TEST(MemoryCUFTest, SimpleAllocTramsferFree) {
   EXPECT_EQ(42, host);
   RTNAME(CUFMemFree)((void *)dev, kMemTypeDevice, __FILE__, __LINE__);
 }
+
+static OwningPtr createAllocatable(
+Fortran::common::TypeCategory tc, int kind, int rank = 1) {
+  return Descriptor::Create(TypeCode{tc, kind}, kind, nullptr, rank, nullptr,
+  CFI_attribute_allocatable);
+}
+
+TEST(MemoryCUFTest, CUFDataTransferDescDesc) {
+  using Fortran::common::TypeCategory;
+  RTNAME(CUFRegisterAllocator)();
+  // INTEGER(4), DEVICE, ALLOCATABLE :: a(:)
+  auto dev{createAllocatable(TypeCategory::Integer, 4)};
+  dev->SetAllocIdx(kDeviceAllocatorPos);
+  EXPECT_EQ((int)kDeviceAllocatorPos, dev->GetAllocIdx());
+  RTNAME(AllocatableSetBounds)(*dev, 0, 1, 10);
+  RTNAME(AllocatableAllocate)
+  (*dev, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__);
+  EXPECT_TRUE(dev->IsAllocated());
+
+  // Create temp array to transfer to device.
+  auto x{MakeArray(std::vector{10},
+  std::vector{0, 1, 2, 3, 4, 5, 6, 7, 8, 9})};
+  RTNAME(CUFDataTransferDescDesc)(*dev, *x, kHostToDevice, __FILE__, __LINE__);
+
+  // Retrieve data from device.
+  auto host{MakeArray(std::vector{10},
+  std::vector{0, 0, 0, 0, 0, 0, 0, 0, 0, 0})};
+  RTNAME(CUFDataTransferDescDesc)(
+  *host, *dev, kDeviceToHost, __FILE__, __LINE__);
+
+  for (unsigned i = 0; i < 10; ++i) {
+EXPECT_EQ(*host->ZeroBasedIndexedElement(i), 
(std::int32_t)i);
+  }
+}

>From be6734745eba64a1e05886b30eba64409658b5ae Mon Sep 17 00:00:00 2001
From: Valentin Clement 
Date: Wed, 30 Oct 2024 14:08:23 -0700
Subject: [PATCH 2/2] fix call

---
 flang/runtime/CUDA/allocatable.cpp | 1 +
 flang/runtime/CUDA/memory.cpp  | 2 +-
 2 files cha

[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)

2024-10-30 Thread Alex Rønne Petersen via llvm-branch-commits


@@ -294,7 +294,11 @@ class Triple {
 
 PAuthTest,
 
-LastEnvironmentType = PAuthTest
+GNUT64,
+GNUEABIT64,
+GNUEABIHFT64,
+
+LastEnvironmentType = GNUEABIHFT64

alexrp wrote:

We don't mandate a particular patch version of LLVM because we try to support 
building Zig with a distro-provided LLVM. Distro packages aren't necessarily 
going to be on the latest LLVM patch release.

https://github.com/llvm/llvm-project/pull/112364
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 78dcfb3 - Revert "[TLI] Add support for hypot libcall. (#113724)"

2024-10-30 Thread via llvm-branch-commits

Author: gulfemsavrun
Date: 2024-10-30T15:09:24-07:00
New Revision: 78dcfb323241e46701ff8d19c8509307bce904bb

URL: 
https://github.com/llvm/llvm-project/commit/78dcfb323241e46701ff8d19c8509307bce904bb
DIFF: 
https://github.com/llvm/llvm-project/commit/78dcfb323241e46701ff8d19c8509307bce904bb.diff

LOG: Revert "[TLI] Add support for hypot libcall. (#113724)"

This reverts commit feb2d867fac3b6339c169fff97ddf0716fce6f0a.

Added: 


Modified: 
llvm/include/llvm/Analysis/TargetLibraryInfo.def
llvm/lib/Analysis/TargetLibraryInfo.cpp
llvm/lib/Transforms/Utils/BuildLibCalls.cpp
llvm/test/Transforms/InferFunctionAttrs/annotate.ll
llvm/test/tools/llvm-tli-checker/ps4-tli-check.yaml
llvm/unittests/Analysis/TargetLibraryInfoTest.cpp

Removed: 




diff  --git a/llvm/include/llvm/Analysis/TargetLibraryInfo.def 
b/llvm/include/llvm/Analysis/TargetLibraryInfo.def
index fd53a26ef8fc11b..3e23e398f6a7976 100644
--- a/llvm/include/llvm/Analysis/TargetLibraryInfo.def
+++ b/llvm/include/llvm/Analysis/TargetLibraryInfo.def
@@ -1671,21 +1671,6 @@ TLI_DEFINE_ENUM_INTERNAL(htons)
 TLI_DEFINE_STRING_INTERNAL("htons")
 TLI_DEFINE_SIG_INTERNAL(Int16, Int16)
 
-/// double hypot(double x, double y);
-TLI_DEFINE_ENUM_INTERNAL(hypot)
-TLI_DEFINE_STRING_INTERNAL("hypot")
-TLI_DEFINE_SIG_INTERNAL(Dbl, Dbl, Dbl)
-
-/// float hypotf(float x, float y);
-TLI_DEFINE_ENUM_INTERNAL(hypotf)
-TLI_DEFINE_STRING_INTERNAL("hypotf")
-TLI_DEFINE_SIG_INTERNAL(Flt, Flt, Flt)
-
-/// long double hypotl(long double x, long double y);
-TLI_DEFINE_ENUM_INTERNAL(hypotl)
-TLI_DEFINE_STRING_INTERNAL("hypotl")
-TLI_DEFINE_SIG_INTERNAL(LDbl, LDbl, LDbl)
-
 /// int iprintf(const char *format, ...);
 TLI_DEFINE_ENUM_INTERNAL(iprintf)
 TLI_DEFINE_STRING_INTERNAL("iprintf")

diff  --git a/llvm/lib/Analysis/TargetLibraryInfo.cpp 
b/llvm/lib/Analysis/TargetLibraryInfo.cpp
index 7f0b98ab3c1514a..0ee83d217a5001e 100644
--- a/llvm/lib/Analysis/TargetLibraryInfo.cpp
+++ b/llvm/lib/Analysis/TargetLibraryInfo.cpp
@@ -300,7 +300,6 @@ static void initializeLibCalls(TargetLibraryInfoImpl &TLI, 
const Triple &T,
   TLI.setUnavailable(LibFunc_expf);
   TLI.setUnavailable(LibFunc_floorf);
   TLI.setUnavailable(LibFunc_fmodf);
-  TLI.setUnavailable(LibFunc_hypotf);
   TLI.setUnavailable(LibFunc_log10f);
   TLI.setUnavailable(LibFunc_logf);
   TLI.setUnavailable(LibFunc_modff);
@@ -332,7 +331,6 @@ static void initializeLibCalls(TargetLibraryInfoImpl &TLI, 
const Triple &T,
 TLI.setUnavailable(LibFunc_floorl);
 TLI.setUnavailable(LibFunc_fmodl);
 TLI.setUnavailable(LibFunc_frexpl);
-TLI.setUnavailable(LibFunc_hypotl);
 TLI.setUnavailable(LibFunc_ldexpl);
 TLI.setUnavailable(LibFunc_log10l);
 TLI.setUnavailable(LibFunc_logl);

diff  --git a/llvm/lib/Transforms/Utils/BuildLibCalls.cpp 
b/llvm/lib/Transforms/Utils/BuildLibCalls.cpp
index e039457f313b29e..5fd4fd78c28a953 100644
--- a/llvm/lib/Transforms/Utils/BuildLibCalls.cpp
+++ b/llvm/lib/Transforms/Utils/BuildLibCalls.cpp
@@ -1215,9 +1215,6 @@ bool llvm::inferNonMandatoryLibFuncAttrs(Function &F,
   case LibFunc_fmod:
   case LibFunc_fmodf:
   case LibFunc_fmodl:
-  case LibFunc_hypot:
-  case LibFunc_hypotf:
-  case LibFunc_hypotl:
   case LibFunc_isascii:
   case LibFunc_isdigit:
   case LibFunc_labs:

diff  --git a/llvm/test/Transforms/InferFunctionAttrs/annotate.ll 
b/llvm/test/Transforms/InferFunctionAttrs/annotate.ll
index 452d90aa98d88df..d8266f4c6703dd6 100644
--- a/llvm/test/Transforms/InferFunctionAttrs/annotate.ll
+++ b/llvm/test/Transforms/InferFunctionAttrs/annotate.ll
@@ -589,15 +589,6 @@ declare ptr @gets(ptr)
 ; CHECK: declare noundef i32 @gettimeofday(ptr nocapture noundef, ptr 
nocapture noundef) [[NOFREE_NOUNWIND]]
 declare i32 @gettimeofday(ptr, ptr)
 
-; CHECK: declare double @hypot(double, double) 
[[NOFREE_NOUNWIND_WILLRETURN_WRITEONLY]]
-declare double @hypot(double, double)
-
-; CHECK: declare float @hypotf(float, float) 
[[NOFREE_NOUNWIND_WILLRETURN_WRITEONLY]]
-declare float @hypotf(float, float)
-
-; CHECK: declare x86_fp80 @hypotl(x86_fp80, x86_fp80) 
[[NOFREE_NOUNWIND_WILLRETURN_WRITEONLY]]
-declare x86_fp80 @hypotl(x86_fp80, x86_fp80)
-
 ; CHECK: declare i32 @isascii(i32) [[NOFREE_NOUNWIND_WILLRETURN_WRITEONLY]]
 declare i32 @isascii(i32)
 

diff  --git a/llvm/test/tools/llvm-tli-checker/ps4-tli-check.yaml 
b/llvm/test/tools/llvm-tli-checker/ps4-tli-check.yaml
index d52f3c751b06695..408b9c39934286f 100644
--- a/llvm/test/tools/llvm-tli-checker/ps4-tli-check.yaml
+++ b/llvm/test/tools/llvm-tli-checker/ps4-tli-check.yaml
@@ -602,18 +602,6 @@ DynamicSymbols:
 Type:STT_FUNC
 Section: .text
 Binding: STB_GLOBAL
-  - Name:hypot
-Type:STT_FUNC
-Section: .text
-Binding: STB_GLOBAL
-  - Name:hypotf
-Type:STT_FUNC
-Section:

[llvm-branch-commits] [llvm] ValueTracking: Allow getUnderlyingObject to look at vectors (PR #114311)

2024-10-30 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/114311?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#114311** https://app.graphite.dev/github/pr/llvm/llvm-project/114311?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#114144** https://app.graphite.dev/github/pr/llvm/llvm-project/114144?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#114113** https://app.graphite.dev/github/pr/llvm/llvm-project/114113?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#114091** https://app.graphite.dev/github/pr/llvm/llvm-project/114091?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @arsenm and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/114311
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [llvm][fatlto] Drop any CFI related instrumentation after emitting bitcode (PR #112788)

2024-10-30 Thread Paul Kirth via llvm-branch-commits

https://github.com/ilovepi updated 
https://github.com/llvm/llvm-project/pull/112788

>From ad89d61e60bac57cf8c66a974d741377ebe1db30 Mon Sep 17 00:00:00 2001
From: Paul Kirth 
Date: Fri, 18 Oct 2024 01:59:26 +
Subject: [PATCH 1/2] Use new enum in constructor

Created using spr 1.3.4
---
 llvm/lib/Passes/PassBuilderPipelines.cpp | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index aec79304ab5c3c..0585e83e59a9ab 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -1631,8 +1631,11 @@ 
PassBuilder::buildFatLTODefaultPipeline(OptimizationLevel Level, bool ThinLTO,
   MPM.addPass(EmbedBitcodePass(ThinLTO, EmitSummary));
 
   // If we're doing FatLTO w/ CFI enabled, we don't want the type tests in the
-  // object file.
-  MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true, true));
+  // object code, only in the bitcode section, so drop it before we run
+  // module optimization and generate machine code. If llvm.type.test() isn't 
in
+  // the IR, this won't do anything.
+  MPM.addPass(
+  LowerTypeTestsPass(nullptr, nullptr, lowertypetests::DropTestKind::All));
 
   // Use the ThinLTO post-link pipeline with sample profiling
   if (ThinLTO && PGOOpt && PGOOpt->Action == PGOOptions::SampleUse)

>From f331c70196e399d0e0d4ec8fdb76d3a313b25ac3 Mon Sep 17 00:00:00 2001
From: Paul Kirth 
Date: Wed, 30 Oct 2024 23:54:00 +
Subject: [PATCH 2/2] Update test to use cc1

Created using spr 1.3.4
---
 clang/test/CodeGen/fat-lto-objects-cfi.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/test/CodeGen/fat-lto-objects-cfi.cpp 
b/clang/test/CodeGen/fat-lto-objects-cfi.cpp
index 022e74fd9b6f22..628951847053ac 100644
--- a/clang/test/CodeGen/fat-lto-objects-cfi.cpp
+++ b/clang/test/CodeGen/fat-lto-objects-cfi.cpp
@@ -1,7 +1,7 @@
 // REQUIRES: x86-registered-target
 
-// RUN: %clangxx --target=x86_64-unknown-fuchsia -O2 -flto -ffat-lto-objects \
-// RUN:  -fsanitize=cfi -fvisibility=hidden -S -emit-llvm -o - %s \
+// RUN: %clang_cc1 -triple x86_64-unknown-fuchsia -O2 -flto -ffat-lto-objects \
+// RUN:  -fsanitize=cfi-icall -fsanitize-trap=cfi-icall 
-fvisibility=hidden  -emit-llvm -o - %s \
 // RUN:   | FileCheck %s
 
 // CHECK: llvm.embedded.object

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 3ca951e - Revert "[GlobalISel][AArch64] Legalize G_INSERT_VECTOR_ELT for SVE (#114310)"

2024-10-30 Thread via llvm-branch-commits

Author: Thorsten Schütt
Date: 2024-10-31T05:40:53+01:00
New Revision: 3ca951e1f59110cb29b9c03fc1733cab1f6fbc30

URL: 
https://github.com/llvm/llvm-project/commit/3ca951e1f59110cb29b9c03fc1733cab1f6fbc30
DIFF: 
https://github.com/llvm/llvm-project/commit/3ca951e1f59110cb29b9c03fc1733cab1f6fbc30.diff

LOG: Revert "[GlobalISel][AArch64] Legalize G_INSERT_VECTOR_ELT for SVE 
(#114310)"

This reverts commit 6bf214b7c6d74ec581bc52a9142756a1d1df6df0.

Added: 


Modified: 
llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp
llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp

Removed: 
llvm/test/CodeGen/AArch64/GlobalISel/legalize-vector-insert-elt.mir



diff  --git a/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h 
b/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
index 6811b37767cb21..6d71c150c8da6b 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
@@ -273,11 +273,6 @@ inline LegalityPredicate typeIsNot(unsigned TypeIdx, LLT 
Type) {
 LegalityPredicate
 typePairInSet(unsigned TypeIdx0, unsigned TypeIdx1,
   std::initializer_list> TypesInit);
-/// True iff the given types for the given tuple of type indexes is one of the
-/// specified type tuple.
-LegalityPredicate
-typeTupleInSet(unsigned TypeIdx0, unsigned TypeIdx1, unsigned TypeIdx2,
-   std::initializer_list> TypesInit);
 /// True iff the given types for the given pair of type indexes is one of the
 /// specified type pairs.
 LegalityPredicate typePairAndMemDescInSet(
@@ -509,15 +504,6 @@ class LegalizeRuleSet {
 using namespace LegalityPredicates;
 return actionIf(Action, typePairInSet(typeIdx(0), typeIdx(1), Types));
   }
-
-  LegalizeRuleSet &
-  actionFor(LegalizeAction Action,
-std::initializer_list> Types) {
-using namespace LegalityPredicates;
-return actionIf(Action,
-typeTupleInSet(typeIdx(0), typeIdx(1), typeIdx(2), Types));
-  }
-
   /// Use the given action when type indexes 0 and 1 is any type pair in the
   /// given list.
   /// Action should be an action that requires mutation.
@@ -629,12 +615,6 @@ class LegalizeRuleSet {
   return *this;
 return actionFor(LegalizeAction::Legal, Types);
   }
-  LegalizeRuleSet &
-  legalFor(bool Pred, std::initializer_list> Types) {
-if (!Pred)
-  return *this;
-return actionFor(LegalizeAction::Legal, Types);
-  }
   /// The instruction is legal when type index 0 is any type in the given list
   /// and imm index 0 is anything.
   LegalizeRuleSet &legalForTypeWithAnyImm(std::initializer_list Types) {

diff  --git a/llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp 
b/llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp
index dc7ed6cbe8b7da..8fe48195c610be 100644
--- a/llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp
@@ -49,17 +49,6 @@ LegalityPredicate LegalityPredicates::typePairInSet(
   };
 }
 
-LegalityPredicate LegalityPredicates::typeTupleInSet(
-unsigned TypeIdx0, unsigned TypeIdx1, unsigned TypeIdx2,
-std::initializer_list> TypesInit) {
-  SmallVector, 4> Types = TypesInit;
-  return [=](const LegalityQuery &Query) {
-std::tuple Match = {
-Query.Types[TypeIdx0], Query.Types[TypeIdx1], Query.Types[TypeIdx2]};
-return llvm::is_contained(Types, Match);
-  };
-}
-
 LegalityPredicate LegalityPredicates::typePairAndMemDescInSet(
 unsigned TypeIdx0, unsigned TypeIdx1, unsigned MMOIdx,
 std::initializer_list TypesAndMemDescInit) {

diff  --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp 
b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index 7beda0e92a75bc..6024027afaf6ce 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -978,10 +978,6 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const 
AArch64Subtarget &ST)
   getActionDefinitionsBuilder(G_INSERT_VECTOR_ELT)
   .legalIf(
   typeInSet(0, {v16s8, v8s8, v8s16, v4s16, v4s32, v2s32, v2s64, v2p0}))
-  .legalFor(HasSVE, {{nxv16s8, s32, s64},
- {nxv8s16, s32, s64},
- {nxv4s32, s32, s64},
- {nxv2s64, s64, s64}})
   .moreElementsToNextPow2(0)
   .widenVectorEltsToVectorMinSize(0, 64)
   .clampNumElements(0, v8s8, v16s8)

diff  --git a/llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp 
b/llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp
index 0bf0a4bf27c44d..b40fe55fdfaf67 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp
@@ -161,8 +161,6 @@ bool matchREV(MachineInstr 

[llvm-branch-commits] [llvm] [llvm][fatlto] Drop any CFI related instrumentation after emitting bitcode (PR #112788)

2024-10-30 Thread via llvm-branch-commits

https://github.com/pcc approved this pull request.


https://github.com/llvm/llvm-project/pull/112788
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)

2024-10-30 Thread Tom Stellard via llvm-branch-commits


@@ -294,7 +294,11 @@ class Triple {
 
 PAuthTest,
 
-LastEnvironmentType = PAuthTest
+GNUT64,
+GNUEABIT64,
+GNUEABIHFT64,
+
+LastEnvironmentType = GNUEABIHFT64

tstellar wrote:

@alexrp Why would zig still need to deal with it?  I don't think it's likely 
19.1.3 will be used much once 19.1.4 comes out.

https://github.com/llvm/llvm-project/pull/112364
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] GlobalISel: Fix combine duplicating atomic loads (PR #111730)

2024-10-30 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/111730

>From 41780b3398f46745fabe2ae2d24ca02c6580cf49 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Wed, 9 Oct 2024 22:05:48 +0400
Subject: [PATCH 1/2] GlobalISel: Fix combine duplicating atomic loads

The sext_inreg (load) combine was not deleting the old load instruction,
and it would never be deleted if volatile or atomic.
---
 .../lib/CodeGen/GlobalISel/CombinerHelper.cpp |  1 +
 .../AMDGPU/GlobalISel/atomic_load_flat.ll | 96 ---
 .../AMDGPU/GlobalISel/atomic_load_global.ll   | 51 +++---
 .../AMDGPU/GlobalISel/atomic_load_local_2.ll  | 36 ++-
 ...lizer-combiner-sextload-from-sextinreg.mir |  2 -
 5 files changed, 40 insertions(+), 146 deletions(-)

diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp 
b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
index b7ddf9f479ef8e..a5cabd46e23124 100644
--- a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
@@ -1110,6 +1110,7 @@ void CombinerHelper::applySextInRegOfLoad(
   Builder.buildLoadInstr(TargetOpcode::G_SEXTLOAD, MI.getOperand(0).getReg(),
  LoadDef->getPointerReg(), *NewMMO);
   MI.eraseFromParent();
+  LoadDef->eraseFromParent();
 }
 
 /// Return true if 'MI' is a load or a store that may be fold it's address
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/atomic_load_flat.ll 
b/llvm/test/CodeGen/AMDGPU/GlobalISel/atomic_load_flat.ll
index 817d1af9c226c8..83912b1e77db20 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/atomic_load_flat.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/atomic_load_flat.ll
@@ -27,32 +27,12 @@ define i32 @atomic_load_flat_monotonic_i8_zext_to_i32(ptr 
%ptr) {
 }
 
 define i32 @atomic_load_flat_monotonic_i8_sext_to_i32(ptr %ptr) {
-; GFX7-LABEL: atomic_load_flat_monotonic_i8_sext_to_i32:
-; GFX7:   ; %bb.0:
-; GFX7-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX7-NEXT:flat_load_sbyte v2, v[0:1] glc
-; GFX7-NEXT:flat_load_ubyte v0, v[0:1] glc
-; GFX7-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0)
-; GFX7-NEXT:v_mov_b32_e32 v0, v2
-; GFX7-NEXT:s_setpc_b64 s[30:31]
-;
-; GFX8-LABEL: atomic_load_flat_monotonic_i8_sext_to_i32:
-; GFX8:   ; %bb.0:
-; GFX8-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX8-NEXT:flat_load_sbyte v2, v[0:1] glc
-; GFX8-NEXT:flat_load_ubyte v0, v[0:1] glc
-; GFX8-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0)
-; GFX8-NEXT:v_mov_b32_e32 v0, v2
-; GFX8-NEXT:s_setpc_b64 s[30:31]
-;
-; GFX9-LABEL: atomic_load_flat_monotonic_i8_sext_to_i32:
-; GFX9:   ; %bb.0:
-; GFX9-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX9-NEXT:flat_load_sbyte v2, v[0:1] glc
-; GFX9-NEXT:flat_load_ubyte v3, v[0:1] glc
-; GFX9-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0)
-; GFX9-NEXT:v_mov_b32_e32 v0, v2
-; GFX9-NEXT:s_setpc_b64 s[30:31]
+; GCN-LABEL: atomic_load_flat_monotonic_i8_sext_to_i32:
+; GCN:   ; %bb.0:
+; GCN-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GCN-NEXT:flat_load_sbyte v0, v[0:1] glc
+; GCN-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0)
+; GCN-NEXT:s_setpc_b64 s[30:31]
   %load = load atomic i8, ptr %ptr monotonic, align 1
   %ext = sext i8 %load to i32
   ret i32 %ext
@@ -71,32 +51,12 @@ define i16 @atomic_load_flat_monotonic_i8_zext_to_i16(ptr 
%ptr) {
 }
 
 define i16 @atomic_load_flat_monotonic_i8_sext_to_i16(ptr %ptr) {
-; GFX7-LABEL: atomic_load_flat_monotonic_i8_sext_to_i16:
-; GFX7:   ; %bb.0:
-; GFX7-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX7-NEXT:flat_load_sbyte v2, v[0:1] glc
-; GFX7-NEXT:flat_load_ubyte v0, v[0:1] glc
-; GFX7-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0)
-; GFX7-NEXT:v_mov_b32_e32 v0, v2
-; GFX7-NEXT:s_setpc_b64 s[30:31]
-;
-; GFX8-LABEL: atomic_load_flat_monotonic_i8_sext_to_i16:
-; GFX8:   ; %bb.0:
-; GFX8-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX8-NEXT:flat_load_sbyte v2, v[0:1] glc
-; GFX8-NEXT:flat_load_ubyte v0, v[0:1] glc
-; GFX8-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0)
-; GFX8-NEXT:v_mov_b32_e32 v0, v2
-; GFX8-NEXT:s_setpc_b64 s[30:31]
-;
-; GFX9-LABEL: atomic_load_flat_monotonic_i8_sext_to_i16:
-; GFX9:   ; %bb.0:
-; GFX9-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX9-NEXT:flat_load_sbyte v2, v[0:1] glc
-; GFX9-NEXT:flat_load_ubyte v3, v[0:1] glc
-; GFX9-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0)
-; GFX9-NEXT:v_mov_b32_e32 v0, v2
-; GFX9-NEXT:s_setpc_b64 s[30:31]
+; GCN-LABEL: atomic_load_flat_monotonic_i8_sext_to_i16:
+; GCN:   ; %bb.0:
+; GCN-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GCN-NEXT:flat_load_sbyte v0, v[0:1] glc
+; GCN-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0)
+; GCN-NEXT:s_setpc_b64 s[30:31]
   %load = load atomic i8, ptr %ptr monotonic, align 1
   %ext = sext i8 %load to i16
   ret i16 %ext
@@ -126,32 +86,12 @@ define i32 @atomic_load_flat_monotonic_i16_zext_to_i32(ptr 
%ptr) {
 }
 
 define i32 @atom

[llvm-branch-commits] [mlir] [mlir][func] Remove `func-bufferize` pass (PR #114152)

2024-10-30 Thread Javed Absar via llvm-branch-commits


@@ -111,9 +111,9 @@ module attributes {transform.with_named_sequence} {
   transform.named_sequence @__transform_main(%arg1: !transform.any_op) {
 %1 = transform.structured.match ops{["func.func"]} in %arg1 : 
(!transform.any_op) -> !transform.any_op
 
-// func-bufferize can be applied only to ModuleOps.
+// duplicate-function-elimination can be applied only to ModuleOps.
 // expected-error @below {{pass pipeline failed}}
-transform.apply_registered_pass "func-bufferize" to %1 : 
(!transform.any_op) -> !transform.any_op
+transform.apply_registered_pass "duplicate-function-elimination" to %1 : 
(!transform.any_op) -> !transform.any_op

javedabsar1 wrote:

AFAIU here 'func-bufferize' just happened to be used. It is not really playing 
any specific role. right? 

https://github.com/llvm/llvm-project/pull/114152
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)

2024-10-30 Thread Chuanqi Xu via llvm-branch-commits

ChuanqiXu9 wrote:

> > I tried to take a look at eigen and it looks like the declaration looks 
> > well and I had no clue how that happens. A reproducer may be necessary here 
> > to proceed. Thanks in advance.
> 
> I can reproduce using the following sources and invocations outlined in 
> `run.sh` 
> [usx95@363d877](https://github.com/usx95/llvm-project/commit/363d877bd317638b197f57c3591860e1688950d5)
> 
> ```shell
> >  module-reproducer/run.sh
> 
> Building sensor_data.h
> Building tensor.h
> Building base.cc
> In module 'sensor_data':
> ../../eigen/Eigen/src/Core/../plugins/CommonCwiseBinaryOps.inc:47:29: 
> warning: inline function 'Eigen::operator*' is not defined 
> [-Wundefined-inline]
>47 | EIGEN_MAKE_SCALAR_BINARY_OP(operator*, product)
>   | ^
> ../../eigen/Eigen/src/Geometry/AngleAxis.h:221:35: note: used here
>   221 |   Vector3 sin_axis = sin(m_angle) * m_axis;
>   |   ^
> 1 warning generated.
> ```
> 
> This warning is a new breakage and does not happen without this change 
> (ignore the linker failure). Let me know if you can reproduce or need help 
> reproducing.

Sorry, could you provide the hash id for the commit that avoid the warning? I 
tried with d54953ef472bfd8d4b503aae7682aa76c49f8cc0 but I still saw the 
warning. I suspected if it is due to cache so I add `rm -fr 
~/.cache/clang/ModuleCache/` in the top of the script and `rm -f ${sensor_data} 
&& rm -f ${tensor}` in the end but I still saw the failures.

I am using libc++17. What's your environment?

https://github.com/llvm/llvm-project/pull/83237
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] 565d18d - Revert "[Clang][Sema] Always use latest redeclaration of primary template (#1…"

2024-10-30 Thread via llvm-branch-commits

Author: Felipe de Azevedo Piovezan
Date: 2024-10-30T14:03:47-07:00
New Revision: 565d18daf296b9848cf9d1b23fc82892e10eef8c

URL: 
https://github.com/llvm/llvm-project/commit/565d18daf296b9848cf9d1b23fc82892e10eef8c
DIFF: 
https://github.com/llvm/llvm-project/commit/565d18daf296b9848cf9d1b23fc82892e10eef8c.diff

LOG: Revert "[Clang][Sema] Always use latest redeclaration of primary template 
(#1…"

This reverts commit 90786adade22784a52856a0e8b545ec6710b47f6.

Added: 


Modified: 
clang/include/clang/AST/DeclTemplate.h
clang/lib/AST/Decl.cpp
clang/lib/AST/DeclCXX.cpp
clang/lib/AST/DeclTemplate.cpp
clang/lib/Sema/SemaDecl.cpp
clang/lib/Sema/SemaInit.cpp
clang/lib/Sema/SemaTemplateInstantiate.cpp
clang/test/AST/ast-dump-decl.cpp
clang/test/CXX/temp/temp.spec/temp.expl.spec/p7.cpp

Removed: 




diff  --git a/clang/include/clang/AST/DeclTemplate.h 
b/clang/include/clang/AST/DeclTemplate.h
index 0ca3fd48e81cf4..a572e3380f1655 100644
--- a/clang/include/clang/AST/DeclTemplate.h
+++ b/clang/include/clang/AST/DeclTemplate.h
@@ -857,6 +857,16 @@ class RedeclarableTemplateDecl : public TemplateDecl,
   /// \endcode
   bool isMemberSpecialization() const { return Common.getInt(); }
 
+  /// Determines whether any redeclaration of this template was
+  /// a specialization of a member template.
+  bool hasMemberSpecialization() const {
+for (const auto *D : redecls()) {
+  if (D->isMemberSpecialization())
+return true;
+}
+return false;
+  }
+
   /// Note that this member template is a specialization.
   void setMemberSpecialization() {
 assert(!isMemberSpecialization() && "already a member specialization");
@@ -1955,7 +1965,13 @@ class ClassTemplateSpecializationDecl : public 
CXXRecordDecl,
   /// specialization which was specialized by this.
   llvm::PointerUnion
-  getSpecializedTemplateOrPartial() const;
+  getSpecializedTemplateOrPartial() const {
+if (const auto *PartialSpec =
+SpecializedTemplate.dyn_cast())
+  return PartialSpec->PartialSpecialization;
+
+return SpecializedTemplate.get();
+  }
 
   /// Retrieve the set of template arguments that should be used
   /// to instantiate members of the class template or class template partial
@@ -2192,6 +2208,17 @@ class ClassTemplatePartialSpecializationDecl
 return InstantiatedFromMember.getInt();
   }
 
+  /// Determines whether any redeclaration of this this class template partial
+  /// specialization was a specialization of a member partial specialization.
+  bool hasMemberSpecialization() const {
+for (const auto *D : redecls()) {
+  if (cast(D)
+  ->isMemberSpecialization())
+return true;
+}
+return false;
+  }
+
   /// Note that this member template is a specialization.
   void setMemberSpecialization() { return InstantiatedFromMember.setInt(true); 
}
 
@@ -2713,7 +2740,13 @@ class VarTemplateSpecializationDecl : public VarDecl,
   /// Retrieve the variable template or variable template partial
   /// specialization which was specialized by this.
   llvm::PointerUnion
-  getSpecializedTemplateOrPartial() const;
+  getSpecializedTemplateOrPartial() const {
+if (const auto *PartialSpec =
+SpecializedTemplate.dyn_cast())
+  return PartialSpec->PartialSpecialization;
+
+return SpecializedTemplate.get();
+  }
 
   /// Retrieve the set of template arguments that should be used
   /// to instantiate the initializer of the variable template or variable
@@ -2947,6 +2980,18 @@ class VarTemplatePartialSpecializationDecl
 return InstantiatedFromMember.getInt();
   }
 
+  /// Determines whether any redeclaration of this this variable template
+  /// partial specialization was a specialization of a member partial
+  /// specialization.
+  bool hasMemberSpecialization() const {
+for (const auto *D : redecls()) {
+  if (cast(D)
+  ->isMemberSpecialization())
+return true;
+}
+return false;
+  }
+
   /// Note that this member template is a specialization.
   void setMemberSpecialization() { return InstantiatedFromMember.setInt(true); 
}
 
@@ -3119,9 +3164,6 @@ class VarTemplateDecl : public RedeclarableTemplateDecl {
 return makeSpecIterator(getSpecializations(), true);
   }
 
-  /// Merge \p Prev with our RedeclarableTemplateDecl::Common.
-  void mergePrevDecl(VarTemplateDecl *Prev);
-
   // Implement isa/cast/dyncast support
   static bool classof(const Decl *D) { return classofKind(D->getKind()); }
   static bool classofKind(Kind K) { return K == VarTemplate; }

diff  --git a/clang/lib/AST/Decl.cpp b/clang/lib/AST/Decl.cpp
index cd173d17263792..86913763ef9ff5 100644
--- a/clang/lib/AST/Decl.cpp
+++ b/clang/lib/AST/Decl.cpp
@@ -2708,7 +2708,7 @@ VarDecl *VarDecl::getTemplateInstantiationPattern() const 
{
 if (isTemplateInstantiation(VDTemplSpec->getTemplateSpecializationKind())) 
{
   

[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)

2024-10-30 Thread Alex Rønne Petersen via llvm-branch-commits

https://github.com/alexrp edited 
https://github.com/llvm/llvm-project/pull/112364
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [llvm][fatlto] Drop any CFI related instrumentation after emitting bitcode (PR #112788)

2024-10-30 Thread Paul Kirth via llvm-branch-commits

https://github.com/ilovepi updated 
https://github.com/llvm/llvm-project/pull/112788

>From ad89d61e60bac57cf8c66a974d741377ebe1db30 Mon Sep 17 00:00:00 2001
From: Paul Kirth 
Date: Fri, 18 Oct 2024 01:59:26 +
Subject: [PATCH 1/2] Use new enum in constructor

Created using spr 1.3.4
---
 llvm/lib/Passes/PassBuilderPipelines.cpp | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index aec79304ab5c3c..0585e83e59a9ab 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -1631,8 +1631,11 @@ 
PassBuilder::buildFatLTODefaultPipeline(OptimizationLevel Level, bool ThinLTO,
   MPM.addPass(EmbedBitcodePass(ThinLTO, EmitSummary));
 
   // If we're doing FatLTO w/ CFI enabled, we don't want the type tests in the
-  // object file.
-  MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true, true));
+  // object code, only in the bitcode section, so drop it before we run
+  // module optimization and generate machine code. If llvm.type.test() isn't 
in
+  // the IR, this won't do anything.
+  MPM.addPass(
+  LowerTypeTestsPass(nullptr, nullptr, lowertypetests::DropTestKind::All));
 
   // Use the ThinLTO post-link pipeline with sample profiling
   if (ThinLTO && PGOOpt && PGOOpt->Action == PGOOptions::SampleUse)

>From f331c70196e399d0e0d4ec8fdb76d3a313b25ac3 Mon Sep 17 00:00:00 2001
From: Paul Kirth 
Date: Wed, 30 Oct 2024 23:54:00 +
Subject: [PATCH 2/2] Update test to use cc1

Created using spr 1.3.4
---
 clang/test/CodeGen/fat-lto-objects-cfi.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/clang/test/CodeGen/fat-lto-objects-cfi.cpp 
b/clang/test/CodeGen/fat-lto-objects-cfi.cpp
index 022e74fd9b6f22..628951847053ac 100644
--- a/clang/test/CodeGen/fat-lto-objects-cfi.cpp
+++ b/clang/test/CodeGen/fat-lto-objects-cfi.cpp
@@ -1,7 +1,7 @@
 // REQUIRES: x86-registered-target
 
-// RUN: %clangxx --target=x86_64-unknown-fuchsia -O2 -flto -ffat-lto-objects \
-// RUN:  -fsanitize=cfi -fvisibility=hidden -S -emit-llvm -o - %s \
+// RUN: %clang_cc1 -triple x86_64-unknown-fuchsia -O2 -flto -ffat-lto-objects \
+// RUN:  -fsanitize=cfi-icall -fsanitize-trap=cfi-icall 
-fvisibility=hidden  -emit-llvm -o - %s \
 // RUN:   | FileCheck %s
 
 // CHECK: llvm.embedded.object

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][cuda] Data transfer with descriptor (PR #114302)

2024-10-30 Thread Renaud Kauffmann via llvm-branch-commits

https://github.com/Renaud-K approved this pull request.

Looks good. Nice way of testing the runtime. 

https://github.com/llvm/llvm-project/pull/114302
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 70a8e4c - Revert "[DXIL] Add GroupMemoryBarrierWithGroupSync intrinsic (#111884)"

2024-10-30 Thread via llvm-branch-commits

Author: Adam Yang
Date: 2024-10-30T15:57:27-07:00
New Revision: 70a8e4c4f4f0eba9f9192fae97654d3a32a731cd

URL: 
https://github.com/llvm/llvm-project/commit/70a8e4c4f4f0eba9f9192fae97654d3a32a731cd
DIFF: 
https://github.com/llvm/llvm-project/commit/70a8e4c4f4f0eba9f9192fae97654d3a32a731cd.diff

LOG: Revert "[DXIL] Add GroupMemoryBarrierWithGroupSync intrinsic (#111884)"

This reverts commit 9a5b3a1bbca6790602ec3291da850fc4485cc807.

Added: 


Modified: 
llvm/include/llvm/IR/IntrinsicsDirectX.td
llvm/lib/Target/DirectX/DXIL.td
llvm/lib/Target/DirectX/DXILOpLowering.cpp
llvm/utils/TableGen/DXILEmitter.cpp

Removed: 
llvm/test/CodeGen/DirectX/group_memory_barrier_with_group_sync.ll



diff  --git a/llvm/include/llvm/IR/IntrinsicsDirectX.td 
b/llvm/include/llvm/IR/IntrinsicsDirectX.td
index dada426368995d..e30d37f69f781e 100644
--- a/llvm/include/llvm/IR/IntrinsicsDirectX.td
+++ b/llvm/include/llvm/IR/IntrinsicsDirectX.td
@@ -92,6 +92,4 @@ def int_dx_step : DefaultAttrsIntrinsic<[LLVMMatchType<0>], 
[llvm_anyfloat_ty, L
 def int_dx_splitdouble : DefaultAttrsIntrinsic<[llvm_anyint_ty, 
LLVMMatchType<0>], 
 [LLVMScalarOrSameVectorWidth<0, llvm_double_ty>], [IntrNoMem]>;
 def int_dx_radians : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], 
[LLVMMatchType<0>], [IntrNoMem]>;
-
-def int_dx_group_memory_barrier_with_group_sync : DefaultAttrsIntrinsic<[], 
[], []>;
 }

diff  --git a/llvm/lib/Target/DirectX/DXIL.td b/llvm/lib/Target/DirectX/DXIL.td
index 263ca50011aa7b..1e8dc63ffa257e 100644
--- a/llvm/lib/Target/DirectX/DXIL.td
+++ b/llvm/lib/Target/DirectX/DXIL.td
@@ -294,43 +294,6 @@ class Attributes attrs> {
   list op_attrs = attrs;
 }
 
-class DXILConstant {
-  int value = value_;
-}
-
-defset list BarrierModes = {
-  def BarrierMode_DeviceMemoryBarrier  : DXILConstant<2>;
-  def BarrierMode_DeviceMemoryBarrierWithGroupSync : DXILConstant<3>;
-  def BarrierMode_GroupMemoryBarrier   : DXILConstant<8>;
-  def BarrierMode_GroupMemoryBarrierWithGroupSync  : DXILConstant<9>;
-  def BarrierMode_AllMemoryBarrier : DXILConstant<10>;
-  def BarrierMode_AllMemoryBarrierWithGroupSync: DXILConstant<11>;
-}
-
-// Intrinsic arg selection
-class Arg {
-  int index = -1;
-  DXILConstant value;
-  bit is_i8 = 0;
-  bit is_i32 = 0;
-}
-class ArgSelect : Arg {
-  let index = index_;
-}
-class ArgI32 : Arg {
-  let value = value_;
-  let is_i32 = 1;
-}
-class ArgI8 : Arg {
-  let value = value_;
-  let is_i8 = 1;
-}
-
-class IntrinsicSelect args_> {
-  Intrinsic intrinsic = intrinsic_;
-  list args = args_;
-}
-
 // Abstraction DXIL Operation
 class DXILOp {
   // A short description of the operation
@@ -345,9 +308,6 @@ class DXILOp {
   // LLVM Intrinsic DXIL Operation maps to
   Intrinsic LLVMIntrinsic = ?;
 
-  // Non-trivial LLVM Intrinsics DXIL Operation maps to
-  list intrinsic_selects = [];
-
   // Result type of the op
   DXILOpParamType result;
 
@@ -869,17 +829,3 @@ def WaveGetLaneIndex : DXILOp<111, waveGetLaneIndex> {
   let stages = [Stages];
   let attributes = [Attributes];
 }
-
-def Barrier : DXILOp<80, barrier> {
-  let Doc = "inserts a memory barrier in the shader";
-  let intrinsic_selects = [
-IntrinsicSelect<
-int_dx_group_memory_barrier_with_group_sync,
-[ ArgI32 ]>,
-  ];
-
-  let arguments = [Int32Ty];
-  let result = VoidTy;
-  let stages = [Stages];
-  let attributes = [Attributes];
-}

diff  --git a/llvm/lib/Target/DirectX/DXILOpLowering.cpp 
b/llvm/lib/Target/DirectX/DXILOpLowering.cpp
index b5cf1654181c6c..8acc9c1efa08c0 100644
--- a/llvm/lib/Target/DirectX/DXILOpLowering.cpp
+++ b/llvm/lib/Target/DirectX/DXILOpLowering.cpp
@@ -106,43 +106,17 @@ class OpLowerer {
 return false;
   }
 
-  struct ArgSelect {
-enum class Type {
-  Index,
-  I8,
-  I32,
-};
-Type Type = Type::Index;
-int Value = -1;
-  };
-
-  [[nodiscard]] bool replaceFunctionWithOp(Function &F, dxil::OpCode DXILOp,
-   ArrayRef ArgSelects) {
+  [[nodiscard]]
+  bool replaceFunctionWithOp(Function &F, dxil::OpCode DXILOp) {
 bool IsVectorArgExpansion = isVectorArgExpansion(F);
 return replaceFunction(F, [&](CallInst *CI) -> Error {
-  OpBuilder.getIRB().SetInsertPoint(CI);
   SmallVector Args;
-  if (ArgSelects.size()) {
-for (const ArgSelect &A : ArgSelects) {
-  switch (A.Type) {
-  case ArgSelect::Type::Index:
-Args.push_back(CI->getArgOperand(A.Value));
-break;
-  case ArgSelect::Type::I8:
-Args.push_back(OpBuilder.getIRB().getInt8((uint8_t)A.Value));
-break;
-  case ArgSelect::Type::I32:
-Args.push_back(OpBuilder.getIRB().getInt32(A.Value));
-break;
-  default:
-llvm_unreachable("Invalid type of intrinsic arg select.");
-  }
-}
-  

[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)

2024-10-30 Thread Andrew Kelley via llvm-branch-commits


@@ -294,7 +294,11 @@ class Triple {
 
 PAuthTest,
 
-LastEnvironmentType = PAuthTest
+GNUT64,
+GNUEABIT64,
+GNUEABIHFT64,
+
+LastEnvironmentType = GNUEABIHFT64

andrewrk wrote:

Well, that used to be the case anyway. @alexrp pointed out in 
https://github.com/ziglang/zig/pull/21862 that since we now generate LLVM 
bitcode rather than using the LLVM IRBuilder API, we can delete these strict 
checks.

So, this backport is not relevant to the zig project after all.

https://github.com/llvm/llvm-project/pull/112364
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)

2024-10-30 Thread Andrew Kelley via llvm-branch-commits

https://github.com/andrewrk edited 
https://github.com/llvm/llvm-project/pull/112364
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)

2024-10-30 Thread Andrew Kelley via llvm-branch-commits

https://github.com/andrewrk edited 
https://github.com/llvm/llvm-project/pull/112364
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port RegUsageInfoCollector pass to NPM (PR #113874)

2024-10-30 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan updated 
https://github.com/llvm/llvm-project/pull/113874

>From a95b69c07c7804d2e2a10b939a178a191643a41c Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Mon, 28 Oct 2024 06:22:49 +
Subject: [PATCH 1/4] [CodeGen][NewPM] Port RegUsageInfoCollector pass to NPM

---
 .../llvm/CodeGen/RegUsageInfoCollector.h  | 25 
 llvm/include/llvm/InitializePasses.h  |  2 +-
 llvm/include/llvm/Passes/CodeGenPassBuilder.h |  1 +
 .../llvm/Passes/MachinePassRegistry.def   |  2 +-
 llvm/lib/CodeGen/CodeGen.cpp  |  2 +-
 llvm/lib/CodeGen/RegUsageInfoCollector.cpp| 60 +--
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 llvm/test/CodeGen/AMDGPU/ipra-regmask.ll  |  5 ++
 8 files changed, 76 insertions(+), 22 deletions(-)
 create mode 100644 llvm/include/llvm/CodeGen/RegUsageInfoCollector.h

diff --git a/llvm/include/llvm/CodeGen/RegUsageInfoCollector.h 
b/llvm/include/llvm/CodeGen/RegUsageInfoCollector.h
new file mode 100644
index 00..6b88cc4f99089e
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/RegUsageInfoCollector.h
@@ -0,0 +1,25 @@
+//===- llvm/CodeGen/RegUsageInfoCollector.h -*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CODEGEN_REGUSAGEINFOCOLLECTOR_H
+#define LLVM_CODEGEN_REGUSAGEINFOCOLLECTOR_H
+
+#include "llvm/CodeGen/MachinePassManager.h"
+
+namespace llvm {
+
+class RegUsageInfoCollectorPass
+: public AnalysisInfoMixin {
+public:
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+};
+
+} // namespace llvm
+
+#endif // LLVM_CODEGEN_REGUSAGEINFOCOLLECTOR_H
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index edc237f2819818..44b7ba830bb329 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -257,7 +257,7 @@ void 
initializeRegAllocPriorityAdvisorAnalysisPass(PassRegistry &);
 void initializeRegAllocScoringPass(PassRegistry &);
 void initializeRegBankSelectPass(PassRegistry &);
 void initializeRegToMemWrapperPassPass(PassRegistry &);
-void initializeRegUsageInfoCollectorPass(PassRegistry &);
+void initializeRegUsageInfoCollectorLegacyPass(PassRegistry &);
 void initializeRegUsageInfoPropagationPass(PassRegistry &);
 void initializeRegionInfoPassPass(PassRegistry &);
 void initializeRegionOnlyPrinterPass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index 8cbc9f71ab26d0..066cd70ec8b996 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -53,6 +53,7 @@
 #include "llvm/CodeGen/PHIElimination.h"
 #include "llvm/CodeGen/PreISelIntrinsicLowering.h"
 #include "llvm/CodeGen/RegAllocFast.h"
+#include "llvm/CodeGen/RegUsageInfoCollector.h"
 #include "llvm/CodeGen/RegisterUsageInfo.h"
 #include "llvm/CodeGen/ReplaceWithVeclib.h"
 #include "llvm/CodeGen/SafeStack.h"
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def 
b/llvm/include/llvm/Passes/MachinePassRegistry.def
index 7db28cb0092525..0ee4794034e98b 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -156,6 +156,7 @@ MACHINE_FUNCTION_PASS("print",
   MachinePostDominatorTreePrinterPass(dbgs()))
 MACHINE_FUNCTION_PASS("print", SlotIndexesPrinterPass(dbgs()))
 MACHINE_FUNCTION_PASS("print", VirtRegMapPrinterPass(dbgs()))
+MACHINE_FUNCTION_PASS("reg-usage-collector", RegUsageInfoCollectorPass())
 MACHINE_FUNCTION_PASS("require-all-machine-function-properties",
   RequireAllMachineFunctionPropertiesPass())
 MACHINE_FUNCTION_PASS("stack-coloring", StackColoringPass())
@@ -250,7 +251,6 @@ DUMMY_MACHINE_FUNCTION_PASS("prologepilog-code", 
PrologEpilogCodeInserterPass)
 DUMMY_MACHINE_FUNCTION_PASS("ra-basic", RABasicPass)
 DUMMY_MACHINE_FUNCTION_PASS("ra-greedy", RAGreedyPass)
 DUMMY_MACHINE_FUNCTION_PASS("ra-pbqp", RAPBQPPass)
-DUMMY_MACHINE_FUNCTION_PASS("reg-usage-collector", RegUsageInfoCollectorPass)
 DUMMY_MACHINE_FUNCTION_PASS("reg-usage-propagation", 
RegUsageInfoPropagationPass)
 DUMMY_MACHINE_FUNCTION_PASS("regalloc", RegAllocPass)
 DUMMY_MACHINE_FUNCTION_PASS("regallocscoringpass", RegAllocScoringPass)
diff --git a/llvm/lib/CodeGen/CodeGen.cpp b/llvm/lib/CodeGen/CodeGen.cpp
index 39fba1d0b527ef..e7e8a121369b75 100644
--- a/llvm/lib/CodeGen/CodeGen.cpp
+++ b/llvm/lib/CodeGen/CodeGen.cpp
@@ -113,7 +113,7 @@ void llvm::initializeCodeGen(PassRegistry &Registry) {
   initializeRABasicPass(Registry);
   initializeRAGreedyPass(Registry);
   initializeRegAllocFastPass(Reg

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port RegUsageInfoPropagation pass to NPM (PR #114010)

2024-10-30 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan updated 
https://github.com/llvm/llvm-project/pull/114010

>From 3370e24f9e9ec16b6404d7bcf3d72361c46934de Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Tue, 29 Oct 2024 07:14:30 +
Subject: [PATCH 1/2] [CodeGen][NewPM] Port RegUsageInfoPropagation pass to NPM

---
 .../llvm/CodeGen/RegUsageInfoPropagate.h  | 25 +++
 llvm/include/llvm/InitializePasses.h  |  2 +-
 llvm/include/llvm/Passes/CodeGenPassBuilder.h |  1 +
 .../llvm/Passes/MachinePassRegistry.def   |  2 +-
 llvm/lib/CodeGen/CodeGen.cpp  |  2 +-
 llvm/lib/CodeGen/RegUsageInfoPropagate.cpp| 75 +--
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 llvm/test/CodeGen/AArch64/preserve.ll |  4 +
 8 files changed, 86 insertions(+), 26 deletions(-)
 create mode 100644 llvm/include/llvm/CodeGen/RegUsageInfoPropagate.h

diff --git a/llvm/include/llvm/CodeGen/RegUsageInfoPropagate.h 
b/llvm/include/llvm/CodeGen/RegUsageInfoPropagate.h
new file mode 100644
index 00..73624015e37d9d
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/RegUsageInfoPropagate.h
@@ -0,0 +1,25 @@
+//===- llvm/CodeGen/RegUsageInfoPropagate.h -*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CODEGEN_REGUSAGEINFOPROPAGATE_H
+#define LLVM_CODEGEN_REGUSAGEINFOPROPAGATE_H
+
+#include "llvm/CodeGen/MachinePassManager.h"
+
+namespace llvm {
+
+class RegUsageInfoPropagationPass
+: public PassInfoMixin {
+public:
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+};
+
+} // namespace llvm
+
+#endif // LLVM_CODEGEN_REGUSAGEINFOPROPAGATE_H
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index 44b7ba830bb329..bc209a4e939415 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -258,7 +258,7 @@ void initializeRegAllocScoringPass(PassRegistry &);
 void initializeRegBankSelectPass(PassRegistry &);
 void initializeRegToMemWrapperPassPass(PassRegistry &);
 void initializeRegUsageInfoCollectorLegacyPass(PassRegistry &);
-void initializeRegUsageInfoPropagationPass(PassRegistry &);
+void initializeRegUsageInfoPropagationLegacyPass(PassRegistry &);
 void initializeRegionInfoPassPass(PassRegistry &);
 void initializeRegionOnlyPrinterPass(PassRegistry &);
 void initializeRegionOnlyViewerPass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index 066cd70ec8b996..9f41cc41a7c926 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -54,6 +54,7 @@
 #include "llvm/CodeGen/PreISelIntrinsicLowering.h"
 #include "llvm/CodeGen/RegAllocFast.h"
 #include "llvm/CodeGen/RegUsageInfoCollector.h"
+#include "llvm/CodeGen/RegUsageInfoPropagate.h"
 #include "llvm/CodeGen/RegisterUsageInfo.h"
 #include "llvm/CodeGen/ReplaceWithVeclib.h"
 #include "llvm/CodeGen/SafeStack.h"
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def 
b/llvm/include/llvm/Passes/MachinePassRegistry.def
index 0ee4794034e98b..6327ab1abd48e9 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -157,6 +157,7 @@ MACHINE_FUNCTION_PASS("print",
 MACHINE_FUNCTION_PASS("print", SlotIndexesPrinterPass(dbgs()))
 MACHINE_FUNCTION_PASS("print", VirtRegMapPrinterPass(dbgs()))
 MACHINE_FUNCTION_PASS("reg-usage-collector", RegUsageInfoCollectorPass())
+MACHINE_FUNCTION_PASS("reg-usage-propagation", RegUsageInfoPropagationPass())
 MACHINE_FUNCTION_PASS("require-all-machine-function-properties",
   RequireAllMachineFunctionPropertiesPass())
 MACHINE_FUNCTION_PASS("stack-coloring", StackColoringPass())
@@ -251,7 +252,6 @@ DUMMY_MACHINE_FUNCTION_PASS("prologepilog-code", 
PrologEpilogCodeInserterPass)
 DUMMY_MACHINE_FUNCTION_PASS("ra-basic", RABasicPass)
 DUMMY_MACHINE_FUNCTION_PASS("ra-greedy", RAGreedyPass)
 DUMMY_MACHINE_FUNCTION_PASS("ra-pbqp", RAPBQPPass)
-DUMMY_MACHINE_FUNCTION_PASS("reg-usage-propagation", 
RegUsageInfoPropagationPass)
 DUMMY_MACHINE_FUNCTION_PASS("regalloc", RegAllocPass)
 DUMMY_MACHINE_FUNCTION_PASS("regallocscoringpass", RegAllocScoringPass)
 DUMMY_MACHINE_FUNCTION_PASS("regbankselect", RegBankSelectPass)
diff --git a/llvm/lib/CodeGen/CodeGen.cpp b/llvm/lib/CodeGen/CodeGen.cpp
index e7e8a121369b75..013a9b3c9c4ffa 100644
--- a/llvm/lib/CodeGen/CodeGen.cpp
+++ b/llvm/lib/CodeGen/CodeGen.cpp
@@ -114,7 +114,7 @@ void llvm::initializeCodeGen(PassRegistry &Registry) {
   initializeRAGreedyPass(Registry);
   initializeRegAllocFastPass(Registry);
   initializeRegUsageInfoCollector

[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-30 Thread David Chisnall via llvm-branch-commits

davidchisnall wrote:

I’m concerned about the semantics of unstable. This sounds like it would impact 
optimisation of memcmp, for example (is it still allowable to optimise away 
self comparisons?). I wouldn’t want that added to the LangRef without some 
clearer description of what optimisers *can* assume.  That said, given that 
it’s a property that’s already present for NI pointers for GC, I suppose we’re 
stuck with it for now.

Note that CHERI LLVM’s use of ptrtoint is largely historical. We worked with 
the folks doing NI pointers and my plan was to move to them eventually. We 
didn’t initially have the address-get intrinsic and so we abused ptrtoint for 
that, but I believe we’ve fixed the places in the front end that do this. When 
we upstream, we should get rid of that entirely. This also improves 
optimisation because optimisers treat address-get as non-escaping whereas 
ptrtoint assumes the pointer may be materialised anywhere else and makes alias 
analysis conservative.

https://github.com/llvm/llvm-project/pull/105735
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits