[llvm-branch-commits] [clang] release/19.x: [C++20] [Modules] Fix the duplicated static initializer problem (#114193) (PR #114197)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/114197 Backport 259eaa6878ead1e2e7ef572a874dc3d885c1899b Requested by: @ChuanqiXu9 >From 6a3b8b5c12d45c5f981e90dae1a1e6c40d986cc7 Mon Sep 17 00:00:00 2001 From: Chuanqi Xu Date: Wed, 30 Oct 2024 17:27:04 +0800 Subject: [PATCH] [C++20] [Modules] Fix the duplicated static initializer problem (#114193) Reproducer: ``` //--- a.cppm export module a; int func(); static int a = func(); //--- a.cpp import a; ``` The `func()` should only execute once. However, before this patch we will somehow import `static int a` from a.cppm incorrectly and initialize that again. This is super bad and can introduce serious runtime behaviors. And also surprisingly, it looks like the root cause of the problem is simply some oversight choosing APIs. (cherry picked from commit 259eaa6878ead1e2e7ef572a874dc3d885c1899b) --- clang/lib/CodeGen/CodeGenModule.cpp| 4 ++-- clang/test/Modules/static-initializer.cppm | 18 ++ 2 files changed, 20 insertions(+), 2 deletions(-) create mode 100644 clang/test/Modules/static-initializer.cppm diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 151505baf38db1..2a5d5f9083ae65 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -7080,8 +7080,8 @@ void CodeGenModule::EmitTopLevelDecl(Decl *D) { // For C++ standard modules we are done - we will call the module // initializer for imported modules, and that will likewise call those for // any imports it has. -if (CXX20ModuleInits && Import->getImportedOwningModule() && -!Import->getImportedOwningModule()->isModuleMapModule()) +if (CXX20ModuleInits && Import->getImportedModule() && +Import->getImportedModule()->isNamedModule()) break; // For clang C++ module map modules the initializers for sub-modules are diff --git a/clang/test/Modules/static-initializer.cppm b/clang/test/Modules/static-initializer.cppm new file mode 100644 index 00..10d4854ee67fa6 --- /dev/null +++ b/clang/test/Modules/static-initializer.cppm @@ -0,0 +1,18 @@ +// RUN: rm -rf %t +// RUN: mkdir -p %t +// RUN: split-file %s %t +// +// RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 %t/a.cppm -emit-module-interface -o %t/a.pcm +// RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 %t/a.cpp -fmodule-file=a=%t/a.pcm -emit-llvm -o - | FileCheck %t/a.cpp + +//--- a.cppm +export module a; +int func(); +static int a = func(); + +//--- a.cpp +import a; + +// CHECK-NOT: internal global +// CHECK-NOT: __cxx_global_var_init + ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port RegUsageInfoCollector pass to NPM (PR #113874)
https://github.com/optimisan updated https://github.com/llvm/llvm-project/pull/113874 >From a95b69c07c7804d2e2a10b939a178a191643a41c Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Mon, 28 Oct 2024 06:22:49 + Subject: [PATCH 1/4] [CodeGen][NewPM] Port RegUsageInfoCollector pass to NPM --- .../llvm/CodeGen/RegUsageInfoCollector.h | 25 llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 + .../llvm/Passes/MachinePassRegistry.def | 2 +- llvm/lib/CodeGen/CodeGen.cpp | 2 +- llvm/lib/CodeGen/RegUsageInfoCollector.cpp| 60 +-- llvm/lib/Passes/PassBuilder.cpp | 1 + llvm/test/CodeGen/AMDGPU/ipra-regmask.ll | 5 ++ 8 files changed, 76 insertions(+), 22 deletions(-) create mode 100644 llvm/include/llvm/CodeGen/RegUsageInfoCollector.h diff --git a/llvm/include/llvm/CodeGen/RegUsageInfoCollector.h b/llvm/include/llvm/CodeGen/RegUsageInfoCollector.h new file mode 100644 index 00..6b88cc4f99089e --- /dev/null +++ b/llvm/include/llvm/CodeGen/RegUsageInfoCollector.h @@ -0,0 +1,25 @@ +//===- llvm/CodeGen/RegUsageInfoCollector.h -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef LLVM_CODEGEN_REGUSAGEINFOCOLLECTOR_H +#define LLVM_CODEGEN_REGUSAGEINFOCOLLECTOR_H + +#include "llvm/CodeGen/MachinePassManager.h" + +namespace llvm { + +class RegUsageInfoCollectorPass +: public AnalysisInfoMixin { +public: + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); +}; + +} // namespace llvm + +#endif // LLVM_CODEGEN_REGUSAGEINFOCOLLECTOR_H diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index edc237f2819818..44b7ba830bb329 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -257,7 +257,7 @@ void initializeRegAllocPriorityAdvisorAnalysisPass(PassRegistry &); void initializeRegAllocScoringPass(PassRegistry &); void initializeRegBankSelectPass(PassRegistry &); void initializeRegToMemWrapperPassPass(PassRegistry &); -void initializeRegUsageInfoCollectorPass(PassRegistry &); +void initializeRegUsageInfoCollectorLegacyPass(PassRegistry &); void initializeRegUsageInfoPropagationPass(PassRegistry &); void initializeRegionInfoPassPass(PassRegistry &); void initializeRegionOnlyPrinterPass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h b/llvm/include/llvm/Passes/CodeGenPassBuilder.h index 8cbc9f71ab26d0..066cd70ec8b996 100644 --- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h +++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h @@ -53,6 +53,7 @@ #include "llvm/CodeGen/PHIElimination.h" #include "llvm/CodeGen/PreISelIntrinsicLowering.h" #include "llvm/CodeGen/RegAllocFast.h" +#include "llvm/CodeGen/RegUsageInfoCollector.h" #include "llvm/CodeGen/RegisterUsageInfo.h" #include "llvm/CodeGen/ReplaceWithVeclib.h" #include "llvm/CodeGen/SafeStack.h" diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def index 7db28cb0092525..0ee4794034e98b 100644 --- a/llvm/include/llvm/Passes/MachinePassRegistry.def +++ b/llvm/include/llvm/Passes/MachinePassRegistry.def @@ -156,6 +156,7 @@ MACHINE_FUNCTION_PASS("print", MachinePostDominatorTreePrinterPass(dbgs())) MACHINE_FUNCTION_PASS("print", SlotIndexesPrinterPass(dbgs())) MACHINE_FUNCTION_PASS("print", VirtRegMapPrinterPass(dbgs())) +MACHINE_FUNCTION_PASS("reg-usage-collector", RegUsageInfoCollectorPass()) MACHINE_FUNCTION_PASS("require-all-machine-function-properties", RequireAllMachineFunctionPropertiesPass()) MACHINE_FUNCTION_PASS("stack-coloring", StackColoringPass()) @@ -250,7 +251,6 @@ DUMMY_MACHINE_FUNCTION_PASS("prologepilog-code", PrologEpilogCodeInserterPass) DUMMY_MACHINE_FUNCTION_PASS("ra-basic", RABasicPass) DUMMY_MACHINE_FUNCTION_PASS("ra-greedy", RAGreedyPass) DUMMY_MACHINE_FUNCTION_PASS("ra-pbqp", RAPBQPPass) -DUMMY_MACHINE_FUNCTION_PASS("reg-usage-collector", RegUsageInfoCollectorPass) DUMMY_MACHINE_FUNCTION_PASS("reg-usage-propagation", RegUsageInfoPropagationPass) DUMMY_MACHINE_FUNCTION_PASS("regalloc", RegAllocPass) DUMMY_MACHINE_FUNCTION_PASS("regallocscoringpass", RegAllocScoringPass) diff --git a/llvm/lib/CodeGen/CodeGen.cpp b/llvm/lib/CodeGen/CodeGen.cpp index 39fba1d0b527ef..e7e8a121369b75 100644 --- a/llvm/lib/CodeGen/CodeGen.cpp +++ b/llvm/lib/CodeGen/CodeGen.cpp @@ -113,7 +113,7 @@ void llvm::initializeCodeGen(PassRegistry &Registry) { initializeRABasicPass(Registry); initializeRAGreedyPass(Registry); initializeRegAllocFastPass(Reg
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Emit descriptive errors for all unsupported clauses (PR #114037)
https://github.com/skatrak updated https://github.com/llvm/llvm-project/pull/114037 >From 5f9c42714f1f8168adcb55ef72bf10fd0f6db81a Mon Sep 17 00:00:00 2001 From: Sergio Afonso Date: Tue, 29 Oct 2024 11:18:07 + Subject: [PATCH 1/2] [MLIR][OpenMP] Emit descriptive errors for all unsupported clauses This patch improves error reporting in the MLIR to LLVM IR translation pass for the 'omp' dialect by emitting descriptive errors when encountering clauses not yet supported by that pass. Additionally, not-yet-implemented errors previously missing for some clauses are added, to avoid silently ignoring them. Error messages related to inlining of `omp.private` and `omp.declare_reduction` regions have been updated to use the same format. --- .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 340 -- mlir/test/Target/LLVMIR/openmp-todo.mlir | 212 +-- 2 files changed, 421 insertions(+), 131 deletions(-) diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp index 4d189b1f40c46b..582a68f4c00a47 100644 --- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp +++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp @@ -581,7 +581,8 @@ makeReductionGen(omp::DeclareReductionOp decl, llvm::IRBuilderBase &builder, if (failed(inlineConvertOmpRegions(decl.getReductionRegion(), "omp.reduction.nonatomic.body", builder, moduleTranslation, &phis))) - return llvm::createStringError("failed reduction region translation"); + return llvm::createStringError( + "failed to inline `combiner` region of `omp.declare_reduction`"); assert(phis.size() == 1); result = phis[0]; return builder.saveIP(); @@ -614,7 +615,8 @@ makeAtomicReductionGen(omp::DeclareReductionOp decl, if (failed(inlineConvertOmpRegions(decl.getAtomicReductionRegion(), "omp.reduction.atomic.body", builder, moduleTranslation, &phis))) - return llvm::createStringError("failed reduction region translation"); + return llvm::createStringError( + "failed to inline `atomic` region of `omp.declare_reduction`"); assert(phis.empty()); return builder.saveIP(); }; @@ -650,6 +652,13 @@ convertOmpOrdered(Operation &opInst, llvm::IRBuilderBase &builder, return success(); } +static LogicalResult orderedRegionSupported(omp::OrderedRegionOp op) { + if (op.getParLevelSimd()) +return op.emitError("parallelization-level clause set not yet supported"); + + return success(); +} + /// Converts an OpenMP 'ordered_region' operation into LLVM IR using /// OpenMPIRBuilder. static LogicalResult @@ -658,9 +667,8 @@ convertOmpOrderedRegion(Operation &opInst, llvm::IRBuilderBase &builder, using InsertPointTy = llvm::OpenMPIRBuilder::InsertPointTy; auto orderedRegionOp = cast(opInst); - // TODO: The code generation for ordered simd directive is not supported yet. - if (orderedRegionOp.getParLevelSimd()) -return opInst.emitError("unhandled clauses for translation to LLVM IR"); + if (failed(orderedRegionSupported(orderedRegionOp))) +return failure(); auto bodyGenCB = [&](InsertPointTy allocaIP, InsertPointTy codeGenIP) { // OrderedOp has only one region associated with it. @@ -727,9 +735,10 @@ allocReductionVars(T loop, ArrayRef reductionArgs, SmallVector phis; if (failed(inlineConvertOmpRegions(allocRegion, "omp.reduction.alloc", builder, moduleTranslation, &phis))) -return failure(); - assert(phis.size() == 1 && "expected one allocation to be yielded"); +return loop.emitError( +"failed to inline `alloc` region of `omp.declare_reduction`"); + assert(phis.size() == 1 && "expected one allocation to be yielded"); builder.SetInsertPoint(allocaIP.getBlock()->getTerminator()); // Allocate reduction variable (which is a pointer to the real reduction @@ -995,6 +1004,16 @@ static LogicalResult allocAndInitializeReductionVars( return success(); } +static LogicalResult sectionsOpSupported(omp::SectionsOp op) { + if (!op.getAllocateVars().empty() || !op.getAllocatorVars().empty()) +return op.emitError("allocate clause not yet supported"); + + if (!op.getPrivateVars().empty() || op.getPrivateSyms()) +return op.emitError("privatization clauses not yet supported"); + + return success(); +} + static LogicalResult convertOmpSections(Operation &opInst, llvm::IRBuilderBase &builder, LLVM::ModuleTranslation &moduleTranslation) { @@ -1004,12 +1023,8 @@ convertOmpSections(Operation &opInst, llvm::IRBuilderBase &builder, auto sectionsOp = cast(opInst); - // TODO: Support the following clauses: private, firstprivate, lastprivat
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Emit descriptive errors for all unsupported clauses (PR #114037)
@@ -640,6 +642,13 @@ convertOmpOrdered(Operation &opInst, llvm::IRBuilderBase &builder, return success(); } +static LogicalResult orderedRegionSupported(omp::OrderedRegionOp op) { skatrak wrote: This should be ready to review again. Now the implementation is much more centralized, helping us make sure error messages are consistent. https://github.com/llvm/llvm-project/pull/114037 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/19.x: [C++20] [Modules] Fix the duplicated static initializer problem (#114193) (PR #114197)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/114197 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)
llvmbot wrote: @llvm/pr-subscribers-backend-risc-v Author: None (dlav-sc) Changes --- Full diff: https://github.com/llvm/llvm-project/pull/114227.diff 2 Files Affected: - (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.cpp (+94-98) - (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.h (+3-2) ``diff diff --git a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp index 6375300117090f..b94dd031186356 100644 --- a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp @@ -27,6 +27,88 @@ using namespace llvm; +static unsigned getCaleeSavedRVVNumRegs(const Register &BaseReg) { + return RISCV::VRRegClass.contains(BaseReg) ? 1 + : RISCV::VRM2RegClass.contains(BaseReg) ? 2 + : RISCV::VRM4RegClass.contains(BaseReg) ? 4 + : 8; +} + +static MCRegister getRVVBaseRegister(const RISCVRegisterInfo &TRI, const Register &Reg) { + MCRegister BaseReg = TRI.getSubReg(Reg, RISCV::sub_vrm1_0); + // If it's not a grouped vector register, it doesn't have subregister, so + // the base register is just itself. + if (BaseReg == RISCV::NoRegister) +BaseReg = Reg; + return BaseReg; +} + +namespace { + +struct CFIRestoreRegisterEmitter { + CFIRestoreRegisterEmitter(MachineFunction &, const RISCVSubtarget &) {}; + + void emit(MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, const RISCVInstrInfo &TII, const DebugLoc &DL, const CalleeSavedInfo &CS) const { +Register Reg = CS.getReg(); +unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createRestore( +nullptr, RI.getDwarfRegNum(Reg, true))); +BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION)) +.addCFIIndex(CFIIndex) +.setMIFlag(MachineInstr::FrameDestroy); + } +}; + +class CFIStoreRegisterEmitter { + MachineFrameInfo &MFI; + + public: + CFIStoreRegisterEmitter(MachineFunction &MF, const RISCVSubtarget &) : MFI{MF.getFrameInfo()} {}; + + void emit(MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, const RISCVInstrInfo &TII, const DebugLoc &DL, const CalleeSavedInfo &CS) const { +int FrameIdx = CS.getFrameIdx(); +int64_t Offset = MFI.getObjectOffset(FrameIdx); +Register Reg = CS.getReg(); +unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createOffset( +nullptr, RI.getDwarfRegNum(Reg, true), Offset)); +BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION)) +.addCFIIndex(CFIIndex) +.setMIFlag(MachineInstr::FrameSetup); + } +}; + +class CFIRestoreRVVRegisterEmitter { + const llvm::RISCVRegisterInfo *TRI; + + public: + CFIRestoreRVVRegisterEmitter(MachineFunction &, const RISCVSubtarget &STI) : TRI{STI.getRegisterInfo()} {}; + + void emit(MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, const RISCVInstrInfo &TII, const DebugLoc &DL, const CalleeSavedInfo &CS) const { +MCRegister BaseReg = getRVVBaseRegister(*TRI, CS.getReg()); +unsigned NumRegs = getCaleeSavedRVVNumRegs(CS.getReg()); +for (unsigned i = 0; i < NumRegs; ++i) { + unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createRestore( + nullptr, RI.getDwarfRegNum(BaseReg + i, true))); + BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION)) + .addCFIIndex(CFIIndex) + .setMIFlag(MachineInstr::FrameDestroy); +} + } +}; + +} + +template +void RISCVFrameLowering::emitCFIForCSI(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const SmallVector &CSI) const { + MachineFunction *MF = MBB.getParent(); + const RISCVRegisterInfo *RI = STI.getRegisterInfo(); + const RISCVInstrInfo *TII = STI.getInstrInfo(); + DebugLoc DL = MBB.findDebugLoc(MBBI); + + Emitter E{*MF, STI}; + for (const auto &CS : CSI) +E.emit(*MF, MBB, MBBI, *RI, *TII, DL, CS); +} + static Align getABIStackAlignment(RISCVABI::ABI ABI) { if (ABI == RISCVABI::ABI_ILP32E) return Align(4); @@ -607,16 +689,7 @@ void RISCVFrameLowering::emitPrologue(MachineFunction &MF, .addCFIIndex(CFIIndex) .setMIFlag(MachineInstr::FrameSetup); -for (const auto &Entry : getPushOrLibCallsSavedInfo(MF, CSI)) { - int FrameIdx = Entry.getFrameIdx(); - int64_t Offset = MFI.getObjectOffset(FrameIdx); - Register Reg = Entry.getReg(); - unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createOffset( - nullptr, RI->getDwarfRegNum(Reg, true), Offset)); - BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION)) - .addCFIIndex(CFIIndex) - .setMIFlag(MachineInstr::FrameSetup); -} +emitCFIForCSI(MBB, MBBI, getPushOrLibCallsSavedInfo(MF, CSI)); } // FIXME (note copied from Lanai): This appears to be overallocating. Needs @
[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)
https://github.com/dlav-sc created https://github.com/llvm/llvm-project/pull/114227 None >From 825bc966611bb9a5e737f2ed65b524766998c261 Mon Sep 17 00:00:00 2001 From: Daniil Avdeev Date: Wed, 30 Oct 2024 13:24:21 + Subject: [PATCH] [RISCV][NFC] refactor CFI emitting --- llvm/lib/Target/RISCV/RISCVFrameLowering.cpp | 192 +-- llvm/lib/Target/RISCV/RISCVFrameLowering.h | 5 +- 2 files changed, 97 insertions(+), 100 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp index 6375300117090f..b94dd031186356 100644 --- a/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVFrameLowering.cpp @@ -27,6 +27,88 @@ using namespace llvm; +static unsigned getCaleeSavedRVVNumRegs(const Register &BaseReg) { + return RISCV::VRRegClass.contains(BaseReg) ? 1 + : RISCV::VRM2RegClass.contains(BaseReg) ? 2 + : RISCV::VRM4RegClass.contains(BaseReg) ? 4 + : 8; +} + +static MCRegister getRVVBaseRegister(const RISCVRegisterInfo &TRI, const Register &Reg) { + MCRegister BaseReg = TRI.getSubReg(Reg, RISCV::sub_vrm1_0); + // If it's not a grouped vector register, it doesn't have subregister, so + // the base register is just itself. + if (BaseReg == RISCV::NoRegister) +BaseReg = Reg; + return BaseReg; +} + +namespace { + +struct CFIRestoreRegisterEmitter { + CFIRestoreRegisterEmitter(MachineFunction &, const RISCVSubtarget &) {}; + + void emit(MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, const RISCVInstrInfo &TII, const DebugLoc &DL, const CalleeSavedInfo &CS) const { +Register Reg = CS.getReg(); +unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createRestore( +nullptr, RI.getDwarfRegNum(Reg, true))); +BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION)) +.addCFIIndex(CFIIndex) +.setMIFlag(MachineInstr::FrameDestroy); + } +}; + +class CFIStoreRegisterEmitter { + MachineFrameInfo &MFI; + + public: + CFIStoreRegisterEmitter(MachineFunction &MF, const RISCVSubtarget &) : MFI{MF.getFrameInfo()} {}; + + void emit(MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, const RISCVInstrInfo &TII, const DebugLoc &DL, const CalleeSavedInfo &CS) const { +int FrameIdx = CS.getFrameIdx(); +int64_t Offset = MFI.getObjectOffset(FrameIdx); +Register Reg = CS.getReg(); +unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createOffset( +nullptr, RI.getDwarfRegNum(Reg, true), Offset)); +BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION)) +.addCFIIndex(CFIIndex) +.setMIFlag(MachineInstr::FrameSetup); + } +}; + +class CFIRestoreRVVRegisterEmitter { + const llvm::RISCVRegisterInfo *TRI; + + public: + CFIRestoreRVVRegisterEmitter(MachineFunction &, const RISCVSubtarget &STI) : TRI{STI.getRegisterInfo()} {}; + + void emit(MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, const RISCVInstrInfo &TII, const DebugLoc &DL, const CalleeSavedInfo &CS) const { +MCRegister BaseReg = getRVVBaseRegister(*TRI, CS.getReg()); +unsigned NumRegs = getCaleeSavedRVVNumRegs(CS.getReg()); +for (unsigned i = 0; i < NumRegs; ++i) { + unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createRestore( + nullptr, RI.getDwarfRegNum(BaseReg + i, true))); + BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION)) + .addCFIIndex(CFIIndex) + .setMIFlag(MachineInstr::FrameDestroy); +} + } +}; + +} + +template +void RISCVFrameLowering::emitCFIForCSI(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const SmallVector &CSI) const { + MachineFunction *MF = MBB.getParent(); + const RISCVRegisterInfo *RI = STI.getRegisterInfo(); + const RISCVInstrInfo *TII = STI.getInstrInfo(); + DebugLoc DL = MBB.findDebugLoc(MBBI); + + Emitter E{*MF, STI}; + for (const auto &CS : CSI) +E.emit(*MF, MBB, MBBI, *RI, *TII, DL, CS); +} + static Align getABIStackAlignment(RISCVABI::ABI ABI) { if (ABI == RISCVABI::ABI_ILP32E) return Align(4); @@ -607,16 +689,7 @@ void RISCVFrameLowering::emitPrologue(MachineFunction &MF, .addCFIIndex(CFIIndex) .setMIFlag(MachineInstr::FrameSetup); -for (const auto &Entry : getPushOrLibCallsSavedInfo(MF, CSI)) { - int FrameIdx = Entry.getFrameIdx(); - int64_t Offset = MFI.getObjectOffset(FrameIdx); - Register Reg = Entry.getReg(); - unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createOffset( - nullptr, RI->getDwarfRegNum(Reg, true), Offset)); - BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION)) - .addCFIIndex(CFIIndex) - .setMIFlag(MachineInstr::FrameSetup); -} +emitCFIForCSI(MBB, MBB
[llvm-branch-commits] [compiler-rt] release/19.x: [compiler-rt] [test] Fix using toolchains that rely on Clang default configs (#113491) (PR #114229)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/114229 Backport a14a83d9a102253eca7c02ff4c35a2ce3f7de6e5 Requested by: @mstorsjo >From 3f88a54c4eeb8d7c92aab78b39e9fb9162c6030c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Martin=20Storsj=C3=B6?= Date: Thu, 24 Oct 2024 23:45:14 +0300 Subject: [PATCH] [compiler-rt] [test] Fix using toolchains that rely on Clang default configs (#113491) The use of CLANG_NO_DEFAULT_CONFIG in the tests was added because some Linux distributions had a global default config file, that added flags relating to hardening, which interfere with the sanitizer tests. By setting CLANG_NO_DEFAULT_CONFIG, the global default config files that are found are ignored, and the sanitizers get the expected default compiler behaviour. (This was https://github.com/llvm/llvm-project/issues/60394, which was fixed in 8ab762557fb057af1a3015211ee116a975027e78.) However, some toolchains may rely on default config files for mandatory parts required for functioning at all - setting things like sysroots, -rtlib, -unwindlib, -stdlib, -fuse-ld etc. In such a case we can't forcibly disable any default config, because it will break the otherwise working toolchain. Add a test for whether the compiler works while passing --no-default-config to it. If the option is accepted and the toolchain still works while that is set, set CLANG_NO_DEFAULT_CONFIG while running tests. (This adds a little bit of inconsistency, as we're testing for the command line option, while using the environment variable. However doing compile testing with an environment variable isn't quite as easily doable, and passing an extra command line flag to all compile commands while testing, is a bit clumsy - therefore this inconsistency.) (cherry picked from commit a14a83d9a102253eca7c02ff4c35a2ce3f7de6e5) --- compiler-rt/CMakeLists.txt| 16 compiler-rt/test/CMakeLists.txt | 2 ++ compiler-rt/test/lit.common.cfg.py| 6 +- compiler-rt/test/lit.common.configured.in | 1 + 4 files changed, 24 insertions(+), 1 deletion(-) diff --git a/compiler-rt/CMakeLists.txt b/compiler-rt/CMakeLists.txt index 2207555b03a03f..6cf20ab7c183ce 100644 --- a/compiler-rt/CMakeLists.txt +++ b/compiler-rt/CMakeLists.txt @@ -39,6 +39,22 @@ include(CompilerRTUtils) include(CMakeDependentOption) include(GetDarwinLinkerVersion) +include(CheckCXXCompilerFlag) + +# Check if we can compile with --no-default-config, or if that omits a config +# file that is essential for the toolchain to work properly. +# +# Using CMAKE_REQUIRED_FLAGS to make sure the flag is used both for compilation +# and for linking. +# +# Doing this test early on, to see if the flag works on the toolchain +# out of the box. Later on, we end up adding -nostdlib and similar flags +# to all test compiles, which easily can give false positives on this test. +set(OLD_CMAKE_REQUIRED_FLAGS "${CMAKE_REQUIRED_FLAGS}") +set(CMAKE_REQUIRED_FLAGS "${CMAKE_REQUIRED_FLAGS} --no-default-config") +check_cxx_compiler_flag("" COMPILER_RT_HAS_NO_DEFAULT_CONFIG_FLAG) +set(CMAKE_REQUIRED_FLAGS "${OLD_CMAKE_REQUIRED_FLAGS}") + option(COMPILER_RT_BUILD_BUILTINS "Build builtins" ON) mark_as_advanced(COMPILER_RT_BUILD_BUILTINS) option(COMPILER_RT_DISABLE_AARCH64_FMV "Disable AArch64 Function Multi Versioning support" OFF) diff --git a/compiler-rt/test/CMakeLists.txt b/compiler-rt/test/CMakeLists.txt index 84a98f36747495..f9e23710d3e4f7 100644 --- a/compiler-rt/test/CMakeLists.txt +++ b/compiler-rt/test/CMakeLists.txt @@ -12,6 +12,8 @@ pythonize_bool(COMPILER_RT_ENABLE_INTERNAL_SYMBOLIZER) pythonize_bool(COMPILER_RT_HAS_AARCH64_SME) +pythonize_bool(COMPILER_RT_HAS_NO_DEFAULT_CONFIG_FLAG) + configure_compiler_rt_lit_site_cfg( ${CMAKE_CURRENT_SOURCE_DIR}/lit.common.configured.in ${CMAKE_CURRENT_BINARY_DIR}/lit.common.configured) diff --git a/compiler-rt/test/lit.common.cfg.py b/compiler-rt/test/lit.common.cfg.py index 0690c3a18efdbc..d4b1e1d71d3c54 100644 --- a/compiler-rt/test/lit.common.cfg.py +++ b/compiler-rt/test/lit.common.cfg.py @@ -980,7 +980,11 @@ def is_windows_lto_supported(): # default configs for the test runs. In particular, anything hardening # related is likely to cause issues with sanitizer tests, because it may # preempt something we're looking to trap (e.g. _FORTIFY_SOURCE vs our ASAN). -config.environment["CLANG_NO_DEFAULT_CONFIG"] = "1" +# +# Only set this if we know we can still build for the target while disabling +# default configs. +if config.has_no_default_config_flag: +config.environment["CLANG_NO_DEFAULT_CONFIG"] = "1" if config.has_compiler_rt_libatomic: base_lib = os.path.join(config.compiler_rt_libdir, "libclang_rt.atomic%s.so" diff --git a/compiler-rt/test/lit.common.configured.in b/compiler-rt/test/lit.common.configured.in index 8889b816b149fc..f7276627995520 100644 --- a/compiler-rt/test/lit.common.configured.in +++ b/compiler-rt/test/lit.common.configured.in @@ -53,6 +53,7
[llvm-branch-commits] [compiler-rt] release/19.x: [compiler-rt] [test] Fix using toolchains that rely on Clang default configs (#113491) (PR #114229)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/114229 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [compiler-rt] [test] Fix using toolchains that rely on Clang default configs (#113491) (PR #114229)
llvmbot wrote: @mgorny What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/114229 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [OpenMP][MLIR] Descriptor explicit member map lowering changes (PR #113556)
https://github.com/skatrak commented: LGTM, but buildbot errors in "MapInfoFinalization.cpp", "MapsForPrivatizedSymbols.cpp" and "Utils.cpp" seem to be a consequence of the type change to `members_index`. Hopefully it's a simple fix. https://github.com/llvm/llvm-project/pull/113556 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: AMDGPURegBankSelect (PR #112863)
https://github.com/petar-avramovic edited https://github.com/llvm/llvm-project/pull/112863 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/19.x: [C++20] [Modules] Fix the duplicated static initializer problem (#114193) (PR #114197)
llvmbot wrote: @llvm/pr-subscribers-clang-modules Author: None (llvmbot) Changes Backport 259eaa6878ead1e2e7ef572a874dc3d885c1899b Requested by: @ChuanqiXu9 --- Full diff: https://github.com/llvm/llvm-project/pull/114197.diff 2 Files Affected: - (modified) clang/lib/CodeGen/CodeGenModule.cpp (+2-2) - (added) clang/test/Modules/static-initializer.cppm (+18) ``diff diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 151505baf38db1..2a5d5f9083ae65 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -7080,8 +7080,8 @@ void CodeGenModule::EmitTopLevelDecl(Decl *D) { // For C++ standard modules we are done - we will call the module // initializer for imported modules, and that will likewise call those for // any imports it has. -if (CXX20ModuleInits && Import->getImportedOwningModule() && -!Import->getImportedOwningModule()->isModuleMapModule()) +if (CXX20ModuleInits && Import->getImportedModule() && +Import->getImportedModule()->isNamedModule()) break; // For clang C++ module map modules the initializers for sub-modules are diff --git a/clang/test/Modules/static-initializer.cppm b/clang/test/Modules/static-initializer.cppm new file mode 100644 index 00..10d4854ee67fa6 --- /dev/null +++ b/clang/test/Modules/static-initializer.cppm @@ -0,0 +1,18 @@ +// RUN: rm -rf %t +// RUN: mkdir -p %t +// RUN: split-file %s %t +// +// RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 %t/a.cppm -emit-module-interface -o %t/a.pcm +// RUN: %clang_cc1 -triple %itanium_abi_triple -std=c++20 %t/a.cpp -fmodule-file=a=%t/a.pcm -emit-llvm -o - | FileCheck %t/a.cpp + +//--- a.cppm +export module a; +int func(); +static int a = func(); + +//--- a.cpp +import a; + +// CHECK-NOT: internal global +// CHECK-NOT: __cxx_global_var_init + `` https://github.com/llvm/llvm-project/pull/114197 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libc++][C++03] Update include guards (PR #109001)
https://github.com/ldionne edited https://github.com/llvm/llvm-project/pull/109001 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libc++][C++03] Update include guards (PR #109001)
@@ -29,18 +29,18 @@ _LIBCPP_PUSH_MACROS // TODO: Find out how altivec changes things and allow vectorizations there too. #if _LIBCPP_STD_VER >= 14 && defined(_LIBCPP_CLANG_VER) && !defined(__ALTIVEC__) -# define _LIBCPP_HAS_ALGORITHM_VECTOR_UTILS 1 +# define _LIBCPP___CXX03_HAS_ALGORITHM_VECTOR_UTILS 1 ldionne wrote: Oops! https://github.com/llvm/llvm-project/pull/109001 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libc++][C++03] Update include guards (PR #109001)
https://github.com/ldionne commented: This looks good to me in principle, however the patch itself has a few incorrect renamings. One thing you could do is use `git diff --stat` to confirm that all the changed files have exactly 3 lines changed in them. A few headers like the C compat headers might need additional renames, that needs to be investigated. https://github.com/llvm/llvm-project/pull/109001 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libc++][C++03] Update include guards (PR #109001)
@@ -41,7 +41,7 @@ template , int> = 0> [[nodiscard]] _LIBCPP_HIDE_FROM_ABI bool any_of(_ExecutionPolicy&& __policy, _ForwardIterator __first, _ForwardIterator __last, _Predicate __pred) { - _LIBCPP_REQUIRE_CPP17_FORWARD_ITERATOR(_ForwardIterator, "any_of requires a ForwardIterator"); + _LIBCPP___CXX03_REQUIRE_CPP17_FORWARD_ITERATOR(_ForwardIterator, "any_of requires a ForwardIterator"); ldionne wrote: Oops! https://github.com/llvm/llvm-project/pull/109001 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] [PAC][lld][AArch64][ELF] Support signed GOT with tiny code model (PR #113816)
@@ -78,6 +78,79 @@ _start: adrp x1, :got_auth:zed add x1, x1, :got_auth_lo12:zed +#--- ok-tiny.s + +# RUN: llvm-mc -filetype=obj -triple=aarch64-none-linux ok-tiny.s -o ok-tiny.o + +# RUN: ld.lld ok-tiny.o a.so -pie -o external-tiny +# RUN: llvm-readelf -r -S -x .got external-tiny | FileCheck %s --check-prefix=EXTERNAL-TINY + +# RUN: ld.lld ok-tiny.o a.o -pie -o local-tiny +# RUN: llvm-readelf -r -S -x .got -s local-tiny | FileCheck %s --check-prefix=LOCAL-TINY + +# EXTERNAL-TINY: OffsetInfo Type Symbol's Value Symbol's Name + Addend +# EXTERNAL-TINY-NEXT: 00020380 0001e201 R_AARCH64_AUTH_GLOB_DAT bar + 0 +# EXTERNAL-TINY-NEXT: 00020388 0002e201 R_AARCH64_AUTH_GLOB_DAT zed + 0 + +## Symbol's values for bar and zed are equal since they contain no content (see Inputs/shared.s) +# LOCAL-TINY: OffsetInfo Type Symbol's Value Symbol's Name + Addend +# LOCAL-TINY-NEXT:00020320 0411 R_AARCH64_AUTH_RELATIVE 10260 +# LOCAL-TINY-NEXT:00020328 0411 R_AARCH64_AUTH_RELATIVE 10260 + +# EXTERNAL-TINY: Hex dump of section '.got': +# EXTERNAL-TINY-NEXT: 0x00020380 0080 00a0 +# ^^ +# 0b1000 bit 63 address diversity = true, bits 61..60 key = IA +# ^^ +# 0b1010 bit 63 address diversity = true, bits 61..60 key = DA ilovepi wrote: I assume these are intentionally not matched. In that case, is there a good reason to keep them in the test? https://github.com/llvm/llvm-project/pull/113816 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)
@@ -294,7 +294,11 @@ class Triple { PAuthTest, -LastEnvironmentType = PAuthTest +GNUT64, +GNUEABIT64, +GNUEABIHFT64, + +LastEnvironmentType = GNUEABIHFT64 mgorny wrote: The change wasn't meant to break API or ABI compatibility, and whether it did is at least controversial. What Zig is doing here is quite uncommon. https://github.com/llvm/llvm-project/pull/112364 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)
https://github.com/tru edited https://github.com/llvm/llvm-project/pull/112364 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)
@@ -294,7 +294,11 @@ class Triple { PAuthTest, -LastEnvironmentType = PAuthTest +GNUT64, +GNUEABIT64, +GNUEABIHFT64, + +LastEnvironmentType = GNUEABIHFT64 tru wrote: Yeah this is obviously an oversight and of the ABI checker would have flagged it I would asked for changes. The question now is what we do about it - reverting the change might not be the best way forward. I think we need some input from @tstellar @nikic and possibly @AaronBallman here. https://github.com/llvm/llvm-project/pull/112364 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [OpenMP][MLIR] Descriptor explicit member map lowering changes (PR #113556)
agozillon wrote: Ah yes, this one won't build, as MapInfoFinalization.cpp is a Flang change, as is the Utils.cpp/hpp and other complaints. This patch requires the changeset from 113557, so it's 113557 we should be looking at to pass, which it currently doesn't, so I'll have a look there. But don't expect this one to pass as it doesn't contain the changeset require to do so, it'd require both the Flang frontend changes and the changeset in this PR which is only the MLIR project level changes. https://github.com/llvm/llvm-project/pull/113556 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [OpenMP][MLIR] Descriptor explicit member map lowering changes (PR #113556)
https://github.com/agozillon updated https://github.com/llvm/llvm-project/pull/113556 >From b27db9198dedd284ea24161dc4fe6fcb3952814b Mon Sep 17 00:00:00 2001 From: agozillon Date: Fri, 4 Oct 2024 13:03:22 -0500 Subject: [PATCH] [OpenMP][MLIR] Descriptor explicit member map lowering changes This is one of 3 PRs in a PR stack that aims to add support for explicit mapping of allocatable members in derived types. The primary changes in this PR are the OpenMPToLLVMIRTranslation.cpp changes, which are small and seek to alter the current member mapping to add an additional map insertion for pointers. Effectively, if the member is a pointer (currently indicated by having a varPtrPtr field) we add an additional map for the pointer and then alter the subsequent mapping of the member (the data) to utilise the member rather than the parents base pointer. This appears to be necessary in certain cases when mapping pointer data within record types to avoid segfaulting on device (due to incorrect data mapping). In general this record type mapping may be simplifiable in the future. There are also additions of tests which should help to showcase the affect of the changes above. --- mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | 2 +- mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp | 58 +++-- .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 81 - mlir/test/Dialect/OpenMP/ops.mlir | 4 +- ...t-nested-ptr-record-type-mapping-host.mlir | 66 ++ ...arget-nested-record-type-mapping-host.mlir | 2 +- ...get-record-type-with-ptr-member-host.mlir} | 114 ++ 7 files changed, 197 insertions(+), 130 deletions(-) create mode 100644 mlir/test/Target/LLVMIR/omptarget-nested-ptr-record-type-mapping-host.mlir rename mlir/test/Target/LLVMIR/{omptarget-fortran-allocatable-types-host.mlir => omptarget-record-type-with-ptr-member-host.mlir} (58%) diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index 626539cb7bde42..348c1b9c2b8bdf 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -895,7 +895,7 @@ def MapInfoOp : OpenMP_Op<"map.info", [AttrSizedOperandSegments]> { TypeAttr:$var_type, Optional:$var_ptr_ptr, Variadic:$members, - OptionalAttr:$members_index, + OptionalAttr:$members_index, Variadic:$bounds, /* rank-0 to rank-{n-1} */ OptionalAttr:$map_type, OptionalAttr:$map_capture_type, diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp index e1df647d6a3c71..8d31cda3a33ee9 100644 --- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp +++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp @@ -1395,16 +1395,15 @@ static void printMapClause(OpAsmPrinter &p, Operation *op, } static ParseResult parseMembersIndex(OpAsmParser &parser, - DenseIntElementsAttr &membersIdx) { - SmallVector values; - int64_t value; - int64_t shape[2] = {0, 0}; - unsigned shapeTmp = 0; + ArrayAttr &membersIdx) { + SmallVector values, memberIdxs; + auto parseIndices = [&]() -> ParseResult { +int64_t value; if (parser.parseInteger(value)) return failure(); -shapeTmp++; -values.push_back(APInt(32, value, /*isSigned=*/true)); +values.push_back(IntegerAttr::get(parser.getBuilder().getIntegerType(64), + APInt(64, value, /*isSigned=*/false))); return success(); }; @@ -1418,52 +1417,29 @@ static ParseResult parseMembersIndex(OpAsmParser &parser, if (failed(parser.parseRSquare())) return failure(); -// Only set once, if any indices are not the same size -// we error out in the next check as that's unsupported -if (shape[1] == 0) - shape[1] = shapeTmp; - -// Verify that the recently parsed list is equal to the -// first one we parsed, they must be equal lengths to -// keep the rectangular shape DenseIntElementsAttr -// requires -if (shapeTmp != shape[1]) - return failure(); - -shapeTmp = 0; -shape[0]++; +memberIdxs.push_back(ArrayAttr::get(parser.getContext(), values)); +values.clear(); } while (succeeded(parser.parseOptionalComma())); - if (!values.empty()) { -ShapedType valueType = -VectorType::get(shape, IntegerType::get(parser.getContext(), 32)); -membersIdx = DenseIntElementsAttr::get(valueType, values); - } + if (!memberIdxs.empty()) +membersIdx = ArrayAttr::get(parser.getContext(), memberIdxs); return success(); } static void printMembersIndex(OpAsmPrinter &p, MapInfoOp op, - DenseIntElementsAttr membersIdx) { - llvm::ArrayRef shape = membersIdx.getShapedType
[llvm-branch-commits] [mlir] [mlir][bufferization] Remove `finalizing-bufferize` pass (PR #114154)
https://github.com/javedabsar1 commented: Hi Matthias. Will you delete references in docs in a different diff ? https://github.com/llvm/llvm-project/blob/main/mlir/docs/Bufferization.md?plain=1#L561 https://github.com/llvm/llvm-project/pull/114154 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [OpenMP][MLIR] Descriptor explicit member map lowering changes (PR #113556)
agozillon wrote: > LGTM, but buildbot errors in "MapInfoFinalization.cpp", > "MapsForPrivatizedSymbols.cpp" and "Utils.cpp" seem to be a consequence of > the type change to `members_index`. Hopefully it's a simple fix. I have a feeling it's just not setup correctly to layer on top of it's dependent PRs, hence the errors. As in conjunction it all builds fine and runs on my machine and I'll double check again before landing the PR stack :-) Would love an approval if you're happy with the PRs current state! And would love an approval or further review comments from @ergawy @TIFitis if at all possible, would be wonderful to be able to land this in the next week or two as it seems to be approaching closure :-) https://github.com/llvm/llvm-project/pull/113556 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [llvm] [libc++][C++03] Use `__cxx03/` headers in C++03 mode (PR #109002)
https://github.com/ldionne edited https://github.com/llvm/llvm-project/pull/109002 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #113557)
agozillon wrote: Would love a review on this whenever you all get a bit of spare time! Thank you very much for your help and time it's greatly appreciated :-) https://github.com/llvm/llvm-project/pull/113557 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)
@@ -294,7 +294,11 @@ class Triple { PAuthTest, -LastEnvironmentType = PAuthTest +GNUT64, +GNUEABIT64, +GNUEABIHFT64, + +LastEnvironmentType = GNUEABIHFT64 alexrp wrote: Also, ABI aside, quoting [Release Patch Rules](https://llvm.org/docs/HowToReleaseLLVM.html#release-patch-rules): > Bug fix releases Patches should be limited to bug fixes or very safe and > critical performance improvements. Patches must maintain **both API and ABI > compatibility** with the previous major release. https://github.com/llvm/llvm-project/pull/112364 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm][fatlto] Drop any CFI related instrumentation after emitting bitcode (PR #112788)
https://github.com/ilovepi updated https://github.com/llvm/llvm-project/pull/112788 >From ad89d61e60bac57cf8c66a974d741377ebe1db30 Mon Sep 17 00:00:00 2001 From: Paul Kirth Date: Fri, 18 Oct 2024 01:59:26 + Subject: [PATCH] Use new enum in constructor Created using spr 1.3.4 --- llvm/lib/Passes/PassBuilderPipelines.cpp | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index aec79304ab5c3c..0585e83e59a9ab 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -1631,8 +1631,11 @@ PassBuilder::buildFatLTODefaultPipeline(OptimizationLevel Level, bool ThinLTO, MPM.addPass(EmbedBitcodePass(ThinLTO, EmitSummary)); // If we're doing FatLTO w/ CFI enabled, we don't want the type tests in the - // object file. - MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true, true)); + // object code, only in the bitcode section, so drop it before we run + // module optimization and generate machine code. If llvm.type.test() isn't in + // the IR, this won't do anything. + MPM.addPass( + LowerTypeTestsPass(nullptr, nullptr, lowertypetests::DropTestKind::All)); // Use the ThinLTO post-link pipeline with sample profiling if (ThinLTO && PGOOpt && PGOOpt->Action == PGOOptions::SampleUse) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [llvm][fatlto] Drop any CFI related instrumentation after emitting bitcode (PR #112788)
https://github.com/ilovepi updated https://github.com/llvm/llvm-project/pull/112788 >From ad89d61e60bac57cf8c66a974d741377ebe1db30 Mon Sep 17 00:00:00 2001 From: Paul Kirth Date: Fri, 18 Oct 2024 01:59:26 + Subject: [PATCH] Use new enum in constructor Created using spr 1.3.4 --- llvm/lib/Passes/PassBuilderPipelines.cpp | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index aec79304ab5c3c..0585e83e59a9ab 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -1631,8 +1631,11 @@ PassBuilder::buildFatLTODefaultPipeline(OptimizationLevel Level, bool ThinLTO, MPM.addPass(EmbedBitcodePass(ThinLTO, EmitSummary)); // If we're doing FatLTO w/ CFI enabled, we don't want the type tests in the - // object file. - MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true, true)); + // object code, only in the bitcode section, so drop it before we run + // module optimization and generate machine code. If llvm.type.test() isn't in + // the IR, this won't do anything. + MPM.addPass( + LowerTypeTestsPass(nullptr, nullptr, lowertypetests::DropTestKind::All)); // Use the ThinLTO post-link pipeline with sample profiling if (ThinLTO && PGOOpt && PGOOpt->Action == PGOOptions::SampleUse) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)
https://github.com/lenary commented: This cleanup is going in a nice direction, I think. A few suggestions/questions below. https://github.com/llvm/llvm-project/pull/114227 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)
@@ -27,6 +27,102 @@ using namespace llvm; +static unsigned getCaleeSavedRVVNumRegs(const Register &BaseReg) { + return RISCV::VRRegClass.contains(BaseReg) ? 1 + : RISCV::VRM2RegClass.contains(BaseReg) ? 2 + : RISCV::VRM4RegClass.contains(BaseReg) ? 4 + : 8; +} + +static MCRegister getRVVBaseRegister(const RISCVRegisterInfo &TRI, + const Register &Reg) { + MCRegister BaseReg = TRI.getSubReg(Reg, RISCV::sub_vrm1_0); + // If it's not a grouped vector register, it doesn't have subregister, so + // the base register is just itself. + if (BaseReg == RISCV::NoRegister) +BaseReg = Reg; + return BaseReg; +} + +namespace { + +struct CFIRestoreRegisterEmitter { + CFIRestoreRegisterEmitter(MachineFunction &, const RISCVSubtarget &) {}; + + void emit(MachineFunction &MF, MachineBasicBlock &MBB, +MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, +const RISCVInstrInfo &TII, const DebugLoc &DL, +const CalleeSavedInfo &CS) const { +Register Reg = CS.getReg(); +unsigned CFIIndex = MF.addFrameInst( +MCCFIInstruction::createRestore(nullptr, RI.getDwarfRegNum(Reg, true))); +BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION)) +.addCFIIndex(CFIIndex) +.setMIFlag(MachineInstr::FrameDestroy); + } +}; + +class CFIStoreRegisterEmitter { + MachineFrameInfo &MFI; + +public: + CFIStoreRegisterEmitter(MachineFunction &MF, const RISCVSubtarget &) + : MFI{MF.getFrameInfo()} {}; + + void emit(MachineFunction &MF, MachineBasicBlock &MBB, +MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, +const RISCVInstrInfo &TII, const DebugLoc &DL, +const CalleeSavedInfo &CS) const { +int FrameIdx = CS.getFrameIdx(); +int64_t Offset = MFI.getObjectOffset(FrameIdx); +Register Reg = CS.getReg(); +unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createOffset( +nullptr, RI.getDwarfRegNum(Reg, true), Offset)); +BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION)) +.addCFIIndex(CFIIndex) +.setMIFlag(MachineInstr::FrameSetup); + } +}; + +class CFIRestoreRVVRegisterEmitter { + const llvm::RISCVRegisterInfo *TRI; + +public: + CFIRestoreRVVRegisterEmitter(MachineFunction &, const RISCVSubtarget &STI) + : TRI{STI.getRegisterInfo()} {}; + + void emit(MachineFunction &MF, MachineBasicBlock &MBB, +MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, +const RISCVInstrInfo &TII, const DebugLoc &DL, +const CalleeSavedInfo &CS) const { +MCRegister BaseReg = getRVVBaseRegister(*TRI, CS.getReg()); +unsigned NumRegs = getCaleeSavedRVVNumRegs(CS.getReg()); +for (unsigned i = 0; i < NumRegs; ++i) { + unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createRestore( + nullptr, RI.getDwarfRegNum(BaseReg + i, true))); + BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION)) + .addCFIIndex(CFIIndex) + .setMIFlag(MachineInstr::FrameDestroy); +} + } +}; + +} // namespace + +template +void RISCVFrameLowering::emitCFIForCSI( +MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, +const SmallVector &CSI) const { + MachineFunction *MF = MBB.getParent(); + const RISCVRegisterInfo *RI = STI.getRegisterInfo(); + const RISCVInstrInfo *TII = STI.getInstrInfo(); + DebugLoc DL = MBB.findDebugLoc(MBBI); + + Emitter E{*MF, STI}; + for (const auto &CS : CSI) +E.emit(*MF, MBB, MBBI, *RI, *TII, DL, CS); lenary wrote: I don't quite know why `*MF` needs to be passed into both the constructor and into `emit`? Would it not be simpler to just pass it into the constructor and cache the reference? https://github.com/llvm/llvm-project/pull/114227 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)
@@ -27,6 +27,102 @@ using namespace llvm; +static unsigned getCaleeSavedRVVNumRegs(const Register &BaseReg) { + return RISCV::VRRegClass.contains(BaseReg) ? 1 + : RISCV::VRM2RegClass.contains(BaseReg) ? 2 + : RISCV::VRM4RegClass.contains(BaseReg) ? 4 + : 8; +} + +static MCRegister getRVVBaseRegister(const RISCVRegisterInfo &TRI, + const Register &Reg) { + MCRegister BaseReg = TRI.getSubReg(Reg, RISCV::sub_vrm1_0); + // If it's not a grouped vector register, it doesn't have subregister, so + // the base register is just itself. + if (BaseReg == RISCV::NoRegister) +BaseReg = Reg; + return BaseReg; +} + +namespace { + +struct CFIRestoreRegisterEmitter { + CFIRestoreRegisterEmitter(MachineFunction &, const RISCVSubtarget &) {}; + + void emit(MachineFunction &MF, MachineBasicBlock &MBB, +MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, +const RISCVInstrInfo &TII, const DebugLoc &DL, +const CalleeSavedInfo &CS) const { +Register Reg = CS.getReg(); +unsigned CFIIndex = MF.addFrameInst( +MCCFIInstruction::createRestore(nullptr, RI.getDwarfRegNum(Reg, true))); +BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION)) +.addCFIIndex(CFIIndex) +.setMIFlag(MachineInstr::FrameDestroy); + } +}; + +class CFIStoreRegisterEmitter { lenary wrote: ```suggestion class CFISaveRegisterEmitter { ``` I think it would be clearer to use "Save" as the opposite of "Restore", rather than "Store" vs "Restore". https://github.com/llvm/llvm-project/pull/114227 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)
@@ -1737,39 +1776,14 @@ void RISCVFrameLowering::emitCalleeSavedRVVPrologCFI( for (auto &CS : RVVCSI) { // Insert the spill to the stack frame. int FI = CS.getFrameIdx(); lenary wrote: Is there a reason you didn't replace this loop with a call to `emitCFIforCSI` with a new `CFISaveRVVRegisterEmitter`? That would give a little better symmetry in those objects, even though it doesn't get rid of the calls to `emitCalleeSavedRVVPrologCFI` because of the other logic happening before this loop. https://github.com/llvm/llvm-project/pull/114227 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)
@@ -27,6 +27,102 @@ using namespace llvm; +static unsigned getCaleeSavedRVVNumRegs(const Register &BaseReg) { + return RISCV::VRRegClass.contains(BaseReg) ? 1 + : RISCV::VRM2RegClass.contains(BaseReg) ? 2 + : RISCV::VRM4RegClass.contains(BaseReg) ? 4 + : 8; +} + +static MCRegister getRVVBaseRegister(const RISCVRegisterInfo &TRI, + const Register &Reg) { + MCRegister BaseReg = TRI.getSubReg(Reg, RISCV::sub_vrm1_0); + // If it's not a grouped vector register, it doesn't have subregister, so + // the base register is just itself. + if (BaseReg == RISCV::NoRegister) +BaseReg = Reg; + return BaseReg; +} + +namespace { + +struct CFIRestoreRegisterEmitter { + CFIRestoreRegisterEmitter(MachineFunction &, const RISCVSubtarget &) {}; + + void emit(MachineFunction &MF, MachineBasicBlock &MBB, +MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, +const RISCVInstrInfo &TII, const DebugLoc &DL, +const CalleeSavedInfo &CS) const { +Register Reg = CS.getReg(); +unsigned CFIIndex = MF.addFrameInst( +MCCFIInstruction::createRestore(nullptr, RI.getDwarfRegNum(Reg, true))); +BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION)) +.addCFIIndex(CFIIndex) +.setMIFlag(MachineInstr::FrameDestroy); + } +}; + +class CFIStoreRegisterEmitter { + MachineFrameInfo &MFI; + +public: + CFIStoreRegisterEmitter(MachineFunction &MF, const RISCVSubtarget &) + : MFI{MF.getFrameInfo()} {}; + + void emit(MachineFunction &MF, MachineBasicBlock &MBB, +MachineBasicBlock::iterator MBBI, const RISCVRegisterInfo &RI, +const RISCVInstrInfo &TII, const DebugLoc &DL, +const CalleeSavedInfo &CS) const { +int FrameIdx = CS.getFrameIdx(); +int64_t Offset = MFI.getObjectOffset(FrameIdx); +Register Reg = CS.getReg(); +unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createOffset( +nullptr, RI.getDwarfRegNum(Reg, true), Offset)); +BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION)) +.addCFIIndex(CFIIndex) +.setMIFlag(MachineInstr::FrameSetup); + } +}; + +class CFIRestoreRVVRegisterEmitter { + const llvm::RISCVRegisterInfo *TRI; + +public: + CFIRestoreRVVRegisterEmitter(MachineFunction &, const RISCVSubtarget &STI) + : TRI{STI.getRegisterInfo()} {}; lenary wrote: Why is this saved, when the same pointee is passed into `emit` as a reference `RI`? Getting rid of this would simplify the constructor, removing the STI argument. https://github.com/llvm/llvm-project/pull/114227 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)
@@ -27,6 +27,102 @@ using namespace llvm; +static unsigned getCaleeSavedRVVNumRegs(const Register &BaseReg) { lenary wrote: Typo ```suggestion static unsigned getCalleeSavedRVVNumRegs(const Register &BaseReg) { ``` https://github.com/llvm/llvm-project/pull/114227 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV][NFC] refactor CFI emitting (PR #114227)
https://github.com/lenary edited https://github.com/llvm/llvm-project/pull/114227 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)
@@ -294,7 +294,11 @@ class Triple { PAuthTest, -LastEnvironmentType = PAuthTest +GNUT64, +GNUEABIT64, +GNUEABIHFT64, + +LastEnvironmentType = GNUEABIHFT64 tru wrote: Hmm. That looks a bit odd. The Abi checker didn't catch this. Wonder why - @tstellar https://github.com/llvm/llvm-project/pull/112364 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: Add skeletons for new register bank select passes (PR #112862)
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/112862 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [llvm] [libc++][C++03] Use `__cxx03/` headers in C++03 mode (PR #109002)
https://github.com/ldionne edited https://github.com/llvm/llvm-project/pull/109002 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [llvm] [libc++][C++03] Use `__cxx03/` headers in C++03 mode (PR #109002)
@@ -1827,232 +1827,147 @@ template */ -#include <__config> - -#include <__algorithm/adjacent_find.h> -#include <__algorithm/all_of.h> -#include <__algorithm/any_of.h> -#include <__algorithm/binary_search.h> -#include <__algorithm/copy.h> -#include <__algorithm/copy_backward.h> -#include <__algorithm/copy_if.h> -#include <__algorithm/copy_n.h> -#include <__algorithm/count.h> -#include <__algorithm/count_if.h> -#include <__algorithm/equal.h> -#include <__algorithm/equal_range.h> -#include <__algorithm/fill.h> -#include <__algorithm/fill_n.h> -#include <__algorithm/find.h> -#include <__algorithm/find_end.h> -#include <__algorithm/find_first_of.h> -#include <__algorithm/find_if.h> -#include <__algorithm/find_if_not.h> -#include <__algorithm/for_each.h> -#include <__algorithm/generate.h> -#include <__algorithm/generate_n.h> -#include <__algorithm/includes.h> -#include <__algorithm/inplace_merge.h> -#include <__algorithm/is_heap.h> -#include <__algorithm/is_heap_until.h> -#include <__algorithm/is_partitioned.h> -#include <__algorithm/is_permutation.h> -#include <__algorithm/is_sorted.h> -#include <__algorithm/is_sorted_until.h> -#include <__algorithm/iter_swap.h> -#include <__algorithm/lexicographical_compare.h> -#include <__algorithm/lower_bound.h> -#include <__algorithm/make_heap.h> -#include <__algorithm/max.h> -#include <__algorithm/max_element.h> -#include <__algorithm/merge.h> -#include <__algorithm/min.h> -#include <__algorithm/min_element.h> -#include <__algorithm/minmax.h> -#include <__algorithm/minmax_element.h> -#include <__algorithm/mismatch.h> -#include <__algorithm/move.h> -#include <__algorithm/move_backward.h> -#include <__algorithm/next_permutation.h> -#include <__algorithm/none_of.h> -#include <__algorithm/nth_element.h> -#include <__algorithm/partial_sort.h> -#include <__algorithm/partial_sort_copy.h> -#include <__algorithm/partition.h> -#include <__algorithm/partition_copy.h> -#include <__algorithm/partition_point.h> -#include <__algorithm/pop_heap.h> -#include <__algorithm/prev_permutation.h> -#include <__algorithm/push_heap.h> -#include <__algorithm/remove.h> -#include <__algorithm/remove_copy.h> -#include <__algorithm/remove_copy_if.h> -#include <__algorithm/remove_if.h> -#include <__algorithm/replace.h> -#include <__algorithm/replace_copy.h> -#include <__algorithm/replace_copy_if.h> -#include <__algorithm/replace_if.h> -#include <__algorithm/reverse.h> -#include <__algorithm/reverse_copy.h> -#include <__algorithm/rotate.h> -#include <__algorithm/rotate_copy.h> -#include <__algorithm/search.h> -#include <__algorithm/search_n.h> -#include <__algorithm/set_difference.h> -#include <__algorithm/set_intersection.h> -#include <__algorithm/set_symmetric_difference.h> -#include <__algorithm/set_union.h> -#include <__algorithm/shuffle.h> -#include <__algorithm/sort.h> -#include <__algorithm/sort_heap.h> -#include <__algorithm/stable_partition.h> -#include <__algorithm/stable_sort.h> -#include <__algorithm/swap_ranges.h> -#include <__algorithm/transform.h> -#include <__algorithm/unique.h> -#include <__algorithm/unique_copy.h> -#include <__algorithm/upper_bound.h> - -#if _LIBCPP_STD_VER >= 17 -# include <__algorithm/clamp.h> -# include <__algorithm/for_each_n.h> -# include <__algorithm/pstl.h> -# include <__algorithm/sample.h> -#endif // _LIBCPP_STD_VER >= 17 - -#if _LIBCPP_STD_VER >= 20 -# include <__algorithm/in_found_result.h> -# include <__algorithm/in_fun_result.h> -# include <__algorithm/in_in_out_result.h> -# include <__algorithm/in_in_result.h> -# include <__algorithm/in_out_out_result.h> -# include <__algorithm/in_out_result.h> -# include <__algorithm/lexicographical_compare_three_way.h> -# include <__algorithm/min_max_result.h> -# include <__algorithm/ranges_adjacent_find.h> -# include <__algorithm/ranges_all_of.h> -# include <__algorithm/ranges_any_of.h> -# include <__algorithm/ranges_binary_search.h> -# include <__algorithm/ranges_clamp.h> -# include <__algorithm/ranges_contains.h> -# include <__algorithm/ranges_copy.h> -# include <__algorithm/ranges_copy_backward.h> -# include <__algorithm/ranges_copy_if.h> -# include <__algorithm/ranges_copy_n.h> -# include <__algorithm/ranges_count.h> -# include <__algorithm/ranges_count_if.h> -# include <__algorithm/ranges_equal.h> -# include <__algorithm/ranges_equal_range.h> -# include <__algorithm/ranges_fill.h> -# include <__algorithm/ranges_fill_n.h> -# include <__algorithm/ranges_find.h> -# include <__algorithm/ranges_find_end.h> -# include <__algorithm/ranges_find_first_of.h> -# include <__algorithm/ranges_find_if.h> -# include <__algorithm/ranges_find_if_not.h> -# include <__algorithm/ranges_for_each.h> -# include <__algorithm/ranges_for_each_n.h> -# include <__algorithm/ranges_generate.h> -# include <__algorithm/ranges_generate_n.h> -# include <__algorithm/ranges_includes.h> -# include <__algorithm/ranges_inplace_merge.h> -# include <__algorithm/ranges_is_heap.h> -# include <__a
[llvm-branch-commits] [libcxx] [llvm] [libc++][C++03] Use `__cxx03/` headers in C++03 mode (PR #109002)
@@ -0,0 +1,2 @@ +set(LIBCXX_TEST_PARAMS "std=c++03;test-cxx03-headers=True" CACHE STRING "") ldionne wrote: I would probably name this new param `use-frozen-cxx03-headers`. That tells what the setting does a bit more explicitly. https://github.com/llvm/llvm-project/pull/109002 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [llvm] [libc++][C++03] Use `__cxx03/` headers in C++03 mode (PR #109002)
@@ -587,42 +587,48 @@ template */ -#include <__config> - -#include <__atomic/aliases.h> -#include <__atomic/atomic.h> -#include <__atomic/atomic_base.h> -#include <__atomic/atomic_flag.h> -#include <__atomic/atomic_init.h> -#include <__atomic/atomic_lock_free.h> -#include <__atomic/atomic_sync.h> -#include <__atomic/check_memory_order.h> -#include <__atomic/contention_t.h> -#include <__atomic/cxx_atomic_impl.h> -#include <__atomic/fence.h> -#include <__atomic/is_always_lock_free.h> -#include <__atomic/kill_dependency.h> -#include <__atomic/memory_order.h> -#include - -#if _LIBCPP_STD_VER >= 20 -# include <__atomic/atomic_ref.h> -#endif - -#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER) -# pragma GCC system_header -#endif - -#if !_LIBCPP_HAS_ATOMIC_HEADER -# error is not implemented -#endif - -#if !defined(_LIBCPP_REMOVE_TRANSITIVE_INCLUDES) && _LIBCPP_STD_VER <= 20 -# include -# include -# include -# include -# include -#endif +#include <__configuration/cxx03.h> + +#if defined(_LIBCPP_CXX03_LANG) && !defined(_LIBCPP_USE_CXX03_HEADERS) +# include <__cxx03/algorithm> ldionne wrote: ```suggestion # include <__cxx03/atomic> ``` Perhaps other occurrences of the same mistake? https://github.com/llvm/llvm-project/pull/109002 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [llvm] [libc++][C++03] Use `__cxx03/` headers in C++03 mode (PR #109002)
https://github.com/ldionne commented: I think this makes sense, however I would like you to post this PR to the RFC, since we mentioned back then that the decision of Driver vs conditional includes would be best taken based on actual proposed changes. This will give this patch a bit more visibility and we can ensure that we have consensus on moving forward with this implementation strategy. https://github.com/llvm/llvm-project/pull/109002 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [llvm] [libc++][C++03] Use `__cxx03/` headers in C++03 mode (PR #109002)
@@ -14,6 +14,7 @@ #include <__configuration/abi.h> #include <__configuration/availability.h> #include <__configuration/compiler.h> +#include <__configuration/cxx03.h> ldionne wrote: Could we move `_LIBCPP_CXX03_LANG` into `language.h` instead? And it might be worth making a separate patch since that's so easy to do. So in many places the includes would become `#include <__configuration/language.h>` at the beginning of files. https://github.com/llvm/llvm-project/pull/109002 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [compiler-rt] [test] Fix using toolchains that rely on Clang default configs (#113491) (PR #114229)
https://github.com/mgorny approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/114229 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Emit descriptive errors for all unsupported clauses (PR #114037)
https://github.com/tblah approved this pull request. Thanks for this. Looks great! https://github.com/llvm/llvm-project/pull/114037 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 821d9ba - Revert "[GlobalISel] Import samesign flag (#113090)"
Author: Thorsten Schütt Date: 2024-10-30T17:02:07+01:00 New Revision: 821d9bad7373f8ddc4d1c424c0d806f8c087faa3 URL: https://github.com/llvm/llvm-project/commit/821d9bad7373f8ddc4d1c424c0d806f8c087faa3 DIFF: https://github.com/llvm/llvm-project/commit/821d9bad7373f8ddc4d1c424c0d806f8c087faa3.diff LOG: Revert "[GlobalISel] Import samesign flag (#113090)" This reverts commit 72b115301d1c0d56f40f5030bb8d16f422ac211b. Added: Modified: llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h llvm/include/llvm/CodeGen/MachineInstr.h llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp llvm/lib/CodeGen/MIRParser/MILexer.cpp llvm/lib/CodeGen/MIRParser/MILexer.h llvm/lib/CodeGen/MIRParser/MIParser.cpp llvm/lib/CodeGen/MIRPrinter.cpp llvm/lib/CodeGen/MachineInstr.cpp Removed: llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-samesign.ll llvm/test/CodeGen/MIR/icmp-flags.mir diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h b/llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h index cd7ebcf54c9e1e..b6309a9ea0ec78 100644 --- a/llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h +++ b/llvm/include/llvm/CodeGen/GlobalISel/GenericMachineInstrs.h @@ -28,7 +28,7 @@ namespace llvm { class GenericMachineInstr : public MachineInstr { constexpr static unsigned PoisonFlags = NoUWrap | NoSWrap | NoUSWrap | IsExact | Disjoint | NonNeg | - FmNoNans | FmNoInfs | SameSign; + FmNoNans | FmNoInfs; public: GenericMachineInstr() = delete; diff --git a/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h b/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h index a38dd34a17097a..c41e74ec7ebdcc 100644 --- a/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h +++ b/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h @@ -1266,8 +1266,7 @@ class MachineIRBuilder { /// /// \return a MachineInstrBuilder for the newly created instruction. MachineInstrBuilder buildICmp(CmpInst::Predicate Pred, const DstOp &Res, -const SrcOp &Op0, const SrcOp &Op1, -std::optional Flags = std::nullopt); +const SrcOp &Op0, const SrcOp &Op1); /// Build and insert a \p Res = G_FCMP \p Pred\p Op0, \p Op1 /// diff --git a/llvm/include/llvm/CodeGen/MachineInstr.h b/llvm/include/llvm/CodeGen/MachineInstr.h index ead6bbe1d5f641..36051732474634 100644 --- a/llvm/include/llvm/CodeGen/MachineInstr.h +++ b/llvm/include/llvm/CodeGen/MachineInstr.h @@ -119,7 +119,6 @@ class MachineInstr Disjoint = 1 << 19, // Each bit is zero in at least one of the inputs. NoUSWrap = 1 << 20, // Instruction supports geps // no unsigned signed wrap. -SameSign = 1 << 21 // Both operands have the same sign. }; private: diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp index a87754389cc8ed..5381dce58f9e65 100644 --- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp +++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp @@ -340,17 +340,20 @@ bool IRTranslator::translateCompare(const User &U, Register Op1 = getOrCreateVReg(*U.getOperand(1)); Register Res = getOrCreateVReg(U); CmpInst::Predicate Pred = CI->getPredicate(); - uint32_t Flags = MachineInstr::copyFlagsFromInstruction(*CI); if (CmpInst::isIntPredicate(Pred)) -MIRBuilder.buildICmp(Pred, Res, Op0, Op1, Flags); +MIRBuilder.buildICmp(Pred, Res, Op0, Op1); else if (Pred == CmpInst::FCMP_FALSE) MIRBuilder.buildCopy( Res, getOrCreateVReg(*Constant::getNullValue(U.getType(; else if (Pred == CmpInst::FCMP_TRUE) MIRBuilder.buildCopy( Res, getOrCreateVReg(*Constant::getAllOnesValue(U.getType(; - else + else { +uint32_t Flags = 0; +if (CI) + Flags = MachineInstr::copyFlagsFromInstruction(*CI); MIRBuilder.buildFCmp(Pred, Res, Op0, Op1, Flags); + } return true; } diff --git a/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp b/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp index 15b9164247846c..59f2fc633f5de7 100644 --- a/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp +++ b/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp @@ -898,9 +898,8 @@ MachineIRBuilder::buildFPTrunc(const DstOp &Res, const SrcOp &Op, MachineInstrBuilder MachineIRBuilder::buildICmp(CmpInst::Predicate Pred, const DstOp &Res, const SrcOp &Op0, -const SrcOp &Op1, -
[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)
@@ -294,7 +294,11 @@ class Triple { PAuthTest, -LastEnvironmentType = PAuthTest +GNUT64, +GNUEABIT64, +GNUEABIHFT64, + +LastEnvironmentType = GNUEABIHFT64 alexrp wrote: I don't *think* a revert at this stage would make much of a difference since the break has already shipped in a release anyway. We'll have to deal with it in Zig regardless. We just need to be a bit more careful with enum changes like this going forward. https://github.com/llvm/llvm-project/pull/112364 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang][cuda] Data transfer with descriptor (PR #114302)
https://github.com/clementval created https://github.com/llvm/llvm-project/pull/114302 Use the feature added in #114301 to perform data transfer between data having a descriptor. >From e4c7e31c77bbfda563e4e2c9b591fe2f5cb2c259 Mon Sep 17 00:00:00 2001 From: Valentin Clement Date: Wed, 30 Oct 2024 11:53:12 -0700 Subject: [PATCH] [flang][cuda] Data transfer with descriptor --- flang/runtime/CUDA/memory.cpp | 34 +++-- flang/unittests/Runtime/CUDA/Memory.cpp | 40 + 2 files changed, 72 insertions(+), 2 deletions(-) diff --git a/flang/runtime/CUDA/memory.cpp b/flang/runtime/CUDA/memory.cpp index 4778a4ae77683f..f25d3b531c84f0 100644 --- a/flang/runtime/CUDA/memory.cpp +++ b/flang/runtime/CUDA/memory.cpp @@ -9,10 +9,32 @@ #include "flang/Runtime/CUDA/memory.h" #include "../terminator.h" #include "flang/Runtime/CUDA/common.h" +#include "flang/Runtime/assign.h" #include "cuda_runtime.h" namespace Fortran::runtime::cuda { +static void *MemmoveHostToDevice( +void *dst, const void *src, std::size_t count) { + // TODO: Use cudaMemcpyAsync when we have support for stream. + CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice)); + return dst; +} + +static void *MemmoveDeviceToHost( +void *dst, const void *src, std::size_t count) { + // TODO: Use cudaMemcpyAsync when we have support for stream. + CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyDeviceToHost)); + return dst; +} + +static void *MemmoveDeviceToDevice( +void *dst, const void *src, std::size_t count) { + // TODO: Use cudaMemcpyAsync when we have support for stream. + CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice)); + return dst; +} + extern "C" { void *RTDEF(CUFMemAlloc)( @@ -90,8 +112,16 @@ void RTDEF(CUFDataTransferPtrDesc)(void *addr, Descriptor *desc, void RTDECL(CUFDataTransferDescDesc)(Descriptor *dstDesc, Descriptor *srcDesc, unsigned mode, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - terminator.Crash( - "not yet implemented: CUDA data transfer between two descriptors"); + MemmoveFct memmoveFct; + if (mode == kHostToDevice) { +memmoveFct = &MemmoveHostToDevice; + } else if (mode == kDeviceToHost) { +memmoveFct = &MemmoveDeviceToHost; + } else if (mode == kDeviceToDevice) { +memmoveFct = &MemmoveDeviceToDevice; + } + Fortran::runtime::Assign( + dstDesc, srcDesc, terminator, MaybeReallocate, memmoveFct); } } } // namespace Fortran::runtime::cuda diff --git a/flang/unittests/Runtime/CUDA/Memory.cpp b/flang/unittests/Runtime/CUDA/Memory.cpp index 157d3cdb531def..ade05e21b70a89 100644 --- a/flang/unittests/Runtime/CUDA/Memory.cpp +++ b/flang/unittests/Runtime/CUDA/Memory.cpp @@ -9,11 +9,17 @@ #include "flang/Runtime/CUDA/memory.h" #include "gtest/gtest.h" #include "../../../runtime/terminator.h" +#include "../tools.h" #include "flang/Common/Fortran.h" +#include "flang/Runtime/CUDA/allocator.h" #include "flang/Runtime/CUDA/common.h" +#include "flang/Runtime/CUDA/descriptor.h" +#include "flang/Runtime/allocatable.h" +#include "flang/Runtime/allocator-registry.h" #include "cuda_runtime.h" +using namespace Fortran::runtime; using namespace Fortran::runtime::cuda; TEST(MemoryCUFTest, SimpleAllocTramsferFree) { @@ -29,3 +35,37 @@ TEST(MemoryCUFTest, SimpleAllocTramsferFree) { EXPECT_EQ(42, host); RTNAME(CUFMemFree)((void *)dev, kMemTypeDevice, __FILE__, __LINE__); } + +static OwningPtr createAllocatable( +Fortran::common::TypeCategory tc, int kind, int rank = 1) { + return Descriptor::Create(TypeCode{tc, kind}, kind, nullptr, rank, nullptr, + CFI_attribute_allocatable); +} + +TEST(MemoryCUFTest, CUFDataTransferDescDesc) { + using Fortran::common::TypeCategory; + RTNAME(CUFRegisterAllocator)(); + // INTEGER(4), DEVICE, ALLOCATABLE :: a(:) + auto dev{createAllocatable(TypeCategory::Integer, 4)}; + dev->SetAllocIdx(kDeviceAllocatorPos); + EXPECT_EQ((int)kDeviceAllocatorPos, dev->GetAllocIdx()); + RTNAME(AllocatableSetBounds)(*dev, 0, 1, 10); + RTNAME(AllocatableAllocate) + (*dev, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + EXPECT_TRUE(dev->IsAllocated()); + + // Create temp array to transfer to device. + auto x{MakeArray(std::vector{10}, + std::vector{0, 1, 2, 3, 4, 5, 6, 7, 8, 9})}; + RTNAME(CUFDataTransferDescDesc)(*dev, *x, kHostToDevice, __FILE__, __LINE__); + + // Retrieve data from device. + auto host{MakeArray(std::vector{10}, + std::vector{0, 0, 0, 0, 0, 0, 0, 0, 0, 0})}; + RTNAME(CUFDataTransferDescDesc)( + *host, *dev, kDeviceToHost, __FILE__, __LINE__); + + for (unsigned i = 0; i < 10; ++i) { +EXPECT_EQ(*host->ZeroBasedIndexedElement(i), (std::int32_t)i); + } +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch
[llvm-branch-commits] [flang] [flang][cuda] Data transfer with descriptor (PR #114302)
llvmbot wrote: @llvm/pr-subscribers-flang-runtime Author: Valentin Clement (バレンタイン クレメン) (clementval) Changes Use the feature added in #114301 to perform data transfer between data having a descriptor. --- Full diff: https://github.com/llvm/llvm-project/pull/114302.diff 2 Files Affected: - (modified) flang/runtime/CUDA/memory.cpp (+32-2) - (modified) flang/unittests/Runtime/CUDA/Memory.cpp (+40) ``diff diff --git a/flang/runtime/CUDA/memory.cpp b/flang/runtime/CUDA/memory.cpp index 4778a4ae77683f..f25d3b531c84f0 100644 --- a/flang/runtime/CUDA/memory.cpp +++ b/flang/runtime/CUDA/memory.cpp @@ -9,10 +9,32 @@ #include "flang/Runtime/CUDA/memory.h" #include "../terminator.h" #include "flang/Runtime/CUDA/common.h" +#include "flang/Runtime/assign.h" #include "cuda_runtime.h" namespace Fortran::runtime::cuda { +static void *MemmoveHostToDevice( +void *dst, const void *src, std::size_t count) { + // TODO: Use cudaMemcpyAsync when we have support for stream. + CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice)); + return dst; +} + +static void *MemmoveDeviceToHost( +void *dst, const void *src, std::size_t count) { + // TODO: Use cudaMemcpyAsync when we have support for stream. + CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyDeviceToHost)); + return dst; +} + +static void *MemmoveDeviceToDevice( +void *dst, const void *src, std::size_t count) { + // TODO: Use cudaMemcpyAsync when we have support for stream. + CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice)); + return dst; +} + extern "C" { void *RTDEF(CUFMemAlloc)( @@ -90,8 +112,16 @@ void RTDEF(CUFDataTransferPtrDesc)(void *addr, Descriptor *desc, void RTDECL(CUFDataTransferDescDesc)(Descriptor *dstDesc, Descriptor *srcDesc, unsigned mode, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - terminator.Crash( - "not yet implemented: CUDA data transfer between two descriptors"); + MemmoveFct memmoveFct; + if (mode == kHostToDevice) { +memmoveFct = &MemmoveHostToDevice; + } else if (mode == kDeviceToHost) { +memmoveFct = &MemmoveDeviceToHost; + } else if (mode == kDeviceToDevice) { +memmoveFct = &MemmoveDeviceToDevice; + } + Fortran::runtime::Assign( + dstDesc, srcDesc, terminator, MaybeReallocate, memmoveFct); } } } // namespace Fortran::runtime::cuda diff --git a/flang/unittests/Runtime/CUDA/Memory.cpp b/flang/unittests/Runtime/CUDA/Memory.cpp index 157d3cdb531def..ade05e21b70a89 100644 --- a/flang/unittests/Runtime/CUDA/Memory.cpp +++ b/flang/unittests/Runtime/CUDA/Memory.cpp @@ -9,11 +9,17 @@ #include "flang/Runtime/CUDA/memory.h" #include "gtest/gtest.h" #include "../../../runtime/terminator.h" +#include "../tools.h" #include "flang/Common/Fortran.h" +#include "flang/Runtime/CUDA/allocator.h" #include "flang/Runtime/CUDA/common.h" +#include "flang/Runtime/CUDA/descriptor.h" +#include "flang/Runtime/allocatable.h" +#include "flang/Runtime/allocator-registry.h" #include "cuda_runtime.h" +using namespace Fortran::runtime; using namespace Fortran::runtime::cuda; TEST(MemoryCUFTest, SimpleAllocTramsferFree) { @@ -29,3 +35,37 @@ TEST(MemoryCUFTest, SimpleAllocTramsferFree) { EXPECT_EQ(42, host); RTNAME(CUFMemFree)((void *)dev, kMemTypeDevice, __FILE__, __LINE__); } + +static OwningPtr createAllocatable( +Fortran::common::TypeCategory tc, int kind, int rank = 1) { + return Descriptor::Create(TypeCode{tc, kind}, kind, nullptr, rank, nullptr, + CFI_attribute_allocatable); +} + +TEST(MemoryCUFTest, CUFDataTransferDescDesc) { + using Fortran::common::TypeCategory; + RTNAME(CUFRegisterAllocator)(); + // INTEGER(4), DEVICE, ALLOCATABLE :: a(:) + auto dev{createAllocatable(TypeCategory::Integer, 4)}; + dev->SetAllocIdx(kDeviceAllocatorPos); + EXPECT_EQ((int)kDeviceAllocatorPos, dev->GetAllocIdx()); + RTNAME(AllocatableSetBounds)(*dev, 0, 1, 10); + RTNAME(AllocatableAllocate) + (*dev, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + EXPECT_TRUE(dev->IsAllocated()); + + // Create temp array to transfer to device. + auto x{MakeArray(std::vector{10}, + std::vector{0, 1, 2, 3, 4, 5, 6, 7, 8, 9})}; + RTNAME(CUFDataTransferDescDesc)(*dev, *x, kHostToDevice, __FILE__, __LINE__); + + // Retrieve data from device. + auto host{MakeArray(std::vector{10}, + std::vector{0, 0, 0, 0, 0, 0, 0, 0, 0, 0})}; + RTNAME(CUFDataTransferDescDesc)( + *host, *dev, kDeviceToHost, __FILE__, __LINE__); + + for (unsigned i = 0; i < 10; ++i) { +EXPECT_EQ(*host->ZeroBasedIndexedElement(i), (std::int32_t)i); + } +} `` https://github.com/llvm/llvm-project/pull/114302 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang][cuda] Data transfer with descriptor (PR #114302)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff a1f81d24c44da15071196faa9e8d466bcbbd7e97 e4c7e31c77bbfda563e4e2c9b591fe2f5cb2c259 --extensions cpp -- flang/runtime/CUDA/memory.cpp flang/unittests/Runtime/CUDA/Memory.cpp `` View the diff from clang-format here. ``diff diff --git a/flang/unittests/Runtime/CUDA/Memory.cpp b/flang/unittests/Runtime/CUDA/Memory.cpp index ade05e21b7..3765fbbb7b 100644 --- a/flang/unittests/Runtime/CUDA/Memory.cpp +++ b/flang/unittests/Runtime/CUDA/Memory.cpp @@ -62,8 +62,8 @@ TEST(MemoryCUFTest, CUFDataTransferDescDesc) { // Retrieve data from device. auto host{MakeArray(std::vector{10}, std::vector{0, 0, 0, 0, 0, 0, 0, 0, 0, 0})}; - RTNAME(CUFDataTransferDescDesc)( - *host, *dev, kDeviceToHost, __FILE__, __LINE__); + RTNAME(CUFDataTransferDescDesc) + (*host, *dev, kDeviceToHost, __FILE__, __LINE__); for (unsigned i = 0; i < 10; ++i) { EXPECT_EQ(*host->ZeroBasedIndexedElement(i), (std::int32_t)i); `` https://github.com/llvm/llvm-project/pull/114302 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [PAC][CodeGen][ELF][AArch64] Support signed GOT with tiny code model (PR #113812)
https://github.com/ilovepi commented: Overall, this seems fine as a translation of the spec. I still need to take another pass over the tests to be sure, but they seem appropriate so far. All that said, I'm far from an expert on Aarch64 minutiae, so a more experienced reviewer here will have to provide any final approval. https://github.com/llvm/llvm-project/pull/113812 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [PAC][CodeGen][ELF][AArch64] Support signed GOT with tiny code model (PR #113812)
@@ -2277,28 +2277,40 @@ void AArch64AsmPrinter::LowerLOADgotAUTH(const MachineInstr &MI) { const MachineOperand &GAMO = MI.getOperand(1); assert(GAMO.getOffset() == 0); - MachineOperand GAHiOp(GAMO); - MachineOperand GALoOp(GAMO); - GAHiOp.addTargetFlag(AArch64II::MO_PAGE); - GALoOp.addTargetFlag(AArch64II::MO_PAGEOFF | AArch64II::MO_NC); + if (MI.getParent()->getParent()->getTarget().getCodeModel() == + CodeModel::Tiny) { ilovepi wrote: ```suggestion if (MI.getMF()->getTarget().getCodeModel() == CodeModel::Tiny) { ``` Not sure if there's an easier way to get to the target from MI off the top of my head, but you can get the MachineFunction w/o going through `getParent` https://github.com/llvm/llvm-project/pull/113812 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [PAC][CodeGen][ELF][AArch64] Support signed GOT with tiny code model (PR #113812)
https://github.com/ilovepi edited https://github.com/llvm/llvm-project/pull/113812 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)
@@ -294,7 +294,11 @@ class Triple { PAuthTest, -LastEnvironmentType = PAuthTest +GNUT64, +GNUEABIT64, +GNUEABIHFT64, + +LastEnvironmentType = GNUEABIHFT64 tru wrote: I want to understand why the Abi checker didn't flag this. But yes - I will have to read the diffs much more carefully until we know we can trust it. Maybe it would be enough to have an action that flags any changes to the public include files so that it's not easily missed. https://github.com/llvm/llvm-project/pull/112364 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Emit descriptive errors for all unsupported clauses (PR #114037)
https://github.com/Meinersbur approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/114037 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [compiler-rt] [test] Fix using toolchains that rely on Clang default configs (#113491) (PR #114229)
https://github.com/thesamesam approved this pull request. https://github.com/llvm/llvm-project/pull/114229 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] MachineUniformityAnalysis: Improve isConstantOrUndefValuePhi (PR #112866)
https://github.com/petar-avramovic updated https://github.com/llvm/llvm-project/pull/112866 >From 98a00e5a2ed28da3a4608d9c209a04f0cff6fe12 Mon Sep 17 00:00:00 2001 From: Petar Avramovic Date: Wed, 30 Oct 2024 15:41:59 +0100 Subject: [PATCH] MachineUniformityAnalysis: Improve isConstantOrUndefValuePhi Change existing code for G_PHI to match what LLVM-IR version is doing via PHINode::hasConstantOrUndefValue. This is not safe for regular PHI since it may appear with an undef operand and getVRegDef can fail. Most notably this improves number of values that can be allocated to sgpr register bank in AMDGPURegBankSelect. Common case here are phis that appear in structurize-cfg lowering for cycles with multiple exits: Undef incoming value is coming from block that reached cycle exit condition, if other incoming is uniform keep the phi uniform despite the fact it is joining values from pair of blocks that are entered via divergent condition branch. --- llvm/lib/CodeGen/MachineSSAContext.cpp| 27 +- .../AMDGPU/MIR/hidden-diverge-gmir.mir| 28 +++ .../AMDGPU/MIR/hidden-loop-diverge.mir| 4 +- .../AMDGPU/MIR/uses-value-from-cycle.mir | 8 +- .../GlobalISel/divergence-structurizer.mir| 80 -- .../regbankselect-mui-regbanklegalize.mir | 69 --- .../regbankselect-mui-regbankselect.mir | 18 ++-- .../AMDGPU/GlobalISel/regbankselect-mui.ll| 84 ++- .../AMDGPU/GlobalISel/regbankselect-mui.mir | 51 ++- 9 files changed, 191 insertions(+), 178 deletions(-) diff --git a/llvm/lib/CodeGen/MachineSSAContext.cpp b/llvm/lib/CodeGen/MachineSSAContext.cpp index e384187b6e8593..8e13c0916dd9e1 100644 --- a/llvm/lib/CodeGen/MachineSSAContext.cpp +++ b/llvm/lib/CodeGen/MachineSSAContext.cpp @@ -54,9 +54,34 @@ const MachineBasicBlock *MachineSSAContext::getDefBlock(Register value) const { return F->getRegInfo().getVRegDef(value)->getParent(); } +static bool isUndef(const MachineInstr &MI) { + return MI.getOpcode() == TargetOpcode::G_IMPLICIT_DEF || + MI.getOpcode() == TargetOpcode::IMPLICIT_DEF; +} + +/// MachineInstr equivalent of PHINode::hasConstantOrUndefValue() for G_PHI. template <> bool MachineSSAContext::isConstantOrUndefValuePhi(const MachineInstr &Phi) { - return Phi.isConstantValuePHI(); + if (!Phi.isPHI()) +return false; + + // In later passes PHI may appear with an undef operand, getVRegDef can fail. + if (Phi.getOpcode() == TargetOpcode::PHI) +return Phi.isConstantValuePHI(); + + // For G_PHI we do equivalent of PHINode::hasConstantOrUndefValue(). + const MachineRegisterInfo &MRI = Phi.getMF()->getRegInfo(); + Register This = Phi.getOperand(0).getReg(); + Register ConstantValue; + for (unsigned i = 1, e = Phi.getNumOperands(); i < e; i += 2) { +Register Incoming = Phi.getOperand(i).getReg(); +if (Incoming != This && !isUndef(*MRI.getVRegDef(Incoming))) { + if (ConstantValue && ConstantValue != Incoming) +return false; + ConstantValue = Incoming; +} + } + return true; } template <> diff --git a/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/hidden-diverge-gmir.mir b/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/hidden-diverge-gmir.mir index ce00edf3363f77..9694a340b5e906 100644 --- a/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/hidden-diverge-gmir.mir +++ b/llvm/test/Analysis/UniformityAnalysis/AMDGPU/MIR/hidden-diverge-gmir.mir @@ -1,24 +1,24 @@ # RUN: llc -mtriple=amdgcn-- -run-pass=print-machine-uniformity -o - %s 2>&1 | FileCheck %s # CHECK-LABEL: MachineUniformityInfo for function: hidden_diverge # CHECK-LABEL: BLOCK bb.0 -# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s32) = G_INTRINSIC intrinsic(@llvm.amdgcn.workitem.id.x) -# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1) = G_ICMP intpred(slt) -# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1) = G_XOR %{{[0-9]*}}:_, %{{[0-9]*}}:_ -# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1), %{{[0-9]*}}:_(s64) = G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.if) -# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1), %{{[0-9]*}}:_(s64) = G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.if) -# CHECK: DIVERGENT: G_BRCOND %{{[0-9]*}}:_(s1), %bb.1 -# CHECK: DIVERGENT: G_BR %bb.2 +# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s32) = G_INTRINSIC intrinsic(@llvm.amdgcn.workitem.id.x) +# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1) = G_ICMP intpred(slt) +# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1) = G_XOR %{{[0-9]*}}:_, %{{[0-9]*}}:_ +# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1), %{{[0-9]*}}:_(s64) = G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.if) +# CHECK: DIVERGENT: %{{[0-9]*}}: %{{[0-9]*}}:_(s1), %{{[0-9]*}}:_(s64) = G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.if) +# CHECK: DIVERGENT: G_BRCOND %{{[0-9]*}}:_(s1), %bb.1 +# CHECK: DIVERGENT: G_BR %bb.2 # CHECK-LABEL: BLOCK bb.1 # CHECK-LABEL: BLOCK bb.2 -# CHECK: D
[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: RegBankLegalize rules for load (PR #112882)
https://github.com/petar-avramovic updated https://github.com/llvm/llvm-project/pull/112882 >From 4675f79f28222cef60d1607acb1b682ca3363eb6 Mon Sep 17 00:00:00 2001 From: Petar Avramovic Date: Wed, 30 Oct 2024 15:37:59 +0100 Subject: [PATCH] AMDGPU/GlobalISel: RegBankLegalize rules for load Add IDs for bit width that cover multiple LLTs: B32 B64 etc. "Predicate" wrapper class for bool predicate functions used to write pretty rules. Predicates can be combined using &&, || and !. Lowering for splitting and widening loads. Write rules for loads to not change existing mir tests from old regbankselect. --- .../AMDGPU/AMDGPURegBankLegalizeHelper.cpp| 287 +++- .../AMDGPU/AMDGPURegBankLegalizeHelper.h | 5 + .../AMDGPU/AMDGPURegBankLegalizeRules.cpp | 309 - .../AMDGPU/AMDGPURegBankLegalizeRules.h | 65 +++- .../AMDGPU/GlobalISel/regbankselect-load.mir | 320 +++--- .../GlobalISel/regbankselect-zextload.mir | 9 +- 6 files changed, 929 insertions(+), 66 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp index 15ccf1a38af9a5..19d8d466e3b12e 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp @@ -36,6 +36,83 @@ void RegBankLegalizeHelper::findRuleAndApplyMapping(MachineInstr &MI) { lower(MI, Mapping, WaterfallSgprs); } +void RegBankLegalizeHelper::splitLoad(MachineInstr &MI, + ArrayRef LLTBreakdown, LLT MergeTy) { + MachineFunction &MF = B.getMF(); + assert(MI.getNumMemOperands() == 1); + MachineMemOperand &BaseMMO = **MI.memoperands_begin(); + Register Dst = MI.getOperand(0).getReg(); + const RegisterBank *DstRB = MRI.getRegBankOrNull(Dst); + Register Base = MI.getOperand(1).getReg(); + LLT PtrTy = MRI.getType(Base); + const RegisterBank *PtrRB = MRI.getRegBankOrNull(Base); + LLT OffsetTy = LLT::scalar(PtrTy.getSizeInBits()); + SmallVector LoadPartRegs; + + unsigned ByteOffset = 0; + for (LLT PartTy : LLTBreakdown) { +Register BasePlusOffset; +if (ByteOffset == 0) { + BasePlusOffset = Base; +} else { + auto Offset = B.buildConstant({PtrRB, OffsetTy}, ByteOffset); + BasePlusOffset = B.buildPtrAdd({PtrRB, PtrTy}, Base, Offset).getReg(0); +} +auto *OffsetMMO = MF.getMachineMemOperand(&BaseMMO, ByteOffset, PartTy); +auto LoadPart = B.buildLoad({DstRB, PartTy}, BasePlusOffset, *OffsetMMO); +LoadPartRegs.push_back(LoadPart.getReg(0)); +ByteOffset += PartTy.getSizeInBytes(); + } + + if (!MergeTy.isValid()) { +// Loads are of same size, concat or merge them together. +B.buildMergeLikeInstr(Dst, LoadPartRegs); + } else { +// Loads are not all of same size, need to unmerge them to smaller pieces +// of MergeTy type, then merge pieces to Dst. +SmallVector MergeTyParts; +for (Register Reg : LoadPartRegs) { + if (MRI.getType(Reg) == MergeTy) { +MergeTyParts.push_back(Reg); + } else { +auto Unmerge = B.buildUnmerge({DstRB, MergeTy}, Reg); +for (unsigned i = 0; i < Unmerge->getNumOperands() - 1; ++i) + MergeTyParts.push_back(Unmerge.getReg(i)); + } +} +B.buildMergeLikeInstr(Dst, MergeTyParts); + } + MI.eraseFromParent(); +} + +void RegBankLegalizeHelper::widenLoad(MachineInstr &MI, LLT WideTy, + LLT MergeTy) { + MachineFunction &MF = B.getMF(); + assert(MI.getNumMemOperands() == 1); + MachineMemOperand &BaseMMO = **MI.memoperands_begin(); + Register Dst = MI.getOperand(0).getReg(); + const RegisterBank *DstRB = MRI.getRegBankOrNull(Dst); + Register Base = MI.getOperand(1).getReg(); + + MachineMemOperand *WideMMO = MF.getMachineMemOperand(&BaseMMO, 0, WideTy); + auto WideLoad = B.buildLoad({DstRB, WideTy}, Base, *WideMMO); + + if (WideTy.isScalar()) { +B.buildTrunc(Dst, WideLoad); + } else { +SmallVector MergeTyParts; +auto Unmerge = B.buildUnmerge({DstRB, MergeTy}, WideLoad); + +LLT DstTy = MRI.getType(Dst); +unsigned NumElts = DstTy.getSizeInBits() / MergeTy.getSizeInBits(); +for (unsigned i = 0; i < NumElts; ++i) { + MergeTyParts.push_back(Unmerge.getReg(i)); +} +B.buildMergeLikeInstr(Dst, MergeTyParts); + } + MI.eraseFromParent(); +} + void RegBankLegalizeHelper::lower(MachineInstr &MI, const RegBankLLTMapping &Mapping, SmallSet &WaterfallSgprs) { @@ -114,6 +191,50 @@ void RegBankLegalizeHelper::lower(MachineInstr &MI, MI.eraseFromParent(); break; } + case SplitLoad: { +LLT DstTy = MRI.getType(MI.getOperand(0).getReg()); +unsigned Size = DstTy.getSizeInBits(); +// Even split to 128-bit loads +if (Size > 128) { + LLT B128; + if (DstTy.isVector()) { +LLT EltTy = DstTy.getElementType(); +B128 = LLT:
[llvm-branch-commits] [flang] [flang][cuda] Data transfer with descriptor (PR #114302)
@@ -9,10 +9,32 @@ #include "flang/Runtime/CUDA/memory.h" #include "../terminator.h" #include "flang/Runtime/CUDA/common.h" +#include "flang/Runtime/assign.h" #include "cuda_runtime.h" namespace Fortran::runtime::cuda { +static void *MemmoveHostToDevice( +void *dst, const void *src, std::size_t count) { + // TODO: Use cudaMemcpyAsync when we have support for stream. + CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice)); + return dst; +} + +static void *MemmoveDeviceToHost( +void *dst, const void *src, std::size_t count) { + // TODO: Use cudaMemcpyAsync when we have support for stream. + CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyDeviceToHost)); + return dst; +} + +static void *MemmoveDeviceToDevice( +void *dst, const void *src, std::size_t count) { + // TODO: Use cudaMemcpyAsync when we have support for stream. + CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice)); Renaud-K wrote: DeviceToDevice? https://github.com/llvm/llvm-project/pull/114302 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] ValueTracking: Allow getUnderlyingObject to look at vectors (PR #114311)
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/114311 We can identify some easy vector of pointer cases, such as a getelementptr with a scalar base. >From 6e8c27a281929c1c1941960e0134122c3ccf77b8 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 29 Oct 2024 15:30:51 -0700 Subject: [PATCH] ValueTracking: Allow getUnderlyingObject to look at vectors We can identify some easy vector of pointer cases, such as a getelementptr with a scalar base. --- llvm/lib/Analysis/ValueTracking.cpp | 4 +-- .../AMDGPU/promote-alloca-vector-gep.ll | 27 ++- 2 files changed, 22 insertions(+), 9 deletions(-) diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp index aa5142f3362409..dd7cac9f4fc325 100644 --- a/llvm/lib/Analysis/ValueTracking.cpp +++ b/llvm/lib/Analysis/ValueTracking.cpp @@ -6686,11 +6686,11 @@ static bool isSameUnderlyingObjectInLoop(const PHINode *PN, } const Value *llvm::getUnderlyingObject(const Value *V, unsigned MaxLookup) { - if (!V->getType()->isPointerTy()) -return V; for (unsigned Count = 0; MaxLookup == 0 || Count < MaxLookup; ++Count) { if (auto *GEP = dyn_cast(V)) { V = GEP->getPointerOperand(); + if (!V->getType()->isPointerTy()) // Only handle scalar pointer base. +return nullptr; } else if (Operator::getOpcode(V) == Instruction::BitCast || Operator::getOpcode(V) == Instruction::AddrSpaceCast) { Value *NewV = cast(V)->getOperand(0); diff --git a/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll b/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll index 355a2b8796b24d..76e1868b3c4b9e 100644 --- a/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll +++ b/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll @@ -35,17 +35,30 @@ bb: ret void } -; TODO: Should be able to promote this define amdgpu_kernel void @scalar_alloca_ptr_with_vector_gep_offset_select(i1 %cond) { ; CHECK-LABEL: define amdgpu_kernel void @scalar_alloca_ptr_with_vector_gep_offset_select( ; CHECK-SAME: i1 [[COND:%.*]]) { ; CHECK-NEXT: [[BB:.*:]] -; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [4 x i32], align 4, addrspace(5) -; CHECK-NEXT:[[GETELEMENTPTR0:%.*]] = getelementptr inbounds i8, ptr addrspace(5) [[ALLOCA]], <4 x i64> -; CHECK-NEXT:[[GETELEMENTPTR1:%.*]] = getelementptr inbounds i8, ptr addrspace(5) [[ALLOCA]], <4 x i64> -; CHECK-NEXT:[[SELECT:%.*]] = select i1 [[COND]], <4 x ptr addrspace(5)> [[GETELEMENTPTR0]], <4 x ptr addrspace(5)> [[GETELEMENTPTR1]] -; CHECK-NEXT:[[EXTRACTELEMENT:%.*]] = extractelement <4 x ptr addrspace(5)> [[SELECT]], i64 1 -; CHECK-NEXT:store i32 0, ptr addrspace(5) [[EXTRACTELEMENT]], align 4 +; CHECK-NEXT:[[TMP0:%.*]] = call noalias nonnull dereferenceable(64) ptr addrspace(4) @llvm.amdgcn.dispatch.ptr() +; CHECK-NEXT:[[TMP1:%.*]] = getelementptr inbounds i32, ptr addrspace(4) [[TMP0]], i64 1 +; CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr addrspace(4) [[TMP1]], align 4, !invariant.load [[META0]] +; CHECK-NEXT:[[TMP3:%.*]] = getelementptr inbounds i32, ptr addrspace(4) [[TMP0]], i64 2 +; CHECK-NEXT:[[TMP4:%.*]] = load i32, ptr addrspace(4) [[TMP3]], align 4, !range [[RNG1]], !invariant.load [[META0]] +; CHECK-NEXT:[[TMP5:%.*]] = lshr i32 [[TMP2]], 16 +; CHECK-NEXT:[[TMP6:%.*]] = call range(i32 0, 1024) i32 @llvm.amdgcn.workitem.id.x() +; CHECK-NEXT:[[TMP7:%.*]] = call range(i32 0, 1024) i32 @llvm.amdgcn.workitem.id.y() +; CHECK-NEXT:[[TMP8:%.*]] = call range(i32 0, 1024) i32 @llvm.amdgcn.workitem.id.z() +; CHECK-NEXT:[[TMP9:%.*]] = mul nuw nsw i32 [[TMP5]], [[TMP4]] +; CHECK-NEXT:[[TMP10:%.*]] = mul i32 [[TMP9]], [[TMP6]] +; CHECK-NEXT:[[TMP11:%.*]] = mul nuw nsw i32 [[TMP7]], [[TMP4]] +; CHECK-NEXT:[[TMP12:%.*]] = add i32 [[TMP10]], [[TMP11]] +; CHECK-NEXT:[[TMP13:%.*]] = add i32 [[TMP12]], [[TMP8]] +; CHECK-NEXT:[[TMP14:%.*]] = getelementptr inbounds [1024 x [4 x i32]], ptr addrspace(3) @scalar_alloca_ptr_with_vector_gep_offset_select.alloca, i32 0, i32 [[TMP13]] +; CHECK-NEXT:[[GETELEMENTPTR0:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP14]], <4 x i64> +; CHECK-NEXT:[[GETELEMENTPTR1:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP14]], <4 x i64> +; CHECK-NEXT:[[SELECT:%.*]] = select i1 [[COND]], <4 x ptr addrspace(3)> [[GETELEMENTPTR0]], <4 x ptr addrspace(3)> [[GETELEMENTPTR1]] +; CHECK-NEXT:[[EXTRACTELEMENT:%.*]] = extractelement <4 x ptr addrspace(3)> [[SELECT]], i64 1 +; CHECK-NEXT:store i32 0, ptr addrspace(3) [[EXTRACTELEMENT]], align 4 ; CHECK-NEXT:ret void ; bb: ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] ValueTracking: Allow getUnderlyingObject to look at vectors (PR #114311)
llvmbot wrote: @llvm/pr-subscribers-llvm-analysis Author: Matt Arsenault (arsenm) Changes We can identify some easy vector of pointer cases, such as a getelementptr with a scalar base. --- Full diff: https://github.com/llvm/llvm-project/pull/114311.diff 2 Files Affected: - (modified) llvm/lib/Analysis/ValueTracking.cpp (+2-2) - (modified) llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll (+20-7) ``diff diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp index aa5142f33624099..dd7cac9f4fc3259 100644 --- a/llvm/lib/Analysis/ValueTracking.cpp +++ b/llvm/lib/Analysis/ValueTracking.cpp @@ -6686,11 +6686,11 @@ static bool isSameUnderlyingObjectInLoop(const PHINode *PN, } const Value *llvm::getUnderlyingObject(const Value *V, unsigned MaxLookup) { - if (!V->getType()->isPointerTy()) -return V; for (unsigned Count = 0; MaxLookup == 0 || Count < MaxLookup; ++Count) { if (auto *GEP = dyn_cast(V)) { V = GEP->getPointerOperand(); + if (!V->getType()->isPointerTy()) // Only handle scalar pointer base. +return nullptr; } else if (Operator::getOpcode(V) == Instruction::BitCast || Operator::getOpcode(V) == Instruction::AddrSpaceCast) { Value *NewV = cast(V)->getOperand(0); diff --git a/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll b/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll index 355a2b8796b24dc..76e1868b3c4b9e8 100644 --- a/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll +++ b/llvm/test/CodeGen/AMDGPU/promote-alloca-vector-gep.ll @@ -35,17 +35,30 @@ bb: ret void } -; TODO: Should be able to promote this define amdgpu_kernel void @scalar_alloca_ptr_with_vector_gep_offset_select(i1 %cond) { ; CHECK-LABEL: define amdgpu_kernel void @scalar_alloca_ptr_with_vector_gep_offset_select( ; CHECK-SAME: i1 [[COND:%.*]]) { ; CHECK-NEXT: [[BB:.*:]] -; CHECK-NEXT:[[ALLOCA:%.*]] = alloca [4 x i32], align 4, addrspace(5) -; CHECK-NEXT:[[GETELEMENTPTR0:%.*]] = getelementptr inbounds i8, ptr addrspace(5) [[ALLOCA]], <4 x i64> -; CHECK-NEXT:[[GETELEMENTPTR1:%.*]] = getelementptr inbounds i8, ptr addrspace(5) [[ALLOCA]], <4 x i64> -; CHECK-NEXT:[[SELECT:%.*]] = select i1 [[COND]], <4 x ptr addrspace(5)> [[GETELEMENTPTR0]], <4 x ptr addrspace(5)> [[GETELEMENTPTR1]] -; CHECK-NEXT:[[EXTRACTELEMENT:%.*]] = extractelement <4 x ptr addrspace(5)> [[SELECT]], i64 1 -; CHECK-NEXT:store i32 0, ptr addrspace(5) [[EXTRACTELEMENT]], align 4 +; CHECK-NEXT:[[TMP0:%.*]] = call noalias nonnull dereferenceable(64) ptr addrspace(4) @llvm.amdgcn.dispatch.ptr() +; CHECK-NEXT:[[TMP1:%.*]] = getelementptr inbounds i32, ptr addrspace(4) [[TMP0]], i64 1 +; CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr addrspace(4) [[TMP1]], align 4, !invariant.load [[META0]] +; CHECK-NEXT:[[TMP3:%.*]] = getelementptr inbounds i32, ptr addrspace(4) [[TMP0]], i64 2 +; CHECK-NEXT:[[TMP4:%.*]] = load i32, ptr addrspace(4) [[TMP3]], align 4, !range [[RNG1]], !invariant.load [[META0]] +; CHECK-NEXT:[[TMP5:%.*]] = lshr i32 [[TMP2]], 16 +; CHECK-NEXT:[[TMP6:%.*]] = call range(i32 0, 1024) i32 @llvm.amdgcn.workitem.id.x() +; CHECK-NEXT:[[TMP7:%.*]] = call range(i32 0, 1024) i32 @llvm.amdgcn.workitem.id.y() +; CHECK-NEXT:[[TMP8:%.*]] = call range(i32 0, 1024) i32 @llvm.amdgcn.workitem.id.z() +; CHECK-NEXT:[[TMP9:%.*]] = mul nuw nsw i32 [[TMP5]], [[TMP4]] +; CHECK-NEXT:[[TMP10:%.*]] = mul i32 [[TMP9]], [[TMP6]] +; CHECK-NEXT:[[TMP11:%.*]] = mul nuw nsw i32 [[TMP7]], [[TMP4]] +; CHECK-NEXT:[[TMP12:%.*]] = add i32 [[TMP10]], [[TMP11]] +; CHECK-NEXT:[[TMP13:%.*]] = add i32 [[TMP12]], [[TMP8]] +; CHECK-NEXT:[[TMP14:%.*]] = getelementptr inbounds [1024 x [4 x i32]], ptr addrspace(3) @scalar_alloca_ptr_with_vector_gep_offset_select.alloca, i32 0, i32 [[TMP13]] +; CHECK-NEXT:[[GETELEMENTPTR0:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP14]], <4 x i64> +; CHECK-NEXT:[[GETELEMENTPTR1:%.*]] = getelementptr inbounds i8, ptr addrspace(3) [[TMP14]], <4 x i64> +; CHECK-NEXT:[[SELECT:%.*]] = select i1 [[COND]], <4 x ptr addrspace(3)> [[GETELEMENTPTR0]], <4 x ptr addrspace(3)> [[GETELEMENTPTR1]] +; CHECK-NEXT:[[EXTRACTELEMENT:%.*]] = extractelement <4 x ptr addrspace(3)> [[SELECT]], i64 1 +; CHECK-NEXT:store i32 0, ptr addrspace(3) [[EXTRACTELEMENT]], align 4 ; CHECK-NEXT:ret void ; bb: `` https://github.com/llvm/llvm-project/pull/114311 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] ValueTracking: Allow getUnderlyingObject to look at vectors (PR #114311)
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/114311 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [llvm][fatlto] Drop any CFI related instrumentation after emitting bitcode (PR #112788)
https://github.com/ilovepi updated https://github.com/llvm/llvm-project/pull/112788 >From ad89d61e60bac57cf8c66a974d741377ebe1db30 Mon Sep 17 00:00:00 2001 From: Paul Kirth Date: Fri, 18 Oct 2024 01:59:26 + Subject: [PATCH 1/2] Use new enum in constructor Created using spr 1.3.4 --- llvm/lib/Passes/PassBuilderPipelines.cpp | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index aec79304ab5c3c..0585e83e59a9ab 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -1631,8 +1631,11 @@ PassBuilder::buildFatLTODefaultPipeline(OptimizationLevel Level, bool ThinLTO, MPM.addPass(EmbedBitcodePass(ThinLTO, EmitSummary)); // If we're doing FatLTO w/ CFI enabled, we don't want the type tests in the - // object file. - MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true, true)); + // object code, only in the bitcode section, so drop it before we run + // module optimization and generate machine code. If llvm.type.test() isn't in + // the IR, this won't do anything. + MPM.addPass( + LowerTypeTestsPass(nullptr, nullptr, lowertypetests::DropTestKind::All)); // Use the ThinLTO post-link pipeline with sample profiling if (ThinLTO && PGOOpt && PGOOpt->Action == PGOOptions::SampleUse) >From f331c70196e399d0e0d4ec8fdb76d3a313b25ac3 Mon Sep 17 00:00:00 2001 From: Paul Kirth Date: Wed, 30 Oct 2024 23:54:00 + Subject: [PATCH 2/2] Update test to use cc1 Created using spr 1.3.4 --- clang/test/CodeGen/fat-lto-objects-cfi.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/clang/test/CodeGen/fat-lto-objects-cfi.cpp b/clang/test/CodeGen/fat-lto-objects-cfi.cpp index 022e74fd9b6f22..628951847053ac 100644 --- a/clang/test/CodeGen/fat-lto-objects-cfi.cpp +++ b/clang/test/CodeGen/fat-lto-objects-cfi.cpp @@ -1,7 +1,7 @@ // REQUIRES: x86-registered-target -// RUN: %clangxx --target=x86_64-unknown-fuchsia -O2 -flto -ffat-lto-objects \ -// RUN: -fsanitize=cfi -fvisibility=hidden -S -emit-llvm -o - %s \ +// RUN: %clang_cc1 -triple x86_64-unknown-fuchsia -O2 -flto -ffat-lto-objects \ +// RUN: -fsanitize=cfi-icall -fsanitize-trap=cfi-icall -fvisibility=hidden -emit-llvm -o - %s \ // RUN: | FileCheck %s // CHECK: llvm.embedded.object ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)
@@ -294,7 +294,11 @@ class Triple { PAuthTest, -LastEnvironmentType = PAuthTest +GNUT64, +GNUEABIT64, +GNUEABIHFT64, + +LastEnvironmentType = GNUEABIHFT64 mgorny wrote: For the record, Gentoo has already backported it to 18.x as well (I was literally waiting for the patch to be merged as a confirmation that this approach is good). My alternative idea would be to limit the changes to accept `*t64` but return one of the existing enum values, i.e. `-gnut64` would return `::GNU` and so on. We'd lose the ability to customize clang behavior on it, but at least it won't refuse to work right away. https://github.com/llvm/llvm-project/pull/112364 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang][cuda] Data transfer with descriptor (PR #114302)
https://github.com/clementval updated https://github.com/llvm/llvm-project/pull/114302 >From e4c7e31c77bbfda563e4e2c9b591fe2f5cb2c259 Mon Sep 17 00:00:00 2001 From: Valentin Clement Date: Wed, 30 Oct 2024 11:53:12 -0700 Subject: [PATCH 1/2] [flang][cuda] Data transfer with descriptor --- flang/runtime/CUDA/memory.cpp | 34 +++-- flang/unittests/Runtime/CUDA/Memory.cpp | 40 + 2 files changed, 72 insertions(+), 2 deletions(-) diff --git a/flang/runtime/CUDA/memory.cpp b/flang/runtime/CUDA/memory.cpp index 4778a4ae77683f..f25d3b531c84f0 100644 --- a/flang/runtime/CUDA/memory.cpp +++ b/flang/runtime/CUDA/memory.cpp @@ -9,10 +9,32 @@ #include "flang/Runtime/CUDA/memory.h" #include "../terminator.h" #include "flang/Runtime/CUDA/common.h" +#include "flang/Runtime/assign.h" #include "cuda_runtime.h" namespace Fortran::runtime::cuda { +static void *MemmoveHostToDevice( +void *dst, const void *src, std::size_t count) { + // TODO: Use cudaMemcpyAsync when we have support for stream. + CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice)); + return dst; +} + +static void *MemmoveDeviceToHost( +void *dst, const void *src, std::size_t count) { + // TODO: Use cudaMemcpyAsync when we have support for stream. + CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyDeviceToHost)); + return dst; +} + +static void *MemmoveDeviceToDevice( +void *dst, const void *src, std::size_t count) { + // TODO: Use cudaMemcpyAsync when we have support for stream. + CUDA_REPORT_IF_ERROR(cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice)); + return dst; +} + extern "C" { void *RTDEF(CUFMemAlloc)( @@ -90,8 +112,16 @@ void RTDEF(CUFDataTransferPtrDesc)(void *addr, Descriptor *desc, void RTDECL(CUFDataTransferDescDesc)(Descriptor *dstDesc, Descriptor *srcDesc, unsigned mode, const char *sourceFile, int sourceLine) { Terminator terminator{sourceFile, sourceLine}; - terminator.Crash( - "not yet implemented: CUDA data transfer between two descriptors"); + MemmoveFct memmoveFct; + if (mode == kHostToDevice) { +memmoveFct = &MemmoveHostToDevice; + } else if (mode == kDeviceToHost) { +memmoveFct = &MemmoveDeviceToHost; + } else if (mode == kDeviceToDevice) { +memmoveFct = &MemmoveDeviceToDevice; + } + Fortran::runtime::Assign( + dstDesc, srcDesc, terminator, MaybeReallocate, memmoveFct); } } } // namespace Fortran::runtime::cuda diff --git a/flang/unittests/Runtime/CUDA/Memory.cpp b/flang/unittests/Runtime/CUDA/Memory.cpp index 157d3cdb531def..ade05e21b70a89 100644 --- a/flang/unittests/Runtime/CUDA/Memory.cpp +++ b/flang/unittests/Runtime/CUDA/Memory.cpp @@ -9,11 +9,17 @@ #include "flang/Runtime/CUDA/memory.h" #include "gtest/gtest.h" #include "../../../runtime/terminator.h" +#include "../tools.h" #include "flang/Common/Fortran.h" +#include "flang/Runtime/CUDA/allocator.h" #include "flang/Runtime/CUDA/common.h" +#include "flang/Runtime/CUDA/descriptor.h" +#include "flang/Runtime/allocatable.h" +#include "flang/Runtime/allocator-registry.h" #include "cuda_runtime.h" +using namespace Fortran::runtime; using namespace Fortran::runtime::cuda; TEST(MemoryCUFTest, SimpleAllocTramsferFree) { @@ -29,3 +35,37 @@ TEST(MemoryCUFTest, SimpleAllocTramsferFree) { EXPECT_EQ(42, host); RTNAME(CUFMemFree)((void *)dev, kMemTypeDevice, __FILE__, __LINE__); } + +static OwningPtr createAllocatable( +Fortran::common::TypeCategory tc, int kind, int rank = 1) { + return Descriptor::Create(TypeCode{tc, kind}, kind, nullptr, rank, nullptr, + CFI_attribute_allocatable); +} + +TEST(MemoryCUFTest, CUFDataTransferDescDesc) { + using Fortran::common::TypeCategory; + RTNAME(CUFRegisterAllocator)(); + // INTEGER(4), DEVICE, ALLOCATABLE :: a(:) + auto dev{createAllocatable(TypeCategory::Integer, 4)}; + dev->SetAllocIdx(kDeviceAllocatorPos); + EXPECT_EQ((int)kDeviceAllocatorPos, dev->GetAllocIdx()); + RTNAME(AllocatableSetBounds)(*dev, 0, 1, 10); + RTNAME(AllocatableAllocate) + (*dev, /*hasStat=*/false, /*errMsg=*/nullptr, __FILE__, __LINE__); + EXPECT_TRUE(dev->IsAllocated()); + + // Create temp array to transfer to device. + auto x{MakeArray(std::vector{10}, + std::vector{0, 1, 2, 3, 4, 5, 6, 7, 8, 9})}; + RTNAME(CUFDataTransferDescDesc)(*dev, *x, kHostToDevice, __FILE__, __LINE__); + + // Retrieve data from device. + auto host{MakeArray(std::vector{10}, + std::vector{0, 0, 0, 0, 0, 0, 0, 0, 0, 0})}; + RTNAME(CUFDataTransferDescDesc)( + *host, *dev, kDeviceToHost, __FILE__, __LINE__); + + for (unsigned i = 0; i < 10; ++i) { +EXPECT_EQ(*host->ZeroBasedIndexedElement(i), (std::int32_t)i); + } +} >From be6734745eba64a1e05886b30eba64409658b5ae Mon Sep 17 00:00:00 2001 From: Valentin Clement Date: Wed, 30 Oct 2024 14:08:23 -0700 Subject: [PATCH 2/2] fix call --- flang/runtime/CUDA/allocatable.cpp | 1 + flang/runtime/CUDA/memory.cpp | 2 +- 2 files cha
[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)
@@ -294,7 +294,11 @@ class Triple { PAuthTest, -LastEnvironmentType = PAuthTest +GNUT64, +GNUEABIT64, +GNUEABIHFT64, + +LastEnvironmentType = GNUEABIHFT64 alexrp wrote: We don't mandate a particular patch version of LLVM because we try to support building Zig with a distro-provided LLVM. Distro packages aren't necessarily going to be on the latest LLVM patch release. https://github.com/llvm/llvm-project/pull/112364 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 78dcfb3 - Revert "[TLI] Add support for hypot libcall. (#113724)"
Author: gulfemsavrun Date: 2024-10-30T15:09:24-07:00 New Revision: 78dcfb323241e46701ff8d19c8509307bce904bb URL: https://github.com/llvm/llvm-project/commit/78dcfb323241e46701ff8d19c8509307bce904bb DIFF: https://github.com/llvm/llvm-project/commit/78dcfb323241e46701ff8d19c8509307bce904bb.diff LOG: Revert "[TLI] Add support for hypot libcall. (#113724)" This reverts commit feb2d867fac3b6339c169fff97ddf0716fce6f0a. Added: Modified: llvm/include/llvm/Analysis/TargetLibraryInfo.def llvm/lib/Analysis/TargetLibraryInfo.cpp llvm/lib/Transforms/Utils/BuildLibCalls.cpp llvm/test/Transforms/InferFunctionAttrs/annotate.ll llvm/test/tools/llvm-tli-checker/ps4-tli-check.yaml llvm/unittests/Analysis/TargetLibraryInfoTest.cpp Removed: diff --git a/llvm/include/llvm/Analysis/TargetLibraryInfo.def b/llvm/include/llvm/Analysis/TargetLibraryInfo.def index fd53a26ef8fc11b..3e23e398f6a7976 100644 --- a/llvm/include/llvm/Analysis/TargetLibraryInfo.def +++ b/llvm/include/llvm/Analysis/TargetLibraryInfo.def @@ -1671,21 +1671,6 @@ TLI_DEFINE_ENUM_INTERNAL(htons) TLI_DEFINE_STRING_INTERNAL("htons") TLI_DEFINE_SIG_INTERNAL(Int16, Int16) -/// double hypot(double x, double y); -TLI_DEFINE_ENUM_INTERNAL(hypot) -TLI_DEFINE_STRING_INTERNAL("hypot") -TLI_DEFINE_SIG_INTERNAL(Dbl, Dbl, Dbl) - -/// float hypotf(float x, float y); -TLI_DEFINE_ENUM_INTERNAL(hypotf) -TLI_DEFINE_STRING_INTERNAL("hypotf") -TLI_DEFINE_SIG_INTERNAL(Flt, Flt, Flt) - -/// long double hypotl(long double x, long double y); -TLI_DEFINE_ENUM_INTERNAL(hypotl) -TLI_DEFINE_STRING_INTERNAL("hypotl") -TLI_DEFINE_SIG_INTERNAL(LDbl, LDbl, LDbl) - /// int iprintf(const char *format, ...); TLI_DEFINE_ENUM_INTERNAL(iprintf) TLI_DEFINE_STRING_INTERNAL("iprintf") diff --git a/llvm/lib/Analysis/TargetLibraryInfo.cpp b/llvm/lib/Analysis/TargetLibraryInfo.cpp index 7f0b98ab3c1514a..0ee83d217a5001e 100644 --- a/llvm/lib/Analysis/TargetLibraryInfo.cpp +++ b/llvm/lib/Analysis/TargetLibraryInfo.cpp @@ -300,7 +300,6 @@ static void initializeLibCalls(TargetLibraryInfoImpl &TLI, const Triple &T, TLI.setUnavailable(LibFunc_expf); TLI.setUnavailable(LibFunc_floorf); TLI.setUnavailable(LibFunc_fmodf); - TLI.setUnavailable(LibFunc_hypotf); TLI.setUnavailable(LibFunc_log10f); TLI.setUnavailable(LibFunc_logf); TLI.setUnavailable(LibFunc_modff); @@ -332,7 +331,6 @@ static void initializeLibCalls(TargetLibraryInfoImpl &TLI, const Triple &T, TLI.setUnavailable(LibFunc_floorl); TLI.setUnavailable(LibFunc_fmodl); TLI.setUnavailable(LibFunc_frexpl); -TLI.setUnavailable(LibFunc_hypotl); TLI.setUnavailable(LibFunc_ldexpl); TLI.setUnavailable(LibFunc_log10l); TLI.setUnavailable(LibFunc_logl); diff --git a/llvm/lib/Transforms/Utils/BuildLibCalls.cpp b/llvm/lib/Transforms/Utils/BuildLibCalls.cpp index e039457f313b29e..5fd4fd78c28a953 100644 --- a/llvm/lib/Transforms/Utils/BuildLibCalls.cpp +++ b/llvm/lib/Transforms/Utils/BuildLibCalls.cpp @@ -1215,9 +1215,6 @@ bool llvm::inferNonMandatoryLibFuncAttrs(Function &F, case LibFunc_fmod: case LibFunc_fmodf: case LibFunc_fmodl: - case LibFunc_hypot: - case LibFunc_hypotf: - case LibFunc_hypotl: case LibFunc_isascii: case LibFunc_isdigit: case LibFunc_labs: diff --git a/llvm/test/Transforms/InferFunctionAttrs/annotate.ll b/llvm/test/Transforms/InferFunctionAttrs/annotate.ll index 452d90aa98d88df..d8266f4c6703dd6 100644 --- a/llvm/test/Transforms/InferFunctionAttrs/annotate.ll +++ b/llvm/test/Transforms/InferFunctionAttrs/annotate.ll @@ -589,15 +589,6 @@ declare ptr @gets(ptr) ; CHECK: declare noundef i32 @gettimeofday(ptr nocapture noundef, ptr nocapture noundef) [[NOFREE_NOUNWIND]] declare i32 @gettimeofday(ptr, ptr) -; CHECK: declare double @hypot(double, double) [[NOFREE_NOUNWIND_WILLRETURN_WRITEONLY]] -declare double @hypot(double, double) - -; CHECK: declare float @hypotf(float, float) [[NOFREE_NOUNWIND_WILLRETURN_WRITEONLY]] -declare float @hypotf(float, float) - -; CHECK: declare x86_fp80 @hypotl(x86_fp80, x86_fp80) [[NOFREE_NOUNWIND_WILLRETURN_WRITEONLY]] -declare x86_fp80 @hypotl(x86_fp80, x86_fp80) - ; CHECK: declare i32 @isascii(i32) [[NOFREE_NOUNWIND_WILLRETURN_WRITEONLY]] declare i32 @isascii(i32) diff --git a/llvm/test/tools/llvm-tli-checker/ps4-tli-check.yaml b/llvm/test/tools/llvm-tli-checker/ps4-tli-check.yaml index d52f3c751b06695..408b9c39934286f 100644 --- a/llvm/test/tools/llvm-tli-checker/ps4-tli-check.yaml +++ b/llvm/test/tools/llvm-tli-checker/ps4-tli-check.yaml @@ -602,18 +602,6 @@ DynamicSymbols: Type:STT_FUNC Section: .text Binding: STB_GLOBAL - - Name:hypot -Type:STT_FUNC -Section: .text -Binding: STB_GLOBAL - - Name:hypotf -Type:STT_FUNC -Section:
[llvm-branch-commits] [llvm] ValueTracking: Allow getUnderlyingObject to look at vectors (PR #114311)
arsenm wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/114311?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#114311** https://app.graphite.dev/github/pr/llvm/llvm-project/114311?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#114144** https://app.graphite.dev/github/pr/llvm/llvm-project/114144?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#114113** https://app.graphite.dev/github/pr/llvm/llvm-project/114113?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#114091** https://app.graphite.dev/github/pr/llvm/llvm-project/114091?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @arsenm and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/114311 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [llvm][fatlto] Drop any CFI related instrumentation after emitting bitcode (PR #112788)
https://github.com/ilovepi updated https://github.com/llvm/llvm-project/pull/112788 >From ad89d61e60bac57cf8c66a974d741377ebe1db30 Mon Sep 17 00:00:00 2001 From: Paul Kirth Date: Fri, 18 Oct 2024 01:59:26 + Subject: [PATCH 1/2] Use new enum in constructor Created using spr 1.3.4 --- llvm/lib/Passes/PassBuilderPipelines.cpp | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index aec79304ab5c3c..0585e83e59a9ab 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -1631,8 +1631,11 @@ PassBuilder::buildFatLTODefaultPipeline(OptimizationLevel Level, bool ThinLTO, MPM.addPass(EmbedBitcodePass(ThinLTO, EmitSummary)); // If we're doing FatLTO w/ CFI enabled, we don't want the type tests in the - // object file. - MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true, true)); + // object code, only in the bitcode section, so drop it before we run + // module optimization and generate machine code. If llvm.type.test() isn't in + // the IR, this won't do anything. + MPM.addPass( + LowerTypeTestsPass(nullptr, nullptr, lowertypetests::DropTestKind::All)); // Use the ThinLTO post-link pipeline with sample profiling if (ThinLTO && PGOOpt && PGOOpt->Action == PGOOptions::SampleUse) >From f331c70196e399d0e0d4ec8fdb76d3a313b25ac3 Mon Sep 17 00:00:00 2001 From: Paul Kirth Date: Wed, 30 Oct 2024 23:54:00 + Subject: [PATCH 2/2] Update test to use cc1 Created using spr 1.3.4 --- clang/test/CodeGen/fat-lto-objects-cfi.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/clang/test/CodeGen/fat-lto-objects-cfi.cpp b/clang/test/CodeGen/fat-lto-objects-cfi.cpp index 022e74fd9b6f22..628951847053ac 100644 --- a/clang/test/CodeGen/fat-lto-objects-cfi.cpp +++ b/clang/test/CodeGen/fat-lto-objects-cfi.cpp @@ -1,7 +1,7 @@ // REQUIRES: x86-registered-target -// RUN: %clangxx --target=x86_64-unknown-fuchsia -O2 -flto -ffat-lto-objects \ -// RUN: -fsanitize=cfi -fvisibility=hidden -S -emit-llvm -o - %s \ +// RUN: %clang_cc1 -triple x86_64-unknown-fuchsia -O2 -flto -ffat-lto-objects \ +// RUN: -fsanitize=cfi-icall -fsanitize-trap=cfi-icall -fvisibility=hidden -emit-llvm -o - %s \ // RUN: | FileCheck %s // CHECK: llvm.embedded.object ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 3ca951e - Revert "[GlobalISel][AArch64] Legalize G_INSERT_VECTOR_ELT for SVE (#114310)"
Author: Thorsten Schütt Date: 2024-10-31T05:40:53+01:00 New Revision: 3ca951e1f59110cb29b9c03fc1733cab1f6fbc30 URL: https://github.com/llvm/llvm-project/commit/3ca951e1f59110cb29b9c03fc1733cab1f6fbc30 DIFF: https://github.com/llvm/llvm-project/commit/3ca951e1f59110cb29b9c03fc1733cab1f6fbc30.diff LOG: Revert "[GlobalISel][AArch64] Legalize G_INSERT_VECTOR_ELT for SVE (#114310)" This reverts commit 6bf214b7c6d74ec581bc52a9142756a1d1df6df0. Added: Modified: llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp Removed: llvm/test/CodeGen/AArch64/GlobalISel/legalize-vector-insert-elt.mir diff --git a/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h b/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h index 6811b37767cb21..6d71c150c8da6b 100644 --- a/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h +++ b/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h @@ -273,11 +273,6 @@ inline LegalityPredicate typeIsNot(unsigned TypeIdx, LLT Type) { LegalityPredicate typePairInSet(unsigned TypeIdx0, unsigned TypeIdx1, std::initializer_list> TypesInit); -/// True iff the given types for the given tuple of type indexes is one of the -/// specified type tuple. -LegalityPredicate -typeTupleInSet(unsigned TypeIdx0, unsigned TypeIdx1, unsigned TypeIdx2, - std::initializer_list> TypesInit); /// True iff the given types for the given pair of type indexes is one of the /// specified type pairs. LegalityPredicate typePairAndMemDescInSet( @@ -509,15 +504,6 @@ class LegalizeRuleSet { using namespace LegalityPredicates; return actionIf(Action, typePairInSet(typeIdx(0), typeIdx(1), Types)); } - - LegalizeRuleSet & - actionFor(LegalizeAction Action, -std::initializer_list> Types) { -using namespace LegalityPredicates; -return actionIf(Action, -typeTupleInSet(typeIdx(0), typeIdx(1), typeIdx(2), Types)); - } - /// Use the given action when type indexes 0 and 1 is any type pair in the /// given list. /// Action should be an action that requires mutation. @@ -629,12 +615,6 @@ class LegalizeRuleSet { return *this; return actionFor(LegalizeAction::Legal, Types); } - LegalizeRuleSet & - legalFor(bool Pred, std::initializer_list> Types) { -if (!Pred) - return *this; -return actionFor(LegalizeAction::Legal, Types); - } /// The instruction is legal when type index 0 is any type in the given list /// and imm index 0 is anything. LegalizeRuleSet &legalForTypeWithAnyImm(std::initializer_list Types) { diff --git a/llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp b/llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp index dc7ed6cbe8b7da..8fe48195c610be 100644 --- a/llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp +++ b/llvm/lib/CodeGen/GlobalISel/LegalityPredicates.cpp @@ -49,17 +49,6 @@ LegalityPredicate LegalityPredicates::typePairInSet( }; } -LegalityPredicate LegalityPredicates::typeTupleInSet( -unsigned TypeIdx0, unsigned TypeIdx1, unsigned TypeIdx2, -std::initializer_list> TypesInit) { - SmallVector, 4> Types = TypesInit; - return [=](const LegalityQuery &Query) { -std::tuple Match = { -Query.Types[TypeIdx0], Query.Types[TypeIdx1], Query.Types[TypeIdx2]}; -return llvm::is_contained(Types, Match); - }; -} - LegalityPredicate LegalityPredicates::typePairAndMemDescInSet( unsigned TypeIdx0, unsigned TypeIdx1, unsigned MMOIdx, std::initializer_list TypesAndMemDescInit) { diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp index 7beda0e92a75bc..6024027afaf6ce 100644 --- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp +++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp @@ -978,10 +978,6 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST) getActionDefinitionsBuilder(G_INSERT_VECTOR_ELT) .legalIf( typeInSet(0, {v16s8, v8s8, v8s16, v4s16, v4s32, v2s32, v2s64, v2p0})) - .legalFor(HasSVE, {{nxv16s8, s32, s64}, - {nxv8s16, s32, s64}, - {nxv4s32, s32, s64}, - {nxv2s64, s64, s64}}) .moreElementsToNextPow2(0) .widenVectorEltsToVectorMinSize(0, 64) .clampNumElements(0, v8s8, v16s8) diff --git a/llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp b/llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp index 0bf0a4bf27c44d..b40fe55fdfaf67 100644 --- a/llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp +++ b/llvm/lib/Target/AArch64/GISel/AArch64PostLegalizerLowering.cpp @@ -161,8 +161,6 @@ bool matchREV(MachineInstr
[llvm-branch-commits] [llvm] [llvm][fatlto] Drop any CFI related instrumentation after emitting bitcode (PR #112788)
https://github.com/pcc approved this pull request. https://github.com/llvm/llvm-project/pull/112788 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)
@@ -294,7 +294,11 @@ class Triple { PAuthTest, -LastEnvironmentType = PAuthTest +GNUT64, +GNUEABIT64, +GNUEABIHFT64, + +LastEnvironmentType = GNUEABIHFT64 tstellar wrote: @alexrp Why would zig still need to deal with it? I don't think it's likely 19.1.3 will be used much once 19.1.4 comes out. https://github.com/llvm/llvm-project/pull/112364 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] GlobalISel: Fix combine duplicating atomic loads (PR #111730)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/111730 >From 41780b3398f46745fabe2ae2d24ca02c6580cf49 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 9 Oct 2024 22:05:48 +0400 Subject: [PATCH 1/2] GlobalISel: Fix combine duplicating atomic loads The sext_inreg (load) combine was not deleting the old load instruction, and it would never be deleted if volatile or atomic. --- .../lib/CodeGen/GlobalISel/CombinerHelper.cpp | 1 + .../AMDGPU/GlobalISel/atomic_load_flat.ll | 96 --- .../AMDGPU/GlobalISel/atomic_load_global.ll | 51 +++--- .../AMDGPU/GlobalISel/atomic_load_local_2.ll | 36 ++- ...lizer-combiner-sextload-from-sextinreg.mir | 2 - 5 files changed, 40 insertions(+), 146 deletions(-) diff --git a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp index b7ddf9f479ef8e..a5cabd46e23124 100644 --- a/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp +++ b/llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp @@ -1110,6 +1110,7 @@ void CombinerHelper::applySextInRegOfLoad( Builder.buildLoadInstr(TargetOpcode::G_SEXTLOAD, MI.getOperand(0).getReg(), LoadDef->getPointerReg(), *NewMMO); MI.eraseFromParent(); + LoadDef->eraseFromParent(); } /// Return true if 'MI' is a load or a store that may be fold it's address diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/atomic_load_flat.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/atomic_load_flat.ll index 817d1af9c226c8..83912b1e77db20 100644 --- a/llvm/test/CodeGen/AMDGPU/GlobalISel/atomic_load_flat.ll +++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/atomic_load_flat.ll @@ -27,32 +27,12 @@ define i32 @atomic_load_flat_monotonic_i8_zext_to_i32(ptr %ptr) { } define i32 @atomic_load_flat_monotonic_i8_sext_to_i32(ptr %ptr) { -; GFX7-LABEL: atomic_load_flat_monotonic_i8_sext_to_i32: -; GFX7: ; %bb.0: -; GFX7-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) -; GFX7-NEXT:flat_load_sbyte v2, v[0:1] glc -; GFX7-NEXT:flat_load_ubyte v0, v[0:1] glc -; GFX7-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0) -; GFX7-NEXT:v_mov_b32_e32 v0, v2 -; GFX7-NEXT:s_setpc_b64 s[30:31] -; -; GFX8-LABEL: atomic_load_flat_monotonic_i8_sext_to_i32: -; GFX8: ; %bb.0: -; GFX8-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) -; GFX8-NEXT:flat_load_sbyte v2, v[0:1] glc -; GFX8-NEXT:flat_load_ubyte v0, v[0:1] glc -; GFX8-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0) -; GFX8-NEXT:v_mov_b32_e32 v0, v2 -; GFX8-NEXT:s_setpc_b64 s[30:31] -; -; GFX9-LABEL: atomic_load_flat_monotonic_i8_sext_to_i32: -; GFX9: ; %bb.0: -; GFX9-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) -; GFX9-NEXT:flat_load_sbyte v2, v[0:1] glc -; GFX9-NEXT:flat_load_ubyte v3, v[0:1] glc -; GFX9-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0) -; GFX9-NEXT:v_mov_b32_e32 v0, v2 -; GFX9-NEXT:s_setpc_b64 s[30:31] +; GCN-LABEL: atomic_load_flat_monotonic_i8_sext_to_i32: +; GCN: ; %bb.0: +; GCN-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) +; GCN-NEXT:flat_load_sbyte v0, v[0:1] glc +; GCN-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0) +; GCN-NEXT:s_setpc_b64 s[30:31] %load = load atomic i8, ptr %ptr monotonic, align 1 %ext = sext i8 %load to i32 ret i32 %ext @@ -71,32 +51,12 @@ define i16 @atomic_load_flat_monotonic_i8_zext_to_i16(ptr %ptr) { } define i16 @atomic_load_flat_monotonic_i8_sext_to_i16(ptr %ptr) { -; GFX7-LABEL: atomic_load_flat_monotonic_i8_sext_to_i16: -; GFX7: ; %bb.0: -; GFX7-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) -; GFX7-NEXT:flat_load_sbyte v2, v[0:1] glc -; GFX7-NEXT:flat_load_ubyte v0, v[0:1] glc -; GFX7-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0) -; GFX7-NEXT:v_mov_b32_e32 v0, v2 -; GFX7-NEXT:s_setpc_b64 s[30:31] -; -; GFX8-LABEL: atomic_load_flat_monotonic_i8_sext_to_i16: -; GFX8: ; %bb.0: -; GFX8-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) -; GFX8-NEXT:flat_load_sbyte v2, v[0:1] glc -; GFX8-NEXT:flat_load_ubyte v0, v[0:1] glc -; GFX8-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0) -; GFX8-NEXT:v_mov_b32_e32 v0, v2 -; GFX8-NEXT:s_setpc_b64 s[30:31] -; -; GFX9-LABEL: atomic_load_flat_monotonic_i8_sext_to_i16: -; GFX9: ; %bb.0: -; GFX9-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) -; GFX9-NEXT:flat_load_sbyte v2, v[0:1] glc -; GFX9-NEXT:flat_load_ubyte v3, v[0:1] glc -; GFX9-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0) -; GFX9-NEXT:v_mov_b32_e32 v0, v2 -; GFX9-NEXT:s_setpc_b64 s[30:31] +; GCN-LABEL: atomic_load_flat_monotonic_i8_sext_to_i16: +; GCN: ; %bb.0: +; GCN-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) +; GCN-NEXT:flat_load_sbyte v0, v[0:1] glc +; GCN-NEXT:s_waitcnt vmcnt(0) lgkmcnt(0) +; GCN-NEXT:s_setpc_b64 s[30:31] %load = load atomic i8, ptr %ptr monotonic, align 1 %ext = sext i8 %load to i16 ret i16 %ext @@ -126,32 +86,12 @@ define i32 @atomic_load_flat_monotonic_i16_zext_to_i32(ptr %ptr) { } define i32 @atom
[llvm-branch-commits] [mlir] [mlir][func] Remove `func-bufferize` pass (PR #114152)
@@ -111,9 +111,9 @@ module attributes {transform.with_named_sequence} { transform.named_sequence @__transform_main(%arg1: !transform.any_op) { %1 = transform.structured.match ops{["func.func"]} in %arg1 : (!transform.any_op) -> !transform.any_op -// func-bufferize can be applied only to ModuleOps. +// duplicate-function-elimination can be applied only to ModuleOps. // expected-error @below {{pass pipeline failed}} -transform.apply_registered_pass "func-bufferize" to %1 : (!transform.any_op) -> !transform.any_op +transform.apply_registered_pass "duplicate-function-elimination" to %1 : (!transform.any_op) -> !transform.any_op javedabsar1 wrote: AFAIU here 'func-bufferize' just happened to be used. It is not really playing any specific role. right? https://github.com/llvm/llvm-project/pull/114152 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)
ChuanqiXu9 wrote: > > I tried to take a look at eigen and it looks like the declaration looks > > well and I had no clue how that happens. A reproducer may be necessary here > > to proceed. Thanks in advance. > > I can reproduce using the following sources and invocations outlined in > `run.sh` > [usx95@363d877](https://github.com/usx95/llvm-project/commit/363d877bd317638b197f57c3591860e1688950d5) > > ```shell > > module-reproducer/run.sh > > Building sensor_data.h > Building tensor.h > Building base.cc > In module 'sensor_data': > ../../eigen/Eigen/src/Core/../plugins/CommonCwiseBinaryOps.inc:47:29: > warning: inline function 'Eigen::operator*' is not defined > [-Wundefined-inline] >47 | EIGEN_MAKE_SCALAR_BINARY_OP(operator*, product) > | ^ > ../../eigen/Eigen/src/Geometry/AngleAxis.h:221:35: note: used here > 221 | Vector3 sin_axis = sin(m_angle) * m_axis; > | ^ > 1 warning generated. > ``` > > This warning is a new breakage and does not happen without this change > (ignore the linker failure). Let me know if you can reproduce or need help > reproducing. Sorry, could you provide the hash id for the commit that avoid the warning? I tried with d54953ef472bfd8d4b503aae7682aa76c49f8cc0 but I still saw the warning. I suspected if it is due to cache so I add `rm -fr ~/.cache/clang/ModuleCache/` in the top of the script and `rm -f ${sensor_data} && rm -f ${tensor}` in the end but I still saw the failures. I am using libc++17. What's your environment? https://github.com/llvm/llvm-project/pull/83237 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] 565d18d - Revert "[Clang][Sema] Always use latest redeclaration of primary template (#1…"
Author: Felipe de Azevedo Piovezan Date: 2024-10-30T14:03:47-07:00 New Revision: 565d18daf296b9848cf9d1b23fc82892e10eef8c URL: https://github.com/llvm/llvm-project/commit/565d18daf296b9848cf9d1b23fc82892e10eef8c DIFF: https://github.com/llvm/llvm-project/commit/565d18daf296b9848cf9d1b23fc82892e10eef8c.diff LOG: Revert "[Clang][Sema] Always use latest redeclaration of primary template (#1…" This reverts commit 90786adade22784a52856a0e8b545ec6710b47f6. Added: Modified: clang/include/clang/AST/DeclTemplate.h clang/lib/AST/Decl.cpp clang/lib/AST/DeclCXX.cpp clang/lib/AST/DeclTemplate.cpp clang/lib/Sema/SemaDecl.cpp clang/lib/Sema/SemaInit.cpp clang/lib/Sema/SemaTemplateInstantiate.cpp clang/test/AST/ast-dump-decl.cpp clang/test/CXX/temp/temp.spec/temp.expl.spec/p7.cpp Removed: diff --git a/clang/include/clang/AST/DeclTemplate.h b/clang/include/clang/AST/DeclTemplate.h index 0ca3fd48e81cf4..a572e3380f1655 100644 --- a/clang/include/clang/AST/DeclTemplate.h +++ b/clang/include/clang/AST/DeclTemplate.h @@ -857,6 +857,16 @@ class RedeclarableTemplateDecl : public TemplateDecl, /// \endcode bool isMemberSpecialization() const { return Common.getInt(); } + /// Determines whether any redeclaration of this template was + /// a specialization of a member template. + bool hasMemberSpecialization() const { +for (const auto *D : redecls()) { + if (D->isMemberSpecialization()) +return true; +} +return false; + } + /// Note that this member template is a specialization. void setMemberSpecialization() { assert(!isMemberSpecialization() && "already a member specialization"); @@ -1955,7 +1965,13 @@ class ClassTemplateSpecializationDecl : public CXXRecordDecl, /// specialization which was specialized by this. llvm::PointerUnion - getSpecializedTemplateOrPartial() const; + getSpecializedTemplateOrPartial() const { +if (const auto *PartialSpec = +SpecializedTemplate.dyn_cast()) + return PartialSpec->PartialSpecialization; + +return SpecializedTemplate.get(); + } /// Retrieve the set of template arguments that should be used /// to instantiate members of the class template or class template partial @@ -2192,6 +2208,17 @@ class ClassTemplatePartialSpecializationDecl return InstantiatedFromMember.getInt(); } + /// Determines whether any redeclaration of this this class template partial + /// specialization was a specialization of a member partial specialization. + bool hasMemberSpecialization() const { +for (const auto *D : redecls()) { + if (cast(D) + ->isMemberSpecialization()) +return true; +} +return false; + } + /// Note that this member template is a specialization. void setMemberSpecialization() { return InstantiatedFromMember.setInt(true); } @@ -2713,7 +2740,13 @@ class VarTemplateSpecializationDecl : public VarDecl, /// Retrieve the variable template or variable template partial /// specialization which was specialized by this. llvm::PointerUnion - getSpecializedTemplateOrPartial() const; + getSpecializedTemplateOrPartial() const { +if (const auto *PartialSpec = +SpecializedTemplate.dyn_cast()) + return PartialSpec->PartialSpecialization; + +return SpecializedTemplate.get(); + } /// Retrieve the set of template arguments that should be used /// to instantiate the initializer of the variable template or variable @@ -2947,6 +2980,18 @@ class VarTemplatePartialSpecializationDecl return InstantiatedFromMember.getInt(); } + /// Determines whether any redeclaration of this this variable template + /// partial specialization was a specialization of a member partial + /// specialization. + bool hasMemberSpecialization() const { +for (const auto *D : redecls()) { + if (cast(D) + ->isMemberSpecialization()) +return true; +} +return false; + } + /// Note that this member template is a specialization. void setMemberSpecialization() { return InstantiatedFromMember.setInt(true); } @@ -3119,9 +3164,6 @@ class VarTemplateDecl : public RedeclarableTemplateDecl { return makeSpecIterator(getSpecializations(), true); } - /// Merge \p Prev with our RedeclarableTemplateDecl::Common. - void mergePrevDecl(VarTemplateDecl *Prev); - // Implement isa/cast/dyncast support static bool classof(const Decl *D) { return classofKind(D->getKind()); } static bool classofKind(Kind K) { return K == VarTemplate; } diff --git a/clang/lib/AST/Decl.cpp b/clang/lib/AST/Decl.cpp index cd173d17263792..86913763ef9ff5 100644 --- a/clang/lib/AST/Decl.cpp +++ b/clang/lib/AST/Decl.cpp @@ -2708,7 +2708,7 @@ VarDecl *VarDecl::getTemplateInstantiationPattern() const { if (isTemplateInstantiation(VDTemplSpec->getTemplateSpecializationKind())) {
[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)
https://github.com/alexrp edited https://github.com/llvm/llvm-project/pull/112364 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [llvm][fatlto] Drop any CFI related instrumentation after emitting bitcode (PR #112788)
https://github.com/ilovepi updated https://github.com/llvm/llvm-project/pull/112788 >From ad89d61e60bac57cf8c66a974d741377ebe1db30 Mon Sep 17 00:00:00 2001 From: Paul Kirth Date: Fri, 18 Oct 2024 01:59:26 + Subject: [PATCH 1/2] Use new enum in constructor Created using spr 1.3.4 --- llvm/lib/Passes/PassBuilderPipelines.cpp | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index aec79304ab5c3c..0585e83e59a9ab 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -1631,8 +1631,11 @@ PassBuilder::buildFatLTODefaultPipeline(OptimizationLevel Level, bool ThinLTO, MPM.addPass(EmbedBitcodePass(ThinLTO, EmitSummary)); // If we're doing FatLTO w/ CFI enabled, we don't want the type tests in the - // object file. - MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true, true)); + // object code, only in the bitcode section, so drop it before we run + // module optimization and generate machine code. If llvm.type.test() isn't in + // the IR, this won't do anything. + MPM.addPass( + LowerTypeTestsPass(nullptr, nullptr, lowertypetests::DropTestKind::All)); // Use the ThinLTO post-link pipeline with sample profiling if (ThinLTO && PGOOpt && PGOOpt->Action == PGOOptions::SampleUse) >From f331c70196e399d0e0d4ec8fdb76d3a313b25ac3 Mon Sep 17 00:00:00 2001 From: Paul Kirth Date: Wed, 30 Oct 2024 23:54:00 + Subject: [PATCH 2/2] Update test to use cc1 Created using spr 1.3.4 --- clang/test/CodeGen/fat-lto-objects-cfi.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/clang/test/CodeGen/fat-lto-objects-cfi.cpp b/clang/test/CodeGen/fat-lto-objects-cfi.cpp index 022e74fd9b6f22..628951847053ac 100644 --- a/clang/test/CodeGen/fat-lto-objects-cfi.cpp +++ b/clang/test/CodeGen/fat-lto-objects-cfi.cpp @@ -1,7 +1,7 @@ // REQUIRES: x86-registered-target -// RUN: %clangxx --target=x86_64-unknown-fuchsia -O2 -flto -ffat-lto-objects \ -// RUN: -fsanitize=cfi -fvisibility=hidden -S -emit-llvm -o - %s \ +// RUN: %clang_cc1 -triple x86_64-unknown-fuchsia -O2 -flto -ffat-lto-objects \ +// RUN: -fsanitize=cfi-icall -fsanitize-trap=cfi-icall -fvisibility=hidden -emit-llvm -o - %s \ // RUN: | FileCheck %s // CHECK: llvm.embedded.object ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [flang][cuda] Data transfer with descriptor (PR #114302)
https://github.com/Renaud-K approved this pull request. Looks good. Nice way of testing the runtime. https://github.com/llvm/llvm-project/pull/114302 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 70a8e4c - Revert "[DXIL] Add GroupMemoryBarrierWithGroupSync intrinsic (#111884)"
Author: Adam Yang Date: 2024-10-30T15:57:27-07:00 New Revision: 70a8e4c4f4f0eba9f9192fae97654d3a32a731cd URL: https://github.com/llvm/llvm-project/commit/70a8e4c4f4f0eba9f9192fae97654d3a32a731cd DIFF: https://github.com/llvm/llvm-project/commit/70a8e4c4f4f0eba9f9192fae97654d3a32a731cd.diff LOG: Revert "[DXIL] Add GroupMemoryBarrierWithGroupSync intrinsic (#111884)" This reverts commit 9a5b3a1bbca6790602ec3291da850fc4485cc807. Added: Modified: llvm/include/llvm/IR/IntrinsicsDirectX.td llvm/lib/Target/DirectX/DXIL.td llvm/lib/Target/DirectX/DXILOpLowering.cpp llvm/utils/TableGen/DXILEmitter.cpp Removed: llvm/test/CodeGen/DirectX/group_memory_barrier_with_group_sync.ll diff --git a/llvm/include/llvm/IR/IntrinsicsDirectX.td b/llvm/include/llvm/IR/IntrinsicsDirectX.td index dada426368995d..e30d37f69f781e 100644 --- a/llvm/include/llvm/IR/IntrinsicsDirectX.td +++ b/llvm/include/llvm/IR/IntrinsicsDirectX.td @@ -92,6 +92,4 @@ def int_dx_step : DefaultAttrsIntrinsic<[LLVMMatchType<0>], [llvm_anyfloat_ty, L def int_dx_splitdouble : DefaultAttrsIntrinsic<[llvm_anyint_ty, LLVMMatchType<0>], [LLVMScalarOrSameVectorWidth<0, llvm_double_ty>], [IntrNoMem]>; def int_dx_radians : DefaultAttrsIntrinsic<[llvm_anyfloat_ty], [LLVMMatchType<0>], [IntrNoMem]>; - -def int_dx_group_memory_barrier_with_group_sync : DefaultAttrsIntrinsic<[], [], []>; } diff --git a/llvm/lib/Target/DirectX/DXIL.td b/llvm/lib/Target/DirectX/DXIL.td index 263ca50011aa7b..1e8dc63ffa257e 100644 --- a/llvm/lib/Target/DirectX/DXIL.td +++ b/llvm/lib/Target/DirectX/DXIL.td @@ -294,43 +294,6 @@ class Attributes attrs> { list op_attrs = attrs; } -class DXILConstant { - int value = value_; -} - -defset list BarrierModes = { - def BarrierMode_DeviceMemoryBarrier : DXILConstant<2>; - def BarrierMode_DeviceMemoryBarrierWithGroupSync : DXILConstant<3>; - def BarrierMode_GroupMemoryBarrier : DXILConstant<8>; - def BarrierMode_GroupMemoryBarrierWithGroupSync : DXILConstant<9>; - def BarrierMode_AllMemoryBarrier : DXILConstant<10>; - def BarrierMode_AllMemoryBarrierWithGroupSync: DXILConstant<11>; -} - -// Intrinsic arg selection -class Arg { - int index = -1; - DXILConstant value; - bit is_i8 = 0; - bit is_i32 = 0; -} -class ArgSelect : Arg { - let index = index_; -} -class ArgI32 : Arg { - let value = value_; - let is_i32 = 1; -} -class ArgI8 : Arg { - let value = value_; - let is_i8 = 1; -} - -class IntrinsicSelect args_> { - Intrinsic intrinsic = intrinsic_; - list args = args_; -} - // Abstraction DXIL Operation class DXILOp { // A short description of the operation @@ -345,9 +308,6 @@ class DXILOp { // LLVM Intrinsic DXIL Operation maps to Intrinsic LLVMIntrinsic = ?; - // Non-trivial LLVM Intrinsics DXIL Operation maps to - list intrinsic_selects = []; - // Result type of the op DXILOpParamType result; @@ -869,17 +829,3 @@ def WaveGetLaneIndex : DXILOp<111, waveGetLaneIndex> { let stages = [Stages]; let attributes = [Attributes]; } - -def Barrier : DXILOp<80, barrier> { - let Doc = "inserts a memory barrier in the shader"; - let intrinsic_selects = [ -IntrinsicSelect< -int_dx_group_memory_barrier_with_group_sync, -[ ArgI32 ]>, - ]; - - let arguments = [Int32Ty]; - let result = VoidTy; - let stages = [Stages]; - let attributes = [Attributes]; -} diff --git a/llvm/lib/Target/DirectX/DXILOpLowering.cpp b/llvm/lib/Target/DirectX/DXILOpLowering.cpp index b5cf1654181c6c..8acc9c1efa08c0 100644 --- a/llvm/lib/Target/DirectX/DXILOpLowering.cpp +++ b/llvm/lib/Target/DirectX/DXILOpLowering.cpp @@ -106,43 +106,17 @@ class OpLowerer { return false; } - struct ArgSelect { -enum class Type { - Index, - I8, - I32, -}; -Type Type = Type::Index; -int Value = -1; - }; - - [[nodiscard]] bool replaceFunctionWithOp(Function &F, dxil::OpCode DXILOp, - ArrayRef ArgSelects) { + [[nodiscard]] + bool replaceFunctionWithOp(Function &F, dxil::OpCode DXILOp) { bool IsVectorArgExpansion = isVectorArgExpansion(F); return replaceFunction(F, [&](CallInst *CI) -> Error { - OpBuilder.getIRB().SetInsertPoint(CI); SmallVector Args; - if (ArgSelects.size()) { -for (const ArgSelect &A : ArgSelects) { - switch (A.Type) { - case ArgSelect::Type::Index: -Args.push_back(CI->getArgOperand(A.Value)); -break; - case ArgSelect::Type::I8: -Args.push_back(OpBuilder.getIRB().getInt8((uint8_t)A.Value)); -break; - case ArgSelect::Type::I32: -Args.push_back(OpBuilder.getIRB().getInt32(A.Value)); -break; - default: -llvm_unreachable("Invalid type of intrinsic arg select."); - } -} -
[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)
@@ -294,7 +294,11 @@ class Triple { PAuthTest, -LastEnvironmentType = PAuthTest +GNUT64, +GNUEABIT64, +GNUEABIHFT64, + +LastEnvironmentType = GNUEABIHFT64 andrewrk wrote: Well, that used to be the case anyway. @alexrp pointed out in https://github.com/ziglang/zig/pull/21862 that since we now generate LLVM bitcode rather than using the LLVM IRBuilder API, we can delete these strict checks. So, this backport is not relevant to the zig project after all. https://github.com/llvm/llvm-project/pull/112364 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)
https://github.com/andrewrk edited https://github.com/llvm/llvm-project/pull/112364 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LLVM] [Clang] Backport "Support for Gentoo `*t64` triples (64-bit time_t ABIs)" (PR #112364)
https://github.com/andrewrk edited https://github.com/llvm/llvm-project/pull/112364 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port RegUsageInfoCollector pass to NPM (PR #113874)
https://github.com/optimisan updated https://github.com/llvm/llvm-project/pull/113874 >From a95b69c07c7804d2e2a10b939a178a191643a41c Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Mon, 28 Oct 2024 06:22:49 + Subject: [PATCH 1/4] [CodeGen][NewPM] Port RegUsageInfoCollector pass to NPM --- .../llvm/CodeGen/RegUsageInfoCollector.h | 25 llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 + .../llvm/Passes/MachinePassRegistry.def | 2 +- llvm/lib/CodeGen/CodeGen.cpp | 2 +- llvm/lib/CodeGen/RegUsageInfoCollector.cpp| 60 +-- llvm/lib/Passes/PassBuilder.cpp | 1 + llvm/test/CodeGen/AMDGPU/ipra-regmask.ll | 5 ++ 8 files changed, 76 insertions(+), 22 deletions(-) create mode 100644 llvm/include/llvm/CodeGen/RegUsageInfoCollector.h diff --git a/llvm/include/llvm/CodeGen/RegUsageInfoCollector.h b/llvm/include/llvm/CodeGen/RegUsageInfoCollector.h new file mode 100644 index 00..6b88cc4f99089e --- /dev/null +++ b/llvm/include/llvm/CodeGen/RegUsageInfoCollector.h @@ -0,0 +1,25 @@ +//===- llvm/CodeGen/RegUsageInfoCollector.h -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef LLVM_CODEGEN_REGUSAGEINFOCOLLECTOR_H +#define LLVM_CODEGEN_REGUSAGEINFOCOLLECTOR_H + +#include "llvm/CodeGen/MachinePassManager.h" + +namespace llvm { + +class RegUsageInfoCollectorPass +: public AnalysisInfoMixin { +public: + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); +}; + +} // namespace llvm + +#endif // LLVM_CODEGEN_REGUSAGEINFOCOLLECTOR_H diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index edc237f2819818..44b7ba830bb329 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -257,7 +257,7 @@ void initializeRegAllocPriorityAdvisorAnalysisPass(PassRegistry &); void initializeRegAllocScoringPass(PassRegistry &); void initializeRegBankSelectPass(PassRegistry &); void initializeRegToMemWrapperPassPass(PassRegistry &); -void initializeRegUsageInfoCollectorPass(PassRegistry &); +void initializeRegUsageInfoCollectorLegacyPass(PassRegistry &); void initializeRegUsageInfoPropagationPass(PassRegistry &); void initializeRegionInfoPassPass(PassRegistry &); void initializeRegionOnlyPrinterPass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h b/llvm/include/llvm/Passes/CodeGenPassBuilder.h index 8cbc9f71ab26d0..066cd70ec8b996 100644 --- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h +++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h @@ -53,6 +53,7 @@ #include "llvm/CodeGen/PHIElimination.h" #include "llvm/CodeGen/PreISelIntrinsicLowering.h" #include "llvm/CodeGen/RegAllocFast.h" +#include "llvm/CodeGen/RegUsageInfoCollector.h" #include "llvm/CodeGen/RegisterUsageInfo.h" #include "llvm/CodeGen/ReplaceWithVeclib.h" #include "llvm/CodeGen/SafeStack.h" diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def index 7db28cb0092525..0ee4794034e98b 100644 --- a/llvm/include/llvm/Passes/MachinePassRegistry.def +++ b/llvm/include/llvm/Passes/MachinePassRegistry.def @@ -156,6 +156,7 @@ MACHINE_FUNCTION_PASS("print", MachinePostDominatorTreePrinterPass(dbgs())) MACHINE_FUNCTION_PASS("print", SlotIndexesPrinterPass(dbgs())) MACHINE_FUNCTION_PASS("print", VirtRegMapPrinterPass(dbgs())) +MACHINE_FUNCTION_PASS("reg-usage-collector", RegUsageInfoCollectorPass()) MACHINE_FUNCTION_PASS("require-all-machine-function-properties", RequireAllMachineFunctionPropertiesPass()) MACHINE_FUNCTION_PASS("stack-coloring", StackColoringPass()) @@ -250,7 +251,6 @@ DUMMY_MACHINE_FUNCTION_PASS("prologepilog-code", PrologEpilogCodeInserterPass) DUMMY_MACHINE_FUNCTION_PASS("ra-basic", RABasicPass) DUMMY_MACHINE_FUNCTION_PASS("ra-greedy", RAGreedyPass) DUMMY_MACHINE_FUNCTION_PASS("ra-pbqp", RAPBQPPass) -DUMMY_MACHINE_FUNCTION_PASS("reg-usage-collector", RegUsageInfoCollectorPass) DUMMY_MACHINE_FUNCTION_PASS("reg-usage-propagation", RegUsageInfoPropagationPass) DUMMY_MACHINE_FUNCTION_PASS("regalloc", RegAllocPass) DUMMY_MACHINE_FUNCTION_PASS("regallocscoringpass", RegAllocScoringPass) diff --git a/llvm/lib/CodeGen/CodeGen.cpp b/llvm/lib/CodeGen/CodeGen.cpp index 39fba1d0b527ef..e7e8a121369b75 100644 --- a/llvm/lib/CodeGen/CodeGen.cpp +++ b/llvm/lib/CodeGen/CodeGen.cpp @@ -113,7 +113,7 @@ void llvm::initializeCodeGen(PassRegistry &Registry) { initializeRABasicPass(Registry); initializeRAGreedyPass(Registry); initializeRegAllocFastPass(Reg
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port RegUsageInfoPropagation pass to NPM (PR #114010)
https://github.com/optimisan updated https://github.com/llvm/llvm-project/pull/114010 >From 3370e24f9e9ec16b6404d7bcf3d72361c46934de Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Tue, 29 Oct 2024 07:14:30 + Subject: [PATCH 1/2] [CodeGen][NewPM] Port RegUsageInfoPropagation pass to NPM --- .../llvm/CodeGen/RegUsageInfoPropagate.h | 25 +++ llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 + .../llvm/Passes/MachinePassRegistry.def | 2 +- llvm/lib/CodeGen/CodeGen.cpp | 2 +- llvm/lib/CodeGen/RegUsageInfoPropagate.cpp| 75 +-- llvm/lib/Passes/PassBuilder.cpp | 1 + llvm/test/CodeGen/AArch64/preserve.ll | 4 + 8 files changed, 86 insertions(+), 26 deletions(-) create mode 100644 llvm/include/llvm/CodeGen/RegUsageInfoPropagate.h diff --git a/llvm/include/llvm/CodeGen/RegUsageInfoPropagate.h b/llvm/include/llvm/CodeGen/RegUsageInfoPropagate.h new file mode 100644 index 00..73624015e37d9d --- /dev/null +++ b/llvm/include/llvm/CodeGen/RegUsageInfoPropagate.h @@ -0,0 +1,25 @@ +//===- llvm/CodeGen/RegUsageInfoPropagate.h -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef LLVM_CODEGEN_REGUSAGEINFOPROPAGATE_H +#define LLVM_CODEGEN_REGUSAGEINFOPROPAGATE_H + +#include "llvm/CodeGen/MachinePassManager.h" + +namespace llvm { + +class RegUsageInfoPropagationPass +: public PassInfoMixin { +public: + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); +}; + +} // namespace llvm + +#endif // LLVM_CODEGEN_REGUSAGEINFOPROPAGATE_H diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 44b7ba830bb329..bc209a4e939415 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -258,7 +258,7 @@ void initializeRegAllocScoringPass(PassRegistry &); void initializeRegBankSelectPass(PassRegistry &); void initializeRegToMemWrapperPassPass(PassRegistry &); void initializeRegUsageInfoCollectorLegacyPass(PassRegistry &); -void initializeRegUsageInfoPropagationPass(PassRegistry &); +void initializeRegUsageInfoPropagationLegacyPass(PassRegistry &); void initializeRegionInfoPassPass(PassRegistry &); void initializeRegionOnlyPrinterPass(PassRegistry &); void initializeRegionOnlyViewerPass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h b/llvm/include/llvm/Passes/CodeGenPassBuilder.h index 066cd70ec8b996..9f41cc41a7c926 100644 --- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h +++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h @@ -54,6 +54,7 @@ #include "llvm/CodeGen/PreISelIntrinsicLowering.h" #include "llvm/CodeGen/RegAllocFast.h" #include "llvm/CodeGen/RegUsageInfoCollector.h" +#include "llvm/CodeGen/RegUsageInfoPropagate.h" #include "llvm/CodeGen/RegisterUsageInfo.h" #include "llvm/CodeGen/ReplaceWithVeclib.h" #include "llvm/CodeGen/SafeStack.h" diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def index 0ee4794034e98b..6327ab1abd48e9 100644 --- a/llvm/include/llvm/Passes/MachinePassRegistry.def +++ b/llvm/include/llvm/Passes/MachinePassRegistry.def @@ -157,6 +157,7 @@ MACHINE_FUNCTION_PASS("print", MACHINE_FUNCTION_PASS("print", SlotIndexesPrinterPass(dbgs())) MACHINE_FUNCTION_PASS("print", VirtRegMapPrinterPass(dbgs())) MACHINE_FUNCTION_PASS("reg-usage-collector", RegUsageInfoCollectorPass()) +MACHINE_FUNCTION_PASS("reg-usage-propagation", RegUsageInfoPropagationPass()) MACHINE_FUNCTION_PASS("require-all-machine-function-properties", RequireAllMachineFunctionPropertiesPass()) MACHINE_FUNCTION_PASS("stack-coloring", StackColoringPass()) @@ -251,7 +252,6 @@ DUMMY_MACHINE_FUNCTION_PASS("prologepilog-code", PrologEpilogCodeInserterPass) DUMMY_MACHINE_FUNCTION_PASS("ra-basic", RABasicPass) DUMMY_MACHINE_FUNCTION_PASS("ra-greedy", RAGreedyPass) DUMMY_MACHINE_FUNCTION_PASS("ra-pbqp", RAPBQPPass) -DUMMY_MACHINE_FUNCTION_PASS("reg-usage-propagation", RegUsageInfoPropagationPass) DUMMY_MACHINE_FUNCTION_PASS("regalloc", RegAllocPass) DUMMY_MACHINE_FUNCTION_PASS("regallocscoringpass", RegAllocScoringPass) DUMMY_MACHINE_FUNCTION_PASS("regbankselect", RegBankSelectPass) diff --git a/llvm/lib/CodeGen/CodeGen.cpp b/llvm/lib/CodeGen/CodeGen.cpp index e7e8a121369b75..013a9b3c9c4ffa 100644 --- a/llvm/lib/CodeGen/CodeGen.cpp +++ b/llvm/lib/CodeGen/CodeGen.cpp @@ -114,7 +114,7 @@ void llvm::initializeCodeGen(PassRegistry &Registry) { initializeRAGreedyPass(Registry); initializeRegAllocFastPass(Registry); initializeRegUsageInfoCollector
[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)
davidchisnall wrote: I’m concerned about the semantics of unstable. This sounds like it would impact optimisation of memcmp, for example (is it still allowable to optimise away self comparisons?). I wouldn’t want that added to the LangRef without some clearer description of what optimisers *can* assume. That said, given that it’s a property that’s already present for NI pointers for GC, I suppose we’re stuck with it for now. Note that CHERI LLVM’s use of ptrtoint is largely historical. We worked with the folks doing NI pointers and my plan was to move to them eventually. We didn’t initially have the address-get intrinsic and so we abused ptrtoint for that, but I believe we’ve fixed the places in the front end that do this. When we upstream, we should get rid of that entirely. This also improves optimisation because optimisers treat address-get as non-escaping whereas ptrtoint assumes the pointer may be materialised anywhere else and makes alias analysis conservative. https://github.com/llvm/llvm-project/pull/105735 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits