[llvm-branch-commits] [llvm] [LoongArch] Avoid scheduling relaxable code sequence and attach relax relocs (PR #121330)
@@ -443,6 +443,89 @@ bool LoongArchInstrInfo::isSchedulingBoundary(const MachineInstr &MI, break; } + const auto &STI = MF.getSubtarget(); + if (STI.hasFeature(LoongArch::FeatureRelax)) { +// When linker relaxation enabled, the following instruction patterns are +// prohibited from being reordered: +// +// * pcalau12i $a0, %pc_hi20(s) +// addi.w/d $a0, $a0, %pc_lo12(s) +// +// * pcalau12i $a0, %got_pc_hi20(s) +// ld.w/d $a0, $a0, %got_pc_lo12(s) +// +// * pcalau12i $a0, %ie_pc_hi20(s) +// ld.w/d $a0, $a0, %ie_pc_lo12(s) ylzsx wrote: I think tls ie can be scheduled. https://github.com/llvm/llvm-project/pull/121330 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoongArch] Avoid scheduling relaxable code sequence and attach relax relocs (PR #121330)
https://github.com/zhaoqi5 updated https://github.com/llvm/llvm-project/pull/121330 >From 85be5541a23a859ad8e50bd75fb7ff35985c5988 Mon Sep 17 00:00:00 2001 From: Qi Zhao Date: Tue, 24 Dec 2024 11:03:23 +0800 Subject: [PATCH 1/2] [LoongArch] Avoid scheduling relaxable code sequence and attach relax relocs If linker relaxation enabled, relaxable code sequence expanded from pseudos should avoid being separated by instruction scheduling. This commit tags scheduling boundary for them to avoid being scheduled. (Except for `tls_le` and `call36/tail36`. Because `tls_le` can be scheduled and have no influence to relax, `call36/tail36` are expanded later in `LoongArchExpandPseudo` pass.) A new mask target-flag is added to attach relax relocs to the relaxable code sequence. (No need to add it for `tls_le` and `call36/tail36` because of the reasons shown above.) Because of this, get "direct" flags is necessary when using their target-flags. In addition, code sequence after being optimized by `MergeBaseOffset` pass may not relaxable any more, so the relax "bitmask" flag should be removed. --- .../LoongArch/LoongArchExpandPseudoInsts.cpp | 34 -- .../Target/LoongArch/LoongArchInstrInfo.cpp | 99 - .../lib/Target/LoongArch/LoongArchInstrInfo.h | 3 + .../Target/LoongArch/LoongArchMCInstLower.cpp | 4 +- .../LoongArch/LoongArchMergeBaseOffset.cpp| 30 +- .../LoongArch/LoongArchTargetMachine.cpp | 1 + .../MCTargetDesc/LoongArchBaseInfo.h | 22 .../MCTargetDesc/LoongArchMCCodeEmitter.cpp | 1 + .../CodeGen/LoongArch/linker-relaxation.ll| 102 ++ .../test/CodeGen/LoongArch/mir-relax-flags.ll | 64 +++ .../CodeGen/LoongArch/mir-target-flags.ll | 31 +- 11 files changed, 370 insertions(+), 21 deletions(-) create mode 100644 llvm/test/CodeGen/LoongArch/linker-relaxation.ll create mode 100644 llvm/test/CodeGen/LoongArch/mir-relax-flags.ll diff --git a/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp b/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp index 0218934ea3344a..be60de3d63d061 100644 --- a/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp +++ b/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp @@ -187,18 +187,23 @@ bool LoongArchPreRAExpandPseudo::expandPcalau12iInstPair( MachineInstr &MI = *MBBI; DebugLoc DL = MI.getDebugLoc(); + const auto &STI = MF->getSubtarget(); + bool EnableRelax = STI.hasFeature(LoongArch::FeatureRelax); + Register DestReg = MI.getOperand(0).getReg(); Register ScratchReg = MF->getRegInfo().createVirtualRegister(&LoongArch::GPRRegClass); MachineOperand &Symbol = MI.getOperand(1); BuildMI(MBB, MBBI, DL, TII->get(LoongArch::PCALAU12I), ScratchReg) - .addDisp(Symbol, 0, FlagsHi); + .addDisp(Symbol, 0, + EnableRelax ? LoongArchII::addRelaxFlag(FlagsHi) : FlagsHi); MachineInstr *SecondMI = BuildMI(MBB, MBBI, DL, TII->get(SecondOpcode), DestReg) .addReg(ScratchReg) - .addDisp(Symbol, 0, FlagsLo); + .addDisp(Symbol, 0, + EnableRelax ? LoongArchII::addRelaxFlag(FlagsLo) : FlagsLo); if (MI.hasOneMemOperand()) SecondMI->addMemOperand(*MF, *MI.memoperands_begin()); @@ -481,6 +486,7 @@ bool LoongArchPreRAExpandPseudo::expandLoadAddressTLSDesc( unsigned ADD = STI.is64Bit() ? LoongArch::ADD_D : LoongArch::ADD_W; unsigned ADDI = STI.is64Bit() ? LoongArch::ADDI_D : LoongArch::ADDI_W; unsigned LD = STI.is64Bit() ? LoongArch::LD_D : LoongArch::LD_W; + bool EnableRelax = STI.hasFeature(LoongArch::FeatureRelax); Register DestReg = MI.getOperand(0).getReg(); Register Tmp1Reg = @@ -488,7 +494,10 @@ bool LoongArchPreRAExpandPseudo::expandLoadAddressTLSDesc( MachineOperand &Symbol = MI.getOperand(Large ? 2 : 1); BuildMI(MBB, MBBI, DL, TII->get(LoongArch::PCALAU12I), Tmp1Reg) - .addDisp(Symbol, 0, LoongArchII::MO_DESC_PC_HI); + .addDisp(Symbol, 0, + (EnableRelax && !Large) + ? LoongArchII::addRelaxFlag(LoongArchII::MO_DESC_PC_HI) + : LoongArchII::MO_DESC_PC_HI); if (Large) { // Code Sequence: @@ -526,19 +535,28 @@ bool LoongArchPreRAExpandPseudo::expandLoadAddressTLSDesc( // pcalau12i $a0, %desc_pc_hi20(sym) // addi.w/d $a0, $a0, %desc_pc_lo12(sym) // ld.w/d$ra, $a0, %desc_ld(sym) -// jirl $ra, $ra, %desc_ld(sym) -// add.d $dst, $a0, $tp +// jirl $ra, $ra, %desc_call(sym) +// add.w/d $dst, $a0, $tp BuildMI(MBB, MBBI, DL, TII->get(ADDI), LoongArch::R4) .addReg(Tmp1Reg) -.addDisp(Symbol, 0, LoongArchII::MO_DESC_PC_LO); +.addDisp(Symbol, 0, + EnableRelax + ? LoongArchII::addRelaxFlag(LoongArchII::MO_DESC_PC_LO) + : LoongArchII::MO_DESC_PC_LO); } BuildMI(MBB, MBBI, DL, TII->get(LD), LoongArch::R1) .addReg(LoongArch::
[llvm-branch-commits] [llvm] [LoongArch] Avoid scheduling relaxable code sequence and attach relax relocs (PR #121330)
@@ -443,6 +443,89 @@ bool LoongArchInstrInfo::isSchedulingBoundary(const MachineInstr &MI, break; } + const auto &STI = MF.getSubtarget(); + if (STI.hasFeature(LoongArch::FeatureRelax)) { +// When linker relaxation enabled, the following instruction patterns are +// prohibited from being reordered: +// +// * pcalau12i $a0, %pc_hi20(s) +// addi.w/d $a0, $a0, %pc_lo12(s) +// +// * pcalau12i $a0, %got_pc_hi20(s) +// ld.w/d $a0, $a0, %got_pc_lo12(s) +// +// * pcalau12i $a0, %ie_pc_hi20(s) +// ld.w/d $a0, $a0, %ie_pc_lo12(s) zhaoqi5 wrote: Great! It was my misunderstanding. Thanks. https://github.com/llvm/llvm-project/pull/121330 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoongArch] Avoid scheduling relaxable code sequence and attach relax relocs (PR #121330)
@@ -187,18 +187,23 @@ bool LoongArchPreRAExpandPseudo::expandPcalau12iInstPair( MachineInstr &MI = *MBBI; DebugLoc DL = MI.getDebugLoc(); + const auto &STI = MF->getSubtarget(); + bool EnableRelax = STI.hasFeature(LoongArch::FeatureRelax); + Register DestReg = MI.getOperand(0).getReg(); Register ScratchReg = MF->getRegInfo().createVirtualRegister(&LoongArch::GPRRegClass); MachineOperand &Symbol = MI.getOperand(1); BuildMI(MBB, MBBI, DL, TII->get(LoongArch::PCALAU12I), ScratchReg) - .addDisp(Symbol, 0, FlagsHi); + .addDisp(Symbol, 0, + EnableRelax ? LoongArchII::addRelaxFlag(FlagsHi) : FlagsHi); zhaoqi5 wrote: This is indeed better. Thanks. https://github.com/llvm/llvm-project/pull/121330 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Reduce 64-bit add width if low bits are known 0 (PR #122049)
https://github.com/cdevadas approved this pull request. https://github.com/llvm/llvm-project/pull/122049 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoongArch] Avoid scheduling relaxable code sequence and attach relax relocs (PR #121330)
https://github.com/zhaoqi5 edited https://github.com/llvm/llvm-project/pull/121330 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LoongArch] Avoid scheduling relaxable code sequence and attach relax relocs (PR #121330)
https://github.com/zhaoqi5 edited https://github.com/llvm/llvm-project/pull/121330 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [Flang] Introduce FortranSupport (PR #122069)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff a08aa48fb4955f9d16c6172580505c100076b5d4 2023940bffc9c717e44266134b4f63f04f65f762 --extensions h,cpp -- flang/include/flang/Common/fast-int-set.h flang/include/flang/Evaluate/call.h flang/include/flang/Evaluate/characteristics.h flang/include/flang/Evaluate/common.h flang/include/flang/Evaluate/constant.h flang/include/flang/Evaluate/expression.h flang/include/flang/Evaluate/formatting.h flang/include/flang/Evaluate/intrinsics.h flang/include/flang/Evaluate/shape.h flang/include/flang/Evaluate/target.h flang/include/flang/Evaluate/tools.h flang/include/flang/Evaluate/traverse.h flang/include/flang/Evaluate/type.h flang/include/flang/Evaluate/variable.h flang/include/flang/Frontend/CompilerInvocation.h flang/include/flang/Frontend/FrontendOptions.h flang/include/flang/ISO_Fortran_binding.h flang/include/flang/Lower/AbstractConverter.h flang/include/flang/Lower/Bridge.h flang/include/flang/Lower/CallInterface.h flang/include/flang/Lower/ConvertType.h flang/include/flang/Lower/LoweringOptions.h flang/include/flang/Lower/PFTBuilder.h flang/include/flang/Lower/Support/Utils.h flang/include/flang/Lower/SymbolMap.h flang/include/flang/Optimizer/Builder/FIRBuilder.h flang/include/flang/Optimizer/Builder/PPCIntrinsicCall.h flang/include/flang/Optimizer/Builder/Runtime/RTBuilder.h flang/include/flang/Optimizer/CodeGen/DescriptorModel.h flang/include/flang/Optimizer/Dialect/CUF/Attributes/CUFAttr.h flang/include/flang/Optimizer/Support/TypeCode.h flang/include/flang/Optimizer/Support/Utils.h flang/include/flang/Parser/char-block.h flang/include/flang/Parser/dump-parse-tree.h flang/include/flang/Parser/message.h flang/include/flang/Parser/parse-state.h flang/include/flang/Parser/parse-tree.h flang/include/flang/Parser/parsing.h flang/include/flang/Parser/provenance.h flang/include/flang/Parser/source.h flang/include/flang/Parser/user-state.h flang/include/flang/Runtime/allocatable.h flang/include/flang/Runtime/descriptor-consts.h flang/include/flang/Runtime/descriptor.h flang/include/flang/Runtime/io-api.h flang/include/flang/Runtime/pointer.h flang/include/flang/Runtime/random.h flang/include/flang/Runtime/support.h flang/include/flang/Runtime/type-code.h flang/include/flang/Semantics/expression.h flang/include/flang/Semantics/runtime-type-info.h flang/include/flang/Semantics/scope.h flang/include/flang/Semantics/semantics.h flang/include/flang/Semantics/symbol.h flang/include/flang/Semantics/tools.h flang/include/flang/Semantics/type.h flang/include/flang/Tools/CrossToolHelpers.h flang/lib/Evaluate/call.cpp flang/lib/Evaluate/characteristics.cpp flang/lib/Evaluate/fold-implementation.h flang/lib/Evaluate/formatting.cpp flang/lib/Evaluate/intrinsics-library.cpp flang/lib/Evaluate/intrinsics.cpp flang/lib/Evaluate/real.cpp flang/lib/Evaluate/shape.cpp flang/lib/Evaluate/target.cpp flang/lib/Frontend/CompilerInstance.cpp flang/lib/Frontend/CompilerInvocation.cpp flang/lib/Frontend/FrontendActions.cpp flang/lib/Lower/Bridge.cpp flang/lib/Lower/CallInterface.cpp flang/lib/Lower/ConvertExpr.cpp flang/lib/Lower/Mangler.cpp flang/lib/Optimizer/Builder/IntrinsicCall.cpp flang/lib/Optimizer/CodeGen/TypeConverter.cpp flang/lib/Optimizer/Dialect/FIRType.cpp flang/lib/Optimizer/Transforms/AddDebugInfo.cpp flang/lib/Optimizer/Transforms/AssumedRankOpConversion.cpp flang/lib/Optimizer/Transforms/CUFDeviceGlobal.cpp flang/lib/Optimizer/Transforms/CUFGPUToLLVMConversion.cpp flang/lib/Optimizer/Transforms/CUFOpConversion.cpp flang/lib/Optimizer/Transforms/ExternalNameConversion.cpp flang/lib/Optimizer/Transforms/LoopVersioning.cpp flang/lib/Optimizer/Transforms/SimplifyIntrinsics.cpp flang/lib/Optimizer/Transforms/StackReclaim.cpp flang/lib/Optimizer/Transforms/VScaleAttr.cpp flang/lib/Parser/basic-parsers.h flang/lib/Parser/parse-tree.cpp flang/lib/Parser/prescan.h flang/lib/Parser/unparse.cpp flang/lib/Semantics/assignment.h flang/lib/Semantics/check-case.cpp flang/lib/Semantics/check-coarray.cpp flang/lib/Semantics/check-cuda.cpp flang/lib/Semantics/check-data.h flang/lib/Semantics/check-do-forall.cpp flang/lib/Semantics/check-return.cpp flang/lib/Semantics/check-select-rank.cpp flang/lib/Semantics/check-select-type.cpp flang/lib/Semantics/check-stop.cpp flang/lib/Semantics/data-to-inits.h flang/lib/Semantics/expression.cpp flang/lib/Semantics/pointer-assignment.cpp flang/lib/Semantics/resolve-labels.cpp flang/lib/Semantics/resolve-names-utils.cpp flang/lib/Semantics/resolve-names.cpp flang/lib/Semantics/rewrite-parse-tree.cpp flang/lib/Semantics/semantics.cpp flang/lib/Semantics/tools.cpp flang/runtime/CUDA/allocator.cpp flang/runtime/ISO_Fortran_binding.cpp flang/runtime/ISO_Fortran_util.h flang/runtime/allocatable.cpp flang/runtime/stat.h
[llvm-branch-commits] [llvm] AMDGPU: Reduce 64-bit add width if low bits are known 0 (PR #122049)
arsenm wrote: > Why doesn't this fall out naturally from splitting the 64-bit add into 32-bit > parts and then simplifying each part? Do we leave it as a 64-bit add all the > way until final instruction selection? Yes. It gets selected to pseudos which are split in the post-isel hook (I don't remember why these were moved from just split during the actual selection) https://github.com/llvm/llvm-project/pull/122049 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [SelectionDAG][X86] Split <2 x T> vector types for atomic load (PR #120640)
@@ -1391,6 +1394,38 @@ void DAGTypeLegalizer::SplitVectorResult(SDNode *N, unsigned ResNo) { SetSplitVector(SDValue(N, ResNo), Lo, Hi); } +void DAGTypeLegalizer::SplitVecRes_ATOMIC_LOAD(AtomicSDNode *LD, SDValue &Lo, + SDValue &Hi) { + EVT LoVT, HiVT; + SDLoc dl(LD); + std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(LD->getValueType(0)); + + SDValue Ch = LD->getChain(); + SDValue Ptr = LD->getBasePtr(); + EVT MemoryVT = LD->getMemoryVT(); + + EVT LoMemVT, HiMemVT; + std::tie(LoMemVT, HiMemVT) = DAG.GetSplitDestVTs(MemoryVT); + + Lo = DAG.getAtomic(ISD::ATOMIC_LOAD, dl, LoMemVT, LoMemVT, Ch, Ptr, arsenm wrote: This should create one ATOMIC_LOAD with the bitcast integer type. You then unpack that result into the expected Lo/Hi, not the direct atomic results https://github.com/llvm/llvm-project/pull/120640 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [SelectionDAG][X86] Split <2 x T> vector types for atomic load (PR #120640)
@@ -1391,6 +1394,38 @@ void DAGTypeLegalizer::SplitVectorResult(SDNode *N, unsigned ResNo) { SetSplitVector(SDValue(N, ResNo), Lo, Hi); } +void DAGTypeLegalizer::SplitVecRes_ATOMIC_LOAD(AtomicSDNode *LD, SDValue &Lo, + SDValue &Hi) { + EVT LoVT, HiVT; + SDLoc dl(LD); + std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(LD->getValueType(0)); + + SDValue Ch = LD->getChain(); + SDValue Ptr = LD->getBasePtr(); + EVT MemoryVT = LD->getMemoryVT(); + + EVT LoMemVT, HiMemVT; + std::tie(LoMemVT, HiMemVT) = DAG.GetSplitDestVTs(MemoryVT); + + Lo = DAG.getAtomic(ISD::ATOMIC_LOAD, dl, LoMemVT, LoMemVT, Ch, Ptr, arsenm wrote: Title should also not say split, this is forcing the type to a legal integer https://github.com/llvm/llvm-project/pull/120640 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [SelectionDAG][X86] Split <2 x T> vector types for atomic load (PR #120640)
https://github.com/arsenm requested changes to this pull request. https://github.com/llvm/llvm-project/pull/120640 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [SelectionDAG][X86] Split <2 x T> vector types for atomic load (PR #120640)
@@ -1146,6 +1146,9 @@ void DAGTypeLegalizer::SplitVectorResult(SDNode *N, unsigned ResNo) { SplitVecRes_STEP_VECTOR(N, Lo, Hi); break; case ISD::SIGN_EXTEND_INREG: SplitVecRes_InregOp(N, Lo, Hi); break; + case ISD::ATOMIC_LOAD: +SplitVecRes_ATOMIC_LOAD(cast(N), Lo, Hi); arsenm wrote: I don't understand the comment. This is unrelated to the set of legal types or legalization actions. There is no need to touch any patterns https://github.com/llvm/llvm-project/pull/120640 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)
Meinersbur wrote: > > It is an old problem, see [#87866 > > (comment)](https://github.com/llvm/llvm-project/pull/87866#issuecomment-2214034671) > > Can we raise an issue for this? Created #122152 I don't expect anything come out of it, I think moving to `LLVM_ENABLE_PER_TARGET_RUNTIME_DIR` by default is deliberate. https://github.com/llvm/llvm-project/pull/110217 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [llvm] [Flang][NFC] Move runtime library files to flang-rt. (PR #110298)
https://github.com/Meinersbur edited https://github.com/llvm/llvm-project/pull/110298 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)
https://github.com/Meinersbur edited https://github.com/llvm/llvm-project/pull/110217 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)
https://github.com/Meinersbur ready_for_review https://github.com/llvm/llvm-project/pull/110217 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AsmPrinter][TargetLowering]Place a hot jump table into a hot-suffixed section (PR #122215)
https://github.com/mingmingl-llvm created https://github.com/llvm/llvm-project/pull/122215 None >From a2a6f9f5a6f7647f85a230241bf3aa39c4bd65d9 Mon Sep 17 00:00:00 2001 From: mingmingl Date: Wed, 8 Jan 2025 16:53:45 -0800 Subject: [PATCH] [AsmPrinter][TargetLowering]Place a hot jump table into a hot-suffixed section --- llvm/include/llvm/CodeGen/AsmPrinter.h| 8 +- .../CodeGen/TargetLoweringObjectFileImpl.h| 3 + .../llvm/Target/TargetLoweringObjectFile.h| 5 + llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp| 102 +- .../CodeGen/TargetLoweringObjectFileImpl.cpp | 28 +++-- llvm/lib/CodeGen/TargetPassConfig.cpp | 8 +- llvm/lib/Target/TargetLoweringObjectFile.cpp | 6 ++ llvm/test/CodeGen/X86/jump-table-partition.ll | 8 +- 8 files changed, 130 insertions(+), 38 deletions(-) diff --git a/llvm/include/llvm/CodeGen/AsmPrinter.h b/llvm/include/llvm/CodeGen/AsmPrinter.h index c9a88d7b1c015c..9249d5adf3f6f7 100644 --- a/llvm/include/llvm/CodeGen/AsmPrinter.h +++ b/llvm/include/llvm/CodeGen/AsmPrinter.h @@ -453,6 +453,10 @@ class AsmPrinter : public MachineFunctionPass { /// function to the current output stream. virtual void emitJumpTableInfo(); + virtual void emitJumpTables(const std::vector &JumpTableIndices, + MCSection *JumpTableSection, bool JTInDiffSection, + const MachineJumpTableInfo &MJTI); + /// Emit the specified global variable to the .s file. virtual void emitGlobalVariable(const GlobalVariable *GV); @@ -892,10 +896,10 @@ class AsmPrinter : public MachineFunctionPass { // Internal Implementation Details //===--===// - void emitJumpTableEntry(const MachineJumpTableInfo *MJTI, + void emitJumpTableEntry(const MachineJumpTableInfo &MJTI, const MachineBasicBlock *MBB, unsigned uid) const; - void emitJumpTableSizesSection(const MachineJumpTableInfo *MJTI, + void emitJumpTableSizesSection(const MachineJumpTableInfo &MJTI, const Function &F) const; void emitLLVMUsedList(const ConstantArray *InitList); diff --git a/llvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h b/llvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h index a2a9e5d499e527..3d48d380fcb245 100644 --- a/llvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h +++ b/llvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h @@ -74,6 +74,9 @@ class TargetLoweringObjectFileELF : public TargetLoweringObjectFile { MCSection *getSectionForJumpTable(const Function &F, const TargetMachine &TM) const override; + MCSection * + getSectionForJumpTable(const Function &F, const TargetMachine &TM, + const MachineJumpTableEntry *JTE) const override; MCSection *getSectionForLSDA(const Function &F, const MCSymbol &FnSym, const TargetMachine &TM) const override; diff --git a/llvm/include/llvm/Target/TargetLoweringObjectFile.h b/llvm/include/llvm/Target/TargetLoweringObjectFile.h index 4864ba843f4886..577adc458fcbf1 100644 --- a/llvm/include/llvm/Target/TargetLoweringObjectFile.h +++ b/llvm/include/llvm/Target/TargetLoweringObjectFile.h @@ -27,6 +27,7 @@ class Function; class GlobalObject; class GlobalValue; class MachineBasicBlock; +class MachineJumpTableEntry; class MachineModuleInfo; class Mangler; class MCContext; @@ -132,6 +133,10 @@ class TargetLoweringObjectFile : public MCObjectFileInfo { virtual MCSection *getSectionForJumpTable(const Function &F, const TargetMachine &TM) const; + virtual MCSection * + getSectionForJumpTable(const Function &F, const TargetMachine &TM, + const MachineJumpTableEntry *JTE) const; + virtual MCSection *getSectionForLSDA(const Function &, const MCSymbol &, const TargetMachine &) const { return LSDASection; diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp index d34fe0e86c7495..b575cd7d993c39 100644 --- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp +++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp @@ -168,6 +168,11 @@ static cl::opt BBAddrMapSkipEmitBBEntries( "unnecessary for some PGOAnalysisMap features."), cl::Hidden, cl::init(false)); +static cl::opt +EmitStaticDataHotnessSuffix("emit-static-data-hotness-suffix", cl::Hidden, +cl::init(false), cl::ZeroOrMore, +cl::desc("Emit static data hotness suffix")); + static cl::opt EmitJumpTableSizesSection( "emit-jump-table-sizes-section", cl::desc("Emit a section containing jump table addresses and sizes"), @@ -2861,7 +2866,6 @@ void AsmPrinter::emitConstantPool() { // Print assembly representations of the jump tabl
[llvm-branch-commits] [clang] Add documentation for Multilib custom flags (PR #114998)
https://github.com/vhscampos updated https://github.com/llvm/llvm-project/pull/114998 >From 9fcdd1760ea664a618a2c05a18e777940a9d49b6 Mon Sep 17 00:00:00 2001 From: Victor Campos Date: Tue, 5 Nov 2024 14:22:06 + Subject: [PATCH 1/4] Add documentation for Multilib custom flags --- clang/docs/Multilib.rst | 90 + 1 file changed, 90 insertions(+) diff --git a/clang/docs/Multilib.rst b/clang/docs/Multilib.rst index 7637d0db9565b8..85cb789b9847ac 100644 --- a/clang/docs/Multilib.rst +++ b/clang/docs/Multilib.rst @@ -122,6 +122,78 @@ subclass and a suitable base multilib variant is present then the It is the responsibility of layered multilib authors to ensure that headers and libraries in each layer are complete enough to mask any incompatibilities. +Multilib custom flags += + +Introduction + + +The multilib mechanism supports library variants that correspond to target, +code generation or language command-line flags. Examples include ``--target``, +``-mcpu``, ``-mfpu``, ``-mbranch-protection``, ``-fno-rtti``. However, some library +variants are particular to features that do not correspond to any command-line +option. Multithreading and semihosting, for instance, have no associated +compiler option. + +In order to support the selection of variants for which no compiler option +exists, the multilib specification includes the concept of *custom flags*. +These flags have no impact on code generation and are only used in the multilib +processing. + +Multilib custom flags follow this format in the driver invocation: + +:: + + -fmultilib-flag= + +They are fed into the multilib system alongside the remaining flags. + +Custom flag declarations + + +Custom flags can be declared in the YAML file under the *Flags* section. + +.. code-block:: yaml + + Flags: + - Name: multithreaded +Values: +- Name: no-multithreaded + DriverArgs: [-D__SINGLE_THREAD__] +- Name: multithreaded +Default: no-multithreaded + +* Name: the name to categorize a flag. +* Values: a list of flag *Value*s (defined below). +* Default: it specifies the name of the value this flag should take if not + specified in the command-line invocation. It must be one value from the Values + field. + +A Default value is useful to save users from specifying custom flags that have a +most commonly used value. + +Each flag *Value* is defined as: + +* Name: name of the value. This is the string to be used in + ``-fmultilib-flag=``. +* DriverArgs: a list of strings corresponding to the extra driver arguments + used to build a library variant that's in accordance to this specific custom + flag value. These arguments are fed back into the driver if this flag *Value* + is enabled. + +The namespace of flag values is common across all flags. This means that flag +value names must be unique. + +Usage of custom flags in the *Variants* specifications +-- + +Library variants should list their requirement on one or more custom flags like +they do for any other flag. Each requirement must be listed as +``-fmultilib-flag=``. + +A variant that does not specify a requirement on one particular flag can be +matched against any value of that flag. + Stability = @@ -222,6 +294,24 @@ For a more comprehensive example see # Flags is a list of one or more strings. Flags: [--target=thumbv7m-none-eabi] + # Custom flag declarations. Each item is a different declaration. + Flags: +# Name of the flag + - Name: multithreaded +# List of custom flag values +Values: + # Name of the custom flag value. To be used in -fmultilib-flag=. +- Name: no-multithreaded + # Extra driver arguments to be printed with -print-multi-lib. Useful for + # specifying extra arguments for building the the associated library + # variant(s). + DriverArgs: [-D__SINGLE_THREAD__] +- Name: multithreaded +# Default flag value. If no value for this flag declaration is used in the +# command-line, the multilib system will use this one. Must be equal to one +# of the flag value names from this flag declaration. +Default: no-multithreaded + Design principles = >From 5799eb81ac94ec4131af146bfacdf44a9bebdd71 Mon Sep 17 00:00:00 2001 From: Victor Campos Date: Mon, 25 Nov 2024 15:07:57 + Subject: [PATCH 2/4] Fix doc build warning --- clang/docs/Multilib.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/clang/docs/Multilib.rst b/clang/docs/Multilib.rst index 85cb789b9847ac..48d84087dda01c 100644 --- a/clang/docs/Multilib.rst +++ b/clang/docs/Multilib.rst @@ -164,7 +164,7 @@ Custom flags can be declared in the YAML file under the *Flags* section. Default: no-multithreaded * Name: the name to categorize a flag. -* Values: a list of flag *Value*s (defined below). +* Values: a list of flag Values (defined b
[llvm-branch-commits] [clang] Add documentation for Multilib custom flags (PR #114998)
@@ -122,6 +122,76 @@ subclass and a suitable base multilib variant is present then the It is the responsibility of layered multilib authors to ensure that headers and libraries in each layer are complete enough to mask any incompatibilities. +Multilib custom flags += + +Introduction + + +The multilib mechanism supports library variants that correspond to target, +code generation or language command-line flags. Examples include ``--target``, +``-mcpu``, ``-mfpu``, ``-mbranch-protection``, ``-fno-rtti``. However, some library +variants are particular to features that do not correspond to any command-line +option. Multithreading and semihosting, for instance, have no associated +compiler option. + +In order to support the selection of variants for which no compiler option +exists, the multilib specification includes the concept of *custom flags*. +These flags have no impact on code generation and are only used in the multilib +processing. + +Multilib custom flags follow this format in the driver invocation: + +:: + + -fmultilib-flag= + +They are fed into the multilib system alongside the remaining flags. + +Custom flag declarations + + +Custom flags can be declared in the YAML file under the *Flags* section. + +.. code-block:: yaml + + Flags: + - Name: multithreaded +Values: +- Name: no-multithreaded + MacroDefines: [__SINGLE_THREAD__] +- Name: multithreaded +Default: no-multithreaded + +* Name: the name to categorize a flag. +* Values: a list of flag Values (defined below). +* Default: it specifies the name of the value this flag should take if not + specified in the command-line invocation. It must be one value from the Values + field. + +A Default value is useful to save users from specifying custom flags that have a vhscampos wrote: Thanks. FIxed. https://github.com/llvm/llvm-project/pull/114998 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] Add documentation for Multilib custom flags (PR #114998)
https://github.com/lenary approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/114998 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)
@@ -0,0 +1,232 @@ +#===-- CMakeLists.txt --===# +# +# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +# See https://llvm.org/LICENSE.txt for license information. +# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +# +#======# +# +# Build instructions for the flang-rt library. This is file is intended to be +# included using the LLVM_ENABLE_RUNTIMES mechanism. +# +#======# + +if (NOT LLVM_RUNTIMES_BUILD) + message(FATAL_ERROR "Use this CMakeLists.txt from LLVM's runtimes build system. + Example: +cmake /runtimes -DLLVM_ENABLE_RUNTIMES=flang-rt +") +endif () + +set(LLVM_SUBPROJECT_TITLE "Flang-RT") +set(FLANG_RT_SOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}") +set(FLANG_RT_BINARY_DIR "${CMAKE_CURRENT_BINARY_DIR}") +set(FLANG_SOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}/../flang") + + +# CMake 3.24 is the first version of CMake that directly recognizes Flang. +# LLVM's requirement is only CMake 3.20, teach CMake 3.20-3.23 how to use Flang. +if (CMAKE_VERSION VERSION_LESS "3.24") + cmake_path(GET CMAKE_Fortran_COMPILER STEM _Fortran_COMPILER_STEM) + if (_Fortran_COMPILER_STEM STREQUAL "flang-new" OR _Fortran_COMPILER_STEM STREQUAL "flang") +include(CMakeForceCompiler) +CMAKE_FORCE_Fortran_COMPILER("${CMAKE_Fortran_COMPILER}" "LLVMFlang") + +set(CMAKE_Fortran_COMPILER_ID "LLVMFlang") +set(CMAKE_Fortran_COMPILER_VERSION "${LLVM_VERSION_MAJOR}.${LLVM_VERSION_MINOR}") + +set(CMAKE_Fortran_SUBMODULE_SEP "-") +set(CMAKE_Fortran_SUBMODULE_EXT ".mod") + +set(CMAKE_Fortran_PREPROCESS_SOURCE + " -cpp-E > ") + +set(CMAKE_Fortran_FORMAT_FIXED_FLAG "-ffixed-form") +set(CMAKE_Fortran_FORMAT_FREE_FLAG "-ffree-form") + +set(CMAKE_Fortran_MODDIR_FLAG "-module-dir") + +set(CMAKE_Fortran_COMPILE_OPTIONS_PREPROCESS_ON "-cpp") +set(CMAKE_Fortran_COMPILE_OPTIONS_PREPROCESS_OFF "-nocpp") +set(CMAKE_Fortran_POSTPROCESS_FLAG "-ffixed-line-length-72") + +set(CMAKE_Fortran_COMPILE_OPTIONS_TARGET "--target=") + +set(CMAKE_Fortran_LINKER_WRAPPER_FLAG "-Wl,") +set(CMAKE_Fortran_LINKER_WRAPPER_FLAG_SEP ",") + endif () +endif () +enable_language(Fortran) + + +list(APPEND CMAKE_MODULE_PATH +"${FLANG_RT_SOURCE_DIR}/cmake/modules" +"${FLANG_SOURCE_DIR}/cmake/modules" + ) +include(AddFlangRT) +include(GetToolchainDirs) +include(FlangCommon) +include(HandleCompilerRT) +include(ExtendPath) +include(GNUInstallDirs) + + + +# Build Mode Introspection # + + +# Determine whether we are in the runtimes/runtimes-bins directory of a +# bootstrap build. +set(LLVM_TREE_AVAILABLE OFF) +if (LLVM_LIBRARY_OUTPUT_INTDIR AND LLVM_RUNTIME_OUTPUT_INTDIR AND PACKAGE_VERSION) + set(LLVM_TREE_AVAILABLE ON) +endif() + +# Path to LLVM development tools (FileCheck, llvm-lit, not, ...) +set(LLVM_TOOLS_DIR "${LLVM_BINARY_DIR}/bin") + +# Determine build and install paths. +# The build path is absolute, but the install dir is relative, CMake's install +# command has to apply CMAKE_INSTALL_PREFIX itself. +if (LLVM_TREE_AVAILABLE) + # In a bootstrap build emit the libraries into a default search path in the + # build directory of the just-built compiler. This allows using the + # just-built compiler without specifying paths to runtime libraries. + # + # Despite Clang in the name, get_clang_resource_dir does not depend on Clang + # being added to the build. Flang uses the same resource dir as clang. + include(GetClangResourceDir) + get_clang_resource_dir(FLANG_RT_OUTPUT_RESOURCE_DIR PREFIX "${LLVM_LIBRARY_OUTPUT_INTDIR}/..") jhuber6 wrote: The `lib64/` thing seems weird. Is anything else installed there? `-DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=ON` is the default for Linux and what should lead to having `x86_64-unknown-linux-gnu` there, but I've never seen `lib64/` be qualified there as well. https://github.com/llvm/llvm-project/pull/110217 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)
Meinersbur wrote: > I think with >1600 commits and >300kLoC changes, something went wrong here > with the merging. As mentioned by myself and others, it would be good to > rebase this and condense the commits that belong into #110298 resp. this one. This happens when I push the branch, that has `origin/HEAD` merged into, before doing the same with the PR that it is based on. In this case I pushed the wrong branch, one of the PRs that has already been merged. Sorry about that. I now wrote a script that pushes in the right order so I hope this doesn't happen anymore. It seems that I will not be able to apply https://github.com/llvm/llvm-zorg/pull/333 in a timely manner minimize the time the buildbots are red. I am currently working on keeping the old CMake code that builds the runtime such that the buildbot builders can be updated iteratively. It has the side-effect that I need to keep the `flang/runtime/CMakeLists.txt` working which I originally wanted to avoid, but also allows splitting this patch into smaller PRs. https://github.com/llvm/llvm-project/pull/110217 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)
https://github.com/ergawy deleted https://github.com/llvm/llvm-project/pull/116050 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)
@@ -6182,9 +6182,12 @@ TEST_F(OpenMPIRBuilderTest, TargetRegion) { TargetRegionEntryInfo EntryInfo("func", 42, 4711, 17); OpenMPIRBuilder::LocationDescription OmpLoc({Builder.saveIP(), DL}); - OpenMPIRBuilder::InsertPointOrErrorTy AfterIP = OMPBuilder.createTarget( - OmpLoc, /*IsOffloadEntry=*/true, Builder.saveIP(), Builder.saveIP(), - EntryInfo, -1, 0, Inputs, GenMapInfoCB, BodyGenCB, SimpleArgAccessorCB); + OpenMPIRBuilder::TargetKernelDefaultAttrs DefaultAttrs = { + /*MaxTeams=*/{-1}, /*MinTeams=*/0, /*MaxThreads=*/{0}, /*MinThreads=*/0}; ergawy wrote: This set of values is used in multiple locations to "default" construct `TargetKernelDefaultAttrs`, would it make sense to have this set of values as default values in the struct? I might be missing why we need the current default struct values. https://github.com/llvm/llvm-project/pull/116050 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)
@@ -6182,9 +6182,12 @@ TEST_F(OpenMPIRBuilderTest, TargetRegion) { TargetRegionEntryInfo EntryInfo("func", 42, 4711, 17); OpenMPIRBuilder::LocationDescription OmpLoc({Builder.saveIP(), DL}); - OpenMPIRBuilder::InsertPointOrErrorTy AfterIP = OMPBuilder.createTarget( - OmpLoc, /*IsOffloadEntry=*/true, Builder.saveIP(), Builder.saveIP(), - EntryInfo, -1, 0, Inputs, GenMapInfoCB, BodyGenCB, SimpleArgAccessorCB); + OpenMPIRBuilder::TargetKernelDefaultAttrs DefaultAttrs = { + /*MaxTeams=*/{-1}, /*MinTeams=*/0, /*MaxThreads=*/{0}, /*MinThreads=*/0}; ergawy wrote: This set of values is used in multiple locations to "default" construct `TargetKernelDefaultAttrs`, would it make sense to have this set of values as default values in the struct? I might be missing why we need the current default struct values. https://github.com/llvm/llvm-project/pull/116050 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add host_eval clause to omp.target (PR #116049)
https://github.com/skatrak updated https://github.com/llvm/llvm-project/pull/116049 >From bd7fa379968210047a25e031a8385ff0c43a3fb7 Mon Sep 17 00:00:00 2001 From: Sergio Afonso Date: Fri, 8 Nov 2024 12:00:45 + Subject: [PATCH] [MLIR][OpenMP] Add host_eval clause to omp.target This patch adds the `host_eval` clause to the `omp.target` operation. Additionally, it updates its op verifier to make sure all uses of block arguments defined by this clause fall within one of the few cases where they are allowed. MLIR to LLVM IR translation fails on translation of this clause with a not-yet-implemented error. --- mlir/docs/Dialects/OpenMPDialect/_index.md| 58 - .../mlir/Dialect/OpenMP/OpenMPDialect.h | 1 + mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | 33 ++- mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp | 206 +- .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 5 + mlir/test/Dialect/OpenMP/invalid.mlir | 94 +++- mlir/test/Dialect/OpenMP/ops.mlir | 54 - mlir/test/Target/LLVMIR/openmp-todo.mlir | 14 ++ 8 files changed, 446 insertions(+), 19 deletions(-) diff --git a/mlir/docs/Dialects/OpenMPDialect/_index.md b/mlir/docs/Dialects/OpenMPDialect/_index.md index 03d5b95217cce0..b651b3c06485c6 100644 --- a/mlir/docs/Dialects/OpenMPDialect/_index.md +++ b/mlir/docs/Dialects/OpenMPDialect/_index.md @@ -298,7 +298,8 @@ introduction of private copies of the same underlying variable defined outside the MLIR operation the clause is attached to. Currently, clauses with this property can be classified into three main categories: - Map-like clauses: `host_eval` (compiler internal, not defined by the OpenMP - specification), `map`, `use_device_addr` and `use_device_ptr`. + specification: [see more](#host-evaluated-clauses-in-target-regions)), `map`, + `use_device_addr` and `use_device_ptr`. - Reduction-like clauses: `in_reduction`, `reduction` and `task_reduction`. - Privatization clauses: `private`. @@ -523,3 +524,58 @@ omp.parallel ... { omp.terminator } {omp.composite} ``` + +## Host-Evaluated Clauses in Target Regions + +The `omp.target` operation, which represents the OpenMP `target` construct, is +marked with the `IsolatedFromAbove` trait. This means that, inside of its +region, no MLIR values defined outside of the op itself can be used. This is +consistent with the OpenMP specification of the `target` construct, which +mandates that all host device values used inside of the `target` region must +either be privatized (data-sharing) or mapped (data-mapping). + +Normally, clauses applied to a construct are evaluated before entering that +construct. Further, in some cases, the OpenMP specification stipulates that +clauses be evaluated _on the host device_ on entry to a parent `target` +construct. In particular, the `num_teams` and `thread_limit` clauses of the +`teams` construct must be evaluated on the host device if it's nested inside or +combined with a `target` construct. + +Additionally, the runtime library targeted by the MLIR to LLVM IR translation of +the OpenMP dialect supports the optimized launch of SPMD kernels (i.e. +`target teams distribute parallel {do,for}` in OpenMP), which requires +specifying in advance what the total trip count of the loop is. Consequently, it +is also beneficial to evaluate the trip count on the host device prior to the +kernel launch. + +These host-evaluated values in MLIR would need to be placed outside of the +`omp.target` region and also attached to the corresponding nested operations, +which is not possible because of the `IsolatedFromAbove` trait. The solution +implemented to address this problem has been to introduce the `host_eval` +argument to the `omp.target` operation. It works similarly to a `map` clause, +but its only intended use is to forward host-evaluated values to their +corresponding operation inside of the region. Any uses outside of the previously +described result in a verifier error. + +```mlir +// Initialize %0, %1, %2, %3... +omp.target host_eval(%0 -> %nt, %1 -> %lb, %2 -> %ub, %3 -> %step : i32, i32, i32, i32) { + omp.teams num_teams(to %nt : i32) { +omp.parallel { + omp.distribute { +omp.wsloop { + omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) { +// ... +omp.yield + } + omp.terminator +} {omp.composite} +omp.terminator + } {omp.composite} + omp.terminator +} {omp.composite} +omp.terminator + } + omp.terminator +} +``` diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPDialect.h b/mlir/include/mlir/Dialect/OpenMP/OpenMPDialect.h index bee21432196e42..248ac2eb72c61a 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPDialect.h +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPDialect.h @@ -22,6 +22,7 @@ #include "mlir/IR/SymbolTable.h" #include "mlir/Interfaces/ControlFlowInterfaces.h" #include "mlir/Interfaces/SideEffectInterfaces.h" +#
[llvm-branch-commits] [llvm] AMDGPU: Start considering new atomicrmw metadata on integer operations (PR #122138)
llvmbot wrote: @llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) Changes Start considering !amdgpu.no.remote.memory.access and !amdgpu.no.fine.grained.host.memory metadata when deciding to expand integer atomic operations. This does not yet attempt to accurately handle fadd/fmin/fmax, which are trickier and require migrating the old "amdgpu-unsafe-fp-atomics" attribute. --- Patch is 1.43 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/122138.diff 28 Files Affected: - (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+53-12) - (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_udec_wrap.ll (+31-30) - (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_uinc_wrap.ll (+2521-614) - (modified) llvm/test/CodeGen/AMDGPU/acc-ldst.ll (+4-2) - (modified) llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll (+14-12) - (modified) llvm/test/CodeGen/AMDGPU/dag-divergence-atomic.ll (+17-17) - (modified) llvm/test/CodeGen/AMDGPU/flat_atomics.ll (+87-85) - (modified) llvm/test/CodeGen/AMDGPU/flat_atomics_i32_system.ll (+100-926) - (modified) llvm/test/CodeGen/AMDGPU/flat_atomics_i64.ll (+82-80) - (modified) llvm/test/CodeGen/AMDGPU/flat_atomics_i64_noprivate.ll (+3758-1362) - (modified) llvm/test/CodeGen/AMDGPU/flat_atomics_i64_system.ll (+258-1374) - (modified) llvm/test/CodeGen/AMDGPU/flat_atomics_i64_system_noprivate.ll (+100-1212) - (modified) llvm/test/CodeGen/AMDGPU/global-saddr-atomics.ll (+82-80) - (modified) llvm/test/CodeGen/AMDGPU/global_atomics.ll (+319-79) - (modified) llvm/test/CodeGen/AMDGPU/global_atomics_i32_system.ll (+98-984) - (modified) llvm/test/CodeGen/AMDGPU/global_atomics_i64.ll (+82-80) - (modified) llvm/test/CodeGen/AMDGPU/global_atomics_i64_system.ll (+109-1221) - (modified) llvm/test/CodeGen/AMDGPU/idemponent-atomics.ll (+42-28) - (modified) llvm/test/CodeGen/AMDGPU/move-to-valu-atomicrmw.ll (+4-2) - (modified) llvm/test/CodeGen/AMDGPU/shl_add_ptr_global.ll (+1-1) - (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i16.ll (+534-159) - (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i32-agent.ll (+990-49) - (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i32-system.ll (+30-330) - (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i64-agent.ll (+990-49) - (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i64-system.ll (+30-330) - (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i8.ll (+209-6) - (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomicrmw-flat-noalias-addrspace.ll (+130-8) - (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomicrmw-integer-ops-0-to-add-0.ll (+43-21) ``diff diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp index 513251e398ad4d..5fa8e1532096f7 100644 --- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp +++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp @@ -16621,19 +16621,60 @@ SITargetLowering::shouldExpandAtomicRMWInIR(AtomicRMWInst *RMW) const { case AtomicRMWInst::UDecWrap: { if (AMDGPU::isFlatGlobalAddrSpace(AS) || AS == AMDGPUAS::BUFFER_FAT_POINTER) { - // Always expand system scope atomics. - if (HasSystemScope) { -if (Op == AtomicRMWInst::Sub || Op == AtomicRMWInst::Or || -Op == AtomicRMWInst::Xor) { - // Atomic sub/or/xor do not work over PCI express, but atomic add - // does. InstCombine transforms these with 0 to or, so undo that. - if (Constant *ConstVal = dyn_cast(RMW->getValOperand()); - ConstVal && ConstVal->isNullValue()) -return AtomicExpansionKind::Expand; -} - -return AtomicExpansionKind::CmpXChg; + // On most subtargets, for atomicrmw operations other than add/xchg, + // whether or not the instructions will behave correctly depends on where + // the address physically resides and what interconnect is used in the + // system configuration. On some some targets the instruction will nop, + // and in others synchronization will only occur at degraded device scope. + // + // If the allocation is known local to the device, the instructions should + // work correctly. + if (RMW->hasMetadata("amdgpu.no.remote.memory")) +return atomicSupportedIfLegalIntType(RMW); + + // If fine-grained remote memory works at device scope, we don't need to + // do anything. + if (!HasSystemScope && + Subtarget->supportsAgentScopeFineGrainedRemoteMemoryAtomics()) +return atomicSupportedIfLegalIntType(RMW); + + // If we are targeting a remote allocated address, it depends what kind of + // allocation the address belongs to. + // + // If the allocation is fine-grained (in host memory, or in PCIe peer + // device memory), the operation will fa
[llvm-branch-commits] [llvm] AMDGPU: Start considering new atomicrmw metadata on integer operations (PR #122138)
arsenm wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/122138?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#122138** https://app.graphite.dev/github/pr/llvm/llvm-project/122138?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/122138?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#122137** https://app.graphite.dev/github/pr/llvm/llvm-project/122137?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/122138 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Start considering new atomicrmw metadata on integer operations (PR #122138)
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/122138 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)
https://github.com/skatrak updated https://github.com/llvm/llvm-project/pull/116050 >From f73a439832c4e8454274b7677570d190231dcf46 Mon Sep 17 00:00:00 2001 From: Sergio Afonso Date: Fri, 8 Nov 2024 15:46:48 + Subject: [PATCH 1/2] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure used to simplify passing default and constant values for number of teams and threads, and possibly other target kernel-related information in the future. This is used to forward values passed to `createTarget` to `createTargetInit`, which previously used a default unrelated set of values. --- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 13 ++-- clang/lib/CodeGen/CGOpenMPRuntime.h | 9 +-- clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp | 9 +-- .../llvm/Frontend/OpenMP/OMPIRBuilder.h | 39 ++ llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 71 +++ .../Frontend/OpenMPIRBuilderTest.cpp | 29 .../OpenMP/OpenMPToLLVMIRTranslation.cpp | 11 +-- .../LLVMIR/omptarget-region-device-llvm.mlir | 2 +- 8 files changed, 102 insertions(+), 81 deletions(-) diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index 30c3834de139c3..1cb3bab454c26a 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -5881,10 +5881,13 @@ void CGOpenMPRuntime::emitUsesAllocatorsFini(CodeGenFunction &CGF, void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams( const OMPExecutableDirective &D, CodeGenFunction &CGF, -int32_t &MinThreadsVal, int32_t &MaxThreadsVal, int32_t &MinTeamsVal, -int32_t &MaxTeamsVal) { +llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs) { + assert(Attrs.MaxTeams.size() == 1 && Attrs.MaxThreads.size() == 1 && + "invalid default attrs structure"); + int32_t &MaxTeamsVal = Attrs.MaxTeams.front(); + int32_t &MaxThreadsVal = Attrs.MaxThreads.front(); - getNumTeamsExprForTargetDirective(CGF, D, MinTeamsVal, MaxTeamsVal); + getNumTeamsExprForTargetDirective(CGF, D, Attrs.MinTeams, MaxTeamsVal); getNumThreadsExprForTargetDirective(CGF, D, MaxThreadsVal, /*UpperBoundOnly=*/true); @@ -5902,12 +5905,12 @@ void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams( else continue; - MinThreadsVal = std::max(MinThreadsVal, AttrMinThreadsVal); + Attrs.MinThreads = std::max(Attrs.MinThreads, AttrMinThreadsVal); if (AttrMaxThreadsVal > 0) MaxThreadsVal = MaxThreadsVal > 0 ? std::min(MaxThreadsVal, AttrMaxThreadsVal) : AttrMaxThreadsVal; - MinTeamsVal = std::max(MinTeamsVal, AttrMinBlocksVal); + Attrs.MinTeams = std::max(Attrs.MinTeams, AttrMinBlocksVal); if (AttrMaxBlocksVal > 0) MaxTeamsVal = MaxTeamsVal > 0 ? std::min(MaxTeamsVal, AttrMaxBlocksVal) : AttrMaxBlocksVal; diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h b/clang/lib/CodeGen/CGOpenMPRuntime.h index 8ab5ee70a19fa2..3791bb71592350 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.h +++ b/clang/lib/CodeGen/CGOpenMPRuntime.h @@ -313,12 +313,9 @@ class CGOpenMPRuntime { llvm::OpenMPIRBuilder OMPBuilder; /// Helper to determine the min/max number of threads/teams for \p D. - void computeMinAndMaxThreadsAndTeams(const OMPExecutableDirective &D, - CodeGenFunction &CGF, - int32_t &MinThreadsVal, - int32_t &MaxThreadsVal, - int32_t &MinTeamsVal, - int32_t &MaxTeamsVal); + void computeMinAndMaxThreadsAndTeams( + const OMPExecutableDirective &D, CodeGenFunction &CGF, + llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs); /// Helper to emit outlined function for 'target' directive. /// \param D Directive to emit. diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp index 756f0482b8ea72..659783a813c83e 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp @@ -744,14 +744,11 @@ void CGOpenMPRuntimeGPU::emitNonSPMDKernel(const OMPExecutableDirective &D, void CGOpenMPRuntimeGPU::emitKernelInit(const OMPExecutableDirective &D, CodeGenFunction &CGF, EntryFunctionState &EST, bool IsSPMD) { - int32_t MinThreadsVal = 1, MaxThreadsVal = -1, MinTeamsVal = 1, - MaxTeamsVal = -1; - computeMinAndMaxThreadsAndTeams(D, CGF, MinThreadsVal, MaxThreadsVal, - MinTeamsVal, MaxTeamsVal); + llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs Attrs; + computeMinAndMaxThreadsAndTeams(D, CGF, Attrs); C
[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)
@@ -2726,15 +2740,11 @@ class OpenMPIRBuilder { /// /// \param Loc The insert and source location description. /// \param IsSPMD Flag to indicate if the kernel is an SPMD kernel or not. - /// \param MinThreads Minimal number of threads, or 0. - /// \param MaxThreads Maximal number of threads, or 0. - /// \param MinTeams Minimal number of teams, or 0. - /// \param MaxTeams Maximal number of teams, or 0. - InsertPointTy createTargetInit(const LocationDescription &Loc, bool IsSPMD, - int32_t MinThreadsVal = 0, - int32_t MaxThreadsVal = 0, - int32_t MinTeamsVal = 0, - int32_t MaxTeamsVal = 0); + /// \param Attrs Structure containing the default numbers of threads and teams + ///to launch the kernel with. + InsertPointTy createTargetInit( + const LocationDescription &Loc, bool IsSPMD, + const llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs); skatrak wrote: Good point, I agree. Added now. https://github.com/llvm/llvm-project/pull/116050 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)
@@ -6182,9 +6182,12 @@ TEST_F(OpenMPIRBuilderTest, TargetRegion) { TargetRegionEntryInfo EntryInfo("func", 42, 4711, 17); OpenMPIRBuilder::LocationDescription OmpLoc({Builder.saveIP(), DL}); - OpenMPIRBuilder::InsertPointOrErrorTy AfterIP = OMPBuilder.createTarget( - OmpLoc, /*IsOffloadEntry=*/true, Builder.saveIP(), Builder.saveIP(), - EntryInfo, -1, 0, Inputs, GenMapInfoCB, BodyGenCB, SimpleArgAccessorCB); + OpenMPIRBuilder::TargetKernelDefaultAttrs DefaultAttrs = { + /*MaxTeams=*/{-1}, /*MinTeams=*/0, /*MaxThreads=*/{0}, /*MinThreads=*/0}; skatrak wrote: Defaults in the new struct represent basically what you would expect: max values representing "unset" (since these can be either unset (<0), runtime-evaluated (0) or constant (>0)) and min values set to 1. I believe that set of defaults makes sense, and it matches what clang set the corresponding attributes initially too. As for not overriding the defaults in these tests, `MaxThreads < 0` causes the OMPIRBuilder to query the default grid size based on the target triple, whereas 0 won't. Querying that triggers an assert if the triple is not one of the supported offloading targets, so at least that one attribute can't be left unchanged unless we change the target triple of the OMPIRBuilder too. But, more generally, I think there is nothing in this PR that causes a need to update these tests, so I just set all of the values to what they already were before the struct was introduced rather than adapting them to its defaults. I hope that makes sense to you, but let me know if you don't agree. https://github.com/llvm/llvm-project/pull/116050 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Reduce 64-bit add width if low bits are known 0 (PR #122049)
arsenm wrote: ### Merge activity * **Jan 8, 10:24 AM EST**: A user started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/122049). https://github.com/llvm/llvm-project/pull/122049 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Parse WHEN, OTHERWISE, MATCH clauses plus METADIRECTIVE (PR #121817)
https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/121817 >From 5f534c559ca1bb7911b484264582d1a5078bdcb8 Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 12 Dec 2024 15:26:26 -0600 Subject: [PATCH 1/7] [flang][OpenMP] Parse WHEN, OTHERWISE, MATCH clauses plus METADIRECTIVE Parse METADIRECTIVE as a standalone executable directive at the moment. This will allow testing the parser code. There is no lowering, not even clause conversion yet. There is also no verification of the allowed values for trait sets, trait properties. --- flang/include/flang/Parser/dump-parse-tree.h | 5 + flang/include/flang/Parser/parse-tree.h | 41 - flang/lib/Lower/OpenMP/Clauses.cpp | 21 ++- flang/lib/Lower/OpenMP/Clauses.h | 1 + flang/lib/Lower/OpenMP/OpenMP.cpp| 6 + flang/lib/Parser/openmp-parsers.cpp | 26 ++- flang/lib/Parser/unparse.cpp | 12 ++ flang/lib/Semantics/check-omp-structure.cpp | 9 + flang/lib/Semantics/check-omp-structure.h| 3 + flang/lib/Semantics/resolve-directives.cpp | 5 + flang/test/Parser/OpenMP/metadirective.f90 | 165 +++ llvm/include/llvm/Frontend/OpenMP/OMP.td | 9 +- 12 files changed, 296 insertions(+), 7 deletions(-) create mode 100644 flang/test/Parser/OpenMP/metadirective.f90 diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index a61d7973dd5c36..94ca7c67cbd52e 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -476,6 +476,11 @@ class ParseTreeDumper { NODE(parser, NullInit) NODE(parser, ObjectDecl) NODE(parser, OldParameterStmt) + NODE(parser, OmpMetadirectiveDirective) + NODE(parser, OmpMatchClause) + NODE(parser, OmpOtherwiseClause) + NODE(parser, OmpWhenClause) + NODE(OmpWhenClause, Modifier) NODE(parser, OmpDirectiveSpecification) NODE(parser, OmpTraitPropertyName) NODE(parser, OmpTraitScore) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 697bddfaf16150..113ff3380ba22c 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -3964,6 +3964,7 @@ struct OmpBindClause { // data-sharing-attribute -> //SHARED | NONE | // since 4.5 //PRIVATE | FIRSTPRIVATE// since 5.0 +// See also otherwise-clause. struct OmpDefaultClause { ENUM_CLASS(DataSharingAttribute, Private, Firstprivate, Shared, None) WRAPPER_CLASS_BOILERPLATE(OmpDefaultClause, DataSharingAttribute); @@ -4184,6 +4185,16 @@ struct OmpMapClause { std::tuple t; }; +// Ref: [5.0:58-60], [5.1:63-68], [5.2:194-195] +// +// match-clause -> +//MATCH (context-selector-specification)// since 5.0 +struct OmpMatchClause { + // The context-selector is an argument. + WRAPPER_CLASS_BOILERPLATE( + OmpMatchClause, traits::OmpContextSelectorSpecification); +}; + // Ref: [5.2:217-218] // message-clause -> //MESSAGE("message-text") @@ -4214,6 +4225,17 @@ struct OmpOrderClause { std::tuple t; }; +// Ref: [5.0:56-57], [5.1:60-62], [5.2:191] +// +// otherwise-clause -> +//DEFAULT ([directive-specification]) // since 5.0, until 5.1 +// otherwise-clause -> +//OTHERWISE ([directive-specification])]// since 5.2 +struct OmpOtherwiseClause { + WRAPPER_CLASS_BOILERPLATE( + OmpOtherwiseClause, std::optional); +}; + // Ref: [4.5:46-50], [5.0:74-78], [5.1:92-96], [5.2:229-230] // // proc-bind-clause -> @@ -4299,6 +4321,17 @@ struct OmpUpdateClause { std::variant u; }; +// Ref: [5.0:56-57], [5.1:60-62], [5.2:190-191] +// +// when-clause -> +//WHEN (context-selector : +//[directive-specification])// since 5.0 +struct OmpWhenClause { + TUPLE_CLASS_BOILERPLATE(OmpWhenClause); + MODIFIER_BOILERPLATE(OmpContextSelector); + std::tuple> t; +}; + // OpenMP Clauses struct OmpClause { UNION_CLASS_BOILERPLATE(OmpClause); @@ -4323,6 +4356,12 @@ struct OmpClauseList { // --- Directives and constructs +struct OmpMetadirectiveDirective { + TUPLE_CLASS_BOILERPLATE(OmpMetadirectiveDirective); + std::tuple t; + CharBlock source; +}; + // Ref: [5.1:89-90], [5.2:216] // // nothing-directive -> @@ -4696,7 +4735,7 @@ struct OpenMPStandaloneConstruct { CharBlock source; std::variant + OpenMPDepobjConstruct, OmpMetadirectiveDirective> u; }; diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp b/flang/lib/Lower/OpenMP/Clauses.cpp index b424e209d56da9..d60171552087fa 100644 --- a/flang/lib/Lower/OpenMP/Clauses.cpp +++ b/flang/lib/Lower/OpenMP/Clauses.cpp @@ -230,9 +230,9 @@ MAKE_EMPTY_CLASS(Threadprivate, Threadprivate); MAKE_INCOMPLETE_CLASS(AdjustArgs, AdjustArgs); MAKE_INCOMPLETE_CLASS(AppendArgs, AppendArgs); -MAKE_INCOMPLETE_CLASS(Match, Match); +// MAKE_INCOMPLETE_CLASS(Match, Mat
[llvm-branch-commits] [flang] [flang][OpenMP] Parsing context selectors for METADIRECTIVE (PR #121815)
https://github.com/kparzysz updated https://github.com/llvm/llvm-project/pull/121815 >From 215c7e6133bf07d005ac7483b8faf797e319a1fa Mon Sep 17 00:00:00 2001 From: Krzysztof Parzyszek Date: Thu, 12 Dec 2024 15:26:26 -0600 Subject: [PATCH] [flang][OpenMP] Parsing context selectors for METADIRECTIVE This is just adding parsers for context selectors. There are no tests because there is no way to execute these parsers yet. --- flang/include/flang/Parser/characters.h | 2 + flang/include/flang/Parser/dump-parse-tree.h | 14 ++ flang/include/flang/Parser/parse-tree.h | 136 +++ flang/lib/Parser/openmp-parsers.cpp | 78 +++ flang/lib/Parser/token-parsers.h | 4 + flang/lib/Parser/unparse.cpp | 38 ++ flang/lib/Semantics/check-omp-structure.cpp | 8 ++ flang/lib/Semantics/check-omp-structure.h| 3 + flang/lib/Semantics/resolve-directives.cpp | 6 + 9 files changed, 289 insertions(+) diff --git a/flang/include/flang/Parser/characters.h b/flang/include/flang/Parser/characters.h index df188d674b9eeb..dbdc058c44995a 100644 --- a/flang/include/flang/Parser/characters.h +++ b/flang/include/flang/Parser/characters.h @@ -180,6 +180,8 @@ inline constexpr bool IsValidFortranTokenCharacter(char ch) { case '>': case '[': case ']': + case '{': // Used in OpenMP context selector specification + case '}': // return true; default: return IsLegalIdentifierStart(ch) || IsDecimalDigit(ch); diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h index 3331520922bc63..a61d7973dd5c36 100644 --- a/flang/include/flang/Parser/dump-parse-tree.h +++ b/flang/include/flang/Parser/dump-parse-tree.h @@ -476,6 +476,20 @@ class ParseTreeDumper { NODE(parser, NullInit) NODE(parser, ObjectDecl) NODE(parser, OldParameterStmt) + NODE(parser, OmpDirectiveSpecification) + NODE(parser, OmpTraitPropertyName) + NODE(parser, OmpTraitScore) + NODE(parser, OmpTraitPropertyExtension) + NODE(OmpTraitPropertyExtension, ExtensionValue) + NODE(parser, OmpTraitProperty) + NODE(parser, OmpTraitSelectorName) + NODE_ENUM(OmpTraitSelectorName, Value) + NODE(parser, OmpTraitSelector) + NODE(OmpTraitSelector, Properties) + NODE(parser, OmpTraitSetSelectorName) + NODE_ENUM(OmpTraitSetSelectorName, Value) + NODE(parser, OmpTraitSetSelector) + NODE(parser, OmpContextSelectorSpecification) NODE(parser, OmpMapper) NODE(parser, OmpMapType) NODE_ENUM(OmpMapType, Value) diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h index 941d70d3876291..697bddfaf16150 100644 --- a/flang/include/flang/Parser/parse-tree.h +++ b/flang/include/flang/Parser/parse-tree.h @@ -3453,6 +3453,17 @@ WRAPPER_CLASS(PauseStmt, std::optional); // --- Common definitions +struct OmpClause; +struct OmpClauseList; + +struct OmpDirectiveSpecification { + TUPLE_CLASS_BOILERPLATE(OmpDirectiveSpecification); + std::tuple>> + t; + CharBlock source; +}; + // 2.1 Directives or clauses may accept a list or extended-list. // A list item is a variable, array section or common block name (enclosed // in slashes). An extended list item is a list item or a procedure Name. @@ -3474,6 +3485,128 @@ WRAPPER_CLASS(OmpObjectList, std::list); #define MODIFIERS() std::optional> +inline namespace traits { +// trait-property-name -> +//identifier | string-literal +struct OmpTraitPropertyName { + WRAPPER_CLASS_BOILERPLATE(OmpTraitPropertyName, std::string); +}; + +// trait-score -> +//SCORE(non-negative-const-integer-expression) +struct OmpTraitScore { + WRAPPER_CLASS_BOILERPLATE(OmpTraitScore, ScalarIntExpr); +}; + +// trait-property-extension -> +//trait-property-name (trait-property-value, ...) +// trait-property-value -> +//trait-property-name | +//scalar-integer-expression | +//trait-property-extension +// +// The grammar in OpenMP 5.2+ spec is ambiguous, the above is a different +// version (but equivalent) that doesn't have ambiguities. +// The ambiguity is in +// trait-property: +// trait-property-name <- (a) +// trait-property-clause +// trait-property-expression<- (b) +// trait-property-extension <- this conflicts with (a) and (b) +// trait-property-extension: +// trait-property-name <- conflict with (a) +// identifier(trait-property-extension[, trait-property-extension[, ...]]) +// constant integer expression <- conflict with (b) +// +struct OmpTraitPropertyExtension { + TUPLE_CLASS_BOILERPLATE(OmpTraitPropertyExtension); + struct ExtensionValue { +UNION_CLASS_BOILERPLATE(ExtensionValue); +std::variant> +u; + }; + using ExtensionList = std::list; + std::tuple t; +}; + +// trait-property -> +//trait-property-name | OmpClause | +//trait-property-expression | trait-property-extension +// trait-property-expression -> +//scala
[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)
https://github.com/Meinersbur edited https://github.com/llvm/llvm-project/pull/110217 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)
@@ -650,48 +650,127 @@ literal types are uniqued in recent versions of LLVM. .. _nointptrtype: -Non-Integral Pointer Type -- +Non-Integral and Unstable Pointer Types +--- -Note: non-integral pointer types are a work in progress, and they should be -considered experimental at this time. +Note: non-integral/unstable pointer types are a work in progress, and they +should be considered experimental at this time. LLVM IR optionally allows the frontend to denote pointers in certain address -spaces as "non-integral" via the :ref:`datalayout string`. -Non-integral pointer types represent pointers that have an *unspecified* bitwise -representation; that is, the integral representation may be target dependent or -unstable (not backed by a fixed integer). +spaces as "non-integral" or "unstable" (or both "non-integral" and "unstable") +via the :ref:`datalayout string`. + +The exact implications of these properties are target-specific, but the +following IR semantics and restrictions to optimization passes apply: + +Unstable pointer representation +^^^ + +Pointers in this address space have an *unspecified* bitwise representation +(i.e. not backed by a fixed integer). The bitwise pattern of such pointers is +allowed to change in a target-specific way. For example, this could be a pointer +type used with copying garbage collection where the garbage collector could +update the pointer at any time in the collection sweep. ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for integral (i.e. normal) pointers in that they convert integers to and from -corresponding pointer types, but there are additional implications to be -aware of. Because the bit-representation of a non-integral pointer may -not be stable, two identical casts of the same operand may or may not +corresponding pointer types, but there are additional implications to be aware +of. + +For "unstable" pointer representations, the bit-representation of the pointer +may not be stable, so two identical casts of the same operand may or may not return the same value. Said differently, the conversion to or from the -non-integral type depends on environmental state in an implementation +"unstable" pointer type depends on environmental state in an implementation defined manner. - If the frontend wishes to observe a *particular* value following a cast, the generated IR must fence with the underlying environment in an implementation defined manner. (In practice, this tends to require ``noinline`` routines for such operations.) From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for -non-integral types are analogous to ones on integral types with one +"unstable" pointer types are analogous to ones on integral types with one key exception: the optimizer may not, in general, insert new dynamic occurrences of such casts. If a new cast is inserted, the optimizer would need to either ensure that a) all possible values are valid, or b) appropriate fencing is inserted. Since the appropriate fencing is implementation defined, the optimizer can't do the latter. The former is challenging as many commonly expected properties, such as -``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types. +``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for "unstable" pointer types. Similar restrictions apply to intrinsics that might examine the pointer bits, such as :ref:`llvm.ptrmask`. -The alignment information provided by the frontend for a non-integral pointer +The alignment information provided by the frontend for an "unstable" pointer (typically using attributes or metadata) must be valid for every possible representation of the pointer. +Non-integral pointer representation +^^^ + +Pointers are not represented as just an address, but may instead include +additional metadata such as bounds information or a temporal identifier. +Examples include AMDGPU buffer descriptors with a 128-bit fat pointer and a +32-bit offset, or CHERI capabilities that contain bounds, permissions and a +type field (as well as an out-of-band validity bit, see next section). +In general, valid non-integral pointers cannot becreated from just an integer +value: while ``inttoptr`` yields a deterministic bitwise pattern, the resulting +value is not guaranteed to be a valid dereferenceable pointer. + +In most cases pointers with a non-integral representation behave exactly the +same as an integral pointer, the only difference is that it is not possible to +create a pointer just from an address. + +"Non-integral" pointers also impose restrictions on transformation passes, but +in general these are less restrictive than for "unstable" pointers. The main +difference compared to integral pointers is that ``inttoptr`` instructions +should not be inserted by passes as they may not be able to create a vali
[llvm-branch-commits] [llvm] [Flang-RT] Build libflang_rt.so (PR #121782)
https://github.com/Meinersbur updated https://github.com/llvm/llvm-project/pull/121782 >From a3037ab5557dcc4a4deb5bb40f801ca9770e3854 Mon Sep 17 00:00:00 2001 From: Michael Kruse Date: Mon, 6 Jan 2025 16:44:08 +0100 Subject: [PATCH 1/2] Add FLANG_RT_ENABLE_STATIC and FLANG_RT_ENABLE_SHARED --- flang-rt/CMakeLists.txt | 30 ++ flang-rt/cmake/modules/AddFlangRT.cmake | 291 -- .../cmake/modules/AddFlangRTOffload.cmake | 8 +- flang-rt/cmake/modules/GetToolchainDirs.cmake | 254 +++ flang-rt/lib/flang_rt/CMakeLists.txt | 20 +- flang-rt/test/CMakeLists.txt | 2 +- flang-rt/test/lit.cfg.py | 2 +- 7 files changed, 366 insertions(+), 241 deletions(-) diff --git a/flang-rt/CMakeLists.txt b/flang-rt/CMakeLists.txt index 7b3d22e454a108..7effa6012a078f 100644 --- a/flang-rt/CMakeLists.txt +++ b/flang-rt/CMakeLists.txt @@ -113,6 +113,15 @@ cmake_path(NORMAL_PATH FLANG_RT_OUTPUT_RESOURCE_DIR) cmake_path(NORMAL_PATH FLANG_RT_INSTALL_RESOURCE_PATH) # Determine subdirectories for build output and install destinations. +# FIXME: For the libflang_rt.so, the toolchain resource lib dir is not a good +#destination because it is not a ld.so default search path. +#The machine where the executable is eventually executed may not be the +#machine where the Flang compiler and its resource dir is installed, so +#setting RPath by the driver is not an solution. It should belong into +#/usr/lib//libflang_rt.so, like e.g. libgcc_s.so. +#But the linker as invoked by the Flang driver also requires +#libflang_rt.so to be found when linking and the resource lib dir is +#the only reliable location. get_toolchain_library_subdir(toolchain_lib_subdir) extend_path(FLANG_RT_OUTPUT_RESOURCE_LIB_DIR "${FLANG_RT_OUTPUT_RESOURCE_DIR}" "${toolchain_lib_subdir}") extend_path(FLANG_RT_INSTALL_RESOURCE_LIB_PATH "${FLANG_RT_INSTALL_RESOURCE_PATH}" "${toolchain_lib_subdir}") @@ -130,6 +139,27 @@ cmake_path(NORMAL_PATH FLANG_RT_INSTALL_RESOURCE_LIB_PATH) option(FLANG_RT_INCLUDE_TESTS "Generate build targets for the flang-rt unit and regression-tests." "${LLVM_INCLUDE_TESTS}") +option(FLANG_RT_ENABLE_STATIC "Build Flang-RT as a static library." ON) +if (WIN32) + # Windows DLL currently not implemented. + set(FLANG_RT_ENABLE_SHARED OFF) +else () + # TODO: Enable by default to increase test coverage, and which version of the + # library should be the user's choice anyway. + # Currently, the Flang driver adds `-L"libdir" -lflang_rt` as linker + # argument, which leaves the choice which library to use to the linker. + # Since most linkers prefer the shared library, this would constitute a + # breaking change unless the driver is changed. + option(FLANG_RT_ENABLE_SHARED "Build Flang-RT as a shared library." OFF) +endif () +if (NOT FLANG_RT_ENABLE_STATIC AND NOT FLANG_RT_ENABLE_SHARED) + message(FATAL_ERROR " + Must build at least one type of library + (FLANG_RT_ENABLE_STATIC=ON, FLANG_RT_ENABLE_SHARED=ON, or both) +") +endif () + + set(FLANG_RT_EXPERIMENTAL_OFFLOAD_SUPPORT "" CACHE STRING "Compile Flang-RT with GPU support (CUDA or OpenMP)") set_property(CACHE FLANG_RT_EXPERIMENTAL_OFFLOAD_SUPPORT PROPERTY STRINGS "" diff --git a/flang-rt/cmake/modules/AddFlangRT.cmake b/flang-rt/cmake/modules/AddFlangRT.cmake index 1f8b5111433825..5f493a80c35f20 100644 --- a/flang-rt/cmake/modules/AddFlangRT.cmake +++ b/flang-rt/cmake/modules/AddFlangRT.cmake @@ -16,7 +16,8 @@ # STATIC # Build a static (.a/.lib) library # OBJECT -# Create only object files without static/dynamic library +# Always create an object library. +# Without SHARED/STATIC, build only the object library. # INSTALL_WITH_TOOLCHAIN # Install library into Clang's resource directory so it can be found by the # Flang driver during compilation, including tests @@ -44,17 +45,73 @@ function (add_flangrt_library name) ") endif () - # Forward libtype to add_library - set(extra_args "") - if (ARG_SHARED) -list(APPEND extra_args SHARED) + # Internal names of libraries. If called with just single type option, use + # the default name for it. Name of targets must only depend on function + # arguments to be predictable for callers. + set(name_static "${name}.static") + set(name_shared "${name}.shared") + set(name_object "obj.${name}") + if (ARG_STATIC AND NOT ARG_SHARED) +set(name_static "${name}") + elseif (NOT ARG_STATIC AND ARG_SHARED) +set(name_shared "${name}") + elseif (NOT ARG_STATIC AND NOT ARG_SHARED AND ARG_OBJECT) +set(name_object "${name}") + elseif (NOT ARG_STATIC AND NOT ARG_SHARED AND NOT ARG_OBJECT) +# Only one of them will actually be built. +set(name_static "${name}") +set(name_shared "${name}") + endif () + + # Determine what to build. If not explicitly
[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)
Meinersbur wrote: > The library is present under > > ``` > $PREFIX/flang-rt/lib/x86_64-unknown-linux-gnu/libflang_rt.a > ``` > > (where it got installed through the changes in this PR, without any specific > overrides). If `$PREFIX` is `CMAKE_INSTALL_DIR`, it is the wrong location. it should be in `$PREFIX/lib/clang/20/lib/x86_64-unknown-linux-gnu/libflang_rt.a`. I don't see how the `flang_rt` dir could get in there. Can you try with the latest update? If `$PREFIX` is CMake's build directory, are you using a runtimes-standalone (non-bootstrap) build and running flang from there? Then `$PREFIX` for flang an flang-rt are different. `flang` looks into its own buildir only. You will have to `-L` to that directory or install them into the same `CMAKE_INSTALL_DIR`. https://github.com/llvm/llvm-project/pull/110217 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)
@@ -0,0 +1,232 @@ +#===-- CMakeLists.txt --===# +# +# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +# See https://llvm.org/LICENSE.txt for license information. +# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +# +#======# +# +# Build instructions for the flang-rt library. This is file is intended to be +# included using the LLVM_ENABLE_RUNTIMES mechanism. +# +#======# + +if (NOT LLVM_RUNTIMES_BUILD) + message(FATAL_ERROR "Use this CMakeLists.txt from LLVM's runtimes build system. + Example: +cmake /runtimes -DLLVM_ENABLE_RUNTIMES=flang-rt +") +endif () + +set(LLVM_SUBPROJECT_TITLE "Flang-RT") +set(FLANG_RT_SOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}") +set(FLANG_RT_BINARY_DIR "${CMAKE_CURRENT_BINARY_DIR}") +set(FLANG_SOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}/../flang") + + +# CMake 3.24 is the first version of CMake that directly recognizes Flang. +# LLVM's requirement is only CMake 3.20, teach CMake 3.20-3.23 how to use Flang. +if (CMAKE_VERSION VERSION_LESS "3.24") + cmake_path(GET CMAKE_Fortran_COMPILER STEM _Fortran_COMPILER_STEM) + if (_Fortran_COMPILER_STEM STREQUAL "flang-new" OR _Fortran_COMPILER_STEM STREQUAL "flang") +include(CMakeForceCompiler) +CMAKE_FORCE_Fortran_COMPILER("${CMAKE_Fortran_COMPILER}" "LLVMFlang") + +set(CMAKE_Fortran_COMPILER_ID "LLVMFlang") +set(CMAKE_Fortran_COMPILER_VERSION "${LLVM_VERSION_MAJOR}.${LLVM_VERSION_MINOR}") + +set(CMAKE_Fortran_SUBMODULE_SEP "-") +set(CMAKE_Fortran_SUBMODULE_EXT ".mod") + +set(CMAKE_Fortran_PREPROCESS_SOURCE + " -cpp-E > ") + +set(CMAKE_Fortran_FORMAT_FIXED_FLAG "-ffixed-form") +set(CMAKE_Fortran_FORMAT_FREE_FLAG "-ffree-form") + +set(CMAKE_Fortran_MODDIR_FLAG "-module-dir") + +set(CMAKE_Fortran_COMPILE_OPTIONS_PREPROCESS_ON "-cpp") +set(CMAKE_Fortran_COMPILE_OPTIONS_PREPROCESS_OFF "-nocpp") +set(CMAKE_Fortran_POSTPROCESS_FLAG "-ffixed-line-length-72") + +set(CMAKE_Fortran_COMPILE_OPTIONS_TARGET "--target=") + +set(CMAKE_Fortran_LINKER_WRAPPER_FLAG "-Wl,") +set(CMAKE_Fortran_LINKER_WRAPPER_FLAG_SEP ",") + endif () +endif () +enable_language(Fortran) + + +list(APPEND CMAKE_MODULE_PATH +"${FLANG_RT_SOURCE_DIR}/cmake/modules" +"${FLANG_SOURCE_DIR}/cmake/modules" + ) +include(AddFlangRT) +include(GetToolchainDirs) +include(FlangCommon) +include(HandleCompilerRT) +include(ExtendPath) +include(GNUInstallDirs) + + + +# Build Mode Introspection # + + +# Determine whether we are in the runtimes/runtimes-bins directory of a +# bootstrap build. +set(LLVM_TREE_AVAILABLE OFF) +if (LLVM_LIBRARY_OUTPUT_INTDIR AND LLVM_RUNTIME_OUTPUT_INTDIR AND PACKAGE_VERSION) + set(LLVM_TREE_AVAILABLE ON) +endif() + +# Path to LLVM development tools (FileCheck, llvm-lit, not, ...) +set(LLVM_TOOLS_DIR "${LLVM_BINARY_DIR}/bin") + +# Determine build and install paths. +# The build path is absolute, but the install dir is relative, CMake's install +# command has to apply CMAKE_INSTALL_PREFIX itself. +if (LLVM_TREE_AVAILABLE) + # In a bootstrap build emit the libraries into a default search path in the + # build directory of the just-built compiler. This allows using the + # just-built compiler without specifying paths to runtime libraries. + # + # Despite Clang in the name, get_clang_resource_dir does not depend on Clang + # being added to the build. Flang uses the same resource dir as clang. + include(GetClangResourceDir) + get_clang_resource_dir(FLANG_RT_OUTPUT_RESOURCE_DIR PREFIX "${LLVM_LIBRARY_OUTPUT_INTDIR}/..") Meinersbur wrote: `lib64` is determined by [`CMAKE_INSTALL_LIBDIR`](https://cmake.org/cmake/help/latest/module/GNUInstallDirs.html) which seems the correct way to do it. If clang does not recognize it, I will need to hardcode `lib`. https://github.com/llvm/llvm-project/pull/110217 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Reduce 64-bit add width if low bits are known 0 (PR #122049)
jayfoad wrote: Why doesn't this fall out naturally from splitting the 64-bit add into 32-bit parts and then simplifying each part? Do we leave it as a 64-bit add all the way until final instruction selection? https://github.com/llvm/llvm-project/pull/122049 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add host_eval clause to omp.target (PR #116049)
@@ -1166,9 +1166,10 @@ def TargetOp : OpenMP_Op<"target", traits = [ ], clauses = [ // TODO: Complete clause list (defaultmap, uses_allocators). OpenMP_AllocateClause, OpenMP_DependClause, OpenMP_DeviceClause, -OpenMP_HasDeviceAddrClause, OpenMP_IfClause, OpenMP_InReductionClause, -OpenMP_IsDevicePtrClause, OpenMP_MapClauseSkip, -OpenMP_NowaitClause, OpenMP_PrivateClause, OpenMP_ThreadLimitClause +OpenMP_HasDeviceAddrClause, OpenMP_HostEvalClause, OpenMP_IfClause, anchuraj wrote: I dont think it is a `Pseudo` clause, I think its more of an `Internal` / `Private` clause. I am not able to come up with better alternatives. Since you have documented about the clause well, please feel free to skip the suggestion. https://github.com/llvm/llvm-project/pull/116049 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits