[llvm-branch-commits] [llvm] [LoongArch] Avoid scheduling relaxable code sequence and attach relax relocs (PR #121330)

2025-01-08 Thread Zhaoxin Yang via llvm-branch-commits


@@ -443,6 +443,89 @@ bool LoongArchInstrInfo::isSchedulingBoundary(const 
MachineInstr &MI,
 break;
   }
 
+  const auto &STI = MF.getSubtarget();
+  if (STI.hasFeature(LoongArch::FeatureRelax)) {
+// When linker relaxation enabled, the following instruction patterns are
+// prohibited from being reordered:
+//
+// * pcalau12i $a0, %pc_hi20(s)
+//   addi.w/d $a0, $a0, %pc_lo12(s)
+//
+// * pcalau12i $a0, %got_pc_hi20(s)
+//   ld.w/d $a0, $a0, %got_pc_lo12(s)
+//
+// * pcalau12i $a0, %ie_pc_hi20(s)
+//   ld.w/d $a0, $a0, %ie_pc_lo12(s)

ylzsx wrote:

I think tls ie can be scheduled. 

https://github.com/llvm/llvm-project/pull/121330
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoongArch] Avoid scheduling relaxable code sequence and attach relax relocs (PR #121330)

2025-01-08 Thread via llvm-branch-commits

https://github.com/zhaoqi5 updated 
https://github.com/llvm/llvm-project/pull/121330

>From 85be5541a23a859ad8e50bd75fb7ff35985c5988 Mon Sep 17 00:00:00 2001
From: Qi Zhao 
Date: Tue, 24 Dec 2024 11:03:23 +0800
Subject: [PATCH 1/2] [LoongArch] Avoid scheduling relaxable code sequence and
 attach relax relocs

If linker relaxation enabled, relaxable code sequence expanded
from pseudos should avoid being separated by instruction scheduling.
This commit tags scheduling boundary for them to avoid being
scheduled. (Except for `tls_le` and `call36/tail36`. Because
`tls_le` can be scheduled and have no influence to relax,
`call36/tail36` are expanded later in `LoongArchExpandPseudo` pass.)

A new mask target-flag is added to attach relax relocs to the
relaxable code sequence. (No need to add it for `tls_le` and
`call36/tail36` because of the reasons shown above.) Because of this,
get "direct" flags is necessary when using their target-flags.
In addition, code sequence after being optimized by `MergeBaseOffset`
pass may not relaxable any more, so the relax "bitmask" flag should
be removed.
---
 .../LoongArch/LoongArchExpandPseudoInsts.cpp  |  34 --
 .../Target/LoongArch/LoongArchInstrInfo.cpp   |  99 -
 .../lib/Target/LoongArch/LoongArchInstrInfo.h |   3 +
 .../Target/LoongArch/LoongArchMCInstLower.cpp |   4 +-
 .../LoongArch/LoongArchMergeBaseOffset.cpp|  30 +-
 .../LoongArch/LoongArchTargetMachine.cpp  |   1 +
 .../MCTargetDesc/LoongArchBaseInfo.h  |  22 
 .../MCTargetDesc/LoongArchMCCodeEmitter.cpp   |   1 +
 .../CodeGen/LoongArch/linker-relaxation.ll| 102 ++
 .../test/CodeGen/LoongArch/mir-relax-flags.ll |  64 +++
 .../CodeGen/LoongArch/mir-target-flags.ll |  31 +-
 11 files changed, 370 insertions(+), 21 deletions(-)
 create mode 100644 llvm/test/CodeGen/LoongArch/linker-relaxation.ll
 create mode 100644 llvm/test/CodeGen/LoongArch/mir-relax-flags.ll

diff --git a/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp 
b/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp
index 0218934ea3344a..be60de3d63d061 100644
--- a/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchExpandPseudoInsts.cpp
@@ -187,18 +187,23 @@ bool LoongArchPreRAExpandPseudo::expandPcalau12iInstPair(
   MachineInstr &MI = *MBBI;
   DebugLoc DL = MI.getDebugLoc();
 
+  const auto &STI = MF->getSubtarget();
+  bool EnableRelax = STI.hasFeature(LoongArch::FeatureRelax);
+
   Register DestReg = MI.getOperand(0).getReg();
   Register ScratchReg =
   MF->getRegInfo().createVirtualRegister(&LoongArch::GPRRegClass);
   MachineOperand &Symbol = MI.getOperand(1);
 
   BuildMI(MBB, MBBI, DL, TII->get(LoongArch::PCALAU12I), ScratchReg)
-  .addDisp(Symbol, 0, FlagsHi);
+  .addDisp(Symbol, 0,
+   EnableRelax ? LoongArchII::addRelaxFlag(FlagsHi) : FlagsHi);
 
   MachineInstr *SecondMI =
   BuildMI(MBB, MBBI, DL, TII->get(SecondOpcode), DestReg)
   .addReg(ScratchReg)
-  .addDisp(Symbol, 0, FlagsLo);
+  .addDisp(Symbol, 0,
+   EnableRelax ? LoongArchII::addRelaxFlag(FlagsLo) : FlagsLo);
 
   if (MI.hasOneMemOperand())
 SecondMI->addMemOperand(*MF, *MI.memoperands_begin());
@@ -481,6 +486,7 @@ bool LoongArchPreRAExpandPseudo::expandLoadAddressTLSDesc(
   unsigned ADD = STI.is64Bit() ? LoongArch::ADD_D : LoongArch::ADD_W;
   unsigned ADDI = STI.is64Bit() ? LoongArch::ADDI_D : LoongArch::ADDI_W;
   unsigned LD = STI.is64Bit() ? LoongArch::LD_D : LoongArch::LD_W;
+  bool EnableRelax = STI.hasFeature(LoongArch::FeatureRelax);
 
   Register DestReg = MI.getOperand(0).getReg();
   Register Tmp1Reg =
@@ -488,7 +494,10 @@ bool LoongArchPreRAExpandPseudo::expandLoadAddressTLSDesc(
   MachineOperand &Symbol = MI.getOperand(Large ? 2 : 1);
 
   BuildMI(MBB, MBBI, DL, TII->get(LoongArch::PCALAU12I), Tmp1Reg)
-  .addDisp(Symbol, 0, LoongArchII::MO_DESC_PC_HI);
+  .addDisp(Symbol, 0,
+   (EnableRelax && !Large)
+   ? LoongArchII::addRelaxFlag(LoongArchII::MO_DESC_PC_HI)
+   : LoongArchII::MO_DESC_PC_HI);
 
   if (Large) {
 // Code Sequence:
@@ -526,19 +535,28 @@ bool LoongArchPreRAExpandPseudo::expandLoadAddressTLSDesc(
 // pcalau12i $a0, %desc_pc_hi20(sym)
 // addi.w/d  $a0, $a0, %desc_pc_lo12(sym)
 // ld.w/d$ra, $a0, %desc_ld(sym)
-// jirl  $ra, $ra, %desc_ld(sym)
-// add.d $dst, $a0, $tp
+// jirl  $ra, $ra, %desc_call(sym)
+// add.w/d   $dst, $a0, $tp
 BuildMI(MBB, MBBI, DL, TII->get(ADDI), LoongArch::R4)
 .addReg(Tmp1Reg)
-.addDisp(Symbol, 0, LoongArchII::MO_DESC_PC_LO);
+.addDisp(Symbol, 0,
+ EnableRelax
+ ? LoongArchII::addRelaxFlag(LoongArchII::MO_DESC_PC_LO)
+ : LoongArchII::MO_DESC_PC_LO);
   }
 
   BuildMI(MBB, MBBI, DL, TII->get(LD), LoongArch::R1)
   .addReg(LoongArch::

[llvm-branch-commits] [llvm] [LoongArch] Avoid scheduling relaxable code sequence and attach relax relocs (PR #121330)

2025-01-08 Thread via llvm-branch-commits


@@ -443,6 +443,89 @@ bool LoongArchInstrInfo::isSchedulingBoundary(const 
MachineInstr &MI,
 break;
   }
 
+  const auto &STI = MF.getSubtarget();
+  if (STI.hasFeature(LoongArch::FeatureRelax)) {
+// When linker relaxation enabled, the following instruction patterns are
+// prohibited from being reordered:
+//
+// * pcalau12i $a0, %pc_hi20(s)
+//   addi.w/d $a0, $a0, %pc_lo12(s)
+//
+// * pcalau12i $a0, %got_pc_hi20(s)
+//   ld.w/d $a0, $a0, %got_pc_lo12(s)
+//
+// * pcalau12i $a0, %ie_pc_hi20(s)
+//   ld.w/d $a0, $a0, %ie_pc_lo12(s)

zhaoqi5 wrote:

Great! It was my misunderstanding. Thanks.

https://github.com/llvm/llvm-project/pull/121330
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoongArch] Avoid scheduling relaxable code sequence and attach relax relocs (PR #121330)

2025-01-08 Thread via llvm-branch-commits


@@ -187,18 +187,23 @@ bool LoongArchPreRAExpandPseudo::expandPcalau12iInstPair(
   MachineInstr &MI = *MBBI;
   DebugLoc DL = MI.getDebugLoc();
 
+  const auto &STI = MF->getSubtarget();
+  bool EnableRelax = STI.hasFeature(LoongArch::FeatureRelax);
+
   Register DestReg = MI.getOperand(0).getReg();
   Register ScratchReg =
   MF->getRegInfo().createVirtualRegister(&LoongArch::GPRRegClass);
   MachineOperand &Symbol = MI.getOperand(1);
 
   BuildMI(MBB, MBBI, DL, TII->get(LoongArch::PCALAU12I), ScratchReg)
-  .addDisp(Symbol, 0, FlagsHi);
+  .addDisp(Symbol, 0,
+   EnableRelax ? LoongArchII::addRelaxFlag(FlagsHi) : FlagsHi);

zhaoqi5 wrote:

This is indeed better. Thanks.

https://github.com/llvm/llvm-project/pull/121330
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Reduce 64-bit add width if low bits are known 0 (PR #122049)

2025-01-08 Thread Christudasan Devadasan via llvm-branch-commits

https://github.com/cdevadas approved this pull request.


https://github.com/llvm/llvm-project/pull/122049
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoongArch] Avoid scheduling relaxable code sequence and attach relax relocs (PR #121330)

2025-01-08 Thread via llvm-branch-commits

https://github.com/zhaoqi5 edited 
https://github.com/llvm/llvm-project/pull/121330
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoongArch] Avoid scheduling relaxable code sequence and attach relax relocs (PR #121330)

2025-01-08 Thread via llvm-branch-commits

https://github.com/zhaoqi5 edited 
https://github.com/llvm/llvm-project/pull/121330
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang] Introduce FortranSupport (PR #122069)

2025-01-08 Thread via llvm-branch-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff a08aa48fb4955f9d16c6172580505c100076b5d4 
2023940bffc9c717e44266134b4f63f04f65f762 --extensions h,cpp -- 
flang/include/flang/Common/fast-int-set.h flang/include/flang/Evaluate/call.h 
flang/include/flang/Evaluate/characteristics.h 
flang/include/flang/Evaluate/common.h flang/include/flang/Evaluate/constant.h 
flang/include/flang/Evaluate/expression.h 
flang/include/flang/Evaluate/formatting.h 
flang/include/flang/Evaluate/intrinsics.h flang/include/flang/Evaluate/shape.h 
flang/include/flang/Evaluate/target.h flang/include/flang/Evaluate/tools.h 
flang/include/flang/Evaluate/traverse.h flang/include/flang/Evaluate/type.h 
flang/include/flang/Evaluate/variable.h 
flang/include/flang/Frontend/CompilerInvocation.h 
flang/include/flang/Frontend/FrontendOptions.h 
flang/include/flang/ISO_Fortran_binding.h 
flang/include/flang/Lower/AbstractConverter.h 
flang/include/flang/Lower/Bridge.h flang/include/flang/Lower/CallInterface.h 
flang/include/flang/Lower/ConvertType.h 
flang/include/flang/Lower/LoweringOptions.h 
flang/include/flang/Lower/PFTBuilder.h 
flang/include/flang/Lower/Support/Utils.h flang/include/flang/Lower/SymbolMap.h 
flang/include/flang/Optimizer/Builder/FIRBuilder.h 
flang/include/flang/Optimizer/Builder/PPCIntrinsicCall.h 
flang/include/flang/Optimizer/Builder/Runtime/RTBuilder.h 
flang/include/flang/Optimizer/CodeGen/DescriptorModel.h 
flang/include/flang/Optimizer/Dialect/CUF/Attributes/CUFAttr.h 
flang/include/flang/Optimizer/Support/TypeCode.h 
flang/include/flang/Optimizer/Support/Utils.h 
flang/include/flang/Parser/char-block.h 
flang/include/flang/Parser/dump-parse-tree.h 
flang/include/flang/Parser/message.h flang/include/flang/Parser/parse-state.h 
flang/include/flang/Parser/parse-tree.h flang/include/flang/Parser/parsing.h 
flang/include/flang/Parser/provenance.h flang/include/flang/Parser/source.h 
flang/include/flang/Parser/user-state.h 
flang/include/flang/Runtime/allocatable.h 
flang/include/flang/Runtime/descriptor-consts.h 
flang/include/flang/Runtime/descriptor.h flang/include/flang/Runtime/io-api.h 
flang/include/flang/Runtime/pointer.h flang/include/flang/Runtime/random.h 
flang/include/flang/Runtime/support.h flang/include/flang/Runtime/type-code.h 
flang/include/flang/Semantics/expression.h 
flang/include/flang/Semantics/runtime-type-info.h 
flang/include/flang/Semantics/scope.h flang/include/flang/Semantics/semantics.h 
flang/include/flang/Semantics/symbol.h flang/include/flang/Semantics/tools.h 
flang/include/flang/Semantics/type.h 
flang/include/flang/Tools/CrossToolHelpers.h flang/lib/Evaluate/call.cpp 
flang/lib/Evaluate/characteristics.cpp flang/lib/Evaluate/fold-implementation.h 
flang/lib/Evaluate/formatting.cpp flang/lib/Evaluate/intrinsics-library.cpp 
flang/lib/Evaluate/intrinsics.cpp flang/lib/Evaluate/real.cpp 
flang/lib/Evaluate/shape.cpp flang/lib/Evaluate/target.cpp 
flang/lib/Frontend/CompilerInstance.cpp 
flang/lib/Frontend/CompilerInvocation.cpp 
flang/lib/Frontend/FrontendActions.cpp flang/lib/Lower/Bridge.cpp 
flang/lib/Lower/CallInterface.cpp flang/lib/Lower/ConvertExpr.cpp 
flang/lib/Lower/Mangler.cpp flang/lib/Optimizer/Builder/IntrinsicCall.cpp 
flang/lib/Optimizer/CodeGen/TypeConverter.cpp 
flang/lib/Optimizer/Dialect/FIRType.cpp 
flang/lib/Optimizer/Transforms/AddDebugInfo.cpp 
flang/lib/Optimizer/Transforms/AssumedRankOpConversion.cpp 
flang/lib/Optimizer/Transforms/CUFDeviceGlobal.cpp 
flang/lib/Optimizer/Transforms/CUFGPUToLLVMConversion.cpp 
flang/lib/Optimizer/Transforms/CUFOpConversion.cpp 
flang/lib/Optimizer/Transforms/ExternalNameConversion.cpp 
flang/lib/Optimizer/Transforms/LoopVersioning.cpp 
flang/lib/Optimizer/Transforms/SimplifyIntrinsics.cpp 
flang/lib/Optimizer/Transforms/StackReclaim.cpp 
flang/lib/Optimizer/Transforms/VScaleAttr.cpp flang/lib/Parser/basic-parsers.h 
flang/lib/Parser/parse-tree.cpp flang/lib/Parser/prescan.h 
flang/lib/Parser/unparse.cpp flang/lib/Semantics/assignment.h 
flang/lib/Semantics/check-case.cpp flang/lib/Semantics/check-coarray.cpp 
flang/lib/Semantics/check-cuda.cpp flang/lib/Semantics/check-data.h 
flang/lib/Semantics/check-do-forall.cpp flang/lib/Semantics/check-return.cpp 
flang/lib/Semantics/check-select-rank.cpp 
flang/lib/Semantics/check-select-type.cpp flang/lib/Semantics/check-stop.cpp 
flang/lib/Semantics/data-to-inits.h flang/lib/Semantics/expression.cpp 
flang/lib/Semantics/pointer-assignment.cpp 
flang/lib/Semantics/resolve-labels.cpp 
flang/lib/Semantics/resolve-names-utils.cpp 
flang/lib/Semantics/resolve-names.cpp 
flang/lib/Semantics/rewrite-parse-tree.cpp flang/lib/Semantics/semantics.cpp 
flang/lib/Semantics/tools.cpp flang/runtime/CUDA/allocator.cpp 
flang/runtime/ISO_Fortran_binding.cpp flang/runtime/ISO_Fortran_util.h 
flang/runtime/allocatable.cpp flang/runtime/stat.h 

[llvm-branch-commits] [llvm] AMDGPU: Reduce 64-bit add width if low bits are known 0 (PR #122049)

2025-01-08 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> Why doesn't this fall out naturally from splitting the 64-bit add into 32-bit 
> parts and then simplifying each part? Do we leave it as a 64-bit add all the 
> way until final instruction selection?

Yes. It gets selected to pseudos which are split in the post-isel hook (I don't 
remember why these were moved from just split during the actual selection)

https://github.com/llvm/llvm-project/pull/122049
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SelectionDAG][X86] Split <2 x T> vector types for atomic load (PR #120640)

2025-01-08 Thread Matt Arsenault via llvm-branch-commits


@@ -1391,6 +1394,38 @@ void DAGTypeLegalizer::SplitVectorResult(SDNode *N, 
unsigned ResNo) {
 SetSplitVector(SDValue(N, ResNo), Lo, Hi);
 }
 
+void DAGTypeLegalizer::SplitVecRes_ATOMIC_LOAD(AtomicSDNode *LD, SDValue &Lo,
+   SDValue &Hi) {
+  EVT LoVT, HiVT;
+  SDLoc dl(LD);
+  std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(LD->getValueType(0));
+
+  SDValue Ch = LD->getChain();
+  SDValue Ptr = LD->getBasePtr();
+  EVT MemoryVT = LD->getMemoryVT();
+
+  EVT LoMemVT, HiMemVT;
+  std::tie(LoMemVT, HiMemVT) = DAG.GetSplitDestVTs(MemoryVT);
+
+  Lo = DAG.getAtomic(ISD::ATOMIC_LOAD, dl, LoMemVT, LoMemVT, Ch, Ptr,

arsenm wrote:

This should create one ATOMIC_LOAD with the bitcast integer type. You then 
unpack that result into the expected Lo/Hi, not the direct atomic results 

https://github.com/llvm/llvm-project/pull/120640
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SelectionDAG][X86] Split <2 x T> vector types for atomic load (PR #120640)

2025-01-08 Thread Matt Arsenault via llvm-branch-commits


@@ -1391,6 +1394,38 @@ void DAGTypeLegalizer::SplitVectorResult(SDNode *N, 
unsigned ResNo) {
 SetSplitVector(SDValue(N, ResNo), Lo, Hi);
 }
 
+void DAGTypeLegalizer::SplitVecRes_ATOMIC_LOAD(AtomicSDNode *LD, SDValue &Lo,
+   SDValue &Hi) {
+  EVT LoVT, HiVT;
+  SDLoc dl(LD);
+  std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(LD->getValueType(0));
+
+  SDValue Ch = LD->getChain();
+  SDValue Ptr = LD->getBasePtr();
+  EVT MemoryVT = LD->getMemoryVT();
+
+  EVT LoMemVT, HiMemVT;
+  std::tie(LoMemVT, HiMemVT) = DAG.GetSplitDestVTs(MemoryVT);
+
+  Lo = DAG.getAtomic(ISD::ATOMIC_LOAD, dl, LoMemVT, LoMemVT, Ch, Ptr,

arsenm wrote:

Title should also not say split, this is forcing the type to a legal integer 

https://github.com/llvm/llvm-project/pull/120640
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SelectionDAG][X86] Split <2 x T> vector types for atomic load (PR #120640)

2025-01-08 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm requested changes to this pull request.


https://github.com/llvm/llvm-project/pull/120640
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SelectionDAG][X86] Split <2 x T> vector types for atomic load (PR #120640)

2025-01-08 Thread Matt Arsenault via llvm-branch-commits


@@ -1146,6 +1146,9 @@ void DAGTypeLegalizer::SplitVectorResult(SDNode *N, 
unsigned ResNo) {
 SplitVecRes_STEP_VECTOR(N, Lo, Hi);
 break;
   case ISD::SIGN_EXTEND_INREG: SplitVecRes_InregOp(N, Lo, Hi); break;
+  case ISD::ATOMIC_LOAD:
+SplitVecRes_ATOMIC_LOAD(cast(N), Lo, Hi);

arsenm wrote:

I don't understand the comment. This is unrelated to the set of legal types or 
legalization actions. There is no need to touch any patterns 

https://github.com/llvm/llvm-project/pull/120640
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)

2025-01-08 Thread Michael Kruse via llvm-branch-commits

Meinersbur wrote:

> > It is an old problem, see [#87866 
> > (comment)](https://github.com/llvm/llvm-project/pull/87866#issuecomment-2214034671)
> 
> Can we raise an issue for this?

Created #122152

I don't expect anything come out of it, I think moving to 
`LLVM_ENABLE_PER_TARGET_RUNTIME_DIR` by default is deliberate.

https://github.com/llvm/llvm-project/pull/110217
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [llvm] [Flang][NFC] Move runtime library files to flang-rt. (PR #110298)

2025-01-08 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur edited 
https://github.com/llvm/llvm-project/pull/110298
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)

2025-01-08 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur edited 
https://github.com/llvm/llvm-project/pull/110217
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)

2025-01-08 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur ready_for_review 
https://github.com/llvm/llvm-project/pull/110217
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AsmPrinter][TargetLowering]Place a hot jump table into a hot-suffixed section (PR #122215)

2025-01-08 Thread Mingming Liu via llvm-branch-commits

https://github.com/mingmingl-llvm created 
https://github.com/llvm/llvm-project/pull/122215

None

>From a2a6f9f5a6f7647f85a230241bf3aa39c4bd65d9 Mon Sep 17 00:00:00 2001
From: mingmingl 
Date: Wed, 8 Jan 2025 16:53:45 -0800
Subject: [PATCH] [AsmPrinter][TargetLowering]Place a hot jump table into a
 hot-suffixed section

---
 llvm/include/llvm/CodeGen/AsmPrinter.h|   8 +-
 .../CodeGen/TargetLoweringObjectFileImpl.h|   3 +
 .../llvm/Target/TargetLoweringObjectFile.h|   5 +
 llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp| 102 +-
 .../CodeGen/TargetLoweringObjectFileImpl.cpp  |  28 +++--
 llvm/lib/CodeGen/TargetPassConfig.cpp |   8 +-
 llvm/lib/Target/TargetLoweringObjectFile.cpp  |   6 ++
 llvm/test/CodeGen/X86/jump-table-partition.ll |   8 +-
 8 files changed, 130 insertions(+), 38 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/AsmPrinter.h 
b/llvm/include/llvm/CodeGen/AsmPrinter.h
index c9a88d7b1c015c..9249d5adf3f6f7 100644
--- a/llvm/include/llvm/CodeGen/AsmPrinter.h
+++ b/llvm/include/llvm/CodeGen/AsmPrinter.h
@@ -453,6 +453,10 @@ class AsmPrinter : public MachineFunctionPass {
   /// function to the current output stream.
   virtual void emitJumpTableInfo();
 
+  virtual void emitJumpTables(const std::vector &JumpTableIndices,
+  MCSection *JumpTableSection, bool 
JTInDiffSection,
+  const MachineJumpTableInfo &MJTI);
+
   /// Emit the specified global variable to the .s file.
   virtual void emitGlobalVariable(const GlobalVariable *GV);
 
@@ -892,10 +896,10 @@ class AsmPrinter : public MachineFunctionPass {
   // Internal Implementation Details
   //===--===//
 
-  void emitJumpTableEntry(const MachineJumpTableInfo *MJTI,
+  void emitJumpTableEntry(const MachineJumpTableInfo &MJTI,
   const MachineBasicBlock *MBB, unsigned uid) const;
 
-  void emitJumpTableSizesSection(const MachineJumpTableInfo *MJTI,
+  void emitJumpTableSizesSection(const MachineJumpTableInfo &MJTI,
  const Function &F) const;
 
   void emitLLVMUsedList(const ConstantArray *InitList);
diff --git a/llvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h 
b/llvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h
index a2a9e5d499e527..3d48d380fcb245 100644
--- a/llvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h
+++ b/llvm/include/llvm/CodeGen/TargetLoweringObjectFileImpl.h
@@ -74,6 +74,9 @@ class TargetLoweringObjectFileELF : public 
TargetLoweringObjectFile {
 
   MCSection *getSectionForJumpTable(const Function &F,
 const TargetMachine &TM) const override;
+  MCSection *
+  getSectionForJumpTable(const Function &F, const TargetMachine &TM,
+ const MachineJumpTableEntry *JTE) const override;
   MCSection *getSectionForLSDA(const Function &F, const MCSymbol &FnSym,
const TargetMachine &TM) const override;
 
diff --git a/llvm/include/llvm/Target/TargetLoweringObjectFile.h 
b/llvm/include/llvm/Target/TargetLoweringObjectFile.h
index 4864ba843f4886..577adc458fcbf1 100644
--- a/llvm/include/llvm/Target/TargetLoweringObjectFile.h
+++ b/llvm/include/llvm/Target/TargetLoweringObjectFile.h
@@ -27,6 +27,7 @@ class Function;
 class GlobalObject;
 class GlobalValue;
 class MachineBasicBlock;
+class MachineJumpTableEntry;
 class MachineModuleInfo;
 class Mangler;
 class MCContext;
@@ -132,6 +133,10 @@ class TargetLoweringObjectFile : public MCObjectFileInfo {
 
   virtual MCSection *getSectionForJumpTable(const Function &F,
 const TargetMachine &TM) const;
+  virtual MCSection *
+  getSectionForJumpTable(const Function &F, const TargetMachine &TM,
+ const MachineJumpTableEntry *JTE) const;
+
   virtual MCSection *getSectionForLSDA(const Function &, const MCSymbol &,
const TargetMachine &) const {
 return LSDASection;
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp 
b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index d34fe0e86c7495..b575cd7d993c39 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -168,6 +168,11 @@ static cl::opt BBAddrMapSkipEmitBBEntries(
  "unnecessary for some PGOAnalysisMap features."),
 cl::Hidden, cl::init(false));
 
+static cl::opt
+EmitStaticDataHotnessSuffix("emit-static-data-hotness-suffix", cl::Hidden,
+cl::init(false), cl::ZeroOrMore,
+cl::desc("Emit static data hotness suffix"));
+
 static cl::opt EmitJumpTableSizesSection(
 "emit-jump-table-sizes-section",
 cl::desc("Emit a section containing jump table addresses and sizes"),
@@ -2861,7 +2866,6 @@ void AsmPrinter::emitConstantPool() {
 // Print assembly representations of the jump tabl

[llvm-branch-commits] [clang] Add documentation for Multilib custom flags (PR #114998)

2025-01-08 Thread Victor Campos via llvm-branch-commits

https://github.com/vhscampos updated 
https://github.com/llvm/llvm-project/pull/114998

>From 9fcdd1760ea664a618a2c05a18e777940a9d49b6 Mon Sep 17 00:00:00 2001
From: Victor Campos 
Date: Tue, 5 Nov 2024 14:22:06 +
Subject: [PATCH 1/4] Add documentation for Multilib custom flags

---
 clang/docs/Multilib.rst | 90 +
 1 file changed, 90 insertions(+)

diff --git a/clang/docs/Multilib.rst b/clang/docs/Multilib.rst
index 7637d0db9565b8..85cb789b9847ac 100644
--- a/clang/docs/Multilib.rst
+++ b/clang/docs/Multilib.rst
@@ -122,6 +122,78 @@ subclass and a suitable base multilib variant is present 
then the
 It is the responsibility of layered multilib authors to ensure that headers and
 libraries in each layer are complete enough to mask any incompatibilities.
 
+Multilib custom flags
+=
+
+Introduction
+
+
+The multilib mechanism supports library variants that correspond to target,
+code generation or language command-line flags. Examples include ``--target``,
+``-mcpu``, ``-mfpu``, ``-mbranch-protection``, ``-fno-rtti``. However, some 
library
+variants are particular to features that do not correspond to any command-line
+option. Multithreading and semihosting, for instance, have no associated
+compiler option.
+
+In order to support the selection of variants for which no compiler option
+exists, the multilib specification includes the concept of *custom flags*.
+These flags have no impact on code generation and are only used in the multilib
+processing.
+
+Multilib custom flags follow this format in the driver invocation:
+
+::
+
+  -fmultilib-flag=
+
+They are fed into the multilib system alongside the remaining flags.
+
+Custom flag declarations
+
+
+Custom flags can be declared in the YAML file under the *Flags* section.
+
+.. code-block:: yaml
+
+  Flags:
+  - Name: multithreaded
+Values:
+- Name: no-multithreaded
+  DriverArgs: [-D__SINGLE_THREAD__]
+- Name: multithreaded
+Default: no-multithreaded
+
+* Name: the name to categorize a flag.
+* Values: a list of flag *Value*s (defined below).
+* Default: it specifies the name of the value this flag should take if not
+  specified in the command-line invocation. It must be one value from the 
Values
+  field.
+
+A Default value is useful to save users from specifying custom flags that have 
a
+most commonly used value.
+
+Each flag *Value* is defined as:
+
+* Name: name of the value. This is the string to be used in
+  ``-fmultilib-flag=``.
+* DriverArgs: a list of strings corresponding to the extra driver arguments
+  used to build a library variant that's in accordance to this specific custom
+  flag value. These arguments are fed back into the driver if this flag *Value*
+  is enabled.
+
+The namespace of flag values is common across all flags. This means that flag
+value names must be unique.
+
+Usage of custom flags in the *Variants* specifications
+--
+
+Library variants should list their requirement on one or more custom flags like
+they do for any other flag. Each requirement must be listed as
+``-fmultilib-flag=``.
+
+A variant that does not specify a requirement on one particular flag can be
+matched against any value of that flag.
+
 Stability
 =
 
@@ -222,6 +294,24 @@ For a more comprehensive example see
 # Flags is a list of one or more strings.
 Flags: [--target=thumbv7m-none-eabi]
 
+  # Custom flag declarations. Each item is a different declaration.
+  Flags:
+# Name of the flag
+  - Name: multithreaded
+# List of custom flag values
+Values:
+  # Name of the custom flag value. To be used in -fmultilib-flag=.
+- Name: no-multithreaded
+  # Extra driver arguments to be printed with -print-multi-lib. Useful for
+  # specifying extra arguments for building the the associated library
+  # variant(s).
+  DriverArgs: [-D__SINGLE_THREAD__]
+- Name: multithreaded
+# Default flag value. If no value for this flag declaration is used in the
+# command-line, the multilib system will use this one. Must be equal to one
+# of the flag value names from this flag declaration.
+Default: no-multithreaded
+
 Design principles
 =
 

>From 5799eb81ac94ec4131af146bfacdf44a9bebdd71 Mon Sep 17 00:00:00 2001
From: Victor Campos 
Date: Mon, 25 Nov 2024 15:07:57 +
Subject: [PATCH 2/4] Fix doc build warning

---
 clang/docs/Multilib.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/docs/Multilib.rst b/clang/docs/Multilib.rst
index 85cb789b9847ac..48d84087dda01c 100644
--- a/clang/docs/Multilib.rst
+++ b/clang/docs/Multilib.rst
@@ -164,7 +164,7 @@ Custom flags can be declared in the YAML file under the 
*Flags* section.
 Default: no-multithreaded
 
 * Name: the name to categorize a flag.
-* Values: a list of flag *Value*s (defined below).
+* Values: a list of flag Values (defined b

[llvm-branch-commits] [clang] Add documentation for Multilib custom flags (PR #114998)

2025-01-08 Thread Victor Campos via llvm-branch-commits


@@ -122,6 +122,76 @@ subclass and a suitable base multilib variant is present 
then the
 It is the responsibility of layered multilib authors to ensure that headers and
 libraries in each layer are complete enough to mask any incompatibilities.
 
+Multilib custom flags
+=
+
+Introduction
+
+
+The multilib mechanism supports library variants that correspond to target,
+code generation or language command-line flags. Examples include ``--target``,
+``-mcpu``, ``-mfpu``, ``-mbranch-protection``, ``-fno-rtti``. However, some 
library
+variants are particular to features that do not correspond to any command-line
+option. Multithreading and semihosting, for instance, have no associated
+compiler option.
+
+In order to support the selection of variants for which no compiler option
+exists, the multilib specification includes the concept of *custom flags*.
+These flags have no impact on code generation and are only used in the multilib
+processing.
+
+Multilib custom flags follow this format in the driver invocation:
+
+::
+
+  -fmultilib-flag=
+
+They are fed into the multilib system alongside the remaining flags.
+
+Custom flag declarations
+
+
+Custom flags can be declared in the YAML file under the *Flags* section.
+
+.. code-block:: yaml
+
+  Flags:
+  - Name: multithreaded
+Values:
+- Name: no-multithreaded
+  MacroDefines: [__SINGLE_THREAD__]
+- Name: multithreaded
+Default: no-multithreaded
+
+* Name: the name to categorize a flag.
+* Values: a list of flag Values (defined below).
+* Default: it specifies the name of the value this flag should take if not
+  specified in the command-line invocation. It must be one value from the 
Values
+  field.
+
+A Default value is useful to save users from specifying custom flags that have 
a

vhscampos wrote:

Thanks. FIxed.

https://github.com/llvm/llvm-project/pull/114998
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] Add documentation for Multilib custom flags (PR #114998)

2025-01-08 Thread Sam Elliott via llvm-branch-commits

https://github.com/lenary approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/114998
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)

2025-01-08 Thread Joseph Huber via llvm-branch-commits


@@ -0,0 +1,232 @@
+#===-- CMakeLists.txt 
--===#
+#
+# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+#
+#======#
+#
+# Build instructions for the flang-rt library. This is file is intended to be
+# included using the LLVM_ENABLE_RUNTIMES mechanism.
+#
+#======#
+
+if (NOT LLVM_RUNTIMES_BUILD)
+  message(FATAL_ERROR "Use this CMakeLists.txt from LLVM's runtimes build 
system.
+  Example:
+cmake /runtimes -DLLVM_ENABLE_RUNTIMES=flang-rt
+")
+endif ()
+
+set(LLVM_SUBPROJECT_TITLE "Flang-RT")
+set(FLANG_RT_SOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}")
+set(FLANG_RT_BINARY_DIR "${CMAKE_CURRENT_BINARY_DIR}")
+set(FLANG_SOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}/../flang")
+
+
+# CMake 3.24 is the first version of CMake that directly recognizes Flang.
+# LLVM's requirement is only CMake 3.20, teach CMake 3.20-3.23 how to use 
Flang.
+if (CMAKE_VERSION VERSION_LESS "3.24")
+  cmake_path(GET CMAKE_Fortran_COMPILER STEM _Fortran_COMPILER_STEM)
+  if (_Fortran_COMPILER_STEM STREQUAL "flang-new" OR _Fortran_COMPILER_STEM 
STREQUAL "flang")
+include(CMakeForceCompiler)
+CMAKE_FORCE_Fortran_COMPILER("${CMAKE_Fortran_COMPILER}" "LLVMFlang")
+
+set(CMAKE_Fortran_COMPILER_ID "LLVMFlang")
+set(CMAKE_Fortran_COMPILER_VERSION 
"${LLVM_VERSION_MAJOR}.${LLVM_VERSION_MINOR}")
+
+set(CMAKE_Fortran_SUBMODULE_SEP "-")
+set(CMAKE_Fortran_SUBMODULE_EXT ".mod")
+
+set(CMAKE_Fortran_PREPROCESS_SOURCE
+  " -cpp-E  
> ")
+
+set(CMAKE_Fortran_FORMAT_FIXED_FLAG "-ffixed-form")
+set(CMAKE_Fortran_FORMAT_FREE_FLAG "-ffree-form")
+
+set(CMAKE_Fortran_MODDIR_FLAG "-module-dir")
+
+set(CMAKE_Fortran_COMPILE_OPTIONS_PREPROCESS_ON "-cpp")
+set(CMAKE_Fortran_COMPILE_OPTIONS_PREPROCESS_OFF "-nocpp")
+set(CMAKE_Fortran_POSTPROCESS_FLAG "-ffixed-line-length-72")
+
+set(CMAKE_Fortran_COMPILE_OPTIONS_TARGET "--target=")
+
+set(CMAKE_Fortran_LINKER_WRAPPER_FLAG "-Wl,")
+set(CMAKE_Fortran_LINKER_WRAPPER_FLAG_SEP ",")
+  endif ()
+endif ()
+enable_language(Fortran)
+
+
+list(APPEND CMAKE_MODULE_PATH
+"${FLANG_RT_SOURCE_DIR}/cmake/modules"
+"${FLANG_SOURCE_DIR}/cmake/modules"
+  )
+include(AddFlangRT)
+include(GetToolchainDirs)
+include(FlangCommon)
+include(HandleCompilerRT)
+include(ExtendPath)
+include(GNUInstallDirs)
+
+
+
+# Build Mode Introspection #
+
+
+# Determine whether we are in the runtimes/runtimes-bins directory of a
+# bootstrap build.
+set(LLVM_TREE_AVAILABLE OFF)
+if (LLVM_LIBRARY_OUTPUT_INTDIR AND LLVM_RUNTIME_OUTPUT_INTDIR AND 
PACKAGE_VERSION)
+  set(LLVM_TREE_AVAILABLE ON)
+endif()
+
+# Path to LLVM development tools (FileCheck, llvm-lit, not, ...)
+set(LLVM_TOOLS_DIR "${LLVM_BINARY_DIR}/bin")
+
+# Determine build and install paths.
+# The build path is absolute, but the install dir is relative, CMake's install
+# command has to apply CMAKE_INSTALL_PREFIX itself.
+if (LLVM_TREE_AVAILABLE)
+  # In a bootstrap build emit the libraries into a default search path in the
+  # build directory of the just-built compiler. This allows using the
+  # just-built compiler without specifying paths to runtime libraries.
+  #
+  # Despite Clang in the name, get_clang_resource_dir does not depend on Clang
+  # being added to the build. Flang uses the same resource dir as clang.
+  include(GetClangResourceDir)
+  get_clang_resource_dir(FLANG_RT_OUTPUT_RESOURCE_DIR PREFIX 
"${LLVM_LIBRARY_OUTPUT_INTDIR}/..")

jhuber6 wrote:

The `lib64/` thing seems weird. Is anything else installed there? 
`-DLLVM_ENABLE_PER_TARGET_RUNTIME_DIR=ON` is the default for Linux and what 
should lead to having `x86_64-unknown-linux-gnu` there, but I've never seen 
`lib64/` be qualified there as well.

https://github.com/llvm/llvm-project/pull/110217
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)

2025-01-08 Thread Michael Kruse via llvm-branch-commits

Meinersbur wrote:

> I think with >1600 commits and >300kLoC changes, something went wrong here 
> with the merging. As mentioned by myself and others, it would be good to 
> rebase this and condense the commits that belong into #110298 resp. this one.

This happens when I push the branch, that has `origin/HEAD` merged into, before 
doing the same with the PR that it is based on. In this case I pushed the wrong 
branch, one of the PRs that has already been merged. Sorry about that. I now 
wrote a script that pushes in the right order so I hope this doesn't happen 
anymore. 

It seems that I will not be able to apply 
https://github.com/llvm/llvm-zorg/pull/333 in a timely manner minimize the time 
the buildbots are red. I am currently working on keeping the old CMake code 
that builds the runtime such that the buildbot builders can be updated 
iteratively. It has the side-effect that I need to keep the 
`flang/runtime/CMakeLists.txt` working which I originally wanted to avoid, but 
also allows splitting this patch into smaller PRs.

https://github.com/llvm/llvm-project/pull/110217
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)

2025-01-08 Thread Kareem Ergawy via llvm-branch-commits

https://github.com/ergawy deleted 
https://github.com/llvm/llvm-project/pull/116050
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)

2025-01-08 Thread Kareem Ergawy via llvm-branch-commits


@@ -6182,9 +6182,12 @@ TEST_F(OpenMPIRBuilderTest, TargetRegion) {
 
   TargetRegionEntryInfo EntryInfo("func", 42, 4711, 17);
   OpenMPIRBuilder::LocationDescription OmpLoc({Builder.saveIP(), DL});
-  OpenMPIRBuilder::InsertPointOrErrorTy AfterIP = OMPBuilder.createTarget(
-  OmpLoc, /*IsOffloadEntry=*/true, Builder.saveIP(), Builder.saveIP(),
-  EntryInfo, -1, 0, Inputs, GenMapInfoCB, BodyGenCB, SimpleArgAccessorCB);
+  OpenMPIRBuilder::TargetKernelDefaultAttrs DefaultAttrs = {
+  /*MaxTeams=*/{-1}, /*MinTeams=*/0, /*MaxThreads=*/{0}, /*MinThreads=*/0};

ergawy wrote:

This set of values is used in multiple locations to "default" construct 
`TargetKernelDefaultAttrs`, would it make sense to have this set of values as 
default values in the struct? I might be missing why we need the current 
default struct values.

https://github.com/llvm/llvm-project/pull/116050
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)

2025-01-08 Thread Kareem Ergawy via llvm-branch-commits


@@ -6182,9 +6182,12 @@ TEST_F(OpenMPIRBuilderTest, TargetRegion) {
 
   TargetRegionEntryInfo EntryInfo("func", 42, 4711, 17);
   OpenMPIRBuilder::LocationDescription OmpLoc({Builder.saveIP(), DL});
-  OpenMPIRBuilder::InsertPointOrErrorTy AfterIP = OMPBuilder.createTarget(
-  OmpLoc, /*IsOffloadEntry=*/true, Builder.saveIP(), Builder.saveIP(),
-  EntryInfo, -1, 0, Inputs, GenMapInfoCB, BodyGenCB, SimpleArgAccessorCB);
+  OpenMPIRBuilder::TargetKernelDefaultAttrs DefaultAttrs = {
+  /*MaxTeams=*/{-1}, /*MinTeams=*/0, /*MaxThreads=*/{0}, /*MinThreads=*/0};

ergawy wrote:

This set of values is used in multiple locations to "default" construct 
`TargetKernelDefaultAttrs`, would it make sense to have this set of values as 
default values in the struct? I might be missing why we need the current 
default struct values.

https://github.com/llvm/llvm-project/pull/116050
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add host_eval clause to omp.target (PR #116049)

2025-01-08 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak updated 
https://github.com/llvm/llvm-project/pull/116049

>From bd7fa379968210047a25e031a8385ff0c43a3fb7 Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Fri, 8 Nov 2024 12:00:45 +
Subject: [PATCH] [MLIR][OpenMP] Add host_eval clause to omp.target

This patch adds the `host_eval` clause to the `omp.target` operation.
Additionally, it updates its op verifier to make sure all uses of block
arguments defined by this clause fall within one of the few cases where they
are allowed.

MLIR to LLVM IR translation fails on translation of this clause with a
not-yet-implemented error.
---
 mlir/docs/Dialects/OpenMPDialect/_index.md|  58 -
 .../mlir/Dialect/OpenMP/OpenMPDialect.h   |   1 +
 mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td |  33 ++-
 mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp  | 206 +-
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  |   5 +
 mlir/test/Dialect/OpenMP/invalid.mlir |  94 +++-
 mlir/test/Dialect/OpenMP/ops.mlir |  54 -
 mlir/test/Target/LLVMIR/openmp-todo.mlir  |  14 ++
 8 files changed, 446 insertions(+), 19 deletions(-)

diff --git a/mlir/docs/Dialects/OpenMPDialect/_index.md 
b/mlir/docs/Dialects/OpenMPDialect/_index.md
index 03d5b95217cce0..b651b3c06485c6 100644
--- a/mlir/docs/Dialects/OpenMPDialect/_index.md
+++ b/mlir/docs/Dialects/OpenMPDialect/_index.md
@@ -298,7 +298,8 @@ introduction of private copies of the same underlying 
variable defined outside
 the MLIR operation the clause is attached to. Currently, clauses with this
 property can be classified into three main categories:
   - Map-like clauses: `host_eval` (compiler internal, not defined by the OpenMP
-  specification), `map`, `use_device_addr` and `use_device_ptr`.
+  specification: [see more](#host-evaluated-clauses-in-target-regions)), `map`,
+  `use_device_addr` and `use_device_ptr`.
   - Reduction-like clauses: `in_reduction`, `reduction` and `task_reduction`.
   - Privatization clauses: `private`.
 
@@ -523,3 +524,58 @@ omp.parallel ... {
   omp.terminator
 } {omp.composite}
 ```
+
+## Host-Evaluated Clauses in Target Regions
+
+The `omp.target` operation, which represents the OpenMP `target` construct, is
+marked with the `IsolatedFromAbove` trait. This means that, inside of its
+region, no MLIR values defined outside of the op itself can be used. This is
+consistent with the OpenMP specification of the `target` construct, which
+mandates that all host device values used inside of the `target` region must
+either be privatized (data-sharing) or mapped (data-mapping).
+
+Normally, clauses applied to a construct are evaluated before entering that
+construct. Further, in some cases, the OpenMP specification stipulates that
+clauses be evaluated _on the host device_ on entry to a parent `target`
+construct. In particular, the `num_teams` and `thread_limit` clauses of the
+`teams` construct must be evaluated on the host device if it's nested inside or
+combined with a `target` construct.
+
+Additionally, the runtime library targeted by the MLIR to LLVM IR translation 
of
+the OpenMP dialect supports the optimized launch of SPMD kernels (i.e.
+`target teams distribute parallel {do,for}` in OpenMP), which requires
+specifying in advance what the total trip count of the loop is. Consequently, 
it
+is also beneficial to evaluate the trip count on the host device prior to the
+kernel launch.
+
+These host-evaluated values in MLIR would need to be placed outside of the
+`omp.target` region and also attached to the corresponding nested operations,
+which is not possible because of the `IsolatedFromAbove` trait. The solution
+implemented to address this problem has been to introduce the `host_eval`
+argument to the `omp.target` operation. It works similarly to a `map` clause,
+but its only intended use is to forward host-evaluated values to their
+corresponding operation inside of the region. Any uses outside of the 
previously
+described result in a verifier error.
+
+```mlir
+// Initialize %0, %1, %2, %3...
+omp.target host_eval(%0 -> %nt, %1 -> %lb, %2 -> %ub, %3 -> %step : i32, i32, 
i32, i32) {
+  omp.teams num_teams(to %nt : i32) {
+omp.parallel {
+  omp.distribute {
+omp.wsloop {
+  omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
+// ...
+omp.yield
+  }
+  omp.terminator
+} {omp.composite}
+omp.terminator
+  } {omp.composite}
+  omp.terminator
+} {omp.composite}
+omp.terminator
+  }
+  omp.terminator
+}
+```
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPDialect.h 
b/mlir/include/mlir/Dialect/OpenMP/OpenMPDialect.h
index bee21432196e42..248ac2eb72c61a 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPDialect.h
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPDialect.h
@@ -22,6 +22,7 @@
 #include "mlir/IR/SymbolTable.h"
 #include "mlir/Interfaces/ControlFlowInterfaces.h"
 #include "mlir/Interfaces/SideEffectInterfaces.h"
+#

[llvm-branch-commits] [llvm] AMDGPU: Start considering new atomicrmw metadata on integer operations (PR #122138)

2025-01-08 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)


Changes

Start considering !amdgpu.no.remote.memory.access and
!amdgpu.no.fine.grained.host.memory metadata when deciding to expand
integer atomic operations. This does not yet attempt to accurately
handle fadd/fmin/fmax, which are trickier and require migrating the
old "amdgpu-unsafe-fp-atomics" attribute.

---

Patch is 1.43 MiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/122138.diff


28 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+53-12) 
- (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_udec_wrap.ll 
(+31-30) 
- (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_uinc_wrap.ll 
(+2521-614) 
- (modified) llvm/test/CodeGen/AMDGPU/acc-ldst.ll (+4-2) 
- (modified) llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll 
(+14-12) 
- (modified) llvm/test/CodeGen/AMDGPU/dag-divergence-atomic.ll (+17-17) 
- (modified) llvm/test/CodeGen/AMDGPU/flat_atomics.ll (+87-85) 
- (modified) llvm/test/CodeGen/AMDGPU/flat_atomics_i32_system.ll (+100-926) 
- (modified) llvm/test/CodeGen/AMDGPU/flat_atomics_i64.ll (+82-80) 
- (modified) llvm/test/CodeGen/AMDGPU/flat_atomics_i64_noprivate.ll 
(+3758-1362) 
- (modified) llvm/test/CodeGen/AMDGPU/flat_atomics_i64_system.ll (+258-1374) 
- (modified) llvm/test/CodeGen/AMDGPU/flat_atomics_i64_system_noprivate.ll 
(+100-1212) 
- (modified) llvm/test/CodeGen/AMDGPU/global-saddr-atomics.ll (+82-80) 
- (modified) llvm/test/CodeGen/AMDGPU/global_atomics.ll (+319-79) 
- (modified) llvm/test/CodeGen/AMDGPU/global_atomics_i32_system.ll (+98-984) 
- (modified) llvm/test/CodeGen/AMDGPU/global_atomics_i64.ll (+82-80) 
- (modified) llvm/test/CodeGen/AMDGPU/global_atomics_i64_system.ll (+109-1221) 
- (modified) llvm/test/CodeGen/AMDGPU/idemponent-atomics.ll (+42-28) 
- (modified) llvm/test/CodeGen/AMDGPU/move-to-valu-atomicrmw.ll (+4-2) 
- (modified) llvm/test/CodeGen/AMDGPU/shl_add_ptr_global.ll (+1-1) 
- (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i16.ll 
(+534-159) 
- (modified) 
llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i32-agent.ll (+990-49) 
- (modified) 
llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i32-system.ll (+30-330) 
- (modified) 
llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i64-agent.ll (+990-49) 
- (modified) 
llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i64-system.ll (+30-330) 
- (modified) llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i8.ll 
(+209-6) 
- (modified) 
llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomicrmw-flat-noalias-addrspace.ll
 (+130-8) 
- (modified) 
llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomicrmw-integer-ops-0-to-add-0.ll
 (+43-21) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 513251e398ad4d..5fa8e1532096f7 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -16621,19 +16621,60 @@ 
SITargetLowering::shouldExpandAtomicRMWInIR(AtomicRMWInst *RMW) const {
   case AtomicRMWInst::UDecWrap: {
 if (AMDGPU::isFlatGlobalAddrSpace(AS) ||
 AS == AMDGPUAS::BUFFER_FAT_POINTER) {
-  // Always expand system scope atomics.
-  if (HasSystemScope) {
-if (Op == AtomicRMWInst::Sub || Op == AtomicRMWInst::Or ||
-Op == AtomicRMWInst::Xor) {
-  // Atomic sub/or/xor do not work over PCI express, but atomic add
-  // does. InstCombine transforms these with 0 to or, so undo that.
-  if (Constant *ConstVal = dyn_cast(RMW->getValOperand());
-  ConstVal && ConstVal->isNullValue())
-return AtomicExpansionKind::Expand;
-}
-
-return AtomicExpansionKind::CmpXChg;
+  // On most subtargets, for atomicrmw operations other than add/xchg,
+  // whether or not the instructions will behave correctly depends on where
+  // the address physically resides and what interconnect is used in the
+  // system configuration. On some some targets the instruction will nop,
+  // and in others synchronization will only occur at degraded device 
scope.
+  //
+  // If the allocation is known local to the device, the instructions 
should
+  // work correctly.
+  if (RMW->hasMetadata("amdgpu.no.remote.memory"))
+return atomicSupportedIfLegalIntType(RMW);
+
+  // If fine-grained remote memory works at device scope, we don't need to
+  // do anything.
+  if (!HasSystemScope &&
+  Subtarget->supportsAgentScopeFineGrainedRemoteMemoryAtomics())
+return atomicSupportedIfLegalIntType(RMW);
+
+  // If we are targeting a remote allocated address, it depends what kind 
of
+  // allocation the address belongs to.
+  //
+  // If the allocation is fine-grained (in host memory, or in PCIe peer
+  // device memory), the operation will fa

[llvm-branch-commits] [llvm] AMDGPU: Start considering new atomicrmw metadata on integer operations (PR #122138)

2025-01-08 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/122138?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#122138** https://app.graphite.dev/github/pr/llvm/llvm-project/122138?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/122138?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#122137** https://app.graphite.dev/github/pr/llvm/llvm-project/122137?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/122138
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Start considering new atomicrmw metadata on integer operations (PR #122138)

2025-01-08 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/122138
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)

2025-01-08 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak updated 
https://github.com/llvm/llvm-project/pull/116050

>From f73a439832c4e8454274b7677570d190231dcf46 Mon Sep 17 00:00:00 2001
From: Sergio Afonso 
Date: Fri, 8 Nov 2024 15:46:48 +
Subject: [PATCH 1/2] [OMPIRBuilder] Introduce struct to hold default kernel
 teams/threads

This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure
used to simplify passing default and constant values for number of teams and
threads, and possibly other target kernel-related information in the future.

This is used to forward values passed to `createTarget` to `createTargetInit`,
which previously used a default unrelated set of values.
---
 clang/lib/CodeGen/CGOpenMPRuntime.cpp | 13 ++--
 clang/lib/CodeGen/CGOpenMPRuntime.h   |  9 +--
 clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp  |  9 +--
 .../llvm/Frontend/OpenMP/OMPIRBuilder.h   | 39 ++
 llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp | 71 +++
 .../Frontend/OpenMPIRBuilderTest.cpp  | 29 
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  | 11 +--
 .../LLVMIR/omptarget-region-device-llvm.mlir  |  2 +-
 8 files changed, 102 insertions(+), 81 deletions(-)

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index 30c3834de139c3..1cb3bab454c26a 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -5881,10 +5881,13 @@ void 
CGOpenMPRuntime::emitUsesAllocatorsFini(CodeGenFunction &CGF,
 
 void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams(
 const OMPExecutableDirective &D, CodeGenFunction &CGF,
-int32_t &MinThreadsVal, int32_t &MaxThreadsVal, int32_t &MinTeamsVal,
-int32_t &MaxTeamsVal) {
+llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs) {
+  assert(Attrs.MaxTeams.size() == 1 && Attrs.MaxThreads.size() == 1 &&
+ "invalid default attrs structure");
+  int32_t &MaxTeamsVal = Attrs.MaxTeams.front();
+  int32_t &MaxThreadsVal = Attrs.MaxThreads.front();
 
-  getNumTeamsExprForTargetDirective(CGF, D, MinTeamsVal, MaxTeamsVal);
+  getNumTeamsExprForTargetDirective(CGF, D, Attrs.MinTeams, MaxTeamsVal);
   getNumThreadsExprForTargetDirective(CGF, D, MaxThreadsVal,
   /*UpperBoundOnly=*/true);
 
@@ -5902,12 +5905,12 @@ void CGOpenMPRuntime::computeMinAndMaxThreadsAndTeams(
   else
 continue;
 
-  MinThreadsVal = std::max(MinThreadsVal, AttrMinThreadsVal);
+  Attrs.MinThreads = std::max(Attrs.MinThreads, AttrMinThreadsVal);
   if (AttrMaxThreadsVal > 0)
 MaxThreadsVal = MaxThreadsVal > 0
 ? std::min(MaxThreadsVal, AttrMaxThreadsVal)
 : AttrMaxThreadsVal;
-  MinTeamsVal = std::max(MinTeamsVal, AttrMinBlocksVal);
+  Attrs.MinTeams = std::max(Attrs.MinTeams, AttrMinBlocksVal);
   if (AttrMaxBlocksVal > 0)
 MaxTeamsVal = MaxTeamsVal > 0 ? std::min(MaxTeamsVal, AttrMaxBlocksVal)
   : AttrMaxBlocksVal;
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h 
b/clang/lib/CodeGen/CGOpenMPRuntime.h
index 8ab5ee70a19fa2..3791bb71592350 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.h
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.h
@@ -313,12 +313,9 @@ class CGOpenMPRuntime {
   llvm::OpenMPIRBuilder OMPBuilder;
 
   /// Helper to determine the min/max number of threads/teams for \p D.
-  void computeMinAndMaxThreadsAndTeams(const OMPExecutableDirective &D,
-   CodeGenFunction &CGF,
-   int32_t &MinThreadsVal,
-   int32_t &MaxThreadsVal,
-   int32_t &MinTeamsVal,
-   int32_t &MaxTeamsVal);
+  void computeMinAndMaxThreadsAndTeams(
+  const OMPExecutableDirective &D, CodeGenFunction &CGF,
+  llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs);
 
   /// Helper to emit outlined function for 'target' directive.
   /// \param D Directive to emit.
diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
index 756f0482b8ea72..659783a813c83e 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -744,14 +744,11 @@ void CGOpenMPRuntimeGPU::emitNonSPMDKernel(const 
OMPExecutableDirective &D,
 void CGOpenMPRuntimeGPU::emitKernelInit(const OMPExecutableDirective &D,
 CodeGenFunction &CGF,
 EntryFunctionState &EST, bool IsSPMD) {
-  int32_t MinThreadsVal = 1, MaxThreadsVal = -1, MinTeamsVal = 1,
-  MaxTeamsVal = -1;
-  computeMinAndMaxThreadsAndTeams(D, CGF, MinThreadsVal, MaxThreadsVal,
-  MinTeamsVal, MaxTeamsVal);
+  llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs Attrs;
+  computeMinAndMaxThreadsAndTeams(D, CGF, Attrs);
 
   C

[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)

2025-01-08 Thread Sergio Afonso via llvm-branch-commits


@@ -2726,15 +2740,11 @@ class OpenMPIRBuilder {
   ///
   /// \param Loc The insert and source location description.
   /// \param IsSPMD Flag to indicate if the kernel is an SPMD kernel or not.
-  /// \param MinThreads Minimal number of threads, or 0.
-  /// \param MaxThreads Maximal number of threads, or 0.
-  /// \param MinTeams Minimal number of teams, or 0.
-  /// \param MaxTeams Maximal number of teams, or 0.
-  InsertPointTy createTargetInit(const LocationDescription &Loc, bool IsSPMD,
- int32_t MinThreadsVal = 0,
- int32_t MaxThreadsVal = 0,
- int32_t MinTeamsVal = 0,
- int32_t MaxTeamsVal = 0);
+  /// \param Attrs Structure containing the default numbers of threads and 
teams
+  ///to launch the kernel with.
+  InsertPointTy createTargetInit(
+  const LocationDescription &Loc, bool IsSPMD,
+  const llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs);

skatrak wrote:

Good point, I agree. Added now.

https://github.com/llvm/llvm-project/pull/116050
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [mlir] [OMPIRBuilder] Introduce struct to hold default kernel teams/threads (PR #116050)

2025-01-08 Thread Sergio Afonso via llvm-branch-commits


@@ -6182,9 +6182,12 @@ TEST_F(OpenMPIRBuilderTest, TargetRegion) {
 
   TargetRegionEntryInfo EntryInfo("func", 42, 4711, 17);
   OpenMPIRBuilder::LocationDescription OmpLoc({Builder.saveIP(), DL});
-  OpenMPIRBuilder::InsertPointOrErrorTy AfterIP = OMPBuilder.createTarget(
-  OmpLoc, /*IsOffloadEntry=*/true, Builder.saveIP(), Builder.saveIP(),
-  EntryInfo, -1, 0, Inputs, GenMapInfoCB, BodyGenCB, SimpleArgAccessorCB);
+  OpenMPIRBuilder::TargetKernelDefaultAttrs DefaultAttrs = {
+  /*MaxTeams=*/{-1}, /*MinTeams=*/0, /*MaxThreads=*/{0}, /*MinThreads=*/0};

skatrak wrote:

Defaults in the new struct represent basically what you would expect: max 
values representing "unset" (since these can be either unset (<0), 
runtime-evaluated (0) or constant (>0)) and min values set to 1. I believe that 
set of defaults makes sense, and it matches what clang set the corresponding 
attributes initially too.

As for not overriding the defaults in these tests, `MaxThreads < 0` causes the 
OMPIRBuilder to query the default grid size based on the target triple, whereas 
0 won't. Querying that triggers an assert if the triple is not one of the 
supported offloading targets, so at least that one attribute can't be left 
unchanged unless we change the target triple of the OMPIRBuilder too. But, more 
generally, I think there is nothing in this PR that causes a need to update 
these tests, so I just set all of the values to what they already were before 
the struct was introduced rather than adapting them to its defaults.

I hope that makes sense to you, but let me know if you don't agree.

https://github.com/llvm/llvm-project/pull/116050
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Reduce 64-bit add width if low bits are known 0 (PR #122049)

2025-01-08 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

### Merge activity

* **Jan 8, 10:24 AM EST**: A user started a stack merge that includes this pull 
request via 
[Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/122049).


https://github.com/llvm/llvm-project/pull/122049
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [llvm] [flang][OpenMP] Parse WHEN, OTHERWISE, MATCH clauses plus METADIRECTIVE (PR #121817)

2025-01-08 Thread Krzysztof Parzyszek via llvm-branch-commits

https://github.com/kparzysz updated 
https://github.com/llvm/llvm-project/pull/121817

>From 5f534c559ca1bb7911b484264582d1a5078bdcb8 Mon Sep 17 00:00:00 2001
From: Krzysztof Parzyszek 
Date: Thu, 12 Dec 2024 15:26:26 -0600
Subject: [PATCH 1/7] [flang][OpenMP] Parse WHEN, OTHERWISE, MATCH clauses plus
 METADIRECTIVE

Parse METADIRECTIVE as a standalone executable directive at the moment.
This will allow testing the parser code.

There is no lowering, not even clause conversion yet. There is also no
verification of the allowed values for trait sets, trait properties.
---
 flang/include/flang/Parser/dump-parse-tree.h |   5 +
 flang/include/flang/Parser/parse-tree.h  |  41 -
 flang/lib/Lower/OpenMP/Clauses.cpp   |  21 ++-
 flang/lib/Lower/OpenMP/Clauses.h |   1 +
 flang/lib/Lower/OpenMP/OpenMP.cpp|   6 +
 flang/lib/Parser/openmp-parsers.cpp  |  26 ++-
 flang/lib/Parser/unparse.cpp |  12 ++
 flang/lib/Semantics/check-omp-structure.cpp  |   9 +
 flang/lib/Semantics/check-omp-structure.h|   3 +
 flang/lib/Semantics/resolve-directives.cpp   |   5 +
 flang/test/Parser/OpenMP/metadirective.f90   | 165 +++
 llvm/include/llvm/Frontend/OpenMP/OMP.td |   9 +-
 12 files changed, 296 insertions(+), 7 deletions(-)
 create mode 100644 flang/test/Parser/OpenMP/metadirective.f90

diff --git a/flang/include/flang/Parser/dump-parse-tree.h 
b/flang/include/flang/Parser/dump-parse-tree.h
index a61d7973dd5c36..94ca7c67cbd52e 100644
--- a/flang/include/flang/Parser/dump-parse-tree.h
+++ b/flang/include/flang/Parser/dump-parse-tree.h
@@ -476,6 +476,11 @@ class ParseTreeDumper {
   NODE(parser, NullInit)
   NODE(parser, ObjectDecl)
   NODE(parser, OldParameterStmt)
+  NODE(parser, OmpMetadirectiveDirective)
+  NODE(parser, OmpMatchClause)
+  NODE(parser, OmpOtherwiseClause)
+  NODE(parser, OmpWhenClause)
+  NODE(OmpWhenClause, Modifier)
   NODE(parser, OmpDirectiveSpecification)
   NODE(parser, OmpTraitPropertyName)
   NODE(parser, OmpTraitScore)
diff --git a/flang/include/flang/Parser/parse-tree.h 
b/flang/include/flang/Parser/parse-tree.h
index 697bddfaf16150..113ff3380ba22c 100644
--- a/flang/include/flang/Parser/parse-tree.h
+++ b/flang/include/flang/Parser/parse-tree.h
@@ -3964,6 +3964,7 @@ struct OmpBindClause {
 // data-sharing-attribute ->
 //SHARED | NONE |   // since 4.5
 //PRIVATE | FIRSTPRIVATE// since 5.0
+// See also otherwise-clause.
 struct OmpDefaultClause {
   ENUM_CLASS(DataSharingAttribute, Private, Firstprivate, Shared, None)
   WRAPPER_CLASS_BOILERPLATE(OmpDefaultClause, DataSharingAttribute);
@@ -4184,6 +4185,16 @@ struct OmpMapClause {
   std::tuple t;
 };
 
+// Ref: [5.0:58-60], [5.1:63-68], [5.2:194-195]
+//
+// match-clause ->
+//MATCH (context-selector-specification)// since 5.0
+struct OmpMatchClause {
+  // The context-selector is an argument.
+  WRAPPER_CLASS_BOILERPLATE(
+  OmpMatchClause, traits::OmpContextSelectorSpecification);
+};
+
 // Ref: [5.2:217-218]
 // message-clause ->
 //MESSAGE("message-text")
@@ -4214,6 +4225,17 @@ struct OmpOrderClause {
   std::tuple t;
 };
 
+// Ref: [5.0:56-57], [5.1:60-62], [5.2:191]
+//
+// otherwise-clause ->
+//DEFAULT ([directive-specification])   // since 5.0, until 5.1
+// otherwise-clause ->
+//OTHERWISE ([directive-specification])]// since 5.2
+struct OmpOtherwiseClause {
+  WRAPPER_CLASS_BOILERPLATE(
+  OmpOtherwiseClause, std::optional);
+};
+
 // Ref: [4.5:46-50], [5.0:74-78], [5.1:92-96], [5.2:229-230]
 //
 // proc-bind-clause ->
@@ -4299,6 +4321,17 @@ struct OmpUpdateClause {
   std::variant u;
 };
 
+// Ref: [5.0:56-57], [5.1:60-62], [5.2:190-191]
+//
+// when-clause ->
+//WHEN (context-selector :
+//[directive-specification])// since 5.0
+struct OmpWhenClause {
+  TUPLE_CLASS_BOILERPLATE(OmpWhenClause);
+  MODIFIER_BOILERPLATE(OmpContextSelector);
+  std::tuple> t;
+};
+
 // OpenMP Clauses
 struct OmpClause {
   UNION_CLASS_BOILERPLATE(OmpClause);
@@ -4323,6 +4356,12 @@ struct OmpClauseList {
 
 // --- Directives and constructs
 
+struct OmpMetadirectiveDirective {
+  TUPLE_CLASS_BOILERPLATE(OmpMetadirectiveDirective);
+  std::tuple t;
+  CharBlock source;
+};
+
 // Ref: [5.1:89-90], [5.2:216]
 //
 // nothing-directive ->
@@ -4696,7 +4735,7 @@ struct OpenMPStandaloneConstruct {
   CharBlock source;
   std::variant
+  OpenMPDepobjConstruct, OmpMetadirectiveDirective>
   u;
 };
 
diff --git a/flang/lib/Lower/OpenMP/Clauses.cpp 
b/flang/lib/Lower/OpenMP/Clauses.cpp
index b424e209d56da9..d60171552087fa 100644
--- a/flang/lib/Lower/OpenMP/Clauses.cpp
+++ b/flang/lib/Lower/OpenMP/Clauses.cpp
@@ -230,9 +230,9 @@ MAKE_EMPTY_CLASS(Threadprivate, Threadprivate);
 
 MAKE_INCOMPLETE_CLASS(AdjustArgs, AdjustArgs);
 MAKE_INCOMPLETE_CLASS(AppendArgs, AppendArgs);
-MAKE_INCOMPLETE_CLASS(Match, Match);
+// MAKE_INCOMPLETE_CLASS(Match, Mat

[llvm-branch-commits] [flang] [flang][OpenMP] Parsing context selectors for METADIRECTIVE (PR #121815)

2025-01-08 Thread Krzysztof Parzyszek via llvm-branch-commits

https://github.com/kparzysz updated 
https://github.com/llvm/llvm-project/pull/121815

>From 215c7e6133bf07d005ac7483b8faf797e319a1fa Mon Sep 17 00:00:00 2001
From: Krzysztof Parzyszek 
Date: Thu, 12 Dec 2024 15:26:26 -0600
Subject: [PATCH] [flang][OpenMP] Parsing context selectors for METADIRECTIVE

This is just adding parsers for context selectors. There are no tests
because there is no way to execute these parsers yet.
---
 flang/include/flang/Parser/characters.h  |   2 +
 flang/include/flang/Parser/dump-parse-tree.h |  14 ++
 flang/include/flang/Parser/parse-tree.h  | 136 +++
 flang/lib/Parser/openmp-parsers.cpp  |  78 +++
 flang/lib/Parser/token-parsers.h |   4 +
 flang/lib/Parser/unparse.cpp |  38 ++
 flang/lib/Semantics/check-omp-structure.cpp  |   8 ++
 flang/lib/Semantics/check-omp-structure.h|   3 +
 flang/lib/Semantics/resolve-directives.cpp   |   6 +
 9 files changed, 289 insertions(+)

diff --git a/flang/include/flang/Parser/characters.h 
b/flang/include/flang/Parser/characters.h
index df188d674b9eeb..dbdc058c44995a 100644
--- a/flang/include/flang/Parser/characters.h
+++ b/flang/include/flang/Parser/characters.h
@@ -180,6 +180,8 @@ inline constexpr bool IsValidFortranTokenCharacter(char ch) 
{
   case '>':
   case '[':
   case ']':
+  case '{': // Used in OpenMP context selector specification
+  case '}': //
 return true;
   default:
 return IsLegalIdentifierStart(ch) || IsDecimalDigit(ch);
diff --git a/flang/include/flang/Parser/dump-parse-tree.h 
b/flang/include/flang/Parser/dump-parse-tree.h
index 3331520922bc63..a61d7973dd5c36 100644
--- a/flang/include/flang/Parser/dump-parse-tree.h
+++ b/flang/include/flang/Parser/dump-parse-tree.h
@@ -476,6 +476,20 @@ class ParseTreeDumper {
   NODE(parser, NullInit)
   NODE(parser, ObjectDecl)
   NODE(parser, OldParameterStmt)
+  NODE(parser, OmpDirectiveSpecification)
+  NODE(parser, OmpTraitPropertyName)
+  NODE(parser, OmpTraitScore)
+  NODE(parser, OmpTraitPropertyExtension)
+  NODE(OmpTraitPropertyExtension, ExtensionValue)
+  NODE(parser, OmpTraitProperty)
+  NODE(parser, OmpTraitSelectorName)
+  NODE_ENUM(OmpTraitSelectorName, Value)
+  NODE(parser, OmpTraitSelector)
+  NODE(OmpTraitSelector, Properties)
+  NODE(parser, OmpTraitSetSelectorName)
+  NODE_ENUM(OmpTraitSetSelectorName, Value)
+  NODE(parser, OmpTraitSetSelector)
+  NODE(parser, OmpContextSelectorSpecification)
   NODE(parser, OmpMapper)
   NODE(parser, OmpMapType)
   NODE_ENUM(OmpMapType, Value)
diff --git a/flang/include/flang/Parser/parse-tree.h 
b/flang/include/flang/Parser/parse-tree.h
index 941d70d3876291..697bddfaf16150 100644
--- a/flang/include/flang/Parser/parse-tree.h
+++ b/flang/include/flang/Parser/parse-tree.h
@@ -3453,6 +3453,17 @@ WRAPPER_CLASS(PauseStmt, std::optional);
 
 // --- Common definitions
 
+struct OmpClause;
+struct OmpClauseList;
+
+struct OmpDirectiveSpecification {
+  TUPLE_CLASS_BOILERPLATE(OmpDirectiveSpecification);
+  std::tuple>>
+  t;
+  CharBlock source;
+};
+
 // 2.1 Directives or clauses may accept a list or extended-list.
 // A list item is a variable, array section or common block name (enclosed
 // in slashes). An extended list item is a list item or a procedure Name.
@@ -3474,6 +3485,128 @@ WRAPPER_CLASS(OmpObjectList, std::list);
 
 #define MODIFIERS() std::optional>
 
+inline namespace traits {
+// trait-property-name ->
+//identifier | string-literal
+struct OmpTraitPropertyName {
+  WRAPPER_CLASS_BOILERPLATE(OmpTraitPropertyName, std::string);
+};
+
+// trait-score ->
+//SCORE(non-negative-const-integer-expression)
+struct OmpTraitScore {
+  WRAPPER_CLASS_BOILERPLATE(OmpTraitScore, ScalarIntExpr);
+};
+
+// trait-property-extension ->
+//trait-property-name (trait-property-value, ...)
+// trait-property-value ->
+//trait-property-name |
+//scalar-integer-expression |
+//trait-property-extension
+//
+// The grammar in OpenMP 5.2+ spec is ambiguous, the above is a different
+// version (but equivalent) that doesn't have ambiguities.
+// The ambiguity is in
+//   trait-property:
+//  trait-property-name  <- (a)
+//  trait-property-clause
+//  trait-property-expression<- (b)
+//  trait-property-extension <- this conflicts with (a) and (b)
+//   trait-property-extension:
+//  trait-property-name  <- conflict with (a)
+//  identifier(trait-property-extension[, trait-property-extension[, ...]])
+//  constant integer expression  <- conflict with (b)
+//
+struct OmpTraitPropertyExtension {
+  TUPLE_CLASS_BOILERPLATE(OmpTraitPropertyExtension);
+  struct ExtensionValue {
+UNION_CLASS_BOILERPLATE(ExtensionValue);
+std::variant>
+u;
+  };
+  using ExtensionList = std::list;
+  std::tuple t;
+};
+
+// trait-property ->
+//trait-property-name | OmpClause |
+//trait-property-expression | trait-property-extension
+// trait-property-expression ->
+//scala

[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)

2025-01-08 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur edited 
https://github.com/llvm/llvm-project/pull/110217
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2025-01-08 Thread Alexander Richardson via llvm-branch-commits


@@ -650,48 +650,127 @@ literal types are uniqued in recent versions of LLVM.
 
 .. _nointptrtype:
 
-Non-Integral Pointer Type
--
+Non-Integral and Unstable Pointer Types
+---
 
-Note: non-integral pointer types are a work in progress, and they should be
-considered experimental at this time.
+Note: non-integral/unstable pointer types are a work in progress, and they
+should be considered experimental at this time.
 
 LLVM IR optionally allows the frontend to denote pointers in certain address
-spaces as "non-integral" via the :ref:`datalayout string`.
-Non-integral pointer types represent pointers that have an *unspecified* 
bitwise
-representation; that is, the integral representation may be target dependent or
-unstable (not backed by a fixed integer).
+spaces as "non-integral" or "unstable" (or both "non-integral" and "unstable")
+via the :ref:`datalayout string`.
+
+The exact implications of these properties are target-specific, but the
+following IR semantics and restrictions to optimization passes apply:
+
+Unstable pointer representation
+^^^
+
+Pointers in this address space have an *unspecified* bitwise representation
+(i.e. not backed by a fixed integer). The bitwise pattern of such pointers is
+allowed to change in a target-specific way. For example, this could be a 
pointer
+type used with copying garbage collection where the garbage collector could
+update the pointer at any time in the collection sweep.
 
 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
 integral (i.e. normal) pointers in that they convert integers to and from
-corresponding pointer types, but there are additional implications to be
-aware of.  Because the bit-representation of a non-integral pointer may
-not be stable, two identical casts of the same operand may or may not
+corresponding pointer types, but there are additional implications to be aware
+of.
+
+For "unstable" pointer representations, the bit-representation of the pointer
+may not be stable, so two identical casts of the same operand may or may not
 return the same value.  Said differently, the conversion to or from the
-non-integral type depends on environmental state in an implementation
+"unstable" pointer type depends on environmental state in an implementation
 defined manner.
-
 If the frontend wishes to observe a *particular* value following a cast, the
 generated IR must fence with the underlying environment in an implementation
 defined manner. (In practice, this tends to require ``noinline`` routines for
 such operations.)
 
 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
-non-integral types are analogous to ones on integral types with one
+"unstable" pointer types are analogous to ones on integral types with one
 key exception: the optimizer may not, in general, insert new dynamic
 occurrences of such casts.  If a new cast is inserted, the optimizer would
 need to either ensure that a) all possible values are valid, or b)
 appropriate fencing is inserted.  Since the appropriate fencing is
 implementation defined, the optimizer can't do the latter.  The former is
 challenging as many commonly expected properties, such as
-``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
+``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for "unstable" pointer types.
 Similar restrictions apply to intrinsics that might examine the pointer bits,
 such as :ref:`llvm.ptrmask`.
 
-The alignment information provided by the frontend for a non-integral pointer
+The alignment information provided by the frontend for an "unstable" pointer
 (typically using attributes or metadata) must be valid for every possible
 representation of the pointer.
 
+Non-integral pointer representation
+^^^
+
+Pointers are not represented as just an address, but may instead include
+additional metadata such as bounds information or a temporal identifier.
+Examples include AMDGPU buffer descriptors with a 128-bit fat pointer and a
+32-bit offset, or CHERI capabilities that contain bounds, permissions and a
+type field (as well as an out-of-band validity bit, see next section).
+In general, valid non-integral pointers cannot becreated from just an integer
+value: while ``inttoptr`` yields a deterministic bitwise pattern, the resulting
+value is not guaranteed to be a valid dereferenceable pointer.
+
+In most cases pointers with a non-integral representation behave exactly the
+same as an integral pointer, the only difference is that it is not possible to
+create a pointer just from an address.
+
+"Non-integral" pointers also impose restrictions on transformation passes, but
+in general these are less restrictive than for "unstable" pointers. The main
+difference compared to integral pointers is that ``inttoptr`` instructions
+should not be inserted by passes as they may not be able to create a vali

[llvm-branch-commits] [llvm] [Flang-RT] Build libflang_rt.so (PR #121782)

2025-01-08 Thread Michael Kruse via llvm-branch-commits

https://github.com/Meinersbur updated 
https://github.com/llvm/llvm-project/pull/121782

>From a3037ab5557dcc4a4deb5bb40f801ca9770e3854 Mon Sep 17 00:00:00 2001
From: Michael Kruse 
Date: Mon, 6 Jan 2025 16:44:08 +0100
Subject: [PATCH 1/2] Add FLANG_RT_ENABLE_STATIC and FLANG_RT_ENABLE_SHARED

---
 flang-rt/CMakeLists.txt   |  30 ++
 flang-rt/cmake/modules/AddFlangRT.cmake   | 291 --
 .../cmake/modules/AddFlangRTOffload.cmake |   8 +-
 flang-rt/cmake/modules/GetToolchainDirs.cmake | 254 +++
 flang-rt/lib/flang_rt/CMakeLists.txt  |  20 +-
 flang-rt/test/CMakeLists.txt  |   2 +-
 flang-rt/test/lit.cfg.py  |   2 +-
 7 files changed, 366 insertions(+), 241 deletions(-)

diff --git a/flang-rt/CMakeLists.txt b/flang-rt/CMakeLists.txt
index 7b3d22e454a108..7effa6012a078f 100644
--- a/flang-rt/CMakeLists.txt
+++ b/flang-rt/CMakeLists.txt
@@ -113,6 +113,15 @@ cmake_path(NORMAL_PATH FLANG_RT_OUTPUT_RESOURCE_DIR)
 cmake_path(NORMAL_PATH FLANG_RT_INSTALL_RESOURCE_PATH)
 
 # Determine subdirectories for build output and install destinations.
+# FIXME: For the libflang_rt.so, the toolchain resource lib dir is not a good
+#destination because it is not a ld.so default search path.
+#The machine where the executable is eventually executed may not be the
+#machine where the Flang compiler and its resource dir is installed, so
+#setting RPath by the driver is not an solution. It should belong into
+#/usr/lib//libflang_rt.so, like e.g. libgcc_s.so.
+#But the linker as invoked by the Flang driver also requires
+#libflang_rt.so to be found when linking and the resource lib dir is
+#the only reliable location.
 get_toolchain_library_subdir(toolchain_lib_subdir)
 extend_path(FLANG_RT_OUTPUT_RESOURCE_LIB_DIR "${FLANG_RT_OUTPUT_RESOURCE_DIR}" 
"${toolchain_lib_subdir}")
 extend_path(FLANG_RT_INSTALL_RESOURCE_LIB_PATH 
"${FLANG_RT_INSTALL_RESOURCE_PATH}" "${toolchain_lib_subdir}")
@@ -130,6 +139,27 @@ cmake_path(NORMAL_PATH FLANG_RT_INSTALL_RESOURCE_LIB_PATH)
 option(FLANG_RT_INCLUDE_TESTS "Generate build targets for the flang-rt unit 
and regression-tests." "${LLVM_INCLUDE_TESTS}")
 
 
+option(FLANG_RT_ENABLE_STATIC "Build Flang-RT as a static library." ON)
+if (WIN32)
+  # Windows DLL currently not implemented.
+  set(FLANG_RT_ENABLE_SHARED OFF)
+else ()
+  # TODO: Enable by default to increase test coverage, and which version of the
+  #   library should be the user's choice anyway.
+  #   Currently, the Flang driver adds `-L"libdir" -lflang_rt` as linker
+  #   argument, which leaves the choice which library to use to the linker.
+  #   Since most linkers prefer the shared library, this would constitute a
+  #   breaking change unless the driver is changed.
+  option(FLANG_RT_ENABLE_SHARED "Build Flang-RT as a shared library." OFF)
+endif ()
+if (NOT FLANG_RT_ENABLE_STATIC AND NOT FLANG_RT_ENABLE_SHARED)
+  message(FATAL_ERROR "
+  Must build at least one type of library
+  (FLANG_RT_ENABLE_STATIC=ON, FLANG_RT_ENABLE_SHARED=ON, or both)
+")
+endif ()
+
+
 set(FLANG_RT_EXPERIMENTAL_OFFLOAD_SUPPORT "" CACHE STRING "Compile Flang-RT 
with GPU support (CUDA or OpenMP)")
 set_property(CACHE FLANG_RT_EXPERIMENTAL_OFFLOAD_SUPPORT PROPERTY STRINGS
 ""
diff --git a/flang-rt/cmake/modules/AddFlangRT.cmake 
b/flang-rt/cmake/modules/AddFlangRT.cmake
index 1f8b5111433825..5f493a80c35f20 100644
--- a/flang-rt/cmake/modules/AddFlangRT.cmake
+++ b/flang-rt/cmake/modules/AddFlangRT.cmake
@@ -16,7 +16,8 @@
 #   STATIC
 # Build a static (.a/.lib) library
 #   OBJECT
-# Create only object files without static/dynamic library
+# Always create an object library.
+# Without SHARED/STATIC, build only the object library.
 #   INSTALL_WITH_TOOLCHAIN
 # Install library into Clang's resource directory so it can be found by the
 # Flang driver during compilation, including tests
@@ -44,17 +45,73 @@ function (add_flangrt_library name)
   ")
   endif ()
 
-  # Forward libtype to add_library
-  set(extra_args "")
-  if (ARG_SHARED)
-list(APPEND extra_args SHARED)
+  # Internal names of libraries. If called with just single type option, use
+  # the default name for it. Name of targets must only depend on function
+  # arguments to be predictable for callers.
+  set(name_static "${name}.static")
+  set(name_shared "${name}.shared")
+  set(name_object "obj.${name}")
+  if (ARG_STATIC AND NOT ARG_SHARED)
+set(name_static "${name}")
+  elseif (NOT ARG_STATIC AND ARG_SHARED)
+set(name_shared "${name}")
+  elseif (NOT ARG_STATIC AND NOT ARG_SHARED AND ARG_OBJECT)
+set(name_object "${name}")
+  elseif (NOT ARG_STATIC AND NOT ARG_SHARED AND NOT ARG_OBJECT)
+# Only one of them will actually be built.
+set(name_static "${name}")
+set(name_shared "${name}")
+  endif ()
+
+  # Determine what to build. If not explicitly

[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)

2025-01-08 Thread Michael Kruse via llvm-branch-commits

Meinersbur wrote:

> The library is present under
> 
> ```
> $PREFIX/flang-rt/lib/x86_64-unknown-linux-gnu/libflang_rt.a
> ```
> 
> (where it got installed through the changes in this PR, without any specific 
> overrides).

If `$PREFIX` is `CMAKE_INSTALL_DIR`, it is the wrong location. it should be in 
`$PREFIX/lib/clang/20/lib/x86_64-unknown-linux-gnu/libflang_rt.a`. I don't see 
how the `flang_rt` dir could get in there. Can you try with the latest update?

If `$PREFIX` is CMake's build directory, are you using a runtimes-standalone 
(non-bootstrap) build and running flang from there? Then `$PREFIX` for flang an 
flang-rt are different. `flang` looks into its own buildir only. You will have 
to `-L` to that directory or install them into the same `CMAKE_INSTALL_DIR`.

https://github.com/llvm/llvm-project/pull/110217
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [flang] [lld] [llvm] [Flang] LLVM_ENABLE_RUNTIMES=flang-rt (PR #110217)

2025-01-08 Thread Michael Kruse via llvm-branch-commits


@@ -0,0 +1,232 @@
+#===-- CMakeLists.txt 
--===#
+#
+# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+#
+#======#
+#
+# Build instructions for the flang-rt library. This is file is intended to be
+# included using the LLVM_ENABLE_RUNTIMES mechanism.
+#
+#======#
+
+if (NOT LLVM_RUNTIMES_BUILD)
+  message(FATAL_ERROR "Use this CMakeLists.txt from LLVM's runtimes build 
system.
+  Example:
+cmake /runtimes -DLLVM_ENABLE_RUNTIMES=flang-rt
+")
+endif ()
+
+set(LLVM_SUBPROJECT_TITLE "Flang-RT")
+set(FLANG_RT_SOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}")
+set(FLANG_RT_BINARY_DIR "${CMAKE_CURRENT_BINARY_DIR}")
+set(FLANG_SOURCE_DIR "${CMAKE_CURRENT_SOURCE_DIR}/../flang")
+
+
+# CMake 3.24 is the first version of CMake that directly recognizes Flang.
+# LLVM's requirement is only CMake 3.20, teach CMake 3.20-3.23 how to use 
Flang.
+if (CMAKE_VERSION VERSION_LESS "3.24")
+  cmake_path(GET CMAKE_Fortran_COMPILER STEM _Fortran_COMPILER_STEM)
+  if (_Fortran_COMPILER_STEM STREQUAL "flang-new" OR _Fortran_COMPILER_STEM 
STREQUAL "flang")
+include(CMakeForceCompiler)
+CMAKE_FORCE_Fortran_COMPILER("${CMAKE_Fortran_COMPILER}" "LLVMFlang")
+
+set(CMAKE_Fortran_COMPILER_ID "LLVMFlang")
+set(CMAKE_Fortran_COMPILER_VERSION 
"${LLVM_VERSION_MAJOR}.${LLVM_VERSION_MINOR}")
+
+set(CMAKE_Fortran_SUBMODULE_SEP "-")
+set(CMAKE_Fortran_SUBMODULE_EXT ".mod")
+
+set(CMAKE_Fortran_PREPROCESS_SOURCE
+  " -cpp-E  
> ")
+
+set(CMAKE_Fortran_FORMAT_FIXED_FLAG "-ffixed-form")
+set(CMAKE_Fortran_FORMAT_FREE_FLAG "-ffree-form")
+
+set(CMAKE_Fortran_MODDIR_FLAG "-module-dir")
+
+set(CMAKE_Fortran_COMPILE_OPTIONS_PREPROCESS_ON "-cpp")
+set(CMAKE_Fortran_COMPILE_OPTIONS_PREPROCESS_OFF "-nocpp")
+set(CMAKE_Fortran_POSTPROCESS_FLAG "-ffixed-line-length-72")
+
+set(CMAKE_Fortran_COMPILE_OPTIONS_TARGET "--target=")
+
+set(CMAKE_Fortran_LINKER_WRAPPER_FLAG "-Wl,")
+set(CMAKE_Fortran_LINKER_WRAPPER_FLAG_SEP ",")
+  endif ()
+endif ()
+enable_language(Fortran)
+
+
+list(APPEND CMAKE_MODULE_PATH
+"${FLANG_RT_SOURCE_DIR}/cmake/modules"
+"${FLANG_SOURCE_DIR}/cmake/modules"
+  )
+include(AddFlangRT)
+include(GetToolchainDirs)
+include(FlangCommon)
+include(HandleCompilerRT)
+include(ExtendPath)
+include(GNUInstallDirs)
+
+
+
+# Build Mode Introspection #
+
+
+# Determine whether we are in the runtimes/runtimes-bins directory of a
+# bootstrap build.
+set(LLVM_TREE_AVAILABLE OFF)
+if (LLVM_LIBRARY_OUTPUT_INTDIR AND LLVM_RUNTIME_OUTPUT_INTDIR AND 
PACKAGE_VERSION)
+  set(LLVM_TREE_AVAILABLE ON)
+endif()
+
+# Path to LLVM development tools (FileCheck, llvm-lit, not, ...)
+set(LLVM_TOOLS_DIR "${LLVM_BINARY_DIR}/bin")
+
+# Determine build and install paths.
+# The build path is absolute, but the install dir is relative, CMake's install
+# command has to apply CMAKE_INSTALL_PREFIX itself.
+if (LLVM_TREE_AVAILABLE)
+  # In a bootstrap build emit the libraries into a default search path in the
+  # build directory of the just-built compiler. This allows using the
+  # just-built compiler without specifying paths to runtime libraries.
+  #
+  # Despite Clang in the name, get_clang_resource_dir does not depend on Clang
+  # being added to the build. Flang uses the same resource dir as clang.
+  include(GetClangResourceDir)
+  get_clang_resource_dir(FLANG_RT_OUTPUT_RESOURCE_DIR PREFIX 
"${LLVM_LIBRARY_OUTPUT_INTDIR}/..")

Meinersbur wrote:

`lib64` is determined by 
[`CMAKE_INSTALL_LIBDIR`](https://cmake.org/cmake/help/latest/module/GNUInstallDirs.html)
 which seems the correct way to do it. If clang does not recognize it, I will 
need to hardcode `lib`.

https://github.com/llvm/llvm-project/pull/110217
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Reduce 64-bit add width if low bits are known 0 (PR #122049)

2025-01-08 Thread Jay Foad via llvm-branch-commits

jayfoad wrote:

Why doesn't this fall out naturally from splitting the 64-bit add into 32-bit 
parts and then simplifying each part? Do we leave it as a 64-bit add all the 
way until final instruction selection?

https://github.com/llvm/llvm-project/pull/122049
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][OpenMP] Add host_eval clause to omp.target (PR #116049)

2025-01-08 Thread Anchu Rajendran S via llvm-branch-commits


@@ -1166,9 +1166,10 @@ def TargetOp : OpenMP_Op<"target", traits = [
   ], clauses = [
 // TODO: Complete clause list (defaultmap, uses_allocators).
 OpenMP_AllocateClause, OpenMP_DependClause, OpenMP_DeviceClause,
-OpenMP_HasDeviceAddrClause, OpenMP_IfClause, OpenMP_InReductionClause,
-OpenMP_IsDevicePtrClause, OpenMP_MapClauseSkip,
-OpenMP_NowaitClause, OpenMP_PrivateClause, OpenMP_ThreadLimitClause
+OpenMP_HasDeviceAddrClause, OpenMP_HostEvalClause, OpenMP_IfClause,

anchuraj wrote:

I dont think it is a `Pseudo` clause, I think  its more of an `Internal` / 
`Private` clause. I am not able to come up with better alternatives. Since you 
have documented about the clause well,  please feel free to skip the suggestion.

https://github.com/llvm/llvm-project/pull/116049
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits