[llvm-branch-commits] [X86][NewPM] Port lower-amx-intrinsics to NewPM (PR #165113)

2025-11-01 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 updated 
https://github.com/llvm/llvm-project/pull/165113


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [X86][NewPM] Port lower-amx-intrinsics to NewPM (PR #165113)

2025-11-01 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 updated 
https://github.com/llvm/llvm-project/pull/165113


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LVer][profcheck] explicitly set unknown branch weights for the versioned/unversioned selector (PR #164507)

2025-11-01 Thread Florian Hahn via llvm-branch-commits


@@ -109,8 +110,13 @@ void LoopVersioning::versionLoop(
   // Insert the conditional branch based on the result of the memchecks.
   Instruction *OrigTerm = RuntimeCheckBB->getTerminator();
   Builder.SetInsertPoint(OrigTerm);
-  Builder.CreateCondBr(RuntimeCheck, NonVersionedLoop->getLoopPreheader(),
-   VersionedLoop->getLoopPreheader());
+  auto *BI =
+  Builder.CreateCondBr(RuntimeCheck, NonVersionedLoop->getLoopPreheader(),
+   VersionedLoop->getLoopPreheader());
+  // We don't know what the probability of executing the versioned vs the
+  // unversioned variants is.
+  setExplicitlyUnknownBranchWeightsIfProfiled(
+  *BI, *BI->getParent()->getParent(), DEBUG_TYPE);

fhahn wrote:

Actually, looks like the argument can be removed alltogether 
https://github.com/llvm/llvm-project/pull/166028

https://github.com/llvm/llvm-project/pull/164507
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LVer][profcheck] explicitly set unknown branch weights for the versioned/unversioned selector (PR #164507)

2025-11-01 Thread Florian Hahn via llvm-branch-commits


@@ -109,8 +110,13 @@ void LoopVersioning::versionLoop(
   // Insert the conditional branch based on the result of the memchecks.
   Instruction *OrigTerm = RuntimeCheckBB->getTerminator();
   Builder.SetInsertPoint(OrigTerm);
-  Builder.CreateCondBr(RuntimeCheck, NonVersionedLoop->getLoopPreheader(),
-   VersionedLoop->getLoopPreheader());
+  auto *BI =
+  Builder.CreateCondBr(RuntimeCheck, NonVersionedLoop->getLoopPreheader(),
+   VersionedLoop->getLoopPreheader());
+  // We don't know what the probability of executing the versioned vs the
+  // unversioned variants is.
+  setExplicitlyUnknownBranchWeightsIfProfiled(
+  *BI, *BI->getParent()->getParent(), DEBUG_TYPE);

fhahn wrote:

Or not, looks like InstCombine passes disconnected instructions

https://github.com/llvm/llvm-project/pull/164507
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] Add Testing Configuration for LLVM libc (PR #165120)

2025-11-01 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 updated 
https://github.com/llvm/llvm-project/pull/165120


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] Add Testing Configuration for LLVM libc (PR #165120)

2025-11-01 Thread Aiden Grossman via llvm-branch-commits

https://github.com/boomanaiden154 updated 
https://github.com/llvm/llvm-project/pull/165120


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LVer][profcheck] explicitly set unknown branch weights for the versioned/unversioned selector (PR #164507)

2025-11-01 Thread Florian Hahn via llvm-branch-commits


@@ -109,8 +110,13 @@ void LoopVersioning::versionLoop(
   // Insert the conditional branch based on the result of the memchecks.
   Instruction *OrigTerm = RuntimeCheckBB->getTerminator();
   Builder.SetInsertPoint(OrigTerm);
-  Builder.CreateCondBr(RuntimeCheck, NonVersionedLoop->getLoopPreheader(),
-   VersionedLoop->getLoopPreheader());
+  auto *BI =
+  Builder.CreateCondBr(RuntimeCheck, NonVersionedLoop->getLoopPreheader(),
+   VersionedLoop->getLoopPreheader());
+  // We don't know what the probability of executing the versioned vs the
+  // unversioned variants is.
+  setExplicitlyUnknownBranchWeightsIfProfiled(
+  *BI, *BI->getParent()->getParent(), DEBUG_TYPE);

fhahn wrote:

```suggestion
  *BI, *BI->getFunction(), DEBUG_TYPE);
```



https://github.com/llvm/llvm-project/pull/164507
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [X86][NewPM] Port lower-amx-intrinsics to NewPM (PR #165113)

2025-11-01 Thread Aiden Grossman via llvm-branch-commits


@@ -179,7 +179,18 @@ FunctionPass *createX86LowerAMXTypeLegacyPass();
 
 /// The pass transforms amx intrinsics to scalar operation if the function has
 /// optnone attribute or it is O0.
-FunctionPass *createX86LowerAMXIntrinsicsPass();
+class X86LowerAMXIntrinsicsPass
+: public PassInfoMixin {
+private:
+  const TargetMachine *TM;
+
+public:
+  X86LowerAMXIntrinsicsPass(const TargetMachine *TM) : TM(TM) {}
+  PreservedAnalyses run(Function &F, FunctionAnalysisManager &FAM);
+  static bool isRequired() { return true; }

boomanaiden154 wrote:

I'm not sure we should be using a backend pass enabled at O0 to remove 
redundant stack load/stores if those are supposed to be cleaned up in the 
middle end.

Either way, this patch just intends to port the pass, not fix any latent issues 
or clean up any latent tech debt of this sort.

https://github.com/llvm/llvm-project/pull/165113
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Instrumentor] Allow printing a runtime stub (PR #138978)

2025-11-01 Thread Kevin Sala Penades via llvm-branch-commits

https://github.com/kevinsala updated 
https://github.com/llvm/llvm-project/pull/138978

>From 371483a750a456459d054a56787b40e946ab2890 Mon Sep 17 00:00:00 2001
From: Kevin Sala 
Date: Tue, 6 May 2025 22:48:41 -0700
Subject: [PATCH 1/2] [Instrumentor] Allow printing a runtime stub

---
 .../llvm/Transforms/IPO/Instrumentor.h|  16 ++
 .../Transforms/IPO/InstrumentorStubPrinter.h  |  32 +++
 llvm/lib/Transforms/IPO/CMakeLists.txt|   1 +
 llvm/lib/Transforms/IPO/Instrumentor.cpp  |   3 +
 .../IPO/InstrumentorStubPrinter.cpp   | 210 ++
 .../Instrumentor/bad_rt_config.json   | 105 +
 .../Instrumentor/default_config.json  |   2 +
 .../Instrumentation/Instrumentor/default_rt   |  37 +++
 .../Instrumentor/generate_bad_rt.ll   |   3 +
 .../Instrumentor/generate_rt.ll   |   2 +
 .../Instrumentor/load_store_config.json   |   2 +-
 .../load_store_noreplace_config.json  |   2 +-
 .../Instrumentor/rt_config.json   | 105 +
 13 files changed, 518 insertions(+), 2 deletions(-)
 create mode 100644 llvm/include/llvm/Transforms/IPO/InstrumentorStubPrinter.h
 create mode 100644 llvm/lib/Transforms/IPO/InstrumentorStubPrinter.cpp
 create mode 100644 llvm/test/Instrumentation/Instrumentor/bad_rt_config.json
 create mode 100644 llvm/test/Instrumentation/Instrumentor/default_rt
 create mode 100644 llvm/test/Instrumentation/Instrumentor/generate_bad_rt.ll
 create mode 100644 llvm/test/Instrumentation/Instrumentor/generate_rt.ll
 create mode 100644 llvm/test/Instrumentation/Instrumentor/rt_config.json

diff --git a/llvm/include/llvm/Transforms/IPO/Instrumentor.h 
b/llvm/include/llvm/Transforms/IPO/Instrumentor.h
index 26445d221d00f..e6d5f717072a2 100644
--- a/llvm/include/llvm/Transforms/IPO/Instrumentor.h
+++ b/llvm/include/llvm/Transforms/IPO/Instrumentor.h
@@ -116,6 +116,18 @@ struct IRTCallDescription {
InstrumentorIRBuilderTy &IIRB, const DataLayout &DL,
InstrumentationCaches &ICaches);
 
+  /// Create a string representation of the function declaration in C. Two
+  /// strings are returned: the function definition with direct arguments and
+  /// the function with any indirect argument.
+  std::pair
+  createCSignature(const InstrumentationConfig &IConf) const;
+
+  /// Create a string representation of the function definition in C. The
+  /// function body implements a stub and only prints the passed arguments. Two
+  /// strings are returned: the function definition with direct arguments and
+  /// the function with any indirect argument.
+  std::pair createCBodies() const;
+
   /// Return whether the \p IRTA argument can be replaced.
   bool isReplacable(IRTArg &IRTA) const {
 return (IRTA.Flags & (IRTArg::REPLACABLE | IRTArg::REPLACABLE_CUSTOM));
@@ -334,6 +346,9 @@ struct InstrumentationConfig {
   InstrumentationConfig() : SS(StringAllocator) {
 RuntimePrefix = BaseConfigurationOption::getStringOption(
 *this, "runtime_prefix", "The runtime API prefix.", "__instrumentor_");
+RuntimeStubsFile = BaseConfigurationOption::getStringOption(
+*this, "runtime_stubs_file",
+"The file into which runtime stubs should be written.", "");
 TargetRegex = BaseConfigurationOption::getStringOption(
 *this, "target_regex",
 "Regular expression to be matched against the module target. "
@@ -380,6 +395,7 @@ struct InstrumentationConfig {
 
   /// The base configuration options.
   BaseConfigurationOption *RuntimePrefix;
+  BaseConfigurationOption *RuntimeStubsFile;
   BaseConfigurationOption *TargetRegex;
   BaseConfigurationOption *HostEnabled;
   BaseConfigurationOption *GPUEnabled;
diff --git a/llvm/include/llvm/Transforms/IPO/InstrumentorStubPrinter.h 
b/llvm/include/llvm/Transforms/IPO/InstrumentorStubPrinter.h
new file mode 100644
index 0..6e1e24d5fef9e
--- /dev/null
+++ b/llvm/include/llvm/Transforms/IPO/InstrumentorStubPrinter.h
@@ -0,0 +1,32 @@
+//===- Transforms/IPO/InstrumentorStubPrinter.h 
---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+//
+// A generator of Instrumentor's runtime stubs.
+//
+//===--===//
+
+#ifndef LLVM_TRANSFORMS_IPO_INSTRUMENTOR_STUB_PRINTER_H
+#define LLVM_TRANSFORMS_IPO_INSTRUMENTOR_STUB_PRINTER_H
+
+#include "llvm/ADT/StringRef.h"
+#include "llvm/IR/Module.h"
+#include "llvm/Transforms/IPO/Instrumentor.h"
+
+namespace llvm {
+namespace instrumentor {
+
+/// Print a runtime stub file with the implementation of the instrumentation
+/// runtime functions corresponding to the instrumentation opportunities
+/// enabled.
+void pri

[llvm-branch-commits] Revert "[X86] Narrow BT/BTC/BTR/BTS compare + RMW patterns on very large integers (#165540)" (PR #165979)

2025-11-01 Thread Vitaly Buka via llvm-branch-commits

https://github.com/vitalybuka edited 
https://github.com/llvm/llvm-project/pull/165979
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] Revert "[X86] Narrow BT/BTC/BTR/BTS compare + RMW patterns on very large integers (#165540)" (PR #165979)

2025-11-01 Thread Vitaly Buka via llvm-branch-commits

https://github.com/vitalybuka created 
https://github.com/llvm/llvm-project/pull/165979

This reverts commit a55a7207c7e4d98dad32e8d53dd5964ee833edd9.



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] Revert "[X86] Narrow BT/BTC/BTR/BTS compare + RMW patterns on very large integers (#165540)" (PR #165979)

2025-11-01 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-x86

Author: Vitaly Buka (vitalybuka)


Changes

This reverts commit a55a7207c7e4d98dad32e8d53dd5964ee833edd9.


---

Patch is 338.85 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/165979.diff


2 Files Affected:

- (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+2-112) 
- (modified) llvm/test/CodeGen/X86/bittest-big-integer.ll (+6331-994) 


``diff
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 6f75a2eb7075a..c5fb5535d0057 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -53344,80 +53344,6 @@ static SDValue combineMaskedStore(SDNode *N, 
SelectionDAG &DAG,
   return SDValue();
 }
 
-// Look for a RMW operation that only touches one bit of a larger than legal
-// type and fold it to a BTC/BTR/BTS pattern acting on a single i32 sub value.
-static SDValue narrowBitOpRMW(StoreSDNode *St, const SDLoc &DL,
-  SelectionDAG &DAG,
-  const X86Subtarget &Subtarget) {
-  using namespace SDPatternMatch;
-
-  // Only handle normal stores and its chain was a matching normal load.
-  auto *Ld = dyn_cast(St->getChain());
-  if (!ISD::isNormalStore(St) || !St->isSimple() || !Ld ||
-  !ISD::isNormalLoad(Ld) || !Ld->isSimple() ||
-  Ld->getBasePtr() != St->getBasePtr() ||
-  Ld->getOffset() != St->getOffset())
-return SDValue();
-
-  SDValue LoadVal(Ld, 0);
-  SDValue StoredVal = St->getValue();
-  EVT VT = StoredVal.getValueType();
-
-  // Only narrow larger than legal scalar integers.
-  if (!VT.isScalarInteger() ||
-  VT.getSizeInBits() <= (Subtarget.is64Bit() ? 64 : 32))
-return SDValue();
-
-  // BTR: X & ~(1 << ShAmt)
-  // BTS: X | (1 << ShAmt)
-  // BTC: X ^ (1 << ShAmt)
-  SDValue ShAmt;
-  if (!StoredVal.hasOneUse() ||
-  !(sd_match(StoredVal, m_And(m_Specific(LoadVal),
-  m_Not(m_Shl(m_One(), m_Value(ShAmt) ||
-sd_match(StoredVal,
- m_Or(m_Specific(LoadVal), m_Shl(m_One(), m_Value(ShAmt ||
-sd_match(StoredVal,
- m_Xor(m_Specific(LoadVal), m_Shl(m_One(), m_Value(ShAmt))
-return SDValue();
-
-  // Ensure the shift amount is in bounds.
-  KnownBits KnownAmt = DAG.computeKnownBits(ShAmt);
-  if (KnownAmt.getMaxValue().uge(VT.getSizeInBits()))
-return SDValue();
-
-  // Split the shift into an alignment shift that moves the active i32 block to
-  // the bottom bits for truncation and a modulo shift that can act on the i32.
-  EVT AmtVT = ShAmt.getValueType();
-  SDValue AlignAmt = DAG.getNode(ISD::AND, DL, AmtVT, ShAmt,
- DAG.getSignedConstant(-32LL, DL, AmtVT));
-  SDValue ModuloAmt =
-  DAG.getNode(ISD::AND, DL, AmtVT, ShAmt, DAG.getConstant(31, DL, AmtVT));
-
-  // Compute the byte offset for the i32 block that is changed by the RMW.
-  // combineTruncate will adjust the load for us in a similar way.
-  EVT PtrVT = St->getBasePtr().getValueType();
-  SDValue PtrBitOfs = DAG.getZExtOrTrunc(AlignAmt, DL, PtrVT);
-  SDValue PtrByteOfs = DAG.getNode(ISD::SRL, DL, PtrVT, PtrBitOfs,
-   DAG.getShiftAmountConstant(3, PtrVT, DL));
-  SDValue NewPtr = DAG.getMemBasePlusOffset(St->getBasePtr(), PtrByteOfs, DL,
-SDNodeFlags::NoUnsignedWrap);
-
-  // Reconstruct the BTC/BTR/BTS pattern for the i32 block and store.
-  SDValue X = DAG.getNode(ISD::SRL, DL, VT, LoadVal, AlignAmt);
-  X = DAG.getNode(ISD::TRUNCATE, DL, MVT::i32, X);
-
-  SDValue Mask =
-  DAG.getNode(ISD::SHL, DL, MVT::i32, DAG.getConstant(1, DL, MVT::i32),
-  DAG.getZExtOrTrunc(ModuloAmt, DL, MVT::i8));
-  if (StoredVal.getOpcode() == ISD::AND)
-Mask = DAG.getNOT(DL, Mask, MVT::i32);
-
-  SDValue Res = DAG.getNode(StoredVal.getOpcode(), DL, MVT::i32, X, Mask);
-  return DAG.getStore(St->getChain(), DL, Res, NewPtr, St->getPointerInfo(),
-  Align(), St->getMemOperand()->getFlags());
-}
-
 static SDValue combineStore(SDNode *N, SelectionDAG &DAG,
 TargetLowering::DAGCombinerInfo &DCI,
 const X86Subtarget &Subtarget) {
@@ -53644,9 +53570,6 @@ static SDValue combineStore(SDNode *N, SelectionDAG 
&DAG,
 }
   }
 
-  if (SDValue R = narrowBitOpRMW(St, dl, DAG, Subtarget))
-return R;
-
   // Convert store(cmov(load(p), x, CC), p) to cstore(x, p, CC)
   // store(cmov(x, load(p), CC), p) to cstore(x, p, InvertCC)
   if ((VT == MVT::i16 || VT == MVT::i32 || VT == MVT::i64) &&
@@ -54579,9 +54502,8 @@ static SDValue combineTruncate(SDNode *N, SelectionDAG 
&DAG,
   // truncation, see if we can convert the shift into a pointer offset instead.
   // Limit this to normal (non-ext) scalar integer loads.
   if (SrcVT.isScalarInteger() && Src.getOpcode() == ISD::SRL 

[llvm-branch-commits] [llvm] [openmp] [OpenMP][Offload] Add offload runtime support for dyn_groupprivate clause (PR #152831)

2025-11-01 Thread Kevin Sala Penades via llvm-branch-commits

https://github.com/kevinsala updated 
https://github.com/llvm/llvm-project/pull/152831

>From fa3c7425ae9e5ffea83841f2be61b0f494b99038 Mon Sep 17 00:00:00 2001
From: Kevin Sala 
Date: Fri, 8 Aug 2025 11:25:14 -0700
Subject: [PATCH 1/4] [OpenMP][Offload] Add offload runtime support for
 dyn_groupprivate clause

---
 offload/DeviceRTL/include/DeviceTypes.h   |   4 +
 offload/DeviceRTL/include/Interface.h |   2 +-
 offload/DeviceRTL/include/State.h |   2 +-
 offload/DeviceRTL/src/Kernel.cpp  |  14 +-
 offload/DeviceRTL/src/State.cpp   |  48 +-
 offload/include/Shared/APITypes.h |   6 +-
 offload/include/Shared/Environment.h  |   4 +-
 offload/include/device.h  |   3 +
 offload/include/omptarget.h   |   7 +-
 offload/libomptarget/OpenMP/API.cpp   |  14 ++
 offload/libomptarget/device.cpp   |   6 +
 offload/libomptarget/exports  |   1 +
 .../amdgpu/dynamic_hsa/hsa_ext_amd.h  |   1 +
 offload/plugins-nextgen/amdgpu/src/rtl.cpp|  34 +++--
 .../common/include/PluginInterface.h  |  33 +++-
 .../common/src/PluginInterface.cpp|  86 ---
 .../plugins-nextgen/cuda/dynamic_cuda/cuda.h  |   1 +
 offload/plugins-nextgen/cuda/src/rtl.cpp  |  37 +++--
 offload/plugins-nextgen/host/src/rtl.cpp  |   4 +-
 .../offloading/dyn_groupprivate_strict.cpp| 141 ++
 openmp/runtime/src/include/omp.h.var  |  10 ++
 openmp/runtime/src/kmp_csupport.cpp   |   9 ++
 openmp/runtime/src/kmp_stub.cpp   |  16 ++
 23 files changed, 418 insertions(+), 65 deletions(-)
 create mode 100644 offload/test/offloading/dyn_groupprivate_strict.cpp

diff --git a/offload/DeviceRTL/include/DeviceTypes.h 
b/offload/DeviceRTL/include/DeviceTypes.h
index 2e5d92380f040..a43b506d6879e 100644
--- a/offload/DeviceRTL/include/DeviceTypes.h
+++ b/offload/DeviceRTL/include/DeviceTypes.h
@@ -163,4 +163,8 @@ typedef enum omp_allocator_handle_t {
 
 ///}
 
+enum omp_access_t {
+  omp_access_cgroup = 0,
+};
+
 #endif
diff --git a/offload/DeviceRTL/include/Interface.h 
b/offload/DeviceRTL/include/Interface.h
index c4bfaaa2404b4..672afea206785 100644
--- a/offload/DeviceRTL/include/Interface.h
+++ b/offload/DeviceRTL/include/Interface.h
@@ -222,7 +222,7 @@ struct KernelEnvironmentTy;
 int8_t __kmpc_is_spmd_exec_mode();
 
 int32_t __kmpc_target_init(KernelEnvironmentTy &KernelEnvironment,
-   KernelLaunchEnvironmentTy &KernelLaunchEnvironment);
+   KernelLaunchEnvironmentTy *KernelLaunchEnvironment);
 
 void __kmpc_target_deinit();
 
diff --git a/offload/DeviceRTL/include/State.h 
b/offload/DeviceRTL/include/State.h
index db396dae6e445..17c3c6f2d3e42 100644
--- a/offload/DeviceRTL/include/State.h
+++ b/offload/DeviceRTL/include/State.h
@@ -116,7 +116,7 @@ extern Local ThreadStates;
 
 /// Initialize the state machinery. Must be called by all threads.
 void init(bool IsSPMD, KernelEnvironmentTy &KernelEnvironment,
-  KernelLaunchEnvironmentTy &KernelLaunchEnvironment);
+  KernelLaunchEnvironmentTy *KernelLaunchEnvironment);
 
 /// Return the kernel and kernel launch environment associated with the current
 /// kernel. The former is static and contains compile time information that
diff --git a/offload/DeviceRTL/src/Kernel.cpp b/offload/DeviceRTL/src/Kernel.cpp
index 467e44a65276c..58e9a09105a76 100644
--- a/offload/DeviceRTL/src/Kernel.cpp
+++ b/offload/DeviceRTL/src/Kernel.cpp
@@ -34,8 +34,8 @@ enum OMPTgtExecModeFlags : unsigned char {
 };
 
 static void
-inititializeRuntime(bool IsSPMD, KernelEnvironmentTy &KernelEnvironment,
-KernelLaunchEnvironmentTy &KernelLaunchEnvironment) {
+initializeRuntime(bool IsSPMD, KernelEnvironmentTy &KernelEnvironment,
+  KernelLaunchEnvironmentTy *KernelLaunchEnvironment) {
   // Order is important here.
   synchronize::init(IsSPMD);
   mapping::init(IsSPMD);
@@ -80,17 +80,17 @@ extern "C" {
 /// \param Ident   Source location identification, can be NULL.
 ///
 int32_t __kmpc_target_init(KernelEnvironmentTy &KernelEnvironment,
-   KernelLaunchEnvironmentTy &KernelLaunchEnvironment) 
{
+   KernelLaunchEnvironmentTy *KernelLaunchEnvironment) 
{
   ConfigurationEnvironmentTy &Configuration = KernelEnvironment.Configuration;
   bool IsSPMD = Configuration.ExecMode & OMP_TGT_EXEC_MODE_SPMD;
   bool UseGenericStateMachine = Configuration.UseGenericStateMachine;
   if (IsSPMD) {
-inititializeRuntime(/*IsSPMD=*/true, KernelEnvironment,
-KernelLaunchEnvironment);
+initializeRuntime(/*IsSPMD=*/true, KernelEnvironment,
+  KernelLaunchEnvironment);
 synchronize::threadsAligned(atomic::relaxed);
   } else {
-inititializeRuntime(/*IsSPMD=*/false, KernelEnvironment,
-KernelLaunchEnv