from:"Joseph Huber via lldb\-commits"

[Lldb-commits] [flang] [clang] [lldb] [libc] [compiler-rt] [clang-tools-extra] [lld] [llvm] [libcxx] [openmp] Gcc 75 libomptarget type convert (PR #75562)

2023-12-15 Thread Joseph Huber via lldb-commits



@@ -47,7 +47,9 @@ PluginAdaptorTy::create(const std::string &Name) {
   new PluginAdaptorTy(Name, std::move(LibraryHandler)));
   if (auto Err = PluginAdaptor->init())
 return Err;
-  return PluginAdaptor;

jhuber6 wrote:

Does putting `std::move` here not work?

https://github.com/llvm/llvm-project/pull/75562
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [lld] [libc] [clang-tools-extra] [compiler-rt] [lldb] [llvm] [flang] [libcxx] [openmp] [clang] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits



@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;
+  auto ELFObj = getELFObjectFile(Image);
+  if (!ELFObj)
+return ELFObj.takeError();
+  profdata.targetTriple = ELFObj->makeTriple();

jhuber6 wrote:

Made a patch in https://github.com/llvm/llvm-project/pull/76992 and 
https://github.com/llvm/llvm-project/pull/76970 to make this actually work.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [libc] [clang] [lld] [clang-tools-extra] [compiler-rt] [flang] [lldb] [libcxx] [llvm] [openmp] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits



@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;
+  auto ELFObj = getELFObjectFile(Image);
+  if (!ELFObj)
+return ELFObj.takeError();
+  profdata.targetTriple = ELFObj->makeTriple();
+  // Iterate through elf symbols
+  for (auto &sym : ELFObj->symbols()) {
+if (auto name = sym.getName()) {
+  // Check if given current global is a profiling global based
+  // on name
+  if (name->equals(getInstrProfNamesVarName())) {
+// Read in profiled function names
+std::vector chars(sym.getSize() / sizeof(char), ' ');

jhuber6 wrote:

Why are we turning this into a vector of chars? Also isn't `sizeof(char)` 
pretty much always going to be `1`?

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [compiler-rt] [lldb] [openmp] [llvm] [clang-tools-extra] [lld] [flang] [clang] [libcxx] [libc] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits



@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;
+  auto ELFObj = getELFObjectFile(Image);
+  if (!ELFObj)
+return ELFObj.takeError();
+  profdata.targetTriple = ELFObj->makeTriple();
+  // Iterate through elf symbols
+  for (auto &sym : ELFObj->symbols()) {
+if (auto name = sym.getName()) {
+  // Check if given current global is a profiling global based
+  // on name
+  if (name->equals(getInstrProfNamesVarName())) {
+// Read in profiled function names
+std::vector chars(sym.getSize() / sizeof(char), ' ');
+GlobalTy NamesGlobal(name->str(), sym.getSize(), chars.data());
+if (auto Err = readGlobalFromDevice(Device, Image, NamesGlobal))
+  return Err;
+std::string names(chars.begin(), chars.end());
+profdata.names = std::move(names);
+  } else if (name->starts_with(getInstrProfCountersVarPrefix())) {
+// Read global variable profiling counts
+std::vector counts(sym.getSize() / sizeof(int64_t), 0);
+GlobalTy CountGlobal(name->str(), sym.getSize(), counts.data());
+if (auto Err = readGlobalFromDevice(Device, Image, CountGlobal))
+  return Err;
+profdata.counts.push_back(std::move(counts));
+  } else if (name->starts_with(getInstrProfDataVarPrefix())) {
+// Read profiling data for this global variable
+__llvm_profile_data data{};
+GlobalTy DataGlobal(name->str(), sym.getSize(), &data);
+if (auto Err = readGlobalFromDevice(Device, Image, DataGlobal))
+  return Err;
+profdata.data.push_back(std::move(data));
+  }
+}
+  }
+  return profdata;
+}

jhuber6 wrote:

LLVM style for everything here.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [compiler-rt] [clang] [clang-tools-extra] [flang] [llvm] [libcxx] [lld] [lldb] [libc] [openmp] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits



@@ -58,6 +60,22 @@ class GlobalTy {
   void setPtr(void *P) { Ptr = P; }
 };
 
+typedef void *IntPtrT;

jhuber6 wrote:

What's the utility of this?

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [lld] [clang-tools-extra] [openmp] [flang] [libc] [libcxx] [llvm] [lldb] [compiler-rt] [clang] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits



@@ -58,6 +60,22 @@ class GlobalTy {
   void setPtr(void *P) { Ptr = P; }
 };
 
+typedef void *IntPtrT;
+struct __llvm_profile_data {
+#define INSTR_PROF_DATA(Type, LLVMType, Name, Initializer) Type Name;
+#include "llvm/ProfileData/InstrProfData.inc"
+};
+
+/// PGO profiling data extracted from a GPU device
+struct GPUProfGlobals {
+  std::string names;
+  std::vector> counts;
+  std::vector<__llvm_profile_data> data;
+  Triple targetTriple;
+

jhuber6 wrote:

These should probably use LLVM structs. E.g. `StringRef` is the name is a 
constant string with stable storage and `SmallVector`.

I'd really appreciate some descriptions of how this is supposed to look and how 
it interacts with the existing profile data.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [lld] [libcxx] [clang-tools-extra] [compiler-rt] [clang] [flang] [llvm] [libc] [openmp] [lldb] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits



@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;

jhuber6 wrote:

```suggestion
  GPUProfGlobals ProfData;
```
LLVM style. Also not a fan of the name. Maybe `DeviceProfileData` or something.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [libc] [openmp] [compiler-rt] [libcxx] [clang-tools-extra] [lld] [llvm] [clang] [flang] [lldb] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits



@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;
+  auto ELFObj = getELFObjectFile(Image);
+  if (!ELFObj)
+return ELFObj.takeError();
+  profdata.targetTriple = ELFObj->makeTriple();
+  // Iterate through elf symbols
+  for (auto &sym : ELFObj->symbols()) {
+if (auto name = sym.getName()) {
+  // Check if given current global is a profiling global based
+  // on name
+  if (name->equals(getInstrProfNamesVarName())) {
+// Read in profiled function names
+std::vector chars(sym.getSize() / sizeof(char), ' ');
+GlobalTy NamesGlobal(name->str(), sym.getSize(), chars.data());
+if (auto Err = readGlobalFromDevice(Device, Image, NamesGlobal))
+  return Err;
+std::string names(chars.begin(), chars.end());
+profdata.names = std::move(names);
+  } else if (name->starts_with(getInstrProfCountersVarPrefix())) {

jhuber6 wrote:

Are the `getInstrProfCountersVarPrefix` function preexisting? I don't see them 
defined in this patch set.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [lld] [lldb] [clang-tools-extra] [compiler-rt] [flang] [llvm] [clang] [libcxx] [openmp] [libc] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits



@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;
+  auto ELFObj = getELFObjectFile(Image);
+  if (!ELFObj)
+return ELFObj.takeError();
+  profdata.targetTriple = ELFObj->makeTriple();
+  // Iterate through elf symbols
+  for (auto &sym : ELFObj->symbols()) {
+if (auto name = sym.getName()) {
+  // Check if given current global is a profiling global based
+  // on name
+  if (name->equals(getInstrProfNamesVarName())) {
+// Read in profiled function names
+std::vector chars(sym.getSize() / sizeof(char), ' ');
+GlobalTy NamesGlobal(name->str(), sym.getSize(), chars.data());

jhuber6 wrote:

Okay, we're reading a string back from the device? What's the purpose of that? 
Also, just so you know, the ELF will only contain the correct size if it's 
emitted as an array. E.g.
```
const char a[] = "a"; // strlen("a") + 1 in ELF
const char *b = "b"; // sizeof(char *) in ELF
```

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [openmp] [lld] [clang-tools-extra] [libcxx] [llvm] [flang] [libc] [clang] [lldb] [compiler-rt] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-04 Thread Joseph Huber via lldb-commits



@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;
+  auto ELFObj = getELFObjectFile(Image);
+  if (!ELFObj)
+return ELFObj.takeError();
+  profdata.targetTriple = ELFObj->makeTriple();
+  // Iterate through elf symbols
+  for (auto &sym : ELFObj->symbols()) {
+if (auto name = sym.getName()) {

jhuber6 wrote:

This is incorrect. If this returns an error it will exit the if, call the 
deconstructor, and then crash the program because it was not handled.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [clang-tools-extra] [libc] [lldb] [openmp] [clang] [llvm] [flang] [compiler-rt] [libcxx] [lld] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-05 Thread Joseph Huber via lldb-commits



@@ -58,6 +60,22 @@ class GlobalTy {
   void setPtr(void *P) { Ptr = P; }
 };
 
+typedef void *IntPtrT;
+struct __llvm_profile_data {
+#define INSTR_PROF_DATA(Type, LLVMType, Name, Initializer) Type Name;
+#include "llvm/ProfileData/InstrProfData.inc"
+};
+
+/// PGO profiling data extracted from a GPU device
+struct GPUProfGlobals {
+  std::string names;
+  std::vector> counts;
+  std::vector<__llvm_profile_data> data;
+  Triple targetTriple;
+

jhuber6 wrote:

All of them, SmallVector is a std::vector with small size optimizations like 
`std::string` basically.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [clang-tools-extra] [libc] [lldb] [openmp] [clang] [llvm] [flang] [compiler-rt] [libcxx] [lld] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-05 Thread Joseph Huber via lldb-commits



@@ -58,6 +60,22 @@ class GlobalTy {
   void setPtr(void *P) { Ptr = P; }
 };
 
+typedef void *IntPtrT;

jhuber6 wrote:

Okay. you should use the C++ `using` keyword instead of C's `typedef.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [clang] [libc] [lld] [lldb] [clang-tools-extra] [llvm] [compiler-rt] [flang] [libcxx] [openmp] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-05 Thread Joseph Huber via lldb-commits


https://github.com/jhuber6 edited 
https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [clang] [libc] [lld] [lldb] [clang-tools-extra] [llvm] [compiler-rt] [flang] [libcxx] [openmp] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-05 Thread Joseph Huber via lldb-commits



@@ -163,3 +163,87 @@ Error 
GenericGlobalHandlerTy::readGlobalFromImage(GenericDeviceTy &Device,
 
   return Plugin::success();
 }
+
+bool GenericGlobalHandlerTy::hasProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GlobalTy global(getInstrProfNamesVarName().str(), 0);
+  if (auto Err = getGlobalMetadataFromImage(Device, Image, global)) {
+consumeError(std::move(Err));
+return false;
+  }
+  return true;
+}
+
+Expected
+GenericGlobalHandlerTy::readProfilingGlobals(GenericDeviceTy &Device,
+ DeviceImageTy &Image) {
+  GPUProfGlobals profdata;
+  auto ELFObj = getELFObjectFile(Image);
+  if (!ELFObj)
+return ELFObj.takeError();
+  profdata.targetTriple = ELFObj->makeTriple();
+  // Iterate through elf symbols
+  for (auto &sym : ELFObj->symbols()) {
+if (auto name = sym.getName()) {
+  // Check if given current global is a profiling global based
+  // on name
+  if (name->equals(getInstrProfNamesVarName())) {
+// Read in profiled function names
+std::vector chars(sym.getSize() / sizeof(char), ' ');
+GlobalTy NamesGlobal(name->str(), sym.getSize(), chars.data());

jhuber6 wrote:

Okay, this should use `SmallVector` as well, don't bother dividing by the size 
because the one reported from the ELF is absolute.  Then just make the data 
inside `uint8_t`.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [clang-tools-extra] [flang] [libcxx] [lld] [compiler-rt] [lldb] [clang] [llvm] [libc] [openmp] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-08 Thread Joseph Huber via lldb-commits



@@ -58,6 +60,22 @@ class GlobalTy {
   void setPtr(void *P) { Ptr = P; }
 };
 
+typedef void *IntPtrT;
+struct __llvm_profile_data {
+#define INSTR_PROF_DATA(Type, LLVMType, Name, Initializer) Type Name;
+#include "llvm/ProfileData/InstrProfData.inc"
+};
+
+/// PGO profiling data extracted from a GPU device
+struct GPUProfGlobals {
+  std::string names;
+  std::vector> counts;
+  std::vector<__llvm_profile_data> data;
+  Triple targetTriple;
+

jhuber6 wrote:

That's confusing, how would using a `std::vector` not have that problem as 
well? I'll need to look into that.

https://github.com/llvm/llvm-project/pull/76587
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [clang] [libc] [lldb] [llvm] [mlir] [compiler-rt] [lld] [libcxx] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-23 Thread Joseph Huber via lldb-commits



@@ -0,0 +1,7 @@
+/// Some target-specific options are ignored for GPU, so %clang exits with 
code 0.
+// DEFINE: %{gpu_opts} = --cuda-gpu-arch=sm_60 
--cuda-path=%S/Inputs/CUDA/usr/local/cuda --no-cuda-version-check
+// DEFINE: %{check} = %clang -### -c %{gpu_opts} -mcmodel=medium %s
+// RUN: %{check} -fbasic-block-sections=all

jhuber6 wrote:

Offloading compilation for these single-source languages pretty much just 
combines one "host" compilation job with N "Device" compilation jobs. Doing 
`--offload-device-only` and `--offload-host-only` simply does one part of that. 
There's probably some flags that behave differently depending on which end 
you're compiling on, so maybe it would be useful for separating that behavior 
if needed.

https://github.com/llvm/llvm-project/pull/79222
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [llvm] [lldb] [lld] [compiler-rt] [clang] [mlir] [libc] [libcxx] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-23 Thread Joseph Huber via lldb-commits



@@ -0,0 +1,7 @@
+/// Some target-specific options are ignored for GPU, so %clang exits with 
code 0.
+// DEFINE: %{gpu_opts} = --cuda-gpu-arch=sm_60 
--cuda-path=%S/Inputs/CUDA/usr/local/cuda --no-cuda-version-check
+// DEFINE: %{check} = %clang -### -c %{gpu_opts} -mcmodel=medium %s
+// RUN: %{check} -fbasic-block-sections=all
+
+// REDEFINE: %{gpu_opts} = -x hip --rocm-path=%S/Inputs/rocm -nogpulib

jhuber6 wrote:

Should probably include `-nogpuinc` as well. Best way to avoid spurious 
failures due to lack of a local CUDA / ROCm installation. Maybe in the future 
LLVM based offloading won't depend on so much external stuff.

https://github.com/llvm/llvm-project/pull/79222
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [libc] [clang] [openmp] [lld] [clang-tools-extra] [lldb] [libcxx] [compiler-rt] [mlir] [llvm] [pstl] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-23 Thread Joseph Huber via lldb-commits


https://github.com/jhuber6 approved this pull request.


https://github.com/llvm/llvm-project/pull/79222
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [lldb] [pstl] [llvm] [mlir] [libc] [compiler-rt] [libcxx] [openmp] [clang-tools-extra] [clang] [lld] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-24 Thread Joseph Huber via lldb-commits



@@ -0,0 +1,5 @@
+/// Some target-specific options are ignored for GPU, so %clang exits with 
code 0.
+// DEFINE: %{check} = %clang -### -c -mcmodel=medium

jhuber6 wrote:

Probably depends on the option we're testing. We could do both.

https://github.com/llvm/llvm-project/pull/79222
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [lldb] [clang] [openmp] [compiler-rt] [lld] [llvm] [libc] [libcxx] [clang-tools-extra] [mlir] [pstl] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-24 Thread Joseph Huber via lldb-commits


jhuber6 wrote:

Maybe need to specify `--target=x86_64-unknown-linux-gnu` in the test?

https://github.com/llvm/llvm-project/pull/79222
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [compiler-rt] [flang] [libcxx] [clang] [llvm] [clang-tools-extra] [lldb] [lld] [libc] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits


https://github.com/jhuber6 updated 
https://github.com/llvm/llvm-project/pull/79373

>From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001
From: Joseph Huber 
Date: Wed, 24 Jan 2024 15:34:00 -0600
Subject: [PATCH 1/3] [NVPTX] Add support for -march=native in standalone NVPTX

Summary:
We support `--target=nvptx64-nvidia-cuda` as a way to target the NVPTX
architecture from standard CPU. This patch simply uses the existing
support for handling `--offload-arch=native` to also apply to the
standalone toolchain.
---
 clang/lib/Driver/ToolChains/Cuda.cpp   | 61 +-
 clang/lib/Driver/ToolChains/Cuda.h | 10 ++--
 clang/test/Driver/nvptx-cuda-system-arch.c |  5 ++
 3 files changed, 45 insertions(+), 31 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/Cuda.cpp 
b/clang/lib/Driver/ToolChains/Cuda.cpp
index 1462576ca870e6f..6215c43b5fc96bd 100644
--- a/clang/lib/Driver/ToolChains/Cuda.cpp
+++ b/clang/lib/Driver/ToolChains/Cuda.cpp
@@ -738,9 +738,18 @@ NVPTXToolChain::TranslateArgs(const 
llvm::opt::DerivedArgList &Args,
 if (!llvm::is_contained(*DAL, A))
   DAL->append(A);
 
-  if (!DAL->hasArg(options::OPT_march_EQ))
+  if (!DAL->hasArg(options::OPT_march_EQ)) {
 DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ),
   CudaArchToString(CudaArch::CudaDefault));
+  } else if (DAL->getLastArgValue(options::OPT_march_EQ) == "native") {
+auto GPUsOrErr = getSystemGPUArchs(Args);
+if (!GPUsOrErr)
+  getDriver().Diag(diag::err_drv_undetermined_gpu_arch)
+  << getArchName() << llvm::toString(GPUsOrErr.takeError()) << 
"-march";
+else
+  DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ),
+Args.MakeArgString(GPUsOrErr->front()));
+  }
 
   return DAL;
 }
@@ -783,6 +792,31 @@ void NVPTXToolChain::adjustDebugInfoKind(
   }
 }
 
+Expected>
+NVPTXToolChain::getSystemGPUArchs(const ArgList &Args) const {
+  // Detect NVIDIA GPUs availible on the system.
+  std::string Program;
+  if (Arg *A = Args.getLastArg(options::OPT_nvptx_arch_tool_EQ))
+Program = A->getValue();
+  else
+Program = GetProgramPath("nvptx-arch");
+
+  auto StdoutOrErr = executeToolChainProgram(Program);
+  if (!StdoutOrErr)
+return StdoutOrErr.takeError();
+
+  SmallVector GPUArchs;
+  for (StringRef Arch : llvm::split((*StdoutOrErr)->getBuffer(), "\n"))
+if (!Arch.empty())
+  GPUArchs.push_back(Arch.str());
+
+  if (GPUArchs.empty())
+return llvm::createStringError(std::error_code(),
+   "No NVIDIA GPU detected in the system");
+
+  return std::move(GPUArchs);
+}
+
 /// CUDA toolchain.  Our assembler is ptxas, and our "linker" is fatbinary,
 /// which isn't properly a linker but nonetheless performs the step of 
stitching
 /// together object files from the assembler into a single blob.
@@ -948,31 +982,6 @@ CudaToolChain::TranslateArgs(const 
llvm::opt::DerivedArgList &Args,
   return DAL;
 }
 
-Expected>
-CudaToolChain::getSystemGPUArchs(const ArgList &Args) const {
-  // Detect NVIDIA GPUs availible on the system.
-  std::string Program;
-  if (Arg *A = Args.getLastArg(options::OPT_nvptx_arch_tool_EQ))
-Program = A->getValue();
-  else
-Program = GetProgramPath("nvptx-arch");
-
-  auto StdoutOrErr = executeToolChainProgram(Program);
-  if (!StdoutOrErr)
-return StdoutOrErr.takeError();
-
-  SmallVector GPUArchs;
-  for (StringRef Arch : llvm::split((*StdoutOrErr)->getBuffer(), "\n"))
-if (!Arch.empty())
-  GPUArchs.push_back(Arch.str());
-
-  if (GPUArchs.empty())
-return llvm::createStringError(std::error_code(),
-   "No NVIDIA GPU detected in the system");
-
-  return std::move(GPUArchs);
-}
-
 Tool *NVPTXToolChain::buildAssembler() const {
   return new tools::NVPTX::Assembler(*this);
 }
diff --git a/clang/lib/Driver/ToolChains/Cuda.h 
b/clang/lib/Driver/ToolChains/Cuda.h
index 8a053f3393e1206..43c17ba7c0ba03d 100644
--- a/clang/lib/Driver/ToolChains/Cuda.h
+++ b/clang/lib/Driver/ToolChains/Cuda.h
@@ -168,6 +168,11 @@ class LLVM_LIBRARY_VISIBILITY NVPTXToolChain : public 
ToolChain {
   unsigned GetDefaultDwarfVersion() const override { return 2; }
   unsigned getMaxDwarfVersion() const override { return 2; }
 
+  /// Uses nvptx-arch tool to get arch of the system GPU. Will return error
+  /// if unable to find one.
+  virtual Expected>
+  getSystemGPUArchs(const llvm::opt::ArgList &Args) const override;
+
   CudaInstallationDetector CudaInstallation;
 
 protected:
@@ -223,11 +228,6 @@ class LLVM_LIBRARY_VISIBILITY CudaToolChain : public 
NVPTXToolChain {
 
   const ToolChain &HostTC;
 
-  /// Uses nvptx-arch tool to get arch of the system GPU. Will return error
-  /// if unable to find one.
-  virtual Expected>
-  getSystemGPUArchs(const llvm::opt::ArgList &Args) const override;
-
 protected:
   Tool *buildAssembler() const override; // ptxas
   Tool *buildLinker() const override;

[Lldb-commits] [lld] [lldb] [llvm] [compiler-rt] [clang-tools-extra] [libc] [clang] [flang] [libcxx] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> On the other hand, I'd be OK with providing --offload-arch=native translating 
> into "compile for all present GPU variants", with a possibility to further 
> adjust the selected set with the usual --no-offload-arch-foo, if the user 
> needs to. This will at least produce code that will run on the machine where 
> it's built, be somewhat consistent and is still adjustable by the user when 
> the default choice will inevitably be wrong.

This is what we already do, but this is somewhat tangential. I've updated this 
patch to present the warning in the case of multiply GPUs being detected, so I 
don't think there's a concern here with the user being confused. If they have 
two GPUs, the warning will tell them which one it's using with the correct 
`sm_` value to specify it manually if they so wish. If there is only one GPU on 
the system, it should be obvious that it's going to be targeted.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [clang] [lld] [libcxx] [flang] [compiler-rt] [libc] [clang-tools-extra] [llvm] [lldb] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> User confusion is only part of the issue here. With any single GPU choice we 
> would still potentially produce a nonworking binary, if our GPU choice does 
> not match what the user wants.
>
> "all GPUs" has the advantage of always producing the binary that's guaranteed 
> to work. Granted, in the case of multiple GPUs it comes with the compilation 
> time overhead, but I think it's a better trade-off than compiling faster, but 
> not working. If the overhead is unacceptable, then we can tweak the build, 
> but in that case, the user may as well just specify the desired architectures 
> explicitly.

I think the semantics of `native` on other architectures are clear enough here. 
This combined with the fact that using `-march=native` will error out in the 
case of no GPUs available, or give a warning if more than one GPU is available, 
should be sufficiently clear what it's doing. This obviously falls apart if you 
compile with `-march=native` and then move it off of the system you compiled it 
for, but the same applies for standard x64 binaries I feel.

Realistically, very, very few casual users are going to be using direct NVPTX 
targeting. The current use-case is for building tests directly for the GPU 
without needing to handle calling `amdgpu-arch` and `nvptx-arch` manually in 
CMake. If I had this in, then I could simplify a lot of CMake code in my `libc` 
project by just letting the compiler handle the autodetection. Then one less 
random program dependency is removed from the build process. AMDGPU already has 
`-mcpu=native` so I'd like NVPTX to match if possible.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [clang] [clang-tools-extra] [lldb] [libc] [libcxx] [lld] [llvm] [flang] [compiler-rt] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> > I think the semantics of native on other architectures are clear enough 
> > here.
> 
> I don't think we have the same idea about that. Let's spell it out, so 
> there's no confusion.
> 
> [GCC 
> manual](https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-march-16) 
> says:
> 
> > Using -march=native enables all instruction subsets supported by the local 
> > machine (hence the result might not run on different machines)
> 
> The way I read it "all instruction subsets supported by the local machine" 
> would be what all-GPUs strategy would do. The binary is expected to run on 
> all GPU architecture variants available on the machine.
> 
> Granted, gcc was not written with GPUs in mind, but it's a good baseline for 
> establishing existing conventions for the meaning of `-march=native`.

This more or less depends on what your definition of "local machine" is when it 
comes to a system augmented with GPUs. The verbiage of "**The** local machine" 
implies an assumption that there is only one, which I personally find 
consistent with just selecting the first GPU found on the system. There is 
ambiguity in how we should treat this in the case of multiple GPUs, but that's 
what the warning message is for. it informs the user that the "native" 
architecture is somewhat ambiguous and that the first one was selected.

Further, our current default makes sense, because it corresponds to Device ID 
zero in CUDA, which means that unless you change the environment via 
`CUDA_VISIBLE_DEVICES` or something, it will work on the default device.

So, in the case there is one device, the behavior is consistent with 
`-march=native`. In the case where there are two, we make an implicit decision 
to target the first GPU and inform the user. This method of compilation is not 
like CUDA, so we can't target all the GPUs at the same time. This will be 
useful in cases where we want to write code that simply targets a GPU that will 
"work". We have CMake code around LLVM already to do this, so it would be nice 
to get rid of that.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [flang] [clang] [clang-tools-extra] [llvm] [compiler-rt] [libcxx] [libc] [lldb] [lld] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> > This method of compilation is not like CUDA, so we can't target all the 
> > GPUs at the same time.
> 
> I think this is the key fact I was missing. If the patch is only for a 
> standalone compilation which does not do multi-GPU compilation in principle, 
> then your approach makes sense.
> 
> I was arguing from the normal offloading which does have ability to target 
> multiple GPUs.

Yes, this is more similar to OpenCL or just regular CPU compilation where we 
have a single job that creates a simple executable, terminal application style. 
So given a single target, the desire is to "pick me the one that will work on 
the default CUDA device without me needing to check." type thing.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [lld] [lldb] [libcxx] [compiler-rt] [clang-tools-extra] [llvm] [libc] [clang] [flang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> > This method of compilation is not like CUDA, so we can't target all the 
> > GPUs at the same time.
> 
> Can you clarify for me -- what are you compiling where it's impossible to 
> target multiple GPUs in the binary? I'm confused because Art is understanding 
> that it's not CUDA, but we're modifying the CUDA driver here?

The idea is to simply compile C / C++ code directly targeting NVPTX rather than 
going through offloading languages like CUDA or OpenMP. This is more or less 
what cross-compiling is. We specify `--target=nvptx64-nvidia-cuda` which 
instructs the compiler to cross-compile the C / C++ targeting NVPTX. This 
results in a workflow that is very close to compiling a standard executable by 
design. This is mostly related to my work on the LLVM C library for GPUs [which 
I did a talk on that goes in more 
detail](https://www.youtube.com/watch?v=_LLGc48GYHc)

Right now, with the LLVM `libc` infrastructure I can do the following on my AMD 
GPU.

```
#include 
int main() { puts("Hello World!"); }
```
And compile it and run it more or less.
```
$ clang hello.c --target=amdgcn-amd-amdhsa -mcpu=native -flto -lc crt1.o
$ amdhsa_loader a.out
Hello World!
```
This works with AMD currently, and I want it to work for NVPTX so I can remove 
some ugly, annoying code in the `libc` project. This is how I'm running unit 
tests targeting the GPU in that project, which needs to run on the user's GPU. 
I'd rather just use `-march=native` than detect it manually in CMake.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [lld] [lldb] [libcxx] [compiler-rt] [clang-tools-extra] [llvm] [libc] [clang] [flang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> I...think I understand.
> 
> Is the output of this compilation step a cubin, then?

Yes, it will spit out a simple `cubin` instead of a fatbinary. The NVIDIA 
toolchain is much worse about this stuff than the AMD one, but in general it 
works. You can check with `-###` or whatever like in 
https://godbolt.org/z/zWf5jezYP.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [flang] [clang] [libc] [compiler-rt] [clang-tools-extra] [llvm] [lld] [lldb] [libcxx] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits


jhuber6 wrote:

> Got it, okay, thanks.
> 
> Since this change only applies to `--target=nvptx64-nvidia-cuda`, fine by me. 
> Thanks for putting up with our scrutiny. :)

No problem, I probably should've have been  clearer in my commit messages.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [clang-tools-extra] [llvm] [libc] [clang] [libcxx] [lldb] [lld] [flang] [compiler-rt] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits


https://github.com/jhuber6 closed 
https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [lldb] [libc][libm][GPU] Add missing vendor entrypoints to the GPU version of `libm` (PR #66034)

2023-10-19 Thread Joseph Huber via lldb-commits



@@ -0,0 +1,17 @@
+//===-- Implementation of the GPU lroundf function

jhuber6 wrote:

This and some other headers need to be reformatted.

https://github.com/llvm/llvm-project/pull/66034
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [lldb] [libc][libm][GPU] Add missing vendor entrypoints to the GPU version of `libm` (PR #66034)

2023-10-19 Thread Joseph Huber via lldb-commits



@@ -0,0 +1,21 @@
+//===-- Implementation of the GPU atan2f function

jhuber6 wrote:

Fix header

https://github.com/llvm/llvm-project/pull/66034
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [lldb] [libc][libm][GPU] Add missing vendor entrypoints to the GPU version of `libm` (PR #66034)

2023-10-19 Thread Joseph Huber via lldb-commits


https://github.com/jhuber6 approved this pull request.

LG after fixing the header formatting.

https://github.com/llvm/llvm-project/pull/66034
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [lldb] [libc][libm][GPU] Add missing vendor entrypoints to the GPU version of `libm` (PR #66034)

2023-10-19 Thread Joseph Huber via lldb-commits



@@ -0,0 +1,19 @@
+//===-- Implementation of the GPU logbf function

jhuber6 wrote:

Fix header

https://github.com/llvm/llvm-project/pull/66034
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [lldb] [libc][libm][GPU] Add missing vendor entrypoints to the GPU version of `libm` (PR #66034)

2023-10-19 Thread Joseph Huber via lldb-commits


https://github.com/jhuber6 edited 
https://github.com/llvm/llvm-project/pull/66034
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [clang] [clang-tools-extra] [flang] [lldb] [llvm] [mlir] [polly] Add static to command line option (cl::opt) (PR #126243)

2025-02-07 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> So I should do like open 5 branches and make a PR for each?
> 
> As for the cases mentioned, when I tried compile all projects (compile with 
> -DLLVM_ENABLE_PROJECTS=all), I did encounter such cases when compiling and 
> I've fixed them all. Now all projects can be successfully built, so I think 
> it should be fine?

Yes, one PR for each project please.

Also refer to the failing CI for what's broken, e.g.
```
/var/lib/buildkite-agent/builds/linux-56-59b8f5d88-96x6s-1/llvm-project/github-pull-requests/flang/lib/Lower/OpenMP/Utils.cpp:30:28:
 error: static declaration of 'treatIndexAsSection' follows non-static 
declaration
static llvm::cl::opt treatIndexAsSection(
   ^
/var/lib/buildkite-agent/builds/linux-56-59b8f5d88-96x6s-1/llvm-project/github-pull-requests/flang/lib/Lower/OpenMP/Utils.h:19:28:
 note: previous declaration is here
extern llvm::cl::opt treatIndexAsSection;
   ^
```

https://github.com/llvm/llvm-project/pull/126243
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [clang] [clang-tools-extra] [flang] [lldb] [llvm] [mlir] [polly] Add static to command line option (cl::opt) (PR #126243)

2025-02-07 Thread Joseph Huber via lldb-commits


https://github.com/jhuber6 edited 
https://github.com/llvm/llvm-project/pull/126243
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

[Lldb-commits] [clang] [clang-tools-extra] [flang] [lldb] [llvm] [mlir] [polly] Add static to command line option (cl::opt) (PR #126243)

2025-02-07 Thread Joseph Huber via lldb-commits


https://github.com/jhuber6 commented:

This should definitely be split up. Also some options are referenced in 
multiple places, i.e.
```
// foo.h
extern cl::opt <...>

// foo.cpp
cl::opt <...>
use opt

//bar.cpp
use opt.
```

https://github.com/llvm/llvm-project/pull/126243
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

37 matches

Mail list logo