[clang] a1ad988 - [clang] Make header self-contained. NFC.

2024-06-27 Thread Benjamin Kramer via cfe-commits

Author: Benjamin Kramer
Date: 2024-06-27T09:21:37+02:00
New Revision: a1ad98813006cefcdf88336db3f81a15b6bf36fb

URL: 
https://github.com/llvm/llvm-project/commit/a1ad98813006cefcdf88336db3f81a15b6bf36fb
DIFF: 
https://github.com/llvm/llvm-project/commit/a1ad98813006cefcdf88336db3f81a15b6bf36fb.diff

LOG: [clang] Make header self-contained. NFC.

Added: 


Modified: 
clang/include/clang/Basic/Thunk.h

Removed: 




diff  --git a/clang/include/clang/Basic/Thunk.h 
b/clang/include/clang/Basic/Thunk.h
index af4afb2d2ac4d..8ff7603e0094d 100644
--- a/clang/include/clang/Basic/Thunk.h
+++ b/clang/include/clang/Basic/Thunk.h
@@ -21,6 +21,7 @@
 namespace clang {
 
 class CXXMethodDecl;
+class Type;
 
 /// A return adjustment.
 struct ReturnAdjustment {



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [compiler-rt] [llvm] [openmp] [PGO][Offload] Profile profraw generation for GPU instrumentation #76587 (PR #93365)

2024-06-27 Thread Ethan Luis McDonough via cfe-commits

https://github.com/EthanLuisMcDonough edited 
https://github.com/llvm/llvm-project/pull/93365
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libcxx] [libcxxabi] [libunwind] [llvm] [runtimes] remove workaround for old CMake when setting `--unwindlib=none` (PR #93429)

2024-06-27 Thread via cfe-commits

https://github.com/h-vetinari updated 
https://github.com/llvm/llvm-project/pull/93429

>From 8c1b899aa174b107fece1edbf99eaf261bdea516 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Martin=20Storsj=C3=B6?= 
Date: Mon, 25 Apr 2022 09:45:22 +0300
Subject: [PATCH 01/11] [runtimes] [CMake] Use CMAKE_REQUIRED_LINK_OPTIONS to
 simplify handling of the --unwindlib=none option

This avoids passing the option unnecessarily to compilation commands
(where it causes warnings).

This fails in practice with libunwind, where setting
CMAKE_TRY_COMPILE_TARGET_TYPE to STATIC_LIBRARY breaks it, as
the option from CMAKE_REQUIRED_LINK_OPTIONS ends up passed to the "ar"
tool too.
---
 libunwind/CMakeLists.txt |  3 +++
 runtimes/CMakeLists.txt  | 22 +-
 2 files changed, 4 insertions(+), 21 deletions(-)

diff --git a/libunwind/CMakeLists.txt b/libunwind/CMakeLists.txt
index b22ade0a7d71e..3d2fadca9d2ec 100644
--- a/libunwind/CMakeLists.txt
+++ b/libunwind/CMakeLists.txt
@@ -221,9 +221,12 @@ add_cxx_compile_flags_if_supported(-EHsc)
 # This leads to libunwind not being built with this flag, which makes
 # libunwind quite useless in this setup.
 set(_previous_CMAKE_TRY_COMPILE_TARGET_TYPE ${CMAKE_TRY_COMPILE_TARGET_TYPE})
+set(_previous_CMAKE_REQUIRED_LINK_OPTIONS ${CMAKE_REQUIRED_LINK_OPTIONS})
 set(CMAKE_TRY_COMPILE_TARGET_TYPE STATIC_LIBRARY)
+set(CMAKE_REQUIRED_LINK_OPTIONS)
 add_compile_flags_if_supported(-funwind-tables)
 set(CMAKE_TRY_COMPILE_TARGET_TYPE ${_previous_CMAKE_TRY_COMPILE_TARGET_TYPE})
+set(CMAKE_REQUIRED_LINK_OPTIONS ${_previous_CMAKE_REQUIRED_LINK_OPTIONS})
 
 if (LIBUNWIND_USES_ARM_EHABI AND NOT CXX_SUPPORTS_FUNWIND_TABLES_FLAG)
   message(SEND_ERROR "The -funwind-tables flag must be supported "
diff --git a/runtimes/CMakeLists.txt b/runtimes/CMakeLists.txt
index 24f4851169591..8f909322c9a98 100644
--- a/runtimes/CMakeLists.txt
+++ b/runtimes/CMakeLists.txt
@@ -116,27 +116,7 @@ 
filter_prefixed("${CMAKE_ASM_IMPLICIT_INCLUDE_DIRECTORIES}" ${LLVM_BINARY_DIR} C
 # brittle. We should ideally move this to runtimes/CMakeLists.txt.
 llvm_check_compiler_linker_flag(C "--unwindlib=none" 
CXX_SUPPORTS_UNWINDLIB_EQ_NONE_FLAG)
 if (CXX_SUPPORTS_UNWINDLIB_EQ_NONE_FLAG)
-  set(ORIG_CMAKE_REQUIRED_FLAGS "${CMAKE_REQUIRED_FLAGS}")
-  set(CMAKE_REQUIRED_FLAGS "${CMAKE_REQUIRED_FLAGS} --unwindlib=none")
-  # TODO: When we can require CMake 3.14, we should use
-  # CMAKE_REQUIRED_LINK_OPTIONS here. Until then, we need a workaround:
-  # When using CMAKE_REQUIRED_FLAGS, this option gets added both to
-  # compilation and linking commands. That causes warnings in the
-  # compilation commands during cmake tests. This is normally benign, but
-  # when testing whether -Werror works, that test fails (due to the
-  # preexisting warning).
-  #
-  # Therefore, before we can use CMAKE_REQUIRED_LINK_OPTIONS, check if we
-  # can use --start-no-unused-arguments to silence the warnings about
-  # --unwindlib=none during compilation.
-  #
-  # We must first add --unwindlib=none to CMAKE_REQUIRED_FLAGS above, to
-  # allow this subsequent test to succeed, then rewrite CMAKE_REQUIRED_FLAGS
-  # below.
-  check_c_compiler_flag("--start-no-unused-arguments" 
C_SUPPORTS_START_NO_UNUSED_ARGUMENTS)
-  if (C_SUPPORTS_START_NO_UNUSED_ARGUMENTS)
-set(CMAKE_REQUIRED_FLAGS "${ORIG_CMAKE_REQUIRED_FLAGS} 
--start-no-unused-arguments --unwindlib=none --end-no-unused-arguments")
-  endif()
+  list(APPEND CMAKE_REQUIRED_LINK_OPTIONS "--unwindlib=none")
 endif()
 
 # Disable use of the installed C++ standard library when building runtimes.

>From 816e9e6d81ac12537879406e0495fc80394a1a66 Mon Sep 17 00:00:00 2001
From: "H. Vetinari" 
Date: Thu, 20 Jun 2024 23:18:51 +1100
Subject: [PATCH 02/11] add comment (and CMake issue reference) about
 incompatible options

---
 libunwind/CMakeLists.txt | 4 
 1 file changed, 4 insertions(+)

diff --git a/libunwind/CMakeLists.txt b/libunwind/CMakeLists.txt
index 3d2fadca9d2ec..d84f8fa6ff954 100644
--- a/libunwind/CMakeLists.txt
+++ b/libunwind/CMakeLists.txt
@@ -220,6 +220,10 @@ add_cxx_compile_flags_if_supported(-EHsc)
 #
 # This leads to libunwind not being built with this flag, which makes
 # libunwind quite useless in this setup.
+#
+# NOTE: we need to work around 
https://gitlab.kitware.com/cmake/cmake/-/issues/23454
+#   because CMAKE_REQUIRED_LINK_OPTIONS (c.f. 
CXX_SUPPORTS_UNWINDLIB_EQ_NONE_FLAG)
+#   is incompatible with CMAKE_TRY_COMPILE_TARGET_TYPE==STATIC_LIBRARY.
 set(_previous_CMAKE_TRY_COMPILE_TARGET_TYPE ${CMAKE_TRY_COMPILE_TARGET_TYPE})
 set(_previous_CMAKE_REQUIRED_LINK_OPTIONS ${CMAKE_REQUIRED_LINK_OPTIONS})
 set(CMAKE_TRY_COMPILE_TARGET_TYPE STATIC_LIBRARY)

>From 3f917d22bdcd8b398cf7162563547418a056ecec Mon Sep 17 00:00:00 2001
From: "H. Vetinari" 
Date: Thu, 20 Jun 2024 23:18:51 +1100
Subject: [PATCH 03/11] [cmake] move check for `-fno-exceptions` to "safe zone"

w.r.t. interference between CMAKE_REQUIRED_LINK_OPTIONS and static libraries
---
 libunwind/CMakeLists.txt | 

[clang] [analyzer][NFC] Use ArrayRef for input parameters (PR #93203)

2024-06-27 Thread Balazs Benics via cfe-commits


@@ -672,7 +672,7 @@ class StdLibraryFunctionsChecker
 StringRef getNote() const { return Note; }
   };
 
-  using ArgTypes = std::vector>;
+  using ArgTypes = ArrayRef>;

steakhal wrote:

One can argue the same the other way around, just like for pointers.

https://github.com/llvm/llvm-project/pull/93203
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libcxx] [libcxxabi] [libunwind] [llvm] [runtimes] remove workaround for old CMake when setting `--unwindlib=none` (PR #93429)

2024-06-27 Thread via cfe-commits

h-vetinari wrote:

So I've been trying to follow down the rabbit hole of the failing flag checks, 
and it seems the combination of `CXX_SUPPORTS_UNWINDLIB_EQ_NONE_FLAG` plus 
https://gitlab.kitware.com/cmake/cmake/-/issues/23454 has a wider blast radius 
than anticipated.

I'm not claiming that adding the 
`_previous_CMAKE_{REQUIRED_LINK_OPTIONS,TRY_COMPILE_TARGET_TYPE}` dance 
everywhere is the right approach here, but it was - so far - the obvious path 
to just try to get things green again. It's conceivable though that it would be 
easier to simply shift the detection of `CXX_SUPPORTS_UNWINDLIB_EQ_NONE_FLAG` 
until after the other flag checks have been performed? 🤔 

https://github.com/llvm/llvm-project/pull/93429
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [OpenMP] Fix stack corruption due to argument mismatch (PR #96386)

2024-06-27 Thread Sushant Gokhale via cfe-commits

https://github.com/sushgokh updated 
https://github.com/llvm/llvm-project/pull/96386

>From af4dc96c25f32b477337cedaeb0a696f75840ac0 Mon Sep 17 00:00:00 2001
From: sgokhale 
Date: Sat, 22 Jun 2024 17:16:24 +0530
Subject: [PATCH] [OpenMP] Fix stack corruption due to argument mismatch

While lowering (#pragma omp target update from), clang's generated
.omp_task_entry. is setting up 9 arguments while calling
__tgt_target_data_update_nowait_mapper.

At the same time, in __tgt_target_data_update_nowait_mapper, call to
targetData() is converted to a sibcall assuming
it has the argument count listed in the signature.

AARCH64 asm sequence for this is as follows (removed unrelated insns):

.omp_task_entry..108:
  sub   sp, sp, #32
  stp   x29, x30, sp, #16   // 16-byte Folded Spill
  add   x29, sp, #16
  str   x8, sp, #8. // stack canary
  str   xzr, [sp]
  bl   __tgt_target_data_update_nowait_mapper

__tgt_target_data_update_nowait_mapper:
  sub   sp, sp, #32
  stp   x29, x30, sp, #16   // 16-byte Folded Spill
  add   x29, sp, #16
  str   x8, sp, #8 // stack canary
  // Sibcall argument setup
  adrp  x8, 
:got:_Z16targetDataUpdateP7ident_tR8DeviceTyiPPvS4_PlS5_S4_S4_R11AsyncInfoTyb
  ldr   x8, [x8, 
:got_lo12:_Z16targetDataUpdateP7ident_tR8DeviceTyiPPvS4_PlS5_S4_S4_R11AsyncInfoTyb]
  stp   x9, x8, x29, #16
  adrp  x8, .L.str.8
  add   x8, x8, :lo12:.L.str.8
  str   x8, x29, #32. <==. This is the insn that erases $fp

  ldp   x29, x30, sp, #16   // 16-byte Folded Reload
  add   sp, sp, #32
  // Sibcall
  b
ZL10targetDataI22TaskAsyncInfoWrapperTyEvP7ident_tliPPvS4_PlS5_S4_S4_PFiS2_R8DeviceTyiS4_S4_S5_S5_S4_S4_R11AsyncInfoTybEPKcSD

On AArch64, call to __tgt_target_data_update_nowait_mapper in .omp_task_entry.
sets up only single space on stack and this results in ovewriting $fp
and subsequent stack corruption. This issue can be credited to discrepancy of
__tgt_target_data_update_nowait_mapper signature in
openmp/libomptarget/include/omptarget.h taking 13 arguments while
clang/lib/CodeGen/CGOpenMPRuntime.cpp and
llvm/include/llvm/Frontend/OpenMP/OMPKinds.def taking only 9 arguments.

This patch modifies __tgt_target_data_update_nowait_mapper signature
to match .omp_task_entry usage(and other 2 files mentioned above).

Co-authored-by: Kugan Vivekanandarajah 
---
 clang/lib/CodeGen/CGOpenMPRuntime.cpp | 28 +++--
 .../include/llvm/Frontend/OpenMP/OMPKinds.def | 30 ---
 2 files changed, 44 insertions(+), 14 deletions(-)

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index f6d12d46cfc07..fc3ad533666ca 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -10343,6 +10343,23 @@ void CGOpenMPRuntime::emitTargetDataStandAloneCall(
 MapNamesArray,
 InputInfo.MappersArray.emitRawPointer(CGF)};
 
+// Nowait calls have header declarations that take 13 arguments. Hence, the
+// divergence from the OffloadingArgs definition.
+llvm::Value *NowaitOffloadingArgs[] = {
+RTLoc,
+DeviceID,
+PointerNum,
+InputInfo.BasePointersArray.emitRawPointer(CGF),
+InputInfo.PointersArray.emitRawPointer(CGF),
+InputInfo.SizesArray.emitRawPointer(CGF),
+MapTypesArray,
+MapNamesArray,
+InputInfo.MappersArray.emitRawPointer(CGF),
+llvm::Constant::getNullValue(CGF.Int32Ty),
+llvm::Constant::getNullValue(CGF.VoidPtrTy),
+llvm::Constant::getNullValue(CGF.Int32Ty),
+llvm::Constant::getNullValue(CGF.VoidPtrTy)};
+
 // Select the right runtime function call for each standalone
 // directive.
 const bool HasNowait = D.hasClausesOfKind();
@@ -10430,9 +10447,14 @@ void CGOpenMPRuntime::emitTargetDataStandAloneCall(
   llvm_unreachable("Unexpected standalone target data directive.");
   break;
 }
-CGF.EmitRuntimeCall(
-OMPBuilder.getOrCreateRuntimeFunction(CGM.getModule(), RTLFn),
-OffloadingArgs);
+if (HasNowait)
+  CGF.EmitRuntimeCall(
+  OMPBuilder.getOrCreateRuntimeFunction(CGM.getModule(), RTLFn),
+  NowaitOffloadingArgs);
+else
+  CGF.EmitRuntimeCall(
+  OMPBuilder.getOrCreateRuntimeFunction(CGM.getModule(), RTLFn),
+  OffloadingArgs);
   };
 
   auto &&TargetThenGen = [this, &ThenGen, &D, &InputInfo, &MapTypesArray,
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPKinds.def 
b/llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
index fe09bb8177c28..ebd928470109a 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
+++ b/llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
@@ -438,19 +438,22 @@ __OMP_RTL(__tgt_target_kernel_nowait, false, Int32, 
IdentPtr, Int64, Int32,
   Int32, VoidPtr, KernelArgsPtr, Int32, VoidPtr, Int32, VoidPtr)
 __OMP_RTL(__tgt_target_data_begin_mapper, false, Void, IdentPtr, Int64, Int32, 
VoidPtrPtr,
   VoidPtrPtr, Int64Ptr, Int64Ptr, VoidPtrPtr, VoidPtrPtr)

[clang] Support `guarded_by` attribute and related attributes inside C structs and support late parsing them (PR #95455)

2024-06-27 Thread Pierre d'Herbemont via cfe-commits

pdherbemont wrote:

I think this still needs review from @delcypher and @rapidsna 

https://github.com/llvm/llvm-project/pull/95455
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver] Support using toolchain libc and libc++ for baremetal (PR #96736)

2024-06-27 Thread Petr Hosek via cfe-commits

https://github.com/petrhosek updated 
https://github.com/llvm/llvm-project/pull/96736

>From db5ae584cc00717d667d423a99d71a8d3ac46805 Mon Sep 17 00:00:00 2001
From: Petr Hosek 
Date: Mon, 10 Jun 2024 20:27:52 +
Subject: [PATCH 1/2] [Driver] Support using toolchain libc and libc++ for
 baremetal

We want to support using a complete Clang/LLVM toolchain that includes
LLVM libc and libc++ for baremetal targets. To do so, we need the driver
to add the necessary include paths.
---
 clang/include/clang/Driver/ToolChain.h|  3 +
 clang/lib/Driver/ToolChain.cpp|  6 ++
 clang/lib/Driver/ToolChains/BareMetal.cpp | 63 ---
 .../Inputs/basic_baremetal_tree/bin/.keep |  0
 .../include/armv6m-unknown-none-eabi/.keep|  0
 .../armv6m-unknown-none-eabi/c++/v1/.keep |  0
 .../basic_baremetal_tree/include/c++/v1/.keep |  0
 .../lib/armv6m-unknown-none-eabi/.keep|  0
 clang/test/Driver/baremetal.cpp   | 16 -
 9 files changed, 78 insertions(+), 10 deletions(-)
 create mode 100644 clang/test/Driver/Inputs/basic_baremetal_tree/bin/.keep
 create mode 100644 
clang/test/Driver/Inputs/basic_baremetal_tree/include/armv6m-unknown-none-eabi/.keep
 create mode 100644 
clang/test/Driver/Inputs/basic_baremetal_tree/include/armv6m-unknown-none-eabi/c++/v1/.keep
 create mode 100644 
clang/test/Driver/Inputs/basic_baremetal_tree/include/c++/v1/.keep
 create mode 100644 
clang/test/Driver/Inputs/basic_baremetal_tree/lib/armv6m-unknown-none-eabi/.keep

diff --git a/clang/include/clang/Driver/ToolChain.h 
b/clang/include/clang/Driver/ToolChain.h
index 1f93bd612e9b0..ece1384d5d3c0 100644
--- a/clang/include/clang/Driver/ToolChain.h
+++ b/clang/include/clang/Driver/ToolChain.h
@@ -526,6 +526,9 @@ class ToolChain {
   // Returns target specific standard library path if it exists.
   std::optional getStdlibPath() const;
 
+  // Returns target specific standard library include path if it exists.
+  std::optional getStdlibIncludePath() const;
+
   // Returns /lib// or /lib/.
   // This is used by runtimes (such as OpenMP) to find arch-specific libraries.
   virtual path_list getArchSpecificLibPaths() const;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 40ab2e91125d1..04021cc0a8f3f 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -811,6 +811,12 @@ std::optional ToolChain::getStdlibPath() 
const {
   return getTargetSubDirPath(P);
 }
 
+std::optional ToolChain::getStdlibIncludePath() const {
+  SmallString<128> P(D.Dir);
+  llvm::sys::path::append(P, "..", "include");
+  return getTargetSubDirPath(P);
+}
+
 ToolChain::path_list ToolChain::getArchSpecificLibPaths() const {
   path_list Paths;
 
diff --git a/clang/lib/Driver/ToolChains/BareMetal.cpp 
b/clang/lib/Driver/ToolChains/BareMetal.cpp
index dd365e62e084e..4eb333efe2314 100644
--- a/clang/lib/Driver/ToolChains/BareMetal.cpp
+++ b/clang/lib/Driver/ToolChains/BareMetal.cpp
@@ -270,15 +270,19 @@ void BareMetal::AddClangSystemIncludeArgs(const ArgList 
&DriverArgs,
 addSystemInclude(DriverArgs, CC1Args, Dir.str());
   }
 
-  if (!DriverArgs.hasArg(options::OPT_nostdlibinc)) {
-const SmallString<128> SysRoot(computeSysRoot());
-if (!SysRoot.empty()) {
-  for (const Multilib &M : getOrderedMultilibs()) {
-SmallString<128> Dir(SysRoot);
-llvm::sys::path::append(Dir, M.includeSuffix());
-llvm::sys::path::append(Dir, "include");
-addSystemInclude(DriverArgs, CC1Args, Dir.str());
-  }
+  if (DriverArgs.hasArg(options::OPT_nostdlibinc))
+return;
+
+  if (std::optional Path = getStdlibIncludePath())
+addSystemInclude(DriverArgs, CC1Args, *Path);
+
+  const SmallString<128> SysRoot(computeSysRoot());
+  if (!SysRoot.empty()) {
+for (const Multilib &M : getOrderedMultilibs()) {
+  SmallString<128> Dir(SysRoot);
+  llvm::sys::path::append(Dir, M.includeSuffix());
+  llvm::sys::path::append(Dir, "include");
+  addSystemInclude(DriverArgs, CC1Args, Dir.str());
 }
   }
 }
@@ -296,6 +300,47 @@ void BareMetal::AddClangCXXStdlibIncludeArgs(const ArgList 
&DriverArgs,
 return;
 
   const Driver &D = getDriver();
+  std::string Target = getTripleString();
+
+  auto AddCXXIncludePath = [&](StringRef Path) {
+std::string Version = detectLibcxxVersion(Path);
+if (Version.empty())
+  return;
+
+// First add the per-target multilib include dir.
+if (!SelectedMultilibs.empty() && !SelectedMultilibs.back().isDefault()) {
+  const Multilib &M = SelectedMultilibs.back();
+  SmallString<128> TargetDir(Path);
+  llvm::sys::path::append(TargetDir, Target, M.gccSuffix(), "c++", 
Version);
+  if (getVFS().exists(TargetDir)) {
+addSystemInclude(DriverArgs, CC1Args, TargetDir);
+  }
+}
+
+// Second add the per-target include dir.
+SmallString<128> TargetDir(Path);
+llvm::sys::path::append(TargetDir, Target, "c++", Version);
+if 

[clang] [Driver] Support using toolchain libc and libc++ for baremetal (PR #96736)

2024-06-27 Thread Petr Hosek via cfe-commits


@@ -296,6 +300,47 @@ void BareMetal::AddClangCXXStdlibIncludeArgs(const ArgList 
&DriverArgs,
 return;
 
   const Driver &D = getDriver();
+  std::string Target = getTripleString();
+
+  auto AddCXXIncludePath = [&](StringRef Path) {
+std::string Version = detectLibcxxVersion(Path);
+if (Version.empty())
+  return;
+
+// First add the per-target multilib include dir.
+if (!SelectedMultilibs.empty() && !SelectedMultilibs.back().isDefault()) {
+  const Multilib &M = SelectedMultilibs.back();
+  SmallString<128> TargetDir(Path);
+  llvm::sys::path::append(TargetDir, Target, M.gccSuffix(), "c++", 
Version);
+  if (getVFS().exists(TargetDir)) {
+addSystemInclude(DriverArgs, CC1Args, TargetDir);
+  }

petrhosek wrote:

Done

https://github.com/llvm/llvm-project/pull/96736
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Access tls_guard via llvm.threadlocal.address (PR #96633)

2024-06-27 Thread via cfe-commits


@@ -1059,9 +1059,15 @@ 
CodeGenFunction::GenerateCXXGlobalInitFunc(llvm::Function *Fn,
 if (Guard.isValid()) {
   // If we have a guard variable, check whether we've already performed
   // these initializations. This happens for TLS initialization functions.
-  llvm::Value *GuardVal = Builder.CreateLoad(Guard);
-  llvm::Value *Uninit = Builder.CreateIsNull(GuardVal,
- "guard.uninitialized");
+  Address GuardAddr = Guard;

nikola-tesic-ns wrote:

The `Guard` is a `ConstantAddress`, so I cannot change it, that's why I 
introduced new variable. If you have some suggestion, I would be happy to adapt 
the code.

https://github.com/llvm/llvm-project/pull/96633
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver] Support using toolchain libc and libc++ for baremetal (PR #96736)

2024-06-27 Thread Petr Hosek via cfe-commits


@@ -296,6 +300,47 @@ void BareMetal::AddClangCXXStdlibIncludeArgs(const ArgList 
&DriverArgs,
 return;
 
   const Driver &D = getDriver();
+  std::string Target = getTripleString();
+
+  auto AddCXXIncludePath = [&](StringRef Path) {

petrhosek wrote:

No, we also need `this` for `detectLibcxxVersion`, `DriverArgs` and `CC1Args`.

https://github.com/llvm/llvm-project/pull/96736
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver] Support using toolchain libc and libc++ for baremetal (PR #96736)

2024-06-27 Thread Petr Hosek via cfe-commits


@@ -296,6 +300,47 @@ void BareMetal::AddClangCXXStdlibIncludeArgs(const ArgList 
&DriverArgs,
 return;
 
   const Driver &D = getDriver();
+  std::string Target = getTripleString();
+
+  auto AddCXXIncludePath = [&](StringRef Path) {
+std::string Version = detectLibcxxVersion(Path);
+if (Version.empty())
+  return;
+
+// First add the per-target multilib include dir.
+if (!SelectedMultilibs.empty() && !SelectedMultilibs.back().isDefault()) {
+  const Multilib &M = SelectedMultilibs.back();
+  SmallString<128> TargetDir(Path);
+  llvm::sys::path::append(TargetDir, Target, M.gccSuffix(), "c++", 
Version);

petrhosek wrote:

Done.

https://github.com/llvm/llvm-project/pull/96736
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver] Support using toolchain libc and libc++ for baremetal (PR #96736)

2024-06-27 Thread Petr Hosek via cfe-commits


@@ -296,6 +300,47 @@ void BareMetal::AddClangCXXStdlibIncludeArgs(const ArgList 
&DriverArgs,
 return;
 
   const Driver &D = getDriver();
+  std::string Target = getTripleString();
+
+  auto AddCXXIncludePath = [&](StringRef Path) {
+std::string Version = detectLibcxxVersion(Path);
+if (Version.empty())
+  return;
+
+// First add the per-target multilib include dir.
+if (!SelectedMultilibs.empty() && !SelectedMultilibs.back().isDefault()) {
+  const Multilib &M = SelectedMultilibs.back();
+  SmallString<128> TargetDir(Path);
+  llvm::sys::path::append(TargetDir, Target, M.gccSuffix(), "c++", 
Version);
+  if (getVFS().exists(TargetDir)) {
+addSystemInclude(DriverArgs, CC1Args, TargetDir);
+  }
+}
+
+// Second add the per-target include dir.
+SmallString<128> TargetDir(Path);
+llvm::sys::path::append(TargetDir, Target, "c++", Version);
+if (getVFS().exists(TargetDir))
+  addSystemInclude(DriverArgs, CC1Args, TargetDir);
+
+// Third the generic one.
+SmallString<128> Dir(Path);

petrhosek wrote:

Done.

https://github.com/llvm/llvm-project/pull/96736
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Access tls_guard via llvm.threadlocal.address (PR #96633)

2024-06-27 Thread via cfe-commits


@@ -1070,13 +1076,26 @@ 
CodeGenFunction::GenerateCXXGlobalInitFunc(llvm::Function *Fn,
   // Mark as initialized before initializing anything else. If the
   // initializers use previously-initialized thread_local vars, that's
   // probably supposed to be OK, but the standard doesn't say.
-  Builder.CreateStore(llvm::ConstantInt::get(GuardVal->getType(),1), 
Guard);
-
-  // The guard variable can't ever change again.
-  EmitInvariantStart(
-  Guard.getPointer(),
-  CharUnits::fromQuantity(
-  CGM.getDataLayout().getTypeAllocSize(GuardVal->getType(;
+  if (auto *GV = dyn_cast(Guard.getPointer()))
+// Get the thread-local address via intrinsic.
+if (GV->isThreadLocal())
+  GuardAddr = GuardAddr.withPointer(
+  Builder.CreateThreadLocalAddress(GV), NotKnownNonNull);
+  Builder.CreateStore(llvm::ConstantInt::get(GuardVal->getType(), 1),
+  GuardAddr);
+
+  // Emit invariant start for TLS guard address.
+  if (CGM.getCodeGenOpts().OptimizationLevel > 0) {
+uint64_t Width =
+CGM.getDataLayout().getTypeAllocSize(GuardVal->getType());
+llvm::Value *TLSAddr = Guard.getPointer();
+if (auto *GV = dyn_cast(Guard.getPointer()))
+  // Get the thread-local address via intrinsic.
+  if (GV->isThreadLocal())
+TLSAddr = Builder.CreateThreadLocalAddress(GV);
+Builder.CreateInvariantStart(

nikola-tesic-ns wrote:

I am not sure I understood this, sorry.

https://github.com/llvm/llvm-project/pull/96633
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Access tls_guard via llvm.threadlocal.address (PR #96633)

2024-06-27 Thread via cfe-commits


@@ -1059,9 +1059,15 @@ 
CodeGenFunction::GenerateCXXGlobalInitFunc(llvm::Function *Fn,
 if (Guard.isValid()) {
   // If we have a guard variable, check whether we've already performed
   // these initializations. This happens for TLS initialization functions.
-  llvm::Value *GuardVal = Builder.CreateLoad(Guard);
-  llvm::Value *Uninit = Builder.CreateIsNull(GuardVal,
- "guard.uninitialized");
+  Address GuardAddr = Guard;
+  if (auto *GV = dyn_cast(Guard.getPointer()))

nikola-tesic-ns wrote:

There is a code pattern where this "guarded initialization" is done for non-TLS 
var ([partitions.cpp 
test](https://github.com/nextsilicon/next-llvm-project/blob/b36811cc9baf1c72de2fa1c8b5d8fc30bae9a15c/clang/test/CodeGenCXX/partitions.cpp)).
 That's the reason I've introduced these checks.

https://github.com/llvm/llvm-project/pull/96633
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Access tls_guard via llvm.threadlocal.address (PR #96633)

2024-06-27 Thread Chuanqi Xu via cfe-commits


@@ -1059,9 +1059,15 @@ 
CodeGenFunction::GenerateCXXGlobalInitFunc(llvm::Function *Fn,
 if (Guard.isValid()) {
   // If we have a guard variable, check whether we've already performed
   // these initializations. This happens for TLS initialization functions.
-  llvm::Value *GuardVal = Builder.CreateLoad(Guard);
-  llvm::Value *Uninit = Builder.CreateIsNull(GuardVal,
- "guard.uninitialized");
+  Address GuardAddr = Guard;

ChuanqiXu9 wrote:

OK, I didn't look into the context.

https://github.com/llvm/llvm-project/pull/96633
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Access tls_guard via llvm.threadlocal.address (PR #96633)

2024-06-27 Thread Chuanqi Xu via cfe-commits


@@ -1070,13 +1076,26 @@ 
CodeGenFunction::GenerateCXXGlobalInitFunc(llvm::Function *Fn,
   // Mark as initialized before initializing anything else. If the
   // initializers use previously-initialized thread_local vars, that's
   // probably supposed to be OK, but the standard doesn't say.
-  Builder.CreateStore(llvm::ConstantInt::get(GuardVal->getType(),1), 
Guard);
-
-  // The guard variable can't ever change again.
-  EmitInvariantStart(
-  Guard.getPointer(),
-  CharUnits::fromQuantity(
-  CGM.getDataLayout().getTypeAllocSize(GuardVal->getType(;
+  if (auto *GV = dyn_cast(Guard.getPointer()))
+// Get the thread-local address via intrinsic.
+if (GV->isThreadLocal())
+  GuardAddr = GuardAddr.withPointer(
+  Builder.CreateThreadLocalAddress(GV), NotKnownNonNull);
+  Builder.CreateStore(llvm::ConstantInt::get(GuardVal->getType(), 1),
+  GuardAddr);
+
+  // Emit invariant start for TLS guard address.
+  if (CGM.getCodeGenOpts().OptimizationLevel > 0) {
+uint64_t Width =
+CGM.getDataLayout().getTypeAllocSize(GuardVal->getType());
+llvm::Value *TLSAddr = Guard.getPointer();
+if (auto *GV = dyn_cast(Guard.getPointer()))
+  // Get the thread-local address via intrinsic.
+  if (GV->isThreadLocal())
+TLSAddr = Builder.CreateThreadLocalAddress(GV);
+Builder.CreateInvariantStart(

ChuanqiXu9 wrote:

I mean, it used  `EmitInvariantStart` but now it uses  `CreateInvariantStart`. 
(Sorry, I meant to use API)

https://github.com/llvm/llvm-project/pull/96633
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Run PreStmt/PostStmt checker for GCCAsmStmt (PR #95409)

2024-06-27 Thread via cfe-commits

https://github.com/T-Gruber edited 
https://github.com/llvm/llvm-project/pull/95409
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libcxx] [libcxxabi] [libunwind] [llvm] [runtimes] remove workaround for old CMake when setting `--unwindlib=none` (PR #93429)

2024-06-27 Thread Martin Storsjö via cfe-commits

mstorsjo wrote:

> So I've been trying to follow down the rabbit hole of the failing flag 
> checks, and it seems the combination of `CXX_SUPPORTS_UNWINDLIB_EQ_NONE_FLAG` 
> plus https://gitlab.kitware.com/cmake/cmake/-/issues/23454 has a wider blast 
> radius than anticipated.
> 
> I'm not claiming that adding the 
> `_previous_CMAKE_{REQUIRED_LINK_OPTIONS,TRY_COMPILE_TARGET_TYPE}` dance 
> everywhere is the right approach here, but it was - so far - the obvious path 
> to just try to get things green again. It's conceivable though that it would 
> be easier to simply shift the detection of 
> `CXX_SUPPORTS_UNWINDLIB_EQ_NONE_FLAG` until after the other flag checks have 
> been performed? 🤔

That's probably not possible...

The point is that when bootstrapping a new sysroot from scratch (i.e. building 
the initial libunwind etc), in a configuration where libunwind is linked in 
automatically, every test that tries to do linking will fail (as it implicitly 
tries to link in libunwind, which does not exist yet). Therefore, we need to 
add `--unwindlib=none` as an additional linker flag, as soon as possible, so 
that all following cmake checks will get the right result.

Also, in general, setting `CMAKE_TRY_COMPILE_TARGET_TYPE` to `STATIC_LIBRARY` 
in too wide a context will also give false positive checks, for cases where we 
intentionally want to check whether linking some library works and is found. 
But perhaps the way you do it here, adding it in a narrow context only when 
doing specific checks, is the right way? I'm not sure...

So that cmake issue seems to be really, really unfortunate here. :-( I wonder 
if the cure is worse than the disease here - and if it would be better to just 
keep what we have now - and simplify it only if cmake adds something like 
`CMAKE_REQUIRED_DYNAMIC_LINK_OPTIONS` or so.

https://github.com/llvm/llvm-project/pull/93429
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Access tls_guard via llvm.threadlocal.address (PR #96633)

2024-06-27 Thread via cfe-commits


@@ -1070,13 +1076,26 @@ 
CodeGenFunction::GenerateCXXGlobalInitFunc(llvm::Function *Fn,
   // Mark as initialized before initializing anything else. If the
   // initializers use previously-initialized thread_local vars, that's
   // probably supposed to be OK, but the standard doesn't say.
-  Builder.CreateStore(llvm::ConstantInt::get(GuardVal->getType(),1), 
Guard);
-
-  // The guard variable can't ever change again.
-  EmitInvariantStart(
-  Guard.getPointer(),
-  CharUnits::fromQuantity(
-  CGM.getDataLayout().getTypeAllocSize(GuardVal->getType(;
+  if (auto *GV = dyn_cast(Guard.getPointer()))
+// Get the thread-local address via intrinsic.
+if (GV->isThreadLocal())
+  GuardAddr = GuardAddr.withPointer(
+  Builder.CreateThreadLocalAddress(GV), NotKnownNonNull);
+  Builder.CreateStore(llvm::ConstantInt::get(GuardVal->getType(), 1),
+  GuardAddr);
+
+  // Emit invariant start for TLS guard address.
+  if (CGM.getCodeGenOpts().OptimizationLevel > 0) {
+uint64_t Width =
+CGM.getDataLayout().getTypeAllocSize(GuardVal->getType());
+llvm::Value *TLSAddr = Guard.getPointer();
+if (auto *GV = dyn_cast(Guard.getPointer()))
+  // Get the thread-local address via intrinsic.
+  if (GV->isThreadLocal())
+TLSAddr = Builder.CreateThreadLocalAddress(GV);
+Builder.CreateInvariantStart(

nikola-tesic-ns wrote:

Ok, well `EmitInvariantStart` expects Constant value, which `TLSAddr` cannot be 
if we are going to set it conditionally. (But maybe it should be 
unconditionally)

https://github.com/llvm/llvm-project/pull/96633
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libcxx] [libcxxabi] [libunwind] [llvm] [runtimes] remove workaround for old CMake when setting `--unwindlib=none` (PR #93429)

2024-06-27 Thread via cfe-commits

h-vetinari wrote:

> So that cmake issue seems to be really, really unfortunate here. :-( I wonder 
> if the cure is worse than the disease here [...]

Yup, that's a distinct possibility IMO...

> [...] and if it would be better to just keep what we have now - and simplify 
> it only if cmake adds something like `CMAKE_REQUIRED_DYNAMIC_LINK_OPTIONS` or 
> so.

It would probably make sense to report back on the CMake issue how big the 
fallout from this is? Perhaps the CMake devs would reconsider, or at least take 
it as an indicator for the necessity of `CMAKE_REQUIRED_DYNAMIC_LINK_OPTIONS`? 
I think you understand the problem space much better than me (I'm mostly 
stumbling around in a dark room TBH), so if you could do that that would be 
great!

https://github.com/llvm/llvm-project/pull/93429
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] Re-land: "[AArch64] Add ability to list extensions enabled for a target" (#95805) (PR #96795)

2024-06-27 Thread David Spickett via cfe-commits


@@ -343,7 +350,9 @@ bool isX18ReservedByDefault(const Triple &TT);
 // themselves, they are sequential (0, 1, 2, 3, ...).
 uint64_t getCpuSupportsMask(ArrayRef FeatureStrs);
 
-void PrintSupportedExtensions(StringMap DescMap);
+void PrintSupportedExtensions();
+
+void printEnabledExtensions(std::set EnabledFeatureNames);

DavidSpickett wrote:

Might as well be const correct too if you're changing it.

https://github.com/llvm/llvm-project/pull/96795
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][AST] Let DeclPrinter print trailing requires expressions for template parameters (PR #96864)

2024-06-27 Thread Younan Zhang via cfe-commits

https://github.com/zyn0217 created 
https://github.com/llvm/llvm-project/pull/96864

As discussed in 
https://github.com/llvm/llvm-project/pull/96084#discussion_r1654629993, it 
would be nice to present these trailing constraints on template parameters when 
printing CTAD decls through a DeclPrinter.

>From a5c33bd413d8150d1688240c6b5253b1760cafe1 Mon Sep 17 00:00:00 2001
From: Younan Zhang 
Date: Thu, 27 Jun 2024 15:59:48 +0800
Subject: [PATCH] [Clang][AST] Let DeclPrinter print trailing requires
 expressions for template parameters

As discussed in 
https://github.com/llvm/llvm-project/pull/96084#discussion_r1654629993,
it would be nice to present these trailing constraints on template
parameters when printing CTAD decls through a DeclPrinter.
---
 clang/docs/ReleaseNotes.rst|  1 +
 clang/lib/AST/DeclPrinter.cpp  | 10 ++
 clang/test/PCH/cxx2a-requires-expr.cpp | 17 +
 3 files changed, 28 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 69aea6c21ad39..03b1daa6597cd 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -99,6 +99,7 @@ AST Dumping Potentially Breaking Changes
 
 
 - The text ast-dumper has improved printing of TemplateArguments.
+- The text decl-dumper prints template parameters' trailing requires 
expressions now.
 
 Clang Frontend Potentially Breaking Changes
 ---
diff --git a/clang/lib/AST/DeclPrinter.cpp b/clang/lib/AST/DeclPrinter.cpp
index 0cf4e64f83b8d..0a081e7e07ca8 100644
--- a/clang/lib/AST/DeclPrinter.cpp
+++ b/clang/lib/AST/DeclPrinter.cpp
@@ -1189,6 +1189,16 @@ void DeclPrinter::printTemplateParameters(const 
TemplateParameterList *Params,
   Out << '>';
   if (!OmitTemplateKW)
 Out << ' ';
+
+  if (const Expr *RequiresClause = Params->getRequiresClause()) {
+if (OmitTemplateKW)
+  Out << ' ';
+Out << "requires ";
+RequiresClause->printPretty(Out, nullptr, Policy, Indentation, "\n",
+&Context);
+if (!OmitTemplateKW)
+  Out << ' ';
+  }
 }
 
 void DeclPrinter::printTemplateArguments(ArrayRef Args,
diff --git a/clang/test/PCH/cxx2a-requires-expr.cpp 
b/clang/test/PCH/cxx2a-requires-expr.cpp
index 7f8f258a0f8f3..936f601685463 100644
--- a/clang/test/PCH/cxx2a-requires-expr.cpp
+++ b/clang/test/PCH/cxx2a-requires-expr.cpp
@@ -22,3 +22,20 @@ bool f() {
 requires C || (C || C);
   };
 }
+
+namespace trailing_requires_expression {
+
+template  requires C && C2
+// CHECK: template  requires C && C2 void g();
+void g();
+
+template  requires C || C2
+// CHECK: template  requires C || C2 constexpr int h = 
sizeof(T);
+constexpr int h = sizeof(T);
+
+template  requires C
+// CHECK:  template  requires C class i {
+// CHECK-NEXT: };
+class i {};
+
+}

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][AST] Let DeclPrinter print trailing requires expressions for template parameters (PR #96864)

2024-06-27 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Younan Zhang (zyn0217)


Changes

As discussed in 
https://github.com/llvm/llvm-project/pull/96084#discussion_r1654629993, it 
would be nice to present these trailing constraints on template parameters when 
printing CTAD decls through a DeclPrinter.

---
Full diff: https://github.com/llvm/llvm-project/pull/96864.diff


3 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+1) 
- (modified) clang/lib/AST/DeclPrinter.cpp (+10) 
- (modified) clang/test/PCH/cxx2a-requires-expr.cpp (+17) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 69aea6c21ad39..03b1daa6597cd 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -99,6 +99,7 @@ AST Dumping Potentially Breaking Changes
 
 
 - The text ast-dumper has improved printing of TemplateArguments.
+- The text decl-dumper prints template parameters' trailing requires 
expressions now.
 
 Clang Frontend Potentially Breaking Changes
 ---
diff --git a/clang/lib/AST/DeclPrinter.cpp b/clang/lib/AST/DeclPrinter.cpp
index 0cf4e64f83b8d..0a081e7e07ca8 100644
--- a/clang/lib/AST/DeclPrinter.cpp
+++ b/clang/lib/AST/DeclPrinter.cpp
@@ -1189,6 +1189,16 @@ void DeclPrinter::printTemplateParameters(const 
TemplateParameterList *Params,
   Out << '>';
   if (!OmitTemplateKW)
 Out << ' ';
+
+  if (const Expr *RequiresClause = Params->getRequiresClause()) {
+if (OmitTemplateKW)
+  Out << ' ';
+Out << "requires ";
+RequiresClause->printPretty(Out, nullptr, Policy, Indentation, "\n",
+&Context);
+if (!OmitTemplateKW)
+  Out << ' ';
+  }
 }
 
 void DeclPrinter::printTemplateArguments(ArrayRef Args,
diff --git a/clang/test/PCH/cxx2a-requires-expr.cpp 
b/clang/test/PCH/cxx2a-requires-expr.cpp
index 7f8f258a0f8f3..936f601685463 100644
--- a/clang/test/PCH/cxx2a-requires-expr.cpp
+++ b/clang/test/PCH/cxx2a-requires-expr.cpp
@@ -22,3 +22,20 @@ bool f() {
 requires C || (C || C);
   };
 }
+
+namespace trailing_requires_expression {
+
+template  requires C && C2
+// CHECK: template  requires C && C2 void g();
+void g();
+
+template  requires C || C2
+// CHECK: template  requires C || C2 constexpr int h = 
sizeof(T);
+constexpr int h = sizeof(T);
+
+template  requires C
+// CHECK:  template  requires C class i {
+// CHECK-NEXT: };
+class i {};
+
+}

``




https://github.com/llvm/llvm-project/pull/96864
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] CTAD alias: fix transformation for require-clause expr Part2. (PR #93533)

2024-06-27 Thread Haojian Wu via cfe-commits

https://github.com/hokein updated 
https://github.com/llvm/llvm-project/pull/93533

>From 14817083f75f9615e9df4c905e09bc4e9b199336 Mon Sep 17 00:00:00 2001
From: Haojian Wu 
Date: Fri, 17 May 2024 15:28:48 +0200
Subject: [PATCH 1/2] [clang] CTAD alias: fix transformation for require-clause
 expr Part2.

In the https://github.com/llvm/llvm-project/pull/90961 fix, we miss a
case where the undeduced template parameters of the underlying deduction
guide is not transformed, which leaves incorrect depth/index
information, and causes crash when evaluating the constraints.

This patch fix this missing case.

Fixes #92596
Fixes #92212
---
 clang/lib/Sema/SemaTemplate.cpp  | 32 
 clang/test/AST/ast-dump-ctad-alias.cpp   | 25 +++
 clang/test/SemaCXX/cxx20-ctad-type-alias.cpp | 25 +++
 3 files changed, 76 insertions(+), 6 deletions(-)

diff --git a/clang/lib/Sema/SemaTemplate.cpp b/clang/lib/Sema/SemaTemplate.cpp
index e36ee2d5a46cf..3869f789da78b 100644
--- a/clang/lib/Sema/SemaTemplate.cpp
+++ b/clang/lib/Sema/SemaTemplate.cpp
@@ -2779,6 +2779,7 @@ Expr *
 buildAssociatedConstraints(Sema &SemaRef, FunctionTemplateDecl *F,
TypeAliasTemplateDecl *AliasTemplate,
ArrayRef DeduceResults,
+   unsigned UndeducedTemplateParameterStartIndex,
Expr *IsDeducible) {
   Expr *RC = F->getTemplateParameters()->getRequiresClause();
   if (!RC)
@@ -2839,8 +2840,22 @@ buildAssociatedConstraints(Sema &SemaRef, 
FunctionTemplateDecl *F,
 
   for (unsigned Index = 0; Index < DeduceResults.size(); ++Index) {
 const auto &D = DeduceResults[Index];
-if (D.isNull())
+if (D.isNull()) { // non-deduced template parameters of f
+  auto TP = F->getTemplateParameters()->getParam(Index);
+  MultiLevelTemplateArgumentList Args;
+  Args.setKind(TemplateSubstitutionKind::Rewrite);
+  Args.addOuterTemplateArguments(TemplateArgsForBuildingRC);
+  // Rebuild the template parameter with updated depth and index.
+  NamedDecl *NewParam = transformTemplateParameter(
+  SemaRef, F->getDeclContext(), TP, Args,
+  /*NewIndex=*/UndeducedTemplateParameterStartIndex++,
+  getTemplateParameterDepth(TP) + AdjustDepth);
+
+  assert(TemplateArgsForBuildingRC[Index].isNull());
+  TemplateArgsForBuildingRC[Index] = Context.getCanonicalTemplateArgument(
+  Context.getInjectedTemplateArg(NewParam));
   continue;
+}
 TemplateArgumentLoc Input =
 SemaRef.getTrivialTemplateArgumentLoc(D, QualType(), SourceLocation{});
 TemplateArgumentLoc Output;
@@ -2856,9 +2871,11 @@ buildAssociatedConstraints(Sema &SemaRef, 
FunctionTemplateDecl *F,
   MultiLevelTemplateArgumentList ArgsForBuildingRC;
   ArgsForBuildingRC.setKind(clang::TemplateSubstitutionKind::Rewrite);
   ArgsForBuildingRC.addOuterTemplateArguments(TemplateArgsForBuildingRC);
-  // For 2), if the underlying F is instantiated from a member template, we 
need
-  // the entire template argument list, as the constraint AST in the
-  // require-clause of F remains completely uninstantiated.
+  // For 2), if the underlying function template F is nested in a class 
template
+  // (either instantiated from an explicitly-written deduction guide, or
+  // synthesized from a constructor), we need the entire template argument 
list,
+  // as the constraint AST in the require-clause of F remains completely
+  // uninstantiated.
   //
   // For example:
   //   template  // depth 0
@@ -2881,7 +2898,8 @@ buildAssociatedConstraints(Sema &SemaRef, 
FunctionTemplateDecl *F,
   // We add the outer template arguments which is [int] to the multi-level arg
   // list to ensure that the occurrence U in `C` will be replaced with int
   // during the substitution.
-  if (F->getInstantiatedFromMemberTemplate()) {
+  if (F->getLexicalDeclContext()->getDeclKind() ==
+  clang::Decl::ClassTemplateSpecialization) {
 auto OuterLevelArgs = SemaRef.getTemplateInstantiationArgs(
 F, F->getLexicalDeclContext(),
 /*Final=*/false, /*Innermost=*/std::nullopt,
@@ -3099,6 +3117,7 @@ BuildDeductionGuideForTypeAlias(Sema &SemaRef,
 Context.getInjectedTemplateArg(NewParam));
 TransformedDeducedAliasArgs[AliasTemplateParamIdx] = NewTemplateArgument;
   }
+  unsigned UndeducedTemplateParameterStartIndex = FPrimeTemplateParams.size();
   //   ...followed by the template parameters of f that were not deduced
   //   (including their default template arguments)
   for (unsigned FTemplateParamIdx : NonDeducedTemplateParamsInFIndex) {
@@ -3168,7 +3187,8 @@ BuildDeductionGuideForTypeAlias(Sema &SemaRef,
 Expr *IsDeducible = buildIsDeducibleConstraint(
 SemaRef, AliasTemplate, FPrime->getReturnType(), FPrimeTemplateParams);
 Expr *RequiresClause = buildAssociatedConstraints(
-SemaRef, F, AliasTemplate, DeduceResults, IsDeducible);
+SemaRef,

[clang] [clang] CTAD alias: fix transformation for require-clause expr Part2. (PR #93533)

2024-06-27 Thread Haojian Wu via cfe-commits

https://github.com/hokein edited https://github.com/llvm/llvm-project/pull/93533
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] CTAD alias: fix transformation for require-clause expr Part2. (PR #93533)

2024-06-27 Thread Haojian Wu via cfe-commits

https://github.com/hokein commented:

thanks for the review.

https://github.com/llvm/llvm-project/pull/93533
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] CTAD alias: fix transformation for require-clause expr Part2. (PR #93533)

2024-06-27 Thread Haojian Wu via cfe-commits


@@ -2840,8 +2841,22 @@ buildAssociatedConstraints(Sema &SemaRef, 
FunctionTemplateDecl *F,
 
   for (unsigned Index = 0; Index < DeduceResults.size(); ++Index) {
 const auto &D = DeduceResults[Index];
-if (D.isNull())
+if (D.isNull()) { // non-deduced template parameters of f
+  auto TP = F->getTemplateParameters()->getParam(Index);
+  MultiLevelTemplateArgumentList Args;
+  Args.setKind(TemplateSubstitutionKind::Rewrite);
+  Args.addOuterTemplateArguments(TemplateArgsForBuildingRC);
+  // Rebuild the template parameter with updated depth and index.
+  NamedDecl *NewParam = transformTemplateParameter(
+  SemaRef, F->getDeclContext(), TP, Args,
+  /*NewIndex=*/UndeducedTemplateParameterStartIndex++,
+  getTemplateParameterDepth(TP) + AdjustDepth);

hokein wrote:

Done.

https://github.com/llvm/llvm-project/pull/93533
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] CTAD alias: fix transformation for require-clause expr Part2. (PR #93533)

2024-06-27 Thread Haojian Wu via cfe-commits


@@ -2882,7 +2899,8 @@ buildAssociatedConstraints(Sema &SemaRef, 
FunctionTemplateDecl *F,
   // We add the outer template arguments which is [int] to the multi-level arg
   // list to ensure that the occurrence U in `C` will be replaced with int
   // during the substitution.
-  if (F->getInstantiatedFromMemberTemplate()) {
+  if (F->getLexicalDeclContext()->getDeclKind() ==
+  clang::Decl::ClassTemplateSpecialization) {

hokein wrote:

Not needed. The F here is an instantiated template (either from an explicit 
deduction guide within a class or a constructor), so its DeclContext cannot be 
ClassTemplatePartialSpecialization. I made a code comment to clarify it.

https://github.com/llvm/llvm-project/pull/93533
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] CTAD alias: fix transformation for require-clause expr Part2. (PR #93533)

2024-06-27 Thread Haojian Wu via cfe-commits


@@ -2840,8 +2841,22 @@ buildAssociatedConstraints(Sema &SemaRef, 
FunctionTemplateDecl *F,
 
   for (unsigned Index = 0; Index < DeduceResults.size(); ++Index) {
 const auto &D = DeduceResults[Index];
-if (D.isNull())
+if (D.isNull()) { // non-deduced template parameters of f
+  auto TP = F->getTemplateParameters()->getParam(Index);

hokein wrote:

Done

https://github.com/llvm/llvm-project/pull/93533
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] CTAD alias: fix transformation for require-clause expr Part2. (PR #93533)

2024-06-27 Thread Haojian Wu via cfe-commits


@@ -3100,6 +3118,7 @@ BuildDeductionGuideForTypeAlias(Sema &SemaRef,
 Context.getInjectedTemplateArg(NewParam));
 TransformedDeducedAliasArgs[AliasTemplateParamIdx] = NewTemplateArgument;
   }
+  unsigned UndeducedTemplateParameterStartIndex = FPrimeTemplateParams.size();

hokein wrote:

Done.

https://github.com/llvm/llvm-project/pull/93533
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Pipelines] Move IPSCCP after inliner pipeline (PR #96620)

2024-06-27 Thread via cfe-commits

https://github.com/sihuan updated 
https://github.com/llvm/llvm-project/pull/96620

>From abf211c35e39efc5d8f30019e10a14766985c185 Mon Sep 17 00:00:00 2001
From: SiHuaN 
Date: Tue, 25 Jun 2024 18:04:33 +0800
Subject: [PATCH 1/3] [Pipelines] Move IPSCCP after inliner pipeline

Moving the Interprocedural Constant Propagation (IPSCCP) pass to run after the
inliner pipeline can enhance optimization effectiveness. Performance uplift
for SPEC2017:548.exchange2_r on rv64gc is over 40%.
---
 llvm/lib/Passes/PassBuilderPipelines.cpp | 28 
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 926515c9508a9..82e2690f4f441 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -1118,20 +1118,6 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
 
   invokePipelineEarlySimplificationEPCallbacks(MPM, Level);
 
-  // Interprocedural constant propagation now that basic cleanup has occurred
-  // and prior to optimizing globals.
-  // FIXME: This position in the pipeline hasn't been carefully considered in
-  // years, it should be re-analyzed.
-  MPM.addPass(IPSCCPPass(
-  IPSCCPOptions(/*AllowFuncSpec=*/
-Level != OptimizationLevel::Os &&
-Level != OptimizationLevel::Oz &&
-!isLTOPreLink(Phase;
-
-  // Attach metadata to indirect call sites indicating the set of functions
-  // they may target at run-time. This should follow IPSCCP.
-  MPM.addPass(CalledValuePropagationPass());
-
   // Optimize globals to try and fold them into constants.
   MPM.addPass(GlobalOptPass());
 
@@ -1204,6 +1190,20 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   else
 MPM.addPass(buildInlinerPipeline(Level, Phase));
 
+  // Interprocedural constant propagation after the inliner pipeline yields
+  // better optimization results.
+  // FIXME: This position in the pipeline hasn't been carefully considered in
+  // years, it should be re-analyzed.
+  MPM.addPass(IPSCCPPass(
+  IPSCCPOptions(/*AllowFuncSpec=*/
+Level != OptimizationLevel::Os &&
+Level != OptimizationLevel::Oz &&
+!isLTOPreLink(Phase;
+
+  // Attach metadata to indirect call sites indicating the set of functions
+  // they may target at run-time. This should follow IPSCCP.
+  MPM.addPass(CalledValuePropagationPass());
+
   // Remove any dead arguments exposed by cleanups, constant folding globals,
   // and argument promotion.
   MPM.addPass(DeadArgumentEliminationPass());

>From 1f8eef8e2b98eadc79f7c89456f701e24b956716 Mon Sep 17 00:00:00 2001
From: SiHuaN 
Date: Thu, 27 Jun 2024 15:11:38 +0800
Subject: [PATCH 2/3] Restore IPSCCP Pass to its original position and repeat
 it after the inliner pipeline

---
 llvm/lib/Passes/PassBuilderPipelines.cpp | 22 ++
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 82e2690f4f441..5659c116e9c95 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -1118,6 +1118,20 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
 
   invokePipelineEarlySimplificationEPCallbacks(MPM, Level);
 
+  // Interprocedural constant propagation now that basic cleanup has occurred
+  // and prior to optimizing globals.
+  // FIXME: This position in the pipeline hasn't been carefully considered in
+  // years, it should be re-analyzed.
+  MPM.addPass(IPSCCPPass(
+  IPSCCPOptions(/*AllowFuncSpec=*/
+Level != OptimizationLevel::Os &&
+Level != OptimizationLevel::Oz &&
+!isLTOPreLink(Phase;
+
+  // Attach metadata to indirect call sites indicating the set of functions
+  // they may target at run-time. This should follow IPSCCP.
+  MPM.addPass(CalledValuePropagationPass());
+
   // Optimize globals to try and fold them into constants.
   MPM.addPass(GlobalOptPass());
 
@@ -1190,20 +1204,12 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   else
 MPM.addPass(buildInlinerPipeline(Level, Phase));
 
-  // Interprocedural constant propagation after the inliner pipeline yields
-  // better optimization results.
-  // FIXME: This position in the pipeline hasn't been carefully considered in
-  // years, it should be re-analyzed.
   MPM.addPass(IPSCCPPass(
   IPSCCPOptions(/*AllowFuncSpec=*/
 Level != OptimizationLevel::Os &&
 Level != OptimizationLevel::Oz &&
 !isLTOPreLink(Phase;
 
-  // Attach metadata to indirect call sites indi

[clang] [llvm] [Pipelines] Move IPSCCP after inliner pipeline (PR #96620)

2024-06-27 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: SiHuaN (sihuan)


Changes

This patch significantly improves the performance of LLVM for the 
SPEC2017:548.exchange2_r benchmark, with a performance uplift of over 40% on 
the rv64gc.

During our investigation into the significant performance disparity between GCC 
and LLVM on the SPEC2017:548.exchange2_r benchmark on RISC-V, we identified 
that the primary difference stems from constant propagation optimization. 
In GCC, the hotspot function `digits_2` is split into several parts:
```console
$ objdump -D exchange2_r_gcc | grep "digits_2.*:$"
0001d480 <__brute_force_MOD_digits_2.isra.0>:
0001f0f6 <__brute_force_MOD_digits_2.constprop.7.isra.0>:
0001fdd0 <__brute_force_MOD_digits_2.constprop.6.isra.0>:
00020900 <__brute_force_MOD_digits_2.constprop.5.isra.0>:
000211c4 <__brute_force_MOD_digits_2.constprop.4.isra.0>:
00022002 <__brute_force_MOD_digits_2.constprop.3.isra.0>:
00022d6a <__brute_force_MOD_digits_2.constprop.2.isra.0>:
00023898 <__brute_force_MOD_digits_2.constprop.1.isra.0>:
```
However, in LLVM, this function is not split:
```console
$ objdump -D exchange2_r_llvm | grep "digits_2.*:$"
000115a0 <_QMbrute_forcePdigits_2>:
```
By applying this patch, LLVM now exhibits similar behavior, resulting in a 
substantial performance uplift.
```console
$ objdump -D exchange2_r_patched_llvm | grep "digits_2.*:$"
00011ab0 <_QMbrute_forcePdigits_2>:
00018a4e <_QMbrute_forcePdigits_2.specialized.1>:
00019820 <_QMbrute_forcePdigits_2.specialized.2>:
0001a436 <_QMbrute_forcePdigits_2.specialized.3>:
0001ae78 <_QMbrute_forcePdigits_2.specialized.4>:
0001ba8e <_QMbrute_forcePdigits_2.specialized.5>:
0001c7e6 <_QMbrute_forcePdigits_2.specialized.6>:
0001d072 <_QMbrute_forcePdigits_2.specialized.7>:
0001dad0 <_QMbrute_forcePdigits_2.specialized.8>:
```
And we used `perf stat` to measure the instruction count for `exchange2_r 0` on 
rv64gc, as shown in the table below:
| Compiler | Instructions |
|||
| GCC  #d28ea8e5 | 55,965,728,914 |
| LLVM #62d44fbd | 105,416,890,241  |
| LLVM #62d44fbd with this patch | 62,693,427,761 |

 Additionally, I performed tests on x86_64, yielding similar results:
| Compiler | cpu_atom instructions |
|||
| LLVM #62d44fbd | 100,147,914,793   |
| LLVM #62d44fbd with this patch | 53,077,337,115 | 



---
Full diff: https://github.com/llvm/llvm-project/pull/96620.diff


12 Files Affected:

- (modified) clang/test/CodeGen/attr-counted-by.c (+2-2) 
- (modified) llvm/lib/Passes/PassBuilderPipelines.cpp (+6) 
- (modified) llvm/test/Other/new-pm-defaults.ll (+1) 
- (modified) llvm/test/Other/new-pm-thinlto-postlink-defaults.ll (+2-1) 
- (modified) llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll (+1) 
- (modified) llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll (+1) 
- (modified) llvm/test/Other/new-pm-thinlto-prelink-defaults.ll (+1) 
- (modified) llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll (+1) 
- (modified) llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll (+1) 
- (modified) 
llvm/test/Transforms/PhaseOrdering/AArch64/constraint-elimination-placement.ll 
(+7-25) 
- (modified) llvm/test/Transforms/PhaseOrdering/dce-after-argument-promotion.ll 
(+3-4) 
- (modified) 
llvm/test/Transforms/PhaseOrdering/deletion-of-loops-that-became-side-effect-free.ll
 (+15-40) 


``diff
diff --git a/clang/test/CodeGen/attr-counted-by.c 
b/clang/test/CodeGen/attr-counted-by.c
index 79922eb4159f1..8d0e39d0e3dad 100644
--- a/clang/test/CodeGen/attr-counted-by.c
+++ b/clang/test/CodeGen/attr-counted-by.c
@@ -639,7 +639,7 @@ void test6(struct anon_struct *p, int index) {
   p->array[index] = __builtin_dynamic_object_size(p->array, 1);
 }
 
-// SANITIZE-WITH-ATTR-LABEL: define dso_local i64 @test6_bdos(
+// SANITIZE-WITH-ATTR-LABEL: define dso_local range(i64 0, -3) i64 @test6_bdos(
 // SANITIZE-WITH-ATTR-SAME: ptr nocapture noundef readonly [[P:%.*]]) 
local_unnamed_addr #[[ATTR2]] {
 // SANITIZE-WITH-ATTR-NEXT:  entry:
 // SANITIZE-WITH-ATTR-NEXT:[[DOT_COUNTED_BY_GEP:%.*]] = getelementptr 
inbounds i8, ptr [[P]], i64 8
@@ -649,7 +649,7 @@ void test6(struct anon_struct *p, int index) {
 // SANITIZE-WITH-ATTR-NEXT:[[TMP1:%.*]] = select i1 [[DOTINV]], i64 0, i64 
[[TMP0]]
 // SANITIZE-WITH-ATTR-NEXT:ret i64 [[TMP1]]
 //
-// NO-SANITIZE-WITH-ATTR-LABEL: define dso_local i64 @test6_bdos(
+// NO-SANITIZE-WITH-ATTR-LABEL: define dso_local range(i64 0, -3) i64 
@test6_bdos(
 // NO-SANITIZE-WITH-ATTR-SAME: ptr nocapture noundef readonly [[P:%.*]]) 
local_unnamed_addr #[[ATTR2]] {
 // NO-SANITIZE-WITH-ATTR-NEXT:  entry:
 // NO-SANITIZE-WITH-ATTR-NEXT:[[DOT_COUNTED_BY_GEP:%.*]] = getelementptr 
inbounds i8, ptr [[P]], i64 8
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 926515c9

[clang] [llvm] [PAC][ELF][AArch64] Encode signed GOT flag in PAuth core info (PR #96159)

2024-06-27 Thread James Henderson via cfe-commits

https://github.com/jh7370 commented:

I'm not at all familiar with this PAuth stuff, but don't you need a test case 
for where the new value is set (currently they all seem to be unset, if I'm 
interpreting things correctly)?

https://github.com/llvm/llvm-project/pull/96159
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Pipelines] Move IPSCCP after inliner pipeline (PR #96620)

2024-06-27 Thread via cfe-commits

https://github.com/sihuan updated 
https://github.com/llvm/llvm-project/pull/96620

>From abf211c35e39efc5d8f30019e10a14766985c185 Mon Sep 17 00:00:00 2001
From: SiHuaN 
Date: Tue, 25 Jun 2024 18:04:33 +0800
Subject: [PATCH 1/4] [Pipelines] Move IPSCCP after inliner pipeline

Moving the Interprocedural Constant Propagation (IPSCCP) pass to run after the
inliner pipeline can enhance optimization effectiveness. Performance uplift
for SPEC2017:548.exchange2_r on rv64gc is over 40%.
---
 llvm/lib/Passes/PassBuilderPipelines.cpp | 28 
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 926515c9508a9..82e2690f4f441 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -1118,20 +1118,6 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
 
   invokePipelineEarlySimplificationEPCallbacks(MPM, Level);
 
-  // Interprocedural constant propagation now that basic cleanup has occurred
-  // and prior to optimizing globals.
-  // FIXME: This position in the pipeline hasn't been carefully considered in
-  // years, it should be re-analyzed.
-  MPM.addPass(IPSCCPPass(
-  IPSCCPOptions(/*AllowFuncSpec=*/
-Level != OptimizationLevel::Os &&
-Level != OptimizationLevel::Oz &&
-!isLTOPreLink(Phase;
-
-  // Attach metadata to indirect call sites indicating the set of functions
-  // they may target at run-time. This should follow IPSCCP.
-  MPM.addPass(CalledValuePropagationPass());
-
   // Optimize globals to try and fold them into constants.
   MPM.addPass(GlobalOptPass());
 
@@ -1204,6 +1190,20 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   else
 MPM.addPass(buildInlinerPipeline(Level, Phase));
 
+  // Interprocedural constant propagation after the inliner pipeline yields
+  // better optimization results.
+  // FIXME: This position in the pipeline hasn't been carefully considered in
+  // years, it should be re-analyzed.
+  MPM.addPass(IPSCCPPass(
+  IPSCCPOptions(/*AllowFuncSpec=*/
+Level != OptimizationLevel::Os &&
+Level != OptimizationLevel::Oz &&
+!isLTOPreLink(Phase;
+
+  // Attach metadata to indirect call sites indicating the set of functions
+  // they may target at run-time. This should follow IPSCCP.
+  MPM.addPass(CalledValuePropagationPass());
+
   // Remove any dead arguments exposed by cleanups, constant folding globals,
   // and argument promotion.
   MPM.addPass(DeadArgumentEliminationPass());

>From 1f8eef8e2b98eadc79f7c89456f701e24b956716 Mon Sep 17 00:00:00 2001
From: SiHuaN 
Date: Thu, 27 Jun 2024 15:11:38 +0800
Subject: [PATCH 2/4] Restore IPSCCP Pass to its original position and repeat
 it after the inliner pipeline

---
 llvm/lib/Passes/PassBuilderPipelines.cpp | 22 ++
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 82e2690f4f441..5659c116e9c95 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -1118,6 +1118,20 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
 
   invokePipelineEarlySimplificationEPCallbacks(MPM, Level);
 
+  // Interprocedural constant propagation now that basic cleanup has occurred
+  // and prior to optimizing globals.
+  // FIXME: This position in the pipeline hasn't been carefully considered in
+  // years, it should be re-analyzed.
+  MPM.addPass(IPSCCPPass(
+  IPSCCPOptions(/*AllowFuncSpec=*/
+Level != OptimizationLevel::Os &&
+Level != OptimizationLevel::Oz &&
+!isLTOPreLink(Phase;
+
+  // Attach metadata to indirect call sites indicating the set of functions
+  // they may target at run-time. This should follow IPSCCP.
+  MPM.addPass(CalledValuePropagationPass());
+
   // Optimize globals to try and fold them into constants.
   MPM.addPass(GlobalOptPass());
 
@@ -1190,20 +1204,12 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   else
 MPM.addPass(buildInlinerPipeline(Level, Phase));
 
-  // Interprocedural constant propagation after the inliner pipeline yields
-  // better optimization results.
-  // FIXME: This position in the pipeline hasn't been carefully considered in
-  // years, it should be re-analyzed.
   MPM.addPass(IPSCCPPass(
   IPSCCPOptions(/*AllowFuncSpec=*/
 Level != OptimizationLevel::Os &&
 Level != OptimizationLevel::Oz &&
 !isLTOPreLink(Phase;
 
-  // Attach metadata to indirect call sites indi

[clang] [Clang][AST] Let DeclPrinter print trailing requires expressions for template parameters (PR #96864)

2024-06-27 Thread Haojian Wu via cfe-commits


@@ -1189,6 +1189,16 @@ void DeclPrinter::printTemplateParameters(const 
TemplateParameterList *Params,
   Out << '>';
   if (!OmitTemplateKW)
 Out << ' ';
+
+  if (const Expr *RequiresClause = Params->getRequiresClause()) {

hokein wrote:

If I read the code correctly, looks like we can move this code to Line 1190 
(just immediately before the  above `if (!OmitTemplateKW)`)? Then we can get 
rid of all the `OmitTemplateKw` logic inside this if branch.

https://github.com/llvm/llvm-project/pull/96864
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Bring initFeatureMap back to AArch64TargetInfo. (PR #96832)

2024-06-27 Thread Tomas Matheson via cfe-commits

tmatheson-arm wrote:

And please add a test to cover whatever broke.

https://github.com/llvm/llvm-project/pull/96832
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][CodeGen] security check cookie execute only when needed (PR #95904)

2024-06-27 Thread via cfe-commits

https://github.com/mahesh-attarde updated 
https://github.com/llvm/llvm-project/pull/95904

>From 6d6619f8f7a37906ac45791487a4d63b51a48ad1 Mon Sep 17 00:00:00 2001
From: mahesh-attarde 
Date: Wed, 12 Jun 2024 06:15:51 -0700
Subject: [PATCH 1/5] added regcall strct by reg support

---
 clang/lib/CodeGen/Targets/X86.cpp | 20 
 clang/test/CodeGen/regcall3.c | 53 +++
 2 files changed, 73 insertions(+)
 create mode 100644 clang/test/CodeGen/regcall3.c

diff --git a/clang/lib/CodeGen/Targets/X86.cpp 
b/clang/lib/CodeGen/Targets/X86.cpp
index 43dadf5e724ac..506d106ad65b0 100644
--- a/clang/lib/CodeGen/Targets/X86.cpp
+++ b/clang/lib/CodeGen/Targets/X86.cpp
@@ -148,6 +148,7 @@ class X86_32ABIInfo : public ABIInfo {
 
   Class classify(QualType Ty) const;
   ABIArgInfo classifyReturnType(QualType RetTy, CCState &State) const;
+
   ABIArgInfo classifyArgumentType(QualType RetTy, CCState &State,
   unsigned ArgIndex) const;
 
@@ -1306,6 +1307,8 @@ class X86_64ABIInfo : public ABIInfo {
unsigned &NeededSSE,
unsigned &MaxVectorWidth) const;
 
+  bool DoesRegcallStructFitInReg(QualType Ty) const;
+
   bool IsIllegalVectorType(QualType Ty) const;
 
   /// The 0.98 ABI revision clarified a lot of ambiguities,
@@ -2830,6 +2833,20 @@ X86_64ABIInfo::classifyArgumentType(QualType Ty, 
unsigned freeIntRegs,
   return ABIArgInfo::getDirect(ResType);
 }
 
+bool X86_64ABIInfo::DoesRegcallStructFitInReg(QualType Ty) const {
+  auto RT = Ty->castAs();
+  // For Integer class, Max GPR Size is 64
+  if (getContext().getTypeSize(Ty) > 64)
+return false;
+  // Struct At hand must not have other non Builtin types
+  for (const auto *FD : RT->getDecl()->fields()) {
+QualType MTy = FD->getType();
+if (!MTy->isBuiltinType())
+  return false;
+  }
+  return true;
+}
+
 ABIArgInfo
 X86_64ABIInfo::classifyRegCallStructTypeImpl(QualType Ty, unsigned &NeededInt,
  unsigned &NeededSSE,
@@ -2837,6 +2854,9 @@ X86_64ABIInfo::classifyRegCallStructTypeImpl(QualType Ty, 
unsigned &NeededInt,
   auto RT = Ty->getAs();
   assert(RT && "classifyRegCallStructType only valid with struct types");
 
+  if (DoesRegcallStructFitInReg(Ty))
+return classifyArgumentType(Ty, UINT_MAX, NeededInt, NeededSSE, true, 
true);
+
   if (RT->getDecl()->hasFlexibleArrayMember())
 return getIndirectReturnResult(Ty);
 
diff --git a/clang/test/CodeGen/regcall3.c b/clang/test/CodeGen/regcall3.c
new file mode 100644
index 0..1c83407220861
--- /dev/null
+++ b/clang/test/CodeGen/regcall3.c
@@ -0,0 +1,53 @@
+// RUN: %clang_cc1 -S %s -o - -ffreestanding -triple=x86_64-unknown-linux-gnu 
| FileCheck %s --check-prefixes=LINUX64
+
+#include 
+struct struct1 { int x; int y; };
+void __regcall v6(int a, float b, struct struct1 c) {}
+
+void v6_caller(){
+struct struct1 c0;
+c0.x = 0xa0a0; c0.y = 0xb0b0;
+int x= 0xf0f0, y = 0x0f0f;
+v6(x,y,c0);
+}
+
+// LINUX64-LABEL: __regcall3__v6
+// LINUX64: movq   %rcx, -8(%rsp)
+// LINUX64: movl   %eax, -12(%rsp)
+// LINUX64: movss  %xmm0, -16(%rsp)
+
+// LINUX64-LABEL: v6_caller
+// LINUX64: movl   $41120, 16(%rsp)# imm = 0xA0A0
+// LINUX64: movl   $45232, 20(%rsp)# imm = 0xB0B0
+// LINUX64: movl   $61680, 12(%rsp)# imm = 0xF0F0
+// LINUX64: movl   $3855, 8(%rsp)  # imm = 0xF0F
+// LINUX64: movl   12(%rsp), %eax
+// LINUX64: cvtsi2ssl  8(%rsp), %xmm0
+// LINUX64: movq   16(%rsp), %rcx
+// LINUX64: callq  .L__regcall3__v6$local
+
+
+struct struct2 { int x; float y; };
+void __regcall v31(int a, float b, struct struct2 c) {}
+
+void v31_caller(){
+struct struct2 c0;
+c0.x = 0xa0a0; c0.y = 0xb0b0;
+int x= 0xf0f0, y = 0x0f0f;
+v31(x,y,c0);
+}
+
+// LINUX64: __regcall3__v31:# @__regcall3__v31
+// LINUX64:movq%rcx, -8(%rsp)
+// LINUX64:movl%eax, -12(%rsp)
+// LINUX64:movss   %xmm0, -16(%rsp)
+// LINUX64: v31_caller: # @v31_caller
+// LINUX64:movl$41120, 16(%rsp)# imm = 0xA0A0
+// LINUX64:movss   .LCPI3_0(%rip), %xmm0   # xmm0 = 
[4.5232E+4,0.0E+0,0.0E+0,0.0E+0]
+// LINUX64:movss   %xmm0, 20(%rsp)
+// LINUX64:movl$61680, 12(%rsp)# imm = 0xF0F0
+// LINUX64:movl$3855, 8(%rsp)  # imm = 0xF0F
+// LINUX64:movl12(%rsp), %eax
+// LINUX64:cvtsi2ssl   8(%rsp), %xmm0
+// LINUX64:movq16(%rsp), %rcx
+// LINUX64:callq   .L__regcall3__v31$local

>From 8bdd245edd8dca9477d6541401737f2aeaf6e820 Mon Sep 17 00:00:00 2001
From: mahesh-attarde 
Date: Tue, 18 Jun 2024 03:33:02 -0700
Subject: [PATCH 2/5] selectively call security cookie check

---
 llvm/lib/Target/X86/CMakeLists.txt|   1 +
 llvm/lib/Target/X

[clang] [flang] [llvm] Re-land: "[AArch64] Add ability to list extensions enabled for a target" (#95805) (PR #96795)

2024-06-27 Thread Lucas Duarte Prates via cfe-commits


@@ -343,7 +350,9 @@ bool isX18ReservedByDefault(const Triple &TT);
 // themselves, they are sequential (0, 1, 2, 3, ...).
 uint64_t getCpuSupportsMask(ArrayRef FeatureStrs);
 
-void PrintSupportedExtensions(StringMap DescMap);
+void PrintSupportedExtensions();
+
+void printEnabledExtensions(std::set EnabledFeatureNames);

pratlucas wrote:

Thanks for catching this! Not sure how I missed it 🤦 

https://github.com/llvm/llvm-project/pull/96795
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [compiler-rt] [XRay] Add support for instrumentation of DSOs on x86_64 (PR #90959)

2024-06-27 Thread Sebastian Kreutzer via cfe-commits

sebastiankreutzer wrote:

@androm3da @MaskRay 
I'm tagging you because I'm having trouble to get feedback to this PR, and you 
seem to be the most recent contributors to XRay.
Would one of you be willing to review it? 
Any other pointers on who to get in touch with are also much appreciated. 

https://github.com/llvm/llvm-project/pull/90959
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Emitting a warning if optimizations are enabled with sanitizers (PR #95934)

2024-06-27 Thread Sandeep Kosuri via cfe-commits


@@ -1038,3 +1038,10 @@
 // RUN: not %clang --target=aarch64-none-elf -fsanitize=dataflow %s -### 2>&1 
| FileCheck %s -check-prefix=UNSUPPORTED-BAREMETAL
 // RUN: not %clang --target=arm-arm-none-eabi -fsanitize=shadow-call-stack %s 
-### 2>&1 | FileCheck %s -check-prefix=UNSUPPORTED-BAREMETAL
 // UNSUPPORTED-BAREMETAL: unsupported option '-fsanitize={{.*}}' for target
+
+// RUN: %clang -O0 -O1 -fsanitize=address %s -### 2>&1 | FileCheck %s 
-check-prefix=CHECK-SAN-OPT-WARN
+// RUN: %clang -Ofast -fsanitize=address %s -### 2>&1 | FileCheck %s 
-check-prefix=CHECK-SAN-OPT-WARN
+// RUN: %clang -O3 -fsanitize=address %s -### 2>&1 | FileCheck %s 
-check-prefix=CHECK-SAN-OPT-WARN
+// RUN: %clang -O2 -fsanitize=thread %s -### 2>&1 | FileCheck %s 
-check-prefix=CHECK-SAN-OPT-WARN
+// RUN: %clang -O1 -fsanitize=thread %s -### 2>&1 | FileCheck %s 
-check-prefix=CHECK-SAN-OPT-WARN
+// CHECK-SAN-OPT-WARN: warning: enabling optimizations with sanitizers may 
potentially reduce effectiveness

sandeepkosuri wrote:

```suggestion
// CHECK-SAN-OPT-WARN: warning: enabling optimizations may reduce the 
effectiveness of sanitizers
```

https://github.com/llvm/llvm-project/pull/95934
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Emitting a warning if optimizations are enabled with sanitizers (PR #95934)

2024-06-27 Thread Sandeep Kosuri via cfe-commits


@@ -477,6 +477,8 @@ def warn_drv_disabling_vptr_no_rtti_default : Warning<
 def warn_drv_object_size_disabled_O0 : Warning<
   "the object size sanitizer has no effect at -O0, but is explicitly enabled: 
%0">,
   InGroup, DefaultWarnNoWerror;
+def warn_sanitizer_with_optimization : Warning<
+  "enabling optimizations with sanitizers may potentially reduce 
effectiveness">;

sandeepkosuri wrote:

```suggestion
  "enabling optimizations may reduce the effectiveness of sanitizers">;
```

https://github.com/llvm/llvm-project/pull/95934
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][Interp] Merge ByteCodeExprGen and ByteCodeStmtGen (PR #83683)

2024-06-27 Thread Timm Baeder via cfe-commits

https://github.com/tbaederr updated 
https://github.com/llvm/llvm-project/pull/83683

>From 74550f244eed465d4f0db1787eecb73a09d5881a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Timm=20B=C3=A4der?= 
Date: Sat, 2 Mar 2024 17:00:26 +0100
Subject: [PATCH] [clang][Interp] Merge ByteCode{Stmt,Expr}Gen

---
 clang/lib/AST/CMakeLists.txt  |3 +-
 clang/lib/AST/Interp/ByteCodeStmtGen.cpp  |  734 
 clang/lib/AST/Interp/ByteCodeStmtGen.h|   91 --
 .../{ByteCodeExprGen.cpp => Compiler.cpp} | 1002 ++---
 .../Interp/{ByteCodeExprGen.h => Compiler.h}  |   93 +-
 clang/lib/AST/Interp/Context.cpp  |   13 +-
 clang/lib/AST/Interp/EvalEmitter.h|1 +
 clang/lib/AST/Interp/Program.cpp  |1 -
 8 files changed, 911 insertions(+), 1027 deletions(-)
 delete mode 100644 clang/lib/AST/Interp/ByteCodeStmtGen.cpp
 delete mode 100644 clang/lib/AST/Interp/ByteCodeStmtGen.h
 rename clang/lib/AST/Interp/{ByteCodeExprGen.cpp => Compiler.cpp} (81%)
 rename clang/lib/AST/Interp/{ByteCodeExprGen.h => Compiler.h} (87%)

diff --git a/clang/lib/AST/CMakeLists.txt b/clang/lib/AST/CMakeLists.txt
index 0328666d59b1f..ceaad8d3c5a86 100644
--- a/clang/lib/AST/CMakeLists.txt
+++ b/clang/lib/AST/CMakeLists.txt
@@ -65,8 +65,7 @@ add_clang_library(clangAST
   FormatString.cpp
   InheritViz.cpp
   Interp/ByteCodeEmitter.cpp
-  Interp/ByteCodeExprGen.cpp
-  Interp/ByteCodeStmtGen.cpp
+  Interp/Compiler.cpp
   Interp/Context.cpp
   Interp/Descriptor.cpp
   Interp/Disasm.cpp
diff --git a/clang/lib/AST/Interp/ByteCodeStmtGen.cpp 
b/clang/lib/AST/Interp/ByteCodeStmtGen.cpp
deleted file mode 100644
index 0618ec1aa8f58..0
--- a/clang/lib/AST/Interp/ByteCodeStmtGen.cpp
+++ /dev/null
@@ -1,734 +0,0 @@
-//===--- ByteCodeStmtGen.cpp - Code generator for expressions ---*- C++ 
-*-===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===--===//
-
-#include "ByteCodeStmtGen.h"
-#include "ByteCodeEmitter.h"
-#include "Context.h"
-#include "Function.h"
-#include "PrimType.h"
-
-using namespace clang;
-using namespace clang::interp;
-
-namespace clang {
-namespace interp {
-
-/// Scope managing label targets.
-template  class LabelScope {
-public:
-  virtual ~LabelScope() {  }
-
-protected:
-  LabelScope(ByteCodeStmtGen *Ctx) : Ctx(Ctx) {}
-  /// ByteCodeStmtGen instance.
-  ByteCodeStmtGen *Ctx;
-};
-
-/// Sets the context for break/continue statements.
-template  class LoopScope final : public LabelScope {
-public:
-  using LabelTy = typename ByteCodeStmtGen::LabelTy;
-  using OptLabelTy = typename ByteCodeStmtGen::OptLabelTy;
-
-  LoopScope(ByteCodeStmtGen *Ctx, LabelTy BreakLabel,
-LabelTy ContinueLabel)
-  : LabelScope(Ctx), OldBreakLabel(Ctx->BreakLabel),
-OldContinueLabel(Ctx->ContinueLabel) {
-this->Ctx->BreakLabel = BreakLabel;
-this->Ctx->ContinueLabel = ContinueLabel;
-  }
-
-  ~LoopScope() {
-this->Ctx->BreakLabel = OldBreakLabel;
-this->Ctx->ContinueLabel = OldContinueLabel;
-  }
-
-private:
-  OptLabelTy OldBreakLabel;
-  OptLabelTy OldContinueLabel;
-};
-
-// Sets the context for a switch scope, mapping labels.
-template  class SwitchScope final : public LabelScope {
-public:
-  using LabelTy = typename ByteCodeStmtGen::LabelTy;
-  using OptLabelTy = typename ByteCodeStmtGen::OptLabelTy;
-  using CaseMap = typename ByteCodeStmtGen::CaseMap;
-
-  SwitchScope(ByteCodeStmtGen *Ctx, CaseMap &&CaseLabels,
-  LabelTy BreakLabel, OptLabelTy DefaultLabel)
-  : LabelScope(Ctx), OldBreakLabel(Ctx->BreakLabel),
-OldDefaultLabel(this->Ctx->DefaultLabel),
-OldCaseLabels(std::move(this->Ctx->CaseLabels)) {
-this->Ctx->BreakLabel = BreakLabel;
-this->Ctx->DefaultLabel = DefaultLabel;
-this->Ctx->CaseLabels = std::move(CaseLabels);
-  }
-
-  ~SwitchScope() {
-this->Ctx->BreakLabel = OldBreakLabel;
-this->Ctx->DefaultLabel = OldDefaultLabel;
-this->Ctx->CaseLabels = std::move(OldCaseLabels);
-  }
-
-private:
-  OptLabelTy OldBreakLabel;
-  OptLabelTy OldDefaultLabel;
-  CaseMap OldCaseLabels;
-};
-
-} // namespace interp
-} // namespace clang
-
-template 
-bool ByteCodeStmtGen::emitLambdaStaticInvokerBody(
-const CXXMethodDecl *MD) {
-  assert(MD->isLambdaStaticInvoker());
-  assert(MD->hasBody());
-  assert(cast(MD->getBody())->body_empty());
-
-  const CXXRecordDecl *ClosureClass = MD->getParent();
-  const CXXMethodDecl *LambdaCallOp = ClosureClass->getLambdaCallOperator();
-  assert(ClosureClass->captures_begin() == ClosureClass->captures_end());
-  const Function *Func = this->getFunction(LambdaCallOp);
-  if (!Func)
-return false;
-  assert(Func->hasThisPointer());
-  assert(Func->getNumParams() == (MD->getNumParams() + 1 + F

[clang] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins (PR #96738)

2024-06-27 Thread Matt Arsenault via cfe-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/96738

>From 5f614809ac4ffa5e29a01c7e9410d91eadcbe6f2 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Tue, 11 Jun 2024 10:40:27 +0200
Subject: [PATCH 1/2] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins

---
 clang/lib/CodeGen/CGBuiltin.cpp   | 40 ---
 clang/test/CodeGenCUDA/builtins-amdgcn.cu |  8 +--
 .../test/CodeGenCUDA/builtins-spirv-amdgcn.cu |  8 +--
 .../test/CodeGenOpenCL/builtins-amdgcn-vi.cl  | 66 ++-
 4 files changed, 86 insertions(+), 36 deletions(-)

diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 96dcf6283f9f8..98c2f70664ec7 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -18632,28 +18632,6 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned 
BuiltinID,
 Function *F = CGM.getIntrinsic(Intrin, { Src0->getType() });
 return Builder.CreateCall(F, { Src0, Builder.getFalse() });
   }
-  case AMDGPU::BI__builtin_amdgcn_ds_fminf:
-  case AMDGPU::BI__builtin_amdgcn_ds_fmaxf: {
-Intrinsic::ID Intrin;
-switch (BuiltinID) {
-case AMDGPU::BI__builtin_amdgcn_ds_fminf:
-  Intrin = Intrinsic::amdgcn_ds_fmin;
-  break;
-case AMDGPU::BI__builtin_amdgcn_ds_fmaxf:
-  Intrin = Intrinsic::amdgcn_ds_fmax;
-  break;
-}
-llvm::Value *Src0 = EmitScalarExpr(E->getArg(0));
-llvm::Value *Src1 = EmitScalarExpr(E->getArg(1));
-llvm::Value *Src2 = EmitScalarExpr(E->getArg(2));
-llvm::Value *Src3 = EmitScalarExpr(E->getArg(3));
-llvm::Value *Src4 = EmitScalarExpr(E->getArg(4));
-llvm::Function *F = CGM.getIntrinsic(Intrin, { Src1->getType() });
-llvm::FunctionType *FTy = F->getFunctionType();
-llvm::Type *PTy = FTy->getParamType(0);
-Src0 = Builder.CreatePointerBitCastOrAddrSpaceCast(Src0, PTy);
-return Builder.CreateCall(F, { Src0, Src1, Src2, Src3, Src4 });
-  }
   case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f64:
   case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f32:
   case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_v2f16:
@@ -19087,11 +19065,13 @@ Value 
*CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID,
   case AMDGPU::BI__builtin_amdgcn_atomic_inc64:
   case AMDGPU::BI__builtin_amdgcn_atomic_dec32:
   case AMDGPU::BI__builtin_amdgcn_atomic_dec64:
-  case AMDGPU::BI__builtin_amdgcn_ds_faddf:
   case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_f64:
   case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_f32:
   case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2f16:
-  case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2bf16: {
+  case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2bf16:
+  case AMDGPU::BI__builtin_amdgcn_ds_faddf:
+  case AMDGPU::BI__builtin_amdgcn_ds_fminf:
+  case AMDGPU::BI__builtin_amdgcn_ds_fmaxf: {
 llvm::AtomicRMWInst::BinOp BinOp;
 switch (BuiltinID) {
 case AMDGPU::BI__builtin_amdgcn_atomic_inc32:
@@ -19109,6 +19089,12 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned 
BuiltinID,
 case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2bf16:
   BinOp = llvm::AtomicRMWInst::FAdd;
   break;
+case AMDGPU::BI__builtin_amdgcn_ds_fminf:
+  BinOp = llvm::AtomicRMWInst::FMin;
+  break;
+case AMDGPU::BI__builtin_amdgcn_ds_fmaxf:
+  BinOp = llvm::AtomicRMWInst::FMax;
+  break;
 }
 
 Address Ptr = CheckAtomicAlignment(*this, E);
@@ -19118,8 +19104,10 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned 
BuiltinID,
 
 bool Volatile;
 
-if (BuiltinID == AMDGPU::BI__builtin_amdgcn_ds_faddf) {
-  // __builtin_amdgcn_ds_faddf has an explicit volatile argument
+if (BuiltinID == AMDGPU::BI__builtin_amdgcn_ds_faddf ||
+BuiltinID == AMDGPU::BI__builtin_amdgcn_ds_fminf ||
+BuiltinID == AMDGPU::BI__builtin_amdgcn_ds_fmaxf) {
+  // __builtin_amdgcn_ds_faddf/fminf/fmaxf has an explicit volatile 
argument
   Volatile =
   cast(EmitScalarExpr(E->getArg(4)))->getZExtValue();
 } else {
diff --git a/clang/test/CodeGenCUDA/builtins-amdgcn.cu 
b/clang/test/CodeGenCUDA/builtins-amdgcn.cu
index 132cbd27b08fc..2e88afac813f4 100644
--- a/clang/test/CodeGenCUDA/builtins-amdgcn.cu
+++ b/clang/test/CodeGenCUDA/builtins-amdgcn.cu
@@ -98,7 +98,7 @@ __global__
 // CHECK-NEXT:[[X_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[X]] to 
ptr
 // CHECK-NEXT:store float [[SRC:%.*]], ptr [[SRC_ADDR_ASCAST]], align 4
 // CHECK-NEXT:[[TMP0:%.*]] = load float, ptr [[SRC_ADDR_ASCAST]], align 4
-// CHECK-NEXT:[[TMP1:%.*]] = call contract float 
@llvm.amdgcn.ds.fmax.f32(ptr addrspace(3) @_ZZ12test_ds_fmaxfE6shared, float 
[[TMP0]], i32 0, i32 0, i1 false)
+// CHECK-NEXT:[[TMP1:%.*]] = atomicrmw fmax ptr addrspace(3) 
@_ZZ12test_ds_fmaxfE6shared, float [[TMP0]] monotonic, align 4
 // CHECK-NEXT:store volatile float [[TMP1]], ptr [[X_ASCAST]], align 4
 // CHECK-NEXT:ret void
 //
@@ -142,7 +142,7 @@ __global__ voi

[clang] [Clang][AST] Let DeclPrinter print trailing requires expressions for template parameters (PR #96864)

2024-06-27 Thread Younan Zhang via cfe-commits


@@ -1189,6 +1189,16 @@ void DeclPrinter::printTemplateParameters(const 
TemplateParameterList *Params,
   Out << '>';
   if (!OmitTemplateKW)
 Out << ' ';
+
+  if (const Expr *RequiresClause = Params->getRequiresClause()) {

zyn0217 wrote:

Yeah, you're right, will do that shortly.

https://github.com/llvm/llvm-project/pull/96864
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] Re-land: "[AArch64] Add ability to list extensions enabled for a target" (#95805) (PR #96795)

2024-06-27 Thread Lucas Duarte Prates via cfe-commits


@@ -21,7 +21,7 @@
 
 // RUN: %clang --target=aarch64 -march=armv8a+fp16fml -### -c %s 2>&1 | 
FileCheck -check-prefix=GENERICV8A-FP16FML %s

pratlucas wrote:

The check for target validity doesn't run when using `-###` in the command 
line. E.g.:
```
$ ../build/bin/clang -target foo -c test.c -###
clang version 19.0.0git
Target: foo
Thread model: posix
InstalledDir: /Users/lucpra01/Workspace/opensource/build/bin
Build config: +tsan
 (in-process)
 "/Users/lucpra01/Workspace/opensource/build/bin/clang-19" "-cc1" "-triple" 
"foo" "-emit-obj" "-disable-free" "-clear-ast-before-backend" 
"-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "test.c" 
"-mrelocation-model" "static" "-mframe-pointer=all" "-fmath-errno" 
"-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" 
"-debugger-tuning=gdb" 
"-fdebug-compilation-dir=/Users/lucpra01/Workspace/opensource/test" 
"-target-linker-version" "1053.12" 
"-fcoverage-compilation-dir=/Users/lucpra01/Workspace/opensource/test" 
"-resource-dir" "/Users/lucpra01/Workspace/opensource/build/lib/clang/19" 
"-ferror-limit" "19" "-fgnuc-version=4.2.1" "-fskip-odr-check-in-gmf" 
"-fcolor-diagnostics" "-faddrsig" "-o" "test.o" "-x" "c" "test.c"
$ echo $?
0
```

As adding the `// REQUIRES:` directive would reduced our current test coverage, 
I chose to add an `%if aarch64-registered-target` condition only to the 
relevant `// RUN:` lines instead.

https://github.com/llvm/llvm-project/pull/96795
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [llvm] Re-land: "[AArch64] Add ability to list extensions enabled for a target" (#95805) (PR #96795)

2024-06-27 Thread Lucas Duarte Prates via cfe-commits


@@ -315,37 +315,37 @@
 // RUN: %clang -target aarch64 -mcpu=thunderx2t99 -### -c %s 2>&1 | FileCheck 
-check-prefix=CHECK-MCPU-THUNDERX2T99 %s
 // RUN: %clang -target aarch64 -mcpu=a64fx -### -c %s 2>&1 | FileCheck 
-check-prefix=CHECK-MCPU-A64FX %s
 // RUN: %clang -target aarch64 -mcpu=carmel -### -c %s 2>&1 | FileCheck 
-check-prefix=CHECK-MCPU-CARMEL %s
-// CHECK-MCPU-APPLE-A7: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" 
"-target-feature" "+zcm" "-target-feature" "+zcz" "-target-feature" "+v8a" 
"-target-feature" "+aes"{{.*}} "-target-feature" "+fp-armv8" "-target-feature" 
"+perfmon" "-target-feature" "+sha2" "-target-feature" "+neon"

pratlucas wrote:

Ditto.

https://github.com/llvm/llvm-project/pull/96795
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 8a43dc3 - [clang][Sema] Move the initializer lifetime checking code from SemaInit.cpp to a new place, NFC (#96758)

2024-06-27 Thread via cfe-commits

Author: Haojian Wu
Date: 2024-06-27T10:56:06+02:00
New Revision: 8a43dc3efdd9bfba0bea32061ef2f3397a968eb9

URL: 
https://github.com/llvm/llvm-project/commit/8a43dc3efdd9bfba0bea32061ef2f3397a968eb9
DIFF: 
https://github.com/llvm/llvm-project/commit/8a43dc3efdd9bfba0bea32061ef2f3397a968eb9.diff

LOG: [clang][Sema] Move the initializer lifetime checking code from 
SemaInit.cpp to a new place, NFC (#96758)

This is a refactoring change for better code isolation and reuse, the
first step to extend it for assignments.

Added: 
clang/lib/Sema/CheckExprLifetime.cpp
clang/lib/Sema/CheckExprLifetime.h

Modified: 
clang/lib/Sema/CMakeLists.txt
clang/lib/Sema/SemaInit.cpp

Removed: 




diff  --git a/clang/lib/Sema/CMakeLists.txt b/clang/lib/Sema/CMakeLists.txt
index f152d243d39a5..980a83d4431aa 100644
--- a/clang/lib/Sema/CMakeLists.txt
+++ b/clang/lib/Sema/CMakeLists.txt
@@ -15,6 +15,7 @@ clang_tablegen(OpenCLBuiltins.inc -gen-clang-opencl-builtins
 
 add_clang_library(clangSema
   AnalysisBasedWarnings.cpp
+  CheckExprLifetime.cpp
   CodeCompleteConsumer.cpp
   DeclSpec.cpp
   DelayedDiagnostic.cpp

diff  --git a/clang/lib/Sema/CheckExprLifetime.cpp 
b/clang/lib/Sema/CheckExprLifetime.cpp
new file mode 100644
index 0..54e2f1c22536d
--- /dev/null
+++ b/clang/lib/Sema/CheckExprLifetime.cpp
@@ -0,0 +1,1259 @@
+//===--- CheckExprLifetime.cpp 
===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "CheckExprLifetime.h"
+#include "clang/AST/Expr.h"
+#include "clang/Sema/Sema.h"
+#include "llvm/ADT/PointerIntPair.h"
+
+namespace clang::sema {
+namespace {
+enum LifetimeKind {
+  /// The lifetime of a temporary bound to this entity ends at the end of the
+  /// full-expression, and that's (probably) fine.
+  LK_FullExpression,
+
+  /// The lifetime of a temporary bound to this entity is extended to the
+  /// lifeitme of the entity itself.
+  LK_Extended,
+
+  /// The lifetime of a temporary bound to this entity probably ends too soon,
+  /// because the entity is allocated in a new-expression.
+  LK_New,
+
+  /// The lifetime of a temporary bound to this entity ends too soon, because
+  /// the entity is a return object.
+  LK_Return,
+
+  /// The lifetime of a temporary bound to this entity ends too soon, because
+  /// the entity is the result of a statement expression.
+  LK_StmtExprResult,
+
+  /// This is a mem-initializer: if it would extend a temporary (other than via
+  /// a default member initializer), the program is ill-formed.
+  LK_MemInitializer,
+};
+using LifetimeResult =
+llvm::PointerIntPair;
+} // namespace
+
+/// Determine the declaration which an initialized entity ultimately refers to,
+/// for the purpose of lifetime-extending a temporary bound to a reference in
+/// the initialization of \p Entity.
+static LifetimeResult
+getEntityLifetime(const InitializedEntity *Entity,
+  const InitializedEntity *InitField = nullptr) {
+  // C++11 [class.temporary]p5:
+  switch (Entity->getKind()) {
+  case InitializedEntity::EK_Variable:
+//   The temporary [...] persists for the lifetime of the reference
+return {Entity, LK_Extended};
+
+  case InitializedEntity::EK_Member:
+// For subobjects, we look at the complete object.
+if (Entity->getParent())
+  return getEntityLifetime(Entity->getParent(), Entity);
+
+//   except:
+// C++17 [class.base.init]p8:
+//   A temporary expression bound to a reference member in a
+//   mem-initializer is ill-formed.
+// C++17 [class.base.init]p11:
+//   A temporary expression bound to a reference member from a
+//   default member initializer is ill-formed.
+//
+// The context of p11 and its example suggest that it's only the use of a
+// default member initializer from a constructor that makes the program
+// ill-formed, not its mere existence, and that it can even be used by
+// aggregate initialization.
+return {Entity, Entity->isDefaultMemberInitializer() ? LK_Extended
+ : LK_MemInitializer};
+
+  case InitializedEntity::EK_Binding:
+// Per [dcl.decomp]p3, the binding is treated as a variable of reference
+// type.
+return {Entity, LK_Extended};
+
+  case InitializedEntity::EK_Parameter:
+  case InitializedEntity::EK_Parameter_CF_Audited:
+//   -- A temporary bound to a reference parameter in a function call
+//  persists until the completion of the full-expression containing
+//  the call.
+return {nullptr, LK_FullExpression};
+
+  case InitializedEntity::EK_TemplateParameter:
+

[clang] [clang][Sema] Move the initializer lifetime checking code from SemaInit.cpp to a new place, NFC (PR #96758)

2024-06-27 Thread Haojian Wu via cfe-commits

https://github.com/hokein closed https://github.com/llvm/llvm-project/pull/96758
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improve PointerSubChecker (PR #96501)

2024-06-27 Thread Donát Nagy via cfe-commits

NagyDonat wrote:

> The warning message may be still misleading if the LHS or RHS "arrays" are 
> non-array variables.

I think that the warning message is OK: "Subtraction of two pointers that do 
not point into the same array is undefined behavior." -- this also covers the 
case when one or both of the pointers do not point to arrays. (It doesn't 
mention the corner case that it's also valid to subtract two identical pointers 
that point to a non-array value, but that's completely irrelevant in practice, 
so wouldn't be a helpful suggestion.)

> (or detect if `offsetof` can be used and include it in the message)?

I think that would be a waste of time, because it's very rare that a project 
manually reimplements `offsetof` -- I think it only appears in `vim` becasue 
it's a very old codebase. (Also developers who play with this kind of low-level 
trickery should be familiar with the standard and understand what's the 
problem.)


https://github.com/llvm/llvm-project/pull/96501
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][CodeGen] security check cookie execute only when needed (PR #95904)

2024-06-27 Thread via cfe-commits

mahesh-attarde wrote:

> @mahesh-attarde please can you rebase against trunk - I've cleaned up the 
> test checks to help with the codegen diff

done.

https://github.com/llvm/llvm-project/pull/95904
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][CodeGen] security check cookie execute only when needed (PR #95904)

2024-06-27 Thread via cfe-commits

mahesh-attarde wrote:

ping @MaskRay @RKSimon 

https://github.com/llvm/llvm-project/pull/95904
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][AST] Let DeclPrinter print trailing requires expressions for template parameters (PR #96864)

2024-06-27 Thread Younan Zhang via cfe-commits

https://github.com/zyn0217 updated 
https://github.com/llvm/llvm-project/pull/96864

>From a5c33bd413d8150d1688240c6b5253b1760cafe1 Mon Sep 17 00:00:00 2001
From: Younan Zhang 
Date: Thu, 27 Jun 2024 15:59:48 +0800
Subject: [PATCH 1/2] [Clang][AST] Let DeclPrinter print trailing requires
 expressions for template parameters

As discussed in 
https://github.com/llvm/llvm-project/pull/96084#discussion_r1654629993,
it would be nice to present these trailing constraints on template
parameters when printing CTAD decls through a DeclPrinter.
---
 clang/docs/ReleaseNotes.rst|  1 +
 clang/lib/AST/DeclPrinter.cpp  | 10 ++
 clang/test/PCH/cxx2a-requires-expr.cpp | 17 +
 3 files changed, 28 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 69aea6c21ad39..03b1daa6597cd 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -99,6 +99,7 @@ AST Dumping Potentially Breaking Changes
 
 
 - The text ast-dumper has improved printing of TemplateArguments.
+- The text decl-dumper prints template parameters' trailing requires 
expressions now.
 
 Clang Frontend Potentially Breaking Changes
 ---
diff --git a/clang/lib/AST/DeclPrinter.cpp b/clang/lib/AST/DeclPrinter.cpp
index 0cf4e64f83b8d..0a081e7e07ca8 100644
--- a/clang/lib/AST/DeclPrinter.cpp
+++ b/clang/lib/AST/DeclPrinter.cpp
@@ -1189,6 +1189,16 @@ void DeclPrinter::printTemplateParameters(const 
TemplateParameterList *Params,
   Out << '>';
   if (!OmitTemplateKW)
 Out << ' ';
+
+  if (const Expr *RequiresClause = Params->getRequiresClause()) {
+if (OmitTemplateKW)
+  Out << ' ';
+Out << "requires ";
+RequiresClause->printPretty(Out, nullptr, Policy, Indentation, "\n",
+&Context);
+if (!OmitTemplateKW)
+  Out << ' ';
+  }
 }
 
 void DeclPrinter::printTemplateArguments(ArrayRef Args,
diff --git a/clang/test/PCH/cxx2a-requires-expr.cpp 
b/clang/test/PCH/cxx2a-requires-expr.cpp
index 7f8f258a0f8f3..936f601685463 100644
--- a/clang/test/PCH/cxx2a-requires-expr.cpp
+++ b/clang/test/PCH/cxx2a-requires-expr.cpp
@@ -22,3 +22,20 @@ bool f() {
 requires C || (C || C);
   };
 }
+
+namespace trailing_requires_expression {
+
+template  requires C && C2
+// CHECK: template  requires C && C2 void g();
+void g();
+
+template  requires C || C2
+// CHECK: template  requires C || C2 constexpr int h = 
sizeof(T);
+constexpr int h = sizeof(T);
+
+template  requires C
+// CHECK:  template  requires C class i {
+// CHECK-NEXT: };
+class i {};
+
+}

>From 432f3fdb6d0817fad15b87a6d166e0ada9748f89 Mon Sep 17 00:00:00 2001
From: Younan Zhang 
Date: Thu, 27 Jun 2024 16:56:19 +0800
Subject: [PATCH 2/2] Address feedback

---
 clang/lib/AST/DeclPrinter.cpp | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/clang/lib/AST/DeclPrinter.cpp b/clang/lib/AST/DeclPrinter.cpp
index 0a081e7e07ca8..26773a69ab9ac 100644
--- a/clang/lib/AST/DeclPrinter.cpp
+++ b/clang/lib/AST/DeclPrinter.cpp
@@ -1187,18 +1187,15 @@ void DeclPrinter::printTemplateParameters(const 
TemplateParameterList *Params,
   }
 
   Out << '>';
-  if (!OmitTemplateKW)
-Out << ' ';
 
   if (const Expr *RequiresClause = Params->getRequiresClause()) {
-if (OmitTemplateKW)
-  Out << ' ';
-Out << "requires ";
+Out << " requires ";
 RequiresClause->printPretty(Out, nullptr, Policy, Indentation, "\n",
 &Context);
-if (!OmitTemplateKW)
-  Out << ' ';
   }
+
+  if (!OmitTemplateKW)
+Out << ' ';
 }
 
 void DeclPrinter::printTemplateArguments(ArrayRef Args,

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [PAC][ELF][AArch64] Encode signed GOT flag in PAuth core info (PR #96159)

2024-06-27 Thread Daniil Kovalev via cfe-commits

kovdan01 wrote:

> I'm not at all familiar with this PAuth stuff, but don't you need a test case 
> for where the new value is set (currently they all seem to be unset, if I'm 
> interpreting things correctly)?

I'm not sure if I understood your question correctly - particularly, I'm not 
sure what does the phrase "the new value is set" mean. Could you please add a 
bit more details in your question?

If you are talking about 
llvm/test/tools/llvm-readobj/ELF/AArch64/aarch64-feature-pauth.s and 
llvm/test/CodeGen/AArch64/note-gnu-property-elf-pauthabi.ll tests checking 
version value 0x55 which does not imply signed GOT enabled, we just can't test 
2^8=256 combinations of flags, so we test values which look like 0b10101... But 
I can add a test for version value 0xAA which would set opposite flags compared 
to 0x55.

https://github.com/llvm/llvm-project/pull/96159
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [RISCV] Add support for getHostCPUFeatures using hwprobe (PR #94352)

2024-06-27 Thread Yingwei Zheng via cfe-commits


@@ -2002,6 +2003,76 @@ bool sys::getHostCPUFeatures(StringMap &Features) {
 
   return true;
 }
+#elif defined(__linux__) && defined(__riscv)
+// struct riscv_hwprobe
+struct RISCVHwProbe {
+  int64_t Key;
+  uint64_t Value;
+};
+bool sys::getHostCPUFeatures(StringMap &Features) {
+  RISCVHwProbe Query[]{{/*RISCV_HWPROBE_KEY_BASE_BEHAVIOR=*/3, 0},
+   {/*RISCV_HWPROBE_KEY_IMA_EXT_0=*/4, 0}};
+  int Ret = syscall(/*__NR_riscv_hwprobe=*/258, /*pairs=*/Query,

dtcxzyw wrote:

Currently `sys::getHostCPUFeatures` has three callers:
+ clang -> `riscv::getRISCVTargetFeatures`
+ llvm-tools -> `codegen::getFeaturesStr`
+ JIT users -> `JITTargetMachineBuilder::detectHost`

I don't think there are any opportunities to reuse the result.
BTW, https://github.com/llvm/llvm-project/pull/85790 may benefit from the vDSO 
symbol, but it implements caching itself.

I didn't use the glibc call `__riscv_hwprobe` since `sys/hwprobe.h` was 
unavailable on my RV board :(


https://github.com/llvm/llvm-project/pull/94352
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [RISCV] Add support for getHostCPUFeatures using hwprobe (PR #94352)

2024-06-27 Thread Yingwei Zheng via cfe-commits


@@ -83,8 +83,14 @@ void riscv::getRISCVTargetFeatures(const Driver &D, const 
llvm::Triple &Triple,
   // and other features (ex. mirco architecture feature) from mcpu
   if (Arg *A = Args.getLastArg(options::OPT_mcpu_EQ)) {
 StringRef CPU = A->getValue();
-if (CPU == "native")
+if (CPU == "native") {
   CPU = llvm::sys::getHostCPUName();
+  llvm::StringMap HostFeatures;
+  if (llvm::sys::getHostCPUFeatures(HostFeatures))
+for (auto &F : HostFeatures)
+  Features.push_back(
+  Args.MakeArgString((F.second ? "+" : "-") + F.first()));
+}

dtcxzyw wrote:

@wangpc-pp @topperc
Are there any equivalents of the helper `printMArch`?
https://github.com/llvm/llvm-project/blob/ba60d8a11af2cdd7e80e2fd968cdf52adcabf5a1/llvm/utils/TableGen/RISCVTargetDefEmitter.cpp#L90-L123


https://github.com/llvm/llvm-project/pull/94352
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 2033b1c - [CodeGen] Don't write coverage to source directory in test

2024-06-27 Thread Benjamin Kramer via cfe-commits

Author: Benjamin Kramer
Date: 2024-06-27T11:21:42+02:00
New Revision: 2033b1cf16f040e1369d8efba8439dcd3e36ed31

URL: 
https://github.com/llvm/llvm-project/commit/2033b1cf16f040e1369d8efba8439dcd3e36ed31
DIFF: 
https://github.com/llvm/llvm-project/commit/2033b1cf16f040e1369d8efba8439dcd3e36ed31.diff

LOG: [CodeGen] Don't write coverage to source directory in test

Added: 


Modified: 
clang/test/CodeGen/coverage-target-attr.c

Removed: 




diff  --git a/clang/test/CodeGen/coverage-target-attr.c 
b/clang/test/CodeGen/coverage-target-attr.c
index 8c8e6ee1c3b69..d46299f5bee22 100644
--- a/clang/test/CodeGen/coverage-target-attr.c
+++ b/clang/test/CodeGen/coverage-target-attr.c
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -emit-llvm -coverage-notes-file=test.gcno 
-coverage-data-file=test.gcda -triple aarch64-linux-android30 -target-cpu 
generic -target-feature +tagged-globals -fsanitize=hwaddress %s -o %t
+// RUN: %clang_cc1 -emit-llvm -coverage-notes-file=/dev/null 
-coverage-data-file=/dev/null -triple aarch64-linux-android30 -target-cpu 
generic -target-feature +tagged-globals -fsanitize=hwaddress %s -o %t
 // RUN: FileCheck %s < %t
 
 // CHECK: define internal void @__llvm_gcov_writeout() unnamed_addr 
[[ATTR:#[0-9]+]]



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [PAC][ELF][AArch64] Encode signed GOT flag in PAuth core info (PR #96159)

2024-06-27 Thread James Henderson via cfe-commits

jh7370 wrote:

> > I'm not at all familiar with this PAuth stuff, but don't you need a test 
> > case for where the new value is set (currently they all seem to be unset, 
> > if I'm interpreting things correctly)?
> 
> @jh7370 I'm not sure if I understood your question correctly - particularly, 
> I'm not sure what does the phrase "the new value is set" mean. Could you 
> please add a bit more details in your question?
> 
> If you are talking about 
> llvm/test/tools/llvm-readobj/ELF/AArch64/aarch64-feature-pauth.s and 
> llvm/test/CodeGen/AArch64/note-gnu-property-elf-pauthabi.ll tests checking 
> version value 0x55 which does not imply signed GOT enabled, we just can't 
> test 2^8=256 combinations of flags, so we test values which look like 
> 0b10101... But I can add a test for version value 0xAA which would set 
> opposite flags compared to 0x55.

I was referring to this line from the description:

> llvm-readobj: print `PointerAuthELFGOT` or `!PointerAuthELFGOT` in version 
> description of llvm_linux platform depending on whether the flag is set.

In my opinion, if you don't test the first of those two cases, you might as 
well not have implemented behaviour for it. I'd always test "all flags set" and 
"no flags set" cases (or some variant that effectively tests this, e.g. 0xff 
and ~0xff). Of course, if it's not practical, that's fine.

https://github.com/llvm/llvm-project/pull/96159
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][AST] Let DeclPrinter print trailing requires expressions for template parameters (PR #96864)

2024-06-27 Thread Haojian Wu via cfe-commits

https://github.com/hokein approved this pull request.

thanks, looks good.

https://github.com/llvm/llvm-project/pull/96864
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [PAC][ELF][AArch64] Encode signed GOT flag in PAuth core info (PR #96159)

2024-06-27 Thread Daniil Kovalev via cfe-commits

kovdan01 wrote:

> I was referring to this line from the description:
> 
> > llvm-readobj: print `PointerAuthELFGOT` or `!PointerAuthELFGOT` in version 
> > description of llvm_linux platform depending on whether the flag is set.
> 
> In my opinion, if you don't test the first of those two cases, you might as 
> well not have implemented behaviour for it. I'd always test "all flags set" 
> and "no flags set" cases (or some variant that effectively tests this, e.g. 
> 0xff and ~0xff). Of course, if it's not practical, that's fine.
> 
> To be clear, I'm not suggesting testing every possible combination of flags, 
> just each flag individually set/not set.

@jh7370 Thanks for explanation! It's a reasonable point, I'll add corresponding 
test cases, thanks.

https://github.com/llvm/llvm-project/pull/96159
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [Flang-new][OpenMP] Add offload related flags for AMDGPU (PR #96742)

2024-06-27 Thread Dominik Adamski via cfe-commits


@@ -333,6 +333,9 @@ void Flang::AddAMDGPUTargetArgs(const ArgList &Args,
 StringRef Val = A->getValue();
 CmdArgs.push_back(Args.MakeArgString("-mcode-object-version=" + Val));
   }
+
+  const ToolChain &TC = getToolChain();
+  TC.addClangTargetOptions(Args, CmdArgs, Action::OffloadKind::OFK_OpenMP);

DominikAdamski wrote:

Hi,
thanks for the feedback. I would like to share my observations with you:

1. Clang does not verify how we use these flags and it accepts them for non-GPU 
target.
2. These flags can be reused by other vendors. For example clang adds 
`mlink-builtin-bitcode` option for OpenMP Nvidia GPU [as 
well](https://github.com/llvm/llvm-project/blob/ba60d8a11af2cdd7e80e2fd968cdf52adcabf5a1/clang/test/Driver/openmp-offload-gpu.c#L92)
 .

https://github.com/llvm/llvm-project/pull/96742
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [Flang-new][OpenMP] Add offload related flags for AMDGPU (PR #96742)

2024-06-27 Thread Tom Eccles via cfe-commits


@@ -333,6 +333,9 @@ void Flang::AddAMDGPUTargetArgs(const ArgList &Args,
 StringRef Val = A->getValue();
 CmdArgs.push_back(Args.MakeArgString("-mcode-object-version=" + Val));
   }
+
+  const ToolChain &TC = getToolChain();
+  TC.addClangTargetOptions(Args, CmdArgs, Action::OffloadKind::OFK_OpenMP);

tblah wrote:

Does that mean that this change would also lead to adding these flags when 
building for Nvidia GPU with flang?

https://github.com/llvm/llvm-project/pull/96742
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [OpenMP] OpenMP 5.1 "assume" directive parsing support (PR #92731)

2024-06-27 Thread Julian Brown via cfe-commits


@@ -0,0 +1,31 @@
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -ast-print %s | FileCheck %s
+// expected-no-diagnostics
+
+extern int bar(int);
+
+int foo(int arg)
+{
+  #pragma omp assume no_openmp_routines
+  {
+auto fn = [](int x) { return bar(x); };
+// CHECK: auto fn = [](int x) {
+return fn(5);
+  }
+}
+
+class C {
+public:
+  int foo(int a);
+};
+
+// We're really just checking that this parses.  All the assumptions are thrown
+// away immediately for now.
+int C::foo(int a)
+{
+  #pragma omp assume holds(sizeof(T) == 8) absent(parallel)
+  {
+auto fn = [](int x) { return bar(x); };
+// CHECK: auto fn = [](int x) {
+return fn(5);
+  }
+}

jtb20 wrote:

Understood! There is indeed a vscode option to add the missing newline (it 
appears to be turned off by default for some bizarre reason). I'll push a new 
version with them in.

https://github.com/llvm/llvm-project/pull/92731
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [compiler-rt] [XRay] Add support for instrumentation of DSOs on x86_64 (PR #90959)

2024-06-27 Thread Brian Cain via cfe-commits

androm3da wrote:

> @androm3da @MaskRay I'm tagging you because I'm having trouble to get 
> feedback to this PR, and you seem to be the most recent contributors to XRay. 
> Would one of you be willing to review it? Any other pointers on who to get in 
> touch with are also much appreciated.

I'm happy to take a look - but I'm traveling this week and won't be able to 
until this weekend. 

https://github.com/llvm/llvm-project/pull/90959
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improve documentation of checker 'cplusplus.Move' (NFC) (PR #96295)

2024-06-27 Thread Balázs Kéri via cfe-commits

https://github.com/balazske updated 
https://github.com/llvm/llvm-project/pull/96295

From 0c57ad1ca36a841dff700eb98f878475e0243b88 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Bal=C3=A1zs=20K=C3=A9ri?= 
Date: Fri, 21 Jun 2024 12:13:02 +0200
Subject: [PATCH 1/3] [clang][analyzer] Improve documentation of checker
 'cplusplus.Move' (NFC)

---
 clang/docs/analyzer/checkers.rst  | 39 +--
 .../clang/StaticAnalyzer/Checkers/Checkers.td | 21 +++---
 2 files changed, 40 insertions(+), 20 deletions(-)

diff --git a/clang/docs/analyzer/checkers.rst b/clang/docs/analyzer/checkers.rst
index b8d5f372bdf61..445f434e1e6ce 100644
--- a/clang/docs/analyzer/checkers.rst
+++ b/clang/docs/analyzer/checkers.rst
@@ -420,21 +420,52 @@ around, such as ``std::string_view``.
 
 cplusplus.Move (C++)
 
-Method calls on a moved-from object and copying a moved-from object will be 
reported.
-
+Find use-after-move bugs in C++. This includes method calls on moved-from
+objects, assignment of a moved-from object, and repeated move of a moved-from
+object.
 
 .. code-block:: cpp
 
-  struct A {
+ struct A {
void foo() {}
  };
 
- void f() {
+ void f1() {
A a;
A b = std::move(a); // note: 'a' became 'moved-from' here
a.foo();// warn: method call on a 'moved-from' object 'a'
  }
 
+ void f2() {
+   A a;
+   A b = std::move(a);
+   A c(std::move(a)); // warn: move of an already moved-from object
+ }
+
+ void f3() {
+   A a;
+   A b = std::move(a);
+   b = a; // warn: copy of moved-from object
+ }
+
+The checker option ``WarnOn`` controls on what objects the use-after-move is
+checked. The most strict value is ``KnownsOnly``, in this mode only objects are
+checked whose type is known to be move-unsafe. These include most STL objects
+(but excluding move-safe ones) and smart pointers. With option value
+``KnownsAndLocals`` local variables (of any type) are additionally checked. The
+idea behind this is that local variables are usually not tempting to be re-used
+so an use after move is more likely a bug than with member variables. With
+option value ``All`` any use-after move condition is checked on all kinds of
+variables, excluding global variables and known move-safe cases. Default value
+is ``KnownsAndLocals``.
+
+Call of methods named ``empty()`` or ``isEmpty()`` are allowed on moved-from
+objects because these methods are considered as move-safe. Functions called
+``reset()``, ``destroy()``, ``clear()``, ``assign``, ``resize``,  ``shrink`` 
are
+treated as state-reset functions and are allowed on moved-from objects, these
+make the object valid again. This applies to any type of object (not only STL
+ones).
+
 .. _cplusplus-NewDelete:
 
 cplusplus.NewDelete (C++)
diff --git a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td 
b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
index 429c334a0b24b..6e224a4e098ad 100644
--- a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
+++ b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
@@ -686,22 +686,11 @@ def MoveChecker: Checker<"Move">,
   CheckerOptions<[
 CmdLineOption
   ]>,

From 866655581a1e1f0779542737a3f9d427a8d067b6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Bal=C3=A1zs=20K=C3=A9ri?= 
Date: Fri, 21 Jun 2024 16:35:09 +0200
Subject: [PATCH 2/3] using bullet point list for option values

---
 clang/docs/analyzer/checkers.rst | 26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/clang/docs/analyzer/checkers.rst b/clang/docs/analyzer/checkers.rst
index 445f434e1e6ce..42c097d973d53 100644
--- a/clang/docs/analyzer/checkers.rst
+++ b/clang/docs/analyzer/checkers.rst
@@ -449,17 +449,21 @@ object.
  }
 
 The checker option ``WarnOn`` controls on what objects the use-after-move is
-checked. The most strict value is ``KnownsOnly``, in this mode only objects are
-checked whose type is known to be move-unsafe. These include most STL objects
-(but excluding move-safe ones) and smart pointers. With option value
-``KnownsAndLocals`` local variables (of any type) are additionally checked. The
-idea behind this is that local variables are usually not tempting to be re-used
-so an use after move is more likely a bug than with member variables. With
-option value ``All`` any use-after move condition is checked on all kinds of
-variables, excluding global variables and known move-safe cases. Default value
-is ``KnownsAndLocals``.
-
-Call of methods named ``empty()`` or ``isEmpty()`` are allowed on moved-from
+checked:
+
+* The most strict value is ``KnownsOnly``, in this mode only objects are
+  checked whose type is known to be move-unsafe. These include most STL objects
+  (but excluding move-safe ones) and smart pointers.
+* With option value ``KnownsAndLocals`` local variables (of any type) are
+  additionally checked. The idea behind this is that local variables are
+  usually not tempting to be re-used so an use after move is more likely a bug
+  than with

[clang] [llvm] [RISCV] Add support for getHostCPUFeatures using hwprobe (PR #94352)

2024-06-27 Thread Pengcheng Wang via cfe-commits


@@ -83,8 +83,14 @@ void riscv::getRISCVTargetFeatures(const Driver &D, const 
llvm::Triple &Triple,
   // and other features (ex. mirco architecture feature) from mcpu
   if (Arg *A = Args.getLastArg(options::OPT_mcpu_EQ)) {
 StringRef CPU = A->getValue();
-if (CPU == "native")
+if (CPU == "native") {
   CPU = llvm::sys::getHostCPUName();
+  llvm::StringMap HostFeatures;
+  if (llvm::sys::getHostCPUFeatures(HostFeatures))
+for (auto &F : HostFeatures)
+  Features.push_back(
+  Args.MakeArgString((F.second ? "+" : "-") + F.first()));
+}

wangpc-pp wrote:

> @wangpc-pp @topperc Are there any equivalents of the helper `printMArch`?

No, I think there isn't. You may need to write a helper via 
`RISCVISAInfo::parseFeatures` and `RISCVISAInfo::toString()`.


https://github.com/llvm/llvm-project/pull/94352
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [Flang-new][OpenMP] Add offload related flags for AMDGPU (PR #96742)

2024-06-27 Thread Dominik Adamski via cfe-commits

DominikAdamski wrote:

> > fcuda-is-device flag is not used by Flang currently. In the future it will 
> > be needed for Flang equivalent functions: 
> > AMDGPUTargetCodeGenInfo::getGlobalVarAddressSpace 
> > AMDGPUTargetInfo::getTargetDefines .
> 
> I don't follow - why would anything related to CUDA be relevant here?

Clang for AMDGPU supports OpenMP and 
[HIP](https://clang.llvm.org/docs/HIPSupport.html) and it reuses the same code. 
For example `-fcuda-is-device` flag needs to be checked for [legacy HIP host 
code](https://github.com/llvm/llvm-project/blob/2033b1cf16f040e1369d8efba8439dcd3e36ed31/clang/lib/Basic/Targets/AMDGPU.cpp#L278).
 I would like to reuse the same part of the AMD GPU toolchain for Flang.

https://github.com/llvm/llvm-project/pull/96742
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AArch64][NEON] Add intrinsics for LUTI (PR #96883)

2024-06-27 Thread via cfe-commits

https://github.com/Lukacma created 
https://github.com/llvm/llvm-project/pull/96883

This patch adds intrinsics for NEON LUTI2 and LUTI4 instructions as specified 
in the [ACLE proposal](https://github.com/ARM-software/acle/pull/324)

>From cb2ebe232013576f57f8f26b9156fccd75d7d38f Mon Sep 17 00:00:00 2001
From: Marian Lukac 
Date: Thu, 27 Jun 2024 09:38:17 +
Subject: [PATCH] [AArch64][NEON] Add intrinsics for LUTI

---
 clang/include/clang/Basic/arm_neon.td |  16 +
 clang/lib/CodeGen/CGBuiltin.cpp   |  54 +++
 clang/test/CodeGen/aarch64-neon-luti.c| 433 ++
 llvm/include/llvm/IR/IntrinsicsAArch64.td |  19 +
 .../lib/Target/AArch64/AArch64InstrFormats.td |  14 +-
 llvm/lib/Target/AArch64/AArch64InstrInfo.td   |  70 +++
 llvm/test/CodeGen/AArch64/neon-luti.ll| 207 +
 7 files changed, 806 insertions(+), 7 deletions(-)
 create mode 100644 clang/test/CodeGen/aarch64-neon-luti.c
 create mode 100644 llvm/test/CodeGen/AArch64/neon-luti.ll

diff --git a/clang/include/clang/Basic/arm_neon.td 
b/clang/include/clang/Basic/arm_neon.td
index 6390ba3f9fe5e..0dd76ce32fc20 100644
--- a/clang/include/clang/Basic/arm_neon.td
+++ b/clang/include/clang/Basic/arm_neon.td
@@ -2096,3 +2096,19 @@ let ArchGuard = "defined(__aarch64__) || 
defined(__arm64ec__)", TargetGuard = "r
   def VLDAP1_LANE : WInst<"vldap1_lane", ".(c*!).I", "QUlQlUlldQdPlQPl">;
   def VSTL1_LANE  : WInst<"vstl1_lane", "v*(.!)I", "QUlQlUlldQdPlQPl">;
 }
+
+//Lookup table read with 2-bit/4-bit indices
+let ArchGuard = "defined(__aarch64__)", TargetGuard = "lut" in {
+  def VLUTI2_B: SInst<"vluti2_lane","Q.(qU)I",   "cUcPcQcQUcQPc">;
+  def VLUTI2_B_Q  : SInst<"vluti2_laneq",   "Q.(QU)I",   "cUcPcQcQUcQPc">;
+  def VLUTI2_H: SInst<"vluti2_lane","Q.(qU<)I",   "sUsPshQsQUsQPsQh">;
+  def VLUTI2_H_Q  : SInst<"vluti2_laneq",   "Q.(QU<)I",   "sUsPshQsQUsQPsQh">; 
 
+  def VLUTI4_B: SInst<"vluti4_laneq","..UI",   "QcQUcQPc">;
+  def VLUTI4_H_X2 : SInst<"vluti4_laneq_x2", ".2(U<)I", "QsQUsQPsQh">;
+
+  let ArchGuard = "defined(__aarch64__)", TargetGuard= "lut,bf16" in {
+def VLUTI2_BF  : SInst<"vluti2_lane", "Q.(qU<)I",   "bQb">;  
+def VLUTI2_BF_Q: SInst<"vluti2_laneq","Q.(QU<)I",   "bQb">;  
+def VLUTI4_BF_X2   : SInst<"vluti4_laneq_x2", ".2(U<)I", "Qb">;
+  }
+}
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 511e1fd4016d7..f9ac6c9dc8504 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -13357,6 +13357,60 @@ Value 
*CodeGenFunction::EmitAArch64BuiltinExpr(unsigned BuiltinID,
 Int = Intrinsic::aarch64_neon_suqadd;
 return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vuqadd");
   }
+
+  case NEON::BI__builtin_neon_vluti2_lane_bf16:
+  case NEON::BI__builtin_neon_vluti2_lane_f16:
+  case NEON::BI__builtin_neon_vluti2_lane_p16:
+  case NEON::BI__builtin_neon_vluti2_lane_p8:
+  case NEON::BI__builtin_neon_vluti2_lane_s16:
+  case NEON::BI__builtin_neon_vluti2_lane_s8:
+  case NEON::BI__builtin_neon_vluti2_lane_u16:
+  case NEON::BI__builtin_neon_vluti2_lane_u8:
+  case NEON::BI__builtin_neon_vluti2_laneq_bf16:
+  case NEON::BI__builtin_neon_vluti2_laneq_f16:
+  case NEON::BI__builtin_neon_vluti2_laneq_p16:
+  case NEON::BI__builtin_neon_vluti2_laneq_p8:
+  case NEON::BI__builtin_neon_vluti2_laneq_s16:
+  case NEON::BI__builtin_neon_vluti2_laneq_s8:
+  case NEON::BI__builtin_neon_vluti2_laneq_u16:
+  case NEON::BI__builtin_neon_vluti2_laneq_u8:
+  case NEON::BI__builtin_neon_vluti2q_lane_bf16:
+  case NEON::BI__builtin_neon_vluti2q_lane_f16:
+  case NEON::BI__builtin_neon_vluti2q_lane_p16:
+  case NEON::BI__builtin_neon_vluti2q_lane_p8:
+  case NEON::BI__builtin_neon_vluti2q_lane_s16:
+  case NEON::BI__builtin_neon_vluti2q_lane_s8:
+  case NEON::BI__builtin_neon_vluti2q_lane_u16:
+  case NEON::BI__builtin_neon_vluti2q_lane_u8:
+  case NEON::BI__builtin_neon_vluti2q_laneq_bf16:
+  case NEON::BI__builtin_neon_vluti2q_laneq_f16:
+  case NEON::BI__builtin_neon_vluti2q_laneq_p16:
+  case NEON::BI__builtin_neon_vluti2q_laneq_p8:
+  case NEON::BI__builtin_neon_vluti2q_laneq_s16:
+  case NEON::BI__builtin_neon_vluti2q_laneq_s8:
+  case NEON::BI__builtin_neon_vluti2q_laneq_u16:
+  case NEON::BI__builtin_neon_vluti2q_laneq_u8: {
+Int = Intrinsic::aarch64_neon_vluti2_lane;
+llvm::Type *Tys[3];
+Tys[0] = Ty;
+Tys[1] = Ops[0]->getType();
+Tys[2] = Ops[1]->getType();
+return EmitNeonCall(CGM.getIntrinsic(Int, Tys), Ops, "vluti2_lane");
+  }
+  case NEON::BI__builtin_neon_vluti4q_laneq_p8:
+  case NEON::BI__builtin_neon_vluti4q_laneq_s8:
+  case NEON::BI__builtin_neon_vluti4q_laneq_u8: {
+Int = Intrinsic::aarch64_neon_vluti4q_laneq;
+return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vluti4q_laneq");
+  }
+  case NEON::BI__builtin_neon_vluti4q_laneq_bf16_x2:
+  case NEON::BI__builtin_neon_vluti4q_laneq_f16_x2:
+  case NEON::BI__builtin_neon_vluti4q_l

[clang] [llvm] [AArch64][NEON] Add intrinsics for LUTI (PR #96883)

2024-06-27 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: None (Lukacma)


Changes

This patch adds intrinsics for NEON LUTI2 and LUTI4 instructions as specified 
in the [ACLE proposal](https://github.com/ARM-software/acle/pull/324)

---

Patch is 45.96 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/96883.diff


7 Files Affected:

- (modified) clang/include/clang/Basic/arm_neon.td (+16) 
- (modified) clang/lib/CodeGen/CGBuiltin.cpp (+54) 
- (added) clang/test/CodeGen/aarch64-neon-luti.c (+433) 
- (modified) llvm/include/llvm/IR/IntrinsicsAArch64.td (+19) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrFormats.td (+7-7) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+70) 
- (added) llvm/test/CodeGen/AArch64/neon-luti.ll (+207) 


``diff
diff --git a/clang/include/clang/Basic/arm_neon.td 
b/clang/include/clang/Basic/arm_neon.td
index 6390ba3f9fe5e..0dd76ce32fc20 100644
--- a/clang/include/clang/Basic/arm_neon.td
+++ b/clang/include/clang/Basic/arm_neon.td
@@ -2096,3 +2096,19 @@ let ArchGuard = "defined(__aarch64__) || 
defined(__arm64ec__)", TargetGuard = "r
   def VLDAP1_LANE : WInst<"vldap1_lane", ".(c*!).I", "QUlQlUlldQdPlQPl">;
   def VSTL1_LANE  : WInst<"vstl1_lane", "v*(.!)I", "QUlQlUlldQdPlQPl">;
 }
+
+//Lookup table read with 2-bit/4-bit indices
+let ArchGuard = "defined(__aarch64__)", TargetGuard = "lut" in {
+  def VLUTI2_B: SInst<"vluti2_lane","Q.(qU)I",   "cUcPcQcQUcQPc">;
+  def VLUTI2_B_Q  : SInst<"vluti2_laneq",   "Q.(QU)I",   "cUcPcQcQUcQPc">;
+  def VLUTI2_H: SInst<"vluti2_lane","Q.(qU<)I",   "sUsPshQsQUsQPsQh">;
+  def VLUTI2_H_Q  : SInst<"vluti2_laneq",   "Q.(QU<)I",   "sUsPshQsQUsQPsQh">; 
 
+  def VLUTI4_B: SInst<"vluti4_laneq","..UI",   "QcQUcQPc">;
+  def VLUTI4_H_X2 : SInst<"vluti4_laneq_x2", ".2(U<)I", "QsQUsQPsQh">;
+
+  let ArchGuard = "defined(__aarch64__)", TargetGuard= "lut,bf16" in {
+def VLUTI2_BF  : SInst<"vluti2_lane", "Q.(qU<)I",   "bQb">;  
+def VLUTI2_BF_Q: SInst<"vluti2_laneq","Q.(QU<)I",   "bQb">;  
+def VLUTI4_BF_X2   : SInst<"vluti4_laneq_x2", ".2(U<)I", "Qb">;
+  }
+}
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 511e1fd4016d7..f9ac6c9dc8504 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -13357,6 +13357,60 @@ Value 
*CodeGenFunction::EmitAArch64BuiltinExpr(unsigned BuiltinID,
 Int = Intrinsic::aarch64_neon_suqadd;
 return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vuqadd");
   }
+
+  case NEON::BI__builtin_neon_vluti2_lane_bf16:
+  case NEON::BI__builtin_neon_vluti2_lane_f16:
+  case NEON::BI__builtin_neon_vluti2_lane_p16:
+  case NEON::BI__builtin_neon_vluti2_lane_p8:
+  case NEON::BI__builtin_neon_vluti2_lane_s16:
+  case NEON::BI__builtin_neon_vluti2_lane_s8:
+  case NEON::BI__builtin_neon_vluti2_lane_u16:
+  case NEON::BI__builtin_neon_vluti2_lane_u8:
+  case NEON::BI__builtin_neon_vluti2_laneq_bf16:
+  case NEON::BI__builtin_neon_vluti2_laneq_f16:
+  case NEON::BI__builtin_neon_vluti2_laneq_p16:
+  case NEON::BI__builtin_neon_vluti2_laneq_p8:
+  case NEON::BI__builtin_neon_vluti2_laneq_s16:
+  case NEON::BI__builtin_neon_vluti2_laneq_s8:
+  case NEON::BI__builtin_neon_vluti2_laneq_u16:
+  case NEON::BI__builtin_neon_vluti2_laneq_u8:
+  case NEON::BI__builtin_neon_vluti2q_lane_bf16:
+  case NEON::BI__builtin_neon_vluti2q_lane_f16:
+  case NEON::BI__builtin_neon_vluti2q_lane_p16:
+  case NEON::BI__builtin_neon_vluti2q_lane_p8:
+  case NEON::BI__builtin_neon_vluti2q_lane_s16:
+  case NEON::BI__builtin_neon_vluti2q_lane_s8:
+  case NEON::BI__builtin_neon_vluti2q_lane_u16:
+  case NEON::BI__builtin_neon_vluti2q_lane_u8:
+  case NEON::BI__builtin_neon_vluti2q_laneq_bf16:
+  case NEON::BI__builtin_neon_vluti2q_laneq_f16:
+  case NEON::BI__builtin_neon_vluti2q_laneq_p16:
+  case NEON::BI__builtin_neon_vluti2q_laneq_p8:
+  case NEON::BI__builtin_neon_vluti2q_laneq_s16:
+  case NEON::BI__builtin_neon_vluti2q_laneq_s8:
+  case NEON::BI__builtin_neon_vluti2q_laneq_u16:
+  case NEON::BI__builtin_neon_vluti2q_laneq_u8: {
+Int = Intrinsic::aarch64_neon_vluti2_lane;
+llvm::Type *Tys[3];
+Tys[0] = Ty;
+Tys[1] = Ops[0]->getType();
+Tys[2] = Ops[1]->getType();
+return EmitNeonCall(CGM.getIntrinsic(Int, Tys), Ops, "vluti2_lane");
+  }
+  case NEON::BI__builtin_neon_vluti4q_laneq_p8:
+  case NEON::BI__builtin_neon_vluti4q_laneq_s8:
+  case NEON::BI__builtin_neon_vluti4q_laneq_u8: {
+Int = Intrinsic::aarch64_neon_vluti4q_laneq;
+return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vluti4q_laneq");
+  }
+  case NEON::BI__builtin_neon_vluti4q_laneq_bf16_x2:
+  case NEON::BI__builtin_neon_vluti4q_laneq_f16_x2:
+  case NEON::BI__builtin_neon_vluti4q_laneq_p16_x2:
+  case NEON::BI__builtin_neon_vluti4q_laneq_s16_x2:
+  case NEON::BI__builtin_neon_vluti4q_laneq_u16_x2: {
+Int = Intrinsic::aarch64_neon_vluti4q_laneq_x2;
+return EmitNeonCall(CGM.

[clang] [llvm] [AArch64][NEON] Add intrinsics for LUTI (PR #96883)

2024-06-27 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: None (Lukacma)


Changes

This patch adds intrinsics for NEON LUTI2 and LUTI4 instructions as specified 
in the [ACLE proposal](https://github.com/ARM-software/acle/pull/324)

---

Patch is 45.96 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/96883.diff


7 Files Affected:

- (modified) clang/include/clang/Basic/arm_neon.td (+16) 
- (modified) clang/lib/CodeGen/CGBuiltin.cpp (+54) 
- (added) clang/test/CodeGen/aarch64-neon-luti.c (+433) 
- (modified) llvm/include/llvm/IR/IntrinsicsAArch64.td (+19) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrFormats.td (+7-7) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+70) 
- (added) llvm/test/CodeGen/AArch64/neon-luti.ll (+207) 


``diff
diff --git a/clang/include/clang/Basic/arm_neon.td 
b/clang/include/clang/Basic/arm_neon.td
index 6390ba3f9fe5e..0dd76ce32fc20 100644
--- a/clang/include/clang/Basic/arm_neon.td
+++ b/clang/include/clang/Basic/arm_neon.td
@@ -2096,3 +2096,19 @@ let ArchGuard = "defined(__aarch64__) || 
defined(__arm64ec__)", TargetGuard = "r
   def VLDAP1_LANE : WInst<"vldap1_lane", ".(c*!).I", "QUlQlUlldQdPlQPl">;
   def VSTL1_LANE  : WInst<"vstl1_lane", "v*(.!)I", "QUlQlUlldQdPlQPl">;
 }
+
+//Lookup table read with 2-bit/4-bit indices
+let ArchGuard = "defined(__aarch64__)", TargetGuard = "lut" in {
+  def VLUTI2_B: SInst<"vluti2_lane","Q.(qU)I",   "cUcPcQcQUcQPc">;
+  def VLUTI2_B_Q  : SInst<"vluti2_laneq",   "Q.(QU)I",   "cUcPcQcQUcQPc">;
+  def VLUTI2_H: SInst<"vluti2_lane","Q.(qU<)I",   "sUsPshQsQUsQPsQh">;
+  def VLUTI2_H_Q  : SInst<"vluti2_laneq",   "Q.(QU<)I",   "sUsPshQsQUsQPsQh">; 
 
+  def VLUTI4_B: SInst<"vluti4_laneq","..UI",   "QcQUcQPc">;
+  def VLUTI4_H_X2 : SInst<"vluti4_laneq_x2", ".2(U<)I", "QsQUsQPsQh">;
+
+  let ArchGuard = "defined(__aarch64__)", TargetGuard= "lut,bf16" in {
+def VLUTI2_BF  : SInst<"vluti2_lane", "Q.(qU<)I",   "bQb">;  
+def VLUTI2_BF_Q: SInst<"vluti2_laneq","Q.(QU<)I",   "bQb">;  
+def VLUTI4_BF_X2   : SInst<"vluti4_laneq_x2", ".2(U<)I", "Qb">;
+  }
+}
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 511e1fd4016d7..f9ac6c9dc8504 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -13357,6 +13357,60 @@ Value 
*CodeGenFunction::EmitAArch64BuiltinExpr(unsigned BuiltinID,
 Int = Intrinsic::aarch64_neon_suqadd;
 return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vuqadd");
   }
+
+  case NEON::BI__builtin_neon_vluti2_lane_bf16:
+  case NEON::BI__builtin_neon_vluti2_lane_f16:
+  case NEON::BI__builtin_neon_vluti2_lane_p16:
+  case NEON::BI__builtin_neon_vluti2_lane_p8:
+  case NEON::BI__builtin_neon_vluti2_lane_s16:
+  case NEON::BI__builtin_neon_vluti2_lane_s8:
+  case NEON::BI__builtin_neon_vluti2_lane_u16:
+  case NEON::BI__builtin_neon_vluti2_lane_u8:
+  case NEON::BI__builtin_neon_vluti2_laneq_bf16:
+  case NEON::BI__builtin_neon_vluti2_laneq_f16:
+  case NEON::BI__builtin_neon_vluti2_laneq_p16:
+  case NEON::BI__builtin_neon_vluti2_laneq_p8:
+  case NEON::BI__builtin_neon_vluti2_laneq_s16:
+  case NEON::BI__builtin_neon_vluti2_laneq_s8:
+  case NEON::BI__builtin_neon_vluti2_laneq_u16:
+  case NEON::BI__builtin_neon_vluti2_laneq_u8:
+  case NEON::BI__builtin_neon_vluti2q_lane_bf16:
+  case NEON::BI__builtin_neon_vluti2q_lane_f16:
+  case NEON::BI__builtin_neon_vluti2q_lane_p16:
+  case NEON::BI__builtin_neon_vluti2q_lane_p8:
+  case NEON::BI__builtin_neon_vluti2q_lane_s16:
+  case NEON::BI__builtin_neon_vluti2q_lane_s8:
+  case NEON::BI__builtin_neon_vluti2q_lane_u16:
+  case NEON::BI__builtin_neon_vluti2q_lane_u8:
+  case NEON::BI__builtin_neon_vluti2q_laneq_bf16:
+  case NEON::BI__builtin_neon_vluti2q_laneq_f16:
+  case NEON::BI__builtin_neon_vluti2q_laneq_p16:
+  case NEON::BI__builtin_neon_vluti2q_laneq_p8:
+  case NEON::BI__builtin_neon_vluti2q_laneq_s16:
+  case NEON::BI__builtin_neon_vluti2q_laneq_s8:
+  case NEON::BI__builtin_neon_vluti2q_laneq_u16:
+  case NEON::BI__builtin_neon_vluti2q_laneq_u8: {
+Int = Intrinsic::aarch64_neon_vluti2_lane;
+llvm::Type *Tys[3];
+Tys[0] = Ty;
+Tys[1] = Ops[0]->getType();
+Tys[2] = Ops[1]->getType();
+return EmitNeonCall(CGM.getIntrinsic(Int, Tys), Ops, "vluti2_lane");
+  }
+  case NEON::BI__builtin_neon_vluti4q_laneq_p8:
+  case NEON::BI__builtin_neon_vluti4q_laneq_s8:
+  case NEON::BI__builtin_neon_vluti4q_laneq_u8: {
+Int = Intrinsic::aarch64_neon_vluti4q_laneq;
+return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vluti4q_laneq");
+  }
+  case NEON::BI__builtin_neon_vluti4q_laneq_bf16_x2:
+  case NEON::BI__builtin_neon_vluti4q_laneq_f16_x2:
+  case NEON::BI__builtin_neon_vluti4q_laneq_p16_x2:
+  case NEON::BI__builtin_neon_vluti4q_laneq_s16_x2:
+  case NEON::BI__builtin_neon_vluti4q_laneq_u16_x2: {
+Int = Intrinsic::aarch64_neon_vluti4q_laneq_x2;
+return EmitNeo

[clang] [llvm] [OpenMP] OpenMP 5.1 "assume" directive parsing support (PR #92731)

2024-06-27 Thread Julian Brown via cfe-commits

jtb20 wrote:

> > > don't you need more code in AST?
> > 
> > 
> > Sorry, I don't quite understand the question! Could you elaborate a little 
> > please?
> 
> I was thinking maybe you need changes in AST related files, like 
> `ASTWriter.cpp`, but that might be not needed as this is adding a new 
> directive.

At the moment, since the "assume" directive is parsed but then immediately 
discarded, I don't think anything else is needed.

Actually the existing "assumes" support is reused -- the bit in SemaOpenMP.cpp 
adds the "assume" assumptions to the OMPAssumeScoped "stack". For "assumes", 
it's done like that so (e.g. top-level) declarations between begin/end 
"assumes" can be modified according to the assumptions on that stack. For 
"assume", once some use is made for the assumptions, that might turn out to not 
be the most useful representation. That can be revisited later though, I think.

https://github.com/llvm/llvm-project/pull/92731
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [CLANG][LLVM][AArch64]Add SME2.1 intrinsics for MOVAZ tile to vector,… (PR #88499)

2024-06-27 Thread via cfe-commits

https://github.com/CarolineConcatto updated 
https://github.com/llvm/llvm-project/pull/88499

>From a4d4a0ff71f5086c9fdf43e332b9752074eb42dc Mon Sep 17 00:00:00 2001
From: Caroline Concatto 
Date: Thu, 11 Apr 2024 16:10:16 +
Subject: [PATCH 1/3] [CLANG][LLVM][AArch64]Add SME2.1 intrinsics for MOVAZ
 tile to vector, single

According to the specification in
ARM-software/acle#309 this adds the intrinsics

// And similarly for u8.
svint8_t svreadz_hor_za8_s8(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u16, bf16 and f16.
svint16_t svreadz_hor_za16_s16(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u32 and f32.
svint32_t svreadz_hor_za32_s32(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u64 and f64.
svint64_t svreadz_hor_za64_s64(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64
svint8_t svreadz_hor_za128_s8(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");
---
 clang/include/clang/Basic/arm_sme.td  |  18 +
 .../acle_sme2p1_movaz.c   | 410 
 .../acle_sme2p1_imm.cpp   |  21 +
 llvm/include/llvm/IR/IntrinsicsAArch64.td |  12 +-
 .../Target/AArch64/AArch64ISelLowering.cpp|  37 ++
 llvm/lib/Target/AArch64/AArch64ISelLowering.h |   3 +
 .../lib/Target/AArch64/AArch64SMEInstrInfo.td |   3 +-
 llvm/lib/Target/AArch64/SMEInstrFormats.td|  93 +++-
 .../AArch64/sme2p1-intrinsics-movaz.ll| 445 +-
 9 files changed, 1021 insertions(+), 21 deletions(-)
 create mode 100644 
clang/test/Sema/aarch64-sme2p1-intrinsics/acle_sme2p1_imm.cpp

diff --git a/clang/include/clang/Basic/arm_sme.td 
b/clang/include/clang/Basic/arm_sme.td
index 5f757b40e8fd9..a5677802193af 100644
--- a/clang/include/clang/Basic/arm_sme.td
+++ b/clang/include/clang/Basic/arm_sme.td
@@ -787,4 +787,22 @@ defm SVREADZ_ZA16_X4 : ZAReadz<"za16", "4", "sUshb", 
"aarch64_sme_readz", [ImmCh
 defm SVREADZ_ZA32_X4 : ZAReadz<"za32", "4", "iUif",  "aarch64_sme_readz", 
[ImmCheck<0, ImmCheck0_3>]>;
 defm SVREADZ_ZA64_X4 : ZAReadz<"za64", "4", "lUld",  "aarch64_sme_readz", 
[ImmCheck<0, ImmCheck0_7>]>;
 
+
+multiclass ZAReadz 
ch> {
+  let SMETargetGuard = "sme2p1" in {
+def NAME # _H : SInst<"svreadz_hor_" # n_suffix # "_{d}", "dim", t,
+  MergeNone, i_prefix # "_horiz",
+  [IsStreaming, IsInOutZA], ch>;
+
+def NAME # _V : SInst<"svreadz_ver_" # n_suffix # "_{d}", "dim", t,
+  MergeNone, i_prefix # "_vert",
+  [IsStreaming, IsInOutZA], ch>;
+  }
+}
+
+defm SVREADZ_ZA8 : ZAReadz<"za8", "cUc", "aarch64_sme_readz", [ImmCheck<0, 
ImmCheck0_0>]>;
+defm SVREADZ_ZA16 : ZAReadz<"za16", "sUshb", "aarch64_sme_readz", [ImmCheck<0, 
ImmCheck0_1>]>;
+defm SVREADZ_ZA32 : ZAReadz<"za32", "iUif", "aarch64_sme_readz", [ImmCheck<0, 
ImmCheck0_3>]>;
+defm SVREADZ_ZA64 : ZAReadz<"za64", "lUld", "aarch64_sme_readz", [ImmCheck<0, 
ImmCheck0_7>]>;
+defm SVREADZ_ZA128 : ZAReadz<"za128", "csilUcUiUsUlbhfd", 
"aarch64_sme_readz_q", [ImmCheck<0, ImmCheck0_15>]>;
 } // let SVETargetGuard = InvalidMode
diff --git a/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c 
b/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c
index d0c7230ade761..7c9067a5ceece 100644
--- a/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c
+++ b/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c
@@ -1413,3 +1413,413 @@ svfloat64x4_t test_svreadz_ver_za64_f64_x4(uint32_t 
slice) __arm_streaming __arm
 {
return svreadz_ver_za64_f64_vg4(7, slice);
 }
+
+// CHECK-LABEL: define dso_local  @test_svreadz_hor_za8_s8(
+// CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = tail call  
@llvm.aarch64.sme.readz.horiz.nxv16i8(i32 0, i32 [[SLICE]])
+// CHECK-NEXT:ret  [[TMP0]]
+//
+// CPP-CHECK-LABEL: define dso_local  
@_Z23test_svreadz_hor_za8_s8j(
+// CPP-CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0:[0-9]+]] {
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:[[TMP0:%.*]] = tail call  
@llvm.aarch64.sme.readz.horiz.nxv16i8(i32 0, i32 [[SLICE]])
+// CPP-CHECK-NEXT:ret  [[TMP0]]
+//
+svint8_t test_svreadz_hor_za8_s8(uint32_t slice) __arm_streaming 
__arm_inout("za")
+{
+   return svreadz_hor_za8_s8(0, slice);
+}
+
+// CHECK-LABEL: define dso_local  @test_svreadz_hor_za8_u8(
+// CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = tail call  
@llvm.aarch64.sme.readz.horiz.nxv16i8(i32 0, i32 [[SLICE]])
+// CHECK-NEXT:ret  [[TMP0]]
+//
+// CPP-CHECK-LABEL: define dso_local  
@_Z23test_svreadz_hor_za8_u8j(
+// CPP-CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0]] {
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT

[clang] [llvm] [CLANG][LLVM][AArch64]Add SME2.1 intrinsics for MOVAZ tile to vector,… (PR #88499)

2024-06-27 Thread via cfe-commits

https://github.com/CarolineConcatto updated 
https://github.com/llvm/llvm-project/pull/88499

>From a4d4a0ff71f5086c9fdf43e332b9752074eb42dc Mon Sep 17 00:00:00 2001
From: Caroline Concatto 
Date: Thu, 11 Apr 2024 16:10:16 +
Subject: [PATCH 1/4] [CLANG][LLVM][AArch64]Add SME2.1 intrinsics for MOVAZ
 tile to vector, single

According to the specification in
ARM-software/acle#309 this adds the intrinsics

// And similarly for u8.
svint8_t svreadz_hor_za8_s8(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u16, bf16 and f16.
svint16_t svreadz_hor_za16_s16(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u32 and f32.
svint32_t svreadz_hor_za32_s32(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for u64 and f64.
svint64_t svreadz_hor_za64_s64(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");

// And similarly for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64
svint8_t svreadz_hor_za128_s8(uint64_t tile, uint32_t slice)
__arm_streaming __arm_inout("za");
---
 clang/include/clang/Basic/arm_sme.td  |  18 +
 .../acle_sme2p1_movaz.c   | 410 
 .../acle_sme2p1_imm.cpp   |  21 +
 llvm/include/llvm/IR/IntrinsicsAArch64.td |  12 +-
 .../Target/AArch64/AArch64ISelLowering.cpp|  37 ++
 llvm/lib/Target/AArch64/AArch64ISelLowering.h |   3 +
 .../lib/Target/AArch64/AArch64SMEInstrInfo.td |   3 +-
 llvm/lib/Target/AArch64/SMEInstrFormats.td|  93 +++-
 .../AArch64/sme2p1-intrinsics-movaz.ll| 445 +-
 9 files changed, 1021 insertions(+), 21 deletions(-)
 create mode 100644 
clang/test/Sema/aarch64-sme2p1-intrinsics/acle_sme2p1_imm.cpp

diff --git a/clang/include/clang/Basic/arm_sme.td 
b/clang/include/clang/Basic/arm_sme.td
index 5f757b40e8fd9..a5677802193af 100644
--- a/clang/include/clang/Basic/arm_sme.td
+++ b/clang/include/clang/Basic/arm_sme.td
@@ -787,4 +787,22 @@ defm SVREADZ_ZA16_X4 : ZAReadz<"za16", "4", "sUshb", 
"aarch64_sme_readz", [ImmCh
 defm SVREADZ_ZA32_X4 : ZAReadz<"za32", "4", "iUif",  "aarch64_sme_readz", 
[ImmCheck<0, ImmCheck0_3>]>;
 defm SVREADZ_ZA64_X4 : ZAReadz<"za64", "4", "lUld",  "aarch64_sme_readz", 
[ImmCheck<0, ImmCheck0_7>]>;
 
+
+multiclass ZAReadz 
ch> {
+  let SMETargetGuard = "sme2p1" in {
+def NAME # _H : SInst<"svreadz_hor_" # n_suffix # "_{d}", "dim", t,
+  MergeNone, i_prefix # "_horiz",
+  [IsStreaming, IsInOutZA], ch>;
+
+def NAME # _V : SInst<"svreadz_ver_" # n_suffix # "_{d}", "dim", t,
+  MergeNone, i_prefix # "_vert",
+  [IsStreaming, IsInOutZA], ch>;
+  }
+}
+
+defm SVREADZ_ZA8 : ZAReadz<"za8", "cUc", "aarch64_sme_readz", [ImmCheck<0, 
ImmCheck0_0>]>;
+defm SVREADZ_ZA16 : ZAReadz<"za16", "sUshb", "aarch64_sme_readz", [ImmCheck<0, 
ImmCheck0_1>]>;
+defm SVREADZ_ZA32 : ZAReadz<"za32", "iUif", "aarch64_sme_readz", [ImmCheck<0, 
ImmCheck0_3>]>;
+defm SVREADZ_ZA64 : ZAReadz<"za64", "lUld", "aarch64_sme_readz", [ImmCheck<0, 
ImmCheck0_7>]>;
+defm SVREADZ_ZA128 : ZAReadz<"za128", "csilUcUiUsUlbhfd", 
"aarch64_sme_readz_q", [ImmCheck<0, ImmCheck0_15>]>;
 } // let SVETargetGuard = InvalidMode
diff --git a/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c 
b/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c
index d0c7230ade761..7c9067a5ceece 100644
--- a/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c
+++ b/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c
@@ -1413,3 +1413,413 @@ svfloat64x4_t test_svreadz_ver_za64_f64_x4(uint32_t 
slice) __arm_streaming __arm
 {
return svreadz_ver_za64_f64_vg4(7, slice);
 }
+
+// CHECK-LABEL: define dso_local  @test_svreadz_hor_za8_s8(
+// CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = tail call  
@llvm.aarch64.sme.readz.horiz.nxv16i8(i32 0, i32 [[SLICE]])
+// CHECK-NEXT:ret  [[TMP0]]
+//
+// CPP-CHECK-LABEL: define dso_local  
@_Z23test_svreadz_hor_za8_s8j(
+// CPP-CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0:[0-9]+]] {
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:[[TMP0:%.*]] = tail call  
@llvm.aarch64.sme.readz.horiz.nxv16i8(i32 0, i32 [[SLICE]])
+// CPP-CHECK-NEXT:ret  [[TMP0]]
+//
+svint8_t test_svreadz_hor_za8_s8(uint32_t slice) __arm_streaming 
__arm_inout("za")
+{
+   return svreadz_hor_za8_s8(0, slice);
+}
+
+// CHECK-LABEL: define dso_local  @test_svreadz_hor_za8_u8(
+// CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = tail call  
@llvm.aarch64.sme.readz.horiz.nxv16i8(i32 0, i32 [[SLICE]])
+// CHECK-NEXT:ret  [[TMP0]]
+//
+// CPP-CHECK-LABEL: define dso_local  
@_Z23test_svreadz_hor_za8_u8j(
+// CPP-CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0]] {
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT

[clang] [flang] [Flang-new][OpenMP] Add offload related flags for AMDGPU (PR #96742)

2024-06-27 Thread Dominik Adamski via cfe-commits


@@ -333,6 +333,9 @@ void Flang::AddAMDGPUTargetArgs(const ArgList &Args,
 StringRef Val = A->getValue();
 CmdArgs.push_back(Args.MakeArgString("-mcode-object-version=" + Val));
   }
+
+  const ToolChain &TC = getToolChain();
+  TC.addClangTargetOptions(Args, CmdArgs, Action::OffloadKind::OFK_OpenMP);

DominikAdamski wrote:

No. My change does not imply any changes for Nvidia GPUs support.

Flang and Clang share the same LLVM backend which consumes generated LLVM IR. 
For AMD GPU we need to embed bitcode definitions of GPU math functions. AMD 
toolchain adds all required options to the compiler invocation for AMD GPU and 
IMO can be reused between Flang and Clang.

I don't know if Nvidia also want to reuse their toolchain between Clang and 
Flang to fully support OpenMP offloading.

https://github.com/llvm/llvm-project/pull/96742
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Sema] LambdaScopeForCallOperatorInstantiationRAII - fix typo in early out logic (PR #96888)

2024-06-27 Thread Simon Pilgrim via cfe-commits

https://github.com/RKSimon created 
https://github.com/llvm/llvm-project/pull/96888

We should be checking for a failed dyn_cast on the ParentFD result - not the 
loop invariant FD root value.

Seems to have been introduced in #65193

Noticed by static analyser (I have no specific test case).

>From 3194b593fbb50ee20b9f7d73beef4472657e6e00 Mon Sep 17 00:00:00 2001
From: Simon Pilgrim 
Date: Thu, 27 Jun 2024 11:09:32 +0100
Subject: [PATCH] [Sema] LambdaScopeForCallOperatorInstantiationRAII - fix typo
 in early out logic

We should be checking for a failed dyn_cast on the ParentFD result - not the 
loop invariant FD root value.

Seems to have been introduced in #65193

Noticed by static analyser (I have no specific test case).
---
 clang/lib/Sema/SemaLambda.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/lib/Sema/SemaLambda.cpp b/clang/lib/Sema/SemaLambda.cpp
index e9476a0c93c5d..ca9c7cb9faadf 100644
--- a/clang/lib/Sema/SemaLambda.cpp
+++ b/clang/lib/Sema/SemaLambda.cpp
@@ -2391,7 +2391,7 @@ Sema::LambdaScopeForCallOperatorInstantiationRAII::
   Pattern =
   dyn_cast(getLambdaAwareParentOfDeclContext(Pattern));
 
-  if (!FD || !Pattern)
+  if (!ParentFD || !Pattern)
 break;
 
   SemaRef.addInstantiatedParametersToScope(ParentFD, Pattern, Scope, 
MLTAL);

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Sema] LambdaScopeForCallOperatorInstantiationRAII - fix typo in early out logic (PR #96888)

2024-06-27 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Simon Pilgrim (RKSimon)


Changes

We should be checking for a failed dyn_cast on the ParentFD result - not the 
loop invariant FD root value.

Seems to have been introduced in #65193

Noticed by static analyser (I have no specific test case).

---
Full diff: https://github.com/llvm/llvm-project/pull/96888.diff


1 Files Affected:

- (modified) clang/lib/Sema/SemaLambda.cpp (+1-1) 


``diff
diff --git a/clang/lib/Sema/SemaLambda.cpp b/clang/lib/Sema/SemaLambda.cpp
index e9476a0c93c5d..ca9c7cb9faadf 100644
--- a/clang/lib/Sema/SemaLambda.cpp
+++ b/clang/lib/Sema/SemaLambda.cpp
@@ -2391,7 +2391,7 @@ Sema::LambdaScopeForCallOperatorInstantiationRAII::
   Pattern =
   dyn_cast(getLambdaAwareParentOfDeclContext(Pattern));
 
-  if (!FD || !Pattern)
+  if (!ParentFD || !Pattern)
 break;
 
   SemaRef.addInstantiatedParametersToScope(ParentFD, Pattern, Scope, 
MLTAL);

``




https://github.com/llvm/llvm-project/pull/96888
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [lldb] [clang][lldb] Don't assert structure layout correctness for layouts provided by LLDB (PR #93809)

2024-06-27 Thread Michael Buch via cfe-commits

Michael137 wrote:

> Here's the smallest patch that would put explicit alignment on any packed 
> structure:
> 
> ```
> diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp 
> b/clang/lib/CodeGen/CGDebugInfo.cpp
> index a072475ba770..bbb13ddd593b 100644
> --- a/clang/lib/CodeGen/CGDebugInfo.cpp
> +++ b/clang/lib/CodeGen/CGDebugInfo.cpp
> @@ -64,7 +64,7 @@ static uint32_t getTypeAlignIfRequired(const Type *Ty, 
> const ASTContext &Ctx) {
>// MaxFieldAlignmentAttr is the attribute added to types
>// declared after #pragma pack(n).
>if (auto *Decl = Ty->getAsRecordDecl())
> -if (Decl->hasAttr())
> +if (Decl->hasAttr() || 
> Decl->hasAttr())
>return TI.Align;
>  
>return 0;
> ```
> 
> But I don't think that's the right approach - I think what we should do is 
> compute the natural alignment of the structure, then compare that to the 
> actual alignment - and if they differ, we should put an explicit alignment on 
> the structure. This avoids the risk that other alignment-influencing effects 
> might be missed (and avoids the case of putting alignment on a structure 
> that, when packed, just has the same alignment anyway - which is a minor 
> issue, but nice to get right (eg: packed struct of a single char probably 
> shouldn't have an explicit alignment - since it's the same as the implicit 
> alignment anyway))

Thanks for the analysis! If we can emit alignment for packed attributes 
consistently then we probably can get rid of most of the `InferAlignment` logic 
in the `RecordLayoutBuilder` (it seems to me most of that logic was put 
introduced there for the purpose of packed structs), which would address the 
issue I saw with laying out `[[no_unique_address]]` fields. Trying this now

https://github.com/llvm/llvm-project/pull/93809
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Extend lifetime bound analysis to support assignments (PR #96475)

2024-06-27 Thread Haojian Wu via cfe-commits

https://github.com/hokein edited https://github.com/llvm/llvm-project/pull/96475
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][CodeGen] security check cookie execute only when needed (PR #95904)

2024-06-27 Thread Simon Pilgrim via cfe-commits


@@ -9,95 +9,6 @@
 @"\01LC" = internal constant [11 x i8] c"buf == %s\0A\00";  [#uses=1]
 
 define void @test(ptr %a) nounwind ssp {
-; MSVC-X86-LABEL: test:

RKSimon wrote:

where did the test checks go?

https://github.com/llvm/llvm-project/pull/95904
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][CodeGen] security check cookie execute only when needed (PR #95904)

2024-06-27 Thread Simon Pilgrim via cfe-commits


@@ -114,250 +25,93 @@ return:; preds = %entry
 declare void @escape(ptr)
 
 define void @test_vla(i32 %n) nounwind ssp {
-; MSVC-X86-LABEL: test_vla:
-; MSVC-X86:   # %bb.0:
-; MSVC-X86-NEXT:pushl %ebp
-; MSVC-X86-NEXT:movl %esp, %ebp
-; MSVC-X86-NEXT:pushl %eax
-; MSVC-X86-NEXT:movl 8(%ebp), %eax
-; MSVC-X86-NEXT:movl ___security_cookie, %ecx
-; MSVC-X86-NEXT:xorl %ebp, %ecx
-; MSVC-X86-NEXT:movl %ecx, -4(%ebp)
-; MSVC-X86-NEXT:shll $2, %eax
-; MSVC-X86-NEXT:calll __chkstk
-; MSVC-X86-NEXT:movl %esp, %eax
-; MSVC-X86-NEXT:pushl %eax
-; MSVC-X86-NEXT:calll _escape
-; MSVC-X86-NEXT:addl $4, %esp
-; MSVC-X86-NEXT:movl -4(%ebp), %ecx
-; MSVC-X86-NEXT:xorl %ebp, %ecx
-; MSVC-X86-NEXT:calll @__security_check_cookie@4
-; MSVC-X86-NEXT:movl %ebp, %esp
-; MSVC-X86-NEXT:popl %ebp
-; MSVC-X86-NEXT:retl
-;
-; MSVC-X64-LABEL: test_vla:
-; MSVC-X64:   # %bb.0:
-; MSVC-X64-NEXT:pushq %rbp
-; MSVC-X64-NEXT:subq $16, %rsp
-; MSVC-X64-NEXT:leaq {{[0-9]+}}(%rsp), %rbp
-; MSVC-X64-NEXT:movq __security_cookie(%rip), %rax
-; MSVC-X64-NEXT:xorq %rbp, %rax
-; MSVC-X64-NEXT:movq %rax, -8(%rbp)
-; MSVC-X64-NEXT:movl %ecx, %eax
-; MSVC-X64-NEXT:leaq 15(,%rax,4), %rax
-; MSVC-X64-NEXT:andq $-16, %rax
-; MSVC-X64-NEXT:callq __chkstk
-; MSVC-X64-NEXT:subq %rax, %rsp
-; MSVC-X64-NEXT:movq %rsp, %rcx
-; MSVC-X64-NEXT:subq $32, %rsp
-; MSVC-X64-NEXT:callq escape
-; MSVC-X64-NEXT:addq $32, %rsp
-; MSVC-X64-NEXT:movq -8(%rbp), %rcx
-; MSVC-X64-NEXT:xorq %rbp, %rcx
-; MSVC-X64-NEXT:subq $32, %rsp
-; MSVC-X64-NEXT:callq __security_check_cookie
-; MSVC-X64-NEXT:movq %rbp, %rsp
-; MSVC-X64-NEXT:popq %rbp
-; MSVC-X64-NEXT:retq
-;
-; MSVC-X86-O0-LABEL: test_vla:
-; MSVC-X86-O0:   # %bb.0:
-; MSVC-X86-O0-NEXT:pushl %ebp
-; MSVC-X86-O0-NEXT:movl %esp, %ebp
-; MSVC-X86-O0-NEXT:pushl %eax
-; MSVC-X86-O0-NEXT:movl 8(%ebp), %eax
-; MSVC-X86-O0-NEXT:movl ___security_cookie, %ecx
-; MSVC-X86-O0-NEXT:xorl %ebp, %ecx
-; MSVC-X86-O0-NEXT:movl %ecx, -4(%ebp)
-; MSVC-X86-O0-NEXT:shll $2, %eax
-; MSVC-X86-O0-NEXT:calll __chkstk
-; MSVC-X86-O0-NEXT:movl %esp, %eax
-; MSVC-X86-O0-NEXT:subl $4, %esp
-; MSVC-X86-O0-NEXT:movl %eax, (%esp)
-; MSVC-X86-O0-NEXT:calll _escape
-; MSVC-X86-O0-NEXT:addl $4, %esp
-; MSVC-X86-O0-NEXT:movl -4(%ebp), %ecx
-; MSVC-X86-O0-NEXT:xorl %ebp, %ecx
-; MSVC-X86-O0-NEXT:calll @__security_check_cookie@4
-; MSVC-X86-O0-NEXT:movl %ebp, %esp
-; MSVC-X86-O0-NEXT:popl %ebp
-; MSVC-X86-O0-NEXT:retl
-;
-; MSVC-X64-O0-LABEL: test_vla:
-; MSVC-X64-O0:   # %bb.0:
-; MSVC-X64-O0-NEXT:pushq %rbp
-; MSVC-X64-O0-NEXT:subq $16, %rsp
-; MSVC-X64-O0-NEXT:leaq {{[0-9]+}}(%rsp), %rbp
-; MSVC-X64-O0-NEXT:movq __security_cookie(%rip), %rax
-; MSVC-X64-O0-NEXT:xorq %rbp, %rax
-; MSVC-X64-O0-NEXT:movq %rax, -8(%rbp)
-; MSVC-X64-O0-NEXT:movl %ecx, %eax
-; MSVC-X64-O0-NEXT:# kill: def $rax killed $eax
-; MSVC-X64-O0-NEXT:leaq 15(,%rax,4), %rax
-; MSVC-X64-O0-NEXT:andq $-16, %rax
-; MSVC-X64-O0-NEXT:callq __chkstk
-; MSVC-X64-O0-NEXT:subq %rax, %rsp
-; MSVC-X64-O0-NEXT:movq %rsp, %rcx
-; MSVC-X64-O0-NEXT:subq $32, %rsp
-; MSVC-X64-O0-NEXT:callq escape
-; MSVC-X64-O0-NEXT:addq $32, %rsp
-; MSVC-X64-O0-NEXT:movq -8(%rbp), %rcx
-; MSVC-X64-O0-NEXT:xorq %rbp, %rcx
-; MSVC-X64-O0-NEXT:subq $32, %rsp
-; MSVC-X64-O0-NEXT:callq __security_check_cookie
-; MSVC-X64-O0-NEXT:movq %rbp, %rsp
-; MSVC-X64-O0-NEXT:popq %rbp
-; MSVC-X64-O0-NEXT:retq
   %vla = alloca i32, i32 %n
   call void @escape(ptr %vla)
   ret void
 }
 
+; MSVC-X86-LABEL: _test_vla:
+; MSVC-X86: pushl %ebp
+; MSVC-X86: movl %esp, %ebp
+; MSVC-X86: movl ___security_cookie, %[[REG1:[^ ]*]]
+; MSVC-X86: xorl %ebp, %[[REG1]]
+; MSVC-X86: movl %[[REG1]], [[SLOT:-[0-9]*]](%ebp)
+; MSVC-X86: calll __chkstk
+; MSVC-X86: pushl
+; MSVC-X86: calll _escape
+; MSVC-X86: movl [[SLOT]](%ebp), %ecx
+; MSVC-X86: xorl %ebp, %ecx
+; MSVC-X86: calll @__security_check_cookie@4
+; MSVC-X86: movl %ebp, %esp
+; MSVC-X86: popl %ebp
+; MSVC-X86: retl
+
+; MSVC-X64-LABEL: test_vla:
+; MSVC-X64: pushq %rbp
+; MSVC-X64: subq $16, %rsp
+; MSVC-X64: leaq 16(%rsp), %rbp
+; MSVC-X64: movq __security_cookie(%rip), %[[REG1:[^ ]*]]
+; MSVC-X64: xorq %rbp, %[[REG1]]
+; MSVC-X64: movq %[[REG1]], [[SLOT:-[0-9]*]](%rbp)
+; MSVC-X64: callq __chkstk
+; MSVC-X64: callq escape
+; MSVC-X64: movq [[SLOT]](%rbp), %rcx
+; MSVC-X64: xorq %rbp, %rcx
+; MSVC-X64: callq __security_check_cookie
+; MSVC-X64: retq

RKSimon wrote:

These look to have been manually re-added instead of using the 
update_llc_test_checks script

https://github.com/llvm/llvm-project/pull/95904
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://

[clang] [llvm] [Pipelines] Move IPSCCP after inliner pipeline (PR #96620)

2024-06-27 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> This patch causes some significant performance regressions on llvm-test-suite 
> (rv64gc-O3-thinlto):
> 
> Name  Before  After   Ratio
> SingleSource/Benchmarks/Shootout/Shootout-random  2.150161677 
> 3.300161641 + 53.5%
> SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv  
> 0.111845159 0.145389494 +30.0%
> SingleSource/Benchmarks/Adobe-C++/functionobjects 5.489498263 
> 6.827863965 +24.4%

It has been fixed. But this patch didn't show a positive net effect :(



https://github.com/llvm/llvm-project/pull/96620
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [X86][CodeGen] security check cookie execute only when needed (PR #95904)

2024-06-27 Thread Simon Pilgrim via cfe-commits


@@ -82,6 +94,8 @@ define void @tailcall_unrelated_frame() sspreq {
 ; LINUX-NEXT:  .LBB1_2: # %CallStackCheckFailBlk
 ; LINUX-NEXT:.cfi_def_cfa_offset 16
 ; LINUX-NEXT:callq __stack_chk_fail@PLT
+
+

RKSimon wrote:

superfluous

https://github.com/llvm/llvm-project/pull/95904
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [llvm][AArch64] Move Apple aliases into the CpuAlias map (PR #96249)

2024-06-27 Thread Tomas Matheson via cfe-commits

https://github.com/tmatheson-arm approved this pull request.

LGTM, I've added some thoughts but it's fine as it is.

https://github.com/llvm/llvm-project/pull/96249
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [llvm][AArch64] Move Apple aliases into the CpuAlias map (PR #96249)

2024-06-27 Thread Tomas Matheson via cfe-commits

https://github.com/tmatheson-arm edited 
https://github.com/llvm/llvm-project/pull/96249
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [llvm][AArch64] Move Apple aliases into the CpuAlias map (PR #96249)

2024-06-27 Thread Tomas Matheson via cfe-commits


@@ -5,11 +5,11 @@
 
 // RUN: not %clang_cc1 -triple arm64--- -target-cpu not-a-cpu -fsyntax-only %s 
2>&1 | FileCheck %s --check-prefix AARCH64
 // AARCH64: error: unknown target CPU 'not-a-cpu'
-// AARCH64-NEXT: note: valid target CPU values are: generic, cortex-a35, 
cortex-a34, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a520ae, 
cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, 
cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78ae, cortex-a78c, 
cortex-a710, cortex-a715, cortex-a720, cortex-a720ae, cortex-a725, cortex-r82, 
cortex-r82ae, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, 
cortex-x925, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-n3, 
neoverse-512tvb, neoverse-v1, neoverse-v2, neoverse-v3, neoverse-v3ae, 
exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx, thunderxt88, 
thunderxt81, thunderxt83, thunderx2t99, thunderx3t110, tsv110, cyclone, 
apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-s4, 
apple-s5, apple-a13, apple-a14, apple-m1, apple-a15, apple-m2, apple-a16, 
apple-m3, apple-a17, apple-m4, a64fx, carmel, ampere1, ampere1a, ampere1b, 
oryon-1, cobalt-100, grace{{$}}
+// AARCH64-NEXT: note: valid target CPU values are: a64fx, ampere1, ampere1a, 
ampere1b, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, 
apple-a16, apple-a17, apple-a7, apple-a8, apple-a9, apple-m1, apple-m2, 
apple-m3, apple-m4, apple-s4, apple-s5, carmel, cobalt-100, cortex-a34, 
cortex-a35, cortex-a510, cortex-a520, cortex-a520ae, cortex-a53, cortex-a55, 
cortex-a57, cortex-a65, cortex-a65ae, cortex-a710, cortex-a715, cortex-a72, 
cortex-a720, cortex-a720ae, cortex-a725, cortex-a73, cortex-a75, cortex-a76, 
cortex-a76ae, cortex-a77, cortex-a78, cortex-a78ae, cortex-a78c, cortex-r82, 
cortex-r82ae, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, 
cortex-x925, cyclone, exynos-m3, exynos-m4, exynos-m5, falkor, generic, grace, 
kryo, neoverse-512tvb, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-n3, 
neoverse-v1, neoverse-v2, neoverse-v3, neoverse-v3ae, oryon-1, saphira, 
thunderx, thunderx2t99, thunderx3t110, thunderxt81, thunderxt83, thunderxt88, 
tsv110{{$}}

tmatheson-arm wrote:

Split into multiple lines?

https://github.com/llvm/llvm-project/pull/96249
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [llvm][AArch64] Move Apple aliases into the CpuAlias map (PR #96249)

2024-06-27 Thread Tomas Matheson via cfe-commits


@@ -304,8 +304,21 @@ struct Alias {
   StringRef Name;
 };
 
-inline constexpr Alias CpuAliases[] = {{"cobalt-100", "neoverse-n2"},
-   {"grace", "neoverse-v2"}};
+inline constexpr Alias CpuAliases[] = {
+{"cobalt-100", "neoverse-n2"},
+{"grace", "neoverse-v2"},
+// Support cyclone as an alias for apple-a7 so we can still LTO old 
bitcode.

tmatheson-arm wrote:

If you really want this to work only for bitcode (and not appear on `-mcpu`), 
could it be handled in the bitcode importer? Same for "apple-latest"?

https://github.com/llvm/llvm-project/pull/96249
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [llvm][AArch64] Move Apple aliases into the CpuAlias map (PR #96249)

2024-06-27 Thread Tomas Matheson via cfe-commits


@@ -88,10 +88,14 @@ StringRef AArch64::getArchExtFeature(StringRef ArchExt) {
 
 void AArch64::fillValidCPUArchList(SmallVectorImpl &Values) {
   for (const auto &C : CpuInfos)
-  Values.push_back(C.Name);
+Values.push_back(C.Name);
 
   for (const auto &Alias : CpuAliases)
-Values.push_back(Alias.AltName);
+// The apple-latest alias is backend only, do not expose it to clang's 
-mcpu.
+if (Alias.AltName != "apple-latest")

tmatheson-arm wrote:

I don't love this special case. But, not sure what to do about it.

https://github.com/llvm/llvm-project/pull/96249
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [Flang-new][OpenMP] Add offload related flags for AMDGPU (PR #96742)

2024-06-27 Thread Andrzej Warzyński via cfe-commits


@@ -333,6 +333,9 @@ void Flang::AddAMDGPUTargetArgs(const ArgList &Args,
 StringRef Val = A->getValue();
 CmdArgs.push_back(Args.MakeArgString("-mcode-object-version=" + Val));
   }
+
+  const ToolChain &TC = getToolChain();
+  TC.addClangTargetOptions(Args, CmdArgs, Action::OffloadKind::OFK_OpenMP);

banach-space wrote:

> Clang does not verify how we use these flags and it accepts them for non-GPU 
> target.

It's OK to make Flang "stricter" if we believe that's the right thing to do ;-) 
(I think that generating useful error/warning messages like "don't mix these 
flags - that's not supporter" would be a good thing)

> IMO can be reused between Flang and Clang

Are there any plans to extract that logic and share it somewhere?

> I don't know if Nvidia also want to reuse their toolchain between Clang and 
> Flang to fully support OpenMP offloading.

Who could be the right person to ask?

https://github.com/llvm/llvm-project/pull/96742
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [llvm][AArch64] Move Apple aliases into the CpuAlias map (PR #96249)

2024-06-27 Thread Tomas Matheson via cfe-commits


@@ -304,8 +304,21 @@ struct Alias {
   StringRef Name;
 };
 
-inline constexpr Alias CpuAliases[] = {{"cobalt-100", "neoverse-n2"},
-   {"grace", "neoverse-v2"}};
+inline constexpr Alias CpuAliases[] = {

tmatheson-arm wrote:

We should tablegen this too.

https://github.com/llvm/llvm-project/pull/96249
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][analyzer] Improve documentation of checker 'cplusplus.Move' (NFC) (PR #96295)

2024-06-27 Thread Balázs Kéri via cfe-commits

balazske wrote:

I fixed a test that contained the entire option help description. I think this 
is not needed, removed it and only included the first line of the description.

https://github.com/llvm/llvm-project/pull/96295
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [PAC][ELF][AArch64] Encode signed GOT flag in PAuth core info (PR #96159)

2024-06-27 Thread Daniil Kovalev via cfe-commits

https://github.com/kovdan01 updated 
https://github.com/llvm/llvm-project/pull/96159

>From 4eeb1b4e82941681b6cafda8579d136e3e7cb09f Mon Sep 17 00:00:00 2001
From: Daniil Kovalev 
Date: Tue, 18 Jun 2024 15:37:18 +0300
Subject: [PATCH 1/2] [PAC][ELF][AArch64] Encode signed GOT flag in PAuth core
 info

Treat 7th bit of version value for llvm_linux platform as signed GOT flag.

- clang: define `PointerAuthELFGOT` LangOption and set 7th bit of
  `aarch64-elf-pauthabi-version` LLVM module flag correspondingly;

- llvm-readobj: print `PointerAuthELFGOT` or `!PointerAuthELFGOT` in version
  description of llvm_linux platform depending on whether the flag is set.
---
 clang/include/clang/Basic/LangOptions.def  |  1 +
 clang/lib/CodeGen/CodeGenModule.cpp|  6 --
 llvm/include/llvm/BinaryFormat/ELF.h   |  3 ++-
 .../AArch64/note-gnu-property-elf-pauthabi.ll  |  2 +-
 .../ELF/AArch64/aarch64-feature-pauth.s| 18 +-
 llvm/tools/llvm-readobj/ELFDumper.cpp  |  3 ++-
 6 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/clang/include/clang/Basic/LangOptions.def 
b/clang/include/clang/Basic/LangOptions.def
index 6dd6b5614f44c..bc99dad5cd55e 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -168,6 +168,7 @@ LANGOPT(PointerAuthAuthTraps, 1, 0, "pointer authentication 
failure traps")
 LANGOPT(PointerAuthVTPtrAddressDiscrimination, 1, 0, "incorporate address 
discrimination in authenticated vtable pointers")
 LANGOPT(PointerAuthVTPtrTypeDiscrimination, 1, 0, "incorporate type 
discrimination in authenticated vtable pointers")
 LANGOPT(PointerAuthInitFini, 1, 0, "sign function pointers in init/fini 
arrays")
+LANGOPT(PointerAuthELFGOT, 1, 0, "authenticate pointers from GOT")
 
 LANGOPT(DoubleSquareBracketAttributes, 1, 0, "'[[]]' attributes extension for 
all language standard modes")
 LANGOPT(ExperimentalLateParseAttributes, 1, 0, "experimental late parsing of 
attributes")
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index dd4a665ebc78b..feac291e01b50 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -1210,8 +1210,10 @@ void CodeGenModule::Release() {
   (LangOpts.PointerAuthVTPtrTypeDiscrimination
<< AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_VPTRTYPEDISCR) |
   (LangOpts.PointerAuthInitFini
-   << AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_INITFINI);
-  static_assert(AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_INITFINI ==
+   << AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_INITFINI) |
+  (LangOpts.PointerAuthELFGOT
+   << AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_GOT);
+  static_assert(AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_GOT ==
 AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_LAST,
 "Update when new enum items are defined");
   if (PAuthABIVersion != 0) {
diff --git a/llvm/include/llvm/BinaryFormat/ELF.h 
b/llvm/include/llvm/BinaryFormat/ELF.h
index dfba180149916..2aa37bbed6656 100644
--- a/llvm/include/llvm/BinaryFormat/ELF.h
+++ b/llvm/include/llvm/BinaryFormat/ELF.h
@@ -1774,8 +1774,9 @@ enum : unsigned {
   AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_VPTRADDRDISCR = 4,
   AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_VPTRTYPEDISCR = 5,
   AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_INITFINI = 6,
+  AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_GOT = 7,
   AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_LAST =
-  AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_INITFINI,
+  AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_GOT,
 };
 
 // x86 processor feature bits.
diff --git a/llvm/test/CodeGen/AArch64/note-gnu-property-elf-pauthabi.ll 
b/llvm/test/CodeGen/AArch64/note-gnu-property-elf-pauthabi.ll
index 728cffeba02a2..fb69a12b2f906 100644
--- a/llvm/test/CodeGen/AArch64/note-gnu-property-elf-pauthabi.ll
+++ b/llvm/test/CodeGen/AArch64/note-gnu-property-elf-pauthabi.ll
@@ -27,7 +27,7 @@
 ; OBJ: Displaying notes found in: .note.gnu.property
 ; OBJ-NEXT:   Owner Data size  Description
 ; OBJ-NEXT:   GNU   0x0018 NT_GNU_PROPERTY_TYPE_0 
(property note)
-; OBJ-NEXT:   AArch64 PAuth ABI core info: platform 0x1002 (llvm_linux), 
version 0x55 (PointerAuthIntrinsics, !PointerAuthCalls, PointerAuthReturns, 
!PointerAuthAuthTraps, PointerAuthVTPtrAddressDiscrimination, 
!PointerAuthVTPtrTypeDiscrimination, PointerAuthInitFini)
+; OBJ-NEXT:   AArch64 PAuth ABI core info: platform 0x1002 (llvm_linux), 
version 0x55 (PointerAuthIntrinsics, !PointerAuthCalls, PointerAuthReturns, 
!PointerAuthAuthTraps, PointerAuthVTPtrAddressDiscrimination, 
!PointerAuthVTPtrTypeDiscrimination, PointerAuthInitFini, !PointerAuthELFGOT)
 
 ; ERR: either both or no 'aarch64-elf-pauthabi-platform' and 
'aarch64-elf-pauthabi-version' module flags must be present
 
diff --git a/llvm/test/tools/llvm-readobj/ELF/AArch64/

[clang] [llvm] [Pipelines] Move IPSCCP after inliner pipeline (PR #96620)

2024-06-27 Thread via cfe-commits

goldsteinn wrote:

> > This patch causes some significant performance regressions on 
> > llvm-test-suite (rv64gc-O3-thinlto):
> > NameBefore  After   Ratio
> > SingleSource/Benchmarks/Shootout/Shootout-random2.150161677 
> > 3.300161641 + 53.5%
> > SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv
> > 0.111845159 0.145389494 +30.0%
> > SingleSource/Benchmarks/Adobe-C++/functionobjects   5.489498263 
> > 6.827863965 +24.4%
> 
> It has been fixed. But this patch didn't show a positive net effect :(

Does that mean it has a negative net effect, or its neutral (in which case the 
original motivating case should be enough).

https://github.com/llvm/llvm-project/pull/96620
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [compiler-rt] [XRay] Add support for instrumentation of DSOs on x86_64 (PR #90959)

2024-06-27 Thread Sebastian Kreutzer via cfe-commits

sebastiankreutzer wrote:

> > @androm3da @MaskRay I'm tagging you because I'm having trouble to get 
> > feedback to this PR, and you seem to be the most recent contributors to 
> > XRay. Would one of you be willing to review it? Any other pointers on who 
> > to get in touch with are also much appreciated.
> 
> I'm happy to take a look - but I'm traveling this week and won't be able to 
> until this weekend.

That'd be great! There is no rush.

https://github.com/llvm/llvm-project/pull/90959
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang] [Flang-new][OpenMP] Add offload related flags for AMDGPU (PR #96742)

2024-06-27 Thread Andrzej Warzyński via cfe-commits

banach-space wrote:

> Clang for AMDGPU supports OpenMP and 
> [HIP](https://clang.llvm.org/docs/HIPSupport.html) and it reuses the same 
> code. For example `-fcuda-is-device` flag needs to be checked for [legacy HIP 
> host 
> code](https://github.com/llvm/llvm-project/blob/2033b1cf16f040e1369d8efba8439dcd3e36ed31/clang/lib/Basic/Targets/AMDGPU.cpp#L278).
>  

Thanks! I'm still puzzled though:

> In the future it will be needed for Flang equivalent functions: 
> AMDGPUTargetCodeGenInfo::getGlobalVarAddressSpace 
> AMDGPUTargetInfo::getTargetDefines

Why would `-fcuda-is-device` be required? From your link I gather that the AMD 
logic in Clang simply makes sure that `-fcuda-is-device` wasn't used?

> I would like to reuse the same part of the AMD GPU toolchain for Flang.

That would be great - what's the plan here then? Simply to rely on the code in 
Clang? Also, note that that's `TargetInfo` (which lives in `clangBasic`) rather 
than `Toolchain` (that lives in `clangDriver`). This is actually key because it 
makes the coupling between Flang and Clang even stronger. 

https://github.com/llvm/llvm-project/pull/96742
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][NFC] Use range-based for loops (PR #96831)

2024-06-27 Thread Vlad Serebrennikov via cfe-commits

https://github.com/Endilll edited 
https://github.com/llvm/llvm-project/pull/96831
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][NFC] Use range-based for loops (PR #96831)

2024-06-27 Thread Vlad Serebrennikov via cfe-commits

https://github.com/Endilll commented:

This looks good overall, but I have minor suggestions.

https://github.com/llvm/llvm-project/pull/96831
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][NFC] Use range-based for loops (PR #96831)

2024-06-27 Thread Vlad Serebrennikov via cfe-commits


@@ -2056,40 +2056,40 @@ void CXXRecordDecl::completeDefinition() {
   completeDefinition(nullptr);
 }
 
+static bool hasPureVirtualFinalOverrider(
+const CXXRecordDecl &RD, const CXXFinalOverriderMap *FinalOverriders) {
+  auto ExistsIn = [](const CXXFinalOverriderMap &FinalOverriders) {
+for (const auto &[_, M] : FinalOverriders) {
+  for (const auto &[_, SO] : M) {

Endilll wrote:

While we're at it, can we use more descriptive names than `M` and `SO`? I'm not 
even sure what the latter means. I also have reservations towards `auto` here.

https://github.com/llvm/llvm-project/pull/96831
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [Pipelines] Move IPSCCP after inliner pipeline (PR #96620)

2024-06-27 Thread Florian Hahn via cfe-commits

https://github.com/fhahn commented:

Running IPSCCP twice seems like quite a heavy hammer, I'd expect a noticeable 
compile-time impact.

I'd recommend to try to extract a reproducer from your motivating use case and 
check why IPSCCP cannot perform the desired optimization before inlining. Note 
that we run SCCP after inlining I think, which is the non-IP version of IPSCCP

https://github.com/llvm/llvm-project/pull/96620
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Extend lifetime bound analysis to support assignments (PR #96475)

2024-06-27 Thread Gábor Horváth via cfe-commits


@@ -964,11 +966,26 @@ static bool 
pathOnlyInitializesGslPointer(IndirectLocalPath &Path) {
   return false;
 }
 
-void checkExprLifetime(Sema &SemaRef, const InitializedEntity &Entity,
+void checkExprLifetime(Sema &SemaRef, const CheckingEntity &CEntity,
Expr *Init) {
-  LifetimeResult LR = getEntityLifetime(&Entity);
-  LifetimeKind LK = LR.getInt();
-  const InitializedEntity *ExtendingEntity = LR.getPointer();
+  LifetimeKind LK = LK_FullExpression;
+
+  const AssignedEntity *AEntity = nullptr;
+  // Local variables for initialized entity.
+  const InitializedEntity *InitEntity = nullptr;
+  const InitializedEntity *ExtendingEntity = nullptr;
+  if (auto IEntityP = std::get_if(&CEntity)) {
+InitEntity = *IEntityP;
+auto LTResult = getEntityLifetime(InitEntity);
+LK = LTResult.getInt();
+ExtendingEntity = LTResult.getPointer();
+  } else if (auto AEntityP = std::get_if(&CEntity)) {
+AEntity = *AEntityP;
+if (AEntity->LHS->getType()->isPointerType()) // builtin pointer type
+  LK = LK_Extended;

Xazax-hun wrote:

I am a bit confused here, could you elaborate why we want `LK_Extended` here? 
As fas as I remember, assignments are not doing lifetime extension. 

https://github.com/llvm/llvm-project/pull/96475
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Extend lifetime bound analysis to support assignments (PR #96475)

2024-06-27 Thread Gábor Horváth via cfe-commits

https://github.com/Xazax-hun edited 
https://github.com/llvm/llvm-project/pull/96475
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   4   5   >