[clang] a1ad988 - [clang] Make header self-contained. NFC.
Author: Benjamin Kramer Date: 2024-06-27T09:21:37+02:00 New Revision: a1ad98813006cefcdf88336db3f81a15b6bf36fb URL: https://github.com/llvm/llvm-project/commit/a1ad98813006cefcdf88336db3f81a15b6bf36fb DIFF: https://github.com/llvm/llvm-project/commit/a1ad98813006cefcdf88336db3f81a15b6bf36fb.diff LOG: [clang] Make header self-contained. NFC. Added: Modified: clang/include/clang/Basic/Thunk.h Removed: diff --git a/clang/include/clang/Basic/Thunk.h b/clang/include/clang/Basic/Thunk.h index af4afb2d2ac4d..8ff7603e0094d 100644 --- a/clang/include/clang/Basic/Thunk.h +++ b/clang/include/clang/Basic/Thunk.h @@ -21,6 +21,7 @@ namespace clang { class CXXMethodDecl; +class Type; /// A return adjustment. struct ReturnAdjustment { ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [llvm] [openmp] [PGO][Offload] Profile profraw generation for GPU instrumentation #76587 (PR #93365)
https://github.com/EthanLuisMcDonough edited https://github.com/llvm/llvm-project/pull/93365 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libcxx] [libcxxabi] [libunwind] [llvm] [runtimes] remove workaround for old CMake when setting `--unwindlib=none` (PR #93429)
https://github.com/h-vetinari updated https://github.com/llvm/llvm-project/pull/93429 >From 8c1b899aa174b107fece1edbf99eaf261bdea516 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Martin=20Storsj=C3=B6?= Date: Mon, 25 Apr 2022 09:45:22 +0300 Subject: [PATCH 01/11] [runtimes] [CMake] Use CMAKE_REQUIRED_LINK_OPTIONS to simplify handling of the --unwindlib=none option This avoids passing the option unnecessarily to compilation commands (where it causes warnings). This fails in practice with libunwind, where setting CMAKE_TRY_COMPILE_TARGET_TYPE to STATIC_LIBRARY breaks it, as the option from CMAKE_REQUIRED_LINK_OPTIONS ends up passed to the "ar" tool too. --- libunwind/CMakeLists.txt | 3 +++ runtimes/CMakeLists.txt | 22 +- 2 files changed, 4 insertions(+), 21 deletions(-) diff --git a/libunwind/CMakeLists.txt b/libunwind/CMakeLists.txt index b22ade0a7d71e..3d2fadca9d2ec 100644 --- a/libunwind/CMakeLists.txt +++ b/libunwind/CMakeLists.txt @@ -221,9 +221,12 @@ add_cxx_compile_flags_if_supported(-EHsc) # This leads to libunwind not being built with this flag, which makes # libunwind quite useless in this setup. set(_previous_CMAKE_TRY_COMPILE_TARGET_TYPE ${CMAKE_TRY_COMPILE_TARGET_TYPE}) +set(_previous_CMAKE_REQUIRED_LINK_OPTIONS ${CMAKE_REQUIRED_LINK_OPTIONS}) set(CMAKE_TRY_COMPILE_TARGET_TYPE STATIC_LIBRARY) +set(CMAKE_REQUIRED_LINK_OPTIONS) add_compile_flags_if_supported(-funwind-tables) set(CMAKE_TRY_COMPILE_TARGET_TYPE ${_previous_CMAKE_TRY_COMPILE_TARGET_TYPE}) +set(CMAKE_REQUIRED_LINK_OPTIONS ${_previous_CMAKE_REQUIRED_LINK_OPTIONS}) if (LIBUNWIND_USES_ARM_EHABI AND NOT CXX_SUPPORTS_FUNWIND_TABLES_FLAG) message(SEND_ERROR "The -funwind-tables flag must be supported " diff --git a/runtimes/CMakeLists.txt b/runtimes/CMakeLists.txt index 24f4851169591..8f909322c9a98 100644 --- a/runtimes/CMakeLists.txt +++ b/runtimes/CMakeLists.txt @@ -116,27 +116,7 @@ filter_prefixed("${CMAKE_ASM_IMPLICIT_INCLUDE_DIRECTORIES}" ${LLVM_BINARY_DIR} C # brittle. We should ideally move this to runtimes/CMakeLists.txt. llvm_check_compiler_linker_flag(C "--unwindlib=none" CXX_SUPPORTS_UNWINDLIB_EQ_NONE_FLAG) if (CXX_SUPPORTS_UNWINDLIB_EQ_NONE_FLAG) - set(ORIG_CMAKE_REQUIRED_FLAGS "${CMAKE_REQUIRED_FLAGS}") - set(CMAKE_REQUIRED_FLAGS "${CMAKE_REQUIRED_FLAGS} --unwindlib=none") - # TODO: When we can require CMake 3.14, we should use - # CMAKE_REQUIRED_LINK_OPTIONS here. Until then, we need a workaround: - # When using CMAKE_REQUIRED_FLAGS, this option gets added both to - # compilation and linking commands. That causes warnings in the - # compilation commands during cmake tests. This is normally benign, but - # when testing whether -Werror works, that test fails (due to the - # preexisting warning). - # - # Therefore, before we can use CMAKE_REQUIRED_LINK_OPTIONS, check if we - # can use --start-no-unused-arguments to silence the warnings about - # --unwindlib=none during compilation. - # - # We must first add --unwindlib=none to CMAKE_REQUIRED_FLAGS above, to - # allow this subsequent test to succeed, then rewrite CMAKE_REQUIRED_FLAGS - # below. - check_c_compiler_flag("--start-no-unused-arguments" C_SUPPORTS_START_NO_UNUSED_ARGUMENTS) - if (C_SUPPORTS_START_NO_UNUSED_ARGUMENTS) -set(CMAKE_REQUIRED_FLAGS "${ORIG_CMAKE_REQUIRED_FLAGS} --start-no-unused-arguments --unwindlib=none --end-no-unused-arguments") - endif() + list(APPEND CMAKE_REQUIRED_LINK_OPTIONS "--unwindlib=none") endif() # Disable use of the installed C++ standard library when building runtimes. >From 816e9e6d81ac12537879406e0495fc80394a1a66 Mon Sep 17 00:00:00 2001 From: "H. Vetinari" Date: Thu, 20 Jun 2024 23:18:51 +1100 Subject: [PATCH 02/11] add comment (and CMake issue reference) about incompatible options --- libunwind/CMakeLists.txt | 4 1 file changed, 4 insertions(+) diff --git a/libunwind/CMakeLists.txt b/libunwind/CMakeLists.txt index 3d2fadca9d2ec..d84f8fa6ff954 100644 --- a/libunwind/CMakeLists.txt +++ b/libunwind/CMakeLists.txt @@ -220,6 +220,10 @@ add_cxx_compile_flags_if_supported(-EHsc) # # This leads to libunwind not being built with this flag, which makes # libunwind quite useless in this setup. +# +# NOTE: we need to work around https://gitlab.kitware.com/cmake/cmake/-/issues/23454 +# because CMAKE_REQUIRED_LINK_OPTIONS (c.f. CXX_SUPPORTS_UNWINDLIB_EQ_NONE_FLAG) +# is incompatible with CMAKE_TRY_COMPILE_TARGET_TYPE==STATIC_LIBRARY. set(_previous_CMAKE_TRY_COMPILE_TARGET_TYPE ${CMAKE_TRY_COMPILE_TARGET_TYPE}) set(_previous_CMAKE_REQUIRED_LINK_OPTIONS ${CMAKE_REQUIRED_LINK_OPTIONS}) set(CMAKE_TRY_COMPILE_TARGET_TYPE STATIC_LIBRARY) >From 3f917d22bdcd8b398cf7162563547418a056ecec Mon Sep 17 00:00:00 2001 From: "H. Vetinari" Date: Thu, 20 Jun 2024 23:18:51 +1100 Subject: [PATCH 03/11] [cmake] move check for `-fno-exceptions` to "safe zone" w.r.t. interference between CMAKE_REQUIRED_LINK_OPTIONS and static libraries --- libunwind/CMakeLists.txt |
[clang] [analyzer][NFC] Use ArrayRef for input parameters (PR #93203)
@@ -672,7 +672,7 @@ class StdLibraryFunctionsChecker StringRef getNote() const { return Note; } }; - using ArgTypes = std::vector>; + using ArgTypes = ArrayRef>; steakhal wrote: One can argue the same the other way around, just like for pointers. https://github.com/llvm/llvm-project/pull/93203 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libcxx] [libcxxabi] [libunwind] [llvm] [runtimes] remove workaround for old CMake when setting `--unwindlib=none` (PR #93429)
h-vetinari wrote: So I've been trying to follow down the rabbit hole of the failing flag checks, and it seems the combination of `CXX_SUPPORTS_UNWINDLIB_EQ_NONE_FLAG` plus https://gitlab.kitware.com/cmake/cmake/-/issues/23454 has a wider blast radius than anticipated. I'm not claiming that adding the `_previous_CMAKE_{REQUIRED_LINK_OPTIONS,TRY_COMPILE_TARGET_TYPE}` dance everywhere is the right approach here, but it was - so far - the obvious path to just try to get things green again. It's conceivable though that it would be easier to simply shift the detection of `CXX_SUPPORTS_UNWINDLIB_EQ_NONE_FLAG` until after the other flag checks have been performed? 🤔 https://github.com/llvm/llvm-project/pull/93429 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [OpenMP] Fix stack corruption due to argument mismatch (PR #96386)
https://github.com/sushgokh updated https://github.com/llvm/llvm-project/pull/96386 >From af4dc96c25f32b477337cedaeb0a696f75840ac0 Mon Sep 17 00:00:00 2001 From: sgokhale Date: Sat, 22 Jun 2024 17:16:24 +0530 Subject: [PATCH] [OpenMP] Fix stack corruption due to argument mismatch While lowering (#pragma omp target update from), clang's generated .omp_task_entry. is setting up 9 arguments while calling __tgt_target_data_update_nowait_mapper. At the same time, in __tgt_target_data_update_nowait_mapper, call to targetData() is converted to a sibcall assuming it has the argument count listed in the signature. AARCH64 asm sequence for this is as follows (removed unrelated insns): .omp_task_entry..108: sub sp, sp, #32 stp x29, x30, sp, #16 // 16-byte Folded Spill add x29, sp, #16 str x8, sp, #8. // stack canary str xzr, [sp] bl __tgt_target_data_update_nowait_mapper __tgt_target_data_update_nowait_mapper: sub sp, sp, #32 stp x29, x30, sp, #16 // 16-byte Folded Spill add x29, sp, #16 str x8, sp, #8 // stack canary // Sibcall argument setup adrp x8, :got:_Z16targetDataUpdateP7ident_tR8DeviceTyiPPvS4_PlS5_S4_S4_R11AsyncInfoTyb ldr x8, [x8, :got_lo12:_Z16targetDataUpdateP7ident_tR8DeviceTyiPPvS4_PlS5_S4_S4_R11AsyncInfoTyb] stp x9, x8, x29, #16 adrp x8, .L.str.8 add x8, x8, :lo12:.L.str.8 str x8, x29, #32. <==. This is the insn that erases $fp ldp x29, x30, sp, #16 // 16-byte Folded Reload add sp, sp, #32 // Sibcall b ZL10targetDataI22TaskAsyncInfoWrapperTyEvP7ident_tliPPvS4_PlS5_S4_S4_PFiS2_R8DeviceTyiS4_S4_S5_S5_S4_S4_R11AsyncInfoTybEPKcSD On AArch64, call to __tgt_target_data_update_nowait_mapper in .omp_task_entry. sets up only single space on stack and this results in ovewriting $fp and subsequent stack corruption. This issue can be credited to discrepancy of __tgt_target_data_update_nowait_mapper signature in openmp/libomptarget/include/omptarget.h taking 13 arguments while clang/lib/CodeGen/CGOpenMPRuntime.cpp and llvm/include/llvm/Frontend/OpenMP/OMPKinds.def taking only 9 arguments. This patch modifies __tgt_target_data_update_nowait_mapper signature to match .omp_task_entry usage(and other 2 files mentioned above). Co-authored-by: Kugan Vivekanandarajah --- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 28 +++-- .../include/llvm/Frontend/OpenMP/OMPKinds.def | 30 --- 2 files changed, 44 insertions(+), 14 deletions(-) diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index f6d12d46cfc07..fc3ad533666ca 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -10343,6 +10343,23 @@ void CGOpenMPRuntime::emitTargetDataStandAloneCall( MapNamesArray, InputInfo.MappersArray.emitRawPointer(CGF)}; +// Nowait calls have header declarations that take 13 arguments. Hence, the +// divergence from the OffloadingArgs definition. +llvm::Value *NowaitOffloadingArgs[] = { +RTLoc, +DeviceID, +PointerNum, +InputInfo.BasePointersArray.emitRawPointer(CGF), +InputInfo.PointersArray.emitRawPointer(CGF), +InputInfo.SizesArray.emitRawPointer(CGF), +MapTypesArray, +MapNamesArray, +InputInfo.MappersArray.emitRawPointer(CGF), +llvm::Constant::getNullValue(CGF.Int32Ty), +llvm::Constant::getNullValue(CGF.VoidPtrTy), +llvm::Constant::getNullValue(CGF.Int32Ty), +llvm::Constant::getNullValue(CGF.VoidPtrTy)}; + // Select the right runtime function call for each standalone // directive. const bool HasNowait = D.hasClausesOfKind(); @@ -10430,9 +10447,14 @@ void CGOpenMPRuntime::emitTargetDataStandAloneCall( llvm_unreachable("Unexpected standalone target data directive."); break; } -CGF.EmitRuntimeCall( -OMPBuilder.getOrCreateRuntimeFunction(CGM.getModule(), RTLFn), -OffloadingArgs); +if (HasNowait) + CGF.EmitRuntimeCall( + OMPBuilder.getOrCreateRuntimeFunction(CGM.getModule(), RTLFn), + NowaitOffloadingArgs); +else + CGF.EmitRuntimeCall( + OMPBuilder.getOrCreateRuntimeFunction(CGM.getModule(), RTLFn), + OffloadingArgs); }; auto &&TargetThenGen = [this, &ThenGen, &D, &InputInfo, &MapTypesArray, diff --git a/llvm/include/llvm/Frontend/OpenMP/OMPKinds.def b/llvm/include/llvm/Frontend/OpenMP/OMPKinds.def index fe09bb8177c28..ebd928470109a 100644 --- a/llvm/include/llvm/Frontend/OpenMP/OMPKinds.def +++ b/llvm/include/llvm/Frontend/OpenMP/OMPKinds.def @@ -438,19 +438,22 @@ __OMP_RTL(__tgt_target_kernel_nowait, false, Int32, IdentPtr, Int64, Int32, Int32, VoidPtr, KernelArgsPtr, Int32, VoidPtr, Int32, VoidPtr) __OMP_RTL(__tgt_target_data_begin_mapper, false, Void, IdentPtr, Int64, Int32, VoidPtrPtr, VoidPtrPtr, Int64Ptr, Int64Ptr, VoidPtrPtr, VoidPtrPtr)
[clang] Support `guarded_by` attribute and related attributes inside C structs and support late parsing them (PR #95455)
pdherbemont wrote: I think this still needs review from @delcypher and @rapidsna https://github.com/llvm/llvm-project/pull/95455 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Driver] Support using toolchain libc and libc++ for baremetal (PR #96736)
https://github.com/petrhosek updated https://github.com/llvm/llvm-project/pull/96736 >From db5ae584cc00717d667d423a99d71a8d3ac46805 Mon Sep 17 00:00:00 2001 From: Petr Hosek Date: Mon, 10 Jun 2024 20:27:52 + Subject: [PATCH 1/2] [Driver] Support using toolchain libc and libc++ for baremetal We want to support using a complete Clang/LLVM toolchain that includes LLVM libc and libc++ for baremetal targets. To do so, we need the driver to add the necessary include paths. --- clang/include/clang/Driver/ToolChain.h| 3 + clang/lib/Driver/ToolChain.cpp| 6 ++ clang/lib/Driver/ToolChains/BareMetal.cpp | 63 --- .../Inputs/basic_baremetal_tree/bin/.keep | 0 .../include/armv6m-unknown-none-eabi/.keep| 0 .../armv6m-unknown-none-eabi/c++/v1/.keep | 0 .../basic_baremetal_tree/include/c++/v1/.keep | 0 .../lib/armv6m-unknown-none-eabi/.keep| 0 clang/test/Driver/baremetal.cpp | 16 - 9 files changed, 78 insertions(+), 10 deletions(-) create mode 100644 clang/test/Driver/Inputs/basic_baremetal_tree/bin/.keep create mode 100644 clang/test/Driver/Inputs/basic_baremetal_tree/include/armv6m-unknown-none-eabi/.keep create mode 100644 clang/test/Driver/Inputs/basic_baremetal_tree/include/armv6m-unknown-none-eabi/c++/v1/.keep create mode 100644 clang/test/Driver/Inputs/basic_baremetal_tree/include/c++/v1/.keep create mode 100644 clang/test/Driver/Inputs/basic_baremetal_tree/lib/armv6m-unknown-none-eabi/.keep diff --git a/clang/include/clang/Driver/ToolChain.h b/clang/include/clang/Driver/ToolChain.h index 1f93bd612e9b0..ece1384d5d3c0 100644 --- a/clang/include/clang/Driver/ToolChain.h +++ b/clang/include/clang/Driver/ToolChain.h @@ -526,6 +526,9 @@ class ToolChain { // Returns target specific standard library path if it exists. std::optional getStdlibPath() const; + // Returns target specific standard library include path if it exists. + std::optional getStdlibIncludePath() const; + // Returns /lib// or /lib/. // This is used by runtimes (such as OpenMP) to find arch-specific libraries. virtual path_list getArchSpecificLibPaths() const; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 40ab2e91125d1..04021cc0a8f3f 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -811,6 +811,12 @@ std::optional ToolChain::getStdlibPath() const { return getTargetSubDirPath(P); } +std::optional ToolChain::getStdlibIncludePath() const { + SmallString<128> P(D.Dir); + llvm::sys::path::append(P, "..", "include"); + return getTargetSubDirPath(P); +} + ToolChain::path_list ToolChain::getArchSpecificLibPaths() const { path_list Paths; diff --git a/clang/lib/Driver/ToolChains/BareMetal.cpp b/clang/lib/Driver/ToolChains/BareMetal.cpp index dd365e62e084e..4eb333efe2314 100644 --- a/clang/lib/Driver/ToolChains/BareMetal.cpp +++ b/clang/lib/Driver/ToolChains/BareMetal.cpp @@ -270,15 +270,19 @@ void BareMetal::AddClangSystemIncludeArgs(const ArgList &DriverArgs, addSystemInclude(DriverArgs, CC1Args, Dir.str()); } - if (!DriverArgs.hasArg(options::OPT_nostdlibinc)) { -const SmallString<128> SysRoot(computeSysRoot()); -if (!SysRoot.empty()) { - for (const Multilib &M : getOrderedMultilibs()) { -SmallString<128> Dir(SysRoot); -llvm::sys::path::append(Dir, M.includeSuffix()); -llvm::sys::path::append(Dir, "include"); -addSystemInclude(DriverArgs, CC1Args, Dir.str()); - } + if (DriverArgs.hasArg(options::OPT_nostdlibinc)) +return; + + if (std::optional Path = getStdlibIncludePath()) +addSystemInclude(DriverArgs, CC1Args, *Path); + + const SmallString<128> SysRoot(computeSysRoot()); + if (!SysRoot.empty()) { +for (const Multilib &M : getOrderedMultilibs()) { + SmallString<128> Dir(SysRoot); + llvm::sys::path::append(Dir, M.includeSuffix()); + llvm::sys::path::append(Dir, "include"); + addSystemInclude(DriverArgs, CC1Args, Dir.str()); } } } @@ -296,6 +300,47 @@ void BareMetal::AddClangCXXStdlibIncludeArgs(const ArgList &DriverArgs, return; const Driver &D = getDriver(); + std::string Target = getTripleString(); + + auto AddCXXIncludePath = [&](StringRef Path) { +std::string Version = detectLibcxxVersion(Path); +if (Version.empty()) + return; + +// First add the per-target multilib include dir. +if (!SelectedMultilibs.empty() && !SelectedMultilibs.back().isDefault()) { + const Multilib &M = SelectedMultilibs.back(); + SmallString<128> TargetDir(Path); + llvm::sys::path::append(TargetDir, Target, M.gccSuffix(), "c++", Version); + if (getVFS().exists(TargetDir)) { +addSystemInclude(DriverArgs, CC1Args, TargetDir); + } +} + +// Second add the per-target include dir. +SmallString<128> TargetDir(Path); +llvm::sys::path::append(TargetDir, Target, "c++", Version); +if
[clang] [Driver] Support using toolchain libc and libc++ for baremetal (PR #96736)
@@ -296,6 +300,47 @@ void BareMetal::AddClangCXXStdlibIncludeArgs(const ArgList &DriverArgs, return; const Driver &D = getDriver(); + std::string Target = getTripleString(); + + auto AddCXXIncludePath = [&](StringRef Path) { +std::string Version = detectLibcxxVersion(Path); +if (Version.empty()) + return; + +// First add the per-target multilib include dir. +if (!SelectedMultilibs.empty() && !SelectedMultilibs.back().isDefault()) { + const Multilib &M = SelectedMultilibs.back(); + SmallString<128> TargetDir(Path); + llvm::sys::path::append(TargetDir, Target, M.gccSuffix(), "c++", Version); + if (getVFS().exists(TargetDir)) { +addSystemInclude(DriverArgs, CC1Args, TargetDir); + } petrhosek wrote: Done https://github.com/llvm/llvm-project/pull/96736 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] Access tls_guard via llvm.threadlocal.address (PR #96633)
@@ -1059,9 +1059,15 @@ CodeGenFunction::GenerateCXXGlobalInitFunc(llvm::Function *Fn, if (Guard.isValid()) { // If we have a guard variable, check whether we've already performed // these initializations. This happens for TLS initialization functions. - llvm::Value *GuardVal = Builder.CreateLoad(Guard); - llvm::Value *Uninit = Builder.CreateIsNull(GuardVal, - "guard.uninitialized"); + Address GuardAddr = Guard; nikola-tesic-ns wrote: The `Guard` is a `ConstantAddress`, so I cannot change it, that's why I introduced new variable. If you have some suggestion, I would be happy to adapt the code. https://github.com/llvm/llvm-project/pull/96633 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Driver] Support using toolchain libc and libc++ for baremetal (PR #96736)
@@ -296,6 +300,47 @@ void BareMetal::AddClangCXXStdlibIncludeArgs(const ArgList &DriverArgs, return; const Driver &D = getDriver(); + std::string Target = getTripleString(); + + auto AddCXXIncludePath = [&](StringRef Path) { petrhosek wrote: No, we also need `this` for `detectLibcxxVersion`, `DriverArgs` and `CC1Args`. https://github.com/llvm/llvm-project/pull/96736 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Driver] Support using toolchain libc and libc++ for baremetal (PR #96736)
@@ -296,6 +300,47 @@ void BareMetal::AddClangCXXStdlibIncludeArgs(const ArgList &DriverArgs, return; const Driver &D = getDriver(); + std::string Target = getTripleString(); + + auto AddCXXIncludePath = [&](StringRef Path) { +std::string Version = detectLibcxxVersion(Path); +if (Version.empty()) + return; + +// First add the per-target multilib include dir. +if (!SelectedMultilibs.empty() && !SelectedMultilibs.back().isDefault()) { + const Multilib &M = SelectedMultilibs.back(); + SmallString<128> TargetDir(Path); + llvm::sys::path::append(TargetDir, Target, M.gccSuffix(), "c++", Version); petrhosek wrote: Done. https://github.com/llvm/llvm-project/pull/96736 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Driver] Support using toolchain libc and libc++ for baremetal (PR #96736)
@@ -296,6 +300,47 @@ void BareMetal::AddClangCXXStdlibIncludeArgs(const ArgList &DriverArgs, return; const Driver &D = getDriver(); + std::string Target = getTripleString(); + + auto AddCXXIncludePath = [&](StringRef Path) { +std::string Version = detectLibcxxVersion(Path); +if (Version.empty()) + return; + +// First add the per-target multilib include dir. +if (!SelectedMultilibs.empty() && !SelectedMultilibs.back().isDefault()) { + const Multilib &M = SelectedMultilibs.back(); + SmallString<128> TargetDir(Path); + llvm::sys::path::append(TargetDir, Target, M.gccSuffix(), "c++", Version); + if (getVFS().exists(TargetDir)) { +addSystemInclude(DriverArgs, CC1Args, TargetDir); + } +} + +// Second add the per-target include dir. +SmallString<128> TargetDir(Path); +llvm::sys::path::append(TargetDir, Target, "c++", Version); +if (getVFS().exists(TargetDir)) + addSystemInclude(DriverArgs, CC1Args, TargetDir); + +// Third the generic one. +SmallString<128> Dir(Path); petrhosek wrote: Done. https://github.com/llvm/llvm-project/pull/96736 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] Access tls_guard via llvm.threadlocal.address (PR #96633)
@@ -1070,13 +1076,26 @@ CodeGenFunction::GenerateCXXGlobalInitFunc(llvm::Function *Fn, // Mark as initialized before initializing anything else. If the // initializers use previously-initialized thread_local vars, that's // probably supposed to be OK, but the standard doesn't say. - Builder.CreateStore(llvm::ConstantInt::get(GuardVal->getType(),1), Guard); - - // The guard variable can't ever change again. - EmitInvariantStart( - Guard.getPointer(), - CharUnits::fromQuantity( - CGM.getDataLayout().getTypeAllocSize(GuardVal->getType(; + if (auto *GV = dyn_cast(Guard.getPointer())) +// Get the thread-local address via intrinsic. +if (GV->isThreadLocal()) + GuardAddr = GuardAddr.withPointer( + Builder.CreateThreadLocalAddress(GV), NotKnownNonNull); + Builder.CreateStore(llvm::ConstantInt::get(GuardVal->getType(), 1), + GuardAddr); + + // Emit invariant start for TLS guard address. + if (CGM.getCodeGenOpts().OptimizationLevel > 0) { +uint64_t Width = +CGM.getDataLayout().getTypeAllocSize(GuardVal->getType()); +llvm::Value *TLSAddr = Guard.getPointer(); +if (auto *GV = dyn_cast(Guard.getPointer())) + // Get the thread-local address via intrinsic. + if (GV->isThreadLocal()) +TLSAddr = Builder.CreateThreadLocalAddress(GV); +Builder.CreateInvariantStart( nikola-tesic-ns wrote: I am not sure I understood this, sorry. https://github.com/llvm/llvm-project/pull/96633 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] Access tls_guard via llvm.threadlocal.address (PR #96633)
@@ -1059,9 +1059,15 @@ CodeGenFunction::GenerateCXXGlobalInitFunc(llvm::Function *Fn, if (Guard.isValid()) { // If we have a guard variable, check whether we've already performed // these initializations. This happens for TLS initialization functions. - llvm::Value *GuardVal = Builder.CreateLoad(Guard); - llvm::Value *Uninit = Builder.CreateIsNull(GuardVal, - "guard.uninitialized"); + Address GuardAddr = Guard; + if (auto *GV = dyn_cast(Guard.getPointer())) nikola-tesic-ns wrote: There is a code pattern where this "guarded initialization" is done for non-TLS var ([partitions.cpp test](https://github.com/nextsilicon/next-llvm-project/blob/b36811cc9baf1c72de2fa1c8b5d8fc30bae9a15c/clang/test/CodeGenCXX/partitions.cpp)). That's the reason I've introduced these checks. https://github.com/llvm/llvm-project/pull/96633 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] Access tls_guard via llvm.threadlocal.address (PR #96633)
@@ -1059,9 +1059,15 @@ CodeGenFunction::GenerateCXXGlobalInitFunc(llvm::Function *Fn, if (Guard.isValid()) { // If we have a guard variable, check whether we've already performed // these initializations. This happens for TLS initialization functions. - llvm::Value *GuardVal = Builder.CreateLoad(Guard); - llvm::Value *Uninit = Builder.CreateIsNull(GuardVal, - "guard.uninitialized"); + Address GuardAddr = Guard; ChuanqiXu9 wrote: OK, I didn't look into the context. https://github.com/llvm/llvm-project/pull/96633 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] Access tls_guard via llvm.threadlocal.address (PR #96633)
@@ -1070,13 +1076,26 @@ CodeGenFunction::GenerateCXXGlobalInitFunc(llvm::Function *Fn, // Mark as initialized before initializing anything else. If the // initializers use previously-initialized thread_local vars, that's // probably supposed to be OK, but the standard doesn't say. - Builder.CreateStore(llvm::ConstantInt::get(GuardVal->getType(),1), Guard); - - // The guard variable can't ever change again. - EmitInvariantStart( - Guard.getPointer(), - CharUnits::fromQuantity( - CGM.getDataLayout().getTypeAllocSize(GuardVal->getType(; + if (auto *GV = dyn_cast(Guard.getPointer())) +// Get the thread-local address via intrinsic. +if (GV->isThreadLocal()) + GuardAddr = GuardAddr.withPointer( + Builder.CreateThreadLocalAddress(GV), NotKnownNonNull); + Builder.CreateStore(llvm::ConstantInt::get(GuardVal->getType(), 1), + GuardAddr); + + // Emit invariant start for TLS guard address. + if (CGM.getCodeGenOpts().OptimizationLevel > 0) { +uint64_t Width = +CGM.getDataLayout().getTypeAllocSize(GuardVal->getType()); +llvm::Value *TLSAddr = Guard.getPointer(); +if (auto *GV = dyn_cast(Guard.getPointer())) + // Get the thread-local address via intrinsic. + if (GV->isThreadLocal()) +TLSAddr = Builder.CreateThreadLocalAddress(GV); +Builder.CreateInvariantStart( ChuanqiXu9 wrote: I mean, it used `EmitInvariantStart` but now it uses `CreateInvariantStart`. (Sorry, I meant to use API) https://github.com/llvm/llvm-project/pull/96633 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Run PreStmt/PostStmt checker for GCCAsmStmt (PR #95409)
https://github.com/T-Gruber edited https://github.com/llvm/llvm-project/pull/95409 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libcxx] [libcxxabi] [libunwind] [llvm] [runtimes] remove workaround for old CMake when setting `--unwindlib=none` (PR #93429)
mstorsjo wrote: > So I've been trying to follow down the rabbit hole of the failing flag > checks, and it seems the combination of `CXX_SUPPORTS_UNWINDLIB_EQ_NONE_FLAG` > plus https://gitlab.kitware.com/cmake/cmake/-/issues/23454 has a wider blast > radius than anticipated. > > I'm not claiming that adding the > `_previous_CMAKE_{REQUIRED_LINK_OPTIONS,TRY_COMPILE_TARGET_TYPE}` dance > everywhere is the right approach here, but it was - so far - the obvious path > to just try to get things green again. It's conceivable though that it would > be easier to simply shift the detection of > `CXX_SUPPORTS_UNWINDLIB_EQ_NONE_FLAG` until after the other flag checks have > been performed? 🤔 That's probably not possible... The point is that when bootstrapping a new sysroot from scratch (i.e. building the initial libunwind etc), in a configuration where libunwind is linked in automatically, every test that tries to do linking will fail (as it implicitly tries to link in libunwind, which does not exist yet). Therefore, we need to add `--unwindlib=none` as an additional linker flag, as soon as possible, so that all following cmake checks will get the right result. Also, in general, setting `CMAKE_TRY_COMPILE_TARGET_TYPE` to `STATIC_LIBRARY` in too wide a context will also give false positive checks, for cases where we intentionally want to check whether linking some library works and is found. But perhaps the way you do it here, adding it in a narrow context only when doing specific checks, is the right way? I'm not sure... So that cmake issue seems to be really, really unfortunate here. :-( I wonder if the cure is worse than the disease here - and if it would be better to just keep what we have now - and simplify it only if cmake adds something like `CMAKE_REQUIRED_DYNAMIC_LINK_OPTIONS` or so. https://github.com/llvm/llvm-project/pull/93429 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] Access tls_guard via llvm.threadlocal.address (PR #96633)
@@ -1070,13 +1076,26 @@ CodeGenFunction::GenerateCXXGlobalInitFunc(llvm::Function *Fn, // Mark as initialized before initializing anything else. If the // initializers use previously-initialized thread_local vars, that's // probably supposed to be OK, but the standard doesn't say. - Builder.CreateStore(llvm::ConstantInt::get(GuardVal->getType(),1), Guard); - - // The guard variable can't ever change again. - EmitInvariantStart( - Guard.getPointer(), - CharUnits::fromQuantity( - CGM.getDataLayout().getTypeAllocSize(GuardVal->getType(; + if (auto *GV = dyn_cast(Guard.getPointer())) +// Get the thread-local address via intrinsic. +if (GV->isThreadLocal()) + GuardAddr = GuardAddr.withPointer( + Builder.CreateThreadLocalAddress(GV), NotKnownNonNull); + Builder.CreateStore(llvm::ConstantInt::get(GuardVal->getType(), 1), + GuardAddr); + + // Emit invariant start for TLS guard address. + if (CGM.getCodeGenOpts().OptimizationLevel > 0) { +uint64_t Width = +CGM.getDataLayout().getTypeAllocSize(GuardVal->getType()); +llvm::Value *TLSAddr = Guard.getPointer(); +if (auto *GV = dyn_cast(Guard.getPointer())) + // Get the thread-local address via intrinsic. + if (GV->isThreadLocal()) +TLSAddr = Builder.CreateThreadLocalAddress(GV); +Builder.CreateInvariantStart( nikola-tesic-ns wrote: Ok, well `EmitInvariantStart` expects Constant value, which `TLSAddr` cannot be if we are going to set it conditionally. (But maybe it should be unconditionally) https://github.com/llvm/llvm-project/pull/96633 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libcxx] [libcxxabi] [libunwind] [llvm] [runtimes] remove workaround for old CMake when setting `--unwindlib=none` (PR #93429)
h-vetinari wrote: > So that cmake issue seems to be really, really unfortunate here. :-( I wonder > if the cure is worse than the disease here [...] Yup, that's a distinct possibility IMO... > [...] and if it would be better to just keep what we have now - and simplify > it only if cmake adds something like `CMAKE_REQUIRED_DYNAMIC_LINK_OPTIONS` or > so. It would probably make sense to report back on the CMake issue how big the fallout from this is? Perhaps the CMake devs would reconsider, or at least take it as an indicator for the necessity of `CMAKE_REQUIRED_DYNAMIC_LINK_OPTIONS`? I think you understand the problem space much better than me (I'm mostly stumbling around in a dark room TBH), so if you could do that that would be great! https://github.com/llvm/llvm-project/pull/93429 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [llvm] Re-land: "[AArch64] Add ability to list extensions enabled for a target" (#95805) (PR #96795)
@@ -343,7 +350,9 @@ bool isX18ReservedByDefault(const Triple &TT); // themselves, they are sequential (0, 1, 2, 3, ...). uint64_t getCpuSupportsMask(ArrayRef FeatureStrs); -void PrintSupportedExtensions(StringMap DescMap); +void PrintSupportedExtensions(); + +void printEnabledExtensions(std::set EnabledFeatureNames); DavidSpickett wrote: Might as well be const correct too if you're changing it. https://github.com/llvm/llvm-project/pull/96795 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang][AST] Let DeclPrinter print trailing requires expressions for template parameters (PR #96864)
https://github.com/zyn0217 created https://github.com/llvm/llvm-project/pull/96864 As discussed in https://github.com/llvm/llvm-project/pull/96084#discussion_r1654629993, it would be nice to present these trailing constraints on template parameters when printing CTAD decls through a DeclPrinter. >From a5c33bd413d8150d1688240c6b5253b1760cafe1 Mon Sep 17 00:00:00 2001 From: Younan Zhang Date: Thu, 27 Jun 2024 15:59:48 +0800 Subject: [PATCH] [Clang][AST] Let DeclPrinter print trailing requires expressions for template parameters As discussed in https://github.com/llvm/llvm-project/pull/96084#discussion_r1654629993, it would be nice to present these trailing constraints on template parameters when printing CTAD decls through a DeclPrinter. --- clang/docs/ReleaseNotes.rst| 1 + clang/lib/AST/DeclPrinter.cpp | 10 ++ clang/test/PCH/cxx2a-requires-expr.cpp | 17 + 3 files changed, 28 insertions(+) diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 69aea6c21ad39..03b1daa6597cd 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -99,6 +99,7 @@ AST Dumping Potentially Breaking Changes - The text ast-dumper has improved printing of TemplateArguments. +- The text decl-dumper prints template parameters' trailing requires expressions now. Clang Frontend Potentially Breaking Changes --- diff --git a/clang/lib/AST/DeclPrinter.cpp b/clang/lib/AST/DeclPrinter.cpp index 0cf4e64f83b8d..0a081e7e07ca8 100644 --- a/clang/lib/AST/DeclPrinter.cpp +++ b/clang/lib/AST/DeclPrinter.cpp @@ -1189,6 +1189,16 @@ void DeclPrinter::printTemplateParameters(const TemplateParameterList *Params, Out << '>'; if (!OmitTemplateKW) Out << ' '; + + if (const Expr *RequiresClause = Params->getRequiresClause()) { +if (OmitTemplateKW) + Out << ' '; +Out << "requires "; +RequiresClause->printPretty(Out, nullptr, Policy, Indentation, "\n", +&Context); +if (!OmitTemplateKW) + Out << ' '; + } } void DeclPrinter::printTemplateArguments(ArrayRef Args, diff --git a/clang/test/PCH/cxx2a-requires-expr.cpp b/clang/test/PCH/cxx2a-requires-expr.cpp index 7f8f258a0f8f3..936f601685463 100644 --- a/clang/test/PCH/cxx2a-requires-expr.cpp +++ b/clang/test/PCH/cxx2a-requires-expr.cpp @@ -22,3 +22,20 @@ bool f() { requires C || (C || C); }; } + +namespace trailing_requires_expression { + +template requires C && C2 +// CHECK: template requires C && C2 void g(); +void g(); + +template requires C || C2 +// CHECK: template requires C || C2 constexpr int h = sizeof(T); +constexpr int h = sizeof(T); + +template requires C +// CHECK: template requires C class i { +// CHECK-NEXT: }; +class i {}; + +} ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang][AST] Let DeclPrinter print trailing requires expressions for template parameters (PR #96864)
llvmbot wrote: @llvm/pr-subscribers-clang Author: Younan Zhang (zyn0217) Changes As discussed in https://github.com/llvm/llvm-project/pull/96084#discussion_r1654629993, it would be nice to present these trailing constraints on template parameters when printing CTAD decls through a DeclPrinter. --- Full diff: https://github.com/llvm/llvm-project/pull/96864.diff 3 Files Affected: - (modified) clang/docs/ReleaseNotes.rst (+1) - (modified) clang/lib/AST/DeclPrinter.cpp (+10) - (modified) clang/test/PCH/cxx2a-requires-expr.cpp (+17) ``diff diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 69aea6c21ad39..03b1daa6597cd 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -99,6 +99,7 @@ AST Dumping Potentially Breaking Changes - The text ast-dumper has improved printing of TemplateArguments. +- The text decl-dumper prints template parameters' trailing requires expressions now. Clang Frontend Potentially Breaking Changes --- diff --git a/clang/lib/AST/DeclPrinter.cpp b/clang/lib/AST/DeclPrinter.cpp index 0cf4e64f83b8d..0a081e7e07ca8 100644 --- a/clang/lib/AST/DeclPrinter.cpp +++ b/clang/lib/AST/DeclPrinter.cpp @@ -1189,6 +1189,16 @@ void DeclPrinter::printTemplateParameters(const TemplateParameterList *Params, Out << '>'; if (!OmitTemplateKW) Out << ' '; + + if (const Expr *RequiresClause = Params->getRequiresClause()) { +if (OmitTemplateKW) + Out << ' '; +Out << "requires "; +RequiresClause->printPretty(Out, nullptr, Policy, Indentation, "\n", +&Context); +if (!OmitTemplateKW) + Out << ' '; + } } void DeclPrinter::printTemplateArguments(ArrayRef Args, diff --git a/clang/test/PCH/cxx2a-requires-expr.cpp b/clang/test/PCH/cxx2a-requires-expr.cpp index 7f8f258a0f8f3..936f601685463 100644 --- a/clang/test/PCH/cxx2a-requires-expr.cpp +++ b/clang/test/PCH/cxx2a-requires-expr.cpp @@ -22,3 +22,20 @@ bool f() { requires C || (C || C); }; } + +namespace trailing_requires_expression { + +template requires C && C2 +// CHECK: template requires C && C2 void g(); +void g(); + +template requires C || C2 +// CHECK: template requires C || C2 constexpr int h = sizeof(T); +constexpr int h = sizeof(T); + +template requires C +// CHECK: template requires C class i { +// CHECK-NEXT: }; +class i {}; + +} `` https://github.com/llvm/llvm-project/pull/96864 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] CTAD alias: fix transformation for require-clause expr Part2. (PR #93533)
https://github.com/hokein updated https://github.com/llvm/llvm-project/pull/93533 >From 14817083f75f9615e9df4c905e09bc4e9b199336 Mon Sep 17 00:00:00 2001 From: Haojian Wu Date: Fri, 17 May 2024 15:28:48 +0200 Subject: [PATCH 1/2] [clang] CTAD alias: fix transformation for require-clause expr Part2. In the https://github.com/llvm/llvm-project/pull/90961 fix, we miss a case where the undeduced template parameters of the underlying deduction guide is not transformed, which leaves incorrect depth/index information, and causes crash when evaluating the constraints. This patch fix this missing case. Fixes #92596 Fixes #92212 --- clang/lib/Sema/SemaTemplate.cpp | 32 clang/test/AST/ast-dump-ctad-alias.cpp | 25 +++ clang/test/SemaCXX/cxx20-ctad-type-alias.cpp | 25 +++ 3 files changed, 76 insertions(+), 6 deletions(-) diff --git a/clang/lib/Sema/SemaTemplate.cpp b/clang/lib/Sema/SemaTemplate.cpp index e36ee2d5a46cf..3869f789da78b 100644 --- a/clang/lib/Sema/SemaTemplate.cpp +++ b/clang/lib/Sema/SemaTemplate.cpp @@ -2779,6 +2779,7 @@ Expr * buildAssociatedConstraints(Sema &SemaRef, FunctionTemplateDecl *F, TypeAliasTemplateDecl *AliasTemplate, ArrayRef DeduceResults, + unsigned UndeducedTemplateParameterStartIndex, Expr *IsDeducible) { Expr *RC = F->getTemplateParameters()->getRequiresClause(); if (!RC) @@ -2839,8 +2840,22 @@ buildAssociatedConstraints(Sema &SemaRef, FunctionTemplateDecl *F, for (unsigned Index = 0; Index < DeduceResults.size(); ++Index) { const auto &D = DeduceResults[Index]; -if (D.isNull()) +if (D.isNull()) { // non-deduced template parameters of f + auto TP = F->getTemplateParameters()->getParam(Index); + MultiLevelTemplateArgumentList Args; + Args.setKind(TemplateSubstitutionKind::Rewrite); + Args.addOuterTemplateArguments(TemplateArgsForBuildingRC); + // Rebuild the template parameter with updated depth and index. + NamedDecl *NewParam = transformTemplateParameter( + SemaRef, F->getDeclContext(), TP, Args, + /*NewIndex=*/UndeducedTemplateParameterStartIndex++, + getTemplateParameterDepth(TP) + AdjustDepth); + + assert(TemplateArgsForBuildingRC[Index].isNull()); + TemplateArgsForBuildingRC[Index] = Context.getCanonicalTemplateArgument( + Context.getInjectedTemplateArg(NewParam)); continue; +} TemplateArgumentLoc Input = SemaRef.getTrivialTemplateArgumentLoc(D, QualType(), SourceLocation{}); TemplateArgumentLoc Output; @@ -2856,9 +2871,11 @@ buildAssociatedConstraints(Sema &SemaRef, FunctionTemplateDecl *F, MultiLevelTemplateArgumentList ArgsForBuildingRC; ArgsForBuildingRC.setKind(clang::TemplateSubstitutionKind::Rewrite); ArgsForBuildingRC.addOuterTemplateArguments(TemplateArgsForBuildingRC); - // For 2), if the underlying F is instantiated from a member template, we need - // the entire template argument list, as the constraint AST in the - // require-clause of F remains completely uninstantiated. + // For 2), if the underlying function template F is nested in a class template + // (either instantiated from an explicitly-written deduction guide, or + // synthesized from a constructor), we need the entire template argument list, + // as the constraint AST in the require-clause of F remains completely + // uninstantiated. // // For example: // template // depth 0 @@ -2881,7 +2898,8 @@ buildAssociatedConstraints(Sema &SemaRef, FunctionTemplateDecl *F, // We add the outer template arguments which is [int] to the multi-level arg // list to ensure that the occurrence U in `C` will be replaced with int // during the substitution. - if (F->getInstantiatedFromMemberTemplate()) { + if (F->getLexicalDeclContext()->getDeclKind() == + clang::Decl::ClassTemplateSpecialization) { auto OuterLevelArgs = SemaRef.getTemplateInstantiationArgs( F, F->getLexicalDeclContext(), /*Final=*/false, /*Innermost=*/std::nullopt, @@ -3099,6 +3117,7 @@ BuildDeductionGuideForTypeAlias(Sema &SemaRef, Context.getInjectedTemplateArg(NewParam)); TransformedDeducedAliasArgs[AliasTemplateParamIdx] = NewTemplateArgument; } + unsigned UndeducedTemplateParameterStartIndex = FPrimeTemplateParams.size(); // ...followed by the template parameters of f that were not deduced // (including their default template arguments) for (unsigned FTemplateParamIdx : NonDeducedTemplateParamsInFIndex) { @@ -3168,7 +3187,8 @@ BuildDeductionGuideForTypeAlias(Sema &SemaRef, Expr *IsDeducible = buildIsDeducibleConstraint( SemaRef, AliasTemplate, FPrime->getReturnType(), FPrimeTemplateParams); Expr *RequiresClause = buildAssociatedConstraints( -SemaRef, F, AliasTemplate, DeduceResults, IsDeducible); +SemaRef,
[clang] [clang] CTAD alias: fix transformation for require-clause expr Part2. (PR #93533)
https://github.com/hokein edited https://github.com/llvm/llvm-project/pull/93533 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] CTAD alias: fix transformation for require-clause expr Part2. (PR #93533)
https://github.com/hokein commented: thanks for the review. https://github.com/llvm/llvm-project/pull/93533 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] CTAD alias: fix transformation for require-clause expr Part2. (PR #93533)
@@ -2840,8 +2841,22 @@ buildAssociatedConstraints(Sema &SemaRef, FunctionTemplateDecl *F, for (unsigned Index = 0; Index < DeduceResults.size(); ++Index) { const auto &D = DeduceResults[Index]; -if (D.isNull()) +if (D.isNull()) { // non-deduced template parameters of f + auto TP = F->getTemplateParameters()->getParam(Index); + MultiLevelTemplateArgumentList Args; + Args.setKind(TemplateSubstitutionKind::Rewrite); + Args.addOuterTemplateArguments(TemplateArgsForBuildingRC); + // Rebuild the template parameter with updated depth and index. + NamedDecl *NewParam = transformTemplateParameter( + SemaRef, F->getDeclContext(), TP, Args, + /*NewIndex=*/UndeducedTemplateParameterStartIndex++, + getTemplateParameterDepth(TP) + AdjustDepth); hokein wrote: Done. https://github.com/llvm/llvm-project/pull/93533 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] CTAD alias: fix transformation for require-clause expr Part2. (PR #93533)
@@ -2882,7 +2899,8 @@ buildAssociatedConstraints(Sema &SemaRef, FunctionTemplateDecl *F, // We add the outer template arguments which is [int] to the multi-level arg // list to ensure that the occurrence U in `C` will be replaced with int // during the substitution. - if (F->getInstantiatedFromMemberTemplate()) { + if (F->getLexicalDeclContext()->getDeclKind() == + clang::Decl::ClassTemplateSpecialization) { hokein wrote: Not needed. The F here is an instantiated template (either from an explicit deduction guide within a class or a constructor), so its DeclContext cannot be ClassTemplatePartialSpecialization. I made a code comment to clarify it. https://github.com/llvm/llvm-project/pull/93533 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] CTAD alias: fix transformation for require-clause expr Part2. (PR #93533)
@@ -2840,8 +2841,22 @@ buildAssociatedConstraints(Sema &SemaRef, FunctionTemplateDecl *F, for (unsigned Index = 0; Index < DeduceResults.size(); ++Index) { const auto &D = DeduceResults[Index]; -if (D.isNull()) +if (D.isNull()) { // non-deduced template parameters of f + auto TP = F->getTemplateParameters()->getParam(Index); hokein wrote: Done https://github.com/llvm/llvm-project/pull/93533 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] CTAD alias: fix transformation for require-clause expr Part2. (PR #93533)
@@ -3100,6 +3118,7 @@ BuildDeductionGuideForTypeAlias(Sema &SemaRef, Context.getInjectedTemplateArg(NewParam)); TransformedDeducedAliasArgs[AliasTemplateParamIdx] = NewTemplateArgument; } + unsigned UndeducedTemplateParameterStartIndex = FPrimeTemplateParams.size(); hokein wrote: Done. https://github.com/llvm/llvm-project/pull/93533 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Pipelines] Move IPSCCP after inliner pipeline (PR #96620)
https://github.com/sihuan updated https://github.com/llvm/llvm-project/pull/96620 >From abf211c35e39efc5d8f30019e10a14766985c185 Mon Sep 17 00:00:00 2001 From: SiHuaN Date: Tue, 25 Jun 2024 18:04:33 +0800 Subject: [PATCH 1/3] [Pipelines] Move IPSCCP after inliner pipeline Moving the Interprocedural Constant Propagation (IPSCCP) pass to run after the inliner pipeline can enhance optimization effectiveness. Performance uplift for SPEC2017:548.exchange2_r on rv64gc is over 40%. --- llvm/lib/Passes/PassBuilderPipelines.cpp | 28 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 926515c9508a9..82e2690f4f441 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -1118,20 +1118,6 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, invokePipelineEarlySimplificationEPCallbacks(MPM, Level); - // Interprocedural constant propagation now that basic cleanup has occurred - // and prior to optimizing globals. - // FIXME: This position in the pipeline hasn't been carefully considered in - // years, it should be re-analyzed. - MPM.addPass(IPSCCPPass( - IPSCCPOptions(/*AllowFuncSpec=*/ -Level != OptimizationLevel::Os && -Level != OptimizationLevel::Oz && -!isLTOPreLink(Phase; - - // Attach metadata to indirect call sites indicating the set of functions - // they may target at run-time. This should follow IPSCCP. - MPM.addPass(CalledValuePropagationPass()); - // Optimize globals to try and fold them into constants. MPM.addPass(GlobalOptPass()); @@ -1204,6 +1190,20 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, else MPM.addPass(buildInlinerPipeline(Level, Phase)); + // Interprocedural constant propagation after the inliner pipeline yields + // better optimization results. + // FIXME: This position in the pipeline hasn't been carefully considered in + // years, it should be re-analyzed. + MPM.addPass(IPSCCPPass( + IPSCCPOptions(/*AllowFuncSpec=*/ +Level != OptimizationLevel::Os && +Level != OptimizationLevel::Oz && +!isLTOPreLink(Phase; + + // Attach metadata to indirect call sites indicating the set of functions + // they may target at run-time. This should follow IPSCCP. + MPM.addPass(CalledValuePropagationPass()); + // Remove any dead arguments exposed by cleanups, constant folding globals, // and argument promotion. MPM.addPass(DeadArgumentEliminationPass()); >From 1f8eef8e2b98eadc79f7c89456f701e24b956716 Mon Sep 17 00:00:00 2001 From: SiHuaN Date: Thu, 27 Jun 2024 15:11:38 +0800 Subject: [PATCH 2/3] Restore IPSCCP Pass to its original position and repeat it after the inliner pipeline --- llvm/lib/Passes/PassBuilderPipelines.cpp | 22 ++ 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 82e2690f4f441..5659c116e9c95 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -1118,6 +1118,20 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, invokePipelineEarlySimplificationEPCallbacks(MPM, Level); + // Interprocedural constant propagation now that basic cleanup has occurred + // and prior to optimizing globals. + // FIXME: This position in the pipeline hasn't been carefully considered in + // years, it should be re-analyzed. + MPM.addPass(IPSCCPPass( + IPSCCPOptions(/*AllowFuncSpec=*/ +Level != OptimizationLevel::Os && +Level != OptimizationLevel::Oz && +!isLTOPreLink(Phase; + + // Attach metadata to indirect call sites indicating the set of functions + // they may target at run-time. This should follow IPSCCP. + MPM.addPass(CalledValuePropagationPass()); + // Optimize globals to try and fold them into constants. MPM.addPass(GlobalOptPass()); @@ -1190,20 +1204,12 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, else MPM.addPass(buildInlinerPipeline(Level, Phase)); - // Interprocedural constant propagation after the inliner pipeline yields - // better optimization results. - // FIXME: This position in the pipeline hasn't been carefully considered in - // years, it should be re-analyzed. MPM.addPass(IPSCCPPass( IPSCCPOptions(/*AllowFuncSpec=*/ Level != OptimizationLevel::Os && Level != OptimizationLevel::Oz && !isLTOPreLink(Phase; - // Attach metadata to indirect call sites indi
[clang] [llvm] [Pipelines] Move IPSCCP after inliner pipeline (PR #96620)
llvmbot wrote: @llvm/pr-subscribers-clang Author: SiHuaN (sihuan) Changes This patch significantly improves the performance of LLVM for the SPEC2017:548.exchange2_r benchmark, with a performance uplift of over 40% on the rv64gc. During our investigation into the significant performance disparity between GCC and LLVM on the SPEC2017:548.exchange2_r benchmark on RISC-V, we identified that the primary difference stems from constant propagation optimization. In GCC, the hotspot function `digits_2` is split into several parts: ```console $ objdump -D exchange2_r_gcc | grep "digits_2.*:$" 0001d480 <__brute_force_MOD_digits_2.isra.0>: 0001f0f6 <__brute_force_MOD_digits_2.constprop.7.isra.0>: 0001fdd0 <__brute_force_MOD_digits_2.constprop.6.isra.0>: 00020900 <__brute_force_MOD_digits_2.constprop.5.isra.0>: 000211c4 <__brute_force_MOD_digits_2.constprop.4.isra.0>: 00022002 <__brute_force_MOD_digits_2.constprop.3.isra.0>: 00022d6a <__brute_force_MOD_digits_2.constprop.2.isra.0>: 00023898 <__brute_force_MOD_digits_2.constprop.1.isra.0>: ``` However, in LLVM, this function is not split: ```console $ objdump -D exchange2_r_llvm | grep "digits_2.*:$" 000115a0 <_QMbrute_forcePdigits_2>: ``` By applying this patch, LLVM now exhibits similar behavior, resulting in a substantial performance uplift. ```console $ objdump -D exchange2_r_patched_llvm | grep "digits_2.*:$" 00011ab0 <_QMbrute_forcePdigits_2>: 00018a4e <_QMbrute_forcePdigits_2.specialized.1>: 00019820 <_QMbrute_forcePdigits_2.specialized.2>: 0001a436 <_QMbrute_forcePdigits_2.specialized.3>: 0001ae78 <_QMbrute_forcePdigits_2.specialized.4>: 0001ba8e <_QMbrute_forcePdigits_2.specialized.5>: 0001c7e6 <_QMbrute_forcePdigits_2.specialized.6>: 0001d072 <_QMbrute_forcePdigits_2.specialized.7>: 0001dad0 <_QMbrute_forcePdigits_2.specialized.8>: ``` And we used `perf stat` to measure the instruction count for `exchange2_r 0` on rv64gc, as shown in the table below: | Compiler | Instructions | ||| | GCC #d28ea8e5 | 55,965,728,914 | | LLVM #62d44fbd | 105,416,890,241 | | LLVM #62d44fbd with this patch | 62,693,427,761 | Additionally, I performed tests on x86_64, yielding similar results: | Compiler | cpu_atom instructions | ||| | LLVM #62d44fbd | 100,147,914,793 | | LLVM #62d44fbd with this patch | 53,077,337,115 | --- Full diff: https://github.com/llvm/llvm-project/pull/96620.diff 12 Files Affected: - (modified) clang/test/CodeGen/attr-counted-by.c (+2-2) - (modified) llvm/lib/Passes/PassBuilderPipelines.cpp (+6) - (modified) llvm/test/Other/new-pm-defaults.ll (+1) - (modified) llvm/test/Other/new-pm-thinlto-postlink-defaults.ll (+2-1) - (modified) llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll (+1) - (modified) llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll (+1) - (modified) llvm/test/Other/new-pm-thinlto-prelink-defaults.ll (+1) - (modified) llvm/test/Other/new-pm-thinlto-prelink-pgo-defaults.ll (+1) - (modified) llvm/test/Other/new-pm-thinlto-prelink-samplepgo-defaults.ll (+1) - (modified) llvm/test/Transforms/PhaseOrdering/AArch64/constraint-elimination-placement.ll (+7-25) - (modified) llvm/test/Transforms/PhaseOrdering/dce-after-argument-promotion.ll (+3-4) - (modified) llvm/test/Transforms/PhaseOrdering/deletion-of-loops-that-became-side-effect-free.ll (+15-40) ``diff diff --git a/clang/test/CodeGen/attr-counted-by.c b/clang/test/CodeGen/attr-counted-by.c index 79922eb4159f1..8d0e39d0e3dad 100644 --- a/clang/test/CodeGen/attr-counted-by.c +++ b/clang/test/CodeGen/attr-counted-by.c @@ -639,7 +639,7 @@ void test6(struct anon_struct *p, int index) { p->array[index] = __builtin_dynamic_object_size(p->array, 1); } -// SANITIZE-WITH-ATTR-LABEL: define dso_local i64 @test6_bdos( +// SANITIZE-WITH-ATTR-LABEL: define dso_local range(i64 0, -3) i64 @test6_bdos( // SANITIZE-WITH-ATTR-SAME: ptr nocapture noundef readonly [[P:%.*]]) local_unnamed_addr #[[ATTR2]] { // SANITIZE-WITH-ATTR-NEXT: entry: // SANITIZE-WITH-ATTR-NEXT:[[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 8 @@ -649,7 +649,7 @@ void test6(struct anon_struct *p, int index) { // SANITIZE-WITH-ATTR-NEXT:[[TMP1:%.*]] = select i1 [[DOTINV]], i64 0, i64 [[TMP0]] // SANITIZE-WITH-ATTR-NEXT:ret i64 [[TMP1]] // -// NO-SANITIZE-WITH-ATTR-LABEL: define dso_local i64 @test6_bdos( +// NO-SANITIZE-WITH-ATTR-LABEL: define dso_local range(i64 0, -3) i64 @test6_bdos( // NO-SANITIZE-WITH-ATTR-SAME: ptr nocapture noundef readonly [[P:%.*]]) local_unnamed_addr #[[ATTR2]] { // NO-SANITIZE-WITH-ATTR-NEXT: entry: // NO-SANITIZE-WITH-ATTR-NEXT:[[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 8 diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 926515c9
[clang] [llvm] [PAC][ELF][AArch64] Encode signed GOT flag in PAuth core info (PR #96159)
https://github.com/jh7370 commented: I'm not at all familiar with this PAuth stuff, but don't you need a test case for where the new value is set (currently they all seem to be unset, if I'm interpreting things correctly)? https://github.com/llvm/llvm-project/pull/96159 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Pipelines] Move IPSCCP after inliner pipeline (PR #96620)
https://github.com/sihuan updated https://github.com/llvm/llvm-project/pull/96620 >From abf211c35e39efc5d8f30019e10a14766985c185 Mon Sep 17 00:00:00 2001 From: SiHuaN Date: Tue, 25 Jun 2024 18:04:33 +0800 Subject: [PATCH 1/4] [Pipelines] Move IPSCCP after inliner pipeline Moving the Interprocedural Constant Propagation (IPSCCP) pass to run after the inliner pipeline can enhance optimization effectiveness. Performance uplift for SPEC2017:548.exchange2_r on rv64gc is over 40%. --- llvm/lib/Passes/PassBuilderPipelines.cpp | 28 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 926515c9508a9..82e2690f4f441 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -1118,20 +1118,6 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, invokePipelineEarlySimplificationEPCallbacks(MPM, Level); - // Interprocedural constant propagation now that basic cleanup has occurred - // and prior to optimizing globals. - // FIXME: This position in the pipeline hasn't been carefully considered in - // years, it should be re-analyzed. - MPM.addPass(IPSCCPPass( - IPSCCPOptions(/*AllowFuncSpec=*/ -Level != OptimizationLevel::Os && -Level != OptimizationLevel::Oz && -!isLTOPreLink(Phase; - - // Attach metadata to indirect call sites indicating the set of functions - // they may target at run-time. This should follow IPSCCP. - MPM.addPass(CalledValuePropagationPass()); - // Optimize globals to try and fold them into constants. MPM.addPass(GlobalOptPass()); @@ -1204,6 +1190,20 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, else MPM.addPass(buildInlinerPipeline(Level, Phase)); + // Interprocedural constant propagation after the inliner pipeline yields + // better optimization results. + // FIXME: This position in the pipeline hasn't been carefully considered in + // years, it should be re-analyzed. + MPM.addPass(IPSCCPPass( + IPSCCPOptions(/*AllowFuncSpec=*/ +Level != OptimizationLevel::Os && +Level != OptimizationLevel::Oz && +!isLTOPreLink(Phase; + + // Attach metadata to indirect call sites indicating the set of functions + // they may target at run-time. This should follow IPSCCP. + MPM.addPass(CalledValuePropagationPass()); + // Remove any dead arguments exposed by cleanups, constant folding globals, // and argument promotion. MPM.addPass(DeadArgumentEliminationPass()); >From 1f8eef8e2b98eadc79f7c89456f701e24b956716 Mon Sep 17 00:00:00 2001 From: SiHuaN Date: Thu, 27 Jun 2024 15:11:38 +0800 Subject: [PATCH 2/4] Restore IPSCCP Pass to its original position and repeat it after the inliner pipeline --- llvm/lib/Passes/PassBuilderPipelines.cpp | 22 ++ 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 82e2690f4f441..5659c116e9c95 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -1118,6 +1118,20 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, invokePipelineEarlySimplificationEPCallbacks(MPM, Level); + // Interprocedural constant propagation now that basic cleanup has occurred + // and prior to optimizing globals. + // FIXME: This position in the pipeline hasn't been carefully considered in + // years, it should be re-analyzed. + MPM.addPass(IPSCCPPass( + IPSCCPOptions(/*AllowFuncSpec=*/ +Level != OptimizationLevel::Os && +Level != OptimizationLevel::Oz && +!isLTOPreLink(Phase; + + // Attach metadata to indirect call sites indicating the set of functions + // they may target at run-time. This should follow IPSCCP. + MPM.addPass(CalledValuePropagationPass()); + // Optimize globals to try and fold them into constants. MPM.addPass(GlobalOptPass()); @@ -1190,20 +1204,12 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, else MPM.addPass(buildInlinerPipeline(Level, Phase)); - // Interprocedural constant propagation after the inliner pipeline yields - // better optimization results. - // FIXME: This position in the pipeline hasn't been carefully considered in - // years, it should be re-analyzed. MPM.addPass(IPSCCPPass( IPSCCPOptions(/*AllowFuncSpec=*/ Level != OptimizationLevel::Os && Level != OptimizationLevel::Oz && !isLTOPreLink(Phase; - // Attach metadata to indirect call sites indi
[clang] [Clang][AST] Let DeclPrinter print trailing requires expressions for template parameters (PR #96864)
@@ -1189,6 +1189,16 @@ void DeclPrinter::printTemplateParameters(const TemplateParameterList *Params, Out << '>'; if (!OmitTemplateKW) Out << ' '; + + if (const Expr *RequiresClause = Params->getRequiresClause()) { hokein wrote: If I read the code correctly, looks like we can move this code to Line 1190 (just immediately before the above `if (!OmitTemplateKW)`)? Then we can get rid of all the `OmitTemplateKw` logic inside this if branch. https://github.com/llvm/llvm-project/pull/96864 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] Bring initFeatureMap back to AArch64TargetInfo. (PR #96832)
tmatheson-arm wrote: And please add a test to cover whatever broke. https://github.com/llvm/llvm-project/pull/96832 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [X86][CodeGen] security check cookie execute only when needed (PR #95904)
https://github.com/mahesh-attarde updated https://github.com/llvm/llvm-project/pull/95904 >From 6d6619f8f7a37906ac45791487a4d63b51a48ad1 Mon Sep 17 00:00:00 2001 From: mahesh-attarde Date: Wed, 12 Jun 2024 06:15:51 -0700 Subject: [PATCH 1/5] added regcall strct by reg support --- clang/lib/CodeGen/Targets/X86.cpp | 20 clang/test/CodeGen/regcall3.c | 53 +++ 2 files changed, 73 insertions(+) create mode 100644 clang/test/CodeGen/regcall3.c diff --git a/clang/lib/CodeGen/Targets/X86.cpp b/clang/lib/CodeGen/Targets/X86.cpp index 43dadf5e724ac..506d106ad65b0 100644 --- a/clang/lib/CodeGen/Targets/X86.cpp +++ b/clang/lib/CodeGen/Targets/X86.cpp @@ -148,6 +148,7 @@ class X86_32ABIInfo : public ABIInfo { Class classify(QualType Ty) const; ABIArgInfo classifyReturnType(QualType RetTy, CCState &State) const; + ABIArgInfo classifyArgumentType(QualType RetTy, CCState &State, unsigned ArgIndex) const; @@ -1306,6 +1307,8 @@ class X86_64ABIInfo : public ABIInfo { unsigned &NeededSSE, unsigned &MaxVectorWidth) const; + bool DoesRegcallStructFitInReg(QualType Ty) const; + bool IsIllegalVectorType(QualType Ty) const; /// The 0.98 ABI revision clarified a lot of ambiguities, @@ -2830,6 +2833,20 @@ X86_64ABIInfo::classifyArgumentType(QualType Ty, unsigned freeIntRegs, return ABIArgInfo::getDirect(ResType); } +bool X86_64ABIInfo::DoesRegcallStructFitInReg(QualType Ty) const { + auto RT = Ty->castAs(); + // For Integer class, Max GPR Size is 64 + if (getContext().getTypeSize(Ty) > 64) +return false; + // Struct At hand must not have other non Builtin types + for (const auto *FD : RT->getDecl()->fields()) { +QualType MTy = FD->getType(); +if (!MTy->isBuiltinType()) + return false; + } + return true; +} + ABIArgInfo X86_64ABIInfo::classifyRegCallStructTypeImpl(QualType Ty, unsigned &NeededInt, unsigned &NeededSSE, @@ -2837,6 +2854,9 @@ X86_64ABIInfo::classifyRegCallStructTypeImpl(QualType Ty, unsigned &NeededInt, auto RT = Ty->getAs(); assert(RT && "classifyRegCallStructType only valid with struct types"); + if (DoesRegcallStructFitInReg(Ty)) +return classifyArgumentType(Ty, UINT_MAX, NeededInt, NeededSSE, true, true); + if (RT->getDecl()->hasFlexibleArrayMember()) return getIndirectReturnResult(Ty); diff --git a/clang/test/CodeGen/regcall3.c b/clang/test/CodeGen/regcall3.c new file mode 100644 index 0..1c83407220861 --- /dev/null +++ b/clang/test/CodeGen/regcall3.c @@ -0,0 +1,53 @@ +// RUN: %clang_cc1 -S %s -o - -ffreestanding -triple=x86_64-unknown-linux-gnu | FileCheck %s --check-prefixes=LINUX64 + +#include +struct struct1 { int x; int y; }; +void __regcall v6(int a, float b, struct struct1 c) {} + +void v6_caller(){ +struct struct1 c0; +c0.x = 0xa0a0; c0.y = 0xb0b0; +int x= 0xf0f0, y = 0x0f0f; +v6(x,y,c0); +} + +// LINUX64-LABEL: __regcall3__v6 +// LINUX64: movq %rcx, -8(%rsp) +// LINUX64: movl %eax, -12(%rsp) +// LINUX64: movss %xmm0, -16(%rsp) + +// LINUX64-LABEL: v6_caller +// LINUX64: movl $41120, 16(%rsp)# imm = 0xA0A0 +// LINUX64: movl $45232, 20(%rsp)# imm = 0xB0B0 +// LINUX64: movl $61680, 12(%rsp)# imm = 0xF0F0 +// LINUX64: movl $3855, 8(%rsp) # imm = 0xF0F +// LINUX64: movl 12(%rsp), %eax +// LINUX64: cvtsi2ssl 8(%rsp), %xmm0 +// LINUX64: movq 16(%rsp), %rcx +// LINUX64: callq .L__regcall3__v6$local + + +struct struct2 { int x; float y; }; +void __regcall v31(int a, float b, struct struct2 c) {} + +void v31_caller(){ +struct struct2 c0; +c0.x = 0xa0a0; c0.y = 0xb0b0; +int x= 0xf0f0, y = 0x0f0f; +v31(x,y,c0); +} + +// LINUX64: __regcall3__v31:# @__regcall3__v31 +// LINUX64:movq%rcx, -8(%rsp) +// LINUX64:movl%eax, -12(%rsp) +// LINUX64:movss %xmm0, -16(%rsp) +// LINUX64: v31_caller: # @v31_caller +// LINUX64:movl$41120, 16(%rsp)# imm = 0xA0A0 +// LINUX64:movss .LCPI3_0(%rip), %xmm0 # xmm0 = [4.5232E+4,0.0E+0,0.0E+0,0.0E+0] +// LINUX64:movss %xmm0, 20(%rsp) +// LINUX64:movl$61680, 12(%rsp)# imm = 0xF0F0 +// LINUX64:movl$3855, 8(%rsp) # imm = 0xF0F +// LINUX64:movl12(%rsp), %eax +// LINUX64:cvtsi2ssl 8(%rsp), %xmm0 +// LINUX64:movq16(%rsp), %rcx +// LINUX64:callq .L__regcall3__v31$local >From 8bdd245edd8dca9477d6541401737f2aeaf6e820 Mon Sep 17 00:00:00 2001 From: mahesh-attarde Date: Tue, 18 Jun 2024 03:33:02 -0700 Subject: [PATCH 2/5] selectively call security cookie check --- llvm/lib/Target/X86/CMakeLists.txt| 1 + llvm/lib/Target/X
[clang] [flang] [llvm] Re-land: "[AArch64] Add ability to list extensions enabled for a target" (#95805) (PR #96795)
@@ -343,7 +350,9 @@ bool isX18ReservedByDefault(const Triple &TT); // themselves, they are sequential (0, 1, 2, 3, ...). uint64_t getCpuSupportsMask(ArrayRef FeatureStrs); -void PrintSupportedExtensions(StringMap DescMap); +void PrintSupportedExtensions(); + +void printEnabledExtensions(std::set EnabledFeatureNames); pratlucas wrote: Thanks for catching this! Not sure how I missed it 🤦 https://github.com/llvm/llvm-project/pull/96795 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [XRay] Add support for instrumentation of DSOs on x86_64 (PR #90959)
sebastiankreutzer wrote: @androm3da @MaskRay I'm tagging you because I'm having trouble to get feedback to this PR, and you seem to be the most recent contributors to XRay. Would one of you be willing to review it? Any other pointers on who to get in touch with are also much appreciated. https://github.com/llvm/llvm-project/pull/90959 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Emitting a warning if optimizations are enabled with sanitizers (PR #95934)
@@ -1038,3 +1038,10 @@ // RUN: not %clang --target=aarch64-none-elf -fsanitize=dataflow %s -### 2>&1 | FileCheck %s -check-prefix=UNSUPPORTED-BAREMETAL // RUN: not %clang --target=arm-arm-none-eabi -fsanitize=shadow-call-stack %s -### 2>&1 | FileCheck %s -check-prefix=UNSUPPORTED-BAREMETAL // UNSUPPORTED-BAREMETAL: unsupported option '-fsanitize={{.*}}' for target + +// RUN: %clang -O0 -O1 -fsanitize=address %s -### 2>&1 | FileCheck %s -check-prefix=CHECK-SAN-OPT-WARN +// RUN: %clang -Ofast -fsanitize=address %s -### 2>&1 | FileCheck %s -check-prefix=CHECK-SAN-OPT-WARN +// RUN: %clang -O3 -fsanitize=address %s -### 2>&1 | FileCheck %s -check-prefix=CHECK-SAN-OPT-WARN +// RUN: %clang -O2 -fsanitize=thread %s -### 2>&1 | FileCheck %s -check-prefix=CHECK-SAN-OPT-WARN +// RUN: %clang -O1 -fsanitize=thread %s -### 2>&1 | FileCheck %s -check-prefix=CHECK-SAN-OPT-WARN +// CHECK-SAN-OPT-WARN: warning: enabling optimizations with sanitizers may potentially reduce effectiveness sandeepkosuri wrote: ```suggestion // CHECK-SAN-OPT-WARN: warning: enabling optimizations may reduce the effectiveness of sanitizers ``` https://github.com/llvm/llvm-project/pull/95934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Emitting a warning if optimizations are enabled with sanitizers (PR #95934)
@@ -477,6 +477,8 @@ def warn_drv_disabling_vptr_no_rtti_default : Warning< def warn_drv_object_size_disabled_O0 : Warning< "the object size sanitizer has no effect at -O0, but is explicitly enabled: %0">, InGroup, DefaultWarnNoWerror; +def warn_sanitizer_with_optimization : Warning< + "enabling optimizations with sanitizers may potentially reduce effectiveness">; sandeepkosuri wrote: ```suggestion "enabling optimizations may reduce the effectiveness of sanitizers">; ``` https://github.com/llvm/llvm-project/pull/95934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][Interp] Merge ByteCodeExprGen and ByteCodeStmtGen (PR #83683)
https://github.com/tbaederr updated https://github.com/llvm/llvm-project/pull/83683 >From 74550f244eed465d4f0db1787eecb73a09d5881a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Timm=20B=C3=A4der?= Date: Sat, 2 Mar 2024 17:00:26 +0100 Subject: [PATCH] [clang][Interp] Merge ByteCode{Stmt,Expr}Gen --- clang/lib/AST/CMakeLists.txt |3 +- clang/lib/AST/Interp/ByteCodeStmtGen.cpp | 734 clang/lib/AST/Interp/ByteCodeStmtGen.h| 91 -- .../{ByteCodeExprGen.cpp => Compiler.cpp} | 1002 ++--- .../Interp/{ByteCodeExprGen.h => Compiler.h} | 93 +- clang/lib/AST/Interp/Context.cpp | 13 +- clang/lib/AST/Interp/EvalEmitter.h|1 + clang/lib/AST/Interp/Program.cpp |1 - 8 files changed, 911 insertions(+), 1027 deletions(-) delete mode 100644 clang/lib/AST/Interp/ByteCodeStmtGen.cpp delete mode 100644 clang/lib/AST/Interp/ByteCodeStmtGen.h rename clang/lib/AST/Interp/{ByteCodeExprGen.cpp => Compiler.cpp} (81%) rename clang/lib/AST/Interp/{ByteCodeExprGen.h => Compiler.h} (87%) diff --git a/clang/lib/AST/CMakeLists.txt b/clang/lib/AST/CMakeLists.txt index 0328666d59b1f..ceaad8d3c5a86 100644 --- a/clang/lib/AST/CMakeLists.txt +++ b/clang/lib/AST/CMakeLists.txt @@ -65,8 +65,7 @@ add_clang_library(clangAST FormatString.cpp InheritViz.cpp Interp/ByteCodeEmitter.cpp - Interp/ByteCodeExprGen.cpp - Interp/ByteCodeStmtGen.cpp + Interp/Compiler.cpp Interp/Context.cpp Interp/Descriptor.cpp Interp/Disasm.cpp diff --git a/clang/lib/AST/Interp/ByteCodeStmtGen.cpp b/clang/lib/AST/Interp/ByteCodeStmtGen.cpp deleted file mode 100644 index 0618ec1aa8f58..0 --- a/clang/lib/AST/Interp/ByteCodeStmtGen.cpp +++ /dev/null @@ -1,734 +0,0 @@ -//===--- ByteCodeStmtGen.cpp - Code generator for expressions ---*- C++ -*-===// -// -// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. -// See https://llvm.org/LICENSE.txt for license information. -// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception -// -//===--===// - -#include "ByteCodeStmtGen.h" -#include "ByteCodeEmitter.h" -#include "Context.h" -#include "Function.h" -#include "PrimType.h" - -using namespace clang; -using namespace clang::interp; - -namespace clang { -namespace interp { - -/// Scope managing label targets. -template class LabelScope { -public: - virtual ~LabelScope() { } - -protected: - LabelScope(ByteCodeStmtGen *Ctx) : Ctx(Ctx) {} - /// ByteCodeStmtGen instance. - ByteCodeStmtGen *Ctx; -}; - -/// Sets the context for break/continue statements. -template class LoopScope final : public LabelScope { -public: - using LabelTy = typename ByteCodeStmtGen::LabelTy; - using OptLabelTy = typename ByteCodeStmtGen::OptLabelTy; - - LoopScope(ByteCodeStmtGen *Ctx, LabelTy BreakLabel, -LabelTy ContinueLabel) - : LabelScope(Ctx), OldBreakLabel(Ctx->BreakLabel), -OldContinueLabel(Ctx->ContinueLabel) { -this->Ctx->BreakLabel = BreakLabel; -this->Ctx->ContinueLabel = ContinueLabel; - } - - ~LoopScope() { -this->Ctx->BreakLabel = OldBreakLabel; -this->Ctx->ContinueLabel = OldContinueLabel; - } - -private: - OptLabelTy OldBreakLabel; - OptLabelTy OldContinueLabel; -}; - -// Sets the context for a switch scope, mapping labels. -template class SwitchScope final : public LabelScope { -public: - using LabelTy = typename ByteCodeStmtGen::LabelTy; - using OptLabelTy = typename ByteCodeStmtGen::OptLabelTy; - using CaseMap = typename ByteCodeStmtGen::CaseMap; - - SwitchScope(ByteCodeStmtGen *Ctx, CaseMap &&CaseLabels, - LabelTy BreakLabel, OptLabelTy DefaultLabel) - : LabelScope(Ctx), OldBreakLabel(Ctx->BreakLabel), -OldDefaultLabel(this->Ctx->DefaultLabel), -OldCaseLabels(std::move(this->Ctx->CaseLabels)) { -this->Ctx->BreakLabel = BreakLabel; -this->Ctx->DefaultLabel = DefaultLabel; -this->Ctx->CaseLabels = std::move(CaseLabels); - } - - ~SwitchScope() { -this->Ctx->BreakLabel = OldBreakLabel; -this->Ctx->DefaultLabel = OldDefaultLabel; -this->Ctx->CaseLabels = std::move(OldCaseLabels); - } - -private: - OptLabelTy OldBreakLabel; - OptLabelTy OldDefaultLabel; - CaseMap OldCaseLabels; -}; - -} // namespace interp -} // namespace clang - -template -bool ByteCodeStmtGen::emitLambdaStaticInvokerBody( -const CXXMethodDecl *MD) { - assert(MD->isLambdaStaticInvoker()); - assert(MD->hasBody()); - assert(cast(MD->getBody())->body_empty()); - - const CXXRecordDecl *ClosureClass = MD->getParent(); - const CXXMethodDecl *LambdaCallOp = ClosureClass->getLambdaCallOperator(); - assert(ClosureClass->captures_begin() == ClosureClass->captures_end()); - const Function *Func = this->getFunction(LambdaCallOp); - if (!Func) -return false; - assert(Func->hasThisPointer()); - assert(Func->getNumParams() == (MD->getNumParams() + 1 + F
[clang] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins (PR #96738)
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/96738 >From 5f614809ac4ffa5e29a01c7e9410d91eadcbe6f2 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 11 Jun 2024 10:40:27 +0200 Subject: [PATCH 1/2] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins --- clang/lib/CodeGen/CGBuiltin.cpp | 40 --- clang/test/CodeGenCUDA/builtins-amdgcn.cu | 8 +-- .../test/CodeGenCUDA/builtins-spirv-amdgcn.cu | 8 +-- .../test/CodeGenOpenCL/builtins-amdgcn-vi.cl | 66 ++- 4 files changed, 86 insertions(+), 36 deletions(-) diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 96dcf6283f9f8..98c2f70664ec7 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -18632,28 +18632,6 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, Function *F = CGM.getIntrinsic(Intrin, { Src0->getType() }); return Builder.CreateCall(F, { Src0, Builder.getFalse() }); } - case AMDGPU::BI__builtin_amdgcn_ds_fminf: - case AMDGPU::BI__builtin_amdgcn_ds_fmaxf: { -Intrinsic::ID Intrin; -switch (BuiltinID) { -case AMDGPU::BI__builtin_amdgcn_ds_fminf: - Intrin = Intrinsic::amdgcn_ds_fmin; - break; -case AMDGPU::BI__builtin_amdgcn_ds_fmaxf: - Intrin = Intrinsic::amdgcn_ds_fmax; - break; -} -llvm::Value *Src0 = EmitScalarExpr(E->getArg(0)); -llvm::Value *Src1 = EmitScalarExpr(E->getArg(1)); -llvm::Value *Src2 = EmitScalarExpr(E->getArg(2)); -llvm::Value *Src3 = EmitScalarExpr(E->getArg(3)); -llvm::Value *Src4 = EmitScalarExpr(E->getArg(4)); -llvm::Function *F = CGM.getIntrinsic(Intrin, { Src1->getType() }); -llvm::FunctionType *FTy = F->getFunctionType(); -llvm::Type *PTy = FTy->getParamType(0); -Src0 = Builder.CreatePointerBitCastOrAddrSpaceCast(Src0, PTy); -return Builder.CreateCall(F, { Src0, Src1, Src2, Src3, Src4 }); - } case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f64: case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_f32: case AMDGPU::BI__builtin_amdgcn_global_atomic_fadd_v2f16: @@ -19087,11 +19065,13 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, case AMDGPU::BI__builtin_amdgcn_atomic_inc64: case AMDGPU::BI__builtin_amdgcn_atomic_dec32: case AMDGPU::BI__builtin_amdgcn_atomic_dec64: - case AMDGPU::BI__builtin_amdgcn_ds_faddf: case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_f64: case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_f32: case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2f16: - case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2bf16: { + case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2bf16: + case AMDGPU::BI__builtin_amdgcn_ds_faddf: + case AMDGPU::BI__builtin_amdgcn_ds_fminf: + case AMDGPU::BI__builtin_amdgcn_ds_fmaxf: { llvm::AtomicRMWInst::BinOp BinOp; switch (BuiltinID) { case AMDGPU::BI__builtin_amdgcn_atomic_inc32: @@ -19109,6 +19089,12 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, case AMDGPU::BI__builtin_amdgcn_ds_atomic_fadd_v2bf16: BinOp = llvm::AtomicRMWInst::FAdd; break; +case AMDGPU::BI__builtin_amdgcn_ds_fminf: + BinOp = llvm::AtomicRMWInst::FMin; + break; +case AMDGPU::BI__builtin_amdgcn_ds_fmaxf: + BinOp = llvm::AtomicRMWInst::FMax; + break; } Address Ptr = CheckAtomicAlignment(*this, E); @@ -19118,8 +19104,10 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, bool Volatile; -if (BuiltinID == AMDGPU::BI__builtin_amdgcn_ds_faddf) { - // __builtin_amdgcn_ds_faddf has an explicit volatile argument +if (BuiltinID == AMDGPU::BI__builtin_amdgcn_ds_faddf || +BuiltinID == AMDGPU::BI__builtin_amdgcn_ds_fminf || +BuiltinID == AMDGPU::BI__builtin_amdgcn_ds_fmaxf) { + // __builtin_amdgcn_ds_faddf/fminf/fmaxf has an explicit volatile argument Volatile = cast(EmitScalarExpr(E->getArg(4)))->getZExtValue(); } else { diff --git a/clang/test/CodeGenCUDA/builtins-amdgcn.cu b/clang/test/CodeGenCUDA/builtins-amdgcn.cu index 132cbd27b08fc..2e88afac813f4 100644 --- a/clang/test/CodeGenCUDA/builtins-amdgcn.cu +++ b/clang/test/CodeGenCUDA/builtins-amdgcn.cu @@ -98,7 +98,7 @@ __global__ // CHECK-NEXT:[[X_ASCAST:%.*]] = addrspacecast ptr addrspace(5) [[X]] to ptr // CHECK-NEXT:store float [[SRC:%.*]], ptr [[SRC_ADDR_ASCAST]], align 4 // CHECK-NEXT:[[TMP0:%.*]] = load float, ptr [[SRC_ADDR_ASCAST]], align 4 -// CHECK-NEXT:[[TMP1:%.*]] = call contract float @llvm.amdgcn.ds.fmax.f32(ptr addrspace(3) @_ZZ12test_ds_fmaxfE6shared, float [[TMP0]], i32 0, i32 0, i1 false) +// CHECK-NEXT:[[TMP1:%.*]] = atomicrmw fmax ptr addrspace(3) @_ZZ12test_ds_fmaxfE6shared, float [[TMP0]] monotonic, align 4 // CHECK-NEXT:store volatile float [[TMP1]], ptr [[X_ASCAST]], align 4 // CHECK-NEXT:ret void // @@ -142,7 +142,7 @@ __global__ voi
[clang] [Clang][AST] Let DeclPrinter print trailing requires expressions for template parameters (PR #96864)
@@ -1189,6 +1189,16 @@ void DeclPrinter::printTemplateParameters(const TemplateParameterList *Params, Out << '>'; if (!OmitTemplateKW) Out << ' '; + + if (const Expr *RequiresClause = Params->getRequiresClause()) { zyn0217 wrote: Yeah, you're right, will do that shortly. https://github.com/llvm/llvm-project/pull/96864 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [llvm] Re-land: "[AArch64] Add ability to list extensions enabled for a target" (#95805) (PR #96795)
@@ -21,7 +21,7 @@ // RUN: %clang --target=aarch64 -march=armv8a+fp16fml -### -c %s 2>&1 | FileCheck -check-prefix=GENERICV8A-FP16FML %s pratlucas wrote: The check for target validity doesn't run when using `-###` in the command line. E.g.: ``` $ ../build/bin/clang -target foo -c test.c -### clang version 19.0.0git Target: foo Thread model: posix InstalledDir: /Users/lucpra01/Workspace/opensource/build/bin Build config: +tsan (in-process) "/Users/lucpra01/Workspace/opensource/build/bin/clang-19" "-cc1" "-triple" "foo" "-emit-obj" "-disable-free" "-clear-ast-before-backend" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "test.c" "-mrelocation-model" "static" "-mframe-pointer=all" "-fmath-errno" "-ffp-contract=on" "-fno-rounding-math" "-mconstructor-aliases" "-debugger-tuning=gdb" "-fdebug-compilation-dir=/Users/lucpra01/Workspace/opensource/test" "-target-linker-version" "1053.12" "-fcoverage-compilation-dir=/Users/lucpra01/Workspace/opensource/test" "-resource-dir" "/Users/lucpra01/Workspace/opensource/build/lib/clang/19" "-ferror-limit" "19" "-fgnuc-version=4.2.1" "-fskip-odr-check-in-gmf" "-fcolor-diagnostics" "-faddrsig" "-o" "test.o" "-x" "c" "test.c" $ echo $? 0 ``` As adding the `// REQUIRES:` directive would reduced our current test coverage, I chose to add an `%if aarch64-registered-target` condition only to the relevant `// RUN:` lines instead. https://github.com/llvm/llvm-project/pull/96795 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [llvm] Re-land: "[AArch64] Add ability to list extensions enabled for a target" (#95805) (PR #96795)
@@ -315,37 +315,37 @@ // RUN: %clang -target aarch64 -mcpu=thunderx2t99 -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-MCPU-THUNDERX2T99 %s // RUN: %clang -target aarch64 -mcpu=a64fx -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-MCPU-A64FX %s // RUN: %clang -target aarch64 -mcpu=carmel -### -c %s 2>&1 | FileCheck -check-prefix=CHECK-MCPU-CARMEL %s -// CHECK-MCPU-APPLE-A7: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-feature" "+zcm" "-target-feature" "+zcz" "-target-feature" "+v8a" "-target-feature" "+aes"{{.*}} "-target-feature" "+fp-armv8" "-target-feature" "+perfmon" "-target-feature" "+sha2" "-target-feature" "+neon" pratlucas wrote: Ditto. https://github.com/llvm/llvm-project/pull/96795 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 8a43dc3 - [clang][Sema] Move the initializer lifetime checking code from SemaInit.cpp to a new place, NFC (#96758)
Author: Haojian Wu Date: 2024-06-27T10:56:06+02:00 New Revision: 8a43dc3efdd9bfba0bea32061ef2f3397a968eb9 URL: https://github.com/llvm/llvm-project/commit/8a43dc3efdd9bfba0bea32061ef2f3397a968eb9 DIFF: https://github.com/llvm/llvm-project/commit/8a43dc3efdd9bfba0bea32061ef2f3397a968eb9.diff LOG: [clang][Sema] Move the initializer lifetime checking code from SemaInit.cpp to a new place, NFC (#96758) This is a refactoring change for better code isolation and reuse, the first step to extend it for assignments. Added: clang/lib/Sema/CheckExprLifetime.cpp clang/lib/Sema/CheckExprLifetime.h Modified: clang/lib/Sema/CMakeLists.txt clang/lib/Sema/SemaInit.cpp Removed: diff --git a/clang/lib/Sema/CMakeLists.txt b/clang/lib/Sema/CMakeLists.txt index f152d243d39a5..980a83d4431aa 100644 --- a/clang/lib/Sema/CMakeLists.txt +++ b/clang/lib/Sema/CMakeLists.txt @@ -15,6 +15,7 @@ clang_tablegen(OpenCLBuiltins.inc -gen-clang-opencl-builtins add_clang_library(clangSema AnalysisBasedWarnings.cpp + CheckExprLifetime.cpp CodeCompleteConsumer.cpp DeclSpec.cpp DelayedDiagnostic.cpp diff --git a/clang/lib/Sema/CheckExprLifetime.cpp b/clang/lib/Sema/CheckExprLifetime.cpp new file mode 100644 index 0..54e2f1c22536d --- /dev/null +++ b/clang/lib/Sema/CheckExprLifetime.cpp @@ -0,0 +1,1259 @@ +//===--- CheckExprLifetime.cpp ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#include "CheckExprLifetime.h" +#include "clang/AST/Expr.h" +#include "clang/Sema/Sema.h" +#include "llvm/ADT/PointerIntPair.h" + +namespace clang::sema { +namespace { +enum LifetimeKind { + /// The lifetime of a temporary bound to this entity ends at the end of the + /// full-expression, and that's (probably) fine. + LK_FullExpression, + + /// The lifetime of a temporary bound to this entity is extended to the + /// lifeitme of the entity itself. + LK_Extended, + + /// The lifetime of a temporary bound to this entity probably ends too soon, + /// because the entity is allocated in a new-expression. + LK_New, + + /// The lifetime of a temporary bound to this entity ends too soon, because + /// the entity is a return object. + LK_Return, + + /// The lifetime of a temporary bound to this entity ends too soon, because + /// the entity is the result of a statement expression. + LK_StmtExprResult, + + /// This is a mem-initializer: if it would extend a temporary (other than via + /// a default member initializer), the program is ill-formed. + LK_MemInitializer, +}; +using LifetimeResult = +llvm::PointerIntPair; +} // namespace + +/// Determine the declaration which an initialized entity ultimately refers to, +/// for the purpose of lifetime-extending a temporary bound to a reference in +/// the initialization of \p Entity. +static LifetimeResult +getEntityLifetime(const InitializedEntity *Entity, + const InitializedEntity *InitField = nullptr) { + // C++11 [class.temporary]p5: + switch (Entity->getKind()) { + case InitializedEntity::EK_Variable: +// The temporary [...] persists for the lifetime of the reference +return {Entity, LK_Extended}; + + case InitializedEntity::EK_Member: +// For subobjects, we look at the complete object. +if (Entity->getParent()) + return getEntityLifetime(Entity->getParent(), Entity); + +// except: +// C++17 [class.base.init]p8: +// A temporary expression bound to a reference member in a +// mem-initializer is ill-formed. +// C++17 [class.base.init]p11: +// A temporary expression bound to a reference member from a +// default member initializer is ill-formed. +// +// The context of p11 and its example suggest that it's only the use of a +// default member initializer from a constructor that makes the program +// ill-formed, not its mere existence, and that it can even be used by +// aggregate initialization. +return {Entity, Entity->isDefaultMemberInitializer() ? LK_Extended + : LK_MemInitializer}; + + case InitializedEntity::EK_Binding: +// Per [dcl.decomp]p3, the binding is treated as a variable of reference +// type. +return {Entity, LK_Extended}; + + case InitializedEntity::EK_Parameter: + case InitializedEntity::EK_Parameter_CF_Audited: +// -- A temporary bound to a reference parameter in a function call +// persists until the completion of the full-expression containing +// the call. +return {nullptr, LK_FullExpression}; + + case InitializedEntity::EK_TemplateParameter: +
[clang] [clang][Sema] Move the initializer lifetime checking code from SemaInit.cpp to a new place, NFC (PR #96758)
https://github.com/hokein closed https://github.com/llvm/llvm-project/pull/96758 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Improve PointerSubChecker (PR #96501)
NagyDonat wrote: > The warning message may be still misleading if the LHS or RHS "arrays" are > non-array variables. I think that the warning message is OK: "Subtraction of two pointers that do not point into the same array is undefined behavior." -- this also covers the case when one or both of the pointers do not point to arrays. (It doesn't mention the corner case that it's also valid to subtract two identical pointers that point to a non-array value, but that's completely irrelevant in practice, so wouldn't be a helpful suggestion.) > (or detect if `offsetof` can be used and include it in the message)? I think that would be a waste of time, because it's very rare that a project manually reimplements `offsetof` -- I think it only appears in `vim` becasue it's a very old codebase. (Also developers who play with this kind of low-level trickery should be familiar with the standard and understand what's the problem.) https://github.com/llvm/llvm-project/pull/96501 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [X86][CodeGen] security check cookie execute only when needed (PR #95904)
mahesh-attarde wrote: > @mahesh-attarde please can you rebase against trunk - I've cleaned up the > test checks to help with the codegen diff done. https://github.com/llvm/llvm-project/pull/95904 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [X86][CodeGen] security check cookie execute only when needed (PR #95904)
mahesh-attarde wrote: ping @MaskRay @RKSimon https://github.com/llvm/llvm-project/pull/95904 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang][AST] Let DeclPrinter print trailing requires expressions for template parameters (PR #96864)
https://github.com/zyn0217 updated https://github.com/llvm/llvm-project/pull/96864 >From a5c33bd413d8150d1688240c6b5253b1760cafe1 Mon Sep 17 00:00:00 2001 From: Younan Zhang Date: Thu, 27 Jun 2024 15:59:48 +0800 Subject: [PATCH 1/2] [Clang][AST] Let DeclPrinter print trailing requires expressions for template parameters As discussed in https://github.com/llvm/llvm-project/pull/96084#discussion_r1654629993, it would be nice to present these trailing constraints on template parameters when printing CTAD decls through a DeclPrinter. --- clang/docs/ReleaseNotes.rst| 1 + clang/lib/AST/DeclPrinter.cpp | 10 ++ clang/test/PCH/cxx2a-requires-expr.cpp | 17 + 3 files changed, 28 insertions(+) diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 69aea6c21ad39..03b1daa6597cd 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -99,6 +99,7 @@ AST Dumping Potentially Breaking Changes - The text ast-dumper has improved printing of TemplateArguments. +- The text decl-dumper prints template parameters' trailing requires expressions now. Clang Frontend Potentially Breaking Changes --- diff --git a/clang/lib/AST/DeclPrinter.cpp b/clang/lib/AST/DeclPrinter.cpp index 0cf4e64f83b8d..0a081e7e07ca8 100644 --- a/clang/lib/AST/DeclPrinter.cpp +++ b/clang/lib/AST/DeclPrinter.cpp @@ -1189,6 +1189,16 @@ void DeclPrinter::printTemplateParameters(const TemplateParameterList *Params, Out << '>'; if (!OmitTemplateKW) Out << ' '; + + if (const Expr *RequiresClause = Params->getRequiresClause()) { +if (OmitTemplateKW) + Out << ' '; +Out << "requires "; +RequiresClause->printPretty(Out, nullptr, Policy, Indentation, "\n", +&Context); +if (!OmitTemplateKW) + Out << ' '; + } } void DeclPrinter::printTemplateArguments(ArrayRef Args, diff --git a/clang/test/PCH/cxx2a-requires-expr.cpp b/clang/test/PCH/cxx2a-requires-expr.cpp index 7f8f258a0f8f3..936f601685463 100644 --- a/clang/test/PCH/cxx2a-requires-expr.cpp +++ b/clang/test/PCH/cxx2a-requires-expr.cpp @@ -22,3 +22,20 @@ bool f() { requires C || (C || C); }; } + +namespace trailing_requires_expression { + +template requires C && C2 +// CHECK: template requires C && C2 void g(); +void g(); + +template requires C || C2 +// CHECK: template requires C || C2 constexpr int h = sizeof(T); +constexpr int h = sizeof(T); + +template requires C +// CHECK: template requires C class i { +// CHECK-NEXT: }; +class i {}; + +} >From 432f3fdb6d0817fad15b87a6d166e0ada9748f89 Mon Sep 17 00:00:00 2001 From: Younan Zhang Date: Thu, 27 Jun 2024 16:56:19 +0800 Subject: [PATCH 2/2] Address feedback --- clang/lib/AST/DeclPrinter.cpp | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/clang/lib/AST/DeclPrinter.cpp b/clang/lib/AST/DeclPrinter.cpp index 0a081e7e07ca8..26773a69ab9ac 100644 --- a/clang/lib/AST/DeclPrinter.cpp +++ b/clang/lib/AST/DeclPrinter.cpp @@ -1187,18 +1187,15 @@ void DeclPrinter::printTemplateParameters(const TemplateParameterList *Params, } Out << '>'; - if (!OmitTemplateKW) -Out << ' '; if (const Expr *RequiresClause = Params->getRequiresClause()) { -if (OmitTemplateKW) - Out << ' '; -Out << "requires "; +Out << " requires "; RequiresClause->printPretty(Out, nullptr, Policy, Indentation, "\n", &Context); -if (!OmitTemplateKW) - Out << ' '; } + + if (!OmitTemplateKW) +Out << ' '; } void DeclPrinter::printTemplateArguments(ArrayRef Args, ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [PAC][ELF][AArch64] Encode signed GOT flag in PAuth core info (PR #96159)
kovdan01 wrote: > I'm not at all familiar with this PAuth stuff, but don't you need a test case > for where the new value is set (currently they all seem to be unset, if I'm > interpreting things correctly)? I'm not sure if I understood your question correctly - particularly, I'm not sure what does the phrase "the new value is set" mean. Could you please add a bit more details in your question? If you are talking about llvm/test/tools/llvm-readobj/ELF/AArch64/aarch64-feature-pauth.s and llvm/test/CodeGen/AArch64/note-gnu-property-elf-pauthabi.ll tests checking version value 0x55 which does not imply signed GOT enabled, we just can't test 2^8=256 combinations of flags, so we test values which look like 0b10101... But I can add a test for version value 0xAA which would set opposite flags compared to 0x55. https://github.com/llvm/llvm-project/pull/96159 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [RISCV] Add support for getHostCPUFeatures using hwprobe (PR #94352)
@@ -2002,6 +2003,76 @@ bool sys::getHostCPUFeatures(StringMap &Features) { return true; } +#elif defined(__linux__) && defined(__riscv) +// struct riscv_hwprobe +struct RISCVHwProbe { + int64_t Key; + uint64_t Value; +}; +bool sys::getHostCPUFeatures(StringMap &Features) { + RISCVHwProbe Query[]{{/*RISCV_HWPROBE_KEY_BASE_BEHAVIOR=*/3, 0}, + {/*RISCV_HWPROBE_KEY_IMA_EXT_0=*/4, 0}}; + int Ret = syscall(/*__NR_riscv_hwprobe=*/258, /*pairs=*/Query, dtcxzyw wrote: Currently `sys::getHostCPUFeatures` has three callers: + clang -> `riscv::getRISCVTargetFeatures` + llvm-tools -> `codegen::getFeaturesStr` + JIT users -> `JITTargetMachineBuilder::detectHost` I don't think there are any opportunities to reuse the result. BTW, https://github.com/llvm/llvm-project/pull/85790 may benefit from the vDSO symbol, but it implements caching itself. I didn't use the glibc call `__riscv_hwprobe` since `sys/hwprobe.h` was unavailable on my RV board :( https://github.com/llvm/llvm-project/pull/94352 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [RISCV] Add support for getHostCPUFeatures using hwprobe (PR #94352)
@@ -83,8 +83,14 @@ void riscv::getRISCVTargetFeatures(const Driver &D, const llvm::Triple &Triple, // and other features (ex. mirco architecture feature) from mcpu if (Arg *A = Args.getLastArg(options::OPT_mcpu_EQ)) { StringRef CPU = A->getValue(); -if (CPU == "native") +if (CPU == "native") { CPU = llvm::sys::getHostCPUName(); + llvm::StringMap HostFeatures; + if (llvm::sys::getHostCPUFeatures(HostFeatures)) +for (auto &F : HostFeatures) + Features.push_back( + Args.MakeArgString((F.second ? "+" : "-") + F.first())); +} dtcxzyw wrote: @wangpc-pp @topperc Are there any equivalents of the helper `printMArch`? https://github.com/llvm/llvm-project/blob/ba60d8a11af2cdd7e80e2fd968cdf52adcabf5a1/llvm/utils/TableGen/RISCVTargetDefEmitter.cpp#L90-L123 https://github.com/llvm/llvm-project/pull/94352 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] 2033b1c - [CodeGen] Don't write coverage to source directory in test
Author: Benjamin Kramer Date: 2024-06-27T11:21:42+02:00 New Revision: 2033b1cf16f040e1369d8efba8439dcd3e36ed31 URL: https://github.com/llvm/llvm-project/commit/2033b1cf16f040e1369d8efba8439dcd3e36ed31 DIFF: https://github.com/llvm/llvm-project/commit/2033b1cf16f040e1369d8efba8439dcd3e36ed31.diff LOG: [CodeGen] Don't write coverage to source directory in test Added: Modified: clang/test/CodeGen/coverage-target-attr.c Removed: diff --git a/clang/test/CodeGen/coverage-target-attr.c b/clang/test/CodeGen/coverage-target-attr.c index 8c8e6ee1c3b69..d46299f5bee22 100644 --- a/clang/test/CodeGen/coverage-target-attr.c +++ b/clang/test/CodeGen/coverage-target-attr.c @@ -1,4 +1,4 @@ -// RUN: %clang_cc1 -emit-llvm -coverage-notes-file=test.gcno -coverage-data-file=test.gcda -triple aarch64-linux-android30 -target-cpu generic -target-feature +tagged-globals -fsanitize=hwaddress %s -o %t +// RUN: %clang_cc1 -emit-llvm -coverage-notes-file=/dev/null -coverage-data-file=/dev/null -triple aarch64-linux-android30 -target-cpu generic -target-feature +tagged-globals -fsanitize=hwaddress %s -o %t // RUN: FileCheck %s < %t // CHECK: define internal void @__llvm_gcov_writeout() unnamed_addr [[ATTR:#[0-9]+]] ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [PAC][ELF][AArch64] Encode signed GOT flag in PAuth core info (PR #96159)
jh7370 wrote: > > I'm not at all familiar with this PAuth stuff, but don't you need a test > > case for where the new value is set (currently they all seem to be unset, > > if I'm interpreting things correctly)? > > @jh7370 I'm not sure if I understood your question correctly - particularly, > I'm not sure what does the phrase "the new value is set" mean. Could you > please add a bit more details in your question? > > If you are talking about > llvm/test/tools/llvm-readobj/ELF/AArch64/aarch64-feature-pauth.s and > llvm/test/CodeGen/AArch64/note-gnu-property-elf-pauthabi.ll tests checking > version value 0x55 which does not imply signed GOT enabled, we just can't > test 2^8=256 combinations of flags, so we test values which look like > 0b10101... But I can add a test for version value 0xAA which would set > opposite flags compared to 0x55. I was referring to this line from the description: > llvm-readobj: print `PointerAuthELFGOT` or `!PointerAuthELFGOT` in version > description of llvm_linux platform depending on whether the flag is set. In my opinion, if you don't test the first of those two cases, you might as well not have implemented behaviour for it. I'd always test "all flags set" and "no flags set" cases (or some variant that effectively tests this, e.g. 0xff and ~0xff). Of course, if it's not practical, that's fine. https://github.com/llvm/llvm-project/pull/96159 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang][AST] Let DeclPrinter print trailing requires expressions for template parameters (PR #96864)
https://github.com/hokein approved this pull request. thanks, looks good. https://github.com/llvm/llvm-project/pull/96864 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [PAC][ELF][AArch64] Encode signed GOT flag in PAuth core info (PR #96159)
kovdan01 wrote: > I was referring to this line from the description: > > > llvm-readobj: print `PointerAuthELFGOT` or `!PointerAuthELFGOT` in version > > description of llvm_linux platform depending on whether the flag is set. > > In my opinion, if you don't test the first of those two cases, you might as > well not have implemented behaviour for it. I'd always test "all flags set" > and "no flags set" cases (or some variant that effectively tests this, e.g. > 0xff and ~0xff). Of course, if it's not practical, that's fine. > > To be clear, I'm not suggesting testing every possible combination of flags, > just each flag individually set/not set. @jh7370 Thanks for explanation! It's a reasonable point, I'll add corresponding test cases, thanks. https://github.com/llvm/llvm-project/pull/96159 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [Flang-new][OpenMP] Add offload related flags for AMDGPU (PR #96742)
@@ -333,6 +333,9 @@ void Flang::AddAMDGPUTargetArgs(const ArgList &Args, StringRef Val = A->getValue(); CmdArgs.push_back(Args.MakeArgString("-mcode-object-version=" + Val)); } + + const ToolChain &TC = getToolChain(); + TC.addClangTargetOptions(Args, CmdArgs, Action::OffloadKind::OFK_OpenMP); DominikAdamski wrote: Hi, thanks for the feedback. I would like to share my observations with you: 1. Clang does not verify how we use these flags and it accepts them for non-GPU target. 2. These flags can be reused by other vendors. For example clang adds `mlink-builtin-bitcode` option for OpenMP Nvidia GPU [as well](https://github.com/llvm/llvm-project/blob/ba60d8a11af2cdd7e80e2fd968cdf52adcabf5a1/clang/test/Driver/openmp-offload-gpu.c#L92) . https://github.com/llvm/llvm-project/pull/96742 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [Flang-new][OpenMP] Add offload related flags for AMDGPU (PR #96742)
@@ -333,6 +333,9 @@ void Flang::AddAMDGPUTargetArgs(const ArgList &Args, StringRef Val = A->getValue(); CmdArgs.push_back(Args.MakeArgString("-mcode-object-version=" + Val)); } + + const ToolChain &TC = getToolChain(); + TC.addClangTargetOptions(Args, CmdArgs, Action::OffloadKind::OFK_OpenMP); tblah wrote: Does that mean that this change would also lead to adding these flags when building for Nvidia GPU with flang? https://github.com/llvm/llvm-project/pull/96742 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [OpenMP] OpenMP 5.1 "assume" directive parsing support (PR #92731)
@@ -0,0 +1,31 @@ +// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -ast-print %s | FileCheck %s +// expected-no-diagnostics + +extern int bar(int); + +int foo(int arg) +{ + #pragma omp assume no_openmp_routines + { +auto fn = [](int x) { return bar(x); }; +// CHECK: auto fn = [](int x) { +return fn(5); + } +} + +class C { +public: + int foo(int a); +}; + +// We're really just checking that this parses. All the assumptions are thrown +// away immediately for now. +int C::foo(int a) +{ + #pragma omp assume holds(sizeof(T) == 8) absent(parallel) + { +auto fn = [](int x) { return bar(x); }; +// CHECK: auto fn = [](int x) { +return fn(5); + } +} jtb20 wrote: Understood! There is indeed a vscode option to add the missing newline (it appears to be turned off by default for some bizarre reason). I'll push a new version with them in. https://github.com/llvm/llvm-project/pull/92731 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [XRay] Add support for instrumentation of DSOs on x86_64 (PR #90959)
androm3da wrote: > @androm3da @MaskRay I'm tagging you because I'm having trouble to get > feedback to this PR, and you seem to be the most recent contributors to XRay. > Would one of you be willing to review it? Any other pointers on who to get in > touch with are also much appreciated. I'm happy to take a look - but I'm traveling this week and won't be able to until this weekend. https://github.com/llvm/llvm-project/pull/90959 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Improve documentation of checker 'cplusplus.Move' (NFC) (PR #96295)
https://github.com/balazske updated https://github.com/llvm/llvm-project/pull/96295 From 0c57ad1ca36a841dff700eb98f878475e0243b88 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bal=C3=A1zs=20K=C3=A9ri?= Date: Fri, 21 Jun 2024 12:13:02 +0200 Subject: [PATCH 1/3] [clang][analyzer] Improve documentation of checker 'cplusplus.Move' (NFC) --- clang/docs/analyzer/checkers.rst | 39 +-- .../clang/StaticAnalyzer/Checkers/Checkers.td | 21 +++--- 2 files changed, 40 insertions(+), 20 deletions(-) diff --git a/clang/docs/analyzer/checkers.rst b/clang/docs/analyzer/checkers.rst index b8d5f372bdf61..445f434e1e6ce 100644 --- a/clang/docs/analyzer/checkers.rst +++ b/clang/docs/analyzer/checkers.rst @@ -420,21 +420,52 @@ around, such as ``std::string_view``. cplusplus.Move (C++) -Method calls on a moved-from object and copying a moved-from object will be reported. - +Find use-after-move bugs in C++. This includes method calls on moved-from +objects, assignment of a moved-from object, and repeated move of a moved-from +object. .. code-block:: cpp - struct A { + struct A { void foo() {} }; - void f() { + void f1() { A a; A b = std::move(a); // note: 'a' became 'moved-from' here a.foo();// warn: method call on a 'moved-from' object 'a' } + void f2() { + A a; + A b = std::move(a); + A c(std::move(a)); // warn: move of an already moved-from object + } + + void f3() { + A a; + A b = std::move(a); + b = a; // warn: copy of moved-from object + } + +The checker option ``WarnOn`` controls on what objects the use-after-move is +checked. The most strict value is ``KnownsOnly``, in this mode only objects are +checked whose type is known to be move-unsafe. These include most STL objects +(but excluding move-safe ones) and smart pointers. With option value +``KnownsAndLocals`` local variables (of any type) are additionally checked. The +idea behind this is that local variables are usually not tempting to be re-used +so an use after move is more likely a bug than with member variables. With +option value ``All`` any use-after move condition is checked on all kinds of +variables, excluding global variables and known move-safe cases. Default value +is ``KnownsAndLocals``. + +Call of methods named ``empty()`` or ``isEmpty()`` are allowed on moved-from +objects because these methods are considered as move-safe. Functions called +``reset()``, ``destroy()``, ``clear()``, ``assign``, ``resize``, ``shrink`` are +treated as state-reset functions and are allowed on moved-from objects, these +make the object valid again. This applies to any type of object (not only STL +ones). + .. _cplusplus-NewDelete: cplusplus.NewDelete (C++) diff --git a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td index 429c334a0b24b..6e224a4e098ad 100644 --- a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td +++ b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td @@ -686,22 +686,11 @@ def MoveChecker: Checker<"Move">, CheckerOptions<[ CmdLineOption ]>, From 866655581a1e1f0779542737a3f9d427a8d067b6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Bal=C3=A1zs=20K=C3=A9ri?= Date: Fri, 21 Jun 2024 16:35:09 +0200 Subject: [PATCH 2/3] using bullet point list for option values --- clang/docs/analyzer/checkers.rst | 26 +++--- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/clang/docs/analyzer/checkers.rst b/clang/docs/analyzer/checkers.rst index 445f434e1e6ce..42c097d973d53 100644 --- a/clang/docs/analyzer/checkers.rst +++ b/clang/docs/analyzer/checkers.rst @@ -449,17 +449,21 @@ object. } The checker option ``WarnOn`` controls on what objects the use-after-move is -checked. The most strict value is ``KnownsOnly``, in this mode only objects are -checked whose type is known to be move-unsafe. These include most STL objects -(but excluding move-safe ones) and smart pointers. With option value -``KnownsAndLocals`` local variables (of any type) are additionally checked. The -idea behind this is that local variables are usually not tempting to be re-used -so an use after move is more likely a bug than with member variables. With -option value ``All`` any use-after move condition is checked on all kinds of -variables, excluding global variables and known move-safe cases. Default value -is ``KnownsAndLocals``. - -Call of methods named ``empty()`` or ``isEmpty()`` are allowed on moved-from +checked: + +* The most strict value is ``KnownsOnly``, in this mode only objects are + checked whose type is known to be move-unsafe. These include most STL objects + (but excluding move-safe ones) and smart pointers. +* With option value ``KnownsAndLocals`` local variables (of any type) are + additionally checked. The idea behind this is that local variables are + usually not tempting to be re-used so an use after move is more likely a bug + than with
[clang] [llvm] [RISCV] Add support for getHostCPUFeatures using hwprobe (PR #94352)
@@ -83,8 +83,14 @@ void riscv::getRISCVTargetFeatures(const Driver &D, const llvm::Triple &Triple, // and other features (ex. mirco architecture feature) from mcpu if (Arg *A = Args.getLastArg(options::OPT_mcpu_EQ)) { StringRef CPU = A->getValue(); -if (CPU == "native") +if (CPU == "native") { CPU = llvm::sys::getHostCPUName(); + llvm::StringMap HostFeatures; + if (llvm::sys::getHostCPUFeatures(HostFeatures)) +for (auto &F : HostFeatures) + Features.push_back( + Args.MakeArgString((F.second ? "+" : "-") + F.first())); +} wangpc-pp wrote: > @wangpc-pp @topperc Are there any equivalents of the helper `printMArch`? No, I think there isn't. You may need to write a helper via `RISCVISAInfo::parseFeatures` and `RISCVISAInfo::toString()`. https://github.com/llvm/llvm-project/pull/94352 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [Flang-new][OpenMP] Add offload related flags for AMDGPU (PR #96742)
DominikAdamski wrote: > > fcuda-is-device flag is not used by Flang currently. In the future it will > > be needed for Flang equivalent functions: > > AMDGPUTargetCodeGenInfo::getGlobalVarAddressSpace > > AMDGPUTargetInfo::getTargetDefines . > > I don't follow - why would anything related to CUDA be relevant here? Clang for AMDGPU supports OpenMP and [HIP](https://clang.llvm.org/docs/HIPSupport.html) and it reuses the same code. For example `-fcuda-is-device` flag needs to be checked for [legacy HIP host code](https://github.com/llvm/llvm-project/blob/2033b1cf16f040e1369d8efba8439dcd3e36ed31/clang/lib/Basic/Targets/AMDGPU.cpp#L278). I would like to reuse the same part of the AMD GPU toolchain for Flang. https://github.com/llvm/llvm-project/pull/96742 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AArch64][NEON] Add intrinsics for LUTI (PR #96883)
https://github.com/Lukacma created https://github.com/llvm/llvm-project/pull/96883 This patch adds intrinsics for NEON LUTI2 and LUTI4 instructions as specified in the [ACLE proposal](https://github.com/ARM-software/acle/pull/324) >From cb2ebe232013576f57f8f26b9156fccd75d7d38f Mon Sep 17 00:00:00 2001 From: Marian Lukac Date: Thu, 27 Jun 2024 09:38:17 + Subject: [PATCH] [AArch64][NEON] Add intrinsics for LUTI --- clang/include/clang/Basic/arm_neon.td | 16 + clang/lib/CodeGen/CGBuiltin.cpp | 54 +++ clang/test/CodeGen/aarch64-neon-luti.c| 433 ++ llvm/include/llvm/IR/IntrinsicsAArch64.td | 19 + .../lib/Target/AArch64/AArch64InstrFormats.td | 14 +- llvm/lib/Target/AArch64/AArch64InstrInfo.td | 70 +++ llvm/test/CodeGen/AArch64/neon-luti.ll| 207 + 7 files changed, 806 insertions(+), 7 deletions(-) create mode 100644 clang/test/CodeGen/aarch64-neon-luti.c create mode 100644 llvm/test/CodeGen/AArch64/neon-luti.ll diff --git a/clang/include/clang/Basic/arm_neon.td b/clang/include/clang/Basic/arm_neon.td index 6390ba3f9fe5e..0dd76ce32fc20 100644 --- a/clang/include/clang/Basic/arm_neon.td +++ b/clang/include/clang/Basic/arm_neon.td @@ -2096,3 +2096,19 @@ let ArchGuard = "defined(__aarch64__) || defined(__arm64ec__)", TargetGuard = "r def VLDAP1_LANE : WInst<"vldap1_lane", ".(c*!).I", "QUlQlUlldQdPlQPl">; def VSTL1_LANE : WInst<"vstl1_lane", "v*(.!)I", "QUlQlUlldQdPlQPl">; } + +//Lookup table read with 2-bit/4-bit indices +let ArchGuard = "defined(__aarch64__)", TargetGuard = "lut" in { + def VLUTI2_B: SInst<"vluti2_lane","Q.(qU)I", "cUcPcQcQUcQPc">; + def VLUTI2_B_Q : SInst<"vluti2_laneq", "Q.(QU)I", "cUcPcQcQUcQPc">; + def VLUTI2_H: SInst<"vluti2_lane","Q.(qU<)I", "sUsPshQsQUsQPsQh">; + def VLUTI2_H_Q : SInst<"vluti2_laneq", "Q.(QU<)I", "sUsPshQsQUsQPsQh">; + def VLUTI4_B: SInst<"vluti4_laneq","..UI", "QcQUcQPc">; + def VLUTI4_H_X2 : SInst<"vluti4_laneq_x2", ".2(U<)I", "QsQUsQPsQh">; + + let ArchGuard = "defined(__aarch64__)", TargetGuard= "lut,bf16" in { +def VLUTI2_BF : SInst<"vluti2_lane", "Q.(qU<)I", "bQb">; +def VLUTI2_BF_Q: SInst<"vluti2_laneq","Q.(QU<)I", "bQb">; +def VLUTI4_BF_X2 : SInst<"vluti4_laneq_x2", ".2(U<)I", "Qb">; + } +} diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 511e1fd4016d7..f9ac6c9dc8504 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -13357,6 +13357,60 @@ Value *CodeGenFunction::EmitAArch64BuiltinExpr(unsigned BuiltinID, Int = Intrinsic::aarch64_neon_suqadd; return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vuqadd"); } + + case NEON::BI__builtin_neon_vluti2_lane_bf16: + case NEON::BI__builtin_neon_vluti2_lane_f16: + case NEON::BI__builtin_neon_vluti2_lane_p16: + case NEON::BI__builtin_neon_vluti2_lane_p8: + case NEON::BI__builtin_neon_vluti2_lane_s16: + case NEON::BI__builtin_neon_vluti2_lane_s8: + case NEON::BI__builtin_neon_vluti2_lane_u16: + case NEON::BI__builtin_neon_vluti2_lane_u8: + case NEON::BI__builtin_neon_vluti2_laneq_bf16: + case NEON::BI__builtin_neon_vluti2_laneq_f16: + case NEON::BI__builtin_neon_vluti2_laneq_p16: + case NEON::BI__builtin_neon_vluti2_laneq_p8: + case NEON::BI__builtin_neon_vluti2_laneq_s16: + case NEON::BI__builtin_neon_vluti2_laneq_s8: + case NEON::BI__builtin_neon_vluti2_laneq_u16: + case NEON::BI__builtin_neon_vluti2_laneq_u8: + case NEON::BI__builtin_neon_vluti2q_lane_bf16: + case NEON::BI__builtin_neon_vluti2q_lane_f16: + case NEON::BI__builtin_neon_vluti2q_lane_p16: + case NEON::BI__builtin_neon_vluti2q_lane_p8: + case NEON::BI__builtin_neon_vluti2q_lane_s16: + case NEON::BI__builtin_neon_vluti2q_lane_s8: + case NEON::BI__builtin_neon_vluti2q_lane_u16: + case NEON::BI__builtin_neon_vluti2q_lane_u8: + case NEON::BI__builtin_neon_vluti2q_laneq_bf16: + case NEON::BI__builtin_neon_vluti2q_laneq_f16: + case NEON::BI__builtin_neon_vluti2q_laneq_p16: + case NEON::BI__builtin_neon_vluti2q_laneq_p8: + case NEON::BI__builtin_neon_vluti2q_laneq_s16: + case NEON::BI__builtin_neon_vluti2q_laneq_s8: + case NEON::BI__builtin_neon_vluti2q_laneq_u16: + case NEON::BI__builtin_neon_vluti2q_laneq_u8: { +Int = Intrinsic::aarch64_neon_vluti2_lane; +llvm::Type *Tys[3]; +Tys[0] = Ty; +Tys[1] = Ops[0]->getType(); +Tys[2] = Ops[1]->getType(); +return EmitNeonCall(CGM.getIntrinsic(Int, Tys), Ops, "vluti2_lane"); + } + case NEON::BI__builtin_neon_vluti4q_laneq_p8: + case NEON::BI__builtin_neon_vluti4q_laneq_s8: + case NEON::BI__builtin_neon_vluti4q_laneq_u8: { +Int = Intrinsic::aarch64_neon_vluti4q_laneq; +return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vluti4q_laneq"); + } + case NEON::BI__builtin_neon_vluti4q_laneq_bf16_x2: + case NEON::BI__builtin_neon_vluti4q_laneq_f16_x2: + case NEON::BI__builtin_neon_vluti4q_l
[clang] [llvm] [AArch64][NEON] Add intrinsics for LUTI (PR #96883)
llvmbot wrote: @llvm/pr-subscribers-clang Author: None (Lukacma) Changes This patch adds intrinsics for NEON LUTI2 and LUTI4 instructions as specified in the [ACLE proposal](https://github.com/ARM-software/acle/pull/324) --- Patch is 45.96 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/96883.diff 7 Files Affected: - (modified) clang/include/clang/Basic/arm_neon.td (+16) - (modified) clang/lib/CodeGen/CGBuiltin.cpp (+54) - (added) clang/test/CodeGen/aarch64-neon-luti.c (+433) - (modified) llvm/include/llvm/IR/IntrinsicsAArch64.td (+19) - (modified) llvm/lib/Target/AArch64/AArch64InstrFormats.td (+7-7) - (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+70) - (added) llvm/test/CodeGen/AArch64/neon-luti.ll (+207) ``diff diff --git a/clang/include/clang/Basic/arm_neon.td b/clang/include/clang/Basic/arm_neon.td index 6390ba3f9fe5e..0dd76ce32fc20 100644 --- a/clang/include/clang/Basic/arm_neon.td +++ b/clang/include/clang/Basic/arm_neon.td @@ -2096,3 +2096,19 @@ let ArchGuard = "defined(__aarch64__) || defined(__arm64ec__)", TargetGuard = "r def VLDAP1_LANE : WInst<"vldap1_lane", ".(c*!).I", "QUlQlUlldQdPlQPl">; def VSTL1_LANE : WInst<"vstl1_lane", "v*(.!)I", "QUlQlUlldQdPlQPl">; } + +//Lookup table read with 2-bit/4-bit indices +let ArchGuard = "defined(__aarch64__)", TargetGuard = "lut" in { + def VLUTI2_B: SInst<"vluti2_lane","Q.(qU)I", "cUcPcQcQUcQPc">; + def VLUTI2_B_Q : SInst<"vluti2_laneq", "Q.(QU)I", "cUcPcQcQUcQPc">; + def VLUTI2_H: SInst<"vluti2_lane","Q.(qU<)I", "sUsPshQsQUsQPsQh">; + def VLUTI2_H_Q : SInst<"vluti2_laneq", "Q.(QU<)I", "sUsPshQsQUsQPsQh">; + def VLUTI4_B: SInst<"vluti4_laneq","..UI", "QcQUcQPc">; + def VLUTI4_H_X2 : SInst<"vluti4_laneq_x2", ".2(U<)I", "QsQUsQPsQh">; + + let ArchGuard = "defined(__aarch64__)", TargetGuard= "lut,bf16" in { +def VLUTI2_BF : SInst<"vluti2_lane", "Q.(qU<)I", "bQb">; +def VLUTI2_BF_Q: SInst<"vluti2_laneq","Q.(QU<)I", "bQb">; +def VLUTI4_BF_X2 : SInst<"vluti4_laneq_x2", ".2(U<)I", "Qb">; + } +} diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 511e1fd4016d7..f9ac6c9dc8504 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -13357,6 +13357,60 @@ Value *CodeGenFunction::EmitAArch64BuiltinExpr(unsigned BuiltinID, Int = Intrinsic::aarch64_neon_suqadd; return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vuqadd"); } + + case NEON::BI__builtin_neon_vluti2_lane_bf16: + case NEON::BI__builtin_neon_vluti2_lane_f16: + case NEON::BI__builtin_neon_vluti2_lane_p16: + case NEON::BI__builtin_neon_vluti2_lane_p8: + case NEON::BI__builtin_neon_vluti2_lane_s16: + case NEON::BI__builtin_neon_vluti2_lane_s8: + case NEON::BI__builtin_neon_vluti2_lane_u16: + case NEON::BI__builtin_neon_vluti2_lane_u8: + case NEON::BI__builtin_neon_vluti2_laneq_bf16: + case NEON::BI__builtin_neon_vluti2_laneq_f16: + case NEON::BI__builtin_neon_vluti2_laneq_p16: + case NEON::BI__builtin_neon_vluti2_laneq_p8: + case NEON::BI__builtin_neon_vluti2_laneq_s16: + case NEON::BI__builtin_neon_vluti2_laneq_s8: + case NEON::BI__builtin_neon_vluti2_laneq_u16: + case NEON::BI__builtin_neon_vluti2_laneq_u8: + case NEON::BI__builtin_neon_vluti2q_lane_bf16: + case NEON::BI__builtin_neon_vluti2q_lane_f16: + case NEON::BI__builtin_neon_vluti2q_lane_p16: + case NEON::BI__builtin_neon_vluti2q_lane_p8: + case NEON::BI__builtin_neon_vluti2q_lane_s16: + case NEON::BI__builtin_neon_vluti2q_lane_s8: + case NEON::BI__builtin_neon_vluti2q_lane_u16: + case NEON::BI__builtin_neon_vluti2q_lane_u8: + case NEON::BI__builtin_neon_vluti2q_laneq_bf16: + case NEON::BI__builtin_neon_vluti2q_laneq_f16: + case NEON::BI__builtin_neon_vluti2q_laneq_p16: + case NEON::BI__builtin_neon_vluti2q_laneq_p8: + case NEON::BI__builtin_neon_vluti2q_laneq_s16: + case NEON::BI__builtin_neon_vluti2q_laneq_s8: + case NEON::BI__builtin_neon_vluti2q_laneq_u16: + case NEON::BI__builtin_neon_vluti2q_laneq_u8: { +Int = Intrinsic::aarch64_neon_vluti2_lane; +llvm::Type *Tys[3]; +Tys[0] = Ty; +Tys[1] = Ops[0]->getType(); +Tys[2] = Ops[1]->getType(); +return EmitNeonCall(CGM.getIntrinsic(Int, Tys), Ops, "vluti2_lane"); + } + case NEON::BI__builtin_neon_vluti4q_laneq_p8: + case NEON::BI__builtin_neon_vluti4q_laneq_s8: + case NEON::BI__builtin_neon_vluti4q_laneq_u8: { +Int = Intrinsic::aarch64_neon_vluti4q_laneq; +return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vluti4q_laneq"); + } + case NEON::BI__builtin_neon_vluti4q_laneq_bf16_x2: + case NEON::BI__builtin_neon_vluti4q_laneq_f16_x2: + case NEON::BI__builtin_neon_vluti4q_laneq_p16_x2: + case NEON::BI__builtin_neon_vluti4q_laneq_s16_x2: + case NEON::BI__builtin_neon_vluti4q_laneq_u16_x2: { +Int = Intrinsic::aarch64_neon_vluti4q_laneq_x2; +return EmitNeonCall(CGM.
[clang] [llvm] [AArch64][NEON] Add intrinsics for LUTI (PR #96883)
llvmbot wrote: @llvm/pr-subscribers-backend-aarch64 Author: None (Lukacma) Changes This patch adds intrinsics for NEON LUTI2 and LUTI4 instructions as specified in the [ACLE proposal](https://github.com/ARM-software/acle/pull/324) --- Patch is 45.96 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/96883.diff 7 Files Affected: - (modified) clang/include/clang/Basic/arm_neon.td (+16) - (modified) clang/lib/CodeGen/CGBuiltin.cpp (+54) - (added) clang/test/CodeGen/aarch64-neon-luti.c (+433) - (modified) llvm/include/llvm/IR/IntrinsicsAArch64.td (+19) - (modified) llvm/lib/Target/AArch64/AArch64InstrFormats.td (+7-7) - (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+70) - (added) llvm/test/CodeGen/AArch64/neon-luti.ll (+207) ``diff diff --git a/clang/include/clang/Basic/arm_neon.td b/clang/include/clang/Basic/arm_neon.td index 6390ba3f9fe5e..0dd76ce32fc20 100644 --- a/clang/include/clang/Basic/arm_neon.td +++ b/clang/include/clang/Basic/arm_neon.td @@ -2096,3 +2096,19 @@ let ArchGuard = "defined(__aarch64__) || defined(__arm64ec__)", TargetGuard = "r def VLDAP1_LANE : WInst<"vldap1_lane", ".(c*!).I", "QUlQlUlldQdPlQPl">; def VSTL1_LANE : WInst<"vstl1_lane", "v*(.!)I", "QUlQlUlldQdPlQPl">; } + +//Lookup table read with 2-bit/4-bit indices +let ArchGuard = "defined(__aarch64__)", TargetGuard = "lut" in { + def VLUTI2_B: SInst<"vluti2_lane","Q.(qU)I", "cUcPcQcQUcQPc">; + def VLUTI2_B_Q : SInst<"vluti2_laneq", "Q.(QU)I", "cUcPcQcQUcQPc">; + def VLUTI2_H: SInst<"vluti2_lane","Q.(qU<)I", "sUsPshQsQUsQPsQh">; + def VLUTI2_H_Q : SInst<"vluti2_laneq", "Q.(QU<)I", "sUsPshQsQUsQPsQh">; + def VLUTI4_B: SInst<"vluti4_laneq","..UI", "QcQUcQPc">; + def VLUTI4_H_X2 : SInst<"vluti4_laneq_x2", ".2(U<)I", "QsQUsQPsQh">; + + let ArchGuard = "defined(__aarch64__)", TargetGuard= "lut,bf16" in { +def VLUTI2_BF : SInst<"vluti2_lane", "Q.(qU<)I", "bQb">; +def VLUTI2_BF_Q: SInst<"vluti2_laneq","Q.(QU<)I", "bQb">; +def VLUTI4_BF_X2 : SInst<"vluti4_laneq_x2", ".2(U<)I", "Qb">; + } +} diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 511e1fd4016d7..f9ac6c9dc8504 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -13357,6 +13357,60 @@ Value *CodeGenFunction::EmitAArch64BuiltinExpr(unsigned BuiltinID, Int = Intrinsic::aarch64_neon_suqadd; return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vuqadd"); } + + case NEON::BI__builtin_neon_vluti2_lane_bf16: + case NEON::BI__builtin_neon_vluti2_lane_f16: + case NEON::BI__builtin_neon_vluti2_lane_p16: + case NEON::BI__builtin_neon_vluti2_lane_p8: + case NEON::BI__builtin_neon_vluti2_lane_s16: + case NEON::BI__builtin_neon_vluti2_lane_s8: + case NEON::BI__builtin_neon_vluti2_lane_u16: + case NEON::BI__builtin_neon_vluti2_lane_u8: + case NEON::BI__builtin_neon_vluti2_laneq_bf16: + case NEON::BI__builtin_neon_vluti2_laneq_f16: + case NEON::BI__builtin_neon_vluti2_laneq_p16: + case NEON::BI__builtin_neon_vluti2_laneq_p8: + case NEON::BI__builtin_neon_vluti2_laneq_s16: + case NEON::BI__builtin_neon_vluti2_laneq_s8: + case NEON::BI__builtin_neon_vluti2_laneq_u16: + case NEON::BI__builtin_neon_vluti2_laneq_u8: + case NEON::BI__builtin_neon_vluti2q_lane_bf16: + case NEON::BI__builtin_neon_vluti2q_lane_f16: + case NEON::BI__builtin_neon_vluti2q_lane_p16: + case NEON::BI__builtin_neon_vluti2q_lane_p8: + case NEON::BI__builtin_neon_vluti2q_lane_s16: + case NEON::BI__builtin_neon_vluti2q_lane_s8: + case NEON::BI__builtin_neon_vluti2q_lane_u16: + case NEON::BI__builtin_neon_vluti2q_lane_u8: + case NEON::BI__builtin_neon_vluti2q_laneq_bf16: + case NEON::BI__builtin_neon_vluti2q_laneq_f16: + case NEON::BI__builtin_neon_vluti2q_laneq_p16: + case NEON::BI__builtin_neon_vluti2q_laneq_p8: + case NEON::BI__builtin_neon_vluti2q_laneq_s16: + case NEON::BI__builtin_neon_vluti2q_laneq_s8: + case NEON::BI__builtin_neon_vluti2q_laneq_u16: + case NEON::BI__builtin_neon_vluti2q_laneq_u8: { +Int = Intrinsic::aarch64_neon_vluti2_lane; +llvm::Type *Tys[3]; +Tys[0] = Ty; +Tys[1] = Ops[0]->getType(); +Tys[2] = Ops[1]->getType(); +return EmitNeonCall(CGM.getIntrinsic(Int, Tys), Ops, "vluti2_lane"); + } + case NEON::BI__builtin_neon_vluti4q_laneq_p8: + case NEON::BI__builtin_neon_vluti4q_laneq_s8: + case NEON::BI__builtin_neon_vluti4q_laneq_u8: { +Int = Intrinsic::aarch64_neon_vluti4q_laneq; +return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vluti4q_laneq"); + } + case NEON::BI__builtin_neon_vluti4q_laneq_bf16_x2: + case NEON::BI__builtin_neon_vluti4q_laneq_f16_x2: + case NEON::BI__builtin_neon_vluti4q_laneq_p16_x2: + case NEON::BI__builtin_neon_vluti4q_laneq_s16_x2: + case NEON::BI__builtin_neon_vluti4q_laneq_u16_x2: { +Int = Intrinsic::aarch64_neon_vluti4q_laneq_x2; +return EmitNeo
[clang] [llvm] [OpenMP] OpenMP 5.1 "assume" directive parsing support (PR #92731)
jtb20 wrote: > > > don't you need more code in AST? > > > > > > Sorry, I don't quite understand the question! Could you elaborate a little > > please? > > I was thinking maybe you need changes in AST related files, like > `ASTWriter.cpp`, but that might be not needed as this is adding a new > directive. At the moment, since the "assume" directive is parsed but then immediately discarded, I don't think anything else is needed. Actually the existing "assumes" support is reused -- the bit in SemaOpenMP.cpp adds the "assume" assumptions to the OMPAssumeScoped "stack". For "assumes", it's done like that so (e.g. top-level) declarations between begin/end "assumes" can be modified according to the assumptions on that stack. For "assume", once some use is made for the assumptions, that might turn out to not be the most useful representation. That can be revisited later though, I think. https://github.com/llvm/llvm-project/pull/92731 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [CLANG][LLVM][AArch64]Add SME2.1 intrinsics for MOVAZ tile to vector,… (PR #88499)
https://github.com/CarolineConcatto updated https://github.com/llvm/llvm-project/pull/88499 >From a4d4a0ff71f5086c9fdf43e332b9752074eb42dc Mon Sep 17 00:00:00 2001 From: Caroline Concatto Date: Thu, 11 Apr 2024 16:10:16 + Subject: [PATCH 1/3] [CLANG][LLVM][AArch64]Add SME2.1 intrinsics for MOVAZ tile to vector, single According to the specification in ARM-software/acle#309 this adds the intrinsics // And similarly for u8. svint8_t svreadz_hor_za8_s8(uint64_t tile, uint32_t slice) __arm_streaming __arm_inout("za"); // And similarly for u16, bf16 and f16. svint16_t svreadz_hor_za16_s16(uint64_t tile, uint32_t slice) __arm_streaming __arm_inout("za"); // And similarly for u32 and f32. svint32_t svreadz_hor_za32_s32(uint64_t tile, uint32_t slice) __arm_streaming __arm_inout("za"); // And similarly for u64 and f64. svint64_t svreadz_hor_za64_s64(uint64_t tile, uint32_t slice) __arm_streaming __arm_inout("za"); // And similarly for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64 svint8_t svreadz_hor_za128_s8(uint64_t tile, uint32_t slice) __arm_streaming __arm_inout("za"); --- clang/include/clang/Basic/arm_sme.td | 18 + .../acle_sme2p1_movaz.c | 410 .../acle_sme2p1_imm.cpp | 21 + llvm/include/llvm/IR/IntrinsicsAArch64.td | 12 +- .../Target/AArch64/AArch64ISelLowering.cpp| 37 ++ llvm/lib/Target/AArch64/AArch64ISelLowering.h | 3 + .../lib/Target/AArch64/AArch64SMEInstrInfo.td | 3 +- llvm/lib/Target/AArch64/SMEInstrFormats.td| 93 +++- .../AArch64/sme2p1-intrinsics-movaz.ll| 445 +- 9 files changed, 1021 insertions(+), 21 deletions(-) create mode 100644 clang/test/Sema/aarch64-sme2p1-intrinsics/acle_sme2p1_imm.cpp diff --git a/clang/include/clang/Basic/arm_sme.td b/clang/include/clang/Basic/arm_sme.td index 5f757b40e8fd9..a5677802193af 100644 --- a/clang/include/clang/Basic/arm_sme.td +++ b/clang/include/clang/Basic/arm_sme.td @@ -787,4 +787,22 @@ defm SVREADZ_ZA16_X4 : ZAReadz<"za16", "4", "sUshb", "aarch64_sme_readz", [ImmCh defm SVREADZ_ZA32_X4 : ZAReadz<"za32", "4", "iUif", "aarch64_sme_readz", [ImmCheck<0, ImmCheck0_3>]>; defm SVREADZ_ZA64_X4 : ZAReadz<"za64", "4", "lUld", "aarch64_sme_readz", [ImmCheck<0, ImmCheck0_7>]>; + +multiclass ZAReadz ch> { + let SMETargetGuard = "sme2p1" in { +def NAME # _H : SInst<"svreadz_hor_" # n_suffix # "_{d}", "dim", t, + MergeNone, i_prefix # "_horiz", + [IsStreaming, IsInOutZA], ch>; + +def NAME # _V : SInst<"svreadz_ver_" # n_suffix # "_{d}", "dim", t, + MergeNone, i_prefix # "_vert", + [IsStreaming, IsInOutZA], ch>; + } +} + +defm SVREADZ_ZA8 : ZAReadz<"za8", "cUc", "aarch64_sme_readz", [ImmCheck<0, ImmCheck0_0>]>; +defm SVREADZ_ZA16 : ZAReadz<"za16", "sUshb", "aarch64_sme_readz", [ImmCheck<0, ImmCheck0_1>]>; +defm SVREADZ_ZA32 : ZAReadz<"za32", "iUif", "aarch64_sme_readz", [ImmCheck<0, ImmCheck0_3>]>; +defm SVREADZ_ZA64 : ZAReadz<"za64", "lUld", "aarch64_sme_readz", [ImmCheck<0, ImmCheck0_7>]>; +defm SVREADZ_ZA128 : ZAReadz<"za128", "csilUcUiUsUlbhfd", "aarch64_sme_readz_q", [ImmCheck<0, ImmCheck0_15>]>; } // let SVETargetGuard = InvalidMode diff --git a/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c b/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c index d0c7230ade761..7c9067a5ceece 100644 --- a/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c +++ b/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c @@ -1413,3 +1413,413 @@ svfloat64x4_t test_svreadz_ver_za64_f64_x4(uint32_t slice) __arm_streaming __arm { return svreadz_ver_za64_f64_vg4(7, slice); } + +// CHECK-LABEL: define dso_local @test_svreadz_hor_za8_s8( +// CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[TMP0:%.*]] = tail call @llvm.aarch64.sme.readz.horiz.nxv16i8(i32 0, i32 [[SLICE]]) +// CHECK-NEXT:ret [[TMP0]] +// +// CPP-CHECK-LABEL: define dso_local @_Z23test_svreadz_hor_za8_s8j( +// CPP-CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0:[0-9]+]] { +// CPP-CHECK-NEXT: entry: +// CPP-CHECK-NEXT:[[TMP0:%.*]] = tail call @llvm.aarch64.sme.readz.horiz.nxv16i8(i32 0, i32 [[SLICE]]) +// CPP-CHECK-NEXT:ret [[TMP0]] +// +svint8_t test_svreadz_hor_za8_s8(uint32_t slice) __arm_streaming __arm_inout("za") +{ + return svreadz_hor_za8_s8(0, slice); +} + +// CHECK-LABEL: define dso_local @test_svreadz_hor_za8_u8( +// CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[TMP0:%.*]] = tail call @llvm.aarch64.sme.readz.horiz.nxv16i8(i32 0, i32 [[SLICE]]) +// CHECK-NEXT:ret [[TMP0]] +// +// CPP-CHECK-LABEL: define dso_local @_Z23test_svreadz_hor_za8_u8j( +// CPP-CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0]] { +// CPP-CHECK-NEXT: entry: +// CPP-CHECK-NEXT
[clang] [llvm] [CLANG][LLVM][AArch64]Add SME2.1 intrinsics for MOVAZ tile to vector,… (PR #88499)
https://github.com/CarolineConcatto updated https://github.com/llvm/llvm-project/pull/88499 >From a4d4a0ff71f5086c9fdf43e332b9752074eb42dc Mon Sep 17 00:00:00 2001 From: Caroline Concatto Date: Thu, 11 Apr 2024 16:10:16 + Subject: [PATCH 1/4] [CLANG][LLVM][AArch64]Add SME2.1 intrinsics for MOVAZ tile to vector, single According to the specification in ARM-software/acle#309 this adds the intrinsics // And similarly for u8. svint8_t svreadz_hor_za8_s8(uint64_t tile, uint32_t slice) __arm_streaming __arm_inout("za"); // And similarly for u16, bf16 and f16. svint16_t svreadz_hor_za16_s16(uint64_t tile, uint32_t slice) __arm_streaming __arm_inout("za"); // And similarly for u32 and f32. svint32_t svreadz_hor_za32_s32(uint64_t tile, uint32_t slice) __arm_streaming __arm_inout("za"); // And similarly for u64 and f64. svint64_t svreadz_hor_za64_s64(uint64_t tile, uint32_t slice) __arm_streaming __arm_inout("za"); // And similarly for s16, s32, s64, u8, u16, u32, u64, bf16, f16, f32, f64 svint8_t svreadz_hor_za128_s8(uint64_t tile, uint32_t slice) __arm_streaming __arm_inout("za"); --- clang/include/clang/Basic/arm_sme.td | 18 + .../acle_sme2p1_movaz.c | 410 .../acle_sme2p1_imm.cpp | 21 + llvm/include/llvm/IR/IntrinsicsAArch64.td | 12 +- .../Target/AArch64/AArch64ISelLowering.cpp| 37 ++ llvm/lib/Target/AArch64/AArch64ISelLowering.h | 3 + .../lib/Target/AArch64/AArch64SMEInstrInfo.td | 3 +- llvm/lib/Target/AArch64/SMEInstrFormats.td| 93 +++- .../AArch64/sme2p1-intrinsics-movaz.ll| 445 +- 9 files changed, 1021 insertions(+), 21 deletions(-) create mode 100644 clang/test/Sema/aarch64-sme2p1-intrinsics/acle_sme2p1_imm.cpp diff --git a/clang/include/clang/Basic/arm_sme.td b/clang/include/clang/Basic/arm_sme.td index 5f757b40e8fd9..a5677802193af 100644 --- a/clang/include/clang/Basic/arm_sme.td +++ b/clang/include/clang/Basic/arm_sme.td @@ -787,4 +787,22 @@ defm SVREADZ_ZA16_X4 : ZAReadz<"za16", "4", "sUshb", "aarch64_sme_readz", [ImmCh defm SVREADZ_ZA32_X4 : ZAReadz<"za32", "4", "iUif", "aarch64_sme_readz", [ImmCheck<0, ImmCheck0_3>]>; defm SVREADZ_ZA64_X4 : ZAReadz<"za64", "4", "lUld", "aarch64_sme_readz", [ImmCheck<0, ImmCheck0_7>]>; + +multiclass ZAReadz ch> { + let SMETargetGuard = "sme2p1" in { +def NAME # _H : SInst<"svreadz_hor_" # n_suffix # "_{d}", "dim", t, + MergeNone, i_prefix # "_horiz", + [IsStreaming, IsInOutZA], ch>; + +def NAME # _V : SInst<"svreadz_ver_" # n_suffix # "_{d}", "dim", t, + MergeNone, i_prefix # "_vert", + [IsStreaming, IsInOutZA], ch>; + } +} + +defm SVREADZ_ZA8 : ZAReadz<"za8", "cUc", "aarch64_sme_readz", [ImmCheck<0, ImmCheck0_0>]>; +defm SVREADZ_ZA16 : ZAReadz<"za16", "sUshb", "aarch64_sme_readz", [ImmCheck<0, ImmCheck0_1>]>; +defm SVREADZ_ZA32 : ZAReadz<"za32", "iUif", "aarch64_sme_readz", [ImmCheck<0, ImmCheck0_3>]>; +defm SVREADZ_ZA64 : ZAReadz<"za64", "lUld", "aarch64_sme_readz", [ImmCheck<0, ImmCheck0_7>]>; +defm SVREADZ_ZA128 : ZAReadz<"za128", "csilUcUiUsUlbhfd", "aarch64_sme_readz_q", [ImmCheck<0, ImmCheck0_15>]>; } // let SVETargetGuard = InvalidMode diff --git a/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c b/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c index d0c7230ade761..7c9067a5ceece 100644 --- a/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c +++ b/clang/test/CodeGen/aarch64-sme2p1-intrinsics/acle_sme2p1_movaz.c @@ -1413,3 +1413,413 @@ svfloat64x4_t test_svreadz_ver_za64_f64_x4(uint32_t slice) __arm_streaming __arm { return svreadz_ver_za64_f64_vg4(7, slice); } + +// CHECK-LABEL: define dso_local @test_svreadz_hor_za8_s8( +// CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[TMP0:%.*]] = tail call @llvm.aarch64.sme.readz.horiz.nxv16i8(i32 0, i32 [[SLICE]]) +// CHECK-NEXT:ret [[TMP0]] +// +// CPP-CHECK-LABEL: define dso_local @_Z23test_svreadz_hor_za8_s8j( +// CPP-CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0:[0-9]+]] { +// CPP-CHECK-NEXT: entry: +// CPP-CHECK-NEXT:[[TMP0:%.*]] = tail call @llvm.aarch64.sme.readz.horiz.nxv16i8(i32 0, i32 [[SLICE]]) +// CPP-CHECK-NEXT:ret [[TMP0]] +// +svint8_t test_svreadz_hor_za8_s8(uint32_t slice) __arm_streaming __arm_inout("za") +{ + return svreadz_hor_za8_s8(0, slice); +} + +// CHECK-LABEL: define dso_local @test_svreadz_hor_za8_u8( +// CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[TMP0:%.*]] = tail call @llvm.aarch64.sme.readz.horiz.nxv16i8(i32 0, i32 [[SLICE]]) +// CHECK-NEXT:ret [[TMP0]] +// +// CPP-CHECK-LABEL: define dso_local @_Z23test_svreadz_hor_za8_u8j( +// CPP-CHECK-SAME: i32 noundef [[SLICE:%.*]]) #[[ATTR0]] { +// CPP-CHECK-NEXT: entry: +// CPP-CHECK-NEXT
[clang] [flang] [Flang-new][OpenMP] Add offload related flags for AMDGPU (PR #96742)
@@ -333,6 +333,9 @@ void Flang::AddAMDGPUTargetArgs(const ArgList &Args, StringRef Val = A->getValue(); CmdArgs.push_back(Args.MakeArgString("-mcode-object-version=" + Val)); } + + const ToolChain &TC = getToolChain(); + TC.addClangTargetOptions(Args, CmdArgs, Action::OffloadKind::OFK_OpenMP); DominikAdamski wrote: No. My change does not imply any changes for Nvidia GPUs support. Flang and Clang share the same LLVM backend which consumes generated LLVM IR. For AMD GPU we need to embed bitcode definitions of GPU math functions. AMD toolchain adds all required options to the compiler invocation for AMD GPU and IMO can be reused between Flang and Clang. I don't know if Nvidia also want to reuse their toolchain between Clang and Flang to fully support OpenMP offloading. https://github.com/llvm/llvm-project/pull/96742 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Sema] LambdaScopeForCallOperatorInstantiationRAII - fix typo in early out logic (PR #96888)
https://github.com/RKSimon created https://github.com/llvm/llvm-project/pull/96888 We should be checking for a failed dyn_cast on the ParentFD result - not the loop invariant FD root value. Seems to have been introduced in #65193 Noticed by static analyser (I have no specific test case). >From 3194b593fbb50ee20b9f7d73beef4472657e6e00 Mon Sep 17 00:00:00 2001 From: Simon Pilgrim Date: Thu, 27 Jun 2024 11:09:32 +0100 Subject: [PATCH] [Sema] LambdaScopeForCallOperatorInstantiationRAII - fix typo in early out logic We should be checking for a failed dyn_cast on the ParentFD result - not the loop invariant FD root value. Seems to have been introduced in #65193 Noticed by static analyser (I have no specific test case). --- clang/lib/Sema/SemaLambda.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/clang/lib/Sema/SemaLambda.cpp b/clang/lib/Sema/SemaLambda.cpp index e9476a0c93c5d..ca9c7cb9faadf 100644 --- a/clang/lib/Sema/SemaLambda.cpp +++ b/clang/lib/Sema/SemaLambda.cpp @@ -2391,7 +2391,7 @@ Sema::LambdaScopeForCallOperatorInstantiationRAII:: Pattern = dyn_cast(getLambdaAwareParentOfDeclContext(Pattern)); - if (!FD || !Pattern) + if (!ParentFD || !Pattern) break; SemaRef.addInstantiatedParametersToScope(ParentFD, Pattern, Scope, MLTAL); ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Sema] LambdaScopeForCallOperatorInstantiationRAII - fix typo in early out logic (PR #96888)
llvmbot wrote: @llvm/pr-subscribers-clang Author: Simon Pilgrim (RKSimon) Changes We should be checking for a failed dyn_cast on the ParentFD result - not the loop invariant FD root value. Seems to have been introduced in #65193 Noticed by static analyser (I have no specific test case). --- Full diff: https://github.com/llvm/llvm-project/pull/96888.diff 1 Files Affected: - (modified) clang/lib/Sema/SemaLambda.cpp (+1-1) ``diff diff --git a/clang/lib/Sema/SemaLambda.cpp b/clang/lib/Sema/SemaLambda.cpp index e9476a0c93c5d..ca9c7cb9faadf 100644 --- a/clang/lib/Sema/SemaLambda.cpp +++ b/clang/lib/Sema/SemaLambda.cpp @@ -2391,7 +2391,7 @@ Sema::LambdaScopeForCallOperatorInstantiationRAII:: Pattern = dyn_cast(getLambdaAwareParentOfDeclContext(Pattern)); - if (!FD || !Pattern) + if (!ParentFD || !Pattern) break; SemaRef.addInstantiatedParametersToScope(ParentFD, Pattern, Scope, MLTAL); `` https://github.com/llvm/llvm-project/pull/96888 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [lldb] [clang][lldb] Don't assert structure layout correctness for layouts provided by LLDB (PR #93809)
Michael137 wrote: > Here's the smallest patch that would put explicit alignment on any packed > structure: > > ``` > diff --git a/clang/lib/CodeGen/CGDebugInfo.cpp > b/clang/lib/CodeGen/CGDebugInfo.cpp > index a072475ba770..bbb13ddd593b 100644 > --- a/clang/lib/CodeGen/CGDebugInfo.cpp > +++ b/clang/lib/CodeGen/CGDebugInfo.cpp > @@ -64,7 +64,7 @@ static uint32_t getTypeAlignIfRequired(const Type *Ty, > const ASTContext &Ctx) { >// MaxFieldAlignmentAttr is the attribute added to types >// declared after #pragma pack(n). >if (auto *Decl = Ty->getAsRecordDecl()) > -if (Decl->hasAttr()) > +if (Decl->hasAttr() || > Decl->hasAttr()) >return TI.Align; > >return 0; > ``` > > But I don't think that's the right approach - I think what we should do is > compute the natural alignment of the structure, then compare that to the > actual alignment - and if they differ, we should put an explicit alignment on > the structure. This avoids the risk that other alignment-influencing effects > might be missed (and avoids the case of putting alignment on a structure > that, when packed, just has the same alignment anyway - which is a minor > issue, but nice to get right (eg: packed struct of a single char probably > shouldn't have an explicit alignment - since it's the same as the implicit > alignment anyway)) Thanks for the analysis! If we can emit alignment for packed attributes consistently then we probably can get rid of most of the `InferAlignment` logic in the `RecordLayoutBuilder` (it seems to me most of that logic was put introduced there for the purpose of packed structs), which would address the issue I saw with laying out `[[no_unique_address]]` fields. Trying this now https://github.com/llvm/llvm-project/pull/93809 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] Extend lifetime bound analysis to support assignments (PR #96475)
https://github.com/hokein edited https://github.com/llvm/llvm-project/pull/96475 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [X86][CodeGen] security check cookie execute only when needed (PR #95904)
@@ -9,95 +9,6 @@ @"\01LC" = internal constant [11 x i8] c"buf == %s\0A\00"; [#uses=1] define void @test(ptr %a) nounwind ssp { -; MSVC-X86-LABEL: test: RKSimon wrote: where did the test checks go? https://github.com/llvm/llvm-project/pull/95904 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [X86][CodeGen] security check cookie execute only when needed (PR #95904)
@@ -114,250 +25,93 @@ return:; preds = %entry declare void @escape(ptr) define void @test_vla(i32 %n) nounwind ssp { -; MSVC-X86-LABEL: test_vla: -; MSVC-X86: # %bb.0: -; MSVC-X86-NEXT:pushl %ebp -; MSVC-X86-NEXT:movl %esp, %ebp -; MSVC-X86-NEXT:pushl %eax -; MSVC-X86-NEXT:movl 8(%ebp), %eax -; MSVC-X86-NEXT:movl ___security_cookie, %ecx -; MSVC-X86-NEXT:xorl %ebp, %ecx -; MSVC-X86-NEXT:movl %ecx, -4(%ebp) -; MSVC-X86-NEXT:shll $2, %eax -; MSVC-X86-NEXT:calll __chkstk -; MSVC-X86-NEXT:movl %esp, %eax -; MSVC-X86-NEXT:pushl %eax -; MSVC-X86-NEXT:calll _escape -; MSVC-X86-NEXT:addl $4, %esp -; MSVC-X86-NEXT:movl -4(%ebp), %ecx -; MSVC-X86-NEXT:xorl %ebp, %ecx -; MSVC-X86-NEXT:calll @__security_check_cookie@4 -; MSVC-X86-NEXT:movl %ebp, %esp -; MSVC-X86-NEXT:popl %ebp -; MSVC-X86-NEXT:retl -; -; MSVC-X64-LABEL: test_vla: -; MSVC-X64: # %bb.0: -; MSVC-X64-NEXT:pushq %rbp -; MSVC-X64-NEXT:subq $16, %rsp -; MSVC-X64-NEXT:leaq {{[0-9]+}}(%rsp), %rbp -; MSVC-X64-NEXT:movq __security_cookie(%rip), %rax -; MSVC-X64-NEXT:xorq %rbp, %rax -; MSVC-X64-NEXT:movq %rax, -8(%rbp) -; MSVC-X64-NEXT:movl %ecx, %eax -; MSVC-X64-NEXT:leaq 15(,%rax,4), %rax -; MSVC-X64-NEXT:andq $-16, %rax -; MSVC-X64-NEXT:callq __chkstk -; MSVC-X64-NEXT:subq %rax, %rsp -; MSVC-X64-NEXT:movq %rsp, %rcx -; MSVC-X64-NEXT:subq $32, %rsp -; MSVC-X64-NEXT:callq escape -; MSVC-X64-NEXT:addq $32, %rsp -; MSVC-X64-NEXT:movq -8(%rbp), %rcx -; MSVC-X64-NEXT:xorq %rbp, %rcx -; MSVC-X64-NEXT:subq $32, %rsp -; MSVC-X64-NEXT:callq __security_check_cookie -; MSVC-X64-NEXT:movq %rbp, %rsp -; MSVC-X64-NEXT:popq %rbp -; MSVC-X64-NEXT:retq -; -; MSVC-X86-O0-LABEL: test_vla: -; MSVC-X86-O0: # %bb.0: -; MSVC-X86-O0-NEXT:pushl %ebp -; MSVC-X86-O0-NEXT:movl %esp, %ebp -; MSVC-X86-O0-NEXT:pushl %eax -; MSVC-X86-O0-NEXT:movl 8(%ebp), %eax -; MSVC-X86-O0-NEXT:movl ___security_cookie, %ecx -; MSVC-X86-O0-NEXT:xorl %ebp, %ecx -; MSVC-X86-O0-NEXT:movl %ecx, -4(%ebp) -; MSVC-X86-O0-NEXT:shll $2, %eax -; MSVC-X86-O0-NEXT:calll __chkstk -; MSVC-X86-O0-NEXT:movl %esp, %eax -; MSVC-X86-O0-NEXT:subl $4, %esp -; MSVC-X86-O0-NEXT:movl %eax, (%esp) -; MSVC-X86-O0-NEXT:calll _escape -; MSVC-X86-O0-NEXT:addl $4, %esp -; MSVC-X86-O0-NEXT:movl -4(%ebp), %ecx -; MSVC-X86-O0-NEXT:xorl %ebp, %ecx -; MSVC-X86-O0-NEXT:calll @__security_check_cookie@4 -; MSVC-X86-O0-NEXT:movl %ebp, %esp -; MSVC-X86-O0-NEXT:popl %ebp -; MSVC-X86-O0-NEXT:retl -; -; MSVC-X64-O0-LABEL: test_vla: -; MSVC-X64-O0: # %bb.0: -; MSVC-X64-O0-NEXT:pushq %rbp -; MSVC-X64-O0-NEXT:subq $16, %rsp -; MSVC-X64-O0-NEXT:leaq {{[0-9]+}}(%rsp), %rbp -; MSVC-X64-O0-NEXT:movq __security_cookie(%rip), %rax -; MSVC-X64-O0-NEXT:xorq %rbp, %rax -; MSVC-X64-O0-NEXT:movq %rax, -8(%rbp) -; MSVC-X64-O0-NEXT:movl %ecx, %eax -; MSVC-X64-O0-NEXT:# kill: def $rax killed $eax -; MSVC-X64-O0-NEXT:leaq 15(,%rax,4), %rax -; MSVC-X64-O0-NEXT:andq $-16, %rax -; MSVC-X64-O0-NEXT:callq __chkstk -; MSVC-X64-O0-NEXT:subq %rax, %rsp -; MSVC-X64-O0-NEXT:movq %rsp, %rcx -; MSVC-X64-O0-NEXT:subq $32, %rsp -; MSVC-X64-O0-NEXT:callq escape -; MSVC-X64-O0-NEXT:addq $32, %rsp -; MSVC-X64-O0-NEXT:movq -8(%rbp), %rcx -; MSVC-X64-O0-NEXT:xorq %rbp, %rcx -; MSVC-X64-O0-NEXT:subq $32, %rsp -; MSVC-X64-O0-NEXT:callq __security_check_cookie -; MSVC-X64-O0-NEXT:movq %rbp, %rsp -; MSVC-X64-O0-NEXT:popq %rbp -; MSVC-X64-O0-NEXT:retq %vla = alloca i32, i32 %n call void @escape(ptr %vla) ret void } +; MSVC-X86-LABEL: _test_vla: +; MSVC-X86: pushl %ebp +; MSVC-X86: movl %esp, %ebp +; MSVC-X86: movl ___security_cookie, %[[REG1:[^ ]*]] +; MSVC-X86: xorl %ebp, %[[REG1]] +; MSVC-X86: movl %[[REG1]], [[SLOT:-[0-9]*]](%ebp) +; MSVC-X86: calll __chkstk +; MSVC-X86: pushl +; MSVC-X86: calll _escape +; MSVC-X86: movl [[SLOT]](%ebp), %ecx +; MSVC-X86: xorl %ebp, %ecx +; MSVC-X86: calll @__security_check_cookie@4 +; MSVC-X86: movl %ebp, %esp +; MSVC-X86: popl %ebp +; MSVC-X86: retl + +; MSVC-X64-LABEL: test_vla: +; MSVC-X64: pushq %rbp +; MSVC-X64: subq $16, %rsp +; MSVC-X64: leaq 16(%rsp), %rbp +; MSVC-X64: movq __security_cookie(%rip), %[[REG1:[^ ]*]] +; MSVC-X64: xorq %rbp, %[[REG1]] +; MSVC-X64: movq %[[REG1]], [[SLOT:-[0-9]*]](%rbp) +; MSVC-X64: callq __chkstk +; MSVC-X64: callq escape +; MSVC-X64: movq [[SLOT]](%rbp), %rcx +; MSVC-X64: xorq %rbp, %rcx +; MSVC-X64: callq __security_check_cookie +; MSVC-X64: retq RKSimon wrote: These look to have been manually re-added instead of using the update_llc_test_checks script https://github.com/llvm/llvm-project/pull/95904 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://
[clang] [llvm] [Pipelines] Move IPSCCP after inliner pipeline (PR #96620)
dtcxzyw wrote: > This patch causes some significant performance regressions on llvm-test-suite > (rv64gc-O3-thinlto): > > Name Before After Ratio > SingleSource/Benchmarks/Shootout/Shootout-random 2.150161677 > 3.300161641 + 53.5% > SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv > 0.111845159 0.145389494 +30.0% > SingleSource/Benchmarks/Adobe-C++/functionobjects 5.489498263 > 6.827863965 +24.4% It has been fixed. But this patch didn't show a positive net effect :( https://github.com/llvm/llvm-project/pull/96620 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [X86][CodeGen] security check cookie execute only when needed (PR #95904)
@@ -82,6 +94,8 @@ define void @tailcall_unrelated_frame() sspreq { ; LINUX-NEXT: .LBB1_2: # %CallStackCheckFailBlk ; LINUX-NEXT:.cfi_def_cfa_offset 16 ; LINUX-NEXT:callq __stack_chk_fail@PLT + + RKSimon wrote: superfluous https://github.com/llvm/llvm-project/pull/95904 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [llvm][AArch64] Move Apple aliases into the CpuAlias map (PR #96249)
https://github.com/tmatheson-arm approved this pull request. LGTM, I've added some thoughts but it's fine as it is. https://github.com/llvm/llvm-project/pull/96249 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [llvm][AArch64] Move Apple aliases into the CpuAlias map (PR #96249)
https://github.com/tmatheson-arm edited https://github.com/llvm/llvm-project/pull/96249 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [llvm][AArch64] Move Apple aliases into the CpuAlias map (PR #96249)
@@ -5,11 +5,11 @@ // RUN: not %clang_cc1 -triple arm64--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix AARCH64 // AARCH64: error: unknown target CPU 'not-a-cpu' -// AARCH64-NEXT: note: valid target CPU values are: generic, cortex-a35, cortex-a34, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a520ae, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78ae, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-a720ae, cortex-a725, cortex-r82, cortex-r82ae, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, cortex-x925, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-n3, neoverse-512tvb, neoverse-v1, neoverse-v2, neoverse-v3, neoverse-v3ae, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx, thunderxt88, thunderxt81, thunderxt83, thunderx2t99, thunderx3t110, tsv110, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-s4, apple-s5, apple-a13, apple-a14, apple-m1, apple-a15, apple-m2, apple-a16, apple-m3, apple-a17, apple-m4, a64fx, carmel, ampere1, ampere1a, ampere1b, oryon-1, cobalt-100, grace{{$}} +// AARCH64-NEXT: note: valid target CPU values are: a64fx, ampere1, ampere1a, ampere1b, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-a7, apple-a8, apple-a9, apple-m1, apple-m2, apple-m3, apple-m4, apple-s4, apple-s5, carmel, cobalt-100, cortex-a34, cortex-a35, cortex-a510, cortex-a520, cortex-a520ae, cortex-a53, cortex-a55, cortex-a57, cortex-a65, cortex-a65ae, cortex-a710, cortex-a715, cortex-a72, cortex-a720, cortex-a720ae, cortex-a725, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78ae, cortex-a78c, cortex-r82, cortex-r82ae, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, cortex-x925, cyclone, exynos-m3, exynos-m4, exynos-m5, falkor, generic, grace, kryo, neoverse-512tvb, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-n3, neoverse-v1, neoverse-v2, neoverse-v3, neoverse-v3ae, oryon-1, saphira, thunderx, thunderx2t99, thunderx3t110, thunderxt81, thunderxt83, thunderxt88, tsv110{{$}} tmatheson-arm wrote: Split into multiple lines? https://github.com/llvm/llvm-project/pull/96249 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [llvm][AArch64] Move Apple aliases into the CpuAlias map (PR #96249)
@@ -304,8 +304,21 @@ struct Alias { StringRef Name; }; -inline constexpr Alias CpuAliases[] = {{"cobalt-100", "neoverse-n2"}, - {"grace", "neoverse-v2"}}; +inline constexpr Alias CpuAliases[] = { +{"cobalt-100", "neoverse-n2"}, +{"grace", "neoverse-v2"}, +// Support cyclone as an alias for apple-a7 so we can still LTO old bitcode. tmatheson-arm wrote: If you really want this to work only for bitcode (and not appear on `-mcpu`), could it be handled in the bitcode importer? Same for "apple-latest"? https://github.com/llvm/llvm-project/pull/96249 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [llvm][AArch64] Move Apple aliases into the CpuAlias map (PR #96249)
@@ -88,10 +88,14 @@ StringRef AArch64::getArchExtFeature(StringRef ArchExt) { void AArch64::fillValidCPUArchList(SmallVectorImpl &Values) { for (const auto &C : CpuInfos) - Values.push_back(C.Name); +Values.push_back(C.Name); for (const auto &Alias : CpuAliases) -Values.push_back(Alias.AltName); +// The apple-latest alias is backend only, do not expose it to clang's -mcpu. +if (Alias.AltName != "apple-latest") tmatheson-arm wrote: I don't love this special case. But, not sure what to do about it. https://github.com/llvm/llvm-project/pull/96249 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [Flang-new][OpenMP] Add offload related flags for AMDGPU (PR #96742)
@@ -333,6 +333,9 @@ void Flang::AddAMDGPUTargetArgs(const ArgList &Args, StringRef Val = A->getValue(); CmdArgs.push_back(Args.MakeArgString("-mcode-object-version=" + Val)); } + + const ToolChain &TC = getToolChain(); + TC.addClangTargetOptions(Args, CmdArgs, Action::OffloadKind::OFK_OpenMP); banach-space wrote: > Clang does not verify how we use these flags and it accepts them for non-GPU > target. It's OK to make Flang "stricter" if we believe that's the right thing to do ;-) (I think that generating useful error/warning messages like "don't mix these flags - that's not supporter" would be a good thing) > IMO can be reused between Flang and Clang Are there any plans to extract that logic and share it somewhere? > I don't know if Nvidia also want to reuse their toolchain between Clang and > Flang to fully support OpenMP offloading. Who could be the right person to ask? https://github.com/llvm/llvm-project/pull/96742 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [llvm][AArch64] Move Apple aliases into the CpuAlias map (PR #96249)
@@ -304,8 +304,21 @@ struct Alias { StringRef Name; }; -inline constexpr Alias CpuAliases[] = {{"cobalt-100", "neoverse-n2"}, - {"grace", "neoverse-v2"}}; +inline constexpr Alias CpuAliases[] = { tmatheson-arm wrote: We should tablegen this too. https://github.com/llvm/llvm-project/pull/96249 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][analyzer] Improve documentation of checker 'cplusplus.Move' (NFC) (PR #96295)
balazske wrote: I fixed a test that contained the entire option help description. I think this is not needed, removed it and only included the first line of the description. https://github.com/llvm/llvm-project/pull/96295 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [PAC][ELF][AArch64] Encode signed GOT flag in PAuth core info (PR #96159)
https://github.com/kovdan01 updated https://github.com/llvm/llvm-project/pull/96159 >From 4eeb1b4e82941681b6cafda8579d136e3e7cb09f Mon Sep 17 00:00:00 2001 From: Daniil Kovalev Date: Tue, 18 Jun 2024 15:37:18 +0300 Subject: [PATCH 1/2] [PAC][ELF][AArch64] Encode signed GOT flag in PAuth core info Treat 7th bit of version value for llvm_linux platform as signed GOT flag. - clang: define `PointerAuthELFGOT` LangOption and set 7th bit of `aarch64-elf-pauthabi-version` LLVM module flag correspondingly; - llvm-readobj: print `PointerAuthELFGOT` or `!PointerAuthELFGOT` in version description of llvm_linux platform depending on whether the flag is set. --- clang/include/clang/Basic/LangOptions.def | 1 + clang/lib/CodeGen/CodeGenModule.cpp| 6 -- llvm/include/llvm/BinaryFormat/ELF.h | 3 ++- .../AArch64/note-gnu-property-elf-pauthabi.ll | 2 +- .../ELF/AArch64/aarch64-feature-pauth.s| 18 +- llvm/tools/llvm-readobj/ELFDumper.cpp | 3 ++- 6 files changed, 19 insertions(+), 14 deletions(-) diff --git a/clang/include/clang/Basic/LangOptions.def b/clang/include/clang/Basic/LangOptions.def index 6dd6b5614f44c..bc99dad5cd55e 100644 --- a/clang/include/clang/Basic/LangOptions.def +++ b/clang/include/clang/Basic/LangOptions.def @@ -168,6 +168,7 @@ LANGOPT(PointerAuthAuthTraps, 1, 0, "pointer authentication failure traps") LANGOPT(PointerAuthVTPtrAddressDiscrimination, 1, 0, "incorporate address discrimination in authenticated vtable pointers") LANGOPT(PointerAuthVTPtrTypeDiscrimination, 1, 0, "incorporate type discrimination in authenticated vtable pointers") LANGOPT(PointerAuthInitFini, 1, 0, "sign function pointers in init/fini arrays") +LANGOPT(PointerAuthELFGOT, 1, 0, "authenticate pointers from GOT") LANGOPT(DoubleSquareBracketAttributes, 1, 0, "'[[]]' attributes extension for all language standard modes") LANGOPT(ExperimentalLateParseAttributes, 1, 0, "experimental late parsing of attributes") diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index dd4a665ebc78b..feac291e01b50 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -1210,8 +1210,10 @@ void CodeGenModule::Release() { (LangOpts.PointerAuthVTPtrTypeDiscrimination << AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_VPTRTYPEDISCR) | (LangOpts.PointerAuthInitFini - << AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_INITFINI); - static_assert(AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_INITFINI == + << AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_INITFINI) | + (LangOpts.PointerAuthELFGOT + << AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_GOT); + static_assert(AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_GOT == AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_LAST, "Update when new enum items are defined"); if (PAuthABIVersion != 0) { diff --git a/llvm/include/llvm/BinaryFormat/ELF.h b/llvm/include/llvm/BinaryFormat/ELF.h index dfba180149916..2aa37bbed6656 100644 --- a/llvm/include/llvm/BinaryFormat/ELF.h +++ b/llvm/include/llvm/BinaryFormat/ELF.h @@ -1774,8 +1774,9 @@ enum : unsigned { AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_VPTRADDRDISCR = 4, AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_VPTRTYPEDISCR = 5, AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_INITFINI = 6, + AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_GOT = 7, AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_LAST = - AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_INITFINI, + AARCH64_PAUTH_PLATFORM_LLVM_LINUX_VERSION_GOT, }; // x86 processor feature bits. diff --git a/llvm/test/CodeGen/AArch64/note-gnu-property-elf-pauthabi.ll b/llvm/test/CodeGen/AArch64/note-gnu-property-elf-pauthabi.ll index 728cffeba02a2..fb69a12b2f906 100644 --- a/llvm/test/CodeGen/AArch64/note-gnu-property-elf-pauthabi.ll +++ b/llvm/test/CodeGen/AArch64/note-gnu-property-elf-pauthabi.ll @@ -27,7 +27,7 @@ ; OBJ: Displaying notes found in: .note.gnu.property ; OBJ-NEXT: Owner Data size Description ; OBJ-NEXT: GNU 0x0018 NT_GNU_PROPERTY_TYPE_0 (property note) -; OBJ-NEXT: AArch64 PAuth ABI core info: platform 0x1002 (llvm_linux), version 0x55 (PointerAuthIntrinsics, !PointerAuthCalls, PointerAuthReturns, !PointerAuthAuthTraps, PointerAuthVTPtrAddressDiscrimination, !PointerAuthVTPtrTypeDiscrimination, PointerAuthInitFini) +; OBJ-NEXT: AArch64 PAuth ABI core info: platform 0x1002 (llvm_linux), version 0x55 (PointerAuthIntrinsics, !PointerAuthCalls, PointerAuthReturns, !PointerAuthAuthTraps, PointerAuthVTPtrAddressDiscrimination, !PointerAuthVTPtrTypeDiscrimination, PointerAuthInitFini, !PointerAuthELFGOT) ; ERR: either both or no 'aarch64-elf-pauthabi-platform' and 'aarch64-elf-pauthabi-version' module flags must be present diff --git a/llvm/test/tools/llvm-readobj/ELF/AArch64/
[clang] [llvm] [Pipelines] Move IPSCCP after inliner pipeline (PR #96620)
goldsteinn wrote: > > This patch causes some significant performance regressions on > > llvm-test-suite (rv64gc-O3-thinlto): > > NameBefore After Ratio > > SingleSource/Benchmarks/Shootout/Shootout-random2.150161677 > > 3.300161641 + 53.5% > > SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv > > 0.111845159 0.145389494 +30.0% > > SingleSource/Benchmarks/Adobe-C++/functionobjects 5.489498263 > > 6.827863965 +24.4% > > It has been fixed. But this patch didn't show a positive net effect :( Does that mean it has a negative net effect, or its neutral (in which case the original motivating case should be enough). https://github.com/llvm/llvm-project/pull/96620 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [XRay] Add support for instrumentation of DSOs on x86_64 (PR #90959)
sebastiankreutzer wrote: > > @androm3da @MaskRay I'm tagging you because I'm having trouble to get > > feedback to this PR, and you seem to be the most recent contributors to > > XRay. Would one of you be willing to review it? Any other pointers on who > > to get in touch with are also much appreciated. > > I'm happy to take a look - but I'm traveling this week and won't be able to > until this weekend. That'd be great! There is no rush. https://github.com/llvm/llvm-project/pull/90959 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [Flang-new][OpenMP] Add offload related flags for AMDGPU (PR #96742)
banach-space wrote: > Clang for AMDGPU supports OpenMP and > [HIP](https://clang.llvm.org/docs/HIPSupport.html) and it reuses the same > code. For example `-fcuda-is-device` flag needs to be checked for [legacy HIP > host > code](https://github.com/llvm/llvm-project/blob/2033b1cf16f040e1369d8efba8439dcd3e36ed31/clang/lib/Basic/Targets/AMDGPU.cpp#L278). > Thanks! I'm still puzzled though: > In the future it will be needed for Flang equivalent functions: > AMDGPUTargetCodeGenInfo::getGlobalVarAddressSpace > AMDGPUTargetInfo::getTargetDefines Why would `-fcuda-is-device` be required? From your link I gather that the AMD logic in Clang simply makes sure that `-fcuda-is-device` wasn't used? > I would like to reuse the same part of the AMD GPU toolchain for Flang. That would be great - what's the plan here then? Simply to rely on the code in Clang? Also, note that that's `TargetInfo` (which lives in `clangBasic`) rather than `Toolchain` (that lives in `clangDriver`). This is actually key because it makes the coupling between Flang and Clang even stronger. https://github.com/llvm/llvm-project/pull/96742 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][NFC] Use range-based for loops (PR #96831)
https://github.com/Endilll edited https://github.com/llvm/llvm-project/pull/96831 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][NFC] Use range-based for loops (PR #96831)
https://github.com/Endilll commented: This looks good overall, but I have minor suggestions. https://github.com/llvm/llvm-project/pull/96831 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][NFC] Use range-based for loops (PR #96831)
@@ -2056,40 +2056,40 @@ void CXXRecordDecl::completeDefinition() { completeDefinition(nullptr); } +static bool hasPureVirtualFinalOverrider( +const CXXRecordDecl &RD, const CXXFinalOverriderMap *FinalOverriders) { + auto ExistsIn = [](const CXXFinalOverriderMap &FinalOverriders) { +for (const auto &[_, M] : FinalOverriders) { + for (const auto &[_, SO] : M) { Endilll wrote: While we're at it, can we use more descriptive names than `M` and `SO`? I'm not even sure what the latter means. I also have reservations towards `auto` here. https://github.com/llvm/llvm-project/pull/96831 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Pipelines] Move IPSCCP after inliner pipeline (PR #96620)
https://github.com/fhahn commented: Running IPSCCP twice seems like quite a heavy hammer, I'd expect a noticeable compile-time impact. I'd recommend to try to extract a reproducer from your motivating use case and check why IPSCCP cannot perform the desired optimization before inlining. Note that we run SCCP after inlining I think, which is the non-IP version of IPSCCP https://github.com/llvm/llvm-project/pull/96620 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] Extend lifetime bound analysis to support assignments (PR #96475)
@@ -964,11 +966,26 @@ static bool pathOnlyInitializesGslPointer(IndirectLocalPath &Path) { return false; } -void checkExprLifetime(Sema &SemaRef, const InitializedEntity &Entity, +void checkExprLifetime(Sema &SemaRef, const CheckingEntity &CEntity, Expr *Init) { - LifetimeResult LR = getEntityLifetime(&Entity); - LifetimeKind LK = LR.getInt(); - const InitializedEntity *ExtendingEntity = LR.getPointer(); + LifetimeKind LK = LK_FullExpression; + + const AssignedEntity *AEntity = nullptr; + // Local variables for initialized entity. + const InitializedEntity *InitEntity = nullptr; + const InitializedEntity *ExtendingEntity = nullptr; + if (auto IEntityP = std::get_if(&CEntity)) { +InitEntity = *IEntityP; +auto LTResult = getEntityLifetime(InitEntity); +LK = LTResult.getInt(); +ExtendingEntity = LTResult.getPointer(); + } else if (auto AEntityP = std::get_if(&CEntity)) { +AEntity = *AEntityP; +if (AEntity->LHS->getType()->isPointerType()) // builtin pointer type + LK = LK_Extended; Xazax-hun wrote: I am a bit confused here, could you elaborate why we want `LK_Extended` here? As fas as I remember, assignments are not doing lifetime extension. https://github.com/llvm/llvm-project/pull/96475 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang] Extend lifetime bound analysis to support assignments (PR #96475)
https://github.com/Xazax-hun edited https://github.com/llvm/llvm-project/pull/96475 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits