[llvm-branch-commits] [openmp] [OpenMP][AArch64] Fix branch protection in microtasks (#102317) (PR #103491)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/103491

>From 38a591de66a86aaf523f78f8266a2d5f01a1b106 Mon Sep 17 00:00:00 2001
From: Tulio Magno Quites Machado Filho 
Date: Tue, 13 Aug 2024 15:34:41 -0300
Subject: [PATCH] [OpenMP][AArch64] Fix branch protection in microtasks
 (#102317)

Start __kmp_invoke_microtask with PACBTI in order to identify the
function as a valid branch target. Before returning, SP is
authenticated.
Also add the BTI and PAC markers to z_Linux_asm.S.

With this patch, libomp.so can now be generated with DT_AARCH64_BTI_PLT
when built with -mbranch-protection=standard.

The implementation is based on the code available in compiler-rt.

(cherry picked from commit 0aa22dcd2f6ec5f46b8ef18fee88066463734935)
---
 openmp/runtime/src/z_Linux_asm.S | 53 
 1 file changed, 53 insertions(+)

diff --git a/openmp/runtime/src/z_Linux_asm.S b/openmp/runtime/src/z_Linux_asm.S
index 5b614e26a8337e..223ad091030e77 100644
--- a/openmp/runtime/src/z_Linux_asm.S
+++ b/openmp/runtime/src/z_Linux_asm.S
@@ -176,6 +176,53 @@ KMP_PREFIX_UNDERSCORE(\proc):
 .endm
 # endif // KMP_OS_DARWIN
 
+# if KMP_OS_LINUX
+// BTI and PAC gnu property note
+#  define NT_GNU_PROPERTY_TYPE_0 5
+#  define GNU_PROPERTY_AARCH64_FEATURE_1_AND 0xc000
+#  define GNU_PROPERTY_AARCH64_FEATURE_1_BTI 1
+#  define GNU_PROPERTY_AARCH64_FEATURE_1_PAC 2
+
+#  define GNU_PROPERTY(type, value)
\
+  .pushsection .note.gnu.property, "a";
\
+  .p2align 3;  
\
+  .word 4; 
\
+  .word 16;
\
+  .word NT_GNU_PROPERTY_TYPE_0;
\
+  .asciz "GNU";
\
+  .word type;  
\
+  .word 4; 
\
+  .word value; 
\
+  .word 0; 
\
+  .popsection
+# endif
+
+# if defined(__ARM_FEATURE_BTI_DEFAULT)
+#  define BTI_FLAG GNU_PROPERTY_AARCH64_FEATURE_1_BTI
+# else
+#  define BTI_FLAG 0
+# endif
+# if __ARM_FEATURE_PAC_DEFAULT & 3
+#  define PAC_FLAG GNU_PROPERTY_AARCH64_FEATURE_1_PAC
+# else
+#  define PAC_FLAG 0
+# endif
+
+# if (BTI_FLAG | PAC_FLAG) != 0
+#  if PAC_FLAG != 0
+#   define PACBTI_C hint #25
+#   define PACBTI_RET hint #29
+#  else
+#   define PACBTI_C hint #34
+#   define PACBTI_RET
+#  endif
+#  define GNU_PROPERTY_BTI_PAC \
+GNU_PROPERTY(GNU_PROPERTY_AARCH64_FEATURE_1_AND, BTI_FLAG | PAC_FLAG)
+# else
+#  define PACBTI_C
+#  define PACBTI_RET
+#  define GNU_PROPERTY_BTI_PAC
+# endif
 #endif // (KMP_OS_LINUX || KMP_OS_DARWIN || KMP_OS_WINDOWS) && 
(KMP_ARCH_AARCH64 || KMP_ARCH_AARCH64_32 || KMP_ARCH_ARM)
 
 .macro COMMON name, size, align_power
@@ -1296,6 +1343,7 @@ __tid = 8
 // mark_begin;
.text
PROC __kmp_invoke_microtask
+   PACBTI_C
 
stp x29, x30, [sp, #-16]!
 # if OMPT_SUPPORT
@@ -1359,6 +1407,7 @@ KMP_LABEL(kmp_1):
ldp x19, x20, [sp], #16
 # endif
ldp x29, x30, [sp], #16
+   PACBTI_RET
ret
 
DEBUG_INFO __kmp_invoke_microtask
@@ -2472,3 +2521,7 @@ __kmp_unnamed_critical_addr:
 .4byte .gomp_critical_user_
 .size __kmp_unnamed_critical_addr, 4
 #endif
+
+#if KMP_OS_LINUX && (KMP_ARCH_AARCH64 || KMP_ARCH_AARCH64_32)
+GNU_PROPERTY_BTI_PAC
+#endif

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [openmp] 38a591d - [OpenMP][AArch64] Fix branch protection in microtasks (#102317)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

Author: Tulio Magno Quites Machado Filho
Date: 2024-08-20T09:14:30+02:00
New Revision: 38a591de66a86aaf523f78f8266a2d5f01a1b106

URL: 
https://github.com/llvm/llvm-project/commit/38a591de66a86aaf523f78f8266a2d5f01a1b106
DIFF: 
https://github.com/llvm/llvm-project/commit/38a591de66a86aaf523f78f8266a2d5f01a1b106.diff

LOG: [OpenMP][AArch64] Fix branch protection in microtasks (#102317)

Start __kmp_invoke_microtask with PACBTI in order to identify the
function as a valid branch target. Before returning, SP is
authenticated.
Also add the BTI and PAC markers to z_Linux_asm.S.

With this patch, libomp.so can now be generated with DT_AARCH64_BTI_PLT
when built with -mbranch-protection=standard.

The implementation is based on the code available in compiler-rt.

(cherry picked from commit 0aa22dcd2f6ec5f46b8ef18fee88066463734935)

Added: 


Modified: 
openmp/runtime/src/z_Linux_asm.S

Removed: 




diff  --git a/openmp/runtime/src/z_Linux_asm.S 
b/openmp/runtime/src/z_Linux_asm.S
index 5b614e26a8337e..223ad091030e77 100644
--- a/openmp/runtime/src/z_Linux_asm.S
+++ b/openmp/runtime/src/z_Linux_asm.S
@@ -176,6 +176,53 @@ KMP_PREFIX_UNDERSCORE(\proc):
 .endm
 # endif // KMP_OS_DARWIN
 
+# if KMP_OS_LINUX
+// BTI and PAC gnu property note
+#  define NT_GNU_PROPERTY_TYPE_0 5
+#  define GNU_PROPERTY_AARCH64_FEATURE_1_AND 0xc000
+#  define GNU_PROPERTY_AARCH64_FEATURE_1_BTI 1
+#  define GNU_PROPERTY_AARCH64_FEATURE_1_PAC 2
+
+#  define GNU_PROPERTY(type, value)
\
+  .pushsection .note.gnu.property, "a";
\
+  .p2align 3;  
\
+  .word 4; 
\
+  .word 16;
\
+  .word NT_GNU_PROPERTY_TYPE_0;
\
+  .asciz "GNU";
\
+  .word type;  
\
+  .word 4; 
\
+  .word value; 
\
+  .word 0; 
\
+  .popsection
+# endif
+
+# if defined(__ARM_FEATURE_BTI_DEFAULT)
+#  define BTI_FLAG GNU_PROPERTY_AARCH64_FEATURE_1_BTI
+# else
+#  define BTI_FLAG 0
+# endif
+# if __ARM_FEATURE_PAC_DEFAULT & 3
+#  define PAC_FLAG GNU_PROPERTY_AARCH64_FEATURE_1_PAC
+# else
+#  define PAC_FLAG 0
+# endif
+
+# if (BTI_FLAG | PAC_FLAG) != 0
+#  if PAC_FLAG != 0
+#   define PACBTI_C hint #25
+#   define PACBTI_RET hint #29
+#  else
+#   define PACBTI_C hint #34
+#   define PACBTI_RET
+#  endif
+#  define GNU_PROPERTY_BTI_PAC \
+GNU_PROPERTY(GNU_PROPERTY_AARCH64_FEATURE_1_AND, BTI_FLAG | PAC_FLAG)
+# else
+#  define PACBTI_C
+#  define PACBTI_RET
+#  define GNU_PROPERTY_BTI_PAC
+# endif
 #endif // (KMP_OS_LINUX || KMP_OS_DARWIN || KMP_OS_WINDOWS) && 
(KMP_ARCH_AARCH64 || KMP_ARCH_AARCH64_32 || KMP_ARCH_ARM)
 
 .macro COMMON name, size, align_power
@@ -1296,6 +1343,7 @@ __tid = 8
 // mark_begin;
.text
PROC __kmp_invoke_microtask
+   PACBTI_C
 
stp x29, x30, [sp, #-16]!
 # if OMPT_SUPPORT
@@ -1359,6 +1407,7 @@ KMP_LABEL(kmp_1):
ldp x19, x20, [sp], #16
 # endif
ldp x29, x30, [sp], #16
+   PACBTI_RET
ret
 
DEBUG_INFO __kmp_invoke_microtask
@@ -2472,3 +2521,7 @@ __kmp_unnamed_critical_addr:
 .4byte .gomp_critical_user_
 .size __kmp_unnamed_critical_addr, 4
 #endif
+
+#if KMP_OS_LINUX && (KMP_ARCH_AARCH64 || KMP_ARCH_AARCH64_32)
+GNU_PROPERTY_BTI_PAC
+#endif



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [openmp] [OpenMP][AArch64] Fix branch protection in microtasks (#102317) (PR #103491)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/103491
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][driver][clang-cl] Fix unused argument warning for `/std:c++20` for precompiled module inputs to `clang-cl` (PR #102438)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/102438

>From 6fcbfb8ebc9650a2ea184aac244d067efdbe441e Mon Sep 17 00:00:00 2001
From: Sharadh Rajaraman 
Date: Mon, 19 Aug 2024 12:17:58 +0100
Subject: [PATCH] [clang][driver] `TY_ModuleFile` should be a 'CXX' file type

---
 clang/lib/Driver/Types.cpp  | 4 +++-
 clang/test/Driver/cl-cxx20-modules.cppm | 8 
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/clang/lib/Driver/Types.cpp b/clang/lib/Driver/Types.cpp
index a7b6b9000e1d2b..2b9b391c19c9fd 100644
--- a/clang/lib/Driver/Types.cpp
+++ b/clang/lib/Driver/Types.cpp
@@ -242,7 +242,9 @@ bool types::isCXX(ID Id) {
   case TY_CXXHUHeader:
   case TY_PP_CXXHeaderUnit:
   case TY_ObjCXXHeader: case TY_PP_ObjCXXHeader:
-  case TY_CXXModule: case TY_PP_CXXModule:
+  case TY_CXXModule:
+  case TY_PP_CXXModule:
+  case TY_ModuleFile:
   case TY_PP_CLCXX:
   case TY_CUDA: case TY_PP_CUDA: case TY_CUDA_DEVICE:
   case TY_HIP:
diff --git a/clang/test/Driver/cl-cxx20-modules.cppm 
b/clang/test/Driver/cl-cxx20-modules.cppm
index 06df929c42342f..43dbf517485a05 100644
--- a/clang/test/Driver/cl-cxx20-modules.cppm
+++ b/clang/test/Driver/cl-cxx20-modules.cppm
@@ -1,3 +1,6 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
 // RUN: %clang_cl /std:c++20 --precompile -### -- %s 2>&1 | FileCheck 
--check-prefix=PRECOMPILE %s
 // PRECOMPILE: -emit-module-interface
 
@@ -6,3 +9,8 @@
 
 // RUN: %clang_cl /std:c++20 --fprebuilt-module-path=. -### -- %s 2>&1 | 
FileCheck --check-prefix=FPREBUILT %s
 // FPREBUILT: -fprebuilt-module-path=.
+
+// RUN: %clang_cl %t/test.pcm /std:c++20 -### 2>&1 | FileCheck 
--check-prefix=CPP20WARNING %t/test.pcm
+
+//--- test.pcm
+// CPP20WARNING-NOT: clang-cl: warning: argument unused during compilation: 
'/std:c++20' [-Wunused-command-line-argument]

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] 6fcbfb8 - [clang][driver] `TY_ModuleFile` should be a 'CXX' file type

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

Author: Sharadh Rajaraman
Date: 2024-08-20T09:15:54+02:00
New Revision: 6fcbfb8ebc9650a2ea184aac244d067efdbe441e

URL: 
https://github.com/llvm/llvm-project/commit/6fcbfb8ebc9650a2ea184aac244d067efdbe441e
DIFF: 
https://github.com/llvm/llvm-project/commit/6fcbfb8ebc9650a2ea184aac244d067efdbe441e.diff

LOG: [clang][driver] `TY_ModuleFile` should be a 'CXX' file type

Added: 


Modified: 
clang/lib/Driver/Types.cpp
clang/test/Driver/cl-cxx20-modules.cppm

Removed: 




diff  --git a/clang/lib/Driver/Types.cpp b/clang/lib/Driver/Types.cpp
index a7b6b9000e1d2b..2b9b391c19c9fd 100644
--- a/clang/lib/Driver/Types.cpp
+++ b/clang/lib/Driver/Types.cpp
@@ -242,7 +242,9 @@ bool types::isCXX(ID Id) {
   case TY_CXXHUHeader:
   case TY_PP_CXXHeaderUnit:
   case TY_ObjCXXHeader: case TY_PP_ObjCXXHeader:
-  case TY_CXXModule: case TY_PP_CXXModule:
+  case TY_CXXModule:
+  case TY_PP_CXXModule:
+  case TY_ModuleFile:
   case TY_PP_CLCXX:
   case TY_CUDA: case TY_PP_CUDA: case TY_CUDA_DEVICE:
   case TY_HIP:

diff  --git a/clang/test/Driver/cl-cxx20-modules.cppm 
b/clang/test/Driver/cl-cxx20-modules.cppm
index 06df929c42342f..43dbf517485a05 100644
--- a/clang/test/Driver/cl-cxx20-modules.cppm
+++ b/clang/test/Driver/cl-cxx20-modules.cppm
@@ -1,3 +1,6 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
 // RUN: %clang_cl /std:c++20 --precompile -### -- %s 2>&1 | FileCheck 
--check-prefix=PRECOMPILE %s
 // PRECOMPILE: -emit-module-interface
 
@@ -6,3 +9,8 @@
 
 // RUN: %clang_cl /std:c++20 --fprebuilt-module-path=. -### -- %s 2>&1 | 
FileCheck --check-prefix=FPREBUILT %s
 // FPREBUILT: -fprebuilt-module-path=.
+
+// RUN: %clang_cl %t/test.pcm /std:c++20 -### 2>&1 | FileCheck 
--check-prefix=CPP20WARNING %t/test.pcm
+
+//--- test.pcm
+// CPP20WARNING-NOT: clang-cl: warning: argument unused during compilation: 
'/std:c++20' [-Wunused-command-line-argument]



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [openmp] [OpenMP][AArch64] Fix branch protection in microtasks (#102317) (PR #103491)

2024-08-20 Thread via llvm-branch-commits

github-actions[bot] wrote:

@tuliom (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/103491
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][driver][clang-cl] Fix unused argument warning for `/std:c++20` for precompiled module inputs to `clang-cl` (PR #102438)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/102438
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [Mips] Fix fast isel for i16 bswap. (#103398) (PR #104745)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/104745

>From 9545ef53ebe8be2a53ef6f84626f52bed73c82ba Mon Sep 17 00:00:00 2001
From: Craig Topper 
Date: Fri, 16 Aug 2024 14:54:51 -0700
Subject: [PATCH] [Mips] Fix fast isel for i16 bswap. (#103398)

We need to mask the SRL result to 8 bits before ORing in the SLL. This
is needed in case bits 23:16 of the input aren't zero. They will have
been shifted into bits 15:8.

We don't need to AND the result with 0x. It's ok if the upper 16
bits of the register are garbage.

Fixes #103035.

(cherry picked from commit ebe7265b142f370f0a563fece5db22f57383ba2d)
---
 llvm/lib/Target/Mips/MipsFastISel.cpp  | 4 ++--
 llvm/test/CodeGen/Mips/Fast-ISel/bswap1.ll | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/llvm/lib/Target/Mips/MipsFastISel.cpp 
b/llvm/lib/Target/Mips/MipsFastISel.cpp
index bd8ef43da625c3..64a0e9321598ff 100644
--- a/llvm/lib/Target/Mips/MipsFastISel.cpp
+++ b/llvm/lib/Target/Mips/MipsFastISel.cpp
@@ -1608,8 +1608,8 @@ bool MipsFastISel::fastLowerIntrinsicCall(const 
IntrinsicInst *II) {
 }
 emitInst(Mips::SLL, TempReg[0]).addReg(SrcReg).addImm(8);
 emitInst(Mips::SRL, TempReg[1]).addReg(SrcReg).addImm(8);
-emitInst(Mips::OR, TempReg[2]).addReg(TempReg[0]).addReg(TempReg[1]);
-emitInst(Mips::ANDi, DestReg).addReg(TempReg[2]).addImm(0x);
+emitInst(Mips::ANDi, TempReg[2]).addReg(TempReg[1]).addImm(0xFF);
+emitInst(Mips::OR, DestReg).addReg(TempReg[0]).addReg(TempReg[2]);
 updateValueMap(II, DestReg);
 return true;
   }
diff --git a/llvm/test/CodeGen/Mips/Fast-ISel/bswap1.ll 
b/llvm/test/CodeGen/Mips/Fast-ISel/bswap1.ll
index bd762a0e1d741f..ce664c78e86c2a 100644
--- a/llvm/test/CodeGen/Mips/Fast-ISel/bswap1.ll
+++ b/llvm/test/CodeGen/Mips/Fast-ISel/bswap1.ll
@@ -21,8 +21,8 @@ define void @b16() {
 
   ; 32R1:   sll   $[[TMP1:[0-9]+]], $[[A_VAL]], 8
   ; 32R1:   srl   $[[TMP2:[0-9]+]], $[[A_VAL]], 8
-  ; 32R1:   or$[[TMP3:[0-9]+]], $[[TMP1]], $[[TMP2]]
-  ; 32R1:   andi  $[[TMP4:[0-9]+]], $[[TMP3]], 65535
+  ; 32R1:   andi  $[[TMP3:[0-9]+]], $[[TMP2]], 255
+  ; 32R1:   or$[[RESULT:[0-9]+]], $[[TMP1]], $[[TMP3]]
 
   ; 32R2:   wsbh  $[[RESULT:[0-9]+]], $[[A_VAL]]
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [Mips] Fix fast isel for i16 bswap. (#103398) (PR #104745)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/104745
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 9545ef5 - [Mips] Fix fast isel for i16 bswap. (#103398)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

Author: Craig Topper
Date: 2024-08-20T09:16:16+02:00
New Revision: 9545ef53ebe8be2a53ef6f84626f52bed73c82ba

URL: 
https://github.com/llvm/llvm-project/commit/9545ef53ebe8be2a53ef6f84626f52bed73c82ba
DIFF: 
https://github.com/llvm/llvm-project/commit/9545ef53ebe8be2a53ef6f84626f52bed73c82ba.diff

LOG: [Mips] Fix fast isel for i16 bswap. (#103398)

We need to mask the SRL result to 8 bits before ORing in the SLL. This
is needed in case bits 23:16 of the input aren't zero. They will have
been shifted into bits 15:8.

We don't need to AND the result with 0x. It's ok if the upper 16
bits of the register are garbage.

Fixes #103035.

(cherry picked from commit ebe7265b142f370f0a563fece5db22f57383ba2d)

Added: 


Modified: 
llvm/lib/Target/Mips/MipsFastISel.cpp
llvm/test/CodeGen/Mips/Fast-ISel/bswap1.ll

Removed: 




diff  --git a/llvm/lib/Target/Mips/MipsFastISel.cpp 
b/llvm/lib/Target/Mips/MipsFastISel.cpp
index bd8ef43da625c3..64a0e9321598ff 100644
--- a/llvm/lib/Target/Mips/MipsFastISel.cpp
+++ b/llvm/lib/Target/Mips/MipsFastISel.cpp
@@ -1608,8 +1608,8 @@ bool MipsFastISel::fastLowerIntrinsicCall(const 
IntrinsicInst *II) {
 }
 emitInst(Mips::SLL, TempReg[0]).addReg(SrcReg).addImm(8);
 emitInst(Mips::SRL, TempReg[1]).addReg(SrcReg).addImm(8);
-emitInst(Mips::OR, TempReg[2]).addReg(TempReg[0]).addReg(TempReg[1]);
-emitInst(Mips::ANDi, DestReg).addReg(TempReg[2]).addImm(0x);
+emitInst(Mips::ANDi, TempReg[2]).addReg(TempReg[1]).addImm(0xFF);
+emitInst(Mips::OR, DestReg).addReg(TempReg[0]).addReg(TempReg[2]);
 updateValueMap(II, DestReg);
 return true;
   }

diff  --git a/llvm/test/CodeGen/Mips/Fast-ISel/bswap1.ll 
b/llvm/test/CodeGen/Mips/Fast-ISel/bswap1.ll
index bd762a0e1d741f..ce664c78e86c2a 100644
--- a/llvm/test/CodeGen/Mips/Fast-ISel/bswap1.ll
+++ b/llvm/test/CodeGen/Mips/Fast-ISel/bswap1.ll
@@ -21,8 +21,8 @@ define void @b16() {
 
   ; 32R1:   sll   $[[TMP1:[0-9]+]], $[[A_VAL]], 8
   ; 32R1:   srl   $[[TMP2:[0-9]+]], $[[A_VAL]], 8
-  ; 32R1:   or$[[TMP3:[0-9]+]], $[[TMP1]], $[[TMP2]]
-  ; 32R1:   andi  $[[TMP4:[0-9]+]], $[[TMP3]], 65535
+  ; 32R1:   andi  $[[TMP3:[0-9]+]], $[[TMP2]], 255
+  ; 32R1:   or$[[RESULT:[0-9]+]], $[[TMP1]], $[[TMP3]]
 
   ; 32R2:   wsbh  $[[RESULT:[0-9]+]], $[[A_VAL]]
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang][driver][clang-cl] Fix unused argument warning for `/std:c++20` for precompiled module inputs to `clang-cl` (PR #102438)

2024-08-20 Thread via llvm-branch-commits

github-actions[bot] wrote:

@sharadhr (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/102438
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 8595e91 - Add some brief LLVM 19 release notes for Pointer Authentication ABI support.

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

Author: Anton Korobeynikov
Date: 2024-08-20T09:16:42+02:00
New Revision: 8595e91b16dadc33fbb321cfd30b77f43f64e10e

URL: 
https://github.com/llvm/llvm-project/commit/8595e91b16dadc33fbb321cfd30b77f43f64e10e
DIFF: 
https://github.com/llvm/llvm-project/commit/8595e91b16dadc33fbb321cfd30b77f43f64e10e.diff

LOG: Add some brief LLVM 19 release notes for Pointer Authentication ABI 
support.

Added: 


Modified: 
clang/docs/ReleaseNotes.rst
llvm/docs/ReleaseNotes.rst

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index b56e7177846d99..17ddbfe910f878 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1191,11 +1191,13 @@ Arm and AArch64 Support
   improvements for most targets. We have not changed the default behavior for
   ARMv6, but may revisit that decision in the future. Users can restore the old
   behavior with -m[no-]unaligned-access.
+
 - An alias identifier (rdma) has been added for targeting the AArch64
   Architecture Extension which uses Rounding Doubling Multiply Accumulate
   instructions (rdm). The identifier is available on the command line as
   a feature modifier for -march and -mcpu as well as via target attributes
   like ``target_version`` or ``target_clones``.
+
 - Support has been added for the following processors (-mcpu identifiers in 
parenthesis):
 * Arm Cortex-R52+ (cortex-r52plus).
 * Arm Cortex-R82AE (cortex-r82ae).
@@ -1213,6 +1215,12 @@ Arm and AArch64 Support
   objects. It doesn't cause any code generation changes, as the code generated
   by clang is already compatible with GCS.
 
+ - Experimental support has been added for pointer authentication ABI for 
С/C++.
+
+ - Pointer authentication ABI could be enabled for AArch64 Linux via
+   ``-mabi=pauthtest`` option or via specifying ``pauthtest`` environment part 
of
+   target triple.
+
 Android Support
 ^^^
 

diff  --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index a81caa160883d8..60b6c6e786df89 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -80,6 +80,11 @@ Changes to the LLVM IR
 removed. The next argument has been changed from byte index to bit
 index.
 * Added ``llvm.experimental.vector.compress`` intrinsic.
+* Added special kind of `constant expressions
+  `_ to
+  represent pointers with signature embedded into it.
+* Added `pointer authentication operand bundles
+  
`_. 
 
 Changes to LLVM infrastructure
 --
@@ -125,6 +130,15 @@ Changes to the AArch64 Backend
   when specified via ``-march=`` or an ``-mcpu=`` that supports them.  The
   attribute ``"target-features"="+v9a"`` no longer implies ``"+sve"`` and
   ``"+sve2"`` respectively.
+* Added support for ELF pointer authentication relocations as specified in
+  `PAuth ABI Extension to ELF
+  
`_.
+* Added codegeneration, ELF object file and linker support for authenticated
+  call lowering, signed constants and emission of signing scheme details in
+  ``GNU_PROPERTY_AARCH64_FEATURE_PAUTH`` property of ``.note.gnu.property``
+  section.
+* Added codegeneration support for ``llvm.ptrauth.auth`` and
+  ``llvm.ptrauth.resign`` intrinsics.
 
 Changes to the AMDGPU Backend
 -



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] Add some brief LLVM 19 release notes for Pointer Authentication ABI support (PR #104657)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/104657

>From 8595e91b16dadc33fbb321cfd30b77f43f64e10e Mon Sep 17 00:00:00 2001
From: Anton Korobeynikov 
Date: Fri, 16 Aug 2024 18:09:53 -0700
Subject: [PATCH] Add some brief LLVM 19 release notes for Pointer
 Authentication ABI support.

---
 clang/docs/ReleaseNotes.rst |  8 
 llvm/docs/ReleaseNotes.rst  | 14 ++
 2 files changed, 22 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index b56e7177846d99..17ddbfe910f878 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1191,11 +1191,13 @@ Arm and AArch64 Support
   improvements for most targets. We have not changed the default behavior for
   ARMv6, but may revisit that decision in the future. Users can restore the old
   behavior with -m[no-]unaligned-access.
+
 - An alias identifier (rdma) has been added for targeting the AArch64
   Architecture Extension which uses Rounding Doubling Multiply Accumulate
   instructions (rdm). The identifier is available on the command line as
   a feature modifier for -march and -mcpu as well as via target attributes
   like ``target_version`` or ``target_clones``.
+
 - Support has been added for the following processors (-mcpu identifiers in 
parenthesis):
 * Arm Cortex-R52+ (cortex-r52plus).
 * Arm Cortex-R82AE (cortex-r82ae).
@@ -1213,6 +1215,12 @@ Arm and AArch64 Support
   objects. It doesn't cause any code generation changes, as the code generated
   by clang is already compatible with GCS.
 
+ - Experimental support has been added for pointer authentication ABI for 
С/C++.
+
+ - Pointer authentication ABI could be enabled for AArch64 Linux via
+   ``-mabi=pauthtest`` option or via specifying ``pauthtest`` environment part 
of
+   target triple.
+
 Android Support
 ^^^
 
diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index a81caa160883d8..60b6c6e786df89 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -80,6 +80,11 @@ Changes to the LLVM IR
 removed. The next argument has been changed from byte index to bit
 index.
 * Added ``llvm.experimental.vector.compress`` intrinsic.
+* Added special kind of `constant expressions
+  `_ to
+  represent pointers with signature embedded into it.
+* Added `pointer authentication operand bundles
+  
`_. 
 
 Changes to LLVM infrastructure
 --
@@ -125,6 +130,15 @@ Changes to the AArch64 Backend
   when specified via ``-march=`` or an ``-mcpu=`` that supports them.  The
   attribute ``"target-features"="+v9a"`` no longer implies ``"+sve"`` and
   ``"+sve2"`` respectively.
+* Added support for ELF pointer authentication relocations as specified in
+  `PAuth ABI Extension to ELF
+  
`_.
+* Added codegeneration, ELF object file and linker support for authenticated
+  call lowering, signed constants and emission of signing scheme details in
+  ``GNU_PROPERTY_AARCH64_FEATURE_PAUTH`` property of ``.note.gnu.property``
+  section.
+* Added codegeneration support for ``llvm.ptrauth.auth`` and
+  ``llvm.ptrauth.resign`` intrinsics.
 
 Changes to the AMDGPU Backend
 -

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] Add some brief LLVM 19 release notes for Pointer Authentication ABI support (PR #104657)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/104657
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [Mips] Fix fast isel for i16 bswap. (#103398) (PR #104745)

2024-08-20 Thread via llvm-branch-commits

github-actions[bot] wrote:

@nikic (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/104745
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [BOLT] Fix relocations handling (PR #102741)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/102741

>From bb46c721211b901f7ab34551e4bb240308203da9 Mon Sep 17 00:00:00 2001
From: Vladislav Khmelevsky 
Date: Sat, 27 Jul 2024 23:07:59 +0400
Subject: [PATCH] release/19.x: [BOLT] Fix relocations handling

Backport 
https://github.com/llvm/llvm-project/commit/097ddd3565f830e6cb9d0bb8ca66844b7f3f3cbb
---
 bolt/lib/Rewrite/RewriteInstance.cpp   |   2 +-
 bolt/test/AArch64/Inputs/build_id.ldscript |   9 +
 bolt/test/AArch64/build_id.c   |  25 ++
 bolt/test/X86/Inputs/build_id.yaml | 326 +
 bolt/test/X86/build_id.test|   8 +
 5 files changed, 369 insertions(+), 1 deletion(-)
 create mode 100644 bolt/test/AArch64/Inputs/build_id.ldscript
 create mode 100644 bolt/test/AArch64/build_id.c
 create mode 100644 bolt/test/X86/Inputs/build_id.yaml
 create mode 100644 bolt/test/X86/build_id.test

diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp 
b/bolt/lib/Rewrite/RewriteInstance.cpp
index 5a2c2637eb01ff..3d24936271bf8f 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -2612,7 +2612,7 @@ void RewriteInstance::handleRelocation(const SectionRef 
&RelocatedSection,
   Expected SectionName = Section->getName();
   if (SectionName && !SectionName->empty())
 ReferencedSection = BC->getUniqueSectionByName(*SectionName);
-} else if (ReferencedSymbol && ContainingBF &&
+} else if (BC->isRISCV() && ReferencedSymbol && ContainingBF &&
(cantFail(Symbol.getFlags()) & SymbolRef::SF_Absolute)) {
   // This might be a relocation for an ABS symbols like __global_pointer$ 
on
   // RISC-V
diff --git a/bolt/test/AArch64/Inputs/build_id.ldscript 
b/bolt/test/AArch64/Inputs/build_id.ldscript
new file mode 100644
index 00..0af8e960f491b3
--- /dev/null
+++ b/bolt/test/AArch64/Inputs/build_id.ldscript
@@ -0,0 +1,9 @@
+SECTIONS
+{
+  PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x40)); . = 
SEGMENT_START("text-segment", 0x40) + SIZEOF_HEADERS;
+  .note.gnu.build-id (0x400400):
+   {
+build_id_note = ABSOLUTE(.);
+*(.note.gnu.build-id)
+   }
+}
diff --git a/bolt/test/AArch64/build_id.c b/bolt/test/AArch64/build_id.c
new file mode 100644
index 00..01e433c7ca8fd9
--- /dev/null
+++ b/bolt/test/AArch64/build_id.c
@@ -0,0 +1,25 @@
+// This test checks that referencing build_id through GOT table
+// would result in GOT access after disassembly, not directly
+// to build_id address.
+
+// RUN: %clang %cflags -fuse-ld=lld -Wl,-T,%S/Inputs/build_id.ldscript -Wl,-q \
+// RUN:   -Wl,--no-relax -Wl,--build-id=sha1 %s -o %t.exe
+// RUN: llvm-bolt -print-disasm --print-only=get_build_id %t.exe -o %t.bolt | \
+// RUN:   FileCheck %s
+
+// CHECK: adrp [[REG:x[0-28]+]], __BOLT_got_zero
+// CHECK: ldr x{{.*}}, [[[REG]], :lo12:__BOLT_got_zero{{.*}}]
+
+struct build_id_note {
+  char pad[16];
+  char hash[20];
+};
+
+extern const struct build_id_note build_id_note;
+
+__attribute__((noinline)) char get_build_id() { return build_id_note.hash[0]; }
+
+int main() {
+  get_build_id();
+  return 0;
+}
diff --git a/bolt/test/X86/Inputs/build_id.yaml 
b/bolt/test/X86/Inputs/build_id.yaml
new file mode 100644
index 00..af012904ff9507
--- /dev/null
+++ b/bolt/test/X86/Inputs/build_id.yaml
@@ -0,0 +1,326 @@
+--- !ELF
+FileHeader:
+  Class:   ELFCLASS64
+  Data:ELFDATA2LSB
+  Type:ET_EXEC
+  Machine: EM_X86_64
+  Entry:   0x4010A0
+ProgramHeaders:
+  - Type:PT_PHDR
+Flags:   [ PF_R ]
+VAddr:   0x400040
+Align:   0x8
+Offset:  0x40
+  - Type:PT_INTERP
+Flags:   [ PF_R ]
+FirstSec:.interp
+LastSec: .interp
+VAddr:   0x400444
+Offset:  0x444
+  - Type:PT_LOAD
+Flags:   [ PF_X, PF_R ]
+FirstSec:.init
+LastSec: .fini
+VAddr:   0x401000
+Align:   0x1000
+Offset:  0x1000
+  - Type:PT_LOAD
+Flags:   [ PF_R ]
+FirstSec:.rodata
+LastSec: .rodata
+VAddr:   0x402000
+Align:   0x1000
+Offset:  0x2000
+  - Type:PT_LOAD
+Flags:   [ PF_W, PF_R ]
+FirstSec:.init_array
+LastSec: .bss
+VAddr:   0x403DD8
+Align:   0x1000
+Offset:  0x2DD8
+  - Type:PT_DYNAMIC
+Flags:   [ PF_W, PF_R ]
+FirstSec:.dynamic
+LastSec: .dynamic
+VAddr:   0x403DE8
+Align:   0x8
+Offset:  0x2DE8
+  - Type:PT_NOTE
+Flags:   [ PF_R ]
+FirstSec:.note.gnu.build-id
+LastSec: .note.ABI-tag
+VAddr:   0x400400
+Align:   0x4
+Offset:  0x400
+Sectio

[llvm-branch-commits] [llvm] release/19.x: [BOLT] Fix relocations handling (PR #102741)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/102741
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [BOLT] Fix relocations handling (PR #102741)

2024-08-20 Thread via llvm-branch-commits

github-actions[bot] wrote:

@yota9 (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/102741
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [AArch64] Fix a bug where user could not disable certain architecture features (PR #104752)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/104752

>From 263965ebe237e2f82d714a12a8c9338b46237a33 Mon Sep 17 00:00:00 2001
From: Tomas Matheson 
Date: Sat, 17 Aug 2024 13:36:40 +0100
Subject: [PATCH] [AArch64] Add a check for invalid default features (#104435)

This adds a check that all ExtensionWithMArch which are marked as
implied features for an architecture are also present in the list of
default features. It doesn't make sense to have something mandatory but
not on by default.

There were a number of existing cases that violated this rule, and some
changes to which features are mandatory (indicated by the Implies
field).

This resulted in a bug where if a feature was marked as `Implies` but
was not added to `DefaultExt`, then for `-march=base_arch+nofeat` the
Driver would consider `feat` to have never been added and therefore
would do nothing to disable it (no `-target-feature -feat` would be
added, but the backend would enable the feature by default because of
`Implies`). See
clang/test/Driver/aarch64-negative-modifiers-for-default-features.c.

Note that the processor definitions do not respect the architecture
DefaultExts. These apply only when specifying `-march=`. So when a feature is moved from `Implies` to `DefaultExts` on
the Architecture definition, the feature needs to be added to all
processor definitions (that are based on that architecture) in order to
preserve the existing behaviour. I have checked the TRMs for many cases
(see specific commit messages) but in other cases I have just kept the
current behaviour and not tried to fix it.
---
 clang/test/CodeGen/aarch64-targetattr.c   | 12 +--
 ...-negative-modifiers-for-default-features.c | 12 +++
 clang/test/Driver/arm-sb.c|  2 +-
 .../aarch64-apple-a12.c   |  1 -
 .../aarch64-apple-a13.c   |  1 -
 .../aarch64-apple-a14.c   |  1 -
 .../aarch64-apple-a15.c   |  1 -
 .../aarch64-apple-a16.c   |  1 -
 .../aarch64-apple-a17.c   |  1 -
 .../aarch64-apple-m4.c|  2 -
 .../aarch64-cortex-r82.c  |  1 -
 .../aarch64-cortex-r82ae.c|  1 -
 llvm/lib/Target/AArch64/AArch64Features.td| 19 ++--
 llvm/lib/Target/AArch64/AArch64Processors.td  | 46 +++--
 llvm/test/MC/AArch64/arm64-system-encoding.s  |  2 +-
 llvm/test/MC/AArch64/armv8.5a-ssbs-error.s|  2 +-
 llvm/test/MC/AArch64/armv8.5a-ssbs.s  |  2 +-
 .../MC/Disassembler/AArch64/armv8.5a-ssbs.txt |  2 +-
 .../AArch64/basic-a64-instructions.txt|  2 +-
 .../TargetParser/TargetParserTest.cpp | 95 +++
 llvm/utils/TableGen/ARMTargetDefEmitter.cpp   | 32 ++-
 21 files changed, 154 insertions(+), 84 deletions(-)
 create mode 100644 
clang/test/Driver/aarch64-negative-modifiers-for-default-features.c

diff --git a/clang/test/CodeGen/aarch64-targetattr.c 
b/clang/test/CodeGen/aarch64-targetattr.c
index 4f891f938b6186..d6227be2ebef83 100644
--- a/clang/test/CodeGen/aarch64-targetattr.c
+++ b/clang/test/CodeGen/aarch64-targetattr.c
@@ -195,19 +195,19 @@ void minusarch() {}
 // CHECK: attributes #[[ATTR0]] = { noinline nounwind optnone 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-features"="+crc,+fp-armv8,+lse,+neon,+ras,+rdm,+v8.1a,+v8.2a,+v8a" }
 // CHECK: attributes #[[ATTR1]] = { noinline nounwind optnone 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a"
 }
 // CHECK: attributes #[[ATTR2]] = { noinline nounwind optnone 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+sve2,+v8.1a,+v8.2a,+v8a"
 }
-// CHECK: attributes #[[ATTR3]] = { noinline nounwind optnone 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-features"="+bf16,+complxnum,+crc,+dotprod,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+ras,+rcpc,+rdm,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a"
 }
-// CHECK: attributes #[[ATTR4]] = { noinline nounwind optnone 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-cpu"="cortex-a710" 
"target-features"="+bf16,+complxnum,+crc,+dotprod,+ete,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+mte,+neon,+pauth,+perfmon,+ras,+rcpc,+rdm,+sb,+sve,+sve2,+sve2-bitperm,+trbe,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8a,+v9a"
 }
+// CHECK: attributes #[[ATTR3]] = { noinline nounwind optnone 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a"
 }
+// CHECK: attributes #[[ATTR4]] = { noinline nounwind optnone 
"no-trapping-math"=

[llvm-branch-commits] [llvm] 263965e - [AArch64] Add a check for invalid default features (#104435)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

Author: Tomas Matheson
Date: 2024-08-20T09:18:22+02:00
New Revision: 263965ebe237e2f82d714a12a8c9338b46237a33

URL: 
https://github.com/llvm/llvm-project/commit/263965ebe237e2f82d714a12a8c9338b46237a33
DIFF: 
https://github.com/llvm/llvm-project/commit/263965ebe237e2f82d714a12a8c9338b46237a33.diff

LOG: [AArch64] Add a check for invalid default features (#104435)

This adds a check that all ExtensionWithMArch which are marked as
implied features for an architecture are also present in the list of
default features. It doesn't make sense to have something mandatory but
not on by default.

There were a number of existing cases that violated this rule, and some
changes to which features are mandatory (indicated by the Implies
field).

This resulted in a bug where if a feature was marked as `Implies` but
was not added to `DefaultExt`, then for `-march=base_arch+nofeat` the
Driver would consider `feat` to have never been added and therefore
would do nothing to disable it (no `-target-feature -feat` would be
added, but the backend would enable the feature by default because of
`Implies`). See
clang/test/Driver/aarch64-negative-modifiers-for-default-features.c.

Note that the processor definitions do not respect the architecture
DefaultExts. These apply only when specifying `-march=`. So when a feature is moved from `Implies` to `DefaultExts` on
the Architecture definition, the feature needs to be added to all
processor definitions (that are based on that architecture) in order to
preserve the existing behaviour. I have checked the TRMs for many cases
(see specific commit messages) but in other cases I have just kept the
current behaviour and not tried to fix it.

Added: 
clang/test/Driver/aarch64-negative-modifiers-for-default-features.c

Modified: 
clang/test/CodeGen/aarch64-targetattr.c
clang/test/Driver/arm-sb.c
clang/test/Driver/print-enabled-extensions/aarch64-apple-a12.c
clang/test/Driver/print-enabled-extensions/aarch64-apple-a13.c
clang/test/Driver/print-enabled-extensions/aarch64-apple-a14.c
clang/test/Driver/print-enabled-extensions/aarch64-apple-a15.c
clang/test/Driver/print-enabled-extensions/aarch64-apple-a16.c
clang/test/Driver/print-enabled-extensions/aarch64-apple-a17.c
clang/test/Driver/print-enabled-extensions/aarch64-apple-m4.c
clang/test/Driver/print-enabled-extensions/aarch64-cortex-r82.c
clang/test/Driver/print-enabled-extensions/aarch64-cortex-r82ae.c
llvm/lib/Target/AArch64/AArch64Features.td
llvm/lib/Target/AArch64/AArch64Processors.td
llvm/test/MC/AArch64/arm64-system-encoding.s
llvm/test/MC/AArch64/armv8.5a-ssbs-error.s
llvm/test/MC/AArch64/armv8.5a-ssbs.s
llvm/test/MC/Disassembler/AArch64/armv8.5a-ssbs.txt
llvm/test/MC/Disassembler/AArch64/basic-a64-instructions.txt
llvm/unittests/TargetParser/TargetParserTest.cpp
llvm/utils/TableGen/ARMTargetDefEmitter.cpp

Removed: 




diff  --git a/clang/test/CodeGen/aarch64-targetattr.c 
b/clang/test/CodeGen/aarch64-targetattr.c
index 4f891f938b6186..d6227be2ebef83 100644
--- a/clang/test/CodeGen/aarch64-targetattr.c
+++ b/clang/test/CodeGen/aarch64-targetattr.c
@@ -195,19 +195,19 @@ void minusarch() {}
 // CHECK: attributes #[[ATTR0]] = { noinline nounwind optnone 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-features"="+crc,+fp-armv8,+lse,+neon,+ras,+rdm,+v8.1a,+v8.2a,+v8a" }
 // CHECK: attributes #[[ATTR1]] = { noinline nounwind optnone 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a"
 }
 // CHECK: attributes #[[ATTR2]] = { noinline nounwind optnone 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+sve2,+v8.1a,+v8.2a,+v8a"
 }
-// CHECK: attributes #[[ATTR3]] = { noinline nounwind optnone 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-features"="+bf16,+complxnum,+crc,+dotprod,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+ras,+rcpc,+rdm,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a"
 }
-// CHECK: attributes #[[ATTR4]] = { noinline nounwind optnone 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-cpu"="cortex-a710" 
"target-features"="+bf16,+complxnum,+crc,+dotprod,+ete,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+mte,+neon,+pauth,+perfmon,+ras,+rcpc,+rdm,+sb,+sve,+sve2,+sve2-bitperm,+trbe,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8a,+v9a"
 }
+// CHECK: attributes #[[ATTR3]] = { noinline nounwind optnone 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a"
 }
+// CH

[llvm-branch-commits] [clang] [llvm] [AArch64] Fix a bug where user could not disable certain architecture features (PR #104752)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/104752
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [GlobalISel] Bail out early for big-endian (#103310) (PR #104823)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/104823

>From c1336c9e3bd6c0887ead386043c547b3a3ed76a9 Mon Sep 17 00:00:00 2001
From: David Green 
Date: Mon, 19 Aug 2024 18:50:47 +0100
Subject: [PATCH] [GlobalISel] Bail out early for big-endian (#103310)

If we continue through the function we can currently hit crashes. We can
bail out early and fall back to SDAG.

Fixes #103032

(cherry picked from commit 05d17a1c705e1053f95b90aa37d91ce4f94a9287)
---
 llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp  |  1 +
 llvm/lib/Target/ARM/ARMCallLowering.cpp   |  9 
 llvm/lib/Target/ARM/ARMCallLowering.h |  2 ++
 .../AArch64/GlobalISel/endian_fallback.ll | 21 +++
 .../ARM/GlobalISel/arm-irtranslator.ll|  2 +-
 .../ARM/GlobalISel/arm-param-lowering.ll  |  2 +-
 6 files changed, 35 insertions(+), 2 deletions(-)
 create mode 100644 llvm/test/CodeGen/AArch64/GlobalISel/endian_fallback.ll

diff --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp 
b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
index 68a8a273a1b479..eb010afd41b6b7 100644
--- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
@@ -3889,6 +3889,7 @@ bool IRTranslator::runOnMachineFunction(MachineFunction 
&CurMF) {
F.getSubprogram(), &F.getEntryBlock());
 R << "unable to translate in big endian mode";
 reportTranslationError(*MF, *TPC, *ORE, R);
+return false;
   }
 
   // Release the per-function state when we return, whether we succeeded or 
not.
diff --git a/llvm/lib/Target/ARM/ARMCallLowering.cpp 
b/llvm/lib/Target/ARM/ARMCallLowering.cpp
index 9cc162d041f48b..883808ae981f5e 100644
--- a/llvm/lib/Target/ARM/ARMCallLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMCallLowering.cpp
@@ -50,6 +50,13 @@
 
 using namespace llvm;
 
+// Whether Big-endian GISel is enabled, defaults to off, can be enabled for
+// testing.
+static cl::opt
+EnableGISelBigEndian("enable-arm-gisel-bigendian", cl::Hidden,
+ cl::init(false),
+ cl::desc("Enable Global-ISel Big Endian Lowering"));
+
 ARMCallLowering::ARMCallLowering(const ARMTargetLowering &TLI)
 : CallLowering(&TLI) {}
 
@@ -539,3 +546,5 @@ bool ARMCallLowering::lowerCall(MachineIRBuilder 
&MIRBuilder, CallLoweringInfo &
 
   return true;
 }
+
+bool ARMCallLowering::enableBigEndian() const { return EnableGISelBigEndian; }
\ No newline at end of file
diff --git a/llvm/lib/Target/ARM/ARMCallLowering.h 
b/llvm/lib/Target/ARM/ARMCallLowering.h
index 38095617fb4f35..32c95a044d7b7e 100644
--- a/llvm/lib/Target/ARM/ARMCallLowering.h
+++ b/llvm/lib/Target/ARM/ARMCallLowering.h
@@ -42,6 +42,8 @@ class ARMCallLowering : public CallLowering {
   bool lowerCall(MachineIRBuilder &MIRBuilder,
  CallLoweringInfo &Info) const override;
 
+  bool enableBigEndian() const override;
+
 private:
   bool lowerReturnVal(MachineIRBuilder &MIRBuilder, const Value *Val,
   ArrayRef VRegs,
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/endian_fallback.ll 
b/llvm/test/CodeGen/AArch64/GlobalISel/endian_fallback.ll
new file mode 100644
index 00..6c27b4dd85f9b0
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/endian_fallback.ll
@@ -0,0 +1,21 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: llc -mtriple aarch64-unknown-linux-musl -O0 -global-isel-abort=2 < %s 
2>&1 | FileCheck %s --check-prefix=CHECK-LE
+; RUN: llc -mtriple aarch64_be-unknown-linux-musl -O0 -global-isel-abort=2 < 
%s 2>&1 | FileCheck %s --check-prefix=CHECK-BE
+
+; Make sure we fall-back to SDAG for BE targets.
+
+; CHECK-LE-NOT: warning: Instruction selection used fallback path for foo
+; CHECK-BE: warning: Instruction selection used fallback path for foo
+
+define <4 x i6> @foo(float %0, <4 x i6> %1) {
+; CHECK-LE-LABEL: foo:
+; CHECK-LE:   // %bb.0:
+; CHECK-LE-NEXT:fmov d0, d1
+; CHECK-LE-NEXT:ret
+;
+; CHECK-BE-LABEL: foo:
+; CHECK-BE:   // %bb.0:
+; CHECK-BE-NEXT:fmov d0, d1
+; CHECK-BE-NEXT:ret
+  ret <4 x i6> %1
+}
diff --git a/llvm/test/CodeGen/ARM/GlobalISel/arm-irtranslator.ll 
b/llvm/test/CodeGen/ARM/GlobalISel/arm-irtranslator.ll
index 411cf78b621f8b..dc1d4b289c2ab8 100644
--- a/llvm/test/CodeGen/ARM/GlobalISel/arm-irtranslator.ll
+++ b/llvm/test/CodeGen/ARM/GlobalISel/arm-irtranslator.ll
@@ -1,5 +1,5 @@
 ; RUN: llc -mtriple arm-unknown -mattr=+vfp2,+v4t -global-isel 
-stop-after=irtranslator -verify-machineinstrs %s -o - | FileCheck %s 
-check-prefix=CHECK -check-prefix=LITTLE
-; RUN: llc -mtriple armeb-unknown -mattr=+vfp2,+v4t -global-isel 
-global-isel-abort=0 -stop-after=irtranslator -verify-machineinstrs %s -o - | 
FileCheck %s -check-prefix=CHECK -check-prefix=BIG
+; RUN: llc -mtriple armeb-unknown -mattr=+vfp2,+v4t -global-isel 
-global-isel-abort=0 -enable-arm-gisel-bigendian -stop-after=irtranslator 
-verify-

[llvm-branch-commits] [llvm] c1336c9 - [GlobalISel] Bail out early for big-endian (#103310)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

Author: David Green
Date: 2024-08-20T09:19:52+02:00
New Revision: c1336c9e3bd6c0887ead386043c547b3a3ed76a9

URL: 
https://github.com/llvm/llvm-project/commit/c1336c9e3bd6c0887ead386043c547b3a3ed76a9
DIFF: 
https://github.com/llvm/llvm-project/commit/c1336c9e3bd6c0887ead386043c547b3a3ed76a9.diff

LOG: [GlobalISel] Bail out early for big-endian (#103310)

If we continue through the function we can currently hit crashes. We can
bail out early and fall back to SDAG.

Fixes #103032

(cherry picked from commit 05d17a1c705e1053f95b90aa37d91ce4f94a9287)

Added: 
llvm/test/CodeGen/AArch64/GlobalISel/endian_fallback.ll

Modified: 
llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
llvm/lib/Target/ARM/ARMCallLowering.cpp
llvm/lib/Target/ARM/ARMCallLowering.h
llvm/test/CodeGen/ARM/GlobalISel/arm-irtranslator.ll
llvm/test/CodeGen/ARM/GlobalISel/arm-param-lowering.ll

Removed: 




diff  --git a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp 
b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
index 68a8a273a1b479..eb010afd41b6b7 100644
--- a/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
@@ -3889,6 +3889,7 @@ bool IRTranslator::runOnMachineFunction(MachineFunction 
&CurMF) {
F.getSubprogram(), &F.getEntryBlock());
 R << "unable to translate in big endian mode";
 reportTranslationError(*MF, *TPC, *ORE, R);
+return false;
   }
 
   // Release the per-function state when we return, whether we succeeded or 
not.

diff  --git a/llvm/lib/Target/ARM/ARMCallLowering.cpp 
b/llvm/lib/Target/ARM/ARMCallLowering.cpp
index 9cc162d041f48b..883808ae981f5e 100644
--- a/llvm/lib/Target/ARM/ARMCallLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMCallLowering.cpp
@@ -50,6 +50,13 @@
 
 using namespace llvm;
 
+// Whether Big-endian GISel is enabled, defaults to off, can be enabled for
+// testing.
+static cl::opt
+EnableGISelBigEndian("enable-arm-gisel-bigendian", cl::Hidden,
+ cl::init(false),
+ cl::desc("Enable Global-ISel Big Endian Lowering"));
+
 ARMCallLowering::ARMCallLowering(const ARMTargetLowering &TLI)
 : CallLowering(&TLI) {}
 
@@ -539,3 +546,5 @@ bool ARMCallLowering::lowerCall(MachineIRBuilder 
&MIRBuilder, CallLoweringInfo &
 
   return true;
 }
+
+bool ARMCallLowering::enableBigEndian() const { return EnableGISelBigEndian; }
\ No newline at end of file

diff  --git a/llvm/lib/Target/ARM/ARMCallLowering.h 
b/llvm/lib/Target/ARM/ARMCallLowering.h
index 38095617fb4f35..32c95a044d7b7e 100644
--- a/llvm/lib/Target/ARM/ARMCallLowering.h
+++ b/llvm/lib/Target/ARM/ARMCallLowering.h
@@ -42,6 +42,8 @@ class ARMCallLowering : public CallLowering {
   bool lowerCall(MachineIRBuilder &MIRBuilder,
  CallLoweringInfo &Info) const override;
 
+  bool enableBigEndian() const override;
+
 private:
   bool lowerReturnVal(MachineIRBuilder &MIRBuilder, const Value *Val,
   ArrayRef VRegs,

diff  --git a/llvm/test/CodeGen/AArch64/GlobalISel/endian_fallback.ll 
b/llvm/test/CodeGen/AArch64/GlobalISel/endian_fallback.ll
new file mode 100644
index 00..6c27b4dd85f9b0
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/endian_fallback.ll
@@ -0,0 +1,21 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: llc -mtriple aarch64-unknown-linux-musl -O0 -global-isel-abort=2 < %s 
2>&1 | FileCheck %s --check-prefix=CHECK-LE
+; RUN: llc -mtriple aarch64_be-unknown-linux-musl -O0 -global-isel-abort=2 < 
%s 2>&1 | FileCheck %s --check-prefix=CHECK-BE
+
+; Make sure we fall-back to SDAG for BE targets.
+
+; CHECK-LE-NOT: warning: Instruction selection used fallback path for foo
+; CHECK-BE: warning: Instruction selection used fallback path for foo
+
+define <4 x i6> @foo(float %0, <4 x i6> %1) {
+; CHECK-LE-LABEL: foo:
+; CHECK-LE:   // %bb.0:
+; CHECK-LE-NEXT:fmov d0, d1
+; CHECK-LE-NEXT:ret
+;
+; CHECK-BE-LABEL: foo:
+; CHECK-BE:   // %bb.0:
+; CHECK-BE-NEXT:fmov d0, d1
+; CHECK-BE-NEXT:ret
+  ret <4 x i6> %1
+}

diff  --git a/llvm/test/CodeGen/ARM/GlobalISel/arm-irtranslator.ll 
b/llvm/test/CodeGen/ARM/GlobalISel/arm-irtranslator.ll
index 411cf78b621f8b..dc1d4b289c2ab8 100644
--- a/llvm/test/CodeGen/ARM/GlobalISel/arm-irtranslator.ll
+++ b/llvm/test/CodeGen/ARM/GlobalISel/arm-irtranslator.ll
@@ -1,5 +1,5 @@
 ; RUN: llc -mtriple arm-unknown -mattr=+vfp2,+v4t -global-isel 
-stop-after=irtranslator -verify-machineinstrs %s -o - | FileCheck %s 
-check-prefix=CHECK -check-prefix=LITTLE
-; RUN: llc -mtriple armeb-unknown -mattr=+vfp2,+v4t -global-isel 
-global-isel-abort=0 -stop-after=irtranslator -verify-machineinstrs %s -o - | 
FileCheck %s -check-prefix=CHECK -check-prefix=BIG
+; RUN: llc -mtriple armeb-unknown -mattr=+vfp2,+v4t -global-isel 
-global-isel-abort=0 -enable-arm-gisel-bigendian -st

[llvm-branch-commits] [llvm] release/19.x: [GlobalISel] Bail out early for big-endian (#103310) (PR #104823)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/104823
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/19.x: [LLD] [MinGW] Recognize the -rpath option (#102886) (PR #104843)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/104843

>From 6dbc0e236b3e3a651302d079d1c64934976bc0b3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Martin=20Storsj=C3=B6?= 
Date: Sun, 18 Aug 2024 00:44:16 +0300
Subject: [PATCH] [LLD] [MinGW] Recognize the -rpath option (#102886)

GNU ld silently accepts the -rpath option for Windows targets, as a
no-op.

This has lead to some build systems (and users) passing this option
while building for Windows/MinGW, even if Windows doesn't have any
concept like rpath.

Older versions of Conan did include -rpath in the pkg-config files it
generated, see e.g.

https://github.com/conan-io/conan/blob/17c58f0c61931f9de218ac571cd97a8e0befa68e/conans/client/generators/pkg_config.py#L104-L114
and
https://github.com/conan-io/conan/blob/17c58f0c61931f9de218ac571cd97a8e0befa68e/conans/client/build/compiler_flags.py#L26-L34
- and see https://github.com/mstorsjo/llvm-mingw/issues/300 for user
reports about this issue.

Recognize the option in LLD for MinGW targets, to improve drop-in
compatibility compared to GNU ld, but produce a warning to alert users
that the option really has no effect for these targets.

(cherry picked from commit 69f76c782b554a004078af6909c19a11e3846415)
---
 lld/MinGW/Driver.cpp   | 3 +++
 lld/MinGW/Options.td   | 3 +++
 lld/test/MinGW/driver.test | 6 ++
 3 files changed, 12 insertions(+)

diff --git a/lld/MinGW/Driver.cpp b/lld/MinGW/Driver.cpp
index 35fd478a21905e..c7d7b9cfca386f 100644
--- a/lld/MinGW/Driver.cpp
+++ b/lld/MinGW/Driver.cpp
@@ -448,6 +448,9 @@ bool link(ArrayRef argsArr, llvm::raw_ostream 
&stdoutOS,
   add("-errorlimit:" + s);
   }
 
+  if (auto *a = args.getLastArg(OPT_rpath))
+warn("parameter " + a->getSpelling() + " has no effect on PE/COFF 
targets");
+
   for (auto *a : args.filtered(OPT_mllvm))
 add("-mllvm:" + StringRef(a->getValue()));
 
diff --git a/lld/MinGW/Options.td b/lld/MinGW/Options.td
index 56f67e3dd96c42..7bd5fb80749da2 100644
--- a/lld/MinGW/Options.td
+++ b/lld/MinGW/Options.td
@@ -243,6 +243,9 @@ defm: EqNoHelp<"sysroot">;
 def: F<"sort-common">;
 def: F<"start-group">;
 
+// Ignored options, that produce warnings
+defm rpath: EqNoHelp<"rpath">;
+
 // Ignore GCC collect2 LTO plugin related options. Note that we don't support
 // GCC LTO, but GCC collect2 passes these options even in non-LTO mode.
 def: J<"plugin-opt=-fresolution=">;
diff --git a/lld/test/MinGW/driver.test b/lld/test/MinGW/driver.test
index cbccd09793c2f0..0dab66b613c774 100644
--- a/lld/test/MinGW/driver.test
+++ b/lld/test/MinGW/driver.test
@@ -446,3 +446,9 @@ RUN: ld.lld -### foo.o -m i386pep --build-id=none 2>&1 | 
FileCheck -check-prefix
 RUN: ld.lld -### foo.o -m i386pep -s 2>&1 | FileCheck 
-check-prefix=NO_BUILD_ID %s
 RUN: ld.lld -### foo.o -m i386pep -S 2>&1 | FileCheck 
-check-prefix=NO_BUILD_ID %s
 NO_BUILD_ID: -build-id:no
+
+RUN: ld.lld -### foo.o -m i386pep -rpath foo 2>&1 | FileCheck 
-check-prefix=WARN_RPATH %s
+RUN: ld.lld -### foo.o -m i386pep --rpath foo 2>&1 | FileCheck 
-check-prefix=WARN_RPATH %s
+RUN: ld.lld -### foo.o -m i386pep -rpath=foo 2>&1 | FileCheck 
-check-prefix=WARN_RPATH %s
+RUN: ld.lld -### foo.o -m i386pep --rpath=foo 2>&1 | FileCheck 
-check-prefix=WARN_RPATH %s
+WARN_RPATH: warning: parameter -{{-?}}rpath has no effect on PE/COFF targets

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] 6dbc0e2 - [LLD] [MinGW] Recognize the -rpath option (#102886)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

Author: Martin Storsjö
Date: 2024-08-20T09:20:29+02:00
New Revision: 6dbc0e236b3e3a651302d079d1c64934976bc0b3

URL: 
https://github.com/llvm/llvm-project/commit/6dbc0e236b3e3a651302d079d1c64934976bc0b3
DIFF: 
https://github.com/llvm/llvm-project/commit/6dbc0e236b3e3a651302d079d1c64934976bc0b3.diff

LOG: [LLD] [MinGW] Recognize the -rpath option (#102886)

GNU ld silently accepts the -rpath option for Windows targets, as a
no-op.

This has lead to some build systems (and users) passing this option
while building for Windows/MinGW, even if Windows doesn't have any
concept like rpath.

Older versions of Conan did include -rpath in the pkg-config files it
generated, see e.g.

https://github.com/conan-io/conan/blob/17c58f0c61931f9de218ac571cd97a8e0befa68e/conans/client/generators/pkg_config.py#L104-L114
and
https://github.com/conan-io/conan/blob/17c58f0c61931f9de218ac571cd97a8e0befa68e/conans/client/build/compiler_flags.py#L26-L34
- and see https://github.com/mstorsjo/llvm-mingw/issues/300 for user
reports about this issue.

Recognize the option in LLD for MinGW targets, to improve drop-in
compatibility compared to GNU ld, but produce a warning to alert users
that the option really has no effect for these targets.

(cherry picked from commit 69f76c782b554a004078af6909c19a11e3846415)

Added: 


Modified: 
lld/MinGW/Driver.cpp
lld/MinGW/Options.td
lld/test/MinGW/driver.test

Removed: 




diff  --git a/lld/MinGW/Driver.cpp b/lld/MinGW/Driver.cpp
index 35fd478a21905e..c7d7b9cfca386f 100644
--- a/lld/MinGW/Driver.cpp
+++ b/lld/MinGW/Driver.cpp
@@ -448,6 +448,9 @@ bool link(ArrayRef argsArr, llvm::raw_ostream 
&stdoutOS,
   add("-errorlimit:" + s);
   }
 
+  if (auto *a = args.getLastArg(OPT_rpath))
+warn("parameter " + a->getSpelling() + " has no effect on PE/COFF 
targets");
+
   for (auto *a : args.filtered(OPT_mllvm))
 add("-mllvm:" + StringRef(a->getValue()));
 

diff  --git a/lld/MinGW/Options.td b/lld/MinGW/Options.td
index 56f67e3dd96c42..7bd5fb80749da2 100644
--- a/lld/MinGW/Options.td
+++ b/lld/MinGW/Options.td
@@ -243,6 +243,9 @@ defm: EqNoHelp<"sysroot">;
 def: F<"sort-common">;
 def: F<"start-group">;
 
+// Ignored options, that produce warnings
+defm rpath: EqNoHelp<"rpath">;
+
 // Ignore GCC collect2 LTO plugin related options. Note that we don't support
 // GCC LTO, but GCC collect2 passes these options even in non-LTO mode.
 def: J<"plugin-opt=-fresolution=">;

diff  --git a/lld/test/MinGW/driver.test b/lld/test/MinGW/driver.test
index cbccd09793c2f0..0dab66b613c774 100644
--- a/lld/test/MinGW/driver.test
+++ b/lld/test/MinGW/driver.test
@@ -446,3 +446,9 @@ RUN: ld.lld -### foo.o -m i386pep --build-id=none 2>&1 | 
FileCheck -check-prefix
 RUN: ld.lld -### foo.o -m i386pep -s 2>&1 | FileCheck 
-check-prefix=NO_BUILD_ID %s
 RUN: ld.lld -### foo.o -m i386pep -S 2>&1 | FileCheck 
-check-prefix=NO_BUILD_ID %s
 NO_BUILD_ID: -build-id:no
+
+RUN: ld.lld -### foo.o -m i386pep -rpath foo 2>&1 | FileCheck 
-check-prefix=WARN_RPATH %s
+RUN: ld.lld -### foo.o -m i386pep --rpath foo 2>&1 | FileCheck 
-check-prefix=WARN_RPATH %s
+RUN: ld.lld -### foo.o -m i386pep -rpath=foo 2>&1 | FileCheck 
-check-prefix=WARN_RPATH %s
+RUN: ld.lld -### foo.o -m i386pep --rpath=foo 2>&1 | FileCheck 
-check-prefix=WARN_RPATH %s
+WARN_RPATH: warning: parameter -{{-?}}rpath has no effect on PE/COFF targets



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/19.x: [LLD] [MinGW] Recognize the -rpath option (#102886) (PR #104843)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/104843
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] 3ffa542 - [C++23] Fix infinite recursion (Clang 19.x regression) (#104829)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

Author: Aaron Ballman
Date: 2024-08-20T09:20:54+02:00
New Revision: 3ffa5421ca657c04d4df170307c1f9a3c6293003

URL: 
https://github.com/llvm/llvm-project/commit/3ffa5421ca657c04d4df170307c1f9a3c6293003
DIFF: 
https://github.com/llvm/llvm-project/commit/3ffa5421ca657c04d4df170307c1f9a3c6293003.diff

LOG: [C++23] Fix infinite recursion (Clang 19.x regression) (#104829)

d469794d0cdfd2fea50a6ce0c0e33abb242d744c was fixing an issue with
triggering vtable instantiations, but it accidentally introduced
infinite recursion when the type to be checked is the same as the type
used in a base specifier or field declaration.

Fixes #104802

(cherry picked from commit 435cb0dc5eca08cdd8d9ed0d887fa1693cc2bf33)

Added: 


Modified: 
clang/lib/Sema/SemaDeclCXX.cpp
clang/test/SemaCXX/gh102293.cpp

Removed: 




diff  --git a/clang/lib/Sema/SemaDeclCXX.cpp b/clang/lib/Sema/SemaDeclCXX.cpp
index ecf8754143a49e..92c47be67339e9 100644
--- a/clang/lib/Sema/SemaDeclCXX.cpp
+++ b/clang/lib/Sema/SemaDeclCXX.cpp
@@ -7056,11 +7056,16 @@ void Sema::CheckCompletedCXXClass(Scope *S, 
CXXRecordDecl *Record) {
 if (!RD->hasConstexprDestructor())
   return false;
 
+QualType CanUnqualT = T.getCanonicalType().getUnqualifiedType();
 for (const CXXBaseSpecifier &B : RD->bases())
-  if (!Check(B.getType(), Check))
+  if (B.getType().getCanonicalType().getUnqualifiedType() !=
+  CanUnqualT &&
+  !Check(B.getType(), Check))
 return false;
 for (const FieldDecl *FD : RD->fields())
-  if (!Check(FD->getType(), Check))
+  if (FD->getType().getCanonicalType().getUnqualifiedType() !=
+  CanUnqualT &&
+  !Check(FD->getType(), Check))
 return false;
 return true;
   };

diff  --git a/clang/test/SemaCXX/gh102293.cpp b/clang/test/SemaCXX/gh102293.cpp
index 30629fc03bf6a9..d4218cc13dcecd 100644
--- a/clang/test/SemaCXX/gh102293.cpp
+++ b/clang/test/SemaCXX/gh102293.cpp
@@ -1,5 +1,4 @@
 // RUN: %clang_cc1 -std=c++23 -fsyntax-only -verify %s
-// expected-no-diagnostics
 
 template  static void destroy() {
 T t;
@@ -20,3 +19,29 @@ struct S : HasVT {
   HasD<> v;
 };
 
+// Ensure we don't get infinite recursion from the check, however. See GH104802
+namespace GH104802 {
+class foo {   // expected-note {{definition of 'GH104802::foo' is not 
complete until the closing '}'}}
+  foo a;  // expected-error {{field has incomplete type 'foo'}}
+
+  virtual int c();
+};
+
+class bar {   // expected-note {{definition of 'GH104802::bar' is not 
complete until the closing '}'}}
+  const bar a;// expected-error {{field has incomplete type 'const bar'}}
+
+  virtual int c();
+};
+
+class baz {   // expected-note {{definition of 'GH104802::baz' is not 
complete until the closing '}'}}
+  typedef class baz blech;
+  blech a;// expected-error {{field has incomplete type 'blech' (aka 
'GH104802::baz')}}
+
+  virtual int c();
+};
+
+class quux : quux { // expected-error {{base class has incomplete type}} \
+ expected-note {{definition of 'GH104802::quux' is not 
complete until the closing '}'}}
+  virtual int c();
+};
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [C++23] Fix infinite recursion (Clang 19.x regression) (#104829) (PR #104858)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/104858
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [llvm][CodeGen] Address the issue discovered In window scheduling (#101665) (PR #102881)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/102881

>From bee19707eb9154a84d6052beb056f62ad69637fc Mon Sep 17 00:00:00 2001
From: Kai Yan 
Date: Wed, 24 Jul 2024 12:06:35 +0800
Subject: [PATCH 1/5] [llvm][CodeGen] Added missing initialization failure
 information for window scheduler (#99449)

Added missing initialization failure information for window scheduler.
---
 llvm/lib/CodeGen/WindowScheduler.cpp| 5 -
 llvm/test/CodeGen/Hexagon/swp-ws-fail-2.mir | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/CodeGen/WindowScheduler.cpp 
b/llvm/lib/CodeGen/WindowScheduler.cpp
index 0777480499e55b..3fe8a1aaafd128 100644
--- a/llvm/lib/CodeGen/WindowScheduler.cpp
+++ b/llvm/lib/CodeGen/WindowScheduler.cpp
@@ -232,8 +232,11 @@ bool WindowScheduler::initialize() {
   return false;
 }
 for (auto &Def : MI.all_defs())
-  if (Def.isReg() && Def.getReg().isPhysical())
+  if (Def.isReg() && Def.getReg().isPhysical()) {
+LLVM_DEBUG(dbgs() << "Physical registers are not supported in "
+ "window scheduling!\n");
 return false;
+  }
   }
   if (SchedInstrNum <= WindowRegionLimit) {
 LLVM_DEBUG(dbgs() << "There are too few MIs in the window region!\n");
diff --git a/llvm/test/CodeGen/Hexagon/swp-ws-fail-2.mir 
b/llvm/test/CodeGen/Hexagon/swp-ws-fail-2.mir
index 601b98dca8e20b..be75301b016ed9 100644
--- a/llvm/test/CodeGen/Hexagon/swp-ws-fail-2.mir
+++ b/llvm/test/CodeGen/Hexagon/swp-ws-fail-2.mir
@@ -3,6 +3,7 @@
 # RUN: -window-sched=force -filetype=null -verify-machineinstrs 2>&1 \
 # RUN: | FileCheck %s
 
+# CHECK: Physical registers are not supported in window scheduling!
 # CHECK: The WindowScheduler failed to initialize!
 
 ---

>From bb80679f48b89bf3bab7240f0bf4b7b96f2394d1 Mon Sep 17 00:00:00 2001
From: Kai Yan 
Date: Wed, 24 Jul 2024 12:11:58 +0800
Subject: [PATCH 2/5] [llvm][CodeGen] Added a new restriction for II by pragma
 in window scheduler (#99448)

Added a new restriction for window scheduling.
Window scheduling is disabled when llvm.loop.pipeline.initiationinterval
is set.
---
 llvm/lib/CodeGen/MachinePipeliner.cpp | 12 ++-
 ...swp-ws-pragma-initiation-interval-fail.mir | 83 +++
 2 files changed, 93 insertions(+), 2 deletions(-)
 create mode 100644 
llvm/test/CodeGen/Hexagon/swp-ws-pragma-initiation-interval-fail.mir

diff --git a/llvm/lib/CodeGen/MachinePipeliner.cpp 
b/llvm/lib/CodeGen/MachinePipeliner.cpp
index 497e282bb97682..5c68711ff61938 100644
--- a/llvm/lib/CodeGen/MachinePipeliner.cpp
+++ b/llvm/lib/CodeGen/MachinePipeliner.cpp
@@ -528,8 +528,16 @@ bool MachinePipeliner::useSwingModuloScheduler() {
 }
 
 bool MachinePipeliner::useWindowScheduler(bool Changed) {
-  // WindowScheduler does not work when it is off or when SwingModuloScheduler
-  // is successfully scheduled.
+  // WindowScheduler does not work for following cases:
+  // 1. when it is off.
+  // 2. when SwingModuloScheduler is successfully scheduled.
+  // 3. when pragma II is enabled.
+  if (II_setByPragma) {
+LLVM_DEBUG(dbgs() << "Window scheduling is disabled when "
+ "llvm.loop.pipeline.initiationinterval is set.\n");
+return false;
+  }
+
   return WindowSchedulingOption == WindowSchedulingFlag::WS_Force ||
  (WindowSchedulingOption == WindowSchedulingFlag::WS_On && !Changed);
 }
diff --git 
a/llvm/test/CodeGen/Hexagon/swp-ws-pragma-initiation-interval-fail.mir 
b/llvm/test/CodeGen/Hexagon/swp-ws-pragma-initiation-interval-fail.mir
new file mode 100644
index 00..6e69a76290fb1d
--- /dev/null
+++ b/llvm/test/CodeGen/Hexagon/swp-ws-pragma-initiation-interval-fail.mir
@@ -0,0 +1,83 @@
+# RUN: llc  --march=hexagon %s -run-pass=pipeliner -debug-only=pipeliner \
+# RUN: -window-sched=force -filetype=null -verify-machineinstrs 2>&1 \
+# RUN: | FileCheck %s
+# REQUIRES: asserts
+
+# Test that checks no window scheduler is performed if the II set by pragma was
+# enabled
+
+# CHECK: Window scheduling is disabled when 
llvm.loop.pipeline.initiationinterval is set.
+
+--- |
+  define void @test_pragma_ii_fail(ptr %a0, i32 %a1) {
+  b0:
+%v0 = icmp sgt i32 %a1, 1
+br i1 %v0, label %b1, label %b4
+
+  b1:   ; preds = %b0
+%v1 = load i32, ptr %a0, align 4
+%v2 = add i32 %v1, 10
+%v4 = add i32 %a1, -1
+%cgep = getelementptr i32, ptr %a0, i32 1
+br label %b2
+
+  b2:   ; preds = %b2, %b1
+%v5 = phi i32 [ %v12, %b2 ], [ %v4, %b1 ]
+%v6 = phi ptr [ %cgep2, %b2 ], [ %cgep, %b1 ]
+%v7 = phi i32 [ %v10, %b2 ], [ %v2, %b1 ]
+store i32 %v7, ptr %v6, align 4
+%v8 = add i32 %v7, 10
+%cgep1 = getelementptr i32, ptr %v6, i32 -1
+store i32 %v8, ptr %cgep1, align 4
+%v10 = add i32 %v7, 10
+%v12 = add i32 %v5, -1
+%v13 = icmp eq i32 %v12, 0
+%cgep2 = getelementptr i32, ptr %v6, i32 1
+br i

[llvm-branch-commits] [llvm] release/19.x: [llvm][CodeGen] Address the issue discovered In window scheduling (#101665) (PR #102881)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

I can merge this PR as it is. But I need to have someone to review it, 
especially it will land in the last possible window before -final. This has to 
be solid and not cause issues after merge. Since this touches Hexagon I would 
like @SundeepKushwaha to chime in.

https://github.com/llvm/llvm-project/pull/102881
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [AArch64][ARM] Add a release note about _BitInt (PR #101521)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

The conflict needs to be fixed.

https://github.com/llvm/llvm-project/pull/101521
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [SLP]Fix PR104422: Wrong value truncation (PR #104747)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

@nikic who can review this? @fhahn ?

https://github.com/llvm/llvm-project/pull/104747
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] 64b8514 - Reland [C++20] [Modules] [Itanium ABI] Generate the vtable in the mod… (#102287)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

Author: Chuanqi Xu
Date: 2024-08-20T09:28:20+02:00
New Revision: 64b8514e6c1a663660fbb93ec7f623b3e40a2020

URL: 
https://github.com/llvm/llvm-project/commit/64b8514e6c1a663660fbb93ec7f623b3e40a2020
DIFF: 
https://github.com/llvm/llvm-project/commit/64b8514e6c1a663660fbb93ec7f623b3e40a2020.diff

LOG: Reland [C++20] [Modules] [Itanium ABI] Generate the vtable in the mod… 
(#102287)

Reland https://github.com/llvm/llvm-project/pull/75912

The differences of this PR between
https://github.com/llvm/llvm-project/pull/75912 are:

- Fixed a regression in `Decl::isInAnotherModuleUnit()` in DeclBase.cpp
pointed by @mizvekov and add the corresponding test.
- Fixed the regression in windows
https://github.com/llvm/llvm-project/issues/97447. The changes are in
`CodeGenModule::getVTableLinkage` from
`clang/lib/CodeGen/CGVTables.cpp`. According to the feedbacks from MSVC
devs, the linkage of vtables won't affected by modules. So I simply
skipped the case for MSVC.

Given this is more or less fundamental to the use of modules. I hope we
can backport this to 19.x.

(cherry picked from commit 847f9cb0e868c8ec34f9aa86fdf846f8c4e0388b)

Added: 
clang/test/CodeGenCXX/pr70585.cppm
clang/test/Modules/pr97313.cppm
clang/test/Modules/static-func-in-private.cppm
clang/test/Modules/vtable-windows.cppm

Modified: 
clang/include/clang/AST/DeclBase.h
clang/include/clang/Serialization/ASTBitCodes.h
clang/include/clang/Serialization/ASTReader.h
clang/include/clang/Serialization/ASTWriter.h
clang/lib/AST/ASTContext.cpp
clang/lib/AST/DeclBase.cpp
clang/lib/CodeGen/CGVTables.cpp
clang/lib/CodeGen/ItaniumCXXABI.cpp
clang/lib/Sema/SemaDecl.cpp
clang/lib/Sema/SemaDeclCXX.cpp
clang/lib/Serialization/ASTReader.cpp
clang/lib/Serialization/ASTReaderDecl.cpp
clang/lib/Serialization/ASTWriter.cpp
clang/lib/Serialization/ASTWriterDecl.cpp
clang/test/CodeGenCXX/modules-vtable.cppm

Removed: 




diff  --git a/clang/include/clang/AST/DeclBase.h 
b/clang/include/clang/AST/DeclBase.h
index 40f01abf384e92..2a4bd0f9c2fda3 100644
--- a/clang/include/clang/AST/DeclBase.h
+++ b/clang/include/clang/AST/DeclBase.h
@@ -670,6 +670,13 @@ class alignas(8) Decl {
   /// Whether this declaration comes from another module unit.
   bool isInAnotherModuleUnit() const;
 
+  /// Whether this declaration comes from the same module unit being compiled.
+  bool isInCurrentModuleUnit() const;
+
+  /// Whether the definition of the declaration should be emitted in external
+  /// sources.
+  bool shouldEmitInExternalSource() const;
+
   /// Whether this declaration comes from explicit global module.
   bool isFromExplicitGlobalModule() const;
 

diff  --git a/clang/include/clang/Serialization/ASTBitCodes.h 
b/clang/include/clang/Serialization/ASTBitCodes.h
index 5dd0ba33f8a9c2..9b7e3af0e449b5 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -721,6 +721,9 @@ enum ASTRecordTypes {
 
   /// Record code for \#pragma clang unsafe_buffer_usage begin/end
   PP_UNSAFE_BUFFER_USAGE = 69,
+
+  /// Record code for vtables to emit.
+  VTABLES_TO_EMIT = 70,
 };
 
 /// Record types used within a source manager block.

diff  --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index 76e51ac7ab9792..671520a3602b3b 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -790,6 +790,11 @@ class ASTReader
   /// the consumer eagerly.
   SmallVector EagerlyDeserializedDecls;
 
+  /// The IDs of all vtables to emit. The referenced declarations are passed
+  /// to the consumers' HandleVTable eagerly after passing
+  /// EagerlyDeserializedDecls.
+  SmallVector VTablesToEmit;
+
   /// The IDs of all tentative definitions stored in the chain.
   ///
   /// Sema keeps track of all tentative definitions in a TU because it has to
@@ -1500,6 +1505,7 @@ class ASTReader
   bool isConsumerInterestedIn(Decl *D);
   void PassInterestingDeclsToConsumer();
   void PassInterestingDeclToConsumer(Decl *D);
+  void PassVTableToConsumer(CXXRecordDecl *RD);
 
   void finishPendingActions();
   void diagnoseOdrViolations();

diff  --git a/clang/include/clang/Serialization/ASTWriter.h 
b/clang/include/clang/Serialization/ASTWriter.h
index a0e475ec9f862c..71a7c28047e318 100644
--- a/clang/include/clang/Serialization/ASTWriter.h
+++ b/clang/include/clang/Serialization/ASTWriter.h
@@ -500,6 +500,10 @@ class ASTWriter : public ASTDeserializationListener,
   std::vector NonAffectingRanges;
   std::vector NonAffectingOffsetAdjustments;
 
+  /// A list of classes which need to emit the VTable in the corresponding
+  /// object file.
+  llvm::SmallVector PendingEmittingVTables;
+
   /// Computes input files that didn't affect compilation of the current 
module,
   /// and initializes data struc

[llvm-branch-commits] [clang] release/19.x: Reland [C++20] [Modules] [Itanium ABI] Generate the vtable in the mod… (#102287) (PR #102561)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/102561

>From 64b8514e6c1a663660fbb93ec7f623b3e40a2020 Mon Sep 17 00:00:00 2001
From: Chuanqi Xu 
Date: Thu, 8 Aug 2024 13:14:09 +0800
Subject: [PATCH] =?UTF-8?q?Reland=20[C++20]=20[Modules]=20[Itanium=20ABI]?=
 =?UTF-8?q?=20Generate=20the=20vtable=20in=20the=20mod=E2=80=A6=20(#102287?=
 =?UTF-8?q?)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Reland https://github.com/llvm/llvm-project/pull/75912

The differences of this PR between
https://github.com/llvm/llvm-project/pull/75912 are:

- Fixed a regression in `Decl::isInAnotherModuleUnit()` in DeclBase.cpp
pointed by @mizvekov and add the corresponding test.
- Fixed the regression in windows
https://github.com/llvm/llvm-project/issues/97447. The changes are in
`CodeGenModule::getVTableLinkage` from
`clang/lib/CodeGen/CGVTables.cpp`. According to the feedbacks from MSVC
devs, the linkage of vtables won't affected by modules. So I simply
skipped the case for MSVC.

Given this is more or less fundamental to the use of modules. I hope we
can backport this to 19.x.

(cherry picked from commit 847f9cb0e868c8ec34f9aa86fdf846f8c4e0388b)
---
 clang/include/clang/AST/DeclBase.h|   7 ++
 .../include/clang/Serialization/ASTBitCodes.h |   3 +
 clang/include/clang/Serialization/ASTReader.h |   6 +
 clang/include/clang/Serialization/ASTWriter.h |   7 ++
 clang/lib/AST/ASTContext.cpp  |   3 +-
 clang/lib/AST/DeclBase.cpp|  34 +++--
 clang/lib/CodeGen/CGVTables.cpp   |  56 +
 clang/lib/CodeGen/ItaniumCXXABI.cpp   |   3 +
 clang/lib/Sema/SemaDecl.cpp   |   9 ++
 clang/lib/Sema/SemaDeclCXX.cpp|  14 ++-
 clang/lib/Serialization/ASTReader.cpp |  11 ++
 clang/lib/Serialization/ASTReaderDecl.cpp |   7 ++
 clang/lib/Serialization/ASTWriter.cpp |  33 -
 clang/lib/Serialization/ASTWriterDecl.cpp |   6 +
 clang/test/CodeGenCXX/modules-vtable.cppm |  31 +++--
 clang/test/CodeGenCXX/pr70585.cppm|  47 +++
 clang/test/Modules/pr97313.cppm   | 118 ++
 .../test/Modules/static-func-in-private.cppm  |   8 ++
 clang/test/Modules/vtable-windows.cppm|  26 
 19 files changed, 374 insertions(+), 55 deletions(-)
 create mode 100644 clang/test/CodeGenCXX/pr70585.cppm
 create mode 100644 clang/test/Modules/pr97313.cppm
 create mode 100644 clang/test/Modules/static-func-in-private.cppm
 create mode 100644 clang/test/Modules/vtable-windows.cppm

diff --git a/clang/include/clang/AST/DeclBase.h 
b/clang/include/clang/AST/DeclBase.h
index 40f01abf384e92..2a4bd0f9c2fda3 100644
--- a/clang/include/clang/AST/DeclBase.h
+++ b/clang/include/clang/AST/DeclBase.h
@@ -670,6 +670,13 @@ class alignas(8) Decl {
   /// Whether this declaration comes from another module unit.
   bool isInAnotherModuleUnit() const;
 
+  /// Whether this declaration comes from the same module unit being compiled.
+  bool isInCurrentModuleUnit() const;
+
+  /// Whether the definition of the declaration should be emitted in external
+  /// sources.
+  bool shouldEmitInExternalSource() const;
+
   /// Whether this declaration comes from explicit global module.
   bool isFromExplicitGlobalModule() const;
 
diff --git a/clang/include/clang/Serialization/ASTBitCodes.h 
b/clang/include/clang/Serialization/ASTBitCodes.h
index 5dd0ba33f8a9c2..9b7e3af0e449b5 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -721,6 +721,9 @@ enum ASTRecordTypes {
 
   /// Record code for \#pragma clang unsafe_buffer_usage begin/end
   PP_UNSAFE_BUFFER_USAGE = 69,
+
+  /// Record code for vtables to emit.
+  VTABLES_TO_EMIT = 70,
 };
 
 /// Record types used within a source manager block.
diff --git a/clang/include/clang/Serialization/ASTReader.h 
b/clang/include/clang/Serialization/ASTReader.h
index 76e51ac7ab9792..671520a3602b3b 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -790,6 +790,11 @@ class ASTReader
   /// the consumer eagerly.
   SmallVector EagerlyDeserializedDecls;
 
+  /// The IDs of all vtables to emit. The referenced declarations are passed
+  /// to the consumers' HandleVTable eagerly after passing
+  /// EagerlyDeserializedDecls.
+  SmallVector VTablesToEmit;
+
   /// The IDs of all tentative definitions stored in the chain.
   ///
   /// Sema keeps track of all tentative definitions in a TU because it has to
@@ -1500,6 +1505,7 @@ class ASTReader
   bool isConsumerInterestedIn(Decl *D);
   void PassInterestingDeclsToConsumer();
   void PassInterestingDeclToConsumer(Decl *D);
+  void PassVTableToConsumer(CXXRecordDecl *RD);
 
   void finishPendingActions();
   void diagnoseOdrViolations();
diff --git a/clang/include/clang/Serialization/ASTWriter.h 
b/clang/include/clang/Serialization/ASTWriter.h
index

[llvm-branch-commits] [clang] release/19.x: Reland [C++20] [Modules] [Itanium ABI] Generate the vtable in the mod… (#102287) (PR #102561)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/102561
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libunwind] release/19.x: [libunwind] Add GCS support for AArch64 (#99335) (PR #101888)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101888

>From 7e7e8125cfabf7daf5de63612e6f2c646dd8cad3 Mon Sep 17 00:00:00 2001
From: John Brawn 
Date: Sun, 4 Aug 2024 13:27:12 +0100
Subject: [PATCH 1/3] [libunwind] Add GCS support for AArch64 (#99335)

AArch64 GCS (Guarded Control Stack) is similar enough to CET that we can
re-use the existing code that is guarded by _LIBUNWIND_USE_CET, so long
as we also add defines to locate the GCS stack and pop the entries from
it. We also need the jumpto function to exit using br instead of ret, to
prevent it from popping the GCS stack.

GCS support is enabled using the LIBUNWIND_ENABLE_GCS cmake option. This
enables -mbranch-protection=standard, which enables GCS. For the places
we need to use GCS instructions we use the target attribute, as there's
not a command-line option to enable a specific architecture extension.

(cherry picked from commit b32aac4358c1f6639de7c453656cd74fbab75d71)
---
 libunwind/CMakeLists.txt  |  8 +
 libunwind/src/Registers.hpp   |  7 +
 libunwind/src/UnwindCursor.hpp|  6 ++--
 libunwind/src/UnwindLevel1.c  | 31 +--
 libunwind/src/UnwindRegistersRestore.S|  2 +-
 libunwind/src/cet_unwind.h| 18 +++
 libunwind/test/CMakeLists.txt |  1 +
 .../test/configs/llvm-libunwind-merged.cfg.in |  3 ++
 .../test/configs/llvm-libunwind-shared.cfg.in |  3 ++
 .../test/configs/llvm-libunwind-static.cfg.in |  3 ++
 10 files changed, 75 insertions(+), 7 deletions(-)

diff --git a/libunwind/CMakeLists.txt b/libunwind/CMakeLists.txt
index b22ade0a7d71eb..28d67b0fef92cc 100644
--- a/libunwind/CMakeLists.txt
+++ b/libunwind/CMakeLists.txt
@@ -37,6 +37,7 @@ if (LIBUNWIND_BUILD_32_BITS)
 endif()
 
 option(LIBUNWIND_ENABLE_CET "Build libunwind with CET enabled." OFF)
+option(LIBUNWIND_ENABLE_GCS "Build libunwind with GCS enabled." OFF)
 option(LIBUNWIND_ENABLE_ASSERTIONS "Enable assertions independent of build 
mode." ON)
 option(LIBUNWIND_ENABLE_PEDANTIC "Compile with pedantic enabled." ON)
 option(LIBUNWIND_ENABLE_WERROR "Fail and stop if a warning is triggered." OFF)
@@ -188,6 +189,13 @@ if (LIBUNWIND_ENABLE_CET)
   endif()
 endif()
 
+if (LIBUNWIND_ENABLE_GCS)
+  add_compile_flags_if_supported(-mbranch-protection=standard)
+  if (NOT CXX_SUPPORTS_MBRANCH_PROTECTION_EQ_STANDARD_FLAG)
+message(SEND_ERROR "Compiler doesn't support GCS -mbranch-protection 
option!")
+  endif()
+endif()
+
 if (WIN32)
   # The headers lack matching dllexport attributes (_LIBUNWIND_EXPORT);
   # silence the warning instead of cluttering the headers (which aren't
diff --git a/libunwind/src/Registers.hpp b/libunwind/src/Registers.hpp
index d11ddb3426d522..861e6b5f6f2c58 100644
--- a/libunwind/src/Registers.hpp
+++ b/libunwind/src/Registers.hpp
@@ -1815,6 +1815,13 @@ inline const char *Registers_ppc64::getRegisterName(int 
regNum) {
 /// process.
 class _LIBUNWIND_HIDDEN Registers_arm64;
 extern "C" void __libunwind_Registers_arm64_jumpto(Registers_arm64 *);
+
+#if defined(_LIBUNWIND_USE_GCS)
+extern "C" void *__libunwind_cet_get_jump_target() {
+  return reinterpret_cast(&__libunwind_Registers_arm64_jumpto);
+}
+#endif
+
 class _LIBUNWIND_HIDDEN Registers_arm64 {
 public:
   Registers_arm64();
diff --git a/libunwind/src/UnwindCursor.hpp b/libunwind/src/UnwindCursor.hpp
index 758557337899ed..06e654197351df 100644
--- a/libunwind/src/UnwindCursor.hpp
+++ b/libunwind/src/UnwindCursor.hpp
@@ -471,7 +471,7 @@ class _LIBUNWIND_HIDDEN AbstractUnwindCursor {
   }
 #endif
 
-#if defined(_LIBUNWIND_USE_CET)
+#if defined(_LIBUNWIND_USE_CET) || defined(_LIBUNWIND_USE_GCS)
   virtual void *get_registers() {
 _LIBUNWIND_ABORT("get_registers not implemented");
   }
@@ -954,7 +954,7 @@ class UnwindCursor : public AbstractUnwindCursor{
   virtual uintptr_t getDataRelBase();
 #endif
 
-#if defined(_LIBUNWIND_USE_CET)
+#if defined(_LIBUNWIND_USE_CET) || defined(_LIBUNWIND_USE_GCS)
   virtual void *get_registers() { return &_registers; }
 #endif
 
@@ -3005,7 +3005,7 @@ bool UnwindCursor::isReadableAddr(const pint_t 
addr) const {
 }
 #endif
 
-#if defined(_LIBUNWIND_USE_CET)
+#if defined(_LIBUNWIND_USE_CET) || defined(_LIBUNWIND_USE_GCS)
 extern "C" void *__libunwind_cet_get_registers(unw_cursor_t *cursor) {
   AbstractUnwindCursor *co = (AbstractUnwindCursor *)cursor;
   return co->get_registers();
diff --git a/libunwind/src/UnwindLevel1.c b/libunwind/src/UnwindLevel1.c
index 48e7bc3b9e00ec..7e785f4d31e716 100644
--- a/libunwind/src/UnwindLevel1.c
+++ b/libunwind/src/UnwindLevel1.c
@@ -44,7 +44,7 @@
 // _LIBUNWIND_POP_CET_SSP is used to adjust CET shadow stack pointer and we
 // directly jump to __libunwind_Registers_x86/x86_64_jumpto instead of using
 // a regular function call to avoid pushing to CET shadow stack again.
-#if !defined(_LIBUNWIND_USE_CET)
+#if !defined(_LIBUNWIND_USE_CET) && !defined(_LIBUNWIND_USE_GCS)
 #define __

[llvm-branch-commits] [libunwind] 7e7e812 - [libunwind] Add GCS support for AArch64 (#99335)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

Author: John Brawn
Date: 2024-08-20T09:29:19+02:00
New Revision: 7e7e8125cfabf7daf5de63612e6f2c646dd8cad3

URL: 
https://github.com/llvm/llvm-project/commit/7e7e8125cfabf7daf5de63612e6f2c646dd8cad3
DIFF: 
https://github.com/llvm/llvm-project/commit/7e7e8125cfabf7daf5de63612e6f2c646dd8cad3.diff

LOG: [libunwind] Add GCS support for AArch64 (#99335)

AArch64 GCS (Guarded Control Stack) is similar enough to CET that we can
re-use the existing code that is guarded by _LIBUNWIND_USE_CET, so long
as we also add defines to locate the GCS stack and pop the entries from
it. We also need the jumpto function to exit using br instead of ret, to
prevent it from popping the GCS stack.

GCS support is enabled using the LIBUNWIND_ENABLE_GCS cmake option. This
enables -mbranch-protection=standard, which enables GCS. For the places
we need to use GCS instructions we use the target attribute, as there's
not a command-line option to enable a specific architecture extension.

(cherry picked from commit b32aac4358c1f6639de7c453656cd74fbab75d71)

Added: 


Modified: 
libunwind/CMakeLists.txt
libunwind/src/Registers.hpp
libunwind/src/UnwindCursor.hpp
libunwind/src/UnwindLevel1.c
libunwind/src/UnwindRegistersRestore.S
libunwind/src/cet_unwind.h
libunwind/test/CMakeLists.txt
libunwind/test/configs/llvm-libunwind-merged.cfg.in
libunwind/test/configs/llvm-libunwind-shared.cfg.in
libunwind/test/configs/llvm-libunwind-static.cfg.in

Removed: 




diff  --git a/libunwind/CMakeLists.txt b/libunwind/CMakeLists.txt
index b22ade0a7d71eb..28d67b0fef92cc 100644
--- a/libunwind/CMakeLists.txt
+++ b/libunwind/CMakeLists.txt
@@ -37,6 +37,7 @@ if (LIBUNWIND_BUILD_32_BITS)
 endif()
 
 option(LIBUNWIND_ENABLE_CET "Build libunwind with CET enabled." OFF)
+option(LIBUNWIND_ENABLE_GCS "Build libunwind with GCS enabled." OFF)
 option(LIBUNWIND_ENABLE_ASSERTIONS "Enable assertions independent of build 
mode." ON)
 option(LIBUNWIND_ENABLE_PEDANTIC "Compile with pedantic enabled." ON)
 option(LIBUNWIND_ENABLE_WERROR "Fail and stop if a warning is triggered." OFF)
@@ -188,6 +189,13 @@ if (LIBUNWIND_ENABLE_CET)
   endif()
 endif()
 
+if (LIBUNWIND_ENABLE_GCS)
+  add_compile_flags_if_supported(-mbranch-protection=standard)
+  if (NOT CXX_SUPPORTS_MBRANCH_PROTECTION_EQ_STANDARD_FLAG)
+message(SEND_ERROR "Compiler doesn't support GCS -mbranch-protection 
option!")
+  endif()
+endif()
+
 if (WIN32)
   # The headers lack matching dllexport attributes (_LIBUNWIND_EXPORT);
   # silence the warning instead of cluttering the headers (which aren't

diff  --git a/libunwind/src/Registers.hpp b/libunwind/src/Registers.hpp
index d11ddb3426d522..861e6b5f6f2c58 100644
--- a/libunwind/src/Registers.hpp
+++ b/libunwind/src/Registers.hpp
@@ -1815,6 +1815,13 @@ inline const char *Registers_ppc64::getRegisterName(int 
regNum) {
 /// process.
 class _LIBUNWIND_HIDDEN Registers_arm64;
 extern "C" void __libunwind_Registers_arm64_jumpto(Registers_arm64 *);
+
+#if defined(_LIBUNWIND_USE_GCS)
+extern "C" void *__libunwind_cet_get_jump_target() {
+  return reinterpret_cast(&__libunwind_Registers_arm64_jumpto);
+}
+#endif
+
 class _LIBUNWIND_HIDDEN Registers_arm64 {
 public:
   Registers_arm64();

diff  --git a/libunwind/src/UnwindCursor.hpp b/libunwind/src/UnwindCursor.hpp
index 758557337899ed..06e654197351df 100644
--- a/libunwind/src/UnwindCursor.hpp
+++ b/libunwind/src/UnwindCursor.hpp
@@ -471,7 +471,7 @@ class _LIBUNWIND_HIDDEN AbstractUnwindCursor {
   }
 #endif
 
-#if defined(_LIBUNWIND_USE_CET)
+#if defined(_LIBUNWIND_USE_CET) || defined(_LIBUNWIND_USE_GCS)
   virtual void *get_registers() {
 _LIBUNWIND_ABORT("get_registers not implemented");
   }
@@ -954,7 +954,7 @@ class UnwindCursor : public AbstractUnwindCursor{
   virtual uintptr_t getDataRelBase();
 #endif
 
-#if defined(_LIBUNWIND_USE_CET)
+#if defined(_LIBUNWIND_USE_CET) || defined(_LIBUNWIND_USE_GCS)
   virtual void *get_registers() { return &_registers; }
 #endif
 
@@ -3005,7 +3005,7 @@ bool UnwindCursor::isReadableAddr(const pint_t 
addr) const {
 }
 #endif
 
-#if defined(_LIBUNWIND_USE_CET)
+#if defined(_LIBUNWIND_USE_CET) || defined(_LIBUNWIND_USE_GCS)
 extern "C" void *__libunwind_cet_get_registers(unw_cursor_t *cursor) {
   AbstractUnwindCursor *co = (AbstractUnwindCursor *)cursor;
   return co->get_registers();

diff  --git a/libunwind/src/UnwindLevel1.c b/libunwind/src/UnwindLevel1.c
index 48e7bc3b9e00ec..7e785f4d31e716 100644
--- a/libunwind/src/UnwindLevel1.c
+++ b/libunwind/src/UnwindLevel1.c
@@ -44,7 +44,7 @@
 // _LIBUNWIND_POP_CET_SSP is used to adjust CET shadow stack pointer and we
 // directly jump to __libunwind_Registers_x86/x86_64_jumpto instead of using
 // a regular function call to avoid pushing to CET shadow stack again.
-#if !defined(_LIBUNWIND_USE_CET)
+#if !defined(_LIBUNWIND_USE_CET) && !defined(_LIBUNWIND_USE_GCS)
 #define __unw_phase2_resume(cursor, fn)  

[llvm-branch-commits] [libunwind] c3da16b - [libunwind] Be more careful about enabling GCS (#101973)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

Author: John Brawn
Date: 2024-08-20T09:29:19+02:00
New Revision: c3da16b094511e42022e534b5eb665dbc3f8db0f

URL: 
https://github.com/llvm/llvm-project/commit/c3da16b094511e42022e534b5eb665dbc3f8db0f
DIFF: 
https://github.com/llvm/llvm-project/commit/c3da16b094511e42022e534b5eb665dbc3f8db0f.diff

LOG: [libunwind] Be more careful about enabling GCS (#101973)

We need both GCS to be enabled by the compiler (which we do by checking
if __ARM_FEATURE_GCS_DEFAULT is defined) and for arm_acle.h to define
the GCS intrinsics. Check the latter by checking if _CHKFEAT_GCS is
defined.

(cherry picked from commit c649194a71b47431f2eb2e041435d564e3b51072)

Added: 


Modified: 
libunwind/src/cet_unwind.h

Removed: 




diff  --git a/libunwind/src/cet_unwind.h b/libunwind/src/cet_unwind.h
index 45c11973cb7fa3..47d7616a7322c3 100644
--- a/libunwind/src/cet_unwind.h
+++ b/libunwind/src/cet_unwind.h
@@ -39,9 +39,13 @@
 // need to guard any use of GCS instructions with __chkfeat though, as GCS may
 // not be enabled.
 #if defined(_LIBUNWIND_TARGET_AARCH64) && defined(__ARM_FEATURE_GCS_DEFAULT)
-#define _LIBUNWIND_USE_GCS 1
 #include 
 
+// We can only use GCS if arm_acle.h defines the GCS intrinsics.
+#ifdef _CHKFEAT_GCS
+#define _LIBUNWIND_USE_GCS 1
+#endif
+
 #define _LIBUNWIND_POP_CET_SSP(x)  
\
   do { 
\
 if (__chkfeat(_CHKFEAT_GCS)) { 
\



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libunwind] 72d2932 - [libunwind] Fix problems caused by combining BTI and GCS (#102322)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

Author: John Brawn
Date: 2024-08-20T09:29:19+02:00
New Revision: 72d2932da5a7c70885a1fdfaa809ff1ede0984ff

URL: 
https://github.com/llvm/llvm-project/commit/72d2932da5a7c70885a1fdfaa809ff1ede0984ff
DIFF: 
https://github.com/llvm/llvm-project/commit/72d2932da5a7c70885a1fdfaa809ff1ede0984ff.diff

LOG: [libunwind] Fix problems caused by combining BTI and GCS (#102322)

The libunwind assembly files need adjustment in order to work correctly
when both BTI and GCS are both enabled (which will be the case when
using -mbranch-protection=standard):
* __libunwind_Registers_arm64_jumpto can't use br to jump to the return
location, instead we need to use gcspush then ret.
* Because we indirectly call __libunwind_Registers_arm64_jumpto it needs
to start with bti jc.
 * We need to set the GCS GNU property bit when it's enabled.

-

Co-authored-by: Daniel Kiss 
(cherry picked from commit 39529107b46032ef0875ac5b809ab5b60cd15a40)

Added: 


Modified: 
libunwind/src/UnwindRegistersRestore.S
libunwind/src/assembly.h

Removed: 




diff  --git a/libunwind/src/UnwindRegistersRestore.S 
b/libunwind/src/UnwindRegistersRestore.S
index e1d6e17549880b..9d34c7909ed371 100644
--- a/libunwind/src/UnwindRegistersRestore.S
+++ b/libunwind/src/UnwindRegistersRestore.S
@@ -629,6 +629,10 @@ Lnovec:
 
 #elif defined(__aarch64__)
 
+#if defined(__ARM_FEATURE_GCS_DEFAULT)
+.arch_extension gcs
+#endif
+
 //
 // extern "C" void __libunwind_Registers_arm64_jumpto(Registers_arm64 *);
 //
@@ -680,7 +684,17 @@ 
DEFINE_LIBUNWIND_FUNCTION(__libunwind_Registers_arm64_jumpto)
   ldrx16, [x0, #0x0F8]
   ldpx0, x1,  [x0, #0x000]  // restore x0,x1
   movsp,x16 // restore sp
-  br x30// jump to pc
+#if defined(__ARM_FEATURE_GCS_DEFAULT)
+  // If GCS is enabled we need to push the address we're returning to onto the
+  // GCS stack. We can't just return using br, as there won't be a BTI landing
+  // pad instruction at the destination.
+  mov  x16, #1
+  chkfeat  x16
+  cbnz x16, Lnogcs
+  gcspushm x30
+Lnogcs:
+#endif
+  retx30// jump to pc
 
 #elif defined(__arm__) && !defined(__APPLE__)
 

diff  --git a/libunwind/src/assembly.h b/libunwind/src/assembly.h
index fb07d04071af3d..f8e83e138eff50 100644
--- a/libunwind/src/assembly.h
+++ b/libunwind/src/assembly.h
@@ -82,7 +82,22 @@
 #define PPC64_OPD2
 #endif
 
-#if defined(__aarch64__) && defined(__ARM_FEATURE_BTI_DEFAULT)
+#if defined(__aarch64__)
+#if defined(__ARM_FEATURE_GCS_DEFAULT) && defined(__ARM_FEATURE_BTI_DEFAULT)
+// Set BTI, PAC, and GCS gnu property bits
+#define GNU_PROPERTY 7
+// We indirectly branch to __libunwind_Registers_arm64_jumpto from
+// __unw_phase2_resume, so we need to use bti jc.
+#define AARCH64_BTI bti jc
+#elif defined(__ARM_FEATURE_GCS_DEFAULT)
+// Set GCS gnu property bit
+#define GNU_PROPERTY 4
+#elif defined(__ARM_FEATURE_BTI_DEFAULT)
+// Set BTI and PAC gnu property bits
+#define GNU_PROPERTY 3
+#define AARCH64_BTI bti c
+#endif
+#ifdef GNU_PROPERTY
   .pushsection ".note.gnu.property", "a" SEPARATOR 
\
   .balign 8 SEPARATOR  
\
   .long 4 SEPARATOR
\
@@ -91,12 +106,12 @@
   .asciz "GNU" SEPARATOR   
\
   .long 0xc000 SEPARATOR /* GNU_PROPERTY_AARCH64_FEATURE_1_AND */  
\
   .long 4 SEPARATOR
\
-  .long 3 SEPARATOR /* GNU_PROPERTY_AARCH64_FEATURE_1_BTI AND */   
\
-/* GNU_PROPERTY_AARCH64_FEATURE_1_PAC */   
\
+  .long GNU_PROPERTY SEPARATOR 
\
   .long 0 SEPARATOR
\
   .popsection SEPARATOR
-#define AARCH64_BTI  bti c
-#else
+#endif
+#endif
+#if !defined(AARCH64_BTI)
 #define AARCH64_BTI
 #endif
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libunwind] release/19.x: [libunwind] Add GCS support for AArch64 (#99335) (PR #101888)

2024-08-20 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101888
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [AArch64] Fix a bug where user could not disable certain architecture features (PR #104752)

2024-08-20 Thread via llvm-branch-commits

github-actions[bot] wrote:

@tmatheson-arm (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/104752
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [GlobalISel] Bail out early for big-endian (#103310) (PR #104823)

2024-08-20 Thread via llvm-branch-commits

github-actions[bot] wrote:

@davemgreen (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/104823
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/19.x: [LLD] [MinGW] Recognize the -rpath option (#102886) (PR #104843)

2024-08-20 Thread via llvm-branch-commits

github-actions[bot] wrote:

@mstorsjo (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/104843
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [C++23] Fix infinite recursion (Clang 19.x regression) (#104829) (PR #104858)

2024-08-20 Thread via llvm-branch-commits

github-actions[bot] wrote:

@AaronBallman (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/104858
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: Reland [C++20] [Modules] [Itanium ABI] Generate the vtable in the mod… (#102287) (PR #102561)

2024-08-20 Thread via llvm-branch-commits

github-actions[bot] wrote:

@ChuanqiXu9 (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/102561
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libunwind] release/19.x: [libunwind] Add GCS support for AArch64 (#99335) (PR #101888)

2024-08-20 Thread via llvm-branch-commits

github-actions[bot] wrote:

@john-brawn-arm (or anyone else). If you would like to add a note about this 
fix in the release notes (completely optional). Please reply to this comment 
with a one or two sentence description of the fix.  When you are done, please 
add the release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/101888
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [WIP][flang] Introduce HLFIR lowerings to omp.workshare_loop_nest (PR #104748)

2024-08-20 Thread Ivan R. Ivanov via llvm-branch-commits

https://github.com/ivanradanov updated 
https://github.com/llvm/llvm-project/pull/104748

>From a45ef32ecf6483bdb65954c4283ea493494cea77 Mon Sep 17 00:00:00 2001
From: Ivan Radanov Ivanov 
Date: Tue, 20 Aug 2024 16:57:25 +0900
Subject: [PATCH 1/6] Update test

---
 .../Transforms/OpenMP/lower-workshare.mlir| 42 +++
 1 file changed, 16 insertions(+), 26 deletions(-)

diff --git a/flang/test/Transforms/OpenMP/lower-workshare.mlir 
b/flang/test/Transforms/OpenMP/lower-workshare.mlir
index 9347863dc4a609..c189e54aaeb0d4 100644
--- a/flang/test/Transforms/OpenMP/lower-workshare.mlir
+++ b/flang/test/Transforms/OpenMP/lower-workshare.mlir
@@ -103,28 +103,23 @@ func.func @wsfunc(%arg0: !fir.ref>) {
 // CHECK: %[[VAL_10:.*]]:2 = hlfir.declare %[[VAL_0]](%[[VAL_9]]) 
{uniq_name = "array"} : (!fir.ref>, !fir.shape<1>) -> 
(!fir.ref>, !fir.ref>)
 // CHECK: %[[VAL_11:.*]] = fir.load %[[VAL_1]] : 
!fir.ref>>
 // CHECK: %[[VAL_12:.*]]:2 = hlfir.declare %[[VAL_11]](%[[VAL_9]]) 
{uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> 
(!fir.heap>, !fir.heap>)
-// CHECK: %[[VAL_13:.*]] = arith.constant true
-// CHECK: %[[VAL_14:.*]] = arith.constant 1 : index
+// CHECK: %[[VAL_13:.*]] = arith.constant 1 : index
 // CHECK: omp.wsloop {
-// CHECK:   omp.loop_nest (%[[VAL_15:.*]]) : index = (%[[VAL_14]]) 
to (%[[VAL_7]]) inclusive step (%[[VAL_14]]) {
-// CHECK: %[[VAL_16:.*]] = hlfir.designate %[[VAL_10]]#0 
(%[[VAL_15]])  : (!fir.ref>, index) -> !fir.ref
-// CHECK: %[[VAL_17:.*]] = fir.load %[[VAL_16]] : !fir.ref
-// CHECK: %[[VAL_18:.*]] = arith.subi %[[VAL_17]], %[[VAL_8]] 
: i32
-// CHECK: %[[VAL_19:.*]] = hlfir.designate %[[VAL_12]]#0 
(%[[VAL_15]])  : (!fir.heap>, index) -> !fir.ref
-// CHECK: hlfir.assign %[[VAL_18]] to %[[VAL_19]] 
temporary_lhs : i32, !fir.ref
+// CHECK:   omp.loop_nest (%[[VAL_14:.*]]) : index = (%[[VAL_13]]) 
to (%[[VAL_7]]) inclusive step (%[[VAL_13]]) {
+// CHECK: %[[VAL_15:.*]] = hlfir.designate %[[VAL_10]]#0 
(%[[VAL_14]])  : (!fir.ref>, index) -> !fir.ref
+// CHECK: %[[VAL_16:.*]] = fir.load %[[VAL_15]] : !fir.ref
+// CHECK: %[[VAL_17:.*]] = arith.subi %[[VAL_16]], %[[VAL_8]] 
: i32
+// CHECK: %[[VAL_18:.*]] = hlfir.designate %[[VAL_12]]#0 
(%[[VAL_14]])  : (!fir.heap>, index) -> !fir.ref
+// CHECK: hlfir.assign %[[VAL_17]] to %[[VAL_18]] 
temporary_lhs : i32, !fir.ref
 // CHECK: omp.yield
 // CHECK:   }
 // CHECK:   omp.terminator
 // CHECK: }
 // CHECK: omp.single nowait {
-// CHECK:   %[[VAL_20:.*]] = fir.undefined 
tuple>, i1>
-// CHECK:   %[[VAL_21:.*]] = fir.insert_value %[[VAL_20]], 
%[[VAL_13]], [1 : index] : (tuple>, i1>, i1) -> 
tuple>, i1>
 // CHECK:   hlfir.assign %[[VAL_12]]#0 to %[[VAL_10]]#0 : 
!fir.heap>, !fir.ref>
 // CHECK:   fir.freemem %[[VAL_12]]#0 : 
!fir.heap>
 // CHECK:   omp.terminator
 // CHECK: }
-// CHECK: %[[VAL_22:.*]] = fir.undefined 
tuple>, i1>
-// CHECK: %[[VAL_23:.*]] = fir.insert_value %[[VAL_22]], 
%[[VAL_13]], [1 : index] : (tuple>, i1>, i1) -> 
tuple>, i1>
 // CHECK: omp.barrier
 // CHECK: omp.terminator
 // CHECK:   }
@@ -168,31 +163,26 @@ func.func @wsfunc(%arg0: !fir.ref>) {
 // CHECK:   %[[VAL_12:.*]]:2 = hlfir.declare %[[VAL_0]](%[[VAL_11]]) 
{uniq_name = "array"} : (!fir.ref>, !fir.shape<1>) -> 
(!fir.ref>, !fir.ref>)
 // CHECK:   %[[VAL_13:.*]] = fir.load %[[VAL_2]] : 
!fir.ref>>
 // CHECK:   %[[VAL_14:.*]]:2 = hlfir.declare %[[VAL_13]](%[[VAL_11]]) 
{uniq_name = ".tmp.array"} : (!fir.heap>, !fir.shape<1>) -> 
(!fir.heap>, !fir.heap>)
-// CHECK:   %[[VAL_15:.*]] = arith.constant true
-// CHECK:   %[[VAL_16:.*]] = arith.constant 1 : index
+// CHECK:   %[[VAL_15:.*]] = arith.constant 1 : index
 // CHECK:   omp.wsloop {
-// CHECK: omp.loop_nest (%[[VAL_17:.*]]) : index = (%[[VAL_16]]) 
to (%[[VAL_10]]) inclusive step (%[[VAL_16]]) {
-// CHECK:   %[[VAL_18:.*]] = hlfir.designate %[[VAL_12]]#0 
(%[[VAL_17]])  : (!fir.ref>, index) -> !fir.ref
-// CHECK:   %[[VAL_19:.*]] = fir.load %[[VAL_18]] : !fir.ref
-// CHECK:   %[[VAL_20:.*]] = fir.load %[[VAL_1]] : !fir.ref
-// CHECK:   %[[VAL_21:.*]] = arith.subi %[[VAL_19]], %[[VAL_20]] : 
i32
-// CHECK:   %[[VAL_22:.*]] = arith.subi %[[VAL_21]], %[[VAL_9]] : 
i32
-// CHECK:   %[[VAL_23:.*]] = hlfir.designate %[[VAL_14]]#0 
(%[[VAL_17]])  : (!fir.heap>, index) -> !fir.ref
-// CHECK:   hlfir.assign %[[VAL_22]] to %[[VAL_23]] temporary_lhs 
: i32, !fir.ref
+// CHECK: omp.loop_n

[llvm-branch-commits] [compiler-rt] [sanitizer_common] Fix internal_*stat on Linux/sparc64 (PR #101236)

2024-08-20 Thread Rainer Orth via llvm-branch-commits

https://github.com/rorth closed https://github.com/llvm/llvm-project/pull/101236
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [sanitizer_common] Fix internal_*stat on Linux/sparc64 (PR #101236)

2024-08-20 Thread Rainer Orth via llvm-branch-commits

rorth wrote:

This can now be closed: one part (PR #101012) has already been merged and the 
necessary rest is now PR #104916.

https://github.com/llvm/llvm-project/pull/101236
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [WIP] AMDGPU: Handle v_add* in eliminateFrameIndex (PR #102346)

2024-08-20 Thread via llvm-branch-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff b9a85cef41e26f6f468c87a00f2072ea798ec628 
af30ca7d438747ce58dd79ef293a9914ceef592f --extensions cpp -- 
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
``





View the diff from clang-format here.


``diff
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp 
b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
index 0427b021d4..505802c679 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
@@ -2534,15 +2534,11 @@ bool 
SIRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator MI,
 if (isVGPRClass(getPhysRegBaseClass(MaterializedReg))) {
   // If we know we have a VGPR already, it's more likely the other
   // operand is a legal vsrc0.
-  AddI32
-.add(*OtherOp)
-.addReg(MaterializedReg, MaterializedRegFlags);
+  AddI32.add(*OtherOp).addReg(MaterializedReg, MaterializedRegFlags);
 } else {
   // Commute operands to avoid violating VOP2 restrictions. This will
   // typically happen when using scratch.
-  AddI32
-.addReg(MaterializedReg, MaterializedRegFlags)
-.add(*OtherOp);
+  AddI32.addReg(MaterializedReg, MaterializedRegFlags).add(*OtherOp);
 }
 
 if (MI->getOpcode() == AMDGPU::V_ADD_CO_U32_e64 ||

``




https://github.com/llvm/llvm-project/pull/102346
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [MC][NFC] Reduce Address2ProbesMap size (PR #102904)

2024-08-20 Thread Amir Ayupov via llvm-branch-commits

aaupov wrote:

> Tested End-to-End on llvm-profgen on a heavy workload(ported all the stacked 
> PR) : The running time is neutral, the maximum RSS is reduced by 3GB (from 
> 70GB to 67GB)   cc @WenleiHe 

To double-check: did you test with or without dwarf-correlation? I tested once 
with it, expectedly pseudo probe parsing wasn't engaged, so there was no effect.

https://github.com/llvm/llvm-project/pull/102904
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [DirectX] Implement metadata lowering for resources (PR #104447)

2024-08-20 Thread Xiang Li via llvm-branch-commits


@@ -13,27 +13,52 @@
 #include "DXILShaderFlags.h"
 #include "DirectX.h"
 #include "llvm/ADT/StringSet.h"
+#include "llvm/Analysis/DXILResource.h"
 #include "llvm/IR/Constants.h"
 #include "llvm/IR/Metadata.h"
 #include "llvm/IR/Module.h"
+#include "llvm/InitializePasses.h"
 #include "llvm/Pass.h"
 #include "llvm/TargetParser/Triple.h"
 
 using namespace llvm;
 using namespace llvm::dxil;
 
-static void emitResourceMetadata(Module &M,
+static void emitResourceMetadata(Module &M, const DXILResourceMap &DRM,
  const dxil::Resources &MDResources) {
-  Metadata *SRVMD = nullptr, *UAVMD = nullptr, *CBufMD = nullptr,
-   *SmpMD = nullptr;
-  bool HasResources = false;
+  LLVMContext &Context = M.getContext();
+
+  SmallVector SRVs, UAVs, CBufs, Smps;
+  for (auto [_, RI] : DRM) {
+switch (RI.getResourceClass()) {

python3kgae wrote:

For a resource array like `Buffer B[10]` which used `B[2]` and `B[5]`. 
Will `B[2]` and `B[5]` both in DRM and get same RI?

https://github.com/llvm/llvm-project/pull/104447
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [mlir] [OpenMP]Update use_device_clause lowering (PR #101707)

2024-08-20 Thread Akash Banerjee via llvm-branch-commits

https://github.com/TIFitis updated 
https://github.com/llvm/llvm-project/pull/101707

>From 3a2afe783bfd65c981424fb14d2b0f42ea0b6618 Mon Sep 17 00:00:00 2001
From: Akash Banerjee 
Date: Fri, 2 Aug 2024 17:11:21 +0100
Subject: [PATCH 1/2] [OpenMP]Update use_device_clause lowering

This patch updates the use_device_ptr and use_device_addr clauses to use the 
mapInfoOps for lowering. This allows all the types that are handle by the map 
clauses such as derived types to also be supported by the use_device_clauses.

This is patch 2/2 in a series of patches.
---
 llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp |   2 +-
 .../OpenMP/OpenMPToLLVMIRTranslation.cpp  | 284 ++
 mlir/test/Target/LLVMIR/omptarget-llvm.mlir   |  16 +-
 .../openmp-target-use-device-nested.mlir  |  27 ++
 4 files changed, 194 insertions(+), 135 deletions(-)
 create mode 100644 mlir/test/Target/LLVMIR/openmp-target-use-device-nested.mlir

diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp 
b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 83fec194d73904..f5d94069ad6f4c 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -6357,7 +6357,7 @@ OpenMPIRBuilder::InsertPointTy 
OpenMPIRBuilder::createTargetData(
   // Disable TargetData CodeGen on Device pass.
   if (Config.IsTargetDevice.value_or(false)) {
 if (BodyGenCB)
-  Builder.restoreIP(BodyGenCB(Builder.saveIP(), BodyGenTy::NoPriv));
+  Builder.restoreIP(BodyGenCB(CodeGenIP, BodyGenTy::NoPriv));
 return Builder.saveIP();
   }
 
diff --git 
a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp 
b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 458d05d5059db7..78c460c50cbe5e 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -2110,6 +2110,8 @@ getRefPtrIfDeclareTarget(mlir::Value value,
 struct MapInfoData : llvm::OpenMPIRBuilder::MapInfosTy {
   llvm::SmallVector IsDeclareTarget;
   llvm::SmallVector IsAMember;
+  // Identify if mapping was added by mapClause or use_device clauses.
+  llvm::SmallVector IsAMapping;
   llvm::SmallVector MapClause;
   llvm::SmallVector OriginalValue;
   // Stripped off array/pointer to get the underlying
@@ -2193,62 +2195,125 @@ llvm::Value *getSizeInBytes(DataLayout &dl, const 
mlir::Type &type,
   return builder.getInt64(dl.getTypeSizeInBits(type) / 8);
 }
 
-void collectMapDataFromMapVars(MapInfoData &mapData,
-   llvm::SmallVectorImpl &mapVars,
-   LLVM::ModuleTranslation &moduleTranslation,
-   DataLayout &dl, llvm::IRBuilderBase &builder) {
+void collectMapDataFromMapOperands(
+MapInfoData &mapData, llvm::SmallVectorImpl &mapVars,
+LLVM::ModuleTranslation &moduleTranslation, DataLayout &dl,
+llvm::IRBuilderBase &builder,
+const llvm::ArrayRef &useDevPtrOperands = {},
+const llvm::ArrayRef &useDevAddrOperands = {}) {
+  // Process MapOperands
   for (mlir::Value mapValue : mapVars) {
-if (auto mapOp = mlir::dyn_cast_if_present(
-mapValue.getDefiningOp())) {
-  mlir::Value offloadPtr =
-  mapOp.getVarPtrPtr() ? mapOp.getVarPtrPtr() : mapOp.getVarPtr();
-  mapData.OriginalValue.push_back(
-  moduleTranslation.lookupValue(offloadPtr));
-  mapData.Pointers.push_back(mapData.OriginalValue.back());
-
-  if (llvm::Value *refPtr =
-  getRefPtrIfDeclareTarget(offloadPtr,
-   moduleTranslation)) { // declare target
-mapData.IsDeclareTarget.push_back(true);
-mapData.BasePointers.push_back(refPtr);
-  } else { // regular mapped variable
-mapData.IsDeclareTarget.push_back(false);
-mapData.BasePointers.push_back(mapData.OriginalValue.back());
-  }
+auto mapOp = mlir::cast(mapValue.getDefiningOp());
+mlir::Value offloadPtr =
+mapOp.getVarPtrPtr() ? mapOp.getVarPtrPtr() : mapOp.getVarPtr();
+mapData.OriginalValue.push_back(moduleTranslation.lookupValue(offloadPtr));
+mapData.Pointers.push_back(mapData.OriginalValue.back());
+
+if (llvm::Value *refPtr =
+getRefPtrIfDeclareTarget(offloadPtr,
+ moduleTranslation)) { // declare target
+  mapData.IsDeclareTarget.push_back(true);
+  mapData.BasePointers.push_back(refPtr);
+} else { // regular mapped variable
+  mapData.IsDeclareTarget.push_back(false);
+  mapData.BasePointers.push_back(mapData.OriginalValue.back());
+}
 
-  mapData.BaseType.push_back(
-  moduleTranslation.convertType(mapOp.getVarType()));
-  mapData.Sizes.push_back(
-  getSizeInBytes(dl, mapOp.getVarType(), mapOp, 
mapData.Pointers.back(),
- mapData.BaseType.back(), builder, moduleTranslation));
-  mapData.MapClause.push_back(mapOp.getO

[llvm-branch-commits] [llvm] [mlir] [OpenMP]Update use_device_clause lowering (PR #101707)

2024-08-20 Thread Akash Banerjee via llvm-branch-commits

TIFitis wrote:

Thanks @Dinistro for the comments, I've addressed them in the latest revision.

https://github.com/llvm/llvm-project/pull/101707
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][OpenMP] Convert reduction alloc region to LLVMIR (PR #102524)

2024-08-20 Thread Tom Eccles via llvm-branch-commits

tblah wrote:

ping for review

https://github.com/llvm/llvm-project/pull/102524
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] use reduction alloc region (PR #102525)

2024-08-20 Thread Tom Eccles via llvm-branch-commits

tblah wrote:

ping for review

https://github.com/llvm/llvm-project/pull/102525
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AArch64] Don't replace dst of SWP instructions with (X|W)ZR (#102139) (PR #102316)

2024-08-20 Thread Luke Geeson via llvm-branch-commits

lukeg101 wrote:

Prevents a concurrency-related compiler bug (a reordering bug introduced by 
LLVM) that arises when optimisations rewrite the destination register of SWP 
instructions to be the zero register when compiling an atomic exchange 
operation. For more information on this bug and how it was found, please see: 
https://lukegeeson.com/publications/2024-03-05-CGO/



https://github.com/llvm/llvm-project/pull/102316
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] [sanitizer_common] Make sanitizer_linux.cpp kernel_stat* handling Lin… (PR #104916)

2024-08-20 Thread Rainer Orth via llvm-branch-commits

rorth wrote:

The Solaris/sparcv9 build just completed successfully: no regressions relative 
to rc2.

https://github.com/llvm/llvm-project/pull/104916
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][OpenMP] Convert reduction alloc region to LLVMIR (PR #102524)

2024-08-20 Thread Mats Petersson via llvm-branch-commits


@@ -594,45 +594,85 @@ convertOmpOrderedRegion(Operation &opInst, 
llvm::IRBuilderBase &builder,
 
 /// Allocate space for privatized reduction variables.
 template 
-static void allocByValReductionVars(
-T loop, ArrayRef reductionArgs, llvm::IRBuilderBase 
&builder,
-LLVM::ModuleTranslation &moduleTranslation,
-llvm::OpenMPIRBuilder::InsertPointTy &allocaIP,
-SmallVectorImpl &reductionDecls,
-SmallVectorImpl &privateReductionVariables,
-DenseMap &reductionVariableMap,
-llvm::ArrayRef isByRefs) {
+static LogicalResult
+allocReductionVars(T loop, ArrayRef reductionArgs,
+   llvm::IRBuilderBase &builder,
+   LLVM::ModuleTranslation &moduleTranslation,
+   llvm::OpenMPIRBuilder::InsertPointTy &allocaIP,
+   SmallVectorImpl &reductionDecls,
+   SmallVectorImpl &privateReductionVariables,
+   DenseMap &reductionVariableMap,
+   llvm::ArrayRef isByRefs) {
   llvm::IRBuilderBase::InsertPointGuard guard(builder);
   builder.SetInsertPoint(allocaIP.getBlock()->getTerminator());
 
+  // delay creating stores until after all allocas
+  SmallVector> storesToCreate;
+  storesToCreate.reserve(loop.getNumReductionVars());
+
   for (std::size_t i = 0; i < loop.getNumReductionVars(); ++i) {
-if (isByRefs[i])
-  continue;
-llvm::Value *var = builder.CreateAlloca(
-moduleTranslation.convertType(reductionDecls[i].getType()));
-moduleTranslation.mapValue(reductionArgs[i], var);
-privateReductionVariables[i] = var;
-reductionVariableMap.try_emplace(loop.getReductionVars()[i], var);
+Region &allocRegion = reductionDecls[i].getAllocRegion();
+if (isByRefs[i]) {
+  if (allocRegion.empty())

Leporacanthicus wrote:

What does allocRegion empty mean here? It's not been created? If so, where does 
the alloca go?

https://github.com/llvm/llvm-project/pull/102524
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][OpenMP] Convert reduction alloc region to LLVMIR (PR #102524)

2024-08-20 Thread Mats Petersson via llvm-branch-commits

https://github.com/Leporacanthicus edited 
https://github.com/llvm/llvm-project/pull/102524
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][OpenMP] Convert reduction alloc region to LLVMIR (PR #102524)

2024-08-20 Thread Mats Petersson via llvm-branch-commits

https://github.com/Leporacanthicus edited 
https://github.com/llvm/llvm-project/pull/102524
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] use reduction alloc region (PR #102525)

2024-08-20 Thread Mats Petersson via llvm-branch-commits

https://github.com/Leporacanthicus approved this pull request.

LGTM. Probably good to have a second approval tho'.

https://github.com/llvm/llvm-project/pull/102525
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Remove flat/global atomic fadd v2bf16 intrinsics (PR #97050)

2024-08-20 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

ping

https://github.com/llvm/llvm-project/pull/97050
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][4/7] Improves std::format_to performance. (PR #101823)

2024-08-20 Thread Louis Dionne via llvm-branch-commits


@@ -722,6 +724,95 @@ class _LIBCPP_TEMPLATE_VIS __allocating_buffer : public 
__output_buffer<_CharT>
   }
 };
 
+// A buffer that directly writes to the underlying buffer.
+template 
+class _LIBCPP_TEMPLATE_VIS __direct_iterator_buffer : public 
__output_buffer<_CharT> {
+public:
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI explicit __direct_iterator_buffer(_OutIt 
__out_it)
+  : __output_buffer<_CharT>{std::__unwrap_iter(__out_it), __buffer_size, 
__prepare_write}, __out_it_(__out_it) {}
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI _OutIt __out_it() && { return __out_it_ 
+ this->__size(); }
+
+private:
+  // The function format_to expects a buffer large enough for the output. The
+  // function format_to_n has its own helper class that restricts the number of
+  // write options. So this function class can pretend to have an infinite
+  // buffer.
+  static constexpr size_t __buffer_size = -1;
+
+  _OutIt __out_it_;
+
+  _LIBCPP_HIDE_FROM_ABI static void
+  __prepare_write([[maybe_unused]] __output_buffer<_CharT>& __buffer, 
[[maybe_unused]] size_t __size_hint) {
+std::__throw_length_error("__direct_iterator_buffer");
+  }
+};
+
+// A buffer that writes its output to the end of a container.
+template 
+class _LIBCPP_TEMPLATE_VIS __container_inserter_buffer : public 
__output_buffer<_CharT> {
+public:
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI explicit 
__container_inserter_buffer(_OutIt __out_it)
+  : __output_buffer<_CharT>{__buffer_, __buffer_size, __prepare_write}, 
__container_{__out_it.__get_container()} {}
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI auto __out_it() && {
+__container_->insert(__container_->end(), __buffer_, __buffer_ + 
this->__size());
+return std::back_inserter(*__container_);
+  }
+
+private:
+  typename __back_insert_iterator_container<_OutIt>::type* __container_;
+
+  // This class uses a fixed size buffer and appends the elements in
+  // __buffer_size chunks. An alternative would be to use an allocating buffer
+  // and append the output in one write operation. Benchmarking showed no
+  // performance difference.
+  static constexpr size_t __buffer_size = 256;
+  _CharT __buffer_[__buffer_size];
+
+  _LIBCPP_HIDE_FROM_ABI void __prepare_write() {
+__container_->insert(__container_->end(), __buffer_, __buffer_ + 
this->__size());
+this->__buffer_flused();
+  }
+
+  _LIBCPP_HIDE_FROM_ABI static void
+  __prepare_write(__output_buffer<_CharT>& __buffer, [[maybe_unused]] size_t 
__size_hint) {
+static_cast<__container_inserter_buffer<_OutIt, 
_CharT>&>(__buffer).__prepare_write();
+  }
+};
+
+// A buffer that writes to an iterator.
+//
+// Unlinke the __container_inserter_buffer this class' perfomance does benefit

ldionne wrote:

```suggestion
// Unlike the __container_inserter_buffer this class' performance does benefit
```

https://github.com/llvm/llvm-project/pull/101823
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Remove flat/global atomic fadd v2bf16 intrinsics (PR #97050)

2024-08-20 Thread Yaxun Liu via llvm-branch-commits

https://github.com/yxsamliu approved this pull request.


https://github.com/llvm/llvm-project/pull/97050
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [mlir] [OpenMP]Update use_device_clause lowering (PR #101707)

2024-08-20 Thread Christian Ulmann via llvm-branch-commits

Dinistro wrote:

As I'm not too familiar with these parts of MLIR, I'll not be able to review 
this properly. Does someone else have capacity to do a pass on the logic of 
this?

https://github.com/llvm/llvm-project/pull/101707
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [clang][modules] Built-in modules are not correctly enabled for Mac Catalyst (#104872) (PR #105093)

2024-08-20 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/105093

Backport 4c5ef6690040383956461828457ac27f7f912edb 
b9864387d9d00e1d4888181460d05dbc92364d75

Requested by: @ian-twilightcoder

>From 1c36e2afffa74d2c25e0c74d96c1a0ef84e70aa0 Mon Sep 17 00:00:00 2001
From: Ian Anderson 
Date: Fri, 9 Aug 2024 07:06:11 -0700
Subject: [PATCH 1/2] Fix a unit test input file (#102567)

I forgot to update the version info in the SDKSettings file when I
updated it to the real version relevant to the test.

(cherry picked from commit 4c5ef6690040383956461828457ac27f7f912edb)
---
 clang/test/Driver/Inputs/MacOSX15.0.sdk/SDKSettings.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/test/Driver/Inputs/MacOSX15.0.sdk/SDKSettings.json 
b/clang/test/Driver/Inputs/MacOSX15.0.sdk/SDKSettings.json
index 77b70e1a83c19c..ced45d5c219962 100644
--- a/clang/test/Driver/Inputs/MacOSX15.0.sdk/SDKSettings.json
+++ b/clang/test/Driver/Inputs/MacOSX15.0.sdk/SDKSettings.json
@@ -1 +1 @@
-{"Version":"990.0", "MaximumDeploymentTarget": "99.0.99"}
+{"Version":"15.0", "MaximumDeploymentTarget": "15.0.99"}

>From 6b75d57acd8d679c55cdc7261da06805bd416b10 Mon Sep 17 00:00:00 2001
From: Ian Anderson 
Date: Tue, 20 Aug 2024 03:29:11 -0700
Subject: [PATCH 2/2] [clang][modules] Built-in modules are not correctly
 enabled for Mac Catalyst (#104872)

Mac Catalyst is the iOS platform, but it builds against the macOS SDK
and so it needs to be checking the macOS SDK version instead of the iOS
one. Add tests against a greater-than SDK version just to make sure this
works beyond the initially supporting SDKs.

(cherry picked from commit b9864387d9d00e1d4888181460d05dbc92364d75)
---
 clang/lib/Driver/ToolChains/Darwin.cpp | 10 +-
 .../test/Driver/Inputs/MacOSX15.1.sdk/SDKSettings.json |  1 +
 clang/test/Driver/darwin-builtin-modules.c |  3 +++
 3 files changed, 13 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/Driver/Inputs/MacOSX15.1.sdk/SDKSettings.json

diff --git a/clang/lib/Driver/ToolChains/Darwin.cpp 
b/clang/lib/Driver/ToolChains/Darwin.cpp
index 17d57b2f7eedab..e576efaf5ca884 100644
--- a/clang/lib/Driver/ToolChains/Darwin.cpp
+++ b/clang/lib/Driver/ToolChains/Darwin.cpp
@@ -2953,7 +2953,15 @@ static bool sdkSupportsBuiltinModules(
   case Darwin::MacOS:
 return SDKVersion >= VersionTuple(15U);
   case Darwin::IPhoneOS:
-return SDKVersion >= VersionTuple(18U);
+switch (TargetEnvironment) {
+case Darwin::MacCatalyst:
+  // Mac Catalyst uses `-target arm64-apple-ios18.0-macabi` so the platform
+  // is iOS, but it builds with the macOS SDK, so it's the macOS SDK 
version
+  // that's relevant.
+  return SDKVersion >= VersionTuple(15U);
+default:
+  return SDKVersion >= VersionTuple(18U);
+}
   case Darwin::TvOS:
 return SDKVersion >= VersionTuple(18U);
   case Darwin::WatchOS:
diff --git a/clang/test/Driver/Inputs/MacOSX15.1.sdk/SDKSettings.json 
b/clang/test/Driver/Inputs/MacOSX15.1.sdk/SDKSettings.json
new file mode 100644
index 00..d46295b2ab5a17
--- /dev/null
+++ b/clang/test/Driver/Inputs/MacOSX15.1.sdk/SDKSettings.json
@@ -0,0 +1 @@
+{"Version":"15.1", "MaximumDeploymentTarget": "15.1.99"}
diff --git a/clang/test/Driver/darwin-builtin-modules.c 
b/clang/test/Driver/darwin-builtin-modules.c
index ec515133be8aba..4564d7317d7abe 100644
--- a/clang/test/Driver/darwin-builtin-modules.c
+++ b/clang/test/Driver/darwin-builtin-modules.c
@@ -8,5 +8,8 @@
 
 // RUN: %clang -isysroot %S/Inputs/MacOSX15.0.sdk -target 
x86_64-apple-macos14.0 -### %s 2>&1 | FileCheck --check-prefix=CHECK_FUTURE %s
 // RUN: %clang -isysroot %S/Inputs/MacOSX15.0.sdk -target 
x86_64-apple-macos15.0 -### %s 2>&1 | FileCheck --check-prefix=CHECK_FUTURE %s
+// RUN: %clang -isysroot %S/Inputs/MacOSX15.0.sdk -target 
x86_64-apple-ios18.0-macabi -### %s 2>&1 | FileCheck 
--check-prefix=CHECK_FUTURE %s
+// RUN: %clang -isysroot %S/Inputs/MacOSX15.1.sdk -target 
x86_64-apple-macos15.1 -darwin-target-variant x86_64-apple-ios18.1-macabi -### 
%s 2>&1 | FileCheck --check-prefix=CHECK_FUTURE %s
+// RUN: %clang -isysroot %S/Inputs/MacOSX15.1.sdk -target 
x86_64-apple-ios18.1-macabi -darwin-target-variant x86_64-apple-macos15.1 -### 
%s 2>&1 | FileCheck --check-prefix=CHECK_FUTURE %s
 // RUN: %clang -isysroot %S/Inputs/DriverKit23.0.sdk -target 
arm64-apple-driverkit23.0 -### %s 2>&1 | FileCheck --check-prefix=CHECK_FUTURE 
%s
 // CHECK_FUTURE-NOT: -fbuiltin-headers-in-system-modules

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [clang][modules] Built-in modules are not correctly enabled for Mac Catalyst (#104872) (PR #105093)

2024-08-20 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/105093
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [clang][modules] Built-in modules are not correctly enabled for Mac Catalyst (#104872) (PR #105093)

2024-08-20 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: None (llvmbot)


Changes

Backport 4c5ef6690040383956461828457ac27f7f912edb 
b9864387d9d00e1d4888181460d05dbc92364d75

Requested by: @ian-twilightcoder

---
Full diff: https://github.com/llvm/llvm-project/pull/105093.diff


4 Files Affected:

- (modified) clang/lib/Driver/ToolChains/Darwin.cpp (+9-1) 
- (modified) clang/test/Driver/Inputs/MacOSX15.0.sdk/SDKSettings.json (+1-1) 
- (added) clang/test/Driver/Inputs/MacOSX15.1.sdk/SDKSettings.json (+1) 
- (modified) clang/test/Driver/darwin-builtin-modules.c (+3) 


``diff
diff --git a/clang/lib/Driver/ToolChains/Darwin.cpp 
b/clang/lib/Driver/ToolChains/Darwin.cpp
index 17d57b2f7eedab..e576efaf5ca884 100644
--- a/clang/lib/Driver/ToolChains/Darwin.cpp
+++ b/clang/lib/Driver/ToolChains/Darwin.cpp
@@ -2953,7 +2953,15 @@ static bool sdkSupportsBuiltinModules(
   case Darwin::MacOS:
 return SDKVersion >= VersionTuple(15U);
   case Darwin::IPhoneOS:
-return SDKVersion >= VersionTuple(18U);
+switch (TargetEnvironment) {
+case Darwin::MacCatalyst:
+  // Mac Catalyst uses `-target arm64-apple-ios18.0-macabi` so the platform
+  // is iOS, but it builds with the macOS SDK, so it's the macOS SDK 
version
+  // that's relevant.
+  return SDKVersion >= VersionTuple(15U);
+default:
+  return SDKVersion >= VersionTuple(18U);
+}
   case Darwin::TvOS:
 return SDKVersion >= VersionTuple(18U);
   case Darwin::WatchOS:
diff --git a/clang/test/Driver/Inputs/MacOSX15.0.sdk/SDKSettings.json 
b/clang/test/Driver/Inputs/MacOSX15.0.sdk/SDKSettings.json
index 77b70e1a83c19c..ced45d5c219962 100644
--- a/clang/test/Driver/Inputs/MacOSX15.0.sdk/SDKSettings.json
+++ b/clang/test/Driver/Inputs/MacOSX15.0.sdk/SDKSettings.json
@@ -1 +1 @@
-{"Version":"990.0", "MaximumDeploymentTarget": "99.0.99"}
+{"Version":"15.0", "MaximumDeploymentTarget": "15.0.99"}
diff --git a/clang/test/Driver/Inputs/MacOSX15.1.sdk/SDKSettings.json 
b/clang/test/Driver/Inputs/MacOSX15.1.sdk/SDKSettings.json
new file mode 100644
index 00..d46295b2ab5a17
--- /dev/null
+++ b/clang/test/Driver/Inputs/MacOSX15.1.sdk/SDKSettings.json
@@ -0,0 +1 @@
+{"Version":"15.1", "MaximumDeploymentTarget": "15.1.99"}
diff --git a/clang/test/Driver/darwin-builtin-modules.c 
b/clang/test/Driver/darwin-builtin-modules.c
index ec515133be8aba..4564d7317d7abe 100644
--- a/clang/test/Driver/darwin-builtin-modules.c
+++ b/clang/test/Driver/darwin-builtin-modules.c
@@ -8,5 +8,8 @@
 
 // RUN: %clang -isysroot %S/Inputs/MacOSX15.0.sdk -target 
x86_64-apple-macos14.0 -### %s 2>&1 | FileCheck --check-prefix=CHECK_FUTURE %s
 // RUN: %clang -isysroot %S/Inputs/MacOSX15.0.sdk -target 
x86_64-apple-macos15.0 -### %s 2>&1 | FileCheck --check-prefix=CHECK_FUTURE %s
+// RUN: %clang -isysroot %S/Inputs/MacOSX15.0.sdk -target 
x86_64-apple-ios18.0-macabi -### %s 2>&1 | FileCheck 
--check-prefix=CHECK_FUTURE %s
+// RUN: %clang -isysroot %S/Inputs/MacOSX15.1.sdk -target 
x86_64-apple-macos15.1 -darwin-target-variant x86_64-apple-ios18.1-macabi -### 
%s 2>&1 | FileCheck --check-prefix=CHECK_FUTURE %s
+// RUN: %clang -isysroot %S/Inputs/MacOSX15.1.sdk -target 
x86_64-apple-ios18.1-macabi -darwin-target-variant x86_64-apple-macos15.1 -### 
%s 2>&1 | FileCheck --check-prefix=CHECK_FUTURE %s
 // RUN: %clang -isysroot %S/Inputs/DriverKit23.0.sdk -target 
arm64-apple-driverkit23.0 -### %s 2>&1 | FileCheck --check-prefix=CHECK_FUTURE 
%s
 // CHECK_FUTURE-NOT: -fbuiltin-headers-in-system-modules

``




https://github.com/llvm/llvm-project/pull/105093
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [SPARC] Remove assertions in printOperand for inline asm operands (#104692) (PR #105096)

2024-08-20 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/105096
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [SPARC] Remove assertions in printOperand for inline asm operands (#104692) (PR #105096)

2024-08-20 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/105096

Backport 576b7a781aac6b1d60a72248894b50e565e9185a

Requested by: @s-barannikov

>From 7113e7dccfc2e326abd49919a3361266c6443fe2 Mon Sep 17 00:00:00 2001
From: Koakuma 
Date: Tue, 20 Aug 2024 20:05:06 +0700
Subject: [PATCH] [SPARC] Remove assertions in printOperand for inline asm
 operands (#104692)

Inline asm operands could contain any kind of relocation, so remove the
checks.

Fixes https://github.com/llvm/llvm-project/issues/103493

(cherry picked from commit 576b7a781aac6b1d60a72248894b50e565e9185a)
---
 llvm/lib/Target/Sparc/SparcAsmPrinter.cpp | 51 ---
 llvm/test/CodeGen/SPARC/inlineasm.ll  | 10 +
 2 files changed, 10 insertions(+), 51 deletions(-)

diff --git a/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp 
b/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp
index 6855471840e9db..71ec01aeb011ca 100644
--- a/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp
+++ b/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp
@@ -314,57 +314,6 @@ void SparcAsmPrinter::printOperand(const MachineInstr *MI, 
int opNum,
   const MachineOperand &MO = MI->getOperand (opNum);
   SparcMCExpr::VariantKind TF = (SparcMCExpr::VariantKind) MO.getTargetFlags();
 
-#ifndef NDEBUG
-  // Verify the target flags.
-  if (MO.isGlobal() || MO.isSymbol() || MO.isCPI()) {
-if (MI->getOpcode() == SP::CALL)
-  assert(TF == SparcMCExpr::VK_Sparc_None &&
- "Cannot handle target flags on call address");
-else if (MI->getOpcode() == SP::SETHIi)
-  assert((TF == SparcMCExpr::VK_Sparc_HI
-  || TF == SparcMCExpr::VK_Sparc_H44
-  || TF == SparcMCExpr::VK_Sparc_HH
-  || TF == SparcMCExpr::VK_Sparc_LM
-  || TF == SparcMCExpr::VK_Sparc_TLS_GD_HI22
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDM_HI22
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDO_HIX22
-  || TF == SparcMCExpr::VK_Sparc_TLS_IE_HI22
-  || TF == SparcMCExpr::VK_Sparc_TLS_LE_HIX22) &&
- "Invalid target flags for address operand on sethi");
-else if (MI->getOpcode() == SP::TLS_CALL)
-  assert((TF == SparcMCExpr::VK_Sparc_None
-  || TF == SparcMCExpr::VK_Sparc_TLS_GD_CALL
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDM_CALL) &&
- "Cannot handle target flags on tls call address");
-else if (MI->getOpcode() == SP::TLS_ADDrr)
-  assert((TF == SparcMCExpr::VK_Sparc_TLS_GD_ADD
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDM_ADD
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDO_ADD
-  || TF == SparcMCExpr::VK_Sparc_TLS_IE_ADD) &&
- "Cannot handle target flags on add for TLS");
-else if (MI->getOpcode() == SP::TLS_LDrr)
-  assert(TF == SparcMCExpr::VK_Sparc_TLS_IE_LD &&
- "Cannot handle target flags on ld for TLS");
-else if (MI->getOpcode() == SP::TLS_LDXrr)
-  assert(TF == SparcMCExpr::VK_Sparc_TLS_IE_LDX &&
- "Cannot handle target flags on ldx for TLS");
-else if (MI->getOpcode() == SP::XORri)
-  assert((TF == SparcMCExpr::VK_Sparc_TLS_LDO_LOX10
-  || TF == SparcMCExpr::VK_Sparc_TLS_LE_LOX10) &&
- "Cannot handle target flags on xor for TLS");
-else
-  assert((TF == SparcMCExpr::VK_Sparc_LO
-  || TF == SparcMCExpr::VK_Sparc_M44
-  || TF == SparcMCExpr::VK_Sparc_L44
-  || TF == SparcMCExpr::VK_Sparc_HM
-  || TF == SparcMCExpr::VK_Sparc_TLS_GD_LO10
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDM_LO10
-  || TF == SparcMCExpr::VK_Sparc_TLS_IE_LO10 ) &&
- "Invalid target flags for small address operand");
-  }
-#endif
-
-
   bool CloseParen = SparcMCExpr::printVariantKind(O, TF);
 
   switch (MO.getType()) {
diff --git a/llvm/test/CodeGen/SPARC/inlineasm.ll 
b/llvm/test/CodeGen/SPARC/inlineasm.ll
index 14ea0a2a126027..e2853f03a002e6 100644
--- a/llvm/test/CodeGen/SPARC/inlineasm.ll
+++ b/llvm/test/CodeGen/SPARC/inlineasm.ll
@@ -152,3 +152,13 @@ define i64 @test_twinword(){
   %1 = tail call i64 asm sideeffect "rd %asr5, ${0:L} \0A\09 srlx ${0:L}, 32, 
${0:H}", "={i0}"()
   ret i64 %1
 }
+
+; CHECK-LABEL: test_symbol:
+; CHECK: ba,a brtarget
+define void @test_symbol() {
+Entry:
+  call void asm sideeffect "ba,a ${0}", "X"(ptr @brtarget)
+  unreachable
+}
+
+declare void @brtarget()

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [SPARC] Remove assertions in printOperand for inline asm operands (#104692) (PR #105096)

2024-08-20 Thread via llvm-branch-commits

llvmbot wrote:

@s-barannikov What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/105096
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-20 Thread Louis Dionne via llvm-branch-commits


@@ -452,9 +452,9 @@ format_to(_OutIt __out_it, wformat_string<_Args...> __fmt, 
_Args&&... __args) {
 // fires too eagerly, see http://llvm.org/PR61563.
 template 
 [[nodiscard]] _LIBCPP_ALWAYS_INLINE inline _LIBCPP_HIDE_FROM_ABI string 
vformat(string_view __fmt, format_args __args) {
-  string __res;
-  std::vformat_to(std::back_inserter(__res), __fmt, __args);
-  return __res;
+  __format::__allocating_buffer __buffer;

ldionne wrote:

It would actually be really nice if `vformat_to(std::back_inserter(string), 
...)` were just as performant as using the allocating buffer. It seems to me 
that if using the `__allocating_buffer` directly is faster, there may be a 
problem with next patch's `buffer_selector` which would basically be picking a 
less efficient approach for the case of `std::back_inserter`.

https://github.com/llvm/llvm-project/pull/101817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [SPARC] Remove assertions in printOperand for inline asm operands (#104692) (PR #105096)

2024-08-20 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-sparc

Author: None (llvmbot)


Changes

Backport 576b7a781aac6b1d60a72248894b50e565e9185a

Requested by: @s-barannikov

---
Full diff: https://github.com/llvm/llvm-project/pull/105096.diff


2 Files Affected:

- (modified) llvm/lib/Target/Sparc/SparcAsmPrinter.cpp (-51) 
- (modified) llvm/test/CodeGen/SPARC/inlineasm.ll (+10) 


``diff
diff --git a/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp 
b/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp
index 6855471840e9db..71ec01aeb011ca 100644
--- a/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp
+++ b/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp
@@ -314,57 +314,6 @@ void SparcAsmPrinter::printOperand(const MachineInstr *MI, 
int opNum,
   const MachineOperand &MO = MI->getOperand (opNum);
   SparcMCExpr::VariantKind TF = (SparcMCExpr::VariantKind) MO.getTargetFlags();
 
-#ifndef NDEBUG
-  // Verify the target flags.
-  if (MO.isGlobal() || MO.isSymbol() || MO.isCPI()) {
-if (MI->getOpcode() == SP::CALL)
-  assert(TF == SparcMCExpr::VK_Sparc_None &&
- "Cannot handle target flags on call address");
-else if (MI->getOpcode() == SP::SETHIi)
-  assert((TF == SparcMCExpr::VK_Sparc_HI
-  || TF == SparcMCExpr::VK_Sparc_H44
-  || TF == SparcMCExpr::VK_Sparc_HH
-  || TF == SparcMCExpr::VK_Sparc_LM
-  || TF == SparcMCExpr::VK_Sparc_TLS_GD_HI22
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDM_HI22
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDO_HIX22
-  || TF == SparcMCExpr::VK_Sparc_TLS_IE_HI22
-  || TF == SparcMCExpr::VK_Sparc_TLS_LE_HIX22) &&
- "Invalid target flags for address operand on sethi");
-else if (MI->getOpcode() == SP::TLS_CALL)
-  assert((TF == SparcMCExpr::VK_Sparc_None
-  || TF == SparcMCExpr::VK_Sparc_TLS_GD_CALL
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDM_CALL) &&
- "Cannot handle target flags on tls call address");
-else if (MI->getOpcode() == SP::TLS_ADDrr)
-  assert((TF == SparcMCExpr::VK_Sparc_TLS_GD_ADD
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDM_ADD
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDO_ADD
-  || TF == SparcMCExpr::VK_Sparc_TLS_IE_ADD) &&
- "Cannot handle target flags on add for TLS");
-else if (MI->getOpcode() == SP::TLS_LDrr)
-  assert(TF == SparcMCExpr::VK_Sparc_TLS_IE_LD &&
- "Cannot handle target flags on ld for TLS");
-else if (MI->getOpcode() == SP::TLS_LDXrr)
-  assert(TF == SparcMCExpr::VK_Sparc_TLS_IE_LDX &&
- "Cannot handle target flags on ldx for TLS");
-else if (MI->getOpcode() == SP::XORri)
-  assert((TF == SparcMCExpr::VK_Sparc_TLS_LDO_LOX10
-  || TF == SparcMCExpr::VK_Sparc_TLS_LE_LOX10) &&
- "Cannot handle target flags on xor for TLS");
-else
-  assert((TF == SparcMCExpr::VK_Sparc_LO
-  || TF == SparcMCExpr::VK_Sparc_M44
-  || TF == SparcMCExpr::VK_Sparc_L44
-  || TF == SparcMCExpr::VK_Sparc_HM
-  || TF == SparcMCExpr::VK_Sparc_TLS_GD_LO10
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDM_LO10
-  || TF == SparcMCExpr::VK_Sparc_TLS_IE_LO10 ) &&
- "Invalid target flags for small address operand");
-  }
-#endif
-
-
   bool CloseParen = SparcMCExpr::printVariantKind(O, TF);
 
   switch (MO.getType()) {
diff --git a/llvm/test/CodeGen/SPARC/inlineasm.ll 
b/llvm/test/CodeGen/SPARC/inlineasm.ll
index 14ea0a2a126027..e2853f03a002e6 100644
--- a/llvm/test/CodeGen/SPARC/inlineasm.ll
+++ b/llvm/test/CodeGen/SPARC/inlineasm.ll
@@ -152,3 +152,13 @@ define i64 @test_twinword(){
   %1 = tail call i64 asm sideeffect "rd %asr5, ${0:L} \0A\09 srlx ${0:L}, 32, 
${0:H}", "={i0}"()
   ret i64 %1
 }
+
+; CHECK-LABEL: test_symbol:
+; CHECK: ba,a brtarget
+define void @test_symbol() {
+Entry:
+  call void asm sideeffect "ba,a ${0}", "X"(ptr @brtarget)
+  unreachable
+}
+
+declare void @brtarget()

``




https://github.com/llvm/llvm-project/pull/105096
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] Add AIX and PPC Clang/LLVM release notes for LLVM 19. (PR #105099)

2024-08-20 Thread Amy Kwan via llvm-branch-commits

https://github.com/amy-kwan milestoned 
https://github.com/llvm/llvm-project/pull/105099
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] Add AIX and PPC Clang/LLVM release notes for LLVM 19. (PR #105099)

2024-08-20 Thread Amy Kwan via llvm-branch-commits

https://github.com/amy-kwan created 
https://github.com/llvm/llvm-project/pull/105099

This PR adds AIX and PPC Clang/LLVM release notes for LLVM 19 to the 
`release/19.x` branch.

>From 1aa3221f169f8be0fbe6156d97543c326f6ef97a Mon Sep 17 00:00:00 2001
From: Amy Kwan 
Date: Tue, 20 Aug 2024 10:30:09 -0500
Subject: [PATCH] Add AIX/PPC Clang/LLVM release notes for LLVM 19.

---
 clang/docs/ReleaseNotes.rst | 17 +
 llvm/docs/ReleaseNotes.rst  | 14 ++
 2 files changed, 31 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 17ddbfe910f878..b68b823ae6761d 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1276,6 +1276,14 @@ RISC-V Support
   accesses may be created. ``-m[no-]strict-align`` applies to both scalar and
   vector.
 
+PowerPC Support
+^^^
+
+- Clang now emits errors for impossible ``__attribute__((musttail))``.
+- Added support for ``-mcpu=[pwr11 | power11]`` and ``-mtune=[pwr11 | 
power11]``.
+- Added support for ``builtin_cpu_supports`` on AIX, along with a subset of
+  features that can be queried.
+
 CUDA/HIP Language Changes
 ^
 
@@ -1294,6 +1302,14 @@ AIX Support
   base is encoded as an immediate operand.
   This access sequence is not used for TLS variables larger than 32KB, and is
   currently only supported on 64-bit mode.
+- Introduced the options ``-mtocdata/-mno-tocdata`` to enable/disable TOC data
+  transformations for the listed suitable variables.
+- Introduced the ``-maix-shared-lib-tls-model-opt`` option to enable the tuning
+  of changing local-dynamic mode access(es) to initial-exec access(es) at the
+  function level on 64-bit mode.
+- Clang now emits errors for ``-gdwarf-5``.
+- Added the support of the OpenMP runtime libomp on AIX. OpenMP applications 
can be
+  compiled with ``-fopenmp`` and execute on AIX.
 
 NetBSD Support
 ^^
@@ -1451,6 +1467,7 @@ OpenMP Support
 --
 
 - Added support for the `[[omp::assume]]` attribute.
+- AIX added an include directory for ``omp.h`` at 
``/opt/IBM/openxlCSDK/include/openmp``.
 
 Additional Information
 ==
diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index 60b6c6e786df89..ac7bdf723a168d 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -113,6 +113,8 @@ Changes to TableGen
 Changes to Interprocedural Optimizations
 
 
+* Hot cold region splitting analysis improvements for overlapping cold regions.
+
 Changes to the AArch64 Backend
 --
 
@@ -194,6 +196,16 @@ Changes to the MIPS Backend
 Changes to the PowerPC Backend
 --
 
+* PPC big-endian Linux now supports ``-fpatchable-function-entry``.
+* PPC AIX now supports local-dynamic TLS mode.
+* PPC AIX saves the Git revision in binaries when built with 
LLVM_APPEND_VC_REV=ON.
+* PPC AIX now supports toc-data attribute for large code model.
+* PPC AIX now supports passing arguments by value having greater alignment than
+  the pointer size. Currently only compatible with the IBM XL C compiler.
+* Add support for the per global code model attribute on AIX.
+* Support spilling non-volatile registers for traceback table accuracy on AIX.
+* Codegen improvements and bug fixes.
+
 Changes to the RISC-V Backend
 -
 
@@ -436,6 +448,8 @@ Changes to the LLVM tools
   be disabled by ``--no-verify-note-sections``. (`#90458
   `).
 
+* llvm-objdump now supports the ``--file-headers`` option for XCOFF object 
files.
+
 Changes to LLDB
 -
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] Add AIX and PPC Clang/LLVM release notes for LLVM 19. (PR #105099)

2024-08-20 Thread Amy Kwan via llvm-branch-commits

https://github.com/amy-kwan edited 
https://github.com/llvm/llvm-project/pull/105099
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] Add AIX and PPC Clang/LLVM release notes for LLVM 19. (PR #105099)

2024-08-20 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Amy Kwan (amy-kwan)


Changes

This PR adds AIX and PPC Clang/LLVM release notes for LLVM 19 to the 
`release/19.x` branch.

---
Full diff: https://github.com/llvm/llvm-project/pull/105099.diff


2 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+17) 
- (modified) llvm/docs/ReleaseNotes.rst (+14) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 17ddbfe910f878..b68b823ae6761d 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1276,6 +1276,14 @@ RISC-V Support
   accesses may be created. ``-m[no-]strict-align`` applies to both scalar and
   vector.
 
+PowerPC Support
+^^^
+
+- Clang now emits errors for impossible ``__attribute__((musttail))``.
+- Added support for ``-mcpu=[pwr11 | power11]`` and ``-mtune=[pwr11 | 
power11]``.
+- Added support for ``builtin_cpu_supports`` on AIX, along with a subset of
+  features that can be queried.
+
 CUDA/HIP Language Changes
 ^
 
@@ -1294,6 +1302,14 @@ AIX Support
   base is encoded as an immediate operand.
   This access sequence is not used for TLS variables larger than 32KB, and is
   currently only supported on 64-bit mode.
+- Introduced the options ``-mtocdata/-mno-tocdata`` to enable/disable TOC data
+  transformations for the listed suitable variables.
+- Introduced the ``-maix-shared-lib-tls-model-opt`` option to enable the tuning
+  of changing local-dynamic mode access(es) to initial-exec access(es) at the
+  function level on 64-bit mode.
+- Clang now emits errors for ``-gdwarf-5``.
+- Added the support of the OpenMP runtime libomp on AIX. OpenMP applications 
can be
+  compiled with ``-fopenmp`` and execute on AIX.
 
 NetBSD Support
 ^^
@@ -1451,6 +1467,7 @@ OpenMP Support
 --
 
 - Added support for the `[[omp::assume]]` attribute.
+- AIX added an include directory for ``omp.h`` at 
``/opt/IBM/openxlCSDK/include/openmp``.
 
 Additional Information
 ==
diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index 60b6c6e786df89..ac7bdf723a168d 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -113,6 +113,8 @@ Changes to TableGen
 Changes to Interprocedural Optimizations
 
 
+* Hot cold region splitting analysis improvements for overlapping cold regions.
+
 Changes to the AArch64 Backend
 --
 
@@ -194,6 +196,16 @@ Changes to the MIPS Backend
 Changes to the PowerPC Backend
 --
 
+* PPC big-endian Linux now supports ``-fpatchable-function-entry``.
+* PPC AIX now supports local-dynamic TLS mode.
+* PPC AIX saves the Git revision in binaries when built with 
LLVM_APPEND_VC_REV=ON.
+* PPC AIX now supports toc-data attribute for large code model.
+* PPC AIX now supports passing arguments by value having greater alignment than
+  the pointer size. Currently only compatible with the IBM XL C compiler.
+* Add support for the per global code model attribute on AIX.
+* Support spilling non-volatile registers for traceback table accuracy on AIX.
+* Codegen improvements and bug fixes.
+
 Changes to the RISC-V Backend
 -
 
@@ -436,6 +448,8 @@ Changes to the LLVM tools
   be disabled by ``--no-verify-note-sections``. (`#90458
   `).
 
+* llvm-objdump now supports the ``--file-headers`` option for XCOFF object 
files.
+
 Changes to LLDB
 -
 

``




https://github.com/llvm/llvm-project/pull/105099
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [SPARC] Remove assertions in printOperand for inline asm operands (#104692) (PR #105096)

2024-08-20 Thread via llvm-branch-commits

https://github.com/llvmbot updated 
https://github.com/llvm/llvm-project/pull/105096

>From 92740defcd22962d8f97fb9e2f0dc1e3b770531f Mon Sep 17 00:00:00 2001
From: Koakuma 
Date: Tue, 20 Aug 2024 20:05:06 +0700
Subject: [PATCH] [SPARC] Remove assertions in printOperand for inline asm
 operands (#104692)

Inline asm operands could contain any kind of relocation, so remove the
checks.

Fixes https://github.com/llvm/llvm-project/issues/103493

(cherry picked from commit 576b7a781aac6b1d60a72248894b50e565e9185a)
---
 llvm/lib/Target/Sparc/SparcAsmPrinter.cpp | 51 ---
 llvm/test/CodeGen/SPARC/inlineasm.ll  | 10 +
 2 files changed, 10 insertions(+), 51 deletions(-)

diff --git a/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp 
b/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp
index 6855471840e9db..71ec01aeb011ca 100644
--- a/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp
+++ b/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp
@@ -314,57 +314,6 @@ void SparcAsmPrinter::printOperand(const MachineInstr *MI, 
int opNum,
   const MachineOperand &MO = MI->getOperand (opNum);
   SparcMCExpr::VariantKind TF = (SparcMCExpr::VariantKind) MO.getTargetFlags();
 
-#ifndef NDEBUG
-  // Verify the target flags.
-  if (MO.isGlobal() || MO.isSymbol() || MO.isCPI()) {
-if (MI->getOpcode() == SP::CALL)
-  assert(TF == SparcMCExpr::VK_Sparc_None &&
- "Cannot handle target flags on call address");
-else if (MI->getOpcode() == SP::SETHIi)
-  assert((TF == SparcMCExpr::VK_Sparc_HI
-  || TF == SparcMCExpr::VK_Sparc_H44
-  || TF == SparcMCExpr::VK_Sparc_HH
-  || TF == SparcMCExpr::VK_Sparc_LM
-  || TF == SparcMCExpr::VK_Sparc_TLS_GD_HI22
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDM_HI22
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDO_HIX22
-  || TF == SparcMCExpr::VK_Sparc_TLS_IE_HI22
-  || TF == SparcMCExpr::VK_Sparc_TLS_LE_HIX22) &&
- "Invalid target flags for address operand on sethi");
-else if (MI->getOpcode() == SP::TLS_CALL)
-  assert((TF == SparcMCExpr::VK_Sparc_None
-  || TF == SparcMCExpr::VK_Sparc_TLS_GD_CALL
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDM_CALL) &&
- "Cannot handle target flags on tls call address");
-else if (MI->getOpcode() == SP::TLS_ADDrr)
-  assert((TF == SparcMCExpr::VK_Sparc_TLS_GD_ADD
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDM_ADD
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDO_ADD
-  || TF == SparcMCExpr::VK_Sparc_TLS_IE_ADD) &&
- "Cannot handle target flags on add for TLS");
-else if (MI->getOpcode() == SP::TLS_LDrr)
-  assert(TF == SparcMCExpr::VK_Sparc_TLS_IE_LD &&
- "Cannot handle target flags on ld for TLS");
-else if (MI->getOpcode() == SP::TLS_LDXrr)
-  assert(TF == SparcMCExpr::VK_Sparc_TLS_IE_LDX &&
- "Cannot handle target flags on ldx for TLS");
-else if (MI->getOpcode() == SP::XORri)
-  assert((TF == SparcMCExpr::VK_Sparc_TLS_LDO_LOX10
-  || TF == SparcMCExpr::VK_Sparc_TLS_LE_LOX10) &&
- "Cannot handle target flags on xor for TLS");
-else
-  assert((TF == SparcMCExpr::VK_Sparc_LO
-  || TF == SparcMCExpr::VK_Sparc_M44
-  || TF == SparcMCExpr::VK_Sparc_L44
-  || TF == SparcMCExpr::VK_Sparc_HM
-  || TF == SparcMCExpr::VK_Sparc_TLS_GD_LO10
-  || TF == SparcMCExpr::VK_Sparc_TLS_LDM_LO10
-  || TF == SparcMCExpr::VK_Sparc_TLS_IE_LO10 ) &&
- "Invalid target flags for small address operand");
-  }
-#endif
-
-
   bool CloseParen = SparcMCExpr::printVariantKind(O, TF);
 
   switch (MO.getType()) {
diff --git a/llvm/test/CodeGen/SPARC/inlineasm.ll 
b/llvm/test/CodeGen/SPARC/inlineasm.ll
index 14ea0a2a126027..e2853f03a002e6 100644
--- a/llvm/test/CodeGen/SPARC/inlineasm.ll
+++ b/llvm/test/CodeGen/SPARC/inlineasm.ll
@@ -152,3 +152,13 @@ define i64 @test_twinword(){
   %1 = tail call i64 asm sideeffect "rd %asr5, ${0:L} \0A\09 srlx ${0:L}, 32, 
${0:H}", "={i0}"()
   ret i64 %1
 }
+
+; CHECK-LABEL: test_symbol:
+; CHECK: ba,a brtarget
+define void @test_symbol() {
+Entry:
+  call void asm sideeffect "ba,a ${0}", "X"(ptr @brtarget)
+  unreachable
+}
+
+declare void @brtarget()

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-20 Thread Louis Dionne via llvm-branch-commits


@@ -58,23 +59,156 @@ namespace __format {
 /// This helper is used together with the @ref back_insert_iterator to offer
 /// type-erasure for the formatting functions. This reduces the number to
 /// template instantiations.
+///
+/// The design of the class is being changed to improve performance and do some
+/// code cleanups.
+/// The original design (as shipped up to LLVM-19) uses the following design:
+/// - There is an external object that connects the buffer to the output.
+/// - The class constructor stores a function pointer to a grow function and a
+///   type-erased pointer to the object that does the grow.
+/// - When writing data to the buffer would exceed the external buffer's
+///   capacity it requests the external buffer to flush its contents.
+///
+/// The new design tries to solve some issues with the current design:
+/// - The buffer used is a fixed-size buffer, benchmarking shows that using a
+///   dynamic allocated buffer has performance benefits.
+/// - Implementing P3107R5 "Permit an efficient implementation of std::print"
+///   is not trivial with the current buffers. Using the code from this series
+///   makes it trivial.
+///
+/// This class is ABI-tagged, still the new design does not change the size of
+/// objects of this class.
+///
+/// The new design contains information regarding format_to_n changes, these
+/// will be implemented in follow-up patch.
+///
+/// The new design is the following.
+/// - There is an external object that connects the buffer to the output.
+/// - This buffer object:
+///   - inherits publicly from this class.
+///   - has a static or dynamic buffer.
+///   - has a static member function to make space in its buffer write
+/// operations. This can be done by increasing the size of the internal
+/// buffer or by writing the contents of the buffer to the output iterator.
+///
+/// This member function is a constructor argument, so its name is not
+/// fixed. The code uses the name __prepare_write.
+/// - The number of output code units can be limited by a __max_output_size
+///   object. This is used in format_to_n This object:
+///   - Contains the maximum number of code units to be written.
+///   - Contains the number of code units that are requested to be written.
+/// This number is returned to the user of format_to_n.
+///   - The write functions call objects __request_write member function.
+/// This function:
+/// - Updates the number of code units that are requested to be written.
+/// - Returns the number of code units that can be written without
+///   exceeding the maximum number of code units to be written.
+///
+/// Documentation for the buffer usage members:
+/// - __ptr_ the start of the buffer.
+/// - __capacity_ the number of code units that can be written.
+///   This means [__ptr_, __ptr_ + __capacity_) is a valid range to write to.
+/// - __size_ the number of code units written in the buffer. The next code
+///   unit will be written at __ptr_ + __size_. This __size_ may NOT contain
+///   the total number of code units written by the __output_buffer. Whether or
+///   not it does depends on the sub-class used. Typically the total number of
+///   code units written is not interesting. It is interesting for format_to_n
+///   which has its own way to track this number.
+///
+/// Documentation for the buffer changes function:
+/// The subclasses have a function with the following signature:
+///
+///   static void __prepare_write(
+/// __output_buffer<_CharT>& __buffer, size_t __code_units);
+///
+/// This function is called when a write function writes more code units than
+/// the buffer' available space. When an __max_output_size object is provided
+/// the number of code units is the number of code units returned from
+/// __max_output_size::__request_write function.
+///
+/// - The __buffer contains *this. Since the class containing this function
+///   inherits from __output_buffer it's save to cast it to the subclass being
+///   used.
+/// - The __code_units is the number of code units the caller will write + 1.
+///   - This value does not take the avaiable space of the buffer into account.
+///   - The push_back function is more efficient when writing before resizing,
+/// this means the buffer should always have room for one code unit. Hence
+/// the + 1 is the size.
+/// - When the function returns there is room for at least one code unit. There
+///   is no requirement there is room for __code_units code units:
+///   - The class has some "bulk" operations. For example, __copy which copies
+/// the contents of a basic_string_view to the output. If the sub-class has
+/// a fixed size buffer the size of the basic_string_view may be larger
+/// than the buffer. In that case it's impossible to honor the requested
+/// size.
+///   - The at least one code unit makes sure the entire output can be written.
+/// (Obviously making room on

[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-20 Thread Louis Dionne via llvm-branch-commits


@@ -58,23 +59,156 @@ namespace __format {
 /// This helper is used together with the @ref back_insert_iterator to offer
 /// type-erasure for the formatting functions. This reduces the number to
 /// template instantiations.
+///
+/// The design of the class is being changed to improve performance and do some
+/// code cleanups.
+/// The original design (as shipped up to LLVM-19) uses the following design:
+/// - There is an external object that connects the buffer to the output.
+/// - The class constructor stores a function pointer to a grow function and a
+///   type-erased pointer to the object that does the grow.
+/// - When writing data to the buffer would exceed the external buffer's
+///   capacity it requests the external buffer to flush its contents.
+///
+/// The new design tries to solve some issues with the current design:
+/// - The buffer used is a fixed-size buffer, benchmarking shows that using a
+///   dynamic allocated buffer has performance benefits.
+/// - Implementing P3107R5 "Permit an efficient implementation of std::print"
+///   is not trivial with the current buffers. Using the code from this series
+///   makes it trivial.
+///
+/// This class is ABI-tagged, still the new design does not change the size of
+/// objects of this class.
+///
+/// The new design contains information regarding format_to_n changes, these
+/// will be implemented in follow-up patch.
+///
+/// The new design is the following.
+/// - There is an external object that connects the buffer to the output.
+/// - This buffer object:
+///   - inherits publicly from this class.
+///   - has a static or dynamic buffer.
+///   - has a static member function to make space in its buffer write
+/// operations. This can be done by increasing the size of the internal
+/// buffer or by writing the contents of the buffer to the output iterator.
+///
+/// This member function is a constructor argument, so its name is not
+/// fixed. The code uses the name __prepare_write.
+/// - The number of output code units can be limited by a __max_output_size
+///   object. This is used in format_to_n This object:
+///   - Contains the maximum number of code units to be written.
+///   - Contains the number of code units that are requested to be written.
+/// This number is returned to the user of format_to_n.
+///   - The write functions call objects __request_write member function.
+/// This function:
+/// - Updates the number of code units that are requested to be written.
+/// - Returns the number of code units that can be written without
+///   exceeding the maximum number of code units to be written.
+///
+/// Documentation for the buffer usage members:
+/// - __ptr_ the start of the buffer.
+/// - __capacity_ the number of code units that can be written.
+///   This means [__ptr_, __ptr_ + __capacity_) is a valid range to write to.
+/// - __size_ the number of code units written in the buffer. The next code
+///   unit will be written at __ptr_ + __size_. This __size_ may NOT contain
+///   the total number of code units written by the __output_buffer. Whether or
+///   not it does depends on the sub-class used. Typically the total number of
+///   code units written is not interesting. It is interesting for format_to_n
+///   which has its own way to track this number.
+///
+/// Documentation for the buffer changes function:
+/// The subclasses have a function with the following signature:
+///
+///   static void __prepare_write(
+/// __output_buffer<_CharT>& __buffer, size_t __code_units);
+///
+/// This function is called when a write function writes more code units than
+/// the buffer' available space. When an __max_output_size object is provided
+/// the number of code units is the number of code units returned from
+/// __max_output_size::__request_write function.
+///
+/// - The __buffer contains *this. Since the class containing this function
+///   inherits from __output_buffer it's save to cast it to the subclass being
+///   used.
+/// - The __code_units is the number of code units the caller will write + 1.
+///   - This value does not take the avaiable space of the buffer into account.
+///   - The push_back function is more efficient when writing before resizing,
+/// this means the buffer should always have room for one code unit. Hence
+/// the + 1 is the size.
+/// - When the function returns there is room for at least one code unit. There
+///   is no requirement there is room for __code_units code units:
+///   - The class has some "bulk" operations. For example, __copy which copies
+/// the contents of a basic_string_view to the output. If the sub-class has
+/// a fixed size buffer the size of the basic_string_view may be larger
+/// than the buffer. In that case it's impossible to honor the requested
+/// size.
+///   - The at least one code unit makes sure the entire output can be written.
+/// (Obviously making room on

[llvm-branch-commits] [libcxx] [libc++][format][4/7] Improves std::format_to performance. (PR #101823)

2024-08-20 Thread Louis Dionne via llvm-branch-commits

https://github.com/ldionne requested changes to this pull request.


https://github.com/llvm/llvm-project/pull/101823
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][4/7] Improves std::format_to performance. (PR #101823)

2024-08-20 Thread Louis Dionne via llvm-branch-commits


@@ -722,6 +724,95 @@ class _LIBCPP_TEMPLATE_VIS __allocating_buffer : public 
__output_buffer<_CharT>
   }
 };
 
+// A buffer that directly writes to the underlying buffer.
+template 
+class _LIBCPP_TEMPLATE_VIS __direct_iterator_buffer : public 
__output_buffer<_CharT> {
+public:
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI explicit __direct_iterator_buffer(_OutIt 
__out_it)
+  : __output_buffer<_CharT>{std::__unwrap_iter(__out_it), __buffer_size, 
__prepare_write}, __out_it_(__out_it) {}
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI _OutIt __out_it() && { return __out_it_ 
+ this->__size(); }
+
+private:
+  // The function format_to expects a buffer large enough for the output. The
+  // function format_to_n has its own helper class that restricts the number of
+  // write options. So this function class can pretend to have an infinite
+  // buffer.
+  static constexpr size_t __buffer_size = -1;
+
+  _OutIt __out_it_;
+
+  _LIBCPP_HIDE_FROM_ABI static void
+  __prepare_write([[maybe_unused]] __output_buffer<_CharT>& __buffer, 
[[maybe_unused]] size_t __size_hint) {
+std::__throw_length_error("__direct_iterator_buffer");
+  }
+};
+
+// A buffer that writes its output to the end of a container.
+template 
+class _LIBCPP_TEMPLATE_VIS __container_inserter_buffer : public 
__output_buffer<_CharT> {
+public:
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI explicit 
__container_inserter_buffer(_OutIt __out_it)
+  : __output_buffer<_CharT>{__buffer_, __buffer_size, __prepare_write}, 
__container_{__out_it.__get_container()} {}
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI auto __out_it() && {
+__container_->insert(__container_->end(), __buffer_, __buffer_ + 
this->__size());
+return std::back_inserter(*__container_);
+  }
+
+private:
+  typename __back_insert_iterator_container<_OutIt>::type* __container_;
+
+  // This class uses a fixed size buffer and appends the elements in
+  // __buffer_size chunks. An alternative would be to use an allocating buffer
+  // and append the output in one write operation. Benchmarking showed no
+  // performance difference.
+  static constexpr size_t __buffer_size = 256;
+  _CharT __buffer_[__buffer_size];
+
+  _LIBCPP_HIDE_FROM_ABI void __prepare_write() {
+__container_->insert(__container_->end(), __buffer_, __buffer_ + 
this->__size());
+this->__buffer_flused();

ldionne wrote:

```suggestion
this->__buffer_flushed();
```

https://github.com/llvm/llvm-project/pull/101823
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][4/7] Improves std::format_to performance. (PR #101823)

2024-08-20 Thread Louis Dionne via llvm-branch-commits


@@ -722,6 +724,95 @@ class _LIBCPP_TEMPLATE_VIS __allocating_buffer : public 
__output_buffer<_CharT>
   }
 };
 
+// A buffer that directly writes to the underlying buffer.
+template 

ldionne wrote:

```suggestion
template 
```

That way, you don't need to `__unwrap_iter` below and we can keep the bounds 
information when there is some.

https://github.com/llvm/llvm-project/pull/101823
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][4/7] Improves std::format_to performance. (PR #101823)

2024-08-20 Thread Louis Dionne via llvm-branch-commits


@@ -722,6 +724,95 @@ class _LIBCPP_TEMPLATE_VIS __allocating_buffer : public 
__output_buffer<_CharT>
   }
 };
 
+// A buffer that directly writes to the underlying buffer.
+template 
+class _LIBCPP_TEMPLATE_VIS __direct_iterator_buffer : public 
__output_buffer<_CharT> {
+public:
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI explicit __direct_iterator_buffer(_OutIt 
__out_it)
+  : __output_buffer<_CharT>{std::__unwrap_iter(__out_it), __buffer_size, 
__prepare_write}, __out_it_(__out_it) {}
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI _OutIt __out_it() && { return __out_it_ 
+ this->__size(); }
+
+private:
+  // The function format_to expects a buffer large enough for the output. The
+  // function format_to_n has its own helper class that restricts the number of
+  // write options. So this function class can pretend to have an infinite
+  // buffer.
+  static constexpr size_t __buffer_size = -1;
+
+  _OutIt __out_it_;
+
+  _LIBCPP_HIDE_FROM_ABI static void
+  __prepare_write([[maybe_unused]] __output_buffer<_CharT>& __buffer, 
[[maybe_unused]] size_t __size_hint) {
+std::__throw_length_error("__direct_iterator_buffer");
+  }
+};
+
+// A buffer that writes its output to the end of a container.
+template 
+class _LIBCPP_TEMPLATE_VIS __container_inserter_buffer : public 
__output_buffer<_CharT> {
+public:
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI explicit 
__container_inserter_buffer(_OutIt __out_it)
+  : __output_buffer<_CharT>{__buffer_, __buffer_size, __prepare_write}, 
__container_{__out_it.__get_container()} {}
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI auto __out_it() && {
+__container_->insert(__container_->end(), __buffer_, __buffer_ + 
this->__size());
+return std::back_inserter(*__container_);
+  }
+
+private:
+  typename __back_insert_iterator_container<_OutIt>::type* __container_;
+
+  // This class uses a fixed size buffer and appends the elements in
+  // __buffer_size chunks. An alternative would be to use an allocating buffer
+  // and append the output in one write operation. Benchmarking showed no
+  // performance difference.
+  static constexpr size_t __buffer_size = 256;
+  _CharT __buffer_[__buffer_size];
+
+  _LIBCPP_HIDE_FROM_ABI void __prepare_write() {

ldionne wrote:

I find it confusing to reuse the same function name for `__prepare_write`. 
Could we simply inline this function inside 
`__prepare_write(__output_buffer<_CharT>&, size_t)` below?

```c++
static void __prepare_write(__output_buffer<_CharT>& __buffer, [[maybe_unused]] 
size_t __size_hint) {
  __container_inserter_buffer<_OutIt, _CharT>& __self = 
static_cast<__container_inserter_buffer<_OutIt, _CharT>&>(__buffer);
  __self.__container_->insert(__self.__container_->end(), __self.__buffer_, 
__self.__buffer_ + __self.__size());
  __self.__buffer_flused();
}
```

https://github.com/llvm/llvm-project/pull/101823
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][4/7] Improves std::format_to performance. (PR #101823)

2024-08-20 Thread Louis Dionne via llvm-branch-commits


@@ -722,6 +724,95 @@ class _LIBCPP_TEMPLATE_VIS __allocating_buffer : public 
__output_buffer<_CharT>
   }
 };
 
+// A buffer that directly writes to the underlying buffer.
+template 
+class _LIBCPP_TEMPLATE_VIS __direct_iterator_buffer : public 
__output_buffer<_CharT> {
+public:
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI explicit __direct_iterator_buffer(_OutIt 
__out_it)
+  : __output_buffer<_CharT>{std::__unwrap_iter(__out_it), __buffer_size, 
__prepare_write}, __out_it_(__out_it) {}
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI _OutIt __out_it() && { return __out_it_ 
+ this->__size(); }
+
+private:
+  // The function format_to expects a buffer large enough for the output. The
+  // function format_to_n has its own helper class that restricts the number of
+  // write options. So this function class can pretend to have an infinite
+  // buffer.
+  static constexpr size_t __buffer_size = -1;
+
+  _OutIt __out_it_;
+
+  _LIBCPP_HIDE_FROM_ABI static void
+  __prepare_write([[maybe_unused]] __output_buffer<_CharT>& __buffer, 
[[maybe_unused]] size_t __size_hint) {
+std::__throw_length_error("__direct_iterator_buffer");
+  }
+};
+
+// A buffer that writes its output to the end of a container.
+template 
+class _LIBCPP_TEMPLATE_VIS __container_inserter_buffer : public 
__output_buffer<_CharT> {
+public:
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI explicit 
__container_inserter_buffer(_OutIt __out_it)
+  : __output_buffer<_CharT>{__buffer_, __buffer_size, __prepare_write}, 
__container_{__out_it.__get_container()} {}
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI auto __out_it() && {
+__container_->insert(__container_->end(), __buffer_, __buffer_ + 
this->__size());
+return std::back_inserter(*__container_);
+  }
+
+private:
+  typename __back_insert_iterator_container<_OutIt>::type* __container_;
+
+  // This class uses a fixed size buffer and appends the elements in
+  // __buffer_size chunks. An alternative would be to use an allocating buffer
+  // and append the output in one write operation. Benchmarking showed no

ldionne wrote:

```suggestion
  // and append the output in a single write operation. Benchmarking showed no
```

https://github.com/llvm/llvm-project/pull/101823
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [clang][modules] Built-in modules are not correctly enabled for Mac Catalyst (#104872) (PR #105093)

2024-08-20 Thread Cyndy Ishida via llvm-branch-commits

https://github.com/cyndyishida approved this pull request.


https://github.com/llvm/llvm-project/pull/105093
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

2024-08-20 Thread Louis Dionne via llvm-branch-commits


@@ -499,11 +652,78 @@ struct _LIBCPP_TEMPLATE_VIS __format_to_n_buffer final
   _LIBCPP_HIDE_FROM_ABI auto __make_output_iterator() { return 
this->__output_.__make_output_iterator(); }
 
   _LIBCPP_HIDE_FROM_ABI format_to_n_result<_OutIt> __result() && {
-this->__output_.__flush();
+this->__output_.__flush(0);
 return {std::move(this->__writer_).__out_it(), this->__size_};
   }
 };
 
+// * * * LLVM-20 classes * * *
+
+// A dynamically growing buffer.
+template <__fmt_char_type _CharT>
+class _LIBCPP_TEMPLATE_VIS __allocating_buffer : public 
__output_buffer<_CharT> {
+public:
+  __allocating_buffer(const __allocating_buffer&)= delete;
+  __allocating_buffer& operator=(const __allocating_buffer&) = delete;
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI __allocating_buffer()
+  : __output_buffer<_CharT>{__buffer_, __buffer_size_, __prepare_write} {}
+
+  _LIBCPP_HIDE_FROM_ABI ~__allocating_buffer() {
+if (__ptr_ != __buffer_) {
+  ranges::destroy_n(__ptr_, this->__size());
+  allocator_traits<_Alloc>::deallocate(__alloc_, __ptr_, 
this->__capacity());
+}
+  }
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI basic_string_view<_CharT> __view() { 
return {__ptr_, this->__size()}; }
+
+private:
+  // At the moment the allocator is hard-code. There might be reasons to have
+  // an allocator trait in the future. This ensures forward compatibility.
+  using _Alloc = allocator<_CharT>;
+  _LIBCPP_NO_UNIQUE_ADDRESS _Alloc __alloc_;
+
+  // Since allocating is expensive the class has a small internal buffer. When
+  // its capacity is exceeded a dynamic buffer will be allocated.
+  static constexpr size_t __buffer_size_ = 256;
+  _CharT __buffer_[__buffer_size_];

ldionne wrote:

After looking at the patches later in the series, I am not certain these 
comments about implementing this as `vector` (i.e. elements beyond `size()` not 
having their lifetime started yet) is the best approach. I think your current 
approach may be better.

However, I think that the usage of `back_insert_iterator<__output_buffer>` + 
`__output_buffer::push_back` is misleading -- that's what caused me to think 
this should behave as a `vector` when in reality this is a different beast. If 
instead we had a method called `__push_back` (or `__insert_back`) and we used 
our own small iterator type to output to the buffer, it would make the design 
of this class clearer. It would also put `__push_back`/`__insert_back` on the 
same level as the other "insertion" functions like `__fill` and others. So IMO 
the current design might be fine lifetime-wise, but I think it would be nice to 
avoid using `back_insert_iterator<__output_buffer>` since `__output_buffer` is 
fundamentally not a container.

I understand that using `back_inserter` greatly simplifies the boilerplate 
needed, but IMO that's still the right design.

https://github.com/llvm/llvm-project/pull/101817
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [MC][NFC] Reduce Address2ProbesMap size (PR #102904)

2024-08-20 Thread Lei Wang via llvm-branch-commits

wlei-llvm wrote:

> > Tested End-to-End on llvm-profgen on a heavy workload(ported all the 
> > stacked PR) : The running time is neutral, the maximum RSS is reduced by 
> > 3GB (from 70GB to 67GB)   cc @WenleiHe
> 
> To double-check: did you test with or without dwarf-correlation? I tested 
> once with it, expectedly pseudo probe parsing wasn't engaged, so there was no 
> effect.

Yes, I double-checked it's a pseudo-probed binary not dwarf-correlation. Note 
that for end to end test, the pseudo-probe decoding is not the bottleneck(only 
account for < 1% run time), so neutral for run time is kind of expected. I was 
to test if the Address2ProbesMap `find` could regress the run time.

https://github.com/llvm/llvm-project/pull/102904
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [MC][NFC] Reduce Address2ProbesMap size (PR #102904)

2024-08-20 Thread Lei Wang via llvm-branch-commits

wlei-llvm wrote:


> > To double-check: did you test with or without dwarf-correlation? I tested 
> > once with it, expectedly pseudo probe parsing wasn't engaged, so there was 
> > no effect.
> 
> Yes, I double-checked it's a pseudo-probed binary not dwarf-correlation. Note 
> that for end to end test, the pseudo-probe decoding is not the 
> bottleneck(only account for < 1% run time), so neutral for run time is kind 
> of expected. I was to test if the Address2ProbesMap `find` could regress the 
> run time.

For RSS, it's probably because in llvm-profgen, we decoded the pseudo-probe 
only for profiled function(https://reviews.llvm.org/D121643), so the baseline 
memory is decreased. 

In https://github.com/llvm/llvm-project/pull/102789, I tested for pseudo-probe 
decoding for all functions. 


https://github.com/llvm/llvm-project/pull/102904
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] 177a5c2 - Revert "[compiler-rt][fuzzer] implements SetThreadName for fuchsia. (#99953)"

2024-08-20 Thread via llvm-branch-commits

Author: David CARLIER
Date: 2024-08-20T18:41:48+01:00
New Revision: 177a5c23c998699d93ecc64154b111a8adabb66e

URL: 
https://github.com/llvm/llvm-project/commit/177a5c23c998699d93ecc64154b111a8adabb66e
DIFF: 
https://github.com/llvm/llvm-project/commit/177a5c23c998699d93ecc64154b111a8adabb66e.diff

LOG: Revert "[compiler-rt][fuzzer] implements SetThreadName for fuchsia. 
(#99953)"

This reverts commit 31cc4ccdea92a4fee6a327a07251bb0ed1e6a933.

Added: 


Modified: 
compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp

Removed: 




diff  --git a/compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp 
b/compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp
index eefa0b326e51b5..fe79e1908d6029 100644
--- a/compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp
+++ b/compiler-rt/lib/fuzzer/FuzzerUtilFuchsia.cpp
@@ -607,13 +607,7 @@ size_t PageSize() {
 }
 
 void SetThreadName(std::thread &thread, const std::string &name) {
-  if (name.size() > 31)
-name.resize(31);
-  zx_status_t s;
-  if ((s = zx_object_set_property(thread.native_handle(), ZX_PROP_NAME,
-  name.c_str(), name.size())) != ZX_OK)
-Printf("SetThreadName for name %s failed: %s", name.c_str(),
-   zx_status_get_string(s));
+  // TODO ?
 }
 
 } // namespace fuzzer



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [ctx_prof] Profile flatterner (PR #104539)

2024-08-20 Thread Mircea Trofin via llvm-branch-commits

https://github.com/mtrofin updated 
https://github.com/llvm/llvm-project/pull/104539

>From c0eb05f775a88fdf343d52b7af7fcc5eb4b2497e Mon Sep 17 00:00:00 2001
From: Mircea Trofin 
Date: Thu, 15 Aug 2024 19:03:30 -0700
Subject: [PATCH] [ctx_prof] Profile flatterner

---
 llvm/include/llvm/Analysis/CtxProfAnalysis.h  |  7 ++
 llvm/lib/Analysis/CtxProfAnalysis.cpp | 40 
 .../Analysis/CtxProfAnalysis/full-cycle.ll| 65 ++-
 llvm/test/Analysis/CtxProfAnalysis/load.ll|  5 ++
 4 files changed, 114 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/Analysis/CtxProfAnalysis.h 
b/llvm/include/llvm/Analysis/CtxProfAnalysis.h
index 43587d953fc4ca..f9c204ea8d7744 100644
--- a/llvm/include/llvm/Analysis/CtxProfAnalysis.h
+++ b/llvm/include/llvm/Analysis/CtxProfAnalysis.h
@@ -9,6 +9,8 @@
 #ifndef LLVM_ANALYSIS_CTXPROFANALYSIS_H
 #define LLVM_ANALYSIS_CTXPROFANALYSIS_H
 
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/IR/GlobalValue.h"
 #include "llvm/IR/InstrTypes.h"
 #include "llvm/IR/IntrinsicInst.h"
 #include "llvm/IR/PassManager.h"
@@ -18,6 +20,9 @@ namespace llvm {
 
 class CtxProfAnalysis;
 
+using CtxProfFlatProfile =
+DenseMap>;
+
 /// The instrumented contextual profile, produced by the CtxProfAnalysis.
 class PGOContextualProfile {
   friend class CtxProfAnalysis;
@@ -65,6 +70,8 @@ class PGOContextualProfile {
 return 
FuncInfo.find(getDefinedFunctionGUID(F))->second.NextCallsiteIndex++;
   }
 
+  const CtxProfFlatProfile flatten() const;
+
   bool invalidate(Module &, const PreservedAnalyses &PA,
   ModuleAnalysisManager::Invalidator &) {
 // Check whether the analysis has been explicitly invalidated. Otherwise,
diff --git a/llvm/lib/Analysis/CtxProfAnalysis.cpp 
b/llvm/lib/Analysis/CtxProfAnalysis.cpp
index 51663196b13070..837ffc43a30235 100644
--- a/llvm/lib/Analysis/CtxProfAnalysis.cpp
+++ b/llvm/lib/Analysis/CtxProfAnalysis.cpp
@@ -184,6 +184,14 @@ PreservedAnalyses CtxProfAnalysisPrinterPass::run(Module 
&M,
   OS << "\nCurrent Profile:\n";
   OS << formatv("{0:2}", JSONed);
   OS << "\n";
+  OS << "\nFlat Profile:\n";
+  auto Flat = C.flatten();
+  for (const auto &[Guid, Counters] : Flat) {
+OS << Guid << " : ";
+for (auto V : Counters)
+  OS << V << " ";
+OS << "\n";
+  }
   return PreservedAnalyses::all();
 }
 
@@ -193,3 +201,35 @@ InstrProfCallsite 
*CtxProfAnalysis::getCallsiteInstrumentation(CallBase &CB) {
   return IPC;
   return nullptr;
 }
+
+static void
+preorderVisit(const PGOCtxProfContext::CallTargetMapTy &Profiles,
+  function_ref Visitor) {
+  std::function Traverser =
+  [&](const auto &Ctx) {
+Visitor(Ctx);
+for (const auto &[_, SubCtxSet] : Ctx.callsites())
+  for (const auto &[__, Subctx] : SubCtxSet)
+Traverser(Subctx);
+  };
+  for (const auto &[_, P] : Profiles)
+Traverser(P);
+}
+
+const CtxProfFlatProfile PGOContextualProfile::flatten() const {
+  assert(Profiles.has_value());
+  CtxProfFlatProfile Flat;
+  preorderVisit(*Profiles, [&](const PGOCtxProfContext &Ctx) {
+auto [It, Ins] = Flat.insert({Ctx.guid(), {}});
+if (Ins) {
+  llvm::append_range(It->second, Ctx.counters());
+} else {
+  assert(It->second.size() == Ctx.counters().size() &&
+ "All contexts corresponding to a function should have the exact "
+ "same nr of counters.");
+  for (size_t I = 0, E = It->second.size(); I < E; ++I)
+It->second[I] += Ctx.counters()[I];
+}
+  });
+  return Flat;
+}
diff --git a/llvm/test/Analysis/CtxProfAnalysis/full-cycle.ll 
b/llvm/test/Analysis/CtxProfAnalysis/full-cycle.ll
index 0cdf82bd96efcb..06ba8b3542f7d5 100644
--- a/llvm/test/Analysis/CtxProfAnalysis/full-cycle.ll
+++ b/llvm/test/Analysis/CtxProfAnalysis/full-cycle.ll
@@ -4,6 +4,9 @@
 ; RUN: split-file %s %t
 ;
 ; Test that the GUID metadata survives through thinlink.
+; Also test that the flattener works correctly. f2 is called in 2 places, with
+; different counter values, and we expect resulting flat profile to be the sum
+; (of values at the same index).
 ;
 ; RUN: llvm-ctxprof-util fromJSON --input=%t/profile.json 
--output=%t/profile.ctxprofdata
 ;
@@ -17,7 +20,9 @@
 ; RUN: llvm-lto2 run %t/m1.bc %t/m2.bc -o %t/ -thinlto-distributed-indexes \
 ; RUN:  -use-ctx-profile=%t/profile.ctxprofdata \
 ; RUN:  -r %t/m1.bc,f1,plx \
+; RUN:  -r %t/m1.bc,f3,plx \
 ; RUN:  -r %t/m2.bc,f1 \
+; RUN:  -r %t/m2.bc,f3 \
 ; RUN:  -r %t/m2.bc,entrypoint,plx
 ; RUN: opt 
--passes='function-import,require,print' \
 ; RUN:  -summary-file=%t/m2.bc.thinlto.bc 
-use-ctx-profile=%t/profile.ctxprofdata %t/m2.bc \
@@ -38,6 +43,11 @@ define void @f1() #0 {
   ret void
 }
 
+define void @f3() #0 {
+  call void @f2()
+  ret void
+}
+
 attributes #0 = { noinline }
 !0 = !{ i64 3087265239403591524 }
 
@@ -48,9 +58,11 @@ target triple = "x86_64-pc-linux-gnu"
 source_filename = "random_path/m2.cc"
 
 declare void @f1()
+declare void @f3()
 

[llvm-branch-commits] [DirectX] Encapsulate DXILOpLowering's state into a class. NFC (PR #104248)

2024-08-20 Thread Justin Bogner via llvm-branch-commits

https://github.com/bogner updated 
https://github.com/llvm/llvm-project/pull/104248


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [DirectX] Encapsulate DXILOpLowering's state into a class. NFC (PR #104248)

2024-08-20 Thread Justin Bogner via llvm-branch-commits

https://github.com/bogner updated 
https://github.com/llvm/llvm-project/pull/104248


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [DirectX] Lower `@llvm.dx.handle.fromBinding` to DXIL ops (PR #104251)

2024-08-20 Thread Justin Bogner via llvm-branch-commits

https://github.com/bogner updated 
https://github.com/llvm/llvm-project/pull/104251


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [DirectX] Lower `@llvm.dx.handle.fromBinding` to DXIL ops (PR #104251)

2024-08-20 Thread Justin Bogner via llvm-branch-commits

https://github.com/bogner updated 
https://github.com/llvm/llvm-project/pull/104251


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [DirectX] Lower `@llvm.dx.handle.fromBinding` to DXIL ops (PR #104251)

2024-08-20 Thread Justin Bogner via llvm-branch-commits


@@ -119,6 +123,119 @@ class OpLowerer {
 });
   }
 
+  Value *createTmpHandleCast(Value *V, Type *Ty) {
+Function *CastFn = Intrinsic::getDeclaration(&M, Intrinsic::dx_cast_handle,
+ {Ty, V->getType()});
+CallInst *Cast = OpBuilder.getIRB().CreateCall(CastFn, {V});
+CleanupCasts.push_back(Cast);
+return Cast;
+  }
+
+  void cleanupHandleCasts() {
+SmallVector ToRemove;
+SmallVector CastFns;
+
+for (CallInst *Cast : CleanupCasts) {
+  CastFns.push_back(Cast->getCalledFunction());
+  // All of the ops should be using `dx.types.Handle` at this point, so if
+  // we're not producing that we should be part of a pair. Track this so we
+  // can remove it at the end.
+  if (Cast->getType() != OpBuilder.getHandleType()) {
+ToRemove.push_back(Cast);
+continue;
+  }
+  // Otherwise, we're the second handle in a pair. Forward the arguments 
and
+  // remove the (second) cast.
+  CallInst *Def = cast(Cast->getOperand(0));
+  assert(Def->getIntrinsicID() == Intrinsic::dx_cast_handle &&
+ "Unbalanced pair of temporary handle casts");
+  Cast->replaceAllUsesWith(Def->getOperand(0));
+  Cast->eraseFromParent();
+}
+for (CallInst *Cast : ToRemove) {
+  assert(Cast->user_empty() && "Temporary handle cast still has users");
+  Cast->eraseFromParent();
+}
+llvm::sort(CastFns);
+CastFns.erase(llvm::unique(CastFns), CastFns.end());
+for (Function *F : CastFns)
+  F->eraseFromParent();

bogner wrote:

Above we add the cast function to the list every time we see a call to a cast - 
this means we could end up with duplicates if the same cast function is called 
multiple times. So, we need to deduplicate the list before we start erasing the 
functions from the module.

https://github.com/llvm/llvm-project/pull/104251
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [MC][NFC] Reduce Address2ProbesMap size (PR #102904)

2024-08-20 Thread via llvm-branch-commits

https://github.com/WenleiHe commented:

> For RSS, it's probably because in llvm-profgen, we decoded the pseudo-probe 
> only for profiled function(https://reviews.llvm.org/D121643), so the baseline 
> memory is decreased.

@aaupov actually do we need to decode profile for functions without profile? 
that is no profile to be matched in this case, and there is no 
optimization/reorder to be done either, hence no need to rewrite probes? 

The change itself look good to land though. 

https://github.com/llvm/llvm-project/pull/102904
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] clang/AMDGPU: Emit atomicrmw for global/flat fadd v2bf16 builtins (PR #96875)

2024-08-20 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

### Merge activity

* **Aug 20, 2:53 PM EDT**: @arsenm started a stack merge that includes this 
pull request via 
[Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/96875).


https://github.com/llvm/llvm-project/pull/96875
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


  1   2   >