Re: [PATCH] lto: pass -pthread to AM_LDFLAGS [PR 106118]

2022-07-01 Thread Martin Liška
>
> OK
> 

I've just pushed that as 51debf7f857.

Martin


Re: [PATCH] Remove legacy -gz=zlib-gnu

2022-07-01 Thread Andrew Pinski via Gcc-patches
On Thu, Jun 30, 2022 at 11:58 PM Fangrui Song via Gcc-patches
 wrote:
>
> From: Fangrui Song 
>
> SHF_COMPRESSED style zlib has been supported since binutils 2.26
> and the legacy zlib-gnu option hasn't gain adoption.
> According to Debian Code Search (`gz=zlib-gnu`), no project uses
> -gz=zlib-gnu (valgrind has a configure to use -gz=zlib).
> Remove support for the legacy zlib-gnu and simplify configure.ac by
> removing zlib-gnu ld/as check.

A couple of things, you are missing a changelog.
Second, why remove something which is still working?
Third, why not just make gz=zlib-gnu as an alias to gz=zlib instead so
if someone used it before it will still work. we try not to remove
options; have them emit a warning and be ignored (or moved over to the
closed option).

Thanks,
Andrew

> ---
>  gcc/common.opt  |  3 ---
>  gcc/configure   | 33 ++---
>  gcc/configure.ac| 29 -
>  gcc/doc/invoke.texi | 11 +--
>  gcc/gcc.cc  | 22 ++
>  5 files changed, 17 insertions(+), 81 deletions(-)
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index e7a51e882ba..8754d93d545 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -3424,9 +3424,6 @@ Enum(compressed_debug_sections) String(none) Value(0)
>  EnumValue
>  Enum(compressed_debug_sections) String(zlib) Value(1)
>
> -EnumValue
> -Enum(compressed_debug_sections) String(zlib-gnu) Value(2)
> -
>  gz
>  Common Driver
>  Generate compressed debug sections.
> diff --git a/gcc/configure b/gcc/configure
> index 62872d132ea..ca87e875e9d 100755
> --- a/gcc/configure
> +++ b/gcc/configure
> @@ -19674,7 +19674,7 @@ else
>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>lt_status=$lt_dlunknown
>cat > conftest.$ac_ext <<_LT_EOF
> -#line 19679 "configure"
> +#line 19677 "configure"
>  #include "confdefs.h"
>
>  #if HAVE_DLFCN_H
> @@ -19780,7 +19780,7 @@ else
>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>lt_status=$lt_dlunknown
>cat > conftest.$ac_ext <<_LT_EOF
> -#line 19785 "configure"
> +#line 19783 "configure"
>  #include "confdefs.h"
>
>  #if HAVE_DLFCN_H
> @@ -29711,20 +29711,13 @@ else
> if $gcc_cv_as --compress-debug-sections -o conftest.o conftest.s 2>&1 | 
> grep -i warning > /dev/null
> then
>   gcc_cv_as_compress_debug=0
> -   # Since binutils 2.26, gas supports --compress-debug-sections=type,
> +   # Since binutils 2.26, gas supports --compress-debug-sections=zlib,
> # defaulting to the ELF gABI format.
> -   elif $gcc_cv_as --compress-debug-sections=zlib-gnu -o conftest.o 
> conftest.s > /dev/null 2>&1
> +   elif $gcc_cv_as --compress-debug-sections=zlib -o conftest.o conftest.s > 
> /dev/null 2>&1
> then
>   gcc_cv_as_compress_debug=2
>   gcc_cv_as_compress_debug_option="--compress-debug-sections"
>   gcc_cv_as_no_compress_debug_option="--nocompress-debug-sections"
> -   # Before binutils 2.26, gas only supported --compress-debug-options and
> -   # emitted the traditional GNU format.
> -   elif $gcc_cv_as --compress-debug-sections -o conftest.o conftest.s > 
> /dev/null 2>&1
> -   then
> - gcc_cv_as_compress_debug=1
> - gcc_cv_as_compress_debug_option="--compress-debug-sections"
> - gcc_cv_as_no_compress_debug_option="--nocompress-debug-sections"
> else
>   gcc_cv_as_compress_debug=0
> fi
> @@ -30238,42 +30231,28 @@ $as_echo "$gcc_cv_ld_eh_gc_sections_bug" >&6; }
>
>  { $as_echo "$as_me:${as_lineno-$LINENO}: checking linker for compressed 
> debug sections" >&5
>  $as_echo_n "checking linker for compressed debug sections... " >&6; }
> -# gold/gld support compressed debug sections since binutils 2.19/2.21
> -# In binutils 2.26, gld gained support for the ELF gABI format.
> +# GNU ld/gold support --compressed-debug-sections=zlib since binutils 2.26.
>  if test $in_tree_ld = yes ; then
>gcc_cv_ld_compress_debug=0
>if test $ld_is_mold = yes; then
>  gcc_cv_ld_compress_debug=3
>  gcc_cv_ld_compress_debug_option="--compress-debug-sections"
> -  elif test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" 
> -ge 19 -o "$gcc_cv_gld_major_version" -gt 2 \
> - && test $in_tree_ld_is_elf = yes && test $ld_is_gold = yes; then
> -gcc_cv_ld_compress_debug=2
> -gcc_cv_ld_compress_debug_option="--compress-debug-sections"
>elif test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" 
> -ge 26 -o "$gcc_cv_gld_major_version" -gt 2 \
>   && test $in_tree_ld_is_elf = yes && test $ld_is_gold = no; then
>  gcc_cv_ld_compress_debug=3
>  gcc_cv_ld_compress_debug_option="--compress-debug-sections"
> -  elif test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" 
> -ge 21 -o "$gcc_cv_gld_major_version" -gt 2 \
> - && test $in_tree_ld_is_elf = yes; then
> -gcc_cv_ld_compress_debug=1
>fi
>  elif echo "$ld_ver" | grep GNU > /dev/null; then
>if test $ld_is_mold = yes; then
>  gcc_cv_ld_compress_d

Re: [PATCH] Remove legacy -gz=zlib-gnu

2022-07-01 Thread Fangrui Song via Gcc-patches

On 2022-07-01, Andrew Pinski wrote:

On Thu, Jun 30, 2022 at 11:58 PM Fangrui Song via Gcc-patches
 wrote:


From: Fangrui Song 

SHF_COMPRESSED style zlib has been supported since binutils 2.26
and the legacy zlib-gnu option hasn't gain adoption.
According to Debian Code Search (`gz=zlib-gnu`), no project uses
-gz=zlib-gnu (valgrind has a configure to use -gz=zlib).
Remove support for the legacy zlib-gnu and simplify configure.ac by
removing zlib-gnu ld/as check.


A couple of things, you are missing a changelog.


Sorry.


Second, why remove something which is still working?


It's unused and its existence causes confusion: the paradox of choice.
People may assume the support may be good but newer DWARF consumers may
not support the legacy format.

The other motivation is to clean up it a bit.  I foresee that someone
will add --compress-debug-sections=zstd to binutils and configure.ac and
gcc/gcc.cc would become more messy.


Third, why not just make gz=zlib-gnu as an alias to gz=zlib instead so
if someone used it before it will still work. we try not to remove
options; have them emit a warning and be ignored (or moved over to the
closed option).


Changing the semantics of -gz=zlib-gnu would be even more confusing.


Thanks,
Andrew


---
 gcc/common.opt  |  3 ---
 gcc/configure   | 33 ++---
 gcc/configure.ac| 29 -
 gcc/doc/invoke.texi | 11 +--
 gcc/gcc.cc  | 22 ++
 5 files changed, 17 insertions(+), 81 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index e7a51e882ba..8754d93d545 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -3424,9 +3424,6 @@ Enum(compressed_debug_sections) String(none) Value(0)
 EnumValue
 Enum(compressed_debug_sections) String(zlib) Value(1)

-EnumValue
-Enum(compressed_debug_sections) String(zlib-gnu) Value(2)
-
 gz
 Common Driver
 Generate compressed debug sections.
diff --git a/gcc/configure b/gcc/configure
index 62872d132ea..ca87e875e9d 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -19674,7 +19674,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19679 "configure"
+#line 19677 "configure"
 #include "confdefs.h"

 #if HAVE_DLFCN_H
@@ -19780,7 +19780,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19785 "configure"
+#line 19783 "configure"
 #include "confdefs.h"

 #if HAVE_DLFCN_H
@@ -29711,20 +29711,13 @@ else
if $gcc_cv_as --compress-debug-sections -o conftest.o conftest.s 2>&1 | grep 
-i warning > /dev/null
then
  gcc_cv_as_compress_debug=0
-   # Since binutils 2.26, gas supports --compress-debug-sections=type,
+   # Since binutils 2.26, gas supports --compress-debug-sections=zlib,
# defaulting to the ELF gABI format.
-   elif $gcc_cv_as --compress-debug-sections=zlib-gnu -o conftest.o conftest.s > 
/dev/null 2>&1
+   elif $gcc_cv_as --compress-debug-sections=zlib -o conftest.o conftest.s > 
/dev/null 2>&1
then
  gcc_cv_as_compress_debug=2
  gcc_cv_as_compress_debug_option="--compress-debug-sections"
  gcc_cv_as_no_compress_debug_option="--nocompress-debug-sections"
-   # Before binutils 2.26, gas only supported --compress-debug-options and
-   # emitted the traditional GNU format.
-   elif $gcc_cv_as --compress-debug-sections -o conftest.o conftest.s > /dev/null 
2>&1
-   then
- gcc_cv_as_compress_debug=1
- gcc_cv_as_compress_debug_option="--compress-debug-sections"
- gcc_cv_as_no_compress_debug_option="--nocompress-debug-sections"
else
  gcc_cv_as_compress_debug=0
fi
@@ -30238,42 +30231,28 @@ $as_echo "$gcc_cv_ld_eh_gc_sections_bug" >&6; }

 { $as_echo "$as_me:${as_lineno-$LINENO}: checking linker for compressed debug 
sections" >&5
 $as_echo_n "checking linker for compressed debug sections... " >&6; }
-# gold/gld support compressed debug sections since binutils 2.19/2.21
-# In binutils 2.26, gld gained support for the ELF gABI format.
+# GNU ld/gold support --compressed-debug-sections=zlib since binutils 2.26.
 if test $in_tree_ld = yes ; then
   gcc_cv_ld_compress_debug=0
   if test $ld_is_mold = yes; then
 gcc_cv_ld_compress_debug=3
 gcc_cv_ld_compress_debug_option="--compress-debug-sections"
-  elif test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" -ge 19 -o 
"$gcc_cv_gld_major_version" -gt 2 \
- && test $in_tree_ld_is_elf = yes && test $ld_is_gold = yes; then
-gcc_cv_ld_compress_debug=2
-gcc_cv_ld_compress_debug_option="--compress-debug-sections"
   elif test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" -ge 26 -o 
"$gcc_cv_gld_major_version" -gt 2 \
  && test $in_tree_ld_is_elf = yes && test $ld_is_gold = no; then
 gcc_cv_ld_compress_debug=3
 gcc_cv_ld_compress_debug_option="--compress-debug-sections"
-  elif test "$gcc_cv_gld_major_version" -eq 2 -a "

[PATCH] wide-int: Fix up wi::shifted_mask [PR106144]

2022-07-01 Thread Jakub Jelinek via Gcc-patches
Hi!

As the following self-test testcase shows, wi::shifted_mask sometimes
doesn't create canonicalized wide_ints, which then fail to compare equal
to canonicalized wide_ints with the same value.
In particular, wi::mask (128, false, 128) gives { -1 } with len 1 and prec 128,
while wi::shifted_mask (0, 128, false, 128) gives { -1, -1 } with len 2
and prec 128.
The problem is that the code is written with the assumption that there are
3 bit blocks (or 2 if start is 0), but doesn't consider the possibility
where there are 2 bit blocks (or 1 if start is 0) where the highest block
isn't present.  In that case, there is the optional block of negate ? 0 : -1
elts, followed by just one elt (either one from the if (shift) or just
negate ? -1 : 0) and the rest is implicit sign-extension.
Only if end < prec there is 1 or more bits above it that have different bit
value and so we need to emit all the elts till end and then one more elt.

if (end == prec) would work too, because we have:
  if (width > prec - start)
width = prec - start;
  unsigned int end = start + width;
so end is guaranteed to be end <= prec, dunno what is preferred.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-07-01  Jakub Jelinek  

PR middle-end/106144
* wide-int.cc (wi::shifted_mask): If end >= prec, return right after
emitting element for shift or if shift is 0 first element after start.
(wide_int_cc_tests): Add tests for equivalency of wi::mask and
wi::shifted_mask with 0 start.

--- gcc/wide-int.cc.jj  2022-01-11 23:11:23.592273263 +0100
+++ gcc/wide-int.cc 2022-06-30 20:41:25.506292687 +0200
@@ -842,6 +842,13 @@ wi::shifted_mask (HOST_WIDE_INT *val, un
val[i++] = negate ? block : ~block;
 }
 
+  if (end >= prec)
+{
+  if (!shift)
+   val[i++] = negate ? 0 : -1;
+  return i;
+}
+
   while (i < end / HOST_BITS_PER_WIDE_INT)
 /* 111 */
 val[i++] = negate ? 0 : -1;
@@ -2583,6 +2590,10 @@ wide_int_cc_tests ()
   run_all_wide_int_tests  ();
   test_overflow ();
   test_round_for_mask ();
+  ASSERT_EQ (wi::mask (128, false, 128),
+wi::shifted_mask (0, 128, false, 128));
+  ASSERT_EQ (wi::mask (128, true, 128),
+wi::shifted_mask (0, 128, true, 128));
 }
 
 } // namespace selftest

Jakub



Re: [PATCH v2] Enable __memcmpeq after seeing __memcmpeq prototype

2022-07-01 Thread Richard Biener via Gcc-patches
On Mon, Jun 20, 2022 at 5:44 PM H.J. Lu  wrote:
>
> extern int __memcmpeq (const void *, const void *, size_t);
>
> was was added to GLIBC 2.35.  Expand BUILT_IN_MEMCMP_EQ to __memcmpeq
> after seeing __memcmpeq prototype

Can you instead use builtin_decl_declared_p (), see how frontends
set that via set_builtin_decl_declared_p?

> gcc/
>
> * builtins.cc (have_memcmpeq_prototype): New.
> (expand_builtin): Issue an error for BUILT_IN___MEMCMPEQ if
> there is no __memcmpeq prototype.  Expand BUILT_IN_MEMCMP_EQ
> to BUILT_IN___MEMCMP_EQ if there is __memcmpeq prototype.
> * builtins.def (BUILT_IN___MEMCMPEQ): New.
> * builtins.h (have_memcmpeq_prototype): New.
>
> gcc/c/
>
> * c-decl.cc (diagnose_mismatched_decls): Set
> have_memcmpeq_prototype to true after seeing __memcmpeq prototype.
>
> gcc/cp/
>
> *  decl.cc (duplicate_decls): Set have_memcmpeq_prototype to true
> after seeing __memcmpeq prototype.
>
> gcc/testsuite/
>
> * c-c++-common/memcmpeq-1.c: New test.
> * c-c++-common/memcmpeq-2.c: Likewise.
> * c-c++-common/memcmpeq-3.c: Likewise.
> * c-c++-common/memcmpeq-4.c: Likewise.
> * c-c++-common/memcmpeq-5.c: Likewise.
> * c-c++-common/memcmpeq-6.c: Likewise.
> * c-c++-common/memcmpeq.h: Likewise.
> ---
>  gcc/builtins.cc | 17 -
>  gcc/builtins.def|  3 +++
>  gcc/builtins.h  |  3 +++
>  gcc/c/c-decl.cc | 25 ++---
>  gcc/cp/decl.cc  |  5 +
>  gcc/testsuite/c-c++-common/memcmpeq-1.c | 11 +++
>  gcc/testsuite/c-c++-common/memcmpeq-2.c | 11 +++
>  gcc/testsuite/c-c++-common/memcmpeq-3.c | 11 +++
>  gcc/testsuite/c-c++-common/memcmpeq-4.c | 11 +++
>  gcc/testsuite/c-c++-common/memcmpeq-5.c | 11 +++
>  gcc/testsuite/c-c++-common/memcmpeq-6.c | 10 ++
>  gcc/testsuite/c-c++-common/memcmpeq.h   | 11 +++
>  12 files changed, 121 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/memcmpeq-1.c
>  create mode 100644 gcc/testsuite/c-c++-common/memcmpeq-2.c
>  create mode 100644 gcc/testsuite/c-c++-common/memcmpeq-3.c
>  create mode 100644 gcc/testsuite/c-c++-common/memcmpeq-4.c
>  create mode 100644 gcc/testsuite/c-c++-common/memcmpeq-5.c
>  create mode 100644 gcc/testsuite/c-c++-common/memcmpeq-6.c
>  create mode 100644 gcc/testsuite/c-c++-common/memcmpeq.h
>
> diff --git a/gcc/builtins.cc b/gcc/builtins.cc
> index 971b18c3745..96e283e5847 100644
> --- a/gcc/builtins.cc
> +++ b/gcc/builtins.cc
> @@ -104,6 +104,9 @@ builtin_info_type builtin_info[(int)END_BUILTINS];
>  /* Non-zero if __builtin_constant_p should be folded right away.  */
>  bool force_folding_builtin_constant_p;
>
> +/* True if there is a __memcmpeq prototype.  */
> +bool have_memcmpeq_prototype;
> +
>  static int target_char_cast (tree, char *);
>  static int apply_args_size (void);
>  static int apply_result_size (void);
> @@ -7392,6 +7395,15 @@ expand_builtin (tree exp, rtx target, rtx subtarget, 
> machine_mode mode,
> return target;
>break;
>
> +case BUILT_IN___MEMCMPEQ:
> +  if (!have_memcmpeq_prototype)
> +   {
> + error ("use of %<__builtin___memcmpeq ()%> without "
> +"%<__memcmpeq%> prototype");
> + return const0_rtx;
> +   }
> +  break;
> +
>  /* Expand it as BUILT_IN_MEMCMP_EQ first. If not successful, change it
> back to a BUILT_IN_STRCMP. Remember to delete the 3rd parameter
> when changing it to a strcmp call.  */
> @@ -7445,7 +7457,10 @@ expand_builtin (tree exp, rtx target, rtx subtarget, 
> machine_mode mode,
> return target;
>if (fcode == BUILT_IN_MEMCMP_EQ)
> {
> - tree newdecl = builtin_decl_explicit (BUILT_IN_MEMCMP);
> + tree newdecl = builtin_decl_explicit
> +   (have_memcmpeq_prototype
> +? BUILT_IN___MEMCMPEQ
> +: BUILT_IN_MEMCMP);
>   TREE_OPERAND (exp, 1) = build_fold_addr_expr (newdecl);
> }
>break;
> diff --git a/gcc/builtins.def b/gcc/builtins.def
> index 005976f34e9..95642c6acdf 100644
> --- a/gcc/builtins.def
> +++ b/gcc/builtins.def
> @@ -965,6 +965,9 @@ DEF_BUILTIN_STUB (BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX, 
> "__builtin_alloca_with_ali
> equality with zero.  */
>  DEF_BUILTIN_STUB (BUILT_IN_MEMCMP_EQ, "__builtin_memcmp_eq")
>
> +/* Similar to BUILT_IN_MEMCMP_EQ, but is mapped to __memcmpeq.  */
> +DEF_EXT_LIB_BUILTIN (BUILT_IN___MEMCMPEQ, "__memcmpeq", 
> BT_FN_INT_CONST_PTR_CONST_PTR_SIZE, ATTR_PURE_NOTHROW_NONNULL_LEAF)
> +
>  /* An internal version of strcmp/strncmp, used when the result is only
> tested for equality with zero.  */
>  DEF_BUILTIN_STUB (BUILT_IN_STRCMP_EQ, "__builtin_strcmp_eq")
> diff --git a/gcc/builtins.h b/gcc/builtins.h
> index 5ad8

[PATCH] Mips: Resolve build issues for the n32 ABI

2022-07-01 Thread Dimitrije Milosevic
Building the ASAN for the n32 MIPS ABI currently fails, due to a few reasons:
- defined(__mips64), which is set solely based on the architecture type 
(32-bit/64-bit), was still used in some places.
Therefore, defined(__mips64) is swapped with SANITIZER_MIPS64, which takes the 
ABI into account as well - defined(__mips64) && _MIPS_SIM == ABI64.
- The n32 ABI still uses 64-bit *Linux* system calls, even though the word size 
is 32 bits.
- After the transition to canonical system calls 
(https://reviews.llvm.org/D124212), the n32 ABI still didn't use them, even 
though they are supported,
as per 
https://github.com/torvalds/linux/blob/master/arch/mips/kernel/syscalls/syscall_n32.tbl.
- struct_kernel_stat_sz was not updated after being changed in LLVM's source 
tree.

See https://reviews.llvm.org/D127098.

libsanitizer/ChangeLog:

* sanitizer_common/sanitizer_linux.cpp (defined): Resolve
ASAN build issues for the n32 ABI.
* sanitizer_common/sanitizer_platform.h (defined): Likewise.
* sanitizer_common/sanitizer_platform_limits_posix.h: Likewise.

---

 libsanitizer/sanitizer_common/sanitizer_linux.cpp   | 17 
++---
 libsanitizer/sanitizer_common/sanitizer_platform.h  |  2 +-
 libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h |  2 +-
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/libsanitizer/sanitizer_common/sanitizer_linux.cpp 
b/libsanitizer/sanitizer_common/sanitizer_linux.cpp
index e2c32d679ad..5ba033492e7 100644
--- a/libsanitizer/sanitizer_common/sanitizer_linux.cpp
+++ b/libsanitizer/sanitizer_common/sanitizer_linux.cpp
@@ -34,7 +34,7 @@
 // format. Struct kernel_stat is defined as 'struct stat' in asm/stat.h. To
 // access stat from asm/stat.h, without conflicting with definition in
 // sys/stat.h, we use this trick.
-#if defined(__mips64)
+#if SANITIZER_MIPS64
 #include 
 #include 
 #define stat kernel_stat
@@ -124,8 +124,9 @@ const int FUTEX_WAKE_PRIVATE = FUTEX_WAKE | 
FUTEX_PRIVATE_FLAG;
 // Are we using 32-bit or 64-bit Linux syscalls?
 // x32 (which defines __x86_64__) has SANITIZER_WORDSIZE == 32
 // but it still needs to use 64-bit syscalls.
-#if SANITIZER_LINUX && (defined(__x86_64__) || defined(__powerpc64__) ||   
\
-SANITIZER_WORDSIZE == 64)
+#if SANITIZER_LINUX && (defined(__x86_64__) || defined(__powerpc64__) || \
+SANITIZER_WORDSIZE == 64 ||  \
+(defined(__mips__) && _MIPS_SIM == _ABIN32))
 # define SANITIZER_LINUX_USES_64BIT_SYSCALLS 1
 #else
 # define SANITIZER_LINUX_USES_64BIT_SYSCALLS 0
@@ -289,7 +290,7 @@ static void stat64_to_stat(struct stat64 *in, struct stat 
*out) {
 }
 #endif

-#if defined(__mips64)
+#if SANITIZER_MIPS64
 // Undefine compatibility macros from 
 // so that they would not clash with the kernel_stat
 // st_[a|m|c]time fields
@@ -343,7 +344,8 @@ uptr internal_stat(const char *path, void *buf) {
 #if SANITIZER_FREEBSD
   return internal_syscall(SYSCALL(fstatat), AT_FDCWD, (uptr)path, (uptr)buf, 
0);
 #elif SANITIZER_LINUX
-#  if SANITIZER_WORDSIZE == 64 || SANITIZER_X32
+#  if SANITIZER_WORDSIZE == 64 || SANITIZER_X32 || \
+  (defined(__mips__) && _MIPS_SIM == _ABIN32)
   return internal_syscall(SYSCALL(newfstatat), AT_FDCWD, (uptr)path, (uptr)buf,
   0);
 #  else
@@ -366,7 +368,8 @@ uptr internal_lstat(const char *path, void *buf) {
   return internal_syscall(SYSCALL(fstatat), AT_FDCWD, (uptr)path, (uptr)buf,
   AT_SYMLINK_NOFOLLOW);
 #elif SANITIZER_LINUX
-#  if defined(_LP64) || SANITIZER_X32
+#  if defined(_LP64) || SANITIZER_X32 || \
+  (defined(__mips__) && _MIPS_SIM == _ABIN32)
   return internal_syscall(SYSCALL(newfstatat), AT_FDCWD, (uptr)path, (uptr)buf,
   AT_SYMLINK_NOFOLLOW);
 #  else
@@ -1053,7 +1056,7 @@ uptr GetMaxVirtualAddress() {
   return (1ULL << (MostSignificantSetBitIndex(GET_CURRENT_FRAME()) + 1)) - 1;
 #elif SANITIZER_RISCV64
   return (1ULL << 38) - 1;
-# elif defined(__mips64)
+# elif SANITIZER_MIPS64
   return (1ULL << 40) - 1;  // 0x00ffUL;
 # elif defined(__s390x__)
   return (1ULL << 53) - 1;  // 0x001fUL;
diff --git a/libsanitizer/sanitizer_common/sanitizer_platform.h 
b/libsanitizer/sanitizer_common/sanitizer_platform.h
index 8fe0d831431..8bd9a327623 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform.h
+++ b/libsanitizer/sanitizer_common/sanitizer_platform.h
@@ -159,7 +159,7 @@

 #if defined(__mips__)
 #  define SANITIZER_MIPS 1
-#  if defined(__mips64)
+#  if defined(__mips64) && _MIPS_SIM == _ABI64
 #define SANITIZER_MIPS32 0
 #define SANITIZER_MIPS64 1
 #  else
diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h 
b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
index 89772a7e5c0..62a99035db3 100644
--- a/libsanitizer/sanitizer_c

[PATCH] libsanitizer: Fix linkage errors for cross toolchains

2022-07-01 Thread Dimitrije Milosevic
When we use cross toolchains, in which the GCC libraries are not installed 
within a designated system root, the shared sanitizer libraries link against 
libstdc++.so* within the same libraries. This directory, however, is not in 
RPATH, so attempting to build a dynamically linked application with 
-fsanitize=... gives a linkage error.
More information can be found here: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69839.


gcc/ChangeLog:
* gcc.c (LIBSAN_RPATH): New macro.
(LIBASAN_SPEC): Add LIBSAN_RPATH.
(LIBUBSAN_SPEC): Likewise.
 (LIBTSAN_SPEC): Likewise.
 (LIBLSAN_SPEC): Likewise.

libsanitizer/ChangeLog:
* configure.ac (link_libsan_rpath): New config variable.
* libsanitizer.spec.in (link_libsan_rpath): New spec.
* configure (link_libsan_rpath): New config variable.
* Makefile.in (link_libsan_rpath): Define new Makefile variable.
* asan/Makefile.in: Likewise.
* interception/Makefile.in: Likewise.
* libbacktrace/Makefile.in: Likewise.
* lsan/Makefile.in: Likewise.
* sanitizer_common/Makefile.in: Likewise.
* tsan/Makefile.in: Likewise.
* ubsan/Makefile.in: Likewise.
* hwasan/Makefile.in: Likewise.

---

 gcc/gcc.cc| 20 
 libsanitizer/Makefile.in  |  1 +
 libsanitizer/asan/Makefile.in |  1 +
 libsanitizer/configure| 10 --
 libsanitizer/configure.ac |  7 +++
 libsanitizer/hwasan/Makefile.in   |  1 +
 libsanitizer/interception/Makefile.in |  1 +
 libsanitizer/libbacktrace/Makefile.in |  1 +
 libsanitizer/libsanitizer.spec.in |  2 ++
 libsanitizer/lsan/Makefile.in |  1 +
 libsanitizer/sanitizer_common/Makefile.in |  1 +
 libsanitizer/tsan/Makefile.in |  1 +
 libsanitizer/ubsan/Makefile.in|  1 +
 13 files changed, 38 insertions(+), 10 deletions(-)


diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index 299e09c4f54..37ff75f1ad5 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -738,17 +738,21 @@ proper position among the other output files.  */
 #define STACK_SPLIT_SPEC " %{fsplit-stack: --wrap=pthread_create}"
 #endif

+#ifndef LIBSAN_RPATH
+#define LIBSAN_RPATH " %:include(libsanitizer.spec)%(link_libsan_rpath)"
+#endif
+
 #ifndef LIBASAN_SPEC
 #define STATIC_LIBASAN_LIBS \
   " %{static-libasan|static:%:include(libsanitizer.spec)%(link_libasan)}"
 #ifdef LIBASAN_EARLY_SPEC
-#define LIBASAN_SPEC STATIC_LIBASAN_LIBS
+#define LIBASAN_SPEC STATIC_LIBASAN_LIBS LIBSAN_RPATH
 #elif defined(HAVE_LD_STATIC_DYNAMIC)
 #define LIBASAN_SPEC "%{static-libasan:" LD_STATIC_OPTION \
 "} -lasan %{static-libasan:" LD_DYNAMIC_OPTION "}" \
 STATIC_LIBASAN_LIBS
 #else
-#define LIBASAN_SPEC "-lasan" STATIC_LIBASAN_LIBS
+#define LIBASAN_SPEC "-lasan" STATIC_LIBASAN_LIBS LIBSAN_RPATH
 #endif
 #endif

@@ -778,13 +782,13 @@ proper position among the other output files.  */
 #define STATIC_LIBTSAN_LIBS \
   " %{static-libtsan|static:%:include(libsanitizer.spec)%(link_libtsan)}"
 #ifdef LIBTSAN_EARLY_SPEC
-#define LIBTSAN_SPEC STATIC_LIBTSAN_LIBS
+#define LIBTSAN_SPEC STATIC_LIBTSAN_LIBS LIBSAN_RPATH
 #elif defined(HAVE_LD_STATIC_DYNAMIC)
 #define LIBTSAN_SPEC "%{static-libtsan:" LD_STATIC_OPTION \
 "} -ltsan %{static-libtsan:" LD_DYNAMIC_OPTION "}" \
 STATIC_LIBTSAN_LIBS
 #else
-#define LIBTSAN_SPEC "-ltsan" STATIC_LIBTSAN_LIBS
+#define LIBTSAN_SPEC "-ltsan" STATIC_LIBTSAN_LIBS LIBSAN_RPATH
 #endif
 #endif

@@ -793,7 +797,7 @@ proper position among the other output files.  */
 #endif

 #ifndef LIBLSAN_SPEC
-#define STATIC_LIBLSAN_LIBS \
+#define STATIC_LIBLSAN_LIBS LIBSAN_RPATH \
   " %{static-liblsan|static:%:include(libsanitizer.spec)%(link_liblsan)}"
 #ifdef LIBLSAN_EARLY_SPEC
 #define LIBLSAN_SPEC STATIC_LIBLSAN_LIBS
@@ -802,7 +806,7 @@ proper position among the other output files.  */
 "} -llsan %{static-liblsan:" LD_DYNAMIC_OPTION "}" \
 STATIC_LIBLSAN_LIBS
 #else
-#define LIBLSAN_SPEC "-llsan" STATIC_LIBLSAN_LIBS
+#define LIBLSAN_SPEC "-llsan" STATIC_LIBLSAN_LIBS LIBSAN_RPATH
 #endif
 #endif

@@ -816,9 +820,9 @@ proper position among the other output files.  */
 #ifdef HAVE_LD_STATIC_DYNAMIC
 #define LIBUBSAN_SPEC "%{static-libubsan:" LD_STATIC_OPTION \
 "} -lubsan %{static-libubsan:" LD_DYNAMIC_OPTION "}" \
-STATIC_LIBUBSAN_LIBS
+STATIC_LIBUBSAN_LIBS LIBSAN_RPATH
 #else
-#define LIBUBSAN_SPEC "-lubsan" STATIC_LIBUBSAN_LIBS
+#define LIBUBSAN_SPEC "-lubsan" STATIC_LIBUBSAN_LIBS LIBSAN_RPATH
 #endif
 #endif

diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in
index 65e7f2e9553..ef71407a512 100644
--- a/libsanitizer/Makefile.in
+++ b/libsanitizer/Makefile.in
@@ -333,6 +333,7 @@ libexecdir = @libexecdir@
 link_libasan = @l

Re: [PATCH] Mips: Resolve build issues for the n32 ABI

2022-07-01 Thread Xi Ruoyao via Gcc-patches
Please configure your mail client to send the patch as plain text (not
HTML, and don't replace a tab with 8 whitespaces, etc).  If it's not
possible, send the patch as an attachment.

And, it's better to separate the changes in
https://reviews.llvm.org/D127098 and struct_kernel_stat_sz into two
patches, one just "cherry-pick LLVM commit aabbccdd...", another changes
struct_kernel_stat_sz.  It would make the reviewing and tracking of
changes easier.

On Fri, 2022-07-01 at 08:18 +, Dimitrije Milosevic wrote:
> Building the ASAN for the n32 MIPS ABI currently fails, due to a few reasons:
> - defined(__mips64), which is set solely based on the architecture type 
> (32-bit/64-bit), was still used in some places.
> Therefore, defined(__mips64) is swapped with SANITIZER_MIPS64, which takes 
> the ABI into account as well - defined(__mips64) && _MIPS_SIM == ABI64.
> - The n32 ABI still uses 64-bit *Linux* system calls, even though the word 
> size is 32 bits.
> - After the transition to canonical system calls 
> (https://reviews.llvm.org/D124212), the n32 ABI still didn't use them, even 
> though they are supported,
> as per 
> https://github.com/torvalds/linux/blob/master/arch/mips/kernel/syscalls/syscall_n32.tbl.
> - struct_kernel_stat_sz was not updated after being changed in LLVM's source 
> tree.
> 
> See https://reviews.llvm.org/D127098.
>     
> libsanitizer/ChangeLog:
> 
>         * sanitizer_common/sanitizer_linux.cpp (defined): Resolve
>         ASAN build issues for the n32 ABI.
>         * sanitizer_common/sanitizer_platform.h (defined): Likewise.
>         * sanitizer_common/sanitizer_platform_limits_posix.h: Likewise.
> 
> ---
> 
>  libsanitizer/sanitizer_common/sanitizer_linux.cpp               | 17 
> ++---
>  libsanitizer/sanitizer_common/sanitizer_platform.h              |  2 +-
>  libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h |  2 +-
>  3 files changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/libsanitizer/sanitizer_common/sanitizer_linux.cpp 
> b/libsanitizer/sanitizer_common/sanitizer_linux.cpp
> index e2c32d679ad..5ba033492e7 100644
> --- a/libsanitizer/sanitizer_common/sanitizer_linux.cpp
> +++ b/libsanitizer/sanitizer_common/sanitizer_linux.cpp
> @@ -34,7 +34,7 @@
>  // format. Struct kernel_stat is defined as 'struct stat' in asm/stat.h. To
>  // access stat from asm/stat.h, without conflicting with definition in
>  // sys/stat.h, we use this trick.
> -#if defined(__mips64)
> +#if SANITIZER_MIPS64
>  #include 
>  #include 
>  #define stat kernel_stat
> @@ -124,8 +124,9 @@ const int FUTEX_WAKE_PRIVATE = FUTEX_WAKE | 
> FUTEX_PRIVATE_FLAG;
>  // Are we using 32-bit or 64-bit Linux syscalls?
>  // x32 (which defines __x86_64__) has SANITIZER_WORDSIZE == 32
>  // but it still needs to use 64-bit syscalls.
> -#if SANITIZER_LINUX && (defined(__x86_64__) || defined(__powerpc64__) ||     
>   \
> -                        SANITIZER_WORDSIZE == 64)
> +#if SANITIZER_LINUX && (defined(__x86_64__) || defined(__powerpc64__) || \
> +                        SANITIZER_WORDSIZE == 64 ||                      \
> +                        (defined(__mips__) && _MIPS_SIM == _ABIN32))
>  # define SANITIZER_LINUX_USES_64BIT_SYSCALLS 1
>  #else
>  # define SANITIZER_LINUX_USES_64BIT_SYSCALLS 0
> @@ -289,7 +290,7 @@ static void stat64_to_stat(struct stat64 *in, struct stat 
> *out) {
>  }
>  #endif
>  
> -#if defined(__mips64)
> +#if SANITIZER_MIPS64
>  // Undefine compatibility macros from 
>  // so that they would not clash with the kernel_stat
>  // st_[a|m|c]time fields
> @@ -343,7 +344,8 @@ uptr internal_stat(const char *path, void *buf) {
>  #if SANITIZER_FREEBSD
>    return internal_syscall(SYSCALL(fstatat), AT_FDCWD, (uptr)path, (uptr)buf, 
> 0);
>  #    elif SANITIZER_LINUX
> -#      if SANITIZER_WORDSIZE == 64 || SANITIZER_X32
> +#      if SANITIZER_WORDSIZE == 64 || SANITIZER_X32 || \
> +          (defined(__mips__) && _MIPS_SIM == _ABIN32)
>    return internal_syscall(SYSCALL(newfstatat), AT_FDCWD, (uptr)path, 
> (uptr)buf,
>                            0);
>  #      else
> @@ -366,7 +368,8 @@ uptr internal_lstat(const char *path, void *buf) {
>    return internal_syscall(SYSCALL(fstatat), AT_FDCWD, (uptr)path, (uptr)buf,
>                            AT_SYMLINK_NOFOLLOW);
>  #    elif SANITIZER_LINUX
> -#      if defined(_LP64) || SANITIZER_X32
> +#      if defined(_LP64) || SANITIZER_X32 ||         \
> +          (defined(__mips__) && _MIPS_SIM == _ABIN32)
>    return internal_syscall(SYSCALL(newfstatat), AT_FDCWD, (uptr)path, 
> (uptr)buf,
>                            AT_SYMLINK_NOFOLLOW);
>  #      else
> @@ -1053,7 +1056,7 @@ uptr GetMaxVirtualAddress() {
>    return (1ULL << (MostSignificantSetBitIndex(GET_CURRENT_FRAME()) + 1)) - 1;
>  #elif SANITIZER_RISCV64
>    return (1ULL << 38) - 1;
> -# elif defined(__mips64)
> +# elif SANITIZER_MIPS64
>    return (1ULL << 40) - 1;  // 0x00ffUL;
>  # elif defined(__s390x__)
>    return (1ULL << 53) - 1;

Re: PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459]

2022-07-01 Thread Richard Biener via Gcc-patches
On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin  wrote:
>
> Hi,
>
> Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html
>
> BR,
> Kewen
>
> on 2022/6/6 14:20, Kewen.Lin via Gcc-patches wrote:
> > Hi,
> >
> > PR105459 exposes one issue in inline_call handling that when it
> > decides to copy FP flags from callee to caller and rebuild the
> > optimization node for caller fndecl, it's possible that the target
> > option node is also necessary to be rebuilt.  Without updating
> > target option node early, it can make nodes share the same target
> > option node wrongly, later when we want to unshare it somewhere
> > (like in target hook) it can get unexpected results, like ICE on
> > uninitialized secondary member of target globals exposed in this PR.

I think that

  /* Reload global optimization flags.  */
  if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun)
set_cfun (cfun, true);

is supposed to do that via ix86_set_current_function which will eventually
re-build the target optimization node exactly for this reason.

But with LTO we arrive here during WPA time only and there cfun is NULL
(and so is DECL_STRUCT_FUNCTION (to->decl)), so the target doesn't
get the chance to fix things up here.

Now, it should be fine to delay this fixup until we set the cfun at LTRANS
time but there we run into

  if (old_tree != new_tree)
{
  cl_target_option_restore (&global_options, &global_options_set,
TREE_TARGET_OPTION (new_tree));
...
}
  else if (flag_unsafe_math_optimizations
   != TREE_TARGET_OPTION (new_tree)->x_ix86_unsafe_math_optimizations
   || (flag_excess_precision
   != TREE_TARGET_OPTION (new_tree)->x_ix86_excess_precision))
{
... FIXUP! ...

and old_tree != new_tree disables the fixup.

When we refactor the above to always consider the FP flag change (so apply it
lazily), then this fixes the testcase in the PR as well.  Thus something like
the attached.

Ideally this stuff would be refactored to a target hook that can work without
the set_cfun, also working towards merging the target and optimization node
since they have to be kept in sync ...

I think your proposed patch makes another variant through the maze to
do something at WPA time but that makes it all even more complicated :/

Sorry for the delay btw.

Folks - any other opinions?

Thanks,
Richard.

> > Commit r12-3721 makes it get exact fp_expression info and causes
> > more optimization chances then exposes this issue.  Commit r11-5855
> > introduces two target options to shadow flag_excess_precision and
> > flag_unsafe_math_optimizations and shows the need to rebuild target
> > node in inline_call when optimization node changes.
> >
> > As commented in PR105459, I tried to postpone init_function_start
> > in cgraph_node::expand, but abandoned it since I thought it just
> > concealed the issue.  And I also tried to adjust the target node
> > when current function switching, but failed since we get the NULL
> > cfun and fndecl in WPA phase.
> >
> > Bootstrapped and regtested on x86_64-redhat-linux, powerpc64-linux-gnu
> > P8 and powerpc64le-linux-gnu P9.
> >
> > Any thoughts?  Is it OK for trunk?
> >
> > BR,
> > Kewen
> > -
> >
> >   PR tree-optimization/105459
> >
> > gcc/ChangeLog:
> >
> >   * ipa-inline-transform.cc (inline_call): Rebuild target option node
> >   once optimization node gets rebuilt.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.dg/lto/pr105459_0.c: New test.
> > ---
> >  gcc/ipa-inline-transform.cc   | 50 +--
> >  gcc/testsuite/gcc.dg/lto/pr105459_0.c | 35 +++
> >  2 files changed, 83 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/lto/pr105459_0.c
> >
> > diff --git a/gcc/ipa-inline-transform.cc b/gcc/ipa-inline-transform.cc
> > index 07288e57c73..edba58377f4 100644
> > --- a/gcc/ipa-inline-transform.cc
> > +++ b/gcc/ipa-inline-transform.cc
> > @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "ipa-modref.h"
> >  #include "symtab-thunks.h"
> >  #include "symtab-clones.h"
> > +#include "target.h"
> >
> >  int ncalls_inlined;
> >  int nfunctions_inlined;
> > @@ -469,8 +470,53 @@ inline_call (struct cgraph_edge *e, bool 
> > update_original,
> >  }
> >
> >/* Reload global optimization flags.  */
> > -  if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun)
> > -set_cfun (cfun, true);
> > +  if (reload_optimization_node)
> > +{
> > +  /* Only need to check and update target option node
> > +  when target_option_default_node is not NULL.  */
> > +  if (target_option_default_node)
> > + {
> > +   /* Save the current context for optimization and target option
> > +  node.  */
> > +   tree old_optimize
> > + = build_optimization_node (&global_options, &global_options_set);
> > +   tree old_target_opt
> > + 

Re: [PATCH] configure: When host-shared, pass --with-pic to in-tree lib configs.

2022-07-01 Thread Richard Biener via Gcc-patches
On Sun, Jun 26, 2022 at 5:28 PM Iain Sandoe via Gcc-patches
 wrote:
>
> If we are building PIC/PIE host executables, and we are building dependent
> libs (e.g. GMP) in-tree those libs need to be configured to generate PIC code.
>
> At present, if an --enable-host-shared build is attempted on ELF platforms,
> with in-tree dependents, the build will fail with incompatible relocations.
> One can append --with-pic to the configure, but then that is applied 
> everywhere
> not just on the libraries that need it.
>
> Tested on  x86_64-linux-gnu "--enable-host-shared" and compared with an
> "--enable-host-shared --with-pic" version,
>
> OK for master?
> comments?

Looks reasonable to me, so go ahead (for trunk).

Richard.

> thanks
> Iain
>
> Signed-off-by: Iain Sandoe 
>
> ChangeLog:
>
> * Makefile.def: Pass host_libs_picflag to host dependent library
> configures.
> * Makefile.in: Regenerate.
> * configure: Regenerate.
> * configure.ac (host_libs_picflag): New configure variable set to
> '--with-pic' when building 'host_shared'.
> ---
>  Makefile.def |  15 +++---
>  Makefile.in  | 140 +--
>  configure|  11 
>  configure.ac |  10 
>  4 files changed, 99 insertions(+), 77 deletions(-)
>
> diff --git a/Makefile.def b/Makefile.def
> index 72d58549645..92239aebb57 100644
> --- a/Makefile.def
> +++ b/Makefile.def
> @@ -50,7 +50,7 @@ host_modules= { module= gcc; bootstrap=true;
> extra_make_flags="$(EXTRA_GCC_FLAGS)"; };
>  host_modules= { module= gmp; lib_path=.libs; bootstrap=true;
> // Work around in-tree gmp configure bug with missing flex.
> -   extra_configure_flags='--disable-shared LEX="touch lex.yy.c"';
> +   extra_configure_flags='--disable-shared LEX="touch lex.yy.c" 
> @host_libs_picflag@';
> extra_make_flags='AM_CFLAGS="-DNO_ASM"';
> no_install= true;
> // none-*-* disables asm optimizations, bootstrap-testing
> @@ -60,21 +60,22 @@ host_modules= { module= gmp; lib_path=.libs; 
> bootstrap=true;
> // different from host for target.
> target="none-${host_vendor}-${host_os}"; };
>  host_modules= { module= mpfr; lib_path=src/.libs; bootstrap=true;
> -   extra_configure_flags='--disable-shared 
> @extra_mpfr_configure_flags@';
> +   extra_configure_flags='--disable-shared 
> @extra_mpfr_configure_flags@ @host_libs_picflag@';
> extra_make_flags='AM_CFLAGS="-DNO_ASM"';
> no_install= true; };
>  host_modules= { module= mpc; lib_path=src/.libs; bootstrap=true;
> -   extra_configure_flags='--disable-shared 
> @extra_mpc_gmp_configure_flags@ @extra_mpc_mpfr_configure_flags@ 
> --disable-maintainer-mode';
> +   extra_configure_flags='--disable-shared 
> @extra_mpc_gmp_configure_flags@ @extra_mpc_mpfr_configure_flags@  
> @host_libs_picflag@ --disable-maintainer-mode';
> no_install= true; };
>  host_modules= { module= isl; lib_path=.libs; bootstrap=true;
> -   extra_configure_flags='--disable-shared 
> @extra_isl_gmp_configure_flags@';
> +   extra_configure_flags='--disable-shared 
> @extra_isl_gmp_configure_flags@  @host_libs_picflag@';
> extra_make_flags='V=1';
> no_install= true; };
>  host_modules= { module= libelf; lib_path=.libs; bootstrap=true;
> -   extra_configure_flags='--disable-shared';
> +   extra_configure_flags='--disable-shared  @host_libs_picflag@';
> no_install= true; };
>  host_modules= { module= gold; bootstrap=true; };
>  host_modules= { module= gprof; };
> +// intl acts on 'host_shared' directly, and does not support --with-pic.
>  host_modules= { module= intl; bootstrap=true; };
>  host_modules= { module= tcl;
>  missing=mostlyclean; };
> @@ -110,7 +111,7 @@ host_modules= { module= libiberty-linker-plugin; 
> bootstrap=true;
>  // We abuse missing to avoid installing anything for libiconv.
>  host_modules= { module= libiconv;
> bootstrap=true;
> -   extra_configure_flags='--disable-shared';
> +   extra_configure_flags='--disable-shared  @host_libs_picflag@';
> no_install= true;
> missing= pdf;
> missing= html;
> @@ -125,7 +126,7 @@ host_modules= { module= sim; };
>  host_modules= { module= texinfo; no_install= true; };
>  host_modules= { module= zlib; no_install=true; no_check=true;
> bootstrap=true;
> -   extra_configure_flags='@extra_host_zlib_configure_flags@';};
> +   extra_configure_flags='@extra_host_zlib_configure_flags@ 
> @host_libs_picflag@';};
>  host_modules= { module= gnulib; };
>  host_modules= { module= gdbsupport; };
>  host_modules= { module= gdbserver; };
>
> diff --git a/configure.ac b/configure.ac
> index d

Re: [PATCH] libsanitizer: Fix linkage errors for cross toolchains

2022-07-01 Thread Xi Ruoyao via Gcc-patches
Again, please send patch as plain text.

On Fri, 2022-07-01 at 08:18 +, Dimitrije Milosevic wrote:
> When we use cross toolchains, in which the GCC libraries are not
> installed within a designated system root, the shared sanitizer
> libraries link against libstdc++.so* within the same libraries.
> This directory, however, is not in RPATH, so attempting to build a
> dynamically linked application with -fsanitize=... gives a linkage
> error.
> More information can be found here:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69839.

Hmm... Is anyone really using a cross compiler without a sysroot?  PR
69839 is already closed as INVALID.  If you don't want to install GCC
runtime libraries into the sysroot (for example, for multiple GCC builds
using a shared sysroot), we have --enable-version-specific-runtime-libs
but unfortunately it's broken (PR 32415).

Please explain the actual use case and let us elaborate to see if there
is a better solution.

> gcc/ChangeLog:
>   * gcc.c (LIBSAN_RPATH): New macro.
>   (LIBASAN_SPEC): Add LIBSAN_RPATH.
>   (LIBUBSAN_SPEC): Likewise.
>  (LIBTSAN_SPEC): Likewise.
>  (LIBLSAN_SPEC): Likewise.
> 
> libsanitizer/ChangeLog:
>   * configure.ac (link_libsan_rpath): New config variable.
>   * libsanitizer.spec.in (link_libsan_rpath): New spec.
>   * configure (link_libsan_rpath): New config variable.
>   * Makefile.in (link_libsan_rpath): Define new Makefile
> variable.
>   * asan/Makefile.in: Likewise.
>   * interception/Makefile.in: Likewise.
>   * libbacktrace/Makefile.in: Likewise.
>   * lsan/Makefile.in: Likewise.
>   * sanitizer_common/Makefile.in: Likewise.
>   * tsan/Makefile.in: Likewise.
>   * ubsan/Makefile.in: Likewise.
>   * hwasan/Makefile.in: Likewise.
> 
> ---
> 
>  gcc/gcc.cc                                | 20 
>  libsanitizer/Makefile.in                  |  1 +
>  libsanitizer/asan/Makefile.in             |  1 +
>  libsanitizer/configure                    | 10 --
>  libsanitizer/configure.ac                 |  7 +++
>  libsanitizer/hwasan/Makefile.in           |  1 +
>  libsanitizer/interception/Makefile.in     |  1 +
>  libsanitizer/libbacktrace/Makefile.in     |  1 +
>  libsanitizer/libsanitizer.spec.in         |  2 ++
>  libsanitizer/lsan/Makefile.in             |  1 +
>  libsanitizer/sanitizer_common/Makefile.in |  1 +
>  libsanitizer/tsan/Makefile.in             |  1 +
>  libsanitizer/ubsan/Makefile.in            |  1 +
>  13 files changed, 38 insertions(+), 10 deletions(-)
> 
> 
> diff --git a/gcc/gcc.cc b/gcc/gcc.ccindex 299e09c4f54..37ff75f1ad5
> 100644
> --- a/gcc/gcc.cc
> +++ b/gcc/gcc.cc
> @@ -738,17 +738,21 @@ proper position among the other output files.
>  */
>  #define STACK_SPLIT_SPEC " %{fsplit-stack: --wrap=pthread_create}"
>  #endif
>  
> +#ifndef LIBSAN_RPATH
> +#define LIBSAN_RPATH "
> %:include(libsanitizer.spec)%(link_libsan_rpath)"
> +#endif
> +
>  #ifndef LIBASAN_SPEC
>  #define STATIC_LIBASAN_LIBS \
>    " %{static-
> libasan|static:%:include(libsanitizer.spec)%(link_libasan)}"
>  #ifdef LIBASAN_EARLY_SPEC
> -#define LIBASAN_SPEC STATIC_LIBASAN_LIBS
> +#define LIBASAN_SPEC STATIC_LIBASAN_LIBS LIBSAN_RPATH
>  #elif defined(HAVE_LD_STATIC_DYNAMIC)
>  #define LIBASAN_SPEC "%{static-libasan:" LD_STATIC_OPTION \
>                      "} -lasan %{static-libasan:" LD_DYNAMIC_OPTION
> "}" \
>                      STATIC_LIBASAN_LIBS
>  #else
> -#define LIBASAN_SPEC "-lasan" STATIC_LIBASAN_LIBS
> +#define LIBASAN_SPEC "-lasan" STATIC_LIBASAN_LIBS LIBSAN_RPATH
>  #endif
>  #endif
>  
> @@ -778,13 +782,13 @@ proper position among the other output files.
>  */
>  #define STATIC_LIBTSAN_LIBS \
>    " %{static-
> libtsan|static:%:include(libsanitizer.spec)%(link_libtsan)}"
>  #ifdef LIBTSAN_EARLY_SPEC
> -#define LIBTSAN_SPEC STATIC_LIBTSAN_LIBS
> +#define LIBTSAN_SPEC STATIC_LIBTSAN_LIBS LIBSAN_RPATH
>  #elif defined(HAVE_LD_STATIC_DYNAMIC)
>  #define LIBTSAN_SPEC "%{static-libtsan:" LD_STATIC_OPTION \
>                      "} -ltsan %{static-libtsan:" LD_DYNAMIC_OPTION
> "}" \
>                      STATIC_LIBTSAN_LIBS
>  #else
> -#define LIBTSAN_SPEC "-ltsan" STATIC_LIBTSAN_LIBS
> +#define LIBTSAN_SPEC "-ltsan" STATIC_LIBTSAN_LIBS LIBSAN_RPATH
>  #endif
>  #endif
>  
> @@ -793,7 +797,7 @@ proper position among the other output files.  */
>  #endif
>  
>  #ifndef LIBLSAN_SPEC
> -#define STATIC_LIBLSAN_LIBS \
> +#define STATIC_LIBLSAN_LIBS LIBSAN_RPATH \
>    " %{static-
> liblsan|static:%:include(libsanitizer.spec)%(link_liblsan)}"
>  #ifdef LIBLSAN_EARLY_SPEC
>  #define LIBLSAN_SPEC STATIC_LIBLSAN_LIBS
> @@ -802,7 +806,7 @@ proper position among the other output files.  */
>                      "} -llsan %{static-liblsan:" LD_DYNAMIC_OPTION
> "}" \
>                      STATIC_LIBLSAN_LIBS
>  #else
> -#define LIBLSAN_SPEC "-llsan" STATIC_LIBLSAN_LIBS
> +#define LIBLSAN_SPEC "-llsan" STATIC_LIBLSAN_LIBS LIBSAN_RPATH

Re: [PATCH] wide-int: Fix up wi::shifted_mask [PR106144]

2022-07-01 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek  writes:
> Hi!
>
> As the following self-test testcase shows, wi::shifted_mask sometimes
> doesn't create canonicalized wide_ints, which then fail to compare equal
> to canonicalized wide_ints with the same value.
> In particular, wi::mask (128, false, 128) gives { -1 } with len 1 and prec 
> 128,
> while wi::shifted_mask (0, 128, false, 128) gives { -1, -1 } with len 2
> and prec 128.
> The problem is that the code is written with the assumption that there are
> 3 bit blocks (or 2 if start is 0), but doesn't consider the possibility
> where there are 2 bit blocks (or 1 if start is 0) where the highest block
> isn't present.  In that case, there is the optional block of negate ? 0 : -1
> elts, followed by just one elt (either one from the if (shift) or just
> negate ? -1 : 0) and the rest is implicit sign-extension.
> Only if end < prec there is 1 or more bits above it that have different bit
> value and so we need to emit all the elts till end and then one more elt.
>
> if (end == prec) would work too, because we have:
>   if (width > prec - start)
> width = prec - start;
>   unsigned int end = start + width;
> so end is guaranteed to be end <= prec, dunno what is preferred.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2022-07-01  Jakub Jelinek  
>
>   PR middle-end/106144
>   * wide-int.cc (wi::shifted_mask): If end >= prec, return right after
>   emitting element for shift or if shift is 0 first element after start.
>   (wide_int_cc_tests): Add tests for equivalency of wi::mask and
>   wi::shifted_mask with 0 start.

OK, thanks, but could you also remove the "end < prec" condition from:

  else if (end < prec)
val[i++] = negate ? -1 : 0;

Richard

> --- gcc/wide-int.cc.jj2022-01-11 23:11:23.592273263 +0100
> +++ gcc/wide-int.cc   2022-06-30 20:41:25.506292687 +0200
> @@ -842,6 +842,13 @@ wi::shifted_mask (HOST_WIDE_INT *val, un
>   val[i++] = negate ? block : ~block;
>  }
>  
> +  if (end >= prec)
> +{
> +  if (!shift)
> + val[i++] = negate ? 0 : -1;
> +  return i;
> +}
> +
>while (i < end / HOST_BITS_PER_WIDE_INT)
>  /* 111 */
>  val[i++] = negate ? 0 : -1;
> @@ -2583,6 +2590,10 @@ wide_int_cc_tests ()
>run_all_wide_int_tests  ();
>test_overflow ();
>test_round_for_mask ();
> +  ASSERT_EQ (wi::mask (128, false, 128),
> +  wi::shifted_mask (0, 128, false, 128));
> +  ASSERT_EQ (wi::mask (128, true, 128),
> +  wi::shifted_mask (0, 128, true, 128));
>  }
>  
>  } // namespace selftest
>
>   Jakub


Re: [PATCH] tree-optimization/106131 - wrong code with FRE rewriting

2022-07-01 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches  writes:
> The following makes sure to not use the original TBAA type for
> looking up a value across an aggregate copy when we had to offset
> the read.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to trunk.
>
> 2022-06-30  Richard Biener  
>
>   PR tree-optimization/106131
>   * tree-ssa-sccvn.cc (vn_reference_lookup_3): Force alias-set
>   zero when offsetting the read looking through an aggregate
>   copy.
>
>   * g++.dg/torture/pr106131.C: New testcase.
> ---
>  gcc/testsuite/g++.dg/torture/pr106131.C | 34 +
>  gcc/tree-ssa-sccvn.cc   | 16 +---
>  2 files changed, 46 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/torture/pr106131.C
>
> diff --git a/gcc/testsuite/g++.dg/torture/pr106131.C 
> b/gcc/testsuite/g++.dg/torture/pr106131.C
> new file mode 100644
> index 000..e110f4a8fe6
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/torture/pr106131.C
> @@ -0,0 +1,34 @@
> +// { dg-do run { target c++11 } }
> +
> +struct Pair {
> +int a, b;
> +Pair(const Pair &) = default;
> +Pair(int _a, int _b) : a(_a), b(_b) {}
> +Pair &operator=(const Pair &z) {
> + a = z.a;
> + b = z.b;
> + return *this;
> +}
> +};
> +
> +const int &max(const int &a, const int &b)
> +{
> +  return a < b ? b : a;
> +}
> +
> +int foo(Pair x, Pair y)
> +{
> +  return max(x.b, y.b);
> +}
> +
> +int main()
> +{
> +  auto f = new Pair[3] {{0, -11}, {0, -8}, {0, 2}};
> +  for (int i = 0; i < 1; i++) {
> +  f[i] = f[0];
> +  if(i == 0)
> + f[i] = f[2];
> +  if (foo(f[i], f[1]) != 2)
> + __builtin_abort();
> +  }
> +}
> diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
> index 9deedeac378..76d92895a3a 100644
> --- a/gcc/tree-ssa-sccvn.cc
> +++ b/gcc/tree-ssa-sccvn.cc
> @@ -3243,12 +3243,12 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
> *data_,
>poly_int64 extra_off = 0;
>if (j == 0 && i >= 0
> && lhs_ops[0].opcode == MEM_REF
> -   && maybe_ne (lhs_ops[0].off, -1))
> +   && known_ne (lhs_ops[0].off, -1))
>   {
> if (known_eq (lhs_ops[0].off, vr->operands[i].off))
>   i--, j--;
> else if (vr->operands[i].opcode == MEM_REF
> -&& maybe_ne (vr->operands[i].off, -1))
> +&& known_ne (vr->operands[i].off, -1))

These changes don't look right.  If -1 is a special marker,
it should be tested with known_eq (positive) or maybe_ne (negative).

In other words, we should enter the if body if off is not the
compile-time constant -1.

Thanks,
Richard

>   {
> extra_off = vr->operands[i].off - lhs_ops[0].off;
> i--, j--;
> @@ -3275,6 +3275,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
> *data_,
>copy_reference_ops_from_ref (rhs1, &rhs);
>  
>/* Apply an extra offset to the inner MEM_REF of the RHS.  */
> +  bool force_no_tbaa = false;
>if (maybe_ne (extra_off, 0))
>   {
> if (rhs.length () < 2)
> @@ -3287,6 +3288,10 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
> *data_,
> rhs[ix].op0 = int_const_binop (PLUS_EXPR, rhs[ix].op0,
>build_int_cst (TREE_TYPE (rhs[ix].op0),
>   extra_off));
> +   /* When we have offsetted the RHS, reading only parts of it,
> +  we can no longer use the original TBAA type, force alias-set
> +  zero.  */
> +   force_no_tbaa = true;
>   }
>  
>/* Save the operands since we need to use the original ones for
> @@ -3339,8 +3344,11 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
> *data_,
>/* Adjust *ref from the new operands.  */
>ao_ref rhs1_ref;
>ao_ref_init (&rhs1_ref, rhs1);
> -  if (!ao_ref_init_from_vn_reference (&r, ao_ref_alias_set (&rhs1_ref),
> -   ao_ref_base_alias_set (&rhs1_ref),
> +  if (!ao_ref_init_from_vn_reference (&r,
> +   force_no_tbaa ? 0
> +   : ao_ref_alias_set (&rhs1_ref),
> +   force_no_tbaa ? 0
> +   : ao_ref_base_alias_set (&rhs1_ref),
> vr->type, vr->operands))
>   return (void *)-1;
>/* This can happen with bitfields.  */


Re: PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459]

2022-07-01 Thread Jan Hubicka via Gcc-patches
> On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin  wrote:
> >
> > Hi,
> >
> > Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html
> >
> > BR,
> > Kewen
> >
> > on 2022/6/6 14:20, Kewen.Lin via Gcc-patches wrote:
> > > Hi,
> > >
> > > PR105459 exposes one issue in inline_call handling that when it
> > > decides to copy FP flags from callee to caller and rebuild the
> > > optimization node for caller fndecl, it's possible that the target
> > > option node is also necessary to be rebuilt.  Without updating
> > > target option node early, it can make nodes share the same target
> > > option node wrongly, later when we want to unshare it somewhere
> > > (like in target hook) it can get unexpected results, like ICE on
> > > uninitialized secondary member of target globals exposed in this PR.
> 
> I think that
> 
>   /* Reload global optimization flags.  */
>   if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun)
> set_cfun (cfun, true);
> 
> is supposed to do that via ix86_set_current_function which will eventually
> re-build the target optimization node exactly for this reason.
> 
> But with LTO we arrive here during WPA time only and there cfun is NULL
> (and so is DECL_STRUCT_FUNCTION (to->decl)), so the target doesn't
> get the chance to fix things up here.

I see this logic was added by Martin in 2017 and it indeed looks bit
odd, but I suppose it is intended primarily for early inliner where cfun
is non-NULL and we really need to update global state.
> 
> Now, it should be fine to delay this fixup until we set the cfun at LTRANS
> time but there we run into
> 
>   if (old_tree != new_tree)
> {
>   cl_target_option_restore (&global_options, &global_options_set,
> TREE_TARGET_OPTION (new_tree));
> ...
> }
>   else if (flag_unsafe_math_optimizations
>!= TREE_TARGET_OPTION (new_tree)->x_ix86_unsafe_math_optimizations
>|| (flag_excess_precision
>!= TREE_TARGET_OPTION (new_tree)->x_ix86_excess_precision))
> {
> ... FIXUP! ...
> 
> and old_tree != new_tree disables the fixup.
> 
> When we refactor the above to always consider the FP flag change (so apply it
> lazily), then this fixes the testcase in the PR as well.  Thus something like
> the attached.
> 
> Ideally this stuff would be refactored to a target hook that can work without
> the set_cfun, also working towards merging the target and optimization node
> since they have to be kept in sync ...
> 
> I think your proposed patch makes another variant through the maze to
> do something at WPA time but that makes it all even more complicated :/
> 
> Sorry for the delay btw.
> 
> Folks - any other opinions?

Your patch looks reasonable to me... Indeed working on nodes directly
would be nicer, but that means bigger surgery in the optimization
handling right?

Thanks,
Honza
> 
> Thanks,
> Richard.
> 
> > > Commit r12-3721 makes it get exact fp_expression info and causes
> > > more optimization chances then exposes this issue.  Commit r11-5855
> > > introduces two target options to shadow flag_excess_precision and
> > > flag_unsafe_math_optimizations and shows the need to rebuild target
> > > node in inline_call when optimization node changes.
> > >
> > > As commented in PR105459, I tried to postpone init_function_start
> > > in cgraph_node::expand, but abandoned it since I thought it just
> > > concealed the issue.  And I also tried to adjust the target node
> > > when current function switching, but failed since we get the NULL
> > > cfun and fndecl in WPA phase.
> > >
> > > Bootstrapped and regtested on x86_64-redhat-linux, powerpc64-linux-gnu
> > > P8 and powerpc64le-linux-gnu P9.
> > >
> > > Any thoughts?  Is it OK for trunk?
> > >
> > > BR,
> > > Kewen
> > > -
> > >
> > >   PR tree-optimization/105459
> > >
> > > gcc/ChangeLog:
> > >
> > >   * ipa-inline-transform.cc (inline_call): Rebuild target option node
> > >   once optimization node gets rebuilt.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   * gcc.dg/lto/pr105459_0.c: New test.
> > > ---
> > >  gcc/ipa-inline-transform.cc   | 50 +--
> > >  gcc/testsuite/gcc.dg/lto/pr105459_0.c | 35 +++
> > >  2 files changed, 83 insertions(+), 2 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.dg/lto/pr105459_0.c
> > >
> > > diff --git a/gcc/ipa-inline-transform.cc b/gcc/ipa-inline-transform.cc
> > > index 07288e57c73..edba58377f4 100644
> > > --- a/gcc/ipa-inline-transform.cc
> > > +++ b/gcc/ipa-inline-transform.cc
> > > @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.  If not see
> > >  #include "ipa-modref.h"
> > >  #include "symtab-thunks.h"
> > >  #include "symtab-clones.h"
> > > +#include "target.h"
> > >
> > >  int ncalls_inlined;
> > >  int nfunctions_inlined;
> > > @@ -469,8 +470,53 @@ inline_call (struct cgraph_edge *e, bool 
> > > update_original,
> > >  }
> > >
> > >/* Reload global opti

Re: [PATCH] tree-optimization/106131 - wrong code with FRE rewriting

2022-07-01 Thread Richard Biener via Gcc-patches
On Fri, 1 Jul 2022, Richard Sandiford wrote:

> Richard Biener via Gcc-patches  writes:
> > The following makes sure to not use the original TBAA type for
> > looking up a value across an aggregate copy when we had to offset
> > the read.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to trunk.
> >
> > 2022-06-30  Richard Biener  
> >
> > PR tree-optimization/106131
> > * tree-ssa-sccvn.cc (vn_reference_lookup_3): Force alias-set
> > zero when offsetting the read looking through an aggregate
> > copy.
> >
> > * g++.dg/torture/pr106131.C: New testcase.
> > ---
> >  gcc/testsuite/g++.dg/torture/pr106131.C | 34 +
> >  gcc/tree-ssa-sccvn.cc   | 16 +---
> >  2 files changed, 46 insertions(+), 4 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/torture/pr106131.C
> >
> > diff --git a/gcc/testsuite/g++.dg/torture/pr106131.C 
> > b/gcc/testsuite/g++.dg/torture/pr106131.C
> > new file mode 100644
> > index 000..e110f4a8fe6
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/torture/pr106131.C
> > @@ -0,0 +1,34 @@
> > +// { dg-do run { target c++11 } }
> > +
> > +struct Pair {
> > +int a, b;
> > +Pair(const Pair &) = default;
> > +Pair(int _a, int _b) : a(_a), b(_b) {}
> > +Pair &operator=(const Pair &z) {
> > +   a = z.a;
> > +   b = z.b;
> > +   return *this;
> > +}
> > +};
> > +
> > +const int &max(const int &a, const int &b)
> > +{
> > +  return a < b ? b : a;
> > +}
> > +
> > +int foo(Pair x, Pair y)
> > +{
> > +  return max(x.b, y.b);
> > +}
> > +
> > +int main()
> > +{
> > +  auto f = new Pair[3] {{0, -11}, {0, -8}, {0, 2}};
> > +  for (int i = 0; i < 1; i++) {
> > +  f[i] = f[0];
> > +  if(i == 0)
> > +   f[i] = f[2];
> > +  if (foo(f[i], f[1]) != 2)
> > +   __builtin_abort();
> > +  }
> > +}
> > diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
> > index 9deedeac378..76d92895a3a 100644
> > --- a/gcc/tree-ssa-sccvn.cc
> > +++ b/gcc/tree-ssa-sccvn.cc
> > @@ -3243,12 +3243,12 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
> > *data_,
> >poly_int64 extra_off = 0;
> >if (j == 0 && i >= 0
> >   && lhs_ops[0].opcode == MEM_REF
> > - && maybe_ne (lhs_ops[0].off, -1))
> > + && known_ne (lhs_ops[0].off, -1))
> > {
> >   if (known_eq (lhs_ops[0].off, vr->operands[i].off))
> > i--, j--;
> >   else if (vr->operands[i].opcode == MEM_REF
> > -  && maybe_ne (vr->operands[i].off, -1))
> > +  && known_ne (vr->operands[i].off, -1))
> 
> These changes don't look right.  If -1 is a special marker,
> it should be tested with known_eq (positive) or maybe_ne (negative).
> 
> In other words, we should enter the if body if off is not the
> compile-time constant -1.

Hmm, to me 'known_ne' was visually more correct (only if 'off' is
an actual offset we may treat it as such).  Yes, -1 is a special
marker but still.  Practically of course
known_ne (poly, integer constant) == maybe_ne (poly, integer_constant)?
That is, if we'd have a poly-int that might be -1 we should still
not use that as 'off'?

I can of course revert that hunk.

Thanks,
Richard.

> Thanks,
> Richard
> 
> > {
> >   extra_off = vr->operands[i].off - lhs_ops[0].off;
> >   i--, j--;
> > @@ -3275,6 +3275,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
> > *data_,
> >copy_reference_ops_from_ref (rhs1, &rhs);
> >  
> >/* Apply an extra offset to the inner MEM_REF of the RHS.  */
> > +  bool force_no_tbaa = false;
> >if (maybe_ne (extra_off, 0))
> > {
> >   if (rhs.length () < 2)
> > @@ -3287,6 +3288,10 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
> > *data_,
> >   rhs[ix].op0 = int_const_binop (PLUS_EXPR, rhs[ix].op0,
> >  build_int_cst (TREE_TYPE (rhs[ix].op0),
> > extra_off));
> > + /* When we have offsetted the RHS, reading only parts of it,
> > +we can no longer use the original TBAA type, force alias-set
> > +zero.  */
> > + force_no_tbaa = true;
> > }
> >  
> >/* Save the operands since we need to use the original ones for
> > @@ -3339,8 +3344,11 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
> > *data_,
> >/* Adjust *ref from the new operands.  */
> >ao_ref rhs1_ref;
> >ao_ref_init (&rhs1_ref, rhs1);
> > -  if (!ao_ref_init_from_vn_reference (&r, ao_ref_alias_set (&rhs1_ref),
> > - ao_ref_base_alias_set (&rhs1_ref),
> > +  if (!ao_ref_init_from_vn_reference (&r,
> > + force_no_tbaa ? 0
> > + : ao_ref_alias_set (&rhs1_ref),
> > + force_no_tbaa ? 0
> > + : ao_ref_base_alias_set (&rhs1_ref),
> >  

Re: PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459]

2022-07-01 Thread Kewen.Lin via Gcc-patches
Hi Richi,

Thanks for the insightful comments!

on 2022/7/1 16:40, Richard Biener wrote:
> On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin  wrote:
>>
>> Hi,
>>
>> Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html
>>
>> BR,
>> Kewen
>>
>> on 2022/6/6 14:20, Kewen.Lin via Gcc-patches wrote:
>>> Hi,
>>>
>>> PR105459 exposes one issue in inline_call handling that when it
>>> decides to copy FP flags from callee to caller and rebuild the
>>> optimization node for caller fndecl, it's possible that the target
>>> option node is also necessary to be rebuilt.  Without updating
>>> target option node early, it can make nodes share the same target
>>> option node wrongly, later when we want to unshare it somewhere
>>> (like in target hook) it can get unexpected results, like ICE on
>>> uninitialized secondary member of target globals exposed in this PR.
> 
> I think that
> 
>   /* Reload global optimization flags.  */
>   if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun)
> set_cfun (cfun, true);
> 
> is supposed to do that via ix86_set_current_function which will eventually
> re-build the target optimization node exactly for this reason.
> 
> But with LTO we arrive here during WPA time only and there cfun is NULL
> (and so is DECL_STRUCT_FUNCTION (to->decl)), so the target doesn't
> get the chance to fix things up here.
> 

Yes, when I read this code, I was thinking if we can do some similar to get
the hook to update target node, but as you pointed out, the cfun is NULL
here.  :(

> Now, it should be fine to delay this fixup until we set the cfun at LTRANS
> time but there we run into
> 
>   if (old_tree != new_tree)
> {
>   cl_target_option_restore (&global_options, &global_options_set,
> TREE_TARGET_OPTION (new_tree));
> ...
> }
>   else if (flag_unsafe_math_optimizations
>!= TREE_TARGET_OPTION (new_tree)->x_ix86_unsafe_math_optimizations
>|| (flag_excess_precision
>!= TREE_TARGET_OPTION (new_tree)->x_ix86_excess_precision))
> {
> ... FIXUP! ...
> 
> and old_tree != new_tree disables the fixup.
> 
> When we refactor the above to always consider the FP flag change (so apply it
> lazily), then this fixes the testcase in the PR as well.  Thus something like
> the attached.

Good idea!  Previously following the code for reload_optimization_node, I 
thought
it's good to update the target node information up to date at the same time, but
your proposal with delaying fixup till LTRANS looks better, IIUC WPA passes 
won't
care about this information out of date or not.

> 
> Ideally this stuff would be refactored to a target hook that can work without
> the set_cfun, also working towards merging the target and optimization node
> since they have to be kept in sync ...
> 
> I think your proposed patch makes another variant through the maze to
> do something at WPA time but that makes it all even more complicated :/
> 
> Sorry for the delay btw.
> 

No problem!  Thanks again!!

BR,
Kewen


Re: [PATCH] tree-optimization/106131 - wrong code with FRE rewriting

2022-07-01 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> On Fri, 1 Jul 2022, Richard Sandiford wrote:
>
>> Richard Biener via Gcc-patches  writes:
>> > The following makes sure to not use the original TBAA type for
>> > looking up a value across an aggregate copy when we had to offset
>> > the read.
>> >
>> > Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to trunk.
>> >
>> > 2022-06-30  Richard Biener  
>> >
>> >PR tree-optimization/106131
>> >* tree-ssa-sccvn.cc (vn_reference_lookup_3): Force alias-set
>> >zero when offsetting the read looking through an aggregate
>> >copy.
>> >
>> >* g++.dg/torture/pr106131.C: New testcase.
>> > ---
>> >  gcc/testsuite/g++.dg/torture/pr106131.C | 34 +
>> >  gcc/tree-ssa-sccvn.cc   | 16 +---
>> >  2 files changed, 46 insertions(+), 4 deletions(-)
>> >  create mode 100644 gcc/testsuite/g++.dg/torture/pr106131.C
>> >
>> > diff --git a/gcc/testsuite/g++.dg/torture/pr106131.C 
>> > b/gcc/testsuite/g++.dg/torture/pr106131.C
>> > new file mode 100644
>> > index 000..e110f4a8fe6
>> > --- /dev/null
>> > +++ b/gcc/testsuite/g++.dg/torture/pr106131.C
>> > @@ -0,0 +1,34 @@
>> > +// { dg-do run { target c++11 } }
>> > +
>> > +struct Pair {
>> > +int a, b;
>> > +Pair(const Pair &) = default;
>> > +Pair(int _a, int _b) : a(_a), b(_b) {}
>> > +Pair &operator=(const Pair &z) {
>> > +  a = z.a;
>> > +  b = z.b;
>> > +  return *this;
>> > +}
>> > +};
>> > +
>> > +const int &max(const int &a, const int &b)
>> > +{
>> > +  return a < b ? b : a;
>> > +}
>> > +
>> > +int foo(Pair x, Pair y)
>> > +{
>> > +  return max(x.b, y.b);
>> > +}
>> > +
>> > +int main()
>> > +{
>> > +  auto f = new Pair[3] {{0, -11}, {0, -8}, {0, 2}};
>> > +  for (int i = 0; i < 1; i++) {
>> > +  f[i] = f[0];
>> > +  if(i == 0)
>> > +  f[i] = f[2];
>> > +  if (foo(f[i], f[1]) != 2)
>> > +  __builtin_abort();
>> > +  }
>> > +}
>> > diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
>> > index 9deedeac378..76d92895a3a 100644
>> > --- a/gcc/tree-ssa-sccvn.cc
>> > +++ b/gcc/tree-ssa-sccvn.cc
>> > @@ -3243,12 +3243,12 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, 
>> > void *data_,
>> >poly_int64 extra_off = 0;
>> >if (j == 0 && i >= 0
>> >  && lhs_ops[0].opcode == MEM_REF
>> > -&& maybe_ne (lhs_ops[0].off, -1))
>> > +&& known_ne (lhs_ops[0].off, -1))
>> >{
>> >  if (known_eq (lhs_ops[0].off, vr->operands[i].off))
>> >i--, j--;
>> >  else if (vr->operands[i].opcode == MEM_REF
>> > - && maybe_ne (vr->operands[i].off, -1))
>> > + && known_ne (vr->operands[i].off, -1))
>> 
>> These changes don't look right.  If -1 is a special marker,
>> it should be tested with known_eq (positive) or maybe_ne (negative).
>> 
>> In other words, we should enter the if body if off is not the
>> compile-time constant -1.
>
> Hmm, to me 'known_ne' was visually more correct (only if 'off' is
> an actual offset we may treat it as such).

That's also the intention behind maybe_ne though.  The point is that
-1 isn't really a number on a scale [M, N] (M < -1, N > -1).  It's a
just a C way of representing "nothing" in an "X or nothing" (std::optional).

So whether an offset happens to be numerically equal to -1 at runtime
isn't relevant.  A runtime off is an X in the "X or nothing" and so
needs to be treated in the same was as other Xs.

To put it another way: known_ne is harder to prove than maybe_ne.
known_ne must return false if it can't prove the arguments are
different for all possible combinations of indeterminates.
maybe_ne is instead the direct opposite of known_eq.

Another analogy might be: suppose that we used a -1 INTEGER_CST
as a special marker.  (Not a good choice, but bear with me.)
If we wanted to test whether a tree was this special marker,
we'd use integer_minus_onep.  To test that it isn't the marker
we'd use !integer_minus_onep.  known_ne would instead be the
equivalent of:

   integer_zerop (fold_build2 (boolean_type_node, NE_EXPR,
   x, ...-1 node...))

and so a general x would be treated in the same way as -1.

> Yes, -1 is a special
> marker but still.  Practically of course
> known_ne (poly, integer constant) == maybe_ne (poly, integer_constant)?

Not in general.  E.g. if VL is the SVE vector length in bytes:

  maybe_ne (VL, 16)

is true (VL can be [1,16]*16) but:

  known_ne (VL, 16)

is false (VL might be 16 but might not).

> That is, if we'd have a poly-int that might be -1 we should still
> not use that as 'off'?

known_ne might be safe in this particular instance if no runtime offset
would ever evaluate to -1.  But IMO it's harder to reason about and
less obviously safe.

Thanks,
Richard

> I can of course revert that hunk.
>
> Thanks,
> Richard.
>
>> Thanks,
>> Richard
>> 
>> >{
>> >  extra_off = vr->operands[i].off - lhs_ops[0].off;
>> >  i--, j--;
>> > @@ -3275,6 +3275,7 @@ vn_reference_lookup_3 (ao

[Committed] Add constraints to new andn_doubleword_bmi pattern in i386.md.

2022-07-01 Thread Roger Sayle

Many thanks to Uros for spotting that I'd forgotten to add constraints
to the new define_insn_and_split *andn_doubleword_bmi when moving it
from pre-reload to post-reload.  I've pushed this obvious fix after a
make bootstrap on x86_64-pc-linux-gnu.  Sorry for the inconvenience to
anyone building the tree with a non-default architecture that enables
BMI.


2022-07-01  Roger Sayle  
Uroš Bizjak  

gcc/ChangeLog
* config/i386/i386.md (*andn3_doubleword_bmi): Add constraints
to post-reload define_insn_and_split.


Roger
--

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 3401814..352a21c 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -10405,10 +10405,10 @@
 })
 
 (define_insn_and_split "*andn3_doubleword_bmi"
-  [(set (match_operand: 0 "register_operand")
+  [(set (match_operand: 0 "register_operand" "=r")
(and:
- (not: (match_operand: 1 "register_operand"))
- (match_operand: 2 "nonimmediate_operand")))
+ (not: (match_operand: 1 "register_operand" "0"))
+ (match_operand: 2 "nonimmediate_operand" "ro")))
(clobber (reg:CC FLAGS_REG))]
   "TARGET_BMI"
   "#"


Re: [PATCH][AArch64] Implement ACLE Data Intrinsics

2022-07-01 Thread Andre Vieira (lists) via Gcc-patches


On 29/06/2022 08:18, Richard Sandiford wrote:

+  break;
+case AARCH64_RBIT:
+case AARCH64_RBITL:
+case AARCH64_RBITLL:
+  if (mode == SImode)
+   icode = CODE_FOR_aarch64_rbitsi;
+  else
+   icode = CODE_FOR_aarch64_rbitdi;
+  break;
+default:
+  gcc_unreachable ();
+}
+  expand_insn (icode, 2, ops);
+  return target;

This needs to return ops[0].value instead, since "target" just suggests
a possible location.

Could you add tests for a memory source and memory destination, e.g.:

void test_clz_mem (uint32_t *a)
{
   *a = __clz (*a);
}

Without tests like that, these comments probably just sound like a paper
exercise, but they should make a difference for memory sources (previous
review) and memory destinations (this round).
I had locally tested it (with rev though because clz doesn't use that 
code) and strangely it does seem to work for the memory destinations, 
but that's just a simple test.
It could very well go wrong with some more complex codegen, so I'll just 
take your word and use ops[0].value.


And yeah I didn't add the tests at the time, don't really know why, I'll 
chuck it down to laziness :P



diff --git a/gcc/config/aarch64/arm_acle.h b/gcc/config/aarch64/arm_acle.h
index 
9775a48c65825b424d3eb442384f5ab87b734fd7..a044bc74553fcf2a49b71290083f3f072fd5a2ce
 100644
--- a/gcc/config/aarch64/arm_acle.h
+++ b/gcc/config/aarch64/arm_acle.h
@@ -28,6 +28,7 @@
  #define _GCC_ARM_ACLE_H
  
  #include 

+#include 
  
  #pragma GCC aarch64 "arm_acle.h"
  
@@ -35,6 +36,58 @@

  extern "C" {
  #endif
  
+#define _GCC_ARM_ACLE_ROR_FN(NAME, TYPE)  \

+__extension__ extern __inline TYPE   \
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))  \
+NAME (TYPE __value, uint32_t __rotate)   \
+{\
+  size_t __size = sizeof (TYPE) * __CHAR_BIT__;
  \
+  __rotate = __rotate % __size;
  \
+  return __value >> __rotate | __value << ((__size - __rotate) % __size); \
+}
+
+_GCC_ARM_ACLE_ROR_FN (__ror, uint32_t)
+_GCC_ARM_ACLE_ROR_FN (__rorl, unsigned long)
+_GCC_ARM_ACLE_ROR_FN (__rorll, uint64_t)
+
+#undef _GCC_ARM_ACLE_ROR_FN
+
+#define _GCC_ARM_ACLE_DATA_FN(NAME, BUILTIN, ITYPE, RTYPE) \
+__extension__ extern __inline RTYPE\
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) \
+__##NAME (ITYPE __value)   \
+{  \
+  return __builtin_##BUILTIN (__value);\
+}
+
+_GCC_ARM_ACLE_DATA_FN (clz, clz, uint32_t, unsigned int)
+_GCC_ARM_ACLE_DATA_FN (clzl, clzl, unsigned long, unsigned int)
+_GCC_ARM_ACLE_DATA_FN (clzll, clzll, uint64_t, unsigned int)
+_GCC_ARM_ACLE_DATA_FN (cls, clrsb, uint32_t, unsigned int)
+_GCC_ARM_ACLE_DATA_FN (clsl, clrsbl, unsigned long, unsigned int)
+_GCC_ARM_ACLE_DATA_FN (clsll, clrsbll, uint64_t, unsigned int)
+_GCC_ARM_ACLE_DATA_FN (rev16, aarch64_rev16, uint32_t, uint32_t)
+_GCC_ARM_ACLE_DATA_FN (rev16l, aarch64_rev16l, unsigned long, unsigned long)
+_GCC_ARM_ACLE_DATA_FN (rev16ll, aarch64_rev16ll, uint64_t, uint64_t)
+_GCC_ARM_ACLE_DATA_FN (rbit, aarch64_rbit, uint32_t, uint32_t)
+_GCC_ARM_ACLE_DATA_FN (rbitl, aarch64_rbitl, unsigned long, unsigned long)
+_GCC_ARM_ACLE_DATA_FN (rbitll, aarch64_rbitll, uint64_t, uint64_t)
+_GCC_ARM_ACLE_DATA_FN (revsh, bswap16, int16_t, uint16_t)

The return type should be int16_t.

Nice catch!

The clz and cls tests have the old return types (same as the argument
types), but I guess that's a good thing, since it shows that we avoid
the redundant zero-extend in clzll and clsll.
Yeah I noticed that too when I was adding the mem tests, but I did 
change them though because at the time it just felt like an oversight, 
though I too was pleasantly surprised GCC was managing to avoid the 
zero-extending :)
I then saw your comment and made me wonder whether I should keep the 
wrong return types in... I haven't but happy to change them back if you 
think it's a nice 'test' to have.


Regards,
Andre
diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
b/gcc/config/aarch64/aarch64-builtins.cc
index 
e0a741ac663188713e21f457affa57217d074783..bb5d97c8fc6402635270df851a949cabeecaa5e8
 100644
--- a/gcc/config/aarch64/aarch64-builtins.cc
+++ b/gcc/config/aarch64/aarch64-builtins.cc
@@ -613,6 +613,12 @@ enum aarch64_builtins
   AARCH64_LS64_BUILTIN_ST64B,
   AARCH64_LS64_BUILTIN_ST64BV,
   AARCH64_LS64_BUILTIN_ST64BV0,
+  AARCH64_REV16,
+  AARCH64_REV16L,
+  AARCH64_REV16LL,
+  AARCH64_RBIT,
+  AARCH64_RBITL,
+  AARCH64_RBITLL,
   AARCH64_BUILTIN_MAX
 };
 
@@ -1664,10 +1670,41 @@ aarch64_init_ls64_builtins (void)
   = aarch64_general_add_builtin (data[i].name, data[i].type, data[i].code);
 }

[committed] gcn: Remove useless register keyword

2022-07-01 Thread Tobias Burnus

This silences the compile warnings:
  config/gcn/gcn-protos.h:78:61: warning: ISO C++17 does not allow 'register' 
storage class specifier [-Wregister]

and is in line with many other commits that remove 'register'.
I also note that the the definition in config/gcn/gcn.cc already uses
  print_operand_address (FILE *file, rtx mem)
and the "print_operand_address" prototype/definition in other config/*
also don't have 'register' ...

Committed as r13-1385-g7780dc5b2d02785186583fc8eced3c9e3aec4552

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
commit 7780dc5b2d02785186583fc8eced3c9e3aec4552
Author: Tobias Burnus 
Date:   Fri Jul 1 12:21:24 2022 +0200

gcn: Remove useless register keyword

gcc/ChangeLog:

* config/gcn/gcn-protos.h (print_operand_address): Remove register
keyword on 'rtx addr' argument.

diff --git a/gcc/config/gcn/gcn-protos.h b/gcc/config/gcn/gcn-protos.h
index 61791c90445..38197b929fd 100644
--- a/gcc/config/gcn/gcn-protos.h
+++ b/gcc/config/gcn/gcn-protos.h
@@ -75,7 +75,7 @@ extern bool gcn_valid_move_p (machine_mode, rtx, rtx);
 extern rtx gcn_vec_constant (machine_mode, int);
 extern rtx gcn_vec_constant (machine_mode, rtx);
 extern bool gcn_vgpr_move_p (rtx, rtx);
-extern void print_operand_address (FILE *file, register rtx addr);
+extern void print_operand_address (FILE *file, rtx addr);
 extern void print_operand (FILE *file, rtx x, int code);
 extern bool regno_ok_for_index_p (int);
 


Re: [PATCH 1/12] arm: Make mbranch-protection opts parsing common to AArch32/64

2022-07-01 Thread Richard Earnshaw via Gcc-patches




On 28/04/2022 10:08, Andrea Corallo via Gcc-patches wrote:

Hi all,

This change refactors all the mbranch-protection option parsing code and
types to make it common to both AArch32 and AArch64 backends.

This change also pulls in some supporting types from AArch64 to make
it common (aarch_parse_opt_result).

The significant changes in this patch are the movement of all branch
protection parsing routines from aarch64.c to aarch-common.c and
supporting data types and static data structures.

This patch also pre-declares variables and types required in the
aarch32 back-end for moved variables for function sign scope and key
to prepare for the impending series of patches that support parsing
the feature mbranch-protection in the aarch32 back-end.

This patch implements the changes requested and was pre-approved here
.

gcc/ChangeLog:

* common/config/aarch64/aarch64-common.cc: Include aarch-common.h.
(all_architectures): Fix comment.
(aarch64_parse_extension): Rename return type, enum value names.
* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Rename
factored out aarch_ra_sign_scope and aarch_ra_sign_key variables.
Also rename corresponding enum values.
* config/aarch64/aarch64-opts.h (aarch64_function_type): Factor
out aarch64_function_type and move it to common code as
aarch_function_type in aarch-common.h.
* config/aarch64/aarch64-protos.h: Include common types header,
move out types aarch64_parse_opt_result and aarch64_key_type to
aarch-common.h
* config/aarch64/aarch64.cc: Move mbranch-protection parsing types
and functions out into aarch-common.h and aarch-common.cc.  Fix up
all the name changes resulting from the move.
* config/aarch64/aarch64.md: Fix up aarch64_ra_sign_key type name change
and enum value.
* config/aarch64/aarch64.opt: Include aarch-common.h to import
type move.  Fix up name changes from factoring out common code and
data.
* config/arm/aarch-common-protos.h: Export factored out routines to both
backends.
* config/arm/aarch-common.cc: Include newly factored out types.  Move 
all
mbranch-protection code and data structures from aarch64.cc.
* config/arm/aarch-common.h: New header that declares types shared
between aarch32 and aarch64 backends.
* config/arm/arm-protos.h: Declare types and variables that are
made common to aarch64 and aarch32 backends - aarch_ra_sign_key,
aarch_ra_sign_scope and aarch_enable_bti.

Co-Authored-By: Tejas Belagod  



OK, but please wait for the rest of this series to be approved before 
applying.


R.


Re: [PATCH 2/12] arm: Add Armv8.1-M Mainline target feature +pacbti

2022-07-01 Thread Richard Earnshaw via Gcc-patches




On 28/04/2022 10:37, Andrea Corallo via Gcc-patches wrote:

This patch adds the -march feature +pacbti to Armv8.1-M Mainline.

This feature enables pointer signing and authentication instructions
on M-class architectures.

Pre-approved here
.

gcc/Changelog:

* config/arm/arm.h (TARGET_HAVE_PACBTI): New macro.
* config/arm/arm-cpus.in (pacbti): New feature.
* doc/invoke.texi (Arm Options): Document it.

Co-Authored-By: Tejas Belagod  



OK.


Re: [PATCH 3/12] arm: Add option -mbranch-protection

2022-07-01 Thread Richard Earnshaw via Gcc-patches




On 28/04/2022 10:38, Andrea Corallo via Gcc-patches wrote:

[PATCH 3/12] arm: Add option -mbranch-protection

Add -mbranch-protection option.  This option enables the
code-generation of pointer signing and authentication instructions in
function prologues and epilogues.

gcc/ChangeLog:

* config/arm/arm.c (arm_configure_build_target): Parse and validate
-mbranch-protection option and initialize appropriate data structures.
* config/arm/arm.opt (-mbranch-protection): New option.
* doc/invoke.texi (Arm Options): Document it.

Co-Authored-By: Tejas Belagod  
Co-Authored-By: Richard Earnshaw 


+@item
+-mbranch-protection=@var{none}|@var{standard}|@var{pac-ret}[+@var{leaf}][+@var{bti}]|@var{bti}[+@var{pac-ret}[+@var{leaf}]]
+@opindex mbranch-protection
+Enable branch protection features (armv8.1-m.main only).
+@samp{none} generate code without branch protection or return address
+signing.
+@samp{standard[+@var{leaf}]} generate code with all branch protection
+features enabled at their standard level.
+@samp{pac-ret[+@var{leaf}]} generate code with return address signing
+set to its standard level, which is to sign all functions that save
+the return address to memory.
+@samp{leaf} When return address signing is enabled, also sign leaf
+functions even if they do not write the return address to memory.
++@samp{bti} Add landing-pad instructions at the permitted targets of
+indirect branch instructions.
+
+If the @samp{+pacbti} architecture extension is not enabled, then all
+branch protection and return address signing operations are
+constrained to use only the instructions defined in the
+architectural-NOP space. The generated code will remain
+backwards-compatible with earlier versions of the architecture, but
+the additional security can be enabled at run time on processors that
+support the @samp{PACBTI} extension.
+
+Branch target enforcement using BTI can only be enabled at runtime if
+all code in the application has been compiled with at least
+@samp{-mbranch-protection=bti}.
+
+The default is to generate code without branch protection or return
+address signing.

This needs to make it clear that -mbranch-protection != none is only 
supported on armv8-m.main or later.


R.


Re: [Committed] Add constraints to new andn_doubleword_bmi pattern in i386.md.

2022-07-01 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 1, 2022 at 12:17 PM Roger Sayle  wrote:
>
>
> Many thanks to Uros for spotting that I'd forgotten to add constraints
> to the new define_insn_and_split *andn_doubleword_bmi when moving it
> from pre-reload to post-reload.  I've pushed this obvious fix after a
> make bootstrap on x86_64-pc-linux-gnu.  Sorry for the inconvenience to
> anyone building the tree with a non-default architecture that enables
> BMI.
>
>
> 2022-07-01  Roger Sayle  
> Uroš Bizjak  
>
> gcc/ChangeLog
> * config/i386/i386.md (*andn3_doubleword_bmi): Add constraints
> to post-reload define_insn_and_split.

-  (not: (match_operand: 1 "register_operand"))
-  (match_operand: 2 "nonimmediate_operand")))
+  (not: (match_operand: 1 "register_operand" "0"))

This constraint can be "r", ANDN is not destructive.

+  (match_operand: 2 "nonimmediate_operand" "ro")))

Uros.


[committed] libstdc++: Add nodiscard attribute to filesystem operations

2022-07-01 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, pushed to trunk.

-- >8 --

Some of these are not truly "pure" because they access the file system,
e.g. exists and file_size, but they do not modify anything and are only
useful for the return value.

If you really want to use one of those functions just to check whether
an error is reported (either via an exception or an error_code&
argument) you can still do so, but you need to cast the discarded result
to void.  Several tests need such a change, because they were indeed
only calling the functions to check for expected errors.

libstdc++-v3/ChangeLog:

* include/bits/fs_ops.h: Add nodiscard to all pure functions.
* include/experimental/bits/fs_ops.h: Likewise.
* testsuite/27_io/filesystem/operations/all.cc: Do not discard
results of absolute and canonical.
* testsuite/27_io/filesystem/operations/absolute.cc: Cast
discarded result to void.
* testsuite/27_io/filesystem/operations/canonical.cc: Likewise.
* testsuite/27_io/filesystem/operations/exists.cc: Likewise.
* testsuite/27_io/filesystem/operations/is_empty.cc: Likewise.
* testsuite/27_io/filesystem/operations/read_symlink.cc:
Likewise.
* testsuite/27_io/filesystem/operations/status.cc: Likewise.
* testsuite/27_io/filesystem/operations/symlink_status.cc:
Likewise.
* testsuite/27_io/filesystem/operations/temp_directory_path.cc:
Likewise.
* testsuite/experimental/filesystem/operations/canonical.cc:
Likewise.
* testsuite/experimental/filesystem/operations/exists.cc:
Likewise.
* testsuite/experimental/filesystem/operations/is_empty.cc:
Likewise.
* testsuite/experimental/filesystem/operations/read_symlink.cc:
Likewise.
* testsuite/experimental/filesystem/operations/temp_directory_path.cc:
Likewise.
---
 libstdc++-v3/include/bits/fs_ops.h| 79 +++
 .../include/experimental/bits/fs_ops.h| 71 +
 .../27_io/filesystem/operations/absolute.cc   |  2 +-
 .../27_io/filesystem/operations/all.cc|  8 +-
 .../27_io/filesystem/operations/canonical.cc  |  4 +-
 .../27_io/filesystem/operations/exists.cc |  2 +-
 .../27_io/filesystem/operations/is_empty.cc   |  4 +-
 .../filesystem/operations/read_symlink.cc |  2 +-
 .../27_io/filesystem/operations/status.cc |  2 +-
 .../filesystem/operations/symlink_status.cc   |  2 +-
 .../operations/temp_directory_path.cc |  4 +-
 .../filesystem/operations/canonical.cc|  6 +-
 .../filesystem/operations/exists.cc   |  2 +-
 .../filesystem/operations/is_empty.cc |  4 +-
 .../filesystem/operations/read_symlink.cc |  2 +-
 .../operations/temp_directory_path.cc |  4 +-
 16 files changed, 174 insertions(+), 24 deletions(-)

diff --git a/libstdc++-v3/include/bits/fs_ops.h 
b/libstdc++-v3/include/bits/fs_ops.h
index 0281c6540d0..1ae8fe12374 100644
--- a/libstdc++-v3/include/bits/fs_ops.h
+++ b/libstdc++-v3/include/bits/fs_ops.h
@@ -44,10 +44,16 @@ namespace filesystem
*  @{
*/
 
+  [[nodiscard]]
   path absolute(const path& __p);
+
+  [[nodiscard]]
   path absolute(const path& __p, error_code& __ec);
 
+  [[nodiscard]]
   path canonical(const path& __p);
+
+  [[nodiscard]]
   path canonical(const path& __p, error_code& __ec);
 
   inline void
@@ -100,25 +106,34 @@ namespace filesystem
   void create_symlink(const path& __to, const path& __new_symlink,
  error_code& __ec) noexcept;
 
+  [[nodiscard]]
   path current_path();
+
+  [[nodiscard]]
   path current_path(error_code& __ec);
+
   void current_path(const path& __p);
   void current_path(const path& __p, error_code& __ec) noexcept;
 
+  [[nodiscard]]
   bool
   equivalent(const path& __p1, const path& __p2);
 
+  [[nodiscard]]
   bool
   equivalent(const path& __p1, const path& __p2, error_code& __ec) noexcept;
 
+  [[nodiscard]]
   inline bool
   exists(file_status __s) noexcept
   { return status_known(__s) && __s.type() != file_type::not_found; }
 
+  [[nodiscard]]
   inline bool
   exists(const path& __p)
   { return exists(status(__p)); }
 
+  [[nodiscard]]
   inline bool
   exists(const path& __p, error_code& __ec) noexcept
   {
@@ -131,63 +146,85 @@ namespace filesystem
 return false;
   }
 
+  [[nodiscard]]
   uintmax_t file_size(const path& __p);
+
+  [[nodiscard]]
   uintmax_t file_size(const path& __p, error_code& __ec) noexcept;
 
+  [[nodiscard]]
   uintmax_t hard_link_count(const path& __p);
+
+  [[nodiscard]]
   uintmax_t hard_link_count(const path& __p, error_code& __ec) noexcept;
 
+  [[nodiscard]]
   inline bool
   is_block_file(file_status __s) noexcept
   { return __s.type() == file_type::block; }
 
+  [[nodiscard]]
   inline bool
   is_block_file(const path& __p)
   { return is_block_file(status(__p)); }
 
+  [[nodiscard]]
   inline bool
   is_block_file(const path& __p, error_code& __ec) noexcept
   

Re: [PATCH][AArch64] Implement ACLE Data Intrinsics

2022-07-01 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)"  writes:
> On 29/06/2022 08:18, Richard Sandiford wrote:
>>> +  break;
>>> +case AARCH64_RBIT:
>>> +case AARCH64_RBITL:
>>> +case AARCH64_RBITLL:
>>> +  if (mode == SImode)
>>> +   icode = CODE_FOR_aarch64_rbitsi;
>>> +  else
>>> +   icode = CODE_FOR_aarch64_rbitdi;
>>> +  break;
>>> +default:
>>> +  gcc_unreachable ();
>>> +}
>>> +  expand_insn (icode, 2, ops);
>>> +  return target;
>> This needs to return ops[0].value instead, since "target" just suggests
>> a possible location.
>>
>> Could you add tests for a memory source and memory destination, e.g.:
>>
>> void test_clz_mem (uint32_t *a)
>> {
>>*a = __clz (*a);
>> }
>>
>> Without tests like that, these comments probably just sound like a paper
>> exercise, but they should make a difference for memory sources (previous
>> review) and memory destinations (this round).
> I had locally tested it (with rev though because clz doesn't use that 
> code) and strangely it does seem to work for the memory destinations, 
> but that's just a simple test.
> It could very well go wrong with some more complex codegen, so I'll just 
> take your word and use ops[0].value.
>
> And yeah I didn't add the tests at the time, don't really know why, I'll 
> chuck it down to laziness :P
>>
>>> diff --git a/gcc/config/aarch64/arm_acle.h b/gcc/config/aarch64/arm_acle.h
>>> index 
>>> 9775a48c65825b424d3eb442384f5ab87b734fd7..a044bc74553fcf2a49b71290083f3f072fd5a2ce
>>>  100644
>>> --- a/gcc/config/aarch64/arm_acle.h
>>> +++ b/gcc/config/aarch64/arm_acle.h
>>> @@ -28,6 +28,7 @@
>>>   #define _GCC_ARM_ACLE_H
>>>   
>>>   #include 
>>> +#include 
>>>   
>>>   #pragma GCC aarch64 "arm_acle.h"
>>>   
>>> @@ -35,6 +36,58 @@
>>>   extern "C" {
>>>   #endif
>>>   
>>> +#define _GCC_ARM_ACLE_ROR_FN(NAME, TYPE) \
>>> +__extension__ extern __inline TYPE   \
>>> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>>>   \
>>> +NAME (TYPE __value, uint32_t __rotate) 
>>>   \
>>> +{\
>>> +  size_t __size = sizeof (TYPE) * __CHAR_BIT__;
>>>   \
>>> +  __rotate = __rotate % __size;
>>>   \
>>> +  return __value >> __rotate | __value << ((__size - __rotate) % __size); \
>>> +}
>>> +
>>> +_GCC_ARM_ACLE_ROR_FN (__ror, uint32_t)
>>> +_GCC_ARM_ACLE_ROR_FN (__rorl, unsigned long)
>>> +_GCC_ARM_ACLE_ROR_FN (__rorll, uint64_t)
>>> +
>>> +#undef _GCC_ARM_ACLE_ROR_FN
>>> +
>>> +#define _GCC_ARM_ACLE_DATA_FN(NAME, BUILTIN, ITYPE, RTYPE) \
>>> +__extension__ extern __inline RTYPE\
>>> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) \
>>> +__##NAME (ITYPE __value)   \
>>> +{  \
>>> +  return __builtin_##BUILTIN (__value);\
>>> +}
>>> +
>>> +_GCC_ARM_ACLE_DATA_FN (clz, clz, uint32_t, unsigned int)
>>> +_GCC_ARM_ACLE_DATA_FN (clzl, clzl, unsigned long, unsigned int)
>>> +_GCC_ARM_ACLE_DATA_FN (clzll, clzll, uint64_t, unsigned int)
>>> +_GCC_ARM_ACLE_DATA_FN (cls, clrsb, uint32_t, unsigned int)
>>> +_GCC_ARM_ACLE_DATA_FN (clsl, clrsbl, unsigned long, unsigned int)
>>> +_GCC_ARM_ACLE_DATA_FN (clsll, clrsbll, uint64_t, unsigned int)
>>> +_GCC_ARM_ACLE_DATA_FN (rev16, aarch64_rev16, uint32_t, uint32_t)
>>> +_GCC_ARM_ACLE_DATA_FN (rev16l, aarch64_rev16l, unsigned long, unsigned 
>>> long)
>>> +_GCC_ARM_ACLE_DATA_FN (rev16ll, aarch64_rev16ll, uint64_t, uint64_t)
>>> +_GCC_ARM_ACLE_DATA_FN (rbit, aarch64_rbit, uint32_t, uint32_t)
>>> +_GCC_ARM_ACLE_DATA_FN (rbitl, aarch64_rbitl, unsigned long, unsigned long)
>>> +_GCC_ARM_ACLE_DATA_FN (rbitll, aarch64_rbitll, uint64_t, uint64_t)
>>> +_GCC_ARM_ACLE_DATA_FN (revsh, bswap16, int16_t, uint16_t)
>> The return type should be int16_t.
> Nice catch!
>> The clz and cls tests have the old return types (same as the argument
>> types), but I guess that's a good thing, since it shows that we avoid
>> the redundant zero-extend in clzll and clsll.
> Yeah I noticed that too when I was adding the mem tests, but I did 
> change them though because at the time it just felt like an oversight, 
> though I too was pleasantly surprised GCC was managing to avoid the 
> zero-extending :)
> I then saw your comment and made me wonder whether I should keep the 
> wrong return types in... I haven't but happy to change them back if you 
> think it's a nice 'test' to have.

I thought it was OK/useful as it was, but I don't mind either way.

BTW, while trying it out locally, I noticed:

  aarch64_init_data_intrinsics

was called from the wrong place.  Since it's adding normal __builtin
functions, it should be called from aarch64_general_init_builtins
instead of handle_arm_acle_h.

handle_arm

Re: PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459]

2022-07-01 Thread Richard Biener via Gcc-patches
On Fri, Jul 1, 2022 at 11:20 AM Jan Hubicka  wrote:
>
> > On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin  wrote:
> > >
> > > Hi,
> > >
> > > Gentle ping 
> > > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html
> > >
> > > BR,
> > > Kewen
> > >
> > > on 2022/6/6 14:20, Kewen.Lin via Gcc-patches wrote:
> > > > Hi,
> > > >
> > > > PR105459 exposes one issue in inline_call handling that when it
> > > > decides to copy FP flags from callee to caller and rebuild the
> > > > optimization node for caller fndecl, it's possible that the target
> > > > option node is also necessary to be rebuilt.  Without updating
> > > > target option node early, it can make nodes share the same target
> > > > option node wrongly, later when we want to unshare it somewhere
> > > > (like in target hook) it can get unexpected results, like ICE on
> > > > uninitialized secondary member of target globals exposed in this PR.
> >
> > I think that
> >
> >   /* Reload global optimization flags.  */
> >   if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun)
> > set_cfun (cfun, true);
> >
> > is supposed to do that via ix86_set_current_function which will eventually
> > re-build the target optimization node exactly for this reason.
> >
> > But with LTO we arrive here during WPA time only and there cfun is NULL
> > (and so is DECL_STRUCT_FUNCTION (to->decl)), so the target doesn't
> > get the chance to fix things up here.
>
> I see this logic was added by Martin in 2017 and it indeed looks bit
> odd, but I suppose it is intended primarily for early inliner where cfun
> is non-NULL and we really need to update global state.
> >
> > Now, it should be fine to delay this fixup until we set the cfun at LTRANS
> > time but there we run into
> >
> >   if (old_tree != new_tree)
> > {
> >   cl_target_option_restore (&global_options, &global_options_set,
> > TREE_TARGET_OPTION (new_tree));
> > ...
> > }
> >   else if (flag_unsafe_math_optimizations
> >!= TREE_TARGET_OPTION 
> > (new_tree)->x_ix86_unsafe_math_optimizations
> >|| (flag_excess_precision
> >!= TREE_TARGET_OPTION (new_tree)->x_ix86_excess_precision))
> > {
> > ... FIXUP! ...
> >
> > and old_tree != new_tree disables the fixup.
> >
> > When we refactor the above to always consider the FP flag change (so apply 
> > it
> > lazily), then this fixes the testcase in the PR as well.  Thus something 
> > like
> > the attached.
> >
> > Ideally this stuff would be refactored to a target hook that can work 
> > without
> > the set_cfun, also working towards merging the target and optimization node
> > since they have to be kept in sync ...
> >
> > I think your proposed patch makes another variant through the maze to
> > do something at WPA time but that makes it all even more complicated :/
> >
> > Sorry for the delay btw.
> >
> > Folks - any other opinions?
>
> Your patch looks reasonable to me... Indeed working on nodes directly
> would be nicer, but that means bigger surgery in the optimization
> handling right?

Yeah, nothing like I feel doing right now ...

I'm going to give the patch full testing now.

Richard.

>
> Thanks,
> Honza
> >
> > Thanks,
> > Richard.
> >
> > > > Commit r12-3721 makes it get exact fp_expression info and causes
> > > > more optimization chances then exposes this issue.  Commit r11-5855
> > > > introduces two target options to shadow flag_excess_precision and
> > > > flag_unsafe_math_optimizations and shows the need to rebuild target
> > > > node in inline_call when optimization node changes.
> > > >
> > > > As commented in PR105459, I tried to postpone init_function_start
> > > > in cgraph_node::expand, but abandoned it since I thought it just
> > > > concealed the issue.  And I also tried to adjust the target node
> > > > when current function switching, but failed since we get the NULL
> > > > cfun and fndecl in WPA phase.
> > > >
> > > > Bootstrapped and regtested on x86_64-redhat-linux, powerpc64-linux-gnu
> > > > P8 and powerpc64le-linux-gnu P9.
> > > >
> > > > Any thoughts?  Is it OK for trunk?
> > > >
> > > > BR,
> > > > Kewen
> > > > -
> > > >
> > > >   PR tree-optimization/105459
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >   * ipa-inline-transform.cc (inline_call): Rebuild target option 
> > > > node
> > > >   once optimization node gets rebuilt.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > >   * gcc.dg/lto/pr105459_0.c: New test.
> > > > ---
> > > >  gcc/ipa-inline-transform.cc   | 50 +--
> > > >  gcc/testsuite/gcc.dg/lto/pr105459_0.c | 35 +++
> > > >  2 files changed, 83 insertions(+), 2 deletions(-)
> > > >  create mode 100644 gcc/testsuite/gcc.dg/lto/pr105459_0.c
> > > >
> > > > diff --git a/gcc/ipa-inline-transform.cc b/gcc/ipa-inline-transform.cc
> > > > index 07288e57c73..edba58377f4 100644
> > > > --- a/gcc/ipa-inline-transform.cc
> > > > +++ b/gcc/ipa-inli

[PATCH 1/2] Make sure checking code is conditional in VN

2022-07-01 Thread Richard Biener via Gcc-patches
VN has checking code with gcc_unreachable (), the following makes
it cheaper by instead guarding the side-effect with flag_checking.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2022-07-01  Richard Biener  

* tree-ssa-sccvn.cc (vn_nary_op_insert_into): Make
checking dominance check conditional on flag_checking.
---
 gcc/tree-ssa-sccvn.cc | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index 76d92895a3a..c40c45ed840 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -4243,9 +4243,10 @@ vn_nary_op_insert_into (vn_nary_op_t vno, 
vn_nary_op_table_type *table)
  if (dominated_by_p (CDI_DOMINATORS, vno_bb, val_bb))
/* Value registered with more generic predicate.  */
return *slot;
- else if (dominated_by_p (CDI_DOMINATORS, val_bb, vno_bb))
+ else if (flag_checking)
/* Shouldn't happen, we insert in RPO order.  */
-   gcc_unreachable ();
+   gcc_assert (!dominated_by_p (CDI_DOMINATORS,
+val_bb, vno_bb));
}
  /* Append value.  */
  *next = (vn_pval *) obstack_alloc (&vn_tables_obstack,
-- 
2.35.3



[PATCH 2/2] Revert maybe_ne -> known_ne change in vn_reference_lookup_3

2022-07-01 Thread Richard Biener via Gcc-patches
This reverts the change as discussed.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2022-07-01  Richard Biener  

* tree-ssa-sccvn.cc (vn_reference_lookup_3): Revert
back to using maybe_ne (off, -1).
---
 gcc/tree-ssa-sccvn.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index c40c45ed840..f41d5031365 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -3243,12 +3243,12 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
*data_,
   poly_int64 extra_off = 0;
   if (j == 0 && i >= 0
  && lhs_ops[0].opcode == MEM_REF
- && known_ne (lhs_ops[0].off, -1))
+ && maybe_ne (lhs_ops[0].off, -1))
{
  if (known_eq (lhs_ops[0].off, vr->operands[i].off))
i--, j--;
  else if (vr->operands[i].opcode == MEM_REF
-  && known_ne (vr->operands[i].off, -1))
+  && maybe_ne (vr->operands[i].off, -1))
{
  extra_off = vr->operands[i].off - lhs_ops[0].off;
  i--, j--;
-- 
2.35.3


Re: [PATCH 2/2] Revert maybe_ne -> known_ne change in vn_reference_lookup_3

2022-07-01 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches  writes:
> This reverts the change as discussed.

Thanks!

> Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
>
> 2022-07-01  Richard Biener  
>
>   * tree-ssa-sccvn.cc (vn_reference_lookup_3): Revert
>   back to using maybe_ne (off, -1).
> ---
>  gcc/tree-ssa-sccvn.cc | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
> index c40c45ed840..f41d5031365 100644
> --- a/gcc/tree-ssa-sccvn.cc
> +++ b/gcc/tree-ssa-sccvn.cc
> @@ -3243,12 +3243,12 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
> *data_,
>poly_int64 extra_off = 0;
>if (j == 0 && i >= 0
> && lhs_ops[0].opcode == MEM_REF
> -   && known_ne (lhs_ops[0].off, -1))
> +   && maybe_ne (lhs_ops[0].off, -1))
>   {
> if (known_eq (lhs_ops[0].off, vr->operands[i].off))
>   i--, j--;
> else if (vr->operands[i].opcode == MEM_REF
> -&& known_ne (vr->operands[i].off, -1))
> +&& maybe_ne (vr->operands[i].off, -1))
>   {
> extra_off = vr->operands[i].off - lhs_ops[0].off;
> i--, j--;


Mips: Fix kernel_stat structure size

2022-07-01 Thread Dimitrije Milosevic
Fix kernel_stat structure size for non-Android 32-bit Mips.
LLVM currently has this value for the kernel_stat structure size,
as per compiler-rt/lib/sanitizer-common/sanitizer_platform_limits_posix.h.
This also resolves one of the build issues for non-Android 32-bit Mips.

libsanitizer/ChangeLog:

* sanitizer_common/sanitizer_platform_limits_posix.h: Fix
kernel_stat structure size for non-Android 32-bit Mips.

---

 libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h 
b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
index 89772a7e5c0..62a99035db3 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
@@ -83,7 +83,7 @@ const unsigned struct_kernel_stat64_sz = 104;
 #elif defined(__mips__)
 const unsigned struct_kernel_stat_sz = SANITIZER_ANDROID
? FIRST_32_SECOND_64(104, 128)
-   : FIRST_32_SECOND_64(144, 216);
+   : FIRST_32_SECOND_64(160, 216);
 const unsigned struct_kernel_stat64_sz = 104;
 #elif defined(__s390__) && !defined(__s390x__)
 const unsigned struct_kernel_stat_sz = 64;

---

[PATCH] Mips: Resolve build issues for the n32 ABI

2022-07-01 Thread Dimitrije Milosevic
Building the ASAN for the n32 MIPS ABI currently fails, due to a few reasons:
- defined(__mips64), which is set solely based on the architecture type 
(32-bit/64-bit), 
was still used in some places. Therefore, defined(__mips64) is swapped with 
SANITIZER_MIPS64, 
which takes the ABI into account as well - defined(__mips64) && 
_MIPS_SIM == ABI64.
- The n32 ABI still uses 64-bit *Linux* system calls, even though the word size 
is 32 bits.
- After the transition to canonical system calls 
(https://reviews.llvm.org/D124212), the n32 ABI still didn't use them, 
even though they are supported,
as per 
https://github.com/torvalds/linux/blob/master/arch/mips/kernel/syscalls/syscall_n32.tbl.

See https://reviews.llvm.org/D127098.

libsanitizer/ChangeLog:

* sanitizer_common/sanitizer_linux.cpp (defined): Resolve
ASAN build issues for the Mips n32 ABI.
* sanitizer_common/sanitizer_platform.h (defined): Likewise.

---

 libsanitizer/sanitizer_common/sanitizer_linux.cpp  | 17 ++---
 libsanitizer/sanitizer_common/sanitizer_platform.h |  2 +-
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/libsanitizer/sanitizer_common/sanitizer_linux.cpp 
b/libsanitizer/sanitizer_common/sanitizer_linux.cpp
index e2c32d679ad..5ba033492e7 100644
--- a/libsanitizer/sanitizer_common/sanitizer_linux.cpp
+++ b/libsanitizer/sanitizer_common/sanitizer_linux.cpp
@@ -34,7 +34,7 @@
 // format. Struct kernel_stat is defined as 'struct stat' in asm/stat.h. To
 // access stat from asm/stat.h, without conflicting with definition in
 // sys/stat.h, we use this trick.
-#if defined(__mips64)
+#if SANITIZER_MIPS64
 #include 
 #include 
 #define stat kernel_stat
@@ -124,8 +124,9 @@ const int FUTEX_WAKE_PRIVATE = FUTEX_WAKE | 
FUTEX_PRIVATE_FLAG;
 // Are we using 32-bit or 64-bit Linux syscalls?
 // x32 (which defines __x86_64__) has SANITIZER_WORDSIZE == 32
 // but it still needs to use 64-bit syscalls.
-#if SANITIZER_LINUX && (defined(__x86_64__) || defined(__powerpc64__) ||   
\
-SANITIZER_WORDSIZE == 64)
+#if SANITIZER_LINUX && (defined(__x86_64__) || defined(__powerpc64__) || \
+SANITIZER_WORDSIZE == 64 ||  \
+(defined(__mips__) && _MIPS_SIM == _ABIN32))
 # define SANITIZER_LINUX_USES_64BIT_SYSCALLS 1
 #else
 # define SANITIZER_LINUX_USES_64BIT_SYSCALLS 0
@@ -289,7 +290,7 @@ static void stat64_to_stat(struct stat64 *in, struct stat 
*out) {
 }
 #endif
 
-#if defined(__mips64)
+#if SANITIZER_MIPS64
 // Undefine compatibility macros from 
 // so that they would not clash with the kernel_stat
 // st_[a|m|c]time fields
@@ -343,7 +344,8 @@ uptr internal_stat(const char *path, void *buf) {
 #if SANITIZER_FREEBSD
   return internal_syscall(SYSCALL(fstatat), AT_FDCWD, (uptr)path, (uptr)buf, 
0);
 #elif SANITIZER_LINUX
-#  if SANITIZER_WORDSIZE == 64 || SANITIZER_X32
+#  if SANITIZER_WORDSIZE == 64 || SANITIZER_X32 || \
+  (defined(__mips__) && _MIPS_SIM == _ABIN32)
   return internal_syscall(SYSCALL(newfstatat), AT_FDCWD, (uptr)path, (uptr)buf,
   0);
 #  else
@@ -366,7 +368,8 @@ uptr internal_lstat(const char *path, void *buf) {
   return internal_syscall(SYSCALL(fstatat), AT_FDCWD, (uptr)path, (uptr)buf,
   AT_SYMLINK_NOFOLLOW);
 #elif SANITIZER_LINUX
-#  if defined(_LP64) || SANITIZER_X32
+#  if defined(_LP64) || SANITIZER_X32 || \
+  (defined(__mips__) && _MIPS_SIM == _ABIN32)
   return internal_syscall(SYSCALL(newfstatat), AT_FDCWD, (uptr)path, (uptr)buf,
   AT_SYMLINK_NOFOLLOW);
 #  else
@@ -1053,7 +1056,7 @@ uptr GetMaxVirtualAddress() {
   return (1ULL << (MostSignificantSetBitIndex(GET_CURRENT_FRAME()) + 1)) - 1;
 #elif SANITIZER_RISCV64
   return (1ULL << 38) - 1;
-# elif defined(__mips64)
+# elif SANITIZER_MIPS64
   return (1ULL << 40) - 1;  // 0x00ffUL;
 # elif defined(__s390x__)
   return (1ULL << 53) - 1;  // 0x001fUL;
diff --git a/libsanitizer/sanitizer_common/sanitizer_platform.h 
b/libsanitizer/sanitizer_common/sanitizer_platform.h
index 8fe0d831431..8bd9a327623 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform.h
+++ b/libsanitizer/sanitizer_common/sanitizer_platform.h
@@ -159,7 +159,7 @@
 
 #if defined(__mips__)
 #  define SANITIZER_MIPS 1
-#  if defined(__mips64)
+#  if defined(__mips64) && _MIPS_SIM == _ABI64
 #define SANITIZER_MIPS32 0
 #define SANITIZER_MIPS64 1
 #  else

---

[PATCH] libsanitizer: Fix linkage errors for cross toolchains

2022-07-01 Thread Dimitrije Milosevic
When we use cross toolchains, in which the GCC libraries are not installed 
within a designated system root, the shared sanitizer libraries link against 
libstdc++.so* within the same libraries. This directory, however, is not in 
RPATH, 
so attempting to build a dynamically linked application with -fsanitize=... 
gives a linkage error.
More information can be found here: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69839.

GCC, even when configured with -with-sysroot, by default, doesn't install 
libstdc++.so*
within the sysroot, as GCC's installation process isn't designed to help 
construct sysroot trees.
This has to be done manually. 
Furthermore, if we are using a multiarch/multilib configuration (mips-mti*, for 
example), 
we may not even want to install them within the sysroot.
Would love to hear your thoughts on this, as I'm not sure myself that this is 
the best solution.

gcc/ChangeLog:

* gcc.cc (LIBSAN_RPATH): New macro.
(LIBASAN_SPEC): Add LIBSAN_RPATH.
(LIBTSAN_SPEC): Likewise.
(LIBLSAN_SPEC): Likewise.
(LIBUBSAN_SPEC): Likewise.

libsanitizer/ChangeLog:

* Makefile.in: New Makefile variable.
* asan/Makefile.in: Likewise.
* configure: Regenerate.
* configure.ac: New config variable.
* hwasan/Makefile.in: New Makefile variable.
* interception/Makefile.in: Likewise.
* libbacktrace/Makefile.in: Likewise.
* libsanitizer.spec.in: New spec.
* lsan/Makefile.in: New Makefile variable.
* sanitizer_common/Makefile.in: Likewise.
* tsan/Makefile.in: Likewise.
* ubsan/Makefile.in: Likewise.

---

 gcc/gcc.cc| 20 
 libsanitizer/Makefile.in  |  1 +
 libsanitizer/asan/Makefile.in |  1 +
 libsanitizer/configure| 10 --
 libsanitizer/configure.ac |  7 +++
 libsanitizer/hwasan/Makefile.in   |  1 +
 libsanitizer/interception/Makefile.in |  1 +
 libsanitizer/libbacktrace/Makefile.in |  1 +
 libsanitizer/libsanitizer.spec.in |  2 ++
 libsanitizer/lsan/Makefile.in |  1 +
 libsanitizer/sanitizer_common/Makefile.in |  1 +
 libsanitizer/tsan/Makefile.in |  1 +
 libsanitizer/ubsan/Makefile.in|  1 +
 13 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index 299e09c4f54..0d2d361b9a4 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -738,17 +738,21 @@ proper position among the other output files.  */
 #define STACK_SPLIT_SPEC " %{fsplit-stack: --wrap=pthread_create}"
 #endif
 
+#ifndef LIBSAN_RPATH
+#define LIBSAN_RPATH " %:include(libsanitizer.spec)%(link_libsan_rpath)"
+#endif
+
 #ifndef LIBASAN_SPEC
 #define STATIC_LIBASAN_LIBS \
   " %{static-libasan|static:%:include(libsanitizer.spec)%(link_libasan)}"
 #ifdef LIBASAN_EARLY_SPEC
-#define LIBASAN_SPEC STATIC_LIBASAN_LIBS
+#define LIBASAN_SPEC STATIC_LIBASAN_LIBS LIBSAN_RPATH
 #elif defined(HAVE_LD_STATIC_DYNAMIC)
 #define LIBASAN_SPEC "%{static-libasan:" LD_STATIC_OPTION \
 "} -lasan %{static-libasan:" LD_DYNAMIC_OPTION "}" \
 STATIC_LIBASAN_LIBS
 #else
-#define LIBASAN_SPEC "-lasan" STATIC_LIBASAN_LIBS
+#define LIBASAN_SPEC "-lasan" STATIC_LIBASAN_LIBS LIBSAN_RPATH
 #endif
 #endif
 
@@ -778,13 +782,13 @@ proper position among the other output files.  */
 #define STATIC_LIBTSAN_LIBS \
   " %{static-libtsan|static:%:include(libsanitizer.spec)%(link_libtsan)}"
 #ifdef LIBTSAN_EARLY_SPEC
-#define LIBTSAN_SPEC STATIC_LIBTSAN_LIBS
+#define LIBTSAN_SPEC STATIC_LIBTSAN_LIBS LIBSAN_RPATH
 #elif defined(HAVE_LD_STATIC_DYNAMIC)
 #define LIBTSAN_SPEC "%{static-libtsan:" LD_STATIC_OPTION \
 "} -ltsan %{static-libtsan:" LD_DYNAMIC_OPTION "}" \
 STATIC_LIBTSAN_LIBS
 #else
-#define LIBTSAN_SPEC "-ltsan" STATIC_LIBTSAN_LIBS
+#define LIBTSAN_SPEC "-ltsan" STATIC_LIBTSAN_LIBS LIBSAN_RPATH
 #endif
 #endif
 
@@ -796,13 +800,13 @@ proper position among the other output files.  */
 #define STATIC_LIBLSAN_LIBS \
   " %{static-liblsan|static:%:include(libsanitizer.spec)%(link_liblsan)}"
 #ifdef LIBLSAN_EARLY_SPEC
-#define LIBLSAN_SPEC STATIC_LIBLSAN_LIBS
+#define LIBLSAN_SPEC STATIC_LIBLSAN_LIBS LIBSAN_RPATH
 #elif defined(HAVE_LD_STATIC_DYNAMIC)
 #define LIBLSAN_SPEC "%{static-liblsan:" LD_STATIC_OPTION \
 "} -llsan %{static-liblsan:" LD_DYNAMIC_OPTION "}" \
 STATIC_LIBLSAN_LIBS
 #else
-#define LIBLSAN_SPEC "-llsan" STATIC_LIBLSAN_LIBS
+#define LIBLSAN_SPEC "-llsan" STATIC_LIBLSAN_LIBS LIBSAN_RPATH
 #endif
 #endif
 
@@ -816,9 +820,9 @@ proper position among the other output files.  */
 #ifdef HAVE_LD_STATIC_DYNAMIC
 #define LIBUBSAN_SPEC "%{static-libubsan:" LD_STATIC_OPTION \
 "} -lubsan %{static-libubsan:" LD_DYNAMIC

Re: Mips: Fix kernel_stat structure size

2022-07-01 Thread Xi Ruoyao via Gcc-patches
On Fri, 2022-07-01 at 12:40 +, Dimitrije Milosevic wrote:
> Fix kernel_stat structure size for non-Android 32-bit Mips.
> LLVM currently has this value for the kernel_stat structure size,
> as per compiler-rt/lib/sanitizer-common/sanitizer_platform_limits_posix.h.
> This also resolves one of the build issues for non-Android 32-bit Mips.

nit: the ChangeLog file name shall have no indents in the commit
message, and there should be one tab (instead of 8 whitespaces) before
the content.  Like:

libsanitizer/ChangeLog:

* sanitizer_common/sanitizer_platform_limits_posix.h: Fix
kernel_stat structure size for non-Android 32-bit Mips.

Patch content LGTM as it just changes our code to match the upstream,
but I don't have privilege to approve the change.  Richard?

>     libsanitizer/ChangeLog:
>     
>     * sanitizer_common/sanitizer_platform_limits_posix.h: Fix
>     kernel_stat structure size for non-Android 32-bit Mips.
> 
> ---
> 
>  libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h 
> b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
> index 89772a7e5c0..62a99035db3 100644
> --- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
> +++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
> @@ -83,7 +83,7 @@ const unsigned struct_kernel_stat64_sz = 104;
>  #elif defined(__mips__)
>  const unsigned struct_kernel_stat_sz = SANITIZER_ANDROID
>     ? FIRST_32_SECOND_64(104, 128)
> -   : FIRST_32_SECOND_64(144, 216);
> +   : FIRST_32_SECOND_64(160, 216);
>  const unsigned struct_kernel_stat64_sz = 104;
>  #elif defined(__s390__) && !defined(__s390x__)
>  const unsigned struct_kernel_stat_sz = 64;
> 
> ---

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [GCC 13][PATCH] PR101836: Add a new option -fstrict-flex-array[=n] and use it in __builtin_object_size

2022-07-01 Thread Qing Zhao via Gcc-patches


> On Jul 1, 2022, at 2:49 AM, Richard Biener  wrote:
> 
> On Thu, Jun 30, 2022 at 9:30 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On Jun 30, 2022, at 1:03 PM, Jakub Jelinek  wrote:
>>> 
>>> On Thu, Jun 30, 2022 at 03:31:00PM +, Qing Zhao wrote:
> No, that’s not true.  A FIELD_DELC is only shared for cv variants of a 
> structure.
 
 Sorry for my dump questions:
 
 1. What do you mean by “cv variants” of a structure?
>>> 
>>> const/volatile qualified variants.  So
>> Okay. I see. thanks.
>>> 
 2. For the following example:
 
 struct AX { int n; short ax[];};
>>> 
>>> struct AX, const struct AX, volatile const struct AX etc. types will share
>>> the FIELD_DECLs.
>> 
>> Okay.
>>> 
 struct UX {struct AX b; int m;};
 
 Are there two different FIELD_DECLs in the IR, one for AX.ax, the other 
 one is for UX.b.ax?
>>> 
>>> No, there are just n and ax FIELD_DECLs with DECL_CONTEXT of struct AX and
>>> b and m FIELD_DECLs with DECL_CONTEXT of struct UX.
>> 
>> Ah, right.
>> 
>> 
>>> 
>>> But, what is important is that when some FIELD_DECL is last in some
>>> structure and has array type, it doesn't mean it should have an
>>> unconstrained length.
>>> In the above case, when struct AX is is followed by some other member, it
>>> acts as a strict short ax[0]; field (even when that is an exception), one
>>> can tak address of &UX.b.ax[0], but can't dereference that, or &UX.b.ax[1].
>> 
>> So, is this a GNU extension. I see that CLANG gives a warning by default and 
>> GCC gives a warning when specify -pedantic:
>> [opc@qinzhao-ol8u3-x86 trailing_array]$ cat t3.c
>> struct AX
>> {
>>  int n;
>>  short ax[];
>> };
>> 
>> struct UX
>> {
>>  struct AX b;
>>  int m;
>> };
>> 
>> void warn_ax_local (struct AX *p, struct UX *q)
>> {
>>  p->ax[2] = 0;
>>  q->b.ax[2] = 0;
>> }
>> [opc@qinzhao-ol8u3-x86 trailing_array]$ clang -O2 -Wall t3.c -S
>> t3.c:9:13: warning: field 'b' with variable sized type 'struct AX' not at 
>> the end of a struct or class is a GNU extension 
>> [-Wgnu-variable-sized-type-not-at-end]
>>  struct AX b;
>>^
>> [opc@qinzhao-ol8u3-x86 trailing_array]$ gcc -O2 -Wall t3.c -pedantic -S
>> t3.c:9:13: warning: invalid use of structure with flexible array member 
>> [-Wpedantic]
>>9 |   struct AX b;
>>  | ^
>> 
>> But, Yes, I agree, even though this is only a GNU extension, We still need 
>> to handle it and accept it as legal code.
>> 
>> Then, yes, I also agree that encoding the info of is_flexible_array into 
>> FIELD_DECL is not good.
> 
> Which is why I suggested to encode 'not_flexible_array'.  This way the
> FE can mark all a[1] this way in some mode
> but leave a[] as possibly flexarray (depending on context).

Then, FE marking (not_flexible_array) can not do the complete job to mark 
whether a field array is flexible array member or not,  Middle end still need 
to 
check the “context” (i.e, whether the array ref is at the end of a structure?)

So, only FE marking + Middle-end “context checking” together will decide a REAL 
flex array? 

If so, comparing to the current implemenation to have all the checking in 
middle-end, what’s the 
major benefit of moving part of the checking into FE, and leaving the other 
part in middle-end?

> 
>> How about encoding the info of “has_flexible_array” into the enclosing 
>> RECORD_TYPE or UNION_TYPE node?
> 
> But that has the same issue.  Consider
> 
> struct A { int n; int a[1]; };
> 
> where a is considered possibly a flexarray vs.
> 
> struct B { struct A a; int b; };
> 
> where B.a would be not considered to have a flexarray (again note
> 'possibly' vs. 'actually does').
> 
> Also
> 
> struct A a;
> 
> has 'a' as _not_ having a flexarray (because it's size is statically
> allocated) but
> 
> struct A *a;
> struct B *b;
> 
> a->a[n];
> 
> as possibly accessing the flexarray portion of *a while
> 
> b->a.a[n]
> 
> is not accessing a flexarray because there's a member after a in b.
> 
> For your original proposal it's really the field declaration itself
> which changes so annotating the FIELD_DECL
> seems correct to me.

Then middle-end still need to check the context, and combined 
with the “not_flexible_array” flag that is encoded in FIELD_DECL
 to make the final decision?

Thanks.

Qing
> 
>> For example, in the above example,  the RECORD_TYPE for “struct AX” will be 
>> marked as “has_flexible_array”, but that for “struct UX” will not.
>> 
>>> 
>>> I believe pedantically flexible array members in such cases don't
>>> necessarily mean zero length array, could be longer, e.g. for the usual
>>> x86_64 alignments
>>> struct BX { long long n; short o; short ax[]; };
>>> struct VX { struct BX b; int m; };
>>> I think it acts as short ax[3]; because the padding at the end of struct BX
>>> is so long that 3 short elements fit in there.
>>> While if one uses
>>> struct BX bx = { 1LL, 2, { 3, 4, 5, 6, 7, 8, 9, 10 } };
>>> (a GNU extension), then it acts as short ax[11]; - the initializer i

Re: [GCC 13][PATCH] PR101836: Add a new option -fstrict-flex-array[=n] and use it in __builtin_object_size

2022-07-01 Thread Richard Biener via Gcc-patches
On Fri, Jul 1, 2022 at 2:55 PM Qing Zhao  wrote:
>
>
>
> > On Jul 1, 2022, at 2:49 AM, Richard Biener  
> > wrote:
> >
> > On Thu, Jun 30, 2022 at 9:30 PM Qing Zhao  wrote:
> >>
> >>
> >>
> >>> On Jun 30, 2022, at 1:03 PM, Jakub Jelinek  wrote:
> >>>
> >>> On Thu, Jun 30, 2022 at 03:31:00PM +, Qing Zhao wrote:
> > No, that’s not true.  A FIELD_DELC is only shared for cv variants of a 
> > structure.
> 
>  Sorry for my dump questions:
> 
>  1. What do you mean by “cv variants” of a structure?
> >>>
> >>> const/volatile qualified variants.  So
> >> Okay. I see. thanks.
> >>>
>  2. For the following example:
> 
>  struct AX { int n; short ax[];};
> >>>
> >>> struct AX, const struct AX, volatile const struct AX etc. types will share
> >>> the FIELD_DECLs.
> >>
> >> Okay.
> >>>
>  struct UX {struct AX b; int m;};
> 
>  Are there two different FIELD_DECLs in the IR, one for AX.ax, the other 
>  one is for UX.b.ax?
> >>>
> >>> No, there are just n and ax FIELD_DECLs with DECL_CONTEXT of struct AX and
> >>> b and m FIELD_DECLs with DECL_CONTEXT of struct UX.
> >>
> >> Ah, right.
> >>
> >>
> >>>
> >>> But, what is important is that when some FIELD_DECL is last in some
> >>> structure and has array type, it doesn't mean it should have an
> >>> unconstrained length.
> >>> In the above case, when struct AX is is followed by some other member, it
> >>> acts as a strict short ax[0]; field (even when that is an exception), one
> >>> can tak address of &UX.b.ax[0], but can't dereference that, or 
> >>> &UX.b.ax[1].
> >>
> >> So, is this a GNU extension. I see that CLANG gives a warning by default 
> >> and GCC gives a warning when specify -pedantic:
> >> [opc@qinzhao-ol8u3-x86 trailing_array]$ cat t3.c
> >> struct AX
> >> {
> >>  int n;
> >>  short ax[];
> >> };
> >>
> >> struct UX
> >> {
> >>  struct AX b;
> >>  int m;
> >> };
> >>
> >> void warn_ax_local (struct AX *p, struct UX *q)
> >> {
> >>  p->ax[2] = 0;
> >>  q->b.ax[2] = 0;
> >> }
> >> [opc@qinzhao-ol8u3-x86 trailing_array]$ clang -O2 -Wall t3.c -S
> >> t3.c:9:13: warning: field 'b' with variable sized type 'struct AX' not at 
> >> the end of a struct or class is a GNU extension 
> >> [-Wgnu-variable-sized-type-not-at-end]
> >>  struct AX b;
> >>^
> >> [opc@qinzhao-ol8u3-x86 trailing_array]$ gcc -O2 -Wall t3.c -pedantic -S
> >> t3.c:9:13: warning: invalid use of structure with flexible array member 
> >> [-Wpedantic]
> >>9 |   struct AX b;
> >>  | ^
> >>
> >> But, Yes, I agree, even though this is only a GNU extension, We still need 
> >> to handle it and accept it as legal code.
> >>
> >> Then, yes, I also agree that encoding the info of is_flexible_array into 
> >> FIELD_DECL is not good.
> >
> > Which is why I suggested to encode 'not_flexible_array'.  This way the
> > FE can mark all a[1] this way in some mode
> > but leave a[] as possibly flexarray (depending on context).
>
> Then, FE marking (not_flexible_array) can not do the complete job to mark
> whether a field array is flexible array member or not,  Middle end still need 
> to
> check the “context” (i.e, whether the array ref is at the end of a structure?)

Yes, but at the very "root" the frontend get's to say whether char[1]
is possibly
flexarray or if only char[] is.

> So, only FE marking + Middle-end “context checking” together will decide a 
> REAL flex array?
>
> If so, comparing to the current implemenation to have all the checking in 
> middle-end, what’s the
> major benefit of moving part of the checking into FE, and leaving the other 
> part in middle-end?

Because a frontend might decide based on language rules that char[1]
is never a flexarray and
in particular it can decide to do that only for user declared
structures.  In particular the latter is
difficult for the middle-end where some aggregates are built by the
middle-end (gcov) or the
targets.

> >
> >> How about encoding the info of “has_flexible_array” into the enclosing 
> >> RECORD_TYPE or UNION_TYPE node?
> >
> > But that has the same issue.  Consider
> >
> > struct A { int n; int a[1]; };
> >
> > where a is considered possibly a flexarray vs.
> >
> > struct B { struct A a; int b; };
> >
> > where B.a would be not considered to have a flexarray (again note
> > 'possibly' vs. 'actually does').
> >
> > Also
> >
> > struct A a;
> >
> > has 'a' as _not_ having a flexarray (because it's size is statically
> > allocated) but
> >
> > struct A *a;
> > struct B *b;
> >
> > a->a[n];
> >
> > as possibly accessing the flexarray portion of *a while
> >
> > b->a.a[n]
> >
> > is not accessing a flexarray because there's a member after a in b.
> >
> > For your original proposal it's really the field declaration itself
> > which changes so annotating the FIELD_DECL
> > seems correct to me.
>
> Then middle-end still need to check the context, and combined
> with the “not_flexible_array” flag that is encoded in FIELD_DECL
>  to make the final decision?

Yes.

> T

Re: [PATCH] Mips: Resolve build issues for the n32 ABI

2022-07-01 Thread Xi Ruoyao via Gcc-patches
On Fri, 2022-07-01 at 12:40 +, Dimitrije Milosevic wrote:
> Building the ASAN for the n32 MIPS ABI currently fails, due to a few reasons:
> - defined(__mips64), which is set solely based on the architecture type 
> (32-bit/64-bit), 
> was still used in some places. Therefore, defined(__mips64) is swapped with 
> SANITIZER_MIPS64, 
> which takes the ABI into account as well - defined(__mips64) && 
> _MIPS_SIM == ABI64.
> - The n32 ABI still uses 64-bit *Linux* system calls, even though the word 
> size is 32 bits.
> - After the transition to canonical system calls 
> (https://reviews.llvm.org/D124212), the n32 ABI still didn't use them,
> even though they are supported,
> as per 
> https://github.com/torvalds/linux/blob/master/arch/mips/kernel/syscalls/syscall_n32.tbl.
> 
> See https://reviews.llvm.org/D127098.
> 
>     libsanitizer/ChangeLog:
>     
>     * sanitizer_common/sanitizer_linux.cpp (defined): Resolve
>     ASAN build issues for the Mips n32 ABI.
>     * sanitizer_common/sanitizer_platform.h (defined): Likewise.

LGTM (with the ChangeLog format fixed), but I think you need to commit
this into LLVM repository first.  And in the commit message you should
say something like "cherry-pick 0011aabb... from upstream".  Then we
still require the approve from a maintainer.

> ---
> 
>  libsanitizer/sanitizer_common/sanitizer_linux.cpp  | 17 ++---
>  libsanitizer/sanitizer_common/sanitizer_platform.h |  2 +-
>  2 files changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/libsanitizer/sanitizer_common/sanitizer_linux.cpp 
> b/libsanitizer/sanitizer_common/sanitizer_linux.cpp
> index e2c32d679ad..5ba033492e7 100644
> --- a/libsanitizer/sanitizer_common/sanitizer_linux.cpp
> +++ b/libsanitizer/sanitizer_common/sanitizer_linux.cpp
> @@ -34,7 +34,7 @@
>  // format. Struct kernel_stat is defined as 'struct stat' in asm/stat.h. To
>  // access stat from asm/stat.h, without conflicting with definition in
>  // sys/stat.h, we use this trick.
> -#if defined(__mips64)
> +#if SANITIZER_MIPS64
>  #include 
>  #include 
>  #define stat kernel_stat
> @@ -124,8 +124,9 @@ const int FUTEX_WAKE_PRIVATE = FUTEX_WAKE | 
> FUTEX_PRIVATE_FLAG;
>  // Are we using 32-bit or 64-bit Linux syscalls?
>  // x32 (which defines __x86_64__) has SANITIZER_WORDSIZE == 32
>  // but it still needs to use 64-bit syscalls.
> -#if SANITIZER_LINUX && (defined(__x86_64__) || defined(__powerpc64__) || 
>   \
> -    SANITIZER_WORDSIZE == 64)
> +#if SANITIZER_LINUX && (defined(__x86_64__) || defined(__powerpc64__) || \
> +    SANITIZER_WORDSIZE == 64 ||  \
> +    (defined(__mips__) && _MIPS_SIM == _ABIN32))
>  # define SANITIZER_LINUX_USES_64BIT_SYSCALLS 1
>  #else
>  # define SANITIZER_LINUX_USES_64BIT_SYSCALLS 0
> @@ -289,7 +290,7 @@ static void stat64_to_stat(struct stat64 *in, struct stat 
> *out) {
>  }
>  #endif
>  
> -#if defined(__mips64)
> +#if SANITIZER_MIPS64
>  // Undefine compatibility macros from 
>  // so that they would not clash with the kernel_stat
>  // st_[a|m|c]time fields
> @@ -343,7 +344,8 @@ uptr internal_stat(const char *path, void *buf) {
>  #if SANITIZER_FREEBSD
>    return internal_syscall(SYSCALL(fstatat), AT_FDCWD, (uptr)path, (uptr)buf, 
> 0);
>  #    elif SANITIZER_LINUX
> -#  if SANITIZER_WORDSIZE == 64 || SANITIZER_X32
> +#  if SANITIZER_WORDSIZE == 64 || SANITIZER_X32 || \
> +  (defined(__mips__) && _MIPS_SIM == _ABIN32)
>    return internal_syscall(SYSCALL(newfstatat), AT_FDCWD, (uptr)path, 
> (uptr)buf,
>    0);
>  #  else
> @@ -366,7 +368,8 @@ uptr internal_lstat(const char *path, void *buf) {
>    return internal_syscall(SYSCALL(fstatat), AT_FDCWD, (uptr)path, (uptr)buf,
>    AT_SYMLINK_NOFOLLOW);
>  #    elif SANITIZER_LINUX
> -#  if defined(_LP64) || SANITIZER_X32
> +#  if defined(_LP64) || SANITIZER_X32 || \
> +  (defined(__mips__) && _MIPS_SIM == _ABIN32)
>    return internal_syscall(SYSCALL(newfstatat), AT_FDCWD, (uptr)path, 
> (uptr)buf,
>    AT_SYMLINK_NOFOLLOW);
>  #  else
> @@ -1053,7 +1056,7 @@ uptr GetMaxVirtualAddress() {
>    return (1ULL << (MostSignificantSetBitIndex(GET_CURRENT_FRAME()) + 1)) - 1;
>  #elif SANITIZER_RISCV64
>    return (1ULL << 38) - 1;
> -# elif defined(__mips64)
> +# elif SANITIZER_MIPS64
>    return (1ULL << 40) - 1;  // 0x00ffUL;
>  # elif defined(__s390x__)
>    return (1ULL << 53) - 1;  // 0x001fUL;
> diff --git a/libsanitizer/sanitizer_common/sanitizer_platform.h 
> b/libsanitizer/sanitizer_common/sanitizer_platform.h
> index 8fe0d831431..8bd9a327623 100644
> --- a/libsanitizer/sanitizer_common/sanitizer_platform.h
> +++ b/libsanitizer/sanitizer_common/sanitizer_platform.h
> @@ -159,7 +159,7 @@
>  
>  #if defined(__mips__)
>  #  define SANITIZER_MIPS 1
> -#  if defined(__mips64)
> +#  if defined(__mips

Re: [GCC 13][PATCH] PR101836: Add a new option -fstrict-flex-array[=n] and use it in __builtin_object_size

2022-07-01 Thread Jakub Jelinek via Gcc-patches
On Fri, Jul 01, 2022 at 12:55:08PM +, Qing Zhao wrote:
> If so, comparing to the current implemenation to have all the checking in 
> middle-end, what’s the 
> major benefit of moving part of the checking into FE, and leaving the other 
> part in middle-end?

The point is recording early what FIELD_DECLs could be vs. can't possibly be
treated like flexible array members and just use that flag in the decisions
in the current routines in addition to what it is doing.

Jakub



Re: [PATCH 4/12] arm: Add testsuite library support for PACBTI target

2022-07-01 Thread Richard Earnshaw via Gcc-patches




On 28/04/2022 10:40, Andrea Corallo via Gcc-patches wrote:

Add targeting-checking entities for PACBTI in testsuite
framework.

Pre-approved with the requested changes here
.

gcc/testsuite/ChangeLog:

* testsuite/lib/target-supports.exp:
(check_effective_target_arm_pacbti_hw): New.
* doc/sourcebuild.texi: Document arm_pacbti_hw.

Co-Authored-By: Tejas Belagod  


+proc check_effective_target_arm_pacbti_hw {} {
+return [check_runtime arm_pacbti_hw_available {
+   __attribute__ ((naked)) int
+   main (void)
+   {
+ asm ("pac r12, lr, sp");

So the armv8-m Arm ARM says that this instruction is in the NOP space 
and that it is undefined if we aren't armv8-m.main or higher.


+ asm ("mov r0, #0");
+ asm ("autg r12, lr, sp");

This isn't in the nop space, but the Arm ARM says it is unpredictable if 
the extension isn't present.  Unfortunately, that means this isn't a 
particularly reliable way of detecting that the PACBTI feature is present.


However, I can't think off hand of more reliable way of testing this 
since reading the feature register ID_ISAR5 is not possible when in 
unprivileged mode.


So I think we'll have to live with this.

+ asm ("bx lr");
+   }
+} ""]

OK.

R.


[Patch][v5] OpenMP: Move omp requires checks to libgomp

2022-07-01 Thread Tobias Burnus

Attached is the updated patch. Main changes:
- File names shown that violate the requires-clause-must-be-same requirement
  Taken from the offload_vars/funcs context (if available), otherwise
  (that's no 'omp target'/'omp declare target' but just 'omp target update/data'
  in the TU), the *.o file name is used.
(thanks to richi + jakub for the susggestions!)
- Uses GOMP_register_var to pass the mask to libgomp
(and no longer a weak variable)
- 'omp declare target' is not regarded as being used -> pending OpenMP lang 
spec clarification
- 'omp target update' is for C/C++
- Properly handle is used by-target constucts for Fortran
- Save requires (and empty offload table) in the *.o file, even if it is only
  using 'omp target (enter/exit) data'

Thanks goes to Jakub for many useful suggestions!

Tested without offloading configured and with nvptx and amdgcn offloading (all 
on x86_64-gnu-linux).

OK? Or does anyone have more useful suggestions?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP: Move omp requires checks to libgomp

Handle reverse_offload, unified_address, and unified_shared_memory
requirements in libgomp by saving them alongside the offload table.
When the device lto1 runs, it extracts the data for mkoffload. The
latter than passes the value on to GOMP_offload_register_ver.

lto1 (either the host one, with -flto [+ ENABLE_OFFLOADING], or in the
offload-device lto1) also does the the consistency check is done,
erroring out when the 'omp requires' clause use is inconsistent.

For all in-principle supported devices, if a requirement cannot be fulfilled,
the device is excluded from the (supported) devices list. Currently, none of
those requirements are marked as supported for any of the non-host devices.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_target_data, c_parser_omp_target_update,
	c_parser_omp_target_enter_data, c_parser_omp_target_exit_data): Set
	OMP_REQUIRES_TARGET_USED.
	(c_parser_omp_requires): Remove sorry.

gcc/ChangeLog:

	* config/gcn/mkoffload.cc (process_asm): Write '#include '.
	(process_obj): Pass omp_requires_mask to GOMP_offload_register_ver.
	(main): Ask lto1 to obtain omp_requires_mask and pass it on.
	* config/nvptx/mkoffload.cc (process, main): Likewise.
	* lto-cgraph.cc (omp_requires_to_name): New.
	(input_offload_tables): Save omp_requires_mask.
	(output_offload_tables): Read it, check for consistency,
	save value for mkoffload.
	* omp-low.cc (lower_omp_target): Force output_offloadtables
	call for OMP_REQUIRES_TARGET_USED.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_target_data,
	cp_parser_omp_target_enter_data, cp_parser_omp_target_exit_data,
	cp_parser_omp_target_update): Set OMP_REQUIRES_TARGET_USED.
	(cp_parser_omp_requires): Remove sorry.

gcc/fortran/ChangeLog:

	* openmp.cc (gfc_match_omp_requires): Remove sorry.
	* parse.cc (decode_omp_directive): Don't regard 'declare target'
	as target usage for 'omp requires'; add more flags to
	omp_requires_mask.

include/ChangeLog:

	* gomp-constants.h (GOMP_VERSION): Bump to 2.
	(GOMP_REQUIRES_UNIFIED_ADDRESS, GOMP_REQUIRES_UNIFIED_SHARED_MEMORY,
	GOMP_REQUIRES_REVERSE_OFFLOAD): New defines.

libgomp/ChangeLog:

	* libgomp-plugin.h (GOMP_OFFLOAD_get_num_devices): Add
	omp_requires_mask arg.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Likewise;
	return -1 when device available but omp_requires_mask != 0.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Likewise.
	* oacc-host.c (host_get_num_devices, host_openacc_get_property):
	Update call.
	* oacc-init.c (resolve_device, acc_init_1, acc_shutdown_1,
	goacc_attach_host_thread_to_device, acc_get_num_devices,
	acc_set_device_num, get_property_any): Likewise.
	* target.c (omp_requires_mask): New global var.
	(gomp_requires_to_name): New.
	(GOMP_offload_register_ver): Handle passed omp_requires_mask.
	(gomp_target_init): Handle omp_requires_mask.
	* libgomp.texi (OpenMP 5.0): Update requires impl. status.
	(OpenMP 5.1): Add a missed item.
	(OpenMP 5.2): Mark linear-clause change as supported in C/C++.
	* testsuite/libgomp.c-c++-common/requires-1-aux.c: New test.
	* testsuite/libgomp.c-c++-common/requires-1.c: New test.
	* testsuite/libgomp.c-c++-common/requires-2-aux.c: New test.
	* testsuite/libgomp.c-c++-common/requires-2.c: New test.
	* testsuite/libgomp.c-c++-common/requires-3-aux.c: New test.
	* testsuite/libgomp.c-c++-common/requires-3.c: New test.
	* testsuite/libgomp.c-c++-common/requires-4-aux.c: New test.
	* testsuite/libgomp.c-c++-common/requires-4.c: New test.
	* testsuite/libgomp.c-c++-common/requires-5-aux.c: New test.
	* testsuite/libgomp.c-c++-common/requires-5.c: New test.
	* testsuite/libgomp.c-c++-common/requires-6.c: New test.
	* testsuite/libgomp.c-c++-common/requires-7-aux.c: New test.
	* testsuite/libg

[PATCH] target/105459 - allow delayed target option node fixup

2022-07-01 Thread Richard Biener via Gcc-patches
The following avoids the need to massage the target optimization
node at WPA time when we fixup the optimization node, copying
FP related flags from callee to caller.  The target is already
set up to fixup, but that only works when not switching between
functions.  After fixing that the fixup is then done at LTRANS
time when materializing the function.

Bootstrapped and tested on x86_64-unknown-linux-gnu, ok?

Thanks,
Richard.

2022-07-01  Richard Biener  

PR target/105459
* config/i386/i386-options.cc (ix86_set_current_function):
Rebuild the target optimization node whenever necessary,
not only when the optimization node didn't change.

* gcc.dg/lto/pr105459_0.c: New testcase.
---
 gcc/config/i386/i386-options.cc   | 32 ++--
 gcc/testsuite/gcc.dg/lto/pr105459_0.c | 35 +++
 2 files changed, 48 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr105459_0.c

diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc
index e11f68186f5..acb2291e70f 100644
--- a/gcc/config/i386/i386-options.cc
+++ b/gcc/config/i386/i386-options.cc
@@ -3232,28 +3232,22 @@ ix86_set_current_function (tree fndecl)
   if (new_tree == NULL_TREE)
 new_tree = target_option_default_node;
 
-  if (old_tree != new_tree)
+  bool fp_flag_change
+= (flag_unsafe_math_optimizations
+   != TREE_TARGET_OPTION (new_tree)->x_ix86_unsafe_math_optimizations
+   || (flag_excess_precision
+  != TREE_TARGET_OPTION (new_tree)->x_ix86_excess_precision));
+  if (old_tree != new_tree || fp_flag_change)
 {
   cl_target_option_restore (&global_options, &global_options_set,
TREE_TARGET_OPTION (new_tree));
-  if (TREE_TARGET_GLOBALS (new_tree))
-   restore_target_globals (TREE_TARGET_GLOBALS (new_tree));
-  else if (new_tree == target_option_default_node)
-   restore_target_globals (&default_target_globals);
-  else
-   TREE_TARGET_GLOBALS (new_tree) = save_target_globals_default_opts ();
-}
-  else if (flag_unsafe_math_optimizations
-  != TREE_TARGET_OPTION (new_tree)->x_ix86_unsafe_math_optimizations
-  || (flag_excess_precision
-  != TREE_TARGET_OPTION (new_tree)->x_ix86_excess_precision))
-{
-  cl_target_option_restore (&global_options, &global_options_set,
-   TREE_TARGET_OPTION (new_tree));
-  ix86_excess_precision = flag_excess_precision;
-  ix86_unsafe_math_optimizations = flag_unsafe_math_optimizations;
-  DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = new_tree
-   = build_target_option_node (&global_options, &global_options_set);
+  if (fp_flag_change)
+   {
+ ix86_excess_precision = flag_excess_precision;
+ ix86_unsafe_math_optimizations = flag_unsafe_math_optimizations;
+ DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = new_tree
+   = build_target_option_node (&global_options, &global_options_set);
+   }
   if (TREE_TARGET_GLOBALS (new_tree))
restore_target_globals (TREE_TARGET_GLOBALS (new_tree));
   else if (new_tree == target_option_default_node)
diff --git a/gcc/testsuite/gcc.dg/lto/pr105459_0.c 
b/gcc/testsuite/gcc.dg/lto/pr105459_0.c
new file mode 100644
index 000..c799e6ef23d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr105459_0.c
@@ -0,0 +1,35 @@
+/* { dg-lto-do link } */
+/* { dg-lto-options { { -flto -O1 } } } */
+
+double m;
+int n;
+
+__attribute__ ((optimize ("-funsafe-math-optimizations")))
+void
+bar (int x)
+{
+  n = x;
+  m = n;
+}
+
+__attribute__ ((flatten))
+void
+foo (int x)
+{
+  bar (x);
+}
+
+void
+quux (void)
+{
+  ++n;
+}
+
+int
+main (void)
+{
+  foo (0);
+  quux ();
+
+  return 0;
+}
-- 
2.35.3


[PATCH] Avoid unused sbitmap in update_ssa

2022-07-01 Thread Richard Biener via Gcc-patches
The following avoids copying and using blocks_to_update to
the interesting_blocks sbitmap when doing update_ssa as it is
unused besides the redundant query in the domwalk.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

* tree-into-ssa.cc (rewrite_update_dom_walker::before_dom_children):
Do not look at interesting_blocks which is a copy of
blocks_to_update.
(update_ssa): Do not initialize it.
(pass_build_ssa::execute): Set interesting_blocks to NULL
after releasing it.
---
 gcc/tree-into-ssa.cc | 22 ++
 1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/gcc/tree-into-ssa.cc b/gcc/tree-into-ssa.cc
index c4e40e8fb08..c90651c3a89 100644
--- a/gcc/tree-into-ssa.cc
+++ b/gcc/tree-into-ssa.cc
@@ -2214,15 +2214,11 @@ rewrite_update_dom_walker::before_dom_children 
(basic_block bb)
 }
 
   /* Step 2.  Rewrite every variable used in each statement in the block.  */
-  if (bitmap_bit_p (interesting_blocks, bb->index))
-{
-  gcc_checking_assert (bitmap_bit_p (blocks_to_update, bb->index));
-  for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
-   if (rewrite_update_stmt (gsi_stmt (gsi), gsi))
- gsi_remove (&gsi, true);
-   else
- gsi_next (&gsi);
-}
+  for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); )
+if (rewrite_update_stmt (gsi_stmt (gsi), gsi))
+  gsi_remove (&gsi, true);
+else
+  gsi_next (&gsi);
 
   /* Step 3.  Update PHI nodes.  */
   rewrite_update_phi_arguments (bb);
@@ -2460,6 +2456,7 @@ pass_build_ssa::execute (function *fun)
   free (dfs);
 
   sbitmap_free (interesting_blocks);
+  interesting_blocks = NULL;
 
   fini_ssa_renamer ();
 
@@ -3503,15 +3500,8 @@ update_ssa (unsigned update_flags)
 get_var_info (sym)->info.current_def = NULL_TREE;
 
   /* Now start the renaming process at START_BB.  */
-  interesting_blocks = sbitmap_alloc (last_basic_block_for_fn (cfun));
-  bitmap_clear (interesting_blocks);
-  EXECUTE_IF_SET_IN_BITMAP (blocks_to_update, 0, i, bi)
-bitmap_set_bit (interesting_blocks, i);
-
   rewrite_blocks (start_bb, REWRITE_UPDATE);
 
-  sbitmap_free (interesting_blocks);
-
   /* Debugging dumps.  */
   if (dump_file)
 {
-- 
2.35.3


Re: [GCC 13][PATCH] PR101836: Add a new option -fstrict-flex-array[=n] and use it in __builtin_object_size

2022-07-01 Thread Qing Zhao via Gcc-patches


> On Jul 1, 2022, at 8:58 AM, Richard Biener  wrote:
> 
> On Fri, Jul 1, 2022 at 2:55 PM Qing Zhao  wrote:
>> 
>> 
>> 
>>> On Jul 1, 2022, at 2:49 AM, Richard Biener  
>>> wrote:
>>> 
>>> On Thu, Jun 30, 2022 at 9:30 PM Qing Zhao  wrote:
 
 
 
> On Jun 30, 2022, at 1:03 PM, Jakub Jelinek  wrote:
> 
> On Thu, Jun 30, 2022 at 03:31:00PM +, Qing Zhao wrote:
>>> No, that’s not true.  A FIELD_DELC is only shared for cv variants of a 
>>> structure.
>> 
>> Sorry for my dump questions:
>> 
>> 1. What do you mean by “cv variants” of a structure?
> 
> const/volatile qualified variants.  So
 Okay. I see. thanks.
> 
>> 2. For the following example:
>> 
>> struct AX { int n; short ax[];};
> 
> struct AX, const struct AX, volatile const struct AX etc. types will share
> the FIELD_DECLs.
 
 Okay.
> 
>> struct UX {struct AX b; int m;};
>> 
>> Are there two different FIELD_DECLs in the IR, one for AX.ax, the other 
>> one is for UX.b.ax?
> 
> No, there are just n and ax FIELD_DECLs with DECL_CONTEXT of struct AX and
> b and m FIELD_DECLs with DECL_CONTEXT of struct UX.
 
 Ah, right.
 
 
> 
> But, what is important is that when some FIELD_DECL is last in some
> structure and has array type, it doesn't mean it should have an
> unconstrained length.
> In the above case, when struct AX is is followed by some other member, it
> acts as a strict short ax[0]; field (even when that is an exception), one
> can tak address of &UX.b.ax[0], but can't dereference that, or 
> &UX.b.ax[1].
 
 So, is this a GNU extension. I see that CLANG gives a warning by default 
 and GCC gives a warning when specify -pedantic:
 [opc@qinzhao-ol8u3-x86 trailing_array]$ cat t3.c
 struct AX
 {
 int n;
 short ax[];
 };
 
 struct UX
 {
 struct AX b;
 int m;
 };
 
 void warn_ax_local (struct AX *p, struct UX *q)
 {
 p->ax[2] = 0;
 q->b.ax[2] = 0;
 }
 [opc@qinzhao-ol8u3-x86 trailing_array]$ clang -O2 -Wall t3.c -S
 t3.c:9:13: warning: field 'b' with variable sized type 'struct AX' not at 
 the end of a struct or class is a GNU extension 
 [-Wgnu-variable-sized-type-not-at-end]
 struct AX b;
   ^
 [opc@qinzhao-ol8u3-x86 trailing_array]$ gcc -O2 -Wall t3.c -pedantic -S
 t3.c:9:13: warning: invalid use of structure with flexible array member 
 [-Wpedantic]
   9 |   struct AX b;
 | ^
 
 But, Yes, I agree, even though this is only a GNU extension, We still need 
 to handle it and accept it as legal code.
 
 Then, yes, I also agree that encoding the info of is_flexible_array into 
 FIELD_DECL is not good.
>>> 
>>> Which is why I suggested to encode 'not_flexible_array'.  This way the
>>> FE can mark all a[1] this way in some mode
>>> but leave a[] as possibly flexarray (depending on context).
>> 
>> Then, FE marking (not_flexible_array) can not do the complete job to mark
>> whether a field array is flexible array member or not,  Middle end still 
>> need to
>> check the “context” (i.e, whether the array ref is at the end of a 
>> structure?)
> 
> Yes, but at the very "root" the frontend get's to say whether char[1]
> is possibly
> flexarray or if only char[] is.

Okay. 
> 
>> So, only FE marking + Middle-end “context checking” together will decide a 
>> REAL flex array?
>> 
>> If so, comparing to the current implemenation to have all the checking in 
>> middle-end, what’s the
>> major benefit of moving part of the checking into FE, and leaving the other 
>> part in middle-end?
> 
> Because a frontend might decide based on language rules that char[1]
> is never a flexarray and
> in particular it can decide to do that only for user declared
> structures.  In particular the latter is
> difficult for the middle-end where some aggregates are built by the
> middle-end (gcov) or the
> targets.

That makes sense. 
> 
>>> 
 How about encoding the info of “has_flexible_array” into the enclosing 
 RECORD_TYPE or UNION_TYPE node?
>>> 
>>> But that has the same issue.  Consider
>>> 
>>> struct A { int n; int a[1]; };
>>> 
>>> where a is considered possibly a flexarray vs.
>>> 
>>> struct B { struct A a; int b; };
>>> 
>>> where B.a would be not considered to have a flexarray (again note
>>> 'possibly' vs. 'actually does').
>>> 
>>> Also
>>> 
>>> struct A a;
>>> 
>>> has 'a' as _not_ having a flexarray (because it's size is statically
>>> allocated) but
>>> 
>>> struct A *a;
>>> struct B *b;
>>> 
>>> a->a[n];
>>> 
>>> as possibly accessing the flexarray portion of *a while
>>> 
>>> b->a.a[n]
>>> 
>>> is not accessing a flexarray because there's a member after a in b.
>>> 
>>> For your original proposal it's really the field declaration itself
>>> which changes so annotating the FIELD_DECL
>>> seems correct to me.
>> 

Re: [GCC 13][PATCH] PR101836: Add a new option -fstrict-flex-array[=n] and use it in __builtin_object_size

2022-07-01 Thread Qing Zhao via Gcc-patches


> On Jul 1, 2022, at 8:59 AM, Jakub Jelinek  wrote:
> 
> On Fri, Jul 01, 2022 at 12:55:08PM +, Qing Zhao wrote:
>> If so, comparing to the current implemenation to have all the checking in 
>> middle-end, what’s the 
>> major benefit of moving part of the checking into FE, and leaving the other 
>> part in middle-end?
> 
> The point is recording early what FIELD_DECLs could be vs. can't possibly be
> treated like flexible array members and just use that flag in the decisions
> in the current routines in addition to what it is doing.

Okay. 

Based on the discussion so far, I will do the following:

1. Add a new flag “DECL_NOT_FLEXARRAY” to FIELD_DECL;
2. In C/C++ FE, set the new flag “DECL_NOT_FLEXARRAY” for a FIELD_DECL based on 
[0], [1],
[] and the option -fstrict-flex-array, and whether it’s the last field of 
the DECL_CONTEXT.
3. In Middle end,  Add a new utility routine is_flexible_array_member_p, which 
bases on 
DECL_NOT_FLEXARRAY + array_at_struct_end_p to decide whether the array
reference is a real flexible array member reference. 


Middle end currently is quite mess, array_at_struct_end_p, component_ref_size, 
and all the phases that
use these routines need to be updated, + new testing cases for each of the 
phases.


So, I still plan to separate the patch set into 2 parts:

  Part A:the above 1 + 2 + 3,  and use these new utilities in 
tree-object-size.cc to resolve PR101836 first.
 Then kernel can use __FORTIFY_SOURCE correctly;

  Part B:update all other phases with the new utilities + new testing cases 
+ resolving regressions.

Let me know if you have any comment and suggestion.

Thanks a lot for all your help.

Qing

> 
>   Jakub
> 



Re: [RFC] trailing_wide_ints with runtime variable lengths

2022-07-01 Thread Aldy Hernandez via Gcc-patches
FYI...if no one has anything to say, I'd like to formally post this for
review.

So OK for trunk?
Aldy

On Wed, Jun 29, 2022, 11:22 Aldy Hernandez  wrote:

> Currently global ranges are stored in SSA_NAME_RANGE_INFO as a pair of
> wide_int-like objects along with the nonzero bits.  We frequently lose
> precision when streaming out our higher resolution iranges.  The plan
> was always to store the full irange between passes.  However, as was
> originally discussed eons ago:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2017-May/475139.html
>
> ...we need a memory efficient way of saving iranges, preferably using
> the trailing_wide_ints idiom.
>
> The problem with doing so is that trailing_wide_ints assume a
> compile-time specified number of elements.  For irange, we need to
> determine the size at run-time.
>
> One solution is to adapt trailing_wide_ints such that N is the maximum
> number of elements allowed, and allow setting the actual number at
> run-time (defaulting to N).  The attached patch does this, while
> requiring no changes to existing users.
>
> It uses a byte to store the number of elements in the
> trailing_wide_ints control word.  The control word is currently a
> 16-bit precision, an 8-bit max-length, and the rest is used for
> m_len[N].  On a 64-bit architecture, this allows for 5 elements in
> m_len without having to use an extra word.  With this patch, m_len[]
> would be smaller by one byte (4) before consuming the padding.  This
> shouldn't be a problem as the only users of trailing_wide_ints use N=2
> for NUM_POLY_INT_COEFFS in aarch64, and N=3 for range_info_def.
>
> For irange, my plan is to use one more word to fit a maximum of 12
> elements (the above 4 plus 8 more).  This would allow for 6 pairs of
> sub-ranges which would be more than adequate for our needs.  In
> previous tests we found that 99% of ranges fit within 3-4 pairs.  More
> precisely, this would allow for 5 pairs, plus the nonzero bits, plus a
> spare wide-int for future development.
>
> Ultimately this means that streaming an irange would consume one more
> word than what we currently do for range_info_def.  IMO this is a nice
> trade-off considering we started storing a slew of wide-ints directly
> ;-).
>
> I'm not above rolling an altogether different approach, but would
> prefer to use the existing trailing infrastructure since it's mostly
> what we need.
>
> Thoughts?
>
> p.s. Tested and benchmarked on x86-64 Linux.  There was no discernible
> performance change in our benchmark suite.
>
> gcc/ChangeLog:
>
> * wide-int.h (struct trailing_wide_ints): Add m_num_elements.
> (trailing_wide_ints::set_precision): Add num_elements argument.
> (trailing_wide_ints::extra_size): Same.
> ---
>  gcc/wide-int.h | 42 +-
>  1 file changed, 29 insertions(+), 13 deletions(-)
>
> diff --git a/gcc/wide-int.h b/gcc/wide-int.h
> index 8041b6104f9..f68ccf0a0c5 100644
> --- a/gcc/wide-int.h
> +++ b/gcc/wide-int.h
> @@ -1373,10 +1373,13 @@ namespace wi
>  : public int_traits  {};
>  }
>
> -/* An array of N wide_int-like objects that can be put at the end of
> -   a variable-sized structure.  Use extra_size to calculate how many
> -   bytes beyond the sizeof need to be allocated.  Use set_precision
> -   to initialize the structure.  */
> +/* A variable-lengthed array of wide_int-like objects that can be put
> +   at the end of a variable-sized structure.  The number of objects is
> +   at most N and can be set at runtime by using set_precision().
> +
> +   Use extra_size to calculate how many bytes beyond the
> +   sizeof need to be allocated.  Use set_precision to initialize the
> +   structure.  */
>  template 
>  struct GTY((user)) trailing_wide_ints
>  {
> @@ -1387,6 +1390,9 @@ private:
>/* The shared maximum length of each number.  */
>unsigned char m_max_len;
>
> +  /* The number of elements.  */
> +  unsigned char m_num_elements;
> +
>/* The current length of each number.
>   Avoid char array so the whole structure is not a typeless storage
>   that will, in turn, turn off TBAA on gimple, trees and RTL.  */
> @@ -1399,12 +1405,15 @@ private:
>  public:
>typedef WIDE_INT_REF_FOR (trailing_wide_int_storage) const_reference;
>
> -  void set_precision (unsigned int);
> +  void set_precision (unsigned int precision, unsigned int num_elements =
> N);
>unsigned int get_precision () const { return m_precision; }
> +  unsigned int num_elements () const { return m_num_elements; }
>trailing_wide_int operator [] (unsigned int);
>const_reference operator [] (unsigned int) const;
> -  static size_t extra_size (unsigned int);
> -  size_t extra_size () const { return extra_size (m_precision); }
> +  static size_t extra_size (unsigned int precision,
> +   unsigned int num_elements = N);
> +  size_t extra_size () const { return extra_size (m_precision,
> + m_n

Re: [PATCH 4/12] arm: Add testsuite library support for PACBTI target

2022-07-01 Thread Richard Earnshaw via Gcc-patches




On 01/07/2022 14:03, Richard Earnshaw via Gcc-patches wrote:



On 28/04/2022 10:40, Andrea Corallo via Gcc-patches wrote:

Add targeting-checking entities for PACBTI in testsuite
framework.

Pre-approved with the requested changes here
.

gcc/testsuite/ChangeLog:

* testsuite/lib/target-supports.exp:
(check_effective_target_arm_pacbti_hw): New.
* doc/sourcebuild.texi: Document arm_pacbti_hw.

Co-Authored-By: Tejas Belagod  


+proc check_effective_target_arm_pacbti_hw {} {
+    return [check_runtime arm_pacbti_hw_available {
+    __attribute__ ((naked)) int
+    main (void)
+    {
+  asm ("pac r12, lr, sp");

So the armv8-m Arm ARM says that this instruction is in the NOP space 
and that it is undefined if we aren't armv8-m.main or higher.


+  asm ("mov r0, #0");
+  asm ("autg r12, lr, sp");

This isn't in the nop space, but the Arm ARM says it is unpredictable if 
the extension isn't present.  Unfortunately, that means this isn't a 
particularly reliable way of detecting that the PACBTI feature is present.


However, I can't think off hand of more reliable way of testing this 
since reading the feature register ID_ISAR5 is not possible when in 
unprivileged mode.


So I think we'll have to live with this.

+  asm ("bx lr");
+    }
+    } ""]

OK.



Or perhaps not. The test does not try to add the right options to enable 
PAC/BTI if those aren't in the default selection for the current 
testsuite run.


Perhaps we also need some additional tests to work out what architecture 
options to add (if any) to ensure the test will at least assemble.



R.

R.


Re: Mips: Fix kernel_stat structure size

2022-07-01 Thread Dimitrije Milosevic
Thanks Xi. Forgive me as I'm not that familiar with the coding standards 
when submitting patches for a review.
Here is the updated version of the patch.

Fix kernel_stat structure size for non-Android 32-bit Mips.
LLVM currently has this value for the kernel_stat structure size,
as per compiler-rt/lib/sanitizer-common/sanitizer_platform_limits_posix.h.
This also resolves one of the build issues for non-Android 32-bit Mips.

libsanitizer/ChangeLog:

* sanitizer_common/sanitizer_platform_limits_posix.h: Fix
kernel_stat structure size for non-Android 32-bit Mips.

---

 libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h 
b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
index 89772a7e5c0..62a99035db3 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
@@ -83,7 +83,7 @@ const unsigned struct_kernel_stat64_sz = 104;
 #elif defined(__mips__)
 const unsigned struct_kernel_stat_sz = SANITIZER_ANDROID
? FIRST_32_SECOND_64(104, 128)
-   : FIRST_32_SECOND_64(144, 216);
+   : FIRST_32_SECOND_64(160, 216);
 const unsigned struct_kernel_stat64_sz = 104;
 #elif defined(__s390__) && !defined(__s390x__)
 const unsigned struct_kernel_stat_sz = 64;

---

Re: [PATCH 5/12] arm: Implement target feature macros for PACBTI

2022-07-01 Thread Richard Earnshaw via Gcc-patches




On 28/04/2022 10:42, Andrea Corallo via Gcc-patches wrote:

This patch implements target feature macros when PACBTI is enabled
through the -march option or -mbranch-protection.  The target feature
macros __ARM_FEATURE_PAC_DEFAULT and __ARM_FEATURE_BTI_DEFAULT are
specified in ARM ACLE

__ARM_FEATURE_PAUTH and __ARM_FEATURE_BTI are specified in the
pull-request .

Approved here
.

gcc/ChangeLog:

* config/arm/arm-c.c (arm_cpu_builtins): Define
__ARM_FEATURE_BTI_DEFAULT, __ARM_FEATURE_PAC_DEFAULT,
__ARM_FEATURE_PAUTH and __ARM_FEATURE_BTI.


This bit is OK.



gcc/testsuite/ChangeLog:

* gcc.target/arm/acle/pacbti-m-predef-2.c: New test.
* gcc.target/arm/acle/pacbti-m-predef-4.c: New test.
* gcc.target/arm/acle/pacbti-m-predef-5.c: New test.



These are all execution tests.  I think we also need some compile-only 
tests so that we get better coverage when the target does not directly 
support PACBTI.


We also need some tests for the defines when targetting armv8-m.main and 
some tests for checking __ARM_FEATURE_BTI and __ARM_FEATURE_PAC (the 
tests here check only the '..._DEFAULT' macros.



Co-Authored-By: Tejas Belagod  



R.


Re: [PATCH] Mips: Resolve build issues for the n32 ABI

2022-07-01 Thread Dimitrije Milosevic
Thanks Xi. Forgive me as I'm not that familiar with the coding standards
when submitting patches for a review.
Here is the updated version of the patch.

Building the ASAN for the n32 MIPS ABI currently fails, due to a few reasons:
- defined(__mips64), which is set solely based on the architecture type 
(32-bit/64-bit),
was still used in some places. Therefore, defined(__mips64) is swapped with 
SANITIZER_MIPS64,
which takes the ABI into account as well - defined(__mips64) &&
_MIPS_SIM == ABI64.
- The n32 ABI still uses 64-bit *Linux* system calls, even though the word size 
is 32 bits.
- After the transition to canonical system calls
(https://reviews.llvm.org/D124212), the n32 ABI still didn't use them,
even though they are supported,
as per 
https://github.com/torvalds/linux/blob/master/arch/mips/kernel/syscalls/syscall_n32.tbl.

See https://reviews.llvm.org/D127098.

libsanitizer/ChangeLog:

* sanitizer_common/sanitizer_linux.cpp (defined): Resolve
ASAN build issues for the Mips n32 ABI.
* sanitizer_common/sanitizer_platform.h (defined): Likewise.

---

 libsanitizer/sanitizer_common/sanitizer_linux.cpp  | 17 ++---
 libsanitizer/sanitizer_common/sanitizer_platform.h |  2 +-
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/libsanitizer/sanitizer_common/sanitizer_linux.cpp 
b/libsanitizer/sanitizer_common/sanitizer_linux.cpp
index e2c32d679ad..5ba033492e7 100644
--- a/libsanitizer/sanitizer_common/sanitizer_linux.cpp
+++ b/libsanitizer/sanitizer_common/sanitizer_linux.cpp
@@ -34,7 +34,7 @@
 // format. Struct kernel_stat is defined as 'struct stat' in asm/stat.h. To
 // access stat from asm/stat.h, without conflicting with definition in
 // sys/stat.h, we use this trick.
-#if defined(__mips64)
+#if SANITIZER_MIPS64
 #include 
 #include 
 #define stat kernel_stat
@@ -124,8 +124,9 @@ const int FUTEX_WAKE_PRIVATE = FUTEX_WAKE | 
FUTEX_PRIVATE_FLAG;
 // Are we using 32-bit or 64-bit Linux syscalls?
 // x32 (which defines __x86_64__) has SANITIZER_WORDSIZE == 32
 // but it still needs to use 64-bit syscalls.
-#if SANITIZER_LINUX && (defined(__x86_64__) || defined(__powerpc64__) ||   
\
-SANITIZER_WORDSIZE == 64)
+#if SANITIZER_LINUX && (defined(__x86_64__) || defined(__powerpc64__) || \
+SANITIZER_WORDSIZE == 64 ||  \
+(defined(__mips__) && _MIPS_SIM == _ABIN32))
 # define SANITIZER_LINUX_USES_64BIT_SYSCALLS 1
 #else
 # define SANITIZER_LINUX_USES_64BIT_SYSCALLS 0
@@ -289,7 +290,7 @@ static void stat64_to_stat(struct stat64 *in, struct stat 
*out) {
 }
 #endif
 
-#if defined(__mips64)
+#if SANITIZER_MIPS64
 // Undefine compatibility macros from 
 // so that they would not clash with the kernel_stat
 // st_[a|m|c]time fields
@@ -343,7 +344,8 @@ uptr internal_stat(const char *path, void *buf) {
 #if SANITIZER_FREEBSD
   return internal_syscall(SYSCALL(fstatat), AT_FDCWD, (uptr)path, (uptr)buf, 
0);
 #elif SANITIZER_LINUX
-#  if SANITIZER_WORDSIZE == 64 || SANITIZER_X32
+#  if SANITIZER_WORDSIZE == 64 || SANITIZER_X32 || \
+  (defined(__mips__) && _MIPS_SIM == _ABIN32)
   return internal_syscall(SYSCALL(newfstatat), AT_FDCWD, (uptr)path, (uptr)buf,
   0);
 #  else
@@ -366,7 +368,8 @@ uptr internal_lstat(const char *path, void *buf) {
   return internal_syscall(SYSCALL(fstatat), AT_FDCWD, (uptr)path, (uptr)buf,
   AT_SYMLINK_NOFOLLOW);
 #elif SANITIZER_LINUX
-#  if defined(_LP64) || SANITIZER_X32
+#  if defined(_LP64) || SANITIZER_X32 || \
+  (defined(__mips__) && _MIPS_SIM == _ABIN32)
   return internal_syscall(SYSCALL(newfstatat), AT_FDCWD, (uptr)path, (uptr)buf,
   AT_SYMLINK_NOFOLLOW);
 #  else
@@ -1053,7 +1056,7 @@ uptr GetMaxVirtualAddress() {
   return (1ULL << (MostSignificantSetBitIndex(GET_CURRENT_FRAME()) + 1)) - 1;
 #elif SANITIZER_RISCV64
   return (1ULL << 38) - 1;
-# elif defined(__mips64)
+# elif SANITIZER_MIPS64
   return (1ULL << 40) - 1;  // 0x00ffUL;
 # elif defined(__s390x__)
   return (1ULL << 53) - 1;  // 0x001fUL;
diff --git a/libsanitizer/sanitizer_common/sanitizer_platform.h 
b/libsanitizer/sanitizer_common/sanitizer_platform.h
index 8fe0d831431..8bd9a327623 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform.h
+++ b/libsanitizer/sanitizer_common/sanitizer_platform.h
@@ -159,7 +159,7 @@
 
 #if defined(__mips__)
 #  define SANITIZER_MIPS 1
-#  if defined(__mips64)
+#  if defined(__mips64) && _MIPS_SIM == _ABI64
 #define SANITIZER_MIPS32 0
 #define SANITIZER_MIPS64 1
 #  else

---

Re: [Patch][v5] OpenMP: Move omp requires checks to libgomp

2022-07-01 Thread Jakub Jelinek via Gcc-patches
On Fri, Jul 01, 2022 at 03:06:05PM +0200, Tobias Burnus wrote:
> --- a/gcc/fortran/parse.cc
> +++ b/gcc/fortran/parse.cc
> @@ -1168,7 +1168,8 @@ decode_omp_directive (void)
>  }
>switch (ret)
>  {
> -case ST_OMP_DECLARE_TARGET:
> +/* Set omp_target_seen; exclude ST_OMP_DECLARE_TARGET.
> +   FIXME: Get clarification, cf. OpenMP Spec Issue #3240.  */
>  case ST_OMP_TARGET:
>  case ST_OMP_TARGET_DATA:
>  case ST_OMP_TARGET_ENTER_DATA:
> @@ -6879,11 +6880,14 @@ done:
>  
>/* Fixup for external procedures and resolve 'omp requires'.  */
>int omp_requires;
> +  bool omp_target_seen;
>omp_requires = 0;
> +  omp_target_seen = false;
>for (gfc_current_ns = gfc_global_ns_list; gfc_current_ns;
> gfc_current_ns = gfc_current_ns->sibling)
>  {
>omp_requires |= gfc_current_ns->omp_requires;
> +  omp_target_seen |= gfc_current_ns->omp_target_seen;
>gfc_check_externals (gfc_current_ns);
>  }
>for (gfc_current_ns = gfc_global_ns_list; gfc_current_ns;
> @@ -6908,6 +6912,22 @@ done:
>break;
>  }
>  
> +  if (omp_target_seen)
> +omp_requires_mask = (enum omp_requires) (omp_requires_mask
> +  | OMP_REQUIRES_TARGET_USED);
> +  if (omp_requires & OMP_REQ_REVERSE_OFFLOAD)
> +omp_requires_mask = (enum omp_requires) (omp_requires_mask
> +  | OMP_REQUIRES_REVERSE_OFFLOAD);
> +  if (omp_requires & OMP_REQ_UNIFIED_ADDRESS)
> +omp_requires_mask = (enum omp_requires) (omp_requires_mask
> +  | OMP_REQUIRES_UNIFIED_ADDRESS);
> +  if (omp_requires & OMP_REQ_UNIFIED_SHARED_MEMORY)
> +omp_requires_mask
> +   = (enum omp_requires) (omp_requires_mask
> +  | OMP_REQUIRES_UNIFIED_SHARED_MEMORY);
> +  if (omp_requires & OMP_REQ_DYNAMIC_ALLOCATORS)
> +omp_requires_mask = (enum omp_requires) (omp_requires_mask
> +  | OMP_REQUIRES_DYNAMIC_ALLOCATORS);
>/* Do the parse tree dump.  */
>gfc_current_ns = flag_dump_fortran_original ? gfc_global_ns_list : NULL;

Will Fortran diagnose:
subroutine foo
!$omp requires unified_shared_memory
!$omp target
!$omp end target
end subroutine foo
subroutine bar
!$omp requires reverse_offload
!$omp target
!$omp end target
end subroutine bar

or just merge it from the different namespaces?
This is something that can be handled separately if it isn't resolved
and might need clarification from omp-lang.

> @@ -1764,6 +1781,20 @@ input_symtab (void)
>  }
>  }
>  
> +static void
> +omp_requires_to_name (char *buf, size_t size, unsigned int requires_mask)
> +{
> +  char *end = buf + size, *p = buf;
> +  if (requires_mask & GOMP_REQUIRES_UNIFIED_ADDRESS)
> +p += snprintf (p, end - p, "unified_address");
> +  if (requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY)
> +p += snprintf (p, end - p, "%sunified_shared_memory",
> +(p == buf ? "" : ", "));
> +  if (requires_mask & GOMP_REQUIRES_REVERSE_OFFLOAD)
> +p += snprintf (p, end - p, "%sreverse_offload",
> +(p == buf ? "" : ", "));

So, what does this print if requires_mask is 0 (or just the target used bit
set but not unified_address, unified_shared_memory nor reverse_offload)?
Say in case of:
a.c
#pragma omp requires unified_address
void foo (void) {
#pragma omp target
;
}
b.c:
void bar (void) {
#pragma omp target
;
}
gcc -fopenmp -shared -o a.so a.c b.c
?

> @@ -1810,6 +1847,54 @@ input_offload_tables (bool do_force_output)
>may be no refs to var_decl in offload LTO mode.  */
> if (do_force_output)
>   varpool_node::get (var_decl)->force_output = 1;
> +   tmp_decl = var_decl;
> + }
> +   else if (tag == LTO_symtab_edge)
> + {
> +   static bool error_emitted = false;
> +   HOST_WIDE_INT val = streamer_read_hwi (ib);
> +
> +   if (omp_requires_mask == 0)
> + {
> +   omp_requires_mask = (omp_requires) val;
> +   requires_decl = tmp_decl;
> +   requires_fn = file_data->file_name;

And similarly here, if some device construct is seen but requires
directive isn't, not sure if in this version val would be 0 or something
with the TARGET_USED bit set.  In the latter case, only what is printed
for no requires or just atomic related requires is a problem, in the former
case due to the == 0 check mixing of 0 with non-zero would be ignored
but mixing of non-zero with 0 wouldn't be.

> + }
> +   else if (omp_requires_mask != val && !error_emitted)
> + {
> +   char buf[64], buf2[64];

Perhaps cleaner would be to size the buffers as
sizeof ("unified_address,unified_shared_memory,reverse_offload")
64 is more, but just a wild guess and if further clauses are added later,
it might be too small.

> +(p == buf ? "" : ", "));
> +  if (

Re: [PATCH 6/12] arm: Add pointer authentication for stack-unwinding runtime

2022-07-01 Thread Richard Earnshaw via Gcc-patches




On 28/04/2022 10:44, Andrea Corallo via Gcc-patches wrote:

This patch adds authentication for when the stack is unwound when an
exception is taken.  All the changes here are done to the runtime code
in libgcc's unwinder code for Arm target. All the changes are guarded
under defined (__ARM_FEATURE_PAC_DEFAULT) and activated only if the
+pacbti feature is switched on for the architecture. This means that
switching on the target feature via -march or -mcpu is sufficient and
-mbranch-protection need not be enabled. This ensures that the
unwinder is authenticated only if the PACBTI instructions are
available in the non-NOP space as it uses AUTG.  Just generating
PAC/AUT instructions using -mbranch-protection will not enable
authentication on the unwinder.

Pre-approved with the requested changes here
.

gcc/ChangeLog:

* ginclude/unwind-arm-common.h (_Unwind_VRS_RegClass): Introduce
new pseudo register class _UVRSC_PAC.
* libgcc/config/arm/pr-support.c (__gnu_unwind_execute): Decode
exception opcode (0xb4) for saving RA_AUTH_CODE and authenticate
with AUTG if found.
* libgcc/config/arm/unwind-arm.c (struct pseudo_regs): New.
(phase1_vrs): Introduce new field to store pseudo-reg state.
(phase2_vrs): Likewise.
(_Unwind_VRS_Get): Load pseudo register state from virtual reg set.
(_Unwind_VRS_Set): Store pseudo register state to virtual reg set.
(_Unwind_VRS_Pop): Load pseudo register value from stack into VRS.

Co-Authored-By: Tejas Belagod  



Ok.

R.


Re: [PATCH 7/12] arm: Emit build attributes for PACBTI target feature

2022-07-01 Thread Richard Earnshaw via Gcc-patches




On 28/04/2022 10:45, Andrea Corallo via Gcc-patches wrote:

This patch emits assembler directives for PACBTI build attributes as
defined by the
ABI.



gcc/ChangeLog:

* config/arm/arm.c (arm_file_start): Emit EABI attributes for
Tag_PAC_extension, Tag_BTI_extension, TAG_BTI_use, TAG_PACRET_use.


This bit is OK.



gcc/testsuite/ChangeLog:

* gcc.target/arm/acle/pacbti-m-predef-1.c: New test.
* gcc.target/arm/acle/pacbti-m-predef-3: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-6.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-7.c: Likewise.


These tests contain directives like:

+/* { dg-additional-options " -mbranch-protection=pac-ret+bti 
--save-temps" } */


But they don't check that the architecture permits this (it has to be 
armv8-m.main or later).




Co-Authored-By: Tejas Belagod  



R.


[pushed] c++: dependent generic lambda template-id [PR106024]

2022-07-01 Thread Jason Merrill via Gcc-patches
We were wrongly looking up the generic lambda op() in a dependent scope, and
then trying to look up its instantiation at substitution time, but lambdas
aren't instantiated, so we crashed.  The fix is to not look into dependent
class scopes.

But this created trouble with wrongly trying to use a template from the
enclosing scope when we aren't actually looking at a template-argument-list,
in template/lookup18.C, so let's avoid that.

Tested x86_64-pc-linux-gnu, applying to trunk and 12.

PR c++/106024

gcc/cp/ChangeLog:

* parser.cc (missing_template_diag): Factor out...
(cp_parser_id_expression): ...from here.
(cp_parser_lookup_name): Don't look in dependent object_type.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-generic10.C: New test.
---
 gcc/cp/parser.cc  | 23 ++-
 gcc/testsuite/g++.dg/cpp2a/lambda-generic10.C | 14 +++
 2 files changed, 36 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-generic10.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index da2f370cdca..357fde557c7 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -30676,9 +30676,11 @@ cp_parser_lookup_name (cp_parser *parser, tree name,
 }
   else if (object_type)
 {
+  bool dep = dependent_scope_p (object_type);
+
   /* Look up the name in the scope of the OBJECT_TYPE, unless the
 OBJECT_TYPE is not a class.  */
-  if (CLASS_TYPE_P (object_type))
+  if (!dep && CLASS_TYPE_P (object_type))
/* If the OBJECT_TYPE is a template specialization, it may
   be instantiated during name lookup.  In that case, errors
   may be issued.  Even if we rollback the current tentative
@@ -30702,6 +30704,25 @@ cp_parser_lookup_name (cp_parser *parser, tree name,
: is_template ? LOOK_want::TYPE
: prefer_type_arg (tag_type));
 
+  /* If we did unqualified lookup of a dependent member-qualified name and
+found something, do we want to use it?  P1787 clarified that we need
+to look in the object scope first even if it's dependent, but for now
+let's still use it in some cases.
+FIXME remember unqualified lookup result to use if member lookup fails
+at instantiation time.  */
+  if (decl && dep && is_template)
+   {
+ saved_token_sentinel toks (parser->lexer, STS_ROLLBACK);
+ /* Only use the unqualified class template lookup if we're actually
+looking at a template arg list.  */
+ if (!cp_parser_skip_entire_template_parameter_list (parser))
+   decl = NULL_TREE;
+ /* And only use the unqualified lookup if we're looking at ::.  */
+ if (decl
+ && !cp_lexer_next_token_is (parser->lexer, CPP_SCOPE))
+   decl = NULL_TREE;
+   }
+
   /* If we know we're looking for a type (e.g. A in p->A::x),
 mock up a typename.  */
   if (!decl && object_type && tag_type != none_type
diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-generic10.C 
b/gcc/testsuite/g++.dg/cpp2a/lambda-generic10.C
new file mode 100644
index 000..47a87bbfbd7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/lambda-generic10.C
@@ -0,0 +1,14 @@
+// PR c++/106024
+// { dg-do compile { target c++20 } }
+
+void sink(...);
+template  void f()
+{
+  sink ([]  (int...) { return 1; }
+.operator()(args...)...); // { dg-warning 
"-Wmissing-template-keyword" }
+} // { dg-prune-output {expected '\)'} }
+
+int main()
+{
+  f<1,2,3>();
+}

base-commit: 63abe04999283582b258adf60da6c19d541ebc68
-- 
2.27.0



[pushed] c++: add fixup to missing .template warning

2022-07-01 Thread Jason Merrill via Gcc-patches
I experimented with giving this diagnostic in another place, which didn't
work out, but we can still benefit from adding the fixup.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* parser.cc (missing_template_diag): Split out...
(cp_parser_id_expression): ...from here.
---
 gcc/cp/parser.cc | 21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 357fde557c7..f6bc8db8581 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -6093,6 +6093,23 @@ cp_parser_primary_expression (cp_parser *parser,
   /*decltype*/false, idk);
 }
 
+/* Complain about missing template keyword when naming a dependent
+   member template.  */
+
+static void
+missing_template_diag (location_t loc, diagnostic_t diag_kind = DK_WARNING)
+{
+  if (warning_suppressed_at (loc, OPT_Wmissing_template_keyword))
+return;
+
+  gcc_rich_location richloc (loc);
+  richloc.add_fixit_insert_before ("template");
+  emit_diagnostic (diag_kind, &richloc, OPT_Wmissing_template_keyword,
+  "expected %qs keyword before dependent "
+  "template name", "template");
+  suppress_warning_at (loc, OPT_Wmissing_template_keyword);
+}
+
 /* Parse an id-expression.
 
id-expression:
@@ -6268,9 +6285,7 @@ cp_parser_id_expression (cp_parser *parser,
 operator.  */
  && (cp_lexer_peek_token (parser->lexer)->type
  <= CPP_LAST_PUNCTUATOR))
-   warning_at (token->location, OPT_Wmissing_template_keyword,
-   "expected %qs keyword before dependent "
-   "template name", "template");
+   missing_template_diag (token->location);
 }
 
   return id;

base-commit: 07ac550393d00fcadcee21b44abee6bb30c93949
-- 
2.27.0



Re: [PATCH 8/12 V2] arm: Introduce multilibs for PACBTI target feature

2022-07-01 Thread Richard Earnshaw via Gcc-patches




On 01/06/2022 13:32, Andrea Corallo via Gcc-patches wrote:

Hi all,

second iteration of the previous patch adding the following new
multilibs:

thumb/v8.1-m.main+pacbti/mbranch-protection/nofp
thumb/v8.1-m.main+pacbti+dp/mbranch-protection/soft
thumb/v8.1-m.main+pacbti+dp/mbranch-protection/hard
thumb/v8.1-m.main+pacbti+fp/mbranch-protection/soft
thumb/v8.1-m.main+pacbti+fp/mbranch-protection/hard
thumb/v8.1-m.main+pacbti+mve/mbranch-protection/hard

To trigger the following compiler flags:

-mthumb -march=armv8.1-m.main+pacbti -mbranch-protection=standard 
-mfloat-abi=soft
-mthumb -march=armv8.1-m.main+pacbti+fp -mbranch-protection=standard 
-mfloat-abi=softfp
-mthumb -march=armv8.1-m.main+pacbti+fp -mbranch-protection=standard 
-mfloat-abi=hard
-mthumb -march=armv8.1-m.main+pacbti+fp.dp -mbranch-protection=standard 
-mfloat-abi=softfp
-mthumb -march=armv8.1-m.main+pacbti+fp.dp -mbranch-protection=standard 
-mfloat-abi=hard
-mthumb -march=armv8.1-m.main+pacbti+mve -mbranch-protection=standard 
-mfloat-abi=hard

gcc/ChangeLog:

* config/arm/t-rmprofile: Add multilib rules for march +pacbti
   and mbranch-protection.



+# Map all mbranch-protection values other than 'none' to 'standard'.
+MULTILIB_MATCHES   += mbranch-protection?standard=mbranch-protection?bti
+MULTILIB_MATCHES   += 
mbranch-protection?standard=mbranch-protection?pac-ret
+MULTILIB_MATCHES	+= 
mbranch-protection?standard=mbranch-protection?pac-ret+leaf
+MULTILIB_MATCHES	+= 
mbranch-protection?standard=mbranch-protection?pac-ret+bti
+MULTILIB_MATCHES	+= 
mbranch-protection?standard=mbranch-protection?pac-ret+leaf+bti
+MULTILIB_MATCHES	+= 
mbranch-protection?standard=mbranch-protection?bti+pac-ret
+MULTILIB_MATCHES	+= 
mbranch-protection?standard=mbranch-protection?bti+pac-ret+leaf

+

The documentation mentions -mbranch-protection=standard+leaf, so you're 
missing a mapping for that.



OK with that change.

R.


[pushed] c++: tweak resolve_args change

2022-07-01 Thread Jason Merrill via Gcc-patches
I don't know why I used tf_error instead of complain here.

PR c++/105779

gcc/cp/ChangeLog:

* call.cc (resolve_args): Use complain.
---
 gcc/cp/call.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index f1dd8377628..fc98552fda2 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -4675,7 +4675,7 @@ resolve_args (vec *args, tsubst_flags_t 
complain)
 
   /* Force auto deduction now.  Omit tf_warning to avoid redundant
 deprecated warning on deprecated-14.C.  */
-  if (!mark_single_function (arg, tf_error))
+  if (!mark_single_function (arg, complain & ~tf_warning))
return NULL;
 }
   return args;

base-commit: 288c6cce0277e03e08b324283b6a015a70066bb7
-- 
2.27.0



Re: [PATCH 8/12 V2] arm: Introduce multilibs for PACBTI target feature

2022-07-01 Thread Richard Earnshaw via Gcc-patches




On 01/07/2022 15:54, Richard Earnshaw via Gcc-patches wrote:



On 01/06/2022 13:32, Andrea Corallo via Gcc-patches wrote:

Hi all,

second iteration of the previous patch adding the following new
multilibs:

thumb/v8.1-m.main+pacbti/mbranch-protection/nofp
thumb/v8.1-m.main+pacbti+dp/mbranch-protection/soft
thumb/v8.1-m.main+pacbti+dp/mbranch-protection/hard
thumb/v8.1-m.main+pacbti+fp/mbranch-protection/soft
thumb/v8.1-m.main+pacbti+fp/mbranch-protection/hard
thumb/v8.1-m.main+pacbti+mve/mbranch-protection/hard

To trigger the following compiler flags:

-mthumb -march=armv8.1-m.main+pacbti -mbranch-protection=standard 
-mfloat-abi=soft
-mthumb -march=armv8.1-m.main+pacbti+fp -mbranch-protection=standard 
-mfloat-abi=softfp
-mthumb -march=armv8.1-m.main+pacbti+fp -mbranch-protection=standard 
-mfloat-abi=hard
-mthumb -march=armv8.1-m.main+pacbti+fp.dp 
-mbranch-protection=standard -mfloat-abi=softfp
-mthumb -march=armv8.1-m.main+pacbti+fp.dp 
-mbranch-protection=standard -mfloat-abi=hard
-mthumb -march=armv8.1-m.main+pacbti+mve -mbranch-protection=standard 
-mfloat-abi=hard


gcc/ChangeLog:

* config/arm/t-rmprofile: Add multilib rules for march +pacbti
   and mbranch-protection.



+# Map all mbranch-protection values other than 'none' to 'standard'.
+MULTILIB_MATCHES    += mbranch-protection?standard=mbranch-protection?bti
+MULTILIB_MATCHES    += 
mbranch-protection?standard=mbranch-protection?pac-ret
+MULTILIB_MATCHES    += 
mbranch-protection?standard=mbranch-protection?pac-ret+leaf
+MULTILIB_MATCHES    += 
mbranch-protection?standard=mbranch-protection?pac-ret+bti
+MULTILIB_MATCHES    += 
mbranch-protection?standard=mbranch-protection?pac-ret+leaf+bti
+MULTILIB_MATCHES    += 
mbranch-protection?standard=mbranch-protection?bti+pac-ret
+MULTILIB_MATCHES    += 
mbranch-protection?standard=mbranch-protection?bti+pac-ret+leaf

+

The documentation mentions -mbranch-protection=standard+leaf, so you're 
missing a mapping for that.



OK with that change.

R.


Oh, and please add some tests to gcc/testsuite/gcc.target/arm/multilib.exp

R.


Re: [RFC] trailing_wide_ints with runtime variable lengths

2022-07-01 Thread Jakub Jelinek via Gcc-patches
On Fri, Jul 01, 2022 at 04:12:55PM +0200, Aldy Hernandez wrote:
> > --- a/gcc/wide-int.h
> > +++ b/gcc/wide-int.h
> > @@ -1373,10 +1373,13 @@ namespace wi
> >  : public int_traits  {};
> >  }
> >
> > -/* An array of N wide_int-like objects that can be put at the end of
> > -   a variable-sized structure.  Use extra_size to calculate how many
> > -   bytes beyond the sizeof need to be allocated.  Use set_precision
> > -   to initialize the structure.  */
> > +/* A variable-lengthed array of wide_int-like objects that can be put
> > +   at the end of a variable-sized structure.  The number of objects is
> > +   at most N and can be set at runtime by using set_precision().
> > +
> > +   Use extra_size to calculate how many bytes beyond the
> > +   sizeof need to be allocated.  Use set_precision to initialize the
> > +   structure.  */
> >  template 
> >  struct GTY((user)) trailing_wide_ints
> >  {
> > @@ -1387,6 +1390,9 @@ private:
> >/* The shared maximum length of each number.  */
> >unsigned char m_max_len;
> >
> > +  /* The number of elements.  */
> > +  unsigned char m_num_elements;

IMNSHO you certainly don't want to change like this existing
trailing_wide_ints, you don't want to grow unnecessarily existing
trailing_wide_ints users (e.g. const_poly_int_def).

My brief understanding of wide-int.h is that in some cases stuff like this
is implied from template parameters or exact class instantiation and in
other cases it is present explicitly and class inheritence is used to hide
that stuff nicely.

So, you are looking for something like trailing_wide_ints but where that
N is actually a runtime value?  Then e.g. the
  struct {unsigned char len;} m_len[N];
member can't work properly either, because it isn't constant size.

Jakub



Re: [PATCH 9/12] arm: Make libgcc bti compatible

2022-07-01 Thread Richard Earnshaw via Gcc-patches




On 28/04/2022 10:48, Andrea Corallo via Gcc-patches wrote:

This change add bti instructions at the beginning of arm specific
libgcc hand written assembly routines.

2022-03-31  Andrea Corallo  

* libgcc/config/arm/crti.S (FUNC_START): Add bti instruction if
necessary.
* libgcc/config/arm/lib1funcs.S (THUMB_FUNC_START, FUNC_START):
Likewise.



+#if defined(__ARM_FEATURE_BTI)

Wouldn't it be better to use __ARM_FEATURE_BTI_DEFAULT?  That way we 
only get BTI instructions in multilib variants that have asked for BTI.


R.


Re: [PATCH] c++: warn about using keywords as identifiers [PR106111]

2022-07-01 Thread Jason Merrill via Gcc-patches

On 6/29/22 12:11, Marek Polacek wrote:

In C++03, -Wc++11-compat should warn about

   int constexpr;

since 'constexpr' is a keyword in C++11.  Jonathan reports that
we don't emit a similar warning for 'alignas' or 'alignof', and,
as I found out, 'thread_local'.

Similarly, we don't warn for most C++20 keywords.  That happens
because RID_LAST_CXX20 hasn't been updated in a while.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


PR c++/106111

gcc/c-family/ChangeLog:

* c-common.h (enum rid): Update RID_LAST_CXX20.

gcc/cp/ChangeLog:

* parser.cc (cp_lexer_get_preprocessor_token): Also warn about
RID_ALIGNOF, RID_ALIGNAS, RID_THREAD.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/keywords1.C: New test.
* g++.dg/cpp2a/keywords1.C: New test.
---
  gcc/c-family/c-common.h|  2 +-
  gcc/cp/parser.cc   | 10 +++---
  gcc/testsuite/g++.dg/cpp0x/keywords1.C | 15 +++
  gcc/testsuite/g++.dg/cpp2a/keywords1.C | 12 
  4 files changed, 35 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/keywords1.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/keywords1.C

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 47442c95a53..a1e6a55158d 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -271,7 +271,7 @@ enum rid
RID_FIRST_CXX11 = RID_CONSTEXPR,
RID_LAST_CXX11 = RID_STATIC_ASSERT,
RID_FIRST_CXX20 = RID_CONSTINIT,
-  RID_LAST_CXX20 = RID_CONSTINIT,
+  RID_LAST_CXX20 = RID_CO_RETURN,
RID_FIRST_AT = RID_AT_ENCODE,
RID_LAST_AT = RID_AT_IMPLEMENTATION,
RID_FIRST_PQ = RID_IN,
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index da2f370cdca..cc6525e0509 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -890,10 +890,14 @@ cp_lexer_get_preprocessor_token (unsigned flags, cp_token 
*token)
else
{
if (warn_cxx11_compat
-  && C_RID_CODE (token->u.value) >= RID_FIRST_CXX11
-  && C_RID_CODE (token->u.value) <= RID_LAST_CXX11)
+ && ((C_RID_CODE (token->u.value) >= RID_FIRST_CXX11
+  && C_RID_CODE (token->u.value) <= RID_LAST_CXX11)
+ /* These are outside the CXX11 range.  */
+ || C_RID_CODE (token->u.value) == RID_ALIGNOF
+ || C_RID_CODE (token->u.value) == RID_ALIGNAS
+ || C_RID_CODE (token->u.value)== RID_THREAD))
  {
-  /* Warn about the C++0x keyword (but still treat it as
+ /* Warn about the C++11 keyword (but still treat it as
   an identifier).  */
  warning_at (token->location, OPT_Wc__11_compat,
  "identifier %qE is a keyword in C++11",
diff --git a/gcc/testsuite/g++.dg/cpp0x/keywords1.C 
b/gcc/testsuite/g++.dg/cpp0x/keywords1.C
new file mode 100644
index 000..2b2ab6404ea
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/keywords1.C
@@ -0,0 +1,15 @@
+// PR c++/106111
+// { dg-do compile { target c++98_only } }
+// { dg-options "-Wc++11-compat" }
+
+int alignof; // { dg-warning "is a keyword in C\\\+\\\+11" }
+int alignas; // { dg-warning "is a keyword in C\\\+\\\+11" }
+int constexpr; // { dg-warning "is a keyword in C\\\+\\\+11" }
+int decltype; // { dg-warning "is a keyword in C\\\+\\\+11" }
+int noexcept; // { dg-warning "is a keyword in C\\\+\\\+11" }
+int nullptr; // { dg-warning "is a keyword in C\\\+\\\+11" }
+int static_assert; // { dg-warning "is a keyword in C\\\+\\\+11" }
+int thread_local; // { dg-warning "is a keyword in C\\\+\\\+11" }
+int _Alignas;
+int _Alignof;
+int _Thread_local;
diff --git a/gcc/testsuite/g++.dg/cpp2a/keywords1.C 
b/gcc/testsuite/g++.dg/cpp2a/keywords1.C
new file mode 100644
index 000..7f4dba2d3b7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/keywords1.C
@@ -0,0 +1,12 @@
+// PR c++/106111
+// { dg-do compile { target c++17_down } }
+// { dg-options "-Wc++20-compat -Wc++11-compat -Wc++14-compat -Wc++17-compat" }
+
+int constinit; // { dg-warning "is a keyword in C\\\+\\\+20" }
+int consteval; // { dg-warning "is a keyword in C\\\+\\\+20" }
+int requires; // { dg-warning "is a keyword in C\\\+\\\+20" }
+int concept; // { dg-warning "is a keyword in C\\\+\\\+20" }
+int co_await; // { dg-warning "is a keyword in C\\\+\\\+20" }
+int co_yield; // { dg-warning "is a keyword in C\\\+\\\+20" }
+int co_return; // { dg-warning "is a keyword in C\\\+\\\+20" }
+int char8_t; // { dg-warning "is a keyword in C\\\+\\\+20" }

base-commit: b01c075e7e6d84da846c2ff9087433a30ebeb0d2




RFA: Another Rust demangler recursion limit

2022-07-01 Thread Nick Clifton via Gcc-patches
Hi Jeff,

  [I am sending this to your directly since you seem to be the only one
  reviewing these patches].

  Hot on the heels of the fix for the recursion problem in demangle_const
  a binutils user has filed another PoC that exposes a problem in
  demangle_path_maybe_open_generics():

https://sourceware.org/bugzilla/show_bug.cgi?id=29312#c1

  I have redirected them to file a bug report with the gcc system, but in
  the hopes of getting a fix in quickly I am also attaching a patch
  here.  It just does the obvious thing of adding a recursion counter
  and limit to the function.

Cheers
  Nick

diff --git a/libiberty/rust-demangle.c b/libiberty/rust-demangle.c
index 36afcfae278..d6daf23af27 100644
--- a/libiberty/rust-demangle.c
+++ b/libiberty/rust-demangle.c
@@ -1082,6 +1082,18 @@ demangle_path_maybe_open_generics (struct rust_demangler *rdm)
   if (rdm->errored)
 return open;
 
+  if (rdm->recursion != RUST_NO_RECURSION_LIMIT)
+{
+  ++ rdm->recursion;
+  if (rdm->recursion > RUST_MAX_RECURSION_COUNT)
+	{
+	  /* FIXME: There ought to be a way to report
+	 that the recursion limit has been reached.  */
+	  rdm->errored = 1;
+	  goto end_of_func;
+	}
+}
+
   if (eat (rdm, 'B'))
 {
   backref = parse_integer_62 (rdm);
@@ -1107,6 +1119,11 @@ demangle_path_maybe_open_generics (struct rust_demangler *rdm)
 }
   else
 demangle_path (rdm, 0);
+
+ end_of_func:
+  if (rdm->recursion != RUST_NO_RECURSION_LIMIT)
+-- rdm->recursion;
+
   return open;
 }
 


[PATCH] i386: Use "r" constraint in *andn3_doubleword_bmi

2022-07-01 Thread Uros Bizjak via Gcc-patches
ANDN is non-destructive, so use "r" instead of "0" for its operand 1 constraint.

2022-07-01  Uroš Bizjak  

gcc/ChangeLog:

* config/i386/i386.md (*andn3_doubleword_bmi):
Use "r" constraint for operand 1.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 352a21c585c..20c3b9a4122 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -10407,7 +10407,7 @@ (define_split
 (define_insn_and_split "*andn3_doubleword_bmi"
   [(set (match_operand: 0 "register_operand" "=r")
(and:
- (not: (match_operand: 1 "register_operand" "0"))
+ (not: (match_operand: 1 "register_operand" "r"))
  (match_operand: 2 "nonimmediate_operand" "ro")))
(clobber (reg:CC FLAGS_REG))]
   "TARGET_BMI"


Re: [PATCH] x86: Support 2/4/8 byte constant vector stores

2022-07-01 Thread Uros Bizjak via Gcc-patches
On Thu, Jun 30, 2022 at 4:50 PM H.J. Lu  wrote:
>
> 1. Add a predicate for constant vectors which can be converted to integer
> constants suitable for constant integer stores.  For a 8-byte constant
> vector, the converted 64-bit integer must be valid for store with 64-bit
> immediate, which is a 64-bit integer sign-extended from a 32-bit integer.
> 2. Add a new pattern to allow 2-byte, 4-byte and 8-byte constant vector
> stores, like
>
> (set (mem:V2HI (reg:DI 84))
>  (const_vector:V2HI [(const_int 0 [0]) (const_int 1 [0x1])]))
>
> 3. After reload, convert constant vector stores to constant integer
> stores, like
>
> (set (mem:SI (reg:DI 5 di [84]))
>  (const_int 65536 [0x1]))
>
> For
>
> void
> foo (short * c)
> {
>   c[0] = 0;
>   c[1] = 1;
> }
>
> it generates
>
> movl$65536, (%rdi)
>
> instead of
>
> movl.LC0(%rip), %eax
> movl%eax, (%rdi)
>
> gcc/
>
> PR target/106022
> * config/i386/i386-protos.h (ix86_convert_const_vector_to_integer):
> New.
> * config/i386/i386.cc (ix86_convert_const_vector_to_integer):
> New.
> * config/i386/mmx.md (V_16_32_64): New.
> (*mov_imm): New patterns for stores with 16-bit, 32-bit
> and 64-bit constant vector.
> * config/i386/predicates.md (x86_64_const_vector_operand): New.
>
> gcc/testsuite/
>
> PR target/106022
> * gcc.target/i386/pr106022-1.c: New test.
> * gcc.target/i386/pr106022-2.c: Likewise.
> * gcc.target/i386/pr106022-3.c: Likewise.
> * gcc.target/i386/pr106022-4.c: Likewise.

OK.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386-protos.h  |  2 +
>  gcc/config/i386/i386.cc| 47 ++
>  gcc/config/i386/mmx.md | 37 +
>  gcc/config/i386/predicates.md  | 11 +
>  gcc/testsuite/gcc.target/i386/pr106022-1.c | 13 ++
>  gcc/testsuite/gcc.target/i386/pr106022-2.c | 14 +++
>  gcc/testsuite/gcc.target/i386/pr106022-3.c | 14 +++
>  gcc/testsuite/gcc.target/i386/pr106022-4.c | 14 +++
>  8 files changed, 152 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr106022-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr106022-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr106022-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr106022-4.c
>
> diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
> index 3596ce81ecf..cf847751ac5 100644
> --- a/gcc/config/i386/i386-protos.h
> +++ b/gcc/config/i386/i386-protos.h
> @@ -122,6 +122,8 @@ extern void ix86_expand_unary_operator (enum rtx_code, 
> machine_mode,
> rtx[]);
>  extern rtx ix86_build_const_vector (machine_mode, bool, rtx);
>  extern rtx ix86_build_signbit_mask (machine_mode, bool, bool);
> +extern HOST_WIDE_INT ix86_convert_const_vector_to_integer (rtx,
> +  machine_mode);
>  extern void ix86_split_convert_uns_si_sse (rtx[]);
>  extern void ix86_expand_convert_uns_didf_sse (rtx, rtx);
>  extern void ix86_expand_convert_uns_sixf_sse (rtx, rtx);
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index b15b4893bb9..0cfe9962f75 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -15723,6 +15723,53 @@ ix86_build_signbit_mask (machine_mode mode, bool 
> vect, bool invert)
>return force_reg (vec_mode, v);
>  }
>
> +/* Return HOST_WIDE_INT for const vector OP in MODE.  */
> +
> +HOST_WIDE_INT
> +ix86_convert_const_vector_to_integer (rtx op, machine_mode mode)
> +{
> +  if (GET_MODE_SIZE (mode) > UNITS_PER_WORD)
> +gcc_unreachable ();
> +
> +  int nunits = GET_MODE_NUNITS (mode);
> +  wide_int val = wi::zero (GET_MODE_BITSIZE (mode));
> +  machine_mode innermode = GET_MODE_INNER (mode);
> +  unsigned int innermode_bits = GET_MODE_BITSIZE (innermode);
> +
> +  switch (mode)
> +{
> +case E_V2QImode:
> +case E_V4QImode:
> +case E_V2HImode:
> +case E_V8QImode:
> +case E_V4HImode:
> +case E_V2SImode:
> +  for (int i = 0; i < nunits; ++i)
> +   {
> + int v = INTVAL (XVECEXP (op, 0, i));
> + wide_int wv = wi::shwi (v, innermode_bits);
> + val = wi::insert (val, wv, innermode_bits * i, innermode_bits);
> +   }
> +  break;
> +case E_V2HFmode:
> +case E_V4HFmode:
> +case E_V2SFmode:
> +  for (int i = 0; i < nunits; ++i)
> +   {
> + rtx x = XVECEXP (op, 0, i);
> + int v = real_to_target (NULL, CONST_DOUBLE_REAL_VALUE (x),
> + REAL_MODE_FORMAT (innermode));
> + wide_int wv = wi::shwi (v, innermode_bits);
> + val = wi::insert (val, wv, innermode_bits * i, innermode_bits);
> +   }
> +  break;
> +default:
> +  gcc_unreachable ();
> +}
> +
> +  return val.to_shwi ();
> +}
> +
>  /* Return TRUE or FALSE dependi

Re: [GCC 13][PATCH] PR101836: Add a new option -fstrict-flex-array[=n] and use it in __builtin_object_size

2022-07-01 Thread Martin Sebor via Gcc-patches

On 7/1/22 08:01, Qing Zhao wrote:




On Jul 1, 2022, at 8:59 AM, Jakub Jelinek  wrote:

On Fri, Jul 01, 2022 at 12:55:08PM +, Qing Zhao wrote:

If so, comparing to the current implemenation to have all the checking in 
middle-end, what’s the
major benefit of moving part of the checking into FE, and leaving the other 
part in middle-end?


The point is recording early what FIELD_DECLs could be vs. can't possibly be
treated like flexible array members and just use that flag in the decisions
in the current routines in addition to what it is doing.


Okay.

Based on the discussion so far, I will do the following:

1. Add a new flag “DECL_NOT_FLEXARRAY” to FIELD_DECL;
2. In C/C++ FE, set the new flag “DECL_NOT_FLEXARRAY” for a FIELD_DECL based on 
[0], [1],
 [] and the option -fstrict-flex-array, and whether it’s the last field of 
the DECL_CONTEXT.
3. In Middle end,  Add a new utility routine is_flexible_array_member_p, which 
bases on
 DECL_NOT_FLEXARRAY + array_at_struct_end_p to decide whether the array
 reference is a real flexible array member reference.


Middle end currently is quite mess, array_at_struct_end_p, component_ref_size, 
and all the phases that
use these routines need to be updated, + new testing cases for each of the 
phases.


So, I still plan to separate the patch set into 2 parts:

   Part A:the above 1 + 2 + 3,  and use these new utilities in 
tree-object-size.cc to resolve PR101836 first.
  Then kernel can use __FORTIFY_SOURCE correctly;

   Part B:update all other phases with the new utilities + new testing 
cases + resolving regressions.

Let me know if you have any comment and suggestion.


It might be worth considering whether it should be possible to control
the "flexible array" property separately for each trailing array member
via either a #pragma or an attribute in headers that can't change
the struct layout but that need to be usable in programs compiled with
stricter -fstrict-flex-array=N settings.

Martin



Thanks a lot for all your help.

Qing



Jakub







[Patch] OpenMP: Handle tofrom with target enter/exit data

2022-07-01 Thread Tobias Burnus

Needed a break and some success. Hence, I implemented this useful and simple 
OpenMP 5.2
feature.

OK for trunk?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP: Handle tofrom with target enter/exit data

In 5.2, a map clause can be map-entering or map-exiting,
either containing 'tofrom'. The main reason for this is
permit 'map(x)' with 'omp target enter/exit data',
avoiding to specify 'to:/from:' explicitly. (OpenMP
defaults to 'tofrom'.)

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_target_enter_data,
	c_parser_omp_target_exit_data): Accept tofrom
	map-type modifier but use 'to' / 'from' internally.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_target_enter_data,
	cp_parser_omp_target_exit_data): Accept tofrom
	map-type modifier but use 'to' / 'from' internally.


gcc/fortran/ChangeLog:

	* dump-parse-tree.cc (show_omp_namelist): For the map-type,
	also handle the always modifer and release/delete.
	* openmp.cc (resolve_omp_clauses): Accept tofrom
	map-type modifier for target enter/exit data,
	but use 'to' / 'from' internally.

libgomp/ChangeLog:

	* libgomp.texi (OpenMP 5.2): Mark target enter/exit data
	with fromto as implemented.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/target-data-2.c: New test.
	* c-c++-common/gomp/target-data-3.c: New test.
	* gfortran.dg/gomp/target-data-1.f90: New test.
	* gfortran.dg/gomp/target-data-2.f90: New test.

 gcc/c/c-parser.cc| 22 +++---
 gcc/cp/parser.cc | 22 +++---
 gcc/fortran/dump-parse-tree.cc   |  5 +
 gcc/fortran/openmp.cc| 20 
 gcc/testsuite/c-c++-common/gomp/target-data-2.c  | 20 
 gcc/testsuite/c-c++-common/gomp/target-data-3.c  | 17 +
 gcc/testsuite/gfortran.dg/gomp/target-data-1.f90 | 17 +
 gcc/testsuite/gfortran.dg/gomp/target-data-2.f90 | 14 ++
 libgomp/libgomp.texi |  2 +-
 9 files changed, 128 insertions(+), 11 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 1704a52be12..97e3b23b5d2 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -21072,6 +21072,14 @@ c_parser_omp_target_enter_data (location_t loc, c_parser *parser,
 	  case GOMP_MAP_ALLOC:
 	map_seen = 3;
 	break;
+	  case GOMP_MAP_TOFROM:
+	OMP_CLAUSE_SET_MAP_KIND (*pc, GOMP_MAP_TO);
+	map_seen = 3;
+	break;
+	  case GOMP_MAP_ALWAYS_TOFROM:
+	OMP_CLAUSE_SET_MAP_KIND (*pc, GOMP_MAP_ALWAYS_TO);
+	map_seen = 3;
+	break;
 	  case GOMP_MAP_FIRSTPRIVATE_POINTER:
 	  case GOMP_MAP_ALWAYS_POINTER:
 	  case GOMP_MAP_ATTACH_DETACH:
@@ -21080,7 +21088,7 @@ c_parser_omp_target_enter_data (location_t loc, c_parser *parser,
 	map_seen |= 1;
 	error_at (OMP_CLAUSE_LOCATION (*pc),
 		  "%<#pragma omp target enter data%> with map-type other "
-		  "than % or % on % clause");
+		  "than %, % or % on % clause");
 	*pc = OMP_CLAUSE_CHAIN (*pc);
 	continue;
 	  }
@@ -21159,6 +21167,14 @@ c_parser_omp_target_exit_data (location_t loc, c_parser *parser,
 	  case GOMP_MAP_DELETE:
 	map_seen = 3;
 	break;
+	  case GOMP_MAP_TOFROM:
+	OMP_CLAUSE_SET_MAP_KIND (*pc, GOMP_MAP_FROM);
+	map_seen = 3;
+	break;
+	  case GOMP_MAP_ALWAYS_TOFROM:
+	OMP_CLAUSE_SET_MAP_KIND (*pc, GOMP_MAP_ALWAYS_FROM);
+	map_seen = 3;
+	break;
 	  case GOMP_MAP_FIRSTPRIVATE_POINTER:
 	  case GOMP_MAP_ALWAYS_POINTER:
 	  case GOMP_MAP_ATTACH_DETACH:
@@ -21167,8 +21183,8 @@ c_parser_omp_target_exit_data (location_t loc, c_parser *parser,
 	map_seen |= 1;
 	error_at (OMP_CLAUSE_LOCATION (*pc),
 		  "%<#pragma omp target exit data%> with map-type other "
-		  "than %, % or % on %"
-		  " clause");
+		  "than %, %, % or % "
+		  "on % clause");
 	*pc = OMP_CLAUSE_CHAIN (*pc);
 	continue;
 	  }
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index da2f370cdca..e8376253a60 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -44405,6 +44405,14 @@ cp_parser_omp_target_enter_data (cp_parser *parser, cp_token *pragma_tok,
 	  case GOMP_MAP_ALLOC:
 	map_seen = 3;
 	break;
+	  case GOMP_MAP_TOFROM:
+	OMP_CLAUSE_SET_MAP_KIND (*pc, GOMP_MAP_TO);
+	map_seen = 3;
+	break;
+	  case GOMP_MAP_ALWAYS_TOFROM:
+	OMP_CLAUSE_SET_MAP_KIND (*pc, GOMP_MAP_ALWAYS_TO);
+	map_seen = 3;
+	break;
 	  case GOMP_MAP_FIRSTPRIVATE_POINTER:
 	  case GOMP_MAP_FIRSTPRIVATE_REFERENCE:
 	  case GOMP_MAP_ALWAYS_POINTER:
@@ -44414,7 +44422,7 @@ cp_parser_omp_target_enter_data (cp_parser *parser, cp_token *pragma_tok,
 	map_seen |= 1;
 	error_at (OMP_CLAUSE_LOCATION (*pc),
 		  "%<#pragma omp target enter

Re: [PATCH 10/12 V2] arm: Implement cortex-M return signing address codegen

2022-07-01 Thread Richard Earnshaw via Gcc-patches




On 28/06/2022 10:17, Andrea Corallo via Gcc-patches wrote:

Hi all,

second version of this patch enabling address return signature and
verification based on Armv8.1-M Pointer Authentication [1].

To sign the return address, we use the PAC R12, LR, SP instruction
upon function entry.  This is signing LR using SP and storing the
result in R12.  R12 will be pushed into the stack.

During function epilogue R12 will be popped and AUT R12, LR, SP will
be used to verify that the content of LR is still valid before return.

Here an example of PAC instrumented function prologue and epilogue:

void foo (void);

int main()
{
   foo ();
   return 0;
}

Compiled with '-march=armv8.1-m.main -mbranch-protection=pac-ret
-mthumb' translates into:

main:
pac ip, lr, sp
push{r3, r7, ip, lr}
add r7, sp, #0
bl  foo
movsr3, #0
mov r0, r3
pop {r3, r7, ip, lr}
aut ip, lr, sp
bx  lr

The patch also takes care of generating a PACBTI instruction in place
of the sequence BTI+PAC when Branch Target Identification is enabled
contextually.

Ex. the previous example compiled with '-march=armv8.1-m.main
-mbranch-protection=pac-ret+bti -mthumb' translates into:

main:
pacbti  ip, lr, sp
push{r3, r7, ip, lr}
add r7, sp, #0
bl  foo
movsr3, #0
mov r0, r3
pop {r3, r7, ip, lr}
aut ip, lr, sp
bx  lr

As part of previous upstream suggestions a test for varargs has been
added and '-mtpcs-frame' is deemed being incompatible with this return
signing address feature being introduced.

[1] 


gcc/Changelog

* config/arm/arm.c: (arm_compute_frame_layout)
(arm_expand_prologue, thumb2_expand_return, arm_expand_epilogue)
(arm_conditional_register_usage): Update for pac codegen.
(arm_current_function_pac_enabled_p): New function.
* config/arm/arm.md (pac_ip_lr_sp, pacbti_ip_lr_sp, aut_ip_lr_sp):
Add new patterns.
* config/arm/unspecs.md (UNSPEC_PAC_IP_LR_SP)
(UNSPEC_PACBTI_IP_LR_SP, UNSPEC_AUT_IP_LR_SP): Add unspecs.

gcc/testsuite/Changelog

* gcc.target/arm/pac.h : New file.
* gcc.target/arm/pac-1.c : New test case.
* gcc.target/arm/pac-2.c : Likewise.
* gcc.target/arm/pac-3.c : Likewise.
* gcc.target/arm/pac-4.c : Likewise.
* gcc.target/arm/pac-5.c : Likewise.
* gcc.target/arm/pac-6.c : Likewise.
* gcc.target/arm/pac-7.c : Likewise.
* gcc.target/arm/pac-8.c : Likewise.



@@ -21139,6 +21139,14 @@ arm_compute_save_core_reg_mask (void)

   save_reg_mask |= arm_compute_save_reg0_reg12_mask ();

+  if (arm_current_function_pac_enabled_p ())
+{
+  if (TARGET_TPCS_FRAME
+ || (TARGET_TPCS_LEAF_FRAME && crtl->is_leaf))
+   error ("TPCS incompatible with return address signing.");
+  save_reg_mask |= 1 << IP_REGNUM;
+}
+

This is the wrong place for a test like this as it will be generated 
every time this function is called (which might be more than once per 
compiled function).


However, TPCS frames are only supported for 'thumb-1' code and PACBTI 
needs armv8-m.main (ie Thumb-2), so the test is really pretty pointless 
at the moment.  I think we should just drop the error.


@@ -22302,7 +22310,7 @@ arm_emit_multi_reg_pop (unsigned long 
saved_regs_mask)

 par = emit_insn (par);

   REG_NOTES (par) = dwarf;
-  if (!return_in_pc)
+  if (!return_in_pc && !frame_pointer_needed)
 arm_add_cfa_adjust_cfa_note (par, UNITS_PER_WORD * num_regs,
 stack_pointer_rtx, stack_pointer_rtx);
 }

What's this hunk for?  It doesn't seem related to the PAC changes.  Is 
this some generic bug?  If so, it should be pulled out into a separate 
patch.  If not, it needs some comment as to why we do it this way.


@@ -23352,6 +23360,11 @@ output_probe_stack_range (rtx reg1, rtx reg2)
   return "";
 }

+static bool  aarch_bti_enabled ()
+{
+  return false;
+}
+

GNU style requires the function name to be in the first column:

static bool
aarch_bti_enabled ()
{
  ...

@@ -23431,11 +23444,12 @@ arm_expand_prologue (void)
   /* The static chain register is the same as the IP register.  If it is
  clobbered when creating the frame, we need to save and restore 
it.  */

   clobber_ip = IS_NESTED (func_type)
-  && ((TARGET_APCS_FRAME && frame_pointer_needed && TARGET_ARM)
-  || ((flag_stack_check == STATIC_BUILTIN_STACK_CHECK
-   || flag_stack_clash_protection)
-  && !df_regs_ever_live_p (LR_REGNUM)
-  && arm_r3_live_at_start_p ()));
+&& (((TARGET_APCS_FRAME && frame_pointer_needed && TARGET_ARM)
+|| ((flag_stack_chec

Re: [Patch] OpenMP: Handle tofrom with target enter/exit data

2022-07-01 Thread Jakub Jelinek via Gcc-patches
On Fri, Jul 01, 2022 at 05:41:27PM +0200, Tobias Burnus wrote:
> --- a/gcc/fortran/dump-parse-tree.cc
> +++ b/gcc/fortran/dump-parse-tree.cc
> @@ -1414,6 +1414,11 @@ show_omp_namelist (int list_type, gfc_omp_namelist *n)
> case OMP_MAP_TO: fputs ("to:", dumpfile); break;
> case OMP_MAP_FROM: fputs ("from:", dumpfile); break;
> case OMP_MAP_TOFROM: fputs ("tofrom:", dumpfile); break;
> +   case OMP_MAP_ALWAYS_TO: fputs ("always,to:", dumpfile); break;
> +   case OMP_MAP_ALWAYS_FROM: fputs ("always,from:", dumpfile); break;
> +   case OMP_MAP_ALWAYS_TOFROM: fputs ("always,tofrom:", dumpfile); break;
> +   case OMP_MAP_DELETE: fputs ("always,tofrom:", dumpfile); break;
> +   case OMP_MAP_RELEASE: fputs ("always,tofrom:", dumpfile); break;

Pasto in the last 2 lines?

Other than that LGTM.

Jakub



Re: [PATCH v2] c++: fix broken copy elision with nested TARGET_EXPRs [PR105550]

2022-07-01 Thread Jason Merrill via Gcc-patches

On 6/24/22 20:30, Marek Polacek wrote:

On Thu, Jun 02, 2022 at 05:08:54PM -0400, Jason Merrill wrote:

On 5/26/22 11:01, Marek Polacek wrote:

In this problem, we are failing to properly perform copy elision with
a conditional operator, so this:

constexpr A a = true ? A{} : A{};

fails with:

error: 'A{((const A*)(&))}' is not a constant expression

The whole initializer is

TARGET_EXPR }> : TARGET_EXPR }>>

where the outermost TARGET_EXPR is elided, but not the nested ones.
Then we end up replacing the PLACEHOLDER_EXPRs with the temporaries the
TARGET_EXPRs represent, which is precisely what should *not* happen with
copy elision.

I've tried the approach of tweaking ctx->object, but I ran into gazillion
problems with that.  I thought that I would let cxx_eval_constant_expression
/TARGET_EXPR create a new object only when ctx->object was null, then
adjust setting of ctx->object in places like cxx_bind_parameters_in_call
and cxx_eval_component_reference but that failed completely.  Sometimes
ctx->object has to be reset, sometimes it cannot be reset, 'this' needed
special handling, etc.  I gave up.

But now that we have potential_prvalue_result_of, a simple but less

elegant solution is the following.  I thought about setting a flag on
a TARGET_EXPR to avoid adding ctx.full_expr, but a new flag would be
overkill and using TARGET_EXPR_DIRECT_INIT_P broke things.


Sorry it's taken me so long to get back to this.
  

This doesn't seem like a general solution; the same issue would also apply
to ?: of TARGET_EXPR when it's a subexpression rather than the full
expression, like f(true ? A{} : B{}).


You're right.
  

Another simple approach, but more general, would be to routinely strip
TARGET_EXPR from the operands of ?: like we do in various other places in
constexpr.c.


How about this, then?

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


-- >8 --
In this problem, we are failing to properly perform copy elision with
a conditional operator, so this:

   constexpr A a = true ? A{} : A{};

fails with:

   error: 'A{((const A*)(&))}' is not a constant expression

The whole initializer is

   TARGET_EXPR }> : TARGET_EXPR }>>

where the outermost TARGET_EXPR is elided, but not the nested ones.
Then we end up replacing the PLACEHOLDER_EXPRs with the temporaries the
TARGET_EXPRs represent, which is precisely what should *not* happen with
copy elision.

I've tried the approach of tweaking ctx->object, but I ran into gazillion
problems with that.  I thought that I would let cxx_eval_constant_expression
/TARGET_EXPR create a new object only when ctx->object was null, then
adjust setting of ctx->object in places like cxx_bind_parameters_in_call
and cxx_eval_component_reference but that failed completely.  Sometimes
ctx->object has to be reset, sometimes it cannot be reset, 'this' needed
special handling, etc.  I gave up.

Instead, this patch strips TARGET_EXPRs from the operands of ?: like
we do in various other places in constexpr.c.

PR c++/105550

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_conditional_expression): Strip TARGET_EXPRs.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/nsdmi-aggr16.C: Remove FIXME.
* g++.dg/cpp1y/nsdmi-aggr17.C: Remove FIXME.
* g++.dg/cpp0x/constexpr-elision1.C: New test.
* g++.dg/cpp1y/constexpr-elision1.C: New test.
---
  gcc/cp/constexpr.cc   |  7 +++
  .../g++.dg/cpp0x/constexpr-elision1.C | 16 ++
  .../g++.dg/cpp1y/constexpr-elision1.C | 53 +++
  gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr16.C |  5 +-
  gcc/testsuite/g++.dg/cpp1y/nsdmi-aggr17.C |  5 +-
  5 files changed, 80 insertions(+), 6 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-elision1.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-elision1.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 0dc94d9445d..5f7fc6f8f0c 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -3507,6 +3507,13 @@ cxx_eval_conditional_expression (const constexpr_ctx 
*ctx, tree t,
  val = TREE_OPERAND (t, 1);
if (TREE_CODE (t) == IF_STMT && !val)
  val = void_node;
+  /* A TARGET_EXPR may be nested inside another TARGET_EXPR, but still
+ serve as the initializer for the same object as the outer TARGET_EXPR,
+ as in
+   A a = true ? A{} : A{};
+ so strip the inner TARGET_EXPR so we don't materialize a temporary.  */
+  if (TREE_CODE (val) == TARGET_EXPR)
+val = TARGET_EXPR_INITIAL (val);
return cxx_eval_constant_expression (ctx, val, lval, non_constant_p,
   overflow_p, jump_target);
  }
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-elision1.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-elision1.C
new file mode 100644
index 000..9e7b9ec3405
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-elision1.C
@@ -0,0 +1,16 @@
+// PR c++/105550
+// { dg-do compile { target c++11 }

Re: [PATCH 11/12] aarch64: Make bti pass generic so it can be used by the arm backend

2022-07-01 Thread Richard Earnshaw via Gcc-patches




On 28/04/2022 10:51, Andrea Corallo via Gcc-patches wrote:

Hi all,

this patch splits and restructures the aarch64 bti pass code in order
to have it usable by the arm backend as well.  These changes have no
functional impact.

Best Regards

   Andrea

gcc/Changelog

* config.gcc (aarch64*-*-*): Rename 'aarch64-bti-insert.o' into
'aarch-bti-insert.o'.
* config/aarch64/aarch64-protos.h: Remove 'aarch64_bti_enabled'
proto.
* config/aarch64/aarch64.cc (aarch_bti_enabled): Rename.
(aarch_bti_j_insn_p, aarch_pac_insn_p): New functions.
(aarch64_output_mi_thunk)
(aarch64_print_patchable_function_entry)
(aarch64_file_end_indicate_exec_stack): Update renamed function
calls to renamed functions.
* config/aarch64/t-aarch64 (aarch-bti-insert.o): Update target.
* config/arm/aarch-bti-insert.cc: New file including and
generalizing code from aarch64-bti-insert.cc.
* config/arm/aarch-common-protos.h: Update.
* config/arm/arm-passes.def: New file.



Is this patch fully stand-alone?  It adds arm-passes.def, which adds a 
reference to pass_insert_bti, but that isn't fully wired up until the 
next patch.


R.


Re: [PATCH] aarch64: Fix pure/const function attributes for intrinsics

2022-07-01 Thread Andrew Carlotti via Gcc-patches
On Fri, Jul 01, 2022 at 08:42:15AM +0200, Richard Biener wrote:
> On Thu, Jun 30, 2022 at 6:04 PM Andrew Carlotti via Gcc-patches
>  wrote:
> > diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
> > b/gcc/config/aarch64/aarch64-builtins.cc
> > index 
> > e0a741ac663188713e21f457affa57217d074783..877f54aab787862794413259cd36ca0fb7bd49c5
> >  100644
> > --- a/gcc/config/aarch64/aarch64-builtins.cc
> > +++ b/gcc/config/aarch64/aarch64-builtins.cc
> > @@ -1085,9 +1085,9 @@ aarch64_get_attributes (unsigned int f, machine_mode 
> > mode)
> >if (!aarch64_modifies_global_state_p (f, mode))
> >  {
> >if (aarch64_reads_global_state_p (f, mode))
> > -   attrs = aarch64_add_attribute ("pure", attrs);
> > -  else
> > attrs = aarch64_add_attribute ("const", attrs);
> > +  else
> > +   attrs = aarch64_add_attribute ("pure", attrs);
> 
> that looks backwards.  'pure' allows read of global memory while
> 'const' does not.  Is
> aarch64_reads_global_state_p really backwards?

Oh - the thing that's backwards is my understanding of what "pure" and
"const" mean. Their meanings as GCC function attributes seem to be
approximately the opposite way round to their meanings in general usage.


Re: [PATCH 12/12 V2] arm: implement bti injection

2022-07-01 Thread Richard Earnshaw via Gcc-patches




On 28/06/2022 10:21, Andrea Corallo via Gcc-patches wrote:

Hi all,

second iteration of this patch enabling Branch Target Identification
Armv8.1-M Mechanism [1].

This is achieved by using the bti pass made common with Aarch64.

The pass iterates through the instructions and adds the necessary BTI
instructions at the beginning of every function and at every landing
pads targeted by indirect jumps.

Best Regards

   Andrea

[1]


gcc/ChangeLog

2022-04-07  Andrea Corallo  

* config.gcc (arm*-*-*): Add 'aarch-bti-insert.o' object.
* config/arm/arm-protos.h: Update.
* config/arm/arm.cc (aarch_bti_enabled, aarch_bti_j_insn_p)
(aarch_pac_insn_p, aarch_gen_bti_c, aarch_gen_bti_j): New
functions.
* config/arm/arm.md (bti_nop): New insn.
* config/arm/t-arm (PASSES_EXTRA): Add 'arm-passes.def'.
(aarch-bti-insert.o): New target.
* config/arm/unspecs.md (UNSPEC_BTI_NOP): New unspec.
* config/arm/aarch-bti-insert.cc (rest_of_insert_bti): Update
to verify arch compatibility.

gcc/testsuite/ChangeLog

2022-04-07  Andrea Corallo  

* gcc.target/arm/bti-1.c: New testcase.
* gcc.target/arm/bti-2.c: Likewise.


@@ -104,6 +105,14 @@ rest_of_insert_bti (void)
   rtx_insn *insn;
   basic_block bb;

+#if defined (TARGET_32BIT) || defined (TARGET_THUMB1)

See the comment about errors in response to patch 10.  I'd simply expect 
the gate function to be disabled when we can't support PAC, so we should 
never get here.



+  if (!arm_arch8)
+{
+  error ("This architecture does not support branch protection 
instructions");

+  goto exit;
+}
+#endif
+
...
+
+rtx aarch_gen_bti_c (void)
+{
+  return gen_bti_nop ();
+}
+
+rtx aarch_gen_bti_j (void)
+{
+  return gen_bti_nop ();
+}
+

Function names should start a new line... Thus:

rtx
aarch_gen_bti_c (void)

etc.

+(define_insn "bti_nop"
+  [(unspec_volatile [(const_int 0)] UNSPEC_BTI_NOP)]
+  "TARGET_THUMB2"

This isn't quite right.  We need v8-m.main as the baseline architecture 
for the NOPs to behave as NOPs.


+  "bti"
+  [(set_attr "type" "mov_reg")])
+

How to deal with architectural NOPs is an interesting question.  I think 
really, for the scheduler we need to describe each newly defined NOP 
separately, then in the scheduling descriptions we can handle all 
unimplemented NOPs by grouping them together for that architecture, 
whilst describing more accurately how to handle them on CPUs where they 
acquire a defined behaviour.


diff --git a/gcc/config.gcc b/gcc/config.gcc
index 2021bdf9d2f..004e1dfa8d8 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -353,7 +353,7 @@ arc*-*-*)
;;
 arm*-*-*)
cpu_type=arm
-   extra_objs="arm-builtins.o arm-mve-builtins.o aarch-common.o"
+	extra_objs="arm-builtins.o arm-mve-builtins.o aarch-common.o 
aarch-bti-insert.o"


--- a/gcc/config/arm/t-arm
+++ b/gcc/config/arm/t-arm
@@ -175,3 +175,13 @@ arm-d.o: $(srcdir)/config/arm/arm-d.cc
 arm-common.o: arm-cpu-cdata.h

 driver-arm.o: arm-native.h
+
+PASSES_EXTRA += $(srcdir)/config/arm/arm-passes.def

See comment on patch 11.  Perhaps the right thing to do is to move the 
hunk that adds arm-passes.def into this patch.


Re: [Patch][v5] OpenMP: Move omp requires checks to libgomp

2022-07-01 Thread Tobias Burnus

On 01.07.22 16:34, Jakub Jelinek wrote:

On Fri, Jul 01, 2022 at 03:06:05PM +0200, Tobias Burnus wrote:
[...]
Will Fortran diagnose:
subroutine foo
!$omp requires unified_shared_memory
!$omp target
!$omp end target
end subroutine foo
subroutine bar
!$omp requires reverse_offload
!$omp target
!$omp end target
end subroutine bar

or just merge it from the different namespaces?


This is done in openmp.cc during parsing. The merging you quoted (in parse.cc) 
happens
after the whole input file has been parsed and resolved. For your test case, the
following error is shown:

test.f90:1:15:

1 |  subroutine foo
  |   1
Error: Program unit at (1) has OpenMP device constructs/routines but does not 
set !$OMP REQUIRES REVERSE_OFFLOAD but other program units do
test.f90:6:14:

6 | subroutine bar
  |  1
Error: Program unit at (1) has OpenMP device constructs/routines but does not 
set !$OMP REQUIRES UNIFIED_SHARED_MEMORY but other program units do



@@ -1764,6 +1781,20 @@ input_symtab (void)

  }
  }

+static void
+omp_requires_to_name (char *buf, size_t size, unsigned int requires_mask)
+{
+  char *end = buf + size, *p = buf;
+  if (requires_mask & GOMP_REQUIRES_UNIFIED_ADDRESS)
+p += snprintf (p, end - p, "unified_address");
+  if (requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY)
+p += snprintf (p, end - p, "%sunified_shared_memory",
+   (p == buf ? "" : ", "));
+  if (requires_mask & GOMP_REQUIRES_REVERSE_OFFLOAD)
+p += snprintf (p, end - p, "%sreverse_offload",
+   (p == buf ? "" : ", "));

So, what does this print if requires_mask is 0 (or just the target used bit
set but not unified_address, unified_shared_memory nor reverse_offload)?


Well, that's what libgomp/testsuite/libgomp.c-c++-common/requires-2.c (+ 
*-2-aux.c)
tests:

/* { dg-error "OpenMP 'requires' directive with non-identical clauses in multiple compilation 
units: 'unified_shared_memory' vs. ''" "" { target *-*-* } 0 }  */

I hope the '' vs. 'unified_shared_memory' is clear - but if you have a better 
wording.

Note that both:
  no 'omp requires'
and
  'omp requires' with other clauses (such as the atomic ones or 
dynamic_allocators)
will lead to 0. Thus, if the wording is changed, it should fit for both cases.


@@ -1810,6 +1847,54 @@ input_offload_tables (bool do_force_output)
  may be no refs to var_decl in offload LTO mode.  */
   if (do_force_output)
 varpool_node::get (var_decl)->force_output = 1;
+  tmp_decl = var_decl;
+}
+  else if (tag == LTO_symtab_edge)
+{
+  static bool error_emitted = false;
+  HOST_WIDE_INT val = streamer_read_hwi (ib);
+
+  if (omp_requires_mask == 0)
+{
+  omp_requires_mask = (omp_requires) val;
+  requires_decl = tmp_decl;
+  requires_fn = file_data->file_name;

And similarly here, if some device construct is seen but requires
directive isn't, not sure if in this version val would be 0 or something
with the TARGET_USED bit set.  In the latter case, only what is printed
for no requires or just atomic related requires is a problem, in the former
case due to the == 0 check mixing of 0 with non-zero would be ignored
but mixing of non-zero with 0 wouldn't be.


Here: 0 = "unset" in the sense that either TARGET_USE nor USM/UA/RO was
specified. If any of those is set, we get != 0.

For mkoffload, the single results are merged - and TARGET_USE is stripped,
such that it is either 0 or a combination of USM/UA/RO


+}
+  else if (omp_requires_mask != val && !error_emitted)
+{
+  char buf[64], buf2[64];

Perhaps cleaner would be to size the buffers as
sizeof ("unified_address,unified_shared_memory,reverse_offload")
64 is more, but just a wild guess and if further clauses are added later,
it might be too small.


I concur – except that ',' should be ', '.
(Likewise in libgomp/target.c)


@@ -1821,6 +1906,18 @@ input_offload_tables (bool do_force_output)

lto_destroy_simple_input_block (file_data, LTO_section_offload_table,
   ib, data, len);
  }
+#ifdef ACCEL_COMPILER
+  char *omp_requires_file = getenv ("GCC_OFFLOAD_OMP_REQUIRES_FILE");
+  if (omp_requires_file == NULL || omp_requires_file[0] == '\0')
+fatal_error (input_location, "GCC_OFFLOAD_OMP_REQUIRES_FILE unset");
+  FILE *f = fopen (omp_requires_file, "wb");
+  if (!f)
+fatal_error (input_location, "Cannot open omp_requires file %qs",
+ omp_requires_file);
+  uint32_t req_mask = omp_requires_mask & ~OMP_REQUIRES_TARGET_USED;

Perhaps it is better to also store the TARGET_USED bit and on the library
side completely ignore values of 0.


For the compiler side, we need to distinguish no requires vs. some
requires when checking multiple TU (to distinguish it from TU which do
not use target constructs).

But for libgomp only the result counts: no require

Re: [RFC] trailing_wide_ints with runtime variable lengths

2022-07-01 Thread Aldy Hernandez via Gcc-patches
On Fri, Jul 1, 2022 at 4:58 PM Jakub Jelinek  wrote:
>
> On Fri, Jul 01, 2022 at 04:12:55PM +0200, Aldy Hernandez wrote:
> > > --- a/gcc/wide-int.h
> > > +++ b/gcc/wide-int.h
> > > @@ -1373,10 +1373,13 @@ namespace wi
> > >  : public int_traits  {};
> > >  }
> > >
> > > -/* An array of N wide_int-like objects that can be put at the end of
> > > -   a variable-sized structure.  Use extra_size to calculate how many
> > > -   bytes beyond the sizeof need to be allocated.  Use set_precision
> > > -   to initialize the structure.  */
> > > +/* A variable-lengthed array of wide_int-like objects that can be put
> > > +   at the end of a variable-sized structure.  The number of objects is
> > > +   at most N and can be set at runtime by using set_precision().
> > > +
> > > +   Use extra_size to calculate how many bytes beyond the
> > > +   sizeof need to be allocated.  Use set_precision to initialize the
> > > +   structure.  */
> > >  template 
> > >  struct GTY((user)) trailing_wide_ints
> > >  {
> > > @@ -1387,6 +1390,9 @@ private:
> > >/* The shared maximum length of each number.  */
> > >unsigned char m_max_len;
> > >
> > > +  /* The number of elements.  */
> > > +  unsigned char m_num_elements;
>
> IMNSHO you certainly don't want to change like this existing
> trailing_wide_ints, you don't want to grow unnecessarily existing
> trailing_wide_ints users (e.g. const_poly_int_def).

That's precisely what I avoided...touching existing trailing_wide_ints
users.  As I explained, there is no cost to either const_poly_int_def
or range_info_def (though I'm about to nuke the latter).  There is
some padding that is currently used by m_len[N], and I just took a
byte out to represent the run-time length.  That would affect
trailing_wide_int users that have N > 4, but none are.  The use in
const_poly_int_def uses 2 (and range_info_def uses 3):

#define NUM_POLY_INT_COEFFS 2

struct GTY((variable_size)) const_poly_int_def {
  trailing_wide_ints coeffs;
};

>
> My brief understanding of wide-int.h is that in some cases stuff like this
> is implied from template parameters or exact class instantiation and in
> other cases it is present explicitly and class inheritence is used to hide
> that stuff nicely.

Yeah, it took me a while to decipher it, but I did read it :).

>
> So, you are looking for something like trailing_wide_ints but where that
> N is actually a runtime value?  Then e.g. the
>   struct {unsigned char len;} m_len[N];
> member can't work properly either, because it isn't constant size.

What my patch does is store the run-time length in the aforementioned
byte, while defaulting to N/MAX.  There is no size difference (or code
changes) for existing users.  With my change, set_precision() and
extra_size() now take a runtime parameter, but it defaults to N and is
inlined, so there is no penalty for existing users.  I benchmarked to
make sure :).

I could hack up a variable_length_wide_int for what I want, but I'd
end up duplicating a lot of the trailing_wide_int_storage, etc.
Another option would be to stream out the HOST_WIDE_INTs in the
tree_int_cst and reconstruct things myself, but that smells of
reinventing the wheel.

Is there another way of allocating an n-bit wide-int at run-time?  I'm
happy to entertain other alternatives...

Aldy



Re: RFA: Another Rust demangler recursion limit

2022-07-01 Thread Jeff Law via Gcc-patches




On 7/1/2022 9:12 AM, Nick Clifton wrote:

Hi Jeff,

   [I am sending this to your directly since you seem to be the only one
   reviewing these patches].

   Hot on the heels of the fix for the recursion problem in demangle_const
   a binutils user has filed another PoC that exposes a problem in
   demangle_path_maybe_open_generics():

https://sourceware.org/bugzilla/show_bug.cgi?id=29312#c1

   I have redirected them to file a bug report with the gcc system, but in
   the hopes of getting a fix in quickly I am also attaching a patch
   here.  It just does the obvious thing of adding a recursion counter
   and limit to the function.
OK.  And yes, I wish someone else was looking at this stuff.  Rust isn't 
really on my radar right now...


jeff


Re: [Patch][v5] OpenMP: Move omp requires checks to libgomp

2022-07-01 Thread Jakub Jelinek via Gcc-patches
On Fri, Jul 01, 2022 at 06:31:48PM +0200, Tobias Burnus wrote:
> This is done in openmp.cc during parsing. The merging you quoted (in 
> parse.cc) happens
> after the whole input file has been parsed and resolved. For your test case, 
> the
> following error is shown:
> 
> test.f90:1:15:
> 
> 1 |  subroutine foo
>   |   1
> Error: Program unit at (1) has OpenMP device constructs/routines but does not 
> set !$OMP REQUIRES REVERSE_OFFLOAD but other program units do
> test.f90:6:14:
> 
> 6 | subroutine bar
>   |  1
> Error: Program unit at (1) has OpenMP device constructs/routines but does not 
> set !$OMP REQUIRES UNIFIED_SHARED_MEMORY but other program units do

Great.

> > @@ -1764,6 +1781,20 @@ input_symtab (void)
> > >   }
> > >   }
> > > 
> > > +static void
> > > +omp_requires_to_name (char *buf, size_t size, unsigned int requires_mask)
> > > +{
> > > +  char *end = buf + size, *p = buf;
> > > +  if (requires_mask & GOMP_REQUIRES_UNIFIED_ADDRESS)
> > > +p += snprintf (p, end - p, "unified_address");
> > > +  if (requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY)
> > > +p += snprintf (p, end - p, "%sunified_shared_memory",
> > > +   (p == buf ? "" : ", "));
> > > +  if (requires_mask & GOMP_REQUIRES_REVERSE_OFFLOAD)
> > > +p += snprintf (p, end - p, "%sreverse_offload",
> > > +   (p == buf ? "" : ", "));
> > So, what does this print if requires_mask is 0 (or just the target used bit
> > set but not unified_address, unified_shared_memory nor reverse_offload)?
> 
> Well, that's what libgomp/testsuite/libgomp.c-c++-common/requires-2.c (+ 
> *-2-aux.c)
> tests:
> 
> /* { dg-error "OpenMP 'requires' directive with non-identical clauses in 
> multiple compilation units: 'unified_shared_memory' vs. ''" "" { target *-*-* 
> } 0 }  */
> 
> I hope the '' vs. 'unified_shared_memory' is clear - but if you have a better 
> wording.

I must be missing how that works.  Because the buf in the callers is
uninitialized and this function doesn't store there anything if
requires_mask == 0.
Perhaps you're just lucky and the stack contains '\0' there?

> Note that both:
>   no 'omp requires'
> and
>   'omp requires' with other clauses (such as the atomic ones or 
> dynamic_allocators)
> will lead to 0. Thus, if the wording is changed, it should fit for both cases.

Maybe it would be better to simply use different error message for the
0 vs. non-0 case, canonicalized to non-0 vs. 0 order so that it is just
2 messages vs. 3 and wording like
"OpenMP 'requires' directive with '' clauses specified only in some 
compilation units"
note: specified here ...
note: but not here ...

> > > +  if (omp_requires_mask == 0)
> > > +{
> > > +  omp_requires_mask = (omp_requires) val;
> > > +  requires_decl = tmp_decl;
> > > +  requires_fn = file_data->file_name;
> > And similarly here, if some device construct is seen but requires
> > directive isn't, not sure if in this version val would be 0 or something
> > with the TARGET_USED bit set.  In the latter case, only what is printed
> > for no requires or just atomic related requires is a problem, in the former
> > case due to the == 0 check mixing of 0 with non-zero would be ignored
> > but mixing of non-zero with 0 wouldn't be.
> 
> Here: 0 = "unset" in the sense that either TARGET_USE nor USM/UA/RO was
> specified. If any of those is set, we get != 0.

Ok.
> 
> For mkoffload, the single results are merged - and TARGET_USE is stripped,
> such that it is either 0 or a combination of USM/UA/RO

I'd find it clearer if we never stripped that, so that even the library knows.
The details will depend on the resolution of #3240.
Whether say declare target and no device constructs and device related API
calls etc. force it too or not.  If not, you could get 0 even if you are
actually registering something, just not target regions.
If anything that will lead to GOMP_offload_register_ver actually means
TARGET_USED, then it isn't necessary.  But even if it isn't necessary,
e.g. for backwards compatibility with GOMP_VERSION == 1 it will be easier
to have that bit in.  0 will then mean older gcc built library or binary.

> > > +}
> > > +  else if (omp_requires_mask != val && !error_emitted)
> > > +{
> > > +  char buf[64], buf2[64];
> > Perhaps cleaner would be to size the buffers as
> > sizeof ("unified_address,unified_shared_memory,reverse_offload")
> > 64 is more, but just a wild guess and if further clauses are added later,
> > it might be too small.
> 
> I concur – except that ',' should be ', '.
> (Likewise in libgomp/target.c)

Good catch.

> > Is
> > c.c:
> > #pragma omp requires unified_shared_memory
> > d.c:
> > void baz (void) {
> >#pragma omp target
> >;
> > }
> > ok?
> 
> This one is *already* streamed out as it creates a symbol and entry in
> in offload_functions (baz.omp_fn.0).
> 
> The code is rather 

[og12] [committed] Port remaining OG11 patches

2022-07-01 Thread Kwok Cheung Yeung

Hello

I have committed the following forward-ports from OG11 onto the 
devel/omp/gcc-12 branch to bring OG12 up-to-parity with OG11 again.


12d14a9a255 amdgcn: libgomp plugin USM implementation
687640af27a amdgcn, openmp: Auto-detect USM mode and set HSA_XNACK
cbc3dd01de8 amdgcn: Support XNACK mode
1d4e24c9fb4 Fix gfortran.dg/gomp/num-teams-2.f90

KwokFrom 1d4e24c9fb40d4df7e742d7631a29329d5fb7c84 Mon Sep 17 00:00:00 2001
From: Tobias Burnus 
Date: Mon, 27 Jun 2022 13:26:43 +0200
Subject: [PATCH 1/4] Fix gfortran.dg/gomp/num-teams-2.f90

OG11 contrary to mainline issues an error for resolve_positive_int_expr
(-> OG11 commit a14b3f29681da1d2465e15f98b8cf8d5c64a2c3c). Update
testcase accordingly.

gcc/testsuite/
* gfortran.dg/gomp/num-teams-2.f90: Use dg-error not dg-warning.
---
 gcc/testsuite/ChangeLog.omp|  4 
 gcc/testsuite/gfortran.dg/gomp/num-teams-2.f90 | 12 ++--
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/ChangeLog.omp b/gcc/testsuite/ChangeLog.omp
index 301085fba5d..99ed6f0e467 100644
--- a/gcc/testsuite/ChangeLog.omp
+++ b/gcc/testsuite/ChangeLog.omp
@@ -1,3 +1,7 @@
+2022-06-27  Tobias Burnus  
+
+   * gfortran.dg/gomp/num-teams-2.f90: Use dg-error not dg-warning.
+
 2022-05-12  Jakub Jelinek  
 
Backport from mainline:
diff --git a/gcc/testsuite/gfortran.dg/gomp/num-teams-2.f90 
b/gcc/testsuite/gfortran.dg/gomp/num-teams-2.f90
index e7814a11a5a..f3031481d4a 100644
--- a/gcc/testsuite/gfortran.dg/gomp/num-teams-2.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/num-teams-2.f90
@@ -9,13 +9,13 @@ subroutine foo (i)
   !$omp teams num_teams (6 : 4)! { dg-warning "NUM_TEAMS lower 
bound at .1. larger than upper bound at .2." }
   !$omp end teams
 
-  !$omp teams num_teams (-7)   ! { dg-warning "INTEGER expression of 
NUM_TEAMS clause at .1. must be positive" }
+  !$omp teams num_teams (-7)   ! { dg-error "INTEGER expression of 
NUM_TEAMS clause at .1. must be positive" }
   !$omp end teams
 
-  !$omp teams num_teams (i : -7)   ! { dg-warning "INTEGER 
expression of NUM_TEAMS clause at .1. must be positive" }
+  !$omp teams num_teams (i : -7)   ! { dg-error "INTEGER 
expression of NUM_TEAMS clause at .1. must be positive" }
   !$omp end teams
 
-  !$omp teams num_teams (-7 : 8)   ! { dg-warning "INTEGER 
expression of NUM_TEAMS clause at .1. must be positive" }
+  !$omp teams num_teams (-7 : 8)   ! { dg-error "INTEGER 
expression of NUM_TEAMS clause at .1. must be positive" }
   !$omp end teams
 end
 
@@ -25,13 +25,13 @@ subroutine bar (i)
   !$omp target teams num_teams (6 : 4) ! { dg-warning "NUM_TEAMS lower bound 
at .1. larger than upper bound at .2." }
   !$omp end target teams
 
-  !$omp target teams num_teams (-7)! { dg-warning "INTEGER expression of 
NUM_TEAMS clause at .1. must be positive" }
+  !$omp target teams num_teams (-7)! { dg-error "INTEGER expression of 
NUM_TEAMS clause at .1. must be positive" }
   !$omp end target teams
 
-  !$omp target teams num_teams (i : -7)! { dg-warning "INTEGER 
expression of NUM_TEAMS clause at .1. must be positive" }
+  !$omp target teams num_teams (i : -7)! { dg-error "INTEGER 
expression of NUM_TEAMS clause at .1. must be positive" }
   !$omp end target teams
 
-  !$omp target teams num_teams (-7 : 8)! { dg-warning "INTEGER 
expression of NUM_TEAMS clause at .1. must be positive" }
+  !$omp target teams num_teams (-7 : 8)! { dg-error "INTEGER 
expression of NUM_TEAMS clause at .1. must be positive" }
   !$omp end target teams
 end
 end module
-- 
2.25.1

From cbc3dd01de8788587a2b641efcb838058303b5ab Mon Sep 17 00:00:00 2001
From: Andrew Stubbs 
Date: Fri, 10 Jun 2022 15:15:49 +0100
Subject: [PATCH 2/4] amdgcn: Support XNACK mode

The XNACK feature allows memory load instructions to restart safely following
a page-miss interrupt.  This is useful for shared-memory devices, like APUs,
and to implement OpenMP Unified Shared Memory.

To support the feature we must be able to set the appropriate meta-data and
set the load instructions to early-clobber.  When the port supports scheduling
of s_waitcnt instructions there will be further requirements.

gcc/ChangeLog:

* config/gcn/gcn-hsa.h (XNACKOPT): New macro.
(ASM_SPEC): Use XNACKOPT.
* config/gcn/gcn-opts.h (enum sram_ecc_type): Rename to ...
(enum hsaco_attr_type): ... this, and generalize the names.
(TARGET_XNACK): New macro.
* config/gcn/gcn-valu.md (gather_insn_1offset):
Add xnack compatible alternatives.
(gather_insn_2offsets): Likewise.
* config/gcn/gcn.c (gcn_option_override): Permit -mxnack for devices
other than Fiji.
(gcn_expand_epilogue): Remove early-clobber problems.
(output_file_start): Emit xnack attributes.
(gcn_hsa_declare_function_name): Obey -mxnack setting.
* config/gcn/gcn.md (xnac

Re: [RFC] trailing_wide_ints with runtime variable lengths

2022-07-01 Thread Jakub Jelinek via Gcc-patches
On Fri, Jul 01, 2022 at 06:47:48PM +0200, Aldy Hernandez wrote:
> > So, you are looking for something like trailing_wide_ints but where that
> > N is actually a runtime value?  Then e.g. the
> >   struct {unsigned char len;} m_len[N];
> > member can't work properly either, because it isn't constant size.
> 
> What my patch does is store the run-time length in the aforementioned
> byte, while defaulting to N/MAX.  There is no size difference (or code
> changes) for existing users.  With my change, set_precision() and
> extra_size() now take a runtime parameter, but it defaults to N and is
> inlined, so there is no penalty for existing users.  I benchmarked to
> make sure :).

So, you still use N = 3 but can sometimes store say 255 wide_ints in there?
In that case, m_len[N] provides lengths just for the first 3, no?

Anyway, you really want feedback from Richard Sandiford IMHO...

Jakub



Re: [RFC] trailing_wide_ints with runtime variable lengths

2022-07-01 Thread Aldy Hernandez via Gcc-patches
On Fri, Jul 1, 2022, 18:58 Jakub Jelinek  wrote:

> On Fri, Jul 01, 2022 at 06:47:48PM +0200, Aldy Hernandez wrote:
> > > So, you are looking for something like trailing_wide_ints but where
> that
> > > N is actually a runtime value?  Then e.g. the
> > >   struct {unsigned char len;} m_len[N];
> > > member can't work properly either, because it isn't constant size.
> >
> > What my patch does is store the run-time length in the aforementioned
> > byte, while defaulting to N/MAX.  There is no size difference (or code
> > changes) for existing users.  With my change, set_precision() and
> > extra_size() now take a runtime parameter, but it defaults to N and is
> > inlined, so there is no penalty for existing users.  I benchmarked to
> > make sure :).
>
> So, you still use N = 3 but can sometimes store say 255 wide_ints in there?
> In that case, m_len[N] provides lengths just for the first 3, no?
>

You can still say N=255 and things continue to work as they do now, since
m_len[] is still statically determined. The only difference is that before,
the size of the structure would be 2+1+255+sizeof(int) whereas now it would
be 1 more because of the byte I'm using for num_elements.

The only time this would be an issue would be for say N=253 because the
size of everything except the ints is currently 256 (2+1+253), which is 64
bit aligned, whereas with my patch it would be 257 which takes an
additional word.

This is all theoretical because there are no users of trailing wide ints
greater than 3.

I guess what I'm really doing is changing the semantics of
trailing_wide_ints to trailing_wide_ints and what used to be the
number of elements is determined at run timebut the default is MAX so
everything continues to work the same for the current users.

Does that make sense?

Aldy


> Anyway, you really want feedback from Richard Sandiford IMHO...
>
> Jakub
>
>


Re: [RFC] trailing_wide_ints with runtime variable lengths

2022-07-01 Thread Richard Sandiford via Gcc-patches
Aldy Hernandez via Gcc-patches  writes:
> Currently global ranges are stored in SSA_NAME_RANGE_INFO as a pair of
> wide_int-like objects along with the nonzero bits.  We frequently lose
> precision when streaming out our higher resolution iranges.  The plan
> was always to store the full irange between passes.  However, as was
> originally discussed eons ago:
>
>   https://gcc.gnu.org/pipermail/gcc-patches/2017-May/475139.html
>
> ...we need a memory efficient way of saving iranges, preferably using
> the trailing_wide_ints idiom.
>
> The problem with doing so is that trailing_wide_ints assume a
> compile-time specified number of elements.  For irange, we need to
> determine the size at run-time.
>
> One solution is to adapt trailing_wide_ints such that N is the maximum
> number of elements allowed, and allow setting the actual number at
> run-time (defaulting to N).  The attached patch does this, while
> requiring no changes to existing users.
>
> It uses a byte to store the number of elements in the
> trailing_wide_ints control word.  The control word is currently a
> 16-bit precision, an 8-bit max-length, and the rest is used for
> m_len[N].  On a 64-bit architecture, this allows for 5 elements in
> m_len without having to use an extra word.  With this patch, m_len[]
> would be smaller by one byte (4) before consuming the padding.  This
> shouldn't be a problem as the only users of trailing_wide_ints use N=2
> for NUM_POLY_INT_COEFFS in aarch64, and N=3 for range_info_def.
>
> For irange, my plan is to use one more word to fit a maximum of 12
> elements (the above 4 plus 8 more).  This would allow for 6 pairs of
> sub-ranges which would be more than adequate for our needs.  In
> previous tests we found that 99% of ranges fit within 3-4 pairs.  More
> precisely, this would allow for 5 pairs, plus the nonzero bits, plus a
> spare wide-int for future development.
>
> Ultimately this means that streaming an irange would consume one more
> word than what we currently do for range_info_def.  IMO this is a nice
> trade-off considering we started storing a slew of wide-ints directly
> ;-).
>
> I'm not above rolling an altogether different approach, but would
> prefer to use the existing trailing infrastructure since it's mostly
> what we need.
>
> Thoughts?
>
> p.s. Tested and benchmarked on x86-64 Linux.  There was no discernible
> performance change in our benchmark suite.

Thanks for the discussion downthread.  I had some of the same questions
as Jakub, but you've answered them :-)

> gcc/ChangeLog:
>
>   * wide-int.h (struct trailing_wide_ints): Add m_num_elements.
>   (trailing_wide_ints::set_precision): Add num_elements argument.
>   (trailing_wide_ints::extra_size): Same.
> ---
>  gcc/wide-int.h | 42 +-
>  1 file changed, 29 insertions(+), 13 deletions(-)
>
> diff --git a/gcc/wide-int.h b/gcc/wide-int.h
> index 8041b6104f9..f68ccf0a0c5 100644
> --- a/gcc/wide-int.h
> +++ b/gcc/wide-int.h
> @@ -1373,10 +1373,13 @@ namespace wi
>  : public int_traits  {};
>  }
>  
> -/* An array of N wide_int-like objects that can be put at the end of
> -   a variable-sized structure.  Use extra_size to calculate how many
> -   bytes beyond the sizeof need to be allocated.  Use set_precision
> -   to initialize the structure.  */
> +/* A variable-lengthed array of wide_int-like objects that can be put

s/lengthed/length/

> +   at the end of a variable-sized structure.  The number of objects is
> +   at most N and can be set at runtime by using set_precision().
> +
> +   Use extra_size to calculate how many bytes beyond the
> +   sizeof need to be allocated.  Use set_precision to initialize the
> +   structure.  */
>  template 
>  struct GTY((user)) trailing_wide_ints
>  {
> @@ -1387,6 +1390,9 @@ private:
>/* The shared maximum length of each number.  */
>unsigned char m_max_len;
>  
> +  /* The number of elements.  */
> +  unsigned char m_num_elements;
> +
>/* The current length of each number.
>   Avoid char array so the whole structure is not a typeless storage
>   that will, in turn, turn off TBAA on gimple, trees and RTL.  */
> @@ -1399,12 +1405,15 @@ private:
>  public:
>typedef WIDE_INT_REF_FOR (trailing_wide_int_storage) const_reference;
>  
> -  void set_precision (unsigned int);
> +  void set_precision (unsigned int precision, unsigned int num_elements = N);
>unsigned int get_precision () const { return m_precision; }
> +  unsigned int num_elements () const { return m_num_elements; }
>trailing_wide_int operator [] (unsigned int);
>const_reference operator [] (unsigned int) const;
> -  static size_t extra_size (unsigned int);
> -  size_t extra_size () const { return extra_size (m_precision); }
> +  static size_t extra_size (unsigned int precision,
> + unsigned int num_elements = N);
> +  size_t extra_size () const { return extra_size (m_precision,
> +  

Re: [RFC] trailing_wide_ints with runtime variable lengths

2022-07-01 Thread Jakub Jelinek via Gcc-patches
On Fri, Jul 01, 2022 at 07:43:28PM +0200, Aldy Hernandez wrote:
> You can still say N=255 and things continue to work as they do now, since
> m_len[] is still statically determined. The only difference is that before,
> the size of the structure would be 2+1+255+sizeof(int) whereas now it would
> be 1 more because of the byte I'm using for num_elements.

So, what N do you want to use for SSA_NAME_RANGE_INFO?
N=255 wouldn't be very space efficient especially if the common case is a
single range or two.
For such cases making e.g. m_len not an embedded array, but pointer to
somewhere after the HOST_WIDE_INT array in the same allocation would be
better.

Jakub



Re: [PATCH] sanitizer: Fix hwasan related option conflicts [PR106132]

2022-07-01 Thread Jeff Law via Gcc-patches




On 6/29/2022 6:04 AM, Martin Liška wrote:
Split report_conflicting_sanitizer_options(..., SANITIZE_ADDRESS | 
SANITIZE_HWADDRESS)

call into 2 calls as we don't have any option that would be
address+hwaddress (that conflicts as well).

PR sanitizer/106132

gcc/ChangeLog:

* opts.cc (finish_options): Use 2 calls to
    report_conflicting_sanitizer_options.

gcc/testsuite/ChangeLog:

* c-c++-common/hwasan/arguments-3.c: Cover new ICE.

OK
jeff



Re: Ping^2: [PATCH v2] diagnostics: Honor #pragma GCC diagnostic in the preprocessor [PR53431]

2022-07-01 Thread Jason Merrill via Gcc-patches

On 6/29/22 12:59, Jason Merrill wrote:

On 6/23/22 13:03, Lewis Hyatt via Gcc-patches wrote:

Hello-

https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595556.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53431#c49

Would a C++ maintainer have some time to take a look at this patch
please? I feel like the PR is still worth resolving. If this doesn't
seem like a good way, I am happy to try another -- would really
appreciate any feedback. Thanks!


Thanks for your persistence, I'll take a look now.

Incidentally, when pinging it's often useful to ping someone from 
MAINTAINERS directly, as well as the list.  I think your last ping got 
eaten by some trouble Red Hat email was having at the time.


The cp_token_is_module_directive cleanup is OK.


+  bool skip_this_pragma;


This member seems to be equivalent to
  in_pragma && !should_output_pragmas ()
Maybe it could be a member function instead of a data member?

More soon.


Looks good, just a few minor comments:


+  PD_KIND_INVALID,
+  PD_KIND_PUSH,
+  PD_KIND_POP,
+  PD_KIND_IGNORED_ATTRIBUTES,
+  PD_KIND_DIAGNOSTIC,


The PD_KIND_ prefix seems unnecessary for a scoped enum.


+/* When preprocessing only, pragma_lex() is not available, so obtain the tokens
+   directly from libcpp.  */
+static void
+pragma_diagnostic_lex_pp (pragma_diagnostic_data *result)


Hmm, we could make a temporary lexer, but I guess this is short enough 
that the duplication is OK.



+/* Similar, for the portion of a diagnostic pragma that was parsed
+   internally and so not seen by our token streamer.  */


Can we rewind after parsing so that the token streamer sees it?


+  if (early && arg)
+{
+  /* This warning is needed because of PR71603 - popping the diagnostic
+state does not currently reset changes to option arguments, only
+changes to the option dispositions.  */
+  warning_at (data.loc_option, OPT_Wpragmas,
+ "a diagnostic pragma attempting to modify a preprocessor"
+ " option argument may not work as expected");
+}


Maybe only warn if we subsequently see a pop?


+/* Handle #pragma gcc diagnostic, which needs to be done during preprocessing
+   for the case of preprocessing-related diagnostics.  */
+static void
+cp_lexer_handle_early_pragma (cp_lexer *lexer)


Let's mention in the comment that this is called before appending the 
CPP_PRAGMA_EOL to the lexer buffer.


Jason



Re: [pushed] c++: Include -Woverloaded-virtual in -Wall [PR87729]

2022-07-01 Thread Stephan Bergmann via Gcc-patches

On 6/25/22 00:26, Jason Merrill via Gcc-patches wrote:

This seems like a good warning to have in -Wall, as requested.  But as
pointed out in PR20423, some users want a warning only when a derived
function doesn't override any base function.  So let's put that lesser
version in -Wall (and -Woverloaded-virtual=1) while leaving the semantics
for the existing option the same.


This now causes


$ cat test.cc
struct S1 {};
struct S2: S1 { virtual ~S2(); };
struct S3 { virtual ~S3(); };
struct S4: S2, S3 { virtual ~S4(); };



$ g++ -Woverloaded-virtual -fsyntax-only test.cc
test.cc:3:21: warning: ‘virtual S3::~S3()’ was hidden [-Woverloaded-virtual=]
3 | struct S3 { virtual ~S3(); };
  | ^
test.cc:4:29: note:   by ‘virtual S4::~S4()’
4 | struct S4: S2, S3 { virtual ~S4(); };
  | ^
test.cc:3:21: warning: ‘virtual S3::~S3()’ was hidden [-Woverloaded-virtual=]
3 | struct S3 { virtual ~S3(); };
  | ^
test.cc:4:29: note:   by ‘virtual S4::~S4()’
4 | struct S4: S2, S3 { virtual ~S4(); };
  | ^
test.cc:3:21: warning: ‘virtual S3::~S3()’ was hidden [-Woverloaded-virtual=]
3 | struct S3 { virtual ~S3(); };
  | ^
test.cc:4:29: note:   by ‘virtual S4::~S4()’
4 | struct S4: S2, S3 { virtual ~S4(); };
  | ^




Re: [RFC] trailing_wide_ints with runtime variable lengths

2022-07-01 Thread Aldy Hernandez via Gcc-patches
On Fri, Jul 1, 2022 at 8:53 PM Jakub Jelinek  wrote:
>
> On Fri, Jul 01, 2022 at 07:43:28PM +0200, Aldy Hernandez wrote:
> > You can still say N=255 and things continue to work as they do now, since
> > m_len[] is still statically determined. The only difference is that before,
> > the size of the structure would be 2+1+255+sizeof(int) whereas now it would
> > be 1 more because of the byte I'm using for num_elements.
>
> So, what N do you want to use for SSA_NAME_RANGE_INFO?
> N=255 wouldn't be very space efficient especially if the common case is a
> single range or two.
> For such cases making e.g. m_len not an embedded array, but pointer to
> somewhere after the HOST_WIDE_INT array in the same allocation would be
> better.

As I mentioned in my original post, 12.  This means that I'm taking
the 4 bytes that are left over from the current padding plus 8
(64-bits).  My trailing wide int structure for SSA_NAME_RANGE_INFO
will be one word larger than what is currently there.  But we'll be
able to store up to 5 pairs plus one for the nonzero bits plus one for
future development (5*2 + 1 + 1 = 12), all without going over the 64
bit alignment.

This is a theoretical max, in reality as I mentioned, 99% of ranges
calculated in infinite precision by the ranger fit into 3-4 pairs.

Aldy



Go patch committed: Avoid C++20 keyword requires

2022-07-01 Thread Ian Lance Taylor via Gcc-patches
This patch to the Go frontend renames "requires" to "needs" to avoid
the C++20 keyword.  Bootstrapped on x86_64-pc-linux-gnu.  Committed to
mainline.

Ian
9d44418664ec8c3e59365901e3ec02e488d9e01c
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 0d49e9e70c6..65f64e0fbfb 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-548720bca6bff21ebc9aba22249d9ce45bbd90c7
+ac438edc5335f69c95df9342f43712ad2f61ad66
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/gogo.cc b/gcc/go/gofrontend/gogo.cc
index e13df0da22b..67b91fab4ca 100644
--- a/gcc/go/gofrontend/gogo.cc
+++ b/gcc/go/gofrontend/gogo.cc
@@ -5302,16 +5302,16 @@ Gogo::write_c_header()
   Named_object* no = types.front();
   types.pop_front();
 
-  std::vector requires;
+  std::vector needs;
   std::vector declare;
-  if (!no->type_value()->struct_type()->can_write_to_c_header(&requires,
+  if (!no->type_value()->struct_type()->can_write_to_c_header(&needs,
  &declare))
continue;
 
   bool ok = true;
   for (std::vector::const_iterator pr
-= requires.begin();
-  pr != requires.end() && ok;
+= needs.begin();
+  pr != needs.end() && ok;
   ++pr)
{
  for (std::list::const_iterator pt = types.begin();
@@ -5342,10 +5342,10 @@ Gogo::write_c_header()
  if (*pd == no)
continue;
 
- std::vector drequires;
+ std::vector dneeds;
  std::vector ddeclare;
  if (!(*pd)->type_value()->struct_type()->
- can_write_to_c_header(&drequires, &ddeclare))
+ can_write_to_c_header(&dneeds, &ddeclare))
continue;
 
  bool done = false;
diff --git a/gcc/go/gofrontend/types.cc b/gcc/go/gofrontend/types.cc
index e82be6840aa..4995283bb60 100644
--- a/gcc/go/gofrontend/types.cc
+++ b/gcc/go/gofrontend/types.cc
@@ -6967,7 +6967,7 @@ Struct_type::do_import(Import* imp)
 
 bool
 Struct_type::can_write_to_c_header(
-std::vector* requires,
+std::vector* needs,
 std::vector* declare) const
 {
   const Struct_field_list* fields = this->fields_;
@@ -6978,7 +6978,7 @@ Struct_type::can_write_to_c_header(
p != fields->end();
++p)
 {
-  if (!this->can_write_type_to_c_header(p->type(), requires, declare))
+  if (!this->can_write_type_to_c_header(p->type(), needs, declare))
return false;
   if (Gogo::message_name(p->field_name()) == "_")
sinks++;
@@ -6993,7 +6993,7 @@ Struct_type::can_write_to_c_header(
 bool
 Struct_type::can_write_type_to_c_header(
 const Type* t,
-std::vector* requires,
+std::vector* needs,
 std::vector* declare) const
 {
   t = t->forwarded();
@@ -7027,13 +7027,13 @@ Struct_type::can_write_type_to_c_header(
   return true;
 
 case TYPE_STRUCT:
-  return t->struct_type()->can_write_to_c_header(requires, declare);
+  return t->struct_type()->can_write_to_c_header(needs, declare);
 
 case TYPE_ARRAY:
   if (t->is_slice_type())
return true;
   return this->can_write_type_to_c_header(t->array_type()->element_type(),
- requires, declare);
+ needs, declare);
 
 case TYPE_NAMED:
   {
@@ -7049,10 +7049,10 @@ Struct_type::can_write_type_to_c_header(
// We will accept empty struct fields, but not print them.
if (t->struct_type()->total_field_count() == 0)
  return true;
-   requires->push_back(no);
-   return t->struct_type()->can_write_to_c_header(requires, declare);
+   needs->push_back(no);
+   return t->struct_type()->can_write_to_c_header(needs, declare);
  }
-   return this->can_write_type_to_c_header(t->base(), requires, declare);
+   return this->can_write_type_to_c_header(t->base(), needs, declare);
   }
 
 case TYPE_CALL_MULTIPLE_RESULT:
@@ -7150,9 +7150,9 @@ Struct_type::write_field_to_c_header(std::ostream& os, 
const std::string& name,
 
 case TYPE_POINTER:
   {
-   std::vector requires;
+   std::vector needs;
std::vector declare;
-   if (!this->can_write_type_to_c_header(t->points_to(), &requires,
+   if (!this->can_write_type_to_c_header(t->points_to(), &needs,
  &declare))
  os << "void*";
else


Re: [RFC] trailing_wide_ints with runtime variable lengths

2022-07-01 Thread Aldy Hernandez via Gcc-patches
BTW, I don't know if it got lost in all my patches, but we already
have an irange allocator that given an irange, returns a chunk of
memory holding a clone of that irange squished into its minimum
representable pairs (see vrange_allocator and friends).  So we won't
ever be storing 255 or something equally absurd like I had proposed
years ago :).  We'll be storing the smallest representable range
inside a trailing_wide_int.

Aldy

On Fri, Jul 1, 2022 at 10:31 PM Aldy Hernandez  wrote:
>
> On Fri, Jul 1, 2022 at 8:53 PM Jakub Jelinek  wrote:
> >
> > On Fri, Jul 01, 2022 at 07:43:28PM +0200, Aldy Hernandez wrote:
> > > You can still say N=255 and things continue to work as they do now, since
> > > m_len[] is still statically determined. The only difference is that 
> > > before,
> > > the size of the structure would be 2+1+255+sizeof(int) whereas now it 
> > > would
> > > be 1 more because of the byte I'm using for num_elements.
> >
> > So, what N do you want to use for SSA_NAME_RANGE_INFO?
> > N=255 wouldn't be very space efficient especially if the common case is a
> > single range or two.
> > For such cases making e.g. m_len not an embedded array, but pointer to
> > somewhere after the HOST_WIDE_INT array in the same allocation would be
> > better.
>
> As I mentioned in my original post, 12.  This means that I'm taking
> the 4 bytes that are left over from the current padding plus 8
> (64-bits).  My trailing wide int structure for SSA_NAME_RANGE_INFO
> will be one word larger than what is currently there.  But we'll be
> able to store up to 5 pairs plus one for the nonzero bits plus one for
> future development (5*2 + 1 + 1 = 12), all without going over the 64
> bit alignment.
>
> This is a theoretical max, in reality as I mentioned, 99% of ranges
> calculated in infinite precision by the ranger fit into 3-4 pairs.
>
> Aldy



[PATCH] tree-sra: Fix union handling in build_reconstructed_reference (PR 105860)

2022-07-01 Thread Martin Jambor
Hi,

As the testcase in PR 105860 shows, the code that tries to re-use the
handled_component chains in SRA can be horribly confused by unions,
where it thinks it has found a compatible structure under which it can
chain the references, but in fact it found the type it was looking
for elsewhere in a union and generated a write to a completely wrong
part of an aggregate.

I don't remember whether the plan was to support unions at all in
build_reconstructed_reference but it can work, to an extent, if we
make sure that we start the search only outside the outermost union,
which is what the patch does (and the extra testcase verifies).

Bootstrapped and tested on x86_64-linux.  OK for trunk and then for
release branches?

Thanks,

Martin


gcc/ChangeLog:

2022-07-01  Martin Jambor  

PR tree-optimization/105860
* tree-sra.cc (build_reconstructed_reference): Start expr
traversal only just below the outermost union.

gcc/testsuite/ChangeLog:

2022-07-01  Martin Jambor  

PR tree-optimization/105860
* gcc.dg/tree-ssa/alias-access-path-13.c: New test.
* gcc.dg/tree-ssa/pr105860.c: Likewise.
---
 .../gcc.dg/tree-ssa/alias-access-path-13.c| 31 +
 gcc/testsuite/gcc.dg/tree-ssa/pr105860.c  | 63 +++
 gcc/tree-sra.cc   | 13 +++-
 3 files changed, 106 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/alias-access-path-13.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr105860.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/alias-access-path-13.c 
b/gcc/testsuite/gcc.dg/tree-ssa/alias-access-path-13.c
new file mode 100644
index 000..e502a97bc75
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/alias-access-path-13.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-fre1" } */
+
+struct inn
+{
+  int val;
+};
+
+union foo
+{
+  struct inn inn;
+  long int baz;
+} *fooptr;
+
+struct bar
+{
+  union foo foo;
+  int val2;
+} *barptr;
+
+int
+test ()
+{
+  union foo foo;
+  foo.inn.val = 0;
+  barptr->val2 = 123;
+  *fooptr = foo;
+  return barptr->val2;
+}
+
+/* { dg-final { scan-tree-dump-times "return 123" 1 "fre1"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr105860.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr105860.c
new file mode 100644
index 000..77bcb4a6739
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr105860.c
@@ -0,0 +1,63 @@
+/* { dg-do run } */
+/* { dg-options "-O1" } */
+
+struct S1  {
+unsigned int _0;
+unsigned int _1;
+} ;
+struct S2  {
+struct S1 _s1;
+unsigned long _x2;
+} ;
+
+struct ufld_type1  {
+unsigned int _u1t;
+struct S2 _s2;
+} ;
+
+struct ufld_type2  {
+unsigned int _u2t;
+struct S1 _s1;
+} ;
+struct parm_type {
+union {
+struct ufld_type1 var_1;
+struct ufld_type2 var_2;
+} U;
+};
+
+struct parm_type  bad_function( struct parm_type arg0 )
+{
+struct parm_type rv;
+struct S2 var4;
+switch( arg0.U.var_2._u2t ) {
+case 4294967041:
+var4._s1 = arg0.U.var_1._s2._s1;
+rv.U.var_1._u1t = 4294967041;
+rv.U.var_1._s2 = var4;
+break;
+case 4294967043:
+rv.U.var_2._u2t = 4294967043;
+rv.U.var_2._s1 = arg0.U.var_2._s1;
+break;
+default:
+break;
+}
+return rv;
+}
+
+int main() {
+struct parm_type val;
+struct parm_type out;
+val.U.var_2._u2t = 4294967043;
+val.U.var_2._s1._0 = 0x01010101;
+val.U.var_2._s1._1 = 0x02020202;
+out = bad_function(val);
+   if (val.U.var_2._u2t != 4294967043)
+ __builtin_abort ();
+if (out.U.var_2._s1._0 != 0x01010101)
+ __builtin_abort ();
+if (val.U.var_2._s1._1 != 0x02020202 )
+ __builtin_abort ();
+   return 0;
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 081c51b58a4..099e8dbe873 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -1667,7 +1667,18 @@ build_ref_for_offset (location_t loc, tree base, 
poly_int64 offset,
 static tree
 build_reconstructed_reference (location_t, tree base, struct access *model)
 {
-  tree expr = model->expr, prev_expr = NULL;
+  tree expr = model->expr;
+  /* We have to make sure to start just below the outermost union.  */
+  tree start_expr = expr;
+  while (handled_component_p (expr))
+{
+  if (TREE_CODE (TREE_TYPE (TREE_OPERAND (expr, 0))) == UNION_TYPE)
+   start_expr = expr;
+  expr = TREE_OPERAND (expr, 0);
+}
+
+  expr = start_expr;
+  tree prev_expr = NULL_TREE;
   while (!types_compatible_p (TREE_TYPE (expr), TREE_TYPE (base)))
 {
   if (!handled_component_p (expr))
-- 
2.36.1



Re: [Patch][v5] OpenMP: Move omp requires checks to libgomp

2022-07-01 Thread Tobias Burnus

Updated version attached – I hope I got everything right, but I start to
get tired, I am not 100% sure.

On 01.07.22 18:55, Jakub Jelinek wrote:

Perhaps you're just lucky and the stack contains '\0' there?

Probably.

Maybe it would be better to simply use different error message for the
0 vs. non-0 case,

Done so.

For mkoffload, the single results are merged - and TARGET_USE is stripped,
such that it is either 0 or a combination of USM/UA/RO

I'd find it clearer if we never stripped that, so that even the library knows.

I have done so – and I concur that the check works then better in
libgomp as well.

Pedantically reading current standard probably yes, but perhaps again
something to be discussed.  The question is what the requires directive
in that case would do, nothing at all as there are no device constructs
etc.?

Isn't there a device construct – which happens to be empty?

In d.c there is.  But in c.c there isn't.
So, the question is if the directive in c.c is just completely ignored
(ok, aside from semantic checking) or if it should mean that if it is
specified there, it must be specified elsewhere where device constructs etc.
are used too.


Good question. The current code follows the wording of the spec and
ignores it. I think that's fine but still feels a bit odd.

Question: If it is not the same, should there just be a message to
stderr (gomp_error) or should libgomp abort (gomp_fatal)?

I'd say gomp_fatal.

Done so - it makes life easier.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP: Move omp requires checks to libgomp

Handle reverse_offload, unified_address, and unified_shared_memory
requirements in libgomp by saving them alongside the offload table.
When the device lto1 runs, it extracts the data for mkoffload. The
latter than passes the value on to GOMP_offload_register_ver.

lto1 (either the host one, with -flto [+ ENABLE_OFFLOADING], or in the
offload-device lto1) also does the the consistency check is done,
erroring out when the 'omp requires' clause use is inconsistent.

For all in-principle supported devices, if a requirement cannot be fulfilled,
the device is excluded from the (supported) devices list. Currently, none of
those requirements are marked as supported for any of the non-host devices.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_target_data, c_parser_omp_target_update,
	c_parser_omp_target_enter_data, c_parser_omp_target_exit_data): Set
	OMP_REQUIRES_TARGET_USED.
	(c_parser_omp_requires): Remove sorry.

gcc/ChangeLog:

	* config/gcn/mkoffload.cc (process_asm): Write '#include '.
	(process_obj): Pass omp_requires_mask to GOMP_offload_register_ver.
	(main): Ask lto1 to obtain omp_requires_mask and pass it on.
	* config/nvptx/mkoffload.cc (process, main): Likewise.
	* lto-cgraph.cc (omp_requires_to_name): New.
	(input_offload_tables): Save omp_requires_mask.
	(output_offload_tables): Read it, check for consistency,
	save value for mkoffload.
	* omp-low.cc (lower_omp_target): Force output_offloadtables
	call for OMP_REQUIRES_TARGET_USED.

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_target_data,
	cp_parser_omp_target_enter_data, cp_parser_omp_target_exit_data,
	cp_parser_omp_target_update): Set OMP_REQUIRES_TARGET_USED.
	(cp_parser_omp_requires): Remove sorry.

gcc/fortran/ChangeLog:

	* openmp.cc (gfc_match_omp_requires): Remove sorry.
	* parse.cc (decode_omp_directive): Don't regard 'declare target'
	as target usage for 'omp requires'; add more flags to
	omp_requires_mask.

include/ChangeLog:

	* gomp-constants.h (GOMP_VERSION): Bump to 2.
	(GOMP_REQUIRES_UNIFIED_ADDRESS, GOMP_REQUIRES_UNIFIED_SHARED_MEMORY,
	GOMP_REQUIRES_REVERSE_OFFLOAD, GOMP_REQUIRES_TARGET_USED):
	New defines.

libgomp/ChangeLog:

	* libgomp-plugin.h (GOMP_OFFLOAD_get_num_devices): Add
	omp_requires_mask arg.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Likewise;
	return -1 when device available but omp_requires_mask != 0.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices): Likewise.
	* oacc-host.c (host_get_num_devices, host_openacc_get_property):
	Update call.
	* oacc-init.c (resolve_device, acc_init_1, acc_shutdown_1,
	goacc_attach_host_thread_to_device, acc_get_num_devices,
	acc_set_device_num, get_property_any): Likewise.
	* target.c (omp_requires_mask): New global var.
	(gomp_requires_to_name): New.
	(GOMP_offload_register_ver): Handle passed omp_requires_mask.
	(gomp_target_init): Handle omp_requires_mask.
	* libgomp.texi (OpenMP 5.0): Update requires impl. status.
	(OpenMP 5.1): Add a missed item.
	(OpenMP 5.2): Mark linear-clause change as supported in C/C++.
	* testsuite/libgomp.c-c++-common/requires-1-aux.c: New test.
	* testsuite/libgomp.c-c++-common/requires-1.c: New test.
	* testsuite/libgomp.c-c++-common/requires-2-aux.c: New test.
	* te

[r13-1395 Regression] FAIL: gfortran.dg/check_bits_2.f90 -O1 output pattern test on Linux/x86_64

2022-07-01 Thread skpandey--- via Gcc-patches
On Linux/x86_64,

f843bea4ca5613cb713f8b9313daa3938f254a05 is the first bad commit
commit f843bea4ca5613cb713f8b9313daa3938f254a05
Author: Uros Bizjak 
Date:   Fri Jul 1 17:25:03 2022 +0200

i386: Use "r" constraint in *andn3_doubleword_bmi

caused

FAIL: gfortran.dg/check_bits_2.f90   -O1  output pattern test

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1395/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=gfortran.dg/check_bits_2.f90 --target_board='unix{-m32\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[committed] libstdc++: Add missing prerequisite to generated header [PR106162]

2022-07-01 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk.
This should be backported too.

-- >8 --

The ${host_builddir}/largefile-config.h header can't be written until
its parent directory has been created, so it needs to have the creation
of that directory as a prerequisite.

libstdc++-v3/ChangeLog:

PR libstdc++/106162
* include/Makefile.am (largefile-config.h): Add
stamp-${host_alias} prerequisite.
* include/Makefile.in: Regenerate.
---
 libstdc++-v3/include/Makefile.am | 2 +-
 libstdc++-v3/include/Makefile.in | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index b46def7ff9f..069a16ec769 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -1286,7 +1286,7 @@ stamp-float128:
 endif
 
 # This header is not installed, it's only used to build libstdc++ itself.
-${host_builddir}/largefile-config.h: ${CONFIG_HEADER}
+${host_builddir}/largefile-config.h: ${CONFIG_HEADER} stamp-${host_alias}
@rm -f $@.tmp
@-grep 'define _DARWIN_USE_64_BIT_INODE' ${CONFIG_HEADER} >> $@.tmp
@-grep 'define _FILE_OFFSET_BITS' ${CONFIG_HEADER} >> $@.tmp
diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index f844008a7c5..36eff73139d 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -1780,7 +1780,7 @@ stamp-host: ${host_headers} ${bits_host_headers} 
${ext_host_headers} ${host_head
 @ENABLE_FLOAT128_FALSE@echo 'undef _GLIBCXX_USE_FLOAT128' > 
stamp-float128
 
 # This header is not installed, it's only used to build libstdc++ itself.
-${host_builddir}/largefile-config.h: ${CONFIG_HEADER}
+${host_builddir}/largefile-config.h: ${CONFIG_HEADER} stamp-${host_alias}
@rm -f $@.tmp
@-grep 'define _DARWIN_USE_64_BIT_INODE' ${CONFIG_HEADER} >> $@.tmp
@-grep 'define _FILE_OFFSET_BITS' ${CONFIG_HEADER} >> $@.tmp
-- 
2.36.1



DSE patch RFA: Don't delete trapping insn

2022-07-01 Thread Ian Lance Taylor via Gcc-patches
The DSE pass can delete a dead store even if the instruction can trap.
That is incorrect when using -fnon-call-exceptions
-fno-delete-dead-exceptions.  This led to a bug report against gccgo:
https://go.dev/issue/53012.  However, the bug is not specific to Go.

This patch fixes the problem in a simple way, and includes a C++
testcase.  Bootstrapped and ran C, C++, and Go tests on
x86_64-pc-linux-gnu.

OK for mainline?

Ian
From bae426745756896ec0ea27c9e12469c53b88d538 Mon Sep 17 00:00:00 2001
From: Ian Lance Taylor 
Date: Fri, 1 Jul 2022 14:51:45 -0700
Subject: [PATCH] tree-optimization: only DSE trapping insn if
 -fdelete-dead-exceptions

gcc/ChangeLog:

* tree-ssa-dse.cc (dse_optimize_stmt): Only delete a trapping
statement if -fdelete-dead-exceptions.

gcc/testsuite/ChangeLog:

* g++.dg/torture/except-1.C: New test.
---
 gcc/testsuite/g++.dg/torture/except-1.C | 44 +
 gcc/tree-ssa-dse.cc |  3 +-
 2 files changed, 46 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/except-1.C

diff --git a/gcc/testsuite/g++.dg/torture/except-1.C 
b/gcc/testsuite/g++.dg/torture/except-1.C
new file mode 100644
index 000..7050a33cc27
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/except-1.C
@@ -0,0 +1,44 @@
+// { dg-do run { target { i?86-*-linux* i?86-*-gnu* x86_64-*-linux* } } }
+// { dg-additional-options "-fexceptions -fnon-call-exceptions 
-fno-delete-dead-exceptions" }
+
+#include 
+#include 
+#include 
+
+static void
+sighandler (int signo, siginfo_t* si, void* uc)
+{
+  throw (5);
+}
+
+struct S { void *p1, *p2; };
+
+struct S v;
+
+__attribute__ ((noinline))
+int
+dosegv ()
+{
+  struct S *p = 0;
+  struct S s __attribute__((unused)) = *p;
+  return 0;
+}
+
+int main ()
+{
+  struct sigaction sa;
+
+  memset (&sa, 0, sizeof sa);
+  sa.sa_sigaction = sighandler;
+  sigaction (SIGSEGV, &sa, NULL);
+  sigaction (SIGBUS, &sa, NULL);
+
+  try {
+dosegv ();
+  }
+  catch (int x) {
+return (x != 5);
+  }
+
+  return 1;
+}
diff --git a/gcc/tree-ssa-dse.cc b/gcc/tree-ssa-dse.cc
index 62efafe384d..8d1739a4510 100644
--- a/gcc/tree-ssa-dse.cc
+++ b/gcc/tree-ssa-dse.cc
@@ -1463,7 +1463,8 @@ dse_optimize_stmt (function *fun, gimple_stmt_iterator 
*gsi, sbitmap live_bytes)
   gimple_call_set_lhs (stmt, NULL_TREE);
   update_stmt (stmt);
 }
-  else
+  else if (!stmt_could_throw_p (fun, stmt)
+  || fun->can_delete_dead_exceptions)
 delete_dead_or_redundant_assignment (gsi, "dead", need_eh_cleanup,
 need_ab_cleanup);
 }
-- 
2.37.0.rc0.161.g10f37bed90-goog



Re: Ping^2: [PATCH v2] diagnostics: Honor #pragma GCC diagnostic in the preprocessor [PR53431]

2022-07-01 Thread Lewis Hyatt via Gcc-patches
On Fri, Jul 1, 2022 at 3:59 PM Jason Merrill  wrote:
>
> On 6/29/22 12:59, Jason Merrill wrote:
> > On 6/23/22 13:03, Lewis Hyatt via Gcc-patches wrote:
> >> Hello-
> >>
> >> https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595556.html
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53431#c49
> >>
> >> Would a C++ maintainer have some time to take a look at this patch
> >> please? I feel like the PR is still worth resolving. If this doesn't
> >> seem like a good way, I am happy to try another -- would really
> >> appreciate any feedback. Thanks!
> >
> > Thanks for your persistence, I'll take a look now.
> >
> > Incidentally, when pinging it's often useful to ping someone from
> > MAINTAINERS directly, as well as the list.  I think your last ping got
> > eaten by some trouble Red Hat email was having at the time.
> >
> > The cp_token_is_module_directive cleanup is OK.

Thank you very much for the advice and for going through the patch, I
really appreciate it. I went ahead and pushed the small cleanup patch.
I have responses to your comments on the main patch below too.

> >
> >> +  bool skip_this_pragma;
> >
> > This member seems to be equivalent to
> >   in_pragma && !should_output_pragmas ()
> > Maybe it could be a member function instead of a data member?
> >

Yeah, makes sense, although I hope that by implementing your
suggestion below regarding rewinding the tokens for preprocessing,
then this can be removed anyway.

> > More soon.
>
> Looks good, just a few minor comments:
>
> > +  PD_KIND_INVALID,
> > +  PD_KIND_PUSH,
> > +  PD_KIND_POP,
> > +  PD_KIND_IGNORED_ATTRIBUTES,
> > +  PD_KIND_DIAGNOSTIC,
>
> The PD_KIND_ prefix seems unnecessary for a scoped enum.
>

Sure, will shorten it to PK_ instead.

> > +/* When preprocessing only, pragma_lex() is not available, so obtain the 
> > tokens
> > +   directly from libcpp.  */
> > +static void
> > +pragma_diagnostic_lex_pp (pragma_diagnostic_data *result)
>
> Hmm, we could make a temporary lexer, but I guess this is short enough
> that the duplication is OK.
>

I see. It would look like a version of pragma_lex() (the one in
c-parser.cc) which took a c_parser* argument so it wouldn't need to
use the global the_parser. I didn't consider this because I was
starting from Manuel's prototype patch on the PR
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53431#c10), which was
doing the parsing in libcpp itself. Perhaps it would make sense to
move to this approach in the future, if it became necessary sometime
to lex some other pragmas during preprocessing?

> > +/* Similar, for the portion of a diagnostic pragma that was parsed
> > +   internally and so not seen by our token streamer.  */
>
> Can we rewind after parsing so that the token streamer sees it?
>

Oh that's an interesting idea. It would avoid some potential issues.
For instance, I have another patch submitted to fix PR55971
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55971#c8), which is that
you can't put raw strings containing newlines into a preprocessing
directive. It occurs to me now, once that's applied, then I think a
#pragma GCC diagnostic with such a raw string (useless though it would
be) would not get output correctly by gcc -E with this current patch's
approach. An edge case certainly, but would be nice to get it right
also, and your approach would automatically handle it. I'm going to
explore this now and then follow up with a new version of the patch.

> > +  if (early && arg)
> > +{
> > +  /* This warning is needed because of PR71603 - popping the diagnostic
> > +state does not currently reset changes to option arguments, only
> > +changes to the option dispositions.  */
> > +  warning_at (data.loc_option, OPT_Wpragmas,
> > + "a diagnostic pragma attempting to modify a preprocessor"
> > + " option argument may not work as expected");
> > +}
>
> Maybe only warn if we subsequently see a pop?
>

Right, that makes sense. Changing the option does work just fine until
you try to pop it. But actually this warning was kinda an
afterthought, now I just checked and at the time I wrote it, there was
only one option it could possibly apply to, since it needs to be an
option that's both for libcpp, and taking an argument, which was
-Wnormalized=. In the meantime one more has been added, -Wbidi-chars=.
Rather than add more complicated logic to remember to warn on pop for
these 2 options only, feels like maybe it would be better to either
just fix PR71603 (which I can work on sometime), or add this warning
for all options, not just libcpp options, which I guess means it
should go inside the implementation of pop... So in either case feels
like it's not really relevant to this patch and I'd propose just to
remove it for now, and then address it subsequently?

> > +/* Handle #pragma gcc diagnostic, which needs to be done during 
> > preprocessing
> > +   for the case of preprocessing-related diagnostics.  */
> > +stat

Re: [PATCH] c++: generic targs and identity substitution [PR105956]

2022-07-01 Thread Jason Merrill via Gcc-patches

On 6/29/22 13:42, Patrick Palka wrote:

In r13-1045-gcb7fd1ea85feea I assumed that substitution into generic
DECL_TI_ARGS corresponds to an identity mapping of the given arguments,
and hence its safe to always elide such substitution.  But this PR
demonstrates that such a substitution isn't always the identity mapping,
in particular when there's an ARGUMENT_PACK_SELECT argument, which gets
handled specially during substitution:

   * when substituting an APS into a template parameter, we strip the
 APS to its underlying argument;
   * and when substituting an APS into a pack expansion, we strip the
 APS to its underlying argument pack.


Ah, right.  For instance, in variadic96.C we have

10  template < typename... T >
11  struct derived
12: public base< T, derived< T... > >...

so when substituting into the base-specifier, we're approaching it from 
the outside in, so when we get to the inner T... we need some way to 
find the T pack again.  It might be possible to remove the need for APS 
by substituting inner pack expansions before outer ones, which could 
improve worst-case complexity, but I don't know how relevant that is in 
real code; I imagine most inner pack expansions are as simple as this one.



In this testcase, when expanding the pack expansion pattern (idx + Ns)...
with Ns={0,1}, we specialize idx twice, first with Ns=APS<0,{0,1}> and
then Ns=APS<1,{0,1}>.  The DECL_TI_ARGS of idx are the generic template
arguments of the enclosing class template impl, so before r13-1045,
we'd substitute into its DECL_TI_ARGS which gave Ns={0,1} as desired.
But after r13-1045, we elide this substitution and end up attempting to
hash the original Ns argument, an APS, which ICEs.

So this patch partially reverts this part of r13-1045.  I considered
using preserve_args in this case instead, but that'd break the
static_assert in the testcase because preserve_args always strips APS to
its underlying argument, but here we want to strip it to its underlying
argument pack, so we'd incorrectly end up forming the specializations
impl<0>::idx and impl<1>::idx instead of impl<0,1>::idx.

Although we can't elide the substitution into DECL_TI_ARGS in light of
ARGUMENT_PACK_SELECT, it should still be safe to elide template argument
coercion in the case of a non-template decl, which this patch preserves.

It's unfortunate that we need to remove this optimization just because
it doesn't hold for one special tree code.  So this patch implements a
heuristic in tsubst_template_args to avoid allocating a new TREE_VEC if
the substituted elements are identical to those of a level from ARGS.
It turns out that about 30% of all calls to tsubst_template_args benefit
from this optimization, and it reduces memory usage by about 1.5% for
e.g. stdc++.h (relative to r13-1045).  (This is the maybe_reuse stuff,
the rest of the changes to tsubst_template_args are just drive-by
cleanups.)

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?  Patch generated with -w to ignore noisy whitespace changes.

PR c++/105956

gcc/cp/ChangeLog:

* pt.cc (tsubst_template_args): Move variable declarations
closer to their first use.  Replace 'orig_t' with 'r'.  Rename
'need_new' to 'const_subst_p'.  Heuristically detect if the
substituted elements are identical to that of a level from
'args' and avoid allocating a new TREE_VEC if so.
(tsubst_decl) : Revert
r13-1045-gcb7fd1ea85feea change for avoiding substitution into
DECL_TI_ARGS, but still avoid coercion in this case.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/variadic183.C: New test.
---
  gcc/cp/pt.cc | 113 ++-
  gcc/testsuite/g++.dg/cpp0x/variadic183.C |  14 +++
  2 files changed, 85 insertions(+), 42 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic183.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 8672da123f4..7898834faa6 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -27,6 +27,7 @@ along with GCC; see the file COPYING3.  If not see
   Fixed by: C++20 modules.  */
  
  #include "config.h"

+#define INCLUDE_ALGORITHM // for std::equal
  #include "system.h"
  #include "coretypes.h"
  #include "cp-tree.h"
@@ -13544,17 +13545,22 @@ tsubst_argument_pack (tree orig_arg, tree args, 
tsubst_flags_t complain,
  tree
  tsubst_template_args (tree t, tree args, tsubst_flags_t complain, tree 
in_decl)
  {
-  tree orig_t = t;
-  int len, need_new = 0, i, expanded_len_adjust = 0, out;
-  tree *elts;
-
if (t == error_mark_node)
  return error_mark_node;
  
-  len = TREE_VEC_LENGTH (t);

-  elts = XALLOCAVEC (tree, len);
+  const int len = TREE_VEC_LENGTH (t);
+  tree *elts = XALLOCAVEC (tree, len);
+  int expanded_len_adjust = 0;
  
-  for (i = 0; i < len; i++)

+  /* True iff no element of T was changed by the substitution.  */
+  bool const_subst_p = true;
+
+  /* If MAYBE_REUSE is non-NULL, as an optim

Go patch committe: Use correct init order for multi-value init

2022-07-01 Thread Ian Lance Taylor via Gcc-patches
This patch to the Go frontend uses the correct initialization order
for code like

var a = c
var b, c = x.(bool)

The global c is initialized by the preinit of b, but we were missing a
dependency of c on b, so a would be initialized to the zero value of c
rather than the correct value.

Simply adding the dependency of c on b didn't work because the preinit
of b refers to c, so that appeared circular.  So this patch changes
the init order to skip dependencies that only appear on the left hand
side of assignments in preinit blocks.

Doing that didn't work because the write barrier pass can transform "a
= b" into code like "gcWriteBarrier(&a, b)" that is not obviously a
simple assigment.  So this patch moves the collection of dependencies
to just after lowering, before the write barriers are inserted.

Making those changes permit relaxing the requirement that we don't
warn about self-dependency in preinit blocks, so now we correctly warn
for

var a, b any = b.(bool)

The test case is https://go.dev/cl/415238.

This fixes https://go.dev/issue/53619.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
23361df71a68478dde7c4aa516ba69f199556a2c
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 65f64e0fbfb..7b1d3011fff 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-ac438edc5335f69c95df9342f43712ad2f61ad66
+6479d5976c5d848ec6f5843041275723a6b0
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/go.cc b/gcc/go/gofrontend/go.cc
index 404cb124549..1512770af29 100644
--- a/gcc/go/gofrontend/go.cc
+++ b/gcc/go/gofrontend/go.cc
@@ -146,6 +146,9 @@ go_parse_input_files(const char** filenames, unsigned int 
filename_count,
   if (only_check_syntax)
 return;
 
+  // Record global variable initializer dependencies.
+  ::gogo->record_global_init_refs();
+
   // Do simple deadcode elimination.
   ::gogo->remove_deadcode();
 
diff --git a/gcc/go/gofrontend/gogo.cc b/gcc/go/gofrontend/gogo.cc
index 67b91fab4ca..9197eef3e38 100644
--- a/gcc/go/gofrontend/gogo.cc
+++ b/gcc/go/gofrontend/gogo.cc
@@ -1086,8 +1086,8 @@ class Find_vars : public Traverse
 
  public:
   Find_vars()
-: Traverse(traverse_expressions),
-  vars_(), seen_objects_()
+: Traverse(traverse_expressions | traverse_statements),
+  vars_(), seen_objects_(), lhs_is_ref_(false)
   { }
 
   // An iterator through the variables found, after the traversal.
@@ -1104,11 +1104,16 @@ class Find_vars : public Traverse
   int
   expression(Expression**);
 
+  int
+  statement(Block*, size_t* index, Statement*);
+
  private:
   // Accumulated variables.
   Vars vars_;
   // Objects we have already seen.
   Seen_objects seen_objects_;
+  // Whether an assignment to a variable counts as a reference.
+  bool lhs_is_ref_;
 };
 
 // Collect global variables referenced by EXPR.  Look through function
@@ -1164,7 +1169,11 @@ Find_vars::expression(Expression** pexpr)
  if (ins.second)
{
  // This is the first time we have seen this name.
- if (f->func_value()->block()->traverse(this) == TRAVERSE_EXIT)
+ bool hold = this->lhs_is_ref_;
+ this->lhs_is_ref_ = true;
+ int r = f->func_value()->block()->traverse(this);
+ this->lhs_is_ref_ = hold;
+ if (r == TRAVERSE_EXIT)
return TRAVERSE_EXIT;
}
}
@@ -1192,6 +1201,29 @@ Find_vars::expression(Expression** pexpr)
   return TRAVERSE_CONTINUE;
 }
 
+// Check a statement while searching for variables.  This is where we
+// skip variables on the left hand side of assigments if appropriate.
+
+int
+Find_vars::statement(Block*, size_t*, Statement* s)
+{
+  if (this->lhs_is_ref_)
+return TRAVERSE_CONTINUE;
+  Assignment_statement* as = s->assignment_statement();
+  if (as == NULL)
+return TRAVERSE_CONTINUE;
+
+  // Only traverse subexpressions of the LHS.
+  if (as->lhs()->traverse_subexpressions(this) == TRAVERSE_EXIT)
+return TRAVERSE_EXIT;
+
+  Expression* rhs = as->rhs();
+  if (Expression::traverse(&rhs, this) == TRAVERSE_EXIT)
+return TRAVERSE_EXIT;
+
+  return TRAVERSE_SKIP_COMPONENTS;
+}
+
 // Return true if EXPR, PREINIT, or DEP refers to VAR.
 
 static bool
@@ -1230,11 +1262,11 @@ class Var_init
 {
  public:
   Var_init()
-: var_(NULL), init_(NULL), refs_(NULL), dep_count_(0)
+: var_(NULL), init_(NULL), dep_count_(0)
   { }
 
   Var_init(Named_object* var, Bstatement* init)
-: var_(var), init_(init), refs_(NULL), dep_count_(0)
+: var_(var), init_(init), dep_count_(0)
   { }
 
   // Return the variable.
@@ -1247,19 +1279,6 @@ class Var_init
   init() const
   { return this->init_; }
 
-  // Add a reference.
-  void
-  add_ref(Named_object* var);
-
-  // The variables which this variable's initializers refer to.
-  const std::vector*
-  refs()
-  { return this->refs_; 

Re: DSE patch RFA: Don't delete trapping insn

2022-07-01 Thread Jeff Law via Gcc-patches




On 7/1/2022 4:04 PM, Ian Lance Taylor via Gcc-patches wrote:

The DSE pass can delete a dead store even if the instruction can trap.
That is incorrect when using -fnon-call-exceptions
-fno-delete-dead-exceptions.  This led to a bug report against gccgo:
https://go.dev/issue/53012.  However, the bug is not specific to Go.

This patch fixes the problem in a simple way, and includes a C++
testcase.  Bootstrapped and ran C, C++, and Go tests on
x86_64-pc-linux-gnu.

OK for mainline?

OK
jeff



Re: [PATCH]middle-end Add optimized float addsub without needing VEC_PERM_EXPR.

2022-07-01 Thread Jeff Law via Gcc-patches




On 6/17/2022 2:33 PM, Andrew Pinski via Gcc-patches wrote:

On Thu, Jun 16, 2022 at 3:59 AM Tamar Christina via Gcc-patches
 wrote:

Hi All,

For IEEE 754 floating point formats we can replace a sequence of alternative
+/- with fneg of a wider type followed by an fadd.  This eliminated the need for
using a permutation.  This patch adds a math.pd rule to recognize and do this
rewriting.

I don't think this is correct. You don't check the format of the
floating point to make sure this is valid (e.g. REAL_MODE_FORMAT's
signbit_rw/signbit_ro field).
Also would just be better if you do the xor in integer mode (using
signbit_rw field for the correct bit)?
And then making sure the target optimizes the xor to the neg
instruction when needed?
Whether or not the xor trick is better or not would be highly target 
dependent.  That seems like it's better left for expansion to figure out 
since we have target costing information at that point.


Jeff



Re: [PATCH]middle-end Add optimized float addsub without needing VEC_PERM_EXPR.

2022-07-01 Thread Jeff Law via Gcc-patches




On 6/20/2022 5:56 AM, Richard Biener via Gcc-patches wrote:



Note one option would be to emit a multiply with { 1, -1, 1, -1 } on
GIMPLE where then targets could opt-in to handle this via a DFmode
negate via a combine pattern?  Not sure if this can be even done
starting from the vec-perm RTL IL.

FWIW, FP multiply is the same cost as FP add/sub on our target.


I fear whether (neg:V2DF (subreg:V2DF (reg:V4SF))) is a good idea
will heavily depend on the target CPU (not only the ISA).  For RISC-V
for example I think the DF lanes do not overlap with two SF lanes
(so same with gcn I think).
Absolutely.  I've regularly seen introduction of subregs like that 
ultimately result in the SUBREG_REG object getting dumped into memory 
rather than be allocated into a register.  It could well be a problem 
with our port, I haven't started chasing it down yet.


One such case where that came up recently was the addition of something 
like this to simplify-rtx.  Basically in some cases we can turn a 
VEC_SELECT into a SUBREG, so I had this little hack in simplify-rtx that 
I was playing with:

+  /* If we have a VEC_SELECT of a SUBREG try to change the SUBREG so
+    that we eliminate the VEC_SELECT.  */
+  if (GET_CODE (op0) == SUBREG
+ && subreg_lowpart_p (op0)
+ && VECTOR_MODE_P (GET_MODE (op0))
+ && GET_MODE_INNER (GET_MODE (op0)) == mode
+ && XVECLEN (trueop1, 0) == 1
+ && CONST_INT_P (XVECEXP (trueop1, 0, 0)))
+   {
+ return simplify_gen_subreg (mode, SUBREG_REG (op0), GET_MODE 
(SUBREG_REG (op0)), INTVAL (XVECEXP (trueop1, 0, 0)) * 8);

+   }


Seemed like a no-brainer win, but in reality it made things worse pretty 
consistently.


jeff


Re: Ping^2: [PATCH v2] diagnostics: Honor #pragma GCC diagnostic in the preprocessor [PR53431]

2022-07-01 Thread Jason Merrill via Gcc-patches

On 7/1/22 18:05, Lewis Hyatt wrote:

On Fri, Jul 1, 2022 at 3:59 PM Jason Merrill  wrote:


On 6/29/22 12:59, Jason Merrill wrote:

On 6/23/22 13:03, Lewis Hyatt via Gcc-patches wrote:

Hello-

https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595556.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53431#c49

Would a C++ maintainer have some time to take a look at this patch
please? I feel like the PR is still worth resolving. If this doesn't
seem like a good way, I am happy to try another -- would really
appreciate any feedback. Thanks!


Thanks for your persistence, I'll take a look now.

Incidentally, when pinging it's often useful to ping someone from
MAINTAINERS directly, as well as the list.  I think your last ping got
eaten by some trouble Red Hat email was having at the time.

The cp_token_is_module_directive cleanup is OK.


Thank you very much for the advice and for going through the patch, I
really appreciate it. I went ahead and pushed the small cleanup patch.
I have responses to your comments on the main patch below too.




+  bool skip_this_pragma;


This member seems to be equivalent to
   in_pragma && !should_output_pragmas ()
Maybe it could be a member function instead of a data member?



Yeah, makes sense, although I hope that by implementing your
suggestion below regarding rewinding the tokens for preprocessing,
then this can be removed anyway.


More soon.


Looks good, just a few minor comments:


+  PD_KIND_INVALID,
+  PD_KIND_PUSH,
+  PD_KIND_POP,
+  PD_KIND_IGNORED_ATTRIBUTES,
+  PD_KIND_DIAGNOSTIC,


The PD_KIND_ prefix seems unnecessary for a scoped enum.



Sure, will shorten it to PK_ instead.


+/* When preprocessing only, pragma_lex() is not available, so obtain the tokens
+   directly from libcpp.  */
+static void
+pragma_diagnostic_lex_pp (pragma_diagnostic_data *result)


Hmm, we could make a temporary lexer, but I guess this is short enough
that the duplication is OK.


I see. It would look like a version of pragma_lex() (the one in
c-parser.cc) which took a c_parser* argument so it wouldn't need to
use the global the_parser.


Or creates the_parser itself and feeds it tokens somewhat like 
cp_parser_handle_statement_omp_attributes.



I didn't consider this because I was
starting from Manuel's prototype patch on the PR
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53431#c10), which was
doing the parsing in libcpp itself. Perhaps it would make sense to
move to this approach in the future, if it became necessary sometime
to lex some other pragmas during preprocessing?


Sure.


+/* Similar, for the portion of a diagnostic pragma that was parsed
+   internally and so not seen by our token streamer.  */


Can we rewind after parsing so that the token streamer sees it?



Oh that's an interesting idea. It would avoid some potential issues.
For instance, I have another patch submitted to fix PR55971
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55971#c8), which is that
you can't put raw strings containing newlines into a preprocessing
directive. It occurs to me now, once that's applied, then I think a
#pragma GCC diagnostic with such a raw string (useless though it would
be) would not get output correctly by gcc -E with this current patch's
approach. An edge case certainly, but would be nice to get it right
also, and your approach would automatically handle it. I'm going to
explore this now and then follow up with a new version of the patch.


+  if (early && arg)
+{
+  /* This warning is needed because of PR71603 - popping the diagnostic
+state does not currently reset changes to option arguments, only
+changes to the option dispositions.  */
+  warning_at (data.loc_option, OPT_Wpragmas,
+ "a diagnostic pragma attempting to modify a preprocessor"
+ " option argument may not work as expected");
+}


Maybe only warn if we subsequently see a pop?



Right, that makes sense. Changing the option does work just fine until
you try to pop it. But actually this warning was kinda an
afterthought, now I just checked and at the time I wrote it, there was
only one option it could possibly apply to, since it needs to be an
option that's both for libcpp, and taking an argument, which was
-Wnormalized=. In the meantime one more has been added, -Wbidi-chars=.
Rather than add more complicated logic to remember to warn on pop for
these 2 options only, feels like maybe it would be better to either
just fix PR71603 (which I can work on sometime), or add this warning
for all options, not just libcpp options, which I guess means it
should go inside the implementation of pop... So in either case feels
like it's not really relevant to this patch and I'd propose just to
remove it for now, and then address it subsequently?


Sounds good.


+/* Handle #pragma gcc diagnostic, which needs to be done during preprocessing
+   for the case of preprocessing-related diagnostics.  */
+static void
+cp_lexer_handle_early_pragm