date:20220721

[PATCH] Add alias disambiguation for vectorizer load/store IFNs

2022-07-21 Thread Richard Biener via Gcc-patches

The following adds support for MASK_STORE, MASK_LOAD and friends
to call_may_clobber_ref_p and ref_maybe_used_by_call_p.  Since
they all use a special argument to specify TBAA they are not really
suited for fnspec handling thus the manual support.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR106365 shows this can be important though in many cases seen there
we miss PTA info.

* tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): Special-case
store internal functions and IFN_MASK_LOAD, IFN_LEN_LOAD
and IFN_MASK_LOAD_LANES.
(call_may_clobber_ref_p_1): Special-case IFN_MASK_STORE,
IFN_LEN_STORE and IFN_MASK_STORE_LANES.
---
 gcc/tree-ssa-alias.cc | 49 +--
 1 file changed, 47 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc
index 782266bdad8..390cd875074 100644
--- a/gcc/tree-ssa-alias.cc
+++ b/gcc/tree-ssa-alias.cc
@@ -47,6 +47,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "print-tree.h"
 #include "tree-ssa-alias-compare.h"
 #include "builtins.h"
+#include "internal-fn.h"
 
 /* Broad overview of how alias analysis on gimple works:
 
@@ -2793,8 +2794,38 @@ ref_maybe_used_by_call_p_1 (gcall *call, ao_ref *ref, 
bool tbaa_p)
   if (ref->volatile_p)
 return true;
 
-  callee = gimple_call_fndecl (call);
+  if (gimple_call_internal_p (call))
+switch (gimple_call_internal_fn (call))
+  {
+  case IFN_MASK_STORE:
+  case IFN_SCATTER_STORE:
+  case IFN_MASK_SCATTER_STORE:
+  case IFN_LEN_STORE:
+   return false;
+  case IFN_MASK_STORE_LANES:
+   goto process_args;
+  case IFN_MASK_LOAD:
+  case IFN_LEN_LOAD:
+  case IFN_MASK_LOAD_LANES:
+   {
+ ao_ref rhs_ref;
+ tree lhs = gimple_call_lhs (call);
+ if (lhs)
+   {
+ ao_ref_init_from_ptr_and_size (&rhs_ref,
+gimple_call_arg (call, 0),
+TYPE_SIZE_UNIT (TREE_TYPE (lhs)));
+ rhs_ref.ref_alias_set = rhs_ref.base_alias_set
+   = tbaa_p ? get_deref_alias_set (TREE_TYPE
+   (gimple_call_arg (call, 1))) : 0;
+ return refs_may_alias_p_1 (ref, &rhs_ref, tbaa_p);
+   }
+ break;
+   }
+  default:;
+  }
 
+  callee = gimple_call_fndecl (call);
   if (callee != NULL_TREE)
 {
   struct cgraph_node *node = cgraph_node::get (callee);
@@ -3005,7 +3036,7 @@ call_may_clobber_ref_p_1 (gcall *call, ao_ref *ref, bool 
tbaa_p)
   & (ECF_PURE|ECF_CONST|ECF_LOOPING_CONST_OR_PURE|ECF_NOVOPS))
 return false;
   if (gimple_call_internal_p (call))
-switch (gimple_call_internal_fn (call))
+switch (auto fn = gimple_call_internal_fn (call))
   {
/* Treat these internal calls like ECF_PURE for aliasing,
   they don't write to any memory the program should care about.
@@ -3018,6 +3049,20 @@ call_may_clobber_ref_p_1 (gcall *call, ao_ref *ref, bool 
tbaa_p)
   case IFN_UBSAN_PTR:
   case IFN_ASAN_CHECK:
return false;
+  case IFN_MASK_STORE:
+  case IFN_LEN_STORE:
+  case IFN_MASK_STORE_LANES:
+   {
+ tree rhs = gimple_call_arg (call,
+ internal_fn_stored_value_index (fn));
+ ao_ref lhs_ref;
+ ao_ref_init_from_ptr_and_size (&lhs_ref, gimple_call_arg (call, 0),
+TYPE_SIZE_UNIT (TREE_TYPE (rhs)));
+ lhs_ref.ref_alias_set = lhs_ref.base_alias_set
+   = tbaa_p ? get_deref_alias_set
+  (TREE_TYPE (gimple_call_arg (call, 1))) : 0;
+ return refs_may_alias_p_1 (ref, &lhs_ref, tbaa_p);
+   }
   default:
break;
   }
-- 
2.35.3

[PATCH v2] RTEMS: Add -ftls-model=local-exec to multilibs

2022-07-21 Thread Sebastian Huber

Use the local-exec TLS model for all multilibs of all RTEMS targets with proper
TLS support.

gcc/ChangeLog:

* config.gcc (aarch64-*-rtems*): Extend tmake_file.
* config/arm/t-rtems (MULTILIB_EXTRA_OPTS): Define to use
-ftls-model=local-exec.
* config/i386/t-rtems (MULTILIB_EXTRA_OPTS): Likewise.
* config/m68k/t-rtems (MULTILIB_EXTRA_OPTS): Likewise.
* config/microblaze/t-rtems (MULTILIB_EXTRA_OPTS): Likewise.
* config/nios2/t-rtems (MULTILIB_EXTRA_OPTS): Likewise.
* config/riscv/t-rtems (MULTILIB_EXTRA_OPTS): Likewise.
* config/rs6000/t-rtems (MULTILIB_EXTRA_OPTS): Likewise.
* config/sparc/t-rtems (MULTILIB_EXTRA_OPTS): Likewise.
* config/aarch64/t-aarch64-rtems: New file.
---
v2:

* Include aarch64. This required a new RTEMS-specific file.

 gcc/config.gcc |  1 +
 gcc/config/aarch64/t-aarch64-rtems | 20 
 gcc/config/arm/t-rtems |  1 +
 gcc/config/i386/t-rtems|  1 +
 gcc/config/m68k/t-rtems|  1 +
 gcc/config/microblaze/t-rtems  |  1 +
 gcc/config/nios2/t-rtems   |  1 +
 gcc/config/riscv/t-rtems   |  2 ++
 gcc/config/rs6000/t-rtems  |  1 +
 gcc/config/sparc/t-rtems   |  2 ++
 10 files changed, 31 insertions(+)
 create mode 100644 gcc/config/aarch64/t-aarch64-rtems

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 4e3b15bb5e9..c8041723d2a 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1093,6 +1093,7 @@ aarch64*-*-elf | aarch64*-*-fuchsia* | aarch64*-*-rtems*)
 ;;
aarch64-*-rtems*)
tm_file="${tm_file} aarch64/rtems.h rtems.h"
+   tmake_file="${tmake_file} aarch64/t-aarch64-rtems"
;;
esac
case $target in
diff --git a/gcc/config/aarch64/t-aarch64-rtems 
b/gcc/config/aarch64/t-aarch64-rtems
new file mode 100644
index 000..049ea4fa7c0
--- /dev/null
+++ b/gcc/config/aarch64/t-aarch64-rtems
@@ -0,0 +1,20 @@
+# Machine description for AArch64 architecture.
+#  Copyright (C) 2022 Free Software Foundation, Inc.
+#
+#  This file is part of GCC.
+#
+#  GCC is free software; you can redistribute it and/or modify it
+#  under the terms of the GNU General Public License as published by
+#  the Free Software Foundation; either version 3, or (at your option)
+#  any later version.
+#
+#  GCC is distributed in the hope that it will be useful, but
+#  WITHOUT ANY WARRANTY; without even the implied warranty of
+#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+#  General Public License for more details.
+#
+#  You should have received a copy of the GNU General Public License
+#  along with GCC; see the file COPYING3.  If not see
+#  .
+
+MULTILIB_EXTRA_OPTS = ftls-model=local-exec
diff --git a/gcc/config/arm/t-rtems b/gcc/config/arm/t-rtems
index b2fcf572bca..aaf11355b11 100644
--- a/gcc/config/arm/t-rtems
+++ b/gcc/config/arm/t-rtems
@@ -8,6 +8,7 @@ MULTILIB_EXCEPTIONS =
 MULTILIB_REUSE =
 MULTILIB_MATCHES   =
 MULTILIB_REQUIRED  =
+MULTILIB_EXTRA_OPTS= ftls-model=local-exec
 
 # Enumeration of multilibs
 
diff --git a/gcc/config/i386/t-rtems b/gcc/config/i386/t-rtems
index 692c99484b3..83b95a6e53d 100644
--- a/gcc/config/i386/t-rtems
+++ b/gcc/config/i386/t-rtems
@@ -24,3 +24,4 @@ MULTILIB_MATCHES += march?pentium=march?k6 
march?pentiumpro=march?athlon
 MULTILIB_EXCEPTIONS = \
 march=pentium/*msoft-float* \
 march=pentiumpro/*msoft-float*
+MULTILIB_EXTRA_OPTS = ftls-model=local-exec
diff --git a/gcc/config/m68k/t-rtems b/gcc/config/m68k/t-rtems
index 0997afebc94..53a585e3018 100644
--- a/gcc/config/m68k/t-rtems
+++ b/gcc/config/m68k/t-rtems
@@ -7,3 +7,4 @@ M68K_MLIB_CPU += && (match(MLIB, "^68") \
 || MLIB == "5329" \
 || MLIB == "5407" \
 || MLIB == "5475")
+MULTILIB_EXTRA_OPTS = ftls-model=local-exec
diff --git a/gcc/config/microblaze/t-rtems b/gcc/config/microblaze/t-rtems
index d0c38261aaa..c9c9716ab62 100644
--- a/gcc/config/microblaze/t-rtems
+++ b/gcc/config/microblaze/t-rtems
@@ -1 +1,2 @@
 # Custom multilibs for RTEMS
+MULTILIB_EXTRA_OPTS = ftls-model=local-exec
diff --git a/gcc/config/nios2/t-rtems b/gcc/config/nios2/t-rtems
index beda8328bd2..3c9fbc69c83 100644
--- a/gcc/config/nios2/t-rtems
+++ b/gcc/config/nios2/t-rtems
@@ -8,6 +8,7 @@ MULTILIB_EXCEPTIONS =
 MULTILIB_REUSE =
 MULTILIB_MATCHES   =
 MULTILIB_REQUIRED  =
+MULTILIB_EXTRA_OPTS= ftls-model=local-exec
 
 # Enumeration of multilibs
 
diff --git a/gcc/config/riscv/t-rtems b/gcc/config/riscv/t-rtems
index 41f5927fc87..bb49e559ec5 100644
--- a/gcc/config/riscv/t-rtems
+++ b/gcc/config/riscv/t-rtems
@@ -1,3 +1,5 @@
+MULTILIB_EXTRA_OPTS= ftls-model=local-exec
+
 MULTILIB_OPTIONS   =
 MULTILIB_DIRNAMES  =
 
diff --git a/gcc/config/rs6000/t-rtems b/gcc/config/rs6000/t-rtems
index 4f8c147be3e..ba71

[GCC 12] libstdc++: Fix lifetime bugs for non-TLS eh_globals [PR105880]

2022-07-21 Thread Sebastian Huber

From: Jonathan Wakely 

This ensures that the single-threaded fallback buffer eh_globals is not
destroyed during program termination, using the same immortalization
technique used for error category objects.

Also ensure that init._M_init can still be read after init has been
destroyed, by making it a static data member.

libstdc++-v3/ChangeLog:

PR libstdc++/105880
* libsupc++/eh_globals.cc (eh_globals): Ensure constant init and
prevent destruction during termination.
(__eh_globals_init::_M_init): Replace with static member _S_init.
(__cxxabiv1::__cxa_get_globals_fast): Update.
(__cxxabiv1::__cxa_get_globals): Likewise.

(cherry picked from commit 1e65f2ed99024f23c56f7b6a961898bcaa882a92)
---

Would it be acceptable to back port this fix to GCC 12?

 libstdc++-v3/libsupc++/eh_globals.cc | 51 
 1 file changed, 37 insertions(+), 14 deletions(-)

diff --git a/libstdc++-v3/libsupc++/eh_globals.cc 
b/libstdc++-v3/libsupc++/eh_globals.cc
index 3a003b89edf..768425c0f40 100644
--- a/libstdc++-v3/libsupc++/eh_globals.cc
+++ b/libstdc++-v3/libsupc++/eh_globals.cc
@@ -64,8 +64,26 @@ __cxxabiv1::__cxa_get_globals() _GLIBCXX_NOTHROW
 
 #else
 
-// Single-threaded fallback buffer.
-static __cxa_eh_globals eh_globals;
+#if __has_cpp_attribute(clang::require_constant_initialization)
+#  define __constinit [[clang::require_constant_initialization]]
+#endif
+
+namespace
+{
+  struct constant_init
+  {
+union {
+  unsigned char unused;
+  __cxa_eh_globals obj;
+};
+constexpr constant_init() : obj() { }
+
+~constant_init() { /* do nothing, union member is not destroyed */ }
+  };
+
+  // Single-threaded fallback buffer.
+  __constinit constant_init eh_globals;
+}
 
 #if __GTHREADS
 
@@ -90,32 +108,37 @@ eh_globals_dtor(void* ptr)
 struct __eh_globals_init
 {
   __gthread_key_t  _M_key;
-  bool _M_init;
+  static bool  _S_init;
 
-  __eh_globals_init() : _M_init(false)
-  { 
+  __eh_globals_init()
+  {
 if (__gthread_active_p())
-  _M_init = __gthread_key_create(&_M_key, eh_globals_dtor) == 0; 
+  _S_init = __gthread_key_create(&_M_key, eh_globals_dtor) == 0;
   }
 
   ~__eh_globals_init()
   {
-if (_M_init)
+if (_S_init)
   __gthread_key_delete(_M_key);
-_M_init = false;
+_S_init = false;
   }
+
+  __eh_globals_init(const __eh_globals_init&) = delete;
+  __eh_globals_init& operator=(const __eh_globals_init&) = delete;
 };
 
+bool __eh_globals_init::_S_init = false;
+
 static __eh_globals_init init;
 
 extern "C" __cxa_eh_globals*
 __cxxabiv1::__cxa_get_globals_fast() _GLIBCXX_NOTHROW
 {
   __cxa_eh_globals* g;
-  if (init._M_init)
+  if (init._S_init)
 g = static_cast<__cxa_eh_globals*>(__gthread_getspecific(init._M_key));
   else
-g = &eh_globals;
+g = &eh_globals.obj;
   return g;
 }
 
@@ -123,7 +146,7 @@ extern "C" __cxa_eh_globals*
 __cxxabiv1::__cxa_get_globals() _GLIBCXX_NOTHROW
 {
   __cxa_eh_globals* g;
-  if (init._M_init)
+  if (init._S_init)
 {
   g = static_cast<__cxa_eh_globals*>(__gthread_getspecific(init._M_key));
   if (!g)
@@ -140,7 +163,7 @@ __cxxabiv1::__cxa_get_globals() _GLIBCXX_NOTHROW
}
 }
   else
-g = &eh_globals;
+g = &eh_globals.obj;
   return g;
 }
 
@@ -148,11 +171,11 @@ __cxxabiv1::__cxa_get_globals() _GLIBCXX_NOTHROW
 
 extern "C" __cxa_eh_globals*
 __cxxabiv1::__cxa_get_globals_fast() _GLIBCXX_NOTHROW
-{ return &eh_globals; }
+{ return &eh_globals.obj; }
 
 extern "C" __cxa_eh_globals*
 __cxxabiv1::__cxa_get_globals() _GLIBCXX_NOTHROW
-{ return &eh_globals; }
+{ return &eh_globals.obj; }
 
 #endif
 
-- 
2.35.3

[PATCH] docs: update abi version info

2022-07-21 Thread Kim Kuparinen via Gcc-patches

Synchronize gcc/common.opts and gcc/doc/invoke.texi w.r.t -fabi-version, and
correct -fabi-compat-version from ABIv11 to ABIv13, since it was changed in
a37e8ce3b66325f0c6de55c80d50ac1664c3d0eb

gcc/ChangeLog:

* doc/invoke.texi: update abi version info
---
 gcc/doc/invoke.texi | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 94689be28..2bf1f3fd3 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -2926,10 +2926,17 @@ change in version 12.
 Version 14, which first appeared in G++ 10, corrects the mangling of
 the nullptr expression.
 
-Version 15, which first appeared in G++ 11, changes the mangling of
+Version 15, which first appeared in G++ 10.3, corrects G++ 10 ABI
+tag regression.
+
+Version 16, which first appeared in G++ 11, changes the mangling of
 @code{__alignof__} to be distinct from that of @code{alignof}, and
 dependent operator names.
 
+Version 17, which first appeared in G++ 12, fixes layout of classes
+that inherit from aggregate classes with default member initializers
+in C++14 and up.
+
 See also @option{-Wabi}.
 
 @item -fabi-compat-version=@var{n}
@@ -2939,7 +2946,7 @@ works around mangling changes by creating an alias with 
the correct
 mangled name when defining a symbol with an incorrect mangled name.
 This switch specifies which ABI version to use for the alias.
 
-With @option{-fabi-version=0} (the default), this defaults to 11 (GCC 7
+With @option{-fabi-version=0} (the default), this defaults to 13 (GCC 8.2
 compatibility).  If another ABI version is explicitly selected, this
 defaults to 0.  For compatibility with GCC versions 3.2 through 4.9,
 use @option{-fabi-compat-version=2}.
-- 
2.30.2

Re: [PATCH] RISC-V: Add RTX costs for `if_then_else' expressions

2022-07-21 Thread Kito Cheng via Gcc-patches

Hi Maciej:

LGTM, thanks for modeling this in cost model!

On Tue, Jul 19, 2022 at 12:48 AM Maciej W. Rozycki  wrote:
>
> Fix a performance regression from commit 391500af1932 ("Do not ignore
> costs of jump insns in combine."), a part of the m68k series for MODE_CC
> conversion (),
> observed in soft-fp code in libgcc used by some of the embench-iot
> benchmarks.
>
> The immediate origin of the regression is the middle end, which in the
> absence of cost information from the backend estimates the cost of an
> RTL expression by assuming a single machine instruction for each of the
> expression's subexpression.
>
> So for `if_then_else', which takes 3 operands, the estimated cost is 3
> instructions (i.e. 12 units) even though a branch instruction evaluates
> it in a single machine cycle (ignoring the cost of actually taking the
> branch of course, which is handled elsewhere).  Consequently an insn
> sequence like:
>
> (insn 595 594 596 43 (set (reg:DI 305)
> (lshiftrt:DI (reg/v:DI 160 [ R_f ])
> (const_int 55 [0x37]))) ".../libgcc/soft-fp/adddf3.c":46:3 216 
> {lshrdi3}
>  (nil))
> (insn 596 595 597 43 (set (reg:DI 304)
> (and:DI (reg:DI 305)
> (const_int 1 [0x1]))) ".../libgcc/soft-fp/adddf3.c":46:3 109 
> {anddi3}
>  (expr_list:REG_DEAD (reg:DI 305)
> (nil)))
> (jump_insn 597 596 598 43 (set (pc)
> (if_then_else (eq (reg:DI 304)
> (const_int 0 [0]))
> (label_ref:DI 1644)
> (pc))) ".../libgcc/soft-fp/adddf3.c":46:3 237 {*branchdi}
>  (expr_list:REG_DEAD (reg:DI 304)
> (int_list:REG_BR_PROB 536870916 (nil)))
>  -> 1644)
>
> does not (anymore, as from the commit referred) get combined into:
>
> (note 595 594 596 43 NOTE_INSN_DELETED)
> (note 596 595 597 43 NOTE_INSN_DELETED)
> (jump_insn 597 596 598 43 (parallel [
> (set (pc)
> (if_then_else (eq (zero_extract:DI (reg/v:DI 160 [ R_f ])
> (const_int 1 [0x1])
> (const_int 55 [0x37]))
> (const_int 0 [0]))
> (label_ref:DI 1644)
> (pc)))
> (clobber (scratch:DI))
> ]) ".../libgcc/soft-fp/adddf3.c":46:3 243 {*branch_on_bitdi}
>  (int_list:REG_BR_PROB 536870916 (nil))
>  -> 1644)
>
> This is because the new cost is incorrectly calculated as 28 units while
> the cost of the original 3 instructions was 24:
>
> rejecting combination of insns 595, 596 and 597
> original costs 4 + 4 + 16 = 24
> replacement cost 28
>
> Before the commit referred the cost of jump instruction was ignored and
> considered 0 (i.e. unknown) and a sequence of instructions of a known
> cost used to win:
>
> allowing combination of insns 595, 596 and 597
> original costs 4 + 4 + 0 = 0
> replacement cost 28
>
> Add the missing costs for the 3 variants of `if_then_else' expressions
> we currently define in the backend.
>
> With the fix in place the cost of this particular `if_then_else' pattern
> is 2 instructions or 8 units (because of the shift operation) and
> therefore the ultimate cost of the original 3 RTL insns will work out at
> 16 units (4 + 4 + 8), however the replacement single RTL insn will cost
> 8 units only.
>
> gcc/
> * config/riscv/riscv.cc (riscv_rtx_costs) : New
> case.
> ---
>  gcc/config/riscv/riscv.cc |   27 +++
>  1 file changed, 27 insertions(+)
>
> gcc-riscv-rtx-costs-if-then-else.diff
> Index: gcc/gcc/config/riscv/riscv.cc
> ===
> --- gcc.orig/gcc/config/riscv/riscv.cc
> +++ gcc/gcc/config/riscv/riscv.cc
> @@ -1853,6 +1853,33 @@ riscv_rtx_costs (rtx x, machine_mode mod
>/* Otherwise use the default handling.  */
>return false;
>
> +case IF_THEN_ELSE:
> +  if (TARGET_SFB_ALU
> + && register_operand (XEXP (x, 1), mode)
> + && sfb_alu_operand (XEXP (x, 2), mode)
> + && comparison_operator (XEXP (x, 0), VOIDmode))
> +   {
> + /* For predicated conditional-move operations we assume the cost
> +of a single instruction even though there are actually two.  */
> + *total = COSTS_N_INSNS (1);
> + return true;
> +   }
> +  else if (LABEL_REF_P (XEXP (x, 1)) && XEXP (x, 2) == pc_rtx)
> +   {
> + if (equality_operator (XEXP (x, 0), mode)
> + && GET_CODE (XEXP (XEXP (x, 0), 0)) == ZERO_EXTRACT)
> +   {
> + *total = COSTS_N_INSNS (SINGLE_SHIFT_COST + 1);
> + return true;
> +   }
> + if (order_operator (XEXP (x, 0), mode))
> +   {
> + *total = COSTS_N_INSNS (1);
> + return true;
> +   }
> +   }
> +  return false;
> +
>  case NOT:
>*total = COSTS_N_INSNS (GET_MODE_SIZE (mode) > UNITS_PER_WORD ? 2 : 1);
>retur

[PATCH] Teach VN about masked/len stores

2022-07-21 Thread Richard Biener via Gcc-patches

The following teaches VN to handle reads from .MASK_STORE and
.LEN_STORE.  For this push_partial_def is extended first for
convenience so we don't have to handle the full def case in the
caller (possibly other paths can be simplified then).  Also
the partial definition stored value can have an offset applied
so we don't have to build a fake RHS when we register the pieces
of an existing store.

Bootstrapped and tested on x86_64-unknown-linux-gnu, Kewen is
going to test on powerpc.

I'm not sure about whether it's worth (or easily possible) to
handle .MASK_STORE_LANES, I think handling the constant def case
might be possible but since it has an intrinsic permute it might
make more sense to rewrite the constant def case into a .MASK_STORE?
(does the mask apply to the destination memory order or the source
lane order?)

PR tree-optimization/106365
* tree-ssa-sccvn.cc (pd_data::rhs_off): New field determining
the offset to start encoding of RHS from.
(vn_walk_cb_data::vn_walk_cb_data): Initialize it.
(vn_walk_cb_data::push_partial_def): Allow the first partial
definition to be fully providing the def.  Offset RHS
before encoding if requested.
(vn_reference_lookup_3): Initialize def_rhs everywhere.
Add support for .MASK_STORE and .LEN_STORE (partial) definitions.

* gcc.target/i386/vec-maskstore-vn.c: New testcase.
---
 .../gcc.target/i386/vec-maskstore-vn.c|  30 +++
 gcc/tree-ssa-sccvn.cc | 255 ++
 2 files changed, 228 insertions(+), 57 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/vec-maskstore-vn.c

diff --git a/gcc/testsuite/gcc.target/i386/vec-maskstore-vn.c 
b/gcc/testsuite/gcc.target/i386/vec-maskstore-vn.c
new file mode 100644
index 000..98213905ece
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/vec-maskstore-vn.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mavx2 -fdump-tree-fre5" } */
+
+void __attribute__((noinline,noclone))
+foo (int *out, int *res)
+{
+  int mask[] = { 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1 };
+  int i;
+  for (i = 0; i < 16; ++i)
+{
+  if (mask[i])
+out[i] = i;
+}
+  int o0 = out[0];
+  int o7 = out[7];
+  int o14 = out[14];
+  int o15 = out[15];
+  res[0] = o0;
+  res[2] = o7;
+  res[4] = o14;
+  res[6] = o15;
+}
+
+/* Vectorization produces .MASK_STORE, unrolling will unroll the two
+   vector iterations.  FRE5 after that should be able to CSE
+   out[7] and out[15], but leave out[0] and out[14] alone.  */
+/* { dg-final { scan-tree-dump " = o0_\[0-9\]+;" "fre5" } } */
+/* { dg-final { scan-tree-dump " = 7;" "fre5" } } */
+/* { dg-final { scan-tree-dump " = o14_\[0-9\]+;" "fre5" } } */
+/* { dg-final { scan-tree-dump " = 15;" "fre5" } } */
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index f41d5031365..7d947b55a27 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -1790,6 +1790,7 @@ struct pd_range
 struct pd_data
 {
   tree rhs;
+  HOST_WIDE_INT rhs_off;
   HOST_WIDE_INT offset;
   HOST_WIDE_INT size;
 };
@@ -1816,6 +1817,7 @@ struct vn_walk_cb_data
unsigned int pos = 0, prec = w.get_precision ();
pd_data pd;
pd.rhs = build_constructor (NULL_TREE, NULL);
+   pd.rhs_off = 0;
/* When bitwise and with a constant is done on a memory load,
   we don't really need all the bits to be defined or defined
   to constants, we don't really care what is in the position
@@ -1976,6 +1978,7 @@ vn_walk_cb_data::push_partial_def (pd_data pd,
 
   bool pd_constant_p = (TREE_CODE (pd.rhs) == CONSTRUCTOR
|| CONSTANT_CLASS_P (pd.rhs));
+  pd_range *r;
   if (partial_defs.is_empty ())
 {
   /* If we get a clobber upfront, fail.  */
@@ -1989,65 +1992,70 @@ vn_walk_cb_data::push_partial_def (pd_data pd,
   first_set = set;
   first_base_set = base_set;
   last_vuse_ptr = NULL;
-  /* Continue looking for partial defs.  */
-  return NULL;
-}
-
-  if (!known_ranges)
-{
-  /* ???  Optimize the case where the 2nd partial def completes things.  */
-  gcc_obstack_init (&ranges_obstack);
-  known_ranges = splay_tree_new_with_allocator (pd_range_compare, 0, 0,
-   pd_tree_alloc,
-   pd_tree_dealloc, this);
-  splay_tree_insert (known_ranges,
-(splay_tree_key)&first_range.offset,
-(splay_tree_value)&first_range);
-}
-
-  pd_range newr = { pd.offset, pd.size };
-  splay_tree_node n;
-  pd_range *r;
-  /* Lookup the predecessor of offset + 1 and see if we need to merge.  */
-  HOST_WIDE_INT loffset = newr.offset + 1;
-  if ((n = splay_tree_predecessor (known_ranges, (splay_tree_key)&loffset))
-  && ((r = (pd_range *)n->value), true)
-  && ranges_known_overlap_p (r->offset, r->size + 1,
-

Re: [PATCH 1/1 V5] RISC-V: Support Zmmul extension

2022-07-21 Thread Kito Cheng via Gcc-patches

LGTM, will merge once binuils part merge.

On Wed, Jul 13, 2022 at 10:14 AM  wrote:
>
> From: LiaoShihua 
>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.cc: Add Zmmul.
> * config/riscv/riscv-opts.h (MASK_ZMMUL): New.
> (TARGET_ZMMUL): Ditto.
> * config/riscv/riscv.cc (riscv_option_override):Ditto.
> * config/riscv/riscv.md: Add Zmmul
> * config/riscv/riscv.opt: Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zmmul-1.c: New test.
> * gcc.target/riscv/zmmul-2.c: New test.
>
> ---
>  gcc/common/config/riscv/riscv-common.cc  |  3 +++
>  gcc/config/riscv/riscv-opts.h|  3 +++
>  gcc/config/riscv/riscv.cc|  8 +--
>  gcc/config/riscv/riscv.md| 28 
>  gcc/config/riscv/riscv.opt   |  3 +++
>  gcc/testsuite/gcc.target/riscv/zmmul-1.c | 20 +
>  gcc/testsuite/gcc.target/riscv/zmmul-2.c | 20 +
>  7 files changed, 69 insertions(+), 16 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zmmul-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zmmul-2.c
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> b/gcc/common/config/riscv/riscv-common.cc
> index 0e5be2ce105..20acc590b30 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -193,6 +193,8 @@ static const struct riscv_ext_version 
> riscv_ext_version_table[] =
>{"zvl32768b", ISA_SPEC_CLASS_NONE, 1, 0},
>{"zvl65536b", ISA_SPEC_CLASS_NONE, 1, 0},
>
> +  {"zmmul", ISA_SPEC_CLASS_NONE, 1, 0},
> +
>/* Terminate the list.  */
>{NULL, ISA_SPEC_CLASS_NONE, 0, 0}
>  };
> @@ -1148,6 +1150,7 @@ static const riscv_ext_flag_table_t 
> riscv_ext_flag_table[] =
>{"zvl32768b", &gcc_options::x_riscv_zvl_flags, MASK_ZVL32768B},
>{"zvl65536b", &gcc_options::x_riscv_zvl_flags, MASK_ZVL65536B},
>
> +  {"zmmul", &gcc_options::x_riscv_zm_subext, MASK_ZMMUL},
>
>{NULL, NULL, 0}
>  };
> diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
> index 1e153b3a6e7..9c7d69a6ea3 100644
> --- a/gcc/config/riscv/riscv-opts.h
> +++ b/gcc/config/riscv/riscv-opts.h
> @@ -153,6 +153,9 @@ enum stack_protector_guard {
>  #define TARGET_ZICBOM ((riscv_zicmo_subext & MASK_ZICBOM) != 0)
>  #define TARGET_ZICBOP ((riscv_zicmo_subext & MASK_ZICBOP) != 0)
>
> +#define MASK_ZMMUL  (1 << 0)
> +#define TARGET_ZMMUL((riscv_zm_subext & MASK_ZMMUL) != 0)
> +
>  /* Bit of riscv_zvl_flags will set contintuly, N-1 bit will set if N-bit is
> set, e.g. MASK_ZVL64B has set then MASK_ZVL32B is set, so we can use
> popcount to caclulate the minimal VLEN.  */
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 2e83ca07394..9ad4181f35f 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -4999,10 +4999,14 @@ riscv_option_override (void)
>/* The presence of the M extension implies that division instructions
>   are present, so include them unless explicitly disabled.  */
>if (TARGET_MUL && (target_flags_explicit & MASK_DIV) == 0)
> -target_flags |= MASK_DIV;
> +if(!TARGET_ZMMUL)
> +  target_flags |= MASK_DIV;
>else if (!TARGET_MUL && TARGET_DIV)
>  error ("%<-mdiv%> requires %<-march%> to subsume the % extension");
> -
> +
> +  if(TARGET_ZMMUL && !TARGET_MUL && TARGET_DIV)
> +warning (0, "%<-mdiv%> cannot be used when % extension is 
> present");
> +
>/* Likewise floating-point division and square root.  */
>if (TARGET_HARD_FLOAT && (target_flags_explicit & MASK_FDIV) == 0)
>  target_flags |= MASK_FDIV;
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index 308b64dd30d..d4e171464ea 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -763,7 +763,7 @@
>[(set (match_operand:SI  0 "register_operand" "=r")
> (mult:SI (match_operand:SI 1 "register_operand" " r")
>  (match_operand:SI 2 "register_operand" " r")))]
> -  "TARGET_MUL"
> +  "TARGET_ZMMUL || TARGET_MUL"
>{ return TARGET_64BIT ? "mulw\t%0,%1,%2" : "mul\t%0,%1,%2"; }
>[(set_attr "type" "imul")
> (set_attr "mode" "SI")])
> @@ -772,7 +772,7 @@
>[(set (match_operand:DI  0 "register_operand" "=r")
> (mult:DI (match_operand:DI 1 "register_operand" " r")
>  (match_operand:DI 2 "register_operand" " r")))]
> -  "TARGET_MUL && TARGET_64BIT"
> +  "TARGET_ZMMUL || TARGET_MUL && TARGET_64BIT"
>"mul\t%0,%1,%2"
>[(set_attr "type" "imul")
> (set_attr "mode" "DI")])
> @@ -782,7 +782,7 @@
> (mult:GPR (match_operand:GPR 1 "register_operand" " r")
>   (match_operand:GPR 2 "register_operand" " r")))
> (label_ref (match_operand 3 "" ""))]
> -  "TARGET_MUL"
> +  "TARGET_ZMMUL || TARGET_MUL"
>  {
>if (TARGET_64BIT && mode == SImode)
>  {
> @@ -827,7 +827,7 @@
> (mult:GPR (m

[PATCH 8/12 V3] arm: Introduce multilibs for PACBTI target feature

2022-07-21 Thread Andrea Corallo via Gcc-patches

Richard Earnshaw  writes:

[...]

>> The documentation mentions -mbranch-protection=standard+leaf, so
>> you're missing a mapping for that.
>> OK with that change.
>> R.
>
> Oh, and please add some tests to gcc/testsuite/gcc.target/arm/multilib.exp
>
> R.

Hi Richard,

thanks, here the updated patch.

PS I've also added three mlibarch -> march matches that were missing.

BR

  Andrea

>From bbd0efb375c08981be7632319b24830196429e9b Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Mon, 6 Dec 2021 11:42:59 +0100
Subject: [PATCH] [PATCH 8/12] arm: Introduce multilibs for PACBTI target
 feature

This patch add the following new multilibs.

thumb/v8.1-m.main+pacbti/mbranch-protection/nofp
thumb/v8.1-m.main+pacbti+dp/mbranch-protection/soft
thumb/v8.1-m.main+pacbti+dp/mbranch-protection/hard
thumb/v8.1-m.main+pacbti+fp/mbranch-protection/soft
thumb/v8.1-m.main+pacbti+fp/mbranch-protection/hard
thumb/v8.1-m.main+pacbti+mve/mbranch-protection/hard

Triggering the following compiler flags:

-mthumb -march=armv8.1-m.main+pacbti -mbranch-protection=standard 
-mfloat-abi=soft
-mthumb -march=armv8.1-m.main+pacbti+fp -mbranch-protection=standard 
-mfloat-abi=softfp
-mthumb -march=armv8.1-m.main+pacbti+fp -mbranch-protection=standard 
-mfloat-abi=hard
-mthumb -march=armv8.1-m.main+pacbti+fp.dp -mbranch-protection=standard 
-mfloat-abi=softfp
-mthumb -march=armv8.1-m.main+pacbti+fp.dp -mbranch-protection=standard 
-mfloat-abi=hard
-mthumb -march=armv8.1-m.main+pacbti+mve -mbranch-protection=standard 
-mfloat-abi=hard

gcc/

* config/arm/t-rmprofile: Add multilib rules for march +pacbti
  and mbranch-protection.

gcc/testsuite/

* gcc.target/arm/multilib.exp: Add pacbti related entries.
---
 gcc/config/arm/t-rmprofile| 29 +--
 gcc/testsuite/gcc.target/arm/multilib.exp |  6 +
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile
index eb321e832f1..c50bf4b3557 100644
--- a/gcc/config/arm/t-rmprofile
+++ b/gcc/config/arm/t-rmprofile
@@ -27,8 +27,11 @@
 
 # Arch and FPU variants to build libraries with
 
-MULTI_ARCH_OPTS_RM = 
march=armv6s-m/march=armv7-m/march=armv7e-m/march=armv7e-m+fp/march=armv7e-m+fp.dp/march=armv8-m.base/march=armv8-m.main/march=armv8-m.main+fp/march=armv8-m.main+fp.dp/march=armv8.1-m.main+mve
-MULTI_ARCH_DIRS_RM = v6-m v7-m v7e-m v7e-m+fp v7e-m+dp v8-m.base v8-m.main 
v8-m.main+fp v8-m.main+dp v8.1-m.main+mve
+MULTI_ARCH_OPTS_RM = 
march=armv6s-m/march=armv7-m/march=armv7e-m/march=armv7e-m+fp/march=armv7e-m+fp.dp/march=armv8-m.base/march=armv8-m.main/march=armv8-m.main+fp/march=armv8-m.main+fp.dp/march=armv8.1-m.main+mve/march=armv8.1-m.main+pacbti/march=armv8.1-m.main+pacbti+fp/march=armv8.1-m.main+pacbti+fp.dp/march=armv8.1-m.main+pacbti+mve
+MULTI_ARCH_DIRS_RM = v6-m v7-m v7e-m v7e-m+fp v7e-m+dp v8-m.base v8-m.main 
v8-m.main+fp v8-m.main+dp v8.1-m.main+mve v8.1-m.main+pacbti 
v8.1-m.main+pacbti+fp v8.1-m.main+pacbti+dp v8.1-m.main+pacbti+mve
+
+MULTI_ARCH_OPTS_RM += mbranch-protection=standard
+MULTI_ARCH_DIRS_RM += mbranch-protection
 
 # Base M-profile (no fp)
 MULTILIB_REQUIRED  += mthumb/march=armv6s-m/mfloat-abi=soft
@@ -50,6 +53,14 @@ MULTILIB_REQUIRED+= 
mthumb/march=armv8-m.main+fp.dp/mfloat-abi=hard
 MULTILIB_REQUIRED  += mthumb/march=armv8-m.main+fp.dp/mfloat-abi=softfp
 MULTILIB_REQUIRED  += mthumb/march=armv8.1-m.main+mve/mfloat-abi=hard
 
+MULTILIB_REQUIRED  += 
mthumb/march=armv8.1-m.main+pacbti/mbranch-protection=standard/mfloat-abi=soft
+MULTILIB_REQUIRED  += 
mthumb/march=armv8.1-m.main+pacbti+fp/mbranch-protection=standard/mfloat-abi=softfp
+MULTILIB_REQUIRED  += 
mthumb/march=armv8.1-m.main+pacbti+fp/mbranch-protection=standard/mfloat-abi=hard
+MULTILIB_REQUIRED  += 
mthumb/march=armv8.1-m.main+pacbti+fp.dp/mbranch-protection=standard/mfloat-abi=softfp
+MULTILIB_REQUIRED  += 
mthumb/march=armv8.1-m.main+pacbti+fp.dp/mbranch-protection=standard/mfloat-abi=hard
+MULTILIB_REQUIRED  += 
mthumb/march=armv8.1-m.main+pacbti+mve/mbranch-protection=standard/mfloat-abi=hard
+
+
 # Arch Matches
 MULTILIB_MATCHES   += march?armv6s-m=march?armv6-m
 
@@ -87,9 +98,23 @@ MULTILIB_MATCHES += $(foreach FP, $(v8_1m_sp_variants), \
 MULTILIB_MATCHES += $(foreach FP, $(v8_1m_dp_variants), \
 
march?armv8-m.main+fp.dp=mlibarch?armv8.1-m.main$(FP))
 
+# Map all mbranch-protection values other than 'none' to 'standard'.
+MULTILIB_MATCHES   += mbranch-protection?standard=mbranch-protection?bti
+MULTILIB_MATCHES   += 
mbranch-protection?standard=mbranch-protection?pac-ret
+MULTILIB_MATCHES   += 
mbranch-protection?standard=mbranch-protection?pac-ret+leaf
+MULTILIB_MATCHES   += 
mbranch-protection?standard=mbranch-protection?pac-ret+bti
+MULTILIB_MATCHES   += 
mbranch-protection?standard=mbranch-protection?pac-ret+leaf+bti
+MULTILIB_MATCHES   += 
mbranch-pro

[PATCH] tree-optimization/106365 - DSE of LEN_STORE and MASK_STORE

2022-07-21 Thread Richard Biener via Gcc-patches

The following enhances DSE to handle LEN_STORE (optimally) and
MASK_STORE (conservatively).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
Kewen is testing on powerpc.  Handling MASK_STORE_LANES in
a similar way to MASK_STORE is probably possible but I couldn't
figure a way to generate one for testing.  STORE_LANES is
probably handled already since it's ECF_CONST.

PR tree-optimization/106365
* tree-ssa-dse.cc (initialize_ao_ref_for_dse): Handle
LEN_STORE, add mode to initialize a may-def and handle
MASK_STORE that way.
(dse_optimize_stmt): Query may-defs.  Handle internal
functions LEN_STORE and MASK_STORE similar to how
we handle memory builtins but without byte tracking.
---
 gcc/tree-ssa-dse.cc | 55 +
 1 file changed, 51 insertions(+), 4 deletions(-)

diff --git a/gcc/tree-ssa-dse.cc b/gcc/tree-ssa-dse.cc
index 8d1739a4510..34cfd1a8802 100644
--- a/gcc/tree-ssa-dse.cc
+++ b/gcc/tree-ssa-dse.cc
@@ -93,7 +93,9 @@ static bitmap need_eh_cleanup;
 static bitmap need_ab_cleanup;
 
 /* STMT is a statement that may write into memory.  Analyze it and
-   initialize WRITE to describe how STMT affects memory.
+   initialize WRITE to describe how STMT affects memory.  When
+   MAY_DEF_OK is true then the function initializes WRITE to what
+   the stmt may define.
 
Return TRUE if the statement was analyzed, FALSE otherwise.
 
@@ -101,7 +103,7 @@ static bitmap need_ab_cleanup;
can be achieved by analyzing more statements.  */
 
 static bool
-initialize_ao_ref_for_dse (gimple *stmt, ao_ref *write)
+initialize_ao_ref_for_dse (gimple *stmt, ao_ref *write, bool may_def_ok = 
false)
 {
   /* It's advantageous to handle certain mem* functions.  */
   if (gimple_call_builtin_p (stmt, BUILT_IN_NORMAL))
@@ -146,6 +148,32 @@ initialize_ao_ref_for_dse (gimple *stmt, ao_ref *write)
  break;
}
 }
+  else if (is_gimple_call (stmt)
+  && gimple_call_internal_p (stmt))
+{
+  switch (gimple_call_internal_fn (stmt))
+   {
+   case IFN_LEN_STORE:
+ ao_ref_init_from_ptr_and_size
+ (write, gimple_call_arg (stmt, 0),
+  int_const_binop (MINUS_EXPR,
+   gimple_call_arg (stmt, 2),
+   gimple_call_arg (stmt, 4)));
+ return true;
+   case IFN_MASK_STORE:
+ /* We cannot initialize a must-def ao_ref (in all cases) but we
+can provide a may-def variant.  */
+ if (may_def_ok)
+   {
+ ao_ref_init_from_ptr_and_size
+ (write, gimple_call_arg (stmt, 0),
+  TYPE_SIZE_UNIT (TREE_TYPE (gimple_call_arg (stmt, 2;
+ return true;
+   }
+ break;
+   default:;
+   }
+}
   else if (tree lhs = gimple_get_lhs (stmt))
 {
   if (TREE_CODE (lhs) != SSA_NAME)
@@ -1328,8 +1356,10 @@ dse_optimize_stmt (function *fun, gimple_stmt_iterator 
*gsi, sbitmap live_bytes)
 
   ao_ref ref;
   /* If this is not a store we can still remove dead call using
- modref summary.  */
-  if (!initialize_ao_ref_for_dse (stmt, &ref))
+ modref summary.  Note we specifically allow ref to be initialized
+ to a conservative may-def since we are looking for followup stores
+ to kill all of it.  */
+  if (!initialize_ao_ref_for_dse (stmt, &ref, true))
 {
   dse_optimize_call (gsi, live_bytes);
   return;
@@ -1398,6 +1428,23 @@ dse_optimize_stmt (function *fun, gimple_stmt_iterator 
*gsi, sbitmap live_bytes)
  return;
}
 }
+  else if (is_gimple_call (stmt)
+  && gimple_call_internal_p (stmt))
+{
+  switch (gimple_call_internal_fn (stmt))
+   {
+   case IFN_LEN_STORE:
+   case IFN_MASK_STORE:
+ {
+   enum dse_store_status store_status;
+   store_status = dse_classify_store (&ref, stmt, false, live_bytes);
+   if (store_status == DSE_STORE_DEAD)
+ delete_dead_or_redundant_call (gsi, "dead");
+   return;
+ }
+   default:;
+   }
+}
 
   bool by_clobber_p = false;
 
-- 
2.35.3

[committed] MAINTAINERS: Add myself as Ada front end co-maintainer

2022-07-21 Thread Marc Poulhiès via Gcc-patches

Add myself as Ada front end co-maintainer.

Committed as f4ed610d02aaf8cfcdcb5cf03e0cde65f1f5f890.

ChangeLog:
* MAINTAINERS: Add myself as Ada front end co-maintainer.
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 7a7ad42ced3..e2db0cfe18b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -157,6 +157,7 @@ cygwin, mingw-w64   Jonathan Yong   
<10wa...@gmail.com>
 C front end/ISO C99Joseph Myers
 Ada front end  Arnaud Charlet  
 Ada front end  Eric Botcazou   
+Ada front end  Marc Poulhiès   
 Ada front end  Pierre-Marie de Rodat   
 c++Jason Merrill   
 c++Nathan Sidwell  
@@ -581,7 +582,6 @@ Nicolas Pitre   

 Michael Ploujnikov 
 Paul Pluzhnikov
 Antoniu Pop
-Marc Poulhiès  
 Siddhesh Poyarekar 
 Vidya Praveen  
 Thomas Preud'homme 
-- 
2.25.1

[PATCH 9/12 V2] arm: Make libgcc bti compatible

2022-07-21 Thread Andrea Corallo via Gcc-patches

Richard Earnshaw  writes:

> On 28/04/2022 10:48, Andrea Corallo via Gcc-patches wrote:
>> This change add bti instructions at the beginning of arm specific
>> libgcc hand written assembly routines.
>> 2022-03-31  Andrea Corallo  
>>  * libgcc/config/arm/crti.S (FUNC_START): Add bti instruction
>> if
>>  necessary.
>>  * libgcc/config/arm/lib1funcs.S (THUMB_FUNC_START, FUNC_START):
>>  Likewise.
>> 
>
> +#if defined(__ARM_FEATURE_BTI)
>
> Wouldn't it be better to use __ARM_FEATURE_BTI_DEFAULT?  That way we
> only get BTI instructions in multilib variants that have asked for
> BTI.
>
> R.

Hi Richard,

good point, yes I think so.

Please find attached the updated patch.

BR

  Andrea

>From 6975c9ddbc8a4b790a765589c6fd07fea92173e5 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Tue, 8 Feb 2022 10:58:31 +0100
Subject: [PATCH] [PATCH 9/12] arm: Make libgcc bti compatible

This change add bti instructions at the beginning of arm specific
libgcc hand written assembly routines.

2022-03-31  Andrea Corallo  

* libgcc/config/arm/crti.S (FUNC_START): Add bti instruction if
necessary.
* libgcc/config/arm/lib1funcs.S (THUMB_FUNC_START, FUNC_START):
Likewise.
---
 libgcc/config/arm/crti.S  | 4 +++-
 libgcc/config/arm/lib1funcs.S | 6 ++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/libgcc/config/arm/crti.S b/libgcc/config/arm/crti.S
index 0192972a7e6..4098353af1c 100644
--- a/libgcc/config/arm/crti.S
+++ b/libgcc/config/arm/crti.S
@@ -51,7 +51,9 @@
 .macro FUNC_START
 #ifdef __thumb__
.thumb
-   
+#if defined(__ARM_FEATURE_BTI_DEFAULT)
+   bti
+#endif
push{r3, r4, r5, r6, r7, lr}
 #else
.arm
diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S
index 8c39c9f20a2..de98edcc300 100644
--- a/libgcc/config/arm/lib1funcs.S
+++ b/libgcc/config/arm/lib1funcs.S
@@ -345,6 +345,9 @@ LSYM(Ldiv0):
TYPE(\name)
.thumb_func
 SYM (\name):
+#if defined(__ARM_FEATURE_BTI_DEFAULT)
+   bti
+#endif
 .endm
 
 /* Function start macros.  Variants for ARM and Thumb.  */
@@ -372,6 +375,9 @@ SYM (\name):
THUMB_FUNC
THUMB_SYNTAX
 SYM (__\name):
+#if defined(__ARM_FEATURE_BTI_DEFAULT)
+   bti
+#endif
 .endm
 
 .macro ARM_SYM_START name
-- 
2.25.1

Re: [PATCH v3] RISC-V/testsuite: constraint some of tests to hard_float

2022-07-21 Thread Kito Cheng via Gcc-patches

> On 5/29/22 20:50, Kito Cheng via Gcc-patches wrote:
> > Committed, thanks!
>
> Can this be backported to gcc-12 please.

I want to say yes but 9ddd44b58649d1d ("RISC-V: Provide `fmin'/`fmax'
RTL pattern") only existing in the trunk, and
gcc.target/riscv/pr105666.c part has fixed[1] during backport, anyway
that remind me there is another patch for the testcase [1], and back
port for that to gcc 12 branch

[1] 
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=e919fae01b89fa6d7fc742d140bb15dc2600becb;hp=682d238f32a2aca993747f9c1ddf2b6f4c5fb536
[2] 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b18e5d7e5f9df69759f0fbc2bed91d5e51313e79

Re: [PATCH v1 1/3] RISC-V: Split "(a & (1 << BIT_NO)) ? 0 : -1" to bexti + addi

2022-07-21 Thread Kito Cheng via Gcc-patches

Hi Philipp:

This patch series is LGTM, but plz introduce new pseudo when
can_create_pseudo_p like what we discussed in
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596305.html, you
can commit with the change with a [committed] patch mail :)

On Thu, Jun 16, 2022 at 5:32 PM Philipp Tomsich
 wrote:
>
> Kito,
>
> Looks like this series fell by the wayside (possibly, because it
> didn't have a cover-letter and was easier to miss)?
>
> Thanks,
> Philipp.
>
> On Wed, 25 May 2022 at 00:52, Philipp Tomsich  
> wrote:
> >
> > Consider creating a polarity-reversed mask from a set-bit (i.e., if
> > the bit is set, produce all-ones; otherwise: all-zeros).  Using Zbb,
> > this can be expressed as bexti, followed by an addi of minus-one.  To
> > enable the combiner to discover this opportunity, we need to split the
> > canonical expression for "(a & (1 << BIT_NO)) ? 0 : -1" into a form
> > combinable into bexti.
> >
> > Consider the function:
> > long f(long a)
> > {
> >   return (a & (1 << BIT_NO)) ? 0 : -1;
> > }
> > This produces the following sequence prior to this change:
> > andia0,a0,16
> > seqza0,a0
> > neg a0,a0
> > ret
> > Following this change, it results in:
> > bexti   a0,a0,4
> > addia0,a0,-1
> > ret
> >
> > Signed-off-by: Philipp Tomsich 
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/bitmanip.md: Add a splitter to generate
> >   polarity-reversed masks from a set bit using bexti + addi.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/zbs-bexti.c: New test.
> >
> > ---
> >
> >  gcc/config/riscv/bitmanip.md   | 13 +
> >  gcc/testsuite/gcc.target/riscv/zbs-bexti.c | 14 ++
> >  2 files changed, 27 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zbs-bexti.c
> >
> > diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
> > index 0ab9ffe3c0b..ea5dea13cfb 100644
> > --- a/gcc/config/riscv/bitmanip.md
> > +++ b/gcc/config/riscv/bitmanip.md
> > @@ -340,3 +340,16 @@ (define_insn "*bexti"
> >"TARGET_ZBS"
> >"bexti\t%0,%1,%2"
> >[(set_attr "type" "bitmanip")])
> > +
> > +;; We can create a polarity-reversed mask (i.e. bit N -> { set = 0, clear 
> > = -1 })
> > +;; using a bext(i) followed by an addi instruction.
> > +;; This splits the canonical representation of "(a & (1 << BIT_NO)) ? 0 : 
> > -1".
> > +(define_split
> > +  [(set (match_operand:GPR 0 "register_operand")
> > +   (neg:GPR (eq:GPR (zero_extract:GPR (match_operand:GPR 1 
> > "register_operand")
> > +  (const_int 1)
> > +  (match_operand 2))
> > +(const_int 0]
> > +  "TARGET_ZBS"
> > +  [(set (match_dup 0) (zero_extract:GPR (match_dup 1) (const_int 1) 
> > (match_dup 2)))
> > +   (set (match_dup 0) (plus:GPR (match_dup 0) (const_int -1)))])
> > diff --git a/gcc/testsuite/gcc.target/riscv/zbs-bexti.c 
> > b/gcc/testsuite/gcc.target/riscv/zbs-bexti.c
> > new file mode 100644
> > index 000..99e3b58309c
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/zbs-bexti.c
> > @@ -0,0 +1,14 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64gc_zbs -mabi=lp64 -O2" } */
> > +
> > +/* bexti */
> > +#define BIT_NO  4
> > +
> > +long
> > +foo0 (long a)
> > +{
> > +  return (a & (1 << BIT_NO)) ? 0 : -1;
> > +}
> > +
> > +/* { dg-final { scan-assembler "bexti" } } */
> > +/* { dg-final { scan-assembler "addi" } } */
> > --
> > 2.34.1
> >

Re: [PATCH] Teach VN about masked/len stores

2022-07-21 Thread Kewen.Lin via Gcc-patches

Hi Richi,

on 2022/7/21 17:01, Richard Biener via Gcc-patches wrote:
> The following teaches VN to handle reads from .MASK_STORE and
> .LEN_STORE.  For this push_partial_def is extended first for
> convenience so we don't have to handle the full def case in the
> caller (possibly other paths can be simplified then).  Also
> the partial definition stored value can have an offset applied
> so we don't have to build a fake RHS when we register the pieces
> of an existing store.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu, Kewen is
> going to test on powerpc.
Tested these three related patches:
  - Add alias disambiguation for vectorizer load/store IFNs
  - Teach VN about masked/len stores
  - tree-optimization/106365 - DSE of LEN_STORE and MASK_STORE

, I confirmed that they were bootstrapped and regtested on
powerpc64le-linux-gnu Power10 (with vector with length support).

BR,
Kewen

Re: [PATCH] libsanitizer: cherry-pick 9cf13067cb5088626ba7 from upstream

2022-07-21 Thread Richard Biener via Gcc-patches

On Mon, Jul 11, 2022 at 10:05 PM Martin Liška  wrote:
>
> I'm going to push the following cherry-pick which fixes libasan
> build with top-of-tree glibc.

Can you also push this to active branches please?

> Martin
>
> 9cf13067cb5088626ba7ee1ec4c42ec59c7995a0 [sanitizer] Remove #include 
>  to resolve fsconfig_command/mount_attr conflict with glibc 2.36
> ---
>  .../sanitizer_platform_limits_posix.cpp| 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git 
> a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp 
> b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp
> index 8ed3e92d270..97fd07acf9d 100644
> --- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp
> +++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cpp
> @@ -73,7 +73,9 @@
>  #include 
>  #include 
>  #include 
> +#if SANITIZER_ANDROID
>  #include 
> +#endif
>  #include 
>  #include 
>  #include 
> @@ -869,10 +871,10 @@ unsigned struct_ElfW_Phdr_sz = sizeof(Elf_Phdr);
>unsigned IOCTL_EVIOCGPROP = IOCTL_NOT_PRESENT;
>unsigned IOCTL_EVIOCSKEYCODE_V2 = IOCTL_NOT_PRESENT;
>  #endif
> -  unsigned IOCTL_FS_IOC_GETFLAGS = FS_IOC_GETFLAGS;
> -  unsigned IOCTL_FS_IOC_GETVERSION = FS_IOC_GETVERSION;
> -  unsigned IOCTL_FS_IOC_SETFLAGS = FS_IOC_SETFLAGS;
> -  unsigned IOCTL_FS_IOC_SETVERSION = FS_IOC_SETVERSION;
> +  unsigned IOCTL_FS_IOC_GETFLAGS = _IOR('f', 1, long);
> +  unsigned IOCTL_FS_IOC_GETVERSION = _IOR('v', 1, long);
> +  unsigned IOCTL_FS_IOC_SETFLAGS = _IOW('f', 2, long);
> +  unsigned IOCTL_FS_IOC_SETVERSION = _IOW('v', 2, long);
>unsigned IOCTL_GIO_CMAP = GIO_CMAP;
>unsigned IOCTL_GIO_FONT = GIO_FONT;
>unsigned IOCTL_GIO_UNIMAP = GIO_UNIMAP;
> --
> 2.36.1
>

Re: libgo patch committed: Don't include in sysinfo.c

2022-07-21 Thread Richard Biener via Gcc-patches

On Wed, Jul 13, 2022 at 6:03 PM Ian Lance Taylor via Gcc-patches
 wrote:
>
> This libgo patch stops including  when building
> gen-sysinfo.go.  Removing this doesn't change anything at least with
> glibc 2.33.  The include was added in https://go.dev/cl/6100049 but
> it's not clear why.  This should fix GCC PR 106266.  Bootstrapped and
> ran Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Btw, active branches are affected the same way - can you please backport?

> Ian

[Patch] OpenMP: Support reverse offload (middle end part)

2022-07-21 Thread Tobias Burnus


This patch does three things:

(a) It removes a 'sorry' for 'device(ancestor:1)' and passes
GOMP_DEVICE_HOST_FALLBACK as device number.

This is sufficient for full "reverse" offload support with ENABLE_OFFLOADING
being false - and -foffload=disable. And for simple hello-world cases.


On the libgomp side, the 'requires reverse_offload' currently implies that
the initial device is the only device. While that's all fine, this change
is insufficient if offloading devices are enabled during compilation as:


(b.1) The offload-device lto1 should not see the content of the ancestor:1 
target
region and all the calls it does. If it does, there will be link errors for
functions not available and it also would pointlessly increase the code size.

Thus, the second part is to create an empty function for devices and a full
version for the host.

The general idea is: The device version can be used as lookup pointer in the
offload_funcs table; thus, we both need a function on the device and a call to
GOMP_target_ext.

It turned out to be quite difficult as late in the processing changing a
FUNCTION_DECL is not that easy – nor removing it after all analysis has been
done. I hope the current version is not too hackish – and maybe someone has
an idea how to best not to assembly the 'nonhost' version on the host.
(Not critical as it is small (having an empty body) - but still it would be
nicer not to write it to .s file.)


(b.2) The omp-offload.cc assert showed that cloning and inlining happened
for the included libgomp example. While inlining should be okay (of
'subroutine m2_tg_fn' (and for C/C++ 'tg_fn')) - cloning will break
the offload_func table lookup - and, hence, had to be excluded → "noclone".
I think it could also affect non-anchestor:1 code - but did not try to
create an example.


(c) Prepare for actual reverse offloading
While (b) already does some prep work for real offloading, at least one more
step is needed: In order to allow that the function pointer can be used for
offload_func table lookup, it has to be passed to libgomp.

Currently, the 'fn' argument is nullified in on-device calls to GOMP_target_ext.
The third part of this patch nullifies it now only for non-reverse offloads.

OK for mainline?

 * * *

Next steps: Implement reverse offloading for devices. In theory, this only
requires libgomp work, but let's see what else will be required.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP: Support reverse offload (middle end part)

gcc/ChangeLog:

	* internal-fn.cc (expand_GOMP_TARGET_REV): New.
	* internal-fn.def (GOMP_TARGET_REV): New.
	* lto-cgraph.cc (lto_output_node, verify_node_partition): Mark
	'omp target device_ancestor_host' as in_other_partition and don't
	error if absent.
	* omp-low.cc (create_omp_child_function): Mark as 'noclone'.
	* omp-expand.cc (expand_omp_target): For reverse offload, remove
	sorry, use device = GOMP_DEVICE_HOST_FALLBACK and create
	empty-body nohost function.
	* omp-offload.cc (execute_omp_device_lower): Handle
	IFN_GOMP_TARGET_REV.
	(pass_omp_target_link::execute): For ACCEL_COMPILER, don't
	nullify fn argument for reverse offload

libgomp/ChangeLog:

	* libgomp.texi (OpenMP 5.0): Mark 'ancestor' as implemented but
	refer to 'requires'.
	* testsuite/libgomp.c-c++-common/reverse-offload-1-aux.c: New test.
	* testsuite/libgomp.c-c++-common/reverse-offload-1.c: New test.
	* testsuite/libgomp.fortran/reverse-offload-1-aux.f90: New test.
	* testsuite/libgomp.fortran/reverse-offload-1.f90: New test.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/reverse-offload-1.c: Remove dg-sorry.
	* c-c++-common/gomp/target-device-ancestor-4.c: Likewise.
	* gfortran.dg/gomp/target-device-ancestor-4.f90: Likewise.
	* gfortran.dg/gomp/target-device-ancestor-5.f90: Likewise.

 gcc/internal-fn.cc |   8 ++
 gcc/internal-fn.def|   1 +
 gcc/lto-cgraph.cc  |  20 +++-
 gcc/omp-expand.cc  | 107 +++--
 gcc/omp-low.cc |   4 +-
 gcc/omp-offload.cc |  50 ++
 .../c-c++-common/gomp/reverse-offload-1.c  |   2 +-
 .../c-c++-common/gomp/target-device-ancestor-4.c   |   2 +-
 .../gfortran.dg/gomp/target-device-ancestor-4.f90  |   2 +-
 .../gfortran.dg/gomp/target-device-ancestor-5.f90  |   2 +-
 libgomp/libgomp.texi   |   2 +-
 .../libgomp.c-c++-common/reverse-offload-1-aux.c   |  10 ++
 .../libgomp.c-c++-common/reverse-offload-1.c   |  83 
 .../libgomp.fortran/reverse-offload-1-aux.f90  |  12 +++
 .../libgomp.fortran/reverse-offload-1.f90  |  88 +
 15 files changed, 375 in

Re: [PATCH 5/12 V2] arm: Implement target feature macros for PACBTI

2022-07-21 Thread Richard Earnshaw via Gcc-patches





On 12/07/2022 16:45, Andrea Corallo via Gcc-patches wrote:

Richard Earnshaw  writes:


On 28/04/2022 10:42, Andrea Corallo via Gcc-patches wrote:

This patch implements target feature macros when PACBTI is enabled
through the -march option or -mbranch-protection.  The target feature
macros __ARM_FEATURE_PAC_DEFAULT and __ARM_FEATURE_BTI_DEFAULT are
specified in ARM ACLE

__ARM_FEATURE_PAUTH and __ARM_FEATURE_BTI are specified in the
pull-request .
Approved here
.
gcc/ChangeLog:
* config/arm/arm-c.c (arm_cpu_builtins): Define
__ARM_FEATURE_BTI_DEFAULT, __ARM_FEATURE_PAC_DEFAULT,
__ARM_FEATURE_PAUTH and __ARM_FEATURE_BTI.


This bit is OK.


gcc/testsuite/ChangeLog:
* gcc.target/arm/acle/pacbti-m-predef-2.c: New test.
* gcc.target/arm/acle/pacbti-m-predef-4.c: New test.
* gcc.target/arm/acle/pacbti-m-predef-5.c: New test.



These are all execution tests.  I think we also need some compile-only
tests so that we get better coverage when the target does not directly
support PACBTI.

We also need some tests for the defines when targetting armv8-m.main
and some tests for checking __ARM_FEATURE_BTI and __ARM_FEATURE_PAC
(the tests here check only the '..._DEFAULT' macros.


Hi Richard & all,

please find attached the updated version of this patch.

Best Regards

   Andrea

gcc/ChangeLog:

* config/arm/arm-c.c (arm_cpu_builtins): Define
__ARM_FEATURE_BTI_DEFAULT, __ARM_FEATURE_PAC_DEFAULT,
__ARM_FEATURE_PAUTH and __ARM_FEATURE_BTI.

gcc/testsuite/ChangeLog:

* gcc.target/arm/acle/pacbti-m-predef-2.c: New test.
* gcc.target/arm/acle/pacbti-m-predef-4.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-5.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-8.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-9.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-10.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-11.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-12.c: Likewise.



diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-10.c 
b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-10.c

new file mode 100644
index 000..311cf572dd9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-10.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-additional-options " -mbranch-protection=bti+pac-ret" } */

This is not enough.  For example, if the testsuite is being run with 
"-march=armv6-m" as the testrun options, we'll get an error that will 
cause a test failure.  You need to run a pre-test rule that validates 
that adding -mbranch-protection is safe.


+++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-11.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv8.1-m.main+pacbti" } */

Similarly here, this would conflict with, for example, "-marm" as test 
options.


+++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-5.c
@@ -0,0 +1,24 @@
+
+/* { dg-do run } */

Blank line at the start of the test.

The other tests have similar issues.

R.

Re: [PATCH 7/12 V2] arm: Emit build attributes for PACBTI target feature

2022-07-21 Thread Richard Earnshaw via Gcc-patches





On 13/07/2022 09:58, Andrea Corallo via Gcc-patches wrote:

Richard Earnshaw  writes:


On 28/04/2022 10:45, Andrea Corallo via Gcc-patches wrote:

This patch emits assembler directives for PACBTI build attributes as
defined by the
ABI.

gcc/ChangeLog:
* config/arm/arm.c (arm_file_start): Emit EABI attributes for
Tag_PAC_extension, Tag_BTI_extension, TAG_BTI_use, TAG_PACRET_use.


This bit is OK.


gcc/testsuite/ChangeLog:
* gcc.target/arm/acle/pacbti-m-predef-1.c: New test.
* gcc.target/arm/acle/pacbti-m-predef-3: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-6.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-7.c: Likewise.


These tests contain directives like:

+/* { dg-additional-options " -mbranch-protection=pac-ret+bti
--save-temps" } */

But they don't check that the architecture permits this (it has to be
armv8-m.main or later).


Hi Richard & all,

please find attached the updated patch.

BR

  Andrea



The tests in this patch have similar issues to my previous reply.  You 
need to make sure that adding options will not cause a conflict with 
other options added by the test driver.


R.

Re: [PATCH 8/12 V3] arm: Introduce multilibs for PACBTI target feature

2022-07-21 Thread Richard Earnshaw via Gcc-patches





On 21/07/2022 10:04, Andrea Corallo via Gcc-patches wrote:

Richard Earnshaw  writes:

[...]


The documentation mentions -mbranch-protection=standard+leaf, so
you're missing a mapping for that.
OK with that change.
R.


Oh, and please add some tests to gcc/testsuite/gcc.target/arm/multilib.exp

R.


Hi Richard,

thanks, here the updated patch.

PS I've also added three mlibarch -> march matches that were missing.

BR

   Andrea



+MULTILIB_REQUIRED	+= 
mthumb/march=armv8.1-m.main+pacbti+fp/mbranch-protection=standard/mfloat-abi=hard
+MULTILIB_REQUIRED	+= 
mthumb/march=armv8.1-m.main+pacbti+fp.dp/mbranch-protection=standard/mfloat-abi=softfp
+MULTILIB_REQUIRED	+= 
mthumb/march=armv8.1-m.main+pacbti+fp.dp/mbranch-protection=standard/mfloat-abi=hard
+MULTILIB_REQUIRED	+= 
mthumb/march=armv8.1-m.main+pacbti+mve/mbranch-protection=standard/mfloat-abi=hard

+
+
 # Arch Matches
 MULTILIB_MATCHES   += march?armv6s-m=march?armv6-m

Just one blank line between sections.

Otherwise OK.

R.

[PATCH] tree-optimization/106379 - add missing ~(a ^ b) folding for _Bool

2022-07-21 Thread Richard Biener via Gcc-patches

The following makes sure to fold ~(a ^ b) to a == b for truth
values (but not vectors, we'd have to check for vector support of
equality).  That turns the PR106379 testcase into a ranger one.

Note that while we arrive at ~(a ^ b) in a convoluted way from
original !a == !b one can eventually write the expression this
way directly as well.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/106379
* match.pd (~(a ^ b) -> a == b): New pattern.

* gcc.dg/pr106379-1.c: New testcase.
---
 gcc/match.pd  | 6 ++
 gcc/testsuite/gcc.dg/pr106379-1.c | 9 +
 2 files changed, 15 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr106379-1.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 8bbc0dbd5cd..88a1a5aa9cc 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1938,6 +1938,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
   (bit_not (bit_xor (view_convert @0) @1
 
+/* ~(a ^ b) is a == b for truth valued a and b.  */
+(simplify
+ (bit_not (bit_xor:s truth_valued_p@0 truth_valued_p@1))
+ (if (!VECTOR_TYPE_P (type))
+  (convert (eq @0 @1
+
 /* (x & ~m) | (y & m) -> ((x ^ y) & m) ^ x */
 (simplify
  (bit_ior:c (bit_and:cs @0 (bit_not @2)) (bit_and:cs @1 @2))
diff --git a/gcc/testsuite/gcc.dg/pr106379-1.c 
b/gcc/testsuite/gcc.dg/pr106379-1.c
new file mode 100644
index 000..7f2575e02dc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr106379-1.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-forwprop1" } */
+
+_Bool foo (_Bool a, _Bool b)
+{
+  return !a == !b;
+}
+
+/* { dg-final { scan-tree-dump "\[ab\]_\[0-9\]+\\(D\\) == 
\[ba\]_\[0-9\]+\\(D\\)" "forwprop1" } } */
-- 
2.35.3

Re: [PATCH 9/12 V2] arm: Make libgcc bti compatible

2022-07-21 Thread Richard Earnshaw via Gcc-patches


On 21/07/2022 10:17, Andrea Corallo via Gcc-patches wrote:

Richard Earnshaw  writes:


On 28/04/2022 10:48, Andrea Corallo via Gcc-patches wrote:

This change add bti instructions at the beginning of arm specific
libgcc hand written assembly routines.
2022-03-31  Andrea Corallo  
* libgcc/config/arm/crti.S (FUNC_START): Add bti instruction
if
necessary.
* libgcc/config/arm/lib1funcs.S (THUMB_FUNC_START, FUNC_START):
Likewise.



+#if defined(__ARM_FEATURE_BTI)

Wouldn't it be better to use __ARM_FEATURE_BTI_DEFAULT?  That way we
only get BTI instructions in multilib variants that have asked for
BTI.

R.


Hi Richard,

good point, yes I think so.

Please find attached the updated patch.

BR

   Andrea



I've been pondering this patch.  The way it is implemented would put a 
BTI instruction at the start of every assembler routine in libgcc.  But 
the vast majority of functions in libgcc cannot have their address 
taken, so a BTI isn't needed (BTI is only needed when an indirect jump 
could be used).  So I wonder if we really need to do this so aggressively?


Perhaps a better approach would be to define a macro (eg MAYBEBTI) which 
expands a BTI if the compilation requires it and nothing otherwise), and 
then manually insert that in any functions that really need this (if any).


R.

Re: libgo patch committed: Don't include in sysinfo.c

2022-07-21 Thread Martin Liška

On 7/21/22 12:19, Richard Biener via Gcc-patches wrote:
> On Wed, Jul 13, 2022 at 6:03 PM Ian Lance Taylor via Gcc-patches
>  wrote:
>>
>> This libgo patch stops including  when building
>> gen-sysinfo.go.  Removing this doesn't change anything at least with
>> glibc 2.33.  The include was added in https://go.dev/cl/6100049 but
>> it's not clear why.  This should fix GCC PR 106266.  Bootstrapped and
>> ran Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.
> 
> Btw, active branches are affected the same way - can you please backport?

I've just done that.

Martin

> 
>> Ian

Re: [PATCH] libsanitizer: cherry-pick 9cf13067cb5088626ba7 from upstream

2022-07-21 Thread Martin Liška

On 7/21/22 12:18, Richard Biener wrote:
> Can you also push this to active branches please?

Sure, I've just done that.

Cheers,
Martin

Re: [Patch] OpenMP: Support reverse offload (middle end part)

2022-07-21 Thread Tobias Burnus


Ups to quick/wrong patch file. I had found an issue related to 'noclone'
(duplicated entries, dg-scan-dump issues with OpenACC) – but ended up to
attach the wrong file...  Changes: omp-low.cc and
gcc/testsuite/*/goacc/. The rest is the same.

Tobias

On 21.07.22 12:55, Tobias Burnus wrote:

This patch does three things:

(a) It removes a 'sorry' for 'device(ancestor:1)' and passes
GOMP_DEVICE_HOST_FALLBACK as device number.

This is sufficient for full "reverse" offload support with
ENABLE_OFFLOADING
being false - and -foffload=disable. And for simple hello-world cases.


On the libgomp side, the 'requires reverse_offload' currently implies
that
the initial device is the only device. While that's all fine, this change
is insufficient if offloading devices are enabled during compilation as:


(b.1) The offload-device lto1 should not see the content of the
ancestor:1 target
region and all the calls it does. If it does, there will be link
errors for
functions not available and it also would pointlessly increase the
code size.

Thus, the second part is to create an empty function for devices and a
full
version for the host.

The general idea is: The device version can be used as lookup pointer
in the
offload_funcs table; thus, we both need a function on the device and a
call to
GOMP_target_ext.

It turned out to be quite difficult as late in the processing changing a
FUNCTION_DECL is not that easy – nor removing it after all analysis
has been
done. I hope the current version is not too hackish – and maybe
someone has
an idea how to best not to assembly the 'nonhost' version on the host.
(Not critical as it is small (having an empty body) - but still it
would be
nicer not to write it to .s file.)


(b.2) The omp-offload.cc assert showed that cloning and inlining happened
for the included libgomp example. While inlining should be okay (of
'subroutine m2_tg_fn' (and for C/C++ 'tg_fn')) - cloning will break
the offload_func table lookup - and, hence, had to be excluded →
"noclone".
I think it could also affect non-anchestor:1 code - but did not try to
create an example.


(c) Prepare for actual reverse offloading
While (b) already does some prep work for real offloading, at least
one more
step is needed: In order to allow that the function pointer can be
used for
offload_func table lookup, it has to be passed to libgomp.

Currently, the 'fn' argument is nullified in on-device calls to
GOMP_target_ext.
The third part of this patch nullifies it now only for non-reverse
offloads.

OK for mainline?

 * * *

Next steps: Implement reverse offloading for devices. In theory, this
only
requires libgomp work, but let's see what else will be required.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP: Support reverse offload (middle end part)

gcc/ChangeLog:

	* internal-fn.cc (expand_GOMP_TARGET_REV): New.
	* internal-fn.def (GOMP_TARGET_REV): New.
	* lto-cgraph.cc (lto_output_node, verify_node_partition): Mark
	'omp target device_ancestor_host' as in_other_partition and don't
	error if absent.
	* omp-low.cc (create_omp_child_function): Mark as 'noclone'.
	* omp-expand.cc (expand_omp_target): For reverse offload, remove
	sorry, use device = GOMP_DEVICE_HOST_FALLBACK and create
	empty-body nohost function.
	* omp-offload.cc (execute_omp_device_lower): Handle
	IFN_GOMP_TARGET_REV.
	(pass_omp_target_link::execute): For ACCEL_COMPILER, don't
	nullify fn argument for reverse offload

libgomp/ChangeLog:

	* libgomp.texi (OpenMP 5.0): Mark 'ancestor' as implemented but
	refer to 'requires'.
	* testsuite/libgomp.c-c++-common/reverse-offload-1-aux.c: New test.
	* testsuite/libgomp.c-c++-common/reverse-offload-1.c: New test.
	* testsuite/libgomp.fortran/reverse-offload-1-aux.f90: New test.
	* testsuite/libgomp.fortran/reverse-offload-1.f90: New test.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/reverse-offload-1.c: Remove dg-sorry.
	* c-c++-common/gomp/target-device-ancestor-4.c: Likewise.
	* gfortran.dg/gomp/target-device-ancestor-4.f90: Likewise.
	* gfortran.dg/gomp/target-device-ancestor-5.f90: Likewise.
	* c-c++-common/goacc/classify-kernels-parloops.c: Add 'noclone' to
	scan-tree-dump-times.
	* c-c++-common/goacc/classify-kernels-unparallelized-parloops.c:
	Likewise.
	* c-c++-common/goacc/classify-kernels-unparallelized.c: Likewise.
	* c-c++-common/goacc/classify-kernels.c: Likewise.
	* c-c++-common/goacc/classify-parallel.c: Likewise.
	* c-c++-common/goacc/classify-serial.c: Likewise.
	* c-c++-common/goacc/kernels-counter-vars-function-scope.c: Likewise.
	* c-c++-common/goacc/kernels-loop-2.c: Likewise.
	* c-c++-common/goacc/kernels-loop-3.c: Likewise.
	* c-c++-common/goacc/kernels-loop-data-2.c: Likewise.
	* c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: Likewise.
	* c-c++-common/goacc/ke

[PATCH] c++: CTAD from initializer list [PR106366]

2022-07-21 Thread Patrick Palka via Gcc-patches

During CTAD, we currently perform the first phase of overload resolution
from [over.match.list] only if the class template has a list constructor.
But according to [over.match.class.deduct]/4 it should be enough to just
have a guide that looks like a list constructor (which is a more general
criterion in light of user-defined guides).

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/106366

gcc/cp/ChangeLog:

* pt.cc (do_class_deduction): Don't consider TYPE_HAS_LIST_CTOR
when setting try_list_ctor.  Reset args even when try_list_ctor
is true and there are no list candidates.  Call resolve_args on
the reset args.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction112.C: New test.
---
 gcc/cp/pt.cc  | 25 +--
 .../g++.dg/cpp1z/class-deduction112.C | 14 +++
 2 files changed, 26 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction112.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 718dfa5bfa8..0f26d6f5bce 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -30250,8 +30250,8 @@ do_class_deduction (tree ptype, tree tmpl, tree init,
   else if (BRACE_ENCLOSED_INITIALIZER_P (init))
 {
   list_init_p = true;
-  try_list_ctor = TYPE_HAS_LIST_CTOR (type);
-  if (try_list_ctor && CONSTRUCTOR_NELTS (init) == 1
+  try_list_ctor = true;
+  if (CONSTRUCTOR_NELTS (init) == 1
  && !CONSTRUCTOR_IS_DESIGNATED_INIT (init))
{
  /* As an exception, the first phase in 16.3.1.7 (considering the
@@ -30310,26 +30310,25 @@ do_class_deduction (tree ptype, tree tmpl, tree init,
 
   tree fndecl = error_mark_node;
 
-  /* If this is list-initialization and the class has a list constructor, first
+  /* If this is list-initialization and the class has a list guide, first
  try deducing from the list as a single argument, as [over.match.list].  */
-  tree list_cands = NULL_TREE;
-  if (try_list_ctor && cands)
-for (lkp_iterator iter (cands); iter; ++iter)
-  {
-   tree dg = *iter;
+  if (try_list_ctor)
+{
+  tree list_cands = NULL_TREE;
+  for (tree dg : lkp_range (cands))
if (is_list_ctor (dg))
  list_cands = lookup_add (dg, list_cands);
-  }
-  if (list_cands)
-{
-  fndecl = perform_dguide_overload_resolution (list_cands, args, tf_none);
-
+  if (list_cands)
+   fndecl = perform_dguide_overload_resolution (list_cands, args, tf_none);
   if (fndecl == error_mark_node)
{
  /* That didn't work, now try treating the list as a sequence of
 arguments.  */
  release_tree_vector (args);
  args = make_tree_vector_from_ctor (init);
+ args = resolve_args (args, complain);
+ if (args == NULL)
+   return error_mark_node;
}
 }
 
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction112.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction112.C
new file mode 100644
index 000..8da5868ff98
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction112.C
@@ -0,0 +1,14 @@
+// PR c++/106366
+// { dg-do compile { target c++17 } }
+
+#include 
+
+template
+struct A { A(...); };
+
+template
+A(std::initializer_list) -> A;
+
+A a{1,2,3};
+using type = decltype(a);
+using type = A;
-- 
2.37.1.208.ge72d93e88c

Re: [PATCH] c++: CTAD from initializer list [PR106366]

2022-07-21 Thread Patrick Palka via Gcc-patches

On Thu, 21 Jul 2022, Patrick Palka wrote:

> During CTAD, we currently perform the first phase of overload resolution
> from [over.match.list] only if the class template has a list constructor.
> But according to [over.match.class.deduct]/4 it should be enough to just
> have a guide that looks like a list constructor (which is a more general
> criterion in light of user-defined guides).
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?
> 
>   PR c++/106366
> 
> gcc/cp/ChangeLog:
> 
>   * pt.cc (do_class_deduction): Don't consider TYPE_HAS_LIST_CTOR
>   when setting try_list_ctor.  Reset args even when try_list_ctor
>   is true and there are no list candidates.  Call resolve_args on
>   the reset args.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp1z/class-deduction112.C: New test.
> ---
>  gcc/cp/pt.cc  | 25 +--
>  .../g++.dg/cpp1z/class-deduction112.C | 14 +++
>  2 files changed, 26 insertions(+), 13 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction112.C
> 
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index 718dfa5bfa8..0f26d6f5bce 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -30250,8 +30250,8 @@ do_class_deduction (tree ptype, tree tmpl, tree init,
>else if (BRACE_ENCLOSED_INITIALIZER_P (init))
>  {
>list_init_p = true;
> -  try_list_ctor = TYPE_HAS_LIST_CTOR (type);
> -  if (try_list_ctor && CONSTRUCTOR_NELTS (init) == 1
> +  try_list_ctor = true;

I suppose try_list_cand would be a more appropriate name for this
variable now, consider that fixed.

> +  if (CONSTRUCTOR_NELTS (init) == 1
> && !CONSTRUCTOR_IS_DESIGNATED_INIT (init))
>   {
> /* As an exception, the first phase in 16.3.1.7 (considering the
> @@ -30310,26 +30310,25 @@ do_class_deduction (tree ptype, tree tmpl, tree 
> init,
>  
>tree fndecl = error_mark_node;
>  
> -  /* If this is list-initialization and the class has a list constructor, 
> first
> +  /* If this is list-initialization and the class has a list guide, first
>   try deducing from the list as a single argument, as [over.match.list].  
> */
> -  tree list_cands = NULL_TREE;
> -  if (try_list_ctor && cands)
> -for (lkp_iterator iter (cands); iter; ++iter)
> -  {
> - tree dg = *iter;
> +  if (try_list_ctor)
> +{
> +  tree list_cands = NULL_TREE;
> +  for (tree dg : lkp_range (cands))
>   if (is_list_ctor (dg))
> list_cands = lookup_add (dg, list_cands);
> -  }
> -  if (list_cands)
> -{
> -  fndecl = perform_dguide_overload_resolution (list_cands, args, 
> tf_none);
> -
> +  if (list_cands)
> + fndecl = perform_dguide_overload_resolution (list_cands, args, tf_none);
>if (fndecl == error_mark_node)
>   {
> /* That didn't work, now try treating the list as a sequence of
>arguments.  */
> release_tree_vector (args);
> args = make_tree_vector_from_ctor (init);
> +   args = resolve_args (args, complain);
> +   if (args == NULL)
> + return error_mark_node;
>   }
>  }
>  
> diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction112.C 
> b/gcc/testsuite/g++.dg/cpp1z/class-deduction112.C
> new file mode 100644
> index 000..8da5868ff98
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction112.C
> @@ -0,0 +1,14 @@
> +// PR c++/106366
> +// { dg-do compile { target c++17 } }
> +
> +#include 
> +
> +template
> +struct A { A(...); };
> +
> +template
> +A(std::initializer_list) -> A;
> +
> +A a{1,2,3};
> +using type = decltype(a);
> +using type = A;
> -- 
> 2.37.1.208.ge72d93e88c
> 
>

Re: [PATCH 1/1 V5] RISC-V: Support Zmmul extension

2022-07-21 Thread Palmer Dabbelt


On Thu, 21 Jul 2022 02:03:35 PDT (-0700), gcc-patches@gcc.gnu.org wrote:

LGTM, will merge once binuils part merge.


+Nelson, in case he's already planning on handling those.  If not then 
they're not in my inbox, so just poke me if you want me to review them.


Also some comments on the patch below.



On Wed, Jul 13, 2022 at 10:14 AM  wrote:


From: LiaoShihua 

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add Zmmul.
* config/riscv/riscv-opts.h (MASK_ZMMUL): New.
(TARGET_ZMMUL): Ditto.
* config/riscv/riscv.cc (riscv_option_override):Ditto.
* config/riscv/riscv.md: Add Zmmul
* config/riscv/riscv.opt: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zmmul-1.c: New test.
* gcc.target/riscv/zmmul-2.c: New test.

---
 gcc/common/config/riscv/riscv-common.cc  |  3 +++
 gcc/config/riscv/riscv-opts.h|  3 +++
 gcc/config/riscv/riscv.cc|  8 +--
 gcc/config/riscv/riscv.md| 28 
 gcc/config/riscv/riscv.opt   |  3 +++
 gcc/testsuite/gcc.target/riscv/zmmul-1.c | 20 +
 gcc/testsuite/gcc.target/riscv/zmmul-2.c | 20 +
 7 files changed, 69 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zmmul-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zmmul-2.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 0e5be2ce105..20acc590b30 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -193,6 +193,8 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"zvl32768b", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zvl65536b", ISA_SPEC_CLASS_NONE, 1, 0},

+  {"zmmul", ISA_SPEC_CLASS_NONE, 1, 0},
+
   /* Terminate the list.  */
   {NULL, ISA_SPEC_CLASS_NONE, 0, 0}
 };
@@ -1148,6 +1150,7 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"zvl32768b", &gcc_options::x_riscv_zvl_flags, MASK_ZVL32768B},
   {"zvl65536b", &gcc_options::x_riscv_zvl_flags, MASK_ZVL65536B},

+  {"zmmul", &gcc_options::x_riscv_zm_subext, MASK_ZMMUL},

   {NULL, NULL, 0}
 };
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 1e153b3a6e7..9c7d69a6ea3 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -153,6 +153,9 @@ enum stack_protector_guard {
 #define TARGET_ZICBOM ((riscv_zicmo_subext & MASK_ZICBOM) != 0)
 #define TARGET_ZICBOP ((riscv_zicmo_subext & MASK_ZICBOP) != 0)

+#define MASK_ZMMUL  (1 << 0)
+#define TARGET_ZMMUL((riscv_zm_subext & MASK_ZMMUL) != 0)
+
 /* Bit of riscv_zvl_flags will set contintuly, N-1 bit will set if N-bit is
set, e.g. MASK_ZVL64B has set then MASK_ZVL32B is set, so we can use
popcount to caclulate the minimal VLEN.  */
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 2e83ca07394..9ad4181f35f 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4999,10 +4999,14 @@ riscv_option_override (void)
   /* The presence of the M extension implies that division instructions
  are present, so include them unless explicitly disabled.  */
   if (TARGET_MUL && (target_flags_explicit & MASK_DIV) == 0)
-target_flags |= MASK_DIV;
+if(!TARGET_ZMMUL)
+  target_flags |= MASK_DIV;


Not sure if I'm missing something here, but that doesn't look right: it 
would mean that "-march=rv32im_zmmul" ends up without divide 
instructions.  I think it's fine to just leave this as it was, we're not 
setting TARGET_MUL from "-march...zmmul...", so this should all be OK.



   else if (!TARGET_MUL && TARGET_DIV)
 error ("%<-mdiv%> requires %<-march%> to subsume the % extension");
-
+
+  if(TARGET_ZMMUL && !TARGET_MUL && TARGET_DIV)
+warning (0, "%<-mdiv%> cannot be used when % extension is 
present");


That should already be getting caught by the check above, but even so 
it's not quite the right error: "-march=rv32im_zmmul -mdiv" is fine, 
it's just something like "-march=rv32i_zmmul -mdiv" that's the problem.



+
   /* Likewise floating-point division and square root.  */
   if (TARGET_HARD_FLOAT && (target_flags_explicit & MASK_FDIV) == 0)
 target_flags |= MASK_FDIV;
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 308b64dd30d..d4e171464ea 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -763,7 +763,7 @@
   [(set (match_operand:SI  0 "register_operand" "=r")
(mult:SI (match_operand:SI 1 "register_operand" " r")
 (match_operand:SI 2 "register_operand" " r")))]
-  "TARGET_MUL"
+  "TARGET_ZMMUL || TARGET_MUL"
   { return TARGET_64BIT ? "mulw\t%0,%1,%2" : "mul\t%0,%1,%2"; }
   [(set_attr "type" "imul")
(set_attr "mode" "SI")])
@@ -772,7 +772,7 @@
   [(set (match_operand:DI  0 "register_operand" "=r")
(mult:DI (match_operand:DI 1 "register_operand" " r")

Re: [PATCH] rs6000/test: Update some cases with -mdejagnu-tune

2022-07-21 Thread Segher Boessenkool

Hi!

On Wed, Jul 20, 2022 at 05:31:11PM +0800, Kewen.Lin wrote:
> As PR106345 shows, some test cases should be updated with
> -mdejagnu-tune, since their test points are sensitive to
> rs6000_tune, such as: group_ending_nop, loop align (ic),
> float conversion cost etc.

It does not make sense to require -mdejagnu-tune= if -mdejagnu-cpu= is
already given?  What is the failure case?

> This patch is to replace -mdejagnu-cpu with -mdejagnu-tune
> or append -mdejagnu-tune (keep the original -mdejagnu-cpu
> when it's required) accordingly.

It is *always* required.  Testcases with -mtune= but unspecified -mcpu=
make no sense.

> --- a/gcc/testsuite/gcc.target/powerpc/compress-float-ppc-pic.c
> +++ b/gcc/testsuite/gcc.target/powerpc/compress-float-ppc-pic.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile { target powerpc_fprs } } */
> -/* { dg-options "-O2 -fpic -mdejagnu-cpu=power5" } */
> +/* { dg-options "-O2 -fpic -mdejagnu-cpu=power5 -mdejagnu-tune=power5" } */
>  /* { dg-require-effective-target fpic } */

This should only make a difference if you have -mtune= in your
RUNTEST_FLAGS, and you shouldn't do silly things like that.  I suspect
you see it in other cases, and those are actual bugs then, that need
actual fixing instead of sweeping under the carper.

The testcase suggests this is with a compiler configured with
--with-cpu= --with-tune=, which should just work, and -mcpu= should
override both of those!

Segher

[PATCH] x86: Add ix86_ifunc_ref_local_ok

2022-07-21 Thread H.J. Lu via Gcc-patches

We can't always use the PLT entry as the function address for local IFUNC
functions.  When the PIC register is needed for PLT call, indirect call
via the PLT entry will fail since the PIC register may not be set up
properly for indirect call.  Add ix86_ifunc_ref_local_ok to return false
when the PLT entry can't be used as local IFUNC function pointers.

gcc/

PR target/83782
* config/i386/i386.cc (ix86_ifunc_ref_local_ok): New.
(TARGET_IFUNC_REF_LOCAL_OK): Use it.

gcc/testsuite/

PR target/83782
* gcc.target/i386/pr83782-1.c: Require non-ia32.
* gcc.target/i386/pr83782-2.c: Likewise.
* gcc.target/i386/pr83782-3.c: New test.
---
 gcc/config/i386/i386.cc   | 15 ++-
 gcc/testsuite/gcc.target/i386/pr83782-1.c |  8 +++---
 gcc/testsuite/gcc.target/i386/pr83782-2.c |  4 +--
 gcc/testsuite/gcc.target/i386/pr83782-3.c | 32 +++
 4 files changed, 50 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr83782-3.c

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index e03f86d4a23..5e30dc884bf 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -16070,6 +16070,19 @@ ix86_call_use_plt_p (rtx call_op)
   return true;
 }
 
+/* Implement TARGET_IFUNC_REF_LOCAL_OK.  If this hook returns true,
+   the PLT entry will be used as the function address for local IFUNC
+   functions.  When the PIC register is needed for PLT call, indirect
+   call via the PLT entry will fail since the PIC register may not be
+   set up properly for indirect call.  In this case, we should return
+   false.  */
+
+static bool
+ix86_ifunc_ref_local_ok (void)
+{
+  return !flag_pic || (TARGET_64BIT && ix86_cmodel != CM_LARGE_PIC);
+}
+
 /* Return true if the function being called was marked with attribute
"noplt" or using -fno-plt and we are compiling for non-PIC.  We need
to handle the non-PIC case in the backend because there is no easy
@@ -24953,7 +24966,7 @@ ix86_libgcc_floating_mode_supported_p
   ix86_get_multilib_abi_name
 
 #undef TARGET_IFUNC_REF_LOCAL_OK
-#define TARGET_IFUNC_REF_LOCAL_OK hook_bool_void_true
+#define TARGET_IFUNC_REF_LOCAL_OK ix86_ifunc_ref_local_ok
 
 #if !TARGET_MACHO && !TARGET_DLLIMPORT_DECL_ATTRIBUTES
 # undef TARGET_ASM_RELOC_RW_MASK
diff --git a/gcc/testsuite/gcc.target/i386/pr83782-1.c 
b/gcc/testsuite/gcc.target/i386/pr83782-1.c
index ce97b12e65d..85674346aec 100644
--- a/gcc/testsuite/gcc.target/i386/pr83782-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr83782-1.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target { ! ia32 } } } */
 /* { dg-require-ifunc "" } */
 /* { dg-options "-O2 -fpic" } */
 
@@ -20,7 +20,5 @@ bar(void)
   return foo;
 }
 
-/* { dg-final { scan-assembler {leal[ \t]foo@GOTOFF\(%[^,]*\),[ \t]%eax} { 
target ia32 } } } */
-/* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ \t]%(?:e|r)ax} { 
target { ! ia32 } } } } */
-/* { dg-final { scan-assembler-not "foo@GOT\\\(" { target ia32 } } } */
-/* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" { target { ! ia32 } } } 
} */
+/* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ \t]%(?:e|r)ax} } 
} */
+/* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr83782-2.c 
b/gcc/testsuite/gcc.target/i386/pr83782-2.c
index e25d258bbda..a654ded771f 100644
--- a/gcc/testsuite/gcc.target/i386/pr83782-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr83782-2.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target { ! ia32 } } } */
 /* { dg-require-ifunc "" } */
 /* { dg-options "-O2 -fpic" } */
 
@@ -20,7 +20,5 @@ bar(void)
   return foo;
 }
 
-/* { dg-final { scan-assembler {leal[ \t]foo@GOTOFF\(%[^,]*\),[ \t]%eax} { 
target ia32 } } } */
 /* { dg-final { scan-assembler {lea(?:l|q)[ \t]foo\(%rip\),[ \t]%(?:e|r)ax} { 
target { ! ia32 } } } } */
-/* { dg-final { scan-assembler-not "foo@GOT\\\(" { target ia32 } } } */
 /* { dg-final { scan-assembler-not "foo@GOTPCREL\\\(" { target { ! ia32 } } } 
} */
diff --git a/gcc/testsuite/gcc.target/i386/pr83782-3.c 
b/gcc/testsuite/gcc.target/i386/pr83782-3.c
new file mode 100644
index 000..1536481cb79
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr83782-3.c
@@ -0,0 +1,32 @@
+/* { dg-do run }  */
+/* { dg-require-ifunc "" } */
+/* { dg-require-effective-target pie } */
+/* { dg-options "-fpie -pie" } */
+
+#include 
+
+static int __attribute__((noinline))
+implementation (void)
+{
+  printf ("'ere I am JH\n");
+  return 0;
+}
+
+static __typeof__ (implementation) *resolver (void)
+{
+  return (void *)implementation;
+}
+
+extern int magic (void) __attribute__ ((ifunc ("resolver")));
+
+__attribute__ ((weak))
+int
+call_magic (int (*ptr) (void))
+{
+  return ptr ();
+}
+
+int main ()
+{
+  return call_magic (magic);
+}
-- 
2.36.1

Re: libgo patch committed: Don't include in sysinfo.c

2022-07-21 Thread Ian Lance Taylor via Gcc-patches

On Thu, Jul 21, 2022 at 4:53 AM Martin Liška  wrote:
>
> On 7/21/22 12:19, Richard Biener via Gcc-patches wrote:
> > On Wed, Jul 13, 2022 at 6:03 PM Ian Lance Taylor via Gcc-patches
> >  wrote:
> >>
> >> This libgo patch stops including  when building
> >> gen-sysinfo.go.  Removing this doesn't change anything at least with
> >> glibc 2.33.  The include was added in https://go.dev/cl/6100049 but
> >> it's not clear why.  This should fix GCC PR 106266.  Bootstrapped and
> >> ran Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.
> >
> > Btw, active branches are affected the same way - can you please backport?
>
> I've just done that.

Thanks.

Ian

[pushed] MAINTAINERS: Add myself to Write After Approval

2022-07-21 Thread Sam Feifer via Gcc-patches

ChangeLog:

* MAINTAINERS (Write After Approval): Add myself.
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index e2db0cfe18b..46c9e48a497 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -398,6 +398,7 @@ Chris Fairles   

 Alessandro Fanfarillo  
 Changpeng Fang 
 David Faust
+Sam Feifer 
 Li Feng
 Thomas Fitzsimmons 
 Alexander Fomin


base-commit: 24eae97625e9423e7344f6d7eb6bc2435a62fffd
-- 
2.31.1

[PATCH] Fortran: fix invalid rank error in ASSOCIATED when rank is remapped [PR77652]

2022-07-21 Thread Harald Anlauf via Gcc-patches

Dear all,

the rank check for ASSOCIATED (POINTER, TARGET) did not allow all
rank combinations that were allowed in pointer assignment for
newer versions of the Fortran standard (F2008+).  Fix the logic.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From 338b43aefece04435d32f961c33d217aaa511095 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 21 Jul 2022 22:02:58 +0200
Subject: [PATCH] Fortran: fix invalid rank error in ASSOCIATED when rank is
 remapped [PR77652]

gcc/fortran/ChangeLog:

	PR fortran/77652
	* check.cc (gfc_check_associated): Make the rank check of POINTER
	vs. TARGET match the selected Fortran standard.

gcc/testsuite/ChangeLog:

	PR fortran/77652
	* gfortran.dg/associated_target_9a.f90: New test.
	* gfortran.dg/associated_target_9b.f90: New test.
---
 gcc/fortran/check.cc  | 16 +--
 .../gfortran.dg/associated_target_9a.f90  | 27 +++
 .../gfortran.dg/associated_target_9b.f90  | 15 +++
 3 files changed, 56 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/associated_target_9a.f90
 create mode 100644 gcc/testsuite/gfortran.dg/associated_target_9b.f90

diff --git a/gcc/fortran/check.cc b/gcc/fortran/check.cc
index 91d87a1b2c1..6d3a4701950 100644
--- a/gcc/fortran/check.cc
+++ b/gcc/fortran/check.cc
@@ -1502,8 +1502,20 @@ gfc_check_associated (gfc_expr *pointer, gfc_expr *target)
 t = false;
   /* F2018 C838 explicitly allows an assumed-rank variable as the first
  argument of intrinsic inquiry functions.  */
-  if (pointer->rank != -1 && !rank_check (target, 0, pointer->rank))
-t = false;
+  if (pointer->rank != -1 && pointer->rank != target->rank)
+{
+  if (target->rank != 1)
+	{
+	  if (!gfc_notify_std (GFC_STD_F2008, "Rank remapping target is not "
+			   "rank 1 at %L", &target->where))
+	t = false;
+	}
+  else if ((gfc_option.allow_std & GFC_STD_F2003) == 0)
+	{
+	  if (!rank_check (target, 0, pointer->rank))
+	t = false;
+	}
+}
   if (target->rank > 0 && target->ref)
 {
   for (i = 0; i < target->rank; i++)
diff --git a/gcc/testsuite/gfortran.dg/associated_target_9a.f90 b/gcc/testsuite/gfortran.dg/associated_target_9a.f90
new file mode 100644
index 000..708645d5bcb
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/associated_target_9a.f90
@@ -0,0 +1,27 @@
+! { dg-do run }
+! { dg-options "-std=f2018" }
+! PR fortran/77652 - Invalid rank error in ASSOCIATED when rank is remapped
+! Contributed by Paul Thomas
+
+program p
+  real, dimension(100),  target  :: array
+  real, dimension(:,:),  pointer :: matrix
+  real, dimension(20,5), target  :: array2
+  real, dimension(:),pointer :: matrix2
+  matrix(1:20,1:5) => array
+  matrix2(1:100)   => array2
+  !
+  ! F2018:16.9.16, ASSOCIATED (POINTER [, TARGET])
+  ! Case(v): If TARGET is present and is an array target, the result is
+  ! true if and only if POINTER is associated with a target that has
+  ! the same shape as TARGET, ...
+  if (associated (matrix, array )) stop 1
+  if (associated (matrix2,array2)) stop 2
+  call check (matrix2, array2)
+contains
+  subroutine check (ptr, tgt)
+real, pointer :: ptr(..)
+real, target  :: tgt(:,:)
+if (associated (ptr, tgt)) stop 3
+  end subroutine check
+end
diff --git a/gcc/testsuite/gfortran.dg/associated_target_9b.f90 b/gcc/testsuite/gfortran.dg/associated_target_9b.f90
new file mode 100644
index 000..ca62ab155c0
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/associated_target_9b.f90
@@ -0,0 +1,15 @@
+! { dg-do compile }
+! { dg-options "-std=f2003" }
+! PR fortran/77652 - Invalid rank error in ASSOCIATED when rank is remapped
+! Contributed by Paul Thomas
+
+subroutine s
+  real, dimension(100),  target  :: array
+  real, dimension(:,:),  pointer :: matrix
+  real, dimension(20,5), target  :: array2
+  real, dimension(:),pointer :: matrix2
+! matrix(1:20,1:5) => array
+! matrix2(1:100)   => array2
+  print *, associated (matrix, array ) ! Technically legal F2003
+  print *, associated (matrix2,array2) ! { dg-error "is not rank 1" }
+end
--
2.35.3

Re: [PATCH] Remove setting -mblock-ops-vector-pair on power10.

2022-07-21 Thread Segher Boessenkool

On Thu, Jul 21, 2022 at 02:42:29AM -0400, Michael Meissner wrote:
> Testing has shown that using the load vector pair and store vector pair
> instructions for block moves has some performance issues on power10.  This
> patch does not set this option by default.  If it is a win in other
> machines in the future, this flag can be set in the ISA options.

This would make rs6000_isa_flags an even bigger misnomer than it already
is, sigh.

>   * config/rs6000/rs6000.cc (rs6000_option_override_internal):
>   Do not enable -mblock-ops-vector-pair by default on power10.

Do not wrap lines early, especially if that would mean leaving a colon
at the end of a line.  Changelog lines are 80 positions long (including
the leading tab, which counts as eight).

> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -4139,17 +4139,6 @@ rs6000_option_override_internal (bool global_init_p)
>   rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_UNALIGNED_VSX;
>  }
>  
> -  if (!(rs6000_isa_flags_explicit & OPTION_MASK_BLOCK_OPS_VECTOR_PAIR))
> -{
> -  /* Do not generate lxvp and stxvp on power10 since there are some
> -  performance issues.  */
> -  if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX
> -   && rs6000_tune != PROCESSOR_POWER10)
> - rs6000_isa_flags |= OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
> -  else
> - rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
> -}

How does this implement what the changelog says it does?  With what it
does the changelog should instead say to not touch it at all (your patch
also disables the code that disables it!)

It isn't clear what you intended: what your changelog says, or what the
code does.

Segher

[pushed] c++: defaulted ctor with DMI in union [PR94823]

2022-07-21 Thread Jason Merrill via Gcc-patches

CWG2084 clarifies that a variant member with a non-trivial constructor does
not make the union's defaulted default constructor deleted if another
variant member has a default member initializer.

Tested x86_64-pc-linux-gnu, applying to trunk.

DR 2084
PR c++/94823

gcc/cp/ChangeLog:

* method.cc (walk_field_subobs): Fix DMI in union case.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nsdmi-union7.C: New test.
---
 gcc/cp/method.cc  | 35 ---
 gcc/testsuite/g++.dg/cpp0x/nsdmi-union7.C | 13 +
 2 files changed, 44 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/nsdmi-union7.C

diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
index f2050f6e970..573ef016f82 100644
--- a/gcc/cp/method.cc
+++ b/gcc/cp/method.cc
@@ -2315,8 +2315,19 @@ walk_field_subobs (tree fields, special_function_kind 
sfk, tree fnname,
   bool diag, int flags, tsubst_flags_t complain,
   bool dtor_from_ctor)
 {
-  tree field;
-  for (field = fields; field; field = DECL_CHAIN (field))
+  if (!fields)
+return;
+
+  tree ctx = DECL_CONTEXT (fields);
+
+  /* CWG2084: A defaulted default ctor for a union with a DMI only initializes
+ that member, so don't check other members.  */
+  enum { unknown, no, yes }
+  only_dmi_mem = (sfk == sfk_constructor && TREE_CODE (ctx) == UNION_TYPE
+ ? unknown : no);
+
+ again:
+  for (tree field = fields; field; field = DECL_CHAIN (field))
 {
   tree mem_type, argtype, rval;
 
@@ -2331,9 +2342,18 @@ walk_field_subobs (tree fields, special_function_kind 
sfk, tree fnname,
 asking if this is deleted, don't even look up the function; we don't
 want an error about a deleted function we aren't actually calling.  */
   if (sfk == sfk_destructor && deleted_p == NULL
- && TREE_CODE (DECL_CONTEXT (field)) == UNION_TYPE)
+ && TREE_CODE (ctx) == UNION_TYPE)
break;
 
+  if (only_dmi_mem != no)
+   {
+ if (DECL_INITIAL (field))
+   only_dmi_mem = yes;
+ else
+   /* Don't check this until we know there's no DMI.  */
+   continue;
+   }
+
   mem_type = strip_array_types (TREE_TYPE (field));
   if (SFK_ASSIGN_P (sfk))
{
@@ -2416,7 +2436,7 @@ walk_field_subobs (tree fields, special_function_kind 
sfk, tree fnname,
  if (constexpr_p
  && cxx_dialect < cxx20
  && !CLASS_TYPE_P (mem_type)
- && TREE_CODE (DECL_CONTEXT (field)) != UNION_TYPE)
+ && TREE_CODE (ctx) != UNION_TYPE)
{
  *constexpr_p = false;
  if (diag)
@@ -2465,6 +2485,13 @@ walk_field_subobs (tree fields, special_function_kind 
sfk, tree fnname,
   process_subob_fn (rval, sfk, spec_p, trivial_p, deleted_p,
constexpr_p, diag, field, dtor_from_ctor);
 }
+
+  /* We didn't find a DMI in this union, now check all the members.  */
+  if (only_dmi_mem == unknown)
+{
+  only_dmi_mem = no;
+  goto again;
+}
 }
 
 /* Base walker helper for synthesized_method_walk.  Inspect a direct
diff --git a/gcc/testsuite/g++.dg/cpp0x/nsdmi-union7.C 
b/gcc/testsuite/g++.dg/cpp0x/nsdmi-union7.C
new file mode 100644
index 000..c840ddf5fbc
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/nsdmi-union7.C
@@ -0,0 +1,13 @@
+// PR c++/94823
+// { dg-do compile { target c++11 } }
+
+struct A{
+  A(){}
+};
+union C{
+  A a;
+  int b = 0;
+};
+int main(){
+  C c;
+}

base-commit: f4ed610d02aaf8cfcdcb5cf03e0cde65f1f5f890
-- 
2.31.1

[pushed] c++: defaulted friend op== [PR106361]

2022-07-21 Thread Jason Merrill via Gcc-patches

Now non-member functions can be defaulted, so this assert is wrong.
move_signature_fn_p already checks for ctor or op=.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/106361

gcc/cp/ChangeLog:

* decl.cc (move_fn_p): Remove assert.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/spaceship-eq14.C: New test.
---
 gcc/cp/decl.cc  |  2 --
 gcc/testsuite/g++.dg/cpp2a/spaceship-eq14.C | 17 +
 2 files changed, 17 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/spaceship-eq14.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index aa6cf3c6c2e..70ad681467e 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -15022,8 +15022,6 @@ copy_fn_p (const_tree d)
 bool
 move_fn_p (const_tree d)
 {
-  gcc_assert (DECL_FUNCTION_MEMBER_P (d));
-
   if (cxx_dialect == cxx98)
 /* There are no move constructors if we are in C++98 mode.  */
 return false;
diff --git a/gcc/testsuite/g++.dg/cpp2a/spaceship-eq14.C 
b/gcc/testsuite/g++.dg/cpp2a/spaceship-eq14.C
new file mode 100644
index 000..896e5232bf6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/spaceship-eq14.C
@@ -0,0 +1,17 @@
+// PR c++/106361
+// { dg-do compile { target c++20 } }
+
+struct foo {
+  int x;
+};
+
+struct bar {
+  foo f;   // { dg-error "operator==" }
+  friend bool operator==(const bar& a, const bar& b);
+};
+
+bool operator==(const bar& a, const bar& b) = default;
+
+int main() {
+  return bar{} == bar{};   // { dg-error "deleted" }
+}

base-commit: df118d7ba138cacb17203d4a1b5f27730347cc77
-- 
2.31.1

[pushed] match.pd: Add new abs pattern [PR94920]

2022-07-21 Thread Sam Feifer via Gcc-patches

This patch is intended to fix a missed optimization in match.pd. It optimizes 
(x >= 0 ? x : 0) + (x <= 0 ? -x : 0) to just abs(x). Additionally, the pattern 
(x <= 0 ? -x : 0) now gets optimized to max(-x, 0), which helps with the other 
simplification rule.

Tests are also included to be added to the testsuite.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR tree-optimization/94920

gcc/ChangeLog:

* match.pd (x >= 0 ? x : 0) + (x <= 0 ? -x : 0): New simplification.
   (x <= 0 ? -x : 0): New simplification.

gcc/testsuite/ChangeLog:

* g++.dg/pr94920-1.C: New test.
* g++.dg/pr94920.C: New test.
* gcc.dg/pr94920-2.c: New test.
---
 gcc/match.pd | 10 +
 gcc/testsuite/g++.dg/pr94920-1.C | 17 +
 gcc/testsuite/g++.dg/pr94920.C   | 63 
 gcc/testsuite/gcc.dg/pr94920-2.c | 15 
 4 files changed, 105 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/pr94920-1.C
 create mode 100644 gcc/testsuite/g++.dg/pr94920.C
 create mode 100644 gcc/testsuite/gcc.dg/pr94920-2.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 88a1a5aa9cc..9736393061a 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -339,6 +339,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@0)))
   (COPYSIGN_ALL (negate @0) @1)))
 
+/* (x >= 0 ? x : 0) + (x <= 0 ? -x : 0) -> abs x.  */
+(simplify
+  (plus:c (max @0 integer_zerop) (max (negate @0) integer_zerop))
+  (abs @0))
+
 /* X * 1, X / 1 -> X.  */
 (for op (mult trunc_div ceil_div floor_div round_div exact_div)
   (simplify
@@ -3425,6 +3430,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   && (GIMPLE || !TREE_SIDE_EFFECTS (@1)))
   (cond (convert:boolean_type_node @2) @1 @0)))
 
+/* (x <= 0 ? -x : 0) -> max(-x, 0).  */
+(simplify
+  (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1)
+  (max @2 @1))
+
 /* Simplifications of shift and rotates.  */
 
 (for rotate (lrotate rrotate)
diff --git a/gcc/testsuite/g++.dg/pr94920-1.C b/gcc/testsuite/g++.dg/pr94920-1.C
new file mode 100644
index 000..6c6483eab2d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr94920-1.C
@@ -0,0 +1,17 @@
+/* PR tree-optimization/94920 */
+/* { dg-do run } */
+
+#include "pr94920.C"
+
+int main() {
+
+if (foo(0) != 0
+|| foo(-42) != 42
+|| foo(42) != 42
+|| baz(-10) != 10
+|| baz(-10) != 10) {
+__builtin_abort();
+}
+
+return 0;
+}
diff --git a/gcc/testsuite/g++.dg/pr94920.C b/gcc/testsuite/g++.dg/pr94920.C
new file mode 100644
index 000..925ec4f42f1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr94920.C
@@ -0,0 +1,63 @@
+/* PR tree-optimization/94920 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+typedef int __attribute__((vector_size(4*sizeof(int vint;
+
+/* Same form as PR.  */
+__attribute__((noipa)) unsigned int foo(int x) {
+return (x >= 0 ? x : 0) + (x <= 0 ? -x : 0);
+}
+
+/* Test for forward propogation.  */
+__attribute__((noipa)) unsigned int corge(int x) {
+int w = (x >= 0 ? x : 0);
+int y = -x;
+int z = (y >= 0 ? y : 0);
+return w + z;
+}
+
+/* Vector case.  */
+__attribute__((noipa)) vint thud(vint x) {
+vint t = (x >= 0 ? x : 0) ;
+vint xx = -x;
+vint t1 =  (xx >= 0 ? xx : 0);
+return t + t1;
+}
+
+/* Signed function.  */
+__attribute__((noipa)) int bar(int x) {
+return (x >= 0 ? x : 0) + (x <= 0 ? -x : 0);
+}
+
+/* Commutative property.  */
+__attribute__((noipa)) unsigned int baz(int x) {
+return (x <= 0 ? -x : 0) + (x >= 0 ? x : 0);
+}
+
+/* Flipped order for max expressions.  */
+__attribute__((noipa)) unsigned int quux(int x) {
+return (0 <= x ? x : 0) + (0 >= x ? -x : 0);
+}
+
+/* Not zero so should not optimize.  */
+__attribute__((noipa)) unsigned int waldo(int x) {
+return (x >= 4 ? x : 4) + (x <= 4 ? -x : 4);
+}
+
+/* Not zero so should not optimize.  */
+__attribute__((noipa)) unsigned int fred(int x) {
+return (x >= -4 ? x : -4) + (x <= -4 ? -x : -4);
+}
+
+/* Incorrect pattern.  */
+__attribute__((noipa)) unsigned int goo(int x) {
+return (x <= 0 ? x : 0) + (x >= 0 ? -x : 0);
+}
+
+/* Incorrect pattern.  */
+__attribute__((noipa)) int qux(int x) {
+return (x >= 0 ? x : 0) + (x >= 0 ? x : 0);
+}
+
+/* { dg-final {scan-tree-dump-times " ABS_EXPR " 6 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/pr94920-2.c b/gcc/testsuite/gcc.dg/pr94920-2.c
new file mode 100644
index 000..a2d23324cfa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr94920-2.c
@@ -0,0 +1,15 @@
+/* PR tree-optimization/94920 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+/* Form from PR.  */
+__attribute__((noipa)) unsigned int foo(int x) {
+return x <= 0 ? -x : 0;
+}
+
+/* Changed order.  */
+__attribute__((noipa)) unsigned int bar(int x) {
+return 0 >= x ? -x : 0;
+}
+
+/* { dg-final {scan-tree-dump-times " MAX_EXPR " 2 "optimized"

[committed] analyzer: fix -Wanalyzer-va-list-exhausted false +ve on va_arg in subroutine [PR106383]

2022-07-21 Thread David Malcolm via Gcc-patches

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-1786-gb852aa7f265424.

gcc/analyzer/ChangeLog:
PR analyzer/106383
* varargs.cc (region_model::impl_call_va_arg): When determining if
we're doing interprocedural analysis, use the stack depth of the
frame in which va_start was called, rather than the current stack
depth.

gcc/testsuite/ChangeLog:
PR analyzer/106383
* gcc.dg/analyzer/stdarg-3.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/varargs.cc  |  4 +-
 gcc/testsuite/gcc.dg/analyzer/stdarg-3.c | 57 
 2 files changed, 59 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/stdarg-3.c

diff --git a/gcc/analyzer/varargs.cc b/gcc/analyzer/varargs.cc
index c92a56dd2f9..c45585ce457 100644
--- a/gcc/analyzer/varargs.cc
+++ b/gcc/analyzer/varargs.cc
@@ -971,7 +971,7 @@ region_model::impl_call_va_arg (const call_details &cd)
  const frame_region *frame_reg = arg_reg->get_frame_region ();
  unsigned next_arg_idx = arg_reg->get_index ();
 
- if (get_stack_depth () > 1)
+ if (frame_reg->get_stack_depth () > 1)
{
  /* The interprocedural case: the called frame will have been
 populated with any variadic aruguments.
@@ -1009,7 +1009,7 @@ region_model::impl_call_va_arg (const call_details &cd)
 any specific var_arg_regions populated within it.
 We already have a conjured_svalue for the result, so leave
 it untouched.  */
- gcc_assert (get_stack_depth () == 1);
+ gcc_assert (frame_reg->get_stack_depth () == 1);
}
 
  if (saw_problem)
diff --git a/gcc/testsuite/gcc.dg/analyzer/stdarg-3.c 
b/gcc/testsuite/gcc.dg/analyzer/stdarg-3.c
new file mode 100644
index 000..68146147adb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/stdarg-3.c
@@ -0,0 +1,57 @@
+typedef __builtin_va_list va_list;
+
+struct printf_spec {
+  unsigned int type;
+};
+
+int
+format_decode(const char *fmt, struct printf_spec *spec);
+
+static int vbin_printf(const char *fmt, va_list args) {
+  struct printf_spec spec;
+  int width = 0;
+
+  while (*fmt) {
+int read = format_decode(fmt, &spec);
+
+fmt += read;
+
+switch (spec.type) {
+case 0:
+  break;
+case 1:
+  width = __builtin_va_arg(args, int); /* { dg-bogus 
"-Wanalyzer-va-list-exhausted" } */
+  break;
+}
+  }
+
+  return width;
+}
+
+int bprintf(const char *fmt, ...) {
+  va_list args;
+  int ret;
+
+  __builtin_va_start(args, fmt);
+  ret = vbin_printf(fmt, args);
+  __builtin_va_end(args);
+
+  return ret;
+}
+
+static int called_by_test_2 (va_list args)
+{
+  return __builtin_va_arg(args, int); /* { dg-bogus 
"-Wanalyzer-va-list-exhausted" } */
+}
+
+int test_2 (const char *fmt, ...)
+{
+  va_list args;
+  int ret;
+
+  __builtin_va_start (args, fmt);
+  ret = called_by_test_2 (args);
+  __builtin_va_end (args);
+
+  return ret;
+}
-- 
2.26.3

Re: [pushed] match.pd: Add new abs pattern [PR94920]

2022-07-21 Thread Marek Polacek via Gcc-patches

On Thu, Jul 21, 2022 at 05:28:34PM -0400, Sam Feifer via Gcc-patches wrote:
> This patch is intended to fix a missed optimization in match.pd. It optimizes 
> (x >= 0 ? x : 0) + (x <= 0 ? -x : 0) to just abs(x). Additionally, the 
> pattern (x <= 0 ? -x : 0) now gets optimized to max(-x, 0), which helps with 
> the other simplification rule.
> 
> Tests are also included to be added to the testsuite.
 
To clarify, this patch has been approved by Richi in an earlier
thread, so...

> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

...this line isn't really asking for a review.
 
>   PR tree-optimization/94920
> 
> gcc/ChangeLog:
> 
>   * match.pd (x >= 0 ? x : 0) + (x <= 0 ? -x : 0): New simplification.
>  (x <= 0 ? -x : 0): New simplification.

The second line ought to be formatted 8 spaces to the left, so that
it's aligned with '* match.pd'.
 
Marek

Re: [PATCH] rs6000/test: Fix empty TU in some cases of effective targets

2022-07-21 Thread Segher Boessenkool

On Wed, Jul 20, 2022 at 05:32:01PM +0800, Kewen.Lin wrote:
> As the failure of test case gcc.target/powerpc/pr92398.p9-.c in
> PR106345 shows, some test sources for some powerpc effective
> targets use empty translation unit wrongly.  The test sources
> could go with options like "-ansi -pedantic-errors", then those
> effective target checkings will fail unexpectedly with the
> error messages like:
> 
>   error: ISO C forbids an empty translation unit [-Wpedantic]
> 
> This patch is to fix empty TUs with one dummy variable definition
> accordingly.

You can also use
  enum{a};
which is shorter, but more importantly does not generate any code.
You can also do
  extern int dummy;
of course -- same idea, no definitions, only declarations.

> I'll push this soon if no objections.

> @@ -6523,6 +6531,7 @@ proc check_effective_target_ppc_float128 { } {
>   #ifndef __FLOAT128__
> nope no good
>   #endif
> + int dummy;

At least put it in #else then?  Or just do things a bit more elegantly
(do a dummy function around this for example).


Segher

Re: [PATCH 1/1 V5] RISC-V: Support Zmmul extension

2022-07-21 Thread Kito Cheng via Gcc-patches

On Fri, Jul 22, 2022 at 2:43 AM Palmer Dabbelt  wrote:
>
> On Thu, 21 Jul 2022 02:03:35 PDT (-0700), gcc-patches@gcc.gnu.org wrote:
> > LGTM, will merge once binuils part merge.
>
> +Nelson, in case he's already planning on handling those.  If not then
> they're not in my inbox, so just poke me if you want me to review them.
>
> Also some comments on the patch below.
>
> >
> > On Wed, Jul 13, 2022 at 10:14 AM  wrote:
> >>
> >> From: LiaoShihua 
> >>
> >> gcc/ChangeLog:
> >>
> >> * common/config/riscv/riscv-common.cc: Add Zmmul.
> >> * config/riscv/riscv-opts.h (MASK_ZMMUL): New.
> >> (TARGET_ZMMUL): Ditto.
> >> * config/riscv/riscv.cc (riscv_option_override):Ditto.
> >> * config/riscv/riscv.md: Add Zmmul
> >> * config/riscv/riscv.opt: Ditto.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >> * gcc.target/riscv/zmmul-1.c: New test.
> >> * gcc.target/riscv/zmmul-2.c: New test.
> >>
> >> ---
> >>  gcc/common/config/riscv/riscv-common.cc  |  3 +++
> >>  gcc/config/riscv/riscv-opts.h|  3 +++
> >>  gcc/config/riscv/riscv.cc|  8 +--
> >>  gcc/config/riscv/riscv.md| 28 
> >>  gcc/config/riscv/riscv.opt   |  3 +++
> >>  gcc/testsuite/gcc.target/riscv/zmmul-1.c | 20 +
> >>  gcc/testsuite/gcc.target/riscv/zmmul-2.c | 20 +
> >>  7 files changed, 69 insertions(+), 16 deletions(-)
> >>  create mode 100644 gcc/testsuite/gcc.target/riscv/zmmul-1.c
> >>  create mode 100644 gcc/testsuite/gcc.target/riscv/zmmul-2.c
> >>
> >> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> >> b/gcc/common/config/riscv/riscv-common.cc
> >> index 0e5be2ce105..20acc590b30 100644
> >> --- a/gcc/common/config/riscv/riscv-common.cc
> >> +++ b/gcc/common/config/riscv/riscv-common.cc
> >> @@ -193,6 +193,8 @@ static const struct riscv_ext_version 
> >> riscv_ext_version_table[] =
> >>{"zvl32768b", ISA_SPEC_CLASS_NONE, 1, 0},
> >>{"zvl65536b", ISA_SPEC_CLASS_NONE, 1, 0},
> >>
> >> +  {"zmmul", ISA_SPEC_CLASS_NONE, 1, 0},
> >> +
> >>/* Terminate the list.  */
> >>{NULL, ISA_SPEC_CLASS_NONE, 0, 0}
> >>  };
> >> @@ -1148,6 +1150,7 @@ static const riscv_ext_flag_table_t 
> >> riscv_ext_flag_table[] =
> >>{"zvl32768b", &gcc_options::x_riscv_zvl_flags, MASK_ZVL32768B},
> >>{"zvl65536b", &gcc_options::x_riscv_zvl_flags, MASK_ZVL65536B},
> >>
> >> +  {"zmmul", &gcc_options::x_riscv_zm_subext, MASK_ZMMUL},
> >>
> >>{NULL, NULL, 0}
> >>  };
> >> diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
> >> index 1e153b3a6e7..9c7d69a6ea3 100644
> >> --- a/gcc/config/riscv/riscv-opts.h
> >> +++ b/gcc/config/riscv/riscv-opts.h
> >> @@ -153,6 +153,9 @@ enum stack_protector_guard {
> >>  #define TARGET_ZICBOM ((riscv_zicmo_subext & MASK_ZICBOM) != 0)
> >>  #define TARGET_ZICBOP ((riscv_zicmo_subext & MASK_ZICBOP) != 0)
> >>
> >> +#define MASK_ZMMUL  (1 << 0)
> >> +#define TARGET_ZMMUL((riscv_zm_subext & MASK_ZMMUL) != 0)
> >> +
> >>  /* Bit of riscv_zvl_flags will set contintuly, N-1 bit will set if N-bit 
> >> is
> >> set, e.g. MASK_ZVL64B has set then MASK_ZVL32B is set, so we can use
> >> popcount to caclulate the minimal VLEN.  */
> >> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> >> index 2e83ca07394..9ad4181f35f 100644
> >> --- a/gcc/config/riscv/riscv.cc
> >> +++ b/gcc/config/riscv/riscv.cc
> >> @@ -4999,10 +4999,14 @@ riscv_option_override (void)
> >>/* The presence of the M extension implies that division instructions
> >>   are present, so include them unless explicitly disabled.  */
> >>if (TARGET_MUL && (target_flags_explicit & MASK_DIV) == 0)
> >> -target_flags |= MASK_DIV;
> >> +if(!TARGET_ZMMUL)
> >> +  target_flags |= MASK_DIV;
>
> Not sure if I'm missing something here, but that doesn't look right: it
> would mean that "-march=rv32im_zmmul" ends up without divide
> instructions.  I think it's fine to just leave this as it was, we're not
> setting TARGET_MUL from "-march...zmmul...", so this should all be OK.

Oh, yeah, I missed that, that should just keek as it.


>
> >>else if (!TARGET_MUL && TARGET_DIV)
> >>  error ("%<-mdiv%> requires %<-march%> to subsume the % 
> >> extension");
> >> -
> >> +
> >> +  if(TARGET_ZMMUL && !TARGET_MUL && TARGET_DIV)
> >> +warning (0, "%<-mdiv%> cannot be used when % extension is 
> >> present");
>
> That should already be getting caught by the check above, but even so
> it's not quite the right error: "-march=rv32im_zmmul -mdiv" is fine,
> it's just something like "-march=rv32i_zmmul -mdiv" that's the problem.
>
> >> +
> >>/* Likewise floating-point division and square root.  */
> >>if (TARGET_HARD_FLOAT && (target_flags_explicit & MASK_FDIV) == 0)
> >>  target_flags |= MASK_FDIV;
> >> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> >> index 308b64dd30d..d4e171464ea 100644
> >> --- a/gc

Re: [PATCH] rs6000/test: Fix empty TU in some cases of effective targets

2022-07-21 Thread Kewen.Lin via Gcc-patches

Hi Segher,

Thanks for the comments!

on 2022/7/22 06:09, Segher Boessenkool wrote:
> On Wed, Jul 20, 2022 at 05:32:01PM +0800, Kewen.Lin wrote:
>> As the failure of test case gcc.target/powerpc/pr92398.p9-.c in
>> PR106345 shows, some test sources for some powerpc effective
>> targets use empty translation unit wrongly.  The test sources
>> could go with options like "-ansi -pedantic-errors", then those
>> effective target checkings will fail unexpectedly with the
>> error messages like:
>>
>>   error: ISO C forbids an empty translation unit [-Wpedantic]
>>
>> This patch is to fix empty TUs with one dummy variable definition
>> accordingly.
> 
> You can also use
>   enum{a};
> which is shorter, but more importantly does not generate any code.
> You can also do
>   extern int dummy;
> of course -- same idea, no definitions, only declarations.
> 

The used "int dummy" follows some existing practices, IMHO in this
context it doesn't matter that it will generate code or not, any of
these alternatives still generates an assembly or object file, but
the generated file gets removed after the checking.

May I still keep this "int dummy" to align with existing practices?

>> I'll push this soon if no objections.
> 
>> @@ -6523,6 +6531,7 @@ proc check_effective_target_ppc_float128 { } {
>>  #ifndef __FLOAT128__
>>nope no good
>>  #endif
>> +int dummy;
> 
> At least put it in #else then?  Or just do things a bit more elegantly
> (do a dummy function around this for example).
> 

OK, since it can still emit error even without "#else", I didn't bother
to add it.  I will add it, and update the "nope no good" to "#error
doesn't have float128 support".

BR,
Kewen

Re: [PATCH] rs6000/test: Fix empty TU in some cases of effective targets

2022-07-21 Thread Segher Boessenkool

Hi!

On Fri, Jul 22, 2022 at 08:41:43AM +0800, Kewen.Lin wrote:
> Hi Segher,
> 
> Thanks for the comments!

Always.

> >> This patch is to fix empty TUs with one dummy variable definition
> >> accordingly.
> > 
> > You can also use
> >   enum{a};
> > which is shorter, but more importantly does not generate any code.
> > You can also do
> >   extern int dummy;
> > of course -- same idea, no definitions, only declarations.
> 
> The used "int dummy" follows some existing practices, IMHO in this
> context it doesn't matter that it will generate code or not, any of
> these alternatives still generates an assembly or object file, but
> the generated file gets removed after the checking.

It doesn't matter here, sure.  But it is certainly simple enough to make
it "extern int dummy" instead, not giving a bad example for future cases
where it may matter :-)

> May I still keep this "int dummy" to align with existing practices?

Of course, it was just advice.  If things are wrong (in my opinion that
is!), I'll say so.

> > At least put it in #else then?  Or just do things a bit more elegantly
> > (do a dummy function around this for example).
> 
> OK, since it can still emit error even without "#else", I didn't bother
> to add it.  I will add it, and update the "nope no good" to "#error
> doesn't have float128 support".

Just say

===
void nope (void)
{
#ifndef __FLOAT128__
nope no good
#endif
}
===

which works in all cases?

Less maintenance is a good thing :-)


Segher

Re: [PATCH] rs6000/test: Fix empty TU in some cases of effective targets

2022-07-21 Thread Kewen.Lin via Gcc-patches

Hi!

on 2022/7/22 09:02, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Jul 22, 2022 at 08:41:43AM +0800, Kewen.Lin wrote:
>> Hi Segher,
>>
>> Thanks for the comments!
> 
> Always.
> 
 This patch is to fix empty TUs with one dummy variable definition
 accordingly.
>>>
>>> You can also use
>>>   enum{a};
>>> which is shorter, but more importantly does not generate any code.
>>> You can also do
>>>   extern int dummy;
>>> of course -- same idea, no definitions, only declarations.
>>
>> The used "int dummy" follows some existing practices, IMHO in this
>> context it doesn't matter that it will generate code or not, any of
>> these alternatives still generates an assembly or object file, but
>> the generated file gets removed after the checking.
> 
> It doesn't matter here, sure.  But it is certainly simple enough to make
> it "extern int dummy" instead, not giving a bad example for future cases
> where it may matter :-)
> 

OK.

>> May I still keep this "int dummy" to align with existing practices?
> 
> Of course, it was just advice.  If things are wrong (in my opinion that
> is!), I'll say so.
> 

Got it, thanks!  :)

>>> At least put it in #else then?  Or just do things a bit more elegantly
>>> (do a dummy function around this for example).
>>
>> OK, since it can still emit error even without "#else", I didn't bother
>> to add it.  I will add it, and update the "nope no good" to "#error
>> doesn't have float128 support".
> 
> Just say
> 
> ===
> void nope (void)
> {
> #ifndef __FLOAT128__
>   nope no good
> #endif
> }
> ===
> 
> which works in all cases?

Yeah, good idea, I'll make a new version of patch based on this.

Thanks again!

BR,
Kewen

> 
> Less maintenance is a good thing :-)
> 
> 
> Segher

[r13-1786 Regression] FAIL: gcc.dg/analyzer/stdarg-3.c (test for excess errors) on Linux/x86_64

2022-07-21 Thread skpandey--- via Gcc-patches

On Linux/x86_64,

b852aa7f265424c8e2036899da5d8306ff06a16c is the first bad commit
commit b852aa7f265424c8e2036899da5d8306ff06a16c
Author: David Malcolm 
Date:   Thu Jul 21 17:29:26 2022 -0400

analyzer: fix -Wanalyzer-va-list-exhausted false +ve on va_arg in 
subroutine [PR106383]

caused

FAIL: gcc.dg/analyzer/stdarg-3.c (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1786/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="analyzer.exp=gcc.dg/analyzer/stdarg-3.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="analyzer.exp=gcc.dg/analyzer/stdarg-3.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="analyzer.exp=gcc.dg/analyzer/stdarg-3.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="analyzer.exp=gcc.dg/analyzer/stdarg-3.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

[PATCH] Adjust testcase.

2022-07-21 Thread liuhongt via Gcc-patches

r13-1762-gf9d4c3b45c5ed5f45c8089c990dbd4e181929c3d lower complex type
move to scalars, but testcase pr23911 is supposed to scan __complex__
constant which is never available, so adjust testcase to scan
IMAGPART/REALPART_EXPR constants separately.

Pushed as obvious patch.

gcc/testsuite/ChangeLog

PR tree-optimization/106010
* gcc.dg/pr23911.c: Scan IMAGPART/REALPART_EXPR = ** instead
of __complex__ since COMPLEX_CST is lower to scalars.
---
 gcc/testsuite/gcc.dg/pr23911.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/pr23911.c b/gcc/testsuite/gcc.dg/pr23911.c
index 3fa041222de..691f3507db2 100644
--- a/gcc/testsuite/gcc.dg/pr23911.c
+++ b/gcc/testsuite/gcc.dg/pr23911.c
@@ -16,5 +16,6 @@ test (void)
 
 /* After DCE2 which runs after FRE, the expressions should be fully
constant folded.  There should be no loads from b left.  */
-/* { dg-final { scan-tree-dump-times "__complex__ \\\(1.0e\\\+0, 0.0\\\)" 2 
"dce3" } } */
+/* { dg-final { scan-tree-dump-times {(?n)REALPART_EXPR.*= 1\.0e\+0} 2 "dce3" 
} } */
+/* { dg-final { scan-tree-dump-times {(?n)IMAGPART_EXPR.*= 0\.0} 2 "dce3" } } 
*/
 /* { dg-final { scan-tree-dump-times "= b" 0 "dce3" } } */
-- 
2.18.1

Re: [PATCH] rs6000/test: Update some cases with -mdejagnu-tune

2022-07-21 Thread Kewen.Lin via Gcc-patches

Hi Segher,

Thanks for the comments!

on 2022/7/22 02:48, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Jul 20, 2022 at 05:31:11PM +0800, Kewen.Lin wrote:
>> As PR106345 shows, some test cases should be updated with
>> -mdejagnu-tune, since their test points are sensitive to
>> rs6000_tune, such as: group_ending_nop, loop align (ic),
>> float conversion cost etc.
> 
> It does not make sense to require -mdejagnu-tune= if -mdejagnu-cpu= is
> already given?  What is the failure case?
> 

I think cpu setting only sets tune setting when tune setting isn't
explicitly provided as:

  if (rs6000_tune_index >= 0)
tune_index = rs6000_tune_index;
  else if (cpu_index >= 0)
rs6000_tune_index = tune_index = cpu_index;

As PR106345 shows, GCC can use an explicit tune setting when it's
configured, even if there is one "-mdejagnu-cpu=", it doesn't
override the explicit given one, so we need one explicit
"-mdejagnu-tune=".

One failure example is gcc.target/powerpc/loop_align.c

See function rs6000_loop_align:

/* Implement LOOP_ALIGN. */
align_flags
rs6000_loop_align (rtx label)
{

...

  /* Align small loops to 32 bytes to fit in an icache sector, otherwise return 
default. */
  if (ninsns > 4 && ninsns <= 8
  && (rs6000_tune == PROCESSOR_POWER4
  || rs6000_tune == PROCESSOR_POWER5
  || rs6000_tune == PROCESSOR_POWER6
  || rs6000_tune == PROCESSOR_POWER7
  || rs6000_tune == PROCESSOR_POWER8))
return align_flags (5);
  else
return align_loops;

Although the test case has adopted option "-mdejagnu-cpu=power7", but
the configured "--with-tune-64=power9" takes effect and make it
return align_loops instead of align_flags (5).

>> This patch is to replace -mdejagnu-cpu with -mdejagnu-tune
>> or append -mdejagnu-tune (keep the original -mdejagnu-cpu
>> when it's required) accordingly.
> 
> It is *always* required.  Testcases with -mtune= but unspecified -mcpu=
> make no sense.
> 

The loop_align.c testings made me think if we know the insn count for
the loop on all cpus is in range (4,8] then the cpu setting doesn't matter.

I think I get your point, it's risky to assume that even if it works
for all existing cpus, will update with an explicit -mdejagnu-cpu here.

>> --- a/gcc/testsuite/gcc.target/powerpc/compress-float-ppc-pic.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/compress-float-ppc-pic.c
>> @@ -1,5 +1,5 @@
>>  /* { dg-do compile { target powerpc_fprs } } */
>> -/* { dg-options "-O2 -fpic -mdejagnu-cpu=power5" } */
>> +/* { dg-options "-O2 -fpic -mdejagnu-cpu=power5 -mdejagnu-tune=power5" } */
>>  /* { dg-require-effective-target fpic } */
> 
> This should only make a difference if you have -mtune= in your
> RUNTEST_FLAGS, and you shouldn't do silly things like that.  I suspect
> you see it in other cases, and those are actual bugs then, that need
> actual fixing instead of sweeping under the carper.
> 

Unfortunately it's due to the explicit tune setting in configuration.

> The testcase suggests this is with a compiler configured with
> --with-cpu= --with-tune=, which should just work, and -mcpu= should
> override both of those!
> 

Unfortunately -mcpu= (-mdejagnu-cpu=) doesn't actually override here.

BR,
Kewen

47 matches

Mail list logo