[PATCH] swap: Fix incorrect lane extraction by vec_extract() [PR106770]

2023-01-04 Thread Surya Kumari Jangala via Gcc-patches
swap: Fix incorrect lane extraction by vec_extract() [PR106770]

In the routine rs6000_analyze_swaps(), special handling of swappable
instructions is done even if the webs that contain the swappable
instructions are not optimized, i.e., the webs do not contain any
permuting load/store instructions along with the associated register
swap instructions. Doing special handling in such webs will result in
the extracted lane being adjusted unnecessarily for vec_extract.

Modifying swappable instructions is also incorrect in webs where
loads/stores on quad word aligned addresses are changed to lvx/stvx.
Similarly, in webs where swap(load(vector constant)) instructions are
replaced with load(swapped vector constant), the swappable
instructions should not be modified.

2023-01-04  Surya Kumari Jangala  

gcc/
PR rtl-optimization/106770
* rs6000-p8swap.cc (rs6000_analyze_swaps): .

gcc/testsuite/
PR rtl-optimization/106770
* gcc.target/powerpc/pr106770.c: New test.
---

diff --git a/gcc/config/rs6000/rs6000-p8swap.cc 
b/gcc/config/rs6000/rs6000-p8swap.cc
index 19fbbfb67dc..7ed39251df9 100644
--- a/gcc/config/rs6000/rs6000-p8swap.cc
+++ b/gcc/config/rs6000/rs6000-p8swap.cc
@@ -179,6 +179,9 @@ class swap_web_entry : public web_entry_base
   unsigned int special_handling : 4;
   /* Set if the web represented by this entry cannot be optimized.  */
   unsigned int web_not_optimizable : 1;
+  /* Set if the web represented by this entry has been optimized, ie,
+ register swaps of permuting loads/stores have been removed.  */
+  unsigned int web_is_optimized : 1;
   /* Set if this insn should be deleted.  */
   unsigned int will_delete : 1;
 };
@@ -2627,22 +2630,43 @@ rs6000_analyze_swaps (function *fun)
   /* For each load and store in an optimizable web (which implies
  the loads and stores are permuting), find the associated
  register swaps and mark them for removal.  Due to various
- optimizations we may mark the same swap more than once.  Also
- perform special handling for swappable insns that require it.  */
+ optimizations we may mark the same swap more than once. Fix up
+ the non-permuting loads and stores by converting them into
+ permuting ones.  */
   for (i = 0; i < e; ++i)
 if ((insn_entry[i].is_load || insn_entry[i].is_store)
&& insn_entry[i].is_swap)
   {
swap_web_entry* root_entry
  = (swap_web_entry*)((&insn_entry[i])->unionfind_root ());
-   if (!root_entry->web_not_optimizable)
+   if (!root_entry->web_not_optimizable) {
  mark_swaps_for_removal (insn_entry, i);
+  root_entry->web_is_optimized = true;
+}
   }
-else if (insn_entry[i].is_swappable && insn_entry[i].special_handling)
+else if (insn_entry[i].is_swappable
+ && (insn_entry[i].special_handling == SH_NOSWAP_LD ||
+ insn_entry[i].special_handling == SH_NOSWAP_ST))
+  {
+swap_web_entry* root_entry
+  = (swap_web_entry*)((&insn_entry[i])->unionfind_root ());
+if (!root_entry->web_not_optimizable) {
+  handle_special_swappables (insn_entry, i);
+  root_entry->web_is_optimized = true;
+}
+  }
+
+  /* Perform special handling for swappable insns that require it. 
+ Note that special handling should be done only for those 
+ swappable insns that are present in webs optimized above.  */
+  for (i = 0; i < e; ++i)
+if (insn_entry[i].is_swappable && insn_entry[i].special_handling &&
+!(insn_entry[i].special_handling == SH_NOSWAP_LD || 
+  insn_entry[i].special_handling == SH_NOSWAP_ST))
   {
swap_web_entry* root_entry
  = (swap_web_entry*)((&insn_entry[i])->unionfind_root ());
-   if (!root_entry->web_not_optimizable)
+   if (root_entry->web_is_optimized)
  handle_special_swappables (insn_entry, i);
   }
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr106770.c 
b/gcc/testsuite/gcc.target/powerpc/pr106770.c
new file mode 100644
index 000..84e9aead975
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr106770.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mdejagnu-cpu=power8 -O3 " } */
+/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */
+
+/* Test case to resolve PR106770  */
+
+#include 
+
+int cmp2(double a, double b)
+{
+vector double va = vec_promote(a, 1);
+vector double vb = vec_promote(b, 1);
+vector long long vlt = (vector long long)vec_cmplt(va, vb);
+vector long long vgt = (vector long long)vec_cmplt(vb, va);
+vector signed long long vr = vec_sub(vlt, vgt);
+
+return vec_extract(vr, 1);
+}
+


Re: [PATCH] expr.cc: avoid unexpected side effects in expand_expr_divmod optimization

2023-01-04 Thread Jose E. Marchesi via Gcc-patches


ping.
Would this be a good approach for fixing the issue?

> Hi Jakub.
>
>> On Thu, Dec 08, 2022 at 02:02:36PM +0100, Jose E. Marchesi wrote:
>>> So, I guess the right fix would be to call assemble_external_libcall
>>> during final?  The `.global FOO' directive would be generated
>>> immediately before the call sequence, but I guess that would be ok.
>>
>> During final only if all the targets can deal with the effects of
>> assemble_external_libcall being done in the middle of emitting assembly
>> for the function.
>>
>> Otherwise, it could be e.g. done in the first loop of shorten_branches.
>>
>> Note, in calls.cc it is done only for emit_library_call_value_1
>> and not for emit_call_1, so if we do it late, we need to be able to find
>> out what call is to a libcall and what is to a normal call.  If there is
>> no way to differentiate it right now, perhaps we need some flag somewhere,
>> say on a SYMBOL_REF.  And then assemble_external_libcall either only
>> if such a SYMBOL_REF appears in CALL_INSN or sibcall JUMP_INSN, or
>> perhaps anywhere in the function and its constant pool.
>
> Allright, the quick-and-dirty patch below seems to DTRT with simple
> examples.
>
> First, when libcalls are generated.  Note only one .global is generated
> for all calls, and actually it is around the same position than before:
>
>   $ cat foo.c
>   int foo(unsigned int len, int flag)
>   {
> if (flag)
>   return (((long)len) * 234 / 5);
> return (((long)len) * 2 / 5);
>   }
>   $ cc1 -O2 foo.c
>   $ cat foo.c
>   .file   "foo.c"
>   .text
>   .global __divdi3
>   .align  3
>   .global foo
>   .type   foo, @function
>   foo:
>   mov32   %r1,%r1
>   lsh %r2,32
>   jne %r2,0,.L5
>   mov %r2,5
>   lsh %r1,1
>   call__divdi3
>   lsh %r0,32
>   arsh%r0,32
>   exit
>   .L5:
>   mov %r2,5
>   mul %r1,234
>   call__divdi3
>   lsh %r0,32
>   arsh%r0,32
>   exit
>   .size   foo, .-foo
>   .ident  "GCC: (GNU) 13.0.0 20221207 (experimental)"
>
> Second, when libcalls are tried by expand_moddiv in a sequence, but then
> discarded and not linked in the main sequence:
>
>   $ cat foo.c
>   int foo(unsigned int len, int flag)
>   {
> if (flag)
>   return (((long)len) * 234 / 5);
> return (((long)len) * 2 / 5);
>   }
>   $ cc1 -O2 foo.c
>   $ cat foo.c
>   .file   "foo.c"
>   .text
>   .align  3
>   .global foo
>   .type   foo, @function
>   foo:
>   mov32   %r0,%r1
>   lsh %r2,32
>   jne %r2,0,.L5
>   add %r0,%r0
>   div %r0,5
>   lsh %r0,32
>   arsh%r0,32
>   exit
>   .L5:
>   mul %r0,234
>   div %r0,5
>   lsh %r0,32
>   arsh%r0,32
>   exit
>   .size   foo, .-foo
>   .ident  "GCC: (GNU) 13.0.0 20221207 (experimental)"
>
> Note the .global now is not generated, as desired.
>
> As you can see below, I am adding a new RTX flag `is_libcall', with
> written form "/l".
>
> Before I get into serious testing etc, can you please confirm whether
> this is the right approach or not?
>
> In particular, I am a little bit concerned about the expectation I am
> using that the target of the `call' instruction emitted by emit_call_1
> is always a (MEM (SYMBOL_REF ...)) when it is passed a SYMBOL_REF as the
> first argument (`fun' in emit_library_call_value_1).
>
> Thanks.
>
> diff --git a/gcc/calls.cc b/gcc/calls.cc
> index 6dd6f73e978..6c4a3725272 100644
> --- a/gcc/calls.cc
> +++ b/gcc/calls.cc
> @@ -4370,10 +4370,6 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx 
> value,
>   || argvec[i].partial != 0)
>update_stack_alignment_for_call (&argvec[i].locate);
>  
> -  /* If this machine requires an external definition for library
> - functions, write one out.  */
> -  assemble_external_libcall (fun);
> -
>original_args_size = args_size;
>args_size.constant = (aligned_upper_bound (args_size.constant
>+ stack_pointer_delta,
> @@ -4717,6 +4713,9 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx 
> value,
>  valreg,
>  old_inhibit_defer_pop + 1, call_fusage, flags, args_so_far);
>  
> +  /* Mark the emitted call as a libcall with the new flag.  */
> +  RTL_LIBCALL_P (last_call_insn ()) = 1;
> +
>if (flag_ipa_ra)
>  {
>rtx datum = orgfun;
> diff --git a/gcc/final.cc b/gcc/final.cc
> index eea572238f6..df57de5afd0 100644
> --- a/gcc/final.cc
> +++ b/gcc/final.cc
> @@ -815,6 +815,8 @@ make_pass_compute_alignments (gcc::context *ctxt)
> reorg.cc, since the branch splitting exposes new instructions with delay
> slots.  */
>  
> +static rtx call_from_call_insn (rtx_call_insn *insn);
> +
>  void
>  shorten_branches (rtx_insn *first)
>  {
> @@ -850,6 +852,24 @@ shorten_branches (rtx_insn *first)
>for (insn = get_insns (), i = 1; insn; insn = NEXT_INS

[PATCH] ubsan: Avoid narrowing of multiply for -fsanitize=signed-integer-overflow [PR108256]

2023-01-04 Thread Jakub Jelinek via Gcc-patches
Hi!

We shouldn't narrow multiplications originally done in signed types,
because the original multiplication might overflow but the narrowed
one will be done in unsigned arithmetics and will never overflow.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-01-04  Jakub Jelinek  

PR sanitizer/108256
* convert.cc (do_narrow): Punt for MULT_EXPR if original
type doesn't wrap around and -fsanitize=signed-integer-overflow
is on.
* fold-const.cc (fold_unary_loc) : Likewise.

* c-c++-common/ubsan/pr108256.c: New test.

--- gcc/convert.cc.jj   2023-01-02 09:32:25.123245723 +0100
+++ gcc/convert.cc  2023-01-03 10:02:36.309706050 +0100
@@ -384,6 +384,14 @@ do_narrow (location_t loc,
   && sanitize_flags_p (SANITIZE_SI_OVERFLOW))
 return NULL_TREE;
 
+  /* Similarly for multiplication, but in that case it can be
+ problematic even if typex is unsigned type - 0x * 0x
+ overflows in int.  */
+  if (ex_form == MULT_EXPR
+  && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (expr))
+  && sanitize_flags_p (SANITIZE_SI_OVERFLOW))
+return NULL_TREE;
+
   /* But now perhaps TYPEX is as wide as INPREC.
  In that case, do nothing special here.
  (Otherwise would recurse infinitely in convert.  */
--- gcc/fold-const.cc.jj2023-01-02 09:32:32.756135438 +0100
+++ gcc/fold-const.cc   2023-01-03 10:30:05.492239455 +0100
@@ -9574,7 +9574,9 @@ fold_unary_loc (location_t loc, enum tre
   if (INTEGRAL_TYPE_P (type)
  && TREE_CODE (op0) == MULT_EXPR
  && INTEGRAL_TYPE_P (TREE_TYPE (op0))
- && TYPE_PRECISION (type) < TYPE_PRECISION (TREE_TYPE (op0)))
+ && TYPE_PRECISION (type) < TYPE_PRECISION (TREE_TYPE (op0))
+ && (TYPE_OVERFLOW_WRAPS (TREE_TYPE (op0))
+ || !sanitize_flags_p (SANITIZE_SI_OVERFLOW)))
{
  /* Be careful not to introduce new overflows.  */
  tree mult_type;
--- gcc/testsuite/c-c++-common/ubsan/pr108256.c.jj  2023-01-03 
10:14:49.064284638 +0100
+++ gcc/testsuite/c-c++-common/ubsan/pr108256.c 2023-01-03 10:43:58.838326443 
+0100
@@ -0,0 +1,27 @@
+/* PR sanitizer/108256 */
+/* { dg-do run { target { lp64 || ilp32 } } } */
+/* { dg-options "-fsanitize=signed-integer-overflow" } */
+
+unsigned short
+foo (unsigned short x, unsigned short y)
+{
+  return x * y;
+}
+
+unsigned short
+bar (unsigned short x, unsigned short y)
+{
+  int r = x * y;
+  return r;
+}
+
+int
+main ()
+{
+  volatile unsigned short a = foo (0x, 0x);
+  volatile unsigned short b = bar (0xfffe, 0xfffe);
+  return 0;
+}
+
+/* { dg-output "signed integer overflow: 65535 \\\* 65535 cannot be 
represented in type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*signed integer overflow: 65534 \\\* 65534 cannot be 
represented in type 'int'" } */

Jakub



[PATCH] vrp: Handle pointers in maybe_set_nonzero_bits [PR108253]

2023-01-04 Thread Jakub Jelinek via Gcc-patches
Hi!

maybe_set_nonzero_bits calls set_nonzero_bits which asserts that
var doesn't have pointer type.  While we could punt for those
cases, I think we can handle at least some easy cases.
Earlier in maybe_set_nonzero_bits we've checked this is on
(var & cst) == 0
edge and the other edge is __builtin_unreachable, so if cst
is say 3 as in the testcase, we want to turn it into 4 byte alignment
of the pointer.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-01-04  Jakub Jelinek  

PR tree-optimization/108253
* tree-vrp.cc (maybe_set_nonzero_bits): Handle var with pointer
types.

* g++.dg/opt/pr108253.C: New test.

--- gcc/tree-vrp.cc.jj  2023-01-02 09:32:53.634833769 +0100
+++ gcc/tree-vrp.cc 2023-01-03 15:57:51.613239761 +0100
@@ -789,8 +789,22 @@ maybe_set_nonzero_bits (edge e, tree var
return;
 }
   cst = gimple_assign_rhs2 (stmt);
-  set_nonzero_bits (var, wi::bit_and_not (get_nonzero_bits (var),
- wi::to_wide (cst)));
+  if (POINTER_TYPE_P (TREE_TYPE (var)))
+{
+  struct ptr_info_def *pi = SSA_NAME_PTR_INFO (var);
+  if (pi && pi->misalign)
+   return;
+  wide_int w = wi::bit_not (wi::to_wide (cst));
+  unsigned int bits = wi::ctz (w);
+  if (bits == 0 || bits >= HOST_BITS_PER_INT)
+   return;
+  unsigned int align = 1U << bits;
+  if (pi == NULL || pi->align < align)
+   set_ptr_info_alignment (get_ptr_info (var), align, 0);
+}
+  else
+set_nonzero_bits (var, wi::bit_and_not (get_nonzero_bits (var),
+   wi::to_wide (cst)));
 }
 
 /* Searches the case label vector VEC for the index *IDX of the CASE_LABEL
--- gcc/testsuite/g++.dg/opt/pr108253.C.jj  2023-01-03 16:02:16.366438488 
+0100
+++ gcc/testsuite/g++.dg/opt/pr108253.C 2023-01-03 16:02:33.549191780 +0100
@@ -0,0 +1,48 @@
+// PR tree-optimization/108253
+// { dg-do compile { target c++11 } }
+// { dg-options "-O2" }
+
+struct S
+{
+  int *s;
+  S () : s (new int) {}
+  S (const S &r) noexcept : s (r.s) { __atomic_fetch_add (r.s, 1, 4); }
+};
+struct T
+{
+  explicit T (const S &x) : t (x) {}
+  const S t;
+};
+struct U
+{
+  operator int () const { new T (u); return 0; }
+  S u;
+};
+bool foo (int matcher);
+unsigned long bar (unsigned long pos, unsigned long end_pos);
+struct V
+{
+  alignas (4) char v[4];
+};
+struct W
+{
+  void baz ()
+  {
+if (!w) __builtin_abort ();
+if (reinterpret_cast <__UINTPTR_TYPE__> (w->v) % 4 != 0) __builtin_abort 
();
+__builtin_unreachable ();
+  }
+  [[gnu::noinline]] void qux (unsigned long) { if (!w) bar (0, x); } 
+  V *w = nullptr;
+  unsigned x = 0;
+};
+
+void
+test ()
+{
+  W w;
+  U t;
+  if (!foo (t))
+w.baz ();
+  w.qux (0);
+}

Jakub



Re: [PATCH] ubsan: Avoid narrowing of multiply for -fsanitize=signed-integer-overflow [PR108256]

2023-01-04 Thread Richard Biener via Gcc-patches



> Am 04.01.2023 um 10:09 schrieb Jakub Jelinek via Gcc-patches 
> :
> 
> Hi!
> 
> We shouldn't narrow multiplications originally done in signed types,
> because the original multiplication might overflow but the narrowed
> one will be done in unsigned arithmetics and will never overflow.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok. Richard 

> 2023-01-04  Jakub Jelinek  
> 
>PR sanitizer/108256
>* convert.cc (do_narrow): Punt for MULT_EXPR if original
>type doesn't wrap around and -fsanitize=signed-integer-overflow
>is on.
>* fold-const.cc (fold_unary_loc) : Likewise.
> 
>* c-c++-common/ubsan/pr108256.c: New test.
> 
> --- gcc/convert.cc.jj2023-01-02 09:32:25.123245723 +0100
> +++ gcc/convert.cc2023-01-03 10:02:36.309706050 +0100
> @@ -384,6 +384,14 @@ do_narrow (location_t loc,
>   && sanitize_flags_p (SANITIZE_SI_OVERFLOW))
> return NULL_TREE;
> 
> +  /* Similarly for multiplication, but in that case it can be
> + problematic even if typex is unsigned type - 0x * 0x
> + overflows in int.  */
> +  if (ex_form == MULT_EXPR
> +  && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (expr))
> +  && sanitize_flags_p (SANITIZE_SI_OVERFLOW))
> +return NULL_TREE;
> +
>   /* But now perhaps TYPEX is as wide as INPREC.
>  In that case, do nothing special here.
>  (Otherwise would recurse infinitely in convert.  */
> --- gcc/fold-const.cc.jj2023-01-02 09:32:32.756135438 +0100
> +++ gcc/fold-const.cc2023-01-03 10:30:05.492239455 +0100
> @@ -9574,7 +9574,9 @@ fold_unary_loc (location_t loc, enum tre
>   if (INTEGRAL_TYPE_P (type)
>  && TREE_CODE (op0) == MULT_EXPR
>  && INTEGRAL_TYPE_P (TREE_TYPE (op0))
> -  && TYPE_PRECISION (type) < TYPE_PRECISION (TREE_TYPE (op0)))
> +  && TYPE_PRECISION (type) < TYPE_PRECISION (TREE_TYPE (op0))
> +  && (TYPE_OVERFLOW_WRAPS (TREE_TYPE (op0))
> +  || !sanitize_flags_p (SANITIZE_SI_OVERFLOW)))
>{
>  /* Be careful not to introduce new overflows.  */
>  tree mult_type;
> --- gcc/testsuite/c-c++-common/ubsan/pr108256.c.jj2023-01-03 
> 10:14:49.064284638 +0100
> +++ gcc/testsuite/c-c++-common/ubsan/pr108256.c2023-01-03 
> 10:43:58.838326443 +0100
> @@ -0,0 +1,27 @@
> +/* PR sanitizer/108256 */
> +/* { dg-do run { target { lp64 || ilp32 } } } */
> +/* { dg-options "-fsanitize=signed-integer-overflow" } */
> +
> +unsigned short
> +foo (unsigned short x, unsigned short y)
> +{
> +  return x * y;
> +}
> +
> +unsigned short
> +bar (unsigned short x, unsigned short y)
> +{
> +  int r = x * y;
> +  return r;
> +}
> +
> +int
> +main ()
> +{
> +  volatile unsigned short a = foo (0x, 0x);
> +  volatile unsigned short b = bar (0xfffe, 0xfffe);
> +  return 0;
> +}
> +
> +/* { dg-output "signed integer overflow: 65535 \\\* 65535 cannot be 
> represented in type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
> +/* { dg-output "\[^\n\r]*signed integer overflow: 65534 \\\* 65534 cannot be 
> represented in type 'int'" } */
> 
>Jakub
> 


[PATCH] rs6000: Don't use optimize_function_for_speed_p too early [PR108184]

2023-01-04 Thread Kewen.Lin via Gcc-patches
Hi,

As Honza pointed out in [1], the current uses of function
optimize_function_for_speed_p in rs6000_option_override_internal
are too early, since the query results from the functions
optimize_function_for_{speed,size}_p could be changed later due
to profile feedback and some function attributes handlings etc.

This patch is to move optimize_function_for_speed_p to all the
use places of the corresponding flags, which follows the existing
practices.  Maybe we can cache it somewhere at an appropriate
timing, but that's another thing.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607527.html

Bootstrapped and regtested on powerpc64-linux-gnu P8 and
powerpc64le-linux-gnu P9 and P10.

I'm going to push this soon if no objections.

BR,
Kewen
-
PR target/108184

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_option_override_internal): Remove
all optimize_function_for_speed_p uses.
* config/rs6000/predicates.md (fusion_gpr_mem_load): Call
optimize_function_for_speed_p along with TARGET_P8_FUSION_SIGN.
(fusion_gpr_load_p): Likewise.
(expand_fusion_gpr_load): Likewise.
(rs6000_call_aix): Call optimize_function_for_speed_p along with
TARGET_SAVE_TOC_INDIRECT.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr108184.c: New test.
---
 gcc/config/rs6000/predicates.md |  4 +++-
 gcc/config/rs6000/rs6000.cc | 16 ++--
 gcc/testsuite/gcc.target/powerpc/pr108184.c | 15 +++
 3 files changed, 28 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108184.c

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index b1fcc69bb60..11f1779e7bf 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1878,7 +1878,9 @@ (define_predicate "fusion_gpr_mem_load"

   /* Handle sign/zero extend.  */
   if (GET_CODE (op) == ZERO_EXTEND
-  || (TARGET_P8_FUSION_SIGN && GET_CODE (op) == SIGN_EXTEND))
+  || (TARGET_P8_FUSION_SIGN
+ && GET_CODE (op) == SIGN_EXTEND
+ && optimize_function_for_speed_p (cfun)))
 {
   op = XEXP (op, 0);
   mode = GET_MODE (op);
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index eb7ad5e954f..bbf829f45d9 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3978,8 +3978,7 @@ rs6000_option_override_internal (bool global_init_p)
   /* If we can shrink-wrap the TOC register save separately, then use
  -msave-toc-indirect unless explicitly disabled.  */
   if ((rs6000_isa_flags_explicit & OPTION_MASK_SAVE_TOC_INDIRECT) == 0
-  && flag_shrink_wrap_separate
-  && optimize_function_for_speed_p (cfun))
+  && flag_shrink_wrap_separate)
 rs6000_isa_flags |= OPTION_MASK_SAVE_TOC_INDIRECT;

   /* Enable power8 fusion if we are tuning for power8, even if we aren't
@@ -4013,7 +4012,6 @@ rs6000_option_override_internal (bool global_init_p)
  zero extending load, and an explicit sign extension.  */
   if (TARGET_P8_FUSION
   && !(rs6000_isa_flags_explicit & OPTION_MASK_P8_FUSION_SIGN)
-  && optimize_function_for_speed_p (cfun)
   && optimize >= 3)
 rs6000_isa_flags |= OPTION_MASK_P8_FUSION_SIGN;

@@ -25604,7 +25602,9 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx tlsarg, 
rtx cookie)

  /* Can we optimize saving the TOC in the prologue or
 do we need to do it at every call?  */
- if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca)
+ if (TARGET_SAVE_TOC_INDIRECT
+ && !cfun->calls_alloca
+ && optimize_function_for_speed_p (cfun))
cfun->machine->save_toc_in_prologue = true;
  else
{
@@ -27471,7 +27471,9 @@ fusion_gpr_load_p (rtx addis_reg,   /* register set 
via addis.  */

   /* Allow sign/zero extension.  */
   if (GET_CODE (mem) == ZERO_EXTEND
-  || (GET_CODE (mem) == SIGN_EXTEND && TARGET_P8_FUSION_SIGN))
+  || (GET_CODE (mem) == SIGN_EXTEND
+ && TARGET_P8_FUSION_SIGN
+ && optimize_function_for_speed_p (cfun)))
 mem = XEXP (mem, 0);

   if (!MEM_P (mem))
@@ -27535,7 +27537,9 @@ expand_fusion_gpr_load (rtx *operands)
   enum rtx_code extend = UNKNOWN;

   if (GET_CODE (orig_mem) == ZERO_EXTEND
-  || (TARGET_P8_FUSION_SIGN && GET_CODE (orig_mem) == SIGN_EXTEND))
+  || (TARGET_P8_FUSION_SIGN
+ && GET_CODE (orig_mem) == SIGN_EXTEND
+ && optimize_function_for_speed_p (cfun)))
 {
   extend = GET_CODE (orig_mem);
   orig_mem = XEXP (orig_mem, 0);
diff --git a/gcc/testsuite/gcc.target/powerpc/pr108184.c 
b/gcc/testsuite/gcc.target/powerpc/pr108184.c
new file mode 100644
index 000..8f1e91d9258
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr108184.c
@@ -0,0 +1,15 @@
+/* Only possible to fuse sign extended loads with the addis when
+   optimize >= 3 and Power8 fusion takes effects.  */
+/* { dg-options 

[PATCH] rs6000: Make P10_FUSION honour tuning setting

2023-01-04 Thread Kewen.Lin via Gcc-patches
Hi,

We noticed this issue when Segher reviewed the patch for
PR104024.  When there is no explicit setting for option
-mpower10-fusion, we enable OPTION_MASK_P10_FUSION for
TARGET_POWER10.  But it's not right, it should honour
tuning setting instead.

This patch is to fix it accordingly, it's bootstrapped
and regtested on powerpc64-linux-gnu P8 and
powerpc64le-linux-gnu P9, but on powerpc64le-linux-gnu P10
it had one regression failure against the test case
gcc.target/powerpc/pr105586.c.  I looked into it and
confirmed a latent bug was exposed and filed one separated
bug PR108273 instead.

I'm going to push this soon if no objections.

BR,
Kewen
-
gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_option_override_internal): Make
OPTION_MASK_P10_FUSION implicit setting honour Power10 tuning setting.
* config/rs6000/rs6000-cpus.def (ISA_3_1_MASKS_SERVER): Remove
OPTION_MASK_P10_FUSION.
---
 gcc/config/rs6000/rs6000-cpus.def |  3 +--
 gcc/config/rs6000/rs6000.cc   | 12 +---
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index c3825bcccd8..4d5544e927a 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -84,8 +84,7 @@

 #define ISA_3_1_MASKS_SERVER   (ISA_3_0_MASKS_SERVER   \
 | OPTION_MASK_POWER10  \
-| OTHER_POWER10_MASKS  \
-| OPTION_MASK_P10_FUSION)
+| OTHER_POWER10_MASKS)

 /* Flags that need to be turned off if -mno-power9-vector.  */
 #define OTHER_P9_VECTOR_MASKS  (OPTION_MASK_FLOAT128_HW\
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 88c865b6b4b..6fa084c0807 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -4378,9 +4378,15 @@ rs6000_option_override_internal (bool global_init_p)
   rs6000_isa_flags &= ~OPTION_MASK_MMA;
 }

-  if (TARGET_POWER10
-  && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION) == 0)
-rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
+  /* Enable power10 fusion if we are tuning for power10, even if we aren't
+ generating power10 instructions.  */
+  if (!(rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION))
+{
+  if (processor_target_table[tune_index].processor == PROCESSOR_POWER10)
+   rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
+  else
+   rs6000_isa_flags &= ~OPTION_MASK_P10_FUSION;
+}

   /* MMA requires SIMD support as ISA 3.1 claims and our implementation
  such as "*movoo" uses vector pair access which use VSX registers.
--
2.27.0


[PATCH] generic-match-head: Don't assume GENERIC folding is done only early [PR108237]

2023-01-04 Thread Jakub Jelinek via Gcc-patches
Hi!

We ICE on the following testcase, because a valid V2DImode
!= comparison is folded into an unsupported V2DImode > comparison.
The match.pd pattern which does this looks like:
/* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z
   where ~Y + 1 == pow2 and Z = ~Y.  */
(for cst (VECTOR_CST INTEGER_CST)
 (for cmp (eq ne)
  icmp (le gt)
  (simplify
   (cmp (bit_and:c@2 @0 cst@1) integer_zerop)
(with { tree csts = bitmask_inv_cst_vector_p (@1); }
 (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2)))
  (with { auto optab = VECTOR_TYPE_P (TREE_TYPE (@1))
 ? optab_vector : optab_default;
  tree utype = unsigned_type_for (TREE_TYPE (@1)); }
   (if (target_supports_op_p (utype, icmp, optab)
|| (optimize_vectors_before_lowering_p ()
&& (!target_supports_op_p (type, cmp, optab)
|| !target_supports_op_p (type, BIT_AND_EXPR, optab
(if (TYPE_UNSIGNED (TREE_TYPE (@1)))
 (icmp @0 { csts; })
 (icmp (view_convert:utype @0) { csts; })
and that optimize_vectors_before_lowering_p () guarded stuff there
already deals with this problem, not trying to fold a supported comparison
into a non-supported one.  The reason it doesn't work in this case is that
it isn't GIMPLE folding which does this, but GENERIC folding done during
forwprop4 - forward_propagate_into_comparison -> 
forward_propagate_into_comparison_1
-> combine_cond_expr_cond -> fold_binary_loc -> generic_simplify
and we simply assumed that GENERIC folding happens only before
gimplification.

The following patch fixes that by checking cfun properties instead of
always returning true in thos cases.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-01-04  Jakub Jelinek  

PR middle-end/108237
* generic-match-head.cc: Include tree-pass.h.
(canonicalize_math_p, optimize_vectors_before_lowering_p): Define
to false if cfun and cfun->curr_properties has PROP_gimple_opt_math
resp. PROP_gimple_lvec property set.

* gcc.c-torture/compile/pr108237.c: New test.

--- gcc/generic-match-head.cc.jj2023-01-02 09:32:42.954988078 +0100
+++ gcc/generic-match-head.cc   2023-01-03 17:31:07.627941369 +0100
@@ -40,6 +40,7 @@ along with GCC; see the file COPYING3.
 #include "tm.h"
 #include "tree-eh.h"
 #include "langhooks.h"
+#include "tree-pass.h"
 
 /* Routine to determine if the types T1 and T2 are effectively
the same for GENERIC.  If T1 or T2 is not a type, the test
@@ -71,7 +72,7 @@ single_use (tree t ATTRIBUTE_UNUSED)
 static inline bool
 canonicalize_math_p ()
 {
-  return true;
+  return !cfun || (cfun->curr_properties & PROP_gimple_opt_math) == 0;
 }
 
 /* Return true if math operations that are beneficial only after
@@ -90,7 +91,7 @@ canonicalize_math_after_vectorization_p
 static inline bool
 optimize_vectors_before_lowering_p ()
 {
-  return true;
+  return !cfun || (cfun->curr_properties & PROP_gimple_lvec) == 0;
 }
 
 /* Return true if successive divisions can be optimized.
--- gcc/testsuite/gcc.c-torture/compile/pr108237.c.jj   2023-01-03 
17:35:37.411068635 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr108237.c  2023-01-03 
17:35:22.490282820 +0100
@@ -0,0 +1,14 @@
+/* PR middle-end/108237 */
+
+typedef unsigned char __attribute__((__vector_size__ (1))) U;
+typedef unsigned long long __attribute__((__vector_size__ (16))) V;
+
+U u;
+V v;
+
+V
+foo (void)
+{
+  V w = v != ((unsigned char) ((unsigned char) u == u) & v);
+  return w;
+}

Jakub



[PATCH] c++: Error recovery in merge_default_template_args [PR108206]

2023-01-04 Thread Jakub Jelinek via Gcc-patches
Hi!

We ICE on the following testcase during error recovery, both new_parm
and old_parm are error_mark_node, the ICE is on
  error ("redefinition of default argument for %q+#D", new_parm);
  inform (DECL_SOURCE_LOCATION (old_parm),
  "original definition appeared here");
where we don't print anything useful for new_parm and ICE trying to
access DECL_SOURCE_LOCATION of old_parm.  I think we shouldn't diagnose
anything when either of the parms is erroneous, GCC 11 before
merge_default_template_args has been added was doing
  if (TREE_VEC_ELT (tmpl_parms, i) == error_mark_node
  || TREE_VEC_ELT (parms, i) == error_mark_node)
continue;

  tmpl_parm = TREE_VALUE (TREE_VEC_ELT (tmpl_parms, i));
  if (error_operand_p (tmpl_parm))
return false;
in redeclare_class_template.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-01-04  Jakub Jelinek  

PR c++/108206
* decl.cc (merge_default_template_args): Return false if either
new_parm or old_parm are erroneous.

* g++.dg/template/pr108206.C: New test.

--- gcc/cp/decl.cc.jj   2022-12-22 11:09:52.026181629 +0100
+++ gcc/cp/decl.cc  2023-01-03 18:11:25.202223528 +0100
@@ -1556,6 +1556,8 @@ merge_default_template_args (tree new_pa
   tree old_parm = TREE_VALUE (TREE_VEC_ELT (old_parms, i));
   tree& new_default = TREE_PURPOSE (TREE_VEC_ELT (new_parms, i));
   tree& old_default = TREE_PURPOSE (TREE_VEC_ELT (old_parms, i));
+  if (error_operand_p (new_parm) || error_operand_p (old_parm))
+   return false;
   if (new_default != NULL_TREE && old_default != NULL_TREE)
{
  auto_diagnostic_group d;
--- gcc/testsuite/g++.dg/template/pr108206.C.jj 2023-01-03 18:14:18.768730450 
+0100
+++ gcc/testsuite/g++.dg/template/pr108206.C2023-01-03 18:13:40.427281176 
+0100
@@ -0,0 +1,5 @@
+// PR c++/108206
+// { dg-do compile { target c++11 } }
+
+template  void foo (T1); // { dg-error "'X' has not been 
declared" }
+template  void foo (T2); // { dg-error "'X' has not been 
declared" }

Jakub



Re: [PATCH] generic-match-head: Don't assume GENERIC folding is done only early [PR108237]

2023-01-04 Thread Richard Biener via Gcc-patches



> Am 04.01.2023 um 10:22 schrieb Jakub Jelinek :
> 
> Hi!
> 
> We ICE on the following testcase, because a valid V2DImode
> != comparison is folded into an unsupported V2DImode > comparison.
> The match.pd pattern which does this looks like:
> /* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z
>   where ~Y + 1 == pow2 and Z = ~Y.  */
> (for cst (VECTOR_CST INTEGER_CST)
> (for cmp (eq ne)
>  icmp (le gt)
>  (simplify
>   (cmp (bit_and:c@2 @0 cst@1) integer_zerop)
>(with { tree csts = bitmask_inv_cst_vector_p (@1); }
> (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2)))
>  (with { auto optab = VECTOR_TYPE_P (TREE_TYPE (@1))
> ? optab_vector : optab_default;
>  tree utype = unsigned_type_for (TREE_TYPE (@1)); }
>   (if (target_supports_op_p (utype, icmp, optab)
>|| (optimize_vectors_before_lowering_p ()
>&& (!target_supports_op_p (type, cmp, optab)
>|| !target_supports_op_p (type, BIT_AND_EXPR, optab
>(if (TYPE_UNSIGNED (TREE_TYPE (@1)))
> (icmp @0 { csts; })
> (icmp (view_convert:utype @0) { csts; })
> and that optimize_vectors_before_lowering_p () guarded stuff there
> already deals with this problem, not trying to fold a supported comparison
> into a non-supported one.  The reason it doesn't work in this case is that
> it isn't GIMPLE folding which does this, but GENERIC folding done during
> forwprop4 - forward_propagate_into_comparison -> 
> forward_propagate_into_comparison_1
> -> combine_cond_expr_cond -> fold_binary_loc -> generic_simplify
> and we simply assumed that GENERIC folding happens only before
> gimplification.
> 
> The following patch fixes that by checking cfun properties instead of
> always returning true in thos cases.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok

Richard 

> 2023-01-04  Jakub Jelinek  
> 
>PR middle-end/108237
>* generic-match-head.cc: Include tree-pass.h.
>(canonicalize_math_p, optimize_vectors_before_lowering_p): Define
>to false if cfun and cfun->curr_properties has PROP_gimple_opt_math
>resp. PROP_gimple_lvec property set.
> 
>* gcc.c-torture/compile/pr108237.c: New test.
> 
> --- gcc/generic-match-head.cc.jj2023-01-02 09:32:42.954988078 +0100
> +++ gcc/generic-match-head.cc2023-01-03 17:31:07.627941369 +0100
> @@ -40,6 +40,7 @@ along with GCC; see the file COPYING3.
> #include "tm.h"
> #include "tree-eh.h"
> #include "langhooks.h"
> +#include "tree-pass.h"
> 
> /* Routine to determine if the types T1 and T2 are effectively
>the same for GENERIC.  If T1 or T2 is not a type, the test
> @@ -71,7 +72,7 @@ single_use (tree t ATTRIBUTE_UNUSED)
> static inline bool
> canonicalize_math_p ()
> {
> -  return true;
> +  return !cfun || (cfun->curr_properties & PROP_gimple_opt_math) == 0;
> }
> 
> /* Return true if math operations that are beneficial only after
> @@ -90,7 +91,7 @@ canonicalize_math_after_vectorization_p
> static inline bool
> optimize_vectors_before_lowering_p ()
> {
> -  return true;
> +  return !cfun || (cfun->curr_properties & PROP_gimple_lvec) == 0;
> }
> 
> /* Return true if successive divisions can be optimized.
> --- gcc/testsuite/gcc.c-torture/compile/pr108237.c.jj2023-01-03 
> 17:35:37.411068635 +0100
> +++ gcc/testsuite/gcc.c-torture/compile/pr108237.c2023-01-03 
> 17:35:22.490282820 +0100
> @@ -0,0 +1,14 @@
> +/* PR middle-end/108237 */
> +
> +typedef unsigned char __attribute__((__vector_size__ (1))) U;
> +typedef unsigned long long __attribute__((__vector_size__ (16))) V;
> +
> +U u;
> +V v;
> +
> +V
> +foo (void)
> +{
> +  V w = v != ((unsigned char) ((unsigned char) u == u) & v);
> +  return w;
> +}
> 
>Jakub
> 


Re: [PATCH] docs: fix Var documentation for .opt files

2023-01-04 Thread Gerald Pfeifer
On Wed, 28 Dec 2022, Martin Liška wrote:
> The Var documentation was somehow wrongly split into 2 pieces.
> 
>   PR middle-end/107966

And on top of that those two bits you are merging were not 
sorted in alphabetically - which your patch also addresses. :-)

> gcc/ChangeLog:
> 
>   * doc/options.texi: Fix Var documentation in internal manual.

Looks good to me; thank you!

Gerald


Re: [PATCH] rs6000: Don't use optimize_function_for_speed_p too early [PR108184]

2023-01-04 Thread Segher Boessenkool
On Wed, Jan 04, 2023 at 05:20:14PM +0800, Kewen.Lin wrote:
> As Honza pointed out in [1], the current uses of function
> optimize_function_for_speed_p in rs6000_option_override_internal
> are too early, since the query results from the functions
> optimize_function_for_{speed,size}_p could be changed later due
> to profile feedback and some function attributes handlings etc.
> 
> This patch is to move optimize_function_for_speed_p to all the
> use places of the corresponding flags, which follows the existing
> practices.  Maybe we can cache it somewhere at an appropriate
> timing, but that's another thing.

> @@ -25604,7 +25602,9 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx 
> tlsarg, rtx cookie)
> 
> /* Can we optimize saving the TOC in the prologue or
>do we need to do it at every call?  */
> -   if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca)
> +   if (TARGET_SAVE_TOC_INDIRECT
> +   && !cfun->calls_alloca
> +   && optimize_function_for_speed_p (cfun))
>   cfun->machine->save_toc_in_prologue = true;

Is this correct?  If so, it really needs a separate testcase.

The rest looks good.  Thanks!


Segher


Re: [PATCH] vrp: Handle pointers in maybe_set_nonzero_bits [PR108253]

2023-01-04 Thread Aldy Hernandez via Gcc-patches
OK.

On Wed, Jan 4, 2023, 10:13 Jakub Jelinek  wrote:

> Hi!
>
> maybe_set_nonzero_bits calls set_nonzero_bits which asserts that
> var doesn't have pointer type.  While we could punt for those
> cases, I think we can handle at least some easy cases.
> Earlier in maybe_set_nonzero_bits we've checked this is on
> (var & cst) == 0
> edge and the other edge is __builtin_unreachable, so if cst
> is say 3 as in the testcase, we want to turn it into 4 byte alignment
> of the pointer.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2023-01-04  Jakub Jelinek  
>
> PR tree-optimization/108253
> * tree-vrp.cc (maybe_set_nonzero_bits): Handle var with pointer
> types.
>
> * g++.dg/opt/pr108253.C: New test.
>
> --- gcc/tree-vrp.cc.jj  2023-01-02 09:32:53.634833769 +0100
> +++ gcc/tree-vrp.cc 2023-01-03 15:57:51.613239761 +0100
> @@ -789,8 +789,22 @@ maybe_set_nonzero_bits (edge e, tree var
> return;
>  }
>cst = gimple_assign_rhs2 (stmt);
> -  set_nonzero_bits (var, wi::bit_and_not (get_nonzero_bits (var),
> - wi::to_wide (cst)));
> +  if (POINTER_TYPE_P (TREE_TYPE (var)))
> +{
> +  struct ptr_info_def *pi = SSA_NAME_PTR_INFO (var);
> +  if (pi && pi->misalign)
> +   return;
> +  wide_int w = wi::bit_not (wi::to_wide (cst));
> +  unsigned int bits = wi::ctz (w);
> +  if (bits == 0 || bits >= HOST_BITS_PER_INT)
> +   return;
> +  unsigned int align = 1U << bits;
> +  if (pi == NULL || pi->align < align)
> +   set_ptr_info_alignment (get_ptr_info (var), align, 0);
> +}
> +  else
> +set_nonzero_bits (var, wi::bit_and_not (get_nonzero_bits (var),
> +   wi::to_wide (cst)));
>  }
>
>  /* Searches the case label vector VEC for the index *IDX of the CASE_LABEL
> --- gcc/testsuite/g++.dg/opt/pr108253.C.jj  2023-01-03
> 16:02:16.366438488 +0100
> +++ gcc/testsuite/g++.dg/opt/pr108253.C 2023-01-03 16:02:33.549191780 +0100
> @@ -0,0 +1,48 @@
> +// PR tree-optimization/108253
> +// { dg-do compile { target c++11 } }
> +// { dg-options "-O2" }
> +
> +struct S
> +{
> +  int *s;
> +  S () : s (new int) {}
> +  S (const S &r) noexcept : s (r.s) { __atomic_fetch_add (r.s, 1, 4); }
> +};
> +struct T
> +{
> +  explicit T (const S &x) : t (x) {}
> +  const S t;
> +};
> +struct U
> +{
> +  operator int () const { new T (u); return 0; }
> +  S u;
> +};
> +bool foo (int matcher);
> +unsigned long bar (unsigned long pos, unsigned long end_pos);
> +struct V
> +{
> +  alignas (4) char v[4];
> +};
> +struct W
> +{
> +  void baz ()
> +  {
> +if (!w) __builtin_abort ();
> +if (reinterpret_cast <__UINTPTR_TYPE__> (w->v) % 4 != 0)
> __builtin_abort ();
> +__builtin_unreachable ();
> +  }
> +  [[gnu::noinline]] void qux (unsigned long) { if (!w) bar (0, x); }
> +  V *w = nullptr;
> +  unsigned x = 0;
> +};
> +
> +void
> +test ()
> +{
> +  W w;
> +  U t;
> +  if (!foo (t))
> +w.baz ();
> +  w.qux (0);
> +}
>
> Jakub
>
>


[committed] libstdc++: Fix std::array::data() to be a constant expression [PR108258]

2023-01-04 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

When I refactored the __array_traits helper I broke this.

libstdc++-v3/ChangeLog:

PR libstdc++/108258
* include/std/array (__array_traits::operator T*()): Add
constexpr.
* testsuite/23_containers/array/element_access/constexpr_c++17.cc: Check
std::array::data().
---
 libstdc++-v3/include/std/array|  2 +-
 .../array/element_access/constexpr_c++17.cc   | 19 ---
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/include/std/array b/libstdc++-v3/include/std/array
index e26390e6f80..c50a201b032 100644
--- a/libstdc++-v3/include/std/array
+++ b/libstdc++-v3/include/std/array
@@ -69,7 +69,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
// Conversion to a pointer produces a null pointer.
__attribute__((__always_inline__))
-   operator _Tp*() const noexcept { return nullptr; }
+   constexpr operator _Tp*() const noexcept { return nullptr; }
  };
 
  using _Is_swappable = true_type;
diff --git 
a/libstdc++-v3/testsuite/23_containers/array/element_access/constexpr_c++17.cc 
b/libstdc++-v3/testsuite/23_containers/array/element_access/constexpr_c++17.cc
index b6878fd0c59..b92aa5c04e2 100644
--- 
a/libstdc++-v3/testsuite/23_containers/array/element_access/constexpr_c++17.cc
+++ 
b/libstdc++-v3/testsuite/23_containers/array/element_access/constexpr_c++17.cc
@@ -34,21 +34,34 @@ constexpr std::size_t test01()
   auto v2  = a.at(2);
   auto v3  = a.front();
   auto v4  = a.back();
-  return v1 + v2 + v3 + v4;
+  auto v5 = *a.data();
+  return v1 + v2 + v3 + v4 + v5;
 }
 
 static_assert( test01() == (55 + 66 + 0 + 2) );
 
 constexpr std::size_t test02()
 {
-  // array
+  // const array
   typedef std::array array_type;
   const array_type a = { { 0, 55, 66, 99, 4115, 2 } };
   auto v1  = a[1];
   auto v2  = a.at(2);
   auto v3  = a.front();
   auto v4  = a.back();
-  return v1 + v2 + v3 + v4;
+  auto v5 = *a.data();
+  return v1 + v2 + v3 + v4 + v5;
 }
 
 static_assert( test02() == (55 + 66 + 0 + 2) );
+
+constexpr bool test_zero()
+{
+  // zero-sized array (PR libstdc++/108258)
+  std::array a{};
+  auto v4 = a.data();
+  // The standard says this is unspecified, it's null for our implementation:
+  return a.data() == nullptr;
+}
+
+static_assert( test_zero() );
-- 
2.39.0



Re: [PATCH] Modula-2, testsuite: No 96 bit floating type on Darwin.

2023-01-04 Thread Gaius Mulley via Gcc-patches
Iain Sandoe  writes:

> Tested on x86_64  and aarch64 Darwin,
> OK for master?
> thanks
> Iain
>
> --- 8< ---
>
> The realbitscast.mod is currently failing on x86_64 and aarch64
> Darwin since they do not have a 96b floating type.  Disable the
> type for all Darwin arches.
>
> gcc/testsuite/ChangeLog:
>
>   * gm2/iso/pass/realbitscast.mod: Disable REAL96 on Darwin.
> ---
>  gcc/testsuite/gm2/iso/pass/realbitscast.mod | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/testsuite/gm2/iso/pass/realbitscast.mod 
> b/gcc/testsuite/gm2/iso/pass/realbitscast.mod
> index 4da5cee..c6b70eb032c 100644
> --- a/gcc/testsuite/gm2/iso/pass/realbitscast.mod
> +++ b/gcc/testsuite/gm2/iso/pass/realbitscast.mod
> @@ -28,11 +28,10 @@ FROM SYSTEM IMPORT CAST, WORD ;
>  #elif defined(__ppc__)
>  #   undef HAVE_REAL96
>  #elif defined(__ia64)
> -#   undef HAVE_REAL69
> -#elif defined(__APPLE__) && defined(__i386__)
>  #   undef HAVE_REAL96
>  #elif defined(__APPLE__)
> -#   define HAVE_REAL96
> +(* No 96 bit floating type on Apple platforms *)
> +#   undef HAVE_REAL96
>  #endif

Hi Iain,

sure yes LGTM,

regards,
Gaius


Re: [PATCH] modula-2, doc: Build dvi, ps and pdf doc in the gcc/doc directory.

2023-01-04 Thread Gaius Mulley via Gcc-patches
Iain Sandoe  writes:

> Tested on darwin21 with "make m2.pdf" and "make m2.dvi".
> OK for trunk?
> thanks.
> Iain
>
> --- 8< ---
>
> This also uses the configured $(TEXI2DVI) and $(TEXI2PDF) to deal with those
> targets (since we cannot assume to know what the user might have installed).
>
> gcc/m2/ChangeLog:
>
>   * Make-lang.in (dvi, ps, pdf): Build in the gcc/doc directory, also
>   use the configured tools for texi -> dvi and texi -> pdf.
> ---
>  gcc/m2/Make-lang.in | 14 +-
>  1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/m2/Make-lang.in b/gcc/m2/Make-lang.in
> index a134d3aca92..6c2b095bb7d 100644
> --- a/gcc/m2/Make-lang.in
> +++ b/gcc/m2/Make-lang.in
> @@ -140,18 +140,22 @@ $(DESTDIR)$(man1dir)/$(GM2_INSTALL_NAME)$(man1ext): 
> doc/m2.1 installdirs
>   -$(INSTALL_DATA) $< $@
>   -chmod a-x $@
>  
> -m2.dvi: $(TEXISRC)
> +m2.dvi: doc/m2.dvi
> +
> +doc/m2.dvi: $(TEXISRC)
>   $(TEXI2DVI) -I $(objdir)/m2 -I $(srcdir)/doc/include 
> $(srcdir)/doc/gm2.texi -o $@
>  
> -m2.ps: m2.dvi
> +doc/m2.ps: doc/m2.dvi
>   dvips -o $@ $<
>  
> -m2.pdf: m2.ps
> - gs -q -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=$@ $<
> +m2.pdf: doc/m2.pdf
> +
> +doc/m2.pdf: $(TEXISRC)
> + $(TEXI2PDF) -I $(objdir)/m2 -I $(srcdir)/doc/include 
> $(srcdir)/doc/gm2.texi -o $@
>  
>  .INTERMEDIATE: m2.pod
>  
> -m2.pod: doc/gm2.texi $(TEXISRC)
> +m2.pod: $(TEXISRC)
>   -$(TEXI2POD) -I $(objdir)/m2 -D m2 < $< > $@
>  
>  doc/m2.info: $(TEXISRC)

thanks LGTM -

regards,
Gaius


Re: [PATCH] modula-2: Fix registration of modules via constructors [PR108183].

2023-01-04 Thread Gaius Mulley via Gcc-patches
Iain Sandoe  writes:

>  When I first made this patch I had a question as to what should be
>  done for registration CTORs generated by the compiler for .mod files.
>  I've now answered that question (the code that makes the GCC decl
>  has also be updated in a separately posted patch).
>  
> tested on x86_64-linux-gnu, x86_64, aarch64-darwin21,
> OK for master?
> Thanks,
> Iain
>  
>  --- 8< ---
>
> This reworks the mechanism used for module registration to use init-
> time constructors.  The order of registration is not important, the
> actual initialization dependency tree will be computed early in the
> execution (all that matters is that we have registered before that).
>
> This fixes a potential issue in which the external name known to the
> m2 system is of the form _M2_XX_ctor() but the C++ code was
> producing a static variable instance with the same name.
>
> Signed-off-by: Iain Sandoe 
>
>   PR modula2/108183
>
> gcc/m2/ChangeLog:
>
>   * gm2-libs-ch/UnixArgs.cc (_M2_UnixArgs_ctor): Rework to use
>   an extern "C" function with 'constructor' attribute.
>   * gm2-libs-ch/dtoa.cc (_M2_dtoa_ctor): Likewise.
>   * gm2-libs-ch/ldtoa.cc (_M2_ldtoa_ctor): Likewise.
>
> libgm2/ChangeLog:
>
>   * libm2cor/KeyBoardLEDs.cc (_M2_KeyBoardLEDs_ctor): Rework to use
>   an extern "C" function with 'constructor' attribute.
>   * libm2iso/ErrnoCategory.cc (_M2_ErrnoCategory_ctor): Likewise.
>   * libm2iso/RTco.cc (_M2_RTco_ctor): Likewise.
>   * libm2pim/Selective.cc (_M2_Selective_ctor): Likewise.
>   * libm2pim/SysExceptions.cc (_M2_SysExceptions_ctor): Likewise.
>   * libm2pim/UnixArgs.cc (_M2_UnixArgs_ctor): Likewise.
>   * libm2pim/cgetopt.cc (_M2_cgetopt_ctor): Likewise.
>   * libm2pim/dtoa.cc (_M2_dtoa_ctor): Likewise.
>   * libm2pim/errno.cc (_M2_errno_ctor): Likewise.
>   * libm2pim/ldtoa.cc (_M2_ldtoa_ctor): Likewise.
>   * libm2pim/sckt.cc (_M2_sckt_ctor): Likewise.
>   * libm2pim/termios.cc (_M2_termios_ctor): Likewise.
>   * libm2pim/wrapc.c: Add a new line to the file end.

all LGTM  (together with the other GCC decl patch),

thanks,
Gaius


Re: [PATCH] modula-2: Module registration constructors need to be visible [PR108259].

2023-01-04 Thread Gaius Mulley via Gcc-patches
Iain Sandoe  writes:

> Tested on x86_64-linux-gnu, x86_64,aarch64-darwin21.
> There remain issues with shared libraries, but the link fails are fixed
> by this.
>
> OK for master?
> Thanks
> Iain
>
> --- 8< ---
>
> In the current design the main executable links explicitly to the module
> registration construtors that it uses.  This means that they must be
> visible in shared libraries.
>
>   PR modula2/108259
>
> gcc/m2/ChangeLog:
>
>   * gm2-gcc/m2decl.cc (m2decl_DeclareModuleCtor): Make module
>   registration constructors visible.
> ---
>  gcc/m2/gm2-gcc/m2decl.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/m2/gm2-gcc/m2decl.cc b/gcc/m2/gm2-gcc/m2decl.cc
> index 62bfefd2530..d849f8aefc4 100644
> --- a/gcc/m2/gm2-gcc/m2decl.cc
> +++ b/gcc/m2/gm2-gcc/m2decl.cc
> @@ -276,7 +276,7 @@ m2decl_DeclareModuleCtor (tree decl)
>/* Declare module_ctor ().  */
>TREE_PUBLIC (decl) = 1;
>DECL_ARTIFICIAL (decl) = 1;
> -  DECL_VISIBILITY (decl) = VISIBILITY_HIDDEN;
> +  DECL_VISIBILITY (decl) = VISIBILITY_DEFAULT;
>DECL_VISIBILITY_SPECIFIED (decl) = 1;
>DECL_STATIC_CONSTRUCTOR (decl) = 1;
>return decl;

LGTM thanks,

Gaius


Re: [PATCH] modula-2, driver: Implement handling for -static-libgm2.

2023-01-04 Thread Gaius Mulley via Gcc-patches
Iain Sandoe  writes:

> tested on x86_64-linux-gnu, x86_64,aarch64-darwin21,
> OK for trunk?
> thanks,
> Iain
>
> --- 8< ---
>
> This was unimplemented so far.
>
> gcc/ChangeLog:
>
>   * common.opt: Add -static-libgm2.
>   * config/darwin.h (LINK_SPEC): Handle static-libgm2.
>
> gcc/m2/ChangeLog:
>
>   * gm2spec.cc (lang_specific_driver): Handle static-libgm2.
> ---
>  gcc/common.opt  |  4 
>  gcc/config/darwin.h |  7 ++-
>  gcc/m2/gm2spec.cc   | 24 +++-
>  3 files changed, 33 insertions(+), 2 deletions(-)
>

yes LGTM - it was unimplemented - thanks!

regards,
Gaius


Re: [PATCH] rs6000: Don't use optimize_function_for_speed_p too early [PR108184]

2023-01-04 Thread Kewen.Lin via Gcc-patches
Hi Segher,

Thanks for the comments.

on 2023/1/4 18:46, Segher Boessenkool wrote:
> On Wed, Jan 04, 2023 at 05:20:14PM +0800, Kewen.Lin wrote:
>> As Honza pointed out in [1], the current uses of function
>> optimize_function_for_speed_p in rs6000_option_override_internal
>> are too early, since the query results from the functions
>> optimize_function_for_{speed,size}_p could be changed later due
>> to profile feedback and some function attributes handlings etc.
>>
>> This patch is to move optimize_function_for_speed_p to all the
>> use places of the corresponding flags, which follows the existing
>> practices.  Maybe we can cache it somewhere at an appropriate
>> timing, but that's another thing.
> 
>> @@ -25604,7 +25602,9 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx 
>> tlsarg, rtx cookie)
>>
>>/* Can we optimize saving the TOC in the prologue or
>>   do we need to do it at every call?  */
>> -  if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca)
>> +  if (TARGET_SAVE_TOC_INDIRECT
>> +  && !cfun->calls_alloca
>> +  && optimize_function_for_speed_p (cfun))
>>  cfun->machine->save_toc_in_prologue = true;
> 
> Is this correct?  If so, it really needs a separate testcase.
> 

Yes, it just moves the condition from:

--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3978,8 +3978,7 @@ rs6000_option_override_internal (bool global_init_p)
   /* If we can shrink-wrap the TOC register save separately, then use
  -msave-toc-indirect unless explicitly disabled.  */
   if ((rs6000_isa_flags_explicit & OPTION_MASK_SAVE_TOC_INDIRECT) == 0
-  && flag_shrink_wrap_separate
-  && optimize_function_for_speed_p (cfun))
+  && flag_shrink_wrap_separate)
 rs6000_isa_flags |= OPTION_MASK_SAVE_TOC_INDIRECT;

here.

I tried to find one test case before, but failed to find one which is not 
fragile
to test.  And I thought the associated test case has demonstrated why the use of
optimize_function_for_{speed,size}_p is too early in function
rs6000_option_override_internal, so I gave up then.  Do you worry about that we
could revert it unexpectedly in future and no sensitive test case is on it?


BR,
Kewen


[PATCH V4] Use reg mode to move sub blocks for parameters and returns

2023-01-04 Thread Jiufu Guo via Gcc-patches
Hi,

When assigning a parameter to a variable, or assigning a variable to
return value with struct type, "block move" may be used to expand
the assignment if the parameter/return is passing through registers and
the parameter/return has BLK mode.
For this kind of case, when moving the blocks, it would be better to use
the nature mode of the registers.
This would raise more opportunities for other optimization passes(cse,
dse, xprop).

As the example code (like code in PR65421):

typedef struct SA {double a[3];} A;
A ret_arg_pt (A *a) {return *a;} // on ppc64le, expect only 3 lfd(s)
A ret_arg (A a) {return a;} // just empty fun body
void st_arg (A a, A *p) {*p = a;} //only 3 stfd(s)

This patches check the "from" and "to" of an assignment in
"expand_assignment", if it is about param/ret which may passing via
register, then use the register nature mode to move sub-blocks for
the assignning.

This patch may be still useful even if we change the behavior of
parameter setup or adopt SRA-like code in expender.

Comparing with previous version:
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608081.html
This patch update the code slightly and merged/added test cases.
And I checked the cases with large struct or non-homogeneous struct
to confirm it does not degrade the code.

Bootstrap and regtest pass on ppc64{,le} and x86_64.
Is this ok for trunk?

BR,
Jeff (Jiufu)

PR target/65421

gcc/ChangeLog:

* cfgexpand.cc (expand_used_vars): Update to mark DECL_USEDBY_RETURN_P
for returns.
* expr.cc (move_sub_blocks): New function.
(expand_assignment): Update to call move_sub_blocks for returns or
parameters.
* function.cc (assign_parm_setup_block): Update to mark
DECL_REGS_TO_STACK_P for parameter.
* tree-core.h (struct tree_decl_common): Add comment.
* tree.h (DECL_USEDBY_RETURN_P): New define.
(DECL_REGS_TO_STACK_P): New define.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr65421-1.c: New test.
* gcc.target/powerpc/pr65421.c: New test.

---
 gcc/cfgexpand.cc | 14 
 gcc/expr.cc  | 77 
 gcc/function.cc  |  3 +
 gcc/tree-core.h  |  4 +-
 gcc/tree.h   |  9 +++
 gcc/testsuite/gcc.target/powerpc/pr65421-1.c |  6 ++
 gcc/testsuite/gcc.target/powerpc/pr65421.c   | 33 +
 7 files changed, 145 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr65421.c

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index dd29c03..09b8ec64cea 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -2158,6 +2158,20 @@ expand_used_vars (bitmap forced_stack_vars)
 frame_phase = off ? align - off : 0;
   }
 
+  /* Collect VARs on returns.  */
+  if (DECL_RESULT (current_function_decl))
+{
+  edge_iterator ei;
+  edge e;
+  FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
+   if (greturn *ret = safe_dyn_cast (last_stmt (e->src)))
+ {
+   tree val = gimple_return_retval (ret);
+   if (val && VAR_P (val))
+ DECL_USEDBY_RETURN_P (val) = 1;
+ }
+}
+
   /* Set TREE_USED on all variables in the local_decls.  */
   FOR_EACH_LOCAL_DECL (cfun, i, var)
 TREE_USED (var) = 1;
diff --git a/gcc/expr.cc b/gcc/expr.cc
index d9407432ea5..afcec6f3c10 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -5559,6 +5559,51 @@ mem_ref_refers_to_non_mem_p (tree ref)
   return non_mem_decl_p (base);
 }
 
+/* Sub routine of expand_assignment, invoked when assigning from a
+   parameter or assigning to a return val on struct type which may
+   be passed through registers.  The mode of register is used to
+   move the content for the assignment.
+
+   This routine generates code for expression FROM which is BLKmode,
+   and move the generated content to TO_RTX by su-blocks in SUB_MODE.  */
+
+static void
+move_sub_blocks (rtx to_rtx, tree from, machine_mode sub_mode, bool 
nontemporal)
+{
+  gcc_assert (MEM_P (to_rtx));
+
+  HOST_WIDE_INT size = MEM_SIZE (to_rtx).to_constant ();
+  HOST_WIDE_INT sub_size = GET_MODE_SIZE (sub_mode).to_constant ();
+  HOST_WIDE_INT len = size / sub_size;
+
+  /* It would be not profitable to move through sub-modes, if the size does
+ not meet register mode.  */
+  if ((size % sub_size) != 0)
+{
+  push_temp_slots ();
+  rtx result = store_expr (from, to_rtx, 0, nontemporal, false);
+  preserve_temp_slots (result);
+  pop_temp_slots ();
+  return;
+}
+
+  push_temp_slots ();
+
+  rtx from_rtx = expand_expr (from, NULL_RTX, GET_MODE (to_rtx), 
EXPAND_NORMAL);
+  for (int i = 0; i < len; i++)
+{
+  rtx temp = gen_reg_rtx (sub_mode);
+  rtx src = adjust_address (from_rtx, sub_mode, sub_size * i);
+  rtx dest = adjust_address (to_rtx, 

[PATCH] RISC-V: Refine Phase 3 of VSETVL PASS

2023-01-04 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (can_backward_propagate_p): Fix for null 
iter_bb.
(vector_insn_info::set_demand_info): New function.
(pass_vsetvl::emit_local_forward_vsetvls): Adjust for refinement of 
Phase 3.
(pass_vsetvl::merge_successors): Ditto.
(pass_vsetvl::compute_global_backward_infos): Ditto.
(pass_vsetvl::backward_demand_fusion): Ditto.
(pass_vsetvl::forward_demand_fusion): Ditto.
(pass_vsetvl::demand_fusion): New function.
(pass_vsetvl::lazy_vsetvl): Adjust for refinement of phase 3.
* config/riscv/riscv-vsetvl.h: New function declaration.

---
 gcc/config/riscv/riscv-vsetvl.cc | 138 ---
 gcc/config/riscv/riscv-vsetvl.h  |   1 +
 2 files changed, 128 insertions(+), 11 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 52f0195980a..d42cfa91d63 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -43,7 +43,7 @@ along with GCC; see the file COPYING3.  If not see
 -  Phase 2 - Emit vsetvl instructions within each basic block according to
demand, compute and save ANTLOC && AVLOC of each block.
 
--  Phase 3 - Backward demanded info propagation and fusion across blocks.
+-  Phase 3 - Backward && forward demanded info propagation and fusion 
across blocks.
 
 -  Phase 4 - Lazy code motion including: compute local properties,
pre_edge_lcm and vsetvl insertion && delete edges for LCM results.
@@ -434,8 +434,12 @@ can_backward_propagate_p (const function_info *ssa, const 
basic_block cfg_bb,
set_info *ultimate_def = look_through_degenerate_phi (set);
const basic_block ultimate_bb = ultimate_def->bb ()->cfg_bb ();
FOR_BB_BETWEEN (iter_bb, ultimate_bb, def->bb ()->cfg_bb (), next_bb)
- if (iter_bb->index == cfg_bb->index)
-   return true;
+ {
+   if (!iter_bb)
+ break;
+   if (iter_bb->index == cfg_bb->index)
+ return true;
+ }
 
return false;
   };
@@ -1172,6 +1176,19 @@ vector_insn_info::parse_insn (insn_info *insn)
 m_demands[DEMAND_MASK_POLICY] = true;
 }
 
+void
+vector_insn_info::set_demand_info (const vector_insn_info &other)
+{
+  set_sew (other.get_sew ());
+  set_vlmul (other.get_vlmul ());
+  set_ratio (other.get_ratio ());
+  set_ta (other.get_ta ());
+  set_ma (other.get_ma ());
+  set_avl_info (other.get_avl_info ());
+  for (size_t i = 0; i < NUM_DEMAND; i++)
+m_demands[i] = other.demand_p ((enum demand_type) i);
+}
+
 void
 vector_insn_info::demand_vl_vtype ()
 {
@@ -1691,8 +1708,10 @@ private:
   void emit_local_forward_vsetvls (const bb_info *);
 
   /* Phase 3.  */
-  void merge_successors (const basic_block, const basic_block);
-  void compute_global_backward_infos (void);
+  bool merge_successors (const basic_block, const basic_block);
+  bool backward_demand_fusion (void);
+  bool forward_demand_fusion (void);
+  void demand_fusion (void);
 
   /* Phase 4.  */
   void prune_expressions (void);
@@ -1866,7 +1885,7 @@ pass_vsetvl::emit_local_forward_vsetvls (const bb_info 
*bb)
 }
 
 /* Merge all successors of Father except child node.  */
-void
+bool
 pass_vsetvl::merge_successors (const basic_block father,
   const basic_block child)
 {
@@ -1877,7 +1896,8 @@ pass_vsetvl::merge_successors (const basic_block father,
  || father_info.local_dem.empty_p ());
   gcc_assert (father_info.reaching_out.dirty_p ()
  || father_info.reaching_out.empty_p ());
-
+  
+  bool changed_p = false;
   FOR_EACH_EDGE (e, ei, father->succs)
 {
   const basic_block succ = e->dest;
@@ -1907,12 +1927,15 @@ pass_vsetvl::merge_successors (const basic_block father,
 
   father_info.local_dem = new_info;
   father_info.reaching_out = new_info;
+  changed_p = true;
 }
+
+  return changed_p;
 }
 
 /* Compute global backward demanded info.  */
-void
-pass_vsetvl::compute_global_backward_infos (void)
+bool
+pass_vsetvl::backward_demand_fusion (void)
 {
   /* We compute global infos by backward propagation.
  We want to have better performance in these following cases:
@@ -1939,6 +1962,7 @@ pass_vsetvl::compute_global_backward_infos (void)
   We backward propagate the first VSETVL into e32,mf2 so that we
   could be able to eliminate the second VSETVL in LCM.  */
 
+  bool changed_p = false;
   for (const bb_info *bb : crtl->ssa->reverse_bbs ())
 {
   basic_block cfg_bb = bb->cfg_bb ();
@@ -1982,9 +2006,10 @@ pass_vsetvl::compute_global_backward_infos (void)
  block_info.reaching_out.set_dirty ();
  block_info.reaching_out.set_dirty_pat (new_pat);
  block_info.local_dem = block_info.reaching_out;
+ changed_p = true;
}
 
- merge_successors (e->src, cfg_bb);
+

[PATCH V2] extract DF/SF/SI/HI/QI subreg from parameter word on stack

2023-01-04 Thread Jiufu Guo via Gcc-patches
Hi,

This patch is fixing an issue about parameter accessing if the
parameter is struct type and passed through integer registers, and
there is floating member is accessed. Like below code:

typedef struct DF {double a[4]; long l; } DF;
double foo_df (DF arg){return arg.a[3];}

On ppc64le, with trunk gcc, "std 6,-24(1) ; lfd 1,-24(1)" is
generated.  While instruction "mtvsrd 1, 6" would be enough for
this case.

This patch updates the behavior when loading floating members of a
parameter: if that floating member is stored via integer register,
then loading it as integer mode first, and converting it to floating
mode.

Compare with previous patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608872.html
Previous version supports converion from DImode to DF/SF, this
version also supports conversion from DImode to SI/HI/QI modes.

I also tried to enhance CSE/DSE for this issue.  But because the
limitations (e.g. CSE does not like new pseudo, DSE is not good
at cross-blocks), some cases (as this patch) can not be handled.

Bootstrap and regtest passes on ppc64{,le}.
Is this ok for trunk?  Thanks for comments!


BR,
Jeff (Jiufu)


PR target/108073

gcc/ChangeLog:

* expr.cc (extract_subreg_from_loading_word): New function.
(expand_expr_real_1): Call extract_subreg_from_loading_word.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/pr102024.C: Updated.
* gcc.target/powerpc/pr108073.c: New test.

---
 gcc/expr.cc | 76 +
 gcc/testsuite/g++.target/powerpc/pr102024.C |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr108073.c | 30 
 3 files changed, 107 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108073.c

diff --git a/gcc/expr.cc b/gcc/expr.cc
index d9407432ea5..6de4a985c8b 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -10631,6 +10631,69 @@ stmt_is_replaceable_p (gimple *stmt)
   return false;
 }
 
+/* Return the content of the memory slot SOURCE as MODE.
+   SOURCE is based on BASE. BASE is a memory block that is stored via words.
+
+   To get the content from SOURCE:
+   first load the word from the memory which covers the SOURCE slot first;
+   next return the word's subreg which offsets to SOURCE slot;
+   then convert to MODE as necessary.  */
+
+static rtx
+extract_subreg_from_loading_word (machine_mode mode, rtx source, rtx base)
+{
+  rtx src_base = XEXP (source, 0);
+  poly_uint64 offset = MEM_OFFSET (source);
+
+  if (GET_CODE (src_base) == PLUS && CONSTANT_P (XEXP (src_base, 1)))
+{
+  offset += INTVAL (XEXP (src_base, 1));
+  src_base = XEXP (src_base, 0);
+}
+
+  if (!rtx_equal_p (XEXP (base, 0), src_base))
+return NULL_RTX;
+
+  /* Subreg(DI,n) -> DF/SF/SI/HI/QI */
+  poly_uint64 word_size = GET_MODE_SIZE (word_mode);
+  poly_uint64 mode_size = GET_MODE_SIZE (mode);
+  poly_uint64 byte_off;
+  unsigned int start;
+  machine_mode int_mode;
+  if (known_ge (word_size, mode_size) && multiple_p (word_size, mode_size)
+  && int_mode_for_mode (mode).exists (&int_mode)
+  && can_div_trunc_p (offset, word_size, &start, &byte_off)
+  && multiple_p (byte_off, mode_size))
+{
+  rtx word_mem = copy_rtx (source);
+  PUT_MODE (word_mem, word_mode);
+  word_mem = adjust_address (word_mem, word_mode, -byte_off);
+
+  rtx word_reg = gen_reg_rtx (word_mode);
+  emit_move_insn (word_reg, word_mem);
+
+  poly_uint64 low_off = subreg_lowpart_offset (int_mode, word_mode);
+  if (!known_eq (byte_off, low_off))
+   {
+ poly_uint64 shift_bytes = known_gt (byte_off, low_off)
+ ? byte_off - low_off
+ : low_off - byte_off;
+ word_reg = expand_shift (RSHIFT_EXPR, word_mode, word_reg,
+  shift_bytes * BITS_PER_UNIT, word_reg, 0);
+   }
+
+  rtx int_subreg = gen_lowpart (int_mode, word_reg);
+  if (mode == int_mode)
+   return int_subreg;
+
+  rtx int_mode_reg = gen_reg_rtx (int_mode);
+  emit_move_insn (int_mode_reg, int_subreg);
+  return gen_lowpart (mode, int_mode_reg);
+}
+
+  return NULL_RTX;
+}
+
 rtx
 expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
enum expand_modifier modifier, rtx *alt_rtl,
@@ -11812,6 +11875,19 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode 
tmode,
&& modifier != EXPAND_WRITE)
  op0 = flip_storage_order (mode1, op0);
 
+   /* Accessing sub-field of struct parameter which passed via integer
+  registers.  */
+   if (mode == mode1 && TREE_CODE (tem) == PARM_DECL
+   && DECL_INCOMING_RTL (tem) && REG_P (DECL_INCOMING_RTL (tem))
+   && GET_MODE (DECL_INCOMING_RTL (tem)) == BLKmode && MEM_P (op0)
+   && MEM_OFFSET_KNOWN_P (op0))
+ {
+   rtx subreg
+ = extract_subreg_from_loading_word (mode, op0, DECL_RTL (tem));
+ 

[PATCH] RISC-V: Add testcases for IMM (0 ~ 31) AVL

2023-01-04 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-10.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-11.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-12.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-13.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-2.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-4.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-5.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-6.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-7.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-8.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_bb_prop-9.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_conflict-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_conflict-2.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_conflict-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_conflict-4.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_conflict-5.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-10.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-11.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-12.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-13.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-14.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-15.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-16.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-17.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-2.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-4.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-5.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-6.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-7.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-8.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_loop_invariant-9.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_switch-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_switch-2.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_switch-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_switch-4.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_switch-5.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_switch-6.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_switch-7.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_switch-8.c: New test.
* gcc.target/riscv/rvv/vsetvl/imm_switch-9.c: New test.

---
 .../riscv/rvv/vsetvl/imm_bb_prop-1.c  |  32 +++
 .../riscv/rvv/vsetvl/imm_bb_prop-10.c |  42 
 .../riscv/rvv/vsetvl/imm_bb_prop-11.c |  42 
 .../riscv/rvv/vsetvl/imm_bb_prop-12.c |  31 +++
 .../riscv/rvv/vsetvl/imm_bb_prop-13.c |  29 +++
 .../riscv/rvv/vsetvl/imm_bb_prop-2.c  |  29 +++
 .../riscv/rvv/vsetvl/imm_bb_prop-3.c  |  22 ++
 .../riscv/rvv/vsetvl/imm_bb_prop-4.c  |  25 +++
 .../riscv/rvv/vsetvl/imm_bb_prop-5.c  |  33 +++
 .../riscv/rvv/vsetvl/imm_bb_prop-6.c  |  30 +++
 .../riscv/rvv/vsetvl/imm_bb_prop-7.c  |  31 +++
 .../riscv/rvv/vsetvl/imm_bb_prop-8.c  |  37 
 .../riscv/rvv/vsetvl/imm_bb_prop-9.c  |  37 
 .../riscv/rvv/vsetvl/imm_conflict-1.c |  22 ++
 .../riscv/rvv/vsetvl/imm_conflict-2.c |  22 ++
 .../riscv/rvv/vsetvl/imm_conflict-3.c |  26 +++
 .../riscv/rvv/vsetvl/imm_conflict-4.c |  38 
 .../riscv/rvv/vsetvl/imm_conflict-5.c |  45 
 .../riscv/rvv/vsetvl/imm_loop_invariant-1.c   | 195 ++
 .../riscv/rvv/vsetvl/imm_loop_invariant-10.c  |  41 
 .../riscv/rvv/vsetvl/imm_loop_invariant-11.c  |  41 
 .../riscv/rvv/vsetvl/imm_loop_invariant-12.c  |  28 +++
 .../riscv/rvv/vsetvl/imm_loop_invariant-13.c  |  30 +++
 .../riscv/rvv/vsetvl/imm_loop_invariant-14.c  |  31 +++
 .../riscv/rvv/vsetvl/imm_loop_invariant-15.c  |  32 +++
 .../riscv/rvv/vsetvl/imm_loop_invariant-16.c  |  29 +++
 .../riscv/rvv/vsetvl/imm_loop_invariant-17.c  |  23 +++
 .../riscv/rvv/vsetvl/imm_loop_invariant-2.c   | 168 +++
 .../riscv/rvv/vsetvl/imm_loop_invariant-3.c   | 141 +
 .../riscv/rvv/vsetvl/imm_loop_invariant-4.c   |  77 +++
 .../riscv/rvv/vsetvl/imm_loop_invariant-5.c   | 114 ++
 .../riscv/rvv/vsetvl/imm_loop_invariant-6.c   |  64 ++
 .../riscv/rvv/vsetvl/imm_loop_invariant-7.c   |  39 
 .../riscv/rvv/vsetvl/

Re: [PATCH] rs6000: Don't use optimize_function_for_speed_p too early [PR108184]

2023-01-04 Thread Segher Boessenkool
Hi!

On Wed, Jan 04, 2023 at 08:15:03PM +0800, Kewen.Lin wrote:
> on 2023/1/4 18:46, Segher Boessenkool wrote:
> >> @@ -25604,7 +25602,9 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx 
> >> tlsarg, rtx cookie)
> >>
> >>  /* Can we optimize saving the TOC in the prologue or
> >> do we need to do it at every call?  */
> >> -if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca)
> >> +if (TARGET_SAVE_TOC_INDIRECT
> >> +&& !cfun->calls_alloca
> >> +&& optimize_function_for_speed_p (cfun))
> >>cfun->machine->save_toc_in_prologue = true;
> > 
> > Is this correct?  If so, it really needs a separate testcase.
> 
> Yes, it just moves the condition from:
> 
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -3978,8 +3978,7 @@ rs6000_option_override_internal (bool global_init_p)
>/* If we can shrink-wrap the TOC register save separately, then use
>   -msave-toc-indirect unless explicitly disabled.  */
>if ((rs6000_isa_flags_explicit & OPTION_MASK_SAVE_TOC_INDIRECT) == 0
> -  && flag_shrink_wrap_separate
> -  && optimize_function_for_speed_p (cfun))
> +  && flag_shrink_wrap_separate)
>  rs6000_isa_flags |= OPTION_MASK_SAVE_TOC_INDIRECT;
> 
> here.

That "just" reinforces that this really needs a testcase!  It is all
action at a distance, none of this is trivial (if it was there would
not be a bug here in the first place, of course).

> I tried to find one test case before, but failed to find one which is not 
> fragile
> to test.  And I thought the associated test case has demonstrated why the use 
> of
> optimize_function_for_{speed,size}_p is too early in function
> rs6000_option_override_internal, so I gave up then.  Do you worry about that 
> we
> could revert it unexpectedly in future and no sensitive test case is on it?

I worry that it might contradict what some other code does.  I also
worry that it just is not a sensible thing to do.

I do not worry that your patch is not an improvement.  But the resulting
code more clearly (than the original) is problematic.  Where is r2 saved
to the frame if save_toc_in_prologue is false?


Segher


Re: [PATCH] c++, TLS: Support cross-tu static initialization for targets without alias support [PR106435].

2023-01-04 Thread Jason Merrill via Gcc-patches

On 1/3/23 18:17, Iain Sandoe wrote:




On 3 Jan 2023, at 22:22, Jason Merrill  wrote:

On 12/7/22 10:39, Iain Sandoe wrote:

  This has been tested on x86_64 and arm64 Darwin and on x86_64 linux gnu.
  The basic patch is live in the homebrew macOS support and so has had quite
  wide coverage on non-trivial codebases.
OK for master?
  Iain
Since this actually fixes wrong code, I wonder if we should also consider
  back-porting.
--- >8 ---
The description below relates to the code path when TARGET_SUPPORTS_ALIASES is
false; current operation is maintained for targets with alias support and any
new support code should be DCEd in that case.
--
Currently, cross-tu static initialisation is not supported for targets without
alias support.
The patch adds support by building a shim function in place of the alias for
these targets; the shim simply calls the generic initialiser.  Although this is
slightly less efficient than the alias, in practice (for targets that allow
sibcalls) the penalty is a single jump when code is optimised.
 From the perspective of a TU referencing an extern TLS variable, there is no
way to determine if it requires a guarded dynamic init.  So, in the referencing
TU, we build a weak reference to the potential init and check at runtime if the
init is present before calling it.  This strategy is fine for targets that have
ELF semantics, but fails at link time for Mach-O (which does not permit the
reference to be undefined in the static link).
The actual initialiser call is contained in a wrapper function, and to resolve
the Mach-O linker issue, in the TU that is referencing the var, we now generate
both the wrapper _and_ a weak definition of a dummy init function.  In the case
that there _is_ a dynamic init (in a different TU), that version will be 
non-weak
and will be override the weak dummy one.


IIUC, this isn't reliable in general; in specific, I believe that the glibc 
dynamic loader no longer prefers strong definitions to weak ones.


Neither does Darwin’s dynamic loader, this implemenation works there because 
the static linker _will_ override the weak def with a strong one.  IIUC, 
binutils ld does this too.

If we need this to work between DSOs then that potentially presents a problem 
(for Darwin the DSO is identified so that the symbol will be found in the 
library that resolved it in the static link, [but that can be defeated by 
forcing “flat linking”]), I am not sure if glibc dynamic loader would do 
something similar (although this code path is not taken on ELF targets since 
they have the symbol aliases).


Perhaps on targets that don't allow weakrefs to be unbound,


Darwin would allow it if we were able to tell the static linker that the symbol 
is permitted to be undefined - but since we don’t know the symbol’s name 
outside the FE, that is not going to fly.


Can you elaborate on this?


we should unconditionally emit the init function where the variable is defined, 
even if it does nothing, and unconditionally call it from the wrapper?


OK. that seems a safer option .. I will have to look at it when I have a chance.

thanks
Iain



In the case that we have a trivial
static init (so no init in any other TU) the weak-defined dummy init will be
called (a single return insn for optimised code).  We mitigate the call to
the dummy init by reworking the wrapper code-gen path to remove the test for
the weak reference function (as it will always be true) since the static linker
will now determine the function to be called.
Signed-off-by: Iain Sandoe 
PR c++/106435
gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Allow fextern-tls-init for targets
without alias support.
gcc/cp/ChangeLog:
* decl2.cc (get_tls_init_fn): Allow targets without alias support.
(handle_tls_init): Emit aliases for single init functions where the
target supporst this, otherwise emit a stub function that calls the
main tls init function.  (generate_tls_dummy_init): New.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/pr106435-b.cc: New file.
* g++.dg/cpp0x/pr106435.C: New test.
* g++.dg/cpp0x/pr106435.h: New file.
---
  gcc/c-family/c-opts.cc   |  2 +-
  gcc/cp/decl2.cc  | 80 
  gcc/testsuite/g++.dg/cpp0x/pr106435-b.cc | 22 +++
  gcc/testsuite/g++.dg/cpp0x/pr106435.C| 24 +++
  gcc/testsuite/g++.dg/cpp0x/pr106435.h| 27 
  5 files changed, 142 insertions(+), 13 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/pr106435-b.cc
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/pr106435.C
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/pr106435.h
diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index 70745aa4e7c..064645f980d 100644
--- a/gcc/c-family/c-opts.cc
+++ b/gcc/c-family/c-opts.cc
@@ -1070,7 +1070,7 @@ c_common_post_options (const char **pfilename)
  if (flag_extern_tls_init)
  {
-  if (!TARG

Re: [PATCH] c++: Error recovery in merge_default_template_args [PR108206]

2023-01-04 Thread Jason Merrill via Gcc-patches

On 1/4/23 04:32, Jakub Jelinek wrote:

Hi!

We ICE on the following testcase during error recovery, both new_parm
and old_parm are error_mark_node, the ICE is on
   error ("redefinition of default argument for %q+#D", new_parm);
   inform (DECL_SOURCE_LOCATION (old_parm),
   "original definition appeared here");
where we don't print anything useful for new_parm and ICE trying to
access DECL_SOURCE_LOCATION of old_parm.  I think we shouldn't diagnose
anything when either of the parms is erroneous, GCC 11 before
merge_default_template_args has been added was doing
   if (TREE_VEC_ELT (tmpl_parms, i) == error_mark_node
   || TREE_VEC_ELT (parms, i) == error_mark_node)
 continue;

   tmpl_parm = TREE_VALUE (TREE_VEC_ELT (tmpl_parms, i));
   if (error_operand_p (tmpl_parm))
 return false;
in redeclare_class_template.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2023-01-04  Jakub Jelinek  

PR c++/108206
* decl.cc (merge_default_template_args): Return false if either
new_parm or old_parm are erroneous.

* g++.dg/template/pr108206.C: New test.

--- gcc/cp/decl.cc.jj   2022-12-22 11:09:52.026181629 +0100
+++ gcc/cp/decl.cc  2023-01-03 18:11:25.202223528 +0100
@@ -1556,6 +1556,8 @@ merge_default_template_args (tree new_pa
tree old_parm = TREE_VALUE (TREE_VEC_ELT (old_parms, i));
tree& new_default = TREE_PURPOSE (TREE_VEC_ELT (new_parms, i));
tree& old_default = TREE_PURPOSE (TREE_VEC_ELT (old_parms, i));
+  if (error_operand_p (new_parm) || error_operand_p (old_parm))
+   return false;
if (new_default != NULL_TREE && old_default != NULL_TREE)
{
  auto_diagnostic_group d;
--- gcc/testsuite/g++.dg/template/pr108206.C.jj 2023-01-03 18:14:18.768730450 
+0100
+++ gcc/testsuite/g++.dg/template/pr108206.C2023-01-03 18:13:40.427281176 
+0100
@@ -0,0 +1,5 @@
+// PR c++/108206
+// { dg-do compile { target c++11 } }
+
+template  void foo (T1);   // { dg-error "'X' has not been 
declared" }
+template  void foo (T2);   // { dg-error "'X' has not been 
declared" }

Jakub





Re: [PATCH] libstdc++: Export the __gnu_cxx::zoneinfo_dir_override symbol.

2023-01-04 Thread Jonathan Wakely via Gcc-patches
On Sat, 24 Dec 2022 at 12:21, Iain Sandoe  wrote:
>
>
>
> > On 24 Dec 2022, at 12:12, Jonathan Wakely wrote:
> >
> >
> >
> > On Sat, 24 Dec 2022, 11:35 Iain Sandoe via Libstdc++, 
> >  wrote:
> >  If this is not the right place to export the symbol (or you do not want
> >  to export it in the general case), I can always add a platform-specific
> >  file for it.  So far, tested on x86_64-darwin21, wider testing will
> >  follow over the holidays.
> >
> >  OK for trunk?
> >
> > I'd like to check if this causes the undefined weak symbol to be exported 
> > on ELF,
>
> I’d expect so, since it’s in the common file.

hppa-hp-hpux* wants this symbol to be defined anyway (see PR 108228),
so please push your patch to trunk.


>
> > but I suppose that doesn't really cause any harm if it is. The symbol name 
> > is in our own namespace so can't clash with user symbols. We can't declare 
> > that function in a header, because "zoneinfo_dir_override" is not a 
> > reserved name so could clash with user macros (we could prefix it with 
> > underscores, but since it's possible to override it without the library 
> > providing a declaration, I think it would be "nicer" to not use an ugly 
> > reserved name for something users are supposed to define themselves).
>
> I can also investigate the alternate solution for Darwin - where we pass 
> -U,symbolname to the linker.
> In that case, we do not need to provide a weak def. in the library (it looks 
> more ELF-like) but the symbol
> does still need to be exported - however that could Darwin-local too as noted 
> above.
>
> (none of this is urgent, bootstrap is fixed - I was just poking at the 
> problems while they were fresh in
>  my mind;  although the fails are removed when I configure to an installation 
> with tzdata.zi, there are
>  still fails to deal with when using the system installation .. so I’m not 
> ’there’ yet .. )
>
> cheers
> Iain
>
> >
> >
> >
> >
> >  Iain
> >
> >  --- 8< ---
> >
> > This symbol needs to be visible in the library interface for Darwin
> > to override it with a user-provided one.
> >
> > Signed-off-by: Iain Sandoe 
> >
> > libstdc++-v3/ChangeLog:
> >
> > * config/abi/pre/gnu.ver (GLIBCXX_3.4):
> > Add __gnu_cxx::zoneinfo_dir_override().
> > ---
> >  libstdc++-v3/config/abi/pre/gnu.ver | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/libstdc++-v3/config/abi/pre/gnu.ver 
> > b/libstdc++-v3/config/abi/pre/gnu.ver
> > index 570ffca8710..bd4ab450652 100644
> > --- a/libstdc++-v3/config/abi/pre/gnu.ver
> > +++ b/libstdc++-v3/config/abi/pre/gnu.ver
> > @@ -1104,6 +1104,9 @@ GLIBCXX_3.4 {
> >  # std::uncaught_exception()
> >  _ZSt18uncaught_exceptionv;
> >
> > +# __gnu_cxx::zoneinfo_dir_override()
> > +_ZN9__gnu_cxx21zoneinfo_dir_overrideEv
> > +
> ># DO NOT DELETE THIS LINE.  Port-specific symbols, if any, will be here.
> >
> >local:
> > --
> > 2.37.1 (Apple Git-137.1)
> >
>



Re: [PATCH] c++, TLS: Support cross-tu static initialization for targets without alias support [PR106435].

2023-01-04 Thread Iain Sandoe



> On 4 Jan 2023, at 15:03, Jason Merrill  wrote:
> 
> On 1/3/23 18:17, Iain Sandoe wrote:
>>> On 3 Jan 2023, at 22:22, Jason Merrill  wrote:
>>> 
>>> On 12/7/22 10:39, Iain Sandoe wrote:
  This has been tested on x86_64 and arm64 Darwin and on x86_64 linux gnu.
  The basic patch is live in the homebrew macOS support and so has had quite
  wide coverage on non-trivial codebases.
OK for master?
  Iain
Since this actually fixes wrong code, I wonder if we should also 
 consider
  back-porting.
--- >8 ---
 The description below relates to the code path when 
 TARGET_SUPPORTS_ALIASES is
 false; current operation is maintained for targets with alias support and 
 any
 new support code should be DCEd in that case.
 --
 Currently, cross-tu static initialisation is not supported for targets 
 without
 alias support.
 The patch adds support by building a shim function in place of the alias 
 for
 these targets; the shim simply calls the generic initialiser.  Although 
 this is
 slightly less efficient than the alias, in practice (for targets that allow
 sibcalls) the penalty is a single jump when code is optimised.
 From the perspective of a TU referencing an extern TLS variable, there is 
 no
 way to determine if it requires a guarded dynamic init.  So, in the 
 referencing
 TU, we build a weak reference to the potential init and check at runtime 
 if the
 init is present before calling it.  This strategy is fine for targets that 
 have
 ELF semantics, but fails at link time for Mach-O (which does not permit the
 reference to be undefined in the static link).
 The actual initialiser call is contained in a wrapper function, and to 
 resolve
 the Mach-O linker issue, in the TU that is referencing the var, we now 
 generate
 both the wrapper _and_ a weak definition of a dummy init function.  In the 
 case
 that there _is_ a dynamic init (in a different TU), that version will be 
 non-weak
 and will be override the weak dummy one.
>>> 
>>> IIUC, this isn't reliable in general; in specific, I believe that the glibc 
>>> dynamic loader no longer prefers strong definitions to weak ones.
>> Neither does Darwin’s dynamic loader, this implemenation works there because 
>> the static linker _will_ override the weak def with a strong one.  IIUC, 
>> binutils ld does this too.
>> If we need this to work between DSOs then that potentially presents a 
>> problem (for Darwin the DSO is identified so that the symbol will be found 
>> in the library that resolved it in the static link, [but that can be 
>> defeated by forcing “flat linking”]), I am not sure if glibc dynamic loader 
>> would do something similar (although this code path is not taken on ELF 
>> targets since they have the symbol aliases).
>>> Perhaps on targets that don't allow weakrefs to be unbound,
>> Darwin would allow it if we were able to tell the static linker that the 
>> symbol is permitted to be undefined - but since we don’t know the symbol’s 
>> name outside the FE, that is not going to fly.
> 
> Can you elaborate on this?

At runtime Mach-O is much the same as ELF w.r.t weak references, the difference 
comes at static link time when (by default) Darwin’s linker requires all 
symbols to have a definition.

Darwin’s static linker has three mechanisms for allowing a weak reference (in 
each case, at runtime, the symbol reference will be NULL if no definition is 
present - so ELF-like at that point):

 1. A definition is supplied on the link line (usually in a DSO) the DSO is 
defined as a weak library, which means it is permitted to be absent at runtime. 
 [the usage we are thinking of here is not what this facility was designed for, 
but it can work].

2. We put -Wl,-undefined,dynamic_lookup on the link line, that allows any 
symbol to be undefined - it is a massive sledgehammer and not at all 
recommended for general code (we have to use it for things like plugins that 
need to resolve many symbols at runtime from their host).  NOTE that it also 
seems to be incompatible with some modern fixups on arm64 (i.e. it looks like 
Apple do not intend to guarantee it will work in the future).

3. An individual symbol maybe be specified as “allowed to be undefined” by 
passing -Wl,-U,_symbol

It was the third case I was thinking of - but I cannot see how to obtain the 
symbols easily (If we can identify them, we could arrange to emit them into 
some special section and then fish them out using simple-object in collect2 and 
apply to the generated link line).  However, this does not seem like a phase3/4 
kind of change (and I do not currently have much^W any spare time either).

——

The simpler approach I was using was to provide a dummy weak definition that is 
always available, but which will be overridden by the static linker if a 
non-weak definition is provided at that time (whi

Re: [PATCH] c++, TLS: Support cross-tu static initialization for targets without alias support [PR106435].

2023-01-04 Thread Jason Merrill via Gcc-patches

On 1/4/23 10:30, Iain Sandoe wrote:




On 4 Jan 2023, at 15:03, Jason Merrill  wrote:

On 1/3/23 18:17, Iain Sandoe wrote:

On 3 Jan 2023, at 22:22, Jason Merrill  wrote:

On 12/7/22 10:39, Iain Sandoe wrote:

  This has been tested on x86_64 and arm64 Darwin and on x86_64 linux gnu.
  The basic patch is live in the homebrew macOS support and so has had quite
  wide coverage on non-trivial codebases.
OK for master?
  Iain
Since this actually fixes wrong code, I wonder if we should also consider
  back-porting.
--- >8 ---
The description below relates to the code path when TARGET_SUPPORTS_ALIASES is
false; current operation is maintained for targets with alias support and any
new support code should be DCEd in that case.
--
Currently, cross-tu static initialisation is not supported for targets without
alias support.
The patch adds support by building a shim function in place of the alias for
these targets; the shim simply calls the generic initialiser.  Although this is
slightly less efficient than the alias, in practice (for targets that allow
sibcalls) the penalty is a single jump when code is optimised.
 From the perspective of a TU referencing an extern TLS variable, there is no
way to determine if it requires a guarded dynamic init.  So, in the referencing
TU, we build a weak reference to the potential init and check at runtime if the
init is present before calling it.  This strategy is fine for targets that have
ELF semantics, but fails at link time for Mach-O (which does not permit the
reference to be undefined in the static link).
The actual initialiser call is contained in a wrapper function, and to resolve
the Mach-O linker issue, in the TU that is referencing the var, we now generate
both the wrapper _and_ a weak definition of a dummy init function.  In the case
that there _is_ a dynamic init (in a different TU), that version will be 
non-weak
and will be override the weak dummy one.


IIUC, this isn't reliable in general; in specific, I believe that the glibc 
dynamic loader no longer prefers strong definitions to weak ones.

Neither does Darwin’s dynamic loader, this implemenation works there because 
the static linker _will_ override the weak def with a strong one.  IIUC, 
binutils ld does this too.
If we need this to work between DSOs then that potentially presents a problem 
(for Darwin the DSO is identified so that the symbol will be found in the 
library that resolved it in the static link, [but that can be defeated by 
forcing “flat linking”]), I am not sure if glibc dynamic loader would do 
something similar (although this code path is not taken on ELF targets since 
they have the symbol aliases).

Perhaps on targets that don't allow weakrefs to be unbound,

Darwin would allow it if we were able to tell the static linker that the symbol 
is permitted to be undefined - but since we don’t know the symbol’s name 
outside the FE, that is not going to fly.


Can you elaborate on this?


At runtime Mach-O is much the same as ELF w.r.t weak references, the difference 
comes at static link time when (by default) Darwin’s linker requires all 
symbols to have a definition.

Darwin’s static linker has three mechanisms for allowing a weak reference (in 
each case, at runtime, the symbol reference will be NULL if no definition is 
present - so ELF-like at that point):

  1. A definition is supplied on the link line (usually in a DSO) the DSO is 
defined as a weak library, which means it is permitted to be absent at runtime. 
 [the usage we are thinking of here is not what this facility was designed for, 
but it can work].

2. We put -Wl,-undefined,dynamic_lookup on the link line, that allows any 
symbol to be undefined - it is a massive sledgehammer and not at all 
recommended for general code (we have to use it for things like plugins that 
need to resolve many symbols at runtime from their host).  NOTE that it also 
seems to be incompatible with some modern fixups on arm64 (i.e. it looks like 
Apple do not intend to guarantee it will work in the future).

3. An individual symbol maybe be specified as “allowed to be undefined” by 
passing -Wl,-U,_symbol

It was the third case I was thinking of - but I cannot see how to obtain the 
symbols easily (If we can identify them, we could arrange to emit them into 
some special section and then fish them out using simple-object in collect2 and 
apply to the generated link line).  However, this does not seem like a phase3/4 
kind of change (and I do not currently have much^W any spare time either).


Aha, thanks.  We shouldn't need to build a list in a special section: 
collect2 could look for _ZTH* symbol references and add -U options for them.


Jason



[PATCH] c++: mark_single_function and SFINAE [PR108282]

2023-01-04 Thread Patrick Palka via Gcc-patches
We typically ignore mark_used failure when in a non-SFINAE context for
sake of better error recovery.  But in mark_single_function we're
instead ignoring mark_used failure in a SFINAE context, which ends up
causing the second static_assert here to incorrectly fail.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk/12?

PR c++/108282

gcc/cp/ChangeLog:

* decl2.cc (mark_single_function): Ignore mark_used failure
only in a non-SFINAE context rather than in a SFINAE one.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-requires34.C: New test.
---
 gcc/cp/decl2.cc   |  2 +-
 .../g++.dg/cpp2a/concepts-requires34.C| 19 +++
 2 files changed, 20 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-requires34.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index f95529a5c9a..00ed64d1691 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -5600,7 +5600,7 @@ mark_single_function (tree expr, tsubst_flags_t complain)
 
   if (is_overloaded_fn (expr) == 1
   && !mark_used (expr, complain)
-  && (complain & tf_error))
+  && !(complain & tf_error))
 return false;
   return true;
 }
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-requires34.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-requires34.C
new file mode 100644
index 000..670a6dab31a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-requires34.C
@@ -0,0 +1,19 @@
+// PR c++/108282
+// { dg-do compile { target c++20 } }
+
+template
+concept TEST = requires { T::TT; };
+
+struct C { };
+
+template
+struct B {
+  static inline void TT() requires TEST;
+};
+
+int main() {
+  static_assert( !TEST );
+  static_assert( !TEST> );
+
+  B::TT();  // { dg-error "no match" }
+}
-- 
2.39.0.158.g2b4f5a4e4b



Re: [PATCH] libstdc++: Export the __gnu_cxx::zoneinfo_dir_override symbol.

2023-01-04 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 04, 2023 at 03:17:42PM +, Jonathan Wakely via Gcc-patches wrote:
> On Sat, 24 Dec 2022 at 12:21, Iain Sandoe  wrote:
> >
> >
> >
> > > On 24 Dec 2022, at 12:12, Jonathan Wakely wrote:
> > >
> > >
> > >
> > > On Sat, 24 Dec 2022, 11:35 Iain Sandoe via Libstdc++, 
> > >  wrote:
> > >  If this is not the right place to export the symbol (or you do not want
> > >  to export it in the general case), I can always add a platform-specific
> > >  file for it.  So far, tested on x86_64-darwin21, wider testing will
> > >  follow over the holidays.
> > >
> > >  OK for trunk?
> > >
> > > I'd like to check if this causes the undefined weak symbol to be exported 
> > > on ELF,
> >
> > I’d expect so, since it’s in the common file.
> 
> hppa-hp-hpux* wants this symbol to be defined anyway (see PR 108228),
> so please push your patch to trunk.

Isn't it wrong though to export it with GLIBCXX_3.4 symbol version?
I mean, if it wasn't exported in GCC 3.4 libstdc++.so, then it shouldn't
be in that symver.  Perhaps GLIBCXX_3.4.31 instead?

Jakub



Avoid quadratic behaviour of symbol renaming

2023-01-04 Thread Jan Hubicka via Gcc-patches
Hi,
LTO partitioning does renaming of symbols that ends up in same partition
and clash with assembler name.  This is done for "ordinary" symbols (such
as static functions) but also for symbols that are kept only as master
clones holding bodies of functions to be specialized later.
This is done only becuase we stream bodies to named section and clash
in names would mean that ltrans will load wrong body and crash.

Martin recently added bit to stream body for clones that are needed
since this makes it easier to bookeep what summaries are output.  THis
however triggers mass renaming of inline clones that is very slow
and unnecesary since their bodies are never streamed.

Bootstrapped/regtested x86_64-linux, comitted.

gcc/lto/ChangeLog:

2023-01-04  Jan Hubicka  

* lto-partition.cc (may_need_named_section_p): Clones with no body
need no remaning.

diff --git a/gcc/lto/lto-partition.cc b/gcc/lto/lto-partition.cc
index 654d67f272e..b96d1dd473d 100644
--- a/gcc/lto/lto-partition.cc
+++ b/gcc/lto/lto-partition.cc
@@ -1035,15 +1035,18 @@ promote_symbol (symtab_node *node)
 /* Return true if NODE needs named section even if it won't land in
the partition symbol table.
 
-   FIXME: we should really not use named sections for inline clones
-   and master clones.  */
+   FIXME: we should really not use named sections for master clones.  */
 
 static bool
 may_need_named_section_p (lto_symtab_encoder_t encoder, symtab_node *node)
 {
   struct cgraph_node *cnode = dyn_cast  (node);
+  /* We do not need to handle variables since we never clone them.  */
   if (!cnode)
 return false;
+  /* Only master clones will have bodies streamed.  */
+  if (cnode->clone_of)
+return false;
   if (node->real_symbol_p ())
 return false;
   return (!encoder


Re: [PATCH] libstdc++: Export the __gnu_cxx::zoneinfo_dir_override symbol.

2023-01-04 Thread Jonathan Wakely via Gcc-patches
On Wed, 4 Jan 2023 at 17:14, Jakub Jelinek wrote:
>
> On Wed, Jan 04, 2023 at 03:17:42PM +, Jonathan Wakely via Gcc-patches 
> wrote:
> > On Sat, 24 Dec 2022 at 12:21, Iain Sandoe  wrote:
> > >
> > >
> > >
> > > > On 24 Dec 2022, at 12:12, Jonathan Wakely wrote:
> > > >
> > > >
> > > >
> > > > On Sat, 24 Dec 2022, 11:35 Iain Sandoe via Libstdc++, 
> > > >  wrote:
> > > >  If this is not the right place to export the symbol (or you do not want
> > > >  to export it in the general case), I can always add a platform-specific
> > > >  file for it.  So far, tested on x86_64-darwin21, wider testing will
> > > >  follow over the holidays.
> > > >
> > > >  OK for trunk?
> > > >
> > > > I'd like to check if this causes the undefined weak symbol to be 
> > > > exported on ELF,
> > >
> > > I’d expect so, since it’s in the common file.
> >
> > hppa-hp-hpux* wants this symbol to be defined anyway (see PR 108228),
> > so please push your patch to trunk.
>
> Isn't it wrong though to export it with GLIBCXX_3.4 symbol version?
> I mean, if it wasn't exported in GCC 3.4 libstdc++.so, then it shouldn't
> be in that symver.  Perhaps GLIBCXX_3.4.31 instead?

Oops, yes! I didn't notice that! I'll move it.

Thanks for catching that.



[ping][PATCH] cp: warn uninitialized const/ref in base class [PR80681]

2023-01-04 Thread Charlie Sale via Gcc-patches
On this example:
```
struct Fine {
private:
const int f;
};

struct BuggyA {
const int a;
int &b;
};

struct BuggyB : private BuggyA {
};
```
g++ currently emits:
```
test.cc:3:19: warning: non-static const member ‘const int Fine::f’ in class 
without a constructor [-Wuninitialized]
3 | const int f;
  |
```
(relevant godbolt: https://godbolt.org/z/KGMK6e1zc)
The issue here is that g++ misses the uninitialized const and ref members in 
BuggyA that are inherited as
private in BuggyB. It should warn about those members when checking BuggyB.

With this patch, g++ emits the following:
```
test.cc:3:19: warning: non-static const member ‘const int Fine::f’ in class 
without a constructor [-Wuninitialized]
3 | const int f;
  |   ^
test.cc:7:19: warning: while processing ‘BuggyB’: non-static const member 
‘const int BuggyA::a’ in class without a constructor [-Wuninitialized]
7 | const int a;
  |   ^
test.cc:7:19: note: ‘BuggyB’ inherits ‘BuggyA’ as private, so all fields 
contained within ‘BuggyA’ are private to ‘BuggyB’
test.cc:8:14: warning: while processing ‘BuggyB’: non-static reference ‘int& 
BuggyA::b’ in class without a constructor [-Wuninitialized]
8 | int &b;
  |  ^
test.cc:8:14: note: ‘BuggyB’ inherits ‘BuggyA’ as private, so all fields 
contained within ‘BuggyA’ are private to ‘BuggyB’
```
Now, the compiler warns about the uninitialized members.

In terms of testing, I added three tests:
- a status quo test that makes sure that the existing warning behavior
  works
- A simple test based off of the PR
- Another example with multiple inheritance
- A final example with mutliple levels of inheritance.

These tests all pass. I also bootstrapped the project without any
regressions.

PR c++/80681

gcc/cp/ChangeLog:

* class.cc (warn_uninitialized_const_and_ref): Extract warn logic
  into new func, add inform for inheritance warning
(check_bases_and_members): Move warn logic to
  warn_unintialized_const_and_ref, check subclasses for warnings
  as well

gcc/testsuite/ChangeLog:

* g++.dg/pr80681-1.C: New test.

Signed-off-by: Charlie Sale 
---
 gcc/cp/class.cc  | 110 +--
 gcc/testsuite/g++.dg/pr80681-1.C |  51 ++
 2 files changed, 142 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr80681-1.C

diff --git a/gcc/cp/class.cc b/gcc/cp/class.cc
index aebcb53739e..72172bea6ad 100644
--- a/gcc/cp/class.cc
+++ b/gcc/cp/class.cc
@@ -6018,6 +6018,76 @@ explain_non_literal_class (tree t)
 }
 }
 
+
+/* Warn for private const or reference class members that cannot be initialized
+   due to the class not having a default constructor.  If a child type is
+   provided, then we are checking class_type's members in case they cannot be
+   initialized by child_type.  If child_type is null, then we simply check
+   class_type.  */
+static void
+warn_uninitialized_const_and_ref (tree class_type, tree child_type)
+{
+  /* Check the fields on this class type.  */
+  tree field;
+  for (field = TYPE_FIELDS (class_type); field; field = DECL_CHAIN (field))
+{
+  /* We only want to check variable declarations.
+   Exclude fields that are not field decls or are not initialized.  */
+  if (TREE_CODE (field) != FIELD_DECL
+ || DECL_INITIAL (field) != NULL_TREE)
+   continue;
+
+  tree type = TREE_TYPE (field);
+
+  if (TYPE_REF_P (type))
+   {
+ if (child_type != nullptr)
+  {
+   /* Show parent class while processing.  */
+   auto_diagnostic_group d;
+   warning_at (DECL_SOURCE_LOCATION (field),
+   OPT_Wuninitialized, "while processing %qE: "
+   "non-static reference %q#D in class without a constructor",
+   child_type, field);
+   inform (DECL_SOURCE_LOCATION (field),
+   "%qE inherits %qE as private, so all fields "
+   "contained within %qE are private to %qE",
+   child_type, class_type, class_type, child_type);
+  }
+ else
+  {
+   warning_at (DECL_SOURCE_LOCATION (field),
+   OPT_Wuninitialized, "non-static reference %q#D "
+   "in class without a constructor", field);
+  }
+   }
+  else if (CP_TYPE_CONST_P (type)
+  && (!CLASS_TYPE_P (type)
+  || !TYPE_HAS_DEFAULT_CONSTRUCTOR (type)))
+   {
+ if (child_type)
+  {
+   /* ditto.  */
+   auto_diagnostic_group d;
+   warning_at (DECL_SOURCE_LOCATION (field),
+   OPT_Wuninitialized, "while processing %qE: "
+   "non-static const member %q#D in class "
+   "without a constructor", child_type, field);
+   inform (DECL_SOURCE_LOCATION (field),
+   "%qE inherits %qE as private, so all fields "

Re: [PATCH] vrp: Handle pointers in maybe_set_nonzero_bits [PR108253]

2023-01-04 Thread Aldy Hernandez via Gcc-patches
On PTO until Monday but thinking out loud here

Shouldn't we put this code in set_nonzero_bits instead, and leave maybe*
alone? That way any possible setters may benefit from your change?

Also, havent looked (AFK) but does this change work with the global range
getter (get_global_range_query...)?

Thoughts?

Aldy


On Wed, Jan 4, 2023, 10:13 Jakub Jelinek  wrote:

> Hi!
>
> maybe_set_nonzero_bits calls set_nonzero_bits which asserts that
> var doesn't have pointer type.  While we could punt for those
> cases, I think we can handle at least some easy cases.
> Earlier in maybe_set_nonzero_bits we've checked this is on
> (var & cst) == 0
> edge and the other edge is __builtin_unreachable, so if cst
> is say 3 as in the testcase, we want to turn it into 4 byte alignment
> of the pointer.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2023-01-04  Jakub Jelinek  
>
> PR tree-optimization/108253
> * tree-vrp.cc (maybe_set_nonzero_bits): Handle var with pointer
> types.
>
> * g++.dg/opt/pr108253.C: New test.
>
> --- gcc/tree-vrp.cc.jj  2023-01-02 09:32:53.634833769 +0100
> +++ gcc/tree-vrp.cc 2023-01-03 15:57:51.613239761 +0100
> @@ -789,8 +789,22 @@ maybe_set_nonzero_bits (edge e, tree var
> return;
>  }
>cst = gimple_assign_rhs2 (stmt);
> -  set_nonzero_bits (var, wi::bit_and_not (get_nonzero_bits (var),
> - wi::to_wide (cst)));
> +  if (POINTER_TYPE_P (TREE_TYPE (var)))
> +{
> +  struct ptr_info_def *pi = SSA_NAME_PTR_INFO (var);
> +  if (pi && pi->misalign)
> +   return;
> +  wide_int w = wi::bit_not (wi::to_wide (cst));
> +  unsigned int bits = wi::ctz (w);
> +  if (bits == 0 || bits >= HOST_BITS_PER_INT)
> +   return;
> +  unsigned int align = 1U << bits;
> +  if (pi == NULL || pi->align < align)
> +   set_ptr_info_alignment (get_ptr_info (var), align, 0);
> +}
> +  else
> +set_nonzero_bits (var, wi::bit_and_not (get_nonzero_bits (var),
> +   wi::to_wide (cst)));
>  }
>
>  /* Searches the case label vector VEC for the index *IDX of the CASE_LABEL
> --- gcc/testsuite/g++.dg/opt/pr108253.C.jj  2023-01-03
> 16:02:16.366438488 +0100
> +++ gcc/testsuite/g++.dg/opt/pr108253.C 2023-01-03 16:02:33.549191780 +0100
> @@ -0,0 +1,48 @@
> +// PR tree-optimization/108253
> +// { dg-do compile { target c++11 } }
> +// { dg-options "-O2" }
> +
> +struct S
> +{
> +  int *s;
> +  S () : s (new int) {}
> +  S (const S &r) noexcept : s (r.s) { __atomic_fetch_add (r.s, 1, 4); }
> +};
> +struct T
> +{
> +  explicit T (const S &x) : t (x) {}
> +  const S t;
> +};
> +struct U
> +{
> +  operator int () const { new T (u); return 0; }
> +  S u;
> +};
> +bool foo (int matcher);
> +unsigned long bar (unsigned long pos, unsigned long end_pos);
> +struct V
> +{
> +  alignas (4) char v[4];
> +};
> +struct W
> +{
> +  void baz ()
> +  {
> +if (!w) __builtin_abort ();
> +if (reinterpret_cast <__UINTPTR_TYPE__> (w->v) % 4 != 0)
> __builtin_abort ();
> +__builtin_unreachable ();
> +  }
> +  [[gnu::noinline]] void qux (unsigned long) { if (!w) bar (0, x); }
> +  V *w = nullptr;
> +  unsigned x = 0;
> +};
> +
> +void
> +test ()
> +{
> +  W w;
> +  U t;
> +  if (!foo (t))
> +w.baz ();
> +  w.qux (0);
> +}
>
> Jakub
>
>


Re: [PATCH] c++: mark_single_function and SFINAE [PR108282]

2023-01-04 Thread Jason Merrill via Gcc-patches

On 1/4/23 11:37, Patrick Palka wrote:

We typically ignore mark_used failure when in a non-SFINAE context for
sake of better error recovery.  But in mark_single_function we're
instead ignoring mark_used failure in a SFINAE context, which ends up
causing the second static_assert here to incorrectly fail.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk/12?


OK.


PR c++/108282

gcc/cp/ChangeLog:

* decl2.cc (mark_single_function): Ignore mark_used failure
only in a non-SFINAE context rather than in a SFINAE one.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-requires34.C: New test.
---
  gcc/cp/decl2.cc   |  2 +-
  .../g++.dg/cpp2a/concepts-requires34.C| 19 +++
  2 files changed, 20 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-requires34.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index f95529a5c9a..00ed64d1691 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -5600,7 +5600,7 @@ mark_single_function (tree expr, tsubst_flags_t complain)
  
if (is_overloaded_fn (expr) == 1

&& !mark_used (expr, complain)
-  && (complain & tf_error))
+  && !(complain & tf_error))
  return false;
return true;
  }
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-requires34.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-requires34.C
new file mode 100644
index 000..670a6dab31a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-requires34.C
@@ -0,0 +1,19 @@
+// PR c++/108282
+// { dg-do compile { target c++20 } }
+
+template
+concept TEST = requires { T::TT; };
+
+struct C { };
+
+template
+struct B {
+  static inline void TT() requires TEST;
+};
+
+int main() {
+  static_assert( !TEST );
+  static_assert( !TEST> );
+
+  B::TT();  // { dg-error "no match" }
+}




Re: [PATCH] gccrs: avoid printing to stderr in selftest::rust_flatten_list

2023-01-04 Thread David Malcolm via Gcc-patches
On Mon, 2023-01-02 at 13:47 +0100, Arthur Cohen wrote:
> Hi David,
> 
> Sorry for the delayed reply!
> 
> On 12/16/22 18:01, David Malcolm wrote:
> > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> > 
> > OK for trunk?
> > 
> > gcc/rust/ChangeLog:
> > * resolve/rust-ast-resolve-item.cc
> > (selftest::rust_flatten_list):
> > Remove output to stderr.

For reference, the stderr spewage was:

foo::bar::baz
foo::bar::bul


> > 
> > Signed-off-by: David Malcolm 
> > ---
> >   gcc/rust/resolve/rust-ast-resolve-item.cc | 3 ---
> >   1 file changed, 3 deletions(-)
> > 
> > diff --git a/gcc/rust/resolve/rust-ast-resolve-item.cc
> > b/gcc/rust/resolve/rust-ast-resolve-item.cc
> > index 0c38f28d530..1276e845acc 100644
> > --- a/gcc/rust/resolve/rust-ast-resolve-item.cc
> > +++ b/gcc/rust/resolve/rust-ast-resolve-item.cc
> > @@ -1202,9 +1202,6 @@ rust_flatten_list (void)
> >     auto paths = std::vector ();
> >     Rust::Resolver::flatten_list (list, paths);
> >   
> > -  for (auto &path : paths)
> > -    fprintf (stderr, "%s\n", path.as_string ().c_str ());
> > -
> >     ASSERT_TRUE (!paths.empty ());
> >     ASSERT_EQ (paths.size (), 2);
> >     ASSERT_EQ (paths[0].get_segments ()[0].as_string (), "foo");
> 
> Looks good to me. OK for trunk :)
> 
> Thanks for taking the time!

I was about to push this and
  https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608645.html
to trunk (after retesting against all the changes since before my
break), but you mentioned today on IRC something about merger issues:

 ibuclaw: I'm looking at the remaining issues for updating
GCC's master with our most recent gccrs commits. I've come to the
conclusion that it would be easier for me to upstream your target
changes while I'm upstreaming/submitting all of the missing commits
 would that suit you?
 it would just be me sending in your commits, but I
wouldn't be author on them or anything of course
 this would enable us to time them properly within the
rest of the commits, so there'd be no conflicts or anything of the sort
 dmalcolm: same question to you, actually :)

Can I go ahead and push my two commits to trunk, or do you want to do
it?  (and if so, do you want them e.g. as PRs against your github
branch?)

Dave


> 
> All the best,



Re: [PATCH] modula-2, driver: Implement handling for -static-libgm2.

2023-01-04 Thread Iain Sandoe
Hi Gaius,

> On 4 Jan 2023, at 12:11, Gaius Mulley  wrote:
> 
> Iain Sandoe  writes:
> 
>> tested on x86_64-linux-gnu, x86_64,aarch64-darwin21,

> 
> yes LGTM - it was unimplemented - thanks!

My apologies, when I came to apply this I realised that I posted the wrong
version of the patch - omitting the documentation changes.

Here is the version with (albeit basic) documentation.
Still OK for master?
Iain

[PATCH] modula-2, driver: Implement handling for -static-libgm2.

This was unimplemented so far.

gcc/ChangeLog:

* common.opt: Add -static-libgm2.
* config/darwin.h (LINK_SPEC): Handle static-libgm2.
* doc/gm2.texi: Document static-libgm2.
* gcc.cc (driver_handle_option): Allow static-libgm2.

gcc/m2/ChangeLog:

* gm2spec.cc (lang_specific_driver): Handle static-libgm2.
* lang.opt: Add static-libgm2.
---
 gcc/common.opt  |  4 
 gcc/config/darwin.h |  7 ++-
 gcc/doc/gm2.texi|  4 
 gcc/gcc.cc  | 12 +++-
 gcc/m2/gm2spec.cc   | 24 +++-
 gcc/m2/lang.opt |  4 
 6 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 97a78030228..d0371aec8db 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -3622,6 +3622,10 @@ static-libgfortran
 Driver
 ; Documented for Fortran, but always accepted by driver.
 
+static-libgm2
+Driver
+; Documented for Modula-2, but always accepted by driver.
+
 static-libphobos
 Driver
 ; Documented for D, but always accepted by driver.
diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index efe3187cd96..e6f76e598e6 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -447,7 +447,12 @@ extern GTY(()) int darwin_ms_struct;
%{static|static-libgcc|static-libphobos:%:replace-outfile(-lgphobos 
libgphobos.a%s)}\

%{static|static-libgcc|static-libstdc++|static-libgfortran:%:replace-outfile(-lgomp
 libgomp.a%s)}\
%{static|static-libgcc|static-libstdc++:%:replace-outfile(-lstdc++ 
libstdc++.a%s)}\
-   %{force_cpusubtype_ALL:-arch %(darwin_arch)} \
+   %{static|static-libgm2:%:replace-outfile(-lm2pim libm2pim.a%s)}\
+   %{static|static-libgm2:%:replace-outfile(-lm2iso libm2iso.a%s)}\
+   %{static|static-libgm2:%:replace-outfile(-lm2min libm2min.a%s)}\
+   %{static|static-libgm2:%:replace-outfile(-lm2log libm2log.a%s)}\
+   %{static|static-libgm2:%:replace-outfile(-lm2cor libm2cor.a%s)}\
+  %{force_cpusubtype_ALL:-arch %(darwin_arch)} \
%{!force_cpusubtype_ALL:-arch %(darwin_subarch)} "\
LINK_SYSROOT_SPEC \
   "%{mmacosx-version-min=*:-macosx_version_min %*} \
diff --git a/gcc/doc/gm2.texi b/gcc/doc/gm2.texi
index 513fdd3ec7f..18cb798c6cd 100644
--- a/gcc/doc/gm2.texi
+++ b/gcc/doc/gm2.texi
@@ -573,6 +573,10 @@ the they provide the base modules which all other dialects 
utilize.
 The option @samp{-fno-libs=-} disables the @samp{gm2} driver from
 modifying the search and library paths.
 
+@item -static-libgm2
+On systems that provide the m2 runtimes as both shared and static libraries,
+this option forces the use of the static version.
+
 @c flocation=
 @c Modula-2 Joined
 @c set all location values to a specific value (internal switch)
diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index 91313a8516d..d629ca5e424 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -4540,12 +4540,14 @@ driver_handle_option (struct gcc_options *opts,
 case OPT_static_libgfortran:
 case OPT_static_libquadmath:
 case OPT_static_libphobos:
+case OPT_static_libgm2:
 case OPT_static_libstdc__:
-  /* These are always valid, since gcc.cc itself understands the
-first two, gfortranspec.cc understands -static-libgfortran,
-d-spec.cc understands -static-libphobos, g++spec.cc
-understands -static-libstdc++ and libgfortran.spec handles
--static-libquadmath.  */
+  /* These are always valid; gcc.cc itself understands the first two
+gfortranspec.cc understands -static-libgfortran,
+libgfortran.spec handles -static-libquadmath,
+d-spec.cc understands -static-libphobos,
+gm2spec.cc understands -static-libgm2,
+and g++spec.cc understands -static-libstdc++.  */
   validated = true;
   break;
 
diff --git a/gcc/m2/gm2spec.cc b/gcc/m2/gm2spec.cc
index b9a5c4e79bb..583723da416 100644
--- a/gcc/m2/gm2spec.cc
+++ b/gcc/m2/gm2spec.cc
@@ -586,6 +586,9 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
   /* Should the driver perform a link?  */
   bool linking = true;
 
+  /* Should the driver link the shared gm2 libs?  */
+  bool shared_libgm2 = true;
+
   /* "-lm" or "-lmath" if it appears on the command line.  */
   const struct cl_decoded_option *saw_math = NULL;
 
@@ -595,7 +598,8 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
   /* By default, we throw on the math library if we have one.  */
   int need_math = (MATH_LIBRARY[0] != '\0');
 
-  /* 1 if we should add -lpthread to the command-line.  */
+  /* 1 if we sho

Re: [PATCH] PR 107189 Remove useless _Alloc_node

2023-01-04 Thread François Dumont via Gcc-patches

Still no chance to review ?

On 14/11/22 18:56, François Dumont wrote:

Gentle reminder.

Sorry if I should have committed it as trivial but I cannot do it 
anymore now that I asked :-)



On 12/10/22 22:18, François Dumont wrote:

libstdc++: Remove _Alloc_node instance in _Rb_tree [PR107189]

    libstdc++-v3/ChangeLog:

    PR libstdc++/107189
    * include/bits/stl_tree.h 
(_Rb_tree<>::_M_insert_range_equal): Remove

    unused _Alloc_node instance.

Ok to commit ?

François



diff --git a/libstdc++-v3/include/bits/stl_tree.h b/libstdc++-v3/include/bits/stl_tree.h
index a4de6141765..33d25089a1d 100644
--- a/libstdc++-v3/include/bits/stl_tree.h
+++ b/libstdc++-v3/include/bits/stl_tree.h
@@ -1123,7 +1123,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	__enable_if_t::value>
 	_M_insert_range_equal(_InputIterator __first, _InputIterator __last)
 	{
-	  _Alloc_node __an(*this);
 	  for (; __first != __last; ++__first)
 	_M_emplace_equal(*__first);
 	}


[PATCH modula2] Add missing declarations to gcc/m2/gm2-libs-min/M2RTS.def

2023-01-04 Thread Gaius Mulley via Gcc-patches


Bootstrapped on gnu/linux x86_64

Ok for trunk?

thanks
Gaius

- o< - o< - o< - o< - o<


Add missing declarations to gcc/m2/gm2-libs-min/M2RTS.def

This patch adds two missing procedures to gcc/m2/gm2-libs-min/M2RTS.def
required for linking (the procedures are already present in the pim and
iso M2RTS.def).  The patch also includes test code, changes to
gcc/testsuite/lib/gm2.exp and an expect tcl script to test the min
libraries.


gcc/m2/ChangeLog:

* gm2-libs-min/M2RTS.def (ConstructModules): New procedure
declaration.
(DeconstructModules): New procedure declaration.

gcc/testsuite:

* lib/gm2.exp (gm2_init_minx): New procedure.
(gm2_init_min): New procedure calls gm2_init_min with
dialect flags.
* gm2/link/min/pass/tiny.mod: New test case.
* gm2/link/min/pass/link-min-pass.exp: New file.


diff --git a/gcc/m2/gm2-libs-min/M2RTS.def b/gcc/m2/gm2-libs-min/M2RTS.def
index e3e13b7b554..147024ebe78 100644
--- a/gcc/m2/gm2-libs-min/M2RTS.def
+++ b/gcc/m2/gm2-libs-min/M2RTS.def
@@ -38,6 +38,10 @@ TYPE
all these procedures do nothing except satisfy the linker.
 *)

+PROCEDURE ConstructModules (applicationmodule: ADDRESS;
+argc: INTEGER; argv, envp: ADDRESS) ;
+PROCEDURE DeconstructModules (applicationmodule: ADDRESS;
+  argc: INTEGER; argv, envp: ADDRESS) ;
 PROCEDURE RegisterModule (name: ADDRESS;
   init, fini:  ArgCVEnvP;
   dependencies: PROC) ;
diff --git a/gcc/testsuite/lib/gm2.exp b/gcc/testsuite/lib/gm2.exp
index 450cb4c2d35..9eba195291a 100644
--- a/gcc/testsuite/lib/gm2.exp
+++ b/gcc/testsuite/lib/gm2.exp
@@ -496,3 +496,42 @@ proc gm2_init_cor { {path ""} args } {
 gm2_link_lib "m2cor m2pim m2iso"
 gm2_init {*}${theIpath} -fpim {*}${theLpath} {*}${args};
 }
+
+
+#
+#  gm2_init_minx - set the default libraries to choose MIN library and
+#  choose Modula-2, dialect.
+#
+#
+
+proc gm2_init_minx { dialect {path ""} args } {
+global srcdir;
+global gccpath;
+
+set gm2src ${srcdir}/../m2;
+
+send_log "srcdir is $srcdir\n"
+send_log "gccpath is $gccpath\n"
+send_log "gm2src is $gm2src\n"
+
+set minIpath "${gccpath}/libgm2/libm2min";
+set minLpath "${gccpath}/libgm2/libm2min/.libs";
+
+set theIpath "-I${minIpath}";
+set theLpath "-L${minLpath}";
+
+if { $path != "" } then {
+   append theIpath " -I"
+   append theIpath ${path}
+}
+gm2_init {*}${theIpath} {*}${dialect} {*}${theLpath} {*}${args};
+}
+
+#
+#  gm2_init_min - set the default libraries to choose MIN libraries
+# and pim dialect.
+#
+
+proc gm2_init_min { {path ""} args } {
+gm2_init_minx -fpim {*}${path} {*}${args};
+}
diff --git a/gcc/testsuite/gm2/link/min/pass/link-min-pass.exp 
b/gcc/testsuite/gm2/link/min/pass/link-min-pass.exp
new file mode 100644
index 000..88d180ec31e
--- /dev/null
+++ b/gcc/testsuite/gm2/link/min/pass/link-min-pass.exp
@@ -0,0 +1,37 @@
+# Expect driver script for GCC Regression Tests
+# Copyright (C) 2023 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+# This file was written by Gaius Mulley (gaiusm...@gmail.com)
+# for GNU Modula-2.
+
+if $tracelevel then {
+strace $tracelevel
+}
+
+# load support procs
+load_lib gm2-torture.exp
+
+gm2_init_min "${srcdir}/gm2/min/pass"
+
+foreach testcase [lsort [glob -nocomplain $srcdir/$subdir/*.mod]] {
+# If we're only testing specific files and this isn't one of them, skip it.
+if ![runtest_file_p $runtests $testcase] then {
+   continue
+}
+
+gm2-torture $testcase
+}
diff --git a/gcc/testsuite/gm2/link/min/pass/tiny.mod 
b/gcc/testsuite/gm2/link/min/pass/tiny.mod
new file mode 100644
index 000..e1165edbe4a
--- /dev/null
+++ b/gcc/testsuite/gm2/link/min/pass/tiny.mod
@@ -0,0 +1,7 @@
+MODULE tiny ;
+
+(* Does nothing at all, but it should link if -flibs=min is used.  *)
+
+BEGIN
+
+END tiny.


[committed] libstdc++: Fix std::chrono::hh_mm_ss with unsigned rep [PR108265]

2023-01-04 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk, backports to gcc-11 and gcc-12
will follow.

-- >8 --

libstdc++-v3/ChangeLog:

PR libstdc++/108265
* include/std/chrono (hh_mm_ss): Do not use chrono::abs if
duration rep is unsigned.
* testsuite/std/time/hh_mm_ss/1.cc: Check unsigned rep.
---
 libstdc++-v3/include/std/chrono   | 15 ---
 libstdc++-v3/testsuite/std/time/hh_mm_ss/1.cc | 16 
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
index 27f391a1455..e7fd6ed57ab 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -2294,7 +2294,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
 
constexpr
-   hh_mm_ss(_Duration __d, bool __is_neg) noexcept
+   hh_mm_ss(_Duration __d, bool __is_neg)
: _M_h (duration_cast(__d)),
  _M_m (duration_cast(__d - hours())),
  _M_s (duration_cast(__d - hours() - minutes())),
@@ -2307,6 +2307,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_ss._M_r = duration_cast(__ss).count();
}
 
+   static constexpr _Duration
+   _S_abs(_Duration __d)
+   {
+ if constexpr (numeric_limits::is_signed)
+   return chrono::abs(__d);
+ else
+   return __d;
+   }
+
   public:
static constexpr unsigned fractional_width = {_S_fractional_width()};
 
@@ -2318,8 +2327,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
constexpr hh_mm_ss() noexcept = default;
 
constexpr explicit
-   hh_mm_ss(_Duration __d) noexcept
-   : hh_mm_ss(chrono::abs(__d), __d < _Duration::zero())
+   hh_mm_ss(_Duration __d)
+   : hh_mm_ss(_S_abs(__d), __d < _Duration::zero())
{ }
 
constexpr bool
diff --git a/libstdc++-v3/testsuite/std/time/hh_mm_ss/1.cc 
b/libstdc++-v3/testsuite/std/time/hh_mm_ss/1.cc
index 3f8a838c477..3c61145317c 100644
--- a/libstdc++-v3/testsuite/std/time/hh_mm_ss/1.cc
+++ b/libstdc++-v3/testsuite/std/time/hh_mm_ss/1.cc
@@ -115,3 +115,19 @@ size()
   struct S4 { long long h; char m, s; bool neg; double ss; };
   static_assert(sizeof(hh_mm_ss>) == sizeof(S4));
 }
+
+constexpr void
+unsigned_rep()
+{
+  using namespace std::chrono;
+
+  constexpr duration ms(3690001);
+
+  constexpr hh_mm_ss hms(ms); // PR libstdc++/108265
+  static_assert( ! hms.is_negative() );
+  static_assert( hms.to_duration() == milliseconds(ms.count()) );
+  static_assert( hms.hours() == 1h );
+  static_assert( hms.minutes() == 1min );
+  static_assert( hms.seconds() == 30s );
+  static_assert( hms.subseconds() == 1ms );
+}
-- 
2.39.0



[committed] libstdc++: Only use std::atomic if lock free [PR108228]

2023-01-04 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

This fixes linker errors for hppa-hp-hpux11.11 due to an undefined weak
symbol and the use of atomic operations that require libatomic.

The weak symbol can simply be defined, which we already do for darwin.

The std::atomic<_Node*> is only an optimization, so can be avoided for
targets where the underlying atomic ops aren't available without help
from libatomic. The accesses to the std::atomic<_Node*> can be
abstracted behind a new API for getting and setting the cached value,
and then the atomics can be used conditionally.

libstdc++-v3/ChangeLog:

PR libstdc++/108228
PR libstdc++/108235
* config/abi/pre/gnu.ver: Move zoneinfo_dir_override export to
the latest symbol version.
* src/c++20/tzdb.cc (USE_ATOMIC_SHARED_PTR): Define to 0 if
atomic<_Node*> is not always lock free.
(USE_ATOMIC_LIST_HEAD): New macro.
[__hpux__] (__gnu_cxx::zoneinfo_dir_override()): Provide
definition of weak symbol.
(tzdb_list::_Node::_S_head): Rename to _S_head_cache.
(tzdb_list::_Node::_S_list_head): New function for accessing
list head efficiently.
(tzdb_list::_Node::_S_cache_list_head): New function for
updating _S_list_head.
---
 libstdc++-v3/config/abi/pre/gnu.ver |  4 +-
 libstdc++-v3/src/c++20/tzdb.cc  | 97 +++--
 2 files changed, 64 insertions(+), 37 deletions(-)

diff --git a/libstdc++-v3/config/abi/pre/gnu.ver 
b/libstdc++-v3/config/abi/pre/gnu.ver
index dc46f670b78..a5559a1c374 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -1104,9 +1104,6 @@ GLIBCXX_3.4 {
 # std::uncaught_exception()
 _ZSt18uncaught_exceptionv;
 
-# __gnu_cxx::zoneinfo_dir_override()
-_ZN9__gnu_cxx21zoneinfo_dir_overrideEv;
-
   # DO NOT DELETE THIS LINE.  Port-specific symbols, if any, will be here.
 
   local:
@@ -2505,6 +2502,7 @@ GLIBCXX_3.4.31 {
 _ZNKSt6chrono9tzdb_list14const_iteratordeEv;
 _ZNSt6chrono9tzdb_list14const_iteratorppEv;
 _ZNSt6chrono9tzdb_list14const_iteratorppEi;
+_ZN9__gnu_cxx21zoneinfo_dir_overrideEv;
 
 } GLIBCXX_3.4.30;
 
diff --git a/libstdc++-v3/src/c++20/tzdb.cc b/libstdc++-v3/src/c++20/tzdb.cc
index 2a4e213d3d9..6772517d55a 100644
--- a/libstdc++-v3/src/c++20/tzdb.cc
+++ b/libstdc++-v3/src/c++20/tzdb.cc
@@ -34,14 +34,22 @@
 #include   // mutex
 #include  // filesystem::read_symlink
 
-#ifdef __GTHREADS
-# if _WIN32
+#ifndef __GTHREADS
+# define USE_ATOMIC_SHARED_PTR 0
+#elif _WIN32
 // std::mutex cannot be constinit, so Windows must use atomic>.
-#  define USE_ATOMIC_SHARED_PTR 1
-# else
+# define USE_ATOMIC_SHARED_PTR 1
+#elif ATOMIC_POINTER_LOCK_FREE < 2
+# define USE_ATOMIC_SHARED_PTR 0
+#else
 // TODO benchmark atomic> vs mutex.
-#  define USE_ATOMIC_SHARED_PTR 1
-# endif
+# define USE_ATOMIC_SHARED_PTR 1
+#endif
+
+#if defined __GTHREADS && ATOMIC_POINTER_LOCK_FREE == 2
+# define USE_ATOMIC_LIST_HEAD 1
+#else
+# define USE_ATOMIC_LIST_HEAD 0
 #endif
 
 #if ! __cpp_constinit
@@ -64,7 +72,7 @@ namespace __gnu_cxx
 #else
   [[gnu::weak]] const char* zoneinfo_dir_override();
 
-#ifdef __APPLE__
+#if defined(__APPLE__) || defined(__hpux__)
   // Need a weak definition for Mach-O.
   [[gnu::weak]] const char* zoneinfo_dir_override()
   { return _GLIBCXX_ZONEINFO_DIR; }
@@ -76,6 +84,15 @@ namespace std::chrono
 {
   namespace
   {
+#if ! USE_ATOMIC_SHARED_PTR
+#ifndef __GTHREADS
+// Dummy no-op mutex type for single-threaded targets.
+struct mutex { void lock() { } void unlock() { } };
+#endif
+/// XXX std::mutex::mutex() not constexpr on Windows, so can't be constinit
+constinit mutex list_mutex;
+#endif
+
 struct Rule;
   }
 
@@ -103,8 +120,29 @@ namespace std::chrono
 // This is the owning reference to the first tzdb in the list.
 static head_ptr _S_head_owner;
 
+#if USE_ATOMIC_LIST_HEAD
 // Lock-free access to the head of the list.
-static atomic<_Node*> _S_head;
+static atomic<_Node*> _S_head_cache;
+
+static _Node*
+_S_list_head(memory_order ord) noexcept
+{ return _S_head_cache.load(ord); }
+
+static void
+_S_cache_list_head(_Node* new_head) noexcept
+{ _S_head_cache = new_head; }
+#else
+static _Node*
+_S_list_head(memory_order)
+{
+  lock_guard l(list_mutex);
+  return _S_head_owner.get();
+}
+
+static void
+_S_cache_list_head(_Node*) noexcept
+{ }
+#endif
 
 static const tzdb& _S_init_tzdb();
 static const tzdb& _S_replace_head(shared_ptr<_Node>, shared_ptr<_Node>);
@@ -122,8 +160,10 @@ namespace std::chrono
   // Shared pointer to the first Node in the list.
   constinit tzdb_list::_Node::head_ptr 
tzdb_list::_Node::_S_head_owner{nullptr};
 
+#if USE_ATOMIC_LIST_HEAD
   // Lock-free access to the first Node in the list.
-  constinit atomic tzdb_list::_Node::_S_head{nullptr};
+  constinit atomic tzdb_list::_Node::_S_head_cache{nullptr};
+#e

[committed] libstdc++: Support single components in name of chrono::current_zone() [PR108211]

2023-01-04 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

We currently only handle the case where /etc/localtime is a symlink to a
path like ".../Etc/UTC" and fail for ".../UTC". This makes both work.

libstdc++-v3/ChangeLog:

PR libstdc++/108211
* src/c++20/tzdb.cc (chrono::current_zone()): Check for zone
using only last component of the name.
---
 libstdc++-v3/src/c++20/tzdb.cc | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/src/c++20/tzdb.cc b/libstdc++-v3/src/c++20/tzdb.cc
index 6772517d55a..9103b159400 100644
--- a/libstdc++-v3/src/c++20/tzdb.cc
+++ b/libstdc++-v3/src/c++20/tzdb.cc
@@ -1501,8 +1501,11 @@ namespace std::chrono
if (std::distance(first, last) > 2)
  {
--last;
-   string name = std::prev(last)->string() + '/';
-   name += last->string();
+   string name = last->string();
+   if (auto tz = do_locate_zone(this->zones, this->links, name))
+ return tz;
+   --last;
+   name = last->string() + '/' + name;
if (auto tz = do_locate_zone(this->zones, this->links, name))
  return tz;
  }
-- 
2.39.0



Re: [PATCH 4/n] modula-2, driver: Handle static-libstd++ for targets without static/dynamic

2023-01-04 Thread Gaius Mulley via Gcc-patches
Iain Sandoe  writes:

> Oops pressed ‘send' too soon - this is part of the series for Darwin:
>
> There are several modula-2 issues on Darwin, some blocking bootstrap on
> one or more system versions.
>
> This has been tested on powerpc/i688-darwin9 .. x86_64-darwin10,17,21 and
> the prototype aarch64-darwin branch on darwin21.
>
> OK for trunk?
> thanks
> Iain

sure, LGTM

regards,
Gaius

>> On 30 Dec 2022, at 10:58, Iain Sandoe  wrote:
>> 
>> The follows the pattern used in C++ and D drivers to pass -static-libstdc++
>> onto the target driver to allow spec substitution of static libraries.
>> 
>> NOTE: The general handling of Bstatic/dynamic and the possible use of static
>> libgm2 libraries is unimplemented in this driver so far.  It seems likely
>> that the driver construction could be greatly simplified if the modula-2
>> runtimes were combined into fewer (hopefully, one) libraries.
>> 
>> Signed-off-by: Iain Sandoe 
>> 
>> gcc/m2/ChangeLog:
>> 
>>  * gm2spec.cc (lang_specific_driver): Pass -static-libstdc++ on to
>>  the target driver if the linker does not support Bstatic/dynamic.
>> ---
>> gcc/m2/gm2spec.cc | 5 +
>> 1 file changed, 5 insertions(+)
>> 
>> diff --git a/gcc/m2/gm2spec.cc b/gcc/m2/gm2spec.cc
>> index 680dd3602ef..b9a5c4e79bb 100644
>> --- a/gcc/m2/gm2spec.cc
>> +++ b/gcc/m2/gm2spec.cc
>> @@ -767,7 +767,12 @@ lang_specific_driver (struct cl_decoded_option 
>> **in_decoded_options,
>> 
>>  case OPT_static_libstdc__:
>>library = library >= 0 ? 2 : library;
>> +#ifdef HAVE_LD_STATIC_DYNAMIC
>> +  /* Remove -static-libstdc++ from the command only if target supports
>> + LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
>> + back-end target can use outfile substitution.  */
>>args[i] |= SKIPOPT;
>> +#endif
>>break;
>> 
>>  case OPT_stdlib_:
>> -- 
>> 2.37.1 (Apple Git-137.1)
>> 


Re: [PATCH] modula-2, driver: Implement handling for -static-libgm2.

2023-01-04 Thread Gaius Mulley via Gcc-patches
Iain Sandoe  writes:

> Hi Gaius,
>
>> On 4 Jan 2023, at 12:11, Gaius Mulley  wrote:
>> 
>> Iain Sandoe  writes:
>> 
>>> tested on x86_64-linux-gnu, x86_64,aarch64-darwin21,
>
>> 
>> yes LGTM - it was unimplemented - thanks!
>
> My apologies, when I came to apply this I realised that I posted the wrong
> version of the patch - omitting the documentation changes.
>
> Here is the version with (albeit basic) documentation.
> Still OK for master?
> Iain

Hi Iain,

yes LGTM

thanks,
Gaius

> [PATCH] modula-2, driver: Implement handling for -static-libgm2.
>
> This was unimplemented so far.
>
> gcc/ChangeLog:
>
>   * common.opt: Add -static-libgm2.
>   * config/darwin.h (LINK_SPEC): Handle static-libgm2.
>   * doc/gm2.texi: Document static-libgm2.
>   * gcc.cc (driver_handle_option): Allow static-libgm2.
>
> gcc/m2/ChangeLog:
>
>   * gm2spec.cc (lang_specific_driver): Handle static-libgm2.
>   * lang.opt: Add static-libgm2.
> ---
>  gcc/common.opt  |  4 
>  gcc/config/darwin.h |  7 ++-
>  gcc/doc/gm2.texi|  4 
>  gcc/gcc.cc  | 12 +++-
>  gcc/m2/gm2spec.cc   | 24 +++-
>  gcc/m2/lang.opt |  4 
>  6 files changed, 48 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 97a78030228..d0371aec8db 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -3622,6 +3622,10 @@ static-libgfortran
>  Driver
>  ; Documented for Fortran, but always accepted by driver.
>  
> +static-libgm2
> +Driver
> +; Documented for Modula-2, but always accepted by driver.
> +
>  static-libphobos
>  Driver
>  ; Documented for D, but always accepted by driver.
> diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
> index efe3187cd96..e6f76e598e6 100644
> --- a/gcc/config/darwin.h
> +++ b/gcc/config/darwin.h
> @@ -447,7 +447,12 @@ extern GTY(()) int darwin_ms_struct;
> %{static|static-libgcc|static-libphobos:%:replace-outfile(-lgphobos 
> libgphobos.a%s)}\
> 
> %{static|static-libgcc|static-libstdc++|static-libgfortran:%:replace-outfile(-lgomp
>  libgomp.a%s)}\
> %{static|static-libgcc|static-libstdc++:%:replace-outfile(-lstdc++ 
> libstdc++.a%s)}\
> -   %{force_cpusubtype_ALL:-arch %(darwin_arch)} \
> +   %{static|static-libgm2:%:replace-outfile(-lm2pim libm2pim.a%s)}\
> +   %{static|static-libgm2:%:replace-outfile(-lm2iso libm2iso.a%s)}\
> +   %{static|static-libgm2:%:replace-outfile(-lm2min libm2min.a%s)}\
> +   %{static|static-libgm2:%:replace-outfile(-lm2log libm2log.a%s)}\
> +   %{static|static-libgm2:%:replace-outfile(-lm2cor libm2cor.a%s)}\
> +  %{force_cpusubtype_ALL:-arch %(darwin_arch)} \
> %{!force_cpusubtype_ALL:-arch %(darwin_subarch)} "\
> LINK_SYSROOT_SPEC \
>"%{mmacosx-version-min=*:-macosx_version_min %*} \
> diff --git a/gcc/doc/gm2.texi b/gcc/doc/gm2.texi
> index 513fdd3ec7f..18cb798c6cd 100644
> --- a/gcc/doc/gm2.texi
> +++ b/gcc/doc/gm2.texi
> @@ -573,6 +573,10 @@ the they provide the base modules which all other 
> dialects utilize.
>  The option @samp{-fno-libs=-} disables the @samp{gm2} driver from
>  modifying the search and library paths.
>  
> +@item -static-libgm2
> +On systems that provide the m2 runtimes as both shared and static libraries,
> +this option forces the use of the static version.
> +
>  @c flocation=
>  @c Modula-2 Joined
>  @c set all location values to a specific value (internal switch)
> diff --git a/gcc/gcc.cc b/gcc/gcc.cc
> index 91313a8516d..d629ca5e424 100644
> --- a/gcc/gcc.cc
> +++ b/gcc/gcc.cc
> @@ -4540,12 +4540,14 @@ driver_handle_option (struct gcc_options *opts,
>  case OPT_static_libgfortran:
>  case OPT_static_libquadmath:
>  case OPT_static_libphobos:
> +case OPT_static_libgm2:
>  case OPT_static_libstdc__:
> -  /* These are always valid, since gcc.cc itself understands the
> -  first two, gfortranspec.cc understands -static-libgfortran,
> -  d-spec.cc understands -static-libphobos, g++spec.cc
> -  understands -static-libstdc++ and libgfortran.spec handles
> -  -static-libquadmath.  */
> +  /* These are always valid; gcc.cc itself understands the first two
> +  gfortranspec.cc understands -static-libgfortran,
> +  libgfortran.spec handles -static-libquadmath,
> +  d-spec.cc understands -static-libphobos,
> +  gm2spec.cc understands -static-libgm2,
> +  and g++spec.cc understands -static-libstdc++.  */
>validated = true;
>break;
>  
> diff --git a/gcc/m2/gm2spec.cc b/gcc/m2/gm2spec.cc
> index b9a5c4e79bb..583723da416 100644
> --- a/gcc/m2/gm2spec.cc
> +++ b/gcc/m2/gm2spec.cc
> @@ -586,6 +586,9 @@ lang_specific_driver (struct cl_decoded_option 
> **in_decoded_options,
>/* Should the driver perform a link?  */
>bool linking = true;
>  
> +  /* Should the driver link the shared gm2 libs?  */
> +  bool shared_libgm2 = true;
> +
>/* "-lm" or "-lmath" if it appears on the command line.  */
>const struct cl_decoded_option *saw_math = NULL;
>  
> @@ -

[PATCH 0/1] Update installation docs with gmplib link

2023-01-04 Thread Benson Muite via Gcc-patches
Improved patch formatting

Benson Muite (1):
  Add link to gmplib.org

 gcc/doc/install.texi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-- 
2.39.0



[PATCH 1/1] Add link to gmplib.org

2023-01-04 Thread Benson Muite via Gcc-patches
Link is missing from install documentation

Signed-off-by: Benson Muite 
---
 gcc/doc/install.texi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index ccc8d15fd08..18e8709a169 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -396,7 +396,8 @@ install the libraries.
 @table @asis
 @item GNU Multiple Precision Library (GMP) version 4.3.2 (or later)
 
-Necessary to build GCC@.  If a GMP source distribution is found in a
+Necessary to build GCC@.  It can be downloaded from
+@uref{https://gmplib.org/}.  If a GMP source distribution is found in a
 subdirectory of your GCC sources named @file{gmp}, it will be built
 together with GCC.  Alternatively, if GMP is already installed but it
 is not in your library search path, you will have to configure with the
-- 
2.39.0



Re: [PATCH] rs6000: Don't use optimize_function_for_speed_p too early [PR108184]

2023-01-04 Thread Kewen.Lin via Gcc-patches
on 2023/1/4 22:02, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Jan 04, 2023 at 08:15:03PM +0800, Kewen.Lin wrote:
>> on 2023/1/4 18:46, Segher Boessenkool wrote:
 @@ -25604,7 +25602,9 @@ rs6000_call_aix (rtx value, rtx func_desc, rtx 
 tlsarg, rtx cookie)

  /* Can we optimize saving the TOC in the prologue or
 do we need to do it at every call?  */
 -if (TARGET_SAVE_TOC_INDIRECT && !cfun->calls_alloca)
 +if (TARGET_SAVE_TOC_INDIRECT
 +&& !cfun->calls_alloca
 +&& optimize_function_for_speed_p (cfun))
cfun->machine->save_toc_in_prologue = true;
>>>
>>> Is this correct?  If so, it really needs a separate testcase.
>>
>> Yes, it just moves the condition from:
>>
>> --- a/gcc/config/rs6000/rs6000.cc
>> +++ b/gcc/config/rs6000/rs6000.cc
>> @@ -3978,8 +3978,7 @@ rs6000_option_override_internal (bool global_init_p)
>>/* If we can shrink-wrap the TOC register save separately, then use
>>   -msave-toc-indirect unless explicitly disabled.  */
>>if ((rs6000_isa_flags_explicit & OPTION_MASK_SAVE_TOC_INDIRECT) == 0
>> -  && flag_shrink_wrap_separate
>> -  && optimize_function_for_speed_p (cfun))
>> +  && flag_shrink_wrap_separate)
>>  rs6000_isa_flags |= OPTION_MASK_SAVE_TOC_INDIRECT;
>>
>> here.
> 
> That "just" reinforces that this really needs a testcase!  It is all
> action at a distance, none of this is trivial (if it was there would
> not be a bug here in the first place, of course).

OK, I'll make a test case for it. :)

> 
>> I tried to find one test case before, but failed to find one which is not 
>> fragile
>> to test.  And I thought the associated test case has demonstrated why the 
>> use of
>> optimize_function_for_{speed,size}_p is too early in function
>> rs6000_option_override_internal, so I gave up then.  Do you worry about that 
>> we
>> could revert it unexpectedly in future and no sensitive test case is on it?
> 
> I worry that it might contradict what some other code does.  I also
> worry that it just is not a sensible thing to do.
> 
> I do not worry that your patch is not an improvement.  But the resulting
> code more clearly (than the original) is problematic.  Where is r2 saved
> to the frame if save_toc_in_prologue is false?

If save_toc_in_prologue is false, the r2 saving to frame would occur at each
indirect call.  Currently separate shrink-wrapping will check 
save_toc_in_prologue
to decide whether to consider saving toc as one component, I think that's why
we enable save-toc-indirect implicitly (going to set save_toc_in_prologue)
if it's not specified explicitly and doing separate shrink-wrapping.

BR,
Kewen


RE: [PATCH][X86_64] Separate znver4 insn reservations from older znvers

2023-01-04 Thread Joshi, Tejas Sanjay via Gcc-patches
[Public]

Hello,

> OK,
> thanks!
> Honza

Thanks! We have pushed the patch.

Regards,
Tejas