[PATCH][pushed] ARM: fix -Wformat= error

2022-01-14 Thread Martin Liška

gcc/ChangeLog:

* common/config/arm/arm-common.c (arm_target_mode): Fix
warning: unterminated quoting directive [-Wformat=].
---
 gcc/common/config/arm/arm-common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/common/config/arm/arm-common.c 
b/gcc/common/config/arm/arm-common.c
index 6a898d8554b..2ca640b5b95 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -296,7 +296,7 @@ arm_target_mode (int argc, const char **argv)
cpu = argv[1];
   else
fatal_error (input_location, "unrecognized option passed to %%:"
-"%>");
+"%");
   argc -= 2;
   argv += 2;
 }
--
2.34.1



Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.

2022-01-14 Thread Uros Bizjak via Gcc-patches
On Fri, Jan 14, 2022 at 6:46 AM Hongyu Wang  wrote:
>
> > No, the approach is wrong. You have to solve output clearing on RTL
> > level, please look at how e.g. tzcnt false dep is solved:
>
> Actually we have considered such approach before, but we found we need
> to break original define_insn to remove the mask/rounding subst,
> since define_split could not adopt subst, and that would add 6 more
> define_insn_and_split and 4 define_insn for each instruction. We think
> such approach would introduce too much redundant code.

Are there any technical obstacles to introduce subst to
define_{,insn_and_}split?

Uros.


Re: [PATCH] Fix tree-optimization/101941: IPA splitting out function with error attribute

2022-01-14 Thread Martin Liška

PING^1

May I please ping this so that we can can Linux kernel as soon as possible?
We would benefit from that for GCC 12.1.0 release.

Thanks,
Martin

On 11/19/21 14:07, Richard Biener via Gcc-patches wrote:

On Fri, Nov 19, 2021 at 12:50 PM Andrew Pinski  wrote:


On Fri, Nov 19, 2021 at 2:16 AM Richard Biener via Gcc-patches
 wrote:


On Wed, Nov 17, 2021 at 7:25 AM apinski--- via Gcc-patches
 wrote:


From: Andrew Pinski 

The Linux kernel started to fail compile when the jump threader was improved
(r12-2591-g2e96b5f14e4025691). This failure was due to the IPA splitting code
decided now to split off the basic block which contained two functions,
one of those functions included the error attribute on them.  This patch fixes
the problem by disallowing basic blocks from being split which contain functions
that have either the error or warning attribute on them.

The two new testcases are to make sure we still split the function for other
places if we reject the one case.


Hmm, it's only problematic if the immediate(?) controlling condition of the
error/warning annotated call is not in the split part, no?

No, if we have something like:
   if (p_size < 32) {
 if (ctx != 0) {
   __write_overflow();
}
 fortify_panic(__func__);
  }
We would still run into the problem if we just disable the splitting
for the innermost bb and split off the 3 other bb's
I have a testcase where the above would fail if we decide only to make
sure the split point is not after ctx !=0 check.


Interestingly we for example don't avoid splitting away __builtin_constant_p 
either.


__builtin_constant_p is handled a different way already; in
check_forbidden_calls we set forbidden_dominators to include the bb
where the builtin_constant_p would have been true.
And then when we consider the split, we reject if the entry point
would have been dominated by one of those bbs.
This was PR49642.


I see.  So how's error/warning different - don't we want to forbit split points
that dominate those calls itself?




Thanks,
Andrew Pinski




Richard.


OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

 PR tree-optimization/101941

gcc/ChangeLog:

 * ipa-split.c (visit_bb): Disallow function calls where
 the function has either error or warning attribute.

gcc/testsuite/ChangeLog:

 * gcc.c-torture/compile/pr101941-1.c: New test.
 * gcc.dg/tree-ssa/pr101941-1.c: New test.
---
  gcc/ipa-split.c   | 12 -
  .../gcc.c-torture/compile/pr101941-1.c| 44 +
  gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c| 48 +++
  3 files changed, 103 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c

diff --git a/gcc/ipa-split.c b/gcc/ipa-split.c
index c68577d04a9..070e894ef31 100644
--- a/gcc/ipa-split.c
+++ b/gcc/ipa-split.c
@@ -873,7 +873,7 @@ visit_bb (basic_block bb, basic_block return_bb,
gimple *stmt = gsi_stmt (bsi);
tree op;
ssa_op_iter iter;
-  tree decl;
+  tree decl = NULL_TREE;

if (is_gimple_debug (stmt))
 continue;
@@ -927,6 +927,16 @@ visit_bb (basic_block bb, basic_block return_bb,
 break;
   }

+  /* If a function call and that function has either the
+warning or error attribute on it, don't split.  */
+  if (decl && (lookup_attribute ("warning", DECL_ATTRIBUTES (decl))
+  || lookup_attribute ("error", DECL_ATTRIBUTES (decl
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, "Cannot split: warning or error attribute.\n");
+ can_split = false;
+   }
+
FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_DEF)
 bitmap_set_bit (set_ssa_names, SSA_NAME_VERSION (op));
FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c 
b/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
new file mode 100644
index 000..ab3bbea8ed7
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
@@ -0,0 +1,44 @@
+/* { dg-additional-options "-fconserve-stack" } */
+struct crypto_aes_ctx {
+  char key_dec[128];
+};
+
+int rfc4106_set_hash_subkey_hash_subkey;
+
+void __write_overflow(void)__attribute__((__error__("")));
+void __write_overflow1(void);
+void aes_encrypt(void*);
+
+void fortify_panic(const char*) __attribute__((__noreturn__)) ;
+
+char *rfc4106_set_hash_subkey(struct crypto_aes_ctx *ctx) {
+  void *a = &ctx->key_dec[0];
+  unsigned p_size =  __builtin_object_size(a, 0);
+#ifdef __OPTIMIZE__
+  if (p_size < 16) {
+__write_overflow1();
+fortify_panic(__func__);
+  }
+  if (p_size < 32) {
+__write_overflow();
+fortify_panic(__func__);
+  }
+#endif
+  aes_encrypt(ctx);
+  return ctx->key_dec;
+}
+
+char *(*gg)(struct crypto_aes_ctx *) = rfc4106

Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.

2022-01-14 Thread Uros Bizjak via Gcc-patches
On Fri, Jan 14, 2022 at 7:11 AM Hongyu Wang  wrote:
>
> > > No, the approach is wrong. You have to solve output clearing on RTL
> > > level, please look at how e.g. tzcnt false dep is solved:
> >
> > Actually we have considered such approach before, but we found we need
> > to break original define_insn to remove the mask/rounding subst,
> > since define_split could not adopt subst, and that would add 6 more
> > define_insn_and_split and 4 define_insn for each instruction. We think
> > such approach would introduce too much redundant code.
> >
> > Do you think the code size increment is acceptable?
>
> Also that 100+ more patterns increases maintenance effort. If we split
> them at epilogue_complete stage,
> it seems not much difference to put it under output template...

In the proposed patch, if the output register is also mentioned in the
input, then clearing before insn will clear the value in the input
register. The solution in the i386.md also takes care of this issue.

Uros.


Re: [PATCH] Fix tree-optimization/101941: IPA splitting out function with error attribute

2022-01-14 Thread Jan Hubicka via Gcc-patches
> > > > > --- a/gcc/ipa-split.c
> > > > > +++ b/gcc/ipa-split.c
> > > > > @@ -873,7 +873,7 @@ visit_bb (basic_block bb, basic_block return_bb,
> > > > > gimple *stmt = gsi_stmt (bsi);
> > > > > tree op;
> > > > > ssa_op_iter iter;
> > > > > -  tree decl;
> > > > > +  tree decl = NULL_TREE;
> > > > > 
> > > > > if (is_gimple_debug (stmt))
> > > > >  continue;
> > > > > @@ -927,6 +927,16 @@ visit_bb (basic_block bb, basic_block return_bb,
Decl is initialized in
  if (gimple_code (stmt) == GIMPLE_CALL
  && (decl = gimple_call_fndecl (stmt)) != NULL_TREE
  && fndecl_built_in_p (decl, BUILT_IN_NORMAL))

I think this is confusing. I would change it to
  if (gimple_code (stmt) == GIMPLE_CALL
  && (decl = gimple_call_fndecl (stmt)) != NULL_TREE
{
  if (fndecl_built_in_p (decl, BUILT_IN_NORMAL))
... existing code ...
  if (decl && (lookup_attribute ("warning", DECL_ATTRIBUTES (decl))
   || lookup_attribute ("error", DECL_ATTRIBUTES (decl
... your code ...
}
OK with that change.
Honza
> > > > >  break;
> > > > >}
> > > > > 
> > > > > +  /* If a function call and that function has either the
> > > > > +warning or error attribute on it, don't split.  */
> > > > > +  if (decl && (lookup_attribute ("warning", DECL_ATTRIBUTES 
> > > > > (decl))
> > > > > +  || lookup_attribute ("error", DECL_ATTRIBUTES 
> > > > > (decl
> > > > > +   {
> > > > > + if (dump_file && (dump_flags & TDF_DETAILS))
> > > > > +   fprintf (dump_file, "Cannot split: warning or error 
> > > > > attribute.\n");
> > > > > + can_split = false;
> > > > > +   }
> > > > > +
> > > > > FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_DEF)
> > > > >  bitmap_set_bit (set_ssa_names, SSA_NAME_VERSION (op));
> > > > > FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
> > > > > diff --git a/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c 
> > > > > b/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
> > > > > new file mode 100644
> > > > > index 000..ab3bbea8ed7
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
> > > > > @@ -0,0 +1,44 @@
> > > > > +/* { dg-additional-options "-fconserve-stack" } */
> > > > > +struct crypto_aes_ctx {
> > > > > +  char key_dec[128];
> > > > > +};
> > > > > +
> > > > > +int rfc4106_set_hash_subkey_hash_subkey;
> > > > > +
> > > > > +void __write_overflow(void)__attribute__((__error__("")));
> > > > > +void __write_overflow1(void);
> > > > > +void aes_encrypt(void*);
> > > > > +
> > > > > +void fortify_panic(const char*) __attribute__((__noreturn__)) ;
> > > > > +
> > > > > +char *rfc4106_set_hash_subkey(struct crypto_aes_ctx *ctx) {
> > > > > +  void *a = &ctx->key_dec[0];
> > > > > +  unsigned p_size =  __builtin_object_size(a, 0);
> > > > > +#ifdef __OPTIMIZE__
> > > > > +  if (p_size < 16) {
> > > > > +__write_overflow1();
> > > > > +fortify_panic(__func__);
> > > > > +  }
> > > > > +  if (p_size < 32) {
> > > > > +__write_overflow();
> > > > > +fortify_panic(__func__);
> > > > > +  }
> > > > > +#endif
> > > > > +  aes_encrypt(ctx);
> > > > > +  return ctx->key_dec;
> > > > > +}
> > > > > +
> > > > > +char *(*gg)(struct crypto_aes_ctx *) = rfc4106_set_hash_subkey;
> > > > > +
> > > > > +void a(void)
> > > > > +{
> > > > > +  struct crypto_aes_ctx ctx;
> > > > > +  rfc4106_set_hash_subkey(&ctx);
> > > > > +}
> > > > > +void b(void)
> > > > > +{
> > > > > +  struct crypto_aes_ctx ctx;
> > > > > +  ctx.key_dec[0] = 0;
> > > > > +  rfc4106_set_hash_subkey(&ctx);
> > > > > +}
> > > > > +
> > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c 
> > > > > b/gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c
> > > > > new file mode 100644
> > > > > index 000..21c1d1ec466
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c
> > > > > @@ -0,0 +1,48 @@
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-O2 -fconserve-stack -fdump-tree-optimized" } */
> > > > > +struct crypto_aes_ctx {
> > > > > +  char key_dec[128];
> > > > > +};
> > > > > +
> > > > > +int rfc4106_set_hash_subkey_hash_subkey;
> > > > > +
> > > > > +void __write_overflow(void)__attribute__((__error__("")));
> > > > > +void __write_overflow1(void);
> > > > > +void aes_encrypt(void*);
> > > > > +
> > > > > +void fortify_panic(const char*) __attribute__((__noreturn__)) ;
> > > > > +
> > > > > +char *rfc4106_set_hash_subkey(struct crypto_aes_ctx *ctx) {
> > > > > +  void *a = &ctx->key_dec[0];
> > > > > +  unsigned p_size =  __builtin_object_size(a, 0);
> > > > > +#ifdef __OPTIMIZE__
> > > > > +  if (p_size < 16) {
> > > > > +__write_overflow1();
> > > > > +fortify_panic(__func__);
> > > > > +  }
> > > > > +  if (p_size < 32) {
> > > > > +__write_overflow();
> > > > > +fort

RE: [PATCH] x86_64: Improvements to arithmetic right shifts of V1TImode values.

2022-01-14 Thread Roger Sayle

Hi Uros,
Here's a revised version of this patch incorporating your suggestion of using
force_reg instead of emit_move_insn to a pseudo allocated by gen_reg_rtx.
I also took the opportunity to transition the rest of the function (and clean-up
those around it) to use this preferred idiom.

This patch has been tested on x86_64-pc-linux-gnu with a make bootstrap
and make -k check with no new failures.  OK for mainline?


2022-01-14  Roger Sayle  
Uroš Bizjak  

gcc/ChangeLog
* config/i386/i386-expand.c (ix86_expand_v1ti_to_ti): Use force_reg.
(ix86_expand_ti_to_v1ti): Use force_reg.
(ix86_expand_v1ti_shift): Use force_reg.
(ix86_expand_v1ti_rotate): Use force_reg.
(ix86_expand_v1ti_ashiftrt): Provide new three operation
implementations for shifts by 111..126 bits.  Use force_reg.


Thanks again,
Roger
--

> -Original Message-
> From: Uros Bizjak 
> Sent: 12 January 2022 19:18
> To: Roger Sayle 
> Cc: GCC Patches 
> Subject: Re: [PATCH] x86_64: Improvements to arithmetic right shifts of
> V1TImode values.
> 
> On Tue, Jan 11, 2022 at 2:26 PM Roger Sayle 
> wrote:
> >
> >
> > This patch to the i386 backend's ix86_expand_v1ti_ashiftrt provides
> > improved (shorter) implementations of V1TI mode arithmetic right
> > shifts for constant amounts between 111 and 126 bits.  The
> > significance of this range is that this functionality is useful for
> > (eventually) providing sign extension from HImode and QImode to V1TImode.
> >
> > For example, x>>112 (to sign extend a 16-bit value), was previously
> > generated as a four operation sequence:
> >
> > movdqa  %xmm0, %xmm1// word7 6 5 4 3 2 1 0
> > psrad   $31, %xmm0  // V8HI = [S,S,?,?,?,?,?,?]
> > psrad   $16, %xmm1  // V8HI = [S,X,?,?,?,?,?,?]
> > punpckhqdq  %xmm0, %xmm1// V8HI = [S,S,?,?,S,X,?,?]
> > pshufd  $253, %xmm1, %xmm0  // V8HI = [S,S,S,S,S,S,S,X]
> >
> > with this patch, we now generates a three operation sequence:
> >
> > psrad   $16, %xmm0  // V8HI = [S,X,?,?,?,?,?,?]
> > pshufhw $254, %xmm0, %xmm0  // V8HI = [S,S,S,X,?,?,?,?]
> > pshufd  $254, %xmm0, %xmm0  // V8HI = [S,S,S,S,S,S,S,X]
> >
> > The correctness of generated code is confirmed by the existing
> > run-time test gcc.target/i386/sse2-v1ti-ashiftrt-1.c in the testsuite.
> > This idiom is safe to use for shifts by 127, but that case gets
> > handled by a two operation sequence earlier in this function.
> >
> >
> > This patch has been tested on x86_64-pc-linux-gnu with a make
> > bootstrap and make -k check with no new failures.  OK for mainline?
> >
> >
> > 2022-01-11  Roger Sayle  
> >
> > gcc/ChangeLog
> > * config/i386/i386-expand.c (ix86_expand_v1ti_ashiftrt): Provide
> > new three operation implementations for shifts by 111..126 bits.
> 
> +  if (bits >= 111)
> +{
> +  /* Three operations.  */
> +  rtx tmp1 = gen_reg_rtx (V4SImode);
> +  rtx tmp2 = gen_reg_rtx (V4SImode);
> +  emit_move_insn (tmp1, gen_lowpart (V4SImode, op1));
> +  emit_insn (gen_ashrv4si3 (tmp2, tmp1, GEN_INT (bits - 96)));
> 
> This can be written as:
> 
> rtx tmp1 = force_reg (V4SImode, gen_lowpart (V4SImode, op1)); emit_insn
> (gen_ashrv4i3 (tmp2, tmp1, GEN_INT ...));
> 
> +  rtx tmp3 = gen_reg_rtx (V8HImode);
> +  rtx tmp4 = gen_reg_rtx (V8HImode);
> +  emit_move_insn (tmp3, gen_lowpart (V8HImode, tmp2));
> +  emit_insn (gen_sse2_pshufhw (tmp4, tmp3, GEN_INT (0xfe)));
> 
> Here in a similar way...
> 
> +  rtx tmp5 = gen_reg_rtx (V4SImode);
> +  rtx tmp6 = gen_reg_rtx (V4SImode);
> +  emit_move_insn (tmp5, gen_lowpart (V4SImode, tmp4));
> +  emit_insn (gen_sse2_pshufd (tmp6, tmp5, GEN_INT (0xfe)));
> 
> ... also here.
> 
> +  rtx tmp7 = gen_reg_rtx (V1TImode);
> +  emit_move_insn (tmp7, gen_lowpart (V1TImode, tmp6));
> +  emit_move_insn (operands[0], tmp7);
> 
> And here a simple:
> 
> emit_move_insn (operands[0], gen_lowpart (V1TImode, tmp6);
> 
> +  return;
> +}
> +
> 
> Uros.
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 0318f12..c740d6e 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -6211,8 +6211,7 @@ ix86_expand_v1ti_to_ti (rtx x)
   rtx result = gen_reg_rtx (TImode);
   if (TARGET_SSE2)
 {
-  rtx temp = gen_reg_rtx (V2DImode);
-  emit_move_insn (temp, gen_lowpart (V2DImode, x));
+  rtx temp = force_reg (V2DImode, gen_lowpart (V2DImode, x));
   rtx lo = gen_lowpart (DImode, result);
   emit_insn (gen_vec_extractv2didi (lo, temp, const0_rtx));
   rtx hi = gen_highpart (DImode, result);
@@ -6227,18 +6226,16 @@ ix86_expand_v1ti_to_ti (rtx x)
 static rtx
 ix86_expand_ti_to_v1ti (rtx x)
 {
-  rtx result = gen_reg_rtx (V1TImode);
   if (TARGET_SSE2)
 {
   rtx lo = gen_lowpart (DImode, x);
   rtx hi = gen_highpart (DImode,

Re: [PATCH] [gfortran] Add support for allocate clause (OpenMP 5.0).

2022-01-14 Thread Thomas Schwinge
Hi Abid!

(Remember to CC  for 'gcc/fortran/' etc. changes.)


On 2022-01-11T22:31:54+, Hafiz Abid Qadeer  wrote:
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/gomp/allocate-2.f90
> @@ -0,0 +1,45 @@
> +! { dg-do compile }
> +
> +module omp_lib_kinds
> +  use iso_c_binding, only: c_int, c_intptr_t
> +  implicit none
> +  private :: c_int, c_intptr_t
> +  integer, parameter :: omp_allocator_handle_kind = c_intptr_t
> +
> +end module
> +
> +subroutine foo(x)
> +  use omp_lib_kinds
> +  implicit none
> +  integer  :: x
> +
> +  !$omp task allocate (x) ! { dg-error "'x' specified in 'allocate' clause 
> at .1. but not in an explicit privatization clause" }
> +  x=1
> +  !$omp end task
> +
> +  !$omp parallel allocate (x) ! { dg-error "'x' specified in 'allocate' 
> clause at .1. but not in an explicit privatization clause" }
> +  x=2
> +  !$omp end parallel
> +
> +  !$omp parallel allocate (x) shared (x) ! { dg-error "'x' specified in 
> 'allocate' clause at .1. but not in an explicit privatization clause" }
> +  x=3
> +  !$omp end parallel
> +
> +  !$omp parallel private (x) allocate (x) allocate (x) ! { dg-warning "'x' 
> appears more than once in 'allocate' clauses at .1." }
> +  x=4
> +  !$omp end parallel
> +
> +  !$omp parallel private (x) allocate (x, x) ! { dg-warning "'x' appears 
> more than once in 'allocate' clauses at .1." }
> +  x=5
> +  !$omp end parallel
> +
> +  !$omp parallel allocate (0: x) private(x) ! { dg-error "Expected integer 
> expression of the 'omp_allocator_handle_kind' kind at .1." }

We do for x86_64 default '-m64', but for '-m32' and '-mx32' compilation,
we're not seeing this latter diagnostic:

PASS: gfortran.dg/gomp/allocate-1.f90   -O  (test for excess errors)
PASS: gfortran.dg/gomp/allocate-2.f90   -O   (test for errors, line 16)
PASS: gfortran.dg/gomp/allocate-2.f90   -O   (test for errors, line 20)
PASS: gfortran.dg/gomp/allocate-2.f90   -O   (test for errors, line 24)
FAIL: gfortran.dg/gomp/allocate-2.f90   -O   (test for errors, line 36)
PASS: gfortran.dg/gomp/allocate-2.f90   -O   (test for errors, line 40)
PASS: gfortran.dg/gomp/allocate-2.f90   -O   (test for warnings, line 28)
PASS: gfortran.dg/gomp/allocate-2.f90   -O   (test for warnings, line 32)
PASS: gfortran.dg/gomp/allocate-2.f90   -O  (test for excess errors)

I suppose the reason is unintended congruence of data types?  Would it
work to make 'x' a floating-point data type, for example -- or is this
meant to explicitly check certain integer data type characteristics?


Grüße
 Thomas


> +  x=6
> +  !$omp end parallel
> +
> +  !$omp parallel private (x) allocate (0.1 : x) ! { dg-error "Expected 
> integer expression of the 'omp_allocator_handle_kind' kind at .1." }
> +  x=7
> +  !$omp end parallel
> +
> +end subroutine
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH] x86_64: Improvements to arithmetic right shifts of V1TImode values.

2022-01-14 Thread Uros Bizjak via Gcc-patches
On Fri, Jan 14, 2022 at 10:00 AM Roger Sayle  wrote:
>
>
> Hi Uros,
> Here's a revised version of this patch incorporating your suggestion of using
> force_reg instead of emit_move_insn to a pseudo allocated by gen_reg_rtx.
> I also took the opportunity to transition the rest of the function (and 
> clean-up
> those around it) to use this preferred idiom.
>
> This patch has been tested on x86_64-pc-linux-gnu with a make bootstrap
> and make -k check with no new failures.  OK for mainline?
>
>
> 2022-01-14  Roger Sayle  
> Uroš Bizjak  
>
> gcc/ChangeLog
> * config/i386/i386-expand.c (ix86_expand_v1ti_to_ti): Use force_reg.
> (ix86_expand_ti_to_v1ti): Use force_reg.
> (ix86_expand_v1ti_shift): Use force_reg.
> (ix86_expand_v1ti_rotate): Use force_reg.
> (ix86_expand_v1ti_ashiftrt): Provide new three operation
> implementations for shifts by 111..126 bits.  Use force_reg.

LGTM, as far as I can review the patch due to code churn...

Thanks,
Uros.

>
> Thanks again,
> Roger
> --
>
> > -Original Message-
> > From: Uros Bizjak 
> > Sent: 12 January 2022 19:18
> > To: Roger Sayle 
> > Cc: GCC Patches 
> > Subject: Re: [PATCH] x86_64: Improvements to arithmetic right shifts of
> > V1TImode values.
> >
> > On Tue, Jan 11, 2022 at 2:26 PM Roger Sayle 
> > wrote:
> > >
> > >
> > > This patch to the i386 backend's ix86_expand_v1ti_ashiftrt provides
> > > improved (shorter) implementations of V1TI mode arithmetic right
> > > shifts for constant amounts between 111 and 126 bits.  The
> > > significance of this range is that this functionality is useful for
> > > (eventually) providing sign extension from HImode and QImode to V1TImode.
> > >
> > > For example, x>>112 (to sign extend a 16-bit value), was previously
> > > generated as a four operation sequence:
> > >
> > > movdqa  %xmm0, %xmm1// word7 6 5 4 3 2 1 0
> > > psrad   $31, %xmm0  // V8HI = [S,S,?,?,?,?,?,?]
> > > psrad   $16, %xmm1  // V8HI = [S,X,?,?,?,?,?,?]
> > > punpckhqdq  %xmm0, %xmm1// V8HI = [S,S,?,?,S,X,?,?]
> > > pshufd  $253, %xmm1, %xmm0  // V8HI = [S,S,S,S,S,S,S,X]
> > >
> > > with this patch, we now generates a three operation sequence:
> > >
> > > psrad   $16, %xmm0  // V8HI = [S,X,?,?,?,?,?,?]
> > > pshufhw $254, %xmm0, %xmm0  // V8HI = [S,S,S,X,?,?,?,?]
> > > pshufd  $254, %xmm0, %xmm0  // V8HI = [S,S,S,S,S,S,S,X]
> > >
> > > The correctness of generated code is confirmed by the existing
> > > run-time test gcc.target/i386/sse2-v1ti-ashiftrt-1.c in the testsuite.
> > > This idiom is safe to use for shifts by 127, but that case gets
> > > handled by a two operation sequence earlier in this function.
> > >
> > >
> > > This patch has been tested on x86_64-pc-linux-gnu with a make
> > > bootstrap and make -k check with no new failures.  OK for mainline?
> > >
> > >
> > > 2022-01-11  Roger Sayle  
> > >
> > > gcc/ChangeLog
> > > * config/i386/i386-expand.c (ix86_expand_v1ti_ashiftrt): Provide
> > > new three operation implementations for shifts by 111..126 bits.
> >
> > +  if (bits >= 111)
> > +{
> > +  /* Three operations.  */
> > +  rtx tmp1 = gen_reg_rtx (V4SImode);
> > +  rtx tmp2 = gen_reg_rtx (V4SImode);
> > +  emit_move_insn (tmp1, gen_lowpart (V4SImode, op1));
> > +  emit_insn (gen_ashrv4si3 (tmp2, tmp1, GEN_INT (bits - 96)));
> >
> > This can be written as:
> >
> > rtx tmp1 = force_reg (V4SImode, gen_lowpart (V4SImode, op1)); emit_insn
> > (gen_ashrv4i3 (tmp2, tmp1, GEN_INT ...));
> >
> > +  rtx tmp3 = gen_reg_rtx (V8HImode);
> > +  rtx tmp4 = gen_reg_rtx (V8HImode);
> > +  emit_move_insn (tmp3, gen_lowpart (V8HImode, tmp2));
> > +  emit_insn (gen_sse2_pshufhw (tmp4, tmp3, GEN_INT (0xfe)));
> >
> > Here in a similar way...
> >
> > +  rtx tmp5 = gen_reg_rtx (V4SImode);
> > +  rtx tmp6 = gen_reg_rtx (V4SImode);
> > +  emit_move_insn (tmp5, gen_lowpart (V4SImode, tmp4));
> > +  emit_insn (gen_sse2_pshufd (tmp6, tmp5, GEN_INT (0xfe)));
> >
> > ... also here.
> >
> > +  rtx tmp7 = gen_reg_rtx (V1TImode);
> > +  emit_move_insn (tmp7, gen_lowpart (V1TImode, tmp6));
> > +  emit_move_insn (operands[0], tmp7);
> >
> > And here a simple:
> >
> > emit_move_insn (operands[0], gen_lowpart (V1TImode, tmp6);
> >
> > +  return;
> > +}
> > +
> >
> > Uros.


Re: [PATCH] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2022-01-14 Thread Stephan Bergmann via Gcc-patches

On 30/11/2021 16:27, Marek Polacek wrote:

I guess we were concerned with programs that generate other programs.
Maybe UCNs should be ignored by default.  There's still time to adjust
the behavior.


Is there any update on this?  Shall I file a bug?  As-is, -Wbidi-chars 
is unusable for building LibreOffice and (esp. in combination with 
 "C++ preprocessor 
ignores #pragma GCC diagnostic") has to be explicitly disabled globally.




[PATCH] Start using check-MAINTAINERS.py instead of legacy maintainers-verify.sh.

2022-01-14 Thread Martin Liška

Let's use the new Python script where I added new
function check_effective_target_python3 that can be used
for python3 detection.

Ready to be installed?
Thanks,
Martin

contrib/ChangeLog:

* maintainers-verify.sh: Removed.

gcc/testsuite/ChangeLog:

* gcc.src/maintainers.exp: Start using check-MAINTAINERS.py.
* lib/target-supports.exp: Add check_effective_target_python3.
---
 contrib/maintainers-verify.sh | 45 ---
 gcc/testsuite/gcc.src/maintainers.exp | 18 ++-
 gcc/testsuite/lib/target-supports.exp | 12 +++
 3 files changed, 23 insertions(+), 52 deletions(-)
 delete mode 100755 contrib/maintainers-verify.sh

diff --git a/contrib/maintainers-verify.sh b/contrib/maintainers-verify.sh
deleted file mode 100755
index 226c158fdaa..000
--- a/contrib/maintainers-verify.sh
+++ /dev/null
@@ -1,45 +0,0 @@
-#!/bin/sh
-
-# Copyright (C) 2018 Free Software Foundation, Inc.
-#
-# This file is part of GCC.
-#
-# GCC is free software; you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation; either version 3, or (at your option)
-# any later version.
-#
-# GCC is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with GCC; see the file COPYING.  If not, write to
-# the Free Software Foundation, 51 Franklin Street, Fifth Floor,
-# Boston, MA 02110-1301, USA.
-
-if [ "$1" != "" ]; then
-f="$1"
-else
-f=./MAINTAINERS
-fi
-
-grep @ $f \
-| sed 's/[\t][\t]*/\t/g' \
-| awk -F '\t' \
- "
-{
-  if (NF == 2) {
-name=\$1
-email=\$2
-if (names[name] == 1) {
-printf \"Redundant in write approval: %s\n\", name
-}
-  } else if (NF == 3 ) {
-name=\$2
-email=\$3
-names[name] = 1
-  }
-}
-"
diff --git a/gcc/testsuite/gcc.src/maintainers.exp 
b/gcc/testsuite/gcc.src/maintainers.exp
index 9ce84f76ac1..63361dd5e4b 100644
--- a/gcc/testsuite/gcc.src/maintainers.exp
+++ b/gcc/testsuite/gcc.src/maintainers.exp
@@ -14,22 +14,27 @@
 # along with GCC; see the file COPYING3.  If not see
 # .
 
-load_lib "gcc-defs.exp"

+load_lib "target-supports.exp"
 
 proc gcc_src_run_maintainers_verify_sh {} {

-set script maintainers-verify.sh
+set script check-MAINTAINERS.py
 
 global srcdir

 set rootdir $srcdir/../..
 set contrib $rootdir/contrib
-
 set maintainers $rootdir/MAINTAINERS
 
-set verify_output [exec $contrib/$script $maintainers]

-if { "$verify_output"  == "" } {
+if { ![check_effective_target_python3] } {
+  unsupported "$script python3 is missing"
+  return
+}
+
+set result [remote_exec host $contrib/$script $maintainers]
+set status [lindex $result 0]
+if { $status == 0 } then {
pass "$script"
 } else {
-   send_log "$verify_output\n"
+   send_log "$result\n"
fail "$script"
 }
 }
@@ -40,4 +45,3 @@ if ![gcc_parallel_test_run_p maintainers] {
 gcc_parallel_test_enable 0
 gcc_src_run_maintainers_verify_sh
 gcc_parallel_test_enable 1
-
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index c1ad97c6bd2..12c5ca5da33 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -11722,3 +11722,15 @@ proc check_effective_target___OPTIMIZE__ {} {
#endif
 } [current_compiler_flags]]
 }
+
+# Return 1 if python3 is available.
+
+proc check_effective_target_python3 { } {
+set result [remote_exec host "python3 --version"]
+set status [lindex $result 0]
+if { $status == 0 } then {
+return 1;
+} else {
+return 0;
+}
+}
--
2.34.1



Re: [vect] PR103997: Fix epilogue mode skipping

2022-01-14 Thread Andre Vieira (lists) via Gcc-patches



On 14/01/2022 07:08, Richard Biener wrote:

On Thu, 13 Jan 2022, Andre Vieira (lists) wrote:


On 13/01/2022 14:25, Richard Biener wrote:

On Thu, 13 Jan 2022, Andre Vieira (lists) wrote:


On 13/01/2022 12:36, Richard Biener wrote:

On Thu, 13 Jan 2022, Andre Vieira (lists) wrote:


This time to the list too (sorry for double email)

Hi,

The original patch '[vect] Re-analyze all modes for epilogues', skipped
modes
that should not be skipped since it used the vector mode provided by
autovectorize_vector_modes to derive the minimum VF required for it.
However,
those modes should only really be used to dictate vector size, so instead
this
patch looks for the mode in 'used_vector_modes' with the largest element
size,
and constructs a vector mode with the smae size as the current
vector_modes[mode_i]. Since we are using the largest element size the
NUNITs
for this mode is the smallest possible VF required for an epilogue with
this
mode and should thus skip only the modes we are certain can not be used.

Passes bootstrap and regression on x86_64 and aarch64.

Clearly

+ /* To make sure we are conservative as to what modes we skip, we
+should use check the smallest possible NUNITS which would be
+derived from the mode in USED_VECTOR_MODES with the largest
+element size.  */
+ scalar_mode max_elsize_mode = GET_MODE_INNER
(vector_modes[mode_i]);
+ for (vec_info::mode_set::iterator i =
+   first_loop_vinfo->used_vector_modes.begin ();
+ i != first_loop_vinfo->used_vector_modes.end (); ++i)
+   {
+ if (VECTOR_MODE_P (*i)
+ && GET_MODE_SIZE (GET_MODE_INNER (*i))
+ > GET_MODE_SIZE (max_elsize_mode))
+   max_elsize_mode = GET_MODE_INNER (*i);
+   }

can be done once before iterating over the modes for the epilogue.

True, I'll start with QImode instead of the inner of vector_modes[mode_i]
too
since we can't guarantee the mode is a VECTOR_MODE_P and it is actually
better
too since we can't possible guarantee the element size of the
USED_VECTOR_MODES is smaller than that of the first vector mode...


Richard maybe knows whether we should take care to look at the
size of the vector mode as well since related_vector_mode when
passed 0 as nunits produces a vector mode with the same size
as vector_modes[mode_i] but not all used_vector_modes may be
of the same size

I suspect that should be fine though, since if we use the largest element
size
of all used_vector_modes then that should gives us the least possible
number
of NUNITS and thus only conservatively skip. That said, that does assume
that
no vector mode used may be larger than the size of the loop's vector_mode.
Can
I assume that?

No idea, but I would lean towards a no ;)  I think the loops vector_mode
doesn't have to match vector_modes[mode_i] either, does it?  At least
autodetected_vector_mode will be not QImode based.

The mode doesn't but both vector modes have to be the same vector size surely,
I'm not referring to the element size here.
What I was trying to ask was whether all vector modes in used_vector_modes had
the same vector size as the loops vector mode (and the vector_modes[mode_i] it
originated from).

Definitely not I think.
Hmmm I'm still struggling to understand what we use that initial 
vector_mode for then. I thought it was a combination of limiting vector 
size (By that I mean NUNITS * element size) and ISA choice.
If we can use vector modes within a loop with a size different from the 
initial one, is it at least a guarantee that the input vector mode's 
size is an upper bound to the sizes of the modes in used_vector_modes?

(and you probably also want to exclude
VECTOR_BOOLEAN_TYPE_P from the search?)

Yeah I think so too, thanks!

I keep going back to thinking (as I brought up in the bugzilla ticket),
maybe
we ought to only skip if the NUNITS of the vector mode with the same vector
size as vector_modes[mode_i] is larger than first_info_vf, or just don't
skip
at all...

The question is how much work we do before realizing the chosen mode
cannot be used because there's not enough iterations?  Maybe we can
improve there easily?

IIUC the VF can change depending on whether we decide to use SLP, so really we
can only check if after we have determined whether or not to use SLP, so
either:
* When SLP fully succeeds, so somewhere between the last 'goto again;' and
return success, but there is very little left to do there
* When SLP fails: here we could save on some work.

Hmm, yeah.  Guess it's quite expensive then in the end so worth to
avoid doing useless stuff.  I do wonder whether we could cache
analysis fails (and VFs in case of success but worse cost) of the
main loop analysis.
Hmmm, a quick look doesn't show any cases where the main loop may fail 
where epilogue may succeed (other than costing). So this could save on 
some analysis. Though it is potentially a band-aid if we end up having 

[PATCH] nvptx: Add support for 64-bit mul.hi (and other) instructions.

2022-01-14 Thread Roger Sayle

Now that the middle-end MULT_HIGHPART_EXPR pieces are in place, this
patch adds support for nvptx's mul.hi.s64 and mul.hi.u64 instructions,
as previously reviewed (provisionally pre-approved) back in August 2020:
https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551373.html
Since then a few things have changed, so this patch uses the new
SMUL_HIGHPART and UMUL_HIGHPART RTX expressions, but the test cases
remain the same.  Like the x86_64 backend, this patch retains the
"trunc" forms of these instructions (while the RTL optimizers/combine
may still generate them).

Given that we're rapidly approaching stage 4, I also took the liberty
of including support in nvptx.md for a few other instructions.  With
the new 64-bit highpart multiplication instructions added above, we
can now provide a define_expand for efficient 64-bit (to 128-bit)
widening multiplications.  This patch also adds support for nvptx's
testp.infinite instruction (for implementing __builtin_isinf) and
the not.pred instruction.

As an example of the code generation improvements, the function
int foo(double x) { return __builtin_isinf(x); }
previously generated with -O2:

mov.f64 %r26, %ar0;
abs.f64 %r28, %r26;
setp.leu.f64%r31, %r28, 0d7fef;
selp.u32%r30, 1, 0, %r31;
mov.u32 %r29, %r30;
cvt.u16.u8  %r35, %r29;
mov.u16 %r33, %r35;
xor.b16 %r32, %r33, 1;
cvt.u32.u16 %r34, %r32;
cvt.u32.u8  %value, %r34;

and with this patch now generates:

mov.f64 %r23, %ar0;
testp.infinite.f64  %r24, %r23;
selp.u32%value, 1, 0, %r24;

This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu
(including newlib) with a make and make -k check with no new failures.
Ok for mainline?


2022-01-14  Roger Sayle  

gcc/ChangeLog
* config/nvptx/nvptx.md (UNSPEC_ISINF): New UNSPEC.
(one_cmplbi2): New define_insn for not.pred.
(mulditi3): New define_expand for signed widening multiply.
(umulditi3): New define_expand for unsigned widening multiply.
(smul3_highpart): New define_insn for signed highpart mult.
(umul3_highpart): New define_insn for unsigned highpart mult.
(*smulhi3_highpart_2): Renamed from smulhi3_highpart.
(*smulsi3_highpart_2): Renamed from smulsi3_highpart.
(*umulhi3_highpart_2): Renamed from umulhi3_highpart.
(*umulsi3_highpart_2): Renamed from umulsi3_highpart.
(*setcc_from_not_bi): New define_insn.
(*setcc_isinf): New define_insn for testp.infinite.
(isinf2): New define_expand.

gcc/testsuite/ChangeLog
* gcc.target/nvptx/mul-hi64.c: New test case.
* gcc.target/nvptx/umul-hi64.c: New test case.
* gcc.target/nvptx/mul-wide64.c: New test case.
* gcc.target/nvptx/umul-wide64.c: New test case.
* gcc.target/nvptx/isinf.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index ce74672..a36ea59 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -27,6 +27,7 @@
UNSPEC_SIN
UNSPEC_COS
UNSPEC_TANH
+   UNSPEC_ISINF
 
UNSPEC_FPINT_FLOOR
UNSPEC_FPINT_BTRUNC
@@ -592,6 +593,12 @@
   ""
   "%.\\tnot.b%T0\\t%0, %1;")
 
+(define_insn "one_cmplbi2"
+  [(set (match_operand:BI 0 "nvptx_register_operand" "=R")
+   (not:BI (match_operand:BI 1 "nvptx_register_operand" "R")))]
+  ""
+  "%.\\tnot.pred\\t%0, %1;")
+
 (define_insn "*cnot2"
   [(set (match_operand:HSDIM 0 "nvptx_register_operand" "=R")
(eq:HSDIM (match_operand:HSDIM 1 "nvptx_register_operand" "R")
@@ -667,7 +674,57 @@
   ""
   "%.\\tmul.wide.u32\\t%0, %1, %2;")
 
-(define_insn "smulhi3_highpart"
+(define_expand "mulditi3"
+  [(set (match_operand:TI 0 "nvptx_register_operand")
+   (mult:TI (sign_extend:TI
+ (match_operand:DI 1 "nvptx_register_operand"))
+(sign_extend:DI
+ (match_operand:DI 2 "nvptx_nonmemory_operand"]
+  ""
+{
+  rtx hi = gen_reg_rtx (DImode);
+  rtx lo = gen_reg_rtx (DImode);
+  emit_insn (gen_smuldi3_highpart (hi, operands[1], operands[2]));
+  emit_insn (gen_muldi3 (lo, operands[1], operands[2]));
+  emit_move_insn (gen_highpart (DImode, operands[0]), hi);
+  emit_move_insn (gen_lowpart (DImode, operands[0]), lo);
+  DONE;
+})
+
+(define_expand "umulditi3"
+  [(set (match_operand:TI 0 "nvptx_register_operand")
+   (mult:TI (zero_extend:TI
+ (match_operand:DI 1 "nvptx_register_operand"))
+(zero_extend:DI
+ (match_operand:DI 2 "nvptx_nonmemory_operand"]
+  ""
+{
+  rtx hi = gen_reg_rtx (DImode);
+  rtx lo = gen_reg_rtx (DImode);
+  emit_insn (gen_umuldi3_highpart (hi, operands[1], operands[2]));
+  emit_insn (gen_muldi3 (lo, operands[1], operands[2]));
+  emit_move_in

Re: [vect] PR103997: Fix epilogue mode skipping

2022-01-14 Thread Richard Biener via Gcc-patches
On Fri, 14 Jan 2022, Andre Vieira (lists) wrote:

> 
> On 14/01/2022 07:08, Richard Biener wrote:
> > On Thu, 13 Jan 2022, Andre Vieira (lists) wrote:
> >
> >> On 13/01/2022 14:25, Richard Biener wrote:
> >>> On Thu, 13 Jan 2022, Andre Vieira (lists) wrote:
> >>>
>  On 13/01/2022 12:36, Richard Biener wrote:
> > On Thu, 13 Jan 2022, Andre Vieira (lists) wrote:
> >
> >> This time to the list too (sorry for double email)
> >>
> >> Hi,
> >>
> >> The original patch '[vect] Re-analyze all modes for epilogues', skipped
> >> modes
> >> that should not be skipped since it used the vector mode provided by
> >> autovectorize_vector_modes to derive the minimum VF required for it.
> >> However,
> >> those modes should only really be used to dictate vector size, so
> >> instead
> >> this
> >> patch looks for the mode in 'used_vector_modes' with the largest
> >> element
> >> size,
> >> and constructs a vector mode with the smae size as the current
> >> vector_modes[mode_i]. Since we are using the largest element size the
> >> NUNITs
> >> for this mode is the smallest possible VF required for an epilogue with
> >> this
> >> mode and should thus skip only the modes we are certain can not be
> >> used.
> >>
> >> Passes bootstrap and regression on x86_64 and aarch64.
> > Clearly
> >
> > + /* To make sure we are conservative as to what modes we skip,
> > we
> > +should use check the smallest possible NUNITS which would
> > be
> > +derived from the mode in USED_VECTOR_MODES with the largest
> > +element size.  */
> > + scalar_mode max_elsize_mode = GET_MODE_INNER
> > (vector_modes[mode_i]);
> > + for (vec_info::mode_set::iterator i =
> > +   first_loop_vinfo->used_vector_modes.begin ();
> > + i != first_loop_vinfo->used_vector_modes.end (); ++i)
> > +   {
> > + if (VECTOR_MODE_P (*i)
> > + && GET_MODE_SIZE (GET_MODE_INNER (*i))
> > + > GET_MODE_SIZE (max_elsize_mode))
> > +   max_elsize_mode = GET_MODE_INNER (*i);
> > +   }
> >
> > can be done once before iterating over the modes for the epilogue.
>  True, I'll start with QImode instead of the inner of vector_modes[mode_i]
>  too
>  since we can't guarantee the mode is a VECTOR_MODE_P and it is actually
>  better
>  too since we can't possible guarantee the element size of the
>  USED_VECTOR_MODES is smaller than that of the first vector mode...
> 
> > Richard maybe knows whether we should take care to look at the
> > size of the vector mode as well since related_vector_mode when
> > passed 0 as nunits produces a vector mode with the same size
> > as vector_modes[mode_i] but not all used_vector_modes may be
> > of the same size
>  I suspect that should be fine though, since if we use the largest element
>  size
>  of all used_vector_modes then that should gives us the least possible
>  number
>  of NUNITS and thus only conservatively skip. That said, that does assume
>  that
>  no vector mode used may be larger than the size of the loop's
>  vector_mode.
>  Can
>  I assume that?
> >>> No idea, but I would lean towards a no ;)  I think the loops vector_mode
> >>> doesn't have to match vector_modes[mode_i] either, does it?  At least
> >>> autodetected_vector_mode will be not QImode based.
> >> The mode doesn't but both vector modes have to be the same vector size
> >> surely,
> >> I'm not referring to the element size here.
> >> What I was trying to ask was whether all vector modes in used_vector_modes
> >> had
> >> the same vector size as the loops vector mode (and the vector_modes[mode_i]
> >> it
> >> originated from).
> > Definitely not I think.
> Hmmm I'm still struggling to understand what we use that initial vector_mode
> for then. I thought it was a combination of limiting vector size (By that I
> mean NUNITS * element size) and ISA choice.
> If we can use vector modes within a loop with a size different from the
> initial one, is it at least a guarantee that the input vector mode's size is
> an upper bound to the sizes of the modes in used_vector_modes?

I don't think so - via related_vector_mode we can get to both smaller
and larger vector types (if they are supported).  The input vector mode
really just determines what we use as first guess.  But maybe for
loop vectorization that's really all - Richard should know more here.
But since this is all to be used as heuristic there's no need to be
precise (and I think it's impossible to be).

> > (and you probably also want to exclude
> > VECTOR_BOOLEAN_TYPE_P from the search?)
>  Yeah I think so too, thanks!
> 
>  I keep going back to thinking (as I brought up in the bugzilla

RE: [AArch32]: correct dot-product RTL patterns.

2022-01-14 Thread Kyrylo Tkachov via Gcc-patches
Hi Tamar,

Sorry for the delay.

> -Original Message-
> From: Tamar Christina 
> Sent: Tuesday, January 11, 2022 7:10 AM
> To: gcc-patches@gcc.gnu.org
> Cc: nd ; Ramana Radhakrishnan
> ; Richard Earnshaw
> ; ni...@redhat.com; Kyrylo Tkachov
> 
> Subject: RE: [AArch32]: correct dot-product RTL patterns.
> 
> ping
> 
> > -Original Message-
> > From: Tamar Christina
> > Sent: Tuesday, December 21, 2021 12:31 PM
> > To: gcc-patches@gcc.gnu.org
> > Cc: nd ; Ramana Radhakrishnan
> > ; Richard Earnshaw
> > ; ni...@redhat.com; Kyrylo Tkachov
> > 
> > Subject: [AArch32]: correct dot-product RTL patterns.
> >
> > Hi All,
> >
> > The previous fix for this problem was wrong due to a subtle difference
> > between where NEON expects the RMW values and where intrinsics
> expects
> > them.
> >
> > The insn pattern is modeled after the intrinsics and so needs an expand for
> > the vectorizer optab to switch the RTL.
> >
> > However operand[3] is not expected to be written to so the current pattern
> > is bogus.
> >
> > Instead we use the expand to shuffle around the RTL.
> >
> > The vectorizer expects operands[3] and operands[0] to be the same but the
> > aarch64 intrinsics expanders expect operands[0] and operands[1] to be the
> > same.
> >
> > This also fixes some issues with big-endian, each dot product performs 4 8-
> > byte multiplications.  However compared to AArch64 we don't enter lanes
> in
> > GCC lane indexed in AArch32 aside from loads/stores.  This means no lane
> > remappings are done in arm-builtins.c and so none should be done at the
> > instruction side.
> >
> > There are some other instructions that need inspections as I think there are
> > more incorrect ones.
> >
> > Third there was a bug in the ACLE specication for dot product which has
> now
> > been fixed[1].  This means some intrinsics were missing and are added by
> > this patch.
> >
> > Bootstrapped and regtested on arm-none-linux-gnueabihf and no issues.
> >
> > Ok for master? and active branches after some stew?
> >
> > [1] https://github.com/ARM-software/acle/releases/tag/r2021Q3
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > * config/arm/arm_neon.h (vdot_laneq_u32, vdotq_laneq_u32,
> > vdot_laneq_s32, vdotq_laneq_s32): New.
> > * config/arm/arm_neon_builtins.def (sdot_laneq, udot_laneq: New.
> > * config/arm/neon.md (neon_dot): New.
> > (dot_prod): Re-order rtl.
> > (neon_dot_lane): Fix rtl order and endiannes.
> > (neon_dot_laneq): New.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/arm/simd/vdot-compile.c: Add new cases.
> > * gcc.target/arm/simd/vdot-exec.c: Likewise.
> >

Ok but...

> > --- inline copy of patch --
> > diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
> index
> >
> 3364b37f69dfc33082388246c03149d9ad66a634..af6ac63dc3b47830d92f199d
> 93
> > 153ff510f658e9 100644
> > --- a/gcc/config/arm/arm_neon.h
> > +++ b/gcc/config/arm/arm_neon.h
> > @@ -18243,6 +18243,35 @@ vdotq_lane_s32 (int32x4_t __r, int8x16_t
> __a,
> > int8x8_t __b, const int __index)
> >return __builtin_neon_sdot_lanev16qi (__r, __a, __b, __index);  }
> >
> > +__extension__ extern __inline uint32x2_t __attribute__
> > +((__always_inline__, __gnu_inline__, __artificial__))
> > +vdot_laneq_u32 (uint32x2_t __r, uint8x8_t __a, uint8x16_t __b, const
> > +int __index) {
> > +  return __builtin_neon_udot_laneqv8qi_s (__r, __a, __b, __index);
> > +}
> > +
> > +__extension__ extern __inline uint32x4_t __attribute__
> > +((__always_inline__, __gnu_inline__, __artificial__))
> > +vdotq_laneq_u32 (uint32x4_t __r, uint8x16_t __a, uint8x16_t __b,
> > +   const int __index)
> > +{
> > +  return __builtin_neon_udot_laneqv16qi_s (__r, __a, __b, __index);
> > +}
> > +
> > +__extension__ extern __inline int32x2_t __attribute__
> > +((__always_inline__, __gnu_inline__, __artificial__))
> > +vdot_laneq_s32 (int32x2_t __r, int8x8_t __a, int8x16_t __b, const int
> > +__index) {
> > +  return __builtin_neon_sdot_laneqv8qi (__r, __a, __b, __index); }
> > +
> > +__extension__ extern __inline int32x4_t __attribute__
> > +((__always_inline__, __gnu_inline__, __artificial__))
> > +vdotq_laneq_s32 (int32x4_t __r, int8x16_t __a, int8x16_t __b, const int
> > +__index) {
> > +  return __builtin_neon_sdot_laneqv16qi (__r, __a, __b, __index); }
> > +
> >  #pragma GCC pop_options
> >  #endif
> >
> > diff --git a/gcc/config/arm/arm_neon_builtins.def
> > b/gcc/config/arm/arm_neon_builtins.def
> > index
> >
> fafb5c6fc51c16679ead1afda7cccfea8264fd15..f83dd4327c16c0af68f72eb6d9c
> a
> > 8cf21e2e56b5 100644
> > --- a/gcc/config/arm/arm_neon_builtins.def
> > +++ b/gcc/config/arm/arm_neon_builtins.def
> > @@ -342,6 +342,8 @@ VAR2 (TERNOP, sdot, v8qi, v16qi)
> >  VAR2 (UTERNOP, udot, v8qi, v16qi)
> >  VAR2 (MAC_LANE, sdot_lane, v8qi, v16qi)
> >  VAR2 (UMAC_LANE, udot_lane, v8qi, v16qi)
> > +VAR2 (MAC_LANE, sdot_laneq, v8qi, v16qi)
> > +VAR2 (UMAC_LANE, udot_laneq, v8qi, v16qi)
> >
> >  VAR1 (USTERNOP, u

Re: [PATCH v5 1/1] [ARM] Add support for TLS register based stack protector canary access

2022-01-14 Thread Ard Biesheuvel via Gcc-patches
(+ Richard Earnshaw)

On Wed, 12 Jan 2022 at 19:29, Ard Biesheuvel  wrote:
>
> On Wed, 17 Nov 2021 at 18:12, Ard Biesheuvel  wrote:
> >
> > (+ Ramana)
> >
>
> Ping?
>
> > On Mon, 15 Nov 2021 at 19:04, Ard Biesheuvel  wrote:
> > >
> > > Add support for accessing the stack canary value via the TLS register,
> > > so that multiple threads running in the same address space can use
> > > distinct canary values. This is intended for the Linux kernel running in
> > > SMP mode, where processes entering the kernel are essentially threads
> > > running the same program concurrently: using a global variable for the
> > > canary in that context is problematic because it can never be rotated,
> > > and so the OS is forced to use the same value as long as it remains up.
> > >
> > > Using the TLS register to index the stack canary helps with this, as it
> > > allows each CPU to context switch the TLS register along with the rest
> > > of the process, permitting each process to use its own value for the
> > > stack canary.
> > >
> > > 2021-11-15 Ard Biesheuvel 
> > >
> > > * config/arm/arm-opts.h (enum stack_protector_guard): New
> > > * config/arm/arm-protos.h (arm_stack_protect_tls_canary_mem):
> > > New
> > > * config/arm/arm.c (TARGET_STACK_PROTECT_GUARD): Define
> > > (arm_option_override_internal): Handle and put in error checks
> > > for stack protector guard options.
> > > (arm_option_reconfigure_globals): Likewise
> > > (arm_stack_protect_tls_canary_mem): New
> > > (arm_stack_protect_guard): New
> > > * config/arm/arm.md (stack_protect_set): New
> > > (stack_protect_set_tls): Likewise
> > > (stack_protect_test): Likewise
> > > (stack_protect_test_tls): Likewise
> > > (reload_tp_hard): Likewise
> > > * config/arm/arm.opt (-mstack-protector-guard): New
> > > (-mstack-protector-guard-offset): New.
> > > * doc/invoke.texi: Document new options
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.target/arm/stack-protector-7.c: New test.
> > > * gcc.target/arm/stack-protector-8.c: New test.
> > >
> > > Signed-off-by: Ard Biesheuvel 
> > > ---
> > >  gcc/config/arm/arm-opts.h|  6 ++
> > >  gcc/config/arm/arm-protos.h  |  2 +
> > >  gcc/config/arm/arm.c | 55 +++
> > >  gcc/config/arm/arm.md| 71 
> > > +++-
> > >  gcc/config/arm/arm.opt   | 22 ++
> > >  gcc/doc/invoke.texi  | 11 +++
> > >  gcc/testsuite/gcc.target/arm/stack-protector-7.c | 10 +++
> > >  gcc/testsuite/gcc.target/arm/stack-protector-8.c |  5 ++
> > >  8 files changed, 180 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/gcc/config/arm/arm-opts.h b/gcc/config/arm/arm-opts.h
> > > index 5c4b62f404f7..581ba3c4fbbb 100644
> > > --- a/gcc/config/arm/arm-opts.h
> > > +++ b/gcc/config/arm/arm-opts.h
> > > @@ -69,4 +69,10 @@ enum arm_tls_type {
> > >TLS_GNU,
> > >TLS_GNU2
> > >  };
> > > +
> > > +/* Where to get the canary for the stack protector.  */
> > > +enum stack_protector_guard {
> > > +  SSP_TLSREG,  /* per-thread canary in TLS register */
> > > +  SSP_GLOBAL   /* global canary */
> > > +};
> > >  #endif
> > > diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> > > index 9b1f61394ad7..d8d605920c97 100644
> > > --- a/gcc/config/arm/arm-protos.h
> > > +++ b/gcc/config/arm/arm-protos.h
> > > @@ -195,6 +195,8 @@ extern void arm_split_atomic_op (enum rtx_code, rtx, 
> > > rtx, rtx, rtx, rtx, rtx);
> > >  extern rtx arm_load_tp (rtx);
> > >  extern bool arm_coproc_builtin_available (enum unspecv);
> > >  extern bool arm_coproc_ldc_stc_legitimate_address (rtx);
> > > +extern rtx arm_stack_protect_tls_canary_mem (bool);
> > > +
> > >
> > >  #if defined TREE_CODE
> > >  extern void arm_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, 
> > > tree);
> > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > > index a5b403eb3e49..e5077348ce07 100644
> > > --- a/gcc/config/arm/arm.c
> > > +++ b/gcc/config/arm/arm.c
> > > @@ -829,6 +829,9 @@ static const struct attribute_spec 
> > > arm_attribute_table[] =
> > >
> > >  #undef TARGET_MD_ASM_ADJUST
> > >  #define TARGET_MD_ASM_ADJUST arm_md_asm_adjust
> > > +
> > > +#undef TARGET_STACK_PROTECT_GUARD
> > > +#define TARGET_STACK_PROTECT_GUARD arm_stack_protect_guard
> > >
> > >  /* Obstack for minipool constant handling.  */
> > >  static struct obstack minipool_obstack;
> > > @@ -3176,6 +3179,26 @@ arm_option_override_internal (struct gcc_options 
> > > *opts,
> > >if (TARGET_THUMB2_P (opts->x_target_flags))
> > >  opts->x_inline_asm_unified = true;
> > >
> > > +  if (arm_stack_protector_guard == SSP_GLOBAL
> > > +  && opts->x_arm_stack_protector_guard_offset_str)
> > > +{
> > > +  error ("incom

[committed] libstdc++: Document new std::random_device tokens

2022-01-14 Thread Jonathan Wakely via Gcc-patches
Pushed to trunk.

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2011.xml: Document new tokens
accepted by std::random_device constructor.
* doc/html/manual/status.html: Regenerate.
---
 libstdc++-v3/doc/html/manual/status.html  | 13 
 .../doc/xml/manual/status_cxx2011.xml | 30 +++
 2 files changed, 43 insertions(+)

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
index 88844f8f0cc..1eeb2d1ccd7 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
@@ -2720,6 +2720,36 @@ particular release.
   the assembler and CPU support the corresponding instruction.
 
   
+  
+"darn"
+
+  Use the Power ISA-3.0 DARN
+  ("Deliver A Random Number")
+  instruction to read from an on-chip hardware random number generator.
+  This token is only valid for 64-bit powerpc targets when both
+  the assembler and CPU support the corresponding instruction.
+
+  
+  
+"hw"
+"hardware"
+
+  Use any available CPU
+  instruction to read from an on-chip hardware random number generator.
+  This is equivalent to trying each of the following and using the
+  first that is supported:
+  "rdseed"
+  "rdrand"
+  "darn"
+
+  
+  
+"arc4random"
+"getentropy"
+
+  Use the named C library function, if available on the target.
+
+  
   
 "/dev/urandom"
 "/dev/random"
-- 
2.31.1



[committed] libstdc++: Use std::construct_at in std::common_iterator [PR103992]

2022-01-14 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk.


This should have been done as part of the LWG 3574 changes.

libstdc++-v3/ChangeLog:

PR libstdc++/103992
* include/bits/stl_iterator.h (common_iterator): Use
std::construct_at instead of placement new.
* testsuite/24_iterators/common_iterator/1.cc: Check copy
construction is usable in constant expressions.
---
 libstdc++-v3/include/bits/stl_iterator.h |  8 
 .../testsuite/24_iterators/common_iterator/1.cc  | 16 
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
b/libstdc++-v3/include/bits/stl_iterator.h
index 24a71ea55af..24c106e0f70 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -1907,14 +1907,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
if constexpr (is_trivially_default_constructible_v<_It>)
  _M_it = std::move(__x._M_it);
else
- ::new((void*)std::__addressof(_M_it)) _It(__x._M_it);
+ std::construct_at(std::__addressof(_M_it), __x._M_it);
  }
else if (_M_index == 1)
  {
if constexpr (is_trivially_default_constructible_v<_Sent>)
  _M_sent = std::move(__x._M_sent);
else
- ::new((void*)std::__addressof(_M_sent)) _Sent(__x._M_sent);
+ std::construct_at(std::__addressof(_M_sent), __x._M_sent);
  }
   }
 
@@ -1928,14 +1928,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  if constexpr (is_trivially_default_constructible_v<_It>)
_M_it = std::move(__x._M_it);
  else
-   ::new((void*)std::__addressof(_M_it)) _It(__x._M_it);
+   std::construct_at(std::__addressof(_M_it), __x._M_it);
}
   else if (_M_index == 1)
{
  if constexpr (is_trivially_default_constructible_v<_Sent>)
_M_sent = std::move(__x._M_sent);
  else
-   ::new((void*)std::__addressof(_M_sent)) _Sent(__x._M_sent);
+   std::construct_at(std::__addressof(_M_sent), __x._M_sent);
}
 }
 
diff --git a/libstdc++-v3/testsuite/24_iterators/common_iterator/1.cc 
b/libstdc++-v3/testsuite/24_iterators/common_iterator/1.cc
index 484d0cc2301..365ee89c02e 100644
--- a/libstdc++-v3/testsuite/24_iterators/common_iterator/1.cc
+++ b/libstdc++-v3/testsuite/24_iterators/common_iterator/1.cc
@@ -157,6 +157,22 @@ test04()
   VERIFY( x.i == 2 );
 }
 
+constexpr bool
+test_pr103992()
+{
+  using C1 = std::common_iterator,
+ std::unreachable_sentinel_t>;
+  using C2 = std::common_iterator,
+ std::unreachable_sentinel_t>;
+  C1 c1;
+  C2 c2 = c1;
+  C1 c3 = c1;
+
+  return true;
+}
+
+static_assert( test_pr103992() );
+
 int
 main()
 {
-- 
2.31.1



[committed] libstdc++: Ignore cv-quals when std::allocator constructs

2022-01-14 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk.


When I added the std::allocator_traits>
specialization it broke code like this:

  std::allocate_shared(std::allocator());

The problem is that allocator_traits>::construct(a, p)
now uses std::_Construct(p), which only does a static_cast(p) and
so fails if the pointer has cv-quals.

This changes std::_Construct (and the related std::_Construct_novalue)
to use a C-style cast to (void*) which matches the effects of the
"voidify" helper in the C++20 standard.

libstdc++-v3/ChangeLog:

* include/bits/stl_construct.h (_Construct, _Construct_novalue):
Also cast away cv-qualifiers when converting pointer to void.
* testsuite/20_util/allocator/void.cc: Test construct function
with cv-qualified types.
---
 libstdc++-v3/include/bits/stl_construct.h|  4 ++--
 libstdc++-v3/testsuite/20_util/allocator/void.cc | 15 +++
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_construct.h 
b/libstdc++-v3/include/bits/stl_construct.h
index 7c5fd4c9cf7..9531222809c 100644
--- a/libstdc++-v3/include/bits/stl_construct.h
+++ b/libstdc++-v3/include/bits/stl_construct.h
@@ -116,7 +116,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  return;
}
 #endif
-  ::new(static_cast(__p)) _Tp(std::forward<_Args>(__args)...);
+  ::new((void*)__p) _Tp(std::forward<_Args>(__args)...);
 }
 #else
   template
@@ -132,7 +132,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 inline void
 _Construct_novalue(_T1* __p)
-{ ::new(static_cast(__p)) _T1; }
+{ ::new((void*)__p) _T1; }
 
   template
 _GLIBCXX20_CONSTEXPR void
diff --git a/libstdc++-v3/testsuite/20_util/allocator/void.cc 
b/libstdc++-v3/testsuite/20_util/allocator/void.cc
index 52e1fef3700..5cdf0be012c 100644
--- a/libstdc++-v3/testsuite/20_util/allocator/void.cc
+++ b/libstdc++-v3/testsuite/20_util/allocator/void.cc
@@ -87,8 +87,23 @@ static_assert( 
std::is_same::const_pointer, const void*>(),
 "const_pointer is const void*" );
 #endif // C++20
 
+void
+test02()
+{
+  std::allocator av;
+  int* p = std::allocator().allocate(1);
+  const int* c = p;
+  std::allocator_traits>::construct(av, c, 0);
+  volatile int* v = p;
+  std::allocator_traits>::construct(av, v, 0);
+  const volatile int* cv = p;
+  std::allocator_traits>::construct(av, cv, 0);
+  std::allocator().deallocate(p, 1);
+}
+
 int
 main()
 {
   test01();
+  test02();
 }
-- 
2.31.1



[committed] libstdc++: Add C++20 std::make_shared enhancements (P0674R1)

2022-01-14 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64-le-linux, pushed to trunk.
N.B. we had a patch from JeanHeyd two years ago adding the first part of
this (P0674R1), but this is much simpler, works without concepts, and
also adds make_shared_for_overwrite.



This adds the overloads of std::make_shared and std::allocate_shared for
creating arrays, added to C++20 by P0674R1.

It also adds std::make_shared_for_overwrite, added to C++20 by P1020R1
(and renamed by P1973R1). The std::make_unique_for_overwite overloads
are already supported.

The original std::make_shared overload is changed to construct a
shared_ptr directly instead of calling std::allocate_shared. This
removes a function call at runtime, and avoids having to do overload
resolution for std::allocate_shared, now that there are five overloads
of it.

Allocating a shared array is done by a new __shared_count constructor.
An array is allocated with space for additional elements at the end and
an instance of new _Sp_counted_array class template is constructed in
that unused capacity.

The non-array form of std::make_shared_for_overwrite uses the same
__shared_count constructor as the original std::make_shared overload,
but a new partial specialization of _Sp_counted_ptr_inplace is selected
when the allocator's value_type is the new _Sp_overwrite_tag type. That
new partial specialization default-initializes its contained object and
destroys it with a destructor call rather than using the allocator.

Despite being C++20 features, this implementation only uses concepts
conditionally, with workarounds when they are not supported. This allows
it to work with older non-GCC compilers (Clang 9 and icc 2021). At some
point we can simplify the code by removing the workarounds.

libstdc++-v3/ChangeLog:

* include/bits/shared_ptr.h (__cpp_lib_shared_ptr_weak_type):
Correct type of macro value.
(shared_ptr): Add additional friend declarations.
(make_shared, allocate_shared): Constrain existing overloads and
remove static_assert.
* include/bits/shared_ptr_base.h (__cpp_lib_smart_ptr_for_overwrite):
New macro.
(_Sp_counted_ptr_inplace): New partial
specialization for use with make_shared_for_overwrite.
(__cpp_lib_shared_ptr_arrays): Update value for C++20.
(_Sp_counted_array_base): New class template.
(_Sp_counted_array): New class template.
(__shared_count(_Tp*&, const _Sp_counted_array_base&, _Init)):
New constructor for allocating shared arrays.
(__shared_ptr(const _Sp_counted_array_base&, _Init)): Likewise.
* include/std/version (__cpp_lib_shared_ptr_weak_type): Correct
type.
(__cpp_lib_shared_ptr_arrays): Update value for C++20.
(__cpp_lib_smart_ptr_for_overwrite): New macro.
* testsuite/20_util/shared_ptr/creation/99006.cc: Adjust
expected errors.
* testsuite/20_util/shared_ptr/creation/array.cc: New test.
* testsuite/20_util/shared_ptr/creation/overwrite.cc: New test.
* testsuite/20_util/shared_ptr/creation/version.cc: New test.
* testsuite/20_util/unique_ptr/creation/for_overwrite.cc: Check
feature test macro. Test non-trivial default-initialization.
---
 libstdc++-v3/include/bits/shared_ptr.h| 271 +++-
 libstdc++-v3/include/bits/shared_ptr_base.h   | 299 +-
 libstdc++-v3/include/std/version  |   6 +-
 .../20_util/shared_ptr/creation/99006.cc  |   7 +-
 .../20_util/shared_ptr/creation/array.cc  | 224 +
 .../20_util/shared_ptr/creation/overwrite.cc  | 143 +
 .../20_util/shared_ptr/creation/version.cc|  18 ++
 .../unique_ptr/creation/for_overwrite.cc  |  25 +-
 8 files changed, 975 insertions(+), 18 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/20_util/shared_ptr/creation/array.cc
 create mode 100644 
libstdc++-v3/testsuite/20_util/shared_ptr/creation/overwrite.cc
 create mode 100644 
libstdc++-v3/testsuite/20_util/shared_ptr/creation/version.cc

diff --git a/libstdc++-v3/include/bits/shared_ptr.h 
b/libstdc++-v3/include/bits/shared_ptr.h
index 5fdfcb2c4e1..f44008ed2b1 100644
--- a/libstdc++-v3/include/bits/shared_ptr.h
+++ b/libstdc++-v3/include/bits/shared_ptr.h
@@ -100,6 +100,57 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 }
 
+  /// @cond undocumented
+
+  // Constraint for overloads taking non-array types.
+#if __cpp_concepts && __cpp_lib_type_trait_variable_templates
+  template
+requires (!is_array_v<_Tp>)
+using _NonArray = _Tp;
+#else
+  template
+using _NonArray = __enable_if_t::value, _Tp>;
+#endif
+
+#if __cpp_lib_shared_ptr_arrays >= 201707L
+  // Constraint for overloads taking array types with unknown bound, U[].
+#if __cpp_concepts
+  template
+requires is_array_v<_Tp> && (extent_v<_Tp> == 0)
+using _UnboundedArray = _Tp;
+#else
+  template
+using _UnboundedArray
+  = __enable_if_t<__is_array_unknown_bounds<_Tp>::value, _Tp>;
+#endif
+
+  // Const

[committed] arc: Update stack size computation when accumulator registers are available.

2022-01-14 Thread Claudiu Zissulescu via Gcc-patches
When accumulator registers are available in a processor, they need to
be save onto stack durring interrupts.  We were already doing so, but
the stack size was wrongly computed in the case other than ARC600.

gcc/

* config/arc/arc.c (arc_compute_frame_size): Remove condition when
computin checking accumulator regs.
(arc_expand_prologue): Update comments.
(arc_expand_epilogue): Likewise.

Signed-off-by: Claudiu Zissulescu 
---
 gcc/config/arc/arc.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 6de364e9ce0..3096a9bef44 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -2897,9 +2897,8 @@ arc_compute_frame_size (void)
  cfun, TARGET_DPFP))
 reg_size += UNITS_PER_WORD * 2;
 
-  /* Check for special MLO/MHI case used by ARC600' MUL64
- extension.  */
-  if (arc_must_save_register (R58_REG, cfun, TARGET_MUL64_SET))
+  /* Check if R58 is used.  */
+  if (arc_must_save_register (R58_REG, cfun, true))
 reg_size += UNITS_PER_WORD * 2;
 
   /* 4) Calculate extra size made up of the blink + fp size.  */
@@ -3878,7 +3877,7 @@ arc_expand_prologue (void)
}
 }
 
-  /* Save ARC600' MUL64 registers.  */
+  /* Save accumulator registers.  */
   if (arc_must_save_register (R58_REG, cfun, true))
 frame_size_to_allocate -= arc_save_callee_saves (3ULL << 58,
 false, false, 0, false);
@@ -3971,7 +3970,7 @@ arc_expand_epilogue (int sibcall_p)
   first_offset = 0;
 }
 
-  /* Restore ARC600' MUL64 registers.  */
+  /* Restore accumulator registers.  */
   if (arc_must_save_register (R58_REG, cfun, true))
 {
   rtx insn;
-- 
2.31.1



[committed] arc: Add DWARF2 alternate CFA column.

2022-01-14 Thread Claudiu Zissulescu via Gcc-patches
Add DWARF 2 CFA column which tracks the return address from a signal
handler context.  This value must not correspond to a hard register
and must be out of the range of DWARF_FRAME_REGNUM().

gcc/
* config/arc/arc.h (DWARF_FRAME_REGNUM): Update definition.
(DWARF_FRAME_RETURN_COLUMN): Use RETURN_ADDR_REGNUM macro.
(INCOMING_RETURN_ADDR_RTX): Likewise.
(DWARF_ALT_FRAME_RETURN_COLUMN): Define.

gcc/testsuite/
* gcc.target/arc/cancel-1.c: New file.

libgcc/
* config/arc/linux-unwind.h (arc_fallback_frame_state): Use
DWARF_ALT_FRAME_RETURN_COLUMN macro.

Signed-off-by: Claudiu Zissulescu 
---
 gcc/config/arc/arc.h| 11 ++---
 gcc/testsuite/gcc.target/arc/cancel-1.c | 31 +
 libgcc/config/arc/linux-unwind.h|  9 +++
 3 files changed, 44 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/cancel-1.c

diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
index 78b5000eeb7..539a1662084 100644
--- a/gcc/config/arc/arc.h
+++ b/gcc/config/arc/arc.h
@@ -1356,7 +1356,7 @@ do { \
: (REGNO))
 
 /* Use gcc hard register numbering for eh_frame.  */
-#define DWARF_FRAME_REGNUM(REG) (REG)
+#define DWARF_FRAME_REGNUM(REG) ((REG) < 144 ? REG : INVALID_REGNUM)
 
 /* Map register numbers held in the call frame info that gcc has
collected using DWARF_FRAME_REGNUM to those that should be output
@@ -1370,9 +1370,14 @@ do { \
   : 57 + !!TARGET_MULMAC_32BY16_SET) /* MLO */ \
: (REGNO))
 
-#define DWARF_FRAME_RETURN_COLUMN  DWARF_FRAME_REGNUM (31)
+/* The DWARF 2 CFA column which tracks the return address.  */
+#define DWARF_FRAME_RETURN_COLUMN RETURN_ADDR_REGNUM
+#define INCOMING_RETURN_ADDR_RTX gen_rtx_REG (Pmode, RETURN_ADDR_REGNUM)
 
-#define INCOMING_RETURN_ADDR_RTX  gen_rtx_REG (Pmode, 31)
+/* The DWARF 2 CFA column which tracks the return address from a signal handler
+   context.  This value must not correspond to a hard register and must be out
+   of the range of DWARF_FRAME_REGNUM().  */
+#define DWARF_ALT_FRAME_RETURN_COLUMN 144
 
 /* Frame info.  */
 
diff --git a/gcc/testsuite/gcc.target/arc/cancel-1.c 
b/gcc/testsuite/gcc.target/arc/cancel-1.c
new file mode 100644
index 000..e050c538157
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/cancel-1.c
@@ -0,0 +1,31 @@
+/* Test for cleanups with pthread_cancel.  Any issue with libgcc's unwinder
+   will cause this test to spin in pthread_join.  */
+
+/* { dg-do run } */
+/* { dg-require-effective-target pthread } */
+/* { dg-options "-pthread" } */
+
+#include 
+#include 
+#include 
+
+void *thread_loop (void *)
+{
+  while (1)
+{
+  printf("worker: loop\n");
+  sleep(1);
+}
+}
+
+int main ()
+{
+  pthread_t thread;
+
+  pthread_create (&thread, 0, thread_loop, 0);
+  sleep(5);
+  pthread_cancel (thread);
+  pthread_join (thread, 0);
+
+  return 0;
+}
diff --git a/libgcc/config/arc/linux-unwind.h b/libgcc/config/arc/linux-unwind.h
index 1d8c0c55883..be42a3163b2 100644
--- a/libgcc/config/arc/linux-unwind.h
+++ b/libgcc/config/arc/linux-unwind.h
@@ -120,10 +120,11 @@ arc_fallback_frame_state (struct _Unwind_Context *context,
= ((_Unwind_Ptr) &(regs[i])) - new_cfa;
 }
 
-  fs->regs.reg[31].how = REG_SAVED_VAL_OFFSET;
-  fs->regs.reg[31].loc.offset = ((_Unwind_Ptr) (regs[ret])) - new_cfa;
-
-  fs->retaddr_column = 31;
+  fs->signal_frame = 1;
+  fs->retaddr_column = __LIBGCC_DWARF_ALT_FRAME_RETURN_COLUMN__;
+  fs->regs.reg[fs->retaddr_column].how = REG_SAVED_VAL_OFFSET;
+  fs->regs.reg[fs->retaddr_column].loc.offset =
+((_Unwind_Ptr) (regs[ret])) - new_cfa;
 
   return _URC_NO_REASON;
 }
-- 
2.31.1



Re: [patch] Fix reverse SSO issues in IPA-SRA

2022-01-14 Thread Eric Botcazou via Gcc-patches
> Yes, it must still be copied.

OK, revised patch attached, with testcases but they fail only on the 10 and 11 
branches because of a change in the heuristics apparently.


* ipa-param-manipulation.c (ipa_dump_adjusted_parameters): Dump
reverse flag as "reverse" for the sake of consistency.
* ipa-sra.c: Fix copyright year.
(ipa_sra_function_summaries::duplicate): Copy the reverse flag.
(dump_isra_access): Tweak dump line.
(isra_write_node_summary): Write the reverse flag.
(isra_read_node_info): Read it.
(pull_accesses_from_callee): Test its consistency and copy it.

testsuite/
* gnat.dg/lto25.adb: New test.
* gnat.dg/opt96.adb: Likewise.
* gnat.dg/opt96_pkg.ads, gnat.dg/opt96_pkg.adb: New helper.

-- 
Eric Botcazoudiff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
index 4973bfb67dd..fa6815e0941 100644
--- a/gcc/ipa-param-manipulation.c
+++ b/gcc/ipa-param-manipulation.c
@@ -228,7 +228,7 @@ ipa_dump_adjusted_parameters (FILE *f,
 	  fprintf (f, " prefix: %s",
 		   ipa_param_prefixes[apm->param_prefix_index]);
 	  if (apm->reverse)
-	fprintf (f, ", reverse-sso");
+	fprintf (f, ", reverse");
 	  break;
 	}
   fprintf (f, "\n");
diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
index 45030a17c07..f300a328f2b 100644
--- a/gcc/ipa-sra.c
+++ b/gcc/ipa-sra.c
@@ -1,6 +1,5 @@
 /* Interprocedural scalar replacement of aggregates
-   Copyright (C) 2008-2022 Free Software Foundation, Inc.
-
+   Copyright (C) 2019-2022 Free Software Foundation, Inc.
Contributed by Martin Jambor 
 
 This file is part of GCC.
@@ -21,7 +20,7 @@ along with GCC; see the file COPYING3.  If not see
 
 /* IPA-SRA is an interprocedural pass that removes unused function return
values (turning functions returning a value which is never used into void
-   functions), removes unused function parameters.  It can also replace an
+   functions) and removes unused function parameters.  It can also replace an
aggregate parameter by a set of other parameters representing part of the
original, turning those passed by reference into new ones which pass the
value directly.
@@ -57,7 +56,6 @@ along with GCC; see the file COPYING3.  If not see
ipa-param-manipulation.h for more details.  */
 
 
-
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -93,7 +91,7 @@ static void ipa_sra_summarize_function (cgraph_node *);
 #define ISRA_ARG_SIZE_LIMIT_BITS 16
 #define ISRA_ARG_SIZE_LIMIT (1 << ISRA_ARG_SIZE_LIMIT_BITS)
 /* How many parameters can feed into a call actual argument and still be
-   tracked. */
+   tracked.  */
 #define IPA_SRA_MAX_PARAM_FLOW_LEN 7
 
 /* Structure describing accesses to a specific portion of an aggregate
@@ -122,7 +120,7 @@ struct GTY(()) param_access
  transformed function - initially not set for portions of formal parameters
  that are only used as actual function arguments passed to callees.  */
   unsigned certain : 1;
-  /* Set if the access has a reversed scalar storage order.  */
+  /* Set if the access has reverse scalar storage order.  */
   unsigned reverse : 1;
 };
 
@@ -156,7 +154,7 @@ struct gensum_param_access
  arguments to a function call that can be tracked.  */
   bool nonarg;
 
-  /* Set if the access has a reversed scalar storage order.  */
+  /* Set if the access has reverse scalar storage order.  */
   bool reverse;
 };
 
@@ -219,8 +217,8 @@ struct gensum_param_desc
 };
 
 /* Properly deallocate accesses of DESC.  TODO: Since this data structure is
-   not in GC memory, this is not necessary and we can consider removing the
-   function.  */
+   allocated in GC memory, this is not necessary and we can consider removing
+   the function.  */
 
 static void
 free_param_decl_accesses (isra_param_desc *desc)
@@ -275,9 +273,9 @@ public:
   unsigned m_queued : 1;
 };
 
-/* Clean up and deallocate isra_func_summary points to.  TODO: Since this data
-   structure is not in GC memory, this is not necessary and we can consider
-   removing the destructor.  */
+/* Deallocate the memory pointed to by isra_func_summary.  TODO: Since this
+   data structure is allocated in GC memory, this is not necessary and we can
+   consider removing the destructor.  */
 
 isra_func_summary::~isra_func_summary ()
 {
@@ -287,7 +285,6 @@ isra_func_summary::~isra_func_summary ()
   vec_free (m_parameters);
 }
 
-
 /* Mark the function as not a candidate for any IPA-SRA transformation.  Return
true if it was a candidate until now.  */
 
@@ -297,6 +294,7 @@ isra_func_summary::zap ()
   bool ret = m_candidate;
   m_candidate = false;
 
+  /* TODO: see the destructor above.  */
   unsigned len = vec_safe_length (m_parameters);
   for (unsigned i = 0; i < len; ++i)
 free_param_decl_accesses (&(*m_parameters)[i]);
@@ -306,7 +304,7 @@ isra_func_summary::zap ()
 }
 
 /* Structure to describe which formal parameters feed into a particular actual
-   arguments.  */
+   argument.  *

Re: [AArch64] Enable generation of FRINTNZ instructions

2022-01-14 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)"  writes:
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 
> 19e89ae502bc2f51db64667b236c1cb669718b02..3b0e4e0875b4392ab6833568b207580ef597a98f
>  100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -6191,6 +6191,15 @@ operands; otherwise, it may not.
>  
>  This pattern is not allowed to @code{FAIL}.
>  
> +@cindex @code{ftrunc@var{m}@var{n}2} instruction pattern
> +@item @samp{ftrunc@var{m}@var{n}2}
> +Truncate operand 1 to a @var{n} mode signed integer, towards zero, and store
> +the result in operand 0. Both operands have mode @var{m}, which is a scalar 
> or
> +vector floating-point mode.  An exception must be raised if operand 1 does 
> not
> +fit in a @var{n} mode signed integer as it would have if the truncation
> +happened through separate floating point to integer conversion.
> +
> +

Nit: just one blank line here.

>  @cindex @code{round@var{m}2} instruction pattern
>  @item @samp{round@var{m}2}
>  Round operand 1 to the nearest integer, rounding away from zero in the
> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> index 
> 6095a35cd4565fdb7d758104e80fe6411230f758..a56bbb775572fa72379854f90a01ad543557e29a
>  100644
> --- a/gcc/doc/sourcebuild.texi
> +++ b/gcc/doc/sourcebuild.texi
> @@ -2286,6 +2286,10 @@ Like @code{aarch64_sve_hw}, but also test for an exact 
> hardware vector length.
>  @item aarch64_fjcvtzs_hw
>  AArch64 target that is able to generate and execute armv8.3-a FJCVTZS
>  instruction.
> +
> +@item aarch64_frintzx_ok
> +AArch64 target that is able to generate the Armv8.5-a FRINT32Z, FRINT64Z,
> +FRINT32X and FRINT64X instructions.
>  @end table
>  
>  @subsubsection MIPS-specific attributes
> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
> index 
> b24102a5990bea4cbb102069f7a6df497fc81ebf..9047b486f41948059a7a7f1ccc4032410a369139
>  100644
> --- a/gcc/internal-fn.c
> +++ b/gcc/internal-fn.c
> @@ -130,6 +130,7 @@ init_internal_fns ()
>  #define fold_left_direct { 1, 1, false }
>  #define mask_fold_left_direct { 1, 1, false }
>  #define check_ptrs_direct { 0, 0, false }
> +#define ftrunc_int_direct { 0, 1, true }
>  
>  const direct_internal_fn_info direct_internal_fn_array[IFN_LAST + 1] = {
>  #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) not_direct,
> @@ -156,6 +157,29 @@ get_multi_vector_move (tree array_type, convert_optab 
> optab)
>return convert_optab_handler (optab, imode, vmode);
>  }
>  
> +/* Expand FTRUNC_INT call STMT using optab OPTAB.  */
> +
> +static void
> +expand_ftrunc_int_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
> +{
> +  class expand_operand ops[2];
> +  tree lhs, float_type, int_type;
> +  rtx target, op;
> +
> +  lhs = gimple_call_lhs (stmt);
> +  target = expand_normal (lhs);
> +  op = expand_normal (gimple_call_arg (stmt, 0));
> +
> +  float_type = TREE_TYPE (lhs);
> +  int_type = element_type (gimple_call_arg (stmt, 1));

Sorry for the run-around, but now that we don't (need to) vectorise
the second argument, I think we can drop this element_type.  That in
turn means that…

> +
> +  create_output_operand (&ops[0], target, TYPE_MODE (float_type));
> +  create_input_operand (&ops[1], op, TYPE_MODE (float_type));
> +
> +  expand_insn (convert_optab_handler (optab, TYPE_MODE (float_type),
> +   TYPE_MODE (int_type)), 2, ops);
> +}
> +
>  /* Expand LOAD_LANES call STMT using optab OPTAB.  */
>  
>  static void
> @@ -3747,6 +3771,15 @@ multi_vector_optab_supported_p (convert_optab optab, 
> tree_pair types,
> != CODE_FOR_nothing);
>  }
>  
> +static bool
> +direct_ftrunc_int_optab_supported_p (convert_optab optab, tree_pair types,
> +  optimization_type opt_type)
> +{
> +  return (convert_optab_handler (optab, TYPE_MODE (types.first),
> + TYPE_MODE (element_type (types.second)),
> + opt_type) != CODE_FOR_nothing);
> +}
> +

…this can use convert_optab_supported_p.

> diff --git a/gcc/testsuite/gcc.target/aarch64/frintnz.c 
> b/gcc/testsuite/gcc.target/aarch64/frintnz.c
> new file mode 100644
> index 
> ..008e1cf9f4a1b0148128c65c9ea0d1bb111467b7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/frintnz.c
> @@ -0,0 +1,91 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -march=armv8.5-a" } */
> +/* { dg-require-effective-target aarch64_frintnzx_ok } */

Is this just a cut-&-pasto from a run test?  If not, why do we need both
this and the dg-options?  It feels like one on its own should be enough,
with the dg-options being better.

The test looks OK without this line.

> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +/*
> +** f1:
> +**   frint32zs0, s0
> +**   ret
> +*/
> +float
> +f1 (float x)
> +{
> +  int y = x;
> +  return (float) y;
> +}
> +
> +/*
> +** f2:
> +**   frint64zs0, s0
> +**   ret
> +*/
> +float
> +f2 (float x)
> +{
> +  long long int y = x;
> +  return (float) y

Re: [PATCH] rs6000: Use known constant for GET_MODE_NUNITS and similar

2022-01-14 Thread Kewen.Lin via Gcc-patches
on 2022/1/13 下午11:15, David Edelsohn wrote:
> On Thu, Jan 13, 2022 at 7:40 AM Kewen.Lin  wrote:
>>
>> Hi David,
>>
>> on 2022/1/13 上午11:12, David Edelsohn wrote:
>>> On Wed, Jan 12, 2022 at 8:56 PM Kewen.Lin  wrote:

 Hi,

 This patch is to clean up some codes with GET_MODE_UNIT_SIZE or
 GET_MODE_NUNITS, which can use known constant instead.
>>>
>>> I'll let Segher decide, but often the additional code is useful
>>> self-documentation instead of magic constants.  Or at least the change
>>> requires comments documenting the derivation of the constants
>>> currently described by the code itself.
>>>
>>
>> Thanks for the comments, I added some comments as suggested, also removed
>> the whole "altivec_vreveti2" since I noticed it's useless, it's not used
>> by any built-in functions and even unused in the commit db042e1603db50573.
>>
>> The updated version has been tested as before.
> 
> As we have discussed offline, the comments need to be clarified and expanded.
> 
> And the removal of altivec_vreveti2 should be confirmed with Carl
> Love, who added the pattern less than a year ago. There may be another
> patch planning to use it.
> 

Thanks for the suggestions David!  The comments have been updated, and Carl
also helped to confirm the altivec_vreveti2 pattern is not planned for any
future work and looks reasonable to remove.

Does this updated version look good to you?

BR,
Kewen
-
gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vreveti2): Remove.
* config/rs6000/vsx.md (*vsx_extract_si, *vsx_extract_si_float_df,
*vsx_extract_si_float_, *vsx_insert_extract_v4sf_p9): Use
known constant values to simplify code.
---
 gcc/config/rs6000/altivec.md | 25 -
 gcc/config/rs6000/vsx.md | 16 
 2 files changed, 12 insertions(+), 29 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 950b17862c4..4412175a0dc 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -3950,31 +3950,6 @@
   DONE;
 })

-;; Vector reverse elements
-(define_expand "altivec_vreveti2"
-  [(set (match_operand:TI 0 "register_operand" "=v")
-   (unspec:TI [(match_operand:TI 1 "register_operand" "v")]
- UNSPEC_VREVEV))]
-  "TARGET_ALTIVEC"
-{
-  int i, j, size, num_elements;
-  rtvec v = rtvec_alloc (16);
-  rtx mask = gen_reg_rtx (V16QImode);
-
-  size = GET_MODE_UNIT_SIZE (TImode);
-  num_elements = GET_MODE_NUNITS (TImode);
-
-  for (j = 0; j < num_elements; j++)
-for (i = 0; i < size; i++)
-  RTVEC_ELT (v, i + j * size)
-   = GEN_INT (i + (num_elements - 1 - j) * size);
-
-  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
-  emit_insn (gen_altivec_vperm_ti (operands[0], operands[1],
-operands[1], mask));
-  DONE;
-})
-
 ;; Vector reverse elements for V16QI V8HI V4SI V4SF
 (define_expand "altivec_vreve2"
   [(set (match_operand:VEC_K 0 "register_operand" "=v")
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index acd729d1687..ee748ff4ebd 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -3854,8 +3854,10 @@
   rtx vec_tmp = operands[3];
   int value;

+  /* Adjust index for LE element ordering, the below minuend 3 is computed by
+ GET_MODE_NUNITS (V4SImode) - 1.  */
   if (!BYTES_BIG_ENDIAN)
-element = GEN_INT (GET_MODE_NUNITS (V4SImode) - 1 - INTVAL (element));
+element = GEN_INT (3 - INTVAL (element));

   /* If the value is in the correct position, we can avoid doing the VSPLT
  instruction.  */
@@ -4230,8 +4232,10 @@
   rtx v4si_tmp = operands[3];
   int value;

+  /* Adjust index for LE element ordering, the below minuend 3 is computed by
+ GET_MODE_NUNITS (V4SImode) - 1.  */
   if (!BYTES_BIG_ENDIAN)
-element = GEN_INT (GET_MODE_NUNITS (V4SImode) - 1 - INTVAL (element));
+element = GEN_INT (3 - INTVAL (element));

   /* If the value is in the correct position, we can avoid doing the VSPLT
  instruction.  */
@@ -4273,8 +4277,10 @@
   rtx df_tmp = operands[4];
   int value;

+  /* Adjust index for LE element ordering, the below minuend 3 is computed by
+ GET_MODE_NUNITS (V4SImode) - 1.  */
   if (!BYTES_BIG_ENDIAN)
-element = GEN_INT (GET_MODE_NUNITS (V4SImode) - 1 - INTVAL (element));
+element = GEN_INT (3 - INTVAL (element));

   /* If the value is in the correct position, we can avoid doing the VSPLT
  instruction.  */
@@ -4466,8 +4472,10 @@
 {
   int ele = INTVAL (operands[4]);

+  /* Adjust index for LE element ordering, the below minuend 3 is computed by
+ GET_MODE_NUNITS (V4SFmode) - 1.  */
   if (!BYTES_BIG_ENDIAN)
-ele = GET_MODE_NUNITS (V4SFmode) - 1 - ele;
+ele = 3 - ele;

   operands[4] = GEN_INT (GET_MODE_SIZE (SFmode) * ele);
   return "xxinsertw %x0,%x2,%4";



[committed] libgfortran: Partly revert my r12-6498 change to fix Solaris build [PR104006]

2022-01-14 Thread Jakub Jelinek via Gcc-patches
Hi!

In r12-6498 I've added $(version_dep) to BUILT_SOURCES, previously version_dep
on Linux used to be a file in $(srcdir), but with my changes it is a generated
file in the object directory (preprocessed version of the $(srcdir) file)
and I thought generated files belong to BUILT_SOURCES so that they are
cleaned up etc.
For Linux that is fine, but it broke parallel builds on Solaris.
BUILT_SOURCES is a special variable for automake where automake ensures
that for make all, make check and make install all those $(BUILT_SOURCES)
are generated before actually starting building in parallel the various
object files.  That way we can avoid hacks like:
$(patsubst %.F90,%.lo,$(notdir $(filter %.F90,$(prereq_SRC: kinds.inc 
c99_protos.inc
$(patsubst %.c,%.lo,$(notdir $(filter %.c,$(prereq_SRC: kinds.h
$(patsubst %.f90,%.lo,selected_real_kind.f90): selected_real_kind.inc
$(patsubst %.f90,%.lo,selected_int_kind.f90): selected_int_kind.inc
$(patsubst %.F90,%.lo,ieee_exceptions.F90): fpu-target.inc
$(patsubst %.F90,%.lo,ieee_arithmetic.F90): fpu-target.inc ieee_exceptions.lo
$(patsubst %.c,%.lo,fpu.c): fpu-target.h
which makes those dependencies explicit but hides it from automake, so that it
doesn't throw away its rules for those object files.
On Solaris, $(version_dep) contains gfortran.ver and gfortran.ver-sun.
gfortran.ver is like on Linux, it can be in $(BUILT_SOURCES), but unfortunately
gfortran.ver-sun depends on all the object files being compiled already,
so if gfortran.ver-sun appears in $(BUILT_SOURCES), the BUILT_SOURCES function
of ensuring the generated files are generated before building object files
is gone, almost everything is built before all-am is entered and there
are no explicit dependencies that e.g. *.F90 files depend on
kinds.inc etc.

So, this change reverts that mistake and instead adds $(version_dep) to
what is removed during make clean (clean-local in particular).

The reversion committed under the reversion of our own patches policy,
the clean-local change considered obvious, committed to trunk.

2022-01-14  Jakub Jelinek  

PR libfortran/104006
* Makefile.am (BUILT_SOURCES): Don't include $(version_dep).
(clean-local): Remove $(version_dep).
* Makefile.in: Regenerated.

--- libgfortran/Makefile.am.jj  2022-01-13 17:44:40.503962317 +0100
+++ libgfortran/Makefile.am 2022-01-13 23:37:52.876004924 +0100
@@ -1118,7 +1118,7 @@ ieee_arithmetic.mod: ieee_arithmetic.lo
:
 
 BUILT_SOURCES=$(gfor_built_src) $(gfor_built_specific_src) \
-   $(gfor_built_specific2_src) $(gfor_misc_specifics) $(version_dep)
+   $(gfor_built_specific2_src) $(gfor_misc_specifics)
 
 prereq_SRC = $(gfor_src) $(gfor_built_src) $(gfor_io_src) \
$(gfor_helper_src) $(gfor_ieee_src) $(gfor_io_headers) 
$(gfor_specific_src)
@@ -1356,7 +1356,7 @@ $(gfor_misc_specifics): m4/misc_specific
 endif
 
 clean-local:
-   -rm -rf include
+   -rm -rf include $(version_dep)
 
 EXTRA_DIST = $(m4_files)
 
--- libgfortran/Makefile.in.jj  2022-01-13 17:44:51.468807363 +0100
+++ libgfortran/Makefile.in 2022-01-13 23:37:58.729923341 +0100
@@ -1652,7 +1652,7 @@ intrinsics/random_init.f90
 
 BUILT_SOURCES = $(gfor_built_src) $(gfor_built_specific_src) \
$(gfor_built_specific2_src) $(gfor_misc_specifics) \
-   $(version_dep) $(am__append_7)
+   $(am__append_7)
 prereq_SRC = $(gfor_src) $(gfor_built_src) $(gfor_io_src) \
$(gfor_helper_src) $(gfor_ieee_src) $(gfor_io_headers) 
$(gfor_specific_src)
 
@@ -7857,7 +7857,7 @@ include/ISO_Fortran_binding.h: $(srcdir)
 @MAINTAINER_MODE_TRUE@ $(M4) -Dfile=$@ -I$(srcdir)/m4 misc_specifics.m4 > $@
 
 clean-local:
-   -rm -rf include
+   -rm -rf include $(version_dep)
 
 # target overrides
 -include $(tmake_file)

Jakub



Re: [PATCH] libstdc++: Add attribute to features deprecated in C++17 [PR91260]

2022-01-14 Thread Jonathan Wakely via Gcc-patches
On Wed, 12 Jan 2022 at 09:34, Jonathan Wakely via Libstdc++ <
libstd...@gcc.gnu.org> wrote:

> This passes testing (with -std=gnu++98/11/17/20) but is quite a large
> patch for this late in stage 3. Does anybody object to doing this now?
> The bugs it fixes were closed as INVALID because we're not actually
> *required* to remove these or deprecate them. But users are right to
> complain about us silently accepting use of things like std::bind1st in
> C++20 mode. Probably WONTFIX would have been better, and in fact we can
> fix them (as this patch does). If we don't do this, users will keep
> reporting bugs about it, but it could wait for stage 1 if needed.
>

Pushed to trunk.


> === >8 === >8 === >8 ===
>
> There are a lot of things in the C++ standard library which were
> deprecated in C++11, and more in C++17.  Some of them were removed after
> deprecation and are no longer present in the standard at all. We have
> not removed these from libstdc++ because keeping them as non-standard
> extensions is conforming, and avoids gratuitously breaking user code,
> and in some cases we need to keep using them to avoid ABI changes. But
> we should at least give a warning for using them. That has not been done
> previously because of the library's own uses of them (e.g. the
> std::iterator class template used as a base class).
>
> This adds deprecated attributes to the relevant components, and then
> goes through the whole library to add diagnostic pragmas where needed to
> suppress warnings about our internal uses of them. The tests are updated
> to either expect the additional warnings, or to suppress them where we
> aren't interested in them.
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/91260
> PR libstdc++/91383
> PR libstdc++/95065
> * include/backward/binders.h (bind1st, bind2nd): Add deprecated
> attribute.
> * include/bits/refwrap.h (_Maybe_unary_or_binary_function):
> Disable deprecated warnings for base classes.
> (_Reference_wrapper_base): Likewise.
> * include/bits/shared_ptr_base.h (_Sp_owner_less): Likewise.
> * include/bits/stl_bvector.h (_Bit_iterator_base): Likewise.
> * include/bits/stl_function.h (unary_function, binary_function):
> Add deprecated attribute.
> (unary_negate, not1, binary_negate, not2, ptr_fun)
> (pointer_to_unary_function, pointer_to_binary_function)
> (mem_fun_t, const_mem_fun_t, mem_fun_ref_t, const_mem_fun_ref_t)
> (mem_fun1_t, const_mem_fun1_t, mem_fun_ref1_t)
> (const_mem_fun1_ref_t, mem_fun, mem_fun_ref): Add deprecated
> attributes.
> * include/bits/stl_iterator.h: Disable deprecated warnings for
> std::iterator base classes.
> * include/bits/stl_iterator_base_types.h (iterator): Add
> deprecated attribute.
> * include/bits/stl_map.h (map::value_compare): Disable
> deprecated warnings for base class.
> * include/bits/stl_multimap.h (multimap::value_compare):
> Likewise.
> * include/bits/stl_raw_storage_iter.h (raw_storage_iterator):
> Add deprecated attribute.
> * include/bits/stl_tempbuf.h (get_temporary_buffer): Likewise.
> * include/bits/stream_iterator.h: Disable deprecated warnings.
> * include/bits/streambuf_iterator.h: Likewise.
> * include/ext/bitmap_allocator.h: Remove unary_function base
> classes.
> * include/ext/functional: Disable deprecated warnings.
> * include/ext/rope: Likewise.
> * include/ext/throw_allocator.h: Likewise.
> * include/std/type_traits (result_of): Add deprecated attribute.
> * include/tr1/functional: Disable deprecated warnings.
> * include/tr1/functional_hash.h: Likewise.
> * testsuite/20_util/function_objects/binders/1.cc: Add
> -Wno-disable-deprecations.
> * testsuite/20_util/function_objects/binders/3113.cc: Likewise.
> * testsuite/20_util/function_objects/constexpr.cc: Add
> dg-warning.
> * testsuite/20_util/raw_storage_iterator/base.cc: Likewise.
> * testsuite/20_util/raw_storage_iterator/dr2127.cc: Likewise.
> *
> testsuite/20_util/raw_storage_iterator/requirements/base_classes.cc:
> Likewise.
> *
> testsuite/20_util/raw_storage_iterator/requirements/explicit_instantiation/1.cc:
> Likewise.
> * testsuite/20_util/raw_storage_iterator/requirements/typedefs.cc:
> Likewise.
> * testsuite/20_util/reference_wrapper/24803.cc:
> Likewise.
> * testsuite/20_util/reference_wrapper/typedefs.cc: Enable for
> C++20 and check for absence of nested types.
> * testsuite/20_util/shared_ptr/comparison/less.cc: Remove
> std::binary_function base class.
> * testsuite/20_util/temporary_buffer.cc: Add dg-warning.
> * testsuite/21_strings/basic_string/cons/char/69092.cc: Remove
>  

[PATCH] libstdc++: Fix 22_locale/numpunct/members/char/3.cc execution test

2022-01-14 Thread Uros Bizjak via Gcc-patches
The test fails on Fedora 33+ because nl_NL locale got thousands
separator defined.  Use one of ar_SA, bg_BG, bs_BA, pt_PT
or plain C locale instead.

2022-01-14  Uroš Bizjak  

libstdc++-v3/ChangeLog:

   * testsuite/22_locale/numpunct/members/char/3.cc (test02):
   Use pt_PT locale instead of nl_NL.

Tested on x86_64-linux-gnu (Fedora 33).

OK for master?

Uros.
diff --git a/libstdc++-v3/testsuite/22_locale/numpunct/members/char/3.cc 
b/libstdc++-v3/testsuite/22_locale/numpunct/members/char/3.cc
index 6e4959860cd..4cf4b24407d 100644
--- a/libstdc++-v3/testsuite/22_locale/numpunct/members/char/3.cc
+++ b/libstdc++-v3/testsuite/22_locale/numpunct/members/char/3.cc
@@ -28,8 +28,8 @@ void test02()
 {
   using namespace std;
 
-  // nl_NL chosen because it has no thousands separator (at this time).
-  locale loc_it = locale(ISO_8859(15,nl_NL));
+  // pt_PT chosen because it has no thousands separator (at this time).
+  locale loc_it = locale(ISO_8859(15,pt_PT));
 
   const numpunct& nump_it = use_facet >(loc_it); 
 


Re: [PATCH] [gfortran] Add support for allocate clause (OpenMP 5.0).

2022-01-14 Thread Tobias Burnus

Hi all,

On 14.01.22 10:10, Thomas Schwinge wrote:

+  integer  :: x
...
+  !$omp parallel allocate (0: x) private(x) ! { dg-error "Expected integer 
expression of the 'omp_allocator_handle_kind' kind at .1." }

We do for x86_64 default '-m64', but for '-m32' and '-mx32' compilation,
we're not seeing this latter diagnostic:
 FAIL: gfortran.dg/gomp/allocate-2.f90   -O   (test for errors, line 36)

I suppose the reason is unintended congruence of data types?  Would it
work to make 'x' a floating-point data type, for example -- or is this
meant to explicitly check certain integer data type characteristics?


Alternatively, you could use 'integer(kind=1)' (which is a 1-byte/8-bits
type.) I assume we do not have any platform which still uses 8-bit
pointers and supports libgomp :-)

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH] [gfortran] Add support for allocate clause (OpenMP 5.0).

2022-01-14 Thread Jakub Jelinek via Gcc-patches
On Fri, Jan 14, 2022 at 12:45:54PM +0100, Tobias Burnus wrote:
> On 14.01.22 10:10, Thomas Schwinge wrote:
> > > +  integer  :: x
> > > ...
> > > +  !$omp parallel allocate (0: x) private(x) ! { dg-error "Expected 
> > > integer expression of the 'omp_allocator_handle_kind' kind at .1." }
> > We do for x86_64 default '-m64', but for '-m32' and '-mx32' compilation,
> > we're not seeing this latter diagnostic:
> >  FAIL: gfortran.dg/gomp/allocate-2.f90   -O   (test for errors, line 36)
> > 
> > I suppose the reason is unintended congruence of data types?  Would it
> > work to make 'x' a floating-point data type, for example -- or is this
> > meant to explicitly check certain integer data type characteristics?
> 
> Alternatively, you could use 'integer(kind=1)' (which is a 1-byte/8-bits
> type.) I assume we do not have any platform which still uses 8-bit
> pointers and supports libgomp :-)

If we want to check intptr_t, we should guard the dg-error with
"" { target { lp64 || llp64 } }
or so.
Otherwise yes, we can add some other kind and hope it is not the
same as omp_allocator_handle_kind.  Or we can do both,
keep the current one with the target lp64 || llp64 and
add another one with some integer(kind=1).

Jakub



Re: [PATCH] disable aggressive_loop_optimizations until niter ready

2022-01-14 Thread Richard Biener via Gcc-patches
On Fri, 14 Jan 2022, Jiufu Guo wrote:

> Richard Biener  writes:
> 
> > On Thu, 13 Jan 2022, guojiufu wrote:
> >
> >> On 2022-01-03 22:30, Richard Biener wrote:
> >> > On Wed, 22 Dec 2021, Jiufu Guo wrote:
> >> > 
> >> >> Hi,
> >> >> ...
> >> >> 
> >> >> Bootstrap and regtest pass on ppc64* and x86_64.  Is this ok for trunk?
> >> > 
> >> > So this is a optimality fix, not a correctness one?  I suppose the
> >> > estimates are computed/used from scev_probably_wraps_p via
> >> > loop_exits_before_overflow and ultimatively chrec_convert.
> >> > 
> >> > We have a call cycle here,
> >> > 
> >> > estimate_numbers_of_iterations -> number_of_latch_executions ->
> >> > ... -> estimate_numbers_of_iterations
> >> > 
> >> > where the first estimate_numbers_of_iterations will make sure
> >> > the later call will immediately return.
> >> 
> >> Hi Richard,
> >> Thanks for your comments! And sorry for the late reply.
> >> 
> >> In estimate_numbers_of_iterations, there is a guard to make sure
> >> the second call to estimate_numbers_of_iterations returns
> >> immediately.
> >> 
> >> Exactly as you said, it relates to scev_probably_wraps_p calls
> >> loop_exits_before_overflow.
> >> 
> >> The issue is: the first calling to estimate_numbers_of_iterations
> >> maybe inside number_of_latch_executions.
> >> 
> >> > 
> >> > I'm not sure what your patch tries to do - it seems to tackle
> >> > the case where we enter the cycle via number_of_latch_executions?
> >> > Why do we get "non-final" values?  idx_infer_loop_bounds resorts
> >> 
> >> Right, when the call cycle starts from number_of_latch_execution,
> >> the issue may occur:
> >> 
> >> number_of_latch_executions(*1st call)->..->
> >> analyze_scalar_evolution(IVs 1st) ->..follow_ssa_edge_expr..->
> >> loop_exits_before_overflow->
> >> estimate_numbers_of_iterations (*1st call)->
> >> number_of_latch_executions(*2nd call)->..->
> >> analyze_scalar_evolution(IVs 2nd)->..loop_exits_before_overflow->
> >> estimate_numbers_of_iterations(*2nd call)
> >> 
> >> The second calling to estimate_numbers_of_iterations returns quickly.
> >> And then, in the first calling to estimate_numbers_of_iterations,
> >> infer_loop_bounds_from_undefined is invoked.
> >> 
> >> And, function "infer_loop_bounds_from_undefined" instantiate/analyze
> >> SCEV for each SSA in the loop.
> >> *Here the issue occur*, these SCEVs are based on the interim IV's
> >> SCEV which come from "analyze_scalar_evolution(IVs 2nd)",
> >> and those IV's SCEV will be overridden by up level
> >> "analyze_scalar_evolution(IVs 1st)".
> >
> > OK, so indeed analyze_scalar_evolution is not protected against
> > recursive invocation on the same SSA name (though it definitely
> > doesn't expect to do that).  We could fix that by pre-seeding
> > the cache conservatively in analyze_scalar_evolution or by
> > not overwriting the cached result of the recursive invocation.
> >
> > But to re-iterate an unanswered question, is this a correctness issue
> > or an optimization issue?
> 
> Hi Richard,
> 
> Thanks for your time and patience on review this!
> 
> I would say it is an optimization issue for the current code,
> it does not fix known error.
> 
> The patch could help compiling-time.  Another benefit, this patch
> would be able to improve some scev(s) if the scev is cached in
> infer_loop_bounds_from_undefined under the call stack where IV's
> SCEV is under analyzing.
> 
> Yes, in analyze_scalar_evolution call chain, it may recursive on
> same SSA name.
> While outer level analyze_scalar_evolution 'may' get better
> chrec(POLYNOMIAL_CHREC), inner one may get other scev (e.g.
> conversion).  I'm even wondering this recursive is intended :).
> It may help to handle the chick-egg issue(wrap vs. niter).
> 
> >
> >> To handle this issue, disabling flag_aggressive_loop_optimizations
> >> inside number_of_latch_executions is one method.
> >> To avoid the issue in other cases, e.g. the call cycle starts from
> >> number_of_iterations_exit or number_of_iterations_exit_assumptions,
> >> this patch disable flag_aggressive_loop_optimizations inside
> >> number_of_iterations_exit_assumptions.
> >
> > But disabling flag_aggressive_loop_optimizations is a very
> > non-intuitive way of avoiding recursive calls.  I'd rather
> > avoid those in a similar way estimate_numbers_of_iterations does,
> > for example with
> >
> > diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
> > index 61d72c278a1..cc1e510b6c2 100644
> > --- a/gcc/tree-scalar-evolution.c
> > +++ b/gcc/tree-scalar-evolution.c
> > @@ -2807,7 +2807,7 @@ number_of_latch_executions (class loop *loop)
> >if (dump_file && (dump_flags & TDF_SCEV))
> >  fprintf (dump_file, "(number_of_iterations_in_loop = \n");
> >  
> > -  res = chrec_dont_know;
> > +  loop->nb_iterations = res = chrec_dont_know;
> >exit = single_exit (loop);
> >  
> >if (exit && number_of_iterations_exit (loop, exit, &niter_desc, false))
> >
> > though this doesn't seem to improve the SCEV analy

Re: [PATCH] libstdc++: Fix 22_locale/numpunct/members/char/3.cc execution test

2022-01-14 Thread Jonathan Wakely via Gcc-patches
On Fri, 14 Jan 2022 at 11:41, Uros Bizjak via Libstdc++ <
libstd...@gcc.gnu.org> wrote:

> The test fails on Fedora 33+ because nl_NL locale got thousands
> separator defined.  Use one of ar_SA, bg_BG, bs_BA, pt_PT
> or plain C locale instead.
>
> 2022-01-14  Uroš Bizjak  
>
> libstdc++-v3/ChangeLog:
>
>* testsuite/22_locale/numpunct/members/char/3.cc (test02):
>Use pt_PT locale instead of nl_NL.
>
> Tested on x86_64-linux-gnu (Fedora 33).
>
> OK for master?
>
>
Please also change the dg-require-namedlocale directive on the first line.

OK with that change, thanks.


Re: [PATCH] [gfortran] Add support for allocate clause (OpenMP 5.0).

2022-01-14 Thread Tobias Burnus

On 14.01.22 12:55, Jakub Jelinek via Fortran wrote:

If we want to check intptr_t, we should guard the dg-error with
"" { target { lp64 || llp64 } }
or so.


Well, if we want to use intptr_t, we could use be explicitly as with:

  use iso_c_binding, only: c_intptr_t
  ! use omp_lib, only: omp_allocator_handle_kind
  ...  ('implicit none' etc.)
  integer, parameter :: omp_allocator_handle_kind = c_intptr_t
  ...
  integer(kind=omp_allocator_handle_kind)

(@Abid: The 'use omp_lib' line is commented as in gcc/testsuite/*/gomp/,
the OpenMP module/header from libgomp is not available - and then a
stub parameter is created.)


Otherwise yes, we can add some other kind and hope it is not the
same as omp_allocator_handle_kind.  Or we can do both,
keep the current one with the target lp64 || llp64 and
add another one with some integer(kind=1).


For just testing something invalid, I think it makes more sense to just
set it to kind=1.

For checking the valid value, using c_intptr_t seems to make more sense
than restricting it to (l)l64.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [patch] Fix reverse SSO issues in IPA-SRA

2022-01-14 Thread Martin Jambor
Hi Eric,

On Tue, Jan 11 2022, Eric Botcazou via Gcc-patches wrote:
> Hi,
>
> we recently received the report that the IPA-SRA pass introduced in GCC 10 
> does not always play nice with the reverse scalar storage order that can be 
> used in structures/records/unions.  Reading the code, the pass apparently 
> correctly detects it but fails to propagate the information to the rewriting 
> phase in some cases and, in particular, does not stream it for LTO.
>
> The attached patch is a tentative fix for these issues spotted essentially by 
> code reading.  It also contains various minor tweaks left and right.
>
> Bootstrapped/regtested on x86-64/Linux, OK for mainline, 11 and 10 branches?
>
>
> 2022-01-11  Eric Botcazou  
>
>   * ipa-param-manipulation.c (ipa_dump_adjusted_parameters): Dump
>   reverse flag as "reverse" for the sake of consistency.
>   * ipa-sra.c: Fix copyright year.
>   (ipa_sra_function_summaries::duplicate): Copy the reverse flag.
>   (dump_isra_access): Remove confusing dump line.
>   (isra_write_node_summary): Write the reverse flag.
>   (isra_read_node_info): Read it.
>   (pull_accesses_from_callee): Copy it.

great, thanks, the patch is OK.

Martin


Re: [Patch][V4][Patch 2/2]Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables.

2022-01-14 Thread Richard Biener via Gcc-patches
On Thu, Jan 13, 2022 at 2:45 AM Qing Zhao  wrote:
>
> Hi, Richard,
>
> This is the updated version for the second patch, which is mainly the change 
> for "Enable -Wuninitialized + -ftrivial-auto-var-init for  address taken 
> variables”.
>
> In this update, I mainly made the following change:
>
> 1.  Delete “repl_var”, use the var_def_stmt, i.e, the call to .DEFERRED_INIT 
> to record the warning suppressed info.
> 2. Add and change the comments in multiple places to make the change more 
> readable.
>
> Now, for the deleted variable, we will get the necessary info to report 
> warning from the call to .DEFERRED_INIT.
>
> A. the name string of DECL from the 3rd parameter of the call;
> B. the location of the DECL from the location of the call;
> C. the call can also be used to hold the information on whether the 
> warning
>has been issued or not to suppress warning messages when needed;
>
> Please see the detailed description below for the problem and solution of 
> this patch.
>
> This patch has been bootstrapped and regressing tested on both X86 and 
> aarch64.
>
> Okay for GCC12?

I think the change to split "%qD is used uninitialized" is bad for i8n
though it seems
like none of the strings passed to warn_uninit are currently marked
for localization.
I suppose there's lazy matching with the exact same strings passed to
warning_at around like 641 but after your change those will no longer match up,
at least for the "%qs ..." case constructed.  I think a better way
(for i8n) would be
to pass down a flag whether it is "may" or "is" and have the full translatable
strings literally passed to warning_at at the expense of some code duplication.
Actually the extra flag is unnecessary, the OPT_W... we pass is already fully
specifying that.

Instead of doing

+  if (gimple_call_internal_p (var_def_stmt, IFN_DEFERRED_INIT))
+   {
+ /* Ignore the call to .DEFERRED_INIT that define the original
+var itself as the following case:
+   temp = .DEFERRED_INIT (4, 2, â~@~\alt_reloc");
+   alt_reloc = temp;
+In order to avoid generating warning for the fake usage
+at alt_reloc = temp.
...

I thought of many options, none really very appealing so I guess we have to
go with this for now.

So OK with the i8n thing sorted out - please post one hopefully last update
for the patch.

Thanks for your patience,
Richard.

> thanks.
>
> Qing.
>
> 
> Enable -Wuninitialized + -ftrivial-auto-var-init for address
>  taken variables.
>
> With -ftrivial-auto-var-init, the address taken auto variable is replaced with
> a temporary variable during gimplification, and the original auto variable 
> might
> be eliminated by compiler optimization completely. As a result, the current
> uninitialized warning analysis cannot get enough information from the IR,
> therefore the uninitialized warnings for address taken variable cannot be
> issued based on the current implemenation of -ftrival-auto-var-init.
>
> For more info please refer to:
> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577431.html
>
> In order to improve this situation, we can improve uninitialized analysis
> for address taken auto variables with -ftrivial-auto-var-init as following:
>
> for the following stmt:
>
> _1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]);
> if (_1 != 0)
>
> The original variable DECL has been eliminated from the IR, all the necessary
> information that is needed for reporting the warnings for DECL can be acquired
> from the call to .DEFERRED_INIT.
>
> A. the name string of DECL from the 3rd parameter of the call;
> B. the location of the DECL from the location of the call;
> C. the call can also be used to hold the information on whether the 
> warning
>has been issued or not to suppress warning messages when needed;
>
> The current testing cases for uninitialized warnings + -ftrivial-auto-var-init
> are adjusted to reflect the fact that we can issue warnings for address taken
> variables.
>
> gcc/ChangeLog:
>
> 2022-01-12  qing zhao  
>
> * tree-ssa-uninit.c (warn_uninit): Handle .DEFERRED_INIT call with an
> anonymous SSA_NAME specially.
> (check_defs): Likewise.
> (warn_uninit_phi_uses): Adjust the message format for warn_uninit.
> (warn_uninitialized_vars): Likewise.
> (warn_uninitialized_phi): Likewise
>
> gcc/testsuite/ChangeLog:
>
> 2022-01-12  qing zhao  
>
> * gcc.dg/auto-init-uninit-16.c (testfunc): Delete xfail to reflect
> the fact that address taken variable can be warned.
> * gcc.dg/auto-init-uninit-34.c (warn_scalar_1): Likewise.
> (warn_scalar_2): Likewise.
> * gcc.dg/auto-init-uninit-37.c (T1): Likewise.
> (T2): Likewise.
> * gcc.dg/auto-init-uninit-B.c (baz): Likewise.
>
> The complete patch is attached:
>


[PATCH] testsuite: Remove executable mode from source files.

2022-01-14 Thread Martin Liška

Pushed to master as obvious.

Martin

gcc/testsuite/ChangeLog:

* g++.dg/vect/pr99149.cc: Remove executable mode.
* g++.dg/vect/pr99220.cc: Likewise.
* g++.target/i386/avx512vl-pr100738-1.C: Likewise.
* g++.target/i386/pr100738-1.C: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/bfdot-1.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/bfdot-3.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vdot-3-1.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vdot-3-3.c: Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vdot-3-4.c: Likewise.
* gcc.target/i386/pr101796-1.c: Likewise.
* gcc.target/i386/pr94790-1.c: Likewise.
* gcc.target/i386/pr94790-2.c: Likewise.
* gcc.target/powerpc/mma-double-test.c: Likewise.
* gcc.target/powerpc/mma-single-test.c: Likewise.
* gdc.test/fail_compilation/constraints_aggr.d: Likewise.
* gdc.test/fail_compilation/constraints_defs.d: Likewise.
* gdc.test/fail_compilation/constraints_func1.d: Likewise.
* gdc.test/fail_compilation/constraints_func2.d: Likewise.
* gdc.test/fail_compilation/constraints_func3.d: Likewise.
* gdc.test/fail_compilation/constraints_func4.d: Likewise.
* gdc.test/fail_compilation/constraints_tmpl.d: Likewise.
* gdc.test/fail_compilation/imports/constraints.d: Likewise.
---
 gcc/testsuite/g++.dg/vect/pr99149.cc   | 0
 gcc/testsuite/g++.dg/vect/pr99220.cc   | 0
 gcc/testsuite/g++.target/i386/avx512vl-pr100738-1.C| 0
 gcc/testsuite/g++.target/i386/pr100738-1.C | 0
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-1.c  | 0
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c  | 0
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-3.c  | 0
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-1.c | 0
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c | 0
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-3.c | 0
 gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-4.c | 0
 gcc/testsuite/gcc.target/i386/pr101796-1.c | 0
 gcc/testsuite/gcc.target/i386/pr94790-1.c  | 0
 gcc/testsuite/gcc.target/i386/pr94790-2.c  | 0
 gcc/testsuite/gcc.target/powerpc/mma-double-test.c | 0
 gcc/testsuite/gcc.target/powerpc/mma-single-test.c | 0
 gcc/testsuite/gdc.test/fail_compilation/constraints_aggr.d | 0
 gcc/testsuite/gdc.test/fail_compilation/constraints_defs.d | 0
 gcc/testsuite/gdc.test/fail_compilation/constraints_func1.d| 0
 gcc/testsuite/gdc.test/fail_compilation/constraints_func2.d| 0
 gcc/testsuite/gdc.test/fail_compilation/constraints_func3.d| 0
 gcc/testsuite/gdc.test/fail_compilation/constraints_func4.d| 0
 gcc/testsuite/gdc.test/fail_compilation/constraints_tmpl.d | 0
 gcc/testsuite/gdc.test/fail_compilation/imports/constraints.d  | 0
 24 files changed, 0 insertions(+), 0 deletions(-)
 mode change 100755 => 100644 gcc/testsuite/g++.dg/vect/pr99149.cc
 mode change 100755 => 100644 gcc/testsuite/g++.dg/vect/pr99220.cc
 mode change 100755 => 100644 
gcc/testsuite/g++.target/i386/avx512vl-pr100738-1.C
 mode change 100755 => 100644 gcc/testsuite/g++.target/i386/pr100738-1.C
 mode change 100755 => 100644 
gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-1.c
 mode change 100755 => 100644 
gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c
 mode change 100755 => 100644 
gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-3.c
 mode change 100755 => 100644 
gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-1.c
 mode change 100755 => 100644 
gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c
 mode change 100755 => 100644 
gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-3.c
 mode change 100755 => 100644 
gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-4.c
 mode change 100755 => 100644 gcc/testsuite/gcc.target/i386/pr101796-1.c
 mode change 100755 => 100644 gcc/testsuite/gcc.target/i386/pr94790-1.c
 mode change 100755 => 100644 gcc/testsuite/gcc.target/i386/pr94790-2.c
 mode change 100755 => 100644 gcc/testsuite/gcc.target/powerpc/mma-double-test.c
 mode change 100755 => 100644 gcc/testsuite/gcc.target/powerpc/mma-single-test.c
 mode change 100755 => 100644 
gcc/testsuite/gdc.test/fail_compilation/constraints_aggr.d
 mode change 100755 => 100644 
gcc/testsuite/gdc.test/fail_compilation/constraints_defs.d
 mode change 100755 => 100644 
gcc/testsuite/gdc.test/fail_compilation/constraints_func1.d
 mode change 100755 => 100644 
gcc/testsuite/gdc.test/fail_compilation/constraints_func2.d
 mode change 100755 => 100644 
gcc/testsuite/gdc.test/fail_compi

[committed] vect: Check partial vector param for supports_partial_vectors [PR104015]

2022-01-14 Thread Kewen.Lin via Gcc-patches
Hi,

As described in PR104015, the function partial_vectors_supported_p
mainly checks optabs for partial vectors support query, but we
still have one parameter param_vect_partial_vector_usage to control
the capability.

Power9 introduces vector with length instructions (for
len_load/len_store) but we don't enable partial vector on it by
default. It should be considered as not supporting partial vector by
default. This fix is to make the flag supports_partial_vectors check
param_vect_partial_vector_usage accordingly.

This was approved by Richard in the PR, thanks Richard and Andre for
the comments, pushed via r12-6582.

BR,
Kewen
-
gcc/ChangeLog:

PR tree-optimization/104015
* tree-vect-loop.c (vect_analyze_loop): Check
param_vect_partial_vector_usage for supports_partial_vectors.

---
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index ba67de490bb..49d53fb3383 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -3026,7 +3026,8 @@ vect_analyze_loop (class loop *loop, vec_info_shared 
*shared)
   vector_modes[0] = autodetected_vector_mode;
   mode_i = 0;

-  bool supports_partial_vectors = partial_vectors_supported_p ();
+  bool supports_partial_vectors =
+partial_vectors_supported_p () && param_vect_partial_vector_usage != 0;
   poly_uint64 first_vinfo_vf = LOOP_VINFO_VECT_FACTOR (first_loop_vinfo);

   while (1)



Re: [PATCH v3 00/15] ARM/MVE use vectors of boolean for predicates

2022-01-14 Thread Christophe Lyon via Gcc-patches
Hi,

I hadn't realized we are moving to stage 4 this week-end :-(

The PRs I'm fixing are P3, but without these fixes MVE support is badly
broken, so I think I would be really good to fix that before the buggy
version becomes part of an actual release.
Anyway I posted v1 of the patches during stage1, so it should still be OK
if they are accepted as-is ?

Thanks,

Christophe

On Thu, Jan 13, 2022 at 3:58 PM Christophe Lyon via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

>
> This is v3 of this patch series, fixing issues I discovered before
> committing v2 (which had been approved).
>
> Thanks a lot to Richard Sandiford for his help.
>
> The changes v2 -> v3 are:
>
> Patch 4: Fix arm_hard_regno_nregs and CLASS_MAX_NREGS to support VPR.
>
> Patch 7: Changes to the underlying representation of vectors of
> booleans to account for the different expectations between AArch64/SVE
> and Arm/MVE.
>
> Patch 8: Re-use and extend existing thumb2_movhi* patterns instead of
> duplicating them in mve_mov. This requires the introduction of a
> new constraint to match a constant vector of booleans. Add a new RTL
> test.
>
> Patch 9: Introduce check_effective_target_arm_mve and skip
> gcc.dg/signbit-2.c, because with MVE there is no fallback architecture
> unlike SVE or AVX512.
>
> Patch 12: Update less load/store MVE builtins
> (mve_vldrdq_gather_base_z_v2di,
> mve_vldrdq_gather_offset_z_v2di,
> mve_vldrdq_gather_shifted_offset_z_v2di,
> mve_vstrdq_scatter_base_p_v2di,
> mve_vstrdq_scatter_offset_p_v2di,
> mve_vstrdq_scatter_offset_p_v2di_insn,
> mve_vstrdq_scatter_shifted_offset_p_v2di,
> mve_vstrdq_scatter_shifted_offset_p_v2di_insn,
> mve_vstrdq_scatter_base_wb_p_v2di,
> mve_vldrdq_gather_base_wb_z_v2di,
> mve_vldrdq_gather_base_nowb_z_v2di,
> mve_vldrdq_gather_base_wb_z_v2di_insn) for which we keep HI mode
> for vpr_register_operand.
>
> Patch 13: No need to update
> gcc.target/arm/acle/cde-mve-full-assembly.c anymore since we re-use
> the mov pattern that emits '@ movhi' in the assembly.
>
> Patch 15: This is a new patch to fix a problem I noticed during this
> v2->v3 update.
>
>
>
> I'll squash patch 2 with patch 9 and patch 3 with patch 8.
>
> Original text:
>
> This patch series addresses PR 100757 and 101325 by representing
> vectors of predicates (MVE VPR.P0 register) as vectors of booleans
> rather than using HImode.
>
> As this implies a lot of mostly mechanical changes, I have tried to
> split the patches in a way that should help reviewers, but the split
> is a bit artificial.
>
> Patches 1-3 add new tests.
>
> Patches 4-6 are small independent improvements.
>
> Patch 7 implements the predicate qualifier, but does not change any
> builtin yet.
>
> Patch 8 is the first of the two main patches, and uses the new
> qualifier to describe the vcmp and vpsel builtins that are useful for
> auto-vectorization of comparisons.
>
> Patch 9 is the second main patch, which fixes the vcond_mask expander.
>
> Patches 10-13 convert almost all the remaining builtins with HI
> operands to use the predicate qualifier.  After these, there are still
> a few builtins with HI operands left, about which I am not sure: vctp,
> vpnot, load-gather and store-scatter with v2di operands.  In fact,
> patches 11/12 update some STR/LDR qualifiers in a way that breaks
> these v2di builtins although existing tests still pass.
>
> Christophe Lyon (15):
>   arm: Add new tests for comparison vectorization with Neon and MVE
>   arm: Add tests for PR target/100757
>   arm: Add tests for PR target/101325
>   arm: Add GENERAL_AND_VPR_REGS regclass
>   arm: Add support for VPR_REG in arm_class_likely_spilled_p
>   arm: Fix mve_vmvnq_n_ argument mode
>   arm: Implement MVE predicates as vectors of booleans
>   arm: Implement auto-vectorized MVE comparisons with vectors of boolean
> predicates
>   arm: Fix vcond_mask expander for MVE (PR target/100757)
>   arm: Convert remaining MVE vcmp builtins to predicate qualifiers
>   arm: Convert more MVE builtins to predicate qualifiers
>   arm: Convert more load/store MVE builtins to predicate qualifiers
>   arm: Convert more MVE/CDE builtins to predicate qualifiers
>   arm: Add VPR_REG to ALL_REGS
>   arm: Fix constraint check for V8HI in mve_vector_mem_operand
>
>  gcc/config/aarch64/aarch64-modes.def  |   8 +-
>  gcc/config/arm/arm-builtins.c | 224 +++--
>  gcc/config/arm/arm-builtins.h |   4 +-
>  gcc/config/arm/arm-modes.def  |   8 +
>  gcc/config/arm/arm-protos.h   |   4 +-
>  gcc/config/arm/arm-simd-builtin-types.def |   4 +
>  gcc/config/arm/arm.c  | 169 ++--
>  gcc/config/arm/arm.h  |   9 +-
>  gcc/config/arm/arm_mve_builtins.def   | 746 
>  gcc/config/arm/constraints.md |   6 +
>  gcc/config/arm/iterators.md   |   6 +
>  gcc/config/arm/mve.md | 795 ++
>  gcc/config/arm/neon.md

Re: [PATCH] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2022-01-14 Thread Marek Polacek via Gcc-patches
On Fri, Jan 14, 2022 at 10:23:16AM +0100, Stephan Bergmann wrote:
> On 30/11/2021 16:27, Marek Polacek wrote:
> > I guess we were concerned with programs that generate other programs.
> > Maybe UCNs should be ignored by default.  There's still time to adjust
> > the behavior.
> 
> Is there any update on this?  Shall I file a bug?  As-is, -Wbidi-chars is
> unusable for building LibreOffice and (esp. in combination with
>  "C++ preprocessor
> ignores #pragma GCC diagnostic") has to be explicitly disabled globally.

No update, it wasn't clear to me what the action should be here.

Please do file a bug.  I think I'll just have to adjust the warning to ignore
UCNs.

Marek



Re: [PATCH v3 00/15] ARM/MVE use vectors of boolean for predicates

2022-01-14 Thread Richard Biener via Gcc-patches
On Fri, Jan 14, 2022 at 2:18 PM Christophe Lyon via Gcc-patches
 wrote:
>
> Hi,
>
> I hadn't realized we are moving to stage 4 this week-end :-(
>
> The PRs I'm fixing are P3, but without these fixes MVE support is badly
> broken, so I think I would be really good to fix that before the buggy
> version becomes part of an actual release.
> Anyway I posted v1 of the patches during stage1, so it should still be OK
> if they are accepted as-is ?

In the end it's up to the target maintainers to weight the risk of breakage
vs. the risk of not usefulness ;)  But stage3 is where the "was posted
during stage1"
rule can easily apply - at some point we have to stop with such general ruling.

Richard.

> Thanks,
>
> Christophe
>
> On Thu, Jan 13, 2022 at 3:58 PM Christophe Lyon via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
>
> >
> > This is v3 of this patch series, fixing issues I discovered before
> > committing v2 (which had been approved).
> >
> > Thanks a lot to Richard Sandiford for his help.
> >
> > The changes v2 -> v3 are:
> >
> > Patch 4: Fix arm_hard_regno_nregs and CLASS_MAX_NREGS to support VPR.
> >
> > Patch 7: Changes to the underlying representation of vectors of
> > booleans to account for the different expectations between AArch64/SVE
> > and Arm/MVE.
> >
> > Patch 8: Re-use and extend existing thumb2_movhi* patterns instead of
> > duplicating them in mve_mov. This requires the introduction of a
> > new constraint to match a constant vector of booleans. Add a new RTL
> > test.
> >
> > Patch 9: Introduce check_effective_target_arm_mve and skip
> > gcc.dg/signbit-2.c, because with MVE there is no fallback architecture
> > unlike SVE or AVX512.
> >
> > Patch 12: Update less load/store MVE builtins
> > (mve_vldrdq_gather_base_z_v2di,
> > mve_vldrdq_gather_offset_z_v2di,
> > mve_vldrdq_gather_shifted_offset_z_v2di,
> > mve_vstrdq_scatter_base_p_v2di,
> > mve_vstrdq_scatter_offset_p_v2di,
> > mve_vstrdq_scatter_offset_p_v2di_insn,
> > mve_vstrdq_scatter_shifted_offset_p_v2di,
> > mve_vstrdq_scatter_shifted_offset_p_v2di_insn,
> > mve_vstrdq_scatter_base_wb_p_v2di,
> > mve_vldrdq_gather_base_wb_z_v2di,
> > mve_vldrdq_gather_base_nowb_z_v2di,
> > mve_vldrdq_gather_base_wb_z_v2di_insn) for which we keep HI mode
> > for vpr_register_operand.
> >
> > Patch 13: No need to update
> > gcc.target/arm/acle/cde-mve-full-assembly.c anymore since we re-use
> > the mov pattern that emits '@ movhi' in the assembly.
> >
> > Patch 15: This is a new patch to fix a problem I noticed during this
> > v2->v3 update.
> >
> >
> >
> > I'll squash patch 2 with patch 9 and patch 3 with patch 8.
> >
> > Original text:
> >
> > This patch series addresses PR 100757 and 101325 by representing
> > vectors of predicates (MVE VPR.P0 register) as vectors of booleans
> > rather than using HImode.
> >
> > As this implies a lot of mostly mechanical changes, I have tried to
> > split the patches in a way that should help reviewers, but the split
> > is a bit artificial.
> >
> > Patches 1-3 add new tests.
> >
> > Patches 4-6 are small independent improvements.
> >
> > Patch 7 implements the predicate qualifier, but does not change any
> > builtin yet.
> >
> > Patch 8 is the first of the two main patches, and uses the new
> > qualifier to describe the vcmp and vpsel builtins that are useful for
> > auto-vectorization of comparisons.
> >
> > Patch 9 is the second main patch, which fixes the vcond_mask expander.
> >
> > Patches 10-13 convert almost all the remaining builtins with HI
> > operands to use the predicate qualifier.  After these, there are still
> > a few builtins with HI operands left, about which I am not sure: vctp,
> > vpnot, load-gather and store-scatter with v2di operands.  In fact,
> > patches 11/12 update some STR/LDR qualifiers in a way that breaks
> > these v2di builtins although existing tests still pass.
> >
> > Christophe Lyon (15):
> >   arm: Add new tests for comparison vectorization with Neon and MVE
> >   arm: Add tests for PR target/100757
> >   arm: Add tests for PR target/101325
> >   arm: Add GENERAL_AND_VPR_REGS regclass
> >   arm: Add support for VPR_REG in arm_class_likely_spilled_p
> >   arm: Fix mve_vmvnq_n_ argument mode
> >   arm: Implement MVE predicates as vectors of booleans
> >   arm: Implement auto-vectorized MVE comparisons with vectors of boolean
> > predicates
> >   arm: Fix vcond_mask expander for MVE (PR target/100757)
> >   arm: Convert remaining MVE vcmp builtins to predicate qualifiers
> >   arm: Convert more MVE builtins to predicate qualifiers
> >   arm: Convert more load/store MVE builtins to predicate qualifiers
> >   arm: Convert more MVE/CDE builtins to predicate qualifiers
> >   arm: Add VPR_REG to ALL_REGS
> >   arm: Fix constraint check for V8HI in mve_vector_mem_operand
> >
> >  gcc/config/aarch64/aarch64-modes.def  |   8 +-
> >  gcc/config/arm/arm-builtins.c | 224 +++--
> >  gcc/config/arm/arm-builtins.h |   4 +-
> >  gcc/config/arm/arm-m

Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.

2022-01-14 Thread Hongyu Wang via Gcc-patches
> Are there any technical obstacles to introduce subst to
> define_{,insn_and_}split?

gccint says: define_subst can be used only in define_insn and
define_expand, it cannot be used in other expressions (e.g. in
define_insn_and_split).
I have no idea how to implement it in current infrastructure.

> In the proposed patch, if the output register is also mentioned in the
> input, then clearing before insn will clear the value in the input
> register. The solution in the i386.md also takes care of this issue.
>

For this, I think we can add REGNO checks for operands in condition
(which means there is true dependency).

Uros Bizjak  于2022年1月14日周五 16:37写道:
>
> On Fri, Jan 14, 2022 at 7:11 AM Hongyu Wang  wrote:
> >
> > > > No, the approach is wrong. You have to solve output clearing on RTL
> > > > level, please look at how e.g. tzcnt false dep is solved:
> > >
> > > Actually we have considered such approach before, but we found we need
> > > to break original define_insn to remove the mask/rounding subst,
> > > since define_split could not adopt subst, and that would add 6 more
> > > define_insn_and_split and 4 define_insn for each instruction. We think
> > > such approach would introduce too much redundant code.
> > >
> > > Do you think the code size increment is acceptable?
> >
> > Also that 100+ more patterns increases maintenance effort. If we split
> > them at epilogue_complete stage,
> > it seems not much difference to put it under output template...
>
> In the proposed patch, if the output register is also mentioned in the
> input, then clearing before insn will clear the value in the input
> register. The solution in the i386.md also takes care of this issue.
>
> Uros.


Re: [PATCH] rs6000: Use known constant for GET_MODE_NUNITS and similar

2022-01-14 Thread David Edelsohn via Gcc-patches
On Fri, Jan 14, 2022 at 5:42 AM Kewen.Lin  wrote:
>
> on 2022/1/13 下午11:15, David Edelsohn wrote:
> > On Thu, Jan 13, 2022 at 7:40 AM Kewen.Lin  wrote:
> >>
> >> Hi David,
> >>
> >> on 2022/1/13 上午11:12, David Edelsohn wrote:
> >>> On Wed, Jan 12, 2022 at 8:56 PM Kewen.Lin  wrote:
> 
>  Hi,
> 
>  This patch is to clean up some codes with GET_MODE_UNIT_SIZE or
>  GET_MODE_NUNITS, which can use known constant instead.
> >>>
> >>> I'll let Segher decide, but often the additional code is useful
> >>> self-documentation instead of magic constants.  Or at least the change
> >>> requires comments documenting the derivation of the constants
> >>> currently described by the code itself.
> >>>
> >>
> >> Thanks for the comments, I added some comments as suggested, also removed
> >> the whole "altivec_vreveti2" since I noticed it's useless, it's not used
> >> by any built-in functions and even unused in the commit db042e1603db50573.
> >>
> >> The updated version has been tested as before.
> >
> > As we have discussed offline, the comments need to be clarified and 
> > expanded.
> >
> > And the removal of altivec_vreveti2 should be confirmed with Carl
> > Love, who added the pattern less than a year ago. There may be another
> > patch planning to use it.
> >
>
> Thanks for the suggestions David!  The comments have been updated, and Carl
> also helped to confirm the altivec_vreveti2 pattern is not planned for any
> future work and looks reasonable to remove.
>
> Does this updated version look good to you?

The revised patch is okay.

Thanks, David

>
> BR,
> Kewen
> -
> gcc/ChangeLog:
>
> * config/rs6000/altivec.md (altivec_vreveti2): Remove.
> * config/rs6000/vsx.md (*vsx_extract_si, 
> *vsx_extract_si_float_df,
> *vsx_extract_si_float_, *vsx_insert_extract_v4sf_p9): Use
> known constant values to simplify code.
> ---
>  gcc/config/rs6000/altivec.md | 25 -
>  gcc/config/rs6000/vsx.md | 16 
>  2 files changed, 12 insertions(+), 29 deletions(-)
>
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index 950b17862c4..4412175a0dc 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -3950,31 +3950,6 @@
>DONE;
>  })
>
> -;; Vector reverse elements
> -(define_expand "altivec_vreveti2"
> -  [(set (match_operand:TI 0 "register_operand" "=v")
> -   (unspec:TI [(match_operand:TI 1 "register_operand" "v")]
> - UNSPEC_VREVEV))]
> -  "TARGET_ALTIVEC"
> -{
> -  int i, j, size, num_elements;
> -  rtvec v = rtvec_alloc (16);
> -  rtx mask = gen_reg_rtx (V16QImode);
> -
> -  size = GET_MODE_UNIT_SIZE (TImode);
> -  num_elements = GET_MODE_NUNITS (TImode);
> -
> -  for (j = 0; j < num_elements; j++)
> -for (i = 0; i < size; i++)
> -  RTVEC_ELT (v, i + j * size)
> -   = GEN_INT (i + (num_elements - 1 - j) * size);
> -
> -  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> -  emit_insn (gen_altivec_vperm_ti (operands[0], operands[1],
> -operands[1], mask));
> -  DONE;
> -})
> -
>  ;; Vector reverse elements for V16QI V8HI V4SI V4SF
>  (define_expand "altivec_vreve2"
>[(set (match_operand:VEC_K 0 "register_operand" "=v")
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index acd729d1687..ee748ff4ebd 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -3854,8 +3854,10 @@
>rtx vec_tmp = operands[3];
>int value;
>
> +  /* Adjust index for LE element ordering, the below minuend 3 is computed by
> + GET_MODE_NUNITS (V4SImode) - 1.  */
>if (!BYTES_BIG_ENDIAN)
> -element = GEN_INT (GET_MODE_NUNITS (V4SImode) - 1 - INTVAL (element));
> +element = GEN_INT (3 - INTVAL (element));
>
>/* If the value is in the correct position, we can avoid doing the VSPLT
>   instruction.  */
> @@ -4230,8 +4232,10 @@
>rtx v4si_tmp = operands[3];
>int value;
>
> +  /* Adjust index for LE element ordering, the below minuend 3 is computed by
> + GET_MODE_NUNITS (V4SImode) - 1.  */
>if (!BYTES_BIG_ENDIAN)
> -element = GEN_INT (GET_MODE_NUNITS (V4SImode) - 1 - INTVAL (element));
> +element = GEN_INT (3 - INTVAL (element));
>
>/* If the value is in the correct position, we can avoid doing the VSPLT
>   instruction.  */
> @@ -4273,8 +4277,10 @@
>rtx df_tmp = operands[4];
>int value;
>
> +  /* Adjust index for LE element ordering, the below minuend 3 is computed by
> + GET_MODE_NUNITS (V4SImode) - 1.  */
>if (!BYTES_BIG_ENDIAN)
> -element = GEN_INT (GET_MODE_NUNITS (V4SImode) - 1 - INTVAL (element));
> +element = GEN_INT (3 - INTVAL (element));
>
>/* If the value is in the correct position, we can avoid doing the VSPLT
>   instruction.  */
> @@ -4466,8 +4472,10 @@
>  {
>int ele = INTVAL (operands[4]);
>
> +  /* Adjust index for LE element ordering, the below minuend 3 is computed by
> + GET_MOD

[PATCH][pushed] testsuite: rename files in c-c++-common.

2022-01-14 Thread Martin Liška

Pushed, there should be only .c files in the sub-folder.

Martin

gcc/testsuite/ChangeLog:

* c-c++-common/Walloca-larger-than.C: Moved to...
* c-c++-common/Walloca-larger-than.c: ...here.
---
 .../c-c++-common/{Walloca-larger-than.C => Walloca-larger-than.c} | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 rename gcc/testsuite/c-c++-common/{Walloca-larger-than.C => 
Walloca-larger-than.c} (100%)

diff --git a/gcc/testsuite/c-c++-common/Walloca-larger-than.C 
b/gcc/testsuite/c-c++-common/Walloca-larger-than.c
similarity index 100%
rename from gcc/testsuite/c-c++-common/Walloca-larger-than.C
rename to gcc/testsuite/c-c++-common/Walloca-larger-than.c
--
2.34.1



RE: [PATCH v3 00/15] ARM/MVE use vectors of boolean for predicates

2022-01-14 Thread Kyrylo Tkachov via Gcc-patches
Hi Christophe, Richard,

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Richard
> Biener via Gcc-patches
> Sent: Friday, January 14, 2022 1:33 PM
> To: Christophe Lyon 
> Cc: GCC Patches 
> Subject: Re: [PATCH v3 00/15] ARM/MVE use vectors of boolean for
> predicates
> 
> On Fri, Jan 14, 2022 at 2:18 PM Christophe Lyon via Gcc-patches
>  wrote:
> >
> > Hi,
> >
> > I hadn't realized we are moving to stage 4 this week-end :-(
> >
> > The PRs I'm fixing are P3, but without these fixes MVE support is badly
> > broken, so I think I would be really good to fix that before the buggy
> > version becomes part of an actual release.
> > Anyway I posted v1 of the patches during stage1, so it should still be OK
> > if they are accepted as-is ?
> 
> In the end it's up to the target maintainers to weight the risk of breakage
> vs. the risk of not usefulness ;)  But stage3 is where the "was posted
> during stage1"
> rule can easily apply - at some point we have to stop with such general 
> ruling.
> 

Thanks, that's in line with my interpretation.
These patches resolve some nasty brokenness in the MVE support that I'm keen to 
see fixed and from what I can tell the patches shouldn't have a large effect on 
non-MVE code.
So the risk vs reward balance for the arm port as a whole looks good to me.
Andre has kindly agreed to help review the patches and I'll also try to get to 
them today and next week so that we can get them in early stage4.

Thanks,
Kyrill

> Richard.
> 
> > Thanks,
> >
> > Christophe
> >
> > On Thu, Jan 13, 2022 at 3:58 PM Christophe Lyon via Gcc-patches <
> > gcc-patches@gcc.gnu.org> wrote:
> >
> > >
> > > This is v3 of this patch series, fixing issues I discovered before
> > > committing v2 (which had been approved).
> > >
> > > Thanks a lot to Richard Sandiford for his help.
> > >
> > > The changes v2 -> v3 are:
> > >
> > > Patch 4: Fix arm_hard_regno_nregs and CLASS_MAX_NREGS to support
> VPR.
> > >
> > > Patch 7: Changes to the underlying representation of vectors of
> > > booleans to account for the different expectations between AArch64/SVE
> > > and Arm/MVE.
> > >
> > > Patch 8: Re-use and extend existing thumb2_movhi* patterns instead of
> > > duplicating them in mve_mov. This requires the introduction of a
> > > new constraint to match a constant vector of booleans. Add a new RTL
> > > test.
> > >
> > > Patch 9: Introduce check_effective_target_arm_mve and skip
> > > gcc.dg/signbit-2.c, because with MVE there is no fallback architecture
> > > unlike SVE or AVX512.
> > >
> > > Patch 12: Update less load/store MVE builtins
> > > (mve_vldrdq_gather_base_z_v2di,
> > > mve_vldrdq_gather_offset_z_v2di,
> > > mve_vldrdq_gather_shifted_offset_z_v2di,
> > > mve_vstrdq_scatter_base_p_v2di,
> > > mve_vstrdq_scatter_offset_p_v2di,
> > > mve_vstrdq_scatter_offset_p_v2di_insn,
> > > mve_vstrdq_scatter_shifted_offset_p_v2di,
> > > mve_vstrdq_scatter_shifted_offset_p_v2di_insn,
> > > mve_vstrdq_scatter_base_wb_p_v2di,
> > > mve_vldrdq_gather_base_wb_z_v2di,
> > > mve_vldrdq_gather_base_nowb_z_v2di,
> > > mve_vldrdq_gather_base_wb_z_v2di_insn) for which we keep HI
> mode
> > > for vpr_register_operand.
> > >
> > > Patch 13: No need to update
> > > gcc.target/arm/acle/cde-mve-full-assembly.c anymore since we re-use
> > > the mov pattern that emits '@ movhi' in the assembly.
> > >
> > > Patch 15: This is a new patch to fix a problem I noticed during this
> > > v2->v3 update.
> > >
> > >
> > >
> > > I'll squash patch 2 with patch 9 and patch 3 with patch 8.
> > >
> > > Original text:
> > >
> > > This patch series addresses PR 100757 and 101325 by representing
> > > vectors of predicates (MVE VPR.P0 register) as vectors of booleans
> > > rather than using HImode.
> > >
> > > As this implies a lot of mostly mechanical changes, I have tried to
> > > split the patches in a way that should help reviewers, but the split
> > > is a bit artificial.
> > >
> > > Patches 1-3 add new tests.
> > >
> > > Patches 4-6 are small independent improvements.
> > >
> > > Patch 7 implements the predicate qualifier, but does not change any
> > > builtin yet.
> > >
> > > Patch 8 is the first of the two main patches, and uses the new
> > > qualifier to describe the vcmp and vpsel builtins that are useful for
> > > auto-vectorization of comparisons.
> > >
> > > Patch 9 is the second main patch, which fixes the vcond_mask expander.
> > >
> > > Patches 10-13 convert almost all the remaining builtins with HI
> > > operands to use the predicate qualifier.  After these, there are still
> > > a few builtins with HI operands left, about which I am not sure: vctp,
> > > vpnot, load-gather and store-scatter with v2di operands.  In fact,
> > > patches 11/12 update some STR/LDR qualifiers in a way that breaks
> > > these v2di builtins although existing tests still pass.
> > >
> > > Christophe Lyon (15):
> > >   arm: Add new tests for comparison vectorization with Neon and MVE
> > >   arm: Add te

Re: [PATCH] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2022-01-14 Thread Stephan Bergmann via Gcc-patches

On 14/01/2022 14:28, Marek Polacek wrote:

On Fri, Jan 14, 2022 at 10:23:16AM +0100, Stephan Bergmann wrote:

On 30/11/2021 16:27, Marek Polacek wrote:

I guess we were concerned with programs that generate other programs.
Maybe UCNs should be ignored by default.  There's still time to adjust
the behavior.


Is there any update on this?  Shall I file a bug?  As-is, -Wbidi-chars is
unusable for building LibreOffice and (esp. in combination with
 "C++ preprocessor
ignores #pragma GCC diagnostic") has to be explicitly disabled globally.


No update, it wasn't clear to me what the action should be here.

Please do file a bug.  I think I'll just have to adjust the warning to ignore
UCNs.


 "-Wbidi-chars 
should not warn about UCNs"




[PATCH] i386: Mark some of strict_low_part insn constraints earlyclobbered

2022-01-14 Thread Uros Bizjak via Gcc-patches
While there is practically impossible that input registers are matched
with in-out register, better mark the output operand of the split alternative
as earlyclobbered - we do output early to the output operand when
the insn is split.

2022-01-14  Uroš Bizjak  

gcc/ChangeLog:

* config/i386/i386.md (*add_1_slp"):
Mark alternative 1 output operand earlyclobbered.
(*sub_1_slp): Ditto.
(*and_1_slp): Ditto.
(*_1_slp): Ditto.
(*neg_1_slp): Ditto.
(*one_cmpl_1_slp): Ditto.
(*ashl3_1_slp): Ditto.
(*3_1_slp): Ditto.
(*3_1_slp): Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index c2acb1dbd90..bd965df79df 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -5902,7 +5902,7 @@
 
 ;; Alternative 1 is needed to work around LRA limitation, see PR82524.
 (define_insn_and_split "*add_1_slp"
-  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" "+,"))
+  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" 
"+,&"))
(plus:SWI12 (match_operand:SWI12 1 "nonimmediate_operand" "%0,!")
(match_operand:SWI12 2 "general_operand" "mn,mn")))
(clobber (reg:CC FLAGS_REG))]
@@ -6856,7 +6856,7 @@
 
 ;; Alternative 1 is needed to work around LRA limitation, see PR82524.
 (define_insn_and_split "*sub_1_slp"
-  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" "+,"))
+  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" 
"+,&"))
(minus:SWI12 (match_operand:SWI12 1 "register_operand" "0,!")
 (match_operand:SWI12 2 "general_operand" "mn,mn")))
(clobber (reg:CC FLAGS_REG))]
@@ -9905,7 +9905,7 @@
 
 ;; Alternative 1 is needed to work around LRA limitation, see PR82524.
 (define_insn_and_split "*and_1_slp"
-  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" "+,"))
+  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" 
"+,&"))
(and:SWI12 (match_operand:SWI12 1 "nonimmediate_operand" "%0,!")
   (match_operand:SWI12 2 "general_operand" "mn,mn")))
(clobber (reg:CC FLAGS_REG))]
@@ -10543,7 +10543,7 @@
 
 ;; Alternative 1 is needed to work around LRA limitation, see PR82524.
 (define_insn_and_split "*_1_slp"
-  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" "+,"))
+  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" 
"+,&"))
(any_or:SWI12 (match_operand:SWI12 1 "nonimmediate_operand" "%0,!")
  (match_operand:SWI12 2 "general_operand" "mn,mn")))
(clobber (reg:CC FLAGS_REG))]
@@ -10897,7 +10897,7 @@
 
 ;; Alternative 1 is needed to work around LRA limitation, see PR82524.
 (define_insn_and_split "*neg_1_slp"
-  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" "+,"))
+  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" 
"+,&"))
(neg:SWI12 (match_operand:SWI12 1 "register_operand" "0,!")))
(clobber (reg:CC FLAGS_REG))]
   "!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)"
@@ -11490,7 +11490,7 @@
 
 ;; Alternative 1 is needed to work around LRA limitation, see PR82524.
 (define_insn_and_split "*one_cmpl_1_slp"
-  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" "+,"))
+  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" 
"+,&"))
(not:SWI12 (match_operand:SWI12 1 "register_operand" "0,!")))]
   "!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun)"
   "@
@@ -12186,7 +12186,7 @@
 
 ;; Alternative 1 is needed to work around LRA limitation, see PR82524.
 (define_insn_and_split "*ashl3_1_slp"
-  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" "+,"))
+  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" 
"+,&"))
(ashift:SWI12 (match_operand:SWI12 1 "register_operand" "0,!")
  (match_operand:QI 2 "nonmemory_operand" "cI,cI")))
(clobber (reg:CC FLAGS_REG))]
@@ -13063,7 +13063,7 @@
 
 ;; Alternative 1 is needed to work around LRA limitation, see PR82524.
 (define_insn_and_split "*3_1_slp"
-  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" "+,"))
+  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" 
"+,&"))
(any_shiftrt:SWI12 (match_operand:SWI12 1 "register_operand" "0,!")
   (match_operand:QI 2 "nonmemory_operand" "cI,cI")))
(clobber (reg:CC FLAGS_REG))]
@@ -13607,7 +13607,7 @@
 
 ;; Alternative 1 is needed to work around LRA limitation, see PR82524.
 (define_insn_and_split "*3_1_slp"
-  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" "+,"))
+  [(set (strict_low_part (match_operand:SWI12 0 "register_operand" 
"+,&"))
(any_rotate:SWI12 (match_operand:SWI12 1 "register_operand" "0,!")
  (match_operand:QI 2 "nonmemory_operand" "cI,cI")))
(clobber (reg:CC FLAGS_REG))]


[PATCH][pushed] testsuite: fix dh-warning typo.

2022-01-14 Thread Martin Liška

gcc/testsuite/ChangeLog:

* c-c++-common/Walloca-larger-than.c: Fix typo.
---
 gcc/testsuite/c-c++-common/Walloca-larger-than.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/Walloca-larger-than.c 
b/gcc/testsuite/c-c++-common/Walloca-larger-than.c
index 464ea242a7a..e7851eabb46 100644
--- a/gcc/testsuite/c-c++-common/Walloca-larger-than.c
+++ b/gcc/testsuite/c-c++-common/Walloca-larger-than.c
@@ -16,6 +16,6 @@ void warn_alloca_too_large (void)
 {
   sink (alloca (1));
   sink (alloca (128));
-  sink (alloca (129));// { dh-warning "\\\[-Walloca-larger-than" }
-  sink (alloca (1024));   // { dh-warning "\\\[-Walloca-larger-than" }
+  sink (alloca (129));// { dg-warning "\\\[-Walloca-larger-than" }
+  sink (alloca (1024));   // { dg-warning "\\\[-Walloca-larger-than" }
 }
--
2.34.1



Re: [PATCH] git-backport: support renamed .cc files in commit message.

2022-01-14 Thread Martin Liška

On 1/14/22 08:44, Bernhard Reutner-Fischer wrote:

On Wed, 12 Jan 2022 16:54:46 +0100
Martin Liška  wrote:


+def replace_file_in_changelog(lines, filename):
+if not filename.endswith('.cc'):
+return
+
+# consider all componenets of a path: gcc/ipa-icf.cc
+while filename:
+for i, line in enumerate(lines):
+if filename in line:
+line = line.replace(filename, filename[:-1])
+lines[i] = line
+return
+parts = filename.split('/')
+if len(parts) == 1:
+return
+filename = '/'.join(parts[1:])
+


I think you mean os.sep instead of the hardcoded slash.
But i'd use os.path.split and os.path.join


Hello.

Well, these are all paths from a git commit message. And we require unix-style
of paths for all ChangeLog entries. So it should be correct.

Martin



thanks,




Re: [PATCH] testsuite: Robustify aarch64/simd tests against more aggressive DCE

2022-01-14 Thread Marc Poulhies via Gcc-patches
Eric Botcazou  writes:

>> The new variables seem to be unused, so I think slightly stronger
>> DCE could remove the calls even after the patch.  Perhaps the containing
>> functions should take an int32x4_t *ptr or something, with the calls
>> assigning to different ptr[] indices.
>
> We run a minimal DCE pass at -O0 in our compiler to eliminate all the garbage 
> generated by the gimplifier for variable-sized types (people care about code 
> size at -O0 in specific contexts) but it does not touch anything written by 
> the user (and debugging is unaffected of course).  Given that the builtins 
> are 
> pure functions and the arguments have no side effects, it eliminates the 
> calls, but adding a LHS blocks that because this minimal DCE pass preserves 
> anything user-related, in particular assignments to user variables.
>
>> I think it would be better to do that using new calls though,
>> and xfail the existing ones when they no longer work.  For example:
>> 
>>   /* { dg-error "lane -1 out of range 0 - 7" "" {target *-*-*} 0 } */
>>   vqdmlal_high_laneq_s16 (int32x4_a, int16x8_b, int16x8_c, -1);
>>   /* { dg-error "lane -1 out of range 0 - 7" "" {target *-*-*} 0 } */
>>   ptr[0] = vqdmlal_high_laneq_s16 (int32x4_a, int16x8_b, int16x8_c, -1);
>> 
>> That way we don't lose the existing tests.
>
> Frankly I'm not quite sure of what we can lose by adding a LHS here, can you 
> elaborate a bit?  We would need a solution that works out of the box with our 
> compiler in the future, i.e. without having to tweak 50 testcases again.

Hi Richard,

Thank for your reply !

As Éric, I'm also wondering why having LHS in the existing tests would
make us loose them. I guess I'm not familiar enough with this part of
the testsuite and I'm missing something.

Thanks,
Marc


gcc/configure: out of date

2022-01-14 Thread Martin Liška

Hello.

I noticed that when I run:
ACLOCAL=~/bin/automake-1.15.1/bin/aclocal  
AUTOMAKE=~/bin/automake-1.15.1/bin/automake autoconf

in gcc subfolder I get the following diff:

diff --git a/gcc/configure b/gcc/configure
index d19059e13cc..ff570f73ef5 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -5352,7 +5352,26 @@ else
   GDC="$ac_cv_prog_GDC"
 fi
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether the D compiler works" >&5

+$as_echo_n "checking whether the D compiler works... " >&6; }
+if ${acx_cv_d_compiler_works+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat >conftest.d <&1 || echo failure`
+  if test x"$errors" = x && test -f conftest.$ac_objext; then
+acx_cv_d_compiler_works=yes
+  fi
+  rm -f conftest.*
+fi
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $acx_cv_d_compiler_works" >&5
+$as_echo "$acx_cv_d_compiler_works" >&6; }
+if test "x$GDC" != xno && test x$acx_cv_d_compiler_works != xno; then
   have_gdc=yes
 else
   have_gdc=no
@@ -19640,7 +19659,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19643 "configure"
+#line 19662 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H

@@ -19746,7 +19765,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19749 "configure"
+#line 19768 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H


Am I correct that somebody forgot to re-generate the file?

Martin


Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.

2022-01-14 Thread Uros Bizjak via Gcc-patches
On Fri, Jan 14, 2022 at 2:44 PM Hongyu Wang  wrote:
>
> > Are there any technical obstacles to introduce subst to
> > define_{,insn_and_}split?
>
> gccint says: define_subst can be used only in define_insn and
> define_expand, it cannot be used in other expressions (e.g. in
> define_insn_and_split).

Hm, hm ... annoying ...

> I have no idea how to implement it in current infrastructure.
>
> > In the proposed patch, if the output register is also mentioned in the
> > input, then clearing before insn will clear the value in the input
> > register. The solution in the i386.md also takes care of this issue.
> >
>
> For this, I think we can add REGNO checks for operands in condition
> (which means there is true dependency).

Let's go in your direction, considering the limitations of current
infrastructure.
+{
+  if (TARGET_DEST_FALSE_DEPENDENCY
+  && get_attr_dest_false_dep (insn) ==
+DEST_FALSE_DEP_TRUE)
+output_asm_insn ("vxorps\t{%x0, %x0, %x0}", operands);
+  return "vgetmant\t{%2, %1,
%0|%0, %1, %2}";
+}

There is no need to pass the information via attributes. IMO, you
shoud use subst attribute directly in the condition:

{
  if (TARGET_DEST_FALSE_DEPENDENCY
  && 
  && !reg_mentioned_p (operands[0], operands[1]))
   output_asm_insn ("vxorps\t{%x0, %x0, %x0}", operands);
  return ".";
}

Assume the above works, so please:

- rename TARGET_DEST_FALSE_DEPENDENCY to something less generic, maybe
following existing BMI example with TARGET_AVOID_FALSE_DEP_FOR_AVX512F
- rename "mask3_dest_false_dep_attr" to "mask3_false_dep_for_avx512f_cond"

Please note reg_mentioned_p in the above condition. This function
returns nonzero if register op0 appears somewhere within op1 and is
critical for the correct operation of your patch.

Uros.


Re: gcc/configure: out of date

2022-01-14 Thread Martin Liška

On 1/14/22 16:46, Martin Liška wrote:

Hello.

I noticed that when I run:
ACLOCAL=~/bin/automake-1.15.1/bin/aclocal  
AUTOMAKE=~/bin/automake-1.15.1/bin/automake autoconf

in gcc subfolder I get the following diff:


Have it, started with:


commit 7c6ae994fb587c19ca14aebe18dbc9aca83be609
Author: Iain Buclaw 
Date:   Thu Dec 16 23:56:16 2021 +0100

config: Add check whether D compiler works (PR103528)

As well as checking for the existence of a GDC compiler, also validate

that it has also been built with libphobos, otherwise warn or fail with
the message that GDC is required to build d.

config/ChangeLog:

PR d/103528

* acx.m4 (ACX_PROG_GDC): Add check whether D compiler works.

ChangeLog:

* configure: Regenerate.


I'm going to push the patch I pasted in the previous email.

Martin


Re: [Patch][V4][Patch 2/2]Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables.

2022-01-14 Thread Qing Zhao


> On Jan 14, 2022, at 6:45 AM, Richard Biener  
> wrote:
> 
> On Thu, Jan 13, 2022 at 2:45 AM Qing Zhao  wrote:
>> 
>> Hi, Richard,
>> 
>> This is the updated version for the second patch, which is mainly the change 
>> for "Enable -Wuninitialized + -ftrivial-auto-var-init for  address taken 
>> variables”.
>> 
>> In this update, I mainly made the following change:
>> 
>> 1.  Delete “repl_var”, use the var_def_stmt, i.e, the call to .DEFERRED_INIT 
>> to record the warning suppressed info.
>> 2. Add and change the comments in multiple places to make the change more 
>> readable.
>> 
>> Now, for the deleted variable, we will get the necessary info to report 
>> warning from the call to .DEFERRED_INIT.
>> 
>>A. the name string of DECL from the 3rd parameter of the call;
>>B. the location of the DECL from the location of the call;
>>C. the call can also be used to hold the information on whether the 
>> warning
>>   has been issued or not to suppress warning messages when needed;
>> 
>> Please see the detailed description below for the problem and solution of 
>> this patch.
>> 
>> This patch has been bootstrapped and regressing tested on both X86 and 
>> aarch64.
>> 
>> Okay for GCC12?
> 
> I think the change to split "%qD is used uninitialized" is bad for i8n
> though it seems
> like none of the strings passed to warn_uninit are currently marked
> for localization.
> I suppose there's lazy matching with the exact same strings passed to
> warning_at around like 641 but after your change those will no longer match 
> up,

Silly question: What does the above “641” mean?

> at least for the "%qs ..." case constructed.  I think a better way
> (for i8n) would be
> to pass down a flag whether it is "may" or "is" and have the full translatable
> strings literally passed to warning_at at the expense of some code 
> duplication.
> Actually the extra flag is unnecessary, the OPT_W... we pass is already fully
> specifying that.

The only issue with the above change is:  among the  4 calls to “warn_uninit” 
as following:

   if (use_stmt)
warn_uninit (OPT_Wuninitialized, def, SSA_NAME_VAR (def),
-"%qD is used uninitialized", use_stmt);
+"is used uninitialized", use_stmt);
 }
 }

@@ -932,10 +1015,10 @@ warn_uninitialized_vars (bool wmaybe_uninit)
  tree use = USE_FROM_PTR (use_p);
  if (wlims.always_executed)
warn_uninit (OPT_Wuninitialized, use, SSA_NAME_VAR (use),
-"%qD is used uninitialized", stmt);
+"is used uninitialized", stmt);
  else if (wmaybe_uninit)
warn_uninit (OPT_Wmaybe_uninitialized, use, SSA_NAME_VAR (use),
-"%qD may be used uninitialized", stmt);
+"may be used uninitialized", stmt);
}

  /* For limiting the alias walk below we count all
@@ -1182,7 +1265,7 @@ warn_uninitialized_phi (gphi *phi, vec *worklist,

   warn_uninit (OPT_Wmaybe_uninitialized, uninit_op,
   SSA_NAME_VAR (uninit_op),
-  "%qD may be used uninitialized in this function",
+  "may be used uninitialized in this function",
   uninit_use_stmt, loc);

All the strings passed map well with the OPT_W… except the last one, since the 
last one has an additional string “in this function” at the end.
However, since the last call has the last argument “loc” been NULL, maybe we 
can use this to distinguish.


> 
> Instead of doing
> 
> +  if (gimple_call_internal_p (var_def_stmt, IFN_DEFERRED_INIT))
> +   {
> + /* Ignore the call to .DEFERRED_INIT that define the original
> +var itself as the following case:
> +   temp = .DEFERRED_INIT (4, 2, â~@~\alt_reloc");
> +   alt_reloc = temp;
> +In order to avoid generating warning for the fake usage
> +at alt_reloc = temp.
> ...
> 
> I thought of many options, none really very appealing so I guess we have to
> go with this for now.

I agree with you, this is really ugly, I am not very comfortable myself either. 
But looks like no better way to resolve it….
> 
> So OK with the i8n thing sorted out - please post one hopefully last update
> for the patch.

Will do it.

> 
> Thanks for your patience,

Thank you!

Qing
> Richard.
> 
>> thanks.
>> 
>> Qing.
>> 
>> 
>> Enable -Wuninitialized + -ftrivial-auto-var-init for address
>> taken variables.
>> 
>> With -ftrivial-auto-var-init, the address taken auto variable is replaced 
>> with
>> a temporary variable during gimplification, and the original auto variable 
>> might
>> be eliminated by compiler optimization completely. As a result, the current
>> uninitialized warning analysis cannot get enough information from the IR,
>> therefore the uninitialized warnings for address taken variable cannot be
>> issued based on the current implemenation of -

[arm] MVE: Relax addressing modes for full loads and stores

2022-01-14 Thread Andre Vieira (lists) via Gcc-patches

Hi Christophe,

This patch relaxes the addressing modes for the mve full load and stores 
(by full loads and stores I mean non-widening or narrowing loads and 
stores resp). The code before was requiring a LO_REGNUM for these, where 
this is only a requirement if the load is widening or the store narrowing.


So with this your patch should not be necessary.

Regression tested on arm-none-eabi-gcc.  Can you please confirm this 
fixes the issue you were seeing too?


gcc/ChangeLog:

    * config/arm/arm.h (MVE_STN_LDW_MODE): New MACRO.
    * config/arm/arm.c (mve_vector_mem_operand): Relax constraint on
    base register for non widening loads or narrowing stores.


Kind Regards,
Andre Vieiradiff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
dacce2b7f086eeffb0cd36b26f102f77130a92fa..f39786d0f9e19e81841a45f6d7e92e408272fe23
 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1099,6 +1099,10 @@ extern const int arm_arch_cde_coproc_bits[];
   ((MODE) == V2DImode ||(MODE) == V4SImode || (MODE) == V8HImode \
|| (MODE) == V16QImode)
 
+/* Modes used in MVE's narrowing stores or widening loads.  */
+#define MVE_STN_LDW_MODE(MODE) \
+  ((MODE) == V4QImode || (MODE) == V8QImode || (MODE) == V4HImode)
+
 #define VALID_MVE_SF_MODE(MODE) \
   ((MODE) == V8HFmode || (MODE) == V4SFmode || (MODE) == V2DFmode)
 
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
bb75921f32df6185711d5304c044ce67a2d5671e..f5e09cb00b5478546d29c05cc885aeaa89501d39
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -13438,27 +13438,28 @@ mve_vector_mem_operand (machine_mode mode, rtx op, 
bool strict)
  case E_V16QImode:
  case E_V8QImode:
  case E_V4QImode:
-   if (abs (val) <= 127)
- return (reg_no < LAST_ARM_REGNUM && reg_no != SP_REGNUM)
-   || reg_no >= FIRST_PSEUDO_REGISTER;
-   return FALSE;
+   if (abs (val) > 127)
+ return FALSE;
+   break;
  case E_V8HImode:
  case E_V8HFmode:
  case E_V4HImode:
  case E_V4HFmode:
-   if (val % 2 == 0 && abs (val) <= 254)
- return reg_no <= LAST_LO_REGNUM
-   || reg_no >= FIRST_PSEUDO_REGISTER;
-   return FALSE;
+   if (val % 2 != 0 || abs (val) > 254)
+ return FALSE;
+   break;
  case E_V4SImode:
  case E_V4SFmode:
-   if (val % 4 == 0 && abs (val) <= 508)
- return (reg_no < LAST_ARM_REGNUM && reg_no != SP_REGNUM)
-   || reg_no >= FIRST_PSEUDO_REGISTER;
-   return FALSE;
+   if (val % 4 != 0 || abs (val) > 508)
+ return FALSE;
+   break;
  default:
return FALSE;
}
+  return reg_no >= FIRST_PSEUDO_REGISTER
+   || (MVE_STN_LDW_MODE (mode)
+   ? reg_no <= LAST_LO_REGNUM
+   : (reg_no < LAST_ARM_REGNUM && reg_no != SP_REGNUM));
 }
   return FALSE;
 }


[COMMITTED] test to verify -Wformat-overflow uses context-sensitive ranges

2022-01-14 Thread Martin Sebor via Gcc-patches

Converting the strlen/sprintf pass to Ranger has considerably
improved the accuracy of -Wformat-overflow warnings: they can avoid
triggering for safe input even at -O0 while at the same time detect
provable overflow.  The conversion didn't come with any tests so in
r12-6591 I committed one that verifies both of these improvements.

https://gcc.gnu.org/pipermail/gcc-cvs/2022-January/359299.html

Martin


Re: [Patch][V4][Patch 2/2]Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables.

2022-01-14 Thread Martin Sebor via Gcc-patches

On 1/14/22 09:30, Qing Zhao wrote:




On Jan 14, 2022, at 6:45 AM, Richard Biener  wrote:

On Thu, Jan 13, 2022 at 2:45 AM Qing Zhao  wrote:


Hi, Richard,

This is the updated version for the second patch, which is mainly the change for 
"Enable -Wuninitialized + -ftrivial-auto-var-init for  address taken variables”.

In this update, I mainly made the following change:

1.  Delete “repl_var”, use the var_def_stmt, i.e, the call to .DEFERRED_INIT to 
record the warning suppressed info.
2. Add and change the comments in multiple places to make the change more 
readable.

Now, for the deleted variable, we will get the necessary info to report warning 
from the call to .DEFERRED_INIT.

A. the name string of DECL from the 3rd parameter of the call;
B. the location of the DECL from the location of the call;
C. the call can also be used to hold the information on whether the warning
   has been issued or not to suppress warning messages when needed;

Please see the detailed description below for the problem and solution of this 
patch.

This patch has been bootstrapped and regressing tested on both X86 and aarch64.

Okay for GCC12?


I think the change to split "%qD is used uninitialized" is bad for i8n
though it seems
like none of the strings passed to warn_uninit are currently marked
for localization.
I suppose there's lazy matching with the exact same strings passed to
warning_at around like 641 but after your change those will no longer match up,


Silly question: What does the above “641” mean?


At around line 641 :)




at least for the "%qs ..." case constructed.  I think a better way
(for i8n) would be
to pass down a flag whether it is "may" or "is" and have the full translatable
strings literally passed to warning_at at the expense of some code duplication.
Actually the extra flag is unnecessary, the OPT_W... we pass is already fully
specifying that.


The only issue with the above change is:  among the  4 calls to “warn_uninit” 
as following:

if (use_stmt)
 warn_uninit (OPT_Wuninitialized, def, SSA_NAME_VAR (def),
-"%qD is used uninitialized", use_stmt);
+"is used uninitialized", use_stmt);
  }
  }

@@ -932,10 +1015,10 @@ warn_uninitialized_vars (bool wmaybe_uninit)
   tree use = USE_FROM_PTR (use_p);
   if (wlims.always_executed)
 warn_uninit (OPT_Wuninitialized, use, SSA_NAME_VAR (use),
-"%qD is used uninitialized", stmt);
+"is used uninitialized", stmt);
   else if (wmaybe_uninit)
 warn_uninit (OPT_Wmaybe_uninitialized, use, SSA_NAME_VAR (use),
-"%qD may be used uninitialized", stmt);
+"may be used uninitialized", stmt);
 }

   /* For limiting the alias walk below we count all
@@ -1182,7 +1265,7 @@ warn_uninitialized_phi (gphi *phi, vec *worklist,

warn_uninit (OPT_Wmaybe_uninitialized, uninit_op,
SSA_NAME_VAR (uninit_op),
-  "%qD may be used uninitialized in this function",
+  "may be used uninitialized in this function",
uninit_use_stmt, loc);

All the strings passed map well with the OPT_W… except the last one, since the 
last one has an additional string “in this function” at the end.
However, since the last call has the last argument “loc” been NULL, maybe we 
can use this to distinguish.


Now that diagnostics are prefixed by something like "In function 'foo':
the "in this function" part is superfluous and could be removed from
all warning messages.

When there's no location (i.e., loc is UNKNOWN_LOCATION) the called
code tries to derive a location from the context.  When it can't, it
won't point anywhere useful, so if that case ever comes up here it
should probably be handled by using the location of the curly brace
closing the function definition.

Martin



Instead of doing

+  if (gimple_call_internal_p (var_def_stmt, IFN_DEFERRED_INIT))
+   {
+ /* Ignore the call to .DEFERRED_INIT that define the original
+var itself as the following case:
+   temp = .DEFERRED_INIT (4, 2, â~@~\alt_reloc");
+   alt_reloc = temp;
+In order to avoid generating warning for the fake usage
+at alt_reloc = temp.
...

I thought of many options, none really very appealing so I guess we have to
go with this for now.


I agree with you, this is really ugly, I am not very comfortable myself either. 
But looks like no better way to resolve it….


So OK with the i8n thing sorted out - please post one hopefully last update
for the patch.


Will do it.



Thanks for your patience,


Thank you!

Qing

Richard.


thanks.

Qing.


Enable -Wuninitialized + -ftrivial-auto-var-init for address
taken variables.

With -ftrivial-auto-var-init, the address taken auto variable is replaced with
a temp

Re: [RFC PATCH] tree-ssa-sink: do not sink to in front of setjmp

2022-01-14 Thread Alexander Monakov via Gcc-patches
> I approved the initial sink patch (maybe not clearly enough).

I wasn't entirely happy with that patch. The new version solves this better.

> Can you open
> a bugreport about the missing CFG verification and list the set of FAILs
> (all errors in some passes similar to the one you fixed in sinking I guess)?
> It indeed sounds like something to tackle during next stage1 (unless you
> already narrowed down the culprit to a single offender...)

Most of the failures were related to transactional memory, and the rest are
seemingly solved by forbidding duplication of returns_twice calls. In reply
to this email I'm sending three patches, the first is a revised patch for
tree-ssa-sink, the second forbids duplication of setjmp-like calls, and the
third implements the checks in verify_flow_info:

  tree-ssa-sink: do not sink to in front of setjmp
  tree-cfg: do not duplicate returns_twice calls
  tree-cfg: check placement of returns_twice calls

 gcc/testsuite/gcc.dg/setjmp-7.c | 13 +++
 gcc/tree-cfg.c  | 40 +++--
 gcc/tree-ssa-sink.c |  6 +
 3 files changed, 57 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/setjmp-7.c

-- 
2.33.1



[PATCH 1/3] tree-ssa-sink: do not sink to in front of setjmp

2022-01-14 Thread Alexander Monakov via Gcc-patches
gcc/ChangeLog:

* tree-ssa-sink.c (select_best_block): Punt if selected block
has incoming abnormal edges.

gcc/testsuite/ChangeLog:

* gcc.dg/setjmp-7.c: New test.
---
 gcc/testsuite/gcc.dg/setjmp-7.c | 13 +
 gcc/tree-ssa-sink.c |  6 ++
 2 files changed, 19 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/setjmp-7.c

diff --git a/gcc/testsuite/gcc.dg/setjmp-7.c b/gcc/testsuite/gcc.dg/setjmp-7.c
new file mode 100644
index 0..44b5bcbfa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/setjmp-7.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-guess-branch-probability -w" } */
+/* { dg-require-effective-target indirect_jumps } */
+
+struct __jmp_buf_tag { };
+typedef struct __jmp_buf_tag jmp_buf[1];
+struct globals { jmp_buf listingbuf; };
+extern struct globals *const ptr_to_globals;
+void foo()
+{
+if ( _setjmp ( ((*ptr_to_globals).listingbuf )))
+   ;
+}
diff --git a/gcc/tree-ssa-sink.c b/gcc/tree-ssa-sink.c
index 66d7ae89e..016ecbaec 100644
--- a/gcc/tree-ssa-sink.c
+++ b/gcc/tree-ssa-sink.c
@@ -208,6 +208,12 @@ select_best_block (basic_block early_bb,
   temp_bb = get_immediate_dominator (CDI_DOMINATORS, temp_bb);
 }
 
+  /* Placing a statement before a setjmp-like function would be invalid
+ (it cannot be reevaluated when execution follows an abnormal edge).
+ If we selected a block with abnormal predecessors, just punt.  */
+  if (bb_has_abnormal_pred (best_bb))
+return early_bb;
+
   /* If we found a shallower loop nest, then we always consider that
  a win.  This will always give us the most control dependent block
  within that loop nest.  */
-- 
2.33.1



[PATCH 2/3] tree-cfg: do not duplicate returns_twice calls

2022-01-14 Thread Alexander Monakov via Gcc-patches
A returns_twice call may have associated abnormal edges that correspond
to the "second return" from the call. If the call is duplicated, the
copies of those edges also need to be abnormal, but e.g. tracer does not
enforce that. Just prohibit the (unlikely to be useful) duplication.

gcc/ChangeLog:

* tree-cfg.c (gimple_can_duplicate_bb_p): Reject blocks with
calls that may return twice.
---
 gcc/tree-cfg.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index b7fe313b7..a99f1acb4 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -6304,12 +6304,15 @@ gimple_can_duplicate_bb_p (const_basic_block bb)
 {
   gimple *g = gsi_stmt (gsi);
 
-  /* An IFN_GOMP_SIMT_ENTER_ALLOC/IFN_GOMP_SIMT_EXIT call must be
+  /* Prohibit duplication of returns_twice calls, otherwise associated
+abnormal edges also need to be duplicated properly.
+An IFN_GOMP_SIMT_ENTER_ALLOC/IFN_GOMP_SIMT_EXIT call must be
 duplicated as part of its group, or not at all.
 The IFN_GOMP_SIMT_VOTE_ANY and IFN_GOMP_SIMT_XCHG_* are part of such a
 group, so the same holds there.  */
   if (is_gimple_call (g)
- && (gimple_call_internal_p (g, IFN_GOMP_SIMT_ENTER_ALLOC)
+ && (gimple_call_flags (g) & ECF_RETURNS_TWICE
+ || gimple_call_internal_p (g, IFN_GOMP_SIMT_ENTER_ALLOC)
  || gimple_call_internal_p (g, IFN_GOMP_SIMT_EXIT)
  || gimple_call_internal_p (g, IFN_GOMP_SIMT_VOTE_ANY)
  || gimple_call_internal_p (g, IFN_GOMP_SIMT_XCHG_BFLY)
-- 
2.33.1



[PATCH 3/3] tree-cfg: check placement of returns_twice calls

2022-01-14 Thread Alexander Monakov via Gcc-patches
When a returns_twice call has an associated abnormal edge, the edge
corresponds to the "second return" from the call. It wouldn't make sense
if any executable statements appeared between the call and the
destination of the edge (they wouldn't be re-executed upon the "second
return"), so verify that.

gcc/ChangeLog:

* tree-cfg.c (gimple_verify_flow_info): Check placement of
returns_twice calls.
---
 gcc/tree-cfg.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index a99f1acb4..70bd31d3d 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -5644,6 +5644,7 @@ gimple_verify_flow_info (void)
}
 
   /* Verify that body of basic block BB is free of control flow.  */
+  bool seen_nondebug_stmt = false;
   for (; !gsi_end_p (gsi); gsi_next (&gsi))
{
  gimple *stmt = gsi_stmt (gsi);
@@ -5664,6 +5665,38 @@ gimple_verify_flow_info (void)
 gimple_label_label (label_stmt), bb->index);
  err = 1;
}
+
+ /* Check that no statements appear between a returns_twice call
+and its associated abnormal edge.  */
+ if (gimple_code (stmt) == GIMPLE_CALL
+ && gimple_call_flags (stmt) & ECF_RETURNS_TWICE)
+   {
+ const char *misplaced = NULL;
+ /* TM is an exception: it points abnormal edges just after the
+call that starts a transaction, i.e. it must end the BB.  */
+ if (gimple_call_builtin_p (stmt, BUILT_IN_TM_START))
+   {
+ if (single_succ_p (bb)
+ && bb_has_abnormal_pred (single_succ (bb))
+ && !gsi_one_nondebug_before_end_p (gsi))
+   misplaced = "not last";
+   }
+ else
+   {
+ if (seen_nondebug_stmt
+ && bb_has_abnormal_pred (bb))
+   misplaced = "not first";
+   }
+ if (misplaced)
+   {
+ error ("returns_twice call is %s in basic block %d",
+misplaced, bb->index);
+ print_gimple_stmt (stderr, stmt, 0, TDF_SLIM);
+ err = 1;
+   }
+   }
+ if (!is_gimple_debug (stmt))
+   seen_nondebug_stmt = true;
}
 
   gsi = gsi_last_nondebug_bb (bb);
-- 
2.33.1



Re: [Patch][V4][Patch 2/2]Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables.

2022-01-14 Thread Qing Zhao


> On Jan 14, 2022, at 12:11 PM, Martin Sebor  wrote:
> 
> On 1/14/22 09:30, Qing Zhao wrote:
>>> On Jan 14, 2022, at 6:45 AM, Richard Biener  
>>> wrote:
>>> 
>>> On Thu, Jan 13, 2022 at 2:45 AM Qing Zhao  wrote:
 
 Hi, Richard,
 
 This is the updated version for the second patch, which is mainly the 
 change for "Enable -Wuninitialized + -ftrivial-auto-var-init for  address 
 taken variables”.
 
 In this update, I mainly made the following change:
 
 1.  Delete “repl_var”, use the var_def_stmt, i.e, the call to 
 .DEFERRED_INIT to record the warning suppressed info.
 2. Add and change the comments in multiple places to make the change more 
 readable.
 
 Now, for the deleted variable, we will get the necessary info to report 
 warning from the call to .DEFERRED_INIT.
 
A. the name string of DECL from the 3rd parameter of the call;
B. the location of the DECL from the location of the call;
C. the call can also be used to hold the information on whether the 
 warning
   has been issued or not to suppress warning messages when needed;
 
 Please see the detailed description below for the problem and solution of 
 this patch.
 
 This patch has been bootstrapped and regressing tested on both X86 and 
 aarch64.
 
 Okay for GCC12?
>>> 
>>> I think the change to split "%qD is used uninitialized" is bad for i8n
>>> though it seems
>>> like none of the strings passed to warn_uninit are currently marked
>>> for localization.
>>> I suppose there's lazy matching with the exact same strings passed to
>>> warning_at around like 641 but after your change those will no longer match 
>>> up,
>> Silly question: What does the above “641” mean?
> 
> At around line 641 :)

I thought of this, but I didn’t find a meaningful statement around line 641 in 
tree-ssa-uninit.c…

 636 found_alloc = true;
 637   break;
 638 }
 639 
 640   if (!is_gimple_assign (def_stmt))
 641 break;
 642 
 643   tree_code code = gimple_assign_rhs_code (def_stmt);
 644   if (code != ADDR_EXPR && code != POINTER_PLUS_EXPR)
 645 break;
 
> 
>>> at least for the "%qs ..." case constructed.  I think a better way
>>> (for i8n) would be
>>> to pass down a flag whether it is "may" or "is" and have the full 
>>> translatable
>>> strings literally passed to warning_at at the expense of some code 
>>> duplication.
>>> Actually the extra flag is unnecessary, the OPT_W... we pass is already 
>>> fully
>>> specifying that.
>> The only issue with the above change is:  among the  4 calls to 
>> “warn_uninit” as following:
>>if (use_stmt)
>> warn_uninit (OPT_Wuninitialized, def, SSA_NAME_VAR (def),
>> -"%qD is used uninitialized", use_stmt);
>> +"is used uninitialized", use_stmt);
>>  }
>>  }
>> @@ -932,10 +1015,10 @@ warn_uninitialized_vars (bool wmaybe_uninit)
>>   tree use = USE_FROM_PTR (use_p);
>>   if (wlims.always_executed)
>> warn_uninit (OPT_Wuninitialized, use, SSA_NAME_VAR (use),
>> -"%qD is used uninitialized", stmt);
>> +"is used uninitialized", stmt);
>>   else if (wmaybe_uninit)
>> warn_uninit (OPT_Wmaybe_uninitialized, use, SSA_NAME_VAR 
>> (use),
>> -"%qD may be used uninitialized", stmt);
>> +"may be used uninitialized", stmt);
>> }
>>   /* For limiting the alias walk below we count all
>> @@ -1182,7 +1265,7 @@ warn_uninitialized_phi (gphi *phi, vec 
>> *worklist,
>>warn_uninit (OPT_Wmaybe_uninitialized, uninit_op,
>>SSA_NAME_VAR (uninit_op),
>> -  "%qD may be used uninitialized in this function",
>> +  "may be used uninitialized in this function",
>>uninit_use_stmt, loc);
>> All the strings passed map well with the OPT_W… except the last one, since 
>> the last one has an additional string “in this function” at the end.
>> However, since the last call has the last argument “loc” been NULL, maybe we 
>> can use this to distinguish.
> 
> Now that diagnostics are prefixed by something like "In function 'foo':
> the "in this function" part is superfluous and could be removed from
> all warning messages.

Okay, so, you mean that it’s safe to just remove “in this function” from the 
last callsite?

> 
> When there's no location (i.e., loc is UNKNOWN_LOCATION) the called
> code tries to derive a location from the context.  When it can't, it
> won't point anywhere useful, so if that case ever comes up here it
> should probably be handled by using the location of the curly brace
> closing the function definition.

You mean the following in the routine warn_uninit:

 /* Use either the location of the read statement or that of

Re: [PATCH] RISC-V: Document the degree of position independence that medany affords

2022-01-14 Thread Palmer Dabbelt

On Thu, 13 Jan 2022 21:54:45 PST (-0800), gcc-patches@gcc.gnu.org wrote:

On Fri, Jan 14, 2022 at 4:42 AM Palmer Dabbelt  wrote:


The code generated by -mcmodel=medany is defined to be
position-independent, but is not guarnteed to function correctly when
linked into position-independent executables or libraries.  See the
recent discussion at the psABI specification [1] for more details.

[1]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/issues/245

gcc/ChangeLog:

* doc/invoke.texi: Document the degree of position independence
that -mcmodel=medany affords.

Signed-off-by: Palmer Dabbelt 
---
 gcc/doc/invoke.texi | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5504971ea81..eaba12bb61f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -27568,6 +27568,10 @@ Generate code for the medium-any code model. The 
program and its statically
 defined symbols must be within any single 2 GiB address range. Programs can be
 statically or dynamically linked.

+The code generated by the medium-any code model is position-independent, but is
+not guarnteed to function correctly when linked into position-independent
+executables or libraries.


Typo: guarnteed -> guaranteed


Thanks.  IDK why I still can't figure out how to spell that word ;)


I think it would be more helpful from a user perspective if a hint to
the solution
(i.e. use -fPIC) would be there. What about something like this:
"""
The code generated by the medium-any code model is position-independent.
However, to link such code into position-independent executables or libraries,
the corresponding flags to enable position-independent code generation
still need to be provided (e.g. -fPIC or -fPIE).
"""


I'd tried to make this more detailed, but everything I came up with was 
too verbose.  What you've got reads well, but it's not entirely 
accurate:


* A lot  of distros are defaulting to PIE now and there's "-fpic" and 
 "-fpie", so it's really more of a  "maybe pass -fpic, -fpie, -fPIE or 
 -fPIC but definitely don't pass the -no versions of those" 
 requirement.
* The code sequences generated by medany (without any of those PIC/pie 
 options enabled) can be linked into position-independent executables, 
 it's just might not result in symbol addresses that resolve as 
 expected.


Given that those constraints are well documented elsewhere (the GCC 
documentation for the arguments, and the psABI threads for the 
compiler/linker thing) I don't think it's strictly necessary to include 
everything here.  The rest of these blurbs are pretty terse and require 
users to understand these non-local definitions, so it's certainly in 
line with what's there.


That said, I do think this one is pretty tricky (it certainly tricked 
me), so if you or someone else has a better way of describing the actual 
constraints directly here then I'm all ears -- I couldn't come up with 
anything, though.


[As an aside, when I had the -fPIC blurb in medany I'd also added it to 
medlow as it's exactly the same there, but that also seemed too 
verbose.]



 @item -mexplicit-relocs
 @itemx -mno-exlicit-relocs
 Use or do not use assembler relocation operators when dealing with symbolic
--
2.32.0


[pushed] vect: Fix uninitialised variable PR104026

2022-01-14 Thread Richard Sandiford via Gcc-patches
As noted by Tobias in the PR, the loop_vec_info constructor wasn't
initializing the new partial_load_store_bias field.

Tested on aarch64-linux-gnu and pushed as obvious.

Richard


gcc/
PR middle-end/104026
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
partial_load_store_bias.
---
 gcc/tree-vect-loop.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 1dbdb9e5fa0..f1410b03fb3 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -846,6 +846,7 @@ _loop_vec_info::_loop_vec_info (class loop *loop_in, 
vec_info_shared *shared)
 can_use_partial_vectors_p (param_vect_partial_vector_usage != 0),
 using_partial_vectors_p (false),
 epil_using_partial_vectors_p (false),
+partial_load_store_bias (0),
 peeling_for_gaps (false),
 peeling_for_niter (false),
 no_data_dependencies (false),
-- 
2.25.1



Re: [Patch][V4][Patch 2/2]Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables.

2022-01-14 Thread Martin Sebor via Gcc-patches

On 1/14/22 11:29, Qing Zhao wrote:




On Jan 14, 2022, at 12:11 PM, Martin Sebor  wrote:

On 1/14/22 09:30, Qing Zhao wrote:

On Jan 14, 2022, at 6:45 AM, Richard Biener  wrote:

On Thu, Jan 13, 2022 at 2:45 AM Qing Zhao  wrote:


Hi, Richard,

This is the updated version for the second patch, which is mainly the change for 
"Enable -Wuninitialized + -ftrivial-auto-var-init for  address taken variables”.

In this update, I mainly made the following change:

1.  Delete “repl_var”, use the var_def_stmt, i.e, the call to .DEFERRED_INIT to 
record the warning suppressed info.
2. Add and change the comments in multiple places to make the change more 
readable.

Now, for the deleted variable, we will get the necessary info to report warning 
from the call to .DEFERRED_INIT.

A. the name string of DECL from the 3rd parameter of the call;
B. the location of the DECL from the location of the call;
C. the call can also be used to hold the information on whether the warning
   has been issued or not to suppress warning messages when needed;

Please see the detailed description below for the problem and solution of this 
patch.

This patch has been bootstrapped and regressing tested on both X86 and aarch64.

Okay for GCC12?


I think the change to split "%qD is used uninitialized" is bad for i8n
though it seems
like none of the strings passed to warn_uninit are currently marked
for localization.
I suppose there's lazy matching with the exact same strings passed to
warning_at around like 641 but after your change those will no longer match up,

Silly question: What does the above “641” mean?


At around line 641 :)


I thought of this, but I didn’t find a meaningful statement around line 641 in 
tree-ssa-uninit.c…

  636 found_alloc = true;
  637   break;
  638 }
  639
  640   if (!is_gimple_assign (def_stmt))
  641 break;
  642
  643   tree_code code = gimple_assign_rhs_code (def_stmt);
  644   if (code != ADDR_EXPR && code != POINTER_PLUS_EXPR)
  645 break;
  



at least for the "%qs ..." case constructed.  I think a better way
(for i8n) would be
to pass down a flag whether it is "may" or "is" and have the full translatable
strings literally passed to warning_at at the expense of some code duplication.
Actually the extra flag is unnecessary, the OPT_W... we pass is already fully
specifying that.

The only issue with the above change is:  among the  4 calls to “warn_uninit” 
as following:
if (use_stmt)
 warn_uninit (OPT_Wuninitialized, def, SSA_NAME_VAR (def),
-"%qD is used uninitialized", use_stmt);
+"is used uninitialized", use_stmt);
  }
  }
@@ -932,10 +1015,10 @@ warn_uninitialized_vars (bool wmaybe_uninit)
   tree use = USE_FROM_PTR (use_p);
   if (wlims.always_executed)
 warn_uninit (OPT_Wuninitialized, use, SSA_NAME_VAR (use),
-"%qD is used uninitialized", stmt);
+"is used uninitialized", stmt);
   else if (wmaybe_uninit)
 warn_uninit (OPT_Wmaybe_uninitialized, use, SSA_NAME_VAR (use),
-"%qD may be used uninitialized", stmt);
+"may be used uninitialized", stmt);
 }
   /* For limiting the alias walk below we count all
@@ -1182,7 +1265,7 @@ warn_uninitialized_phi (gphi *phi, vec *worklist,
warn_uninit (OPT_Wmaybe_uninitialized, uninit_op,
SSA_NAME_VAR (uninit_op),
-  "%qD may be used uninitialized in this function",
+  "may be used uninitialized in this function",
uninit_use_stmt, loc);
All the strings passed map well with the OPT_W… except the last one, since the 
last one has an additional string “in this function” at the end.
However, since the last call has the last argument “loc” been NULL, maybe we 
can use this to distinguish.


Now that diagnostics are prefixed by something like "In function 'foo':
the "in this function" part is superfluous and could be removed from
all warning messages.


Okay, so, you mean that it’s safe to just remove “in this function” from the 
last callsite?


Yes, I would remove it.  It might require some cleanup in the test
suite but I wouldn't expect it to be too bad.





When there's no location (i.e., loc is UNKNOWN_LOCATION) the called
code tries to derive a location from the context.  When it can't, it
won't point anywhere useful, so if that case ever comes up here it
should probably be handled by using the location of the curly brace
closing the function definition.


You mean the following in the routine warn_uninit:

  /* Use either the location of the read statement or that of the PHI
  argument, or that of the uninitialized variable, in that order,
  whichever is valid.  */
   location_t location = UNKNOWN_LOCATION;
   if (gimple_has_location 

Re: gcc/configure: out of date

2022-01-14 Thread Iain Buclaw via Gcc-patches
Excerpts from Martin Liška's message of Januar 14, 2022 4:51 pm:
> On 1/14/22 16:46, Martin Liška wrote:
>> Hello.
>> 
>> I noticed that when I run:
>> ACLOCAL=~/bin/automake-1.15.1/bin/aclocal  
>> AUTOMAKE=~/bin/automake-1.15.1/bin/automake autoconf
>> 
>> in gcc subfolder I get the following diff:
> 
> Have it, started with:
> 

Good spot! Thanks.

Iain.


Re: [PATCH] cprop_hardreg: Workaround for narrow mode != lowpart targets

2022-01-14 Thread Andreas Krebbel via Gcc-patches
On 1/14/22 08:37, Richard Biener wrote:
...
> Can the gist of this bug be put into the GCC bugzilla so the rev can
> refer to it? 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034

> Can we have a testcase even?
The testcase from Jakub is in the BZ. However, since it doesn't fail with head 
I didn't try to
include it in my patch.

> I'm not quite understanding the problem but is it that, say,
> 
>  (subreg:DI (reg:V2DI ..) 0)
> 
> isn't the same as
> 
>  (lowpart:DI (reg:V2DI ...) 0)

(reg:DI v0) does not match the lower order bits of (reg:TI v0)

> ?  The regcprop code looks more like asking whether the larger reg
> is a composition of multiple other hardregs and will return the specific
> hardreg corresponding to the lowpart - so like if on s390 the vector
> registers overlap with some other regset.  But then doing the actual
> accesses via the other regset regs doesn't actually work?  Isn't the
> backend then lying to us (aka the mode_change_ok returns the
> wrong answer)?

can_change_mode_class should do the right thing. We return false in case 
somebody wants to change TI
to DI for a vector register. However, the hook never gets called like this from 
regcprop. regcprop
only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's 
indeed ok.

Before cprop we have:

(insn 175 176 174 3 (set (reg/v:TI 16 %f0 [orig:69 __comp ] [69])
(reg:TI 8 %r8)) -1
 (nil))


(insn 155 124 156 3 (set (reg:DI 6 %r6 [ __comp ])
(reg:DI 16 %f0)) 1277 {*movdi_64}
 (nil))


(insn 156 155 128 3 (set (reg:DI 7 %r7 [orig:69 __comp+8 ] [69])
(unspec:DI [
(reg:V2DI 16 %f0)
(const_int 1 [0x1])
] UNSPEC_VEC_EXTRACT)) 409 {*vec_extractv2di}
 (expr_list:REG_DEAD (reg:V2DI 16 %f0)
(nil)))

So a copy of reg pair r8/r9 is kept in v0==f0. The problem comes from cprop 
assuming that (reg:DI
f0) refers to the low part of f0 and as a consequence replaces (reg:DI 16 %f0) 
with (reg:DI 9 %r9)
what would be the DImode lowpart of (reg:TI r8)

Insn 155 and 156 are the result of applying the following splitter:

; Split a VR -> GPR TImode move into 2 vector load GR from VR element.
; For the higher order bits we do simply a DImode move while the
; second part is done via vec extract.  Both will end up as vlgvg.
(define_split
  [(set (match_operand:TI 0 "register_operand" "")
(match_operand:TI 1 "register_operand" ""))]
  "TARGET_VX && reload_completed
   && GENERAL_REG_P (operands[0])
   && VECTOR_REG_P (operands[1])"
  [(set (match_dup 2) (match_dup 4))
   (set (match_dup 3) (unspec:DI [(match_dup 5) (const_int 1)]
 UNSPEC_VEC_EXTRACT))]
{
  operands[2] = operand_subword (operands[0], 0, 0, TImode);
  operands[3] = operand_subword (operands[0], 1, 0, TImode);
  operands[4] = gen_rtx_REG (DImode, REGNO (operands[1]));
  operands[5] = gen_rtx_REG (V2DImode, REGNO (operands[1]));
})

Introducing the (reg:DI 16 %f0) access to the TImode VR is something the middle 
end is not expected
to do - because we prevent it in can_change_mode_class. However, I don't see 
anything wrong with
doing that in the splitter. In our backend this is well-defined as being the 
first element in the
vector register - the high part of the TImode vector register value.

Unfortunately it confuses cprop :(

Andreas



> 
> How does the stage1 fix, aka "rewrite" of cprop, look like?  How can we
> be sure this hack isn't still present in 10 years from now?
> 
> Thanks,
> Richard.
> 
>> Bootstrapped and regression-tested on s390x.
>>
>> Ok?
>>
>> gcc/ChangeLog:
>>
>> * target.def (narrow_mode_refers_low_part_p): Add new target hook.
>> * config/s390/s390.c (s390_narrow_mode_refers_low_part_p):
>> Implement new target hook for IBM Z.
>> (TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro.
>> * regcprop.c (maybe_mode_change): Disable transformation depending
>> on the new target hook.
>> ---
>>  gcc/config/s390/s390.c | 14 ++
>>  gcc/regcprop.c |  3 ++-
>>  gcc/target.def | 12 +++-
>>  3 files changed, 27 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
>> index 056002e4a4a..aafc6d63be6 100644
>> --- a/gcc/config/s390/s390.c
>> +++ b/gcc/config/s390/s390.c
>> @@ -10488,6 +10488,18 @@ s390_hard_regno_mode_ok (unsigned int regno, 
>> machine_mode mode)
>>return false;
>>  }
>>
>> +/* Implement TARGET_NARROW_MODE_REFERS_LOW_PART_P.  */
>> +
>> +static bool
>> +s390_narrow_mode_refers_low_part_p (unsigned int regno)
>> +{
>> +  if (reg_classes_intersect_p (VEC_REGS, REGNO_REG_CLASS (regno)))
>> +return false;
>> +
>> +  return true;
>> +}
>> +
>> +
>>  /* Implement TARGET_MODES_TIEABLE_P.  */
>>
>>  static bool
>> @@ -17472,6 +17484,8 @@ s390_vectorize_vec_perm_const (machine_mode vmode, 
>> rtx target, rtx op0, rtx op1,
>>  #undef TARGET_VECTORIZE_VEC_PERM_CONST
>>  #define TARGET_VECTORIZE_VEC_PERM_CONST

Re: [PATCH] cprop_hardreg: Workaround for narrow mode != lowpart targets

2022-01-14 Thread Andrew Pinski via Gcc-patches
On Fri, Jan 14, 2022 at 11:42 AM Andreas Krebbel via Gcc-patches
 wrote:
>
> On 1/14/22 08:37, Richard Biener wrote:
> ...
> > Can the gist of this bug be put into the GCC bugzilla so the rev can
> > refer to it?
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034
>
> > Can we have a testcase even?
> The testcase from Jakub is in the BZ. However, since it doesn't fail with 
> head I didn't try to
> include it in my patch.

So I also noticed there was PR 101260 which had the exact same
analysis as this patch with respect to s390 and TI mode splitting and
regcprop.
So I marked PR 104034 as a duplicate of PR 101260.

Thanks,
Andrew

>
> > I'm not quite understanding the problem but is it that, say,
> >
> >  (subreg:DI (reg:V2DI ..) 0)
> >
> > isn't the same as
> >
> >  (lowpart:DI (reg:V2DI ...) 0)
>
> (reg:DI v0) does not match the lower order bits of (reg:TI v0)
>
> > ?  The regcprop code looks more like asking whether the larger reg
> > is a composition of multiple other hardregs and will return the specific
> > hardreg corresponding to the lowpart - so like if on s390 the vector
> > registers overlap with some other regset.  But then doing the actual
> > accesses via the other regset regs doesn't actually work?  Isn't the
> > backend then lying to us (aka the mode_change_ok returns the
> > wrong answer)?
>
> can_change_mode_class should do the right thing. We return false in case 
> somebody wants to change TI
> to DI for a vector register. However, the hook never gets called like this 
> from regcprop. regcprop
> only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's 
> indeed ok.
>
> Before cprop we have:
>
> (insn 175 176 174 3 (set (reg/v:TI 16 %f0 [orig:69 __comp ] [69])
> (reg:TI 8 %r8)) -1
>  (nil))
>
>
> (insn 155 124 156 3 (set (reg:DI 6 %r6 [ __comp ])
> (reg:DI 16 %f0)) 1277 {*movdi_64}
>  (nil))
>
>
> (insn 156 155 128 3 (set (reg:DI 7 %r7 [orig:69 __comp+8 ] [69])
> (unspec:DI [
> (reg:V2DI 16 %f0)
> (const_int 1 [0x1])
> ] UNSPEC_VEC_EXTRACT)) 409 {*vec_extractv2di}
>  (expr_list:REG_DEAD (reg:V2DI 16 %f0)
> (nil)))
>
> So a copy of reg pair r8/r9 is kept in v0==f0. The problem comes from cprop 
> assuming that (reg:DI
> f0) refers to the low part of f0 and as a consequence replaces (reg:DI 16 
> %f0) with (reg:DI 9 %r9)
> what would be the DImode lowpart of (reg:TI r8)
>
> Insn 155 and 156 are the result of applying the following splitter:
>
> ; Split a VR -> GPR TImode move into 2 vector load GR from VR element.
> ; For the higher order bits we do simply a DImode move while the
> ; second part is done via vec extract.  Both will end up as vlgvg.
> (define_split
>   [(set (match_operand:TI 0 "register_operand" "")
> (match_operand:TI 1 "register_operand" ""))]
>   "TARGET_VX && reload_completed
>&& GENERAL_REG_P (operands[0])
>&& VECTOR_REG_P (operands[1])"
>   [(set (match_dup 2) (match_dup 4))
>(set (match_dup 3) (unspec:DI [(match_dup 5) (const_int 1)]
>  UNSPEC_VEC_EXTRACT))]
> {
>   operands[2] = operand_subword (operands[0], 0, 0, TImode);
>   operands[3] = operand_subword (operands[0], 1, 0, TImode);
>   operands[4] = gen_rtx_REG (DImode, REGNO (operands[1]));
>   operands[5] = gen_rtx_REG (V2DImode, REGNO (operands[1]));
> })
>
> Introducing the (reg:DI 16 %f0) access to the TImode VR is something the 
> middle end is not expected
> to do - because we prevent it in can_change_mode_class. However, I don't see 
> anything wrong with
> doing that in the splitter. In our backend this is well-defined as being the 
> first element in the
> vector register - the high part of the TImode vector register value.
>
> Unfortunately it confuses cprop :(
>
> Andreas
>
>
>
> >
> > How does the stage1 fix, aka "rewrite" of cprop, look like?  How can we
> > be sure this hack isn't still present in 10 years from now?
> >
> > Thanks,
> > Richard.
> >
> >> Bootstrapped and regression-tested on s390x.
> >>
> >> Ok?
> >>
> >> gcc/ChangeLog:
> >>
> >> * target.def (narrow_mode_refers_low_part_p): Add new target hook.
> >> * config/s390/s390.c (s390_narrow_mode_refers_low_part_p):
> >> Implement new target hook for IBM Z.
> >> (TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro.
> >> * regcprop.c (maybe_mode_change): Disable transformation depending
> >> on the new target hook.
> >> ---
> >>  gcc/config/s390/s390.c | 14 ++
> >>  gcc/regcprop.c |  3 ++-
> >>  gcc/target.def | 12 +++-
> >>  3 files changed, 27 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
> >> index 056002e4a4a..aafc6d63be6 100644
> >> --- a/gcc/config/s390/s390.c
> >> +++ b/gcc/config/s390/s390.c
> >> @@ -10488,6 +10488,18 @@ s390_hard_regno_mode_ok (unsigned int regno, 
> >> machine_mode mode)
> >>return false;
> >>  }
> >>
> >> +/* Implement 

Re: [PATCH] cprop_hardreg: Workaround for narrow mode != lowpart targets

2022-01-14 Thread Andreas Krebbel via Gcc-patches
On 1/14/22 20:41, Andreas Krebbel via Gcc-patches wrote:
> On 1/14/22 08:37, Richard Biener wrote:
> ...
>> Can the gist of this bug be put into the GCC bugzilla so the rev can
>> refer to it? 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034
> 
>> Can we have a testcase even?
> The testcase from Jakub is in the BZ. However, since it doesn't fail with 
> head I didn't try to
> include it in my patch.
> 
>> I'm not quite understanding the problem but is it that, say,
>>
>>  (subreg:DI (reg:V2DI ..) 0)
>>
>> isn't the same as
>>
>>  (lowpart:DI (reg:V2DI ...) 0)
> 
> (reg:DI v0) does not match the lower order bits of (reg:TI v0)
> 
>> ?  The regcprop code looks more like asking whether the larger reg
>> is a composition of multiple other hardregs and will return the specific
>> hardreg corresponding to the lowpart - so like if on s390 the vector
>> registers overlap with some other regset.  But then doing the actual
>> accesses via the other regset regs doesn't actually work?  Isn't the
>> backend then lying to us (aka the mode_change_ok returns the
>> wrong answer)?
> 
> can_change_mode_class should do the right thing. We return false in case 
> somebody wants to change TI
> to DI for a vector register. However, the hook never gets called like this 
> from regcprop. regcprop
> only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's 
> indeed ok.

After writing this I'm wondering whether this would be a better fix:

diff --git a/gcc/regcprop.c b/gcc/regcprop.c
index 18132425ab2..b6a3f4e3804 100644
--- a/gcc/regcprop.c
+++ b/gcc/regcprop.c
@@ -402,7 +402,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode 
copy_mode,

   if (orig_mode == new_mode)
 return gen_raw_REG (new_mode, regno);
-  else if (mode_change_ok (orig_mode, new_mode, regno))
+  else if (mode_change_ok (orig_mode, new_mode, regno)
+   && mode_change_ok (copy_mode, new_mode, copy_regno))
 {
   int copy_nregs = hard_regno_nregs (copy_regno, copy_mode);
   int use_nregs = hard_regno_nregs (copy_regno, new_mode);


Andreas


Re: [Patch][V4][Patch 2/2]Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables.

2022-01-14 Thread Qing Zhao via Gcc-patches


> On Jan 14, 2022, at 12:58 PM, Martin Sebor  wrote:
> 
> On 1/14/22 11:29, Qing Zhao wrote:
>>> On Jan 14, 2022, at 12:11 PM, Martin Sebor  wrote:
>>> 
>>> On 1/14/22 09:30, Qing Zhao wrote:
> On Jan 14, 2022, at 6:45 AM, Richard Biener  
> wrote:
> 
> On Thu, Jan 13, 2022 at 2:45 AM Qing Zhao  wrote:
>> 
>> Hi, Richard,
>> 
>> This is the updated version for the second patch, which is mainly the 
>> change for "Enable -Wuninitialized + -ftrivial-auto-var-init for  
>> address taken variables”.
>> 
>> In this update, I mainly made the following change:
>> 
>> 1.  Delete “repl_var”, use the var_def_stmt, i.e, the call to 
>> .DEFERRED_INIT to record the warning suppressed info.
>> 2. Add and change the comments in multiple places to make the change 
>> more readable.
>> 
>> Now, for the deleted variable, we will get the necessary info to report 
>> warning from the call to .DEFERRED_INIT.
>> 
>>A. the name string of DECL from the 3rd parameter of the call;
>>B. the location of the DECL from the location of the call;
>>C. the call can also be used to hold the information on whether the 
>> warning
>>   has been issued or not to suppress warning messages when needed;
>> 
>> Please see the detailed description below for the problem and solution 
>> of this patch.
>> 
>> This patch has been bootstrapped and regressing tested on both X86 and 
>> aarch64.
>> 
>> Okay for GCC12?
> 
> I think the change to split "%qD is used uninitialized" is bad for i8n
> though it seems
> like none of the strings passed to warn_uninit are currently marked
> for localization.
> I suppose there's lazy matching with the exact same strings passed to
> warning_at around like 641 but after your change those will no longer 
> match up,
 Silly question: What does the above “641” mean?
>>> 
>>> At around line 641 :)
>> I thought of this, but I didn’t find a meaningful statement around line 641 
>> in tree-ssa-uninit.c…
>>  636 found_alloc = true;
>>  637   break;
>>  638 }
>>  639
>>  640   if (!is_gimple_assign (def_stmt))
>>  641 break;
>>  642
>>  643   tree_code code = gimple_assign_rhs_code (def_stmt);
>>  644   if (code != ADDR_EXPR && code != POINTER_PLUS_EXPR)
>>  645 break;
>>  
>>> 
> at least for the "%qs ..." case constructed.  I think a better way
> (for i8n) would be
> to pass down a flag whether it is "may" or "is" and have the full 
> translatable
> strings literally passed to warning_at at the expense of some code 
> duplication.
> Actually the extra flag is unnecessary, the OPT_W... we pass is already 
> fully
> specifying that.
 The only issue with the above change is:  among the  4 calls to 
 “warn_uninit” as following:
if (use_stmt)
 warn_uninit (OPT_Wuninitialized, def, SSA_NAME_VAR (def),
 -"%qD is used uninitialized", use_stmt);
 +"is used uninitialized", use_stmt);
  }
  }
 @@ -932,10 +1015,10 @@ warn_uninitialized_vars (bool wmaybe_uninit)
   tree use = USE_FROM_PTR (use_p);
   if (wlims.always_executed)
 warn_uninit (OPT_Wuninitialized, use, SSA_NAME_VAR (use),
 -"%qD is used uninitialized", stmt);
 +"is used uninitialized", stmt);
   else if (wmaybe_uninit)
 warn_uninit (OPT_Wmaybe_uninitialized, use, SSA_NAME_VAR 
 (use),
 -"%qD may be used uninitialized", stmt);
 +"may be used uninitialized", stmt);
 }
   /* For limiting the alias walk below we count all
 @@ -1182,7 +1265,7 @@ warn_uninitialized_phi (gphi *phi, vec 
 *worklist,
warn_uninit (OPT_Wmaybe_uninitialized, uninit_op,
SSA_NAME_VAR (uninit_op),
 -  "%qD may be used uninitialized in this function",
 +  "may be used uninitialized in this function",
uninit_use_stmt, loc);
 All the strings passed map well with the OPT_W… except the last one, since 
 the last one has an additional string “in this function” at the end.
 However, since the last call has the last argument “loc” been NULL, maybe 
 we can use this to distinguish.
>>> 
>>> Now that diagnostics are prefixed by something like "In function 'foo':
>>> the "in this function" part is superfluous and could be removed from
>>> all warning messages.
>> Okay, so, you mean that it’s safe to just remove “in this function” from the 
>> last callsite?
> 
> Yes, I would remove it.  It might require some cleanup in the test
> suite but I wouldn't expect it to be 

[PATCH, committed] PR fortran/99256 - ICE in variable_check, at fortran/check.c:1012

2022-01-14 Thread Harald Anlauf via Gcc-patches
Dear all,

this is a rather satisfying mini-patch which removes code to fix a bug.

The intrinsics MOVE_ALLOC, C_F_POINTER, and C_F_PROCPOINTER require
deferred checks of part of their actual argument types which may be of
"any" type.  This however excludes alternate return specifiers which
therefore must be unconditionally rejected for all standard intrinsics.

OK'ed by Steve (see PR), and committed to mainline as obvious after
regtesting on x86_64-pc-linux-gnu.

Thanks,
Harald

From 70e24c9682ddbcade0301665bccd8e7f928d0082 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Fri, 14 Jan 2022 21:48:15 +0100
Subject: [PATCH] Fortran: always reject alternate return specifier as argument
 of intrinsics

The intrinsics MOVE_ALLOC, C_F_POINTER, and C_F_PROCPOINTER require
deferred checks of part of their actual argument types which may be of
"any" type.  This however excludes alternate return specifiers which
therefore must be unconditionally rejected for all standard intrinsics.

gcc/fortran/ChangeLog:

	PR fortran/99256
	* intrinsic.c: Do not check formal argument type when checking
	arguments of intrinsics for alternate return specifiers.

gcc/testsuite/ChangeLog:

	PR fortran/99256
	* gfortran.dg/altreturn_11.f90: New test.
---
 gcc/fortran/intrinsic.c|  2 +-
 gcc/testsuite/gfortran.dg/altreturn_11.f90 | 15 +++
 2 files changed, 16 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/altreturn_11.f90

diff --git a/gcc/fortran/intrinsic.c b/gcc/fortran/intrinsic.c
index a7ecdb401ef..9746cd5ddb6 100644
--- a/gcc/fortran/intrinsic.c
+++ b/gcc/fortran/intrinsic.c
@@ -4420,7 +4420,7 @@ do_sort:
   FOR_EACH_VEC_ELT (dummy_args, idx, f)
 {
   a = ordered_actual_args[idx];
-  if (a && a->label != NULL && f->ts.type)
+  if (a && a->label != NULL)
 	{
 	  gfc_error ("ALTERNATE RETURN not permitted at %L", where);
 	  return false;
diff --git a/gcc/testsuite/gfortran.dg/altreturn_11.f90 b/gcc/testsuite/gfortran.dg/altreturn_11.f90
new file mode 100644
index 000..be42971d781
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/altreturn_11.f90
@@ -0,0 +1,15 @@
+! { dg-do compile }
+! { dg-prune-output "Obsolescent feature: Alternate-return argument" }
+! PR fortran/99256 - ICE in variable_check
+! Contributed by G.Steimetz
+
+program test
+  use iso_c_binding
+  type(c_ptr):: i
+  type(c_funptr) :: p
+  call move_alloc (*1, *2) ! { dg-error "ALTERNATE RETURN" }
+  call c_f_pointer (i, *1) ! { dg-error "ALTERNATE RETURN" }
+  call c_f_procpointer (p, *2) ! { dg-error "ALTERNATE RETURN" }
+1 continue
+2 stop
+end
--
2.31.1



Re: [PATCH 11/11] libstdc++: Fix ODR issues with different -m flags

2022-01-14 Thread Jonathan Wakely via Gcc-patches
On Mon, 15 Nov 2021 at 08:57, Matthias Kretz  wrote:

> ping. OK to push?
>

Sorry for the delay - this is OK for trunk.



> On Tuesday, 8 June 2021 14:12:23 CET Matthias Kretz wrote:
> > From: Matthias Kretz 
> >
> > Explicitly support use of the stdx::simd implementation in situations
> > where the user links TUs that were compiled with different -m flags. In
> > general, this is always a (quasi) ODR violation for inline functions
> > because at least codegen may differ in important ways. However, in the
> > resulting executable only one (unspecified which one) of them might be
> > used. For simd we want to support users to compile code multiple times,
> > with different -m flags and have a runtime dispatch to the TU matching
> > the target CPU. But if internal functions are not inlined this may lead
> > to unexpected performance loss or execution of illegal instructions.
> > Therefore, inline functions that are not marked as always_inline must
> > use an additional template parameter somewhere in their name, to
> > disambiguate between the different -m translations.
> >
> > Signed-off-by: Matthias Kretz 
> >
> > libstdc++-v3/ChangeLog:
> >
> >   * include/experimental/bits/simd.h: Move feature detection bools
> >   and add __have_avx512bitalg, __have_avx512vbmi2,
> >   __have_avx512vbmi, __have_avx512ifma, __have_avx512cd,
> >   __have_avx512vnni, __have_avx512vpopcntdq.
> >   (__detail::__machine_flags): New function which returns a unique
> >   uint64 depending on relevant -m and -f flags.
> >   (__detail::__odr_helper): New type alias for either an anonymous
> >   type or a type specialized with the __machine_flags number.
> >   (_SimdIntOperators): Change template parameters from _Impl to
> >   _Tp, _Abi because _Impl now has an __odr_helper parameter which
> >   may be _OdrEnforcer from the anonymous namespace, which makes
> >   for a bad base class.
> >   (many): Either add __odr_helper template parameter or mark as
> >   always_inline.
> >   * include/experimental/bits/simd_detail.h: Add defines for
> >   AVX512BITALG, AVX512VBMI2, AVX512VBMI, AVX512IFMA, AVX512CD,
> >   AVX512VNNI, AVX512VPOPCNTDQ, and AVX512VP2INTERSECT.
> >   * include/experimental/bits/simd_builtin.h: Add __odr_helper
> >   template parameter or mark as always_inline.
> >   * include/experimental/bits/simd_fixed_size.h: Ditto.
> >   * include/experimental/bits/simd_math.h: Ditto.
> >   * include/experimental/bits/simd_scalar.h: Ditto.
> >   * include/experimental/bits/simd_neon.h: Add __odr_helper
> >   template parameter.
> >   * include/experimental/bits/simd_ppc.h: Ditto.
> >   * include/experimental/bits/simd_x86.h: Ditto.
> > ---
> >  libstdc++-v3/include/experimental/bits/simd.h | 380 --
> >  .../include/experimental/bits/simd_builtin.h  |  41 +-
> >  .../include/experimental/bits/simd_detail.h   |  40 ++
> >  .../experimental/bits/simd_fixed_size.h   |  39 +-
> >  .../include/experimental/bits/simd_math.h |  45 ++-
> >  .../include/experimental/bits/simd_neon.h |   4 +-
> >  .../include/experimental/bits/simd_ppc.h  |   4 +-
> >  .../include/experimental/bits/simd_scalar.h   |  71 +++-
> >  .../include/experimental/bits/simd_x86.h  |   4 +-
> >  9 files changed, 440 insertions(+), 188 deletions(-)
>
> --
> ──
>  Dr. Matthias Kretz   https://mattkretz.github.io
>  GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
>  stdₓ::simd
> ──
>
>
>
>


Re: [PATCH v2 RFC] c++: add color to function decl printing

2022-01-14 Thread Jonathan Wakely via Gcc-patches
On Fri, 14 Jan 2022 at 21:49, David Malcolm wrote:

>
> Jonathan, did you try the v2 patch?
>

No, sorry.


Re: [PATCH] libstdc++: Implement C++20 atomic and atomic

2022-01-14 Thread Jonathan Wakely via Gcc-patches
On Sat, 18 Sept 2021 at 05:12, Thomas Rodgers 
wrote:

> From: Thomas Rodgers 
>
> Let's try this one instead.
>
> Signed-off-by: Thomas Rodgers 
>

If you're doing DCO "Signed-off-by" commits you don't need FSF copyright
notices in the new tests.

I no longer put any copyright notices in the tests, because it's largely
pointless. They're usually not interesting and don't do anything that
anybody is going to want to steal to incorporate into non-GPL code. Your
new 20_util/shared_ptr/atomic/atomic_shared_ptr.cc is non-trivial, and
maybe interesting, but the two that just check feature test macros are not
worth putting licence headers and copyright notices on.



> libstdc++-v3/ChangeLog:
> * acinclude.m4: Update ABI version.
> * config/abi/pre/gnu.ver (GLIBCXX_3.4.21): Do not match new
> _Sp_locker
> constructor.
> (GLIBCXX_3.4.30): Export _Sp_locker::_M_wait/_M_notify and new
> constructor.
> * include/bits/shared_ptr_atomic.h: define
> __cpp_lib_atomic_shared_ptr
> feature test macro.
> (_Sp_locker::_Sp_locker(const void*, bool): New constructor.
> (_Sp_locker::_M_wait()), _Sp_locker::_M_notify()): New methods.
> (_Sp_impl): New type.
> (atomic>): New partial template specialization.
> (atomic>): New partial template specialization.
> * include/std/version: define __cpp_lib_atomic_shared_ptr feature
> test macro.
> * doc/xml/manual/abi.xml: New ABI version.
> * src/c++11/Makefile.am: Compile src/c++11/shared_ptr.cc
> -std=gnu++20.
> * src/c++11/Makefile.in: Regenerate.
> * src/c++11/shared_ptr.cc (_Sp_locker::_Sp_locker(const void*,
> bool),
> _Sp_locker::_M_wait(), _Sp_locker::_M_notify(): Implement.
> * testsuite/20_util/shared_ptr/atomic/4.cc: New test.
> * testsuite/20_util/shared_ptr/atomic/5.cc: Likewise.
> * testsuite/20_util/shared_ptr/atomic/atomic_shared_ptr.cc:
> Likewise.
> * testuite/util/testsuite_abi.cc: New ABI version.
> ---
>  libstdc++-v3/acinclude.m4 |   2 +-
>  libstdc++-v3/config/abi/pre/gnu.ver   |  12 +-
>  libstdc++-v3/configure|   2 +-
>  libstdc++-v3/doc/xml/manual/abi.xml   |   1 +
>  libstdc++-v3/include/bits/shared_ptr_atomic.h | 309 ++
>  libstdc++-v3/include/std/version  |   1 +
>  libstdc++-v3/src/c++11/Makefile.am|   6 +
>  libstdc++-v3/src/c++11/Makefile.in|   6 +
>  libstdc++-v3/src/c++11/shared_ptr.cc  |  86 -
>  .../testsuite/20_util/shared_ptr/atomic/4.cc  |  28 ++
>  .../testsuite/20_util/shared_ptr/atomic/5.cc  |  28 ++
>  .../shared_ptr/atomic/atomic_shared_ptr.cc| 159 +
>  libstdc++-v3/testsuite/util/testsuite_abi.cc  |   3 +-
>  13 files changed, 637 insertions(+), 6 deletions(-)
>  create mode 100644 libstdc++-v3/testsuite/20_util/shared_ptr/atomic/4.cc
>  create mode 100644 libstdc++-v3/testsuite/20_util/shared_ptr/atomic/5.cc
>  create mode 100644
> libstdc++-v3/testsuite/20_util/shared_ptr/atomic/atomic_shared_ptr.cc
>
>
[...]


> +  void
> +  store(value_type __r, memory_order __o = memory_order_seq_cst)
> noexcept
> +  { _M_impl._M_store(move(__r), __o); }
>

Every move(x) needs to be qualified as std::move(x) to avoid ADL.



>
> +
> +// { dg-options "-std=gnu++2a" }
> +// { dg-do run { target c++2a } }
>

gnu++20 and c++20 has been OK for a while, there's no need to use the 2a
forms.

Other than that, the patch looks OK, but I have a modified version based on
your patch just using atomic ops, no mutexes, which I'll post shortly.


[committed] analyzer: fix ICE in taint checker on unary ops [PR104029]

2022-01-14 Thread David Malcolm via Gcc-patches
gcc/analyzer/ChangeLog:
PR analyzer/104029
* sm-taint.cc (taint_state_machine::alt_get_inherited_state):
Remove gcc_unreachable from default case for unary ops.

gcc/testsuite/ChangeLog:
PR analyzer/104029
* gcc.dg/analyzer/pr104029.c: New test.
* gcc.dg/analyzer/taint-ops.c: New test.
---
 gcc/analyzer/sm-taint.cc  |   1 -
 gcc/testsuite/gcc.dg/analyzer/pr104029.c  | 115 ++
 gcc/testsuite/gcc.dg/analyzer/taint-ops.c | 106 
 3 files changed, 221 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr104029.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-ops.c

diff --git a/gcc/analyzer/sm-taint.cc b/gcc/analyzer/sm-taint.cc
index 54c7e6015ab..3a46256b020 100644
--- a/gcc/analyzer/sm-taint.cc
+++ b/gcc/analyzer/sm-taint.cc
@@ -649,7 +649,6 @@ taint_state_machine::alt_get_inherited_state (const 
sm_state_map &map,
  return arg_state;
}
  default:
-   gcc_unreachable ();
break;
  }
   }
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr104029.c 
b/gcc/testsuite/gcc.dg/analyzer/pr104029.c
new file mode 100644
index 000..adf15ed356f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr104029.c
@@ -0,0 +1,115 @@
+// TODO: remove need for this option
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+
+typedef __SIZE_TYPE__ size_t;
+typedef const void *t_comptype;
+typedef int (*t_compfunc)(t_comptype, t_comptype);
+
+extern int *__errno_location(void)
+  __attribute__((__nothrow__, __leaf__,__const__));
+extern void free(void *__ptr)
+  __attribute__((__nothrow__, __leaf__));
+extern void *my_malloc1(const char *file, int line, size_t size);
+
+int heapsort(void *vbase, size_t nmemb, size_t size, t_compfunc compar) {
+  char tmp, *tmp1, *tmp2, *abase, *k, *p, *t;
+  size_t cnt, i, j, l;
+
+  if (nmemb <= 1)
+return (0);
+
+  if (!size) {
+(*__errno_location()) = 22;
+return (-1);
+  }
+
+  k = my_malloc1(__FILE__, __LINE__, size);
+
+  abase = (char *)vbase - size;
+
+  for (l = nmemb / 2 + 1; --l;) {
+for (i = l; (j = i * 2) <= nmemb; i = j) {
+  p = abase + j * size;
+  if (j < nmemb && compar(p, p + size) < 0) {
+p += size;
+++j;
+  }
+  t = abase + i * size;
+  if (compar(p, t) <= 0)
+break;
+  {
+cnt = size;
+do {
+  tmp = *t;
+  *t++ = *p;
+  *p++ = tmp;
+} while (--cnt);
+  };
+}
+  };
+
+  while (nmemb > 1) {
+{
+  cnt = size;
+  tmp1 = k;
+  tmp2 = abase + nmemb * size;
+  do {
+*tmp1++ = *tmp2++;
+  } while (--cnt);
+};
+{
+  cnt = size;
+  tmp1 = abase + nmemb * size;
+  tmp2 = abase + size;
+  do {
+*tmp1++ = *tmp2++;
+  } while (--cnt);
+};
+--nmemb;
+{
+  for (i = 1; (j = i * 2) <= nmemb; i = j) {
+p = abase + j * size;
+if (j < nmemb && compar(p, p + size) < 0) {
+  p += size;
+  ++j;
+}
+t = abase + i * size;
+{
+  cnt = size;
+  tmp1 = t;
+  tmp2 = p;
+  do {
+*tmp1++ = *tmp2++;
+  } while (--cnt);
+};
+  }
+  for (;;) {
+j = i;
+i = j / 2;
+p = abase + j * size;
+t = abase + i * size;
+if (j == 1 || compar(k, t) < 0) {
+  {
+cnt = size;
+tmp1 = p;
+tmp2 = k;
+do {
+  *tmp1++ = *tmp2++;
+} while (--cnt);
+  };
+  break;
+}
+{
+  cnt = size;
+  tmp1 = p;
+  tmp2 = t;
+  do {
+*tmp1++ = *tmp2++;
+  } while (--cnt);
+};
+  }
+};
+  }
+  free(k);
+  return (0);
+}
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-ops.c 
b/gcc/testsuite/gcc.dg/analyzer/taint-ops.c
new file mode 100644
index 000..729dbe53a0c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-ops.c
@@ -0,0 +1,106 @@
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+// TODO: remove need for this option
+/* This test can probably be removed when -fanalyzer enables
+   the taint checker by default.  */
+
+#include "analyzer-decls.h"
+
+void
+test_1 (char a)
+{
+  char b = -a;
+}
+
+/* Copies of code from data-model-1.c.  */
+
+void test_20 (int i, int j)
+{
+  __analyzer_eval (i + 1); /* { dg-warning "UNKNOWN" } */
+  __analyzer_eval (i + j); /* { dg-warning "UNKNOWN" } */
+
+  __analyzer_eval (i - 1); /* { dg-warning "UNKNOWN" } */
+  __analyzer_eval (i - j); /* { dg-warning "UNKNOWN" } */
+
+  __analyzer_eval (i * 2); /* { dg-warning "UNKNOWN" } */
+  __analyzer_eval (i * j); /* { dg-warning "UNKNOWN" } */
+
+  __analyzer_eval (i / 2); /* { dg-warning "UNKNOWN" } */
+  __analyzer_eval (i / j); /* { dg-warning "UNKNOWN" } */
+
+  __analyzer_eval (i % 2); /* { dg-war

[committed] analyzer: fix ICE when combining taint states has_ub and has_lb

2022-01-14 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as cc3b67e40140ec79f86e79a96d7fdd169b84faaf.

gcc/analyzer/ChangeLog:
* sm-taint.cc (taint_state_machine::combine_states): Handle combination
of has_ub and has_lb.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/taint-merger.c: New test.
---
 gcc/analyzer/sm-taint.cc | 14 +++--
 gcc/testsuite/gcc.dg/analyzer/taint-merger.c | 57 
 2 files changed, 66 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-merger.c

diff --git a/gcc/analyzer/sm-taint.cc b/gcc/analyzer/sm-taint.cc
index 3a46256b020..357456593ff 100644
--- a/gcc/analyzer/sm-taint.cc
+++ b/gcc/analyzer/sm-taint.cc
@@ -860,15 +860,19 @@ taint_state_machine::combine_states (state_t s0, state_t 
s1) const
 return s0;
   if (s0 == m_tainted || s1 == m_tainted)
 return m_tainted;
-  if (s0 == m_stop)
-return s1;
-  if (s1 == m_stop)
-return s0;
   if (s0 == m_start)
 return s1;
   if (s1 == m_start)
 return s0;
-  gcc_unreachable ();
+  if (s0 == m_stop)
+return s1;
+  if (s1 == m_stop)
+return s0;
+  /* The only remaining combinations are one of has_ub and has_lb
+ (in either order).  */
+  gcc_assert ((s0 == m_has_lb && s1 == m_has_ub)
+ || (s0 == m_has_ub && s1 == m_has_lb));
+  return m_tainted;
 }
 
 /* Check for calls to external functions marked with
diff --git a/gcc/testsuite/gcc.dg/analyzer/taint-merger.c 
b/gcc/testsuite/gcc.dg/analyzer/taint-merger.c
new file mode 100644
index 000..e4e48f3db03
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/taint-merger.c
@@ -0,0 +1,57 @@
+/* { dg-additional-options "-fanalyzer-checker=taint" } */
+// TODO: remove need for this option
+
+#include "analyzer-decls.h"
+
+int v_start;
+
+__attribute__((tainted_args))
+void test (int v_tainted, int v_has_lb, int v_has_ub, int v_stop)
+{
+  /* Get each var into the 5 different taintedness states.  */
+  if (v_has_lb < 10)
+return;
+  if (v_has_ub > 100)
+return;
+  if (v_stop < 0 || v_stop > 100)
+return;
+
+  /* Verify that we have the taintedness states we expect.  */
+
+  __analyzer_dump_state ("taint", v_start); /* { dg-warning "state: 'start'" } 
*/
+  __analyzer_dump_state ("taint", v_tainted); /* { dg-warning "state: 
'tainted'" } */
+  __analyzer_dump_state ("taint", v_has_lb); /* { dg-warning "state: 'has_lb'" 
} */
+  __analyzer_dump_state ("taint", v_has_ub); /* { dg-warning "state: 'has_ub'" 
} */
+  __analyzer_dump_state ("taint", v_stop); /* { dg-warning "state: 'stop'" } */
+
+  /* Check all combinations of taintedness state.  */
+  __analyzer_dump_state ("taint", v_start + v_start); /* { dg-warning "state: 
'start'" } */
+  __analyzer_dump_state ("taint", v_start + v_tainted); /* { dg-warning 
"state: 'tainted'" } */
+  __analyzer_dump_state ("taint", v_start + v_has_lb); /* { dg-warning "state: 
'has_lb'" } */
+  __analyzer_dump_state ("taint", v_start + v_has_ub); /* { dg-warning "state: 
'has_ub'" } */
+  __analyzer_dump_state ("taint", v_start + v_stop); /* { dg-warning "state: 
'stop'" } */
+
+  __analyzer_dump_state ("taint", v_tainted + v_start); /* { dg-warning 
"state: 'tainted'" } */
+  __analyzer_dump_state ("taint", v_tainted + v_tainted); /* { dg-warning 
"state: 'tainted'" } */
+  __analyzer_dump_state ("taint", v_tainted + v_has_lb); /* { dg-warning 
"state: 'tainted'" } */
+  __analyzer_dump_state ("taint", v_tainted + v_has_ub); /* { dg-warning 
"state: 'tainted'" } */
+  __analyzer_dump_state ("taint", v_tainted + v_stop); /* { dg-warning "state: 
'tainted'" } */
+
+  __analyzer_dump_state ("taint", v_has_lb + v_start); /* { dg-warning "state: 
'has_lb'" } */
+  __analyzer_dump_state ("taint", v_has_lb + v_tainted); /* { dg-warning 
"state: 'tainted'" } */
+  __analyzer_dump_state ("taint", v_has_lb + v_has_lb); /* { dg-warning 
"state: 'has_lb'" } */
+  __analyzer_dump_state ("taint", v_has_lb + v_has_ub); /* { dg-warning 
"state: 'tainted'" } */
+  __analyzer_dump_state ("taint", v_has_lb + v_stop); /* { dg-warning "state: 
'has_lb'" } */
+
+  __analyzer_dump_state ("taint", v_has_ub + v_start); /* { dg-warning "state: 
'has_ub'" } */
+  __analyzer_dump_state ("taint", v_has_ub + v_tainted); /* { dg-warning 
"state: 'tainted'" } */
+  __analyzer_dump_state ("taint", v_has_ub + v_has_lb); /* { dg-warning 
"state: 'tainted'" } */
+  __analyzer_dump_state ("taint", v_has_ub + v_has_ub); /* { dg-warning 
"state: 'has_ub'" } */
+  __analyzer_dump_state ("taint", v_has_ub + v_stop); /* { dg-warning "state: 
'has_ub'" } */
+
+  __analyzer_dump_state ("taint", v_stop + v_start); /* { dg-warning "state: 
'stop'" } */
+  __analyzer_dump_state ("taint", v_stop + v_tainted); /* { dg-warning "state: 
'tainted'" } */
+  __analyzer_dump_state ("taint", v_stop + v_has_lb); /* { dg-warning "state: 
'has_lb'" } */
+  __analyzer_dump_state ("taint", v_stop + v_has_ub); /* { dg-warning "state: 
'has_ub'" } */
+

[PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973]

2022-01-14 Thread Jakub Jelinek via Gcc-patches
Hi!

C++20:
#include 
auto cmp4way(double a, double b)
{
  return a <=> b;
}
expands to:
ucomisd %xmm1, %xmm0
jp  .L8
movl$0, %eax
jne .L8
.L2:
ret
.p2align 4,,10
.p2align 3
.L8:
comisd  %xmm0, %xmm1
movl$-1, %eax
ja  .L2
ucomisd %xmm1, %xmm0
setbe   %al
addl$1, %eax
ret
That is 3 comparisons of the same operands.
The following patch improves it to just one comparison:
comisd  %xmm1, %xmm0
jp  .L4
seta%al
movl$0, %edx
leal-1(%rax,%rax), %eax
cmove   %edx, %eax
ret
.L4:
movl$2, %eax
ret
While a <=> b expands to a == b ? 0 : a < b ? -1 : a > b ? 1 : 2
where the first comparison is equality and this shouldn't raise
exceptions on qNaN operands, if the operands aren't equal (which
includes unordered cases), then it immediately performs < or >
comparison and that raises exceptions even on qNaNs, so we can just
perform a single comparison that raises exceptions on qNaN.
As the 4 different cases are encoded as
ZF CF PF
1  1  1  a unordered b
0  0  0  a > b
0  1  0  a < b
1  0  0  a == b
we can emit optimal sequence of comparions, first jp
for the unordered case, then je for the == case and finally jb
for the < case.

The patch pattern recognizes spaceship-like comparisons during
widening_mul if the spaceship optab is implemented, and replaces
thoose comparisons with comparisons of .SPACESHIP ifn which returns
-1/0/1/2 based on the comparison.  This seems to work well both for the
case of just returning the -1/0/1/2 (when we have just a common
successor with a PHI) or when the different cases are handled with
various other basic blocks.  The testcases cover both of those cases,
the latter with different function calls in those.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-01-14  Jakub Jelinek  

PR target/103973
* tree-cfg.h (cond_only_block_p): Declare.
* tree-ssa-phiopt.c (cond_only_block_p): Move function to ...
* tree-cfg.c (cond_only_block_p): ... here.  No longer static.
* optabs.def (spaceship_optab): New optab.
* internal-fn.def (SPACESHIP): New internal function.
* internal-fn.h (expand_SPACESHIP): Declare.
* internal-fn.c (expand_PHI): Formatting fix.
(expand_SPACESHIP): New function.
* tree-ssa-math-opts.c (optimize_spaceship): New function.
(math_opts_dom_walker::after_dom_children): Use it.
* config/i386/i386.md (spaceship3): New define_expand.
* config/i386/i386-protos.h (ix86_expand_fp_spaceship): Declare.
* config/i386/i386-expand.c (ix86_expand_fp_spaceship): New function.
* doc/md.texi (spaceship@var{m}3): Document.

* gcc.target/i386/pr103973-1.c: New test.
* gcc.target/i386/pr103973-2.c: New test.
* gcc.target/i386/pr103973-3.c: New test.
* gcc.target/i386/pr103973-4.c: New test.
* g++.target/i386/pr103973-1.C: New test.
* g++.target/i386/pr103973-2.C: New test.
* g++.target/i386/pr103973-3.C: New test.
* g++.target/i386/pr103973-4.C: New test.

--- gcc/tree-cfg.h.jj   2022-01-13 22:29:07.414943450 +0100
+++ gcc/tree-cfg.h  2022-01-14 12:59:42.147866622 +0100
@@ -111,6 +111,7 @@ extern basic_block gimple_switch_label_b
 extern basic_block gimple_switch_default_bb (function *, gswitch *);
 extern edge gimple_switch_edge (function *, gswitch *, unsigned);
 extern edge gimple_switch_default_edge (function *, gswitch *);
+extern bool cond_only_block_p (basic_block);
 
 /* Return true if the LHS of a call should be removed.  */
 
--- gcc/tree-ssa-phiopt.c.jj2022-01-13 22:29:07.514942041 +0100
+++ gcc/tree-ssa-phiopt.c   2022-01-14 12:59:42.146866637 +0100
@@ -1958,31 +1958,6 @@ minmax_replacement (basic_block cond_bb,
   return true;
 }
 
-/* Return true if the only executable statement in BB is a GIMPLE_COND.  */
-
-static bool
-cond_only_block_p (basic_block bb)
-{
-  /* BB must have no executable statements.  */
-  gimple_stmt_iterator gsi = gsi_after_labels (bb);
-  if (phi_nodes (bb))
-return false;
-  while (!gsi_end_p (gsi))
-{
-  gimple *stmt = gsi_stmt (gsi);
-  if (is_gimple_debug (stmt))
-   ;
-  else if (gimple_code (stmt) == GIMPLE_NOP
-  || gimple_code (stmt) == GIMPLE_PREDICT
-  || gimple_code (stmt) == GIMPLE_COND)
-   ;
-  else
-   return false;
-  gsi_next (&gsi);
-}
-  return true;
-}
-
 /* Attempt to optimize (x <=> y) cmp 0 and similar comparisons.
For strong ordering <=> try to match something like:
  :  // cond3_bb (== cond2_bb)
--- gcc/tree-cfg.c.jj   2022-01-13 22:29:07.399943661 +0100
+++ gcc/tree-cfg.c  2022-01-14 12:59:42.148866608 +0100
@@ -9410,6 +9410,31 @@ gimple_switch_default_edge (function *if
   return gimple_switch_edge (ifun, gs, 0);
 }
 

Re: [committed] analyzer: fix ICE in taint checker on unary ops [PR104029]

2022-01-14 Thread David Malcolm via Gcc-patches
On Fri, 2022-01-14 at 17:53 -0500, David Malcolm wrote:
> gcc/analyzer/ChangeLog:
> PR analyzer/104029
> * sm-taint.cc (taint_state_machine::alt_get_inherited_state):
> Remove gcc_unreachable from default case for unary ops.
> 
> gcc/testsuite/ChangeLog:
> PR analyzer/104029
> * gcc.dg/analyzer/pr104029.c: New test.
> * gcc.dg/analyzer/taint-ops.c: New test.

Forgot to add:

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-6597-g8931adfa0530590d21e74e5c7a1f8d26df575775.

[...snip...]



[committed] ada: Fix up handling of ghost units [PR104027]

2022-01-14 Thread Jakub Jelinek via Gcc-patches
Hi!

As reported, libgnat-12.so gets PT_GNU_STACK RWE, which means it doesn't
work in some SELinux configurations.
This is caused by the a-nbnbig.o file, which is a ghost unit and since
r12-5670 the FE emits an object file for it, but exits before compile_file
has a chance to finalize it e.g. with targetm.asm_out.file_end ()
that emits the .note.GNU-stack section on various linux targets.

Fixed by not existing but instead returning early to the caller.

Bootstrapped/regtested on x86_64-linux, preapproved in the PR,
committed to trunk.

2022-01-14  Andrew Pinski  

PR ada/104027
* gnat1drv.adb (Gnat1drv): After Back_End.Gen_Or_Update_Object_File
goto End_Of_Program.

--- gcc/ada/gnat1drv.adb
+++ gcc/ada/gnat1drv.adb
@@ -1429,6 +1429,11 @@ begin
 Ecode := E_Success;
 Back_End.Gen_Or_Update_Object_File;
 
+--  Use a goto instead of calling Exit_Program so that finalization
+--  occurs normally.
+
+goto End_Of_Program;
+
  --  Otherwise the unit is missing a crucial piece that prevents code
  --  generation.
 

Jakub



[PATCH] c++: ICE with noexcept and canonical types [PR101715]

2022-01-14 Thread Marek Polacek via Gcc-patches
This is a "canonical types differ for identical types" ICE, which started
with r11-4682.  It's a bit tricky to explain.  Consider:

  template  struct S {
S bar() noexcept(T::value);  // #1
S foo() noexcept(T::value);  // #2
  };

  template  S S::foo() noexcept(T::value) {}  // #3

We ICE because #3 and #2 have the same type, but their canonical types
differ: TYPE_CANONICAL (#3) == #2 but TYPE_CANONICAL (#2) == #1.

The member functions #1 and #2 have the same type.  However, since their
noexcept-specifier is deferred, when parsing them, we create a variant for
both of them, because DEFERRED_PARSE cannot be compared.  In other words,
build_cp_fntype_variant's

  tree v = TYPE_MAIN_VARIANT (type);
  for (; v; v = TYPE_NEXT_VARIANT (v))
if (cp_check_qualified_type (v, type, type_quals, rqual, raises, late))
  return v;

will *not* find an existing variant when creating a method_type for #2, so we
have to create a new one.

But then we perform delayed parsing and call fixup_deferred_exception_variants
for #1 and #2.  f_d_e_v will replace TYPE_RAISES_EXCEPTIONS with the newly
parsed noexcept-specifier.  It also sets TYPE_CANONICAL (#2) to #1.  Both
noexcepts turned out to be the same, so now we have two equivalent variants in
the list!  I.e.,

+-+  +-+  +-+
|  main   |  |  #2 |  |  #1 |
| S S::(S*) |->| S S::(S*) |->| S S::(S*) |->NULL
|-|  |  noex(T::value) |  |  noex(T::value) |
+-+  +-+  +-+

Then we get to #3.  As for #1 and #2, grokdeclarator calls build_memfn_type,
which ends up calling build_cp_fntype_variant, which will use the loop
above to look for an existing variant.  The first one that matches
cp_check_qualified_type will be used, so we use #2 rather than #1, and the
TYPE_CANONICAL mismatch follows.  Hopefully that makes sense.

As for the fix, I didn't think I could rewrite the method_type #2 with #1
because the type may have escaped via decltype.  So my approach is to
elide #2 from the list, so when looking for a matching variant, we always
find #1 (#2 remains live though, which admittedly sounds sort of dodgy).

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11?

PR c++/101715

gcc/cp/ChangeLog:

* tree.c (fixup_deferred_exception_variants): Remove duplicate
variants after parsing the exception specifications.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept72.C: New test.
* g++.dg/cpp0x/noexcept73.C: New test.
---
 gcc/cp/tree.c   | 16 +++-
 gcc/testsuite/g++.dg/cpp0x/noexcept72.C | 21 +
 gcc/testsuite/g++.dg/cpp0x/noexcept73.C | 13 +
 3 files changed, 49 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/noexcept72.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/noexcept73.C

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 7f7de86b4e8..2efad49e7c1 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -2804,8 +2804,9 @@ fixup_deferred_exception_variants (tree type, tree raises)
 
   /* Though sucky, this walk will process the canonical variants
  first.  */
+  tree prev = NULL_TREE;
   for (tree variant = TYPE_MAIN_VARIANT (type);
-   variant; variant = TYPE_NEXT_VARIANT (variant))
+   variant; prev = variant, variant = TYPE_NEXT_VARIANT (variant))
 if (TYPE_RAISES_EXCEPTIONS (variant) == original)
   {
gcc_checking_assert (variant != TYPE_MAIN_VARIANT (type));
@@ -2827,6 +2828,19 @@ fixup_deferred_exception_variants (tree type, tree 
raises)
  v = build_cp_fntype_variant (TYPE_CANONICAL (variant),
   rqual, cr, false);
TYPE_CANONICAL (variant) = v;
+
+   /* If VARIANT became a duplicate (cp_check_qualified_type-wise)
+  of an existing variant in the variant list of TYPE after we
+  have parsed its exception specification, elide it.  Otherwise,
+  build_cp_fntype_variant would use it, leading to "canonical
+  types differ for identical types."  */
+   for (v = TYPE_MAIN_VARIANT (type); v; v = TYPE_NEXT_VARIANT (v))
+ if (v != variant
+ /* The main variant will not have TYPE_RAISES_EXCEPTIONS
+so PREV should never be null.  */
+ && cp_check_qualified_type (v, variant, var_quals,
+ rqual, cr, false))
+   TYPE_NEXT_VARIANT (prev) = TYPE_NEXT_VARIANT (variant);
  }
else
  TYPE_RAISES_EXCEPTIONS (variant) = raises;
diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept72.C 
b/gcc/testsuite/g++.dg/cpp0x/noexcept72.C
new file mode 100644
index 000..f1455b3b46b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept72.C
@@ -0,0 +1,21 @@
+// PR c++/101715
+// { 

[PATCH] libstdc++: Add missing free functions for atomic_flag [PR103934]

2022-01-14 Thread Thomas Rodgers via Gcc-patches

From c2b74fd7cf2668d288f46da42565e5eb954e5e1f Mon Sep 17 00:00:00 2001
From: Thomas Rodgers 
Date: Fri, 14 Jan 2022 18:30:27 -0800
Subject: [PATCH] libstdc++: Add missing free functions for atomic_flag
 [PR103934]

libstdc++-v3/ChangeLog:

	PR103934
	* include/std/atomic: Add missing free functions.
	* testsuite/29_atomics/atomic_flag/wait_notify/1.cc:
	Add test case to cover missing atomic_flag free functions.
---
 libstdc++-v3/include/std/atomic   | 39 +++
 .../29_atomics/atomic_flag/wait_notify/1.cc   | 27 +++--
 2 files changed, 63 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic
index 9df17704f7e..92c96a9b047 100644
--- a/libstdc++-v3/include/std/atomic
+++ b/libstdc++-v3/include/std/atomic
@@ -1216,6 +1216,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 memory_order __m) noexcept
   { return __a->test_and_set(__m); }
 
+#if __cpp_lib_atomic_flag_test
+  inline bool
+  atomic_flag_test(const atomic_flag* __a) noexcept
+  { return __a->test(); }
+
+  inline bool
+  atomic_flag_test(const volatile atomic_flag* __a) noexcept
+  { return __a->test(); }
+
+  inline bool
+  atomic_flag_test_explicit(const atomic_flag* __a,
+			memory_order __m) noexcept
+  { return __a->test(__m); }
+
+  inline bool
+  atomic_flag_test_explicit(const volatile atomic_flag* __a,
+			memory_order __m) noexcept
+  { return __a->test(__m); }
+#endif
+
   inline void
   atomic_flag_clear_explicit(atomic_flag* __a, memory_order __m) noexcept
   { __a->clear(__m); }
@@ -1241,6 +1261,25 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   atomic_flag_clear(volatile atomic_flag* __a) noexcept
   { atomic_flag_clear_explicit(__a, memory_order_seq_cst); }
 
+#if __cpp_lib_atomic_wait
+  inline void
+  atomic_flag_wait(const atomic_flag* __a, bool __old) noexcept
+  { __a->wait(__old); }
+
+  inline void
+  atomic_flag_wait_explicit(const atomic_flag* __a, bool __old,
+		   std::memory_order __m) noexcept
+  { __a->wait(__old, __m); }
+
+  inline void
+  atomic_flag_notify_one(const atomic_flag* __a) noexcept
+  { __a->notify_one(); }
+
+  inline void
+  atomic_flag_notify_all(const atomic_flag* __a) noexcept
+  { __a->notify_all(); }
+#endif // __cpp_lib_atomic_wait
+
 
   template
 using __atomic_val_t = typename atomic<_Tp>::value_type;
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_flag/wait_notify/1.cc b/libstdc++-v3/testsuite/29_atomics/atomic_flag/wait_notify/1.cc
index 87a104059ff..1050b72a1c6 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_flag/wait_notify/1.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_flag/wait_notify/1.cc
@@ -26,8 +26,8 @@
 
 #include 
 
-int
-main()
+void
+test01()
 {
   std::atomic_flag a;
   VERIFY( !a.test() );
@@ -39,5 +39,26 @@ main()
 });
   a.wait(false);
   t.join();
-  return 0;
+}
+
+void
+test02()
+{
+  std::atomic_flag a;
+  VERIFY( !std::atomic_flag_test(&a) );
+  std::atomic_flag_wait(&a, true);
+  std::thread t([&]
+{
+  std::atomic_flag_test_and_set(&a);
+  std::atomic_flag_notify_one(&a);
+});
+  std::atomic_flag_wait(&a, false);
+  t.join();
+}
+
+int
+main()
+{
+  test01();
+  test02();
 }
-- 
2.31.1



Re: [PATCH v3 RFC] c++: add color to function decl printing

2022-01-14 Thread Jason Merrill via Gcc-patches

On 1/14/22 16:49, David Malcolm wrote:

On Mon, 2021-12-13 at 09:58 -0500, Jason Merrill via Gcc-patches wrote:

On 12/13/21 06:02, Jonathan Wakely wrote:

On Sun, 12 Dec 2021 at 05:39, Jason Merrill mailto:ja...@redhat.com>> wrote:
  >
  > In reading C++ diagnostics, it's often hard to find the name of
the
function
  > in the middle of the template header, return type, parameters, and
template
  > arguments.  So let's colorize it, and maybe the template argument
bindings
  > while we're at it.


Thanks for the patch; sorry for not responding before.

Overall I like that patch, with some reservations...


  >
  > I've somewhat arbitrarily chosen bold green for the function name,
and
  > non-bold magenta for the template arguments.  I'm not at all
attached to
  > these choices.


I tried the patch with gnome terminals "light" and "dark" themes, and
the colors seemed readable with both color schemes.


  >
  > A side-effect is that when this happens in a quote (i.e. %qD), the
  > rest of the quote after the function name is no longer bold.  I
think
that's
  > acceptable; returning to the bold would require maintaining a
colorize stack
  > instead of the on/off controls we have now.


I was going to grumble about this, but having tried it on some
examples, I think it's actually an improvement in readability to drop
the emboldening, in that it reduces the "wall of bold text" seen.

Having tried it on some examples, I think the patch as a whole is a
definite readability win, in that it breaks up long stretches of bold
text into useful colorized parts; the result seems much easier to
decipher at a glance.  I wonder to what extent this is a poor-man's
syntax highlighting?


I think it definitely is.  Going farther in that direction would make 
sense it future.



Did you try any other variants of the highlighting?  This approach
seems to work well, FWIW, I wonder if others might work better, or if
this is a local maxima in terms of readability.


Do you have any particular variants in mind?  This was motivated by my 
having trouble finding the name of the function in diagnostic output, so 
that's the main thing I wanted to highlight.



I'm taking the liberty of attaching a screenshot (137K) showing
before/after the patch for the sake of discussion.


  >
  > Any thoughts?


I was also concerned about how this would interact with the template
type diff printing from f012c8ef4b35dcee9b5a3807868d050812d5b3b9, but I
did a few tests and it seems to work OK.

I only tested in lightly for a few minutes, so it could do with some
more testing.



I thought I was going to love this ... and it's a nice little
improvement, but I didn't love it as much as I expected.

Is it intentional that only the last function in the instantiation
stack
gets colourized? In this example the function on line 390 isn't
highlighted:

/home/jwakely/gcc/12/include/c++/12.0.0/bits/ranges_base.h:390:12:
  required
by substitution of 'template  requires
(is_bounded_array_v::type>) ||
(__member_size*<_Tp>) || (__adl_size<_Tp>) ||
(__sentinel_size<_Tp>)
constexpr auto std::ranges::__cust_access::_Size::operator()(_Tp&&)
const [with _Tp = adl::Footie (&)[]]*'


Oops, I needed to change subst_to_string as well.  Updated patch
below.


Jonathan, did you try the v2 patch?




Aside: it's a little odd that the first caret diagnostic there only
highlights the word "operator" and not the name of the function,
"operator()".


Yes, I imagine we need to adjust DECL_SOURCE_LOCATION to use a range
instead of assuming the location of the first token is sufficient.

Jason


[...]


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9b4371b9213..cdfddd75343 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -4803,6 +4803,14 @@ SGR substring for location information,

@samp{file:line} or

  @vindex quote GCC_COLORS @r{capability}
  SGR substring for information printed within quotes.
  
+@item fnname=

+@vindex fnname GCC_COLORS @r{capability}
+SGR substring for names of C++ functions.
+
+@item targs=
+@vindex targs GCC_COLORS @r{capability}
+SGR substring for C++ function template parameter bindings.
+
  @item fixit-insert=
  @vindex fixit-insert GCC_COLORS @r{capability}
  SGR substring for fix-it hints suggesting text to


There's a section of the docs for -fdiagnostics-color a little above
this, starting
   "The default @env{GCC_COLORS} is"
which needs to be updated whenever we add new color capabilities.


Fixed.


diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 323643d1e6f..0008ee2ee8d 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -1,4 +1,4 @@
-/* Call-backs for C++ error reporting.
+/* Call-backs for -*- C++ -*- error reporting.


This change isn't called out in the ChangeLog.  Is it deliberate?


Yes, an incidental change to tell emacs that it's a C++ source file 
despite the .c suffix.  Probably not important, since it seems we're 
about to rename the .c files.



[...]


@@ -1158,6 +1163,22 @@ dump_simple_decl (cxx_pretty_pr

Re: [PATCH v2 RFA] diagnostic: avoid repeating include path

2022-01-14 Thread Jason Merrill via Gcc-patches

On 1/13/22 17:30, David Malcolm wrote:

On Thu, 2022-01-13 at 17:08 -0500, Jason Merrill wrote:

When a sequence of diagnostic messages bounces back and forth
repeatedly
between two includes, as with

  #include 
  std::map m ("123", "456");

The output is quite a bit longer than necessary because we dump the
include
path each time it changes.  I'd think we could print the include path
once
for each header file, and then expect that the user can look earlier
in the
output if they're wondering.

Tested x86_64-pc-linux-gnu, OK for trunk?

gcc/ChangeLog:

 * diagnostic.c (includes_seen): New.
 (diagnostic_report_current_module): Use it.
---
  gcc/diagnostic.c | 12 +++-
  1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 58139427d01..e56441a2dbf 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -700,6 +700,16 @@ set_last_module (diagnostic_context *context,
const line_map_ordinary *map)
    context->last_module = map;
  }
  
+/* Only dump the "In file included from..." stack once for each

file.  */
+
+static bool
+includes_seen (const line_map_ordinary *map)
+{
+  using hset = hash_set;
+  static hset *set = new hset;
+  return set->add (map);
+}


Overall, I like the idea, but...

- the patch works at the level of line_map_ordinary instances, rather
than header files.  There are various ways in which a single header
file can have multiple line maps e.g. due to very long lines, or
including another file, etc.  I think it makes sense to do it at the
per-file level, assuming we aren't in a horrible situation where a
header is being included repeatedly, with different effects.  So maybe
this ought to look at what include directive led to this map, i.e.
looking at the ord_map->included_from field, and having a
hash_set ?


Done.  This version doesn't affect printing of the module import path 
yet, pending more consideration of whether we always want to identify 
the module it comes from or just the file/line is enough.



- there's no test coverage, but it's probably not feasible to write
DejaGnu tests for this, given the way prune.exp's prune_gcc_output
strips these strings.  Maybe a dg directive to selectively disable the
pertinent pruning operations in prune_gcc_output???  Gah...

- global state is a pet peeve of mine; can the above state be put
inside the diagnostic_context instead?   (perhaps via a pointer to a
wrapper class to avoid requiring all users of diagnostic.h to include
hash-set.h?).


It seems that using hash_set directly doesn't break any users.
From a6137bc1154bdf237449382f98cd3945a701103f Mon Sep 17 00:00:00 2001
From: Jason Merrill 
Date: Fri, 17 Dec 2021 05:45:02 -0500
Subject: [PATCH] diagnostic: avoid repeating include path
To: gcc-patches@gcc.gnu.org

When a sequence of diagnostic messages bounces back and forth repeatedly
between two includes, as with

 #include 
 std::map m ("123", "456");

The output is quite a bit longer than necessary because we dump the include
path each time it changes.  I'd think we could print the include path once
for each header file, and then expect that the user can look earlier in the
output if they're wondering.

gcc/ChangeLog:

	* diagnostic.h (struct diagnostic_context): Add includes_seen.
	* diagnostic.c (diagnostic_initialize): Initialize it.
	(diagnostic_finish): Clean it up.
	(includes_seen): New function.
	(diagnostic_report_current_module): Use it.

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/line-2.c: Only expect includes once.
	* c-c++-common/cpp/line-3.c: Likewise.
---
 gcc/diagnostic.h|  4 +++
 gcc/diagnostic.c| 36 +++--
 gcc/testsuite/c-c++-common/cpp/line-2.c |  2 +-
 gcc/testsuite/c-c++-common/cpp/line-3.c |  2 +-
 4 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 6739028a931..ccaa33b5817 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -387,6 +387,10 @@ struct diagnostic_context
  the BLOCK_SUPERCONTEXT() chain hanging off the LOCATION_BLOCK()
  of a diagnostic's location.  */
   void (*set_locations_cb)(diagnostic_context *, diagnostic_info *);
+
+  /* Include files that diagnostic_report_current_module has already listed the
+ include path for.  */
+  hash_set *includes_seen;
 };
 
 static inline void
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 58139427d01..5c02ff05882 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -237,6 +237,7 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)
   context->begin_group_cb = NULL;
   context->end_group_cb = NULL;
   context->final_cb = default_diagnostic_final_cb;
+  context->includes_seen = NULL;
 }
 
 /* Maybe initialize the color support. We require clients to do this
@@ -329,6 +330,12 @@ diagnostic_finish (diagnostic_context *context)
   delete context->edit_context_ptr;
   context->edit_context_ptr = NULL;
 }
+
+  if (context->inc