Re: [PATCH v2 RFA] diagnostic: add permerror variants with opt

2023-10-04 Thread Florian Weimer
* Jason Merrill:

> @@ -6159,6 +6153,18 @@ errors by @option{-pedantic-errors}.  For instance:
>  -Wwrite-strings @r{(C++11 or later)}
>  }
>  
> +@opindex fpermissive
> +@item -fpermissive
> +Downgrade some required diagnostics about nonconformant code from
> +errors to warnings.  Thus, using @option{-fpermissive} allows some
> +nonconforming code to compile.  Some C++ diagnostics are controlled
> +only by this flag, but it also downgrades some diagnostics that have
> +their own flag:
> +
> +@gccoptlist{
> +-Wnarrowing @r{(C++)}
> +}
> +
>  @opindex Wall
>  @opindex Wno-all
>  @item -Wall

Does compiling with -Wno-narrowing also accept the obsolete constructs?
The documentation isn't clear about it.  The existing test
gcc/testsuite/g++.dg/cpp0x/initlist55.C suggests it's possible.  Maybe
add an explicit example to the documentation?

What about the impact of -w?

As far as the internal API is concerned, will there be a way to query
whether -Wno-narrowing etc. have been specified?  We could use this in
the C frontend where we might want to run different code in some cases.
Without implicit ints, for example, we know that identifiers at certain
positions must be types, so we can type spelling hints to errors.

Thanks,
Florian



Re: [PATCH] match.pd: Fix up a ? cst1 : cst2 regression on signed bool [PR111668]

2023-10-04 Thread Richard Biener
On Tue, 3 Oct 2023, Jakub Jelinek wrote:

> Hi!
> 
> My relatively recent changes to these simplifiers to avoid
> doing build_nonstandard_integer_type (primarily for BITINT_TYPE)
> broke PR111668, a recurrence of the PR110487 bug.
> I thought the build_nonstandard_integer_type isn't ever needed there,
> but there is one special case where it is.
> For the a ? -1 : 0 and a ? 0 : -1 simplifications there are actually
> 3 different cases.  One is for signed 1-bit precision types (signed
> kind of implied from integer_all_onesp, because otherwise it would
> match integer_onep earlier), where the simplifier wierdly was matching
> them using the a ? powerof2cst : 0 -> a << (log2(powerof2cst))
> simplification and then another simplifier optimizing away the left shift
> when log2(powerof2cst) was 0.  Another one is signed BOOLEAN_TYPE with
> precision > 1, where indeed we shouldn't be doing the negation in type,
> because it isn't well defined in that type, the type only has 2 valid
> values, 0 and -1.  As an alternative, we could also e.g. cast to
> signed 1-bit precision BOOLEAN_TYPE and then extend to type.
> And the last case is what we were doing for types which have both 1 and -1
> (all all ones) as valid values (i.e. all signed/unsigned ENUMERAL_TYPEs,
> INTEGRAL_TYPEs and BITINT_TYPEs with precision > 1).
> 
> The following patch avoids the hops through << 0 for 1-bit precision
> and uses build_nonstandard_integer_type solely for the BOOLEAN_TYPE types
> (where we have a guarantee the precision is reasonably small, nothing ought
> to be created 129+ bit precision BOOLEAN_TYPEs).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> 2023-10-03  Jakub Jelinek  
> 
>   PR tree-optimization/111668
>   * match.pd (a ? CST1 : CST2): Handle the a ? -1 : 0 and
>   a ? 0 : -1 cases before the powerof2cst cases and differentiate
>   between 1-bit precision types, larger precision boolean types
>   and other integral types.  Fix comment pastos and formatting.
> 
> --- gcc/match.pd.jj   2023-10-02 09:42:01.657836005 +0200
> +++ gcc/match.pd  2023-10-03 10:33:30.817614648 +0200
> @@ -5100,36 +5100,53 @@ (define_operator_list SYNC_FETCH_AND_AND
>   (switch
>(if (integer_zerop (@2))
> (switch
> -/* a ? 1 : 0 -> a if 0 and 1 are integral types. */
> +/* a ? 1 : 0 -> a if 0 and 1 are integral types.  */
>  (if (integer_onep (@1))
>   (convert (convert:boolean_type_node @0)))
> +/* a ? -1 : 0 -> -a.  */
> +(if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1))
> + (if (TYPE_PRECISION (type) == 1)
> +  /* For signed 1-bit precision just cast bool to the type.  */
> +  (convert (convert:boolean_type_node @0))
> +  (if (TREE_CODE (type) == BOOLEAN_TYPE)
> +   (with {
> +   tree intt = build_nonstandard_integer_type (TYPE_PRECISION (type),
> +   TYPE_UNSIGNED (type));
> + }
> + (convert (negate (convert:intt (convert:boolean_type_node @0)
> +   (negate (convert:type (convert:boolean_type_node @0))
>  /* a ? powerof2cst : 0 -> a << (log2(powerof2cst)) */
>  (if (INTEGRAL_TYPE_P (type) && integer_pow2p (@1))
>   (with {
> tree shift = build_int_cst (integer_type_node, tree_log2 (@1));
>}
> -  (lshift (convert (convert:boolean_type_node @0)) { shift; })))
> -/* a ? -1 : 0 -> -a.  No need to check the TYPE_PRECISION not being 1
> -   here as the powerof2cst case above will handle that case correctly.  
> */
> -(if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@1))
> - (negate (convert:type (convert:boolean_type_node @0))
> +  (lshift (convert (convert:boolean_type_node @0)) { shift; })
>(if (integer_zerop (@1))
> (switch
> -/* a ? 0 : 1 -> !a. */
> +/* a ? 0 : 1 -> !a.  */
>  (if (integer_onep (@2))
>   (convert (bit_xor (convert:boolean_type_node @0) { boolean_true_node; 
> })))
> -/* a ? powerof2cst : 0 -> (!a) << (log2(powerof2cst)) */
> +/* a ? 0 : -1 -> -(!a).  */
> +(if (INTEGRAL_TYPE_P (type) && integer_all_onesp (@2))
> + (if (TYPE_PRECISION (type) == 1)
> +  /* For signed 1-bit precision just cast bool to the type.  */
> +  (convert (bit_xor (convert:boolean_type_node @0) { boolean_true_node; 
> }))
> +  (if (TREE_CODE (type) == BOOLEAN_TYPE)
> +   (with {
> +   tree intt = build_nonstandard_integer_type (TYPE_PRECISION (type),
> +   TYPE_UNSIGNED (type));
> + }
> + (convert (negate (convert:intt (bit_xor (convert:boolean_type_node @0)
> + { boolean_true_node; })
> +   (negate (convert:type (bit_xor (convert:boolean_type_node @0)
> +   { boolean_true_node; }))
> +/* a ? 0 : powerof2cst -> (!a) << (log2(powerof2cst)) */
>  (if (INTEGRAL_TYPE_P (type) && integer_pow2p (@2))
>   (w

Re: gcc-patches From rewriting mailman settings (Was: [Linaro-TCWG-CI] gcc patch #75674: FAIL: 68 regressions)

2023-10-04 Thread Mark Wielaard
Hi Gerald,

On Wed, Oct 04, 2023 at 12:17:48AM +0200, Gerald Pfeifer wrote:
> On Tue, 19 Sep 2023, Mark Wielaard wrote:
> >> Although there were some positive responses (on list and on irc) it is
> >> sometimes hard to know if there really is consensus for these kind of
> >> infrastructure tweaks. But I believe there is at least no sustained
> >> opposition to changing the gcc-patches mailman setting as proposed
> >> above.
> > This change is now done for gcc-patches.
> 
> Yeah, yeah, yeah. Thank you!

Thanks to the fsf-tech team who explained the setup they are using for
lists.gnu.org and everybody helping to test in the bug:
https://sourceware.org/bugzilla/show_bug.cgi?id=29713

It seems to work well, but it does mean disabling most of the things
mailman can do, like filtering out HTML attachements, adjusting
headers, adding footers or subject prefixes, etc. And we did break at
least one workflow for people who were using DKIM signing and the
Errors-To header.

> >> And if there are no complaints at Cauldron we could do the same for
> >> the other patch lists the week after.
> 
> Sadly I missed Cauldron - have there been any complaints there?

No, the opposite. People seemed happy and there were some examples
shown where (on other lists, like binutils) From rewriting caused
issues for some tools relying on patchwork.sourceware.org.

> Can you adjust the g...@gcc.gnu.org list and others @gcc.gnu.org as well?
> I for one would love to see that.

Being somewhat conservative doing that in steps. Last week we switched
the other gcc lists that receive patches (gcc-rust, libstdc++, fortran
and jit). This week looking at non-gcc lists on sourceware that use
patchwork (newlib, binutils, elfutils-devel, gdb-patches and
libabigail). Then if nothing breaks horribly more general lists if
people want.

Cheers,

Mark


[PATCH] RISC-V: THead: Fix missing CFI directives for th.sdd in prologue.

2023-10-04 Thread Xianmiao Qu
From: quxm 

When generating CFI directives for the store-pair instruction,
if we add two parallel REG_FRAME_RELATED_EXPR expr_lists like
  (expr_list:REG_FRAME_RELATED_EXPR (set (mem/c:DI (plus:DI (reg/f:DI 2 sp)
(const_int 8 [0x8])) [1  S8 A64])
(reg:DI 1 ra))
  (expr_list:REG_FRAME_RELATED_EXPR (set (mem/c:DI (reg/f:DI 2 sp) [1  S8 A64])
(reg:DI 8 s0))
only the first expr_list will be recognized by dwarf2out_frame_debug
funciton. So, here we generate a SEQUENCE expression of REG_FRAME_RELATED_EXPR,
which includes two sub-expressions of RTX_FRAME_RELATED_P. Then the
dwarf2out_frame_debug_expr function will iterate through all the sub-expressions
and generate the corresponding CFI directives.

gcc/
* config/riscv/thead.cc (th_mempair_save_regs): Fix missing CFI
directives for store-pair instruction.

gcc/testsuite/
* gcc.target/riscv/xtheadmempair-4.c: New test.
---
 gcc/config/riscv/thead.cc | 11 +++
 .../gcc.target/riscv/xtheadmempair-4.c| 29 +++
 2 files changed, 35 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c

diff --git a/gcc/config/riscv/thead.cc b/gcc/config/riscv/thead.cc
index 507c912bc39..be0cd7c1276 100644
--- a/gcc/config/riscv/thead.cc
+++ b/gcc/config/riscv/thead.cc
@@ -366,14 +366,15 @@ th_mempair_save_regs (rtx operands[4])
 {
   rtx set1 = gen_rtx_SET (operands[0], operands[1]);
   rtx set2 = gen_rtx_SET (operands[2], operands[3]);
+  rtx dwarf = gen_rtx_SEQUENCE (VOIDmode, rtvec_alloc (2));
   rtx insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set1, 
set2)));
   RTX_FRAME_RELATED_P (insn) = 1;
 
-  REG_NOTES (insn) = alloc_EXPR_LIST (REG_FRAME_RELATED_EXPR,
- copy_rtx (set1), REG_NOTES (insn));
-
-  REG_NOTES (insn) = alloc_EXPR_LIST (REG_FRAME_RELATED_EXPR,
- copy_rtx (set2), REG_NOTES (insn));
+  XVECEXP (dwarf, 0, 0) = copy_rtx (set1);
+  XVECEXP (dwarf, 0, 1) = copy_rtx (set2);
+  RTX_FRAME_RELATED_P (XVECEXP (dwarf, 0, 0)) = 1;
+  RTX_FRAME_RELATED_P (XVECEXP (dwarf, 0, 1)) = 1;
+  add_reg_note (insn, REG_FRAME_RELATED_EXPR, dwarf);
 }
 
 /* Similar like riscv_restore_reg, but restores two registers from memory
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c 
b/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
new file mode 100644
index 000..9aef4e15f8d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-g" "-Oz" "-Os" "-flto" } } */
+/* { dg-options "-march=rv64gc_xtheadmempair -mtune=thead-c906 
-funwind-tables" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_xtheadmempair -mtune=thead-c906 
-funwind-tables" { target { rv32 } } } */
+
+extern void bar (void);
+
+void foo (void)
+{
+  asm volatile (";my clobber list"
+   : : : "s0");
+  bar ();
+  asm volatile (";my clobber list"
+   : : : "s0");
+}
+
+/* { dg-final { scan-assembler-times "th.sdd\t" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler ".cfi_offset 8, -16" { target { rv64 } } } } */
+/* { dg-final { scan-assembler ".cfi_offset 1, -8" { target { rv64 } } } } */
+
+/* { dg-final { scan-assembler-times "th.swd\t" 1 { target { rv32 } } } } */
+/* { dg-final { scan-assembler ".cfi_offset 8, -8" { target { rv32 } } } } */
+/* { dg-final { scan-assembler ".cfi_offset 1, -4" { target { rv32 } } } } */
+
+/* { dg-final { scan-assembler ".cfi_restore 1" } } */
+/* { dg-final { scan-assembler ".cfi_restore 8" } } */
+
+/* { dg-final { scan-assembler-times "th.ldd\t" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.lwd\t" 1 { target { rv32 } } } } */
-- 
2.17.1



Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-04 Thread Robin Dapp
Hi Tamar,

> I can't approve but hope you don't mind the review,

Not at all, greatly appreciated.

I incorporated all your remarks apart from this:

> Isn't vec_opmask NULL for SLP? You probably need to read it from
> vec_defs for the COND_EXPR

Above that I gcc_assert (!slp_node) for the IFN_COND case.  It doesn't
seem to be hit during testsuite runs.  I also didn't manage to create
an example that would trigger it.  When "conditionalizing" an SLP
fold-left reduction we don't seem to vectorize for a different reason.
Granted, I didn't look very closely at that reason :)

Bootstrap and testsuite are currently running with the attached v2
on x86, aarch64 and powerpc.

Besides, when thinking about which COND_OPs we expect I tried to loosen
the restrictions in if-conv by allowing MAX_EXPR and MIN_EXPR.  The
emitted code on riscv looks correct but I hit a bootstrap ICE on x86
so omitted it for now.

Regards
 Robin


Subject: [PATCH v2] ifcvt/vect: Emit COND_ADD for conditional scalar
 reduction.

As described in PR111401 we currently emit a COND and a PLUS expression
for conditional reductions.  This makes it difficult to combine both
into a masked reduction statement later.
This patch improves that by directly emitting a COND_ADD during ifcvt and
adjusting some vectorizer code to handle it.

It also makes neutral_op_for_reduction return -0 if HONOR_SIGNED_ZEROS
is true.

gcc/ChangeLog:

PR middle-end/111401
* internal-fn.cc (cond_fn_p): New function.
* internal-fn.h (cond_fn_p): Define.
* tree-if-conv.cc (convert_scalar_cond_reduction): Emit COND_ADD
if supported.
(predicate_scalar_phi): Add whitespace.
* tree-vect-loop.cc (fold_left_reduction_fn): Add IFN_COND_ADD.
(neutral_op_for_reduction): Return -0 for PLUS.
(vect_is_simple_reduction): Don't count else operand in
COND_ADD.
(vectorize_fold_left_reduction): Add COND_ADD handling.
(vectorizable_reduction): Don't count else operand in COND_ADD.
(vect_transform_reduction): Add COND_ADD handling.
* tree-vectorizer.h (neutral_op_for_reduction): Add default
parameter.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c: New test.
* gcc.target/riscv/rvv/autovec/cond/pr111401.c: New test.
---
 gcc/internal-fn.cc|  17 +++
 gcc/internal-fn.h |   1 +
 .../vect-cond-reduc-in-order-2-signed-zero.c  | 141 ++
 .../riscv/rvv/autovec/cond/pr111401.c | 139 +
 gcc/tree-if-conv.cc   |  63 ++--
 gcc/tree-vect-loop.cc | 129 
 gcc/tree-vectorizer.h |   2 +-
 7 files changed, 450 insertions(+), 42 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/pr111401.c

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 61d5a9e4772..9b38dc0cef4 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -4245,6 +4245,23 @@ first_commutative_argument (internal_fn fn)
 }
 }
 
+/* Return true if this CODE describes a conditional (masked) internal_fn.  */
+
+bool
+cond_fn_p (code_helper code)
+{
+  if (!code.is_fn_code ())
+return false;
+
+  if (!internal_fn_p ((combined_fn) code))
+return false;
+
+  internal_fn fn = as_internal_fn ((combined_fn) code);
+
+  return conditional_internal_fn_code (fn) != ERROR_MARK;
+}
+
+
 /* Return true if this CODE describes an internal_fn that returns a vector with
elements twice as wide as the element size of the input vectors.  */
 
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 99de13a0199..f1cc9db29c0 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -219,6 +219,7 @@ extern bool commutative_ternary_fn_p (internal_fn);
 extern int first_commutative_argument (internal_fn);
 extern bool associative_binary_fn_p (internal_fn);
 extern bool widening_fn_p (code_helper);
+extern bool cond_fn_p (code_helper code);
 
 extern bool set_edom_supported_p (void);
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c 
b/gcc/testsuite/gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c
new file mode 100644
index 000..7b46e7d8a2a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c
@@ -0,0 +1,141 @@
+/* Make sure a -0 stays -0 when we perform a conditional reduction.  */
+/* { dg-do run } */
+/* { dg-require-effective-target vect_double } */
+/* { dg-add-options ieee } */
+/* { dg-additional-options "-std=gnu99 -fno-fast-math" } */
+
+#include "tree-vect.h"
+
+#include 
+
+#define N (VECTOR_BITS * 17)
+
+double __attribute__ ((noinline, noclone))
+reduc_plus_double (double *restrict a, double init, int *cond, int n)
+{
+  double res = init;
+  for (int i = 0; i < n; i++)
+if (cond[i])
+  r

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-04 Thread Robin Dapp
Ping^2.

I realize it's not very elegant as of now.  If there's a better/shorter way
to solve this feel free to suggest :) 

Regards
 Robin


Re: [PATCH 1/2] testsuite: Add dg-require-atomic-exchange non-atomic code

2023-10-04 Thread Jonathan Wakely
On Wed, 4 Oct 2023 at 03:56, Hans-Peter Nilsson  wrote:
>
> > From: Christophe Lyon 
> > Date: Tue, 3 Oct 2023 15:20:39 +0200
>
> > Maybe we need a new variant of dg-require-thread-fence ?
>
> Yes: many of the dg-require-thread-fence users need
> something stronger.  Tested arm-eabi together with the next
> patch (2/2) with
> RUNTESTFLAGS=--target_board=arm-sim/-mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto\
> conformance.exp=29_atomics/\*
>
> (Incidentally, in the patch context is seen
> dg-require-atomic-builtins which is a misnomer: it should
> rather be named "dg-require-lock-atomic-builtins-free".)

dg-require-lock-free-atomic-builtins or
dg-require-atomic-builtins-lock-free, surely?


>
> Ok to commit?
>
> -- >8 --
> Some targets (armv6) support inline atomic load and store,
> i.e. dg-require-thread-fence matches, but not atomic like
> atomic exchange.  This directive will replace uses of
> dg-require-thread-fence where an atomic exchange operation
> is actually used.
>
> * testsuite/lib/dg-options.exp (dg-require-atomic-exchange): New proc.
> * testsuite/lib/libstdc++.exp (check_v3_target_atomic_exchange): 
> Ditto.

OK

> ---
>  libstdc++-v3/testsuite/lib/dg-options.exp |  9 ++
>  libstdc++-v3/testsuite/lib/libstdc++.exp  | 35 +++
>  2 files changed, 44 insertions(+)
>
> diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp 
> b/libstdc++-v3/testsuite/lib/dg-options.exp
> index 84ad0c65330b..b13c2f244c63 100644
> --- a/libstdc++-v3/testsuite/lib/dg-options.exp
> +++ b/libstdc++-v3/testsuite/lib/dg-options.exp
> @@ -133,6 +133,15 @@ proc dg-require-thread-fence { args } {
>  return
>  }
>
> +proc dg-require-atomic-exchange { args } {
> +if { ![ check_v3_target_atomic_exchange ] } {
> +   upvar dg-do-what dg-do-what
> +   set dg-do-what [list [lindex ${dg-do-what} 0] "N" "P"]
> +   return
> +}
> +return
> +}
> +
>  proc dg-require-atomic-builtins { args } {
>  if { ![ check_v3_target_atomic_builtins ] } {
> upvar dg-do-what dg-do-what
> diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp 
> b/libstdc++-v3/testsuite/lib/libstdc++.exp
> index 608056e5068e..481f81711074 100644
> --- a/libstdc++-v3/testsuite/lib/libstdc++.exp
> +++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
> @@ -1221,6 +1221,41 @@ proc check_v3_target_thread_fence { } {
>  }]
>  }
>
> +proc check_v3_target_atomic_exchange { } {
> +return [check_v3_target_prop_cached et_atomic_exchange {
> +   global cxxflags
> +   global DEFAULT_CXXFLAGS
> +
> +   # Set up and link a C++11 test program that depends
> +   # on atomic exchange be available for "int".
> +   set src atomic_exchange[pid].cc
> +
> +   set f [open $src "w"]
> +   puts $f "
> +int i, j, k;
> +   int main() {
> +   __atomic_exchange (&i, &j, &k, __ATOMIC_SEQ_CST);
> +   return 0;
> +   }"
> +   close $f
> +
> +   set cxxflags_saved $cxxflags
> +   set cxxflags "$cxxflags $DEFAULT_CXXFLAGS -Werror -std=gnu++11"
> +
> +   set lines [v3_target_compile $src /dev/null executable ""]
> +   set cxxflags $cxxflags_saved
> +   file delete $src
> +
> +   if [string match "" $lines] {
> +   # No error message, linking succeeded.
> +   return 1
> +   } else {
> +   verbose "check_v3_target_atomic_exchange: compilation failed" 2
> +   return 0
> +   }
> +}]
> +}
> +
>  # Return 1 if atomics_bool and atomic_int are always lock-free, 0 otherwise.
>  proc check_v3_target_atomic_builtins { } {
>  return [check_v3_target_prop_cached et_atomic_builtins {
> --
> 2.30.2
>
>
> >
> > Thanks,
> >
> > Christophe
> >
> >
> > Ok to commit?
> > >
> > > -- >8 --
> > > Make __atomic_test_and_set consistent with other __atomic_ and __sync_
> > > builtins: call a matching library function instead of emitting
> > > non-atomic code when the target has no direct insn support.
> > >
> > > There's special-case code handling targetm.atomic_test_and_set_trueval
> > > != 1 trying a modified maybe_emit_sync_lock_test_and_set.  Previously,
> > > if that worked but its matching emit_store_flag_force returned NULL,
> > > we'd segfault later on.  Now that the caller handles NULL, gcc_assert
> > > here instead.
> > >
> > > While the referenced PR:s are ARM-specific, the issue is general.
> > >
> > > PR target/107567
> > > PR target/109166
> > > * builtins.cc (expand_builtin)  > > BUILT_IN_ATOMIC_TEST_AND_SET>:
> > > Handle failure from expand_builtin_atomic_test_and_set.
> > > * optabs.cc (expand_atomic_test_and_set): When all attempts fail 
> > > to
> > > generate atomic code through target support, return NULL
> > > instead of emitting non-atomic code.  Also, for code handling
> > > targetm.atomic_test_and_set_trueval != 1, gcc_assert result
> > > from calling emit_store_flag_force instead of returning NULL.
> > > 

Re: [PATCH] Makefile.tpl: disable -Werror for feedback stage [PR111663]

2023-10-04 Thread Richard Biener
On Mon, Oct 2, 2023 at 2:06 PM Sergei Trofimovich  wrote:
>
> From: Sergei Trofimovich 
>
> Without the change profiled bootstrap fails for various warnings on
> master branch as:
>
> $ ../gcc/configure
> $ make profiledbootstrap
> ...
> gcc/genmodes.cc: In function ‘int main(int, char**)’:
> gcc/genmodes.cc:2152:1: error: ‘gcc/build/genmodes.gcda’ profile count 
> data file not found [-Werror=missing-profile]
> ...
> gcc/gengtype-parse.cc: In function ‘void parse_error(const char*, ...)’:
> gcc/gengtype-parse.cc:142:21: error: ‘%s’ directive argument is null 
> [-Werror=format-overflow=]
>
> The change removes -Werror just like autofeedback does today.

I think that makes sense, OK if nobody objects.

Richard.

> /
>
> PR bootstrap/111663
> * Makefile.tpl (STAGEfeedback_CONFIGURE_FLAGS): Disable -Werror.
> * Makefile.in: Regenerate.
> ---
>  Makefile.in  | 4 
>  Makefile.tpl | 4 
>  2 files changed, 8 insertions(+)
>
> diff --git a/Makefile.in b/Makefile.in
> index 2f136839c35..e0e3c4c8fe8 100644
> --- a/Makefile.in
> +++ b/Makefile.in
> @@ -638,6 +638,10 @@ STAGEtrain_TFLAGS = $(filter-out 
> -fchecking=1,$(STAGE3_TFLAGS))
>
>  STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use 
> -fprofile-reproducible=parallel-runs
>  STAGEfeedback_TFLAGS = $(STAGE4_TFLAGS)
> +# Disable warnings as errors for a few reasons:
> +# - sources for gen* binaries do not have .gcda files available
> +# - inlining decisions generate extra warnings
> +STAGEfeedback_CONFIGURE_FLAGS = $(filter-out 
> --enable-werror-always,$(STAGE_CONFIGURE_FLAGS))
>
>  STAGEautoprofile_CFLAGS = $(filter-out -gtoggle,$(STAGE2_CFLAGS)) -g
>  STAGEautoprofile_TFLAGS = $(STAGE2_TFLAGS)
> diff --git a/Makefile.tpl b/Makefile.tpl
> index 5872dd03f2c..8b7783bb4f1 100644
> --- a/Makefile.tpl
> +++ b/Makefile.tpl
> @@ -561,6 +561,10 @@ STAGEtrain_TFLAGS = $(filter-out 
> -fchecking=1,$(STAGE3_TFLAGS))
>
>  STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use 
> -fprofile-reproducible=parallel-runs
>  STAGEfeedback_TFLAGS = $(STAGE4_TFLAGS)
> +# Disable warnings as errors for a few reasons:
> +# - sources for gen* binaries do not have .gcda files available
> +# - inlining decisions generate extra warnings
> +STAGEfeedback_CONFIGURE_FLAGS = $(filter-out 
> --enable-werror-always,$(STAGE_CONFIGURE_FLAGS))
>
>  STAGEautoprofile_CFLAGS = $(filter-out -gtoggle,$(STAGE2_CFLAGS)) -g
>  STAGEautoprofile_TFLAGS = $(STAGE2_TFLAGS)
> --
> 2.42.0
>


[Patch, fortran] PR111674 - [13/14 regression] Failure to finalize an allocatable subobject of a non-finalizable type

2023-10-04 Thread Paul Richard Thomas
This was fixed as 'obvious' with an off-list OK, posted on the PR, from
Harald. Applied to 13-branch and trunk then closed as fixed.

Cheers

Paul

Fortran: Alloc comp of non-finalizable type not finalized [PR111674]

2023-10-04  Paul Thomas  

gcc/fortran
PR fortran/37336
PR fortran/111674
* trans-expr.cc (gfc_trans_scalar_assign): Finalize components
on deallocation if derived type is not finalizable.

gcc/testsuite/
PR fortran/37336
PR fortran/111674
* gfortran.dg/allocate_with_source_25.f90: Final count in tree
dump reverts from 4 to original 6.
* gfortran.dg/finalize_38.f90: Add test for fix of PR111674.


Re: [PATCH 2/2] testsuite: Replace many dg-require-thread-fence with dg-require-atomic-exchange

2023-10-04 Thread Jonathan Wakely
On Wed, 4 Oct 2023 at 04:11, Hans-Peter Nilsson  wrote:
>
> > From: Christophe Lyon 
> > Date: Tue, 3 Oct 2023 15:20:39 +0200
>
> > The patch passed almost all our CI configurations, except arm-eabi when
> > testing with
> >  -mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto
> > where is causes these failures:
> > FAIL: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 (test for excess
> > errors)
> > UNRESOLVED: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 compilation
> > failed to produce executable
> > FAIL: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++20 (test for
> > excess errors)
> > UNRESOLVED: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++20
> > compilation failed to produce executable
> > FAIL: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++26 (test for
> > excess errors)
> > UNRESOLVED: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++26
> > compilation failed to produce executable
> > FAIL: 29_atomics/atomic_flag/test_and_set/explicit.cc -std=gnu++17 (test
> > for excess errors)
> > UNRESOLVED: 29_atomics/atomic_flag/test_and_set/explicit.cc -std=gnu++17
> > compilation failed to produce executable
> > FAIL: 29_atomics/atomic_flag/test_and_set/implicit.cc -std=gnu++17 (test
> > for excess errors)
> > UNRESOLVED: 29_atomics/atomic_flag/test_and_set/implicit.cc -std=gnu++17
> > compilation failed to produce executable
> >
> > The linker error is:
> > undefined reference to `__atomic_test_and_set'
>
> Here's 2/2, fixing those regressions by, after code
> inspection, gating those test-case users of
> dg-require-thread-fence that actually use not just
> atomic-load/store functionality, but a form of
> compare-exchange, including the test-and-set cases listed
> above as testsuite regressions.  That neatly includes the
> regressions above.

The new dg-require proc checks for __atomic_exchange, which is not the
same as compare-exchange, and not the same as test-and-set on
atomic_flag. Does it just happen to be true for arm that the presence
of __atomic_exchange also implies the other atomic operations?

The new proc also only tests it for int, which presumably means none
of these tests rely on atomics for long long being present. Some of
the tests use atomics on pointers, which should work for ILP32 targets
if atomics work on int, but the new dg-require-atomic-exchange isn't
sufficient to check that it will work for pointers on LP64 targets.

Maybe it happens to work today for the targets where we have issues,
but we seem to be assuming that there will be no LP64 targets where
these atomics are absent for 64-bit pointers. Are there supported
risc-v ISA subsets where that isn't true? And we're assuming that
__atomic_exchange being present implies all other ops are present,
which seems like a bad assumption to me. I would be more confident
testing for __atomic_compare_exchange, because with a wait-free CAS
you can implement any other atomic operation, but the same isn't true
for __atomic_exchange.

>
> Again, other libstdc++ test-cases should likely also use
> this gate, from what I see of "undefined references" in
> libstdc++.log.
>
> Tested together with 1/2.
> Ok to commit?
>
> (N.B. there was a stray suffix "non-atomic code" in the
> subject of 1/2; that's just a typo which will not be
> committed.)
>
> -- >8 --
> These tests actually use a form of atomic exchange, not just
> atomic loading and storing.  Some target have the latter,
> but not the former, yielding linker errors for missing
> library functions (and not supported by libatomic).
>
> This change is just for existing uses of
> dg-require-thread-fence.  It does not fix any other tests
> that should also be gated on dg-require-atomic-exchange.
>
> * testsuite/29_atomics/atomic/compare_exchange_padding.cc,
> testsuite/29_atomics/atomic_flag/clear/1.cc,
> testsuite/29_atomics/atomic_flag/cons/value_init.cc,
> testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc,
> testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc,
> testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc,
> testsuite/29_atomics/atomic_ref/generic.cc,
> testsuite/29_atomics/atomic_ref/integral.cc,
> testsuite/29_atomics/atomic_ref/pointer.cc: Replace
> dg-require-thread-fence with dg-require-atomic-exchange.
> ---
>  .../testsuite/29_atomics/atomic/compare_exchange_padding.cc | 2 +-
>  libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc| 2 +-
>  .../testsuite/29_atomics/atomic_flag/cons/value_init.cc | 2 +-
>  .../testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc   | 2 +-
>  .../testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc   | 2 +-
>  .../testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc | 2 +-
>  libstdc++-v3/testsuite/29_atomics/atomic_ref/generic.cc | 2 +-
>  libstdc++-v3/testsuite/29_atomics/atomic_ref/integral.cc| 2 +-
>  libstdc++-v3/testsuite/29_atomics/atomic_ref/po

Re: [PATCH] RISC-V: THead: Fix missing CFI directives for th.sdd in prologue.

2023-10-04 Thread 瞿仙淼
On 10/4/23 15:49, Xianmiao Qu wrote:
> 
> From: quxm 
> 
I'm sorry for posting the wrong username and email. If someone helps me merge 
the code later, please delete this line for me (Just use the username and email 
from the email I am currently sending, Xianmiao Qu 
).
Thanks,
Xianmiao


Re: [ARC PATCH] Split SImode shifts pre-reload on !TARGET_BARREL_SHIFTER.

2023-10-04 Thread Claudiu Zissulescu Ianculescu
Hi Roger,

The patch as it is passed the validation, and it is in general OK.
Although it doesn't address the elephant in the room, namely
output_shift function, it is a welcome cleanup.
I would like you to split the patch in two. One which deals with
improvements on shifts in absence of a barrel shifter, and one which
addresses the default instruction length, as they can be seen as
separate work. Please feel free to commit resulting patches to the
mainline.

Thank you for your contribution,
Claudiu

On Thu, Sep 28, 2023 at 2:27 PM Roger Sayle  wrote:
>
>
> Hi Claudiu,
> It was great meeting up with you and the Synopsys ARC team at the
> GNU tools Cauldron in Cambridge.
>
> This patch is the first in a series to improve SImode and DImode
> shifts and rotates in the ARC backend.  This first piece splits
> SImode shifts, for !TARGET_BARREL_SHIFTER targets, after combine
> and before reload, in the split1 pass, as suggested by the FIXME
> comment above output_shift in arc.cc.  To do this I've copied the
> implementation of the x86_pre_reload_split function from i386
> backend, and renamed it arc_pre_reload_split.
>
> Although the actual implementations of shifts remain the same
> (as in output_shift), having them as explicit instructions in
> the RTL stream allows better scheduling and use of compact forms
> when available.  The benefits can be seen in two short examples
> below.
>
> For the function:
> unsigned int foo(unsigned int x, unsigned int y) {
>   return y << 2;
> }
>
> GCC with -O2 -mcpu=em would previously generate:
> foo:add r1,r1,r1
> add r1,r1,r1
> j_s.d   [blink]
> mov_s   r0,r1   ;4
> and with this patch now generates:
> foo:asl_s r0,r1
> j_s.d   [blink]
> asl_s r0,r0
>
> Notice the original (from shift_si3's output_shift) requires the
> shift sequence to be monolithic with the same destination register
> as the source (requiring an extra mov_s).  The new version can
> eliminate this move, and schedule the second asl in the branch
> delay slot of the return.
>
> For the function:
> int x,y,z;
>
> void bar()
> {
>   x <<= 3;
>   y <<= 3;
>   z <<= 3;
> }
>
> GCC -O2 -mcpu=em currently generates:
> bar:push_s  r13
> ld.as   r12,[gp,@x@sda] ;23
> ld.as   r3,[gp,@y@sda]  ;23
> mov r2,0
> add3 r12,r2,r12
> mov r2,0
> add3 r3,r2,r3
> ld.as   r2,[gp,@z@sda]  ;23
> st.as   r12,[gp,@x@sda] ;26
> mov r13,0
> add3 r2,r13,r2
> st.as   r3,[gp,@y@sda]  ;26
> st.as   r2,[gp,@z@sda]  ;26
> j_s.d   [blink]
> pop_s   r13
>
> where each shift by 3, uses ARC's add3 instruction, which is similar
> to x86's lea implementing x = (y<<3) + z, but requires the value zero
> to be placed in a temporary register "z".  Splitting this before reload
> allows these pseudos to be shared/reused.  With this patch, we get
>
> bar:ld.as   r2,[gp,@x@sda]  ;23
> mov_s   r3,0;3
> add3r2,r3,r2
> ld.as   r3,[gp,@y@sda]  ;23
> st.as   r2,[gp,@x@sda]  ;26
> ld.as   r2,[gp,@z@sda]  ;23
> mov_s   r12,0   ;3
> add3r3,r12,r3
> add3r2,r12,r2
> st.as   r3,[gp,@y@sda]  ;26
> st.as   r2,[gp,@z@sda]  ;26
> j_s [blink]
>
> Unfortunately, register allocation means that we only share two of the
> three "mov_s z,0", but this is sufficient to reduce register pressure
> enough to avoid spilling r13 in the prologue/epilogue.
>
> This patch also contains a (latent?) bug fix.  The implementation of
> the default insn "length" attribute, assumes instructions of type
> "shift" have two input operands and accesses operands[2], hence
> specializations of shifts that don't have a operands[2], need to be
> categorized as type "unary" (which results in the correct length).
>
> This patch has been tested on a cross-compiler to arc-elf (hosted on
> x86_64-pc-linux-gnu), but because I've an incomplete tool chain many
> of the regression test fail, but there are no new failures with new
> test cases added below.  If you can confirm that there are no issues
> from additional testing, is this OK for mainline?
>
> Finally a quick technical question.  ARC's zero overhead loops require
> at least two instructions in the loop, so currently the backend's
> implementation of shr20 pads the loop body with a "nop".
>
> lshr20: mov.f lp_count, 20
> lpnz2f
> lsr r0,r0
> nop
> 2:  # end single insn loop
> j_s [blink]
>
> could this be more efficiently implemented as:
>
> lshr20: mov lp_count, 10
> lp 2f
> lsr_s r0,r0
> lsr_s r0,r0
> 2:  # end single insn loop
> j_s [blink]
>
> i.e. half the number of iterations, but doing twice as much useful
> work in each iteration?  Or might the nop be free on advanced
> microarchitectures, and/or the consecutive dependent shifts cause
> a pipeline stall?  It would be nice to fuse loops t

Re: [PATCH] __atomic_test_and_set: Fall back to library, not non-atomic code

2023-10-04 Thread Christophe Lyon
On Wed, 4 Oct 2023 at 02:49, Hans-Peter Nilsson  wrote:

> (Just before sending, I noticed you replied off-list; I
> won't add back gcc-patches to cc here myself, but please
> feel free to do it, if you choose to reply.)
>

Sorry, it was a typo of mine, I meant to reply to the list


>
> > From: Christophe Lyon 
> > Date: Tue, 3 Oct 2023 18:06:21 +0200
>
> > On Tue, 3 Oct 2023 at 17:16, Hans-Peter Nilsson  wrote:
> >
> > > > From: Christophe Lyon 
> > > > Date: Tue, 3 Oct 2023 15:20:39 +0200
> > >
> > > > The patch passed almost all our CI configurations, except arm-eabi
> when
> > > > testing with
> > > >  -mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto
> > > > where is causes these failures:
> > > > FAIL: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 (test for excess
> > > > errors)
> > > > UNRESOLVED: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17
> compilation
> > > > failed to produce executable
> [...]
> > > For which set of multilibs in that set, do you get these
> > > errors?  I'm guessing -march=armv6s-m, but I'm checking.
> > >
> >
> > Not sure to understand the question?
>
> By your "testing with
> -mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto"
> I presume you mean "testing with make check
>
> 'RUNTESTFLAGS=--target_board=arm-sim/-mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto'
> (as that's where you usually put that expression)
>
Yes


>
> - but I misremembered what "/" means in RUNTESTFLAGS (it's
> combining options in one single set of options passed to
> gcc, not delimiting separate options used to form
> combinations).  Sorry for the confusion!
>
I see.

Actually I always find "multilib" confusing in the context of running the
tests, where in fact we generally mean we add some flags in
RUNTESTFLAGS and/or --target_board.
To me, multilib refers to the set of lib variants we build, but I noticed
"multilib" is often used in the context of running the tests...


>
> > > > Maybe we need a new variant of dg-require-thread-fence ?
> > >
> > > Perhaps.
>
> Or rather: most certainly.  IIUC, ARMv6 (or whatever you
> prefer to call it) can load and store atomically, but only
> as separate events; it can't atomically exchange a stored
> value and therefore arm-eabi calls out to a library
> function.
>
In this case, it's armv6s-m (which is different from armv6)
And yes, it seems so, as your patch showed, assuming there's
no bug in the target description ;-)


> I think I'll help and replace the obvious uses of
> dg-require-thread-fence where actually an atomic exchange is
> required; replacing those with a new directive
> dg-require-atomic-exchange.  That will however not be *all*
> places where such a guard should be added.
>
Indeed.


> I also see lots of undefined references to *other* outlined
> atomic builtins, for example __atomic_compare_exchange_4 and
> __atomic_fetch_add_4.  Though, those likely aren't
> regressions.  I understand you focus on regressions here.
>
Yes, my reply to your patch was meant to look at the regressions.

As a separate action, I plan to look at the remaining existing such
failures.


> By the way, how do you test; what simulator, what baseboard
> file?  Please share!  Also, please send *some*
> contrib/test_summary reports for arm-eabi to
> gcc-testresults@ every now and then.  (But also, please
>
We use qemu, with qemu.exp from:
https://git.linaro.org/toolchain/abe.git/tree/config/boards/qemu.exp
nothing fancy ;-)



> don't post multiple results several times a day for similar
> configurations.  Looking at you, powerpc people!)
>
We have plans to restart sending such results, like I was doing several
years ago
(with many results every day, too ;-))


> I can't test *anything* newer than default arm-eabi (armv4t)
> on arm-sim (the one next to gdb), or else execution tests
> get lost and time out while also complaining about "Unknown
> machine type".  I noticed there's a qemu.exp in ToT dejagnu,
> but it doesn't work for arm-eabi for at least two reasons.
> (I might get to that yak later, I just take it as a sign
> that qemu-arm isn't what I look for.)

We do use qemu-arm, depending on how you want to test,
maybe you need to add a -cpu flag such that it supports
the required instructions.


> >  Unless of course, there's a multilib combination
>
> > for which you *can* emit the proper atomic spell; missing it
> > > because the need for it, has been hidden!
> > >
> > > (At first I thought it was related to caching the
> > > thread-fence property across multilib testing, but I don't
> > > think that was correct.)
> > >
> > Not sure what you mean? We run the tests for a single multilib here
> > (mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto)
> > so the cached value should always be correct?
>
> Yeah, part of my RUNTESTFLAGS confusion per above, please
> ignore that.
>
> brgds, H-P
>


[PATCH] ipa/111643 - clarify flatten attribute documentation

2023-10-04 Thread Richard Biener
The following clarifies the flatten attribute documentation to mention
the inlining applies also to calls formed as part of inlining earlier
calls but not calls to the function itself.

Will push this tomorrow or so if there are no better suggestions
on the wording.

PR ipa/111643
* doc/extend.texi (attribute flatten): Clarify.
---
 gcc/doc/extend.texi | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index b4770f1a149..645c76f23e9 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3109,7 +3109,9 @@ file descriptor opened with @code{O_RDONLY}.
 @cindex @code{flatten} function attribute
 @item flatten
 Generally, inlining into a function is limited.  For a function marked with
-this attribute, every call inside this function is inlined, if possible.
+this attribute, every call inside this function is inlined including the
+calls such inlining introduces to the function (but not recursive calls
+to the function itself), if possible.
 Functions declared with attribute @code{noinline} and similar are not
 inlined.  Whether the function itself is considered for inlining depends
 on its size and the current inlining parameters.
-- 
2.35.3


PR111648: Fix wrong code-gen due to incorrect VEC_PERM_EXPR folding

2023-10-04 Thread Prathamesh Kulkarni
Hi,
The attached patch attempts to fix PR111648.
As mentioned in PR, the issue is when a1 is a multiple of vector
length, we end up creating following encoding in result: { base_elem,
arg[0], arg[1], ... } (assuming S = 1),
where arg is chosen input vector, which is incorrect, since the
encoding originally in arg would be: { arg[0], arg[1], arg[2], ... }

For the test-case mentioned in PR, vectorizer pass creates
VEC_PERM_EXPR where:
arg0: { -16, -9, -10, -11 }
arg1: { -12, -5, -6, -7 }
sel = { 3, 4, 5, 6 }

arg0, arg1 and sel are encoded with npatterns = 1 and nelts_per_pattern = 3.
Since a1 = 4 and arg_len = 4, it ended up creating the result with
following encoding:
res = { arg0[3], arg1[0], arg1[1] } // npatterns = 1, nelts_per_pattern = 3
  = { -11, -12, -5 }

So for res[3], it used S = (-5) - (-12) = 7
And hence computed it as -5 + 7 = 2.
instead of selecting arg1[2], ie, -6.

The patch tweaks valid_mask_for_fold_vec_perm_cst_p to punt if a1 is a multiple
of vector length, so a1 ... ae select elements only from stepped part
of the pattern
from input vector and return false for this case.

Since the vectors are VLS, fold_vec_perm_cst then sets:
res_npatterns = res_nelts
res_nelts_per_pattern  = 1
which seems to fix the issue by encoding all the elements.

The patch resulted in Case 4 and Case 5 failing from test_nunits_min_2 because
they used sel = { 0, 0, 1, ... } and {len, 0, 1, ... } respectively,
which used a1 = 0, and thus selected arg1[0].

I removed Case 4 because it was already covered in test_nunits_min_4,
and moved Case 5 to test_nunits_min_4, with sel = { len, 1, 2, ... }
and added a new Case 9 to test for this issue.

Passes bootstrap+test on aarch64-linux-gnu with and without SVE,
and on x86_64-linux-gnu.
Does the patch look OK ?

Thanks,
Prathamesh
[PR111648] Fix wrong code-gen due to incorrect VEC_PERM_EXPR folding.

gcc/ChangeLog:
PR tree-optimization/111648
* fold-const.cc (valid_mask_for_fold_vec_perm_cst_p): Punt if a1
is a multiple of vector length.
(test_nunits_min_2): Remove Case 4 and move Case 5 to ...
(test_nunits_min_4): ... here and rename case numbers. Also add
Case 9.

gcc/testsuite/ChangeLog:
PR tree-optimization/111648
* gcc.dg/vect/pr111648.c: New test.


diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 4f8561509ff..c5f421d6b76 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -10682,8 +10682,8 @@ valid_mask_for_fold_vec_perm_cst_p (tree arg0, tree 
arg1,
  return false;
}
 
-  /* Ensure that the stepped sequence always selects from the same
-input pattern.  */
+  /* Ensure that the stepped sequence always selects from the stepped
+part of same input pattern.  */
   unsigned arg_npatterns
= ((q1 & 1) == 0) ? VECTOR_CST_NPATTERNS (arg0)
  : VECTOR_CST_NPATTERNS (arg1);
@@ -10694,6 +10694,20 @@ valid_mask_for_fold_vec_perm_cst_p (tree arg0, tree 
arg1,
*reason = "step is not multiple of npatterns";
  return false;
}
+
+  /* If a1 is a multiple of len, it will select base element of input
+vector resulting in following encoding:
+{ base_elem, arg[0], arg[1], ... } where arg is the chosen input
+vector. This encoding is not originally present in arg, since it's
+defined as:
+{ arg[0], arg[1], arg[2], ... }.  */
+
+  if (multiple_p (a1, arg_len))
+   {
+ if (reason)
+   *reason = "selecting base element of input vector";
+ return false;
+   }
 }
 
   return true;
@@ -17425,47 +17439,6 @@ test_nunits_min_2 (machine_mode vmode)
tree expected_res[] = { ARG0(0), ARG1(0), ARG0(1), ARG1(1) };
validate_res (2, 2, res, expected_res);
   }
-
-  /* Case 4: mask = {0, 0, 1, ...} // (1, 3)
-Test that the stepped sequence of the pattern selects from
-same input pattern. Since input vectors have npatterns = 2,
-and step (a2 - a1) = 1, step is not a multiple of npatterns
-in input vector. So return NULL_TREE.  */
-  {
-   tree arg0 = build_vec_cst_rand (vmode, 2, 3, 1);
-   tree arg1 = build_vec_cst_rand (vmode, 2, 3, 1);
-   poly_uint64 len = TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg0));
-
-   vec_perm_builder builder (len, 1, 3);
-   poly_uint64 mask_elems[] = { 0, 0, 1 };
-   builder_push_elems (builder, mask_elems);
-
-   vec_perm_indices sel (builder, 2, len);
-   const char *reason;
-   tree res = fold_vec_perm_cst (TREE_TYPE (arg0), arg0, arg1, sel,
- &reason);
-   ASSERT_TRUE (res == NULL_TREE);
-   ASSERT_TRUE (!strcmp (reason, "step is not multiple of npatterns"));
-  }
-
-  /* Case 5: mask = {len, 0, 1, ...} // (1, 3)
-Test that stepped sequence of the pattern selects from arg0.
-res = { arg1[0], arg0[0], arg0[1], ... } // (1, 3)  */
-  {
-   tree 

Re: [PATCH7/8] vect: Add TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM

2023-10-04 Thread Andre Vieira (lists)




On 30/08/2023 14:04, Richard Biener wrote:

On Wed, 30 Aug 2023, Andre Vieira (lists) wrote:


This patch adds a new target hook to enable us to adapt the types of return
and parameters of simd clones.  We use this in two ways, the first one is to
make sure we can create valid SVE types, including the SVE type attribute,
when creating a SVE simd clone, even when the target options do not support
SVE.  We are following the same behaviour seen with x86 that creates simd
clones according to the ABI rules when no simdlen is provided, even if that
simdlen is not supported by the current target options.  Note that this
doesn't mean the simd clone will be used in auto-vectorization.


You are not documenting the bool parameter of the new hook.

What's wrong with doing the adjustment in TARGET_SIMD_CLONE_ADJUST?


simd_clone_adjust_argument_types is called after that hook, so by the 
time we call TARGET_SIMD_CLONE_ADJUST the types are still in scalar, not 
vector.  The same is true for the return type one.


Also the changes to the types need to be taken into consideration in 
'adjustments' I think.


PS: I hope the subject line survived, my email client is having a bit of 
a wobble this morning... it's what you get for updating software :(


Re: [PATCH7/8] vect: Add TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM

2023-10-04 Thread Richard Biener
On Wed, 4 Oct 2023, Andre Vieira (lists) wrote:

> 
> 
> On 30/08/2023 14:04, Richard Biener wrote:
> > On Wed, 30 Aug 2023, Andre Vieira (lists) wrote:
> > 
> >> This patch adds a new target hook to enable us to adapt the types of return
> >> and parameters of simd clones.  We use this in two ways, the first one is
> >> to
> >> make sure we can create valid SVE types, including the SVE type attribute,
> >> when creating a SVE simd clone, even when the target options do not support
> >> SVE.  We are following the same behaviour seen with x86 that creates simd
> >> clones according to the ABI rules when no simdlen is provided, even if that
> >> simdlen is not supported by the current target options.  Note that this
> >> doesn't mean the simd clone will be used in auto-vectorization.
> > 
> > You are not documenting the bool parameter of the new hook.
> > 
> > What's wrong with doing the adjustment in TARGET_SIMD_CLONE_ADJUST?
> 
> simd_clone_adjust_argument_types is called after that hook, so by the time we
> call TARGET_SIMD_CLONE_ADJUST the types are still in scalar, not vector.  The
> same is true for the return type one.
> 
> Also the changes to the types need to be taken into consideration in
> 'adjustments' I think.

Nothing in the three existing implementations of TARGET_SIMD_CLONE_ADJUST
relies on this ordering I think, how about moving the hook invocation 
after simd_clone_adjust_argument_types?

Richard.

> PS: I hope the subject line survived, my email client is having a bit of a
> wobble this morning... it's what you get for updating software :(


Re: Check that passes do not forget to define profile

2023-10-04 Thread Jan Hubicka
> Hi Honza,
> 
> My current patch set for AArch64 VLA omp codegen started failing on
> gcc.dg/gomp/pr87898.c after this. I traced it back to
> 'move_sese_region_to_fn' in tree/cfg.cc not setting count for the bb
> created.
> 
> I was able to 'fix' it locally by setting the count of the new bb to the
> accumulation of e->count () of all the entry_endges (if initialized). I'm
> however not even close to certain that's the right approach, attached patch
> for illustration.
> 
> Kind regards,
> Andre
> diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc

> index 
> ffab7518b1568b58e610e26feb9e3cab18ddb3c2..32fc47ae683164bf8fac477fbe6e2c998382e754
>  100644
> --- a/gcc/tree-cfg.cc
> +++ b/gcc/tree-cfg.cc
> @@ -8160,11 +8160,15 @@ move_sese_region_to_fn (struct function *dest_cfun, 
> basic_block entry_bb,
>bb = create_empty_bb (entry_pred[0]);
>if (current_loops)
>  add_bb_to_loop (bb, loop);
> +  profile_count count = profile_count::zero ();
>for (i = 0; i < num_entry_edges; i++)
>  {
>e = make_edge (entry_pred[i], bb, entry_flag[i]);
>e->probability = entry_prob[i];
> +  if (e->count ().initialized_p ())
> + count += e->count ();
>  }
> +  bb->count = count;

This looks generally right - if you create a BB you need to set its
count and unless it has self-loop that is the sum of counts of
incommping edges.

However the initialized_p check should be unnecessary: if one of entry
edges to BB is uninitialized, the + operation will make bb count
uninitialized too, which is OK.

Honza
>  
>for (i = 0; i < num_exit_edges; i++)
>  {



[Patch] libgomp.texi: Clarify that no other OpenMP context selectors are implemented

2023-10-04 Thread Tobias Burnus

I got confused myself when reading

https://gcc.gnu.org/onlinedocs/libgomp/OpenMP-Context-Selectors.html

Especially with regards to other platforms like PowerPC.

It turned out that the list is complete, kind of. For 'arch' and 'isa'
those are the only ones - if we want to have more, it has to be
implemented (→ cf. PR105640).

For 'kind': The host compiler always matches 'kind=host'; this also
applies to AMD GCN and Nvidia PTX - when compiled as stand-alone
compiler. Those two also match 'kind=gpu' (both as stand-alone
compiler(*) and for offloading).

The attached documentation patch attempts to clarify this for both users
– and for implementers, for the latter, there is also a comment in the
.texi with more details.

Comments, suggestions, remarks?

Tobias

(*) This could be changed by checking for "#ifndef ACCEL_COMPILER" in
gcc/config/{gcn/nvptx}/ but that does not seem to be worthwhile.
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp.texi: Clarify that no other OpenMP context selectors are implemented

libgomp/ChangeLog:

	* libgomp.texi (OpenMP Context Selectors): Clarify 'kind' trait
	and that other target archs have no 'arch'/'isa' traits implemented.

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index f21557c3c52..d24f590fd84 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -4982,18 +4982,27 @@ smaller number.  On non-host devices, the value of the
 
 @code{vendor} is always @code{gnu}. References are to the GCC manual.
 
-@multitable @columnfractions .60 .10 .25
-@headitem @code{arch} @tab @code{kind} @tab @code{isa}
+@c NOTE: Only the following selectors have been implemented. To add
+@c additional traits for target architecture, TARGET_OMP_DEVICE_KIND_ARCH_ISA
+@c has to be implemented; cf. also PR target/105640.
+@c For offload devices, add *additionally* gcc/config/*/t-omp-device.
+
+For the host compiler, @code{kind} always matches @code{host}; for the
+offloading architectures AMD GCN and Nvidia PTX, @code{kind} always matches
+@code{gpu}.  For the x86 family of computers, AMD GCN and Nvidia PTX
+the following traits are supported in addition; while OpenMP is supported
+on more architectures, GCC currently does not match any @code{arch} or
+@code{isa} traits for those.
+
+@multitable @columnfractions .65 .30
+@headitem @code{arch} @tab @code{isa}
 @item @code{x86}, @code{x86_64}, @code{i386}, @code{i486},
   @code{i586}, @code{i686}, @code{ia32}
-  @tab @code{host}
   @tab See @code{-m...} flags in ``x86 Options'' (without @code{-m})
 @item @code{amdgcn}, @code{gcn}
-  @tab @code{gpu}
   @tab See @code{-march=} in ``AMD GCN Options''@footnote{Additionally,
   @code{gfx803} is supported as an alias for @code{fiji}.}
 @item @code{nvptx}
-  @tab @code{gpu}
   @tab See @code{-march=} in ``Nvidia PTX Options''
 @end multitable
 


Re: [Patch] libgomp.texi: Clarify that no other OpenMP context selectors are implemented

2023-10-04 Thread Jakub Jelinek
On Wed, Oct 04, 2023 at 01:08:15PM +0200, Tobias Burnus wrote:
> I got confused myself when reading
> 
> https://gcc.gnu.org/onlinedocs/libgomp/OpenMP-Context-Selectors.html
> 
> Especially with regards to other platforms like PowerPC.
> 
> It turned out that the list is complete, kind of. For 'arch' and 'isa'
> those are the only ones - if we want to have more, it has to be
> implemented (→ cf. PR105640).
> 
> For 'kind': The host compiler always matches 'kind=host'; this also
> applies to AMD GCN and Nvidia PTX - when compiled as stand-alone
> compiler. Those two also match 'kind=gpu' (both as stand-alone
> compiler(*) and for offloading).
> 
> The attached documentation patch attempts to clarify this for both users
> – and for implementers, for the latter, there is also a comment in the
> .texi with more details.
> 
> Comments, suggestions, remarks?
> 
> Tobias
> 
> (*) This could be changed by checking for "#ifndef ACCEL_COMPILER" in
> gcc/config/{gcn/nvptx}/ but that does not seem to be worthwhile.
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955

> libgomp.texi: Clarify that no other OpenMP context selectors are implemented
> 
> libgomp/ChangeLog:
> 
>   * libgomp.texi (OpenMP Context Selectors): Clarify 'kind' trait
>   and that other target archs have no 'arch'/'isa' traits implemented.

LGTM.

Jakub



[committed,gcc-11] libstdc++: Fix testsuite failures with -O0

2023-10-04 Thread Jonathan Wakely
I've pushed this to gcc-11 after testing on x86_64-linux.

-- >8 --

Backport the prune.exp change from r12-4425-g1595fe44e11a96 to fix two
testsuite failures when testing with -O0:
FAIL: 20_util/uses_allocator/69293_neg.cc (test for excess errors)
FAIL: 20_util/uses_allocator/cons_neg.cc (test for excess errors)

Also force some 20_util/integer_comparisons/ xfail tests to use -O2 so
that the errors match the dg-error directives.

libstdc++-v3/ChangeLog:

* testsuite/20_util/integer_comparisons/greater_equal_neg.cc:
Add -O2 to dg-options.
* testsuite/20_util/integer_comparisons/greater_neg.cc:
Likewise.
* testsuite/20_util/integer_comparisons/less_equal_neg.cc:
Likewise.
* testsuite/lib/prune.exp: Prune 'in constexpr expansion'.
---
 .../testsuite/20_util/integer_comparisons/greater_equal_neg.cc  | 2 +-
 .../testsuite/20_util/integer_comparisons/greater_neg.cc| 2 +-
 .../testsuite/20_util/integer_comparisons/less_equal_neg.cc | 2 +-
 libstdc++-v3/testsuite/lib/prune.exp| 1 +
 4 files changed, 4 insertions(+), 3 deletions(-)

diff --git 
a/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_equal_neg.cc 
b/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_equal_neg.cc
index 62633262948..028dce3df51 100644
--- a/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_equal_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_equal_neg.cc
@@ -15,7 +15,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-options "-std=gnu++2a" }
+// { dg-options "-std=gnu++2a -O2" }
 // { dg-do compile { target c++2a } }
 
 #include 
diff --git a/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_neg.cc 
b/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_neg.cc
index 48cb64d5676..f0422bc1948 100644
--- a/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/integer_comparisons/greater_neg.cc
@@ -15,7 +15,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-options "-std=gnu++2a" }
+// { dg-options "-std=gnu++2a -O2" }
 // { dg-do compile { target c++2a } }
 
 #include 
diff --git 
a/libstdc++-v3/testsuite/20_util/integer_comparisons/less_equal_neg.cc 
b/libstdc++-v3/testsuite/20_util/integer_comparisons/less_equal_neg.cc
index a16b36a83c9..3bd5d6480fe 100644
--- a/libstdc++-v3/testsuite/20_util/integer_comparisons/less_equal_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/integer_comparisons/less_equal_neg.cc
@@ -15,7 +15,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-options "-std=gnu++2a" }
+// { dg-options "-std=gnu++2a -O2" }
 // { dg-do compile { target c++2a } }
 
 #include 
diff --git a/libstdc++-v3/testsuite/lib/prune.exp 
b/libstdc++-v3/testsuite/lib/prune.exp
index 6c905631f16..2ebfb922ef4 100644
--- a/libstdc++-v3/testsuite/lib/prune.exp
+++ b/libstdc++-v3/testsuite/lib/prune.exp
@@ -46,6 +46,7 @@ proc libstdc++-dg-prune { system text } {
 regsub -all "(^|\n)\[^\n\]*(: )?At (top level|global scope):\[^\n\]*" 
$text "" text
 regsub -all "(^|\n)\[^\n\]*:   (recursively )?required \[^\n\]*" $text "" 
text
 regsub -all "(^|\n)\[^\n\]*:   . skipping \[0-9\]* instantiation contexts 
\[^\n\]*" $text "" text
+regsub -all "(^|\n)\[^\n\]*:   in .constexpr. expansion \[^\n\]*" $text "" 
text
 regsub -all "(^|\n)inlined from \[^\n\]*" $text "" text
 # Why doesn't GCC need these to strip header context?
 regsub -all "(^|\n)In file included from \[^\n\]*" $text "" text
-- 
2.41.0



Re: [PATCH] LoongArch: Reimplement multilib build option handling.

2023-10-04 Thread Jan-Benedict Glaw
Hi!

On Wed, 2023-09-13 17:52:14 +0800, Yang Yujie  wrote:
> Library build options from --with-multilib-list used to be processed with
> *self_spec, which missed the driver's initial canonicalization.  This
> caused limitations on CFLAGS override and the use of driver-only options
> like -m[no]-lsx.
> 
> The problem is solved by promoting the injection rules of --with-multilib-list
> options to the first element of DRIVER_SELF_SPECS, to make them execute before
> the canonialization.  The library-build options are also hard-coded in
> the driver and can be used conveniently by the builders of other non-gcc
> libraries via the use of -fmultiflags.
> 
> Bootstrapped and tested on loongarch64-linux-gnu.

Seems this breaks for me with

../gcc/configure [...] --enable-werror-always --enable-languages=all 
--disable-gcov --disable-shared --disable-threads 
--target=loongarch64-linux-gnuf32 --without-headers
make V=1 all-gcc


See eg. 
http://toolchain.lug-owl.de/laminar/jobs/gcc-loongarch64-linux-gnuf32/44 :

/var/lib/laminar/run/gcc-loongarch64-linux-gnuf32/44/local-toolchain-install/bin/g++
 -c   -g -O2   -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE   -fno-exceptions -fno-rtti 
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
-Wcast-qual -Wmissing-format-attribute -Wconditionally-supported 
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros 
-Wno-overlength-strings -Werror -fno-common  -DHAVE_CONFIG_H  -DGENERATOR_FILE 
-I. -Ibuild -I../../gcc/gcc -I../../gcc/gcc/build -I../../gcc/gcc/../include  
-I../../gcc/gcc/../libcpp/include  \
 -o build/genpreds.o ../../gcc/gcc/genpreds.cc
In file included from ../../gcc/gcc/config/loongarch/loongarch.h:53,
 from ./tm.h:50,
 from ../../gcc/gcc/genpreds.cc:26:
../../gcc/gcc/config/loongarch/loongarch-driver.h:82:10: fatal error: 
loongarch-multilib.h: No such file or directory
   82 | #include "loongarch-multilib.h"
  |  ^~
compilation terminated.
make[1]: *** [Makefile:2966: build/genpreds.o] Error 1
make[1]: Leaving directory 
'/var/lib/laminar/run/gcc-loongarch64-linux-gnuf32/44/toolchain-build/gcc'
make: *** [Makefile:4659: all-gcc] Error 2


So it failed to execute the t-multilib fragment? Happens for all my
loongarch compilation tests:

http://toolchain.lug-owl.de/laminar/jobs/gcc-loongarch64-linux/45
http://toolchain.lug-owl.de/laminar/jobs/gcc-loongarch64-linux-gnuf32/44
http://toolchain.lug-owl.de/laminar/jobs/gcc-loongarch64-linux-gnuf64/44
http://toolchain.lug-owl.de/laminar/jobs/gcc-loongarch64-linux-gnusf/44

And when this is fixed, it might be a nice idea to have a
--with-multilib-list config in ./contrib/config-list.mk .

MfG, JBG

-- 


signature.asc
Description: PGP signature


Re: [PATCH] RISC-V: Fix the riscv_legitimize_poly_move issue on targets where the minimal VLEN exceeds 512.

2023-10-04 Thread 钟居哲
I think the "max poly value" is the LMUL 1 mode coeffs[1]

See int vlenb = BYTES_PER_RISCV_VECTOR.coeffs[1];

So I think bump max_power to exact_log2 (64); is not enough.
since we adjust the LMUL 1 mode size according to TARGET_MIN_VLEN.

I suspect the testcase you append in this patch will fail with 
-march=rv64gcv_zvl4096b.

I think it's more reasonable bump the max power into 
BYTES_PER_RISCV_VECTOR.coeffs[1]

It should be more reasonable:

+  int max_power = exact_log2 (64);

change it into 

+  int max_power = exact_log2 (BYTES_PER_RISCV_VECTOR.coeffs[0]);



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2023-10-03 10:27
To: gcc-patches; kito.cheng; palmer; jeffreyalaw; rdapp; juzhe.zhong
CC: Kito Cheng
Subject: [PATCH] RISC-V: Fix the riscv_legitimize_poly_move issue on targets 
where the minimal VLEN exceeds 512.
riscv_legitimize_poly_move was expected to ensure the poly value is at most 32
times smaller than the minimal VLEN (32 being derived from '4096 / 128').
This assumption held when our mode modeling was not so precisely defined.
However, now that we have modeled the mode size according to the correct minimal
VLEN info, the size difference between different RVV modes can be up to 64
times. For instance, comparing RVVMF64BI and RVVMF1BI, the sizes are [1, 1]
versus [64, 64] respectively.
 
gcc/ChangeLog:
 
* config/riscv/riscv.cc (riscv_legitimize_poly_move): Bump
max_power to 64.
 
gcc/testsuite/ChangeLog:
 
* g++.target/riscv/rvv/autovec/bug-01.C: New.
* g++.target/riscv/rvv/rvv.exp: Add autovec folder.
---
gcc/config/riscv/riscv.cc |  7 ++--
.../g++.target/riscv/rvv/autovec/bug-01.C | 33 +++
gcc/testsuite/g++.target/riscv/rvv/rvv.exp|  3 ++
3 files changed, 40 insertions(+), 3 deletions(-)
create mode 100644 gcc/testsuite/g++.target/riscv/rvv/autovec/bug-01.C
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index d5446b63dbf..6468702e3a3 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2386,9 +2386,10 @@ riscv_legitimize_poly_move (machine_mode mode, rtx dest, 
rtx tmp, rtx src)
 }
   else
 {
-  /* FIXME: We currently DON'T support TARGET_MIN_VLEN > 4096.  */
-  int max_power = exact_log2 (4096 / 128);
-  for (int i = 0; i < max_power; i++)
+  /* The size difference between different RVV modes can be up to 64 times.
+ e.g. RVVMF64BI vs RVVMF1BI on zvl512b, which is [1, 1] vs [64, 64].  */
+  int max_power = exact_log2 (64);
+  for (int i = 0; i <= max_power; i++)
{
  int possible_div_factor = 1 << i;
  if (factor % (vlenb / possible_div_factor) == 0)
diff --git a/gcc/testsuite/g++.target/riscv/rvv/autovec/bug-01.C 
b/gcc/testsuite/g++.target/riscv/rvv/autovec/bug-01.C
new file mode 100644
index 000..fd10009ddbe
--- /dev/null
+++ b/gcc/testsuite/g++.target/riscv/rvv/autovec/bug-01.C
@@ -0,0 +1,33 @@
+/* { dg-options "-march=rv64gcv_zvl512b -mabi=lp64d -O3" } */
+
+class c {
+public:
+  int e();
+  void j();
+};
+float *d;
+class k {
+  int f;
+
+public:
+  k(int m) : f(m) {}
+  float g;
+  float h;
+  void n(int m) {
+for (int i; i < m; i++) {
+  d[0] = d[1] = d[2] = g;
+  d[3] = h;
+  d += f;
+}
+  }
+};
+c l;
+void o() {
+  int b = l.e();
+  k a(b);
+  for (;;)
+if (b == 4) {
+  l.j();
+  a.n(2);
+}
+}
diff --git a/gcc/testsuite/g++.target/riscv/rvv/rvv.exp 
b/gcc/testsuite/g++.target/riscv/rvv/rvv.exp
index 249530580d7..c30d6e93144 100644
--- a/gcc/testsuite/g++.target/riscv/rvv/rvv.exp
+++ b/gcc/testsuite/g++.target/riscv/rvv/rvv.exp
@@ -40,5 +40,8 @@ set CFLAGS "-march=$gcc_march -O3"
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/base/*.C]] \
"" $CFLAGS
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/*.\[C\]]] \
+"" $CFLAGS
+
# All done.
dg-finish
-- 
2.40.1
 
 


[PATCH v5] Add condition coverage profiling

2023-10-04 Thread Jørgen Kvalsvik
This is an extended set of patches with both review feedback from the v4 set,
as well as a few serious bug fixes and improvements, plus test cases.

There has only been a small-ish adjustment to the algorithm itself, in
expression isolation step. Because of gotos this might need to run multiple
times if it is detected that the expression subgraph has more than two
neighbors (in which case it *must* be more than a single Boolean expression).
This fixes some problems that mostly happen under optimization, but it also
helps correctness by having extra checks that what we found satisifies some
fundamental properties. Most other bugs were implementation errors.

I have broken the work up into smaller pieces that fix a single issue
each, including addressing review feedback. I hope this will make the patch set
easier to review and check how comments are addressed, but it also means that
most of the patch set does not have a the proper commit format etc. I plan to
squash most, if not all, of the commits before merge where this will be
addressed.

In September I proposed [1] an idea on how to fix the if (a) if (b) {} counted
as if (a && b) problem. If such a fix should be implemented no real changes
will have to happen to the tree profiling, the only change necessary to the
tree profiling is updating the test to reflect the counting.

To test the compiler I have built all dependencies (libraries, programs) to
build git, systemd, and openjdk, which includes llvm, perl, python, openssl,
icu, valgrind, and elfutils. This list is non-exhaustive. All these programs
now *build* with coverage enabled without failing (hitting an assert), but I
have not verified counting for these programs.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631254.html

---

 gcc/builtins.cc|2 +-
 gcc/collect2.cc|7 +-
 gcc/common.opt |9 +
 gcc/doc/gcov.texi  |   38 +
 gcc/doc/invoke.texi|   21 +
 gcc/gcc.cc |4 +-
 gcc/gcov-counter.def   |3 +
 gcc/gcov-dump.cc   |   24 +
 gcc/gcov-io.h  |3 +
 gcc/gcov.cc|  209 -
 gcc/ipa-inline.cc  |2 +-
 gcc/ipa-split.cc   |3 +-
 gcc/passes.cc  |3 +-
 gcc/profile.cc |   92 +-
 gcc/testsuite/g++.dg/gcov/gcov-18.C|  246 ++
 gcc/testsuite/gcc.misc-tests/gcov-19.c | 1471 
 gcc/testsuite/gcc.misc-tests/gcov-20.c |   22 +
 gcc/testsuite/gcc.misc-tests/gcov-21.c |   16 +
 gcc/testsuite/gcc.misc-tests/gcov-22.c |   71 ++
 gcc/testsuite/gcc.misc-tests/gcov-23.c |  149 
 gcc/testsuite/lib/gcov.exp |  191 -
 gcc/tree-profile.cc| 1123 +++-
 libgcc/libgcov-merge.c |5 +
 23 files changed, 3688 insertions(+), 26 deletions(-)




[PATCH 02/22] Add "Condition coverage profiling" term to --help

2023-10-04 Thread Jørgen Kvalsvik
From: Jørgen Kvalsvik 

---
 gcc/common.opt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 94b1b358585..cd769ad95e0 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -864,7 +864,8 @@ Warn in case a function ends earlier than it begins due to 
an invalid linenum ma
 
 Wcoverage-too-many-conditions
 Common Var(warn_too_many_conditions) Init(1) Warning
-Warn when a conditional has too many terms and coverage gives up.
+Warn when a conditional has too many terms and condition coverage profiling
+gives up instrumenting the expression.
 
 Wmissing-profile
 Common Var(warn_missing_profile) Init(1) Warning
-- 
2.30.2



[PATCH 03/22] Mention relevant flags in condition coverage docs

2023-10-04 Thread Jørgen Kvalsvik
From: Jørgen Kvalsvik 

---
 gcc/doc/gcov.texi   |  3 ++-
 gcc/doc/invoke.texi | 14 --
 2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi
index 10cfdcf24aa..f6db593a62a 100644
--- a/gcc/doc/gcov.texi
+++ b/gcc/doc/gcov.texi
@@ -175,7 +175,8 @@ the percentage of branches taken.
 Write condition coverage to the output file, and write condition summary info
 to the standard output.  This option allows you to see if the conditions in
 your program at least once had an independent effect on the outcome of the
-boolean expression (modified condition/decision coverage).
+boolean expression (modified condition/decision coverage).  This requires you
+to compile the source with @option{-fprofile-conditions}.
 
 @item -d
 @itemx --display-progress
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 21419b9a442..fbe6fa5c825 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -6478,9 +6478,10 @@ Completely disabling the warning is not recommended.
 @opindex Wno-coverage-too-many-conditions
 @opindex Wcoverage-too-many-conditions
 @item -Wno-coverage-too-many-conditions
-Warn in case a condition have too many terms and GCC gives up coverage.
-Coverage is given up when there are more terms in the conditional than there
-are bits in a @code{gcov_type_unsigned}.  This warning is enabled by default.
+Warn if @option{-fprofile-conditions} is used and an expression have too many
+terms and GCC gives up coverage.  Coverage is given up when there are more
+terms in the conditional than there are bits in a @code{gcov_type_unsigned}.
+This warning is enabled by default.
 
 @opindex Wno-coverage-invalid-line-number
 @opindex Wcoverage-invalid-line-number
@@ -16638,9 +16639,10 @@ E.g. @code{gcc a.c b.c -o binary} would generate 
@file{binary-a.gcda} and
 @item -fprofile-conditions
 @opindex fprofile-conditions
 Add code so that program conditions are instrumented.  During execution the
-program records what terms in a conditional contributes to a decision.  The
-data may be used to verify that all terms in a booleans are tested and have an
-effect on the outcome of a condition.
+program records what terms in a conditional contributes to a decision, which
+can be used to verify that all terms in a booleans are tested and have an
+independent effect on the outcome of a decision.  The result can be read with
+@code{gcov --conditions}.
 
 @xref{Cross-profiling}.
 
-- 
2.30.2



Re: [PATCH7/8] vect: Add TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM

2023-10-04 Thread Andre Vieira (lists)




On 04/10/2023 11:41, Richard Biener wrote:

On Wed, 4 Oct 2023, Andre Vieira (lists) wrote:




On 30/08/2023 14:04, Richard Biener wrote:

On Wed, 30 Aug 2023, Andre Vieira (lists) wrote:


This patch adds a new target hook to enable us to adapt the types of return
and parameters of simd clones.  We use this in two ways, the first one is
to
make sure we can create valid SVE types, including the SVE type attribute,
when creating a SVE simd clone, even when the target options do not support
SVE.  We are following the same behaviour seen with x86 that creates simd
clones according to the ABI rules when no simdlen is provided, even if that
simdlen is not supported by the current target options.  Note that this
doesn't mean the simd clone will be used in auto-vectorization.


You are not documenting the bool parameter of the new hook.

What's wrong with doing the adjustment in TARGET_SIMD_CLONE_ADJUST?


simd_clone_adjust_argument_types is called after that hook, so by the time we
call TARGET_SIMD_CLONE_ADJUST the types are still in scalar, not vector.  The
same is true for the return type one.

Also the changes to the types need to be taken into consideration in
'adjustments' I think.


Nothing in the three existing implementations of TARGET_SIMD_CLONE_ADJUST
relies on this ordering I think, how about moving the hook invocation
after simd_clone_adjust_argument_types?



But that wouldn't change the 'ipa_param_body_adjustments' for when we 
have a function definition and we need to redo the body.

Richard.


PS: I hope the subject line survived, my email client is having a bit of a
wobble this morning... it's what you get for updating software :(


[PATCH 05/22] Describe condition_info

2023-10-04 Thread Jørgen Kvalsvik
From: Jørgen Kvalsvik 

---
 gcc/gcov.cc | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/gcov.cc b/gcc/gcov.cc
index 21a6da1a7fa..274f2fc5d9f 100644
--- a/gcc/gcov.cc
+++ b/gcc/gcov.cc
@@ -135,6 +135,8 @@ public:
   vector lines;
 };
 
+/* Describes a single conditional expression and the (recorded) conditions
+ * shown to independently affect the outcome.  */
 class condition_info
 {
 public:
@@ -142,9 +144,12 @@ public:
 
   int popcount () const;
 
+  /* Bitsets storing the independently significant outcomes for true and false,
+   * respectively.  */
   gcov_type_unsigned truev;
   gcov_type_unsigned falsev;
 
+  /* Number of terms in the expression; if (x) -> 1, if (x && y) -> 2 etc.  */
   unsigned n_terms;
 };
 
-- 
2.30.2



[PATCH 04/22] Describe, remove ATTRIBUTE_UNUSED from tag_conditions

2023-10-04 Thread Jørgen Kvalsvik
From: Jørgen Kvalsvik 

---
 gcc/gcov-dump.cc | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/gcov-dump.cc b/gcc/gcov-dump.cc
index 65fa2d7b44b..d85fa98ddd5 100644
--- a/gcc/gcov-dump.cc
+++ b/gcc/gcov-dump.cc
@@ -394,14 +394,14 @@ tag_arcs (const char *filename ATTRIBUTE_UNUSED,
 }
 }
 
+/* Print number of conditions (not outcomes, i.e. if (x && y) is 2, not 4) */
 static void
-tag_conditions (const char *filename ATTRIBUTE_UNUSED,
-   unsigned tag ATTRIBUTE_UNUSED, int length ATTRIBUTE_UNUSED,
+tag_conditions (const char *filename, unsigned /* tag */, int length,
unsigned depth)
 {
   unsigned n_conditions = GCOV_TAG_CONDS_NUM (length);
 
-  printf (" %u conditionals", n_conditions);
+  printf (" %u conditions", n_conditions);
   if (flag_dump_contents)
 {
   for (unsigned ix = 0; ix != n_conditions; ix++)
-- 
2.30.2



[PATCH 01/22] Add condition coverage profiling

2023-10-04 Thread Jørgen Kvalsvik
From: Jørgen Kvalsvik 

This patch adds support in gcc+gcov for modified condition/decision
coverage (MC/DC) with the -fprofile-conditions flag. MC/DC is a type of
test/code coverage and it is particularly important in the avation and
automotive industries for safety-critical applications. MC/DC it is
required for or recommended by:

* DO-178C for the most critical software (Level A) in avionics
* IEC 61508 for SIL 4
* ISO 26262-6 for ASIL D

>From the SQLite webpage:

Two methods of measuring test coverage were described above:
"statement" and "branch" coverage. There are many other test
coverage metrics besides these two. Another popular metric is
"Modified Condition/Decision Coverage" or MC/DC. Wikipedia defines
MC/DC as follows:

* Each decision tries every possible outcome.
* Each condition in a decision takes on every possible outcome.
* Each entry and exit point is invoked.
* Each condition in a decision is shown to independently affect
  the outcome of the decision.

In the C programming language where && and || are "short-circuit"
operators, MC/DC and branch coverage are very nearly the same thing.
The primary difference is in boolean vector tests. One can test for
any of several bits in bit-vector and still obtain 100% branch test
coverage even though the second element of MC/DC - the requirement
that each condition in a decision take on every possible outcome -
might not be satisfied.

https://sqlite.org/testing.html#mcdc

Wahlen, Heimdahl, and De Silva "Efficient Test Coverage Measurement for
MC/DC" describes an algorithm for adding instrumentation by carrying
over information from the AST, but my algorithm analyses the the control
flow graph to instrument for coverage. This has the benefit of being
programming language independent and faithful to compiler decisions
and transformations. I have only tested it on primarily in C and C++,
see testsuite/gcc.misc-tests and testsuite/g++.dg, and run some manual
tests using D, Rust, and go. D and rust mostly behave as you would
expect, although sometimes conditions are really expanded and therefore
instrumented in another module than the one with the source, which also
applies to the branch coverage. It does not work as expected for go as
the go front end evaluates multi-conditional expressions by folding
results into temporaries.

Like Wahlen et al this implementation records coverage in fixed-size
bitsets which gcov knows how to interpret. This is very fast, but
introduces a limit on the number of terms in a single boolean
expression, the number of bits in a gcov_unsigned_type (which is
typedef'd to uint64_t), so for most practical purposes this would be
acceptable. This limitation is in the implementation and not the
algorithm, so support for more conditions can be added by also
introducing arbitrary-sized bitsets.

For space overhead, the instrumentation needs two accumulators
(gcov_unsigned_type) per condition in the program which will be written
to the gcov file. In addition, every function gets a pair of local
accumulators, but these accmulators are reused between conditions in the
same function.

For time overhead, there is a zeroing of the local accumulators for
every condition and one or two bitwise operation on every edge taken in
the an expression.

In action it looks pretty similar to the branch coverage. The -g short
opt carries no significance, but was chosen because it was an available
option with the upper-case free too.

gcov --conditions:

3:   17:void fn (int a, int b, int c, int d) {
3:   18:if ((a && (b || c)) && d)
conditions covered 3/8
condition  0 not covered (true)
condition  0 not covered (false)
condition  1 not covered (true)
condition  2 not covered (true)
condition  3 not covered (true)
1:   19:x = 1;
-:   20:else
2:   21:x = 2;
3:   22:}

gcov --conditions --json-format:

"conditions": [
{
"not_covered_false": [
0
],
"count": 8,
"covered": 3,
"not_covered_true": [
0,
1,
2,
3
]
}
],

Some expressions, mostly those without else-blocks, are effectively
"rewritten" in the CFG construction making the algorithm unable to
distinguish them:

and.c:

if (a && b && c)
x = 1;

ifs.c:

if (a)
if (b)
if (c)
x = 1;

gcc will build the same graph for both these programs, and gcov will
report boths as 3-term expressions. It is vital that it is not
interpreted the other way around (which is consistent with the shape of
the graph) because otherwise the masking would be wrong for the and.c
program which is a more severe error. While surprising, users would
probably expect some minor rewriting of semantically-identical
expressions.

and.c.gcov

[PATCH 06/22] Use popcount_hwi rather than builtin

2023-10-04 Thread Jørgen Kvalsvik
From: Jørgen Kvalsvik 

---
 gcc/gcov.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/gcov.cc b/gcc/gcov.cc
index 274f2fc5d9f..35be97cf5ac 100644
--- a/gcc/gcov.cc
+++ b/gcc/gcov.cc
@@ -46,6 +46,7 @@ along with Gcov; see the file COPYING3.  If not see
 #include "color-macros.h"
 #include "pretty-print.h"
 #include "json.h"
+#include "hwint.h"
 
 #include 
 #include 
@@ -159,7 +160,7 @@ condition_info::condition_info (): truev (0), falsev (0), 
n_terms (0)
 
 int condition_info::popcount () const
 {
-return __builtin_popcountll (truev) + __builtin_popcountll (falsev);
+return popcount_hwi (truev) + popcount_hwi (falsev);
 }
 
 /* Describes a basic block. Contains lists of arcs to successor and
-- 
2.30.2



[PATCH 10/22] Prune search for boolean expr on goto, return

2023-10-04 Thread Jørgen Kvalsvik
The search for the initial candidate set for a decision is a delicate
affair in complex CFGs with gotos (mostly from returns). Gotos and
returns in the then-block could create graphs where the goto/return edge
would also be searched for candidates for G, even though they can never
be a part of the expression being searched for.

Nodes may have complex predecessor edges which should be silently
ignored. Complex edges signify some odd or abnormal control flow, and
the flow of evaluation of terms in a boolean expression certainly does
not fall into this category.
---
 gcc/testsuite/gcc.misc-tests/gcov-19.c | 53 --
 gcc/tree-profile.cc| 33 +++-
 2 files changed, 82 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-19.c 
b/gcc/testsuite/gcc.misc-tests/gcov-19.c
index 73e268ab15e..662f4ca2864 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-19.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-19.c
@@ -312,6 +312,30 @@ begin:
goto then3;
 }
 
+void
+noop () {}
+
+int
+mcdc007d (int a, int b, int c, int d, int e)
+{
+noop ();
+if (a)  /* conditions(1/2) true(0) */
+   /* conditions(end) */
+{
+   if (b || c) /* conditions(0/4) true(0 1) false(0 1) */
+   /* conditions(end) */
+   x = 2;
+   if (d)  /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   return 1;
+}
+if (e)  /* conditions(1/2) false(0) */
+   /* conditions(end) */
+   return 0;
+
+return 2;
+}
+
 /* while loop */
 void
 mcdc008a (int a)
@@ -557,9 +581,6 @@ mcdc017a (int a)
 } while (a > 0); /* conditions(2/2) */
 }
 
-void
-noop () {}
-
 void
 mcdc017b (int a, int b)
 {
@@ -845,6 +866,29 @@ mcdc022d (int a)
x = a + 1;
 }
 
+/* Adapted from openssl-3.0.1/crypto/cmp/cmp_msg.c ossl_cmp_error_new ().
+   Without proper limiting of the initial candidate search this misidentified
+   { ...; if (fn ()) goto err; } if (c) goto err; as a 2-term expression.  */
+void
+mcdc022e (int a, int b, int c, int d)
+{
+if (a || b) /* conditions(1/4) true(0) false(0 1) */
+   /* conditions(end) */
+{
+   if (always (c)) /* conditions(1/2) false(0) */
+   /* conditions(end) */
+   goto err;
+}
+
+if (d)  /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   goto err;
+return;
+
+err:
+noop ();
+}
+
 /* 023 specifically tests that masking works correctly, which gets complicated
fast with a mix of operators and deep subexpressions.  These tests violates
the style guide slightly to emphasize the nesting.  They all share the same
@@ -1128,6 +1172,8 @@ int main ()
 mcdc007c (0, 1, 1);
 mcdc007c (1, 0, 1);
 
+mcdc007d (0, 1, 0, 1, 1);
+
 mcdc008a (0);
 
 mcdc008b (0);
@@ -1232,6 +1278,7 @@ int main ()
 mcdc022c (1);
 
 mcdc022d (1);
+mcdc022e (0, 1, 1, 0);
 
 mcdc023a (0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1);
 mcdc023b (0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1);
diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index adab0f59c07..f98bb1f76c8 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -618,11 +618,40 @@ masking_vectors (conds_ctx& ctx, array_slice 
blocks,
 }
 }
 
+/* Check that all predecessors are conditional and belong to the current
+   expression.  This check is necessary in the presence of gotos, setjmp and
+   other complicated control flow that creates extra edges and creates odd
+   reachable paths from mid-expression terms and paths escaping nested
+   expressions.  If a node has an incoming non-complex edge (after contraction)
+   it can not be a part of a single, multi-term conditional expression.
+
+   If the expr[i] is set then nodes[i] is reachable from the leftmost operand
+   and b is a viable candidate.  Otherwise, this has to be an independent but
+   following expression.
+ */
+bool
+all_preds_conditional_p (basic_block b, const sbitmap expr)
+{
+for (edge e : b->preds)
+{
+   e = contract_edge_up (e);
+   if (!(e->flags & (EDGE_CONDITION | EDGE_COMPLEX)))
+   return false;
+
+   if (!bitmap_bit_p (expr, e->src->index))
+   return false;
+}
+return true;
+}
+
 /* Find the nodes reachable from p by following only (possibly contracted)
condition edges and ignoring DFS back edges.  From a high level this is
partitioning the CFG into subgraphs by removing all non-condition edges and
selecting a single connected subgraph.  This creates a cut C = (G, G') where
-   G is the returned explicitly by this function.
+   G is the returned explicitly by this function and forms the candidate set
+   for an expression.  All nodes in an expression should be connected only by
+   true|false edges, so a node with a non-conditional predecessor must be a
+   part of a different expression and in G', not G.
 
It is assumed that all paths from p go through 

[PATCH 16/22] Rename pathological -> setjmp

2023-10-04 Thread Jørgen Kvalsvik
---
 gcc/testsuite/gcc.misc-tests/gcov-22.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-22.c 
b/gcc/testsuite/gcc.misc-tests/gcov-22.c
index 28b7de66022..3737235d40e 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-22.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-22.c
@@ -16,24 +16,24 @@ int identity(int x) { return x; }
 
__builtin_setjmp did not trigger this, so we need setjmp from libc.  */
 void
-pathological001 (int a, int b, int c)
+setjmp001 (int a, int b, int c)
 {
 if (a)  /* conditions(1/2) true(0) */
-   /* conditions(end) */
+   /* conditions(end) */
noop ();
 
 if (b)  /* conditions(1/2) false(0) */
-   /* conditions(end) */
+   /* conditions(end) */
noop ();
 
 if (c && setjmp (buf))  /* conditions(1/4) true(0 1) false(1) */
-   /* conditions(end) */
+   /* conditions(end) */
noop ();
 }
 
-///* Adapted from freetype-2.13.0 gxvalid/gxvmod.c classic_kern_validate */
+/* Adapted from freetype-2.13.0 gxvalid/gxvmod.c classic_kern_validate */
 int
-pathological002 (int a)
+setjmp002 (int a)
 {
 int error = identity(a);
 
@@ -63,8 +63,8 @@ Exit:
 int
 main ()
 {
-pathological001 (0, 1, 0);
-pathological002 (0);
+setjmp001 (0, 1, 0);
+setjmp002 (0);
 }
 
 
-- 
2.30.2



[PATCH 15/22] Fix candidate, neighborhood set reduction phase

2023-10-04 Thread Jørgen Kvalsvik
This phase did not at all behave like advertised, which in turn lead to
ICEs for the new tests.
---
 gcc/testsuite/gcc.misc-tests/gcov-19.c | 35 +++---
 gcc/tree-profile.cc| 29 +++--
 2 files changed, 54 insertions(+), 10 deletions(-)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-19.c 
b/gcc/testsuite/gcc.misc-tests/gcov-19.c
index 1c671f7e186..5daa06cb7f4 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-19.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-19.c
@@ -4,6 +4,9 @@
 /* some side effect to stop branches from being pruned */
 int x = 0;
 
+int id  (int x) { return  x; }
+int inv (int x) { return !x; }
+
 /* || works */
 void
 mcdc001a (int a, int b)
@@ -395,6 +398,15 @@ mcdc009a (int a, int b)
x = a--;
 }
 
+/* Multi-term loop condition with empty body, which can give neighborhoods of
+   size 1.  */
+void
+mcdc009b (int a, int b)
+{
+while (a-- > 0 && b) {} /* conditions(2/4) true(0 1) */
+   /* conditions(end) */
+}
+
 /* for loop */
 void
 mcdc010a (int a, int b)
@@ -537,6 +549,21 @@ mcdc015c (int a, int b)
 }
 }
 
+/* Early returns, gotos can create candidate sets where the neighborhood
+   internally shares dominator nodes that are not the first-in-expression which
+   implies the neighborhood belongs to some other boolean expression.  When
+   this happens, the candidate set must be properly tidied up.  */
+void
+mcdc015d (int a, int b, int c)
+{
+if (a) return;  /* conditions(1/2) false(0) */
+   /* conditions(end) */
+if (id (b)) return; /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+if (id (c)) return; /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+}
+
 
 /* check nested loops */
 void
@@ -639,9 +666,6 @@ mcdc017c (int a, int b)
 } while (n-- > 0); /* conditions(2/2) */
 }
 
-int id  (int x) { return  x; }
-int inv (int x) { return !x; }
-
 /* collection of odd cases lifted-and-adapted from real-world code */
 int mcdc018a (int a, int b, int c, int d, int e, int f, int g, int len)
 {
@@ -1332,6 +1356,9 @@ int main ()
 mcdc009a (0, 0);
 mcdc009a (1, 1);
 
+mcdc009b (0, 0);
+mcdc009b (1, 0);
+
 mcdc010a (0, 0);
 mcdc010a (0, 9);
 mcdc010a (2, 1);
@@ -1369,6 +1396,8 @@ int main ()
 mcdc015c (0, 5);
 mcdc015c (6, 1);
 
+mcdc015d (1, 0, 0);
+
 mcdc016a (5, 5);
 
 mcdc016b (5, 5);
diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index b75736a22a0..98593f53862 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -759,6 +759,10 @@ isolate_expression (conds_ctx &ctx, basic_block p, 
vec& out)
if (bitmap_count_bits (expr) == 2)
break;
 
+   /* This can happen for loops with no body.  */
+   if (bitmap_count_bits (expr) == 1 && bb_loop_header_p (p))
+   break;
+
/* If the neighborhood does not change between iterations (a fixed
   point) we cannot understand the graph properly, and this would loop
   infinitely.  If this should happen, we should bail out and give up
@@ -768,21 +772,32 @@ isolate_expression (conds_ctx &ctx, basic_block p, 
vec& out)
gcc_assert (false);
bitmap_copy (prev, expr);
 
-   for (unsigned i = 0; i != NG.length () - 1; i++)
+   const unsigned NGlen = NG.length ();
+   for (unsigned i = 0; i != NGlen - 1; i++)
{
-   for (unsigned j = i+1; j != NG.length (); j++)
+   for (unsigned j = i+1; j != NGlen; j++)
{
-   auto b = nearest_common_dominator (CDI_DOMINATORS, NG[i], 
NG[j]);
+   basic_block b = nearest_common_dominator (CDI_DOMINATORS, 
NG[i], NG[j]);
if (ctx.index_map[b->index] > ctx.index_map[p->index])
{
-   bitmap_clear_bit (reachable, NG[i]->index);
-   bitmap_clear_bit (reachable, NG[j]->index);
-   bitmap_set_bit (reachable, b->index);
+   bitmap_clear_bit (expr, NG[i]->index);
+   bitmap_clear_bit (expr, NG[j]->index);
+   bitmap_set_bit (expr, b->index);
+   NG.safe_push (b);
+   /* It could be that the predecessor edge from the ancestor
+  was contracted past, in which case we need to mark it to
+  make sure its ancestors_of do not immediately terminate. 
 */
+   for (edge e : b->preds)
+   if (ctx.index_map[e->src->index] > 
ctx.index_map[p->index])
+   bitmap_set_bit (reachable, e->src->index);
}
}
}
-   bitmap_copy (expr, reachable);
+   for (unsigned i = 0; i != NG.length (); i++)
+   if (!bitmap_bit_p (expr, NG[i]->index))
+   NG.unordered_remove (i--);
 
+   bitmap_copy (expr, reachable);
for (const basic_block neighbor : N

[PATCH 07/22] Describe add_condition_counts

2023-10-04 Thread Jørgen Kvalsvik
From: Jørgen Kvalsvik 

---
 gcc/gcov.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/gcov.cc b/gcc/gcov.cc
index 35be97cf5ac..95e0a11bc08 100644
--- a/gcc/gcov.cc
+++ b/gcc/gcov.cc
@@ -2580,6 +2580,8 @@ add_branch_counts (coverage_info *coverage, const 
arc_info *arc)
 }
 }
 
+/* Increment totals in COVERAGE according to to block BLOCK.  */
+
 static void
 add_condition_counts (coverage_info *coverage, const block_info *block)
 {
-- 
2.30.2



[PATCH 12/22] Do two-phase filtering in expr isolation

2023-10-04 Thread Jørgen Kvalsvik
In complex control flow graphs where gotos and returns escape the inner
then-blocks and the right else-blocks are omitted, inner conditions
would erroneously be considered a part of the outer. By using the
observation that the neighborhood of a proper boolean expression should
always be the two outcome nodes we can repeat the search and ancestor
filtering to pinpoint the proper expression. This is helped by replacing
every pair of nodes that share some dominator in the candidate set that
is not the leading term, as that dominator must be the leading term in
some other boolean expression than the one we are looking for.
---
 gcc/testsuite/gcc.misc-tests/gcov-19.c | 125 +
 gcc/testsuite/gcc.misc-tests/gcov-22.c |  31 ++
 gcc/tree-profile.cc|  96 ++-
 3 files changed, 251 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-19.c 
b/gcc/testsuite/gcc.misc-tests/gcov-19.c
index 5b5c1c275b0..8d6eb610af2 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-19.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-19.c
@@ -1036,6 +1036,10 @@ mcdc023g (int a, int b, int c, int d, int e, int f, int 
g, int h, int i, int k,
x = b + c;
 }
 
+/* Gotos, return, labels can make odd graphs.  It is important that conditions
+   are assigned to the right expression, and that there are no miscounts.  In
+   these tests values may be re-used, as checking things like masking an
+   independence is done in other test cases and not so useful here.  */
 void
 mcdc024a (int a, int b)
 {
@@ -1117,6 +1121,123 @@ label4:
 }
 }
 
+int
+mcdc024d (int a, int b, int c)
+{
+/* Graphs can get complicated with the innermost returns and else-less if,
+   so we must make sure these conditions are counted correctly.  */
+if (a)  /* conditions(1/2) true(0) */
+   /* conditions(end) */
+{
+   if (b)  /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   {
+   if (c)  /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   return 1;
+   else
+   return 2;
+   }
+
+   if (a)  /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   return 3;
+}
+
+return 5;
+}
+
+/* Nested else-less ifs with inner returns needs to be counted right, which
+   puts some pressure on the expression isolation.  The fallthrough from inner
+   expressions into the next if cause the cfg to have edges from deeper in
+   subexpression to the next block sequence, which can confuse the expression
+   isolation. */
+int
+mcdc024e (int a, int b, int c)
+{
+if (a)  /* conditions(1/2) true(0) */
+   /* conditions(end) */
+{
+   if (b)  /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   {
+   if (c)  /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   {
+   if (a)  /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   return 1;
+   else
+   return 2;
+   }
+
+   if (a)  /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   return 3;
+   }
+
+   if (b)  /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   return 4;
+}
+return 5;
+}
+
+int
+mcdc024f (int a, int b, int c)
+{
+if (b)  /* conditions(1/2) true(0) */
+   /* conditions(end) */
+   return 0;
+
+if (a)  /* conditions(1/2) true(0) */
+   /* conditions(end) */
+{
+   if (b)  /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   {
+   b += 2;
+   if (b & 0xFF)   /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   c++;
+
+   return c;
+   }
+   c += 10;
+}
+}
+
+
+int
+mcdc024g (int a, int b, int c)
+{
+if (b)  /* conditions(1/2) true(0) */
+   /* conditions(end) */
+   goto inner;
+
+if (a)  /* conditions(1/2) true(0) */
+   /* conditions(end) */
+   ++a;
+
+
+if (a)  /* conditions(1/2) true(0) */
+   /* conditions(end) */
+{
+   if (b)  /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   {
+inner:
+   b += 2;
+   if (b & 0xFF)   /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   c++;
+
+   return c;
+   }
+   c += 10;
+}
+}
+
 int main ()
 {
 mcdc001a (0, 1);
@@ -1313,6 +1434,10 @@ int main ()
 mcdc024a (0, 0);
 mcdc024b (0, 0);
 mcdc024c (0, 0);
+mcdc024d (0, 0, 0);
+mcdc024e (0, 0, 0);
+mcdc024f (0, 0, 0);
+mcdc024g (0, 0, 0);
 }
 
 /* { dg-final { run-gcov conditions { --conditions gcov-19.c } } } */
diff --git a/gc

[PATCH 18/22] Don't contract into random edge in multi-succ node

2023-10-04 Thread Jørgen Kvalsvik
A check was missing for is-single when contracting, so if a
multi-successor node that was not a condition node (e.g. a switch) a
random edge would be picked and contracted into.
---
 gcc/testsuite/gcc.misc-tests/gcov-23.c | 48 ++
 gcc/tree-profile.cc|  4 +++
 2 files changed, 52 insertions(+)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-23.c 
b/gcc/testsuite/gcc.misc-tests/gcov-23.c
index c4afc5e700d..856e97f5088 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-23.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-23.c
@@ -3,6 +3,7 @@
 int id (int);
 int idp (int *);
 int err;
+char c;
 
 /* This becomes problematic only under optimization for the case when the
compiler cannot inline the function but have to generate a call.  It is not
@@ -72,4 +73,51 @@ exit:
 return a;
 }
 
+/* Adapted from icu4c-73.1 common/ucase.cpp ucase_getLocaleCase ().  */
+int
+mcdc003 (const char *locale)
+{
+/* extern, so its effect won't be optimized out.  */
+c = *locale++;
+if (c == 'z') /* conditions(0/2) true(0) false(0) */
+ /* conditions(end) */
+{
+   return 1;
+}
+else if (c >= 'a') /* conditions(0/2) true(0) false(0) */
+ /* conditions(end) */
+{
+   if (id (c)) /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   c = *locale++;
+}
+else
+{
+   if (c == 'T')
+   {
+   if (id (c)) /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   c = *locale++;
+   if (id (c)) /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   c = *locale++;
+   }
+   else if (c == 'L')
+   c = *locale++;
+   else if (c == 'E')
+   c = *locale++;
+   else if (c == 'N')
+   c = *locale++;
+   else if (c == 'H')
+   {
+   c = *locale++;
+   if (id (c)) /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   c = *locale++;
+   }
+}
+
+return 1;
+}
+
 /* { dg-final { run-gcov conditions { --conditions gcov-23.c } } } */
diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index 26e1924d444..ce679130ca0 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -297,6 +297,10 @@ contract_edge (edge e, sbitmap G = nullptr)
return source;
if (block_conditional_p (dest))
return e;
+   /* This happens for switches, and must be checked after the 
is-conditional
+  (which is also not single).  */
+   if (!single (dest->succs))
+   return source;
 
if (G)
bitmap_set_bit (G, dest->index);
-- 
2.30.2



[PATCH 08/22] Describe output_conditions

2023-10-04 Thread Jørgen Kvalsvik
From: Jørgen Kvalsvik 

---
 gcc/gcov.cc | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/gcov.cc b/gcc/gcov.cc
index 95e0a11bc08..62eac76a971 100644
--- a/gcc/gcov.cc
+++ b/gcc/gcov.cc
@@ -3045,6 +3045,10 @@ accumulate_line_counts (source_info *src)
   }
 }
 
+/* Output information about the conditions in block BINFO.  The output includes
+ * a summary (n/m outcomes covered) and a list of the missing (uncovered)
+ * outcomes.  */
+
 static void
 output_conditions (FILE *gcov_file, const block_info *binfo)
 {
-- 
2.30.2



[PATCH 19/22] Beautify assert

2023-10-04 Thread Jørgen Kvalsvik
---
 gcc/tree-profile.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index ce679130ca0..f329be84ad2 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -784,8 +784,8 @@ isolate_expression (conds_ctx &ctx, basic_block p, 
vec& out)
   infinitely.  If this should happen, we should bail out and give up
   instrumentation for the function altogether.  It is possible no such
   CFGs exist, so for now this is an assert.  */
-   if (bitmap_equal_p (prev, expr) || bitmap_count_bits (expr) < 2)
-   gcc_assert (false);
+   gcc_assert (!bitmap_equal_p (prev, expr));
+   gcc_assert (bitmap_count_bits (expr) > 2);
bitmap_copy (prev, expr);
 
const unsigned NGlen = NG.length ();
-- 
2.30.2



[PATCH 13/22] Handle split-outcome with intrusive flag

2023-10-04 Thread Jørgen Kvalsvik
When the coverage and branch profiling preprocessing happens it can
choose to split certain nodes with labels to accuractely track coverage.
These shapes were recongized, but this causes some expressions to be
counted wrong as this graph may be isomorphic with graphs that have not
been split, for example for this snippet:

if (b)
return 0;

if (a)
{
if (b)
{
b += 2;
if (b & 0xFF)
c++;
else
c--;

return c;
}
else
{
a++;
}
c += 10;
}
else
{
b++;
}

This is a nice approach since the need for detecting this pattern only
comes from the coverage instrumentation splitting the edge in the first
place.
---
 gcc/profile.cc |  8 
 gcc/testsuite/gcc.misc-tests/gcov-19.c |  7 +++
 gcc/tree-profile.cc| 24 
 3 files changed, 15 insertions(+), 24 deletions(-)

diff --git a/gcc/profile.cc b/gcc/profile.cc
index 4407bb0683d..00975661b0d 100644
--- a/gcc/profile.cc
+++ b/gcc/profile.cc
@@ -1257,6 +1257,9 @@ branch_prob (bool thunk)
  basic_block new_bb = split_edge (e);
  edge ne = single_succ_edge (new_bb);
  ne->goto_locus = e->goto_locus;
+ /* Mark the edge with IGNORE so condition coverage knows that
+the edge split occurred and this should be contracted.  */
+ ne->flags |= EDGE_IGNORE;
}
  if ((e->flags & (EDGE_ABNORMAL | EDGE_ABNORMAL_CALL))
   && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
@@ -1608,6 +1611,11 @@ branch_prob (bool thunk)
   /* Commit changes done by instrumentation.  */
   gsi_commit_edge_inserts ();
 
+  /* Unset all EDGE_IGNORE set in this pass.  */
+  FOR_EACH_BB_FN (bb, cfun)
+for (edge e : bb->succs)
+  e->flags &= ~EDGE_IGNORE;
+
   coverage_end_function (lineno_checksum, cfg_checksum);
   if (flag_branch_probabilities
   && (profile_status_for_fn (cfun) == PROFILE_READ))
diff --git a/gcc/testsuite/gcc.misc-tests/gcov-19.c 
b/gcc/testsuite/gcc.misc-tests/gcov-19.c
index 8d6eb610af2..1c671f7e186 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-19.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-19.c
@@ -299,12 +299,11 @@ then:
 return;
 begin:
 /* Evaluates to if (a || b || c) x = 1 */
-if (a) /* conditions(5/6) true(2) */
-   /* conditions(end) */
+if (a) /* conditions(2/2) */
goto then;
-else if (b)
+else if (b)/* conditions(2/2) */
goto then;
-else if (c)
+else if (c) /* conditions(1/2) true(0) */
goto then;
 }
 
diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index ab96b872db3..3fc78ff8ace 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -344,32 +344,16 @@ contract_edge_up (edge e)
 |  \|
 T   F
 
-This function recognizes this shape and returns the "merges" the split
-outcome block by returning their common successor.  In all other cases it 
is
-the identity function.  */
+When this split happens it flags the edge with EDGE_IGNORE.  */
 basic_block
 merge_split_outcome (basic_block b)
 {
 if (!single (b->succs))
return b;
-if (!single (b->preds))
-   return b;
-
-const unsigned flag = single_edge (b->preds)->flags & EDGE_CONDITION;
-if (!flag)
-   return b;
-
 edge e = single_edge (b->succs);
-for (edge pred : e->dest->preds)
-{
-   if (!(pred->flags & EDGE_FALLTHRU))
-   return b;
-   if (!single (pred->src->preds))
-   return b;
-   if (!(single_edge (pred->src->preds)->flags & flag))
-   return b;
-}
-return e->dest;
+if (e->flags & EDGE_IGNORE)
+   return e->dest;
+return b;
 }
 
 
-- 
2.30.2



[PATCH 20/22] Don't try to reduce NG from dominators

2023-10-04 Thread Jørgen Kvalsvik
The presence of gotos already makes this iffy, but it is already not
necessary as reducing by refining the G should cover it.
---
 gcc/tree-profile.cc | 19 ---
 1 file changed, 19 deletions(-)

diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index f329be84ad2..d1d7265cd1c 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -788,25 +788,6 @@ isolate_expression (conds_ctx &ctx, basic_block p, 
vec& out)
gcc_assert (bitmap_count_bits (expr) > 2);
bitmap_copy (prev, expr);
 
-   const unsigned NGlen = NG.length ();
-   for (unsigned i = 0; i != NGlen - 1; i++)
-   {
-   for (unsigned j = i+1; j != NGlen; j++)
-   {
-   basic_block b = nearest_common_dominator (CDI_DOMINATORS, 
NG[i], NG[j]);
-   if (ctx.index_map[b->index] > ctx.index_map[p->index])
-   {
-   bitmap_clear_bit (expr, NG[i]->index);
-   bitmap_clear_bit (expr, NG[j]->index);
-   bitmap_set_bit (expr, b->index);
-   NG.safe_push (b);
-   }
-   }
-   }
-   for (unsigned i = 0; i != NG.length (); i++)
-   if (!bitmap_bit_p (expr, NG[i]->index))
-   NG.unordered_remove (i--);
-
bitmap_copy (expr, reachable);
for (const basic_block neighbor : NG)
{
-- 
2.30.2



[PATCH 09/22] Find reachable conditions unbounded by dominators

2023-10-04 Thread Jørgen Kvalsvik
Search for reachable conditions without limiting the search to the nodes
dominated by the left-most term in the expression. Right operands are
conceptually still always dominated by the left-most term, but in
arbitrary programs there may be complex edges that breaks this.

As it turns out, it is not necessary to limit search by the dominator,
and it effectively only served as an optimization.

A new test case has been added with the code that triggered the bug.

gcc/ChangeLog:

* tree-profile.cc (masking_vectors): Tweak comment.
(cond_reachable_from): Remove dominated_by_p check.

gcc/testsuite/ChangeLog:

* gcc.misc-tests/gcov-22.c: New test.
---
 gcc/testsuite/gcc.misc-tests/gcov-22.c | 40 ++
 gcc/tree-profile.cc| 15 +++---
 2 files changed, 44 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-22.c

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-22.c 
b/gcc/testsuite/gcc.misc-tests/gcov-22.c
new file mode 100644
index 000..8aa8867dbf1
--- /dev/null
+++ b/gcc/testsuite/gcc.misc-tests/gcov-22.c
@@ -0,0 +1,40 @@
+/* { dg-options "-fprofile-conditions -ftest-coverage" } */
+/* { dg-do run { target native } } */
+
+#include 
+jmp_buf buf;
+
+void noop() {}
+
+/* This function is a test to verify that the expression isolation does not
+   break on a CFG with the right set of complex edges.  The (_ && setjmp)
+   created complex edges after the function calls and a circular pair of
+   complex edges around the setjmp call.  This triggered a bug when the search
+   for right operands only would consider nodes dominated by the left-most
+   term, as this would only be the case if the complex edges were removed.
+
+   __builtin_setjmp did not trigger this, so we need setjmp from libc.  */
+void
+pathological001 (int a, int b, int c)
+{
+if (a)  /* conditions(1/2) true(0) */
+   /* conditions(end) */
+   noop ();
+
+if (b)  /* conditions(1/2) false(0) */
+   /* conditions(end) */
+   noop ();
+
+if (c && setjmp (buf))  /* conditions(1/4) true(0 1) false(1) */
+   /* conditions(end) */
+   noop ();
+}
+
+int
+main ()
+{
+pathological001 (0, 1, 0);
+}
+
+
+/* { dg-final { run-gcov conditions { --conditions gcov-22.c } } } */
diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index c8b917afb9a..adab0f59c07 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -619,10 +619,10 @@ masking_vectors (conds_ctx& ctx, array_slice 
blocks,
 }
 
 /* Find the nodes reachable from p by following only (possibly contracted)
-   condition edges dominated by p and ignore DFS back edges.  From a high level
-   this is partitioning the CFG into subgraphs by removing all non-condition
-   edges and selecting a single connected subgraph.  This creates a cut C = (G,
-   G') where G is the returned explicitly by this function.
+   condition edges and ignoring DFS back edges.  From a high level this is
+   partitioning the CFG into subgraphs by removing all non-condition edges and
+   selecting a single connected subgraph.  This creates a cut C = (G, G') where
+   G is the returned explicitly by this function.
 
It is assumed that all paths from p go through q (q post-dominates p).  p
must always be the first term in an expression and a condition node.
@@ -631,11 +631,6 @@ masking_vectors (conds_ctx& ctx, array_slice 
blocks,
this is a multi-term expression or the first block in the then/else block is
a conditional expression as well.
 
-   Only nodes dominated by p is added - under optimization some blocks may be
-   merged and multiple independent conditions may share the same outcome
-   (making successors misidentified as a right operands), but true right-hand
-   operands are always dominated by the first term.
-
The function outputs both a bitmap and a vector as both are useful to the
caller.  */
 void
@@ -651,8 +646,6 @@ cond_reachable_from (basic_block p, basic_block q, sbitmap 
expr,
basic_block dest = contract_edge (e)->dest;
if (dest == q)
continue;
-   if (!dominated_by_p (CDI_DOMINATORS, dest, p))
-   continue;
if (!block_conditional_p (dest))
continue;
if (bitmap_bit_p (expr, dest->index))
-- 
2.30.2



[PATCH 14/22] Unify expression candidate set refinement logic

2023-10-04 Thread Jørgen Kvalsvik
It always felt odd that the expression candidate set refinement were in
two more-or-less identical phases. It turns out that this is not
necessary and the neighborhood, ancestor filtering, update candidate set
steps can all be done inside a single loop, which terminates when
|NG| == |Outcomes| == 2. This also provides a nice opportunity for some
error sensitivity - if the candidate set cannot be reduced any more, and
the neighborhood |NG| != 2 we error out.

It is unclear if this graph can possibly be constructed, so for now it
uses an assert for sensitivity to it. What this should really do is drop
the instrumentation for the function altogether, and give a warning.
---
 gcc/tree-profile.cc | 139 
 1 file changed, 49 insertions(+), 90 deletions(-)

diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index 3fc78ff8ace..b75736a22a0 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -114,9 +114,10 @@ struct conds_ctx
 auto_sbitmap G1;
 auto_sbitmap G2;
 auto_sbitmap G3;
+auto_sbitmap G4;
 
 explicit conds_ctx (unsigned size) noexcept (true) : marks (size),
-G1 (size), G2 (size), G3 (size)
+G1 (size), G2 (size), G3 (size), G4 (size)
 {
bitmap_clear (marks);
 }
@@ -728,13 +729,14 @@ isolate_expression (conds_ctx &ctx, basic_block p, 
vec& out)
 sbitmap expr = ctx.G1;
 sbitmap reachable = ctx.G2;
 sbitmap ancestors = ctx.G3;
+sbitmap prev = ctx.G3;
 bitmap_clear (expr);
 bitmap_clear (reachable);
+bitmap_clear (prev);
 
 vec& G = ctx.B1;
 vec& NG = ctx.B2;
 G.truncate (0);
-NG.truncate (0);
 
 basic_block post = get_immediate_dominator (CDI_POST_DOMINATORS, p);
 cond_reachable_from (p, post, reachable, G);
@@ -744,105 +746,62 @@ isolate_expression (conds_ctx &ctx, basic_block p, 
vec& out)
return;
 }
 
-neighborhood (G, reachable, NG);
-bitmap_copy (expr, reachable);
-
-for (const basic_block neighbor : NG)
-{
-   bitmap_clear (ancestors);
-   for (edge e : neighbor->preds)
-   ancestors_of (e->src, p, reachable, ancestors);
-   bitmap_and (expr, expr, ancestors);
-}
-
-for (const basic_block b : G)
-   if (bitmap_bit_p (expr, b->index))
-   out.safe_push (b);
-out.sort (cmp_index_map, &ctx.index_map);
-
-
-/* In a lot of programs we would be done now, but in complex CFGs with
-   gotos or returns from nested then-blocks we need some post processing.
-
-   Essentially, perform another step of the algorithm - find the
-   neighborhood NG' of the candidate expression B.  If |B| == 2 we have the
-   two outcome nodes and are done.  If not, we know the previous step must
-   have captured something in a nested expression (probably because of a
-   fallthrough from the then-block being merged into the next block.  If
-   two nodes are dominated by some node lower (according to topsort) in the
-   CFG then the dominating node is the first term in some conditional
-   expression, which means it cannot be a part of the expression we're
-   searching for.  We redo the ancestor filtering, but now use a more
-   precise neighborhood that is unaffected.  */
-
-NG.truncate (0);
-neighborhood (out, expr, NG);
-if (NG.length () == 2)
-   return;
-
-/* Names are reused in this section and generally mean the same thing.  */
-bitmap_clear (reachable);
-for (const basic_block b : NG)
-   bitmap_set_bit (reachable, b->index);
-
-/* If a pair of nodes has a shared dominator *that is not the root*
-   then that dominator must be the first term of another boolean
-   expression or some other structure outside the expression.  */
-gcc_assert (!NG.is_empty ());
-for (unsigned i = 0; i != NG.length () - 1; i++)
+while (true)
 {
-   for (unsigned j = i+1; j != NG.length (); j++)
+   NG.truncate (0);
+   neighborhood (G, reachable, NG);
+   gcc_assert (!NG.is_empty ());
+
+   bitmap_clear (expr);
+   for (basic_block b : NG)
+   bitmap_set_bit (expr, b->index);
+
+   if (bitmap_count_bits (expr) == 2)
+   break;
+
+   /* If the neighborhood does not change between iterations (a fixed
+  point) we cannot understand the graph properly, and this would loop
+  infinitely.  If this should happen, we should bail out and give up
+  instrumentation for the function altogether.  It is possible no such
+  CFGs exist, so for now this is an assert.  */
+   if (bitmap_equal_p (prev, expr) || bitmap_count_bits (expr) < 2)
+   gcc_assert (false);
+   bitmap_copy (prev, expr);
+
+   for (unsigned i = 0; i != NG.length () - 1; i++)
{
-   auto b = nearest_common_dominator (CDI_DOMINATORS, NG[i], NG[j]);
-   if (ctx.index_map[b->index] > ctx.index_map[p->index])
+   for (unsigned j = i+1; j != NG.len

[PATCH 11/22] Add test case showing cross-decision fusing

2023-10-04 Thread Jørgen Kvalsvik
Some expressions create CFGs isomorphic to those with multi-term
expressions, e.g. if (a) if (b) {} is equivalent to if (a && b) {}, and
there is no real recovery for this. The test is added to 1. document the
behaviour and 2. detect if the cfg generation changes in a way that
(correctly) splits the ifs.

Note that some arithmetic is performed between the else-if and the
following if. This is not sufficient to create a new basic block, and
all sorts of arithmetic and side effect is possible between terms in an
arbitrary boolean expression anyway, for example if (a++ && --b).
---
 gcc/testsuite/gcc.misc-tests/gcov-19.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-19.c 
b/gcc/testsuite/gcc.misc-tests/gcov-19.c
index 662f4ca2864..5b5c1c275b0 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-19.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-19.c
@@ -146,6 +146,26 @@ mcdc004e (int a, int b, int c)
 }
 }
 
+/* else-if is not immune to the else-less fuse.  This test is also put in as a
+ * detection mechanism - sif this should register as 3 individual decisions
+ * then the test should be updated and fixed to reflect it.  */
+int
+mcdc004f (int a, int b, int c)
+{
+if (a)  /* conditions(1/2) false(0) */
+   /* conditions(end) */
+{
+   x = 1;
+}
+else if (b) /* conditions(0/4) true(0 1) false(0 1) */
+   /* conditions(end) */
+{
+   x = 2;
+   if (c)
+   x = 3;
+}
+}
+
 /* mixing && and || works */
 void
 mcdc005a (int a, int b, int c)
@@ -1137,6 +1157,8 @@ int main ()
 mcdc004e (0, 0, 0);
 mcdc004e (1, 1, 1);
 
+mcdc004f (1, 1, 1);
+
 mcdc005a (1, 0, 1);
 
 mcdc005b (1, 1, 0, 0);
-- 
2.30.2



[PATCH 22/22] Return value on separate line

2023-10-04 Thread Jørgen Kvalsvik
---
 gcc/testsuite/g++.dg/gcov/gcov-18.C | 36 +++--
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/gcc/testsuite/g++.dg/gcov/gcov-18.C 
b/gcc/testsuite/g++.dg/gcov/gcov-18.C
index 310ed5297c0..b58f8450e44 100644
--- a/gcc/testsuite/g++.dg/gcov/gcov-18.C
+++ b/gcc/testsuite/g++.dg/gcov/gcov-18.C
@@ -18,7 +18,8 @@ public:
 int identity (int x) { return x; }
 int throws (int) { throw std::runtime_error("exception"); }
 
-int throw_if (int x)
+int
+throw_if (int x)
 {
 if (x) /* conditions(1/2) true(0) */
   /* conditions(end) */
@@ -30,7 +31,8 @@ int throw_if (int x)
 int x = 0;
 
 /* conditionals work in the presence of non-trivial destructors */
-void mcdc001a (int a)
+void
+mcdc001a (int a)
 {
 nontrivial_destructor v (a);
 
@@ -57,7 +59,8 @@ mcdc002a (int a, int b)
 }
 
 /* conditional in constructor */
-void mcdc003a (int a)
+void
+mcdc003a (int a)
 {
 class C
 {
@@ -77,7 +80,8 @@ void mcdc003a (int a)
 }
 
 /* conditional in destructor */
-void mcdc004a (int a)
+void
+mcdc004a (int a)
 {
 class C
 {
@@ -96,7 +100,8 @@ void mcdc004a (int a)
 }
 
 /* conditional in try */
-void mcdc005a (int a)
+void
+mcdc005a (int a)
 {
 try
 {
@@ -113,7 +118,8 @@ void mcdc005a (int a)
 }
 
 /* conditional in catch */
-void mcdc006a (int a) {
+void
+mcdc006a (int a) {
 try
 {
throws (a);
@@ -128,7 +134,8 @@ void mcdc006a (int a) {
 }
 }
 
-void mcdc006b (int a)
+void
+mcdc006b (int a)
 {
 if (a) /* conditions(1/2) true(0) */
   /* conditions(end) */
@@ -137,7 +144,8 @@ void mcdc006b (int a)
x = 1;
 }
 
-void mcdc006c (int a) try
+void
+mcdc006c (int a) try
 {
 throws (a);
 }
@@ -147,12 +155,14 @@ catch (...) {
 }
 
 /* temporary with destructor as term */
-void mcdc007a (int a, int b)
+void
+mcdc007a (int a, int b)
 {
 x = a && nontrivial_destructor (b); /* conditions(3/4) false(1) 
destructor() */
 }
 
-void mcdc007b (int a, int b)
+void
+mcdc007b (int a, int b)
 {
 if (a || throw_if (b)) /* conditions(3/4) true(1) destructor() */
x = -1;
@@ -160,7 +170,8 @@ void mcdc007b (int a, int b)
x = 1;
 }
 
-void mcdc007c (int a, int b)
+void
+mcdc007c (int a, int b)
 {
 if (throw_if (a) || throw_if (b)) /* conditions(2/4) true(0 1) 
destructor() */
x = -1;
@@ -169,7 +180,8 @@ void mcdc007c (int a, int b)
 }
 
 /* destructor with delete */
-void mcdc008a (int a)
+void
+mcdc008a (int a)
 {
 class C
 {
-- 
2.30.2



[PATCH 17/22] Mark contracted-past nodes in reachable

2023-10-04 Thread Jørgen Kvalsvik
When there was no candidate set reduction phase this was not necessary,
as every neighbor's predecessor would always be in the reachable set.
Now that the graph cut is refined multiple times this may not hold,
which would lead to incorrect termination of the ancestors search when a
node in the candidate set was reduced to neighbor (effectively moving
the cut) as its predecessor may have been contracted by.
---
 gcc/testsuite/gcc.misc-tests/gcov-23.c | 75 ++
 gcc/tree-profile.cc| 41 +-
 2 files changed, 102 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-23.c

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-23.c 
b/gcc/testsuite/gcc.misc-tests/gcov-23.c
new file mode 100644
index 000..c4afc5e700d
--- /dev/null
+++ b/gcc/testsuite/gcc.misc-tests/gcov-23.c
@@ -0,0 +1,75 @@
+/* { dg-options "-fprofile-conditions -ftest-coverage -O2 -c" } */
+
+int id (int);
+int idp (int *);
+int err;
+
+/* This becomes problematic only under optimization for the case when the
+   compiler cannot inline the function but have to generate a call.  It is not
+   really interesting to run, only see if it builds.  Notably, both the
+   function calls and the return values are important to construct a
+   problematic graph.
+
+   This test is also a good example of where optimization makes condition
+   coverage unpredictable, but not unusable.  If this is built without
+   optimization the conditions work as you would expect from reading the
+   source.  */
+/* Adapted from cpio-2.14 gnu/utmens.c lutimens ().  */
+int
+mcdc001 (int *v)
+{
+int adjusted;
+int adjustment_needed = 0;
+
+int *ts = v ? &adjusted : 0; /* conditions(0/4) true(0 1) false(0 1) */
+/* conditions(end) */
+if (ts)
+   adjustment_needed = idp (ts);
+if (adjustment_needed < 0)
+   return -1;
+
+if (adjustment_needed)  /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+{
+   if (adjustment_needed != 3) /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   return -1;
+   if (ts) /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   return 0;
+}
+
+if (adjustment_needed && idp (&adjusted)) /* conditions(0/2) true(0) 
false(0) */
+ /* conditions(end) */
+   return -1;
+if (adjusted)   /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   return idp (ts);
+
+return -1;
+}
+
+/* This failed when the candidate set internal/contracted-past nodes were not
+   properly marked as reachable in the candidate reduction phase.  */
+/* Adapted from cpio-2.14 gnu/mktime.c mktime_internal ().  */
+int
+mcdc002 ()
+{
+int a;
+if (idp (&a)) /* conditions(0/2) true(0) false(0) */
+ /* conditions(end) */
+{
+   if (id (a)) /* conditions(0/2) true(0/2) true(0) false(0) */
+   /* conditions(end) */
+   goto exit;
+
+   if (err) /* conditions(0/2) true(0/2) true(0) false(0) */
+/* conditions(end) */
+   return -1;
+}
+
+exit:
+return a;
+}
+
+/* { dg-final { run-gcov conditions { --conditions gcov-23.c } } } */
diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index 98593f53862..26e1924d444 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -279,23 +279,27 @@ single_edge (const vec *edges)
merged.
 
Only chains of single-exit single-entry nodes that end with a condition
-   should be contracted.  */
+   should be contracted.  If the optional bitset G is passed, the intermediate
+   "contracted-past" nodes will be recorded, which is only meaningful if the
+   non-source edge is returned.  */
 edge
-contract_edge (edge e)
+contract_edge (edge e, sbitmap G = nullptr)
 {
 edge source = e;
 while (true)
 {
basic_block dest = e->dest;
-   if (!single (dest->preds))
-   return source;
if (e->flags & EDGE_DFS_BACK)
return source;
-   if (block_conditional_p (dest))
-   return e;
if (dest->index == EXIT_BLOCK)
return source;
+   if (!single (dest->preds))
+   return source;
+   if (block_conditional_p (dest))
+   return e;
 
+   if (G)
+   bitmap_set_bit (G, dest->index);
e = single_edge (dest->succs);
if (!e)
return source;
@@ -673,6 +677,8 @@ cond_reachable_from (basic_block p, basic_block q, sbitmap 
expr,
if (!all_preds_conditional_p (dest, expr))
continue;
 
+   if (dest != e->dest)
+   contract_edge (e, expr);
bitmap_set_bit (expr, dest->index);
out.safe_push (dest);
}
@@ -692,8 +698,14 @@ neighborhood (const vec& blocks, sbitmap G, 
vec& out)

[PATCH 21/22] Walk the cfg in topological order, not depth-first

2023-10-04 Thread Jørgen Kvalsvik
Depth first order is insufficient to process expressions in the right
order when there are setjmps in optimized builds. This would create
complex paths from the root past conditions and into the middle of
boolean expressions. Traversing the graph in topological order restores
the expectation that expressions will be processed top-down.
---
 gcc/testsuite/gcc.misc-tests/gcov-23.c | 26 ++
 gcc/tree-profile.cc|  1 +
 2 files changed, 27 insertions(+)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-23.c 
b/gcc/testsuite/gcc.misc-tests/gcov-23.c
index 856e97f5088..e5b56a5aa44 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-23.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-23.c
@@ -1,5 +1,8 @@
 /* { dg-options "-fprofile-conditions -ftest-coverage -O2 -c" } */
 
+#include 
+jmp_buf buf;
+
 int id (int);
 int idp (int *);
 int err;
@@ -120,4 +123,27 @@ mcdc003 (const char *locale)
 return 1;
 }
 
+/* Adapted from jxl 0.8.2 lib/extras/dec/apng.cc processing_start ().
+   This created a graph where depth-first traversal of the CFG would not
+   process nodes in the wrong order (the extra control inserted from setjmp
+   created a path of complexes from root to !b without going through !a).
+
+This only happened under optimization.  */
+int
+mcdc004 (int a, int b)
+{
+a = id (a);
+b = id (b);
+
+if (a || b) /* conditions(0/4) true(0 1) false(0 1) */
+   /* conditions(end) */
+   return 1;
+
+if (setjmp (buf)) /* conditions(0/2) true(0) false(0) */
+ /* conditions(end) */
+   return 1;
+
+return 0;
+}
+
 /* { dg-final { run-gcov conditions { --conditions gcov-23.c } } } */
diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index d1d7265cd1c..8bf280dc018 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -1045,6 +1045,7 @@ find_conditions (struct function *fn)
 int n = dfs_enumerate_from (entry, 0, yes, dfs.address (), nblocks, exit);
 dfs.truncate (n);
 make_index_map (dfs, nblocks, ctx.B1, ctx.index_map);
+dfs.sort (cmp_index_map, &ctx.index_map);
 
 /* Visit all reachable nodes and collect conditions.  DFS order is
important so the first node of a boolean expression is visited first
-- 
2.30.2



[PATCH] wwwdocs: Add ADL to C++ non-bugs

2023-10-04 Thread Jonathan Wakely
We have a long history of INVALID bugs about std functions being
available in the global namespace (PRs 27846, 67566, 82619, 99865,
110602, 111553, probably others). Let's document it.

Also de-prioritize the C++98-only bugs, which are unlikely to affect
anybody nowadays.

OK for wwwdocs?

-- >8 --

Add ADL to C++ non-bugs

Also move the item about C++98 'export' to the end, and update the item
about <: digraphs that only applies to C++98.

---
 htdocs/bugs/index.html | 36 
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/htdocs/bugs/index.html b/htdocs/bugs/index.html
index 813b78c0..41edc561 100644
--- a/htdocs/bugs/index.html
+++ b/htdocs/bugs/index.html
@@ -539,15 +539,15 @@ for details.
 C++
 
 
-export
-Most C++ compilers (G++ included) never implemented C++98
-export, which was removed in C++11, and the keyword reused in
-C++20 by the Modules feature. The C++98 feature was intended to support
-separate compilation of template declarations and
-definitions. Without export, a template definition must be in
-scope to be used. The obvious workaround is simply to place all definitions in
-the header itself. Alternatively, the compilation unit containing template
-definitions may be included from the header.
+Functions can be called without qualifying them with their namespace.
+
+Argument Dependent Lookup (ADL) means that functions can be found in namespaces
+associated with their arguments. This means that move(arg) can
+call std::move if arg is a type defined in namespace
+std, such as std::string or std::vector.
+If std::move is not the function you intended to call, use a
+qualified name such as ::move(arg) or foo::move(arg).
+
 
 Nested classes can access private members and types of the containing
 class.
@@ -597,9 +597,9 @@ handler and catch it in the main thread.
 If you have a class in the global namespace, say named X,
 and want to give it as a template argument to some other class, say
 std::vector, then std::vector<::X>
-fails with a parser error.
+fails with a parser error in C++98/C++03 mode.
 
-The reason is that the standard mandates that the sequence
+The reason is that the C++98 standard mandates that the sequence
 <: is treated as if it were the token [.
 (There are several such combinations of characters - they are called
 digraphs.) Depending on the version, the compiler then reports
@@ -608,7 +608,19 @@ a parse error before the character : (the 
colon before
 
 The simplest way to avoid this is to write std::vector<
 ::X>, i.e. place a space between the opening angle bracket
-and the scope operator.
+and the scope operator, or compile using C++11 or later. Defect report 1104
+changed the parser rules so that <:: works as expected.
+
+
+export
+Most C++ compilers (G++ included) never implemented C++98
+export, which was removed in C++11, and the keyword reused in
+C++20 by the Modules feature. The C++98 feature was intended to support
+separate compilation of template declarations and
+definitions. Without export, a template definition must be in
+scope to be used. The obvious workaround is simply to place all definitions in
+the header itself. Alternatively, the compilation unit containing template
+definitions may be included from the header.
 
 
 Common problems when upgrading the compiler
-- 
2.41.0



Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-04 Thread Robin Dapp
> +  gcc_assert (code == IFN_COND_ADD || code == IFN_COND_SUB);

I forgot to add the other IFN_CONDs here before sending.  So with

-  gcc_assert (code == IFN_COND_ADD || code == IFN_COND_SUB);
+  gcc_assert (code == IFN_COND_ADD || code == IFN_COND_SUB
+   || code == IFN_COND_MUL || code == IFN_COND_AND
+   || code == IFN_COND_IOR || code == IFN_COND_XOR);

on top, bootstrap and testsuites on x86, aarch64 and power10 are
unchanged.

Regards
 Robin


Re: [PATCH] RISC-V: Fix the riscv_legitimize_poly_move issue on targets where the minimal VLEN exceeds 512.

2023-10-04 Thread Kito Cheng
钟居哲 於 2023年10月4日 週三,20:20寫道:
>
> I think the "max poly value" is the LMUL 1 mode coeffs[1]
>
> See int vlenb = BYTES_PER_RISCV_VECTOR.coeffs[1];
>
> So I think bump max_power to exact_log2 (64); is not enough.
> since we adjust the LMUL 1 mode size according to TARGET_MIN_VLEN.
>
> I suspect the testcase you append in this patch will fail with 
> -march=rv64gcv_zvl4096b.


There is no type smaller than  [64, 64] in zvl4096b, RVVMF64BI is [64,
64], it’s smallest type, and RVVFM1BI is [512, 512] (size of single
vector reg.) which at most 64x for zvl4096b, so my understanding is
log2(64) is enough :)

and of cause, verified the testcase is work with -march=rv64gcv_zvl4096b


Re: [PATCH] options: Prevent multidimensional arrays

2023-10-04 Thread Kito Cheng
committed to trunk, got approval from Jeff in another mail thread:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/631873.html :)

On Mon, Oct 2, 2023 at 4:03 PM Kito Cheng  wrote:
>
> Multidimensional arrary is gawk extension, and we accidentally
> introduced that in recent commit[1].
>
> [1] 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e4a4b8e983bac865eb435b11798e38d633b98942
>
> gcc/ChangeLog:
>
> * opt-read.awk: Drop multidimensional arrays.
> * opth-gen.awk: Ditto.
> ---
>  gcc/opt-read.awk | 4 ++--
>  gcc/opth-gen.awk | 8 
>  2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/opt-read.awk b/gcc/opt-read.awk
> index fcf92853957..f74d8478f72 100644
> --- a/gcc/opt-read.awk
> +++ b/gcc/opt-read.awk
> @@ -123,7 +123,7 @@ BEGIN {
> }
> else {
> target_var = opt_args("Var", $0)
> -if (target_var)
> +   if (target_var)
> {
> target_var = opt_args("Var", $1)
> var_index = find_index(target_var, 
> target_vars, n_target_vars)
> @@ -131,7 +131,7 @@ BEGIN {
> {
> target_vars[n_target_vars++] 
> = target_var
> }
> -   
> other_masks[var_index][n_other_mask[var_index]++] = name
> +   other_masks[var_index "," 
> n_other_mask[var_index]++] = name
> }
> else
> {
> diff --git a/gcc/opth-gen.awk b/gcc/opth-gen.awk
> index 70ca3d37719..c4398be2f3a 100644
> --- a/gcc/opth-gen.awk
> +++ b/gcc/opth-gen.awk
> @@ -412,9 +412,9 @@ for (i = 0; i < n_target_vars; i++)
> continue
> for (j = 0; j < n_other_mask[i]; j++)
> {
> -   print "#define MASK_" other_masks[i][j] " (1U << " 
> other_masknum[i][""]++ ")"
> +   print "#define MASK_" other_masks[i "," j] " (1U << " 
> other_masknum[i]++ ")"
> }
> -   if (other_masknum[i][""] > 32)
> +   if (other_masknum[i] > 32)
> print "#error too many target masks for" extra_target_vars[i]
>  }
>
> @@ -437,8 +437,8 @@ for (i = 0; i < n_target_vars; i++)
> continue
> for (j = 0; j < n_other_mask[i]; j++)
> {
> -   print "#define TARGET_" other_masks[i][j] \
> - " ((" target_vars[i] " & MASK_" other_masks[i][j] ") != 
> 0)"
> +   print "#define TARGET_" other_masks[i "," j] \
> + " ((" target_vars[i] " & MASK_" other_masks[i "," j] ") 
> != 0)"
> }
>  }
>  print ""
> --
> 2.40.1
>


Re: Re: [PATCH] RISC-V: Fix the riscv_legitimize_poly_move issue on targets where the minimal VLEN exceeds 512.

2023-10-04 Thread 钟居哲
OK. But could you add a MACRO define

Something like:
#define MAX_POLY_VARIANT 64



juzhe.zh...@rivai.ai
 
From: Kito Cheng
Date: 2023-10-04 21:32
To: 钟居哲
CC: Jeff Law; gcc-patches; kito.cheng; palmer; rdapp
Subject: Re: [PATCH] RISC-V: Fix the riscv_legitimize_poly_move issue on 
targets where the minimal VLEN exceeds 512.
钟居哲 於 2023年10月4日 週三,20:20寫道:
>
> I think the "max poly value" is the LMUL 1 mode coeffs[1]
>
> See int vlenb = BYTES_PER_RISCV_VECTOR.coeffs[1];
>
> So I think bump max_power to exact_log2 (64); is not enough.
> since we adjust the LMUL 1 mode size according to TARGET_MIN_VLEN.
>
> I suspect the testcase you append in this patch will fail with 
> -march=rv64gcv_zvl4096b.
 
 
There is no type smaller than  [64, 64] in zvl4096b, RVVMF64BI is [64,
64], it’s smallest type, and RVVFM1BI is [512, 512] (size of single
vector reg.) which at most 64x for zvl4096b, so my understanding is
log2(64) is enough :)
 
and of cause, verified the testcase is work with -march=rv64gcv_zvl4096b
 


[ping][PATCH v2] Add a GCC Security policy

2023-10-04 Thread Siddhesh Poyarekar

Ping!

On 2023-09-28 07:55, Siddhesh Poyarekar wrote:

Define a security process and exclusions to security issues for GCC and
all components it ships.

Signed-off-by: Siddhesh Poyarekar 
---
  SECURITY.txt | 205 +++
  1 file changed, 205 insertions(+)
  create mode 100644 SECURITY.txt

diff --git a/SECURITY.txt b/SECURITY.txt
new file mode 100644
index 000..14cb31570d3
--- /dev/null
+++ b/SECURITY.txt
@@ -0,0 +1,205 @@
+What is a GCC security bug?
+===
+
+A security bug is one that threatens the security of a system or
+network, or might compromise the security of data stored on it.
+In the context of GCC there are multiple ways in which this might
+happen and some common scenarios are detailed below.
+
+If you're reporting a security issue and feel like it does not fit
+into any of the descriptions below, you're encouraged to reach out
+through the GCC bugzilla or if needed, privately by following the
+instructions in the last two sections of this document.
+
+Compiler drivers, programs, libgccjit and support libraries
+---
+
+The compiler driver processes source code, invokes other programs
+such as the assembler and linker and generates the output result,
+which may be assembly code or machine code.  Compiling untrusted
+sources can result in arbitrary code execution and unconstrained
+resource consumption in the compiler. As a result, compilation of
+such code should be done inside a sandboxed environment to ensure
+that it does not compromise the development environment.
+
+The libgccjit library can, despite the name, be used both for
+ahead-of-time compilation and for just-in-compilation.  In both
+cases it can be used to translate input representations (such as
+source code) in the application context; in the latter case the
+generated code is also run in the application context.
+
+Limitations that apply to the compiler driver, apply here too in
+terms of sanitizing inputs and it is recommended that both the
+compilation *and* execution context of the code are appropriately
+sandboxed to contain the effects of any bugs in libgccjit, the
+application code using it, or its generated code to the sandboxed
+environment.
+
+Libraries such as libiberty, libcc1 and libcpp are not distributed
+for runtime support and have similar challenges to compiler drivers.
+While they are expected to be robust against arbitrary input, they
+should only be used with trusted inputs when linked into the
+compiler.
+
+Libraries such as zlib that bundled into GCC to build it will be
+treated the same as the compiler drivers and programs as far as
+security coverage is concerned.  However if you find an issue in
+these libraries independent of their use in GCC, you should reach
+out to their upstream projects to report them.
+
+As a result, the only case for a potential security issue in the
+compiler is when it generates vulnerable application code for
+trusted input source code that is conforming to the relevant
+programming standard or extensions documented as supported by GCC
+and the algorithm expressed in the source code does not have the
+vulnerability.  The output application code could be considered
+vulnerable if it produces an actual vulnerability in the target
+application, specifically in the following cases:
+
+- The application dereferences an invalid memory location despite
+  the application sources being valid.
+- The application reads from or writes to a valid but incorrect
+  memory location, resulting in an information integrity issue or an
+  information leak.
+- The application ends up running in an infinite loop or with
+  severe degradation in performance despite the input sources having
+  no such issue, resulting in a Denial of Service.  Note that
+  correct but non-performant code is not a security issue candidate,
+  this only applies to incorrect code that may result in performance
+  degradation severe enough to amount to a denial of service.
+- The application crashes due to the generated incorrect code,
+  resulting in a Denial of Service.
+
+Language runtime libraries
+--
+
+GCC also builds and distributes libraries that are intended to be
+used widely to implement runtime support for various programming
+languages.  These include the following:
+
+* libada
+* libatomic
+* libbacktrace
+* libcc1
+* libcody
+* libcpp
+* libdecnumber
+* libffi
+* libgcc
+* libgfortran
+* libgm2
+* libgo
+* libgomp
+* libitm
+* libobjc
+* libphobos
+* libquadmath
+* libssp
+* libstdc++
+
+These libraries are intended to be used in arbitr

[PATCH] RISC-V: Remove @ of vec_series

2023-10-04 Thread Juzhe-Zhong
gcc/ChangeLog:

* config/riscv/autovec.md (@vec_series): Remove @.
(vec_series): Ditto.
* config/riscv/riscv-v.cc (expand_const_vector): Ditto.
(shuffle_decompress_patterns): Ditto.

---
 gcc/config/riscv/autovec.md | 2 +-
 gcc/config/riscv/riscv-v.cc | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index d6cf376ebca..056f2c352f6 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -336,7 +336,7 @@
 ;; - vadd.vx/vadd.vi
 ;; -
 
-(define_expand "@vec_series"
+(define_expand "vec_series"
   [(match_operand:V_VLSI 0 "register_operand")
(match_operand: 1 "reg_or_int_operand")
(match_operand: 2 "reg_or_int_operand")]
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 29e138e1da2..23633a2a74d 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1014,7 +1014,7 @@ expand_const_vector (rtx target, rtx src)
   rtx base, step;
   if (const_vec_series_p (src, &base, &step))
 {
-  emit_insn (gen_vec_series (mode, target, base, step));
+  expand_vec_series (target, base, step);
   return;
 }
 
@@ -1171,7 +1171,7 @@ expand_const_vector (rtx target, rtx src)
  rtx step = CONST_VECTOR_ELT (src, 2);
  /* Step 1 - { base1, base1 + step, base1 + step * 2, ... }  */
  rtx tmp = gen_reg_rtx (mode);
- emit_insn (gen_vec_series (mode, tmp, base1, step));
+ expand_vec_series (tmp, base1, step);
  /* Step 2 - { base0, base1, base1 + step, base1 + step * 2, ... }  */
  scalar_mode elem_mode = GET_MODE_INNER (mode);
  if (!rtx_equal_p (base0, const0_rtx))
@@ -3020,7 +3020,7 @@ shuffle_decompress_patterns (struct expand_vec_perm_d *d)
   /* Generate { 0, 1,  } mask.  */
   rtx vid = gen_reg_rtx (sel_mode);
   rtx vid_repeat = gen_reg_rtx (sel_mode);
-  emit_insn (gen_vec_series (sel_mode, vid, const0_rtx, const1_rtx));
+  expand_vec_series (vid, const0_rtx, const1_rtx);
   rtx and_ops[] = {vid_repeat, vid, const1_rtx};
   emit_vlmax_insn (code_for_pred_scalar (AND, sel_mode), BINARY_OP, and_ops);
   rtx const_vec = gen_const_vector_dup (sel_mode, 1);
-- 
2.36.3



Re: [PATCH] RISC-V: THead: Fix missing CFI directives for th.sdd in prologue.

2023-10-04 Thread Christoph Müllner
On Wed, Oct 4, 2023 at 9:49 AM Xianmiao Qu  wrote:
>
> From: quxm 
>
> When generating CFI directives for the store-pair instruction,
> if we add two parallel REG_FRAME_RELATED_EXPR expr_lists like
>   (expr_list:REG_FRAME_RELATED_EXPR (set (mem/c:DI (plus:DI (reg/f:DI 2 sp)
> (const_int 8 [0x8])) [1  S8 A64])
> (reg:DI 1 ra))
>   (expr_list:REG_FRAME_RELATED_EXPR (set (mem/c:DI (reg/f:DI 2 sp) [1  S8 
> A64])
> (reg:DI 8 s0))
> only the first expr_list will be recognized by dwarf2out_frame_debug
> funciton. So, here we generate a SEQUENCE expression of 
> REG_FRAME_RELATED_EXPR,
> which includes two sub-expressions of RTX_FRAME_RELATED_P. Then the
> dwarf2out_frame_debug_expr function will iterate through all the 
> sub-expressions
> and generate the corresponding CFI directives.
>
> gcc/
> * config/riscv/thead.cc (th_mempair_save_regs): Fix missing CFI
> directives for store-pair instruction.
>
> gcc/testsuite/
> * gcc.target/riscv/xtheadmempair-4.c: New test.

LGTM, I've also tested it.

Reviewed-by: Christoph Müllner 
Tested-by: Christoph Müllner 

Thanks!

> ---
>  gcc/config/riscv/thead.cc | 11 +++
>  .../gcc.target/riscv/xtheadmempair-4.c| 29 +++
>  2 files changed, 35 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
>
> diff --git a/gcc/config/riscv/thead.cc b/gcc/config/riscv/thead.cc
> index 507c912bc39..be0cd7c1276 100644
> --- a/gcc/config/riscv/thead.cc
> +++ b/gcc/config/riscv/thead.cc
> @@ -366,14 +366,15 @@ th_mempair_save_regs (rtx operands[4])
>  {
>rtx set1 = gen_rtx_SET (operands[0], operands[1]);
>rtx set2 = gen_rtx_SET (operands[2], operands[3]);
> +  rtx dwarf = gen_rtx_SEQUENCE (VOIDmode, rtvec_alloc (2));
>rtx insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set1, 
> set2)));
>RTX_FRAME_RELATED_P (insn) = 1;
>
> -  REG_NOTES (insn) = alloc_EXPR_LIST (REG_FRAME_RELATED_EXPR,
> - copy_rtx (set1), REG_NOTES (insn));
> -
> -  REG_NOTES (insn) = alloc_EXPR_LIST (REG_FRAME_RELATED_EXPR,
> - copy_rtx (set2), REG_NOTES (insn));
> +  XVECEXP (dwarf, 0, 0) = copy_rtx (set1);
> +  XVECEXP (dwarf, 0, 1) = copy_rtx (set2);
> +  RTX_FRAME_RELATED_P (XVECEXP (dwarf, 0, 0)) = 1;
> +  RTX_FRAME_RELATED_P (XVECEXP (dwarf, 0, 1)) = 1;
> +  add_reg_note (insn, REG_FRAME_RELATED_EXPR, dwarf);
>  }
>
>  /* Similar like riscv_restore_reg, but restores two registers from memory
> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c 
> b/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
> new file mode 100644
> index 000..9aef4e15f8d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-g" "-Oz" "-Os" "-flto" } } */
> +/* { dg-options "-march=rv64gc_xtheadmempair -mtune=thead-c906 
> -funwind-tables" { target { rv64 } } } */
> +/* { dg-options "-march=rv32gc_xtheadmempair -mtune=thead-c906 
> -funwind-tables" { target { rv32 } } } */
> +
> +extern void bar (void);
> +
> +void foo (void)
> +{
> +  asm volatile (";my clobber list"
> +   : : : "s0");
> +  bar ();
> +  asm volatile (";my clobber list"
> +   : : : "s0");
> +}
> +
> +/* { dg-final { scan-assembler-times "th.sdd\t" 1 { target { rv64 } } } } */
> +/* { dg-final { scan-assembler ".cfi_offset 8, -16" { target { rv64 } } } } 
> */
> +/* { dg-final { scan-assembler ".cfi_offset 1, -8" { target { rv64 } } } } */
> +
> +/* { dg-final { scan-assembler-times "th.swd\t" 1 { target { rv32 } } } } */
> +/* { dg-final { scan-assembler ".cfi_offset 8, -8" { target { rv32 } } } } */
> +/* { dg-final { scan-assembler ".cfi_offset 1, -4" { target { rv32 } } } } */
> +
> +/* { dg-final { scan-assembler ".cfi_restore 1" } } */
> +/* { dg-final { scan-assembler ".cfi_restore 8" } } */
> +
> +/* { dg-final { scan-assembler-times "th.ldd\t" 1 { target { rv64 } } } } */
> +/* { dg-final { scan-assembler-times "th.lwd\t" 1 { target { rv32 } } } } */
> --
> 2.17.1
>


Re: [PATCH] RISC-V: Remove @ of vec_series

2023-10-04 Thread Robin Dapp
I'm currently in the process of removing some unused @s.
This is OK.

Regards
 Robin


Re: [PATCH] RISC-V: THead: Fix missing CFI directives for th.sdd in prologue.

2023-10-04 Thread Kito Cheng
LGTM, will commit if nobody commit that tomorrow :p

Christoph Müllner 於 2023年10月4日 週三,23:04寫道:

> On Wed, Oct 4, 2023 at 9:49 AM Xianmiao Qu 
> wrote:
> >
> > From: quxm 
> >
> > When generating CFI directives for the store-pair instruction,
> > if we add two parallel REG_FRAME_RELATED_EXPR expr_lists like
> >   (expr_list:REG_FRAME_RELATED_EXPR (set (mem/c:DI (plus:DI (reg/f:DI 2
> sp)
> > (const_int 8 [0x8])) [1  S8 A64])
> > (reg:DI 1 ra))
> >   (expr_list:REG_FRAME_RELATED_EXPR (set (mem/c:DI (reg/f:DI 2 sp) [1
> S8 A64])
> > (reg:DI 8 s0))
> > only the first expr_list will be recognized by dwarf2out_frame_debug
> > funciton. So, here we generate a SEQUENCE expression of
> REG_FRAME_RELATED_EXPR,
> > which includes two sub-expressions of RTX_FRAME_RELATED_P. Then the
> > dwarf2out_frame_debug_expr function will iterate through all the
> sub-expressions
> > and generate the corresponding CFI directives.
> >
> > gcc/
> > * config/riscv/thead.cc (th_mempair_save_regs): Fix missing CFI
> > directives for store-pair instruction.
> >
> > gcc/testsuite/
> > * gcc.target/riscv/xtheadmempair-4.c: New test.
>
> LGTM, I've also tested it.
>
> Reviewed-by: Christoph Müllner 
> Tested-by: Christoph Müllner 
>
> Thanks!
>
> > ---
> >  gcc/config/riscv/thead.cc | 11 +++
> >  .../gcc.target/riscv/xtheadmempair-4.c| 29 +++
> >  2 files changed, 35 insertions(+), 5 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
> >
> > diff --git a/gcc/config/riscv/thead.cc b/gcc/config/riscv/thead.cc
> > index 507c912bc39..be0cd7c1276 100644
> > --- a/gcc/config/riscv/thead.cc
> > +++ b/gcc/config/riscv/thead.cc
> > @@ -366,14 +366,15 @@ th_mempair_save_regs (rtx operands[4])
> >  {
> >rtx set1 = gen_rtx_SET (operands[0], operands[1]);
> >rtx set2 = gen_rtx_SET (operands[2], operands[3]);
> > +  rtx dwarf = gen_rtx_SEQUENCE (VOIDmode, rtvec_alloc (2));
> >rtx insn = emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set1,
> set2)));
> >RTX_FRAME_RELATED_P (insn) = 1;
> >
> > -  REG_NOTES (insn) = alloc_EXPR_LIST (REG_FRAME_RELATED_EXPR,
> > - copy_rtx (set1), REG_NOTES (insn));
> > -
> > -  REG_NOTES (insn) = alloc_EXPR_LIST (REG_FRAME_RELATED_EXPR,
> > - copy_rtx (set2), REG_NOTES (insn));
> > +  XVECEXP (dwarf, 0, 0) = copy_rtx (set1);
> > +  XVECEXP (dwarf, 0, 1) = copy_rtx (set2);
> > +  RTX_FRAME_RELATED_P (XVECEXP (dwarf, 0, 0)) = 1;
> > +  RTX_FRAME_RELATED_P (XVECEXP (dwarf, 0, 1)) = 1;
> > +  add_reg_note (insn, REG_FRAME_RELATED_EXPR, dwarf);
> >  }
> >
> >  /* Similar like riscv_restore_reg, but restores two registers from
> memory
> > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
> b/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
> > new file mode 100644
> > index 000..9aef4e15f8d
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/xtheadmempair-4.c
> > @@ -0,0 +1,29 @@
> > +/* { dg-do compile } */
> > +/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-g" "-Oz" "-Os" "-flto" } }
> */
> > +/* { dg-options "-march=rv64gc_xtheadmempair -mtune=thead-c906
> -funwind-tables" { target { rv64 } } } */
> > +/* { dg-options "-march=rv32gc_xtheadmempair -mtune=thead-c906
> -funwind-tables" { target { rv32 } } } */
> > +
> > +extern void bar (void);
> > +
> > +void foo (void)
> > +{
> > +  asm volatile (";my clobber list"
> > +   : : : "s0");
> > +  bar ();
> > +  asm volatile (";my clobber list"
> > +   : : : "s0");
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "th.sdd\t" 1 { target { rv64 } } }
> } */
> > +/* { dg-final { scan-assembler ".cfi_offset 8, -16" { target { rv64 } }
> } } */
> > +/* { dg-final { scan-assembler ".cfi_offset 1, -8" { target { rv64 } }
> } } */
> > +
> > +/* { dg-final { scan-assembler-times "th.swd\t" 1 { target { rv32 } } }
> } */
> > +/* { dg-final { scan-assembler ".cfi_offset 8, -8" { target { rv32 } }
> } } */
> > +/* { dg-final { scan-assembler ".cfi_offset 1, -4" { target { rv32 } }
> } } */
> > +
> > +/* { dg-final { scan-assembler ".cfi_restore 1" } } */
> > +/* { dg-final { scan-assembler ".cfi_restore 8" } } */
> > +
> > +/* { dg-final { scan-assembler-times "th.ldd\t" 1 { target { rv64 } } }
> } */
> > +/* { dg-final { scan-assembler-times "th.lwd\t" 1 { target { rv32 } } }
> } */
> > --
> > 2.17.1
> >
>


[PATCH] libstdc++: Correctly call _string_types function

2023-10-04 Thread Tom Tromey
flake8 points out that the new call to _string_types from
StdExpAnyPrinter.__init__ is not correct -- it needs to be qualified.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py
(StdExpAnyPrinter.__init__): Qualify call to
_string_types.
---
 libstdc++-v3/python/libstdcxx/v6/printers.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
b/libstdc++-v3/python/libstdcxx/v6/printers.py
index 23efbd171ec..9a51f26d8e0 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -1386,7 +1386,7 @@ class StdExpAnyPrinter(SingleObjContainerPrinter):
 # FIXME need to expand 'std::string' so that gdb.lookup_type works
 if 'std::string' in mgrname:
 mgrtypes = []
-for s in _string_types():
+for s in StdExpAnyPrinter._string_types():
 try:
 x = re.sub(r"std::string(?!\w)", s, m.group(1))
 # The following lookup might raise gdb.error if the
-- 
2.40.1



Re: [PATCH] libstdc++: Correctly call _string_types function

2023-10-04 Thread Jonathan Wakely
On Wed, 4 Oct 2023 at 16:10, Tom Tromey  wrote:
>
> flake8 points out that the new call to _string_types from
> StdExpAnyPrinter.__init__ is not correct -- it needs to be qualified.

OK, thanks.


>
> libstdc++-v3/ChangeLog:
>
> * python/libstdcxx/v6/printers.py
> (StdExpAnyPrinter.__init__): Qualify call to
> _string_types.
> ---
>  libstdc++-v3/python/libstdcxx/v6/printers.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
> b/libstdc++-v3/python/libstdcxx/v6/printers.py
> index 23efbd171ec..9a51f26d8e0 100644
> --- a/libstdc++-v3/python/libstdcxx/v6/printers.py
> +++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
> @@ -1386,7 +1386,7 @@ class StdExpAnyPrinter(SingleObjContainerPrinter):
>  # FIXME need to expand 'std::string' so that gdb.lookup_type 
> works
>  if 'std::string' in mgrname:
>  mgrtypes = []
> -for s in _string_types():
> +for s in StdExpAnyPrinter._string_types():
>  try:
>  x = re.sub(r"std::string(?!\w)", s, m.group(1))
>  # The following lookup might raise gdb.error if the
> --
> 2.40.1
>


RE: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-04 Thread Tamar Christina
Hi Robin,

> -Original Message-
> From: Robin Dapp 
> Sent: Wednesday, October 4, 2023 8:54 AM
> To: Tamar Christina ; gcc-patches  patc...@gcc.gnu.org>; Richard Biener 
> Cc: rdapp@gmail.com
> Subject: Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar
> reduction.
> 
> Hi Tamar,
> 
> > I can't approve but hope you don't mind the review,
> 
> Not at all, greatly appreciated.
> 
> I incorporated all your remarks apart from this:
> 
> > Isn't vec_opmask NULL for SLP? You probably need to read it from
> > vec_defs for the COND_EXPR
> 
> Above that I gcc_assert (!slp_node) for the IFN_COND case.  It doesn't seem to
> be hit during testsuite runs.  I also didn't manage to create an example that
> would trigger it.  When "conditionalizing" an SLP fold-left reduction we don't
> seem to vectorize for a different reason.
> Granted, I didn't look very closely at that reason :)

Yeah it looks like it's failing because it can't handle the PHI node reduction 
for
the condition.

So that's fine, I think then we should exit from vectorize_fold_left_reduction
in that case so we avoid the segfault when we start forcing things through
SLP only soon and add single lane SLP support.

So in the

  if (slp_node)
{

Add something like:

If (is_cond_op)
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "left fold reduction on SLP not supported.\n");
  return false;
}

> 
> Bootstrap and testsuite are currently running with the attached v2 on x86,
> aarch64 and powerpc.
> 
> Besides, when thinking about which COND_OPs we expect I tried to loosen the
> restrictions in if-conv by allowing MAX_EXPR and MIN_EXPR.  The emitted
> code on riscv looks correct but I hit a bootstrap ICE on x86 so omitted it for
> now.
> 
> Regards
>  Robin
> 
> 
> Subject: [PATCH v2] ifcvt/vect: Emit COND_ADD for conditional scalar
> reduction.
> 
> As described in PR111401 we currently emit a COND and a PLUS expression
> for conditional reductions.  This makes it difficult to combine both into a
> masked reduction statement later.
> This patch improves that by directly emitting a COND_ADD during ifcvt and
> adjusting some vectorizer code to handle it.
> 
> It also makes neutral_op_for_reduction return -0 if HONOR_SIGNED_ZEROS is
> true.
> 
> gcc/ChangeLog:
> 
>   PR middle-end/111401
>   * internal-fn.cc (cond_fn_p): New function.
>   * internal-fn.h (cond_fn_p): Define.
>   * tree-if-conv.cc (convert_scalar_cond_reduction): Emit COND_ADD
>   if supported.
>   (predicate_scalar_phi): Add whitespace.
>   * tree-vect-loop.cc (fold_left_reduction_fn): Add IFN_COND_ADD.
>   (neutral_op_for_reduction): Return -0 for PLUS.
>   (vect_is_simple_reduction): Don't count else operand in
>   COND_ADD.
>   (vectorize_fold_left_reduction): Add COND_ADD handling.
>   (vectorizable_reduction): Don't count else operand in COND_ADD.
>   (vect_transform_reduction): Add COND_ADD handling.
>   * tree-vectorizer.h (neutral_op_for_reduction): Add default
>   parameter.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c: New test.
>   * gcc.target/riscv/rvv/autovec/cond/pr111401.c: New test.
> ---
>  gcc/internal-fn.cc|  17 +++
>  gcc/internal-fn.h |   1 +
>  .../vect-cond-reduc-in-order-2-signed-zero.c  | 141 ++
>  .../riscv/rvv/autovec/cond/pr111401.c | 139 +
>  gcc/tree-if-conv.cc   |  63 ++--
>  gcc/tree-vect-loop.cc | 129 
>  gcc/tree-vectorizer.h |   2 +-
>  7 files changed, 450 insertions(+), 42 deletions(-)  create mode 100644
> gcc/testsuite/gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/pr111401.c
> 
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index
> 61d5a9e4772..9b38dc0cef4 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -4245,6 +4245,23 @@ first_commutative_argument (internal_fn fn)
>  }
>  }
> 
> +/* Return true if this CODE describes a conditional (masked)
> +internal_fn.  */
> +
> +bool
> +cond_fn_p (code_helper code)
> +{
> +  if (!code.is_fn_code ())
> +return false;
> +
> +  if (!internal_fn_p ((combined_fn) code))
> +return false;
> +
> +  internal_fn fn = as_internal_fn ((combined_fn) code);
> +
> +  return conditional_internal_fn_code (fn) != ERROR_MARK; }
> +
> +
>  /* Return true if this CODE describes an internal_fn that returns a vector 
> with
> elements twice as wide as the element size of the input vectors.  */
> 

The only comment I have is whether you actually need this helper function?
It looks like all the uses of it are in cases you have, or will call 
conditional_internal_fn_code
directly.

e.g. in vect_transform_reduction yo

Re: [PATCH v2] Add a GCC Security policy

2023-10-04 Thread David Edelsohn
Hi, Siddhesh

Thanks for working on this and fine-tuning the original proposed text.  It
looks good to me.  Minor grammatical nit below.

Thanks, David

On Thu, Sep 28, 2023 at 7:59 AM Siddhesh Poyarekar 
wrote:

> On 2023-09-28 12:55, Siddhesh Poyarekar wrote:
> > Define a security process and exclusions to security issues for GCC and
> > all components it ships.
> >
> > Signed-off-by: Siddhesh Poyarekar 
> > ---
>
> Sorry I forgot to summarize changes since the previous version:
>
> - Reworded the introduction so that it doesn't sound like we know *all*
> scenarios and also encourage reporters to reach out.
>
> - Fixed up support and diagnostic libraries sections based on Jakub's
> feedback.
>
> >   SECURITY.txt | 205 +++
> >   1 file changed, 205 insertions(+)
> >   create mode 100644 SECURITY.txt
> >
> > diff --git a/SECURITY.txt b/SECURITY.txt
> > new file mode 100644
> > index 000..14cb31570d3
> > --- /dev/null
> > +++ b/SECURITY.txt
> > @@ -0,0 +1,205 @@
> > +What is a GCC security bug?
> > +===
> > +
> > +A security bug is one that threatens the security of a system or
> > +network, or might compromise the security of data stored on it.
> > +In the context of GCC there are multiple ways in which this might
> > +happen and some common scenarios are detailed below.
> > +
> > +If you're reporting a security issue and feel like it does not fit
> > +into any of the descriptions below, you're encouraged to reach out
> > +through the GCC bugzilla or if needed, privately by following the
>

Some missing, pedantic commas:

through the GCC bugzilla or, if needed, privately, by following the


> > +instructions in the last two sections of this document.
> > +
> > +Compiler drivers, programs, libgccjit and support libraries
> > +---
> > +
> > +The compiler driver processes source code, invokes other programs
> > +such as the assembler and linker and generates the output result,
> > +which may be assembly code or machine code.  Compiling untrusted
> > +sources can result in arbitrary code execution and unconstrained
> > +resource consumption in the compiler. As a result, compilation of
> > +such code should be done inside a sandboxed environment to ensure
> > +that it does not compromise the development environment.
> > +
> > +The libgccjit library can, despite the name, be used both for
> > +ahead-of-time compilation and for just-in-compilation.  In both
> > +cases it can be used to translate input representations (such as
> > +source code) in the application context; in the latter case the
> > +generated code is also run in the application context.
> > +
> > +Limitations that apply to the compiler driver, apply here too in
> > +terms of sanitizing inputs and it is recommended that both the
> > +compilation *and* execution context of the code are appropriately
> > +sandboxed to contain the effects of any bugs in libgccjit, the
> > +application code using it, or its generated code to the sandboxed
> > +environment.
> > +
> > +Libraries such as libiberty, libcc1 and libcpp are not distributed
> > +for runtime support and have similar challenges to compiler drivers.
> > +While they are expected to be robust against arbitrary input, they
> > +should only be used with trusted inputs when linked into the
> > +compiler.
> > +
> > +Libraries such as zlib that bundled into GCC to build it will be
> > +treated the same as the compiler drivers and programs as far as
> > +security coverage is concerned.  However if you find an issue in
> > +these libraries independent of their use in GCC, you should reach
> > +out to their upstream projects to report them.
> > +
> > +As a result, the only case for a potential security issue in the
> > +compiler is when it generates vulnerable application code for
> > +trusted input source code that is conforming to the relevant
> > +programming standard or extensions documented as supported by GCC
> > +and the algorithm expressed in the source code does not have the
> > +vulnerability.  The output application code could be considered
> > +vulnerable if it produces an actual vulnerability in the target
> > +application, specifically in the following cases:
> > +
> > +- The application dereferences an invalid memory location despite
> > +  the application sources being valid.
> > +- The application reads from or writes to a valid but incorrect
> > +  memory location, resulting in an information integrity issue or an
> > +  information leak.
> > +- The application ends up running in an infinite loop or with
> > +  severe degradation in performance despite the input sources having
> > +  no such issue, resulting in a Denial of Service.  Note that
> > +   

Re: [PATCH 2/2] testsuite: Replace many dg-require-thread-fence with dg-require-atomic-exchange

2023-10-04 Thread Hans-Peter Nilsson
> From: Jonathan Wakely 
> Date: Wed, 4 Oct 2023 09:29:43 +0100

> The new dg-require proc checks for __atomic_exchange, which is not the
> same as compare-exchange, and not the same as test-and-set on
> atomic_flag. Does it just happen to be true for arm that the presence
> of __atomic_exchange also implies the other atomic operations?

I missed pointing out that if the target implements
something that emits actual insns for __atomic_exchange
(and/or __atomic_compare_exchange), it has also implemented
test-and-set.  Cf. optabs.cc:expand_atomic_test_and_set (at
r14-4395-g027a94cf32be0b).

Similarly a insn-level __atomic_compare_exchange
implementation (atomic_compare_and_swapM) also does it for
__atomic_exchange.

> The new proc also only tests it for int, which presumably means none
> of these tests rely on atomics for long long being present. Some of
> the tests use atomics on pointers, which should work for ILP32 targets
> if atomics work on int, but the new dg-require-atomic-exchange isn't
> sufficient to check that it will work for pointers on LP64 targets.

Right, I'll amend to test a uintptr_t...

> Maybe it happens to work today for the targets where we have issues,
> but we seem to be assuming that there will be no LP64 targets where
> these atomics are absent for 64-bit pointers. Are there supported
> risc-v ISA subsets where that isn't true?

...but generally, I'd like to leave future amendments like
that to the Next Guy, just like the Previous Guy left
dg-require-thread-fence for me (us) to split up.  But,
perhaps we can prepare for such amendments by giving it a
more specific name: suggesting
dg-require-atomic-cmpxchg-word (bikeshedding opportunity),
testing __atomic_compare_exchange.

IOW, I don't think we should make a distinction, looking for
other operations and sizes at this time.

> And we're assuming that
> __atomic_exchange being present implies all other ops are present,
> which seems like a bad assumption to me. I would be more confident
> testing for __atomic_compare_exchange, because with a wait-free CAS
> you can implement any other atomic operation, but the same isn't true
> for __atomic_exchange.

Yes, I switched them around.

New version coming up.

brgds, H-P


Re: [PATCH 21/22] Walk the cfg in topological order, not depth-first

2023-10-04 Thread Jørgen Kvalsvik

On 04/10/2023 21:39, Jørgen Kvalsvik wrote:

Depth first order is insufficient to process expressions in the right
order when there are setjmps in optimized builds. This would create
complex paths from the root past conditions and into the middle of
boolean expressions. Traversing the graph in topological order restores
the expectation that expressions will be processed top-down.
---
  gcc/testsuite/gcc.misc-tests/gcov-23.c | 26 ++
  gcc/tree-profile.cc|  1 +
  2 files changed, 27 insertions(+)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-23.c 
b/gcc/testsuite/gcc.misc-tests/gcov-23.c
index 856e97f5088..e5b56a5aa44 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-23.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-23.c
@@ -1,5 +1,8 @@
  /* { dg-options "-fprofile-conditions -ftest-coverage -O2 -c" } */
  
+#include 

+jmp_buf buf;
+
  int id (int);
  int idp (int *);
  int err;
@@ -120,4 +123,27 @@ mcdc003 (const char *locale)
  return 1;
  }
  
+/* Adapted from jxl 0.8.2 lib/extras/dec/apng.cc processing_start ().

+   This created a graph where depth-first traversal of the CFG would not
+   process nodes in the wrong order (the extra control inserted from setjmp
+   created a path of complexes from root to !b without going through !a).
+
+This only happened under optimization.  */
+int
+mcdc004 (int a, int b)
+{
+a = id (a);
+b = id (b);
+
+if (a || b) /* conditions(0/4) true(0 1) false(0 1) */
+   /* conditions(end) */
+   return 1;
+
+if (setjmp (buf)) /* conditions(0/2) true(0) false(0) */
+ /* conditions(end) */
+   return 1;
+
+return 0;
+}
+
  /* { dg-final { run-gcov conditions { --conditions gcov-23.c } } } */
diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index d1d7265cd1c..8bf280dc018 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -1045,6 +1045,7 @@ find_conditions (struct function *fn)
  int n = dfs_enumerate_from (entry, 0, yes, dfs.address (), nblocks, exit);
  dfs.truncate (n);
  make_index_map (dfs, nblocks, ctx.B1, ctx.index_map);
+dfs.sort (cmp_index_map, &ctx.index_map);
  
  /* Visit all reachable nodes and collect conditions.  DFS order is

 important so the first node of a boolean expression is visited first


This test fails the build because it missed this change:

From ef01096021841bc3335aa1e80f0e3e75be9b24a2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?J=C3=B8rgen=20Kvalsvik?= 
Date: Thu, 5 Oct 2023 00:14:53 +0900
Subject: [PATCH] Make right operand function to keep 2-term expr

If this is just the variable and not the function call, the comparsion
gets optimized to a|b which reduces the number of conditions and causes
the counting to fail.

The counting is not really an interesting part of this test, verifying
that it doesn't crash is, but it's nice when the numbers line up. An
alternative is to accept the strength reduction and count it as a 1-term
expression instead.
---
 gcc/testsuite/gcc.misc-tests/gcov-23.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-23.c 
b/gcc/testsuite/gcc.misc-tests/gcov-23.c

index e5b56a5aa44..98e692fcd01 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-23.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-23.c
@@ -135,8 +135,8 @@ mcdc004 (int a, int b)
 a = id (a);
 b = id (b);

-if (a || b) /* conditions(0/4) true(0 1) false(0 1) */
-   /* conditions(end) */
+if (a || id (b)) /* conditions(0/4) true(0 1) false(0 1) */
+/* conditions(end) */
return 1;

 if (setjmp (buf)) /* conditions(0/2) true(0) false(0) */
--
2.30.2


Re: [RFC] expr: don't clear SUBREG_PROMOTED_VAR_P flag for a promoted subreg [target/111466]

2023-10-04 Thread Jeff Law




On 9/28/23 15:43, Vineet Gupta wrote:

RISC-V suffers from extraneous sign extensions, despite/given the ABI
guarantee that 32-bit quantities are sign-extended into 64-bit registers,
meaning incoming SI function args need not be explicitly sign extended
(so do SI return values as most ALU insns implicitly sign-extend too.)

Existing REE doesn't seem to handle this well and there are various ideas
floating around to smarten REE about it.

RISC-V also seems to correctly implement middle-end hook PROMOTE_MODE
etc.

Another approach would be to prevent EXPAND from generating the
sign_extend in the first place which this patch tries to do.

The hunk being removed was introduced way back in 1994 as
5069803972 ("expand_expr, case CONVERT_EXPR .. clear the promotion flag")

This survived full testsuite run for RISC-V rv64gc with surprisingly no
fallouts: test results before/after are exactly same.

|   | # of unexpected case / # of unique unexpected 
case
|   |  gcc |  g++ | gfortran |
| rv64imafdc_zba_zbb_zbs_zicond/|  264 /87 |5 / 2 |   72 /12 |
|lp64d/medlow

Granted for something so old to have survived, there must be a valid
reason. Unfortunately the original change didn't have additional
commentary or a test case. That is not to say it can't/won't possibly
break things on other arches/ABIs, hence the RFC for someone to scream
that this is just bonkers, don't do this :-)

I've explicitly CC'ed Jakub and Roger who have last touched subreg
promoted notes in expr.cc for insight and/or screaming ;-)

Thanks to Robin for narrowing this down in an amazing debugging session
@ GNU Cauldron.

[ ... ]
So the data for Alpha was -- no change.  I also put the patch into my 
tester in the hopes that some target, any target would show a difference 
in test results.  It's churned about halfway through the embedded 
targets so far with no regressions.


Given the data in your followup message, this clearly looks valuable and 
so far we don't have any evidence Kenner's old change is useful or 
necessary anymore.


I'm not (yet) comfortable committing this patch, I think the easy 
avenues to getting a case where it's needed are not likely to prove 
fruitful.  So next steps here are for me to spend a bit more time trying 
to understand what cases Kenner was trying to handle.


Jeff


Re: [PATCH 18/22] Don't contract into random edge in multi-succ node

2023-10-04 Thread Jørgen Kvalsvik

On 04/10/2023 21:39, Jørgen Kvalsvik wrote:

A check was missing for is-single when contracting, so if a
multi-successor node that was not a condition node (e.g. a switch) a
random edge would be picked and contracted into.
---
  gcc/testsuite/gcc.misc-tests/gcov-23.c | 48 ++
  gcc/tree-profile.cc|  4 +++
  2 files changed, 52 insertions(+)

diff --git a/gcc/testsuite/gcc.misc-tests/gcov-23.c 
b/gcc/testsuite/gcc.misc-tests/gcov-23.c
index c4afc5e700d..856e97f5088 100644
--- a/gcc/testsuite/gcc.misc-tests/gcov-23.c
+++ b/gcc/testsuite/gcc.misc-tests/gcov-23.c
@@ -3,6 +3,7 @@
  int id (int);
  int idp (int *);
  int err;
+char c;
  
  /* This becomes problematic only under optimization for the case when the

 compiler cannot inline the function but have to generate a call.  It is not
@@ -72,4 +73,51 @@ exit:
  return a;
  }
  
+/* Adapted from icu4c-73.1 common/ucase.cpp ucase_getLocaleCase ().  */

+int
+mcdc003 (const char *locale)
+{
+/* extern, so its effect won't be optimized out.  */
+c = *locale++;
+if (c == 'z') /* conditions(0/2) true(0) false(0) */
+ /* conditions(end) */
+{
+   return 1;
+}
+else if (c >= 'a') /* conditions(0/2) true(0) false(0) */
+ /* conditions(end) */
+{
+   if (id (c)) /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   c = *locale++;
+}
+else
+{
+   if (c == 'T')
+   {
+   if (id (c)) /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   c = *locale++;
+   if (id (c)) /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   c = *locale++;
+   }
+   else if (c == 'L')
+   c = *locale++;
+   else if (c == 'E')
+   c = *locale++;
+   else if (c == 'N')
+   c = *locale++;
+   else if (c == 'H')
+   {
+   c = *locale++;
+   if (id (c)) /* conditions(0/2) true(0) false(0) */
+   /* conditions(end) */
+   c = *locale++;
+   }
+}
+
+return 1;
+}
+
  /* { dg-final { run-gcov conditions { --conditions gcov-23.c } } } */
diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index 26e1924d444..ce679130ca0 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -297,6 +297,10 @@ contract_edge (edge e, sbitmap G = nullptr)
return source;
if (block_conditional_p (dest))
return e;
+   /* This happens for switches, and must be checked after the 
is-conditional
+  (which is also not single).  */
+   if (!single (dest->succs))
+   return source;
  
  	if (G)

bitmap_set_bit (G, dest->index);


This test seems to fail on aarch64 (thanks, linaro bot)


FAIL: 24 regressions

regressions.sum:
=== gcc tests ===

Running gcc:gcc.misc-tests/gcov.exp ...
FAIL: gcc.misc-tests/gcov-23.c gcov: 0 failures in line counts, 0 in 
branch percentages, 13 in condition/decision, 0 in return percentages, 0 
in intermediate format

FAIL: gcc.misc-tests/gcov-23.c line 108: unexpected summary 0/2
FAIL: gcc.misc-tests/gcov-23.c line 108: unexpected uncovered term 0 (false)
FAIL: gcc.misc-tests/gcov-23.c line 108: unexpected uncovered term 0 (true)
FAIL: gcc.misc-tests/gcov-23.c line 110: unexpected summary 0/2
FAIL: gcc.misc-tests/gcov-23.c line 110: unexpected uncovered term 0 (false)
FAIL: gcc.misc-tests/gcov-23.c line 110: unexpected uncovered term 0 (true)
... and 19 more entries

Apparently the if-else do not get turned into a switch here. I have a 
look and see if I can reproduce the problem with an explicit switch 
rather than the implied one from if-else, as it is broken in its current 
state.


Re: [PATCH] RISC-V: Remove @ of vec_series

2023-10-04 Thread Jeff Law




On 10/4/23 09:06, Robin Dapp wrote:

I'm currently in the process of removing some unused @s.
This is OK.
Agreed.  And if you or Juzhe have other @ cases that are unused, such 
changes should be considered pre-approved.


Jeff


Re: [PATCH v2] Add a GCC Security policy

2023-10-04 Thread Alexander Monakov


On Thu, 28 Sep 2023, Siddhesh Poyarekar wrote:

> Define a security process and exclusions to security issues for GCC and
> all components it ships.

Some typos and wording suggestions below.

> --- /dev/null
> +++ b/SECURITY.txt
> @@ -0,0 +1,205 @@
> +What is a GCC security bug?
> +===
> +
> +A security bug is one that threatens the security of a system or
> +network, or might compromise the security of data stored on it.
> +In the context of GCC there are multiple ways in which this might
> +happen and some common scenarios are detailed below.
> +
> +If you're reporting a security issue and feel like it does not fit
> +into any of the descriptions below, you're encouraged to reach out
> +through the GCC bugzilla or if needed, privately by following the
> +instructions in the last two sections of this document.
> +
> +Compiler drivers, programs, libgccjit and support libraries
> +---
> +
> +The compiler driver processes source code, invokes other programs
> +such as the assembler and linker and generates the output result,
> +which may be assembly code or machine code.  Compiling untrusted
> +sources can result in arbitrary code execution and unconstrained
> +resource consumption in the compiler. As a result, compilation of
> +such code should be done inside a sandboxed environment to ensure
> +that it does not compromise the development environment.

"... the host environment" seems more appropriate.

> +
> +The libgccjit library can, despite the name, be used both for
> +ahead-of-time compilation and for just-in-compilation.  In both
> +cases it can be used to translate input representations (such as
> +source code) in the application context; in the latter case the
> +generated code is also run in the application context.
> +
> +Limitations that apply to the compiler driver, apply here too in
> +terms of sanitizing inputs and it is recommended that both the

s/sanitizing inputs/trusting inputs/ (I suggested it earlier, just unsure
if you don't agree or it simply fell through the cracks)

> +compilation *and* execution context of the code are appropriately
> +sandboxed to contain the effects of any bugs in libgccjit, the
> +application code using it, or its generated code to the sandboxed
> +environment.
> +
> +Libraries such as libiberty, libcc1 and libcpp are not distributed
> +for runtime support and have similar challenges to compiler drivers.
> +While they are expected to be robust against arbitrary input, they
> +should only be used with trusted inputs when linked into the
> +compiler.
> +
> +Libraries such as zlib that bundled into GCC to build it will be

'are bundled with' (missing 'are', s/into/with/)

> +treated the same as the compiler drivers and programs as far as
> +security coverage is concerned.  However if you find an issue in
> +these libraries independent of their use in GCC, you should reach
> +out to their upstream projects to report them.
> +
> +As a result, the only case for a potential security issue in the
> +compiler is when it generates vulnerable application code for
> +trusted input source code that is conforming to the relevant
> +programming standard or extensions documented as supported by GCC
> +and the algorithm expressed in the source code does not have the
> +vulnerability.  The output application code could be considered
> +vulnerable if it produces an actual vulnerability in the target
> +application, specifically in the following cases:

It seems ambiguous if the list that follows is meant to be an exhaustive
enumeration. I think it is meant to give examples without covering all
possibilities; if that's the case, I would suggest

s/specifically in the following cases/for example/

If I misunderstood and the list is really meant to be exhaustive,
it would be nice to make that clear and perhaps refer the reader
to the second paragraph when their scenario does not fit.

> +
> +- The application dereferences an invalid memory location despite
> +  the application sources being valid.
> +- The application reads from or writes to a valid but incorrect
> +  memory location, resulting in an information integrity issue or an
> +  information leak.
> +- The application ends up running in an infinite loop or with
> +  severe degradation in performance despite the input sources having
> +  no such issue, resulting in a Denial of Service.  Note that
> +  correct but non-performant code is not a security issue candidate,
> +  this only applies to incorrect code that may result in performance
> +  degradation severe enough to amount to a denial of service.
> +- The application crashes due to the generated incorrect code,
> +  resulting in a Denial of Service.
> +
> +Language

Re: [RFC gcc13 backport 0/3] Add Ztso atomic mappings

2023-10-04 Thread Jeff Law




On 10/3/23 16:26, Patrick O'Neill wrote:

I vaugely recall some discussion about backporting the Ztso mappings
along with the RVWMO mappings. Now that the RVWMO mappings have been
backported for 13.3, is there interest in also backporting the Ztso
mappings?

Tested using for regressions using rv32gc/rv64gc glibc.

Jeff Law (1):
   [RISCV][committed] Remove spurious newline in ztso sequence

Patrick O'Neill (2):
   RISC-V: Add Ztso atomic mappings
   RISC-V: Specify -mabi for ztso testcases
I recall discussing Ztso mappings, but not the final conclusion.  I 
think the final decision comes down to the motivation behind the changes.


If they're primarily optimized sequences utilizing the stronger ordering 
guarantees from Ztso, then it's probably not a good candidate for 
gcc-13.  If the primary motivation is to make it easier to port code 
from targets with stronger memory models (ie x86), then the Ztso work is 
a reasonable candidate for backporting.


Jeff


Re: [committed] libstdc++: Define std::numeric_limits<_FloatNN> before C++23

2023-10-04 Thread Stephan Bergmann

On 8/17/23 22:32, Jonathan Wakely via Libstdc++ wrote:

Tested x86_64-linux. Pushed to trunk.

-- >8 --

The extended floating-point types such as _Float32 are supported by GCC
prior to C++23, you just can't use the standard-conforming names from
 to refer to them. This change defines the specializations of
std::numeric_limits for those types for older dialects, not only for
C++23.

libstdc++-v3/ChangeLog:

* include/bits/c++config (__gnu_cxx::__bfloat16_t): Define
whenever __BFLT16_DIG__ is defined, not only for C++23.
* include/std/limits (numeric_limits): Likewise.
(numeric_limits<_Float16>, numeric_limits<_Float32>)
(numeric_limits<_Float64>): Likewise for other extended
floating-point types.
---
  libstdc++-v3/include/bits/c++config |   4 +-
  libstdc++-v3/include/std/limits | 194 +++-
  2 files changed, 103 insertions(+), 95 deletions(-)


[...]

diff --git a/libstdc++-v3/include/std/limits b/libstdc++-v3/include/std/limits
index 52b19ef8264..7a59e7520eb 100644
--- a/libstdc++-v3/include/std/limits
+++ b/libstdc++-v3/include/std/limits
@@ -1890,189 +1890,197 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

[...]

  __glibcxx_float_n(64)
  #endif
-#ifdef __STDCPP_FLOAT128_T__
+#ifdef __FLT128_DIG__
  __glibcxx_float_n(128)
  #endif
  #undef __glibcxx_float_n

[...]

The above change (from __STDCPP_FLOAT128_T__ to __FLT128_DIG__) now 
started to cause issues with Clang on Clang 18 trunk:


* Clang does not support a _Float128 type.

* Clang does not predefine __STDCPP_FLOAT128_T__.

* But since 
 
"[clang] Predefined macros for float128 support (#67196)", Clang 18 
trunk does predefine __FLT128_DIG__ now.  Which causes



$ cat test.cc
#include 



$ clang++ -fsyntax-only test.cc
In file included from test.cc:1:
/home/sbergman/gcc/trunk/inst/lib/gcc/x86_64-pc-linux-gnu/14.0.0/../../../../include/c++/14.0.0/limits:1995:1:
 error: use of undeclared identifier '_Float128'
 1995 | __glibcxx_float_n(128)
  | ^
/home/sbergman/gcc/trunk/inst/lib/gcc/x86_64-pc-linux-gnu/14.0.0/../../../../include/c++/14.0.0/limits:1903:27:
 note: expanded from macro '__glibcxx_float_n'
 1903 | struct numeric_limits<_Float##BITSIZE>  
\
  |   ^
:36:1: note: expanded from here
   36 | _Float128
  | ^
1 error generated.


(I don't know whether or not it is useful for Clang to predefine 
__FLT128_DIG__ when not providing a _Float128 type.  I assume 
 "ISO/IEC TS 18661-3:2015", as 
referenced by the C++ standard, might be relevant here, but don't know 
that document.  I added Pranav, the author of the relevant Clang commit, 
in cc here.)




Re: [committed] libstdc++: Define std::numeric_limits<_FloatNN> before C++23

2023-10-04 Thread Jonathan Wakely
On Wed, 4 Oct 2023 at 16:54, Stephan Bergmann  wrote:
>
> On 8/17/23 22:32, Jonathan Wakely via Libstdc++ wrote:
> > Tested x86_64-linux. Pushed to trunk.
> >
> > -- >8 --
> >
> > The extended floating-point types such as _Float32 are supported by GCC
> > prior to C++23, you just can't use the standard-conforming names from
> >  to refer to them. This change defines the specializations of
> > std::numeric_limits for those types for older dialects, not only for
> > C++23.
> >
> > libstdc++-v3/ChangeLog:
> >
> >   * include/bits/c++config (__gnu_cxx::__bfloat16_t): Define
> >   whenever __BFLT16_DIG__ is defined, not only for C++23.
> >   * include/std/limits (numeric_limits): Likewise.
> >   (numeric_limits<_Float16>, numeric_limits<_Float32>)
> >   (numeric_limits<_Float64>): Likewise for other extended
> >   floating-point types.
> > ---
> >   libstdc++-v3/include/bits/c++config |   4 +-
> >   libstdc++-v3/include/std/limits | 194 +++-
> >   2 files changed, 103 insertions(+), 95 deletions(-)
> >
> [...]
> > diff --git a/libstdc++-v3/include/std/limits 
> > b/libstdc++-v3/include/std/limits
> > index 52b19ef8264..7a59e7520eb 100644
> > --- a/libstdc++-v3/include/std/limits
> > +++ b/libstdc++-v3/include/std/limits
> > @@ -1890,189 +1890,197 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> [...]
> >   __glibcxx_float_n(64)
> >   #endif
> > -#ifdef __STDCPP_FLOAT128_T__
> > +#ifdef __FLT128_DIG__
> >   __glibcxx_float_n(128)
> >   #endif
> >   #undef __glibcxx_float_n
> [...]
>
> The above change (from __STDCPP_FLOAT128_T__ to __FLT128_DIG__) now
> started to cause issues with Clang on Clang 18 trunk:
>
> * Clang does not support a _Float128 type.
>
> * Clang does not predefine __STDCPP_FLOAT128_T__.
>
> * But since
> 
> "[clang] Predefined macros for float128 support (#67196)", Clang 18
> trunk does predefine __FLT128_DIG__ now.  Which causes
>
> > $ cat test.cc
> > #include 
>
> > $ clang++ -fsyntax-only test.cc
> > In file included from test.cc:1:
> > /home/sbergman/gcc/trunk/inst/lib/gcc/x86_64-pc-linux-gnu/14.0.0/../../../../include/c++/14.0.0/limits:1995:1:
> >  error: use of undeclared identifier '_Float128'
> >  1995 | __glibcxx_float_n(128)
> >   | ^
> > /home/sbergman/gcc/trunk/inst/lib/gcc/x86_64-pc-linux-gnu/14.0.0/../../../../include/c++/14.0.0/limits:1903:27:
> >  note: expanded from macro '__glibcxx_float_n'
> >  1903 | struct numeric_limits<_Float##BITSIZE>  
> > \
> >   |   ^
> > :36:1: note: expanded from here
> >36 | _Float128
> >   | ^
> > 1 error generated.
>
> (I don't know whether or not it is useful for Clang to predefine
> __FLT128_DIG__ when not providing a _Float128 type.  I assume
>  "ISO/IEC TS 18661-3:2015", as
> referenced by the C++ standard, might be relevant here, but don't know
> that document.  I added Pranav, the author of the relevant Clang commit,
> in cc here.)


It's completely wrong or Clang to define a macro describing properties
of a non-existent type.

This was reported as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111687 and closed as
INVALID, it needs to be fixed in Clang.


>



Re: [PATCH 2/2] testsuite: Replace many dg-require-thread-fence with dg-require-atomic-exchange

2023-10-04 Thread Jonathan Wakely
On Wed, 4 Oct 2023 at 16:15, Hans-Peter Nilsson  wrote:
>
> > From: Jonathan Wakely 
> > Date: Wed, 4 Oct 2023 09:29:43 +0100
>
> > The new dg-require proc checks for __atomic_exchange, which is not the
> > same as compare-exchange, and not the same as test-and-set on
> > atomic_flag. Does it just happen to be true for arm that the presence
> > of __atomic_exchange also implies the other atomic operations?
>
> I missed pointing out that if the target implements
> something that emits actual insns for __atomic_exchange
> (and/or __atomic_compare_exchange), it has also implemented
> test-and-set.  Cf. optabs.cc:expand_atomic_test_and_set (at
> r14-4395-g027a94cf32be0b).
>
> Similarly a insn-level __atomic_compare_exchange
> implementation (atomic_compare_and_swapM) also does it for
> __atomic_exchange.

Ah great, that makes testing __atomic_exchange good enough then.

>
> > The new proc also only tests it for int, which presumably means none
> > of these tests rely on atomics for long long being present. Some of
> > the tests use atomics on pointers, which should work for ILP32 targets
> > if atomics work on int, but the new dg-require-atomic-exchange isn't
> > sufficient to check that it will work for pointers on LP64 targets.
>
> Right, I'll amend to test a uintptr_t...
>
> > Maybe it happens to work today for the targets where we have issues,
> > but we seem to be assuming that there will be no LP64 targets where
> > these atomics are absent for 64-bit pointers. Are there supported
> > risc-v ISA subsets where that isn't true?
>
> ...but generally, I'd like to leave future amendments like
> that to the Next Guy, just like the Previous Guy left
> dg-require-thread-fence for me (us) to split up.  But,
> perhaps we can prepare for such amendments by giving it a
> more specific name: suggesting
> dg-require-atomic-cmpxchg-word (bikeshedding opportunity),
> testing __atomic_compare_exchange.
>
> IOW, I don't think we should make a distinction, looking for
> other operations and sizes at this time.

Yep, that's fine. Worse is better. Maybe just add a comment before the
definition of the new effective target check noting that we only test
for one atomic op, but the implementation in gcc means that's good
enough.

>
> > And we're assuming that
> > __atomic_exchange being present implies all other ops are present,
> > which seems like a bad assumption to me. I would be more confident
> > testing for __atomic_compare_exchange, because with a wait-free CAS
> > you can implement any other atomic operation, but the same isn't true
> > for __atomic_exchange.
>
> Yes, I switched them around.
>
> New version coming up.

Thanks!


>
> brgds, H-P
>



Re: [PATCH] c++: merge tsubst_copy into tsubst_copy_and_build

2023-10-04 Thread Patrick Palka
On Tue, 3 Oct 2023, Jason Merrill wrote:

> On 10/3/23 08:41, Patrick Palka wrote:
> > On Mon, 2 Oct 2023, Patrick Palka wrote:
> > 
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
> > > OK for trunk?
> > > 
> > > -- >8 --
> > > 
> > > The relationship between tsubst_copy_and_build and tsubst_copy (two of
> > > the main template argument substitution routines for expression trees)
> > > is rather hazy.  The former is mostly a superset of the latter, with
> > > some differences.
> > > 
> > > The main difference is that they handle many tree codes differently, but
> > > much of the tree code handling in tsubst_copy appears to be dead code[1].
> > > This is because tsubst_copy only gets directly called in a few places
> > > and mostly on id-expressions.  The interesting exceptions are PARM_DECL,
> > > VAR_DECL, BIT_NOT_EXPR, SCOPE_REF, TEMPLATE_ID_EXPR and IDENTIFIER_NODE:
> > > 
> > >   * for PARM_DECL and VAR_DECL, tsubst_copy_and_build calls tsubst_copy
> > > followed by doing some extra handling of its own
> > >   * for BIT_NOT_EXPR tsubst_copy implicitly handles unresolved destructor
> > > calls (i.e. the first operand is an identifier or a type)
> > >   * for SCOPE_REF, TEMPLATE_ID_EXPR and IDENTIFIER_NODE tsubst_copy
> > > refrains from doing name lookup of the terminal name
> > > 
> > > Other more minor differences are that tsubst_copy exits early when
> > > 'args' is null, and it calls maybe_dependent_member_ref
> 
> That's curious, since what that function does seems like name lookup; I
> wouldn't think we would want to call it when tf_no_name_lookup.

Ah, that makes sense I think.

> 
> > > and finally it dispatches to tsubst for type trees.
> 
> And it looks like you fix the callers to avoid that?

Yes, I'll make a note of that in the commit message.

> 
> > > Thus tsubst_copy is (at this point) similar enough to
> > > tsubst_copy_and_build
> > > that it makes sense to merge the two functions, with the main difference
> > > being the name lookup behavior[2].  So this patch merges tsubst_copy into
> > > tsubst_copy_and_build via a new tsubst tf_no_name_lookup which controls
> > > name lookup and resolution of a (top-level) id-expression.
> > > 
> > > [1]: http://thrifty.mooo.com:8008/gcc-lcov/gcc/cp/pt.cc.gcov.html#17231
> > > [2]: I don't know the history of tsubst_copy but I would guess it was
> > > added before we settled on using processing_template_decl to control
> > > whether our AST building routines perform semantic checking and return
> > > non-templated trees, and so we needed a separate tsubst routine that
> > > avoids semantic checking and always returns a templated tree for e.g.
> > > partial substitution.
> > 
> > Oops, this is wrong -- tsubst_copy_and_build came after tsubst_copy,
> > and was introduced as an optimization with the intent of getting rid
> > of tsubst_copy eventually:
> > https://gcc.gnu.org/pipermail/gcc-patches/2003-January/093659.html
> 
> I wonder if we want to add a small tsubst_name wrapper to call
> tsubst_copy_and_build with tf_no_name_lookup?

Good idea, that'll complement tsubst_scope nicely.

> 
> Can we also merge in tsubst_expr and use that name instead of the unwieldy
> tsubst_copy_and_build?

That'd be nice.  Another idea would be to rename tsubst_expr to
tsubst_stmt and make it disjoint from tsubst_copy_and_build, and then
rename tsubst_copy_and_build to tsubst_expr, to draw a distinction
between statement-like trees (the substitution of which typically has
side effects like calling add_stmt) and expression-like trees (which
don't usually have such side effects).  I can work on that as a
follow-up patch.

Here's v2 which guards the call to maybe_dependent_member_ref and adds
tsubst_name, bootstrapped and regtested on x86_64-pc-linux-gnu:

-- >8 --

Subject: [PATCH] c++: merge tsubst_copy into tsubst_copy_and_build

gcc/cp/ChangeLog:

* cp-tree.h (enum tsubst_flags): Add tf_no_name_lookup.
* pt.cc (tsubst_copy):
(tsubst_pack_expansion): Use tsubst for substituting BASES_TYPE.
(tsubst_decl) : Use tsubst_name instead of
tsubst_copy.
(tsubst) : Use tsubst_copy_and_build
instead of tsubst_copy for substituting
CLASS_PLACEHOLDER_TEMPLATE.
: Use tsubst_name instead of tsubst_copy for
substituting TYPENAME_TYPE_FULLNAME.
(tsubst_name): Define.
(tsubst_qualified_id): Use tsubst_name instead of tsubst_copy
for substituting the component name of a SCOPE_REF.
(tsubst_copy): Remove.
(tsubst_copy_and_build): Clear tf_no_name_lookup at the start,
and remember if it was set.  Call maybe_dependent_member_ref if
tf_no_name_lookup was not set.
: Don't do name lookup if tf_no_name_lookup
was set.
: If tf_no_name_lookup was set, use
tsubst_name instead of tsubst_copy_and_build to substitute the
template and don't finish the template-id.
: Handle identifier 

Re: [committed] libstdc++: Define std::numeric_limits<_FloatNN> before C++23

2023-10-04 Thread Pranav Kant
Thanks for bringing this to my attention. I am working on a fix. Will keep
this thread posted.

Clang *does* define this macro only when float128 type is available. But
the problem seems to be that clang doesn't define _Float128 alias type
which is what's being used here. It only defines __float128 type. Should be
easy to fix.

On Wed, Oct 4, 2023 at 8:56 AM Jonathan Wakely  wrote:

> On Wed, 4 Oct 2023 at 16:54, Stephan Bergmann  wrote:
> >
> > On 8/17/23 22:32, Jonathan Wakely via Libstdc++ wrote:
> > > Tested x86_64-linux. Pushed to trunk.
> > >
> > > -- >8 --
> > >
> > > The extended floating-point types such as _Float32 are supported by GCC
> > > prior to C++23, you just can't use the standard-conforming names from
> > >  to refer to them. This change defines the specializations of
> > > std::numeric_limits for those types for older dialects, not only for
> > > C++23.
> > >
> > > libstdc++-v3/ChangeLog:
> > >
> > >   * include/bits/c++config (__gnu_cxx::__bfloat16_t): Define
> > >   whenever __BFLT16_DIG__ is defined, not only for C++23.
> > >   * include/std/limits (numeric_limits): Likewise.
> > >   (numeric_limits<_Float16>, numeric_limits<_Float32>)
> > >   (numeric_limits<_Float64>): Likewise for other extended
> > >   floating-point types.
> > > ---
> > >   libstdc++-v3/include/bits/c++config |   4 +-
> > >   libstdc++-v3/include/std/limits | 194
> +++-
> > >   2 files changed, 103 insertions(+), 95 deletions(-)
> > >
> > [...]
> > > diff --git a/libstdc++-v3/include/std/limits
> b/libstdc++-v3/include/std/limits
> > > index 52b19ef8264..7a59e7520eb 100644
> > > --- a/libstdc++-v3/include/std/limits
> > > +++ b/libstdc++-v3/include/std/limits
> > > @@ -1890,189 +1890,197 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > [...]
> > >   __glibcxx_float_n(64)
> > >   #endif
> > > -#ifdef __STDCPP_FLOAT128_T__
> > > +#ifdef __FLT128_DIG__
> > >   __glibcxx_float_n(128)
> > >   #endif
> > >   #undef __glibcxx_float_n
> > [...]
> >
> > The above change (from __STDCPP_FLOAT128_T__ to __FLT128_DIG__) now
> > started to cause issues with Clang on Clang 18 trunk:
> >
> > * Clang does not support a _Float128 type.
> >
> > * Clang does not predefine __STDCPP_FLOAT128_T__.
> >
> > * But since
> > <
> https://github.com/llvm/llvm-project/commit/457f582ffe23e951380bc345c4c96ec053c09681
> >
> > "[clang] Predefined macros for float128 support (#67196)", Clang 18
> > trunk does predefine __FLT128_DIG__ now.  Which causes
> >
> > > $ cat test.cc
> > > #include 
> >
> > > $ clang++ -fsyntax-only test.cc
> > > In file included from test.cc:1:
> > >
> /home/sbergman/gcc/trunk/inst/lib/gcc/x86_64-pc-linux-gnu/14.0.0/../../../../include/c++/14.0.0/limits:1995:1:
> error: use of undeclared identifier '_Float128'
> > >  1995 | __glibcxx_float_n(128)
> > >   | ^
> > >
> /home/sbergman/gcc/trunk/inst/lib/gcc/x86_64-pc-linux-gnu/14.0.0/../../../../include/c++/14.0.0/limits:1903:27:
> note: expanded from macro '__glibcxx_float_n'
> > >  1903 | struct numeric_limits<_Float##BITSIZE>
>   \
> > >   |   ^
> > > :36:1: note: expanded from here
> > >36 | _Float128
> > >   | ^
> > > 1 error generated.
> >
> > (I don't know whether or not it is useful for Clang to predefine
> > __FLT128_DIG__ when not providing a _Float128 type.  I assume
> >  "ISO/IEC TS 18661-3:2015", as
> > referenced by the C++ standard, might be relevant here, but don't know
> > that document.  I added Pranav, the author of the relevant Clang commit,
> > in cc here.)
>
>
> It's completely wrong or Clang to define a macro describing properties
> of a non-existent type.
>
> This was reported as
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111687 and closed as
> INVALID, it needs to be fixed in Clang.
>
>
> >
>
>


Re: [committed] libstdc++: Define std::numeric_limits<_FloatNN> before C++23

2023-10-04 Thread Jakub Jelinek
On Wed, Oct 04, 2023 at 09:47:34AM -0700, Pranav Kant wrote:
> Thanks for bringing this to my attention. I am working on a fix. Will keep
> this thread posted.
> 
> Clang *does* define this macro only when float128 type is available. But
> the problem seems to be that clang doesn't define _Float128 alias type
> which is what's being used here. It only defines __float128 type. Should be
> easy to fix.

_Float128 (and other _FloatNN types) are standard C23 and C++23 types,
they certainly can't be simple aliases to __float128 or other similar types.
In C++ they mangle differently, have significantly different behavior in
lots of different language details.
So, you need to implement the https://wg21.link/P1467R9 paper, rather than
doing quick hacks.

Jakub



[RFC 0/2] black, isort, and flake8 configuration

2023-10-04 Thread Tom Tromey
This short series adds configuration files for black ("opinionated"
code formatter), isort (import sorter) and flake8 (Python lint) to
libstdc++.

I marked it as RFC since sometimes people don't like black's output.
In gdb we use it -- at first I found some of its decisions a little
odd, but overall it's nice not to have to review for or worry about
the minitia of code formatting.

Tom




[RFC 2/2] libstdc++: Add flake8 configuration

2023-10-04 Thread Tom Tromey
flake8 is a Python linter.  This patch adds a .flake8 configuration
file (flake8 does not use pyproject.toml for some reason) and fixes a
few trivial flake8 warnings.

After this patch, the only remaining flake8 warnings are about unused
imports (there are two - but they are not completely trivial to
remove) and the use of bare "except:".

It is possible to change the flake8 configuration to suppress these
warnings, but I haven't done so here.

libstdc++-v3/ChangeLog:

* python/.flake8: New file.
* python/libstdcxx/v6/__init__.py: Remove blank line.
* python/libstdcxx/v6/printers.py: Reformat two comments.
---
 libstdc++-v3/python/.flake8  | 3 +++
 libstdc++-v3/python/libstdcxx/__init__.py| 1 -
 libstdc++-v3/python/libstdcxx/v6/printers.py | 8 
 3 files changed, 7 insertions(+), 5 deletions(-)
 create mode 100644 libstdc++-v3/python/.flake8

diff --git a/libstdc++-v3/python/.flake8 b/libstdc++-v3/python/.flake8
new file mode 100644
index 000..7ffe4d6e0f7
--- /dev/null
+++ b/libstdc++-v3/python/.flake8
@@ -0,0 +1,3 @@
+[flake8]
+max-line-length = 79
+extend-ignore = E203
diff --git a/libstdc++-v3/python/libstdcxx/__init__.py 
b/libstdc++-v3/python/libstdcxx/__init__.py
index 8b137891791..e69de29bb2d 100644
--- a/libstdc++-v3/python/libstdcxx/__init__.py
+++ b/libstdc++-v3/python/libstdcxx/__init__.py
@@ -1 +0,0 @@
-
diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
b/libstdc++-v3/python/libstdcxx/v6/printers.py
index e26b8b36013..202fa450a91 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -2672,8 +2672,8 @@ class FilteringTypePrinter(object):
 name (str): The typedef-name that will be used instead.
 targ1 (str, optional): The first template argument. Defaults to None.
 
-Checks if a specialization of the class template 'template' is the same 
type
-as the typedef 'name', and prints it as 'name' instead.
+Checks if a specialization of the class template 'template' is the same
+type as the typedef 'name', and prints it as 'name' instead.
 
 e.g. if an instantiation of std::basic_istream is the same type as
 std::istream then print it as std::istream.
@@ -3167,8 +3167,8 @@ def build_libstdcxx_dictionary():
 libstdcxx_printer.add_version(
 'std::chrono::', 'tzdb', StdChronoTzdbPrinter
 )
-# libstdcxx_printer.add_version('std::chrono::(anonymous namespace)', 
'Rule',
-#  StdChronoTimeZoneRulePrinter)
+# libstdcxx_printer.add_version('std::chrono::(anonymous namespace)',
+#   'Rule', StdChronoTimeZoneRulePrinter)
 
 # Extensions.
 libstdcxx_printer.add_version('__gnu_cxx::', 'slist', StdSlistPrinter)
-- 
2.40.1



[RFC 1/2] libstdc++: Use 'black' and 'isort' in pretty printers

2023-10-04 Thread Tom Tromey
This changes libstdc++ to use the 'black' Python formatter.  This
formatter is somewhat standard and fairly comprehensive.  FWIW we use
this in gdb, mainly because it means we don't have to review Python
code for formatting style.

This patch also runs 'isort', which handles sorting the imports.

A new pyproject.tom file is added, so after this you can:

cd .../libstdc++/python
black .
isort .

... and have any changes reformatted.

libstdc++-v3/ChangeLog:
* pyproject.toml: New file.
* python/libstdcxx/v6/printers.py: Reformat.
* python/libstdcxx/v6/xmethods.py: Reformat.
* python/libstdcxx/v6/__init__.py: Reformat.
---
 libstdc++-v3/python/libstdcxx/v6/__init__.py |   5 +
 libstdc++-v3/python/libstdcxx/v6/printers.py | 834 ---
 libstdc++-v3/python/libstdcxx/v6/xmethods.py | 138 +--
 libstdc++-v3/python/pyproject.toml   |   6 +
 4 files changed, 648 insertions(+), 335 deletions(-)
 create mode 100644 libstdc++-v3/python/pyproject.toml

diff --git a/libstdc++-v3/python/libstdcxx/v6/__init__.py 
b/libstdc++-v3/python/libstdcxx/v6/__init__.py
index 8b2cbc60a1b..cb0abf38e6b 100644
--- a/libstdc++-v3/python/libstdcxx/v6/__init__.py
+++ b/libstdc++-v3/python/libstdcxx/v6/__init__.py
@@ -13,19 +13,24 @@
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see .
 
+
 # Load the xmethods if GDB supports them.
 def gdb_has_xmethods():
 try:
 import gdb.xmethod
+
 return True
 except ImportError:
 return False
 
+
 def register_libstdcxx_printers(obj):
 # Load the pretty-printers.
 from .printers import register_libstdcxx_printers
+
 register_libstdcxx_printers(obj)
 
 if gdb_has_xmethods():
 from .xmethods import register_libstdcxx_xmethods
+
 register_libstdcxx_xmethods(obj)
diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
b/libstdc++-v3/python/libstdcxx/v6/printers.py
index 7e16a49aeb0..e26b8b36013 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -15,12 +15,13 @@
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see .
 
-import gdb
+import datetime
+import errno
 import itertools
 import re
 import sys
-import errno
-import datetime
+
+import gdb
 
 # Python 2 + Python 3 compatibility code
 
@@ -78,6 +79,7 @@ else:
 
 def dst(self, dt):
 return datetime.timedelta(0)
+
 _utc_timezone = UTC()
 
 # Try to use the new-style pretty-printing if available.
@@ -91,6 +93,7 @@ except ImportError:
 _use_type_printing = False
 try:
 import gdb.types
+
 if hasattr(gdb.types, 'TypePrinter'):
 _use_type_printing = True
 except ImportError:
@@ -148,6 +151,7 @@ def lookup_templ_spec(templ, *args):
 pass
 raise e
 
+
 # Use this to find container node types instead of find_type,
 # see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91997 for details.
 def lookup_node_type(nodename, containertype):
@@ -177,8 +181,9 @@ def lookup_node_type(nodename, containertype):
 except gdb.error:
 # For debug mode containers the node is in std::__cxx1998.
 if is_member_of_namespace(nodename, 'std'):
-if is_member_of_namespace(containertype, 'std::__cxx1998',
-  'std::__debug', '__gnu_debug'):
+if is_member_of_namespace(
+containertype, 'std::__cxx1998', 'std::__debug', '__gnu_debug'
+):
 nodename = nodename.replace('::', '::__cxx1998::', 1)
 try:
 return lookup_templ_spec(nodename, valtype)
@@ -296,7 +301,9 @@ class SharedPointerPrinter(printer_base):
 state = 'expired, weak count %d' % weakcount
 else:
 state = 'use count %d, weak count %d' % (
-usecount, weakcount - 1)
+usecount,
+weakcount - 1,
+)
 return '%s<%s> (%s)' % (self._typename, targ, state)
 
 
@@ -305,13 +312,15 @@ def _tuple_impl_get(val):
 bases = val.type.fields()
 if not bases[-1].is_base_class:
 raise ValueError(
-"Unsupported implementation for std::tuple: %s" % str(val.type))
+"Unsupported implementation for std::tuple: %s" % str(val.type)
+)
 # Get the _Head_base base class:
 head_base = val.cast(bases[-1].type)
 fields = head_base.type.fields()
 if len(fields) == 0:
 raise ValueError(
-"Unsupported implementation for std::tuple: %s" % str(val.type))
+"Unsupported implementation for std::tuple: %s" % str(val.type)
+)
 if fields[0].name == '_M_head_impl':
 # The tuple element is the _Head_base::_M_head_impl data member.
 return head_base['_M_he

Re: [committed] libstdc++: Define std::numeric_limits<_FloatNN> before C++23

2023-10-04 Thread Pranav Kant
I will revert the commit while I work on this. Thanks for the pointers.

On Wed, Oct 4, 2023 at 9:57 AM Jakub Jelinek  wrote:

> On Wed, Oct 04, 2023 at 09:47:34AM -0700, Pranav Kant wrote:
> > Thanks for bringing this to my attention. I am working on a fix. Will
> keep
> > this thread posted.
> >
> > Clang *does* define this macro only when float128 type is available. But
> > the problem seems to be that clang doesn't define _Float128 alias type
> > which is what's being used here. It only defines __float128 type. Should
> be
> > easy to fix.
>
> _Float128 (and other _FloatNN types) are standard C23 and C++23 types,
> they certainly can't be simple aliases to __float128 or other similar
> types.
> In C++ they mangle differently, have significantly different behavior in
> lots of different language details.
> So, you need to implement the https://wg21.link/P1467R9 paper, rather than
> doing quick hacks.
>
> Jakub
>
>


[PATCH v2 1/2] testsuite: Add dg-require-atomic-cmpxchg-word

2023-10-04 Thread Hans-Peter Nilsson
> From: Hans-Peter Nilsson 
> Date: Wed, 4 Oct 2023 17:15:28 +0200

> New version coming up.

Using pointer-sized int instead of int,
__atomic_compare_exchange instead of __atomic_exchange,
renamed to atomic-cmpxchg-word from atomic-exchange, and
updating a comment that already seemed reasonably well
placed.

Tested as with v1 1/2.

Ok to commit?

-- >8 --
Some targets (armv6-m) support inline atomic load and store,
i.e. dg-require-thread-fence matches, but not atomic operations like
compare and exchange.

This directive can be used to replace uses of dg-require-thread-fence
where an atomic operation is actually used.

* testsuite/lib/dg-options.exp (dg-require-atomic-cmpxchg-word):
New proc.
* testsuite/lib/libstdc++.exp (check_v3_target_atomic_cmpxchg_word):
Ditto.
---
 libstdc++-v3/testsuite/lib/dg-options.exp |  9 ++
 libstdc++-v3/testsuite/lib/libstdc++.exp  | 37 +++
 2 files changed, 46 insertions(+)

diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp 
b/libstdc++-v3/testsuite/lib/dg-options.exp
index 84ad0c65330b..850442b6b7c1 100644
--- a/libstdc++-v3/testsuite/lib/dg-options.exp
+++ b/libstdc++-v3/testsuite/lib/dg-options.exp
@@ -133,6 +133,15 @@ proc dg-require-thread-fence { args } {
 return
 }
 
+proc dg-require-atomic-cmpxchg-word { args } {
+if { ![ check_v3_target_atomic_cmpxchg_word ] } {
+   upvar dg-do-what dg-do-what
+   set dg-do-what [list [lindex ${dg-do-what} 0] "N" "P"]
+   return
+}
+return
+}
+
 proc dg-require-atomic-builtins { args } {
 if { ![ check_v3_target_atomic_builtins ] } {
upvar dg-do-what dg-do-what
diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp 
b/libstdc++-v3/testsuite/lib/libstdc++.exp
index 608056e5068e..4bedb36dc6f9 100644
--- a/libstdc++-v3/testsuite/lib/libstdc++.exp
+++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
@@ -1221,6 +1221,43 @@ proc check_v3_target_thread_fence { } {
 }]
 }
 
+proc check_v3_target_atomic_cmpxchg_word { } {
+return [check_v3_target_prop_cached et_atomic_cmpxchg_word {
+   global cxxflags
+   global DEFAULT_CXXFLAGS
+
+   # Set up and link a C++11 test program that depends on
+   # atomic-compare-exchange being available for a pointer-sized
+   # integer.  It should be sufficient as gcc can derive all
+   # other operations when a target implements this operation.
+   set src atomic_cmpxchg_word[pid].cc
+
+   set f [open $src "w"]
+   puts $f "
+   __UINTPTR_TYPE__ i, j, k;
+   int main() {
+   __atomic_compare_exchange (&i, &j, &k, 1, __ATOMIC_SEQ_CST, 
__ATOMIC_SEQ_CST);
+   return 0;
+   }"
+   close $f
+
+   set cxxflags_saved $cxxflags
+   set cxxflags "$cxxflags $DEFAULT_CXXFLAGS -Werror -std=gnu++11"
+
+   set lines [v3_target_compile $src /dev/null executable ""]
+   set cxxflags $cxxflags_saved
+   file delete $src
+
+   if [string match "" $lines] {
+   # No error message, linking succeeded.
+   return 1
+   } else {
+   verbose "check_v3_target_atomic_cmpxchg_word: compilation failed" 2
+   return 0
+   }
+}]
+}
+
 # Return 1 if atomics_bool and atomic_int are always lock-free, 0 otherwise.
 proc check_v3_target_atomic_builtins { } {
 return [check_v3_target_prop_cached et_atomic_builtins {
-- 
2.30.2



[PATCH v2 2/2] testsuite: Replace many dg-require-thread-fence with dg-require-atomic-cmpxchg-word

2023-10-04 Thread Hans-Peter Nilsson
s/atomic-exchange/atomic-cmpxchg-word/g.
Tested as v1.

Ok to commit?
-- >8 --
These tests actually use a form of atomic compare and exchange
operation, not just atomic loading and storing.  Some targets (not
supported by e.g. libatomic) have atomic loading and storing, but not
compare and exchange, yielding linker errors for missing library
functions.

This change is just for existing uses of
dg-require-thread-fence.  It does not fix any other tests
that should also be gated on dg-require-atomic-cmpxchg-word.

* testsuite/29_atomics/atomic/compare_exchange_padding.cc,
testsuite/29_atomics/atomic_flag/clear/1.cc,
testsuite/29_atomics/atomic_flag/cons/value_init.cc,
testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc,
testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc,
testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc,
testsuite/29_atomics/atomic_ref/generic.cc,
testsuite/29_atomics/atomic_ref/integral.cc,
testsuite/29_atomics/atomic_ref/pointer.cc: Replace
dg-require-thread-fence with dg-require-atomic-cmpxchg-word.
---
 .../testsuite/29_atomics/atomic/compare_exchange_padding.cc | 2 +-
 libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc| 2 +-
 .../testsuite/29_atomics/atomic_flag/cons/value_init.cc | 2 +-
 .../testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc   | 2 +-
 .../testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc   | 2 +-
 .../testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc | 2 +-
 libstdc++-v3/testsuite/29_atomics/atomic_ref/generic.cc | 2 +-
 libstdc++-v3/testsuite/29_atomics/atomic_ref/integral.cc| 2 +-
 libstdc++-v3/testsuite/29_atomics/atomic_ref/pointer.cc | 2 +-
 9 files changed, 9 insertions(+), 9 deletions(-)

diff --git 
a/libstdc++-v3/testsuite/29_atomics/atomic/compare_exchange_padding.cc 
b/libstdc++-v3/testsuite/29_atomics/atomic/compare_exchange_padding.cc
index 01f7475631e6..859629e625f8 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic/compare_exchange_padding.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic/compare_exchange_padding.cc
@@ -1,5 +1,5 @@
 // { dg-do run { target c++20 } }
-// { dg-require-thread-fence "" }
+// { dg-require-atomic-cmpxchg-word "" }
 // { dg-add-options libatomic }
 
 #include 
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc 
b/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
index 89ed381fe057..2e154178dbd7 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc
@@ -1,5 +1,5 @@
 // { dg-do run { target c++11 } }
-// { dg-require-thread-fence "" }
+// { dg-require-atomic-cmpxchg-word "" }
 
 // Copyright (C) 2009-2023 Free Software Foundation, Inc.
 //
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/value_init.cc 
b/libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/value_init.cc
index f3f38b54dbcd..6439873be133 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/value_init.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_flag/cons/value_init.cc
@@ -16,7 +16,7 @@
 // .
 
 // { dg-do run { target c++20 } }
-// { dg-require-thread-fence "" }
+// { dg-require-atomic-cmpxchg-word "" }
 
 #include 
 #include 
diff --git 
a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc 
b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
index 6f723eb5f4e7..6cb1ae2b6dda 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc
@@ -1,5 +1,5 @@
 // { dg-do run { target c++11 } }
-// { dg-require-thread-fence "" }
+// { dg-require-atomic-cmpxchg-word "" }
 
 // Copyright (C) 2008-2023 Free Software Foundation, Inc.
 //
diff --git 
a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc 
b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
index 6f723eb5f4e7..6cb1ae2b6dda 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc
@@ -1,5 +1,5 @@
 // { dg-do run { target c++11 } }
-// { dg-require-thread-fence "" }
+// { dg-require-atomic-cmpxchg-word "" }
 
 // Copyright (C) 2008-2023 Free Software Foundation, Inc.
 //
diff --git 
a/libstdc++-v3/testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc 
b/libstdc++-v3/testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc
index 2a3d1d468c22..25ccd2e94336 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc
@@ -1,5 +1,5 @@
 // { dg-do run { target c++20 } }
-// { dg-require-thread-fence "" }
+// { dg-require-atomic-cmpxchg-word "" }
 // { dg-add-options libatomic }
 
 #include 
diff -

Re: [RISC-V]: Re: cpymem for RISCV with v extension

2023-10-04 Thread Patrick O'Neill

Hi Joern,

I'm seeing new failures introduced by this patch 
(9464e72bcc9123b619215af8cfef491772a3ebd9).


On rv64gcv:
FAIL: gcc.dg/pr90263.c scan-assembler memcpy
FAIL: gfortran.fortran-torture/execute/intrinsic_count.f90 execution,  
-O2 -fomit-frame-pointer -finline-functions -funroll-loops


Debug log for intrinsic_count.f90:
spawn riscv64-unknown-linux-gnu-run 
/scratch/tc-testing/tc-410-break/build/build-gcc-linux-stage2/gcc/testsuite/gfortran9/intrinsic_count.x

STOP 2
FAIL: gfortran.fortran-torture/execute/intrinsic_count.f90 execution,  
-O2 -fomit-frame-pointer -finline-functions -funroll-loops


It's worth noting that intrinsic_count.f90 had failures prior to this 
patch for other option combinations:

FAIL: gfortran.fortran-torture/execute/intrinsic_count.f90 execution,  -O2
FAIL: gfortran.fortran-torture/execute/intrinsic_count.f90 execution,  
-O2 -fbounds-check
FAIL: gfortran.fortran-torture/execute/intrinsic_count.f90 execution,  
-O2 -fomit-frame-pointer -finline-functions


Thanks,
Patrick

On 10/1/23 19:43, Joern Rennecke wrote:

On Tue, 15 Aug 2023 at 15:06, Jeff Law  wrote:
  >

On 8/15/23 03:16, juzhe.zh...@rivai.ai wrote:

The new  patch looks reasonable to me now. Thanks for fixing it.

Could you append testcase after finishing test infrastructure ?
I prefer this patch with testcase after infrastructure.

So let's call this an ACK, but ask that Joern not commit until the
testsuite bits are in place.

Beyond the adding of tests, the patch needed some changes because of the
Refactoring of emit_{vlmax,nonvlmax}_xxx functions .
Attached is the committed version.


Re: [RFC] expr: don't clear SUBREG_PROMOTED_VAR_P flag for a promoted subreg [target/111466]

2023-10-04 Thread Jeff Law




On 9/28/23 15:43, Vineet Gupta wrote:

RISC-V suffers from extraneous sign extensions, despite/given the ABI
guarantee that 32-bit quantities are sign-extended into 64-bit registers,
meaning incoming SI function args need not be explicitly sign extended
(so do SI return values as most ALU insns implicitly sign-extend too.)

Existing REE doesn't seem to handle this well and there are various ideas
floating around to smarten REE about it.

RISC-V also seems to correctly implement middle-end hook PROMOTE_MODE
etc.

Another approach would be to prevent EXPAND from generating the
sign_extend in the first place which this patch tries to do.

The hunk being removed was introduced way back in 1994 as
5069803972 ("expand_expr, case CONVERT_EXPR .. clear the promotion flag")

This survived full testsuite run for RISC-V rv64gc with surprisingly no
fallouts: test results before/after are exactly same.

|   | # of unexpected case / # of unique unexpected 
case
|   |  gcc |  g++ | gfortran |
| rv64imafdc_zba_zbb_zbs_zicond/|  264 /87 |5 / 2 |   72 /12 |
|lp64d/medlow

Granted for something so old to have survived, there must be a valid
reason. Unfortunately the original change didn't have additional
commentary or a test case. That is not to say it can't/won't possibly
break things on other arches/ABIs, hence the RFC for someone to scream
that this is just bonkers, don't do this :-)

I've explicitly CC'ed Jakub and Roger who have last touched subreg
promoted notes in expr.cc for insight and/or screaming ;-)

Thanks to Robin for narrowing this down in an amazing debugging session
@ GNU Cauldron.

```
foo2:
sext.w  a6,a1 <-- this goes away
beq a1,zero,.L4
li  a5,0
li  a0,0
.L3:
addwa4,a2,a5
addwa5,a3,a5
addwa0,a4,a0
bltua5,a6,.L3
ret
.L4:
li  a0,0
ret
```

Signed-off-by: Vineet Gupta 
Co-developed-by: Robin Dapp 
---
  gcc/expr.cc   |  7 ---
  gcc/testsuite/gcc.target/riscv/pr111466.c | 15 +++
  2 files changed, 15 insertions(+), 7 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/riscv/pr111466.c
So mcore-elf is showing something interesting.  With that hunk of Kenner 
code removed, it actually has a few failing tests that flip to passes.



Tests that now work, but didn't before (11 tests):

mcore-sim: gcc.c-torture/execute/pr109986.c   -O1  execution test
mcore-sim: gcc.c-torture/execute/pr109986.c   -O2  execution test
mcore-sim: gcc.c-torture/execute/pr109986.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  execution test
mcore-sim: gcc.c-torture/execute/pr109986.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  execution test
mcore-sim: gcc.c-torture/execute/pr109986.c   -O3 -g  execution test
mcore-sim: gcc.c-torture/execute/pr109986.c   -Os  execution test
mcore-sim: gcc.c-torture/execute/pr84524.c   -O2  execution test
mcore-sim: gcc.c-torture/execute/pr84524.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  execution test
mcore-sim: gcc.c-torture/execute/pr84524.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  execution test
mcore-sim: gcc.dg/tree-ssa/pr84436-5.c execution test
mcore-sim: gcc.dg/tree-ssa/pr84436-5.c execution test


So that's a really interesting result.   If further analysis doesn't 
point the finger at a simulator bug or something like that, then we'll 
have strong evidence that Kenner's change is actively harmful from a 
correctness standpoint.  That would change the calculus here significantly.


Sadly, mcore-elf doesn't have a working gdb IIRC (don't ask how I know 
that!), so I'm going to have to analyze this further with less efficient 
techniques.  BUt definitely interesting news/results.


Jeff


Re: [RFC] expr: don't clear SUBREG_PROMOTED_VAR_P flag for a promoted subreg [target/111466]

2023-10-04 Thread Vineet Gupta

On 10/4/23 10:59, Jeff Law wrote:



On 9/28/23 15:43, Vineet Gupta wrote:

RISC-V suffers from extraneous sign extensions, despite/given the ABI
guarantee that 32-bit quantities are sign-extended into 64-bit 
registers,

meaning incoming SI function args need not be explicitly sign extended
(so do SI return values as most ALU insns implicitly sign-extend too.)

Existing REE doesn't seem to handle this well and there are various 
ideas

floating around to smarten REE about it.

RISC-V also seems to correctly implement middle-end hook PROMOTE_MODE
etc.

Another approach would be to prevent EXPAND from generating the
sign_extend in the first place which this patch tries to do.

The hunk being removed was introduced way back in 1994 as
    5069803972 ("expand_expr, case CONVERT_EXPR .. clear the 
promotion flag")


This survived full testsuite run for RISC-V rv64gc with surprisingly no
fallouts: test results before/after are exactly same.

|   | # of unexpected case / # of unique 
unexpected case
|   |  gcc |  g++ | 
gfortran |
| rv64imafdc_zba_zbb_zbs_zicond/|  264 /    87 |    5 / 2 |   72 
/    12 |

|    lp64d/medlow

Granted for something so old to have survived, there must be a valid
reason. Unfortunately the original change didn't have additional
commentary or a test case. That is not to say it can't/won't possibly
break things on other arches/ABIs, hence the RFC for someone to scream
that this is just bonkers, don't do this :-)

I've explicitly CC'ed Jakub and Roger who have last touched subreg
promoted notes in expr.cc for insight and/or screaming ;-)

Thanks to Robin for narrowing this down in an amazing debugging session
@ GNU Cauldron.

```
foo2:
sext.w    a6,a1 <-- this goes away
beq    a1,zero,.L4
li    a5,0
li    a0,0
.L3:
addw    a4,a2,a5
addw    a5,a3,a5
addw    a0,a4,a0
bltu    a5,a6,.L3
ret
.L4:
li    a0,0
ret
```

Signed-off-by: Vineet Gupta 
Co-developed-by: Robin Dapp 
---
  gcc/expr.cc   |  7 ---
  gcc/testsuite/gcc.target/riscv/pr111466.c | 15 +++
  2 files changed, 15 insertions(+), 7 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/riscv/pr111466.c
So mcore-elf is showing something interesting.  With that hunk of 
Kenner code removed, it actually has a few failing tests that flip to 
passes.



Tests that now work, but didn't before (11 tests):

mcore-sim: gcc.c-torture/execute/pr109986.c   -O1  execution test
mcore-sim: gcc.c-torture/execute/pr109986.c   -O2  execution test
mcore-sim: gcc.c-torture/execute/pr109986.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  execution test
mcore-sim: gcc.c-torture/execute/pr109986.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  execution test

mcore-sim: gcc.c-torture/execute/pr109986.c   -O3 -g  execution test
mcore-sim: gcc.c-torture/execute/pr109986.c   -Os  execution test
mcore-sim: gcc.c-torture/execute/pr84524.c   -O2  execution test
mcore-sim: gcc.c-torture/execute/pr84524.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  execution test
mcore-sim: gcc.c-torture/execute/pr84524.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  execution test

mcore-sim: gcc.dg/tree-ssa/pr84436-5.c execution test
mcore-sim: gcc.dg/tree-ssa/pr84436-5.c execution test


So that's a really interesting result.   If further analysis doesn't 
point the finger at a simulator bug or something like that, then we'll 
have strong evidence that Kenner's change is actively harmful from a 
correctness standpoint.  That would change the calculus here 
significantly.


Sadly, mcore-elf doesn't have a working gdb IIRC (don't ask how I know 
that!), so I'm going to have to analyze this further with less 
efficient techniques.  BUt definitely interesting news/results.


Really interesting. Can't thank you enough for spending time in chasing 
this down.


-Vineet


Re: [committed] libstdc++: Define std::numeric_limits<_FloatNN> before C++23

2023-10-04 Thread Pranav Kant
I have reverted the change upstream (
https://github.com/llvm/llvm-project/commit/7d21086d0ca4a680e96e0f4cd3e2597ebe027a48
).

On Wed, Oct 4, 2023 at 10:00 AM Pranav Kant  wrote:

> I will revert the commit while I work on this. Thanks for the pointers.
>
> On Wed, Oct 4, 2023 at 9:57 AM Jakub Jelinek  wrote:
>
>> On Wed, Oct 04, 2023 at 09:47:34AM -0700, Pranav Kant wrote:
>> > Thanks for bringing this to my attention. I am working on a fix. Will
>> keep
>> > this thread posted.
>> >
>> > Clang *does* define this macro only when float128 type is available. But
>> > the problem seems to be that clang doesn't define _Float128 alias type
>> > which is what's being used here. It only defines __float128 type.
>> Should be
>> > easy to fix.
>>
>> _Float128 (and other _FloatNN types) are standard C23 and C++23 types,
>> they certainly can't be simple aliases to __float128 or other similar
>> types.
>> In C++ they mangle differently, have significantly different behavior in
>> lots of different language details.
>> So, you need to implement the https://wg21.link/P1467R9 paper, rather
>> than
>> doing quick hacks.
>>
>> Jakub
>>
>>


[PATCH v3] libiberty: Use posix_spawn in pex-unix when available.

2023-10-04 Thread Brendan Shanks
Hi,

This patch implements pex_unix_exec_child using posix_spawn when
available.

This should especially benefit recent macOS (where vfork just calls
fork), but should have equivalent or faster performance on all
platforms.
In addition, the implementation is substantially simpler than the
vfork+exec code path.

Tested on x86_64-linux.

v2: Fix error handling (previously the function would be run twice in
case of error), and don't use a macro that changes control flow.

v3: Match file style for error-handling blocks, don't close
in/out/errdes on error, and check close() for errors.

libiberty/
* configure.ac (AC_CHECK_HEADERS): Add spawn.h.
(checkfuncs): Add posix_spawn, posix_spawnp.
(AC_CHECK_FUNCS): Add posix_spawn, posix_spawnp.
* configure, config.in: Rebuild.
* pex-unix.c [HAVE_POSIX_SPAWN] (pex_unix_exec_child): New function.

Signed-off-by: Brendan Shanks 
---
 libiberty/configure.ac |   8 +-
 libiberty/pex-unix.c   | 168 +
 2 files changed, 173 insertions(+), 3 deletions(-)

diff --git a/libiberty/configure.ac b/libiberty/configure.ac
index 0748c592704..2488b031bc8 100644
--- a/libiberty/configure.ac
+++ b/libiberty/configure.ac
@@ -289,7 +289,7 @@ AC_SUBST_FILE(host_makefile_frag)
 # It's OK to check for header files.  Although the compiler may not be
 # able to link anything, it had better be able to at least compile
 # something.
-AC_CHECK_HEADERS(sys/file.h sys/param.h limits.h stdlib.h malloc.h string.h 
unistd.h strings.h sys/time.h time.h sys/resource.h sys/stat.h sys/mman.h 
fcntl.h alloca.h sys/pstat.h sys/sysmp.h sys/sysinfo.h machine/hal_sysinfo.h 
sys/table.h sys/sysctl.h sys/systemcfg.h stdint.h stdio_ext.h process.h 
sys/prctl.h)
+AC_CHECK_HEADERS(sys/file.h sys/param.h limits.h stdlib.h malloc.h string.h 
unistd.h strings.h sys/time.h time.h sys/resource.h sys/stat.h sys/mman.h 
fcntl.h alloca.h sys/pstat.h sys/sysmp.h sys/sysinfo.h machine/hal_sysinfo.h 
sys/table.h sys/sysctl.h sys/systemcfg.h stdint.h stdio_ext.h process.h 
sys/prctl.h spawn.h)
 AC_HEADER_SYS_WAIT
 AC_HEADER_TIME
 
@@ -412,7 +412,8 @@ funcs="$funcs setproctitle"
 vars="sys_errlist sys_nerr sys_siglist"
 
 checkfuncs="__fsetlocking canonicalize_file_name dup3 getrlimit getrusage \
- getsysinfo gettimeofday on_exit pipe2 psignal pstat_getdynamic 
pstat_getstatic \
+ getsysinfo gettimeofday on_exit pipe2 posix_spawn posix_spawnp psignal \
+ pstat_getdynamic pstat_getstatic \
  realpath setrlimit spawnve spawnvpe strerror strsignal sysconf sysctl \
  sysmp table times wait3 wait4"
 
@@ -435,7 +436,8 @@ if test "x" = "y"; then
 index insque \
 memchr memcmp memcpy memmem memmove memset mkstemps \
 on_exit \
-pipe2 psignal pstat_getdynamic pstat_getstatic putenv \
+pipe2 posix_spawn posix_spawnp psignal \
+pstat_getdynamic pstat_getstatic putenv \
 random realpath rename rindex \
 sbrk setenv setproctitle setrlimit sigsetmask snprintf spawnve spawnvpe \
  stpcpy stpncpy strcasecmp strchr strdup \
diff --git a/libiberty/pex-unix.c b/libiberty/pex-unix.c
index 33b5bce31c2..336799d1125 100644
--- a/libiberty/pex-unix.c
+++ b/libiberty/pex-unix.c
@@ -58,6 +58,9 @@ extern int errno;
 #ifdef HAVE_PROCESS_H
 #include 
 #endif
+#ifdef HAVE_SPAWN_H
+#include 
+#endif
 
 #ifdef vfork /* Autoconf may define this to fork for us. */
 # define VFORK_STRING "fork"
@@ -559,6 +562,171 @@ pex_unix_exec_child (struct pex_obj *obj ATTRIBUTE_UNUSED,
   return (pid_t) -1;
 }
 
+#elif defined(HAVE_POSIX_SPAWN) && defined(HAVE_POSIX_SPAWNP)
+/* Implementation of pex->exec_child using posix_spawn.*/
+
+static pid_t
+pex_unix_exec_child (struct pex_obj *obj ATTRIBUTE_UNUSED,
+int flags, const char *executable,
+char * const * argv, char * const * env,
+int in, int out, int errdes,
+int toclose, const char **errmsg, int *err)
+{
+  int ret;
+  pid_t pid = -1;
+  posix_spawnattr_t attr;
+  posix_spawn_file_actions_t actions;
+  int attr_initialized = 0, actions_initialized = 0;
+
+  *err = 0;
+
+  ret = posix_spawnattr_init (&attr);
+  if (ret)
+{
+  *err = ret;
+  *errmsg = "posix_spawnattr_init";
+  goto exit;
+}
+  attr_initialized = 1;
+
+  /* Use vfork() on glibc <=2.24. */
+#ifdef POSIX_SPAWN_USEVFORK
+  ret = posix_spawnattr_setflags (&attr, POSIX_SPAWN_USEVFORK);
+  if (ret)
+{
+  *err = ret;
+  *errmsg = "posix_spawnattr_setflags";
+  goto exit;
+}
+#endif
+
+  ret = posix_spawn_file_actions_init (&actions);
+  if (ret)
+{
+  *err = ret;
+  *errmsg = "posix_spawn_file_actions_init";
+  goto exit;
+}
+  actions_initialized = 1;
+
+  if (in != STDIN_FILE_NO)
+{
+  ret = posix_spawn_file_actions_adddup2 (&actions, in, STDIN_FILE_NO);
+  if (ret)
+{
+  *err = ret;
+  *errmsg = "posix_spawn_file_actions_adddup2";
+  goto exit;
+}

Re: [PATCH v2] Add a GCC Security policy

2023-10-04 Thread Siddhesh Poyarekar




On 2023-10-04 11:49, Alexander Monakov wrote:


On Thu, 28 Sep 2023, Siddhesh Poyarekar wrote:


Define a security process and exclusions to security issues for GCC and
all components it ships.


Some typos and wording suggestions below.


I've incorporated all your and David's suggestions and pushed it.  Thank 
you for iterating with me on this!


Sid




--- /dev/null
+++ b/SECURITY.txt
@@ -0,0 +1,205 @@
+What is a GCC security bug?
+===
+
+A security bug is one that threatens the security of a system or
+network, or might compromise the security of data stored on it.
+In the context of GCC there are multiple ways in which this might
+happen and some common scenarios are detailed below.
+
+If you're reporting a security issue and feel like it does not fit
+into any of the descriptions below, you're encouraged to reach out
+through the GCC bugzilla or if needed, privately by following the
+instructions in the last two sections of this document.
+
+Compiler drivers, programs, libgccjit and support libraries
+---
+
+The compiler driver processes source code, invokes other programs
+such as the assembler and linker and generates the output result,
+which may be assembly code or machine code.  Compiling untrusted
+sources can result in arbitrary code execution and unconstrained
+resource consumption in the compiler. As a result, compilation of
+such code should be done inside a sandboxed environment to ensure
+that it does not compromise the development environment.


"... the host environment" seems more appropriate.


+
+The libgccjit library can, despite the name, be used both for
+ahead-of-time compilation and for just-in-compilation.  In both
+cases it can be used to translate input representations (such as
+source code) in the application context; in the latter case the
+generated code is also run in the application context.
+
+Limitations that apply to the compiler driver, apply here too in
+terms of sanitizing inputs and it is recommended that both the


s/sanitizing inputs/trusting inputs/ (I suggested it earlier, just unsure
if you don't agree or it simply fell through the cracks)


+compilation *and* execution context of the code are appropriately
+sandboxed to contain the effects of any bugs in libgccjit, the
+application code using it, or its generated code to the sandboxed
+environment.
+
+Libraries such as libiberty, libcc1 and libcpp are not distributed
+for runtime support and have similar challenges to compiler drivers.
+While they are expected to be robust against arbitrary input, they
+should only be used with trusted inputs when linked into the
+compiler.
+
+Libraries such as zlib that bundled into GCC to build it will be


'are bundled with' (missing 'are', s/into/with/)


+treated the same as the compiler drivers and programs as far as
+security coverage is concerned.  However if you find an issue in
+these libraries independent of their use in GCC, you should reach
+out to their upstream projects to report them.
+
+As a result, the only case for a potential security issue in the
+compiler is when it generates vulnerable application code for
+trusted input source code that is conforming to the relevant
+programming standard or extensions documented as supported by GCC
+and the algorithm expressed in the source code does not have the
+vulnerability.  The output application code could be considered
+vulnerable if it produces an actual vulnerability in the target
+application, specifically in the following cases:


It seems ambiguous if the list that follows is meant to be an exhaustive
enumeration. I think it is meant to give examples without covering all
possibilities; if that's the case, I would suggest

s/specifically in the following cases/for example/

If I misunderstood and the list is really meant to be exhaustive,
it would be nice to make that clear and perhaps refer the reader
to the second paragraph when their scenario does not fit.


+
+- The application dereferences an invalid memory location despite
+  the application sources being valid.
+- The application reads from or writes to a valid but incorrect
+  memory location, resulting in an information integrity issue or an
+  information leak.
+- The application ends up running in an infinite loop or with
+  severe degradation in performance despite the input sources having
+  no such issue, resulting in a Denial of Service.  Note that
+  correct but non-performant code is not a security issue candidate,
+  this only applies to incorrect code that may result in performance
+  degradation severe enough to amount to a denial of service.
+- The application crashes due to the generated incorrect code,
+  resulting in a Denial

Re: [PATCH v2 RFA] diagnostic: add permerror variants with opt

2023-10-04 Thread Jason Merrill

On 10/4/23 03:07, Florian Weimer wrote:

* Jason Merrill:


@@ -6159,6 +6153,18 @@ errors by @option{-pedantic-errors}.  For instance:
  -Wwrite-strings @r{(C++11 or later)}
  }
  
+@opindex fpermissive

+@item -fpermissive
+Downgrade some required diagnostics about nonconformant code from
+errors to warnings.  Thus, using @option{-fpermissive} allows some
+nonconforming code to compile.  Some C++ diagnostics are controlled
+only by this flag, but it also downgrades some diagnostics that have
+their own flag:
+
+@gccoptlist{
+-Wnarrowing @r{(C++)}
+}
+
  @opindex Wall
  @opindex Wno-all
  @item -Wall


Does compiling with -Wno-narrowing also accept the obsolete constructs?
The documentation isn't clear about it.  The existing test
gcc/testsuite/g++.dg/cpp0x/initlist55.C suggests it's possible.  Maybe
add an explicit example to the documentation?


The documentation for -Wnarrowing already says that -Wno-narrowing 
suppresses the diagnostic.  It seems clear to me that if there's no 
error, the construct is accepted?



What about the impact of -w?


As I said in the patch comment, -w by itself has no effect. 
-fpermissive -w silences the diagnostic.



As far as the internal API is concerned, will there be a way to query
whether -Wno-narrowing etc. have been specified?


There's warning_enabled_at, but that doesn't account for the different 
rules for permerrors, such as the above -w behavior.  We could add a 
similar function for permerrors, that could also indicate whether the 
diagnostic is a warning or error?


More generally there's OPTION_SET_P.

Jason



Re: [PATCH] wwwdocs: Add ADL to C++ non-bugs

2023-10-04 Thread Jason Merrill

On 10/3/23 10:45, Jonathan Wakely wrote:

We have a long history of INVALID bugs about std functions being
available in the global namespace (PRs 27846, 67566, 82619, 99865,
110602, 111553, probably others). Let's document it.

Also de-prioritize the C++98-only bugs, which are unlikely to affect
anybody nowadays.

OK for wwwdocs?


OK, thanks.

Jason



Re: [RISC-V]: Re: cpymem for RISCV with v extension

2023-10-04 Thread Joern Rennecke
On Wed, 4 Oct 2023 at 18:38, Patrick O'Neill  wrote:
>
> Hi Joern,
>
> I'm seeing new failures introduced by this patch
> (9464e72bcc9123b619215af8cfef491772a3ebd9).
>
> On rv64gcv:
> FAIL: gcc.dg/pr90263.c scan-assembler memcpy

My testing didn't flag this because I used elf targets.  The
expected behaviour now is to use vector instructions for rvv.
so we shouldn't expect memcpy to appear there.  I think the
rvv case is suitably covered by the new tests, so we just
have to avoid the failure here.  Does the attached patch work for you?

> FAIL: gfortran.fortran-torture/execute/intrinsic_count.f90 execution,
> -O2 -fomit-frame-pointer -finline-functions -funroll-loops

There seems to be an issue with my test setup regarding fortran, I'll
have to investigate.
diff --git a/gcc/testsuite/gcc.dg/pr90263.c b/gcc/testsuite/gcc.dg/pr90263.c
index 3222a5331c1..09e0446f45c 100644
--- a/gcc/testsuite/gcc.dg/pr90263.c
+++ b/gcc/testsuite/gcc.dg/pr90263.c
@@ -9,4 +9,4 @@ int *f (int *p, int *q, long n)
 }
 
 /* { dg-final { scan-assembler "mempcpy" { target { i?86-*-* x86_64-*-* } } } 
} */
-/* { dg-final { scan-assembler "memcpy" { target { ! { i?86-*-* x86_64-*-* } } 
} } } */
+/* { dg-final { scan-assembler "memcpy" { target { ! { i?86-*-* x86_64-*-* 
riscv_v } } } } } */


[PATCH 2/1] c++: rename tsubst_copy_and_build and tsubst_expr

2023-10-04 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

-- >8 --

After the previous patch, we currently have two tsubst entry points
for expression trees: tsubst_copy_and_build and tsubst_expr.  But the
latter is just a superset of the former that also handles statement
trees.  We could merge the two entry points so that we just have
tsubst_expr, but it seems natural to distinguish statement trees from
expression trees and to maintain a separate entry point for them.

To that end, this this patch renames tsubst_copy_and_build to
tsubst_expr, and renames the current tsubst_expr to tsubst_stmt, which
continues to be a superset of the former (since sometimes expression
trees appear in statement contexts, e.g. a branch of an IF_STMT could be
NOP_EXPR).  Making tsubst_stmt disjoint from tsubst_expr is left as
future work (if deemed desirable).

This patch in turn renames suitable existing uses of tsubst_expr (that
expect to take statement trees) to use tsubst_stmt.  Thus untouched
tsubst_expr calls are implicitly strengthened to expect only expression
trees after this patch.  And since I'm not familiar with the
tsubst_omp_* routines this patch renames all tsubst_expr uses there to
tsubst_stmt to ensure no unintended functional change in that area.
This patch also moves the handling of CO_YIELD_EXPR and CO_AWAIT_EXPR
from tsubst_stmt to tsubst_expr since they're expression trees.

gcc/cp/ChangeLog:

* cp-lang.cc (objcp_tsubst_copy_and_build): Rename to ...
(objcp_tsubst_expr): ... this.
* cp-objcp-common.h (objcp_tsubst_copy_and_build): Rename to ...
(objcp_tsubst_expr): ... this.
* cp-tree.h (tsubst_copy_and_build): Remove declaration.
* init.cc (maybe_instantiate_nsdmi_init): Use tsubst_expr
instead of tsubst_copy_and_build.
* pt.cc (expand_integer_pack): Likewise.
(instantiate_non_dependent_expr_internal): Likewise.
(instantiate_class_template): Use tsubst_stmt instead of
tsubst_expr for STATIC_ASSERT.
(tsubst_function_decl): Adjust tsubst_copy_and_build uses.
(tsubst_arg_types): Likewise.
(tsubst_exception_specification): Likewise.
(tsubst_tree_list): Likewise.
(tsubst): Likewise.
(tsubst_name): Likewise.
(tsubst_omp_clause_decl): Use tsubst_stmt instead of tsubst_expr.
(tsubst_omp_clauses): Likewise.
(tsubst_copy_asm_operands): Adjust tsubst_copy_and_build use.
(tsubst_omp_for_iterator): Use tsubst_stmt instead of tsubst_expr.
(tsubst_expr): Rename to ...
(tsubst_stmt): ... this.
: Move to tsubst_expr.
(tsubst_omp_udr): Use tsubst_stmt instead of tsubst_expr.
(tsubst_non_call_postfix_expression): Adjust tsubst_copy_and_build
use.
(tsubst_lambda_expr): Likewise.  Use tsubst_stmt instead of
tsubst_expr for the body of a lambda.
(tsubst_copy_and_build_call_args): Rename to ...
(tsubst_call_args): ... this.  Adjust tsubst_copy_and_build use.
(tsubst_copy_and_build): Rename to tsubst_expr.  Adjust
tsubst_copy_and_build and tsubst_copy_and_build_call_args use.
: Use tsubst_stmt instead of tsubst_expr.
(maybe_instantiate_noexcept): Adjust tsubst_copy_and_build use.
(instantiate_body): Use tsubst_stmt instead of tsubst_expr for
substituting the function body.
(tsubst_initializer_list): Adjust tsubst_copy_and_build use.

gcc/objcp/ChangeLog:

* objcp-lang.cc (objcp_tsubst_copy_and_build): Rename to ...
(objcp_tsubst_expr): ... this.  Adjust tsubst_copy_and_build
uses.
---
 gcc/cp/cp-lang.cc|   6 +-
 gcc/cp/cp-objcp-common.h |   2 +-
 gcc/cp/cp-tree.h |   1 -
 gcc/cp/init.cc   |   3 +-
 gcc/cp/pt.cc | 177 +--
 gcc/objcp/objcp-lang.cc  |   5 +-
 6 files changed, 85 insertions(+), 109 deletions(-)

diff --git a/gcc/cp/cp-lang.cc b/gcc/cp/cp-lang.cc
index 2f541460c4a..f2ed83de4fb 100644
--- a/gcc/cp/cp-lang.cc
+++ b/gcc/cp/cp-lang.cc
@@ -113,10 +113,8 @@ struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
 /* The following function does something real, but only in Objective-C++.  */
 
 tree
-objcp_tsubst_copy_and_build (tree /*t*/,
-tree /*args*/,
-tsubst_flags_t /*complain*/,
-tree /*in_decl*/)
+objcp_tsubst_expr (tree /*t*/, tree /*args*/, tsubst_flags_t /*complain*/,
+  tree /*in_decl*/)
 {
   return NULL_TREE;
 }
diff --git a/gcc/cp/cp-objcp-common.h b/gcc/cp/cp-objcp-common.h
index 80893aa1752..1408301a300 100644
--- a/gcc/cp/cp-objcp-common.h
+++ b/gcc/cp/cp-objcp-common.h
@@ -24,7 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 /* In cp/objcp-common.c, cp/cp-lang.cc and objcp/objcp-lang.cc.  */
 
 extern tree cp_get_debug_type (const_tree);
-extern tree objcp_tsubst_copy_and_build 

Re: [PATCH] RISC-V: Fix the riscv_legitimize_poly_move issue on targets where the minimal VLEN exceeds 512.

2023-10-04 Thread Robin Dapp
>> I think the "max poly value" is the LMUL 1 mode coeffs[1]
>>
>> See int vlenb = BYTES_PER_RISCV_VECTOR.coeffs[1];
>>
>> So I think bump max_power to exact_log2 (64); is not enough.
>> since we adjust the LMUL 1 mode size according to TARGET_MIN_VLEN.
>>
>> I suspect the testcase you append in this patch will fail with 
>> -march=rv64gcv_zvl4096b.
> 
> 
> There is no type smaller than  [64, 64] in zvl4096b, RVVMF64BI is [64,
> 64], it’s smallest type, and RVVFM1BI is [512, 512] (size of single
> vector reg.) which at most 64x for zvl4096b, so my understanding is
> log2(64) is enough :)
> 
> and of cause, verified the testcase is work with -march=rv64gcv_zvl4096b

I was wondering if the whole hunk couldn't be condensed into something
like (untested):

  div_factor = wi::ctz (factor) - wi::ctz (vlenb);
  if (div_factor >= 0)
div_factor = 1;
  else
div_factor = 1 << -div_factor;

This would avoid the loop as well.  An assert for the div_factor (not
exceeding a value) could still be added.

Regards
 Robin


Re: [RFC] expr: don't clear SUBREG_PROMOTED_VAR_P flag for a promoted subreg [target/111466]

2023-10-04 Thread Jeff Law




On 10/4/23 12:14, Vineet Gupta wrote:

On 10/4/23 10:59, Jeff Law wrote:



On 9/28/23 15:43, Vineet Gupta wrote:

RISC-V suffers from extraneous sign extensions, despite/given the ABI
guarantee that 32-bit quantities are sign-extended into 64-bit 
registers,

meaning incoming SI function args need not be explicitly sign extended
(so do SI return values as most ALU insns implicitly sign-extend too.)

Existing REE doesn't seem to handle this well and there are various 
ideas

floating around to smarten REE about it.

RISC-V also seems to correctly implement middle-end hook PROMOTE_MODE
etc.

Another approach would be to prevent EXPAND from generating the
sign_extend in the first place which this patch tries to do.

The hunk being removed was introduced way back in 1994 as
    5069803972 ("expand_expr, case CONVERT_EXPR .. clear the 
promotion flag")


This survived full testsuite run for RISC-V rv64gc with surprisingly no
fallouts: test results before/after are exactly same.

|   | # of unexpected case / # of unique 
unexpected case
|   |  gcc |  g++ | 
gfortran |
| rv64imafdc_zba_zbb_zbs_zicond/|  264 /    87 |    5 / 2 |   72 
/    12 |

|    lp64d/medlow

Granted for something so old to have survived, there must be a valid
reason. Unfortunately the original change didn't have additional
commentary or a test case. That is not to say it can't/won't possibly
break things on other arches/ABIs, hence the RFC for someone to scream
that this is just bonkers, don't do this :-)

I've explicitly CC'ed Jakub and Roger who have last touched subreg
promoted notes in expr.cc for insight and/or screaming ;-)

Thanks to Robin for narrowing this down in an amazing debugging session
@ GNU Cauldron.

```
foo2:
sext.w    a6,a1 <-- this goes away
beq    a1,zero,.L4
li    a5,0
li    a0,0
.L3:
addw    a4,a2,a5
addw    a5,a3,a5
addw    a0,a4,a0
bltu    a5,a6,.L3
ret
.L4:
li    a0,0
ret
```

Signed-off-by: Vineet Gupta 
Co-developed-by: Robin Dapp 
---
  gcc/expr.cc   |  7 ---
  gcc/testsuite/gcc.target/riscv/pr111466.c | 15 +++
  2 files changed, 15 insertions(+), 7 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/riscv/pr111466.c
So mcore-elf is showing something interesting.  With that hunk of 
Kenner code removed, it actually has a few failing tests that flip to 
passes.



Tests that now work, but didn't before (11 tests):

mcore-sim: gcc.c-torture/execute/pr109986.c   -O1  execution test
mcore-sim: gcc.c-torture/execute/pr109986.c   -O2  execution test
mcore-sim: gcc.c-torture/execute/pr109986.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  execution test
mcore-sim: gcc.c-torture/execute/pr109986.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  execution test

mcore-sim: gcc.c-torture/execute/pr109986.c   -O3 -g  execution test
mcore-sim: gcc.c-torture/execute/pr109986.c   -Os  execution test
mcore-sim: gcc.c-torture/execute/pr84524.c   -O2  execution test
mcore-sim: gcc.c-torture/execute/pr84524.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  execution test
mcore-sim: gcc.c-torture/execute/pr84524.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  execution test

mcore-sim: gcc.dg/tree-ssa/pr84436-5.c execution test
mcore-sim: gcc.dg/tree-ssa/pr84436-5.c execution test


So that's a really interesting result.   If further analysis doesn't 
point the finger at a simulator bug or something like that, then we'll 
have strong evidence that Kenner's change is actively harmful from a 
correctness standpoint.  That would change the calculus here 
significantly.


Sadly, mcore-elf doesn't have a working gdb IIRC (don't ask how I know 
that!), so I'm going to have to analyze this further with less 
efficient techniques.  BUt definitely interesting news/results.


Really interesting. Can't thank you enough for spending time in chasing 
this down.
Turns out this is a simulator bug.  The simulator assumed that "long" 
types were 32 bits and implemented sextb as x <<= 24; x >>= 24;  This is 
a fairly common error.  If "long" is 64 bits on the host, then that 
approach doesn't work well.




Jeff


Re: [RISC-V]: Re: cpymem for RISCV with v extension

2023-10-04 Thread Patrick O'Neill

On 10/4/23 12:19, Joern Rennecke wrote:


On Wed, 4 Oct 2023 at 18:38, Patrick O'Neill  wrote:

Hi Joern,

I'm seeing new failures introduced by this patch
(9464e72bcc9123b619215af8cfef491772a3ebd9).

On rv64gcv:
FAIL: gcc.dg/pr90263.c scan-assembler memcpy

My testing didn't flag this because I used elf targets.  The
expected behaviour now is to use vector instructions for rvv.
so we shouldn't expect memcpy to appear there.  I think the
rvv case is suitably covered by the new tests, so we just
have to avoid the failure here.  Does the attached patch work for you?


Thanks for the quick response. I'm glad to hear the behavior is expected :)
The attached patch works, just needed some syntax changes:
ERROR: gcc.dg/pr90263.c: error executing dg-final: syntax error in target selector 
"target i?86-*-* x86_64-*-* riscv_v"
Diff w/ syntax changes:
diff --git a/gcc/testsuite/gcc.dg/pr90263.c b/gcc/testsuite/gcc.dg/pr90263.c
index 3222a5331c1..4044e6b1544 100644
--- a/gcc/testsuite/gcc.dg/pr90263.c
+++ b/gcc/testsuite/gcc.dg/pr90263.c
@@ -9,4 +9,4 @@ int *f (int *p, int *q, long n)
 }

 /* { dg-final { scan-assembler "mempcpy" { target { i?86-*-* x86_64-*-* } } } 
} */
-/* { dg-final { scan-assembler "memcpy" { target { ! { i?86-*-* x86_64-*-* } } 
} } } */
+/* { dg-final { scan-assembler "memcpy" { target { ! { { i?86-*-* x86_64-*-* } 
|| { riscv_v } } } } } } */

I'll send it as a patch shortly.

Patrick


FAIL: gfortran.fortran-torture/execute/intrinsic_count.f90 execution,
-O2 -fomit-frame-pointer -finline-functions -funroll-loops

There seems to be an issue with my test setup regarding fortran, I'll
have to investigate.

[PATCH] RISC-V: xfail gcc.dg/pr90263.c for riscv_v

2023-10-04 Thread Patrick O'Neill
Since r14-4358-g9464e72bcc9 riscv_v targets use vector instructions to
perform a memcpy. We no longer expect memcpy for riscv_v targets.

gcc/testsuite/ChangeLog:

* gcc.dg/pr90263.c: xfail riscv_v targets.

Signed-off-by: Patrick O'Neill 
Co-authored-by: Joern Rennecke 
---
 gcc/testsuite/gcc.dg/pr90263.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/pr90263.c b/gcc/testsuite/gcc.dg/pr90263.c
index 3222a5331c1..13f4a85fa0f 100644
--- a/gcc/testsuite/gcc.dg/pr90263.c
+++ b/gcc/testsuite/gcc.dg/pr90263.c
@@ -9,4 +9,4 @@ int *f (int *p, int *q, long n)
 }

 /* { dg-final { scan-assembler "mempcpy" { target { i?86-*-* x86_64-*-* } } } 
} */
-/* { dg-final { scan-assembler "memcpy" { target { ! { i?86-*-* x86_64-*-* } } 
} } } */
+/* { dg-final { scan-assembler "memcpy" { target { ! { i?86-*-* x86_64-*-* } } 
xfail { riscv_v } } } } */
--
2.34.1



Re: [PATCH v6] Implement new RTL optimizations pass: fold-mem-offsets.

2023-10-04 Thread Jeff Law




On 10/3/23 05:45, Manolis Tsamis wrote:

This is a new RTL pass that tries to optimize memory offset calculations
by moving them from add immediate instructions to the memory loads/stores.
For example it can transform this:

   addi t4,sp,16
   add  t2,a6,t4
   shl  t3,t2,1
   ld   a2,0(t3)
   addi a2,1
   sd   a2,8(t2)

into the following (one instruction less):

   add  t2,a6,sp
   shl  t3,t2,1
   ld   a2,32(t3)
   addi a2,1
   sd   a2,24(t2)

Although there are places where this is done already, this pass is more
powerful and can handle the more difficult cases that are currently not
optimized. Also, it runs late enough and can optimize away unnecessary
stack pointer calculations.

gcc/ChangeLog:

* Makefile.in: Add fold-mem-offsets.o.
* passes.def: Schedule a new pass.
* tree-pass.h (make_pass_fold_mem_offsets): Declare.
* common.opt: New options.
* doc/invoke.texi: Document new option.
* fold-mem-offsets.cc: New file.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/fold-mem-offsets-1.c: New test.
* gcc.target/riscv/fold-mem-offsets-2.c: New test.
* gcc.target/riscv/fold-mem-offsets-3.c: New test.

Signed-off-by: Manolis Tsamis 



So I was ready to ACK, but realized there weren't any testresults for a 
primary platform mentioned.  So I ran this on x86.


It's triggering one regression (code quality).

Specifically gcc.target/i386/pr52146.c

The f-m-o code is slightly worse than without f-m-o.

Without f-m-o we get this:

   9  B88000E0  movl$-18874240, %eax
   9  FE
  10 0005 67C7  movl$0, (%eax)
  10  00
  11 000c C3ret

With f-m-o we get this:

   9  B800  movl$0, %eax
   9  00
  10 0005 67C78080  movl$0, -18874240(%eax)
  10  00E0FE00
  10  00
  11 0010 C3ret


The key being that we don't get rid of the original move instruction, 
nor does the original move instruction get smaller due to simplification 
of its constant.  Additionally, the memory store gets larger.  The net 
is a 4 byte increase in code size.



This is probably a fairly rare scenario and the original bug report was 
for a correctness issue in using addresses in the range 
0x8000..0x in x32.  So I wouldn't lose any sleep if we 
adjusted the test to pass -fno-fold-mem-offsets.  But before doing that 
I wanted to give you the chance to ponder if this is something you'd 
prefer to improve in f-m-o itself.   At some level if the base register 
collapses down to 0, then we could take the offset as a constant address 
and try to recognize that form.  If that fails, then just consider the 
change unprofitable rather than trying to recognize it as reg+d.


Anyway, waiting to hear your thoughts...

If we do a V7, then we need to fix one spelling issue that shows up in 
several places (if we go with the v6 we can just fix it prior to 
committing).  Specifically in several places we need to replace 
"recognised" with "recognized".



jeff


Re: [PATCH] RISC-V: xfail gcc.dg/pr90263.c for riscv_v

2023-10-04 Thread Jeff Law




On 10/4/23 15:57, Patrick O'Neill wrote:

Since r14-4358-g9464e72bcc9 riscv_v targets use vector instructions to
perform a memcpy. We no longer expect memcpy for riscv_v targets.

gcc/testsuite/ChangeLog:

* gcc.dg/pr90263.c: xfail riscv_v targets.
Or rather than XFAIL skip the test?  XFAIL kind of implies its something 
we'd like to fix.  But in this case we don't want a memcpy call as the 
inlined vector implementation is almost certainly better.


You might be able to use riscv_v in a dg-skip-if directive.  Not sure.


Jeff


[PATCH] Support g++ 4.8 as a host compiler.

2023-10-04 Thread Roger Sayle

The recent patch to remove poly_int_pod triggers a bug in g++ 4.8.5's
C++ 11 support which mistakenly believes poly_uint16 has a non-trivial
constructor.  This in turn prohibits it from being used as a member in
a union (rtxunion) that constructed statically, resulting in a (fatal)
error during stage 1.  A workaround is to add an explicit constructor
to the problematic union, which allows mainline to be bootstrapped with
the system compiler on older RedHat 7 systems.

This patch has been tested on x86_64-pc-linux-gnu where it allows a
bootstrap to complete when using g++ 4.8.5 as the host compiler.
Ok for mainline?


2023-10-04  Roger Sayle  

gcc/ChangeLog
* rtl.h (rtx_def::u): Add explicit constructor to workaround
issue using g++ 4.8 as a host compiler.

diff --git a/gcc/rtl.h b/gcc/rtl.h
index 6850281..a7667f5 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -451,6 +451,9 @@ struct GTY((desc("0"), tag("0"),
 struct fixed_value fv;
 struct hwivec_def hwiv;
 struct const_poly_int_def cpi;
+#if defined(__GNUC__) && GCC_VERSION < 5000
+u () {}
+#endif
   } GTY ((special ("rtx_def"), desc ("GET_CODE (&%0)"))) u;
 };
 


Re: [PATCH] RISC-V: xfail gcc.dg/pr90263.c for riscv_v

2023-10-04 Thread Patrick O'Neill



On 10/4/23 15:14, Jeff Law wrote:



On 10/4/23 15:57, Patrick O'Neill wrote:

Since r14-4358-g9464e72bcc9 riscv_v targets use vector instructions to
perform a memcpy. We no longer expect memcpy for riscv_v targets.

gcc/testsuite/ChangeLog:

* gcc.dg/pr90263.c: xfail riscv_v targets.
Or rather than XFAIL skip the test?  XFAIL kind of implies its 
something we'd like to fix.  But in this case we don't want a memcpy 
call as the inlined vector implementation is almost certainly better.

Ah. Since XFAIL notifies us if a test starts passing (via xpass) I
thought it would help us ensure the test doesn't start passing
on riscv_v. I didn't know it implied something needed to be fixed.

I'll rework it to skip riscv_v targets.

Thanks,
Patrick


You might be able to use riscv_v in a dg-skip-if directive.  Not sure.


Jeff


Re: [PATCH] RISC-V: xfail gcc.dg/pr90263.c for riscv_v

2023-10-04 Thread Jeff Law




On 10/4/23 16:21, Patrick O'Neill wrote:


On 10/4/23 15:14, Jeff Law wrote:



On 10/4/23 15:57, Patrick O'Neill wrote:

Since r14-4358-g9464e72bcc9 riscv_v targets use vector instructions to
perform a memcpy. We no longer expect memcpy for riscv_v targets.

gcc/testsuite/ChangeLog:

* gcc.dg/pr90263.c: xfail riscv_v targets.
Or rather than XFAIL skip the test?  XFAIL kind of implies its 
something we'd like to fix.  But in this case we don't want a memcpy 
call as the inlined vector implementation is almost certainly better.

Ah. Since XFAIL notifies us if a test starts passing (via xpass) I
thought it would help us ensure the test doesn't start passing
on riscv_v. I didn't know it implied something needed to be fixed.

I'll rework it to skip riscv_v targets.

Hopefully that works.

If you wanted a test to verify that we don't go backwards and start 
emitting a memcpy, you can set up a test like


// dg-directives
#include "pr90263.c"

// dg directives for scanning

Where the scanning verifies that we don't have a call to memcpy.  The 
kind of neat thing here is the dg directives in the included file are 
ignored, so you can use the same test sources in multiple ways.


Given this is kindof specific to risc-v, it might make more sense in the 
riscv directory.


Jeff


[PATCH v2] RISC-V: Test memcpy inlined on riscv_v

2023-10-04 Thread Patrick O'Neill
Since r14-4358-g9464e72bcc9 riscv_v targets use vector instructions to
perform a memcpy. We no longer expect memcpy for riscv_v targets.

gcc/testsuite/ChangeLog:

* gcc.dg/pr90263.c: Skip riscv_v targets.
* gcc.target/riscv/rvv/base/pr90263.c: New test.

Signed-off-by: Patrick O'Neill 
Co-authored-by: Joern Rennecke 
---
Changes from v1:
[PATCH] RISC-V: xfail gcc.dg/pr90263.c for riscv_v
- Skip test rather than xfailing for riscv_v.
- Add testcase to ensure memcpy is not emitted on riscv_v.
---
 gcc/testsuite/gcc.dg/pr90263.c| 1 +
 gcc/testsuite/gcc.target/riscv/rvv/base/pr90263.c | 7 +++
 2 files changed, 8 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr90263.c

diff --git a/gcc/testsuite/gcc.dg/pr90263.c b/gcc/testsuite/gcc.dg/pr90263.c
index 3222a5331c1..831e098783b 100644
--- a/gcc/testsuite/gcc.dg/pr90263.c
+++ b/gcc/testsuite/gcc.dg/pr90263.c
@@ -2,6 +2,7 @@
 /* { dg-do compile } */
 /* { dg-options "-O2" } */
 /* { dg-require-effective-target glibc } */
+/* { dg-skip-if "riscv_v uses an inline memcpy routine" { riscv_v } }*/

 int *f (int *p, int *q, long n)
 {
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr90263.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/pr90263.c
new file mode 100644
index 000..7308428e2c3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr90263.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target glibc } */
+
+#include "../../../../gcc.dg/pr90263.c"
+
+/* { dg-final { scan-assembler-not "memcpy" { target { riscv_v } } } } */
--
2.34.1



  1   2   >