Re: [PATCH v2] Do not abort compilation when dump file is /dev/*

2021-11-19 Thread Richard Biener via Gcc-patches
On Thu, 18 Nov 2021, Giuliano Belinassi wrote:

> The `configure` scripts generated with autoconf often tests compiler
> features by setting output to `/dev/null`, which then sets the dump
> folder as being /dev/* and the compilation halts with an error because
> GCC cannot create files in /dev/. This is a problem when configure is
> testing for compiler features because it cannot tell if the failure was
> due to unsupported features or any other problem, and disable it even
> if it is working.
> 
> As an example, running configure overriding CFLAGS="-fdump-ipa-clones"
> will result in several compiler-features as being disabled because of
> gcc halting with an error creating files in /dev/*.
> 
> This commit fixes this issue by checking if the output file is
> /dev/null or /dev/zero. In this case we use the current working
> directory for dump output instead of the directory of the output
> file because we cannot write to /dev/*.

Comments below.

> gcc/ChangeLog
> 2021-11-16  Giuliano Belinassi  
> 
>   * gcc.c (process_command): Skip dumpdir override on -o /dev/null
> or -o /dev/zero.
> 
> gcc/testsuite/ChangeLog
> 2021-11-16  Giuliano Belinassi  
> 
>   * gcc.dg/devnull-dump.c: New.
> 
> Signed-off-by: Giuliano Belinassi 
> ---
>  gcc/doc/invoke.texi |  4 +++-
>  gcc/gcc.c   | 19 +++
>  gcc/testsuite/gcc.dg/devnull-dump.c |  7 +++
>  3 files changed, 25 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/devnull-dump.c
> 
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 6070288856c..4c056606e0c 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -1877,7 +1877,9 @@ named @file{dir/bar.*}, combining the given 
> @var{dumppfx} with the
>  default @var{dumpbase} derived from the primary output name.  Dump
>  outputs also take the input name suffix: @file{dir/bar.c.*}.
>  
> -It defaults to the location of the output file; options
> +It defaults to the location of the output file, unless @option{-o /dev/null}
> +or @option{-o /dev/zero} is passed to the compiler: on those cases the 
> @option{cwd}
> +will be used instead. Options

"It defaults to the location of the output file, unless the output
file is a special file like @code{/dev/null}."

>  @option{-save-temps=cwd} and @option{-save-temps=obj} override this
>  default, just like an explicit @option{-dumpdir} option.  In case
>  multiple such options are given, the last one prevails:
> diff --git a/gcc/gcc.c b/gcc/gcc.c
> index 506c2acc282..0173018ac04 100644
> --- a/gcc/gcc.c
> +++ b/gcc/gcc.c
> @@ -5109,10 +5109,21 @@ process_command (unsigned int decoded_options_count,
>  {
>free (dumpdir);
>dumpdir = NULL;
> -  temp = lbasename (output_file);
> -  if (temp != output_file)
> - dumpdir = xstrndup (output_file,
> - strlen (output_file) - strlen (temp));
> +  if (strcmp(output_file, "/dev/null") == 0
> +   || strcmp(output_file, "/dev/zero") == 0)

Please merge this into the condition of the block and update the comment

  /* If -save-temps=obj and -o name, create the prefix to use for %b.
 Otherwise just make -save-temps=obj the same as -save-temps=cwd.  */
  else if (save_temps_flag != SAVE_TEMPS_CWD && output_file != NULL)

please split out

  strcmp(output_file, "/dev/null") == 0
  || strcmp(output_file, "/dev/zero") == 0

into a bool special_output_location (const char *) function.  For
example on Windows the special output location would be NUL
(appearantly any path ending in NUL works like that - huh).

Note it might be possible to stat() the file and simply behave
this way whenever it exists and is not a regular file.  For start
just matching literal /dev/null is good enough IMHO (did you
run into cases with /dev/zero?).

Thanks,
Richard.

> + {
> +   /* If output is /dev/null or /dev/zero, we cannot set the dumpdir to
> +  /dev/ because no files can be created on that directory.  In this
> +  case, we do nothing.  */
> + }
> +  else
> + {
> +   /* Set dumpdir as being the same directory where the output file is.  
> */
> +   temp = lbasename (output_file);
> +   if (temp != output_file)
> + dumpdir = xstrndup (output_file,
> + strlen (output_file) - strlen (temp));
> + }
>  }
>else if (dumpdir)
>  {
> diff --git a/gcc/testsuite/gcc.dg/devnull-dump.c 
> b/gcc/testsuite/gcc.dg/devnull-dump.c
> new file mode 100644
> index 000..378e0901c28
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/devnull-dump.c
> @@ -0,0 +1,7 @@
> +/* { dg-do assemble } */
> +/* { dg-options "-fdump-ipa-clones -o /dev/null" } */
> +
> +int main()
> +{
> +  return 0;
> +}
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


[PATCH] darwin, d: Support outfile substitution for liphobos

2021-11-19 Thread Iain Buclaw via Gcc-patches
Hi,

This patch fixes a stage2 bootstrap failure in the D front-end on
darwin due to libgphobos being dynamically linked despite
-static-libphobos being on the command line.

In the gdc driver, this takes the previous fix for the Darwin D
bootstrap, and extends it to the -static-libphobos option as well.
Rather than pushing the -static-libphobos option back onto the command
line, the setting of SKIPOPT is instead conditionally removed.  The same
change has been repeated for -static-libstdc++ so there is now no need
to call generate_option to re-add it.

In the gcc driver, -static-libphobos has been added as a common option,
validated, and a new outfile substition added to config/darwin.h to
correctly replace -lgphobos with libgphobos.a.

Bootstrapped and regression tested on x86_64-linux-gnu and
x86_64-apple-darwin20.

OK for mainline?  This would also be fine for gcc-11 release branch too,
as well as earlier releases with D support.

Regards,
Iain.

---
gcc/ChangeLog:

* common.opt (static-libphobos): Add option.
* config/darwin.h (LINK_SPEC): Substitute -lgphobos with libgphobos.a
when linking statically.
* gcc.c (driver_handle_option): Set -static-libphobos as always valid.

gcc/d/ChangeLog:

* d-spec.cc (lang_specific_driver): Set SKIPOPT on -static-libstdc++
and -static-libphobos only when target supports LD_STATIC_DYNAMIC.
Remove generate_option to re-add -static-libstdc++.
---
 gcc/common.opt  |  4 
 gcc/config/darwin.h |  1 +
 gcc/d/d-spec.cc | 18 +++---
 gcc/gcc.c   |  6 --
 4 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index db6010e4e20..73c12d933f3 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -3527,6 +3527,10 @@ static-libgfortran
 Driver
 ; Documented for Fortran, but always accepted by driver.
 
+static-libphobos
+Driver
+; Documented for D, but always accepted by driver.
+
 static-libstdc++
 Driver
 
diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index 7ed01efa694..c4ddd623e8b 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -443,6 +443,7 @@ extern GTY(()) int darwin_ms_struct;
  %:replace-outfile(-lobjc libobjc-gnu.a%s); \
 :%:replace-outfile(-lobjc -lobjc-gnu )}}\
%{static|static-libgcc|static-libgfortran:%:replace-outfile(-lgfortran 
libgfortran.a%s)}\
+   %{static|static-libgcc|static-libphobos:%:replace-outfile(-lgphobos 
libgphobos.a%s)}\

%{static|static-libgcc|static-libstdc++|static-libgfortran:%:replace-outfile(-lgomp
 libgomp.a%s)}\
%{static|static-libgcc|static-libstdc++:%:replace-outfile(-lstdc++ 
libstdc++.a%s)}\
%{force_cpusubtype_ALL:-arch %(darwin_arch)} \
diff --git a/gcc/d/d-spec.cc b/gcc/d/d-spec.cc
index b12d28f1047..73ecac3bbf1 100644
--- a/gcc/d/d-spec.cc
+++ b/gcc/d/d-spec.cc
@@ -253,13 +253,23 @@ lang_specific_driver (cl_decoded_option 
**in_decoded_options,
 
case OPT_static_libstdc__:
  saw_static_libcxx = true;
+#ifndef HAVE_LD_STATIC_DYNAMIC
+ /* Remove -static-libstdc++ from the command only if target supports
+LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
+back-end target can use outfile substitution.  */
  args[i] |= SKIPOPT;
+#endif
  break;
 
case OPT_static_libphobos:
  if (phobos_library != PHOBOS_NOLINK)
phobos_library = PHOBOS_STATIC;
+#ifndef HAVE_LD_STATIC_DYNAMIC
+ /* Remove -static-libphobos from the command only if target supports
+LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
+back-end target can use outfile substitution.  */
  args[i] |= SKIPOPT;
+#endif
  break;
 
case OPT_shared_libphobos:
@@ -460,7 +470,7 @@ lang_specific_driver (cl_decoded_option 
**in_decoded_options,
 #endif
 }
 
-  if (saw_libcxx || need_stdcxx)
+  if (saw_libcxx || saw_static_libcxx || need_stdcxx)
 {
 #ifdef HAVE_LD_STATIC_DYNAMIC
   if (saw_static_libcxx && !static_link)
@@ -468,12 +478,6 @@ lang_specific_driver (cl_decoded_option 
**in_decoded_options,
  generate_option (OPT_Wl_, LD_STATIC_OPTION, 1, CL_DRIVER,
   &new_decoded_options[j++]);
}
-#else
-  /* Push the -static-libstdc++ option back onto the command so that
-a target without LD_STATIC_DYNAMIC can use outfile substitution.  */
-  if (saw_static_libcxx && !static_link)
-   generate_option (OPT_static_libstdc__, NULL, 1, CL_DRIVER,
-&new_decoded_options[j++]);
 #endif
   if (saw_libcxx)
new_decoded_options[j++] = *saw_libcxx;
diff --git a/gcc/gcc.c b/gcc/gcc.c
index 506c2acc282..fea6d049183 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -4576,10 +4576,12 @@ driver_handle_option (struct gcc_options *opts,
 case OPT_static_libgcc:
 case OPT_shared_libgcc:
 case OPT_static_libgfortran:
+case OPT_

[PATCH] tree-optimization/102436 - restore loop store motion

2021-11-19 Thread Richard Biener via Gcc-patches
This restores a case of conditional store motion we fail to handle
after the rewrite.  We can recognize the special case of all
stores in a loop happening in a single conditionally executed
block which ensures stores are not re-ordered by executing them
in different loop iterations.  Separating out the case avoids
complicating the already complex main path.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-11-18  Richard Biener  

PR tree-optimization/102436
* tree-ssa-loop-im.c (execute_sm_if_changed): Add mode
to just create the if structure and return the then block.
(execute_sm): Add flag to indicate the var will re-use
another flag var.
(hoist_memory_references): Support a single conditional
block with all stores as special case.

* gcc.dg/torture/2028-1.c: New testcase.
* gcc.dg/tree-ssa/ssa-lim-18.c: Likewise.
---
 gcc/testsuite/gcc.dg/torture/2028-1.c  |  27 
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c |  19 +++
 gcc/tree-ssa-loop-im.c | 162 +++--
 3 files changed, 199 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/2028-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c

diff --git a/gcc/testsuite/gcc.dg/torture/2028-1.c 
b/gcc/testsuite/gcc.dg/torture/2028-1.c
new file mode 100644
index 000..67ef68420df
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/2028-1.c
@@ -0,0 +1,27 @@
+/* { dg-do run } */
+
+unsigned p[3];
+void __attribute__((noipa))
+bar (float *q, int n, int m)
+{
+  for (int i = 0; i < m; ++i)
+{
+  if (i == n)
+{
+  unsigned a = p[1];
+  p[1] = a + 1;
+  *q = 1.;
+}
+  q++;
+}
+}
+
+int main()
+{
+#if __SIZEOF_FLOAT__ == __SIZEOF_INT__
+  bar ((float *)p, 1, 3);
+  if (((float *)p)[1] != 1.)
+__builtin_abort ();
+#endif
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c
new file mode 100644
index 000..da19806b712
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fstrict-aliasing -fdump-tree-lim2-details" } */
+
+unsigned p;
+
+void foo (float *q)
+{
+  for (int i = 0; i < 256; ++i)
+{
+  if (p)
+{
+  unsigned a = p;
+  *(q++) = 1.;
+  p = a + 1;
+}
+}
+}
+
+/* { dg-final { scan-tree-dump-times "Executing store motion" 1 "lim2" } } */
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 8a81acae115..682406d26c1 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -1911,10 +1911,13 @@ first_mem_ref_loc (class loop *loop, im_mem_ref *ref)
}
  }
  if (lsm_flag) <--
-   MEM = lsm;  <--
+   MEM = lsm;  <-- (X)
+
+  In case MEM and TMP_VAR are NULL the function will return the then
+  block so the caller can insert (X) and other related stmts. 
 */
 
-static void
+static basic_block
 execute_sm_if_changed (edge ex, tree mem, tree tmp_var, tree flag,
   edge preheader, hash_set  *flag_bbs,
   edge &append_cond_position, edge &last_cond_fallthru)
@@ -2009,10 +2012,13 @@ execute_sm_if_changed (edge ex, tree mem, tree tmp_var, 
tree flag,
NULL_TREE, NULL_TREE);
   gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
 
-  gsi = gsi_start_bb (then_bb);
   /* Insert actual store.  */
-  stmt = gimple_build_assign (unshare_expr (mem), tmp_var);
-  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+  if (mem)
+{
+  gsi = gsi_start_bb (then_bb);
+  stmt = gimple_build_assign (unshare_expr (mem), tmp_var);
+  gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING);
+}
 
   edge e1 = single_succ_edge (new_bb);
   edge e2 = make_edge (new_bb, then_bb,
@@ -2060,6 +2066,8 @@ execute_sm_if_changed (edge ex, tree mem, tree tmp_var, 
tree flag,
  update_stmt (phi);
}
   }
+
+  return then_bb;
 }
 
 /* When REF is set on the location, set flag indicating the store.  */
@@ -2117,7 +2125,8 @@ struct sm_aux
 
 static void
 execute_sm (class loop *loop, im_mem_ref *ref,
-   hash_map &aux_map, bool maybe_mt)
+   hash_map &aux_map, bool maybe_mt,
+   bool use_other_flag_var)
 {
   gassign *load;
   struct fmt_data fmt_data;
@@ -2146,7 +2155,7 @@ execute_sm (class loop *loop, im_mem_ref *ref,
  || (! flag_store_data_races && ! always_stored)))
 multi_threaded_model_p = true;
 
-  if (multi_threaded_model_p)
+  if (multi_threaded_model_p && !use_other_flag_var)
 aux->store_flag
   = execute_sm_if_changed_flag_set (loop, ref, &aux->flag_bbs);
   else
@@ -2182,7 +2191,7 @@ execute_sm (class loop *loop, im_mem_ref *ref,
   lim_data->tgt_loop = loop;
   gsi_insert_before (&gsi, load, GSI_SAME_STMT);
 
-  if (multi_threaded_model_p)
+  i

[PATCH]middle-end: Handle FMA_CONJ correctly after SLP layout update.

2021-11-19 Thread Tamar Christina via Gcc-patches
Hi All,

Apologies, I got dinged by the i386 regressions bot for a test I didn't have in
my tree at the time I made the previous patch.  The bot was telling me that FMA
stopped working after I strengthened the FMA check in the previous patch.

The reason is that the check is slightly early.  The first check can indeed only
exit early when either node isn't a mult.  However we need to delay till we know
if the node is a MUL or FMA before enforcing that both nodes must be a MULT
since the node to inspect is different if the operation is a MUL or FMA.

Also with the update patch for GCC 11 tree layout update to the new GCC 12 one
I had missed that the difference in which node is conjucated is not symmetrical.

So the test for it can just be testing the inverse order.  It was Currently
no detecting when the first node was conjucated instead of the second one.

This also made me wonder why my own test didn't detect this.  It turns out that
the tests, being copied from the _Float16 ones were incorrectly marked as
xfail.  The _Float16 ones are marked as xfail since C doesn't have a conj
operation for _Float16, which means you get extra type-casts in between.

While you could use the GCC _Complex extension here I opted to mark them xfail
since I wanted to include detection over the widenings next year.

Secondly the double tests were being skipped because Adv. SIMD was missing from
targets supporting Complex Double vectorization.

With these changes all other tests run and pass and only XFAIL ones are
correctly the _Float16 ones.  Sorry for missing this before, testing should now
cover all cases.

Bootstrapped Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu and no regressions.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR tree-optimization/103311
* tree-vect-slp-patterns.c (vect_validate_multiplication): Fix CONJ
test to new codegen.
(complex_mul_pattern::matches): Move check downwards.

gcc/testsuite/ChangeLog:

PR tree-optimization/103311
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c: Fix it.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c: Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c: Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-float.c: Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-double.c: Likewise.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-float.c: Likewise.
* lib/target-supports.exp
(check_effective_target_vect_complex_add_double): Add Adv. SIMD.

--- inline copy of patch -- 
diff --git 
a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c 
b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c
index 
462063abc3041d466e8b0261252054a46ef48e15..e13fa3bd31bab016c7a033a7ed590047c757882b
 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c
@@ -6,5 +6,5 @@
 #define N 16
 #include "complex-mla-template.c"
 
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMA_CONJ" "slp1" } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "slp1" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMA_CONJ" "vect" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "vect" } } */
diff --git 
a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c 
b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c
index 
a88adc8184ece3d6d61c66a42233fde0f1721a78..2d774315df96a37759021ab64e0bdb4eb6292f6e
 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c
@@ -6,5 +6,5 @@
 #define TYPE float
 #define N 16
 #include "complex-mla-template.c"
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMA_CONJ" "slp1" { xfail *-*-* } 
} } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "slp1" { xfail *-*-* } } } 
*/
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMA_CONJ" "vect" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "vect" } } */
diff --git 
a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c 
b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c
index 
a434fd1f1d3235392a0240b5861c85d39c48d551..bfb62f21f972c5a02af4952d2db1799f37011cad
 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c
@@ -6,7 +6,5 @@
 #define N 16
 #include "complex-mls-template.c"
 
-/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT270" "slp1" } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "slp1" } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMS_CONJ" "slp1" } } */
-/* { dg-final { scan-tree-dump "Found COMPLEX_FMS" "slp1" } } */
+/* { dg-final { scan-tree-dump "Found COMPLEX_FMS_CON

[PATCH][committed][libstdc++] Fix ctype changed after newlib update.

2021-11-19 Thread Tamar Christina via Gcc-patches
Hi All,

Newlib changed ctype.h recently[1] by moving the short labels from ctype.h intro
the private namespace in ctype_.h.  This broke embedded builds due to them no
longer being found.  Instead they now expose the long names to match glibc.

This patch now uses the short or long names depending on is the short ones are
defined or not.

[1] 
https://github.com/bminor/newlib/commit/3ba1bd0d9dbc015c14a0aaafcef042f706d1249a

Regtested on aarch64-none-elf and no issues.

Patch was pre-approve by Jonathan.

Thanks,
Tamar

libstdc++-v3/ChangeLog:

PR libstdc++/103305
* config/os/newlib/ctype_base.h (upper, lower, alpha, digit, xdigit,
space, print, graph, cntrl, punct, alnum, blank): Use short or long
names depending on if short ones are defined.

--- inline copy of patch -- 
diff --git a/libstdc++-v3/config/os/newlib/ctype_base.h 
b/libstdc++-v3/config/os/newlib/ctype_base.h
index 
33654d7794a0cff3ba4ebe78e28882d78866dd94..3a7477afdcc76e5a5bc6b0a86d2264c5bc8ab7f8
 100644
--- a/libstdc++-v3/config/os/newlib/ctype_base.h
+++ b/libstdc++-v3/config/os/newlib/ctype_base.h
@@ -41,6 +41,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 // NB: Offsets into ctype::_M_table force a particular size
 // on the mask type. Because of this, we don't use an enum.
 typedef char   mask;
+#if defined _U && defined _L && defined _N && defined _S
 static const mask upper= _U;
 static const mask lower= _L;
 static const mask alpha= _U | _L;
@@ -52,8 +53,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 static const mask cntrl= _C;
 static const mask punct= _P;
 static const mask alnum= _U | _L | _N;
-#if __cplusplus >= 201103L
+# if __cplusplus >= 201103L
 static const mask blank= space;
+# endif
+#else
+static const mask upper= _ISupper;
+static const mask lower= _ISlower;
+static const mask alpha= _ISupper | _ISlower;
+static const mask digit= _ISdigit;
+static const mask xdigit   = _ISxdigit | _ISdigit;
+static const mask space= _ISspace;
+static const mask print= _ISpunct | _ISupper | _ISlower | _ISdigit | 
_ISblank;
+static const mask graph= _ISpunct | _ISupper | _ISlower | _ISdigit;
+static const mask cntrl= _IScntrl;
+static const mask punct= _ISpunct;
+static const mask alnum= _ISupper | _ISlower | _ISdigit;
+# if __cplusplus >= 201103L
+static const mask blank=  _ISspace | _ISblank;
+# endif
 #endif
   };
 


-- 
diff --git a/libstdc++-v3/config/os/newlib/ctype_base.h b/libstdc++-v3/config/os/newlib/ctype_base.h
index 33654d7794a0cff3ba4ebe78e28882d78866dd94..3a7477afdcc76e5a5bc6b0a86d2264c5bc8ab7f8 100644
--- a/libstdc++-v3/config/os/newlib/ctype_base.h
+++ b/libstdc++-v3/config/os/newlib/ctype_base.h
@@ -41,6 +41,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 // NB: Offsets into ctype::_M_table force a particular size
 // on the mask type. Because of this, we don't use an enum.
 typedef char 		mask;
+#if defined _U && defined _L && defined _N && defined _S
 static const mask upper	= _U;
 static const mask lower 	= _L;
 static const mask alpha 	= _U | _L;
@@ -52,8 +53,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 static const mask cntrl 	= _C;
 static const mask punct 	= _P;
 static const mask alnum 	= _U | _L | _N;
-#if __cplusplus >= 201103L
+# if __cplusplus >= 201103L
 static const mask blank 	= space;
+# endif
+#else
+static const mask upper= _ISupper;
+static const mask lower= _ISlower;
+static const mask alpha= _ISupper | _ISlower;
+static const mask digit= _ISdigit;
+static const mask xdigit   = _ISxdigit | _ISdigit;
+static const mask space= _ISspace;
+static const mask print= _ISpunct | _ISupper | _ISlower | _ISdigit | _ISblank;
+static const mask graph= _ISpunct | _ISupper | _ISlower | _ISdigit;
+static const mask cntrl= _IScntrl;
+static const mask punct= _ISpunct;
+static const mask alnum= _ISupper | _ISlower | _ISdigit;
+# if __cplusplus >= 201103L
+static const mask blank=  _ISspace | _ISblank;
+# endif
 #endif
   };
 



RE: [PATCH] middle-end: fix de-optimizations with bitclear patterns on signed values

2021-11-19 Thread Tamar Christina via Gcc-patches
Ping

> -Original Message-
> From: Tamar Christina
> Sent: Friday, November 12, 2021 7:31 AM
> To: Jakub Jelinek 
> Cc: Jonathan Wakely ; Richard Biener
> ; gcc-patches@gcc.gnu.org; nd 
> Subject: RE: [PATCH] middle-end: fix de-optimizations with bitclear patterns
> on signed values
> 
> 
> 
> > -Original Message-
> > From: Jakub Jelinek 
> > Sent: Thursday, November 4, 2021 4:11 PM
> > To: Tamar Christina 
> > Cc: Jonathan Wakely ; Richard Biener
> > ; gcc-patches@gcc.gnu.org; nd 
> > Subject: Re: [PATCH] middle-end: fix de-optimizations with bitclear
> > patterns on signed values
> >
> > On Thu, Nov 04, 2021 at 12:19:34PM +, Tamar Christina wrote:
> > > I'm not sure the precision matters since if the conversion resulted
> > > in not enough precision such that It influences the compare it would
> > > have
> > been optimized out.
> >
> > You can't really rely on other optimizations being performed.  They
> > will usually happen, but might not because such code only materialized
> > short time ago without folding happening in between, or some debug
> > counters or -
> > fno-* disabling some passes, ...
> 
> Fair point, I have separated out the logic as you requested and added the
> debug fix.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> and no regressions.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * tree-ssa-phiopt.c (spaceship_replacement): Handle new canonical
>   codegen.
> 
> --- inline copy of patch ---
> 
> diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c index
> 0e339c46afa29fa97f90d9bc4394370cd9b4b396..3ad5b23885a37eec0beff229e2
> a96e86658b2d1a 100644
> --- a/gcc/tree-ssa-phiopt.c
> +++ b/gcc/tree-ssa-phiopt.c
> @@ -2038,11 +2038,36 @@ spaceship_replacement (basic_block cond_bb,
> basic_block middle_bb,
>gimple *orig_use_stmt = use_stmt;
>tree orig_use_lhs = NULL_TREE;
>int prec = TYPE_PRECISION (TREE_TYPE (phires));
> -  if (is_gimple_assign (use_stmt)
> -  && gimple_assign_rhs_code (use_stmt) == BIT_AND_EXPR
> -  && TREE_CODE (gimple_assign_rhs2 (use_stmt)) == INTEGER_CST
> -  && (wi::to_wide (gimple_assign_rhs2 (use_stmt))
> -   == wi::shifted_mask (1, prec - 1, false, prec)))
> +  bool is_cast = false;
> +
> +  /* Deal with the case when match.pd has rewritten the (res & ~1) == 0
> + into res <= 1 and has left a type-cast for signed types.  */
> +  if (gimple_assign_cast_p (use_stmt))
> +{
> +  orig_use_lhs = gimple_assign_lhs (use_stmt);
> +  /* match.pd would have only done this for a signed type,
> +  so the conversion must be to an unsigned one.  */
> +  tree ty1 = TREE_TYPE (gimple_assign_rhs1 (use_stmt));
> +  tree ty2 = TREE_TYPE (orig_use_lhs);
> +
> +  if (!TYPE_UNSIGNED (ty2) || !INTEGRAL_TYPE_P (ty2))
> + return false;
> +  if (TYPE_PRECISION (ty1) != TYPE_PRECISION (ty2))
> + return false;
> +  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (orig_use_lhs))
> + return false;
> +  if (EDGE_COUNT (phi_bb->preds) != 4)
> + return false;
> +  if (!single_imm_use (orig_use_lhs, &use_p, &use_stmt))
> + return false;
> +
> +  is_cast = true;
> +}
> +  else if (is_gimple_assign (use_stmt)
> +&& gimple_assign_rhs_code (use_stmt) == BIT_AND_EXPR
> +&& TREE_CODE (gimple_assign_rhs2 (use_stmt)) == INTEGER_CST
> +&& (wi::to_wide (gimple_assign_rhs2 (use_stmt))
> +== wi::shifted_mask (1, prec - 1, false, prec)))
>  {
>/* For partial_ordering result operator>= with unspec as second
>argument is (res & 1) == res, folded by match.pd into @@ -2099,7
> +2124,7 @@ spaceship_replacement (basic_block cond_bb, basic_block
> middle_bb,
>|| !tree_fits_shwi_p (rhs)
>|| !IN_RANGE (tree_to_shwi (rhs), -1, 1))
>  return false;
> -  if (orig_use_lhs)
> +  if (orig_use_lhs && !is_cast)
>  {
>if ((cmp != EQ_EXPR && cmp != NE_EXPR) || !integer_zerop (rhs))
>   return false;
> @@ -2310,62 +2335,101 @@ spaceship_replacement (basic_block cond_bb,
> basic_block middle_bb,
>  one_cmp = GT_EXPR;
> 
>enum tree_code res_cmp;
> -  switch (cmp)
> +
> +  if (is_cast)
>  {
> -case EQ_EXPR:
> -  if (integer_zerop (rhs))
> - res_cmp = EQ_EXPR;
> -  else if (integer_minus_onep (rhs))
> - res_cmp = one_cmp == LT_EXPR ? GT_EXPR : LT_EXPR;
> -  else if (integer_onep (rhs))
> - res_cmp = one_cmp;
> -  else
> +  if (TREE_CODE (rhs) != INTEGER_CST)
>   return false;
> -  break;
> -case NE_EXPR:
> -  if (integer_zerop (rhs))
> - res_cmp = NE_EXPR;
> -  else if (integer_minus_onep (rhs))
> - res_cmp = one_cmp == LT_EXPR ? LE_EXPR : GE_EXPR;
> -  else if (integer_onep (rhs))
> - res_cmp = one_cmp == LT_EXPR ? GE_EXPR : LE_EXPR;
> -  else
> - return false;
> -  break;
> -case LT_EXPR:
> -  if (integer_onep (rhs))
> - res_cmp = one_cmp == LT_EXPR ? GE_EXPR : L

Re: [PATCH]middle-end: Handle FMA_CONJ correctly after SLP layout update.

2021-11-19 Thread Richard Biener via Gcc-patches
On Fri, 19 Nov 2021, Tamar Christina wrote:

> Hi All,
> 
> Apologies, I got dinged by the i386 regressions bot for a test I didn't have 
> in
> my tree at the time I made the previous patch.  The bot was telling me that 
> FMA
> stopped working after I strengthened the FMA check in the previous patch.
> 
> The reason is that the check is slightly early.  The first check can indeed 
> only
> exit early when either node isn't a mult.  However we need to delay till we 
> know
> if the node is a MUL or FMA before enforcing that both nodes must be a MULT
> since the node to inspect is different if the operation is a MUL or FMA.
> 
> Also with the update patch for GCC 11 tree layout update to the new GCC 12 one
> I had missed that the difference in which node is conjucated is not 
> symmetrical.
> 
> So the test for it can just be testing the inverse order.  It was Currently
> no detecting when the first node was conjucated instead of the second one.
> 
> This also made me wonder why my own test didn't detect this.  It turns out 
> that
> the tests, being copied from the _Float16 ones were incorrectly marked as
> xfail.  The _Float16 ones are marked as xfail since C doesn't have a conj
> operation for _Float16, which means you get extra type-casts in between.
> 
> While you could use the GCC _Complex extension here I opted to mark them xfail
> since I wanted to include detection over the widenings next year.
> 
> Secondly the double tests were being skipped because Adv. SIMD was missing 
> from
> targets supporting Complex Double vectorization.
> 
> With these changes all other tests run and pass and only XFAIL ones are
> correctly the _Float16 ones.  Sorry for missing this before, testing should 
> now
> cover all cases.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> x86_64-pc-linux-gnu and no regressions.
> 
> Ok for master?

OK.

Thanks,
Richard.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/103311
>   * tree-vect-slp-patterns.c (vect_validate_multiplication): Fix CONJ
>   test to new codegen.
>   (complex_mul_pattern::matches): Move check downwards.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/103311
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c: Fix it.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-float.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-double.c: Likewise.
>   * gcc.dg/vect/complex/fast-math-bb-slp-complex-mul-float.c: Likewise.
>   * lib/target-supports.exp
>   (check_effective_target_vect_complex_add_double): Add Adv. SIMD.
> 
> --- inline copy of patch -- 
> diff --git 
> a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c 
> b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c
> index 
> 462063abc3041d466e8b0261252054a46ef48e15..e13fa3bd31bab016c7a033a7ed590047c757882b
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c
> +++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-double.c
> @@ -6,5 +6,5 @@
>  #define N 16
>  #include "complex-mla-template.c"
>  
> -/* { dg-final { scan-tree-dump "Found COMPLEX_FMA_CONJ" "slp1" } } */
> -/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "slp1" } } */
> +/* { dg-final { scan-tree-dump "Found COMPLEX_FMA_CONJ" "vect" } } */
> +/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "vect" } } */
> diff --git 
> a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c 
> b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c
> index 
> a88adc8184ece3d6d61c66a42233fde0f1721a78..2d774315df96a37759021ab64e0bdb4eb6292f6e
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c
> +++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mla-float.c
> @@ -6,5 +6,5 @@
>  #define TYPE float
>  #define N 16
>  #include "complex-mla-template.c"
> -/* { dg-final { scan-tree-dump "Found COMPLEX_FMA_CONJ" "slp1" { xfail *-*-* 
> } } } */
> -/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "slp1" { xfail *-*-* } } 
> } */
> +/* { dg-final { scan-tree-dump "Found COMPLEX_FMA_CONJ" "vect" } } */
> +/* { dg-final { scan-tree-dump "Found COMPLEX_FMA" "vect" } } */
> diff --git 
> a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c 
> b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c
> index 
> a434fd1f1d3235392a0240b5861c85d39c48d551..bfb62f21f972c5a02af4952d2db1799f37011cad
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c
> +++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-mls-double.c
> @@ -6,7 +6,5 @@
>  #define N 16
>  #include "complex-mls-template.c"
>  
> -/* { dg-final { scan-tree-dump "Found COMPLEX_ADD_R

Re: [PATCH] c++/103326 - fix ICE in tsubst with VECTOR_CST

2021-11-19 Thread Jakub Jelinek via Gcc-patches
On Fri, Nov 19, 2021 at 08:57:01AM +0100, Richard Biener wrote:
> This adds missing handling of VECTOR_CST.
> 
> Bootstrap and regtest pending on x86_64-unknown-linux-gnu, OK?
> 
> Thanks,
> Richard.
> 
> 2021-11-19  Richard Biener  
> 
>   PR c++/103326
>   * pt.c (tsubst_copy): Handle VECTOR_CST.
> 
>   * g++.dg/pr103326.C: New testcase.

Ok, thanks.

Jakub



Re: [PATCH] Remove MAY_HAVE_DEBUG_MARKER_STMTS and MAY_HAVE_DEBUG_BIND_STMTS.

2021-11-19 Thread Alexandre Oliva via Gcc-patches
On Oct 18, 2021, Richard Biener via Gcc-patches  wrote:

> On Mon, Oct 18, 2021 at 10:54 AM Martin Liška  wrote:
>> 
>> The macros correspond 1:1 to an option flags and make it harder
>> to find all usages of the flags.
>> 
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>> 
>> Ready to be installed?

> Hmm, they were introduced on purpose

Yup.  Though there is a 1:1 equivalence right now, conceptually other
kinds of debug marker stmts, and of debug bind stmts, could be
introduced, and then the macros would be adjusted to encompass the new
functionality, covering presumably different options as well.

By removing the macros, every use of the options would have to be
reassessed to tell whether it needs to be changed to cover the new
features, or left alone because it's really meant to refer to that
specific option.

So I find the abstraction useful.  However, I don't have plans to add
other kinds of debug stmts, and I don't know of anyone else who does, so
I won't stand in the way if others think removing these abstractions is
a positive change.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about 


Re: [PATCH 2/5] gimple-match: Add a gimple_extract_op function

2021-11-19 Thread Richard Biener via Gcc-patches
On Thu, Nov 18, 2021 at 8:01 PM Richard Sandiford
 wrote:
>
> Richard Biener  writes:
> > On Tue, Nov 16, 2021 at 4:51 PM Richard Sandiford
> >  wrote:
> >>
> >> Richard Biener  writes:
> >> > On Wed, Nov 10, 2021 at 1:46 PM Richard Sandiford via Gcc-patches
> >> >  wrote:
> >> >>
> >> >> code_helper and gimple_match_op seem like generally useful ways
> >> >> of summing up a gimple_assign or gimple_call (or gimple_cond).
> >> >> This patch adds a gimple_extract_op function that can be used
> >> >> for that.
> >> >>
> >> >> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> >> >>
> >> >> Richard
> >> >>
> >> >>
> >> >> gcc/
> >> >> * gimple-match.h (gimple_extract_op): Declare.
> >> >> * gimple-match.c (gimple_extract): New function, extracted 
> >> >> from...
> >> >> (gimple_simplify): ...here.
> >> >> (gimple_extract_op): New function.
> >> >> ---
> >> >>  gcc/gimple-match-head.c | 261 +++-
> >> >>  gcc/gimple-match.h  |   1 +
> >> >>  2 files changed, 149 insertions(+), 113 deletions(-)
> >> >>
> >> >> diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c
> >> >> index 9d88b2f8551..4c6e0883ba4 100644
> >> >> --- a/gcc/gimple-match-head.c
> >> >> +++ b/gcc/gimple-match-head.c
> >> >> @@ -890,12 +890,29 @@ try_conditional_simplification (internal_fn ifn, 
> >> >> gimple_match_op *res_op,
> >> >>return true;
> >> >>  }
> >> >>
> >> >> -/* The main STMT based simplification entry.  It is used by the 
> >> >> fold_stmt
> >> >> -   and the fold_stmt_to_constant APIs.  */
> >> >> +/* Common subroutine of gimple_extract_op and gimple_simplify.  Try to
> >> >> +   describe STMT in RES_OP.  Return:
> >> >>
> >> >> -bool
> >> >> -gimple_simplify (gimple *stmt, gimple_match_op *res_op, gimple_seq 
> >> >> *seq,
> >> >> -tree (*valueize)(tree), tree (*top_valueize)(tree))
> >> >> +   - -1 if extraction failed
> >> >> +   - otherwise, 0 if no simplification should take place
> >> >> +   - otherwise, the number of operands for a GIMPLE_ASSIGN or 
> >> >> GIMPLE_COND
> >> >> +   - otherwise, -2 for a GIMPLE_CALL
> >> >> +
> >> >> +   Before recording an operand, call:
> >> >> +
> >> >> +   - VALUEIZE_CONDITION for a COND_EXPR condition
> >> >> +   - VALUEIZE_NAME if the rhs of a GIMPLE_ASSIGN is an SSA_NAME
> >> >
> >> > I think at least VALUEIZE_NAME is unnecessary, see below
> >>
> >> Yeah, it's unnecessary.  The idea was to (try to) ensure that
> >> gimple_simplify keeps all the microoptimisations that it had
> >> previously.  This includes open-coding do_valueize for SSA_NAMEs
> >> and jumping straight to the right gimplify_resimplifyN routine
> >> when the number of operands is already known.
> >>
> >> (The two calls to gimple_extract<> produce different functions
> >> that ought to get inlined into their single callers.  A lot of the
> >> jumps should then be threaded.)
> >>
> >> I can drop all that if you don't think it's worth it though.
> >> Just wanted to double-check first.
> >
> > Yes, I'd prefer the simpler and more understandable variant.
>
> Fair :-)  How does this look?  I pulled in the extra conversions
> you mentioned in the later review since they're useful for the
> new gimple_simplify routine.

OK.

Thanks,
Richard.

> Tested as before.
>
> Richard
>
>
> gcc/
> * gimple-match.h (code_helper): Add functions for querying whether
> the code represents an internal_fn or a built_in_function.
> Provide explicit conversion operators for both cases.
> (gimple_extract_op): Declare.
> * gimple-match-head.c (gimple_extract): New function, extracted 
> from...
> (gimple_simplify): ...here.
> (gimple_extract_op): New function.
> ---
>  gcc/gimple-match-head.c | 218 
>  gcc/gimple-match.h  |  27 +
>  2 files changed, 135 insertions(+), 110 deletions(-)
>
> diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c
> index 9d88b2f8551..15053d1189e 100644
> --- a/gcc/gimple-match-head.c
> +++ b/gcc/gimple-match-head.c
> @@ -890,12 +890,20 @@ try_conditional_simplification (internal_fn ifn, 
> gimple_match_op *res_op,
>return true;
>  }
>
> -/* The main STMT based simplification entry.  It is used by the fold_stmt
> -   and the fold_stmt_to_constant APIs.  */
> +/* Common subroutine of gimple_extract_op and gimple_simplify.  Try to
> +   describe STMT in RES_OP, returning true on success.  Before recording
> +   an operand, call:
>
> -bool
> -gimple_simplify (gimple *stmt, gimple_match_op *res_op, gimple_seq *seq,
> -tree (*valueize)(tree), tree (*top_valueize)(tree))
> +   - VALUEIZE_CONDITION for a COND_EXPR condition
> +   - VALUEIZE_OP for every other top-level operand
> +
> +   Both routines take a tree argument and returns a tree.  */
> +
> +template
> +inline bool
> +gimple_extract (gimple *stmt, gimple_match_op *res_op,
> +   ValueizeOp valueize_op,
> +  

Re: [PATCH] darwin, d: Support outfile substitution for liphobos

2021-11-19 Thread Iain Sandoe via Gcc-patches
Hi Iain

> On 19 Nov 2021, at 08:32, Iain Buclaw  wrote:

> This patch fixes a stage2 bootstrap failure in the D front-end on
> darwin due to libgphobos being dynamically linked despite
> -static-libphobos being on the command line.
> 
> In the gdc driver, this takes the previous fix for the Darwin D
> bootstrap, and extends it to the -static-libphobos option as well.
> Rather than pushing the -static-libphobos option back onto the command
> line, the setting of SKIPOPT is instead conditionally removed.  The same
> change has been repeated for -static-libstdc++ so there is now no need
> to call generate_option to re-add it.
> 
> In the gcc driver, -static-libphobos has been added as a common option,
> validated, and a new outfile substition added to config/darwin.h to
> correctly replace -lgphobos with libgphobos.a.
> 
> Bootstrapped and regression tested on x86_64-linux-gnu and
> x86_64-apple-darwin20.
> 
> OK for mainline?  This would also be fine for gcc-11 release branch too,
> as well as earlier releases with D support.

the Darwin parts are fine, thanks 

The SKIPOPT in d-spec, presumably means “skip removing this opt”?
otherwise the #ifndef looks odd (because of the static-libgcc|static-libphobos,
darwin.h would do the substitution for -static-libgcc as well, so it’s not a 
100%
test).

Iain

> 
> Regards,
> Iain.
> 
> ---
> gcc/ChangeLog:
> 
>   * common.opt (static-libphobos): Add option.
>   * config/darwin.h (LINK_SPEC): Substitute -lgphobos with libgphobos.a
>   when linking statically.
>   * gcc.c (driver_handle_option): Set -static-libphobos as always valid.
> 
> gcc/d/ChangeLog:
> 
>   * d-spec.cc (lang_specific_driver): Set SKIPOPT on -static-libstdc++
>   and -static-libphobos only when target supports LD_STATIC_DYNAMIC.
>   Remove generate_option to re-add -static-libstdc++.
> ---
> gcc/common.opt  |  4 
> gcc/config/darwin.h |  1 +
> gcc/d/d-spec.cc | 18 +++---
> gcc/gcc.c   |  6 --
> 4 files changed, 20 insertions(+), 9 deletions(-)
> 
> diff --git a/gcc/common.opt b/gcc/common.opt
> index db6010e4e20..73c12d933f3 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -3527,6 +3527,10 @@ static-libgfortran
> Driver
> ; Documented for Fortran, but always accepted by driver.
> 
> +static-libphobos
> +Driver
> +; Documented for D, but always accepted by driver.
> +
> static-libstdc++
> Driver
> 
> diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
> index 7ed01efa694..c4ddd623e8b 100644
> --- a/gcc/config/darwin.h
> +++ b/gcc/config/darwin.h
> @@ -443,6 +443,7 @@ extern GTY(()) int darwin_ms_struct;
>  %:replace-outfile(-lobjc libobjc-gnu.a%s); \
> :%:replace-outfile(-lobjc -lobjc-gnu )}}\
>%{static|static-libgcc|static-libgfortran:%:replace-outfile(-lgfortran 
> libgfortran.a%s)}\
> +   %{static|static-libgcc|static-libphobos:%:replace-outfile(-lgphobos 
> libgphobos.a%s)}\
>
> %{static|static-libgcc|static-libstdc++|static-libgfortran:%:replace-outfile(-lgomp
>  libgomp.a%s)}\
>%{static|static-libgcc|static-libstdc++:%:replace-outfile(-lstdc++ 
> libstdc++.a%s)}\
>%{force_cpusubtype_ALL:-arch %(darwin_arch)} \
> diff --git a/gcc/d/d-spec.cc b/gcc/d/d-spec.cc
> index b12d28f1047..73ecac3bbf1 100644
> --- a/gcc/d/d-spec.cc
> +++ b/gcc/d/d-spec.cc
> @@ -253,13 +253,23 @@ lang_specific_driver (cl_decoded_option 
> **in_decoded_options,
> 
>   case OPT_static_libstdc__:
> saw_static_libcxx = true;
> +#ifndef HAVE_LD_STATIC_DYNAMIC
> +   /* Remove -static-libstdc++ from the command only if target supports
> +  LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
> +  back-end target can use outfile substitution.  */
> args[i] |= SKIPOPT;
> +#endif
> break;
> 
>   case OPT_static_libphobos:
> if (phobos_library != PHOBOS_NOLINK)
>   phobos_library = PHOBOS_STATIC;
> +#ifndef HAVE_LD_STATIC_DYNAMIC
> +   /* Remove -static-libphobos from the command only if target supports
> +  LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
> +  back-end target can use outfile substitution.  */
> args[i] |= SKIPOPT;
> +#endif
> break;
> 
>   case OPT_shared_libphobos:
> @@ -460,7 +470,7 @@ lang_specific_driver (cl_decoded_option 
> **in_decoded_options,
> #endif
> }
> 
> -  if (saw_libcxx || need_stdcxx)
> +  if (saw_libcxx || saw_static_libcxx || need_stdcxx)
> {
> #ifdef HAVE_LD_STATIC_DYNAMIC
>   if (saw_static_libcxx && !static_link)
> @@ -468,12 +478,6 @@ lang_specific_driver (cl_decoded_option 
> **in_decoded_options,
> generate_option (OPT_Wl_, LD_STATIC_OPTION, 1, CL_DRIVER,
>  &new_decoded_options[j++]);
>   }
> -#else
> -  /* Push the -static-libstdc++ option back onto the command so that
> -  a target without LD_STATIC_DYNAMIC can use outfile substitution.  */
> -  if (saw_static_libcx

Re: [PATCH] PR tree-optimization/103254 - Limit depth for all GORI expressions.

2021-11-19 Thread Richard Biener via Gcc-patches
On Thu, Nov 18, 2021 at 8:30 PM Andrew MacLeod via Gcc-patches
 wrote:
>
> At issue here is the dynamic approach we currently use for outgoing edge
> calculations.  It isn't normally a problem, but once you get a very
> large number of possible outgoing values (ie very long unrolled blocks)
> with pairs of values on a statement, and individual queries for each
> one, it becomes exponential if they relate to each other.
>
> So.  We previously introduced a param for PR 100081 which limited the
> depth of logical expressions GORI would pursue because we always make 2
> or 4 queries for each logical expression, which accelerated the
> exponential behaviour much quicker.
>
> This patch reuses that to apply to any statement which has 2 ssa-names,
> as the potential for 2 queries can happen then as well..  leading to
> exponential behaviour.  Each statement we step back through with
> multiple ssa-names decreases the odds of calculating a useful outgoing
> range anyway.  This will remove ridiculous behaviour like this PR
> demonstrates.
>
> Next release I plan to revamp GORI to cache outgoing values, which will
> eliminate all this sort of behaviour, and we can remove all such
> restrictions.
>
> Bootstrapped on x86_64-pc-linux-gnu with no regressions.  OK for trunk?

I think it's reasonable.  SCEV tries to limit the overall number of SSA -> def
visits, capturing deep vs. wide in a more uniform (and single) way.
(--param scev-max-expr-size and the duplicate - huh - --param
scev-max-expr-complexity).

For SCEV running into the limit means fatal fail of the analysis, for ranger
it only means less refinement?  So it might make a difference which path
you visit and which not (just thinking about reproducability of range analysis
when we say, swap operands of a stmt)?

Thanks,
Richard.

> Andrew
>
>
> PS. This also points out something we are investigating with the
> threader.  There are no uses of _14 outside the block, so asking for its
> range is problematic and contributing to excess work and cache filling
> that we don't need to be doing.  The VRP passes do not demonstrate this
> behaviour unless, as observed, we ask for details output which queries
> *all* the names.


Re: [PATCH] Do not abort compilation when dump file is /dev/*

2021-11-19 Thread Alexandre Oliva via Gcc-patches
On Nov 18, 2021, Richard Biener  wrote:

> IMHO a more reasonable thing to do would be to not treat
> -o /dev/null as a source for -dumpdir and friends.  Alex?

+1

I think we already have some special-casing for /dev/null somewhere.

> You did the last re-org, where'd we put such special casing?

I think we're missing something like this, to avoid messing with dumpdir
with -o /dev/null.  We already use the same function when computing
outbase just below this.

diff --git a/gcc/gcc.c b/gcc/gcc.c
index 506c2acc282d6..a986728fb91d6 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -5098,7 +5098,8 @@ process_command (unsigned int decoded_options_count,
 
   bool explicit_dumpdir = dumpdir;
 
-  if (!save_temps_overrides_dumpdir && explicit_dumpdir)
+  if ((!save_temps_overrides_dumpdir && explicit_dumpdir)
+  || (output_file != NULL && not_actual_file_p (output_file)))
 {
   /* Do nothing.  */
 }


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about 


Re: [PATCH] Fix tree-optimization/103314 : Limit folding of (type) X op CST where type is a nop convert to gimple

2021-11-19 Thread Richard Biener via Gcc-patches
On Fri, Nov 19, 2021 at 7:34 AM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> There is some re-association code in fold_binary which conflicts with
> this optimization due keeping around some "constants" which are not
> INTEGER_CST (1 << -1) so we end up in an infinite loop because of that.
> So we need to limit this case to GIMPLE level only.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

OK.

Richard.

> PR tree-optimization/103314
>
> gcc/ChangeLog:
>
> * match.pd ((type) X op CST): Restrict the equal
> TYPE_PRECISION case to GIMPLE only.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.c-torture/compile/pr103314-1.c: New test.
> ---
>  gcc/match.pd | 6 +-
>  gcc/testsuite/gcc.c-torture/compile/pr103314-1.c | 6 ++
>  2 files changed, 11 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr103314-1.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 4dc66fb47f2..24a84e3b504 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1619,7 +1619,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   after hoisting the conversion the operation will be narrower.
>   It is also a good if the conversion is a nop as moves the
>   conversion to one side; allowing for combining of the 
> conversions.  */
> -  TYPE_PRECISION (TREE_TYPE (@0)) <= TYPE_PRECISION (type)
> +  TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (type)
> +  /* The conversion check for being a nop can only be done at the 
> gimple
> + level as fold_binary has some re-association code which can 
> conflict
> + with this if there is a "constant" which is not a full 
> INTEGER_CST.  */
> +  || (GIMPLE && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION 
> (type))
>/* It's also a good idea if the conversion is to a non-integer
>   mode.  */
>|| GET_MODE_CLASS (TYPE_MODE (type)) != MODE_INT
> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr103314-1.c 
> b/gcc/testsuite/gcc.c-torture/compile/pr103314-1.c
> new file mode 100644
> index 000..f4a63130421
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/compile/pr103314-1.c
> @@ -0,0 +1,6 @@
> +/* { dg-options "" } */
> +int main() {
> +  int t = 1;
> +  unsigned c = 0, d1 = t ? 1 ^ c ^ 1 >> (-1) : 0; /* { dg-warning "is 
> negative"  } */
> +  return d1;
> +}
> --
> 2.17.1
>


Re: [PATCH] Do not abort compilation when dump file is /dev/*

2021-11-19 Thread Richard Biener via Gcc-patches
On Fri, 19 Nov 2021, Alexandre Oliva wrote:

> On Nov 18, 2021, Richard Biener  wrote:
> 
> > IMHO a more reasonable thing to do would be to not treat
> > -o /dev/null as a source for -dumpdir and friends.  Alex?
> 
> +1
> 
> I think we already have some special-casing for /dev/null somewhere.

Grepping finds me the following in system.h which is already checked
for in gcc.c in a few places indeed.

/* Provide a default for the HOST_BIT_BUCKET.
   This suffices for POSIX-like hosts.  */

#ifndef HOST_BIT_BUCKET
#define HOST_BIT_BUCKET "/dev/null"
#endif


> > You did the last re-org, where'd we put such special casing?
> 
> I think we're missing something like this, to avoid messing with dumpdir
> with -o /dev/null.  We already use the same function when computing
> outbase just below this.

Ah yeah, not_actual_file_p should do the trick indeed.  Giuliano, can
you update the patch like below?  I think we should still adjust
documentation as you did.

> diff --git a/gcc/gcc.c b/gcc/gcc.c
> index 506c2acc282d6..a986728fb91d6 100644
> --- a/gcc/gcc.c
> +++ b/gcc/gcc.c
> @@ -5098,7 +5098,8 @@ process_command (unsigned int decoded_options_count,
>  
>bool explicit_dumpdir = dumpdir;
>  
> -  if (!save_temps_overrides_dumpdir && explicit_dumpdir)
> +  if ((!save_temps_overrides_dumpdir && explicit_dumpdir)
> +  || (output_file != NULL && not_actual_file_p (output_file)))
>  {
>/* Do nothing.  */
>  }
> 


Re: [PATCH] Remove MAY_HAVE_DEBUG_MARKER_STMTS and MAY_HAVE_DEBUG_BIND_STMTS.

2021-11-19 Thread Richard Biener via Gcc-patches
On Fri, Nov 19, 2021 at 9:59 AM Alexandre Oliva  wrote:
>
> On Oct 18, 2021, Richard Biener via Gcc-patches  
> wrote:
>
> > On Mon, Oct 18, 2021 at 10:54 AM Martin Liška  wrote:
> >>
> >> The macros correspond 1:1 to an option flags and make it harder
> >> to find all usages of the flags.
> >>
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> Ready to be installed?
>
> > Hmm, they were introduced on purpose
>
> Yup.  Though there is a 1:1 equivalence right now, conceptually other
> kinds of debug marker stmts, and of debug bind stmts, could be
> introduced, and then the macros would be adjusted to encompass the new
> functionality, covering presumably different options as well.
>
> By removing the macros, every use of the options would have to be
> reassessed to tell whether it needs to be changed to cover the new
> features, or left alone because it's really meant to refer to that
> specific option.
>
> So I find the abstraction useful.  However, I don't have plans to add
> other kinds of debug stmts, and I don't know of anyone else who does, so
> I won't stand in the way if others think removing these abstractions is
> a positive change.

Since the patch keeps some abstraction but not all I think we should
keep them all for now.

Richard.

>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about 


[PATCH] fortran, debug: Fix up DW_AT_rank [PR103315]

2021-11-19 Thread Jakub Jelinek via Gcc-patches
Hi!

For DW_AT_rank we were emitting
.uleb128 0x4# DW_AT_rank
.byte   0x97# DW_OP_push_object_address
.byte   0x23# DW_OP_plus_uconst
.uleb128 0x1c
.byte   0x6 # DW_OP_deref
on 64-bit and
.uleb128 0x4# DW_AT_rank
.byte   0x97# DW_OP_push_object_address
.byte   0x23# DW_OP_plus_uconst
.uleb128 0x10
.byte   0x6 # DW_OP_deref
on 32-bit.  I think this is wrong, as dtype.rank field in the descriptor
has unsigned char type, not pointer type nor pointer sized integral.
E.g. if we have a
REAL :: a(..)
dummy argument, which is passed as a reference to the function descriptor,
we want to evaluate a->dtype.rank.  The above DWARF expressions perform
*(uintptr_t *)(a + 0x1c)
and
*(uintptr_t *)(a + 0x10)
respectively.  The following patch changes those to:
.uleb128 0x5# DW_AT_rank
.byte   0x97# DW_OP_push_object_address
.byte   0x23# DW_OP_plus_uconst
.uleb128 0x1c
.byte   0x94# DW_OP_deref_size
.byte   0x1
and
.uleb128 0x5# DW_AT_rank
.byte   0x97# DW_OP_push_object_address
.byte   0x23# DW_OP_plus_uconst
.uleb128 0x10
.byte   0x94# DW_OP_deref_size
.byte   0x1
which perform
*(unsigned char *)(a + 0x1c)
and
*(unsigned char *)(a + 0x10)
respectively.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-11-19  Jakub Jelinek  

PR debug/103315
* trans-types.c (gfc_get_array_descr_info): Use DW_OP_deref_size 1
instead of DW_OP_deref for DW_AT_rank.

--- gcc/fortran/trans-types.c.jj2021-11-12 15:54:21.0 +0100
+++ gcc/fortran/trans-types.c   2021-11-18 15:13:45.131281198 +0100
@@ -3459,8 +3459,8 @@ gfc_get_array_descr_info (const_tree typ
   if (!integer_zerop (dtype_off))
t = fold_build_pointer_plus (t, rank_off);
 
-  t = build1 (NOP_EXPR, build_pointer_type (gfc_array_index_type), t);
-  t = build1 (INDIRECT_REF, gfc_array_index_type, t);
+  t = build1 (NOP_EXPR, build_pointer_type (TREE_TYPE (field)), t);
+  t = build1 (INDIRECT_REF, TREE_TYPE (field), t);
   info->rank = t;
   t = build0 (PLACEHOLDER_EXPR, TREE_TYPE (dim_off));
   t = size_binop (MULT_EXPR, t, dim_size);

Jakub



Re: [PATCH] Remove MAY_HAVE_DEBUG_MARKER_STMTS and MAY_HAVE_DEBUG_BIND_STMTS.

2021-11-19 Thread Jakub Jelinek via Gcc-patches
On Fri, Nov 19, 2021 at 10:38:41AM +0100, Richard Biener via Gcc-patches wrote:
> > Yup.  Though there is a 1:1 equivalence right now, conceptually other
> > kinds of debug marker stmts, and of debug bind stmts, could be
> > introduced, and then the macros would be adjusted to encompass the new
> > functionality, covering presumably different options as well.
> >
> > By removing the macros, every use of the options would have to be
> > reassessed to tell whether it needs to be changed to cover the new
> > features, or left alone because it's really meant to refer to that
> > specific option.
> >
> > So I find the abstraction useful.  However, I don't have plans to add
> > other kinds of debug stmts, and I don't know of anyone else who does, so
> > I won't stand in the way if others think removing these abstractions is
> > a positive change.
> 
> Since the patch keeps some abstraction but not all I think we should
> keep them all for now.

Yeah, e.g. there has been discussions about deferring some warnings until
later so we omit less often false positives about dead code that we optimize
away, and one suggested approach was to represent those warnings as special
debug stmts.  Those would appear in the IL regardless of the
-fvariable-tracking-assignments option, so MAY_HAVE_DEBUG_STMTS could
become
  debug_nonbind_markers_p || flag_var_tracking_assignments || 
cfun->may_have_warnings
where the last one would be set if we emit any such warning stmts into the
IL.

Jakub



[PATCH] IBM Z: Fix load-and-test peephole2 condition

2021-11-19 Thread Stefan Schulze Frielinghaus via Gcc-patches
For a peephole2 condition variable insn points to the first matched
insn.  In order to refer to the second matched insn use
peep2_next_insn(1) instead.

Update: Added a test case

Bootstrapped and regtested on IBM Z.  Ok for mainline and gcc-{11,10,9}?

gcc/ChangeLog:

* config/s390/s390.md (define_peephole2): Variable insn points
to the first matched insn.  Use peep2_next_insn(1) to refer to
the second matched insn.

gcc/testsuite/ChangeLog:

* gcc.target/s390/2029.c: New test.
---
 gcc/config/s390/s390.md  |  2 +-
 gcc/testsuite/gcc.target/s390/2029.c | 12 
 2 files changed, 13 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/2029.c

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 4debdcd1247..c4f92bde061 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -1003,7 +1003,7 @@
(match_operand:GPR 2 "memory_operand"))
(set (reg CC_REGNUM)
(compare (match_dup 0) (match_operand:GPR 1 "const0_operand")))]
-  "s390_match_ccmode(insn, CCSmode) && TARGET_EXTIMM
+  "s390_match_ccmode (peep2_next_insn (1), CCSmode) && TARGET_EXTIMM
&& GENERAL_REG_P (operands[0])
&& satisfies_constraint_T (operands[2])
&& !contains_constant_pool_address_p (operands[2])"
diff --git a/gcc/testsuite/gcc.target/s390/2029.c 
b/gcc/testsuite/gcc.target/s390/2029.c
new file mode 100644
index 000..1a6df4f4b89
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/2029.c
@@ -0,0 +1,12 @@
+/* { dg-do run } */
+/* { dg-options "-Os -march=z10" } */
+signed char a;
+int b = -925974181, c;
+unsigned *d = &b;
+int *e = &c;
+int main() {
+  *e = ((217 ^ a) > 585) < *d;
+  if (c != 1)
+__builtin_abort();
+  return 0;
+}
-- 
2.31.1



Re: [PATCH] Loop unswitching: support gswitch statements.

2021-11-19 Thread Richard Biener via Gcc-patches
On Tue, Nov 16, 2021 at 2:53 PM Martin Liška  wrote:
>
> On 11/11/21 08:15, Richard Biener wrote:
> > If you look at simplify_using_entry_checks then this is really really 
> > simple,
> > so I'd try to abstract this, recording sth like a unswitch_predicate where
> > we store the condition we unswitch on plus maybe cache the constant
> > range of a VAR cmp CST variable condition on the true/false edge.  We
> > can then try to simplify each gcond/gswitch based on such an 
> > unswitch_predicate
> > (when we ever scan the loop once to discover all opportunities we'd have a
> > set of unswitch_predicates to try simplifying against).  As said the integer
> > range thing would be an improvement over the current state so even that
> > can be done as followup but I guess for gswitch support that's going to be
> > the thing to use.
>
> I started working on the unswitch_predicate where I recond also 
> true/false-edge irange
> of an expression we unswitch on.
>
> I noticed one significant problem, let's consider:
>
>for (int i = 0; i < size; i++)
>{
>  double tmp;
>
>  if (order == 1)
>tmp = -8 * a[i];
>  else
>{
> if (order == 2)
>   tmp = -4 * b[i];
> else
>   tmp = a[i];
>}
>
>  r[i] = 3.4f * tmp + d[i];
>}
>
> We can end up with first unswitching candidate being 'if (order == 2)' (I 
> have a real benchmark where it happens).
> So I collect ranges and they are [2,2] for true edge and [-INF, 0], [3, INF] 
> (because we came to the condition through order != 1 cond).

You can only record a range from the unswitch condition itself, not
from the path you came, so your false edge range
looks wrong.

> Then the loop is cloned and we have

> if (order == 2)
> loop_version_1
> else
> loop_version_2
>
> but in loop_version_2 we wrongly fold 'if (order == 1)' to false because it's 
> reflected in the range.

Of course for loop_version_1 the range is [2, 2] and for
loop_version_2 it is ~[2, 2] - you
do have to invert the range when folding the "false" version.

> So the question is, can one iterate get_loop_body stmts in some dominator 
> order?

I don't think this would fix anything, but yes, in the end we'd like
to unswitch on "outer" conditions
first.  That would mean iterating blocks in program order, the easiest
way is to use
get_loop_body_in_dom_order which should fix the above testcase but
likely not arbitrary
nested ifs.  For those you could use
rev_post_order_and_mark_dfs_back_seme, but as said
using get_loop_body_in_dom_order should be a strict improvement even
without any other
change (you could propose a patch if you have a testcase that shows a
difference with
otherwise unchanged unswitching, for example choosing the outer condition
instead of the inner with --param max-unswitch-level==0).

Richard.

>
> Thanks,
> Martin
>
>


Re: [RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)

2021-11-19 Thread Matthias Kretz
On Thursday, 18 November 2021 20:24:36 CET Jason Merrill wrote:
> On 11/17/21 17:51, Matthias Kretz wrote:
> > Right, I had already added a `gcc_assert (!TMPL_ARGS_HAVE_MULTIPLE_LEVELS
> > (args))` to my new set_non_default_template_args_count function and found
> > cp/ constraint.cc:2896 (get_mapped_args), which calls
> > SET_NON_DEFAULT_TEMPLATE_ARGS_COUNT on the outer TREE_VEC. Was this
> > supposed to apply to all inner TREE_VECs? Or is deleting the line the
> > correct fix?
>
> That should apply to the inner TREE_VECs (and probably use list.length)

Like this?

@@ -2890,10 +2890,11 @@ get_mapped_args (tree map)
   tree level = make_tree_vec (list.length ());
   for (unsigned j = 0; j < list.length(); ++j)
 TREE_VEC_ELT (level, j) = list[j];
+  /* None of the args at any level are defaulted.  */
+  SET_NON_DEFAULT_TEMPLATE_ARGS_COUNT (level, list.length());
   SET_TMPL_ARGS_LEVEL (args, i + 1, level);
   list.release ();
 }
-  SET_NON_DEFAULT_TEMPLATE_ARGS_COUNT (args, 0);

   return args;
 }

> >>> __FUNCTION__ was 'fun' all the time, but __PRETTY_FUNCTION__ was
> >>> 'void fun(T) [with T = int]'.
> >> 
> >> Isn't that true for instantiations, as well?
> > 
> > No, instantiations don't have template args/parms in __FUNCTION__.
> 
> Hmm, that inconsistency seems like a bug, though I'm not sure whether it
> should have the template args or not; I lean toward not.  The standard
> says that the value of __func__ is implementation-defined, the GCC
> manual says it's "the unadorned name of the function".

So you think f1 in testsuite/g++.old-deja/g++.ext/pretty3.C needs to test 
for

  if (strcmp (function, "f1"))
bad = true;
  if (strcmp (pretty, "void f1(T) [with T = int]"))
bad = true;

Or should the latter be "void f1(int)"?

> >>> It's more consistent that __PRETTY_FUNCTION__ contains __FUNCTION__,
> >>> IMHO
> >> 
> >> I suppose, but I don't see that as a strong enough motivation to mix
> >> this up.
> > 
> > What about
> > 
> > template  void f();
> > template <> void f();
> > 
> > With -fpretty-templates shouldn't it print as 'void f() [with T =
> > float]' and 'void f()'? Yes, it's probably too subtle for most users
> > to notice the difference. But I find it's more consistent this way.
> 
> I disagree; the function signature is the same whether a particular
> function is an explicit specialization or an instantiation.

Yes, the call signature is the same. But what it calls is different. There's 
no obvious answer for my stated goal "print template parms wherever they
would appear in the source code as well", since it depends on whether the user 
is interested in recognizing the exact function body that was called.

My motivation for printing a function template specialization differently is:

1. It's a different function definition that's being called. The user (caller) 
might be surprised to realize this is the case as he forgot about the 
specialization and was expecting his change to the general template to make a 
difference.

2. There's no T in

template <> void f() {
  // no T here, but of course I can define one:
  using T = int;
}

so presenting the function that was called as 'void f() [with T = int]' is 
not exactly correct. In this case it wasn't even the case that T was deduced 
to be 'int', which we could argue to be useful information that might get 
lost.

For

template  void f(T);
template <> void f(int);

the whole story is "'void f(int)' was called for 'template  void f(T) 
[with T = int]'". What the user wants to see depends on what is more important 
to fix the bug: that T was deduced to be int, or that the specialization of f 
was called instead of the general template. I'd still go with 'void f(int)', 
though I'd be happier if I had some indication that a template was involved.

> >> Ah, you're trying to omit defaulted parms from the ?  I'm not sure
> >> that's necessary, leaving them out of the [with ...] list should be
> >> sufficient.
> > 
> > I was thinking about all the std::allocator defaults in the standard
> > library. I don't want to see them. E.g. vector::clear() on const
> > object:
> > 
> > error: passing 'const std::vector' as 'this' argument discards
> > qualifiers [...]/stl_vector.h:1498:7: note:   in call to 'void
> > std::vector<_Tp, _Alloc>::clear() [with _Tp = int; _Alloc =
> > std::allocator]'
> > 
> > With my patch the last line becomes
> > [...]/stl_vector.h:1498:7: note:   in call to 'void
> > std::vector<_Tp>::clear() [with _Tp = int]'
> > 
> > 
> > Another case I didn't consider before:
> > 
> > template  struct A {
> > 
> >[[deprecated]] void f(U);
> > 
> > };
> > 
> > A a; a.f(1);
> > 
> > With my patch it prints 'void A::f(U) [with T = float]', with your
> > suggestion 'void A::f(U) [with T = float]'. Both are missing
> > important information in the substitution list, IMHO. Would 'void A > = int>::f(U) [with T = float]' be an improvement? Or should
> > find_typenames (in cp/error.c) find defaulted te

Re: [PATCH] darwin, d: Support outfile substitution for liphobos

2021-11-19 Thread Iain Buclaw via Gcc-patches
Excerpts from Iain Sandoe's message of November 19, 2021 10:21 am:
> Hi Iain
> 
>> On 19 Nov 2021, at 08:32, Iain Buclaw  wrote:
> 
>> This patch fixes a stage2 bootstrap failure in the D front-end on
>> darwin due to libgphobos being dynamically linked despite
>> -static-libphobos being on the command line.
>> 
>> In the gdc driver, this takes the previous fix for the Darwin D
>> bootstrap, and extends it to the -static-libphobos option as well.
>> Rather than pushing the -static-libphobos option back onto the command
>> line, the setting of SKIPOPT is instead conditionally removed.  The same
>> change has been repeated for -static-libstdc++ so there is now no need
>> to call generate_option to re-add it.
>> 
>> In the gcc driver, -static-libphobos has been added as a common option,
>> validated, and a new outfile substition added to config/darwin.h to
>> correctly replace -lgphobos with libgphobos.a.
>> 
>> Bootstrapped and regression tested on x86_64-linux-gnu and
>> x86_64-apple-darwin20.
>> 
>> OK for mainline?  This would also be fine for gcc-11 release branch too,
>> as well as earlier releases with D support.
> 
> the Darwin parts are fine, thanks 
> 
> The SKIPOPT in d-spec, presumably means “skip removing this opt”?
> otherwise the #ifndef looks odd (because of the 
> static-libgcc|static-libphobos,
> darwin.h would do the substitution for -static-libgcc as well, so it’s not a 
> 100%
> test).
> 

The inverse. SKIPOPT means "skip this option", so previously it was
being removed by the driver when constructing the new_decoded_options,
hence why your generate_option addition was necessary before.

Iain.

> Iain
> 
>> 
>> Regards,
>> Iain.
>> 
>> ---
>> gcc/ChangeLog:
>> 
>>  * common.opt (static-libphobos): Add option.
>>  * config/darwin.h (LINK_SPEC): Substitute -lgphobos with libgphobos.a
>>  when linking statically.
>>  * gcc.c (driver_handle_option): Set -static-libphobos as always valid.
>> 
>> gcc/d/ChangeLog:
>> 
>>  * d-spec.cc (lang_specific_driver): Set SKIPOPT on -static-libstdc++
>>  and -static-libphobos only when target supports LD_STATIC_DYNAMIC.
>>  Remove generate_option to re-add -static-libstdc++.
>> ---
>> gcc/common.opt  |  4 
>> gcc/config/darwin.h |  1 +
>> gcc/d/d-spec.cc | 18 +++---
>> gcc/gcc.c   |  6 --
>> 4 files changed, 20 insertions(+), 9 deletions(-)
>> 
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index db6010e4e20..73c12d933f3 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -3527,6 +3527,10 @@ static-libgfortran
>> Driver
>> ; Documented for Fortran, but always accepted by driver.
>> 
>> +static-libphobos
>> +Driver
>> +; Documented for D, but always accepted by driver.
>> +
>> static-libstdc++
>> Driver
>> 
>> diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
>> index 7ed01efa694..c4ddd623e8b 100644
>> --- a/gcc/config/darwin.h
>> +++ b/gcc/config/darwin.h
>> @@ -443,6 +443,7 @@ extern GTY(()) int darwin_ms_struct;
>>  %:replace-outfile(-lobjc libobjc-gnu.a%s); \
>> :%:replace-outfile(-lobjc -lobjc-gnu )}}\
>>%{static|static-libgcc|static-libgfortran:%:replace-outfile(-lgfortran 
>> libgfortran.a%s)}\
>> +   %{static|static-libgcc|static-libphobos:%:replace-outfile(-lgphobos 
>> libgphobos.a%s)}\
>>
>> %{static|static-libgcc|static-libstdc++|static-libgfortran:%:replace-outfile(-lgomp
>>  libgomp.a%s)}\
>>%{static|static-libgcc|static-libstdc++:%:replace-outfile(-lstdc++ 
>> libstdc++.a%s)}\
>>%{force_cpusubtype_ALL:-arch %(darwin_arch)} \
>> diff --git a/gcc/d/d-spec.cc b/gcc/d/d-spec.cc
>> index b12d28f1047..73ecac3bbf1 100644
>> --- a/gcc/d/d-spec.cc
>> +++ b/gcc/d/d-spec.cc
>> @@ -253,13 +253,23 @@ lang_specific_driver (cl_decoded_option 
>> **in_decoded_options,
>> 
>>  case OPT_static_libstdc__:
>>saw_static_libcxx = true;
>> +#ifndef HAVE_LD_STATIC_DYNAMIC
>> +  /* Remove -static-libstdc++ from the command only if target supports
>> + LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
>> + back-end target can use outfile substitution.  */
>>args[i] |= SKIPOPT;
>> +#endif
>>break;
>> 
>>  case OPT_static_libphobos:
>>if (phobos_library != PHOBOS_NOLINK)
>>  phobos_library = PHOBOS_STATIC;
>> +#ifndef HAVE_LD_STATIC_DYNAMIC
>> +  /* Remove -static-libphobos from the command only if target supports
>> + LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
>> + back-end target can use outfile substitution.  */
>>args[i] |= SKIPOPT;
>> +#endif
>>break;
>> 
>>  case OPT_shared_libphobos:
>> @@ -460,7 +470,7 @@ lang_specific_driver (cl_decoded_option 
>> **in_decoded_options,
>> #endif
>> }
>> 
>> -  if (saw_libcxx || need_stdcxx)
>> +  if (saw_libcxx || saw_static_libcxx || need_stdcxx)
>> {
>> #ifdef HAVE_LD_STATIC_DYNAMIC
>>   if (saw_static_libcxx && !static_link)
>> @@ -468,1

Re: [PATCH] Loop unswitching: support gswitch statements.

2021-11-19 Thread Richard Biener via Gcc-patches
On Tue, Nov 16, 2021 at 3:40 PM Martin Liška  wrote:
>
> On 11/11/21 08:15, Richard Biener wrote:
> > So I'd try to do no functional change first, improving the costing and
> > setting up the transform to simply pick up the stmts to "fold" as discovered
> > during analysis (as I hinted you possibly can use gimple_uid to mark
> > the stmts that simplify, IIRC gimple_uid is preserved during copying.
> > gimple_uid would also scale better than gimple_plf in case we do
> > the analysis for all candidates at once).
>
> Thinking about the analysis. Am I correct that we want to properly calculate
> loop size for true and false edge of a potential gcond before the actually 
> unswitching?

Yes.

> We can do that by finding a first gcond candidate, evaluate (symbolic + 
> irange approache)
> all other gcond in the loop body and use BB_REACHABLE discovery. Similarly to 
> what we do now
> at lines 378-446. Then tree_num_loop_insns can be adjusted for only these 
> reachable blocks.
> Having that, we can calculate # of insns that will live in true/false loops.

So whatever we do here we should record as "this control stmt folds to
{true,false}" (or {true,unknown},
or in future, "this control stmt will lead to edge {e,unknown}"),
recording the simplification
on the true/false loop version in a way we can apply it after the transform.

> Then we can call tree_unswitch_loop and make the gcond folding as we do in 
> the versioned loops.
>
> Is it a step in good direction? Having that we can then extend it to gswitch 
> statements.

One issue I see is that BB_REACHABLE is there only once but you could use
auto_bb_flag reachable_true, reachable_false to distinguish the
true/false loop version
copies.

So yes, I think that sounds reasonable.  At the point we want to
evaluate different
(first) unswitching opportunities against each other storing this only
as BB flag is
likely to hit limits.  When we want to evaluate multiple levels of
unswitching before
doing any transforms even more so (if there are 3 opportunities there'd be
many cases to be considered when going to level 3 ;)).  I _think_ that a sparse
lattice of stmt UID -> edge might do the trick if we change tree_num_loop_insns
do to a DFS walk from the loop header, ignoring not taken edges by
consulting the
lattice.  Or, for speed reason, pre-compute tree_num_loop_insns for each BB
so we just have to sum a different set of BBs rather than walking all
stmts again.

That said, the second step would definitely be to choose the "best" opportunity
on the current level.

Richard.

> Cheers,
> Martin


[PATCH] c++: Avoid adding implicit attributes during apply_late_template_attributes [PR101180]

2021-11-19 Thread Jakub Jelinek via Gcc-patches
Hi!

decl_attributes and its caller cplus_decl_attributes sometimes add
implicit attributes, e.g. optimize attribute if #pragma GCC optimize
is active, target attribute if #pragma GCC target is active, or
e.g. omp declare target attribute if in between #pragma omp declare target
and #pragma omp end declare target.

For templates that seems highly undesirable to me though, they should
get those implicit attributes from the spot the templates were parsed
(and they do get that), then tsubst through copy_node copies those
attributes, but then apply_late_template_attributes can or does add
a new set from the spot where they are instantiated, which can be pretty
random point of first use of the template.

Consider e.g.
#pragma GCC push_options
#pragma GCC target "avx"
template 
inline void foo ()
{
}
#pragma GCC pop_options
#pragma GCC push_options
#pragma GCC target "crc32"
void
bar ()
{
  foo<0> ();
}
#pragma GCC pop_options
testcase where the intention is that foo has avx target attribute
and bar has crc32 target attribute, but we end up with
__attribute__((target ("crc32"), target ("avx")))
on foo<0> (and due to yet another bug actually don't enable avx
in foo<0>).  In this particular case it is a regression caused
by r12-299-ga0fdff3cf33f7284 which apparently calls
cplus_decl_attributes even if attributes != NULL but late_attrs
is NULL, before those changes we didn't call it in those cases.
But, if there is at least one unrelated dependent attribute this
would happen already in older releases.

The following patch fixes that by temporarily overriding the variables
that control the addition of the implicit attributes.

Shall we also change the function so that it doesn't call
cplus_decl_attributes if late_attrs is NULL, or was that change
intentional?

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-11-19  Jakub Jelinek  

PR c++/101180
* pt.c (apply_late_template_attributes): Temporarily override
current_optimize_pragma, optimization_current_node,
current_target_pragma and scope_chain->omp_declare_target_attribute,
so that cplus_decl_attributes doesn't add implicit attributes.

* g++.target/i386/pr101180.C: New test.

--- gcc/cp/pt.c.jj  2021-11-18 12:33:22.025628027 +0100
+++ gcc/cp/pt.c 2021-11-18 19:08:52.800539490 +0100
@@ -11722,6 +11722,17 @@ apply_late_template_attributes (tree *de
q = &TREE_CHAIN (*q);
 }
 
+  /* cplus_decl_attributes can add some attributes implicitly.  For templates,
+ those attributes should have been added already when those templates were
+ parsed, and shouldn't be added based on from which context they are
+ first time instantiated.  */
+  auto o1 = make_temp_override (current_optimize_pragma, NULL_TREE);
+  auto o2 = make_temp_override (optimization_current_node,
+   optimization_default_node);
+  auto o3 = make_temp_override (current_target_pragma, NULL_TREE);
+  auto o4 = make_temp_override (scope_chain->omp_declare_target_attribute,
+   NULL);
+
   cplus_decl_attributes (decl_p, late_attrs, attr_flags);
 
   return true;
--- gcc/testsuite/g++.target/i386/pr101180.C.jj 2021-11-18 19:11:50.512009888 
+0100
+++ gcc/testsuite/g++.target/i386/pr101180.C2021-11-18 19:11:33.066258216 
+0100
@@ -0,0 +1,25 @@
+// PR c++/101180
+// { dg-do compile { target c++11 } }
+
+#pragma GCC target "avx"
+template  struct A {};
+#pragma GCC push_options
+#pragma GCC target "avx,avx2,bmi,bmi2,fma,f16c"
+template  using B = A;
+template  struct C;
+template <> struct C {
+  __attribute__((always_inline)) float operator()(long) { return .0f; }
+};
+long d;
+template  void e(B) {
+  T{C()(d)};
+}
+template  void f(T d, FromT) {
+  e(d);
+}
+int g;
+void h() {
+  A i;
+  f(i, g);
+}
+#pragma GCC pop_options

Jakub



Re: [PATCH 4/5] vect: Make reduction code handle calls

2021-11-19 Thread Richard Biener via Gcc-patches
On Tue, Nov 16, 2021 at 5:24 PM Richard Sandiford
 wrote:
>
> Richard Biener via Gcc-patches  writes:
> > On Wed, Nov 10, 2021 at 1:48 PM Richard Sandiford via Gcc-patches
> >  wrote:
> >>
> >> This patch extends the reduction code to handle calls.  So far
> >> it's a structural change only; a later patch adds support for
> >> specific function reductions.
> >>
> >> Most of the patch consists of using code_helper and gimple_match_op
> >> to describe the reduction operations.  The other main change is that
> >> vectorizable_call now needs to handle fully-predicated reductions.
> >>
> >> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> >>
> >> Richard
> >>
> >>
> >> gcc/
> >> * builtins.h (associated_internal_fn): Declare overload that
> >> takes a (combined_cfn, return type) pair.
> >> * builtins.c (associated_internal_fn): Split new overload out
> >> of original fndecl version.  Also provide an overload that takes
> >> a (combined_cfn, return type) pair.
> >> * internal-fn.h (commutative_binary_fn_p): Declare.
> >> (associative_binary_fn_p): Likewise.
> >> * internal-fn.c (commutative_binary_fn_p): New function,
> >> split out from...
> >> (first_commutative_argument): ...here.
> >> (associative_binary_fn_p): New function.
> >> * gimple-match.h (code_helper): Add a constructor that takes
> >> internal functions.
> >> (commutative_binary_op_p): Declare.
> >> (associative_binary_op_p): Likewise.
> >> (canonicalize_code): Likewise.
> >> (directly_supported_p): Likewise.
> >> (get_conditional_internal_fn): Likewise.
> >> (gimple_build): New overload that takes a code_helper.
> >> * gimple-fold.c (gimple_build): Likewise.
> >> * gimple-match-head.c (commutative_binary_op_p): New function.
> >> (associative_binary_op_p): Likewise.
> >> (canonicalize_code): Likewise.
> >> (directly_supported_p): Likewise.
> >> (get_conditional_internal_fn): Likewise.
> >> * tree-vectorizer.h: Include gimple-match.h.
> >> (neutral_op_for_reduction): Take a code_helper instead of a 
> >> tree_code.
> >> (needs_fold_left_reduction_p): Likewise.
> >> (reduction_fn_for_scalar_code): Likewise.
> >> (vect_can_vectorize_without_simd_p): Declare a nNew overload that 
> >> takes
> >> a code_helper.
> >> * tree-vect-loop.c: Include case-cfn-macros.h.
> >> (fold_left_reduction_fn): Take a code_helper instead of a 
> >> tree_code.
> >> (reduction_fn_for_scalar_code): Likewise.
> >> (neutral_op_for_reduction): Likewise.
> >> (needs_fold_left_reduction_p): Likewise.
> >> (use_mask_by_cond_expr_p): Likewise.
> >> (build_vect_cond_expr): Likewise.
> >> (vect_create_partial_epilog): Likewise.  Use gimple_build rather
> >> than gimple_build_assign.
> >> (check_reduction_path): Handle calls and operate on code_helpers
> >> rather than tree_codes.
> >> (vect_is_simple_reduction): Likewise.
> >> (vect_model_reduction_cost): Likewise.
> >> (vect_find_reusable_accumulator): Likewise.
> >> (vect_create_epilog_for_reduction): Likewise.
> >> (vect_transform_cycle_phi): Likewise.
> >> (vectorizable_reduction): Likewise.  Make more use of
> >> lane_reduc_code_p.
> >> (vect_transform_reduction): Use gimple_extract_op but expect
> >> a tree_code for now.
> >> (vect_can_vectorize_without_simd_p): New overload that takes
> >> a code_helper.
> >> * tree-vect-stmts.c (vectorizable_call): Handle reductions in
> >> fully-masked loops.
> >> * tree-vect-patterns.c (vect_mark_pattern_stmts): Use
> >> gimple_extract_op when updating STMT_VINFO_REDUC_IDX.
> >> ---
> >>  gcc/builtins.c   |  46 -
> >>  gcc/builtins.h   |   1 +
> >>  gcc/gimple-fold.c|   9 +
> >>  gcc/gimple-match-head.c  |  70 +++
> >>  gcc/gimple-match.h   |  20 ++
> >>  gcc/internal-fn.c|  46 -
> >>  gcc/internal-fn.h|   2 +
> >>  gcc/tree-vect-loop.c | 420 +++
> >>  gcc/tree-vect-patterns.c |  23 ++-
> >>  gcc/tree-vect-stmts.c|  66 --
> >>  gcc/tree-vectorizer.h|  10 +-
> >>  11 files changed, 455 insertions(+), 258 deletions(-)
> >>
> >> diff --git a/gcc/builtins.c b/gcc/builtins.c
> >> index 384864bfb3a..03829c03a5a 100644
> >> --- a/gcc/builtins.c
> >> +++ b/gcc/builtins.c
> >> @@ -2139,17 +2139,17 @@ mathfn_built_in_type (combined_fn fn)
> >>  #undef SEQ_OF_CASE_MATHFN
> >>  }
> >>
> >> -/* If BUILT_IN_NORMAL function FNDECL has an associated internal function,
> >> -   return its code, otherwise return IFN_LAST.  Note that this function
> >> -   only tests whether the function is defined in internals.def, not 
> >> whether
> 

Re: [PATCH 4/5] if-conv: Apply VN to hoisted conversions

2021-11-19 Thread Richard Biener via Gcc-patches
On Tue, Nov 16, 2021 at 7:05 PM Richard Sandiford
 wrote:
>
> Richard Biener via Gcc-patches  writes:
> > On Mon, Nov 15, 2021 at 3:00 PM Richard Sandiford
> >  wrote:
> >>
> >> Richard Biener via Gcc-patches  writes:
> >> > On Fri, Nov 12, 2021 at 7:05 PM Richard Sandiford via Gcc-patches
> >> >  wrote:
> >> >>
> >> >> This patch is a prerequisite for a later one.  At the moment,
> >> >> if-conversion converts predicated POINTER_PLUS_EXPRs into
> >> >> non-wrapping forms, which for:
> >> >>
> >> >> … = base + offset
> >> >>
> >> >> becomes:
> >> >>
> >> >> tmp = (unsigned long) base
> >> >> … = tmp + offset
> >> >>
> >> >> It then hoists these conversions out of the loop where possible.
> >> >>
> >> >> However, because “base” is a valid gimple operand, there can be
> >> >> multiple POINTER_PLUS_EXPRs with the same base, which can in turn
> >> >> lead to multiple instances of the same conversion.  The later VN pass
> >> >> is (and I think needs to be) restricted to the new if-converted code,
> >> >> whereas here we're deliberately inserting the conversions before the
> >> >> .LOOP_VECTORIZED condition:
> >> >>
> >> >> /* If we versioned loop then make sure to insert invariant
> >> >>stmts before the .LOOP_VECTORIZED check since the vectorizer
> >> >>will re-use that for things like runtime alias versioning
> >> >>whose condition can end up using those invariants.  */
> >> >>
> >> >> We can therefore enter the vectoriser with redundant conversions.
> >> >>
> >> >> The easiest fix seemed to be to defer the hoisting until after VN.
> >> >> This catches other hoisting opportunities too.
> >> >>
> >> >> Hoisting the code from the (artificial) loop in pr99102.c means
> >> >> that it's no longer worth vectorising.  The patch forces vectorisation
> >> >> instead of relying on the cost model.
> >> >>
> >> >> The patch also reverts pr87007-4.c and pr87007-5.c back to their
> >> >> original forms, undoing changes in 783dc66f9ccb0019c3dad.
> >> >> The code at the time the tests were added was:
> >> >>
> >> >> testl   %edi, %edi
> >> >> je  .L10
> >> >> vxorps  %xmm1, %xmm1, %xmm1
> >> >> vsqrtsd d3(%rip), %xmm1, %xmm0
> >> >> vsqrtsd d2(%rip), %xmm1, %xmm1
> >> >> ...
> >> >> .L10:
> >> >> ret
> >> >>
> >> >> with the operations being hoisted, and the vxorps was specifically
> >> >> wanted (compared to the previous code).  This patch restores the code
> >> >> to that form, with the hoisted operations and the vxorps.
> >> >>
> >> >> Regstrapped on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> >> >>
> >> >> Richard
> >> >>
> >> >>
> >> >> gcc/
> >> >> * tree-if-conv.c: Include tree-eh.h.
> >> >> (predicate_statements): Remove pe argument.  Don't hoist
> >> >> statements here.
> >> >> (combine_blocks): Remove pe argument.
> >> >> (ifcvt_can_hoist, ifcvt_can_hoist_further): New functions.
> >> >> (ifcvt_hoist_invariants): Likewise.
> >> >> (tree_if_conversion): Update call to combine_blocks.  Call
> >> >> ifcvt_hoist_invariants after VN.
> >> >>
> >> >> gcc/testsuite/
> >> >> * gcc.dg/vect/pr99102.c: Add -fno-vect-cost-model.
> >> >>
> >> >> Revert:
> >> >>
> >> >> 2020-09-09  Richard Biener  
> >> >>
> >> >> * gcc.target/i386/pr87007-4.c: Adjust.
> >> >> * gcc.target/i386/pr87007-5.c: Likewise.
> >> >> ---
> >> >>  gcc/testsuite/gcc.dg/vect/pr99102.c   |   2 +-
> >> >>  gcc/testsuite/gcc.target/i386/pr87007-4.c |   2 +-
> >> >>  gcc/testsuite/gcc.target/i386/pr87007-5.c |   2 +-
> >> >>  gcc/tree-if-conv.c| 122 --
> >> >>  4 files changed, 114 insertions(+), 14 deletions(-)
> >> >>
> >> >> diff --git a/gcc/testsuite/gcc.dg/vect/pr99102.c 
> >> >> b/gcc/testsuite/gcc.dg/vect/pr99102.c
> >> >> index 6c1a13f0783..0d030d15c86 100644
> >> >> --- a/gcc/testsuite/gcc.dg/vect/pr99102.c
> >> >> +++ b/gcc/testsuite/gcc.dg/vect/pr99102.c
> >> >> @@ -1,4 +1,4 @@
> >> >> -/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
> >> >> +/* { dg-options "-O2 -ftree-vectorize -fno-vect-cost-model 
> >> >> -fdump-tree-vect-details" } */
> >> >>  /* { dg-additional-options "-msve-vector-bits=256" { target 
> >> >> aarch64_sve256_hw } } */
> >> >>  long a[44];
> >> >>  short d, e = -7;
> >> >> diff --git a/gcc/testsuite/gcc.target/i386/pr87007-4.c 
> >> >> b/gcc/testsuite/gcc.target/i386/pr87007-4.c
> >> >> index 9c4b8005af3..e91bdcbac44 100644
> >> >> --- a/gcc/testsuite/gcc.target/i386/pr87007-4.c
> >> >> +++ b/gcc/testsuite/gcc.target/i386/pr87007-4.c
> >> >> @@ -15,4 +15,4 @@ foo (int n, int k)
> >> >>d1 = ceil (d3);
> >> >>  }
> >> >>
> >> >> -/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 0 } 
> >> >> } */
> >> >> +/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 1 } 
> >> >> } */
> >> >> diff --git a/g

Re: [PATCH] Fix tree-optimization/101941: IPA splitting out function with error attribute

2021-11-19 Thread Richard Biener via Gcc-patches
On Wed, Nov 17, 2021 at 7:25 AM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> The Linux kernel started to fail compile when the jump threader was improved
> (r12-2591-g2e96b5f14e4025691). This failure was due to the IPA splitting code
> decided now to split off the basic block which contained two functions,
> one of those functions included the error attribute on them.  This patch fixes
> the problem by disallowing basic blocks from being split which contain 
> functions
> that have either the error or warning attribute on them.
>
> The two new testcases are to make sure we still split the function for other
> places if we reject the one case.

Hmm, it's only problematic if the immediate(?) controlling condition of the
error/warning annotated call is not in the split part, no?  Interestingly
we for example don't avoid splitting away __builtin_constant_p either.

Richard.

> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
>
> PR tree-optimization/101941
>
> gcc/ChangeLog:
>
> * ipa-split.c (visit_bb): Disallow function calls where
> the function has either error or warning attribute.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.c-torture/compile/pr101941-1.c: New test.
> * gcc.dg/tree-ssa/pr101941-1.c: New test.
> ---
>  gcc/ipa-split.c   | 12 -
>  .../gcc.c-torture/compile/pr101941-1.c| 44 +
>  gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c| 48 +++
>  3 files changed, 103 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c
>
> diff --git a/gcc/ipa-split.c b/gcc/ipa-split.c
> index c68577d04a9..070e894ef31 100644
> --- a/gcc/ipa-split.c
> +++ b/gcc/ipa-split.c
> @@ -873,7 +873,7 @@ visit_bb (basic_block bb, basic_block return_bb,
>gimple *stmt = gsi_stmt (bsi);
>tree op;
>ssa_op_iter iter;
> -  tree decl;
> +  tree decl = NULL_TREE;
>
>if (is_gimple_debug (stmt))
> continue;
> @@ -927,6 +927,16 @@ visit_bb (basic_block bb, basic_block return_bb,
> break;
>   }
>
> +  /* If a function call and that function has either the
> +warning or error attribute on it, don't split.  */
> +  if (decl && (lookup_attribute ("warning", DECL_ATTRIBUTES (decl))
> +  || lookup_attribute ("error", DECL_ATTRIBUTES (decl
> +   {
> + if (dump_file && (dump_flags & TDF_DETAILS))
> +   fprintf (dump_file, "Cannot split: warning or error 
> attribute.\n");
> + can_split = false;
> +   }
> +
>FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_DEF)
> bitmap_set_bit (set_ssa_names, SSA_NAME_VERSION (op));
>FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c 
> b/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
> new file mode 100644
> index 000..ab3bbea8ed7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
> @@ -0,0 +1,44 @@
> +/* { dg-additional-options "-fconserve-stack" } */
> +struct crypto_aes_ctx {
> +  char key_dec[128];
> +};
> +
> +int rfc4106_set_hash_subkey_hash_subkey;
> +
> +void __write_overflow(void)__attribute__((__error__("")));
> +void __write_overflow1(void);
> +void aes_encrypt(void*);
> +
> +void fortify_panic(const char*) __attribute__((__noreturn__)) ;
> +
> +char *rfc4106_set_hash_subkey(struct crypto_aes_ctx *ctx) {
> +  void *a = &ctx->key_dec[0];
> +  unsigned p_size =  __builtin_object_size(a, 0);
> +#ifdef __OPTIMIZE__
> +  if (p_size < 16) {
> +__write_overflow1();
> +fortify_panic(__func__);
> +  }
> +  if (p_size < 32) {
> +__write_overflow();
> +fortify_panic(__func__);
> +  }
> +#endif
> +  aes_encrypt(ctx);
> +  return ctx->key_dec;
> +}
> +
> +char *(*gg)(struct crypto_aes_ctx *) = rfc4106_set_hash_subkey;
> +
> +void a(void)
> +{
> +  struct crypto_aes_ctx ctx;
> +  rfc4106_set_hash_subkey(&ctx);
> +}
> +void b(void)
> +{
> +  struct crypto_aes_ctx ctx;
> +  ctx.key_dec[0] = 0;
> +  rfc4106_set_hash_subkey(&ctx);
> +}
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c
> new file mode 100644
> index 000..21c1d1ec466
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c
> @@ -0,0 +1,48 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fconserve-stack -fdump-tree-optimized" } */
> +struct crypto_aes_ctx {
> +  char key_dec[128];
> +};
> +
> +int rfc4106_set_hash_subkey_hash_subkey;
> +
> +void __write_overflow(void)__attribute__((__error__("")));
> +void __write_overflow1(void);
> +void aes_encrypt(void*);
> +
> +void fortify_panic(const char*) __attribute__((__noreturn__)) ;
> +
> +char *rfc4106_set_hash_subkey(struct crypto_aes_ctx *ctx) {
> +  void *a = &ctx->key_dec[0];
> +  unsigned p_size =  

Re: [PATCH] PR tree-optimization/103254 - Limit depth for all GORI expressions.

2021-11-19 Thread Aldy Hernandez via Gcc-patches




On 11/18/21 8:28 PM, Andrew MacLeod wrote:


@@ -376,6 +366,14 @@ range_def_chain::get_def_chain (tree name)
   return NULL;
 }
 
+  // Terminate the def chains if we see too many cascading stmts.

+  if (m_logical_depth == param_ranger_logical_depth)
+return NULL;
+
+  // Increase the depth if we have a pair of ssa-names.
+  if (ssa1 && ssa2)
+m_logical_depth++;
+


Not sure it matters, since this is an internal knob, but if the --param 
now applies to all statements, perhaps we should remove the "logical" 
reference from the knob.


Aldy



Re: [PATCH v1 1/8] bswap: synthesize HImode bswap from SImode or DImode

2021-11-19 Thread Richard Biener via Gcc-patches
On Thu, Nov 11, 2021 at 3:13 PM Philipp Tomsich
 wrote:
>
> The RISC-V Zbb extension adds an XLEN (i.e. SImode for rv32, DImode
> for rv64) bswap instruction (rev8).  While, with the current master,
> SImode is synthesized correctly from DImode, HImode is not.
>
> This change adds an appropriate expansion for a HImode bswap, if a
> wider bswap is available.
>
> Without this change, the following rv64gc_zbb code is generated for
> __builtin_bswap16():
> slliw   a5,a0,8
> zext.h  a0,a0
> srliw   a0,a0,8
> or  a0,a5,a0
> sext.h  a0,a0  // this is a 16bit sign-extension following
>// the byteswap (e.g. on a 'short' function
>// return).
>
> After this change, a bswap (rev8) is used and any extensions are
> combined into the shift-right:
> rev8a0,a0
> sraia0,a0,48   // the sign-extension is combined into the
>// shift; a srli is emitted otherwise...
>
> gcc/ChangeLog:
>
> * optabs.c (expand_unop): support expanding a HImode bswap
>   using SImode or DImode, followed by a shift.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zbb-bswap.c: New test.
>
> Signed-off-by: Philipp Tomsich 
> ---
>
>  gcc/optabs.c   |  6 ++
>  gcc/testsuite/gcc.target/riscv/zbb-bswap.c | 22 ++
>  2 files changed, 28 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-bswap.c
>
> diff --git a/gcc/optabs.c b/gcc/optabs.c
> index 019bbb62882..7a3ffbe4525 100644
> --- a/gcc/optabs.c
> +++ b/gcc/optabs.c
> @@ -3307,6 +3307,12 @@ expand_unop (machine_mode mode, optab unoptab, rtx 
> op0, rtx target,
> return temp;
> }
>
> + /* If we are missing a HImode BSWAP, but have one for SImode or
> +DImode, use a BSWAP followed by a SHIFT.  */
> + temp = widen_bswap (as_a  (mode), op0, target);
> + if (temp)
> +   return temp;
> +

I think it would be more natural to temporarily terminate the HImode case here
and re-open it inside the following

  if (is_a  (mode, &int_mode))
{
  temp = widen_bswap (int_mode, op0, target);
  if (temp)
return temp;

here to handle the ashl/lshr/ior fallback.

Richard.

>   last = get_last_insn ();
>
>   temp1 = expand_binop (mode, ashl_optab, op0,
> diff --git a/gcc/testsuite/gcc.target/riscv/zbb-bswap.c 
> b/gcc/testsuite/gcc.target/riscv/zbb-bswap.c
> new file mode 100644
> index 000..6ee27d9f47a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zbb-bswap.c
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -O2" } */
> +
> +unsigned long
> +func64 (unsigned long i)
> +{
> +  return __builtin_bswap64(i);
> +}
> +
> +unsigned int
> +func32 (unsigned int i)
> +{
> +  return __builtin_bswap32(i);
> +}
> +
> +unsigned short
> +func16 (unsigned short i)
> +{
> +  return __builtin_bswap16(i);
> +}
> +
> +/* { dg-final { scan-assembler-times "rev8" 3 } } */
> --
> 2.32.0
>


Re: [PATCH v1 1/8] bswap: synthesize HImode bswap from SImode or DImode

2021-11-19 Thread Richard Biener via Gcc-patches
On Fri, Nov 19, 2021 at 11:20 AM Richard Biener
 wrote:
>
> On Thu, Nov 11, 2021 at 3:13 PM Philipp Tomsich
>  wrote:
> >
> > The RISC-V Zbb extension adds an XLEN (i.e. SImode for rv32, DImode
> > for rv64) bswap instruction (rev8).  While, with the current master,
> > SImode is synthesized correctly from DImode, HImode is not.
> >
> > This change adds an appropriate expansion for a HImode bswap, if a
> > wider bswap is available.
> >
> > Without this change, the following rv64gc_zbb code is generated for
> > __builtin_bswap16():
> > slliw   a5,a0,8
> > zext.h  a0,a0
> > srliw   a0,a0,8
> > or  a0,a5,a0
> > sext.h  a0,a0  // this is a 16bit sign-extension following
> >// the byteswap (e.g. on a 'short' function
> >// return).
> >
> > After this change, a bswap (rev8) is used and any extensions are
> > combined into the shift-right:
> > rev8a0,a0
> > sraia0,a0,48   // the sign-extension is combined into the
> >// shift; a srli is emitted otherwise...
> >
> > gcc/ChangeLog:
> >
> > * optabs.c (expand_unop): support expanding a HImode bswap
> >   using SImode or DImode, followed by a shift.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/zbb-bswap.c: New test.
> >
> > Signed-off-by: Philipp Tomsich 
> > ---
> >
> >  gcc/optabs.c   |  6 ++
> >  gcc/testsuite/gcc.target/riscv/zbb-bswap.c | 22 ++
> >  2 files changed, 28 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-bswap.c
> >
> > diff --git a/gcc/optabs.c b/gcc/optabs.c
> > index 019bbb62882..7a3ffbe4525 100644
> > --- a/gcc/optabs.c
> > +++ b/gcc/optabs.c
> > @@ -3307,6 +3307,12 @@ expand_unop (machine_mode mode, optab unoptab, rtx 
> > op0, rtx target,
> > return temp;
> > }
> >
> > + /* If we are missing a HImode BSWAP, but have one for SImode or
> > +DImode, use a BSWAP followed by a SHIFT.  */
> > + temp = widen_bswap (as_a  (mode), op0, target);
> > + if (temp)
> > +   return temp;
> > +
>
> I think it would be more natural to temporarily terminate the HImode case here
> and re-open it inside the following
>
>   if (is_a  (mode, &int_mode))
> {
>   temp = widen_bswap (int_mode, op0, target);
>   if (temp)
> return temp;
>
> here to handle the ashl/lshr/ior fallback.

But as Kito says, code generation for more targets would need to be looked at.

Richard.

> Richard.
>
> >   last = get_last_insn ();
> >
> >   temp1 = expand_binop (mode, ashl_optab, op0,
> > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-bswap.c 
> > b/gcc/testsuite/gcc.target/riscv/zbb-bswap.c
> > new file mode 100644
> > index 000..6ee27d9f47a
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/zbb-bswap.c
> > @@ -0,0 +1,22 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -O2" } */
> > +
> > +unsigned long
> > +func64 (unsigned long i)
> > +{
> > +  return __builtin_bswap64(i);
> > +}
> > +
> > +unsigned int
> > +func32 (unsigned int i)
> > +{
> > +  return __builtin_bswap32(i);
> > +}
> > +
> > +unsigned short
> > +func16 (unsigned short i)
> > +{
> > +  return __builtin_bswap16(i);
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "rev8" 3 } } */
> > --
> > 2.32.0
> >


Re: [PATCH][_GLIBCXX_DEBUG] Limit performance impact in __erase_nodes_if

2021-11-19 Thread Jonathan Wakely via Gcc-patches
On Thu, 18 Nov 2021 at 22:05, François Dumont via Libstdc++ <
libstd...@gcc.gnu.org> wrote:

> Hi
>
>  Here is a proposal to limit performance impact of _GLIBCXX_DEBUG
> mode on __erase_nodes_if.
>
>  As you can see I am adding erase overloads on the Debug container
> to accept base iterators. So it exposes an additional non-Standard
> method, do you think it is any issue.
>

I think this is fine - I can't think of a way that this could make user
code do the wrong thing.


>
>  Note that I've started doing something similar for the vector/deque
> adding this time erase(_Base_const_iterator, _Base_const_iterator)
> overloads. But in this case it results in ambiguous erase calls, even if
> I remove the _Safe_iterator cast operator to the base iterator.
>
>  libstdc++: [_GLIBCXX_DEBUG] Reduce performance impact on std::erase_if
>
>  Bypass the _GLIBCXX_DEBUG additional checks in
> std::__detail::__erase_node_if used
>  by all implementations of std::erase_if for node based containers.
>
>  libstdc++-v3/ChangeLog:
>
>  * include/bits/erase_if.h (__erase_nodes_if): Add
> _UnsafeContainer template
>  parameter. Use it to get iterators to work with.
>  * include/debug/macros.h (__glibcxx_check_erase2): New.
>  * include/debug/map.h (map<>::erase(_Base_const_iterator)):
> New.
>  (map<>::erase(const_iterator)): Use latter.
>  * include/debug/multimap.h
> (multimap<>::erase(_Base_const_iterator)): New.
>  (multimap<>::erase(const_iterator)): Use latter.
>  * include/debug/multiset.h
> (multiset<>::erase(_Base_const_iterator)): New.
>  (multiset<>::erase(const_iterator)): Use latter.
>  * include/debug/set.h (set<>::erase(_Base_const_iterator)):
> New.
>  (set<>::erase(const_iterator)): Use latter.
>  * include/debug/unordered_map
> (unordered_map<>::erase(_Base_const_iterator)): New.
>  (unordered_multimap<>::erase(const_iterator)): New.
>  * include/debug/unordered_set
> (unordered_set<>::erase(_Base_const_iterator)): New.
>  (unordered_multiset<>::erase(const_iterator)): New.
>  * include/experimental/map (erase_if): Adapt.
>  * include/experimental/set (erase_if): Adapt.
>  * include/experimental/unordered_map (erase_if): Adapt.
>  * include/experimental/unordered_set (erase_if): Adapt.
>  * include/std/map (erase_if): Adapt.
>  * include/std/set (erase_if): Adapt.
>  * include/std/unordered_map (erase_if): Adapt.
>  * include/std/unordered_set (erase_if): Adapt.
>
> Tested under Linux x86_64,
>
> Ok to commit ?
>

OK, thanks.


Re: [PATCH] middle-end: fix de-optimizations with bitclear patterns on signed values

2021-11-19 Thread Jakub Jelinek via Gcc-patches
On Fri, Nov 12, 2021 at 07:30:35AM +, Tamar Christina via Gcc-patches wrote:
> @@ -2099,7 +2124,7 @@ spaceship_replacement (basic_block cond_bb, basic_block 
> middle_bb,
>|| !tree_fits_shwi_p (rhs)
>|| !IN_RANGE (tree_to_shwi (rhs), -1, 1))
>  return false;
> -  if (orig_use_lhs)
> +  if (orig_use_lhs && !is_cast)
>  {
>if ((cmp != EQ_EXPR && cmp != NE_EXPR) || !integer_zerop (rhs))
>   return false;

I actually meant that you'd do the if (is_cast) handling right above
the if (orig_use_lhs), i.e.
  if (is_cast)
{
  if (TREE_CODE (rhs) != INTEGER_CST)
return false;
  /* As for -ffast-math we assume the 2 return to be
 impossible, canonicalize (unsigned) res <= 1U or
 (unsigned) res < 2U into res >= 0 and (unsigned) res > 1U
 or (unsigned) res >= 2U as res < 0.  */
  switch (cmp)
{
case LE_EXPR:
  if (!integer_onep (rhs))
return false;
  cmp = GE_EXPR;
  break;
case LT_EXPR:
  if (wi::ne_p (wi::to_widest (rhs), 2))
return false;
  cmp = GE_EXPR;
  break;
case GT_EXPR:
  if (!integer_onep (rhs))
return false;
  cmp = LT_EXPR;
  break;
case GE_EXPR:
  if (wi::ne_p (wi::to_widest (rhs), 2))
return false;
  cmp = LT_EXPR;
  break;
default:
  return false;
}
  rhs = build_zero_cst (TREE_TYPE (phires));
}
  else if (orig_use_lhs)
...
and keep the code in the following hunk untouched.  Similarly to how
for the BIT_AND_EXPR if (orig_use_lhs), it virtually undoes the match.pd
optimization.

Because in the place you've placed it you're totally ignoring one_cmp,
and I'm pretty sure that is the wrong thing.
one_cmp is computed as:
  /* lhs1 one_cmp rhs1 results in phires of 1.  */
  enum tree_code one_cmp;
  if ((cmp1 == LT_EXPR || cmp1 == LE_EXPR)
  ^ (!integer_onep ((e1->flags & EDGE_TRUE_VALUE) ? arg1 : arg0)))
one_cmp = LT_EXPR;
  else
one_cmp = GT_EXPR;
and it is something unrelated to what actual comparison is done or virtually
done on the phires.

> @@ -2310,62 +2335,101 @@ spaceship_replacement (basic_block cond_bb, 
> basic_block middle_bb,
>  one_cmp = GT_EXPR;
>  
>enum tree_code res_cmp;
> -  switch (cmp)
> +
> +  if (is_cast)
>  {
> -case EQ_EXPR:
> -  if (integer_zerop (rhs))
> - res_cmp = EQ_EXPR;
> -  else if (integer_minus_onep (rhs))
> - res_cmp = one_cmp == LT_EXPR ? GT_EXPR : LT_EXPR;
> -  else if (integer_onep (rhs))
> - res_cmp = one_cmp;
> -  else
> +  if (TREE_CODE (rhs) != INTEGER_CST)
>   return false;
> -  break;
> -case NE_EXPR:
> -  if (integer_zerop (rhs))
> - res_cmp = NE_EXPR;
> -  else if (integer_minus_onep (rhs))
> - res_cmp = one_cmp == LT_EXPR ? LE_EXPR : GE_EXPR;
> -  else if (integer_onep (rhs))
> - res_cmp = one_cmp == LT_EXPR ? GE_EXPR : LE_EXPR;
> -  else
> - return false;
> -  break;
> -case LT_EXPR:
> -  if (integer_onep (rhs))
> - res_cmp = one_cmp == LT_EXPR ? GE_EXPR : LE_EXPR;
> -  else if (integer_zerop (rhs))
> - res_cmp = one_cmp == LT_EXPR ? GT_EXPR : LT_EXPR;
> -  else
> - return false;
> -  break;
> -case LE_EXPR:
> -  if (integer_zerop (rhs))
> - res_cmp = one_cmp == LT_EXPR ? GE_EXPR : LE_EXPR;
> -  else if (integer_minus_onep (rhs))
> - res_cmp = one_cmp == LT_EXPR ? GT_EXPR : LT_EXPR;
> -  else
> - return false;
> -  break;
> -case GT_EXPR:
> -  if (integer_minus_onep (rhs))
> - res_cmp = one_cmp == LT_EXPR ? LE_EXPR : GE_EXPR;
> -  else if (integer_zerop (rhs))
> - res_cmp = one_cmp;
> -  else
> - return false;
> -  break;
> -case GE_EXPR:
> -  if (integer_zerop (rhs))
> - res_cmp = one_cmp == LT_EXPR ? LE_EXPR : GE_EXPR;
> -  else if (integer_onep (rhs))
> - res_cmp = one_cmp;
> -  else
> - return false;
> -  break;
> -default:
> -  gcc_unreachable ();
> +  /* As for -ffast-math we assume the 2 return to be
> +  impossible, canonicalize (unsigned) res <= 1U or
> +  (unsigned) res < 2U into res >= 0 and (unsigned) res > 1U
> +  or (unsigned) res >= 2U as res < 0.  */
> +  switch (cmp)
> + {
> + case LE_EXPR:
> +   if (!integer_onep (rhs))
> + return false;
> +   res_cmp = GE_EXPR;
> +   break;
> + case LT_EXPR:
> +   if (wi::ne_p (wi::to_widest (rhs), 2))
> + return false;
> +   res_cmp = GE_EXPR;
> +   break;
> + case GT_EXPR:
> +   if (!integer_onep (rhs))
> + return false;
> +   res_cmp = LT_EXPR;
> +   break;
> + case GE_EXPR:
> +   if (wi::ne_p (wi::to_widest (rhs), 2))
> + return false;
> +   res_cmp = LT_EXPR;
> +   break;
> + default:
> +   return false;
> + }
> +  rhs = build_zero_

Re: [PATCH][V4] rs6000: Remove unnecessary option manipulation.

2021-11-19 Thread Martin Liška

On 11/18/21 19:59, Segher Boessenkool wrote:

Please resend, without line wrapping (format=flawed).


Done in the original [v4] email, see here:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584267.html

Martin


Re: [PATCH] PR tree-optimization/96779 Adding a missing pattern to match.pd

2021-11-19 Thread Richard Biener via Gcc-patches
On Tue, Nov 16, 2021 at 11:51 PM Navid Rahimi via Gcc-patches
 wrote:
>
> Hi GCC community,
>
> This patch will add the missed pattern described in bug 102232 [1] to the 
> match.pd.
>
> Tree-optimization/96779: Adding new optimization to match.pd:
>
> * match.pd (-x == x) -> (x == 0): New optimization.
> * gcc.dg/tree-ssa/pr96779.c: testcase for this optimization.
> * gcc.dg/tree-ssa/pr96779-disabled.c: testcase for this 
> optimization when -fwrapv passed.
>
> 1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96779

+/* -x == x -> x == 0 */
+(for cmp (eq ne)
+ (simplify
+  (cmp:c @0 (negate @0))
+   (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
+&& !TYPE_OVERFLOW_WRAPS (type))

you need to check TYPE_OVERFLOW_WRAPS on TREE_TYPE (@0),
otherwise you check on boolean.

+(cmp:c @0 { build_zero_cst (TREE_TYPE(@0)); }
+

no need for :c on the result pattern.  Otherwise it looks OK, but how
did you check the patch?

Thanks,
Richard.


> Best wishes,
> Navid.


Re: [PATCH][V4] rs6000: Remove unnecessary option manipulation.

2021-11-19 Thread Segher Boessenkool
On Fri, Nov 19, 2021 at 12:32:09PM +0100, Martin Liška wrote:
> On 11/18/21 19:59, Segher Boessenkool wrote:
> >Please resend, without line wrapping (format=flawed).
> 
> Done in the original [v4] email, see here:
> https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584267.html

Which you didn't send to me.  And the archives we have now don't let me
grab the raw mail from there anymore.  Please send this to me if you
want it reviewed.


Segher


Re: libgo patch committed: redirect mkrsysinfo.sh grep output to /dev/null

2021-11-19 Thread Bernhard Reutner-Fischer via Gcc-patches
On Fri, 28 Oct 2016 10:55:18 -0700
Ian Lance Taylor  wrote:

> This patch to libgo redirects the output of a grep command in
> mkrsysinfo.sh to /dev/null.  The output otherwise appears in the

grep -q exists since at least SUSv2, fwiw.
thanks,

> middle of a build log, where it is harmless but confusing.
> Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
> to mainline.
> 
> Ian



Re: [PATCH] Fix tree-optimization/101941: IPA splitting out function with error attribute

2021-11-19 Thread Andrew Pinski via Gcc-patches
On Fri, Nov 19, 2021 at 2:16 AM Richard Biener via Gcc-patches
 wrote:
>
> On Wed, Nov 17, 2021 at 7:25 AM apinski--- via Gcc-patches
>  wrote:
> >
> > From: Andrew Pinski 
> >
> > The Linux kernel started to fail compile when the jump threader was improved
> > (r12-2591-g2e96b5f14e4025691). This failure was due to the IPA splitting 
> > code
> > decided now to split off the basic block which contained two functions,
> > one of those functions included the error attribute on them.  This patch 
> > fixes
> > the problem by disallowing basic blocks from being split which contain 
> > functions
> > that have either the error or warning attribute on them.
> >
> > The two new testcases are to make sure we still split the function for other
> > places if we reject the one case.
>
> Hmm, it's only problematic if the immediate(?) controlling condition of the
> error/warning annotated call is not in the split part, no?
No, if we have something like:
  if (p_size < 32) {
if (ctx != 0) {
  __write_overflow();
   }
fortify_panic(__func__);
 }
We would still run into the problem if we just disable the splitting
for the innermost bb and split off the 3 other bb's
I have a testcase where the above would fail if we decide only to make
sure the split point is not after ctx !=0 check.

> Interestingly we for example don't avoid splitting away __builtin_constant_p 
> either.

__builtin_constant_p is handled a different way already; in
check_forbidden_calls we set forbidden_dominators to include the bb
where the builtin_constant_p would have been true.
And then when we consider the split, we reject if the entry point
would have been dominated by one of those bbs.
This was PR49642.


Thanks,
Andrew Pinski


>
> Richard.
>
> > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> >
> > PR tree-optimization/101941
> >
> > gcc/ChangeLog:
> >
> > * ipa-split.c (visit_bb): Disallow function calls where
> > the function has either error or warning attribute.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.c-torture/compile/pr101941-1.c: New test.
> > * gcc.dg/tree-ssa/pr101941-1.c: New test.
> > ---
> >  gcc/ipa-split.c   | 12 -
> >  .../gcc.c-torture/compile/pr101941-1.c| 44 +
> >  gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c| 48 +++
> >  3 files changed, 103 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c
> >
> > diff --git a/gcc/ipa-split.c b/gcc/ipa-split.c
> > index c68577d04a9..070e894ef31 100644
> > --- a/gcc/ipa-split.c
> > +++ b/gcc/ipa-split.c
> > @@ -873,7 +873,7 @@ visit_bb (basic_block bb, basic_block return_bb,
> >gimple *stmt = gsi_stmt (bsi);
> >tree op;
> >ssa_op_iter iter;
> > -  tree decl;
> > +  tree decl = NULL_TREE;
> >
> >if (is_gimple_debug (stmt))
> > continue;
> > @@ -927,6 +927,16 @@ visit_bb (basic_block bb, basic_block return_bb,
> > break;
> >   }
> >
> > +  /* If a function call and that function has either the
> > +warning or error attribute on it, don't split.  */
> > +  if (decl && (lookup_attribute ("warning", DECL_ATTRIBUTES (decl))
> > +  || lookup_attribute ("error", DECL_ATTRIBUTES (decl
> > +   {
> > + if (dump_file && (dump_flags & TDF_DETAILS))
> > +   fprintf (dump_file, "Cannot split: warning or error 
> > attribute.\n");
> > + can_split = false;
> > +   }
> > +
> >FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_DEF)
> > bitmap_set_bit (set_ssa_names, SSA_NAME_VERSION (op));
> >FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
> > diff --git a/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c 
> > b/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
> > new file mode 100644
> > index 000..ab3bbea8ed7
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
> > @@ -0,0 +1,44 @@
> > +/* { dg-additional-options "-fconserve-stack" } */
> > +struct crypto_aes_ctx {
> > +  char key_dec[128];
> > +};
> > +
> > +int rfc4106_set_hash_subkey_hash_subkey;
> > +
> > +void __write_overflow(void)__attribute__((__error__("")));
> > +void __write_overflow1(void);
> > +void aes_encrypt(void*);
> > +
> > +void fortify_panic(const char*) __attribute__((__noreturn__)) ;
> > +
> > +char *rfc4106_set_hash_subkey(struct crypto_aes_ctx *ctx) {
> > +  void *a = &ctx->key_dec[0];
> > +  unsigned p_size =  __builtin_object_size(a, 0);
> > +#ifdef __OPTIMIZE__
> > +  if (p_size < 16) {
> > +__write_overflow1();
> > +fortify_panic(__func__);
> > +  }
> > +  if (p_size < 32) {
> > +__write_overflow();
> > +fortify_panic(__func__);
> > +  }
> > +#endif
> > +  aes_encrypt(ctx);
> > +  return ctx->key_dec;
> > +}
> > +
> > +char *(*gg)(struct crypto_aes_ctx *)

Re: [PATCH] options: Make -Ofast switch off -fsemantic-interposition

2021-11-19 Thread Jan Hubicka via Gcc-patches
> > Hi,
> > 
> > On Fri, Nov 12 2021, Martin Jambor wrote:
> > > Hi,
> > >
> > > using -fno-semantic-interposition has been reported by various people
> > > to bring about considerable speed up at the cost of strict compliance
> > > to the ELF symbol interposition rules  See for example
> > > https://fedoraproject.org/wiki/Changes/PythonNoSemanticInterpositionSpeedup
> > >
> > > As such I believe it should be implied by our -Ofast optimization
> > > level, not only so that benchmarks that can benefit run faster, but
> > > also so that people looking at -Ofast documentation for options that
> > > could speed their programs find it.
> > >
> > > I have verified that with the following patch IPA-CP sees
> > > flag_semantic_interposition set to zero at Ofast and that info and pdf
> > > manual builds fine with the documentation change.  I am bootstrapping
> > > and testing it now in order to comply with submission criteria but I
> > > don't think an Ofast change gets much tested.
> > >
> > > Assuming it passes, is the patch OK?  (If it is, I will also add a note
> > > about it in the "Caveats" section in gcc-12/changes.html of wwwdocs
> > > after I commit the patch.)
> > >
> > 
> > Unfortunately, I was wrong, there are testcases which use the optimize
> > attribute to switch a function to Ofast and those ICE because
> > -fsemantic-interposition is not an optimization flag and only
> > optimization flags can change in an optimize attribute (AFAIK, I only
> > had a quick glance at the results).
> > 
> > I am not sure what is the right way to tackle this, whether to set the
> > flag at Ofast in some nonstandard way or make the flag an optimization
> > flag - probably affecting function definitions, having it affect
> > call-sites seems too fine-grained.  I will try to discuss this on IRC on
> > Monday (and hope such change is still doable early stage3).
> > 
> > Sorry for posting this a bit prematurely,
> 
> Hi,
> 
> This patch turns flag_semantic_interposition to optimization option so
> it can be enabled with per-function granuality.  This is done by adding
> the flag among visibility flags into the symbol table.  This fixes the
> behaviour on the testcase I added to testsuite.
> 
> There are bugs where get_availability misbehaves on partitioned program.
> We can also use the new flag to fix those, but I will do that
> incrementally.
> 
> The -Ofast change should be safe now.

Also forgot to say it explicitly, the patch is OK, so please commit it.

Honza


Re: [PATCH] IBM Z: Fix load-and-test peephole2 condition

2021-11-19 Thread Andreas Krebbel via Gcc-patches
On 11/19/21 10:45, Stefan Schulze Frielinghaus wrote:
...
> diff --git a/gcc/testsuite/gcc.target/s390/20211119.c 
> b/gcc/testsuite/gcc.target/s390/2029.c
> new file mode 100644
> index 000..1a6df4f4b89
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/2029.c
> @@ -0,0 +1,12 @@
> +/* { dg-do run } */
> +/* { dg-options "-Os -march=z10" } */

Although z10 is pretty old we will need an effective target check here. Ok with 
that change.

Thanks!

Andreas


Re: [RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)

2021-11-19 Thread Matthias Kretz
On Friday, 19 November 2021 10:53:27 CET Matthias Kretz wrote:
> > >> Ah, you're trying to omit defaulted parms from the ?  I'm not
> > >> sure
> > >> that's necessary, leaving them out of the [with ...] list should be
> > >> sufficient.
> > >
> > > I was thinking about all the std::allocator defaults in the standard
> > > library. I don't want to see them. E.g. vector::clear() on const
> > > object:
> > > 
> > > error: passing 'const std::vector' as 'this' argument discards
> > > qualifiers [...]/stl_vector.h:1498:7: note:   in call to 'void
> > > std::vector<_Tp, _Alloc>::clear() [with _Tp = int; _Alloc =
> > > std::allocator]'
> > > 
> > > With my patch the last line becomes
> > > [...]/stl_vector.h:1498:7: note:   in call to 'void
> > > std::vector<_Tp>::clear() [with _Tp = int]'
> > > 
> > > 
> > > Another case I didn't consider before:
> > > 
> > > template  struct A {
> > > 
> > > [[deprecated]] void f(U);
> > > 
> > > };
> > > 
> > > A a; a.f(1);
> > > 
> > > With my patch it prints 'void A::f(U) [with T = float]', with your
> > > suggestion 'void A::f(U) [with T = float]'. Both are missing
> > > important information in the substitution list, IMHO. Would 'void A > > = int>::f(U) [with T = float]' be an improvement? Or should
> > > find_typenames (in cp/error.c) find defaulted template parms and add
> > > them
> > > to its list? IIUC find_typenames would find all template parms and
> > > couldn't know whether they're defaulted.
> > 
> > That sounds good: omit defaulted parms only if they don't appear in the
> > signature (other than as another default template argument).
> 
> Let me check whether I have the right idea:
> 
> I could extend find_typenames (which walks the complete) tree to look for
> TEMPLATE_TYPE_PARM (and the 3 others I don't recall right now). But since
> that walks the *complete* tree, it'll simply find all parms with no
> indication whether they appear in the signature. Ideas:
> 
> 1. Count occurrences: with 2 occurrences, one of them must be a use in the
> signature.
> 
> 2. Walk only over TYPE_ARG_TYPES (TREE_TYPE (DECL_TEMPLATE_RESULT (fn))) to
> collect TEMPLATE_TYPE_PARMs.

I tried the latter:

@@ -1641,8 +1652,11 @@ dump_substitution (cxx_pretty_printer *pp,   
   
   && !(flags & TFF_NO_TEMPLATE_BINDINGS))  
   
 {
   vec *typenames = t ? find_typenames (t) : NULL; 
   
-  dump_template_bindings (pp, template_parms, template_args, typenames,
   
- flags);   
   
+  tree fn_arguments = TYPE_ARG_TYPES (TREE_TYPE (DECL_TEMPLATE_RESULT 
(t)));  
+  tree used_template_parms = find_template_parameters (fn_arguments,   
   
+  template_parms); 
   
+  dump_template_bindings (pp, template_parms, template_args,   
   
+ used_template_parms, typenames, flags);   
   
 }
 }

Now in dump_template_bindings it skips all defaulted template_parms that are 
not in used_template_parms. Makes this test pass:

template   
  struct id 
  { using type = T; };  

template  
  struct A  
  { 
template  
  [[deprecated]] static void
  f(typename id::type); 
  };

int main()  
{   
  A::f(0); // { dg-warning "'static void A::f\\(typename 
id::type\\) .with U0 = const int&; T0 = int; typename id::type = const 
int&.'" }
}

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──





Remove gimple_static_chain test disabling modref in ref_maybe_used_in_call_p

2021-11-19 Thread Jan Hubicka via Gcc-patches
Hi,
this patch removes test for function not having call chain guarding
modref use in ref_maybe_used_by_call_p_1.  It never made sense since
modref treats call chain accesses explicitly. It was however copied from
earlier check for ECF_CONST (which seems dubious too, but I would like
to discuss it independelty).

This enables us to detect that memory pointed to static chain (or parts of it)
are unused by the function.

lto-bootstrapped-regtested all lanugages on x86_64-linux. OK?

gcc/ChangeLog:

2021-11-19  Jan Hubicka  

* tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Do not guard modref
by !gimple_call_chain.

gcc/testsuite/ChangeLog:

2021-11-19  Jan Hubicka  

* gcc.dg/tree-ssa/modref-dse-6.c: New test.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-6.c 
b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-6.c
new file mode 100644
index 000..d1e45a893ad
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-6.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized"  } */
+int
+main()
+{
+  int a,b;
+  __attribute__ ((noinline))
+  void kill_me()
+  {
+a=1234;
+b=2234;
+  }
+  a=0;
+  b=1234;
+  __attribute__ ((noinline))
+  int reta()
+  {
+return a;
+  }
+  return reta();
+}
+/* { dg-final { scan-tree-dump-not "kill_me" "optimized" } } */
+/* { dg-final { scan-tree-dump-not "1234" "optimized" } } */
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 02bbc87b597..cd6a0b2f67b 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -2755,7 +2755,7 @@ ref_maybe_used_by_call_p_1 (gcall *call, ao_ref *ref, 
bool tbaa_p)
 
   callee = gimple_call_fndecl (call);
 
-  if (!gimple_call_chain (call) && callee != NULL_TREE)
+  if (callee != NULL_TREE)
 {
   struct cgraph_node *node = cgraph_node::get (callee);
   /* We can not safely optimize based on summary of calle if it does


Re: [PATCH][V4] rs6000: Remove unnecessary option manipulation.

2021-11-19 Thread Martin Liška

On 11/19/21 12:43, Segher Boessenkool wrote:

On Fri, Nov 19, 2021 at 12:32:09PM +0100, Martin Liška wrote:

On 11/18/21 19:59, Segher Boessenkool wrote:

Please resend, without line wrapping (format=flawed).


Done in the original [v4] email, see here:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584267.html


Which you didn't send to me.  And the archives we have now don't let me
grab the raw mail from there anymore.  Please send this to me if you
want it reviewed.


Fine. I've just done that.

Martin




Segher





Re: [PATCH] Do not abort compilation when dump file is /dev/*

2021-11-19 Thread Bernhard Reutner-Fischer via Gcc-patches
On Fri, 19 Nov 2021 10:35:26 +0100 (CET)
Richard Biener via Gcc-patches  wrote:

> On Fri, 19 Nov 2021, Alexandre Oliva wrote:
> 
> > On Nov 18, 2021, Richard Biener  wrote:
> >   
> > > IMHO a more reasonable thing to do would be to not treat
> > > -o /dev/null as a source for -dumpdir and friends.  Alex?  
> > 
> > +1
> > 
> > I think we already have some special-casing for /dev/null somewhere.  
> 
> Grepping finds me the following in system.h which is already checked
> for in gcc.c in a few places indeed.
> 
> /* Provide a default for the HOST_BIT_BUCKET.
>This suffices for POSIX-like hosts.  */
> 
> #ifndef HOST_BIT_BUCKET
> #define HOST_BIT_BUCKET "/dev/null"
> #endif
> 
> 
> > > You did the last re-org, where'd we put such special casing?  
> > 
> > I think we're missing something like this, to avoid messing with dumpdir
> > with -o /dev/null.  We already use the same function when computing
> > outbase just below this.  
> 
> Ah yeah, not_actual_file_p should do the trick indeed.  Giuliano, can
> you update the patch like below?  I think we should still adjust
> documentation as you did.

But that wouldn't cater for the general problem that the dumpdir is not
writable, no? Why not just simply check access W_OK of the dumpdir?

Otherwise a dumpdir /dev/full or anyother such path will cause the same
thing i guess.

thanks,
> 
> > diff --git a/gcc/gcc.c b/gcc/gcc.c
> > index 506c2acc282d6..a986728fb91d6 100644
> > --- a/gcc/gcc.c
> > +++ b/gcc/gcc.c
> > @@ -5098,7 +5098,8 @@ process_command (unsigned int decoded_options_count,
> >  
> >bool explicit_dumpdir = dumpdir;
> >  
> > -  if (!save_temps_overrides_dumpdir && explicit_dumpdir)
> > +  if ((!save_temps_overrides_dumpdir && explicit_dumpdir)
> > +  || (output_file != NULL && not_actual_file_p (output_file)))
> >  {
> >/* Do nothing.  */
> >  }
> >   



Re: [PATCH][V4] rs6000: Remove unnecessary option manipulation.

2021-11-19 Thread Martin Liška

On 11/19/21 13:25, Martin Liška wrote:

Fine. I've just done that.


All right, so I can't send an email from my local machine and git imap-send
does not work as it goes through Thunderbird.

So my last attempt is attaching the email so that you can add the .eml file.

Martin--- Begin Message ---
Do not set flag_rename_registers, it's already enabled with 
EnabledBy(funroll-loops)
in the common.opt file. Use EnabledBy for unroll_only_small_loops which
is a canonical approach how can be make option dependencies.

gcc/ChangeLog:

* config/rs6000/rs6000.c (rs6000_override_options_after_change):
Do not set flag_rename_registers and unroll_only_small_loops.
* config/rs6000/rs6000.opt: Use EnabledBy for unroll_only_small_loops.
---
 gcc/config/rs6000/rs6000.c   | 7 +--
 gcc/config/rs6000/rs6000.opt | 2 +-
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index e4843eb0f1c..5550113a94c 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -3466,13 +3466,8 @@ rs6000_override_options_after_change (void)
   /* Explicit -funroll-loops turns -munroll-only-small-loops off, and
  turns -frename-registers on.  */
   if ((OPTION_SET_P (flag_unroll_loops) && flag_unroll_loops)
-   || (OPTION_SET_P (flag_unroll_all_loops)
-  && flag_unroll_all_loops))
+   || (OPTION_SET_P (flag_unroll_all_loops) && flag_unroll_all_loops))
 {
-  if (!OPTION_SET_P (unroll_only_small_loops))
-   unroll_only_small_loops = 0;
-  if (!OPTION_SET_P (flag_rename_registers))
-   flag_rename_registers = 1;
   if (!OPTION_SET_P (flag_cunroll_grow_size))
flag_cunroll_grow_size = 1;
 }
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 9d7878f144a..faeb7423ca7 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -546,7 +546,7 @@ Target Undocumented Var(rs6000_optimize_swaps) Init(1) Save
 Analyze and remove doubleword swaps from VSX computations.
 
 munroll-only-small-loops
-Target Undocumented Var(unroll_only_small_loops) Init(0) Save
+Target Undocumented Var(unroll_only_small_loops) Init(0) Save 
EnabledBy(funroll-loops)
 ; Use conservative small loop unrolling.
 
 mpower9-misc
-- 
2.33.1

--- End Message ---


Re: [PATCH][V4] rs6000: Remove unnecessary option manipulation.

2021-11-19 Thread Segher Boessenkool
On Fri, Nov 19, 2021 at 01:31:21PM +0100, Martin Liška wrote:
> All right, so I can't send an email from my local machine and git imap-send
> does not work as it goes through Thunderbird.

Hrm, painful (for you).  You should figure out how you can do the basics
of the patch-based workflow that we all have to do.

> So my last attempt is attaching the email so that you can add the .eml file.

I cannot do that, but it is text, so it is inlined fine.  Thanks.

This is almost identical to what git format-patch would give you, btw.

> Subject: [PATCH][V4] rs6000: Remove unnecessary option manipulation.

> Do not set flag_rename_registers, it's already enabled with 
> EnabledBy(funroll-loops)
> in the common.opt file. Use EnabledBy for unroll_only_small_loops which
> is a canonical approach how can be make option dependencies.
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.c (rs6000_override_options_after_change):
>   Do not set flag_rename_registers and unroll_only_small_loops.

Please don't end changelog lines in ":", it looks like something is
missing.  Changelog lines wrap at 80 cols.

>   * config/rs6000/rs6000.opt: Use EnabledBy for unroll_only_small_loops.

> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -3466,13 +3466,8 @@ rs6000_override_options_after_change (void)
>/* Explicit -funroll-loops turns -munroll-only-small-loops off, and
>   turns -frename-registers on.  */
>if ((OPTION_SET_P (flag_unroll_loops) && flag_unroll_loops)
> -   || (OPTION_SET_P (flag_unroll_all_loops)
> -&& flag_unroll_all_loops))
> +   || (OPTION_SET_P (flag_unroll_all_loops) && flag_unroll_all_loops))

You should mention this change in the changelog as well.

> -  if (!OPTION_SET_P (unroll_only_small_loops))
> - unroll_only_small_loops = 0;
> -  if (!OPTION_SET_P (flag_rename_registers))
> - flag_rename_registers = 1;
>if (!OPTION_SET_P (flag_cunroll_grow_size))
>   flag_cunroll_grow_size = 1;
>  }
> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
> index 9d7878f144a..faeb7423ca7 100644
> --- a/gcc/config/rs6000/rs6000.opt
> +++ b/gcc/config/rs6000/rs6000.opt
> @@ -546,7 +546,7 @@ Target Undocumented Var(rs6000_optimize_swaps) Init(1) 
> Save
>  Analyze and remove doubleword swaps from VSX computations.
>  
>  munroll-only-small-loops
> -Target Undocumented Var(unroll_only_small_loops) Init(0) Save
> +Target Undocumented Var(unroll_only_small_loops) Init(0) Save 
> EnabledBy(funroll-loops)
>  ; Use conservative small loop unrolling.

That is the opposite of the original logic.


Segher


Re: Remove gimple_static_chain test disabling modref in ref_maybe_used_in_call_p

2021-11-19 Thread Richard Biener via Gcc-patches
On Fri, 19 Nov 2021, Jan Hubicka wrote:

> Hi,
> this patch removes test for function not having call chain guarding
> modref use in ref_maybe_used_by_call_p_1.  It never made sense since
> modref treats call chain accesses explicitly. It was however copied from
> earlier check for ECF_CONST (which seems dubious too, but I would like
> to discuss it independelty).
> 
> This enables us to detect that memory pointed to static chain (or parts of it)
> are unused by the function.
> 
> lto-bootstrapped-regtested all lanugages on x86_64-linux. OK?

OK.

> gcc/ChangeLog:
> 
> 2021-11-19  Jan Hubicka  
> 
>   * tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Do not guard modref
>   by !gimple_call_chain.
> 
> gcc/testsuite/ChangeLog:
> 
> 2021-11-19  Jan Hubicka  
> 
>   * gcc.dg/tree-ssa/modref-dse-6.c: New test.
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-6.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-6.c
> new file mode 100644
> index 000..d1e45a893ad
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-6.c
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized"  } */
> +int
> +main()
> +{
> +  int a,b;
> +  __attribute__ ((noinline))
> +  void kill_me()
> +  {
> +a=1234;
> +b=2234;
> +  }
> +  a=0;
> +  b=1234;
> +  __attribute__ ((noinline))
> +  int reta()
> +  {
> +return a;
> +  }
> +  return reta();
> +}
> +/* { dg-final { scan-tree-dump-not "kill_me" "optimized" } } */
> +/* { dg-final { scan-tree-dump-not "1234" "optimized" } } */
> diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
> index 02bbc87b597..cd6a0b2f67b 100644
> --- a/gcc/tree-ssa-alias.c
> +++ b/gcc/tree-ssa-alias.c
> @@ -2755,7 +2755,7 @@ ref_maybe_used_by_call_p_1 (gcall *call, ao_ref *ref, 
> bool tbaa_p)
>  
>callee = gimple_call_fndecl (call);
>  
> -  if (!gimple_call_chain (call) && callee != NULL_TREE)
> +  if (callee != NULL_TREE)
>  {
>struct cgraph_node *node = cgraph_node::get (callee);
>/* We can not safely optimize based on summary of calle if it does
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


Re: [PATCH] Do not abort compilation when dump file is /dev/*

2021-11-19 Thread Richard Biener via Gcc-patches
On Fri, 19 Nov 2021, Bernhard Reutner-Fischer wrote:

> On Fri, 19 Nov 2021 10:35:26 +0100 (CET)
> Richard Biener via Gcc-patches  wrote:
> 
> > On Fri, 19 Nov 2021, Alexandre Oliva wrote:
> > 
> > > On Nov 18, 2021, Richard Biener  wrote:
> > >   
> > > > IMHO a more reasonable thing to do would be to not treat
> > > > -o /dev/null as a source for -dumpdir and friends.  Alex?  
> > > 
> > > +1
> > > 
> > > I think we already have some special-casing for /dev/null somewhere.  
> > 
> > Grepping finds me the following in system.h which is already checked
> > for in gcc.c in a few places indeed.
> > 
> > /* Provide a default for the HOST_BIT_BUCKET.
> >This suffices for POSIX-like hosts.  */
> > 
> > #ifndef HOST_BIT_BUCKET
> > #define HOST_BIT_BUCKET "/dev/null"
> > #endif
> > 
> > 
> > > > You did the last re-org, where'd we put such special casing?  
> > > 
> > > I think we're missing something like this, to avoid messing with dumpdir
> > > with -o /dev/null.  We already use the same function when computing
> > > outbase just below this.  
> > 
> > Ah yeah, not_actual_file_p should do the trick indeed.  Giuliano, can
> > you update the patch like below?  I think we should still adjust
> > documentation as you did.
> 
> But that wouldn't cater for the general problem that the dumpdir is not
> writable, no? Why not just simply check access W_OK of the dumpdir?
> 
> Otherwise a dumpdir /dev/full or anyother such path will cause the same
> thing i guess.

I think those cases are OK to diagnose.  Just choosing a
not_actual_file_p output as base to derive the dump directory is
bad.

Richard.


Re: [PATCH][V4] rs6000: Remove unnecessary option manipulation.

2021-11-19 Thread Martin Liška

On 11/19/21 13:45, Segher Boessenkool wrote:

On Fri, Nov 19, 2021 at 01:31:21PM +0100, Martin Liška wrote:

All right, so I can't send an email from my local machine and git imap-send
does not work as it goes through Thunderbird.


Hrm, painful (for you).  You should figure out how you can do the basics
of the patch-based workflow that we all have to do.


Sure, but to be honest others don't have problem with patches attached to an 
email
(with whatever mime type an email client uses).




So my last attempt is attaching the email so that you can add the .eml file.


I cannot do that, but it is text, so it is inlined fine.  Thanks.

This is almost identical to what git format-patch would give you, btw.


Sure, apparently Redirect plugin in TW should work and preserve the email as 
you need.




Subject: [PATCH][V4] rs6000: Remove unnecessary option manipulation.



Do not set flag_rename_registers, it's already enabled with 
EnabledBy(funroll-loops)
in the common.opt file. Use EnabledBy for unroll_only_small_loops which
is a canonical approach how can be make option dependencies.

gcc/ChangeLog:

* config/rs6000/rs6000.c (rs6000_override_options_after_change):
Do not set flag_rename_registers and unroll_only_small_loops.


Please don't end changelog lines in ":", it looks like something is
missing.  Changelog lines wrap at 80 cols.


* config/rs6000/rs6000.opt: Use EnabledBy for unroll_only_small_loops.



--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -3466,13 +3466,8 @@ rs6000_override_options_after_change (void)
/* Explicit -funroll-loops turns -munroll-only-small-loops off, and
   turns -frename-registers on.  */
if ((OPTION_SET_P (flag_unroll_loops) && flag_unroll_loops)
-   || (OPTION_SET_P (flag_unroll_all_loops)
-  && flag_unroll_all_loops))
+   || (OPTION_SET_P (flag_unroll_all_loops) && flag_unroll_all_loops))


You should mention this change in the changelog as well.


-  if (!OPTION_SET_P (unroll_only_small_loops))
-   unroll_only_small_loops = 0;
-  if (!OPTION_SET_P (flag_rename_registers))
-   flag_rename_registers = 1;
if (!OPTION_SET_P (flag_cunroll_grow_size))
flag_cunroll_grow_size = 1;
  }
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 9d7878f144a..faeb7423ca7 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -546,7 +546,7 @@ Target Undocumented Var(rs6000_optimize_swaps) Init(1) Save
  Analyze and remove doubleword swaps from VSX computations.
  
  munroll-only-small-loops

-Target Undocumented Var(unroll_only_small_loops) Init(0) Save
+Target Undocumented Var(unroll_only_small_loops) Init(0) Save 
EnabledBy(funroll-loops)
  ; Use conservative small loop unrolling.


That is the opposite of the original logic.


You are correct! Forget about the patch, I don't want it merged any longer.

Martin




Segher





Re: [PATCH] Fix tree-optimization/101941: IPA splitting out function with error attribute

2021-11-19 Thread Richard Biener via Gcc-patches
On Fri, Nov 19, 2021 at 12:50 PM Andrew Pinski  wrote:
>
> On Fri, Nov 19, 2021 at 2:16 AM Richard Biener via Gcc-patches
>  wrote:
> >
> > On Wed, Nov 17, 2021 at 7:25 AM apinski--- via Gcc-patches
> >  wrote:
> > >
> > > From: Andrew Pinski 
> > >
> > > The Linux kernel started to fail compile when the jump threader was 
> > > improved
> > > (r12-2591-g2e96b5f14e4025691). This failure was due to the IPA splitting 
> > > code
> > > decided now to split off the basic block which contained two functions,
> > > one of those functions included the error attribute on them.  This patch 
> > > fixes
> > > the problem by disallowing basic blocks from being split which contain 
> > > functions
> > > that have either the error or warning attribute on them.
> > >
> > > The two new testcases are to make sure we still split the function for 
> > > other
> > > places if we reject the one case.
> >
> > Hmm, it's only problematic if the immediate(?) controlling condition of the
> > error/warning annotated call is not in the split part, no?
> No, if we have something like:
>   if (p_size < 32) {
> if (ctx != 0) {
>   __write_overflow();
>}
> fortify_panic(__func__);
>  }
> We would still run into the problem if we just disable the splitting
> for the innermost bb and split off the 3 other bb's
> I have a testcase where the above would fail if we decide only to make
> sure the split point is not after ctx !=0 check.
>
> > Interestingly we for example don't avoid splitting away 
> > __builtin_constant_p either.
>
> __builtin_constant_p is handled a different way already; in
> check_forbidden_calls we set forbidden_dominators to include the bb
> where the builtin_constant_p would have been true.
> And then when we consider the split, we reject if the entry point
> would have been dominated by one of those bbs.
> This was PR49642.

I see.  So how's error/warning different - don't we want to forbit split points
that dominate those calls itself?

>
>
> Thanks,
> Andrew Pinski
>
>
> >
> > Richard.
> >
> > > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
> > >
> > > PR tree-optimization/101941
> > >
> > > gcc/ChangeLog:
> > >
> > > * ipa-split.c (visit_bb): Disallow function calls where
> > > the function has either error or warning attribute.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.c-torture/compile/pr101941-1.c: New test.
> > > * gcc.dg/tree-ssa/pr101941-1.c: New test.
> > > ---
> > >  gcc/ipa-split.c   | 12 -
> > >  .../gcc.c-torture/compile/pr101941-1.c| 44 +
> > >  gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c| 48 +++
> > >  3 files changed, 103 insertions(+), 1 deletion(-)
> > >  create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
> > >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c
> > >
> > > diff --git a/gcc/ipa-split.c b/gcc/ipa-split.c
> > > index c68577d04a9..070e894ef31 100644
> > > --- a/gcc/ipa-split.c
> > > +++ b/gcc/ipa-split.c
> > > @@ -873,7 +873,7 @@ visit_bb (basic_block bb, basic_block return_bb,
> > >gimple *stmt = gsi_stmt (bsi);
> > >tree op;
> > >ssa_op_iter iter;
> > > -  tree decl;
> > > +  tree decl = NULL_TREE;
> > >
> > >if (is_gimple_debug (stmt))
> > > continue;
> > > @@ -927,6 +927,16 @@ visit_bb (basic_block bb, basic_block return_bb,
> > > break;
> > >   }
> > >
> > > +  /* If a function call and that function has either the
> > > +warning or error attribute on it, don't split.  */
> > > +  if (decl && (lookup_attribute ("warning", DECL_ATTRIBUTES (decl))
> > > +  || lookup_attribute ("error", DECL_ATTRIBUTES (decl
> > > +   {
> > > + if (dump_file && (dump_flags & TDF_DETAILS))
> > > +   fprintf (dump_file, "Cannot split: warning or error 
> > > attribute.\n");
> > > + can_split = false;
> > > +   }
> > > +
> > >FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_DEF)
> > > bitmap_set_bit (set_ssa_names, SSA_NAME_VERSION (op));
> > >FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
> > > diff --git a/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c 
> > > b/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
> > > new file mode 100644
> > > index 000..ab3bbea8ed7
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
> > > @@ -0,0 +1,44 @@
> > > +/* { dg-additional-options "-fconserve-stack" } */
> > > +struct crypto_aes_ctx {
> > > +  char key_dec[128];
> > > +};
> > > +
> > > +int rfc4106_set_hash_subkey_hash_subkey;
> > > +
> > > +void __write_overflow(void)__attribute__((__error__("")));
> > > +void __write_overflow1(void);
> > > +void aes_encrypt(void*);
> > > +
> > > +void fortify_panic(const char*) __attribute__((__noreturn__)) ;
> > > +
> > > +char *rfc4106_set_hash_subkey(struct crypto_aes_ctx *ctx) 

Re: [PATCH][wwwdocs] remove references to java in htdocs/projects/beginner.html

2021-11-19 Thread Eric Gallager via Gcc-patches
On Fri, Nov 19, 2021 at 1:48 AM Gerald Pfeifer  wrote:
>
> On Thu, 18 Nov 2021, Eric Gallager wrote:
> > I'd find it easier to just edit the page linked to in wwwdocs instead,
> > so I'm going to start seeing what I can do to update it. I figured I'd
> > start by removing the references to Java in it, since Java has been
> > removed. A patch to do that is attached. Eric Gallager
>
> Cool, thank you!
>
> Please feel free to commit patches like this without asking for
> approval (though I'm happy to review and approve).
>
> Gerald
>

OK thanks; committed as dbaebcd


Re: [PATCH] Remove MAY_HAVE_DEBUG_MARKER_STMTS and MAY_HAVE_DEBUG_BIND_STMTS.

2021-11-19 Thread Martin Liška

On 11/19/21 09:59, Alexandre Oliva wrote:

So I find the abstraction useful.  However, I don't have plans to add
other kinds of debug stmts, and I don't know of anyone else who does, so
I won't stand in the way if others think removing these abstractions is
a positive change.


Hello.

I've already installed the patch based on Jeff's approval (and waiting for 24 
hours)
as 206b22d021d94adbaa79e1d443c87415254b15de.

Martin


[PR102988] harden cond: detach without decls

2021-11-19 Thread Alexandre Oliva via Gcc-patches


When we create copies of SSA_NAMEs to hold "detached" copies of the
values for the hardening tests, we end up with assignments to
SSA_NAMEs that refer to the same decls.  That would be generally
desirable, since it enables the variable to be recognized in dumps,
and makes coalescing more likely if the original variable dies at that
point.  When the decl is a DECL_BY_REFERENCE, the SSA_NAME holds the
address of a parm or result, and it's read-only, so we shouldn't
create assignments to it.  Gimple checkers flag at least the case of
results.

This patch arranges for us to avoid referencing the same decls, which
cures the problem, but retaining the visible association between the
SSA_NAMEs, by using the same identifier for the copy.

Regstrapped on x86_64-linux-gnu, also bootstrapped with both
-fharden-compares and -fharden-conditional-branches enabled by default.
Ok to install?


for  gcc/ChangeLog

PR tree-optimization/102988
* gimple-harden-conditionals.cc (detach_value): Copy SSA_NAME
without decl sharing.

for  gcc/testsuite/ChangeLog

PR tree-optimization/102988
* g++.dg/pr102988.C: New.
---
 gcc/gimple-harden-conditionals.cc |9 -
 gcc/testsuite/g++.dg/pr102988.C   |   17 +
 2 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/pr102988.C

diff --git a/gcc/gimple-harden-conditionals.cc 
b/gcc/gimple-harden-conditionals.cc
index 8916420d7dfe9..cfa2361d65be0 100644
--- a/gcc/gimple-harden-conditionals.cc
+++ b/gcc/gimple-harden-conditionals.cc
@@ -123,7 +123,14 @@ detach_value (location_t loc, gimple_stmt_iterator *gsip, 
tree val)
   return val;
 }
 
-  tree ret = copy_ssa_name (val);
+  /* Create a SSA "copy" of VAL.  This could be an anonymous
+ temporary, but it's nice to have it named after the corresponding
+ variable.  Alas, when VAL is a DECL_BY_REFERENCE RESULT_DECL,
+ setting (a copy of) it would be flagged by checking, so we don't
+ use copy_ssa_name: we create an anonymous SSA name, and then give
+ it the same identifier (rather than decl) as VAL.  */
+  tree ret = make_ssa_name (TREE_TYPE (val));
+  SET_SSA_NAME_VAR_OR_IDENTIFIER (ret, SSA_NAME_IDENTIFIER (val));
 
   /* Output asm ("" : "=g" (ret) : "0" (val));  */
   vec *inputs = NULL;
diff --git a/gcc/testsuite/g++.dg/pr102988.C b/gcc/testsuite/g++.dg/pr102988.C
new file mode 100644
index 0..05a1a8f67ed9b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr102988.C
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fharden-conditional-branches -fchecking=1" } */
+
+/* DECL_BY_REFERENCE RESULT_DECL is read-only, we can't create a copy
+   of its (address) default def and set it.  */
+
+void ll();
+struct k {
+  ~k();
+};
+k ice(k *a)
+{
+  k v;
+  if (&v!= a)
+ll();
+  return v;
+}


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about 


[PATCH v3] Do not abort compilation when dump file is /dev/*

2021-11-19 Thread Giuliano Belinassi via Gcc-patches
The `configure` scripts generated with autoconf often tests compiler
features by setting output to `/dev/null`, which then sets the dump
folder as being /dev/* and the compilation halts with an error because
GCC cannot create files in /dev/. This is a problem when configure is
testing for compiler features because it cannot tell if the failure was
due to unsupported features or any other problem, and disable it even
if it is working.

As an example, running configure overriding CFLAGS="-fdump-ipa-clones"
will result in several compiler-features as being disabled because of
gcc halting with an error creating files in /dev/*.

This commit fixes this issue by checking if the output file is
/dev/null or /dev/zero. In this case we use the current working
directory for dump output instead of the directory of the output
file because we cannot write to /dev/*.

gcc/ChangeLog
2021-11-16  Giuliano Belinassi  

* gcc.c (process_command): Skip dumpdir override if file is a
not_actual_file_p.
(not_actual_file_p): Return true if file is /dev/zero as well.
* doc/invoke.texi: Update -dumpdir documentation.

gcc/testsuite/ChangeLog
2021-11-16  Giuliano Belinassi  

* gcc.dg/devnull-dump.c: New.

Signed-off-by: Giuliano Belinassi 
---
 gcc/doc/invoke.texi | 3 ++-
 gcc/gcc.c   | 6 --
 gcc/testsuite/gcc.dg/devnull-dump.c | 7 +++
 3 files changed, 13 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/devnull-dump.c

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6070288856c..4a17cd4d317 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1877,7 +1877,8 @@ named @file{dir/bar.*}, combining the given @var{dumppfx} 
with the
 default @var{dumpbase} derived from the primary output name.  Dump
 outputs also take the input name suffix: @file{dir/bar.c.*}.
 
-It defaults to the location of the output file; options
+It defaults to the location of the output file, unless the output
+file is a special file like @code{/dev/null}. Options
 @option{-save-temps=cwd} and @option{-save-temps=obj} override this
 default, just like an explicit @option{-dumpdir} option.  In case
 multiple such options are given, the last one prevails:
diff --git a/gcc/gcc.c b/gcc/gcc.c
index 506c2acc282..43d7cde1be9 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -5098,7 +5098,8 @@ process_command (unsigned int decoded_options_count,
 
   bool explicit_dumpdir = dumpdir;
 
-  if (!save_temps_overrides_dumpdir && explicit_dumpdir)
+  if ((!save_temps_overrides_dumpdir && explicit_dumpdir)
+  || (output_file && not_actual_file_p (output_file)))
 {
   /* Do nothing.  */
 }
@@ -10716,7 +10717,8 @@ static bool
 not_actual_file_p (const char *name)
 {
   return (strcmp (name, "-") == 0
- || strcmp (name, HOST_BIT_BUCKET) == 0);
+ || strcmp (name, HOST_BIT_BUCKET) == 0
+ || strcmp (name, "/dev/zero") == 0);
 }
 
 /* %:dumps spec function.  Take an optional argument that overrides
diff --git a/gcc/testsuite/gcc.dg/devnull-dump.c 
b/gcc/testsuite/gcc.dg/devnull-dump.c
new file mode 100644
index 000..378e0901c28
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/devnull-dump.c
@@ -0,0 +1,7 @@
+/* { dg-do assemble } */
+/* { dg-options "-fdump-ipa-clones -o /dev/null" } */
+
+int main()
+{
+  return 0;
+}
-- 
2.33.1



[PATCH, v2, OpenMP 5.0] Remove array section base-pointer mapping semantics, and other front-end adjustments (mainline trunk)

2021-11-19 Thread Chung-Lin Tang

Hi Jakub,
attached is a rebased version of this "OpenMP fixes/adjustments" patch.

This version removes some of the (ort == C_ORT_OMP || ort == C_ORT_ACC) stuff 
that's not needed
in handle_omp_array_sections_1 and [c_]finish_omp_clauses.

Note that this is meant to be patched atop of the recent also posted C++ 
PR92120 v5 patch:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584602.html

Again, tested without regressions (together with the PR92120 patch), awaiting 
review.

Thanks,
Chung-Lin

(ChangeLog updated below)

On 2021/5/25 9:36 PM, Chung-Lin Tang wrote:


This patch largely implements three pieces of functionality:

(1) Per discussion and clarification on the omp-lang mailing list,
standards conforming behavior for mapping array sections should *NOT* also map 
the base-pointer,
i.e for this code:

 struct S { int *ptr; ... };
 struct S s;
 #pragma omp target enter data map(to: s.ptr[:100])

Currently we generate after gimplify:
#pragma omp target enter data map(struct:s [len: 1]) map(alloc:s.ptr [len: 8]) \
    map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 
0])

which is deemed incorrect. After this patch, the gimplify results are now 
adjusted to:
#pragma omp target enter data map(to:*_1 [len: 400]) map(attach:s.ptr [bias: 0])
(the attach operation is still generated, and if s.ptr is already mapped prior, 
attachment will happen)

The correct way of achieving the base-pointer-also-mapped behavior would be to 
use:
#pragma omp target enter data map(to: s.ptr, s.ptr[:100])

This adjustment in behavior required a number of small adjustments here and 
there in gimplify, including
to accomodate map sequences for C++ references.

There is also a small Fortran front-end patch involved (hence CCing Tobias and 
fortran@).
The new gimplify processing changed behavior in handling 
GOMP_MAP_ALWAYS_POINTER maps such that
the libgomp.fortran/struct-elem-map-1.f90 regressed. It appeared that the 
Fortran FE was generating
a GOMP_MAP_ALWAYS_POINTER for array types, which didn't seem quite correct, and 
the pre-patch behavior
was removing this map anyways. I have a small change in 
trans-openmp.c:gfc_trans_omp_array_section
to not generate the map in this case, and so far no bad test results.

(2) The second part (though kind of related to the first above) are fixes in 
libgomp/target.c
to not overwrite attached pointers when handling device<->host copies, mainly for the 
"always" case.
This behavior is also noted in the 5.0 spec, but not yet properly coded before.

(3) The third is a set of changes to the C/C++ front-ends to extend the allowed 
component access syntax
in map clauses. This is actually mainly an effort to allow SPEC HPC to compile, 
so despite in the long
term the entire map clause syntax parsing is probably going to be revamped, 
we're still adding this in
for now. These changes are enabled for both OpenACC and OpenMP.



2021-11-19  Chung-Lin Tang  

gcc/c/ChangeLog:

* c-parser.c (struct omp_dim): New struct type for use inside
c_parser_omp_variable_list.
(c_parser_omp_variable_list): Allow multiple levels of array and
component accesses in array section base-pointer expression.
(c_parser_omp_clause_to): Set 'allow_deref' to true in call to
c_parser_omp_var_list_parens.
(c_parser_omp_clause_from): Likewise.
* c-typeck.c (handle_omp_array_sections_1): Extend allowed range
of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and
POINTER_PLUS_EXPR.
(c_finish_omp_clauses): Extend allowed ranged of expressions
involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR.

gcc/cp/ChangeLog:

* parser.c (struct omp_dim): New struct type for use inside
cp_parser_omp_var_list_no_open.
(cp_parser_omp_var_list_no_open): Allow multiple levels of array and
component accesses in array section base-pointer expression.
(cp_parser_omp_all_clauses): Set 'allow_deref' to true in call to
cp_parser_omp_var_list for to/from clauses.
* semantics.c (handle_omp_array_sections_1): Extend allowed range
of base-pointer expressions involving INDIRECT/MEM/ARRAY_REF and
POINTER_PLUS_EXPR.
(handle_omp_array_sections): Adjust pointer map generation of
references.
(finish_omp_clauses): Extend allowed ranged of expressions
involving INDIRECT/MEM/ARRAY_REF and POINTER_PLUS_EXPR.

gcc/fortran/ChangeLog:

* trans-openmp.c (gfc_trans_omp_array_section): Do not generate
GOMP_MAP_ALWAYS_POINTER map for main array maps of ARRAY_TYPE type.

gcc/ChangeLog:

* gimplify.c (extract_base_bit_offset): Add 'tree *offsetp' parameter,
accomodate case where 'offset' return of get_inner_reference is
non-NULL.
(is_or_contains_p): Further robustify conditions.
(omp_target_reorder_clauses): In alloc/to/from sorting phase, also
move follow

Re: [PATCH, PR90030] Fortran OpenMP/OpenACC array mapping alignment fix

2021-11-19 Thread Chung-Lin Tang

Ping.

On 2021/11/4 4:23 PM, Chung-Lin Tang wrote:

Hi Jakub,
As Thomas reported and submitted a patch a while ago:
https://gcc.gnu.org/pipermail/gcc-patches/2019-April/519932.html
https://gcc.gnu.org/pipermail/gcc-patches/2019-May/522738.html

There's an issue with the Fortran front-end when mapping arrays: when
creating the data MEM_REF for the map clause, there's a convention of
casting the referencing pointer to 'c_char *' by
fold_convert (build_pointer_type (char_type_node), ptr).

This causes the alignment passed to the libgomp runtime for array data
hardwared to '1', and causes alignment errors on the offload target
(not always showing up, but can trigger due to slight change of clause
ordering)

This patch is not exactly Thomas' patch from 2019, but does the same
thing. The new libgomp tests are directly reused though. A lot of
scan test adjustment is also included in this patch.

Patch has been tested for no regressions for gfortran and libgomp, is
this okay for trunk?

Thanks,
Chung-Lin

Fortran: fix array alignment for OpenMP/OpenACC target mapping clauses [PR90030]

The Fortran front-end is creating maps of array data with a type of pointer to
char_type_node, which when eventually passed to libgomp during runtime, marks
the passed array with an alignment of 1, which can cause mapping alignment
errors on the offload target.

This patch removes the related fold_convert(build_pointer_type (char_type_node))
calls in fortran/trans-openmp.c, and adds gcc_asserts to ensure pointer type.

2021-11-04  Chung-Lin Tang  
     Thomas Schwinge 

 PR fortran/90030

gcc/fortran/ChangeLog:

 * trans-openmp.c (gfc_omp_finish_clause): Remove fold_convert to pointer
 to char_type_node, add gcc_assert of POINTER_TYPE_P.
 (gfc_trans_omp_array_section): Likewise.
 (gfc_trans_omp_clauses): Likewise.

gcc/testsuite/ChangeLog:

 * gfortran.dg/goacc/finalize-1.f: Adjust scan test.
 * gfortran.dg/gomp/affinity-clause-1.f90: Likewise.
 * gfortran.dg/gomp/affinity-clause-5.f90: Likewise.
 * gfortran.dg/gomp/defaultmap-4.f90: Likewise.
 * gfortran.dg/gomp/defaultmap-5.f90: Likewise.
 * gfortran.dg/gomp/defaultmap-6.f90: Likewise.
 * gfortran.dg/gomp/map-3.f90: Likewise.
 * gfortran.dg/gomp/pr78260-2.f90: Likewise.
 * gfortran.dg/gomp/pr78260-3.f90: Likewise.

libgomp/ChangeLog:

 * testsuite/libgomp.oacc-fortran/pr90030.f90: New test.
 * testsuite/libgomp.fortran/pr90030.f90: New test.


[committed] libphobos: Don't call __gthread_key_delete in the emutls destroy function.

2021-11-19 Thread Iain Buclaw via Gcc-patches
Hi,

This patch fixes a EXC_BAD_ACCESS issue seen on Darwin when the
libphobos DSO gets unloaded.  Based on reading libgcc's emutls
implementation, as it doesn't call __gthread_key_delete directly,
neither should libphobos.

Bootstrapped and regression tested on x86_64-linux-gnu and
x86_64-apple-darwin20, committed to mainline, and backported to the
release branches.

Regards,
Iain

---
libphobos/ChangeLog:

* libdruntime/gcc/emutls.d (emutlsDestroyThread): Don't remove entry
from global array.
(_d_emutls_destroy): Don't call __gthread_key_delete.
---
 libphobos/libdruntime/gcc/emutls.d | 6 --
 1 file changed, 6 deletions(-)

diff --git a/libphobos/libdruntime/gcc/emutls.d 
b/libphobos/libdruntime/gcc/emutls.d
index 4237dc7a3df..462230508ab 100644
--- a/libphobos/libdruntime/gcc/emutls.d
+++ b/libphobos/libdruntime/gcc/emutls.d
@@ -229,9 +229,6 @@ void** emutlsAlloc(shared __emutls_object* obj) nothrow 
@nogc
 extern (C) void emutlsDestroyThread(void* ptr) nothrow @nogc
 {
 auto arr = cast(TlsArray*) ptr;
-emutlsMutex.lock_nothrow();
-emutlsArrays.remove(arr);
-emutlsMutex.unlock_nothrow();
 
 foreach (entry; *arr)
 {
@@ -308,9 +305,6 @@ void _d_emutls_scan(scope void delegate(void* pbeg, void* 
pend) nothrow cb) noth
 // Call this after druntime has been unloaded
 void _d_emutls_destroy() nothrow @nogc
 {
-if (__gthread_key_delete(emutlsKey) != 0)
-abort();
-
 (cast(Mutex) _emutlsMutex.ptr).__dtor();
 destroy(emutlsArrays);
 }
-- 
2.30.2



[committed] libphobos: Increase size of defaultStackPages on OSX X86_64 targets.

2021-11-19 Thread Iain Buclaw via Gcc-patches
Hi,

As of macOS 11, libunwind now requires more stack space than 16k, so
default to a larger stack size. This is only applied to X86 as the
PAGESIZE is still 4k, however on AArch64 it is 16k.

Regression tested on x86_64-linux-gnu and x86_64-apple-darwin20,
committed to mainline and backported to the release branches.

Regards,
Iain.

---
libphobos/ChangeLog:

* libdruntime/core/thread/fiber.d (defaultStackPages): Increase size
on OSX X86_64 targets.
---
 libphobos/libdruntime/core/thread/fiber.d | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/libphobos/libdruntime/core/thread/fiber.d 
b/libphobos/libdruntime/core/thread/fiber.d
index f4c04ce7358..2f90f179edb 100644
--- a/libphobos/libdruntime/core/thread/fiber.d
+++ b/libphobos/libdruntime/core/thread/fiber.d
@@ -595,6 +595,16 @@ class Fiber
 // the existence of debug symbols and other conditions. Avoid causing
 // stack overflows by defaulting to a larger stack size
 enum defaultStackPages = 8;
+else version (OSX)
+{
+version (X86_64)
+// libunwind on macOS 11 now requires more stack space than 16k, so
+// default to a larger stack size. This is only applied to X86 as
+// the PAGESIZE is still 4k, however on AArch64 it is 16k.
+enum defaultStackPages = 8;
+else
+enum defaultStackPages = 4;
+}
 else
 enum defaultStackPages = 4;
 
-- 
2.30.2



Re: [PATCH] Remove MAY_HAVE_DEBUG_MARKER_STMTS and MAY_HAVE_DEBUG_BIND_STMTS.

2021-11-19 Thread Richard Biener via Gcc-patches
On Fri, Nov 19, 2021 at 2:22 PM Martin Liška  wrote:
>
> On 11/19/21 09:59, Alexandre Oliva wrote:
> > So I find the abstraction useful.  However, I don't have plans to add
> > other kinds of debug stmts, and I don't know of anyone else who does, so
> > I won't stand in the way if others think removing these abstractions is
> > a positive change.
>
> Hello.
>
> I've already installed the patch based on Jeff's approval (and waiting for 24 
> hours)
> as 206b22d021d94adbaa79e1d443c87415254b15de.

Can you please revert it?

> Martin


Re: [PR102988] harden cond: detach without decls

2021-11-19 Thread Richard Biener via Gcc-patches
On Fri, Nov 19, 2021 at 2:38 PM Alexandre Oliva via Gcc-patches
 wrote:
>
>
> When we create copies of SSA_NAMEs to hold "detached" copies of the
> values for the hardening tests, we end up with assignments to
> SSA_NAMEs that refer to the same decls.  That would be generally
> desirable, since it enables the variable to be recognized in dumps,
> and makes coalescing more likely if the original variable dies at that
> point.  When the decl is a DECL_BY_REFERENCE, the SSA_NAME holds the
> address of a parm or result, and it's read-only, so we shouldn't
> create assignments to it.  Gimple checkers flag at least the case of
> results.
>
> This patch arranges for us to avoid referencing the same decls, which
> cures the problem, but retaining the visible association between the
> SSA_NAMEs, by using the same identifier for the copy.
>
> Regstrapped on x86_64-linux-gnu, also bootstrapped with both
> -fharden-compares and -fharden-conditional-branches enabled by default.
> Ok to install?

OK.

>
> for  gcc/ChangeLog
>
> PR tree-optimization/102988
> * gimple-harden-conditionals.cc (detach_value): Copy SSA_NAME
> without decl sharing.
>
> for  gcc/testsuite/ChangeLog
>
> PR tree-optimization/102988
> * g++.dg/pr102988.C: New.
> ---
>  gcc/gimple-harden-conditionals.cc |9 -
>  gcc/testsuite/g++.dg/pr102988.C   |   17 +
>  2 files changed, 25 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/pr102988.C
>
> diff --git a/gcc/gimple-harden-conditionals.cc 
> b/gcc/gimple-harden-conditionals.cc
> index 8916420d7dfe9..cfa2361d65be0 100644
> --- a/gcc/gimple-harden-conditionals.cc
> +++ b/gcc/gimple-harden-conditionals.cc
> @@ -123,7 +123,14 @@ detach_value (location_t loc, gimple_stmt_iterator 
> *gsip, tree val)
>return val;
>  }
>
> -  tree ret = copy_ssa_name (val);
> +  /* Create a SSA "copy" of VAL.  This could be an anonymous
> + temporary, but it's nice to have it named after the corresponding
> + variable.  Alas, when VAL is a DECL_BY_REFERENCE RESULT_DECL,
> + setting (a copy of) it would be flagged by checking, so we don't
> + use copy_ssa_name: we create an anonymous SSA name, and then give
> + it the same identifier (rather than decl) as VAL.  */
> +  tree ret = make_ssa_name (TREE_TYPE (val));
> +  SET_SSA_NAME_VAR_OR_IDENTIFIER (ret, SSA_NAME_IDENTIFIER (val));
>
>/* Output asm ("" : "=g" (ret) : "0" (val));  */
>vec *inputs = NULL;
> diff --git a/gcc/testsuite/g++.dg/pr102988.C b/gcc/testsuite/g++.dg/pr102988.C
> new file mode 100644
> index 0..05a1a8f67ed9b
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pr102988.C
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fharden-conditional-branches -fchecking=1" } */
> +
> +/* DECL_BY_REFERENCE RESULT_DECL is read-only, we can't create a copy
> +   of its (address) default def and set it.  */
> +
> +void ll();
> +struct k {
> +  ~k();
> +};
> +k ice(k *a)
> +{
> +  k v;
> +  if (&v!= a)
> +ll();
> +  return v;
> +}
>
>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about 


Re: [PATCH v3] Do not abort compilation when dump file is /dev/*

2021-11-19 Thread Richard Biener via Gcc-patches
On Fri, 19 Nov 2021, Giuliano Belinassi wrote:

> The `configure` scripts generated with autoconf often tests compiler
> features by setting output to `/dev/null`, which then sets the dump
> folder as being /dev/* and the compilation halts with an error because
> GCC cannot create files in /dev/. This is a problem when configure is
> testing for compiler features because it cannot tell if the failure was
> due to unsupported features or any other problem, and disable it even
> if it is working.
> 
> As an example, running configure overriding CFLAGS="-fdump-ipa-clones"
> will result in several compiler-features as being disabled because of
> gcc halting with an error creating files in /dev/*.
> 
> This commit fixes this issue by checking if the output file is
> /dev/null or /dev/zero. In this case we use the current working
> directory for dump output instead of the directory of the output
> file because we cannot write to /dev/*.
> 
> gcc/ChangeLog
> 2021-11-16  Giuliano Belinassi  
> 
>   * gcc.c (process_command): Skip dumpdir override if file is a
>   not_actual_file_p.
>   (not_actual_file_p): Return true if file is /dev/zero as well.
>   * doc/invoke.texi: Update -dumpdir documentation.
> 
> gcc/testsuite/ChangeLog
> 2021-11-16  Giuliano Belinassi  
> 
>   * gcc.dg/devnull-dump.c: New.
> 
> Signed-off-by: Giuliano Belinassi 
> ---
>  gcc/doc/invoke.texi | 3 ++-
>  gcc/gcc.c   | 6 --
>  gcc/testsuite/gcc.dg/devnull-dump.c | 7 +++
>  3 files changed, 13 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/devnull-dump.c
> 
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 6070288856c..4a17cd4d317 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -1877,7 +1877,8 @@ named @file{dir/bar.*}, combining the given 
> @var{dumppfx} with the
>  default @var{dumpbase} derived from the primary output name.  Dump
>  outputs also take the input name suffix: @file{dir/bar.c.*}.
>  
> -It defaults to the location of the output file; options
> +It defaults to the location of the output file, unless the output
> +file is a special file like @code{/dev/null}. Options
>  @option{-save-temps=cwd} and @option{-save-temps=obj} override this
>  default, just like an explicit @option{-dumpdir} option.  In case
>  multiple such options are given, the last one prevails:
> diff --git a/gcc/gcc.c b/gcc/gcc.c
> index 506c2acc282..43d7cde1be9 100644
> --- a/gcc/gcc.c
> +++ b/gcc/gcc.c
> @@ -5098,7 +5098,8 @@ process_command (unsigned int decoded_options_count,
>  
>bool explicit_dumpdir = dumpdir;
>  
> -  if (!save_temps_overrides_dumpdir && explicit_dumpdir)
> +  if ((!save_temps_overrides_dumpdir && explicit_dumpdir)
> +  || (output_file && not_actual_file_p (output_file)))
>  {
>/* Do nothing.  */
>  }
> @@ -10716,7 +10717,8 @@ static bool
>  not_actual_file_p (const char *name)
>  {
>return (strcmp (name, "-") == 0
> -   || strcmp (name, HOST_BIT_BUCKET) == 0);
> +   || strcmp (name, HOST_BIT_BUCKET) == 0
> +   || strcmp (name, "/dev/zero") == 0);
>  }

OK when you omit the change to include /dev/zero in not_actual_file_p.

Thanks,
Richard.

>  /* %:dumps spec function.  Take an optional argument that overrides
> diff --git a/gcc/testsuite/gcc.dg/devnull-dump.c 
> b/gcc/testsuite/gcc.dg/devnull-dump.c
> new file mode 100644
> index 000..378e0901c28
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/devnull-dump.c
> @@ -0,0 +1,7 @@
> +/* { dg-do assemble } */
> +/* { dg-options "-fdump-ipa-clones -o /dev/null" } */
> +
> +int main()
> +{
> +  return 0;
> +}
> 


Re: [PATCH] Remove MAY_HAVE_DEBUG_MARKER_STMTS and MAY_HAVE_DEBUG_BIND_STMTS.

2021-11-19 Thread Martin Liška

On 11/19/21 15:06, Richard Biener wrote:

Can you please revert it?


Sure, done as 79e9f721d1a6f370ce0534745baeeb5a56da948e.

Martin


Re: [PATCH] PR tree-optimization/103254 - Limit depth for all GORI expressions.

2021-11-19 Thread Andrew MacLeod via Gcc-patches

On 11/19/21 05:15, Aldy Hernandez wrote:



On 11/18/21 8:28 PM, Andrew MacLeod wrote:


@@ -376,6 +366,14 @@ range_def_chain::get_def_chain (tree name)
   return NULL;
 }

+  // Terminate the def chains if we see too many cascading stmts.
+  if (m_logical_depth == param_ranger_logical_depth)
+    return NULL;
+
+  // Increase the depth if we have a pair of ssa-names.
+  if (ssa1 && ssa2)
+    m_logical_depth++;
+


Not sure it matters, since this is an internal knob, but if the 
--param now applies to all statements, perhaps we should remove the 
"logical" reference from the knob.


Aldy

I considered that, but recall the last time I changed a name in my 
--param it affected some LTO reading thing invalidating existing objects 
due to a parameter mismatch I think... So I didn't see the need :-)   So 
instead I now interpret it to mean "Logical analysis depth"  or 
something :-) If no one cares about that, we could drop the logical 
part of it.


Andrew





Re: [PATCH v3] Do not abort compilation when dump file is /dev/*

2021-11-19 Thread Bernhard Reutner-Fischer via Gcc-patches
On Fri, 19 Nov 2021 15:12:35 +0100 (CET)
Richard Biener via Gcc-patches  wrote:

> On Fri, 19 Nov 2021, Giuliano Belinassi wrote:

> > -It defaults to the location of the output file; options
> > +It defaults to the location of the output file, unless the output
> > +file is a special file like @code{/dev/null}. Options

s/a special file like //

I'd say.
thanks,
> >  @option{-save-temps=cwd} and @option{-save-temps=obj} override this
> >  default, just like an explicit @option{-dumpdir} option.  In case
> >  multiple such options are given, the last one prevails:
> > diff --git a/gcc/gcc.c b/gcc/gcc.c
> > index 506c2acc282..43d7cde1be9 100644
> > --- a/gcc/gcc.c
> > +++ b/gcc/gcc.c
> > @@ -5098,7 +5098,8 @@ process_command (unsigned int decoded_options_count,
> >  
> >bool explicit_dumpdir = dumpdir;
> >  
> > -  if (!save_temps_overrides_dumpdir && explicit_dumpdir)
> > +  if ((!save_temps_overrides_dumpdir && explicit_dumpdir)
> > +  || (output_file && not_actual_file_p (output_file)))
> >  {
> >/* Do nothing.  */
> >  }
> > @@ -10716,7 +10717,8 @@ static bool
> >  not_actual_file_p (const char *name)
> >  {
> >return (strcmp (name, "-") == 0
> > - || strcmp (name, HOST_BIT_BUCKET) == 0);
> > + || strcmp (name, HOST_BIT_BUCKET) == 0
> > + || strcmp (name, "/dev/zero") == 0);
> >  }  
> 
> OK when you omit the change to include /dev/zero in not_actual_file_p.


[PATCH 0/3] Add zero cycle move support

2021-11-19 Thread Michael Meissner via Gcc-patches
The next set of 3 patches add zero cycle move support to the Power10.  Zero
cycle moves are where the move to LR/CTR/TAR register that is adjacent to the
jump to LR/CTR/TAR register can be fused together.

At the moment, these set of three patches add support for zero cycle moves for
indirect jumps and switch tables using the CTR register.  Potential zero cycle
moves for doing returns are not currently handled.

In looking at the code, I discovered that just using zero cycle moves isn't as
helpful unless we can eliminate the add instruction before doing the jump.  I
also noticed that the various power10 fusion options are only done if
-mcpu=power10.  I added a patch to do the fusion for -mtune=power10 as well.

I have done bootstraps and make check with these patches installed on both
little endian power9 and little endian power10 systems.  Can I install these
patches?

The following patches will be posted:

1) Patch to add zero cycle move for indirect jumps and switches.

2) Patch to enable p10 fusion for -mtune=power10 in addition to -mcpu=power10.

3) Patch to use absolute addresses for switch tables instead of relative
   addresses if zero cycle fusion is enabled.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


[PATCH 1/3] Add power10 zero cycle moves for switches & indirect jumps

2021-11-19 Thread Michael Meissner via Gcc-patches
Add power10 zero cycle moves for switches.

Power10 will fuse adjacenet 'mtctr' and 'bctr' instructions to form zero
cycle moves.  This code exploits this fusion opportunity.

I have built bootstrapped compilers with this patch on little endian power9 and
power10 systems with no regressions.  Can I install this into the master
branch?

2021-11-19  Michael Meissner  

* config/rs6000/rs6000-cpus.def (ISA_3_1_MASKS_SERVER): Add
support for -mpower10-fusion-zero-cycle.
(POWERPC_MASKS): Likewise.
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Likewise.
* config/rs6000/rs6000.md (indirect_jump): Support zero cycle
moves.
(indirect_jump_zero_cycle): New insns.
(tablejump_normal): Likewise.
(tablejump_absolute): Likewise.
(tablejump_insn_zero_cycle): New insn.
* config/rs6000/rs6000.opt (-mpower10-fusion-zero-cycle): New
debug switch.
---
 gcc/config/rs6000/rs6000-cpus.def |  4 ++-
 gcc/config/rs6000/rs6000.c|  4 +++
 gcc/config/rs6000/rs6000.md   | 52 ---
 gcc/config/rs6000/rs6000.opt  |  4 +++
 4 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index f5812da0184..cc072ee94ea 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -91,7 +91,8 @@
 | OPTION_MASK_P10_FUSION_LOGADD\
 | OPTION_MASK_P10_FUSION_ADDLOG\
 | OPTION_MASK_P10_FUSION_2ADD  \
-| OPTION_MASK_P10_FUSION_2STORE)
+| OPTION_MASK_P10_FUSION_2STORE\
+| OPTION_MASK_P10_FUSION_ZERO_CYCLE)
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
 #define OTHER_P9_VECTOR_MASKS  (OPTION_MASK_FLOAT128_HW\
@@ -145,6 +146,7 @@
 | OPTION_MASK_P10_FUSION_ADDLOG\
 | OPTION_MASK_P10_FUSION_2ADD  \
 | OPTION_MASK_P10_FUSION_2STORE\
+| OPTION_MASK_P10_FUSION_ZERO_CYCLE\
 | OPTION_MASK_HTM  \
 | OPTION_MASK_ISEL \
 | OPTION_MASK_MFCRF\
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index e4843eb0f1c..6780304a5eb 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4497,6 +4497,10 @@ rs6000_option_override_internal (bool global_init_p)
   && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_2STORE) == 0)
 rs6000_isa_flags |= OPTION_MASK_P10_FUSION_2STORE;
 
+  if (TARGET_POWER10
+  && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_ZERO_CYCLE) == 0)
+rs6000_isa_flags |= OPTION_MASK_P10_FUSION_ZERO_CYCLE;
+
   /* Turn off vector pair/mma options on non-power10 systems.  */
   else if (!TARGET_POWER10 && TARGET_MMA)
 {
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6bec2bddbde..ea41eb4ada3 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -12988,15 +12988,34 @@ (define_expand "indirect_jump"
 emit_jump_insn (gen_indirect_jump_nospec (Pmode, operands[0], ccreg));
 DONE;
   }
+  if (TARGET_P10_FUSION && TARGET_P10_FUSION_ZERO_CYCLE)
+{
+  emit_jump_insn (gen_indirect_jump_zero_cycle (Pmode, operands[0]));
+  DONE;
+}
 })
 
 (define_insn "*indirect_jump"
   [(set (pc)
(match_operand:P 0 "register_operand" "c,*l"))]
-  "rs6000_speculate_indirect_jumps"
+  "rs6000_speculate_indirect_jumps
+   && !(TARGET_P10_FUSION && TARGET_P10_FUSION_ZERO_CYCLE)"
   "b%T0"
   [(set_attr "type" "jmpreg")])
 
+(define_insn "@indirect_jump_zero_cycle"
+  [(set (pc)
+   (match_operand:P 0 "register_operand" "r,r,!cl"))
+   (clobber (match_scratch:P 1 "=c,*l,X"))]
+  "rs6000_speculate_indirect_jumps && TARGET_P10_FUSION
+   && TARGET_P10_FUSION_ZERO_CYCLE"
+  "@
+   mt%T1 %0\;b%T1
+   mt%T1 %0\;b%T1
+   b%T0"
+  [(set_attr "type" "jmpreg")
+   (set_attr "length" "8,8,4")])
+
 (define_insn "@indirect_jump_nospec"
   [(set (pc) (match_operand:P 0 "register_operand" "c,*l"))
(clobber (match_operand:CC 1 "cc_reg_operand" "=y,y"))]
@@ -13050,7 +13069,11 @@ (define_expand "@tablejump_normal"
   rtx addr = gen_reg_rtx (Pmode);
 
   emit_insn (gen_add3 (addr, off, lab));
-  emit_jump_insn (gen_tablejump_insn_normal (Pmode, addr, operands[1]));
+  rtx insn = (TARGET_P10_FUSION && TARGET_P10_FUSION_ZERO_CYCLE
+ ? gen_tablejump_insn_zero_cycle (Pmode, addr, operands[1])
+ : gen_tablejump_insn_normal (Pmode, addr, operands[1]));
+
+  emit_jump_insn (insn);
   DONE;
 })
 
@@ -13062,7 +13085

[PATCH 2/3] Set power10 fusion if -mtune=power10.

2021-11-19 Thread Michael Meissner via Gcc-patches
Set power10 fusion if -mtune=power10.

In doing the patch for zero cycle moves for switch statements and indirect
jumps, I noticed the fusion support is only done if -mcpu=power10.  This option
enables power10 fusion if we use -mtune=power10.

I have built and run the testsuites on little endian power9 and power10 systems
with no regressions.  Can I install this patch?

2021-11-19  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_option_override_internal): Enable
power10 fusion if -mtune=power10.
(rs6000_opt_masks): Add power10 fusion options.
---
 gcc/config/rs6000/rs6000.c | 25 +
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 6780304a5eb..8531cef0337 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4469,35 +4469,36 @@ rs6000_option_override_internal (bool global_init_p)
   if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_MMA) == 0)
 rs6000_isa_flags |= OPTION_MASK_MMA;
 
-  if (TARGET_POWER10
+  /* Enable power10 tuning if either -mcpu=power10 or -mtune=power10.  */
+  if ((TARGET_POWER10 || rs6000_tune == PROCESSOR_POWER10)
   && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION) == 0)
 rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
 
-  if (TARGET_POWER10 &&
+  if (TARGET_P10_FUSION &&
   (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_LD_CMPI) == 0)
 rs6000_isa_flags |= OPTION_MASK_P10_FUSION_LD_CMPI;
 
-  if (TARGET_POWER10
+  if (TARGET_P10_FUSION
   && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_2LOGICAL) == 0)
 rs6000_isa_flags |= OPTION_MASK_P10_FUSION_2LOGICAL;
 
-  if (TARGET_POWER10
+  if (TARGET_P10_FUSION
   && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_LOGADD) == 0)
 rs6000_isa_flags |= OPTION_MASK_P10_FUSION_LOGADD;
 
-  if (TARGET_POWER10
+  if (TARGET_P10_FUSION
   && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_ADDLOG) == 0)
 rs6000_isa_flags |= OPTION_MASK_P10_FUSION_ADDLOG;
 
-  if (TARGET_POWER10
+  if (TARGET_P10_FUSION
   && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_2ADD) == 0)
 rs6000_isa_flags |= OPTION_MASK_P10_FUSION_2ADD;
 
-  if (TARGET_POWER10
+  if (TARGET_P10_FUSION
   && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_2STORE) == 0)
 rs6000_isa_flags |= OPTION_MASK_P10_FUSION_2STORE;
 
-  if (TARGET_POWER10
+  if (TARGET_P10_FUSION
   && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_ZERO_CYCLE) == 0)
 rs6000_isa_flags |= OPTION_MASK_P10_FUSION_ZERO_CYCLE;
 
@@ -24292,6 +24293,14 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] 
=
   { "power9-misc", OPTION_MASK_P9_MISC,false, true  },
   { "power9-vector",   OPTION_MASK_P9_VECTOR,  false, true  },
   { "power10-fusion",  OPTION_MASK_P10_FUSION, false, true  },
+  { "power10-fusion-ld-cmpi",  OPTION_MASK_P10_FUSION_LD_CMPI, false, true  },
+  { "power10-fusion-2logical", OPTION_MASK_P10_FUSION_2LOGICAL,false, true  },
+  { "power10-fusion-logical-add", OPTION_MASK_P10_FUSION_LOGADD,false, true  },
+  { "power10-fusion-add-logical", OPTION_MASK_P10_FUSION_ADDLOG,false, true  },
+  { "power10-fusion-2add", OPTION_MASK_P10_FUSION_2ADD,false, true  },
+  { "power10-fusion-2store",   OPTION_MASK_P10_FUSION_2STORE,  false, true  },
+  { "power10-fusion-zero-cycle", OPTION_MASK_P10_FUSION_ZERO_CYCLE,
+   false, true  },
   { "powerpc-gfxopt",  OPTION_MASK_PPC_GFXOPT, false, true  },
   { "powerpc-gpopt",   OPTION_MASK_PPC_GPOPT,  false, true  },
   { "prefixed",OPTION_MASK_PREFIXED,   false, 
true  },
-- 
2.31.1


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


[PATCH 3/3] Use absolute switch table addresses for zero cycle moves.

2021-11-19 Thread Michael Meissner via Gcc-patches
se absolute switch table addresses for zero cycle moves.

This option enables using absolute addresses in switch tables if the
power10 zero cycle move tuning is turned on.  The combination of using
absolute addresses in switch tables along with zero cycle moves seems to
give the best performance.

I have built and run bootstrapped compilers on little endian power9 and power10
systems.  There were no regressions.  Can I install this patch?

2021-11-19  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_option_override_internal): Use
absolute addresses in switch tables if power10 zero cycle move
fusion is enabled.
---
 gcc/config/rs6000/rs6000.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 8531cef0337..dc942765828 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4502,6 +4502,12 @@ rs6000_option_override_internal (bool global_init_p)
   && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION_ZERO_CYCLE) == 0)
 rs6000_isa_flags |= OPTION_MASK_P10_FUSION_ZERO_CYCLE;
 
+  /* If we enable zero cycle move fusion, also switch to absolute addresses in
+ switch tables.  */
+  if (TARGET_P10_FUSION && TARGET_P10_FUSION_ZERO_CYCLE
+  && !global_options_set.x_rs6000_relative_jumptables)
+rs6000_relative_jumptables = 0;
+
   /* Turn off vector pair/mma options on non-power10 systems.  */
   else if (!TARGET_POWER10 && TARGET_MMA)
 {
-- 
2.31.1


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [PATCH] PR tree-optimization/103254 - Limit depth for all GORI expressions.

2021-11-19 Thread Andrew MacLeod via Gcc-patches

On 11/19/21 04:21, Richard Biener wrote:

On Thu, Nov 18, 2021 at 8:30 PM Andrew MacLeod via Gcc-patches
 wrote:

At issue here is the dynamic approach we currently use for outgoing edge
calculations.  It isn't normally a problem, but once you get a very
large number of possible outgoing values (ie very long unrolled blocks)
with pairs of values on a statement, and individual queries for each
one, it becomes exponential if they relate to each other.

So.  We previously introduced a param for PR 100081 which limited the
depth of logical expressions GORI would pursue because we always make 2
or 4 queries for each logical expression, which accelerated the
exponential behaviour much quicker.

This patch reuses that to apply to any statement which has 2 ssa-names,
as the potential for 2 queries can happen then as well..  leading to
exponential behaviour.  Each statement we step back through with
multiple ssa-names decreases the odds of calculating a useful outgoing
range anyway.  This will remove ridiculous behaviour like this PR
demonstrates.

Next release I plan to revamp GORI to cache outgoing values, which will
eliminate all this sort of behaviour, and we can remove all such
restrictions.

Bootstrapped on x86_64-pc-linux-gnu with no regressions.  OK for trunk?

I think it's reasonable.  SCEV tries to limit the overall number of SSA -> def
visits, capturing deep vs. wide in a more uniform (and single) way.
(--param scev-max-expr-size and the duplicate - huh - --param
scev-max-expr-complexity).

For SCEV running into the limit means fatal fail of the analysis, for ranger
it only means less refinement?  So it might make a difference which path
you visit and which not (just thinking about reproducability of range analysis
when we say, swap operands of a stmt)?

Yes, Its means less refinement for "deeper" dependencies in the block..  
It will not be affected by operand swapping because we will either look 
at dependencies for both operands of a stmt, or neither.   The path you 
visit shouldn't make any difference. Note this has nothing to do with 
generating ranges for those statements, this is only for refining 
outgoing ranges created by the condition at the bottom of the block.


To be precise, we will stop looking for possible range changes once our 
dependency search, which always starts from the bottom of the block, 
encounters 6 stmts with multiple SSA operands. Statements with a single 
SSA operand will continue to build the dependency chains.


so for instance, following just the chain for first operands of this 
somewhat truncated listing:

 <...>
  _137 = (short int) j_131;
  _138 = _129 & _137;
  _139 = (int) _138;
  j_140 = j_131 & _139;    << depth 6 ... we don't go look at j_131 
or _139 for further dependencies

  _146 = (short int) j_140;
  _147 = _138 & _146;
  _148 = (int) _147;
  j_149 = j_140 & _148;   << depth 5
  _155 = (short int) j_149;
  _156 = _147 & _155;
  _157 = (int) _156;
  j_158 = j_149 & _157;   << depth 4
  _164 = (short int) j_158;
  _165 = _156 & _164;
  _166 = (int) _165;
  j_167 = j_158 & _166;  <<-- depth 3
  _1 = (short int) j_167;
  _3 = _1 & _165;    <<-depth 2
  _5 = (int) _3;
  j_22 = _5 & j_167; <<-- depth 1
  j_20 = j_22 + 4;
  if (j_20 == 4)
    goto ;

We'll generate dependencies, and thus able to generate outgoing ranges 
for all these ssa-names still until we hit a dependency depth of 6 (the 
current default), which for one chain comes to an end at


j_140 = j_131 & _139

  We will be able to generate outgoing ranges for j_131 and _139 on the 
edges, but we will not try for any ssa_names used to define those.. ie, 
_138 might not be able to have outgoing ranges generated for it in this 
block.


working the j_20 : [4,4] range on the true edge back thru those 
expressions, once we get that deep the odds of finding something of 
value is remote :-)


We will also only generate these outgoing ranges when they are asked 
for.  Many of these names are not used outside the block (especially 
back the deeper you go), and so never get asked for anyway...  but you 
could :-)


Anyway, this limit in no way introduces any kind of ordering variation..

Andrew

PS for amusement and information overload...  it turns out in this hunk 
of code we end up knowing that globally the range of most of these names 
are [0, 4], and the actual ranges we COULD generate if asked:


3->5  (T) _1 :  short int [0, 4]
3->5  (T) _3 :  short int [0, 4]
3->5  (T) _5 :  int [0, 4]
3->5  (T) j_20 :    int [4, 4]
3->5  (T) j_22 :    int [0, 0]
3->5  (T) _120 :    short int [0, 4]
3->5  (T) _121 :    int [0, 4]
3->5  (T) _128 :    short int [0, 4]
3->5  (T) _129 :    short int [0, 4]
3->5  (T) _130 :    int [0, 4]
3->5  (T) j_131 :   int [0, 4]
3->5  (T) _137 :    short int [0, 4]
3->5  (T) _138 :    short int [0, 4]
3->5  (T) _139 :    int [0, 4]
3->5  (T) j_140 :   int [0, 4]
3->5  (T) _146 :    short int [0, 4]
3->5  (T) _14

PING [PATCH v2] implement -Winfinite-recursion [PR88232]

2021-11-19 Thread Martin Sebor via Gcc-patches

Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584205.html

On 11/11/21 2:46 PM, Martin Sebor wrote:

Attached is a v2 of the solution I posted earlier this week
with a few tweaks made after a more careful consideration of
the problem and possible false negatives and positives.

1) It avoids warning for [apparently infinitely recursive] calls
    in noreturn functions where the recursion may be prevented by
    a call to a noreturn function.
2) It avoids warning for calls where the recursion may be prevented
    by a call to a longjmp or siglongjmp.
3) It warns for recursive calls to built-ins in definitions of
    the corresponding library functions (e.g., for a call to
    __builtin_malloc in malloc).
4) It warns for calls to C++ functions even if they call other
    functions that might throw and so break out of the infinite
    recursion.  (E.g., operator new.)  This is the same as Clang.
5) It doesn't warn for calls to C++ functions with the throw
    expression.

Besides these changes to the warning itself, I've also improved
the code a bit by making the workhorse function a member of
the pass so recursive calls don't need to pass as many arguments
to itself.

Retested on x86_64-linux and by building Glibc and Binutils/GDB.

A possible enhancement is to warn for calls to calloc, malloc,
or realloc from within the definition of one of the other two
functions.  That might be a mistake made in code that tries
naively to replace the allocator with its own implementation.

On 11/9/21 9:28 PM, Martin Sebor wrote:

The attached patch adds support to the middle end for detecting
infinitely recursive calls.  The warning is controlled by the new
-Winfinite-recursion option.  The option name is the same as
Clang's.

I scheduled the warning pass to run after early inlining to
detect mutually recursive calls but before tail recursion which
turns some recursive calls into infinite loops and so makes
the two indistinguishable.

The warning detects a superset of problems detected by Clang
(based on its tests).  It detects the problem in PR88232
(the feature request) as well as the one in PR 87742,
an unrelated problem report that was root-caused to bug due
to infinite recursion.

This initial version doesn't attempt to deal with superimposed
symbols, so those might trigger false positives.  I'm not sure
that's something to worry about.

The tests are very light, but those for the exceptional cases
are exceedingly superficial, so it's possible those might harbor
some false positives and negatives.

Tested on x86_64-linux.

Martin







Re: [PATCH v2] c++: Implement -Wuninitialized for mem-initializers (redux) [PR19808]

2021-11-19 Thread Martin Sebor via Gcc-patches

On 11/18/21 6:56 PM, Jason Merrill via Gcc-patches wrote:

On 11/18/21 19:24, Marek Polacek wrote:

On Thu, Nov 18, 2021 at 05:10:47PM -0500, Jason Merrill wrote:

On 11/8/21 18:41, Marek Polacek wrote:

@@ -1311,13 +1462,25 @@ emit_mem_initializers (tree mem_inits)
 if (!COMPLETE_TYPE_P (current_class_type))
   return;
+  /* Keep a set holding fields that are not initialized.  */
+  hash_set uninitialized;
+
+  /* Initially that is all of them.  */
+  if (warn_uninitialized)
+    for (tree f = next_initializable_field (TYPE_FIELDS 
(current_class_type));

+ f != NULL_TREE;
+ f = next_initializable_field (DECL_CHAIN (f)))
+  if (!DECL_ARTIFICIAL (f))
+    uninitialized.add (f);


I wonder about flipping the sense of the set, so that it tracks 
fields that
have been initialized rather than those that haven't; then you 
wouldn't need

this loop.


True, but then I'd have to figure out a new way to signal that we 
don't want
to warn about the current member-initializer-list.  What I mean by 
that is

that when I see e.g. a MODIFY_EXPR or something else with side-effects in
a mem-init, I can just empty the set:

 case MODIFY_EXPR:
 /* Don't attempt to handle statement-expressions, either.  */
 case STATEMENT_LIST:
   uninitialized->empty ();

and then we won't even bother walking the other mem-inits, because
find_uninit_fields has:

   if (!uninitialized->is_empty ())
 {
   // walk_tree ()
 }

and I thought that was pretty elegant.  Of course, I could just add a new
bool member into find_uninit_data and then do what you suggest...  Up to
you, I'm happy to do that too.


No need, the patch is OK as is.


Awesome!  Thank you both and especially Marek for finally making
it happen (after no less than 15 years)! :)

Martin



Jason





Re: [PATCH] c++: Avoid adding implicit attributes during apply_late_template_attributes [PR101180]

2021-11-19 Thread Jason Merrill via Gcc-patches

On 11/19/21 05:06, Jakub Jelinek wrote:

Hi!

decl_attributes and its caller cplus_decl_attributes sometimes add
implicit attributes, e.g. optimize attribute if #pragma GCC optimize
is active, target attribute if #pragma GCC target is active, or
e.g. omp declare target attribute if in between #pragma omp declare target
and #pragma omp end declare target.

For templates that seems highly undesirable to me though, they should
get those implicit attributes from the spot the templates were parsed
(and they do get that), then tsubst through copy_node copies those
attributes, but then apply_late_template_attributes can or does add
a new set from the spot where they are instantiated, which can be pretty
random point of first use of the template.

Consider e.g.
#pragma GCC push_options
#pragma GCC target "avx"
template 
inline void foo ()
{
}
#pragma GCC pop_options
#pragma GCC push_options
#pragma GCC target "crc32"
void
bar ()
{
   foo<0> ();
}
#pragma GCC pop_options
testcase where the intention is that foo has avx target attribute
and bar has crc32 target attribute, but we end up with
__attribute__((target ("crc32"), target ("avx")))
on foo<0> (and due to yet another bug actually don't enable avx
in foo<0>).  In this particular case it is a regression caused
by r12-299-ga0fdff3cf33f7284 which apparently calls
cplus_decl_attributes even if attributes != NULL but late_attrs
is NULL, before those changes we didn't call it in those cases.
But, if there is at least one unrelated dependent attribute this
would happen already in older releases.

The following patch fixes that by temporarily overriding the variables
that control the addition of the implicit attributes.


OK.


Shall we also change the function so that it doesn't call
cplus_decl_attributes if late_attrs is NULL [...]?


Please.


Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-11-19  Jakub Jelinek  

PR c++/101180
* pt.c (apply_late_template_attributes): Temporarily override
current_optimize_pragma, optimization_current_node,
current_target_pragma and scope_chain->omp_declare_target_attribute,
so that cplus_decl_attributes doesn't add implicit attributes.

* g++.target/i386/pr101180.C: New test.

--- gcc/cp/pt.c.jj  2021-11-18 12:33:22.025628027 +0100
+++ gcc/cp/pt.c 2021-11-18 19:08:52.800539490 +0100
@@ -11722,6 +11722,17 @@ apply_late_template_attributes (tree *de
q = &TREE_CHAIN (*q);
  }
  
+  /* cplus_decl_attributes can add some attributes implicitly.  For templates,

+ those attributes should have been added already when those templates were
+ parsed, and shouldn't be added based on from which context they are
+ first time instantiated.  */
+  auto o1 = make_temp_override (current_optimize_pragma, NULL_TREE);
+  auto o2 = make_temp_override (optimization_current_node,
+   optimization_default_node);
+  auto o3 = make_temp_override (current_target_pragma, NULL_TREE);
+  auto o4 = make_temp_override (scope_chain->omp_declare_target_attribute,
+   NULL);
+
cplus_decl_attributes (decl_p, late_attrs, attr_flags);
  
return true;

--- gcc/testsuite/g++.target/i386/pr101180.C.jj 2021-11-18 19:11:50.512009888 
+0100
+++ gcc/testsuite/g++.target/i386/pr101180.C2021-11-18 19:11:33.066258216 
+0100
@@ -0,0 +1,25 @@
+// PR c++/101180
+// { dg-do compile { target c++11 } }
+
+#pragma GCC target "avx"
+template  struct A {};
+#pragma GCC push_options
+#pragma GCC target "avx,avx2,bmi,bmi2,fma,f16c"
+template  using B = A;
+template  struct C;
+template <> struct C {
+  __attribute__((always_inline)) float operator()(long) { return .0f; }
+};
+long d;
+template  void e(B) {
+  T{C()(d)};
+}
+template  void f(T d, FromT) {
+  e(d);
+}
+int g;
+void h() {
+  A i;
+  f(i, g);
+}
+#pragma GCC pop_options

Jakub





Re: [PATCH 00/10] __builtin_dynamic_object_size

2021-11-19 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 10, 2021 at 12:31:26AM +0530, Siddhesh Poyarekar wrote:
> - Instead of bailing out on non-constant sizes with
>   __builtin_object_size, it should be possible to use ranger to
>   get an upper and lower bound on the size expression and use that to
>   implement __builtin_object_size.

I'd prefer not to.  One thing is that VRP heavily relies on UB not happening
in the program, while __bos is typically used to catch UB in those programs.
And, I'm afraid fairly often VRP would result in runtime tests for limits
that aren't useful for security at all.  Say VRP figures out that some
integer isn't negative or doesn't have MSB set etc., suddenly we have range
of [0, INT_MAX] or similar and making that imply __builtin_object_size
INT_MAX rather than ~(size_t) 0 would mean we need to use __*_chk and
compare at runtime, even when it is very unlikely an object would be that
big...
The compiler computes some range, but there is not information on how much
the range actually maps to the actual range the program is using, or when it
is some much larger superset of the actual range (same problem is with
Martin's warnings BTW).  Some unrelated inlined function can perform some
comparison just in case, perhaps some jump threading is done and all of sudden
there is non-VARYING range.

Jakub



[PATCH] libphobos, testsuite: Add prune clauses for two Darwin cases.

2021-11-19 Thread Iain Sandoe via Gcc-patches
Depending on the permutation of CPU, OS version and shared/non-
shared library inclusion, we get can get two warnings from the
external tools (ld64, dsymutil) which are not actually GCC issues
but relate to the external tools.  These are already pruned in
the main testsuite, this adds them to the library.

tested on x86_64,i686-darwin17 where the problem shows up.
OK for master / backports?
thanks
Iain

Signed-off-by: Iain Sandoe 

libphobos/ChangeLog:

* testsuite/lib/libphobos.exp: Prune warnings from external
tool bugs.
---
 libphobos/testsuite/lib/libphobos.exp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/libphobos/testsuite/lib/libphobos.exp 
b/libphobos/testsuite/lib/libphobos.exp
index 3be2092b12e..2af430a0e45 100644
--- a/libphobos/testsuite/lib/libphobos.exp
+++ b/libphobos/testsuite/lib/libphobos.exp
@@ -90,6 +90,13 @@ proc libphobos-dg-test { prog do_what extra_tool_flags } {
 }
 
 proc libphobos-dg-prune { system text } {
+
+# Ignore harmless warnings from Xcode.
+regsub -all "(^|\n)\[^\n\]*ld: warning: could not create compact unwind 
for\[^\n\]*" $text "" text
+
+# Ignore dsymutil warning (tool bug is actually linker)
+regsub -all "(^|\n)\[^\n\]*could not find object file symbol for 
symbol\[^\n\]*" $text "" text
+
 return $text
 }
 
-- 
2.24.3 (Apple Git-128)



Re: [PATCH 01/10] tree-object-size: Replace magic numbers with enums

2021-11-19 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 10, 2021 at 12:31:27AM +0530, Siddhesh Poyarekar wrote:
> A simple cleanup to allow inserting dynamic size code more easily.
> 
> gcc/ChangeLog:
> 
>   * tree-object-size.c: New enum.
>   (object_sizes, computed, addr_object_size,
>   compute_builtin_object_size, init_object_sizes,
>   fini_object_sizes, object_sizes_execute): Replace magic numbers
>   with enums.

Ok.

Jakub



[PATCH] libstdc++, testsuite: Add a prune expression for external tool bug.

2021-11-19 Thread Iain Sandoe via Gcc-patches
Depending on the permutation of CPU, OS version and shared/non-
shared library inclusion, we get can get warnings from the external
tools (ld64, dsymutil) which are not actually libstdc++ issues but
relate to the external tools themselves.  This is already pruned
in the main testsuite, this adds it to the library.

tested on x86_64, i686 darwin17 which is one of the platforms where
the issue shows.

OK for master? / eventual backports ?
thanks
Iain

Signed-off-by: Iain Sandoe 

libstdc++-v3/ChangeLog:

* testsuite/lib/prune.exp: Prune dsymutil (ld64) warning.
---
 libstdc++-v3/testsuite/lib/prune.exp | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/lib/prune.exp 
b/libstdc++-v3/testsuite/lib/prune.exp
index 334f8218d16..41c4b51cb16 100644
--- a/libstdc++-v3/testsuite/lib/prune.exp
+++ b/libstdc++-v3/testsuite/lib/prune.exp
@@ -66,9 +66,12 @@ proc libstdc++-dg-prune { system text } {
 regsub -all "(^|\n)\[^\n\]*warning: 
DWARFDebugInfoEntry::AppendDependants\[^\n\]*AT_\[^\n\]*_bound\[^\n\]*FORM_ref4\[^\n\]*"
 $text "" text
 regsub -all "(^|\n)\[^\n\]*warning:\[^\n\]*TAG_variable:  
AT_location\[^\n\]*didn't have valid function low pc\[^\n\]*" $text "" text
 
-# Ignore harmless warnings from Xcode 4.0.
+# Ignore harmless warnings from Xcode 4+.
 regsub -all "(^|\n)\[^\n\]*ld: warning: could not create compact unwind 
for\[^\n\]*" $text "" text
 
+# Ignore dsymutil warning (tool bug is actually in the linker)
+regsub -all "(^|\n)\[^\n\]*could not find object file symbol for 
symbol\[^\n\]*" $text "" text
+
 foreach p $additional_prunes {
if { [string length $p] > 0 } {
# Following regexp matches a complete line containing $p.
-- 
2.24.3 (Apple Git-128)



Re: [PATCH] Fix IPA modref ubsan.

2021-11-19 Thread Martin Sebor via Gcc-patches

On 11/18/21 10:10 AM, Jan Hubicka wrote:

I don't know what the guidance is on using vec in IPA passes
but with respect to existing practice elsewhere, there are
existing uses of vec and auto_vec with non-POD types and vec
does work with them  (see the vec_default_construct and
vec_copy_construct templates, for example, whose goal is
to support nontrivial classes).


I see, since 2017 :).  The patch is OK then.
Nontrivial destructors also behave in a sane way these days?


Good question :)

At a minimum, element dtors should be automatically invoked by
the auto_vec dtor (there is an auto-test in vec.c to verify that).

Beyond that, since (unlike auto_vec) a plain vec isn't a container
its users are on their own when it comes to managing the memory of
their elements (i.e., they need to explicitly destroy their elements).

Having said that, as with all retrofits, they could be incomplete.
I see a few examples of where that seems to be the case here:

Calling truncate() on a vec with notrivial elements leaks, so
clients needs to explicitly release those elements.  That
should happen automatically.

Going through vec, I also see calls to memmove in functions
like quick_insert, ordered_remove, and block_remove.  So calling
those functions is not safe on a vec with nontrivial types.

Calling any of the sort functions also may not work correctly
with nontrivial elements (gcc_sort() calls memcpy).  vec should
either prevent that buy refusing to compile or use a safe
(generic) template for that.

So while basic vec uses work with nontrivial types, there are
plenty of bugs :(


OK, sounds bit dangerous but the use here is very simple.


I agree (that it's dangerous).  I expect -Wclass-memaccess will
catch the calls to memcpy/memmove if any of the vec members that
call them are instantiated on a nontrivial type.  Calling sort
will probably not trigger it because the memcpy call is buried
in a .c file (and not in a template).

Regardless, these uses should be still be fixed to work right.
I'll see if I can get to it at some point.

Martin


I remember the non-POD rule mostly from the original David's
implementation of modref that did put non-pods to vector and he took
really long while to work out why it breaks.  And yep, it was before
2017 :)
Honza


Martin



Honza


Martin


The diagnostics should be from
a.parm_offset_known &= m.parm_offset_known;
Becasue both in the parm_map (which is variable m) and access_node
(which is variable a) the parm_offset_known has no meaning when
parm_index == MODREF_UNKNOWN_PARM.

If we want to avoid computing on these, perhaps this will work?

diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h
index 0a097349ebd..97736d0d8a4 100644
--- a/gcc/ipa-modref-tree.h
+++ b/gcc/ipa-modref-tree.h
@@ -568,9 +568,13 @@ struct GTY((user)) modref_tree
  : (*parm_map) [a.parm_index];
if (m.parm_index == MODREF_LOCAL_MEMORY_PARM)
  continue;
-   a.parm_offset += m.parm_offset;
-   a.parm_offset_known &= m.parm_offset_known;
a.parm_index = m.parm_index;
+   if (a.parm_index != MODREF_UNKNOWN_PARM)
+ {
+   a.parm_offset_known &= m.parm_offset_known;
+   if (a.parm_offset_known)
+ a.parm_offset += m.parm_offset;
+ }
  }
  }
changed |= insert (base_node->base, ref_node->ref, a,

  /* Index of parameter we translate to.
 Values from special_params enum are permitted too.  */
  int parm_index;
diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index c94f0589d44..630d202d5cf 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -5020,8 +5020,7 @@ ipa_merge_modref_summary_after_inlining (cgraph_edge 
*edge)
  auto_vec  parm_map;
  modref_parm_map chain_map;
  /* TODO: Once we get jump functions for static chains we could
-compute this.  */
-  chain_map.parm_index = MODREF_UNKNOWN_PARM;
+compute parm_index.  */
  compute_parm_map (edge, &parm_map);
--
2.33.1









Re: [PATCH 02/10] tree-object-size: Abstract object_sizes array

2021-11-19 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 10, 2021 at 12:31:28AM +0530, Siddhesh Poyarekar wrote:
> Put all accesses to object_sizes behind functions so that we can add
> dynamic capability more easily.
> 
> gcc/ChangeLog:
> 
>   * tree-object-size.c (object_sizes_grow, object_sizes_release,
>   object_sizes_unknown_p, object_sizes_get, object_size_set_force,
>   object_sizes_set): New functions.
>   (addr_object_size, compute_builtin_object_size,
>   expr_object_size, call_object_size, unknown_object_size,
>   merge_object_sizes, plus_stmt_object_size,
>   cond_expr_object_size, collect_object_sizes_for,
>   check_for_plus_in_loops_1, init_object_sizes,
>   fini_object_sizes): Adjust.

> @@ -975,8 +994,9 @@ collect_object_sizes_for (struct object_size_info *osi, 
> tree var)
>  {
>if (bitmap_set_bit (osi->visited, varno))
>   {
> -   object_sizes[object_size_type][varno]
> - = (object_size_type & 2) ? -1 : 0;
> +   /* Initialize to 0 for maximum size and M1U for minimum size so that
> +  it gets immediately overridden.  */
  object_sizes_set_force (osi, varno, unknown (object_size_type ^ 2));

Shouldn't that be unknown (object_size_type
   ^ OST_MINIMUM)
?
Or, if you redo the series, update the first patch so that it doesn't leave
any & 2 or & 1 etc. around and instead uses the enumerators and redo this
patch?

Otherwise LGTM.

Jakub



[PATCH] PR tree-optimization/103254 - Limit depth for all GORI expressions.

2021-11-19 Thread Andrew MacLeod via Gcc-patches

On 11/19/21 04:21, Richard Biener wrote:

On Thu, Nov 18, 2021 at 8:30 PM Andrew MacLeod via Gcc-patches
 wrote:

At issue here is the dynamic approach we currently use for outgoing edge
calculations.  It isn't normally a problem, but once you get a very
large number of possible outgoing values (ie very long unrolled blocks)
with pairs of values on a statement, and individual queries for each
one, it becomes exponential if they relate to each other.

So.  We previously introduced a param for PR 100081 which limited the
depth of logical expressions GORI would pursue because we always make 2
or 4 queries for each logical expression, which accelerated the
exponential behaviour much quicker.

This patch reuses that to apply to any statement which has 2 ssa-names,
as the potential for 2 queries can happen then as well..  leading to
exponential behaviour.  Each statement we step back through with
multiple ssa-names decreases the odds of calculating a useful outgoing
range anyway.  This will remove ridiculous behaviour like this PR
demonstrates.

Next release I plan to revamp GORI to cache outgoing values, which will
eliminate all this sort of behaviour, and we can remove all such
restrictions.

Bootstrapped on x86_64-pc-linux-gnu with no regressions.  OK for trunk?

I think it's reasonable.  SCEV tries to limit the overall number of SSA -> def


Pushed.




Re: [PATCH 02/10] tree-object-size: Abstract object_sizes array

2021-11-19 Thread Siddhesh Poyarekar

On 11/19/21 21:48, Jakub Jelinek wrote:

On Wed, Nov 10, 2021 at 12:31:28AM +0530, Siddhesh Poyarekar wrote:

Put all accesses to object_sizes behind functions so that we can add
dynamic capability more easily.

gcc/ChangeLog:

* tree-object-size.c (object_sizes_grow, object_sizes_release,
object_sizes_unknown_p, object_sizes_get, object_size_set_force,
object_sizes_set): New functions.
(addr_object_size, compute_builtin_object_size,
expr_object_size, call_object_size, unknown_object_size,
merge_object_sizes, plus_stmt_object_size,
cond_expr_object_size, collect_object_sizes_for,
check_for_plus_in_loops_1, init_object_sizes,
fini_object_sizes): Adjust.



@@ -975,8 +994,9 @@ collect_object_sizes_for (struct object_size_info *osi, 
tree var)
  {
if (bitmap_set_bit (osi->visited, varno))
{
- object_sizes[object_size_type][varno]
-   = (object_size_type & 2) ? -1 : 0;
+ /* Initialize to 0 for maximum size and M1U for minimum size so that
+it gets immediately overridden.  */

  object_sizes_set_force (osi, varno, unknown (object_size_type ^ 2));

Shouldn't that be unknown (object_size_type
   ^ OST_MINIMUM)
?
Or, if you redo the series, update the first patch so that it doesn't leave
any & 2 or & 1 etc. around and instead uses the enumerators and redo this
patch?


Thanks, I'll update 1/10 and redo this on top of it.

Siddhesh


Re: [PATCH 03/10] tree-object-size: Use tree instead of HOST_WIDE_INT

2021-11-19 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 10, 2021 at 12:31:29AM +0530, Siddhesh Poyarekar wrote:
>   * tree-object-size.h (compute_builtin_object_size): Return tree
>   instead of HOST_WIDE_INT.
>   * builtins.c (fold_builtin_object_size): Adjust.
>   * gimple-fold.c (gimple_fold_builtin_strncat): Likewise.
>   * ubsan.c (instrument_object_size): Likewise.
>   * tree-object-size.c (object_sizes): Change type to vec.
>   (initval): New function.
>   (unknown): Use it.
>   (size_unknown_p, size_initval, size_unknown): New functions.
>   (object_sizes_unknown_p): Use it.
>   (object_sizes_get): Return tree.
>   (object_sizes_initialize): Rename from object_sizes_set_force
>   and set VAL parameter type as tree.
>   (object_sizes_set): Set VAL parameter type as tree and adjust
>   implementation.
>   (size_for_offset): New function.
>   (addr_object_size): Change PSIZE parameter to tree and adjust
>   implementation.
>   (alloc_object_size): Return tree.
>   (compute_builtin_object_size): Return tree in PSIZE.
>   (expr_object_size, call_object_size, unknown_object_size):
>   Adjust for object_sizes_set change.
>   (merge_object_sizes): Set OFFSET type to tree and adjust
>   implementation.
>   (plus_stmt_object_size, cond_expr_object_size,
>   collect_object_sizes_for, check_for_plus_in_loops_1): Adjust for
>   change of type from HOST_WIDE_INT to tree.
>   (object_sizes_execute): Adjust for type change to tree.

> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -10226,7 +10226,7 @@ maybe_emit_sprintf_chk_warning (tree exp, enum 
> built_in_function fcode)
>  static tree
>  fold_builtin_object_size (tree ptr, tree ost)
>  {
> -  unsigned HOST_WIDE_INT bytes;
> +  tree bytes;
>int object_size_type;
>  
>if (!validate_arg (ptr, POINTER_TYPE)
> @@ -10251,17 +10251,15 @@ fold_builtin_object_size (tree ptr, tree ost)
>if (TREE_CODE (ptr) == ADDR_EXPR)
>  {
>compute_builtin_object_size (ptr, object_size_type, &bytes);
> -  if (wi::fits_to_tree_p (bytes, size_type_node))
> - return build_int_cstu (size_type_node, bytes);
> +  return fold_convert (size_type_node, bytes);
>  }
>else if (TREE_CODE (ptr) == SSA_NAME)
>  {
>/* If object size is not known yet, delay folding until
> later.  Maybe subsequent passes will help determining
> it.  */
> -  if (compute_builtin_object_size (ptr, object_size_type, &bytes)
> -   && wi::fits_to_tree_p (bytes, size_type_node))
> - return build_int_cstu (size_type_node, bytes);
> +  if (compute_builtin_object_size (ptr, object_size_type, &bytes))
> + return fold_convert (size_type_node, bytes);

Neither of these are equivalent to what it used to do before.
If some target has e.g. pointers wider than size_t, then previously we could
compute bytes that doesn't fit into size_t and would return NULL which
eventually would result in object_sizes_execute or expand_builtin_object_size
resolving it to the unknown value.  But fold_convert will perform modulo
arithmetics on it, so say if size_t is 24-bit and bytes is 0x101, it
will fold to (size_t) 1.  I think we should still punt if it doesn't fit.
Or do we ensure that what it returns always has sizetype type?

> @@ -2486,13 +2486,14 @@ gimple_fold_builtin_strncat (gimple_stmt_iterator 
> *gsi)
>if (cmpsrc < 0)
>  return false;
>  
> -  unsigned HOST_WIDE_INT dstsize;
> +  tree dstsize;
>  
>bool nowarn = warning_suppressed_p (stmt, OPT_Wstringop_overflow_);
>  
> -  if (!nowarn && compute_builtin_object_size (dst, 1, &dstsize))
> +  if (!nowarn && compute_builtin_object_size (dst, 1, &dstsize)
> +  && tree_fits_uhwi_p (dstsize))
>  {
> -  int cmpdst = compare_tree_int (len, dstsize);
> +  int cmpdst = compare_tree_int (len, tree_to_uhwi (dstsize));

Why jump from tree to UHWI and back?
Check that TREE_CODE (dstsize) == INTEGER_CST and do
tree_int_cst_compare instead?

>  
>if (cmpdst >= 0)
>   {
> @@ -2509,7 +2510,7 @@ gimple_fold_builtin_strncat (gimple_stmt_iterator *gsi)
>   "destination size")
>  : G_("%qD specified bound %E exceeds "
>   "destination size %wu"),
> -fndecl, len, dstsize);
> +fndecl, len, tree_to_uhwi (dstsize));

Similarly, use %E and pass dstsize to it.

> +/* Initial value of object sizes; zero for maximum and SIZE_MAX for minimum
> +   object size.  */
> +
> +static inline unsigned HOST_WIDE_INT
> +initval (int object_size_type)
> +{
> +  return (unsigned HOST_WIDE_INT) -popcount_hwi (object_size_type
> +  & OST_MINIMUM);

This makes the code quite unreadable and on some arches it will be also
much worse (popcount is a libcall on many, and even if not, it is inline
function using a builtin inside of it).  Can't you 

Re: [PATCH] coroutines: Handle initial awaiters with non-void returns [PR 100127].

2021-11-19 Thread Iain Sandoe
Hi Jason,

> On 18 Nov 2021, at 23:42, Iain Sandoe  wrote:
> 
> 
> 
>> On 18 Nov 2021, at 22:13, Jason Merrill via Gcc-patches 
>>  wrote:
>> 
>> On 11/5/21 11:46, Iain Sandoe wrote:
>>> The way in which a C++20 coroutine is specified discards any value

>>>   tree aw_r = TREE_VEC_ELT (vec, 2);
>>> + if (!VOID_TYPE_P (TREE_TYPE (aw_r)))
>>> +   aw_r = build1 (CONVERT_EXPR, void_type_node, aw_r);
>> 
>> Is there a reason not to use convert_to_void?
> 
> no, just me still learning APIs… I’ll do a revised and check it.

So I’m testing this replacement:

  aw_r = convert_to_void (aw_r, ICV_CAST, tf_warning_or_error);

OK if the testing succeeds?
Iain



Re: libgo patch committed: redirect mkrsysinfo.sh grep output to /dev/null

2021-11-19 Thread Ian Lance Taylor via Gcc-patches
On Fri, Nov 19, 2021 at 3:47 AM Bernhard Reutner-Fischer via
Gcc-patches  wrote:
>
> On Fri, 28 Oct 2016 10:55:18 -0700
> Ian Lance Taylor  wrote:
>
> > This patch to libgo redirects the output of a grep command in
> > mkrsysinfo.sh to /dev/null.  The output otherwise appears in the
>
> grep -q exists since at least SUSv2, fwiw.

There has traditionally been confusion between grep -q and grep -s.
Older grep programs supported -s but not -q.  Perhaps that confusion
is all cleared up now on all systems that GCC supports.  I don't know.
I do know that the patch I applied five years ago works everywhere.

Ian


Re: [PATCH] libstdc++, testsuite: Add a prune expression for external tool bug.

2021-11-19 Thread Jonathan Wakely via Gcc-patches
On Fri, 19 Nov 2021 at 16:04, Iain Sandoe via Libstdc++
 wrote:
>
> Depending on the permutation of CPU, OS version and shared/non-
> shared library inclusion, we get can get warnings from the external
> tools (ld64, dsymutil) which are not actually libstdc++ issues but
> relate to the external tools themselves.  This is already pruned
> in the main testsuite, this adds it to the library.
>
> tested on x86_64, i686 darwin17 which is one of the platforms where
> the issue shows.
>
> OK for master? / eventual backports ?

Yes, thanks.


Re: [PATCH] options: Make -Ofast switch off -fsemantic-interposition

2021-11-19 Thread Martin Jambor
Hi,

On Fri, Nov 19 2021, Jan Hubicka wrote:
>> > Hi,
>> > 
>> > On Fri, Nov 12 2021, Martin Jambor wrote:
>> > > Hi,
>> > >
>> > > using -fno-semantic-interposition has been reported by various people
>> > > to bring about considerable speed up at the cost of strict compliance
>> > > to the ELF symbol interposition rules  See for example
>> > > https://fedoraproject.org/wiki/Changes/PythonNoSemanticInterpositionSpeedup
>> > >
>> > > As such I believe it should be implied by our -Ofast optimization
>> > > level, not only so that benchmarks that can benefit run faster, but
>> > > also so that people looking at -Ofast documentation for options that
>> > > could speed their programs find it.
>> > >
>> > > I have verified that with the following patch IPA-CP sees
>> > > flag_semantic_interposition set to zero at Ofast and that info and pdf
>> > > manual builds fine with the documentation change.  I am bootstrapping
>> > > and testing it now in order to comply with submission criteria but I
>> > > don't think an Ofast change gets much tested.
>> > >
>> > > Assuming it passes, is the patch OK?  (If it is, I will also add a note
>> > > about it in the "Caveats" section in gcc-12/changes.html of wwwdocs
>> > > after I commit the patch.)
>> > >
>> > 
>> > Unfortunately, I was wrong, there are testcases which use the optimize
>> > attribute to switch a function to Ofast and those ICE because
>> > -fsemantic-interposition is not an optimization flag and only
>> > optimization flags can change in an optimize attribute (AFAIK, I only
>> > had a quick glance at the results).
>> > 
>> > I am not sure what is the right way to tackle this, whether to set the
>> > flag at Ofast in some nonstandard way or make the flag an optimization
>> > flag - probably affecting function definitions, having it affect
>> > call-sites seems too fine-grained.  I will try to discuss this on IRC on
>> > Monday (and hope such change is still doable early stage3).
>> > 
>> > Sorry for posting this a bit prematurely,
>> 
>> Hi,
>> 
>> This patch turns flag_semantic_interposition to optimization option so
>> it can be enabled with per-function granuality.  This is done by adding
>> the flag among visibility flags into the symbol table.  This fixes the
>> behaviour on the testcase I added to testsuite.
>> 
>> There are bugs where get_availability misbehaves on partitioned program.
>> We can also use the new flag to fix those, but I will do that
>> incrementally.
>> 
>> The -Ofast change should be safe now.
>
> Also forgot to say it explicitly, the patch is OK, so please commit it.
>

Thanks a lot for the enabling patch.  I have committed mine after re-testing.

Martin


[wwwdocs PATCH]: Add a caveat that -Ofast implies -fno-semantic-interposition

2021-11-19 Thread Martin Jambor
Hi,

can I add the following caveat to the gcc-12/changes.html file?

Thanks,

Martin


diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 5f0214bd..fd7af717 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -72,6 +72,8 @@ a work-in-progress.
 be removed in the next release.  All ports now default to emit DWARF
 (version 2 or later) debugging info or are obsoleted.
   
+  The optimization level -Ofast now implies
+-fno-semantic-interposition.
 




Re: [PATCH] rs6000: Add optimizations for _mm_sad_epu8

2021-11-19 Thread Segher Boessenkool
Hi!

On Fri, Oct 22, 2021 at 12:28:49PM -0500, Paul A. Clarke wrote:
> Power9 ISA added `vabsdub` instruction which is realized in the
> `vec_absd` instrinsic.
> 
> Use `vec_absd` for `_mm_sad_epu8` compatibility intrinsic, when
> `_ARCH_PWR9`.
> 
> Also, the realization of `vec_sum2s` on little-endian includes
> two shifts in order to position the input and output to match
> the semantics of `vec_sum2s`:
> - Shift the second input vector left 12 bytes. In the current usage,
>   that vector is `{0}`, so this shift is unnecessary, but is currently
>   not eliminated under optimization.

The vsum2sws implementation uses an unspec, so there is almost no chance
of anything with it being optimised :-(

It rotates it right by 4 bytes btw, it's not a shift.

> - Shift the vector produced by the `vsum2sws` instruction left 4 bytes.
>   The two words within each doubleword of this (shifted) result must then
>   be explicitly swapped to match the semantics of `_mm_sad_epu8`,
>   effectively reversing this shift.  So, this shift (and a susequent swap)
>   are unnecessary, but not currently removed under optimization.

Rotate left by 4 -- same thing once you consider word 0 and 2 are set
to zeroes by the sum2sws.

Not sure why it is not optimised, what do the dump files say?  -dap and
I'd start looking at the combine dump.

> Using `__builtin_altivec_vsum2sws` retains both shifts, so is not an
> option for removing the shifts.
> 
> For little-endian, use the `vsum2sws` instruction directly, and
> eliminate the explicit shift (swap).
> 
> 2021-10-22  Paul A. Clarke  
> 
> gcc
>   * config/rs6000/emmintrin.h (_mm_sad_epu8): Use vec_absd
>   when _ARCH_PWR9, optimize vec_sum2s when LE.

Please don't break changelog lines early.

> -  vmin = vec_min (a, b);
> -  vmax = vec_max (a, b);
> +#ifndef _ARCH_PWR9
> +  __v16qu vmin = vec_min (a, b);
> +  __v16qu vmax = vec_max (a, b);
>vabsdiff = vec_sub (vmax, vmin);
> +#else
> +  vabsdiff = vec_absd (a, b);
> +#endif

So hrm, maybe we should have the vec_absd macro (or the builtin) always,
just expanding to three insns if necessary.

Okay for trunk with approproate changelog and commit message changes.
Thanks!


Segher


Re: [PATCH] gcc: vxworks: fix providing stdint.h header

2021-11-19 Thread Olivier Hainque via Gcc-patches
Hi Rasmus,

> On 12 Nov 2021, at 17:35, Olivier Hainque  wrote:

> We have had to use for stdbool a similar trick as we had
> for stdint (need to preinclude yyvals.h), which we will need to
> propagate somehow. I'm not yet sure how to reconcile that with
> your observations.

>> In file included from 
>> .../gcc-build/powerpc-wrs-vxworks/libstdc++-v3/include/memory:72,
>>from .../gcc-src/libstdc++-v3/include/precompiled/stdc++.h:82:
>> .../gcc-build/powerpc-wrs-vxworks/libstdc++-v3/include/bits/align.h:36:10: 
>> fatal error: stdint.h: No such file or directory
>>  36 | #include  // uintptr_t
>> |  ^~
>> compilation terminated.
>> Makefile:1861: recipe for target 
>> 'powerpc-wrs-vxworks/bits/stdc++.h.gch/O2ggnu++0x.gch' failed
>> make[5]: *** [powerpc-wrs-vxworks/bits/stdc++.h.gch/O2ggnu++0x.gch] Error 1

>> For the approach with an extra makefile fragment to work, that rule
>> would have to fire after stmp-int-hdrs as it does now (i.e., after the
>> common logic has removed stdint.h), but it must also run before we
>> actually start building target libraries that depend on having a
>> stdint.h - and I can't find something reasonable to make the rule a
>> dependency of.

I was just able to get complete build of a gcc-11 based compiler
with only a couple of minor additions on top of what we already have,
part of which I was planning to propagate, including the stdbool trick
similar to the one for stdint.

I didn't hit the error you have and we have never hit it before,
which suggests there might be some significant difference between our
ways to build that could help.

Your error triggers on O2ggnu++0x.gch and we configure with
--disable-libstdcxx-pch.

Can you see if adding this to your configuration options helps
on your end?

Thanks in advance,

Olivier



Re: [PATCH] PR tree-optimization/103254 - Limit depth for all GORI expressions.

2021-11-19 Thread Richard Biener via Gcc-patches
On November 19, 2021 4:00:01 PM GMT+01:00, Andrew MacLeod  
wrote:
>On 11/19/21 04:21, Richard Biener wrote:
>> On Thu, Nov 18, 2021 at 8:30 PM Andrew MacLeod via Gcc-patches
>>  wrote:
>>> At issue here is the dynamic approach we currently use for outgoing edge
>>> calculations.  It isn't normally a problem, but once you get a very
>>> large number of possible outgoing values (ie very long unrolled blocks)
>>> with pairs of values on a statement, and individual queries for each
>>> one, it becomes exponential if they relate to each other.
>>>
>>> So.  We previously introduced a param for PR 100081 which limited the
>>> depth of logical expressions GORI would pursue because we always make 2
>>> or 4 queries for each logical expression, which accelerated the
>>> exponential behaviour much quicker.
>>>
>>> This patch reuses that to apply to any statement which has 2 ssa-names,
>>> as the potential for 2 queries can happen then as well..  leading to
>>> exponential behaviour.  Each statement we step back through with
>>> multiple ssa-names decreases the odds of calculating a useful outgoing
>>> range anyway.  This will remove ridiculous behaviour like this PR
>>> demonstrates.
>>>
>>> Next release I plan to revamp GORI to cache outgoing values, which will
>>> eliminate all this sort of behaviour, and we can remove all such
>>> restrictions.
>>>
>>> Bootstrapped on x86_64-pc-linux-gnu with no regressions.  OK for trunk?
>> I think it's reasonable.  SCEV tries to limit the overall number of SSA -> 
>> def
>> visits, capturing deep vs. wide in a more uniform (and single) way.
>> (--param scev-max-expr-size and the duplicate - huh - --param
>> scev-max-expr-complexity).
>>
>> For SCEV running into the limit means fatal fail of the analysis, for ranger
>> it only means less refinement?  So it might make a difference which path
>> you visit and which not (just thinking about reproducability of range 
>> analysis
>> when we say, swap operands of a stmt)?
>>
>Yes, Its means less refinement for "deeper" dependencies in the block..  
>It will not be affected by operand swapping because we will either look 
>at dependencies for both operands of a stmt, or neither.   The path you 
>visit shouldn't make any difference. Note this has nothing to do with 
>generating ranges for those statements, this is only for refining 
>outgoing ranges created by the condition at the bottom of the block.
>
>To be precise, we will stop looking for possible range changes once our 
>dependency search, which always starts from the bottom of the block, 
>encounters 6 stmts with multiple SSA operands. Statements with a single 
>SSA operand will continue to build the dependency chains.
>
>so for instance, following just the chain for first operands of this 
>somewhat truncated listing:
>  <...>
>   _137 = (short int) j_131;
>   _138 = _129 & _137;
>   _139 = (int) _138;
>   j_140 = j_131 & _139;    << depth 6 ... we don't go look at j_131 
>or _139 for further dependencies
>   _146 = (short int) j_140;
>   _147 = _138 & _146;
>   _148 = (int) _147;
>   j_149 = j_140 & _148;   << depth 5
>   _155 = (short int) j_149;
>   _156 = _147 & _155;
>   _157 = (int) _156;
>   j_158 = j_149 & _157;   << depth 4
>   _164 = (short int) j_158;
>   _165 = _156 & _164;
>   _166 = (int) _165;
>   j_167 = j_158 & _166;  <<-- depth 3
>   _1 = (short int) j_167;
>   _3 = _1 & _165;    <<-depth 2
>   _5 = (int) _3;
>   j_22 = _5 & j_167; <<-- depth 1
>   j_20 = j_22 + 4;
>   if (j_20 == 4)
>     goto ;
>
>We'll generate dependencies, and thus able to generate outgoing ranges 
>for all these ssa-names still until we hit a dependency depth of 6 (the 
>current default), which for one chain comes to an end at
>
>j_140 = j_131 & _139
>
>   We will be able to generate outgoing ranges for j_131 and _139 on the 
>edges, but we will not try for any ssa_names used to define those.. ie, 
>_138 might not be able to have outgoing ranges generated for it in this 
>block.
>
>working the j_20 : [4,4] range on the true edge back thru those 
>expressions, once we get that deep the odds of finding something of 
>value is remote :-)
>
>We will also only generate these outgoing ranges when they are asked 
>for.  Many of these names are not used outside the block (especially 
>back the deeper you go), and so never get asked for anyway...  but you 
>could :-)

I see. I suppose you could prune the list simply by asking whether a Def has a 
single use only (that you just followed) and follow the chain but not generate 
an outgoing range for the Def. Not sure if that will make a big difference in 
compile time. 

>Anyway, this limit in no way introduces any kind of ordering variation..
>
>Andrew
>
>PS for amusement and information overload...  it turns out in this hunk 
>of code we end up knowing that globally the range of most of these names 
>are [0, 4], and the actual ranges we COULD generate if asked:
>
>3->5  (T) _1 :  short int [0, 4]
>3->5  (T) _3 :  short int [0, 

Re: [PATCH] rs6000: Add Power10 optimization for most _mm_movemask*

2021-11-19 Thread Segher Boessenkool
On Thu, Oct 21, 2021 at 12:22:12PM -0500, Paul A. Clarke wrote:
> Power10 ISA added `vextract*` instructions which are realized in the
> `vec_extractm` instrinsic.
> 
> Use `vec_extractm` for `_mm_movemask_ps`, `_mm_movemask_pd`, and
> `_mm_movemask_epi8` compatibility intrinsics, when `_ARCH_PWR10`.
> 
> 2021-10-21  Paul A. Clarke  
> 
> gcc
>   * config/rs6000/xmmintrin.h (_mm_movemask_ps): Use vec_extractm
>   when _ARCH_PWR10.
>   * config/rs6000/emmintrin.h (_mm_movemask_pd): Likewise.
>   (_mm_movemask_epi8): Likewise.

Okay for trunk.  Thanks!


Segher


[committed] libstdc++: Begin lifetime of chars in constexpr std::string [PR103295]

2021-11-19 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux (and clang on x86_64-linux), pushed to trunk.


Clang gives errors for constexpr std::string because the memory returned
by std::allocator::allocate does not contain any objects yet, and
attempting to set them using char_traits::assign or char_traits::copy
fails with:

assignment to object outside its lifetime is not allowed in a constant 
expression
  *__result = *__first;
^
This adds code to std::char_traits to use std::construct_at to begin
lifetimes when called during constant evaluation. To support
specializations of std::basic_string that don't use std::char_traits
there is now another layer of wrapper around the allocator_traits, so
that the lifetime of characters is begun as soon as the memory is
allocated. By doing it in the char traits and allocator traits, the rest
of basic_string can ignore the problem.

While modifying char_traits::copy and char_traits::assign to begin
lifetimes for the constexpr cases, I also replaced their uses of
std::copy and std::fill_n respectively. That means we don't need
 for char_traits.

libstdc++-v3/ChangeLog:

PR libstdc++/103295
* include/bits/basic_string.h (_Alloc_traits): Replace typedef
with struct for C++20 mode.
* include/bits/basic_string.tcc (_M_replace): Use _Alloc_traits
for allocation.
* include/bits/char_traits.h (__gnu_cxx::char_traits::assign):
Use std::_Construct during constant evaluation.
(__gnu_cxx::char_traits::assign(CharT*, const CharT*, size_t)):
Likewise. Replace std::fill_n with memset or manual loop.
(__gnu_cxx::char_traits::copy): Likewise, replacing std::copy
with memcpy.
* include/ext/vstring.h: Include  for
std::min.
* include/std/string_view: Likewise.
* 
testsuite/21_strings/basic_string/capacity/char/resize_and_overwrite.cc:
Add constexpr test.
---
 libstdc++-v3/include/bits/basic_string.h  | 40 +++-
 libstdc++-v3/include/bits/basic_string.tcc|  7 +-
 libstdc++-v3/include/bits/char_traits.h   | 92 ---
 libstdc++-v3/include/ext/vstring.h|  1 +
 libstdc++-v3/include/std/string_view  |  2 +
 .../capacity/char/resize_and_overwrite.cc | 14 +++
 6 files changed, 141 insertions(+), 15 deletions(-)

diff --git a/libstdc++-v3/include/bits/basic_string.h 
b/libstdc++-v3/include/bits/basic_string.h
index d29c9cdc410..6e7de738308 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -87,7 +87,37 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 {
   typedef typename __gnu_cxx::__alloc_traits<_Alloc>::template
rebind<_CharT>::other _Char_alloc_type;
+
+#if __cpp_lib_constexpr_string < 201907L
   typedef __gnu_cxx::__alloc_traits<_Char_alloc_type> _Alloc_traits;
+#else
+  template
+   struct _Alloc_traits_impl : __gnu_cxx::__alloc_traits<_Char_alloc_type>
+   {
+ typedef __gnu_cxx::__alloc_traits<_Char_alloc_type> _Base;
+
+ [[__gnu__::__always_inline__]]
+ static constexpr typename _Base::pointer
+ allocate(_Char_alloc_type& __a, typename _Base::size_type __n)
+ {
+   pointer __p = _Base::allocate(__a, __n);
+   if (__builtin_is_constant_evaluated())
+ // Begin the lifetime of characters in allocated storage.
+ for (size_type __i = 0; __i < __n; ++__i)
+   std::construct_at(__builtin_addressof(__p[__i]));
+   return __p;
+ }
+   };
+
+  template
+   struct _Alloc_traits_impl, _Dummy_for_PR85282>
+   : __gnu_cxx::__alloc_traits<_Char_alloc_type>
+   {
+ // std::char_traits begins the lifetime of characters.
+   };
+
+  using _Alloc_traits = _Alloc_traits_impl<_Traits, void>;
+#endif
 
   // Types:
 public:
@@ -485,7 +515,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   basic_string()
   _GLIBCXX_NOEXCEPT_IF(is_nothrow_default_constructible<_Alloc>::value)
   : _M_dataplus(_M_local_data())
-  { _M_set_length(0); }
+  {
+   _M_use_local_data();
+   _M_set_length(0);
+  }
 
   /**
*  @brief  Construct an empty string using allocator @a a.
@@ -494,7 +527,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   explicit
   basic_string(const _Alloc& __a) _GLIBCXX_NOEXCEPT
   : _M_dataplus(_M_local_data(), __a)
-  { _M_set_length(0); }
+  {
+   _M_use_local_data();
+   _M_set_length(0);
+  }
 
   /**
*  @brief  Construct string with copy of value of @a __str.
diff --git a/libstdc++-v3/include/bits/basic_string.tcc 
b/libstdc++-v3/include/bits/basic_string.tcc
index 9a54b63b933..374406c0e13 100644
--- a/libstdc++-v3/include/bits/basic_string.tcc
+++ b/libstdc++-v3/include/bits/basic_string.tcc
@@ -490,7 +490,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if __cpp_lib_is_constant_evaluated
  if (__builtin_is_constant_evaluated())
  

[committed] libstdc++: Suppress -Wstringop warnings [PR103332]

2021-11-19 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, pushed to trunk.


libstdc++-v3/ChangeLog:

PR libstdc++/103332
PR libstdc++/102958
* testsuite/21_strings/basic_string/capacity/char/1.cc: Add
-Wno-stringop-overflow.
* testsuite/21_strings/basic_string/operators/char/1.cc:
Likewise.
* testsuite/experimental/filesystem/path/factory/u8path-char8_t.cc:
Add -Wno-stringop-overread.
---
 .../testsuite/21_strings/basic_string/capacity/char/1.cc  | 3 +++
 .../testsuite/21_strings/basic_string/operators/char/1.cc | 3 +++
 .../experimental/filesystem/path/factory/u8path-char8_t.cc| 4 +++-
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/21_strings/basic_string/capacity/char/1.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/capacity/char/1.cc
index eea69771f79..e21181bd62c 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/capacity/char/1.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/capacity/char/1.cc
@@ -17,6 +17,8 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
+// { dg-options "-Wno-stringop-overflow" }
+
 // 21.3.3 string capacity
 
 #include 
@@ -59,6 +61,7 @@ void test01()
 
   std::string str05(30, 'q');
   std::string str06 = str05;
+  // The following triggers -Wstringop-overflow.  See PR 103332.
   str05 = str06 + str05;
   VERIFY( str05.capacity() >= str05.size() );
   VERIFY( str06.capacity() >= str06.size() );
diff --git a/libstdc++-v3/testsuite/21_strings/basic_string/operators/char/1.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/operators/char/1.cc
index 3bdac19644f..514cd2dc65d 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/operators/char/1.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/operators/char/1.cc
@@ -17,6 +17,8 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
+// { dg-options "-Wno-stringop-overflow" }
+
 // 21.3.6 string operations
 
 #include 
@@ -31,6 +33,7 @@ int test01(void)
   // Should get this:
   // 1:8-chars_8-chars_
   // 2:8-chars_8-chars_
+  // The following triggers -Wstringop-overread.  See PR 103332.
   str1 = std::string("8-chars_") + "8-chars_";
   str1.c_str();
   // printf("1:%s\n", str1.c_str());
diff --git 
a/libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path-char8_t.cc 
b/libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path-char8_t.cc
index 26b04cd1cf1..985a5f8fbc1 100644
--- 
a/libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path-char8_t.cc
+++ 
b/libstdc++-v3/testsuite/experimental/filesystem/path/factory/u8path-char8_t.cc
@@ -15,7 +15,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-options "-lstdc++fs -fchar8_t" }
+// { dg-options "-lstdc++fs -fchar8_t -Wno-stringop-overread" }
 // { dg-do run { target c++11 } }
 // { dg-require-filesystem-ts "" }
 
@@ -36,10 +36,12 @@ test01()
   p = fs::u8path(u8"\xf0\x9d\x84\x9e");
   VERIFY( p.u8string() == u8"\U0001D11E" );
 
+  // The following triggers -Wstringop-overread.  See PR 103332.
   std::u8string s1 = u8"filename2";
   p = fs::u8path(s1);
   VERIFY( p.u8string() == u8"filename2" );
 
+  // The following triggers -Wstringop-overread.  See PR 103332.
   std::u8string s2 = u8"filename3";
   p = fs::u8path(s2.begin(), s2.end());
   VERIFY( p.u8string() == u8"filename3" );
-- 
2.31.1



Re: [PATCH] PR tree-optimization/103254 - Limit depth for all GORI expressions.

2021-11-19 Thread Andrew MacLeod via Gcc-patches

On 11/19/21 13:13, Richard Biener wrote:

On November 19, 2021 4:00:01 PM GMT+01:00, Andrew MacLeod  
wrote:

On 11/19/21 04:21, Richard Biener wrote:

On Thu, Nov 18, 2021 at 8:30 PM Andrew MacLeod via Gcc-patches
 wrote:

At issue here is the dynamic approach we currently use for outgoing edge
calculations.  It isn't normally a problem, but once you get a very
large number of possible outgoing values (ie very long unrolled blocks)
with pairs of values on a statement, and individual queries for each
one, it becomes exponential if they relate to each other.

So.  We previously introduced a param for PR 100081 which limited the
depth of logical expressions GORI would pursue because we always make 2
or 4 queries for each logical expression, which accelerated the
exponential behaviour much quicker.

This patch reuses that to apply to any statement which has 2 ssa-names,
as the potential for 2 queries can happen then as well..  leading to
exponential behaviour.  Each statement we step back through with
multiple ssa-names decreases the odds of calculating a useful outgoing
range anyway.  This will remove ridiculous behaviour like this PR
demonstrates.

Next release I plan to revamp GORI to cache outgoing values, which will
eliminate all this sort of behaviour, and we can remove all such
restrictions.

Bootstrapped on x86_64-pc-linux-gnu with no regressions.  OK for trunk?

I think it's reasonable.  SCEV tries to limit the overall number of SSA -> def
visits, capturing deep vs. wide in a more uniform (and single) way.
(--param scev-max-expr-size and the duplicate - huh - --param
scev-max-expr-complexity).

For SCEV running into the limit means fatal fail of the analysis, for ranger
it only means less refinement?  So it might make a difference which path
you visit and which not (just thinking about reproducability of range analysis
when we say, swap operands of a stmt)?


Yes, Its means less refinement for "deeper" dependencies in the block..
It will not be affected by operand swapping because we will either look
at dependencies for both operands of a stmt, or neither.   The path you
visit shouldn't make any difference. Note this has nothing to do with
generating ranges for those statements, this is only for refining
outgoing ranges created by the condition at the bottom of the block.

To be precise, we will stop looking for possible range changes once our
dependency search, which always starts from the bottom of the block,
encounters 6 stmts with multiple SSA operands. Statements with a single
SSA operand will continue to build the dependency chains.

so for instance, following just the chain for first operands of this
somewhat truncated listing:
  <...>
   _137 = (short int) j_131;
   _138 = _129 & _137;
   _139 = (int) _138;
   j_140 = j_131 & _139;    << depth 6 ... we don't go look at j_131
or _139 for further dependencies
   _146 = (short int) j_140;
   _147 = _138 & _146;
   _148 = (int) _147;
   j_149 = j_140 & _148;   << depth 5
   _155 = (short int) j_149;
   _156 = _147 & _155;
   _157 = (int) _156;
   j_158 = j_149 & _157;   << depth 4
   _164 = (short int) j_158;
   _165 = _156 & _164;
   _166 = (int) _165;
   j_167 = j_158 & _166;  <<-- depth 3
   _1 = (short int) j_167;
   _3 = _1 & _165;    <<-depth 2
   _5 = (int) _3;
   j_22 = _5 & j_167; <<-- depth 1
   j_20 = j_22 + 4;
   if (j_20 == 4)
     goto ;

We'll generate dependencies, and thus able to generate outgoing ranges
for all these ssa-names still until we hit a dependency depth of 6 (the
current default), which for one chain comes to an end at

j_140 = j_131 & _139

   We will be able to generate outgoing ranges for j_131 and _139 on the
edges, but we will not try for any ssa_names used to define those.. ie,
_138 might not be able to have outgoing ranges generated for it in this
block.

working the j_20 : [4,4] range on the true edge back thru those
expressions, once we get that deep the odds of finding something of
value is remote :-)

We will also only generate these outgoing ranges when they are asked
for.  Many of these names are not used outside the block (especially
back the deeper you go), and so never get asked for anyway...  but you
could :-)

I see. I suppose you could prune the list simply by asking whether a Def has a 
single use only (that you just followed) and follow the chain but not generate 
an outgoing range for the Def. Not sure if that will make a big difference in 
compile time.

It wouldn't make any difference in compile time.   The export list is 
used to determine which names a block could generate a range for, but if 
there are no other uses, there 's unlikely to be any such direct 
request.  it will not be calculated, except in passing if a previous 
name in the dependency chain is used outside and we have to calculate 
thru it.


Andrew




Re: [PATCH 03/10] tree-object-size: Use tree instead of HOST_WIDE_INT

2021-11-19 Thread Siddhesh Poyarekar

On 11/19/21 22:36, Jakub Jelinek wrote:

On Wed, Nov 10, 2021 at 12:31:29AM +0530, Siddhesh Poyarekar wrote:

* tree-object-size.h (compute_builtin_object_size): Return tree
instead of HOST_WIDE_INT.
* builtins.c (fold_builtin_object_size): Adjust.
* gimple-fold.c (gimple_fold_builtin_strncat): Likewise.
* ubsan.c (instrument_object_size): Likewise.
* tree-object-size.c (object_sizes): Change type to vec.
(initval): New function.
(unknown): Use it.
(size_unknown_p, size_initval, size_unknown): New functions.
(object_sizes_unknown_p): Use it.
(object_sizes_get): Return tree.
(object_sizes_initialize): Rename from object_sizes_set_force
and set VAL parameter type as tree.
(object_sizes_set): Set VAL parameter type as tree and adjust
implementation.
(size_for_offset): New function.
(addr_object_size): Change PSIZE parameter to tree and adjust
implementation.
(alloc_object_size): Return tree.
(compute_builtin_object_size): Return tree in PSIZE.
(expr_object_size, call_object_size, unknown_object_size):
Adjust for object_sizes_set change.
(merge_object_sizes): Set OFFSET type to tree and adjust
implementation.
(plus_stmt_object_size, cond_expr_object_size,
collect_object_sizes_for, check_for_plus_in_loops_1): Adjust for
change of type from HOST_WIDE_INT to tree.
(object_sizes_execute): Adjust for type change to tree.



--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -10226,7 +10226,7 @@ maybe_emit_sprintf_chk_warning (tree exp, enum 
built_in_function fcode)
  static tree
  fold_builtin_object_size (tree ptr, tree ost)
  {
-  unsigned HOST_WIDE_INT bytes;
+  tree bytes;
int object_size_type;
  
if (!validate_arg (ptr, POINTER_TYPE)

@@ -10251,17 +10251,15 @@ fold_builtin_object_size (tree ptr, tree ost)
if (TREE_CODE (ptr) == ADDR_EXPR)
  {
compute_builtin_object_size (ptr, object_size_type, &bytes);
-  if (wi::fits_to_tree_p (bytes, size_type_node))
-   return build_int_cstu (size_type_node, bytes);
+  return fold_convert (size_type_node, bytes);
  }
else if (TREE_CODE (ptr) == SSA_NAME)
  {
/* If object size is not known yet, delay folding until
 later.  Maybe subsequent passes will help determining
 it.  */
-  if (compute_builtin_object_size (ptr, object_size_type, &bytes)
- && wi::fits_to_tree_p (bytes, size_type_node))
-   return build_int_cstu (size_type_node, bytes);
+  if (compute_builtin_object_size (ptr, object_size_type, &bytes))
+   return fold_convert (size_type_node, bytes);


Neither of these are equivalent to what it used to do before.
If some target has e.g. pointers wider than size_t, then previously we could
compute bytes that doesn't fit into size_t and would return NULL which
eventually would result in object_sizes_execute or expand_builtin_object_size
resolving it to the unknown value.  But fold_convert will perform modulo
arithmetics on it, so say if size_t is 24-bit and bytes is 0x101, it
will fold to (size_t) 1.  I think we should still punt if it doesn't fit.
Or do we ensure that what it returns always has sizetype type?


compute_builtin_object_size should always return sizetype.  I probably 
haven't strictly ensured that but I could do that.  Would it be 
necessary to check for fit in sizetype -> size_type_node conversion too? 
 I've been assuming that they'd have the same precision.





@@ -2486,13 +2486,14 @@ gimple_fold_builtin_strncat (gimple_stmt_iterator *gsi)
if (cmpsrc < 0)
  return false;
  
-  unsigned HOST_WIDE_INT dstsize;

+  tree dstsize;
  
bool nowarn = warning_suppressed_p (stmt, OPT_Wstringop_overflow_);
  
-  if (!nowarn && compute_builtin_object_size (dst, 1, &dstsize))

+  if (!nowarn && compute_builtin_object_size (dst, 1, &dstsize)
+  && tree_fits_uhwi_p (dstsize))
  {
-  int cmpdst = compare_tree_int (len, dstsize);
+  int cmpdst = compare_tree_int (len, tree_to_uhwi (dstsize));


Why jump from tree to UHWI and back?
Check that TREE_CODE (dstsize) == INTEGER_CST and do
tree_int_cst_compare instead?

  
if (cmpdst >= 0)

{
@@ -2509,7 +2510,7 @@ gimple_fold_builtin_strncat (gimple_stmt_iterator *gsi)
"destination size")
   : G_("%qD specified bound %E exceeds "
"destination size %wu"),
-  fndecl, len, dstsize);
+  fndecl, len, tree_to_uhwi (dstsize));


Similarly, use %E and pass dstsize to it.



I noticed it while doing the gimple fold patches; I'll fix it.  I need 
to rebase this bit anyway.



+/* Initial value of object sizes; zero for maximum and SIZE_MAX for minimum
+   object size.  */
+
+static inline unsigned HOST_WIDE_INT
+initval (int o

Re: [wwwdocs PATCH]: Add a caveat that -Ofast implies -fno-semantic-interposition

2021-11-19 Thread Gerald Pfeifer
On Fri, 19 Nov 2021, Martin Jambor wrote:
> can I add the following caveat to the gcc-12/changes.html file?

Of course you can. :-)

Actually, we should, and I'm glad you thought of it.

Thank you,
Gerald


Re: [PATCH 03/10] tree-object-size: Use tree instead of HOST_WIDE_INT

2021-11-19 Thread Jakub Jelinek via Gcc-patches
On Sat, Nov 20, 2021 at 12:31:19AM +0530, Siddhesh Poyarekar wrote:
> > Neither of these are equivalent to what it used to do before.
> > If some target has e.g. pointers wider than size_t, then previously we could
> > compute bytes that doesn't fit into size_t and would return NULL which
> > eventually would result in object_sizes_execute or 
> > expand_builtin_object_size
> > resolving it to the unknown value.  But fold_convert will perform modulo
> > arithmetics on it, so say if size_t is 24-bit and bytes is 0x101, it
> > will fold to (size_t) 1.  I think we should still punt if it doesn't fit.
> > Or do we ensure that what it returns always has sizetype type?
> 
> compute_builtin_object_size should always return sizetype.  I probably
> haven't strictly ensured that but I could do that.  Would it be necessary to
> check for fit in sizetype -> size_type_node conversion too?  I've been
> assuming that they'd have the same precision.

sizetype and size_type_node should be indeed always the same precision.

Jakub



Re: [PATCH] libphobos, testsuite: Add prune clauses for two Darwin cases.

2021-11-19 Thread Iain Buclaw via Gcc-patches
Excerpts from Iain Sandoe's message of November 19, 2021 4:59 pm:
> Depending on the permutation of CPU, OS version and shared/non-
> shared library inclusion, we get can get two warnings from the
> external tools (ld64, dsymutil) which are not actually GCC issues
> but relate to the external tools.  These are already pruned in
> the main testsuite, this adds them to the library.
> 
> tested on x86_64,i686-darwin17 where the problem shows up.
> OK for master / backports?
> thanks
> Iain
> 

OK from me.
Iain.


> Signed-off-by: Iain Sandoe 
> 
> libphobos/ChangeLog:
> 
>   * testsuite/lib/libphobos.exp: Prune warnings from external
>   tool bugs.
> ---
>  libphobos/testsuite/lib/libphobos.exp | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/libphobos/testsuite/lib/libphobos.exp 
> b/libphobos/testsuite/lib/libphobos.exp
> index 3be2092b12e..2af430a0e45 100644
> --- a/libphobos/testsuite/lib/libphobos.exp
> +++ b/libphobos/testsuite/lib/libphobos.exp
> @@ -90,6 +90,13 @@ proc libphobos-dg-test { prog do_what extra_tool_flags } {
>  }
>  
>  proc libphobos-dg-prune { system text } {
> +
> +# Ignore harmless warnings from Xcode.
> +regsub -all "(^|\n)\[^\n\]*ld: warning: could not create compact unwind 
> for\[^\n\]*" $text "" text
> +
> +# Ignore dsymutil warning (tool bug is actually linker)
> +regsub -all "(^|\n)\[^\n\]*could not find object file symbol for 
> symbol\[^\n\]*" $text "" text
> +
>  return $text
>  }
>  
> -- 
> 2.24.3 (Apple Git-128)
> 
> 


[PATCH] c++: redundant explicit 'this' capture in C++17 [PR100493]

2021-11-19 Thread Patrick Palka via Gcc-patches
As described in detail in the PR, in C++20 implicitly capturing 'this'
via the '=' capture default is deprecated, but in C++17 explicitly
capturing 'this' alongside a '=' capture default is ill-formed.  This
means it's impossible to write a C++17 lambda that captures 'this' and
that also has a '=' capture default in a forward-compatible way with
C++20:

  [=] { this; }  // #1 deprecated in C++20, OK in C++17
 // GCC issues a -Wdeprecated warning in C++20 mode
  [=, this] { }  // #2 ill-formed in C++17, OK in C++20
 // GCC issues an unconditional warning in C++17 mode

This patch resolves this dilemma by downgrading the warning for #2 into
a -pedantic one.  In passing, move it into the -Wc++20-extensions class
of warnings and mention that the construct in question is a C++20 one.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk/11?

PR c++/100493

gcc/cp/ChangeLog:

* parser.c (cp_parser_lambda_introducer): In C++17, don't
diagnose a redundant 'this' capture alongside a by-copy
capture default unless -pedantic.  Move the diagnostic into
-Wc++20-extensions and improve the wording.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/lambda-this1.C: Adjust expected diagnostics.
* g++.dg/cpp1z/lambda-this8.C: New test.
* g++.dg/cpp2a/lambda-this3.C: Compile with -pedantic in C++17
to continue to diagnose redundant 'this' captures.
---
 gcc/cp/parser.c   |  8 +---
 gcc/testsuite/g++.dg/cpp1z/lambda-this1.C |  8 
 gcc/testsuite/g++.dg/cpp1z/lambda-this8.C | 10 ++
 gcc/testsuite/g++.dg/cpp2a/lambda-this3.C |  2 +-
 4 files changed, 20 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/lambda-this8.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 65f0f112011..30790006ac9 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -11120,10 +11120,12 @@ cp_parser_lambda_introducer (cp_parser* parser, tree 
lambda_expr)
   if (cp_lexer_next_token_is_keyword (parser->lexer, RID_THIS))
{
  location_t loc = cp_lexer_peek_token (parser->lexer)->location;
- if (cxx_dialect < cxx20
+ if (cxx_dialect < cxx20 && pedantic
  && LAMBDA_EXPR_DEFAULT_CAPTURE_MODE (lambda_expr) == CPLD_COPY)
-   pedwarn (loc, 0, "explicit by-copy capture of % redundant "
-"with by-copy capture default");
+   pedwarn (loc, OPT_Wc__20_extensions,
+"explicit by-copy capture of % "
+"with by-copy capture default only available with "
+"%<-std=c++20%> or %<-std=gnu++20%>");
  cp_lexer_consume_token (parser->lexer);
  if (LAMBDA_EXPR_THIS_CAPTURE (lambda_expr))
pedwarn (input_location, 0,
diff --git a/gcc/testsuite/g++.dg/cpp1z/lambda-this1.C 
b/gcc/testsuite/g++.dg/cpp1z/lambda-this1.C
index b13ff8b9fc6..e12330a8291 100644
--- a/gcc/testsuite/g++.dg/cpp1z/lambda-this1.C
+++ b/gcc/testsuite/g++.dg/cpp1z/lambda-this1.C
@@ -18,7 +18,7 @@ struct A {
 auto i = [=] { return a; };// { dg-warning "implicit 
capture" "" { target c++2a } }
 auto j = [&] { return a; };
 // P0409R2 - C++2A lambda capture [=, this]
-auto k = [=, this] { return a; };// { dg-error "explicit by-copy capture 
of 'this' redundant with by-copy capture default" "" { target c++17_down } }
+auto k = [=, this] { return a; };// { dg-error "explicit by-copy capture 
of 'this' with by-copy capture default only available with" "" { target 
c++17_down } }
 auto l = [&, this] { return a; };
 auto m = [=, *this] { return a; };// { dg-error "'*this' capture only 
available with" "" { target c++14_down } }
 auto n = [&, *this] { return a; };// { dg-error "'*this' capture only 
available with" "" { target c++14_down } }
@@ -27,12 +27,12 @@ struct A {
// { dg-error "'*this' capture only 
available with" "" { target c++14_down } .-1 }
 auto q = [=, this, *this] { return a; };// { dg-error "already captured 
'this'" }
// { dg-error "'*this' capture only 
available with" "" { target c++14_down } .-1 }
-   // { dg-error "explicit by-copy 
capture of 'this' redundant with by-copy capture default" "" { target 
c++17_down } .-2 }
+   // { dg-error "explicit by-copy 
capture of 'this' with by-copy capture default only available with" "" { target 
c++17_down } .-2 }
 auto r = [=, this, this] { return a; };// { dg-error "already captured 
'this'" }
-  // { dg-error "explicit by-copy 
capture of 'this' redundant with by-copy capture default" "" { target 
c++17_down } .-1 }
+  // { dg-error "explicit by-copy 
capture of 'this' with by-copy cap

[PATCH] PR fortran/87851 - [9/10/11/12 Regression] Wrong return type for len_trim

2021-11-19 Thread Harald Anlauf via Gcc-patches
Dear Fortranners,

scalariziation of the elemental intrinsic LEN_TRIM was ICEing
when the optional KIND argument was present.

The cleanest solution is to use the infrastructure added by
Mikael's fix for PR97896.  In that case it is a 1-liner.  :-)

That fix is available on mainline and on 11-branch only, though.
My suggestion is to fix the current PR only for the same branches,
leaving the regression unfixed for older ones.

Regtested on x86_64-pc-linux-gnu.  OK for mainline and 11-branch?

Thanks,
Harald

From f700c43fef4a239af25dd30dc22930b1bb1dbe95 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Fri, 19 Nov 2021 20:20:44 +0100
Subject: [PATCH] Fortran: fix scalarization for intrinsic LEN_TRIM with
 present KIND argument

gcc/fortran/ChangeLog:

	PR fortran/87851
	* trans-array.c (arg_evaluated_for_scalarization): Add LEN_TRIM to
	list of intrinsics for which an optional KIND argument needs to be
	removed before scalarization.

gcc/testsuite/ChangeLog:

	PR fortran/87851
	* gfortran.dg/len_trim.f90: New test.
---
 gcc/fortran/trans-array.c  |  1 +
 gcc/testsuite/gfortran.dg/len_trim.f90 | 26 ++
 2 files changed, 27 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/len_trim.f90

diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 2090adf01e7..238b1b72385 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -11499,6 +11499,7 @@ arg_evaluated_for_scalarization (gfc_intrinsic_sym *function,
   switch (function->id)
 	{
 	  case GFC_ISYM_INDEX:
+	  case GFC_ISYM_LEN_TRIM:
 	if (strcmp ("kind", gfc_dummy_arg_get_name (*dummy_arg)) == 0)
 	  return false;
 	  /* Fallthrough.  */
diff --git a/gcc/testsuite/gfortran.dg/len_trim.f90 b/gcc/testsuite/gfortran.dg/len_trim.f90
new file mode 100644
index 000..63f803960d5
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/len_trim.f90
@@ -0,0 +1,26 @@
+! { dg-do compile }
+! { dg-options "-O -Wall -Wconversion-extra -fdump-tree-original" }
+! { dg-final { scan-tree-dump-not "_gfortran_stop_numeric" "original" } }
+! PR fortran/87851 - return type for len_trim
+
+program main
+  implicit none
+  character(3), parameter :: a(1) = 'aa'
+  character(3):: b= "bb"
+  character(3):: c(1) = 'cc'
+  integer(4), parameter   :: l4(1) = len_trim (a, kind=4)
+  integer(8), parameter   :: l8(1) = len_trim (a, kind=8)
+  integer :: kk(1) = len_trim (a)
+  integer(4)  :: mm(1) = len_trim (a, kind=4)
+  integer(8)  :: nn(1) = len_trim (a, kind=8)
+  kk = len_trim (a)
+  mm = len_trim (a, kind=4)
+  nn = len_trim (a, kind=8)
+  kk = len_trim ([b])
+  mm = len_trim ([b],kind=4)
+  nn = len_trim ([b],kind=8)
+  kk = len_trim (c)
+  mm = len_trim (c, kind=4)
+  nn = len_trim (c, kind=8)
+  if (any (l4 /= 2_4) .or. any (l8 /= 2_8)) stop 1
+end program main
--
2.26.2



[pushed] c++: Fix cpp0x/lambda/lambda-nested9.C with C++11

2021-11-19 Thread Marek Polacek via Gcc-patches
Unfortunately dejagnu doesn't honor #if/#endif, so this test was failing
with -std=c++11:

FAIL: g++.dg/cpp0x/lambda/lambda-nested9.C  -std=c++11  (test for errors, line 
37)

Fixed thus.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-nested9.C: Adjust dg-error.
---
 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-nested9.C | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-nested9.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-nested9.C
index ff7da3b0ce1..8f67f225b13 100644
--- a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-nested9.C
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-nested9.C
@@ -34,7 +34,7 @@ int main() {
 #if __cpp_init_captures
   [j=0] () {
 [&] () mutable {
-  ++j; // { dg-error "read-only" }
+  ++j; // { dg-error "read-only" "" { target c++14 } }
 };
   };
 #endif

base-commit: d4943ce939d9654932624b9ece24c3a474ae4157
-- 
2.33.1



[committed] libstdc++: One more change for Clang to support constexpr std::string [PR103295]

2021-11-19 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux (and x86_64-linux clang), pushed to trunk.


All writes into the allocated buffer need to be via traits_type::assign
to begin lifetimes.

libstdc++-v3/ChangeLog:

PR libstdc++/103295
* include/bits/basic_string.tcc (_M_construct): Use the
traits assign member to write into allcoated memory.
---
 libstdc++-v3/include/bits/basic_string.tcc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/bits/basic_string.tcc 
b/libstdc++-v3/include/bits/basic_string.tcc
index 374406c0e13..6f619a08f70 100644
--- a/libstdc++-v3/include/bits/basic_string.tcc
+++ b/libstdc++-v3/include/bits/basic_string.tcc
@@ -201,7 +201,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_data(__another);
_M_capacity(__capacity);
  }
-   _M_data()[__len++] = *__beg;
+   traits_type::assign(_M_data()[__len++], *__beg);
++__beg;
  }
 
-- 
2.31.1



  1   2   >