[PATCH] bootstrap: Fix building with GCC 4.2 [PR89494]

2020-04-16 Thread Jakub Jelinek via Gcc-patches
Hi!

GCC 4.2 (but I think not the latest tip of GCC 4.2 branch) has broken value
initialization, see PR33916.  The following patch provides a workaround for
that.  Tested with GCC 4.2 on a reduced testcase I've distilled from the
assign_param_data_one class which has been miscompiled the same,
and normally bootstrapped/regtested on x86_64-linux and i686-linux with
a recentish system GCC.  Ok for trunk?

2020-04-16  Jakub Jelinek  

PR bootstrap/89494
* function.c (assign_parm_find_data_types): Add workaround for
BROKEN_VALUE_INITIALIZATION compilers.

--- gcc/function.c.jj   2020-01-12 11:54:36.606410497 +0100
+++ gcc/function.c  2020-04-15 14:15:29.269495427 +0200
@@ -2414,7 +2414,15 @@ assign_parm_find_data_types (struct assi
 {
   int unsignedp;
 
+#ifndef BROKEN_VALUE_INITIALIZATION
   *data = assign_parm_data_one ();
+#else
+  /* Old versions of GCC used to miscompile the above by only initializing
+ the members with explicit constructors and copying garbage
+ to the other members.  */
+  assign_parm_data_one zero_data = {};
+  *data = zero_data;
+#endif
 
   /* NAMED_ARG is a misnomer.  We really mean 'non-variadic'. */
   if (!cfun->stdarg)

Jakub



Re: [PATCH] intl: Allow building both with old bison and bison >= 3 [PR92008]

2020-04-16 Thread Richard Biener
On Thu, 16 Apr 2020, Jakub Jelinek wrote:

> Hi!
> 
> bison 3 apparently made a backwards incompatible change, dropped
> YYLEX_PARAM/YYPARSE_PARAM support and instead needs %param or %lex-param
> and %parse-param.  Furthermore, there is no easy way to conditionalize
> on bison version in the *.y files.
> While e.g. glibc bumped bison requirement and just has the bison 3
> compatible version, Richi said there are still systems with older bison
> where we want to build gcc.
> 
> So, this patch instead determines during configure bison version, and
> depending on that when building plural.c (if building it at all) tweaks
> what is passed over to bison if needed.
> 
> Tested with both bison 3 and bison 1.35, in each case with reconfiguring
> intl and building with make all-yes (as in my setup intl isn't normally
> used), plus normally bootstrapped/regtested on x86_64-linux and i686-linux,
> ok for trunk?

OK.

Thanks,
Richard.

> 2020-04-16  Jakub Jelinek  
> 
>   PR bootstrap/92008
>   * configure.ac: Add check for bison >= 3, AC_DEFINE HAVE_BISON3
>   and AC_SUBST BISON3_YES and BISON3_NO.
>   * Makefile.in (.y.c): Prefix $(YACC) invocation with @BISON3_NO@,
>   add @BISON3_YES@ prefixed rule to adjust the *.y source using sed
>   and adjust output afterwards.
>   * plural-exp.h (PLURAL_PARSE): If HAVE_BISON3 is defined, use
>   struct parse_args * type for arg instead of void *.
>   * plural.y: Add magic /* BISON3 ... */ comments with bison >= 3
>   directives.
>   (YYLEX_PARAM, YYPARSE_PARAM): Don't define if HAVE_BISON3 is defined.
>   (yylex, yyerror): Adjust prototypes and definitions if HAVE_BISON3
>   is defined.
>   * plural.c: Regenerated.
>   * config.h.in: Regenerated.
>   * configure: Regenerated.
> 
> --- intl/configure.ac.jj  2020-01-12 11:54:38.544381258 +0100
> +++ intl/configure.ac 2020-04-15 13:03:05.914359936 +0200
> @@ -47,5 +47,28 @@ case $USE_INCLUDED_LIBINTL in
>  ;;
>  esac
>  
> +BISON3_YES='#'
> +BISON3_NO=
> +if test "$INTLBISON" != :; then
> +  ac_bison3=no
> +  AC_MSG_CHECKING([bison 3 or later])
> +changequote(<<,>>)dnl
> +  ac_prog_version=`$INTLBISON --version 2>&1 | sed -n 's/^.*GNU Bison.* 
> \([0-9]*\.[0-9.]*\).*$/\1/p'`
> +  case $ac_prog_version in
> +[3-9].*)
> +changequote([,])dnl
> +  ac_prog_version="$ac_prog_version, bison3"; ac_bison3=yes;;
> +*) ac_prog_version="$ac_prog_version, old";;
> +  esac
> +  AC_MSG_RESULT([$ac_prog_version])
> +  if test $ac_bison3 = yes; then
> +AC_DEFINE(HAVE_BISON3, 1, [Define if bison 3 or later is used.])
> +BISON3_YES=
> +BISON3_NO='#'
> +  fi
> +fi
> +AC_SUBST(BISON3_YES)
> +AC_SUBST(BISON3_NO)
> +
>  AC_CONFIG_FILES(Makefile config.intl)
>  AC_OUTPUT
> --- intl/Makefile.in.jj   2020-01-12 11:54:38.542381288 +0100
> +++ intl/Makefile.in  2020-04-15 13:16:49.420022007 +0200
> @@ -133,7 +133,11 @@ libintl.h: $(srcdir)/libgnuintl.h
>   $(COMPILE) $<
>  
>  .y.c:
> - $(YACC) $(YFLAGS) --output $@ $<
> +@BISON3_YES@ sed 's,%pure_parser,,;s,^/\* BISON3 \(.*\) \*/$$,\1,' $< > $@.y
> +@BISON3_YES@ $(YACC) $(YFLAGS) --output $@.c $@.y
> +@BISON3_YES@ sed 's/\.c\.y"/.y"/' $@.c > $@
> +@BISON3_YES@ rm -f $@.c $@.y $@.h
> +@BISON3_NO@  $(YACC) $(YFLAGS) --output $@ $<
>   rm -f $*.h
>  
>  INCLUDES = -I. -I$(srcdir)
> --- intl/plural-exp.h.jj  2020-01-11 16:31:56.320274233 +0100
> +++ intl/plural-exp.h 2020-04-15 13:22:26.162972185 +0200
> @@ -1,5 +1,5 @@
>  /* Expression parsing and evaluation for plural form selection.
> -   Copyright (C) 2000, 2001, 2002 Free Software Foundation, Inc.
> +   Copyright (C) 2000-2020 Free Software Foundation, Inc.
> Written by Ulrich Drepper , 2000.
>  
> This program is free software; you can redistribute it and/or modify it
> @@ -111,7 +111,11 @@ struct parse_args
>  
>  extern void FREE_EXPRESSION PARAMS ((struct expression *exp))
>   internal_function;
> +#ifdef HAVE_BISON3
> +extern int PLURAL_PARSE PARAMS ((struct parse_args *arg));
> +#else
>  extern int PLURAL_PARSE PARAMS ((void *arg));
> +#endif
>  extern struct expression GERMANIC_PLURAL attribute_hidden;
>  extern void EXTRACT_PLURAL_EXPRESSION PARAMS ((const char *nullentry,
>  struct expression **pluralp,
> --- intl/plural.y.jj  2020-04-15 10:58:43.890398648 +0200
> +++ intl/plural.y 2020-04-15 13:11:13.819054699 +0200
> @@ -1,6 +1,6 @@
>  %{
>  /* Expression parsing for plural form selection.
> -   Copyright (C) 2000, 2001 Free Software Foundation, Inc.
> +   Copyright (C) 2000-2020 Free Software Foundation, Inc.
> Written by Ulrich Drepper , 2000.
>  
> This program is free software; you can redistribute it and/or modify it
> @@ -40,10 +40,15 @@
>  # define __gettextparse PLURAL_PARSE
>  #endif
>  
> +#ifndef HAVE_BISON3
>  #define YYLEX_PARAM  &((struct parse_args *) arg)->cp
>  #define YYPARSE_PARAMarg
> +#endif
>  %}
>  %pure_parser
> +/* BISON3 %parse-

Re: [PATCH] bootstrap: Fix building with GCC 4.2 [PR89494]

2020-04-16 Thread Richard Biener
On Thu, 16 Apr 2020, Jakub Jelinek wrote:

> Hi!
> 
> GCC 4.2 (but I think not the latest tip of GCC 4.2 branch) has broken value
> initialization, see PR33916.  The following patch provides a workaround for
> that.  Tested with GCC 4.2 on a reduced testcase I've distilled from the
> assign_param_data_one class which has been miscompiled the same,
> and normally bootstrapped/regtested on x86_64-linux and i686-linux with
> a recentish system GCC.  Ok for trunk?

OK.

Richard.

> 2020-04-16  Jakub Jelinek  
> 
>   PR bootstrap/89494
>   * function.c (assign_parm_find_data_types): Add workaround for
>   BROKEN_VALUE_INITIALIZATION compilers.
> 
> --- gcc/function.c.jj 2020-01-12 11:54:36.606410497 +0100
> +++ gcc/function.c2020-04-15 14:15:29.269495427 +0200
> @@ -2414,7 +2414,15 @@ assign_parm_find_data_types (struct assi
>  {
>int unsignedp;
>  
> +#ifndef BROKEN_VALUE_INITIALIZATION
>*data = assign_parm_data_one ();
> +#else
> +  /* Old versions of GCC used to miscompile the above by only initializing
> + the members with explicit constructors and copying garbage
> + to the other members.  */
> +  assign_parm_data_one zero_data = {};
> +  *data = zero_data;
> +#endif
>  
>/* NAMED_ARG is a misnomer.  We really mean 'non-variadic'. */
>if (!cfun->stdarg)
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


[committed] libstdc++: Fix -Wunused-parameter warning in test

2020-04-16 Thread Jonathan Wakely via Gcc-patches
* testsuite/20_util/unsynchronized_pool_resource/allocate.cc: Remove
name of unused parameter.

Not an important change, just a tiny clean-up.

Tested x86_64-linux and committed to master.

commit c8d88bf26e4c4c0eeddbf6a9dc184f28d4ef85e4
Author: Jonathan Wakely 
Date:   Thu Apr 16 08:44:10 2020 +0100

libstdc++: Fix -Wunused-parameter warning in test

* testsuite/20_util/unsynchronized_pool_resource/allocate.cc: Remove
name of unused parameter.

diff --git 
a/libstdc++-v3/testsuite/20_util/unsynchronized_pool_resource/allocate.cc 
b/libstdc++-v3/testsuite/20_util/unsynchronized_pool_resource/allocate.cc
index b3bb955db7c..5bf20cf262c 100644
--- a/libstdc++-v3/testsuite/20_util/unsynchronized_pool_resource/allocate.cc
+++ b/libstdc++-v3/testsuite/20_util/unsynchronized_pool_resource/allocate.cc
@@ -197,7 +197,7 @@ test06()
 void do_deallocate(void* p, std::size_t bytes, std::size_t align)
 { std::pmr::new_delete_resource()->deallocate(p, bytes, align); }
 
-bool do_is_equal(const memory_resource& r) const noexcept
+bool do_is_equal(const memory_resource&) const noexcept
 { return false; }
   };
 


[stage1] [PATCH] Merge dg-options and dg-additional-options if len <= 120 chars.

2020-04-16 Thread Martin Liška

On 4/14/20 1:43 PM, Jakub Jelinek wrote:

Roughly, yes.  A few extra in testcases don't hurt necessarily, but say 160
chars or more is clearly too much.


All right, I made a limit of 120 characters for the changes.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed in next stage1?
Thanks,
Martin
>From 78ccbef9cce09bfcf5801b7928f3dcb8a56d14e7 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 15 Apr 2020 08:49:45 +0200
Subject: [PATCH] Merge dg-options and dg-additional-options if len <= 120
 chars.

gcc/testsuite/ChangeLog:

2020-04-15  Martin Liska  

	* g++.dg/concepts/diagnostic1.C: Merge dg-options and
	dg-additional-options if len <= 120 chars.
	* g++.dg/cpp1y/new1.C: Likewise.
	* g++.dg/cpp1y/new2.C: Likewise.
	* g++.dg/debug/dwarf2/pr61433.C: Likewise.
	* g++.dg/init/new18.C: Likewise.
	* g++.dg/ipa/devirt-19.C: Likewise.
	* g++.dg/ipa/devirt-52.C: Likewise.
	* g++.dg/ipa/pr44372.C: Likewise.
	* g++.dg/ipa/pr58371.C: Likewise.
	* g++.dg/ipa/pr63587-2.C: Likewise.
	* g++.dg/ipa/pr78211.C: Likewise.
	* g++.dg/opt/dump1.C: Likewise.
	* g++.dg/opt/pr44919.C: Likewise.
	* g++.dg/opt/pr47615.C: Likewise.
	* g++.dg/opt/pr82159-2.C: Likewise.
	* g++.dg/other/pr52048.C: Likewise.
	* g++.dg/pr57662.C: Likewise.
	* g++.dg/pr59510.C: Likewise.
	* g++.dg/pr67989.C: Likewise.
	* g++.dg/pr81194.C: Likewise.
	* g++.dg/pr94314-2.C: Likewise.
	* g++.dg/pr94314-3.C: Likewise.
	* g++.dg/pr94314.C: Likewise.
	* g++.dg/template/canon-type-8.C: Likewise.
	* g++.dg/template/crash107.C: Likewise.
	* g++.dg/template/show-template-tree-3.C: Likewise.
	* g++.dg/tm/cgraph_edge.C: Likewise.
	* g++.dg/torture/20141013.C: Likewise.
	* g++.dg/torture/pr34641.C: Likewise.
	* g++.dg/torture/pr34850.C: Likewise.
	* g++.dg/torture/pr36745.C: Likewise.
	* g++.dg/torture/pr40991.C: Likewise.
	* g++.dg/torture/pr48271.C: Likewise.
	* g++.dg/torture/pr53602.C: Likewise.
	* g++.dg/torture/pr53752.C: Likewise.
	* g++.dg/torture/pr54838.C: Likewise.
	* g++.dg/torture/pr58252.C: Likewise.
	* g++.dg/tree-ssa/pr22444.C: Likewise.
	* g++.dg/tree-ssa/pr24351-3.C: Likewise.
	* g++.dg/tree-ssa/pr27283.C: Likewise.
	* g++.dg/tree-ssa/pr27291.C: Likewise.
	* g++.dg/tree-ssa/pr27548.C: Likewise.
	* g++.dg/tree-ssa/pr42337.C: Likewise.
	* g++.dg/ubsan/pr65583.C: Likewise.
	* g++.old-deja/g++.robertl/eb27.C: Likewise.
	* gcc.dg/tree-ssa/dse-points-to.c: Likewise.
	* gcc.target/aarch64/advsimd-intrinsics/bf16_dup.c: Likewise.
	* gcc.target/arm/armv8_2-fp16-move-1.c: Likewise.
	* gcc.target/arm/pr67989.C: Likewise.
	* gcc.target/arm/simd/vmmla_1.c: Likewise.
	* gcc.target/i386/vect-pr67800.c: Likewise.
	* gcc.target/mips/cfgcleanup-jalr2.c: Likewise.
	* gcc.target/mips/cfgcleanup-jalr3.c: Likewise.
---
 gcc/testsuite/g++.dg/concepts/diagnostic1.C| 3 +--
 gcc/testsuite/g++.dg/cpp1y/new1.C  | 3 +--
 gcc/testsuite/g++.dg/cpp1y/new2.C  | 3 +--
 gcc/testsuite/g++.dg/debug/dwarf2/pr61433.C| 3 +--
 gcc/testsuite/g++.dg/init/new18.C  | 3 +--
 gcc/testsuite/g++.dg/ipa/devirt-19.C   | 3 +--
 gcc/testsuite/g++.dg/ipa/devirt-52.C   | 3 +--
 gcc/testsuite/g++.dg/ipa/pr44372.C | 3 +--
 gcc/testsuite/g++.dg/ipa/pr58371.C | 3 +--
 gcc/testsuite/g++.dg/ipa/pr63587-2.C   | 3 +--
 gcc/testsuite/g++.dg/ipa/pr78211.C | 3 +--
 gcc/testsuite/g++.dg/opt/dump1.C   | 3 +--
 gcc/testsuite/g++.dg/opt/pr44919.C | 3 +--
 gcc/testsuite/g++.dg/opt/pr47615.C | 3 +--
 gcc/testsuite/g++.dg/opt/pr82159-2.C   | 3 +--
 gcc/testsuite/g++.dg/other/pr52048.C   | 3 +--
 gcc/testsuite/g++.dg/pr57662.C | 3 +--
 gcc/testsuite/g++.dg/pr59510.C | 3 +--
 gcc/testsuite/g++.dg/pr67989.C | 3 +--
 gcc/testsuite/g++.dg/pr81194.C | 3 +--
 gcc/testsuite/g++.dg/pr94314-2.C   | 3 +--
 gcc/testsuite/g++.dg/pr94314-3.C   | 3 +--
 gcc/testsuite/g++.dg/pr94314.C | 3 +--
 gcc/testsuite/g++.dg/template/canon-type-8.C   | 3 +--
 gcc/testsuite/g++.dg/template/crash107.C   | 3 +--
 gcc/testsuite/g++.dg/template/show-template-tree-3.C   | 3 +--
 gcc/testsuite/g++.dg/tm/cgraph_edge.C  | 3 +--
 gcc/testsuite/g++.dg/torture/20141013.C| 3 +--
 gcc/testsuite/g++.dg/torture/pr34641.C | 3 +--
 gcc/testsuite/g++.dg/torture/pr34850.C | 3 +--
 gcc/testsuite/g++.dg/torture/pr36745.C | 3 +--
 gcc/testsuite/

Re: [PATCH] pretty-print SSA names

2020-04-16 Thread Richard Biener
On Wed, 15 Apr 2020, David Malcolm wrote:

> On Wed, 2020-04-15 at 14:52 +0200, Richard Biener wrote:
> > This adds the SSA name version to the gdb pretty-printing of SSA
> > names.
> > 
> > (gdb) p (tree)$1
> > $5 = 
> > 
> > Tested (see above...).
> > 
> > OK?
> 
> > Thanks,
> > Richard.
> > 
> > 2020-04-15  Richard Biener  
> > 
> > * gdbhooks.py (TreePrinter): Print SSA_NAME_VERSION of SSA_NAME
> > nodes.
> > ---
> >  gcc/gdbhooks.py | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py
> > index e9acc373126..f6aef9ceae8 100644
> > --- a/gcc/gdbhooks.py
> > +++ b/gcc/gdbhooks.py
> > @@ -154,6 +154,7 @@ tree_code_dict =
> > gdb.types.make_enum_dict(gdb.lookup_type('enum tree_code'))
> >  # ...and look up specific values for use later:
> >  IDENTIFIER_NODE = tree_code_dict['IDENTIFIER_NODE']
> >  TYPE_DECL = tree_code_dict['TYPE_DECL']
> > +SSA_NAME = tree_code_dict['SSA_NAME']
> >  
> >  # Similarly for "enum tree_code_class" (tree.h):
> >  tree_code_class_dict =
> > gdb.types.make_enum_dict(gdb.lookup_type('enum tree_code_class'))
> > @@ -252,6 +253,8 @@ class TreePrinter:
> >  result += ' %s' %
> > tree_TYPE_NAME.DECL_NAME().IDENTIFIER_POINTER()
> >  if self.node.TREE_CODE() == IDENTIFIER_NODE:
> >  result += ' %s' % self.node.IDENTIFIER_POINTER()
> > +   if self.node.TREE_CODE() == SSA_NAME:
> > +   result += ' %u' % self.gdbval['base']['u']['version']
> 
> Make it an "elif" rather than a plain "if" as the TREE_CODEs are
> mutually exclusive (and the language doesn't have switch statements).

Done.

> Is something up with the indentation?  The "if" ("elif") for SSA_NAME
> should have the same indentation as the "if" for IDENTIFIER_NODE.  Is
> there a stray tab or similar?

Ah, tab vs. spaces.  Changed to all spaces now and pushed.

Richard.


Re: [stage1] [PATCH] Merge dg-options and dg-additional-options if len <= 120 chars.

2020-04-16 Thread Jakub Jelinek via Gcc-patches
On Thu, Apr 16, 2020 at 09:50:02AM +0200, Martin Liška wrote:
> On 4/14/20 1:43 PM, Jakub Jelinek wrote:
> > Roughly, yes.  A few extra in testcases don't hurt necessarily, but say 160
> > chars or more is clearly too much.
> 
> All right, I made a limit of 120 characters for the changes.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed in next stage1?

LGTM, thanks.

Jakub



Re: [PATCH V2]aarch64: falkor-tag-collision-avoidance.c fix valid_src_p for use of uninitialized value

2020-04-16 Thread Andrea Corallo
Hi all,

I'd like to back-port this to the gcc-9 branch.
This patch is directly based on:

https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543627.html
https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543901.html

Bootstrapped and reg tested.  Ok for release/gcc-9?

Bests
  Andrea

gcc/ChangeLog

2020-??-??  Andrea Corallo  

* config/aarch64/falkor-tag-collision-avoidance.c
(valid_src_p): Check for aarch64_address_info type before
accessing base field.

gcc/testsuite/ChangeLog

2020-??-??  Andrea Corallo  

* gcc.target/aarch64/pr94530.c: New test.
>From 1e70131c2c099c1071baba3f40d610f41ff4e9ea Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Tue, 14 Apr 2020 19:51:54 +0100
Subject: [PATCH] pr94530 gcc-9

---
 gcc/config/aarch64/falkor-tag-collision-avoidance.c | 7 +++
 gcc/testsuite/gcc.target/aarch64/pr94530.c  | 9 +
 2 files changed, 16 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr94530.c

diff --git a/gcc/config/aarch64/falkor-tag-collision-avoidance.c b/gcc/config/aarch64/falkor-tag-collision-avoidance.c
index 779dee81f7f4..698d1595d0a6 100644
--- a/gcc/config/aarch64/falkor-tag-collision-avoidance.c
+++ b/gcc/config/aarch64/falkor-tag-collision-avoidance.c
@@ -537,6 +537,13 @@ valid_src_p (rtx src, rtx_insn *insn, struct loop *loop, bool *pre_post,
   if (!aarch64_classify_address (&addr, XEXP (x, 0), mode, true))
 return false;
 
+  if (addr.type != ADDRESS_REG_IMM
+  && addr.type != ADDRESS_REG_WB
+  && addr.type != ADDRESS_REG_REG
+  && addr.type != ADDRESS_REG_UXTW
+  && addr.type != ADDRESS_REG_SXTW)
+return false;
+
   unsigned regno = REGNO (addr.base);
   if (global_regs[regno] || fixed_regs[regno])
 return false;
diff --git a/gcc/testsuite/gcc.target/aarch64/pr94530.c b/gcc/testsuite/gcc.target/aarch64/pr94530.c
new file mode 100644
index ..1f98201c50a8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr94530.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-Os -mcpu=falkor -mpc-relative-literal-loads -mcmodel=large" } */
+
+extern void bar(const char *);
+
+void foo(void) {
+  for (;;)
+bar("");
+}
-- 
2.17.1



[PATCH] middle-end/94614 - avoid multiword moves to nothing

2020-04-16 Thread Richard Biener


This adjusts emit_move_multi_word to handle moves into paradoxical
subregs parts that are not there and resolve_clobber to handle
such subregs.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

The testcase involves writing to a register out of bounds so I'm not
sure this is the correct place to paper over this or whether RTL
expansion should have done things differently.

;; MEM[(v4si *)&res] = v_2(D);

(insn 12 9 10 (clobber (subreg:TI (reg/v:DI 113 [ res ]) 0)) 
"pr94574.c":13:18 -1
 (nil))

(insn 10 12 11 (set (subreg:SI (reg/v:DI 113 [ res ]) 0)
(subreg:SI (reg/v:TI 115 [ v ]) 0)) "pr94574.c":13:18 -1
 (nil))

(insn 11 10 0 (set (subreg:SI (reg/v:DI 113 [ res ]) 4)
(subreg:SI (reg/v:TI 115 [ v ]) 4)) "pr94574.c":13:18 -1
 (nil))

maybe we should simply force regs with out-of-bound accesses to
memory?  The above is the RTL generated after the first half of the
fix.  We still generate

(insn 12 7 10 2 (clobber (subreg:TI (reg/v:DI 113 [ res ]) 0)) 
"pr94574.c":13:18 -1  
 (nil))

which lower-subreg runs into - I did not track down where that
is generated, but I understand the subreg is pointless here?

Comments?  OK?

Richard.

2020-04-16  Richard Biener  

PR middle-end/94614
* expr.c (emit_move_multi_word): Do not generate code when
the destination part is undefined_operand_subword_p.
* lower-subreg.c (resolve_clobber): Look through a paradoxica
subreg.
---
 gcc/expr.c | 5 +
 gcc/lower-subreg.c | 4 
 2 files changed, 9 insertions(+)

diff --git a/gcc/expr.c b/gcc/expr.c
index b97c217e86d..dfbeae71518 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -3692,6 +3692,11 @@ emit_move_multi_word (machine_mode mode, rtx x, rtx y)
   need_clobber = false;
   for (i = 0; i < CEIL (mode_size, UNITS_PER_WORD); i++)
 {
+  /* Do not generate code for a move if it would go entirely
+to the non-existing bits of a paradoxical subreg.  */
+  if (undefined_operand_subword_p (x, i))
+   continue;
+
   rtx xpart = operand_subword (x, i, 1, mode);
   rtx ypart;
 
diff --git a/gcc/lower-subreg.c b/gcc/lower-subreg.c
index a170f0ff93b..a11e535b5bf 100644
--- a/gcc/lower-subreg.c
+++ b/gcc/lower-subreg.c
@@ -1150,6 +1150,10 @@ resolve_clobber (rtx pat, rtx_insn *insn)
   int ret;
 
   reg = XEXP (pat, 0);
+  /* For clobbers we can look through paradoxical subregs which
+ we do not handle in simplify_gen_subreg_concatn.  */
+  if (paradoxical_subreg_p (reg))
+reg = SUBREG_REG (reg);
   if (!resolve_reg_p (reg) && !resolve_subreg_p (reg))
 return false;
 
-- 
2.16.4


[PATCH] Do not use HAVE_DOS_BASED_FILE_SYSTEM for Cygwin.

2020-04-16 Thread Martin Liška

Hi.

The patch is fix for Cygwin where we should not define 
HAVE_DOS_BASED_FILE_SYSTEM
and use back slashes as a path component separator.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
I'm going to install the patch if there are no objections.

Thanks,
Martin

ChangeLog:

2020-04-16  Martin Liska  
Jonathan Yong <10wa...@gmail.com>

PR gcov-profile/94570
* ltmain.sh: Do not define HAVE_DOS_BASED_FILE_SYSTEM
for CYGWIN.

gcc/ChangeLog:

2020-04-16  Martin Liska  
Jonathan Yong <10wa...@gmail.com>

PR gcov-profile/94570
* coverage.c (coverage_init): Use separator properly.

include/ChangeLog:

2020-04-16  Martin Liska  
Jonathan Yong <10wa...@gmail.com>

PR gcov-profile/94570
* filenames.h (defined): Do not define HAVE_DOS_BASED_FILE_SYSTEM
for CYGWIN.
---
 gcc/coverage.c  | 12 ++--
 include/filenames.h |  2 +-
 ltmain.sh   |  2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)


diff --git a/gcc/coverage.c b/gcc/coverage.c
index 30ae84df90f..30ac3540110 100644
--- a/gcc/coverage.c
+++ b/gcc/coverage.c
@@ -1199,6 +1199,11 @@ coverage_obj_finish (vec *ctor)
 void
 coverage_init (const char *filename)
 {
+#if HAVE_DOS_BASED_FILE_SYSTEM
+  const char *separator = "\\";
+#else
+  const char *separator = "/";
+#endif
   int len = strlen (filename);
   int prefix_len = 0;
 
@@ -1215,11 +1220,6 @@ coverage_init (const char *filename)
 	 of filename in order to prevent file path clashing.  */
   if (profile_data_prefix)
 	{
-#if HAVE_DOS_BASED_FILE_SYSTEM
-	  const char *separator = "\\";
-#else
-	  const char *separator = "/";
-#endif
 	  filename = concat (getpwd (), separator, filename, NULL);
 	  filename = mangle_path (filename);
 	  len = strlen (filename);
@@ -1238,7 +1238,7 @@ coverage_init (const char *filename)
   if (profile_data_prefix)
 {
   memcpy (da_file_name, profile_data_prefix, prefix_len);
-  da_file_name[prefix_len++] = '/';
+  da_file_name[prefix_len++] = *separator;
 }
   memcpy (da_file_name + prefix_len, filename, len);
   strcpy (da_file_name + prefix_len + len, GCOV_DATA_SUFFIX);
diff --git a/include/filenames.h b/include/filenames.h
index 1ed441221ac..710d9c72687 100644
--- a/include/filenames.h
+++ b/include/filenames.h
@@ -32,7 +32,7 @@ Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA 02110-1301, USA.
 extern "C" {
 #endif
 
-#if defined(__MSDOS__) || defined(_WIN32) || defined(__OS2__) || defined (__CYGWIN__)
+#if defined(__MSDOS__) || (defined(_WIN32) && ! defined(__CYGWIN__)) || defined(__OS2__)
 #  ifndef HAVE_DOS_BASED_FILE_SYSTEM
 #define HAVE_DOS_BASED_FILE_SYSTEM 1
 #  endif
diff --git a/ltmain.sh b/ltmain.sh
index 79f9ba89af5..8ad183010f0 100644
--- a/ltmain.sh
+++ b/ltmain.sh
@@ -3425,7 +3425,7 @@ int setenv (const char *, const char *, int);
 # define PATH_SEPARATOR ':'
 #endif
 
-#if defined (_WIN32) || defined (__MSDOS__) || defined (__DJGPP__) || \
+#if (defined (_WIN32) && ! defined(__CYGWIN__)) || defined (__MSDOS__) || defined (__DJGPP__) || \
   defined (__OS2__)
 # define HAVE_DOS_BASED_FILE_SYSTEM
 # define FOPEN_WB "wb"



[PATCH] testsuite: Move misplaced gcc.c-torture/pr92372.c test [PR92372]

2020-04-16 Thread Jakub Jelinek via Gcc-patches
Hi!

This test got committed into a spot where nothing actually tests it.
As there is no main, I assume it was meant to be gcc.c-torture/compile/
test and the test PASSes after moving there (both x86_64-linux and
i686-linux).  Though, it passed before the PR92372 fixes too.

Ok for trunk?

2020-04-16  Jakub Jelinek  

PR ipa/92372
* gcc.c-torture/pr92372.c: Move ...
* gcc.c-torture/compile/pr92372.c: ... here.

diff --git a/gcc/testsuite/gcc.c-torture/pr92372.c 
b/gcc/testsuite/gcc.c-torture/compile/pr92372.c
similarity index 100%
rename from gcc/testsuite/gcc.c-torture/pr92372.c
rename to gcc/testsuite/gcc.c-torture/compile/pr92372.c

Jakub



Re: [RFC] split pseudos during loop unrolling in RTL unroller

2020-04-16 Thread Richard Biener via Gcc-patches
On Wed, Apr 15, 2020 at 11:23 PM Segher Boessenkool
 wrote:
>
> Hi!
>
> On Wed, Apr 15, 2020 at 08:21:03AM +0200, Richard Biener wrote:
> > On Wed, Apr 15, 2020 at 3:56 AM Jiufu Guo via Gcc-patches
> >  wrote:
> > > As you may know, we have loop unroll pass in RTL which was introduced a 
> > > few
> > > years ago, and works for a long time.  Currently, this unroller is using 
> > > the
> > > pseudos in the original body, and then the pseudos are written multiple 
> > > times.
> > >
> > > It would be a good idea to create new pseudos for those duplicated 
> > > instructions
> > > during loop unrolling.  This would relax reg dependence, and then provide 
> > > more
> > > freedom/opportunity for other optimizations, like scheduling, RA...
> >
> > I think there's a separate pass to do something like this, conveniently
> > placed after unrolling.  -fweb, IIRC enabled by default for -funroll-loops
> > unless explicitly disabled.  Related is regrename which is also enabled 
> > then.
> >
> > So where does your patch make a difference?  Is the webizers dataflow 
> > analysis
> > maybe confused by backedges?
>
> Does -fweb handle things set by the last unrolled iteration, used by the
> first unrolled iteration?
>
> On a general note, we shouldn't depend on some pass that may or may not
> clean up the mess we make, when we could just avoid making a mess in the
> first place.

True - but the issue at hand is not trivial given you have to care for
partial defs, uses outside of the loop (or across the backedge), etc.
So there's plenty of things to go "wrong" here.

> The web pass belongs immediately after expand; but ideally, even expand
> would not reuse pseudos anyway.

But for example when lower-subreg decomposes things in a way turning
partial defs into full defs new opportunities to split the web arise.

> Maybe it would be better as some utility routines, not a pass?

Sure, but then when do we apply it?  Ideally scheduling would to
register renaming itself and thus not rely on the used pseudos
(I'm not sure if it tracks false dependences - I guess it must if it
isn't able to rename regs).  That would be a much better place
for improvements?

Richard.

>
> Segher


Re: [PATCH] middle-end/94614 - avoid multiword moves to nothing

2020-04-16 Thread Jakub Jelinek via Gcc-patches
On Thu, Apr 16, 2020 at 10:05:32AM +0200, Richard Biener wrote:
> 2020-04-16  Richard Biener  
> 
>   PR middle-end/94614
>   * expr.c (emit_move_multi_word): Do not generate code when
>   the destination part is undefined_operand_subword_p.
>   * lower-subreg.c (resolve_clobber): Look through a paradoxica
>   subreg.

LGTM.

Jakub



Re: [PATCH] testsuite: Move misplaced gcc.c-torture/pr92372.c test [PR92372]

2020-04-16 Thread Richard Biener via Gcc-patches
On Thu, Apr 16, 2020 at 10:23 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> This test got committed into a spot where nothing actually tests it.
> As there is no main, I assume it was meant to be gcc.c-torture/compile/
> test and the test PASSes after moving there (both x86_64-linux and
> i686-linux).  Though, it passed before the PR92372 fixes too.
>
> Ok for trunk?

OK.

> 2020-04-16  Jakub Jelinek  
>
> PR ipa/92372
> * gcc.c-torture/pr92372.c: Move ...
> * gcc.c-torture/compile/pr92372.c: ... here.
>
> diff --git a/gcc/testsuite/gcc.c-torture/pr92372.c 
> b/gcc/testsuite/gcc.c-torture/compile/pr92372.c
> similarity index 100%
> rename from gcc/testsuite/gcc.c-torture/pr92372.c
> rename to gcc/testsuite/gcc.c-torture/compile/pr92372.c
>
> Jakub
>


Re: [PATCH] sra: Fix access verification (PR 94598)

2020-04-16 Thread Martin Jambor
Hi,

On Wed, Apr 15 2020, Richard Biener wrote:
> On Wed, 15 Apr 2020, Martin Jambor wrote:
>
>> Hi,
>> 
>> get_ref_base_and_extent recognizes ARRAY_REFs with variable index but
>> into arrays of length one as constant offset accesses.  However,
>> max_size in such cases is extended to span the whole element.
>
> You mean f[d] gets offset zero and max_size == sizeof (struct a)?
> Or f[d].b doing this, thus size != max_size?

in GCC 10 we scan function body first and only then try to fill in the
gaps if/when doing total scalarization.  So during body scan we
encounter f[d] which gets offset 0, size equal to the element size
64, and max_size equal to array size minus offset.  Because array length
is 1, size of the array is 64 and thus max_size is 64.  SRA sees size ==
max_size and is satisfied (size != max_size generally indicates an array
access that is unusable for SRA).

Then total scalarization comes along.  First, it determines whether it
needs to create an access for f[0] but there already is an access with
offset 0 and size 64 so it reuses that.  But then under this one it
creates sub-accesses for f[d].b and f[d].c - and re-uses the expression 
of their parent for their expression and so the variable index remains.

Then verification happened, ran get_ref_base_and_extent on f[d].b, which
has offset zero, size 32 but max_size is the size of the array minus
offset, so 64.  And SRA verification though this should have never been
a scalarizable access in the first place and complained that
grp_unscalarizable_region was not set.

During total scalarization we could take extra care to create expression
f[0].b and f[0].c instead but at this stage I decided to "fix" it the
verifier.  ATM it does not affect anything else.  In future we might
actually want to replace the index already during body scan to make
these candidates for same_access_path_p() test so that whenever these
expressions are re-used, the results are friendlier to the alias oracle
- see the TODO in path_comparable_for_same_access.

>
>> This confuses SRA verification when SRA also builds its (total
>> scalarization) access structures to describe fields under such array -
>> get_ref_base_and_extent returns different size and max_size for them.
>> 
>> Fixed by not performing the check for total scalarization accesses.
>> The subsequent check then had to be changed to use size and not
>> max_size too, which meant it has to be skipped when the access
>> structure describes a genuine variable array access.
>> 
>> Bootstrapped and tested on x86_64-linux.
>> 
>> OK for trunk?
>
> OK.
>

Thanks, I have just pushed the patch to master.

Martin


>> 
>> 
>> 2020-04-15  Martin Jambor  
>> 
>>  PR tree-optimization/94598
>>  * tree-sra.c (verify_sra_access_forest): Fix verification of total
>>  scalarization accesses under access to one-element arrays.
>> 
>>  testsuite/
>>  * gcc.dg/tree-ssa/pr94598.c: New test.


[PATCH] intl: Unbreak intl build with bison 3 when no regeneration is needed [PR92008]

2020-04-16 Thread Jakub Jelinek via Gcc-patches
Hi!

As Iain reported, my change broke the case when one has bison >= 3,
but make decides there is no reason to regenerate plural.c, unfortunately
that seems to be a scenario I haven't tested.  The problem is that
the pregenerated plural.c has been generated with bison 1.35, but when
config.h says HAVE_BISON3, the code assumes it is the bison3 variant.
What used to work fine is when one has bison >= 3 and plural.c has been
regenerated (e.g. do touch intl/plural.y and it will work), or when
one doesn't have any bison (then nothing is regenerated, but HAVE_BISON3
isn't defined either), or when one has bison < 3 and doesn't need to
regenerate, or when one has bison < 3 and it is regenerated.

The following patch fixes this, by killing the HAVE_BISON3 macro from
config.h, and instead remembering the fact whether plural.c has been created
with bison < 3 or bison >= 3 in a separate new plural-config.h header.
The way this works:
- user doesn't have bison
- user has bison >= 3, but intl/{plural-config.h,plural.c} aren't older than 
intl/plural.y
- user has bison < 3, but intl/{plural-config.h,plural.c} aren't older than 
intl/plural.y
pregenerated !USE_BISON3 plural.c and plural-config.h from source
dir is used, nothing in the objdir
- user has bison >= 3 and intl/plural.y is newer
Makefile generates plural.c and USE_BISON3 plural-config.h in the
objdir, which is then used in preference to srcdir copies
- user has bison < 3 and intl/plural.y is newer
Makefile generates plural.c and !USE_BISON3 plural-config.h in the
objdir, which is then used in preference to srcdir copies
I have tested all these cases and make all-yes worked in all the cases.
If one uses the unsupported ./configure where srcdir == objdir, I guess
(though haven't tested) that it should still work, just it would be nice
if such people didn't try to check in the plural{.c,-config.h} they have
regenerated.
What doesn't work, but didn't work before either (just tested gcc-9 branch
too) is when one doesn't have bison and plural.y is newer than plural.c.
Don't do that ;)

Sorry for the breakage.

Ok for trunk?

2020-04-16  Jakub Jelinek  

PR bootstrap/92008
intl/
* configure.ac: Remove HAVE_BISON3 AC_DEFINE.
* Makefile.in (HEADERS): Add plural-config.h.
(.y.c): Also create plural-config.h.
(dcigettext.o loadmsgcat.o plural.o plural-exp.o): Also depend
on plural-config.h.
(plural-config.h): Depend on plural.c.
* plural-exp.h: Include plural-config.h.  Use USE_BISON3 instead
of HAVE_BISON3.
* plural.y: Use USE_BISON3 instead of HAVE_BISON3.
* configure: Regenerated.
* plural.c: Regenerated.
* config.h.in: Regenerated.
* plural-config.h: Generated.
contrib/
* gcc_update: Add intl/plural.y dependency for intl/plural-config.h.

--- intl/configure.ac.jj2020-04-16 10:11:49.709094977 +0200
+++ intl/configure.ac   2020-04-16 10:57:24.033935892 +0200
@@ -62,7 +62,6 @@ changequote([,])dnl
   esac
   AC_MSG_RESULT([$ac_prog_version])
   if test $ac_bison3 = yes; then
-AC_DEFINE(HAVE_BISON3, 1, [Define if bison 3 or later is used.])
 BISON3_YES=
 BISON3_NO='#'
   fi
--- intl/Makefile.in.jj 2020-04-16 10:11:49.715094886 +0200
+++ intl/Makefile.in2020-04-16 11:13:17.134602990 +0200
@@ -57,6 +57,7 @@ HEADERS = \
   gettextP.h \
   hash-string.h \
   loadinfo.h \
+  plural-config.h \
   plural-exp.h \
   eval-plural.h \
   localcharset.h \
@@ -133,10 +134,12 @@ libintl.h: $(srcdir)/libgnuintl.h
$(COMPILE) $<
 
 .y.c:
+@BISON3_YES@   echo '#define USE_BISON3' > $(patsubst %.c,%-config.h,$@)
 @BISON3_YES@   sed 's,%pure_parser,,;s,^/\* BISON3 \(.*\) \*/$$,\1,' $< > $@.y
 @BISON3_YES@   $(YACC) $(YFLAGS) --output $@.c $@.y
 @BISON3_YES@   sed 's/\.c\.y"/.y"/' $@.c > $@
 @BISON3_YES@   rm -f $@.c $@.y $@.h
+@BISON3_NO@echo '/* #define USE_BISON3 */' > $(patsubst %.c,%-config.h,$@)
 @BISON3_NO@$(YACC) $(YFLAGS) --output $@ $<
rm -f $*.h
 
@@ -165,7 +168,7 @@ dngettext.o finddomain.o gettext.o intl-
 localealias.o ngettext.o textdomain.o: gettextP.h gmo.h loadinfo.h
 dcigettext.o loadmsgcat.o: hash-string.h
 explodename.o l10nflist.o: loadinfo.h
-dcigettext.o loadmsgcat.o plural.o plural-exp.o: plural-exp.h
+dcigettext.o loadmsgcat.o plural.o plural-exp.o: plural-exp.h plural-config.h
 dcigettext.o: eval-plural.h
 localcharset.o: localcharset.h
 localealias.o localcharset.o relocatable.o: relocatable.h
@@ -242,6 +245,8 @@ $(srcdir)/aclocal.m4: @MAINT@ $(aclocal_
 config.h: stamp-h1
test -f config.h || (rm -f stamp-h1 && $(MAKE) stamp-h1)
 
+plural-config.h: plural.c
+
 stamp-h1: $(srcdir)/config.h.in config.status
-rm -f stamp-h1
$(SHELL) ./config.status config.h
--- intl/plural-exp.h.jj2020-04-16 10:11:49.728094690 +0200
+++ intl/plural-exp.h   2020-04-16 11:11:45.862973814 +0200
@@ -20,6 +20,8 @@
 #ifndef _PLURAL_EXP_H
 #define 

Re: [PATCH] intl: Unbreak intl build with bison 3 when no regeneration is needed [PR92008]

2020-04-16 Thread Richard Biener
On Thu, 16 Apr 2020, Jakub Jelinek wrote:

> Hi!
> 
> As Iain reported, my change broke the case when one has bison >= 3,
> but make decides there is no reason to regenerate plural.c, unfortunately
> that seems to be a scenario I haven't tested.  The problem is that
> the pregenerated plural.c has been generated with bison 1.35, but when
> config.h says HAVE_BISON3, the code assumes it is the bison3 variant.
> What used to work fine is when one has bison >= 3 and plural.c has been
> regenerated (e.g. do touch intl/plural.y and it will work), or when
> one doesn't have any bison (then nothing is regenerated, but HAVE_BISON3
> isn't defined either), or when one has bison < 3 and doesn't need to
> regenerate, or when one has bison < 3 and it is regenerated.
> 
> The following patch fixes this, by killing the HAVE_BISON3 macro from
> config.h, and instead remembering the fact whether plural.c has been created
> with bison < 3 or bison >= 3 in a separate new plural-config.h header.
> The way this works:
> - user doesn't have bison
> - user has bison >= 3, but intl/{plural-config.h,plural.c} aren't older than 
> intl/plural.y
> - user has bison < 3, but intl/{plural-config.h,plural.c} aren't older than 
> intl/plural.y
>   pregenerated !USE_BISON3 plural.c and plural-config.h from source
>   dir is used, nothing in the objdir
> - user has bison >= 3 and intl/plural.y is newer
>   Makefile generates plural.c and USE_BISON3 plural-config.h in the
>   objdir, which is then used in preference to srcdir copies
> - user has bison < 3 and intl/plural.y is newer
>   Makefile generates plural.c and !USE_BISON3 plural-config.h in the
>   objdir, which is then used in preference to srcdir copies
> I have tested all these cases and make all-yes worked in all the cases.
> If one uses the unsupported ./configure where srcdir == objdir, I guess
> (though haven't tested) that it should still work, just it would be nice
> if such people didn't try to check in the plural{.c,-config.h} they have
> regenerated.
> What doesn't work, but didn't work before either (just tested gcc-9 branch
> too) is when one doesn't have bison and plural.y is newer than plural.c.
> Don't do that ;)
> 
> Sorry for the breakage.
> 
> Ok for trunk?

OK.

> 2020-04-16  Jakub Jelinek  
> 
>   PR bootstrap/92008
> intl/
>   * configure.ac: Remove HAVE_BISON3 AC_DEFINE.
>   * Makefile.in (HEADERS): Add plural-config.h.
>   (.y.c): Also create plural-config.h.
>   (dcigettext.o loadmsgcat.o plural.o plural-exp.o): Also depend
>   on plural-config.h.
>   (plural-config.h): Depend on plural.c.
>   * plural-exp.h: Include plural-config.h.  Use USE_BISON3 instead
>   of HAVE_BISON3.
>   * plural.y: Use USE_BISON3 instead of HAVE_BISON3.
>   * configure: Regenerated.
>   * plural.c: Regenerated.
>   * config.h.in: Regenerated.
>   * plural-config.h: Generated.
> contrib/
>   * gcc_update: Add intl/plural.y dependency for intl/plural-config.h.
> 
> --- intl/configure.ac.jj  2020-04-16 10:11:49.709094977 +0200
> +++ intl/configure.ac 2020-04-16 10:57:24.033935892 +0200
> @@ -62,7 +62,6 @@ changequote([,])dnl
>esac
>AC_MSG_RESULT([$ac_prog_version])
>if test $ac_bison3 = yes; then
> -AC_DEFINE(HAVE_BISON3, 1, [Define if bison 3 or later is used.])
>  BISON3_YES=
>  BISON3_NO='#'
>fi
> --- intl/Makefile.in.jj   2020-04-16 10:11:49.715094886 +0200
> +++ intl/Makefile.in  2020-04-16 11:13:17.134602990 +0200
> @@ -57,6 +57,7 @@ HEADERS = \
>gettextP.h \
>hash-string.h \
>loadinfo.h \
> +  plural-config.h \
>plural-exp.h \
>eval-plural.h \
>localcharset.h \
> @@ -133,10 +134,12 @@ libintl.h: $(srcdir)/libgnuintl.h
>   $(COMPILE) $<
>  
>  .y.c:
> +@BISON3_YES@ echo '#define USE_BISON3' > $(patsubst %.c,%-config.h,$@)
>  @BISON3_YES@ sed 's,%pure_parser,,;s,^/\* BISON3 \(.*\) \*/$$,\1,' $< > $@.y
>  @BISON3_YES@ $(YACC) $(YFLAGS) --output $@.c $@.y
>  @BISON3_YES@ sed 's/\.c\.y"/.y"/' $@.c > $@
>  @BISON3_YES@ rm -f $@.c $@.y $@.h
> +@BISON3_NO@  echo '/* #define USE_BISON3 */' > $(patsubst %.c,%-config.h,$@)
>  @BISON3_NO@  $(YACC) $(YFLAGS) --output $@ $<
>   rm -f $*.h
>  
> @@ -165,7 +168,7 @@ dngettext.o finddomain.o gettext.o intl-
>  localealias.o ngettext.o textdomain.o: gettextP.h gmo.h loadinfo.h
>  dcigettext.o loadmsgcat.o: hash-string.h
>  explodename.o l10nflist.o: loadinfo.h
> -dcigettext.o loadmsgcat.o plural.o plural-exp.o: plural-exp.h
> +dcigettext.o loadmsgcat.o plural.o plural-exp.o: plural-exp.h plural-config.h
>  dcigettext.o: eval-plural.h
>  localcharset.o: localcharset.h
>  localealias.o localcharset.o relocatable.o: relocatable.h
> @@ -242,6 +245,8 @@ $(srcdir)/aclocal.m4: @MAINT@ $(aclocal_
>  config.h: stamp-h1
>   test -f config.h || (rm -f stamp-h1 && $(MAKE) stamp-h1)
>  
> +plural-config.h: plural.c
> +
>  stamp-h1: $(srcdir)/config.h.in config.status
>   -rm -f stamp-h1
>   $(SH

Re: [PR C++ 94426] Lambda linkage

2020-04-16 Thread Iain Sandoe via Gcc-patches
Hi Nathan,

Iain Sandoe  wrote:

> Nathan Sidwell  wrote:
> 
>> My fix for 94147 was confusing no-linkage with internal linkage, at the 
>> language level.  That's wrong. (the std is confusing here, because it 
>> describes linkage of names (which is wrong), and lambdas have no names)
>> 
>> Lambdas with extra-scope, have linkage.  However, at the 
>> implementation-level that linkage is at least as restricted as the linkage 
>> of the extra-scope decl.
>> 
>> Further, when instantiating a variable initialized by a lambda, we must 
>> determine the visibility of the variable itself, before instantiating its 
>> initializer.  If the template arguments are internal (or no-linkage), the 
>> variable will have internal linkage, regardless of the linkage of the 
>> template it is instantiated from.  We need to know that before instantiating 
>> the lambda, so we can restrict its linkage correctly.
>> 
>> I'll commit this in a few days.
> 
> As discussed on irc,
> 
> The testcase for this fails on Darwin, where we don’t use .local or .comm for 
> the var.
> 
> I’ve tested this on x86-64-linux and darwin, 
> but I plan on testing on a few more Darwin boxen,
> OK to apply, if additional testing passes?

that testing revealed some differences in storage description for the variable 
(powerpc 32b darwin puts it in bss, like linux, but the remainer of the 
platform versions use .static_data).  However, that’s not the relevant 
observation.

the observation is that the storage and symbols for 

_Z3VARIZ1qvEUlvE_E

has not changed between gcc-9 and trunk.

What has changed is the function that initializes that variable:

_Z4InitIN3VARIZ1qvEUlvE_EUlvE_EEbT_

which was weak / comdat [Linux]  weak / global [Darwin] and now is text section 
local (which is what I understood was the intention of the change) .. so I 
wonder if the scan-asms are testing what you intended?

how about the following - where IMO, from the observation above, the first two 
tests are not especially useful and could be removed.

the remainder of the amendments cater for USER_LABEL_PREFIX and a different 
spelling for ‘weak’ in the Darwin assembly language.

So - this now tests that the symbol exists, is spelled the way you intend and 
is not weak (or global on Darwin).

WDYT?
Iain

diff --cc gcc/testsuite/g++.dg/cpp0x/lambda/pr94426-2.C
index 3db864c604b,3db864c604b..dd94ea1f325
--- a/gcc/testsuite/g++.dg/cpp0x/lambda/pr94426-2.C
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/pr94426-2.C
@@@ -16,5 -16,5 +16,11 @@@ void q (
  }
  
  // The instantiation of VAR becomes local
--// { dg-final { scan-assembler {.local_Z3VARIZ1qvEUlvE_E} { target { 
i?86-*-* x86_64-*-* } } } }
--// { dg-final { scan-assembler {.comm _Z3VARIZ1qvEUlvE_E,1,1} { target { 
i?86-*-* x86_64-*-* } } } }
++// { dg-final { scan-assembler {.local_Z3VARIZ1qvEUlvE_E} { target { 
{ i?86-*-* x86_64-*-* } && { ! *-*-darwin* } } } } }
++// { dg-final { scan-assembler {.comm _Z3VARIZ1qvEUlvE_E,1,1} { target { { 
i?86-*-* x86_64-*-* } && { ! *-*-darwin* } } } } }
++
++// The instantiation of VAR becomes local
++// { dg-final { scan-assembler-not {.globl[ 
\t]+_?_Z4InitIN3VARIZ1qvEUlvE_EUlvE_EEbT_} { target *-*-darwin* } } }
++// { dg-final { scan-assembler-not {.weak(_definition)?[ 
\t]+_?_Z4InitIN3VARIZ1qvEUlvE_EUlvE_EEbT_} { target  i?86-*-* x86_64-*-* 
*-*-darwin* } } }
++// Make sure it is defined with the mangling we expect.
++// { dg-final { scan-assembler {_?_Z4InitIN3VARIZ1qvEUlvE_EUlvE_EEbT_:} { 
target  i?86-*-* x86_64-*-* *-*-darwin* } } }



[PATCH] cleanup graphite results

2020-04-16 Thread Richard Biener
This removes { dg-final { scan-tree-dump "tiled" "graphite" } } scans
from graphite tests that pass/fail dependent on the ISL version used.
Note all scans match the actually dumped "not tiled" messages with
ISL 0.12 and not the alternative "tiled by ".  With ISL
0.22 neither is printed because the tiling infrastructure doesn't
expect the new schedule layout (though looking at a few cases it
looks superior with more dimensions marked as permutable.

Anyway, the scans have nothing to do with interchange and just
add to testsuite noise.

Pused.

2020-04-16  Richard Biener  

* gcc.dg/graphite/interchange-1.c: Remove scan for tiled.
* gcc.dg/graphite/interchange-10.c: Likewise.
* gcc.dg/graphite/interchange-11.c: Likewise.
* gcc.dg/graphite/interchange-3.c: Likewise.
* gcc.dg/graphite/interchange-4.c: Likewise.
* gcc.dg/graphite/interchange-7.c: Likewise.
* gcc.dg/graphite/interchange-9.c: Likewise.
* gcc.dg/graphite/uns-interchange-9.c: Likewise.
* gfortran.dg/graphite/interchange-3.f90: Likewise.
---
 gcc/testsuite/gcc.dg/graphite/interchange-1.c| 7 ---
 gcc/testsuite/gcc.dg/graphite/interchange-10.c   | 2 --
 gcc/testsuite/gcc.dg/graphite/interchange-11.c   | 2 --
 gcc/testsuite/gcc.dg/graphite/interchange-3.c| 2 --
 gcc/testsuite/gcc.dg/graphite/interchange-4.c| 2 --
 gcc/testsuite/gcc.dg/graphite/interchange-7.c| 2 --
 gcc/testsuite/gcc.dg/graphite/interchange-9.c| 2 --
 gcc/testsuite/gcc.dg/graphite/uns-interchange-9.c| 2 --
 gcc/testsuite/gfortran.dg/graphite/interchange-3.f90 | 2 --
 9 files changed, 23 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/graphite/interchange-1.c 
b/gcc/testsuite/gcc.dg/graphite/interchange-1.c
index b65d4861e68..65a569e7143 100644
--- a/gcc/testsuite/gcc.dg/graphite/interchange-1.c
+++ b/gcc/testsuite/gcc.dg/graphite/interchange-1.c
@@ -48,10 +48,3 @@ main (void)
 
   return 0;
 }
-
-/*FIXME: Between isl 0.12 and isl 0.15 the schedule optimizer needs to print
-something canonical so that it can be checked in the test.  The final code
-generated by both are same in this case but the messaged printed are
-not consistent.  */
-
-/* { dg-final { scan-tree-dump "tiled" "graphite" } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/interchange-10.c 
b/gcc/testsuite/gcc.dg/graphite/interchange-10.c
index a955644dea9..45c248db84d 100644
--- a/gcc/testsuite/gcc.dg/graphite/interchange-10.c
+++ b/gcc/testsuite/gcc.dg/graphite/interchange-10.c
@@ -45,5 +45,3 @@ main (void)
 
   return 0;
 }
-
-/* { dg-final { scan-tree-dump "tiled" "graphite" } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/interchange-11.c 
b/gcc/testsuite/gcc.dg/graphite/interchange-11.c
index 61028225fc4..6ba6907a52e 100644
--- a/gcc/testsuite/gcc.dg/graphite/interchange-11.c
+++ b/gcc/testsuite/gcc.dg/graphite/interchange-11.c
@@ -45,5 +45,3 @@ main (void)
 
   return 0;
 }
-
-/* { dg-final { scan-tree-dump "tiled" "graphite" } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/interchange-3.c 
b/gcc/testsuite/gcc.dg/graphite/interchange-3.c
index 4aec824183a..e8539e2d3d1 100644
--- a/gcc/testsuite/gcc.dg/graphite/interchange-3.c
+++ b/gcc/testsuite/gcc.dg/graphite/interchange-3.c
@@ -46,5 +46,3 @@ main (void)
 
   return 0;
 }
-
-/* { dg-final { scan-tree-dump "tiled" "graphite" } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/interchange-4.c 
b/gcc/testsuite/gcc.dg/graphite/interchange-4.c
index 463ecb5a66d..1370d5f9d54 100644
--- a/gcc/testsuite/gcc.dg/graphite/interchange-4.c
+++ b/gcc/testsuite/gcc.dg/graphite/interchange-4.c
@@ -45,5 +45,3 @@ main (void)
 
   return 0;
 }
-
-/* { dg-final { scan-tree-dump "tiled" "graphite" } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/interchange-7.c 
b/gcc/testsuite/gcc.dg/graphite/interchange-7.c
index 50f7dd7f8e3..b2696dbec08 100644
--- a/gcc/testsuite/gcc.dg/graphite/interchange-7.c
+++ b/gcc/testsuite/gcc.dg/graphite/interchange-7.c
@@ -46,5 +46,3 @@ main (void)
 
   return 0;
 }
-
-/* { dg-final { scan-tree-dump "tiled" "graphite" } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/interchange-9.c 
b/gcc/testsuite/gcc.dg/graphite/interchange-9.c
index 88a357893e9..506b5001f83 100644
--- a/gcc/testsuite/gcc.dg/graphite/interchange-9.c
+++ b/gcc/testsuite/gcc.dg/graphite/interchange-9.c
@@ -43,5 +43,3 @@ main (void)
 
   return 0;
 }
-
-/* { dg-final { scan-tree-dump "tiled" "graphite" } } */
diff --git a/gcc/testsuite/gcc.dg/graphite/uns-interchange-9.c 
b/gcc/testsuite/gcc.dg/graphite/uns-interchange-9.c
index cc108c2bbc3..a8957803207 100644
--- a/gcc/testsuite/gcc.dg/graphite/uns-interchange-9.c
+++ b/gcc/testsuite/gcc.dg/graphite/uns-interchange-9.c
@@ -44,5 +44,3 @@ main (void)
 
   return 0;
 }
-
-/* { dg-final { scan-tree-dump "tiled" "graphite" } } */
diff --git a/gcc/testsuite/gfortran.dg/graphite/interchange-3.f90 
b/gcc/testsuite/gfortran.dg/graphite/interchange-3.f90
index 8070bbb4a8d..d827323acd3 100644
--- a/gcc/testsuite/gfortran

[patch, fortran] Fix PR PR93500

2020-04-16 Thread Thomas Koenig via Gcc-patches

Hello world,

this patch fixes PR PR93500.  One part of it is due to
what Steve wrote in the patch (returning from resolutions when both
operands are NULL), but that still left a nonsensical error.
Returning &gfc_bad_expr when simplifying bounds resulted in the
division by zero error actually reaching the user.

As to why there is an extra error when this is done in the main
program, as compared to a subroutine, I don't know, but I do not
particularly care. What is important is that the first error message
is clear and reaches the user.

Regression-tested. OK for trunk?

Regards

Thomas

2020-04-16  Thomas Koenig  

PR fortran/93500
* resolve.c (resolve_operator): If both operands are
NULL, return false.
* simplify.c (simplify_bound): If a division by zero
was seen during bound simplification, free the
corresponcing expression and return &gfc_bad_expr.

2020-04-16  Thomas Koenig  

PR fortran/93500
* arith_divide_3.f90: New test.
diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 9b95200c241..650837e18c3 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -3986,6 +3986,9 @@ resolve_operator (gfc_expr *e)
 
   op1 = e->value.op.op1;
   op2 = e->value.op.op2;
+  if (op1 == NULL && op2 == NULL)
+return false;
+
   dual_locus_error = false;
 
   /* op1 and op2 cannot both be BOZ.  */
diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c
index 807565b4e80..fba7f7020be 100644
--- a/gcc/fortran/simplify.c
+++ b/gcc/fortran/simplify.c
@@ -4251,7 +4251,13 @@ simplify_bound (gfc_expr *array, gfc_expr *dim, gfc_expr 
*kind, int upper)
 
  for (j = 0; j < d; j++)
gfc_free_expr (bounds[j]);
- return bounds[d];
+ if (gfc_seen_div0)
+   {
+ gfc_free_expr (bounds[d]);
+ return &gfc_bad_expr;
+   }
+ else
+   return bounds[d];
}
}
 
diff --git a/gcc/testsuite/gfortran.dg/arith_divide_3.f90 
b/gcc/testsuite/gfortran.dg/arith_divide_3.f90
new file mode 100644
index 000..d9eb4a0d590
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/arith_divide_3.f90
@@ -0,0 +1,13 @@
+! { dg-do compile }
+! PR 93500 - this used to cause an ICE
+
+program p
+   integer :: a(min(2,0)/0) ! { dg-error "Division by zero" }
+   integer :: b = lbound(a) ! { dg-error "must be an array" }
+end
+
+subroutine s
+   integer :: a(min(2,0)/0)  ! { dg-error "Division by zero" }
+   integer :: b = lbound(a)
+end
+


[committed] early-remat: Handle sets of multiple candidate regs [PR94605]

2020-04-16 Thread Richard Sandiford
early-remat.c:process_block wasn't handling insns that set multiple
candidate registers, which led to an assertion failure at the end
of the main loop.

Instructions that set two pseudos aren't rematerialisation candidates in
themselves, but we still need to track them if another instruction that
sets the same register is a rematerialisation candidate.

Tested on aarch64-linux-gnu, aarch64_be-elf and x86_64-linux-gnu.
Pushed as obvious.  I'll backport to GCC 9 and 8 too.

Richard


2020-04-16  Richard Sandiford  

gcc/
PR rtl-optimization/94605
* early-remat.c (early_remat::process_block): Handle insns that
set multiple candidate registers.

gcc/testsuite/
PR rtl-optimization/94605
* gcc.target/aarch64/sve/pr94605.c: New test.
---
 gcc/early-remat.c  |  2 +-
 gcc/testsuite/gcc.target/aarch64/sve/pr94605.c | 12 
 2 files changed, 13 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr94605.c

diff --git a/gcc/early-remat.c b/gcc/early-remat.c
index 80672cca241..9f5f8541644 100644
--- a/gcc/early-remat.c
+++ b/gcc/early-remat.c
@@ -2020,7 +2020,7 @@ early_remat::process_block (basic_block bb)
}
 
   /* Now process definitions.  */
-  if (next_def && insn == next_def->insn)
+  while (next_def && insn == next_def->insn)
{
  unsigned int gen = canon_candidate (next_candidate);
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr94605.c 
b/gcc/testsuite/gcc.target/aarch64/sve/pr94605.c
new file mode 100644
index 000..593e959e292
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr94605.c
@@ -0,0 +1,12 @@
+/* { dg-options "-O2 -msve-vector-bits=256" } */
+
+typedef int v8si __attribute__((vector_size(32)));
+int g (v8si, v8si);
+
+void
+f (void)
+{
+  v8si x = {}, y = {};
+  while (g (x, y))
+asm ("" : "+w" (x), "+w" (y));
+}
-- 
2.17.1



[PATCH 0/19][GCC-8] aarch64: Backport outline atomics

2020-04-16 Thread Andre Vieira (lists)

Hi,

This series backports all the patches and fixes regarding outline 
atomics to the gcc-8 branch.


Bootstrapped the series for aarch64-linux-gnu and regression tested.
Is this OK for gcc-8?

Andre Vieira (19):
aarch64: Add early clobber for aarch64_store_exclusive
aarch64: Simplify LSE cas generation
aarch64: Improve cas generation
aarch64: Improve swp generation
aarch64: Improve atomic-op lse generation
aarch64: Remove early clobber from ATOMIC_LDOP scratch
aarch64: Extend %R for integer registers
aarch64: Implement TImode compare-and-swap
aarch64: Tidy aarch64_split_compare_and_swap
aarch64: Add out-of-line functions for LSE atomics
Add visibility to libfunc constructors
aarch64: Implement -moutline-atomics
Aarch64: Fix shrinkwrapping interactions with atomics (PR92692)
aarch64: Fix store-exclusive in load-operate LSE helpers
aarch64: Configure for sys/auxv.h in libgcc for lse-init.c
aarch64: Fix up aarch64_compare_and_swaphi pattern [PR94368]
aarch64: Fix bootstrap with old binutils [PR93053]
aarch64: Fix ICE due to aarch64_gen_compare_reg_maybe_ze [PR94435]
re PR target/90724 (ICE with __sync_bool_compare_and_swap with 
-march=armv8.2-a+sve)




[PATCH 6/19][GCC-8] aarch64: Remove early clobber from ATOMIC_LDOP scratch

2020-04-16 Thread Andre Vieira (lists)

2020-04-16  Andre Vieira 

    Backport from mainline.
    2018-10-31  Richard Henderson 

    * config/aarch64/atomics.md (aarch64_atomic__lse):
    scratch register need not be early-clobber.  Document the reason
    why we cannot use ST.

diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 
47a8a40c5b82e349b2caf4e48f9f81577f4c3ed3..d740f4a100b1b624eafdb279f38ac1ce9db587dd
 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -263,6 +263,18 @@
   }
 )
 
+;; It is tempting to want to use ST for relaxed and release
+;; memory models here.  However, that is incompatible with the
+;; C++ memory model for the following case:
+;;
+;; atomic_fetch_add(ptr, 1, memory_order_relaxed);
+;; atomic_thread_fence(memory_order_acquire);
+;;
+;; The problem is that the architecture says that ST (and LD
+;; insns where the destination is XZR) are not regarded as a read.
+;; However we also implement the acquire memory barrier with DMB LD,
+;; and so the ST is not blocked by the barrier.
+
 (define_insn "aarch64_atomic__lse"
   [(set (match_operand:ALLI 0 "aarch64_sync_memory_operand" "+Q")
(unspec_volatile:ALLI
@@ -270,7 +282,7 @@
   (match_operand:ALLI 1 "register_operand" "r")
   (match_operand:SI 2 "const_int_operand")]
   ATOMIC_LDOP))
-   (clobber (match_scratch:ALLI 3 "=&r"))]
+   (clobber (match_scratch:ALLI 3 "=r"))]
   "TARGET_LSE"
   {
enum memmodel model = memmodel_from_int (INTVAL (operands[2]));


[PATCH 4/19][GCC-8] aarch64: Improve swp generation

2020-04-16 Thread Andre Vieira (lists)

Allow zero as an input; fix constraints; avoid unnecessary split.

2020-04-16  Andre Vieira 

    Backport from mainline.
    2018-10-31  Richard Henderson 

    * config/aarch64/aarch64.c (aarch64_emit_atomic_swap): Remove.
    (aarch64_gen_atomic_ldop): Don't call it.
    * config/aarch64/atomics.md (atomic_exchange):
    Use aarch64_reg_or_zero.
    (aarch64_atomic_exchange): Likewise.
    (aarch64_atomic_exchange_lse): Remove split; remove & from
    operand 0; use aarch64_reg_or_zero for input; merge ...
    (aarch64_atomic_swp): ... this and remove.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
b6a6e314153ecf4a7ae1b83cfb64e6192197edc5..bac69474598ff19161b72748505151b0d6185a9b
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14454,27 +14454,6 @@ aarch64_emit_bic (machine_mode mode, rtx dst, rtx s1, 
rtx s2, int shift)
   emit_insn (gen (dst, s2, shift_rtx, s1));
 }
 
-/* Emit an atomic swap.  */
-
-static void
-aarch64_emit_atomic_swap (machine_mode mode, rtx dst, rtx value,
- rtx mem, rtx model)
-{
-  rtx (*gen) (rtx, rtx, rtx, rtx);
-
-  switch (mode)
-{
-case E_QImode: gen = gen_aarch64_atomic_swpqi; break;
-case E_HImode: gen = gen_aarch64_atomic_swphi; break;
-case E_SImode: gen = gen_aarch64_atomic_swpsi; break;
-case E_DImode: gen = gen_aarch64_atomic_swpdi; break;
-default:
-  gcc_unreachable ();
-}
-
-  emit_insn (gen (dst, mem, value, model));
-}
-
 /* Operations supported by aarch64_emit_atomic_load_op.  */
 
 enum aarch64_atomic_load_op_code
@@ -14587,10 +14566,6 @@ aarch64_gen_atomic_ldop (enum rtx_code code, rtx 
out_data, rtx out_result,
  a SET then emit a swap instruction and finish.  */
   switch (code)
 {
-case SET:
-  aarch64_emit_atomic_swap (mode, out_data, src, mem, model_rtx);
-  return;
-
 case MINUS:
   /* Negate the value and treat it as a PLUS.  */
   {
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 
b0e84b8addd809598b3e358a265b86582ce96462..6cc14fbf6c103ab19e6c201333a9eba06b90c469
 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -136,7 +136,7 @@
 (define_expand "atomic_exchange"
  [(match_operand:ALLI 0 "register_operand" "")
   (match_operand:ALLI 1 "aarch64_sync_memory_operand" "")
-  (match_operand:ALLI 2 "register_operand" "")
+  (match_operand:ALLI 2 "aarch64_reg_or_zero" "")
   (match_operand:SI 3 "const_int_operand" "")]
   ""
   {
@@ -156,10 +156,10 @@
 
 (define_insn_and_split "aarch64_atomic_exchange"
   [(set (match_operand:ALLI 0 "register_operand" "=&r");; 
output
-(match_operand:ALLI 1 "aarch64_sync_memory_operand" "+Q")) ;; memory
+(match_operand:ALLI 1 "aarch64_sync_memory_operand" "+Q")) ;; memory
(set (match_dup 1)
 (unspec_volatile:ALLI
-  [(match_operand:ALLI 2 "register_operand" "r")   ;; input
+  [(match_operand:ALLI 2 "aarch64_reg_or_zero" "rZ")   ;; input
(match_operand:SI 3 "const_int_operand" "")];; model
   UNSPECV_ATOMIC_EXCHG))
(clobber (reg:CC CC_REGNUM))
@@ -175,22 +175,25 @@
   }
 )
 
-(define_insn_and_split "aarch64_atomic_exchange_lse"
-  [(set (match_operand:ALLI 0 "register_operand" "=&r")
+(define_insn "aarch64_atomic_exchange_lse"
+  [(set (match_operand:ALLI 0 "register_operand" "=r")
 (match_operand:ALLI 1 "aarch64_sync_memory_operand" "+Q"))
(set (match_dup 1)
 (unspec_volatile:ALLI
-  [(match_operand:ALLI 2 "register_operand" "r")
+  [(match_operand:ALLI 2 "aarch64_reg_or_zero" "rZ")
(match_operand:SI 3 "const_int_operand" "")]
   UNSPECV_ATOMIC_EXCHG))]
   "TARGET_LSE"
-  "#"
-  "&& reload_completed"
-  [(const_int 0)]
   {
-aarch64_gen_atomic_ldop (SET, operands[0], NULL, operands[1],
-operands[2], operands[3]);
-DONE;
+enum memmodel model = memmodel_from_int (INTVAL (operands[3]));
+if (is_mm_relaxed (model))
+  return "swp\t%2, %0, %1";
+else if (is_mm_acquire (model) || is_mm_consume (model))
+  return "swpa\t%2, %0, %1";
+else if (is_mm_release (model))
+  return "swpl\t%2, %0, %1";
+else
+  return "swpal\t%2, %0, %1";
   }
 )
 
@@ -582,28 +585,6 @@
 
 ;; ARMv8.1-A LSE instructions.
 
-;; Atomic swap with memory.
-(define_insn "aarch64_atomic_swp"
- [(set (match_operand:ALLI 0 "register_operand" "+&r")
-   (match_operand:ALLI 1 "aarch64_sync_memory_operand" "+Q"))
-  (set (match_dup 1)
-   (unspec_volatile:ALLI
-[(match_operand:ALLI 2 "register_operand" "r")
- (match_operand:SI 3 "const_int_operand" "")]
-UNSPECV_ATOMIC_SWP))]
-  "TARGET_LSE && reload_completed"
-  {
-enum memmodel model = memmodel_from_int (INTVAL (operands[3]));
-if (is_mm_relaxed (model))
-  return "swp\t%2, %0, %1";
-else if (is_mm_acquire (model) || is_mm_consume (model))
-  return "swpa\t%2, %0, %1";
-else if (is_mm_release 

[PATCH 2/19][GCC-8] aarch64: Simplify LSE cas generation

2020-04-16 Thread Andre Vieira (lists)

The cas insn is a single insn, and if expanded properly need not
be split after reload.  Use the proper inputs for the insn.

2020-04-16  Andre Vieira 

    Backport from mainline.
    2018-10-31  Richard Henderson 

    * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap):
    Force oldval into the rval register for TARGET_LSE; emit the compare
    during initial expansion so that it may be deleted if unused.
    (aarch64_gen_atomic_cas): Remove.
    * config/aarch64/atomics.md (aarch64_compare_and_swap_lse):
    Change =&r to +r for operand 0; use match_dup for operand 2;
    remove is_weak and mod_f operands as unused.  Drop the split
    and merge with...
    (aarch64_atomic_cas): ... this pattern's output; remove.
    (aarch64_compare_and_swap_lse): Similarly.
    (aarch64_atomic_cas): Similarly.

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
cda2895d28e7496f8fd6c1b365c4bb497b54c323..a03565c3b4e13990dc1a0064f9cbbc38bb109795
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -496,7 +496,6 @@ rtx aarch64_load_tp (rtx);
 
 void aarch64_expand_compare_and_swap (rtx op[]);
 void aarch64_split_compare_and_swap (rtx op[]);
-void aarch64_gen_atomic_cas (rtx, rtx, rtx, rtx, rtx);
 
 bool aarch64_atomic_ldop_supported_p (enum rtx_code);
 void aarch64_gen_atomic_ldop (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
20761578fb6051e600299cd58f245774bd457432..c83a9f7ae78d4ed3da6636fce4d1f57c27048756
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14169,17 +14169,19 @@ aarch64_expand_compare_and_swap (rtx operands[])
 {
   rtx bval, rval, mem, oldval, newval, is_weak, mod_s, mod_f, x;
   machine_mode mode, cmp_mode;
-  typedef rtx (*gen_cas_fn) (rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  typedef rtx (*gen_split_cas_fn) (rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  typedef rtx (*gen_atomic_cas_fn) (rtx, rtx, rtx, rtx);
   int idx;
-  gen_cas_fn gen;
-  const gen_cas_fn split_cas[] =
+  gen_split_cas_fn split_gen;
+  gen_atomic_cas_fn atomic_gen;
+  const gen_split_cas_fn split_cas[] =
   {
 gen_aarch64_compare_and_swapqi,
 gen_aarch64_compare_and_swaphi,
 gen_aarch64_compare_and_swapsi,
 gen_aarch64_compare_and_swapdi
   };
-  const gen_cas_fn atomic_cas[] =
+  const gen_atomic_cas_fn atomic_cas[] =
   {
 gen_aarch64_compare_and_swapqi_lse,
 gen_aarch64_compare_and_swaphi_lse,
@@ -14238,14 +14240,29 @@ aarch64_expand_compare_and_swap (rtx operands[])
   gcc_unreachable ();
 }
   if (TARGET_LSE)
-gen = atomic_cas[idx];
+{
+  atomic_gen = atomic_cas[idx];
+  /* The CAS insn requires oldval and rval overlap, but we need to
+have a copy of oldval saved across the operation to tell if
+the operation is successful.  */
+  if (mode == QImode || mode == HImode)
+   rval = copy_to_mode_reg (SImode, gen_lowpart (SImode, oldval));
+  else if (reg_overlap_mentioned_p (rval, oldval))
+rval = copy_to_mode_reg (mode, oldval);
+  else
+   emit_move_insn (rval, oldval);
+  emit_insn (atomic_gen (rval, mem, newval, mod_s));
+  aarch64_gen_compare_reg (EQ, rval, oldval);
+}
   else
-gen = split_cas[idx];
-
-  emit_insn (gen (rval, mem, oldval, newval, is_weak, mod_s, mod_f));
+{
+  split_gen = split_cas[idx];
+  emit_insn (split_gen (rval, mem, oldval, newval, is_weak, mod_s, mod_f));
+}
 
   if (mode == QImode || mode == HImode)
-emit_move_insn (operands[1], gen_lowpart (mode, rval));
+rval = gen_lowpart (mode, rval);
+  emit_move_insn (operands[1], rval);
 
   x = gen_rtx_REG (CCmode, CC_REGNUM);
   x = gen_rtx_EQ (SImode, x, const0_rtx);
@@ -14295,42 +14312,6 @@ aarch64_emit_post_barrier (enum memmodel model)
 }
 }
 
-/* Emit an atomic compare-and-swap operation.  RVAL is the destination register
-   for the data in memory.  EXPECTED is the value expected to be in memory.
-   DESIRED is the value to store to memory.  MEM is the memory location.  MODEL
-   is the memory ordering to use.  */
-
-void
-aarch64_gen_atomic_cas (rtx rval, rtx mem,
-   rtx expected, rtx desired,
-   rtx model)
-{
-  rtx (*gen) (rtx, rtx, rtx, rtx);
-  machine_mode mode;
-
-  mode = GET_MODE (mem);
-
-  switch (mode)
-{
-case E_QImode: gen = gen_aarch64_atomic_casqi; break;
-case E_HImode: gen = gen_aarch64_atomic_cashi; break;
-case E_SImode: gen = gen_aarch64_atomic_cassi; break;
-case E_DImode: gen = gen_aarch64_atomic_casdi; break;
-default:
-  gcc_unreachable ();
-}
-
-  /* Move the expected value into the CAS destination register.  */
-  emit_insn (gen_rtx_SET (rval, expected));
-
-  /* Emit the CAS.  */
-  emit_insn (gen (rval, mem, desired, model));
-
-  /* Compare the expected value with the value loaded by the CAS, to establish
- whether the swa

[PATCH 5/19][GCC-8] aarch64: Improve atomic-op lse generation

2020-04-16 Thread Andre Vieira (lists)

Fix constraints; avoid unnecessary split.  Drop the use of the atomic_op
iterator in favor of the ATOMIC_LDOP iterator; this is simplier and more
logical for ldclr aka bic.

2020-04-16  Andre Vieira 

    Backport from mainline.
    2018-10-31  Richard Henderson 

    * config/aarch64/aarch64.c (aarch64_emit_bic): Remove.
    (aarch64_atomic_ldop_supported_p): Remove.
    (aarch64_gen_atomic_ldop): Remove.
    * config/aarch64/atomic.md (atomic_):
    Fully expand LSE operations here.
    (atomic_fetch_): Likewise.
    (atomic__fetch): Likewise.
    (aarch64_atomic__lse): Drop atomic_op iterator
    and use ATOMIC_LDOP instead; use register_operand for the input;
    drop the split and emit insns directly.
    (aarch64_atomic_fetch__lse): Likewise.
    (aarch64_atomic__fetch_lse): Remove.
    (aarch64_atomic_load): Remove.

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
a03565c3b4e13990dc1a0064f9cbbc38bb109795..da68ce0e7d096bf4a512c2b8ef52bf236f8f76f4
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -497,8 +497,6 @@ rtx aarch64_load_tp (rtx);
 void aarch64_expand_compare_and_swap (rtx op[]);
 void aarch64_split_compare_and_swap (rtx op[]);
 
-bool aarch64_atomic_ldop_supported_p (enum rtx_code);
-void aarch64_gen_atomic_ldop (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
 void aarch64_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx);
 
 bool aarch64_gen_adjusted_ldpstp (rtx *, bool, scalar_mode, RTX_CODE);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
bac69474598ff19161b72748505151b0d6185a9b..1068cfd899a759c506e3217e1e2c19cd778b4372
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14292,32 +14292,6 @@ aarch64_expand_compare_and_swap (rtx operands[])
   emit_insn (gen_rtx_SET (bval, x));
 }
 
-/* Test whether the target supports using a atomic load-operate instruction.
-   CODE is the operation and AFTER is TRUE if the data in memory after the
-   operation should be returned and FALSE if the data before the operation
-   should be returned.  Returns FALSE if the operation isn't supported by the
-   architecture.  */
-
-bool
-aarch64_atomic_ldop_supported_p (enum rtx_code code)
-{
-  if (!TARGET_LSE)
-return false;
-
-  switch (code)
-{
-case SET:
-case AND:
-case IOR:
-case XOR:
-case MINUS:
-case PLUS:
-  return true;
-default:
-  return false;
-}
-}
-
 /* Emit a barrier, that is appropriate for memory model MODEL, at the end of a
sequence implementing an atomic operation.  */
 
@@ -14435,227 +14409,6 @@ aarch64_split_compare_and_swap (rtx operands[])
 aarch64_emit_post_barrier (model);
 }
 
-/* Emit a BIC instruction.  */
-
-static void
-aarch64_emit_bic (machine_mode mode, rtx dst, rtx s1, rtx s2, int shift)
-{
-  rtx shift_rtx = GEN_INT (shift);
-  rtx (*gen) (rtx, rtx, rtx, rtx);
-
-  switch (mode)
-{
-case E_SImode: gen = gen_and_one_cmpl_lshrsi3; break;
-case E_DImode: gen = gen_and_one_cmpl_lshrdi3; break;
-default:
-  gcc_unreachable ();
-}
-
-  emit_insn (gen (dst, s2, shift_rtx, s1));
-}
-
-/* Operations supported by aarch64_emit_atomic_load_op.  */
-
-enum aarch64_atomic_load_op_code
-{
-  AARCH64_LDOP_PLUS,   /* A + B  */
-  AARCH64_LDOP_XOR,/* A ^ B  */
-  AARCH64_LDOP_OR, /* A | B  */
-  AARCH64_LDOP_BIC /* A & ~B  */
-};
-
-/* Emit an atomic load-operate.  */
-
-static void
-aarch64_emit_atomic_load_op (enum aarch64_atomic_load_op_code code,
-machine_mode mode, rtx dst, rtx src,
-rtx mem, rtx model)
-{
-  typedef rtx (*aarch64_atomic_load_op_fn) (rtx, rtx, rtx, rtx);
-  const aarch64_atomic_load_op_fn plus[] =
-  {
-gen_aarch64_atomic_loadaddqi,
-gen_aarch64_atomic_loadaddhi,
-gen_aarch64_atomic_loadaddsi,
-gen_aarch64_atomic_loadadddi
-  };
-  const aarch64_atomic_load_op_fn eor[] =
-  {
-gen_aarch64_atomic_loadeorqi,
-gen_aarch64_atomic_loadeorhi,
-gen_aarch64_atomic_loadeorsi,
-gen_aarch64_atomic_loadeordi
-  };
-  const aarch64_atomic_load_op_fn ior[] =
-  {
-gen_aarch64_atomic_loadsetqi,
-gen_aarch64_atomic_loadsethi,
-gen_aarch64_atomic_loadsetsi,
-gen_aarch64_atomic_loadsetdi
-  };
-  const aarch64_atomic_load_op_fn bic[] =
-  {
-gen_aarch64_atomic_loadclrqi,
-gen_aarch64_atomic_loadclrhi,
-gen_aarch64_atomic_loadclrsi,
-gen_aarch64_atomic_loadclrdi
-  };
-  aarch64_atomic_load_op_fn gen;
-  int idx = 0;
-
-  switch (mode)
-{
-case E_QImode: idx = 0; break;
-case E_HImode: idx = 1; break;
-case E_SImode: idx = 2; break;
-case E_DImode: idx = 3; break;
-default:
-  gcc_unreachable ();
-}
-
-  switch (code)
-{
-case AARCH64_LDOP_PLUS: gen = plus[idx]; break;
-case AARCH64_LDOP_XOR: gen = eor[idx]; break;
-case AARCH64_LDOP_OR: gen = ior[idx]; break;
-case AARC

[PATCH 1/19][GCC-8] aarch64: Fix up aarch64_compare_and_swaphi pattern [PR94368]

2020-04-16 Thread Andre Vieira (lists)

gcc/ChangeLog:
2020-04-16  Andre Vieira 

    Backport from mainline.
    2018-07-16  Ramana Radhakrishnan 

    * config/aarch64/atomics.md (aarch64_store_execlusive): Add
    early clobber.

diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 
686e39ff2ee5940e9e93d0c2b802b46ff9f2c4e4..fba5ec6db5832a184b0323e62041f9c473761bae
 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -530,7 +530,7 @@
 )
 
 (define_insn "aarch64_store_exclusive"
-  [(set (match_operand:SI 0 "register_operand" "=r")
+  [(set (match_operand:SI 0 "register_operand" "=&r")
 (unspec_volatile:SI [(const_int 0)] UNSPECV_SX))
(set (match_operand:ALLI 1 "aarch64_sync_memory_operand" "=Q")
 (unspec_volatile:ALLI


[PATCH 3/19] aarch64: Improve cas generation

2020-04-16 Thread Andre Vieira (lists)

Do not zero-extend the input to the cas for subword operations;
instead, use the appropriate zero-extending compare insns.
Correct the predicates and constraints for immediate expected operand.

2020-04-16  Andre Vieira 

    Backport from mainline.
    2018-10-31  Richard Henderson 

    * config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): New.
    (aarch64_split_compare_and_swap): Use it.
    (aarch64_expand_compare_and_swap): Likewise.  Remove convert_modes;
    test oldval against the proper predicate.
    * config/aarch64/atomics.md (atomic_compare_and_swap):
    Use nonmemory_operand for expected.
    (cas_short_expected_pred): New.
    (aarch64_compare_and_swap): Use it; use "rn" not "rI" to match.
    (aarch64_compare_and_swap): Use "rn" not "rI" for expected.
    * config/aarch64/predicates.md (aarch64_plushi_immediate): New.
    (aarch64_plushi_operand): New.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
c83a9f7ae78d4ed3da6636fce4d1f57c27048756..b6a6e314153ecf4a7ae1b83cfb64e6192197edc5
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1524,6 +1524,33 @@ aarch64_gen_compare_reg (RTX_CODE code, rtx x, rtx y)
   return cc_reg;
 }
 
+/* Similarly, but maybe zero-extend Y if Y_MODE < SImode.  */
+
+static rtx
+aarch64_gen_compare_reg_maybe_ze (RTX_CODE code, rtx x, rtx y,
+  machine_mode y_mode)
+{
+  if (y_mode == E_QImode || y_mode == E_HImode)
+{
+  if (CONST_INT_P (y))
+   y = GEN_INT (INTVAL (y) & GET_MODE_MASK (y_mode));
+  else
+   {
+ rtx t, cc_reg;
+ machine_mode cc_mode;
+
+ t = gen_rtx_ZERO_EXTEND (SImode, y);
+ t = gen_rtx_COMPARE (CC_SWPmode, t, x);
+ cc_mode = CC_SWPmode;
+ cc_reg = gen_rtx_REG (cc_mode, CC_REGNUM);
+ emit_set_insn (cc_reg, t);
+ return cc_reg;
+   }
+}
+
+  return aarch64_gen_compare_reg (code, x, y);
+}
+
 /* Build the SYMBOL_REF for __tls_get_addr.  */
 
 static GTY(()) rtx tls_get_addr_libfunc;
@@ -14167,20 +14194,11 @@ aarch64_emit_unlikely_jump (rtx insn)
 void
 aarch64_expand_compare_and_swap (rtx operands[])
 {
-  rtx bval, rval, mem, oldval, newval, is_weak, mod_s, mod_f, x;
-  machine_mode mode, cmp_mode;
-  typedef rtx (*gen_split_cas_fn) (rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  rtx bval, rval, mem, oldval, newval, is_weak, mod_s, mod_f, x, cc_reg;
+  machine_mode mode, r_mode;
   typedef rtx (*gen_atomic_cas_fn) (rtx, rtx, rtx, rtx);
   int idx;
-  gen_split_cas_fn split_gen;
   gen_atomic_cas_fn atomic_gen;
-  const gen_split_cas_fn split_cas[] =
-  {
-gen_aarch64_compare_and_swapqi,
-gen_aarch64_compare_and_swaphi,
-gen_aarch64_compare_and_swapsi,
-gen_aarch64_compare_and_swapdi
-  };
   const gen_atomic_cas_fn atomic_cas[] =
   {
 gen_aarch64_compare_and_swapqi_lse,
@@ -14198,36 +14216,19 @@ aarch64_expand_compare_and_swap (rtx operands[])
   mod_s = operands[6];
   mod_f = operands[7];
   mode = GET_MODE (mem);
-  cmp_mode = mode;
 
   /* Normally the succ memory model must be stronger than fail, but in the
  unlikely event of fail being ACQUIRE and succ being RELEASE we need to
  promote succ to ACQ_REL so that we don't lose the acquire semantics.  */
-
   if (is_mm_acquire (memmodel_from_int (INTVAL (mod_f)))
   && is_mm_release (memmodel_from_int (INTVAL (mod_s
 mod_s = GEN_INT (MEMMODEL_ACQ_REL);
 
-  switch (mode)
+  r_mode = mode;
+  if (mode == QImode || mode == HImode)
 {
-case E_QImode:
-case E_HImode:
-  /* For short modes, we're going to perform the comparison in SImode,
-so do the zero-extension now.  */
-  cmp_mode = SImode;
-  rval = gen_reg_rtx (SImode);
-  oldval = convert_modes (SImode, mode, oldval, true);
-  /* Fall through.  */
-
-case E_SImode:
-case E_DImode:
-  /* Force the value into a register if needed.  */
-  if (!aarch64_plus_operand (oldval, mode))
-   oldval = force_reg (cmp_mode, oldval);
-  break;
-
-default:
-  gcc_unreachable ();
+  r_mode = SImode;
+  rval = gen_reg_rtx (r_mode);
 }
 
   switch (mode)
@@ -14245,27 +14246,49 @@ aarch64_expand_compare_and_swap (rtx operands[])
   /* The CAS insn requires oldval and rval overlap, but we need to
 have a copy of oldval saved across the operation to tell if
 the operation is successful.  */
-  if (mode == QImode || mode == HImode)
-   rval = copy_to_mode_reg (SImode, gen_lowpart (SImode, oldval));
-  else if (reg_overlap_mentioned_p (rval, oldval))
-rval = copy_to_mode_reg (mode, oldval);
+  if (reg_overlap_mentioned_p (rval, oldval))
+rval = copy_to_mode_reg (r_mode, oldval);
   else
-   emit_move_insn (rval, oldval);
+   emit_move_insn (rval, gen_lowpart (r_mode, oldval));
+
   emit_insn (atomic_gen (rval, mem, newval, mod_s));
-  aarch64_gen_compare_reg (EQ, rval, oldval);
+
+  cc

[PATCH 9/19][GCC-8] aarch64: Tidy aarch64_split_compare_and_swap

2020-04-16 Thread Andre Vieira (lists)

2020-04-16  Andre Vieira 

    Backport from mainline.
    2019-09-19  Richard Henderson 

    * config/aarch64/aarch64 (aarch64_split_compare_and_swap):Unify 
some code paths;

    use aarch64_gen_compare_reg instead of open-coding.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
09e78313489d266daaca9eba3647f150534893f6..2df5bf3db97d9362155c3c8d9c9d7f14c41b9520
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14359,13 +14359,11 @@ aarch64_split_compare_and_swap (rtx operands[])
   /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
   gcc_assert (epilogue_completed);
 
-  rtx rval, mem, oldval, newval, scratch;
+  rtx rval, mem, oldval, newval, scratch, x, model_rtx;
   machine_mode mode;
   bool is_weak;
   rtx_code_label *label1, *label2;
-  rtx x, cond;
   enum memmodel model;
-  rtx model_rtx;
 
   rval = operands[0];
   mem = operands[1];
@@ -14386,7 +14384,7 @@ aarch64_split_compare_and_swap (rtx operands[])
CBNZscratch, .label1
 .label2:
CMP rval, 0.  */
-  bool strong_zero_p = !is_weak && oldval == const0_rtx && mode != TImode;
+  bool strong_zero_p = (!is_weak && oldval == const0_rtx && mode != TImode);
 
   label1 = NULL;
   if (!is_weak)
@@ -14399,26 +14397,20 @@ aarch64_split_compare_and_swap (rtx operands[])
   /* The initial load can be relaxed for a __sync operation since a final
  barrier will be emitted to stop code hoisting.  */
   if (is_mm_sync (model))
-aarch64_emit_load_exclusive (mode, rval, mem,
-GEN_INT (MEMMODEL_RELAXED));
+aarch64_emit_load_exclusive (mode, rval, mem, GEN_INT (MEMMODEL_RELAXED));
   else
 aarch64_emit_load_exclusive (mode, rval, mem, model_rtx);
 
   if (strong_zero_p)
-{
-  x = gen_rtx_NE (VOIDmode, rval, const0_rtx);
-  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
-   gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
-  aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
-}
+x = gen_rtx_NE (VOIDmode, rval, const0_rtx);
   else
 {
-  cond = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode);
-  x = gen_rtx_NE (VOIDmode, cond, const0_rtx);
-  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
-   gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
-  aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
+  rtx cc_reg = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode);
+  x = gen_rtx_NE (VOIDmode, cc_reg, const0_rtx);
 }
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+   gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
+  aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
 
   aarch64_emit_store_exclusive (mode, scratch, mem, newval, model_rtx);
 
@@ -14430,22 +14422,16 @@ aarch64_split_compare_and_swap (rtx operands[])
   aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
 }
   else
-{
-  cond = gen_rtx_REG (CCmode, CC_REGNUM);
-  x = gen_rtx_COMPARE (CCmode, scratch, const0_rtx);
-  emit_insn (gen_rtx_SET (cond, x));
-}
+aarch64_gen_compare_reg (NE, scratch, const0_rtx);
 
   emit_label (label2);
+
   /* If we used a CBNZ in the exchange loop emit an explicit compare with RVAL
  to set the condition flags.  If this is not used it will be removed by
  later passes.  */
   if (strong_zero_p)
-{
-  cond = gen_rtx_REG (CCmode, CC_REGNUM);
-  x = gen_rtx_COMPARE (CCmode, rval, const0_rtx);
-  emit_insn (gen_rtx_SET (cond, x));
-}
+aarch64_gen_compare_reg (NE, rval, const0_rtx);
+
   /* Emit any final barrier needed for a __sync operation.  */
   if (is_mm_sync (model))
 aarch64_emit_post_barrier (model);


[PATCH 12/19][GCC-8] aarch64: Implement -moutline-atomics

2020-04-16 Thread Andre Vieira (lists)

2020-04-16  Andre Vieira 

    Backport from mainline
    2019-09-19  Richard Henderson 

    * config/aarch64/aarch64.opt (-moutline-atomics): New.
    * config/aarch64/aarch64.c (aarch64_atomic_ool_func): New.
    (aarch64_ool_cas_names, aarch64_ool_swp_names): New.
    (aarch64_ool_ldadd_names, aarch64_ool_ldset_names): New.
    (aarch64_ool_ldclr_names, aarch64_ool_ldeor_names): New.
    (aarch64_expand_compare_and_swap): Honor TARGET_OUTLINE_ATOMICS.
    * config/aarch64/atomics.md (atomic_exchange): Likewise.
    (atomic_): Likewise.
    (atomic_fetch_): Likewise.
    (atomic__fetch): Likewise.
    * doc/invoke.texi: Document -moutline-atomics.

    * gcc.target/aarch64/atomic-op-acq_rel.c: Use -mno-outline-atomics.
    * gcc.target/aarch64/atomic-comp-swap-release-acquire.c: Likewise.
    * gcc.target/aarch64/atomic-op-acquire.c: Likewise.
    * gcc.target/aarch64/atomic-op-char.c: Likewise.
    * gcc.target/aarch64/atomic-op-consume.c: Likewise.
    * gcc.target/aarch64/atomic-op-imm.c: Likewise.
    * gcc.target/aarch64/atomic-op-int.c: Likewise.
    * gcc.target/aarch64/atomic-op-long.c: Likewise.
    * gcc.target/aarch64/atomic-op-relaxed.c: Likewise.
    * gcc.target/aarch64/atomic-op-release.c: Likewise.
    * gcc.target/aarch64/atomic-op-seq_cst.c: Likewise.
    * gcc.target/aarch64/atomic-op-short.c: Likewise.
    * gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: Likewise.
    * gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c: Likewise.
    * gcc.target/aarch64/sync-comp-swap.c: Likewise.
    * gcc.target/aarch64/sync-op-acquire.c: Likewise.
    * gcc.target/aarch64/sync-op-full.c: Likewise.

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 
da68ce0e7d096bf4a512c2b8ef52bf236f8f76f4..0f1dc75a27f3fdd2218e57811e208fc28139ac4a
 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -548,4 +548,17 @@ rtl_opt_pass *make_pass_fma_steering (gcc::context *ctxt);
 
 poly_uint64 aarch64_regmode_natural_size (machine_mode);
 
+struct atomic_ool_names
+{
+const char *str[5][4];
+};
+
+rtx aarch64_atomic_ool_func(machine_mode mode, rtx model_rtx,
+   const atomic_ool_names *names);
+extern const atomic_ool_names aarch64_ool_swp_names;
+extern const atomic_ool_names aarch64_ool_ldadd_names;
+extern const atomic_ool_names aarch64_ool_ldset_names;
+extern const atomic_ool_names aarch64_ool_ldclr_names;
+extern const atomic_ool_names aarch64_ool_ldeor_names;
+
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
2df5bf3db97d9362155c3c8d9c9d7f14c41b9520..21124b5a3479dd388eb767402e080e2181153467
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14227,6 +14227,82 @@ aarch64_emit_unlikely_jump (rtx insn)
   add_reg_br_prob_note (jump, profile_probability::very_unlikely ());
 }
 
+/* We store the names of the various atomic helpers in a 5x4 array.
+   Return the libcall function given MODE, MODEL and NAMES.  */
+
+rtx
+aarch64_atomic_ool_func(machine_mode mode, rtx model_rtx,
+   const atomic_ool_names *names)
+{
+  memmodel model = memmodel_base (INTVAL (model_rtx));
+  int mode_idx, model_idx;
+
+  switch (mode)
+{
+case E_QImode:
+  mode_idx = 0;
+  break;
+case E_HImode:
+  mode_idx = 1;
+  break;
+case E_SImode:
+  mode_idx = 2;
+  break;
+case E_DImode:
+  mode_idx = 3;
+  break;
+case E_TImode:
+  mode_idx = 4;
+  break;
+default:
+  gcc_unreachable ();
+}
+
+  switch (model)
+{
+case MEMMODEL_RELAXED:
+  model_idx = 0;
+  break;
+case MEMMODEL_CONSUME:
+case MEMMODEL_ACQUIRE:
+  model_idx = 1;
+  break;
+case MEMMODEL_RELEASE:
+  model_idx = 2;
+  break;
+case MEMMODEL_ACQ_REL:
+case MEMMODEL_SEQ_CST:
+  model_idx = 3;
+  break;
+default:
+  gcc_unreachable ();
+}
+
+  return init_one_libfunc_visibility (names->str[mode_idx][model_idx],
+ VISIBILITY_HIDDEN);
+}
+
+#define DEF0(B, N) \
+  { "__aarch64_" #B #N "_relax", \
+"__aarch64_" #B #N "_acq", \
+"__aarch64_" #B #N "_rel", \
+"__aarch64_" #B #N "_acq_rel" }
+
+#define DEF4(B)  DEF0(B, 1), DEF0(B, 2), DEF0(B, 4), DEF0(B, 8), \
+{ NULL, NULL, NULL, NULL }
+#define DEF5(B)  DEF0(B, 1), DEF0(B, 2), DEF0(B, 4), DEF0(B, 8), DEF0(B, 16)
+
+static const atomic_ool_names aarch64_ool_cas_names = { { DEF5(cas) } };
+const atomic_ool_names aarch64_ool_swp_names = { { DEF4(swp) } };
+const atomic_ool_names aarch64_ool_ldadd_names = { { DEF4(ldadd) } };
+const atomic_ool_names aarch64_ool_ldset_names = { { DEF4(ldset) } };
+const atomic_ool_names aarch64_ool_ldclr_names = { { DEF4(ldclr) } };
+const atomic_ool_names aarch64_ool_ldeor_names = { { DEF4(ldeor) } };
+
+#undef DEF0
+#undef DEF4
+#undef DEF5
+
 /* Expand a compar

[PATCH 11/19][GCC-8] Add visibility to libfunc constructors

2020-04-16 Thread Andre Vieira (lists)

2020-04-16  Andre Vieira 

    Backport from mainline.
    2018-10-31  Richard Henderson 

    * optabs-libfuncs.c (build_libfunc_function_visibility):
    New, split out from...
    (build_libfunc_function): ... here.
    (init_one_libfunc_visibility): New, split out from ...
    (init_one_libfunc): ... here.

diff --git a/gcc/optabs-libfuncs.h b/gcc/optabs-libfuncs.h
index 
0669ea1fdd7dc666d28fc0407a2288de86b3918b..cf39da36887516193aa789446ef0b6a7c24fb1ef
 100644
--- a/gcc/optabs-libfuncs.h
+++ b/gcc/optabs-libfuncs.h
@@ -63,7 +63,9 @@ void gen_satfract_conv_libfunc (convert_optab, const char *,
 void gen_satfractuns_conv_libfunc (convert_optab, const char *,
   machine_mode, machine_mode);
 
+tree build_libfunc_function_visibility (const char *, symbol_visibility);
 tree build_libfunc_function (const char *);
+rtx init_one_libfunc_visibility (const char *, symbol_visibility);
 rtx init_one_libfunc (const char *);
 rtx set_user_assembler_libfunc (const char *, const char *);
 
diff --git a/gcc/optabs-libfuncs.c b/gcc/optabs-libfuncs.c
index 
bd0df8baa3711febcbdf2745588d5d43519af72b..73a28e9ca7a1e5b1564861071e0923d8b8219d25
 100644
--- a/gcc/optabs-libfuncs.c
+++ b/gcc/optabs-libfuncs.c
@@ -719,10 +719,10 @@ struct libfunc_decl_hasher : ggc_ptr_hash
 /* A table of previously-created libfuncs, hashed by name.  */
 static GTY (()) hash_table *libfunc_decls;
 
-/* Build a decl for a libfunc named NAME.  */
+/* Build a decl for a libfunc named NAME with visibility VIS.  */
 
 tree
-build_libfunc_function (const char *name)
+build_libfunc_function_visibility (const char *name, symbol_visibility vis)
 {
   /* ??? We don't have any type information; pretend this is "int foo ()".  */
   tree decl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL,
@@ -731,7 +731,7 @@ build_libfunc_function (const char *name)
   DECL_EXTERNAL (decl) = 1;
   TREE_PUBLIC (decl) = 1;
   DECL_ARTIFICIAL (decl) = 1;
-  DECL_VISIBILITY (decl) = VISIBILITY_DEFAULT;
+  DECL_VISIBILITY (decl) = vis;
   DECL_VISIBILITY_SPECIFIED (decl) = 1;
   gcc_assert (DECL_ASSEMBLER_NAME (decl));
 
@@ -742,11 +742,19 @@ build_libfunc_function (const char *name)
   return decl;
 }
 
+/* Build a decl for a libfunc named NAME.  */
+
+tree
+build_libfunc_function (const char *name)
+{
+  return build_libfunc_function_visibility (name, VISIBILITY_DEFAULT);
+}
+
 /* Return a libfunc for NAME, creating one if we don't already have one.
-   The returned rtx is a SYMBOL_REF.  */
+   The decl is given visibility VIS.  The returned rtx is a SYMBOL_REF.  */
 
 rtx
-init_one_libfunc (const char *name)
+init_one_libfunc_visibility (const char *name, symbol_visibility vis)
 {
   tree id, decl;
   hashval_t hash;
@@ -763,12 +771,18 @@ init_one_libfunc (const char *name)
 {
   /* Create a new decl, so that it can be passed to
 targetm.encode_section_info.  */
-  decl = build_libfunc_function (name);
+  decl = build_libfunc_function_visibility (name, vis);
   *slot = decl;
 }
   return XEXP (DECL_RTL (decl), 0);
 }
 
+rtx
+init_one_libfunc (const char *name)
+{
+  return init_one_libfunc_visibility (name, VISIBILITY_DEFAULT);
+}
+
 /* Adjust the assembler name of libfunc NAME to ASMSPEC.  */
 
 rtx


[PATCH 14/19][GCC-8] aarch64: Fix store-exclusive in load-operate LSE helpers

2020-04-16 Thread Andre Vieira (lists)

2020-04-16  Andre Vieira 

    Backport from mainline
    2019-09-25  Richard Henderson 

    PR target/91834
    * config/aarch64/lse.S (LDNM): Ensure STXR output does not
    overlap the inputs.

diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
index 
a5f6673596c73c497156a6f128799cc43b400504..c7979382ad7770b61bb1c64d32ba2395963a9d7a
 100644
--- a/libgcc/config/aarch64/lse.S
+++ b/libgcc/config/aarch64/lse.S
@@ -227,8 +227,8 @@ STARTFN NAME(LDNM)
 8: mov s(tmp0), s(0)
 0: LDXRs(0), [x1]
OP  s(tmp1), s(0), s(tmp0)
-   STXRw(tmp1), s(tmp1), [x1]
-   cbnzw(tmp1), 0b
+   STXRw(tmp2), s(tmp1), [x1]
+   cbnzw(tmp2), 0b
ret
 
 ENDFN  NAME(LDNM)


[PATCH 10/19][GCC-8] aarch64: Add out-of-line functions for LSE atomics

2020-04-16 Thread Andre Vieira (lists)

This is the libgcc part of the interface -- providing the functions.
Rationale is provided at the top of libgcc/config/aarch64/lse.S.

2020-04-16  Andre Vieira 

    Backport from mainline
    2019-09-19  Richard Henderson 

    * config/aarch64/lse-init.c: New file.
    * config/aarch64/lse.S: New file.
    * config/aarch64/t-lse: New file.
    * config.host: Add t-lse to all aarch64 tuples.

diff --git a/libgcc/config.host b/libgcc/config.host
index 
b12c86267dac9da8da9e1ab4123d5171c3e07f40..e436ade1a68c6cd918d2f370b14d61682cb9fd59
 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -337,23 +337,27 @@ aarch64*-*-elf | aarch64*-*-rtems*)
extra_parts="$extra_parts crtbegin.o crtend.o crti.o crtn.o"
extra_parts="$extra_parts crtfastmath.o"
tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+   tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
md_unwind_header=aarch64/aarch64-unwind.h
;;
 aarch64*-*-freebsd*)
extra_parts="$extra_parts crtfastmath.o"
tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+   tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
md_unwind_header=aarch64/freebsd-unwind.h
;;
 aarch64*-*-fuchsia*)
tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+   tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp"
;;
 aarch64*-*-linux*)
extra_parts="$extra_parts crtfastmath.o"
md_unwind_header=aarch64/linux-unwind.h
tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+   tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
;;
 alpha*-*-linux*)
diff --git a/libgcc/config/aarch64/lse-init.c b/libgcc/config/aarch64/lse-init.c
new file mode 100644
index 
..33d2914747994a1e07dcae906f0352e64045ab02
--- /dev/null
+++ b/libgcc/config/aarch64/lse-init.c
@@ -0,0 +1,45 @@
+/* Out-of-line LSE atomics for AArch64 architecture, Init.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   Contributed by Linaro Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+/* Define the symbol gating the LSE implementations.  */
+_Bool __aarch64_have_lse_atomics
+  __attribute__((visibility("hidden"), nocommon));
+
+/* Disable initialization of __aarch64_have_lse_atomics during bootstrap.  */
+#ifndef inhibit_libc
+# include 
+
+/* Disable initialization if the system headers are too old.  */
+# if defined(AT_HWCAP) && defined(HWCAP_ATOMICS)
+
+static void __attribute__((constructor))
+init_have_lse_atomics (void)
+{
+  unsigned long hwcap = getauxval (AT_HWCAP);
+  __aarch64_have_lse_atomics = (hwcap & HWCAP_ATOMICS) != 0;
+}
+
+# endif /* HWCAP */
+#endif /* inhibit_libc */
diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
new file mode 100644
index 
..a5f6673596c73c497156a6f128799cc43b400504
--- /dev/null
+++ b/libgcc/config/aarch64/lse.S
@@ -0,0 +1,235 @@
+/* Out-of-line LSE atomics for AArch64 architecture.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   Contributed by Linaro Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public Li

[PATCH 13/19][GCC-8] Aarch64: Fix shrinkwrapping interactions with atomics

2020-04-16 Thread Andre Vieira (lists)

2020-04-16  Andre Vieira 

    Backport from mainline
    2020-01-17  Wilco Dijkstra 

    PR target/92692
    * config/aarch64/atomics.md (aarch64_compare_and_swap)
    Use epilogue_completed rather than reload_completed.

diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 
28a1dbc4231009333c2e766d9d3aead54a491631..0ee8d2efac05877d610981b719bd02afdf93a832
 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -104,7 +104,7 @@
(clobber (match_scratch:SI 7 "=&r"))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
 aarch64_split_compare_and_swap (operands);


[PATCH 8/19][GCC-8] aarch64: Implement TImode compare-and-swap

2020-04-16 Thread Andre Vieira (lists)

2020-04-16  Andre Vieira 

    Backport from mainline.
    2019-09-19  Richard Henderson 

    * config/aarch64/aarch64.c (aarch64_gen_compare_reg): Add support
    for NE comparison of TImode values.
    (aarch64_emit_load_exclusive): Add support for TImode.
    (aarch64_emit_store_exclusive): Likewise.
    (aarch64_split_compare_and_swap): Disable strong_zero_p for TImode.
    * config/aarch64/atomics.md (atomic_compare_and_swapti):
    Change iterator from ALLI to ALLI_TI.
    (atomic_compare_and_swapti): New.
    (atomic_compare_and_swapti: New.
    (aarch64_load_exclusive_pair): New.
    (aarch64_store_exclusive_pair): New.
    * config/aarch64/iterators.md (ALLI_TI): New iterator.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
317571e018c4f96046799675e042cdfaabb5b94a..09e78313489d266daaca9eba3647f150534893f6
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1517,10 +1517,33 @@ emit_set_insn (rtx x, rtx y)
 rtx
 aarch64_gen_compare_reg (RTX_CODE code, rtx x, rtx y)
 {
-  machine_mode mode = SELECT_CC_MODE (code, x, y);
-  rtx cc_reg = gen_rtx_REG (mode, CC_REGNUM);
+  machine_mode cmp_mode = GET_MODE (x);
+  machine_mode cc_mode;
+  rtx cc_reg;
 
-  emit_set_insn (cc_reg, gen_rtx_COMPARE (mode, x, y));
+  if (cmp_mode == TImode)
+{
+  gcc_assert (code == NE);
+
+  cc_mode = CCmode;
+  cc_reg = gen_rtx_REG (cc_mode, CC_REGNUM);
+
+  rtx x_lo = operand_subword (x, 0, 0, TImode);
+  rtx y_lo = operand_subword (y, 0, 0, TImode);
+  emit_set_insn (cc_reg, gen_rtx_COMPARE (cc_mode, x_lo, y_lo));
+
+  rtx x_hi = operand_subword (x, 1, 0, TImode);
+  rtx y_hi = operand_subword (y, 1, 0, TImode);
+  emit_insn (gen_ccmpdi (cc_reg, cc_reg, x_hi, y_hi,
+gen_rtx_EQ (cc_mode, cc_reg, const0_rtx),
+GEN_INT (AARCH64_EQ)));
+}
+  else
+{
+  cc_mode = SELECT_CC_MODE (code, x, y);
+  cc_reg = gen_rtx_REG (cc_mode, CC_REGNUM);
+  emit_set_insn (cc_reg, gen_rtx_COMPARE (cc_mode, x, y));
+}
   return cc_reg;
 }
 
@@ -14145,40 +14168,54 @@ static void
 aarch64_emit_load_exclusive (machine_mode mode, rtx rval,
 rtx mem, rtx model_rtx)
 {
-  rtx (*gen) (rtx, rtx, rtx);
-
-  switch (mode)
+  if (mode == TImode)
+emit_insn (gen_aarch64_load_exclusive_pair (gen_lowpart (DImode, rval),
+   gen_highpart (DImode, rval),
+   mem, model_rtx));
+  else
 {
-case E_QImode: gen = gen_aarch64_load_exclusiveqi; break;
-case E_HImode: gen = gen_aarch64_load_exclusivehi; break;
-case E_SImode: gen = gen_aarch64_load_exclusivesi; break;
-case E_DImode: gen = gen_aarch64_load_exclusivedi; break;
-default:
-  gcc_unreachable ();
-}
+  rtx (*gen) (rtx, rtx, rtx);
+
+  switch (mode)
+   {
+   case E_QImode: gen = gen_aarch64_load_exclusiveqi; break;
+   case E_HImode: gen = gen_aarch64_load_exclusivehi; break;
+   case E_SImode: gen = gen_aarch64_load_exclusivesi; break;
+   case E_DImode: gen = gen_aarch64_load_exclusivedi; break;
+   default:
+ gcc_unreachable ();
+   }
 
-  emit_insn (gen (rval, mem, model_rtx));
+  emit_insn (gen (rval, mem, model_rtx));
+}
 }
 
 /* Emit store exclusive.  */
 
 static void
 aarch64_emit_store_exclusive (machine_mode mode, rtx bval,
- rtx rval, rtx mem, rtx model_rtx)
+ rtx mem, rtx rval, rtx model_rtx)
 {
-  rtx (*gen) (rtx, rtx, rtx, rtx);
-
-  switch (mode)
+  if (mode == TImode)
+emit_insn (gen_aarch64_store_exclusive_pair
+  (bval, mem, operand_subword (rval, 0, 0, TImode),
+   operand_subword (rval, 1, 0, TImode), model_rtx));
+  else
 {
-case E_QImode: gen = gen_aarch64_store_exclusiveqi; break;
-case E_HImode: gen = gen_aarch64_store_exclusivehi; break;
-case E_SImode: gen = gen_aarch64_store_exclusivesi; break;
-case E_DImode: gen = gen_aarch64_store_exclusivedi; break;
-default:
-  gcc_unreachable ();
-}
+  rtx (*gen) (rtx, rtx, rtx, rtx);
+
+  switch (mode)
+   {
+   case E_QImode: gen = gen_aarch64_store_exclusiveqi; break;
+   case E_HImode: gen = gen_aarch64_store_exclusivehi; break;
+   case E_SImode: gen = gen_aarch64_store_exclusivesi; break;
+   case E_DImode: gen = gen_aarch64_store_exclusivedi; break;
+   default:
+ gcc_unreachable ();
+   }
 
-  emit_insn (gen (bval, rval, mem, model_rtx));
+  emit_insn (gen (bval, mem, rval, model_rtx));
+}
 }
 
 /* Mark the previous jump instruction as unlikely.  */
@@ -14197,16 +14234,6 @@ aarch64_expand_compare_and_swap (rtx operands[])
 {
   rtx bval, rval, mem, oldval, newval, is_weak, mod_s, mod_f, x, cc_reg;
   machine_mode mode, r_mode;
-  typedef rtx (*gen_atomic_cas_fn) (rtx, rtx, rtx, rtx);
-  in

[PATCH 7/19][GCC-8] aarch64: Extend %R for integer registers

2020-04-16 Thread Andre Vieira (lists)

2020-04-16  Andre Vieira 

    Backport from mainline.
    2019-09-19  Richard Henderson 

    * config/aarch64/aarch64.c (aarch64_print_operand): Allow integer
    registers with %R.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
1068cfd899a759c506e3217e1e2c19cd778b4372..317571e018c4f96046799675e042cdfaabb5b94a
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6627,7 +6627,7 @@ sizetochar (int size)
  'S/T/U/V':Print a FP/SIMD register name for a register 
list.
The register printed is the FP/SIMD register name
of X + 0/1/2/3 for S/T/U/V.
- 'R':  Print a scalar FP/SIMD register name + 1.
+ 'R':  Print a scalar Integer/FP/SIMD register name + 1.
  'X':  Print bottom 16 bits of integer constant in hex.
  'w/x':Print a general register name or the zero register
(32-bit or 64-bit).
@@ -6813,12 +6813,13 @@ aarch64_print_operand (FILE *f, rtx x, int code)
   break;
 
 case 'R':
-  if (!REG_P (x) || !FP_REGNUM_P (REGNO (x)))
-   {
- output_operand_lossage ("incompatible floating point / vector 
register operand for '%%%c'", code);
- return;
-   }
-  asm_fprintf (f, "q%d", REGNO (x) - V0_REGNUM + 1);
+  if (REG_P (x) && FP_REGNUM_P (REGNO (x)))
+   asm_fprintf (f, "q%d", REGNO (x) - V0_REGNUM + 1);
+  else if (REG_P (x) && GP_REGNUM_P (REGNO (x)))
+   asm_fprintf (f, "x%d", REGNO (x) - R0_REGNUM + 1);
+  else
+   output_operand_lossage ("incompatible register operand for '%%%c'",
+   code);
   break;
 
 case 'X':


[PATCH 15/19][GCC-8] aarch64: Configure for sys/auxv.h in libgcc for lse-init.c

2020-04-16 Thread Andre Vieira (lists)

2020-04-16  Andre Vieira 

    Backport from mainline
    2019-09-25  Richard Henderson 

    PR target/91833
    * config/aarch64/lse-init.c: Include auto-target.h.  Disable
    initialization if !HAVE_SYS_AUXV_H.
    * configure.ac (AC_CHECK_HEADERS): Add sys/auxv.h.
    * config.in, configure: Rebuild.

diff --git a/libgcc/config.in b/libgcc/config.in
index 
d634af9d949741e26f5acc2606d40062d491dd8b..59a3d8daf52e72e548d3d9425d6043d5e0c663ad
 100644
--- a/libgcc/config.in
+++ b/libgcc/config.in
@@ -43,6 +43,9 @@
 /* Define to 1 if you have the  header file. */
 #undef HAVE_STRING_H
 
+/* Define to 1 if you have the  header file. */
+#undef HAVE_SYS_AUXV_H
+
 /* Define to 1 if you have the  header file. */
 #undef HAVE_SYS_STAT_H
 
@@ -82,6 +85,11 @@
 /* Define to 1 if the target use emutls for thread-local storage. */
 #undef USE_EMUTLS
 
+/* Enable large inode numbers on Mac OS X 10.5.  */
+#ifndef _DARWIN_USE_64_BIT_INODE
+# define _DARWIN_USE_64_BIT_INODE 1
+#endif
+
 /* Number of bits in a file offset, on hosts where this is settable. */
 #undef _FILE_OFFSET_BITS
 
diff --git a/libgcc/config/aarch64/lse-init.c b/libgcc/config/aarch64/lse-init.c
index 
33d2914747994a1e07dcae906f0352e64045ab02..1a8f4c55213f25c67c8bb8cdc1cc6f1bbe3255cb
 100644
--- a/libgcc/config/aarch64/lse-init.c
+++ b/libgcc/config/aarch64/lse-init.c
@@ -23,12 +23,14 @@ a copy of the GCC Runtime Library Exception along with this 
program;
 see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 .  */
 
+#include "auto-target.h"
+
 /* Define the symbol gating the LSE implementations.  */
 _Bool __aarch64_have_lse_atomics
   __attribute__((visibility("hidden"), nocommon));
 
 /* Disable initialization of __aarch64_have_lse_atomics during bootstrap.  */
-#ifndef inhibit_libc
+#if !defined(inhibit_libc) && defined(HAVE_SYS_AUXV_H)
 # include 
 
 /* Disable initialization if the system headers are too old.  */
diff --git a/libgcc/configure b/libgcc/configure
old mode 100644
new mode 100755
index 
b2f3f8708441e473b8e2941c4748748b6c7c40b8..7962cd9b87e1eb67037180e110f7d0de145bb2e1
--- a/libgcc/configure
+++ b/libgcc/configure
@@ -641,6 +641,7 @@ infodir
 docdir
 oldincludedir
 includedir
+runstatedir
 localstatedir
 sharedstatedir
 sysconfdir
@@ -729,6 +730,7 @@ datadir='${datarootdir}'
 sysconfdir='${prefix}/etc'
 sharedstatedir='${prefix}/com'
 localstatedir='${prefix}/var'
+runstatedir='${localstatedir}/run'
 includedir='${prefix}/include'
 oldincludedir='/usr/include'
 docdir='${datarootdir}/doc/${PACKAGE_TARNAME}'
@@ -980,6 +982,15 @@ do
   | -silent | --silent | --silen | --sile | --sil)
 silent=yes ;;
 
+  -runstatedir | --runstatedir | --runstatedi | --runstated \
+  | --runstate | --runstat | --runsta | --runst | --runs \
+  | --run | --ru | --r)
+ac_prev=runstatedir ;;
+  -runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \
+  | --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \
+  | --run=* | --ru=* | --r=*)
+runstatedir=$ac_optarg ;;
+
   -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb)
 ac_prev=sbindir ;;
   -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \
@@ -1117,7 +1128,7 @@ fi
 for ac_var in  exec_prefix prefix bindir sbindir libexecdir datarootdir \
datadir sysconfdir sharedstatedir localstatedir includedir \
oldincludedir docdir infodir htmldir dvidir pdfdir psdir \
-   libdir localedir mandir
+   libdir localedir mandir runstatedir
 do
   eval ac_val=\$$ac_var
   # Remove trailing slashes.
@@ -1272,6 +1283,7 @@ Fine tuning of the installation directories:
   --sysconfdir=DIRread-only single-machine data [PREFIX/etc]
   --sharedstatedir=DIRmodifiable architecture-independent data [PREFIX/com]
   --localstatedir=DIR modifiable single-machine data [PREFIX/var]
+  --runstatedir=DIR   modifiable per-process data [LOCALSTATEDIR/run]
   --libdir=DIRobject code libraries [EPREFIX/lib]
   --includedir=DIRC header files [PREFIX/include]
   --oldincludedir=DIR C header files for non-gcc [/usr/include]
@@ -4091,7 +4103,7 @@ else
 We can't simply define LARGE_OFF_T to be 9223372036854775807,
 since some C++ compilers masquerading as C compilers
 incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
   && LARGE_OFF_T % 2147483647 == 1)
  ? 1 : -1];
@@ -4137,7 +4149,7 @@ else
 We can't simply define LARGE_OFF_T to be 9223372036854775807,
 since some C++ compilers masquerading as C compilers
 incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31

[PATCH 19/19][GCC-8] re PR target/90724 (ICE with __sync_bool_compare_and_swap with -march=armv8.2-a+sve)

2020-04-16 Thread Andre Vieira (lists)

2020-04-16  Andre Vieira 

    Backport from mainline
    2019-08-21  Prathamesh Kulkarni 

    PR target/90724
    * config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): Force y
    in reg if it fails aarch64_plus_operand predicate.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
6bac63402e508027e77a9f4557cb10c578ea7c2c..0da927be15c339295ef940d6e05a37e95135aa5a
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1574,6 +1574,9 @@ aarch64_gen_compare_reg_maybe_ze (RTX_CODE code, rtx x, 
rtx y,
}
 }
 
+  if (!aarch64_plus_operand (y, y_mode))
+y = force_reg (y_mode, y);
+
   return aarch64_gen_compare_reg (code, x, y);
 }
 


[PATCH 16/19][GCC-8] aarch64: Fix up aarch64_compare_and_swaphi pattern [PR94368]

2020-04-16 Thread Andre Vieira (lists)

2020-04-16  Andre Vieira 

    Backport from mainline
    2020-03-31  Jakub Jelinek 

    PR target/94368
    * config/aarch64/constraints.md (Uph): New constraint.
    * config/aarch64/atomics.md (cas_short_expected_imm): New mode attr.
    (aarch64_compare_and_swap): Use it instead of n in operand 2's
    constraint.

    * gcc.dg/pr94368.c: New test.

diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 
0ee8d2efac05877d610981b719bd02afdf93a832..1005462ae23aa13dbc3013a255aa189096e33366
 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -38,6 +38,8 @@
 
 (define_mode_attr cas_short_expected_pred
   [(QI "aarch64_reg_or_imm") (HI "aarch64_plushi_operand")])
+(define_mode_attr cas_short_expected_imm
+  [(QI "n") (HI "Uph")])
 
 (define_insn_and_split "aarch64_compare_and_swap"
   [(set (reg:CC CC_REGNUM) ;; bool out
@@ -47,7 +49,8 @@
   (match_operand:SHORT 1 "aarch64_sync_memory_operand" "+Q"))) ;; memory
(set (match_dup 1)
 (unspec_volatile:SHORT
-  [(match_operand:SHORT 2 "" "rn");; 
expected
+  [(match_operand:SHORT 2 ""
+ "r")  ;; expected
(match_operand:SHORT 3 "aarch64_reg_or_zero" "rZ")  ;; desired
(match_operand:SI 4 "const_int_operand");; 
is_weak
(match_operand:SI 5 "const_int_operand");; mod_s
diff --git a/gcc/config/aarch64/constraints.md 
b/gcc/config/aarch64/constraints.md
index 
32a0fa60a198c714f7c0b8b987da6bc26992845d..03626d2faf87e0b038bf3b8602d4feb8ef7d077c
 100644
--- a/gcc/config/aarch64/constraints.md
+++ b/gcc/config/aarch64/constraints.md
@@ -213,6 +213,13 @@
   (and (match_code "const_int")
(match_test "(unsigned) exact_log2 (ival) <= 4")))
 
+(define_constraint "Uph"
+  "@internal
+  A constraint that matches HImode integers zero extendable to
+  SImode plus_operand."
+  (and (match_code "const_int")
+   (match_test "aarch64_plushi_immediate (op, VOIDmode)")))
+
 (define_memory_constraint "Q"
  "A memory address which uses a single base register with no offset."
  (and (match_code "mem")
diff --git a/gcc/testsuite/gcc.dg/pr94368.c b/gcc/testsuite/gcc.dg/pr94368.c
new file mode 100644
index 
..1267b8220983ef1477a8339bdcc6369abaeca592
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr94368.c
@@ -0,0 +1,25 @@
+/* PR target/94368 */
+/* { dg-do compile { target fpic } } */
+/* { dg-options "-fpic -O1 -fcommon" } */
+
+int b, c, d, e, f, h;
+short g;
+int foo (int) __attribute__ ((__const__));
+
+void
+bar (void)
+{
+  while (1)
+{
+  while (1)
+   {
+ __atomic_load_n (&e, 0);
+ if (foo (2))
+   __sync_val_compare_and_swap (&c, 0, f);
+ b = 1;
+ if (h == e)
+   break;
+   }
+  __sync_val_compare_and_swap (&g, -1, f);
+}
+}


[PATCH 18/19][GCC-8] aarch64: Fix ICE due to aarch64_gen_compare_reg_maybe_ze [PR94435]

2020-04-16 Thread Andre Vieira (lists)

The following testcase ICEs, because aarch64_gen_compare_reg_maybe_ze emits
invalid RTL.
For y_mode [QH]Imode it expects y to be of that mode (or CONST_INT that fits
into that mode) and x being SImode; for non-CONST_INT y it zero extends y
into SImode and compares that against x, for CONST_INT y it zero extends y
into SImode.  The problem is that when the zero extended constant isn't
usable directly, it forces it into a REG, but with y_mode mode, and then
compares against y.  That is wrong, because it should force it into a SImode
REG and compare that way.

2020-04-16  Andre Vieira 

    Backport from mainline
    2020-04-02  Jakub Jelinek 

    PR target/94435
    * config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): For
    y_mode E_[QH]Imode and y being a CONST_INT, change y_mode to SImode.

    * gcc.target/aarch64/pr94435.c: New test.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
21124b5a3479dd388eb767402e080e2181153467..6bac63402e508027e77a9f4557cb10c578ea7c2c
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1556,7 +1556,10 @@ aarch64_gen_compare_reg_maybe_ze (RTX_CODE code, rtx x, 
rtx y,
   if (y_mode == E_QImode || y_mode == E_HImode)
 {
   if (CONST_INT_P (y))
-   y = GEN_INT (INTVAL (y) & GET_MODE_MASK (y_mode));
+   {
+ y = GEN_INT (INTVAL (y) & GET_MODE_MASK (y_mode));
+ y_mode = SImode;
+   }
   else
{
  rtx t, cc_reg;
diff --git a/gcc/testsuite/gcc.target/aarch64/pr94435.c 
b/gcc/testsuite/gcc.target/aarch64/pr94435.c
new file mode 100644
index 
..5713c14d5f90b1d42f92d040e9030ecc03c97d51
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr94435.c
@@ -0,0 +1,25 @@
+/* PR target/94435 */
+/* { dg-do compile } */
+/* { dg-options "-march=armv8-a+nolse -moutline-atomics" } */
+
+int b, c, d, e, f, h;
+short g;
+int foo (int) __attribute__ ((__const__));
+
+void
+bar (void)
+{
+  while (1)
+{
+  while (1)
+   {
+ __atomic_load_n (&e, 0);
+ if (foo (2))
+   __sync_val_compare_and_swap (&c, 0, f);
+ b = 1;
+ if (h == e)
+   break;
+   }
+  __sync_val_compare_and_swap (&g, -1, f);
+}
+}


[PATCH 17/19][GCC-8] aarch64: Fix bootstrap with old binutils [PR93053]

2020-04-16 Thread Andre Vieira (lists)

As reported in the PR, GCC 10 (and also 9.3.1 but not 9.3.0) fails to build
when using older binutils which lack LSE support, because those instructions
are used in libgcc.
Thanks to Kyrylo's hint, the following patches (hopefully) allow it to build
even with older binutils by using .inst directive if LSE support isn't
available in the assembler.

2020-04-16 Andre Vieira 

    Backport from mainline
    2020-04-15  Jakub Jelinek 

    PR target/93053
    * configure.ac (LIBGCC_CHECK_AS_LSE): Add HAVE_AS_LSE checking.
    * config/aarch64/lse.S: Include auto-target.h, if HAVE_AS_LSE
    is not defined, use just .arch armv8-a.
    (B, M, N, OPN): Define.
    (COMMENT): New .macro.
    (CAS, CASP, SWP, LDOP): Use .inst directive if HAVE_AS_LSE is not
    defined.  Otherwise, move the operands right after the glue? and
    comment out operands where the macros are used.
    * configure: Regenerated.
    * config.in: Regenerated.

diff --git a/libgcc/config.in b/libgcc/config.in
index 
59a3d8daf52e72e548d3d9425d6043d5e0c663ad..5be5321d2584392bac1ec3af779cd96823212902
 100644
--- a/libgcc/config.in
+++ b/libgcc/config.in
@@ -10,6 +10,9 @@
*/
 #undef HAVE_AS_CFI_SECTIONS
 
+/* Define to 1 if the assembler supports LSE. */
+#undef HAVE_AS_LSE
+
 /* Define to 1 if the target assembler supports thread-local storage. */
 #undef HAVE_CC_TLS
 
diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
index 
c7979382ad7770b61bb1c64d32ba2395963a9d7a..f7f1c19587beaec2ccb6371378d54d50139ba1c9
 100644
--- a/libgcc/config/aarch64/lse.S
+++ b/libgcc/config/aarch64/lse.S
@@ -48,8 +48,14 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
  * separately to minimize code size.
  */
 
+#include "auto-target.h"
+
 /* Tell the assembler to accept LSE instructions.  */
+#ifdef HAVE_AS_LSE
.arch armv8-a+lse
+#else
+   .arch armv8-a
+#endif
 
 /* Declare the symbol gating the LSE implementations.  */
.hidden __aarch64_have_lse_atomics
@@ -58,12 +64,19 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #if SIZE == 1
 # define S b
 # define UXT   uxtb
+# define B 0x
 #elif SIZE == 2
 # define S h
 # define UXT   uxth
+# define B 0x4000
 #elif SIZE == 4 || SIZE == 8 || SIZE == 16
 # define S
 # define UXT   mov
+# if SIZE == 4
+#  define B0x8000
+# elif SIZE == 8
+#  define B0xc000
+# endif
 #else
 # error
 #endif
@@ -72,18 +85,26 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 # define SUFF  _relax
 # define A
 # define L
+# define M 0x00
+# define N 0x00
 #elif MODEL == 2
 # define SUFF  _acq
 # define A a
 # define L
+# define M 0x40
+# define N 0x80
 #elif MODEL == 3
 # define SUFF  _rel
 # define A
 # define L l
+# define M 0x008000
+# define N 0x40
 #elif MODEL == 4
 # define SUFF  _acq_rel
 # define A a
 # define L l
+# define M 0x408000
+# define N 0xc0
 #else
 # error
 #endif
@@ -144,9 +165,13 @@ STARTFNNAME(cas)
JUMP_IF_NOT_LSE 8f
 
 #if SIZE < 16
-#define CASglue4(cas, A, L, S)
+#ifdef HAVE_AS_LSE
+# define CAS   glue4(cas, A, L, S) s(0), s(1), [x2]
+#else
+# define CAS   .inst 0x08a07c41 + B + M
+#endif
 
-   CAS s(0), s(1), [x2]
+   CAS /* s(0), s(1), [x2] */
ret
 
 8: UXT s(tmp0), s(0)
@@ -160,9 +185,13 @@ STARTFNNAME(cas)
 #else
 #define LDXP   glue3(ld, A, xp)
 #define STXP   glue3(st, L, xp)
-#define CASP   glue3(casp, A, L)
+#ifdef HAVE_AS_LSE
+# define CASP  glue3(casp, A, L)   x0, x1, x2, x3, [x4]
+#else
+# define CASP  .inst 0x48207c82 + M
+#endif
 
-   CASPx0, x1, x2, x3, [x4]
+   CASP/* x0, x1, x2, x3, [x4] */
ret
 
 8: mov x(tmp0), x0
@@ -181,12 +210,16 @@ ENDFN NAME(cas)
 #endif
 
 #ifdef L_swp
-#define SWPglue4(swp, A, L, S)
+#ifdef HAVE_AS_LSE
+# define SWP   glue4(swp, A, L, S) s(0), s(0), [x1]
+#else
+# define SWP   .inst 0x38208020 + B + N
+#endif
 
 STARTFNNAME(swp)
JUMP_IF_NOT_LSE 8f
 
-   SWP s(0), s(0), [x1]
+   SWP /* s(0), s(0), [x1] */
ret
 
 8: mov s(tmp0), s(0)
@@ -204,24 +237,32 @@ ENDFN NAME(swp)
 #ifdef L_ldadd
 #define LDNM   ldadd
 #define OP add
+#define OPN0x
 #elif defined(L_ldclr)
 #define LDNM   ldclr
 #define OP bic
+#define OPN0x1000
 #elif defined(L_ldeor)
 #define LDNM   ldeor
 #define OP eor
+#define OPN0x2000
 #elif defined(L_ldset)
 #define LDNM   ldset
 #define OP orr
+#define OPN0x3000
 #else
 #error
 #endif
-#define LDOP   glue4(LDNM, A, L, S)
+#ifdef HAVE_AS_LSE
+# define LDOP  glue4(LDNM, A, L, S)s(0), s(0), [x1]
+#else
+# define LDOP  .inst 0x38200020 + OPN + B + N
+#endif
 
 STARTFNNAME(LDNM)
JUMP_IF_NOT_LSE 8f
 
-   LDOPs(0), s(0), [x1]
+   LDOP 

Re: [PATCH] ipa: Make call redirection detect already adjusted calls (PR 93621)

2020-04-16 Thread Jan Hubicka
> 
> 2020-04-09  Martin Jambor  
> 
>   PR ipa/93621
>   * ipa-inline.h (ipa_saved_clone_sources): Declare.
>   * ipa-inline-transform.c (ipa_saved_clone_sources): New variable.
>   (save_inline_function_body): Link the new body holder with the
>   previous one.
>   * cgraph.c: Include ipa-inline.h.
>   (cgraph_edge::redirect_call_stmt_to_callee): Try to find the decl from
>   the statement in ipa_saved_clone_sources.
>   * cgraphunit.c: Include ipa-inline.h.
>   (expand_all_functions): Free ipa_saved_clone_sources.
As discussed on IRC this is kind of hack - we should keep track of the
cloning paths if we want to allow calls to already modified decls.
However this is OK for gcc 10.
I believe this can trigger wrong code with earlier releases?
> diff --git a/gcc/ipa-inline.h b/gcc/ipa-inline.h
> index 5025b6045fc..c596f77d0e7 100644
> --- a/gcc/ipa-inline.h
> +++ b/gcc/ipa-inline.h
> @@ -65,6 +65,7 @@ void clone_inlined_nodes (struct cgraph_edge *e, bool, 
> bool, int *);
>  
>  extern int ncalls_inlined;
>  extern int nfunctions_inlined;
> +extern function_summary  *ipa_saved_clone_sources;
Please make it point to DECL itself.  While symbols can be removed decls
stays and we only care about decl anyway.

Honza


Re: [PATCH] c++: Error recovery with errenous DECL_INITIAL [PR94475]

2020-04-16 Thread Patrick Palka via Gcc-patches
On Wed, 15 Apr 2020, Jason Merrill wrote:

> On 4/15/20 4:43 PM, Patrick Palka wrote:
> > Oops, consider the typo in the subject line fixed.  Also ...
> > 
> > On Wed, 15 Apr 2020, Patrick Palka wrote:
> > 
> > > Here we're ICE'ing in do_narrow during error-recovery, because ocp_convert
> > > returns error_mark_node after it attempts to reduce a const decl to its
> > > erroneous DECL_INITIAL via scalar_constant_value, and we later pass this
> > > error_mark_node to fold_build2 which isn't prepared to handle
> > > error_mark_nodes.
> > > 
> > > We could fix this ICE in do_narrow by checking if ocp_convert returns
> > > error_mark_node, but for the sake of consistency and for better error
> > > recovery
> > > it seems it'd be better if ocp_convert didn't care that a const decl's
> > > initializer is erroneous and would instead proceed as if the decl was not
> > > const,
> > > which is the approach that this patch takes.
> > > 
> > > Passes 'make check-c++', does this look OK to commit after full bootstrap
> > > and
> > > regtest?
> > > 
> > > gcc/cp/ChangeLog:
> > > 
> > >   PR c++/94475
> > >   * cvt.c (ocp_convert): If the result of scalar_constant_value is
> > >   erroneous, discard it and carry on with the original expression.
> > > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > >   PR c++/94475
> > >   * g++.dg/conversion/err-recover2.C: New test.
> > >   * g++.dg/diagnostic/pr84138.C: Remove now-bogus warning.
> > >   * g++.dg/warn/Wsign-compare-8.C: Remove now-bogus warning.
> > > ---
> > >   gcc/cp/cvt.c   |  6 +++---
> > >   gcc/testsuite/g++.dg/conversion/err-recover2.C | 10 ++
> > >   gcc/testsuite/g++.dg/diagnostic/pr84138.C  |  2 +-
> > >   gcc/testsuite/g++.dg/warn/Wsign-compare-8.C|  2 +-
> > >   4 files changed, 15 insertions(+), 5 deletions(-)
> > >   create mode 100644 gcc/testsuite/g++.dg/conversion/err-recover2.C
> > > 
> > > diff --git a/gcc/cp/cvt.c b/gcc/cp/cvt.c
> > > index a3b80968b33..b94231a6d08 100644
> > > --- a/gcc/cp/cvt.c
> > > +++ b/gcc/cp/cvt.c
> > > @@ -723,10 +723,10 @@ ocp_convert (tree type, tree expr, int convtype, int
> > > flags,
> > > if (!CLASS_TYPE_P (type))
> > >   {
> > > e = mark_rvalue_use (e);
> > > -  e = scalar_constant_value (e);
> > > +  tree v = scalar_constant_value (e);
> > > +  if (!error_operand_p (v))
> > > + e = v;
> > >   }
> > > -  if (error_operand_p (e))
> > > -return error_mark_node;
> > 
> > Removing this error_operand_p check might make an error_mark_node slip
> > through and get processed by the rest of ocp_convert, if the call to
> > mark_rvalue_use above returns error_mark_node.
> > 
> > In light of that, please consider this patch instead which restores that
> > error_operand_p check:
> 
> OK.  I wonder if we want to drop the call to scalar_constant_value entirely in
> GCC 11, and expect that the expression will be folded properly later.

That makes sense to me.  On its own, removing the call entirely causes a
few testsuite regressions, but I haven't looked into any of them yet.

> 
> > -- >8 --
> > 
> > Subject: [PATCH] c++: Error recovery with erroneous DECL_INITIAL [PR94475]
> > 
> > gcc/cp/ChangeLog:
> > 
> > PR c++/94475
> > * cvt.c (ocp_convert): If the result of scalar_constant_value is
> > erroneous, ignore it and use the original expression.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR c++/94475
> > * g++.dg/conversion/err-recover2.C: New test.
> > * g++.dg/diagnostic/pr84138.C: Remove now-bogus warning.
> > * g++.dg/warn/Wsign-compare-8.C: Remove now-bogus warning.
> > ---
> >   gcc/cp/cvt.c   |  4 +++-
> >   gcc/testsuite/g++.dg/conversion/err-recover2.C | 10 ++
> >   gcc/testsuite/g++.dg/diagnostic/pr84138.C  |  2 +-
> >   gcc/testsuite/g++.dg/warn/Wsign-compare-8.C|  2 +-
> >   4 files changed, 15 insertions(+), 3 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/conversion/err-recover2.C
> > 
> > diff --git a/gcc/cp/cvt.c b/gcc/cp/cvt.c
> > index a3b80968b33..656e7fd3ec0 100644
> > --- a/gcc/cp/cvt.c
> > +++ b/gcc/cp/cvt.c
> > @@ -723,7 +723,9 @@ ocp_convert (tree type, tree expr, int convtype, int
> > flags,
> > if (!CLASS_TYPE_P (type))
> >   {
> > e = mark_rvalue_use (e);
> > -  e = scalar_constant_value (e);
> > +  tree v = scalar_constant_value (e);
> > +  if (!error_operand_p (v))
> > +   e = v;
> >   }
> > if (error_operand_p (e))
> >   return error_mark_node;
> > diff --git a/gcc/testsuite/g++.dg/conversion/err-recover2.C
> > b/gcc/testsuite/g++.dg/conversion/err-recover2.C
> > new file mode 100644
> > index 000..437e1a919ea
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/conversion/err-recover2.C
> > @@ -0,0 +1,10 @@
> > +// PR c++/94475
> > +// { dg-do compile }
> > +
> > +unsigned char
> > +sr ()
> > +{
> > +  const unsigned char xz = EI; // { dg-error "not declared" }
> > +
> > +  return xz - (xz >> 1)

[PATCH] Do not modify tab options in vimrc for .py files.

2020-04-16 Thread Martin Liška

On 4/16/20 9:57 AM, Richard Biener wrote:

Ah, tab vs. spaces.  Changed to all spaces now and pushed.


Ah, I've also hit the issue. That's caused by our local vimrc.
We should exclude tab options for .py files.

Ready for master?
Thanks,
Martin
>From dc6daf004127f57a7317e492e4a61d3a9848a15b Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 16 Apr 2020 15:17:51 +0200
Subject: [PATCH] Do not modify tab options in vimrc for .py files.

contrib/ChangeLog:

2020-04-16  Martin Liska  

	* vimrc: We do not want to modify tab options
	for Python files.
---
 contrib/vimrc | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/contrib/vimrc b/contrib/vimrc
index bbbe1dd449b..fa0208d5beb 100644
--- a/contrib/vimrc
+++ b/contrib/vimrc
@@ -28,17 +28,19 @@
 
 function! SetStyle()
   let l:fname = expand("%:p")
+  let l:ext = fnamemodify(l:fname, ":e")
+  let l:c_exts = ['c', 'h', 'cpp', 'cc', 'C', 'H', 'def', 'java']
   if stridx(l:fname, 'libsanitizer') != -1
 return
   endif
-  setlocal tabstop=8
-  setlocal softtabstop=2
-  setlocal shiftwidth=2
-  setlocal noexpandtab
+  if l:ext != "py"
+setlocal tabstop=8
+setlocal softtabstop=2
+setlocal shiftwidth=2
+setlocal noexpandtab
+  endif
   setlocal textwidth=80
   setlocal formatoptions-=ro formatoptions+=cqlt
-  let l:ext = fnamemodify(l:fname, ":e")
-  let l:c_exts = ['c', 'h', 'cpp', 'cc', 'C', 'H', 'def', 'java']
   if index(l:c_exts, l:ext) != -1
 setlocal cindent
 setlocal cinoptions=>4,n-2,{2,^-2,:2,=2,g0,f0,h2,p4,t0,+2,(0,u0,w1,m0
-- 
2.26.0



[PATCH] rs6000: Fix ICE in decompose_normal_address, at rtlanal.c:6403

2020-04-16 Thread Peter Bergner via Gcc-patches
The ICE in PR93974 is caused by a bug in decompose address not being able to
handle Altivec addresses the use AND: to strip off the bottom address bits.
Rather than modify lra-constraints.c or rtlanal.c to solve this generic
problem this late in the release cycle, I have decided to fix this in target
code by defining the TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P target hook to
reject mems with Altivec addresses from being used as equivalent expressions.
I think this is fine, since Altivec addresses are legacy addresses.  I have
confirmed the following patch fixes the ICE and that we still get the same
code generated for the test case below, that we got before my PR93658 patch.

This passed bootstrap and regression testing on powerpc64le-linux with no
regressions.  Ok for mainline?

Peter


gcc/
PR rtl-optimization/93974
* config/rs6000/rs6000.c (TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P): Define.
(rs6000_cannot_substitute_mem_equiv_p): New function.

gcc/testsuite/
PR rtl-optimization/93974
* g++.dg/pr93974.C: New test.


diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 4defc1ab52b..a723503b4dc 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1734,6 +1734,9 @@ static const struct attribute_spec 
rs6000_attribute_table[] =
 
 #undef TARGET_MANGLE_DECL_ASSEMBLER_NAME
 #define TARGET_MANGLE_DECL_ASSEMBLER_NAME rs6000_mangle_decl_assembler_name
+
+#undef TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P
+#define TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P 
rs6000_cannot_substitute_mem_equiv_p
 
 
 /* Processor table.  */
@@ -26375,6 +26378,22 @@ rs6000_predict_doloop_p (struct loop *loop)
   return true;
 }
 
+/* Implement TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P.  */
+
+static bool
+rs6000_cannot_substitute_mem_equiv_p (rtx mem)
+{
+  gcc_assert (MEM_P (mem));
+
+  /* curr_insn_transform()'s handling of subregs cannot handle altivec AND:
+ type addresses, so don't allow MEMs with those address types to be
+ substituted as an equivalent expression.  See PR93974 for details.  */
+  if (GET_CODE (XEXP (mem, 0)) == AND)
+return true;
+
+  return false;
+}
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-rs6000.h"
diff --git a/gcc/testsuite/g++.dg/pr93974.C b/gcc/testsuite/g++.dg/pr93974.C
new file mode 100644
index 000..ea5f2f817c1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr93974.C
@@ -0,0 +1,27 @@
+/* { dg-do compile { target { powerpc*-*-linux* } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mdejagnu-cpu=power8 -O3 -fstack-protector-strong" } */
+
+class a {
+  double b[2];
+public:
+  a();
+};
+
+class c {
+public:
+  typedef a d;
+  d m_fn1() {
+a e;
+return e;
+  }
+};
+template  void operator+(f, typename f::d);
+void g() {
+  c connector;
+  for (;;) {
+c cut;
+a h = cut.m_fn1();
+connector + h;
+  }
+}


[PATCH] Avoid illegal argument to verbose in dg-test callback

2020-04-16 Thread Matthias Kretz
From: Matthias Kretz 

If extra_tool_flags starts with a dash, an error like 'ERROR: verbose:
illegal argument: -march=native -O2 -std=c++17' is printed. This is
easily fixed by inserting a double dash before the variable.

* testsuite/lib/libstdc++.exp: Avoid illegal argument to
verbose.
---
 libstdc++-v3/testsuite/lib/libstdc++.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp b/libstdc++-v3/
testsuite/lib/libstdc++.exp
index 10a7e748464..7f4532c55b2 100644
--- a/libstdc++-v3/testsuite/lib/libstdc++.exp
+++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
@@ -408,7 +408,7 @@ proc libstdc++-dg-test { prog do_what extra_tool_flags } {
 set options ""
 if { $extra_tool_flags != "" } {
verbose -log "extra_tool_flags are:"
-   verbose -log $extra_tool_flags
+   verbose -log -- $extra_tool_flags
if { [string first "-x c" $extra_tool_flags ] != -1 } {
verbose -log "compiling and executing as C, not C++"
set edit_tool_flags $extra_tool_flags
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



Re: [PR C++ 94426] Lambda linkage

2020-04-16 Thread Nathan Sidwell

On 4/16/20 6:50 AM, Iain Sandoe wrote:

Hi Nathan,

Iain Sandoe  wrote:


Nathan Sidwell  wrote:


My fix for 94147 was confusing no-linkage with internal linkage, at the 
language level.  That's wrong. (the std is confusing here, because it describes 
linkage of names (which is wrong), and lambdas have no names)

Lambdas with extra-scope, have linkage.  However, at the implementation-level 
that linkage is at least as restricted as the linkage of the extra-scope decl.

Further, when instantiating a variable initialized by a lambda, we must 
determine the visibility of the variable itself, before instantiating its 
initializer.  If the template arguments are internal (or no-linkage), the 
variable will have internal linkage, regardless of the linkage of the template 
it is instantiated from.  We need to know that before instantiating the lambda, 
so we can restrict its linkage correctly.

I'll commit this in a few days.


As discussed on irc,

The testcase for this fails on Darwin, where we don’t use .local or .comm for 
the var.

I’ve tested this on x86-64-linux and darwin,
but I plan on testing on a few more Darwin boxen,
OK to apply, if additional testing passes?


that testing revealed some differences in storage description for the variable 
(powerpc 32b darwin puts it in bss, like linux, but the remainer of the 
platform versions use .static_data).  However, that’s not the relevant 
observation.

the observation is that the storage and symbols for

_Z3VARIZ1qvEUlvE_E

has not changed between gcc-9 and trunk.

What has changed is the function that initializes that variable:

_Z4InitIN3VARIZ1qvEUlvE_EUlvE_EEbT_

which was weak / comdat [Linux]  weak / global [Darwin] and now is text section 
local (which is what I understood was the intention of the change) .. so I 
wonder if the scan-asms are testing what you intended?


That is indeed the expected change.


how about the following - where IMO, from the observation above, the first two 
tests are not especially useful and could be removed.

the remainder of the amendments cater for USER_LABEL_PREFIX and a different 
spelling for ‘weak’ in the Darwin assembly language.

So - this now tests that the symbol exists, is spelled the way you intend and 
is not weak (or global on Darwin).




WDYT?


Thanks for checking, please apply.

(My secret plan worked! someone generalized the regexps!)


--
Nathan Sidwell


[PATCH] coroutines: Backout mandate for tailcalls at O < 2 [PR94359]

2020-04-16 Thread Iain Sandoe
Hi

For symmetric transfers to work with C++20 coroutines, it is
currently necessary to tail call the callee coroutine from resume
method of the caller coroutine.  However there are several targets
which don't support an indirect tail call to an arbitrary callee.

Unfortunately, the target 'function_ok_for_sibcall' is not usable
from the front end in all cases.  While it is possible to add a new
hook to cover this circumstance, IMO, it is too late in the release
cycle to be sure of getting the setting correct for all targets***.

So, this patch backs out the use of function_ok_for_sibcall () and
the mandate of CALL_EXPR_MUST_TAIL_CALL from the symmetric 
transfer.

Targets that can make indirect tail calls to arbitrary callees will
still be able to make use of the symmetric transfer (without risking
overrunning the stack) for optimization levels >= 2.

The draft standard does not mandate unlimited symmetric transfers,
so removing this is a QOI issue (albeit an important one) rather
than a correctness one.

The test is moved and adjusted so that it can be opted into by any
target that supports the necessary tailcall.

(tested on x86_64-linux/darwin, powerpc64-linux, sparc-solaris11)
OK for master?
thanks
Iain

*** that can always be revisited.

gcc/cp/ChangeLog:

2020-04-16  Iain Sandoe  

* coroutines.cc (build_actor_fn): Back out use of
targetm.function_ok_for_sibcall.  Do not mark the resume
call as CALL_EXPR_MUST_TAIL_CALL.

gcc/testsuite/ChangeLog:

2020-04-16  Iain Sandoe  

* g++.dg/coroutines/torture/symmetric-transfer-00-basic.C: Move..
* g++.dg/coroutines/symmetric-transfer-00-basic.C: ..here and
adjust to run at O2 for targets supporting the necessary tail
call.
---
 gcc/cp/coroutines.cc | 16 ++--
 .../{torture => }/symmetric-transfer-00-basic.C  | 13 ++---
 2 files changed, 8 insertions(+), 21 deletions(-)
 rename gcc/testsuite/g++.dg/coroutines/{torture => 
}/symmetric-transfer-00-basic.C (87%)

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index e4ba642d527..0a8e7521c4f 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -2377,21 +2377,9 @@ build_actor_fn (location_t loc, tree coro_frame_type, 
tree actor, tree fnbody,
 (loc, builtin_decl_explicit (BUILT_IN_CORO_RESUME), 1, addr);
 
   /* In order to support an arbitrary number of coroutine continuations,
- we must tail call them.  However, some targets might not support this
- for indirect calls, or calls between DSOs.
- FIXME: see if there's an alternate strategy for such targets.  */
-  /* Now we have the actual call, and we can mark it as a tail.  */
+ we must tail call them.  However, some targets do not support indirect
+ tail calls to arbitrary callees.  See PR94359.  */
   CALL_EXPR_TAILCALL (resume) = true;
-  /* Temporarily, switch cfun so that we can use the target hook.  */
-  push_struct_function (actor);
-  if (targetm.function_ok_for_sibcall (NULL_TREE, resume))
-{
-  /* ... and for optimisation levels 0..1, which do not normally tail-
-   -call, mark it as requiring a tail-call for correctness.  */
-  if (optimize < 2)
-   CALL_EXPR_MUST_TAIL_CALL (resume) = true;
-}
-  pop_cfun ();
   resume = coro_build_cvt_void_expr_stmt (resume, loc);
   add_stmt (resume);
 
diff --git 
a/gcc/testsuite/g++.dg/coroutines/torture/symmetric-transfer-00-basic.C 
b/gcc/testsuite/g++.dg/coroutines/symmetric-transfer-00-basic.C
similarity index 87%
rename from 
gcc/testsuite/g++.dg/coroutines/torture/symmetric-transfer-00-basic.C
rename to gcc/testsuite/g++.dg/coroutines/symmetric-transfer-00-basic.C
index 6f379c8e77a..b78ae20d9d4 100644
--- a/gcc/testsuite/g++.dg/coroutines/torture/symmetric-transfer-00-basic.C
+++ b/gcc/testsuite/g++.dg/coroutines/symmetric-transfer-00-basic.C
@@ -1,10 +1,9 @@
-// { dg-do run }
-// See PR94359 - some targets are unable to make general indirect tailcalls
-// for example, between different DSOs.
-// { dg-xfail-run-if "" { hppa*-*-hpux11* } }
-// { dg-xfail-run-if "" { ia64-*-linux-gnu } }
-// { dg-xfail-run-if "" { { lp64 && { powerpc*-linux-gnu } } || { *-*-aix* } } 
}
-// { dg-xfail-run-if "" { sparc*-*-* } }
+// See PR94359, we will need either a general solution to this, or at least
+// some hook for targets to opt in, for now the test will work on targets that
+// can do the tailcall (which would normally be available for O2+)
+
+// { dg-do run { target { i?86-*-linux-gnu x86_64-*-linux-gnu *-*-darwin* } } }
+// { dg-additional-options "-O2" }
 
 #if __has_include()
 
-- 
2.17.1




Re: [PATCH 0/19][GCC-8] aarch64: Backport outline atomics

2020-04-16 Thread Andre Vieira (lists)

On 16/04/2020 13:24, Andre Vieira (lists) wrote:

Hi,

This series backports all the patches and fixes regarding outline 
atomics to the gcc-8 branch.


Bootstrapped the series for aarch64-linux-gnu and regression tested.
Is this OK for gcc-8?

Andre Vieira (19):
aarch64: Add early clobber for aarch64_store_exclusive
aarch64: Simplify LSE cas generation
aarch64: Improve cas generation
aarch64: Improve swp generation
aarch64: Improve atomic-op lse generation
aarch64: Remove early clobber from ATOMIC_LDOP scratch
aarch64: Extend %R for integer registers
aarch64: Implement TImode compare-and-swap
aarch64: Tidy aarch64_split_compare_and_swap
aarch64: Add out-of-line functions for LSE atomics
Add visibility to libfunc constructors
aarch64: Implement -moutline-atomics
Aarch64: Fix shrinkwrapping interactions with atomics (PR92692)
aarch64: Fix store-exclusive in load-operate LSE helpers
aarch64: Configure for sys/auxv.h in libgcc for lse-init.c
aarch64: Fix up aarch64_compare_and_swaphi pattern [PR94368]
aarch64: Fix bootstrap with old binutils [PR93053]
aarch64: Fix ICE due to aarch64_gen_compare_reg_maybe_ze [PR94435]
re PR target/90724 (ICE with __sync_bool_compare_and_swap with 
-march=armv8.2-a+sve)


Hmm something went wrong when sending these, I had tried to make the 
N/19 patches reply to this one, but failed and also I was pretty sure I 
had CC'ed Kyrill and Richard S.


Adding them now.



Re: [PATCH] Do not modify tab options in vimrc for .py files.

2020-04-16 Thread Alexander Monakov via Gcc-patches



On Thu, 16 Apr 2020, Martin Liška wrote:

> On 4/16/20 9:57 AM, Richard Biener wrote:
> > Ah, tab vs. spaces.  Changed to all spaces now and pushed.
> 
> Ah, I've also hit the issue. That's caused by our local vimrc.
> We should exclude tab options for .py files.

I think your patch is correct. It's possible to also add an
'else' branch with correct settings for Python (expandtab,
sw=sts=4, ts=8).  Have you considered that?

Thanks.
Alexander


Re: [PATCH] Do not modify tab options in vimrc for .py files.

2020-04-16 Thread Martin Liška

On 4/16/20 4:00 PM, Alexander Monakov wrote:



On Thu, 16 Apr 2020, Martin Liška wrote:


On 4/16/20 9:57 AM, Richard Biener wrote:

Ah, tab vs. spaces.  Changed to all spaces now and pushed.


Ah, I've also hit the issue. That's caused by our local vimrc.
We should exclude tab options for .py files.


I think your patch is correct. It's possible to also add an
'else' branch with correct settings for Python (expandtab,
sw=sts=4, ts=8).  Have you considered that?


To be honest I have:
autocmd Filetype python setlocal expandtab tabstop=4 shiftwidth=4 softtabstop=4

in my default vim config.
But I'm wondering what's default for 'python' Filetype?

Thanks,
Martin



Thanks.
Alexander





Re: [PATCH] coroutines: Backout mandate for tailcalls at O < 2 [PR94359]

2020-04-16 Thread Nathan Sidwell

On 4/16/20 9:51 AM, Iain Sandoe wrote:

Hi

For symmetric transfers to work with C++20 coroutines, it is
currently necessary to tail call the callee coroutine from resume
method of the caller coroutine.  However there are several targets
which don't support an indirect tail call to an arbitrary callee.

Unfortunately, the target 'function_ok_for_sibcall' is not usable
from the front end in all cases.  While it is possible to add a new
hook to cover this circumstance, IMO, it is too late in the release
cycle to be sure of getting the setting correct for all targets***.

So, this patch backs out the use of function_ok_for_sibcall () and
the mandate of CALL_EXPR_MUST_TAIL_CALL from the symmetric
transfer.

Targets that can make indirect tail calls to arbitrary callees will
still be able to make use of the symmetric transfer (without risking
overrunning the stack) for optimization levels >= 2.

The draft standard does not mandate unlimited symmetric transfers,
so removing this is a QOI issue (albeit an important one) rather
than a correctness one.

The test is moved and adjusted so that it can be opted into by any
target that supports the necessary tailcall.

(tested on x86_64-linux/darwin, powerpc64-linux, sparc-solaris11)
OK for master?
thanks
Iain


OK.  will you leave 94359 open, or is there a separate bug for the lack 
of generalized symmetric xfer?


nathan


--
Nathan Sidwell


Re: [PATCH] Do not modify tab options in vimrc for .py files.

2020-04-16 Thread Alexander Monakov via Gcc-patches
On Thu, 16 Apr 2020, Martin Liška wrote:

> To be honest I have:
> autocmd Filetype python setlocal expandtab tabstop=4 shiftwidth=4
> softtabstop=4
> 
> in my default vim config.
> But I'm wondering what's default for 'python' Filetype?

Since October 2013 Vim ftplugin/python.vim has:

" As suggested by PEP8.
setlocal expandtab shiftwidth=4 softtabstop=4 tabstop=8

So the default is correct. Please disregard my suggestion then,
no need to add an 'else' branch there.

Thanks.
Alexander


Re: [PATCH] ipa: Make call redirection detect already adjusted calls (PR 93621)

2020-04-16 Thread Martin Jambor
Hi,

On Thu, Apr 16 2020, Jan Hubicka wrote:
>> 
>> 2020-04-09  Martin Jambor  
>> 
>>  PR ipa/93621
>>  * ipa-inline.h (ipa_saved_clone_sources): Declare.
>>  * ipa-inline-transform.c (ipa_saved_clone_sources): New variable.
>>  (save_inline_function_body): Link the new body holder with the
>>  previous one.
>>  * cgraph.c: Include ipa-inline.h.
>>  (cgraph_edge::redirect_call_stmt_to_callee): Try to find the decl from
>>  the statement in ipa_saved_clone_sources.
>>  * cgraphunit.c: Include ipa-inline.h.
>>  (expand_all_functions): Free ipa_saved_clone_sources.
> As discussed on IRC this is kind of hack - we should keep track of the
> cloning paths if we want to allow calls to already modified decls.
> However this is OK for gcc 10.

Thanks.


> I believe this can trigger wrong code with earlier releases?

No, the same assert is there in previous releases too, it just tests
!node->clone.combined_args_to_skip instead of !node->clone.param_adjustments.

>> diff --git a/gcc/ipa-inline.h b/gcc/ipa-inline.h
>> index 5025b6045fc..c596f77d0e7 100644
>> --- a/gcc/ipa-inline.h
>> +++ b/gcc/ipa-inline.h
>> @@ -65,6 +65,7 @@ void clone_inlined_nodes (struct cgraph_edge *e, bool, 
>> bool, int *);
>>  
>>  extern int ncalls_inlined;
>>  extern int nfunctions_inlined;
>> +extern function_summary  *ipa_saved_clone_sources;
> Please make it point to DECL itself.  While symbols can be removed decls
> stays and we only care about decl anyway.

OK, the following has passed bootstrap and testing on x86-64-linux.  I'm
running LTO bootstrap now and will commit it if it passes.

Thanks,

Martin


2020-04-16  Martin Jambor  

PR ipa/93621
* ipa-inline.h (ipa_saved_clone_sources): Declare.
* ipa-inline-transform.c (ipa_saved_clone_sources): New variable.
(save_inline_function_body): Link the new body holder with the
previous one.
* cgraph.c: Include ipa-inline.h.
(cgraph_edge::redirect_call_stmt_to_callee): Try to find the decl from
the statement in ipa_saved_clone_sources.
* cgraphunit.c: Include ipa-inline.h.
(expand_all_functions): Free ipa_saved_clone_sources.

testsuite/
* g++.dg/ipa/pr93621.C: New test.
---
 gcc/ChangeLog  | 13 +
 gcc/cgraph.c   | 11 +++
 gcc/cgraphunit.c   |  4 +++-
 gcc/ipa-inline-transform.c | 19 +++
 gcc/ipa-inline.h   |  1 +
 gcc/testsuite/ChangeLog|  5 +
 gcc/testsuite/g++.dg/ipa/pr93621.C | 29 +
 7 files changed, 81 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr93621.C

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 726e629188a..834026c8f16 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,16 @@
+2020-04-16  Martin Jambor  
+
+   PR ipa/93621
+   * ipa-inline.h (ipa_saved_clone_sources): Declare.
+   * ipa-inline-transform.c (ipa_saved_clone_sources): New variable.
+   (save_inline_function_body): Link the new body holder with the
+   previous one.
+   * cgraph.c: Include ipa-inline.h.
+   (cgraph_edge::redirect_call_stmt_to_callee): Try to find the decl from
+   the statement in ipa_saved_clone_sources.
+   * cgraphunit.c: Include ipa-inline.h.
+   (expand_all_functions): Free ipa_saved_clone_sources.
+
 2020-04-13  Martin Sebor  
 
* doc/extend.texi (-Wall): Mention -Wformat-overflow and
diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index ecb234d032f..72d7cb54301 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attribs.h"
 #include "selftest.h"
 #include "tree-into-ssa.h"
+#include "ipa-inline.h"
 
 /* FIXME: Only for PROP_loops, but cgraph shouldn't have to know about this.  
*/
 #include "tree-pass.h"
@@ -1470,6 +1471,16 @@ cgraph_edge::redirect_call_stmt_to_callee (cgraph_edge 
*e)
   || decl == e->callee->decl)
 return e->call_stmt;
 
+  if (decl && ipa_saved_clone_sources)
+{
+  tree *p = ipa_saved_clone_sources->get (e->callee);
+  if (p && decl == *p)
+   {
+ gimple_call_set_fndecl (e->call_stmt, e->callee->decl);
+ return e->call_stmt;
+   }
+}
+
   if (flag_checking && decl)
 {
   cgraph_node *node = cgraph_node::get (decl);
diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 0e255f25b7d..a1ace95879a 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -205,6 +205,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "lto-section-names.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "ipa-inline.h"
 
 /* Queue of cgraph nodes scheduled to be added into cgraph.  This is a
secondary queue used during optimization to accommodate passes that
@@ -2481,7 +2482,8 @@ expand_all_functions (void)
 
   symtab->process_new_functions ();
   free_gimplify_stack ();
-

[committed] aarch64: Fix mismatched SVE predicate modes [PR94606]

2020-04-16 Thread Richard Sandiford
For this testcase we ended up generating the invalid rtl:

(insn 10 9 11 2 (set (reg:VNx16BI 105)
(and:VNx16BI (xor:VNx16BI (reg:VNx8BI 103)
(reg:VNx16BI 104))
(reg:VNx16BI 104))) "/tmp/bar.c":9:12 -1
 (nil))

Fixed by taking the VNx16BI lowpart.  It's safe to do that here because
the gp (r104) masks out the extra odd-indexed bits.

Tested on aarch64-linux-gnu and aarch64_be-elf, pushed.

Richard


2020-04-16  Richard Sandiford  

gcc/
PR target/94606
* config/aarch64/aarch64.c (aarch64_expand_sve_const_pred_eor): Take
the VNx16BI lowpart of the recursively-generated constant.

gcc/testsuite/
PR target/94606
* gcc.dg/vect/pr94606.c: New test.
---
 gcc/config/aarch64/aarch64.c|  1 +
 gcc/testsuite/gcc.dg/vect/pr94606.c | 13 +
 2 files changed, 14 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr94606.c

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 4af562a81ea..d0a41c286cd 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4742,6 +4742,7 @@ aarch64_expand_sve_const_pred_eor (rtx target, 
rtx_vector_builder &builder,
   /* EOR the result with an ELT_SIZE PTRUE.  */
   rtx mask = aarch64_ptrue_all (elt_size);
   mask = force_reg (VNx16BImode, mask);
+  inv = gen_lowpart (VNx16BImode, inv);
   target = aarch64_target_reg (target, VNx16BImode);
   emit_insn (gen_aarch64_pred_z (XOR, VNx16BImode, target, mask, inv, mask));
   return target;
diff --git a/gcc/testsuite/gcc.dg/vect/pr94606.c 
b/gcc/testsuite/gcc.dg/vect/pr94606.c
new file mode 100644
index 000..f0e7c4cd0e8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr94606.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=armv8.2-a+sve -msve-vector-bits=256" { 
target aarch64*-*-* } } */
+
+const short mask[] = { 0, 0, 0, 0, 0, 0, 0, 0,
+  0, 0, 0, 1, 1, 1, 1, 1 };
+
+int
+foo (short *restrict x, short *restrict y)
+{
+  for (int i = 0; i < 16; ++i)
+if (mask[i])
+  x[i] += y[i];
+}


Re: [PATCH] rs6000: Fix ICE in decompose_normal_address, at rtlanal.c:6403

2020-04-16 Thread will schmidt via Gcc-patches
On Thu, 2020-04-16 at 08:21 -0500, Peter Bergner via Gcc-patches wrote:
> The ICE in PR93974 is caused by a bug in decompose address not being
> able to
> handle Altivec addresses the use AND: to strip off the bottom address
> bits.
> Rather than modify lra-constraints.c or rtlanal.c to solve this
> generic
> problem this late in the release cycle, I have decided to fix this in
> target
> code by defining the TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P target hook
> to
> reject mems with Altivec addresses from being used as equivalent
> expressions.
> I think this is fine, since Altivec addresses are legacy
> addresses.  I have
> confirmed the following patch fixes the ICE and that we still get the
> same
> code generated for the test case below, that we got before my PR93658
> patch.
> 
> This passed bootstrap and regression testing on powerpc64le-linux
> with no
> regressions.  Ok for mainline?
> 
> Peter
> 
> 
> gcc/
>   PR rtl-optimization/93974
>   * config/rs6000/rs6000.c (TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P): Define.
>   (rs6000_cannot_substitute_mem_equiv_p): New function.
> 
> gcc/testsuite/
>   PR rtl-optimization/93974
>   * g++.dg/pr93974.C: New test.
> 

ok

> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 4defc1ab52b..a723503b4dc 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -1734,6 +1734,9 @@ static const struct attribute_spec 
> rs6000_attribute_table[] =
> 
>  #undef TARGET_MANGLE_DECL_ASSEMBLER_NAME
>  #define TARGET_MANGLE_DECL_ASSEMBLER_NAME rs6000_mangle_decl_assembler_name
> +
> +#undef TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P
> +#define TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P 
> rs6000_cannot_substitute_mem_equiv_p
>  

ok

>  /* Processor table.  */
> @@ -26375,6 +26378,22 @@ rs6000_predict_doloop_p (struct loop *loop)
>return true;
>  }
> 
> +/* Implement TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P.  */
> +
> +static bool
> +rs6000_cannot_substitute_mem_equiv_p (rtx mem)
> +{
> +  gcc_assert (MEM_P (mem));
> +
> +  /* curr_insn_transform()'s handling of subregs cannot handle altivec AND:
> + type addresses, so don't allow MEMs with those address types to be
> + substituted as an equivalent expression.  See PR93974 for details.  */
> +  if (GET_CODE (XEXP (mem, 0)) == AND)
> +return true;
> +
> +  return false;
> +}

ok

> +
>  struct gcc_target targetm = TARGET_INITIALIZER;
> 
>  #include "gt-rs6000.h"
> diff --git a/gcc/testsuite/g++.dg/pr93974.C b/gcc/testsuite/g++.dg/pr93974.C
> new file mode 100644
> index 000..ea5f2f817c1
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pr93974.C
> @@ -0,0 +1,27 @@
> +/* { dg-do compile { target { powerpc*-*-linux* } } } */
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O3 -fstack-protector-strong" } */
> +
> +class a {
> +  double b[2];
> +public:
> +  a();
> +};
> +
> +class c {
> +public:
> +  typedef a d;
> +  d m_fn1() {
> +a e;
> +return e;
> +  }
> +};
> +template  void operator+(f, typename f::d);
> +void g() {
> +  c connector;
> +  for (;;) {
> +c cut;
> +a h = cut.m_fn1();
> +connector + h;
> +  }
> +}
ok.


lgtm. :-)

thanks
-Will




Re: [PATCH], PR target/94557, V2, Fix GCC 9.x PowerPC regression due to PR target/93932 back port.

2020-04-16 Thread will schmidt via Gcc-patches
On Wed, 2020-04-15 at 21:37 -0400, Michael Meissner via Gcc-patches wrote:
> Fix regression caused by PR target/93932 backport.
> 
> When I back ported the fix for PR target/93932 to the GCC 9 branch, I put in 
> an
> unintended regression when the GCC compiler is optimizing the vec_extract
> built-in function, and the vector element is in memory, and the index is
> variable.  This patch masks the vector index so that it does not go out of
> bounds.

Regression tested OK, I assume. :-)


> 
> 2020-04-15  Michael Meissner  
> 
>   PR target/94557
>   * config/rs6000/rs6000.c (rs6000_adjust_vec_address): Fix
>   regression caused by PR target/93932 backport.  Mask variable
>   vector extract index so it does not go beyond the vector when
>   extracting a vector element from memory.
> 


ok

> --- /tmp/4XFFqK_rs6000.c  2020-04-13 15:28:33.514011024 -0500
> +++ gcc/config/rs6000/rs6000.c2020-04-13 14:24:01.296932921 -0500
> @@ -7047,18 +7047,25 @@ rs6000_adjust_vec_address (rtx scalar_re
>  element_offset = GEN_INT (INTVAL (element) * scalar_size);
>else
>  {
> +  /* Mask the element to make sure the element number is between 0 and 
> the
> +  maximum number of elements - 1 so that we don't generate an address
> +  outside the vector.  */
> +  rtx num_ele_m1 = GEN_INT (GET_MODE_NUNITS (GET_MODE (mem)) - 1);
> +  rtx and_op = gen_rtx_AND (Pmode, element, num_ele_m1);
> +  emit_insn (gen_rtx_SET (base_tmp, and_op));
> +
>int byte_shift = exact_log2 (scalar_size);
>gcc_assert (byte_shift >= 0);
> 
>if (byte_shift == 0)
> - element_offset = element;
> + element_offset = base_tmp;
> 
>else
>   {
> if (TARGET_POWERPC64)
> - emit_insn (gen_ashldi3 (base_tmp, element, GEN_INT (byte_shift)));
> + emit_insn (gen_ashldi3 (base_tmp, base_tmp, GEN_INT (byte_shift)));
> else
> - emit_insn (gen_ashlsi3 (base_tmp, element, GEN_INT (byte_shift)));
> + emit_insn (gen_ashlsi3 (base_tmp, base_tmp, GEN_INT (byte_shift)));

ok

> 
> element_offset = base_tmp;
>   }
> 

Below matches the above, looks like the patch is double-pasted into the
email, not a big deal.

lgtm.

thanks,
-Will

> --- /tmp/4XFFqK_rs6000.c  2020-04-13 15:28:33.514011024 -0500
> +++ gcc/config/rs6000/rs6000.c2020-04-13 14:24:01.296932921 -0500
> @@ -7047,18 +7047,25 @@ rs6000_adjust_vec_address (rtx scalar_re
>  element_offset = GEN_INT (INTVAL (element) * scalar_size);
>else
>  {
> +  /* Mask the element to make sure the element number is between 0 and 
> the
> +  maximum number of elements - 1 so that we don't generate an address
> +  outside the vector.  */
> +  rtx num_ele_m1 = GEN_INT (GET_MODE_NUNITS (GET_MODE (mem)) - 1);
> +  rtx and_op = gen_rtx_AND (Pmode, element, num_ele_m1);
> +  emit_insn (gen_rtx_SET (base_tmp, and_op));
> +
>int byte_shift = exact_log2 (scalar_size);
>gcc_assert (byte_shift >= 0);
> 
>if (byte_shift == 0)
> - element_offset = element;
> + element_offset = base_tmp;
> 
>else
>   {
> if (TARGET_POWERPC64)
> - emit_insn (gen_ashldi3 (base_tmp, element, GEN_INT (byte_shift)));
> + emit_insn (gen_ashldi3 (base_tmp, base_tmp, GEN_INT (byte_shift)));
> else
> - emit_insn (gen_ashlsi3 (base_tmp, element, GEN_INT (byte_shift)));
> + emit_insn (gen_ashlsi3 (base_tmp, base_tmp, GEN_INT (byte_shift)));
> 
> element_offset = base_tmp;
>   }
> 



arm: Fix regression in contant handling for -mpure-code for v8m

2020-04-16 Thread Christophe Lyon via Gcc-patches
Hi,

In PR94538, Wilco mentioned that my patch to enable -mpure-code for
v6m caused regressions in the code generated for cortex-m23.

Specifically, for
int f3 (void) { return 0x1100; }
int f3_bis (void) { return 0x12345678; }

we currently generate (-O2 -mcpu=cortex-m23 -mpure-code)
movsr0, #17
lslsr0, r0, #8
lslsr0, r0, #8
lslsr0, r0, #8
bx  lr
and
movsr0, #86
lslsr0, r0, #8
addsr0, r0, #120
movtr0, 4660
bx  lr

The attached patch brings back the original code generation:
movsr0, #136@ 12[c=4 l=2]  *thumb1_movsi_insn/1
lslsr0, r0, #21 @ 9 [c=4 l=2]  *thumb1_ashlsi3/0
bx  lr
and
movwr0, #22136  @ 12[c=4 l=4]  *thumb1_movsi_insn/2
movtr0, 4660@ 13[c=4 l=4]  *arm_movtas_ze/1
bx  lr

This does not address the other problems discussed in the PR, so I'm
not mentioning it in the ChangeLog.

OK?

Thanks,

Christophe
diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md
index 2486163..e2644a9 100644
--- a/gcc/config/arm/thumb1.md
+++ b/gcc/config/arm/thumb1.md
@@ -66,11 +66,11 @@ (define_insn "thumb1_movsi_symbol_ref"
 
 (define_split
   [(set (match_operand:SI 0 "register_operand" "")
-   (match_operand:SI 1 "immediate_operand" ""))]
+   (match_operand:SI 1 "const_int_operand" ""))]
   "TARGET_THUMB1
+   && !TARGET_HAVE_MOVT
&& arm_disable_literal_pool
-   && GET_CODE (operands[1]) == CONST_INT
-   && !satisfies_constraint_I (operands[1])"
+   && !satisfies_constraint_K (operands[1])"
   [(clobber (const_int 0))]
   "
 thumb1_gen_const_int (operands[0], INTVAL (operands[1]));
gcc/ChangeLog:

2020-04-16  Christophe Lyon  

* config/arm/thumb1.md: Fix mov splitter for
arm_disable_literal_pool.



[PR94454] specialization hashtable inconsistencies

2020-04-16 Thread Nathan Sidwell
These patches address 3 separate things I discovered in working on 
pr94454.  As mentioned in the bug, we have disagreement between hash 
value equality and key equality in the specialization table.


Once Iain got a reproducible build on gcc110, I came up with 
pr94454-shim.diff.  This (a) causes all specializations to only be 
hashed by their template.  and (b) adds a checking assert to the 
argument comparator, to assert that if the arguments are equal, the 
argument's hashes are the same.  This triggered the problem, and a bunch 
of other cases using the existing testsuites.  We use comp_template_args 
not only for the hash table, but for specialization ordering and the 
like, hence the protection of that checking_assert for some special cases.


While we could keep the checking assert, the neutering of the hasher to 
increase collisions really kills performance -- some of the libstdc++ 
tests timeout on a non-optimized checking build.  I don't think that 
should be enabled with checking.  There doesn't seem to be an obvious 
existing checking flag to add it to (type_checking is enabled on a 
checking build).


The problem Eric hit in his case was that expression pack expansions 
were comparing equal, (but hashing differently).  Causing us to create 
two, apparently equal, specializations that would sometimes collide. 
I'm not sure why we had some randomness in reproducibility.  I didn't 
locate an uninitialized field, which was one of my hypotheses about the 
cause.  This is fixed by pr94454-pack.diff.  cp_tree_operand_length says 
a pack has 1 operand (for mangling), whereas it actually has 3, but only 
two of which are significant for equality.  We must special case that in 
cp_tree_equal.  That new code matches the hasher and the 
type_pack_expansion case in structural_comp_types.


However, I also discovered 2 other problems.

The first is that the hasher was not skipping nodes that 
template_args_equal would.  Fixed by replacing the STRIP_NOPS invocation 
by a bespoke loop.  There's also a change to tpl-tpl-parm hashing, which 
is part of the next problem ...


... we treat tpl-tpl-parms as types.  They're not;  bound-tpl-tpl-parms 
are.  We can get away with them being type-like.  Unfortunately we give 
the original level==orig_level case a canonical type, but the reduced 
cases of levelhasher because we'll use TYPE_HASH (CANONICAL_TYPE ()) when we can. 
There's a note in tsubst[TEMPLATE_TEMPLATE_PARM] about why the reduced 
ones cannot have a canonical type. (I didn't feel like questioning that 
assertion at this point.)  So that's the other part of the hash fn change.


Finally, should tpl-tpl-parms ever have a canonical type?  I think we 
can get away with that, because the comparison machinery will do 
structural comparison if one of the nodes requires structural 
comparison.  It seems somewhat skewed though, and pr94454-ttp.diff stops 
tpl-tpl-parms from having canonical types.


In summary:
pr94454-shim.diff - not apply, keep with bug report
pr94454-arghash.diff - apply, fixes hasher
pr94454-pack.diff - apply, fixes cp_tree_equal
pr94454-ttp.diff - maybe?

Eric's testcase is too huge for the testsuite, and I couldn't get it to 
trigger with arghash.diff and ttp.diff applied bug pack.diff not.  I did 
get several fails in the g++ and libstdc++ testsuites with shim.diff 
applied, and none of the fixes.


nathan

--
Nathan Sidwell
2020-04-16  Nathan Sidwell  

	PR 94454 - specialization hash inconsistencies
	* pt.c (iterative_hash_template_arg): Strip nodes as
	template_args_equal does.
	[ARGUMENT_PACK_SELECT, TREE_VEC, CONSTRUCTOR]: Refactor.
	[node_class:TEMPLATE_TEMPLATE_PARM]: Hash by level & index.
	[node_class:default]: Refactor.

diff --git i/gcc/cp/pt.c w/gcc/cp/pt.c
index 0a8ec3198d2..5bc94a85129 100644
--- i/gcc/cp/pt.c
+++ w/gcc/cp/pt.c
@@ -195,6 +195,7 @@ static void set_current_access_from_decl (tree);
 static enum template_base_result get_template_base (tree, tree, tree, tree,
 		bool , tree *);
 static tree try_class_unification (tree, tree, tree, tree, bool);
+static bool class_nttp_const_wrapper_p (tree t);
 static int coerce_template_template_parms (tree, tree, tsubst_flags_t,
 	   tree, tree);
 static bool template_template_parm_bindings_ok_p (tree, tree);
@@ -1737,31 +1741,32 @@ spec_hasher::hash (spec_entry *e)
 }
 
 /* Recursively calculate a hash value for a template argument ARG, for use
-   in the hash tables of template specializations.  */
+   in the hash tables of template specializations.   We must be
+   careful to (at least) skip the same entities template_args_equal
+   does.  */
 
 hashval_t
 iterative_hash_template_arg (tree arg, hashval_t val)
 {
-  unsigned HOST_WIDE_INT i;
-  enum tree_code code;
-  char tclass;
-
   if (arg == NULL_TREE)
 return iterative_hash_object (arg, val);
 
   if (!TYPE_P (arg))
-STRIP_NOPS (arg);
-
-  if (TREE_CODE (arg) == ARGUMENT_PACK_SELECT)
-gcc_unreachable ();
+/* Strip nop-like things, but not th

Re: [PATCH] Do not use HAVE_DOS_BASED_FILE_SYSTEM for Cygwin.

2020-04-16 Thread Maciej W. Rozycki via Gcc-patches
On Thu, 16 Apr 2020, Martin Li?ka wrote:

> The patch is fix for Cygwin where we should not define 
> HAVE_DOS_BASED_FILE_SYSTEM
> and use back slashes as a path component separator.
[...]
> diff --git a/ltmain.sh b/ltmain.sh
> index 79f9ba89af5..8ad183010f0 100644
> --- a/ltmain.sh
> +++ b/ltmain.sh
> @@ -3425,7 +3425,7 @@ int setenv (const char *, const char *, int);
>  # define PATH_SEPARATOR ':'
>  #endif
>  
> -#if defined (_WIN32) || defined (__MSDOS__) || defined (__DJGPP__) || \
> +#if (defined (_WIN32) && ! defined(__CYGWIN__)) || defined (__MSDOS__) || 
> defined (__DJGPP__) || \
>defined (__OS2__)
>  # define HAVE_DOS_BASED_FILE_SYSTEM
>  # define FOPEN_WB "wb"

 This part needs to go upstream so as to let us avoid local clutter.  Also 
this does not fit in 80 columns and has to be reformatted.

  Maciej


Re: [PATCH], PR target/94557, V2, Fix GCC 9.x PowerPC regression due to PR target/93932 back port.

2020-04-16 Thread Segher Boessenkool
Hi!

On Wed, Apr 15, 2020 at 09:37:32PM -0400, Michael Meissner wrote:
> Fix regression caused by PR target/93932 backport.
> 
> When I back ported the fix for PR target/93932 to the GCC 9 branch, I put in 
> an
> unintended regression when the GCC compiler is optimizing the vec_extract
> built-in function, and the vector element is in memory, and the index is
> variable.  This patch masks the vector index so that it does not go out of
> bounds.
> 
> 2020-04-15  Michael Meissner  
> 
>   PR target/94557
>   * config/rs6000/rs6000.c (rs6000_adjust_vec_address): Fix
>   regression caused by PR target/93932 backport.  Mask variable
>   vector extract index so it does not go beyond the vector when
>   extracting a vector element from memory.

Much better, thanks!

> --- /tmp/4XFFqK_rs6000.c  2020-04-13 15:28:33.514011024 -0500
> +++ gcc/config/rs6000/rs6000.c2020-04-13 14:24:01.296932921 -0500
> @@ -7047,18 +7047,25 @@ rs6000_adjust_vec_address (rtx scalar_re
>  element_offset = GEN_INT (INTVAL (element) * scalar_size);
>else
>  {
> +  /* Mask the element to make sure the element number is between 0 and 
> the
> +  maximum number of elements - 1 so that we don't generate an address
> +  outside the vector.  */

Hrm, so why do you need to do this here?  It is part of the semantics of
vec_extract, so shouldn't the RTL already have this masking somewhere
when we get here?

Nevertheless, the patch is okay for 9, it certainly won't hurt.  Thanks!


Segher


Re: [PATCH], PR target/94557, V2, Fix GCC 9.x PowerPC regression due to PR target/93932 back port.

2020-04-16 Thread Michael Meissner via Gcc-patches
On Thu, Apr 16, 2020 at 11:31:17AM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Apr 15, 2020 at 09:37:32PM -0400, Michael Meissner wrote:
> > Fix regression caused by PR target/93932 backport.
> > 
> > When I back ported the fix for PR target/93932 to the GCC 9 branch, I put 
> > in an
> > unintended regression when the GCC compiler is optimizing the vec_extract
> > built-in function, and the vector element is in memory, and the index is
> > variable.  This patch masks the vector index so that it does not go out of
> > bounds.
> > 
> > 2020-04-15  Michael Meissner  
> > 
> > PR target/94557
> > * config/rs6000/rs6000.c (rs6000_adjust_vec_address): Fix
> > regression caused by PR target/93932 backport.  Mask variable
> > vector extract index so it does not go beyond the vector when
> > extracting a vector element from memory.
> 
> Much better, thanks!
> 
> > --- /tmp/4XFFqK_rs6000.c2020-04-13 15:28:33.514011024 -0500
> > +++ gcc/config/rs6000/rs6000.c  2020-04-13 14:24:01.296932921 -0500
> > @@ -7047,18 +7047,25 @@ rs6000_adjust_vec_address (rtx scalar_re
> >  element_offset = GEN_INT (INTVAL (element) * scalar_size);
> >else
> >  {
> > +  /* Mask the element to make sure the element number is between 0 and 
> > the
> > +maximum number of elements - 1 so that we don't generate an address
> > +outside the vector.  */
> 
> Hrm, so why do you need to do this here?  It is part of the semantics of
> vec_extract, so shouldn't the RTL already have this masking somewhere
> when we get here?

Yes, as we discussed when it went into the master branch, the PowerPC
vec_extract built-in function explicitly requires the masking, rather than it
being undefined.  Currently, the masking is not done when the built-in is
created, but only when it is split into the smaller insns.

What makes this more complicated that normal is that while we have VEC_SELECT
for the case where the index is constant, VEC_SELECT does not work for a
variable index.  So, we have to have a parallel set of insns that use an
UNSPEC for the variable case.

And we need to use UNSPEC (or VEC_SELECT) before combine, so that the combiner
has a chance to build the alternate insn where the vector is in memory, rather
than only doing the extract once the vector is loaded into a register.

> Nevertheless, the patch is okay for 9, it certainly won't hurt.  Thanks!


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] rs6000: Fix ICE in decompose_normal_address, at rtlanal.c:6403

2020-04-16 Thread Peter Bergner via Gcc-patches
On 4/16/20 8:21 AM, Peter Bergner wrote:
> This passed bootstrap and regression testing on powerpc64le-linux with no
> regressions.  Ok for mainline?

This also just passed bootstrap and regtesting on (BE) powerpc64-linux
running the testsuite in both 32-bit and 64-bit modes, with no regressions.

Peter


Re: [PATCH 0/19][GCC-8] aarch64: Backport outline atomics

2020-04-16 Thread Pop, Sebastian via Gcc-patches
Thanks Andre for the back-port to gcc-8.  Overall the patches look good to me.

Could you please move the patch "[PATCH 13/19][GCC-8] Aarch64: Fix 
shrinkwrapping interactions with atomics (PR92692)"
just after "[PATCH 8/19][GCC-8] aarch64: Implement TImode compare-and-swap"
such that the change that breaks TSAN builds gets fixed right away instead of 
waiting to get the fix after 4 more patches?

I would like to test the patches on Graviton2.
Could you please send me the git format-patch version?
It is hard to extract the patches from the mailing list without the patches 
getting scrambled.

Thanks,
Sebastian

On 4/16/20, 7:24 AM, "Andre Vieira (lists)"  
wrote:
   
Hi,

This series backports all the patches and fixes regarding outline
atomics to the gcc-8 branch.

Bootstrapped the series for aarch64-linux-gnu and regression tested.
Is this OK for gcc-8?

Andre Vieira (19):
aarch64: Add early clobber for aarch64_store_exclusive
aarch64: Simplify LSE cas generation
aarch64: Improve cas generation
aarch64: Improve swp generation
aarch64: Improve atomic-op lse generation
aarch64: Remove early clobber from ATOMIC_LDOP scratch
aarch64: Extend %R for integer registers
aarch64: Implement TImode compare-and-swap
aarch64: Tidy aarch64_split_compare_and_swap
aarch64: Add out-of-line functions for LSE atomics
Add visibility to libfunc constructors
aarch64: Implement -moutline-atomics
Aarch64: Fix shrinkwrapping interactions with atomics (PR92692)
aarch64: Fix store-exclusive in load-operate LSE helpers
aarch64: Configure for sys/auxv.h in libgcc for lse-init.c
aarch64: Fix up aarch64_compare_and_swaphi pattern [PR94368]
aarch64: Fix bootstrap with old binutils [PR93053]
aarch64: Fix ICE due to aarch64_gen_compare_reg_maybe_ze [PR94435]
re PR target/90724 (ICE with __sync_bool_compare_and_swap with
-march=armv8.2-a+sve)





Re: [RFC] split pseudos during loop unrolling in RTL unroller

2020-04-16 Thread Segher Boessenkool
On Thu, Apr 16, 2020 at 10:33:45AM +0200, Richard Biener wrote:
> On Wed, Apr 15, 2020 at 11:23 PM Segher Boessenkool
> > On a general note, we shouldn't depend on some pass that may or may not
> > clean up the mess we make, when we could just avoid making a mess in the
> > first place.
> 
> True - but the issue at hand is not trivial given you have to care for
> partial defs, uses outside of the loop (or across the backedge), etc.
> So there's plenty of things to go "wrong" here.

Certainly.  But *all* RTL passes before RA should leave things in "web
form" (compare to SSA form).  The code in web.c is probably just fine;
but we shouldn't have one web pass, *all* passes should leave things in
a useful form!

> > The web pass belongs immediately after expand; but ideally, even expand
> > would not reuse pseudos anyway.
> 
> But for example when lower-subreg decomposes things in a way turning
> partial defs into full defs new opportunities to split the web arise.

But that is only for the new registers it creates, or what am I missing?

> > Maybe it would be better as some utility routines, not a pass?
> 
> Sure, but then when do we apply it?

All of the time!  Ideally every pass would leave things in a good shape.
Usually it is cheapest as well as easiest to do things manually, but for
some harder cases, such helper routines can be used.

> Ideally scheduling would to
> register renaming itself and thus not rely on the used pseudos
> (I'm not sure if it tracks false dependences - I guess it must if it
> isn't able to rename regs).  That would be a much better place
> for improvements?

sched2 runs after RA, so it has nothing to do with webs?  And sched1
doesn't do much relevant here (it doesn't move insns much).

I don't see how this is directly related to register renaming either?


Segher


Re: [patch, fortran] Fix PR PR93500

2020-04-16 Thread Fritz Reese via Gcc-patches
On Thu, Apr 16, 2020 at 7:53 AM Thomas Koenig via Fortran
 wrote:
>
> Hello world,
>
> this patch fixes PR PR93500.  One part of it is due to
> what Steve wrote in the patch (returning from resolutions when both
> operands are NULL), but that still left a nonsensical error.
> Returning &gfc_bad_expr when simplifying bounds resulted in the
> division by zero error actually reaching the user.
>
> As to why there is an extra error when this is done in the main
> program, as compared to a subroutine, I don't know, but I do not
> particularly care. What is important is that the first error message
> is clear and reaches the user.
>
> Regression-tested. OK for trunk?

How odd. It seems something in the procedure matching routine fails to
free the symbol node for "a", while this _is_ done for the program
case. A bug for another day...


IMO a more clear test case uses "implicit none", which reveals the
more sensible "Symbol .a. at ... has no IMPLICIT type" (at least for
the program case, where a second error is displayed). One can see
another variation of this with a declaration like "integer,
dimension(lbound(a)) :: c" which gives a similar error "Symbol .a. is
used before it is typed at ...".


Regarding the new code in simplify.c:

  bounds[d] = simplify_bound_dim([...])
  if (bounds[d] == NULL || bounds[d] == &gfc_bad_expr)
{
 [...]
  if (gfc_seen_div0)
{
  gfc_free_expr (bounds[d]);
  return &gfc_bad_expr;
}
[...]

First, it appears if simplify_bound_dim returns &gfc_bad_expr (and a
div/0 occurs) then this code will free &gfc_bad_expr. I'm not sure
whether or not that can actually occur, but it is certainly incorrect,
since &gfc_bad_expr points to static storage. The only other possible
case is bounds[d] == NULL, in which case the free is a no-op. I
suggest removing the free call.

That being said, it looks like the same error condition can occur with
the lcobound intrinsic. I see code inside simplify_cobound nearly
identical to that in simplify_bound which is not guarded by the new
gfc_seen_div0 check. Someone more familiar with coarrays may be able
to generate a testcase which exhibits the same regression using
lcobound, but I am confident it can occur. This suggests to me that
the check is better placed in simplify_bound_dim, which both
simplify_bound and simplify_cobound call. If simplify_bound_dim
returns &gfc_bad_expr, it appears both routines should continue to do
the right thing already (which would not include freeing
gfc_bad_expr). It is the call to gfc_resolve_array_spec within
simplify_bound_dim which signals the div0, so I believe the div0 check
should be inserted right here (around line 4075). How about the
following patch to simplify.c instead (which appears to have the
fortunate side-effect of fixing a formerly leaked result expression):

diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c
index d5703e38251..5395694dc67 100644
--- a/gcc/fortran/simplify.c
+++ b/gcc/fortran/simplify.c
@@ -4073,7 +4073,13 @@ simplify_bound_dim (gfc_expr *array, gfc_expr
*kind, int d, int upper,
   gcc_assert (as);

   if (!gfc_resolve_array_spec (as, 0))
-return NULL;
+{
+  gfc_free_expr (result);
+  if (gfc_seen_div0)
+   return &gfc_bad_expr;
+  else
+   return NULL;
+}

   /* The last dimension of an assumed-size array is special.  */
   if ((!coarray && d == as->rank && as->type == AS_ASSUMED_SIZE && !upper)

---
Fritz Reese
diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c
index d5703e38251..5395694dc67 100644
--- a/gcc/fortran/simplify.c
+++ b/gcc/fortran/simplify.c
@@ -4073,7 +4073,13 @@ simplify_bound_dim (gfc_expr *array, gfc_expr *kind, int 
d, int upper,
   gcc_assert (as);
 
   if (!gfc_resolve_array_spec (as, 0))
-return NULL;
+{
+  gfc_free_expr (result);
+  if (gfc_seen_div0)
+   return &gfc_bad_expr;
+  else
+   return NULL;
+}
 
   /* The last dimension of an assumed-size array is special.  */
   if ((!coarray && d == as->rank && as->type == AS_ASSUMED_SIZE && !upper)


[PATCH] c++: Non-type-dependent variadic lambda init-capture [PR94483]

2020-04-16 Thread Patrick Palka via Gcc-patches
In this PR (which I think is misclassified as ice-on-invalid instead of
ice-on-valid), we're ICEing on a use of an 'int... a' template parameter pack as
part of the variadic lambda init-capture [...z=a].

The unexpected thing about this variadic init-capture is that it is not
type-dependent, and so when we call do_auto_deduction from
lambda_capture_field_type it actually resolves its type to 'int' instead of
exiting early like it would do for a type-dependent variadic initializer.  This
later confuses add_capture which, according to one of its comments, assumes that
'type' is always 'auto' for a variadic init-capture.

The simplest fix, and the approach that this patch takes, seems to be to avoid
doing auto deduction in lambda_capture_field_type when the initializer uses
parameter packs, so that we always return 'auto' even in the non-type-dependent
case.

Passes 'make check-c++', does this look OK to commit after full
bootstrap/regtesting?

gcc/cp/ChangeLog:

PR c++/94483
* lambda.c (lambda_capture_field_type): Avoid doing auto deduction if
the explicit initializer has parameter packs.

gcc/testsuite/ChangeLog:

PR c++/94483
* g++.dg/cpp2a/lambda-pack-init5.C: New test.
---
 gcc/cp/lambda.c|  5 -
 gcc/testsuite/g++.dg/cpp2a/lambda-pack-init5.C | 18 ++
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-pack-init5.C

diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index 4f39f99756b..b55c2f85d27 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -223,7 +223,10 @@ lambda_capture_field_type (tree expr, bool explicit_init_p,
/* Add the reference now, so deduction doesn't lose
   outermost CV qualifiers of EXPR.  */
type = build_reference_type (type);
-  type = do_auto_deduction (type, expr, auto_node);
+  if (uses_parameter_packs (expr))
+   /* Stick with 'auto' even if the type could be deduced.  */;
+  else
+   type = do_auto_deduction (type, expr, auto_node);
 }
   else if (!is_this && type_dependent_expression_p (expr))
 {
diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-pack-init5.C 
b/gcc/testsuite/g++.dg/cpp2a/lambda-pack-init5.C
new file mode 100644
index 000..492fc479e94
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/lambda-pack-init5.C
@@ -0,0 +1,18 @@
+// PR c++/94483
+// { dg-do compile { target c++2a } }
+
+template constexpr auto x1
+  = [...z = -a] (auto F) { return F(z...); };
+
+template constexpr auto x2
+  = [&...z = a] (auto F) { return F(z...); };
+
+template constexpr auto x3
+  = [z = -a] (auto F) { return F(z); }; // { dg-error "packs not expanded" }
+
+
+constexpr auto sum = [] (auto... xs) { return (xs + ... + 0); };
+const int y1 = 1, y2 = 2, y3 = 3;
+
+static_assert(x1<1,2,3>(sum) == -6);
+static_assert(x2(sum) == 6);
-- 
2.26.1.107.gefe3874640



Re: [PATCH], PR target/94557, V2, Fix GCC 9.x PowerPC regression due to PR target/93932 back port.

2020-04-16 Thread Segher Boessenkool
On Thu, Apr 16, 2020 at 12:45:46PM -0400, Michael Meissner wrote:
> > > +  /* Mask the element to make sure the element number is between 0 
> > > and the
> > > +  maximum number of elements - 1 so that we don't generate an address
> > > +  outside the vector.  */
> > 
> > Hrm, so why do you need to do this here?  It is part of the semantics of
> > vec_extract, so shouldn't the RTL already have this masking somewhere
> > when we get here?
> 
> Yes, as we discussed when it went into the master branch, the PowerPC
> vec_extract built-in function explicitly requires the masking, rather than it
> being undefined.  Currently, the masking is not done when the built-in is
> created, but only when it is split into the smaller insns.

That is very fragile.  Can that not be fixed?

> What makes this more complicated that normal is that while we have VEC_SELECT
> for the case where the index is constant, VEC_SELECT does not work for a
> variable index.

But it could, RTL supports that just fine.


Segher


Re: [PATCH] rs6000: Fix ICE in decompose_normal_address, at rtlanal.c:6403

2020-04-16 Thread Segher Boessenkool
Hi!

On Thu, Apr 16, 2020 at 08:21:07AM -0500, Peter Bergner wrote:
> The ICE in PR93974 is caused by a bug in decompose address not being able to
> handle Altivec addresses the use AND: to strip off the bottom address bits.
> Rather than modify lra-constraints.c or rtlanal.c to solve this generic
> problem this late in the release cycle, I have decided to fix this in target
> code by defining the TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P target hook to
> reject mems with Altivec addresses from being used as equivalent expressions.
> I think this is fine, since Altivec addresses are legacy addresses.  I have
> confirmed the following patch fixes the ICE and that we still get the same
> code generated for the test case below, that we got before my PR93658 patch.

Excellent :-)  Just some very minor things:

> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -1734,6 +1734,9 @@ static const struct attribute_spec 
> rs6000_attribute_table[] =
>  
>  #undef TARGET_MANGLE_DECL_ASSEMBLER_NAME
>  #define TARGET_MANGLE_DECL_ASSEMBLER_NAME rs6000_mangle_decl_assembler_name
> +
> +#undef TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P
> +#define TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P 
> rs6000_cannot_substitute_mem_equiv_p

This line gets too long, you could split it in two?  (I did say "very
minor", right?)

> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pr93974.C
> @@ -0,0 +1,27 @@
> +/* { dg-do compile { target { powerpc*-*-linux* } } } */
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O3 -fstack-protector-strong" } */

Is there a reason to do this on Linux only?  If not, you can just do
{ dg-do compile } ?

Okay for trunk, however you choose to resolve those things.  Thank you!


Segher


Re: [PATCH] middle-end/94614 - avoid multiword moves to nothing

2020-04-16 Thread Jeff Law via Gcc-patches
On Thu, 2020-04-16 at 10:05 +0200, Richard Biener wrote:
> This adjusts emit_move_multi_word to handle moves into paradoxical
> subregs parts that are not there and resolve_clobber to handle
> such subregs.
> 
> Bootstrap & regtest running on x86_64-unknown-linux-gnu.
> 
> The testcase involves writing to a register out of bounds so I'm not
> sure this is the correct place to paper over this or whether RTL
> expansion should have done things differently.
> 
> ;; MEM[(v4si *)&res] = v_2(D);
> 
> (insn 12 9 10 (clobber (subreg:TI (reg/v:DI 113 [ res ]) 0)) 
> "pr94574.c":13:18 -1
>  (nil))
> 
> (insn 10 12 11 (set (subreg:SI (reg/v:DI 113 [ res ]) 0)
> (subreg:SI (reg/v:TI 115 [ v ]) 0)) "pr94574.c":13:18 -1
>  (nil))
> 
> (insn 11 10 0 (set (subreg:SI (reg/v:DI 113 [ res ]) 4)
> (subreg:SI (reg/v:TI 115 [ v ]) 4)) "pr94574.c":13:18 -1
>  (nil))
> 
> maybe we should simply force regs with out-of-bound accesses to
> memory?  The above is the RTL generated after the first half of the
> fix.  We still generate
> 
> (insn 12 7 10 2 (clobber (subreg:TI (reg/v:DI 113 [ res ]) 0)) 
> "pr94574.c":13:18 -1  
>  (nil))
> 
> which lower-subreg runs into - I did not track down where that
> is generated, but I understand the subreg is pointless here?
Yea.  The whole point of these clobbers is to indicate to the various analysis
passes that the whole object is clobbered.  In the case of a paradoxical pseudo
we could just strip the subreg.

If we knew there was a set of the SUBREG_REG object, then we could emit the
clobber completely since it carries no useful information at that point.

Jeff





[PATCH] wwwdocs: document my changes for gcc 10

2020-04-16 Thread David Malcolm via Gcc-patches
Validates.  The wording could probably use some work.

OK to push to the website repo?

---
 htdocs/gcc-10/changes.html | 37 +++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html
index 65e2fb3d..acd0c342 100644
--- a/htdocs/gcc-10/changes.html
+++ b/htdocs/gcc-10/changes.html
@@ -98,6 +98,14 @@ a work-in-progress.
  This makes it possible to rebuild program
  with same outcome which is useful, for example, for distribution 
packages.
   
+  https://gcc.gnu.org/onlinedocs/gcc/Static-Analyzer-Options.html";>-fanalyzer
+   enables a new static analysis pass and associated warnings.
+   This pass performs a time-consuming exploration of paths through
+   the code in the hope of detecting various common errors, such as
+   double-free bugs.  This option should be regarded as
+   experimental in this release.  In particular, analysis of non-C
+   code is unlikely to work.
+  
 
   
   
@@ -860,7 +868,23 @@ typedef svbool_t pred512 
__attribute__((arm_sve_vector_bits(512)));
 
 
 
-
+Improvements for plugin authors
+
+  
+GCC diagnostics can now have a chain of events associated with them,
+describing a path through the code that triggers the problem.
+These can be printed by the diagnostics subsystem in various ways,
+controlled by the
+https://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Message-Formatting-Options.html#index-fdiagnostics-path-format";>-fdiagnostics-path-format
+option, or captured in JSON form via
+https://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Message-Formatting-Options.html#index-fdiagnostics-format";>-fdiagnostics-format=json.
+  
+GCC diagnostics can now be associated with
+https://cwe.mitre.org/";>CWE weakness identifiers, which
+will appear on the standard error stream, and in the JSON output from
+https://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Message-Formatting-Options.html#index-fdiagnostics-format";>-fdiagnostics-format=json.
+  
+
 
 
 Other significant improvements
@@ -873,9 +897,18 @@ typedef svbool_t pred512 
__attribute__((arm_sve_vector_bits(512)));
 for overlapping memory moves, consistent with the
 library functions memcpy and memmove.
   
+  
+For many releases, when GCC emits a warning it prints the option
+controlling that warning.  As of GCC 10, that option text is now a
+clickable hyperlink for the documentation of that option (assuming a
+https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda";>sufficiently
 capable terminal).
+This behavior can be controlled via a new
+https://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Message-Formatting-Options.html#index-fdiagnostics-urls";>-fdiagnostics-urls
+option (along with various environment variables and heuristics
+documented with that option).
+  
 
 
-
 
 
 
-- 
2.21.0



[PATCH] c++: Hard error with tentative parse of declaration [PR88754]

2020-04-16 Thread Patrick Palka via Gcc-patches
In the testcase for this PR, we try to parse the statement

  A(value<0>());

first tentatively as a declaration (with a parenthesized declarator), and during
this tentative parse we end up issuing a hard error from
cp_parser_check_template_parameters about its invalidness as a declaration.

Rather than issuing a hard error, it seems we should instead simulate an error
since we're parsing tentatively.  This would then allow cp_parser_statement to
recover and successfully parse the statement as an expression-statement instead.

Passes 'make check-c++', does this look OK to commit after bootstrap/regtesting?

gcc/cp/ChangeLog:

PR c++/88754
* parser. (cp_parser_check_template_parameters): Before issiung a hard
error, first try simulating an error instead.

gcc/testsuite/ChangeLog:

PR c++/88754
* g++.dg/parse/ambig10.C: New test.
---
 gcc/cp/parser.c  |  4 
 gcc/testsuite/g++.dg/parse/ambig10.C | 20 
 2 files changed, 24 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/parse/ambig10.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index e037e7d8c8e..47e3f2bbd3d 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -28531,6 +28531,10 @@ cp_parser_check_template_parameters (cp_parser* parser,
   if (!template_id_p
   && parser->num_template_parameter_lists == num_templates + 1)
 return true;
+
+  if (cp_parser_simulate_error (parser))
+return false;
+
   /* If there are more template classes than parameter lists, we have
  something like:
 
diff --git a/gcc/testsuite/g++.dg/parse/ambig10.C 
b/gcc/testsuite/g++.dg/parse/ambig10.C
new file mode 100644
index 000..42b04b16923
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/ambig10.C
@@ -0,0 +1,20 @@
+// PR c++/88754
+// { dg-do compile }
+
+struct A
+{
+  A(int);
+  void foo();
+};
+
+template int value() { return N; }
+
+void bar()
+{
+  A(value<0>()).foo();
+  A(value<0>());
+  (A(value<0>())).foo();
+
+  A value<0>; // { dg-error "invalid declaration" }
+  A value<0>(); // { dg-error "invalid declaration" }
+}
-- 
2.26.1.107.gefe3874640



Re: [PATCH v2] rs6000: Don't use HARD_FRAME_POINTER_REGNUM if it's not live in pro_and_epilogue (PR91518)

2020-04-16 Thread Segher Boessenkool
Hi!

On Mon, Apr 13, 2020 at 10:11:43AM +0800, luoxhu wrote:
> frame_pointer_needed is set to true in reload pass setup_can_eliminate,
> but regs_ever_live[31] is false, pro_and_epilogue uses it without live
> check causing CPU2006 465.tonto segment fault of loading from invalid
> addresses due to r31 not saved/restored.  Thus, add HARD_FRAME_POINTER_REGNUM
> live check with frame_pointer_needed_indeed_p when generating pro_and_epilogue
> instructions.

I see.

Can you instead make a boolean variable "frame_pointer_needed_indeed",
that you set somewhere early in *logue processing?  So that we can be
sure that it will not change behind our backs.

>  void
>  rs6000_emit_prologue_components (sbitmap components)
>  {
>rs6000_stack_t *info = rs6000_stack_info ();
> -  rtx ptr_reg = gen_rtx_REG (Pmode, frame_pointer_needed
> -  ? HARD_FRAME_POINTER_REGNUM
> -  : STACK_POINTER_REGNUM);
> +  rtx ptr_reg = gen_rtx_REG (Pmode, frame_pointer_needed_indeed_p ()
> +   ? HARD_FRAME_POINTER_REGNUM
> +   : STACK_POINTER_REGNUM);

Yeah, I got the indent wrong there, thanks for fixing it :-)

These four cases might well be the only four you need to fix here, but
I'll double-check it tomorrow, when I'm awake ;-)

Thanks!


Segher


Re: [PATCH] Fold (add -1; zero_ext; add +1) operations to zero_ext when not zero (PR37451, PR61837)

2020-04-16 Thread Segher Boessenkool
On Wed, Apr 15, 2020 at 10:18:16AM +0100, Richard Sandiford wrote:
> luoxhu--- via Gcc-patches  writes:
> > -count = simplify_gen_binary (PLUS, mode, count, const1_rtx);
> > +{
> > +  /* Fold (add -1; zero_ext; add +1) operations to zero_ext based on 
> > addop0
> > +is never zero, as gimple pass loop ch will do optimization to simplify
> > +the loop to NO loop for loop condition is false.  */
> 
> IMO the code needs to prove this, rather than just assume that previous
> passes have made it so.

Well, it should gcc_assert it, probably.

It is the left-hand side of a+b...  it cannot be 0, because niter always
is simplified!


Segher


Re: [PATCH] rs6000: Fix ICE in decompose_normal_address, at rtlanal.c:6403

2020-04-16 Thread Peter Bergner via Gcc-patches
On 4/16/20 5:21 PM, Segher Boessenkool wrote:
>> +#undef TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P
>> +#define TARGET_CANNOT_SUBSTITUTE_MEM_EQUIV_P 
>> rs6000_cannot_substitute_mem_equiv_p
> 
> This line gets too long, you could split it in two? 

Done.


>> +/* { dg-do compile { target { powerpc*-*-linux* } } } */
>> +/* { dg-require-effective-target powerpc_p8vector_ok } */
>> +/* { dg-options "-mdejagnu-cpu=power8 -O3 -fstack-protector-strong" } */
> 
> Is there a reason to do this on Linux only?  If not, you can just do
> { dg-do compile } ?

Ok, changed.  I was trying to limit it to POWER and then thought that
no other OS on POWER supports P8 vector, so I added that hunk, but I
guess the dg-requires-effective-target is enough.



> Okay for trunk, however you choose to resolve those things.  Thank you!

Thanks for the review.  I just pushed the updated patch.

Peter



Re: [PATCH v2] rs6000: Don't use HARD_FRAME_POINTER_REGNUM if it's not live in pro_and_epilogue (PR91518)

2020-04-16 Thread luoxhu via Gcc-patches



On 2020/4/17 08:52, Segher Boessenkool wrote:
> Hi!
> 
> On Mon, Apr 13, 2020 at 10:11:43AM +0800, luoxhu wrote:
>> frame_pointer_needed is set to true in reload pass setup_can_eliminate,
>> but regs_ever_live[31] is false, pro_and_epilogue uses it without live
>> check causing CPU2006 465.tonto segment fault of loading from invalid
>> addresses due to r31 not saved/restored.  Thus, add HARD_FRAME_POINTER_REGNUM
>> live check with frame_pointer_needed_indeed_p when generating 
>> pro_and_epilogue
>> instructions.
> 
> I see.
> 
> Can you instead make a boolean variable "frame_pointer_needed_indeed",
> that you set somewhere early in *logue processing?  So that we can be
> sure that it will not change behind our backs.


Thanks, rs6000_emit_prologue seems the proper place to set the 
frame_pointer_needed_indeed,
but it's strange that hard_frame_pointer_rtx will be marked USE in 
make_prologue_seq, also
need check here though not causing segfault? PS, this piece of code is in 
different file.

function.c 
static rtx_insn *
make_prologue_seq (void)
{
  if (!targetm.have_prologue ())
return NULL;

  start_sequence ();
  rtx_insn *seq = targetm.gen_prologue ();
  emit_insn (seq);

  /* Insert an explicit USE for the frame pointer
 if the profiling is on and the frame pointer is required.  */
  if (crtl->profile && frame_pointer_needed)
emit_use (hard_frame_pointer_rtx);
...



Any way, update the patch as below with your previous comments:



This bug is exposed by FRE refactor of r263875.  Comparing the fre
dump file shows no obvious change of the segment fault function proves
it to be a target issue.
frame_pointer_needed is set to true in reload pass setup_can_eliminate,
but regs_ever_live[31] is false, pro_and_epilogue uses it without live
check causing CPU2006 465.tonto segment fault of loading from invalid
addresses due to r31 not saved/restored.  Thus, add HARD_FRAME_POINTER_REGNUM
live check with frame_pointer_needed_indeed when generating pro_and_epilogue
instructions.

Bootstrap and regression tested pass on Power8-LE.  Backport to gcc-9
required once approved.

gcc/ChangeLog

2020-04-17  Xiong Hu Luo  

PR target/91518
* config/rs6000/rs6000-logue.c (frame_pointer_needed_indeed):
New variable.
(rs6000_emit_prologue_components):
Check with frame_pointer_needed_indeed.
(rs6000_emit_epilogue_components): Likewise.
(rs6000_emit_epilogue): Likewise.
(rs6000_emit_prologue): Set frame_pointer_needed_indeed.
---
 gcc/config/rs6000/rs6000-logue.c | 23 ++-
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-logue.c b/gcc/config/rs6000/rs6000-logue.c
index 4cbf228eb79..2213d1fa227 100644
--- a/gcc/config/rs6000/rs6000-logue.c
+++ b/gcc/config/rs6000/rs6000-logue.c
@@ -58,6 +58,8 @@ static bool rs6000_save_toc_in_prologue_p (void);
 
 static rs6000_stack_t stack_info;
 
+/* Set if HARD_FRAM_POINTER_REGNUM is really needed.  */
+static bool frame_pointer_needed_indeed = false;
 
 /* Label number of label created for -mrelocatable, to call to so we can
get the address of the GOT section */
@@ -2735,9 +2737,9 @@ void
 rs6000_emit_prologue_components (sbitmap components)
 {
   rs6000_stack_t *info = rs6000_stack_info ();
-  rtx ptr_reg = gen_rtx_REG (Pmode, frame_pointer_needed
-? HARD_FRAME_POINTER_REGNUM
-: STACK_POINTER_REGNUM);
+  rtx ptr_reg = gen_rtx_REG (Pmode, frame_pointer_needed_indeed
+ ? HARD_FRAME_POINTER_REGNUM
+ : STACK_POINTER_REGNUM);
 
   machine_mode reg_mode = Pmode;
   int reg_size = TARGET_32BIT ? 4 : 8;
@@ -2815,9 +2817,9 @@ void
 rs6000_emit_epilogue_components (sbitmap components)
 {
   rs6000_stack_t *info = rs6000_stack_info ();
-  rtx ptr_reg = gen_rtx_REG (Pmode, frame_pointer_needed
-? HARD_FRAME_POINTER_REGNUM
-: STACK_POINTER_REGNUM);
+  rtx ptr_reg = gen_rtx_REG (Pmode, frame_pointer_needed_indeed
+ ? HARD_FRAME_POINTER_REGNUM
+ : STACK_POINTER_REGNUM);
 
   machine_mode reg_mode = Pmode;
   int reg_size = TARGET_32BIT ? 4 : 8;
@@ -2996,7 +2998,10 @@ rs6000_emit_prologue (void)
&& (lookup_attribute ("no_split_stack",
  DECL_ATTRIBUTES (cfun->decl))
== NULL));
- 
+
+  frame_pointer_needed_indeed
+= frame_pointer_needed && df_regs_ever_live_p (HARD_FRAME_POINTER_REGNUM);
+
   /* Offset to top of frame for frame_reg and sp respectively.  */
   HOST_WIDE_INT frame_off = 0;
   HOST_WIDE_INT sp_off = 0;
@@ -3658,7 +3663,7 @@ rs6000_emit_prologue (void)
 }
 
   /* Set frame pointer, if needed.  */
-  if (frame_pointer_needed)
+  if (frame_pointer_needed_indeed)
 {
   insn = emit_mo

Re: [PATCH] RS6000: Use .machine ppc for some CRT files

2020-04-16 Thread Sebastian Huber

Hello Segher,

would you mind having a look at this patch.



Re: [PATCH] reject scalar array initialization with nullptr [PR94510]

2020-04-16 Thread Jason Merrill via Gcc-patches

On 4/15/20 1:30 PM, Martin Sebor wrote:

On 4/13/20 8:43 PM, Jason Merrill wrote:

On 4/12/20 5:49 PM, Martin Sebor wrote:

On 4/10/20 8:52 AM, Jason Merrill wrote:

On 4/9/20 4:23 PM, Martin Sebor wrote:

On 4/9/20 1:32 PM, Jason Merrill wrote:

On 4/9/20 3:24 PM, Martin Sebor wrote:

On 4/9/20 1:03 PM, Jason Merrill wrote:

On 4/8/20 1:23 PM, Martin Sebor wrote:

On 4/7/20 3:36 PM, Marek Polacek wrote:

On Tue, Apr 07, 2020 at 02:46:52PM -0600, Martin Sebor wrote:

On 4/7/20 1:50 PM, Marek Polacek wrote:
On Tue, Apr 07, 2020 at 12:50:48PM -0600, Martin Sebor via 
Gcc-patches wrote:
Among the numerous regressions introduced by the change 
committed
to GCC 9 to allow string literals as template arguments is 
a failure
to recognize the C++ nullptr and GCC's __null constants as 
pointers.
For one, I didn't realize that nullptr, being a null 
pointer constant,
doesn't have a pointer type, and two, I didn't think of 
__null (which

is a special integer constant that NULL sometimes expands to).

The attached patch adjusts the special handling of trailing 
zero
initializers in reshape_init_array_1 to recognize both 
kinds of
constants and avoid treating them as zeros of the array 
integer
element type.  This restores the expected diagnostics when 
either

constant is used in the initializer list.

Martin


PR c++/94510 - nullptr_t implicitly cast to zero twice in 
std::array


gcc/cp/ChangeLog:

PR c++/94510
* decl.c (reshape_init_array_1): Exclude mismatches 
with all kinds

of pointers.

gcc/testsuite/ChangeLog:

PR c++/94510
* g++.dg/init/array57.C: New test.
* g++.dg/init/array58.C: New test.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index a127734af69..692c8ed73f4 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6041,9 +6041,14 @@ reshape_init_array_1 (tree elt_type, 
tree max_index, reshape_iter *d,

   TREE_CONSTANT (new_init) = false;
 /* Pointers initialized to strings must be treated 
as non-zero

- even if the string is empty.  */
+ even if the string is empty.  Handle all kinds of 
pointers,
+ including std::nullptr and GCC's __nullptr, neither 
of which

+ has a pointer type.  */
 tree init_type = TREE_TYPE (elt_init);
-  if (POINTER_TYPE_P (elt_type) != POINTER_TYPE_P 
(init_type)

+  bool init_is_ptr = (POINTER_TYPE_P (init_type)
+  || NULLPTR_TYPE_P (init_type)
+  || null_node_p (elt_init));
+  if (POINTER_TYPE_P (elt_type) != init_is_ptr
 || !type_initializer_zero_p (elt_type, elt_init))
   last_nonzero = index;


It looks like this still won't handle e.g. pointers to 
member functions,

e.g.

struct S { };
int arr[3] = { (void (S::*) ()) 0, 0, 0 };

would still be accepted.  You could use TYPE_PTR_OR_PTRMEM_P 
instead of

POINTER_TYPE_P to catch this case.


Good catch!  That doesn't fail because unlike null data 
member pointers
which are represented as -1, member function pointers are 
represented

as a zero.

I had looked for an API that would answer the question: "is this
expression a pointer?" without having to think of all the 
different
kinds of them but all I could find was null_node_p().  Is 
this a rare,
isolated case that having an API like that wouldn't be worth 
having

or should I add one like in the attached update?

Martin


PR c++/94510 - nullptr_t implicitly cast to zero twice in 
std::array


gcc/cp/ChangeLog:

PR c++/94510
* decl.c (reshape_init_array_1): Exclude mismatches with 
all kinds

of pointers.
* gcc/cp/cp-tree.h (null_pointer_constant_p): New function.


(Drop the gcc/cp/.)

+/* Returns true if EXPR is a null pointer constant of any 
type.  */

+
+inline bool
+null_pointer_constant_p (tree expr)
+{
+  STRIP_ANY_LOCATION_WRAPPER (expr);
+  if (expr == null_node)
+    return true;
+  tree type = TREE_TYPE (expr);
+  if (NULLPTR_TYPE_P (type))
+    return true;
+  if (POINTER_TYPE_P (type))
+    return integer_zerop (expr);
+  return null_member_pointer_value_p (expr);
+}
+


We already have a null_ptr_cst_p so it would be sort of 
confusing to have
this as well.  But are you really interested in whether it's a 
null pointer,

not just a pointer?


The goal of the code is to detect a mismatch in "pointerness" 
between
an initializer expression and the type of the initialized 
element, so
it needs to know if the expression is a pointer (non-nulls 
pointers
are detected in type_initializer_zero_p).  That means testing a 
number

of IMO unintuitive conditions:

   TYPE_PTR_OR_PTRMEM_P (TREE_TYPE (expr))
   || NULLPTR_TYPE_P (TREE_TYPE (expr))
   || null_node_p (expr)

I don't know if this type of a query is common in the C++ FE 
but unless
this is an isolated use case then besides fixing the bug I 
thought it
would be nice to make it easier to get the test above right, or 
at least

come close to it.

Since null_pointer_constant_p already exists (but isn't 
suitable here

because it returns true for plain literal zeros)


Why is that unsuitable?

Re: [RFC] split pseudos during loop unrolling in RTL unroller

2020-04-16 Thread Richard Biener via Gcc-patches
On Thu, Apr 16, 2020 at 7:46 PM Segher Boessenkool
 wrote:
>
> On Thu, Apr 16, 2020 at 10:33:45AM +0200, Richard Biener wrote:
> > On Wed, Apr 15, 2020 at 11:23 PM Segher Boessenkool
> > > On a general note, we shouldn't depend on some pass that may or may not
> > > clean up the mess we make, when we could just avoid making a mess in the
> > > first place.
> >
> > True - but the issue at hand is not trivial given you have to care for
> > partial defs, uses outside of the loop (or across the backedge), etc.
> > So there's plenty of things to go "wrong" here.
>
> Certainly.  But *all* RTL passes before RA should leave things in "web
> form" (compare to SSA form).  The code in web.c is probably just fine;
> but we shouldn't have one web pass, *all* passes should leave things in
> a useful form!

Yeah well, but RTL is not in SSA form and there's no RTL IL verification
in place to track degradation.  And we even work in the opposite way
when expanding to RTL from SSA, coalescing as much as we can ...

> > > The web pass belongs immediately after expand; but ideally, even expand
> > > would not reuse pseudos anyway.
> >
> > But for example when lower-subreg decomposes things in a way turning
> > partial defs into full defs new opportunities to split the web arise.
>
> But that is only for the new registers it creates, or what am I missing?

No idea, just made up the example that maintaing "SSA" RTL is
not automagic.

> > > Maybe it would be better as some utility routines, not a pass?
> >
> > Sure, but then when do we apply it?
>
> All of the time!  Ideally every pass would leave things in a good shape.
> Usually it is cheapest as well as easiest to do things manually, but for
> some harder cases, such helper routines can be used.
>
> > Ideally scheduling would to
> > register renaming itself and thus not rely on the used pseudos
> > (I'm not sure if it tracks false dependences - I guess it must if it
> > isn't able to rename regs).  That would be a much better place
> > for improvements?
>
> sched2 runs after RA, so it has nothing to do with webs?  And sched1
> doesn't do much relevant here (it doesn't move insns much).
>
> I don't see how this is directly related to register renaming either?

If scheduling ignores "false" dependences (anti dependence with
full defs) then when it schedules across such defs needs to perform
renaming.  Maybe I'm using bogus terms here.

Richard.

>
> Segher


Re: [PATCH] Do not modify tab options in vimrc for .py files.

2020-04-16 Thread Martin Liška

On 4/16/20 4:16 PM, Alexander Monakov wrote:

On Thu, 16 Apr 2020, Martin Liška wrote:


To be honest I have:
autocmd Filetype python setlocal expandtab tabstop=4 shiftwidth=4
softtabstop=4

in my default vim config.
But I'm wondering what's default for 'python' Filetype?


Since October 2013 Vim ftplugin/python.vim has:

" As suggested by PEP8.
setlocal expandtab shiftwidth=4 softtabstop=4 tabstop=8


Fine! That's what I expected.



So the default is correct. Please disregard my suggestion then,
no need to add an 'else' branch there.


Thank you for the searching of it.

Martin



Thanks.
Alexander