Re: Enabling -frename-registers?

2016-04-27 Thread Eric Botcazou
> @@ -8562,7 +8563,8 @@ debug information format adopted by the
>   make debugging impossible, since variables no longer stay in
>   a ``home register''.
> 
> -Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops}.
> +Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops},
> +and also enabled at levels @option{-O2} and @option{-O3}.

OPT_LEVELS_2_PLUS includes -Os since it's basically -O2.

-- 
Eric Botcazou


Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size

2016-04-27 Thread Richard Biener
On Wed, 27 Apr 2016, Prathamesh Kulkarni wrote:

> On 26 April 2016 at 16:31, Richard Biener  wrote:
> > On Mon, 25 Apr 2016, Prathamesh Kulkarni wrote:
> >
> >> On 6 April 2016 at 14:54, Richard Biener  wrote:
> >> > On Wed, 6 Apr 2016, Richard Biener wrote:
> >> >
> >> >> On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
> >> >>
> >> >> > On 6 April 2016 at 13:44, Richard Biener  wrote:
> >> >> > > On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
> >> >> > >
> >> >> > >> On 5 April 2016 at 18:28, Richard Biener  wrote:
> >> >> > >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >> >> > >> >
> >> >> > >> >> On 5 April 2016 at 16:58, Richard Biener  
> >> >> > >> >> wrote:
> >> >> > >> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >> >> > >> >> >
> >> >> > >> >> >> On 4 April 2016 at 19:44, Jan Hubicka  wrote:
> >> >> > >> >> >> >
> >> >> > >> >> >> >> diff --git a/gcc/lto/lto-partition.c 
> >> >> > >> >> >> >> b/gcc/lto/lto-partition.c
> >> >> > >> >> >> >> index 9eb63c2..bc0c612 100644
> >> >> > >> >> >> >> --- a/gcc/lto/lto-partition.c
> >> >> > >> >> >> >> +++ b/gcc/lto/lto-partition.c
> >> >> > >> >> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int 
> >> >> > >> >> >> >> n_lto_partitions)
> >> >> > >> >> >> >>varpool_order.qsort (varpool_node_cmp);
> >> >> > >> >> >> >>
> >> >> > >> >> >> >>/* Compute partition size and create the first 
> >> >> > >> >> >> >> partition.  */
> >> >> > >> >> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE 
> >> >> > >> >> >> >> (MAX_PARTITION_SIZE))
> >> >> > >> >> >> >> +fatal_error (input_location, "min partition size 
> >> >> > >> >> >> >> cannot be greater than max partition size");
> >> >> > >> >> >> >> +
> >> >> > >> >> >> >>partition_size = total_size / n_lto_partitions;
> >> >> > >> >> >> >>if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
> >> >> > >> >> >> >>  partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
> >> >> > >> >> >> >> +  else if (partition_size > PARAM_VALUE 
> >> >> > >> >> >> >> (MAX_PARTITION_SIZE))
> >> >> > >> >> >> >> +{
> >> >> > >> >> >> >> +  n_lto_partitions = total_size / PARAM_VALUE 
> >> >> > >> >> >> >> (MAX_PARTITION_SIZE);
> >> >> > >> >> >> >> +  if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> > >> >> >> >> + n_lto_partitions++;
> >> >> > >> >> >> >> +  partition_size = total_size / n_lto_partitions;
> >> >> > >> >> >> >> +}
> >> >> > >> >> >> >
> >> >> > >> >> >> > lto_balanced_map actually works in a way that looks for 
> >> >> > >> >> >> > cheapest cutpoint in range
> >> >> > >> >> >> > 3/4*parittion_size to 2*partition_size and picks the 
> >> >> > >> >> >> > cheapest range.
> >> >> > >> >> >> > Setting partition_size to this value will thus not cause 
> >> >> > >> >> >> > partitioner to produce smaller
> >> >> > >> >> >> > partitions only.  I suppose modify the conditional:
> >> >> > >> >> >> >
> >> >> > >> >> >> >   /* Partition is too large, unwind into step when 
> >> >> > >> >> >> > best cost was reached and
> >> >> > >> >> >> >  start new partition.  */
> >> >> > >> >> >> >   if (partition->insns > 2 * partition_size)
> >> >> > >> >> >> >
> >> >> > >> >> >> > and/or in the code above set the partition_size to half of 
> >> >> > >> >> >> > total_size/max_size.
> >> >> > >> >> >> >
> >> >> > >> >> >> > I know this is somewhat sloppy.  This was really just 
> >> >> > >> >> >> > first cut implementation
> >> >> > >> >> >> > many years ago. I expected to reimplement it marter soon, 
> >> >> > >> >> >> > but then there was
> >> >> > >> >> >> > never really a need for it (I am trying to avoid late IPA 
> >> >> > >> >> >> > optimizations so the
> >> >> > >> >> >> > partitioning decisions should mostly affect compile time 
> >> >> > >> >> >> > performance only).
> >> >> > >> >> >> > If ARM is more sensitive for partitining, perhaps it would 
> >> >> > >> >> >> > make sense to try to
> >> >> > >> >> >> > look for something smarter.
> >> >> > >> >> >> >
> >> >> > >> >> >> >> +
> >> >> > >> >> >> >>npartitions = 1;
> >> >> > >> >> >> >>partition = new_partition ("");
> >> >> > >> >> >> >>if (symtab->dump_file)
> >> >> > >> >> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
> >> >> > >> >> >> >> index 9dd513f..294b8a4 100644
> >> >> > >> >> >> >> --- a/gcc/lto/lto.c
> >> >> > >> >> >> >> +++ b/gcc/lto/lto.c
> >> >> > >> >> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
> >> >> > >> >> >> >>timevar_pop (TV_WHOPR_WPA);
> >> >> > >> >> >> >>
> >> >> > >> >> >> >>timevar_push (TV_WHOPR_PARTITIONING);
> >> >> > >> >> >> >> +
> >> >> > >> >> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
> >> >> > >> >> >> >> +  && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
> >> >> > >> >> >> >> +fatal_error (input_location, "--param 
> >> >> > >> >> >> >> max-lto-partition should only"
> >> >> > >> >> >> >> +  " be used with balanced partitioning\n");
> >> >> > >> >> >> >> +
> >> >> > >> >

Re: [PING 7, PATCH] PR/68089: C++-11: Ingore "alignas(0)".

2016-04-27 Thread Dominik Vogt
On Mon, Jan 04, 2016 at 12:33:21PM +0100, Dominik Vogt wrote:
> On Fri, Jan 01, 2016 at 05:53:08PM -0700, Martin Sebor wrote:
> > On 12/31/2015 04:50 AM, Dominik Vogt wrote:
> > >The attached patch fixes C++-11 handling of "alignas(0)" which
> > >should be ignored but currently generates an error message.  A
> > >test case is included; the patch has been tested on S390x.  Since
> > >it's a language issue it should be independent of the backend
> > >used.
> > 
> > The patch doesn't handle value-dependent expressions(*).
> 
> > It
> > seems that the problem is in handle_aligned_attribute() calling
> > check_user_alignment() with the second argument (ALLOW_ZERO)
> > set to false.  Calling it with true fixes the problem and handles
> > value-dependent expressions (I haven't done any more testing beyond
> > that).
> 
> Like the attached patch?  (Passes the testsuite on s390x.)
> 
> But wouldn't an "aligned" attribute be added, allowing the backend
> to possibly generate an error or a warning?
> 
> > Also, in the test, I noticed the definition of the first struct
> > is missing the terminating semicolon.
> 
> Yeah.

> gcc/c-family/ChangeLog
> 
>   PR/69089
>   * c-common.c (handle_aligned_attribute): Allow 0 as an argument to the
>   "aligned" attribute.
> 
> gcc/testsuite/ChangeLog
> 
>   PR/69089
>   * g++.dg/cpp0x/alignas5.C: New test.

> >From 2461293b9070da74950fd0ae055d1239cc69ce67 Mon Sep 17 00:00:00 2001
> From: Dominik Vogt 
> Date: Wed, 30 Dec 2015 15:08:52 +0100
> Subject: [PATCH] C++-11: Ingore "alignas(0)" instead of generating an
>  error message.
> 
> This is required by the C++-11 standard.
> ---
>  gcc/c-family/c-common.c   |  2 +-
>  gcc/testsuite/g++.dg/cpp0x/alignas5.C | 29 +
>  2 files changed, 30 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/alignas5.C
> 
> diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
> index 653d1dc..9eb25a9 100644
> --- a/gcc/c-family/c-common.c
> +++ b/gcc/c-family/c-common.c
> @@ -7804,7 +7804,7 @@ handle_aligned_attribute (tree *node, tree ARG_UNUSED 
> (name), tree args,
>else if (TYPE_P (*node))
>  type = node, is_type = 1;
>  
> -  if ((i = check_user_alignment (align_expr, false)) == -1
> +  if ((i = check_user_alignment (align_expr, true)) == -1
>|| !check_cxx_fundamental_alignment_constraints (*node, i, flags))
>  *no_add_attrs = true;
>else if (is_type)
> diff --git a/gcc/testsuite/g++.dg/cpp0x/alignas5.C 
> b/gcc/testsuite/g++.dg/cpp0x/alignas5.C
> new file mode 100644
> index 000..f3252a9
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp0x/alignas5.C
> @@ -0,0 +1,29 @@
> +// PR c++/69089
> +// { dg-do compile { target c++11 } }
> +// { dg-options "-Wno-attributes" }
> +
> +alignas (0) int valid1;
> +alignas (1 - 1) int valid2;
> +struct Tvalid
> +{
> +  alignas (0) int i;
> +  alignas (2 * 0) int j;
> +};
> +
> +alignas (-1) int invalid1; /* { dg-error "not a positive power of 2" } */
> +alignas (1 - 2) int invalid2; /* { dg-error "not a positive power of 2" } */
> +struct Tinvalid
> +{
> +  alignas (-1) int i; /* { dg-error "not a positive power of 2" } */
> +  alignas (2 * 0 - 1) int j; /* { dg-error "not a positive power of 2" } */
> +};
> +
> +template  struct TNvalid1 { alignas (N) int i; };
> +TNvalid1<0> SNvalid1;
> +template  struct TNvalid2 { alignas (N) int i; };
> +TNvalid2<1 - 1> SNvalid2;
> +
> +template  struct TNinvalid1 { alignas (N) int i; }; /* { dg-error 
> "not a positive power of 2" } */
> +TNinvalid1<-1> SNinvalid1;
> +template  struct TNinvalid2 { alignas (N) int i; }; /* { dg-error 
> "not a positive power of 2" } */
> +TNinvalid2<1 - 2> SNinvalid2;
> -- 
> 2.3.0
> 



Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany




Re: Please include ada-hurd.diff upstream (try2)

2016-04-27 Thread Eric Botcazou
> Attaching the modified ada-hurd.diff. Maybe it is ready for inclusion in
> upstream now?

Upstream already contains the first set of changes though.  Here's what I have 
installed on mainline and 6 branch (not sure it will be in the 6.1 release).


2016-04-27  Svante Signell  

* gcc-interface/Makefile.in (x86 GNU/Hurd): Use s-osinte-gnu.adb.
* s-osinte-gnu.ads: Small tweaks.
* s-osinte-gnu.adb: New file.

-- 
Eric BotcazouIndex: gcc-interface/Makefile.in
===
--- gcc-interface/Makefile.in	(revision 235394)
+++ gcc-interface/Makefile.in	(working copy)
@@ -1426,7 +1426,7 @@ ifeq ($(strip $(filter-out %86 pc gnu,$(
   a-intnam.adshttp://www.gnu.org/licenses/>.  --
+--  --
+-- GNARL was developed by the GNARL team at Florida State University.   --
+-- Extensive contributions were provided by Ada Core Technologies, Inc. --
+--  --
+--
+
+--  This is the GNU/Hurd version of this package.
+
+pragma Polling (Off);
+--  Turn off polling, we do not want ATC polling to take place during
+--  tasking operations. It causes infinite loops and other problems.
+
+--  This package encapsulates all direct interfaces to OS services
+--  that are needed by children of System.
+
+package body System.OS_Interface is
+
+   
+   -- Get_Stack_Base --
+   
+
+   function Get_Stack_Base (thread : pthread_t) return Address is
+  pragma Warnings (Off, thread);
+
+   begin
+  return Null_Address;
+   end Get_Stack_Base;
+
+   --
+   -- pthread_init --
+   --
+
+   procedure pthread_init is
+   begin
+  null;
+   end pthread_init;
+
+   --
+   -- pthread_mutexattr_setprioceiling --
+   --
+
+   function pthread_mutexattr_setprioceiling
+ (attr : access pthread_mutexattr_t;
+  prioceiling : int) return int is
+  pragma Unreferenced (attr, prioceiling);
+   begin
+  return 0;
+   end pthread_mutexattr_setprioceiling;
+
+   --
+   -- pthread_mutexattr_getprioceiling --
+   --
+
+   function pthread_mutexattr_getprioceiling
+ (attr : access pthread_mutexattr_t;
+  prioceiling : access int) return int is
+  pragma Unreferenced (attr, prioceiling);
+   begin
+  return 0;
+   end pthread_mutexattr_getprioceiling;
+
+   ---
+   -- pthread_setschedparam --
+   ---
+
+   function pthread_setschedparam
+ (thread : pthread_t;
+  policy : int;
+  param : access struct_sched_param) return int is
+  pragma Unreferenced (thread, policy, param);
+   begin
+  return 0;
+   end pthread_setschedparam;
+
+   -
+   -- To_Duration --
+   -
+
+   function To_Duration (TS : timespec) return Duration is
+   begin
+  return Duration (TS.tv_sec) + Duration (TS.tv_nsec) / 10#1#E9;
+   end To_Duration;
+
+   
+   -- To_Target_Priority --
+   
+
+   function To_Target_Priority
+ (Prio : System.Any_Priority) return Interfaces.C.int
+   is
+   begin
+  return Interfaces.C.int (Prio);
+   end To_Target_Priority;
+
+   -
+   -- To_Timespec --
+   -
+
+   function To_Timespec (D : Duration) return timespec is
+  S : time_t;
+  F : Duration;
+
+   begin
+  S := time_t (Long_Long_Integer (D));
+  F := D - Duration (S);
+
+  --  If F has negative value due to a round-up, adjust for positive F
+  --  value.
+
+  if F < 0.0 then
+ S := S - 1;
+ F := F + 1.0;
+  end if;
+
+  return timespec'(tv_sec => S,
+   tv_nsec => long (Long_Long_Integer (F * 10#1#E9)));
+   end To_Timespec;
+
+end System.OS_Interface;
Index: s-osinte-gnu.ads
===
--- s-osinte-gnu.ads	(revision 235394)
+++ s-osinte-gnu.ads	(working copy)
@@ -344,8 +344,9 @@ package System.OS_Interface is
--  returns the stack base of the specified thread. Only call this function
--  when Stack_Base_Available is True.
 
-   --  From: /usr/include/unistd.h __getpagesize or getpagesize??
-   function Get_Page_Size return int;
+   --  From: /usr/include/i386-gnu/bits/shm.h __getpagesize or getpagesize??
+   function Get_Page_Size return size_t;
+   function Get_Page_Size return Address;
pragma Import (C, Get_Page_Size, "__getpagesize");
--  Returns the size of a page
 
@@ -498,7 +499,11 @@ package System.OS_Interface is
PTHREAD_PRIO_PROTECT : constant := 2;
PTHREAD_PRI

Re: [PATCH] Fix up inchash::add_expr to match more closely operand_equal_p (PR sanitizer/70683, take 2)

2016-04-27 Thread Richard Biener
On Wed, 27 Apr 2016, Jakub Jelinek wrote:

> On Tue, Apr 26, 2016 at 03:02:38PM +0200, Jakub Jelinek wrote:
> > I've been using the attached hack; without this patch during x86_64-linux
> > and i686-linux yes,extra,rtl checking bootstraps there were 66931
> > notes (surprisingly only from the ivopts and gimple-ssa-strength-reduction
> > hash tables, no others), with the patch there are none.
> > 
> > Ok for trunk?
> 
> With the checking patch I've just posted, I've found two additional issues.
> 
> One is that the patch treated all tcc_comparison class codes by
> canonicalizing them to the lower of the two codes and perhaps swapping the
> arguments.  That is fine for most of the codes, but not for the commutative
> comparisons, because operand_equal_p will return true e.g. for x != y and
> y != y.  So, we need to treat those as commutative.
> 
> And the second issue is in hashing INTEGER_CSTs.  E.g. on
> builtin-arith-overflow-10.c, operand_equal_p returns non-zero for
> DImode 0x8000 and TImode of the same value, but they weren't
> hashing the same, the former has TREE_INT_CST_NUNITS == 1, the latter
> == 2 (but both have TREE_INT_CST_EXT_NUNITS == 2).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2016-04-27  Jakub Jelinek  
> 
>   PR sanitizer/70683
>   * tree.h (inchash::add_expr): Add FLAGS argument.
>   * tree.c (inchash::add_expr): Likewise.  If not OEP_ADDRESS_OF,
>   use STRIP_NOPS first.  For INTEGER_CST assert not OEP_ADDRESS_OF.
>   For REAL_CST and !HONOR_SIGNED_ZEROS (t) hash +/- 0 the same.
>   Formatting fix.  Adjust recursive calls.  For tcc_comparison,
>   if swap_tree_comparison (code) is smaller than code, hash that
>   and arguments in the other order.  Hash CONVERT_EXPR the same
>   as NOP_EXPR.  For OEP_ADDRESS_OF hash MEM_REF with 0 offset
>   of ADDR_EXPR of decl as the decl itself.  Add or remove
>   OEP_ADDRESS_OF from recursive flags as needed.  For
>   FMA_EXPR, WIDEN_MULT_{PLUS,MINUS}_EXPR hash the first two
>   operands commutatively and only the third one normally.
>   For internal CALL_EXPR hash in CALL_EXPR_IFN.
> 
> --- gcc/tree.h.jj 2016-04-22 18:21:32.0 +0200
> +++ gcc/tree.h2016-04-26 10:59:50.333534452 +0200
> @@ -4759,7 +4759,7 @@ extern int simple_cst_equal (const_tree,
>  namespace inchash
>  {
>  
> -extern void add_expr (const_tree, hash &);
> +extern void add_expr (const_tree, hash &, unsigned int = 0);
>  
>  }
>  
> --- gcc/tree.c.jj 2016-04-22 18:21:32.0 +0200
> +++ gcc/tree.c2016-04-26 23:00:12.238080960 +0200
> @@ -7769,7 +7769,7 @@ namespace inchash
> This function is intended to produce the same hash for expressions which
> would compare equal using operand_equal_p.  */
>  void
> -add_expr (const_tree t, inchash::hash &hstate)
> +add_expr (const_tree t, inchash::hash &hstate, unsigned int flags)
>  {
>int i;
>enum tree_code code;
> @@ -7781,6 +7781,9 @@ add_expr (const_tree t, inchash::hash &h
>return;
>  }
>  
> +  if (!(flags & OEP_ADDRESS_OF))
> +STRIP_NOPS (t);
> +
>code = TREE_CODE (t);
>  
>switch (code)
> @@ -7791,12 +7794,17 @@ add_expr (const_tree t, inchash::hash &h
>hstate.merge_hash (0);
>return;
>  case INTEGER_CST:
> -  for (i = 0; i < TREE_INT_CST_NUNITS (t); i++)
> +  gcc_checking_assert (!(flags & OEP_ADDRESS_OF));
> +  for (i = 0; i < TREE_INT_CST_EXT_NUNITS (t); i++)
>   hstate.add_wide_int (TREE_INT_CST_ELT (t, i));
>return;
>  case REAL_CST:
>{
> - unsigned int val2 = real_hash (TREE_REAL_CST_PTR (t));
> + unsigned int val2;
> + if (!HONOR_SIGNED_ZEROS (t) && real_zerop (t))
> +   val2 = rvc_zero;
> + else
> +   val2 = real_hash (TREE_REAL_CST_PTR (t));
>   hstate.merge_hash (val2);
>   return;
>}
> @@ -7807,17 +7815,18 @@ add_expr (const_tree t, inchash::hash &h
>   return;
>}
>  case STRING_CST:
> -  hstate.add ((const void *) TREE_STRING_POINTER (t), TREE_STRING_LENGTH 
> (t));
> +  hstate.add ((const void *) TREE_STRING_POINTER (t),
> +   TREE_STRING_LENGTH (t));
>return;
>  case COMPLEX_CST:
> -  inchash::add_expr (TREE_REALPART (t), hstate);
> -  inchash::add_expr (TREE_IMAGPART (t), hstate);
> +  inchash::add_expr (TREE_REALPART (t), hstate, flags);
> +  inchash::add_expr (TREE_IMAGPART (t), hstate, flags);
>return;
>  case VECTOR_CST:
>{
>   unsigned i;
>   for (i = 0; i < VECTOR_CST_NELTS (t); ++i)
> -   inchash::add_expr (VECTOR_CST_ELT (t, i), hstate);
> +   inchash::add_expr (VECTOR_CST_ELT (t, i), hstate, flags);
>   return;
>}
>  case SSA_NAME:
> @@ -7831,16 +7840,17 @@ add_expr (const_tree t, inchash::hash &h
>/* A list of expressions, for a CALL_EXPR or as the elements of a
>VECTOR_CST.  */
>  

[PATCH] Clean up tests where a later dg-do completely overrides another.

2016-04-27 Thread Dominik Vogt
The attached patch cleans up some (mostly unnecessary) dg-do
directives in the gcc.dg and gcc.target test cases.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/testsuite/ChangeLog

* gcc/testsuite/gcc.dg/cpp/mac-dir-2.c: Remove pointless duplicate
dg-do.
* gcc/testsuite/gcc.dg/pr27003.c: Likewise.
* gcc/testsuite/gcc.dg/tree-ssa/cswtch.c: Likewise.
* gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c: Likewise.
* gcc/testsuite/gcc.dg/tree-ssa/predcom-4.c: Likewise.
* gcc/testsuite/gcc.dg/tree-ssa/predcom-5.c: Likewise.
* gcc.target/arc/mxy.c: Likewise.
* gcc.target/arc/mswape.c: Likewise.
* gcc.target/arc/mrtsc.c: Likewise.
* gcc.target/arc/mcrc.c: Likewise.
* gcc.target/arc/mdsp-packa.c: Likewise.
* gcc.target/arc/mdvbf.c: Likewise.
* gcc.target/arc/mlock.c: Likewise.
* gcc.target/arc/mmac-24.c: Likewise.
* gcc.dg/spec-options.c: Switch order of the two "dg-do run" so that
the test ist actually "run" on sh*-*-*.  Order _does_ matter.
>From d21d7db706b30be13b23e8e583ecfd4445d1cdf4 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 9 Mar 2016 15:42:23 +0100
Subject: [PATCH] Clean up tests where a later dg-do completely overrides
 another.

In most tests the first dg-do could be simply removed.  In one case the two
lines needed to be swapped so that the condition of the "run" was not
overridden by the later, unconditional "compile".
---
 gcc/testsuite/gcc.dg/cpp/mac-dir-2.c  | 2 --
 gcc/testsuite/gcc.dg/pr27003.c| 1 -
 gcc/testsuite/gcc.dg/spec-options.c   | 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/cswtch.c| 1 -
 gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c | 1 -
 gcc/testsuite/gcc.dg/tree-ssa/predcom-4.c | 1 -
 gcc/testsuite/gcc.dg/tree-ssa/predcom-5.c | 1 -
 gcc/testsuite/gcc.target/arc/mcrc.c   | 1 -
 gcc/testsuite/gcc.target/arc/mlock.c  | 1 -
 gcc/testsuite/gcc.target/arc/mmac-24.c| 1 -
 gcc/testsuite/gcc.target/arc/mrtsc.c  | 1 -
 gcc/testsuite/gcc.target/arc/mswape.c | 1 -
 gcc/testsuite/gcc.target/arc/mxy.c| 1 -
 13 files changed, 1 insertion(+), 14 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/cpp/mac-dir-2.c b/gcc/testsuite/gcc.dg/cpp/mac-dir-2.c
index b31ab3b..4c45d14 100644
--- a/gcc/testsuite/gcc.dg/cpp/mac-dir-2.c
+++ b/gcc/testsuite/gcc.dg/cpp/mac-dir-2.c
@@ -1,7 +1,5 @@
 /* Copyright (C) 2002 Free Software Foundation, Inc.  */
 
-/* { dg-do preprocess } */
-
 /* Source: Neil Booth, 26 Feb 2002.
 
Test that we allow directives in macro arguments.  */
diff --git a/gcc/testsuite/gcc.dg/pr27003.c b/gcc/testsuite/gcc.dg/pr27003.c
index 5e416f4..7d886a0 100644
--- a/gcc/testsuite/gcc.dg/pr27003.c
+++ b/gcc/testsuite/gcc.dg/pr27003.c
@@ -1,4 +1,3 @@
-/* { dg-do compile } */
 /* { dg-do run } */
 /* { dg-options "-Os" } */
 
diff --git a/gcc/testsuite/gcc.dg/spec-options.c b/gcc/testsuite/gcc.dg/spec-options.c
index 1f9d8c1..e3ab23a 100644
--- a/gcc/testsuite/gcc.dg/spec-options.c
+++ b/gcc/testsuite/gcc.dg/spec-options.c
@@ -1,8 +1,8 @@
 /* Check that -mfoo is accepted if defined in a user spec
and that it is not passed on the command line.  */
 /* Must be processed in EXTRA_SPECS to run.  */
-/* { dg-do run { target sh*-*-* } } */
 /* { dg-do compile } */
+/* { dg-do run { target sh*-*-* } } */
 /* { dg-options "-B${srcdir}/gcc.dg --specs=foo.specs -tfoo" } */
 
 extern void abort(void);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cswtch.c b/gcc/testsuite/gcc.dg/tree-ssa/cswtch.c
index 80f92f7..5737a0e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/cswtch.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/cswtch.c
@@ -1,4 +1,3 @@
-/* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-switchconv" } */
 /* { dg-do run } */
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c b/gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c
index 7253921..0d92f8e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c
@@ -1,4 +1,3 @@
-/* { dg-do compile } */
 /* { dg-do run } */
 /* { dg-options "-O2 -funroll-loops --param max-unroll-times=8 -fpredictive-commoning -fdump-tree-pcom-details" } */
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-4.c b/gcc/testsuite/gcc.dg/tree-ssa/predcom-4.c
index 3244c1d..382a464 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-4.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-4.c
@@ -1,4 +1,3 @@
-/* { dg-do compile } */
 /* { dg-do run } */
 /* { dg-options "-O2 -funroll-loops --param max-unroll-times=8 -fpredictive-commoning -fdump-tree-pcom-details" } */
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-5.c b/gcc/testsuite/gcc.dg/tree-ssa/predcom-5.c
index 7ad0d79..a3ee1d9 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-5.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-5.c
@@ -1,4 +1,3 @@
-/* { dg-do compile } */
 /* { dg-do run } */
 /* { dg-options "-O2 -funroll-loops --param max-unroll-times=8 -fpredictive-commoning -fdump-tree-pcom-details" } */

Fix PR ada/70759

2016-04-27 Thread Eric Botcazou
As discussed on gcc@, this removes the internal_reference_types machinery, 
which is used only by the Ada compiler for a reason probably long obsolete.

Tested on x86_64-suse-linux, applied on the mainline.


2016-04-27  Eric Botcazou  

PR ada/70759
* stor-layout.h (internal_reference_types): Delete.
* stor-layout.c (reference_types_internal): Likewise.
(internal_reference_types): Likewise.
(layout_type) : Adjust.


2016-04-27  Eric Botcazou  

* gcc-interface/misc.c (gnat_init): Do not call
internal_reference_types.


-- 
Eric BotcazouIndex: ada/gcc-interface/misc.c
===
--- ada/gcc-interface/misc.c	(revision 235394)
+++ ada/gcc-interface/misc.c	(working copy)
@@ -369,9 +369,6 @@ gnat_init (void)
   sbitsize_one_node = sbitsize_int (1);
   sbitsize_unit_node = sbitsize_int (BITS_PER_UNIT);
 
-  /* Show that REFERENCE_TYPEs are internal and should be Pmode.  */
-  internal_reference_types ();
-
   /* Register our internal error function.  */
   global_dc->internal_error = &internal_error_function;
 
Index: stor-layout.c
===
--- stor-layout.c	(revision 235394)
+++ stor-layout.c	(working copy)
@@ -49,11 +49,6 @@ tree sizetype_tab[(int) stk_type_kind_la
The value is measured in bits.  */
 unsigned int maximum_field_alignment = TARGET_DEFAULT_PACK_STRUCT * BITS_PER_UNIT;
 
-/* Nonzero if all REFERENCE_TYPEs are internal and hence should be allocated
-   in the address spaces' address_mode, not pointer_mode.   Set only by
-   internal_reference_types called only by a front end.  */
-static int reference_types_internal = 0;
-
 static tree self_referential_size (tree);
 static void finalize_record_size (record_layout_info);
 static void finalize_type_size (tree);
@@ -62,15 +57,6 @@ static int excess_unit_span (HOST_WIDE_I
 			 HOST_WIDE_INT, tree);
 extern void debug_rli (record_layout_info);
 
-/* Show that REFERENCE_TYPES are internal and should use address_mode.
-   Called only by front end.  */
-
-void
-internal_reference_types (void)
-{
-  reference_types_internal = 1;
-}
-
 /* Given a size SIZE that may not be a constant, return a SAVE_EXPR
to serve as the actual size-expression for a type or decl.  */
 
@@ -2245,12 +2231,6 @@ layout_type (tree type)
 case REFERENCE_TYPE:
   {
 	machine_mode mode = TYPE_MODE (type);
-	if (TREE_CODE (type) == REFERENCE_TYPE && reference_types_internal)
-	  {
-	addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (type));
-	mode = targetm.addr_space.address_mode (as);
-	  }
-
 	TYPE_SIZE (type) = bitsize_int (GET_MODE_BITSIZE (mode));
 	TYPE_SIZE_UNIT (type) = size_int (GET_MODE_SIZE (mode));
 	TYPE_UNSIGNED (type) = 1;
Index: stor-layout.h
===
--- stor-layout.h	(revision 235394)
+++ stor-layout.h	(working copy)
@@ -22,7 +22,6 @@ along with GCC; see the file COPYING3.
 
 extern void set_min_and_max_values_for_integral_type (tree, int, signop);
 extern void fixup_signed_type (tree);
-extern void internal_reference_types (void);
 extern unsigned int update_alignment_for_field (record_layout_info, tree,
 unsigned int);
 extern record_layout_info start_record_layout (tree);


[PATCH] S/390: Improve documentation of s390_reload_costs.

2016-04-27 Thread Dominik Vogt
The attached patch improves some S/390 function documentation.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* config/s390/s390.c (s390_rtx_costs): Update documentation.
>From ba5a56e03402a75bb0cc807eb27c57d93ce736e1 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Thu, 14 Apr 2016 14:59:40 +0100
Subject: [PATCH] S/390: Improve documentation of s390_reload_costs.

---
 gcc/config/s390/s390.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 4f219be..cb5dd5f 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -3371,8 +3371,10 @@ s390_memory_move_cost (machine_mode mode ATTRIBUTE_UNUSED,
 
 /* Compute a (partial) cost for rtx X.  Return true if the complete
cost has been computed, and false if subexpressions should be
-   scanned.  In either case, *TOTAL contains the cost result.
-   OUTER_CODE contains the code of the superexpression of x.  */
+   scanned.  In either case, *TOTAL contains the cost result.  The
+   initial value of *TOTAL is the default value computed by
+   rtx_cost.  It may be left unmodified.  OUTER_CODE contains the
+   code of the superexpression of x.  */
 
 static bool
 s390_rtx_costs (rtx x, machine_mode mode, int outer_code,
-- 
2.3.0



S/390: Add patterns for rsbg instructions.

2016-04-27 Thread Dominik Vogt
The attached patch adds some patterns using the r*sbg instructions
to the s390 machine description.  Bootstrapped and regression
tested on s390 and s390x.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* config/s390/s390.md ("*rsbg__sll")
("*rsbg__srl"): New define_insns.
("*rsbg__srl_bitmask"): Renamed by adding "_bitmask".
("*rsbg__sll_bitmask"): Likewise.
gcc/testsuite/ChangeLog

* gcc.target/s390/md/rXsbg_mode_sXl.c: New test.
* gcc.target/s390/s390.exp (check_effective_target_z10_instructions):
Procedure to check for z10 instruction set.
>From 4f6e3da0bcb6adeff8acfaaeeec3261a92bac00c Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Fri, 4 Mar 2016 09:45:05 +0100
Subject: [PATCH] S/390: Add patterns for rsbg instructions.

Needs new ANDIXOR code iterator.
---
 gcc/config/s390/s390.md   |  34 -
 gcc/testsuite/gcc.target/s390/md/rXsbg_mode_sXl.c | 151 ++
 gcc/testsuite/gcc.target/s390/s390.exp|  11 ++
 3 files changed, 194 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/md/rXsbg_mode_sXl.c

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 5a9f1c8..5f3b0f7 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -3965,7 +3965,7 @@
   "rsbg\t%0,%1,%2,%2,%b3"
   [(set_attr "op_type" "RIE")])
 
-(define_insn "*rsbg__srl"
+(define_insn "*rsbg__srl_bitmask"
   [(set (match_operand:GPR 0 "nonimmediate_operand" "=d")
 	(IXOR:GPR
 	  (and:GPR
@@ -3981,7 +3981,7 @@
   "rsbg\t%0,%1,%2,%2,64-%3"
   [(set_attr "op_type" "RIE")])
 
-(define_insn "*rsbg__sll"
+(define_insn "*rsbg__sll_bitmask"
   [(set (match_operand:GPR 0 "nonimmediate_operand" "=d")
 	(IXOR:GPR
 	  (and:GPR
@@ -3997,6 +3997,36 @@
   "rsbg\t%0,%1,%2,%2,%3"
   [(set_attr "op_type" "RIE")])
 
+;; unsigned {int,long} a, b
+;; a = a | (b << const_int)
+;; a = a ^ (b << const_int)
+(define_insn "*rsbg__sll"
+  [(set (match_operand:GPR 0 "nonimmediate_operand" "=d")
+	(IXOR:GPR
+	  (ashift:GPR
+(match_operand:GPR 1 "nonimmediate_operand" "d")
+(match_operand:GPR 2 "nonzero_shift_count_operand" ""))
+	  (match_operand:GPR 3 "nonimmediate_operand" "0")))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_Z10"
+  "rsbg\t%0,%1,64-,63-%2,%2"
+  [(set_attr "op_type" "RIE")])
+
+;; unsigned {int,long} a, b
+;; a = a | (b >> const_int)
+;; a = a ^ (b >> const_int)
+(define_insn "*rsbg__srl"
+  [(set (match_operand:GPR 0 "nonimmediate_operand" "=d")
+	(IXOR:GPR
+	  (lshiftrt:GPR
+(match_operand:GPR 1 "nonimmediate_operand" "d")
+(match_operand:GPR 2 "nonzero_shift_count_operand" ""))
+	  (match_operand:GPR 3 "nonimmediate_operand" "0")))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_Z10"
+  "rsbg\t%0,%1,64-+%2,63,64-%2"
+  [(set_attr "op_type" "RIE")])
+
 ;; These two are generated by combine for s.bf &= val.
 ;; ??? For bitfields smaller than 32-bits, we wind up with SImode
 ;; shifts and ands, which results in some truly awful patterns
diff --git a/gcc/testsuite/gcc.target/s390/md/rXsbg_mode_sXl.c b/gcc/testsuite/gcc.target/s390/md/rXsbg_mode_sXl.c
new file mode 100644
index 000..178a537
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/md/rXsbg_mode_sXl.c
@@ -0,0 +1,151 @@
+/* Machine description pattern tests.  */
+
+/*
+{ dg-options "-mzarch -save-temps" }
+
+   Note that dejagnu-1.5.1 has a bug so that the action from the second dg-do
+   always wins, even if the condition is false.  If this test is run on hardware
+   older than z10 with a buggy dejagnu release, the execution part will fail.
+
+{ dg-do assemble { target { ! z10_instructions } } }
+{ dg-do run { target { z10_instructions } } }
+
+   Skip test if -O0, -march=z900, -march=z9-109 or -march=z9-ec is present on
+   the command line:
+
+{ dg-skip-if "" { *-*-* } { "-march=z9*" "-O0" } { "" } }
+
+   Skip test if the -O or the -march= option is missing from the command line
+   because it's difficult to detect the default:
+
+{ dg-skip-if "" { *-*-* } { "*" } { "-O*" } }
+{ dg-skip-if "" { *-*-* } { "*" } { "-march=*" } }
+*/
+
+__attribute__ ((noinline)) unsigned int
+si_sll (unsigned int x)
+{
+  return (x << 1);
+}
+
+__attribute__ ((noinline)) unsigned int
+si_srl (unsigned int x)
+{
+  return (x >> 2);
+}
+
+__attribute__ ((noinline)) unsigned int
+rosbg_si_sll (unsigned int a, unsigned int b)
+{
+  return a | (b << 1);
+}
+/* { dg-final { scan-assembler-times "rosbg\t%r.,%r.,64-32,63-1,1" 1 } } */
+
+__attribute__ ((noinline)) unsigned int
+rosbg_si_srl (unsigned int a, unsigned int b)
+{
+  return a | (b >> 2);
+}
+/* { dg-final { scan-assembler-times "rosbg\t%r.,%r.,64-32\\+2,63,64-2" 1 } } */
+
+__attribute__ ((noinline)) unsigned int
+rxsbg_si_sll (unsigned int a, unsigned int b)
+{
+  return a ^ (b << 1);
+}
+/* { dg-final { scan-assembler-times "rxsbg\t%r.,%r.,64-32,63-1,1" 1 } } */
+
+__attribute__ ((noinline)) unsign

[PATCH] S/390: Add splitter for "and" with complement.

2016-04-27 Thread Dominik Vogt
The attached patch provides some improved patterns for "and with
complement" to the s390 machine description.  Bootstrapped and
regression tested on s390 and s390x.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* config/s390/s390.h (REG_OR_SUBREG_P): New helper macro.
* config/s390/s390.c (s390_expand_logical_operator): Force operands
from memory into registers for expressions with register destination.
(s390_logical_operator_si3_ok_p)
(s390_andc_split_ok_p): New functions.
* config/s390/s390-protos.h (s390_logical_operator_si3_ok_p)
(s390_andc_split_ok_p): Add prototypes.
* config/s390/s390.md ("*andc_split", "*andc_split2"): New splitters
for and with complement.
("*andsi3_zarch", "*iorsi3_zarch", "xorsi3"): Call
s390_logical_operator_si3_ok_p.
gcc/testsuite/ChangeLog

* gcc.target/s390/md/andc-splitter-1.c: New test case.
* gcc.target/s390/md/andc-splitter-2.c: Likewise.
>From de225e02fe79661642f123fd0505a0bd60f20066 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Mon, 14 Mar 2016 17:48:17 +0100
Subject: [PATCH] S/390: Add splitter for "and" with complement.

Force splitting of logical operator expressions ...  with three operands, a
register destination and a memory operand because there are no instructions for
that and combine results in inefficient code.
---
 gcc/config/s390/s390-protos.h  |  2 +
 gcc/config/s390/s390.c | 65 ++
 gcc/config/s390/s390.h |  3 +
 gcc/config/s390/s390.md| 52 -
 gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c | 61 
 gcc/testsuite/gcc.target/s390/md/andc-splitter-2.c | 38 +
 6 files changed, 218 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/md/andc-splitter-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/md/andc-splitter-2.c

diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index 2ccf0bb..8ba4d5d 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -127,6 +127,8 @@ extern rtx_insn *s390_emit_call (rtx, rtx, rtx, rtx);
 extern void s390_expand_logical_operator (enum rtx_code,
 	  machine_mode, rtx *);
 extern bool s390_logical_operator_ok_p (rtx *);
+extern bool s390_logical_operator_si3_ok_p (rtx *);
+extern bool s390_andc_split_ok_p (rtx *);
 extern void s390_narrow_logical_operator (enum rtx_code, rtx *, rtx *);
 extern void s390_split_access_reg (rtx, rtx *, rtx *);
 
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index cb5dd5f..1a303d8 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -2558,6 +2558,27 @@ s390_expand_logical_operator (enum rtx_code code, machine_mode mode,
 	src2 = gen_rtx_SUBREG (wmode, force_reg (mode, src2), 0);
 }
 
+  /* We have no useful instructions with three operands, the source in memory
+ and the destination in a register.  Reload memory operands to register if
+ necessary.  */
+  if (!s390_logical_operator_si3_ok_p (operands))
+{
+  if (MEM_P (src1))
+	{
+	  rtx temp = gen_reg_rtx (mode);
+
+	  emit_move_insn (temp, src1);
+	  src1 = temp;
+	}
+  if (MEM_P (src2))
+	{
+	  rtx temp = gen_reg_rtx (mode);
+
+	  emit_move_insn (temp, src2);
+	  src2 = temp;
+	}
+}
+
   /* Emit the instruction.  */
   op = gen_rtx_SET (dst, gen_rtx_fmt_ee (code, wmode, src1, src2));
   clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, CC_REGNUM));
@@ -2583,6 +2604,50 @@ s390_logical_operator_ok_p (rtx *operands)
   return true;
 }
 
+/* Rejects operand combinations of logical operations (AND, IOR, XOR) that
+   result in less efficient code later.  */
+
+bool
+s390_logical_operator_si3_ok_p (rtx *operands)
+{
+  if (!s390_logical_operator_ok_p (operands))
+return false;
+  if (reload_completed)
+return true;
+  /* Reject three operand expressions with register destination if one of the
+ sources is a memory operand and the other is not a const_int operand.  */
+  if (REG_OR_SUBREG_P (operands[0])
+  && (MEM_P (operands[1]) || MEM_P (operands[2]))
+  && !(CONST_INT_P (operands[1]) || CONST_INT_P (operands[2])))
+return false;
+
+  return true;
+}
+
+/* Rejects operand combinations of AND operations that result in less efficient
+   code later.  */
+
+bool
+s390_andc_split_ok_p (rtx *operands)
+{
+  if (reload_completed)
+return false;
+  if (!s390_logical_operator_si3_ok_p (operands))
+return false;
+  /* Reject two operand expressions with a memory destination that is identical
+ to one of the source operands and the other operand a register or memory
+ because the splitter would replace the destination with a register yielding
+ an undefined pattern.  */
+  if (MEM_P (operands[0])
+  && (MEM_P (operands[1]) || REG_OR_SUBREG_P (operands[1]))
+  && (MEM

[PING, PATCH] Remove xfail from thread_local-order2.C.

2016-04-27 Thread Dominik Vogt
> g++.dg/tls/thread_local-order2.C no longer fail with Glibc-2.18 or
> newer since this commit:
>
>   2014-08-01  Zifei Tong  
> 
>   * libsupc++/atexit_thread.cc (HAVE___CXA_THREAD_ATEXIT_IMPL): Add
>   _GLIBCXX_ prefix to macro.
> 
>   git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@213504 138bc75d-0d04-0410-96
> 
> https://gcc.gnu.org/ml/gcc-patches/2014-07/msg02091.html
> 
> So, is it time to remove the xfail from the test case?

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/testsuite/ChangeLog

* g++.dg/tls/thread_local-order2.C: Remove xfail.
>From 0b0abbd2e6d9d8b6857622065bdcbdde31b5ddb0 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Wed, 27 Jan 2016 09:54:07 +0100
Subject: [PATCH] Remove xfail from thread_local-order2.C.

This should work with Glibc-2.18 or newer.
---
 gcc/testsuite/g++.dg/tls/thread_local-order2.C | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/testsuite/g++.dg/tls/thread_local-order2.C b/gcc/testsuite/g++.dg/tls/thread_local-order2.C
index f8df917..d3351e6 100644
--- a/gcc/testsuite/g++.dg/tls/thread_local-order2.C
+++ b/gcc/testsuite/g++.dg/tls/thread_local-order2.C
@@ -2,7 +2,6 @@
 // that isn't reverse order of construction.  We need to move
 // __cxa_thread_atexit into glibc to get this right.
 
-// { dg-do run { xfail *-*-* } }
 // { dg-require-effective-target c++11 }
 // { dg-add-options tls }
 // { dg-require-effective-target tls_runtime }
-- 
2.3.0



[PATCH] DWARF: turn dw_loc_descr_node field into hash map for frame offset check

2016-04-27 Thread Pierre-Marie de Rodat
Hello,

As discussed on
, this change
removes a field in the dw_loc_descr_node structure so we can get rid of
the CHECKING_P macro usage.

This field was used to perform consistency checks for frame offset in
DWARF procedures. As a replacement, this commit turns the "visited
nodes" set in resolve_args_picking_1 into a map that remembers for each
dw_loc_descr_node the frame offset associated to it, so that the
consistency check is still operational.

Boostrapped and regtested on x86_64-linux. Ok to commit? Thank you in
advance!
---
 gcc/dwarf2out.c | 37 +++--
 gcc/dwarf2out.h |  6 --
 2 files changed, 19 insertions(+), 24 deletions(-)

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 0bbff87..463863d 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -1325,9 +1325,6 @@ new_loc_descr (enum dwarf_location_atom op, unsigned 
HOST_WIDE_INT oprnd1,
   dw_loc_descr_ref descr = ggc_cleared_alloc ();
 
   descr->dw_loc_opc = op;
-#if CHECKING_P
-  descr->dw_loc_frame_offset = -1;
-#endif
   descr->dw_loc_oprnd1.val_class = dw_val_class_unsigned_const;
   descr->dw_loc_oprnd1.val_entry = NULL;
   descr->dw_loc_oprnd1.v.val_unsigned = oprnd1;
@@ -15353,12 +15350,14 @@ is_handled_procedure_type (tree type)
  && int_size_in_bytes (type) <= DWARF2_ADDR_SIZE);
 }
 
-/* Helper for resolve_args_picking.  Stop when coming across VISITED nodes.  */
+/* Helper for resolve_args_picking: do the same but stop when coming across
+   visited nodes.  For each node we visit, register in FRAME_OFFSETS the frame
+   offset *before* evaluating the corresponding operation.  */
 
 static bool
 resolve_args_picking_1 (dw_loc_descr_ref loc, unsigned initial_frame_offset,
struct dwarf_procedure_info *dpi,
-   hash_set &visited)
+   hash_map &frame_offsets)
 {
   /* The "frame_offset" identifier is already used to name a macro... */
   unsigned frame_offset_ = initial_frame_offset;
@@ -15366,19 +15365,18 @@ resolve_args_picking_1 (dw_loc_descr_ref loc, 
unsigned initial_frame_offset,
 
   for (l = loc; l != NULL;)
 {
+  bool existed;
+  unsigned &l_frame_offset = frame_offsets.get_or_insert (l, &existed);
+
   /* If we already met this node, there is nothing to compute anymore.  */
-  if (visited.add (l))
+  if (existed)
{
-#if CHECKING_P
  /* Make sure that the stack size is consistent wherever the execution
 flow comes from.  */
- gcc_assert ((unsigned) l->dw_loc_frame_offset == frame_offset_);
-#endif
+ gcc_assert ((unsigned) l_frame_offset == frame_offset_);
  break;
}
-#if CHECKING_P
-  l->dw_loc_frame_offset = frame_offset_;
-#endif
+  l_frame_offset = frame_offset_;
 
   /* If needed, relocate the picking offset with respect to the frame
 offset. */
@@ -15601,7 +15599,7 @@ resolve_args_picking_1 (dw_loc_descr_ref loc, unsigned 
initial_frame_offset,
{
case DW_OP_bra:
  if (!resolve_args_picking_1 (l->dw_loc_next, frame_offset_, dpi,
-  visited))
+  frame_offsets))
return false;
  /* Fall through... */
 
@@ -15623,17 +15621,20 @@ resolve_args_picking_1 (dw_loc_descr_ref loc, 
unsigned initial_frame_offset,
 
 /* Make a DFS over operations reachable through LOC (i.e. follow branch
operations) in order to resolve the operand of DW_OP_pick operations that
-   target DWARF procedure arguments (DPI).  Stop at already visited nodes.
-   INITIAL_FRAME_OFFSET is the frame offset *before* LOC is executed.  Return
-   if all relocations were successful.  */
+   target DWARF procedure arguments (DPI).  INITIAL_FRAME_OFFSET is the frame
+   offset *before* LOC is executed.  Return if all relocations were
+   successful.  */
 
 static bool
 resolve_args_picking (dw_loc_descr_ref loc, unsigned initial_frame_offset,
  struct dwarf_procedure_info *dpi)
 {
-  hash_set visited;
+  /* Associate to all visited operations the frame offset *before* evaluating
+ this operation.  */
+  hash_map frame_offsets;
 
-  return resolve_args_picking_1 (loc, initial_frame_offset, dpi, visited);
+  return resolve_args_picking_1 (loc, initial_frame_offset, dpi,
+frame_offsets);
 }
 
 /* Try to generate a DWARF procedure that computes the same result as FNDECL.
diff --git a/gcc/dwarf2out.h b/gcc/dwarf2out.h
index 91b3d6b..abf0550 100644
--- a/gcc/dwarf2out.h
+++ b/gcc/dwarf2out.h
@@ -239,12 +239,6 @@ struct GTY((chain_next ("%h.dw_loc_next"))) 
dw_loc_descr_node {
  frame offset.  */
   unsigned int frame_offset_rel : 1;
   int dw_loc_addr;
-#if CHECKING_P
-  /* When translating a function into a DWARF procedure, contains the frame
- offset *before* evaluating this operation.  It is -1 when not yet
- initialized.  

Re: Please include ada-hurd.diff upstream (try2)

2016-04-27 Thread Svante Signell
On Wed, 2016-04-27 at 09:41 +0200, Eric Botcazou wrote:
> > 
> > Attaching the modified ada-hurd.diff. Maybe it is ready for inclusion in
> > upstream now?
> Upstream already contains the first set of changes though.  Here's what I
> have 
> installed on mainline and 6 branch (not sure it will be in the 6.1 release).
> 
> 
> 2016-04-27  Svante Signell  
> 
>   * gcc-interface/Makefile.in (x86 GNU/Hurd): Use s-osinte-gnu.adb.
>   * s-osinte-gnu.ads: Small tweaks.
>   * s-osinte-gnu.adb: New file.

I did not know that, when did that happen? Thanks anyway :) Found the above
commits though, on https://gcc.gnu.org/ml/gcc-cvs/2016-04/


[PATCH] Take known zero bits into account when checking extraction.

2016-04-27 Thread Dominik Vogt
The attached patch is a result of discussing an S/390 issue with
"and with complement" in some cases.

  https://gcc.gnu.org/ml/gcc/2016-03/msg00163.html
  https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01586.html

Combine would merge a ZERO_EXTEND and a SET taking the known zero
bits into account, resulting in an AND.  Later on,
make_compound_operation() fails to replace that with a ZERO_EXTEND
which we get for free on S/390 but leaves the AND, eventually
resulting in two consecutive AND instructions.

The current code in make_compound_operation() that detects
opportunities for ZERO_EXTEND does not work here because it does
not take the known zero bits into account:

  /* If the constant is one less than a power of two, this might be
 representable by an extraction even if no shift is present.
 If it doesn't end up being a ZERO_EXTEND, we will ignore it unless
 we are in a COMPARE.  */
  else if ((i = exact_log2 (UINTVAL (XEXP (x, 1)) + 1)) >= 0)
new_rtx = make_extraction (mode,
   make_compound_operation (XEXP (x, 0),
next_code),
   0, NULL_RTX, i, 1, 0, in_code == COMPARE);

An attempt to use the zero bits in the above conditions resulted
in many situations that generated worse code, so the patch tries
to fix this in a more conservative way.  While the effect is
completely positive on S/390, this will very likely have
unforeseeable consequences on other targets.

Bootstrapped and regression tested on s390 and s390x only at the
moment.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* combine.c (make_compound_operation): Take known zero bits into
account when checking for possible zero_extend.
>From e70e6e469200b53b3f4ae52a766cdd322a4d365d Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Tue, 12 Apr 2016 09:53:46 +0100
Subject: [PATCH] Take known zero bits into account when checking
 extraction.

Allows AND Insns with a const_int operand to be expressed as ZERO_EXTEND if the
operand ist a power of 2 - 1 even with the known zero bits masked out.
---
 gcc/combine.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/gcc/combine.c b/gcc/combine.c
index 1d0e8be..44bb1b3 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -7988,6 +7988,39 @@ make_compound_operation (rtx x, enum rtx_code in_code)
 			next_code),
 			   i, NULL_RTX, 1, 1, 0, 1);
 
+  /* If the one operand is a paradoxical subreg of a register or memory and
+	 the constant (limited to the smaller mode) has only zero bits where
+	 the sub expression has known zero bits, this can be expressed as
+	 a zero_extend.  */
+  else if (GET_CODE (XEXP (x, 0)) == SUBREG)
+	{
+	  rtx sub;
+
+	  sub = XEXP (XEXP (x, 0), 0);
+	  machine_mode sub_mode = GET_MODE (sub);
+	  if ((REG_P (sub) || MEM_P (sub))
+	  && GET_MODE_PRECISION (sub_mode) < mode_width
+	  && (UINTVAL (XEXP (x, 1))
+		  | (~nonzero_bits (sub, sub_mode) & GET_MODE_MASK (sub_mode))
+		  ) == GET_MODE_MASK (sub_mode))
+	{
+	  bool speed_p = optimize_insn_for_speed_p ();
+	  rtx temp = gen_rtx_ZERO_EXTEND (mode, sub);
+	  int cost_of_and;
+	  int cost_of_zero_extend;
+
+	  cost_of_and = rtx_cost (x, mode, in_code, 1, speed_p);
+	  cost_of_zero_extend = rtx_cost (temp, mode, in_code, 1, speed_p);
+	  if (cost_of_zero_extend <= cost_of_and)
+		{
+		  new_rtx = make_compound_operation (sub, next_code);
+		  new_rtx = make_extraction (mode, new_rtx, 0, 0,
+	 GET_MODE_PRECISION (sub_mode),
+	 1, 0, in_code == COMPARE);
+		}
+	}
+	}
+
   break;
 
 case LSHIFTRT:
-- 
2.3.0



[PATCH] Fix type field walking in gimplifier unsharing

2016-04-27 Thread Richard Biener

Gimplification is done in-place and thus relies on all processed
trees being unshared.  This is achieved by unshare_body which
in the end uses walk_tree to get at all interesting trees that possibly
need unsharing.

Unfortunately it doesn't really work because walk_tree only walks
types and type-related fields (TYPE_SIZE, TYPE_MIN_VALUE, etc.) in
very narrow circumstances.

The symptom of failing to unshare trees used in those fields and
in the IL of a function body that is gimplified is that those
trees get modified by the gimplification process which basically
leaks temporary decls into them.

Note that only type sizes that are actually needed for the IL
of a function are gimplified and thus there are referenced
types with still save-expr TYPE_SIZE after gimplification.

Eventually dwarf2out knows to emit debug info for those by
generating dwarf expressions but will be surely not able to do
so if the expressions contain random local decls that may no
longer be there.

With my patch to make the gimplifier use SSA names for temporaries
those may now leak into the non-gimplified TYPE_SIZEs and if they
are later released SSA names with NULL_TREE type (or even garbage
memory if the SSA name is ggc freed) is in there.  This crashes
in various ways when those trees are accessed (in dwarf2out, in the
LTO streamer, in other places looking at pointed-to types).

Thus the following patch which makes the gimplifier unsharing
visit all types.

Alternative patches include unsharing trees when we build a
save_expr around them, doing that only when stor-layout does
this to TYPE_SIZE fields (and friends) or try to have an
extra pass over the GENERIC IL to just mark the expressions
in the type fields not walked by the current walking (might
turn out tricky but it would result in the "least" unsharing
thus only unshare the parts we eventually gimplify).

It might be that we just need to declare tree sharing that runs
into the above issue as invalid (they involve Fortran testcases
or ubsan testcases only as far as I can see - interestingly
no Ada testcases are affected).

So - any opinion on the "correct" way to fix this?  Clearly
the gimplifier running into shared trees is a bug.

Thanks,
Richard.

2016-04-27  Richard Biener  

* gimplify.c (copy_if_shared_r): Walk types, type sizes
and bounds.
(unmark_visited_r): Unmark them.

Index: gcc/gimplify.c
===
--- gcc/gimplify.c.orig 2016-04-27 10:29:54.784677194 +0200
+++ gcc/gimplify.c  2016-04-27 10:29:38.708496357 +0200
@@ -832,31 +832,61 @@ copy_if_shared_r (tree *tp, int *walk_su
   tree t = *tp;
   enum tree_code code = TREE_CODE (t);
 
-  /* Skip types, decls, and constants.  But we do want to look at their
- types and the bounds of types.  Mark them as visited so we properly
- unmark their subtrees on the unmark pass.  If we've already seen them,
- don't look down further.  */
-  if (TREE_CODE_CLASS (code) == tcc_type
-  || TREE_CODE_CLASS (code) == tcc_declaration
+  bool was_visited = TREE_VISITED (t);
+
+  /* If the node wasn't visited already mark it so and recurse on its type.  */
+  if (! was_visited)
+{
+  TREE_VISITED (t) = 1;
+
+  /* walk_tree does not descend into TREE_TYPE unless this is a
+DECL_EXPR of a TYPE_DECL.  */
+  if (CODE_CONTAINS_STRUCT (code, TS_TYPED))
+   walk_tree (&TREE_TYPE (t), copy_if_shared_r, data, NULL);
+}
+
+  /* Skip types, decls, and constants for copying.  But we do want to look
+ at their types and the bounds of types.  Mark them as visited so we
+ properly unmark their subtrees on the unmark pass.  If we've already
+ seen them, don't look down further.  */
+  if (TREE_CODE_CLASS (code) == tcc_declaration
   || TREE_CODE_CLASS (code) == tcc_constant)
 {
-  if (TREE_VISITED (t))
+  if (was_visited)
+   *walk_subtrees = 0;
+}
+  else if (TREE_CODE_CLASS (code) == tcc_type)
+{
+  if (was_visited)
*walk_subtrees = 0;
   else
-   TREE_VISITED (t) = 1;
+   {
+ /* walk_type_fields does not walk type sizes or bounds if not
+coming directly from a DECL_EXPR context.  */
+ /* ???  ideally we'd mark all exprs reached via types as visited
+before copy_if_shared so duplicates in types that do not matter
+for gimplification are never unshared.  */
+ walk_tree (&TYPE_SIZE (t), copy_if_shared_r, data, NULL);
+ walk_tree (&TYPE_SIZE_UNIT (t), copy_if_shared_r, data, NULL);
+ if (INTEGRAL_TYPE_P (t)
+ || TREE_CODE (t) == FIXED_POINT_TYPE
+ || TREE_CODE (t) == REAL_TYPE)
+   {
+ walk_tree (&TYPE_MAX_VALUE (t), copy_if_shared_r, data, NULL);
+ walk_tree (&TYPE_MIN_VALUE (t), copy_if_shared_r, data, NULL);
+   }
+   }
 }
 
   /* If this node has been visited already, unshare it and don't look
  any deep

Re: [PATCH] Clean up tests where a later dg-do completely overrides another.

2016-04-27 Thread Bernd Schmidt

On 04/27/2016 09:50 AM, Dominik Vogt wrote:

The attached patch cleans up some (mostly unnecessary) dg-do
directives in the gcc.dg and gcc.target test cases.


Ok except...


* gcc.dg/spec-options.c: Switch order of the two "dg-do run" so that
the test ist actually "run" on sh*-*-*.  Order _does_ matter.


No commentary in ChangeLogs.


Bernd



Re: [PATCH] S/390: Add splitter for "and" with complement.

2016-04-27 Thread Dominik Vogt
On Wed, Apr 27, 2016 at 08:58:44AM +0100, Dominik Vogt wrote:
> The attached patch provides some improved patterns for "and with
> complement" to the s390 machine description.  Bootstrapped and
> regression tested on s390 and s390x.

(This patch needs some careful proof reading.  I've made so many
versions of the patch that I may be overlooking something
obvious.)

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: Fix some i386 testcases for -frename-registers

2016-04-27 Thread Bernd Schmidt

On 04/27/2016 02:10 AM, H.J. Lu wrote:

On Tue, Apr 26, 2016 at 3:11 PM, Bernd Schmidt  wrote:

On 04/26/2016 09:39 PM, H.J. Lu wrote:


make check-gcc RUNTESTFLAGS="--target_board='unix{-mx32}'
i386.exp=avx512vl-vmovdqa64-1.c"



Unfortunately, that doesn't work:

/usr/include/gnu/stubs.h:13:28: fatal error: gnu/stubs-x32.h: No such file
or directory
compilation terminated.

Trying to follow the recipe to get am x32 glibc built fails with the same
error when trying to build an x32 libgcc. I think I'll need you to send me
before/after assembly files (I'm assuming it's -frename-registers which
makes the test fail on x32).



Here are avx512vl-vmovdqa64-1.i, old.s and new.s.


Still somewhat at a loss. Is it trying to match a register name and 
close paren with the '.{5}'? What if you replace that with something 
like '%[re][0-9a-z]*//)'? Or maybe '.{5,6}'?



Bernd


Re: [PATCH] Fix type field walking in gimplifier unsharing

2016-04-27 Thread Eric Botcazou
> Gimplification is done in-place and thus relies on all processed
> trees being unshared.  This is achieved by unshare_body which
> in the end uses walk_tree to get at all interesting trees that possibly
> need unsharing.
> 
> Unfortunately it doesn't really work because walk_tree only walks
> types and type-related fields (TYPE_SIZE, TYPE_MIN_VALUE, etc.) in
> very narrow circumstances.

Right, but well defined and explained:

case DECL_EXPR:
  /* If this is a TYPE_DECL, walk into the fields of the type that it's
 defining.  We only want to walk into these fields of a type in this
 case and not in the general case of a mere reference to the type.

 The criterion is as follows: if the field can be an expression, it
 must be walked only here.  This should be in keeping with the fields
 that are directly gimplified in gimplify_type_sizes in order for the
 mark/copy-if-shared/unmark machinery of the gimplifier to work with
 variable-sized types.

 Note that DECLs get walked as part of processing the BIND_EXPR.  */

> Thus the following patch which makes the gimplifier unsharing
> visit all types.

I think this will generate a lot of useless walking in Ada...

> So - any opinion on the "correct" way to fix this?

Add DECL_EXPRs for the types, that's what done in Ada.

-- 
Eric Botcazou


Cilk Plus testsuite needs massive cleanup (PR testsuite/70595)

2016-04-27 Thread Rainer Orth
While working on the libcilkrts SPARC port from PR target/68945, I
noticed that the Cilk Plus testsuite has massive need for and potential
of cleanup to easily accomodate non-x86 targets:

* Every single execution test explicitly lists the targets to run on,
  often even twice (in the dg-do target selector and then again when
  adding -lcilkrts via dg-options).  This is completely unmaintainable
  and should be replaced by a target selector.  I'm using the current
  check_libcilkrts_available, renamed to cilkplus_runtime, for that
  purpose.  There's no need to add -lcilkrts at all; -fcilkplus already
  does this when linking.

* Two tests (c-c++-common/cilk-plus/CK/pr63307.c and
  c-c++-common/cilk-plus/SE/ef_error3.c) are pure compile tests and
  don't need a target selector at all.

* This only leaves us with c-c++-common/cilk-plus/SE/ef_error2.c, where
  the expected warning is x86-specific, thus the target selector needs
  to stay.

There's much opportunity for additional cleanup, already mentioned in
the PR, but the current set is enough to successfully run the testsuite
on Solaris/SPARC with the preliminary patch in PR target/68945.  I'll
address the rest in a follow-up.

Tested with the appropriate runtest invocations on i386-pc-solaris2.12
and x86_64-pc-linux-gnu (and also on sparc-sun-solaris2.12 with the
libcilkrts port): with the exception of a line number change for
c-c++-common/cilk-plus/SE/ef_error2.c, results without and with the
patch are identical.

Will commit to mainline in a day or two, giving interested parties an
opportunity to comment.

Rainer


2016-04-05  Rainer Orth  

gcc:
PR testsuite/70595
* doc/sourcebuild.texi (Effective-Target Keywords, Other
attributes): Document cilkplus_runtime.

gcc/testsuite:
PR testsuite/70595
* lib/target-supports.exp (check_libcilkrts_available): Rename to ...
(check_effective_target_cilkplus_runtime): ... this.
* g++.dg/cilk-plus/cilk-plus.exp: Adapt to it.
* gcc.dg/cilk-plus/cilk-plus.exp: Likewise.

* c-c++-common/cilk-plus/CK/cilk-for-2.c: Remove dg-do target selector.
Require cilkplus_runtime.
Don't add -lcilkrts.
* c-c++-common/cilk-plus/CK/cilk-fors.c: Likewise.
* c-c++-common/cilk-plus/CK/cilk_for_grain.c: Likewise.
* c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c: Likewise.
* c-c++-common/cilk-plus/CK/fib.c: Likewise.
* c-c++-common/cilk-plus/CK/fib_init_expr_xy.c: Likewise.
* c-c++-common/cilk-plus/CK/fib_no_return.c: Likewise.
* c-c++-common/cilk-plus/CK/fib_no_sync.c: Likewise.
* c-c++-common/cilk-plus/CK/nested_cilk_for.c: Likewise.
* c-c++-common/cilk-plus/CK/pr60586.c: Likewise.
* c-c++-common/cilk-plus/CK/pr69826-1.c: Likewise.
* c-c++-common/cilk-plus/CK/pr69826-2.c: Likewise.
* c-c++-common/cilk-plus/CK/spawnee_inline.c: Likewise.
* c-c++-common/cilk-plus/CK/spawner_inline.c: Likewise.
* c-c++-common/cilk-plus/CK/spawning_arg.c: Likewise.
* c-c++-common/cilk-plus/CK/steal_check.c: Likewise.
* c-c++-common/cilk-plus/CK/varargs_test.c: Likewise.
* g++.dg/cilk-plus/CK/catch_exc.cc: Likewise.
* g++.dg/cilk-plus/CK/cilk-for-tplt.cc: Likewise.
* g++.dg/cilk-plus/CK/const_spawn.cc: Likewise.
* g++.dg/cilk-plus/CK/fib-opr-overload.cc: Likewise.
* g++.dg/cilk-plus/CK/fib-tplt.cc: Likewise.
* g++.dg/cilk-plus/CK/for1.cc: Likewise.
* g++.dg/cilk-plus/CK/lambda_spawns.cc: Likewise.
* g++.dg/cilk-plus/CK/lambda_spawns_tplt.cc: Likewise.
* g++.dg/cilk-plus/CK/pr60586.cc: Likewise.
* g++.dg/cilk-plus/CK/pr66326.cc: Likewise.
* g++.dg/cilk-plus/CK/stl_iter.cc: Likewise.
* g++.dg/cilk-plus/CK/stl_rev_iter.cc: Likewise.
* g++.dg/cilk-plus/CK/stl_test.cc: Likewise.

* c-c++-common/cilk-plus/CK/pr63307.c: Remove dg-do target selector.
* c-c++-common/cilk-plus/SE/ef_error3.c: Likewise.

* c-c++-common/cilk-plus/SE/ef_error2.c: Explain target selector.

* c-c++-common/cilk-plus/CK/test__cilk.c: Run if
cilkplus_runtime.

# HG changeset patch
# Parent  f4df0fe5be5412270363b803d28085c2e12e6017
Simplify Cilk+ testsuite

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1878,6 +1878,9 @@ Target supports wide characters.
 @item automatic_stack_alignment
 Target supports automatic stack alignment.
 
+@item cilkplus_runtime
+Target supports the Cilk Plus runtime library.
+
 @item cxa_atexit
 Target uses @code{__cxa_atexit}.
 
diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-for-2.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-for-2.c
--- a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-for-2.c
+++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-for-2.c
@@ -1,7 +1,7 @@
-/* { dg-do run { target { i?8

Re: C, C++: New warning for memset without multiply by elt size

2016-04-27 Thread Bernd Schmidt

On 04/26/2016 11:23 PM, Martin Sebor wrote:

The documentation for the new option implies that it should warn
for calls to memset where the third argument contains the number
of elements not multiplied by the element size.  But in my (quick)
testing it only warns when the argument is a constant equal to
the number of elements and less than the size of the array.  For
example, neither of the following is diagnosed:

 int a [4];
 __builtin_memset (a, 0, 2 + 2);
 __builtin_memset (a, 0, 4 * 1);
 __builtin_memset (a, 0, 3);
 __builtin_memset (a, 0, 4 * sizeof a);

If it's possible and not too difficult, it would be nice if
the detection logic could be made a bit smarter to also diagnose
these less trivial cases (and matched the documented behavior).


I've thought about some of these cases. The problem is there are 
legitimate cases of calling memset for only part of an array. I wanted 
to start with something that is unlikely to give false positives.


A multiplication by the wrong sizeof would be a nice thing to spot. 
Would you like to work on followup patches? I probably won't get to it 
in a while.



Even beyond that, I also wonder if the warning could also be
issued when writing any constant number of bytes that is not
a multiple of the element size. This would be useful not just
for memset but also for memcpy.  (The premise being that it's
unusual to want to zero out or copy just a few bytes of any
array element and leave the remaining bytes of that element
unchanged.)


Probably a good idea.


I also have a comment on the text and content of the warning:

   memset used with length equal to number of elements without
 multiplication with element size

FWIW, multiplication is typically done "by" a number (not with
one).


Fixed.


Here's an idea
for rewording the diagnostic to include this information:

   warning: memset called to set '3' bytes which is not a positive
 multiple of element size in array 'a' with type int[3]'
   note: array 'a' is declared here


I'm finding this too verbose, but I guess that's a matter of taste.


Finally, I would be remiss not to mention that the patch has
an instance of trailing space in it (gasp! ;)


Fixed before committing.


Personally,
I'm not bothered by it but it seems like a good opportunity
to highlight that these things happen even to the most careful
of us, and not necessarily as a result of not being careful or
aware of the coding guidelines.  My point is that no amount of
documentation will or diligence will prevent these kinds of
problems, and dwelling on them in code reviews isn't the best
use of our time.


The first part is probably true, but we have code review exactly 
_because_ people make mistakes. Without it, the code base would 
degenerate rapidly. So, well spotted.



Let's put in place a tool that takes care of
these nits for us so we can focus on things a tool can't help
us with!


I don't believe in tools for this, Machines are stupid, and I think the 
problem is AI-complete. In this case the check-GNU-style script has two 
other complaints about the patch which a human can tell are wrong. 
Enforcing patches to pass a mechanical check would be a mistake IMO, 
since it would force people into contortions trying to placate an 
unintelligent program, making code quality worse.



Bernd


Re: Enabling -frename-registers?

2016-04-27 Thread Bernd Schmidt

On 04/27/2016 09:11 AM, Eric Botcazou wrote:

@@ -8562,7 +8563,8 @@ debug information format adopted by the
   make debugging impossible, since variables no longer stay in
   a ``home register''.

-Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops}.
+Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops},
+and also enabled at levels @option{-O2} and @option{-O3}.


OPT_LEVELS_2_PLUS includes -Os since it's basically -O2.


So, this?


Bernd

Index: invoke.texi
===
--- invoke.texi (revision 235475)
+++ invoke.texi (working copy)
@@ -8574,7 +8574,7 @@ make debugging impossible, since variabl
 a ``home register''.

 Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops},
-and also enabled at levels @option{-O2} and @option{-O3}.
+and also enabled at levels @option{-O2}, @option{-O3} and @option{-Os}.

 @item -fschedule-fusion
 @opindex fschedule-fusion


Re: [PATCH] PR target/70155: Use SSE for TImode load/store

2016-04-27 Thread Ilya Enkovich
2016-04-26 22:50 GMT+03:00 H.J. Lu :
> On Tue, Apr 26, 2016 at 11:42 AM, H.J. Lu  wrote:
>> On Tue, Apr 26, 2016 at 9:33 AM, H.J. Lu  wrote:
>>> On Tue, Apr 26, 2016 at 9:27 AM, Ilya Enkovich  
>>> wrote:
 2016-04-26 19:20 GMT+03:00 Ilya Enkovich :
> 2016-04-26 19:12 GMT+03:00 H.J. Lu :
>> On Tue, Apr 26, 2016 at 9:07 AM, Ilya Enkovich  
>> wrote:
>>> 2016-04-26 18:39 GMT+03:00 H.J. Lu :
 On Tue, Apr 26, 2016 at 8:21 AM, Ilya Enkovich 
  wrote:
> 2016-04-26 18:12 GMT+03:00 H.J. Lu :
>> On Tue, Apr 26, 2016 at 8:05 AM, Ilya Enkovich 
>>  wrote:
>>> 2016-04-26 17:55 GMT+03:00 H.J. Lu :
 On Tue, Apr 26, 2016 at 7:15 AM, Ilya Enkovich 
  wrote:
> 2016-04-26 17:07 GMT+03:00 H.J. Lu :
>> On Mon, Apr 25, 2016 at 9:13 AM, Ilya Enkovich 
>>  wrote:
>>> 2016-04-25 18:27 GMT+03:00 H.J. Lu :

 Ilya, can you take a look?

 Thanks.

 --
 H.J.
>>>
>>> Hi,
>>>
>>> Algorithmic part of the patch looks OK to me except the 
>>> following piece of code.
>>>
>>> +/* Check REF's chain to add new insns into a queue
>>> +   and find registers requiring conversion.  */
>>>
>>> Comment is wrong because you don't have any conversions 
>>> required for
>>> your candidates.
>>
>> I will fix it.
>>
>>> +
>>> +void
>>> +scalar_chain_64::analyze_register_chain (bitmap candidates, 
>>> df_ref ref)
>>> +{
>>> +  df_link *chain;
>>> +
>>> +  gcc_assert (bitmap_bit_p (insns, DF_REF_INSN_UID (ref))
>>> + || bitmap_bit_p (candidates, DF_REF_INSN_UID 
>>> (ref)));
>>> +  add_to_queue (DF_REF_INSN_UID (ref));
>>> +
>>> +  for (chain = DF_REF_CHAIN (ref); chain; chain = chain->next)
>>> +{
>>> +  unsigned uid = DF_REF_INSN_UID (chain->ref);
>>> +
>>> +  if (!NONDEBUG_INSN_P (DF_REF_INSN (chain->ref)))
>>> +   continue;
>>> +
>>> +  if (!DF_REF_REG_MEM_P (chain->ref))
>>> +   continue;
>>>
>>> I believe here you wrongly jump to the next ref intead of 
>>> actually adding it
>>> to a queue.  You may just use
>>>
>>> gcc_assert (!DF_REF_REG_MEM_P (chain->ref));
>>>
>>> because you should'n have a candidate used in address operand.
>>
>> I will update.
>>
>>> +
>>> +  if (bitmap_bit_p (insns, uid))
>>> +   continue;
>>> +
>>> +  if (bitmap_bit_p (candidates, uid))
>>> +   add_to_queue (uid);
>>>
>>> Probably gcc_assert (bitmap_bit_p (candidates, uid)) since no 
>>> uses and defs
>>> out of candidates list are allowed?
>>
>> That would be wrong since there are
>>
>>  while (!bitmap_empty_p (queue))
>> {
>>   insn_uid = bitmap_first_set_bit (queue);
>>   bitmap_clear_bit (queue, insn_uid);
>>   bitmap_clear_bit (candidates, insn_uid);
>>   add_insn (candidates, insn_uid);
>> }
>>
>> An instruction is a candidate and the bit is cleared when
>> analyze_register_chain is called.
>
> You clear candidates bit but the first thing you do in add_insn 
> is set
> insns bit.
> Thus you should hit:
>
>   if (bitmap_bit_p (insns, uid))
> continue;
>
> For handled candidates.
>
> Probably it would be more clear if we keep this clear/set pair
> together?  E.g. move
> bitmap_clear_bit (candidates, insn_uid) to scalar_chain::add_insn.
>

 After we started processing candidates, we only use candidates
 to check if an instruction is a candidate, not to check if an
 instruction is NOT a candidate.
>>>
>>> I don't see how it's related to what I said.  My point is that
>>> when you analyze added insn you shouldn't reach insns which are both
>>> not in candidates and not in current scalar_chain_64.  That's why I
>>> think you miss an assert in scalar_chain_64::analyze_register_chain.
>>
>> Since all candidates will be processed by

[PATCH] Fix PR70785

2016-04-27 Thread Richard Biener

The following fixes LTO bootstrap with IPA PTA enabled (the only
useful IPA PTA test we have).  The issue was that the cgraph
node with the body may not have used_from_other_partition set
but only one of its aliases (genrecog was miscompiled and the
function was a constructor and one of its aliases generated by
the C++ FE).

LTO bootstrapped with IPA PTA enabled on x86_64-unknown-linux-gnu,
testing in progress.

Richard.

2016-04-27  Richard Biener  

PR ipa/70785
* tree-ssa-structalias.c (refered_from_nonlocal_fn): New
function cummulating used_from_other_partition, externally_visible
and force_output from aliases.
(refered_from_nonlocal_var): Likewise.
(ipa_pta_execute): Use call_for_symbol_and_aliases to cummulate
node flags properly.

Index: gcc/tree-ssa-structalias.c
===
--- gcc/tree-ssa-structalias.c  (revision 235443)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -7486,7 +7486,7 @@ struct pt_solution ipa_escaped_pt
   = { true, false, false, false, false, false, false, false, NULL };
 
 /* Associate node with varinfo DATA. Worker for
-   cgraph_for_node_and_aliases.  */
+   cgraph_for_symbol_thunks_and_aliases.  */
 static bool
 associate_varinfo_to_alias (struct cgraph_node *node, void *data)
 {
@@ -7496,6 +7496,29 @@ associate_varinfo_to_alias (struct cgrap
   return false;
 }
 
+/* Compute whether node is refered to non-locally.  Worker for
+   cgraph_for_symbol_thunks_and_aliases.  */
+static bool
+refered_from_nonlocal_fn (struct cgraph_node *node, void *data)
+{
+  bool *nonlocal_p = (bool *)data;
+  *nonlocal_p |= (node->used_from_other_partition
+ || node->externally_visible
+ || node->force_output);
+  return false;
+}
+
+/* Same for varpool nodes.  */
+static bool
+refered_from_nonlocal_var (struct varpool_node *node, void *data)
+{
+  bool *nonlocal_p = (bool *)data;
+  *nonlocal_p |= (node->used_from_other_partition
+ || node->externally_visible
+ || node->force_output);
+  return false;
+}
+
 /* Execute the driver for IPA PTA.  */
 static unsigned int
 ipa_pta_execute (void)
@@ -7559,6 +7582,8 @@ ipa_pta_execute (void)
 || node->externally_visible
 || node->force_output
 || node_address_taken);
+  node->call_for_symbol_thunks_and_aliases (refered_from_nonlocal_fn,
+   &nonlocal_p, true);
 
   vi = create_function_info_for (node->decl,
 alias_get_name (node->decl), false,
@@ -7596,6 +7621,8 @@ ipa_pta_execute (void)
   bool nonlocal_p = (var->used_from_other_partition
 || var->externally_visible
 || var->force_output);
+  var->call_for_symbol_and_aliases (refered_from_nonlocal_var,
+   &nonlocal_p, true);
   if (nonlocal_p)
vi->is_ipa_escape_point = true;
 }


Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-27 Thread Kyrill Tkachov

Hi Wilco,

On 25/04/16 20:21, Wilco Dijkstra wrote:

Evandro Menezes wrote:

I assume that you mean that such improvements are true for
-mcpu=generic, yes?  On which target, A53 or A57 or other?

It's true for any CPU setting. The SPEC results are for Cortex-A57
however I wrote a microbenchmark that shows improvements on
all targets I have access to. The GCC switch expansion is awful, so
even with a good indirect predictor it is better to use conditional
branches.


In what way is it awful? If there's something we can do better at
can you file a bug report with a testcase so that we can work on
improving it rather than tweaking a heuristic in the backend.
(Not that I'm against your patch ;)

Cheers,
Kyrill


Wilco








Re: [PATCH] Fix type field walking in gimplifier unsharing

2016-04-27 Thread Richard Biener
On Wed, 27 Apr 2016, Eric Botcazou wrote:

> > Gimplification is done in-place and thus relies on all processed
> > trees being unshared.  This is achieved by unshare_body which
> > in the end uses walk_tree to get at all interesting trees that possibly
> > need unsharing.
> > 
> > Unfortunately it doesn't really work because walk_tree only walks
> > types and type-related fields (TYPE_SIZE, TYPE_MIN_VALUE, etc.) in
> > very narrow circumstances.
> 
> Right, but well defined and explained:
> 
> case DECL_EXPR:
>   /* If this is a TYPE_DECL, walk into the fields of the type that it's
>defining.  We only want to walk into these fields of a type in this
>case and not in the general case of a mere reference to the type.
> 
>The criterion is as follows: if the field can be an expression, it
>must be walked only here.  This should be in keeping with the fields
>that are directly gimplified in gimplify_type_sizes in order for the
>mark/copy-if-shared/unmark machinery of the gimplifier to work with
>variable-sized types.
> 
>Note that DECLs get walked as part of processing the BIND_EXPR.  */
> 
> > Thus the following patch which makes the gimplifier unsharing
> > visit all types.
> 
> I think this will generate a lot of useless walking in Ada...
> 
> > So - any opinion on the "correct" way to fix this?
> 
> Add DECL_EXPRs for the types, that's what done in Ada.

Aww, I was hoping for sth that would not require me to fix all
frontends ...

It seems the C frontend does it correctly already - I hit the
ubsan issue for c-c++-common/ubsan/pr59667.c and only for the C++ FE
for example.  Notice how only the pointed-to type is variable-size
here.

C produces

{
  unsigned int len = 1;
  typedef float [0:(sizetype) ((long int) SAVE_EXPR  + 
-1)][0:(sizetype) ((long int) SAVE_EXPR  + -1)];
  float[0:(sizetype) ((long int) SAVE_EXPR  + -1)][0:(sizetype) 
((long int) SAVE_EXPR  + -1)] * P = 0B;

unsigned int len = 1;
typedef float [0:(sizetype) ((long int) SAVE_EXPR  + 
-1)][0:(sizetype) ((long int) SAVE_EXPR  + -1)];
  SAVE_EXPR ;, (void) SAVE_EXPR ;;
float[0:(sizetype) ((long int) SAVE_EXPR  + -1)][0:(sizetype) 
((long int) SAVE_EXPR  + -1)] * P = 0B;
  (*P)[0][0] = 1.0e+0;
  return 0;
}

the decl-expr is the 'typedef' line.  The C++ FE produces

{
  unsigned int len = 1;
  float[0:(sizetype) (SAVE_EXPR <(ssizetype) len + -1>)][0:(sizetype) 
(SAVE_EXPR <(ssizetype) len + -1>)] * P = 0B;

  <>;
  <) + 
1) * (bitsizetype) ((sizetype) (SAVE_EXPR <(ssizetype) len + -1>) + 1)) * 
32) >;
  <)][0:(sizetype) (SAVE_EXPR <(ssizetype) len + -1>)] * P = 0B;>>;
  <;
  return  = 0;
}

notice the lack of a decl-expr here.  It has some weird expr_stmt
here covering the sizes though. Possibly because VLA arrays are a GNU 
extension.

Didn't look into the fortran FE issue but I expect it's similar
(it only occurs for pointers to VLAs as well).

I'll try to come up with patches.

Thanks for the hint,
Richard.


Re: [PATCH][AArch64][wwwdocs] Summarise some more AArch64 changes for GCC6

2016-04-27 Thread Kyrill Tkachov


On 25/04/16 02:43, Sandra Loosemore wrote:

On 04/22/2016 03:57 AM, James Greenhalgh wrote:

On Thu, Apr 21, 2016 at 09:15:17AM +0100, Kyrill Tkachov wrote:

Hi all,

Here's a proposed summary of the changes in the AArch64 backend for GCC 6.
If there's anything I've missed it's purely my oversight, feel free to add
entries or suggest improvements.


For me, I'm mostly happy with the wording below (I've tried to be
helpful inline). But I'm not as conscientious at checking grammar as others
in the community. So this is OK from an AArch64 target perspective with
the changes below, but wait a short while to give Gerald or Sandra a chance
to comment.


I haven't done a careful review of the whole section of existing text, but I 
did notice a few things in text not being touched by this patch:


+ 
 The new command line options -march=native,


s/command line options/command-line options/


-mcpu=native and -mtune=native are now
 available on native AArch64 GNU/Linux systems. Specifying
 these options will cause GCC to auto-detect the host CPU and


s/will cause/causes/


 rewrite these options to the optimal setting for that system.


s/rewrite these options to the optimal/choose the/


-   -fpic is now supported by the AArch64 target when 
generating
+   -fpic is now supported when generating
 code for the small code model (-mcmodel=small).  The size 
of
 the global offset table (GOT) is limited to 28KiB under the LP64 SysV 
ABI
 , and 15KiB under the ILP32 SysV ABI.


Move the comma directly after "ABI", not separated by newline and whitespace.



Thanks, I've incorporated your and James' feedback.
Since James ok'd the content of the patch from an AArch64 perspective
I'll commit this later today if I receive no further feedback.

Thanks,
Kyrill


-Sandra



Index: htdocs/gcc-6/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.73
diff -U 3 -r1.73 changes.html
--- htdocs/gcc-6/changes.html	7 Apr 2016 09:38:31 -	1.73
+++ htdocs/gcc-6/changes.html	25 Apr 2016 09:10:25 -
@@ -328,29 +328,90 @@
 AArch64

  
-   The new command line options -march=native,
+   A number of AArch64-specific options have been added.  The most
+   important ones are summarised in this section but for usage
+   instructions please refer to the documentation.
+ 
+ 
+   The new command-line options -march=native,
-mcpu=native and -mtune=native are now
available on native AArch64 GNU/Linux systems.  Specifying
-   these options will cause GCC to auto-detect the host CPU and
-   rewrite these options to the optimal setting for that system.
-   If GCC is unable to detect the host CPU these options have no effect.
+   these options causes GCC to auto-detect the host CPU and
+   choose the optimal setting for that system.
  
  
-   -fpic is now supported by the AArch64 target when generating
+   -fpic is now supported when generating
code for the small code model (-mcmodel=small).  The size of
-   the global offset table (GOT) is limited to 28KiB under the LP64 SysV ABI
-   , and 15KiB under the ILP32 SysV ABI.
+   the global offset table (GOT) is limited to 28KiB under the LP64
+   SysV ABI, and 15KiB under the ILP32 SysV ABI.
  
  
-   The AArch64 port now supports target attributes and pragmas.  Please
-   refer to the https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes";>
-   documentation for details of available attributes and
+   Target attributes and pragmas are now supported.  Please
+   refer to the documentation for details of available attributes and
pragmas as well as usage instructions.
  
  
Link-time optimization across translation units with different
target-specific options is now supported.
  
+ 
+   The option -mtls-size= is now supported.  It can be used to
+   specify the bit size of TLS offsets, allowing GCC to generate
+   better TLS instruction sequences.
+ 
+ 
+   The option -fno-plt is now fully functional.
+ 
+ 
+   The ARMv8.1-A architecture and the Large System Extensions are now
+   supported.  They can be used by specifying the
+   -march=armv8.1-a option.  Additionally, the
+   +lse option extension can be used in a similar fashion
+   to other option extensions.
+   The Large System Extensions introduce new instructions that are used
+   in the implementation of atomic operations.
+ 
+ 
+   The ACLE half-precision floating-point type __fp16 is now
+   supported in the C and C++ languages.
+ 
+ 
+   The ARM Cortex-A35 processor is now supported via the
+   -mcpu=cortex-a35 and -mtune=cortex-a35
+   options as well as 

Re: Enabling -frename-registers?

2016-04-27 Thread Eric Botcazou
> Index: invoke.texi
> ===
> --- invoke.texi   (revision 235475)
> +++ invoke.texi   (working copy)
> @@ -8574,7 +8574,7 @@ make debugging impossible, since variabl
>   a ``home register''.
> 
>   Enabled by default with @option{-funroll-loops} and @option{-fpeel-loops},
> -and also enabled at levels @option{-O2} and @option{-O3}.
> +and also enabled at levels @option{-O2}, @option{-O3} and @option{-Os}.
> 
>   @item -fschedule-fusion
>   @opindex fschedule-fusion

Yes, this looks good to me (unless -frename-registers is specifically disabled 
at -Os in some other way, I didn't look into the details).

-- 
Eric Botcazou


[Ada] Small cleanup in gigi

2016-04-27 Thread Eric Botcazou
Tested on x86_64-suse-linux, applied on the mainline.


2016-04-27  Eric Botcazou  

* gcc-interface/gigi.h (gnat_to_gnu_entity): Adjust prototype.
(maybe_pad_type): Adjust comment.
(finish_record_type): Likewise.
(rest_of_record_type_compilation): Likewise.
* gcc-interface/decl.c (gnat_to_gnu_entity): Change DEFINITION type
parameter from integer to boolean.  Adjust recursive calls.
: Use copy_type and remove redundant assignments.
:  Adjust comment.  Remove call to
rest_of_record_type_compilation.  Set TYPE_PADDING_P flag earlier.
Pass false to finish_record_type.  Set the debug type later.
: Remove call to rest_of_record_type_compilation.
(gnat_to_gnu_component_type): Fix formatting.
(gnat_to_gnu_field_decl): Adjust call to gnat_to_gnu_entity.
(gnat_to_gnu_type): Likewise.
* gcc-interface/trans.c (Identifier_to_gnu): Likewise.
(Loop_Statement_to_gnu): Likewise.
(Subprogram_Body_to_gnu): Likewise.
(Exception_Handler_to_gnu_fe_sjlj): Likewise.
(Exception_Handler_to_gnu_gcc): Likewise.
(Compilation_Unit_to_gnu): Likewise.
(gnat_to_gnu): Likewise.
(push_exception_label_stack): Likewise.
(elaborate_all_entities_for_package): Likewise.
(process_freeze_entity): Likewise.
(process_decls): Likewise.
(process_type): Likewise.
* gcc-interface/utils.c (struct deferred_decl_context_node): Tweak.
(maybe_pad_type): Adjust comments.  Set the debug type later.  Remove
call to rest_of_record_type_compilation.
(rest_of_record_type_compilation): Use copy_type.
(copy_type): Use correctly typed constants.
(gnat_signed_or_unsigned_type_for): Use copy_type.
* gcc-interface/utils2.c (nonbinary_modular_operation): Likewise.
(build_goto_raise): Adjust call tognat_to_gnu_entity.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 235394)
+++ gcc-interface/decl.c	(working copy)
@@ -217,15 +217,13 @@ static bool intrin_profiles_compatible_p
initial value (in GCC tree form).  This is optional for a variable.  For
a renamed entity, GNU_EXPR gives the object being renamed.
 
-   DEFINITION is nonzero if this call is intended for a definition.  This is
-   used for separate compilation where it is necessary to know whether an
-   external declaration or a definition must be created if the GCC equivalent
-   was not created previously.  The value of 1 is normally used for a nonzero
-   DEFINITION, but a value of 2 is used in special circumstances, defined in
-   the code.  */
+   DEFINITION is true if this call is intended for a definition.  This is used
+   for separate compilation where it is necessary to know whether an external
+   declaration or a definition must be created if the GCC equivalent was not
+   created previously.  */
 
 tree
-gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, int definition)
+gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
 {
   /* Contains the kind of the input GNAT node.  */
   const Entity_Kind kind = Ekind (gnat_entity);
@@ -306,7 +304,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	  || (IN (Ekind (gnat_temp), Subprogram_Kind)
 		  && present_gnu_tree (gnat_temp)
 		  && (current_function_decl
-		  == gnat_to_gnu_entity (gnat_temp, NULL_TREE, 0
+		  == gnat_to_gnu_entity (gnat_temp, NULL_TREE, false
 	{
 	  process_type (gnat_entity);
 	  return get_gnu_tree (gnat_entity);
@@ -337,7 +335,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	  || No (Freeze_Node (Full_View (gnat_entity)
 	{
 	  gnu_decl
-	= gnat_to_gnu_entity (Full_View (gnat_entity), NULL_TREE, 0);
+	= gnat_to_gnu_entity (Full_View (gnat_entity), NULL_TREE, false);
 	  save_gnu_tree (gnat_entity, NULL_TREE, false);
 	  save_gnu_tree (gnat_entity, gnu_decl, false);
 	}
@@ -485,12 +483,12 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 		gnu_decl
 		  = gnat_to_gnu_entity (Original_Record_Component
 	(gnat_entity),
-	gnu_expr, 0);
+	gnu_expr, false);
 		saved = true;
 		break;
 	  }
 
-	gnat_to_gnu_entity (Scope (gnat_entity), NULL_TREE, 0);
+	gnat_to_gnu_entity (Scope (gnat_entity), NULL_TREE, false);
 	gnu_decl = get_gnu_tree (gnat_entity);
 	saved = true;
 	break;
@@ -537,7 +535,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	  && Present (Full_View (gnat_entity)))
 	{
 	  gnu_decl
-	= gnat_to_gnu_entity (Full_View (gnat_entity), gnu_expr, 0);
+	= gnat_to_gnu_entity (Full_View (gnat_entity), gnu_expr, false);
 	  saved = true;
 	  break;
 	}
@@ -598,7 +596,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	  {
 	if (kind == E_Exception)
 	  gnu_expr = gnat_to_gnu_entity (Renamed_Entity (gnat_entity),
-	 NULL_TREE, 0);
+	 NUL

[Ada] Expression functions need not trigger loading of package body

2016-04-27 Thread Arnaud Charlet
The expression functions introduced in Ada 2012 implicitly come with the
Inline aspect in GNAT.  And, for inter-unit inlining, they were handled by
the inlining machinery as any other inlined subprograms, which means that
they were causing the package body (if any) to be loaded and analyzed.

That's both unnecessary and inefficient, so this patch corrects it as well
as streamlines the implementation of the main entry point for inlining.

The change can be exhibited by the now quiet compilation of Client in:

with P;

procedure Client is

   X : Boolean := P.Foo;

begin
   null;
end Client;
package P is

   function Foo return Boolean is (True);

   procedure Other;

end P;
package body P is

   procedure Other is
   begin
  Unrelated;
   end Other;

end P;

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Eric Botcazou  

* inline.adb (Add_Inlined_Body): Overhaul implementation,
robustify handling of -gnatn1, add special treatment for
expression functions.

Index: inline.adb
===
--- inline.adb  (revision 235481)
+++ inline.adb  (working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1992-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -390,6 +390,40 @@
  return;
   end if;
 
+  --  Find out whether the call must be inlined. Unless the result is
+  --  Dont_Inline, Must_Inline also creates an edge for the call in the
+  --  callgraph; however, it will not be activated until after Is_Called
+  --  is set on the subprogram.
+
+  Level := Must_Inline;
+
+  if Level = Dont_Inline then
+ return;
+  end if;
+
+  --  If the call was generated by the compiler and is to a subprogram in
+  --  a run-time unit, we need to suppress debugging information for it,
+  --  so that the code that is eventually inlined will not affect the
+  --  debugging of the program. We do not do it if the call comes from
+  --  source because, even if the call is inlined, the user may expect it
+  --  to be present in the debugging information.
+
+  if not Comes_From_Source (N)
+and then In_Extended_Main_Source_Unit (N)
+and then
+  Is_Predefined_File_Name (Unit_File_Name (Get_Source_Unit (E)))
+  then
+ Set_Needs_Debug_Info (E, False);
+  end if;
+
+  --  If the subprogram is an expression function, then there is no need to
+  --  load any package body since the body of the function is in the spec.
+
+  if Is_Expression_Function (E) then
+ Set_Is_Called (E);
+ return;
+  end if;
+
   --  Find unit containing E, and add to list of inlined bodies if needed.
   --  If the body is already present, no need to load any other unit. This
   --  is the case for an initialization procedure, which appears in the
@@ -403,77 +437,48 @@
   --  no enclosing package to retrieve. In this case, it is the body of
   --  the function that will have to be loaded.
 
-  Level := Must_Inline;
+  declare
+ Pack : constant Entity_Id := Get_Code_Unit_Entity (E);
 
-  if Level /= Dont_Inline then
- declare
-Pack : constant Entity_Id := Get_Code_Unit_Entity (E);
+  begin
+ if Pack = E then
+Set_Is_Called (E);
+Inlined_Bodies.Increment_Last;
+Inlined_Bodies.Table (Inlined_Bodies.Last) := E;
 
- begin
---  Ensure that Analyze_Inlined_Bodies will be invoked after
---  completing the analysis of the current unit.
+ elsif Ekind (Pack) = E_Package then
+Set_Is_Called (E);
 
-Inline_Processing_Required := True;
+if Is_Generic_Instance (Pack) then
+   null;
 
-if Pack = E then
+--  Do not inline the package if the subprogram is an init proc
+--  or other internally generated subprogram, because in that
+--  case the subprogram body appears in the same unit that
+--  declares the type, and that body is visible to the back end.
+--  Do not inline it either if it is in the main unit.
+--  Extend the -gnatn2 processing to -gnatn1 for Inline_Always
+--  calls if the back-end takes care of inlining the call.
 
-   --  Library-level inlined function. Add function itself to
- 

[Ada] Reimplementation of interfacing aspects

2016-04-27 Thread Arnaud Charlet
This patch reimplements the handling of Convention, Export, External_Name,
Import, and Link_Name to generate a proper corresponding pragma depends on
which of these aspects are present.

As a result, an exported or imported subprogram with preconditions and/or
postconditions will not cause a crash when the compiler is building the
interfacing wrapper tasked with verifying the assumptions.


-- Source --


--  sorters.ads

pragma SPARK_Mode (On);

package Sorters is
   type Array_Type is array (Positive range <>) of Integer;

   function Perm (A : in Array_Type;
  B : in Array_Type) return Boolean
 with Global => null,
  Ghost  => True,
  Import => True;

   procedure Selection_Sort (Values : in out Array_Type)
 with Depends => (Values => Values),
  Pre => Values'Length >= 1 and then
 Values'Last   <= Positive'Last,
  Post=> (for all J in Values'First .. Values'Last - 1 =>
Values (J) <= Values (J + 1))  and then
  Perm (Values'Old, Values);
end Sorters;

--  sorters.adb

pragma SPARK_Mode (On);

package body Sorters is
   function Perm_Transitive (A, B, C : Array_Type) return Boolean
 with Global => null,
  Post   => (if Perm_Transitive'Result
and then Perm (A, B)
and then Perm (B, C)
 then Perm (A, C)),
  Ghost   => True,
  Import  => True;

   procedure Swap (Values : in out Array_Type;
   X  : in Positive;
   Y  : in Positive)
 with Depends => (Values => (Values, X, Y)),
  Pre => (X in Values'Range and then
  Y in Values'Range and then
  X /= Y),
  Post => Perm (Values'Old, Values)and then
  (Values (X) = Values'Old (Y) and then
   Values (Y) = Values'Old (X) and then
   (for all J in Values'Range =>
  (if J /= X and J /= Y then Values (J) = Values'Old (J
   is
  Values_Old : constant Array_Type := Values
with Ghost => True;
  Temp : Integer;
   begin
  Temp   := Values (X);
  Values (X) := Values (Y);
  Values (Y) := Temp;
  pragma Assume (Perm (Values_Old, Values));
   end Swap;

   function Index_Of_Minimum (Unsorted : in Array_Type) return Positive
 with Pre  => Unsorted'First <= Unsorted'Last,
  Post => Index_Of_Minimum'Result in Unsorted'Range and then
 (for all J in Unsorted'Range =>
  Unsorted (Index_Of_Minimum'Result) <= Unsorted (J))
   is
  Min : Positive;
   begin
  Min := Unsorted'First;
  for Index in Unsorted'First .. Unsorted'Last loop
 pragma Loop_Invariant
   (Min in Unsorted'Range and then
   (for all J in Unsorted'First .. Index - 1 =>
  Unsorted (Min) <= Unsorted (J)));

 if Unsorted (Index) < Unsorted (Min) then
Min := Index;
 end if;
  end loop;
  return Min;
   end Index_Of_Minimum;

   procedure Selection_Sort (Values : in out Array_Type) is
  Values_Last : Array_Type (Values'Range)
with Ghost => True;
  Smallest : Positive;
   begin
  pragma Assume (Perm (Values, Values));
  for Current in Values'First .. Values'Last - 1 loop
 Values_Last := Values;
 Smallest := Index_Of_Minimum (Values (Current .. Values'Last));

 if Smallest /= Current then
Swap (Values => Values,
  X  => Current,
  Y  => Smallest);
 end if;

 pragma Assume
   (Perm_Transitive (Values'Loop_Entry, Values_Last, Values));

 pragma Loop_Invariant (Perm (Values'Loop_Entry, Values));
 pragma Loop_Invariant   
   ((for all J in Current .. Values'Last =>
 Values (Current) <= Values (J)));
 pragma Loop_Invariant  
   ((for all J in Values'First .. Current =>
   Values (J) <= Values (J + 1)));
  end loop;
   end Selection_Sort;
end Sorters;

-
-- Compilation --
-

$ gcc -c -gnata sorters.adb

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Hristian Kirtchev  

* aspects.ads Aspects Export and Import do not require delay. They
were classified as delayed aspects, but treated as non-delayed
by the analysis of aspects.
* freeze.adb (Copy_Import_Pragma): New routine.
(Wrap_Imported_Subprogram): Copy the import pragma by first
resetting all semantic fields to avoid an infinite loop when
performing the copy.
* sem_ch13.adb (Analyze_Aspects_At_Freeze_Point): Add
comment on the processing of aspects Export and Import
at the freeze point.
(Analyze_Aspect_Convention: New routine.
(Analyze_Aspect_Export_Import)

Increase default value of lto-min-partition to 10000

2016-04-27 Thread Prathamesh Kulkarni
Hi,
As discussed in other thread, this patch increases default value for
lto-min-partition
to 1. OK to commit if bootstrap+testing passes ?

Thanks,
Prathamesh
Index: gcc/params.def
===
--- gcc/params.def  (revision 235478)
+++ gcc/params.def  (working copy)
@@ -1027,7 +1027,7 @@
 DEFPARAM (MIN_PARTITION_SIZE,
  "lto-min-partition",
  "Minimal size of a partition for LTO (in estimated instructions).",
- 1000, 0, 0)
+ 1, 0, 0)
 
 DEFPARAM (MAX_PARTITION_SIZE,
  "lto-max-partition",


ChangeLog
Description: Binary data


[PATCH] Fix PR70760

2016-04-27 Thread Richard Biener

The following patch fixes an issue in IPA PTA regarding to handling
of DECL_BY_REFERENCE function results at the caller side.  The issue
for the testcase in the PR is that we use the wrong function decl
to look for DECL_RESULT for calls that are an alias (which get
DECL_RESULT released).

But the issue is deeper in that the code also does not handle
indirect calls correctly - to expose a testcase for this the
patch also enables optimistic handling of functions escaping
via their addresses, this is already handled fine after I added
code to parse global initializers correctly.

LTO bootstrapped and tested on x86_64-unknown-linux-gnu with IPA PTA 
enabled, inspected PTA result on the PRs testcase (I failed to create a 
small reproducer).

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

This is the trunk version of the fix, for the branch where the
issue was reported against I will refrain from handling address-taken
functions differently.

I hope I deciphered enough of the calls handling to assess that
aggregate_value_p always matches DECL_BY_REFERENCE on DECL_RESULT.
IPA PTA needs to know the GIMPLE representation of the callees
DECL_RESULT (whether it's a pointer - at the caller side we
still see the non-reference LHS).  And that needs to work for
indirect calls as well.

Richard.

2016-04-27  Richard Biener  

PR ipa/70760
* tree-ssa-structalias.c (find_func_aliases_for_call): Use
aggregate_value_p to determine if a function result is
returned by reference.
(ipa_pta_execute): Functions having their address taken are
not automatically nonlocal.

* g++.dg/ipa/ipa-pta-2.C: New testcase.
* gcc.dg/ipa/ipa-pta-1.c: Adjust.

Index: gcc/tree-ssa-structalias.c
===
*** gcc/tree-ssa-structalias.c  (revision 235478)
--- gcc/tree-ssa-structalias.c  (working copy)
*** find_func_aliases_for_call (struct funct
*** 4641,4652 
  auto_vec lhsc;
  struct constraint_expr rhs;
  struct constraint_expr *lhsp;
  
  get_constraint_for (lhsop, &lhsc);
  rhs = get_function_part_constraint (fi, fi_result);
! if (fndecl
! && DECL_RESULT (fndecl)
! && DECL_BY_REFERENCE (DECL_RESULT (fndecl)))
{
  auto_vec tem;
  tem.quick_push (rhs);
--- 4737,4747 
  auto_vec lhsc;
  struct constraint_expr rhs;
  struct constraint_expr *lhsp;
+ bool aggr_p = aggregate_value_p (lhsop, gimple_call_fntype (t));
  
  get_constraint_for (lhsop, &lhsc);
  rhs = get_function_part_constraint (fi, fi_result);
! if (aggr_p)
{
  auto_vec tem;
  tem.quick_push (rhs);
*** find_func_aliases_for_call (struct funct
*** 4656,4677 
}
  FOR_EACH_VEC_ELT (lhsc, j, lhsp)
process_constraint (new_constraint (*lhsp, rhs));
-   }
  
!   /* If we pass the result decl by reference, honor that.  */
!   if (lhsop
! && fndecl
! && DECL_RESULT (fndecl)
! && DECL_BY_REFERENCE (DECL_RESULT (fndecl)))
!   {
! struct constraint_expr lhs;
! struct constraint_expr *rhsp;
  
! get_constraint_for_address_of (lhsop, &rhsc);
! lhs = get_function_part_constraint (fi, fi_result);
! FOR_EACH_VEC_ELT (rhsc, j, rhsp)
!   process_constraint (new_constraint (lhs, *rhsp));
! rhsc.truncate (0);
}
  
/* If we use a static chain, pass it along.  */
--- 4751,4769 
}
  FOR_EACH_VEC_ELT (lhsc, j, lhsp)
process_constraint (new_constraint (*lhsp, rhs));
  
! /* If we pass the result decl by reference, honor that.  */
! if (aggr_p)
!   {
! struct constraint_expr lhs;
! struct constraint_expr *rhsp;
  
! get_constraint_for_address_of (lhsop, &rhsc);
! lhs = get_function_part_constraint (fi, fi_result);
! FOR_EACH_VEC_ELT (rhsc, j, rhsp)
! process_constraint (new_constraint (lhs, *rhsp));
! rhsc.truncate (0);
!   }
}
  
/* If we use a static chain, pass it along.  */
*** ipa_pta_execute (void)
*** 7686,7715 
  
gcc_assert (!node->clone_of);
  
-   /* When parallelizing a code region, we split the region off into a
-separate function, to be run by several threads in parallel.  So for a
-function foo, we split off a region into a function
-foo._0 (void *foodata), and replace the region with some variant of a
-function call run_on_threads (&foo._0, data).  The '&foo._0' sets the
-address_taken bit for function foo._0, which would make it non-local.
-But for the purpose of ipa-pta, we can regard the run_on_threads call
-as a local call foo._0 (data),  so

Re: Increase default value of lto-min-partition to 10000

2016-04-27 Thread Richard Biener
On Wed, 27 Apr 2016, Prathamesh Kulkarni wrote:

> Hi,
> As discussed in other thread, this patch increases default value for
> lto-min-partition
> to 1. OK to commit if bootstrap+testing passes ?

Ok.

Thanks,
Richard.


Re: [PATCH] DWARF: turn dw_loc_descr_node field into hash map for frame offset check

2016-04-27 Thread Richard Biener
On Wed, Apr 27, 2016 at 10:03 AM, Pierre-Marie de Rodat
 wrote:
> Hello,
>
> As discussed on
> , this change
> removes a field in the dw_loc_descr_node structure so we can get rid of
> the CHECKING_P macro usage.
>
> This field was used to perform consistency checks for frame offset in
> DWARF procedures. As a replacement, this commit turns the "visited
> nodes" set in resolve_args_picking_1 into a map that remembers for each
> dw_loc_descr_node the frame offset associated to it, so that the
> consistency check is still operational.
>
> Boostrapped and regtested on x86_64-linux. Ok to commit? Thank you in

Ok.

Thanks,
Richard.

> advance!
> ---
>  gcc/dwarf2out.c | 37 +++--
>  gcc/dwarf2out.h |  6 --
>  2 files changed, 19 insertions(+), 24 deletions(-)
>
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index 0bbff87..463863d 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -1325,9 +1325,6 @@ new_loc_descr (enum dwarf_location_atom op, unsigned 
> HOST_WIDE_INT oprnd1,
>dw_loc_descr_ref descr = ggc_cleared_alloc ();
>
>descr->dw_loc_opc = op;
> -#if CHECKING_P
> -  descr->dw_loc_frame_offset = -1;
> -#endif
>descr->dw_loc_oprnd1.val_class = dw_val_class_unsigned_const;
>descr->dw_loc_oprnd1.val_entry = NULL;
>descr->dw_loc_oprnd1.v.val_unsigned = oprnd1;
> @@ -15353,12 +15350,14 @@ is_handled_procedure_type (tree type)
>   && int_size_in_bytes (type) <= DWARF2_ADDR_SIZE);
>  }
>
> -/* Helper for resolve_args_picking.  Stop when coming across VISITED nodes.  
> */
> +/* Helper for resolve_args_picking: do the same but stop when coming across
> +   visited nodes.  For each node we visit, register in FRAME_OFFSETS the 
> frame
> +   offset *before* evaluating the corresponding operation.  */
>
>  static bool
>  resolve_args_picking_1 (dw_loc_descr_ref loc, unsigned initial_frame_offset,
> struct dwarf_procedure_info *dpi,
> -   hash_set &visited)
> +   hash_map &frame_offsets)
>  {
>/* The "frame_offset" identifier is already used to name a macro... */
>unsigned frame_offset_ = initial_frame_offset;
> @@ -15366,19 +15365,18 @@ resolve_args_picking_1 (dw_loc_descr_ref loc, 
> unsigned initial_frame_offset,
>
>for (l = loc; l != NULL;)
>  {
> +  bool existed;
> +  unsigned &l_frame_offset = frame_offsets.get_or_insert (l, &existed);
> +
>/* If we already met this node, there is nothing to compute anymore.  
> */
> -  if (visited.add (l))
> +  if (existed)
> {
> -#if CHECKING_P
>   /* Make sure that the stack size is consistent wherever the 
> execution
>  flow comes from.  */
> - gcc_assert ((unsigned) l->dw_loc_frame_offset == frame_offset_);
> -#endif
> + gcc_assert ((unsigned) l_frame_offset == frame_offset_);
>   break;
> }
> -#if CHECKING_P
> -  l->dw_loc_frame_offset = frame_offset_;
> -#endif
> +  l_frame_offset = frame_offset_;
>
>/* If needed, relocate the picking offset with respect to the frame
>  offset. */
> @@ -15601,7 +15599,7 @@ resolve_args_picking_1 (dw_loc_descr_ref loc, 
> unsigned initial_frame_offset,
> {
> case DW_OP_bra:
>   if (!resolve_args_picking_1 (l->dw_loc_next, frame_offset_, dpi,
> -  visited))
> +  frame_offsets))
> return false;
>   /* Fall through... */
>
> @@ -15623,17 +15621,20 @@ resolve_args_picking_1 (dw_loc_descr_ref loc, 
> unsigned initial_frame_offset,
>
>  /* Make a DFS over operations reachable through LOC (i.e. follow branch
> operations) in order to resolve the operand of DW_OP_pick operations that
> -   target DWARF procedure arguments (DPI).  Stop at already visited nodes.
> -   INITIAL_FRAME_OFFSET is the frame offset *before* LOC is executed.  Return
> -   if all relocations were successful.  */
> +   target DWARF procedure arguments (DPI).  INITIAL_FRAME_OFFSET is the frame
> +   offset *before* LOC is executed.  Return if all relocations were
> +   successful.  */
>
>  static bool
>  resolve_args_picking (dw_loc_descr_ref loc, unsigned initial_frame_offset,
>   struct dwarf_procedure_info *dpi)
>  {
> -  hash_set visited;
> +  /* Associate to all visited operations the frame offset *before* evaluating
> + this operation.  */
> +  hash_map frame_offsets;
>
> -  return resolve_args_picking_1 (loc, initial_frame_offset, dpi, visited);
> +  return resolve_args_picking_1 (loc, initial_frame_offset, dpi,
> +frame_offsets);
>  }
>
>  /* Try to generate a DWARF procedure that computes the same result as FNDECL.
> diff --git a/gcc/dwarf2out.h b/gcc/dwarf2out.h
> index 91b3d6b..abf0550 100644
> --- a/gcc/dwarf2out.h
> +++ b/gcc/dwarf2out.h
> @@ -239,12 +239,6 @@ struct GTY((ch

Re: IRA costs tweaks, PR 56069

2016-04-27 Thread Bernd Schmidt

On 04/27/2016 06:02 AM, Jeff Law wrote:

AFAICT the sra-1.c expects to see the incremented value and I'm at a
loss to understand what's really going on here.  Can you give more details?


Yeah, maybe my first impression wasn't very accurate.

When I try to run gdb manually, it just crashes:

(gdb) show version
GNU gdb (Gentoo 7.10.1 vanilla) 7.10.1
(gdb) b 43
Breakpoint 1 at 0x40059b: file sra-1.c, line 43.
(gdb) run
Starting program: /local/src/egcs/bscommit/gcc/a.out

Breakpoint 1, f3 (k=) at sra-1.c:43
43bar (a.j);/* { dg-final { gdb-test 43 "a.j" "14" } } */
(gdb) p a.j
Segmentation fault (core dumped)

Here's rtl from the final dump (reg notes and insn codes etc. removed 
where it seemed to help readability):


(note 49 21 39 2 (var_location a$i (const_int 4 [0x4])) 
NOTE_INSN_VAR_LOCATION)

(insn:TI 39 49 2 2 (set (reg:HI 0 ax [orig:97 a$i ] [97])
(const_int 4 [0x4])) sra-1.c:40

(insn 2 39 50 2 (set (reg/v:SI 1 dx [orig:96 k ] [96])
(reg:SI 5 di [ k ])) sra-1.c:38

(note 50 2 12 2 (var_location a$j (plus:HI (reg:HI 1 dx [orig:96 k ] [96])
(const_int 6 [0x6]))) NOTE_INSN_VAR_LOCATION)

(insn:TI 12 50 51 2 (parallel [
(set (reg:HI 0 ax [orig:97 a$i ] [97])
(asm_operands:HI ("") ("=r") 0 [
(reg:HI 0 ax [orig:97 a$i ] [97])
]
 [
(asm_input:HI ("0") sra-1.c:40)
]
 [] sra-1.c:40))
(clobber (reg:CCFP 18 fpsr))
(clobber (reg:CC 17 flags))
]) sra-1.c:40 -1

(note 51 12 52 2 (var_location a$i (reg:HI 0 ax [orig:97 a$i ] [97])) 
NOTE_INSN_VAR_LOCATION)

(note 52 51 15 2 (var_location a$j (plus:HI (reg:HI 1 dx [orig:96 k ] [96])
(const_int 7 [0x7]))) NOTE_INSN_VAR_LOCATION)
(insn:TI 15 52 16 2 (set (reg:SI 2 cx [orig:92 _10 ] [92])
(sign_extend:SI (reg:HI 0 ax [orig:97 a$i ] [97]))) sra-1.c:42

(insn:TI 16 15 53 2 (set (reg:SI 5 di)
(reg:SI 2 cx [orig:92 _10 ] [92])) sra-1.c:42 86

(note 53 16 17 2 (var_location k (reg/v:SI 1 dx [orig:96 k ] [96])) 
NOTE_INSN_VAR_LOCATION)


(call_insn:TI 17 53 54 2 (call (mem:QI (symbol_ref:DI ("bar")))

(note 54 17 41 2 (expr_list:REG_DEP_TRUE (concat:SI (reg:SI 5 di)
(reg:SI 2 cx [orig:92 _10 ] [92]))
(nil)) NOTE_INSN_CALL_ARG_LOCATION)
(insn:TI 41 54 55 2 (parallel [
(set (reg:SI 1 dx [101])
(plus:SI (reg:SI 1 dx [orig:96 k ] [96])
(const_int 7 [0x7])))
(clobber (reg:CC 17 flags))
]) sra-1.c:41 218 {*addsi_1}
(note 55 41 56 2 (var_location k (plus:SI (reg:SI 1 dx [101])
(const_int -7 [0xfff9]))) NOTE_INSN_VAR_LOCATION)
(note 56 55 42 2 (var_location a$j (reg:HI 1 dx [101])) 
NOTE_INSN_VAR_LOCATION)

(insn:TI 42 56 57 2 (parallel [
(set (reg:SI 1 dx [103])
(ashift:SI (reg:SI 1 dx [101])
(const_int 4 [0x4])))
(clobber (reg:CC 17 flags))
]) sra-1.c:41

(note 57 42 58 2 (var_location k (entry_value:SI (reg:SI 5 di [ k ]))) 
NOTE_INSN_VAR_LOCATION)
(note 58 57 23 2 (var_location a$j (plus:HI (subreg:HI (entry_value:SI 
(reg:SI 5 di [ k ])) 0)

(const_int 7 [0x7]))) NOTE_INSN_VAR_LOCATION)

(insn:TI 23 58 24 2 (parallel [
(set (reg:HI 1 dx [104])
(ashiftrt:HI (reg:HI 1 dx [103])
(const_int 4 [0x4])))
(clobber (reg:CC 17 flags))
]) sra-1.c:41
(insn:TI 24 23 59 2 (set (reg:SI 0 ax [orig:93 _12 ] [93])
(sign_extend:SI (reg:HI 1 dx [104]))) sra-1.c:43

(note 59 24 25 2 (var_location a$i (reg:HI 2 cx [orig:92 _10 ] [92])) 
NOTE_INSN_VAR_LOCATION)

(insn:TI 25 59 26 2 (set (reg:SI 5 di)
(reg:SI 0 ax [orig:93 _12 ] [93])) sra-1.c:43 86 {*movsi_internal}
 (nil))

(call_insn:TI 26 25 60 2 (call (mem:QI (symbol_ref:DI ("bar")))


I don't really understand the var-tracking stuff too well, so no idea 
where to go from here. I suppose I'm withdrawing my patch.



Bernd


Re: Move "X +- C1 CMP C2 to X CMP C2 -+ C1" to match.pd

2016-04-27 Thread Richard Biener
On Tue, Apr 26, 2016 at 10:28 PM, Marc Glisse  wrote:
> On Tue, 26 Apr 2016, Richard Biener wrote:
>
>> On Sun, Apr 24, 2016 at 7:14 PM, Marc Glisse  wrote:
>>>
>>> Hello,
>>>
>>> trying to move a first pattern from fold_comparison.
>>>
>>> I first tried without single_use. It brought the number of 'free' in
>>> g++.dg/tree-ssa/pr61034.C down to 11, changed gcc.dg/sms-6.c to only 2
>>> SMS
>>> (I don't think the generated code was worse, maybe even better, but I
>>> don't
>>> know ppc asm), broke Wstrict-overflow-18.c (the optimization moved from
>>> VRP
>>> to CCP if I remember correctly), and caused IVOPTS to make a mess in
>>> guality/pr54693-2.c (much longer code, and many  debug
>>> variables). If someone wants to drop the single_use, they can work on
>>> that
>>> after this patch is in.
>>>
>>> The conditions do not exactly match the ones in fold-const.c, but I guess
>>> they are close. The warning in the constant case was missing in
>>> fold_comparison, but present in VRP, so I had to add it not to regress.
>>>
>>> I don't think we were warning much from match.pd. I can't say that I am a
>>> big fan of those strict overflow warnings, but I expect people would
>>> complain if we just dropped the existing ones when moving the transforms
>>> to
>>> match.pd?
>>
>>
>> I just dropped them for patterns I moved (luckily we didn't have
>> testcases sofar ;))
>>
>> If we really want to warn from match.pd then you should do the
>> defer/undefer
>> stuff in fold_stmt itself (same condition I guess) and defer/undefer
>> without
>> warning in gimple_fold_stmt_to_constant_1.
>
>
> Moving it to fold_stmt_1 seems like a good idea, much better than putting it
> in forwprop. Looking at gimple_fold_stmt_to_constant_1 on the other hand, I
> have some concerns. If we do not warn for gimple_fold_stmt_to_constant_1, we
> are going to miss some warnings (I believe there is at least one testcase
> that will break, where VRP currently warns but CCP will come first). On the
> other hand, gimple_fold_stmt_to_constant_1 doesn't do much validation on its
> return value, each caller has their own idea of which results are
> acceptable, so it is only in the (many) callers that we can defer/undefer,
> or we might warn for transformations that we don't actually perform. CCP
> already has the defer/undefer calls around ccp_fold and thus
> gimple_fold_stmt_to_constant_1.

Yeah, the issue with gimple_fold_stmt_to_constant_1 is that it's usually
used in an iteration scheme and thus we may warn multiple times
and for transforms that don't end up being used...

>> So you actually ran into a testcase that expected the warning?
>
>
> Several of them if I remember correctly...

Ugh.

>>> I wanted to restrict the equality case to TYPE_OVERFLOW_WRAPS ||
>>> TYPE_OVERFLOW_UNDEFINED, but that broke 20041114-1.c at -O1 (no strict
>>> overflow), so I went with some kind of complement we use elsewhere.
>>
>>
>> I think I prefer to move things 1:1 (unless sth regresses) and fix bugs in
>> the
>> fold-const.c variant as followup (possibly also adding testcases).
>
>
> Well, yes, but things do indeed regress quite regularly when moving things
> 1:1 :-( At least unless we add a big (if (GENERIC) ...) around the
> transformation.

Sure, just wanted clarification on what changes are just (obvious) bugfixes
and what are needed to not regress in the testsuite.

>> IMHO -fno-strict-overflow needs to go.  It has very wary designed
>> semantics
>> (ops neither wrap nor have undefined overflow).
>
>
> Maybe make it an alias for -fwrapv?

Yes, for example.

Richard.

>
> --
> Marc Glisse


[patch] libstdc++/70767 Define std::numeric_limits in C++98 mode

2016-04-27 Thread Jonathan Wakely

This makes the resolution to DR 559 apply for all dialects.

Tested x86_64-linux, committed to trunk.


commit 078fcaeb799445a24b8fe5872c5553a061e9b697
Author: Jonathan Wakely 
Date:   Wed Apr 27 12:05:37 2016 +0100

libstdc++/70767 Define std::numeric_limits in C++98 mode

	PR libstdc++/70767
	* include/std/limits: Update comments about DRs.
	(numeric_limits, numeric_limits,
	numeric_limits): Define unconditionally.

diff --git a/libstdc++-v3/include/std/limits b/libstdc++-v3/include/std/limits
index b25f825..53a183f 100644
--- a/libstdc++-v3/include/std/limits
+++ b/libstdc++-v3/include/std/limits
@@ -307,9 +307,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  representation of a fundamental type on a given platform.  For
*  non-fundamental types, the functions will return 0 and the data
*  members will all be @c false.
-   *
-   *  _GLIBCXX_RESOLVE_LIB_DEFECTS:  DRs 201 and 184 (hi Gaby!) are
-   *  noted, but not incorporated in this documented (yet).
   */
   template
 struct numeric_limits : public __numeric_limits_base
@@ -360,7 +357,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   denorm_min() _GLIBCXX_USE_NOEXCEPT { return _Tp(); }
 };
 
-#if __cplusplus >= 201103L
+  // _GLIBCXX_RESOLVE_LIB_DEFECTS
+  // 559. numeric_limits
+
   template
 struct numeric_limits
 : public numeric_limits<_Tp> { };
@@ -372,10 +371,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct numeric_limits
 : public numeric_limits<_Tp> { };
-#endif
 
   // Now there follow 16 explicit specializations.  Yes, 16.  Make sure
-  // you get the count right. (18 in c++0x mode)
+  // you get the count right. (18 in C++11 mode, with char16_t and char32_t.)
+
+  // _GLIBCXX_RESOLVE_LIB_DEFECTS
+  // 184. numeric_limits wording problems
 
   /// numeric_limits specialization.
   template<>


Re: Move "X +- C1 CMP C2 to X CMP C2 -+ C1" to match.pd

2016-04-27 Thread Richard Biener
On Wed, Apr 27, 2016 at 7:34 AM, Marc Glisse  wrote:
> Here is the current patch (passed regtest), after moving defer/undefer from
> forwprop to fold_stmt_1. I am not sure if checking no_warning at the end of
> fold_stmt_1 is safe, or if I should save its value at the beginning of the
> function, in case some of the transformations clear it.

I think you need to check it on the original stmt, it is preserved only
if we re-use the old stmt memory.

>
> On Tue, 26 Apr 2016, Marc Glisse wrote:
>
>> On Tue, 26 Apr 2016, Richard Biener wrote:
>>
>>> On Sun, Apr 24, 2016 at 7:14 PM, Marc Glisse 
>>> wrote:

 Hello,

 trying to move a first pattern from fold_comparison.

 I first tried without single_use. It brought the number of 'free' in
 g++.dg/tree-ssa/pr61034.C down to 11, changed gcc.dg/sms-6.c to only 2
 SMS
 (I don't think the generated code was worse, maybe even better, but I
 don't
 know ppc asm), broke Wstrict-overflow-18.c (the optimization moved from
 VRP
 to CCP if I remember correctly), and caused IVOPTS to make a mess in
 guality/pr54693-2.c (much longer code, and many  debug
 variables). If someone wants to drop the single_use, they can work on
 that
 after this patch is in.

 The conditions do not exactly match the ones in fold-const.c, but I
 guess
 they are close. The warning in the constant case was missing in
 fold_comparison, but present in VRP, so I had to add it not to regress.

 I don't think we were warning much from match.pd. I can't say that I am
 a
 big fan of those strict overflow warnings, but I expect people would
 complain if we just dropped the existing ones when moving the transforms
 to
 match.pd?
>>>
>>>
>>> I just dropped them for patterns I moved (luckily we didn't have
>>> testcases sofar ;))
>>>
>>> If we really want to warn from match.pd then you should do the
>>> defer/undefer
>>> stuff in fold_stmt itself (same condition I guess) and defer/undefer
>>> without
>>> warning in gimple_fold_stmt_to_constant_1.
>>
>>
>> Moving it to fold_stmt_1 seems like a good idea, much better than putting
>> it in forwprop. Looking at gimple_fold_stmt_to_constant_1 on the other hand,
>> I have some concerns. If we do not warn for gimple_fold_stmt_to_constant_1,
>> we are going to miss some warnings (I believe there is at least one testcase
>> that will break, where VRP currently warns but CCP will come first). On the
>> other hand, gimple_fold_stmt_to_constant_1 doesn't do much validation on its
>> return value, each caller has their own idea of which results are
>> acceptable, so it is only in the (many) callers that we can defer/undefer,
>> or we might warn for transformations that we don't actually perform. CCP
>> already has the defer/undefer calls around ccp_fold and thus
>> gimple_fold_stmt_to_constant_1.
>>
>>> So you actually ran into a testcase that expected the warning?
>>
>>
>> Several of them if I remember correctly...
>>
 I wanted to restrict the equality case to TYPE_OVERFLOW_WRAPS ||
 TYPE_OVERFLOW_UNDEFINED, but that broke 20041114-1.c at -O1 (no strict
 overflow), so I went with some kind of complement we use elsewhere.
>>>
>>>
>>> I think I prefer to move things 1:1 (unless sth regresses) and fix bugs
>>> in the
>>> fold-const.c variant as followup (possibly also adding testcases).
>>
>>
>> Well, yes, but things do indeed regress quite regularly when moving things
>> 1:1 :-( At least unless we add a big (if (GENERIC) ...) around the
>> transformation.
>>
>>> IMHO -fno-strict-overflow needs to go.  It has very wary designed
>>> semantics
>>> (ops neither wrap nor have undefined overflow).
>>
>>
>> Maybe make it an alias for -fwrapv?
>
>
> --
> Marc Glisse
>
> Index: trunk4/gcc/fold-const.c
> ===
> --- trunk4/gcc/fold-const.c (revision 235452)
> +++ trunk4/gcc/fold-const.c (working copy)
> @@ -290,21 +290,21 @@ fold_undefer_and_ignore_overflow_warning
>
>  bool
>  fold_deferring_overflow_warnings_p (void)
>  {
>return fold_deferring_overflow_warnings > 0;
>  }
>
>  /* This is called when we fold something based on the fact that signed
> overflow is undefined.  */
>
> -static void
> +void
>  fold_overflow_warning (const char* gmsgid, enum warn_strict_overflow_code
> wc)
>  {
>if (fold_deferring_overflow_warnings > 0)
>  {
>if (fold_deferred_overflow_warning == NULL
>   || wc < fold_deferred_overflow_code)
> {
>   fold_deferred_overflow_warning = gmsgid;
>   fold_deferred_overflow_code = wc;
> }
> @@ -8366,89 +8366,20 @@ fold_comparison (location_t loc, enum tr
>  {
>const bool equality_code = (code == EQ_EXPR || code == NE_EXPR);
>tree arg0, arg1, tem;
>
>arg0 = op0;
>arg1 = op1;
>
>STRIP_SIGN_NOPS (arg0);
>STRIP_SIGN_NOPS (arg1);
>
> -  /* Transform comparisons of the form X +- C1 CMP C2 t

Re: [PATCH] PR target/70155: Use SSE for TImode load/store

2016-04-27 Thread Uros Bizjak
On Tue, Apr 26, 2016 at 9:50 PM, H.J. Lu  wrote:

>> Here is the updated patch which does that.  Ok for trunk if there
>> is no regressions on x86-64?
>>
>
> CSE works with SSE constants now.  Here is the updated patch.
> OK for trunk if there are no regressions on x86-64?

+static bool
+timode_scalar_to_vector_candidate_p (rtx_insn *insn)
+{
+  rtx def_set = single_set (insn);
+
+  if (!def_set)
+return false;
+
+  if (has_non_address_hard_reg (insn))
+return false;
+
+  rtx src = SET_SRC (def_set);
+  rtx dst = SET_DEST (def_set);
+
+  /* Only TImode load and store are allowed.  */
+  if (GET_MODE (dst) != TImode)
+return false;
+
+  if (MEM_P (dst))
+{
+  /* Check for store.  Only support store from register or standard
+ SSE constants.  */
+  switch (GET_CODE (src))
+ {
+ default:
+  return false;
+
+ case REG:
+  /* For store from register, memory must be aligned or both
+ unaligned load and store are optimal.  */
+  return (!misaligned_operand (dst, TImode)
+  || (TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
+  && TARGET_SSE_UNALIGNED_STORE_OPTIMAL));

Why check TARGET_SSE_UNALIGNED_LOAD_OPTIMAL here? We are moving from a
register here.

+ case CONST_INT:
+  /* For store from standard SSE constant, memory must be
+ aligned or unaligned store is optimal.  */
+  return (standard_sse_constant_p (src, TImode)
+  && (!misaligned_operand (dst, TImode)
+  || TARGET_SSE_UNALIGNED_STORE_OPTIMAL));
+ }
+}
+  else if (MEM_P (src))
+{
+  /* Check for load.  Memory must be aligned or both unaligned
+ load and store are optimal.  */
+  return (GET_CODE (dst) == REG
+  && (!misaligned_operand (src, TImode)
+  || (TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
+  && TARGET_SSE_UNALIGNED_STORE_OPTIMAL)));

Also here. We are loading a regiister, no point to check
TARGET_SSE_UNALIGNED_STORE_OPTIMAL.

+}
+
+  return false;
+}
+

+/* Convert INSN from TImode to V1T1mode.  */
+
+void
+timode_scalar_chain::convert_insn (rtx_insn *insn)
+{
+  rtx def_set = single_set (insn);
+  rtx src = SET_SRC (def_set);
+  rtx tmp;
+  rtx dst = SET_DEST (def_set);

No need for tmp declaration above ...

+  switch (GET_CODE (dst))
+{
+case REG:
+  tmp = find_reg_equal_equiv_note (insn);

... if you declare it here ...

+  if (tmp)
+ PUT_MODE (XEXP (tmp, 0), V1TImode);

/* FALLTHRU */

+case MEM:
+  PUT_MODE (dst, V1TImode);
+  break;

+case CONST_INT:
+  switch (standard_sse_constant_p (src, TImode))
+ {
+ case 1:
+  src = CONST0_RTX (GET_MODE (dst));
+  tmp = gen_reg_rtx (V1TImode);
+  break;
+ case 2:
+  src = CONSTM1_RTX (GET_MODE (dst));
+  tmp = gen_reg_rtx (V1TImode);
+  break;
+ default:
+  gcc_unreachable ();
+ }
+  if (NONDEBUG_INSN_P (insn))
+ {

... and here. Please generate temp register here.

+  /* Since there are no instructions to store standard SSE
+ constant, temporary register usage is required.  */
+  emit_conversion_insns (gen_rtx_SET (dst, tmp), insn);
+  dst = tmp;
+ }


   /* This needs to be done at start up.  It's convenient to do it here.  */
   register_pass (&insert_vzeroupper_info);
-  register_pass (&stv_info);
+  register_pass (TARGET_64BIT ? &stv_info_64 : &stv_info_32);
 }

stv_info_timode and stv_info_dimode?

Uros.


Re: match.pd: unsigned A - B > A --> A < B

2016-04-27 Thread Richard Biener
On Tue, Apr 26, 2016 at 5:56 PM, Marc Glisse  wrote:
> On Tue, 26 Apr 2016, Richard Biener wrote:
>
>> On Sun, Apr 24, 2016 at 7:42 PM, Marc Glisse  wrote:
>>>
>>> Hello,
>>>
>>> the first part is something that was discussed last stage3, and Jakub
>>> argued
>>> in favor of single_use. The second part is probably less useful, it
>>> notices
>>> that if we manually check for overflow using the result of
>>> IFN_*_OVERFLOW,
>>> then we might as well read that information from the result of that
>>> function.
>>>
>>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu. (hmm, I probably
>>> should
>>> have done it on x86_64 instead, I don't know if the ppc backend has
>>> implemented the overflow functions recently)
>>
>>
>> Ok.  Can you please place the match.pd rules adjacent to the other
>> comparison
>> simplifications rather than at the end?
>
>
> (I did that for the other patch as well)
> I just realized that the *_OVERFLOW internal functions do not work the way I
> expected, the arguments and result can be any combination of unrelated
> types, which makes it unlikely that the transforms are safe as they are
> (could be, but that would be a lot of luck).
>
> The patterns have a comparison between @0 and realpart, so at least those 2
> types are (essentially) the same. To be safe, I would add a condition that
> @0 and @1 have (essentially) the same type. Elsewhere we have a different
> condition for generic/gimple, but for such transforms it doesn't seem
> important to do them on generic.
>
> Here is the new patch, with useless_type_conversion_p added. Is that ok?
> (I could also drop that part of the patch and commit only the part that does
> not involve builtins)

Please use types_match_p () instead - that's substituted with a GENERIC/GIMPLE
variant as needed (via {gimple,generic}-match-head.c).  I'm fine if
you want to disable
all this on GENERIC - note that it's prefered to use

#if GIMPLE
...
#endif

to disable whole pattern blocks as otherwise genmatch will still
generate matchers
for the GENERIC case but adds if (0) in the transform.  This just
bloats code and
possibly runtime.

Ok with that change.

Thanks,
Richard.

> I'll do another regtest to be sure.
>
>
>>> 2016-04-25  Marc Glisse  
>>>
>>> gcc/
>>> * match.pd (A - B > A, A + B < A): New transformations.
>>>
>>> gcc/testsuite/
>>> * gcc.dg/tree-ssa/overflow-2.c: New testcase.
>>> * gcc.dg/tree-ssa/minus-ovf.c: Likewise.
>
>
> --
> Marc Glisse
>
> Index: trunk-ovf2/gcc/match.pd
> ===
> --- trunk-ovf2/gcc/match.pd (revision 235448)
> +++ trunk-ovf2/gcc/match.pd (working copy)
> @@ -2501,20 +2501,75 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   out (gt gt le le)
>   (simplify
>(cmp @0 (plus@2 @0 INTEGER_CST@1))
>(if (TYPE_UNSIGNED (TREE_TYPE (@0))
> && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
> && wi::ne_p (@1, 0)
> && single_use (@2))
> (out @0 { wide_int_to_tree (TREE_TYPE (@0), wi::max_value
>(TYPE_PRECISION (TREE_TYPE (@0)), UNSIGNED) - @1); }
>
> +/* To detect overflow in unsigned A - B, A < B is simpler than A - B > A.
> +   However, the detection logic for SUB_OVERFLOW in tree-ssa-math-opts.c
> +   expects the long form, so we restrict the transformation for now.  */
> +(for cmp (gt le)
> + (simplify
> +  (cmp (minus@2 @0 @1) @0)
> +  (if (single_use (@2)
> +   && ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +   && TYPE_UNSIGNED (TREE_TYPE (@0))
> +   && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
> +   (cmp @1 @0
> +(for cmp (lt ge)
> + (simplify
> +  (cmp @0 (minus@2 @0 @1))
> +  (if (single_use (@2)
> +   && ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +   && TYPE_UNSIGNED (TREE_TYPE (@0))
> +   && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
> +   (cmp @0 @1
> +
> +/* Testing for overflow is unnecessary if we already know the result.  */
> +(if (GIMPLE)
> + /* A < A - B  */
> + (for cmp (lt ge)
> +  out (ne eq)
> +  (simplify
> +   (cmp @0 (realpart (IFN_SUB_OVERFLOW@2 @0 @1)))
> +   (if (TYPE_UNSIGNED (TREE_TYPE (@0))
> +   && useless_type_conversion_p (TREE_TYPE (@0), TREE_TYPE (@1)))
> +(out (imagpart @2) { build_zero_cst (TREE_TYPE (@0)); }
> + /* A - B > A  */
> + (for cmp (gt le)
> +  out (ne eq)
> +  (simplify
> +   (cmp (realpart (IFN_SUB_OVERFLOW@2 @0 @1)) @0)
> +   (if (TYPE_UNSIGNED (TREE_TYPE (@0))
> +   && useless_type_conversion_p (TREE_TYPE (@0), TREE_TYPE (@1)))
> +(out (imagpart @2) { build_zero_cst (TREE_TYPE (@0)); }
> + /* A + B < A  */
> + (for cmp (lt ge)
> +  out (ne eq)
> +  (simplify
> +   (cmp (realpart (IFN_ADD_OVERFLOW:c@2 @0 @1)) @0)
> +   (if (TYPE_UNSIGNED (TREE_TYPE (@0))
> +   && useless_type_conversion_p (TREE_TYPE (@0), TREE_TYPE (@1)))
> +(out (imagpart @2) { build_zero_cst (TREE_TYPE (@0)); }
> + /* A > A + B  */
> + (for cmp (gt le)
> +  out (ne eq)
> +  (simplify
> +   (cmp @0 (realpart (IFN

Re: [PATCH] Verify that context of local DECLs is the current function

2016-04-27 Thread Martin Jambor
Hi,

On Tue, Apr 26, 2016 at 10:58:22AM +0200, Richard Biener wrote:
> On Mon, Apr 25, 2016 at 3:22 PM, Martin Jambor  wrote:
> > Hi,
> >
> > the patch below moves an assert from expand_expr_real_1 to gimple
> > verification.  It triggers when we do a sloppy job outlining stuff
> > from one function to another (or perhaps inlining too) and leave in
> > the IL of a function a local declaration that belongs to a different
> > function.
> >
> > Like I wrote above, such cases usually ICE in expand anyway, but I
> > think it is worth bailing out sooner, if nothing because bugs like PR
> > 70348 would not be assigned to me ;-) ...well, actually, I found this
> > helpful when working on OpenMP gridification.
> >
> > In the process, I think that the verifier would not catch a
> > SSA_NAME_IN_FREE_LIST when such an SSA_NAME is a base of a MEM_REF so
> > I added that check too.
> >
> > Bootstrapped and tested on x86_64-linux, OK for trunk?
> >
> > Thanks,
> >
> > Martin
> >
> >
> >
> > 2016-04-21  Martin Jambor  
> >
> > * tree-cfg.c (verify_var_parm_result_decl): New function.
> > (verify_address): Call it on PARM_DECL bases.
> > (verify_expr): Likewise, also verify SSA_NAME bases of MEM_REFs.
> > ---
> >  gcc/tree-cfg.c | 47 +++
> >  1 file changed, 47 insertions(+)
> >
> > diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
> > index 3385164..c917967 100644
> > --- a/gcc/tree-cfg.c
> > +++ b/gcc/tree-cfg.c
> > @@ -2764,6 +2764,23 @@ gimple_split_edge (edge edge_in)
> >return new_bb;
> >  }
> >
> > +/* Verify that a VAR, PARM_DECL or RESULT_DECL T is from the current 
> > function,
> > +   and if not, return true.  If it is, return false.  */
> > +
> > +static bool
> > +verify_var_parm_result_decl (tree t)
> > +{
> > +  tree context = decl_function_context (t);
> > +  if (context != cfun->decl
> > +  && !SCOPE_FILE_SCOPE_P (context)
> > +  && !TREE_STATIC (t)
> > +  && !DECL_EXTERNAL (t))
> > +{
> > +  error ("Local declaration from a different function");
> > +  return true;
> > +}
> > +  return NULL;
> > +}
> >
> >  /* Verify properties of the address expression T with base object BASE.  */
> >
> > @@ -2798,6 +2815,8 @@ verify_address (tree t, tree base)
> > || TREE_CODE (base) == RESULT_DECL))
> >  return NULL_TREE;
> >
> > +  if (verify_var_parm_result_decl (base))
> > +return base;
> 
> Is that necessary?  We recurse after all, so ...
> 
> >if (DECL_GIMPLE_REG_P (base))
> >  {
> >error ("DECL_GIMPLE_REG_P set on a variable with address taken");
> > @@ -2834,6 +2853,13 @@ verify_expr (tree *tp, int *walk_subtrees, void 
> > *data ATTRIBUTE_UNUSED)
> > }
> >break;
> >
> > +case PARM_DECL:
> > +case VAR_DECL:
> > +case RESULT_DECL:
> > +  if (verify_var_parm_result_decl (t))
> > +   return t;
> > +  break;
> > +
> 
> ... should apply.

I made that happen (see below)...

> 
> >  case INDIRECT_REF:
> >error ("INDIRECT_REF in gimple IL");
> >return t;
> > @@ -2852,9 +2878,25 @@ verify_expr (tree *tp, int *walk_subtrees, void 
> > *data ATTRIBUTE_UNUSED)
> >   error ("invalid offset operand of MEM_REF");
> >   return TREE_OPERAND (t, 1);
> > }
> > +  if (TREE_CODE (x) == SSA_NAME)
> > +   {
> > + if (SSA_NAME_IN_FREE_LIST (x))
> > +   {
> > + error ("SSA name in freelist but still referenced");
> > + return x;
> > +   }
> > + if (SSA_NAME_VAR (x))
> > +   x = SSA_NAME_VAR (x);;
> > +   }
> > +  if ((TREE_CODE (x) == PARM_DECL
> > +  || TREE_CODE (x) == VAR_DECL
> > +  || TREE_CODE (x) == RESULT_DECL)
> > + && verify_var_parm_result_decl (x))
> > +   return x;
> 
> please instead try removing *walk_subtrees = 0 ...

That unfortunately leads to the verifier complaining that DECLs which
are in ADDR_EXPRs are not marked as addressable.  So I changed the
code below

> 
> >if (TREE_CODE (x) == ADDR_EXPR
> >   && (x = verify_address (x, TREE_OPERAND (x, 0
> > return x;

to

  if (TREE_CODE (x) == ADDR_EXPR)
{
  tree va = verify_address (x, TREE_OPERAND (x, 0));
  if (va)
return va;
  x = TREE_OPERAND (x, 0);
}
  walk_tree (&x, verify_expr, data, NULL);
  *walk_subtrees = 0;
  break;

> 
> ... we only get some slight duplicate address verification here
> (this copy is stronger than the ADDR_EXPR case).
> 
> > +
> >*walk_subtrees = 0;
> >break;
> >
> > @@ -3010,6 +3052,11 @@ verify_expr (tree *tp, int *walk_subtrees, void 
> > *data ATTRIBUTE_UNUSED)
> >
> >   t = TREE_OPERAND (t, 0);
> > }
> > +  if ((TREE_CODE (t) == PARM_DECL
> > +  || TREE_CODE (t) == VAR_DECL
> > +  || TREE_CODE (t) == RESULT_DECL)
> > + && verify_var_parm_result_decl (t))
> > +   retu

[Ada] Remove superfluous use of secondary stack on object initialization

2016-04-27 Thread Arnaud Charlet
This patch improves on the performance of an object initialization with a
build-in-place function call, when the return type is not a definite type
but has only access discriminants and no controlled components.

THe following must execute quietly:

   gcc -c -gnatDG p.adb
   grep secondary_stack p.adb.dg


---
with Discrim; use Discrim;
procedure P is
  I : aliased Integer;
  A_Obj : A_Type := Create (I'Access);
begin
  null;
end;
---
package Discrim is
  type A_Type (IA : access Integer) is limited private;
  function Create (I : access Integer) return A_Type;
private
  type A_Type (IA : access Integer) is limited record
 Not_Dependent_On_IA : Boolean;
  end record;
end;
---
package body Discrim is
  function Create (I : access Integer) return A_Type is
  begin
 return A : A_Type (I) do
A.Not_Dependent_On_IA := False;
 end return;
  end;
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Ed Schonberg  

* exp_ch6.adb (Make_Build_In_Place_Call_In_Object_Declaration): If the
return type is an untagged limited record with only access
discriminants and no controlled components, the return value does not
need to use the secondary stack.

Index: exp_ch6.adb
===
--- exp_ch6.adb (revision 235482)
+++ exp_ch6.adb (working copy)
@@ -7783,7 +7783,12 @@
   Result_Subt : Entity_Id;
 
   Definite : Boolean;
-  --  True for definite function result subtype
+  --  True if result subtype is definite, or has a size that does not
+  --  require secondary stack usage (i.e. no variant part or components
+  --  whose type depends on discriminants). In particular, untagged types
+  --  with only access discriminants do not require secondary stack use.
+  --  Note that if the return type is tagged we must always use the sec.
+  --  stack because the call may dispatch on result.
 
begin
   --  Step past qualification or unchecked conversion (the latter can occur
@@ -7818,7 +7823,10 @@
   end if;
 
   Result_Subt := Etype (Function_Id);
-  Definite:= Is_Definite_Subtype (Underlying_Type (Result_Subt));
+  Definite :=
+(Is_Definite_Subtype (Underlying_Type (Result_Subt))
+ and then not Is_Tagged_Type (Result_Subt))
+  or else not Requires_Transient_Scope (Underlying_Type (Result_Subt));
 
   --  Create an access type designating the function's result subtype. We
   --  use the type of the original call because it may be a call to an


[Ada] Spurious End_Error with Get_Line on strings without line terminators

2016-04-27 Thread Arnaud Charlet
This patch fixes a spurious End_Error raised by Text_IO.Get_Line, when the
input line has 499 or 500 characters and does not contain a line terminator.

No short example available.

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Ed Schonberg  

* a-textio.adb (Get_Line function): Handle properly the case of
a line that has the same length as the buffer (or a multiple
thereof) and there is no line terminator.
* a-tigeli.adb (Get_Line procedure): Do not store an end_of_file
in the string when there is no previous line terminator and we
need at most one additional character.

Index: a-textio.adb
===
--- a-textio.adb(revision 235481)
+++ a-textio.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1992-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -704,9 +704,6 @@
end Get_Line;
 
function Get_Line (File : File_Type) return String is
-  Buffer : String (1 .. 500);
-  Last   : Natural;
-
   function Get_Rest (S : String) return String;
   --  This is a recursive function that reads the rest of the line and
   --  returns it. S is the part read so far.
@@ -732,12 +729,19 @@
  begin
 if Last < Buffer'Last then
return R;
+
 else
return Get_Rest (R);
 end if;
  end;
   end Get_Rest;
 
+  --  Local variables
+
+  Buffer : String (1 .. 500);
+  ch : int;
+  Last   : Natural;
+
--  Start of processing for Get_Line
 
begin
@@ -745,6 +749,22 @@
 
   if Last < Buffer'Last then
  return Buffer (1 .. Last);
+
+  --  If the String has the same length as the buffer, and there is no end
+  --  of line, check whether we are at the end of file, in which case we
+  --  have the full String in the buffer.
+
+  elsif Last = Buffer'Last then
+ ch := Getc (File);
+
+ if ch = EOF then
+return Buffer;
+
+ else
+Ungetc (ch, File);
+return Get_Rest (Buffer (1 .. Last));
+ end if;
+
   else
  return Get_Rest (Buffer (1 .. Last));
   end if;
Index: a-tigeli.adb
===
--- a-tigeli.adb(revision 235481)
+++ a-tigeli.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1992-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -187,9 +187,14 @@
  --  If we get EOF after already reading data, this is an incomplete
  --  last line, in which case no End_Error should be raised.
 
- if ch = EOF and then Last < Item'First then
-raise End_Error;
+ if ch = EOF then
+if  Last < Item'First then
+   raise End_Error;
 
+else  --  All done
+   return;
+end if;
+
  elsif ch /= LM then
 
 --  Buffer really is full without having seen LM, update col


[Ada] Spurious dimensionality errors on multitidimensional aggregates.

2016-04-27 Thread Arnaud Charlet
This patch removes spurious dimensionality errors on aggregates for multi-
dimensional arrays of scalar types with dimension specifications.

Compiling dimbug.ads below must yield:

   dimbug.ads:8:33: dimensions mismatch in array aggregate
   dimbug.ads:8:33: expected dimension [L], found [M]

---
with system.Dim.Mks; use System.Dim.Mks;
package dimbug is

 test_array_2 : array (1 .. 3, 1 .. 3) of Length :=
  (others => (others => 3.0 * m));--  OK

 Bad_Array : array (1 .. 3, 1 .. 3) of Length :=
  (others => (others => 3.0 * kg));   --  ERROR
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Ed Schonberg  

* sem_dim.ads, sem_dim.adb (Check_Expression_Dimensions): New
procedure to compute the dimension vector of a scalar expression
and compare it with the dimensions if its expected subtype. Used
for the ultimate components of a multidimensional aggregate,
whose components typically are themselves aggregates that are
expanded separately. Previous to this patch, dimensionality
checking on such aggregates generated spurious errors.
* sem_aggr.adb (Resolve_Array_Aggregate): Use
Check_Expression_Dimensions when needed.

Index: sem_aggr.adb
===
--- sem_aggr.adb(revision 235481)
+++ sem_aggr.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1992-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -2052,6 +2052,13 @@
  Set_Parent (Expr, Parent (Expression (Assoc)));
  Analyze (Expr);
 
+ --  Compute its dimensions now, rather than at the end
+ --  of resolution, because in the case of multidimensional
+ --  aggregates subsequent expansion may lead to spurious
+ --  errors.
+
+ Check_Expression_Dimensions (Expr, Component_Typ);
+
  --  If the expression is a literal, propagate this info
  --  to the expression in the association, to enable some
  --  optimizations downstream.
Index: sem_dim.adb
===
--- sem_dim.adb (revision 235481)
+++ sem_dim.adb (working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 2011-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 2011-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -1235,10 +1235,12 @@
  --  since it may not be decorated at this point. We also don't want to
  --  issue the same error message multiple times on the same expression
  --  (may happen when an aggregate is converted into a positional
- --  aggregate).
+ --  aggregate). We also must verify that this is a scalar component,
+ --  and not a subaggregate of a multidimensional aggregate.
 
  if Comes_From_Source (Original_Node (Expr))
and then Present (Etype (Expr))
+   and then Is_Numeric_Type (Etype (Expr))
and then Dimensions_Of (Expr) /= Dims_Of_Comp_Typ
and then Sloc (Comp) /= Sloc (Prev (Comp))
  then
@@ -2270,6 +2272,27 @@
   end case;
end Analyze_Dimension_Unary_Op;
 
+   -
+   -- Check_Expression_Dimensions --
+   -
+
+   procedure Check_Expression_Dimensions
+  (Expr : Node_Id;
+   Typ  : Entity_Id)
+   is
+   begin
+  if Is_Floating_Point_Type (Etype (Expr)) then
+ Analyze_Dimension (Expr);
+
+ if Dimensions_Of (Expr) /= Dimensions_Of (Typ) then
+Error_Msg_N ("dimensions mismatch in array aggregate", Expr);
+Error_Msg_N
+  ("\expected dimension " & Dimensions_Msg_Of (Typ)
+   & ", found " & Dimensions_Msg_Of (Expr), Expr);
+ end if;
+  end if;

[Ada] Missing error on illegal use of volatile object

2016-04-27 Thread Arnaud Charlet
This patch updates the resolution of actual parameters to flags all effectively
volatile objects with enabled property Async_Writers or Effective_Reads which
appear in the actual as illegal because the context is interfering.


-- Source --


--  volatiles.ads

package Volatiles with SPARK_Mode is
   type Vol_Int is new Integer with Volatile;

   function Vol_Func_1 (Obj : Vol_Int) return Vol_Int
 with Volatile_Function;

   function Vol_Func_2 (Obj : Vol_Int) return Boolean
 with Volatile_Function;

   Obj : Vol_Int := 0;

   Error_1 : Vol_Int := Obj - Obj;--  Error
   Error_2 : Vol_Int := Vol_Func_1 (1 + Obj); --  Error
   Error_3 : Boolean := Vol_Func_2 (1 + Vol_Func_1 (1 + Obj));--  Error
end Volatiles;


-- Compilation and output --


$ gcc -c volatiles.ads
volatiles.ads:12:25: volatile object cannot appear in this context (SPARK RM
  7.1.3(12))
volatiles.ads:12:31: volatile object cannot appear in this context (SPARK RM
  7.1.3(12))
volatiles.ads:13:41: volatile object cannot appear in this context (SPARK RM
  7.1.3(11))
volatiles.ads:14:57: volatile object cannot appear in this context (SPARK RM
  7.1.3(11))

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Hristian Kirtchev  

* sem_res.adb (Flag_Effectively_Volatile_Objects): New routine.
(Resolve_Actuals): Flag effectively volatile objects with enabled
property Async_Writers or Effective_Reads as illegal.
* sem_util.adb (Is_OK_Volatile_Context): Comment reformatting.

Index: sem_res.adb
===
--- sem_res.adb (revision 235481)
+++ sem_res.adb (working copy)
@@ -3107,6 +3107,10 @@
   --  interpretation, but the form of the actual can only be determined
   --  once the primitive operation is identified.
 
+  procedure Flag_Effectively_Volatile_Objects (Expr : Node_Id);
+  --  Emit an error concerning the illegal usage of an effectively volatile
+  --  object in interfering context (SPARK RM 7.13(12)).
+
   procedure Insert_Default;
   --  If the actual is missing in a call, insert in the actuals list
   --  an instance of the default expression. The insertion is always
@@ -3360,6 +3364,55 @@
  end if;
   end Check_Prefixed_Call;
 
+  ---
+  -- Flag_Effectively_Volatile_Objects --
+  ---
+
+  procedure Flag_Effectively_Volatile_Objects (Expr : Node_Id) is
+ function Flag_Object (N : Node_Id) return Traverse_Result;
+ --  Determine whether arbitrary node N denotes an effectively volatile
+ --  object and if it does, emit an error.
+
+ -
+ -- Flag_Object --
+ -
+
+ function Flag_Object (N : Node_Id) return Traverse_Result is
+Id : Entity_Id;
+
+ begin
+--  Do not consider nested function calls because they have already
+--  been processed during their own resolution.
+
+if Nkind (N) = N_Function_Call then
+   return Skip;
+
+elsif Is_Entity_Name (N) and then Present (Entity (N)) then
+   Id := Entity (N);
+
+   if Is_Object (Id)
+ and then Is_Effectively_Volatile (Id)
+ and then (Async_Writers_Enabled (Id)
+or else Effective_Reads_Enabled (Id))
+   then
+  Error_Msg_N
+("volatile object cannot appear in this context (SPARK "
+ & "RM 7.1.3(11))", N);
+  return Skip;
+   end if;
+end if;
+
+return OK;
+ end Flag_Object;
+
+ procedure Flag_Objects is new Traverse_Proc (Flag_Object);
+
+  --  Start of processing for Flag_Effectively_Volatile_Objects
+
+  begin
+ Flag_Objects (Expr);
+  end Flag_Effectively_Volatile_Objects;
+
   
   -- Insert_Default --
   
@@ -3461,7 +3514,6 @@
 then
Set_Is_Controlling_Actual (Actval);
 end if;
-
  end if;
 
  --  If the default expression raises constraint error, then just
@@ -4473,10 +4525,8 @@
 --  they are not standard Ada legality rule. Internally generated
 --  temporaries are ignored.
 
-if SPARK_Mode = On
-  and then Comes_From_Source (A)
-  and then Is_Effectively_Volatile_Object (A)
-then
+if SPARK_Mode = On and then Comes_From_Source (A) then
+
--  An effectively volatile object may act as an actual when the
--  corresponding formal is of a non-scalar effectively volatile
--  type (SPARK RM 7.1.3(11

Re: [PATCH] Verify that context of local DECLs is the current function

2016-04-27 Thread Richard Biener
On Wed, Apr 27, 2016 at 2:23 PM, Martin Jambor  wrote:
> Hi,
>
> On Tue, Apr 26, 2016 at 10:58:22AM +0200, Richard Biener wrote:
>> On Mon, Apr 25, 2016 at 3:22 PM, Martin Jambor  wrote:
>> > Hi,
>> >
>> > the patch below moves an assert from expand_expr_real_1 to gimple
>> > verification.  It triggers when we do a sloppy job outlining stuff
>> > from one function to another (or perhaps inlining too) and leave in
>> > the IL of a function a local declaration that belongs to a different
>> > function.
>> >
>> > Like I wrote above, such cases usually ICE in expand anyway, but I
>> > think it is worth bailing out sooner, if nothing because bugs like PR
>> > 70348 would not be assigned to me ;-) ...well, actually, I found this
>> > helpful when working on OpenMP gridification.
>> >
>> > In the process, I think that the verifier would not catch a
>> > SSA_NAME_IN_FREE_LIST when such an SSA_NAME is a base of a MEM_REF so
>> > I added that check too.
>> >
>> > Bootstrapped and tested on x86_64-linux, OK for trunk?
>> >
>> > Thanks,
>> >
>> > Martin
>> >
>> >
>> >
>> > 2016-04-21  Martin Jambor  
>> >
>> > * tree-cfg.c (verify_var_parm_result_decl): New function.
>> > (verify_address): Call it on PARM_DECL bases.
>> > (verify_expr): Likewise, also verify SSA_NAME bases of MEM_REFs.
>> > ---
>> >  gcc/tree-cfg.c | 47 +++
>> >  1 file changed, 47 insertions(+)
>> >
>> > diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
>> > index 3385164..c917967 100644
>> > --- a/gcc/tree-cfg.c
>> > +++ b/gcc/tree-cfg.c
>> > @@ -2764,6 +2764,23 @@ gimple_split_edge (edge edge_in)
>> >return new_bb;
>> >  }
>> >
>> > +/* Verify that a VAR, PARM_DECL or RESULT_DECL T is from the current 
>> > function,
>> > +   and if not, return true.  If it is, return false.  */
>> > +
>> > +static bool
>> > +verify_var_parm_result_decl (tree t)
>> > +{
>> > +  tree context = decl_function_context (t);
>> > +  if (context != cfun->decl
>> > +  && !SCOPE_FILE_SCOPE_P (context)
>> > +  && !TREE_STATIC (t)
>> > +  && !DECL_EXTERNAL (t))
>> > +{
>> > +  error ("Local declaration from a different function");
>> > +  return true;
>> > +}
>> > +  return NULL;
>> > +}
>> >
>> >  /* Verify properties of the address expression T with base object BASE.  
>> > */
>> >
>> > @@ -2798,6 +2815,8 @@ verify_address (tree t, tree base)
>> > || TREE_CODE (base) == RESULT_DECL))
>> >  return NULL_TREE;
>> >
>> > +  if (verify_var_parm_result_decl (base))
>> > +return base;
>>
>> Is that necessary?  We recurse after all, so ...
>>
>> >if (DECL_GIMPLE_REG_P (base))
>> >  {
>> >error ("DECL_GIMPLE_REG_P set on a variable with address taken");
>> > @@ -2834,6 +2853,13 @@ verify_expr (tree *tp, int *walk_subtrees, void 
>> > *data ATTRIBUTE_UNUSED)
>> > }
>> >break;
>> >
>> > +case PARM_DECL:
>> > +case VAR_DECL:
>> > +case RESULT_DECL:
>> > +  if (verify_var_parm_result_decl (t))
>> > +   return t;
>> > +  break;
>> > +
>>
>> ... should apply.
>
> I made that happen (see below)...
>
>>
>> >  case INDIRECT_REF:
>> >error ("INDIRECT_REF in gimple IL");
>> >return t;
>> > @@ -2852,9 +2878,25 @@ verify_expr (tree *tp, int *walk_subtrees, void 
>> > *data ATTRIBUTE_UNUSED)
>> >   error ("invalid offset operand of MEM_REF");
>> >   return TREE_OPERAND (t, 1);
>> > }
>> > +  if (TREE_CODE (x) == SSA_NAME)
>> > +   {
>> > + if (SSA_NAME_IN_FREE_LIST (x))
>> > +   {
>> > + error ("SSA name in freelist but still referenced");
>> > + return x;
>> > +   }
>> > + if (SSA_NAME_VAR (x))
>> > +   x = SSA_NAME_VAR (x);;
>> > +   }
>> > +  if ((TREE_CODE (x) == PARM_DECL
>> > +  || TREE_CODE (x) == VAR_DECL
>> > +  || TREE_CODE (x) == RESULT_DECL)
>> > + && verify_var_parm_result_decl (x))
>> > +   return x;
>>
>> please instead try removing *walk_subtrees = 0 ...
>
> That unfortunately leads to the verifier complaining that DECLs which
> are in ADDR_EXPRs are not marked as addressable.  So I changed the
> code below
>
>>
>> >if (TREE_CODE (x) == ADDR_EXPR
>> >   && (x = verify_address (x, TREE_OPERAND (x, 0
>> > return x;
>
> to
>
>   if (TREE_CODE (x) == ADDR_EXPR)
> {
>   tree va = verify_address (x, TREE_OPERAND (x, 0));
>   if (va)
> return va;
>   x = TREE_OPERAND (x, 0);
> }
>   walk_tree (&x, verify_expr, data, NULL);
>   *walk_subtrees = 0;
>   break;
>
>>
>> ... we only get some slight duplicate address verification here
>> (this copy is stronger than the ADDR_EXPR case).
>>
>> > +
>> >*walk_subtrees = 0;
>> >break;
>> >
>> > @@ -3010,6 +3052,11 @@ verify_expr (tree *tp, int *walk_subtrees, void 
>> > *data ATTRIBUTE_UNUSED)
>> >
>> >   t = TREE_OPER

Re: [PATCH] operand_equal_p checking (PR sanitizer/70683)

2016-04-27 Thread Richard Biener
On Wed, 27 Apr 2016, Jakub Jelinek wrote:

> On Tue, Apr 26, 2016 at 03:02:38PM +0200, Jakub Jelinek wrote:
> > The debugging hack is too ugly and slows down the compiler (by artificially
> > increasing number of collisions), so it is not appropriate, but perhaps we
> > can add some internal only use OEP_* flag, pass it to the recursive calls
> > of operand_equal_p and if not set and flag_checking, verify
> > iterative_hash_expr equality in the outermost call).
> 
> Here is the corresponding checking patch.  It uncovered two further issues
> in the tree.[ch] patch which I'm going to post momentarily.
> Both patches together bootstrapped/regtested on x86_64-linux and i686-linux,
> ok for trunk?

Ok.

Thanks,
Richard.

> 2016-04-27  Jakub Jelinek  
> 
>   PR sanitizer/70683
>   * tree-core.h (enum operand_equal_flag): Add OEP_NO_HASH_CHECK.
>   * fold-const.c (operand_equal_p): If flag_checking and
>   OEP_NO_HASH_CHECK is not set in flag, recurse with OEP_NO_HASH_CHECK
>   and if it returns non-zero, assert iterative_hash_expr on both
>   args is the same.
> 
> --- gcc/tree-core.h.jj2016-04-22 18:21:55.0 +0200
> +++ gcc/tree-core.h   2016-04-26 17:47:19.875753297 +0200
> @@ -765,7 +765,9 @@ enum operand_equal_flag {
>OEP_ONLY_CONST = 1,
>OEP_PURE_SAME = 2,
>OEP_MATCH_SIDE_EFFECTS = 4,
> -  OEP_ADDRESS_OF = 8
> +  OEP_ADDRESS_OF = 8,
> +  /* Internal within operand_equal_p:  */
> +  OEP_NO_HASH_CHECK = 16
>  };
>  
>  /* Enum and arrays used for tree allocation stats.
> --- gcc/fold-const.c.jj   2016-04-22 18:21:32.0 +0200
> +++ gcc/fold-const.c  2016-04-26 18:30:40.919080701 +0200
> @@ -2749,6 +2749,25 @@ combine_comparisons (location_t loc,
>  int
>  operand_equal_p (const_tree arg0, const_tree arg1, unsigned int flags)
>  {
> +  /* When checking, verify at the outermost operand_equal_p call that
> + if operand_equal_p returns non-zero then ARG0 and ARG1 has the same
> + hash value.  */
> +  if (flag_checking && !(flags & OEP_NO_HASH_CHECK))
> +{
> +  if (operand_equal_p (arg0, arg1, flags | OEP_NO_HASH_CHECK))
> + {
> +   inchash::hash hstate0 (0), hstate1 (0);
> +   inchash::add_expr (arg0, hstate0, flags);
> +   inchash::add_expr (arg1, hstate1, flags);
> +   hashval_t h0 = hstate0.end ();
> +   hashval_t h1 = hstate1.end ();
> +   gcc_assert (h0 == h1);
> +   return 1;
> + }
> +  else
> + return 0;
> +}
> +
>/* If either is ERROR_MARK, they aren't equal.  */
>if (TREE_CODE (arg0) == ERROR_MARK || TREE_CODE (arg1) == ERROR_MARK
>|| TREE_TYPE (arg0) == error_mark_node
> 
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[Ada] Missing error on classwide precondition on a generic subprogram

2016-04-27 Thread Arnaud Charlet
A generic subprogram is never a primitive operation, and thus a classwide
condition for it is not legal. This patch diagnoses such an illegal class-
wide condition properly.

Example in ACATS test B611003

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Ed Schonberg  

* sem_prag.adb (Analyze_Pre_Post_Condition_In_Decl_Part):
A generic subprogram is never a primitive operation, and thus
a classwide condition for it is not legal.

Index: sem_prag.adb
===
--- sem_prag.adb(revision 235481)
+++ sem_prag.adb(working copy)
@@ -23319,11 +23319,12 @@
   if Class_Present (N) then
 
  --  Verify that a class-wide condition is legal, i.e. the operation is
- --  a primitive of a tagged type.
+ --  a primitive of a tagged type. Note that a generic subprogram is
+ --  not a primitive operation.
 
  Disp_Typ := Find_Dispatching_Type (Spec_Id);
 
- if No (Disp_Typ) then
+ if No (Disp_Typ) or else Is_Generic_Subprogram (Spec_Id) then
 Error_Msg_Name_1 := Original_Aspect_Pragma_Name (N);
 
 if From_Aspect_Specification (N) then


[Ada] Fix runtime build failure on vxworks 653 2.5

2016-04-27 Thread Arnaud Charlet
The current conditional compilation directives for vxworks
lead to a call with a single argument on some versions, and
to a call with two arguments on others.

We currently end up in the single argument case for all
versions of vxworks 653. This was fine for e.g. 2.2. This
isn't fine any more with 2.5, where the underlying mkdir
implementation was changed to expect the second argument.

This change reworks the code so we always issue a call with
two arguments, the second one to be ignored by implementations
that don't expect it.

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Olivier Hainque  

* mkdir.c (__gnat_mkdir): Rework the vxworks section to use a
consistent posix interface on the caller side.

Index: mkdir.c
===
--- mkdir.c (revision 235481)
+++ mkdir.c (working copy)
@@ -6,7 +6,7 @@
  *  *
  *  C Implementation File   *
  *  *
- * Copyright (C) 2002-2014, Free Software Foundation, Inc.  *
+ * Copyright (C) 2002-2016, Free Software Foundation, Inc.  *
  *  *
  * GNAT is free software;  you can  redistribute it  and/or modify it under *
  * terms of the  GNU General Public License as published  by the Free Soft- *
@@ -60,8 +60,18 @@
 int
 __gnat_mkdir (char *dir_name, int encoding ATTRIBUTE_UNUSED)
 {
-#if defined (__vxworks) && !(defined (__RTP__) && ((_WRS_VXWORKS_MAJOR == 7) 
|| (_WRS_VXWORKS_MINOR != 0)))
-  return mkdir (dir_name);
+#if defined (__vxworks)
+
+  /* Pretend that the system mkdir is posix compliant even though it
+ sometimes is not, not expecting the second argument in some
+ configurations (e.g. vxworks 653 2.2, difference from 2.5). The
+ second actual argument will just be ignored in this case.  */
+
+  typedef int posix_mkdir (const char * name, mode_t mode);
+
+  posix_mkdir * vxmkdir = (posix_mkdir *)&mkdir;
+  return vxmkdir (dir_name, S_IRWXU | S_IRWXG | S_IRWXO);
+
 #elif defined (__MINGW32__)
   TCHAR wname [GNAT_MAX_PATH_LEN + 2];
 


[Ada] Crash on illegal use of limited view of type

2016-04-27 Thread Arnaud Charlet
This patch fixes a compiler abort on an illegal program that attempts to
make use of the non-limited view of a type in the private part of a unit that
has a limited_private with_clause on the unit that declared the type.

Compiling unit_test05.adb must yield:

   unit_test05.ads:46:23: invalid use of type before its full declaration

---
limited private with with_private;   
package unit_test05 is

   type private_type is private;

   function public_fn( x : integer )  return private_type;
   --function public_fn( x : integer )  return integer;

PRIVATE
   function  private_fn( x : BOOLEAN := true )
 return with_private.small;

   type private_type is record
  f : with_private.small;
   end record;

end unit_test05;
--
private with  with_private;

package body unit_test05 is

   function public_fn( x : integer )  return private_type is
   --function public_fn( x : integer )  return integer is

  value : private_type;

   begin
  -- the body can see public declarations in with_private
  -- the initialization of W is private.
  --
  value.f := 5;
  IF with_private.W  THEN RETURN value; END IF;

  return value;  
  --return  x + 5;
   end public_fn;

   function private_fn( x : BOOLEAN := true )
return with_private.small is
   begin
  -- the body can see public declarations in with_private
  --
  IF with_private.z THEN RETURN 5; ELSE RETURN 7; END IF;
  --return  x + 5;
   end private_fn;
end unit_test05;
---
package with_private is

   package T1_Pkg is
  type T1 is tagged null record;
  procedure Prim_Proc (X : T1);

  T1_Var : T1;
  procedure Cw_Operand (X : T1'Class := T1_Var);
   end T1_Pkg;

   W : constant Boolean;
   Z : constant Boolean := TRUE;

   type small is new integer;

private
   W : constant Boolean := true;

end with_private;

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Ed Schonberg  

* sem_ch10.adb (Build_Limited_View, Decorate_Type): If this
is a limited view of a type, initialize the Limited_Dependents
field to catch misuses of the type in a client unit.

Index: sem_ch10.adb
===
--- sem_ch10.adb(revision 235482)
+++ sem_ch10.adb(working copy)
@@ -84,6 +84,13 @@
--  required in order to avoid passing non-decorated entities to the
--  back-end. Implements Ada 2005 (AI-50217).
 
+   procedure Analyze_Proper_Body (N : Node_Id; Nam : Entity_Id);
+   --  Common processing for all stubs (subprograms, tasks, packages, and
+   --  protected cases). N is the stub to be analyzed. Once the subunit name
+   --  is established, load and analyze. Nam is the non-overloadable entity
+   --  for which the proper body provides a completion. Subprogram stubs are
+   --  handled differently because they can be declarations.
+
procedure Check_Body_Needed_For_SAL (Unit_Name : Entity_Id);
--  Check whether the source for the body of a compilation unit must be
--  included in a standalone library.
@@ -203,13 +210,6 @@
procedure Unchain (E : Entity_Id);
--  Remove single entity from visibility list
 
-   procedure Analyze_Proper_Body (N : Node_Id; Nam : Entity_Id);
-   --  Common processing for all stubs (subprograms, tasks, packages, and
-   --  protected cases). N is the stub to be analyzed. Once the subunit name
-   --  is established, load and analyze. Nam is the non-overloadable entity
-   --  for which the proper body provides a completion. Subprogram stubs are
-   --  handled differently because they can be declarations.
-
procedure sm;
--  A dummy procedure, for debugging use, called just before analyzing the
--  main unit (after dealing with any context clauses).
@@ -1489,7 +1489,7 @@
 
--  Check if the named package (or some ancestor)
--  leaves visible the full-view of the unit given
-   --  in the limited-with clause
+   --  in the limited-with clause.
 
loop
   if Designate_Same_Unit (Lim_Unit_Name,
@@ -5633,16 +5633,20 @@
 
   begin
  --  An unanalyzed type or a shadow entity of a type is treated as an
- --  incomplete type.
+ --  incomplete type, and carries the corresponding attributes.
 
- Set_Ekind (Ent, E_Incomplete_Type);
- Set_Etype (Ent, Ent);
- Set_Full_View (Ent, Empty);
- Set_Is_First_Subtype  (Ent);
- Set_Scope (Ent, Scop);
- Set_Stored_Constraint (Ent, No_Elist);
- Init_Size_Align   (Ent);
+ Set_Ekind  (Ent, E_Incomplete_Type);
+ Set_Etype  (Ent, Ent);
+ Set_Full_View  (Ent, Empty);
+ Set_Is_First_Subtype   (Ent);
+ Set_Scope  (Ent, Scop);
+ Set_Stored_Co

[Ada] Incomplete xref information in ALI file

2016-04-27 Thread Arnaud Charlet
This patch fixes the handling of an object declaration whose type definition is
a class-wide subtype and whose expression is a function call that returns a
classwide type. Previous to this patch the type of the object in the ALI file
appeared as the corresponding base type.

Executing the following;

   gcc -c vars.ads
   grep Some_Var vars.ali

must yield:

4c4*Some_Var{1|3C12}

---

with A;
package Vars is
   Some_Var : A.Base_Type := A.Foo;

   subtype T is Integer;
   V2 : T;
end Vars;
---
package A is
   type Root_Type is tagged null record;
   subtype Base_Type is Root_Type'Class;

   function Foo return Base_Type;
end A;

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Ed Schonberg  

* lib-xref.adb (Get_Type_Reference): Handle properly the case
of an object declaration whose type definition is a class-wide
subtype and whose expression is a function call that returns a
classwide type.

Index: lib-xref.adb
===
--- lib-xref.adb(revision 235481)
+++ lib-xref.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1998-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 1998-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -1467,17 +1467,23 @@
--  initialized with a tag-indeterminate call gets a subtype
--  of the classwide type during expansion. See if the original
--  type in the declaration is named, and return it instead
-   --  of going to the root type.
+   --  of going to the root type. The expression may be a class-
+   --  wide function call whose result is on the secondary stack,
+   --  which forces the declaration to be rewritten as a renaming,
+   --  so examine the source declaration.
 
-   if Ekind (Tref) = E_Class_Wide_Subtype
- and then Nkind (Parent (Ent)) = N_Object_Declaration
- and then
-   Nkind (Original_Node (Object_Definition (Parent (Ent
- = N_Identifier
-   then
-  Tref :=
-Entity
-  (Original_Node ((Object_Definition (Parent (Ent);
+   if Ekind (Tref) = E_Class_Wide_Subtype then
+  declare
+ Decl : constant Node_Id := Original_Node (Parent (Ent));
+  begin
+ if Nkind (Decl) = N_Object_Declaration
+   and then Is_Entity_Name
+ (Original_Node ((Object_Definition (Decl
+ then
+Tref :=
+ Entity ((Original_Node ((Object_Definition (Decl);
+ end if;
+  end;
end if;
 
 --  For anything else, exit


Re: [PATCH] PR target/70155: Use SSE for TImode load/store

2016-04-27 Thread H.J. Lu
On Wed, Apr 27, 2016 at 5:03 AM, Uros Bizjak  wrote:
> On Tue, Apr 26, 2016 at 9:50 PM, H.J. Lu  wrote:
>
>>> Here is the updated patch which does that.  Ok for trunk if there
>>> is no regressions on x86-64?
>>>
>>
>> CSE works with SSE constants now.  Here is the updated patch.
>> OK for trunk if there are no regressions on x86-64?
>
> +static bool
> +timode_scalar_to_vector_candidate_p (rtx_insn *insn)
> +{
> +  rtx def_set = single_set (insn);
> +
> +  if (!def_set)
> +return false;
> +
> +  if (has_non_address_hard_reg (insn))
> +return false;
> +
> +  rtx src = SET_SRC (def_set);
> +  rtx dst = SET_DEST (def_set);
> +
> +  /* Only TImode load and store are allowed.  */
> +  if (GET_MODE (dst) != TImode)
> +return false;
> +
> +  if (MEM_P (dst))
> +{
> +  /* Check for store.  Only support store from register or standard
> + SSE constants.  */
> +  switch (GET_CODE (src))
> + {
> + default:
> +  return false;
> +
> + case REG:
> +  /* For store from register, memory must be aligned or both
> + unaligned load and store are optimal.  */
> +  return (!misaligned_operand (dst, TImode)
> +  || (TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
> +  && TARGET_SSE_UNALIGNED_STORE_OPTIMAL));
>
> Why check TARGET_SSE_UNALIGNED_LOAD_OPTIMAL here? We are moving from a
> register here.
>
> + case CONST_INT:
> +  /* For store from standard SSE constant, memory must be
> + aligned or unaligned store is optimal.  */
> +  return (standard_sse_constant_p (src, TImode)
> +  && (!misaligned_operand (dst, TImode)
> +  || TARGET_SSE_UNALIGNED_STORE_OPTIMAL));
> + }
> +}
> +  else if (MEM_P (src))
> +{
> +  /* Check for load.  Memory must be aligned or both unaligned
> + load and store are optimal.  */
> +  return (GET_CODE (dst) == REG
> +  && (!misaligned_operand (src, TImode)
> +  || (TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
> +  && TARGET_SSE_UNALIGNED_STORE_OPTIMAL)));
>
> Also here. We are loading a regiister, no point to check
> TARGET_SSE_UNALIGNED_STORE_OPTIMAL.
>
> +}
> +
> +  return false;
> +}
> +
>
> +/* Convert INSN from TImode to V1T1mode.  */
> +
> +void
> +timode_scalar_chain::convert_insn (rtx_insn *insn)
> +{
> +  rtx def_set = single_set (insn);
> +  rtx src = SET_SRC (def_set);
> +  rtx tmp;
> +  rtx dst = SET_DEST (def_set);
>
> No need for tmp declaration above ...
>
> +  switch (GET_CODE (dst))
> +{
> +case REG:
> +  tmp = find_reg_equal_equiv_note (insn);
>
> ... if you declare it here ...
>
> +  if (tmp)
> + PUT_MODE (XEXP (tmp, 0), V1TImode);
>
> /* FALLTHRU */
>
> +case MEM:
> +  PUT_MODE (dst, V1TImode);
> +  break;
>
> +case CONST_INT:
> +  switch (standard_sse_constant_p (src, TImode))
> + {
> + case 1:
> +  src = CONST0_RTX (GET_MODE (dst));
> +  tmp = gen_reg_rtx (V1TImode);
> +  break;
> + case 2:
> +  src = CONSTM1_RTX (GET_MODE (dst));
> +  tmp = gen_reg_rtx (V1TImode);
> +  break;
> + default:
> +  gcc_unreachable ();
> + }
> +  if (NONDEBUG_INSN_P (insn))
> + {
>
> ... and here. Please generate temp register here.
>
> +  /* Since there are no instructions to store standard SSE
> + constant, temporary register usage is required.  */
> +  emit_conversion_insns (gen_rtx_SET (dst, tmp), insn);
> +  dst = tmp;
> + }
>
>
>/* This needs to be done at start up.  It's convenient to do it here.  */
>register_pass (&insert_vzeroupper_info);
> -  register_pass (&stv_info);
> +  register_pass (TARGET_64BIT ? &stv_info_64 : &stv_info_32);
>  }
>
> stv_info_timode and stv_info_dimode?
>

Here is the updated patch.  OK for trunk if there is no regression?

Thanks.

-- 
H.J.
From 8ae371520d6fa6c338ead544f296ab77fb07bf80 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 7 Mar 2016 14:44:37 -0800
Subject: [PATCH] Extend STV pass to 64-bit mode

128-bit SSE load and store instructions can be used for load and store
of 128-bit integers if they are the only operations on 128-bit integers.
To convert load and store of 128-bit integers to 128-bit SSE load and
store, the original STV pass, which is designed to convert 64-bit integer
operations to SSE2 operations in 32-bit mode, is extended to 64-bit mode
in the following ways:

1. Class scalar_chain is turned into base class.  The 32-bit specific
member functions are moved to the new derived class, dimode_scalar_chain.
The new derived class, timode_scalar_chain, is added to convert oad and
store of 128-bit integers to 128-bit SSE load and store.
2. Add the 64-bit version of scalar_to_vector_candidate_p and
remove_non_convertible_regs.  Only TImode load and store are allowed
for conversion.  If one instruction on the chain of dependent
instructions aren't TImode load or store, the chain of instructions
won't be converted.
3. In 64-bit, we only convert from TImode to V1TImode, which have the
same size.  The difference is only vector registers are allowed in
TImode so that 128-bit SSE load and store instructions will be used
for load and store of 128-bit integers.

Re: [PATCH] PR target/70750: [6/7 Regression] Load and call no longer combined for indirect calls on x86

2016-04-27 Thread H.J. Lu
On Thu, Apr 21, 2016 at 11:29 AM, Uros Bizjak  wrote:
> On Thu, Apr 21, 2016 at 7:46 PM, H.J. Lu  wrote:
>> r231923 has
>>
>>  ;; Test for a valid operand for a call instruction.
>>  ;; Allow constant call address operands in Pmode only.
>>  (define_special_predicate "call_insn_operand"
>>(ior (match_test "constant_call_address_operand
>>  (op, mode == VOIDmode ? mode : Pmode)")
>> (match_operand 0 "call_register_no_elim_operand")
>> -   (and (not (match_test "TARGET_X32"))
>> -   (match_operand 0 "memory_operand"
>> +   (ior (and (not (match_test "TARGET_X32"))
>> +(match_operand 0 "sibcall_memory_operand"))
>>^^^ A typo.
>> +   (and (match_test "TARGET_X32 && Pmode == DImode")
>> +(match_operand 0 "GOT_memory_operand")
>>
>> "sibcall_memory_operand" should be "memory_operand".
>>
>> OK for trunk and 6 branch if there is no regression on x86-64?
>
> OK everywhere, but needs RM's approval for branch.

OK to backport for GCC 6 branch?

> Thanks,
> Uros.
>
>> H.J.
>> ---
>> gcc/
>>
>> PR target/70750
>> * config/i386/predicates.md (call_insn_operand): Replace
>> sibcall_memory_operand with memory_operand.
>>
>> gcc/testsuite/
>>
>> PR target/70750
>> * gcc.target/i386/pr70750-1.c: New test.
>> * gcc.target/i386/pr70750-2.c: Likewise.


-- 
H.J.


[Ada] wrong interface type conversion of in-out parameter

2016-04-27 Thread Arnaud Charlet
The compiler silently skips generating the code to perform a type
conversion when the all the following conditions occur: 1) the target
type of the type conversion is an access to a class-wide interface
type; 2) the type conversion is performed when passing an in-out
access type actual to a subprogram; and 3) in the declaration of the
called subprogram the type of that access to interface formal is
visible through a limited-with clause. After this patch the
following test compiles and executes well.

package Types is
   type Iface is interface;
   type Ref_Iface is access all Iface'Class;
   procedure Enter (Self : in Iface) is abstract;

   type Parent is abstract tagged null record;
   type Object is new Parent and Iface with null record;
   type Ref_Object is access all Object'Class;

   not overriding
   procedure Some_Primitive (Self : in Object);

   overriding
   procedure Enter (Self : in Object);
end;

with GNAT.IO;
package body Types is
   procedure Some_Primitive(Self : Object) is
  pragma Unreferenced (Self);
   begin
  GNAT.IO.Put_Line ("ERROR: wrong dispatching call");
   end;

   procedure Enter(Self : in Object) is 
  pragma Unreferenced (Self);
   begin
  GNAT.IO.Put("OK");
   end;
end;

limited with Types;  -- [3]
package Do_Test is
   procedure Test (The_Bar : in out Types.Ref_Iface); -- [2]
end;

with Types;
with GNAT.IO; use GNAT.IO;
package body Do_Test is
   procedure Test (The_Bar : in out Types.Ref_Iface) is
   begin
  The_Bar.Enter;
   end;
end;

with Types;
with Do_Test;
procedure Main is
   The_Pub : Types.Ref_Object := new Types.Object;
begin
   Do_Test.Test (Types.Ref_Iface(The_Pub)); -- [1]
end;

Command: gnatmake main.adb; ./main
 Output: OK

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Javier Miranda  

* exp_ch6.adb (Add_Call_By_Copy_Code,
Add_Simple_Call_By_Copy_Code, Expand_Actuals): Handle formals
whose type comes from the limited view.

Index: exp_ch6.adb
===
--- exp_ch6.adb (revision 235493)
+++ exp_ch6.adb (working copy)
@@ -1198,14 +1198,14 @@
   ---
 
   procedure Add_Call_By_Copy_Code is
+ Crep  : Boolean;
  Expr  : Node_Id;
+ F_Typ : Entity_Id := Etype (Formal);
+ Indic : Node_Id;
  Init  : Node_Id;
  Temp  : Entity_Id;
- Indic : Node_Id;
+ V_Typ : Entity_Id;
  Var   : Entity_Id;
- F_Typ : constant Entity_Id := Etype (Formal);
- V_Typ : Entity_Id;
- Crep  : Boolean;
 
   begin
  if not Is_Legal_Copy then
@@ -1214,6 +1214,14 @@
 
  Temp := Make_Temporary (Loc, 'T', Actual);
 
+ --  Handle formals whose type comes from the limited view
+
+ if From_Limited_With (F_Typ)
+   and then Has_Non_Limited_View (F_Typ)
+ then
+F_Typ := Non_Limited_View (F_Typ);
+ end if;
+
  --  Use formal type for temp, unless formal type is an unconstrained
  --  array, in which case we don't have to worry about bounds checks,
  --  and we use the actual type, since that has appropriate bounds.
@@ -1221,7 +1229,7 @@
  if Is_Array_Type (F_Typ) and then not Is_Constrained (F_Typ) then
 Indic := New_Occurrence_Of (Etype (Actual), Loc);
  else
-Indic := New_Occurrence_Of (Etype (Formal), Loc);
+Indic := New_Occurrence_Of (F_Typ, Loc);
  end if;
 
  if Nkind (Actual) = N_Type_Conversion then
@@ -1473,20 +1481,28 @@
   --
 
   procedure Add_Simple_Call_By_Copy_Code is
- Temp   : Entity_Id;
  Decl   : Node_Id;
+ F_Typ  : Entity_Id := Etype (Formal);
  Incod  : Node_Id;
+ Indic  : Node_Id;
+ Lhs: Node_Id;
  Outcod : Node_Id;
- Lhs: Node_Id;
  Rhs: Node_Id;
- Indic  : Node_Id;
- F_Typ  : constant Entity_Id := Etype (Formal);
+ Temp   : Entity_Id;
 
   begin
  if not Is_Legal_Copy then
 return;
  end if;
 
+ --  Handle formals whose type comes from the limited view
+
+ if From_Limited_With (F_Typ)
+   and then Has_Non_Limited_View (F_Typ)
+ then
+F_Typ := Non_Limited_View (F_Typ);
+ end if;
+
  --  Use formal type for temp, unless formal type is an unconstrained
  --  array, in which case we don't have to worry about bounds checks,
  --  and we use the actual type, since that has appropriate bounds.
@@ -1494,7 +1510,7 @@
  if Is_Array_Type (F_Typ) and then not Is_Constrained (F_Typ) then
 Indic := New_Occurrence_Of (Etype (Actual), Loc);
  else
-Indic := New_Occurrence_Of (Etype (Formal), Loc);
+Indic := New_Occurrence_Of (F_Typ, Loc);
  end if;
 
  --  Prepare to generat

[Ada] General handling of potential renamings in SPARK

2016-04-27 Thread Arnaud Charlet
This patch implements a mechanism for handling of renamings in SPARK. Since
SPARK cannot handle this form of aliasing, a reference to a renamed object is
replaced by a reference to the object itself.


-- Source --


--  regpat.ads

package Regpat is
   type Program_Data is array (Integer range <>) of Character;

   type Pattern_Matcher is record
  Program : Program_Data (1 .. 16) := (others => ASCII.NUL);
   end record;

   procedure Match (Self : in out Pattern_Matcher);
end Regpat;

--  regpat.adb

package body Regpat is
   type Opcode is (BRANCH);

   function "=" (Left : Character; Right : Opcode) return Boolean is
   begin
  return Character'Pos (Left) = Opcode'Pos (Right);
   end "=";

   procedure Match (Self : in out Pattern_Matcher) is
  Short_Program : Program_Data renames Self.Program;
   begin
  Short_Program (1) := ASCII.NUL;
  pragma Assert (Short_Program (1) /= BRANCH);
   end Match;
end Regpat;


-- Compilation and output --


$ gcc -c -gnatD -gnatd.F regpat.adb > regpat.dg
$ grep "self.program" regpat.dg | wc -l
3

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Hristian Kirtchev  

* exp_spark.adb (Expand_Potential_Renaming): Removed.
(Expand_SPARK): Update the call to expand a potential renaming.
(Expand_SPARK_Potential_Renaming): New routine.
* exp_spark.ads (Expand_SPARK_Potential_Renaming): New routine.
* sem.adb Add with and use clauses for Exp_SPARK.
(Analyze): Expand a non-overloaded potential renaming for SPARK.

Index: exp_spark.adb
===
--- exp_spark.adb   (revision 235481)
+++ exp_spark.adb   (working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1992-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -42,10 +42,6 @@
procedure Expand_SPARK_N_Object_Renaming_Declaration (N : Node_Id);
--  Perform name evaluation for a renamed object
 
-   procedure Expand_Potential_Renaming (N : Node_Id);
-   --  N denotes a N_Identifier or N_Expanded_Name. If N references a renaming,
-   --  replace N with the renamed object.
-
--
-- Expand_SPARK --
--
@@ -73,7 +69,7 @@
 
  when N_Expanded_Name |
   N_Identifier=>
-Expand_Potential_Renaming (N);
+Expand_SPARK_Potential_Renaming (N);
 
  when N_Object_Renaming_Declaration =>
 Expand_SPARK_N_Object_Renaming_Declaration (N);
@@ -116,41 +112,41 @@
   Evaluate_Name (Name (N));
end Expand_SPARK_N_Object_Renaming_Declaration;
 
-   ---
-   -- Expand_Potential_Renaming --
-   ---
+   -
+   -- Expand_SPARK_Potential_Renaming --
+   -
 
-   procedure Expand_Potential_Renaming (N : Node_Id) is
-  Id : constant Entity_Id  := Entity (N);
+   procedure Expand_SPARK_Potential_Renaming (N : Node_Id) is
   Loc: constant Source_Ptr := Sloc (N);
+  Ren_Id : constant Entity_Id  := Entity (N);
   Typ: constant Entity_Id  := Etype (N);
-  Ren_Id : Node_Id;
+  Obj_Id : Node_Id;
 
begin
   --  Replace a reference to a renaming with the actual renamed object
 
-  if Ekind (Id) in Object_Kind then
- Ren_Id := Renamed_Object (Id);
+  if Ekind (Ren_Id) in Object_Kind then
+ Obj_Id := Renamed_Object (Ren_Id);
 
- if Present (Ren_Id) then
+ if Present (Obj_Id) then
 
 --  The renamed object is an entity when instantiating generics
 --  or inlining bodies. In this case the renaming is part of the
 --  mapping "prologue" which links actuals to formals.
 
-if Nkind (Ren_Id) in N_Entity then
-   Rewrite (N, New_Occurrence_Of (Ren_Id, Loc));
+if Nkind (Obj_Id) in N_Entity then
+   Rewrite (N, New_Occurrence_Of (Obj_Id, Loc));
 
 --  Otherwise the renamed object denotes a name
 
 else
-   Rewrite (N, New_Copy_Tree (Ren_Id));
+   Rewrite (N, New_Copy_Tree (Obj_Id));
Reset_Analyzed_Flags (N);
 end if;
 
 Analyze_And_Resolve (N, Typ);
  end if;
   end if;
-   end Expand_Potential_Renami

[Ada] Detect singular matrices in Solve primitives for vectors and matrices.

2016-04-27 Thread Arnaud Charlet
This patch detects cases when the determinant of a system of equations is
exactly Zero, and raises the proper exception, as mandated by RN G.3.2 (68/2)
and G.3.2 (89/2).

Executing the following:

   gnatmake -q debug_solve.adb
   debug_solve

must yield:

System is:
 1.0E+00 x_1 +  0.0E+00 x_2 +  1.0E+00 x_3 =  1.0E+00
 2.0E+00 x_1 +  0.0E+00 x_2 +  2.0E+00 x_3 =  2.0E+00
 3.0E+00 x_1 +  0.0E+00 x_2 +  1.0E+00 x_3 =  2.0E+00
System is:

raised CONSTRAINT_ERROR : debug_solve.float_arrays.Instantiations.Solve:
   matrix is singular

---
with Ada.Numerics.Generic_Real_Arrays;
with Ada.Text_IO;

procedure debug_solve is

   package float_arrays is
  new Ada.Numerics.Generic_Real_Arrays (Real => Float);

   matrix : constant float_arrays.Real_Matrix (1 .. 3, 1 .. 3) :=
   ((1.0, 0.0, 1.0), (2.0, 0.0, 2.0), (3.0, 0.0, 1.0));
   rhs: constant float_arrays.Real_Vector (1 .. 3) := (1.0, 2.0, 2.0);
   solution : float_arrays.Real_Vector (1 .. 3);

begin
   Ada.Text_IO.Put_Line ("System is:");
   Ada.Text_IO.Put_Line
  (Float'Image (matrix (1, 1)) & " x_1 + " &
   Float'Image (matrix (1, 2)) & " x_2 + " &
   Float'Image (matrix (1, 3)) & " x_3 = " & Float'Image(rhs(1)));

   Ada.Text_IO.Put_Line (Float'Image (matrix (2, 1)) &
   " x_1 + " & Float'Image (matrix (2, 2)) &
   " x_2 + " & Float'Image (matrix (2, 3)) &
   " x_3 = " & Float'Image(rhs(2)));

   Ada.Text_IO.Put_Line (Float'Image (matrix (3, 1)) &
  " x_1 + " & Float'Image (matrix (3, 2)) &
  " x_2 + " & Float'Image (matrix (3, 3)) &
  " x_3 = " & Float'Image(rhs(3)));

   Ada.Text_IO.Put_Line ("System is:");
   solution := float_arrays.Solve (A => matrix, X => rhs);
   Ada.Text_IO.Put_Line ("Solution is x = (" & Float'Image (solution (1)) &
   ", " & Float'Image (solution (2)) &
   ", " & Float'Image (solution (3)) & ")");
end;

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Ed Schonberg  

* s-gearop.ads (Matrix_Vector_Solution, Matrix_Matrix_Solution):
Add scalar formal object Zero, to allow detection and report
when the matrix is singular.
* s-gearop.adb (Matrix_Vector_Solution, Matrix_Matrix_Solution):
Raise Constraint_Error if the Forward_Eliminate pass has
determined that determinant is Zero.o
* s-ngrear.adb (Solve): Add actual for Zero in corresponding
instantiations.
* s-ngcoar.adb (Solve): Ditto.

Index: a-ngrear.adb
===
--- a-ngrear.adb(revision 235481)
+++ a-ngrear.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 2006-2012, Free Software Foundation, Inc. --
+--  Copyright (C) 2006-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -337,10 +337,11 @@
Result_Matrix => Real_Matrix,
Operation => "abs");
 
-  function Solve is
- new Matrix_Vector_Solution (Real'Base, Real_Vector, Real_Matrix);
+  function Solve is new
+Matrix_Vector_Solution (Real'Base, 0.0, Real_Vector, Real_Matrix);
 
-  function Solve is new Matrix_Matrix_Solution (Real'Base, Real_Matrix);
+  function Solve is new
+Matrix_Matrix_Solution (Real'Base, 0.0, Real_Matrix);
 
   function Unit_Matrix is new
 Generic_Array_Operations.Unit_Matrix
Index: s-gearop.adb
===
--- s-gearop.adb(revision 235481)
+++ s-gearop.adb(working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
--- Copyright (C) 2006-2012, Free Software Foundation, Inc.  --
+-- Copyright (C) 2006-2016, Free Software Foundation, Inc.  --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -30,9 +30,7 @@
 --
 
 with Ada.Numerics; use Ada.Numerics;
-
 package body System.Generic_Array_Operations is
-
function Check_Unit_Last
  (Index : Integer;
   Or

[Ada] Spurious error on variable in private child package body

2016-04-27 Thread Arnaud Charlet
This patch modifies the contract freezing mechanism to suppress freezing when
the trigger is the body of an instantiated package. This effectively prevents
a spurious attempt to freeze because a) the body instantiation pass does not
have all semantic information and b) freezing has already taken place.


-- Source --


--  gp.ads

generic
   type T is private;

package GP with
   SPARK_Mode => On,
   Elaborate_Body,
   Abstract_State => (State,
  (Atomic_State with External))
is
end GP;

--  gp.adb

package body GP with
   SPARK_Mode => On,
   Refined_State => (State=> GG1,
 Atomic_State => GG2)
is
   GG1 : T;
   GG2 : T with Volatile;
end GP;

--  base.ads

package Base with
   SPARK_Mode => On
is
end Base;

--  base-a.ads

package Base.A with
   SPARK_Mode => On,
   Elaborate_Body,
   Abstract_State => (State,
  (Atomic_State with External))
is
end Base.A;

--  base-a.adb

with Base.A.B;

package body Base.A with
   SPARK_Mode => On,
   Refined_State => (State=> Base.A.B.State,
 Atomic_State => Base.A.B.Atomic_State)
is
end Base.A;

--  base-a-b.ads

private package Base.A.B with
   SPARK_Mode => On,
   Elaborate_Body,
   Abstract_State =>
 ((Statewith   Part_Of => Base.A.State),
  (Atomic_State with External, Part_Of => Base.A.Atomic_State))
is
end Base.A.B;

--  base-a-b.adb

with GP; pragma Elaborate_All (GP);

package body Base.A.B with
   SPARK_Mode => On,
   Refined_State =>
 (State=> (G1, P.State),
  Atomic_State => P.Atomic_State)
is
   G1 : Boolean := False;

   package P is new GP (T => Boolean);
end Base.A.B;

-
-- Compilation --
-

$ gcc -c base-a-b.adb

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Hristian Kirtchev  

* sem_ch7.adb (Analyze_Package_Body_Helper): The body of an
instantiated package should not cause freezing of previous contracts.

Index: sem_ch7.adb
===
--- sem_ch7.adb (revision 235481)
+++ sem_ch7.adb (working copy)
@@ -544,35 +544,6 @@
--  Start of processing for Analyze_Package_Body_Helper
 
begin
-  --  A [generic] package body "freezes" the contract of the nearest
-  --  enclosing package body and all other contracts encountered in the
-  --  same declarative part up to and excluding the package body:
-
-  --package body Nearest_Enclosing_Package
-  --  with Refined_State => (State => Constit)
-  --is
-  --   Constit : ...;
-
-  --   package body Freezes_Enclosing_Package_Body
-  -- with Refined_State => (State_2 => Constit_2)
-  --   is
-  --  Constit_2 : ...;
-
-  --  procedure Proc
-  --with Refined_Depends => (Input => (Constit, Constit_2)) ...
-
-  --  This ensures that any annotations referenced by the contract of a
-  --  [generic] subprogram body declared within the current package body
-  --  are available. This form of "freezing" is decoupled from the usual
-  --  Freeze_xxx mechanism because it must also work in the context of
-  --  generics where normal freezing is disabled.
-
-  --  Only bodies coming from source should cause this type of "freezing"
-
-  if Comes_From_Source (N) then
- Analyze_Previous_Contracts (N);
-  end if;
-
   --  Find corresponding package specification, and establish the current
   --  scope. The visible defining entity for the package is the defining
   --  occurrence in the spec. On exit from the package body, all body
@@ -628,6 +599,42 @@
  end if;
   end if;
 
+  --  A [generic] package body "freezes" the contract of the nearest
+  --  enclosing package body and all other contracts encountered in the
+  --  same declarative part up to and excluding the package body:
+
+  --package body Nearest_Enclosing_Package
+  --  with Refined_State => (State => Constit)
+  --is
+  --   Constit : ...;
+
+  --   package body Freezes_Enclosing_Package_Body
+  -- with Refined_State => (State_2 => Constit_2)
+  --   is
+  --  Constit_2 : ...;
+
+  --  procedure Proc
+  --with Refined_Depends => (Input => (Constit, Constit_2)) ...
+
+  --  This ensures that any annotations referenced by the contract of a
+  --  [generic] subprogram body declared within the current package body
+  --  are available. This form of "freezing" is decoupled from the usual
+  --  Freeze_xxx mechanism because it must also work in the context of
+  --  generics where normal freezing is disabled.
+
+  --  Only bodies coming from source should cause this type of "freezing".
+  --  Instantiated generic bodies are excluded because their processing is
+  --  performed in

Re: [PATCH] PR target/70750: [6/7 Regression] Load and call no longer combined for indirect calls on x86

2016-04-27 Thread Jakub Jelinek
On Wed, Apr 27, 2016 at 05:54:42AM -0700, H.J. Lu wrote:
> On Thu, Apr 21, 2016 at 11:29 AM, Uros Bizjak  wrote:
> > On Thu, Apr 21, 2016 at 7:46 PM, H.J. Lu  wrote:
> >> r231923 has
> >>
> >>  ;; Test for a valid operand for a call instruction.
> >>  ;; Allow constant call address operands in Pmode only.
> >>  (define_special_predicate "call_insn_operand"
> >>(ior (match_test "constant_call_address_operand
> >>  (op, mode == VOIDmode ? mode : Pmode)")
> >> (match_operand 0 "call_register_no_elim_operand")
> >> -   (and (not (match_test "TARGET_X32"))
> >> -   (match_operand 0 "memory_operand"
> >> +   (ior (and (not (match_test "TARGET_X32"))
> >> +(match_operand 0 "sibcall_memory_operand"))
> >>^^^ A typo.
> >> +   (and (match_test "TARGET_X32 && Pmode == DImode")
> >> +(match_operand 0 "GOT_memory_operand")
> >>
> >> "sibcall_memory_operand" should be "memory_operand".
> >>
> >> OK for trunk and 6 branch if there is no regression on x86-64?
> >
> > OK everywhere, but needs RM's approval for branch.
> 
> OK to backport for GCC 6 branch?

Okay.

Jakub


[Ada] Informational messages that are not warnings

2016-04-27 Thread Arnaud Charlet
Previously, an informational message that was not a warning was treated as an
error. This patch changes that, so such an informational message is not an
error. There are no such messages in the compiler, so no test is available.
This patch is mainly for use by SPARK, which wants to have informational
messages that are not warnings, but are not errors.

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Bob Duff  

* errout.ads: Document the fact that informational messages
don't have to be warnings.
* errout.adb (Error_Msg_Internal): In statistics counts, deal
correctly with informational messages that are not warnings.
(Error_Msg_NEL): Remove useless 'if' aroung Set_Posted, because
Set_Posted already checks for errors and ignores others.
* erroutc.adb (Prescan_Message): Set Is_Serious_Error to False
if Is_Info_Msg; the previous code was assuming that Is_Info_Msg
implies Is_Warning_Msg.
* errutil.adb (Error_Msg): In statistics counts, deal correctly
with informational messages that are not warnings.

Index: errout.adb
===
--- errout.adb  (revision 235481)
+++ errout.adb  (working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1992-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -1153,15 +1153,22 @@
  end if;
   end if;
 
-  --  Bump appropriate statistics count
+  --  Bump appropriate statistics counts
 
-  if Errors.Table (Cur_Msg).Warn or else Errors.Table (Cur_Msg).Style then
- Warnings_Detected := Warnings_Detected + 1;
+  if Errors.Table (Cur_Msg).Info then
+ Info_Messages := Info_Messages + 1;
 
- if Errors.Table (Cur_Msg).Info then
-Info_Messages := Info_Messages + 1;
+ --  Could be (usually is) both "info" and "warning"
+
+ if Errors.Table (Cur_Msg).Warn then
+Warnings_Detected := Warnings_Detected + 1;
  end if;
 
+  elsif Errors.Table (Cur_Msg).Warn
+or else Errors.Table (Cur_Msg).Style
+  then
+ Warnings_Detected := Warnings_Detected + 1;
+
   elsif Errors.Table (Cur_Msg).Check then
  Check_Messages := Check_Messages + 1;
 
@@ -1298,9 +1305,7 @@
  Last_Killed := True;
   end if;
 
-  if not (Is_Warning_Msg or Is_Style_Msg) then
- Set_Posted (N);
-  end if;
+  Set_Posted (N);
end Error_Msg_NEL;
 
--
@@ -3077,7 +3082,6 @@
 
begin
   if Is_Serious_Error then
-
  --  We always set Error_Posted on the node itself
 
  Set_Error_Posted (N);
Index: errout.ads
===
--- errout.ads  (revision 235481)
+++ errout.ads  (working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  --
---  Copyright (C) 1992-2015, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2016, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -324,7 +324,7 @@
--  "[restriction warning]" at the end of the warning message. For
--  continuations, use this on each continuation message.
 
-   --Insertion character ?$? (elaboration information messages)
+   --Insertion character ?$? (elaboration informational messages)
--  Like ?, but if the flag Warn_Doc_Switch is True, adds the string
--  "[-gnatel]" at the end of the info message. This is used for the
--  messages generated by the switch -gnatel. For continuations, use
@@ -419,12 +419,13 @@
--  message. Style messages are also considered to be warnings, but
--  they do not get a tag.
 
-   --Insertion sequence "info: " (information message)
+   --Insertion sequence "info: " (informational message)
--  This appears only at the start of the message (and not any of its
--  continuations, if any), and indicates that the message is an info
--  mes

[PATCH] Reduce nesting of parentheses in conditionals generated by genattrtab

2016-04-27 Thread Patrick Palka
On Fri, Mar 4, 2016 at 12:56 PM, Jakub Jelinek  wrote:
> On Fri, Mar 04, 2016 at 06:49:58PM +0100, Bernd Schmidt wrote:
>> On 03/04/2016 06:27 PM, Bernd Schmidt wrote:
>> >On 03/04/2016 06:14 PM, Patrick Palka wrote:
>> >
>> >>I just quickly tested building the generated insn-attrtab.c with and
>> >>without the patch using my host gcc 5.3 compiler and the .s output is
>> >>not the same.
>> >
>> >Hmm, looking at the 003t.original dump it looks like there are
>> >differences in SAVE_EXPRs. Indeed we seem to generate different code for
>> >
>> >int at;
>> >
>> >int foo ()
>> >{
>> >   if (at == 2 || at == 4 || at == 7)
>> > return 1;
>> >   return 0;
>> >}
>> >
>> >int bar ()
>> >{
>> >   if (at == 2 || (at == 4 || at == 7))
>> > return 1;
>> >   return 0;
>> >}
>>
>> Ahh... it's not just different placement of SAVE_EXPRs, it's actually a case
>> of TRUTH_ORIF_EXPR vs. TRUTH_OR_EXPR (the distinction is invisible in the
>> dumps), the latter being created by fold_range_test. That's a bit of a
>> broken optimization what with its inability to see more than two comparisons
>> at a time... we convert one ORIF per function, but a different one.
>
> I think we don't need to guarantee identical assembly, the reason I've
> suggested that was if it passed, it would be much easier to verify.
> Without that, I think it should be bootstrapped at least on one other
> target.  Note the cases you remove the parens aren't just || and &&, but
> most likely also | and & (at least there is some flag whether to print those
> as && or &).  And there is code for the caching of the attributes where the
> result is still usable, I believe the patch doesn't break that, but it
> wouldn't hurt to verify that.

Here's an updated patch that mentions that & and | are also affected.  And I
can't see how this change would possibly affect the attr caching stuff since it
just makes

  (a) OP ((b) OP (c))

get emitted as

  (a) OP (b) OP (c)

For OP = || or &&, the expressions a, b and c will still get evaluated left to
right.  And for OP = | or &, the order of evaluation of a, b and c remains
undefined.

Is this OK to commit after testing on x86_64-pc-linux-gnu?

gcc/ChangeLog:

* genattrtab.c (write_test_expr): New parameter EMIT_PARENS
which defaults to true.  Emit an outer pair of parentheses only if
EMIT_PARENS.  When continuing a chain of && or || (or & or |),
don't emit parentheses for the right-hand operand.
---
 gcc/genattrtab.c | 31 +++
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/gcc/genattrtab.c b/gcc/genattrtab.c
index b64d8b9..c956527 100644
--- a/gcc/genattrtab.c
+++ b/gcc/genattrtab.c
@@ -3416,7 +3416,10 @@ find_attrs_to_cache (rtx exp, bool create)
 
 /* Given a piece of RTX, print a C expression to test its truth value to OUTF.
We use AND and IOR both for logical and bit-wise operations, so
-   interpret them as logical unless they are inside a comparison expression.  
*/
+   interpret them as logical unless they are inside a comparison expression.
+
+   An outermost pair of parentheses is emitted around this C expression unless
+   EMIT_PARENS is false.  */
 
 /* Interpret AND/IOR as bit-wise operations instead of logical.  */
 #define FLG_BITWISE1
@@ -3432,16 +3435,16 @@ find_attrs_to_cache (rtx exp, bool create)
 #define FLG_OUTSIDE_AND8
 
 static unsigned int
-write_test_expr (FILE *outf, rtx exp, unsigned int attrs_cached, int flags)
+write_test_expr (FILE *outf, rtx exp, unsigned int attrs_cached, int flags,
+bool emit_parens = true)
 {
   int comparison_operator = 0;
   RTX_CODE code;
   struct attr_desc *attr;
 
-  /* In order not to worry about operator precedence, surround our part of
- the expression with parentheses.  */
+  if (emit_parens)
+fprintf (outf, "(");
 
-  fprintf (outf, "(");
   code = GET_CODE (exp);
   switch (code)
 {
@@ -3575,8 +3578,18 @@ write_test_expr (FILE *outf, rtx exp, unsigned int 
attrs_cached, int flags)
  || GET_CODE (XEXP (exp, 1)) == EQ_ATTR
  || (GET_CODE (XEXP (exp, 1)) == NOT
  && GET_CODE (XEXP (XEXP (exp, 1), 0)) == EQ_ATTR)))
-   attrs_cached
- = write_test_expr (outf, XEXP (exp, 1), attrs_cached, flags);
+   {
+ bool need_parens = true;
+
+ /* No need to emit parentheses around the right-hand operand if we are
+continuing a chain of && or || (or & or |).  */
+ if (GET_CODE (XEXP (exp, 1)) == code)
+   need_parens = false;
+
+ attrs_cached
+   = write_test_expr (outf, XEXP (exp, 1), attrs_cached, flags,
+  need_parens);
+   }
   else
write_test_expr (outf, XEXP (exp, 1), attrs_cached,
 flags | comparison_operator);
@@ -3794,7 +3807,9 @@ write_test_expr (FILE *outf, rtx exp, unsigned int 
attrs_cached, int flags)
 GET_RTX_NAME (code));
 }

[Ada] Legality check of classwide Pre/Postcondition aspect

2016-04-27 Thread Arnaud Charlet
This patch verifies that classwide Pre/Postconditions cannot be applied to
an operation of an untagged synchronized type, given that these are not
primitive operations of a tagged type.

Tested in ACATS b611007.

Tested on x86_64-pc-linux-gnu, committed on trunk

2016-04-27  Ed Schonberg  

* sem_ch13.adb (Analyze_Aspect_Specifications, case Pre/Post):
Check that the classwide version is illegal when the prefix is
an operation of an untagged synchronized type.

Index: sem_ch13.adb
===
--- sem_ch13.adb(revision 235503)
+++ sem_ch13.adb(working copy)
@@ -3129,6 +3129,24 @@
  Pname := Name_Postcondition;
   end if;
 
+  --  Check that the class-wide predicate cannot be applied to
+  --  an operation of a synchronized type that is not a tagged
+  --  type. Other legality checks are performed when analyzing
+  --  the contract of the operation.
+
+  if Class_Present (Aspect)
+and then Is_Concurrent_Type (Current_Scope)
+and then not Is_Tagged_Type (Current_Scope)
+and then Ekind_In (E, E_Entry, E_Function, E_Procedure)
+  then
+ Error_Msg_Name_1 := Original_Aspect_Pragma_Name (Aspect);
+ Error_Msg_N
+   ("aspect % can only be specified for a primitive "
+& "operation of a tagged type", Aspect);
+
+ goto Continue;
+  end if;
+
   --  If the expressions is of the form A and then B, then
   --  we generate separate Pre/Post aspects for the separate
   --  clauses. Since we allow multiple pragmas, there is no


Re: [PATCH] Reduce nesting of parentheses in conditionals generated by genattrtab

2016-04-27 Thread Jakub Jelinek
On Wed, Apr 27, 2016 at 09:23:47AM -0400, Patrick Palka wrote:
> Here's an updated patch that mentions that & and | are also affected.  And I
> can't see how this change would possibly affect the attr caching stuff since 
> it
> just makes
> 
>   (a) OP ((b) OP (c))
> 
> get emitted as
> 
>   (a) OP (b) OP (c)
> 
> For OP = || or &&, the expressions a, b and c will still get evaluated left to
> right.  And for OP = | or &, the order of evaluation of a, b and c remains
> undefined.
> 
> Is this OK to commit after testing on x86_64-pc-linux-gnu?
> 
> gcc/ChangeLog:
> 
>   * genattrtab.c (write_test_expr): New parameter EMIT_PARENS
>   which defaults to true.  Emit an outer pair of parentheses only if
>   EMIT_PARENS.  When continuing a chain of && or || (or & or |),
>   don't emit parentheses for the right-hand operand.

Ok for trunk.

Jakub


Re: [PATCH] PR target/70155: Use SSE for TImode load/store

2016-04-27 Thread Uros Bizjak
On Wed, Apr 27, 2016 at 2:51 PM, H.J. Lu  wrote:
> On Wed, Apr 27, 2016 at 5:03 AM, Uros Bizjak  wrote:
>> On Tue, Apr 26, 2016 at 9:50 PM, H.J. Lu  wrote:
>>
 Here is the updated patch which does that.  Ok for trunk if there
 is no regressions on x86-64?

>>>
>>> CSE works with SSE constants now.  Here is the updated patch.
>>> OK for trunk if there are no regressions on x86-64?
>>
>> +static bool
>> +timode_scalar_to_vector_candidate_p (rtx_insn *insn)
>> +{
>> +  rtx def_set = single_set (insn);
>> +
>> +  if (!def_set)
>> +return false;
>> +
>> +  if (has_non_address_hard_reg (insn))
>> +return false;
>> +
>> +  rtx src = SET_SRC (def_set);
>> +  rtx dst = SET_DEST (def_set);
>> +
>> +  /* Only TImode load and store are allowed.  */
>> +  if (GET_MODE (dst) != TImode)
>> +return false;
>> +
>> +  if (MEM_P (dst))
>> +{
>> +  /* Check for store.  Only support store from register or standard
>> + SSE constants.  */
>> +  switch (GET_CODE (src))
>> + {
>> + default:
>> +  return false;
>> +
>> + case REG:
>> +  /* For store from register, memory must be aligned or both
>> + unaligned load and store are optimal.  */
>> +  return (!misaligned_operand (dst, TImode)
>> +  || (TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
>> +  && TARGET_SSE_UNALIGNED_STORE_OPTIMAL));
>>
>> Why check TARGET_SSE_UNALIGNED_LOAD_OPTIMAL here? We are moving from a
>> register here.
>>
>> + case CONST_INT:
>> +  /* For store from standard SSE constant, memory must be
>> + aligned or unaligned store is optimal.  */
>> +  return (standard_sse_constant_p (src, TImode)
>> +  && (!misaligned_operand (dst, TImode)
>> +  || TARGET_SSE_UNALIGNED_STORE_OPTIMAL));
>> + }
>> +}
>> +  else if (MEM_P (src))
>> +{
>> +  /* Check for load.  Memory must be aligned or both unaligned
>> + load and store are optimal.  */
>> +  return (GET_CODE (dst) == REG
>> +  && (!misaligned_operand (src, TImode)
>> +  || (TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
>> +  && TARGET_SSE_UNALIGNED_STORE_OPTIMAL)));
>>
>> Also here. We are loading a regiister, no point to check
>> TARGET_SSE_UNALIGNED_STORE_OPTIMAL.
>>
>> +}
>> +
>> +  return false;
>> +}
>> +
>>
>> +/* Convert INSN from TImode to V1T1mode.  */
>> +
>> +void
>> +timode_scalar_chain::convert_insn (rtx_insn *insn)
>> +{
>> +  rtx def_set = single_set (insn);
>> +  rtx src = SET_SRC (def_set);
>> +  rtx tmp;
>> +  rtx dst = SET_DEST (def_set);
>>
>> No need for tmp declaration above ...
>>
>> +  switch (GET_CODE (dst))
>> +{
>> +case REG:
>> +  tmp = find_reg_equal_equiv_note (insn);
>>
>> ... if you declare it here ...
>>
>> +  if (tmp)
>> + PUT_MODE (XEXP (tmp, 0), V1TImode);
>>
>> /* FALLTHRU */
>>
>> +case MEM:
>> +  PUT_MODE (dst, V1TImode);
>> +  break;
>>
>> +case CONST_INT:
>> +  switch (standard_sse_constant_p (src, TImode))
>> + {
>> + case 1:
>> +  src = CONST0_RTX (GET_MODE (dst));
>> +  tmp = gen_reg_rtx (V1TImode);
>> +  break;
>> + case 2:
>> +  src = CONSTM1_RTX (GET_MODE (dst));
>> +  tmp = gen_reg_rtx (V1TImode);
>> +  break;
>> + default:
>> +  gcc_unreachable ();
>> + }
>> +  if (NONDEBUG_INSN_P (insn))
>> + {
>>
>> ... and here. Please generate temp register here.
>>
>> +  /* Since there are no instructions to store standard SSE
>> + constant, temporary register usage is required.  */
>> +  emit_conversion_insns (gen_rtx_SET (dst, tmp), insn);
>> +  dst = tmp;
>> + }
>>
>>
>>/* This needs to be done at start up.  It's convenient to do it here.  */
>>register_pass (&insert_vzeroupper_info);
>> -  register_pass (&stv_info);
>> +  register_pass (TARGET_64BIT ? &stv_info_64 : &stv_info_32);
>>  }
>>
>> stv_info_timode and stv_info_dimode?
>>
>
> Here is the updated patch.  OK for trunk if there is no regression?

OK with a small improvement:

if (MEM_P (dst))
  {
/* Check for store.  Destination must be aligned or unaligned
store is optimal.  */

if (misaligned_operands (dst, TImode) && !TARGET_SSE_UNALIGNED_STORE_OPTIMAL)
  return false;

  /* Only support store from register or standard SSE constants.  */
 switch (GET_CODE (src))
 {
 default:
  return false;

 case REG:
  return true;

 case CONST_INT:
  return (standard_sse_constant_p (src, TImode))
 }
}

+  else if (MEM_P (src))
+{
+  /* Check for load.  Memory must be aligned or unaligned load is
+ optimal.  */
+  return (GET_CODE (dst) == REG

REG_P

+  && (!misaligned_operand (src, TImode)
+  || TARGET_SSE_UNALIGNED_LOAD_OPTIMAL));

Thanks,
Uros.


Re: [PATCH, ARM] Fix gcc.c-torture/execute/loop-2b.c execution failure on cortex-m0

2016-04-27 Thread Thomas Preudhomme
On Thursday 03 March 2016 18:10:38 Thomas Preudhomme wrote:
> On Thursday 03 March 2016 15:32:27 Thomas Preudhomme wrote:
> > On Thursday 03 March 2016 09:44:31 Ramana Radhakrishnan wrote:
> > > On Thu, Mar 3, 2016 at 9:40 AM, Thomas Preudhomme
> > > 
> > >  wrote:
> > > > On Friday 15 January 2016 12:45:04 Ramana Radhakrishnan wrote:
> > > >> On Wed, Dec 16, 2015 at 9:11 AM, Thomas Preud'homme
> > > >> 
> > > >>  wrote:
> > > >> > During reorg pass, thumb1_reorg () is tasked with rewriting mov rd,
> > > >> > rn
> > > >> > to
> > > >> > subs rd, rn, 0 to avoid a comparison against 0 instruction before
> > > >> > doing
> > > >> > a
> > > >> > conditional branch based on it. The actual avoiding of cmp is done
> > > >> > in
> > > >> > cbranchsi4_insn instruction C output template. When the condition
> > > >> > is
> > > >> > met,
> > > >> > the source register (rn) is also propagated into the comparison in
> > > >> > place
> > > >> > the destination register (rd).
> > > >> > 
> > > >> > However, right now thumb1_reorg () only look for a mov followed by
> > > >> > a
> > > >> > cbranchsi but does not check whether the comparison in cbranchsi is
> > > >> > against the constant 0. This is not safe because a non clobbering
> > > >> > instruction could exist between the mov and the comparison that
> > > >> > modifies
> > > >> > the source register. This is what happens here with a post
> > > >> > increment
> > > >> > of
> > > >> > the source register after the mov, which skip the &a[i] == &a[1]
> > > >> > comparison for iteration i == 1.
> > > >> > 
> > > >> > This patch fixes the issue by checking that the comparison is
> > > >> > against
> > > >> > constant 0.
> > > >> > 
> > > >> > ChangeLog entry is as follow:
> > > >> > 
> > > >> > 
> > > >> > *** gcc/ChangeLog ***
> > > >> > 
> > > >> > 2015-12-07  Thomas Preud'homme  
> > > >> > 
> > > >> > * config/arm/arm.c (thumb1_reorg): Check that the
> > > >> > comparison
> > > >> > is
> > > >> > against the constant 0.
> > > >> 
> > > >> OK.
> > > >> 
> > > >> Ramana
> > > >> 
> > > >> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > > >> > index 42bf272..49c0a06 100644
> > > >> > --- a/gcc/config/arm/arm.c
> > > >> > +++ b/gcc/config/arm/arm.c
> > > >> > @@ -17195,7 +17195,7 @@ thumb1_reorg (void)
> > > >> > 
> > > >> >FOR_EACH_BB_FN (bb, cfun)
> > > >> >
> > > >> >  {
> > > >> >  
> > > >> >rtx dest, src;
> > > >> > 
> > > >> > -  rtx pat, op0, set = NULL;
> > > >> > +  rtx cmp, op0, op1, set = NULL;
> > > >> > 
> > > >> >rtx_insn *prev, *insn = BB_END (bb);
> > > >> >bool insn_clobbered = false;
> > > >> > 
> > > >> > @@ -17208,8 +17208,13 @@ thumb1_reorg (void)
> > > >> > 
> > > >> > continue;
> > > >> >
> > > >> >/* Get the register with which we are comparing.  */
> > > >> > 
> > > >> > -  pat = PATTERN (insn);
> > > >> > -  op0 = XEXP (XEXP (SET_SRC (pat), 0), 0);
> > > >> > +  cmp = XEXP (SET_SRC (PATTERN (insn)), 0);
> > > >> > +  op0 = XEXP (cmp, 0);
> > > >> > +  op1 = XEXP (cmp, 1);
> > > >> > +
> > > >> > +  /* Check that comparison is against ZERO.  */
> > > >> > +  if (!CONST_INT_P (op1) || INTVAL (op1) != 0)
> > > >> > +   continue;
> > > >> > 
> > > >> >/* Find the first flag setting insn before INSN in basic
> > > >> >block
> > > >> >BB.
> > > >> >*/
> > > >> >gcc_assert (insn != BB_HEAD (bb));
> > > >> > 
> > > >> > @@ -17249,7 +17254,7 @@ thumb1_reorg (void)
> > > >> > 
> > > >> >   PATTERN (prev) = gen_rtx_SET (dest, src);
> > > >> >   INSN_CODE (prev) = -1;
> > > >> >   /* Set test register in INSN to dest.  */
> > > >> > 
> > > >> > - XEXP (XEXP (SET_SRC (pat), 0), 0) = copy_rtx (dest);
> > > >> > + XEXP (cmp, 0) = copy_rtx (dest);
> > > >> > 
> > > >> >   INSN_CODE (insn) = -1;
> > > >> > 
> > > >> > }
> > > >> >  
> > > >> >  }
> > > >> > 
> > > >> > Testsuite shows no regression when run for arm-none-eabi with
> > > >> > -mcpu=cortex-m0 -mthumb
> > > > 
> > > > The patch applies cleanly on gcc-5-branch and also show no regression
> > > > when
> > > > run for arm-none-eabi with -mcpu=cortex-m0 -mthumb. Is it ok to
> > > > backport?
> > > 
> > > This deserves a testcase.
> > 
> > The original patch don't have one initially because it fixes a fail of an
> > existing testcase (loop-2b.c). However, the test pass on gcc 5 due to
> > difference in code generation. I'm currently trying to come up with a
> > testcase and will get back at you.
> 
> Sadly I did not manage to come up with a testcase that works on GCC 5. One
> need to reproduce a sequence of the form:
> 
> (set B A)
> (insn clobbering A that is not a set, ie store with post increment)
> (conditional branch between A and something else)
> 
> In that case, thumb1_reorg changes the set into (set B (minus A 0)) which is
> safe but also 

Re: [RFC][PATCH][PR40921] Convert x + (-y * z * z) into x - y * z * z

2016-04-27 Thread Richard Biener
On Sat, Apr 23, 2016 at 3:10 PM, kugan
 wrote:
>
>>> I am not sure I understand this. I tried doing this. If I add  -1 and
>>> rhs1
>>> for the NEGATE_EXPR to ops list,  when it come to rewrite_expr_tree
>>> constant
>>> will be sorted early and would make it hard to generate:
>>>   x + (-y * z * z) => x - y * z * z
>>>
>>> Do you want to swap the constant in MULT_EXPR chain (if present) like in
>>> swap_ops_for_binary_stmt and then create a NEGATE_EXPR ?
>>
>>
>> In addition to linearize_expr handling you need to handle a -1 in the
>> MULT_EXPR
>> chain specially at rewrite_expr_tree time by emitting a NEGATE_EXPR
>> instead
>> of a MULT_EXPR (which also means you should sort the -1 "last").
>
>
> Hi Richard,
>
> Thanks. Here is an attempt which does this.
>
> Regression tested and bootstrapped on x86-64-linux-gnu with no new
> regressions.
>
> Is this OK for trunk?

+ int last = ops.length () - 1;
+ bool negate_result = false;

Do

  oe &last = ops.last ();


+ if (rhs_code == MULT_EXPR
+ && ops.length () > 1
+ && ((TREE_CODE (ops[last]->op) == INTEGER_CST

and last.op here and below

+  && integer_minus_onep (ops[last]->op))
+ || ((TREE_CODE (ops[last]->op) == REAL_CST)
+ && real_equal (&TREE_REAL_CST
(ops[last]->op), &dconstm1

Here the checks !HONOR_SNANS () && (!HONOS_SIGNED_ZEROS ||
!COMPLEX_FLOAT_TYPE_P)
are missing.  The * -1 might appear literally and you are only allowed
to turn it into a negate
under the above conditions.

+   {
+ ops.unordered_remove (last);

use ops.pop ();

+ negate_result = true;

Please move the whole thing under the else { } case of the ops.length
== 0, ops.length == 1 test chain
as you did for the actual emit of the negate.


+ if (negate_result)
+   {
+ tree tmp = make_ssa_name (TREE_TYPE (lhs));
+ gimple_set_lhs (stmt, tmp);
+ gassign *neg_stmt = gimple_build_assign (lhs, NEGATE_EXPR,
+  tmp);
+ gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+ gsi_insert_after (&gsi, neg_stmt, GSI_NEW_STMT);
+ update_stmt (stmt);

I think that if powi_result is also built you end up using the wrong
stmt so you miss a

stmt = SSA_NAME_DEF_STMT (lhs);

here.  Also see the new_lhs handling of the powi_result case - again
you need sth
similar here (it's handling looks a bit fishy as well - this all needs
some comments
and possibly a (lot more) testcases).

So, please do the above requested changes and verify the 'lhs' issues I pointed
out by trying to add a few more testcase that also cover the case where a powi
is detected in addition to a negation.  Please also add a testcase that catches
(-y) * x * (-z).

Otherwise this now looks good.

Thanks,
Richard.


> Thanks,
> Kugan
>
> 2016-04-23  Kugan Vivekanandarajah  
>
> PR middle-end/40921
> * gcc.dg/tree-ssa/pr40921.c: New test.
>
> gcc/ChangeLog:
>
> 2016-04-23  Kugan Vivekanandarajah  
>
> PR middle-end/40921
> * tree-ssa-reassoc.c (try_special_add_to_ops): New.
> (linearize_expr_tree): Call try_special_add_to_ops.
> (reassociate_bb): Convert MULT_EXPR by (-1) to NEGATE_EXPR.
>


Re: [PATCH] PR target/70155: Use SSE for TImode load/store

2016-04-27 Thread H.J. Lu
On Wed, Apr 27, 2016 at 6:28 AM, Uros Bizjak  wrote:
> On Wed, Apr 27, 2016 at 2:51 PM, H.J. Lu  wrote:
>> On Wed, Apr 27, 2016 at 5:03 AM, Uros Bizjak  wrote:
>>> On Tue, Apr 26, 2016 at 9:50 PM, H.J. Lu  wrote:
>>>
> Here is the updated patch which does that.  Ok for trunk if there
> is no regressions on x86-64?
>

 CSE works with SSE constants now.  Here is the updated patch.
 OK for trunk if there are no regressions on x86-64?
>>>
>>> +static bool
>>> +timode_scalar_to_vector_candidate_p (rtx_insn *insn)
>>> +{
>>> +  rtx def_set = single_set (insn);
>>> +
>>> +  if (!def_set)
>>> +return false;
>>> +
>>> +  if (has_non_address_hard_reg (insn))
>>> +return false;
>>> +
>>> +  rtx src = SET_SRC (def_set);
>>> +  rtx dst = SET_DEST (def_set);
>>> +
>>> +  /* Only TImode load and store are allowed.  */
>>> +  if (GET_MODE (dst) != TImode)
>>> +return false;
>>> +
>>> +  if (MEM_P (dst))
>>> +{
>>> +  /* Check for store.  Only support store from register or standard
>>> + SSE constants.  */
>>> +  switch (GET_CODE (src))
>>> + {
>>> + default:
>>> +  return false;
>>> +
>>> + case REG:
>>> +  /* For store from register, memory must be aligned or both
>>> + unaligned load and store are optimal.  */
>>> +  return (!misaligned_operand (dst, TImode)
>>> +  || (TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
>>> +  && TARGET_SSE_UNALIGNED_STORE_OPTIMAL));
>>>
>>> Why check TARGET_SSE_UNALIGNED_LOAD_OPTIMAL here? We are moving from a
>>> register here.
>>>
>>> + case CONST_INT:
>>> +  /* For store from standard SSE constant, memory must be
>>> + aligned or unaligned store is optimal.  */
>>> +  return (standard_sse_constant_p (src, TImode)
>>> +  && (!misaligned_operand (dst, TImode)
>>> +  || TARGET_SSE_UNALIGNED_STORE_OPTIMAL));
>>> + }
>>> +}
>>> +  else if (MEM_P (src))
>>> +{
>>> +  /* Check for load.  Memory must be aligned or both unaligned
>>> + load and store are optimal.  */
>>> +  return (GET_CODE (dst) == REG
>>> +  && (!misaligned_operand (src, TImode)
>>> +  || (TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
>>> +  && TARGET_SSE_UNALIGNED_STORE_OPTIMAL)));
>>>
>>> Also here. We are loading a regiister, no point to check
>>> TARGET_SSE_UNALIGNED_STORE_OPTIMAL.
>>>
>>> +}
>>> +
>>> +  return false;
>>> +}
>>> +
>>>
>>> +/* Convert INSN from TImode to V1T1mode.  */
>>> +
>>> +void
>>> +timode_scalar_chain::convert_insn (rtx_insn *insn)
>>> +{
>>> +  rtx def_set = single_set (insn);
>>> +  rtx src = SET_SRC (def_set);
>>> +  rtx tmp;
>>> +  rtx dst = SET_DEST (def_set);
>>>
>>> No need for tmp declaration above ...
>>>
>>> +  switch (GET_CODE (dst))
>>> +{
>>> +case REG:
>>> +  tmp = find_reg_equal_equiv_note (insn);
>>>
>>> ... if you declare it here ...
>>>
>>> +  if (tmp)
>>> + PUT_MODE (XEXP (tmp, 0), V1TImode);
>>>
>>> /* FALLTHRU */
>>>
>>> +case MEM:
>>> +  PUT_MODE (dst, V1TImode);
>>> +  break;
>>>
>>> +case CONST_INT:
>>> +  switch (standard_sse_constant_p (src, TImode))
>>> + {
>>> + case 1:
>>> +  src = CONST0_RTX (GET_MODE (dst));
>>> +  tmp = gen_reg_rtx (V1TImode);
>>> +  break;
>>> + case 2:
>>> +  src = CONSTM1_RTX (GET_MODE (dst));
>>> +  tmp = gen_reg_rtx (V1TImode);
>>> +  break;
>>> + default:
>>> +  gcc_unreachable ();
>>> + }
>>> +  if (NONDEBUG_INSN_P (insn))
>>> + {
>>>
>>> ... and here. Please generate temp register here.
>>>
>>> +  /* Since there are no instructions to store standard SSE
>>> + constant, temporary register usage is required.  */
>>> +  emit_conversion_insns (gen_rtx_SET (dst, tmp), insn);
>>> +  dst = tmp;
>>> + }
>>>
>>>
>>>/* This needs to be done at start up.  It's convenient to do it here.  */
>>>register_pass (&insert_vzeroupper_info);
>>> -  register_pass (&stv_info);
>>> +  register_pass (TARGET_64BIT ? &stv_info_64 : &stv_info_32);
>>>  }
>>>
>>> stv_info_timode and stv_info_dimode?
>>>
>>
>> Here is the updated patch.  OK for trunk if there is no regression?
>
> OK with a small improvement:
>
> if (MEM_P (dst))
>   {
> /* Check for store.  Destination must be aligned or unaligned
> store is optimal.  */
>
> if (misaligned_operands (dst, TImode) && !TARGET_SSE_UNALIGNED_STORE_OPTIMAL)
>   return false;
>
>   /* Only support store from register or standard SSE constants.  */
>  switch (GET_CODE (src))
>  {
>  default:
>   return false;
>
>  case REG:
>   return true;
>
>  case CONST_INT:
>   return (standard_sse_constant_p (src, TImode))
>  }
> }
>
> +  else if (MEM_P (src))
> +{
> +  /* Check for load.  Memory must be aligned or unaligned load is
> + optimal.  */
> +  return (GET_CODE (dst) == REG
>
> REG_P
>
> +  && (!misaligned_operand (src, TImode)
> +  || TARGET_SSE_UNALIGNED_LOAD_OPTIMAL));
>
> Thanks,
> Uros.

This is the patch I will check in.

Thanks.

-- 
H.J.
From 0568f8a300588a8cc8fda4b0f212666714299c32 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 7 Mar 2016 14:44:37 -0800
Subject: [PATCH] E

New template for 'cpplib' made available

2016-04-27 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.  (If you have
any questions, send them to .)

A new POT file for textual domain 'cpplib' has been made available
to the language teams for translation.  It is archived as:

http://translationproject.org/POT-files/cpplib-6.1.0.pot

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

Below is the URL which has been provided to the translators of your
package.  Please inform the translation coordinator, at the address
at the bottom, if this information is not current:

ftp://ftp.gnu.org/gnu/gcc/gcc-6.1.0/gcc-6.1.0.tar.bz2

Translated PO files will later be automatically e-mailed to you.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [RFC][PATCH][PR63586] Convert x+x+x+x into 4*x

2016-04-27 Thread Richard Biener
On Sun, Apr 24, 2016 at 12:02 AM, kugan
 wrote:
> Hi Richard,
>
> As you have said in the other email, I tried implementing with the
> add_reapeats_to_ops_vec but the whole repeat vector is designed for
> MULT_EXPR chain. I tried changing it but it turned out to be not
> straightforward without lots of re-write. Therefore I tried to implement
> based on your review here. Please tell me what you think.

Hmm, ok.

>>> +/* Transoform repeated addition of same values into multiply with
>>> +   constant.  */
>>>
>>> Transform
>
>
> Done.
>
>>>
>>> +static void
>>> +transform_add_to_multiply (gimple_stmt_iterator *gsi, gimple *stmt,
>>> vec *ops)
>>>
>>> split the long line
>
>
> Done.
>
>>>
>>> op_list looks redundant - ops[start]->op gives you the desired value
>>> already and if you
>>> use a vec> you can have a more C++ish start,end pair.
>>>
>>> +  tree tmp = make_temp_ssa_name (TREE_TYPE (op), NULL,
>>> "reassocmul");
>>> +  gassign *mul_stmt = gimple_build_assign (tmp, MULT_EXPR,
>>> +  op, build_int_cst
>>> (TREE_TYPE(op), count));
>>>
>>> this won't work for floating point or complex numbers - you need to use
>>> sth like
>>> fold_convert (TREE_TYPE (op), build_int_cst (integer_type_node, count));
>
>
> Done.
>
>>>
>>> For FP types you need to guard the transform with
>>> flag_unsafe_math_optimizations
>
>
> Done.
>
>>>
>>> +  gimple_set_location (mul_stmt, gimple_location (stmt));
>>> +  gimple_set_uid (mul_stmt, gimple_uid (stmt));
>>> +  gsi_insert_before (gsi, mul_stmt, GSI_SAME_STMT);
>>>
>>> I think you do not want to set the stmt uid
>
>
> assert in reassoc_stmt_dominates_p (gcc_assert (gimple_uid (s1) &&
> gimple_uid (s2))) is failing. So I tried to add the uid of the adjacent stmt
> and it seems to work.

Hmm, yes, other cases seem to do the same.

>>> and you want to insert the
>>> stmt right
>>> after the def of op (or at the original first add - though you can't
>>> get your hands at
>
>
> Done.

maybe instert_stmt_after will help here, I don't think you got the insertion
logic correct, thus insert_stmt_after (mul_stmt, def_stmt) which I think
misses GIMPLE_NOP handling.  At least

+  if (SSA_NAME_VAR (op) != NULL

huh?  I suppose you could have tested SSA_NAME_IS_DEFAULT_DEF
but just the GIMPLE_NOP def-stmt test should be enough.

+ && gimple_code (def_stmt) == GIMPLE_NOP)
+   {
+ gsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)));
+ stmt = gsi_stmt (gsi);
+ gsi_insert_before (&gsi, mul_stmt, GSI_NEW_STMT);

not sure if that is the best insertion point choice, it un-does all
code-sinking done
(and no further sinking is run after the last reassoc pass).  We do know we
are handling all uses of op in our chain so inserting before the plus-expr
chain root should work here (thus 'stmt' in the caller context).  I'd
use that here instead.
I think I'd use that unconditionally even if it works and not bother
finding something
more optimal.

Apart from this this now looks ok to me.

But the testcases need some work


--- a/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr63586-2.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
...
+
+/* { dg-final { scan-tree-dump-times "\\\*" 4 "reassoc1" } } */

I would have expected 3.  Also please check for \\\* 5 for example
to be more specific (and change the cases so you get different constants
for the different functions).

That said, please make the scans more specific.

Thanks,
Richard.


>>> that easily).  You also don't want to set the location to the last stmt
>>> of the
>>> whole add sequence - simply leave it unset.
>>>
>>> +  oe = operand_entry_pool.allocate ();
>>> +  oe->op = tmp;
>>> +  oe->rank = get_rank (op) * count;
>>>
>>> ?  Why that?  oe->rank should be get_rank (tmp).
>>>
>>> +  oe->id = 0;
>>>
>>> other places use next_operand_entry_id++.  I think you want to simply
>>> use add_to_ops_vec (oe, tmp); here for all of the above.
>
>
> Done.
>
>>>
>>> Please return whether you did any optimization and do the
>>> qsort of the operand vector only if you did sth.
>
>
> Done.
>
>
>>> Testcase with FP math missing.  Likewise with complex or vector math.
>>
>>
>> Btw, does it handle associating
>>
>>x + 3 * x + x
>>
>> to
>>
>>5 * x
>>
>> ?
>
>
> Added this to the testcase and verified it is working.
>
> Regression tested and bootstrapped on x86-64-linux-gnu with no new
> regressions.
>
> Is this OK for trunk?
>
> Thanks,
> Kugan
>
>
> gcc/testsuite/ChangeLog:
>
> 2016-04-24  Kugan Vivekanandarajah  
>
> PR middle-end/63586
> * gcc.dg/tree-ssa/pr63586-2.c: New test.
> * gcc.dg/tree-ssa/pr63586.c: New test.
> * gcc.dg/tree-ssa/reassoc-14.c: Adjust multiplication count.
>
> gcc/ChangeLog:
>
> 2016-04-24  Kugan Vivekanandarajah  
>
>
> PR middle-end/63586
> * tree-ssa-reassoc.c (transform_add_to_multiply): New.
> (reassociate_bb

New template for 'gcc' made available

2016-04-27 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.  (If you have
any questions, send them to .)

A new POT file for textual domain 'gcc' has been made available
to the language teams for translation.  It is archived as:

http://translationproject.org/POT-files/gcc-6.1.0.pot

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

Below is the URL which has been provided to the translators of your
package.  Please inform the translation coordinator, at the address
at the bottom, if this information is not current:

ftp://ftp.gnu.org/gnu/gcc/gcc-6.1.0/gcc-6.1.0.tar.bz2

Translated PO files will later be automatically e-mailed to you.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [PATCH] Fixup nb_iterations_upper_bound adjustment for vectorized loops

2016-04-27 Thread Richard Biener
On Tue, Apr 26, 2016 at 2:29 PM, Ilya Enkovich  wrote:
> 2016-04-22 10:13 GMT+03:00 Richard Biener :
>> On Thu, Apr 21, 2016 at 6:09 PM, Ilya Enkovich  
>> wrote:
>>> Hi,
>>>
>>> Currently when loop is vectorized we adjust its nb_iterations_upper_bound
>>> by dividing it by VF.  This is incorrect since nb_iterations_upper_bound
>>> is upper bound for ( - 1) and therefore simple
>>> dividing it by VF in many cases gives us bounds greater than a real one.
>>> Correct value would be ((nb_iterations_upper_bound + 1) / VF - 1).
>>
>> Yeah, that seems correct.
>>
>>> Also decrement due to peeling for gaps should happen before we scale it
>>> by VF because peeling applies to a scalar loop, not vectorized one.
>>
>> That's not true - PEELING_FOR_GAPs is so that the last _vector_ iteration
>> is peeled as scalar operations.  We do not account for the amount
>> of known prologue peeling (if peeling for alignment and the misalignment
>> is known at compile-time) - that would be peeling of scalar iterations.
>
> My initial patch didn't change anything for PEELING_FOR_GAP and it caused
> a runfail for one of SPEC2006 benchmarks.  My investigation showed number
> of vector iterations calculation doesn't match nb_iterations_upper_bound
> adjustment in a way PEELING_FOR_GAP is accounted.
>
> Looking into vect_generate_tmps_on_preheader I see:
>
> /* If epilogue loop is required because of data accesses with gaps, we
>subtract one iteration from the total number of iterations here for
>correct calculation of RATIO.  */
>
> And then we decrement loop counter before dividing it by VF to compute
> ratio and ratio_mult_vf.  This doesn't match nb_iterations_upper_bound
> update and that's why I fixed it.  This resolved runfail for me.
>
> Thus ratio_mult_vf computation conflicts with your statement we peel a
> vector iteration.

Hum.  I stand corrected.  So yes, we remove the last vector iteration if
there are not already epilogue iterations.

>>
>> But it would be interesting to know why we need the != 0 check - static
>> cost modelling should have disabled vectorization if the vectorized body
>> isn't run.
>>
>>> This patch modifies nb_iterations_upper_bound computation to resolve
>>> these issues.
>>
>> You do not adjust the ->nb_iterations_estimate accordingly.
>>
>>> Running regression testing I got one fail due to optimized loop. Heres
>>> is a loop:
>>>
>>> foo (signed char s)
>>> {
>>>   signed char i;
>>>   for (i = 0; i < s; i++)
>>> yy[i] = (signed int) i;
>>> }
>>>
>>> Here we vectorize for AVX512 using VF=64.  Original loop has max 127
>>> iterations and therefore vectorized loop may be executed only once.
>>> With the patch applied compiler detects it and transforms loop into
>>> BB with just stores of constants vectors into yy.  Test was adjusted
>>> to increase number of possible iterations.  A copy of test was added
>>> to check we can optimize out the original loop.
>>>
>>> Bootstrapped and regtested on x86_64-pc-linux-gnu.  OK for trunk?
>>
>> I'd like to see testcases covering the corner-cases - have them have
>> upper bound estimates by adjusting known array sizes and also cover
>> the case of peeling for gaps.
>
> OK, I'll make more tests.

Thanks,
Richard.

> Thanks,
> Ilya
>
>>
>> Richard.
>>


Re: [Patch AArch64] Set TARGET_OMIT_STRUCT_RETURN_REG to true.

2016-04-27 Thread James Greenhalgh
On Tue, Apr 26, 2016 at 02:22:58PM +0100, Ramana Radhakrishnan wrote:
> As $SUBJECT. The reason this caught my eye on aarch64 is because
> the return value register (x0) is not identical to the register in which
> the hidden parameter for AArch64 is set (x8). Thus setting this to true
> seems to be quite reasonable and shaves off 100 odd mov x0, x8's from
> cc1 in a bootstrap build.
> 
> I don't expect this to make a huge impact on performance but as they say
> every little counts.  The AAPCS64 is quite explicit about not requiring that
> the contents of x8 be kept live.
> 
> Bootstrapped and regression tested on aarch64.
> 
> Ok to apply ?

OK.

Thanks,
James



[PATCH][AArch64] Define WORD_REGISTER_OPERATIONS to zero and comment why

2016-04-27 Thread Kyrill Tkachov

Hi all,

WORD_REGISTER_OPERATIONS is currently cryptically commented out in aarch64.h.
In reality, we cannot define it to 1 for aarch64 because operations narrower 
than word_mode (DImode for aarch64)
don't behave like word_mode if they use the W-form of the registers. They'll be 
performed in SImode in that case.
This patch adds a comment to that effect.
Longer term, I think it should be possible to teach the midend about this 
behaviour on aarch64 (maybe re-define
WORD_REGISTER_OPERATIONS to something like NARROWEST_NATURAL_INT_MODE?) to take 
advantage of these semantics,
but in the meantime this should clear up the current situation.

Bootstrapped and tested on aarch64.
This patch shouldn't have any functional changes.
Does the wording look ok for trunk?

Thanks,
Kyrill

2015-04-27  Kyrylo Tkachov  

* config/aarch64/aarch64.h (WORD_REGISTER_OPERATIONS): Define to 0
and explain why in a comment.
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index e2ead511076a2192eb79b79ec0a72777f82af35c..61c56b17efc09b65eeaa5441ab916ab7e0c8a969 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -708,7 +708,12 @@ do {	 \
 #define USE_STORE_PRE_INCREMENT(MODE)   0
 #define USE_STORE_PRE_DECREMENT(MODE)   0
 
-/* ?? #define WORD_REGISTER_OPERATIONS  */
+/* WORD_REGISTER_OPERATIONS does not hold for AArch64.
+   The assigned word_mode is DImode but operations narrower than SImode
+   behave as 32-bit operations if using the W-form of the registers rather
+   than as word_mode (64-bit) operations as WORD_REGISTER_OPERATIONS
+   expects.  */
+#define WORD_REGISTER_OPERATIONS 0
 
 /* Define if loading from memory in MODE, an integral mode narrower than
BITS_PER_WORD will either zero-extend or sign-extend.  The value of this


[PATCH][AArch64] Simplify ashl3 expander for SHORT modes

2016-04-27 Thread Kyrill Tkachov

Hi all,

The ashl3 expander for QI and HI modes is needlessly obfuscated.
The 2nd operand predicate accepts nonmemory_operand but the expand code
FAILs if it's not a CONST_INT. We can just demand a const_int_operand in
the predicate and remove the extra CONST_INT check.

Looking at git blame, it seems it was written that way as a result of some
other refactoring a few years back for an unrelated change.

Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?

Thanks,
Kyrill

2016-04-27  Kyrylo Tkachov  

* config/aarch64/aarch64.md (ashl3, SHORT modes):
Use const_int_operand for operand 2 predicate.  Simplify expand code
as a result.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index d7a669e40f9d4ae863c3e48b73f0eebdecea340d..c08e89bc4eb7b51dbb1e5f893238824caeb5f317 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3770,22 +3770,16 @@ (define_expand "3"
 (define_expand "ashl3"
   [(set (match_operand:SHORT 0 "register_operand")
 	(ashift:SHORT (match_operand:SHORT 1 "register_operand")
-		  (match_operand:QI 2 "nonmemory_operand")))]
+		  (match_operand:QI 2 "const_int_operand")))]
   ""
   {
-if (CONST_INT_P (operands[2]))
-  {
-operands[2] = GEN_INT (INTVAL (operands[2])
-   & (GET_MODE_BITSIZE (mode) - 1));
+operands[2] = GEN_INT (INTVAL (operands[2]) & GET_MODE_MASK (mode));
 
-if (operands[2] == const0_rtx)
-  {
-	emit_insn (gen_mov (operands[0], operands[1]));
-	DONE;
-  }
+if (operands[2] == const0_rtx)
+  {
+	emit_insn (gen_mov (operands[0], operands[1]));
+	DONE;
   }
-else
-  FAIL;
   }
 )
 


[PATCH][AArch64] Delete obsolete CC_ZESWP and CC_SESWP CC modes

2016-04-27 Thread Kyrill Tkachov

Hi all,

The CC_ZESWP and CC_SESWP are not used anywhere and seem to be a remmant of some
old code that was removed. The various compare+extend patterns in aarch64.md 
don't
use these modes. So it should be safe to remove them to avoid future confusion.

Bootstrapped and tested on aarch64.

Ok for trunk?

Thanks,
Kyrill

2016-04-27  Kyrylo Tkachov  

* config/aarch64/aarch64-modes.def (CC_ZESWP, CC_SESWP): Delete.
* config/aarch64/aarch64.c (aarch64_select_cc_mode): Remove condition
that returns CC_SESWPmode and CC_ZESWPmode.
(aarch64_get_condition_code_1): Remove handling of CC_SESWPmode
and CC_SESWPmode.
(aarch64_rtx_costs): Likewise.
diff --git a/gcc/config/aarch64/aarch64-modes.def b/gcc/config/aarch64/aarch64-modes.def
index 7de0b3f2fec1024946e40c66088b5b48675c4b7a..de8227f0ce47f4268761047d4e7bc46627c34bc7 100644
--- a/gcc/config/aarch64/aarch64-modes.def
+++ b/gcc/config/aarch64/aarch64-modes.def
@@ -21,8 +21,6 @@
 CC_MODE (CCFP);
 CC_MODE (CCFPE);
 CC_MODE (CC_SWP);
-CC_MODE (CC_ZESWP); /* zero-extend LHS (but swap to make it RHS).  */
-CC_MODE (CC_SESWP); /* sign-extend LHS (but swap to make it RHS).  */
 CC_MODE (CC_NZ);/* Only N and Z bits of condition flags are valid.  */
 CC_MODE (CC_Z); /* Only Z bit of condition flags is valid.  */
 CC_MODE (CC_C); /* Only C bit of condition flags is valid.  */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 466712e0c1e9c99ed76cb55728e9eeb6783eaa13..5ca2ae820335e9e656fb2c1b929f903bf0287f19 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4192,14 +4192,6 @@ aarch64_select_cc_mode (RTX_CODE code, rtx x, rtx y)
   && GET_CODE (x) == NEG)
 return CC_Zmode;
 
-  /* A compare of a mode narrower than SI mode against zero can be done
- by extending the value in the comparison.  */
-  if ((GET_MODE (x) == QImode || GET_MODE (x) == HImode)
-  && y == const0_rtx)
-/* Only use sign-extension if we really need it.  */
-return ((code == GT || code == GE || code == LE || code == LT)
-	? CC_SESWPmode : CC_ZESWPmode);
-
   /* A test for unsigned overflow.  */
   if ((GET_MODE (x) == DImode || GET_MODE (x) == TImode)
   && code == NE
@@ -4268,8 +4260,6 @@ aarch64_get_condition_code_1 (enum machine_mode mode, enum rtx_code comp_code)
   break;
 
 case CC_SWPmode:
-case CC_ZESWPmode:
-case CC_SESWPmode:
   switch (comp_code)
 	{
 	case NE: return AARCH64_NE;
@@ -6402,10 +6392,6 @@ aarch64_rtx_costs (rtx x, machine_mode mode, int outer ATTRIBUTE_UNUSED,
   /* TODO: A write to the CC flags possibly costs extra, this
 	 needs encoding in the cost tables.  */
 
-  /* CC_ZESWPmode supports zero extend for free.  */
-  if (mode == CC_ZESWPmode && GET_CODE (op0) == ZERO_EXTEND)
-op0 = XEXP (op0, 0);
-
 	  mode = GET_MODE (op0);
   /* ANDS.  */
   if (GET_CODE (op0) == AND)


Re: [RFC] Update gmp/mpfr/mpc minimum versions

2016-04-27 Thread Bernd Edlinger
On 26.04.2016 21:28, Marc Glisse wrote:
> On Tue, 26 Apr 2016, Bernd Edlinger wrote:
>
>> For instance PR libstdc++/69881: gmp.h did this:
>>
>> #define __need_size_t  /* tell gcc stddef.h we only want size_t */
>> #include  /* for size_t */
>>
>> I've persuaded Jonathan to work around that in libstdc++.
>>
>> Of course the in-tree build does work with less versions than
>> otherwise.
>
> IIUC, the bug only shows up if you compile in C++11 or later, so
> basically g++-6 or later, and there is a workaround in libstdc++
> starting from version 6 that means that it doesn't cause any problem. So
> there might be a problem if someone tries to build gcc using CXX='g++-5
> -std=c++11' or CXX='clang++ -stdlib=libc++' on a glibc system (I don't
> think others use __need_size_t?), but those are rather odd cases.
>

Yea, but Jonathan did not like this workaround at all, and my personal
preference would also just have been a better error message for this
clearly invalid code.


Bernd.


Contents of PO file 'cpplib-6.1.0.uk.po'

2016-04-27 Thread Translation Project Robot


cpplib-6.1.0.uk.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.



New Ukrainian PO file for 'cpplib' (version 6.1.0)

2016-04-27 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Ukrainian team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/uk.po

(This file, 'cpplib-6.1.0.uk.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




[PATCH][ARM] Fix costing of sign-extending load in rtx costs

2016-04-27 Thread Kyrill Tkachov

Hi all,

Another costs issue that came out of the investigation for PR 65932 is that
sign-extending loads get a higher cost than they should in the arm backend.
The problem is that when handling a sign-extend of a MEM we add the cost
of the load_sign_extend cost field and then recursively add the cost of the 
inner MEM
rtx, which is bogus. This will end up adding an extra load cost on it.

The solution in this patch is to just remove that recursive step.
With this patch from various CSE dumps I see much more sane costs assign to 
these
expressions (such as 12 instead of 32 or higher).

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Kyrill

2016-04-27  Kyrylo Tkachov  

* config/arm/arm.c (arm_new_rtx_costs, SIGN_EXTEND case):
Don't add cost of inner memory when handling sign-extended
loads.
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 7781b4b449ed48a8d902802d8e6a5c8e1ae7793f..7f2babe7339de3586de190bbe2cf8112919dd96f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -10911,8 +10911,6 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
   if ((arm_arch4 || GET_MODE (XEXP (x, 0)) == SImode)
 	  && MEM_P (XEXP (x, 0)))
 	{
-	  *cost = rtx_cost (XEXP (x, 0), VOIDmode, code, 0, speed_p);
-
 	  if (mode == DImode)
 	*cost += COSTS_N_INSNS (1);
 


Re: [AArch64] Emit division using the Newton series

2016-04-27 Thread James Greenhalgh
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index b7086dd..21af809 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -414,7 +414,8 @@ static const struct tune_params generic_tunings =
>0, /* max_case_values.  */
>0, /* cache_line_size.  */
>tune_params::AUTOPREFETCHER_OFF,   /* autoprefetcher_model.  */
> -  (AARCH64_EXTRA_TUNE_NONE)  /* tune_flags.  */
> +  (AARCH64_EXTRA_TUNE_NONE), /* tune_flags.  */
> +  (AARCH64_APPROX_NONE)  /* approx_div_modes.  */
>  };
>  
>  static const struct tune_params cortexa35_tunings =
> @@ -439,7 +440,8 @@ static const struct tune_params cortexa35_tunings =
>0, /* max_case_values.  */
>0, /* cache_line_size.  */
>tune_params::AUTOPREFETCHER_WEAK,  /* autoprefetcher_model.  */
> -  (AARCH64_EXTRA_TUNE_NONE)  /* tune_flags.  */
> +  (AARCH64_EXTRA_TUNE_NONE), /* tune_flags.  */
> +  (AARCH64_APPROX_NONE)  /* approx_div_modes.  */
>  };
>  
>  static const struct tune_params cortexa53_tunings =
> @@ -464,7 +466,8 @@ static const struct tune_params cortexa53_tunings =
>0, /* max_case_values.  */
>0, /* cache_line_size.  */
>tune_params::AUTOPREFETCHER_WEAK,  /* autoprefetcher_model.  */
> -  (AARCH64_EXTRA_TUNE_NONE)  /* tune_flags.  */
> +  (AARCH64_EXTRA_TUNE_NONE), /* tune_flags.  */
> +  (AARCH64_APPROX_NONE)  /* approx_div_modes.  */
>  };
>  
>  static const struct tune_params cortexa57_tunings =
> @@ -489,7 +492,8 @@ static const struct tune_params cortexa57_tunings =
>0, /* max_case_values.  */
>0, /* cache_line_size.  */
>tune_params::AUTOPREFETCHER_WEAK,  /* autoprefetcher_model.  */
> -  (AARCH64_EXTRA_TUNE_RENAME_FMA_REGS)   /* tune_flags.  */
> +  (AARCH64_EXTRA_TUNE_RENAME_FMA_REGS),  /* tune_flags.  */
> +  (AARCH64_APPROX_NONE)  /* approx_div_modes.  */
>  };
>  
>  static const struct tune_params cortexa72_tunings =
> @@ -514,7 +518,8 @@ static const struct tune_params cortexa72_tunings =
>0, /* max_case_values.  */
>0, /* cache_line_size.  */
>tune_params::AUTOPREFETCHER_OFF,   /* autoprefetcher_model.  */
> -  (AARCH64_EXTRA_TUNE_NONE)  /* tune_flags.  */
> +  (AARCH64_EXTRA_TUNE_NONE), /* tune_flags.  */
> +  (AARCH64_APPROX_NONE)  /* approx_div_modes.  */
>  };
>  
>  static const struct tune_params exynosm1_tunings =
> @@ -538,7 +543,8 @@ static const struct tune_params exynosm1_tunings =
>48,/* max_case_values.  */
>64,/* cache_line_size.  */
>tune_params::AUTOPREFETCHER_WEAK, /* autoprefetcher_model.  */
> -  (AARCH64_EXTRA_TUNE_APPROX_RSQRT) /* tune_flags.  */
> +  (AARCH64_EXTRA_TUNE_APPROX_RSQRT), /* tune_flags.  */
> +  (AARCH64_APPROX_NONE) /* approx_div_modes.  */
>  };
>  
>  static const struct tune_params thunderx_tunings =
> @@ -562,7 +568,8 @@ static const struct tune_params thunderx_tunings =
>0, /* max_case_values.  */
>0, /* cache_line_size.  */
>tune_params::AUTOPREFETCHER_OFF,   /* autoprefetcher_model.  */
> -  (AARCH64_EXTRA_TUNE_NONE)  /* tune_flags.  */
> +  (AARCH64_EXTRA_TUNE_NONE), /* tune_flags.  */
> +  (AARCH64_APPROX_NONE)  /* approx_div_modes.  */
>  };
>  
>  static const struct tune_params xgene1_tunings =
> @@ -586,7 +593,8 @@ static const struct tune_params xgene1_tunings =
>0, /* max_case_values.  */
>0, /* cache_line_size.  */
>tune_params::AUTOPREFETCHER_OFF,   /* autoprefetcher_model.  */
> -  (AARCH64_EXTRA_TUNE_APPROX_RSQRT)  /* tune_flags.  */
> +  (AARCH64_EXTRA_TUNE_APPROX_RSQRT), /* tune_flags.  */
> +  (AARCH64_APPROX_NONE)  /* approx_div_modes.  */
>  };

So this is off for all cores currently supported by GCC?

I'm not sure I understand why we should take this if it will immediately
be dead code?

Thanks,
James



Re: [arm-embedded][PATCH, GCC/ARM, 2/3] Error out for incompatible ARM multilibs

2016-04-27 Thread Thomas Preudhomme
Ping?

Best regards,

Thomas

On Thursday 17 December 2015 17:32:48 Thomas Preud'homme wrote:
> Hi,
> 
> We decided to apply the following patch to the ARM embedded 5 branch.
> 
> Best regards,
> 
> Thomas
> 
> > -Original Message-
> > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> > ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> > Sent: Wednesday, December 16, 2015 7:59 PM
> > To: gcc-patches@gcc.gnu.org; Richard Earnshaw; Ramana Radhakrishnan;
> > Kyrylo Tkachov
> > Subject: [PATCH, GCC/ARM, 2/3] Error out for incompatible ARM
> > multilibs
> > 
> > Currently in config.gcc, only the first multilib in a multilib list is
> > checked for validity and the following elements are ignored due to the
> > break which only breaks out of loop in shell. A loop is also done over
> > the multilib list elements despite no combination being legal. This patch
> > rework the code to address both issues.
> > 
> > ChangeLog entry is as follows:
> > 
> > 
> > 2015-11-24  Thomas Preud'homme  
> > 
> > * config.gcc: Error out when conflicting multilib is detected.  Do
> > not
> > loop over multilibs since no combination is legal.
> > 
> > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > index 59aee2c..be3c720 100644
> > --- a/gcc/config.gcc
> > +++ b/gcc/config.gcc
> > @@ -3772,38 +3772,40 @@ case "${target}" in
> > 
> > # Add extra multilibs
> > if test "x$with_multilib_list" != x; then
> > 
> > arm_multilibs=`echo $with_multilib_list | sed -e
> > 
> > 's/,/ /g'`
> > -   for arm_multilib in ${arm_multilibs}; do
> > -   case ${arm_multilib} in
> > -   aprofile)
> > +   case ${arm_multilibs} in
> > +   aprofile)
> > 
> > # Note that arm/t-aprofile is a
> > # stand-alone make file fragment to be
> > # used only with itself.  We do not
> > # specifically use the
> > # TM_MULTILIB_OPTION framework
> > 
> > because
> > 
> > # this shorthand is more
> > 
> > -   # pragmatic. Additionally it is only
> > -   # designed to work without any
> > -   # with-cpu, with-arch with-mode
> > +   # pragmatic.
> > +   tmake_profile_file="arm/t-aprofile"
> > +   ;;
> > +   default)
> > +   ;;
> > +   *)
> > +   echo "Error: --with-multilib-
> > list=${with_multilib_list} not supported." 1>&2
> > +   exit 1
> > +   ;;
> > +   esac
> > +
> > +   if test "x${tmake_profile_file}" != x ; then
> > +   # arm/t-aprofile is only designed to work
> > +   # without any with-cpu, with-arch, with-
> > mode,
> > 
> > # with-fpu or with-float options.
> > 
> > -   if test "x$with_arch" != x \
> > -   || test "x$with_cpu" != x \
> > -   || test "x$with_float" != x \
> > -   || test "x$with_fpu" != x \
> > -   || test "x$with_mode" != x ;
> > then
> > -   echo "Error: You cannot use
> > any of --with-arch/cpu/fpu/float/mode with --with-multilib-list=aprofile"
> > 1>&2
> > -   exit 1
> > -   fi
> > -   tmake_file="${tmake_file}
> > arm/t-aprofile"
> > -   break
> > -   ;;
> > -   default)
> > -   ;;
> > -   *)
> > -   echo "Error: --with-multilib-
> > list=${with_multilib_list} not supported." 1>&2
> > -   exit 1
> > -   ;;
> > -   esac
> > -   done
> > +   if test "x$with_arch" != x \
> > +   || test "x$with_cpu" != x \
> > +   || test "x$with_float" != x \
> > +   || test "x$with_fpu" != x \
> > +   || test "x$with_mode" != x ; then
> > +   echo "Error: You cannot use any of --
> > with-arch/cpu/fpu/float/mode with --with-multilib-list=${arm_multilib}"
> > 1>&2
> > +   exit 1
> > +   fi
> > +
> > +   tmake_file=

Re: [AArch64] Emit square root using the Newton series

2016-04-27 Thread James Greenhalgh
On Tue, Apr 12, 2016 at 01:14:51PM -0500, Evandro Menezes wrote:
> On 04/05/16 17:30, Evandro Menezes wrote:
> >On 04/05/16 13:37, Wilco Dijkstra wrote:
> >>I can't get any of these to work... Not only do I get a large
> >>number of collisions and duplicated
> >>code between these patches, when I try to resolve them, all I
> >>get is crashes whenever I try
> >>to use sqrt (even rsqrt stopped working). Do you have a patchset
> >>that applies cleanly so I can
> >>try all approximation routines?
> >
> >The original patches should be independent of each other, so
> >indeed they duplicate code.
> >
> >This patch suite should be suitable for testing.

Take look at other patch sets posted to this list for examples of how
to make review easier.

Please send a series of emails tagged:

[Patch 0/3 AArch64] Add infrastructure for more approximate FP operations
[PATCH 1/3 AArch64] Add more choices for the reciprocal square root 
approximation
[PATCH 2/3 AArch64] Emit square root using the Newton series
[PATCH 3/3 AArch64] Emit division using the Newton series

One patch per email, with the dependencies explicit like this, is
infinitely easier to follow than the current structure of your patch set.

I'm not trying to be pedantic for the sake of it, I'm genuinely unsure where
the latest patch versions currently are and how I should apply them to a
clean tree for review.

Thanks,
James



Re: [PING^3] Re: [PATCH 1/4] Add gcc-auto-profile script

2016-04-27 Thread Andi Kleen
Andi Kleen  writes:

Ping^3 for the patch series!

> Andi Kleen  writes:
>
> Ping^2 for the patch series!
>
>> Andi Kleen  writes:
>>
>> Ping for the patch series!
>>
>>> From: Andi Kleen 
>>>
>>> Using autofdo is currently something difficult. It requires using the
>>> model specific branches taken event, which differs on different CPUs.
>>> The example shown in the manual requires a special patched version of
>>> perf that is non standard, and also will likely not work everywhere.
>>>
>>> This patch adds a new gcc-auto-profile script that figures out the
>>> correct event and runs perf. The script is installed with on Linux systems.
>>>
>>> Since maintaining the script would be somewhat tedious (needs changes
>>> every time a new CPU comes out) I auto generated it from the online
>>> Intel event database. The script to do that is in contrib and can be
>>> rerun.
>>>
>>> Right now there is no test if perf works in configure. This
>>> would vary depending on the build and target system, and since
>>> it currently doesn't work in virtualization and needs uptodate
>>> kernel it may often fail in common distribution build setups.
>>>
>>> So Linux just hardcodes installing the script, but it may fail at runtime.
>>>
>>> This is needed to actually make use of autofdo in a generic way
>>> in the build system and in the test suite.
>>>
>>> So far the script is not installed.
>>>
>>> gcc/:
>>> 2016-03-27  Andi Kleen  
>>>
>>> * doc/invoke.texi: Document gcc-auto-profile
>>> * gcc-auto-profile: Create.
>>>
>>> contrib/:
>>>
>>> 2016-03-27  Andi Kleen  
>>>
>>> * gen_autofdo_event.py: New file to regenerate
>>> gcc-auto-profile.
>>> ---
>>>  contrib/gen_autofdo_event.py | 155 
>>> +++
>>>  gcc/doc/invoke.texi  |  31 +++--
>>>  gcc/gcc-auto-profile |  70 +++
>>>  3 files changed, 251 insertions(+), 5 deletions(-)
>>>  create mode 100755 contrib/gen_autofdo_event.py
>>>  create mode 100755 gcc/gcc-auto-profile
>>>
>>> diff --git a/contrib/gen_autofdo_event.py b/contrib/gen_autofdo_event.py
>>> new file mode 100755
>>> index 000..db4db33
>>> --- /dev/null
>>> +++ b/contrib/gen_autofdo_event.py
>>> @@ -0,0 +1,155 @@
>>> +#!/usr/bin/python
>>> +# generate Intel taken branches Linux perf event script for autofdo 
>>> profiling
>>> +
>>> +# Copyright (C) 2016 Free Software Foundation, Inc.
>>> +#
>>> +# GCC is free software; you can redistribute it and/or modify it under
>>> +# the terms of the GNU General Public License as published by the Free
>>> +# Software Foundation; either version 3, or (at your option) any later
>>> +# version.
>>> +#
>>> +# GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>>> +# WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> +# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>> +# for more details.
>>> +#
>>> +# You should have received a copy of the GNU General Public License
>>> +# along with GCC; see the file COPYING3.  If not see
>>> +# .  */
>>> +
>>> +# run it with perf record -b -e EVENT program ...
>>> +# The Linux Kernel needs to support the PMU of the current CPU, and
>>> +# it will likely not work in VMs.
>>> +# add --all to print for all cpus, otherwise for current cpu
>>> +# add --script to generate shell script to run correct event
>>> +#
>>> +# requires internet (https) access. this may require setting up a proxy
>>> +# with export https_proxy=...
>>> +#
>>> +import urllib2
>>> +import sys
>>> +import json
>>> +import argparse
>>> +import collections
>>> +
>>> +baseurl = "https://download.01.org/perfmon";
>>> +
>>> +target_events = (u'BR_INST_RETIRED.NEAR_TAKEN',
>>> + u'BR_INST_EXEC.TAKEN',
>>> + u'BR_INST_RETIRED.TAKEN_JCC',
>>> + u'BR_INST_TYPE_RETIRED.COND_TAKEN')
>>> +
>>> +ap = argparse.ArgumentParser()
>>> +ap.add_argument('--all', '-a', help='Print for all CPUs', 
>>> action='store_true')
>>> +ap.add_argument('--script', help='Generate shell script', 
>>> action='store_true')
>>> +args = ap.parse_args()
>>> +
>>> +eventmap = collections.defaultdict(list)
>>> +
>>> +def get_cpu_str():
>>> +with open('/proc/cpuinfo', 'r') as c:
>>> +vendor, fam, model = None, None, None
>>> +for j in c:
>>> +n = j.split()
>>> +if n[0] == 'vendor_id':
>>> +vendor = n[2]
>>> +elif n[0] == 'model' and n[1] == ':':
>>> +model = int(n[2])
>>> +elif n[0] == 'cpu' and n[1] == 'family':
>>> +fam = int(n[3])
>>> +if vendor and fam and model:
>>> +return "%s-%d-%X" % (vendor, fam, model), model
>>> +return None, None
>>> +
>>> +def find_event(eventurl, model):
>>> +print >>sys.stderr, "Downloading", eventurl
>>> +u = urllib2.urlopen(eventurl)
>>> +events = json.loads(u.read())
>>> +u.close()
>>> +
>>> +  

Re: C, C++: New warning for memset without multiply by elt size

2016-04-27 Thread Jeff Law

On 04/27/2016 03:55 AM, Bernd Schmidt wrote:

On 04/26/2016 11:23 PM, Martin Sebor wrote:

The documentation for the new option implies that it should warn
for calls to memset where the third argument contains the number
of elements not multiplied by the element size.  But in my (quick)
testing it only warns when the argument is a constant equal to
the number of elements and less than the size of the array.  For
example, neither of the following is diagnosed:

 int a [4];
 __builtin_memset (a, 0, 2 + 2);
 __builtin_memset (a, 0, 4 * 1);
 __builtin_memset (a, 0, 3);
 __builtin_memset (a, 0, 4 * sizeof a);

If it's possible and not too difficult, it would be nice if
the detection logic could be made a bit smarter to also diagnose
these less trivial cases (and matched the documented behavior).


I've thought about some of these cases. The problem is there are
legitimate cases of calling memset for only part of an array. I wanted
to start with something that is unlikely to give false positives.
So I wonder if what we really want is to track which bytes in the object 
are set and which are not -- utilizing both memset and standard stores 
and if the object as a whole is not initialized, then warn.


We've actually got a lot of the code that would be necessary to detect 
this in tree DSE, with more coming in this stage1 as I extend it to 
handle some missing cases.


Clearly a follow-up rather than a requirement for the current patch to 
move forward.


Jeff



Re: [PATCH][AArch64] Improve aarch64_modes_tieable_p

2016-04-27 Thread James Greenhalgh
On Fri, Apr 22, 2016 at 01:22:51PM +, Wilco Dijkstra wrote:
> Improve modes_tieable by returning true in more cases: allow scalar access
> within vectors without requiring an extra move. Removing these moves helps
> the register allocator in deciding whether to use integer or FP registers on
> operations that can be done on both. This saves about 100 instructions in the
> gcc.target/aarch64 tests.
>

[snip]

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index abc864c..6e921f0 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -12294,7 +12294,14 @@ aarch64_reverse_mask (enum machine_mode mode)
>return force_reg (V16QImode, mask);
>  }
>  
> -/* Implement MODES_TIEABLE_P.  */
> +/* Implement MODES_TIEABLE_P.  In principle we should always return true.
> +   However due to issues with register allocation it is preferable to avoid
> +   tieing integer scalar and FP scalar modes.  Executing integer operations
> +   in general registers is better than treating them as scalar vector
> +   operations.  This reduces latency and avoids redundant int<->FP moves.
> +   So tie modes if they are either the same class, or vector modes with
> +   other vector modes, vector structs or any scalar mode.
> +*/

*/ shouldn't be on the newline, just "[...] scalar mode.  */"

It would be handy if you could raise something in bugzilla for the
register allocator deficiency.

>  bool
>  aarch64_modes_tieable_p (machine_mode mode1, machine_mode mode2)
> @@ -12305,9 +12312,12 @@ aarch64_modes_tieable_p (machine_mode mode1, 
> machine_mode mode2)
>/* We specifically want to allow elements of "structure" modes to
>   be tieable to the structure.  This more general condition allows
>   other rarer situations too.  */
> -  if (TARGET_SIMD
> -  && aarch64_vector_mode_p (mode1)
> -  && aarch64_vector_mode_p (mode2))
> +  if (aarch64_vector_mode_p (mode1) && aarch64_vector_mode_p (mode2))
> +return true;

This relaxes the TARGET_SIMD check that would have prevented
OImode/CImode/XImode ties when !TARGET_SIMD. What's the reasoning
behind that?

> +  /* Also allow any scalar modes with vectors.  */
> +  if (aarch64_vector_mode_supported_p (mode1)
> +  || aarch64_vector_mode_supported_p (mode2))
>  return true;

Does this always hold? It seems like you might need to be more restrictive
with what we allow to avoid ties with some of the more obscure modes
(V4DF etc.).

Thanks,
James



Re: [PATCH, ARM] Fix gcc.c-torture/execute/loop-2b.c execution failure on cortex-m0

2016-04-27 Thread Ramana Radhakrishnan
>
> Ping? Note that the patch has been on GCC 6 for more than 3 months now without
> any issue reported against it.

OK.

Ramana

>
> Best regards,
>
> Thomas


Re: [PATCH][RFC] Gimplify "into SSA"

2016-04-27 Thread Jeff Law

On 04/21/2016 06:55 AM, Richard Biener wrote:


The following patch makes us not allocate decls but SSA names for
temporaries required during gimplification.  This is basically the
same thing as we do when calling the gimplifier on GENERIC expressions
from optimization passes (when we are already in SSA).

There are two benefits of doing this.

1) SSA names are smaller (72 bytes) than VAR_DECLs (144 bytes) and we
rewrite them into anonymous SSA names later anyway, leaving up the
VAR_DECLs for GC reclaim (but not their UID)

2) We keep expressions "connected" by having the use->def link via
SSA_NAME_DEF_STMT for example allowing match-and-simplify of
larger expressions on early GIMPLE
I like it -- I can't see any significant reason to keep the _DECL nodes 
for these temporaries.  They're not useful for end-user debugging or 
debugging GCC itself.  In fact, I would claim that these temporary _DECL 
nodes just add noise when diffing debugging dumps.


While GC would reclaim the _DECL nodes, I'm all for avoiding placing 
work on the GC system when it can be easily avoided.




Complications arise from the fact that there is no CFG built and thus
we have to make sure to not use SSA names where we'd need PHIs.  Or
when CFG build may end up separating SSA def and use in a way current
into-SSA doesn't fix up (adding of abnormal edges, save-expr placement,
gimplification of type sizes, etc.).

:(



As-is the patch has the downside of effectively disabling the
lookup_tmp_var () CSE for register typed temporaries and not
preserving the "fancy" names we derive from val in
create_tmp_from_val (that can be recovered easily though if
deemed worthwhile).

I don't think it's worthwhile.

ISTM this will affect something like the gimple front-end project which 
would need to see the anonymous name and map it back to a suitable type, 
but I don't think that should stop this from moving forward.


Jeff


Re: [PATCH][AArch64] Fix shift attributes

2016-04-27 Thread James Greenhalgh
On Fri, Apr 22, 2016 at 02:11:52PM +, Wilco Dijkstra wrote:
> This patch fixes the attributes of integer immediate shifts which were
> incorrectly modelled as register controlled shifts. Also change EXTR
> attribute to being a rotate.
> 
> OK for trunk?

OK. Thanks for the fix.

Thanks,
James

> 
> ChangeLog:
> 2016-04-22  Wilco Dijkstra  
> 
>   * gcc/config/aarch64/aarch64.md (aarch64_ashl_sisd_or_int_3):
>   Split integer shifts into shift_reg and bfm.
>   (aarch64_lshr_sisd_or_int_3): Likewise.
>   (aarch64_ashr_sisd_or_int_3): Likewise.
>   (ror3_insn): Likewise.
>   (si3_insn_uxtw): Likewise.
>   (3_insn): Change to rotate_imm.
>   (extr5_insn_alt): Likewise.
>   (extrsi5_insn_uxtw): Likewise.
>   (extrsi5_insn_uxtw_alt): Likewise.



Re: [PATCH GCC]Refactor IVOPT.

2016-04-27 Thread Bin.Cheng
On Fri, Apr 22, 2016 at 8:20 AM, Richard Biener
 wrote:
> On Thu, Apr 21, 2016 at 7:26 PM, Bin Cheng  wrote:
>> Hi,
>> This patch refactors IVOPT in three major aspects:
>> Firstly it rewrites iv_use groups.  Use group is originally introduced only 
>> for address type uses, this patch makes it general to all (generic, compare, 
>> address) types.  Currently generic/compare groups contain only one iv_use, 
>> and compare groups can be extended to contain multiple uses.  As far as 
>> generic use is concerned, it won't contain multiple uses because IVOPT 
>> reuses one iv_use structure for generic uses at different places already.  
>> This change also cleanups algorithms as well as data structures.
>> Secondly it implements group data structure in vector rather than in list as 
>> originally.  List was used because it's easy to split.  Of course list is 
>> hard to sort (For example, we use quadratic insertion sort now).  This 
>> problem will become more critical since I plan to introduce fine-control 
>> over splitting small address groups by checking if target supports 
>> load/store pair instructions or not.  In this case address group needs to be 
>> sorted more than once and against complex conditions, for example, memory 
>> loads in one basic block should be sorted together in offset ascending 
>> order.  With vector group, sorting can be done very efficiently with quick 
>> sort.
>> Thirdly this patch cleanups/reformats IVOPT's dump information.  I think the 
>> information is easier to read/test now.  Since change of dump information is 
>> entangled with group data-structure change, it's hard to make it a 
>> standalone patch.  Given this part patch is quite straightforward, I hope it 
>> won't be confusing.
>>
>> Bootstrap and test on x86_64 and AArch64, no regressions.  I also checked 
>> generated assembly for spec2k and spec2k6 on both platforms, turns out 
>> output assembly is almost not changed except for several files.  After 
>> further investigation, I can confirm the difference is cause by small change 
>> when sorting groups. Given the original sorting condition as below:
>> -  /* Sub use list is maintained in offset ascending order.  */
>> -  if (addr_offset <= group->addr_offset)
>> -{
>> -  use->related_cands = group->related_cands;
>> -  group->related_cands = NULL;
>> -  use->next = group;
>> -  data->iv_uses[id_group] = use;
>> -}
>> iv_uses with same addr_offset are sorted in reverse control flow order.  
>> This might be a typo since I don't remember any specific reason for it.  If 
>> this patch sorts groups in the same way, there will be no difference in 
>> generated assembly at all.  So this patch is a pure refactoring work which 
>> doesn't have any functional change.
>>
>> Any comments?
>
> Looks good to me.

Hi
Attachment is what I applied as r235513., picking up two new tests
that need to be revised.  Also applied Martin's patch on top of it as
r2355134.

Thanks,
bin

2016-04-27  Martin Liska  

* tree-ssa-loop-ivopts.c (iv_ca_dump): Fix level of indentation.
(free_loop_data): Release vuses of groups.

>
> Richard.
>
>> Thanks,
>> bin
>>
>> 2016-04-19  Bin Cheng  
>>
>> * tree-ssa-loop-ivopts.c (struct iv): Use pointer to struct iv_use
>> instead of redundant use_id and boolean have_use_for.
>> (struct iv_use): Change sub_id into group_id.  Remove field next.
>> Move fields: related_cands, n_map_members, cost_map and selected
>> to ...
>> (struct iv_group): ... here.  New structure.
>> (struct iv_common_cand): Use structure declaration directly.
>> (struct ivopts_data, iv_ca, iv_ca_delta): Rename fields.
>> (MAX_CONSIDERED_USES): Rename macro to ...
>> (MAX_CONSIDERED_GROUPS): ... here.
>> (n_iv_uses, iv_use, n_iv_cands, iv_cand): Delete.
>> (dump_iv, dump_use, dump_cand): Refactor format of dump information.
>> (dump_uses): Rename to ...
>> (dump_groups): ... here.  Update all uses.
>> (tree_ssa_iv_optimize_init, alloc_iv): Update all uses.
>> (find_induction_variables): Refactor format of dump information.
>> (record_sub_use): Delete.
>> (record_use): Update all uses.
>> (record_group): New function.
>> (record_group_use, find_interesting_uses_op): Call above functions.
>> Update all uses.
>> (find_interesting_uses_cond): Ditto.
>> (group_compare_offset): New function.
>> (split_all_small_groups): Rename to ...
>> (split_small_address_groups_p): ... here.  Update all uses.
>> (split_address_groups):  Update all uses.
>> (find_interesting_uses): Refactor format of dump information.
>> (add_candidate_1): Update all uses.  Remove redundant check on iv,
>> base and step.
>> (add_candidate, record_common_cand): Remove redundant assert.
>> (add_iv_candidate_for_biv): Update use.
>> (add_iv

Re: moxie-rtems patch for libgcc/config.host

2016-04-27 Thread Jeff Law

On 04/18/2016 03:43 PM, Joel Sherrill wrote:

Hi

For some unknown reason, moxie-rtems has its own stanza
in libgcc/config.host which does not include extra_parts.
This results in C++ RTEMS applications not linking.

Also the tmake_file variable is overridden by the
shared moxie stanza rather than being added to.

This patch addresses both issues. This patch (or some
minor variant) needs to be applied to every branch from
4.9 to master.

Comments?


2015-04-18  Joel Sherrill 

* config.host (moxie-*-rtems*): Merge this stanza with
other moxie targets so the same extra_parts are built.
Also have tmake_file add on to its value rather than override.


OK for the trunk and branches.
jeff


Re: [PATCH] DWARF: turn dw_loc_descr_node field into hash map for frame offset check

2016-04-27 Thread Pierre-Marie de Rodat

On 04/27/2016 01:31 PM, Richard Biener wrote:

Ok.

Thanks,
Richard.


Thank you for the very quick feedback! I just commited the change.

--
Pierre-Marie de Rodat


Re: [PATCH][AArch64] print_operand should not fallthrough from register operand into generic operand

2016-04-27 Thread James Greenhalgh
On Fri, Apr 22, 2016 at 02:24:49PM +, Wilco Dijkstra wrote:
> Some patterns are using '%w2' for immediate operands, which means that a zero
> immediate is actually emitted as 'wzr' or 'xzr'. This not only changes an
> immediate operand into a register operand but may emit illegal instructions
> from legal RTL (eg. ORR x0, SP, xzr rather than ORR x0, SP, 0).
> 
> Remove the fallthrough in aarch64_print_operand from the 'w' and 'x' case
> into the '0' case that created this issue. Modify a few patterns to use '%2'
> rather than '%w2' for an immediate or memory operand so they now print
> correctly without the fallthrough.
> 
> OK for trunk?
> 
> (note this requires https://gcc.gnu.org/ml/gcc-patches/2016-04/msg01265.html 
> to
> be committed first)

If you've got dependencies like this, formatting the mails as a patch set
makes review easier.  e.g.:

  [PATCH 1/2 AArch64] Fix shift attributes
  [PATCH 2/2 AArch64] print_operand should not fallthrough from register
operand into generic operand

My biggest concern about this patch is that it might break code that is in
the wild. Looks to me like this breaks (at least)
arch/arm64/asm/percpu.h in the kernel.

So the part of this patch removing the fallthrough to general operand
is not OK for trunk. 

The other parts look reasonable to me, please resubmit just those.

> --
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> 881dc52e2de03231abb217a9ce22cbb1cc44bc6c..bcef50825c8315c39e29dbe57c387ea2a4fe445d
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -4608,7 +4608,8 @@ aarch64_print_operand (FILE *f, rtx x, int code)
> break;
>   }
>  
> -  /* Fall through */
> +  output_operand_lossage ("invalid operand for '%%%c'", code);
> +  return;
>  
>  case 0:
>/* Print a normal operand, if it's a general register, then we

To be clear, this is the hunk that is not OK.

Thanks,
James



Re: Fix some i386 testcases for -frename-registers

2016-04-27 Thread H.J. Lu
On Wed, Apr 27, 2016 at 2:09 AM, Bernd Schmidt  wrote:
> On 04/27/2016 02:10 AM, H.J. Lu wrote:
>>
>> On Tue, Apr 26, 2016 at 3:11 PM, Bernd Schmidt 
>> wrote:
>>>
>>> On 04/26/2016 09:39 PM, H.J. Lu wrote:


 make check-gcc RUNTESTFLAGS="--target_board='unix{-mx32}'
 i386.exp=avx512vl-vmovdqa64-1.c"
>>>
>>>
>>>
>>> Unfortunately, that doesn't work:
>>>
>>> /usr/include/gnu/stubs.h:13:28: fatal error: gnu/stubs-x32.h: No such
>>> file
>>> or directory
>>> compilation terminated.
>>>
>>> Trying to follow the recipe to get am x32 glibc built fails with the same
>>> error when trying to build an x32 libgcc. I think I'll need you to send
>>> me
>>> before/after assembly files (I'm assuming it's -frename-registers which
>>> makes the test fail on x32).
>>>
>>
>> Here are avx512vl-vmovdqa64-1.i, old.s and new.s.
>
>
> Still somewhat at a loss. Is it trying to match a register name and close
> paren with the '.{5}'? What if you replace that with something like
> '%[re][0-9a-z]*//)'? Or maybe '.{5,6}'?
>

This works for -m32, -mx32 and -m64.  OK for trunk?

Thanks.

-- 
H.J.
	* gcc.target/i386/avx512vl-vmovdqa64-1.c: Replace ".{5}" with
	".{5,6}".

diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-vmovdqa64-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-vmovdqa64-1.c
index 6930f79..14fe4b8 100644
--- a/gcc/testsuite/gcc.target/i386/avx512vl-vmovdqa64-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512vl-vmovdqa64-1.c
@@ -10,7 +10,7 @@
 /* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*\\)\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
 /* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*\\)\[^\n\]*%ymm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
 /* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*\\)\[^\n\]*%xmm\[0-9\]+\{%k\[1-7\]\}\{z\}(?:\n|\[ \\t\]+#)" 1 } } */
-/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\nxy\]*\\(.{5}(?:\n|\[ \\t\]+#)" 1 { target nonpic } } } */
+/* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\nxy\]*\\(.{5,6}(?:\n|\[ \\t\]+#)" 1 { target nonpic } } } */
 /* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\nxy\]*\\((?:\n|\[ \\t\]+#)" 1 { xfail *-*-* } } } */
 /* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%ymm\[0-9\]+\[^\n\]*\\)\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */
 /* { dg-final { scan-assembler-times "vmovdqa64\[ \\t\]+\[^\{\n\]*%xmm\[0-9\]+\[^\n\]*\\)\{%k\[1-7\]\}(?:\n|\[ \\t\]+#)" 1 } } */


Re: [AArch64] Emit division using the Newton series

2016-04-27 Thread Wilco Dijkstra
James Greenhalgh wrote:
> So this is off for all cores currently supported by GCC?
> 
> I'm not sure I understand why we should take this if it will immediately
> be dead code?

I presume it was meant to have the vector variants enabled with -mcpu=exynos-m1
as that is where you can get a good gain if you only have a single divide+sqrt 
unit.
The same applies to the sqrt case too, and I guess -mcpu=xgene-1.

Wilco



[PATCH] add -fprolog-pad=N option to c-family

2016-04-27 Thread Torsten Duwe
Hi Maxim,

thanks for starting the work on this; I have added the missing
command line option. It builds now and the resulting compiler generates
a linux kernel with the desired properties, so work can continue there.

Torsten

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 9bc02fc..57265c5 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -393,6 +393,7 @@ static tree handle_designated_init_attribute (tree *, tree, 
tree, int, bool *);
 static tree handle_bnd_variable_size_attribute (tree *, tree, tree, int, bool 
*);
 static tree handle_bnd_legacy (tree *, tree, tree, int, bool *);
 static tree handle_bnd_instrument (tree *, tree, tree, int, bool *);
+static tree handle_prolog_pad_attribute (tree *, tree, tree, int, bool *);
 
 static void check_function_nonnull (tree, int, tree *);
 static void check_nonnull_arg (void *, tree, unsigned HOST_WIDE_INT);
@@ -833,6 +834,8 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_bnd_legacy, false },
   { "bnd_instrument", 0, 0, true, false, false,
  handle_bnd_instrument, false },
+  { "prolog_pad",1, 1, false, true, true,
+ handle_prolog_pad_attribute, false },
   { NULL, 0, 0, false, false, false, NULL, false }
 };
 
@@ -9663,6 +9666,16 @@ handle_designated_init_attribute (tree *node, tree name, 
tree, int,
   return NULL_TREE;
 }
 
+static tree
+handle_prolog_pad_attribute (tree *, tree name, tree, int,
+bool *)
+{
+  warning (OPT_Wattributes,
+  "%qE attribute is used", name);
+
+  return NULL_TREE;
+}
+
 
 /* Check for valid arguments being passed to a function with FNTYPE.
There are NARGS arguments in the array ARGARRAY.  */
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 9ae181f..31a8026 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -532,6 +532,10 @@ c_common_handle_option (size_t scode, const char *arg, int 
value,
   cpp_opts->ext_numeric_literals = value;
   break;
 
+case OPT_fprolog_pad_:
+  prolog_nop_pad_size = value;
+  break;
+
 case OPT_idirafter:
   add_path (xstrdup (arg), AFTER, 0, true);
   break;
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index aafd802..929ebb6 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1407,6 +1407,10 @@ fpreprocessed
 C ObjC C++ ObjC++
 Treat the input file as already preprocessed.
 
+fprolog-pad=
+C ObjC C++ ObjC++ RejectNegative Joined UInteger Var(prolog_nop_pad_size) 
Init(0)
+Pad NOPs before each function prolog
+
 ftrack-macro-expansion
 C ObjC C++ ObjC++ JoinedOrMissing RejectNegative UInteger
 ; converted into ftrack-macro-expansion=
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 1ce7181..9d10b10 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4553,6 +4553,10 @@ will select the smallest suitable mode.
 This section describes the macros that output function entry
 (@dfn{prologue}) and exit (@dfn{epilogue}) code.
 
+@deftypefn {Target Hook} void TARGET_ASM_PRINT_PROLOG_PAD (FILE *@var{file}, 
unsigned HOST_WIDE_INT @var{pad_size}, bool @var{record_p})
+Generate prologue pad
+@end deftypefn
+
 @deftypefn {Target Hook} void TARGET_ASM_FUNCTION_PROLOGUE (FILE *@var{file}, 
HOST_WIDE_INT @var{size})
 If defined, a function that outputs the assembler code for entry to a
 function.  The prologue is responsible for setting up the stack frame,
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index a0a0a81..bda6d5c 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -3662,6 +3662,8 @@ will select the smallest suitable mode.
 This section describes the macros that output function entry
 (@dfn{prologue}) and exit (@dfn{epilogue}) code.
 
+@hook TARGET_ASM_PRINT_PROLOG_PAD
+
 @hook TARGET_ASM_FUNCTION_PROLOGUE
 
 @hook TARGET_ASM_FUNCTION_END_PROLOGUE
diff --git a/gcc/final.c b/gcc/final.c
index 1edc446..e0cff80 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -1753,6 +1753,7 @@ void
 final_start_function (rtx_insn *first, FILE *file,
  int optimize_p ATTRIBUTE_UNUSED)
 {
+  unsigned HOST_WIDE_INT pad_size = prolog_nop_pad_size;
   block_depth = 0;
 
   this_is_asm_operands = 0;
@@ -1765,6 +1766,21 @@ final_start_function (rtx_insn *first, FILE *file,
 
   high_block_linenum = high_function_linenum = last_linenum;
 
+  tree prolog_pad_attr
+= lookup_attribute ("prolog_pad", TYPE_ATTRIBUTES (TREE_TYPE 
(current_function_decl)));
+  if (prolog_pad_attr)
+{
+  tree prolog_pad_value = TREE_VALUE (TREE_VALUE (prolog_pad_attr));
+
+  if (tree_fits_uhwi_p (prolog_pad_value))
+   pad_size = tree_to_uhwi (prolog_pad_value);
+  else
+   gcc_unreachable ();
+
+}
+  if (pad_size > 0)
+targetm.asm_out.print_prolog_pad (file, pad_size, true);
+
   if (flag_sanitize & SANITIZE_ADDRESS)
 asan_function_start ();
 
diff --git a/gcc/target.def b

Contents of PO file 'cpplib-6.1.0.pt_BR.po'

2016-04-27 Thread Translation Project Robot


cpplib-6.1.0.pt_BR.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.



New Brazilian Portuguese PO file for 'cpplib' (version 6.1.0)

2016-04-27 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Brazilian Portuguese team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/pt_BR.po

(This file, 'cpplib-6.1.0.pt_BR.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [PATCH 3/4] Run profile feedback tests with autofdo

2016-04-27 Thread Bernd Schmidt

On 03/28/2016 06:44 AM, Andi Kleen wrote:

From: Andi Kleen 

Extend the existing bprob and tree-prof tests to also run with autofdo.
The test runtimes are really a bit too short for autofdo, but it's
a reasonable sanity check.

This only works natively for now.

dejagnu doesn't seem to support a wrapper for unix tests, so I had
to open code running these tests.  That should be ok due to the
native run restrictions.


Ideally this would be reviewed by someone who knows tcl (and autofdo) a 
little better. Some observations.



+set profile_wrapper [profopt-perf-wrapper]
+set profile_options "-g"
+set feedback_options "-fauto-profile"
+set run_autofdo 1
+
+foreach profile_option $profile_options feedback_option $feedback_options {
+foreach src [lsort [glob -nocomplain $srcdir/$subdir/bprob-*.c]] {
+if ![runtest_file_p $runtests $src] then {
+continue
+}
+   set base [file tail $srco
+profopt-execute $src
+}
+}
+
+set run_autofdo ""
+set profile_wrapper ""


This block appears duplicated across several files. Is there a way to 
unify that?


> +  if { $run_autofdo == 1 } {
> +  # unix_load does not support wrappers in $PATH, so implement
> +  # it manually here

Please write full sentences with proper capitalization and punctuation. 
This occurs across several of these patches, I'll only mention it here.



@@ -313,6 +320,7 @@ proc profopt-execute { src } {
# valid, by running it after dg-additional-files-options.
foreach ext $prof_ext {
profopt-target-cleanup $tmpdir $base $ext
+   profopt-target-cleanup $tmpdir perf data
}


We have this, and then...


@@ -400,6 +451,7 @@ proc profopt-execute { src } {
foreach ext $prof_ext {
profopt-target-cleanup $tmpdir $base $ext
}
+   # XXX remove perf.data


... this - does that need to look the same as the above?


+   # Should check if create_gcov exists


So maybe do that?


Bernd


Re: Fix some i386 testcases for -frename-registers

2016-04-27 Thread Bernd Schmidt

On 04/27/2016 05:16 PM, H.J. Lu wrote:


This works for -m32, -mx32 and -m64.  OK for trunk?


Yes, thanks.


Bernd



Re: [RFC] Update gmp/mpfr/mpc minimum versions

2016-04-27 Thread Bernd Edlinger
On 26.04.2016 22:14, Joseph Myers wrote:
> On Tue, 26 Apr 2016, Bernd Edlinger wrote:
>
>> Hi,
>>
>> as we all know, it's high time now to adjust the minimum supported
>> gmp/mpfr/mpc versions for gcc-7.
>
> I think updating the minimum versions (when using previously built
> libraries, not in-tree) is only appropriate when it allows some cleanup in
> GCC, such as removing conditionals on whether a more recently added
> function is available, adding functionality that depends on a newer
> interface, or using newer interfaces instead of older ones that are now
> deprecated.
>
> For example, you could justify a move to requiring MPFR 3.0.0 or later
> with cleanups to use MPFR_RND* instead of the older GMP_RND*, and
> similarly mpfr_rnd_t instead of the older mp_rnd_t and likewise mpfr_exp_t
> and mpfr_prec_t in fortran/.  You could justify a move to requiring MPC
> 1.0.0 (or 1.0.2) by optimizing clog10 using mpc_log10.  I don't know what
> if any newer GMP interfaces would be beneficial in GCC.  And as always in
> such cases, it's a good idea to look at e.g. how widespread the newer
> versions are in GNU/Linux distributions, which indicates how many people
> might be affected by an increase in the version requirement.
>

Yes I see.

I would justify it this way: gmp-6.0.0 is the first version that does
not invoke undefined behavior in gmp.h, once we update to gmp-6.0.0
we could emit at least a warning in cstddef for this invalid code.

Once we have gmp-6.0.0, the earliest mpfr version that compiles at all
is mpfr-3.1.1 and the earliest mpc version that compiles at all is
mpc-0.9.  This would be the supported installed versions.

In-tree gmp-6.0.0 does _not_ work for ARM.  But gmp-6.1.0 does (with a
little quirk).  All supported mpfr and mpc versions are working in-tree
too, even for the ARM target.

When we have at least mpfr-3.1.1, it is straight forward to remove the
pre-3.1.0 compatibility code from gcc/fortran/simplify.c for instance.

So I would propose this updated patch for gcc-7.


Thanks
Bernd.
2016-04-26  Bernd Edlinger  

* configure.ac (mpfr): Remove pre-3.1.0 mpfr compatibility code.
Adjust check to new minimum gmp, mpfr and mpc versions.
* configure: Regenerated.
* Makefile.def (gmp): Explicitly disable assembler.
(mpfr): Adjust lib_path.
(mpc): Likewise.
* Makefile.in: Regenerated.

gcc/
2016-04-26  Bernd Edlinger  

* doc/install.texi: Adjust gmp/mpfr/mpc minimum versions.

gcc/fortran/
2016-04-26  Bernd Edlinger  

* simplify.c (gfc_simplify_fraction): Remove pre-3.1.0 mpfr
compatibility code.

contrib/
2016-04-26  Bernd Edlinger  

* download_prerequisites: Adjust gmp/mpfr/mpc versions.
Index: Makefile.def
===
--- Makefile.def	(Revision 235487)
+++ Makefile.def	(Arbeitskopie)
@@ -50,6 +50,7 @@ host_modules= { module= gcc; bootstrap=true;
 host_modules= { module= gmp; lib_path=.libs; bootstrap=true;
 		// Work around in-tree gmp configure bug with missing flex.
 		extra_configure_flags='--disable-shared LEX="touch lex.yy.c"';
+		extra_make_flags='AM_CFLAGS="-DNO_ASM"';
 		no_install= true;
 		// none-*-* disables asm optimizations, bootstrap-testing
 		// the compiler more thoroughly.
@@ -57,11 +58,11 @@ host_modules= { module= gmp; lib_path=.libs; boots
 		// gmp's configure will complain if given anything
 		// different from host for target.
 	target="none-${host_vendor}-${host_os}"; };
-host_modules= { module= mpfr; lib_path=.libs; bootstrap=true;
+host_modules= { module= mpfr; lib_path=src/.libs; bootstrap=true;
 		extra_configure_flags='--disable-shared @extra_mpfr_configure_flags@';
 		extra_make_flags='AM_CFLAGS="-DNO_ASM"';
 		no_install= true; };
-host_modules= { module= mpc; lib_path=.libs; bootstrap=true;
+host_modules= { module= mpc; lib_path=src/.libs; bootstrap=true;
 		extra_configure_flags='--disable-shared @extra_mpc_gmp_configure_flags@ @extra_mpc_mpfr_configure_flags@';
 		no_install= true; };
 host_modules= { module= isl; lib_path=.libs; bootstrap=true;
Index: Makefile.in
===
--- Makefile.in	(Revision 235487)
+++ Makefile.in	(Arbeitskopie)
@@ -639,12 +639,12 @@ HOST_LIB_PATH_gmp = \
 
 @if mpfr
 HOST_LIB_PATH_mpfr = \
-  $$r/$(HOST_SUBDIR)/mpfr/.libs:$$r/$(HOST_SUBDIR)/prev-mpfr/.libs:
+  $$r/$(HOST_SUBDIR)/mpfr/src/.libs:$$r/$(HOST_SUBDIR)/prev-mpfr/src/.libs:
 @endif mpfr
 
 @if mpc
 HOST_LIB_PATH_mpc = \
-  $$r/$(HOST_SUBDIR)/mpc/.libs:$$r/$(HOST_SUBDIR)/prev-mpc/.libs:
+  $$r/$(HOST_SUBDIR)/mpc/src/.libs:$$r/$(HOST_SUBDIR)/prev-mpc/src/.libs:
 @endif mpc
 
 @if isl
@@ -11300,7 +11300,7 @@ all-gmp: configure-gmp
 	s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
 	$(HOST_EXPORTS)  \
 	(cd $(HOST_SUBDIR)/gmp && \
-	  $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_HOST_FLAGS) $(STAGE1_FLAGS_TO_PASS)  \
+	  $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_HOST_FLAGS) $(STAGE

  1   2   >