Re: Fix PR52482, libitm compilation in OSX ppc with old cctools

2015-07-06 Thread Carlos Sánchez de La Lama
>> Patch is against gcc-4.8.4, but affected lines have not changed in SVN HEAD.
>
> I dropped this into all active release branches as well.

Ok, thanks :)

> If you do a test suite run, feel free to email it to the test results
> list.

I am running "make check" on my (locally patched) 4.8.4. Will compile
and run testuite an active branch afterwards (everything takes a while
on my old G4).

BR

Carlos


Re: [PATCH] PR target/53383: Allow -mincoming-stack-boundary=3 with -mno-sse

2015-07-06 Thread Uros Bizjak
On Sun, Jul 5, 2015 at 11:40 PM, H.J. Lu  wrote:
> Similar to -mpreferred-stack-boundary=3, -mincoming-stack-boundary=3 is
> allowed with -mno-sse in 64-bit mode.
>
> OK for trunk?
>
>
> H.J.
> gcc/
>
> PR target/53383
> * config/i386/i386.c (ix86_option_override_internal): Allow
> -mincoming-stack-boundary=X if -mpreferred-stack-boundary=N is
> allowed and X < N.
>
> gcc/testsuite/
>
> PR target/53383
> * gcc.target/i386/pr53383-1.c: New file.
> * gcc.target/i386/pr53383-2.c: Likewise.
> * gcc.target/i386/pr53383-3.c: Likewise.

OK with a small change below.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386.c| 13 -
>  gcc/testsuite/gcc.target/i386/pr53383-1.c |  8 
>  gcc/testsuite/gcc.target/i386/pr53383-2.c |  8 
>  gcc/testsuite/gcc.target/i386/pr53383-3.c |  8 
>  4 files changed, 32 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr53383-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr53383-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr53383-3.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 7d26e8c..cea1295 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -4085,12 +4085,15 @@ ix86_option_override_internal (bool main_args_p,
>ix86_incoming_stack_boundary = ix86_default_incoming_stack_boundary;
>if (opts_set->x_ix86_incoming_stack_boundary_arg)
>  {
> -  if (opts->x_ix86_incoming_stack_boundary_arg
> - < (TARGET_64BIT_P (opts->x_ix86_isa_flags) ? 4 : 2)
> - || opts->x_ix86_incoming_stack_boundary_arg > 12)
> -   error ("-mincoming-stack-boundary=%d is not between %d and 12",
> +  int min = (TARGET_64BIT_P (opts->x_ix86_isa_flags)
> +? (TARGET_SSE_P (opts->x_ix86_isa_flags) ? 4 : 3) : 2);
> +  int max = 12;

Just get rid of the above variable and directly use 12 in the code below.

> +
> +  if (opts->x_ix86_incoming_stack_boundary_arg < min
> + || opts->x_ix86_incoming_stack_boundary_arg > max)
> +   error ("-mincoming-stack-boundary=%d is not between %d and %d",
>opts->x_ix86_incoming_stack_boundary_arg,
> -  TARGET_64BIT_P (opts->x_ix86_isa_flags) ? 4 : 2);
> +  min, max);
>else
> {
>   ix86_user_incoming_stack_boundary
> diff --git a/gcc/testsuite/gcc.target/i386/pr53383-1.c 
> b/gcc/testsuite/gcc.target/i386/pr53383-1.c
> new file mode 100644
> index 000..d140bda
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr53383-1.c
> @@ -0,0 +1,8 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mno-sse -mpreferred-stack-boundary=3" } */
> +
> +int
> +bar (int x)
> +{
> +  return x + 9;
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/pr53383-2.c 
> b/gcc/testsuite/gcc.target/i386/pr53383-2.c
> new file mode 100644
> index 000..a1b8e41
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr53383-2.c
> @@ -0,0 +1,8 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mno-sse -mpreferred-stack-boundary=3 
> -mincoming-stack-boundary=3" } */
> +
> +int
> +bar (int x)
> +{
> +  return x + 9;
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/pr53383-3.c 
> b/gcc/testsuite/gcc.target/i386/pr53383-3.c
> new file mode 100644
> index 000..e5d3a5b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr53383-3.c
> @@ -0,0 +1,8 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mno-sse -mincoming-stack-boundary=3 
> -mpreferred-stack-boundary=3" } */
> +
> +int
> +bar (int x)
> +{
> +  return x + 9;
> +}
> --
> 2.4.3
>


[PATCH][AArch64] PR target/66731 Fix fnmul insn with -frounding-math

2015-07-06 Thread Szabolcs Nagy
fnmul was modeled as (-a)*b instead of -(a*b), which is wrong with
-frounding-math, so the correct pattern is added too and the other
one is only used if !flag_rounding_math.

This affects a glibc math test, similar fix will be needed for ARM.

Tested with aarch64-none-linux-gnu cross compiler.
is this OK?

gcc/Changelog:

2015-07-06  Szabolcs Nagy  

PR target/66731
* config/aarch64/aarch64.md (fnmul3): Handle -frounding-math.

gcc/testsuite/Changelog:

2015-07-06  Szabolcs Nagy  

* gcc.target/aarch64/fnmul-1.c: New.
* gcc.target/aarch64/fnmul-2.c: New.
* gcc.target/aarch64/fnmul-3.c: New.
* gcc.target/aarch64/fnmul-4.c: New.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 2d56a75..1e343fa 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4175,6 +4175,16 @@
 (mult:GPF
 		 (neg:GPF (match_operand:GPF 1 "register_operand" "w"))
 		 (match_operand:GPF 2 "register_operand" "w")))]
+  "TARGET_FLOAT && !flag_rounding_math"
+  "fnmul\\t%0, %1, %2"
+  [(set_attr "type" "fmul")]
+)
+
+(define_insn "*fnmul3"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+(neg:GPF (mult:GPF
+		 (match_operand:GPF 1 "register_operand" "w")
+		 (match_operand:GPF 2 "register_operand" "w"]
   "TARGET_FLOAT"
   "fnmul\\t%0, %1, %2"
   [(set_attr "type" "fmul")]
diff --git a/gcc/testsuite/gcc.target/aarch64/fnmul-1.c b/gcc/testsuite/gcc.target/aarch64/fnmul-1.c
new file mode 100644
index 000..7ec38e0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/fnmul-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+double
+foo_d (double a, double b)
+{
+	/* { dg-final { scan-assembler "fnmul\\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" } } */
+	return -a * b;
+}
+
+float
+foo_s (float a, float b)
+{
+	/* { dg-final { scan-assembler "fnmul\\ts\[0-9\]+, s\[0-9\]+, s\[0-9\]+" } } */
+	return -a * b;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/fnmul-2.c b/gcc/testsuite/gcc.target/aarch64/fnmul-2.c
new file mode 100644
index 000..f05ee79
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/fnmul-2.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -frounding-math" } */
+
+double
+foo_d (double a, double b)
+{
+	/* { dg-final { scan-assembler "fneg\\td\[0-9\]+, d\[0-9\]+" } } */
+	/* { dg-final { scan-assembler "fmul\\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" } } */
+	return -a * b;
+}
+
+float
+foo_s (float a, float b)
+{
+	/* { dg-final { scan-assembler "fneg\\ts\[0-9\]+, s\[0-9\]+" } } */
+	/* { dg-final { scan-assembler "fmul\\ts\[0-9\]+, s\[0-9\]+, s\[0-9\]+" } } */
+	return -a * b;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/fnmul-3.c b/gcc/testsuite/gcc.target/aarch64/fnmul-3.c
new file mode 100644
index 000..301e9cd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/fnmul-3.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+double
+foo_d (double a, double b)
+{
+	/* { dg-final { scan-assembler "fnmul\\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" } } */
+	return -(a * b);
+}
+
+float
+foo_s (float a, float b)
+{
+	/* { dg-final { scan-assembler "fnmul\\ts\[0-9\]+, s\[0-9\]+, s\[0-9\]+" } } */
+	return -(a * b);
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/fnmul-4.c b/gcc/testsuite/gcc.target/aarch64/fnmul-4.c
new file mode 100644
index 000..9b9bf1b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/fnmul-4.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -frounding-math" } */
+
+double
+foo_d (double a, double b)
+{
+	/* { dg-final { scan-assembler "fnmul\\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" } } */
+	return -(a * b);
+}
+
+float
+foo_s (float a, float b)
+{
+	/* { dg-final { scan-assembler "fnmul\\ts\[0-9\]+, s\[0-9\]+, s\[0-9\]+" } } */
+	return -(a * b);
+}


Re: [PATCH, fortran] Fix numerous ICEs in IO statements with IOMSG

2015-07-06 Thread FX
> 2015-07-05  Steven G. Kargl  
> 
>   * io.c (check_char_variable): New function.
>   (match_open_element, match_close_element, match_file_element,
>   match_dt_element, match_inquire_element, match_wait_element): Use it.
> 
> 
> 2015-07-05  Steven G. Kargl  
> 
>   * gfortran.dg/iomsg_2.f90: New test.

OK to commit, thanks!

FX


Re: [PATCH][8/n] Remove GENERIC stmt combining from SCCVN

2015-07-06 Thread Eric Botcazou
> Hum, somehow I convinced myself that it was ok if the precision
> wasn't the same (but I can't remember my line of thought).  Your
> testcase clearly shows I was wrong ;)

It's not mine, but Zhendong Su's.  For the sake of completeness, I also had an 
Ada testcase, but I know how convincing C testcases can be. :-)

-- 
Eric Botcazou


Rename read-md.c:decimal_string

2015-07-06 Thread Rainer Orth
One of the recent header file changes (haven't check which) broke
mainline Solaris bootstrap:

/vol/gcc/src/hg/trunk/local/gcc/read-md.c: In function 'char* 
decimal_string(int)':
/vol/gcc/src/hg/trunk/local/gcc/read-md.c:782:27: error: 'char* 
decimal_string(int)' redeclared as different kind of symbol
 decimal_string (int number)
   ^
In file included from /usr/include/math.h:321:0,
 from 
/var/gcc/regression/trunk/12-gcc/build/prev-i386-pc-solaris2.12/libstdc++-v3/include/cmath:44,
 from 
/var/gcc/regression/trunk/12-gcc/build/prev-i386-pc-solaris2.12/libstdc++-v3/include/random:38,
 from 
/var/gcc/regression/trunk/12-gcc/build/prev-i386-pc-solaris2.12/libstdc++-v3/include/bits/stl_algo.h:66,
 from 
/var/gcc/regression/trunk/12-gcc/build/prev-i386-pc-solaris2.12/libstdc++-v3/include/algorithm:62,
 from /vol/gcc/src/hg/trunk/local/gcc/system.h:218,
 from 
/var/gcc/regression/trunk/12-gcc/build/prev-i386-pc-solaris2.12/libstdc++-v3/include/algorithm:62,
 from /vol/gcc/src/hg/trunk/local/gcc/system.h:218,
 from /vol/gcc/src/hg/trunk/local/gcc/read-md.c:21:
/usr/include/floatingpoint.h:88:14: note: previous declaration 'typedef char 
decimal_string [512]'
 typedef char decimal_string[DECIMAL_STRING_LENGTH];
  ^
/vol/gcc/src/hg/trunk/local/gcc/read-md.c: In function 'void handle_enum(int, 
bool)':
/vol/gcc/src/hg/trunk/local/gcc/read-md.c:854:41: error: functional cast to 
array type 'decimal_string {aka char [512]}'
  decimal_string (def->num_values), def);
 ^
/vol/gcc/src/hg/trunk/local/gcc/read-md.c: At global scope:
/vol/gcc/src/hg/trunk/local/gcc/read-md.c:782:1: error: 'char* 
decimal_string(int)' defined but not used [-Werror=unused-function]
 decimal_string (int number)
 ^

I took the easy way out and renamed the read-md.c function to avoid the
clash.  Bootstrapped on i386-pc-solaris2.11.  Ok for mainline?

Rainer


2015-07-06  Rainer Orth  

* read-md.c (decimal_string): Rename to ...
(md_decimal_string): ... this.
(handle_enum): Reflect this.

# HG changeset patch
# Parent 8d51390422669c8929e6b44770ca6716188d001a
Rename read-md.c:decimal_string

diff --git a/gcc/read-md.c b/gcc/read-md.c
--- a/gcc/read-md.c
+++ b/gcc/read-md.c
@@ -779,7 +779,7 @@ traverse_md_constants (htab_trav callbac
 /* Return a malloc()ed decimal string that represents number NUMBER.  */
 
 static char *
-decimal_string (int number)
+md_decimal_string (int number)
 {
   /* A safe overestimate.  +1 for sign, +1 for null terminator.  */
   char buffer[sizeof (int) * CHAR_BIT + 1 + 1];
@@ -851,7 +851,7 @@ handle_enum (int lineno, bool md_p)
 	  ev->name = value_name;
 	}
   ev->def = add_constant (md_constants, value_name,
-			  decimal_string (def->num_values), def);
+			  md_decimal_string (def->num_values), def);
 
   *def->tail_ptr = ev;
   def->tail_ptr = &ev->next;

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[RFC] two-phase marking in gt_cleare_cache

2015-07-06 Thread Tom de Vries

Hi,

Using attached untested patch, I managed to minimize a test-case failure 
for PR 66714.


The patch introduces two-phase marking in gt_cleare_cache:
- first phase, it loops over all the hash table entries and removes
  those which are dead
- second phase, it runs over all the live hash table entries and marks
  live items that are reachable from those live entries

By doing so, we make the behaviour of gt_cleare_cache independent of the 
order in which the entries are visited, turning:

- hard-to-trigger bugs which trigger for one visiting order but not for
  another, into
- more easily triggered bugs which trigger for any visiting order.

Any comments?

Thanks,
- Tom
Add checking in gt_cleare_cache

---
 gcc/hash-table.h | 32 +++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index 12e0c96..c2ea112 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -1046,7 +1046,37 @@ gt_cleare_cache (hash_table *h)
   if (!h)
 return;
 
-  for (typename table::iterator iter = h->begin (); iter != h->end (); ++iter)
+  typename table::iterator iter;
+
+#ifdef CHECKING
+  /* Say we have:
+ 1. cache entry A, with keep_change_entry (A) == 1, and
+ 2. cache entry B, with keep_change_entry (B) == 0, and
+ 3. gt_ggc_mx (A) marks things live in such a way that keep_change_entry (B)
+becomes 1.
+
+ In the loop at the end of the function, if A is visited first, then B is
+ kept.  If B is visited first, it is deleted.
+
+ We don't want the situation that the result of this function is dependent
+ on the order in which the entries are visited, so we consider this
+ situation a bug.
+
+ In order to stabilize the result of the function in presence of the bug, we
+ first clear all entries E with keep_change_entry (E) == 0.  By doing so, we
+ also maximize the impact of the liveness analysis done up until now, which
+ we hope makes it more likely that we run into bugs regarding that analysis.
+ We only do this when checking since it's more expensive.  */
+  for (iter = h->begin (); iter != h->end (); ++iter)
+if (!table::is_empty (*iter) && !table::is_deleted (*iter))
+  {
+	int res = H::keep_cache_entry (*iter);
+	if (res == 0)
+	  h->clear_slot (&*iter);
+  }
+#endif
+
+  for (iter = h->begin (); iter != h->end (); ++iter)
 if (!table::is_empty (*iter) && !table::is_deleted (*iter))
   {
 	int res = H::keep_cache_entry (*iter);
-- 
1.9.1



Re: Rename read-md.c:decimal_string

2015-07-06 Thread Richard Sandiford
Rainer Orth  writes:
> 2015-07-06  Rainer Orth  
>
>   * read-md.c (decimal_string): Rename to ...
>   (md_decimal_string): ... this.
>   (handle_enum): Reflect this.

OK, thanks.

Richard



Re: [PATCH 1/2][ARM] PR/65956 AAPCS update for alignment attribute

2015-07-06 Thread Alan Lawrence

Richard Biener wrote:


I also believe this loop is equivalent to checking TYPE_ALIGN of the aggregate 
type?


Jakub is correct: the intention is to discard any top-level alignment attribute 
on a struct declaration.



I'll double check your wording in the abi document, but it seems to be unclear 
whether packed and not packed structs should be passed the same (considering 
layout differences).  OTOH the above function is only relevant for register 
passing? (Likewise the abi document changes?)


It also affects the alignment of things passed on the stack. 'Packed' structs 
are affected too: the outer 'packed' will have no effect on the position on the 
stack / in registers, as you say; layout will still be packed.



Is this behavior correct for unions or aggregates with record or

union members?


To clarify Richard Earnshaw's statement: The intention is that 'member 
alignment' is pretty much gcc's TYPE_ALIGN (actually the source code type 
declaration - which is the same for for struct members, but ignoring cases where 
other opts like SRA figure out a larger TYPE_ALIGN). 'Natural alignment' is not 
directly available in GCC under all circumstances, hence having to compute it here.


--Alan



[PING][PATCH, 1/2] Merge rewrite_virtuals_into_loop_closed_ssa from gomp4 branch

2015-07-06 Thread Tom de Vries

On 25/06/15 09:42, Tom de Vries wrote:

Hi,

this patch merges rewrite_virtuals_into_loop_closed_ssa (originally
submitted here: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01236.html
) to trunk.

Bootstrapped and reg-tested on x86_64.

OK for trunk?



Ping.

Thanks,
- Tom



0001-Merge-rewrite_virtuals_into_loop_closed_ssa-from-gom.patch


Merge rewrite_virtuals_into_loop_closed_ssa from gomp4 branch

2015-06-24  Tom de Vries

merge from gomp4 branch:
2015-06-24  Tom de Vries

* tree-ssa-loop-manip.c (get_virtual_phi): Factor out of ...
(rewrite_virtuals_into_loop_closed_ssa): ... here.

* tree-ssa-loop-manip.c (replace_uses_in_dominated_bbs): Factor out
of ...
(rewrite_virtuals_into_loop_closed_ssa): ... here.

* dominance.c (bitmap_get_dominated_by): New function.
* dominance.h (bitmap_get_dominated_by): Declare.
* tree-ssa-loop-manip.c (rewrite_virtuals_into_loop_closed_ssa): Use
bitmap_get_dominated_by.

* tree-parloops.c (replace_uses_in_bbs_by)
(rewrite_virtuals_into_loop_closed_ssa): Move to ...
* tree-ssa-loop-manip.c: here.
* tree-ssa-loop-manip.h (rewrite_virtuals_into_loop_closed_ssa):
Declare.

2015-06-18  Tom de Vries

* tree-parloops.c (rewrite_virtuals_into_loop_closed_ssa): New function.
(transform_to_exit_first_loop_alt): Use
rewrite_virtuals_into_loop_closed_ssa.
---
  gcc/dominance.c   | 21 
  gcc/dominance.h   |  1 +
  gcc/tree-parloops.c   | 43 +
  gcc/tree-ssa-loop-manip.c | 81 +++
  gcc/tree-ssa-loop-manip.h |  1 +
  5 files changed, 112 insertions(+), 35 deletions(-)

diff --git a/gcc/dominance.c b/gcc/dominance.c
index 9c66ca2..9b52d79 100644
--- a/gcc/dominance.c
+++ b/gcc/dominance.c
@@ -753,6 +753,27 @@ set_immediate_dominator (enum cdi_direction dir, 
basic_block bb,
  dom_computed[dir_index] = DOM_NO_FAST_QUERY;
  }

+/* Returns in BBS the list of basic blocks immediately dominated by BB, in the
+   direction DIR.  As get_dominated_by, but returns result as a bitmap.  */
+
+void
+bitmap_get_dominated_by (enum cdi_direction dir, basic_block bb, bitmap bbs)
+{
+  unsigned int dir_index = dom_convert_dir_to_idx (dir);
+  struct et_node *node = bb->dom[dir_index], *son = node->son, *ason;
+
+  bitmap_clear (bbs);
+
+  gcc_checking_assert (dom_computed[dir_index]);
+
+  if (!son)
+return;
+
+  bitmap_set_bit (bbs, ((basic_block) son->data)->index);
+  for (ason = son->right; ason != son; ason = ason->right)
+bitmap_set_bit (bbs, ((basic_block) son->data)->index);
+}
+
  /* Returns the list of basic blocks immediately dominated by BB, in the
 direction DIR.  */
  vec
diff --git a/gcc/dominance.h b/gcc/dominance.h
index 37e138b..0a1a13e 100644
--- a/gcc/dominance.h
+++ b/gcc/dominance.h
@@ -41,6 +41,7 @@ extern void free_dominance_info (enum cdi_direction);
  extern basic_block get_immediate_dominator (enum cdi_direction, basic_block);
  extern void set_immediate_dominator (enum cdi_direction, basic_block,
 basic_block);
+extern void bitmap_get_dominated_by (enum cdi_direction, basic_block, bitmap);
  extern vec get_dominated_by (enum cdi_direction, basic_block);
  extern vec get_dominated_by_region (enum cdi_direction,
 basic_block *,
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index e582fe7..df7c351 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -1498,25 +1498,6 @@ replace_uses_in_bb_by (tree name, tree val, basic_block 
bb)
  }
  }

-/* Replace uses of NAME by VAL in blocks BBS.  */
-
-static void
-replace_uses_in_bbs_by (tree name, tree val, bitmap bbs)
-{
-  gimple use_stmt;
-  imm_use_iterator imm_iter;
-
-  FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, name)
-{
-  if (!bitmap_bit_p (bbs, gimple_bb (use_stmt)->index))
-   continue;
-
-  use_operand_p use_p;
-  FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter)
-   SET_USE (use_p, val);
-}
-}
-
  /* Do transformation from:

   :
@@ -1637,18 +1618,11 @@ transform_to_exit_first_loop_alt (struct loop *loop,
tree control = gimple_cond_lhs (cond_stmt);
edge e;

-  /* Gather the bbs dominated by the exit block.  */
-  bitmap exit_dominated = BITMAP_ALLOC (NULL);
-  bitmap_set_bit (exit_dominated, exit_block->index);
-  vec exit_dominated_vec
-= get_dominated_by (CDI_DOMINATORS, exit_block);
-
-  int i;
-  basic_block dom_bb;
-  FOR_EACH_VEC_ELT (exit_dominated_vec, i, dom_bb)
-bitmap_set_bit (exit_dominated, dom_bb->index);
-
-  exit_dominated_vec.release ();
+  /* Rewriting virtuals into loop-closed ssa normal form makes this
+ transformation simpler.  It also ensures that the virtuals are in
+ loop-closed ssa normal from after the transformation, which is required by
+ create_parallel_loop

[PING][PATCH, 2/2][PR66642] Add empty loop exit block in transform_to_exit_first_loop_alt

2015-07-06 Thread Tom de Vries

On 25/06/15 09:43, Tom de Vries wrote:

Hi,

I ran into a failure with parloops for reduction loop testcase
libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c.  When we
exercise the low iteration count loop, the test-case fails.

To understand the problem, let's first look at what happens when we use
transform_to_exit_first_loop (the original one) instead of
transform_to_exit_first_loop_alt (the alternative one, which is
currently used, and causing the failure).

Before transform_to_exit_first_loop, the low iteration count loop and
the main loop share the loop exit block. After
transform_to_exit_first_loop, that's not the case anymore, the main loop
now has an exit block with a single predecessor. Subsequently,
separate_decls_in_region inserts code in the main loop exit block, which
is only triggered upon exit of the main loop.

However, transform_to_exit_first_loop_alt does not insert such an exit
block, and the code inserted by separate_decls_in_region is also active
for the low iteration count loop, which results in an incorrect
reduction result when the low iteration count loop is used.


This patch fixes the problem by making sure
transform_to_exit_first_loop_alt adds a new exit block inbetween the
main loop header and the old exit block.


Bootstrapped and reg-tested on x86_64.

OK for trunk?



Ping.

Thanks,
- Tom


0002-Add-empty-loop-exit-block-in-transform_to_exit_first.patch


Add empty loop exit block in transform_to_exit_first_loop_alt

2015-06-24  Tom de Vries

PR tree-optimization/66642
* tree-parloops.c (transform_to_exit_first_loop_alt): Update function
header comment.  Rename split_edge variable to edge_at_split.  Split
exit edge to create new loop exit bb.  Insert loop exit phis in new loop
exit bb.

* testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c (main): Test low
iteration count case.
* testsuite/libgomp.c/parloops-exit-first-loop-alt.c (init): New
function, factor out of ...
(main): ... here.  Test low iteration count case.
---
  gcc/tree-parloops.c| 45 --
  .../libgomp.c/parloops-exit-first-loop-alt-3.c |  5 +++
  .../libgomp.c/parloops-exit-first-loop-alt.c   | 28 +-
  3 files changed, 64 insertions(+), 14 deletions(-)

diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index df7c351..6c8aaab 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -1522,7 +1522,7 @@ replace_uses_in_bb_by (tree name, tree val, basic_block 
bb)
   goto 

   :
- sum_z = PHI 
+ sum_z = PHI 

   [1] Where  is single_pred (bb latch); In the simplest case,
 that's .
@@ -1549,14 +1549,17 @@ replace_uses_in_bb_by (tree name, tree val, basic_block 
bb)
   if (ivtmp_c < n + 1)
 goto ;
   else
-   goto ;
+   goto ;

   :
   ivtmp_b = ivtmp_a + 1;
   goto 

+ :
+ sum_y = PHI 
+
   :
- sum_z = PHI 
+ sum_z = PHI 


 In unified diff format:
@@ -1593,9 +1596,12 @@ replace_uses_in_bb_by (tree name, tree val, basic_block 
bb)
  - goto 
  + goto 

++:
++sum_y = PHI 
+
:
-- sum_z = PHI 
-+ sum_z = PHI 
+- sum_z = PHI 
++ sum_z = PHI 

 Note: the example does not show any virtual phis, but these are handled 
more
 or less as reductions.
@@ -1626,7 +1632,7 @@ transform_to_exit_first_loop_alt (struct loop *loop,

/* Create the new_header block.  */
basic_block new_header = split_block_before_cond_jump (exit->src);
-  edge split_edge = single_pred_edge (new_header);
+  edge edge_at_split = single_pred_edge (new_header);

/* Redirect entry edge to new_header.  */
edge entry = loop_preheader_edge (loop);
@@ -1643,9 +1649,9 @@ transform_to_exit_first_loop_alt (struct loop *loop,
e = redirect_edge_and_branch (post_cond_edge, header);
gcc_assert (e == post_cond_edge);

-  /* Redirect split_edge to latch.  */
-  e = redirect_edge_and_branch (split_edge, latch);
-  gcc_assert (e == split_edge);
+  /* Redirect edge_at_split to latch.  */
+  e = redirect_edge_and_branch (edge_at_split, latch);
+  gcc_assert (e == edge_at_split);

/* Set the new loop bound.  */
gimple_cond_set_rhs (cond_stmt, bound);
@@ -1697,21 +1703,36 @@ transform_to_exit_first_loop_alt (struct loop *loop,
/* Set the latch arguments of the new phis to ivtmp/sum_b.  */
flush_pending_stmts (post_inc_edge);

-  /* Register the reduction exit phis.  */
+  /* Create a new empty exit block, inbetween the new loop header and the old
+ exit block.  The function separate_decls_in_region needs this block to
+ insert code that is active on loop exit, but not any other path.  */
+  basic_block new_exit_block = split_edge (exit);
+
+  /* Insert and register the reduction exit phis.  */
for (gphi_iterator gsi = gsi_start_phis (exit_block);
 !gsi_end_p (gsi);
 gsi_next (&gsi))
  {
gphi *ph

Re: [PATCH 1/3] [ARM] PR63870 NEON error messages

2015-07-06 Thread Alan Lawrence
I note some parts of this duplicate my 
https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01422.html , which has been pinged 
a couple of times. Both Charles' patch, and my two, contain parts the other does 
not...


Cheers, Alan

Charles Baylis wrote:

gcc/ChangeLog:

  Charles Baylis  

* config/arm/arm-builtins.c (enum arm_type_qualifiers): New enumerators
qualifier_lane_index, qualifier_struct_load_store_lane_index.
(arm_expand_neon_args): New parameter. Remove ellipsis. Handle NEON
argument qualifiers.
(arm_expand_neon_builtin): Handle NEON argument qualifiers.
* config/arm/arm-protos.h: (arm_neon_lane_bounds) New prototype.
* config/arm/arm.c (arm_neon_lane_bounds): New function.
* config/arm/arm.h (ENDIAN_LANE_N): New macro.

Change-Id: Iaa14d8736879fa53776319977eda2089f0a26647
---
 gcc/config/arm/arm-builtins.c | 65 ---
 gcc/config/arm/arm-protos.h   |  4 +++
 gcc/config/arm/arm.c  | 20 +
 gcc/config/arm/arm.h  |  3 ++
 4 files changed, 75 insertions(+), 17 deletions(-)

diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index f960e0a..8f1253e 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -77,7 +77,11 @@ enum arm_type_qualifiers
   /* qualifier_const_pointer | qualifier_map_mode  */
   qualifier_const_pointer_map_mode = 0x86,
   /* Polynomial types.  */
-  qualifier_poly = 0x100
+  qualifier_poly = 0x100,
+  /* Lane indices - must be in range, and flipped for bigendian.  */
+  qualifier_lane_index = 0x200,
+  /* Lane indices for single lane structure loads and stores.  */
+  qualifier_struct_load_store_lane_index = 0x400
 };

 /*  The qualifier_internal allows generation of a unary builtin from
@@ -1927,6 +1931,8 @@ arm_expand_unop_builtin (enum insn_code icode,
 typedef enum {
   NEON_ARG_COPY_TO_REG,
   NEON_ARG_CONSTANT,
+  NEON_ARG_LANE_INDEX,
+  NEON_ARG_STRUCT_LOAD_STORE_LANE_INDEX,
   NEON_ARG_MEMORY,
   NEON_ARG_STOP
 } builtin_arg;
@@ -1984,9 +1990,9 @@ neon_dereference_pointer (tree exp, tree type, 
machine_mode mem_mode,
 /* Expand a Neon builtin.  */
 static rtx
 arm_expand_neon_args (rtx target, machine_mode map_mode, int fcode,
- int icode, int have_retval, tree exp, ...)
+ int icode, int have_retval, tree exp,
+ builtin_arg *args)
 {
-  va_list ap;
   rtx pat;
   tree arg[SIMD_MAX_BUILTIN_ARGS];
   rtx op[SIMD_MAX_BUILTIN_ARGS];
@@ -2001,13 +2007,11 @@ arm_expand_neon_args (rtx target, machine_mode 
map_mode, int fcode,
  || !(*insn_data[icode].operand[0].predicate) (target, tmode)))
 target = gen_reg_rtx (tmode);

-  va_start (ap, exp);
-
   formals = TYPE_ARG_TYPES (TREE_TYPE (arm_builtin_decls[fcode]));

   for (;;)
 {
-  builtin_arg thisarg = (builtin_arg) va_arg (ap, int);
+  builtin_arg thisarg = args[argc];

   if (thisarg == NEON_ARG_STOP)
break;
@@ -2043,17 +2047,46 @@ arm_expand_neon_args (rtx target, machine_mode 
map_mode, int fcode,
op[argc] = copy_to_mode_reg (mode[argc], op[argc]);
  break;

+case NEON_ARG_STRUCT_LOAD_STORE_LANE_INDEX:
+ gcc_assert (argc > 1);
+ if (CONST_INT_P (op[argc]))
+   {
+ arm_neon_lane_bounds (op[argc], 0,
+   GET_MODE_NUNITS (map_mode), exp);
+ /* Keep to GCC-vector-extension lane indices in the RTL.  */
+ op[argc] =
+   GEN_INT (ENDIAN_LANE_N (map_mode, INTVAL (op[argc])));
+   }
+ goto constant_arg;
+
+case NEON_ARG_LANE_INDEX:
+ /* Must be a previous operand into which this is an index.  */
+ gcc_assert (argc > 0);
+ if (CONST_INT_P (op[argc]))
+   {
+ machine_mode vmode = insn_data[icode].operand[argc - 1].mode;
+ arm_neon_lane_bounds (op[argc],
+   0, GET_MODE_NUNITS (vmode), exp);
+ /* Keep to GCC-vector-extension lane indices in the RTL.  */
+ op[argc] = GEN_INT (ENDIAN_LANE_N (vmode, INTVAL (op[argc])));
+   }
+ /* Fall through - if the lane index isn't a constant then
+the next case will error.  */
case NEON_ARG_CONSTANT:
+constant_arg:
  if (!(*insn_data[icode].operand[opno].predicate)
  (op[argc], mode[argc]))
-   error_at (EXPR_LOCATION (exp), "incompatible type for argument %d, 
"
-  "expected %", argc + 1);
+   {
+ error ("%Kargument %d must be a constant immediate",
+exp, argc + 1);
+ return const0_rtx;
+   }
  break;
+
 case NEON_ARG_MEMORY:
  /* Check if expand failed.  

Re: [PATCH][AArch64] PR target/66731 Fix fnmul insn with -frounding-math

2015-07-06 Thread James Greenhalgh
On Mon, Jul 06, 2015 at 09:20:39AM +0100, Szabolcs Nagy wrote:
> fnmul was modeled as (-a)*b instead of -(a*b), which is wrong with
> -frounding-math, so the correct pattern is added too and the other
> one is only used if !flag_rounding_math.
> 
> This affects a glibc math test, similar fix will be needed for ARM.
> 
> Tested with aarch64-none-linux-gnu cross compiler.
> is this OK?
> 
> gcc/Changelog:
> 
> 2015-07-06  Szabolcs Nagy  
> 
>   PR target/66731
>   * config/aarch64/aarch64.md (fnmul3): Handle -frounding-math.
> 
> gcc/testsuite/Changelog:
> 
> 2015-07-06  Szabolcs Nagy  
> 
>   * gcc.target/aarch64/fnmul-1.c: New.
>   * gcc.target/aarch64/fnmul-2.c: New.
>   * gcc.target/aarch64/fnmul-3.c: New.
>   * gcc.target/aarch64/fnmul-4.c: New.

OK.

Please make sure in a follow-up patch that the costing logic in
aarch64_rtx_costs also gets updated.

Thanks,
James


> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 2d56a75..1e343fa 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -4175,6 +4175,16 @@
>  (mult:GPF
>(neg:GPF (match_operand:GPF 1 "register_operand" "w"))
>(match_operand:GPF 2 "register_operand" "w")))]
> +  "TARGET_FLOAT && !flag_rounding_math"
> +  "fnmul\\t%0, %1, %2"
> +  [(set_attr "type" "fmul")]
> +)
> +
> +(define_insn "*fnmul3"
> +  [(set (match_operand:GPF 0 "register_operand" "=w")
> +(neg:GPF (mult:GPF
> +  (match_operand:GPF 1 "register_operand" "w")
> +  (match_operand:GPF 2 "register_operand" "w"]
>"TARGET_FLOAT"
>"fnmul\\t%0, %1, %2"
>[(set_attr "type" "fmul")]
> diff --git a/gcc/testsuite/gcc.target/aarch64/fnmul-1.c 
> b/gcc/testsuite/gcc.target/aarch64/fnmul-1.c
> new file mode 100644
> index 000..7ec38e0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/fnmul-1.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +double
> +foo_d (double a, double b)
> +{
> + /* { dg-final { scan-assembler "fnmul\\td\[0-9\]+, d\[0-9\]+, 
> d\[0-9\]+" } } */
> + return -a * b;
> +}
> +
> +float
> +foo_s (float a, float b)
> +{
> + /* { dg-final { scan-assembler "fnmul\\ts\[0-9\]+, s\[0-9\]+, 
> s\[0-9\]+" } } */
> + return -a * b;
> +}
> diff --git a/gcc/testsuite/gcc.target/aarch64/fnmul-2.c 
> b/gcc/testsuite/gcc.target/aarch64/fnmul-2.c
> new file mode 100644
> index 000..f05ee79
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/fnmul-2.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -frounding-math" } */
> +
> +double
> +foo_d (double a, double b)
> +{
> + /* { dg-final { scan-assembler "fneg\\td\[0-9\]+, d\[0-9\]+" } } */
> + /* { dg-final { scan-assembler "fmul\\td\[0-9\]+, d\[0-9\]+, d\[0-9\]+" 
> } } */
> + return -a * b;
> +}
> +
> +float
> +foo_s (float a, float b)
> +{
> + /* { dg-final { scan-assembler "fneg\\ts\[0-9\]+, s\[0-9\]+" } } */
> + /* { dg-final { scan-assembler "fmul\\ts\[0-9\]+, s\[0-9\]+, s\[0-9\]+" 
> } } */
> + return -a * b;
> +}
> diff --git a/gcc/testsuite/gcc.target/aarch64/fnmul-3.c 
> b/gcc/testsuite/gcc.target/aarch64/fnmul-3.c
> new file mode 100644
> index 000..301e9cd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/fnmul-3.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +double
> +foo_d (double a, double b)
> +{
> + /* { dg-final { scan-assembler "fnmul\\td\[0-9\]+, d\[0-9\]+, 
> d\[0-9\]+" } } */
> + return -(a * b);
> +}
> +
> +float
> +foo_s (float a, float b)
> +{
> + /* { dg-final { scan-assembler "fnmul\\ts\[0-9\]+, s\[0-9\]+, 
> s\[0-9\]+" } } */
> + return -(a * b);
> +}
> diff --git a/gcc/testsuite/gcc.target/aarch64/fnmul-4.c 
> b/gcc/testsuite/gcc.target/aarch64/fnmul-4.c
> new file mode 100644
> index 000..9b9bf1b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/fnmul-4.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -frounding-math" } */
> +
> +double
> +foo_d (double a, double b)
> +{
> + /* { dg-final { scan-assembler "fnmul\\td\[0-9\]+, d\[0-9\]+, 
> d\[0-9\]+" } } */
> + return -(a * b);
> +}
> +
> +float
> +foo_s (float a, float b)
> +{
> + /* { dg-final { scan-assembler "fnmul\\ts\[0-9\]+, s\[0-9\]+, 
> s\[0-9\]+" } } */
> + return -(a * b);
> +}



Re: [commited, Patch, Fortran, PR58586, v5] ICE with derived type with allocatable component passed by value

2015-07-06 Thread Andre Vehreschild
Hi Steve, hi Paul, hi all,

Steve and Paul, thank you very much for the reviews. Committed with the
requested changes as r225447 and r225448. The last commit adds the Changelog
entry in the testsuite I forgot. Sorry for that.

For the open issue in the testcase I have opened the pr:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66775

Regards,
Andre

On Sun, 5 Jul 2015 19:48:13 +0200
Paul Richard Thomas  wrote:

> Dear Andre,
> 
> I agree with Steve's recommendation that you comment out the line and
> open a PR for the problem.
> 
> The patch looks fine to me and applied cleanly, apart from trailing
> CRs in the testcases.
> 
> OK by me too.
> 
> Cheers
> 
> Paul
> 
> PS I felt safe in setting a deadline for the submodule patch because:
> (i) It was obvious that nobody would review it because of its size;
> and (ii) It is safely ring-fenced by the need for very specific
> procedure attributes and declarations. I would not dream of doing the
> same for other patches more integrated in parts of the compiler that
> are frequented by commonly used code. For example, the patch to
> encompass the use of private entities with submodules will be just
> such a patch when I figure out how to do it! I can sympathize with
> you though; you have often had to wait an excessively long time for
> reviews.
> 
> 
> On 5 July 2015 at 18:14, Steve Kargl  
> wrote:
> > On Sat, Jul 04, 2015 at 09:20:39PM +0200, Andre Vehreschild wrote:
> >>
> >> Thanks for looking at the code. The error you experience is known
> >> to me. The bug is present in gfortran and only exposed by this patch.
> >> Unfortunately is the pr58586 not addressing this specific error. It
> >> may be in the bugtracker under a different number already. Furthermore
> >> did I not want to extend the patch for 58586 any further, because I
> >> have learned that the more complicated a patch gets the longer review
> >> takes. For making the testcase run fine we also simply can comment the
> >> line.
> >>
> >
> > I can appreciate the problem of fixing one bug may expose another,
> > and I agree that holding up a patch for 58586 due to a latent bug
> > seems unreasonable.  I reviewed the email history and it appears
> > that you've addressed Mikael's concerns.  My only comment would
> > be to comment out the problematic statement in alloc_comp_class_4.f03,
> > and open a new bug report to record the issue.  Ok to commit with
> > my suggested change.
> >
> > --
> > Steve
> 
> 
> 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(Revision 225446)
+++ gcc/fortran/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,22 @@
+2015-07-06  Andre Vehreschild  
+
+	PR fortran/58586
+	* resolve.c (resolve_symbol): Non-private functions in modules
+	with allocatable or pointer components are marked referenced
+	now. Furthermore is the default init especially for those
+	components now done in gfc_conf_procedure_call preventing
+	duplicate code.
+	* trans-decl.c (gfc_generate_function_code): Generate a fake
+	result decl for	functions returning an object with allocatable
+	components and initialize them.
+	* trans-expr.c (gfc_conv_procedure_call): For value typed trees
+	use the tree without indirect ref. And for non-decl trees
+	add a temporary variable to prevent evaluating the tree
+	multiple times (prevent multiple function evaluations).
+	* trans.h: Made gfc_trans_structure_assign () protoype
+	available, which is now needed by trans-decl.c:gfc_generate_
+	function_code(), too.
+
 2015-07-04  Steven G. Kargl  
 
 	PR fortran/66725
Index: gcc/fortran/trans-decl.c
===
--- gcc/fortran/trans-decl.c	(Revision 225446)
+++ gcc/fortran/trans-decl.c	(Arbeitskopie)
@@ -5885,10 +5885,34 @@
   tmp = gfc_trans_code (ns->code);
   gfc_add_expr_to_block (&body, tmp);
 
-  if (TREE_TYPE (DECL_RESULT (fndecl)) != void_type_node)
+  if (TREE_TYPE (DECL_RESULT (fndecl)) != void_type_node
+  || (sym->result && sym->result != sym
+	  && sym->result->ts.type == BT_DERIVED
+	  && sym->result->ts.u.derived->attr.alloc_comp))
 {
+  bool artificial_result_decl = false;
   tree result = get_proc_result (sym);
+  gfc_symbol *rsym = sym == sym->result ? sym : sym->result;
 
+  /* Make sure that a function returning an object with
+	 alloc/pointer_components always has a result, where at least
+	 the allocatable/pointer components are set to zero.  */
+  if (result == NULL_TREE && sym->attr.function
+	  && ((sym->result->ts.type == BT_DERIVED
+	   && (sym->attr.allocatable
+		   || sym->attr.pointer
+		   || sym->result->ts.u.derived->attr.alloc_comp
+		   || sym->result->ts.u.derived->attr.pointer_comp))
+	  || (sym->result->ts.type == BT_CLASS
+		  && (CLASS_DATA (sym)->attr.allocatable
+		  || CLASS_DATA (sym)->attr.class_pointer
+		  || CLASS_DATA (sym->result)->attr.alloc_com

[PATCH] Fix PR66759

2015-07-06 Thread Richard Biener

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-07-06  Richard Biener  

PR middle-end/66759
* match.pd: Add missing constraint of y to REAL_CST in
REAL_CST - x CMP y to y - CST CMP x simplification.

* gcc.dg/torture/pr66759.c: New testcase.

Index: gcc/match.pd
===
--- gcc/match.pd(revision 225443)
+++ gcc/match.pd(working copy)
@@ -1496,7 +1513,7 @@ (define_operator_list CBRT BUILT_IN_CBRT
 floating-point types only if -fassociative-math is set.  */
  (if (flag_associative_math)
   (simplify
-   (cmp (minus REAL_CST@0 @1) @2)
+   (cmp (minus REAL_CST@0 @1) REAL_CST@2)
(with { tree tem = const_binop (MINUS_EXPR, TREE_TYPE (@1), @0, @2); }
 (if (!TREE_OVERFLOW (tem))
  (cmp { tem; } @1)
Index: gcc/testsuite/gcc.dg/torture/pr66759.c
===
--- gcc/testsuite/gcc.dg/torture/pr66759.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr66759.c  (working copy)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-ffast-math" } */
+
+int a, b;
+float c;
+int fn2();
+void fn1()
+{
+  if (fn2() <= 1. - c)
+b = a;
+}


Re: [PATCH 1/2][ARM] PR/65956 AAPCS update for alignment attribute

2015-07-06 Thread Alan Lawrence

Eric Botcazou wrote:

Technically this is incorrect since AGGREGATE_TYPE_P includes ARRAY_TYPE
and ARRAY_TYPE doesn't have TYPE_FIELDS.  I doubt we could reach that
case though (unless there's a language that allows passing arrays by value).


Ada passes small array types by the method specified by the pass_by_reference 
hook (and large array types by reference).


Ok, thanks. Here's a revised patch that handles array types. Again I've tested 
on both trunk (bootstrap + check-gcc) and gcc-5-branch (profiledbootstrap now 
succeeding + check-gcc). Jakub's pr65956.c testcase also now passes.


The new code lacks a testcase; from what Eric says, it's possible we can write 
one using Ada, but I don't know any Ada myself, so I think any testcase should 
follow in a separate patch.


Neither have I managed to run a check-ada yet, as I don't presently have a 
working Ada compiler with which to bootstrap gcc's Ada frontend. Working on this 
now.


--Alan

gcc/ChangeLog:

* config/arm/arm.c (arm_needs_doubleword_align) : Drop any outer
alignment attribute, exploring one level down for records and arrays.
commit f8bd310d65f2b8fd8d7e1151a4a1f84489738029
Author: Alan Lawrence 
Date:   Wed Jun 3 18:22:36 2015 +0100

arm_needs_doubleword_align: explore one level for aggregates, also arrays.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index e79a369..e12198a 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -6151,8 +6151,23 @@ arm_init_cumulative_args (CUMULATIVE_ARGS *pcum, tree fntype,
 static bool
 arm_needs_doubleword_align (machine_mode mode, const_tree type)
 {
-  return (GET_MODE_ALIGNMENT (mode) > PARM_BOUNDARY
-	  || (type && TYPE_ALIGN (type) > PARM_BOUNDARY));
+  if (!type)
+return PARM_BOUNDARY < GET_MODE_ALIGNMENT (mode);
+
+  /* Scalar and vector types: Use natural alignment, i.e. of base type.  */
+  if (!AGGREGATE_TYPE_P (type))
+return TYPE_ALIGN (TYPE_MAIN_VARIANT (type)) > PARM_BOUNDARY;
+
+  /* Array types: Use member alignment of element type.  */
+  if (TREE_CODE (type) == ARRAY_TYPE)
+return TYPE_ALIGN (TREE_TYPE (type)) > PARM_BOUNDARY;
+
+  /* Record/aggregate types: Use greatest member alignment of any member.  */ 
+  for (tree field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
+if (DECL_ALIGN (field) > PARM_BOUNDARY)
+  return true;
+
+  return false;
 }
 
 


Re: [Patch, Fortran, 66035, v2] [5/6 Regression] gfortran ICE segfault

2015-07-06 Thread Andre Vehreschild
Hi all,

please find attached the next version of the patch for pr66035 fixing an ICE.
Scope (copied from first submit):

An ICE occurred when in a structure constructor an allocatable component of
type class was initialized with an existing class object. This was caused by 

- the size of the memory to allocate for the component was miscalculated,
- the vptr was not set correctly, and
- when the class object to be used for init was allocatable already, it was
  copied wasting some memory instead of a view_convert inserted.

Bootstraps and regtests fine on x86_64-linux-gnu/f21.

Ok for trunk?

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
gcc/fortran/ChangeLog:

2015-07-06  Andre Vehreschild  

PR fortran/66035
* trans-expr.c (alloc_scalar_allocatable_for_subcomponent_assignment):
Compute the size to allocate for class and derived type objects
correclty.
(gfc_trans_subcomponent_assign): Only allocate memory for a
component when the object to assign is not an allocatable class
object (the memory is already present for allocatable class objects).
Furthermore use copy_class_to_class for assigning the rhs to the
component (may happen for dummy class objects on the rhs).


gcc/testsuite/ChangeLog:

2015-07-06  Andre Vehreschild  

PR fortran/66035
* gfortran.dg/structure_constructor_13.f03: New test.


diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 195f7a4..74af725 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -6903,6 +6903,29 @@ alloc_scalar_allocatable_for_subcomponent_assignment (stmtblock_t *block,
    TREE_TYPE (tmp), tmp,
    fold_convert (TREE_TYPE (tmp), size));
 }
+  else if (cm->ts.type == BT_CLASS)
+{
+  gcc_assert (expr2->ts.type == BT_CLASS || expr2->ts.type == BT_DERIVED);
+  if (expr2->ts.type == BT_DERIVED)
+	{
+	  tmp = gfc_get_symbol_decl (expr2->ts.u.derived);
+	  size = TYPE_SIZE_UNIT (tmp);
+	}
+  else
+	{
+	  gfc_expr *e2vtab;
+	  gfc_se se;
+	  e2vtab = gfc_find_and_cut_at_last_class_ref (expr2);
+	  gfc_add_vptr_component (e2vtab);
+	  gfc_add_size_component (e2vtab);
+	  gfc_init_se (&se, NULL);
+	  gfc_conv_expr (&se, e2vtab);
+	  gfc_add_block_to_block (block, &se.pre);
+	  size = fold_convert (size_type_node, se.expr);
+	  gfc_free_expr (e2vtab);
+	}
+  size_in_bytes = size;
+}
   else
 {
   /* Otherwise use the length in bytes of the rhs.  */
@@ -7030,7 +7053,8 @@ gfc_trans_subcomponent_assign (tree dest, gfc_component * cm, gfc_expr * expr,
   gfc_add_expr_to_block (&block, tmp);
 }
   else if (init && (cm->attr.allocatable
-	   || (cm->ts.type == BT_CLASS && CLASS_DATA (cm)->attr.allocatable)))
+	   || (cm->ts.type == BT_CLASS && CLASS_DATA (cm)->attr.allocatable
+	   && expr->ts.type != BT_CLASS)))
 {
   /* Take care about non-array allocatable components here.  The alloc_*
 	 routine below is motivated by the alloc_scalar_allocatable_for_
@@ -7074,6 +7098,14 @@ gfc_trans_subcomponent_assign (tree dest, gfc_component * cm, gfc_expr * expr,
 	  tmp = gfc_build_memcpy_call (tmp, se.expr, size);
 	  gfc_add_expr_to_block (&block, tmp);
 	}
+  else if (cm->ts.type == BT_CLASS && expr->ts.type == BT_CLASS)
+	{
+	  tmp = gfc_copy_class_to_class (se.expr, dest, integer_one_node,
+   CLASS_DATA (cm)->attr.unlimited_polymorphic);
+	  gfc_add_expr_to_block (&block, tmp);
+	  gfc_add_modify (&block, gfc_class_vptr_get (dest),
+			  gfc_class_vptr_get (se.expr));
+	}
   else
 	gfc_add_modify (&block, tmp,
 			fold_convert (TREE_TYPE (tmp), se.expr));
diff --git a/gcc/testsuite/gfortran.dg/structure_constructor_13.f03 b/gcc/testsuite/gfortran.dg/structure_constructor_13.f03
new file mode 100644
index 000..c74e325
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/structure_constructor_13.f03
@@ -0,0 +1,28 @@
+! { dg-do run }
+!
+! Contributed by Melven Roehrig-Zoellner  
+! PR fortran/66035
+
+program test_pr66035
+  type t
+  end type t
+  type w
+class(t), allocatable :: c
+  end type w
+
+  type(t) :: o
+
+  call test(o)
+contains
+  subroutine test(o)
+class(t), intent(inout) :: o
+type(w), dimension(:), allocatable :: list
+
+select type (o)
+  class is (t)
+list = [w(o)] ! This caused an ICE
+  class default
+call abort()
+end select
+  end subroutine
+end program


[PATCH 0/7] more ifdef removal

2015-07-06 Thread tbsaunde+gcc
From: Trevor Saunders 

Hi,

$subject.

 patches individually bootstrapped + regtested on x86_64-linux-gnu, and the
series was run through config-list.mk with some other stuff a couple weeks ago.
I plan to commit this as preapproved tonight if nobody complains.

Trev

Trevor Saunders (7):
  reduce conditional compilation for LOAD_EXTEND_OP
  remove #if for HAVE_cc0 in combine.c
  always define SHORT_IMMEDIATES_SIGN_EXTEND
  use #if for HARD_FRAME_POINTER_IS_FRAME_POINTER less
  always define AUTO_INC_DEC
  reduce conditional compilation based on AUTO_INC_DEC
  always define WORD_REGISTER_OPERATIONS

 gcc/auto-inc-dec.c |  16 ++--
 gcc/combine.c  | 189 -
 gcc/config/alpha/alpha.h   |   4 +-
 gcc/config/arc/arc.h   |   2 +-
 gcc/config/arm/arm.h   |   2 +-
 gcc/config/bfin/bfin.h |   2 +-
 gcc/config/epiphany/epiphany.h |   2 +-
 gcc/config/frv/frv.h   |   4 +-
 gcc/config/ia64/ia64.h |   2 +-
 gcc/config/iq2000/iq2000.h |   2 +-
 gcc/config/lm32/lm32.h |   4 +-
 gcc/config/m32r/m32r.h |   2 +-
 gcc/config/mcore/mcore.h   |   4 +-
 gcc/config/mep/mep.h   |   4 +-
 gcc/config/microblaze/microblaze.h |   2 +-
 gcc/config/mips/mips.h |   4 +-
 gcc/config/mmix/mmix.h |   2 +-
 gcc/config/mn10300/mn10300.h   |   2 +-
 gcc/config/nds32/nds32.h   |   2 +-
 gcc/config/nios2/nios2.h   |   2 +-
 gcc/config/pa/pa.h |   2 +-
 gcc/config/rl78/rl78.h |   2 +-
 gcc/config/rs6000/rs6000.h |   2 +-
 gcc/config/sh/sh.h |   4 +-
 gcc/config/sparc/sparc.h   |   2 +-
 gcc/config/stormy16/stormy16.h |   2 +-
 gcc/config/tilegx/tilegx.h |   4 +-
 gcc/config/tilepro/tilepro.h   |   4 +-
 gcc/config/v850/v850.h |   2 +-
 gcc/config/xtensa/xtensa.h |   2 +-
 gcc/cse.c  |   6 +-
 gcc/defaults.h |   8 ++
 gcc/doc/tm.texi|   4 +-
 gcc/doc/tm.texi.in |   4 +-
 gcc/emit-rtl.c |   7 +-
 gcc/expr.c |   6 +-
 gcc/fold-const.c   |  10 +-
 gcc/internal-fn.c  |  11 +--
 gcc/loop-invariant.c   |  10 +-
 gcc/lower-subreg.c |  30 +++---
 gcc/lra.c  |   8 +-
 gcc/postreload.c   |  18 ++--
 gcc/recog.c|  13 +--
 gcc/regrename.c|  14 ++-
 gcc/reload.c   |  42 -
 gcc/reload1.c  |  30 +++---
 gcc/rtl.h  |   6 +-
 gcc/rtlanal.c  |  22 ++---
 gcc/sched-deps.c   |   9 +-
 gcc/sel-sched.c|   9 +-
 gcc/simplify-rtx.c |   4 +-
 gcc/valtrack.c |   8 +-
 52 files changed, 244 insertions(+), 314 deletions(-)

-- 
2.4.0



[PATCH 2/7] remove #if for HAVE_cc0 in combine.c

2015-07-06 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-07-06  Trevor Saunders  

* combine.c (do_SUBST_MODE): Don't check the value of HAVE_cc0
with the preprocessor.
(combine_instructions): Likewise.
(try_combine): Likewise.
(subst): Likewise.
(distribute_notes): Likewise.
---
 gcc/combine.c | 59 ---
 1 file changed, 24 insertions(+), 35 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 8b1e9f4..a02e755 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -825,7 +825,6 @@ do_SUBST_MODE (rtx *into, machine_mode newval)
 
 #define SUBST_MODE(INTO, NEWVAL)  do_SUBST_MODE (&(INTO), (NEWVAL))
 
-#if !HAVE_cc0
 /* Similar to SUBST, but NEWVAL is a LOG_LINKS expression.  */
 
 static void
@@ -851,7 +850,6 @@ do_SUBST_LINK (struct insn_link **into, struct insn_link 
*newval)
 }
 
 #define SUBST_LINK(oldval, newval) do_SUBST_LINK (&oldval, newval)
-#endif
 
 /* Subroutine of try_combine.  Determine whether the replacement patterns
NEWPAT, NEWI2PAT and NEWOTHERPAT are cheaper according to insn_rtx_cost
@@ -1142,9 +1140,7 @@ static int
 combine_instructions (rtx_insn *f, unsigned int nregs)
 {
   rtx_insn *insn, *next;
-#if HAVE_cc0
   rtx_insn *prev;
-#endif
   struct insn_link *links, *nextlinks;
   rtx_insn *first;
   basic_block last_bb;
@@ -1319,7 +1315,6 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
}
  }
 
-#if HAVE_cc0
  /* Try to combine a jump insn that uses CC0
 with a preceding insn that sets CC0, and maybe with its
 logical predecessor as well.
@@ -1327,7 +1322,7 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
 We need this special code because data flow connections
 via CC0 do not get entered in LOG_LINKS.  */
 
- if (JUMP_P (insn)
+ if (HAVE_cc0 && JUMP_P (insn)
  && (prev = prev_nonnote_insn (insn)) != 0
  && NONJUMP_INSN_P (prev)
  && sets_cc0_p (PATTERN (prev)))
@@ -1345,7 +1340,7 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
}
 
  /* Do the same for an insn that explicitly references CC0.  */
- if (NONJUMP_INSN_P (insn)
+ if (HAVE_cc0 && NONJUMP_INSN_P (insn)
  && (prev = prev_nonnote_insn (insn)) != 0
  && NONJUMP_INSN_P (prev)
  && sets_cc0_p (PATTERN (prev))
@@ -1367,18 +1362,20 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
  /* Finally, see if any of the insns that this insn links to
 explicitly references CC0.  If so, try this insn, that insn,
 and its predecessor if it sets CC0.  */
- FOR_EACH_LOG_LINK (links, insn)
- if (NONJUMP_INSN_P (links->insn)
- && GET_CODE (PATTERN (links->insn)) == SET
- && reg_mentioned_p (cc0_rtx, SET_SRC (PATTERN (links->insn)))
- && (prev = prev_nonnote_insn (links->insn)) != 0
- && NONJUMP_INSN_P (prev)
- && sets_cc0_p (PATTERN (prev))
- && (next = try_combine (insn, links->insn,
- prev, NULL, &new_direct_jump_p,
- last_combined_insn)) != 0)
-   goto retry;
-#endif
+ if (HAVE_cc0)
+   {
+ FOR_EACH_LOG_LINK (links, insn)
+   if (NONJUMP_INSN_P (links->insn)
+   && GET_CODE (PATTERN (links->insn)) == SET
+   && reg_mentioned_p (cc0_rtx, SET_SRC (PATTERN 
(links->insn)))
+   && (prev = prev_nonnote_insn (links->insn)) != 0
+   && NONJUMP_INSN_P (prev)
+   && sets_cc0_p (PATTERN (prev))
+   && (next = try_combine (insn, links->insn,
+   prev, NULL, &new_direct_jump_p,
+   last_combined_insn)) != 0)
+ goto retry;
+   }
 
  /* Try combining an insn with two different insns whose results it
 uses.  */
@@ -2546,7 +2543,6 @@ is_parallel_of_n_reg_sets (rtx pat, int n)
   return true;
 }
 
-#if !HAVE_cc0
 /* Return whether INSN, a PARALLEL of N register SETs (and maybe some
CLOBBERs), can be split into individual SETs in that order, without
changing semantics.  */
@@ -2573,7 +2569,6 @@ can_split_parallel_of_n_reg_sets (rtx_insn *insn, int n)
 
   return true;
 }
-#endif
 
 /* Try to combine the insns I0, I1 and I2 into I3.
Here I0, I1 and I2 appear earlier than I3.
@@ -2920,7 +2915,6 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
}
 }
 
-#if !HAVE_cc0
   /* If we have no I1 and I2 looks like:
(parallel [(set (reg:CC X) (compare:CC OP (const_int 0)))
   (set Y OP)])
@@ -2934,7 +2928,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_ins

[PATCH 1/7] reduce conditional compilation for LOAD_EXTEND_OP

2015-07-06 Thread tbsaunde+gcc
From: Trevor Saunders 

Provide a default in files where that is possible, so that everything
else there can be unconditionally compiled.  However rtlanal.c and
reload.c do tricky things that break providing a global default, so we
can't do that yet.

gcc/ChangeLog:

2015-07-06  Trevor Saunders  

* combine.c (try_combine): Don't check if LOAD_EXTEND_OP is
defined.
(simplify_set): Likewise.
* cse.c (cse_insn): Likewise.
* fold-const.c (fold_single_bit_test): Likewise.
(fold_unary_loc): Likewise.
* postreload.c (reload_cse_simplify_set): Likewise.
(reload_cse_simplify_operands): Likewise.
---
 gcc/combine.c|  8 
 gcc/cse.c|  6 --
 gcc/fold-const.c | 10 --
 gcc/postreload.c | 18 ++
 4 files changed, 18 insertions(+), 24 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 8eaae7c..8b1e9f4 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -120,6 +120,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "obstack.h"
 #include "rtl-iter.h"
 
+#ifndef LOAD_EXTEND_OP
+#define LOAD_EXTEND_OP(M) UNKNOWN
+#endif
+
 /* Number of attempts to combine instructions in this function.  */
 
 static int combine_attempts;
@@ -3751,7 +3755,6 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
 be written as a ZERO_EXTEND.  */
  if (split_code == SUBREG && MEM_P (SUBREG_REG (*split)))
{
-#ifdef LOAD_EXTEND_OP
  /* Or as a SIGN_EXTEND if LOAD_EXTEND_OP says that that's
 what it really is.  */
  if (LOAD_EXTEND_OP (GET_MODE (SUBREG_REG (*split)))
@@ -3759,7 +3762,6 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
SUBST (*split, gen_rtx_SIGN_EXTEND (split_mode,
SUBREG_REG (*split)));
  else
-#endif
SUBST (*split, gen_rtx_ZERO_EXTEND (split_mode,
SUBREG_REG (*split)));
}
@@ -6779,7 +6781,6 @@ simplify_set (rtx x)
}
 }
 
-#ifdef LOAD_EXTEND_OP
   /* If we have (set FOO (subreg:M (mem:N BAR) 0)) with M wider than N, this
  would require a paradoxical subreg.  Replace the subreg with a
  zero_extend to avoid the reload that would otherwise be required.  */
@@ -6797,7 +6798,6 @@ simplify_set (rtx x)
 
   src = SET_SRC (x);
 }
-#endif
 
   /* If we don't have a conditional move, SET_SRC is an IF_THEN_ELSE, and we
  are comparing an item known to be 0 or -1 against 0, use a logical
diff --git a/gcc/cse.c b/gcc/cse.c
index e01240c..b286417 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -58,6 +58,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "dbgcnt.h"
 #include "rtl-iter.h"
 
+#ifndef LOAD_EXTEND_OP
+#define LOAD_EXTEND_OP(M) UNKNOWN
+#endif
+
 /* The basic idea of common subexpression elimination is to go
through the code, keeping a record of expressions that would
have the same value at the current scan point, and replacing
@@ -4873,7 +4877,6 @@ cse_insn (rtx_insn *insn)
}
}
 
-#ifdef LOAD_EXTEND_OP
   /* See if a MEM has already been loaded with a widening operation;
 if it has, we can use a subreg of that.  Many CISC machines
 also have such operations, but this is only likely to be
@@ -4919,7 +4922,6 @@ cse_insn (rtx_insn *insn)
break;
}
}
-#endif /* LOAD_EXTEND_OP */
 
   /* Try to express the constant using a register+offset expression
 derived from a constant anchor.  */
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 5da6ed3..c8a0520 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -84,6 +84,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "generic-match.h"
 #include "optabs.h"
 
+#ifndef LOAD_EXTEND_OP
+#define LOAD_EXTEND_OP(M) UNKNOWN
+#endif
+
 /* Nonzero if we are folding constants inside an initializer; zero
otherwise.  */
 int folding_initializer = 0;
@@ -6850,12 +6854,8 @@ fold_single_bit_test (location_t loc, enum tree_code 
code,
   /* If we are going to be able to omit the AND below, we must do our
 operations as unsigned.  If we must use the AND, we have a choice.
 Normally unsigned is faster, but for some machines signed is.  */
-#ifdef LOAD_EXTEND_OP
   ops_unsigned = (LOAD_EXTEND_OP (operand_mode) == SIGN_EXTEND
  && !flag_syntax_only) ? 0 : 1;
-#else
-  ops_unsigned = 1;
-#endif
 
   signed_type = lang_hooks.types.type_for_mode (operand_mode, 0);
   unsigned_type = lang_hooks.types.type_for_mode (operand_mode, 1);
@@ -8019,7 +8019,6 @@ fold_unary_loc (location_t loc, enum tree_code code, tree 
type, tree op0)
  cst &= HOST_WIDE_INT_M1U
 << (TYPE_PRECISION (TREE_TYPE (and1)) - 1);
  change = (cst == 0);
-#ifdef LOAD_EXTEND_OP
 

[PATCH 7/7] always define WORD_REGISTER_OPERATIONS

2015-07-06 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-07-06  Trevor Saunders  

* defaults.h: Provide default for WORD_REGISTER_OPERATIONS.
* config/alpha/alpha.h: Define WORD_REGISTER_OPERATIONS to 1.
* config/arc/arc.h: Likewise.
* config/arm/arm.h: Likewise.
* config/bfin/bfin.h: Likewise.
* config/epiphany/epiphany.h: Likewise.
* config/frv/frv.h: Likewise.
* config/ia64/ia64.h: Likewise.
* config/iq2000/iq2000.h: Likewise.
* config/lm32/lm32.h: Likewise.
* config/m32r/m32r.h: Likewise.
* config/mcore/mcore.h: Likewise.
* config/mep/mep.h: Likewise.
* config/microblaze/microblaze.h: Likewise.
* config/mips/mips.h: Likewise.
* config/mmix/mmix.h:
* config/mn10300/mn10300.h:
* config/nds32/nds32.h:
* config/nios2/nios2.h:
* config/pa/pa.h:
* config/rl78/rl78.h:
* config/sh/sh.h:
* config/sparc/sparc.h:
* config/stormy16/stormy16.h (enum reg_class):
* config/tilegx/tilegx.h:
* config/tilepro/tilepro.h:
* config/v850/v850.h:
* config/xtensa/xtensa.h:
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Adjust.
* combine.c (simplify_set): Likewise.
(simplify_comparison): Likewise.
* expr.c (store_constructor): Likewise.
* internal-fn.c (expand_arith_overflow): Likewise.
* reload.c (push_reload): Likewise.
(find_reloads): Likewise.
(find_reloads_subreg_address): Likewise.
* reload1.c (eliminate_regs_1): Likewise.
* rtlanal.c (nonzero_bits1): Likewise.
(num_sign_bit_copies1): Likewise.
* simplify-rtx.c (simplify_truncation): Likewise.
---
 gcc/combine.c  | 14 ++
 gcc/config/alpha/alpha.h   |  2 +-
 gcc/config/arc/arc.h   |  2 +-
 gcc/config/arm/arm.h   |  2 +-
 gcc/config/bfin/bfin.h |  2 +-
 gcc/config/epiphany/epiphany.h |  2 +-
 gcc/config/frv/frv.h   |  2 +-
 gcc/config/ia64/ia64.h |  2 +-
 gcc/config/iq2000/iq2000.h |  2 +-
 gcc/config/lm32/lm32.h |  2 +-
 gcc/config/m32r/m32r.h |  2 +-
 gcc/config/mcore/mcore.h   |  4 ++--
 gcc/config/mep/mep.h   |  2 +-
 gcc/config/microblaze/microblaze.h |  2 +-
 gcc/config/mips/mips.h |  2 +-
 gcc/config/mmix/mmix.h |  2 +-
 gcc/config/mn10300/mn10300.h   |  2 +-
 gcc/config/nds32/nds32.h   |  2 +-
 gcc/config/nios2/nios2.h   |  2 +-
 gcc/config/pa/pa.h |  2 +-
 gcc/config/rl78/rl78.h |  2 +-
 gcc/config/sh/sh.h |  2 +-
 gcc/config/sparc/sparc.h   |  2 +-
 gcc/config/stormy16/stormy16.h |  2 +-
 gcc/config/tilegx/tilegx.h |  2 +-
 gcc/config/tilepro/tilepro.h   |  2 +-
 gcc/config/v850/v850.h |  2 +-
 gcc/config/xtensa/xtensa.h |  2 +-
 gcc/defaults.h |  4 
 gcc/doc/tm.texi|  2 +-
 gcc/doc/tm.texi.in |  2 +-
 gcc/expr.c |  6 +-
 gcc/internal-fn.c  | 11 +--
 gcc/reload.c   | 19 ---
 gcc/reload1.c  |  2 +-
 gcc/rtlanal.c  | 18 +++---
 gcc/simplify-rtx.c |  4 +---
 37 files changed, 63 insertions(+), 75 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 96cc3cd..0b36245 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -6719,10 +6719,9 @@ simplify_set (rtx x)
   / UNITS_PER_WORD)
  == ((GET_MODE_SIZE (GET_MODE (SUBREG_REG (src)))
   + (UNITS_PER_WORD - 1)) / UNITS_PER_WORD))
-#ifndef WORD_REGISTER_OPERATIONS
-  && (GET_MODE_SIZE (GET_MODE (src))
-   < GET_MODE_SIZE (GET_MODE (SUBREG_REG (src
-#endif
+  && (WORD_REGISTER_OPERATIONS
+ || (GET_MODE_SIZE (GET_MODE (src))
+ < GET_MODE_SIZE (GET_MODE (SUBREG_REG (src)
 #ifdef CANNOT_CHANGE_MODE_CLASS
   && ! (REG_P (dest) && REGNO (dest) < FIRST_PSEUDO_REGISTER
&& REG_CANNOT_CHANGE_MODE_P (REGNO (dest),
@@ -11417,7 +11416,7 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx 
*pop1)
   /* Try a few ways of applying the same transformation to both operands.  */
   while (1)
 {
-#ifndef WORD_REGISTER_OPERATIONS
+#if !WORD_REGISTER_OPERATIONS
   /* The test below this one won't handle SIGN_EXTENDs on these machines,
 so check specially.  */
   if (code != GTU && code != GEU && code != LTU && code != LEU
@@ -12072,10 +12071,9 @@ simplify_comparison (enum rtx_code code, rtx *pop0, 
rtx *pop1

[PATCH 3/7] always define SHORT_IMMEDIATES_SIGN_EXTEND

2015-07-06 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-07-06  Trevor Saunders  

* combine.c (update_rsp_from_reg_equal): Don't check if
SHORT_IMMEDIATES_SIGN_EXTEND is defined.
(reg_nonzero_bits_for_combine): Likewise.
* config/alpha/alpha.h: Define SHORT_IMMEDIATES_SIGN_EXTEND to
1.
* config/frv/frv.h: Likewise.
* config/lm32/lm32.h: Likewise.
* config/mep/mep.h: Likewise.
* config/mips/mips.h: Likewise.
* config/rs6000/rs6000.h: Likewise.
* config/sh/sh.h: Likewise.
* config/tilegx/tilegx.h (enum reg_class): Likewise.
* config/tilepro/tilepro.h: Likewise.
* defaults.h: Add default for SHORT_IMMEDIATES_SIGN_EXTEND.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Adjust.
* rtlanal.c (nonzero_bits1): Likewise.
---
 gcc/combine.c| 21 ++---
 gcc/config/alpha/alpha.h |  2 +-
 gcc/config/frv/frv.h |  2 +-
 gcc/config/lm32/lm32.h   |  2 +-
 gcc/config/mep/mep.h |  2 +-
 gcc/config/mips/mips.h   |  2 +-
 gcc/config/rs6000/rs6000.h   |  2 +-
 gcc/config/sh/sh.h   |  2 +-
 gcc/config/tilegx/tilegx.h   |  2 +-
 gcc/config/tilepro/tilepro.h |  2 +-
 gcc/defaults.h   |  4 
 gcc/doc/tm.texi  |  2 +-
 gcc/doc/tm.texi.in   |  2 +-
 gcc/rtlanal.c|  4 +---
 14 files changed, 26 insertions(+), 25 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index a02e755..6935934 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -1630,7 +1630,6 @@ setup_incoming_promotions (rtx_insn *first)
 }
 }
 
-#ifdef SHORT_IMMEDIATES_SIGN_EXTEND
 /* If MODE has a precision lower than PREC and SRC is a non-negative constant
that would appear negative in MODE, sign-extend SRC for use in nonzero_bits
because some machines (maybe most) will actually do the sign-extension and
@@ -1650,7 +1649,6 @@ sign_extend_short_imm (rtx src, machine_mode mode, 
unsigned int prec)
 
   return src;
 }
-#endif
 
 /* Update RSP for pseudo-register X from INSN's REG_EQUAL note (if one exists)
and SET.  */
@@ -1667,11 +1665,12 @@ update_rsp_from_reg_equal (reg_stat_type *rsp, rtx_insn 
*insn, const_rtx set,
   if (reg_equal_note)
 reg_equal = XEXP (reg_equal_note, 0);
 
-#ifdef SHORT_IMMEDIATES_SIGN_EXTEND
-  src = sign_extend_short_imm (src, GET_MODE (x), BITS_PER_WORD);
-  if (reg_equal)
-reg_equal = sign_extend_short_imm (reg_equal, GET_MODE (x), BITS_PER_WORD);
-#endif
+  if (SHORT_IMMEDIATES_SIGN_EXTEND)
+{
+  src = sign_extend_short_imm (src, GET_MODE (x), BITS_PER_WORD);
+  if (reg_equal)
+   reg_equal = sign_extend_short_imm (reg_equal, GET_MODE (x), 
BITS_PER_WORD);
+}
 
   /* Don't call nonzero_bits if it cannot change anything.  */
   if (rsp->nonzero_bits != ~(unsigned HOST_WIDE_INT) 0)
@@ -9818,10 +9817,10 @@ reg_nonzero_bits_for_combine (const_rtx x, machine_mode 
mode,
 
   if (tem)
 {
-#ifdef SHORT_IMMEDIATES_SIGN_EXTEND
-  tem = sign_extend_short_imm (tem, GET_MODE (x),
-  GET_MODE_PRECISION (mode));
-#endif
+  if (SHORT_IMMEDIATES_SIGN_EXTEND)
+   tem = sign_extend_short_imm (tem, GET_MODE (x),
+GET_MODE_PRECISION (mode));
+
   return tem;
 }
   else if (nonzero_sign_valid && rsp->nonzero_bits)
diff --git a/gcc/config/alpha/alpha.h b/gcc/config/alpha/alpha.h
index 8d2ab23..c39f103 100644
--- a/gcc/config/alpha/alpha.h
+++ b/gcc/config/alpha/alpha.h
@@ -897,7 +897,7 @@ do {
 \
 #define LOAD_EXTEND_OP(MODE) ((MODE) == SImode ? SIGN_EXTEND : ZERO_EXTEND)
 
 /* Define if loading short immediate values into registers sign extends.  */
-#define SHORT_IMMEDIATES_SIGN_EXTEND
+#define SHORT_IMMEDIATES_SIGN_EXTEND 1
 
 /* Value is 1 if truncating an integer of INPREC bits to OUTPREC bits
is done just by pretending it is already truncated.  */
diff --git a/gcc/config/frv/frv.h b/gcc/config/frv/frv.h
index 2d4cbdd..a96f201b 100644
--- a/gcc/config/frv/frv.h
+++ b/gcc/config/frv/frv.h
@@ -1899,7 +1899,7 @@ fprintf (STREAM, "\t.word .L%d\n", VALUE)
 #define LOAD_EXTEND_OP(MODE) SIGN_EXTEND
 
 /* Define if loading short immediate values into registers sign extends.  */
-#define SHORT_IMMEDIATES_SIGN_EXTEND
+#define SHORT_IMMEDIATES_SIGN_EXTEND 1
 
 /* The maximum number of bytes that a single instruction can move quickly from
memory to memory.  */
diff --git a/gcc/config/lm32/lm32.h b/gcc/config/lm32/lm32.h
index d284703..9872860 100644
--- a/gcc/config/lm32/lm32.h
+++ b/gcc/config/lm32/lm32.h
@@ -525,7 +525,7 @@ do {
\
 
 #define LOAD_EXTEND_OP(MODE) ZERO_EXTEND
 
-#define SHORT_IMMEDIATES_SIGN_EXTEND
+#define SHORT_IMMEDIATES_SIGN_EXTEND 1
 
 #define MOVE_MAXUNITS_PER_WORD
 #define MAX_MOVE_MAX4
diff --git a/gcc/config/mep/mep.h b/g

[PATCH 5/7] always define AUTO_INC_DEC

2015-07-06 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-07-06  Trevor Saunders  

* rtl.h: Always define AUTO_INC_DEC.
* auto-inc-dec.c (pass_inc_dec::execute): Adjust.
* combine.c (combine_instructions): Likewise.
(can_combine_p): Likewise.
(try_combine): Likewise.
* emit-rtl.c (try_split): Likewise.
* loop-invariant.c (calculate_loop_reg_pressure): Likewise.
* lower-subreg.c (resolve_simple_move): Likewise.
* lra.c (update_inc_notes): Likewise.
* recog.c (asm_operand_ok): Likewise.
(constrain_operands): Likewise.
* regrename.c (scan_rtx_address): Likewise.
* reload.c (update_auto_inc_notes): Likewise.
(find_equiv_reg): Likewise.
* reload1.c (reload): Likewise.
(reload_as_needed): Likewise.
(choose_reload_regs): Likewise.
(emit_input_reload_insns): Likewise.
(delete_output_reload): Likewise.
* sched-deps.c (init_insn_reg_pressure_info): Likewise.
* valtrack.c (cleanup_auto_inc_dec): Likewise.
---
 gcc/auto-inc-dec.c   |  6 +++---
 gcc/combine.c| 10 +-
 gcc/emit-rtl.c   |  4 ++--
 gcc/loop-invariant.c |  2 +-
 gcc/lower-subreg.c   |  4 ++--
 gcc/lra.c|  4 ++--
 gcc/recog.c  |  8 
 gcc/regrename.c  |  2 +-
 gcc/reload.c |  6 +++---
 gcc/reload1.c| 18 +-
 gcc/rtl.h|  6 --
 gcc/sched-deps.c |  2 +-
 gcc/valtrack.c   |  2 +-
 13 files changed, 38 insertions(+), 36 deletions(-)

diff --git a/gcc/auto-inc-dec.c b/gcc/auto-inc-dec.c
index df52229..dd183ee 100644
--- a/gcc/auto-inc-dec.c
+++ b/gcc/auto-inc-dec.c
@@ -123,7 +123,7 @@ along with GCC; see the file COPYING3.  If not see
   before the ref or +c if the increment was after the ref, then if we
   can do the combination but switch the pre/post bit.  */
 
-#ifdef AUTO_INC_DEC
+#if AUTO_INC_DEC
 
 enum form
 {
@@ -1477,7 +1477,7 @@ public:
   /* opt_pass methods: */
   virtual bool gate (function *)
 {
-#ifdef AUTO_INC_DEC
+#if AUTO_INC_DEC
   return (optimize > 0 && flag_auto_inc_dec);
 #else
   return false;
@@ -1492,7 +1492,7 @@ public:
 unsigned int
 pass_inc_dec::execute (function *fun ATTRIBUTE_UNUSED)
 {
-#ifdef AUTO_INC_DEC
+#if AUTO_INC_DEC
   basic_block bb;
   int max_reg = max_reg_num ();
 
diff --git a/gcc/combine.c b/gcc/combine.c
index da5c335..346bdff 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -1204,7 +1204,7 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
   FOR_BB_INSNS (this_basic_block, insn)
 if (INSN_P (insn) && BLOCK_FOR_INSN (insn))
  {
-#ifdef AUTO_INC_DEC
+#if AUTO_INC_DEC
 rtx links;
 #endif
 
@@ -1215,7 +1215,7 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
 insn);
record_dead_and_set_regs (insn);
 
-#ifdef AUTO_INC_DEC
+#if AUTO_INC_DEC
for (links = REG_NOTES (insn); links; links = XEXP (links, 1))
  if (REG_NOTE_KIND (links) == REG_INC)
set_nonzero_bits_and_sign_copies (XEXP (links, 0), NULL_RTX,
@@ -1798,7 +1798,7 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_insn 
*pred ATTRIBUTE_UNUSED,
   const_rtx set = 0;
   rtx src, dest;
   rtx_insn *p;
-#ifdef AUTO_INC_DEC
+#if AUTO_INC_DEC
   rtx link;
 #endif
   bool all_adjacent = true;
@@ -2079,7 +2079,7 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_insn 
*pred ATTRIBUTE_UNUSED,
  Also insist that I3 not be a jump; if it were one
  and the incremented register were spilled, we would lose.  */
 
-#ifdef AUTO_INC_DEC
+#if AUTO_INC_DEC
   for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
 if (REG_NOTE_KIND (link) == REG_INC
&& (JUMP_P (i3)
@@ -3045,7 +3045,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
|| GET_CODE (XEXP (SET_DEST (PATTERN (i3)), 0)) == POST_DEC)))
 /* It's not the exception.  */
 #endif
-#ifdef AUTO_INC_DEC
+#if AUTO_INC_DEC
 {
   rtx link;
   for (link = REG_NOTES (i3); link; link = XEXP (link, 1))
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 80c0adb..eb44066 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -3594,7 +3594,7 @@ prev_cc0_setter (rtx_insn *insn)
   return insn;
 }
 
-#ifdef AUTO_INC_DEC
+#if AUTO_INC_DEC
 /* Find a RTX_AUTOINC class rtx which matches DATA.  */
 
 static int
@@ -3782,7 +3782,7 @@ try_split (rtx pat, rtx_insn *trial, int last)
}
  break;
 
-#ifdef AUTO_INC_DEC
+#if AUTO_INC_DEC
case REG_INC:
  for (insn = insn_last; insn != NULL_RTX; insn = PREV_INSN (insn))
{
diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index d3a7439..1285c66 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1998,7 +1998,7 @@ calculate_loop_reg_pressure (void)
 
  note_stores (PATTERN (insn), mark_reg_store, NULL);
 
-#ifdef AUTO_INC_DEC
+#if AUTO_INC_DEC
  for (link = REG_NO

[PATCH 6/7] reduce conditional compilation based on AUTO_INC_DEC

2015-07-06 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-07-06  Trevor Saunders  

* auto-inc-dec.c (pass_inc_dec::execute): Don't check the value
of AUTO_INC_DEC with the preprocessor.
* combine.c (combine_instructions): Likewise.
(can_combine_p): Likewise.
(try_combine): Likewise.
* emit-rtl.c (try_split): Likewise.
* loop-invariant.c (calculate_loop_reg_pressure): Likewise.
* lower-subreg.c (resolve_simple_move): Likewise.
* lra.c (update_inc_notes): Likewise.
* recog.c (asm_operand_ok): Likewise.
(constrain_operands): Likewise.
* regrename.c (scan_rtx_address): Likewise.
* reload.c (update_auto_inc_notes): Likewise.
(reg_inc_found_and_valid_p): Likewise.
* reload1.c (reload): Likewise.
(emit_input_reload_insns): Likewise.
(delete_output_reload): Likewise.
* sched-deps.c (init_insn_reg_pressure_info): Likewise.
* valtrack.c (cleanup_auto_inc_dec): Likewise.
---
 gcc/auto-inc-dec.c   | 16 +---
 gcc/combine.c| 73 
 gcc/emit-rtl.c   |  7 +++--
 gcc/loop-invariant.c | 10 +++
 gcc/lower-subreg.c   | 30 ++---
 gcc/lra.c|  8 +++---
 gcc/recog.c  | 13 +++---
 gcc/regrename.c  |  4 +--
 gcc/reload.c | 16 +---
 gcc/reload1.c| 20 +-
 gcc/sched-deps.c |  9 +++
 gcc/valtrack.c   |  8 +++---
 12 files changed, 89 insertions(+), 125 deletions(-)

diff --git a/gcc/auto-inc-dec.c b/gcc/auto-inc-dec.c
index dd183ee..831622b 100644
--- a/gcc/auto-inc-dec.c
+++ b/gcc/auto-inc-dec.c
@@ -123,7 +123,6 @@ along with GCC; see the file COPYING3.  If not see
   before the ref or +c if the increment was after the ref, then if we
   can do the combination but switch the pre/post bit.  */
 
-#if AUTO_INC_DEC
 
 enum form
 {
@@ -1448,8 +1447,6 @@ merge_in_block (int max_reg, basic_block bb)
 }
 }
 
-#endif
-
 /* Discover auto-inc auto-dec instructions.  */
 
 namespace {
@@ -1477,11 +1474,10 @@ public:
   /* opt_pass methods: */
   virtual bool gate (function *)
 {
-#if AUTO_INC_DEC
+  if (!AUTO_INC_DEC)
+   return false;
+
   return (optimize > 0 && flag_auto_inc_dec);
-#else
-  return false;
-#endif
 }
 
 
@@ -1492,7 +1488,9 @@ public:
 unsigned int
 pass_inc_dec::execute (function *fun ATTRIBUTE_UNUSED)
 {
-#if AUTO_INC_DEC
+  if (!AUTO_INC_DEC)
+return 0;
+
   basic_block bb;
   int max_reg = max_reg_num ();
 
@@ -1515,7 +1513,7 @@ pass_inc_dec::execute (function *fun ATTRIBUTE_UNUSED)
   free (reg_next_def);
 
   mem_tmp = NULL;
-#endif
+
   return 0;
 }
 
diff --git a/gcc/combine.c b/gcc/combine.c
index 346bdff..96cc3cd 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -1204,9 +1204,7 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
   FOR_BB_INSNS (this_basic_block, insn)
 if (INSN_P (insn) && BLOCK_FOR_INSN (insn))
  {
-#if AUTO_INC_DEC
 rtx links;
-#endif
 
 subst_low_luid = DF_INSN_LUID (insn);
 subst_insn = insn;
@@ -1215,12 +1213,11 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
 insn);
record_dead_and_set_regs (insn);
 
-#if AUTO_INC_DEC
-   for (links = REG_NOTES (insn); links; links = XEXP (links, 1))
- if (REG_NOTE_KIND (links) == REG_INC)
-   set_nonzero_bits_and_sign_copies (XEXP (links, 0), NULL_RTX,
- insn);
-#endif
+   if (AUTO_INC_DEC)
+ for (links = REG_NOTES (insn); links; links = XEXP (links, 1))
+   if (REG_NOTE_KIND (links) == REG_INC)
+ set_nonzero_bits_and_sign_copies (XEXP (links, 0), NULL_RTX,
+   insn);
 
/* Record the current insn_rtx_cost of this instruction.  */
if (NONJUMP_INSN_P (insn))
@@ -1798,9 +1795,7 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_insn 
*pred ATTRIBUTE_UNUSED,
   const_rtx set = 0;
   rtx src, dest;
   rtx_insn *p;
-#if AUTO_INC_DEC
   rtx link;
-#endif
   bool all_adjacent = true;
   int (*is_volatile_p) (const_rtx);
 
@@ -2079,22 +2074,21 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_insn 
*pred ATTRIBUTE_UNUSED,
  Also insist that I3 not be a jump; if it were one
  and the incremented register were spilled, we would lose.  */
 
-#if AUTO_INC_DEC
-  for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
-if (REG_NOTE_KIND (link) == REG_INC
-   && (JUMP_P (i3)
-   || reg_used_between_p (XEXP (link, 0), insn, i3)
-   || (pred != NULL_RTX
-   && reg_overlap_mentioned_p (XEXP (link, 0), PATTERN (pred)))
-   || (pred2 != NULL_RTX
-   && reg_overlap_mentioned_p (XEXP (link, 0), PATTERN (pred2)))
-   || (succ != NULL_RTX
-   && reg_overlap_menti

[PATCH 4/7] use #if for HARD_FRAME_POINTER_IS_FRAME_POINTER less

2015-07-06 Thread tbsaunde+gcc
From: Trevor Saunders 

gcc/ChangeLog:

2015-07-06  Trevor Saunders  

* combine.c (can_combine_def_p): Don't check the value of
* HARD_FRAME_POINTER_IS_FRAME_POINTER with the preprocessor.
(combinable_i3pat): Likewise.
(mark_used_regs_combine): Likewise.
* regrename.c (rename_chains): Likewise.
* reload.c (find_reloads_address): Likewise.
* sel-sched.c (mark_unavailable_hard_regs): Likewise.
---
 gcc/combine.c   | 14 +-
 gcc/regrename.c | 10 --
 gcc/reload.c|  5 ++---
 gcc/sel-sched.c |  9 -
 4 files changed, 15 insertions(+), 23 deletions(-)

diff --git a/gcc/combine.c b/gcc/combine.c
index 6935934..da5c335 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -1011,10 +1011,9 @@ can_combine_def_p (df_ref def)
   /* Do not combine frame pointer adjustments.  */
   if ((regno == FRAME_POINTER_REGNUM
&& (!reload_completed || frame_pointer_needed))
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
-  || (regno == HARD_FRAME_POINTER_REGNUM
+  || (!HARD_FRAME_POINTER_IS_FRAME_POINTER
+ && regno == HARD_FRAME_POINTER_REGNUM
  && (!reload_completed || frame_pointer_needed))
-#endif
   || (FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
  && regno == ARG_POINTER_REGNUM && fixed_regs[regno]))
 return false;
@@ -2227,9 +2226,7 @@ combinable_i3pat (rtx_insn *i3, rtx *loc, rtx i2dest, rtx 
i1dest, rtx i0dest,
  && REG_P (subdest)
  && reg_referenced_p (subdest, PATTERN (i3))
  && REGNO (subdest) != FRAME_POINTER_REGNUM
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
- && REGNO (subdest) != HARD_FRAME_POINTER_REGNUM
-#endif
+ && (HARD_FRAME_POINTER_IS_FRAME_POINTER || REGNO (subdest) != 
HARD_FRAME_POINTER_REGNUM)
  && (FRAME_POINTER_REGNUM == ARG_POINTER_REGNUM
  || (REGNO (subdest) != ARG_POINTER_REGNUM
  || ! fixed_regs [REGNO (subdest)]))
@@ -13316,9 +13313,8 @@ mark_used_regs_combine (rtx x)
{
  /* None of this applies to the stack, frame or arg pointers.  */
  if (regno == STACK_POINTER_REGNUM
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
- || regno == HARD_FRAME_POINTER_REGNUM
-#endif
+ || (!HARD_FRAME_POINTER_IS_FRAME_POINTER
+ && regno == HARD_FRAME_POINTER_REGNUM)
  || (FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
  && regno == ARG_POINTER_REGNUM && fixed_regs[regno])
  || regno == FRAME_POINTER_REGNUM)
diff --git a/gcc/regrename.c b/gcc/regrename.c
index 6c7d650..2e08669 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -447,12 +447,10 @@ rename_chains (void)
continue;
 
   if (fixed_regs[reg] || global_regs[reg]
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
- || (frame_pointer_needed && reg == HARD_FRAME_POINTER_REGNUM)
-#else
- || (frame_pointer_needed && reg == FRAME_POINTER_REGNUM)
-#endif
- )
+ || (!HARD_FRAME_POINTER_IS_FRAME_POINTER && frame_pointer_needed
+ && reg == HARD_FRAME_POINTER_REGNUM)
+ || (HARD_FRAME_POINTER_REGNUM && frame_pointer_needed
+ && reg == FRAME_POINTER_REGNUM))
continue;
 
   COPY_HARD_REG_SET (this_unavailable, unavailable);
diff --git a/gcc/reload.c b/gcc/reload.c
index 1dc04bf..4bc996f 100644
--- a/gcc/reload.c
+++ b/gcc/reload.c
@@ -5176,9 +5176,8 @@ find_reloads_address (machine_mode mode, rtx *memrefloc, 
rtx ad,
   if ((regno_ok_for_base_p (REGNO (operand), mode, as, inner_code,
GET_CODE (addend))
   || operand == frame_pointer_rtx
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
-  || operand == hard_frame_pointer_rtx
-#endif
+  || (!HARD_FRAME_POINTER_IS_FRAME_POINTER
+  && operand == hard_frame_pointer_rtx)
   || (FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
   && operand == arg_pointer_rtx)
   || operand == stack_pointer_rtx)
diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c
index be5d1d1..3f7d78b 100644
--- a/gcc/sel-sched.c
+++ b/gcc/sel-sched.c
@@ -1194,11 +1194,10 @@ mark_unavailable_hard_regs (def_t def, struct 
reg_rename *reg_rename_p,
  frame pointer, or we could not discover its class.  */
   if (fixed_regs[regno]
   || global_regs[regno]
-#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
-  || (frame_pointer_needed && regno == HARD_FRAME_POINTER_REGNUM)
-#else
-  || (frame_pointer_needed && regno == FRAME_POINTER_REGNUM)
-#endif
+  || (!HARD_FRAME_POINTER_IS_FRAME_POINTER && frame_pointer_needed
+ && regno == HARD_FRAME_POINTER_REGNUM)
+  || (HARD_FRAME_POINTER_REGNUM && frame_pointer_needed
+ && regno == FRAME_POINTER_REGNUM)
   || (reload_completed && cl == NO_REGS))
 {
   SET_HARD_REG_SET (reg_rename_p->unavailable_hard_regs);
-- 
2.4.0



[PATCH] Fix PR66772

2015-07-06 Thread Richard Biener

In this PR we hit the issue that when CCP faces a conditional which
is (still) undefined it will consider all outgoing edges executable.
Eventually we'll reverse that decision (but without reflecting that
in the lattice), but it's too late then if a PHI node merging
the edges has been evaluated to a copy defined in the not executable
path.

Thus the following patch makes sure we never even start with such
a case (SSA propagator iteration order hopefully makes sure
this doesn't pessimize things in non-loop regions - I have a patch
to fix some iteration order oddities of it).

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2015-07-06  Richard Biener  

PR tree-optimization/66772
* tree-ssa-ccp.c (ccp_visit_phi_node): Make sure that copy
values are available in the PHI node BB when there are
still unexecutable edges.

* gcc.dg/torture/pr66772-1.c: New testcase.
* gcc.dg/torture/pr66772-2.c: Likewise.

Index: gcc/tree-ssa-ccp.c
===
*** gcc/tree-ssa-ccp.c  (revision 225449)
--- gcc/tree-ssa-ccp.c  (working copy)
*** ccp_visit_phi_node (gphi *phi)
*** 1081,1086 
--- 1094,1100 
new_val.mask = 0;
  
bool first = true;
+   bool non_exec_edge = false;
for (i = 0; i < gimple_phi_num_args (phi); i++)
  {
/* Compute the meet operator over all the PHI arguments flowing
*** ccp_visit_phi_node (gphi *phi)
*** 1121,1126 
--- 1135,1156 
  if (new_val.lattice_val == VARYING)
break;
}
+   else
+   non_exec_edge = true;
+ }
+ 
+   /* In case there were non-executable edges and the value is a copy
+  make sure its definition dominates the PHI node.  */
+   if (non_exec_edge
+   && new_val.lattice_val == CONSTANT
+   && TREE_CODE (new_val.value) == SSA_NAME
+   && ! SSA_NAME_IS_DEFAULT_DEF (new_val.value)
+   && ! dominated_by_p (CDI_DOMINATORS, gimple_bb (phi),
+  gimple_bb (SSA_NAME_DEF_STMT (new_val.value
+ {
+   new_val.lattice_val = VARYING;
+   new_val.value = NULL_TREE;
+   new_val.mask = -1;
  }
  
if (dump_file && (dump_flags & TDF_DETAILS))
Index: gcc/testsuite/gcc.dg/torture/pr66733-1.c
===
*** gcc/testsuite/gcc.dg/torture/pr66733-1.c(revision 0)
--- gcc/testsuite/gcc.dg/torture/pr66733-1.c(working copy)
***
*** 0 
--- 1,28 
+ /* { dg-do compile } */
+ 
+ int a;
+ 
+ int
+ fn1 ()
+ {
+   return 1;
+ }
+ 
+ void
+ fn2 ()
+ {
+   int b, j;
+   for (;;)
+ {
+   int c = 1;
+   if (j)
+   {
+ if (c)
+   break;
+   }
+   else
+   b = a;
+   fn1 () && b;
+   j = fn1 ();
+ }
+ }
Index: gcc/testsuite/gcc.dg/torture/pr66733-2.c
===
*** gcc/testsuite/gcc.dg/torture/pr66733-2.c(revision 0)
--- gcc/testsuite/gcc.dg/torture/pr66733-2.c(working copy)
***
*** 0 
--- 1,46 
+ /* { dg-do compile } */
+ 
+ int a, b, c, e, f;
+ 
+ void fn1 (int p) { }
+ 
+ int
+ fn2 (int p)
+ {
+   return a ? p % a : 0; 
+ }
+ 
+ short
+ fn3 (int p)
+ {
+   return (1 >> p) < 1 ? 1 : p;
+ }
+ 
+ int
+ fn4 ()
+ {
+   int g = 0, h = 1;
+   if (b)
+ goto lbl;
+   fn2 (0);
+   if (fn3 (1))
+ fn1 (e && c);
+   if (h)
+ {
+   int i = 1;
+ lbl:
+   if (i)
+   return 0;
+   for (; g < 1; g++)
+   ;
+ }
+   for (;;)
+ f || g > 0;
+ }
+ 
+ int
+ main ()
+ {
+   fn4 (); 
+   return 0;
+ }


RE: [PATCH] MIPS: fix failing branch range checks for micromips

2015-07-06 Thread Andrew Bennett
> There is a follow-up patch that I will be working on that will correctly
> update the other
> branch tests to correctly test out of range branch behaviour for
> micromips.  Currently these
> are passing because the mips branch range offset is large enough.  These
> offsets will
> need to be reduced for micromips to verify the compiler is calculating branch
> ranges correctly.

The following patch and ChangeLog adds out-of-range branch checks for micromips.
It also adds micromips versions of the branch range run tests (branch-14.c and
branch-15.c).

I have tested this on the mips-mti-elf target using 
mips32r2/{-mno-micromips/-mmicromips}
test options and there are no new regressions.

Ok to commit?

Many thanks,


Andrew


testsuite/
* gcc.target/mips/branch-3.c: Add -mno-micromips to dg-options.
* gcc.target/mips/branch-5.c: Ditto.
* gcc.target/mips/branch-7.c: Ditto.
* gcc.target/mips/branch-9.c: Ditto.
* gcc.target/mips/branch-11.c: Ditto.
* gcc.target/mips/branch-13.c: Ditto.
* gcc.target/mips/branch-14.c: Ditto.
* gcc.target/mips/branch-15.c: Ditto.
* gcc.target/mips/branch-umips-3.c: New file.
* gcc.target/mips/branch-umips-5.c: New file.
* gcc.target/mips/branch-umips-7.c: New file.
* gcc.target/mips/branch-umips-9.c: New file.   
 
* gcc.target/mips/branch-umips-11.c: New file.  
 
* gcc.target/mips/branch-umips-13.c: New file.  
 
* gcc.target/mips/branch-umips-14.c: New file.
* gcc.target/mips/branch-umips-15.c: New file.
* gcc.target/mips/branch-helper.h (OCCUPY_0x1): New define.



diff --git a/gcc/testsuite/gcc.target/mips/branch-11.c 
b/gcc/testsuite/gcc.target/mips/branch-11.c
index 962eb1b..c33686a 100644
--- a/gcc/testsuite/gcc.target/mips/branch-11.c
+++ b/gcc/testsuite/gcc.target/mips/branch-11.c
@@ -1,4 +1,4 @@
-/* { dg-options "-mshared -mabi=n32" } */
+/* { dg-options "-mshared -mabi=n32 -mno-micromips" } */
 /* { dg-final { scan-assembler "\tsd\t\\\$28," } } */
 /* { dg-final { scan-assembler "\tld\t\\\$28," } } */
 /* { dg-final { scan-assembler 
"\taddiu\t\\\$28,\\\$28,%lo\\(%neg\\(%gp_rel\\(foo\\)\\)\\)\n" } } */
diff --git a/gcc/testsuite/gcc.target/mips/branch-13.c 
b/gcc/testsuite/gcc.target/mips/branch-13.c
index 8a6fb04..4da4a37 100644
--- a/gcc/testsuite/gcc.target/mips/branch-13.c
+++ b/gcc/testsuite/gcc.target/mips/branch-13.c
@@ -1,4 +1,4 @@
-/* { dg-options "-mshared -mabi=64" } */
+/* { dg-options "-mshared -mabi=64 -mno-micromips" } */
 /* { dg-final { scan-assembler "\tsd\t\\\$28," } } */
 /* { dg-final { scan-assembler "\tld\t\\\$28," } } */
 /* { dg-final { scan-assembler 
"\tdaddiu\t\\\$28,\\\$28,%lo\\(%neg\\(%gp_rel\\(foo\\)\\)\\)\n" } } */
diff --git a/gcc/testsuite/gcc.target/mips/branch-14.c 
b/gcc/testsuite/gcc.target/mips/branch-14.c
index 026417e..5193808 100644
--- a/gcc/testsuite/gcc.target/mips/branch-14.c
+++ b/gcc/testsuite/gcc.target/mips/branch-14.c
@@ -1,4 +1,5 @@
 /* An executable version of branch-2.c.  */
+/* { dg-options "-mno-micromips" } */
 /* { dg-do run } */
 
 #include "branch-helper.h"
diff --git a/gcc/testsuite/gcc.target/mips/branch-15.c 
b/gcc/testsuite/gcc.target/mips/branch-15.c
index dee7a05..a28de9a 100644
--- a/gcc/testsuite/gcc.target/mips/branch-15.c
+++ b/gcc/testsuite/gcc.target/mips/branch-15.c
@@ -1,4 +1,5 @@
 /* An executable version of branch-3.c.  */
+/* { dg-options "-mno-micromips" } */
 /* { dg-do run } */
 
 #include "branch-helper.h"
diff --git a/gcc/testsuite/gcc.target/mips/branch-3.c 
b/gcc/testsuite/gcc.target/mips/branch-3.c
index 5fcfece..1790cbc 100644
--- a/gcc/testsuite/gcc.target/mips/branch-3.c
+++ b/gcc/testsuite/gcc.target/mips/branch-3.c
@@ -1,4 +1,4 @@
-/* { dg-options "-mshared -mabi=32" } */
+/* { dg-options "-mshared -mabi=32 -mno-micromips" } */
 /* { dg-final { scan-assembler "\t\\.cpload\t\\\$25\n" } } */
 /* { dg-final { scan-assembler "\tjr\t\\\$1\n" } } */
 /* { dg-final { scan-assembler-not "\\.cprestore" } } */
diff --git a/gcc/testsuite/gcc.target/mips/branch-5.c 
b/gcc/testsuite/gcc.target/mips/branch-5.c
index 1e9c120..38dbea2 100644
--- a/gcc/testsuite/gcc.target/mips/branch-5.c
+++ b/gcc/testsuite/gcc.target/mips/branch-5.c
@@ -1,4 +1,4 @@
-/* { dg-options "-mshared -mabi=n32" } */
+/* { dg-options "-mshared -mabi=n32 -mno-micromips" } */
 /* { dg-final { scan-assembler 
"\taddiu\t\\\$3,\\\$3,%lo\\(%neg\\(%gp_rel\\(foo\\)\\)\\)\n" } } */
 /* { dg-final { scan-assembler 
"\tlw\t\\\$1,%got_page\\(\[^)\]*\\)\\(\\\$3\\)\\n" } } */
 /* { dg-final { scan-assembler "\tjr\t\\\$1\n" } } */
diff --git a/gcc/testsuite/gcc.target/mips/branch-7.c 
b/gcc/testsuite/gcc.target/mips/branch-7.c
index 8ad6808..b69a302 100644
--- a/gcc/testsuite/gcc.target/mips/branch-7.c
+++ b/gcc/testsuite/gcc.target/mips/branch-7.c
@@ -1,4 +1,4 @@

[PATCH] Make SSA propagator iteration order consistent

2015-07-06 Thread Richard Biener

The intent (as I read it) of the iteration order in ssa_propagate is
to process stmts in the following order:

 1) complete simulation of BBs from making one of their entries executable
 2) simulation of stmts fed by stmts that changed to VARYING
 3) simulation of the rest of stmts fed by stmts that changed their 
lattice value

but the current implementation fails to enforce this order because it
drains the full worklists before considering entries added to the
others by simulating a statement.  This leads to quite some extra
simulation with too optimistic values from not yet executable edges
(just run into this while debugging PR66733).

The current state is that of the original propagator implementation
in this area.

The patch cuts the number of visited stmts for the testcase in PR66773
from 23 to 20 (it visits PHI nodes 3 times less).

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2015-07-06  Richard Biener  

* tree-ssa-propagate.c (add_ssa_edge): Dump what edge list we
add which use to.
(add_control_edge): Remove excessive vertical space in dumping.
(process_ssa_edge_worklist): Simulate at most one statement and
return whether we did.  Do not simulate PHIs if they are in a
BB not yet simulated.
(ssa_propagate): Adjust to always drain the BB worklist whenever
a BB is available there, likewise the VARYING edges list before
the interesting edge list.

Index: gcc/tree-ssa-propagate.c
===
*** gcc/tree-ssa-propagate.c(revision 225449)
--- gcc/tree-ssa-propagate.c(working copy)
*** add_ssa_edge (tree var, bool is_varying)
*** 281,289 
{
  gimple_set_plf (use_stmt, STMT_IN_SSA_EDGE_WORKLIST, true);
  if (is_varying)
!   varying_ssa_edges.safe_push (use_stmt);
  else
!   interesting_ssa_edges.safe_push (use_stmt);
}
  }
  }
--- 281,303 
{
  gimple_set_plf (use_stmt, STMT_IN_SSA_EDGE_WORKLIST, true);
  if (is_varying)
!   {
! if (dump_file && (dump_flags & TDF_DETAILS))
!   {
! fprintf (dump_file, "varying_ssa_edges: adding SSA use in ");
! print_gimple_stmt (dump_file, use_stmt, 0, TDF_SLIM);
!   }
! varying_ssa_edges.safe_push (use_stmt);
!   }
  else
!   {
! if (dump_file && (dump_flags & TDF_DETAILS))
!   {
! fprintf (dump_file, "interesting_ssa_edges: adding SSA use in 
");
! print_gimple_stmt (dump_file, use_stmt, 0, TDF_SLIM);
!   }
! interesting_ssa_edges.safe_push (use_stmt);
!   }
}
  }
  }
*** add_control_edge (edge e)
*** 311,317 
cfg_blocks_add (bb);
  
if (dump_file && (dump_flags & TDF_DETAILS))
! fprintf (dump_file, "\nAdding Destination of edge (%d -> %d) to 
worklist\n",
e->src->index, e->dest->index);
  }
  
--- 325,331 
cfg_blocks_add (bb);
  
if (dump_file && (dump_flags & TDF_DETAILS))
! fprintf (dump_file, "Adding destination of edge (%d -> %d) to worklist\n",
e->src->index, e->dest->index);
  }
  
*** simulate_stmt (gimple stmt)
*** 414,427 
  
  /* Process an SSA edge worklist.  WORKLIST is the SSA edge worklist to
 drain.  This pops statements off the given WORKLIST and processes
!them until there are no more statements on WORKLIST.
!We take a pointer to WORKLIST because it may be reallocated when an
!SSA edge is added to it in simulate_stmt.  */
  
! static void
! process_ssa_edge_worklist (vec *worklist)
  {
!   /* Drain the entire worklist.  */
while (worklist->length () > 0)
  {
basic_block bb;
--- 428,442 
  
  /* Process an SSA edge worklist.  WORKLIST is the SSA edge worklist to
 drain.  This pops statements off the given WORKLIST and processes
!them until one statement was simulated or there are no more statements
!on WORKLIST.  We take a pointer to WORKLIST because it may be reallocated
!when an SSA edge is added to it in simulate_stmt.  Return true if a stmt
!was simulated.  */
  
! static bool 
! process_ssa_edge_worklist (vec *worklist, const char *edge_list_name)
  {
!   /* Process the next entry from the worklist.  */
while (worklist->length () > 0)
  {
basic_block bb;
*** process_ssa_edge_worklist (vec *
*** 437,457 
/* STMT is no longer in a worklist.  */
gimple_set_plf (stmt, STMT_IN_SSA_EDGE_WORKLIST, false);
  
if (dump_file && (dump_flags & TDF_DETAILS))
{
! fprintf (dump_file, "\nSimulating statement (from ssa_edges): ");
  print_gimple_stmt (dump_file, stmt, 0, dump_flags);
}
  
!   bb = gimple_bb (stmt);
  
!   /* PHI nodes are always visited, regardless of whethe

Re: [PATCH] config/bfin/bfin.c (hwloop_optimize): Use return false instead of gcc_assert for checking jump_insn.

2015-07-06 Thread Bernd Schmidt

On 07/03/2015 04:13 AM, Chen Gang wrote:

On 07/01/2015 11:27 PM, Chen Gang wrote:

On 7/1/15 21:52, Bernd Schmidt wrote:

Below is a patch. Can you test this with anything you have beyond the testsuite?



It can fix this issue (Bug66620), let the insns standard, and can build
the bfin kernel with allmodconfig successfully (although for bfin kernel
members, they stick to allmodconfig is not a good idea for bfin kernel).

It finished lsetup optimization for one loop, but still left the other (
get the same .s as my original fix). for 2nd times in hwloop_optimize, it
return false. And welcome any additional ideas for it.



I shall continue to analyse why 2nd lsetup optimiation has not happened.
Hope I can finish within next week (2015-07-12).


I've committed my patch after testing bfin-elf. There's no great mystery 
why the second optimization doesn't happen: the point where it thinks it 
has to insert the LSETUP is after the loop, and the instruction doesn't 
allow that. Possibly we could change that - when the loop is entered at 
the top but not through a fallthrough edge, we could make a new block 
ahead of it and put the LSETUP in there.



Bernd




[RFC 2/2] Add steady_clock support to condition_variable

2015-07-06 Thread Mike Crowe
If __gthread_cond_timedwaitonclock is available it can be used it to fix
part of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41861 by supporting
std::chrono::steady_clock properly with std::condition_variable.

This means that code using std::condition_variable::wait_for or
std::condition_variable::wait_until with std::chrono::steady_clock is no
longer subject to timing out early or potentially waiting for much
longer if the system clock is changed at an inopportune moment.

If __gthread_cond_timedwaitonclock is available then
std::chrono::steady_clock is deemed to be the "best" clock available
which means that it is used for the relative wait_for calls and absolute
wait_until calls that aren't choosing to use std::chrono::system_clock.
Calls explicitly using std::chrono::system_clock continue to use
CLOCK_REALTIME.

If __gthread_cond_timedwaitonclock is not available then
std::chrono::system_clock is deemed to be the "best" clock available
which means that the previous suboptimal behaviour remains.

Signed-off-by: Mike Crowe 
---
 libstdc++-v3/include/std/condition_variable| 56 ++
 .../30_threads/condition_variable/members/2.cc |  8 ++--
 2 files changed, 53 insertions(+), 11 deletions(-)

diff --git a/libstdc++-v3/include/std/condition_variable 
b/libstdc++-v3/include/std/condition_variable
index f7da017..625ecfe 100644
--- a/libstdc++-v3/include/std/condition_variable
+++ b/libstdc++-v3/include/std/condition_variable
@@ -63,7 +63,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// condition_variable
   class condition_variable
   {
-typedef chrono::system_clock   __clock_t;
+#ifdef _GTHREAD_USE_COND_TIMEDWAITONCLOCK
+typedef chrono::steady_clock   __steady_clock_t;
+typedef chrono::steady_clock   __best_clock_t;
+#else
+typedef chrono::system_clock   __best_clock_t;
+#endif
+typedef chrono::system_clock   __system_clock_t;
 typedef __gthread_cond_t   __native_type;
 
 #ifdef __GTHREAD_COND_INIT
@@ -98,10 +104,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  wait(__lock);
   }
 
+#ifdef _GTHREAD_USE_COND_TIMEDWAITONCLOCK
 template
   cv_status
   wait_until(unique_lock& __lock,
-const chrono::time_point<__clock_t, _Duration>& __atime)
+const chrono::time_point<__steady_clock_t, _Duration>& __atime)
+  { return __wait_until_impl(__lock, __atime); }
+#endif
+
+template
+  cv_status
+  wait_until(unique_lock& __lock,
+const chrono::time_point<__system_clock_t, _Duration>& __atime)
   { return __wait_until_impl(__lock, __atime); }
 
 template
@@ -109,9 +123,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   wait_until(unique_lock& __lock,
 const chrono::time_point<_Clock, _Duration>& __atime)
   {
-   // DR 887 - Sync unknown clock to known clock.
const typename _Clock::time_point __c_entry = _Clock::now();
-   const __clock_t::time_point __s_entry = __clock_t::now();
+   const __best_clock_t::time_point __s_entry = __best_clock_t::now();
const auto __delta = __atime - __c_entry;
const auto __s_atime = __s_entry + __delta;
 
@@ -134,24 +147,47 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   cv_status
   wait_for(unique_lock& __lock,
   const chrono::duration<_Rep, _Period>& __rtime)
-  { return wait_until(__lock, __clock_t::now() + __rtime); }
+  { return wait_until(__lock, __best_clock_t::now() + __rtime); }
 
 template
   bool
   wait_for(unique_lock& __lock,
   const chrono::duration<_Rep, _Period>& __rtime,
   _Predicate __p)
-  { return wait_until(__lock, __clock_t::now() + __rtime, std::move(__p)); 
}
+  { return wait_until(__lock, __best_clock_t::now() + __rtime, 
std::move(__p)); }
 
 native_handle_type
 native_handle()
 { return &_M_cond; }
 
   private:
+#ifdef _GTHREAD_USE_COND_TIMEDWAITONCLOCK
+template
+  cv_status
+  __wait_until_impl(unique_lock& __lock,
+   const chrono::time_point<__steady_clock_t, _Dur>& 
__atime)
+  {
+   auto __s = chrono::time_point_cast(__atime);
+   auto __ns = chrono::duration_cast(__atime - __s);
+
+   __gthread_time_t __ts =
+ {
+   static_cast(__s.time_since_epoch().count()),
+   static_cast(__ns.count())
+ };
+
+   __gthread_cond_timedwaitonclock(&_M_cond, 
__lock.mutex()->native_handle(),
+   __GTHREAD_CLOCK_MONOTONIC,
+   &__ts);
+
+   return (__steady_clock_t::now() < __atime
+   ? cv_status::no_timeout : cv_status::timeout);
+  }
+#endif
 template
   cv_status
   __wait_until_impl(unique_lock& __lock,
-   const chrono::time_point<__clock_t, _Dur>& __atime)
+   const chrono::time_point<__system_clock_t, _Dur>& 
__atime)
   {
auto __s = chr

[Patch, fortran, pr66578, v1] [F2008] Invalid free on allocate(...,source=a(:)) in block

2015-07-06 Thread Andre Vehreschild
Hi all,

this is a proposal to patch PR 66578
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66578 . It extends work of Mikael
Morin. The patch fixes two issues:

1. a source'd allocate in a block: allocate(c, source=a(:)). The issues occurs
because due to the new handling of source-expressions in trans_allocate() an
array descriptor is created where previously just a plain array was used. I.e.,
GFC_DESCRIPTOR_TYPE_P (source) is true now and GFC_ARRAY_TYPE_P (source) false,
which made gfortran use the wrong bounds for the descriptor (zero-based instead
of one-based). This was fixed by Mikael's proposal.

2. a two-level array addressing lead to a segfault. I.e., when in a
source-expression an array was used to index another object, then the offset
was computed incorrectly.

Bootstraps and regtests fine on x86_64-linux-gnu/f21.

Comments welcome!

Regards,
Andre

PS: Experience shows that asking whether this ok for trunk is useless ;-) There
is always something that could be improved. Open for suggestions.
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


pr66578_1.clog
Description: Binary data
diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index fece3ab..afea5ec 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -6912,9 +6912,10 @@ gfc_conv_expr_descriptor (gfc_se *se, gfc_expr *expr)
   tree from;
   tree to;
   tree base;
-  bool onebased = false;
+  bool onebased = false, rank_remap;
 
   ndim = info->ref ? info->ref->u.ar.dimen : ss->dimen;
+  rank_remap = ss->dimen < ndim;
 
   if (se->want_coarray)
 	{
@@ -6947,6 +6948,22 @@ gfc_conv_expr_descriptor (gfc_se *se, gfc_expr *expr)
   if (expr->ts.type == BT_CHARACTER)
 	se->string_length =  gfc_get_expr_charlen (expr);
 
+  /* If we have an array section or are assigning make sure that
+	 the lower bound is 1.  References to the full
+	 array should otherwise keep the original bounds.  */
+  if ((!info->ref || info->ref->u.ar.type != AR_FULL) && !se->want_pointer)
+	for (dim = 0; dim < loop.dimen; dim++)
+	  if (!integer_onep (loop.from[dim]))
+	{
+	  tmp = fold_build2_loc (input_location, MINUS_EXPR,
+ gfc_array_index_type, gfc_index_one_node,
+ loop.from[dim]);
+	  loop.to[dim] = fold_build2_loc (input_location, PLUS_EXPR,
+	  gfc_array_index_type,
+	  loop.to[dim], tmp);
+	  loop.from[dim] = gfc_index_one_node;
+	}
+
   desc = info->descriptor;
   if (se->direct_byref && !se->byref_noassign)
 	{
@@ -7040,20 +7057,6 @@ gfc_conv_expr_descriptor (gfc_se *se, gfc_expr *expr)
 	  from = loop.from[dim];
 	  to = loop.to[dim];
 
-	  /* If we have an array section or are assigning make sure that
-	 the lower bound is 1.  References to the full
-	 array should otherwise keep the original bounds.  */
-	  if ((!info->ref
-	  || info->ref->u.ar.type != AR_FULL)
-	  && !integer_onep (from))
-	{
-	  tmp = fold_build2_loc (input_location, MINUS_EXPR,
- gfc_array_index_type, gfc_index_one_node,
- from);
-	  to = fold_build2_loc (input_location, PLUS_EXPR,
-gfc_array_index_type, to, tmp);
-	  from = gfc_index_one_node;
-	}
 	  onebased = integer_onep (from);
 	  gfc_conv_descriptor_lbound_set (&loop.pre, parm,
 	  gfc_rank_cst[dim], from);
@@ -7079,7 +7082,7 @@ gfc_conv_expr_descriptor (gfc_se *se, gfc_expr *expr)
 	{
 	  tmp = gfc_conv_array_lbound (desc, n);
 	  tmp = fold_build2_loc (input_location, MINUS_EXPR,
- TREE_TYPE (base), tmp, loop.from[dim]);
+ TREE_TYPE (base), tmp, from);
 	  tmp = fold_build2_loc (input_location, MULT_EXPR,
  TREE_TYPE (base), tmp,
  gfc_conv_array_stride (desc, n));
@@ -7114,7 +7117,19 @@ gfc_conv_expr_descriptor (gfc_se *se, gfc_expr *expr)
   /* Force the offset to be -1, when the lower bound of the highest
 	 dimension is one and the symbol is present and is not a
 	 pointer/allocatable or associated.  */
-  if (onebased && se->use_offset
+  if (((se->direct_byref || GFC_ARRAY_TYPE_P (TREE_TYPE (desc)))
+	   && !se->data_not_needed)
+	  || (se->use_offset && base != NULL_TREE))
+	{
+	  /* Set the offset depending on base.  */
+	  tmp = rank_remap && !se->direct_byref ?
+		fold_build2_loc (input_location, PLUS_EXPR,
+ gfc_array_index_type, base,
+ offset)
+	  : base;
+	  gfc_conv_descriptor_offset_set (&loop.pre, parm, tmp);
+	}
+  else if (onebased && (!rank_remap || se->use_offset)
 	  && expr->symtree
 	  && !(expr->symtree->n.sym && expr->symtree->n.sym->ts.type == BT_CLASS
 	   && !CLASS_DATA (expr->symtree->n.sym)->attr.class_pointer)
@@ -7129,11 +7144,6 @@ gfc_conv_expr_descriptor (gfc_se *se, gfc_expr *expr)
 	  tmp = gfc_conv_mpz_to_tree (minus_one, gfc_index_integer_kind);
 	  gfc_conv_descriptor_offset_set (&loop.pre, parm, tmp);
 	}
-  else if (((se->direct_byref || GFC_ARRAY_TYPE_P (TREE_TYPE (desc)))
-		&& !se->data

ORDER

2015-07-06 Thread Anthony Lackner Stores .

Hello,

I am Anthony Lackner of Anthony Lackner store.I want to order from your company 
and I will like to know if you accept credit payment and the types of card 
accepted by you. About the shipment I have a freight forwarder 
(http://www.popular-shipping.com) that will handle the shipment when I am done 
with the payment for the items and shipping cost provided by them but their 
policy is that seller should pay them direct for the safety of both parties are 
you okay by paying you for the goods and their shipping costs quoted by my 
freight forwarder.

Kindly let me know as soon as possible so that i can provide you the 
Merchandise needed.

Thanks.
Anthony Lackner Store.




[Patch,microblaze]: Optimized usage of reserved stack space for function arguments.

2015-07-06 Thread Ajit Kumar Agarwal
All:

The below patch optimized the usage of the reserved stack space for function 
arguments. The stack space is reserved if 
the function is a libcall, variable number of arguments, aggregate data types, 
and some parameter are reserved in registers 
and some parameters is reserved in the stack. Along with the above conditions 
the stack space is not reserved if no arguments 
are passed. No regressions is seen in Deja GNU tests for microblaze.

[Patch,microblaze]: Optimized usage of reserved stack space for function 
arguments.

The changes are made in the patch for optimized usage of
reserved stack space for arguments. The stack space is
reserved if the function is a libcall, variable number of
arguments, aggregate data types, and some parameter are
reserved in registers and some parameters is reserved in the
stack. Along with the above conditions the stack space is not
reserved if no arguments are passed.

ChangeLog:
2015-07-06  Ajit Agarwal  

* config/microblaze/microblaze.c
(microblaze_parm_needs_stack): New.
(microblaze_function_parms_need_stack): New.
(microblaze_reg_parm_stack_space): New.
* config/microblaze/microblaze.h
(REG_PARM_STACK_SPACE): Modify the macro.
* config/microblaze/microblaze-protos.h
(microblaze_reg_parm_stack_space): Declare.

Signed-off-by:Ajit Agarwal ajit...@xilinx.com

---
 gcc/config/microblaze/microblaze-protos.h |1 +
 gcc/config/microblaze/microblaze.c|  140 +
 gcc/config/microblaze/microblaze.h|2 +-
 3 files changed, 142 insertions(+), 1 deletions(-)

diff --git a/gcc/config/microblaze/microblaze-protos.h 
b/gcc/config/microblaze/microblaze-protos.h
index 57879b1..d27d3e1 100644
--- a/gcc/config/microblaze/microblaze-protos.h
+++ b/gcc/config/microblaze/microblaze-protos.h
@@ -56,6 +56,7 @@ extern bool microblaze_tls_referenced_p (rtx);
 extern int symbol_mentioned_p (rtx);
 extern int label_mentioned_p (rtx);
 extern bool microblaze_cannot_force_const_mem (machine_mode, rtx);
+extern int  microblaze_reg_parm_stack_space(tree fun);
 #endif  /* RTX_CODE */
 
 /* Declare functions in microblaze-c.c.  */
diff --git a/gcc/config/microblaze/microblaze.c 
b/gcc/config/microblaze/microblaze.c
index 566b78c..0eae4cd 100644
--- a/gcc/config/microblaze/microblaze.c
+++ b/gcc/config/microblaze/microblaze.c
@@ -3592,7 +3592,147 @@ microblaze_legitimate_constant_p (machine_mode mode 
ATTRIBUTE_UNUSED, rtx x)
 
   return true;
 }
+/* Heuristics and criteria for having param needs stack.  */
 
+static bool
+microblaze_parm_needs_stack (cumulative_args_t args_so_far, tree type)
+{
+  enum machine_mode mode;
+  int unsignedp;
+  rtx entry_parm;
+
+  /* Catch errors.  */
+  if (type == NULL || type == error_mark_node)
+return true;
+
+  /* Handle types with no storage requirement.  */
+  if (TYPE_MODE (type) == VOIDmode)
+return false;
+
+   /* Handle complex types.  */
+  if (TREE_CODE (type) == COMPLEX_TYPE)
+return (microblaze_parm_needs_stack (args_so_far, TREE_TYPE (type))
+ || microblaze_parm_needs_stack (args_so_far, TREE_TYPE (type)));
+
+  /* Handle transparent aggregates.  */
+  if ((TREE_CODE (type) == UNION_TYPE || TREE_CODE (type) == RECORD_TYPE)
+  && TYPE_TRANSPARENT_AGGR (type))
+type = TREE_TYPE (first_field (type));
+
+  /* See if this arg was passed by invisible reference.  */
+  if (pass_by_reference (get_cumulative_args (args_so_far),
+ TYPE_MODE (type), type, true))
+type = build_pointer_type (type);
+
+  /* Find mode as it is passed by the ABI.  */
+  unsignedp = TYPE_UNSIGNED (type);
+  mode = promote_mode (type, TYPE_MODE (type), &unsignedp);
+
+  /* If there is no incoming register, we need a stack.  */
+  entry_parm = microblaze_function_arg (args_so_far, mode, type, true);
+
+  if (entry_parm == NULL)
+return true;
+
+  /* Likewise if we need to pass both in registers and on the stack.  */
+  if (GET_CODE (entry_parm) == PARALLEL
+  && XEXP (XVECEXP (entry_parm, 0, 0), 0) == NULL_RTX)
+return true;
+
+  /* Also true if we're partially in registers and partially not.  */
+  if (function_arg_partial_bytes (args_so_far, mode, type, true) != 0)
+return true;
+
+  /* Update info on where next arg arrives in registers.  */
+  microblaze_function_arg_advance (args_so_far, mode, type, true);
+
+  return false;
+}
+
+/* Function need stack for param if
+   1. The function is a libcall.
+   2. Variable number of arguments.
+   3. If the param is aggregate data types.
+   4. If partially some param in registers and some in the stack.  */
+
+static bool
+microblaze_function_parms_need_stack (tree fun, bool incoming)
+{
+  tree fntype, result;
+  CUMULATIVE_ARGS args_so_far_v;
+  cumulative_args_t args_so_far;
+  int num_of_args = 0;
+
+  /* Must be a libcall, all of which only use reg parms.  */
+  if (!fun)
+return tru

[PATCH] Fix PR66767

2015-07-06 Thread Richard Biener

Similar to the vect_gen_niters_for_prolog_loop change I already noticed
on x86_64.

Built on ppc64-unknown-linux-gnu, applied.

Richard.

2015-07-06  Richard Biener  

PR tree-optimization/66767
* tree-vect-loop-manip.c (vect_create_cond_for_align_checks):
Make sure to build the alignment test on a SSA name without
final alignment info valid only if the alignment test
evaluates to true.

Index: gcc/tree-vect-loop-manip.c
===
--- gcc/tree-vect-loop-manip.c  (revision 225449)
+++ gcc/tree-vect-loop-manip.c  (working copy)
@@ -2143,7 +2143,7 @@ vect_create_cond_for_align_checks (loop_
   bool negative = tree_int_cst_compare
(DR_STEP (STMT_VINFO_DATA_REF (stmt_vinfo)), size_zero_node) < 0;
   tree offset = negative
-   ? size_int (-TYPE_VECTOR_SUBPARTS (vectype) + 1) : NULL_TREE;
+   ? size_int (-TYPE_VECTOR_SUBPARTS (vectype) + 1) : size_zero_node;
 
   /* create: addr_tmp = (int)(address_of_first_vector) */
   addr_base =


[Ada] Add DragonFly support to System.OS_Constants template

2015-07-06 Thread John Marino
The System.OS_Constants templates for GNAT has three preprocessor checks
for FreeBSD.  In all three cases, DragonFly BSD needs to be treated the
same as FreeBSD.  The attached patch accomplishes this.

Please consider incorporating the patch into trunk.

Regards,
John
Index: gcc/ada/s-oscons-tmplt.c
===
--- gcc/ada/s-oscons-tmplt.c(revision 225453)
+++ gcc/ada/s-oscons-tmplt.c(working copy)
@@ -402,7 +402,7 @@
 
 /* ioctl(2) requests are "int" in UNIX, but "unsigned long" on FreeBSD */
 
-#ifdef __FreeBSD__
+#if defined (__FreeBSD__) || defined (__DragonFly__)
 # define CNI CNU
 # define IOCTL_Req_T "unsigned"
 #else
@@ -1014,7 +1014,7 @@
 
 */
 
-#if defined (__FreeBSD__) || defined (__linux__)
+#if defined (__FreeBSD__) || defined (__linux__) || defined (__DragonFly__)
 # define PTY_Library "-lutil"
 #else
 # define PTY_Library ""
@@ -1435,7 +1435,8 @@
 #endif
 CND(CLOCK_THREAD_CPUTIME_ID, "Thread CPU clock")
 
-#if defined(__FreeBSD__) || (defined(_AIX) && defined(_AIXVERSION_530))
+#if defined(__FreeBSD__) || (defined(_AIX) && defined(_AIXVERSION_530)) \
+ || defined(__DragonFly__)
 /** On these platforms use system provided monotonic clock instead of
  ** the default CLOCK_REALTIME. We then need to set up cond var attributes
  ** appropriately (see thread.c).


[PATCH] MIPS: For micromips allow near-far-3.c test to use the jals instruction to call near_func

2015-07-06 Thread Andrew Bennett
Hi,

The near-far-3.c test is failing for micromips because it is expecting the call 
to near_func to be performed by a jal instruction, but for micromips this is 
done 
by a jals instruction.

I have updated the expected test output to deal with this case.  I have tested 
this on the mips-mti-elf target using mips32r2/{-mno-micromips/-mmicromips}
test options and there are no new regressions.

The patch and ChangeLog are below.

Ok to commit?



Many thanks,


Andrew



testsuite/
* gcc.target/mips/near-far-3.c: Allow the call to near_func to use
the jals instruction.


 
diff --git a/gcc/testsuite/gcc.target/mips/near-far-3.c 
b/gcc/testsuite/gcc.target/mips/near-far-3.c
index d4d48b1..ea151bc 100644
--- a/gcc/testsuite/gcc.target/mips/near-far-3.c
+++ b/gcc/testsuite/gcc.target/mips/near-far-3.c
@@ -13,5 +13,5 @@ NOMIPS16 int test4 () { return normal_func (); }
 
 /* { dg-final { scan-assembler-not "\tj\tlong_call_func\n" } } */
 /* { dg-final { scan-assembler-not "\tj\tfar_func\n" } } */
-/* { dg-final { scan-assembler "\tj(|al)\tnear_func\n" } } */
+/* { dg-final { scan-assembler "\tj(|al|als)\tnear_func\n" } } */
 /* { dg-final { scan-assembler-not "\tj\tnormal_func\n" } } */



RE: [PATCH] MIPS: For micromips allow near-far-3.c test to use the jals instruction to call near_func

2015-07-06 Thread Moore, Catherine


> -Original Message-
> From: Andrew Bennett [mailto:andrew.benn...@imgtec.com]
> Sent: Monday, July 06, 2015 9:20 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Matthew Fortune; Moore, Catherine
> Subject: [PATCH] MIPS: For micromips allow near-far-3.c test to use the jals
> instruction to call near_func
> 
> Hi,
> 
> The near-far-3.c test is failing for micromips because it is expecting the 
> call to
> near_func to be performed by a jal instruction, but for micromips this is done
> by a jals instruction.
> 
> I have updated the expected test output to deal with this case.  I have tested
> this on the mips-mti-elf target using mips32r2/{-mno-micromips/-
> mmicromips}
> test options and there are no new regressions.
> 
> The patch and ChangeLog are below.
> 
> Ok to commit?
> 
> testsuite/
>   * gcc.target/mips/near-far-3.c: Allow the call to near_func to use
> the jals instruction.
> 
> 

OK.


Re: [RFC] two-phase marking in gt_cleare_cache

2015-07-06 Thread Richard Biener
On Mon, Jul 6, 2015 at 10:57 AM, Tom de Vries  wrote:
> Hi,
>
> Using attached untested patch, I managed to minimize a test-case failure for
> PR 66714.
>
> The patch introduces two-phase marking in gt_cleare_cache:
> - first phase, it loops over all the hash table entries and removes
>   those which are dead
> - second phase, it runs over all the live hash table entries and marks
>   live items that are reachable from those live entries
>
> By doing so, we make the behaviour of gt_cleare_cache independent of the
> order in which the entries are visited, turning:
> - hard-to-trigger bugs which trigger for one visiting order but not for
>   another, into
> - more easily triggered bugs which trigger for any visiting order.
>
> Any comments?

I think it is only half-way correct in your proposed change.  You only
fix the issue for hashes of the same kind.  To truly fix the issue you'd
have to change generated code for gt_clear_caches () and provide
a clearing-only implementation (or pass a operation mode bool to
the core worker in hash-table.h).

Thanks,
Richard.

> Thanks,
> - Tom


[PATCH] PR target/66749: Add -march=iamcu to optimize for IA MCU

2015-07-06 Thread H.J. Lu
IA MCU is based on Intel Pentium ISA without x87 and passing parameters
in registers.  We want to optimize for IA MCU without changing existing
Pentium codegen.  This patch adds PROCESSOR_IAMCU for -march=iamcu,
which is based on -march=pentium with updated cost tables.

OK for trunk?

Thanks.


H.J.
--
gcc/

PR target/66749
* config/i386/i386.c (iamcu_cost): New.
(m_IAMCU): Likewise.
(initial_ix86_arch_features): Disable X86_ARCH_CMOV for m_IAMCU.
(processor_target_table): Add an entry for "iamcu".
(processor_alias_table): Likewise.
(ix86_issue_rate): Handle PROCESSOR_IAMCU.
(ix86_adjust_cost): Likewise.
(ia32_multipass_dfa_lookahead): Likewise.
* config/i386/i386.h (processor_type): Add PROCESSOR_IAMCU.
* config/i386/x86-tune.def: Updated for m_IAMCU.

gcc/testsuite/

PR target/66749
* gcc.target/i386/pr66749.c: New test.
---
 gcc/config/i386/i386.c  | 76 -
 gcc/config/i386/i386.h  |  1 +
 gcc/config/i386/x86-tune.def| 36 +---
 gcc/testsuite/gcc.target/i386/pr66749.c | 14 ++
 4 files changed, 111 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr66749.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 7d26e8c..98250c4 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -426,6 +426,74 @@ struct processor_costs pentium_cost = {
   1,   /* cond_not_taken_branch_cost.  */
 };
 
+static const
+struct processor_costs iamcu_cost = {
+  COSTS_N_INSNS (1),   /* cost of an add instruction */
+  COSTS_N_INSNS (1) + 1,   /* cost of a lea instruction */
+  COSTS_N_INSNS (4),   /* variable shift costs */
+  COSTS_N_INSNS (1),   /* constant shift costs */
+  {COSTS_N_INSNS (11), /* cost of starting multiply for QI */
+   COSTS_N_INSNS (11), /*   HI */
+   COSTS_N_INSNS (11), /*   SI */
+   COSTS_N_INSNS (11), /*   DI */
+   COSTS_N_INSNS (11)},/*
other */
+  0,   /* cost of multiply per each bit set */
+  {COSTS_N_INSNS (25), /* cost of a divide/mod for QI */
+   COSTS_N_INSNS (25), /*  HI */
+   COSTS_N_INSNS (25), /*  SI */
+   COSTS_N_INSNS (25), /*  DI */
+   COSTS_N_INSNS (25)},/*  
other */
+  COSTS_N_INSNS (3),   /* cost of movsx */
+  COSTS_N_INSNS (2),   /* cost of movzx */
+  8,   /* "large" insn */
+  6,   /* MOVE_RATIO */
+  6,/* cost for loading QImode using movzbl */
+  {2, 4, 2},   /* cost of loading integer registers
+  in QImode, HImode and SImode.
+  Relative to reg-reg move (2).  */
+  {2, 4, 2},   /* cost of storing integer registers */
+  2,   /* cost of reg,reg fld/fst */
+  {2, 2, 6},   /* cost of loading fp registers
+  in SFmode, DFmode and XFmode */
+  {4, 4, 6},   /* cost of storing fp registers
+  in SFmode, DFmode and XFmode */
+  8,   /* cost of moving MMX register */
+  {8, 8},  /* cost of loading MMX registers
+  in SImode and DImode */
+  {8, 8},  /* cost of storing MMX registers
+  in SImode and DImode */
+  2,   /* cost of moving SSE register */
+  {4, 8, 16},  /* cost of loading SSE registers
+  in SImode, DImode and TImode */
+  {4, 8, 16},  /* cost of storing SSE registers
+  in SImode, DImode and TImode */
+  3,   /* MMX or SSE register to integer */
+  8,   /* size of l1 cache.  */
+  8,   /* size of l2 cache  */
+  0,   /* size of prefetch block */
+  0,   /* number of parallel prefetches */
+  2,   /* Branch cost */
+  COSTS_N_INSNS (3),   /* cost of FADD and FSUB insns.  */
+  COSTS_N_I

Re: [RFC] two-phase marking in gt_cleare_cache

2015-07-06 Thread Richard Biener
On Mon, Jul 6, 2015 at 3:25 PM, Richard Biener
 wrote:
> On Mon, Jul 6, 2015 at 10:57 AM, Tom de Vries  wrote:
>> Hi,
>>
>> Using attached untested patch, I managed to minimize a test-case failure for
>> PR 66714.
>>
>> The patch introduces two-phase marking in gt_cleare_cache:
>> - first phase, it loops over all the hash table entries and removes
>>   those which are dead
>> - second phase, it runs over all the live hash table entries and marks
>>   live items that are reachable from those live entries
>>
>> By doing so, we make the behaviour of gt_cleare_cache independent of the
>> order in which the entries are visited, turning:
>> - hard-to-trigger bugs which trigger for one visiting order but not for
>>   another, into
>> - more easily triggered bugs which trigger for any visiting order.
>>
>> Any comments?
>
> I think it is only half-way correct in your proposed change.  You only
> fix the issue for hashes of the same kind.  To truly fix the issue you'd
> have to change generated code for gt_clear_caches () and provide
> a clearing-only implementation (or pass a operation mode bool to
> the core worker in hash-table.h).

Hmm, and don't we rather want to first mark and _then_ clear?  Because
if entry B in the hash is live and would keep A live then A _is_ kept in the
end but you'll remove it from the hash, possibly no longer using a still
live copy.

Richard.

> Thanks,
> Richard.
>
>> Thanks,
>> - Tom


[PATCH] MIPS: Do not generate micromips code for the no-smartmips-lwxs.c testcase

2015-07-06 Thread Andrew Bennett
Hi,

The LWXS instruction is part of the micromips ISA which means it is
valid to generate it for the no-smartmips-lwxs.c testcase.  I have
updated the dg-options for the test to ensure that it does not 
generate micromips code.

I have tested this on the mips-mti-elf target using 
mips32r2/{-mno-micromips/-mmicromips} test options and there are no new 
regressions.

The patch and ChangeLog are below.

Ok to commit?



Many thanks,


Andrew



testsuite/
* gcc.target/mips/no-smartmips-lwxs.c: Add -mno-micromips to dg-options.


diff --git a/gcc/testsuite/gcc.target/mips/no-smartmips-lwxs.c 
b/gcc/testsuite/gcc.target/mips/no-smartmips-lwxs.c
index ecf856e..6701a1c 100644
--- a/gcc/testsuite/gcc.target/mips/no-smartmips-lwxs.c
+++ b/gcc/testsuite/gcc.target/mips/no-smartmips-lwxs.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-mno-smartmips" } */
+/* { dg-options "-mno-smartmips -mno-micromips" } */
 
 NOMIPS16 int scaled_indexed_word_load (int a[], int b)
 {




Re: [Patch,microblaze]: Optimized usage of reserved stack space for function arguments.

2015-07-06 Thread Oleg Endo
Hi,

Just some general comments...

On 06 Jul 2015, at 22:05, Ajit Kumar Agarwal  
wrote:

> +static bool
> +microblaze_parm_needs_stack (cumulative_args_t args_so_far, tree type)
> +{
> +  enum machine_mode mode;
 
 'enum' is not required in C++, please omit it.
 We've been trying to remove unnecessary 'struct' and 'enum' after the
 switch to C++.  Although there are still some of them around, please
 don't add new ones.

> +  int unsignedp;
> +  rtx entry_parm;
 
 Please declare variables at their first use.
 (there are other such cases in your patch)

Cheers,
Oleg

Re: flatten cfgloop.h

2015-07-06 Thread Michael Matz
Hi,

On Sun, 5 Jul 2015, Prathamesh Kulkarni wrote:

> Hi,
> The attached patches flatten cfgloop.h.
> patch-1.diff moves around prototypes and structures to respective 
> header-files.
> patch-2.diff (mostly auto-generated) replicates cfgloop.h includes in c files.
> Bootstrapped and tested on x86_64-unknown-linux-gnu with all front-ends.
> Built on all targets using config-list.mk.
> I left includes in cfgloop.h commented with #if 0 ... #endif.
> OK for trunk ?

Does nobody else think that header files for one or two prototypes are 
fairly silly?

Anyway, your autogenerated part contains changes that seem exaggerated, 
e.g.:

+++ b/gcc/bt-load.c
@@ -54,6 +54,14 @@ along with GCC; see the file COPYING3.  If not see
 #include "predict.h"
 #include "basic-block.h"
 #include "df.h"
+#include "bitmap.h"
+#include "sbitmap.h"
+#include "cfgloopmanip.h"
+#include "loop-init.h"
+#include "cfgloopanal.h"
+#include "loop-doloop.h"
+#include "loop-invariant.h"
+#include "loop-iv.h"

Surely bt-load doesn't need anything from doloop.h or invariant.h.  Before 
this goes into trunk this whole autogenerated thing should be cleaned up 
to add includes only for things that are actually needed.


Ciao,
Michael.


[PATCH] Simplify vector compare-not-select sequence

2015-07-06 Thread Bill Schmidt
Hi,

Due to specifics of the POWER architecture, some forms of a vector
compare followed by a vector select are represented in RTL as a compare,
followed by a logical NOT, followed by the select.  This tends to end up
generating an extra instruction.  This patch adds a case to
simplify-rtx.c to remove the logical NOT by reversing the outcomes of
the select.  I've added a POWER-specific test case that demonstrates
that the issue is fixed.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions.  Is this ok for trunk?

Thanks,
Bill


[gcc]

2015-07-06  Bill Schmidt  

* simplify-rtx.c (simplify_ternary_operation): Add simplification
for (!c) != {0,...,0} ? a : b for vector modes.

[gcc/testsuite]

2015-07-06  Bill Schmidt  

* gcc.target/powerpc/vec-cmp-sel.c: New test.


Index: gcc/simplify-rtx.c
===
--- gcc/simplify-rtx.c  (revision 225440)
+++ gcc/simplify-rtx.c  (working copy)
@@ -5251,6 +5251,32 @@ simplify_ternary_operation (enum rtx_code code, ma
  && rtx_equal_p (XEXP (op0, 1), op1
return op2;
 
+  /* Convert (!c) != {0,...,0} ? a : b into
+ c != {0,...,0} ? b : a for vector modes.  */
+  if (VECTOR_MODE_P (GET_MODE (op1))
+ && GET_CODE (op0) == NE
+ && GET_CODE (XEXP (op0, 0)) == NOT
+ && GET_CODE (XEXP (op0, 1)) == CONST_VECTOR)
+   {
+ rtx cv = XEXP (op0, 1);
+ int nunits = CONST_VECTOR_NUNITS (cv);
+ bool ok = true;
+ for (int i = 0; i < nunits; ++i)
+   if (CONST_VECTOR_ELT (cv, i) != const0_rtx)
+ {
+   ok = false;
+   break;
+ }
+ if (ok)
+   {
+ rtx new_op0 = gen_rtx_NE (GET_MODE (op0),
+   XEXP (XEXP (op0, 0), 0),
+   XEXP (op0, 1));
+ rtx retval = gen_rtx_IF_THEN_ELSE (mode, new_op0, op2, op1);
+ return retval;
+   }
+   }
+
   if (COMPARISON_P (op0) && ! side_effects_p (op0))
{
  machine_mode cmp_mode = (GET_MODE (XEXP (op0, 0)) == VOIDmode
Index: gcc/testsuite/gcc.target/powerpc/vec-cmp-sel.c
===
--- gcc/testsuite/gcc.target/powerpc/vec-cmp-sel.c  (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/vec-cmp-sel.c  (working copy)
@@ -0,0 +1,20 @@
+/* { dg-do compile { target powerpc64*-*-* } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-maltivec -O2" } */
+/* { dg-final { scan-assembler "vcmpgtsd" } } */
+/* { dg-final { scan-assembler-not "xxlnor" } } */
+
+/* Test code in simplify-rtx.c that converts
+ (!c) != {0,...,0} ? a : b
+   into
+ c != {0,...,0} ? b : a  */
+
+#include 
+
+vector signed long long foo () {
+  vector signed long long x = { 25399, -12900 };
+  vector signed long long y = { 12178, -9987 };
+  vector bool long long b = vec_cmpge (x, y);
+  vector signed long long z = vec_sel (y, x, b);
+  return z;
+}




Re: [PING][PATCH, 1/2] Merge rewrite_virtuals_into_loop_closed_ssa from gomp4 branch

2015-07-06 Thread Richard Biener
On Mon, 6 Jul 2015, Tom de Vries wrote:

> On 25/06/15 09:42, Tom de Vries wrote:
> > Hi,
> > 
> > this patch merges rewrite_virtuals_into_loop_closed_ssa (originally
> > submitted here: https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01236.html
> > ) to trunk.
> > 
> > Bootstrapped and reg-tested on x86_64.
> > 
> > OK for trunk?
> > 
> 
> Ping.
> 
> Thanks,
> - Tom
> 
> 
> > 0001-Merge-rewrite_virtuals_into_loop_closed_ssa-from-gom.patch
> > 
> > 
> > Merge rewrite_virtuals_into_loop_closed_ssa from gomp4 branch
> > 
> > 2015-06-24  Tom de Vries
> > 
> > merge from gomp4 branch:
> > 2015-06-24  Tom de Vries
> > 
> > * tree-ssa-loop-manip.c (get_virtual_phi): Factor out of ...
> > (rewrite_virtuals_into_loop_closed_ssa): ... here.
> > 
> > * tree-ssa-loop-manip.c (replace_uses_in_dominated_bbs): Factor out
> > of ...
> > (rewrite_virtuals_into_loop_closed_ssa): ... here.
> > 
> > * dominance.c (bitmap_get_dominated_by): New function.
> > * dominance.h (bitmap_get_dominated_by): Declare.
> > * tree-ssa-loop-manip.c (rewrite_virtuals_into_loop_closed_ssa): Use
> > bitmap_get_dominated_by.
> > 
> > * tree-parloops.c (replace_uses_in_bbs_by)
> > (rewrite_virtuals_into_loop_closed_ssa): Move to ...
> > * tree-ssa-loop-manip.c: here.
> > * tree-ssa-loop-manip.h (rewrite_virtuals_into_loop_closed_ssa):
> > Declare.
> > 
> > 2015-06-18  Tom de Vries
> > 
> > * tree-parloops.c (rewrite_virtuals_into_loop_closed_ssa): New
> > function.
> > (transform_to_exit_first_loop_alt): Use
> > rewrite_virtuals_into_loop_closed_ssa.
> > ---
> >   gcc/dominance.c   | 21 
> >   gcc/dominance.h   |  1 +
> >   gcc/tree-parloops.c   | 43 +
> >   gcc/tree-ssa-loop-manip.c | 81
> > +++
> >   gcc/tree-ssa-loop-manip.h |  1 +
> >   5 files changed, 112 insertions(+), 35 deletions(-)
> > 
> > diff --git a/gcc/dominance.c b/gcc/dominance.c
> > index 9c66ca2..9b52d79 100644
> > --- a/gcc/dominance.c
> > +++ b/gcc/dominance.c
> > @@ -753,6 +753,27 @@ set_immediate_dominator (enum cdi_direction dir,
> > basic_block bb,
> >   dom_computed[dir_index] = DOM_NO_FAST_QUERY;
> >   }
> > 
> > +/* Returns in BBS the list of basic blocks immediately dominated by BB, in
> > the
> > +   direction DIR.  As get_dominated_by, but returns result as a bitmap.  */
> > +
> > +void
> > +bitmap_get_dominated_by (enum cdi_direction dir, basic_block bb, bitmap
> > bbs)
> > +{
> > +  unsigned int dir_index = dom_convert_dir_to_idx (dir);
> > +  struct et_node *node = bb->dom[dir_index], *son = node->son, *ason;
> > +
> > +  bitmap_clear (bbs);
> > +
> > +  gcc_checking_assert (dom_computed[dir_index]);
> > +
> > +  if (!son)
> > +return;
> > +
> > +  bitmap_set_bit (bbs, ((basic_block) son->data)->index);
> > +  for (ason = son->right; ason != son; ason = ason->right)
> > +bitmap_set_bit (bbs, ((basic_block) son->data)->index);
> > +}
> > +

Isn't a immediate_dominated_by_p () predicate better?  It's very
cheap to compute compared to allocating / populating and querying
a bitmap.

> >   /* Returns the list of basic blocks immediately dominated by BB, in the
> >  direction DIR.  */
> >   vec
> > diff --git a/gcc/dominance.h b/gcc/dominance.h
> > index 37e138b..0a1a13e 100644
> > --- a/gcc/dominance.h
> > +++ b/gcc/dominance.h
> > @@ -41,6 +41,7 @@ extern void free_dominance_info (enum cdi_direction);
> >   extern basic_block get_immediate_dominator (enum cdi_direction,
> > basic_block);
> >   extern void set_immediate_dominator (enum cdi_direction, basic_block,
> >  basic_block);
> > +extern void bitmap_get_dominated_by (enum cdi_direction, basic_block,
> > bitmap);
> >   extern vec get_dominated_by (enum cdi_direction,
> > basic_block);
> >   extern vec get_dominated_by_region (enum cdi_direction,
> >  basic_block *,
> > diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
> > index e582fe7..df7c351 100644
> > --- a/gcc/tree-parloops.c
> > +++ b/gcc/tree-parloops.c
> > @@ -1498,25 +1498,6 @@ replace_uses_in_bb_by (tree name, tree val,
> > basic_block bb)
> >   }
> >   }
> > 
> > -/* Replace uses of NAME by VAL in blocks BBS.  */
> > -
> > -static void
> > -replace_uses_in_bbs_by (tree name, tree val, bitmap bbs)
> > -{
> > -  gimple use_stmt;
> > -  imm_use_iterator imm_iter;
> > -
> > -  FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, name)
> > -{
> > -  if (!bitmap_bit_p (bbs, gimple_bb (use_stmt)->index))
> > -   continue;
> > -
> > -  use_operand_p use_p;
> > -  FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter)
> > -   SET_USE (use_p, val);
> > -}
> > -}
> > -
> >   /* Do transformation from:
> > 
> >:
> > @@ -1637,18 +1618,11 @@ transform_to_exit_first_loop_alt (struct loop *loop,
> > tree control = gimple_cond_lhs (cond_stmt);
> > edge e;
> > 
> > -  /* G

Re: [PING][PATCH, 2/2][PR66642] Add empty loop exit block in transform_to_exit_first_loop_alt

2015-07-06 Thread Richard Biener
On Mon, 6 Jul 2015, Tom de Vries wrote:

> On 25/06/15 09:43, Tom de Vries wrote:
> > Hi,
> > 
> > I ran into a failure with parloops for reduction loop testcase
> > libgomp/testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c.  When we
> > exercise the low iteration count loop, the test-case fails.
> > 
> > To understand the problem, let's first look at what happens when we use
> > transform_to_exit_first_loop (the original one) instead of
> > transform_to_exit_first_loop_alt (the alternative one, which is
> > currently used, and causing the failure).
> > 
> > Before transform_to_exit_first_loop, the low iteration count loop and
> > the main loop share the loop exit block. After
> > transform_to_exit_first_loop, that's not the case anymore, the main loop
> > now has an exit block with a single predecessor. Subsequently,
> > separate_decls_in_region inserts code in the main loop exit block, which
> > is only triggered upon exit of the main loop.
> > 
> > However, transform_to_exit_first_loop_alt does not insert such an exit
> > block, and the code inserted by separate_decls_in_region is also active
> > for the low iteration count loop, which results in an incorrect
> > reduction result when the low iteration count loop is used.
> > 
> > 
> > This patch fixes the problem by making sure
> > transform_to_exit_first_loop_alt adds a new exit block inbetween the
> > main loop header and the old exit block.
> > 
> > 
> > Bootstrapped and reg-tested on x86_64.
> > 
> > OK for trunk?
> > 
> 
> Ping.

Ok.

Thanks,
Richard.

> Thanks,
> - Tom
> 
> > 0002-Add-empty-loop-exit-block-in-transform_to_exit_first.patch
> > 
> > 
> > Add empty loop exit block in transform_to_exit_first_loop_alt
> > 
> > 2015-06-24  Tom de Vries
> > 
> > PR tree-optimization/66642
> > * tree-parloops.c (transform_to_exit_first_loop_alt): Update function
> > header comment.  Rename split_edge variable to edge_at_split.  Split
> > exit edge to create new loop exit bb.  Insert loop exit phis in new
> > loop
> > exit bb.
> > 
> > * testsuite/libgomp.c/parloops-exit-first-loop-alt-3.c (main): Test
> > low
> > iteration count case.
> > * testsuite/libgomp.c/parloops-exit-first-loop-alt.c (init): New
> > function, factor out of ...
> > (main): ... here.  Test low iteration count case.
> > ---
> >   gcc/tree-parloops.c| 45
> > --
> >   .../libgomp.c/parloops-exit-first-loop-alt-3.c |  5 +++
> >   .../libgomp.c/parloops-exit-first-loop-alt.c   | 28 +-
> >   3 files changed, 64 insertions(+), 14 deletions(-)
> > 
> > diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
> > index df7c351..6c8aaab 100644
> > --- a/gcc/tree-parloops.c
> > +++ b/gcc/tree-parloops.c
> > @@ -1522,7 +1522,7 @@ replace_uses_in_bb_by (tree name, tree val,
> > basic_block bb)
> >goto 
> > 
> >:
> > - sum_z = PHI 
> > + sum_z = PHI 
> > 
> >[1] Where  is single_pred (bb latch); In the simplest case,
> >  that's .
> > @@ -1549,14 +1549,17 @@ replace_uses_in_bb_by (tree name, tree val,
> > basic_block bb)
> >if (ivtmp_c < n + 1)
> >  goto ;
> >else
> > -   goto ;
> > +   goto ;
> > 
> >:
> >ivtmp_b = ivtmp_a + 1;
> >goto 
> > 
> > + :
> > + sum_y = PHI 
> > +
> >:
> > - sum_z = PHI 
> > + sum_z = PHI 
> > 
> > 
> >  In unified diff format:
> > @@ -1593,9 +1596,12 @@ replace_uses_in_bb_by (tree name, tree val,
> > basic_block bb)
> >   - goto 
> >   + goto 
> > 
> > ++:
> > ++sum_y = PHI 
> > +
> > :
> > -- sum_z = PHI 
> > -+ sum_z = PHI 
> > +- sum_z = PHI 
> > ++ sum_z = PHI 
> > 
> >  Note: the example does not show any virtual phis, but these are handled
> > more
> >  or less as reductions.
> > @@ -1626,7 +1632,7 @@ transform_to_exit_first_loop_alt (struct loop *loop,
> > 
> > /* Create the new_header block.  */
> > basic_block new_header = split_block_before_cond_jump (exit->src);
> > -  edge split_edge = single_pred_edge (new_header);
> > +  edge edge_at_split = single_pred_edge (new_header);
> > 
> > /* Redirect entry edge to new_header.  */
> > edge entry = loop_preheader_edge (loop);
> > @@ -1643,9 +1649,9 @@ transform_to_exit_first_loop_alt (struct loop *loop,
> > e = redirect_edge_and_branch (post_cond_edge, header);
> > gcc_assert (e == post_cond_edge);
> > 
> > -  /* Redirect split_edge to latch.  */
> > -  e = redirect_edge_and_branch (split_edge, latch);
> > -  gcc_assert (e == split_edge);
> > +  /* Redirect edge_at_split to latch.  */
> > +  e = redirect_edge_and_branch (edge_at_split, latch);
> > +  gcc_assert (e == edge_at_split);
> > 
> > /* Set the new loop bound.  */
> > gimple_cond_set_rhs (cond_stmt, bound);
> > @@ -1697,21 +1703,36 @@ transform_to_exit_first_loop_alt (struct loop *loop,
> > /* Set the latch arguments of the new phis to ivtmp/sum

Re: [RFC 1/2] gthread: Add __gthread_cond_timedwaitonclock

2015-07-06 Thread Jonathan Wakely

On 06/07/15 13:55 +0100, Mike Crowe wrote:

diff --git a/libgcc/gthr-posix.h b/libgcc/gthr-posix.h
index fb59816..0e01866 100644
--- a/libgcc/gthr-posix.h
+++ b/libgcc/gthr-posix.h
@@ -33,6 +33,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
#define __GTHREADS_CXX0X 1

#include 
+#include 

#if ((defined(_LIBOBJC) || defined(_LIBOBJC_WEAK)) \
 || !defined(_GTHREAD_USE_MUTEX_TIMEDLOCK))
@@ -44,6 +45,11 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
# endif
#endif

+
+#if defined(_GLIBCXX_USE_PTHREAD_COND_TIMEDWAITONCLOCK_NP)
+# define _GTHREAD_USE_COND_TIMEDWAITONCLOCK 1
+#endif


This isn't correct, because it's possible to include  before
including any C++ Standard Library header, so the _GLIBCXX_ macro
defined in libstdc++'s c++config.h will not have been defined.

It might make sense to just do this internally in libstdc++ and not
involve gthr-posix.h at all, this is what we do for pthread_rwlock_t
usage in  so you might want to follow that model.

How portable is pthread_cond_timedwaitonclock_np? Is it unique to
glibc or do any other posix systems provide it?


Re: [RFC 2/2] Add steady_clock support to condition_variable

2015-07-06 Thread Jonathan Wakely

On 06/07/15 13:55 +0100, Mike Crowe wrote:

If __gthread_cond_timedwaitonclock is available it can be used it to fix
part of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41861 by supporting
std::chrono::steady_clock properly with std::condition_variable.

This means that code using std::condition_variable::wait_for or
std::condition_variable::wait_until with std::chrono::steady_clock is no
longer subject to timing out early or potentially waiting for much
longer if the system clock is changed at an inopportune moment.

If __gthread_cond_timedwaitonclock is available then
std::chrono::steady_clock is deemed to be the "best" clock available
which means that it is used for the relative wait_for calls and absolute
wait_until calls that aren't choosing to use std::chrono::system_clock.
Calls explicitly using std::chrono::system_clock continue to use
CLOCK_REALTIME.

If __gthread_cond_timedwaitonclock is not available then
std::chrono::system_clock is deemed to be the "best" clock available
which means that the previous suboptimal behaviour remains.



From a quick glance this looks like a good change. As I said in my

other mail about the gthr-posix.h changes, it might be better to keep
this change to libstdc++ rather than splitting it across libstdc++ and
libgcc's gthr-posix.h

Do you have a copyright assignment in place for GCC contributions?




[PATCH] libstdc++ os_defines now required for DragonFly

2015-07-06 Thread John Marino
On the development branch of DragonFly BSD, it was discovered that
__LONG_LONG_SUPPORTED was accidently unconditionally defined.  This had
a positive side effect of allowing GCC conftests to pass for C99 support
via wchar.h.  When the bug was fixed, the wchar C99 conftest now fails,
resulting in a c++ regression where software that previously compiled
now fail due to unknown functions such as wcstoll (since C99 supported
changed from "true" to "false")

FreeBSD behaves the exactly same way, and this OS dealt with it with
system-specific defines.
The DragonFly regression is fixed by copying the relevant defines from
the FreeBSD config. (see attached patch).

This patch should be applied to trunk and also backported to GCC-5 branch.

Thanks,
John
Index: libstdc++-v3/config/os/bsd/dragonfly/os_defines.h
===
--- libstdc++-v3/config/os/bsd/dragonfly/os_defines.h   (revision 225453)
+++ libstdc++-v3/config/os/bsd/dragonfly/os_defines.h   (working copy)
@@ -29,4 +29,9 @@
 // System-specific #define, typedefs, corrections, etc, go here.  This
 // file will come before all others.
 
+#define _GLIBCXX_USE_C99_CHECK 1
+#define _GLIBCXX_USE_C99_DYNAMIC (!(__ISO_C_VISIBLE >= 1999))
+#define _GLIBCXX_USE_C99_LONG_LONG_CHECK 1
+#define _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC (_GLIBCXX_USE_C99_DYNAMIC || 
!defined __LONG_LONG_SUPPORTED)
+
 #endif


Re: [PATCH 7/7] always define WORD_REGISTER_OPERATIONS

2015-07-06 Thread Segher Boessenkool
Hi Trevor,

On Mon, Jul 06, 2015 at 08:11:30AM -0400, tbsaunde+...@tbsaunde.org wrote:
>   * defaults.h: Provide default for WORD_REGISTER_OPERATIONS.
>   * config/alpha/alpha.h: Define WORD_REGISTER_OPERATIONS to 1.
>   * config/arc/arc.h: Likewise.
>   * config/arm/arm.h: Likewise.
>   * config/bfin/bfin.h: Likewise.
>   * config/epiphany/epiphany.h: Likewise.
>   * config/frv/frv.h: Likewise.
>   * config/ia64/ia64.h: Likewise.
>   * config/iq2000/iq2000.h: Likewise.
>   * config/lm32/lm32.h: Likewise.
>   * config/m32r/m32r.h: Likewise.
>   * config/mcore/mcore.h: Likewise.
>   * config/mep/mep.h: Likewise.
>   * config/microblaze/microblaze.h: Likewise.
>   * config/mips/mips.h: Likewise.
>   * config/mmix/mmix.h:
>   * config/mn10300/mn10300.h:
>   * config/nds32/nds32.h:
>   * config/nios2/nios2.h:
>   * config/pa/pa.h:
>   * config/rl78/rl78.h:
>   * config/sh/sh.h:
>   * config/sparc/sparc.h:
>   * config/stormy16/stormy16.h (enum reg_class):
>   * config/tilegx/tilegx.h:
>   * config/tilepro/tilepro.h:
>   * config/v850/v850.h:
>   * config/xtensa/xtensa.h:
>   * doc/tm.texi: Regenerate.

Something went wrong here ;-)

> @@ -12072,10 +12071,9 @@ simplify_comparison (enum rtx_code code, rtx *pop0, 
> rtx *pop1)
>they no longer have defined values and the meaning of
>the code has been changed.  */
> && (0
> -#ifdef WORD_REGISTER_OPERATIONS
> -   || (mode_width > GET_MODE_PRECISION (tmode)
> +   || (!WORD_REGISTER_OPERATIONS
> +   && mode_width > GET_MODE_PRECISION (tmode)
> && mode_width <= BITS_PER_WORD)
> -#endif
> || (mode_width <= GET_MODE_PRECISION (tmode)
> && subreg_lowpart_p (XEXP (op0, 0
> && CONST_INT_P (XEXP (op0, 1))

Please get rid of that "0 ||" now.

I think the ! is wrong here?

The rest of the combine changes look good.

> @@ -6114,13 +6112,12 @@ store_constructor (tree exp, rtx target, int cleared, 
> HOST_WIDE_INT size)
>highest_pow2_factor (offset));
> }
>  
> -#ifdef WORD_REGISTER_OPERATIONS
>   /* If this initializes a field that is smaller than a
>  word, at the start of a word, try to widen it to a full
>  word.  This special case allows us to output C++ member
>  function initializations in a form that the optimizers
>  can understand.  */
> - if (REG_P (target)
> + if (WORD_REGISTER_OPERATIONS && REG_P (target)
>   && bitsize < BITS_PER_WORD
>   && bitpos % BITS_PER_WORD == 0
>   && GET_MODE_CLASS (mode) == MODE_INT

Put that first && on a new line as well?  Similar many times more.


Segher


Re: Clean-ups in match.pd

2015-07-06 Thread Richard Biener
On Sat, Jul 4, 2015 at 4:34 PM, Marc Glisse  wrote:
> Hello,
>
> these are just some minor changes. I believe I had already promised a build_
> function to match integer_each_onep.
>
> Bootstrap+testsuite on powerpc64le-unknown-linux-gnu (it looks like
> *-match.c takes about 10 minutes to compile in stage2 these days).

Ouch.  I have some changes to the code generation in the queue which
also supports a more natural "if" structure (else and elif).  Eventually
that helps a bit but I suppose the main issue is simply from the large
functions.  They can be split quite easily I think, but passing down
all relevant state might turn out to be tricky unless we start using
nested functions here ... (and IIRC those are not supported in C++)



Richard.

> 2015-07-06  Marc Glisse  
>
> * match.pd: Remove element_mode inside HONOR_*.
> (~ (-A) -> A - 1, ~ (A - 1) -> -A): Handle complex types.
> (~X | X -> -1, ~X ^ X -> -1): Merge.
> * tree.c (build_each_one_cst): New function.
> * tree.h (build_each_one_cst): Likewise.
>
> --
> Marc Glisse
> Index: match.pd
> ===
> --- match.pd(revision 225411)
> +++ match.pd(working copy)
> @@ -101,7 +101,7 @@
> negative value by 0 gives -0, not +0.  */
>  (simplify
>   (mult @0 real_zerop@1)
> - (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (element_mode (type)))
> + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type))
>@1))
>
>  /* In IEEE floating point, x*1 is not equivalent to x for snans.
> @@ -108,8 +108,8 @@
> Likewise for complex arithmetic with signed zeros.  */
>  (simplify
>   (mult @0 real_onep)
> - (if (!HONOR_SNANS (element_mode (type))
> -  && (!HONOR_SIGNED_ZEROS (element_mode (type))
> + (if (!HONOR_SNANS (type)
> +  && (!HONOR_SIGNED_ZEROS (type)
>|| !COMPLEX_FLOAT_TYPE_P (type)))
>(non_lvalue @0)))
>
> @@ -116,8 +116,8 @@
>  /* Transform x * -1.0 into -x.  */
>  (simplify
>   (mult @0 real_minus_onep)
> -  (if (!HONOR_SNANS (element_mode (type))
> -   && (!HONOR_SIGNED_ZEROS (element_mode (type))
> +  (if (!HONOR_SNANS (type)
> +   && (!HONOR_SIGNED_ZEROS (type)
> || !COMPLEX_FLOAT_TYPE_P (type)))
> (negate @0)))
>
> @@ -165,7 +165,7 @@
>   (rdiv @0 @0)
>   (if (FLOAT_TYPE_P (type)
>&& ! HONOR_NANS (type)
> -  && ! HONOR_INFINITIES (element_mode (type)))
> +  && ! HONOR_INFINITIES (type))
>{ build_one_cst (type); }))
>
>  /* Optimize -A / A to -1.0 if we don't care about
> @@ -174,19 +174,19 @@
>   (rdiv:c @0 (negate @0))
>   (if (FLOAT_TYPE_P (type)
>&& ! HONOR_NANS (type)
> -  && ! HONOR_INFINITIES (element_mode (type)))
> +  && ! HONOR_INFINITIES (type))
>{ build_minus_one_cst (type); }))
>
>  /* In IEEE floating point, x/1 is not equivalent to x for snans.  */
>  (simplify
>   (rdiv @0 real_onep)
> - (if (!HONOR_SNANS (element_mode (type)))
> + (if (!HONOR_SNANS (type))
>(non_lvalue @0)))
>
>  /* In IEEE floating point, x/-1 is not equivalent to -x for snans.  */
>  (simplify
>   (rdiv @0 real_minus_onep)
> - (if (!HONOR_SNANS (element_mode (type)))
> + (if (!HONOR_SNANS (type))
>(negate @0)))
>
>  /* If ARG1 is a constant, we can convert this to a multiply by the
> @@ -297,9 +297,10 @@
>@1)
>
>  /* ~x | x -> -1 */

Please also adjust this comment.  Ok with that change.

Thanks,
Richard.

> -(simplify
> - (bit_ior:c (convert? @0) (convert? (bit_not @0)))
> - (convert { build_all_ones_cst (TREE_TYPE (@0)); }))
> +(for op (bit_ior bit_xor plus)
> + (simplify
> +  (op:c (convert? @0) (convert? (bit_not @0)))
> +  (convert { build_all_ones_cst (TREE_TYPE (@0)); })))
>
>  /* x ^ x -> 0 */
>  (simplify
> @@ -311,11 +312,6 @@
>(bit_xor @0 integer_all_onesp@1)
>(bit_not @0))
>
> -/* ~X ^ X is -1.  */
> -(simplify
> - (bit_xor:c (bit_not @0) @0)
> - { build_all_ones_cst (type); })
> -
>  /* x & ~0 -> x  */
>  (simplify
>   (bit_and @0 integer_all_onesp)
> @@ -603,11 +599,11 @@
>  (simplify
>   (bit_not (convert? (negate @0)))
>   (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
> -  (convert (minus @0 { build_one_cst (TREE_TYPE (@0)); }
> +  (convert (minus @0 { build_each_one_cst (TREE_TYPE (@0)); }
>
>  /* Convert ~ (A - 1) or ~ (A + -1) to -A.  */
>  (simplify
> - (bit_not (convert? (minus @0 integer_onep)))
> + (bit_not (convert? (minus @0 integer_each_onep)))
>   (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
>(convert (negate @0
>  (simplify
> Index: tree.c
> ===
> --- tree.c  (revision 225411)
> +++ tree.c  (working copy)
> @@ -1968,6 +1968,21 @@
>return t;
>  }
>
> +/* Return the constant 1 in type TYPE.  If TYPE has several elements, each
> +   element is set to 1.  In particular, this is 1 + i for complex types.
> */
> +
> +tree
> +build_each_one_cst (tree type)
> +{
> +  if (TREE_CODE (type) == COMPLEX_TYPE)
> +{
> +  tree

Re: [Ada] Add DragonFly support to System.OS_Constants template

2015-07-06 Thread Thomas Quinot
* John Marino, 2015-07-06 :

> The System.OS_Constants templates for GNAT has three preprocessor checks
> for FreeBSD.  In all three cases, DragonFly BSD needs to be treated the
> same as FreeBSD.  The attached patch accomplishes this.

Thanks John, looks good to me!

Thomas.



Re: [RFC 1/2] gthread: Add __gthread_cond_timedwaitonclock

2015-07-06 Thread Mike Crowe
On Monday 06 July 2015 at 14:51:42 +0100, Jonathan Wakely wrote:
> On 06/07/15 13:55 +0100, Mike Crowe wrote:
> >diff --git a/libgcc/gthr-posix.h b/libgcc/gthr-posix.h
> >index fb59816..0e01866 100644
> >--- a/libgcc/gthr-posix.h
> >+++ b/libgcc/gthr-posix.h
> >@@ -33,6 +33,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> >If not, see
> >#define __GTHREADS_CXX0X 1
> >
> >#include 
> >+#include 
> >
> >#if ((defined(_LIBOBJC) || defined(_LIBOBJC_WEAK)) \
> > || !defined(_GTHREAD_USE_MUTEX_TIMEDLOCK))
> >@@ -44,6 +45,11 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> >If not, see
> ># endif
> >#endif
> >
> >+
> >+#if defined(_GLIBCXX_USE_PTHREAD_COND_TIMEDWAITONCLOCK_NP)
> >+# define _GTHREAD_USE_COND_TIMEDWAITONCLOCK 1
> >+#endif
> 
> This isn't correct, because it's possible to include  before
> including any C++ Standard Library header, so the _GLIBCXX_ macro
> defined in libstdc++'s c++config.h will not have been defined.

Presumably gthr-posix.h could just include c++config.h to avoid that, but
perhaps it's not that simple which is why gthr-posix.h swings through hoops
to define _GTHREAD_USE_MUTEX_TIMEDLOCK itself?

> It might make sense to just do this internally in libstdc++ and not
> involve gthr-posix.h at all, this is what we do for pthread_rwlock_t
> usage in  so you might want to follow that model.

I implemented the new function in gthreads because that seemed to be what
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41861 was suggesting and I'd
incorrectly assumed that there were non-pthreads backends for gthreads too.

I'm can try and re-implement condition_variable using pthreads directly but
the change will be somewhat larger and it would break non-pthreads
platforms.

> How portable is pthread_cond_timedwaitonclock_np? Is it unique to
> glibc or do any other posix systems provide it?

It's not at all portable; I invented it because it seemed to be the most
straightforward way to correctly support std::condition_variable's wait
operations using std::chrono::steady_clock on Linux. I haven't even
suggested adding the function on the glibc mailing list yet.

I looked briefly to see if I could find anyone else solving the problem on
Posix platforms but they all seem to be converting to system_clock too. :(

Thanks.

Mike.


RE: [PATCH] MIPS: For micromips allow near-far-3.c test to use the jals instruction to call near_func

2015-07-06 Thread Andrew Bennett
> OK.

Committed as SVN 225457.

Regards,


Andrew


Re: [RFC 2/2] Add steady_clock support to condition_variable

2015-07-06 Thread Mike Crowe
On Monday 06 July 2015 at 14:54:06 +0100, Jonathan Wakely wrote:
> Do you have a copyright assignment in place for GCC contributions?

Not yet. From my reading of https://gcc.gnu.org/contribute.html I should
probably ask you for the forms. :-)

Mike.


Re: [PATCH 1/2][ARM] PR/65956 AAPCS update for alignment attribute

2015-07-06 Thread Ramana Radhakrishnan


On 06/07/15 12:00, Alan Lawrence wrote:
> Eric Botcazou wrote:
>>> Technically this is incorrect since AGGREGATE_TYPE_P includes ARRAY_TYPE
>>> and ARRAY_TYPE doesn't have TYPE_FIELDS.  I doubt we could reach that
>>> case though (unless there's a language that allows passing arrays by value).
>>
>> Ada passes small array types by the method specified by the 
>> pass_by_reference hook (and large array types by reference).
> 
> Ok, thanks. Here's a revised patch that handles array types. Again I've 
> tested on both trunk (bootstrap + check-gcc) and gcc-5-branch 
> (profiledbootstrap now succeeding + check-gcc). Jakub's pr65956.c testcase 
> also now passes.
> 

> The new code lacks a testcase; from what Eric says, it's possible we can 
> write one using Ada, but I don't know any Ada myself, so I think any testcase 
> should follow in a separate patch.
> 
> Neither have I managed to run a check-ada yet, as I don't presently have a 
> working Ada compiler with which to bootstrap gcc's Ada frontend. Working on 
> this now.

This is OK, the ada testing can go in parallel and we should take this in to 
not delay rc1 any further.



regards
Ramana

> 
> --Alan
> 
> gcc/ChangeLog:
> 
> * config/arm/arm.c (arm_needs_doubleword_align) : Drop any outer
> alignment attribute, exploring one level down for records and arrays.


Re: [PATCH][10/n] Remove GENERIC stmt combining from SCCVN

2015-07-06 Thread Kyrill Tkachov

Hi Richard,

On 01/07/15 14:03, Richard Biener wrote:

This merges the complete comparison patterns from the match-and-simplify
branch, leaving incomplete implementations of fold-const.c code alone.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-07-01  Richard Biener  

* fold-const.c (fold_comparison): Move X - Y CMP 0 -> X CMP Y,
X * C1 CMP 0 -> X CMP 0, X CMP X, ~X CMP ~Y -> Y CMP X and
~X CMP C -> X CMP' ~C to ...
* match.pd: ... patterns here.


  
+/* Transform comparisons of the form X - Y CMP 0 to X CMP Y.

+   ??? The transformation is valid for the other operators if overflow
+   is undefined for the type, but performing it here badly interacts
+   with the transformation in fold_cond_expr_with_comparison which
+   attempts to synthetize ABS_EXPR.  */
+(for cmp (eq ne)
+ (simplify
+  (cmp (minus @0 @1) integer_zerop)
+  (cmp @0 @1)))


This broke some tests on aarch64:
FAIL: gcc.target/aarch64/subs.c scan-assembler subs\tw[0-9]
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, w[0-9]+
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, 
w[0-9]+, lsl 3
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, x[0-9]+
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, 
x[0-9]+, lsl 3

To take subs.c as an example:
There's something odd going on:
The X - Y CMP 0 -> X CMP Y transformation gets triggered only for the int case 
but
not the long long case, but the int case (foo) is the place where the rtl ends 
up being:

(insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
(minus:SI (reg/v:SI 76 [ x ])
(reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}
 (nil))
(insn 10 9 11 2 (set (reg:CC 66 cc)
(compare:CC (reg/v:SI 76 [ x ])
(reg/v:SI 77 [ y ])))

instead of the previous:

(insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
(minus:SI (reg/v:SI 76 [ x ])
(reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}

(insn 10 9 11 2 (set (reg:CC 66 cc)
(compare:CC (reg/v:SI 74 [ l ])
(const_int 0 [0])))


so the tranformed X CMP Y does not get matched by combine into a subs.
Was the transformation before the patch in fold-const.c not getting triggered?
In aarch64 we have patterns to match:
  [(set (reg:CC_NZ CC_REGNUM)
(compare:CC_NZ (minus:GPI (match_operand:GPI 1 "register_operand" "r")
  (match_operand:GPI 2 "register_operand" "r"))
   (const_int 0)))
   (set (match_operand:GPI 0 "register_operand" "=r")
(minus:GPI (match_dup 1) (match_dup 2)))]


Should we add a pattern to match:
  [(set (reg:CC CC_REGNUM)
(compare:CC (match_operand:GPI 1 "register_operand" "r")
   (match_operand:GPI 2 "register_operand" "r")))
   (set (match_operand:GPI 0 "register_operand" "=r")
(minus:GPI (match_dup 1) (match_dup 2)))]

as well?

Kyrill


+
+/* Transform comparisons of the form X * C1 CMP 0 to X CMP 0 in the
+   signed arithmetic case.  That form is created by the compiler
+   often enough for folding it to be of value.  One example is in
+   computing loop trip counts after Operator Strength Reduction.  */
+(for cmp (tcc_comparison)
+ scmp (swapped_tcc_comparison)
+ (simplify
+  (cmp (mult @0 INTEGER_CST@1) integer_zerop@2)
+  /* Handle unfolded multiplication by zero.  */
+  (if (integer_zerop (@1))
+   (cmp @1 @2))
+  (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
+   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
+   /* If @1 is negative we swap the sense of the comparison.  */
+   (if (tree_int_cst_sgn (@1) < 0)
+(scmp @0 @2))
+   (cmp @0 @2
+
+/* Simplify comparison of something with itself.  For IEEE
+   floating-point, we can only do some of these simplifications.  */
+(simplify
+ (eq @0 @0)
+ (if (! FLOAT_TYPE_P (TREE_TYPE (@0))
+  || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0
+  { constant_boolean_node (true, type); }))
+(for cmp (ge le)
+ (simplify
+  (cmp @0 @0)
+  (eq @0 @0)))
+(for cmp (ne gt lt)
+ (simplify
+  (cmp @0 @0)
+  (if (cmp != NE_EXPR
+   || ! FLOAT_TYPE_P (TREE_TYPE (@0))
+   || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0
+   { constant_boolean_node (false, type); })))
+
+/* Fold ~X op ~Y as Y op X.  */
+(for cmp (tcc_comparison)
+ (simplify
+  (cmp (bit_not @0) (bit_not @1))
+  (cmp @1 @0)))
+
+/* Fold ~X op C as X op' ~C, where op' is the swapped comparison.  */
+(for cmp (tcc_comparison)
+ scmp (swapped_tcc_comparison)
+ (simplify
+  (cmp (bit_not @0) CONSTANT_CLASS_P@1)
+  (if (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)
+   (scmp @0 (bit_not @1)
+
+
  /* Unordered tests if either argument is a NaN.  */
  (simplify
   (bit_ior (unordered @0 @0) (unordered @1 @1))





[PATCH] MIPS: Update stack-1.c testcase to match micromips jraddiusp instruction.

2015-07-06 Thread Andrew Bennett
Hi,

The stack-1.c testcase fails when being compiled for micromips with the -O0
optimization level.  The reason is the testcase is expecting the following
sequence at the end of the function:

   addiu   $sp,$sp,16
   jrc $31

But for micromips it generates the following:

   jraddiusp   16


As the failure only happens at one optimization level I have decided to just 
change the expected output rather than creating a separate micromips testcase.  

I have tested this on the mips-mti-elf target using 
mips32r2/{-mno-micromips/-mmicromips}
test options and there are no new regressions.

The patch and ChangeLog are below.

Ok to commit?



Many thanks,


Andrew


testsuite/
* gcc.target/mips/stack-1.c: Allow testcase to match jraddiusp 
instruction.


diff --git a/gcc/testsuite/gcc.target/mips/stack-1.c 
b/gcc/testsuite/gcc.target/mips/stack-1.c
index a28e4bf..2249a3b 100644
--- a/gcc/testsuite/gcc.target/mips/stack-1.c
+++ b/gcc/testsuite/gcc.target/mips/stack-1.c
@@ -1,4 +1,4 @@
-/* { dg-final { scan-assembler "\td?addiu\t(\\\$sp,)?\\\$sp,\[1-9\]" } } */
+/* { dg-final { scan-assembler 
"\t((d?addiu\t(\\\$sp,)?\\\$sp,)|jraddiusp\t)\[1-9\]" } } */
 /* { dg-final { scan-assembler "\tlw\t" } } */
 /* { dg-final { scan-assembler-not 
"\td?addiu\t(\\\$sp,)?\\\$sp,\[1-9\].*\tlw\t" } } */



Re: [RFC 1/2] gthread: Add __gthread_cond_timedwaitonclock

2015-07-06 Thread Jonathan Wakely

On 06/07/15 15:10 +0100, Mike Crowe wrote:

On Monday 06 July 2015 at 14:51:42 +0100, Jonathan Wakely wrote:

This isn't correct, because it's possible to include  before
including any C++ Standard Library header, so the _GLIBCXX_ macro
defined in libstdc++'s c++config.h will not have been defined.


Presumably gthr-posix.h could just include c++config.h to avoid that, but
perhaps it's not that simple which is why gthr-posix.h swings through hoops
to define _GTHREAD_USE_MUTEX_TIMEDLOCK itself?


In theory you could have a GCC installation without libstdc++ so no
c++config.h. These days we could use __has_include(bits/c++config.h)
to detect that and so could actually solve this problem, but I still
don't think we want this in gthr-posix.h anyway.


It might make sense to just do this internally in libstdc++ and not
involve gthr-posix.h at all, this is what we do for pthread_rwlock_t
usage in  so you might want to follow that model.


I implemented the new function in gthreads because that seemed to be what
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41861 was suggesting and I'd
incorrectly assumed that there were non-pthreads backends for gthreads too.


There are non-pthreads backends for gthreads, that's the point of
gthreads. But:

1) If you add a new function to gthreads then it becomes part of the
ghtreads API and so in theory it should be implementable by the other
backends. This probably never would be.

2) This isn't even a function that's provided by most pthreads
implementations, so can't even be provided by most users of
gthr-posix.h

3) It is only needed for one place in lisbtdc++ so doesn't need to be
in the generic gthreads API.


I'm can try and re-implement condition_variable using pthreads directly but
the change will be somewhat larger and it would break non-pthreads
platforms.


That's not what I mean. I just mean that we don't need to add a
wrapper in ghtreads and then use that wrapper in .
You can simply use pthread_cond_timedwaitonclock_np function directly
in  when it is available (as determined by the
_GLIBCXX_ macro, which we do still want).

The comparison to  was meant to point out that we just
use pthreads directly there, instead of adding rwlock support to
gthreads and then using that. We could do the same for uses of this
function, while stil using gthreads for the rest of
.



Re: [PATCH][10/n] Remove GENERIC stmt combining from SCCVN

2015-07-06 Thread Andreas Schwab
Kyrill Tkachov  writes:

> This broke some tests on aarch64:
> FAIL: gcc.target/aarch64/subs.c scan-assembler subs\tw[0-9]
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, 
> w[0-9]+
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, 
> w[0-9]+, lsl 3
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, 
> x[0-9]+
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, 
> x[0-9]+, lsl 3

This is PR66739.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH 2/7] remove #if for HAVE_cc0 in combine.c

2015-07-06 Thread Segher Boessenkool
On Mon, Jul 06, 2015 at 08:11:25AM -0400, tbsaunde+...@tbsaunde.org wrote:
> @@ -1327,7 +1322,7 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
>We need this special code because data flow connections
>via CC0 do not get entered in LOG_LINKS.  */
>  
> -   if (JUMP_P (insn)
> +   if (HAVE_cc0 && JUMP_P (insn)
> && (prev = prev_nonnote_insn (insn)) != 0
> && NONJUMP_INSN_P (prev)
> && sets_cc0_p (PATTERN (prev)))

As before (in 6/6), please respect formatting rules.

> @@ -5382,10 +5375,8 @@ subst (rtx x, rtx from, rtx to, int in_dest, int 
> in_cond, int unique_copy)
> && ! (code == SUBREG
>   && MODES_TIEABLE_P (GET_MODE (x),
>   GET_MODE (SUBREG_REG (to
> -#if HAVE_cc0
> -   && ! (code == SET && i == 1 && XEXP (x, 0) == cc0_rtx)
> -#endif
> -   )
> +   && (!HAVE_cc0 || (! (code == SET && i == 1
> +&& XEXP (x, 0) == cc0_rtx

Esp. for things like this  :-)

> -#if HAVE_cc0
> -   && (! reg_mentioned_p (cc0_rtx, SET_SRC (set))
> -   || ((cc0_setter = prev_cc0_setter (tem_insn)) != 
> NULL
> -   && sets_cc0_p (PATTERN (cc0_setter)) > 0))
> -#endif
> -   )
> +   && (!HAVE_cc0
> +   || (! reg_mentioned_p (cc0_rtx, SET_SRC (set))
> +   || ((cc0_setter = prev_cc0_setter (tem_insn)) 
> != NULL
> +   && sets_cc0_p (PATTERN (cc0_setter)) > 
> 0

Line too long now.  This really wants a rewrite anyway, assignment in
conditionals, ewww.  And it will only look worse if you just wrap the
lines.

But please fix the other formatting problems.


Segher


Re: [RFC 2/2] Add steady_clock support to condition_variable

2015-07-06 Thread Jonathan Wakely

On 06/07/15 15:18 +0100, Mike Crowe wrote:

On Monday 06 July 2015 at 14:54:06 +0100, Jonathan Wakely wrote:

Do you have a copyright assignment in place for GCC contributions?


Not yet. From my reading of https://gcc.gnu.org/contribute.html I should
probably ask you for the forms. :-)


Yep :-) I'll send them offlist.



Re: [PATCH 4/7] use #if for HARD_FRAME_POINTER_IS_FRAME_POINTER less

2015-07-06 Thread Segher Boessenkool
On Mon, Jul 06, 2015 at 08:11:27AM -0400, tbsaunde+...@tbsaunde.org wrote:
> From: Trevor Saunders 
> 
> gcc/ChangeLog:
> 
> 2015-07-06  Trevor Saunders  
> 
>   * combine.c (can_combine_def_p): Don't check the value of
>   * HARD_FRAME_POINTER_IS_FRAME_POINTER with the preprocessor.
^ stray asterisk

> @@ -2227,9 +2226,7 @@ combinable_i3pat (rtx_insn *i3, rtx *loc, rtx i2dest, 
> rtx i1dest, rtx i0dest,
> && REG_P (subdest)
> && reg_referenced_p (subdest, PATTERN (i3))
> && REGNO (subdest) != FRAME_POINTER_REGNUM
> -#if !HARD_FRAME_POINTER_IS_FRAME_POINTER
> -   && REGNO (subdest) != HARD_FRAME_POINTER_REGNUM
> -#endif
> +   && (HARD_FRAME_POINTER_IS_FRAME_POINTER || REGNO (subdest) != 
> HARD_FRAME_POINTER_REGNUM)

That line is a bit long ;-)


Segher


Re: [PATCH][10/n] Remove GENERIC stmt combining from SCCVN

2015-07-06 Thread Richard Biener
On Mon, 6 Jul 2015, Kyrill Tkachov wrote:

> Hi Richard,
> 
> On 01/07/15 14:03, Richard Biener wrote:
> > This merges the complete comparison patterns from the match-and-simplify
> > branch, leaving incomplete implementations of fold-const.c code alone.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
> > 
> > Richard.
> > 
> > 2015-07-01  Richard Biener  
> > 
> > * fold-const.c (fold_comparison): Move X - Y CMP 0 -> X CMP Y,
> > X * C1 CMP 0 -> X CMP 0, X CMP X, ~X CMP ~Y -> Y CMP X and
> > ~X CMP C -> X CMP' ~C to ...
> > * match.pd: ... patterns here.
> > 
> > 
> >   +/* Transform comparisons of the form X - Y CMP 0 to X CMP Y.
> > +   ??? The transformation is valid for the other operators if overflow
> > +   is undefined for the type, but performing it here badly interacts
> > +   with the transformation in fold_cond_expr_with_comparison which
> > +   attempts to synthetize ABS_EXPR.  */
> > +(for cmp (eq ne)
> > + (simplify
> > +  (cmp (minus @0 @1) integer_zerop)
> > +  (cmp @0 @1)))
> 
> This broke some tests on aarch64:
> FAIL: gcc.target/aarch64/subs.c scan-assembler subs\tw[0-9]
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+,
> w[0-9]+
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+,
> w[0-9]+, lsl 3
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+,
> x[0-9]+
> FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+,
> x[0-9]+, lsl 3
> 
> To take subs.c as an example:
> There's something odd going on:
> The X - Y CMP 0 -> X CMP Y transformation gets triggered only for the int case
> but
> not the long long case, but the int case (foo) is the place where the rtl ends
> up being:
> 
> (insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
> (minus:SI (reg/v:SI 76 [ x ])
> (reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}
>  (nil))
> (insn 10 9 11 2 (set (reg:CC 66 cc)
> (compare:CC (reg/v:SI 76 [ x ])
> (reg/v:SI 77 [ y ])))
> 
> instead of the previous:
> 
> (insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
> (minus:SI (reg/v:SI 76 [ x ])
> (reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}
> 
> (insn 10 9 11 2 (set (reg:CC 66 cc)
> (compare:CC (reg/v:SI 74 [ l ])
> (const_int 0 [0])))
> 
> 
> so the tranformed X CMP Y does not get matched by combine into a subs.
> Was the transformation before the patch in fold-const.c not getting triggered?

It was prevented from getting triggered by restricting the transform
to single uses (a fix I am testing right now).

Note that in case you'd write

  int l = x - y;
  if (l == 0)
return 5;

  /* { dg-final { scan-assembler "subs\tw\[0-9\]" } } */
  z = x - y ;

the simplification will happen anyway because the redundancy
computing z has not yet been eliminated (a reason why such
single-use checks are not 100% the very much "correct" thing to do).

> In aarch64 we have patterns to match:
>   [(set (reg:CC_NZ CC_REGNUM)
> (compare:CC_NZ (minus:GPI (match_operand:GPI 1 "register_operand" "r")
>   (match_operand:GPI 2 "register_operand" "r"))
>(const_int 0)))
>(set (match_operand:GPI 0 "register_operand" "=r")
> (minus:GPI (match_dup 1) (match_dup 2)))]
> 
> 
> Should we add a pattern to match:
>   [(set (reg:CC CC_REGNUM)
> (compare:CC (match_operand:GPI 1 "register_operand" "r")
>(match_operand:GPI 2 "register_operand" "r")))
>(set (match_operand:GPI 0 "register_operand" "=r")
> (minus:GPI (match_dup 1) (match_dup 2)))]
> 
> as well?

No, I don't think so.

Richard.

> Kyrill
> 
> > +
> > +/* Transform comparisons of the form X * C1 CMP 0 to X CMP 0 in the
> > +   signed arithmetic case.  That form is created by the compiler
> > +   often enough for folding it to be of value.  One example is in
> > +   computing loop trip counts after Operator Strength Reduction.  */
> > +(for cmp (tcc_comparison)
> > + scmp (swapped_tcc_comparison)
> > + (simplify
> > +  (cmp (mult @0 INTEGER_CST@1) integer_zerop@2)
> > +  /* Handle unfolded multiplication by zero.  */
> > +  (if (integer_zerop (@1))
> > +   (cmp @1 @2))
> > +  (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
> > +   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
> > +   /* If @1 is negative we swap the sense of the comparison.  */
> > +   (if (tree_int_cst_sgn (@1) < 0)
> > +(scmp @0 @2))
> > +   (cmp @0 @2
> > +
> > +/* Simplify comparison of something with itself.  For IEEE
> > +   floating-point, we can only do some of these simplifications.  */
> > +(simplify
> > + (eq @0 @0)
> > + (if (! FLOAT_TYPE_P (TREE_TYPE (@0))
> > +  || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0
> > +  { constant_boolean_node (true, type); }))
> > +(for cmp (ge le)
> > + (simplify
> > +  (cmp @0 @0)
> > +  (eq @0 @0)))
> > +(for cmp (ne gt lt)
> > + (simplify
> > +  (cmp @0 @0)
> > +  (if (cmp != NE_EXPR
> > +   || ! FLOAT_TYPE_P (TREE_TYPE (@0))
> > +   || !

Re: [PATCH] libstdc++ os_defines now required for DragonFly

2015-07-06 Thread Jonathan Wakely

On 06/07/15 15:57 +0200, John Marino wrote:

On the development branch of DragonFly BSD, it was discovered that
__LONG_LONG_SUPPORTED was accidently unconditionally defined.  This had
a positive side effect of allowing GCC conftests to pass for C99 support
via wchar.h.  When the bug was fixed, the wchar C99 conftest now fails,
resulting in a c++ regression where software that previously compiled
now fail due to unknown functions such as wcstoll (since C99 supported
changed from "true" to "false")

FreeBSD behaves the exactly same way, and this OS dealt with it with
system-specific defines.
The DragonFly regression is fixed by copying the relevant defines from
the FreeBSD config. (see attached patch).

This patch should be applied to trunk and also backported to GCC-5 branch.


OK, I'll commit it thanks. I think we will be able to undo at least
part of the change on trunk soon-ish, as we're going to improve all
the _GLIBCXX_USE_C99 macros so that they are smarter and work better.



Index: libstdc++-v3/config/os/bsd/dragonfly/os_defines.h
===
--- libstdc++-v3/config/os/bsd/dragonfly/os_defines.h   (revision 225453)
+++ libstdc++-v3/config/os/bsd/dragonfly/os_defines.h   (working copy)
@@ -29,4 +29,9 @@
// System-specific #define, typedefs, corrections, etc, go here.  This
// file will come before all others.

+#define _GLIBCXX_USE_C99_CHECK 1
+#define _GLIBCXX_USE_C99_DYNAMIC (!(__ISO_C_VISIBLE >= 1999))
+#define _GLIBCXX_USE_C99_LONG_LONG_CHECK 1
+#define _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC (_GLIBCXX_USE_C99_DYNAMIC || 
!defined __LONG_LONG_SUPPORTED)
+
#endif




Re: [PATCH][10/n] Remove GENERIC stmt combining from SCCVN

2015-07-06 Thread Kyrill Tkachov


On 06/07/15 15:46, Richard Biener wrote:

On Mon, 6 Jul 2015, Kyrill Tkachov wrote:


Hi Richard,

On 01/07/15 14:03, Richard Biener wrote:

This merges the complete comparison patterns from the match-and-simplify
branch, leaving incomplete implementations of fold-const.c code alone.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-07-01  Richard Biener  

* fold-const.c (fold_comparison): Move X - Y CMP 0 -> X CMP Y,
X * C1 CMP 0 -> X CMP 0, X CMP X, ~X CMP ~Y -> Y CMP X and
~X CMP C -> X CMP' ~C to ...
* match.pd: ... patterns here.


   +/* Transform comparisons of the form X - Y CMP 0 to X CMP Y.
+   ??? The transformation is valid for the other operators if overflow
+   is undefined for the type, but performing it here badly interacts
+   with the transformation in fold_cond_expr_with_comparison which
+   attempts to synthetize ABS_EXPR.  */
+(for cmp (eq ne)
+ (simplify
+  (cmp (minus @0 @1) integer_zerop)
+  (cmp @0 @1)))

This broke some tests on aarch64:
FAIL: gcc.target/aarch64/subs.c scan-assembler subs\tw[0-9]
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+,
w[0-9]+
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+,
w[0-9]+, lsl 3
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+,
x[0-9]+
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+,
x[0-9]+, lsl 3

To take subs.c as an example:
There's something odd going on:
The X - Y CMP 0 -> X CMP Y transformation gets triggered only for the int case
but
not the long long case, but the int case (foo) is the place where the rtl ends
up being:

(insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
 (minus:SI (reg/v:SI 76 [ x ])
 (reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}
  (nil))
(insn 10 9 11 2 (set (reg:CC 66 cc)
 (compare:CC (reg/v:SI 76 [ x ])
 (reg/v:SI 77 [ y ])))

instead of the previous:

(insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
 (minus:SI (reg/v:SI 76 [ x ])
 (reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}

(insn 10 9 11 2 (set (reg:CC 66 cc)
 (compare:CC (reg/v:SI 74 [ l ])
 (const_int 0 [0])))


so the tranformed X CMP Y does not get matched by combine into a subs.
Was the transformation before the patch in fold-const.c not getting triggered?

It was prevented from getting triggered by restricting the transform
to single uses (a fix I am testing right now).

Note that in case you'd write

   int l = x - y;
   if (l == 0)
 return 5;

   /* { dg-final { scan-assembler "subs\tw\[0-9\]" } } */
   z = x - y ;

the simplification will happen anyway because the redundancy
computing z has not yet been eliminated (a reason why such
single-use checks are not 100% the very much "correct" thing to do).


Ok, thanks. Andreas pointed out PR 66739 to me. I had not noticed it.
Sorry for the noise.

Kyrill




In aarch64 we have patterns to match:
   [(set (reg:CC_NZ CC_REGNUM)
 (compare:CC_NZ (minus:GPI (match_operand:GPI 1 "register_operand" "r")
   (match_operand:GPI 2 "register_operand" "r"))
(const_int 0)))
(set (match_operand:GPI 0 "register_operand" "=r")
 (minus:GPI (match_dup 1) (match_dup 2)))]


Should we add a pattern to match:
   [(set (reg:CC CC_REGNUM)
 (compare:CC (match_operand:GPI 1 "register_operand" "r")
(match_operand:GPI 2 "register_operand" "r")))
(set (match_operand:GPI 0 "register_operand" "=r")
 (minus:GPI (match_dup 1) (match_dup 2)))]

as well?

No, I don't think so.

Richard.


Kyrill


+
+/* Transform comparisons of the form X * C1 CMP 0 to X CMP 0 in the
+   signed arithmetic case.  That form is created by the compiler
+   often enough for folding it to be of value.  One example is in
+   computing loop trip counts after Operator Strength Reduction.  */
+(for cmp (tcc_comparison)
+ scmp (swapped_tcc_comparison)
+ (simplify
+  (cmp (mult @0 INTEGER_CST@1) integer_zerop@2)
+  /* Handle unfolded multiplication by zero.  */
+  (if (integer_zerop (@1))
+   (cmp @1 @2))
+  (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
+   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
+   /* If @1 is negative we swap the sense of the comparison.  */
+   (if (tree_int_cst_sgn (@1) < 0)
+(scmp @0 @2))
+   (cmp @0 @2
+
+/* Simplify comparison of something with itself.  For IEEE
+   floating-point, we can only do some of these simplifications.  */
+(simplify
+ (eq @0 @0)
+ (if (! FLOAT_TYPE_P (TREE_TYPE (@0))
+  || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0
+  { constant_boolean_node (true, type); }))
+(for cmp (ge le)
+ (simplify
+  (cmp @0 @0)
+  (eq @0 @0)))
+(for cmp (ne gt lt)
+ (simplify
+  (cmp @0 @0)
+  (if (cmp != NE_EXPR
+   || ! FLOAT_TYPE_P (TREE_TYPE (@0))
+   || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0
+   { constant_boolean_node (false, type); })))
+
+/* Fold ~X op ~Y as Y op X.  */
+(for cmp (tcc_comparison)
+ (simplify
+  (cm

Re: Get rid of move_iterator in debug checks

2015-07-06 Thread Jonathan Wakely

On 05/07/15 23:23 +0200, François Dumont wrote:

Ok ?


Yes, OK, thanks.



RE: [PATCH] MIPS: Update stack-1.c testcase to match micromips jraddiusp instruction.

2015-07-06 Thread Matthew Fortune
Andrew Bennett  writes:
> The stack-1.c testcase fails when being compiled for micromips with the
> -O0 optimization level.  The reason is the testcase is expecting the
> following sequence at the end of the function:
> 
>addiu   $sp,$sp,16
>jrc $31
> 
> But for micromips it generates the following:
> 
>jraddiusp   16
> 
> 
> As the failure only happens at one optimization level I have decided to
> just change the expected output rather than creating a separate
> micromips testcase.

I'm not sure this is the right approach here. If we get a jraddiusp then
the problem that the test is trying to cover can't possibly happen anyway.
(The test is checking if a load and final stack adjustment are ever
re-ordered from what I can see.)

I'd just mark the test as NOCOMPRESSION instead of just NOMIPS16 and
update the comment to say that it is avoiding SAVE, RESTORE and JRADDIUSP.

Thanks,
Matthew


Re: [PATCH] fix PR46029: reimplement if conversion of loads and stores [2nd submitted version of patch]

2015-07-06 Thread Alan Lawrence

Abe wrote:

On 7/2/15 4:49 AM, Alan Lawrence wrote:

As before, I'm still confused here. This still returns false, i.e. bails out of
if-conversion, if the statement could trap. Doesn't the scratchpad let us handle
that? Or do we just not care because it won't be vectorizable anyway???


This seems like an opportunity for more optimization in the future


==> we get enough benefit from the patch, even without my suggested extra 
change. Ok, fair enough! Thanks for the clarification.



Where can I find info on what the different flag values mean?

 > (I had thought they were booleans [...]

Sorry; I don`t know if that is documented anywhere yet.

In this case, (-1) simply means "defaulted": on if the vectorizer is on, and 
off if it is off.
(0) means "user specified no if conversion" and (1) means "user specified [yes] if 
conversion".


Ah, right, that makes sense now. Obviously I would like to see this written in 
doc/ .


Cheers, Alan



Re: [PATCH] PR target/66749: Add -march=iamcu to optimize for IA MCU

2015-07-06 Thread Uros Bizjak
On Mon, Jul 6, 2015 at 3:28 PM, H.J. Lu  wrote:
> IA MCU is based on Intel Pentium ISA without x87 and passing parameters
> in registers.  We want to optimize for IA MCU without changing existing
> Pentium codegen.  This patch adds PROCESSOR_IAMCU for -march=iamcu,
> which is based on -march=pentium with updated cost tables.
>
> OK for trunk?
>
> Thanks.
>
>
> H.J.
> --
> gcc/
>
> PR target/66749
> * config/i386/i386.c (iamcu_cost): New.
> (m_IAMCU): Likewise.
> (initial_ix86_arch_features): Disable X86_ARCH_CMOV for m_IAMCU.
> (processor_target_table): Add an entry for "iamcu".
> (processor_alias_table): Likewise.
> (ix86_issue_rate): Handle PROCESSOR_IAMCU.
> (ix86_adjust_cost): Likewise.
> (ia32_multipass_dfa_lookahead): Likewise.
> * config/i386/i386.h (processor_type): Add PROCESSOR_IAMCU.
> * config/i386/x86-tune.def: Updated for m_IAMCU.
>
> gcc/testsuite/
>
> PR target/66749
> * gcc.target/i386/pr66749.c: New test.

I assume there will be separate patch for configure bits that will set
-march=iamcu for i[34567]86-*-elfiamcu target.

This part is OK.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386.c  | 76 
> -
>  gcc/config/i386/i386.h  |  1 +
>  gcc/config/i386/x86-tune.def| 36 +---
>  gcc/testsuite/gcc.target/i386/pr66749.c | 14 ++
>  4 files changed, 111 insertions(+), 16 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr66749.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 7d26e8c..98250c4 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -426,6 +426,74 @@ struct processor_costs pentium_cost = {
>1,   /* cond_not_taken_branch_cost.  */
>  };
>
> +static const
> +struct processor_costs iamcu_cost = {
> +  COSTS_N_INSNS (1),   /* cost of an add instruction */
> +  COSTS_N_INSNS (1) + 1,   /* cost of a lea instruction */
> +  COSTS_N_INSNS (4),   /* variable shift costs */
> +  COSTS_N_INSNS (1),   /* constant shift costs */
> +  {COSTS_N_INSNS (11), /* cost of starting multiply for QI */
> +   COSTS_N_INSNS (11), /*   HI */
> +   COSTS_N_INSNS (11), /*   SI */
> +   COSTS_N_INSNS (11), /*   DI */
> +   COSTS_N_INSNS (11)},/*
> other */
> +  0,   /* cost of multiply per each bit set 
> */
> +  {COSTS_N_INSNS (25), /* cost of a divide/mod for QI */
> +   COSTS_N_INSNS (25), /*  HI */
> +   COSTS_N_INSNS (25), /*  SI */
> +   COSTS_N_INSNS (25), /*  DI */
> +   COSTS_N_INSNS (25)},/*  
> other */
> +  COSTS_N_INSNS (3),   /* cost of movsx */
> +  COSTS_N_INSNS (2),   /* cost of movzx */
> +  8,   /* "large" insn */
> +  6,   /* MOVE_RATIO */
> +  6,/* cost for loading QImode using movzbl 
> */
> +  {2, 4, 2},   /* cost of loading integer registers
> +  in QImode, HImode and SImode.
> +  Relative to reg-reg move (2).  */
> +  {2, 4, 2},   /* cost of storing integer registers 
> */
> +  2,   /* cost of reg,reg fld/fst */
> +  {2, 2, 6},   /* cost of loading fp registers
> +  in SFmode, DFmode and XFmode */
> +  {4, 4, 6},   /* cost of storing fp registers
> +  in SFmode, DFmode and XFmode */
> +  8,   /* cost of moving MMX register */
> +  {8, 8},  /* cost of loading MMX registers
> +  in SImode and DImode */
> +  {8, 8},  /* cost of storing MMX registers
> +  in SImode and DImode */
> +  2,   /* cost of moving SSE register */
> +  {4, 8, 16},  /* cost of loading SSE registers
> +  in SImode, DImode and TImode */
> +  {4, 8, 16},  /* cost of storing SSE registers
> +  in SImode, DImode and TImode */
> +  3,   /* MMX or SSE register to integer */
> +  8,  

Re: [Patch, ARM]: remove TARGET_ASM_FILE_START_APP_OFF

2015-07-06 Thread Ramana Radhakrishnan


On 30/06/15 13:07, Christian Bruel wrote:
> Hi,
> 
> A little bit of polishing around arm/thumb attribute_target emission and 
> testing: Since the arch mode is emitted for each function, the file setting 
> becomes useless or redundant.
> 
> for example with attr_thumb.c:
> 
> =>.arm
> =>.syntax divided
> .file"attr_thumb.c"
> .text
> .align2
> .globalfoo
> =>.syntax unified
> =>.code16
> .thumb_func
> .typefoo, %function
> 
> This patch cleans this up and relaxes the attribute target tests to run on 
> any thumb1/thumb2.
> 
> no new failure on arm-cortex-linux-gnueabi
> 
> and arm-none-eabi for
>   arm-sim/
>   arm-sim//-march=armv7-a
>   arm-sim//-mthumb
>   arm-sim//-mthumb/-march=armv7-a
>   arm-sim/-mflip-thumb/
>   arm-sim/-mflip-thumb//-march=armv7-a
>   arm-sim/-mflip-thumb//-mthumb
>   arm-sim/-mflip-thumb//-mthumb/-march=armv7-a
> 
> is still running but OK so far. OK for trunk once done ?

OK.


regards
Ramana

> 
> many thanks,
> 
> Christian
> 
> 


Re: [PATCH] PR target/66749: Add -march=iamcu to optimize for IA MCU

2015-07-06 Thread H.J. Lu
On Mon, Jul 6, 2015 at 8:13 AM, Uros Bizjak  wrote:
> On Mon, Jul 6, 2015 at 3:28 PM, H.J. Lu  wrote:
>> IA MCU is based on Intel Pentium ISA without x87 and passing parameters
>> in registers.  We want to optimize for IA MCU without changing existing
>> Pentium codegen.  This patch adds PROCESSOR_IAMCU for -march=iamcu,
>> which is based on -march=pentium with updated cost tables.
>>
>> OK for trunk?
>>
>> Thanks.
>>
>>
>> H.J.
>> --
>> gcc/
>>
>> PR target/66749
>> * config/i386/i386.c (iamcu_cost): New.
>> (m_IAMCU): Likewise.
>> (initial_ix86_arch_features): Disable X86_ARCH_CMOV for m_IAMCU.
>> (processor_target_table): Add an entry for "iamcu".
>> (processor_alias_table): Likewise.
>> (ix86_issue_rate): Handle PROCESSOR_IAMCU.
>> (ix86_adjust_cost): Likewise.
>> (ia32_multipass_dfa_lookahead): Likewise.
>> * config/i386/i386.h (processor_type): Add PROCESSOR_IAMCU.
>> * config/i386/x86-tune.def: Updated for m_IAMCU.
>>
>> gcc/testsuite/
>>
>> PR target/66749
>> * gcc.target/i386/pr66749.c: New test.
>
> I assume there will be separate patch for configure bits that will set
> -march=iamcu for i[34567]86-*-elfiamcu target.

That is correct.

> This part is OK.
>

Thanks.

-- 
H.J.


Re: flatten cfgloop.h

2015-07-06 Thread Andrew MacLeod

On 07/06/2015 09:38 AM, Michael Matz wrote:

Hi,

On Sun, 5 Jul 2015, Prathamesh Kulkarni wrote:


Hi,
The attached patches flatten cfgloop.h.
patch-1.diff moves around prototypes and structures to respective header-files.
patch-2.diff (mostly auto-generated) replicates cfgloop.h includes in c files.
Bootstrapped and tested on x86_64-unknown-linux-gnu with all front-ends.
Built on all targets using config-list.mk.
I left includes in cfgloop.h commented with #if 0 ... #endif.
OK for trunk ?

Does nobody else think that header files for one or two prototypes are
fairly silly?


I think it has its uses, but we also have the vast majority of the 
"problem" includes resolved. I didn't remember that these older patches 
were still doing anything other than moving includes.
I'm thinking we need to sit back and do the rebuilding and see where we 
land after I do the include reduction.


 Over the next couple of days I'm planning to submit some 
flattening/re-modulizing patches. so maybe we ought to hold off on this 
for now and maybe then revisit parts of this.  Some of the includes in 
cfgloop.h are impacted and become parts of a backend module header..


Andrew


Re: [gomp4.1] Support #pragma omp target {enter,exit} data

2015-07-06 Thread Ilya Verbin
On Thu, Jul 02, 2015 at 00:06:58 +0300, Ilya Verbin wrote:
> On Tue, Jun 30, 2015 at 18:10:44 +0200, Jakub Jelinek wrote:
> > The thing is whether it is actually a good idea to allocate the enter data
> > allocated objects together.
> > In OpenMP 4.0, generally objects would be allocated and deallocated at the
> > same times, except for multiple host threads trying to map the same 
> > variables
> > into the target.  In OpenMP 4.1, due to enter data/exit data, they can be
> > allocated and freed quite independently, and it is true that is the case
> > even for target data, one can either target data, then target enter data
> > to prevent something from being deallocated, then target data end freeing
> > only parts, etc.  So the question is if we think in real-world the
> > allocation or deallocation will be usually together or not.
> 
> IMHO, it's OK to allocate "target data" objects together and "target enter 
> data"
> objects one by one.  I've implemented this approach in the patch bellow.
> 
> However, if someone writes a program like this:
> 
>   #pragma omp target data map(tofrom: small, arr[:big])
> {
>   #pragma omp target enter data map(to: small)
> }
>   do_a_lot_of_something ();
>   #pragma omp target exit data map(from: small)
> 
> Big array will be deallocated on target only with 'small' at the end.
> Is this acceptable?

Ping?

> The patch is not ready though, I don't know how to unmap GOMP_MAP_POINTER 
> vars.
> In gomp_unmap_vars they're unmapped through tgt->list[], but in gomp_exit_data
> it's impossible to find such var in the splay tree, because hostaddr differs
> from the address, used at mapping.

I can keep a splay_tree_key of the GOMP_MAP_POINTER in the new field in
target_mem_desc of the previous var (i.e. corresponding memory block).
Or could you suggest a better approach?

Thanks,
  -- Ilya


RE: [PATCH] MIPS: Update stack-1.c testcase to match micromips jraddiusp instruction.

2015-07-06 Thread Moore, Catherine


> -Original Message-
> From: Matthew Fortune [mailto:matthew.fort...@imgtec.com]
> Sent: Monday, July 06, 2015 11:00 AM
> To: Andrew Bennett; gcc-patches@gcc.gnu.org
> Cc: Moore, Catherine
> Subject: RE: [PATCH] MIPS: Update stack-1.c testcase to match micromips
> jraddiusp instruction.
> 
> Andrew Bennett  writes:
> > The stack-1.c testcase fails when being compiled for micromips with
> > the
> > -O0 optimization level.  The reason is the testcase is expecting the
> > following sequence at the end of the function:
> >
> >addiu   $sp,$sp,16
> >jrc $31
> >
> > But for micromips it generates the following:
> >
> >jraddiusp   16
> >
> >
> > As the failure only happens at one optimization level I have decided
> > to just change the expected output rather than creating a separate
> > micromips testcase.
> 
> I'm not sure this is the right approach here. If we get a jraddiusp then the
> problem that the test is trying to cover can't possibly happen anyway.
> (The test is checking if a load and final stack adjustment are ever re-ordered
> from what I can see.)
> 
> I'd just mark the test as NOCOMPRESSION instead of just NOMIPS16 and
> update the comment to say that it is avoiding SAVE, RESTORE and
> JRADDIUSP.
> 

Another approach would be to add the micromips testcase variant and skip the 
test if code-quality (ie. -O0).
Catherine


Re: [PATCH][AArch64] PR target/66731 Fix fnmul insn with -frounding-math

2015-07-06 Thread Marcus Shawcroft
On 6 July 2015 at 09:20, Szabolcs Nagy  wrote:

> 2015-07-06  Szabolcs Nagy  
>
> * gcc.target/aarch64/fnmul-1.c: New.
> * gcc.target/aarch64/fnmul-2.c: New.
> * gcc.target/aarch64/fnmul-3.c: New.
> * gcc.target/aarch64/fnmul-4.c: New.

+float
+foo_s (float a, float b)
+{
+   /* { dg-final { scan-assembler "fnmul\\ts\[0-9\]+, s\[0-9\]+,
s\[0-9\]+" } } */
+   return -(a * b);
+}

Indentation should set at two spaces.
/Marcus


Re: [ARM] Fix PR middle-end/65958

2015-07-06 Thread Ramana Radhakrishnan


On 18/06/15 20:02, Eric Botcazou wrote:
>> Please mark this pattern with (set_attr "type" "multiple").
> 
> Done.
> 
>> While I suspect that stack probing is done before any insns with invalid
>> constants in the function, it would be better to model the length of
>> this insn so that the minipool logic is not confused later in terms of
>> placement of constant pools.
> 
> OK, I can put an upper bound.
> 
>> Shouldn't the pattern contain clobbers for the CC register or is that
>> unnecessary for the same reason as above ?
> 
> That's unnecessary, UNSPEC_VOLATILEs are optimization barriers so no CC-
> related instructions can be moved up to before the instruction.
> 
>> Additionally please add
>>
>> (set_attr "conds" "clob")
>>
>> to this pattern so that the CCFSM state machine doesn't go awry in any
>> of these cases.
> 
> Also done.
> 

Thanks - I have no further comments on this patch. We probably need to 
implement the same on AArch64 too in order to avoid similar problems.


OK with the afore mentioned changes.

regards
Ramana


Re: [PATCH 1/2][ARM] PR/65956 AAPCS update for alignment attribute

2015-07-06 Thread Alan Lawrence

Trying to push these now (svn!), patch 2 is going first.

I realize my second iteration of patch 1/2, dropped the testcases from the first 
version. Okay to include those as per 
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00215.html ?


Cheers, Alan

Ramana Radhakrishnan wrote:


On 06/07/15 12:00, Alan Lawrence wrote:

Eric Botcazou wrote:

Technically this is incorrect since AGGREGATE_TYPE_P includes ARRAY_TYPE
and ARRAY_TYPE doesn't have TYPE_FIELDS.  I doubt we could reach that
case though (unless there's a language that allows passing arrays by value).

Ada passes small array types by the method specified by the pass_by_reference 
hook (and large array types by reference).

Ok, thanks. Here's a revised patch that handles array types. Again I've tested 
on both trunk (bootstrap + check-gcc) and gcc-5-branch (profiledbootstrap now 
succeeding + check-gcc). Jakub's pr65956.c testcase also now passes.




The new code lacks a testcase; from what Eric says, it's possible we can write 
one using Ada, but I don't know any Ada myself, so I think any testcase should 
follow in a separate patch.

Neither have I managed to run a check-ada yet, as I don't presently have a 
working Ada compiler with which to bootstrap gcc's Ada frontend. Working on 
this now.


This is OK, the ada testing can go in parallel and we should take this in to 
not delay rc1 any further.



regards
Ramana


--Alan

gcc/ChangeLog:

* config/arm/arm.c (arm_needs_doubleword_align) : Drop any outer
alignment attribute, exploring one level down for records and arrays.






Re: [PATCH 1/2][ARM] PR/65956 AAPCS update for alignment attribute

2015-07-06 Thread Ramana Radhakrishnan


On 06/07/15 17:38, Alan Lawrence wrote:
> Trying to push these now (svn!), patch 2 is going first.
> 
> I realize my second iteration of patch 1/2, dropped the testcases from the 
> first version. Okay to include those as per 
> https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00215.html ?

Yeah the tests are fine to go in as long as the testcases showed no regressions 
;) 

What about Jakub's tests ? Is he adding them in or are you considering them 
here ?

Ramana


[PATCH] Optimize i?86-*-elfiamcu for iamcu by default

2015-07-06 Thread H.J. Lu
Default -mtune=/-march= to iamcu for i[34567]86-*-elfiamcu targets.

OK for trunk?

Thanks.

H.J.
---
* config.gcc (x86_archs): Add iamcu.
(with_cpu): Default to iamcu for i[34567]86-*-elfiamcu.
(with_arch): Likewise.
* doc/invoke.texi: Add iamcu.
---
 gcc/config.gcc  | 8 +++-
 gcc/doc/invoke.texi | 3 +++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 2b3af82..f0405fe 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -585,7 +585,7 @@ tm_defines="$tm_defines LIBC_GLIBC=1 LIBC_UCLIBC=2 
LIBC_BIONIC=3 LIBC_MUSL=4"
 x86_archs="athlon athlon-4 athlon-fx athlon-mp athlon-tbird \
 athlon-xp k6 k6-2 k6-3 geode c3 c3-2 winchip-c6 winchip2 i386 i486 \
 i586 i686 pentium pentium-m pentium-mmx pentium2 pentium3 pentium3m \
-pentium4 pentium4m pentiumpro prescott"
+pentium4 pentium4m pentiumpro prescott iamcu"
 
 # 64-bit x86 processors supported by --with-arch=.  Each processor
 # MUST be separated by exactly one space.
@@ -3278,6 +3278,9 @@ esac
 # This block sets nothing except for with_cpu.
 if test x$with_cpu = x ; then
   case ${target} in
+i[34567]86-*-elfiamcu)
+  with_cpu=iamcu
+  ;;
 i[34567]86-*-*|x86_64-*-*)
   with_cpu=$cpu
   ;;
@@ -3370,6 +3373,9 @@ if test x$with_arch = x ; then
   # Default arch is set via TARGET_SUBTARGET32_ISA_DEFAULT
   # and TARGET_SUBTARGET64_ISA_DEFAULT in config/i386/darwin.h.
   ;;
+i[34567]86-*-elfiamcu)
+  with_arch=iamcu
+  ;;
 i[34567]86-*-*)
   # --with-fpmath sets the default ISA to SSE2, which is the same
   # ISA supported by Pentium 4.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 69ae0c3..b28e5d6 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -22250,6 +22250,9 @@ Intel i486 CPU@.  (No scheduling is implemented for 
this chip.)
 @itemx pentium
 Intel Pentium CPU with no MMX support.
 
+@item iamcu
+Intel MCU, based on Intel Pentium CPU.
+
 @item pentium-mmx
 Intel Pentium MMX CPU, based on Pentium core with MMX instruction set support.
 
-- 
2.4.3



Re: [PATCH 1/2][ARM] PR/65956 AAPCS update for alignment attribute

2015-07-06 Thread Alan Lawrence

Ramana Radhakrishnan wrote:


On 06/07/15 17:38, Alan Lawrence wrote:

Trying to push these now (svn!), patch 2 is going first.

I realize my second iteration of patch 1/2, dropped the testcases from the 
first version. Okay to include those as per 
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00215.html ?


Yeah the tests are fine to go in as long as the testcases showed no regressions ;) 


What about Jakub's tests ? Is he adding them in or are you considering them 
here ?

Ramana



I'll add Jakub's test, but as a separate commit, I wouldn't want to claim 
authorship of that one ;-)


--Alan



[PATCH, i386]: Use extv and extzv

2015-07-06 Thread Uros Bizjak
Hello!

Attached patch renames (obsolete) extv and extzv expanders to
extv and extzv. Since the patch uses SImode for operands 2
and 3 on both, 32bit and 64bit targets, we can remove a couple of BT
patterns that were used for DImode operands.

The patch also removes some modes from const_int operands and
introduces constraints where needed.

2015-07-06  Uros Bizjak  

* config/i386/i386.md (extv): Rename from extv.  Use SWI24
modes for operands 0 and 1.  Use SImode for operands 2 and 3.
Copy operand 1 to a temporary if !ext_register_operand.  Remove
ancient extract_bit_field workaround.
(*extv): Rename from *mov_extv_1.
(*extvqi): Rename from *movqi_extv_q.
(extzv): Rename from extzv.  Use SWI248 modes for
operands 0 and 1.  Use SImode for operands 2 and 3. Copy operand 1
to a temporary if !ext_register_operand.  Remove ancient
extract_bit_field workaround.
(*extzv): Rename from *mov_extzv_1.
(*extzvqi): Rename from *movqi_extzv_1.
(*testqi_ext_3): Remove modes from const_int_operand predicated
operands.  Add "n" constraint.
(*btsq, *btrq, *btcq): Remove mode from const_0_to_63 predicated
operand.  Add "J" constraint.
(*btsq, *btrq, *btcq peephole2s): Remove mode from
const_0_to_63 predicated operand.
(regmode): New insn attribute.
(*bt): Use SImode for operand 1.  Change operand 1 predicate
to nonmemory_operand.  Use regmode insn attribute.
(*jcc_bt_1): Convert operand 2 to SImode.
(*jcc_bt_mask): Remove mode from operand 3.
(*jcc_btsi_1, *jcc_btsi_mask_1): Remove patterns.
(tbm_bextri_): Remove modes from const_0_to_255 predicated
operands.  Use "N" constraint instead of "n".

Tested on x86_64-linux-gnu {,-m32} and committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 225432)
+++ config/i386/i386.md (working copy)
@@ -2675,7 +2675,22 @@
(set_attr "mode" "")
(set_attr "length_immediate" "0")])
 
-(define_insn "*mov_extv_1"
+(define_expand "extv"
+  [(set (match_operand:SWI24 0 "register_operand")
+   (sign_extract:SWI24 (match_operand:SWI24 1 "register_operand")
+   (match_operand:SI 2 "const_int_operand")
+   (match_operand:SI 3 "const_int_operand")))]
+  ""
+{
+  /* Handle extractions from %ah et al.  */
+  if (INTVAL (operands[2]) != 8 || INTVAL (operands[3]) != 8)
+FAIL;
+
+  if (! ext_register_operand (operands[1], VOIDmode))
+operands[1] = copy_to_reg (operands[1]);
+})
+
+(define_insn "*extv"
   [(set (match_operand:SWI24 0 "register_operand" "=R")
(sign_extract:SWI24 (match_operand 1 "ext_register_operand" "Q")
(const_int 8)
@@ -2685,7 +2700,7 @@
   [(set_attr "type" "imovx")
(set_attr "mode" "SI")])
 
-(define_insn "*movqi_extv_1"
+(define_insn "*extvqi"
   [(set (match_operand:QI 0 "nonimmediate_x64nomem_operand" "=Q,?R,m")
 (sign_extract:QI (match_operand 1 "ext_register_operand" "Q,Q,Q")
  (const_int 8)
@@ -2712,17 +2727,32 @@
(const_string "SI")
(const_string "QI")))])
 
-(define_insn "*mov_extzv_1"
-  [(set (match_operand:SWI48 0 "register_operand" "=R")
-   (zero_extract:SWI48 (match_operand 1 "ext_register_operand" "Q")
-   (const_int 8)
-   (const_int 8)))]
+(define_expand "extzv"
+  [(set (match_operand:SWI248 0 "register_operand")
+   (zero_extract:SWI248 (match_operand:SWI248 1 "register_operand")
+(match_operand:SI 2 "const_int_operand")
+(match_operand:SI 3 "const_int_operand")))]
   ""
+{
+  /* Handle extractions from %ah et al.  */
+  if (INTVAL (operands[2]) != 8 || INTVAL (operands[3]) != 8)
+FAIL;
+
+  if (! ext_register_operand (operands[1], VOIDmode))
+operands[1] = copy_to_reg (operands[1]);
+})
+
+(define_insn "*extzv"
+  [(set (match_operand:SWI248 0 "register_operand" "=R")
+   (zero_extract:SWI248 (match_operand 1 "ext_register_operand" "Q")
+(const_int 8)
+(const_int 8)))]
+  ""
   "movz{bl|x}\t{%h1, %k0|%k0, %h1}"
   [(set_attr "type" "imovx")
(set_attr "mode" "SI")])
 
-(define_insn "*movqi_extzv_2"
+(define_insn "*extzvqi"
   [(set (match_operand:QI 0 "nonimmediate_x64nomem_operand" "=Q,?R,m")
 (subreg:QI
  (zero_extract:SI (match_operand 1 "ext_register_operand" "Q,Q,Q")
@@ -2752,8 +2782,8 @@
 
 (define_insn "mov_insv_1"
   [(set (zero_extract:SWI48 (match_operand 0 "ext_register_operand" "+Q,Q")
-(const_int 8)
-(const_int 8))
+   (const_int 8)
+   (const_int 8))
(match_operand:SWI48 1 "general_x64nomem_operand" "Qn,m"))]
   ""
 {
@@ -7583,8 +7613,8 @@
   [(set (reg FLAGS_REG)
(compare (zero

Re: [PATCH] Optimize i?86-*-elfiamcu for iamcu by default

2015-07-06 Thread Uros Bizjak
On Mon, Jul 6, 2015 at 6:43 PM, H.J. Lu  wrote:
> Default -mtune=/-march= to iamcu for i[34567]86-*-elfiamcu targets.
>
> OK for trunk?
>
> Thanks.
>
> H.J.
> ---
> * config.gcc (x86_archs): Add iamcu.
> (with_cpu): Default to iamcu for i[34567]86-*-elfiamcu.
> (with_arch): Likewise.
> * doc/invoke.texi: Add iamcu.

OK.

Thanks,
Uros.

> ---
>  gcc/config.gcc  | 8 +++-
>  gcc/doc/invoke.texi | 3 +++
>  2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 2b3af82..f0405fe 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -585,7 +585,7 @@ tm_defines="$tm_defines LIBC_GLIBC=1 LIBC_UCLIBC=2 
> LIBC_BIONIC=3 LIBC_MUSL=4"
>  x86_archs="athlon athlon-4 athlon-fx athlon-mp athlon-tbird \
>  athlon-xp k6 k6-2 k6-3 geode c3 c3-2 winchip-c6 winchip2 i386 i486 \
>  i586 i686 pentium pentium-m pentium-mmx pentium2 pentium3 pentium3m \
> -pentium4 pentium4m pentiumpro prescott"
> +pentium4 pentium4m pentiumpro prescott iamcu"
>
>  # 64-bit x86 processors supported by --with-arch=.  Each processor
>  # MUST be separated by exactly one space.
> @@ -3278,6 +3278,9 @@ esac
>  # This block sets nothing except for with_cpu.
>  if test x$with_cpu = x ; then
>case ${target} in
> +i[34567]86-*-elfiamcu)
> +  with_cpu=iamcu
> +  ;;
>  i[34567]86-*-*|x86_64-*-*)
>with_cpu=$cpu
>;;
> @@ -3370,6 +3373,9 @@ if test x$with_arch = x ; then
># Default arch is set via TARGET_SUBTARGET32_ISA_DEFAULT
># and TARGET_SUBTARGET64_ISA_DEFAULT in config/i386/darwin.h.
>;;
> +i[34567]86-*-elfiamcu)
> +  with_arch=iamcu
> +  ;;
>  i[34567]86-*-*)
># --with-fpmath sets the default ISA to SSE2, which is the same
># ISA supported by Pentium 4.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 69ae0c3..b28e5d6 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -22250,6 +22250,9 @@ Intel i486 CPU@.  (No scheduling is implemented for 
> this chip.)
>  @itemx pentium
>  Intel Pentium CPU with no MMX support.
>
> +@item iamcu
> +Intel MCU, based on Intel Pentium CPU.
> +
>  @item pentium-mmx
>  Intel Pentium MMX CPU, based on Pentium core with MMX instruction set 
> support.
>
> --
> 2.4.3
>


Re: [PATCH] PR target/66749: Add -march=iamcu to optimize for IA MCU

2015-07-06 Thread Dominique d'Humières
This breaks bootstrap on x86_64-apple-darwin14:

../../work/gcc/config/i386/i386-c.c: In function 'void 
ix86_target_macros_internal(long long int, processor_type, processor_type, 
fpmath_unit, void (*)(cpp_reader*, const 
char*))':../../work/gcc/config/i386/i386-c.c:59:10: error: enumeration value 
'PROCESSOR_IAMCU' not handled in switch [-Werror=switch]   switch (arch)
  ^../../work/gcc/config/i386/i386-c.c:188:10: error: enumeration value 
'PROCESSOR_IAMCU' not handled in switch [-Werror=switch]   switch (tune)

TIA

Dominique

Re: [patch] [fixincludes] Ignore .DS_Store junk files when running make check

2015-07-06 Thread Andreas Schwab
Eric Gallager  writes:

> I was just matching the code that was already used there... should the
> lines to ignore the CVS and .svn folders be re-written into the style
> you propose, too?

Yes, that would be an improvement.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH] PR target/66749: Add -march=iamcu to optimize for IA MCU

2015-07-06 Thread H.J. Lu
On Mon, Jul 6, 2015 at 9:58 AM, Dominique d'Humières  wrote:
> This breaks bootstrap on x86_64-apple-darwin14:
>
> ../../work/gcc/config/i386/i386-c.c: In function 'void 
> ix86_target_macros_internal(long long int, processor_type, processor_type, 
> fpmath_unit, void (*)(cpp_reader*, const 
> char*))':../../work/gcc/config/i386/i386-c.c:59:10: error: enumeration value 
> 'PROCESSOR_IAMCU' not handled in switch [-Werror=switch]   switch (arch)  
> ^../../work/gcc/config/i386/i386-c.c:188:10: error: enumeration value 
> 'PROCESSOR_IAMCU' not handled in switch [-Werror=switch]   switch (tune)
>
> TIA
>
> Dominique

Here is a patch.  OK for trunk?


-- 
H.J.
From 61f75507ed4517aecf069c1964760a54078f9124 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 6 Jul 2015 10:01:03 -0700
Subject: [PATCH] Handle PROCESSOR_IAMCU in ix86_target_macros_internal

Define __i586__/__pentium__ for -march=iamcu and __tune_iamcu__ for
-mtune=iamcu.

	* config/i386/i386-c.c (ix86_target_macros_internal): Handle
	PROCESSOR_IAMCU.
---
 gcc/config/i386/i386-c.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index 304ce55..d95772b 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -64,6 +64,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
   def_or_undef (parse_in, "__i486");
   def_or_undef (parse_in, "__i486__");
   break;
+case PROCESSOR_IAMCU:
+  /* Intel MCU is based on Intel Pentium CPU.  */
 case PROCESSOR_PENTIUM:
   def_or_undef (parse_in, "__i586");
   def_or_undef (parse_in, "__i586__");
@@ -285,6 +287,9 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
 case PROCESSOR_KNL:
   def_or_undef (parse_in, "__tune_knl__");
   break;
+case PROCESSOR_IAMCU:
+  def_or_undef (parse_in, "__tune_iamcu__");
+  break;
 case PROCESSOR_INTEL:
 case PROCESSOR_GENERIC:
   break;
-- 
2.4.3



Re: [PATCH] Enable two UNSIGNED_FLOAT simplifications in simplify_unary_operation_1

2015-07-06 Thread Kyrill Tkachov


On 23/06/15 11:43, Renlin Li wrote:

Hi Christophe,

Yes, we have also noticed this failure.

Here I have a simple patch to remove the mfloat-abi option for
hard-float toolchain. The default abi is used.
For non-hardfloat toolchain, softfp abi is specified.

I have checked with arm-none-eabi and arm-none-linux-gnueabihf
toolchain, this problem should be resolved by this patch.

Okay to commit?


Ok.
Thanks,
Kyrill




gcc/testsuite/ChangeLog:

2015-06-23  Renlin Li  

  * gcc.target/arm/unsigned-float.c: Different options for hf toolchain.


On 16/06/15 14:33, Christophe Lyon wrote:

On 20 March 2015 at 18:03, Renlin Li  wrote:

Hi all,

This is a simple patch to enable two simplifications for UNSIGNED_FLOAT
expression.

For the following rtx patterns, they can be simplified when the integer x
can be
represented in float mode without precision loss:

float_truncate (float x) --> float x
float_extend (float x) --> float x

Those two simplifications are also applicable to UNSIGNED_FLOAT expression.

For example, compile the following code using aarch64-none-elf toolchain
with -O1 flag.
double
f1 (uint16_t x)
{
return (double)(float)x;
}
Before the change, the compiler generates the following code:
f1:
  uxthw0, w0
  ucvtf   s0, w0
  fcvtd0, s0
  ret
After the change, the following simplified asm code snipts are generated.
f1:
  uxthw0, w0
  ucvtf   d0, w0
  ret


aarch64-none-elf regression test runs Okay. x86_64 bootstraps Okay.
Okay to commit?

gcc/ChangeLog:

2015-03-20  Renlin Li  

  * simplify-rtx.c (simplify_unary_operation_1): Fix a typo. Enable two
  simplifications for UNSIGNED_FLOAT.

gcc/testsuite/ChangeLog:

2015-03-20  Renlin Li  

  * gcc.target/aarch64/unsigned-float.c: New.
  * gcc.target/arm/unsigned-float.c: New.

This new test fails on ARM targets defaulting to hard-float which have
no softfp multilib.
I'm not sure about the best way to fix this.

Note that dg-require-effective-target arm_vfp_ok passes, but the
testcase fails because it includes stdint.h, leading to:
sysroot-arm-none-linux-gnueabihf/usr/include/gnu/stubs.h:7:29: fatal
error: gnu/stubs-soft.h: No such file or directory

Christophe.





Re: [gomp4.1] Support #pragma omp target {enter,exit} data

2015-07-06 Thread Jakub Jelinek
On Mon, Jul 06, 2015 at 06:34:25PM +0300, Ilya Verbin wrote:
> On Thu, Jul 02, 2015 at 00:06:58 +0300, Ilya Verbin wrote:
> > On Tue, Jun 30, 2015 at 18:10:44 +0200, Jakub Jelinek wrote:
> > > The thing is whether it is actually a good idea to allocate the enter data
> > > allocated objects together.
> > > In OpenMP 4.0, generally objects would be allocated and deallocated at the
> > > same times, except for multiple host threads trying to map the same 
> > > variables
> > > into the target.  In OpenMP 4.1, due to enter data/exit data, they can be
> > > allocated and freed quite independently, and it is true that is the case
> > > even for target data, one can either target data, then target enter data
> > > to prevent something from being deallocated, then target data end freeing
> > > only parts, etc.  So the question is if we think in real-world the
> > > allocation or deallocation will be usually together or not.
> > 
> > IMHO, it's OK to allocate "target data" objects together and "target enter 
> > data"
> > objects one by one.  I've implemented this approach in the patch bellow.
> > 
> > However, if someone writes a program like this:
> > 
> >   #pragma omp target data map(tofrom: small, arr[:big])
> > {
> >   #pragma omp target enter data map(to: small)
> > }
> >   do_a_lot_of_something ();
> >   #pragma omp target exit data map(from: small)
> > 
> > Big array will be deallocated on target only with 'small' at the end.
> > Is this acceptable?
> 
> Ping?

I think it is.

> > The patch is not ready though, I don't know how to unmap GOMP_MAP_POINTER 
> > vars.
> > In gomp_unmap_vars they're unmapped through tgt->list[], but in 
> > gomp_exit_data
> > it's impossible to find such var in the splay tree, because hostaddr differs
> > from the address, used at mapping.
> 
> I can keep a splay_tree_key of the GOMP_MAP_POINTER in the new field in
> target_mem_desc of the previous var (i.e. corresponding memory block).
> Or could you suggest a better approach?

What exactly do you have in mind here?

void foo (int *p)
{
#pragma omp enter data (to:p[10])
...
#pragma omp exit data (from:p[10])
}

where the latter will only deallocate &p[0] ... &p[9], but not &p?
I've asked for clarification in that case, but if it should deallocate (or
decrease the counter) for &p too, then I think this is something for the
frontends to handle during handling of array sections in map clause, or
during gimplification or omp lowering.

Jakub


Re: [RFC] two-phase marking in gt_cleare_cache

2015-07-06 Thread Tom de Vries

On 06/07/15 15:29, Richard Biener wrote:

On Mon, Jul 6, 2015 at 3:25 PM, Richard Biener
 wrote:

On Mon, Jul 6, 2015 at 10:57 AM, Tom de Vries  wrote:

Hi,

Using attached untested patch, I managed to minimize a test-case failure for
PR 66714.

The patch introduces two-phase marking in gt_cleare_cache:
- first phase, it loops over all the hash table entries and removes
   those which are dead
- second phase, it runs over all the live hash table entries and marks
   live items that are reachable from those live entries

By doing so, we make the behaviour of gt_cleare_cache independent of the
order in which the entries are visited, turning:
- hard-to-trigger bugs which trigger for one visiting order but not for
   another, into
- more easily triggered bugs which trigger for any visiting order.

Any comments?


I think it is only half-way correct in your proposed change.  You only
fix the issue for hashes of the same kind.  To truly fix the issue you'd
have to change generated code for gt_clear_caches () and provide
a clearing-only implementation (or pass a operation mode bool to
the core worker in hash-table.h).




[ Btw, we have been discussing a similar issue before: 
https://gcc.gnu.org/ml/gcc/2010-07/msg00077.html ]


True, the problem exists at the scope of all variables marked with 
'cache', and this patch addresses the problem only within a single variable.



Hmm, and don't we rather want to first mark and _then_ clear?


I. In favor of first clear and then mark:

It allows for:
- a lazy one phase implementation for !ENABLE_CHECKING where
  you do a single clear-or-mark phase (so the clear is lazy).
- an eager two phase implementation for ENABLE_CHECKING (where the
  clear is eager)
The approach of first a marking phase and then a clearing phase means 
you always have to do these two phases (you can't do the marking lazily).


First mark and then clear means the marking should be done iteratively. 
Each time you mark something live, another entry in another hash table 
could become live. Marking iteratively could become quite costly.


II. In favor of first mark and then clear:

The users of garbage collection will need to be less precise.


Because
if entry B in the hash is live and would keep A live then A _is_ kept in the
end but you'll remove it from the hash, possibly no longer using a still
live copy.



I'm not sure I understand the scenario you're concerned about, but ... 
say we have

- entry B: item B -> item A
- entry A: item A -> item Z

If you do clear first and mark second, and you start out with item B 
live and item A dead:

- during the clearing phase you clear entry A and keep entry B, and
- during the marking phase you mark item A live.

So we no longer have entry A, but item A is kept and entry B is kept.

Thanks,
- Tom



RE: [Patch, MIPS] Enable fp-contract on MIPS and update -mfused-madd

2015-07-06 Thread Steve Ellcey
On Mon, 2015-06-29 at 12:07 -0700, Matthew Fortune wrote:

> If nobody can identify any further functional issues with this patch then
> I'd like to get this committed and pursue enhancements as a second round.
> Catherine, would you be happy for this to be committed on that basis?
> 
> Thanks,
> Matthew

Sorry for the delay (I was on vacation), I fixed up the ChangeLog issues
Maciej pointed out and checked this patch in.

Steve Ellcey
sell...@imgtec.com



Re: [PATCH 2/2][ARM] fix movdi expander to avoid illegal ldrd/strd

2015-07-06 Thread Alan Lawrence

Richard Earnshaw wrote:

On 03/07/15 16:27, Alan Lawrence wrote:

The previous patch caused a regression in
gcc.c-torture/execute/20040709-1.c at -O0 (only), and the new
align_rec2.c test fails, both outputting an illegal assembler
instruction (ldrd on an odd-numbered reg) from output_move_double in
arm.c. Most routes have checks against such an illegal instruction, but
expanding a function call can directly name such impossible register
(pairs), bypassing the normal checks.

gcc/ChangeLog:

* config/arm/arm.md (movdi): Avoid odd-number ldrd/strd in ARM state.



OK.


Both patches, plus Jakub's test, pushed onto trunk (r221461/5/6), and 
gcc-5-branch (r225467/9/70), with an obvious comment fix to the movdi patch 
(LDRD's into, STRD's from), as below.


Cheers, Alan


Index: gcc/config/arm/arm.md
===
--- gcc/config/arm/arm.md   (revision 225457)
+++ gcc/config/arm/arm.md   (working copy)
@@ -5481,6 +5481,42 @@
   if (!REG_P (operands[0]))
operands[1] = force_reg (DImode, operands[1]);
 }
+  if (REG_P (operands[0]) && REGNO (operands[0]) < FIRST_VIRTUAL_REGISTER
+  && !HARD_REGNO_MODE_OK (REGNO (operands[0]), DImode))
+{
+  /* Avoid LDRD's into an odd-numbered register pair in ARM state
+when expanding function calls.  */
+  gcc_assert (can_create_pseudo_p ());
+  if (MEM_P (operands[1]) && MEM_VOLATILE_P (operands[1]))
+   {
+ /* Perform load into legal reg pair first, then move.  */
+ rtx reg = gen_reg_rtx (DImode);
+ emit_insn (gen_movdi (reg, operands[1]));
+ operands[1] = reg;
+   }
+  emit_move_insn (gen_lowpart (SImode, operands[0]),
+ gen_lowpart (SImode, operands[1]));
+  emit_move_insn (gen_highpart (SImode, operands[0]),
+ gen_highpart (SImode, operands[1]));
+  DONE;
+}
+  else if (REG_P (operands[1]) && REGNO (operands[1]) < FIRST_VIRTUAL_REGISTER
+  && !HARD_REGNO_MODE_OK (REGNO (operands[1]), DImode))
+{
+  /* Avoid STRD's from an odd-numbered register pair in ARM state
+when expanding function prologue.  */
+  gcc_assert (can_create_pseudo_p ());
+  rtx split_dest = (MEM_P (operands[0]) && MEM_VOLATILE_P (operands[0]))
+  ? gen_reg_rtx (DImode)
+  : operands[0];
+  emit_move_insn (gen_lowpart (SImode, split_dest),
+ gen_lowpart (SImode, operands[1]));
+  emit_move_insn (gen_highpart (SImode, split_dest),
+ gen_highpart (SImode, operands[1]));
+  if (split_dest != operands[0])
+   emit_insn (gen_movdi (operands[0], split_dest));
+  DONE;
+}
   "
 )




Re: [PATCH] fix PR46029: reimplement if conversion of loads and stores [2nd submitted version of patch]

2015-07-06 Thread Abe

[Abe wrote:]

This seems like an opportunity for more optimization in the future


[On 7/6/15 10:09 AM, Alan Lawrence wrote:]

we get enough benefit from the patch, even without my suggested extra change. 
Ok, fair enough! Thanks for the clarification.


You are welcome, sir.


[Alan wrote:]

Where can I find info on what the different flag values mean?
(I had thought they were booleans [...]


[Abe wrote:]

Sorry; I don`t know if that is documented anywhere yet.
In this case, (-1) simply means "defaulted": on if the vectorizer is on, and 
off if it is off.
(0) means "user specified no if conversion" and (1) means "user specified [yes] if 
conversion".


[Alan wrote:]

Ah, right, that makes sense now. Obviously I would like to see this written in 
doc/ .


Please consider it added to the work queue.  ;-)

Regards,

Abe



Re: [PATCH] PR target/66749: Add -march=iamcu to optimize for IA MCU

2015-07-06 Thread Uros Bizjak
On Mon, Jul 6, 2015 at 7:03 PM, H.J. Lu  wrote:
> On Mon, Jul 6, 2015 at 9:58 AM, Dominique d'Humières  
> wrote:
>> This breaks bootstrap on x86_64-apple-darwin14:
>>
>> ../../work/gcc/config/i386/i386-c.c: In function 'void 
>> ix86_target_macros_internal(long long int, processor_type, processor_type, 
>> fpmath_unit, void (*)(cpp_reader*, const 
>> char*))':../../work/gcc/config/i386/i386-c.c:59:10: error: enumeration value 
>> 'PROCESSOR_IAMCU' not handled in switch [-Werror=switch]   switch (arch) 
>>  ^../../work/gcc/config/i386/i386-c.c:188:10: error: enumeration value 
>> 'PROCESSOR_IAMCU' not handled in switch [-Werror=switch]   switch (tune)
>>
>> TIA
>>
>> Dominique
>
> Here is a patch.  OK for trunk?

This one is obvious, so OK.

Thanks,
Uros.


Re: flatten cfgloop.h

2015-07-06 Thread Jeff Law

On 07/06/2015 07:38 AM, Michael Matz wrote:

Hi,

On Sun, 5 Jul 2015, Prathamesh Kulkarni wrote:


Hi,
The attached patches flatten cfgloop.h.
patch-1.diff moves around prototypes and structures to respective header-files.
patch-2.diff (mostly auto-generated) replicates cfgloop.h includes in c files.
Bootstrapped and tested on x86_64-unknown-linux-gnu with all front-ends.
Built on all targets using config-list.mk.
I left includes in cfgloop.h commented with #if 0 ... #endif.
OK for trunk ?


Does nobody else think that header files for one or two prototypes are
fairly silly?
Perhaps, but having a .h file for each .c file's exported objects means 
that we can implement a reasonable policy around where functions are 
prototyped or structures declared.


Contrast to "I put foo in expr.h because that was the most convenient 
place" which over 25+ years has made our header file dependencies a 
horrid mess.





Anyway, your autogenerated part contains changes that seem exaggerated,
e.g.:

+++ b/gcc/bt-load.c
@@ -54,6 +54,14 @@ along with GCC; see the file COPYING3.  If not see
  #include "predict.h"
  #include "basic-block.h"
  #include "df.h"
+#include "bitmap.h"
+#include "sbitmap.h"
+#include "cfgloopmanip.h"
+#include "loop-init.h"
+#include "cfgloopanal.h"
+#include "loop-doloop.h"
+#include "loop-invariant.h"
+#include "loop-iv.h"

Surely bt-load doesn't need anything from doloop.h or invariant.h.  Before
this goes into trunk this whole autogenerated thing should be cleaned up
to add includes only for things that are actually needed.

Agreed.
jeff




[gomp4.1] Parsing of defaultmap(tofrom:scalar) and private/firstprivate clauses on target construct

2015-07-06 Thread Jakub Jelinek
Hi!

This patch adds parsing of defaultmap(tofrom:scalar) clause as well as
allowing of private/firstprivate on target construct.

Next step will be to handle private/firstprivate in clause splitting
and gimplification rules, then expansion of firstprivate and private with
pointer-assignment on target construct and some way to pass those to the
new GOMP_target call.

2015-07-06  Jakub Jelinek  

* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_DEFAULTMAP.
* tree.c (omp_clause_num_ops, omp_clause_code_name): Add entries
for defaultmap clause.
(walk_tree_1): Handle OMP_CLAUSE_DEFAULTMAP.
* tree-nested.c (convert_nonlocal_omp_clauses,
convert_local_omp_clauses): Likewise.
* tree-pretty-print.c (dump_omp_clause): Likewise.
* gimplify.c (gimplify_scan_omp_clauses,
gimplify_adjust_omp_clauses): Likewise.
* omp-low.c (scan_sharing_clauses): Likewise.
c-family/
* c-pragma.h (enum pragma_omp_clause): Add
PRAGMA_OMP_CLAUSE_DEFAULTMAP.
c/
* c-parser.c (c_parser_omp_clause_name): Handle defaultmap clause.
(c_parser_omp_clause_defaultmap): New function.
(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DEFAULTMAP.
(OMP_TARGET_CLAUSE_MASK): Add private, firstprivate and defaultmap
clauses.
* c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE_DEFAULTMAP.
Track map, to and from clause decl uids separately from data sharing
clauses.
cp/
* parser.c (cp_parser_omp_clause_name): Handle defaultmap clause.
(cp_parser_omp_clause_defaultmap): New function.
(cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DEFAULTMAP.
(OMP_TARGET_CLAUSE_MASK): Add private, firstprivate and defaultmap
clauses.
* pt.c (tsubst_omp_clauses): Handle OMP_CLAUSE_HINT and
OMP_CLAUSE_DEFAULTMAP.
* semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_DEFAULTMAP.
Track map, to and from clause decl uids separately from data sharing
clauses.

--- gcc/tree-core.h.jj  2015-06-30 14:26:04.0 +0200
+++ gcc/tree-core.h 2015-07-06 11:40:33.662330075 +0200
@@ -391,6 +391,9 @@ enum omp_clause_code {
   /* OpenMP clause: hint (integer-expression).  */
   OMP_CLAUSE_HINT,
 
+  /* OpenMP clause: defaultmap (tofrom: scalar).  */
+  OMP_CLAUSE_DEFAULTMAP,
+
   /* Internally used only clause, holding SIMD uid.  */
   OMP_CLAUSE__SIMDUID_,
 
--- gcc/tree.c.jj   2015-06-30 14:24:42.0 +0200
+++ gcc/tree.c  2015-07-06 11:49:14.359071514 +0200
@@ -343,6 +343,7 @@ unsigned const char omp_clause_num_ops[]
   0, /* OMP_CLAUSE_THREADS  */
   0, /* OMP_CLAUSE_SIMD  */
   1, /* OMP_CLAUSE_HINT  */
+  0, /* OMP_CLAUSE_DEFALTMAP  */
   1, /* OMP_CLAUSE__SIMDUID_  */
   1, /* OMP_CLAUSE__CILK_FOR_COUNT_  */
   0, /* OMP_CLAUSE_INDEPENDENT  */
@@ -409,6 +410,7 @@ const char * const omp_clause_code_name[
   "threads",
   "simd",
   "hint",
+  "defaultmap",
   "_simduid_",
   "_Cilk_for_count_",
   "independent",
@@ -11405,6 +11407,7 @@ walk_tree_1 (tree *tp, walk_tree_fn func
case OMP_CLAUSE_NOGROUP:
case OMP_CLAUSE_THREADS:
case OMP_CLAUSE_SIMD:
+   case OMP_CLAUSE_DEFAULTMAP:
case OMP_CLAUSE_AUTO:
case OMP_CLAUSE_SEQ:
  WALK_SUBTREE_TAIL (OMP_CLAUSE_CHAIN (*tp));
--- gcc/tree-nested.c.jj2015-06-30 14:25:59.0 +0200
+++ gcc/tree-nested.c   2015-07-06 12:00:48.230411331 +0200
@@ -1202,6 +1202,7 @@ convert_nonlocal_omp_clauses (tree *pcla
case OMP_CLAUSE_NOGROUP:
case OMP_CLAUSE_THREADS:
case OMP_CLAUSE_SIMD:
+   case OMP_CLAUSE_DEFAULTMAP:
  break;
 
default:
@@ -1854,6 +1855,7 @@ convert_local_omp_clauses (tree *pclause
case OMP_CLAUSE_NOGROUP:
case OMP_CLAUSE_THREADS:
case OMP_CLAUSE_SIMD:
+   case OMP_CLAUSE_DEFAULTMAP:
  break;
 
default:
--- gcc/tree-pretty-print.c.jj  2015-06-30 14:24:29.0 +0200
+++ gcc/tree-pretty-print.c 2015-07-06 12:01:47.048592456 +0200
@@ -733,6 +733,10 @@ dump_omp_clause (pretty_printer *pp, tre
   pp_right_paren (pp);
   break;
 
+case OMP_CLAUSE_DEFAULTMAP:
+  pp_string (pp, "defaultmap(tofrom:scalar)");
+  break;
+
 case OMP_CLAUSE__SIMDUID_:
   pp_string (pp, "_simduid_(");
   dump_generic_node (pp, OMP_CLAUSE__SIMDUID__DECL (clause),
--- gcc/gimplify.c.jj   2015-06-30 14:25:45.0 +0200
+++ gcc/gimplify.c  2015-07-06 11:42:21.846820653 +0200
@@ -6606,6 +6606,7 @@ gimplify_scan_omp_clauses (tree *list_p,
case OMP_CLAUSE_NOGROUP:
case OMP_CLAUSE_THREADS:
case OMP_CLAUSE_SIMD:
+   case OMP_CLAUSE_DEFAULTMAP:
  break;
 
case OMP_CLAUSE_ALIGNED:
@@ -6965,6 +6966,7 @@ gimplify_adjust_omp_clauses (gimple_seq
case OMP_CLAUSE_THREADS:
case OMP_CLAUSE_SIMD:
case OMP_CLAUSE_HINT:
+   case OMP_CLAUSE_DEFAULTMAP:
c

Re: [gomp4.1] Support #pragma omp target {enter,exit} data

2015-07-06 Thread Ilya Verbin
On Mon, Jul 06, 2015 at 19:25:09 +0200, Jakub Jelinek wrote:
> On Mon, Jul 06, 2015 at 06:34:25PM +0300, Ilya Verbin wrote:
> > On Thu, Jul 02, 2015 at 00:06:58 +0300, Ilya Verbin wrote:
> > > The patch is not ready though, I don't know how to unmap GOMP_MAP_POINTER 
> > > vars.
> > > In gomp_unmap_vars they're unmapped through tgt->list[], but in 
> > > gomp_exit_data
> > > it's impossible to find such var in the splay tree, because hostaddr 
> > > differs
> > > from the address, used at mapping.
> > 
> > I can keep a splay_tree_key of the GOMP_MAP_POINTER in the new field in
> > target_mem_desc of the previous var (i.e. corresponding memory block).
> > Or could you suggest a better approach?
> 
> What exactly do you have in mind here?
> 
> void foo (int *p)
> {
> #pragma omp enter data (to:p[10])
> ...
> #pragma omp exit data (from:p[10])
> }
> 
> where the latter will only deallocate &p[0] ... &p[9], but not &p?
> I've asked for clarification in that case, but if it should deallocate (or
> decrease the counter) for &p too, then I think this is something for the
> frontends to handle during handling of array sections in map clause, or
> during gimplification or omp lowering.

I mean, in enter data map(to:p[10]):
1. Map GOMP_MAP_TO var as usual, and save returned target_mem_desc *tgt_var into
   last_tgt_var.
2. Map GOMP_MAP_POINTER var, and save returned tgt_var->list[0].key into
   last_tgt_var->new_special_field_for_pointer.

And in exit data map(from:p[10]):
1. Unmap GOMP_MAP_FROM var as usual, *and* deallocate (or decrease refcount) of
   k->tgt->new_special_field_for_pointer.
2. Do nothing for GOMP_MAP_POINTER var.

But I don't like this plan, there may be corner cases.

  -- Ilya


[PATCH 0/7] Fix bugs found during demangler fuzz-testing

2015-07-06 Thread Mikhail Maltsev
Hi, all!
I performed some fuzz-testing of C++ demangler
(libiberty/cp-demangle.c). The test revealed several bugs, which are
addressed by this series of patches.

Here is a brief description of them:
First one adds a macro CHECK_DEMANGLER. When this macro is defined,
d_peek_next_char and d_advance macros are replaced by inline functions
which perform additional sanity checks and call __builtin_abort when
out-of-bound access is detected.
The second patch fixes a syntax error in debug dump code (it is normally
disabled, unless CP_DEMANGLE_DEBUG is defined).
All other parts fix some errors on invalid input. The attached files
contain a cumulative patch and changelog record.
Bootstrapped and regtested on x86_64-linux.

Some notes:
* Patch 4 adds "#include " to demangler (because of INT_MAX).
I noticed that this file is checked by configury scripts. Do we have
hosts, which do not provide this header? If so, what is the appropriate
replacement for it?
* Testcase "_ZDTtl" (fixed in patch 5) did not actually cause segfault,
but it still invoked undefined behavior (read 1 byte past buffer end).
* Testcase "DpDv1_c" (from patch 7) is now demangled as "(char
__vector(1))..." (it used to segfault). I'm not sure, whether it is
correct or should be rejected.

I have some more test cases, lots of them cause infinite recursion,
because of conversion operator being used as template parameter. Some
are fixed in PR61321, some are not. For example, there are cases when
conversion operator is used as a nested (qualified) name:

_Z1fIN1CcvT_EEvv -> segfault
Presumably this means:
template
void f()

I wonder, if it is possible in valid C++ code?

Notice that the following template instantiation is demangled correctly:
void f()
_Z1fIN1CcviEEvv -> OK

-- 
Regards,
Mikhail Maltsev
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 2988b6b..4ca285e 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -103,6 +103,7 @@
 #include "config.h"
 #endif
 
+#include 
 #include 
 
 #ifdef HAVE_STDLIB_H
@@ -715,7 +716,7 @@ d_dump (struct demangle_component *dc, int indent)
 case DEMANGLE_COMPONENT_FIXED_TYPE:
   printf ("fixed-point type, accum? %d, sat? %d\n",
   dc->u.s_fixed.accum, dc->u.s_fixed.sat);
-  d_dump (dc->u.s_fixed.length, indent + 2)
+  d_dump (dc->u.s_fixed.length, indent + 2);
   break;
 case DEMANGLE_COMPONENT_ARGLIST:
   printf ("argument list\n");
@@ -1599,7 +1600,7 @@ d_source_name (struct d_info *di)
   struct demangle_component *ret;
 
   len = d_number (di);
-  if (len <= 0)
+  if (len <= 0 || len > INT_MAX)
 return NULL;
   ret = d_identifier (di, len);
   di->last_name = ret;
@@ -3166,6 +3167,8 @@ d_expression_1 (struct d_info *di)
   struct demangle_component *type = NULL;
   if (peek == 't')
 	type = cplus_demangle_type (di);
+  if (!d_peek_next_char (di))
+	return NULL;
   d_advance (di, 2);
   return d_make_comp (di, DEMANGLE_COMPONENT_INITIALIZER_LIST,
 			  type, d_exprlist (di, 'E'));
@@ -3240,6 +3243,8 @@ d_expression_1 (struct d_info *di)
 	struct demangle_component *left;
 	struct demangle_component *right;
 
+	if (code == NULL)
+	  return NULL;
 	if (op_is_new_cast (op))
 	  left = cplus_demangle_type (di);
 	else
@@ -3267,7 +3272,9 @@ d_expression_1 (struct d_info *di)
 	struct demangle_component *second;
 	struct demangle_component *third;
 
-	if (!strcmp (code, "qu"))
+	if (code == NULL)
+	  return NULL;
+	else if (!strcmp (code, "qu"))
 	  {
 		/* ?: expression.  */
 		first = d_expression_1 (di);
@@ -4196,6 +4203,9 @@ d_find_pack (struct d_print_info *dpi,
 case DEMANGLE_COMPONENT_CHARACTER:
 case DEMANGLE_COMPONENT_FUNCTION_PARAM:
 case DEMANGLE_COMPONENT_UNNAMED_TYPE:
+case DEMANGLE_COMPONENT_FIXED_TYPE:
+case DEMANGLE_COMPONENT_DEFAULT_ARG:
+case DEMANGLE_COMPONENT_NUMBER:
   return NULL;
 
 case DEMANGLE_COMPONENT_EXTENDED_OPERATOR:
@@ -4431,6 +4441,11 @@ d_print_comp_inner (struct d_print_info *dpi, int options,
 	local_name = d_right (typed_name);
 	if (local_name->type == DEMANGLE_COMPONENT_DEFAULT_ARG)
 	  local_name = local_name->u.s_unary_num.sub;
+	if (local_name == NULL)
+	  {
+		d_print_error (dpi);
+		return;
+	  }
 	while (local_name->type == DEMANGLE_COMPONENT_RESTRICT_THIS
 		   || local_name->type == DEMANGLE_COMPONENT_VOLATILE_THIS
 		   || local_name->type == DEMANGLE_COMPONENT_CONST_THIS
diff --git a/libiberty/cp-demangle.h b/libiberty/cp-demangle.h
index 6fce025..c37a91f 100644
--- a/libiberty/cp-demangle.h
+++ b/libiberty/cp-demangle.h
@@ -135,12 +135,41 @@ struct d_info
- call d_check_char(di, '\0')
Everything else is safe.  */
 #define d_peek_char(di) (*((di)->n))
-#define d_peek_next_char(di) ((di)->n[1])
-#define d_advance(di, i) ((di)->n += (i))
+#ifndef CHECK_DEMANGLER
+#  define d_peek_next_char(di) ((di)->n[1])
+#  define d_advance(di, i) ((di)->n +

[PATCH 1/7] Add CHECK_DEMANGLER

2015-07-06 Thread Mikhail Maltsev
---
 libiberty/cp-demangle.h | 33 +++--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/libiberty/cp-demangle.h b/libiberty/cp-demangle.h
index 6fce025..c37a91f 100644
--- a/libiberty/cp-demangle.h
+++ b/libiberty/cp-demangle.h
@@ -135,12 +135,41 @@ struct d_info
- call d_check_char(di, '\0')
Everything else is safe.  */
 #define d_peek_char(di) (*((di)->n))
-#define d_peek_next_char(di) ((di)->n[1])
-#define d_advance(di, i) ((di)->n += (i))
+#ifndef CHECK_DEMANGLER
+#  define d_peek_next_char(di) ((di)->n[1])
+#  define d_advance(di, i) ((di)->n += (i))
+#endif
 #define d_check_char(di, c) (d_peek_char(di) == c ? ((di)->n++, 1) : 0)
 #define d_next_char(di) (d_peek_char(di) == '\0' ? '\0' : *((di)->n++))
 #define d_str(di) ((di)->n)

+/* Define CHECK_DEMANGLER to perform additional sanity checks (i.e., when
+   debugging the demangler).  It will cause some slowdown, but will allow to
+   catch out-of-bound access errors earlier.
+   Note: CHECK_DEMANGLER is not compatible with compilers other than GCC.  */
+#ifdef CHECK_DEMANGLER
+static inline char
+d_peek_next_char (const struct d_info *di)
+{
+  if (!di->n[0])
+__builtin_abort ();
+  return di->n[1];
+}
+
+static inline void
+d_advance (struct d_info *di, int i)
+{
+  if (i < 0)
+__builtin_abort ();
+  while (i--)
+{
+  if (!di->n[0])
+   __builtin_abort ();
+  di->n++;
+}
+}
+#endif
+
 /* Functions and arrays in cp-demangle.c which are referenced by
functions in cp-demint.c.  */
 #ifdef IN_GLIBCPP_V3
-- 
1.8.3.1


[PATCH 2/7] Fix build with CP_DEMANGLE_DEBUG

2015-07-06 Thread Mikhail Maltsev
---
 libiberty/cp-demangle.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 2988b6b..12093cc 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -715,7 +715,7 @@ d_dump (struct demangle_component *dc, int indent)
 case DEMANGLE_COMPONENT_FIXED_TYPE:
   printf ("fixed-point type, accum? %d, sat? %d\n",
   dc->u.s_fixed.accum, dc->u.s_fixed.sat);
-  d_dump (dc->u.s_fixed.length, indent + 2)
+  d_dump (dc->u.s_fixed.length, indent + 2);
   break;
 case DEMANGLE_COMPONENT_ARGLIST:
   printf ("argument list\n");
-- 
1.8.3.1


[PATCH 3/7] Fix trinary op

2015-07-06 Thread Mikhail Maltsev
---
 libiberty/cp-demangle.c   | 4 +++-
 libiberty/testsuite/demangle-expected | 6 ++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 12093cc..44a0a9b 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -3267,7 +3267,9 @@ d_expression_1 (struct d_info *di)
struct demangle_component *second;
struct demangle_component *third;

-   if (!strcmp (code, "qu"))
+   if (code == NULL)
+ return NULL;
+   else if (!strcmp (code, "qu"))
  {
/* ?: expression.  */
first = d_expression_1 (di);
diff --git a/libiberty/testsuite/demangle-expected
b/libiberty/testsuite/demangle-expected
index 6ea64ae..47ca8f5 100644
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@ -4091,6 +4091,12 @@ void g<1>(A<1>&, B(1)>&)
 _ZNKSt7complexIiE4realB5cxx11Ev
 std::complex::real[abi:cxx11]() const
 #
+# Some more crashes revealed by fuzz-testing:
+# Check for NULL pointer when demangling trinary operators
+--format=gnu-v3
+Av32_f
+Av32_f
+#
 # Ada (GNAT) tests.
 #
 # Simple test.
-- 
1.8.3.1


  1   2   >