[committed] Fix lto build if WCONTINUED is not defined (PR lto/60571)

2014-03-19 Thread Jakub Jelinek
Hi!

WCONTINUED is (recent) Linux specific, so it doesn't have to be defined
on other hosts, or could be missing even on older Linux distros (e.g. glibc
2.3.2 doesn't have it).

Fixed thusly, committed as obvious.

2014-03-19  Jakub Jelinek  

PR lto/60571
* lto.c (wait_for_child): Define WCONTINUED if not defined to 0.
Fix formatting.

--- gcc/lto/lto.c.jj2014-03-03 08:24:32.0 +0100
+++ gcc/lto/lto.c   2014-03-19 08:12:39.235144361 +0100
@@ -2476,7 +2476,10 @@ wait_for_child ()
   int status;
   do
 {
-  int w = waitpid(0, &status, WUNTRACED | WCONTINUED);
+#ifndef WCONTINUED
+#define WCONTINUED 0
+#endif
+  int w = waitpid (0, &status, WUNTRACED | WCONTINUED);
   if (w == -1)
fatal_error ("waitpid failed");
 
@@ -2485,7 +2488,7 @@ wait_for_child ()
   else if (WIFSIGNALED (status))
fatal_error ("streaming subprocess was killed by signal");
 }
-  while (!WIFEXITED(status) && !WIFSIGNALED(status));
+  while (!WIFEXITED (status) && !WIFSIGNALED (status));
 }
 #endif
 

Jakub


[PATCH, ARM] Fix ICE due to out of bound.

2014-03-19 Thread Zhenqiang Chen
Hi,

ICE when compiling gcc.target/arm/neon-modes-3.c with "-g" in
arm_dwarf_register_span since parts[8] is out of bound for XImode.
GET_MODE_SIZE (XImode) / 4 is 16. "rtx parts[8]" can not hold all the
registers.

According to arm-modes.def, 16 should be the biggest number. So the
patch updates parts to

rtx parts[16];

Bootstrap and no make check regression on ARM Chrome book.

OK for trunk?

Thanks!
-Zhenqiang

ChangeLog:
2014-03-19  Zhenqiang Chen  

* config/arm/arm.c (arm_dwarf_register_span): Update the element number
of parts.

testsuite/ChangeLog:
2014-03-19  Zhenqiang Chen  

* gcc.target/arm/neon-modes-3.c: Add "-g" option.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index a68ed8d..c4466c1 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -28692,7 +28692,7 @@ arm_dwarf_register_span (rtx rtl)
 {
   enum machine_mode mode;
   unsigned regno;
-  rtx parts[8];
+  rtx parts[16];
   int nregs;
   int i;

diff --git a/gcc/testsuite/gcc.target/arm/neon-modes-3.c
b/gcc/testsuite/gcc.target/arm/neon-modes-3.c
index fe81875..f3e4f33 100644
--- a/gcc/testsuite/gcc.target/arm/neon-modes-3.c
+++ b/gcc/testsuite/gcc.target/arm/neon-modes-3.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_neon_ok } */
-/* { dg-options "-O" } */
+/* { dg-options "-O -g" } */
 /* { dg-add-options arm_neon } */

 #include 


Re: [PATCH] Fix ICE with MASK_LOAD and -fno-tree-dce (PR tree-optimization/60559)

2014-03-19 Thread Richard Biener
On Tue, 18 Mar 2014, Jakub Jelinek wrote:

> Hi!
> 
> With -fno-tree-dce the scalar MASK_LOAD isn't removed from the IL and we ICE
> on it during expansion (as we support only the vector loads, if those aren't
> supported, MASK_LOAD is either not created by if-conversion at all, or
> vectorization refuses to vectorize the loop and thus it is cfg cleaned up
> away.
> 
> The following patch replaces the scalar MASK_LOAD manually with load of
> zero, similarly how we do it for vectorizable calls.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2014-03-18  Jakub Jelinek  
> 
>   PR tree-optimization/60559
>   * vectorizable_mask_load_store): Replace scalar MASK_LOAD
>   with build_zero_cst assignment.
> 
>   * g++.dg/vect/pr60559.cc: New test.
> 
> --- gcc/tree-vect-stmts.c.jj  2014-03-03 08:24:33.0 +0100
> +++ gcc/tree-vect-stmts.c 2014-03-18 14:01:40.969657763 +0100
> @@ -2038,6 +2038,15 @@ vectorizable_mask_load_store (gimple stm
>   STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
> prev_stmt_info = vinfo_for_stmt (new_stmt);
>   }
> +
> +  /* Ensure that even with -fno-tree-dce the scalar MASK_LOAD is removed
> +  from the IL.  */
> +  tree lhs = gimple_call_lhs (stmt);
> +  new_stmt = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs)));
> +  set_vinfo_for_stmt (new_stmt, stmt_info);
> +  set_vinfo_for_stmt (stmt, NULL);
> +  STMT_VINFO_STMT (stmt_info) = new_stmt;
> +  gsi_replace (gsi, new_stmt, true);
>return true;
>  }
>else if (is_store)
> @@ -2149,6 +2158,18 @@ vectorizable_mask_load_store (gimple stm
>   }
>  }
>  
> +  if (!is_store)
> +{
> +  /* Ensure that even with -fno-tree-dce the scalar MASK_LOAD is removed
> +  from the IL.  */
> +  tree lhs = gimple_call_lhs (stmt);
> +  new_stmt = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs)));
> +  set_vinfo_for_stmt (new_stmt, stmt_info);
> +  set_vinfo_for_stmt (stmt, NULL);
> +  STMT_VINFO_STMT (stmt_info) = new_stmt;
> +  gsi_replace (gsi, new_stmt, true);
> +}
> +
>return true;
>  }
>  
> --- gcc/testsuite/g++.dg/vect/pr60559.cc.jj   2014-03-18 14:04:55.173449250 
> +0100
> +++ gcc/testsuite/g++.dg/vect/pr60559.cc  2014-03-18 14:05:26.610273088 
> +0100
> @@ -0,0 +1,8 @@
> +// PR tree-optimization/60559
> +// { dg-do compile }
> +// { dg-additional-options "-O3 -std=c++11 -fnon-call-exceptions 
> -fno-tree-dce" }
> +// { dg-additional-options "-mavx2" { target { i?86-*-* x86_64-*-* } } }
> +
> +#include "pr60023.cc"
> +
> +// { dg-final { cleanup-tree-dump "vect" } }
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


[PATCH] Fix PR59543

2014-03-19 Thread Richard Biener

This fixes PR59543 (confirmed by Jakub for the testcase at least)
by not dropping debug stmts during WPA phase.

LTO profiled-bootstrapped on x86_64-unknown-linux-gnu, applied.

Honza - you can always come up with a better fix for 4.10.

Richard.

2014-03-19  Richard Biener  

PR lto/59543
* lto-streamer-in.c (input_function): In WPA stage do not drop
debug stmts.

Index: lto-streamer-in.c
===
--- lto-streamer-in.c   (revision 208642)
+++ lto-streamer-in.c   (working copy)
@@ -988,7 +988,7 @@ input_function (tree fn_decl, struct dat
 We can't remove them earlier because this would cause uid
 mismatches in fixups, but we can do it at this point, as
 long as debug stmts don't require fixups.  */
- if (!MAY_HAVE_DEBUG_STMTS && is_gimple_debug (stmt))
+ if (!MAY_HAVE_DEBUG_STMTS && !flag_wpa && is_gimple_debug (stmt))
{
  gimple_stmt_iterator gsi = bsi;
  gsi_next (&bsi);


Re: [PATCH, ARM] Fix ICE due to out of bound.

2014-03-19 Thread Ramana Radhakrishnan

On 03/19/14 08:42, Zhenqiang Chen wrote:

Hi,

ICE when compiling gcc.target/arm/neon-modes-3.c with "-g" in
arm_dwarf_register_span since parts[8] is out of bound for XImode.
GET_MODE_SIZE (XImode) / 4 is 16. "rtx parts[8]" can not hold all the
registers.

According to arm-modes.def, 16 should be the biggest number. So the
patch updates parts to

rtx parts[16];

Bootstrap and no make check regression on ARM Chrome book.

OK for trunk?



It may be time in 4.10 or 5.0 (whatever we call it :)), to deal with the 
FIXME in arm_dwarf_register_span to deal with DW_OP_piece. I'm surprised 
that it's taken so long to hit this.


This is OK for stage4 - it looks sane to me but this needs an RM ack 
before applying.


regards
Ramana


Thanks!
-Zhenqiang

ChangeLog:
2014-03-19  Zhenqiang Chen  

 * config/arm/arm.c (arm_dwarf_register_span): Update the element number
 of parts.

testsuite/ChangeLog:
2014-03-19  Zhenqiang Chen  

 * gcc.target/arm/neon-modes-3.c: Add "-g" option.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index a68ed8d..c4466c1 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -28692,7 +28692,7 @@ arm_dwarf_register_span (rtx rtl)
  {
enum machine_mode mode;
unsigned regno;
-  rtx parts[8];
+  rtx parts[16];
int nregs;
int i;

diff --git a/gcc/testsuite/gcc.target/arm/neon-modes-3.c
b/gcc/testsuite/gcc.target/arm/neon-modes-3.c
index fe81875..f3e4f33 100644
--- a/gcc/testsuite/gcc.target/arm/neon-modes-3.c
+++ b/gcc/testsuite/gcc.target/arm/neon-modes-3.c
@@ -1,6 +1,6 @@
  /* { dg-do compile } */
  /* { dg-require-effective-target arm_neon_ok } */
-/* { dg-options "-O" } */
+/* { dg-options "-O -g" } */
  /* { dg-add-options arm_neon } */

  #include 




--
Ramana Radhakrishnan
Principal Engineer
ARM Ltd.



Re: [PATCH, ARM] Fix ICE due to out of bound.

2014-03-19 Thread Richard Biener
On Wed, 19 Mar 2014, Ramana Radhakrishnan wrote:

> On 03/19/14 08:42, Zhenqiang Chen wrote:
> > Hi,
> > 
> > ICE when compiling gcc.target/arm/neon-modes-3.c with "-g" in
> > arm_dwarf_register_span since parts[8] is out of bound for XImode.
> > GET_MODE_SIZE (XImode) / 4 is 16. "rtx parts[8]" can not hold all the
> > registers.
> > 
> > According to arm-modes.def, 16 should be the biggest number. So the
> > patch updates parts to
> > 
> > rtx parts[16];
> > 
> > Bootstrap and no make check regression on ARM Chrome book.
> > 
> > OK for trunk?
> > 
> 
> It may be time in 4.10 or 5.0 (whatever we call it :)), to deal with the FIXME
> in arm_dwarf_register_span to deal with DW_OP_piece. I'm surprised that it's
> taken so long to hit this.
> 
> This is OK for stage4 - it looks sane to me but this needs an RM ack before
> applying.

Ok (it can't possibly break anything).

Richard.

> regards
> Ramana
> 
> > Thanks!
> > -Zhenqiang
> > 
> > ChangeLog:
> > 2014-03-19  Zhenqiang Chen  
> > 
> >  * config/arm/arm.c (arm_dwarf_register_span): Update the element number
> >  of parts.
> > 
> > testsuite/ChangeLog:
> > 2014-03-19  Zhenqiang Chen  
> > 
> >  * gcc.target/arm/neon-modes-3.c: Add "-g" option.
> > 
> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > index a68ed8d..c4466c1 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -28692,7 +28692,7 @@ arm_dwarf_register_span (rtx rtl)
> >   {
> > enum machine_mode mode;
> > unsigned regno;
> > -  rtx parts[8];
> > +  rtx parts[16];
> > int nregs;
> > int i;
> > 
> > diff --git a/gcc/testsuite/gcc.target/arm/neon-modes-3.c
> > b/gcc/testsuite/gcc.target/arm/neon-modes-3.c
> > index fe81875..f3e4f33 100644
> > --- a/gcc/testsuite/gcc.target/arm/neon-modes-3.c
> > +++ b/gcc/testsuite/gcc.target/arm/neon-modes-3.c
> > @@ -1,6 +1,6 @@
> >   /* { dg-do compile } */
> >   /* { dg-require-effective-target arm_neon_ok } */
> > -/* { dg-options "-O" } */
> > +/* { dg-options "-O -g" } */
> >   /* { dg-add-options arm_neon } */
> > 
> >   #include 
> > 
> 
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


[ARM/AArch64][0/3] Handle bitwise/bytewise reverse operations more effectively

2014-03-19 Thread Kyrill Tkachov

Hi all,

This patch series attempts to improve code generation on arm and aarch64 for 
various bitwise operations that can be expressed with rev16 instructions in 
those architectures. In particular expressions of the form:

((x & 0x00ff00ff) << 8) | ((x & 0xff00ff00) >> 8)

This can appear in places like the Linux kernel and can be directly mapped to a 
single rev16 instruction.

This series has 3 parts:

[1/3] Add a new field to the rtx costs tables to represent the latency of the 
rev* group of instructions that will be used to accurately model the cost of 
these operations. Use it to properly cost existing patterns that generate rev16 
(for bswap operations).


[2/3] Add aarch64 combine patterns to recognise the above bitwise operations and 
map them to rev16. Model the cost appropriately and add helper functions that 
can be reused by the arm backend.


[3/3] Define similar combine patterns for arm and reuse the helper functions 
introduced in patch 2/3 to properly cost them.


I'm proposing these for next stage-1 of course.

Thanks,
Kyrill



[PATCH][ARM][1/3] Add rev field to rtx cost tables

2014-03-19 Thread Kyrill Tkachov

Hi all,

In order to properly cost the rev16 instruction we need a new field in the cost 
tables.

This patch adds that and specifies its value for the existing cost tables.
Since rev16 is used to implement the BSWAP operation we add handling of that in 
the rtx cost function using the new field.


Tested on arm-none-eabi and bootstrapped on an arm linux target.

Does it look ok for stage1?

Thanks,
Kyrill

2014-03-19  Kyrylo Tkachov  

* config/arm/aarch-common-protos.h (alu_cost_table): Add rev field.
* config/arm/aarch-cost-tables.h (generic_extra_costs): Specify
rev cost.
(cortex_a53_extra_costs): Likewise.
(cortex_a57_extra_costs): Likewise.
* config/arm/arm.c (cortexa9_extra_costs): Likewise.
(cortexa7_extra_costs): Likewise.
(cortexa12_extra_costs): Likewise.
(cortexa15_extra_costs): Likewise.
(v7m_extra_costs): Likewise.
(arm_new_rtx_costs): Handle BSWAP.commit 13b2976a9448565beabc41055fdcbd209cde949f
Author: Kyrylo Tkachov 
Date:   Wed Feb 26 15:55:13 2014 +

Add rev field to rtx costs.

diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h
index 2b33626..4ff18cd 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -54,6 +54,7 @@ struct alu_cost_table
   const int bfi;		/* Bit-field insert.  */
   const int bfx;		/* Bit-field extraction.  */
   const int clz;		/* Count Leading Zeros.  */
+  const int rev;		/* Reverse bits/bytes.  */
   const int non_exec;		/* Extra cost when not executing insn.  */
   const bool non_exec_costs_exec; /* True if non-execution must add the exec
  cost.  */
diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
index c30ea2f..adf8708 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -39,6 +39,7 @@ const struct cpu_cost_table generic_extra_costs =
 0,			/* bfi.  */
 0,			/* bfx.  */
 0,			/* clz.  */
+0,			/* rev.  */
 COSTS_N_INSNS (1),	/* non_exec.  */
 false		/* non_exec_costs_exec.  */
   },
@@ -139,6 +140,7 @@ const struct cpu_cost_table cortexa53_extra_costs =
 COSTS_N_INSNS (1),	/* bfi.  */
 COSTS_N_INSNS (1),	/* bfx.  */
 0,			/* clz.  */
+0,			/* rev.  */
 0,			/* non_exec.  */
 true		/* non_exec_costs_exec.  */
   },
@@ -239,6 +241,7 @@ const struct cpu_cost_table cortexa57_extra_costs =
 COSTS_N_INSNS (1), /* bfi.  */
 0, /* bfx.  */
 0, /* clz.  */
+0,			/* rev.  */
 0, /* non_exec.  */
 true   /* non_exec_costs_exec.  */
   },
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index e69911c..a72ee1e 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -982,6 +982,7 @@ const struct cpu_cost_table cortexa9_extra_costs =
 COSTS_N_INSNS (1),	/* bfi.  */
 COSTS_N_INSNS (1),	/* bfx.  */
 0,			/* clz.  */
+0,			/* rev.  */
 0,			/* non_exec.  */
 true		/* non_exec_costs_exec.  */
   },
@@ -1083,6 +1084,7 @@ const struct cpu_cost_table cortexa7_extra_costs =
 COSTS_N_INSNS (1),	/* bfi.  */
 COSTS_N_INSNS (1),	/* bfx.  */
 COSTS_N_INSNS (1),	/* clz.  */
+COSTS_N_INSNS (1),	/* rev.  */
 0,			/* non_exec.  */
 true		/* non_exec_costs_exec.  */
   },
@@ -1184,6 +1186,7 @@ const struct cpu_cost_table cortexa12_extra_costs =
 0,			/* bfi.  */
 COSTS_N_INSNS (1),	/* bfx.  */
 COSTS_N_INSNS (1),	/* clz.  */
+COSTS_N_INSNS (1),	/* rev.  */
 0,			/* non_exec.  */
 true		/* non_exec_costs_exec.  */
   },
@@ -1284,6 +1287,7 @@ const struct cpu_cost_table cortexa15_extra_costs =
 COSTS_N_INSNS (1),	/* bfi.  */
 0,			/* bfx.  */
 0,			/* clz.  */
+0,			/* rev.  */
 0,			/* non_exec.  */
 true		/* non_exec_costs_exec.  */
   },
@@ -1384,6 +1388,7 @@ const struct cpu_cost_table v7m_extra_costs =
 0,			/* bfi.  */
 0,			/* bfx.  */
 0,			/* clz.  */
+0,			/* rev.  */
 COSTS_N_INSNS (1),	/* non_exec.  */
 false		/* non_exec_costs_exec.  */
   },
@@ -9334,6 +9339,47 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
   *cost = LIBCALL_COST (2);
   return false;
 
+case BSWAP:
+  if (arm_arch6)
+{
+  if (mode == SImode)
+{
+  *cost = COSTS_N_INSNS (1);
+  if (speed_p)
+*cost += extra_cost->alu.rev;
+
+  return false;
+}
+}
+  else
+{
+/* No rev instruction available.  Look at arm_legacy_rev
+   and thumb_legacy_rev for the form of RTL used then.  */
+  if (TARGET_THUMB)
+{
+  *cost = COSTS_N_INSNS (10);
+
+  if (speed_p)
+{
+  *cost += 6 * extra_cost->alu.shift;
+  *cost += 3 * extra_cost->alu.logical;
+}
+}
+  else
+ 

[PATCH][AArch64][2/3] Recognise rev16 operations on SImode and DImode data

2014-03-19 Thread Kyrill Tkachov

Hi all,

This patch adds a recogniser for the bitmask,shift,orr sequence of instructions 
that can be used to reverse the bytes in 16-bit halfwords (for the sequence 
itself look at the testcase included in the patch). This can be implemented with 
a rev16 instruction.
Since the shifts can occur in any order and there are no canonicalisation rules 
for where they appear in the expression we have to have two patterns to match 
both cases.


The rtx costs function is updated to recognise the pattern and cost it 
appropriately by using the rev field of the cost tables introduced in patch 
[1/3]. The rtx costs helper functions that are used to recognise those bitwise 
operations are placed in config/arm/aarch-common.c so that they can be reused by 
both arm and aarch64.


I've added an execute testcase but no scan-assembler tests since conceptually in 
the future the combiner might decide to not use a rev instruction due to rtx 
costs. We can at least test that the code generated is functionally correct though.


Tested aarch64-none-elf.

Ok for stage1?

[gcc/]
2014-03-19  Kyrylo Tkachov  

* config/aarch64/aarch64.md (rev162): New pattern.
(rev162_alt): Likewise.
* config/aarch64/aarch64.c (aarch64_rtx_costs): Handle rev16 case.
* config/arm/aarch-common.c (aarch_rev16_shright_mask_imm_p): New.
(aarch_rev16_shleft_mask_imm_p): Likewise.
(aarch_rev16_p_1): Likewise.
(aarch_rev16_p): Likewise.
* config/arm/aarch-common-protos.h (aarch_rev16_p): Declare extern.
(aarch_rev16_shright_mask_imm_p): Likewise.
(aarch_rev16_shleft_mask_imm_p): Likewise.

[gcc/testsuite/]
2014-03-19  Kyrylo Tkachov  

* gcc.target/aarch64/rev16_1.c: New test.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index ebd58c0..41761ae 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4682,6 +4682,16 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
   return false;
 
 case IOR:
+  if (aarch_rev16_p (x))
+{
+  *cost = COSTS_N_INSNS (1);
+
+  if (speed)
+*cost += extra_cost->alu.rev;
+
+  return true;
+}
+/* Fall through.  */
 case XOR:
 case AND:
 cost_logic:
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 99a6ac8..a23452b 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3173,6 +3173,38 @@
   [(set_attr "type" "rev")]
 )
 
+;; There are no canonicalisation rules for the position of the lshiftrt, ashift
+;; operations within an IOR/AND RTX, therefore we have two patterns matching
+;; each valid permutation.
+
+(define_insn "rev162"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+(ior:GPI (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand" "r")
+  (const_int 8))
+  (match_operand:GPI 3 "const_int_operand" "n"))
+ (and:GPI (lshiftrt:GPI (match_dup 1)
+(const_int 8))
+  (match_operand:GPI 2 "const_int_operand" "n"]
+  "aarch_rev16_shleft_mask_imm_p (operands[3], mode)
+   && aarch_rev16_shright_mask_imm_p (operands[2], mode)"
+  "rev16\\t%0, %1"
+  [(set_attr "type" "rev")]
+)
+
+(define_insn "rev162_alt"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+(ior:GPI (and:GPI (lshiftrt:GPI (match_operand:GPI 1 "register_operand" "r")
+(const_int 8))
+  (match_operand:GPI 2 "const_int_operand" "n"))
+ (and:GPI (ashift:GPI (match_dup 1)
+  (const_int 8))
+  (match_operand:GPI 3 "const_int_operand" "n"]
+  "aarch_rev16_shleft_mask_imm_p (operands[3], mode)
+   && aarch_rev16_shright_mask_imm_p (operands[2], mode)"
+  "rev16\\t%0, %1"
+  [(set_attr "type" "rev")]
+)
+
 ;; zero_extend version of above
 (define_insn "*bswapsi2_uxtw"
   [(set (match_operand:DI 0 "register_operand" "=r")
diff --git a/gcc/config/arm/aarch-common-protos.h b/gcc/config/arm/aarch-common-protos.h
index d97ee61..08c4c7a 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -23,6 +23,9 @@
 #ifndef GCC_AARCH_COMMON_PROTOS_H
 #define GCC_AARCH_COMMON_PROTOS_H
 
+extern bool aarch_rev16_p (rtx);
+extern bool aarch_rev16_shleft_mask_imm_p (rtx, enum machine_mode);
+extern bool aarch_rev16_shright_mask_imm_p (rtx, enum machine_mode);
 extern int arm_early_load_addr_dep (rtx, rtx);
 extern int arm_early_store_addr_dep (rtx, rtx);
 extern int arm_mac_accumulator_is_mul_result (rtx, rtx);
diff --git a/gcc/config/arm/aarch-common.c b/gcc/config/arm/aarch-common.c
index c11f7e9..75ed3fd 100644
--- a/gcc/config/arm/aarch-common.c
+++ b/gcc/config/arm/aarch-common.c
@@ -155,6 +155,79 @@ arm_get_set_operands (rtx producer, rtx consumer,
   return 0;
 }
 
+bool
+aarch_rev16_s

[PATCH][ARM][3/3] Recognise bitwise operations leading to SImode rev16

2014-03-19 Thread Kyrill Tkachov

Hi all,

This is the arm equivalent of patch [2/3] in the series that adds combine 
patterns for the bitwise operations leading to a rev16 instruction.
It reuses the functions that were put in aarch-common.c to properly cost these 
operations.


I tried matching a DImode rev16 (with the intent of splitting it into two rev16 
ops) like aarch64 but combine wouldn't try to match that bitwise pattern in 
DImode like aarch64 does. Instead it tries various exotic combinations with subregs.


Tested arm-none-eabi, bootstrap on arm-none-linux-gnueabihf.

Ok for stage1?

[gcc/]
2014-03-19  Kyrylo Tkachov  

* config/arm/arm.md (arm_rev16si2): New pattern.
(arm_rev16si2_alt): Likewise.
* config/arm/arm.c (arm_new_rtx_costs): Handle rev16 case.


[gcc/testsuite/]
2014-03-19  Kyrylo Tkachov  

* gcc.target/arm/rev16.c: New test.commit 04e60723bd1fa2f8e2adcfeed676390643ffec0c
Author: Kyrylo Tkachov 
Date:   Tue Feb 25 15:26:52 2014 +

[ARM] Implement SImode rev16

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 8d1d721..ed603f0 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -9716,8 +9716,17 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
   /* Vector mode?  */
   *cost = LIBCALL_COST (2);
   return false;
+case IOR:
+  if (mode == SImode && arm_arch6 && aarch_rev16_p (x))
+{
+  *cost = COSTS_N_INSNS (1);
+  if (speed_p)
+*cost += extra_cost->alu.rev;
 
-case AND: case XOR: case IOR:
+  return true;
+}
+/* Fall through.  */
+case AND: case XOR:
   if (mode == SImode)
 	{
 	  enum rtx_code subcode = GET_CODE (XEXP (x, 0));
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 4df24a2..47bc747 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -12668,6 +12668,44 @@
(set_attr "type" "rev")]
 )
 
+;; There are no canonicalisation rules for the position of the lshiftrt, ashift
+;; operations within an IOR/AND RTX, therefore we have two patterns matching
+;; each valid permutation.
+
+(define_insn "arm_rev16si2"
+  [(set (match_operand:SI 0 "register_operand" "=l,l,r")
+(ior:SI (and:SI (ashift:SI (match_operand:SI 1 "register_operand" "l,l,r")
+   (const_int 8))
+(match_operand:SI 3 "const_int_operand" "n,n,n"))
+(and:SI (lshiftrt:SI (match_dup 1)
+ (const_int 8))
+(match_operand:SI 2 "const_int_operand" "n,n,n"]
+  "arm_arch6
+   && aarch_rev16_shleft_mask_imm_p (operands[3], SImode)
+   && aarch_rev16_shright_mask_imm_p (operands[2], SImode)"
+  "rev16\\t%0, %1"
+  [(set_attr "arch" "t1,t2,32")
+   (set_attr "length" "2,2,4")
+   (set_attr "type" "rev")]
+)
+
+(define_insn "arm_rev16si2_alt"
+  [(set (match_operand:SI 0 "register_operand" "=l,l,r")
+(ior:SI (and:SI (lshiftrt:SI (match_operand:SI 1 "register_operand" "l,l,r")
+ (const_int 8))
+(match_operand:SI 2 "const_int_operand" "n,n,n"))
+(and:SI (ashift:SI (match_dup 1)
+   (const_int 8))
+(match_operand:SI 3 "const_int_operand" "n,n,n"]
+  "arm_arch6
+   && aarch_rev16_shleft_mask_imm_p (operands[3], SImode)
+   && aarch_rev16_shright_mask_imm_p (operands[2], SImode)"
+  "rev16\\t%0, %1"
+  [(set_attr "arch" "t1,t2,32")
+   (set_attr "length" "2,2,4")
+   (set_attr "type" "rev")]
+)
+
 (define_expand "bswaphi2"
   [(set (match_operand:HI 0 "s_register_operand" "=r")
 	(bswap:HI (match_operand:HI 1 "s_register_operand" "r")))]
diff --git a/gcc/testsuite/gcc.target/arm/rev16.c b/gcc/testsuite/gcc.target/arm/rev16.c
new file mode 100644
index 000..1c869b3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/rev16.c
@@ -0,0 +1,35 @@
+/* { dg-options "-O2" } */
+/* { dg-do run } */
+
+extern void abort (void);
+
+typedef unsigned int __u32;
+
+__u32
+__rev16_32_alt (__u32 x)
+{
+  return (((__u32)(x) & (__u32)0xff00ff00UL) >> 8)
+ | (((__u32)(x) & (__u32)0x00ff00ffUL) << 8);
+}
+
+__u32
+__rev16_32 (__u32 x)
+{
+  return (((__u32)(x) & (__u32)0x00ff00ffUL) << 8)
+ | (((__u32)(x) & (__u32)0xff00ff00UL) >> 8);
+}
+
+int
+main (void)
+{
+  volatile __u32 in32 = 0x12345678;
+  volatile __u32 expected32 = 0x34127856;
+
+  if (__rev16_32 (in32) != expected32)
+abort ();
+
+  if (__rev16_32_alt (in32) != expected32)
+abort ();
+
+  return 0;
+}

[PATCH][AArch64] Add handling of bswap operations in rtx costs

2014-03-19 Thread Kyrill Tkachov

Hi all,

This patch depends on the series started at 
http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00933.html but is not really a part 
of it. It just adds costing of the bswap operation using the new rev field in 
the rtx cost tables since we have patterns in aarch64.md that handle bswap by 
generating rev16 instructions.


Tested aarch64-none-elf.

Ok for stage1 after that series goes in?

2014-03-19  Kyrylo Tkachov  

* config/aarch64/aarch64.c (aarch64_rtx_costs): Handle BSWAP.commit b9771a71dbf62522d423e16ce03353624c1ccd5a
Author: Kyrylo Tkachov 
Date:   Thu Feb 27 11:55:27 2014 +

[AArch64] Cost bswap operations properly

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 901ad3d..28c8841 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4678,6 +4678,14 @@ aarch64_rtx_costs (rtx x, int code, int outer ATTRIBUTE_UNUSED,
 
   return false;
 
+case BSWAP:
+  *cost = COSTS_N_INSNS (1);
+
+  if (speed)
+*cost += extra_cost->alu.rev;
+
+  return false;
+
 case IOR:
 case XOR:
 case AND:

stray warning from gcc's cpp

2014-03-19 Thread Andriy Gapon

I observe the following minor annoyance on FreeBSD systems where cpp is GCC's
cpp.  If a DTrace script has the following shebang line:
#!/usr/sbin/dtrace -Cs
then the following warning is produced when the script is run:
cc1: warning:  is shorter than expected

Some details.  dtrace(1) first forks. Then a child seeks on a file descriptor
associated with the script file, so that the shebang line is skipped (because
otherwise it would confuse cpp).  Then the child makes the file descriptor its
standard input and then it execs cpp.  cpp performs fstat(2) on its standard
input descriptor and determines that it points to a regular file.  Then it
verifies that a number of bytes it reads from the file is the same as a size of
the file.  The check makes sense if the file is opened by cpp itself, but it
does not always make sense for the stdin as described above.

The following patch seems to fix the issue, but perhaps there is a better /
smarter alternative.

--- a/libcpp/files.c
+++ b/libcpp/files.c
@@ -601,7 +601,8 @@ read_file_guts (cpp_reader *pfile, _cpp_file *file)
   return false;
 }

-  if (regular && total != size && STAT_SIZE_RELIABLE (file->st))
+  if (regular && total != size && file->fd != 0
+  && STAT_SIZE_RELIABLE (file->st))
 cpp_error (pfile, CPP_DL_WARNING,
   "%s is shorter than expected", file->path);


-- 
Andriy Gapon


Re: [AArch64] 64-bit float vreinterpret implemention

2014-03-19 Thread Marcus Shawcroft
On 28 February 2014 10:30, Alex Velenko  wrote:

> Hi Richard,
> Thank you for your suggestion. Attached is a patch that includes
> implementation of your proposition. A testsuite was run on LE and BE
> compilers with no regressions.
>
> Here is the description of the patch:
>
> This patch introduces vreinterpret implementation for vectors with 64-bit
> float lanes and adds testcase for those intrinsics.

The aarch64_init_simd_builtins() infrastructure requires the presence
of named RTL patterns in order to construct the types of the SIMD
intrinsics even when an intrinsic is emitted as tree. This seems
rather ugly to me.  At some point we should figure out how to clean up
this aspect of aarch64_init_simd_builtins() and remove the otherwise
unused .md patterns.  This aside I think  your patch is fine as it
stands and can be committed in stage-1.

Cheers
/Marcus


Re: [PATCH, ARM] Fix ICE due to out of bound.

2014-03-19 Thread Jakub Jelinek
On Wed, Mar 19, 2014 at 09:46:56AM +, Ramana Radhakrishnan wrote:
> On 03/19/14 08:42, Zhenqiang Chen wrote:
> >ICE when compiling gcc.target/arm/neon-modes-3.c with "-g" in
> >arm_dwarf_register_span since parts[8] is out of bound for XImode.
> >GET_MODE_SIZE (XImode) / 4 is 16. "rtx parts[8]" can not hold all the
> >registers.
> >
> >According to arm-modes.def, 16 should be the biggest number. So the
> >patch updates parts to
> >
> >rtx parts[16];
> >
> >Bootstrap and no make check regression on ARM Chrome book.
> >
> >OK for trunk?
> >
> 
> It may be time in 4.10 or 5.0 (whatever we call it :)), to deal with
> the FIXME in arm_dwarf_register_span to deal with DW_OP_piece. I'm
> surprised that it's taken so long to hit this.
> 
> This is OK for stage4 - it looks sane to me but this needs an RM ack
> before applying.

Ok.

Jakub


[PATCH] Avoid ggc_collect () after WPA forking

2014-03-19 Thread Richard Biener

This patch avoids calling ggc_collect after we possibly forked
during WPA phase as that necessarily causes a lot of page
unsharing.  I have verified that during a LTO bootstrap we
do not allocate GC memory during (or after) lto_wpa_write_files,
thus the effect on memory use should be positive (the patch
below contains checking code making sure that we don't alloc).

LTO bootstrapped on x86_64-unknown-linux-gnu, will apply shortly
(without the checking code of course).

That should fix the WPA memory explosion Martin sees with building
Chromium.

Richard.

2014-03-19  Richard Biener  

* lto.c (lto_wpa_write_files): Move call to
lto_promote_cross_file_statics ...
(do_whole_program_analysis): ... here, into the partitioning
block.  Do not ggc_collect after lto_wpa_write_files but
for a last time before it.

Index: gcc/ggc-page.c
===
--- gcc/ggc-page.c  (revision 208642)
+++ gcc/ggc-page.c  (working copy)
@@ -1199,6 +1199,8 @@ ggc_round_alloc_size (size_t requested_s
   return size;
 }
 
+int may_alloc = 1;
+
 /* Allocate a chunk of memory of SIZE bytes.  Its contents are undefined.  */
 
 void *
@@ -1208,6 +1210,9 @@ ggc_internal_alloc_stat (size_t size MEM
   struct page_entry *entry;
   void *result;
 
+  if (!may_alloc)
+fatal_error ("allocating GC memory");
+
   ggc_round_alloc_size_1 (size, &order, &object_size);
 
   /* If there are non-full pages for this size allocation, they are at
Index: gcc/lto/lto.c
===
--- gcc/lto/lto.c   (revision 208642)
+++ gcc/lto/lto.c   (working copy)
@@ -2565,11 +2566,6 @@ lto_wpa_write_files (void)
   FOR_EACH_VEC_ELT (ltrans_partitions, i, part)
 lto_stats.num_output_symtab_nodes += lto_symtab_encoder_size 
(part->encoder);
 
-  /* Find out statics that need to be promoted
- to globals with hidden visibility because they are accessed from multiple
- partitions.  */
-  lto_promote_cross_file_statics ();
-
   timevar_pop (TV_WHOPR_WPA);
 
   timevar_push (TV_WHOPR_WPA_IO);
@@ -3281,11 +3277,25 @@ do_whole_program_analysis (void)
 node->aux = NULL;
 
   lto_stats.num_cgraph_partitions += ltrans_partitions.length ();
+
+  /* Find out statics that need to be promoted
+ to globals with hidden visibility because they are accessed from multiple
+ partitions.  */
+  lto_promote_cross_file_statics ();
   timevar_pop (TV_WHOPR_PARTITIONING);
 
   timevar_stop (TV_PHASE_OPT_GEN);
-  timevar_start (TV_PHASE_STREAM_OUT);
 
+  /* Collect a last time - in lto_wpa_write_files we may end up forking
+ with the idea that this doesn't increase memory usage.  So we
+ absoultely do not want to collect after that.  */
+  ggc_collect ();
+{
+  extern int may_alloc;
+  may_alloc = 0;
+}
+
+  timevar_start (TV_PHASE_STREAM_OUT);
   if (!quiet_flag)
 {
   fprintf (stderr, "\nStreaming out");
@@ -3294,10 +3304,8 @@ do_whole_program_analysis (void)
   lto_wpa_write_files ();
   if (!quiet_flag)
 fprintf (stderr, "\n");
-
   timevar_stop (TV_PHASE_STREAM_OUT);
 
-  ggc_collect ();
   if (post_ipa_mem_report)
 {
   fprintf (stderr, "Memory consumption after IPA\n");


[PATCH] Fix ubsan ICE (PR sanitizer/60569)

2014-03-19 Thread Marek Polacek
Apparently with LTO we can get a TYPE_NAME without a DECL_NAME,
so check that it exists before accessing it.
Note that the test has to be run; only compiling wasn't enough
to provoke the ICE.

Ran ubsan testsuite on x86_64-linux, ok for trunk?

2014-03-19  Marek Polacek  

PR sanitizer/60569
* ubsan.c (ubsan_type_descriptor): Check that DECL_NAME is nonnull
before accessing it.
testsuite/
* g++.dg/ubsan/pr60569.C: New test.

diff --git gcc/testsuite/g++.dg/ubsan/pr60569.C 
gcc/testsuite/g++.dg/ubsan/pr60569.C
index e69de29..df6b7a4 100644
--- gcc/testsuite/g++.dg/ubsan/pr60569.C
+++ gcc/testsuite/g++.dg/ubsan/pr60569.C
@@ -0,0 +1,21 @@
+// PR sanitizer/60569
+// { dg-do run }
+// { dg-require-effective-target lto }
+// { dg-options "-fsanitize=undefined -flto" }
+
+struct A
+{
+  void foo ();
+  struct
+  {
+int i;
+void bar () { i = 0; }
+  } s;
+};
+
+void A::foo () { s.bar (); }
+
+int
+main ()
+{
+}
diff --git gcc/ubsan.c gcc/ubsan.c
index 7c7a893..22470da 100644
--- gcc/ubsan.c
+++ gcc/ubsan.c
@@ -318,7 +318,7 @@ ubsan_type_descriptor (tree type, bool want_pointer_type_p)
 {
   if (TREE_CODE (TYPE_NAME (type2)) == IDENTIFIER_NODE)
tname = IDENTIFIER_POINTER (TYPE_NAME (type2));
-  else
+  else if (DECL_NAME (TYPE_NAME (type2)) != NULL)
tname = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type2)));
 }
 

Marek


Re: [PATCH] Fix ubsan ICE (PR sanitizer/60569)

2014-03-19 Thread Jakub Jelinek
On Wed, Mar 19, 2014 at 12:13:57PM +0100, Marek Polacek wrote:
> Apparently with LTO we can get a TYPE_NAME without a DECL_NAME,
> so check that it exists before accessing it.
> Note that the test has to be run; only compiling wasn't enough
> to provoke the ICE.

??  Shouldn't // { dg-do link } be sufficient?

> --- gcc/ubsan.c
> +++ gcc/ubsan.c
> @@ -318,7 +318,7 @@ ubsan_type_descriptor (tree type, bool 
> want_pointer_type_p)
>  {
>if (TREE_CODE (TYPE_NAME (type2)) == IDENTIFIER_NODE)
>   tname = IDENTIFIER_POINTER (TYPE_NAME (type2));
> -  else
> +  else if (DECL_NAME (TYPE_NAME (type2)) != NULL)
>   tname = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type2)));
>  }

This looks good to me.

Jakub


Re: [PATCH] Avoid ggc_collect () after WPA forking

2014-03-19 Thread Steven Bosscher
On Wed, Mar 19, 2014 at 12:10 PM, Richard Biener wrote:
> Index: gcc/ggc-page.c
> ===
> --- gcc/ggc-page.c  (revision 208642)
> +++ gcc/ggc-page.c  (working copy)
> @@ -1199,6 +1199,8 @@ ggc_round_alloc_size (size_t requested_s
>return size;
>  }
>
> +int may_alloc = 1;

"bool may_alloc"?

Ciao!
Steven


Re: [testsuite] Fix gcc.dg/tls/pr58595.c on Solaris 9

2014-03-19 Thread Rainer Orth
Jakub Jelinek  writes:

> On Tue, Mar 18, 2014 at 11:19:52AM +0100, Rainer Orth wrote:
>> The new gcc.dg/tls/pr58595.c testcase FAILs on Solaris 9:
>> 
>> FAIL: gcc.dg/tls/pr58595.c (test for excess errors)
>> Excess errors:
>> Undefined   first referenced
>>  symbol in file
>> ___tls_get_addr /var/tmp//ccuBbAna.o
>> ld: fatal: Symbol referencing errors. No output written to ./pr58595.exe
>> WARNING: gcc.dg/tls/pr58595.c compilation failed to produce executable
>> 
>> Fixed as follows, tested with the appropriate runtest invocation on
>> i386-pc-solaris2.9, i386-pc-solaris2.11, and x86_64-unknown-linux-gnu,
>> installed on mainline.
>
> Can you please also change
> /* { dg-require-effective-target tls } */
> to
> /* { dg-require-effective-target tls_runtime } */
> ?

Sure, done as follows after retesting as before:

2014-03-19  Rainer Orth  

* gcc.dg/tls/pr58595.c: Require tls_runtime instead of tls.

changeset:   13384:d1c2de35507e
tag: tip
user:Rainer Orth 
date:Wed Mar 19 13:04:36 2014 +0100
summary: Require tls_runtime in gcc.dg/tls/pr58595.c

diff --git a/gcc/testsuite/gcc.dg/tls/pr58595.c b/gcc/testsuite/gcc.dg/tls/pr58595.c
--- a/gcc/testsuite/gcc.dg/tls/pr58595.c
+++ b/gcc/testsuite/gcc.dg/tls/pr58595.c
@@ -3,7 +3,7 @@
 /* { dg-options "-O2" } */
 /* { dg-additional-options "-fpic" { target fpic } } */
 /* { dg-add-options tls } */
-/* { dg-require-effective-target tls } */
+/* { dg-require-effective-target tls_runtime } */
 /* { dg-require-effective-target sync_int_long } */
 
 struct S { unsigned long a, b; };


> BTW, don't know if dg-add-options tls can come before that or not.

It can: the tls_runtime check takes care of adding the options itself.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Fix ubsan ICE (PR sanitizer/60569)

2014-03-19 Thread Marek Polacek
On Wed, Mar 19, 2014 at 12:17:19PM +0100, Jakub Jelinek wrote:
> On Wed, Mar 19, 2014 at 12:13:57PM +0100, Marek Polacek wrote:
> > Apparently with LTO we can get a TYPE_NAME without a DECL_NAME,
> > so check that it exists before accessing it.
> > Note that the test has to be run; only compiling wasn't enough
> > to provoke the ICE.
> 
> ??  Shouldn't // { dg-do link } be sufficient?

Ah, forgot about that, it is sufficient.  Ok with dg-do link instead
of dg-do run?
 
> > --- gcc/ubsan.c
> > +++ gcc/ubsan.c
> > @@ -318,7 +318,7 @@ ubsan_type_descriptor (tree type, bool 
> > want_pointer_type_p)
> >  {
> >if (TREE_CODE (TYPE_NAME (type2)) == IDENTIFIER_NODE)
> > tname = IDENTIFIER_POINTER (TYPE_NAME (type2));
> > -  else
> > +  else if (DECL_NAME (TYPE_NAME (type2)) != NULL)
> > tname = IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type2)));
> >  }
> 
> This looks good to me.

Thanks.

Marek


Re: [PATCH] Avoid ggc_collect () after WPA forking

2014-03-19 Thread Richard Biener
On Wed, 19 Mar 2014, Steven Bosscher wrote:

> On Wed, Mar 19, 2014 at 12:10 PM, Richard Biener wrote:
> > Index: gcc/ggc-page.c
> > ===
> > --- gcc/ggc-page.c  (revision 208642)
> > +++ gcc/ggc-page.c  (working copy)
> > @@ -1199,6 +1199,8 @@ ggc_round_alloc_size (size_t requested_s
> >return size;
> >  }
> >
> > +int may_alloc = 1;
> 
> "bool may_alloc"?

It's only checking code I didn't commit.  We may of course alloc
but I wanted to prove we don't.

Richard.


[PATCH] Reduce GC walk recursion depth for types

2014-03-19 Thread Richard Biener

This reduces GC walk recursion depth in two ways.

First by re-ordering tree_type_common members to move 'name' last
and 'canonical' before 'next_variant'.  That makes us
first recurse downward (type, pointer_to/reference_to), then
on the same level (canonical, next_variant, main_variant)
and finally upward (context, name->decl_context).
For TS_TYPE_NON_COMMON we still walk down afterwards via values,
on the same level via minval/maxval and upwards via binfo, so
that the patch helps is maybe too  much handwaving?  (but it
helps a reduced testcase without doing the 2nd part)

Second by choosing sth different for chain_next for types
than TREE_CHAIN (which is TYPE_STUB_DECL, no chain at all).
That makes the unreduced testcase work and apart from the issue
below should be obvious enough (though there usually shouldn't
be so many type variants - still if for every type we save
two or three recursions that still helps).

Martin verified this fixes PR60553.

I've changed chain_next only for the LTO frontend as

  while (ggc_test_and_set_mark (xlimit))
   xlimit = (CODE_CONTAINS_STRUCT (TREE_CODE (&(*xlimit).generic), 
TS_TYPE_COMMON) ? ((union lang_tree_node *) 
(*xlimit).generic.type_common.next_variant) : CODE_CONTAINS_STRUCT 
(TREE_CODE (&(*xlimit).generic), TS_COMMON) ? ((union lang_tree_node *) 
(*xlimit).generic.common.chain) : NULL);

likely doesn't create great code ... (note duplicate tree checks
with checking here for other frontends, fixed LTO with the patch
below).

LTO bootstrap running on x86_64-unknown-linux-gnu.

Ok for trunk?

Thanks,
Richard.

2014-03-19  Richard Biener  

PR middle-end/60553
* tree-core.h (tree_type_common): Re-order pointer members
to reduce recursion depth during GC walks.

lto/
* lto-tree.h (lang_tree_node): For types use TYPE_NEXT_VARIANT 
instead of TREE_CHAIN as chain_next.

Index: gcc/tree-core.h
===
--- gcc/tree-core.h (revision 208642)
+++ gcc/tree-core.h (working copy)
@@ -1265,11 +1265,11 @@ struct GTY(()) tree_type_common {
 const char * GTY ((tag ("TYPE_SYMTAB_IS_POINTER"))) pointer;
 struct die_struct * GTY ((tag ("TYPE_SYMTAB_IS_DIE"))) die;
   } GTY ((desc ("debug_hooks->tree_type_symtab_field"))) symtab;
-  tree name;
+  tree canonical;
   tree next_variant;
   tree main_variant;
   tree context;
-  tree canonical;
+  tree name;
 };
 
 struct GTY(()) tree_type_with_lang_specific {
Index: gcc/lto/lto-tree.h
===
--- gcc/lto/lto-tree.h  (revision 208642)
+++ gcc/lto/lto-tree.h  (working copy)
@@ -48,7 +48,7 @@ enum lto_tree_node_structure_enum {
 };
 
 union GTY((desc ("lto_tree_node_structure (&%h)"),
- chain_next ("CODE_CONTAINS_STRUCT (TREE_CODE (&%h.generic), 
TS_COMMON) ? ((union lang_tree_node *) TREE_CHAIN (&%h.generic)) : NULL")))
+ chain_next ("CODE_CONTAINS_STRUCT (TREE_CODE (&%h.generic), 
TS_TYPE_COMMON) ? ((union lang_tree_node *) 
%h.generic.type_common.next_variant) : CODE_CONTAINS_STRUCT (TREE_CODE 
(&%h.generic), TS_COMMON) ? ((union lang_tree_node *) %h.generic.common.chain) 
: NULL")))
 lang_tree_node
 {
   union tree_node GTY ((tag ("TS_LTO_GENERIC"),


Re: [PATCH] Reduce GC walk recursion depth for types

2014-03-19 Thread Jakub Jelinek
On Wed, Mar 19, 2014 at 02:02:10PM +0100, Richard Biener wrote:
> LTO bootstrap running on x86_64-unknown-linux-gnu.
> 
> Ok for trunk?
> 
> Thanks,
> Richard.
> 
> 2014-03-19  Richard Biener  
> 
>   PR middle-end/60553
>   * tree-core.h (tree_type_common): Re-order pointer members
>   to reduce recursion depth during GC walks.
> 
>   lto/
>   * lto-tree.h (lang_tree_node): For types use TYPE_NEXT_VARIANT 
>   instead of TREE_CHAIN as chain_next.

LGTM.

Jakub


[Fortran][PATCH][gomp4]: Transform OpenACC loop directive

2014-03-19 Thread Ilmir Usmanov

Hi Tobias!

This patch implements transformation of OpenACC loop directive from 
Fortran AST to GENERIC.


Successfully bootstrapped and tested with no new regressions on 
x86_64-unknown-linux-gnu.


OK for gomp4 branch?

--
Ilmir.
>From de2dd5ba0c48500e8e9084bd46cbfac2f21352fe Mon Sep 17 00:00:00 2001
From: Ilmir Usmanov 
Date: Wed, 19 Mar 2014 15:12:36 +0400
Subject: [PATCH] Transform OpenACC loop directive from fortran AST to GENERIC

---
	* gcc/fortran/trans-openmp.c (gfc_trans_oacc_loop): New function.
	(gfc_trans_oacc_combined_directive): Call it.
	(gfc_trans_oacc_directive): Likewise.
	* gcc/tree-pretty-print (dump_omp_clause): Fix WORKER and VECTOR.
	* gcc/testsuite/gfortran.dg/goacc/loop-tree.f95: New test.

diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 29364f4..cb7c970 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -1571,11 +1571,181 @@ typedef struct dovar_init_d {
   tree init;
 } dovar_init;
 
+
+static tree
+gfc_trans_oacc_loop (gfc_code *code, stmtblock_t *pblock,
+		 gfc_omp_clauses *loop_clauses)
+{
+  gfc_se se;
+  tree dovar, stmt, from, to, step, type, init, cond, incr;
+  tree count = NULL_TREE, cycle_label, tmp, omp_clauses;
+  stmtblock_t block;
+  stmtblock_t body;
+  gfc_omp_clauses *clauses = code->ext.omp_clauses;
+  int i, collapse = clauses->collapse;
+  vec inits = vNULL;
+  dovar_init *di;
+  unsigned ix;
+
+  if (collapse <= 0)
+collapse = 1;
+
+  code = code->block->next;
+  gcc_assert (code->op == EXEC_DO || code->op == EXEC_DO_CONCURRENT);
+
+  init = make_tree_vec (collapse);
+  cond = make_tree_vec (collapse);
+  incr = make_tree_vec (collapse);
+
+  if (pblock == NULL)
+{
+  gfc_start_block (&block);
+  pblock = █
+}
+
+  omp_clauses = gfc_trans_omp_clauses (pblock, loop_clauses, code->loc);
+
+  for (i = 0; i < collapse; i++)
+{
+  int simple = 0;
+
+  /* Evaluate all the expressions in the iterator.  */
+  gfc_init_se (&se, NULL);
+  gfc_conv_expr_lhs (&se, code->ext.iterator->var);
+  gfc_add_block_to_block (pblock, &se.pre);
+  dovar = se.expr;
+  type = TREE_TYPE (dovar);
+  gcc_assert (TREE_CODE (type) == INTEGER_TYPE);
+
+  gfc_init_se (&se, NULL);
+  gfc_conv_expr_val (&se, code->ext.iterator->start);
+  gfc_add_block_to_block (pblock, &se.pre);
+  from = gfc_evaluate_now (se.expr, pblock);
+
+  gfc_init_se (&se, NULL);
+  gfc_conv_expr_val (&se, code->ext.iterator->end);
+  gfc_add_block_to_block (pblock, &se.pre);
+  to = gfc_evaluate_now (se.expr, pblock);
+
+  gfc_init_se (&se, NULL);
+  gfc_conv_expr_val (&se, code->ext.iterator->step);
+  gfc_add_block_to_block (pblock, &se.pre);
+  step = gfc_evaluate_now (se.expr, pblock);
+
+  /* Special case simple loops.  */
+  if (TREE_CODE (dovar) == VAR_DECL)
+	{
+	  if (integer_onep (step))
+	simple = 1;
+	  else if (tree_int_cst_equal (step, integer_minus_one_node))
+	simple = -1;
+	}
+
+  /* Loop body.  */
+  if (simple)
+	{
+	  TREE_VEC_ELT (init, i) = build2_v (MODIFY_EXPR, dovar, from);
+	  /* The condition should not be folded.  */
+	  TREE_VEC_ELT (cond, i) = build2_loc (input_location, simple > 0
+	   ? LE_EXPR : GE_EXPR,
+	   boolean_type_node, dovar, to);
+	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location, PLUS_EXPR,
+		type, dovar, step);
+	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location,
+		MODIFY_EXPR,
+		type, dovar,
+		TREE_VEC_ELT (incr, i));
+	}
+  else
+	{
+	  /* STEP is not 1 or -1.  Use:
+	 for (count = 0; count < (to + step - from) / step; count++)
+	   {
+		 dovar = from + count * step;
+		 body;
+	   cycle_label:;
+	   }  */
+	  tmp = fold_build2_loc (input_location, MINUS_EXPR, type, step, from);
+	  tmp = fold_build2_loc (input_location, PLUS_EXPR, type, to, tmp);
+	  tmp = fold_build2_loc (input_location, TRUNC_DIV_EXPR, type, tmp,
+ step);
+	  tmp = gfc_evaluate_now (tmp, pblock);
+	  count = gfc_create_var (type, "count");
+	  TREE_VEC_ELT (init, i) = build2_v (MODIFY_EXPR, count,
+	 build_int_cst (type, 0));
+	  /* The condition should not be folded.  */
+	  TREE_VEC_ELT (cond, i) = build2_loc (input_location, LT_EXPR,
+	   boolean_type_node,
+	   count, tmp);
+	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location, PLUS_EXPR,
+		type, count,
+		build_int_cst (type, 1));
+	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location,
+		MODIFY_EXPR, type, count,
+		TREE_VEC_ELT (incr, i));
+
+	  /* Initialize DOVAR.  */
+	  tmp = fold_build2_loc (input_location, MULT_EXPR, type, count, step);
+	  tmp = fold_build2_loc (input_location, PLUS_EXPR, type, from, tmp);
+	  dovar_init e = {dovar, tmp};
+	  inits.safe_push (e);
+	}
+
+  if (i + 1 < collapse)
+	code = code->block->next;
+}
+
+  if (pblock != &block)
+{
+  pushlevel ();
+

Re: [C++ PATCH] [gomp4] Initial OpenACC support to C++ front-end

2014-03-19 Thread Ilmir Usmanov

Ping.

On 13.03.2014 21:05, Ilmir Usmanov wrote:

On 07.03.2014 15:37, Ilmir Usmanov wrote:

Hi Thomas!

I prepared simple patch to add support of OpenACC data, kernels and 
parallel constructs to C++ FE.


It adds support of data clauses too.

OK to gomp4 branch?


Fixed subject: changed file extensions of tests and fixed comments.

OK to gomp4 branch?


--
Ilmir.


Re: [patch] gcc fstack-protector-explicit

2014-03-19 Thread Marcos Díaz
Well, finally I have the assignment, could you please review this patch?

On Wed, Nov 20, 2013 at 4:13 PM, Jeff Law  wrote:
> On 11/19/13 07:04, Marcos Díaz wrote:
>>
>> My employer is working on the signature of the papers. Could someone
>> please do the review meanwhile?
>
> I'd prefer to wait until the assignment process is complete.  If something
> were to happen and we can't use your code the review time would have been
> wasted (and such things have certainly happened in the past).
>
> Once the assignment is recorded, please ping this patch.
>
> Jeff
>



-- 
__


Marcos Díaz

Software Engineer


San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina


Phone: +54 351 4217888 / +54 351 4218211/ +54 351 7617452

Skype: markdiaz22


Re: [patch] gcc fstack-protector-explicit

2014-03-19 Thread Jeff Law

On 03/19/14 08:06, Marcos Díaz wrote:

Well, finally I have the assignment, could you please review this patch?
Thanks.  I'll take a look once we open up stage1 development again 
(should be soon as 4.9 is getting close to being ready).


jeff



Re: [PATCH] Avoid ggc_collect () after WPA forking

2014-03-19 Thread Richard Biener
On Wed, 19 Mar 2014, Martin Liška wrote:

> There are stats for Firefox with LTO and -O2. According to graphs it
> looks that memory consumption for parallel WPA phase is similar.
> When I disable parallel WPA, wpa footprint is ~4GB, but ltrans memory
> footprint is similar to parallel WPA that reduces libxul.so linking by ~10%.

Ok, so I suppose this tracks RSS, not virtual memory use (what is
"used" and what is "active")?

And it is WPA plus LTRANS stages, WPA ends where memory use first goes
down to zero?

I wonder if you can identify the point where parallel streaming
starts and where it ends ... ;)

Btw, I have another patch in my local tree, limiting the
exponential growth of blocks we allocate when outputting sections.
But it shouldn't be _that_ bad ... maybe you can try if it has
any effect?

Thanks,
Richard.

Index: gcc/lto-section-out.c
===
--- gcc/lto-section-out.c   (revision 208642)
+++ gcc/lto-section-out.c   (working copy)
@@ -99,13 +99,19 @@ lto_end_section (void)
 }
 
 
+/* We exponentially grow the size of the blocks as we need to make
+   room for more data to be written.  Start with a single page and go up
+   to 2MB pages for this.  */
+#define FIRST_BLOCK_SIZE 4096
+#define MAX_BLOCK_SIZE (2 * 1024 * 1024)
+
 /* Write all of the chars in OBS to the assembler.  Recycle the blocks
in obs as this is being done.  */
 
 void
 lto_write_stream (struct lto_output_stream *obs)
 {
-  unsigned int block_size = 1024;
+  unsigned int block_size = FIRST_BLOCK_SIZE;
   struct lto_char_ptr_base *block;
   struct lto_char_ptr_base *next_block;
   if (!obs->first_block)
@@ -135,6 +141,7 @@ lto_write_stream (struct lto_output_stre
   else
lang_hooks.lto.append_data (base, num_chars, block);
   block_size *= 2;
+  block_size = MIN (MAX_BLOCK_SIZE, block_size);
 }
 }
 
@@ -152,7 +159,7 @@ lto_append_block (struct lto_output_stre
 {
   /* This is the first time the stream has been written
 into.  */
-  obs->block_size = 1024;
+  obs->block_size = FIRST_BLOCK_SIZE;
   new_block = (struct lto_char_ptr_base*) xmalloc (obs->block_size);
   obs->first_block = new_block;
 }
@@ -162,6 +169,7 @@ lto_append_block (struct lto_output_stre
   /* Get a new block that is twice as big as the last block
 and link it into the list.  */
   obs->block_size *= 2;
+  obs->block_size = MIN (MAX_BLOCK_SIZE, obs->block_size);
   new_block = (struct lto_char_ptr_base*) xmalloc (obs->block_size);
   /* The first bytes of the block are reserved as a pointer to
 the next block.  Set the chain of the full block to the

Re: [PATCH] Avoid ggc_collect () after WPA forking

2014-03-19 Thread Martin Liška


On 03/19/2014 03:55 PM, Richard Biener wrote:

On Wed, 19 Mar 2014, Martin Liška wrote:


There are stats for Firefox with LTO and -O2. According to graphs it
looks that memory consumption for parallel WPA phase is similar.
When I disable parallel WPA, wpa footprint is ~4GB, but ltrans memory
footprint is similar to parallel WPA that reduces libxul.so linking by ~10%.

Ok, so I suppose this tracks RSS, not virtual memory use (what is
"used" and what is "active")?


Data are given by vmstat, according to: 
http://stackoverflow.com/questions/18529723/what-is-active-memory-and-inactive-memory


*Active memory*is memory that is being used by a particular process.
*Inactive memory*is memory that was allocated to a process that is no 
longer running.


So please follow just 'blue' line that displays really used memory. 
According to man, vmstat tracks virtual memory statistics.



And it is WPA plus LTRANS stages, WPA ends where memory use first goes
down to zero?
I wonder if you can identify the point where parallel streaming
starts and where it ends ... ;)


Exactly, WPA ends when it goes to zero.


Btw, I have another patch in my local tree, limiting the
exponential growth of blocks we allocate when outputting sections.
But it shouldn't be _that_ bad ... maybe you can try if it has
any effect?


I can apply it.

Martin



Thanks,
Richard.

Index: gcc/lto-section-out.c
===
--- gcc/lto-section-out.c   (revision 208642)
+++ gcc/lto-section-out.c   (working copy)
@@ -99,13 +99,19 @@ lto_end_section (void)
  }
  
  
+/* We exponentially grow the size of the blocks as we need to make

+   room for more data to be written.  Start with a single page and go up
+   to 2MB pages for this.  */
+#define FIRST_BLOCK_SIZE 4096
+#define MAX_BLOCK_SIZE (2 * 1024 * 1024)
+
  /* Write all of the chars in OBS to the assembler.  Recycle the blocks
 in obs as this is being done.  */
  
  void

  lto_write_stream (struct lto_output_stream *obs)
  {
-  unsigned int block_size = 1024;
+  unsigned int block_size = FIRST_BLOCK_SIZE;
struct lto_char_ptr_base *block;
struct lto_char_ptr_base *next_block;
if (!obs->first_block)
@@ -135,6 +141,7 @@ lto_write_stream (struct lto_output_stre
else
lang_hooks.lto.append_data (base, num_chars, block);
block_size *= 2;
+  block_size = MIN (MAX_BLOCK_SIZE, block_size);
  }
  }
  
@@ -152,7 +159,7 @@ lto_append_block (struct lto_output_stre

  {
/* This is the first time the stream has been written
 into.  */
-  obs->block_size = 1024;
+  obs->block_size = FIRST_BLOCK_SIZE;
new_block = (struct lto_char_ptr_base*) xmalloc (obs->block_size);
obs->first_block = new_block;
  }
@@ -162,6 +169,7 @@ lto_append_block (struct lto_output_stre
/* Get a new block that is twice as big as the last block
 and link it into the list.  */
obs->block_size *= 2;
+  obs->block_size = MIN (MAX_BLOCK_SIZE, obs->block_size);
new_block = (struct lto_char_ptr_base*) xmalloc (obs->block_size);
/* The first bytes of the block are reserved as a pointer to
 the next block.  Set the chain of the full block to the




Re: [ARM] [Trivial] Fix shortening of field name extend.

2014-03-19 Thread James Greenhalgh
On Mon, Feb 24, 2014 at 09:13:45AM +, James Greenhalgh wrote:
> *ping*, CCing Jakub.

*ping x2* This was OKed by ramana, but we wanted release manager approval.
I would have committed the patch as obvious if we were not in stage 4.

Thanks,
James

> On Wed, Feb 12, 2014 at 12:43:10PM +, Ramana Radhakrishnan wrote:
> > On 02/12/14 12:19, James Greenhalgh wrote:
> > >
> > > Hi,
> > >
> > > In aarch-common-protos.h we define a field in alu_cost_table:
> > >
> > >"extnd"
> > >
> > > On its own this is an upsetting optimization of the
> > > English language, but this trouble is compounded by the
> > > comment attached to this field throughout the cost tables
> > > themselves:
> > >
> > >/* Extend.  */
> > >
> > > This patch fixes the spelling of extend to match that in the
> > > commemnts.
> > >
> > > I've checked that AArch64 and AArch32 build with this patch
> > > applied.
> > >
> > > OK for trunk/stage-1 (I don't mind which)?
> > 
> > I am happy for this to go in now -
> > 
> > Jakub ?
> > 
> > 
> > regards
> > Ramana
> 

2014-03-19  James Greenhalgh  

* config/arm/aarch-common-protos.h
(alu_cost_table): Fix spelling of "extend".
* config/arm/arm.c (arm_new_rtx_costs): Fix spelling of "extend".

diff --git a/gcc/config/arm/aarch-common-protos.h 
b/gcc/config/arm/aarch-common-protos.h
index 056fe56..a5ff6b4 100644
--- a/gcc/config/arm/aarch-common-protos.h
+++ b/gcc/config/arm/aarch-common-protos.h
@@ -48,8 +48,8 @@ struct alu_cost_table
   const int arith_shift_reg;   /* ... and when the shift is by a reg.  */
   const int log_shift; /* Additional when logic also shifts...  */
   const int log_shift_reg; /* ... and when the shift is by a reg.  */
-  const int extnd; /* Zero/sign extension.  */
-  const int extnd_arith;   /* Extend and arith.  */
+  const int extend;/* Zero/sign extension.  */
+  const int extend_arith;  /* Extend and arith.  */
   const int bfi;   /* Bit-field insert.  */
   const int bfx;   /* Bit-field extraction.  */
   const int clz;   /* Count Leading Zeros.  */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index a68ed8d..31df089 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -9594,7 +9594,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum 
rtx_code outer_code,
{
  /* UXTA[BH] or SXTA[BH].  */
  if (speed_p)
-   *cost += extra_cost->alu.extnd_arith;
+   *cost += extra_cost->alu.extend_arith;
  *cost += (rtx_cost (XEXP (XEXP (x, 0), 0), ZERO_EXTEND, 0,
  speed_p)
+ rtx_cost (XEXP (x, 1), PLUS, 0, speed_p));
@@ -10311,7 +10311,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum 
rtx_code outer_code,
  *cost = COSTS_N_INSNS (1);
  *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p);
  if (speed_p)
-   *cost += extra_cost->alu.extnd;
+   *cost += extra_cost->alu.extend;
}
   else if (GET_MODE (XEXP (x, 0)) != SImode)
{
@@ -10364,7 +10364,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum 
rtx_code outer_code,
  *cost = COSTS_N_INSNS (1);
  *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p);
  if (speed_p)
-   *cost += extra_cost->alu.extnd;
+   *cost += extra_cost->alu.extend;
}
   else if (GET_MODE (XEXP (x, 0)) != SImode)
{




Re: [ARM] [Trivial] Fix shortening of field name extend.

2014-03-19 Thread Jakub Jelinek
On Wed, Mar 19, 2014 at 03:13:40PM +, James Greenhalgh wrote:
> On Mon, Feb 24, 2014 at 09:13:45AM +, James Greenhalgh wrote:
> > *ping*, CCing Jakub.
> 
> *ping x2* This was OKed by ramana, but we wanted release manager approval.
> I would have committed the patch as obvious if we were not in stage 4.

This is ok even in stage4.

Jakub


Re: [Patch AArch64] Define TARGET_FLAGS_REGNUM

2014-03-19 Thread Marcus Shawcroft
On 28 February 2014 09:32, Ramana Radhakrishnan  wrote:
> Hi,
>
> This defines TARGET_FLAGS_REGNUM for AArch64 to be CC_REGNUM.
> Noticed this turns on the cmpelim pass after reload and in a few examples
> and a couple of benchmarks I noticed a number of comparisons getting
> deleted. A similar patch for AArch32 is being tested.
>
> Tested cross with aarch64-none-elf on a model with no regressions.
>
> Ok for stage1 ?

OK /Marcus


Re: [PATCH] Avoid ggc_collect () after WPA forking

2014-03-19 Thread Richard Biener
On Wed, 19 Mar 2014, Martin Liška wrote:

> 
> On 03/19/2014 03:55 PM, Richard Biener wrote:
> > On Wed, 19 Mar 2014, Martin Liška wrote:
> > 
> > > There are stats for Firefox with LTO and -O2. According to graphs it
> > > looks that memory consumption for parallel WPA phase is similar.
> > > When I disable parallel WPA, wpa footprint is ~4GB, but ltrans memory
> > > footprint is similar to parallel WPA that reduces libxul.so linking by
> > > ~10%.
> > Ok, so I suppose this tracks RSS, not virtual memory use (what is
> > "used" and what is "active")?
> 
> Data are given by vmstat, according to:
> http://stackoverflow.com/questions/18529723/what-is-active-memory-and-inactive-memory
> 
> *Active memory*is memory that is being used by a particular process.
> *Inactive memory*is memory that was allocated to a process that is no longer
> running.
>
> So please follow just 'blue' line that displays really used memory. According
> to man, vmstat tracks virtual memory statistics.

But 'blue' is neither active nor inactive ... what is 'used'?  Does
it correspond to 'swpd'?

If it is virtual memory in use then this is expected to grow when 
fork()ing as the virtual memory space is obviously copied (just the pages 
are still shared).

For me allocating a GB memory and clearing it increases "active" by
1GB and then forking doesn't increase any of the metrics vmstat -a
outputs in any significant way.

> > And it is WPA plus LTRANS stages, WPA ends where memory use first goes
> > down to zero?
> > I wonder if you can identify the point where parallel streaming
> > starts and where it ends ... ;)
> 
> Exactly, WPA ends when it goes to zero.

So the difference isn't that big (8GB vs. 7.2GB), and is likely attributed
to heap memory we allocate during the stream-out.  For example
we need some for the tree-ref-encoders (I remember that can be a
significant amount of memory, but I improved that already as far as
possible...).  So yes, we _do_ allocate memory during stream-out
and that is now required N times.

> > Btw, I have another patch in my local tree, limiting the
> > exponential growth of blocks we allocate when outputting sections.
> > But it shouldn't be _that_ bad ... maybe you can try if it has
> > any effect?
> 
> I can apply it.

Thanks,
Richard.

PATCH: PR testsuite/60590: Can't recreate the same executable in testsuite

2014-03-19 Thread H.J. Lu
On Wed, Mar 19, 2014 at 8:41 AM, H.J. Lu  wrote:
> GNU linker sets DT_RPATH from the environment variable LD_RUN_PATH.
> set_ld_library_path_env_vars sets a few environment variables including
> LD_RUN_PATH.  This patch logs all environment variables set by
> set_ld_library_path_env_vars so that one can recreate the same
> executable as "make check" run.  OK to install?
>
> Thanks.
>
> H.J.
> ---
> 2014-03-19  H.J. Lu  
>
> PR testsuite/60590
> * lib/target-libpath.exp (set_ld_library_path_env_vars): Log
> LD_LIBRARY_PATH, LD_RUN_PATH, SHLIB_PATH, LD_LIBRARY_PATH_32,
> LD_LIBRARY_PATH_64 and DYLD_LIBRARY_PATH.
>
> diff --git a/gcc/testsuite/lib/target-libpath.exp 
> b/gcc/testsuite/lib/target-libpath.exp
> index 603ed8a..1891088 100644
> --- a/gcc/testsuite/lib/target-libpath.exp
> +++ b/gcc/testsuite/lib/target-libpath.exp
> @@ -155,7 +155,12 @@ proc set_ld_library_path_env_vars { } {
>  setenv DYLD_LIBRARY_PATH "$ld_library_path"
>}
>
> -  verbose -log "set_ld_library_path_env_vars: 
> ld_library_path=$ld_library_path"
> +  verbose -log "LD_LIBRARY_PATH=[getenv LD_LIBRARY_PATH]"
> +  verbose -log "LD_RUN_PATH=[getenv LD_RUN_PATH]"
> +  verbose -log "SHLIB_PATH=[getenv SHLIB_PATH]"
> +  verbose -log "LD_LIBRARY_PATH_32=[getenv LD_LIBRARY_PATH_32]"
> +  verbose -log "LD_LIBRARY_PATH_64=[getenv LD_LIBRARY_PATH_64]"
> +  verbose -log "DYLD_LIBRARY_PATH=[getenv DYLD_LIBRARY_PATH]"
>  }
>
>  ###

Correction.  It is a testsuite issue.

-- 
H.J.


Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Mike Stump
On Mar 18, 2014, at 6:16 AM, Kai Tietz  wrote:
> this patch skips anon2.C and anon3.C test for mingw target.  Issue
> here is that weak under pe-coff is different to ELF-targets and
> therefore test doesn't apply for

So, what does the output look like?  There should be a trace of weak of some 
sort in the output.

Re: [C++ Patch / RFC] PR 51474

2014-03-19 Thread Jason Merrill

OK.

Jason


PATCH: PR target/60590: Can't recreate the same executable in testsuite

2014-03-19 Thread H.J. Lu
GNU linker sets DT_RPATH from the environment variable LD_RUN_PATH.
set_ld_library_path_env_vars sets a few environment variables including
LD_RUN_PATH.  This patch logs all environment variables set by
set_ld_library_path_env_vars so that one can recreate the same
executable as "make check" run.  OK to install?

Thanks.

H.J.
---
2014-03-19  H.J. Lu  

PR target/60590
* lib/target-libpath.exp (set_ld_library_path_env_vars): Log
LD_LIBRARY_PATH, LD_RUN_PATH, SHLIB_PATH, LD_LIBRARY_PATH_32,
LD_LIBRARY_PATH_64 and DYLD_LIBRARY_PATH.

diff --git a/gcc/testsuite/lib/target-libpath.exp 
b/gcc/testsuite/lib/target-libpath.exp
index 603ed8a..1891088 100644
--- a/gcc/testsuite/lib/target-libpath.exp
+++ b/gcc/testsuite/lib/target-libpath.exp
@@ -155,7 +155,12 @@ proc set_ld_library_path_env_vars { } {
 setenv DYLD_LIBRARY_PATH "$ld_library_path"
   }
 
-  verbose -log "set_ld_library_path_env_vars: ld_library_path=$ld_library_path"
+  verbose -log "LD_LIBRARY_PATH=[getenv LD_LIBRARY_PATH]"
+  verbose -log "LD_RUN_PATH=[getenv LD_RUN_PATH]"
+  verbose -log "SHLIB_PATH=[getenv SHLIB_PATH]"
+  verbose -log "LD_LIBRARY_PATH_32=[getenv LD_LIBRARY_PATH_32]"
+  verbose -log "LD_LIBRARY_PATH_64=[getenv LD_LIBRARY_PATH_64]"
+  verbose -log "DYLD_LIBRARY_PATH=[getenv DYLD_LIBRARY_PATH]"
 }
 
 ###


Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Kai Tietz
2014-03-19 17:23 GMT+01:00 Mike Stump :
> On Mar 18, 2014, at 6:16 AM, Kai Tietz  wrote:
>> this patch skips anon2.C and anon3.C test for mingw target.  Issue
>> here is that weak under pe-coff is different to ELF-targets and
>> therefore test doesn't apply for
>
> So, what does the output look like?  There should be a trace of weak of some 
> sort in the output.

No, there is none.  Output looks like:

.seh_proc   _ZN2N43._91CIiE3fn2ES2_
_ZN2N43._91CIiE3fn2ES2_:
.LFB11:
.seh_endprologue
ret
.seh_endproc
.globl  _ZN2N41qE
.data
.align 8
_ZN2N41qE:
.quad   _ZN2N43._91CIiE3fn2ES2_
.globl  _ZN2N41pE
.align 8
_ZN2N41pE:
.quad   _ZN2N43._91CIiE3fn1ENS0_1BE
.globl  _ZN2N31qE
.align 8
_ZN2N31qE:
.quad   _ZN2N31D1CIiE3fn2ES2_...

The concept of weak - as present in ELF - isn't known in COFF in
general.  There is some weak, but it works only for static library and
in a limitted way.  Therefore we can't (and don't) use it for COFF
targets.

Kai

PS: I have another similiar reasoned patch for g++.dg/abi/thunk5.C on
my pile too.


Re: PATCH: PR target/60590: Can't recreate the same executable in testsuite

2014-03-19 Thread Mike Stump
On Mar 19, 2014, at 8:41 AM, H.J. Lu  wrote:
> GNU linker sets DT_RPATH from the environment variable LD_RUN_PATH.
> set_ld_library_path_env_vars sets a few environment variables including
> LD_RUN_PATH.  This patch logs all environment variables set by
> set_ld_library_path_env_vars so that one can recreate the same
> executable as "make check" run.  OK to install?

Ok.  If someone complains about the log size clutter, we can consider bumping 
it up to higher verbosity.

[jit] Tighten up the distinction between pointers and arrays

2014-03-19 Thread David Malcolm
Committed to branch dmalcolm/jit:

https://github.com/davidmalcolm/pygccjit/pull/3#issuecomment-37883129
showed a problem where a parameter expecting a (char *) was passed
a char[1024] cast to a (char *) as its argument, leading to an ICE:

libgccjit.so: internal compiler error: in convert_move, at expr.c:320
0x7fffebea98ad convert_move(rtx_def*, rtx_def*, int)
../../src/gcc/expr.c:320
0x7fffebec31cb expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, 
expand_modifier)
../../src/gcc/expr.c:8105
0x7fffec88d768 expand_gimple_stmt_1
../../src/gcc/cfgexpand.c:2321
0x7fffec88d9cc expand_gimple_stmt
../../src/gcc/cfgexpand.c:2381

The issue was that the recording::type::dereference method is used for
both pointers and for arrays, leading to sloppiness about where lvalues
and rvalues can be pointers vs arrays.

This commit introduces is_pointer and is_array methods, using them to
tighten up type-checking, converting the above ICE into an type-check
error when the cast is attempted:
  libgccjit.so: error: gcc_jit_context_new_cast: cannot cast buffer from type: 
char[1024] to type: char *

The correct way to use an array as a pointer in the JIT API is to use
   gcc_jit_lvalue_get_address
on the array, which gives you an rvalue representing the address of the
initial element, and then to cast that rvalue as necessary.

gcc/jit
* internal-api.c (gcc::jit::recording::memento_of_get_pointer::
accepts_writes_from): Accept writes from pointers, but not arrays.

* internal-api.h (gcc::jit::recording::type::is_pointer): New.
(gcc::jit::recording::type::is_array): New.
(gcc::jit::recording::memento_of_get_type::accepts_writes_from):
Allow (void *) to accept writes of pointers, but not arrays.
(gcc::jit::recording::memento_of_get_type::is_pointer): New.
(gcc::jit::recording::memento_of_get_type::is_array): New.
(gcc::jit::recording::memento_of_get_pointer::is_pointer): New.
(gcc::jit::recording::memento_of_get_pointer::is_array): New.
(gcc::jit::recording::memento_of_get_const::is_pointer): New.
(gcc::jit::recording::memento_of_get_const::is_array): New.
(gcc::jit::recording::memento_of_get_volatile::is_pointer): New.
(gcc::jit::recording::memento_of_get_volatile::is_array): New.
(gcc::jit::recording::array_type::is_pointer): New.
(gcc::jit::recording::array_type::is_array): New.
(gcc::jit::recording::function_type::is_pointer): New.
(gcc::jit::recording::function_type::is_array): New.
(gcc::jit::recording::struct_::is_pointer): New.
(gcc::jit::recording::struct_::is_array): New.

* libgccjit.c (gcc_jit_context_new_rvalue_from_ptr): Require the
pointer_type to be a pointer, not an array.
(gcc_jit_context_null): Likewise.
(is_valid_cast): Require pointer casts to be between pointer types,
not arrays.
(gcc_jit_context_new_array_access): Update error message from "not
a pointer" to "not a pointer or array".
(gcc_jit_rvalue_dereference_field): Require the pointer arg to be
of pointer type, not an array.
(gcc_jit_rvalue_dereference): Likewise.

gcc/testsuite/
* jit.dg/test-array-as-pointer.c: New test case, verifying that
there's a way to treat arrays as pointers.
* jit.dg/test-combination.c: Add test-array-as-pointer.c...
(create_code): ...here and...
(verify_code): ...here.

* jit.dg/test-error-array-as-pointer.c: New test case, verifying
that bogus casts from array to pointer are caught by the type
system, rather than leading to ICEs seen in:
https://github.com/davidmalcolm/pygccjit/pull/3#issuecomment-37883129
---
 gcc/jit/ChangeLog.jit  |  35 +++
 gcc/jit/internal-api.c |   2 +-
 gcc/jit/internal-api.h |  18 +++-
 gcc/jit/libgccjit.c|  14 +--
 gcc/testsuite/ChangeLog.jit|  13 +++
 gcc/testsuite/jit.dg/test-array-as-pointer.c   | 101 +
 gcc/testsuite/jit.dg/test-combination.c|   9 ++
 gcc/testsuite/jit.dg/test-error-array-as-pointer.c |  99 
 8 files changed, 282 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/jit.dg/test-array-as-pointer.c
 create mode 100644 gcc/testsuite/jit.dg/test-error-array-as-pointer.c

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index 8244eba..efb1931 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,38 @@
+2014-03-19  David Malcolm  
+
+   * internal-api.c (gcc::jit::recording::memento_of_get_pointer::
+   accepts_writes_from): Accept writes from pointers, but not arrays.
+
+   * internal-api.h (gcc::jit::recording::type::is_pointer): New.
+   (gcc::jit::recording::type::is_array): New.
+  

Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Rainer Orth
Kai Tietz  writes:

> 2014-03-19 17:23 GMT+01:00 Mike Stump :
>> On Mar 18, 2014, at 6:16 AM, Kai Tietz  wrote:
>>> this patch skips anon2.C and anon3.C test for mingw target.  Issue
>>> here is that weak under pe-coff is different to ELF-targets and
>>> therefore test doesn't apply for
>>
>> So, what does the output look like?  There should be a trace of weak of
>> some sort in the output.
>
> No, there is none.  Output looks like:
>
> .seh_proc   _ZN2N43._91CIiE3fn2ES2_
> _ZN2N43._91CIiE3fn2ES2_:
> .LFB11:
> .seh_endprologue
> ret
> .seh_endproc
> .globl  _ZN2N41qE
> .data
> .align 8
> _ZN2N41qE:
> .quad   _ZN2N43._91CIiE3fn2ES2_
> .globl  _ZN2N41pE
> .align 8
> _ZN2N41pE:
> .quad   _ZN2N43._91CIiE3fn1ENS0_1BE
> .globl  _ZN2N31qE
> .align 8
> _ZN2N31qE:
> .quad   _ZN2N31D1CIiE3fn2ES2_...
>
> The concept of weak - as present in ELF - isn't known in COFF in
> general.  There is some weak, but it works only for static library and
> in a limitted way.  Therefore we can't (and don't) use it for COFF
> targets.

In that case, it seems far better to have
gcc/testsuite/lib/target-support.exp (check_weak_available) reflect that
instead of lying about weak support.

This way, everything else simply falls into place; no need to
special-case many individual testcases.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Mike Stump
On Mar 19, 2014, at 9:49 AM, Rainer Orth  wrote:
>> The concept of weak - as present in ELF - isn't known in COFF in
>> general.  There is some weak, but it works only for static library and
>> in a limitted way.  Therefore we can't (and don't) use it for COFF
>> targets.
> 
> In that case, it seems far better to have
> gcc/testsuite/lib/target-support.exp (check_weak_available) reflect that
> instead of lying about weak support.

Yeah, this is the direction I was headed…  :-)


[PATCH 2/2, AARCH64] Test case changes: Re: [RFC] [PATCH, AARCH64] : Using standard patterns for stack protection.

2014-03-19 Thread Venkataramanan Kumar
Hi Marcus,

On 14 March 2014 19:42, Marcus Shawcroft  wrote:
>>>
>>> Do we need a new effective target test, why is the existing
>>> "fstack_protector" not appropriate?
>>
>> "stack_protector" does a run time test. It failed in cross compilation
>> environment and these are compile only tests.
>
> This works fine in my cross environment, how does yours fail?
>
>
>> Also I thought  richard suggested  me to add a new option for this.
>> ref: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03358.html
>
> I read that comment to mean use an effective target test instead of
> matching triples. I don't see that re-using an existing effective
> target test contradicts that suggestion.
>
> Looking through the test suite I see that there are:
>
> 6 tests that use dg-do compile with dg-require-effective-target 
> fstack_protector
>
> 4 tests that use dg-do run with dg-require-effective-target fstack_protector
>
> 2 tests that use dg-do run {target native} dg-require-effective-target
> fstack_protector
>
> and finally the 2 tests we are discussing that use dg-compile with a
> triple test.
>
> so there are already tests in the testsuite that use dg-do compile
> with the existing effective target test.
>
> I see no immediately obvious reason why the two tests that require
> target native require the native constraint... but I guess that is a
> different issue.
>

I used the existing dg-require-effective-target check,
"stack_protector" and added it in a separate line.

ChangeLog.

2014-03-19  Venkataramanan Kumar  
* g++.dg/fstack-protector-strong.C: Add effetive target check for
  stack protection.
* gcc.dg/fstack-protector-strong.c: Likewise.

These two tests are passing now for aarch64-none-linux-gnu target under QEMU.

Let me know if I can upstream these two patches.

regards,
Venkat.
Index: gcc/testsuite/g++.dg/fstack-protector-strong.C
===
--- gcc/testsuite/g++.dg/fstack-protector-strong.C  (revision 208609)
+++ gcc/testsuite/g++.dg/fstack-protector-strong.C  (working copy)
@@ -1,7 +1,8 @@
 /* Test that stack protection is done on chosen functions. */
 
-/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-do compile } */
 /* { dg-options "-O2 -fstack-protector-strong" } */
+/* { dg-require-effective-target fstack_protector } */
 
 class A
 {
Index: gcc/testsuite/gcc.dg/fstack-protector-strong.c
===
--- gcc/testsuite/gcc.dg/fstack-protector-strong.c  (revision 208609)
+++ gcc/testsuite/gcc.dg/fstack-protector-strong.c  (working copy)
@@ -1,7 +1,8 @@
 /* Test that stack protection is done on chosen functions. */
 
-/* { dg-do compile { target i?86-*-* x86_64-*-* rs6000-*-* s390x-*-* } } */
+/* { dg-do compile } */
 /* { dg-options "-O2 -fstack-protector-strong" } */
+/* { dg-require-effective-target fstack_protector } */
 
 #include
 


[C++ Patch] PR 60384

2014-03-19 Thread Paolo Carlini

Hi,

in this minor regression we ICE during error recovery, when 
push_class_level_binding_1 (called by
finish_member_declaration via pushdecl_class_level) gets a 
TEMPLATE_ID_EXPR as the name argument. It's a regression because, since 
r199779, invalid declarations get more often through (with TREE_TYPE an 
error_mark_node, like TREE_TYPE (x) in the case at issue). Thus the 
additional check I'm suggesting. Tested x86_64-linux.


Thanks,
Paolo.

//
/cp
2014-03-19  Paolo Carlini  

PR c++/60384
* name-lookup.c (push_class_level_binding_1): Check identifier_p
on the name argument.

/testsuite
2014-03-19  Paolo Carlini  

PR c++/60384
* g++.dg/cpp1y/pr60384.C: New.
Index: cp/name-lookup.c
===
--- cp/name-lookup.c(revision 208682)
+++ cp/name-lookup.c(working copy)
@@ -3112,7 +3112,9 @@ push_class_level_binding_1 (tree name, tree x)
   if (!class_binding_level)
 return true;
 
-  if (name == error_mark_node)
+  if (name == error_mark_node
+  /* Can happen for an erroneous declaration (c++/60384).  */
+  || !identifier_p (name))
 return false;
 
   /* Check for invalid member names.  But don't worry about a default
Index: testsuite/g++.dg/cpp1y/pr60384.C
===
--- testsuite/g++.dg/cpp1y/pr60384.C(revision 0)
+++ testsuite/g++.dg/cpp1y/pr60384.C(working copy)
@@ -0,0 +1,9 @@
+// PR c++/60384
+// { dg-do compile { target c++1y } }
+
+template int foo();
+
+struct A
+{
+  typedef auto foo<>();  // { dg-error "typedef declared 'auto'" }
+};


Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Joseph S. Myers
On Wed, 19 Mar 2014, Kai Tietz wrote:

> The concept of weak - as present in ELF - isn't known in COFF in
> general.  There is some weak, but it works only for static library and
> in a limitted way.  Therefore we can't (and don't) use it for COFF
> targets.

There are already two different checks (check_weak_available and 
check_weak_override_available), reflecting what different testcases need.  
Is the requirement for these tests logically different from both of those?  
If so, maybe there should be a third such check (even if in fact it does 
the same thing as check_weak_override_available).

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH 1/2, AARCH64]: Machine descriptions: Re: [RFC] [PATCH, AARCH64] : Using standard patterns for stack protection.

2014-03-19 Thread Venkataramanan Kumar
Hi Marcus,

On 14 March 2014 19:42, Marcus Shawcroft  wrote:
> Hi Venkat
>
> On 5 February 2014 10:29, Venkataramanan Kumar
>  wrote:
>> Hi Marcus,
>>
>>> +  "ldr\\t%x2, %1\;str\\t%x2, %0\;mov\t%x2,0"
>>> +  [(set_attr "length" "12")])
>>>
>>> This pattern emits an opaque sequence of instructions that cannot be
>>> scheduled, is that necessary? Can we not expand individual
>>> instructions or at least split ?
>>
>> Almost all the ports emits a template of assembly instructions.
>> I m not sure why they have to be generated this way.
>> But usage of these pattern is to clear the register that holds canary
>> value immediately after its usage.
>
> I've just read the thread Andrew pointed out, thanks, I'm happy that
> there is a good reason to do it this way.  Andrew, thanks for
> providing the background.
>
> +  [(set_attr "length" "12")])
> +
>
> These patterns should also set the "type" attribute,  a reasonable
> value would be "multiple".
>

I have incorporated your review comments and split the patch into two.

The first patch attached here contains Aarch64 machine descriptions
for the stack protect patterns.

ChangeLog.

2014-03-19 Venkataramanan Kumar  
* config/aarch64/aarch64.md (stack_protect_set, stack_protect_test)
(stack_protect_set_, stack_protect_test_): Add
machine descriptions for Stack Smashing Protector.

Tested  for aarch64-none-linux-gnu target under QEMU .

regards,
Venkat.
Index: gcc/config/aarch64/aarch64.md
===
--- gcc/config/aarch64/aarch64.md   (revision 208609)
+++ gcc/config/aarch64/aarch64.md   (working copy)
@@ -102,6 +102,8 @@
 UNSPEC_TLSDESC
 UNSPEC_USHL_2S
 UNSPEC_VSTRUCTDUMMY
+UNSPEC_SP_SET
+UNSPEC_SP_TEST
 ])
 
 (define_c_enum "unspecv" [
@@ -3634,6 +3636,67 @@
   DONE;
 })
 
+;; Named patterns for stack smashing protection.
+(define_expand "stack_protect_set"
+  [(match_operand 0 "memory_operand")
+   (match_operand 1 "memory_operand")]
+  ""
+{
+  enum machine_mode mode = GET_MODE (operands[0]);
+
+  emit_insn ((mode == DImode
+ ? gen_stack_protect_set_di
+ : gen_stack_protect_set_si) (operands[0], operands[1]));
+  DONE;
+})
+
+(define_insn "stack_protect_set_"
+  [(set (match_operand:PTR 0 "memory_operand" "=m")
+   (unspec:PTR [(match_operand:PTR 1 "memory_operand" "m")]
+UNSPEC_SP_SET))
+   (set (match_scratch:PTR 2 "=&r") (const_int 0))]
+  ""
+  "ldr\\t%x2, %1\;str\\t%x2, %0\;mov\t%x2,0"
+  [(set_attr "length" "12")
+   (set_attr "type" "multiple")])
+
+(define_expand "stack_protect_test"
+  [(match_operand 0 "memory_operand")
+   (match_operand 1 "memory_operand")
+   (match_operand 2)]
+  ""
+{
+
+  rtx result = gen_reg_rtx (Pmode);
+
+  enum machine_mode mode = GET_MODE (operands[0]);
+
+  emit_insn ((mode == DImode
+ ? gen_stack_protect_test_di
+ : gen_stack_protect_test_si) (result,
+   operands[0],
+   operands[1]));
+
+  if (mode == DImode)
+emit_jump_insn (gen_cbranchdi4 (gen_rtx_EQ (VOIDmode, result, const0_rtx),
+   result, const0_rtx, operands[2]));
+  else
+emit_jump_insn (gen_cbranchsi4 (gen_rtx_EQ (VOIDmode, result, const0_rtx),
+   result, const0_rtx, operands[2]));
+  DONE;
+})
+
+(define_insn "stack_protect_test_"
+  [(set (match_operand:PTR 0 "register_operand")
+   (unspec:PTR [(match_operand:PTR 1 "memory_operand" "m")
+(match_operand:PTR 2 "memory_operand" "m")]
+UNSPEC_SP_TEST))
+   (clobber (match_scratch:PTR 3 "=&r"))]
+  ""
+  "ldr\t%x3, %x1\;ldr\t%x0, %x2\;eor\t%x0, %x3, %x0"
+  [(set_attr "length" "12")
+   (set_attr "type" "multiple")])
+
 ;; AdvSIMD Stuff
 (include "aarch64-simd.md")
 


Re: [PATCH 1/2, AARCH64]: Machine descriptions: Re: [RFC] [PATCH, AARCH64] : Using standard patterns for stack protection.

2014-03-19 Thread Marcus Shawcroft
On 19 March 2014 17:11, Venkataramanan Kumar
 wrote:

> I have incorporated your review comments and split the patch into two.
>
> The first patch attached here contains Aarch64 machine descriptions
> for the stack protect patterns.
>
> ChangeLog.
>
> 2014-03-19 Venkataramanan Kumar  
> * config/aarch64/aarch64.md (stack_protect_set, stack_protect_test)
> (stack_protect_set_, stack_protect_test_): Add
> machine descriptions for Stack Smashing Protector.
>
> Tested  for aarch64-none-linux-gnu target under QEMU .
>
> regards,
> Venkat.


Hi, This is OK for stage-1.
Thanks
/Marcus


Re: [RFA jit 2/2] introduce scoped_timevar

2014-03-19 Thread Tom Tromey
> "Trevor" == Trevor Saunders  writes:

Trevor> thanks for doing this.  I wonder about naming, we already have
Trevor> auto_vec and while I don't really care wether we use auto_ or
Trevor> scoped_ it seems like being consistant would be nice.

Sounds reasonable to me, I've made this change for v2.

Tom


Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Kai Tietz
2014-03-19 18:37 GMT+01:00 Joseph S. Myers :
> On Wed, 19 Mar 2014, Kai Tietz wrote:
>
>> The concept of weak - as present in ELF - isn't known in COFF in
>> general.  There is some weak, but it works only for static library and
>> in a limitted way.  Therefore we can't (and don't) use it for COFF
>> targets.
>
> There are already two different checks (check_weak_available and
> check_weak_override_available), reflecting what different testcases need.
> Is the requirement for these tests logically different from both of those?
> If so, maybe there should be a third such check (even if in fact it does
> the same thing as check_weak_override_available).
>
> --
> Joseph S. Myers
> jos...@codesourcery.com

On a second thought the disabling of weak-available for mingw-targets
seems to be wrong.  Actually weak is present.  It just has a different
meaning.
Those testcases are - AFAIU them - actually checking that weaks are
available. Nevertheless the check here intends to probe if
weak-override is possible.  As otherwise weaks make no sense here
AFAICS.
I don't think that we need to add a third check here. It might be
enough to check for weak-override-available instead for those tests.

Kai


Re: [4.8, PATCH 5/26] Backport Power8 and LE support: Test adjustments

2014-03-19 Thread Bill Schmidt
Oops.  Please ignore this for now.  I'm preparing a patch series and
sent this one prematurely.

Thanks,
Bill

On Wed, 2014-03-19 at 10:25 -0500, Bill Schmidt wrote:
> Hi,
> 
> This patch (diff-le-tests) backports adjustments to a few tests for
> powerpc64le and the ELFv2 ABI.
> 
> Thanks,
> Bill




[4.8, PATCH 5/26] Backport Power8 and LE support: Test adjustments

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-le-tests) backports adjustments to a few tests for
powerpc64le and the ELFv2 ABI.

Thanks,
Bill


2014-03-29  Bill Schmidt  

Backport from mainline
2013-11-27  Bill Schmidt  

* gfortran.dg/nan_7.f90: Disable for little endian PowerPC.

Backport from mainline r205106:

2013-11-20  Ulrich Weigand  

* gcc.target/powerpc/darwin-longlong.c (msw): Make endian-safe.

Backport from mainline r205046:

2013-11-19  Ulrich Weigand  

* gcc.target/powerpc/ppc64-abi-2.c (MAKE_SLOT): New macro to
construct parameter slot value in endian-independent way.
(fcevv, fciievv, fcvevv): Use it.


Index: gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c
===
--- gcc-4_8-branch.orig/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c  
2013-12-28 17:41:32.430628909 +0100
+++ gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c   
2013-12-28 17:50:39.655337721 +0100
@@ -119,6 +119,12 @@ typedef union
   vector int v;
 } vector_int_t;
 
+#ifdef __LITTLE_ENDIAN__
+#define MAKE_SLOT(x, y) ((long)x | ((long)y << 32))
+#else
+#define MAKE_SLOT(x, y) ((long)y | ((long)x << 32))
+#endif
+
 /* Paramter passing.
s : gpr 3
v : vpr 2
@@ -226,8 +232,8 @@ fcevv (char *s, ...)
   sp = __builtin_frame_address(0);
   sp = sp->backchain;
   
-  if (sp->slot[2].l != 0x10002ULL
-  || sp->slot[4].l != 0x50006ULL)
+  if (sp->slot[2].l != MAKE_SLOT (1, 2)
+  || sp->slot[4].l !=  MAKE_SLOT (5, 6))
 abort();
 }
 
@@ -268,8 +274,8 @@ fciievv (char *s, int i, int j, ...)
   sp = __builtin_frame_address(0);
   sp = sp->backchain;
   
-  if (sp->slot[4].l != 0x10002ULL
-  || sp->slot[6].l != 0x50006ULL)
+  if (sp->slot[4].l != MAKE_SLOT (1, 2)
+  || sp->slot[6].l !=  MAKE_SLOT (5, 6))
 abort();
 }
 
@@ -296,8 +302,8 @@ fcvevv (char *s, vector int x, ...)
   sp = __builtin_frame_address(0);
   sp = sp->backchain;
   
-  if (sp->slot[4].l != 0x10002ULL
-  || sp->slot[6].l != 0x50006ULL)
+  if (sp->slot[4].l != MAKE_SLOT (1, 2)
+  || sp->slot[6].l !=  MAKE_SLOT (5, 6))
 abort();
 }
 
Index: gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/darwin-longlong.c
===
--- gcc-4_8-branch.orig/gcc/testsuite/gcc.target/powerpc/darwin-longlong.c  
2013-12-28 17:41:32.430628909 +0100
+++ gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/darwin-longlong.c   
2013-12-28 17:50:39.659337741 +0100
@@ -11,7 +11,11 @@ int  msw(long long in)
 int  i[2];
   } ud;
   ud.ll = in;
+#ifdef __LITTLE_ENDIAN__
+  return ud.i[1];
+#else
   return ud.i[0];
+#endif
 }
 
 int main()
Index: gcc-4_8-branch/gcc/testsuite/gfortran.dg/nan_7.f90
===
--- gcc-4_8-branch.orig/gcc/testsuite/gfortran.dg/nan_7.f90 2013-12-28 
17:41:32.430628909 +0100
+++ gcc-4_8-branch/gcc/testsuite/gfortran.dg/nan_7.f90  2013-12-28 
17:50:39.662337756 +0100
@@ -2,6 +2,7 @@
 ! { dg-options "-fno-range-check" }
 ! { dg-require-effective-target fortran_real_16 }
 ! { dg-require-effective-target fortran_integer_16 }
+! { dg-skip-if "" { "powerpc*le-*-*" } { "*" } { "" } }
 ! PR47293 NAN not correctly read
 character(len=200) :: str
 real(16) :: r




[RFA jit v2 1/2] introduce class toplev

2014-03-19 Thread Tom Tromey
This patch introduces a new "class toplev" and changes toplev_main and
toplev_finalize to be methods of this class.  Additionally, now the
timevars are automatically stopped when the object is destroyed.  This
cleans up "compile" a bit and makes it simpler to reuse the toplev
logic in other code.
---
 gcc/ChangeLog.jit  | 14 +
 gcc/diagnostic.c   |  2 +-
 gcc/jit/ChangeLog.jit  |  5 +
 gcc/jit/internal-api.c | 25 +-
 gcc/main.c |  9 
 gcc/toplev.c   | 56 +-
 gcc/toplev.h   | 20 --
 7 files changed, 76 insertions(+), 55 deletions(-)

diff --git a/gcc/ChangeLog.jit b/gcc/ChangeLog.jit
index 77ac44c..c590ab1 100644
--- a/gcc/ChangeLog.jit
+++ b/gcc/ChangeLog.jit
@@ -1,3 +1,17 @@
+2014-03-19  Tom Tromey  
+
+   * diagnostic.c (bt_stop): Use toplev::main.
+   * main.c (main): Update.
+   * toplev.c (do_compile): Remove argument.  Don't check
+   use_TV_TOTAL.
+   (toplev::toplev, toplev::~toplev, toplev::start_timevars): New
+   functions.
+   (toplev::main): Rename from toplev_main.  Update.
+   (toplev::finalize): Rename from toplev_finalize.  Update.
+   * toplev.h (class toplev): New.
+   (struct toplev_options): Remove.
+   (toplev_main, toplev_finalize): Don't declare.
+
 2014-03-11  David Malcolm  
 
* gcse.c (gcse_c_finalize): New, to clear test_insn between
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..56dc3ac 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -333,7 +333,7 @@ diagnostic_show_locus (diagnostic_context * context,
 static const char * const bt_stop[] =
 {
   "main",
-  "toplev_main",
+  "toplev::main",
   "execute_one_pass",
   "compile_file",
 };
diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index efb1931..e45d38c 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,8 @@
+2014-03-19  Tom Tromey  
+
+   * internal-api.c (compile): Use toplev, not toplev_options.
+   Simplify.
+
 2014-03-19  David Malcolm  
 
* internal-api.c (gcc::jit::recording::memento_of_get_pointer::
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index e3ddc4d..95978bf 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -3650,7 +3650,7 @@ compile ()
 
   /* Call into the rest of gcc.
  For now, we have to assemble command-line options to pass into
- toplev_main, so that they can be parsed. */
+ toplev::main, so that they can be parsed. */
 
   /* Pass in user-provided "progname", if any, so that it makes it
  into GCC's "progname" global, used in various diagnostics. */
@@ -3724,25 +3724,15 @@ compile ()
   ADD_ARG ("-fdump-ipa-all");
 }
 
-  toplev_options toplev_opts;
-  toplev_opts.use_TV_TOTAL = false;
+  toplev toplev (false);
 
-  if (time_report || !quiet_flag  || flag_detailed_statistics)
-timevar_init ();
-
-  timevar_start (TV_TOTAL);
-
-  toplev_main (num_args, const_cast  (fake_args), &toplev_opts);
-  toplev_finalize ();
+  toplev.main (num_args, const_cast  (fake_args));
+  toplev.finalize ();
 
   active_playback_ctxt = NULL;
 
   if (errors_occurred ())
-{
-  timevar_stop (TV_TOTAL);
-  timevar_print (stderr);
-  return NULL;
-}
+return NULL;
 
   if (get_bool_option (GCC_JIT_BOOL_OPTION_DUMP_GENERATED_CODE))
dump_generated_code ();
@@ -3765,8 +3755,6 @@ compile ()
 if (ret)
   {
timevar_pop (TV_ASSEMBLE);
-   timevar_stop (TV_TOTAL);
-   timevar_print (stderr);
return NULL;
   }
   }
@@ -3795,9 +3783,6 @@ compile ()
 timevar_pop (TV_LOAD);
   }
 
-  timevar_stop (TV_TOTAL);
-  timevar_print (stderr);
-
   return result_obj;
 }
 
diff --git a/gcc/main.c b/gcc/main.c
index b893308..4bba041 100644
--- a/gcc/main.c
+++ b/gcc/main.c
@@ -1,5 +1,5 @@
 /* main.c: defines main() for cc1, cc1plus, etc.
-   Copyright (C) 2007-2013 Free Software Foundation, Inc.
+   Copyright (C) 2007-2014 Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -26,15 +26,14 @@ along with GCC; see the file COPYING3.  If not see
 
 int main (int argc, char **argv);
 
-/* We define main() to call toplev_main(), which is defined in toplev.c.
+/* We define main() to call toplev::main(), which is defined in toplev.c.
We do this in a separate file in order to allow the language front-end
to define a different main(), if it so desires.  */
 
 int
 main (int argc, char **argv)
 {
-  toplev_options toplev_opts;
-  toplev_opts.use_TV_TOTAL = true;
+  toplev toplev (true);
 
-  return toplev_main (argc, argv, &toplev_opts);
+  return toplev.main (argc, argv);
 }
diff --git a/gcc/toplev.c b/gcc/toplev.c
index f1ac560..5284621 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1,5 +1,5 @@
 /* Top level of GCC compilers (cc1, cc1plus, etc.)
-   Copyright (C) 1987-2013 Free Software Foundation, Inc.
+   Copyright (C) 1987-2014 Free Software 

[RFA jit v2 0/2] minor refactorings for reuse

2014-03-19 Thread Tom Tromey
Here's a second revision of my patches to the jit branch to clean up
toplev and timevar uses a bit.  The first revision was here:

http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00895.html

Compared with that revision, this one hopefully includes the
ChangeLog.jit entries; and I took Trevor's suggestion and renamed the
timevar class to "auto_timevar".

Tom



[RFA jit v2 2/2] introduce auto_timevar

2014-03-19 Thread Tom Tromey
This introduces a new auto_timevar class.  It pushes a given timevar
in its constructor, and pops it in the destructor, giving a much
simpler way to use timevars in the typical case where they can be
scoped.
---
 gcc/ChangeLog.jit  |  4 
 gcc/jit/ChangeLog.jit  |  4 
 gcc/jit/internal-api.c | 16 +---
 gcc/timevar.h  | 26 +-
 4 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/gcc/ChangeLog.jit b/gcc/ChangeLog.jit
index c590ab1..ee1df88 100644
--- a/gcc/ChangeLog.jit
+++ b/gcc/ChangeLog.jit
@@ -1,5 +1,9 @@
 2014-03-19  Tom Tromey  
 
+   * timevar.h (auto_timevar): New class.
+
+2014-03-19  Tom Tromey  
+
* diagnostic.c (bt_stop): Use toplev::main.
* main.c (main): Update.
* toplev.c (do_compile): Remove argument.  Don't check
diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index e45d38c..69f2412 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,5 +1,9 @@
 2014-03-19  Tom Tromey  
 
+   * internal-api.c (compile): Use auto_timevar.
+
+2014-03-19  Tom Tromey  
+
* internal-api.c (compile): Use toplev, not toplev_options.
Simplify.
 
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index 95978bf..090d351 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -3737,8 +3737,6 @@ compile ()
   if (get_bool_option (GCC_JIT_BOOL_OPTION_DUMP_GENERATED_CODE))
dump_generated_code ();
 
-  timevar_push (TV_ASSEMBLE);
-
   /* Gross hacks follow:
  We have a .s file; we want a .so file.
  We could reuse parts of gcc/gcc.c to do this.
@@ -3746,6 +3744,8 @@ compile ()
*/
   /* FIXME: totally faking it for now, not even using pex */
   {
+auto_timevar assemble_timevar (TV_ASSEMBLE);
+
 char cmd[1024];
 snprintf (cmd, 1024, "gcc -shared %s -o %s",
   m_path_s_file, m_path_so_file);
@@ -3753,20 +3753,16 @@ compile ()
   printf ("cmd: %s\n", cmd);
 int ret = system (cmd);
 if (ret)
-  {
-   timevar_pop (TV_ASSEMBLE);
-   return NULL;
-  }
+  return NULL;
   }
-  timevar_pop (TV_ASSEMBLE);
 
   // TODO: split out assembles vs linker
 
   /* dlopen the .so file. */
   {
-const char *error;
+auto_timevar load_timevar (TV_LOAD);
 
-timevar_push (TV_LOAD);
+const char *error;
 
 /* Clear any existing error.  */
 dlerror ();
@@ -3779,8 +3775,6 @@ compile ()
   result_obj = new result (handle);
 else
   result_obj = NULL;
-
-timevar_pop (TV_LOAD);
   }
 
   return result_obj;
diff --git a/gcc/timevar.h b/gcc/timevar.h
index dc2a8bc..f018e39 100644
--- a/gcc/timevar.h
+++ b/gcc/timevar.h
@@ -1,5 +1,5 @@
 /* Timing variables for measuring compiler performance.
-   Copyright (C) 2000-2013 Free Software Foundation, Inc.
+   Copyright (C) 2000-2014 Free Software Foundation, Inc.
Contributed by Alex Samuel 
 
This file is part of GCC.
@@ -110,6 +110,30 @@ timevar_pop (timevar_id_t tv)
 timevar_pop_1 (tv);
 }
 
+// This is a simple timevar wrapper class that pushes a timevar in its
+// constructor and pops the timevar in its destructor.
+class auto_timevar
+{
+ public:
+  auto_timevar (timevar_id_t tv)
+: m_tv (tv)
+  {
+timevar_push (m_tv);
+  }
+
+  ~auto_timevar ()
+  {
+timevar_pop (m_tv);
+  }
+
+ private:
+
+  // Private to disallow copies.
+  auto_timevar (const auto_timevar &);
+
+  timevar_id_t m_tv;
+};
+
 extern void print_time (const char *, long);
 
 #endif /* ! GCC_TIMEVAR_H */
-- 
1.8.5.3



Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Mike Stump
On Mar 19, 2014, at 9:38 AM, Kai Tietz  wrote:
> 2014-03-19 17:23 GMT+01:00 Mike Stump :
>> On Mar 18, 2014, at 6:16 AM, Kai Tietz  wrote:
>>> this patch skips anon2.C and anon3.C test for mingw target.  Issue
>>> here is that weak under pe-coff is different to ELF-targets and
>>> therefore test doesn't apply for
>> 
>> So, what does the output look like?  There should be a trace of weak of some 
>> sort in the output.
> 
> No, there is none.

So, does the target support weak?


Re: [PATCH 2/2, AARCH64] Test case changes: Re: [RFC] [PATCH, AARCH64] : Using standard patterns for stack protection.

2014-03-19 Thread Marcus Shawcroft
On 19 March 2014 17:18, Venkataramanan Kumar
 wrote:

> I used the existing dg-require-effective-target check,
> "stack_protector" and added it in a separate line.
>
> ChangeLog.
>
> 2014-03-19  Venkataramanan Kumar  
> * g++.dg/fstack-protector-strong.C: Add effetive target check for
>   stack protection.
> * gcc.dg/fstack-protector-strong.c: Likewise.
>
> These two tests are passing now for aarch64-none-linux-gnu target under QEMU.


Venkat,

I think this change is reasonable (for stage-1) but I'd like one of
the testsuite maintainers to ACK the change.

Cheers
/Marcus


Re: [patch testsuite]: g++.dg/abi

2014-03-19 Thread Kai Tietz
2014-03-19 17:54 GMT+01:00 Mike Stump :
> On Mar 19, 2014, at 9:49 AM, Rainer Orth  
> wrote:
>>> The concept of weak - as present in ELF - isn't known in COFF in
>>> general.  There is some weak, but it works only for static library and
>>> in a limitted way.  Therefore we can't (and don't) use it for COFF
>>> targets.
>>
>> In that case, it seems far better to have
>> gcc/testsuite/lib/target-support.exp (check_weak_available) reflect that
>> instead of lying about weak support.
>
> Yeah, this is the direction I was headed...  :-)

Ok, I will sent a patch for changing target-support.exp.

And yes, target supports a kind of weak, but not the expected gnu-weak.

Thanks,
Kai


[jit] Avoid shadowing progname global

2014-03-19 Thread David Malcolm
Committed to branch dmalcolm/jit:

gcc/jit/
* internal-api.c (gcc::jit::recording::context::add_error_va):
Rename local "progname" to "ctxt_progname" to avoid shadowing
the related global, for clarity.
(gcc::jit::playback::context::compile): Likewise.
---
 gcc/jit/ChangeLog.jit  |  7 +++
 gcc/jit/internal-api.c | 22 --
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index efb1931..265242e 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,5 +1,12 @@
 2014-03-19  David Malcolm  
 
+   * internal-api.c (gcc::jit::recording::context::add_error_va):
+   Rename local "progname" to "ctxt_progname" to avoid shadowing
+   the related global, for clarity.
+   (gcc::jit::playback::context::compile): Likewise.
+
+2014-03-19  David Malcolm  
+
* internal-api.c (gcc::jit::recording::memento_of_get_pointer::
accepts_writes_from): Accept writes from pointers, but not arrays.
 
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index e3ddc4d..819800a 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -610,18 +610,19 @@ recording::context::add_error_va (location *loc, const 
char *fmt, va_list ap)
   char buf[1024];
   vsnprintf (buf, sizeof (buf) - 1, fmt, ap);
 
-  const char *progname = get_str_option (GCC_JIT_STR_OPTION_PROGNAME);
-  if (!progname)
-progname = "libgccjit.so";
+  const char *ctxt_progname =
+get_str_option (GCC_JIT_STR_OPTION_PROGNAME);
+  if (!ctxt_progname)
+ctxt_progname = "libgccjit.so";
 
   if (loc)
 fprintf (stderr, "%s: %s: error: %s\n",
-progname,
+ctxt_progname,
 loc->get_debug_string (),
 buf);
   else
 fprintf (stderr, "%s: error: %s\n",
-progname,
+ctxt_progname,
 buf);
 
   if (!m_error_count)
@@ -3629,8 +3630,8 @@ playback::context::
 compile ()
 {
   void *handle = NULL;
+  const char *ctxt_progname;
   result *result_obj = NULL;
-  const char *progname;
   const char *fake_args[20];
   unsigned int num_args;
 
@@ -3652,10 +3653,11 @@ compile ()
  For now, we have to assemble command-line options to pass into
  toplev_main, so that they can be parsed. */
 
-  /* Pass in user-provided "progname", if any, so that it makes it
- into GCC's "progname" global, used in various diagnostics. */
-  progname = get_str_option (GCC_JIT_STR_OPTION_PROGNAME);
-  fake_args[0] = progname ? progname : "libgccjit.so";
+  /* Pass in user-provided program name as argv0, if any, so that it
+ makes it into GCC's "progname" global, used in various diagnostics. */
+  ctxt_progname = get_str_option (GCC_JIT_STR_OPTION_PROGNAME);
+  fake_args[0] =
+(ctxt_progname ? ctxt_progname : "libgccjit.so");
 
   fake_args[1] = m_path_c_file;
   num_args = 2;
-- 
1.8.5.3



Re: [RFA jit v2 1/2] introduce class toplev

2014-03-19 Thread Tom Tromey
> "Tom" == Tom Tromey  writes:

Tom> This patch introduces a new "class toplev" and changes toplev_main and
Tom> toplev_finalize to be methods of this class.  Additionally, now the
Tom> timevars are automatically stopped when the object is destroyed.  This
Tom> cleans up "compile" a bit and makes it simpler to reuse the toplev
Tom> logic in other code.

David asked me off-list to rename the field in class toplev, so here's a
new patch that does this.

Tom

commit 66f92863ef55c26f673d02dd39027f340940a3bf
Author: Tom Tromey 
Date:   Tue Mar 18 08:07:40 2014 -0600

introduce class toplev

This patch introduces a new "class toplev" and changes toplev_main and
toplev_finalize to be methods of this class.  Additionally, now the
timevars are automatically stopped when the object is destroyed.  This
cleans up "compile" a bit and makes it simpler to reuse the toplev
logic in other code.

diff --git a/gcc/ChangeLog.jit b/gcc/ChangeLog.jit
index 77ac44c..c590ab1 100644
--- a/gcc/ChangeLog.jit
+++ b/gcc/ChangeLog.jit
@@ -1,3 +1,17 @@
+2014-03-19  Tom Tromey  
+
+   * diagnostic.c (bt_stop): Use toplev::main.
+   * main.c (main): Update.
+   * toplev.c (do_compile): Remove argument.  Don't check
+   use_TV_TOTAL.
+   (toplev::toplev, toplev::~toplev, toplev::start_timevars): New
+   functions.
+   (toplev::main): Rename from toplev_main.  Update.
+   (toplev::finalize): Rename from toplev_finalize.  Update.
+   * toplev.h (class toplev): New.
+   (struct toplev_options): Remove.
+   (toplev_main, toplev_finalize): Don't declare.
+
 2014-03-11  David Malcolm  
 
* gcse.c (gcse_c_finalize): New, to clear test_insn between
diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..56dc3ac 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -333,7 +333,7 @@ diagnostic_show_locus (diagnostic_context * context,
 static const char * const bt_stop[] =
 {
   "main",
-  "toplev_main",
+  "toplev::main",
   "execute_one_pass",
   "compile_file",
 };
diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index efb1931..e45d38c 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,8 @@
+2014-03-19  Tom Tromey  
+
+   * internal-api.c (compile): Use toplev, not toplev_options.
+   Simplify.
+
 2014-03-19  David Malcolm  
 
* internal-api.c (gcc::jit::recording::memento_of_get_pointer::
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index e3ddc4d..95978bf 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -3650,7 +3650,7 @@ compile ()
 
   /* Call into the rest of gcc.
  For now, we have to assemble command-line options to pass into
- toplev_main, so that they can be parsed. */
+ toplev::main, so that they can be parsed. */
 
   /* Pass in user-provided "progname", if any, so that it makes it
  into GCC's "progname" global, used in various diagnostics. */
@@ -3724,25 +3724,15 @@ compile ()
   ADD_ARG ("-fdump-ipa-all");
 }
 
-  toplev_options toplev_opts;
-  toplev_opts.use_TV_TOTAL = false;
+  toplev toplev (false);
 
-  if (time_report || !quiet_flag  || flag_detailed_statistics)
-timevar_init ();
-
-  timevar_start (TV_TOTAL);
-
-  toplev_main (num_args, const_cast  (fake_args), &toplev_opts);
-  toplev_finalize ();
+  toplev.main (num_args, const_cast  (fake_args));
+  toplev.finalize ();
 
   active_playback_ctxt = NULL;
 
   if (errors_occurred ())
-{
-  timevar_stop (TV_TOTAL);
-  timevar_print (stderr);
-  return NULL;
-}
+return NULL;
 
   if (get_bool_option (GCC_JIT_BOOL_OPTION_DUMP_GENERATED_CODE))
dump_generated_code ();
@@ -3765,8 +3755,6 @@ compile ()
 if (ret)
   {
timevar_pop (TV_ASSEMBLE);
-   timevar_stop (TV_TOTAL);
-   timevar_print (stderr);
return NULL;
   }
   }
@@ -3795,9 +3783,6 @@ compile ()
 timevar_pop (TV_LOAD);
   }
 
-  timevar_stop (TV_TOTAL);
-  timevar_print (stderr);
-
   return result_obj;
 }
 
diff --git a/gcc/main.c b/gcc/main.c
index b893308..4bba041 100644
--- a/gcc/main.c
+++ b/gcc/main.c
@@ -1,5 +1,5 @@
 /* main.c: defines main() for cc1, cc1plus, etc.
-   Copyright (C) 2007-2013 Free Software Foundation, Inc.
+   Copyright (C) 2007-2014 Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -26,15 +26,14 @@ along with GCC; see the file COPYING3.  If not see
 
 int main (int argc, char **argv);
 
-/* We define main() to call toplev_main(), which is defined in toplev.c.
+/* We define main() to call toplev::main(), which is defined in toplev.c.
We do this in a separate file in order to allow the language front-end
to define a different main(), if it so desires.  */
 
 int
 main (int argc, char **argv)
 {
-  toplev_options toplev_opts;
-  toplev_opts.use_TV_TOTAL = true;
+  toplev toplev (true);
 
-  return toplev_main (argc, argv, &toplev_opts);
+  return toplev.main (argc, argv);
 }
diff --git a/gcc/toplev.c b/gcc/tople

[PATCH, ARM] Optimise NotDI AND/OR ZeroExtendSI for ARMv7A

2014-03-19 Thread Ian Bolton
This is a follow-on patch to one already committed:
http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01128.html

It implements patterns to simplify our RTL as follows:

OR (Not:DI (A:DI), ZeroExtend:DI (B:SI))
  -->  the top half can be done with a MVN

AND (Not:DI (A:DI), ZeroExtend:DI (B:SI))
  -->  the top half becomes zero.

I've added test cases for both of these and also the existing
anddi_notdi patterns.  The tests all pass.

Full regression runs passed.

OK for stage 1?

Cheers,
Ian


2014-03-19  Ian Bolton  

gcc/
* config/arm/arm.md (*anddi_notdi_zesidi): New pattern
* config/arm/thumb2.md (*iordi_notdi_zesidi): New pattern.

testsuite/
* gcc.target/arm/anddi_notdi-1.c: New test.
* gcc.target/arm/iordi_notdi-1.c: New test case.
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 2ddda02..d2d85ee 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -2962,6 +2962,28 @@
(set_attr "type" "multiple")]
 )
 
+(define_insn_and_split "*anddi_notdi_zesidi"
+  [(set (match_operand:DI 0 "s_register_operand" "=&r,&r")
+(and:DI (not:DI (match_operand:DI 2 "s_register_operand" "0,?r"))
+(zero_extend:DI
+ (match_operand:SI 1 "s_register_operand" "r,r"]
+  "TARGET_32BIT"
+  "#"
+  "TARGET_32BIT && reload_completed"
+  [(set (match_dup 0) (and:SI (not:SI (match_dup 2)) (match_dup 1)))
+   (set (match_dup 3) (const_int 0))]
+  "
+  {
+operands[3] = gen_highpart (SImode, operands[0]);
+operands[0] = gen_lowpart (SImode, operands[0]);
+operands[2] = gen_lowpart (SImode, operands[2]);
+  }"
+  [(set_attr "length" "8")
+   (set_attr "predicable" "yes")
+   (set_attr "predicable_short_it" "no")
+   (set_attr "type" "multiple")]
+)
+
 (define_insn_and_split "*anddi_notsesidi_di"
   [(set (match_operand:DI 0 "s_register_operand" "=&r,&r")
(and:DI (not:DI (sign_extend:DI
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 467c619..10bc8b1 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -1418,6 +1418,30 @@
(set_attr "type" "multiple")]
 )
 
+(define_insn_and_split "*iordi_notdi_zesidi"
+  [(set (match_operand:DI 0 "s_register_operand" "=&r,&r")
+   (ior:DI (not:DI (match_operand:DI 2 "s_register_operand" "0,?r"))
+   (zero_extend:DI
+(match_operand:SI 1 "s_register_operand" "r,r"]
+  "TARGET_THUMB2"
+  "#"
+  "TARGET_THUMB2 && reload_completed"
+  [(set (match_dup 0) (ior:SI (not:SI (match_dup 2)) (match_dup 1)))
+   (set (match_dup 3) (not:SI (match_dup 4)))]
+  "
+  {
+operands[3] = gen_highpart (SImode, operands[0]);
+operands[0] = gen_lowpart (SImode, operands[0]);
+operands[1] = gen_lowpart (SImode, operands[1]);
+operands[4] = gen_highpart (SImode, operands[2]);
+operands[2] = gen_lowpart (SImode, operands[2]);
+  }"
+  [(set_attr "length" "8")
+   (set_attr "predicable" "yes")
+   (set_attr "predicable_short_it" "no")
+   (set_attr "type" "multiple")]
+)
+
 (define_insn_and_split "*iordi_notsesidi_di"
   [(set (match_operand:DI 0 "s_register_operand" "=&r,&r")
(ior:DI (not:DI (sign_extend:DI
diff --git a/gcc/testsuite/gcc.target/arm/anddi_notdi-1.c 
b/gcc/testsuite/gcc.target/arm/anddi_notdi-1.c
new file mode 100644
index 000..cfb33fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/anddi_notdi-1.c
@@ -0,0 +1,65 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-inline --save-temps" } */
+
+extern void abort (void);
+
+typedef long long s64int;
+typedef int s32int;
+typedef unsigned long long u64int;
+typedef unsigned int u32int;
+
+s64int
+anddi_di_notdi (s64int a, s64int b)
+{
+  return (a & ~b);
+}
+
+s64int
+anddi_di_notzesidi (s64int a, u32int b)
+{
+  return (a & ~(u64int) b);
+}
+
+s64int
+anddi_notdi_zesidi (s64int a, u32int b)
+{
+  return (~a & (u64int) b);
+}
+
+s64int
+anddi_di_notsesidi (s64int a, s32int b)
+{
+  return (a & ~(s64int) b);
+}
+
+int main ()
+{
+  s64int a64 = 0xdeadbeefll;
+  s64int b64 = 0x5f470112ll;
+  s64int c64 = 0xdeadbeef300fll;
+
+  u32int c32 = 0x01124f4f;
+  s32int d32 = 0xabbaface;
+
+  s64int z = anddi_di_notdi (c64, b64);
+  if (z != 0xdeadbeef2008ll)
+abort ();
+
+  z = anddi_di_notzesidi (a64, c32);
+  if (z != 0xdeadbeefb0b0ll)
+abort ();
+
+  z = anddi_notdi_zesidi (c64, c32);
+  if (z != 0x01104f4fll)
+abort ();
+
+  z = anddi_di_notsesidi (a64, d32);
+  if (z != 0x0531ll)
+abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "bic\t" 6 } } */
+
+/* { dg-final { cleanup-saved-temps } } */
diff --git a/gcc/testsuite/gcc.target/arm/iordi_notdi-1.c 
b/gcc/testsuite/gcc.target/arm/iordi_notdi-1.c
index cda9c0e..249f080 100644
--- a/gcc/testsuite/gcc.target/arm/iordi_notdi-1.c
+++ b/gcc/testsuite/gcc.target/arm/iordi_notdi-1.c
@@ -9,19 +9,25 @@ typedef unsigned long long u64int;
 typedef unsigned int u32int;
 
 s64int
-iordi_notdi (s64int a, s64int b)
+iordi_di_notdi

Re: [PATCH] Fix PR60505

2014-03-19 Thread Cong Hou
On Tue, Mar 18, 2014 at 4:43 AM, Richard Biener  wrote:
>
> On Mon, 17 Mar 2014, Cong Hou wrote:
>
> > On Mon, Mar 17, 2014 at 6:44 AM, Richard Biener  wrote:
> > > On Fri, 14 Mar 2014, Cong Hou wrote:
> > >
> > >> On Fri, Mar 14, 2014 at 12:58 AM, Richard Biener  
> > >> wrote:
> > >> > On Fri, 14 Mar 2014, Jakub Jelinek wrote:
> > >> >
> > >> >> On Fri, Mar 14, 2014 at 08:52:07AM +0100, Richard Biener wrote:
> > >> >> > > Consider this fact and if there are alias checks, we can safely 
> > >> >> > > remove
> > >> >> > > the epilogue if the maximum trip count of the loop is less than or
> > >> >> > > equal to the calculated threshold.
> > >> >> >
> > >> >> > You have to consider n % vf != 0, so an argument on only maximum
> > >> >> > trip count or threshold cannot work.
> > >> >>
> > >> >> Well, if you only check if maximum trip count is <= vf and you know
> > >> >> that for n < vf the vectorized loop + it's epilogue path will not be 
> > >> >> taken,
> > >> >> then perhaps you could, but it is a very special case.
> > >> >> Now, the question is when we are guaranteed we enter the scalar 
> > >> >> versioned
> > >> >> loop instead for n < vf, is that in case of versioning for alias or
> > >> >> versioning for alignment?
> > >> >
> > >> > I think neither - I have plans to do the cost model check together
> > >> > with the versioning condition but didn't get around to implement that.
> > >> > That would allow stronger max bounds for the epilogue loop.
> > >>
> > >> In vect_transform_loop(), check_profitability will be set to true if
> > >> th >= VF-1 and the number of iteration is unknown (we only consider
> > >> unknown trip count here), where th is calculated based on the
> > >> parameter PARAM_MIN_VECT_LOOP_BOUND and cost model, with the minimum
> > >> value VF-1. If the loop needs to be versioned, then
> > >> check_profitability with true value will be passed to
> > >> vect_loop_versioning(), in which an enhanced loop bound check
> > >> (considering cost) will be built. So I think if the loop is versioned
> > >> and n < VF, then we must enter the scalar version, and in this case
> > >> removing epilogue should be safe when the maximum trip count <= th+1.
> > >
> > > You mean exactly in the case where the profitability check ensures
> > > that n % vf == 0?  Thus effectively if n == maximum trip count?
> > > That's quite a special case, no?
> >
> >
> > Yes, it is a special case. But it is in this special case that those
> > warnings are thrown out. Also, I think declaring an array with VF*N as
> > length is not unusual.
>
> Ok, but then for the patch compute the cost model threshold once
> in vect_analyze_loop_2 and store it in a new
> LOOP_VINFO_COST_MODEL_THRESHOLD.


Done.


> Also you have to check
> the return value from max_stmt_executions_int as that may return
> -1 if the number cannot be computed (or isn't representable in
> a HOST_WIDE_INT).


It will be converted to unsigned type so that -1 means infinity.


> You also should check for
> LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT which should have the
> same effect on the cost model check.


Done.


>
>
> The existing condition is already complicated enough - adding new
> stuff warrants comments before the (sub-)checks.


OK. Comments added.

Below is the revised patch. Bootstrapped and tested on a x86-64 machine.


Cong



diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e1d8666..eceefb3 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,18 @@
+2014-03-11  Cong Hou  
+
+ PR tree-optimization/60505
+ * tree-vectorizer.h (struct _stmt_vec_info): Add th field as the
+ threshold of number of iterations below which no vectorization will be
+ done.
+ * tree-vect-loop.c (new_loop_vec_info):
+ Initialize LOOP_VINFO_COST_MODEL_THRESHOLD.
+ * tree-vect-loop.c (vect_analyze_loop_operations):
+ Set LOOP_VINFO_COST_MODEL_THRESHOLD.
+ * tree-vect-loop.c (vect_transform_loop):
+ Use LOOP_VINFO_COST_MODEL_THRESHOLD.
+ * tree-vect-loop.c (vect_analyze_loop_2): Check the maximum number
+ of iterations of the loop and see if we should build the epilogue.
+
 2014-03-10  Jakub Jelinek  

  PR ipa/60457
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 41b6875..09ec1c0 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2014-03-11  Cong Hou  
+
+ PR tree-optimization/60505
+ * gcc.dg/vect/pr60505.c: New test.
+
 2014-03-10  Jakub Jelinek  

  PR ipa/60457
diff --git a/gcc/testsuite/gcc.dg/vect/pr60505.c
b/gcc/testsuite/gcc.dg/vect/pr60505.c
new file mode 100644
index 000..6940513
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr60505.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Wall -Werror" } */
+
+void foo(char *in, char *out, int num)
+{
+  int i;
+  char ovec[16] = {0};
+
+  for(i = 0; i < num ; ++i)
+out[i] = (ovec[i] = in[i]);
+  out[num] = ovec[num/2];
+}
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index df6ab6f..1c78e11 100644
--- a/gcc/tree-vect-loop.c
++

[4.8, PATCH 0/26] Backport Power8 and LE support

2014-03-19 Thread Bill Schmidt
Hi,

Support for Power8 features and the new powerpc64le-linux-gnu target,
including the ELFv2 ABI, has been developed up till now on the
ibm/gcc-4_8-branch.  It was appropriate to use this separate branch
while the support was unstable, but this branch will not represent a
particularly good support mechanism for distributions going forward.
Most distros are set up to pull from the major release branches, and
having a separate branch for one target is quite inconvenient.  Also,
the ibm/gcc-4_8-branch's original purpose is to serve as the code base
for IBM's Advance Toolchain 7.0.  Over time the two purposes that the
branch currently serves will diverge and make things even more
complicated.

The code is now tested and stable enough that we are ready to backport
this support to the FSF 4.8 branch.  This patch series constitutes that
backport.

Almost all of the changes are specific to PowerPC portions of the code,
and for those patches I am only CCing David.  However, some of the
patches require changes to common code, and for these I will CC Richard
and Jakub.  Three of these are slightly unrelated but necessary patches,
one to enable decimal float ABS builtins, and two others to fix PR54537
and PR56843.  In addition there are patches that update configuration
files throughout for the new target, and some small changes in common
call support (call.c, expr.h, function.c) to support how the new ABI
handles calls.

I realize it is unusual to backport such a large amount of code, but we
have been asked by distribution partners to do this, and we feel it
makes good sense for long-term support.

I have tested the patch series by applying it to a clean FSF 4.8 branch
and comparing the test results against those from the IBM 4.8 branch on
three systems:
 * Power8, little endian (--mcpu=power8)
 * Power8, big endian (--mcpu=power8)
 * Power7, big endian (--mcpu=power7)

I also checked a recursive diff against the two source directories to
ensure that no patches were missed.

Thanks,
Bill

[ 1/26] diff-p8
[ 2/26] diff-p8-htm
[ 3/26] diff-le-config
[ 4/26] diff-le-libtool
[ 5/26] diff-le-tests
[ 6/26] diff-le-dfp
[ 7/26] diff-le-vector
[ 8/26] diff-abi-compat
[ 9/26] diff-abi-calls
[10/26] diff-abi-elfv2
[11/26] diff-abi-gotest
[12/26] diff-le-align
[13/26] diff-abi-libffi
[14/26] diff-dfp-abs
[15/26] diff-pr54537
[16/26] diff-pr56843
[17/26] diff-direct-move
[18/26] diff-le-config-2
[19/26] diff-quad-memory
[20/26] diff-lra
[21/26] diff-le-vector-api
[22/26] diff-mcall
[23/26] diff-pr60137-pr60203
[24/26] diff-reload
[25/26] diff-v1ti
[26/26] diff-trunk-missing








[4.8, PATCH 3/26] Backport Power8 and LE support: Configury bits 1

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-le-config) backports updates to more recent
config.guess and config.sub versions to support the new powerpc64le
target.

Thanks,
Bill


2014-03-29  Bill Schmidt  

Backport from mainline r203071:

2013-10-01  Joern Rennecke  

Import from savannah.gnu.org:
* config.guess: Update to 2013-06-10 version.
* config.sub: Update to 2013-10-01 version.


Index: gcc-4_8-branch/config.guess
===
--- gcc-4_8-branch.orig/config.guess2013-12-28 17:41:32.765630566 +0100
+++ gcc-4_8-branch/config.guess 2013-12-28 17:50:37.995329461 +0100
@@ -1,10 +1,8 @@
 #! /bin/sh
 # Attempt to guess a canonical system name.
-#   Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999,
-#   2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
-#   2011, 2012, 2013 Free Software Foundation, Inc.
+#   Copyright 1992-2013 Free Software Foundation, Inc.
 
-timestamp='2012-12-30'
+timestamp='2013-06-10'
 
 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
@@ -52,9 +50,7 @@ version="\
 GNU config.guess ($timestamp)
 
 Originally written by Per Bothner.
-Copyright (C) 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000,
-2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011,
-2012, 2013 Free Software Foundation, Inc.
+Copyright 1992-2013 Free Software Foundation, Inc.
 
 This is free software; see the source for copying conditions.  There is NO
 warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE."
@@ -136,6 +132,27 @@ UNAME_RELEASE=`(uname -r) 2>/dev/null` |
 UNAME_SYSTEM=`(uname -s) 2>/dev/null`  || UNAME_SYSTEM=unknown
 UNAME_VERSION=`(uname -v) 2>/dev/null` || UNAME_VERSION=unknown
 
+case "${UNAME_SYSTEM}" in
+Linux|GNU|GNU/*)
+   # If the system lacks a compiler, then just pick glibc.
+   # We could probably try harder.
+   LIBC=gnu
+
+   eval $set_cc_for_build
+   cat <<-EOF > $dummy.c
+   #include 
+   #if defined(__UCLIBC__)
+   LIBC=uclibc
+   #elif defined(__dietlibc__)
+   LIBC=dietlibc
+   #else
+   LIBC=gnu
+   #endif
+   EOF
+   eval `$CC_FOR_BUILD -E $dummy.c 2>/dev/null | grep '^LIBC'`
+   ;;
+esac
+
 # Note: order is significant - the case branches are not exclusive.
 
 case "${UNAME_MACHINE}:${UNAME_SYSTEM}:${UNAME_RELEASE}:${UNAME_VERSION}" in
@@ -857,21 +874,21 @@ EOF
exit ;;
 *:GNU:*:*)
# the GNU system
-   echo `echo ${UNAME_MACHINE}|sed -e 's,[-/].*$,,'`-unknown-gnu`echo 
${UNAME_RELEASE}|sed -e 's,/.*$,,'`
+   echo `echo ${UNAME_MACHINE}|sed -e 's,[-/].*$,,'`-unknown-${LIBC}`echo 
${UNAME_RELEASE}|sed -e 's,/.*$,,'`
exit ;;
 *:GNU/*:*:*)
# other systems with GNU libc and userland
-   echo ${UNAME_MACHINE}-unknown-`echo ${UNAME_SYSTEM} | sed 's,^[^/]*/,,' 
| tr '[A-Z]' '[a-z]'``echo ${UNAME_RELEASE}|sed -e 's/[-(].*//'`-gnu
+   echo ${UNAME_MACHINE}-unknown-`echo ${UNAME_SYSTEM} | sed 's,^[^/]*/,,' 
| tr '[A-Z]' '[a-z]'``echo ${UNAME_RELEASE}|sed -e 's/[-(].*//'`-${LIBC}
exit ;;
 i*86:Minix:*:*)
echo ${UNAME_MACHINE}-pc-minix
exit ;;
 aarch64:Linux:*:*)
-   echo ${UNAME_MACHINE}-unknown-linux-gnu
+   echo ${UNAME_MACHINE}-unknown-linux-${LIBC}
exit ;;
 aarch64_be:Linux:*:*)
UNAME_MACHINE=aarch64_be
-   echo ${UNAME_MACHINE}-unknown-linux-gnu
+   echo ${UNAME_MACHINE}-unknown-linux-${LIBC}
exit ;;
 alpha:Linux:*:*)
case `sed -n '/^cpu model/s/^.*: \(.*\)/\1/p' < /proc/cpuinfo` in
@@ -884,59 +901,54 @@ EOF
  EV68*) UNAME_MACHINE=alphaev68 ;;
esac
objdump --private-headers /bin/sh | grep -q ld.so.1
-   if test "$?" = 0 ; then LIBC="libc1" ; else LIBC="" ; fi
-   echo ${UNAME_MACHINE}-unknown-linux-gnu${LIBC}
+   if test "$?" = 0 ; then LIBC="gnulibc1" ; fi
+   echo ${UNAME_MACHINE}-unknown-linux-${LIBC}
+   exit ;;
+arc:Linux:*:* | arceb:Linux:*:*)
+   echo ${UNAME_MACHINE}-unknown-linux-${LIBC}
exit ;;
 arm*:Linux:*:*)
eval $set_cc_for_build
if echo __ARM_EABI__ | $CC_FOR_BUILD -E - 2>/dev/null \
| grep -q __ARM_EABI__
then
-   echo ${UNAME_MACHINE}-unknown-linux-gnu
+   echo ${UNAME_MACHINE}-unknown-linux-${LIBC}
else
if echo __ARM_PCS_VFP | $CC_FOR_BUILD -E - 2>/dev/null \
| grep -q __ARM_PCS_VFP
then
-   echo ${UNAME_MACHINE}-unknown-linux-gnueabi
+   echo ${UNAME_MACHINE}-unknown-linux-${LIBC}eabi
else
-   echo ${UNAME_MACHINE}-unknown-linux-gnueabihf
+   echo ${UNAME_MACHINE}-unknown-linux-${LIBC}eabihf
fi
fi
exit ;;
 avr32*:Linux:*:*)
-   echo ${UNAME_MACHINE}-unknown-linux-gnu
+   ech

[4.8, PATCH 5/26] Backport Power8 and LE support: Test adjustments

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-le-tests) backports adjustments to a few tests for
powerpc64le and the ELFv2 ABI.

Thanks,
Bill


2014-03-29  Bill Schmidt  

Backport from mainline
2013-11-27  Bill Schmidt  

* gfortran.dg/nan_7.f90: Disable for little endian PowerPC.

Backport from mainline r205106:

2013-11-20  Ulrich Weigand  

* gcc.target/powerpc/darwin-longlong.c (msw): Make endian-safe.

Backport from mainline r205046:

2013-11-19  Ulrich Weigand  

* gcc.target/powerpc/ppc64-abi-2.c (MAKE_SLOT): New macro to
construct parameter slot value in endian-independent way.
(fcevv, fciievv, fcvevv): Use it.


Index: gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c
===
--- gcc-4_8-branch.orig/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c  
2013-12-28 17:41:32.430628909 +0100
+++ gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c   
2013-12-28 17:50:39.655337721 +0100
@@ -119,6 +119,12 @@ typedef union
   vector int v;
 } vector_int_t;

+#ifdef __LITTLE_ENDIAN__
+#define MAKE_SLOT(x, y) ((long)x | ((long)y << 32))
+#else
+#define MAKE_SLOT(x, y) ((long)y | ((long)x << 32))
+#endif
+
 /* Paramter passing.
s : gpr 3
v : vpr 2
@@ -226,8 +232,8 @@ fcevv (char *s, ...)
   sp = __builtin_frame_address(0);
   sp = sp->backchain;
   
-  if (sp->slot[2].l != 0x10002ULL
-  || sp->slot[4].l != 0x50006ULL)
+  if (sp->slot[2].l != MAKE_SLOT (1, 2)
+  || sp->slot[4].l !=  MAKE_SLOT (5, 6))
 abort();
 }

@@ -268,8 +274,8 @@ fciievv (char *s, int i, int j, ...)
   sp = __builtin_frame_address(0);
   sp = sp->backchain;
   
-  if (sp->slot[4].l != 0x10002ULL
-  || sp->slot[6].l != 0x50006ULL)
+  if (sp->slot[4].l != MAKE_SLOT (1, 2)
+  || sp->slot[6].l !=  MAKE_SLOT (5, 6))
 abort();
 }

@@ -296,8 +302,8 @@ fcvevv (char *s, vector int x, ...)
   sp = __builtin_frame_address(0);
   sp = sp->backchain;
   
-  if (sp->slot[4].l != 0x10002ULL
-  || sp->slot[6].l != 0x50006ULL)
+  if (sp->slot[4].l != MAKE_SLOT (1, 2)
+  || sp->slot[6].l !=  MAKE_SLOT (5, 6))
 abort();
 }

Index: gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/darwin-longlong.c
===
--- gcc-4_8-branch.orig/gcc/testsuite/gcc.target/powerpc/darwin-longlong.c  
2013-12-28 17:41:32.430628909 +0100
+++ gcc-4_8-branch/gcc/testsuite/gcc.target/powerpc/darwin-longlong.c   
2013-12-28 17:50:39.659337741 +0100
@@ -11,7 +11,11 @@ int  msw(long long in)
 int  i[2];
   } ud;
   ud.ll = in;
+#ifdef __LITTLE_ENDIAN__
+  return ud.i[1];
+#else
   return ud.i[0];
+#endif
 }

 int main()
Index: gcc-4_8-branch/gcc/testsuite/gfortran.dg/nan_7.f90
===
--- gcc-4_8-branch.orig/gcc/testsuite/gfortran.dg/nan_7.f90 2013-12-28 
17:41:32.430628909 +0100
+++ gcc-4_8-branch/gcc/testsuite/gfortran.dg/nan_7.f90  2013-12-28 
17:50:39.662337756 +0100
@@ -2,6 +2,7 @@
 ! { dg-options "-fno-range-check" }
 ! { dg-require-effective-target fortran_real_16 }
 ! { dg-require-effective-target fortran_integer_16 }
+! { dg-skip-if "" { "powerpc*le-*-*" } { "*" } { "" } }
 ! PR47293 NAN not correctly read
 character(len=200) :: str
 real(16) :: r






[4.8, PATCH 8/26] Backport Power8 and LE support: PR57949

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-abi-compat) backports the ABI compatibility fix for
PR57949.

Thanks,
Bill


[gcc]

2014-03-29  Bill Schmidt  

Backport from mainline r201750.
2013-11-15  Ulrich Weigand  
Note: Default setting of -mcompat-align-parm inverted!

2013-08-14  Bill Schmidt  

PR target/57949
* doc/invoke.texi: Add documentation of mcompat-align-parm
option.
* config/rs6000/rs6000.opt: Add mcompat-align-parm option.
* config/rs6000/rs6000.c (rs6000_function_arg_boundary): For AIX
and Linux, correct BLKmode alignment when 128-bit alignment is
required and compatibility flag is not set.
(rs6000_gimplify_va_arg): For AIX and Linux, honor specified
alignment for zero-size arguments when compatibility flag is not
set.

[gcc/testsuite]

2014-03-29  Bill Schmidt  

Backport from mainline r201750.
2013-11-15  Ulrich Weigand  
Note: Default setting of -mcompat-align-parm inverted!

2013-08-14  Bill Schmidt  

PR target/57949
* gcc.target/powerpc/pr57949-1.c: New.
* gcc.target/powerpc/pr57949-2.c: New.


Index: gcc-4_8-test/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.c
@@ -8680,8 +8680,8 @@ rs6000_function_arg_boundary (enum machi
   || (type && TREE_CODE (type) == VECTOR_TYPE
   && int_size_in_bytes (type) >= 16))
 return 128;
-  else if (TARGET_MACHO
-  && rs6000_darwin64_abi
+  else if (((TARGET_MACHO && rs6000_darwin64_abi)
+|| (DEFAULT_ABI == ABI_AIX && !rs6000_compat_align_parm))
   && mode == BLKmode
   && type && TYPE_ALIGN (type) > 64)
 return 128;
@@ -10233,8 +10233,9 @@ rs6000_gimplify_va_arg (tree valist, tre
  We don't need to check for pass-by-reference because of the test above.
  We can return a simplifed answer, since we know there's no offset to add. 
 */
 
-  if (TARGET_MACHO
-  && rs6000_darwin64_abi 
+  if (((TARGET_MACHO
+&& rs6000_darwin64_abi)
+   || (DEFAULT_ABI == ABI_AIX && !rs6000_compat_align_parm))
   && integer_zerop (TYPE_SIZE (type)))
 {
   unsigned HOST_WIDE_INT align, boundary;
Index: gcc-4_8-test/gcc/config/rs6000/rs6000.opt
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.opt
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.opt
@@ -550,6 +550,10 @@ mquad-memory
 Target Report Mask(QUAD_MEMORY) Var(rs6000_isa_flags)
 Generate the quad word memory instructions (lq/stq/lqarx/stqcx).
 
+mcompat-align-parm
+Target Report Var(rs6000_compat_align_parm) Init(1) Save
+Generate aggregate parameter passing code with at most 64-bit alignment.
+
 mupper-regs-df
 Target Undocumented Mask(UPPER_REGS_DF) Var(rs6000_isa_flags)
 Allow double variables in upper registers with -mcpu=power7 or -mvsx
Index: gcc-4_8-test/gcc/doc/invoke.texi
===
--- gcc-4_8-test.orig/gcc/doc/invoke.texi
+++ gcc-4_8-test/gcc/doc/invoke.texi
@@ -17243,7 +17243,8 @@ following options:
 -mpopcntb -mpopcntd  -mpowerpc64 @gol
 -mpowerpc-gpopt  -mpowerpc-gfxopt  -msingle-float -mdouble-float @gol
 -msimple-fpu -mstring  -mmulhw  -mdlmzb  -mmfpgpr -mvsx @gol
--mcrypto -mdirect-move -mpower8-fusion -mpower8-vector -mquad-memory}
+-mcrypto -mdirect-move -mpower8-fusion -mpower8-vector -mquad-memory @gol
+-mcompat-align-parm -mno-compat-align-parm}
 
 The particular options set for any particular CPU varies between
 compiler versions, depending on what setting seems to produce optimal
@@ -18128,6 +18129,23 @@ stack location in the function prologue
 a pointer on AIX and 64-bit Linux systems.  If the TOC value is not
 saved in the prologue, it is saved just before the call through the
 pointer.  The @option{-mno-save-toc-indirect} option is the default.
+
+@item -mcompat-align-parm
+@itemx -mno-compat-align-parm
+@opindex mcompat-align-parm
+Generate (do not generate) code to pass structure parameters with a
+maximum alignment of 64 bits, for compatibility with older versions
+of GCC.
+
+Older versions of GCC (prior to 4.9.0) incorrectly did not align a
+structure parameter on a 128-bit boundary when that structure contained
+a member requiring 128-bit alignment.  This is corrected in more
+recent versions of GCC.  This option may be used to generate code
+that is compatible with functions compiled with older versions of
+GCC.
+
+In this version of the compiler, the @option{-mcompat-align-parm}
+is the default, except when using the Linux ELFv2 ABI.
 @end table
 
 @node RX Options
Index: gcc-4_8-test/gcc/testsuite/gcc.target/powerpc/pr57949-1.c
===
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.target/powerpc/pr57949-1.c
@@ -0,0 +1,19 @@
+/* {

[4.8, PATCH 9/26] Backport Power8 and LE support: ABI call support

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-abi-calls) backports fixes to common code to support
the new ELFv2 ABI.  Copying Richard and Jakub for these bits.

Thanks,
Bill


2014-03-29  Bill Schmidt  

Backport from mainline r204798:

2013-11-14  Ulrich Weigand  
Alan Modra  

* function.c (assign_parms): Use all.reg_parm_stack_space instead
of re-evaluating REG_PARM_STACK_SPACE target macro.
(locate_and_pad_parm): New parameter REG_PARM_STACK_SPACE.  Use it
instead of evaluating target macro REG_PARM_STACK_SPACE every time.
(assign_parm_find_entry_rtl): Update call.
* calls.c (initialize_argument_information): Update call.
(emit_library_call_value_1): Likewise.
* expr.h (locate_and_pad_parm): Update prototype.

Backport from mainline r204797:

2013-11-14  Ulrich Weigand  

* calls.c (store_unaligned_arguments_into_pseudos): Skip PARALLEL
arguments.

Backport from mainline r197003:

2013-03-23  Eric Botcazou  

* calls.c (expand_call): Add missing guard to code handling return
of non-BLKmode structures in MSB.
* function.c (expand_function_end): Likewise.


Index: gcc-4_8-branch/gcc/calls.c
===
--- gcc-4_8-branch.orig/gcc/calls.c 2013-12-28 17:41:32.056627059 +0100
+++ gcc-4_8-branch/gcc/calls.c  2013-12-28 17:50:43.356356135 +0100
@@ -983,6 +983,7 @@ store_unaligned_arguments_into_pseudos (
 
   for (i = 0; i < num_actuals; i++)
 if (args[i].reg != 0 && ! args[i].pass_on_stack
+   && GET_CODE (args[i].reg) != PARALLEL
&& args[i].mode == BLKmode
&& MEM_P (args[i].value)
&& (MEM_ALIGN (args[i].value)
@@ -1327,6 +1328,7 @@ initialize_argument_information (int num
 #else
 args[i].reg != 0,
 #endif
+reg_parm_stack_space,
 args[i].pass_on_stack ? 0 : args[i].partial,
 fndecl, args_size, &args[i].locate);
 #ifdef BLOCK_REG_PADDING
@@ -3171,7 +3173,9 @@ expand_call (tree exp, rtx target, int i
 group load/store machinery below.  */
   if (!structure_value_addr
  && !pcc_struct_value
+ && TYPE_MODE (rettype) != VOIDmode
  && TYPE_MODE (rettype) != BLKmode
+ && REG_P (valreg)
  && targetm.calls.return_in_msb (rettype))
{
  if (shift_return_value (TYPE_MODE (rettype), false, valreg))
@@ -3734,7 +3738,8 @@ emit_library_call_value_1 (int retval, r
 #else
   argvec[count].reg != 0,
 #endif
-  0, NULL_TREE, &args_size, &argvec[count].locate);
+  reg_parm_stack_space, 0,
+  NULL_TREE, &args_size, &argvec[count].locate);
 
   if (argvec[count].reg == 0 || argvec[count].partial != 0
  || reg_parm_stack_space > 0)
@@ -3821,7 +3826,7 @@ emit_library_call_value_1 (int retval, r
 #else
   argvec[count].reg != 0,
 #endif
-  argvec[count].partial,
+  reg_parm_stack_space, argvec[count].partial,
   NULL_TREE, &args_size, &argvec[count].locate);
  args_size.constant += argvec[count].locate.size.constant;
  gcc_assert (!argvec[count].locate.size.var);
Index: gcc-4_8-branch/gcc/function.c
===
--- gcc-4_8-branch.orig/gcc/function.c  2013-12-28 17:41:32.056627059 +0100
+++ gcc-4_8-branch/gcc/function.c   2013-12-28 17:50:43.362356165 +0100
@@ -2507,6 +2507,7 @@ assign_parm_find_entry_rtl (struct assig
 }
 
   locate_and_pad_parm (data->promoted_mode, data->passed_type, in_regs,
+  all->reg_parm_stack_space,
   entry_parm ? data->partial : 0, current_function_decl,
   &all->stack_args_size, &data->locate);
 
@@ -3485,11 +3486,7 @@ assign_parms (tree fndecl)
   /* Adjust function incoming argument size for alignment and
  minimum length.  */
 
-#ifdef REG_PARM_STACK_SPACE
-  crtl->args.size = MAX (crtl->args.size,
-   REG_PARM_STACK_SPACE (fndecl));
-#endif
-
+  crtl->args.size = MAX (crtl->args.size, all.reg_parm_stack_space);
   crtl->args.size = CEIL_ROUND (crtl->args.size,
   PARM_BOUNDARY / BITS_PER_UNIT);
 
@@ -3693,6 +3690,9 @@ gimplify_parameters (void)
IN_REGS is nonzero if the argument will be passed in registers.  It will
never be set if REG_PARM_STACK_SPACE is not defined.
 
+   REG_PARM_STACK_SPACE is the number of bytes of stack space reserved
+   for arguments which are passed in registers.
+
FNDECL is the function in which the argument was defined.
 
There are two types of rounding that are done.  The first, controlled by
@@ -3713,19 +3713,16 @@ g

Re: [RFA jit v2 1/2] introduce class toplev

2014-03-19 Thread David Malcolm
On Wed, 2014-03-19 at 12:10 -0600, Tom Tromey wrote:
> > "Tom" == Tom Tromey  writes:
> 
> Tom> This patch introduces a new "class toplev" and changes toplev_main and
> Tom> toplev_finalize to be methods of this class.  Additionally, now the
> Tom> timevars are automatically stopped when the object is destroyed.  This
> Tom> cleans up "compile" a bit and makes it simpler to reuse the toplev
> Tom> logic in other code.
> 
> David asked me off-list to rename the field in class toplev, so here's a
> new patch that does this.

Thanks!  (yes, I greatly prefer having member data of a class to have a
"m_" prefix, and for the ctor params to have equivalent names, without
the prefix, which this patch does, for "toplev").

> Tom
> 
> commit 66f92863ef55c26f673d02dd39027f340940a3bf
> Author: Tom Tromey 
> Date:   Tue Mar 18 08:07:40 2014 -0600
> 
> introduce class toplev
> 
> This patch introduces a new "class toplev" and changes toplev_main and
> toplev_finalize to be methods of this class.  Additionally, now the
> timevars are automatically stopped when the object is destroyed.  This
> cleans up "compile" a bit and makes it simpler to reuse the toplev
> logic in other code.

OK.  Are you able to push this to my branch, or do you need me to do
this?



[4.8, PATCH 11/26] Backport Power8 and LE support: gotest

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-abi-gotest) backports enablement of the Go testsuite
for powerpc64le.

Thanks,
Bill


2014-03-29  Bill Schmidt  

Backport from mainline r205000.
2013-11-19  Ulrich Weigand  

gotest: Recognize PPC ELF v2 function pointers in text section.


Index: gcc-4_8-branch/libgo/testsuite/gotest
===
--- gcc-4_8-branch.orig/libgo/testsuite/gotest  2013-12-28 17:41:31.783625708 
+0100
+++ gcc-4_8-branch/libgo/testsuite/gotest   2013-12-28 17:50:45.671367653 
+0100
@@ -369,7 +369,7 @@ localname() {
 {
text="T"
case "$GOARCH" in
-   ppc64) text="D" ;;
+   ppc64) text="[TD]" ;;
esac
 
symtogo='sed -e s/_test/XXXtest/ -e s/.*_\([^_]*\.\)/\1/ -e 
s/XXXtest/_test/'





[4.8, PATCH 12/26] Backport Power8 and LE support: Defaults

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-le-align) sets some miscellaneous defaults for little
endian support.

Thanks,
Bill


2014-03-29  Bill Schmidt  

Apply mainline r205060.
2013-11-20  Alan Modra  
* config/rs6000/sysv4.h (CC1_ENDIAN_LITTLE_SPEC): Define as empty.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Default
to strict alignment on older processors when little-endian.
* config/rs6000/linux64.h (PROCESSOR_DEFAULT64): Default to power8
for ELFv2.


Index: gcc-4_8-branch/gcc/config/rs6000/linux64.h
===
--- gcc-4_8-branch.orig/gcc/config/rs6000/linux64.h 2013-12-28 
17:50:44.252360594 +0100
+++ gcc-4_8-branch/gcc/config/rs6000/linux64.h  2013-12-28 17:50:46.356371060 
+0100
@@ -71,7 +71,11 @@ extern int dot_symbols;
 #undef  PROCESSOR_DEFAULT
 #define PROCESSOR_DEFAULT PROCESSOR_POWER7
 #undef  PROCESSOR_DEFAULT64
+#ifdef LINUX64_DEFAULT_ABI_ELFv2
+#define PROCESSOR_DEFAULT64 PROCESSOR_POWER8
+#else
 #define PROCESSOR_DEFAULT64 PROCESSOR_POWER7
+#endif
 
 /* We don't need to generate entries in .fixup, except when
-mrelocatable or -mrelocatable-lib is given.  */
Index: gcc-4_8-branch/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-branch.orig/gcc/config/rs6000/rs6000.c  2013-12-28 
17:50:44.219360429 +0100
+++ gcc-4_8-branch/gcc/config/rs6000/rs6000.c   2013-12-28 17:50:46.369371125 
+0100
@@ -3206,6 +3206,12 @@ rs6000_option_override_internal (bool gl
}
 }
 
+  /* If little-endian, default to -mstrict-align on older processors.
+ Testing for htm matches power8 and later.  */
+  if (!BYTES_BIG_ENDIAN
+  && !(processor_target_table[tune_index].target_enable & OPTION_MASK_HTM))
+rs6000_isa_flags |= ~rs6000_isa_flags_explicit & OPTION_MASK_STRICT_ALIGN;
+
   /* Add some warnings for VSX.  */
   if (TARGET_VSX)
 {
Index: gcc-4_8-branch/gcc/config/rs6000/sysv4.h
===
--- gcc-4_8-branch.orig/gcc/config/rs6000/sysv4.h   2013-12-28 
17:50:44.243360549 +0100
+++ gcc-4_8-branch/gcc/config/rs6000/sysv4.h2013-12-28 17:50:46.374371150 
+0100
@@ -538,12 +538,7 @@ ENDIAN_SELECT(" -mbig", " -mlittle", DEF
 
 #defineCC1_ENDIAN_BIG_SPEC ""
 
-#defineCC1_ENDIAN_LITTLE_SPEC "\
-%{!mstrict-align: %{!mno-strict-align: \
-%{!mcall-i960-old: \
-   -mstrict-align \
-} \
-}}"
+#defineCC1_ENDIAN_LITTLE_SPEC ""
 
 #defineCC1_ENDIAN_DEFAULT_SPEC "%(cc1_endian_big)"
 





[4.8, PATCH 14/26] Backport Power8 and LE support: DFP absolute value

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-dfp-abs) backports some unrelated but necessary work to
enable the DFP absolute value builtins.  Copying Jakub who was involved
with the original patch.

Thanks,
Bill


2014-03-29  Bill Schmidt  

Backport from mainline
2013-08-19  Peter Bergner  
Jakub Jelinek  

* builtins.def (BUILT_IN_FABSD32): New DFP ABS builtin.
(BUILT_IN_FABSD64): Likewise.
(BUILT_IN_FABSD128): Likewise.
* builtins.c (expand_builtin): Add support for
new DFP ABS builtins.
(fold_builtin_1): Likewise.
* config/rs6000/dfp.md
(*abstd2_fpr): Handle non-overlapping destination
and source operands.
(*nabstd2_fpr): Likewise.

2014-03-29  Bill Schmidt  

Backport from mainline
2013-08-19  Peter Bergner  

* gcc.target/powerpc/dfp-dd-2.c: New test.
* gcc.target/powerpc/dfp-td-2.c: Likewise.
* gcc.target/powerpc/dfp-td-3.c: Likewise.


Index: gcc-4_8-test/gcc/builtins.c
===
--- gcc-4_8-test.orig/gcc/builtins.c
+++ gcc-4_8-test/gcc/builtins.c
@@ -5861,6 +5861,9 @@ expand_builtin (tree exp, rtx target, rt
   switch (fcode)
 {
 CASE_FLT_FN (BUILT_IN_FABS):
+case BUILT_IN_FABSD32:
+case BUILT_IN_FABSD64:
+case BUILT_IN_FABSD128:
   target = expand_builtin_fabs (exp, target, subtarget);
   if (target)
return target;
@@ -10313,6 +10316,9 @@ fold_builtin_1 (location_t loc, tree fnd
   return fold_builtin_strlen (loc, type, arg0);
 
 CASE_FLT_FN (BUILT_IN_FABS):
+case BUILT_IN_FABSD32:
+case BUILT_IN_FABSD64:
+case BUILT_IN_FABSD128:
   return fold_builtin_fabs (loc, arg0, type);
 
 case BUILT_IN_ABS:
Index: gcc-4_8-test/gcc/builtins.def
===
--- gcc-4_8-test.orig/gcc/builtins.def
+++ gcc-4_8-test/gcc/builtins.def
@@ -252,6 +252,9 @@ DEF_C99_BUILTIN(BUILT_IN_EXPM1L,
 DEF_LIB_BUILTIN(BUILT_IN_FABS, "fabs", BT_FN_DOUBLE_DOUBLE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_FABSF, "fabsf", BT_FN_FLOAT_FLOAT, 
ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_C90RES_BUILTIN (BUILT_IN_FABSL, "fabsl", BT_FN_LONGDOUBLE_LONGDOUBLE, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_FABSD32, "fabsd32", BT_FN_DFLOAT32_DFLOAT32, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_FABSD64, "fabsd64", BT_FN_DFLOAT64_DFLOAT64, 
ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_GCC_BUILTIN(BUILT_IN_FABSD128, "fabsd128", 
BT_FN_DFLOAT128_DFLOAT128, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN(BUILT_IN_FDIM, "fdim", BT_FN_DOUBLE_DOUBLE_DOUBLE, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_BUILTIN(BUILT_IN_FDIMF, "fdimf", BT_FN_FLOAT_FLOAT_FLOAT, 
ATTR_MATHFN_FPROUNDING_ERRNO)
 DEF_C99_BUILTIN(BUILT_IN_FDIML, "fdiml", 
BT_FN_LONGDOUBLE_LONGDOUBLE_LONGDOUBLE, ATTR_MATHFN_FPROUNDING_ERRNO)
Index: gcc-4_8-test/gcc/config/rs6000/dfp.md
===
--- gcc-4_8-test.orig/gcc/config/rs6000/dfp.md
+++ gcc-4_8-test/gcc/config/rs6000/dfp.md
@@ -148,18 +148,24 @@
   "")
 
 (define_insn "*abstd2_fpr"
-  [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
-   (abs:TD (match_operand:TD 1 "gpc_reg_operand" "d")))]
+  [(set (match_operand:TD 0 "gpc_reg_operand" "=d,d")
+   (abs:TD (match_operand:TD 1 "gpc_reg_operand" "0,d")))]
   "TARGET_HARD_FLOAT && TARGET_FPRS"
-  "fabs %0,%1"
-  [(set_attr "type" "fp")])
+  "@
+   fabs %0,%1
+   fabs %0,%1\;fmr %L0,%L1"
+  [(set_attr "type" "fp")
+   (set_attr "length" "4,8")])
 
 (define_insn "*nabstd2_fpr"
-  [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
-   (neg:TD (abs:TD (match_operand:TD 1 "gpc_reg_operand" "d"]
+  [(set (match_operand:TD 0 "gpc_reg_operand" "=d,d")
+   (neg:TD (abs:TD (match_operand:TD 1 "gpc_reg_operand" "0,d"]
   "TARGET_HARD_FLOAT && TARGET_FPRS"
-  "fnabs %0,%1"
-  [(set_attr "type" "fp")])
+  "@
+   fnabs %0,%1
+   fnabs %0,%1\;fmr %L0,%L1"
+  [(set_attr "type" "fp")
+   (set_attr "length" "4,8")])
 
 ;; Hardware support for decimal floating point operations.
 
Index: gcc-4_8-test/gcc/testsuite/gcc.target/powerpc/dfp-dd-2.c
===
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.target/powerpc/dfp-dd-2.c
@@ -0,0 +1,26 @@
+/* Test generation of DFP instructions for POWER6.  */
+/* { dg-do compile { target { powerpc*-*-linux* && powerpc_fprs } } } */
+/* { dg-options "-std=gnu99 -O1 -mcpu=power6" } */
+
+/* { dg-final { scan-assembler-times "fneg" 1 } } */
+/* { dg-final { scan-assembler-times "fabs" 1 } } */
+/* { dg-final { scan-assembler-times "fnabs" 1 } } */
+/* { dg-final { scan-assembler-times "fmr" 0 } } */
+
+_Decimal64
+func1 (_Decimal64 a, _Decimal64 b)
+{
+  return -b;
+}
+
+_Decimal64
+func2 (_Decimal64 a, _Decimal64 b)
+{
+  return __builti

[4.8, PATCH 17/26] Backport Power8 and LE support: Direct moves

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-direct-move) backports support for the Power8 direct
move instructions for little endian.

Thanks,
Bill


2014-03-19  Bill Schmidt  

Backport from mainline
2013-10-23  Pat Haugen  

* gcc.target/powerpc/direct-move.h: Fix header for executable tests.

Back port from mainline
2014-01-16  Michael Meissner  

PR target/59844
* config/rs6000/rs6000.md (reload_vsx_from_gprsf): Add little
endian support, remove tests for WORDS_BIG_ENDIAN.
(p8_mfvsrd_3_): Likewise.
(reload_gpr_from_vsx): Likewise.
(reload_gpr_from_vsxsf): Likewise.
(p8_mfvsrd_4_disf): Likewise.


Index: gcc-4_8-test/gcc/config/rs6000/rs6000.md
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.md
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.md
@@ -9438,7 +9438,7 @@
(unspec:SF [(match_operand:SF 1 "register_operand" "r")]
   UNSPEC_P8V_RELOAD_FROM_GPR))
(clobber (match_operand:DI 2 "register_operand" "=r"))]
-  "TARGET_POWERPC64 && TARGET_DIRECT_MOVE && WORDS_BIG_ENDIAN"
+  "TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
   "#"
   "&& reload_completed"
   [(const_int 0)]
@@ -9465,7 +9465,7 @@
   [(set (match_operand:DF 0 "register_operand" "=r")
(unspec:DF [(match_operand:FMOVE128_GPR 1 "register_operand" "wa")]
   UNSPEC_P8V_RELOAD_FROM_VSX))]
-  "TARGET_POWERPC64 && TARGET_DIRECT_MOVE && WORDS_BIG_ENDIAN"
+  "TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
   "mfvsrd %0,%x1"
   [(set_attr "type" "mftgpr")])
 
@@ -9475,7 +9475,7 @@
 [(match_operand:FMOVE128_GPR 1 "register_operand" "wa")]
 UNSPEC_P8V_RELOAD_FROM_VSX))
(clobber (match_operand:FMOVE128_GPR 2 "register_operand" "=wa"))]
-  "TARGET_POWERPC64 && TARGET_DIRECT_MOVE && WORDS_BIG_ENDIAN"
+  "TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
   "#"
   "&& reload_completed"
   [(const_int 0)]
@@ -9502,7 +9502,7 @@
(unspec:SF [(match_operand:SF 1 "register_operand" "wa")]
   UNSPEC_P8V_RELOAD_FROM_VSX))
(clobber (match_operand:V4SF 2 "register_operand" "=wa"))]
-  "TARGET_POWERPC64 && TARGET_DIRECT_MOVE && WORDS_BIG_ENDIAN"
+  "TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
   "#"
   "&& reload_completed"
   [(const_int 0)]
@@ -9524,7 +9524,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r")
(unspec:DI [(match_operand:V4SF 1 "register_operand" "wa")]
   UNSPEC_P8V_RELOAD_FROM_VSX))]
-  "TARGET_POWERPC64 && TARGET_DIRECT_MOVE && WORDS_BIG_ENDIAN"
+  "TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
   "mfvsrd %0,%x1"
   [(set_attr "type" "mftgpr")])
 
Index: gcc-4_8-test/gcc/testsuite/gcc.target/powerpc/direct-move.h
===
--- gcc-4_8-test.orig/gcc/testsuite/gcc.target/powerpc/direct-move.h
+++ gcc-4_8-test/gcc/testsuite/gcc.target/powerpc/direct-move.h
@@ -1,5 +1,7 @@
 /* Test functions for direct move support.  */
 
+#include 
+extern void abort (void);
 
 #ifndef VSX_REG_ATTR
 #define VSX_REG_ATTR "wa"
@@ -111,7 +113,7 @@ const struct test_struct test_functions[
 void __attribute__((__noinline__))
 test_value (TYPE a)
 {
-  size_t i;
+  long i;
 
   for (i = 0; i < sizeof (test_functions) / sizeof (test_functions[0]); i++)
 {
@@ -127,8 +129,7 @@ test_value (TYPE a)
 int
 main (void)
 {
-  size_t i;
-  long j;
+  long i,j;
   union {
 TYPE value;
 unsigned char bytes[sizeof (TYPE)];





[4.8, PATCH 6/26] Backport Power8 and LE support: TDmode for LE

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-le-dfp) backports fixes for TDmode on a little endian
target.

Thanks,
Bill


2014-03-19  Bill Schmidt  

Backport from mainline r205123:

2013-11-20  Ulrich Weigand  

* config/rs6000/rs6000.c (rs6000_cannot_change_mode_class): Do not
allow subregs of TDmode in FPRs of smaller size in little-endian.
(rs6000_split_multireg_move): When splitting an access to TDmode
in FPRs, do not use simplify_gen_subreg.

Backport from mainline r204927:

2013-11-17  Ulrich Weigand  

* config/rs6000/rs6000.c (rs6000_emit_move): Use low word of
sdmode_stack_slot also in little-endian mode.


Index: gcc-4_8-test/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.c
@@ -7963,7 +7963,9 @@ rs6000_emit_move (rtx dest, rtx source,
}
   else if (INT_REGNO_P (REGNO (operands[1])))
{
- rtx mem = adjust_address_nv (operands[0], mode, 4);
+ rtx mem = operands[0];
+ if (BYTES_BIG_ENDIAN)
+   mem = adjust_address_nv (mem, mode, 4);
  mem = eliminate_regs (mem, VOIDmode, NULL_RTX);
  emit_insn (gen_movsd_hardfloat (mem, operands[1]));
}
@@ -7986,7 +7988,9 @@ rs6000_emit_move (rtx dest, rtx source,
}
   else if (INT_REGNO_P (REGNO (operands[0])))
{
- rtx mem = adjust_address_nv (operands[1], mode, 4);
+ rtx mem = operands[1];
+ if (BYTES_BIG_ENDIAN)
+   mem = adjust_address_nv (mem, mode, 4);
  mem = eliminate_regs (mem, VOIDmode, NULL_RTX);
  emit_insn (gen_movsd_hardfloat (operands[0], mem));
}
@@ -16082,6 +16086,13 @@ rs6000_cannot_change_mode_class (enum ma
  if (TARGET_IEEEQUAD && (to == TFmode || from == TFmode))
return true;
 
+ /* TDmode in floating-mode registers must always go into a register
+pair with the most significant word in the even-numbered register
+to match ISA requirements.  In little-endian mode, this does not
+match subreg numbering, so we cannot allow subregs.  */
+ if (!BYTES_BIG_ENDIAN && (to == TDmode || from == TDmode))
+   return true;
+
  if (from_size < 8 || to_size < 8)
return true;
 
@@ -19028,6 +19039,39 @@ rs6000_split_multireg_move (rtx dst, rtx
 
   gcc_assert (reg_mode_size * nregs == GET_MODE_SIZE (mode));
 
+  /* TDmode residing in FP registers is special, since the ISA requires that
+ the lower-numbered word of a register pair is always the most significant
+ word, even in little-endian mode.  This does not match the usual subreg
+ semantics, so we cannnot use simplify_gen_subreg in those cases.  Access
+ the appropriate constituent registers "by hand" in little-endian mode.
+
+ Note we do not need to check for destructive overlap here since TDmode
+ can only reside in even/odd register pairs.  */
+  if (FP_REGNO_P (reg) && DECIMAL_FLOAT_MODE_P (mode) && !BYTES_BIG_ENDIAN)
+{
+  rtx p_src, p_dst;
+  int i;
+
+  for (i = 0; i < nregs; i++)
+   {
+ if (REG_P (src) && FP_REGNO_P (REGNO (src)))
+   p_src = gen_rtx_REG (reg_mode, REGNO (src) + nregs - 1 - i);
+ else
+   p_src = simplify_gen_subreg (reg_mode, src, mode,
+i * reg_mode_size);
+
+ if (REG_P (dst) && FP_REGNO_P (REGNO (dst)))
+   p_dst = gen_rtx_REG (reg_mode, REGNO (dst) + nregs - 1 - i);
+ else
+   p_dst = simplify_gen_subreg (reg_mode, dst, mode,
+i * reg_mode_size);
+
+ emit_insn (gen_rtx_SET (VOIDmode, p_dst, p_src));
+   }
+
+  return;
+}
+
   if (REG_P (src) && REG_P (dst) && (REGNO (src) < REGNO (dst)))
 {
   /* Move register range backwards, if we might have destructive





[4.8, PATCH 4/26] Backport Power8 and LE support: Libtool and configure bits 2

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-le-libtool) backports changes to use a libtool.m4 that
supports powerpc64le-*linux*.

Thanks,
Bill


2014-03-19  Bill Schmidt  

Backport from mainline
2013-11-22  Ulrich Weigand  

* libgo/config/libtool.m4: Update to mainline version.
* libgo/configure: Regenerate.

2013-11-17  Ulrich Weigand  

* libgo/config/libtool.m4: Update to mainline version.
* libgo/configure: Regenerate.

2013-11-15  Ulrich Weigand  

* libtool.m4: Update to mainline version.
* libjava/libltdl/acinclude.m4: Likewise.

* gcc/configure: Regenerate.
* boehm-gc/configure: Regenerate.
* libatomic/configure: Regenerate.
* libbacktrace/configure: Regenerate.
* libffi/configure: Regenerate.
* libgfortran/configure: Regenerate.
* libgomp/configure: Regenerate.
* libitm/configure: Regenerate.
* libjava/configure: Regenerate.
* libjava/libltdl/configure: Regenerate.
* libjava/classpath/configure: Regenerate.
* libmudflap/configure: Regenerate.
* libobjc/configure: Regenerate.
* libquadmath/configure: Regenerate.
* libsanitizer/configure: Regenerate.
* libssp/configure: Regenerate.
* libstdc++-v3/configure: Regenerate.
* lto-plugin/configure: Regenerate.
* zlib/configure: Regenerate.

Backport from mainline
2013-09-20  Alan Modra  

* libtool.m4 (_LT_ENABLE_LOCK ): Remove non-canonical
ppc host match.  Support little-endian powerpc linux hosts.
* configure: Regenerate.


Index: gcc-4_8-branch/gcc/configure
===
--- gcc-4_8-branch.orig/gcc/configure   2013-12-28 17:41:32.733630408 +0100
+++ gcc-4_8-branch/gcc/configure2013-12-28 17:50:38.646332701 +0100
@@ -13589,7 +13589,7 @@ ia64-*-hpux*)
   rm -rf conftest*
   ;;
 
-x86_64-*kfreebsd*-gnu|x86_64-*linux*|ppc*-*linux*|powerpc*-*linux*| \
+x86_64-*kfreebsd*-gnu|x86_64-*linux*|powerpc*-*linux*| \
 s390*-*linux*|s390*-*tpf*|sparc*-*linux*)
   # Find out which ABI we are using.
   echo 'int i;' > conftest.$ac_ext
@@ -13614,7 +13614,10 @@ s390*-*linux*|s390*-*tpf*|sparc*-*linux*
;;
esac
;;
- ppc64-*linux*|powerpc64-*linux*)
+ powerpc64le-*linux*)
+   LD="${LD-ld} -m elf32lppclinux"
+   ;;
+ powerpc64-*linux*)
LD="${LD-ld} -m elf32ppclinux"
;;
  s390x-*linux*)
@@ -13633,7 +13636,10 @@ s390*-*linux*|s390*-*tpf*|sparc*-*linux*
  x86_64-*linux*)
LD="${LD-ld} -m elf_x86_64"
;;
- ppc*-*linux*|powerpc*-*linux*)
+ powerpcle-*linux*)
+   LD="${LD-ld} -m elf64lppc"
+   ;;
+ powerpc-*linux*)
LD="${LD-ld} -m elf64ppc"
;;
  s390*-*linux*|s390*-*tpf*)
@@ -17827,7 +17833,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 17830 "configure"
+#line 17836 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -17933,7 +17939,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 17936 "configure"
+#line 17942 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
Index: gcc-4_8-branch/libtool.m4
===
--- gcc-4_8-branch.orig/libtool.m4  2013-12-28 17:41:32.728630383 +0100
+++ gcc-4_8-branch/libtool.m4   2013-12-28 17:50:38.652332731 +0100
@@ -1220,7 +1220,7 @@ ia64-*-hpux*)
   rm -rf conftest*
   ;;
 
-x86_64-*kfreebsd*-gnu|x86_64-*linux*|ppc*-*linux*|powerpc*-*linux*| \
+x86_64-*kfreebsd*-gnu|x86_64-*linux*|powerpc*-*linux*| \
 s390*-*linux*|s390*-*tpf*|sparc*-*linux*)
   # Find out which ABI we are using.
   echo 'int i;' > conftest.$ac_ext
@@ -1241,7 +1241,10 @@ s390*-*linux*|s390*-*tpf*|sparc*-*linux*
;;
esac
;;
- ppc64-*linux*|powerpc64-*linux*)
+ powerpc64le-*linux*)
+   LD="${LD-ld} -m elf32lppclinux"
+   ;;
+ powerpc64-*linux*)
LD="${LD-ld} -m elf32ppclinux"
;;
  s390x-*linux*)
@@ -1260,7 +1263,10 @@ s390*-*linux*|s390*-*tpf*|sparc*-*linux*
  x86_64-*linux*)
LD="${LD-ld} -m elf_x86_64"
;;
- ppc*-*linux*|powerpc*-*linux*)
+ powerpcle-*linux*)
+   LD="${LD-ld} -m elf64lppc"
+   ;;
+ powerpc-*linux*)
LD="${LD-ld} -m elf64ppc"
;;
  s390*-*linux*|s390*-*tpf*)
Index: gcc-4_8-branch/boehm-gc/configure
===
--- gcc-4_8-branch.orig/boehm-gc/configure  2013-12-28 17:41:32.733630408 
+0100
+++ gcc-4_8-branch/boehm-gc/configure   2013-12-28 

[4.8, PATCH 16/26] Backport Power8 and LE support: PR56843

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-pr56843) backports the fix for PR56843.

Thanks,
Bill


[gcc]

2014-03-19  Bill Schmidt  

Backport from mainline
2013-04-05  Bill Schmidt  

PR target/56843
* config/rs6000/rs6000.c (rs6000_emit_swdiv_high_precision): Remove.
(rs6000_emit_swdiv_low_precision): Remove.
(rs6000_emit_swdiv): Rewrite to handle between one and four
iterations of Newton-Raphson generally; modify required number of
iterations for some cases.
* config/rs6000/rs6000.h (RS6000_RECIP_HIGH_PRECISION_P): Remove.

[gcc/testsuite]

2014-03-19  Bill Schmidt  

Backport from mainline
2013-04-05  Bill Schmidt  

PR target/56843
* gcc.target/powerpc/recip-1.c: Modify expected output.
* gcc.target/powerpc/recip-3.c: Likewise.
* gcc.target/powerpc/recip-4.c: Likewise.
* gcc.target/powerpc/recip-5.c: Add expected output for iterations.


Index: gcc-4_8-test/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.c
@@ -29417,54 +29417,26 @@ rs6000_emit_nmsub (rtx dst, rtx m1, rtx
   emit_insn (gen_rtx_SET (VOIDmode, dst, r));
 }
 
-/* Newton-Raphson approximation of floating point divide with just 2 passes
-   (either single precision floating point, or newer machines with higher
-   accuracy estimates).  Support both scalar and vector divide.  Assumes no
-   trapping math and finite arguments.  */
+/* Newton-Raphson approximation of floating point divide DST = N/D.  If NOTE_P,
+   add a reg_note saying that this was a division.  Support both scalar and
+   vector divide.  Assumes no trapping math and finite arguments.  */
 
-static void
-rs6000_emit_swdiv_high_precision (rtx dst, rtx n, rtx d)
+void
+rs6000_emit_swdiv (rtx dst, rtx n, rtx d, bool note_p)
 {
   enum machine_mode mode = GET_MODE (dst);
-  rtx x0, e0, e1, y1, u0, v0;
-  enum insn_code code = optab_handler (smul_optab, mode);
-  insn_gen_fn gen_mul = GEN_FCN (code);
-  rtx one = rs6000_load_constant_and_splat (mode, dconst1);
-
-  gcc_assert (code != CODE_FOR_nothing);
-
-  /* x0 = 1./d estimate */
-  x0 = gen_reg_rtx (mode);
-  emit_insn (gen_rtx_SET (VOIDmode, x0,
- gen_rtx_UNSPEC (mode, gen_rtvec (1, d),
- UNSPEC_FRES)));
-
-  e0 = gen_reg_rtx (mode);
-  rs6000_emit_nmsub (e0, d, x0, one);  /* e0 = 1. - (d * x0) */
-
-  e1 = gen_reg_rtx (mode);
-  rs6000_emit_madd (e1, e0, e0, e0);   /* e1 = (e0 * e0) + e0 */
-
-  y1 = gen_reg_rtx (mode);
-  rs6000_emit_madd (y1, e1, x0, x0);   /* y1 = (e1 * x0) + x0 */
-
-  u0 = gen_reg_rtx (mode);
-  emit_insn (gen_mul (u0, n, y1)); /* u0 = n * y1 */
-
-  v0 = gen_reg_rtx (mode);
-  rs6000_emit_nmsub (v0, d, u0, n);/* v0 = n - (d * u0) */
-
-  rs6000_emit_madd (dst, v0, y1, u0);  /* dst = (v0 * y1) + u0 */
-}
+  rtx one, x0, e0, x1, xprev, eprev, xnext, enext, u, v;
+  int i;
 
-/* Newton-Raphson approximation of floating point divide that has a low
-   precision estimate.  Assumes no trapping math and finite arguments.  */
+  /* Low precision estimates guarantee 5 bits of accuracy.  High
+ precision estimates guarantee 14 bits of accuracy.  SFmode
+ requires 23 bits of accuracy.  DFmode requires 52 bits of
+ accuracy.  Each pass at least doubles the accuracy, leading
+ to the following.  */
+  int passes = (TARGET_RECIP_PRECISION) ? 1 : 3;
+  if (mode == DFmode || mode == V2DFmode)
+passes++;
 
-static void
-rs6000_emit_swdiv_low_precision (rtx dst, rtx n, rtx d)
-{
-  enum machine_mode mode = GET_MODE (dst);
-  rtx x0, e0, e1, e2, y1, y2, y3, u0, v0, one;
   enum insn_code code = optab_handler (smul_optab, mode);
   insn_gen_fn gen_mul = GEN_FCN (code);
 
@@ -29478,46 +29450,44 @@ rs6000_emit_swdiv_low_precision (rtx dst
  gen_rtx_UNSPEC (mode, gen_rtvec (1, d),
  UNSPEC_FRES)));
 
-  e0 = gen_reg_rtx (mode);
-  rs6000_emit_nmsub (e0, d, x0, one);  /* e0 = 1. - d * x0 */
-
-  y1 = gen_reg_rtx (mode);
-  rs6000_emit_madd (y1, e0, x0, x0);   /* y1 = x0 + e0 * x0 */
-
-  e1 = gen_reg_rtx (mode);
-  emit_insn (gen_mul (e1, e0, e0));/* e1 = e0 * e0 */
-
-  y2 = gen_reg_rtx (mode);
-  rs6000_emit_madd (y2, e1, y1, y1);   /* y2 = y1 + e1 * y1 */
-
-  e2 = gen_reg_rtx (mode);
-  emit_insn (gen_mul (e2, e1, e1));/* e2 = e1 * e1 */
-
-  y3 = gen_reg_rtx (mode);
-  rs6000_emit_madd (y3, e2, y2, y2);   /* y3 = y2 + e2 * y2 */
-
-  u0 = gen_reg_rtx (mode);
-  emit_insn (gen_mul (u0, n, y3)); /* u0 = n * y3 */
-
-  v0 = gen_reg_rtx (mode);
-  rs6000_emit_nmsub (v0, d, u0, n);/* v0 = n - d * u0 */
-
-  rs6000_emit_madd (dst, v0, y3, u0);  /* dst = u0 + v0 * y3 */
-}
+  /* Each iteration but the

[4.8, PATCH 18/26] Backport Power8 and LE support: Configure bits 2

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-le-config-2) backports more configure changes,
particularly for multilib/multiarch targeting powerpc64le.

Thanks,
Bill


2014-03-19  Bill Schmidt  

Apply mainline r202190, powerpc64le multilibs and multiarch dir
2013-09-03  Alan Modra  

* config.gcc (powerpc*-*-linux*): Add support for little-endian
multilibs to big-endian target and vice versa.
* config/rs6000/t-linux64: Use := assignment on all vars.
(MULTILIB_EXTRA_OPTS): Remove fPIC.
(MULTILIB_OSDIRNAMES): Specify using mapping from multilib_options.
* config/rs6000/t-linux64le: New file.
* config/rs6000/t-linux64bele: New file.
* config/rs6000/t-linux64lebe: New file.


Index: gcc-4_8-test/gcc/config.gcc
===
--- gcc-4_8-test.orig/gcc/config.gcc
+++ gcc-4_8-test/gcc/config.gcc
@@ -2081,7 +2081,7 @@ powerpc*-*-linux*)
tmake_file="rs6000/t-fprules rs6000/t-ppcos ${tmake_file} 
rs6000/t-ppccomm"
case ${target} in
powerpc*le-*-*)
-   tm_file="${tm_file} rs6000/sysv4le.h" ;;
+   tm_file="${tm_file} rs6000/sysv4le.h" ;;
esac
maybe_biarch=yes
case ${target} in
@@ -2104,6 +2104,19 @@ powerpc*-*-linux*)
fi
tm_file="rs6000/biarch64.h ${tm_file} rs6000/linux64.h 
glibc-stdint.h"
tmake_file="$tmake_file rs6000/t-linux64"
+   case ${target} in
+   powerpc*le-*-*)
+   tmake_file="$tmake_file rs6000/t-linux64le"
+   case ${enable_targets} in
+   all | *powerpc64-* | *powerpc-*)
+   tmake_file="$tmake_file rs6000/t-linux64lebe" ;;
+   esac ;;
+   *)
+   case ${enable_targets} in
+   all | *powerpc64le-* | *powerpcle-*)
+   tmake_file="$tmake_file rs6000/t-linux64bele" ;;
+   esac ;;
+   esac
extra_options="${extra_options} rs6000/linux64.opt"
;;
*)
Index: gcc-4_8-test/gcc/config/rs6000/t-linux64
===
--- gcc-4_8-test.orig/gcc/config/rs6000/t-linux64
+++ gcc-4_8-test/gcc/config/rs6000/t-linux64
@@ -25,8 +25,8 @@
 # it doesn't tell anything about the 32bit libraries on those systems.  Set
 # MULTILIB_OSDIRNAMES according to what is found on the target.
 
-MULTILIB_OPTIONS= m64/m32
-MULTILIB_DIRNAMES   = 64 32
-MULTILIB_EXTRA_OPTS = fPIC
-MULTILIB_OSDIRNAMES= ../lib64$(call if_multiarch,:powerpc64-linux-gnu)
-MULTILIB_OSDIRNAMES+= $(if $(wildcard $(shell echo 
$(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib)$(call 
if_multiarch,:powerpc-linux-gnu)
+MULTILIB_OPTIONS:= m64/m32
+MULTILIB_DIRNAMES   := 64 32
+MULTILIB_EXTRA_OPTS := 
+MULTILIB_OSDIRNAMES := m64=../lib64$(call if_multiarch,:powerpc64-linux-gnu)
+MULTILIB_OSDIRNAMES += m32=$(if $(wildcard $(shell echo 
$(SYSTEM_HEADER_DIR))/../../usr/lib32),../lib32,../lib)$(call 
if_multiarch,:powerpc-linux-gnu)
Index: gcc-4_8-test/gcc/config/rs6000/t-linux64bele
===
--- /dev/null
+++ gcc-4_8-test/gcc/config/rs6000/t-linux64bele
@@ -0,0 +1,7 @@
+#rs6000/t-linux64end
+
+MULTILIB_OPTIONS+= mlittle
+MULTILIB_DIRNAMES   += le
+MULTILIB_OSDIRNAMES += $(subst =,.mlittle=,$(subst lible32,lib32le,$(subst 
lible64,lib64le,$(subst lib,lible,$(subst 
-linux,le-linux,$(MULTILIB_OSDIRNAMES))
+MULTILIB_OSDIRNAMES += $(subst $(if $(findstring 
64,$(target)),m64,m32).,,$(filter $(if $(findstring 
64,$(target)),m64,m32).mlittle%,$(MULTILIB_OSDIRNAMES)))
+MULTILIB_MATCHES:= ${MULTILIB_MATCHES_ENDIAN}
Index: gcc-4_8-test/gcc/config/rs6000/t-linux64le
===
--- /dev/null
+++ gcc-4_8-test/gcc/config/rs6000/t-linux64le
@@ -0,0 +1,3 @@
+#rs6000/t-linux64le
+
+MULTILIB_OSDIRNAMES := $(subst -linux,le-linux,$(MULTILIB_OSDIRNAMES))
Index: gcc-4_8-test/gcc/config/rs6000/t-linux64lebe
===
--- /dev/null
+++ gcc-4_8-test/gcc/config/rs6000/t-linux64lebe
@@ -0,0 +1,7 @@
+#rs6000/t-linux64leend
+
+MULTILIB_OPTIONS+= mbig
+MULTILIB_DIRNAMES   += be
+MULTILIB_OSDIRNAMES += $(subst =,.mbig=,$(subst libbe32,lib32be,$(subst 
libbe64,lib64be,$(subst lib,libbe,$(subst 
le-linux,-linux,$(MULTILIB_OSDIRNAMES))
+MULTILIB_OSDIRNAMES += $(subst $(if $(findstring 
64,$(target)),m64,m32).,,$(filter $(if $(findstring 
64,$(target)),m64,m32).mbig%,$(MULTILIB_OSDIRNAMES)))
+MULTILIB_MATCHES:= ${MULTILIB_MATCHES_ENDIAN}
Index: gcc-4_8-test/libsanitizer/configure.tgt
===
--- gcc-4_8-test.orig/libsanitizer/configure.tgt
+++ gcc-

[4.8, PATCH 15/26] Backport Power8 and LE support: PR54537

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-pr54537) backports a fix for PR54537 which is unrelated
but necessary.  Copying Richard and Jakub for the common code.

Thanks,
Bill


[libstdc++-v3]

2014-03-29  Bill Schmidt  

Backport from mainline
2013-08-01  Fabien Chêne  

PR c++/54537
* include/tr1/cmath: Remove pow(double,double) overload, remove a
duplicated comment about DR 550. Add a comment to explain the issue.
* testsuite/tr1/8_c_compatibility/cmath/pow_cmath.cc: New.

[gcc/cp]

2014-03-29  Bill Schmidt  

Back port from mainline
2013-08-01  Fabien Chêne  

PR c++/54537
* cp-tree.h: Check OVL_USED with OVERLOAD_CHECK.
* name-lookup.c (do_nonmember_using_decl): Make sure we have an
OVERLOAD before calling OVL_USED. Call diagnose_name_conflict
instead of issuing an error without mentioning the conflicting
declaration.

[gcc/testsuite]

2014-03-29  Bill Schmidt  

Back port from mainline
2013-08-01  Fabien Chêne  
Peter Bergner  

PR c++/54537
* g++.dg/overload/using3.C: New.
* g++.dg/overload/using2.C: Adjust.
* g++.dg/lookup/using9.C: Likewise.


Index: gcc-4_8-test/gcc/cp/cp-tree.h
===
--- gcc-4_8-test.orig/gcc/cp/cp-tree.h
+++ gcc-4_8-test/gcc/cp/cp-tree.h
@@ -331,7 +331,7 @@ typedef struct ptrmem_cst * ptrmem_cst_t
 /* If set, this was imported in a using declaration.
This is not to confuse with being used somewhere, which
is not important for this node.  */
-#define OVL_USED(NODE) TREE_USED (NODE)
+#define OVL_USED(NODE) TREE_USED (OVERLOAD_CHECK (NODE))
 /* If set, this OVERLOAD was created for argument-dependent lookup
and can be freed afterward.  */
 #define OVL_ARG_DEPENDENT(NODE) TREE_LANG_FLAG_0 (OVERLOAD_CHECK
(NODE))
Index: gcc-4_8-test/gcc/cp/name-lookup.c
===
--- gcc-4_8-test.orig/gcc/cp/name-lookup.c
+++ gcc-4_8-test/gcc/cp/name-lookup.c
@@ -2286,8 +2286,7 @@ push_overloaded_decl_1 (tree decl, int f
  && compparms (TYPE_ARG_TYPES (TREE_TYPE (fn)),
TYPE_ARG_TYPES (TREE_TYPE (decl)))
  && ! decls_match (fn, decl))
-   error ("%q#D conflicts with previous using declaration %q#D",
-  decl, fn);
+   diagnose_name_conflict (decl, fn);
 
  dup = duplicate_decls (decl, fn, is_friend);
  /* If DECL was a redeclaration of FN -- even an invalid
@@ -2519,7 +2518,7 @@ do_nonmember_using_decl (tree scope, tre
  if (new_fn == old_fn)
/* The function already exists in the current namespace.  */
break;
- else if (OVL_USED (tmp1))
+ else if (TREE_CODE (tmp1) == OVERLOAD && OVL_USED (tmp1))
continue; /* this is a using decl */
  else if (compparms (TYPE_ARG_TYPES (TREE_TYPE (new_fn)),
  TYPE_ARG_TYPES (TREE_TYPE (old_fn
@@ -2534,7 +2533,7 @@ do_nonmember_using_decl (tree scope, tre
break;
  else
{
- error ("%qD is already declared in this scope", name);
+ diagnose_name_conflict (new_fn, old_fn);
  break;
}
}
Index: gcc-4_8-test/gcc/testsuite/g++.dg/lookup/using9.C
===
--- gcc-4_8-test.orig/gcc/testsuite/g++.dg/lookup/using9.C
+++ gcc-4_8-test/gcc/testsuite/g++.dg/lookup/using9.C
@@ -21,11 +21,11 @@ void h()
   f('h');
   f(1); // { dg-error "ambiguous" }
   // { dg-message "candidate" "candidate note" { target *-*-* } 22 }
-  void f(int);  // { dg-error "previous using declaration" }
+  void f(int);  // { dg-error "previous declaration" }
 }
 
 void m()
 {
   void f(int);
-  using B::f;   // { dg-error "already declared" }
+  using B::f;   // { dg-error "previous declaration" }
 }
Index: gcc-4_8-test/gcc/testsuite/g++.dg/overload/using2.C
===
--- gcc-4_8-test.orig/gcc/testsuite/g++.dg/overload/using2.C
+++ gcc-4_8-test/gcc/testsuite/g++.dg/overload/using2.C
@@ -45,7 +45,7 @@ using std::C1;
   extern "C" void exit (int) throw ();
   extern "C" void *malloc (__SIZE_TYPE__) throw ()
__attribute__((malloc));
 
-  void abort (void) throw ();
+  void abort (void) throw (); // { dg-message "previous" }
   void _exit (int) throw (); // { dg-error "conflicts" "conflicts" }
  // { dg-message "void _exit"
"_exit" { target *-*-* } 49 }
 
@@ -54,14 +54,14 @@ using std::C1;
// { dg-message "void C1" "C1" { target
*-*-* } 53 }
 
   extern "C" void c2 

[4.8, PATCH 20/26] Backport Power8 and LE support: LRA

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-lra) backports the changes to enable -mlra for the
PowerPC back end.

Thanks,
Bill


2014-03-19  Bill Schmidt  

Backport from mainline
2014-02-04  Michael Meissner  

* config/rs6000/rs6000.opt (-mlra): Add switch to enable the LRA
register allocator.

* config/rs6000/rs6000.c (TARGET_LRA_P): Add support for -mlra to
enable the LRA register allocator.  Back port the changes from the
trunk to enable LRA.
(rs6000_legitimate_offset_address_p): Likewise.
(legitimate_lo_sum_address_p): Likewise.
(use_toc_relative_ref): Likewise.
(rs6000_legitimate_address_p): Likewise.
(rs6000_emit_move): Likewise.
(rs6000_secondary_memory_needed_mode): Likewise.
(rs6000_alloc_sdmode_stack_slot): Likewise.
(rs6000_lra_p): Likewise.

* config/rs6000/sync.md (load_lockedti): Copy TI/PTI variables by
64-bit parts to force the register allocator to allocate even/odd
register pairs for the quad word atomic instructions.
(store_conditionalti): Likewise.


Index: gcc-4_8-test/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.c
@@ -1,5 +1,5 @@
 /* Subroutines used for code generation on IBM RS/6000.
-   Copyright (C) 1991-2013 Free Software Foundation, Inc.
+   Copyright (C) 1991-2014 Free Software Foundation, Inc.
Contributed by Richard Kenner (ken...@vlsi1.ultra.nyu.edu)
 
This file is part of GCC.
@@ -56,6 +56,7 @@
 #include "intl.h"
 #include "params.h"
 #include "tm-constrs.h"
+#include "ira.h"
 #include "opts.h"
 #include "tree-vectorizer.h"
 #include "dumpfile.h"
@@ -1563,6 +1564,9 @@ static const struct attribute_spec rs600
 #undef TARGET_MODE_DEPENDENT_ADDRESS_P
 #define TARGET_MODE_DEPENDENT_ADDRESS_P rs6000_mode_dependent_address_p
 
+#undef TARGET_LRA_P
+#define TARGET_LRA_P rs6000_lra_p
+
 #undef TARGET_CAN_ELIMINATE
 #define TARGET_CAN_ELIMINATE rs6000_can_eliminate
 
@@ -6242,7 +6246,7 @@ rs6000_legitimate_offset_address_p (enum
 return false;
   if (!reg_offset_addressing_ok_p (mode))
 return virtual_stack_registers_memory_p (x);
-  if (legitimate_constant_pool_address_p (x, mode, strict))
+  if (legitimate_constant_pool_address_p (x, mode, strict || lra_in_progress))
 return true;
   if (GET_CODE (XEXP (x, 1)) != CONST_INT)
 return false;
@@ -6383,9 +6387,21 @@ legitimate_lo_sum_address_p (enum machin
 
   if (TARGET_ELF || TARGET_MACHO)
 {
+  bool large_toc_ok;
+
   if (DEFAULT_ABI == ABI_V4 && flag_pic)
return false;
-  if (TARGET_TOC)
+  /* LRA don't use LEGITIMIZE_RELOAD_ADDRESS as it usually calls
+push_reload from reload pass code.  LEGITIMIZE_RELOAD_ADDRESS
+recognizes some LO_SUM addresses as valid although this
+function says opposite.  In most cases, LRA through different
+transformations can generate correct code for address reloads.
+It can not manage only some LO_SUM cases.  So we need to add
+code analogous to one in rs6000_legitimize_reload_address for
+LOW_SUM here saying that some addresses are still valid.  */
+  large_toc_ok = (lra_in_progress && TARGET_CMODEL != CMODEL_SMALL
+ && small_toc_ref (x, VOIDmode));
+  if (TARGET_TOC && ! large_toc_ok)
return false;
   if (GET_MODE_NUNITS (mode) != 1)
return false;
@@ -6395,7 +6411,7 @@ legitimate_lo_sum_address_p (enum machin
   && (mode == DFmode || mode == DDmode)))
return false;
 
-  return CONSTANT_P (x);
+  return CONSTANT_P (x) || large_toc_ok;
 }
 
   return false;
@@ -7106,7 +7122,6 @@ use_toc_relative_ref (rtx sym)
   && ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (sym),
   get_pool_mode (sym)))
  || (TARGET_CMODEL == CMODEL_MEDIUM
- && !CONSTANT_POOL_ADDRESS_P (sym)
  && SYMBOL_REF_LOCAL_P (sym)));
 }
 
@@ -7394,7 +7409,8 @@ rs6000_legitimate_address_p (enum machin
   if (reg_offset_p && legitimate_small_data_p (mode, x))
 return 1;
   if (reg_offset_p
-  && legitimate_constant_pool_address_p (x, mode, reg_ok_strict))
+  && legitimate_constant_pool_address_p (x, mode,
+reg_ok_strict || lra_in_progress))
 return 1;
   /* For TImode, if we have load/store quad and TImode in VSX registers, only
  allow register indirect addresses.  This will allow the values to go in
@@ -7680,6 +7696,7 @@ rs6000_conditional_register_usage (void)
  fixed_regs[i] = call_used_regs[i] = call_really_used_regs[i] = 1;
 }
 }
+
 
 /* Try to output insns to set TARGET equal to the constant C if it can
be done in less than N insns.  Do all computations in MODE.
@@ -8112,6 +8129,68 @@ rs6000_emit_move (rtx dest, rtx s

[4.8, PATCH 26/26] Backport Power8 and LE support: Missing support

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-trunk-missing) backports some LE pieces that were found
not to have been backported from trunk to the IBM 4.8 branch until
relatively recently.

Thanks,
Bill


2014-03-19  Bill Schmidt  

Back port from trunk
2013-04-25  Alan Modra  

PR target/57052
* config/rs6000/rs6000.md (rotlsi3_internal7): Rename to
rotlsi3_internal7le and condition on !BYTES_BIG_ENDIAN.
(rotlsi3_internal8be): New BYTES_BIG_ENDIAN insn.
Repeat for many other rotate/shift and mask patterns using subregs.
Name lshiftrt insns.
(ashrdisi3_noppc64): Rename to ashrdisi3_noppc64be and condition
on WORDS_BIG_ENDIAN.

2013-06-07  Alan Modra  

* config/rs6000/rs6000.c (rs6000_option_override_internal): Don't
override user -mfp-in-toc.
(offsettable_ok_by_alignment): Consider just the current access
rather than the whole object, unless BLKmode.  Handle
CONSTANT_POOL_ADDRESS_P constants that lack a decl too.
(use_toc_relative_ref): Allow CONSTANT_POOL_ADDRESS_P constants
for -mcmodel=medium.
* config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Don't
override user -mfp-in-toc or -msum-in-toc.  Default to
-mno-fp-in-toc for -mcmodel=medium.

2013-06-18  Alan Modra  

* config/rs6000/rs6000.h (enum data_align): New.
(LOCAL_ALIGNMENT, DATA_ALIGNMENT): Use rs6000_data_alignment.
(DATA_ABI_ALIGNMENT): Define.
(CONSTANT_ALIGNMENT): Correct comment.
* config/rs6000/rs6000-protos.h (rs6000_data_alignment): Declare.
* config/rs6000/rs6000.c (rs6000_data_alignment): New function.

2013-07-11  Ulrich Weigand  

* config/rs6000/rs6000.md (""*tls_gd_low"):
Require GOT register as additional operand in UNSPEC.
("*tls_ld_low"): Likewise.
("*tls_got_dtprel_low"): Likewise.
("*tls_got_tprel_low"): Likewise.
("*tls_gd"): Update splitter.
("*tls_ld"): Likewise.
("tls_got_dtprel_"): Likewise.
("tls_got_tprel_"): Likewise.

2014-01-23  Pat Haugen  

* config/rs6000/rs6000.c (rs6000_option_override_internal): Don't
force flag_ira_loop_pressure if set via command line.

2014-02-06  Alan Modra  

PR target/60032
* config/rs6000/rs6000.c (rs6000_secondary_memory_needed_mode): Only
change SDmode to DDmode when lra_in_progress.


Index: gcc-4_8-test/gcc/config/rs6000/linux64.h
===
--- gcc-4_8-test.orig/gcc/config/rs6000/linux64.h
+++ gcc-4_8-test/gcc/config/rs6000/linux64.h
@@ -149,8 +149,11 @@ extern int dot_symbols;
SET_CMODEL (CMODEL_MEDIUM); \
  if (rs6000_current_cmodel != CMODEL_SMALL)\
{   \
- TARGET_NO_FP_IN_TOC = 0;  \
- TARGET_NO_SUM_IN_TOC = 0; \
+ if (!global_options_set.x_TARGET_NO_FP_IN_TOC) \
+   TARGET_NO_FP_IN_TOC \
+ = rs6000_current_cmodel == CMODEL_MEDIUM; \
+ if (!global_options_set.x_TARGET_NO_SUM_IN_TOC) \
+   TARGET_NO_SUM_IN_TOC = 0;   \
}   \
}   \
}   \
Index: gcc-4_8-test/gcc/config/rs6000/rs6000-protos.h
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000-protos.h
+++ gcc-4_8-test/gcc/config/rs6000/rs6000-protos.h
@@ -152,6 +152,7 @@ extern void rs6000_split_logical (rtx []
 #endif /* RTX_CODE */
 
 #ifdef TREE_CODE
+extern unsigned int rs6000_data_alignment (tree, unsigned int, enum 
data_align);
 extern unsigned int rs6000_special_round_type_align (tree, unsigned int,
 unsigned int);
 extern unsigned int darwin_rs6000_special_round_type_align (tree, unsigned int,
Index: gcc-4_8-test/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.c
@@ -3031,7 +3031,8 @@ rs6000_option_override_internal (bool gl
  calculation works better for RTL loop invariant motion on targets
  with enough (>= 32) registers.  It is an expensive optimization.
  So it is on only for peak performance.  */
-  if (optimize >= 3 && global_init_p)
+  if (optimize >= 3 && global_init_p
+  && !global_options_set.x_flag_ira_loop_pressure)
 flag_ira_loop_pressure = 1;
 
   /* Set the pointer size.  */
@@ -3520,7 +3521,8 @@ rs6000_option_override_internal (bool gl
 
   /* Place FP constants i

[4.8, PATCH 19/26] Backport Power8 and LE support: Quad memory atomic

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-quad-memory) backports support for quad-memory atomic
operations.

Thanks,
Bill


[gcc/testsuite]

2014-03-19  Bill Schmidt  

Back port from mainline
2014-01-23  Michael Meissner  

PR target/59909
* gcc.target/powerpc/quad-atomic.c: New file to test power8 quad
word atomic functions at runtime.

[gcc]

2014-03-19  Bill Schmidt  

Back port from mainline
2014-01-23  Michael Meissner  

PR target/59909
* doc/invoke.texi (RS/6000 and PowerPC Options): Document
-mquad-memory-atomic.  Update -mquad-memory documentation to say
it is only used for non-atomic loads/stores.

* config/rs6000/predicates.md (quad_int_reg_operand): Allow either
-mquad-memory or -mquad-memory-atomic switches.

* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Add
-mquad-memory-atomic to ISA 2.07 support.

* config/rs6000/rs6000.opt (-mquad-memory-atomic): Add new switch
to separate support of normal quad word memory operations (ldq,
stq) from the atomic quad word memory operations.

* config/rs6000/rs6000.c (rs6000_option_override_internal): Add
support to separate non-atomic quad word operations from atomic
quad word operations.  Disable non-atomic quad word operations in
little endian mode so that we don't have to swap words after the
load and before the store.
(quad_load_store_p): Add comment about atomic quad word support.
(rs6000_opt_masks): Add -mquad-memory-atomic to the list of
options printed with -mdebug=reg.

* config/rs6000/rs6000.h (TARGET_SYNC_TI): Use
-mquad-memory-atomic as the test for whether we have quad word
atomic instructions.
(TARGET_SYNC_HI_QI): If either -mquad-memory-atomic,
-mquad-memory, or -mp8-vector are used, allow byte/half-word
atomic operations.

* config/rs6000/sync.md (load_lockedti): Insure that the address
is a proper indexed or indirect address for the lqarx instruction.
On little endian systems, swap the hi/lo registers after the lqarx
instruction.
(load_lockedpti): Use indexed_or_indirect_operand predicate to
insure the address is valid for the lqarx instruction.
(store_conditionalti): Insure that the address is a proper indexed
or indirect address for the stqcrx. instruction.  On little endian
systems, swap the hi/lo registers before doing the stqcrx.
instruction.
(store_conditionalpti): Use indexed_or_indirect_operand predicate to
insure the address is valid for the stqcrx. instruction.

* gcc/config/rs6000/rs6000-c.c (rs6000_target_modify_macros):
Define __QUAD_MEMORY__ and __QUAD_MEMORY_ATOMIC__ based on what
type of quad memory support is available.


Index: gcc-4_8-test/gcc/config/rs6000/predicates.md
===
--- gcc-4_8-test.orig/gcc/config/rs6000/predicates.md
+++ gcc-4_8-test/gcc/config/rs6000/predicates.md
@@ -270,7 +270,7 @@
 {
   HOST_WIDE_INT r;
 
-  if (!TARGET_QUAD_MEMORY)
+  if (!TARGET_QUAD_MEMORY && !TARGET_QUAD_MEMORY_ATOMIC)
 return 0;
 
   if (GET_CODE (op) == SUBREG)
@@ -633,6 +633,7 @@
(match_test "offsettable_nonstrict_memref_p (op)")))
 
 ;; Return 1 if the operand is suitable for load/store quad memory.
+;; This predicate only checks for non-atomic loads/stores.
 (define_predicate "quad_memory_operand"
   (match_code "mem")
 {
Index: gcc-4_8-test/gcc/config/rs6000/rs6000-c.c
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000-c.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000-c.c
@@ -337,6 +337,10 @@ rs6000_target_modify_macros (bool define
 rs6000_define_or_undefine_macro (define_p, "__HTM__");
   if ((flags & OPTION_MASK_P8_VECTOR) != 0)
 rs6000_define_or_undefine_macro (define_p, "__POWER8_VECTOR__");
+  if ((flags & OPTION_MASK_QUAD_MEMORY) != 0)
+rs6000_define_or_undefine_macro (define_p, "__QUAD_MEMORY__");
+  if ((flags & OPTION_MASK_QUAD_MEMORY_ATOMIC) != 0)
+rs6000_define_or_undefine_macro (define_p, "__QUAD_MEMORY_ATOMIC__");
   if ((flags & OPTION_MASK_CRYPTO) != 0)
 rs6000_define_or_undefine_macro (define_p, "__CRYPTO__");
 
Index: gcc-4_8-test/gcc/config/rs6000/rs6000-cpus.def
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000-cpus.def
+++ gcc-4_8-test/gcc/config/rs6000/rs6000-cpus.def
@@ -53,7 +53,8 @@
 | OPTION_MASK_CRYPTO   \
 | OPTION_MASK_DIRECT_MOVE  \
 | OPTION_MASK_HTM  \
-| OPTION_MASK_QUAD_MEMORY)
+| OPTION_MASK_QUAD_MEMORY 

Re: [RFA jit v2 2/2] introduce auto_timevar

2014-03-19 Thread David Malcolm
On Wed, 2014-03-19 at 11:52 -0600, Tom Tromey wrote:
> This introduces a new auto_timevar class.  It pushes a given timevar
> in its constructor, and pops it in the destructor, giving a much
> simpler way to use timevars in the typical case where they can be
> scoped.
> ---
>  gcc/ChangeLog.jit  |  4 
>  gcc/jit/ChangeLog.jit  |  4 
>  gcc/jit/internal-api.c | 16 +---
>  gcc/timevar.h  | 26 +-
>  4 files changed, 38 insertions(+), 12 deletions(-)

OK (and it fixes a bug in the earlier version of the patch in the dtor,
which pushed rather than popped).

Are you able to push this to my branch yourself, or do you need me to do
this?



[4.8, PATCH 1/26 too big]

2014-03-19 Thread Bill Schmidt
Hi,

The main patch for this series was too large for the mailer to accept.
Sorry about that.  This piece is all powerpc-related and seems to have
been delivered to David ok.  If anyone else wants a copy of the patch,
please contact me privately and I'll send it your way.

Thanks,
Bill



Re: [PATCH] [gomp4] Initial support of OpenACC loop directive in C front-end.

2014-03-19 Thread Thomas Schwinge
Hi Ilmir!

On Tue, 18 Mar 2014 16:37:24 +0400, Ilmir Usmanov  wrote:
> This patch introduces support of OpenACC loop directive (and combined 
> directives) in C front-end up to GENERIC. Currently no clause is allowed.

> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/loop-1.c
> @@ -0,0 +1,89 @@
> +/* { dg-do compile } */
> +
> +int test1()
> +{
> +  int i, j, k, b[10];
> +  int a[30];
> +  double d;
> +  float r;
> +  i = 0;

> +  #pragma acc loop
> +  for (i = 1; i < 10; i++)
> +{
> +}

Do you intend to support loop constructs that are not nested in a
parallel or kernels construct?  As I'm reading it, the specification is
not clear on this.  (I guess I'll raise this question with the OpenACC
guys.)


Grüße,
 Thomas


pgpJV43AkyNA2.pgp
Description: PGP signature


[4.8, PATCH 24/26] Backport Power8 and LE support: Reload issues

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-reload) backports fixes for a couple of problems in
PowerPC reload handling.

Thanks,
Bill


2014-03-19  Bill Schmidt  

Apply mainline r207798
2014-02-26  Alan Modra  
PR target/58675
PR target/57935
* config/rs6000/rs6000.c (rs6000_secondary_reload_inner): Use
find_replacement on parts of insn rtl that might be reloaded.

Backport from mainline r208287
2014-03-03  Bill Schmidt  

* config/rs6000/rs6000.c (rs6000_preferred_reload_class): Disallow
reload of PLUS rtx's outside of GENERAL_REGS or BASE_REGS; relax
constraint on constants to permit them being loaded into
GENERAL_REGS or BASE_REGS.


Index: gcc-4_8-test/gcc/config/rs6000/rs6000.c
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.c
@@ -16380,7 +16380,7 @@ rs6000_secondary_reload_inner (rtx reg,
 rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
 
   rclass = REGNO_REG_CLASS (regno);
-  addr = XEXP (mem, 0);
+  addr = find_replacement (&XEXP (mem, 0));
 
   switch (rclass)
 {
@@ -16391,19 +16391,18 @@ rs6000_secondary_reload_inner (rtx reg,
   if (GET_CODE (addr) == AND)
{
  and_op2 = XEXP (addr, 1);
- addr = XEXP (addr, 0);
+ addr = find_replacement (&XEXP (addr, 0));
}
 
   if (GET_CODE (addr) == PRE_MODIFY)
{
- scratch_or_premodify = XEXP (addr, 0);
+ scratch_or_premodify = find_replacement (&XEXP (addr, 0));
  if (!REG_P (scratch_or_premodify))
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
 
- if (GET_CODE (XEXP (addr, 1)) != PLUS)
+ addr = find_replacement (&XEXP (addr, 1));
+ if (GET_CODE (addr) != PLUS)
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
-
- addr = XEXP (addr, 1);
}
 
   if (GET_CODE (addr) == PLUS
@@ -16411,6 +16410,8 @@ rs6000_secondary_reload_inner (rtx reg,
  || !rs6000_legitimate_offset_address_p (PTImode, addr,
  false, true)))
{
+ /* find_replacement already recurses into both operands of
+PLUS so we don't need to call it here.  */
  addr_op1 = XEXP (addr, 0);
  addr_op2 = XEXP (addr, 1);
  if (!legitimate_indirect_address_p (addr_op1, false))
@@ -16486,7 +16487,7 @@ rs6000_secondary_reload_inner (rtx reg,
  || !VECTOR_MEM_ALTIVEC_P (mode)))
{
  and_op2 = XEXP (addr, 1);
- addr = XEXP (addr, 0);
+ addr = find_replacement (&XEXP (addr, 0));
}
 
   /* If we aren't using a VSX load, save the PRE_MODIFY register and use it
@@ -16498,14 +16499,13 @@ rs6000_secondary_reload_inner (rtx reg,
  || and_op2 != NULL_RTX
  || !legitimate_indexed_address_p (XEXP (addr, 1), false)))
{
- scratch_or_premodify = XEXP (addr, 0);
+ scratch_or_premodify = find_replacement (&XEXP (addr, 0));
  if (!legitimate_indirect_address_p (scratch_or_premodify, false))
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
 
- if (GET_CODE (XEXP (addr, 1)) != PLUS)
+ addr = find_replacement (&XEXP (addr, 1));
+ if (GET_CODE (addr) != PLUS)
rs6000_secondary_reload_fail (__LINE__, reg, mem, scratch, store_p);
-
- addr = XEXP (addr, 1);
}
 
   if (legitimate_indirect_address_p (addr, false)  /* reg */
@@ -16765,8 +16765,14 @@ rs6000_preferred_reload_class (rtx x, en
   && easy_vector_constant (x, mode))
 return ALTIVEC_REGS;
 
-  if (CONSTANT_P (x) && reg_classes_intersect_p (rclass, FLOAT_REGS))
-return NO_REGS;
+  if ((CONSTANT_P (x) || GET_CODE (x) == PLUS))
+{
+  if (reg_class_subset_p (GENERAL_REGS, rclass))
+   return GENERAL_REGS;
+  if (reg_class_subset_p (BASE_REGS, rclass))
+   return BASE_REGS;
+  return NO_REGS;
+}
 
   if (GET_MODE_CLASS (mode) == MODE_INT && rclass == NON_SPECIAL_REGS)
 return GENERAL_REGS;





[4.8, PATCH 22/26] Backport Power8 and LE support: -mcall-* endianness

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-mcall) fixes big-endian assumptions for -mcall-aixdesc
and various others.

Thanks,
Bill


2014-03-19  Bill Schmidt  

Backport from mainline r207658
2014-02-06  Ulrich Weigand  

* config/rs6000/sysv4.h (ENDIAN_SELECT): Do not attempt to enforce
big-endian mode for -mcall-aixdesc, -mcall-freebsd, -mcall-netbsd,
-mcall-openbsd, or -mcall-linux.
(CC1_ENDIAN_BIG_SPEC): Remove.
(CC1_ENDIAN_LITTLE_SPEC): Remove.
(CC1_ENDIAN_DEFAULT_SPEC): Remove.
(CC1_SPEC): Remove (always empty) %cc1_endian_... spec.
(SUBTARGET_EXTRA_SPECS): Remove %cc1_endian_big, %cc1_endian_little,
and %cc1_endian_default.
* config/rs6000/sysv4le.h (CC1_ENDIAN_DEFAULT_SPEC): Remove.


Index: gcc-4_8-test/gcc/config/rs6000/sysv4.h
===
--- gcc-4_8-test.orig/gcc/config/rs6000/sysv4.h
+++ gcc-4_8-test/gcc/config/rs6000/sysv4.h
@@ -522,8 +522,6 @@ extern int fixuplabelno;
 #define ENDIAN_SELECT(BIG_OPT, LITTLE_OPT, DEFAULT_OPT)\
 "%{mlittle|mlittle-endian:"LITTLE_OPT ";"  \
   "mbig|mbig-endian:"  BIG_OPT";"  \
-  "mcall-aixdesc|mcall-freebsd|mcall-netbsd|"  \
-  "mcall-openbsd|mcall-linux:" BIG_OPT";"  \
   "mcall-i960-old:"LITTLE_OPT ";"  \
   ":"  DEFAULT_OPT "}"
 
@@ -536,20 +534,12 @@ extern int fixuplabelno;
 %{memb|msdata=eabi: -memb}" \
 ENDIAN_SELECT(" -mbig", " -mlittle", DEFAULT_ASM_ENDIAN)
 
-#defineCC1_ENDIAN_BIG_SPEC ""
-
-#defineCC1_ENDIAN_LITTLE_SPEC ""
-
-#defineCC1_ENDIAN_DEFAULT_SPEC "%(cc1_endian_big)"
-
 #ifndef CC1_SECURE_PLT_DEFAULT_SPEC
 #define CC1_SECURE_PLT_DEFAULT_SPEC ""
 #endif
 
-/* Pass -G xxx to the compiler and set correct endian mode.  */
+/* Pass -G xxx to the compiler.  */
 #defineCC1_SPEC "%{G*} %(cc1_cpu)" \
-  ENDIAN_SELECT(" %(cc1_endian_big)", " %(cc1_endian_little)", \
-   " %(cc1_endian_default)")   \
 "%{meabi: %{!mcall-*: -mcall-sysv }} \
 %{!meabi: %{!mno-eabi: \
 %{mrelocatable: -meabi } \
@@ -903,9 +893,6 @@ ncrtn.o%s"
   { "link_os_netbsd",  LINK_OS_NETBSD_SPEC },  \
   { "link_os_openbsd", LINK_OS_OPENBSD_SPEC }, \
   { "link_os_default", LINK_OS_DEFAULT_SPEC }, \
-  { "cc1_endian_big",  CC1_ENDIAN_BIG_SPEC },  \
-  { "cc1_endian_little",   CC1_ENDIAN_LITTLE_SPEC },   \
-  { "cc1_endian_default",  CC1_ENDIAN_DEFAULT_SPEC },  \
   { "cc1_secure_plt_default",  CC1_SECURE_PLT_DEFAULT_SPEC },  \
   { "cpp_os_ads",  CPP_OS_ADS_SPEC },  \
   { "cpp_os_yellowknife",  CPP_OS_YELLOWKNIFE_SPEC },  \
Index: gcc-4_8-test/gcc/config/rs6000/sysv4le.h
===
--- gcc-4_8-test.orig/gcc/config/rs6000/sysv4le.h
+++ gcc-4_8-test/gcc/config/rs6000/sysv4le.h
@@ -22,9 +22,6 @@
 #undef  TARGET_DEFAULT
 #define TARGET_DEFAULT MASK_LITTLE_ENDIAN
 
-#undef CC1_ENDIAN_DEFAULT_SPEC
-#defineCC1_ENDIAN_DEFAULT_SPEC "%(cc1_endian_little)"
-
 #undef DEFAULT_ASM_ENDIAN
 #defineDEFAULT_ASM_ENDIAN " -mlittle"
 





[4.8, PATCH 23/26] Backport Power8 and LE support: PR60137, PR60203

2014-03-19 Thread Bill Schmidt
Hi,

This patch (diff-pr60137-pr60203) backports fixes for two little-endian
vector mode problems.

Thanks,
Bill


[gcc]

2014-03-19  Bill Schmidt  

Backport from mainline r207699.
2014-02-11  Michael Meissner  

PR target/60137
* config/rs6000/rs6000.md (128-bit GPR splitter): Add a splitter
for VSX/Altivec vectors that land in GPR registers.

Backport from mainline r207808.
2014-02-15  Michael Meissner  

PR target/60203
* config/rs6000/rs6000.md (rreg): Add TFmode, TDmode constraints.
(mov_internal, TFmode/TDmode): Split TFmode/TDmode moves
into 64-bit and 32-bit moves.  On 64-bit moves, add support for
using direct move instructions on ISA 2.07.  Also adjust
instruction length for 64-bit.
(mov_64bit, TFmode/TDmode): Likewise.
(mov_32bit, TFmode/TDmode): Likewise.

Backport from mainline r207868.
2014-02-18  Michael Meissner  

PR target/60203
* config/rs6000/rs6000.md (mov_64bit, TF/TDmode moves):
Split 64-bit moves into 2 patterns.  Do not allow the use of
direct move for TDmode in little endian, since the decimal value
has little endian bytes within a word, but the 64-bit pieces are
ordered in a big endian fashion, and normal subreg's of TDmode are
not allowed.
(mov_64bit_dm): Likewise.
(movtd_64bit_nodm): Likewise.

[gcc/testsuite]

2014-03-19  Bill Schmidt  

Backport from mainline r207699.
2014-02-11  Michael Meissner  

PR target/60137
* gcc.target/powerpc/pr60137.c: New file.

Backport from mainline r207808.
2014-02-15  Michael Meissner  

PR target/60203
* gcc.target/powerpc/pr60203.c: New testsuite.


Index: gcc-4_8-test/gcc/config/rs6000/rs6000.md
===
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.md
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.md
@@ -378,6 +378,8 @@
 
 (define_mode_attr rreg [(SF   "f")
(DF   "ws")
+   (TF   "f")
+   (TD   "f")
(V4SF "wf")
(V2DF "wd")])
 
@@ -8990,10 +8992,40 @@
 ;; It's important to list Y->r and r->Y before r->r because otherwise
 ;; reload, given m->r, will try to pick r->r and reload it, which
 ;; doesn't make progress.
-(define_insn_and_split "*mov_internal"
+
+;; We can't split little endian direct moves of TDmode, because the words are
+;; not swapped like they are for TImode or TFmode.  Subregs therefore are
+;; problematical.  Don't allow direct move for this case.
+
+(define_insn_and_split "*mov_64bit_dm"
+  [(set (match_operand:FMOVE128 0 "nonimmediate_operand" "=m,d,d,Y,r,r,r,wm")
+   (match_operand:FMOVE128 1 "input_operand" "d,m,d,r,YGHF,r,wm,r"))]
+  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_POWERPC64
+   && (mode != TDmode || WORDS_BIG_ENDIAN)
+   && (gpc_reg_operand (operands[0], mode)
+   || gpc_reg_operand (operands[1], mode))"
+  "#"
+  "&& reload_completed"
+  [(pc)]
+{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
+  [(set_attr "length" "8,8,8,12,12,8,8,8")])
+
+(define_insn_and_split "*movtd_64bit_nodm"
+  [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
+   (match_operand:TD 1 "input_operand" "d,m,d,r,YGHF,r"))]
+  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_POWERPC64 && !WORDS_BIG_ENDIAN
+   && (gpc_reg_operand (operands[0], TDmode)
+   || gpc_reg_operand (operands[1], TDmode))"
+  "#"
+  "&& reload_completed"
+  [(pc)]
+{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
+  [(set_attr "length" "8,8,8,12,12,8")])
+
+(define_insn_and_split "*mov_32bit"
   [(set (match_operand:FMOVE128 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
(match_operand:FMOVE128 1 "input_operand" "d,m,d,r,YGHF,r"))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS
+  "TARGET_HARD_FLOAT && TARGET_FPRS && !TARGET_POWERPC64
&& (gpc_reg_operand (operands[0], mode)
|| gpc_reg_operand (operands[1], mode))"
   "#"
@@ -9429,6 +9461,15 @@
   [(set_attr "length" "12")
(set_attr "type" "three")])
 
+(define_split
+  [(set (match_operand:FMOVE128_GPR 0 "nonimmediate_operand" "")
+   (match_operand:FMOVE128_GPR 1 "input_operand" ""))]
+  "reload_completed
+   && (int_reg_operand (operands[0], mode)
+   || int_reg_operand (operands[1], mode))"
+  [(pc)]
+{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
+
 ;; Move SFmode to a VSX from a GPR register.  Because scalar floating point
 ;; type is stored internally as double precision in the VSX registers, we have
 ;; to convert it from the vector format.
Index: gcc-4_8-test/gcc/testsuite/gcc.target/powerpc/pr60137.c
===
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.target/powerpc/pr60137.c
@@ -0,0 +1,17 @@
+/* { dg-do compile { tar

Re: [Fortran][PATCH][gomp4]: Transform OpenACC loop directive

2014-03-19 Thread Tobias Burnus

Hi Illmir,

Ilmir Usmanov:
This patch implements transformation of OpenACC loop directive from 
Fortran AST to GENERIC.


If I followed correctly, with this patch the Fortran FE implementation 
of OpenACC is complete, except for:


* !$acc cache() - parsing supported, but then aborting with a 
not-implemented error

* OpenACC 2.0a additions.

Am I right?

Successfully bootstrapped and tested with no new regressions on 
x86_64-unknown-linux-gnu.

OK for gomp4 branch?


I leave the review of gcc/tree-pretty-print.c part (looks good to me) to 
Thomas, who might have also a comment to the Fortran part.


For a DO loop, the code looks okay.


For DO CONCURRENT, it is not. I think we should really consider to 
reject DO CONCURRENT with a "not permitted"; it is currently not 
explicitly supported by OpenACC; I think we can still worry about it, 
when it will be explicitly added to OpenACC. Otherwise, see 
gfc_trans_do_concurrent for how to handle the do concurrent loops.


Issues with DO CONCURRENT:

* You use "code->ext.iterator->var" - that's fine with DO but not with 
DO CONCURRENT, which uses a "code->ext.forall_iterator"


* Do concurrent also handles multiple variables in a single statement, 
such as:


integer :: i, j, b(3,5)
DO CONCURRENT(i=1:3, j=1:5:2)
  b(i, j) = -42
END DO
end

* And do concurrent also supports masks:

logical :: my_mask(3)
integer :: i, b(3)
b(i) = [5, 5, 2]
my_mask = [.true., .false., .true.]
do concurrent (i=1:3, b(i) == 5 .and. my_mask(i))
  b(i) = -42
end do
end

Tobias


Re: [Patch, Fortran] PRs 60283/60543: Fix two wrong-code bugs related for implicit pure

2014-03-19 Thread Tobias Burnus
Early *ping*  - I think this wrong-code GCC 4.7/4.8/4.9 issue is pretty 
severe.


Tobias Burnus wrote:
This patch fixes two issues, where gfortran claims that a function is 
implicit pure, but it is not. That will cause a wrong-code 
optimization in the middle end.


First problem, cf. PR60543, is that implicit pure was not set to 0 for 
calls to impure intrinsic subroutines. (BTW: There are no impure 
intrinsic functions.) Example:


  module m
  contains
REAL(8) FUNCTION random()
  CALL RANDOM_NUMBER(random)
END FUNCTION random
  end module m


The second problem pops up if one adds a BLOCK ... END BLOCK around 
the random_number call after applying the patch of the PR, which just 
does: gfc_current_ns->proc_name->attr.implicit_pure = 0.


The problem is that one sets only the implicit_pure of the block to 0 
and not of the function. That's the reason that the patch became much 
longer and that I added gfc_unset_implicit_pure as new function.


Thus, the suspicion I had when reviewing the OpenACC patches turned 
out to be founded. Cf. PR60283.


Build and regtested on x86-64-gnu-linux.
OK for the trunk and for the 4.7 and 4.8 branches?

Note: I failed to create a test case.

Tobias




Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-03-19 Thread Jakub Jelinek
On Wed, Mar 19, 2014 at 02:23:58PM -0500, Bill Schmidt wrote:
> Support for Power8 features and the new powerpc64le-linux-gnu target,
> including the ELFv2 ABI, has been developed up till now on the
> ibm/gcc-4_8-branch.  It was appropriate to use this separate branch
> while the support was unstable, but this branch will not represent a
> particularly good support mechanism for distributions going forward.
> Most distros are set up to pull from the major release branches, and
> having a separate branch for one target is quite inconvenient.  Also,
> the ibm/gcc-4_8-branch's original purpose is to serve as the code base
> for IBM's Advance Toolchain 7.0.  Over time the two purposes that the
> branch currently serves will diverge and make things even more
> complicated.
> 
> The code is now tested and stable enough that we are ready to backport
> this support to the FSF 4.8 branch.  This patch series constitutes that
> backport.

I guess the most important question is what guarantees there are that it
won't affect non-powerpc* ports too much (my main concern is the 9/26 patch,
plus the C++ FE / libstdc++ changes), and how much does this affect
code generation and overall stability of the PowerPC big endian existing
targets.

Jakub


Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-03-19 Thread David Edelsohn
On Wed, Mar 19, 2014 at 4:05 PM, Jakub Jelinek  wrote:
> On Wed, Mar 19, 2014 at 02:23:58PM -0500, Bill Schmidt wrote:
>> Support for Power8 features and the new powerpc64le-linux-gnu target,
>> including the ELFv2 ABI, has been developed up till now on the
>> ibm/gcc-4_8-branch.  It was appropriate to use this separate branch
>> while the support was unstable, but this branch will not represent a
>> particularly good support mechanism for distributions going forward.
>> Most distros are set up to pull from the major release branches, and
>> having a separate branch for one target is quite inconvenient.  Also,
>> the ibm/gcc-4_8-branch's original purpose is to serve as the code base
>> for IBM's Advance Toolchain 7.0.  Over time the two purposes that the
>> branch currently serves will diverge and make things even more
>> complicated.
>>
>> The code is now tested and stable enough that we are ready to backport
>> this support to the FSF 4.8 branch.  This patch series constitutes that
>> backport.
>
> I guess the most important question is what guarantees there are that it
> won't affect non-powerpc* ports too much (my main concern is the 9/26 patch,
> plus the C++ FE / libstdc++ changes), and how much does this affect
> code generation and overall stability of the PowerPC big endian existing
> targets.

Before this patch is approved, we are going to thoroughly confirm that
it does not harm any other PowerPC targets (big endian PowerLinux,
eABI, nor AIX). Any help with testng from the PPC eABI community is
appreciated.

- David


Re: [Patch, Fortran] PRs 60283/60543: Fix two wrong-code bugs related for implicit pure

2014-03-19 Thread Paul Richard Thomas
Dear Tobias,

The patch looks OK to me.  If nothing else, it offers a
rationalisation of all the lines of code that unset the attribute!

I am somewhat puzzled by "Note: I failed to create a test case",
wheras I find one at the end of the patch.  Can you explain what you
mean?

Cheers

Paul

On 19 March 2014 21:21, Tobias Burnus  wrote:
> Early *ping*  - I think this wrong-code GCC 4.7/4.8/4.9 issue is pretty
> severe.
>
>
> Tobias Burnus wrote:
>>
>> This patch fixes two issues, where gfortran claims that a function is
>> implicit pure, but it is not. That will cause a wrong-code optimization in
>> the middle end.
>>
>> First problem, cf. PR60543, is that implicit pure was not set to 0 for
>> calls to impure intrinsic subroutines. (BTW: There are no impure intrinsic
>> functions.) Example:
>>
>>   module m
>>   contains
>> REAL(8) FUNCTION random()
>>   CALL RANDOM_NUMBER(random)
>> END FUNCTION random
>>   end module m
>>
>>
>> The second problem pops up if one adds a BLOCK ... END BLOCK around the
>> random_number call after applying the patch of the PR, which just does:
>> gfc_current_ns->proc_name->attr.implicit_pure = 0.
>>
>> The problem is that one sets only the implicit_pure of the block to 0 and
>> not of the function. That's the reason that the patch became much longer and
>> that I added gfc_unset_implicit_pure as new function.
>>
>> Thus, the suspicion I had when reviewing the OpenACC patches turned out to
>> be founded. Cf. PR60283.
>>
>> Build and regtested on x86-64-gnu-linux.
>> OK for the trunk and for the 4.7 and 4.8 branches?
>>
>> Note: I failed to create a test case.
>>
>> Tobias
>
>



-- 
The knack of flying is learning how to throw yourself at the ground and miss.
   --Hitchhikers Guide to the Galaxy


[C++ PATCH] Fix ICE in build_zero_init_1 (PR c++/60572)

2014-03-19 Thread Jakub Jelinek
Hi!

On the following testcase starting with r199779 we have a FIELD_DECL with
error_mark_node type, on which we ICE.  Fixed by ignoring such FIELD_DECLs.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-03-19  Jakub Jelinek  

PR c++/60572
* init.c (build_zero_init_1): Ignore fields with error_mark_node
type.

* g++.dg/init/pr60572.C: New test.

--- gcc/cp/init.c.jj2014-03-10 10:50:14.0 +0100
+++ gcc/cp/init.c   2014-03-19 07:43:54.077795662 +0100
@@ -192,6 +192,9 @@ build_zero_init_1 (tree type, tree nelts
  if (TREE_CODE (field) != FIELD_DECL)
continue;
 
+ if (TREE_TYPE (field) == error_mark_node)
+   continue;
+
  /* Don't add virtual bases for base classes if they are beyond
 the size of the current field, that means it is present
 somewhere else in the object.  */
--- gcc/testsuite/g++.dg/init/pr60572.C.jj  2014-03-19 07:46:33.607894844 
+0100
+++ gcc/testsuite/g++.dg/init/pr60572.C 2014-03-19 07:46:49.752804722 +0100
@@ -0,0 +1,13 @@
+// PR c++/60572
+// { dg-do compile }
+
+struct A
+{
+  A x; // { dg-error "incomplete type" }
+  virtual ~A () {}
+};
+
+struct B : A
+{
+  B () : A () {}
+};

Jakub


Re: [RFA jit v2 1/2] introduce class toplev

2014-03-19 Thread Tom Tromey
David> OK.  Are you able to push this to my branch, or do you need me to do
David> this?

Thanks, I was able to push them.

Tom


Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-03-19 Thread Bill Schmidt
On Wed, 2014-03-19 at 21:05 +0100, Jakub Jelinek wrote:

> I guess the most important question is what guarantees there are that it
> won't affect non-powerpc* ports too much (my main concern is the 9/26 patch,
> plus the C++ FE / libstdc++ changes), and how much does this affect
> code generation and overall stability of the PowerPC big endian existing
> targets.
> 
>   Jakub
> 

The three pieces that are somewhat controversial for non-powerpc targets
are 9/26, 10/26, 15/26.

 * Uli and Alan, can you speak to any concerns for 9/26?

 * 10/26 hits libstdc++, but only in a minor way for the extract_symvers
script; it adds a sed to ignore a string added for powerpc64le, so
shouldn't be a problem.

 * 15/26 might be one we can do without.  I need to check with Peter
Bergner, who originally backported Fabien's patch, but unfortunately he
is on vacation.  That patch fixed a problem that originated on an x86
platform.  I can try respinning the patch series without this one and
see what breaks, or if Peter happens to see this while he's on vacation,
perhaps he can comment.

For PowerPC targets, I have already checked out powerpc64-linux (big
endian).  As David mentioned, I need to apply the patch series on an AIX
machine and test it before this can be accepted.  We don't have any way
of testing the eabi stuff, so community help would be very much
appreciated there.

Thanks,
Bill



Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-03-19 Thread Bill Schmidt
On Wed, 2014-03-19 at 16:03 -0500, Bill Schmidt wrote:
> On Wed, 2014-03-19 at 21:05 +0100, Jakub Jelinek wrote:
> 
> > I guess the most important question is what guarantees there are that it
> > won't affect non-powerpc* ports too much (my main concern is the 9/26 patch,
> > plus the C++ FE / libstdc++ changes), and how much does this affect
> > code generation and overall stability of the PowerPC big endian existing
> > targets.
> > 
> > Jakub
> > 
> 
> The three pieces that are somewhat controversial for non-powerpc targets
> are 9/26, 10/26, 15/26.

I forgot to mention that these bits have all been upstream in trunk
since last autumn, so there's been quite a bit of burn-in at that level.
Obviously that is not the same as being burned in on 4.8, but it does
help provide a bit of confidence.

Bill

> 
>  * Uli and Alan, can you speak to any concerns for 9/26?
> 
>  * 10/26 hits libstdc++, but only in a minor way for the extract_symvers
> script; it adds a sed to ignore a string added for powerpc64le, so
> shouldn't be a problem.
> 
>  * 15/26 might be one we can do without.  I need to check with Peter
> Bergner, who originally backported Fabien's patch, but unfortunately he
> is on vacation.  That patch fixed a problem that originated on an x86
> platform.  I can try respinning the patch series without this one and
> see what breaks, or if Peter happens to see this while he's on vacation,
> perhaps he can comment.
> 
> For PowerPC targets, I have already checked out powerpc64-linux (big
> endian).  As David mentioned, I need to apply the patch series on an AIX
> machine and test it before this can be accepted.  We don't have any way
> of testing the eabi stuff, so community help would be very much
> appreciated there.
> 
> Thanks,
> Bill



Re: [4.8, PATCH 0/26] Backport Power8 and LE support

2014-03-19 Thread Jeff Law

On 03/19/14 15:03, Bill Schmidt wrote:

On Wed, 2014-03-19 at 21:05 +0100, Jakub Jelinek wrote:


I guess the most important question is what guarantees there are that it
won't affect non-powerpc* ports too much (my main concern is the 9/26 patch,
plus the C++ FE / libstdc++ changes), and how much does this affect
code generation and overall stability of the PowerPC big endian existing
targets.

Jakub



The three pieces that are somewhat controversial for non-powerpc targets
are 9/26, 10/26, 15/26.

  * Uli and Alan, can you speak to any concerns for 9/26?
I've got no concerns about 9/26.  Uli, Alan and myself worked through 
this pretty thoroughly.  I've had those in the back of my mind as 
something we're going to want to make sure to pull in.


Jeff



PR libstdc++/60587

2014-03-19 Thread Jonathan Wakely

I'm debugging http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60587 and
have found a number of problems.

Firstly, the bug report is correct, this overload dereferences the
__other argument without checking if that is OK:

  template
inline bool
__foreign_iterator_aux3(const _Safe_iterator<_Iterator, _Sequence>& __it,
  _InputIterator __other,
  std::true_type)

Secondly, in this testcase we should never even have reached that
overload, because we should have gone to this overload of _aux2:

  template
inline bool
__foreign_iterator_aux2(const _Safe_iterator<_Iterator, _Sequence>& __it,
const _Safe_iterator<_OtherIterator, _Sequence>& __other,
std::input_iterator_tag)
{ return __it._M_get_sequence() != __other._M_get_sequence(); }

However that is not chosen by overload resolution because this is a better
match when __other is non-const:

  template
inline bool
__foreign_iterator_aux2(const _Safe_iterator<_Iterator, _Sequence>& __it,
  _InputIterator __other,
  std::random_access_iterator_tag)

Fixing the overload resolution bug makes the testcase in the PR pass,
but the underlying problem of dereferencing an invalid iterator still
exists and can be shown by changing the testcase slightly:

#define _GLIBCXX_DEBUG
#include 
int main() {
std::vector a;
std::vector b;
a.push_back(1);
a.insert(a.end(), b.begin(), b.end());
}

That still dereferences b.begin(), but that too can be fixed (either
as suggested in the PR or by passing the begin and end iterators into
the __foreign_iter function) but I think there's still another
problem.

I'm looking again at the code that attempts to check if we have
contiguous storage:

  if (std::addressof(*(__it._M_get_sequence()->_M_base().end() - 1))
  - std::addressof(*(__it._M_get_sequence()->_M_base().begin()))
  == __it._M_get_sequence()->size() - 1)

Are we really sure that ensures contiguous iterators? What if we have
a deque with three blocks laid out in memory like this:
 
 1XXX3XXx2XXX

 ^  ^
 begin()end()

1 is the start of the first block, 2 is the start of the second block
and 3 is the start of the third block.
X is an element, x is reserved but uninitialized capacity
. is unallocated memory (or memory not used by the deque)

Here we have end() - begin() == size() but non-contiguous memory.
If the __other iterator happens to point to the unallocated memory
between 1 and 3 then it will appear to be part of the deque, but
isn't.

I think the safe thing to do is (as I suggested at the time) to have a
trait saying which iterator types refer to contiguous memory. Our
debug mode only supports our own containers, so the ones which are
contiguous are known.  For 4.9.0 I think the right option is simply
to remove __foreign_iterator_aux3 and __foreign_iterator_aux4
completely. The fixed version of __foreign_iterator_aux2() can detect
when we have iterators referring to the same sequence, which is what
we really want to detect. That's what the attached patch does and what
I'm going to test.


--- debug/functions.h.orig  2014-03-19 21:34:43.038647394 +
+++ debug/functions.h   2014-03-19 21:35:53.502617461 +
@@ -175,62 +175,6 @@
   return __first;
 }
 
-#if __cplusplus >= 201103L
-  // Default implementation.
-  template
-inline bool
-__foreign_iterator_aux4(const _Safe_iterator<_Iterator, _Sequence>& __it,
-   typename _Sequence::const_pointer __begin,
-   typename _Sequence::const_pointer __other)
-{
-  typedef typename _Sequence::const_pointer _PointerType;
-  constexpr std::less<_PointerType> __l{};
-
-  return (__l(__other, __begin)
- || __l(std::addressof(*(__it._M_get_sequence()->_M_base().end()
- - 1)), __other));
-}
-
-  // Fallback when address type cannot be implicitely casted to sequence
-  // const_pointer.
-  template
-inline bool
-__foreign_iterator_aux4(const _Safe_iterator<_Iterator, _Sequence>&,
-   _InputIterator, ...)
-{ return true; }
-
-  template
-inline bool
-__foreign_iterator_aux3(const _Safe_iterator<_Iterator, _Sequence>& __it,
-   _InputIterator __other,
-   std::true_type)
-{
-  // Only containers with all elements in contiguous memory can have their
-  // elements passed through pointers.
-  // Arithmetics is here just to make sure we are not dereferencing
-  // past-the-end iterator.
-  if (__it._M_get_sequence()->_M_base().begin()
- != __it._M_get_sequence()->_M_base().end())
-   if (std::addressof(*(__it._M_get_sequence()->_M_base().end() - 1))
-   - std::addressof(*(__it._M_get_sequence()->_M_base().begin()))
-   == __it._M_get_sequence()->size() - 1)
- return (__foreign_iterator_aux4
-

Re: [Patch, Fortran] PRs 60283/60543: Fix two wrong-code bugs related for implicit pure

2014-03-19 Thread Tobias Burnus

Paul Richard Thomas wrote:

The patch looks OK to me.  If nothing else, it offers a
rationalisation of all the lines of code that unset the attribute!

I am somewhat puzzled by "Note: I failed to create a test case",
wheras I find one at the end of the patch.  Can you explain what you
mean?


What I meant was that I failed to create a run-time test case, which 
fails without the patch. However, after I wrote that, I saw that there 
is a dg-* which permits to check the .mod file for a string. That's why 
I could include a test case.


Committed to the trunk as Rev. 208687.

While looking at the patch again for backporting, I saw that I have 
missed the following parts. I will commit them tomorrow as obvious, 
unless someone protests.


Tobias
2014-03-19  Tobias Burnus  

	PR fortran/60543
	* io.c (check_io_constraints): Use gfc_unset_implicit_pure.
	* resolve.c (resolve_ordinary_assign): Ditto.

Index: gcc/fortran/io.c
===
--- gcc/fortran/io.c	(Revision 208687)
+++ gcc/fortran/io.c	(Arbeitskopie)
@@ -3259,9 +3259,8 @@ if (condition) \
 		 "an internal file in a PURE procedure",
 		 io_kind_name (k));
 
-  if (gfc_implicit_pure (NULL) && (k == M_READ || k == M_WRITE))
-	gfc_current_ns->proc_name->attr.implicit_pure = 0;
-
+  if (k == M_READ || k == M_WRITE)
+	gfc_unset_implicit_pure (NULL);
 }
 
   if (k != M_READ)
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c	(Revision 208687)
+++ gcc/fortran/resolve.c	(Arbeitskopie)
@@ -9165,7 +9165,7 @@ resolve_ordinary_assign (gfc_code *code, gfc_names
   if (lhs->expr_type == EXPR_VARIABLE
 	&& lhs->symtree->n.sym != gfc_current_ns->proc_name
 	&& lhs->symtree->n.sym->ns != gfc_current_ns)
-	gfc_current_ns->proc_name->attr.implicit_pure = 0;
+	gfc_unset_implicit_pure (NULL);
 
   if (lhs->ts.type == BT_DERIVED
 	&& lhs->expr_type == EXPR_VARIABLE
@@ -9173,11 +9173,11 @@ resolve_ordinary_assign (gfc_code *code, gfc_names
 	&& rhs->expr_type == EXPR_VARIABLE
 	&& (gfc_impure_variable (rhs->symtree->n.sym)
 		|| gfc_is_coindexed (rhs)))
-	gfc_current_ns->proc_name->attr.implicit_pure = 0;
+	gfc_unset_implicit_pure (NULL);
 
   /* Fortran 2008, C1283.  */
   if (gfc_is_coindexed (lhs))
-	gfc_current_ns->proc_name->attr.implicit_pure = 0;
+	gfc_unset_implicit_pure (NULL);
 }
 
   /* F2008, 7.2.1.2.  */


Re: PR libstdc++/60587

2014-03-19 Thread Jonathan Wakely

On 19/03/14 21:39 +, Jonathan Wakely wrote:

I think the safe thing to do is (as I suggested at the time) to have a
trait saying which iterator types refer to contiguous memory. Our
debug mode only supports our own containers, so the ones which are
contiguous are known.  For 4.9.0 I think the right option is simply
to remove __foreign_iterator_aux3 and __foreign_iterator_aux4
completely. The fixed version of __foreign_iterator_aux2() can detect
when we have iterators referring to the same sequence, which is what
we really want to detect. That's what the attached patch does and what
I'm going to test.


With my suggested change we get an XPASS for
testsuite/23_containers/vector/debug/57779_neg.cc

An __is_contiguous trait would solve that.



Re: PR libstdc++/60587

2014-03-19 Thread Paolo Carlini
Hi

> On 19/mar/2014, at 23:28, Jonathan Wakely  wrote:
> 
>> On 19/03/14 21:39 +, Jonathan Wakely wrote:
>> I think the safe thing to do is (as I suggested at the time) to have a
>> trait saying which iterator types refer to contiguous memory. Our
>> debug mode only supports our own containers, so the ones which are
>> contiguous are known.  For 4.9.0 I think the right option is simply
>> to remove __foreign_iterator_aux3 and __foreign_iterator_aux4
>> completely. The fixed version of __foreign_iterator_aux2() can detect
>> when we have iterators referring to the same sequence, which is what
>> we really want to detect. That's what the attached patch does and what
>> I'm going to test.
> 
> With my suggested change we get an XPASS for
> testsuite/23_containers/vector/debug/57779_neg.cc
> 
> An __is_contiguous trait would solve that.

Funny, I thought we already had it...

Paolo


[patch committed SH] Fix target/60039

2014-03-19 Thread Kaz Kojima
I've committed the attached patch to fix PR target/60039
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60039
which is a regression from 4.5 for some sh3 users.
Tested on sh4-unknown-linux-gnu with -mdiv=call-div1.
I'd like to backport it to 4.8 in a week or two as usual.

Regards,
kaz
--
2014-03-19  Kaz Kojima  

PR target/60039
* config/sh/sh.md (udivsi3_i1): Clobber R1 register.

--- ORIG/trunk/gcc/config/sh/sh.md  2014-03-02 09:49:58.0 +0900
+++ trunk/gcc/config/sh/sh.md   2014-03-18 14:43:26.515319735 +0900
@@ -2314,6 +2314,7 @@
(udiv:SI (reg:SI R4_REG) (reg:SI R5_REG)))
(clobber (reg:SI T_REG))
(clobber (reg:SI PR_REG))
+   (clobber (reg:SI R1_REG))
(clobber (reg:SI R4_REG))
(use (match_operand:SI 1 "arith_reg_operand" "r"))]
   "TARGET_SH1 && TARGET_DIVIDE_CALL_DIV1"


  1   2   >