Re: [PATCH] [ARM, Callgraph] Fix PR67280: function incorrectly marked as nothrow

2015-09-07 Thread Charles Baylis
On 2 September 2015 at 13:09, Jan Hubicka  wrote:
> It kind of sucks that one needs to mind this flag each time one creates edge,
> but setting the value in create_edge is not quite correct as that one does not
> have any information on where the call appears and if the exception is not 
> handled
> locally.

OK.

>> >gcc/ChangeLog:
>> >
>> >2015-08-28  Charles Baylis  
>> >
>> > * cgraphunit.c (cgraph_node::create_wrapper): Set 
>> > can_throw_external
>> > in new callgraph edge.
>> Ultimately Jan's call.
>
> This is OK.
> Thanks for looking into this!

Thanks for the review!

FWIW, the patch successfully bootstrapped on arm-linux-gnueabihf

Committed to trunk as r227407.

Are you happy for me to backport to gcc-5-branch?

Charles


RE: [PATCH] [MIPS] Fix wrong instruction in the delay slot

2015-09-07 Thread Matthew Fortune
Robert Suchanek 
> IMO, the fix is to recognize the empty basic block that has a code_label
> followed by a barrier (ignoring notes and debug_insns), forbid going
> beyond the barrier if the empty block is found in
> skip_consecutive_labels () and first_active_target_insn ().

I can't approve this but I agree that the delay slot filler should not be
allowed to look past barriers to fill delay slots (especially if moving
a memory operation).

The redundant/no-op branch here is also annoying but the delay slot issue
does seem to be an independent problem. I've seen lots of pointless branches
being generated by GCC that get all the way though to emitting code. Perhaps
this is just one of several reasons for generating either always or never
taken conditional branches. They end up as things like BEQ $4, $4 or
BNE $4, $4 which MIPS R6 converts to 'B' or 'nothing' respectively as I have
not had time to track their origin.

Thanks,
Matthew

> 
> The patch was cross tested on mips-img-linux-gnu and sparc-linux-gnu.
> 
> 
> Regards,
> Robert
> 
> gcc/
>   * reorg.c (label_with_barrier_p): New function.
>   (skip_consecutive_labels): Use it.  Don't skip the label if an
> empty
>   block is found.
>   (first_active_target_insn): Likewise.  Don't ignore the empty
>   block when searching for the next active instruction.
> 
> gcc/testsuite
>   * gcc.target/mips/builtin-unreachable-bug-1.c: New test.
> ---
>  gcc/reorg.c| 28 +++
>  .../gcc.target/mips/builtin-unreachable-bug-1.c| 90
> ++
>  2 files changed, 118 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/mips/builtin-unreachable-
> bug-1.c
> 
> diff --git a/gcc/reorg.c b/gcc/reorg.c
> index cdaa60c..269f666 100644
> --- a/gcc/reorg.c
> +++ b/gcc/reorg.c
> @@ -145,6 +145,30 @@ along with GCC; see the file COPYING3.  If not see
> These functions are now only used here in reorg.c, and have
> therefore
> been moved here to avoid inadvertent misuse elsewhere in the
> compiler.  */
> 
> +/* Return true iff a LABEL is followed by a BARRIER.  Ignore notes and
> debug
> +   instructions.  */
> +
> +static bool
> +label_with_barrier_p (rtx_insn *label)
> +{
> +  bool empty_bb = true;
> +
> +  if (GET_CODE (label) != CODE_LABEL)
> +empty_bb = false;
> +  else
> +label = NEXT_INSN (label);
> +
> +  while (!BARRIER_P (label) && empty_bb)
> +  {
> +if (!(DEBUG_INSN_P (label)
> +   || NOTE_P (label)))
> +  empty_bb = false;
> +label = NEXT_INSN (label);
> +  }
> +
> +  return empty_bb;
> +}
> +
>  /* Return the last label to mark the same position as LABEL.  Return
> LABEL
> itself if it is null or any return rtx.  */
> 
> @@ -159,6 +183,8 @@ skip_consecutive_labels (rtx label_or_return)
>rtx_insn *label = as_a  (label_or_return);
> 
>for (insn = label; insn != 0 && !INSN_P (insn); insn = NEXT_INSN
> (insn))
> +if (LABEL_P (insn) && label_with_barrier_p (insn))
> +  break;
>  if (LABEL_P (insn))
>label = insn;
> 
> @@ -267,6 +293,8 @@ first_active_target_insn (rtx insn)  {
>if (ANY_RETURN_P (insn))
>  return insn;
> +  if (LABEL_P (insn) && label_with_barrier_p (as_a 
> (insn)))
> +return NULL_RTX;
>return next_active_insn (as_a  (insn));  }
> 
> 
> diff --git a/gcc/testsuite/gcc.target/mips/builtin-unreachable-bug-1.c
> b/gcc/testsuite/gcc.target/mips/builtin-unreachable-bug-1.c
> new file mode 100644
> index 000..65eb9a3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/mips/builtin-unreachable-bug-1.c
> @@ -0,0 +1,90 @@
> +/* { dg-options "-mcompact-branches=never -G0 -mno-abicalls" } */
> +/* { dg-final { scan-assembler-not
> +"beq\t\\\$4,\\\$0,\\\$L\[0-9\]+\n\tlw\t\\\$4,%lo\\(mips_cm_is64\\)\\(\\
> +\$4\\)" } } */
> +
> +enum
> +{
> +  CPU_R4700,
> +  CPU_R5000,
> +  CPU_R5500,
> +  CPU_NEVADA,
> +  CPU_RM7000,
> +  CPU_SR71000,
> +  CPU_4KC,
> +  CPU_4KEC,
> +  CPU_4KSC,
> +  CPU_24K,
> +  CPU_34K,
> +  CPU_1004K,
> +  CPU_74K,
> +  CPU_ALCHEMY,
> +  CPU_PR4450,
> +  CPU_BMIPS32,
> +  CPU_BMIPS3300,
> +  CPU_BMIPS5000,
> +  CPU_JZRISC,
> +  CPU_M14KC,
> +  CPU_M14KEC,
> +  CPU_INTERAPTIV,
> +  CPU_P5600,
> +  CPU_PROAPTIV,
> +  CPU_1074K,
> +  CPU_M5150,
> +  CPU_I6400,
> +  CPU_R3000,
> +  CPU_5KC,
> +  CPU_5KE,
> +  CPU_20KC,
> +  CPU_25KF,
> +  CPU_SB1,
> +  CPU_SB1A,
> +  CPU_XLP,
> +  CPU_QEMU_GENERIC
> +};
> +
> +struct cpuinfo_mips
> +{
> +  long options;
> +  int isa_level;
> +} cpu_data[1];
> +
> +struct thread_info
> +{
> +  int cpu;
> +};
> +
> +int a, b, c, d, e, mips_cm_is64;
> +
> +static int __attribute__ ((__cold__))
> +mips_sc_probe_cm3 ()
> +{
> +  struct thread_info *info;
> +  struct cpuinfo_mips *c = &cpu_data[info->cpu];
> +  if (mips_cm_is64)
> +c->options = 0;
> +  return 1;
> +}
> +
> +int
> +mips_sc_probe ()
> +{
> +  struct cpuinfo_mips ci = cpu_data[c];
> +  if (mips_cm_is64)
> +__asm__("" ::: "memory");
> +  if (d)
> +return mips_sc_probe_cm3 ();
> +

Re: [PATCH] [ARM, Callgraph] Fix PR67280: function incorrectly marked as nothrow

2015-09-07 Thread Ramana Radhakrishnan
On Mon, Sep 7, 2015 at 9:35 AM, Charles Baylis
 wrote:
> On 2 September 2015 at 13:09, Jan Hubicka  wrote:
>> It kind of sucks that one needs to mind this flag each time one creates edge,
>> but setting the value in create_edge is not quite correct as that one does 
>> not
>> have any information on where the call appears and if the exception is not 
>> handled
>> locally.
>
> OK.
>
>>> >gcc/ChangeLog:
>>> >
>>> >2015-08-28  Charles Baylis  
>>> >
>>> > * cgraphunit.c (cgraph_node::create_wrapper): Set 
>>> > can_throw_external
>>> > in new callgraph edge.
>>> Ultimately Jan's call.
>>
>> This is OK.
>> Thanks for looking into this!
>
> Thanks for the review!
>
> FWIW, the patch successfully bootstrapped on arm-linux-gnueabihf

>
> Committed to trunk as r227407.

Missing PR number in Changelog - please fix the changelog  entry and
for your sins add a link to the revision (http://gcc.gnu.org/r227407)
in the BZ entry :)

Ramana


Re: [PATCH][ARM][3/3] Expand mod by power of 2

2015-09-07 Thread Ramana Radhakrishnan


On 24/07/15 11:55, Kyrill Tkachov wrote:
> 
> commit d562629e36ba013b8f77956a74139330d191bc30
> Author: Kyrylo Tkachov 
> Date:   Fri Jul 17 16:30:01 2015 +0100
> 
> [ARM][3/3] Expand mod by power of 2
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index e1bc727..6ade07c 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -9556,6 +9556,22 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum 
> rtx_code outer_code,
>  
>  case MOD:
>  case UMOD:
> +  /* MOD by a power of 2 can be expanded as:
> +  rsbsr1, r0, #0
> +  and r0, r0, #(n - 1)
> +  and r1, r1, #(n - 1)
> +  rsbpl   r0, r1, #0.  */
> +  if (code == MOD
> +   && CONST_INT_P (XEXP (x, 1))
> +   && exact_log2 (INTVAL (XEXP (x, 1))) > 0
> +   && mode == SImode)
> + {
> +   *cost += COSTS_N_INSNS (3)
> ++ 2 * extra_cost->alu.logical
> ++ extra_cost->alu.arith;
> +   return true;
> + }
> +
>*cost = LIBCALL_COST (2);
>return false;  /* All arguments must be in registers.  */
>  
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index f341109..8301648 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -1229,7 +1229,7 @@ (define_peephole2
>""
>  )
>  
> -(define_insn "*subsi3_compare0"
> +(define_insn "subsi3_compare0"
>[(set (reg:CC_NOOV CC_REGNUM)
>   (compare:CC_NOOV
>(minus:SI (match_operand:SI 1 "arm_rhs_operand" "r,r,I")
> @@ -2158,7 +2158,7 @@ (define_expand "andsi3"
>  )
>  
>  ; ??? Check split length for Thumb-2
> -(define_insn_and_split "*arm_andsi3_insn"
> +(define_insn_and_split "arm_andsi3_insn"
>[(set (match_operand:SI 0 "s_register_operand" "=r,l,r,r,r")
>   (and:SI (match_operand:SI 1 "s_register_operand" "%r,0,r,r,r")
>   (match_operand:SI 2 "reg_or_int_operand" "I,l,K,r,?n")))]
> @@ -11105,6 +11105,78 @@ (define_expand "thumb_legacy_rev"
>""
>  )

This shouldn't be necessary - you are just adding another interface to produce 
an and insn.

>  
> +;; ARM-specific expansion of signed mod by power of 2
> +;; using conditional negate.
> +;; For r0 % n where n is a power of 2 produce:
> +;; rsbsr1, r0, #0
> +;; and r0, r0, #(n - 1)
> +;; and r1, r1, #(n - 1)
> +;; rsbpl   r0, r1, #0
> +
> +(define_expand "modsi3"
> +  [(match_operand:SI 0 "register_operand" "")
> +   (match_operand:SI 1 "register_operand" "")
> +   (match_operand:SI 2 "const_int_operand" "")]
> +  "TARGET_32BIT"
> +  {
> +HOST_WIDE_INT val = INTVAL (operands[2]);
> +
> +if (val <= 0
> +   || exact_log2 (INTVAL (operands[2])) <= 0
> +   || !const_ok_for_arm (INTVAL (operands[2]) - 1))
> +  FAIL;
> +
> +rtx mask = GEN_INT (val - 1);
> +
> +/* In the special case of x0 % 2 we can do the even shorter:
> + cmp r0, #0
> + and r0, r0, #1
> + rsblt   r0, r0, #0.  */
> +
> +if (val == 2)
> +  {
> + rtx cc_reg = gen_rtx_REG (CCmode, CC_REGNUM);
> + rtx cond = gen_rtx_LT (SImode, cc_reg, const0_rtx);
> +
> + emit_insn (gen_rtx_SET (cc_reg,
> + gen_rtx_COMPARE (CCmode, operands[1], const0_rtx)));
> +
> + rtx masked = gen_reg_rtx (SImode);
> + emit_insn (gen_arm_andsi3_insn (masked, operands[1], mask));

Use emit_insn (gen_andsi3 (masked, operands[1], mask) instead and likewise 
below.


> + emit_move_insn (operands[0],
> + gen_rtx_IF_THEN_ELSE (SImode, cond,
> +   gen_rtx_NEG (SImode,
> +masked),
> +   masked));
> + DONE;
> +  }
> +
> +rtx neg_op = gen_reg_rtx (SImode);
> +rtx_insn *insn = emit_insn (gen_subsi3_compare0 (neg_op, const0_rtx,
> +   operands[1]));
> +
> +/* Extract the condition register and mode.  */
> +rtx cmp = XVECEXP (PATTERN (insn), 0, 0);
> +rtx cc_reg = SET_DEST (cmp);
> +rtx cond = gen_rtx_GE (SImode, cc_reg, const0_rtx);
> +
> +emit_insn (gen_arm_andsi3_insn (operands[0], operands[1], mask));
> +
> +rtx masked_neg = gen_reg_rtx (SImode);
> +emit_insn (gen_arm_andsi3_insn (masked_neg, neg_op, mask));
> +
> +/* We want a conditional negate here, but emitting COND_EXEC rtxes
> +   during expand does not always work.  Do an IF_THEN_ELSE instead.  */
> +emit_move_insn (operands[0],
> + gen_rtx_IF_THEN_ELSE (SImode, cond,
> +   gen_rtx_NEG (SImode, masked_neg),
> +   operands[0]));
> +
> +
> +DONE;
> +  }
> +)
> +
>  (define_expand "bswapsi2"
>[(set (match_operand:SI 0 "s_register_operand" "=r")
>   (bswap:SI (match_operand:SI 1 "s_register_operand" "r")))]
> diff --git a/gcc/testsuite/gcc.target/aarch64/mod_2.c 
> b/gcc/testsuite/gcc.target/aarch64/mod_2.c
> new 

Re: [PATCH, PR 57195] Allow mode iterators inside angle brackets

2015-09-07 Thread Richard Sandiford
Michael Collison  writes:
> This patch allow mode iterators inside angle brackets in machine
> description files. I discovered the issue when attempting to use
> iterators on match_operand's as follows:
>
> match_operand: 0 "s_register_operand" "=w")
>
> The function 'read_name' is nor properly handling ':' inside angle brackets.
>
> Bootstrapped on arm-linux.

Sorry for the slow review.

> diff --git a/gcc/read-md.c b/gcc/read-md.c
> index 9f158ec..0171fb0 100644
> --- a/gcc/read-md.c
> +++ b/gcc/read-md.c
> @@ -399,17 +399,25 @@ read_name (struct md_name *name)
>   {
> int c;
> size_t i;
> +  bool in_angle_bracket;
>
> c = read_skip_spaces ();
>
> i = 0;
> +  in_angle_bracket = false;
> while (1)
>   {
> +  if (c == '<')
> +in_angle_bracket = true;
> +
> +  if (c == '>')
> +in_angle_bracket = false;
> +
> if (c == ' ' || c == '\n' || c == '\t' || c == '\f' || c == '\r'
> || c == EOF)
>   break;
> -  if (c == ':' || c == ')' || c == ']' || c == '"' || c == '/'
> -  || c == '(' || c == '[')
> +  if (((c == ':') and (!in_angle_bracket)) || c == ')' || c == ']'
> +  || c == '"' || c == '/' || c == '(' || c == '[')
>   {
> unread_char (c);
> break;

I think we should have a nesting depth rather than a boolean.
It also seems more natural to skip the final "if" statement above when
inside an angle bracket, rather than treating ':' as a special case.
(We'd still break at the end of the line in the case of a missing '>',
so the error reporting shouldn't be too bad.)

I suppose logically '>' with a nesting depth of 0 should also break
the loop.

Thanks for fixing this.

Richard



Re: [PATCH] Fix PR64078

2015-09-07 Thread Marek Polacek
On Sun, Sep 06, 2015 at 07:21:13PM +0200, Bernd Edlinger wrote:
> Hi,
> 
> we observed sporadic failures of the following two test cases (see PR64078):
> c-c++-common/ubsan/object-size-9.c and c-c++-common/ubsan/object-size-10.c
> 
> For object-size-9.c this happens in a reproducible way when -fpic option is 
> used:
> If that option is used, it is slightly less desirable to inline the 
> functions, but if an explicit
> "inline" is added, the function is still in-lined, even if -fpic is used.

So if we rely on the function being inlined I think it would be better to add
the always_inline attribute.

Marek


Re: [patch] libstdc++/65704 portable timed_mutex and recursive_timed_mutex

2015-09-07 Thread Jonathan Wakely

One more patch for timed mutexes, to enable two more tests on darwin
that are supported by the new implementations.

Tested powerpc64le-linux, committed to trunk.

commit 917a1e218c46a1bfcd9b2368a9e0f51c13f6a387
Author: Jonathan Wakely 
Date:   Mon Sep 7 11:34:59 2015 +0100

Enable timed mutex unlock tests on darwin.

	* testsuite/30_threads/recursive_timed_mutex/unlock/2.cc: Run on
	darwin.
	* testsuite/30_threads/timed_mutex/unlock/2.cc: Run on darwin.

diff --git a/libstdc++-v3/testsuite/30_threads/recursive_timed_mutex/unlock/2.cc b/libstdc++-v3/testsuite/30_threads/recursive_timed_mutex/unlock/2.cc
index fc7e2ab..26ca5c5 100644
--- a/libstdc++-v3/testsuite/30_threads/recursive_timed_mutex/unlock/2.cc
+++ b/libstdc++-v3/testsuite/30_threads/recursive_timed_mutex/unlock/2.cc
@@ -1,7 +1,7 @@
 // { dg-do run { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin *-*-rtems* powerpc-ibm-aix* } }
 // { dg-options " -std=gnu++11 -pthread" { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* powerpc-ibm-aix* } }
 // { dg-options " -std=gnu++11 -pthreads" { target *-*-solaris* } }
-// { dg-options " -std=gnu++11 " { target *-*-cygwin *-*-rtems* } }
+// { dg-options " -std=gnu++11 " { target *-*-cygwin *-*-rtems* *-*-darwin* } }
 // { dg-require-cstdint "" }
 // { dg-require-gthreads "" }
 
diff --git a/libstdc++-v3/testsuite/30_threads/timed_mutex/unlock/2.cc b/libstdc++-v3/testsuite/30_threads/timed_mutex/unlock/2.cc
index a492f88..94a542f 100644
--- a/libstdc++-v3/testsuite/30_threads/timed_mutex/unlock/2.cc
+++ b/libstdc++-v3/testsuite/30_threads/timed_mutex/unlock/2.cc
@@ -1,7 +1,7 @@
 // { dg-do run { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin *-*-rtems* powerpc-ibm-aix* } }
 // { dg-options " -std=gnu++11 -pthread" { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* *-*-gnu* powerpc-ibm-aix* } }
 // { dg-options " -std=gnu++11 -pthreads" { target *-*-solaris* } }
-// { dg-options " -std=gnu++11 " { target *-*-cygwin *-*-rtems* } }
+// { dg-options " -std=gnu++11 " { target *-*-cygwin *-*-rtems* *-*-darwin* } }
 // { dg-require-cstdint "" }
 // { dg-require-gthreads "" }
 


[PATCH] 2015-09-03 Benedikt Huber Philipp Tomsich

2015-09-07 Thread Benedikt Huber
* config/aarch64/aarch64-builtins.c: Builtins for rsqrt and
rsqrtf.
* config/aarch64/aarch64-opts.h: -mrecip has a default value
depending on the core.
* config/aarch64/aarch64-protos.h: Declare.
* config/aarch64/aarch64-simd.md: Matching expressions for
frsqrte and frsqrts.
* config/aarch64/aarch64-tuning-flags.def: Added
MRECIP_DEFAULT_ENABLED.
* config/aarch64/aarch64.c: New functions. Emit rsqrt
estimation code in fast math mode.
* config/aarch64/aarch64.md: Added enum entries.
* config/aarch64/aarch64.opt: Added options -mrecip and
-mlow-precision-recip-sqrt.
* testsuite/gcc.target/aarch64/rsqrt-asm-check.c: Assembly scans
for frsqrte and frsqrts
* testsuite/gcc.target/aarch64/rsqrt.c: Functional tests for rsqrt.

Signed-off-by: Philipp Tomsich 
---
 gcc/ChangeLog  |  21 
 gcc/config/aarch64/aarch64-builtins.c  | 107 +++
 gcc/config/aarch64/aarch64-opts.h  |   7 ++
 gcc/config/aarch64/aarch64-protos.h|   3 +
 gcc/config/aarch64/aarch64-simd.md |  27 +
 gcc/config/aarch64/aarch64-tuning-flags.def|   1 +
 gcc/config/aarch64/aarch64.c   | 113 -
 gcc/config/aarch64/aarch64.md  |   3 +
 gcc/config/aarch64/aarch64.opt |   8 ++
 gcc/doc/invoke.texi|  19 
 gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check.c |  63 
 gcc/testsuite/gcc.target/aarch64/rsqrt.c   | 107 +++
 12 files changed, 474 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 77fb2c1..382f6b3 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,24 @@
+2015-09-03  Benedikt Huber  
+   Philipp Tomsich  
+
+   * config/aarch64/aarch64-builtins.c: Builtins for rsqrt and
+   rsqrtf.
+   * config/aarch64/aarch64-opts.h: -mrecip has a default value
+   depending on the core.
+   * config/aarch64/aarch64-protos.h: Declare.
+   * config/aarch64/aarch64-simd.md: Matching expressions for
+   frsqrte and frsqrts.
+   * config/aarch64/aarch64-tuning-flags.def: Added
+   MRECIP_DEFAULT_ENABLED.
+   * config/aarch64/aarch64.c: New functions. Emit rsqrt
+   estimation code in fast math mode.
+   * config/aarch64/aarch64.md: Added enum entries.
+   * config/aarch64/aarch64.opt: Added options -mrecip and
+   -mlow-precision-recip-sqrt.
+   * testsuite/gcc.target/aarch64/rsqrt-asm-check.c: Assembly scans
+   for frsqrte and frsqrts
+   * testsuite/gcc.target/aarch64/rsqrt.c: Functional tests for rsqrt.
+
 2015-09-02  Charles Baylis  
 
* cgraphunit.c (cgraph_node::create_wrapper): Set can_throw_external
diff --git a/gcc/config/aarch64/aarch64-builtins.c 
b/gcc/config/aarch64/aarch64-builtins.c
index e3a90b5..729f384 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -337,6 +337,11 @@ enum aarch64_builtins
   AARCH64_BUILTIN_GET_FPSR,
   AARCH64_BUILTIN_SET_FPSR,
 
+  AARCH64_BUILTIN_RSQRT_DF,
+  AARCH64_BUILTIN_RSQRT_SF,
+  AARCH64_BUILTIN_RSQRT_V2DF,
+  AARCH64_BUILTIN_RSQRT_V2SF,
+  AARCH64_BUILTIN_RSQRT_V4SF,
   AARCH64_SIMD_BUILTIN_BASE,
   AARCH64_SIMD_BUILTIN_LANE_CHECK,
 #include "aarch64-simd-builtins.def"
@@ -835,6 +840,43 @@ aarch64_init_crc32_builtins ()
 }
 }
 
+/* Add builtins for reciprocal square root. */
+void
+aarch64_add_builtin_rsqrt (void)
+{
+  tree fndecl = NULL;
+  tree ftype = NULL;
+
+  tree V2SF_type_node = build_vector_type (float_type_node, 2);
+  tree V2DF_type_node = build_vector_type (double_type_node, 2);
+  tree V4SF_type_node = build_vector_type (float_type_node, 4);
+
+  ftype = build_function_type_list (double_type_node, double_type_node, 
NULL_TREE);
+  fndecl = add_builtin_function ("__builtin_aarch64_rsqrt_df",
+ftype, AARCH64_BUILTIN_RSQRT_DF, BUILT_IN_MD, NULL, NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_RSQRT_DF] = fndecl;
+
+  ftype = build_function_type_list (float_type_node, float_type_node, 
NULL_TREE);
+  fndecl = add_builtin_function ("__builtin_aarch64_rsqrt_sf",
+ftype, AARCH64_BUILTIN_RSQRT_SF, BUILT_IN_MD, NULL, NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_RSQRT_SF] = fndecl;
+
+  ftype = build_function_type_list (V2DF_type_node, V2DF_type_node, NULL_TREE);
+  fndecl = add_builtin_function ("__builtin_aarch64_rsqrt_v2df",
+ftype, AARCH64_BUILTIN_RSQRT_V2DF, BUILT_IN_MD, NULL, NULL_TREE);
+  aarch64_builtin_decls[AARCH64_BUILTIN_RSQRT_V2DF] = fndecl;
+
+  ftype = build_function_type_list (V2SF_type_node, V2SF_type_node, NULL_TREE);
+  fndecl = add_builtin_function ("__builtin_aarch64

[PATCH v5][aarch64] Implemented reciprocal square root (rsqrt) estimation in -ffast-math

2015-09-07 Thread Benedikt Huber
This fifth revision of the patch:
 * Moves a function declaration to a header.
 * Adds comments to functions.

Ok for check in.

Benedikt Huber (1):
  2015-09-03  Benedikt Huber  
Philipp Tomsich  

 gcc/ChangeLog  |  21 
 gcc/config/aarch64/aarch64-builtins.c  | 107 +++
 gcc/config/aarch64/aarch64-opts.h  |   7 ++
 gcc/config/aarch64/aarch64-protos.h|   3 +
 gcc/config/aarch64/aarch64-simd.md |  27 +
 gcc/config/aarch64/aarch64-tuning-flags.def|   1 +
 gcc/config/aarch64/aarch64.c   | 113 -
 gcc/config/aarch64/aarch64.md  |   3 +
 gcc/config/aarch64/aarch64.opt |   8 ++
 gcc/doc/invoke.texi|  19 
 gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check.c |  63 
 gcc/testsuite/gcc.target/aarch64/rsqrt.c   | 107 +++
 12 files changed, 474 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt-asm-check.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/rsqrt.c

-- 
1.9.1



RE: [0/7] Type promotion pass and elimination of zext/sext

2015-09-07 Thread Wilco Dijkstra
> Kugan wrote:
> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
> -O3 -g Error: unaligned opcodes detected in executable segment. It works
> fine if I remove the -g. I am looking into it and needs to be fixed as well.

This is a known assembler bug I found a while back, Renlin is looking into it.
Basically when debug tables are inserted at the end of a code section the 
assembler doesn't align to the alignment required by the debug tables.

Wilco




Re: [wwwdocs] Re: C++ Concepts available in trunk?

2015-09-07 Thread Gerald Pfeifer
Jonathan,

On Thu, 13 Aug 2015, Jonathan Wakely wrote:
> Thanks, I've committed the attached change to the wwwdocs repo.

looking at this I noticed a reference to "Subversion", when in
general we have tried to minimize references to specific version
control systems.

And I noticed we can be a little less verbose later in that
section.

And then I noticed, those two versions actually have been
propagating from cxx0x.html to cxx0y.html to cxx0z.html over
the years, so I made essentially the same set of simplications
to all three of them.

What do you think?  I have not committed this yet.

Gerald

Index: cxx0x.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cxx0x.html,v
retrieving revision 1.67
diff -u -r1.67 cxx0x.html
--- cxx0x.html  26 Jan 2015 11:12:43 -  1.67
+++ cxx0x.html  7 Sep 2015 10:47:28 -
@@ -18,14 +18,13 @@
   compiler to bring feature-complete C++11 to C++ programmers.
 
   C++11 features are available as part of the "mainline" GCC
-compiler in the trunk of
-GCC's Subversion
-  repository and in GCC 4.3 and later. To enable C++0x
+compiler in the trunk of GCC's repository
+and in GCC 4.3 and later. To enable C++0x
   support, add the command-line parameter -std=c++0x
   to your g++ command line. Or, to enable GNU
-  extensions in addition to C++0x extensions,
-  add -std=gnu++0x to your g++ command
-  line.  GCC 4.7 and later support -std=c++11 and
+extensions in addition to C++0x extensions,
+add -std=gnu++0x.
+  GCC 4.7 and later support -std=c++11 and
   -std=gnu++11 as well.
 
   Important: GCC's support for C++11 is still
Index: cxx1y.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cxx1y.html,v
retrieving revision 1.24
diff -u -r1.24 cxx1y.html
--- cxx1y.html  13 Aug 2015 08:36:07 -  1.24
+++ cxx1y.html  7 Sep 2015 10:47:28 -
@@ -17,14 +17,12 @@
   standard, which was published in 2014.
 
   C++14 features are available as part of the "mainline" GCC
-compiler in the trunk of
-GCC's Subversion
-  repository and in GCC 4.8 and later. To enable C++14
+compiler in the trunk of GCC's repository
+and in GCC 4.8 and later. To enable C++14
   support, add the command-line parameter -std=c++14
   to your g++ command line. Or, to enable GNU
-  extensions in addition to C++14 extensions,
-  add -std=gnu++14 to your g++ command
-  line.
+extensions in addition to C++14 extensions,
+add -std=gnu++14.
 
   Important: Because the final ISO C++14 standard was only
   recently published, GCC's support is experimental.  No 
attempt
Index: cxx1z.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cxx1z.html,v
retrieving revision 1.3
diff -u -r1.3 cxx1z.html
--- cxx1z.html  13 Aug 2015 14:32:10 -  1.3
+++ cxx1z.html  7 Sep 2015 10:47:28 -
@@ -17,14 +17,12 @@
   standard, which is expected to be published in 2017.
 
   C++1z features are available as part of the "mainline" GCC
-compiler in the trunk of
-GCC's Subversion
-  repository and in GCC 5 and later. To enable C++1z
+compiler in the trunk of GCC's repository
+and in GCC 5 and later. To enable C++1z
   support, add the command-line parameter -std=c++1z
   to your g++ command line. Or, to enable GNU
   extensions in addition to C++1z extensions,
-  add -std=gnu++1z to your g++ command
-  line.
+add -std=gnu++1z.
 
   Important: Because the final ISO C++1z standard is
   still evolving, GCC's support is experimental. No attempt


Re: debug mode symbols cleanup

2015-09-07 Thread Jonathan Wakely

On 05/09/15 22:53 +0200, François Dumont wrote:

   I remember Paolo saying once that we were not guarantiing any abi
compatibility for debug mode. I haven't found any section for
unversioned symbols in gnu.ver so I simply uncomment the global export.


There is no section, because all exported symbols are versioned.

It's OK if objects compiled with Debug Mode using one version of GCC
don't link to objects compiled with Debug Mode using a different
version of GCC, but you can't change the exported symbols in the DSO.


Your changelog doesn't include the changes to config/abi/pre/gnu.ver,
but those changes are not OK anyway, they fail the abi-check:

FAIL: libstdc++-abi/abi_check

   === libstdc++ Summary ===

# of unexpected failures1



Re: [Patch, libstdc++] Add specific error message into exceptions

2015-09-07 Thread Jonathan Wakely

On 28/08/15 20:44 -0700, Tim Shen wrote:

On Fri, Aug 28, 2015 at 11:23 AM, Tim Shen  wrote:

So is it good to have an owned raw pointer stored in runtime_error,
pointing to a heap allocated char chunk, which will be deallocated in
regex_error's dtor?


I just put a string member into regex_error, completely ignoring the
storage in std::runtime_error.


That's still an ABI change, so not OK.



Re: [Patch, libstdc++] Add specific error message into exceptions

2015-09-07 Thread Jonathan Wakely

On 28/08/15 11:23 -0700, Tim Shen wrote:

On Fri, Aug 28, 2015 at 8:59 AM, Jonathan Wakely  wrote:

There seems to be no need to construct a std::string here, just pass a
const char* (see below).


To be honest, I wasn't considering performance for a bit, since
exceptions are already considered slow by me :P. But yes, we can do
less allocations.


I wonder if we want to make this more efficient by adding a private
member to regex_error that would allow information to be appended to
the string, rather then creating a new regex_error with a new string.


In case it wasn't clear, I was suggesting to add a private member
*function* not data member.


I can add a helper function to _Scanner to construct the exception
object for only once. For functions that can't access this helper, use
return value for error handling.


I suggest adding another overload that takes a const char* rather than
std::string. The reason is that when using the new ABI this function
will take a std::__cxx11::string, so calling it will allocate memory
for the string data, then that string is passed to the regex_error
constructor which has to convert it internally to an old std::string,
which has to allocate a second time.


First, to make it clear: due to _M_get_location_string(), we need
dynamic allocation.

So is it good to have an owned raw pointer stored in runtime_error,
pointing to a heap allocated char chunk, which will be deallocated in
regex_error's dtor?


No, adding that pointer is an ABI change.

If you can't do it without an ABI change then you will have to lose
the _M_get_location_string() functionality. It seems non-essential
anyway.


Re: [0/7] Type promotion pass and elimination of zext/sext

2015-09-07 Thread Kugan


On 07/09/15 20:46, Wilco Dijkstra wrote:
>> Kugan wrote:
>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
> 
> This is a known assembler bug I found a while back, Renlin is looking into it.
> Basically when debug tables are inserted at the end of a code section the 
> assembler doesn't align to the alignment required by the debug tables.

This is precisely what seems to be happening. Renlin, could you please
let me know when you have a patch (even if it is a prototype or a hack).

Thanks,
Kugan


Re: [PATCH][wwwdocs][AArch64] Add entry for target attributes and pragmas

2015-09-07 Thread Gerald Pfeifer
On Wed, 2 Sep 2015, Kyrill Tkachov wrote:
> My thinking was that when we introduce some new command-line option we 
> list it here and give a short description of it (new -mcpu values, for 
> example). However, here we introduce about 10 new target attributes and 
> pragmas and listing them all would make this entry too long for my 
> liking so as a shorthand for listing them all I chose to point to the 
> documentation.
> 
> Unless you feel strongly against this reasoning I'd like to commit the
> patch as is within 48 hours.

I can follow your reasoning, and anyway the 48 hours are way over ;-),
just have you considered adding a reference to the documentation (as a 
hyperlink to the respective section, if there is a good one, such as
https://gcc.gnu.org/onlinedocs/gcc/ARM-Pragmas.html#ARM-Pragmas )?

Gerald
> 


Re: [wwwdocs] Re: C++ Concepts available in trunk?

2015-09-07 Thread Jonathan Wakely
On 7 September 2015 at 11:51, Gerald Pfeifer wrote:
> Jonathan,
>
> On Thu, 13 Aug 2015, Jonathan Wakely wrote:
>> Thanks, I've committed the attached change to the wwwdocs repo.
>
> looking at this I noticed a reference to "Subversion", when in
> general we have tried to minimize references to specific version
> control systems.
>
> And I noticed we can be a little less verbose later in that
> section.
>
> And then I noticed, those two versions actually have been
> propagating from cxx0x.html to cxx0y.html to cxx0z.html over
> the years, so I made essentially the same set of simplications
> to all three of them.
>
> What do you think?  I have not committed this yet.

Nice, I think they are good improvements.


[patch] Rename shadowed variable in libstdc++ test.

2015-09-07 Thread Jonathan Wakely

Thanks to Sebastian for pointing this out.

Tested powerpc64le-linux, committed to trunk.


Re: [patch] Rename shadowed variable in libstdc++ test.

2015-09-07 Thread Jonathan Wakely

On 07/09/15 12:33 +0100, Jonathan Wakely wrote:

Thanks to Sebastian for pointing this out.

Tested powerpc64le-linux, committed to trunk.


With the patch this time ...


commit 407976d0a374c2b291f4b75957936885f6314ae8
Author: Jonathan Wakely 
Date:   Mon Sep 7 12:20:02 2015 +0100

Rename shadowed variable in test case.

	* testsuite/30_threads/timed_mutex/try_lock_until/57641.cc: Rename
	shadowed variable.

diff --git a/libstdc++-v3/testsuite/30_threads/timed_mutex/try_lock_until/57641.cc b/libstdc++-v3/testsuite/30_threads/timed_mutex/try_lock_until/57641.cc
index 15f9cdf..25093f8 100644
--- a/libstdc++-v3/testsuite/30_threads/timed_mutex/try_lock_until/57641.cc
+++ b/libstdc++-v3/testsuite/30_threads/timed_mutex/try_lock_until/57641.cc
@@ -48,21 +48,21 @@ struct clock
 };
 
 std::timed_mutex mx;
-bool test = false;
+bool locked = false;
 
 void f()
 {
-  test = mx.try_lock_until(clock::now() + C::milliseconds(1));
+  locked = mx.try_lock_until(clock::now() + C::milliseconds(1));
 }
 
 int main()
 {
-  bool test = false;
+  bool test __attribute__((unused)) = true;
   std::lock_guard l(mx);
   auto start = C::system_clock::now();
   std::thread t(f);
   t.join();
   auto stop = C::system_clock::now();
   VERIFY( (stop - start) < C::seconds(9) );
-  VERIFY( !test );
+  VERIFY( !locked );
 }


Re: [0/7] Type promotion pass and elimination of zext/sext

2015-09-07 Thread pinskia




> On Sep 7, 2015, at 7:22 PM, Kugan  wrote:
> 
> 
> 
> On 07/09/15 20:46, Wilco Dijkstra wrote:
>>> Kugan wrote:
>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
>>> fine if I remove the -g. I am looking into it and needs to be fixed as well.
>> 
>> This is a known assembler bug I found a while back, Renlin is looking into 
>> it.
>> Basically when debug tables are inserted at the end of a code section the 
>> assembler doesn't align to the alignment required by the debug tables.
> 
> This is precisely what seems to be happening. Renlin, could you please
> let me know when you have a patch (even if it is a prototype or a hack).


I had noticed that but I read through the assembler code and it sounded very 
much like it was a designed this way and that the compiler was not supposed to 
emit assembly like this and fix up the alignment. 

Thanks,
Andrew

> 
> Thanks,
> Kugan


Re: [PATCH][wwwdocs][AArch64] Add entry for target attributes and pragmas

2015-09-07 Thread Kyrill Tkachov

Hi Gerald,

On 07/09/15 12:31, Gerald Pfeifer wrote:

On Wed, 2 Sep 2015, Kyrill Tkachov wrote:

My thinking was that when we introduce some new command-line option we
list it here and give a short description of it (new -mcpu values, for
example). However, here we introduce about 10 new target attributes and
pragmas and listing them all would make this entry too long for my
liking so as a shorthand for listing them all I chose to point to the
documentation.

Unless you feel strongly against this reasoning I'd like to commit the
patch as is within 48 hours.

I can follow your reasoning, and anyway the 48 hours are way over ;-),
just have you considered adding a reference to the documentation (as a
hyperlink to the respective section, if there is a good one, such as
https://gcc.gnu.org/onlinedocs/gcc/ARM-Pragmas.html#ARM-Pragmas )?


Good idea, I'll send a patch to mention the link.
The relevant one is:
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes

Thanks,
Kyrill



Gerald




Re: [PATCH, PR67405, committed] Avoid NULL pointer dereference

2015-09-07 Thread Ilya Enkovich
2015-09-02 15:35 GMT+03:00 Richard Biener :
>
> DECL_FIELD_BIT_OFFSET should be never NULL.  Whoever created that
> FIELD_DECL created an invalid one.
>
> Richard.
>

layout_class_type doesn't place fields with no type and thus we have
nothing for DECL_FIELD_BIT_OFFSET. We still continue compilation and
function parameters gimplification causes a call to
chkp_find_bound_slots_1 which tries to access. So probably I should
handle gracefully fields with error_mark_node as a type? Or we better
put something into DECL_FIELD_BIT_OFFSET (zero? error_mark_node?) for
such fields.

Ilya


Re: [gomp4.1] Various accelerator updates from OpenMP 4.1

2015-09-07 Thread Jakub Jelinek
On Fri, Sep 04, 2015 at 09:07:02PM +0300, Ilya Verbin wrote:
> It seems that there is a bug some here:
> 
> Here is the reproducer:
> 
> #pragma omp declare target
> int a[1];
> #pragma omp end declare target
> 
> void foo ()
> {
>   #pragma omp target map(to: a[0:1])
> a;
> }
> 
> 
> lookup_decl (var, ctx) tries to lookup for 'a', but ctx->cb.decl_map->get ()
> returns null-pointer.

Fixed thusly, tested on x86_64-linux, committed to gomp4.1 branch.

2015-09-07  Jakub Jelinek  

* omp-low.c (scan_sharing_clauses): Don't ignore map with
declare target vars for GOMP_MAP_FIRSTPRIVATE_POINTER,
unless the decl is an array.
(lower_omp_target): Ignore GOMP_MAP_FIRSTPRIVATE_POINTER map with
declare target var if it is an array.

* testsuite/libgomp.c/target-26.c: New test.

--- gcc/omp-low.c.jj2015-09-04 11:34:45.0 +0200
+++ gcc/omp-low.c   2015-09-07 14:10:36.198517500 +0200
@@ -2060,6 +2060,8 @@ scan_sharing_clauses (tree clauses, omp_
 directly.  */
  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
  && DECL_P (decl)
+ && (OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_FIRSTPRIVATE_POINTER
+ || TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE)
  && is_global_var (maybe_lookup_decl_in_outer_ctx (decl, ctx))
  && varpool_node::get_create (decl)->offloadable)
break;
@@ -2284,6 +2286,8 @@ scan_sharing_clauses (tree clauses, omp_
break;
  decl = OMP_CLAUSE_DECL (c);
  if (DECL_P (decl)
+ && (OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_FIRSTPRIVATE_POINTER
+ || TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE)
  && is_global_var (maybe_lookup_decl_in_outer_ctx (decl, ctx))
  && varpool_node::get_create (decl)->offloadable)
break;
@@ -13358,6 +13362,10 @@ lower_omp_target (gimple_stmt_iterator *
  {
if (TREE_CODE (TREE_TYPE (var)) == ARRAY_TYPE)
  {
+   if (is_global_var (maybe_lookup_decl_in_outer_ctx (var, ctx))
+   && varpool_node::get_create (var)->offloadable)
+ continue;
+
tree type = build_pointer_type (TREE_TYPE (var));
tree new_var = lookup_decl (var, ctx);
x = create_tmp_var_raw (type, get_name (new_var));
@@ -14081,6 +14089,12 @@ lower_omp_target (gimple_stmt_iterator *
HOST_WIDE_INT offset = 0;
gcc_assert (prev);
var = OMP_CLAUSE_DECL (c);
+   if (DECL_P (var)
+   && TREE_CODE (TREE_TYPE (var)) == ARRAY_TYPE
+   && is_global_var (maybe_lookup_decl_in_outer_ctx (var,
+ ctx))
+   && varpool_node::get_create (var)->offloadable)
+ break;
if (TREE_CODE (var) == INDIRECT_REF
&& TREE_CODE (TREE_OPERAND (var, 0)) == COMPONENT_REF)
  var = TREE_OPERAND (var, 0);
--- libgomp/testsuite/libgomp.c/target-26.c.jj  2015-09-07 11:59:08.665425993 
+0200
+++ libgomp/testsuite/libgomp.c/target-26.c 2015-09-07 12:38:56.0 
+0200
@@ -0,0 +1,36 @@
+extern void abort (void);
+#pragma omp declare target
+int a[4] = { 2, 3, 4, 5 }, *b;
+#pragma omp end declare target
+
+int
+main ()
+{
+  int err;
+  int c[3] = { 6, 7, 8 };
+  b = c;
+  #pragma omp target map(to: a[0:2], b[0:2]) map(from: err)
+  err = a[0] != 2 || a[1] != 3 || a[2] != 4 || a[3] != 5 || b[0] != 6 || b[1] 
!= 7;
+  if (err)
+abort ();
+  a[1] = 9;
+  a[2] = 10;
+  #pragma omp target map(always,to:a[1:2]) map(from: err)
+  err = a[0] != 2 || a[1] != 9 || a[2] != 10 || a[3] != 5;
+  if (err)
+abort ();
+  #pragma omp parallel firstprivate(a, b, c, err) num_threads (2)
+  #pragma omp single
+  {
+b = c + 1;
+a[0] = 11;
+a[2] = 13;
+c[1] = 14;
+int d = 0;
+#pragma omp target map(to: a[0:3], b[d:2]) map (from: err)
+err = a[0] != 11 || a[1] != 9 || a[2] != 13 || b[0] != 14 || b[1] != 8;
+if (err)
+  abort ();
+  }
+  return 0;
+}

Jakub


RE: [0/7] Type promotion pass and elimination of zext/sext

2015-09-07 Thread Wilco Dijkstra
> pins...@gmail.com wrote:
> > On Sep 7, 2015, at 7:22 PM, Kugan  wrote:
> >
> >
> >
> > On 07/09/15 20:46, Wilco Dijkstra wrote:
> >>> Kugan wrote:
> >>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with
> >>> -O3 -g Error: unaligned opcodes detected in executable segment. It works
> >>> fine if I remove the -g. I am looking into it and needs to be fixed as 
> >>> well.
> >>
> >> This is a known assembler bug I found a while back, Renlin is looking into 
> >> it.
> >> Basically when debug tables are inserted at the end of a code section the
> >> assembler doesn't align to the alignment required by the debug tables.
> >
> > This is precisely what seems to be happening. Renlin, could you please
> > let me know when you have a patch (even if it is a prototype or a hack).
> 
> 
> I had noticed that but I read through the assembler code and it sounded very 
> much like it was
> a designed this way and that the compiler was not supposed to emit assembly 
> like this and fix
> up the alignment.

No, the bug is introduced solely by the assembler - there is no way to avoid it 
as you can't expect
users to align the end of the code section to an unspecified debug alignment 
(which could
potentially vary depending on the generated debug info). The assembler aligns 
unaligned instructions
without a warning, and doesn't require the section size to be a multiple of the 
section alignment,
ie. the design is that the assembler can deal with any alignment.

Wilco




Re: Openacc launch API

2015-09-07 Thread Nathan Sidwell

On 08/25/15 09:29, Nathan Sidwell wrote:

Jakub,

This patch changes the launch API for openacc parallels.  The current scheme
passes the launch dimensions as 3 separate parameters to the GOACC_parallel
function.  This is problematic for a couple of reasons:

1) these must be validated in the host compiler

2) they provide no extension to support a variety of different offload devices
with different geometry requirements.

This patch changes things so that the function tables emitted by (ptx)
mkoffloads includes the geometry triplet for each function.  This allows them to
be validated and/or manipulated in the offload compiler.  However, this only
works for compile-time known dimensions -- which is a common case.  To deal with
runtime-computed dimensions we have to retain the host-side compiler's
calculation and pass that into the GOACC_parallel function.  We change
GOACC_parallel to take a variadic list of keyed operands ending with a sentinel
marker.  These keyed operands have a slot for expansion to support multiple
different offload devices.

We also extend the functionality of the 'oacc function' internal attribute.
Rather than being a simple marker, it now has a value, which is a TREE_LIST of
the geometry required.  The geometry is held as INTEGER_CSTs on the TREE_VALUE
slots.  Runtime-calculated values are represented by an INTEGER_CST of zero.
We'll also use this representation for  'routines', where the TREE_PURPOSE slot
will be used to indicate the levels at which a routine might spawn a partitioned
loop.  Again, to allow future expansion supporting a number of different offload
devices, this can become a list-of-lists, keyed by and offload device
identifier.  The offload  compiler can manipulate this data, and a later patch
will do this within a new oacc-xform pass.

I  did rename the GOACC_parallel entry point to GOACC_parallel_keyed and provide
a forwarding function. However, as the mkoffload data is incompatible, this is
probably overkill.  I've had to increment the (just committed) version number to
detect the change in data representation.  So any attempt to run an old binary
with a new libgomp will fail at the loading point.  We could simply keep the
same 'GOACC_parallel' name and not need any new symbols.  WDYT?


Ping?

https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01498.html

nathan




[PATCH][AArch64] Improve code generation for float16 vector code

2015-09-07 Thread Alan Lawrence
On 04/09/15 13:32, James Greenhalgh wrote:
> In that case, these should be implemented as inline assembly blocks. As it
> stands, the code generation for these intrinsics will be very poor with this
> patch applied.
>
> I'm going to hold off OKing this until I see a follow-up to fix the code
> generation, either replacing those particular intrinsics with inline asm,
> or doing the more comprehensive fix in the back-end.
>
> Thanks,
> James

In that case, here is the follow-up now ;). This fixes each of the following
functions to generate a single instruction followed by ret:
  * vld1_dup_f16, vld1q_dup_f16
  * vset_lane_f16, vsetq_lane_f16
  * vget_lane_f16, vgetq_lane_f16
  * For IN of type either float16x4_t or float16x8_t, and constant C:
return (float16x4_t) {in[C], in[C], in[C], in[C]};
  * Similarly,
return (float16x8_t) {in[C], in[C], in[C], in[C], in[C], in[C], in[C], in[C]};
(These correspond intuitively to what one might expect for "vdup_lane_f16",
"vdup_laneq_f16", "vdupq_lane_f16" and "vdupq_laneq_f16" intrinsics,
although such intrinsics do not actually exist.)

This patch does not deal with equivalents to vdup_n_s16 and other intrinsics
that load immediates, rather than using elements of pre-existing vectors.

I'd welcome thoughts/opinions on what testcase would be appropriate. Correctness
of all the intrinsics is already tested by the advsimd-intrinsics testsuite, and
the only way I can see to verify code generation, is to scan-assembler looking
for particular instructions; do we wish to see more scan-assembler tests?

Bootstrapped + check-gcc on aarch64-none-linux-gnu.

Thanks,
Alan

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (aarch64_simd_dup,
aarch64_dup_lane, aarch64_dup_lane_,
aarch64_simd_vec_set, vec_set, vec_perm_const,
vec_init, *aarch64_simd_ld1r, vec_extract): Add
V4HF and V8HF variants to iterator.

* config/aarch64/aarch64.c (aarch64_evpc_dup): Add V4HF and V8HF cases.

* config/aarch64/iterators.md (VDQF_F16): New.
(VSWAP_WIDTH, vswap_width_name): Add V4HF and V8HF cases.
---
 gcc/config/aarch64/aarch64-simd.md | 39 +++---
 gcc/config/aarch64/aarch64.c   |  2 ++
 gcc/config/aarch64/iterators.md|  7 ++-
 3 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 160acf9..b303d58 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -53,18 +53,19 @@
 )
 
 (define_insn "aarch64_simd_dup"
-  [(set (match_operand:VDQF 0 "register_operand" "=w")
-(vec_duplicate:VDQF (match_operand: 1 "register_operand" "w")))]
+  [(set (match_operand:VDQF_F16 0 "register_operand" "=w")
+   (vec_duplicate:VDQF_F16
+ (match_operand: 1 "register_operand" "w")))]
   "TARGET_SIMD"
   "dup\\t%0., %1.[0]"
   [(set_attr "type" "neon_dup")]
 )
 
 (define_insn "aarch64_dup_lane"
-  [(set (match_operand:VALL 0 "register_operand" "=w")
-   (vec_duplicate:VALL
+  [(set (match_operand:VALL_F16 0 "register_operand" "=w")
+   (vec_duplicate:VALL_F16
  (vec_select:
-   (match_operand:VALL 1 "register_operand" "w")
+   (match_operand:VALL_F16 1 "register_operand" "w")
(parallel [(match_operand:SI 2 "immediate_operand" "i")])
   )))]
   "TARGET_SIMD"
@@ -76,8 +77,8 @@
 )
 
 (define_insn "aarch64_dup_lane_"
-  [(set (match_operand:VALL 0 "register_operand" "=w")
-   (vec_duplicate:VALL
+  [(set (match_operand:VALL_F16 0 "register_operand" "=w")
+   (vec_duplicate:VALL_F16
  (vec_select:
(match_operand: 1 "register_operand" "w")
(parallel [(match_operand:SI 2 "immediate_operand" "i")])
@@ -834,11 +835,11 @@
 )
 
 (define_insn "aarch64_simd_vec_set"
-  [(set (match_operand:VDQF 0 "register_operand" "=w")
-(vec_merge:VDQF
-   (vec_duplicate:VDQF
+  [(set (match_operand:VDQF_F16 0 "register_operand" "=w")
+   (vec_merge:VDQF_F16
+   (vec_duplicate:VDQF_F16
(match_operand: 1 "register_operand" "w"))
-   (match_operand:VDQF 3 "register_operand" "0")
+   (match_operand:VDQF_F16 3 "register_operand" "0")
(match_operand:SI 2 "immediate_operand" "i")))]
   "TARGET_SIMD"
   {
@@ -851,7 +852,7 @@
 )
 
 (define_expand "vec_set"
-  [(match_operand:VDQF 0 "register_operand" "+w")
+  [(match_operand:VDQF_F16 0 "register_operand" "+w")
(match_operand: 1 "register_operand" "w")
(match_operand:SI 2 "immediate_operand" "")]
   "TARGET_SIMD"
@@ -4691,9 +4692,9 @@
 ;; vec_perm support
 
 (define_expand "vec_perm_const"
-  [(match_operand:VALL 0 "register_operand")
-   (match_operand:VALL 1 "register_operand")
-   (match_operand:VALL 2 "register_operand")
+  [(match_operand:VALL_F16 0 "register_operand")
+   (match_operand:VALL_F16 1 "register_operand")
+   (match_operand:VALL_F16 2 "register_operand")
(match_operand:

Re: [4/7] Use correct promoted mode sign for result of GIMPLE_CALL

2015-09-07 Thread Michael Matz
Hi,

On Mon, 7 Sep 2015, Kugan wrote:

> For the following testcase (compiling with -O1; -O2 works fine), we have
> a stmt with stm_code SSA_NAME (_7 = _ 6) and for which _6 is defined by
> a GIMPLE_CALL. In this case, we are using wrong SUNREG promoted mode
> resulting in wrong code.

And why is that?

> Simple SSA_NAME copes are generally optimized
> but when they are not, we can end up using the wrong promoted mode.
> Attached patch fixes when we have one copy.

I think it's the wrong place to fixing up.  Where does the wrong use come 
from?  At that place it should be fixed, not after the fact.

>   _6 = bar5 (-10);
>   ...
>   _7 = _6;
>   _3 = (long unsigned int) _6;
>   ...
>   if (_3 != l5.0_4)

There is no use of '_7' in this snippet so I don't see the relevance of 
SUBREG_PROMOTED_MODE on it.

But whatever you do, please make sure you include the testcase for the 
problem as a regression test:

> extern void abort (void);
> 
> __attribute__ ((noinline))
> static unsigned short int foo5 (int x)
> {
>   return x;
> }
> 
> __attribute__ ((noinline))
> short int bar5 (int x)
> {
>   return foo5 (x + 6);
> }
> 
> unsigned long l5 = (short int) -4;
> 
> int
> main (void)
> {
>   if (bar5 (-10) != l5)
> abort ();
>   return 0;
> }


Ciao,
Michael.


Re: [PATCH 2/5] completely_scalarize arrays as well as records.

2015-09-07 Thread Alan Lawrence
In-Reply-To: <55e0697d.2010...@arm.com>

On 28/08/15 16:08, Alan Lawrence wrote:
> Alan Lawrence wrote:
>>
>> Right. I think VLA's are the problem with pr64312.C also. I'm testing a fix
>> (that declares arrays with any of these properties as unscalarizable).
> ... 
> In the meantime I've reverted the patch pending further testing on x86, 
> aarch64
> and arm.

I've now tested g++ and fortran (+ bootstrap + check-gcc) on x86, AArch64 and
ARM, and Ada on x86 and ARM.

So far the list of failures from the original patch seems to be:

* g++.dg/torture/pr64312.C on ARM and m68k-linux
* Building Ada on x86
* Ada ACATS c87b31a on ARM (where the Ada frontend builds fine)

Here's a new version, that fixes all the above, by adding a dose of paranoia in
scalarizable_type_p... (I wonder about adding a comment in completely_scalarize
that such cases have already been ruled out?)

OK to install?

Cheers, Alan
---
 gcc/testsuite/gcc.dg/tree-ssa/sra-15.c |  37 
 gcc/tree-sra.c | 155 +++--
 2 files changed, 144 insertions(+), 48 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/sra-15.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sra-15.c 
b/gcc/testsuite/gcc.dg/tree-ssa/sra-15.c
new file mode 100644
index 000..a22062e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/sra-15.c
@@ -0,0 +1,37 @@
+/* Verify that SRA total scalarization works on records containing arrays.  */
+/* { dg-do run } */
+/* { dg-options "-O1 -fdump-tree-release_ssa --param 
sra-max-scalarization-size-Ospeed=32" } */
+
+extern void abort (void);
+
+struct S
+{
+  char c;
+  unsigned short f[2][2];
+  int i;
+  unsigned short f3, f4;
+};
+
+
+int __attribute__ ((noinline))
+foo (struct S *p)
+{
+  struct S l;
+
+  l = *p;
+  l.i++;
+  l.f[1][0] += 3;
+  *p = l;
+}
+
+int
+main (int argc, char **argv)
+{
+  struct S a = {0, { {5, 7}, {9, 11} }, 4, 0, 0};
+  foo (&a);
+  if (a.i != 5 || a.f[1][0] != 12)
+abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "l;" 0 "release_ssa" } } */
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 8b3a0ad..d9fe058 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -915,73 +915,132 @@ create_access (tree expr, gimple stmt, bool write)
 }
 
 
-/* Return true iff TYPE is a RECORD_TYPE with fields that are either of gimple
-   register types or (recursively) records with only these two kinds of fields.
-   It also returns false if any of these records contains a bit-field.  */
+/* Return true iff TYPE is scalarizable - i.e. a RECORD_TYPE or fixed-length
+   ARRAY_TYPE with fields that are either of gimple register types (excluding
+   bit-fields) or (recursively) scalarizable types.  */
 
 static bool
-type_consists_of_records_p (tree type)
+scalarizable_type_p (tree type)
 {
-  tree fld;
+  gcc_assert (!is_gimple_reg_type (type));
 
-  if (TREE_CODE (type) != RECORD_TYPE)
-return false;
+  switch (TREE_CODE (type))
+  {
+  case RECORD_TYPE:
+for (tree fld = TYPE_FIELDS (type); fld; fld = DECL_CHAIN (fld))
+  if (TREE_CODE (fld) == FIELD_DECL)
+   {
+ tree ft = TREE_TYPE (fld);
 
-  for (fld = TYPE_FIELDS (type); fld; fld = DECL_CHAIN (fld))
-if (TREE_CODE (fld) == FIELD_DECL)
-  {
-   tree ft = TREE_TYPE (fld);
+ if (DECL_BIT_FIELD (fld))
+   return false;
 
-   if (DECL_BIT_FIELD (fld))
- return false;
+ if (!is_gimple_reg_type (ft)
+ && !scalarizable_type_p (ft))
+   return false;
+   }
 
-   if (!is_gimple_reg_type (ft)
-   && !type_consists_of_records_p (ft))
- return false;
-  }
+return true;
 
-  return true;
+  case ARRAY_TYPE:
+{
+  if (TYPE_DOMAIN (type) == NULL_TREE
+ || !TREE_CONSTANT (TYPE_MIN_VALUE (TYPE_DOMAIN (type)))
+ || !TREE_CONSTANT (TYPE_MAX_VALUE (TYPE_DOMAIN (type)))
+ || !TREE_CONSTANT (TYPE_SIZE (type))
+ || (tree_to_shwi (TYPE_SIZE (type)) <= 0))
+   return false;
+  tree elem = TREE_TYPE (type);
+  if (DECL_P (elem) && DECL_BIT_FIELD (elem))
+   return false;
+  if (!is_gimple_reg_type (elem)
+&& !scalarizable_type_p (elem))
+   return false;
+  return true;
+}
+  default:
+return false;
+  }
 }
 
-/* Create total_scalarization accesses for all scalar type fields in DECL that
-   must be of a RECORD_TYPE conforming to type_consists_of_records_p.  BASE
-   must be the top-most VAR_DECL representing the variable, OFFSET must be the
-   offset of DECL within BASE.  REF must be the memory reference expression for
-   the given decl.  */
+static void scalarize_elem (tree, HOST_WIDE_INT, HOST_WIDE_INT, tree, tree);
+
+/* Create total_scalarization accesses for all scalar fields of a member
+   of type DECL_TYPE conforming to scalarizable_type_p.  BASE
+   must be the top-most VAR_DECL representing the variable; within that,
+   OFFSET locates the member and REF must be the memory reference expression 

Re: [5/7] Allow gimple debug stmt in widen mode

2015-09-07 Thread Michael Matz
Hi,

On Mon, 7 Sep 2015, Kugan wrote:

> Allow GIMPLE_DEBUG with values in promoted register.

Patch does much more.

> gcc/ChangeLog:
> 
> 2015-09-07  Kugan Vivekanandarajah  
> 
>   * expr.c (expand_expr_real_1): Set proper SUNREG_PROMOTED_MODE for
>   SSA_NAME that was set by GIMPLE_CALL and assigned to another
>   SSA_NAME of same type.

ChangeLog doesn't match patch, and patch contains dubious changes:

> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -5240,7 +5240,6 @@ expand_debug_locations (void)
> tree value = (tree)INSN_VAR_LOCATION_LOC (insn);
> rtx val;
> rtx_insn *prev_insn, *insn2;
> -   machine_mode mode;
>  
> if (value == NULL_TREE)
>   val = NULL_RTX;
> @@ -5275,16 +5274,6 @@ expand_debug_locations (void)
>  
> if (!val)
>   val = gen_rtx_UNKNOWN_VAR_LOC ();
> -   else
> - {
> -   mode = GET_MODE (INSN_VAR_LOCATION (insn));
> -
> -   gcc_assert (mode == GET_MODE (val)
> -   || (GET_MODE (val) == VOIDmode
> -   && (CONST_SCALAR_INT_P (val)
> -   || GET_CODE (val) == CONST_FIXED
> -   || GET_CODE (val) == LABEL_REF)));
> - }
>  
> INSN_VAR_LOCATION_LOC (insn) = val;
> prev_insn = PREV_INSN (insn);

So it seems that the modes of the values location and the value itself 
don't have to match anymore, which seems dubious when considering how a 
debugger should load the value in question from the given location.  So, 
how is it supposed to work?

And this change:

> --- a/gcc/rtl.h
> +++ b/gcc/rtl.h
> @@ -2100,6 +2100,8 @@ wi::int_traits ::decompose (HOST_WIDE_INT*,
>targets is 1 rather than -1.  */
> gcc_checking_assert (INTVAL (x.first)
>  == sext_hwi (INTVAL (x.first), precision)
> +|| INTVAL (x.first)
> +== (INTVAL (x.first) & ((1 << precision) - 1))
>  || (x.second == BImode && INTVAL (x.first) == 
> 1));
>  
>return wi::storage_ref (&INTVAL (x.first), 1, precision);

implies that wide_ints are not always sign-extended anymore after you 
changes.  That's a fundamental assumption, so removing that assert implies 
that you somehow created non-canonical wide_ints, and those will cause 
bugs elsewhere in the code.  Don't just remove asserts, they are usually 
there for a reason, and without accompanying changes those reasons don't 
go away.


Ciao,
Michael.


RE: [PATCH] Fix PR64078

2015-09-07 Thread Bernd Edlinger
Hi,

On Mon, 7 Sep 2015 12:07:00, Marek Polacek wrote:
>
> On Sun, Sep 06, 2015 at 07:21:13PM +0200, Bernd Edlinger wrote:
>> Hi,
>>
>> we observed sporadic failures of the following two test cases (see PR64078):
>> c-c++-common/ubsan/object-size-9.c and c-c++-common/ubsan/object-size-10.c
>>
>> For object-size-9.c this happens in a reproducible way when -fpic option is 
>> used:
>> If that option is used, it is slightly less desirable to inline the 
>> functions, but if an explicit
>> "inline" is added, the function is still in-lined, even if -fpic is used.
>
> So if we rely on the function being inlined I think it would be better to add
> the always_inline attribute.
>


I tried to replace inline by __attribute__((always_inline)), but unfortunately 
it does not work:

FAIL: c-c++-common/ubsan/object-size-9.c   -O2  (test for excess errors)
Excess errors:
/home/ed/gnu/gcc-trunk/gcc/testsuite/c-c++-common/ubsan/object-size-9.c:47:1: 
warning: always_inline function might not be inlinable [-Wattributes]
/home/ed/gnu/gcc-trunk/gcc/testsuite/c-c++-common/ubsan/object-size-9.c:32:1: 
warning: always_inline function might not be inlinable [-Wattributes]
/home/ed/gnu/gcc-trunk/gcc/testsuite/c-c++-common/ubsan/object-size-9.c:47:1: 
error: inlining failed in call to always_inline 'C f3(int)': function body can 
be overwritten at link time
/home/ed/gnu/gcc-trunk/gcc/testsuite/c-c++-common/ubsan/object-size-9.c:94:10: 
error: called from here

the diagnostics are just a little different when the function is inlined or not.


Bernd.
  

Re: [gomp4] expunge shared_size from launch API

2015-09-07 Thread Tom de Vries

On 31/08/15 19:39, Nathan Sidwell wrote:

Index: gcc/omp-builtins.def
===
--- gcc/omp-builtins.def(revision 227269)
+++ gcc/omp-builtins.def(working copy)
@@ -45,11 +45,11 @@ DEF_GOACC_BUILTIN_FNSPEC (BUILT_IN_GOACC
  ATTR_NOTHROW_LIST, "...rrr")
  DEF_GOACC_BUILTIN_FNSPEC (BUILT_IN_GOACC_KERNELS_INTERNAL,
  "GOACC_kernels_internal",
- BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_SIZE_VAR,
- ATTR_FNSPEC_DOT_DOT_DOT_DOT_r_r_r_NOTHROW_LIST,
- ATTR_NOTHROW_LIST, "rrr")
+ BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_VAR,
+ ATTR_FNSPEC_DOT_DOT_DOT_r_r_r_NOTHROW_LIST,
+ ATTR_NOTHROW_LIST, "...rrr")


The VOID_INT_OMPFN_SIZE_PTR_PTR_PTR bit stayed the same, so the fnspec 
list should have been kept the same.


Thanks,
- Tom



Re: [gomp4] expunge shared_size from launch API

2015-09-07 Thread Nathan Sidwell

On 09/07/15 09:46, Tom de Vries wrote:

On 31/08/15 19:39, Nathan Sidwell wrote:

Index: gcc/omp-builtins.def
===
--- gcc/omp-builtins.def(revision 227269)
+++ gcc/omp-builtins.def(working copy)
@@ -45,11 +45,11 @@ DEF_GOACC_BUILTIN_FNSPEC (BUILT_IN_GOACC
ATTR_NOTHROW_LIST, "...rrr")
  DEF_GOACC_BUILTIN_FNSPEC (BUILT_IN_GOACC_KERNELS_INTERNAL,
"GOACC_kernels_internal",
-  BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_SIZE_VAR,
-  ATTR_FNSPEC_DOT_DOT_DOT_DOT_r_r_r_NOTHROW_LIST,
-  ATTR_NOTHROW_LIST, "rrr")
+  BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_VAR,
+  ATTR_FNSPEC_DOT_DOT_DOT_r_r_r_NOTHROW_LIST,
+  ATTR_NOTHROW_LIST, "...rrr")


The VOID_INT_OMPFN_SIZE_PTR_PTR_PTR bit stayed the same, so the fnspec list
should have been kept the same.


Thanks for noticing!  will fix.

nathan


Re: [PATCH] [graphite] Remove limit_scops

2015-09-07 Thread Tobias Grosser

On 09/05/2015 12:57 AM, Aditya Kumar wrote:

This patch removes graphite-scop-detection.c:limit_scops function and fix
related issues arising because of that. The functionality limit_scop was added
as an intermediate step to discard the loops which graphite could not
handle. Removing limit_scop required handling of different cases of loops and
surrounding code.  The scop is now larger so most test cases required 'number of
scops detected' to be fixed. By increasing the size of scop we can now optimize
loops which are 'siblings' of each other. This could enable loop fusion on a
number of loops. Since in the graphite framework we mostly want to opimize
loop-nests/adjacent-loops, we now discard scops with less than 2 loops. We
also discard scops without any data references.


Essentially:
  - Remove limite_scops.
  - Only select scops when there are at least two loops (loop nest or, side by 
side).
  - Discard loops without data-refs.
  - Fix test cases.


Passes bootstrap and reg-test.


I did not check every detail, but conceptually this is looks good to me.

Tobias



[patch] Avoid #ifdef _GLIBCXX_DEBUG in regex_compiler.h

2015-09-07 Thread Jonathan Wakely

This uses an NSDMI and the _GLIBCXX_DEBUG_ONLY macro to remove several
ugly #ifdef _GLIBCXX_DEBUG conditionals in 

Tested powerpc64le-linux, committed to trunk.

commit e53218fa5a7eedf76f409ab41f2e24776bb5195e
Author: Jonathan Wakely 
Date:   Mon Sep 7 15:12:03 2015 +0100

Avoid #ifdef _GLIBCXX_DEBUG in regex_compiler.h

	* include/bits/regex_compiler.h (_BracketMatcher::_M_is_ready):
	Initialize using NSDMI and set using _GLIBCXX_DEBUG_ONLY.

diff --git a/libstdc++-v3/include/bits/regex_compiler.h b/libstdc++-v3/include/bits/regex_compiler.h
index 0cb0c04..07a9ed3 100644
--- a/libstdc++-v3/include/bits/regex_compiler.h
+++ b/libstdc++-v3/include/bits/regex_compiler.h
@@ -370,9 +370,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 		  const _TraitsT& __traits)
   : _M_class_set(0), _M_translator(__traits), _M_traits(__traits),
   _M_is_non_matching(__is_non_matching)
-#ifdef _GLIBCXX_DEBUG
-  , _M_is_ready(false)
-#endif
   { }
 
   bool
@@ -386,9 +383,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_add_char(_CharT __c)
   {
 	_M_char_set.push_back(_M_translator._M_translate(__c));
-#ifdef _GLIBCXX_DEBUG
-	_M_is_ready = false;
-#endif
+	_GLIBCXX_DEBUG_ONLY(_M_is_ready = false);
   }
 
   _StringT
@@ -399,9 +394,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	if (__st.empty())
 	  __throw_regex_error(regex_constants::error_collate);
 	_M_char_set.push_back(_M_translator._M_translate(__st[0]));
-#ifdef _GLIBCXX_DEBUG
-	_M_is_ready = false;
-#endif
+	_GLIBCXX_DEBUG_ONLY(_M_is_ready = false);
 	return __st;
   }
 
@@ -415,9 +408,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	__st = _M_traits.transform_primary(__st.data(),
 	   __st.data() + __st.size());
 	_M_equiv_set.push_back(__st);
-#ifdef _GLIBCXX_DEBUG
-	_M_is_ready = false;
-#endif
+	_GLIBCXX_DEBUG_ONLY(_M_is_ready = false);
   }
 
   // __neg should be true for \D, \S and \W only.
@@ -433,9 +424,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  _M_class_set |= __mask;
 	else
 	  _M_neg_class_set.push_back(__mask);
-#ifdef _GLIBCXX_DEBUG
-	_M_is_ready = false;
-#endif
+	_GLIBCXX_DEBUG_ONLY(_M_is_ready = false);
   }
 
   void
@@ -445,9 +434,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  __throw_regex_error(regex_constants::error_range);
 	_M_range_set.push_back(make_pair(_M_translator._M_transform(__l),
 	 _M_translator._M_transform(__r)));
-#ifdef _GLIBCXX_DEBUG
-	_M_is_ready = false;
-#endif
+	_GLIBCXX_DEBUG_ONLY(_M_is_ready = false);
   }
 
   void
@@ -457,9 +444,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	auto __end = std::unique(_M_char_set.begin(), _M_char_set.end());
 	_M_char_set.erase(__end, _M_char_set.end());
 	_M_make_cache(_UseCache());
-#ifdef _GLIBCXX_DEBUG
-	_M_is_ready = true;
-#endif
+	_GLIBCXX_DEBUG_ONLY(_M_is_ready = true);
   }
 
 private:
@@ -507,7 +492,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   bool  _M_is_non_matching;
   _CacheT	_M_cache;
 #ifdef _GLIBCXX_DEBUG
-  bool  _M_is_ready;
+  bool  _M_is_ready = false;
 #endif
 };
 


Re: Fix intelmic-mkoffload.c if the temp path contains a '-'

2015-09-07 Thread Ilya Verbin
On Sat, Sep 05, 2015 at 00:45:36 +0300, Ilya Verbin wrote:
> 2015-09-04 22:27 GMT+03:00 Mike Stump :
> > On Sep 4, 2015, at 4:10 AM, Hahnfeld, Jonas  
> > wrote:
> >* intelmic-mkoffload.c (prepare_target_image): Fix if the temp path
> >contains a '-‘.
> >
> > So, out of curiosity, did you test all characters other than null?  If - 
> > doesn’t work, there is a good chance that no such test has been done, and 
> > there is a small hoard of bugs, all the same in there.
> 
> Good point.  Objcopy in bfd/binary.c creates symbol names this way:
> 
>   for (p = buf; *p; p++)
> if (! ISALNUM (*p))
>   *p = '_';
> 
> We should do the same in intelmic-mkoffload.c.  I will prepare a patch.

gcc/
* config/i386/intelmic-mkoffload.c (prepare_target_image): Handle all
non-alphanumeric characters in the symbol name.

Regtested on x86_64-linux.  OK for trunk?  OK for gcc-5-branch?


diff --git a/gcc/config/i386/intelmic-mkoffload.c 
b/gcc/config/i386/intelmic-mkoffload.c
index 49e99e8..4a7812c 100644
--- a/gcc/config/i386/intelmic-mkoffload.c
+++ b/gcc/config/i386/intelmic-mkoffload.c
@@ -453,17 +453,18 @@ prepare_target_image (const char *target_compiler, int 
argc, char **argv)
   fork_execute (objcopy_argv[0], CONST_CAST (char **, objcopy_argv), false);
 
   /* Objcopy has created symbols, containing the input file name with
- special characters replaced with '_'.  We are going to rename these
- new symbols.  */
+ non-alphanumeric characters replaced by underscores.
+ We are going to rename these new symbols.  */
   size_t symbol_name_len = strlen (target_so_filename);
   char *symbol_name = XALLOCAVEC (char, symbol_name_len + 1);
-  for (size_t i = 0; i <= symbol_name_len; i++)
+  for (size_t i = 0; i < symbol_name_len; i++)
 {
   char c = target_so_filename[i];
-  if (c == '/' || c == '.' || c == '-')
+  if (!ISALNUM (c))
c = '_';
   symbol_name[i] = c;
 }
+  symbol_name[symbol_name_len] = '\0';
 
   char *opt_for_objcopy[3];
   opt_for_objcopy[0] = XALLOCAVEC (char, sizeof ("_binary__start=")


  -- Ilya


Re: Fix intelmic-mkoffload.c if the temp path contains a '-'

2015-09-07 Thread Jakub Jelinek
On Mon, Sep 07, 2015 at 05:46:12PM +0300, Ilya Verbin wrote:
> On Sat, Sep 05, 2015 at 00:45:36 +0300, Ilya Verbin wrote:
> > 2015-09-04 22:27 GMT+03:00 Mike Stump :
> > > On Sep 4, 2015, at 4:10 AM, Hahnfeld, Jonas  
> > > wrote:
> > >* intelmic-mkoffload.c (prepare_target_image): Fix if the temp path
> > >contains a '-‘.
> > >
> > > So, out of curiosity, did you test all characters other than null?  If - 
> > > doesn’t work, there is a good chance that no such test has been done, and 
> > > there is a small hoard of bugs, all the same in there.
> > 
> > Good point.  Objcopy in bfd/binary.c creates symbol names this way:
> > 
> >   for (p = buf; *p; p++)
> > if (! ISALNUM (*p))
> >   *p = '_';
> > 
> > We should do the same in intelmic-mkoffload.c.  I will prepare a patch.
> 
> gcc/
>   * config/i386/intelmic-mkoffload.c (prepare_target_image): Handle all
>   non-alphanumeric characters in the symbol name.
> 
> Regtested on x86_64-linux.  OK for trunk?  OK for gcc-5-branch?

Ok for both.

Jakub


Re: [patch] Avoid #ifdef _GLIBCXX_DEBUG in regex_compiler.h

2015-09-07 Thread Jonathan Wakely

We could go further and remove the ABI difference in regex_compiler
when using _GLIBCXX_DEBUG, as in the attached patch.

The trick is that for the char specialization we have padding bytes
between _M_is_non_matching (a bool) and _M_cache (a std::bitset), so
we can reuse one of those padding bytes for the _M_is_ready flag.

For the non-char specializations the _Dummy struct takes up one byte,
so we could reuse that for the _M_is_ready flag.

To be safe we should probably check that alignof(std::bitset<8> > 2)
so we know there really is a padding byte that can be re-used, and
place the _M_is_ready flag afterwards if there is no padding.

diff --git a/libstdc++-v3/include/bits/regex_compiler.h 
b/libstdc++-v3/include/bits/regex_compiler.h
index 07a9ed3..9616d4e 100644
--- a/libstdc++-v3/include/bits/regex_compiler.h
+++ b/libstdc++-v3/include/bits/regex_compiler.h
@@ -369,13 +369,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _BracketMatcher(bool __is_non_matching,
  const _TraitsT& __traits)
   : _M_class_set(0), _M_translator(__traits), _M_traits(__traits),
-  _M_is_non_matching(__is_non_matching)
+  _M_extra_members(__is_non_matching)
   { }
 
   bool
   operator()(_CharT __ch) const
   {
-   _GLIBCXX_DEBUG_ASSERT(_M_is_ready);
+   _GLIBCXX_DEBUG_ASSERT(_M_extra._M_is_ready);
return _M_apply(__ch, _UseCache());
   }
 
@@ -383,7 +383,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_add_char(_CharT __c)
   {
_M_char_set.push_back(_M_translator._M_translate(__c));
-   _GLIBCXX_DEBUG_ONLY(_M_is_ready = false);
+   _GLIBCXX_DEBUG_ONLY(_M_extra._M_is_ready = false);
   }
 
   _StringT
@@ -394,7 +394,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
if (__st.empty())
  __throw_regex_error(regex_constants::error_collate);
_M_char_set.push_back(_M_translator._M_translate(__st[0]));
-   _GLIBCXX_DEBUG_ONLY(_M_is_ready = false);
+   _GLIBCXX_DEBUG_ONLY(_M_extra._M_is_ready = false);
return __st;
   }
 
@@ -408,7 +408,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
__st = _M_traits.transform_primary(__st.data(),
   __st.data() + __st.size());
_M_equiv_set.push_back(__st);
-   _GLIBCXX_DEBUG_ONLY(_M_is_ready = false);
+   _GLIBCXX_DEBUG_ONLY(_M_extra._M_is_ready = false);
   }
 
   // __neg should be true for \D, \S and \W only.
@@ -424,7 +424,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  _M_class_set |= __mask;
else
  _M_neg_class_set.push_back(__mask);
-   _GLIBCXX_DEBUG_ONLY(_M_is_ready = false);
+   _GLIBCXX_DEBUG_ONLY(_M_extra._M_is_ready = false);
   }
 
   void
@@ -434,7 +434,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __throw_regex_error(regex_constants::error_range);
_M_range_set.push_back(make_pair(_M_translator._M_transform(__l),
 _M_translator._M_transform(__r)));
-   _GLIBCXX_DEBUG_ONLY(_M_is_ready = false);
+   _GLIBCXX_DEBUG_ONLY(_M_extra._M_is_ready = false);
   }
 
   void
@@ -444,7 +444,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
auto __end = std::unique(_M_char_set.begin(), _M_char_set.end());
_M_char_set.erase(__end, _M_char_set.end());
_M_make_cache(_UseCache());
-   _GLIBCXX_DEBUG_ONLY(_M_is_ready = true);
+   _GLIBCXX_DEBUG_ONLY(_M_extra._M_is_ready = true);
   }
 
 private:
@@ -457,10 +457,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return 1ul << (sizeof(_CharT) * __CHAR_BIT__ * int(_UseCache::value));
   }
 
-  struct _Dummy { };
-  typedef typename std::conditional<_UseCache::value,
-   std::bitset<_S_cache_size()>,
-   _Dummy>::type _CacheT;
+  struct _Flags
+  {
+   bool _M_is_non_matching;
+   bool _M_is_ready;
+  };
+
+  struct _Empty { };
+
+  struct _ExtraMembers
+  : _Flags,
+   conditional<_UseCache::value, bitset<_S_cache_size()>, _Empty>::type
+  {
+   explicit
+   _ExtraMembers(bool __is_non_matching)
+   : _Flags{__is_non_matching, false}
+   { }
+  };
+
   typedef typename std::make_unsigned<_CharT>::type _UnsignedCharT;
 
   bool
@@ -468,13 +482,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   bool
   _M_apply(_CharT __ch, true_type) const
-  { return _M_cache[static_cast<_UnsignedCharT>(__ch)]; }
+  { return _M_extra._M_cache[static_cast<_UnsignedCharT>(__ch)]; }
 
   void
   _M_make_cache(true_type)
   {
-   for (unsigned __i = 0; __i < _M_cache.size(); __i++)
- _M_cache[__i] = _M_apply(static_cast<_CharT>(__i), false_type());
+   auto& __cache = _M_extra._M_cache;
+   for (unsigned __i = 0; __i < __cache.size(); __i++)
+ __cache[__i] = _M_apply(static_cast<_CharT>(__i), false_type());
   }
 
   void
@@ -489,11 +504,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VE

Re: [patch] Avoid #ifdef _GLIBCXX_DEBUG in regex_compiler.h

2015-09-07 Thread Jonathan Wakely

On 07/09/15 15:54 +0100, Jonathan Wakely wrote:

@@ -457,10 +457,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return 1ul << (sizeof(_CharT) * __CHAR_BIT__ * int(_UseCache::value));
  }

-  struct _Dummy { };
-  typedef typename std::conditional<_UseCache::value,
-   std::bitset<_S_cache_size()>,
-   _Dummy>::type _CacheT;
+  struct _Flags
+  {
+   bool _M_is_non_matching;
+   bool _M_is_ready;
+  };
+
+  struct _Empty { };
+
+  struct _ExtraMembers
+  : _Flags,
+   conditional<_UseCache::value, bitset<_S_cache_size()>, _Empty>::type
+  {
+   explicit
+   _ExtraMembers(bool __is_non_matching)
+   : _Flags{__is_non_matching, false}
+   { }
+  };
+


And we could get rid of the _Empty type, because std::bitset<0> is an
empty type anyway, so if we made _S_cache_size()==0 when _UseCache is
false then in the current code we could just unconditionally use:

 using _CacheT = std::bitset<_S_cache_size()>;

and for the suggested change to use a padding byte for the _M_is_ready
flag we could use:

 struct _ExtraMembers
 : _Flags, bitset<_S_cache_size()>
 {
   explicit
   _ExtraMembers(bool __is_non_matching)
   : _Flags{__is_non_matching, false}
   { }
 };


[PATCH, libiberty] Fix PR63758 by using the _NSGetEnviron() API on Darwin.

2015-09-07 Thread Iain Sandoe
Hi,

This is mostly Roland's patch with one extra case added by me plus I moved the 
new header to include/ as suggested in c#7 of the PR since there are other 
users for it in the compiler.

==

On Darwin platforms, when referenced from the main executable, it's permitted 
to access *_environ directly.  When the environment is required from a shared 
library then the _NSGetEnviron() API should be used.  Since libiberty is used 
from shared libraries (such as libcc1) this mechanism should be applied here.

OK for trunk and active branches?
Iain

include/

Roland McGrath  

PR other/63758
* environ.h: New file.

libiberty/

Roland McGrath  
Iain Sandoe  

PR other/63758
* pex-unix.c: Obtain the environment interface from settings in 
environ.h
rather than in-line code.  Update copyright date.
* setenv.c: Likewise.
* xmalloc.c: Likewise.


 From 9dec1e4a7a62a1f556dce6e35133e4c503898a74 Mon Sep 17 00:00:00 2001
From: Iain Sandoe 
Date: Mon, 7 Sep 2015 10:22:07 +0100
Subject: [PATCH] [libiberty] Provide support for indirect use of _environ by
 Darwin.

When referenced from the main executable, it's permitted to access
*_environ directly.  When environ is required from a shared library then
the NSGetEnviron() API should be used.  Since libiberty is used from
shared libraries (such as libcc1) this mechanism should be applied here.
---
 include/environ.h| 33 +
 libiberty/pex-unix.c |  5 ++---
 libiberty/setenv.c   | 10 +++---
 libiberty/xmalloc.c  |  5 +++--
 4 files changed, 41 insertions(+), 12 deletions(-)
 create mode 100644 include/environ.h

diff --git a/include/environ.h b/include/environ.h
new file mode 100644
index 000..c18902b
--- /dev/null
+++ b/include/environ.h
@@ -0,0 +1,33 @@
+/* Declare the environ system variable.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of the libiberty library.
+Libiberty is free software; you can redistribute it and/or
+modify it under the terms of the GNU Library General Public
+License as published by the Free Software Foundation; either
+version 2 of the License, or (at your option) any later version.
+
+Libiberty is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+Library General Public License for more details.
+
+You should have received a copy of the GNU Library General Public
+License along with libiberty; see the file COPYING.LIB.  If not,
+write to the Free Software Foundation, Inc., 51 Franklin Street - Fifth Floor,
+Boston, MA 02110-1301, USA.  */
+
+/* On OSX, the environ variable can be used directly in the code of an
+   executable, but cannot be used in the code of a shared library (such as
+   GCC's liblto_plugin, which links in libiberty code).  Instead, the
+   function _NSGetEnviron can be called to get the address of environ.  */
+
+#ifndef HAVE_ENVIRON_DECL
+#  ifdef __APPLE__
+# include 
+# define environ (*_NSGetEnviron ())
+#  else
+extern char **environ;
+#  endif
+#  define HAVE_ENVIRON_DECL
+#endif
diff --git a/libiberty/pex-unix.c b/libiberty/pex-unix.c
index 0715115..b48f315 100644
--- a/libiberty/pex-unix.c
+++ b/libiberty/pex-unix.c
@@ -2,7 +2,7 @@
with other subprocesses), and wait for it.  Generic Unix version
(also used for UWIN and VMS).
Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2003, 2004, 2005, 2009,
-   2010 Free Software Foundation, Inc.
+   2010, 2015 Free Software Foundation, Inc.
 
 This file is part of the libiberty library.
 Libiberty is free software; you can redistribute it and/or
@@ -23,6 +23,7 @@ Boston, MA 02110-1301, USA.  */
 #include "config.h"
 #include "libiberty.h"
 #include "pex-common.h"
+#include "environ.h"
 
 #include 
 #include 
@@ -390,8 +391,6 @@ pex_child_error (struct pex_obj *obj, const char 
*executable,
 
 /* Execute a child.  */
 
-extern char **environ;
-
 #if defined(HAVE_SPAWNVE) && defined(HAVE_SPAWNVPE)
 /* Implementation of pex->exec_child using the Cygwin spawn operation.  */
 
diff --git a/libiberty/setenv.c b/libiberty/setenv.c
index 714ca0a..5b51193 100644
--- a/libiberty/setenv.c
+++ b/libiberty/setenv.c
@@ -1,5 +1,5 @@
-/* Copyright (C) 1992, 1995, 1996, 1997, 2002, 2011 Free Software Foundation,
-   Inc.
+/* Copyright (C) 1992, 1995, 1996, 1997, 2002, 2011, 2015
+   Free Software Foundation, Inc.
This file based on setenv.c in the GNU C Library.
 
The GNU C Library is free software; you can redistribute it and/or
@@ -62,11 +62,7 @@ extern int errno;
 #endif
 
 #define __environ  environ
-#ifndef HAVE_ENVIRON_DECL
-#ifndef environ
-extern char **environ;
-#endif
-#endif
+#include "environ.h"
 
 #undef setenv
 #undef unsetenv
diff --git a/libiberty/xmalloc.c b/libiberty/xmalloc.c
index 3e97aab..f849aee 100644
--- a/libiberty/xmalloc.c
+++ b/libiberty/xmalloc.c
@@ -1,5 +1,6 @

[PATCH] PR target/67480: AVX512 bitwise logic insns pattern is incorrect

2015-09-07 Thread Alexander Fomin
This patch adresses PR target/67480. As there are no bitwise logic
instructions for BYTE/WORD in AVX512, we should split corresponding
pattern into two different patterns, namely:
(a) any bitwise logic, for SI/DI modes, masking is supported;
(b) any bitwise logic, for QI/HI modes, masking is not supported
to avoid generating wrong instructions for AVX512BW targets.

Is it OK for master if there are no regressions on Linux/x86_64?

Alexander
---
gcc/

PR target/67480
* config/i386/sse.md (define_mode_iterator VI48_AVX_AVX512F): New.
(define_mode_iterator VI12_AVX_AVX512F): New.
(define_insn "3"): Change
all iterators to VI48_AVX_AVX512F. Extract remaining modes ...
(define_insn "*3"): ... Into new pattern using
VI12_AVX_AVX512F iterators without masking.

gcc/testsuite
PR target/67480
* gcc.target/i386/pr67480.c: New test.
---
 gcc/config/i386/sse.md  | 115 +---
 gcc/testsuite/gcc.target/i386/pr67480.c |  10 +++
 2 files changed, 117 insertions(+), 8 deletions(-)

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 4535570..3571128 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -416,6 +416,14 @@
   [(V16SI "TARGET_AVX512F") V8SI V4SI
(V8DI "TARGET_AVX512F") V4DI V2DI])
 
+(define_mode_iterator VI48_AVX_AVX512F
+  [(V16SI "TARGET_AVX512F") (V8SI "TARGET_AVX") V4SI
+   (V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX") V2DI])
+
+(define_mode_iterator VI12_AVX_AVX512F
+  [ (V64QI "TARGET_AVX512F") (V32QI "TARGET_AVX") V16QI
+(V32HI "TARGET_AVX512F") (V16HI "TARGET_AVX") V8HI])
+
 (define_mode_iterator V48_AVX2
   [V4SF V2DF
V8SF V4DF
@@ -11077,10 +11085,10 @@
 })
 
 (define_insn "3"
-  [(set (match_operand:VI 0 "register_operand" "=x,v")
-   (any_logic:VI
- (match_operand:VI 1 "nonimmediate_operand" "%0,v")
- (match_operand:VI 2 "nonimmediate_operand" "xm,vm")))]
+  [(set (match_operand:VI48_AVX_AVX512F 0 "register_operand" "=x,v")
+   (any_logic:VI48_AVX_AVX512F
+ (match_operand:VI48_AVX_AVX512F 1 "nonimmediate_operand" "%0,v")
+ (match_operand:VI48_AVX_AVX512F 2 "nonimmediate_operand" "xm,vm")))]
   "TARGET_SSE && 
&& ix86_binary_operator_ok (, mode, operands)"
 {
@@ -11109,13 +7,104 @@
 case V4DImode:
 case V4SImode:
 case V2DImode:
-  if (TARGET_AVX512VL)
+  tmp = TARGET_AVX512VL ? "p" : "p";
+  break;
+default:
+  gcc_unreachable ();
+  }
+  break;
+
+   case MODE_V16SF:
+  gcc_assert (TARGET_AVX512F);
+   case MODE_V8SF:
+  gcc_assert (TARGET_AVX);
+   case MODE_V4SF:
+  gcc_assert (TARGET_SSE);
+
+  tmp = "ps";
+  break;
+
+   default:
+  gcc_unreachable ();
+   }
+
+  switch (which_alternative)
+{
+case 0:
+  ops = "%s\t{%%2, %%0|%%0, %%2}";
+  break;
+case 1:
+  ops = "v%s\t{%%2, %%1, %%0|%%0, %%1, 
%%2}";
+  break;
+default:
+  gcc_unreachable ();
+}
+
+  snprintf (buf, sizeof (buf), ops, tmp);
+  return buf;
+}
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "type" "sselog")
+   (set (attr "prefix_data16")
+ (if_then_else
+   (and (eq_attr "alternative" "0")
+   (eq_attr "mode" "TI"))
+   (const_string "1")
+   (const_string "*")))
+   (set_attr "prefix" "")
+   (set (attr "mode")
+   (cond [(and (match_test " == 16")
+   (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL"))
+(const_string "")
+  (match_test "TARGET_AVX2")
+(const_string "")
+  (match_test "TARGET_AVX")
+(if_then_else
+  (match_test " > 16")
+  (const_string "V8SF")
+  (const_string ""))
+  (ior (not (match_test "TARGET_SSE2"))
+   (match_test "optimize_function_for_size_p (cfun)"))
+(const_string "V4SF")
+ ]
+ (const_string "")))])
+
+(define_insn "*3"
+  [(set (match_operand:VI12_AVX_AVX512F 0 "register_operand" "=x,v")
+   (any_logic: VI12_AVX_AVX512F
+ (match_operand:VI12_AVX_AVX512F 1 "nonimmediate_operand" "%0,v")
+ (match_operand:VI12_AVX_AVX512F 2 "nonimmediate_operand" "xm,vm")))]
+  "TARGET_SSE && ix86_binary_operator_ok (, mode, operands)"
+{
+  static char buf[64];
+  const char *ops;
+  const char *tmp;
+
+  switch (get_attr_mode (insn))
+{
+case MODE_XI:
+  gcc_assert (TARGET_AVX512F);
+case MODE_OI:
+  gcc_assert (TARGET_AVX2 || TARGET_AVX512VL);
+case MODE_TI:
+  gcc_assert (TARGET_SSE2 || TARGET_AVX512VL);
+  switch (mode)
+  {
+case V64QImode:
+case V32HImode:
+  if (TARGET_AVX512F)
   {
-tmp = "p";
+tmp = "pq";
 break;
   }
-default:
+case V32QImode:
+case V16HImode:
+case V16QImo

[PATCH][RTL-ifcvt] PR rtl-optimization/67465: Do not ifcvt complex blocks if the else block is empty

2015-09-07 Thread Kyrill Tkachov

Hi all,

This patch fixes the PRs in the ChangeLog that have been reported against my 
if-conversion patch.
The problem occurs when the 'then' block is complex but the else block is empty.
In this case the calling code in noce_process_if_block takes the 'else' move (x 
:= b) from
the test block. However, we have not checked whether the test block is valid 
for complex-block
if-conversion with bb_valid_for_noce_process_p. Also, that's a case I wasn't 
particularly targeting
when writing the initial patch.

This patch bails out of noce_try_cmove_arith when one of the blocks is complex 
and the other is empty.
I've checked that if-conversion still happens in the cases of interest from the 
original patch.

I've added the testcase from PR 67465 since that one uses __builtin_abort and 
triggers the problem nicely.
The others show the miscompilation using printf seems to go away if I replace 
it with an abort.
I have confirmed manually that the miscompilation goes away on those testcases.

PR rtl-optimization/67481 is a testsuite regression on sparc-solaris that 
Rainer reported. I haven't tested
that this patch fixes that, but I suspect that the root cause is the same. 
Rainer, could you please
check that this fixes the regression for you?

Bootstrapped and tested on aarch64 and x86_64.

Ok for trunk if sparc testing comes ok?

Thanks,
Kyrill

2015-09-07  Kyrylo Tkachov  

PR rtl-optimization/67456
PR rtl-optimization/67464
PR rtl-optimization/67465
PR rtl-optimization/67481
* ifcvt.c (noce_try_cmove_arith): Bail out if one of the blocks
is complex and the other is empty.

2015-09-07  Kyrylo Tkachov  

* gcc.dg/pr67465.c: New test.
commit 2305f7deed793315f04221f718880676cd62474d
Author: Kyrylo Tkachov 
Date:   Mon Sep 7 14:58:01 2015 +0100

[RTL-ifcvt] PR rtl-optimization/67465: Do not ifcvt complex blocks if the else block is empty

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index d2f5b66..5716dcc 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -2079,7 +2079,14 @@ noce_try_cmove_arith (struct noce_if_info *if_info)
 	}
 }
 
-  if (!a_simple && then_bb && !b_simple && else_bb
+  /* When one of the blocks is really empty and the other is a complex block
+ don't do anything.  The value 'b' may have come from the test block that
+ we did not check for if-conversion validity in noce_process_if_block.  */
+  if ((!a_simple && !else_bb)
+   || (!b_simple && !then_bb))
+return FALSE;
+
+  if (then_bb && else_bb
   && (!bbs_ok_for_cmove_arith (then_bb, else_bb)
 	  || !bbs_ok_for_cmove_arith (else_bb, then_bb)))
 return FALSE;
diff --git a/gcc/testsuite/gcc.dg/pr67465.c b/gcc/testsuite/gcc.dg/pr67465.c
new file mode 100644
index 000..321fd38
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr67465.c
@@ -0,0 +1,53 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -std=gnu99" } */
+
+int a, b, c, d, e, h;
+
+int
+fn1 (int p1)
+{
+  {
+int g[2];
+for (int i = 0; i < 1; i++)
+  g[i] = 0;
+if (g[0] < c)
+  {
+	a = (unsigned) (1 ^ p1) % 2;
+	return 0;
+  }
+  }
+  return 0;
+}
+
+void
+fn2 ()
+{
+  for (h = 0; h < 1; h++)
+{
+  for (int j = 0; j < 2; j++)
+	{
+	  for (b = 1; b; b = 0)
+	a = 1;
+	  for (; b < 1; b++)
+	;
+	  if (e)
+	continue;
+	  a = 2;
+	}
+  fn1 (h);
+  short k = -16;
+  d = k > a;
+}
+}
+
+int
+main ()
+{
+  fn2 ();
+
+  if (a != 2)
+__builtin_abort ();
+
+  return 0;
+}
+


Re: [PATCH, libiberty] Fix PR63758 by using the _NSGetEnviron() API on Darwin.

2015-09-07 Thread Mike Stump
On Sep 7, 2015, at 8:23 AM, Iain Sandoe  wrote:
> On Darwin platforms, when referenced from the main executable, it's permitted 
> to access *_environ directly.  

> OK for trunk and active branches?

Darwin bits Ok.


[patch] Relax Debug Mode assertions on operator-> for smart pointers.

2015-09-07 Thread Jonathan Wakely

The standard says that unique_ptr::operator->() and
shared_ptr::operator->() have preconditions that the pointer is not
null, but that's not strictly necessary and prevents using that
function to get a raw pointer from a smart pointer in generic code.

This changes the _GLIBCXX_DEBUG_ASSERT to _GLIBCXX_DEBUG_PEDASSERT so
they only fire in pedantic debug mode.

Tested powerpc64le-linux, committed to trunk.
commit b81acb46e3634cb827f45988c90148612925664f
Author: Jonathan Wakely 
Date:   Mon Sep 7 17:47:25 2015 +0100

Relax Debug Mode assertions on operator-> for smart pointers.

	* include/bits/shared_ptr_base.h (__shared_ptr::operator->): Change
	_GLIBCXX_DEBUG_ASSERT to _GLIBCXX_DEBUG_PEDASSERT.
	* include/bits/unique_ptr.h (unique_ptr::operator->): Likewise.
	* testsuite/20_util/shared_ptr/observers/get.cc: Test operator-> on
	empty shared_ptr.

diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h b/libstdc++-v3/include/bits/shared_ptr_base.h
index f2f577b..75f1a0d 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -1054,7 +1054,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _Tp*
   operator->() const noexcept
   {
-	_GLIBCXX_DEBUG_ASSERT(_M_ptr != 0);
+	_GLIBCXX_DEBUG_PEDASSERT(_M_ptr != 0);
 	return _M_ptr;
   }
 
diff --git a/libstdc++-v3/include/bits/unique_ptr.h b/libstdc++-v3/include/bits/unique_ptr.h
index 8ab55da..bb96951 100644
--- a/libstdc++-v3/include/bits/unique_ptr.h
+++ b/libstdc++-v3/include/bits/unique_ptr.h
@@ -295,7 +295,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   pointer
   operator->() const noexcept
   {
-	_GLIBCXX_DEBUG_ASSERT(get() != pointer());
+	_GLIBCXX_DEBUG_PEDASSERT(get() != pointer());
 	return get();
   }
 
diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/observers/get.cc b/libstdc++-v3/testsuite/20_util/shared_ptr/observers/get.cc
index 85fc71d..867f07a 100644
--- a/libstdc++-v3/testsuite/20_util/shared_ptr/observers/get.cc
+++ b/libstdc++-v3/testsuite/20_util/shared_ptr/observers/get.cc
@@ -63,11 +63,24 @@ test03()
   VERIFY( &p->i == &a->i );
 }
 
+void
+test04()
+{
+  bool test __attribute__((unused)) = true;
+
+#if !(defined _GLIBCXX_DEBUG && defined _GLIBCXX_DEBUG_PEDANTIC)
+  std::shared_ptr p;
+  auto np = p.operator->();
+  VERIFY( np == nullptr );
+#endif
+}
+
 int 
 main()
 {
   test01();
   test02();
   test03();
+  test04();
   return 0;
 }


Re: [PATCH g++ driver] Push -static-libstdc++ back onto the command line to allow spec substitutions to use it.

2015-09-07 Thread Iain Sandoe

On 14 Jul 2015, at 18:45, Iain Sandoe wrote:

> 
> On 14 Jul 2015, at 18:24, Jason Merrill wrote:
> 
>> On 06/18/2015 04:12 AM, Iain Sandoe wrote:
>>> The patch below pushes -static-libstdc++ onto the output command line (for 
>>> targets without -Bstatic/dynamic)  so that such specs have an opportunity 
>>> to fire.
>> 
>> Won't that produce an unrecognized flag error from the linker?
> 
> IFF the target doesn't support -Bstatic/dynamic  *and* doesn't provide a spec 
> %{static-libstdc++:...
> then, that could happen.
> 
> There's a fairly small group of non-binutils targets - other than Darwin that 
> would be maybe AIX and some Solaris versions (and I believe that the native 
> linkers there do support -Bstatic/dynamic), however, I'll double-check with 
> the maintainers.

I checked with David (AIX) and Rainer (Solaris) as the only other two platforms 
I could think of off-hand that use non-binutils linkers.  Neither should be 
affected by the change,
OK for trunk?
Iain



[PATCH, Darwin] Some driver TLC (improve support for the '-arch' flag).

2015-09-07 Thread Iain Sandoe
Hi,

For some Darwin compilers, "-arch " can be used (a) in place of, but to 
indicate the same as, a multilib flag like "-m32" and (b) multiple times to 
indicate that the User wants a FAT object with multiple arch slices.

It's helpful to support this, as far as possible, to minimise build system 
changes between compilers.

---

This patch improves the uniformity of support for (a)
 - provides support for PPC
 - produces warnings where such flags conflict with each other and/or with any 
multilib options given.

We don't support (b), at present, so the patch produces warnings if the User 
attempts to add multiple (different) instances of -arch.

OK for trunk?
Iain

gcc/

Iain Sandoe  

* config/darwin-driver.c (darwin_driver_init): Handle '-arch' for PPC, 
detect conflicts
between -arch and multilib settings.  Detect and warn about conflicts 
between multiple
-arch definitions.


From 9a9a4ef8b032e333b6b56be19ea093e0e8b84b2a Mon Sep 17 00:00:00 2001
From: Iain Sandoe 
Date: Mon, 7 Sep 2015 09:52:15 +0100
Subject: [PATCH] [Darwin, driver] Improve support for the '-arch' flag.

Support the flag for X86 and PPC, also check and warn for conflicts of
settings of the flag with multi-lib flags or other instances of the
'-arch' flag.
---
 gcc/config/darwin-driver.c | 98 --
 1 file changed, 94 insertions(+), 4 deletions(-)

diff --git a/gcc/config/darwin-driver.c b/gcc/config/darwin-driver.c
index 868cb8d..727ea53 100644
--- a/gcc/config/darwin-driver.c
+++ b/gcc/config/darwin-driver.c
@@ -179,21 +179,55 @@ darwin_driver_init (unsigned int *decoded_options_count,
struct cl_decoded_option **decoded_options)
 {
   unsigned int i;
+  bool seenX86 = false;
+  bool seenX86_64 = false;
+  bool seenPPC = false;
+  bool seenPPC64 = false;
+  bool seenM32 = false;
+  bool seenM64 = false;
+  bool appendM32 = false;
+  bool appendM64 = false;
 
   for (i = 1; i < *decoded_options_count; i++)
 {
   if ((*decoded_options)[i].errors & CL_ERR_MISSING_ARG)
continue;
+
   switch ((*decoded_options)[i].opt_index)
{
-#if DARWIN_X86
case OPT_arch:
+ /* Support provision of a single -arch  flag as a means of
+specifying the sub-target/multi-lib.  Translate this into -m32/64
+as appropriate.  */  
  if (!strcmp ((*decoded_options)[i].arg, "i386"))
-   generate_option (OPT_m32, NULL, 1, CL_DRIVER, 
&(*decoded_options)[i]);
+   seenX86 = true;
  else if (!strcmp ((*decoded_options)[i].arg, "x86_64"))
-   generate_option (OPT_m64, NULL, 1, CL_DRIVER, 
&(*decoded_options)[i]);
+   seenX86_64 = true;
+ else if (!strcmp ((*decoded_options)[i].arg, "ppc"))
+   seenPPC = true;
+ else if (!strcmp ((*decoded_options)[i].arg, "ppc64"))
+   seenPPC64 = true;
+ else
+   error ("this compiler does not support %s",
+  (*decoded_options)[i].arg);
+ /* Now we've examined it, drop the -arch arg.  */
+ if (*decoded_options_count > i) {
+   memmove (*decoded_options + i,
+*decoded_options + i + 1,
+((*decoded_options_count - i)
+ * sizeof (struct cl_decoded_option)));
+ }
+ --i;
+ --*decoded_options_count; 
+ break;
+
+   case OPT_m32:
+ seenM32 = true;
+ break;
+
+   case OPT_m64:
+ seenM64 = true;
  break;
-#endif
 
case OPT_filelist:
case OPT_framework:
@@ -218,4 +252,60 @@ darwin_driver_init (unsigned int *decoded_options_count,
 }
 
   darwin_default_min_version (decoded_options_count, decoded_options);
+  /* Turn -arch  into the appropriate -m32/-m64 flag.
+ If the User tried to specify multiple arch flags (which is possible with
+ some Darwin compilers) warn that this mode is not supported by this
+ compiler (and ignore the arch flags, which means that the default multi-
+ lib will be generated).  */
+  /* TODO: determine if these warnings would better be errors.  */
+#if DARWIN_X86
+  if (seenPPC || seenPPC64)
+warning (0, "this compiler does not support PowerPC (arch flags ignored)");
+  if (seenX86)
+{
+  if (seenX86_64 || seenM64)
+   warning (0, "%s conflicts with i386 (arch flags ignored)",
+   (seenX86_64? "x86_64": "m64"));
+  else if (! seenM32) /* Add -m32 if the User didn't. */
+   appendM32 = true;
+}
+  else if (seenX86_64)
+{
+  if (seenX86 || seenM32)
+   warning (0, "%s conflicts with x86_64 (arch flags ignored)",
+(seenX86? "i386": "m32"));
+  else if (! seenM64) /* Add -m64 if the User didn't. */
+   appendM64 = true;
+}  
+#elif DARWIN_PPC
+  if (seenX86 || seenX86_64)
+warning (0, "this compiler does not support X86 (arch flags ignored)");
+  if (seenPPC)
+{
+  if 

[patch] Enable lightweight checks with _GLIBCXX_ASSERTIONS.

2015-09-07 Thread Jonathan Wakely

This patch adds the "debug mode lite" we've been talking about, by
changing __glibcxx_assert to be activated by _GLIBCXX_ASSERTIONS
instead of _GLIBCXX_DEBUG (and making the latter imply the former).

_GLIBCXX_ASSERTIONS is already used in Parallel Mode for enabling
optional assertions (although some of them are O(n) and so we might
want to change them to use another macro like _GLIBCXX_DEBUG or
_GLIBCXX_PARALLEL_ASSERTIONS instead).

With the change to define __glibcxx_assert() without Debug Mode we can
change most uses of _GLIBCXX_DEBUG_ASSERT to simply __glibcxx_assert,
so that the assertion is done when _GLIBCXX_ASSERTIONS is defined (not
only in Debug Mode).

I haven't added any new assertions yet, this just converts the
lightweight Debug Mode checks, but the next step will be to add
additional assertions to the (normal mode) containers. The google
branches contain several good examples of checks to add.

François, what do you think of this approach?


commit 10af11a95e3a7341f6371801e37ff140e197e634
Author: Jonathan Wakely 
Date:   Mon Sep 7 14:05:23 2015 +0100

Enable lightweight checks with _GLIBCXX_ASSERTIONS.

	* doc/xml/manual/using.xml (_GLIBCXX_ASSERTIONS): Document.
	* doc/html/manual/using_macros.html: Regenerate.
	* include/bits/c++config: Define _GLIBCXX_ASSERTIONS when
	_GLIBCXX_DEBUG is defined. Disable std::string extern templates when
	(_GLIBCXX_EXTERN_TEMPLATE, __glibcxx_assert): Depend on
	_GLIBCXX_ASSERTIONS instead of _GLIBCXX_DEBUG.
	* include/debug/debug.h [!_GLIBCXX_DEBUG]: Define
	__glibcxx_requires_non_empty_range and __glibcxx_requires_nonempty.
	* include/backward/auto_ptr.h (auto_ptr::operator*,
	auto_ptr::operator->): Replace _GLIBCXX_DEBUG_ASSERT with
	__glibcxx_assert.
	* include/bits/basic_string.h (basic_string::operator[],
	basic_string::front, basic_string::back, basic_string::pop_back):
	Likewise.
	* include/bits/random.h
	(uniform_int_distribution::param_type::param_type,
	uniform_real_distribution::param_type::param_type,
	normal_distribution::param_type::param_type,
	gamma_distribution::param_type::param_type,
	bernoulli_distribution::param_type::param_type,
	binomial_distribution::param_type::param_type,
	geometric_distribution::param_type::param_type,
	negative_binomial_distribution::param_type::param_type,
	poisson_distribution::param_type::param_type,
	exponential_distribution::param_type::param_type): Likewise.
	* include/bits/regex.h (match_results::operator[],
	match_results::prefix, match_results::suffix): Likewise.
	* include/bits/regex.tcc (format, regex_iterator::operator++):
	Likewise.
	* include/bits/regex_automaton.tcc (_StateSeq::_M_clone): Likewise.
	* include/bits/regex_compiler.tcc (_Compiler::_Compiler,
	_Compiler::_M_insert_character_class_matcher): Likewise.
	* include/bits/regex_executor.tcc (_Executor::_M_dfs): Likewise.
	* include/bits/regex_scanner.tcc (_Scanner::_M_advance,
	_Scanner::_M_scan_normal): Likewise.
	* include/bits/shared_ptr_base.h (__shared_ptr::_M_reset,
	__shared_ptr::operator*): Likewise.
	* include/bits/stl_iterator_base_funcs.h (__advance): Likewise.
	* include/bits/unique_ptr.h (unique_ptr::operator*,
	unique_ptr::operator[]): Likewise.
	* include/experimental/fs_path.h (path::path(string_type, _Type),
	path::iterator::operator++, path::iterator::operator--,
	path::iterator::operator*): Likewise.
	* include/experimental/string_view (basic_string_view::operator[],
	basic_string_view::front, basic_string_view::back,
	basic_string_view::remove_prefix): Likewise.
	* include/ext/random (beta_distribution::param_type::param_type,
	normal_mv_distribution::param_type::param_type,
	rice_distribution::param_type::param_type,
	pareto_distribution::param_type::param_type,
	k_distribution::param_type::param_type,
	arcsine_distribution::param_type::param_type,
	hoyt_distribution::param_type::param_type,
	triangular_distribution::param_type::param_type,
	von_mises_distribution::param_type::param_type,
	hypergeometric_distribution::param_type::param_type,
	logistic_distribution::param_type::param_type): Likewise.
	* include/ext/vstring.h (__versa_string::operator[]): Likewise.
	* include/std/complex (polar): Likewise.
	* include/std/mutex [!_GTHREAD_USE_MUTEX_TIMEDLOCK]
	(timed_mutex::~timed_mutex, timed_mutex::unlock,
	(recursive_timed_mutex::~timed_mutex, recursive_timed_mutex::unlock):
	Likewise.
	* include/std/shared_mutex [!PTHREAD_RWLOCK_INITIALIZER]
	(__shared_mutex_pthread::__shared_mutex_pthread,
	__shared_mutex_pthread::~__shared_mutex_pthread): Likewise.
	(__shared_mutex_pthread::lock, __shared_mutex_pthread::try_lock,
	__shared_mutex_pthread::unlock, __shared_mutex_pthread::lock_shared,
	__shared_mutex_pthread::try_lock_shared): Likewise.

Re: [patch] Enable lightweight checks with _GLIBCXX_ASSERTIONS.

2015-09-07 Thread Daniel Krügler
2015-09-07 20:27 GMT+02:00 Jonathan Wakely :
> This patch adds the "debug mode lite" we've been talking about, by
> changing __glibcxx_assert to be activated by _GLIBCXX_ASSERTIONS
> instead of _GLIBCXX_DEBUG (and making the latter imply the former).
>
> _GLIBCXX_ASSERTIONS is already used in Parallel Mode for enabling
> optional assertions (although some of them are O(n) and so we might
> want to change them to use another macro like _GLIBCXX_DEBUG or
> _GLIBCXX_PARALLEL_ASSERTIONS instead).
>
> With the change to define __glibcxx_assert() without Debug Mode we can
> change most uses of _GLIBCXX_DEBUG_ASSERT to simply __glibcxx_assert,
> so that the assertion is done when _GLIBCXX_ASSERTIONS is defined (not
> only in Debug Mode).
>
> I haven't added any new assertions yet, this just converts the
> lightweight Debug Mode checks, but the next step will be to add
> additional assertions to the (normal mode) containers. The google
> branches contain several good examples of checks to add.

In the suggested doc changes:

+When defined, _GLIBCXX_ASSERTIONS is defined
+automatically, so all the assertions that enables are also enabled
+in debug mode.

there seems to be a typo, presumably it should be

+When defined, _GLIBCXX_ASSERTIONS is defined
+automatically, so all the assertions that it
enables are also enabled
+in debug mode.

instead?

- Daniel


Re: [PATCH][RTL-ifcvt] PR rtl-optimization/67465: Do not ifcvt complex blocks if the else block is empty

2015-09-07 Thread H.J. Lu
On Mon, Sep 7, 2015 at 9:29 AM, Kyrill Tkachov  wrote:
> Hi all,
>
> This patch fixes the PRs in the ChangeLog that have been reported against my
> if-conversion patch.
> The problem occurs when the 'then' block is complex but the else block is
> empty.
> In this case the calling code in noce_process_if_block takes the 'else' move
> (x := b) from
> the test block. However, we have not checked whether the test block is valid
> for complex-block
> if-conversion with bb_valid_for_noce_process_p. Also, that's a case I wasn't
> particularly targeting
> when writing the initial patch.
>
> This patch bails out of noce_try_cmove_arith when one of the blocks is
> complex and the other is empty.
> I've checked that if-conversion still happens in the cases of interest from
> the original patch.
>
> I've added the testcase from PR 67465 since that one uses __builtin_abort
> and triggers the problem nicely.
> The others show the miscompilation using printf seems to go away if I
> replace it with an abort.
> I have confirmed manually that the miscompilation goes away on those
> testcases.
>
> PR rtl-optimization/67481 is a testsuite regression on sparc-solaris that
> Rainer reported. I haven't tested
> that this patch fixes that, but I suspect that the root cause is the same.
> Rainer, could you please
> check that this fixes the regression for you?
>
> Bootstrapped and tested on aarch64 and x86_64.
>
> Ok for trunk if sparc testing comes ok?
>
> Thanks,
> Kyrill
>
> 2015-09-07  Kyrylo Tkachov  
>
> PR rtl-optimization/67456
> PR rtl-optimization/67464
> PR rtl-optimization/67465
> PR rtl-optimization/67481
> * ifcvt.c (noce_try_cmove_arith): Bail out if one of the blocks
> is complex and the other is empty.
>
> 2015-09-07  Kyrylo Tkachov  
>
> * gcc.dg/pr67465.c: New test.

Does it fix

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67462


-- 
H.J.


Re: [patch] Enable lightweight checks with _GLIBCXX_ASSERTIONS.

2015-09-07 Thread Florian Weimer
* Jonathan Wakely:

> This patch adds the "debug mode lite" we've been talking about, by
> changing __glibcxx_assert to be activated by _GLIBCXX_ASSERTIONS
> instead of _GLIBCXX_DEBUG (and making the latter imply the former).

Interesting.  Is this mode ABI-compatible with the default mode?
Should _FORTIFY_SOURCE imply _GLIBCXX_ASSERTIONS?


Re: [patch] Enable lightweight checks with _GLIBCXX_ASSERTIONS.

2015-09-07 Thread Jonathan Wakely

On 07/09/15 21:04 +0200, Daniel Krügler wrote:

2015-09-07 20:27 GMT+02:00 Jonathan Wakely :

This patch adds the "debug mode lite" we've been talking about, by
changing __glibcxx_assert to be activated by _GLIBCXX_ASSERTIONS
instead of _GLIBCXX_DEBUG (and making the latter imply the former).

_GLIBCXX_ASSERTIONS is already used in Parallel Mode for enabling
optional assertions (although some of them are O(n) and so we might
want to change them to use another macro like _GLIBCXX_DEBUG or
_GLIBCXX_PARALLEL_ASSERTIONS instead).

With the change to define __glibcxx_assert() without Debug Mode we can
change most uses of _GLIBCXX_DEBUG_ASSERT to simply __glibcxx_assert,
so that the assertion is done when _GLIBCXX_ASSERTIONS is defined (not
only in Debug Mode).

I haven't added any new assertions yet, this just converts the
lightweight Debug Mode checks, but the next step will be to add
additional assertions to the (normal mode) containers. The google
branches contain several good examples of checks to add.


In the suggested doc changes:

+When defined, _GLIBCXX_ASSERTIONS is defined
+automatically, so all the assertions that enables are also enabled
+in debug mode.

there seems to be a typo, presumably it should be

+When defined, _GLIBCXX_ASSERTIONS is defined
+automatically, so all the assertions that it
enables are also enabled
+in debug mode.

instead?


It's correct as I wrote it, but your version is clearer so I'll change
it.

My original can be read as "so all the assertions that that enables"
where the first "that" can be removed without changing the meaning.
Stoopid English ;-)




Re: [patch] Enable lightweight checks with _GLIBCXX_ASSERTIONS.

2015-09-07 Thread Jonathan Wakely

On 07/09/15 21:31 +0200, Florian Weimer wrote:

* Jonathan Wakely:


This patch adds the "debug mode lite" we've been talking about, by
changing __glibcxx_assert to be activated by _GLIBCXX_ASSERTIONS
instead of _GLIBCXX_DEBUG (and making the latter imply the former).


Interesting.  Is this mode ABI-compatible with the default mode?


Yes, that's the main reason I want to make this change.


Should _FORTIFY_SOURCE imply _GLIBCXX_ASSERTIONS?


Yes, I think it should.

You can read my notes on these "debug mode lite" checks at
https://gcc.gnu.org/wiki/LibstdcxxDebugMode (including "This should be
discussed with Glibc and security teams" and I specifically had you in
mind when I wrote that :-)

Your thoughts would be much appreciated, especially regarding
_FORTIFY_SOURCE and which kind of new checks you think would be
appropriate to add.



Re: [patch] Enable lightweight checks with _GLIBCXX_ASSERTIONS.

2015-09-07 Thread Jonathan Wakely

On 07/09/15 20:53 +0100, Jonathan Wakely wrote:

On 07/09/15 21:04 +0200, Daniel Krügler wrote:

2015-09-07 20:27 GMT+02:00 Jonathan Wakely :

This patch adds the "debug mode lite" we've been talking about, by
changing __glibcxx_assert to be activated by _GLIBCXX_ASSERTIONS
instead of _GLIBCXX_DEBUG (and making the latter imply the former).

_GLIBCXX_ASSERTIONS is already used in Parallel Mode for enabling
optional assertions (although some of them are O(n) and so we might
want to change them to use another macro like _GLIBCXX_DEBUG or
_GLIBCXX_PARALLEL_ASSERTIONS instead).

With the change to define __glibcxx_assert() without Debug Mode we can
change most uses of _GLIBCXX_DEBUG_ASSERT to simply __glibcxx_assert,
so that the assertion is done when _GLIBCXX_ASSERTIONS is defined (not
only in Debug Mode).

I haven't added any new assertions yet, this just converts the
lightweight Debug Mode checks, but the next step will be to add
additional assertions to the (normal mode) containers. The google
branches contain several good examples of checks to add.


In the suggested doc changes:

+When defined, _GLIBCXX_ASSERTIONS is defined
+automatically, so all the assertions that enables are also enabled
+in debug mode.

there seems to be a typo, presumably it should be

+When defined, _GLIBCXX_ASSERTIONS is defined
+automatically, so all the assertions that it
enables are also enabled
+in debug mode.

instead?


It's correct as I wrote it, but your version is clearer so I'll change
it.

My original can be read as "so all the assertions that that enables"
where the first "that" can be removed without changing the meaning.
Stoopid English ;-)


I think this is even better:

 When defined, _GLIBCXX_ASSERTIONS is defined
 automatically, so all the assertions enabled by that macro are
 also enabled in debug mode.

Is that clear?



Re: [patch] Enable lightweight checks with _GLIBCXX_ASSERTIONS.

2015-09-07 Thread Daniel Krügler
2015-09-07 22:10 GMT+02:00 Jonathan Wakely :
> On 07/09/15 20:53 +0100, Jonathan Wakely wrote:
>> On 07/09/15 21:04 +0200, Daniel Krügler wrote:
>>> In the suggested doc changes:
>>>
>>> +When defined, _GLIBCXX_ASSERTIONS is defined
>>> +automatically, so all the assertions that enables are also
>>> enabled
>>> +in debug mode.
>>>
>>> there seems to be a typo, presumably it should be
>>>
>>> +When defined, _GLIBCXX_ASSERTIONS is defined
>>> +automatically, so all the assertions that it
>>> enables are also enabled
>>> +in debug mode.
>>>
>>> instead?
>>
>>
>> It's correct as I wrote it, but your version is clearer so I'll change
>> it.
>>
>> My original can be read as "so all the assertions that that enables"
>> where the first "that" can be removed without changing the meaning.
>> Stoopid English ;-)
>
>
> I think this is even better:
>
>  When defined, _GLIBCXX_ASSERTIONS is defined
>  automatically, so all the assertions enabled by that macro are
>  also enabled in debug mode.
>
> Is that clear?

It's perfect!

Thanks,

- Daniel


Re: [wwwdocs] Re: C++ Concepts available in trunk?

2015-09-07 Thread Gerald Pfeifer
On Mon, 7 Sep 2015, Jonathan Wakely wrote:
> Nice, I think they are good improvements.

Cool.  I committed this, only to notice another change.

GCC stands for GNU Compiler Collection, to GCC compiler would
expand to GNU Compiler Collection compiler, which feels a bit
redundant. ;-)

I'll wait a bit before committing this in case you or anyone
else has comments.

Gerald

Index: cxx0x.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cxx0x.html,v
retrieving revision 1.68
diff -u -r1.68 cxx0x.html
--- cxx0x.html  7 Sep 2015 13:09:09 -   1.68
+++ cxx0x.html  7 Sep 2015 20:17:10 -
@@ -17,8 +17,8 @@
   implement new C++11 features in GCC, and made it the first
   compiler to bring feature-complete C++11 to C++ programmers.
 
-  C++11 features are available as part of the "mainline" GCC
-compiler in the trunk of GCC's repository
+  C++11 features are available as part of "mainline" GCC
+in the trunk of GCC's repository
 and in GCC 4.3 and later. To enable C++0x
   support, add the command-line parameter -std=c++0x
   to your g++ command line. Or, to enable GNU
Index: cxx1y.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cxx1y.html,v
retrieving revision 1.25
diff -u -r1.25 cxx1y.html
--- cxx1y.html  7 Sep 2015 13:09:10 -   1.25
+++ cxx1y.html  7 Sep 2015 20:17:10 -
@@ -16,8 +16,8 @@
   GCC has support for the latest revision of the C++
   standard, which was published in 2014.
 
-  C++14 features are available as part of the "mainline" GCC
-compiler in the trunk of GCC's repository
+  C++14 features are available as part of "mainline" GCC
+in the trunk of GCC's repository
 and in GCC 4.8 and later. To enable C++14
   support, add the command-line parameter -std=c++14
   to your g++ command line. Or, to enable GNU
Index: cxx1z.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cxx1z.html,v
retrieving revision 1.4
diff -u -r1.4 cxx1z.html
--- cxx1z.html  7 Sep 2015 13:09:10 -   1.4
+++ cxx1z.html  7 Sep 2015 20:17:10 -
@@ -16,8 +16,8 @@
   GCC has experimental support for the next revision of the C++
   standard, which is expected to be published in 2017.
 
-  C++1z features are available as part of the "mainline" GCC
-compiler in the trunk of GCC's repository
+  C++1z features are available as part of "mainline" GCC
+in the trunk of GCC's repository
 and in GCC 5 and later. To enable C++1z
   support, add the command-line parameter -std=c++1z
   to your g++ command line. Or, to enable GNU


Re: [4/7] Use correct promoted mode sign for result of GIMPLE_CALL

2015-09-07 Thread Kugan


On 07/09/15 23:10, Michael Matz wrote:
> Hi,
> 
> On Mon, 7 Sep 2015, Kugan wrote:
> 
>> For the following testcase (compiling with -O1; -O2 works fine), we have
>> a stmt with stm_code SSA_NAME (_7 = _ 6) and for which _6 is defined by
>> a GIMPLE_CALL. In this case, we are using wrong SUNREG promoted mode
>> resulting in wrong code.
> 
> And why is that?
> 
>> Simple SSA_NAME copes are generally optimized
>> but when they are not, we can end up using the wrong promoted mode.
>> Attached patch fixes when we have one copy.
> 
> I think it's the wrong place to fixing up.  Where does the wrong use come 
> from?  At that place it should be fixed, not after the fact.
> 
>>   _6 = bar5 (-10);
>>   ...
>>   _7 = _6;
>>   _3 = (long unsigned int) _6;
>>   ...
>>   if (_3 != l5.0_4)
> 
> There is no use of '_7' in this snippet so I don't see the relevance of 
> SUBREG_PROMOTED_MODE on it.
> 
> But whatever you do, please make sure you include the testcase for the 
> problem as a regression test:
> 

Thanks for the review.

This happens in ARM where definition of PROMOTED_MODE also changes the
sign. I am attaching the cfgdump for the test-case. This is part of the
existing test-case thats why I didn't include it as part of this patch.

for ;; _7 = _6;

(subreg:HI (reg:SI 113) 0)
 
unit size 
align 16 symtab 0 alias set -1 canonical type 0x7fd672c36540
precision 16 min  max >
   def_stmt _7 = _6;

version 7>
decl_rtl -> (reg:SI 113)
temp -> (subreg:HI (reg:SI 113) 0)
Unsignedp = 1

and we expand it to:

;; _7 = _6;

(insn 10 9 0 (set (reg:SI 113)
(zero_extend:SI (subreg/u:HI (reg:SI 113) 0))) -1
 (nil))

but:

short int _6;
short int _7;

insn 10 above is wrong. _6 is defined by a call and therefore the sign
change in promoted mode is not true.

We should probably rearrange/or add a copy propagation to remove this
unnecessary copy but still this looks wrong to me.

Thanks,
Kugan

;; Function foo5 (foo5, funcdef_no=0, decl_uid=4147, cgraph_uid=0, 
symbol_order=0)

foo5 (int x)
{
  unsigned int _2;
  unsigned int _4;
  short unsigned int _5;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  _4 = (unsigned int) x_1(D);
  _2 = _4 & 65535;
  _5 = (short unsigned int) x_1(D);
  return _5;
;;succ:   EXIT

}



Partition map 

Partition 1 (x_1(D) - 1 )
Partition 2 (_2 - 2 )
Partition 4 (_4 - 4 )
Partition 5 (_5 - 5 )


Coalescible Partition map 

Partition 0, base 0 (x_1(D) - 1 )


Partition map 

Partition 0 (x_1(D) - 1 )


Conflict graph:

After sorting:
Coalesce List:

Partition map 

Partition 0 (x_1(D) - 1 )

After Coalescing:

Partition map 

Partition 0 (x_1(D) - 1 )
Partition 1 (_2 - 2 )
Partition 2 (_4 - 4 )
Partition 3 (_5 - 5 )


Replacing Expressions
_4 replace with --> _4 = (unsigned int) x_1(D);

_5 replace with --> _5 = (short unsigned int) x_1(D);


foo5 (int x)
{
  unsigned int _2;
  unsigned int _4;
  short unsigned int _5;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  _4 = (unsigned int) x_1(D);
  _2 = _4 & 65535;
  _5 = (short unsigned int) x_1(D);
  return _5;
;;succ:   EXIT

}



;; Generating RTL for gimple basic block 2
(const_int 65535 [0x])

Hot cost: 4 (final)
(const_int 65535 [0x])

Hot cost: 4 (final)

;; _2 = _4 & 65535;

(insn 6 5 0 (set (reg:SI 111)
(zero_extend:SI (subreg:HI (reg/v:SI 110 [ x ]) 0))) -1
 (nil))

;; return _5;

(insn 7 6 8 (set (reg:HI 115)
(subreg:HI (reg/v:SI 110 [ x ]) 0)) pr39240.c:6 -1
 (nil))

(insn 8 7 9 (set (reg:SI 116)
(zero_extend:SI (reg:HI 115))) pr39240.c:6 -1
 (nil))

(insn 9 8 10 (set (reg:SI 114 [  ])
(reg:SI 116)) pr39240.c:6 -1
 (nil))

(jump_insn 10 9 11 (set (pc)
(label_ref 0)) pr39240.c:6 -1
 (nil))

(barrier 11 10 0)


try_optimize_cfg iteration 1

Merging block 3 into block 2...
Merged blocks 2 and 3.
Merged 2 and 3 without moving.
Removing jump 10.
Merging block 4 into block 2...
Merged blocks 2 and 4.
Merged 2 and 4 without moving.


try_optimize_cfg iteration 2

fix_loop_structure: fixing up loops for function


;;
;; Full RTL generated for this function:
;;
(note 1 0 4 NOTE_INSN_DELETED)
;; basic block 2, loop depth 0, count 0, freq 1, maybe hot
;;  prev block 0, next block 1, flags: (NEW, REACHABLE, RTL)
;;  pred:   ENTRY [100.0%]  (FALLTHRU)
(note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 2 4 3 2 (set (reg/v:SI 110 [ x ])
(reg:SI 0 r0 [ x ])) pr39240.c:5 -1
 (nil))
(note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
(insn 6 3 7 2 (set (reg:SI 111)
(zero_extend:SI (subreg:HI (reg/v:SI 110 [ x ]) 0))) -1
 (nil))
(insn 7 6 8 2 (set (reg:HI 115)
(subreg:HI (reg/v:SI 110 [ x ]) 0)) pr39240.c:6 -1
 (nil))
(insn 8 7 9 2 (set (reg:SI 116)
(zero_extend:SI (reg:HI 115))) pr39240.c:6 -1
 (nil))
(insn 9 8 13 2 (set (reg:SI 114 [  ])
(reg:SI 116)) pr39240.c:6 -1
 (nil))
(insn 13 9 14 2 (set (reg/i:SI 0 r0)
(reg:SI 114 [  ])) pr39240.c:7 -1
 (nil))
(insn

Re: [5/7] Allow gimple debug stmt in widen mode

2015-09-07 Thread Kugan

Thanks for the review.

On 07/09/15 23:20, Michael Matz wrote:
> Hi,
> 
> On Mon, 7 Sep 2015, Kugan wrote:
> 
>> Allow GIMPLE_DEBUG with values in promoted register.
> 
> Patch does much more.
> 

Oops sorry. Copy and paste mistake.

gcc/ChangeLog:

2015-09-07 Kugan Vivekanandarajah 

* cfgexpand.c (expand_debug_locations): Remove assert as now we are
also allowing values in promoted register.
* gimple-ssa-type-promote.c (fixup_uses): Allow GIMPLE_DEBUG to bind
values in promoted register.
* rtl.h (wi::int_traits ::decompose): Accept zero extended value
also.


>> gcc/ChangeLog:
>>
>> 2015-09-07  Kugan Vivekanandarajah  
>>
>>  * expr.c (expand_expr_real_1): Set proper SUNREG_PROMOTED_MODE for
>>  SSA_NAME that was set by GIMPLE_CALL and assigned to another
>>  SSA_NAME of same type.
> 
> ChangeLog doesn't match patch, and patch contains dubious changes:
> 
>> --- a/gcc/cfgexpand.c
>> +++ b/gcc/cfgexpand.c
>> @@ -5240,7 +5240,6 @@ expand_debug_locations (void)
>> tree value = (tree)INSN_VAR_LOCATION_LOC (insn);
>> rtx val;
>> rtx_insn *prev_insn, *insn2;
>> -   machine_mode mode;
>>  
>> if (value == NULL_TREE)
>>   val = NULL_RTX;
>> @@ -5275,16 +5274,6 @@ expand_debug_locations (void)
>>  
>> if (!val)
>>   val = gen_rtx_UNKNOWN_VAR_LOC ();
>> -   else
>> - {
>> -   mode = GET_MODE (INSN_VAR_LOCATION (insn));
>> -
>> -   gcc_assert (mode == GET_MODE (val)
>> -   || (GET_MODE (val) == VOIDmode
>> -   && (CONST_SCALAR_INT_P (val)
>> -   || GET_CODE (val) == CONST_FIXED
>> -   || GET_CODE (val) == LABEL_REF)));
>> - }
>>  
>> INSN_VAR_LOCATION_LOC (insn) = val;
>> prev_insn = PREV_INSN (insn);
> 
> So it seems that the modes of the values location and the value itself 
> don't have to match anymore, which seems dubious when considering how a 
> debugger should load the value in question from the given location.  So, 
> how is it supposed to work?

For example (simplified test-case from creduce):

fn1() {
  char a = fn1;
  return a;
}

--- test.c.142t.veclower21  2015-09-07 23:47:26.362201640 +
+++ test.c.143t.promotion   2015-09-07 23:47:26.362201640 +
@@ -5,13 +5,18 @@
 {
   char a;
   long int fn1.0_1;
+  unsigned int _2;
   int _3;
+  unsigned int _5;
+  char _6;

   :
   fn1.0_1 = (long int) fn1;
-  a_2 = (char) fn1.0_1;
-  # DEBUG a => a_2
-  _3 = (int) a_2;
+  _5 = (unsigned int) fn1.0_1;
+  _2 = _5 & 255;
+  # DEBUG a => _2
+  _6 = (char) _2;
+  _3 = (int) _6;
   return _3;

 }

Please see that DEBUG now points to _2 which is a promoted mode. I am
assuming that the debugger would load required precision from promoted
register. May be I am missing the details but how else we can handle
this? Any suggestions?

In this particular simplified case, we do have _6 but we might not in
all the case.


> 
> And this change:
> 
>> --- a/gcc/rtl.h
>> +++ b/gcc/rtl.h
>> @@ -2100,6 +2100,8 @@ wi::int_traits ::decompose (HOST_WIDE_INT*,
>>targets is 1 rather than -1.  */
>> gcc_checking_assert (INTVAL (x.first)
>>  == sext_hwi (INTVAL (x.first), precision)
>> +|| INTVAL (x.first)
>> +== (INTVAL (x.first) & ((1 << precision) - 1))
>>  || (x.second == BImode && INTVAL (x.first) == 
>> 1));
>>  
>>return wi::storage_ref (&INTVAL (x.first), 1, precision);
> 
> implies that wide_ints are not always sign-extended anymore after you 
> changes.  That's a fundamental assumption, so removing that assert implies 
> that you somehow created non-canonical wide_ints, and those will cause 
> bugs elsewhere in the code.  Don't just remove asserts, they are usually 
> there for a reason, and without accompanying changes those reasons don't 
> go away.
> 


This comes from GIMPLE_DEBUG. If this assumption should always hold, I
will fix it there.

Thanks,
Kugan


Re: Remove redundant test for global_regs

2015-09-07 Thread Anatoly Sokolov



- Original Message - 
From: "Jeff Law" 

Sent: Friday, September 04, 2015 11:22 PM

Hello.




Index: gcc/cse.c
===
--- gcc/cse.c (revision 226953)
+++ gcc/cse.c (working copy)
@@ -463,7 +463,7 @@
A reg wins if it is either the frame pointer or designated as
fixed. */
#define FIXED_REGNO_P(N) \
((N) == FRAME_POINTER_REGNUM || (N) == HARD_FRAME_POINTER_REGNUM \
- || fixed_regs[N] || global_regs[N])
+ || TEST_HARD_REG_BIT (fixed_reg_set, N))
So why not continue to test fixed_regs here (ie, just drop the global_regs 
test)? It's a single memory reference and a test against zero.


Using TEST_HARD_REG_BIT likely still hits memory, but then on many 
architectures you're then going to have to do masking/shifting to get the 
bit you want to look at. That seems to me like a step backwards.


The fixed_regs array duplicate information from the fixed_reg_set. Мore
practical to use an HARD_REG_SET as there are many useful function as
hard_reg_set_subset_p, range_in_hard_reg_set_p, etc. I propose
to remove fixed_regs array use from GCC midle end and allow to use it in
TARGET_CONDITIONAL_REGISTER_USAGE target hook only. This will isolate
the bit the back end interface from the rest of GCC and simplify 
implementation

FIXED_REGISTERS target macro as target hook.

Anatoly. 



[RS6000] Fix PowerPC ICE due to secondary_reload ignoring reload replacements

2015-09-07 Thread Alan Modra
In https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67378 analysis I show
the reason for this PR is that insns emitted by secondary reload
patterns are being generated without taking into account other reloads
that may have occurred.  We run into this problem when an insn has a
pseudo that doesn't get a hard reg, and the pseudo is used in a way
that requires a secondary reload.  In this case the secondary reload
is needed due to gcc generating a 64-bit gpr load from memory insn
with an address offset not a multiple of 4.

Bootstrapped and regression tested powerpc64-linux.  OK to apply?
gcc-5 and gcc-4.9 branches too?

I haven't included a testcase in this patch, because the testcase in
the PR is quite horrible, and testcases triggering reload misbehaviour
tend to be unreliable.  By unreliable, I mean a small change anywhere
in the compiler can result in the testcase passing even if this bug
was reintroduced at some future date.  The testcase doesn't fail on
gcc-5, even though I'm fairly sure the same bug lurks there..

PR target/67378
* config/rs6000/rs6000.c (rs6000_secondary_reload_gpr): Find
reload replacement for PRE_MODIFY address reg.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index cfd5675..51046d4 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -18199,8 +18199,21 @@ rs6000_secondary_reload_gpr (rtx reg, rtx mem, rtx 
scratch, bool store_p)
 
   if (GET_CODE (addr) == PRE_MODIFY)
 {
+  gcc_assert (REG_P (XEXP (addr, 0))
+ && GET_CODE (XEXP (addr, 1)) == PLUS
+ && XEXP (XEXP (addr, 1), 0) == XEXP (addr, 0));
   scratch_or_premodify = XEXP (addr, 0);
-  gcc_assert (REG_P (scratch_or_premodify));
+  if (!HARD_REGISTER_P (scratch_or_premodify))
+   /* If we have a pseudo here then reload will have arranged
+  to have it replaced, but only in the original insn.
+  Use the replacement here too.  */
+   scratch_or_premodify = find_replacement (&XEXP (addr, 0));
+
+  /* RTL emitted by rs6000_secondary_reload_gpr uses RTL
+expressions from the original insn, without unsharing them.
+Any RTL that points into the original insn will of course
+have register replacements applied.  That is why we don't
+need to look for replacements under the PLUS.  */
   addr = XEXP (addr, 1);
 }
   gcc_assert (GET_CODE (addr) == PLUS || GET_CODE (addr) == LO_SUM);

-- 
Alan Modra
Australia Development Lab, IBM


[Patch, fortran] PR66681 - Wrong result in assigning this_image() to a complex coarray

2015-09-07 Thread Paul Richard Thomas
Dear All,

This is something of a corner case, where gfc_conv_expr comes back
with a SAVE_EXPR, in the case of complex, scalar, coarray lvalues. The
first field of the SAVE_EXPR is a perfectly viable expression to
assign to, so I have taken that. If anybody out there has a better
solution, please speak up! The testcase is good, anyway

Discussed on https://groups.google.com/forum/#!topic/opencoarrays/Cl2iK3OfUTs

Bootstrapped and regtested on FC21/x86_64 - OK for trunk?

Paul

2015-09-08  Paul Thomas  

PR fortran/66681
* trans-expr.c (gfc_trans_assignment_1): If the lvalue is a
complex type and a save_expr, take the field to be assigned to.

2015-09-08  Paul Thomas  

PR fortran/66681
* gfortran.dg/coarray_40.f90: New test.
Index: gcc/fortran/trans-expr.c
===
*** gcc/fortran/trans-expr.c(revision 227511)
--- gcc/fortran/trans-expr.c(working copy)
*** gfc_trans_assignment_1 (gfc_expr * expr1
*** 9269,9274 
--- 9269,9281 
gfc_add_block_to_block (&loop.post, &rse.post);
  }
  
+   /* Complex scalar coarrays sometimes produce a SAVE_EXPR on type conversion.
+  Take the expression to assign to.  */
+   if (TREE_CODE (lse.expr) == SAVE_EXPR
+   && TREE_CODE (TREE_TYPE (lse.expr)) == COMPLEX_TYPE
+   && expr1->symtree->n.sym->attr.codimension)
+ lse.expr = TREE_OPERAND (lse.expr, 0);
+ 
tmp = gfc_trans_scalar_assign (&lse, &rse, expr1->ts,
 expr_is_variable (expr2) || scalar_to_array
 || expr2->expr_type == EXPR_ARRAY,
Index: gcc/testsuite/gfortran.dg/coarray_40.f90
===
*** gcc/testsuite/gfortran.dg/coarray_40.f90(revision 0)
--- gcc/testsuite/gfortran.dg/coarray_40.f90(working copy)
***
*** 0 
--- 1,11 
+ ! { dg-do compile }
+ ! { dg-options "-fcoarray=single -fdump-tree-original" }
+ ! Test the fix for PR66681.
+ !
+ ! Contributed by Damian Rouson  
+ !
+   complex a[*]
+   a = this_image ()
+   print *,this_image (),a
+ end
+ ! { dg-final { scan-tree-dump-times "SAVE_EXPR" 0 "original" } }