date:20150227

Re: [PATCH] Remove inefficient branchless conditional negate optimization

2015-02-27 Thread Richard Biener

On Thu, Feb 26, 2015 at 11:20 PM, Jeff Law  wrote:
> On 02/26/15 10:30, Wilco Dijkstra wrote:
>>
>> Several GCC versions ago a conditional negate optimization was introduced
>> as a workaround for
>> PR45685. However the branchless expansion for conditional negate is
>> extremely inefficient on most
>> targets (5 sequentially dependent instructions rather than 2 on AArch64).
>> Since the underlying issue
>> has been resolved (the example in PR45685 no longer generates a branch on
>> x64), remove the
>> workaround so that conditional negates are treated in exactly the same way
>> as conditional invert,
>> add, subtract, and, orr, xor etc. Simple example:
>>
>> int f(int x) { if (x > 3) x = -x; return x; }
>
> You need to bootstrap and regression test the change before it can be
> approved.

As Jeff added a testcase for the PHI opt transform to happen I'm sure
testing would shown this as fallout.

> You should turn this little example into a testcase.  It's fine with me if
> this new test is ARM specific.
>
>
> You should also find a way to change the test gcc.dg/tree-ssa/pr45685.c in
> such a way that it ensures there aren't any undesirable branches.

I'd be also interested in results of vectorizing a loop with a
conditional negate.
I can very well imagine reverting this patch causing code quality regressions
there.

> I've got enough history to know this is fixing a regression of sorts for the
> ARM platform.  So once the issues above are addressed it can go forward even
> without a BZ noting the regression.

But I'd say this is stage1 material at this point.

Richard.

> jeff
>

Re: [PATCH, testsuite] Fix gcc.dg/vect/pr59354.c

2015-02-27 Thread Richard Biener

On Fri, Feb 27, 2015 at 12:34 AM, Pat Haugen
 wrote:
> The subject testcase is failing on older powerpc64 hardware that doesn't
> support vector instructions because the prolog code is saving callee save
> vector registers used in the loop before the check_vect() call has even been
> performed. Following was tested on powerpc64-linux. Ok for trunk/4.9 branch?

Hum.  So the whole check_vect business is unreliable on ppc64?  I'd rather
make sure to not run the tests at all on older ppc hardware then?

Well.  Ok.

Thanks,
Richard.

> -Pat
>
>
> 2015-02-26  Pat Haugen 
>
> gcc/testsuite:
> * gcc.dg/vect/pr59354.c: Move vector producing code to separate
> function.
>
>
> Index: gcc.dg/vect/pr59354.c
> ===
> --- gcc.dg/vect/pr59354.c(revision 221016)
> +++ gcc.dg/vect/pr59354.c(working copy)
> @@ -8,12 +8,11 @@ void abort (void);
>  unsigned int a[256];
>  unsigned char b[256];
>
> -int main()
> +__attribute__ ((noinline)) void
> +main1()
>  {
>int i, z, x, y;
>
> -  check_vect ();
> -
>for(i = 0; i < 256; i++)
>  {
>a[i] = i % 5;
> @@ -27,6 +26,13 @@ int main()
>
>if (b[4] != 1)
>  abort ();
> +}
> +
> +int main (void)
> +{
> +  check_vect ();
> +
> +  main1 ();
>
>return 0;
>  }
>

Re: [PATCH, testsuite] Fix gcc.dg/vect/pr59354.c

2015-02-27 Thread Jakub Jelinek

On Fri, Feb 27, 2015 at 09:08:30AM +0100, Richard Biener wrote:
> On Fri, Feb 27, 2015 at 12:34 AM, Pat Haugen
>  wrote:
> > The subject testcase is failing on older powerpc64 hardware that doesn't
> > support vector instructions because the prolog code is saving callee save
> > vector registers used in the loop before the check_vect() call has even been
> > performed. Following was tested on powerpc64-linux. Ok for trunk/4.9 branch?
> 
> Hum.  So the whole check_vect business is unreliable on ppc64?  I'd rather
> make sure to not run the tests at all on older ppc hardware then?
> 
> Well.  Ok.

I think the separate main containing just check_vect and call to a noinline
function is very much desirable, otherwise it works purely by accident if it
works at all, IMHO on all arches.

Jakub

Re: [PATCH][AArch64]: Fix rtl type in aarch64.md.

2015-02-27 Thread Marcus Shawcroft

On 26 February 2015 at 06:22, Xingxing Pan  wrote:
> Hi,
>
> This patch fix the type of mov_aarch64 in aarch64.md.
> Is it OK for trunk?

OK, thank you /Marcus

Re: [PATCH][wwwdocs] Mention xgene-1 in arm and aarch64, FreeBSD support for arm

2015-02-27 Thread Marcus Shawcroft

On 25 February 2015 at 09:53, Kyrill Tkachov  wrote:
>
> On 13/02/15 10:14, Richard Earnshaw wrote:
>>
>> On 13/02/15 09:52, Kyrill Tkachov wrote:
>>>
>>> Hi all,
>>>
>>> This patch to changes.html mentions the xgene1 support in GCC 5 for arm
>>> and aarch64 and also the FreeBSD support for ARM.
>>>
>>> Is this ok?
>>
>> The repetitive nature of all these new cpus being added looks rather
>> wooden.  I think it would be better to merge them into one change block,
>> that lists all the cpus and their internal names, then mentions once at
>> the end that these names can be used as arguments to -mcpu and -mtune.
>
>
> Yeah, that makes sense. I've incorporated the feedback.
> Here's a proposed patch.
>
> How's this?

Looks fine to me, I suggest you go ahead and commit it. /Marcus

Re: [patch] PR rtl-optimization/65220: avoid integer division in stack alignment

2015-02-27 Thread Uros Bizjak

Hello!

> This is actually Jakub's patch from the PR, with a few minor tweaks that were 
> needed to bootstrap
> and pass the regression suite. The splitter was using operand 0 without 
> setting it first. It should've
> been operand 2. Also, there was a division by zero that was causing an 
> invalid insn; fixed by
> changing the INTVAL + 2 into INTVAL - 2. Finally, a reload_completed was 
> added at Jakub's
> request.

PR rtl-optimization/65220
* config/i386/i386.md (*udivmod4_pow2): New.

+  "UINTVAL (operands[3]) - 2 <  * BITS_PER_UNIT
+   && (UINTVAL (operands[3]) & (UINTVAL (operands[3]) - 1)) == 0"
+  "#"
+  "reload_completed"

This should be "&& reload_completed", so it will also include the
above condition in the split condition.

OK for mainline with the above change.

Thanks,
Uros.

[C PATCH] Fix ICE with __auto_type (PR c/65228)

2015-02-27 Thread Marek Polacek

This PR points out that we can ICE in case we have __auto_type with undeclared
variable: grokdeclarator returns error_mark_node, so check for that before
accessing decl's TREE_CODE (can't use error_operand_p here as that would cause
bogus diagnostics be emitted).

Bootstrapped/regtested on x86_64-linux, ok for trunk and 4.9?

2015-02-26  Marek Polacek  

PR c/65228
* c-decl.c (start_decl): Return NULL_TREE if decl is an error node.

* gcc.dg/pr65228.c: New test.

diff --git gcc/c/c-decl.c gcc/c/c-decl.c
index 89c..3bf1fc6 100644
--- gcc/c/c-decl.c
+++ gcc/c/c-decl.c
@@ -4460,8 +4460,8 @@ start_decl (struct c_declarator *declarator, struct 
c_declspecs *declspecs,
   decl = grokdeclarator (declarator, declspecs,
 NORMAL, initialized, NULL, &attributes, &expr, NULL,
 deprecated_state);
-  if (!decl)
-return 0;
+  if (!decl || decl == error_mark_node)
+return NULL_TREE;
 
   if (expr)
 add_stmt (fold_convert (void_type_node, expr));
diff --git gcc/testsuite/gcc.dg/pr65228.c gcc/testsuite/gcc.dg/pr65228.c
index e69de29..fd83238 100644
--- gcc/testsuite/gcc.dg/pr65228.c
+++ gcc/testsuite/gcc.dg/pr65228.c
@@ -0,0 +1,11 @@
+/* PR c/65228 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+__auto_type a = b; /* { dg-error "undeclared" } */
+
+void
+f (void)
+{
+  __auto_type c = d; /* { dg-error "undeclared" } */
+}

Marek

Re: [PATCH][ARM] Remove an unused reload hook.

2015-02-27 Thread Richard Earnshaw

On 19/02/15 12:19, Matthew Wahab wrote:
> The LEGITIMIZE_RELOAD_ADDRESS macro is only needed for reload. Since the
> ARM backend no longer supports reload, this macro is not needed and this
> patch removes it.
> 
> Tested arm-none-linux-gnueabihf with gcc-check.
> 
> Ok for trunk? now or in stage 1?
> Matthew
> 
> gcc/
> 2015-02-19  Matthew Wahab  
> 
> * config/arm/arm.h (LEGITIMIZE_RELOAD_ADDRESS): Remove.
> (ARM_LEGITIMIZE_RELOAD_ADDRESS): Remove.
> (THUMB_LEGITIMIZE_RELOAD_ADDRESS): Remove.
> * config/arm/arm.c (arm_legitimize_reload_address): Remove.
> (thumb_legitimize_reload_address): Remove.
> * config/arm/arm-protos.h (arm_legitimize_reload_address):
> Remove.
> (thumb_legitimize_reload_address): Remove.
> 

This is OK for stage 1.

I have one open question: can LRA generate the optimizations that these
hooks used to provide through reload?  If not, please could you file
some bugzilla reports so that we don't lose them.

Thanks,
R.

> remove_dead_code.patch
> 
> 
> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index 307babb..0595cc2 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -66,10 +66,6 @@ extern rtx legitimize_tls_address (rtx, rtx);
>  extern bool arm_legitimate_address_p (machine_mode, rtx, bool);
>  extern int arm_legitimate_address_outer_p (machine_mode, rtx, RTX_CODE, int);
>  extern int thumb_legitimate_offset_p (machine_mode, HOST_WIDE_INT);
> -extern bool arm_legitimize_reload_address (rtx *, machine_mode, int, int,
> -int);
> -extern rtx thumb_legitimize_reload_address (rtx *, machine_mode, int, int,
> - int);
>  extern int thumb1_legitimate_address_p (machine_mode, rtx, int);
>  extern bool ldm_stm_operation_p (rtx, bool, machine_mode mode,
>   bool, bool);
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 7bf5b4d..6efe664 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -7932,236 +7932,6 @@ thumb_legitimize_address (rtx x, rtx orig_x, 
> machine_mode mode)
>return x;
>  }
>  
> -bool
> -arm_legitimize_reload_address (rtx *p,
> -machine_mode mode,
> -int opnum, int type,
> -int ind_levels ATTRIBUTE_UNUSED)
> -{
> -  /* We must recognize output that we have already generated ourselves.  */
> -  if (GET_CODE (*p) == PLUS
> -  && GET_CODE (XEXP (*p, 0)) == PLUS
> -  && REG_P (XEXP (XEXP (*p, 0), 0))
> -  && CONST_INT_P (XEXP (XEXP (*p, 0), 1))
> -  && CONST_INT_P (XEXP (*p, 1)))
> -{
> -  push_reload (XEXP (*p, 0), NULL_RTX, &XEXP (*p, 0), NULL,
> -MODE_BASE_REG_CLASS (mode), GET_MODE (*p),
> -VOIDmode, 0, 0, opnum, (enum reload_type) type);
> -  return true;
> -}
> -
> -  if (GET_CODE (*p) == PLUS
> -  && REG_P (XEXP (*p, 0))
> -  && ARM_REGNO_OK_FOR_BASE_P (REGNO (XEXP (*p, 0)))
> -  /* If the base register is equivalent to a constant, let the generic
> -  code handle it.  Otherwise we will run into problems if a future
> -  reload pass decides to rematerialize the constant.  */
> -  && !reg_equiv_constant (ORIGINAL_REGNO (XEXP (*p, 0)))
> -  && CONST_INT_P (XEXP (*p, 1)))
> -{
> -  HOST_WIDE_INT val = INTVAL (XEXP (*p, 1));
> -  HOST_WIDE_INT low, high;
> -
> -  /* Detect coprocessor load/stores.  */
> -  bool coproc_p = ((TARGET_HARD_FLOAT
> - && TARGET_VFP
> - && (mode == SFmode || mode == DFmode))
> -|| (TARGET_REALLY_IWMMXT
> -&& VALID_IWMMXT_REG_MODE (mode))
> -|| (TARGET_NEON
> -&& (VALID_NEON_DREG_MODE (mode)
> -|| VALID_NEON_QREG_MODE (mode;
> -
> -  /* For some conditions, bail out when lower two bits are unaligned.  */
> -  if ((val & 0x3) != 0
> -   /* Coprocessor load/store indexes are 8-bits + '00' appended.  */
> -   && (coproc_p
> -   /* For DI, and DF under soft-float: */
> -   || ((mode == DImode || mode == DFmode)
> -   /* Without ldrd, we use stm/ldm, which does not
> -  fair well with unaligned bits.  */
> -   && (! TARGET_LDRD
> -   /* Thumb-2 ldrd/strd is [-1020,+1020] in steps of 4.  */
> -   || TARGET_THUMB2
> - return false;
> -
> -  /* When breaking down a [reg+index] reload address into 
> [(reg+high)+low],
> -  of which the (reg+high) gets turned into a reload add insn,
> -  we try to decompose the index into high/low values that can often
> -  also lead to better reload CSE.
> -  For example:
> -  ldr r0, [r2, #4100]  // Offset too large
> -  ldr r1, [r2, #4104]  // Offset too large
> -
> -  is best re

Re: [PATCH, AArch64] [4.9] Handle SYMBOL_SMALL_TPREL appropriately

2015-02-27 Thread Richard Earnshaw

On 20/02/15 04:14, Hurugalawadi, Naveen wrote:
> Hi Marcus,
> 
>>> The handling of SYMBOL_SMALL_TPREL is present in 4.9 and very clearly
>>> has exactly the same issue.
> 
> Please find attached the patch ported for gcc-4.9.
> 
> Please review the patch and let us know if its okay?
> Regression tested on aarch64-elf.
> 
> Thanks,
> Naveen
> 
> 
> 2015-02-20  Andrew Pinski  
> Naveen H.S  
> 
> * config/aarch64/aarch64.c (*aarch64_load_symref_appropriately):
> Check whether the destination of SYMBOL_SMALL_TPREL is Pmode.
> 

OK.

R.

> 
> symbolref_ilp32.patch
> 
> 
> Index: gcc/config/aarch64/aarch64.c
> ===
> --- gcc/config/aarch64/aarch64.c  (revision 220806)
> +++ gcc/config/aarch64/aarch64.c  (working copy)
> @@ -659,6 +659,10 @@
>  case SYMBOL_SMALL_TPREL:
>{
>   rtx tp = aarch64_load_tp (NULL);
> +
> + if (GET_MODE (dest) != Pmode)
> +   tp = gen_lowpart (GET_MODE (dest), tp);
> +
>   emit_insn (gen_tlsle_small (dest, tp, imm));
>   set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
>   return;
>

Re: [PR58315] reset inlined debug vars at return-to point

2015-02-27 Thread Petr Machata

Alexandre Oliva  writes:

> Ok, I looked into it further, after patching dwlocstat to dump
> per-variable per-range coverage/length info, so as to be able to compare
> object files more easily.

If you send me those patches, I can finish them, bind the functionality
to a command line option, and merge upstream.

Thanks,
Petr

[PATCH] Testcase for PR65193

2015-02-27 Thread Richard Biener


Applied.

Richard.

2015-02-27  Richard Biener  

PR lto/65193
* g++.dg/lto/pr65193_0.C: New testcase.

Index: gcc/testsuite/g++.dg/lto/pr65193_0.C
===
*** gcc/testsuite/g++.dg/lto/pr65193_0.C(revision 0)
--- gcc/testsuite/g++.dg/lto/pr65193_0.C(working copy)
***
*** 0 
--- 1,71 
+ /* { dg-lto-do link } */
+ /* { dg-require-effective-target fpic } */
+ /* { dg-lto-options {{-fPIC -r -nostdlib -flto -O2 -g}} } */
+ 
+ void frexp (int, int *);
+ namespace std
+ {
+   int ldexp (int, int);
+   struct A
+ {
+ };
+   template  T get_min_shift_value ();
+   template  struct min_shift_initializer
+ {
+   struct B
+   {
+ B () { get_min_shift_value (); }
+   } static const b;
+   static void
+ m_fn1 ()
+   {
+ b;
+   }
+ };
+   template 
+   const typename min_shift_initializer::B min_shift_initializer::b;
+   template 
+   inline T
+   get_min_shift_value ()
+   {
+ using std::ldexp;
+ static T c = ldexp (0, 0);
+ min_shift_initializer::m_fn1;
+   }
+   template 
+   void
+   float_next_imp (T p1, Policy p2)
+   {
+ using std::ldexp;
+ int d;
+ float_next (0, p2);
+ frexp (p1, &d);
+   }
+   template 
+   int
+   float_next (const T &p1, Policy &p2)
+   {
+ float_next_imp (p1, p2);
+   }
+   template  void float_prior_imp (T, Policy)
+ {
+   get_min_shift_value ();
+ }
+   template  int float_prior (T, Policy)
+ {
+   float_prior_imp (static_cast (0), 0);
+ }
+   template 
+   void
+   nextafter (T p1, U p2, Policy p3)
+   {
+ p2 ? float_next (0, p3) : float_prior (p1, 0);
+   }
+   long double e;
+   int f;
+   void
+   nextafter ()
+   {
+ nextafter (e, f, A ());
+   }
+ }

[Committed] S/390: Remove -m64/-m31 from target specific testcases

2015-02-27 Thread Andreas Krebbel

Hi,

the attached patch gets rid of -m64/-m31 uses in our target specific
testcases in order to make
"make check RUNTESTFLAGS='--target_board=unix\{-m31,-m64\}'"
runs work fine again.

Committed to mainline.

Bye,

-Andreas-


2015-02-27  Andreas Krebbel  

* gcc.target/s390/20140327-1.c: Remove -m31 and guard with ! lp64.
* gcc.target/s390/hotpatch-8.c: Likewise.
* gcc.target/s390/hotpatch-9.c: Likewise.
* gcc.target/s390/pr61078.c: Likewise.
* gcc.target/s390/pr57960.c: Remove -m64.
* gcc.target/s390/pr61078.c: Likewise.

diff --git a/gcc/testsuite/gcc.target/s390/20140327-1.c 
b/gcc/testsuite/gcc.target/s390/20140327-1.c
index f71c38f..25c7391 100644
--- a/gcc/testsuite/gcc.target/s390/20140327-1.c
+++ b/gcc/testsuite/gcc.target/s390/20140327-1.c
@@ -1,5 +1,5 @@
-/* { dg-do compile } */
-/* { dg-options "-O3 -m31 -mzarch" } */
+/* { dg-do compile { target { ! lp64 } } } */
+/* { dg-options "-O3 -mzarch" } */
 
 void
 foo ()
diff --git a/gcc/testsuite/gcc.target/s390/hotpatch-8.c 
b/gcc/testsuite/gcc.target/s390/hotpatch-8.c
index 0874bbc..25edd9b 100644
--- a/gcc/testsuite/gcc.target/s390/hotpatch-8.c
+++ b/gcc/testsuite/gcc.target/s390/hotpatch-8.c
@@ -1,7 +1,7 @@
 /* Functional tests for the function hotpatching feature.  */
 
-/* { dg-do compile } */
-/* { dg-options "-O3 -mesa -m31 -march=g5 -mhotpatch=0,3" } */
+/* { dg-do compile { target { ! lp64 } } } */
+/* { dg-options "-O3 -mesa -march=g5 -mhotpatch=0,3" } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/s390/hotpatch-9.c 
b/gcc/testsuite/gcc.target/s390/hotpatch-9.c
index d6fb29a..2143f9d 100644
--- a/gcc/testsuite/gcc.target/s390/hotpatch-9.c
+++ b/gcc/testsuite/gcc.target/s390/hotpatch-9.c
@@ -1,7 +1,7 @@
 /* Functional tests for the function hotpatching feature.  */
 
-/* { dg-do compile } */
-/* { dg-options "-O3 -mesa -m31 -march=g5 -mhotpatch=0,4" } */
+/* { dg-do compile { target { ! lp64 } } } */
+/* { dg-options "-O3 -mesa -march=g5 -mhotpatch=0,4" } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/s390/pr57559.c 
b/gcc/testsuite/gcc.target/s390/pr57559.c
index 15c3878..1c62f56 100644
--- a/gcc/testsuite/gcc.target/s390/pr57559.c
+++ b/gcc/testsuite/gcc.target/s390/pr57559.c
@@ -1,7 +1,7 @@
 /* PR rtl-optimization/57559  */
 
 /* { dg-do compile } */
-/* { dg-options "-march=z10 -m64 -mzarch  -O1" } */
+/* { dg-options "-march=z10 -mzarch  -O1" } */
 
 typedef int int32_t;
 typedef unsigned char uint8_t;
diff --git a/gcc/testsuite/gcc.target/s390/pr57960.c 
b/gcc/testsuite/gcc.target/s390/pr57960.c
index ee751ed..03578ff 100644
--- a/gcc/testsuite/gcc.target/s390/pr57960.c
+++ b/gcc/testsuite/gcc.target/s390/pr57960.c
@@ -1,7 +1,7 @@
 /* PR rtl-optimization/57960  */
 
 /* { dg-do compile } */
-/* { dg-options "-march=z10 -m64 -mzarch  -O1" } */
+/* { dg-options "-march=z10 -mzarch  -O1" } */
 
 typedef union
 {
diff --git a/gcc/testsuite/gcc.target/s390/pr61078.c 
b/gcc/testsuite/gcc.target/s390/pr61078.c
index 2f95eba..40f6ad7 100644
--- a/gcc/testsuite/gcc.target/s390/pr61078.c
+++ b/gcc/testsuite/gcc.target/s390/pr61078.c
@@ -1,8 +1,8 @@
 /* This testcase is extracted from s390_emit_prologue.  The negation
of a 64bit value got split incorrectly on 31 bit.  */
 
-/* { dg-do run } */
-/* { dg-options "-O2 -mesa -m31" } */
+/* { dg-do run { target { ! lp64 } } } */
+/* { dg-options "-O2 -mesa" } */
 
 extern void abort (void);

Re: [patch c-family]: Fix Bug 35330 - [4.8/4.9/5 regression] ICE with invalid pragma weak

2015-02-27 Thread Marek Polacek

On Thu, Feb 26, 2015 at 09:25:57PM +0100, Kai Tietz wrote:
> Well, testcase for the pragma ...
> 
> ChangeLog testsuite/
> 
> 2015-02-26  Kai Tietz  
> 
> * gcc.dg/weak/weak-17.c: New file
 
Missing full stop.

> Updated patch (regression-tested):
> Index: c-pragma.c
> ===
> --- c-pragma.c  (Revision 221019)
> +++ c-pragma.c  (Arbeitskopie)
> @@ -392,6 +392,8 @@ handle_pragma_weak (cpp_reader * ARG_UNUSED (dummy
>decl = identifier_global_value (name);
>if (decl && DECL_P (decl))
>  {
> +  if (!VAR_OR_FUNCTION_DECL_P (decl))
> +   GCC_BAD2 ("weak declaration of %q+D not allowed, ignored", decl);

I think this message should explicitly mention "#pragma weak".

Ok with those changes.

Marek

[PATCH] Fix removing of df problem in df_finish_pass

2015-02-27 Thread Thomas Preud'homme

Hi,

In df_finish_pass, optional problems are removed manually making non null
entries in df->problems_in_order non contiguous. This may lead to null pointer
dereference when accessing all problems from df->problems_in_order[0] to
df->problems_in_order[df->num_problems_defined - 1] and miss some other
problems. Such a scenario was actually encountered when working on a patch.
This patch use the existing function df_remove_problem to do the deletion,
which require iterating on problems via the df->problems_by_index[] array
since each call mess up with df->num_problems_defined and order of
problems in df->problems_in_order[].

ChangeLog entry is as follows:

2015-02-12  Thomas Preud'homme  

* df-core.c (df_finish_pass): Iterate over df->problems_by_index[] and
use df_remove_problem rather than manually removing problems, living
holes in df->problems_in_order[].

diff --git a/gcc/df-core.c b/gcc/df-core.c
index 82f1364..67040a1 100644
--- a/gcc/df-core.c
+++ b/gcc/df-core.c
@@ -642,7 +642,6 @@ void
 df_finish_pass (bool verify ATTRIBUTE_UNUSED)
 {
   int i;
-  int removed = 0;
 
 #ifdef ENABLE_DF_CHECKING
   int saved_flags;
@@ -658,21 +657,15 @@ df_finish_pass (bool verify ATTRIBUTE_UNUSED)
   saved_flags = df->changeable_flags;
 #endif
 
-  for (i = 0; i < df->num_problems_defined; i++)
+  /* We iterate over problems by index as each problem removed will
+ lead to problems_in_order to be reordered.  */
+  for (i = 0; i < DF_LAST_PROBLEM_PLUS1; i++)
 {
-  struct dataflow *dflow = df->problems_in_order[i];
-  struct df_problem *problem = dflow->problem;
+  struct dataflow *dflow = df->problems_by_index[i];
 
-  if (dflow->optional_p)
-   {
- gcc_assert (problem->remove_problem_fun);
- (problem->remove_problem_fun) ();
- df->problems_in_order[i] = NULL;
- df->problems_by_index[problem->id] = NULL;
- removed++;
-   }
+  if (dflow && dflow->optional_p)
+   df_remove_problem (dflow);
 }
-  df->num_problems_defined -= removed;
 
   /* Clear all of the flags.  */
   df->changeable_flags = 0;


Testsuite was run with a bootstrapped x86_64 native compiler and an
arm-none-eabi GCC cross-compiler targetting Cortex-M3 without any
regression.

Although the problem is real, it doesn't seem that GCC hits it now
(I stumbled upon it while working on a patch). Therefore I'm not sure
if this should go in stage4 or not. Please advise me on this.

Ok for trunk/stage1?

Best regards,

Thomas

Re: [patch]: Fix Bug 65038 - [regression 5] Unable to find ftw.h for libgcov-util.c

2015-02-27 Thread Kai Tietz

Applied at rev.revision 221055.

--
Kai

Re: [C PATCH] Fix ICE with __auto_type (PR c/65228)

2015-02-27 Thread Joseph Myers

On Fri, 27 Feb 2015, Marek Polacek wrote:

> This PR points out that we can ICE in case we have __auto_type with undeclared
> variable: grokdeclarator returns error_mark_node, so check for that before
> accessing decl's TREE_CODE (can't use error_operand_p here as that would cause
> bogus diagnostics be emitted).
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk and 4.9?
> 
> 2015-02-26  Marek Polacek  
> 
>   PR c/65228
>   * c-decl.c (start_decl): Return NULL_TREE if decl is an error node.
> 
>   * gcc.dg/pr65228.c: New test.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH, alpha]: Fix PR/47230 [4.6/4.7 Regression] gcc fails to bootstrap on alpha in stage2 with "relocation truncated to fit: GPREL16 against ..."

2015-02-27 Thread Uros Bizjak

On Wed, Feb 25, 2015 at 8:18 PM, Richard Henderson  wrote:
> On 02/25/2015 09:02 AM, Uros Bizjak wrote:
>> The patch was tested on alpha-linux-gnu and alphaev68-linux-gnu for
>> all default languages plus obj-c++ and go.
>>
>> OK for mainline?
>
> Ok.  Thanks.

Unfortunately, while "normal" bootstrap works OK with alpha-linux-gnu
and alphaev68-linux-gnu, the bootstrap still fails when configured
--with-build-config=bootstrap-lto:

libbackend.a(tree-data-ref.o): In function `non_affine_dependence_relation':
/space/uros/gcc-build-profiled/gcc/../../gcc-svn/trunk/gcc/tree-data-ref.c:1661:(.text+0x1b8):
relocation truncated to fit: GPREL16 against symbol `dump_file'
defined in .sbss section in libbackend.a(dumpfile.o)
/space/uros/gcc-build-profiled/gcc/../../gcc-svn/trunk/gcc/tree-data-ref.c:1661:(.text+0x1d0):
relocation truncated to fit: GPREL16 against symbol `dump_flags'
defined in .sbss section in libbackend.a(dumpfile.o)
libbackend.a(tree-data-ref.o): In function
`compute_overlap_steps_for_affine_1_2':
/space/uros/gcc-build-profiled/gcc/../../gcc-svn/trunk/gcc/tree-data-ref.c:2265:(.text+0x72a8):
relocation truncated to fit: GPREL16 against symbol `dump_file'
defined in .sbss section in libbackend.a(dumpfile.o)
/space/uros/gcc-build-profiled/gcc/../../gcc-svn/trunk/gcc/tree-data-ref.c:2265:(.text+0x72b4):
relocation truncated to fit: GPREL16 against symbol `dump_flags'
defined in .sbss section in libbackend.a(dumpfile.o)
libbackend.a(tree-data-ref.o): In function `analyze_siv_subscript_cst_affine':
/space/uros/gcc-build-profiled/gcc/../../gcc-svn/trunk/gcc/tree-data-ref.c:1953:(.text+0x8a5c):
relocation truncated to fit: GPREL16 against symbol `dump_file'
defined in .sbss section in libbackend.a(dumpfile.o)
/space/uros/gcc-build-profiled/gcc/../../gcc-svn/trunk/gcc/tree-data-ref.c:1953:(.text+0x8a74):
relocation truncated to fit: GPREL16 against symbol `dump_flags'
defined in .sbss section in libbackend.a(dumpfile.o)
/space/uros/gcc-build-profiled/gcc/../../gcc-svn/trunk/gcc/tree-data-ref.c:1968:(.text+0x8bc0):
relocation truncated to fit: GPREL16 against symbol `dump_file'
defined in .sbss section in libbackend.a(dumpfile.o)
/space/uros/gcc-build-profiled/gcc/../../gcc-svn/trunk/gcc/tree-data-ref.c:1968:(.text+0x8bd8):
relocation truncated to fit: GPREL16 against symbol `dump_flags'
defined in .sbss section in libbackend.a(dumpfile.o)
/space/uros/gcc-build-profiled/gcc/../../gcc-svn/trunk/gcc/tree-data-ref.c:2049:(.text+0x8db8):
relocation truncated to fit: GPREL16 against symbol `dump_file'
defined in .sbss section in libbackend.a(dumpfile.o)
/space/uros/gcc-build-profiled/gcc/../../gcc-svn/trunk/gcc/tree-data-ref.c:2049:(.text+0x8dd0):
relocation truncated to fit: GPREL16 against symbol `dump_flags'
defined in .sbss section in libbackend.a(dumpfile.o)
libbackend.a(tree-data-ref.o): In function
`dr_analyze_innermost(data_reference*, loop*)':
/space/uros/gcc-build-profiled/gcc/../../gcc-svn/trunk/gcc/tree-data-ref.c:802:(.text+0xb54c):
additional relocation overflows omitted from the output
collect2: error: ld returned 1 exit status
../../gcc-svn/trunk/gcc/lto/Make-lang.in:71: recipe for target 'lto1' failed
gmake[3]: *** [lto1] Error 1

Also reported is build failure on debian [1], where bootstrap dies
with ELF_LITERAL truncation:

libbackend.a(tree-vect-generic.o): In function `gimple_statement_structure':
/«PKGBUILDDIR»/build/gcc/../../src/gcc/gimple.h:1572:(.text+0x2f4):
relocation truncated to fit: ELF_LITERAL against `.text'
libbackend.a(tree-vect-generic.o): In function `gimple_has_ops':
/«PKGBUILDDIR»/build/gcc/../../src/gcc/gimple.h:1846:(.text+0x38c):
relocation truncated to fit: ELF_LITERAL against `.text'
/«PKGBUILDDIR»/build/gcc/../../src/gcc/gimple.h:1846:(.text+0x3ac):
relocation truncated to fit: ELF_LITERAL against `.text'

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=5276#c13

Uros.

Re: [patch] fix PR65048: check that jump-thread paths are still valid

2015-02-27 Thread Andreas Schwab

Sebastian Pop  writes:

> +void
> +baz (void)
> +{
> +  while (1)
> +{
> +  a = foo ();
> +  b = foo ();

FAIL: gcc.dg/tree-ssa/ssa-dom-thread-9.c (test for excess errors)
Excess errors:
/daten/aranym/gcc/gcc-20150227/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-9.c:45:11:
 error: too few arguments to function 'foo'
/daten/aranym/gcc/gcc-20150227/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-9.c:46:11:
 error: too few arguments to function 'foo'

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Re: ipa-icf::merge TLC

2015-02-27 Thread H.J. Lu

On Thu, Feb 26, 2015 at 6:10 PM, Jan Hubicka  wrote:
> Hi,
> this is the final version of patch I comitted.  It has new fix to 
> make_decl_local
> to set TREE_ADDRESSABLE becuase we leave the flag undefined for non-local 
> decls.
> I also dropped Optimization from fmerge-all-constants, fmerge-constants
> those can not be done in function speicfic way, I made 
> ipa_ref::address_matters_p
> to use fmerge-constants, added code to drop UNINLINABLE flag when function is 
> turned
> into a wrapper, added check to require DECL_NO_INLINE_WARNING_P match
> and added code to set TREE_ADDRESSABLE when non-addressable and addressable 
> vars are merged.
> I also disabled merging for DECL_CONSTANT_POOL because it does not work 
> (symtab does not
> expect aliases here)
>
> Bootstrapped/regtested x86_64-linux, comitted.
>
> Honza
> * ipa-icf.c (symbol_compare_collection::symbol_compare_colleciton):
> Use address_matters_p.
> (redirect_all_callers, set_addressable): New functions.
> (sem_function::merge): Reorganize and fix merging issues.
> (sem_variable::merge): Likewise.
> (sem_variable::compare_sections): Remove.
> * common.opt (fmerge-all-constants, fmerge-constants): Remove
> Optimization flag.
> * symtab.c (symtab_node::resolve_alias): When alias has aliases,
> redirect them.
> (symtab_node::make_decl_local): Set ADDRESSABLE bit when
> decl is used.
> (address_matters_1): New function.
> (symtab_node::address_matters_p): New function.
> * cgraph.c (cgraph_edge::verify_corresponds_to_fndecl): Fix
> check for merged flag.
> * cgraph.h (address_matters_p): Declare.
> (symtab_node::address_taken_from_non_vtable_p): Remove.
> (symtab_node::address_can_be_compared_p): New method.
> (ipa_ref::address_matters_p): Move here from ipa-ref.c; simplify.
> * ipa-visibility.c (symtab_node::address_taken_from_non_vtable_p):
> Remove.
> (comdat_can_be_unshared_p_1) Use address_matters_p.
> (update_vtable_references): Fix formating.
> * ipa-ref.c (ipa_ref::address_matters_p): Move inline.
> * cgraphunit.c (cgraph_node::create_wrapper): Drop UNINLINABLE flag.
> * cgraphclones.c: Preserve merged and icf_merged flags.
>

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65237

-- 
H.J.

Re: [PATCH, libsanitizer] Enable for PowerPC little endian

2015-02-27 Thread Peter Bergner

On Thu, 2015-02-26 at 20:04 -0600, Peter Bergner wrote:
> On Thu, 2015-02-26 at 22:56 +0100, Jakub Jelinek wrote:
> > How do make check results (asan.exp/ubsan.exp) look like on ppc64le?
> > If it works as good as or better as ppc64be, then I'm fine with adding it
> > even in stage4.
> 
> They have the same exact failures in ubsan and ppc64le has fewer asan
> failures than ppc64be and all of the ppc64le failures are also failures
> in ppc64be, so ppc64le is actually in better shape than ppc64be.

Ok, since the results met your criteria for inclusion, I committed the
change as revision 221060.  Thanks.

Peter

Re: [patch]: Fix Bug 65038 - [regression 5] Unable to find ftw.h for libgcov-util.c

2015-02-27 Thread H.J. Lu

On Thu, Feb 26, 2015 at 6:49 AM, Kai Tietz  wrote:
> Hi,
>
> This is the remaining fix for re-enabling native boostrap for
> Windows-variant of gcc without disabling -Werror for libgcc.
>
> ChangeLog
>
> 2015-02-26  Kai Tietz  
>
> PR target/65038
> * config.in: Regenerated.
> * configure: Likewise.
> * configure.ac (AC_HEADER_STDC): Add explicit.
> (AC_CHECK_HEADERS): Check for default headers
> plus for ftw.h one.
> * libgcov-util.c (gcov_read_profile_dir): Disable use
> of ftw-function, if header not found.
> (ftw_read_file): Don't translate if ftw header isn't
> present.
>
> Regression-tested for x86_64-unknown-linux-gnu.  Bootstrapped for
> i686-pc-mingw32.
> I will apply soon, if there are no objections.

I believe it breaks bootstrap on Linux/x86:

https://gcc.gnu.org/ml/gcc-regression/2015-02/msg00580.html

-- 
H.J.

[committed] Move -Wformat-signedness out of -Wformat=2 (PR c/65040)

2015-02-27 Thread Marek Polacek

Many folks complained that -Wformat-signedness is overly pedantic, and 
especially
with -Werror it can cause a lot of needless pain.  This patch moves it ouf of
-Wformat=2.

Bootstrapped/regtested on x86_64-linux, applying to trunk.

2015-02-27  Marek Polacek  

PR c/65040
* doc/invoke.texi: Update to reflect that -Wformat=2 doesn't enable
-Wformat-signedness anymore.

* c.opt (Wformat-signedness): Don't enable by -Wformat=2.

* gcc.dg/pr65066.c: Use -Wformat -Wformat-signedness and not
-Wformat=2.

diff --git gcc/c-family/c.opt gcc/c-family/c.opt
index fd00407..b3c8cee 100644
--- gcc/c-family/c.opt
+++ gcc/c-family/c.opt
@@ -456,7 +456,7 @@ C ObjC C++ ObjC++ Var(warn_format_security) Warning 
LangEnabledBy(C ObjC C++ Obj
 Warn about possible security problems with format functions
 
 Wformat-signedness
-C ObjC C++ ObjC++ Var(warn_format_signedness) Warning LangEnabledBy(C ObjC C++ 
ObjC++,Wformat=, warn_format >= 2, 0)
+C ObjC C++ ObjC++ Var(warn_format_signedness) Warning
 Warn about sign differences with format functions
 
 Wformat-y2k
diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
index ef4cc75..b07eed0 100644
--- gcc/doc/invoke.texi
+++ gcc/doc/invoke.texi
@@ -3631,7 +3631,7 @@ The C standard specifies that zero-length formats are 
allowed.
 @opindex Wformat=2
 Enable @option{-Wformat} plus additional format checks.  Currently
 equivalent to @option{-Wformat -Wformat-nonliteral -Wformat-security
--Wformat-signedness -Wformat-y2k}.
+-Wformat-y2k}.
 
 @item -Wformat-nonliteral
 @opindex Wformat-nonliteral
diff --git gcc/testsuite/gcc.dg/pr65066.c gcc/testsuite/gcc.dg/pr65066.c
index 883a87d..291e97a 100644
--- gcc/testsuite/gcc.dg/pr65066.c
+++ gcc/testsuite/gcc.dg/pr65066.c
@@ -1,6 +1,6 @@
 /* PR c/65066 */
 /* { dg-do compile } */
-/* { dg-options "-Wformat=2" } */
+/* { dg-options "-Wformat -Wformat-signedness" } */
 
 extern int sscanf (const char *restrict, const char *restrict, ...);
 int *a;

Marek

patch to fix PR65302

2015-02-27 Thread Vladimir Makarov


The following patch fixes PR65302

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65032

LRA rematerialization sub-pass did not update info about scratch pseudos 
in rematerialized insns and this resulted in skipping reverse 
transformation of scratch pseudos into scratches in such insns at the 
LRA end.


The patch was bootstrapped and tested on x86-64.

Committed as rev. 221062.

2015-02-27  Vladimir Makarov  

PR target/65032
* lra-remat.c (update_scratch_ops): New.
(do_remat): Call it.
* lra.c (lra_register_new_scratch_op): New. Take code from ...
(remove_scratches): ... here.
* lra-int.h (lra_register_new_scratch_op): New prototype.

2015-02-27  Vladimir Makarov  

PR target/65032
* g++.dg/pr65032.C: New.



Index: lra-int.h
===
--- lra-int.h   (revision 220916)
+++ lra-int.h   (working copy)
@@ -321,6 +321,7 @@ extern void lra_create_copy (int, int, i
 extern lra_copy_t lra_get_copy (int);
 extern bool lra_former_scratch_p (int);
 extern bool lra_former_scratch_operand_p (rtx_insn *, int);
+extern void lra_register_new_scratch_op (rtx_insn *, int);
 
 extern int lra_new_regno_start;
 extern int lra_constraint_new_regno_start;
Index: lra.c
===
--- lra.c   (revision 220916)
+++ lra.c   (working copy)
@@ -1907,6 +1907,24 @@ lra_former_scratch_operand_p (rtx_insn *
   INSN_UID (insn) * MAX_RECOG_OPERANDS + nop) != 0;
 }
 
+/* Register operand NOP in INSN as a former scratch.  It will be
+   changed to scratch back, if it is necessary, at the LRA end.  */
+void
+lra_register_new_scratch_op (rtx_insn *insn, int nop)
+{
+  lra_insn_recog_data_t id = lra_get_insn_recog_data (insn);
+  rtx op = *id->operand_loc[nop];
+  sloc_t loc = XNEW (struct sloc);
+  lra_assert (REG_P (op));
+  loc->insn = insn;
+  loc->nop = nop;
+  scratches.safe_push (loc);
+  bitmap_set_bit (&scratch_bitmap, REGNO (op));
+  bitmap_set_bit (&scratch_operand_bitmap,
+ INSN_UID (insn) * MAX_RECOG_OPERANDS + nop);
+  add_reg_note (insn, REG_UNUSED, op);
+}
+
 /* Change scratches onto pseudos and save their location.  */
 static void
 remove_scratches (void)
@@ -1916,7 +1934,6 @@ remove_scratches (void)
   basic_block bb;
   rtx_insn *insn;
   rtx reg;
-  sloc_t loc;
   lra_insn_recog_data_t id;
   struct lra_static_insn_data *static_id;
 
@@ -1938,15 +1955,7 @@ remove_scratches (void)
  *id->operand_loc[i] = reg
= lra_create_new_reg (static_id->operand[i].mode,
  *id->operand_loc[i], ALL_REGS, NULL);
- add_reg_note (insn, REG_UNUSED, reg);
- lra_update_dup (id, i);
- loc = XNEW (struct sloc);
- loc->insn = insn;
- loc->nop = i;
- scratches.safe_push (loc);
- bitmap_set_bit (&scratch_bitmap, REGNO (*id->operand_loc[i]));
- bitmap_set_bit (&scratch_operand_bitmap,
- INSN_UID (insn) * MAX_RECOG_OPERANDS + i);
+ lra_register_new_scratch_op (insn, i);
  if (lra_dump_file != NULL)
fprintf (lra_dump_file,
 "Removing SCRATCH in insn #%u (nop %d)\n",
Index: lra-remat.c
===
--- lra-remat.c (revision 220946)
+++ lra-remat.c (working copy)
@@ -1044,6 +1044,29 @@ get_hard_regs (struct lra_insn_reg *reg,
   return hard_regno;
 }
 
+/* Make copy of and register scratch pseudos in rematerialized insn
+   REMAT_INSN.  */
+static void
+update_scratch_ops (rtx_insn *remat_insn)
+{
+  lra_insn_recog_data_t id = lra_get_insn_recog_data (remat_insn);
+  struct lra_static_insn_data *static_id = id->insn_static_data;
+  for (int i = 0; i < static_id->n_operands; i++)
+{
+  rtx *loc = id->operand_loc[i];
+  if (! REG_P (*loc))
+   continue;
+  int regno = REGNO (*loc);
+  if (! lra_former_scratch_p (regno))
+   continue;
+  *loc = lra_create_new_reg (GET_MODE (*loc), *loc,
+lra_get_allocno_class (regno),
+"scratch pseudo copy");
+  lra_register_new_scratch_op (remat_insn, i);
+}
+  
+}
+
 /* Insert rematerialization insns using the data-flow data calculated
earlier.  */
 static bool
@@ -1193,6 +1216,7 @@ do_remat (void)
  HOST_WIDE_INT sp_offset_change = cand_sp_offset - id->sp_offset;
  if (sp_offset_change != 0)
change_sp_offset (remat_insn, sp_offset_change);
+ update_scratch_ops (remat_insn);
  lra_process_new_insns (insn, remat_insn, NULL,
 "Inserting rematerialization insn");
  lra_set_insn_deleted (insn);
Index: testsuite/g++.dg/pr65032.C
=

RE: [PATCH][ARM] PR target/64600 Fix another ICE with -mtune=xscale: properly sign-extend mask during constant splitting

2015-02-27 Thread Kyrill Tkachov

On 03/02/15 15:18, Kyrill Tkachov wrote:
> Hi all,
>
> The ICE in this PR occurs when -mtune=xscale triggers a particular path 
> through arm_gen_constant during expand
> that creates a 0xf00f mask but for a 64-bit HOST_WIDE_INT doesn't 
> sign extend it into
> 0xf00f that signifies the required -4081. It leaves it as 
> 0xf00f (4294963215) that breaks when
> later combine tries to perform an SImode bitwise AND using the wide-int 
> machinery.
>
> I think the correct approach here is to use trunc_int_for_mode that 
> correctly sign-extends the constant so
> that it is properly represented by a HOST_WIDE_INT for the required mode.
>
> Bootstrapped and tested arm-none-linux-gnueabihf with -mtune=xscale in 
> BOOT_CFLAGS.
>
> The testcase triggers for -mcpu=xscale and all slowmul targets because 
> they are the only ones that have the
> constant_limit tune parameter set to anything >1 which is required to 
> follow this particular path through
> arm_split_constant. Also, the rtx costs can hide this ICE sometimes.
>
> Ok for trunk?
>
> Thanks,
> Kyrill
>
> 2015-02-03  Kyrylo Tkachov  
>
>  PR target/64600
>  * config/arm/arm.c (arm_gen_constant, AND case): Call
>  trunc_int_for_mode when constructing AND mask.
>
> 2015-02-03  Kyrylo Tkachov  
>
>  PR target/64600
>  * gcc.target/arm/pr64600_1.c: New test.
> arm-xscale-wide.patch
> commit 52388a359dd65276bccfac499a2fd9e406fbe1a8
> Author: Kyrylo Tkachov 
> Date:   Tue Jan 20 11:21:34 2015 +
>
> [ARM] Fix ICE due to arm_gen_constant not sign_extending
>
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index db4834b..d0f3a52 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -4709,19 +4709,20 @@ arm_gen_constant (enum rtx_code code, machine_mode 
> mode, rtx cond,
>  
> if ((remainder | shift_mask) != 0x)
>   {
> +   HOST_WIDE_INT new_val
> + = trunc_int_for_mode (remainder | shift_mask, mode);

Offlist, Richard mentioned that trunc_int_for_mode may pessimize codegen for 
HImode values due
to excessive setting of bits and using ARM_SIGN_EXTEND might be preferable.
I've tried that and it does fix the ICE and goes through testing ok. Bootstrap 
still ongoing.
I didn't perform any code quality investigation. Richard, are there any 
particular code sequences
 that you'd like us to investigate here?

Thanks,
Kyrill

>
> +
> if (generate)
>   {
> rtx new_src = subtargets ? gen_reg_rtx (mode) : target;
> -   insns = arm_gen_constant (AND, mode, cond,
> - remainder | shift_mask,
> +   insns = arm_gen_constant (AND, SImode, cond, new_val,
>   new_src, source, subtargets, 1);
> source = new_src;
>   }
> else
>   {
> rtx targ = subtargets ? NULL_RTX : target;
> -   insns = arm_gen_constant (AND, mode, cond,
> - remainder | shift_mask,
> +   insns = arm_gen_constant (AND, mode, cond, new_val,
>   targ, source, subtargets, 0);
>   }
>   }
> @@ -4744,12 +4745,13 @@ arm_gen_constant (enum rtx_code code, machine_mode 
> mode, rtx cond,
>  
> if ((remainder | shift_mask) != 0x)
>   {
> +   HOST_WIDE_INT new_val
> + = trunc_int_for_mode (remainder | shift_mask, mode);
> if (generate)
>   {
> rtx new_src = subtargets ? gen_reg_rtx (mode) : target;
>  
> -   insns = arm_gen_constant (AND, mode, cond,
> - remainder | shift_mask,
> +   insns = arm_gen_constant (AND, mode, cond, new_val,
>   new_src, source, subtargets, 1);
> source = new_src;
>   }
> @@ -4757,8 +4759,7 @@ arm_gen_constant (enum rtx_code code, machine_mode 
> mode, rtx cond,
>   {
> rtx targ = subtargets ? NULL_RTX : target;
>  
> -   insns = arm_gen_constant (AND, mode, cond,
> - remainder | shift_mask,
> +   insns = arm_gen_constant (AND, mode, cond, new_val,
>   targ, source, subtargets, 0);
>   }
>   }
> diff --git a/gcc/testsuite/gcc.target/arm/pr64600_1.c 
> b/gcc/testsuite/gcc.target/arm/pr64600_1.c
> new file mode 100644
> index 000..6ba3fa2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pr64600_1.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mtune=xscale" } */
> +
> +typedef unsigned int speed_t;
> +typedef unsigned int tcflag_t;
> +
> +struct termios {
> + tcflag_t c_cflag;
> +};
> +
> +speed_t
> +cfgetospeed (const struct termios *tp)
> +{
> +  return tp->c_cfl

Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1

2015-02-27 Thread David Edelsohn

On Tue, Feb 24, 2015 at 2:30 PM, Jeff Law  wrote:
> On 02/23/15 20:38, Martin Sebor wrote:
>>
>> On 02/22/2015 11:45 AM, David Edelsohn wrote:
>>>
>>> Does this patch really fix the problem?  The PR notes that the
>>> testcase fails and code quality has regressed.  Has the code
>>> generation been corrected but the testcase looks for the wrong string?
>>>   Presumably the message that basic block was vectorized means that the
>>> code generation is correct, but the commentary about the patch does
>>> not mention it.
>>
>>
>> There appear to be at least three problems at play here:
>>
>> 1) The test expects the wrong string to determine success.
>>
>> 2) GCC 4.9.0 and later emit suboptimal code compared to 4.8.4.
>>
>> 3) With (1) fixed, the test fails to detect (2).
>>
>> During my initial investigation, besides trunk, I had only looked
>> at the assembly emitted at revision 198852 since there the test
>> is reported as passing in comment #2. The code appears comparable
>> between the two.
>>
>> Now that I've also compared the assembly emitted by 4.8.4 I see
>> what I suspect the original reporter was referring to: 4.9.0 and
>> later both uses vectorization to copy the arrays and also assigns
>> the four elements using ordinary loads and stores.  And since
>> the code has been successfully vectorized (and GCC reports it
>> in the dump) the test passes.
>>
>> I'll need to spend some more time to find the revision that
>> caused this.
>
> Something to bear in mind, this may turn out to be something that isn't
> fixable at this stage in development.  So please stay in contact with your
> findings.
>
> Regardless, we should find a way to change the testcase so that it can
> correctly identify the missed optimization.

The fix to the testcase is fine with me.

Given that Martin's fix to the testcase allowed it to succeed without
Richi's fix for the underlying problem, is there a modification to the
testcase or a new testcase that would really test the optimization?

Thanks, David

Re: [patch]: Fix Bug 65038 - [regression 5] Unable to find ftw.h for libgcov-util.c

2015-02-27 Thread Kai Tietz

2015-02-27 14:48 GMT+01:00 H.J. Lu :
> On Thu, Feb 26, 2015 at 6:49 AM, Kai Tietz  wrote:
>> Hi,
>>
>> This is the remaining fix for re-enabling native boostrap for
>> Windows-variant of gcc without disabling -Werror for libgcc.
>>
>> ChangeLog
>>
>> 2015-02-26  Kai Tietz  
>>
>> PR target/65038
>> * config.in: Regenerated.
>> * configure: Likewise.
>> * configure.ac (AC_HEADER_STDC): Add explicit.
>> (AC_CHECK_HEADERS): Check for default headers
>> plus for ftw.h one.
>> * libgcov-util.c (gcov_read_profile_dir): Disable use
>> of ftw-function, if header not found.
>> (ftw_read_file): Don't translate if ftw header isn't
>> present.
>>
>> Regression-tested for x86_64-unknown-linux-gnu.  Bootstrapped for
>> i686-pc-mingw32.
>> I will apply soon, if there are no objections.
>
> I believe it breaks bootstrap on Linux/x86:
>
> https://gcc.gnu.org/ml/gcc-regression/2015-02/msg00580.html
>
> --
> H.J.

Already fixed.  See bug

--
Kai

Re: [PATCH, libsanitizer] Enable for PowerPC little endian

2015-02-27 Thread Markus Trippelsdorf

On 2015.02.27 at 07:47 -0600, Peter Bergner wrote:
> 
> Ok, since the results met your criteria for inclusion, I committed the
> change as revision 221060.  Thanks.

Are there any plans to fix:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63927 ?

-- 
Markus

Re: [patch] fix PR65048: check that jump-thread paths are still valid

2015-02-27 Thread Jakub Jelinek

On Fri, Feb 27, 2015 at 02:31:38PM +0100, Andreas Schwab wrote:
> Sebastian Pop  writes:
> 
> > +void
> > +baz (void)
> > +{
> > +  while (1)
> > +{
> > +  a = foo ();
> > +  b = foo ();
> 
> FAIL: gcc.dg/tree-ssa/ssa-dom-thread-9.c (test for excess errors)
> Excess errors:
> /daten/aranym/gcc/gcc-20150227/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-9.c:45:11:
>  error: too few arguments to function 'foo'
> /daten/aranym/gcc/gcc-20150227/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-9.c:46:11:
>  error: too few arguments to function 'foo'

Indeed.  Unfortunately, the reduced testcase that got reported
only triggers the ICE if it contains the two issues
1) K&R style definition, so the caller can call it with no arguments
2) missing return in foo in some cases

Thus the r221020 not only replaced a warning with two errors, but also
turned a testcase that tested the FSM bug into one that doesn't.

Fixed thusly, verified it succeeds with current trunk and fails (ICEs) with
a few days old trunk, and committed as obvious to trunk.

2015-02-27  Jakub Jelinek  

PR tree-optimization/65048
* gcc.dg/tree-ssa/ssa-dom-thread-9.c: Add -std=gnu89 to dg-options.
(foo): Use K&R style definition.

--- gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-9.c.jj 2015-02-26 
22:00:09.0 +0100
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-9.c2015-02-27 
15:30:09.102675253 +0100
@@ -1,12 +1,13 @@
 /* PR 65048 */
 /* { dg-do compile } */
-/* { dg-options "-O3" } */
+/* { dg-options "-O3 -std=gnu89" } */

 int a, b, c, d;
 void fn (void);

 int
-foo (int x)
+foo (x)
+ int x;
 {
   switch (x)
 {

Jakub

Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1

2015-02-27 Thread Martin Sebor


On 02/27/2015 07:27 AM, David Edelsohn wrote:

On Tue, Feb 24, 2015 at 2:30 PM, Jeff Law  wrote:

On 02/23/15 20:38, Martin Sebor wrote:


On 02/22/2015 11:45 AM, David Edelsohn wrote:


Does this patch really fix the problem?  The PR notes that the
testcase fails and code quality has regressed.  Has the code
generation been corrected but the testcase looks for the wrong string?
   Presumably the message that basic block was vectorized means that the
code generation is correct, but the commentary about the patch does
not mention it.



There appear to be at least three problems at play here:

1) The test expects the wrong string to determine success.

2) GCC 4.9.0 and later emit suboptimal code compared to 4.8.4.

3) With (1) fixed, the test fails to detect (2).

During my initial investigation, besides trunk, I had only looked
at the assembly emitted at revision 198852 since there the test
is reported as passing in comment #2. The code appears comparable
between the two.

Now that I've also compared the assembly emitted by 4.8.4 I see
what I suspect the original reporter was referring to: 4.9.0 and
later both uses vectorization to copy the arrays and also assigns
the four elements using ordinary loads and stores.  And since
the code has been successfully vectorized (and GCC reports it
in the dump) the test passes.

I'll need to spend some more time to find the revision that
caused this.


Something to bear in mind, this may turn out to be something that isn't
fixable at this stage in development.  So please stay in contact with your
findings.

Regardless, we should find a way to change the testcase so that it can
correctly identify the missed optimization.


The fix to the testcase is fine with me.

Given that Martin's fix to the testcase allowed it to succeed without
Richi's fix for the underlying problem, is there a modification to the
testcase or a new testcase that would really test the optimization?


Let me work on it.

Martin

Re: [PATCH] Fix for PR ipa/64693

2015-02-27 Thread Martin Liška


On 02/26/2015 07:21 PM, Jan Hubicka wrote:

2015-02-25  Martin Liska  
Jan Hubicka  

PR ipa/64693
* ipa-icf.c (symbol_compare_collection::symbol_compare_collection): New.
(sem_item_optimizer::subdivide_classes_by_sensitive_refs): New function.
(sem_item_optimizer::process_cong_reduction): Include division by
sensitive references.
* ipa-icf.h (struct symbol_compare_hashmap_traits): New class.
* ipa-ref.c (ipa_ref::address_matters_p): New function.
* ipa-ref.h (ipa_ref::address_matters_p): Likewise.

gcc/testsuite/ChangeLog:

2015-02-25  Martin Liska  
Jan Hubicka  

* g++.dg/ipa/pr64146.C: Update expected results.
* gcc.dg/ipa/ipa-icf-26.c: Update test.
* gcc.dg/ipa/ipa-icf-33.c: Remove redundant line.
* gcc.dg/ipa/ipa-icf-34.c: New test.

OK
Honza



Hi.

There's one missing vector comparison. Fix is obvious, ready for trunk?

Thanks,
Martin
>From 3d03fb28ec21b6ed30d5179bd70aba79d246cd26 Mon Sep 17 00:00:00 2001
From: mliska 
Date: Fri, 27 Feb 2015 16:35:31 +0100
Subject: [PATCH] Fix missing condition in symbol_compare_hashmap_traits.

gcc/ChangeLog:

2015-02-27  Martin Liska  

	* ipa-icf.h (struct symbol_compare_hashmap_traits): Add missing
	vector length condition.
---
 gcc/ipa-icf.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ipa-icf.h b/gcc/ipa-icf.h
index 9e76239..077267c 100644
--- a/gcc/ipa-icf.h
+++ b/gcc/ipa-icf.h
@@ -110,7 +110,8 @@ struct symbol_compare_hashmap_traits: default_hashmap_traits
   equal_keys (const symbol_compare_collection *a,
 	  const symbol_compare_collection *b)
   {
-if (a->m_references.length () != b->m_references.length ())
+if (a->m_references.length () != b->m_references.length ()
+	|| a->m_interposables.length () != b->m_interposables.length ())
   return false;
 
 for (unsigned i = 0; i < a->m_references.length (); i++)
-- 
2.1.2

Re: [PATCH, libsanitizer] Enable for PowerPC little endian

2015-02-27 Thread Peter Bergner

On Fri, 2015-02-27 at 15:30 +0100, Markus Trippelsdorf wrote:
> On 2015.02.27 at 07:47 -0600, Peter Bergner wrote:
> > 
> > Ok, since the results met your criteria for inclusion, I committed the
> > change as revision 221060.  Thanks.
> 
> Are there any plans to fix:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63927 ?

This is the first time I've seen this.  I haven't heard of this being
an issue before either.  That said, we'll have a look into this, but
probably not in time for GCC 5.0, since we have other real bugs we're
working on that have to get fixed first.

Peter

Re: [gofrontend-dev] Re: Go patch committed: Don't strip Go programs

2015-02-27 Thread Ian Lance Taylor

On Fri, Feb 27, 2015 at 7:21 AM,   wrote:
>
> As discussed in this bugzilla, the debug info from libgo should not be
> stripped or some things won't work as documented, like runtime.Callers. Is
> that information documented anywhere so that anyone who builds gccgo and
> libgo and provides it to others is aware of this?

I'm not aware of any documentation specifically saying that libgo
should not be stripped.  Do you have any suggestions as to where that
should go?

Ian

Re: [PATCH 01/36] Create libiberty/libiberty.m4, have GDB and GDBserver use it

2015-02-27 Thread Pedro Alves

On 02/09/2015 11:20 PM, Pedro Alves wrote:
> Converting GDB to be a C++ program, I stumbled on 'basename' issues,
> like:
> 

...
 
> So I thought of adding a m4 file that projects that use libiberty can
> source to pull in the autoconf checks that libiberty needs done in
> order to use its public headers.
> 
> Turns out that this has already happened.  Since I first wrote this a
> few months back, libiberty gained more HAVE_DECL_FOO checks even, for
> the strtol & friends replacements.
> 
> Are the libiberty changes OK?

I moved the libiberty.m4 patch to the gdb/ directory instead,
and pushed it, as below, in order to unblock the GDB C++ conversion
series, and make it easier for others to help with the
conversion as well.

I put a libiberty.m4 patch series here:

 https://github.com/palves/gdb/tree/palves/libiberty_m4

that converts gas, gold, ld, gdb and libiberty/ itself to
use libiberty/libiberty.m4.  If people think that's a good
idea, I can post it at some point.


>From 07697489f4587e41f4f63aa526c1bd7d2fcd5494 Mon Sep 17 00:00:00 2001
From: Pedro Alves 
Date: Fri, 27 Feb 2015 15:52:02 +
Subject: [PATCH] Create libiberty.m4, have GDB and GDBserver use it
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Converting GDB to be a C++ program, I stumbled on 'basename' issues,
like:

 src/gdb/../include/ansidecl.h:169:64: error: new declaration ‘char* 
basename(const char*)’
 /usr/include/string.h:597:26: error: ambiguates old declaration ‘const char* 
basename(const char*)’

which I believe led to this bit in gold's configure.ac:

 dnl We have to check these in C, not C++, because autoconf generates
 dnl tests which have no type information, and current glibc provides
 dnl multiple declarations of functions like basename when compiling
 dnl with C++.
 AC_CHECK_DECLS([basename, ffs, asprintf, vasprintf, snprintf, vsnprintf, 
strverscmp])

These checks IIUC intend to generate all the HAVE_DECL_FOO symbols
that libiberty.h and ansidecl.h check.

GDB is missing these checks currently, which results in the conflict
shown above.

This adds an m4 file that both GDB and GDBserver's configury use to
pull in the autoconf checks that libiberty clients needs done in order
to use these libiberty.h/ansidecl.h.

gdb/ChangeLog:
2015-02-27  Pedro Alves  

* libiberty.m4: New file.
* acinclude.m4: Include libiberty.m4.
* configure.ac: Call libiberty_INIT.
* config.in, configure: Regenerate.

gdb/gdbserver/
2015-02-27  Pedro Alves  

* acinclude.m4: Include libiberty.m4.
* configure.ac: Call libiberty_INIT.
* config.in, configure: Regenerate.
---
 gdb/ChangeLog  |   7 +
 gdb/gdbserver/ChangeLog|   6 +
 gdb/acinclude.m4   |   3 +
 gdb/config.in  |  45 ++
 gdb/configure  | 269 +++
 gdb/configure.ac   |   2 +
 gdb/gdbserver/acinclude.m4 |   3 +
 gdb/gdbserver/config.in|  41 ++
 gdb/gdbserver/configure| 339 +
 gdb/gdbserver/configure.ac |   2 +
 gdb/libiberty.m4   |  31 +
 11 files changed, 694 insertions(+), 54 deletions(-)
 create mode 100644 gdb/libiberty.m4

diff --git a/gdb/ChangeLog b/gdb/ChangeLog
index 7d3b1ce..dfaad27 100644
--- a/gdb/ChangeLog
+++ b/gdb/ChangeLog
@@ -1,3 +1,10 @@
+2015-02-27  Pedro Alves  
+
+   * libiberty.m4: New file.
+   * acinclude.m4: Include libiberty.m4.
+   * configure.ac: Call libiberty_INIT.
+   * config.in, configure: Regenerate.
+
 2015-02-27  Andreas Arnez  
 
* s390-linux-tdep.c (s390_gcc_target_options): Not just handle
diff --git a/gdb/gdbserver/ChangeLog b/gdb/gdbserver/ChangeLog
index 6bb8950..28b582f 100644
--- a/gdb/gdbserver/ChangeLog
+++ b/gdb/gdbserver/ChangeLog
@@ -1,3 +1,9 @@
+2015-02-27  Pedro Alves  
+
+   * acinclude.m4: Include libiberty.m4.
+   * configure.ac: Call libiberty_INIT.
+   * config.in, configure: Regenerate.
+
 2015-02-26  Pedro Alves  
 
* linux-low.c (linux_wait_1): When incrementing the PC past a
diff --git a/gdb/acinclude.m4 b/gdb/acinclude.m4
index 1f0b574..0ad90e7 100644
--- a/gdb/acinclude.m4
+++ b/gdb/acinclude.m4
@@ -57,6 +57,9 @@ sinclude([../config/zlib.m4])
 
 m4_include([common/common.m4])
 
+dnl For libiberty_INIT.
+m4_include(libiberty.m4)
+
 ## - ##
 ## ANSIfy the C compiler whenever possible.  ##
 ## From Franc,ois Pinard ##
diff --git a/gdb/config.in b/gdb/config.in
index 806cbac..4aaadb5 100644
--- a/gdb/config.in
+++ b/gdb/config.in
@@ -85,6 +85,17 @@
you don't. */
 #undef HAVE_DECL_ADDR_NO_RANDOMIZE
 
+/* Define to 1 if you have the declaration of `asprintf', and to 0 if you
+   don't. */
+#undef HAVE_DECL_ASPRINTF
+
+/* Define to 1 if you have the declaration of `basename(char *)', and to 0 if
+   you don't. */
+#undef HAVE_DECL_BASENAME
+
+/* Define to 1 if you have the

Re: [gofrontend-dev] Re: Go patch committed: Don't strip Go programs

2015-02-27 Thread Lynn A. Boger


At a minimum I think it should be mentioned in libgo/README.

I'll have to ask around to find out where would be best so that anyone 
who builds libgo for distribution knows that the debug info should not 
be stripped.


I'm not sure about other places
On 02/27/2015 09:59 AM, Ian Lance Taylor wrote:

On Fri, Feb 27, 2015 at 7:21 AM,   wrote:

As discussed in this bugzilla, the debug info from libgo should not be
stripped or some things won't work as documented, like runtime.Callers. Is
that information documented anywhere so that anyone who builds gccgo and
libgo and provides it to others is aware of this?

I'm not aware of any documentation specifically saying that libgo
should not be stripped.  Do you have any suggestions as to where that
should go?

Ian

Re: [PATCH] Fix for PR ipa/64693

2015-02-27 Thread Jan Hubicka

> Hi.
> 
> There's one missing vector comparison. Fix is obvious, ready for trunk?
> 
> Thanks,
> Martin

> >From 3d03fb28ec21b6ed30d5179bd70aba79d246cd26 Mon Sep 17 00:00:00 2001
> From: mliska 
> Date: Fri, 27 Feb 2015 16:35:31 +0100
> Subject: [PATCH] Fix missing condition in symbol_compare_hashmap_traits.
> 
> gcc/ChangeLog:
> 
> 2015-02-27  Martin Liska  
> 
>   * ipa-icf.h (struct symbol_compare_hashmap_traits): Add missing
>   vector length condition.
OK
Honza
> ---
>  gcc/ipa-icf.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/ipa-icf.h b/gcc/ipa-icf.h
> index 9e76239..077267c 100644
> --- a/gcc/ipa-icf.h
> +++ b/gcc/ipa-icf.h
> @@ -110,7 +110,8 @@ struct symbol_compare_hashmap_traits: 
> default_hashmap_traits
>equal_keys (const symbol_compare_collection *a,
> const symbol_compare_collection *b)
>{
> -if (a->m_references.length () != b->m_references.length ())
> +if (a->m_references.length () != b->m_references.length ()
> + || a->m_interposables.length () != b->m_interposables.length ())
>return false;
>  
>  for (unsigned i = 0; i < a->m_references.length (); i++)
> -- 
> 2.1.2
>

Re: ipa-icf::merge TLC

2015-02-27 Thread Jan Hubicka

> 
> This caused:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65237

Hi,
this is patch I commited.  gcc.dg/attr-noinline.c has template that counts 
number of calls
in optimized assembler.  Those do not match if one function is turned into 
another's wrapper.
gcc.dg/noreturn-7.c misses one warning because we unify the functions before 
outputting it.
I think that is OK given that the warning will come out if user fix the first 
instance.

gcc.dg/ipa/ipa-cp-1.c, gcc.dg/ipa/ipa-cp-2.c was accidental commits from my 
work with
Martin Jambor, sorry for that.
There is still gcc.target/i386/stackalign/longlong-2.c that is real bug of 
alignments not
being compared.  I noticed that independnetly yesterday and asked Martin to add 
patch
(among with several other details)

Honza

PR ipa/65237
* gcc.dg/attr-noinline.c: Add -fno-ipa-icf
* gcc.dg/noreturn-7.c: Add -fno-ipa-icf.
* gcc.dg/ipa/ipa-cp-1.c: Revert accidental commit.
* gcc.dg/ipa/ipa-cp-2.c: Revert accidental commit.
Index: gcc.dg/attr-noinline.c
===
--- gcc.dg/attr-noinline.c  (revision 221034)
+++ gcc.dg/attr-noinline.c  (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -finline-functions" } */
+/* { dg-options "-O2 -finline-functions -fno-ipa-icf" } */
 
 extern int t();
 
Index: gcc.dg/noreturn-7.c
===
--- gcc.dg/noreturn-7.c (revision 221034)
+++ gcc.dg/noreturn-7.c (working copy)
@@ -5,7 +5,7 @@
in presence of tail recursion within a noreturn function.  */
 
 /* { dg-do compile } */
-/* { dg-options "-O2 -Wreturn-type -Wmissing-noreturn" } */
+/* { dg-options "-O2 -Wreturn-type -Wmissing-noreturn -fno-ipa-icf" } */
 
 
 void f(void) __attribute__ ((__noreturn__));
Index: gcc.dg/ipa/ipa-cp-1.c
===
--- gcc.dg/ipa/ipa-cp-1.c   (revision 221034)
+++ gcc.dg/ipa/ipa-cp-1.c   (working copy)
@@ -1,22 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-ipa-cp"  } */
-int n;
-
-static void
-__attribute__ ((noinline))
-test(void *a)
-{
-  __builtin_memset (a,0,n);
-}
-
-int
-main()
-{
-  int aa;
-  short bb;
-  test (&aa);
-  test (&bb);
-  return 0;
-}
-/* { dg-final { scan-ipa-dump "Alignment 2"  "cp"  } } */
-/* { dg-final { cleanup-ipa-dump "cp" } } */
Index: gcc.dg/ipa/ipa-cp-2.c
===
--- gcc.dg/ipa/ipa-cp-2.c   (revision 221034)
+++ gcc.dg/ipa/ipa-cp-2.c   (working copy)
@@ -1,22 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-ipa-cp"  } */
-int n;
-
-static void
-__attribute__ ((noinline))
-test(void *a)
-{
-  __builtin_memset (a,0,n);
-}
-
-static __attribute__ ((aligned(16))) int aa[10];
-
-int
-main()
-{
-  test (&aa[1]);
-  test (&aa[3]);
-  return 0;
-}
-/* { dg-final { scan-ipa-dump "Alignment 8, misalignment 4"  "cp"  } } */
-/* { dg-final { cleanup-ipa-dump "cp" } } */

Re: ipa-icf::merge TLC

2015-02-27 Thread Steve Ellcey

On Fri, 2015-02-27 at 03:10 +0100, Jan Hubicka wrote:

> Bootstrapped/regtested x86_64-linux, comitted.
> 
> Honza
>   * ipa-icf.c (symbol_compare_collection::symbol_compare_colleciton):
>   Use address_matters_p.

I think this patch is causing an ICE while building glibc on MIPS.  I am
building a toolchain for mips-mti-linux-gnu and when compiling
sysdeps/gnu/siglist.c from glibc for mips64r2 (N32 ABI) I get the
following ICE.

I will try to create a preprocessed source file for this but I wanted
to report it first to see if anyone else is seeing it on other
platforms.

Steve Ellcey
sell...@imgtec.com

../sysdeps/gnu/siglist.c:72:1: internal compiler error: in address_matters_p, 
at symtab.c:1908
 versioned_symbol (libc, __new_sys_sigabbrev, sys_sigabbrev, GLIBC_2_3_3);
 ^
0x66a080 symtab_node::address_matters_p()
/scratch/sellcey/repos/bootstrap/src/gcc/gcc/symtab.c:1908
0xe7cbe5 ipa_icf::sem_variable::merge(ipa_icf::sem_item*)
/scratch/sellcey/repos/bootstrap/src/gcc/gcc/ipa-icf.c:1443
0xe81ff9 ipa_icf::sem_item_optimizer::merge_classes(unsigned int)
/scratch/sellcey/repos/bootstrap/src/gcc/gcc/ipa-icf.c:2659
0xe86491 ipa_icf::sem_item_optimizer::execute()
/scratch/sellcey/repos/bootstrap/src/gcc/gcc/ipa-icf.c:1923
0xe885a1 ipa_icf_driver
/scratch/sellcey/repos/bootstrap/src/gcc/gcc/ipa-icf.c:2738
0xe885a1 ipa_icf::pass_ipa_icf::execute(function*)
/scratch/sellcey/repos/bootstrap/src/gcc/gcc/ipa-icf.c:2785
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
make[2]: *** 
[/scratch/sellcey/repos/bootstrap/obj-mips-mti-linux-gnu/glibc/obj_mips64r2/stdio-common/siglist.os]
 Error 1

Re: ipa-icf::merge TLC

2015-02-27 Thread Steve Ellcey

On Fri, 2015-02-27 at 09:33 -0800, Steve Ellcey wrote:
> On Fri, 2015-02-27 at 03:10 +0100, Jan Hubicka wrote:
> 
> > Bootstrapped/regtested x86_64-linux, comitted.
> > 
> > Honza
> > * ipa-icf.c (symbol_compare_collection::symbol_compare_colleciton):
> > Use address_matters_p.
> 
> I think this patch is causing an ICE while building glibc on MIPS.  I am
> building a toolchain for mips-mti-linux-gnu and when compiling
> sysdeps/gnu/siglist.c from glibc for mips64r2 (N32 ABI) I get the
> following ICE.
> 
> I will try to create a preprocessed source file for this but I wanted
> to report it first to see if anyone else is seeing it on other
> platforms.
> 
> Steve Ellcey
> sell...@imgtec.com

Following up to my own email.  I can reproduce this with the following
cut down test case if I compile with '-O2 -fmerge-all-constants' on
MIPS.

extern const char *const _sys_siglist[128];
const char *const __new_sys_siglist[128] = { };
extern __typeof (_sys_siglist) __EI__sys_siglist __attribute__((alias ("" 
"__new_sys_siglist")));
extern __typeof (__new_sys_siglist) _new_sys_siglist __attribute__ ((alias 
("__new_sys_siglist")));

Steve Ellcey
sell...@imgtec.com

RE: [PATCH] Remove inefficient branchless conditional negate optimization

2015-02-27 Thread Wilco Dijkstra

> Richard Biener wrote: 
> On Thu, Feb 26, 2015 at 11:20 PM, Jeff Law  wrote:
> > On 02/26/15 10:30, Wilco Dijkstra wrote:
> >>
> >> Several GCC versions ago a conditional negate optimization was introduced
> >> as a workaround for
> >> PR45685. However the branchless expansion for conditional negate is
> >> extremely inefficient on most
> >> targets (5 sequentially dependent instructions rather than 2 on AArch64).
> >> Since the underlying issue
> >> has been resolved (the example in PR45685 no longer generates a branch on
> >> x64), remove the
> >> workaround so that conditional negates are treated in exactly the same way
> >> as conditional invert,
> >> add, subtract, and, orr, xor etc. Simple example:
> >>
> >> int f(int x) { if (x > 3) x = -x; return x; }
> >
> > You need to bootstrap and regression test the change before it can be
> > approved.
> 
> As Jeff added a testcase for the PHI opt transform to happen I'm sure
> testing would shown this as fallout.

Yes that's the only test that starts to fail. I've changed it to scan
for cmov/csel instead. Bootstrap is fine for AArch64 and x64 of course.

Should the test be moved somewhere else now it is no longer a tree-ssa test?

> > You should turn this little example into a testcase.  It's fine with me if
> > this new test is ARM specific.
> >
> >
> > You should also find a way to change the test gcc.dg/tree-ssa/pr45685.c in
> > such a way that it ensures there aren't any undesirable branches.
>
> I'd be also interested in results of vectorizing a loop with a
> conditional negate.
> I can very well imagine reverting this patch causing code quality regressions
> there.

Well vectorized code improves in the same way as you'd expect:

void f(int *p, int *q) 
{ 
  int i; 
  for (i = 0; i < 1000; i++) p[i] = (q[i] > 3) ? -q[i] : q[i]; 
}

Before:
.L6:
vmovdqa (%r9,%rax), %ymm2
addl$1, %edx
vpcmpgtd%ymm5, %ymm2, %ymm0
vpand   %ymm4, %ymm0, %ymm1
vpsubd  %ymm1, %ymm3, %ymm0
vpxor   %ymm0, %ymm2, %ymm0
vpaddd  %ymm0, %ymm1, %ymm0
vmovups %xmm0, (%rcx,%rax)
vextracti128$0x1, %ymm0, 16(%rcx,%rax)
addq$32, %rax
cmpl%r8d, %edx
jb  .L6

After:
.L6:
vmovdqa (%r9,%rax), %ymm0
addl$1, %edx
vpcmpgtd%ymm4, %ymm0, %ymm2
vpsubd  %ymm0, %ymm3, %ymm1
vpblendvb   %ymm2, %ymm1, %ymm0, %ymm0
vmovups %xmm0, (%rcx,%rax)
vextracti128$0x1, %ymm0, 16(%rcx,%rax)
addq$32, %rax
cmpl%r8d, %edx
jb  .L6

> > I've got enough history to know this is fixing a regression of sorts for the
> > ARM platform.  So once the issues above are addressed it can go forward even
> > without a BZ noting the regression.
> 
> But I'd say this is stage1 material at this point.

I suppose it doesn't matter as it'll have to be backported anyway.

Wilco

Re: ipa-icf::merge TLC

2015-02-27 Thread Martin Liška

On 02/27/2015 07:04 PM, Steve Ellcey wrote:
> Following up to my own email.  I can reproduce this with the following
> cut down test case if I compile with '-O2 -fmerge-all-constants' on
> MIPS.
> 
> extern const char *const _sys_siglist[128];
> const char *const __new_sys_siglist[128] = { };
> extern __typeof (_sys_siglist) __EI__sys_siglist __attribute__((alias ("" 
> "__new_sys_siglist")));
> extern __typeof (__new_sys_siglist) _new_sys_siglist __attribute__ ((alias 
> ("__new_sys_siglist")));
> 
> Steve Ellcey
> sell...@imgtec.com

Hello.

I've just created PR65245, where I've attached suggested patch I'm testing.

Thanks,
Martin

[PATCH] ubsan: improve bounds checking, add -fsanitize=bounds-strict

2015-02-27 Thread Martin Uecker


I tested Marek's proposed change and it works correctly,
i.e. arrays which are not part of a struct are now
instrumented when accessed through a pointer. This also
means that the following case is diagnosed (correctly)
as undefined behaviour as pointed out by Richard:

int
main (void)
{
  int *t = (int *) __builtin_malloc (sizeof (int) * 9);
  int (*a)[3][3] = (int (*)[3][3])t;
  (*a)[0][9] = 1;
}


I also wanted arrays which are the last elements of a
struct which are not flexible-array members instrumented 
correctly. So I added -fsantitize=bounds-strict which does
this. It seems to do instrumentation similar to clang 
with -fsanitize=bounds.

Comments?

(regression testing in progress, but ubsan-related
tests all pass)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index ec2cb69..cb6df20 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2015-02-27  Martin Uecker 
+
+   * opts.c(common_handle_option): Add option for
+   -fsanitize=bounds-strict
+   * flag-types.h: Add SANITIZE_BOUNDS_STRICT
+   * doc/invoke.texi: Improve description for
+   -fsanitize=bounds and document -fsanitize=bounds-strict
+
diff --git a/gcc/c-family/ChangeLog b/gcc/c-family/ChangeLog
index ffa01c6..44a1761 100644
--- a/gcc/c-family/ChangeLog
+++ b/gcc/c-family/ChangeLog
@@ -1,3 +1,9 @@
+2015-02-27  Martin Uecker 
+
+   * c-ubsan.c (ubsan_instrument_bounds): Instrument
+   arrays which are accessed directly through a pointer.
+   For strict checking, instrument last elements of a struct.
+
diff --git a/gcc/c-family/c-ubsan.c b/gcc/c-family/c-ubsan.c
index 90d59c0..1a0e2da 100644
--- a/gcc/c-family/c-ubsan.c
+++ b/gcc/c-family/c-ubsan.c
@@ -293,6 +293,7 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
*index,
   tree type = TREE_TYPE (array);
   tree domain = TYPE_DOMAIN (type);
 
+  /* This also takes care of flexible array members. */
   if (domain == NULL_TREE || TYPE_MAX_VALUE (domain) == NULL_TREE)
 return NULL_TREE;
 
@@ -301,10 +302,13 @@ ubsan_instrument_bounds (location_t loc, tree array, tree 
*index,
 bound = fold_build2 (PLUS_EXPR, TREE_TYPE (bound), bound,
 build_int_cst (TREE_TYPE (bound), 1));
 
-  /* Detect flexible array members and suchlike.  */
+  /* Don't instrument arrays which are the last element of
+ a struct. */
   tree base = get_base_address (array);
-  if (base && (TREE_CODE (base) == INDIRECT_REF
-  || TREE_CODE (base) == MEM_REF))
+  if (!(flag_sanitize & SANITIZE_BOUNDS_STRICT)
+  && (TREE_CODE (array) == COMPONENT_REF)
+  && base && (TREE_CODE (base) == INDIRECT_REF
+ || TREE_CODE (base) == MEM_REF))
 {
   tree next = NULL_TREE;
   tree cref = array;
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ef4cc75..5a93757 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -5704,8 +5704,15 @@ a++;
 @item -fsanitize=bounds
 @opindex fsanitize=bounds
 This option enables instrumentation of array bounds.  Various out of bounds
-accesses are detected.  Flexible array members and initializers of variables
-with static storage are not instrumented.
+accesses are detected.  Accesses to arrays which are the last member of a
+struct and initializers of variables with static storage are not instrumented.
+
+@item -fsanitize=bounds-strict
+@opindex fsanitize=bounds-strict
+This option enables strict instrumentation of array bounds.  Most out of bounds
+accesses are detected including accesses to arrays which are the last member 
of a
+struct. Initializers of variables with static storage are not instrumented.
+
 
 @item -fsanitize=alignment
 @opindex fsanitize=alignment
diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index bfdce44..c9ad4df 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -238,6 +238,7 @@ enum sanitize_code {
   SANITIZE_RETURNS_NONNULL_ATTRIBUTE = 1UL << 19,
   SANITIZE_OBJECT_SIZE = 1UL << 20,
   SANITIZE_VPTR = 1UL << 21,
+  SANITIZE_BOUNDS_STRICT = 1UL << 22,
   SANITIZE_UNDEFINED = SANITIZE_SHIFT | SANITIZE_DIVIDE | SANITIZE_UNREACHABLE
   | SANITIZE_VLA | SANITIZE_NULL | SANITIZE_RETURN
   | SANITIZE_SI_OVERFLOW | SANITIZE_BOOL | SANITIZE_ENUM
@@ -245,7 +246,7 @@ enum sanitize_code {
   | SANITIZE_NONNULL_ATTRIBUTE
   | SANITIZE_RETURNS_NONNULL_ATTRIBUTE
   | SANITIZE_OBJECT_SIZE | SANITIZE_VPTR,
-  SANITIZE_NONDEFAULT = SANITIZE_FLOAT_DIVIDE | SANITIZE_FLOAT_CAST
+  SANITIZE_NONDEFAULT = SANITIZE_FLOAT_DIVIDE | SANITIZE_FLOAT_CAST | 
SANITIZE_BOUNDS_STRICT
 };
 
 /* flag_vtable_verify initialization levels. */
diff --git a/gcc/opts.c b/gcc/opts.c
index 39c190d..7fe77fa 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -1584,6 +1584,7 @@ common_handle_option (struct gcc_options *opts,
  { "float-cast-overflow", SANITIZE_FLOAT_CAST,
sizeof "float-cast-overflow" - 1 },
  { "bounds", SANITIZE_BOUNDS,

Re: [PATCH] Fix removing of df problem in df_finish_pass

2015-02-27 Thread Bernhard Reutner-Fischer

On February 27, 2015 12:42:43 PM GMT+01:00, Thomas Preud'homme 
 wrote:
>Hi,
>
>In df_finish_pass, optional problems are removed manually making non
>null
>entries in df->problems_in_order non contiguous. This may lead to null
>pointer
>dereference when accessing all problems from df->problems_in_order[0]
>to
>df->problems_in_order[df->num_problems_defined - 1] and miss some other
>problems. Such a scenario was actually encountered when working on a
>patch.
>This patch use the existing function df_remove_problem to do the
>deletion,
>which require iterating on problems via the df->problems_by_index[]
>array
>since each call mess up with df->num_problems_defined and order of
>problems in df->problems_in_order[].
>
>ChangeLog entry is as follows:
>
>2015-02-12  Thomas Preud'homme  
>
> * df-core.c (df_finish_pass): Iterate over df->problems_by_index[] and
>   use df_remove_problem rather than manually removing problems, living

leaving

Thanks,

>holes in df->problems_in_order[].
>
>diff --git a/gcc/df-core.c b/gcc/df-core.c
>index 82f1364..67040a1 100644
>--- a/gcc/df-core.c
>+++ b/gcc/df-core.c
>@@ -642,7 +642,6 @@ void
> df_finish_pass (bool verify ATTRIBUTE_UNUSED)
> {
>   int i;
>-  int removed = 0;
> 
> #ifdef ENABLE_DF_CHECKING
>   int saved_flags;
>@@ -658,21 +657,15 @@ df_finish_pass (bool verify ATTRIBUTE_UNUSED)
>   saved_flags = df->changeable_flags;
> #endif
> 
>-  for (i = 0; i < df->num_problems_defined; i++)
>+  /* We iterate over problems by index as each problem removed will
>+ lead to problems_in_order to be reordered.  */
>+  for (i = 0; i < DF_LAST_PROBLEM_PLUS1; i++)
> {
>-  struct dataflow *dflow = df->problems_in_order[i];
>-  struct df_problem *problem = dflow->problem;
>+  struct dataflow *dflow = df->problems_by_index[i];
> 
>-  if (dflow->optional_p)
>-  {
>-gcc_assert (problem->remove_problem_fun);
>-(problem->remove_problem_fun) ();
>-df->problems_in_order[i] = NULL;
>-df->problems_by_index[problem->id] = NULL;
>-removed++;
>-  }
>+  if (dflow && dflow->optional_p)
>+  df_remove_problem (dflow);
> }
>-  df->num_problems_defined -= removed;
> 
>   /* Clear all of the flags.  */
>   df->changeable_flags = 0;
>
>
>Testsuite was run with a bootstrapped x86_64 native compiler and an
>arm-none-eabi GCC cross-compiler targetting Cortex-M3 without any
>regression.
>
>Although the problem is real, it doesn't seem that GCC hits it now
>(I stumbled upon it while working on a patch). Therefore I'm not sure
>if this should go in stage4 or not. Please advise me on this.
>
>Ok for trunk/stage1?
>
>Best regards,
>
>Thomas

Re: [gofrontend-dev] Re: Go patch committed: Don't strip Go programs

2015-02-27 Thread Matthias Klose

On 02/27/2015 04:59 PM, Ian Lance Taylor wrote:
> On Fri, Feb 27, 2015 at 7:21 AM,   wrote:
>>
>> As discussed in this bugzilla, the debug info from libgo should not be
>> stripped or some things won't work as documented, like runtime.Callers. Is
>> that information documented anywhere so that anyone who builds gccgo and
>> libgo and provides it to others is aware of this?
> 
> I'm not aware of any documentation specifically saying that libgo
> should not be stripped.  Do you have any suggestions as to where that
> should go?

is there anything which could be stripped without scarifying functionality?
Linux distributions usually strip things by default, so a hint what exactly is
needed to keep the functionality would be appreciated.

thanks, Matthias

Re: [gofrontend-dev] Re: Go patch committed: Don't strip Go programs

2015-02-27 Thread Ian Lance Taylor

On Fri, Feb 27, 2015 at 12:07 PM, Matthias Klose  wrote:
>
> is there anything which could be stripped without scarifying functionality?
> Linux distributions usually strip things by default, so a hint what exactly is
> needed to keep the functionality would be appreciated.

What is needed is file/line information.  However, I don't know of an
option to strip that discards most debug info but keeps file/line
info.  The gold linker can do it (--strip-debug-non-line) but that
obviously would have to be used when building the library; it doesn't
help at install time.

Ian

Re: ipa-icf::merge TLC

2015-02-27 Thread Jan Hubicka

> 
> ../sysdeps/gnu/siglist.c:72:1: internal compiler error: in address_matters_p, 
> at symtab.c:1908
>  versioned_symbol (libc, __new_sys_sigabbrev, sys_sigabbrev, GLIBC_2_3_3);
>  ^
> 0x66a080 symtab_node::address_matters_p()
> /scratch/sellcey/repos/bootstrap/src/gcc/gcc/symtab.c:1908
> 0xe7cbe5 ipa_icf::sem_variable::merge(ipa_icf::sem_item*)
> /scratch/sellcey/repos/bootstrap/src/gcc/gcc/ipa-icf.c:1443

Indeed, the ipa-icf should not try to analyze aliases - just prove ekvialence of
definitions they are attached to.  It already does that for functions (bit by 
accident;
it gives up when there is no gimple body), but it does not do that for 
variables because
it gets into ctor_for_folding. For that reason it sometimes decides to try to 
make two
variable aliases alias of each other that is not a good idea, because of 
possible creation
of loops.

I am just discussing with Martin the fix.

Honza

[RFC/patch for stage1] Embed compiler dumps into generated .o files (was Re: Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference)

2015-02-27 Thread David Malcolm

On Thu, 2015-02-26 at 11:17 -0500, David Malcolm wrote:
> On Fri, 2015-02-20 at 10:29 -0700, Jeff Law wrote:
> > On 02/19/15 14:56, Chris Johns wrote:
> > > On 20/02/2015 8:23 am, Joel Sherrill wrote:
> > >>
> > >> On 2/19/2015 2:56 PM, Sandra Loosemore wrote:
> > >>> Jakub Jelinek wrote:
> >  On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
> > > Starting with gcc 4.9, -O2 implicitly invokes
> > >
> > >  -fisolate-erroneous-paths-dereference:
> > >
> > > which
> > >
> > >  https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> > >
> > > documents as
> > >
> > >  Detect paths that trigger erroneous or undefined behavior due to
> > >  dereferencing a null pointer. Isolate those paths from the
> > > main control
> > >  flow and turn the statement with erroneous or undefined
> > > behavior into a
> > >  trap. This flag is enabled by default at -O2 and higher.
> > >
> > > This results in a sizable number of previously working embedded
> > > programs mysteriously
> > > crashing when recompiled under gcc 4.9.  The problem is that embedded
> > > programs will often have ram starting at address zero (think
> > > hardware-defined
> > > interrupt vectors, say) which gets initialized by code which the
> > > -fisolate-erroneous-paths-deference logic can recognize as reading
> > > and/or
> > > writing address zero.
> >  If you have some pages mapped at address 0, you really should
> >  compile your
> >  code with -fno-delete-null-pointer-checks, otherwise you can run
> >  into tons
> >  of other issues.
> > >>> H,  Passing the additional option in user code would be one thing,
> > >>> but what about library code?  E.g., using memcpy (either explicitly or
> > >>> implicitly for a structure copy)?
> > >>>
> > >>> It looks to me like cr16 and avr are currently the only architectures
> > >>> that disable flag_delete_null_pointer_checks entirely, but I am sure
> > >>> that this issue affects other embedded targets besides nios2, too.  E.g.
> > >>> scanning Mentor's ARM board support library, I see a whole pile of
> > >>> devices that have memory mapped at address zero (TI Stellaris/Tiva,
> > >>> Energy Micro EFM32Gxxx,  Atmel AT91SAMxxx, ).  Plus our simulator
> > >>> BSPs assume a flat address space starting at address 0.
> > >> I forwarded this to the RTEMS list and was promptly pointed to a patch
> > >> on a Coldfire BSP where someone worked around this behavior.
> > >>
> > >> We are discussing how to deal with this. It is likely OK in user code but
> > >> horrible in BSP and driver code. We don't have a solution ourselves. We
> > >> just recognize it impacts a number of targets.
> > >>
> > >
> > > My main concern is not knowing the trap has been added to the code. If I
> > > could build an application and audit it somehow then I can manage it. We
> > > have a similar issue with the possible use of FP registers being used in
> > > general code (ISR save/restore trade off).
> > >
> > > Can the ELF be annotated in some GCC specific way that makes it to the
> > > final executable to flag this is happening ? We can then create tools to
> > > audit the executables.
> > Not really, for a variety of reasons.
> 
> Is information on this reaching the pass-specific dumpfile?  I don't see
> any explicit dumping in gimple-ssa-isolate-paths.c, but I guess that
> insert_trap_and_remove_trailing_statements could log itself to the
> dumpfile, or use the statistics framework (which itself also reaches a
> dumpfile).
> 
> Assuming the info is reaching a dumpfile, could gcc have an option to
> write its dumpfiles into a special ELF section in the .s, rather than to
> disk?
> 
> Then (given a suitable new option to e.g. eu-readelf) you'd be able to
> read the dumpfiles from a .o file, and (handwaving about linkage) from
> an execuable or library.
> 
> Not that I'm volunteering...

Perhaps foolishly I had a go at prototyping this; attached is a
proof-of-concept patch (albeit with FIXMEs and no ChangeLog or testsuite
coverage).

When writing out the final asm, each dumpfile is read, and embedded into
its own section.  Manual review of the built .o file shows that the
dumpfiles make it into them, e.g.:

$ eu-readelf -x .note.GNU-dump.tree-switchconv smoketest.o|head

Hex dump of section [28] '.note.GNU-dump.tree-switchconv', 2698 bytes at offset 
0xf021:
  0x 0a3b3b20 46756e63 74696f6e 20746573 .;; Function tes
  0x0010 745f7068 69202874 6573745f 7068692c t_phi (test_phi,
  0x0020 2066756e 63646566 5f6e6f3d 302c2064  funcdef_no=0, d
  0x0030 65636c5f 7569643d 31383332 2c206367 ecl_uid=1832, cg
  0x0040 72617068 5f756964 3d302c20 73796d62 raph_uid=0, symb
  0x0050 6f6c5f6f 72646572 3d30290a 0a746573 ol_order=0)..tes
  0x0060 745f7068 69202869 6e742069 2c20696e t_phi (int i, in
  0x0070 74206a29 0a7b0a20 20696e74 206b3b0a t j).{.  int

Re: [patch, libstdc++] Use explicit relative imports for the pretty printers

2015-02-27 Thread Jonathan Wakely

On 25 February 2015 at 20:22, Matthias Klose wrote:
> When gdb is linked/used with Python 3, import of the pretty printers fails:
>
> Traceback (most recent call last):
>  File
> "/usr/share/gdb/auto-load/usr/lib/i386-linux-gnu/libstdc++.so.6.0.21-gdb.py",
> line 58, in 
>import libstdcxx.v6
>  File
> "/usr/lib/i386-linux-gnu/../../share/gcc-5/python/libstdcxx/v6/__init__.py",
> line 19, in 
>from printers import register_libstdcxx_printers
> ImportError: No module named 'printers'
> [Inferior 1 (process 6130) exited normally]
>
> Python3 doesn't support implicit relative imports anymore.  Use explicit
> relative imports instead.  This syntax is compatible with Python 2.5 and newer
> 2.x versions.  Ok for the trunk?

OK, thanks.

Re: [PR58315] reset inlined debug vars at return-to point

2015-02-27 Thread Alexandre Oliva

On Feb 27, 2015, Petr Machata  wrote:

> Alexandre Oliva  writes:
>> Ok, I looked into it further, after patching dwlocstat to dump
>> per-variable per-range coverage/length info, so as to be able to compare
>> object files more easily.

> If you send me those patches, I can finish them, bind the functionality
> to a command line option, and merge upstream.

Here's what I've got so far.  It's ugly and needs polishing, that I
planned to do once this issue was resolved, but if you feel like beating
me to it, you're surely welcome to ;-)

I'm sure there must be a better way to compute the CIE offset, or even
the base address of the debug info section (at least the "- 11" needs to
be replaced by something more sensible and more portable), but I didn't
try very hard to find it out :-)  Anyway, ideally, dumping it would be
optional; I've added and removed it numerous times depending on the
exact object file I was comparing and what info I sought.

diff --git a/locstats.cc b/locstats.cc
index 8a458cb..0b71ceb 100644
--- a/locstats.cc
+++ b/locstats.cc
@@ -240,7 +240,8 @@ static die_action process_location (Dwarf_Attribute *locattr,
 bool full_implicit,
 std::bitset &die_type,
 mutability_t &mut,
-int &coverage);
+int &coverage,
+bool dump_partials = false);
 
 static die_action process_implicit_pointer (Dwarf_Attribute *locattr,
 	Dwarf_Op *op,
@@ -400,7 +401,8 @@ process_location (Dwarf_Attribute *locattr,
 		  bool full_implicit,
 		  std::bitset &die_type,
 		  mutability_t &mut,
-		  int &coverage)
+		  int &coverage,
+		  bool dump_partials)
 {
   Dwarf_Op *expr;
   size_t len;
@@ -459,6 +461,7 @@ process_location (Dwarf_Attribute *locattr,
 	{
 	  Dwarf_Addr low = rit->first;
 	  Dwarf_Addr high = rit->second;
+	  size_t const covered_before = covered;
 	  length += high - low;
 	  //std::cerr << " " << low << ".." << high << std::endl;
 
@@ -525,6 +528,10 @@ process_location (Dwarf_Attribute *locattr,
 	  if (cover)
 		covered++;
 	}
+
+	  if (dump_partials)
+	std::cout << ' ' << (covered - covered_before)
+		  << '/' << (high - low);
 	}
 
   if (length == 0 || covered == 0)
@@ -562,6 +569,8 @@ is_inlined (Dwarf_Die *die)
 void
 process (Dwarf *dw)
 {
+  const bool dump_partials = true;
+
   // map percentage->occurrences.  Percentage is cov_00..100, where
   // 0..100 is rounded-down integer division.
   std::map tally;
@@ -586,11 +595,16 @@ process (Dwarf *dw)
 for (cu_iterator it = cu_iterator (dw); it != cu_iterator::end (); ++it)
   last_cit = it;
 
+  char *base = NULL;
+
   for (all_dies_iterator it (dw); it != all_dies_iterator::end (); ++it)
 {
   std::bitset die_type;
   Dwarf_Die *die = *it;
 
+  if (!base)
+	base = (char*)die->addr - 11;
+
   if (show_progress)
 	{
 	  cu_iterator cit = it.cu ();
@@ -677,17 +691,29 @@ process (Dwarf *dw)
   if (locattr == NULL)
 	locattr = dwarf_attr (die, DW_AT_const_value, &locattr_mem);
 
-  /*
-  Dwarf_Attribute name_attr_mem,
-	*name_attr = dwarf_attr_integrate (die, DW_AT_name, &name_attr_mem);
-  std::string name = name_attr != NULL
-	? dwarf_formstring (name_attr)
-	: (dwarf_hasattr_integrate (die, DW_AT_artificial)
-	   ? "" : "???");
+  if (dump_partials)
+	{
+	  std::string name;
 
-  std::cerr << "die=" << std::hex << die.offset ()
-		<< " '" << name << '\'';
-  */
+	  for (auto pit = it; pit != all_dies_iterator::end ();
+	   pit = pit.parent ())
+	{
+	  Dwarf_Attribute name_attr_mem,
+		*name_attr = dwarf_attr_integrate (*pit, DW_AT_name, &name_attr_mem);
+	  std::string thisname = name_attr != NULL
+		? dwarf_formstring (name_attr)
+		: (dwarf_hasattr_integrate (*pit, DW_AT_artificial)
+		   ? "" : "???");
+
+	  if (name != "")
+		name = thisname + "::" + name;
+	  else
+		name = thisname;
+	}
+
+	  std::cout // << "die=" << std::hex << (char*)die->addr - base << ' '
+	<< name;
+	}
 
   int coverage;
   mutability_t mut;
@@ -697,7 +723,7 @@ process (Dwarf *dw)
 interested_mutability,
 interested_implicit,
 full_implicit,
-die_type, mut, coverage) != da_ok)
+die_type, mut, coverage, dump_partials) != da_ok)
 	continue;
 	}
   catch (::error &e)
@@ -751,7 +777,7 @@ process (Dwarf *dw)
 
   tally[coverage]++;
   total++;
-  //std::cerr << std::endl;
+  std::cout << std::endl;
 }
 
   if (show_progress)

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer

Fix typo in ira-costs.c

2015-02-27 Thread Kugan

Looks like a typo to me. Bootstrapped and regression tested on
x86_64-unknown-linux-gnu with no new regressions.

Is this OK for stage1?

Thanks,
Kugan

gcc/ChangeLog:

2015-02-28  Kugan Vivekanandarajah  

* ira-costs.c (record_operand_costs): Fix typo (remove redundant code).
diff --git a/gcc/ira-costs.c b/gcc/ira-costs.c
index c19f258..1ca156e 100644
--- a/gcc/ira-costs.c
+++ b/gcc/ira-costs.c
@@ -1387,8 +1387,6 @@ record_operand_costs (rtx_insn *insn, enum reg_class 
*pref)
   rtx dest = SET_DEST (set);
   rtx src = SET_SRC (set);
 
-  dest = SET_DEST (set);
-  src = SET_SRC (set);
   if (GET_CODE (dest) == SUBREG
  && (GET_MODE_SIZE (GET_MODE (dest))
  == GET_MODE_SIZE (GET_MODE (SUBREG_REG (dest)

one more patch for PR64317

2015-02-27 Thread Vladimir Makarov


  The following patch improves inheritance for PR64317 testcase

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64317

  I ran a lot SPEC2000 benchmarks to get better default parameter value 
for EBB_PROBABILITY_CUTOFF in LRA inheritance. The new default parameter 
value improves SPECInt2000 by 0.4% on x86-64 without changing SPECFP2000 
rate.  The code size changes are insignificant (0.002% increase for 
SPECInt and 0.01% decrease for SPECFP).


The patch was bootstrapped and tested on x86-64.

Committed as rev.221070.

2015-02-27  Vladimir Makarov  

PR target/64317
* params.def (PARAM_LRA_INHERITANCE_EBB_PROBABILITY_CUTOFF): New.
* params.h (LRA_INHERITANCE_EBB_PROBABILITY_CUTOFF): New.
* lra-constraints.c (EBB_PROBABILITY_CUTOFF): Use
LRA_INHERITANCE_EBB_PROBABILITY_CUTOFF.
(lra_inheritance): Use '<' instead of '<=' for
EBB_PROBABILITY_CUTOFF.
* doc/invoke.texi (lra-inheritance-ebb-probability-cutoff):
Document change.

Index: params.def
===
--- params.def  (revision 220916)
+++ params.def  (working copy)
@@ -836,6 +836,11 @@ DEFPARAM (PARAM_LRA_MAX_CONSIDERED_RELOA
  "The max number of reload pseudos which are considered during 
spilling a non-reload pseudo",
  500, 0, 0)
 
+DEFPARAM (PARAM_LRA_INHERITANCE_EBB_PROBABILITY_CUTOFF,
+ "lra-inheritance-ebb-probability-cutoff",
+ "Minimal fall-through edge probability in percentage used to add BB 
to inheritance EEB in LRA",
+ 40, 0, 100)
+
 /* Switch initialization conversion will refuse to create arrays that are
bigger than this parameter times the number of switch branches.  */
 
Index: params.h
===
--- params.h(revision 220916)
+++ params.h(working copy)
@@ -202,6 +202,8 @@ extern void init_param_values (int *para
   PARAM_VALUE (PARAM_IRA_LOOP_RESERVED_REGS)
 #define LRA_MAX_CONSIDERED_RELOAD_PSEUDOS \
   PARAM_VALUE (PARAM_LRA_MAX_CONSIDERED_RELOAD_PSEUDOS)
+#define LRA_INHERITANCE_EBB_PROBABILITY_CUTOFF \
+  PARAM_VALUE (PARAM_LRA_INHERITANCE_EBB_PROBABILITY_CUTOFF)
 #define SWITCH_CONVERSION_BRANCH_RATIO \
   PARAM_VALUE (PARAM_SWITCH_CONVERSION_BRANCH_RATIO)
 #define LOOP_INVARIANT_MAX_BBS_IN_LOOP \
Index: lra-constraints.c
===
--- lra-constraints.c   (revision 220916)
+++ lra-constraints.c   (working copy)
@@ -154,6 +154,7 @@
 #include "df.h"
 #include "ira.h"
 #include "rtl-error.h"
+#include "params.h"
 #include "lra-int.h"
 
 /* Value of LRA_CURR_RELOAD_NUM at the beginning of BB of the current
@@ -5694,7 +5695,8 @@ inherit_in_ebb (rtx_insn *head, rtx_insn
 /* This value affects EBB forming.  If probability of edge from EBB to
a BB is not greater than the following value, we don't add the BB
to EBB.  */
-#define EBB_PROBABILITY_CUTOFF ((REG_BR_PROB_BASE * 50) / 100)
+#define EBB_PROBABILITY_CUTOFF \
+  ((REG_BR_PROB_BASE * LRA_INHERITANCE_EBB_PROBABILITY_CUTOFF) / 100)
 
 /* Current number of inheritance/split iteration.  */
 int lra_inheritance_iter;
@@ -5740,7 +5742,7 @@ lra_inheritance (void)
  e = find_fallthru_edge (bb->succs);
  if (! e)
break;
- if (e->probability <= EBB_PROBABILITY_CUTOFF)
+ if (e->probability < EBB_PROBABILITY_CUTOFF)
break;
  bb = bb->next_bb;
}
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 220916)
+++ doc/invoke.texi (working copy)
@@ -10570,6 +10570,14 @@ by this parameter.  The default value of
 the minimal number of registers needed by typical instructions.
 This value is the best found from numerous experiments.
 
+@item lra-inheritance-ebb-probability-cutoff
+LRA tries to reuse values reloaded in registers in subsequent insns.
+This optimization is called inheritance.  EBB is used as a region to
+do this optimization.  The parameter defines a minimal fall-through
+edge probability in percentage used to add BB to inheritance EBB in
+LRA.  The default value of the parameter is 40.  The value was chosen
+from numerous runs of SPEC2000 on x86-64.
+
 @item loop-invariant-max-bbs-in-loop
 Loop invariant motion can be very expensive, both in compilation time and
 in amount of needed compile-time memory, with very large loops.  Loops

Re: [C/C++ PATCH] -Wlogical-not-parentheses tweaks (PR c/65120)

2015-02-27 Thread Jason Merrill


On 02/19/2015 07:03 PM, Jakub Jelinek wrote:

+ /* Avoid warning for !!b == y where b is boolean.  */
+ && (!DECL_P (current.lhs)
+ || TREE_TYPE (current.lhs) == NULL_TREE
+ || TREE_CODE (TREE_TYPE (current.lhs)) != BOOLEAN_TYPE))


There's something wrong here.  If the type is null, trying to check its 
TREE_CODE will SEGV.


Jason

Re: one more patch for PR64317

2015-02-27 Thread Bernhard Reutner-Fischer

On February 27, 2015 11:03:14 PM GMT+01:00, Vladimir Makarov 
 wrote:
>   The following patch improves inheritance for PR64317 testcase
>
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64317
>
> I ran a lot SPEC2000 benchmarks to get better default parameter value 
>for EBB_PROBABILITY_CUTOFF in LRA inheritance. The new default
>parameter 
>value improves SPECInt2000 by 0.4% on x86-64 without changing
>SPECFP2000 
>rate.  The code size changes are insignificant (0.002% increase for 
>SPECInt and 0.01% decrease for SPECFP).
>
>The patch was bootstrapped and tested on x86-64.
>
>Committed as rev.221070.


+DEFPARAM (PARAM_LRA_INHERITANCE_EBB_PROBABILITY_CUTOFF,
+ "lra-inheritance-ebb-probability-cutoff",
+ "Minimal fall-through edge probability in percentage used to add BB 
to inheritance EEB in LRA",
+ 40, 0, 100)

s/EEB/EBB/
?
Thanks,

Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations

2015-02-27 Thread H.J. Lu

On Thu, Dec 4, 2014 at 8:46 AM, H.J. Lu  wrote:
> On Thu, Dec 4, 2014 at 4:44 AM, Uros Bizjak  wrote:
>> On Wed, Dec 3, 2014 at 10:35 PM, H.J. Lu  wrote:
>>
 It would probably help reviewers if you pointed to actual path
 submission [1], which unfortunately contains the explanation in the
 patch itself [2], which further explains that this functionality is
 currently only supported with gold, patched with [3].

 [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
 [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
 [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html

 After a bit of the above detective work, I think that new gcc option
 is not necessary. The configure should detect if new functionality is
 supported in the linker, and auto-configure gcc to use it when
 appropriate.
>>>
>>> I think GCC option is needed since one can use -fuse-ld= to
>>> change linker.
>>
>> IMO, nobody will use this highly special x86_64-only option. It would
>> be best for gnu-ld to reach feature parity with gold as far as this
>> functionality is concerned. In this case, the optimization would be
>> auto-configured, and would fire automatically, without any user
>> intervention.
>>
>
> Let's do it.  I implemented the same feature in bfd linker on both
> master and 2.25 branch.
>

 +bool
 +i386_binds_local_p (const_tree exp)
 +{
 +  /* Globals marked extern are treated as local when linker copy 
 relocations
 + support is available with -f{pie|PIE}.  */
 +  if (TARGET_64BIT && ix86_copyrelocs && flag_pie
 +  && TREE_CODE (exp) == VAR_DECL
 +  && DECL_EXTERNAL (exp) && !DECL_WEAK (exp))
 +return true;
 +  return default_binds_local_p (exp);
 +}
 +

 It returns true with -fPIE and false without -fPIE.  It is lying to 
 compiler.
 Maybe legitimate_pic_address_disp_p is a better place.
>>
>> Agreed.
>>
>>> Something like this?
>>
>> Yes.
>>
>> OK, if Jakub doesn't have any objections here. Please also add
>> Sriraman as author to ChangeLog entry.
>>
>> Thanks,
>> Uros.
>
> Here is the patch.   OK to install?
>
> Thanks.
>
> --
> H.J.
> ---
> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
> module using the GOT.  This is two instructions, one to get the address
> of the global from the GOT and the other to get the value.  If it turns
> out that the global gets defined in the executable at link-time, it still
> needs to go through the GOT as it is too late then to generate a direct
> access.
>
> Examples:
>
> foo.cc
> --
> int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code directly accesses the global via
> PC-relative insn:
>
> 5e0   :
>mov0x165a(%rip),%eax# 1c40 
>
> foo.cc
> --
>
> extern int a_glob;
> int main () {
>   return a_glob; // defined in this file
> }
>
> With -O2 -fpie -pie, the generated code accesses global via GOT using
> two memory loads:
>
> 6f0  :
>mov0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>mov(%rax),%eax
>
> This is true even if in the latter case the global was defined in the
> executable through a different file.
>
> Some experiments on google benchmarks shows that the extra memory loads
> affects performance by 1% to 5%.
>
> Solution - Copy Relocations:
>
> When the linker supports copy relocations, GCC can always assume that
> the global will be defined in the executable.  For globals that are truly
> extern (come from shared objects), the linker will create copy relocations
> and have them defined in the executable. Result is that no global access
> needs to go through the GOT and hence improves performance.
>
> This optimization only applies to undefined, non-weak global data.
> Undefined, weak global data access still must go through the GOT.
>
> This patch checks if linker supports PIE with copy reloc, which is
> enabled in gold and bfd linker in bininutils 2.25, at configure time
> and enables this optimization if the linker support is available.
>
> gcc/
>
> * configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if
> Linux/x86-64 linker supports PIE with copy reloc.
> * config.in: Regenerated.
> * configure: Likewise.
>
> * config/i386/i386.c (legitimate_pic_address_disp_p): Allow
> pc-relative address for undefined, non-weak, non-function
> symbol reference in 64-bit PIE if linker supports PIE with
> copy reloc.
>
> * doc/sourcebuild.texi: Document pie_copyreloc target.
>
> gcc/testsuite/
>
> * gcc.target/i386/pie-copyrelocs-1.c: New test.
> * gcc.target/i386/pie-copyrelocs-2.c: Likewise.
> * gcc.target/i386/pie-copyrelocs-3.c: Likewise.
> * gcc.target/i386/pie-copyrelocs-4.c: Likewise.
>
> * lib/target-supports.exp (check_effective_target_pie_copyreloc):
> New procedure.

This caused:

h

Re: [PATCH x86_64] Optimize access to globals in "-fpie -pie" builds with copy relocations

2015-02-27 Thread H.J. Lu

On Fri, Feb 27, 2015 at 3:23 PM, H.J. Lu  wrote:
> On Thu, Dec 4, 2014 at 8:46 AM, H.J. Lu  wrote:
>> On Thu, Dec 4, 2014 at 4:44 AM, Uros Bizjak  wrote:
>>> On Wed, Dec 3, 2014 at 10:35 PM, H.J. Lu  wrote:
>>>
> It would probably help reviewers if you pointed to actual path
> submission [1], which unfortunately contains the explanation in the
> patch itself [2], which further explains that this functionality is
> currently only supported with gold, patched with [3].
>
> [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
> [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
> [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html
>
> After a bit of the above detective work, I think that new gcc option
> is not necessary. The configure should detect if new functionality is
> supported in the linker, and auto-configure gcc to use it when
> appropriate.

 I think GCC option is needed since one can use -fuse-ld= to
 change linker.
>>>
>>> IMO, nobody will use this highly special x86_64-only option. It would
>>> be best for gnu-ld to reach feature parity with gold as far as this
>>> functionality is concerned. In this case, the optimization would be
>>> auto-configured, and would fire automatically, without any user
>>> intervention.
>>>
>>
>> Let's do it.  I implemented the same feature in bfd linker on both
>> master and 2.25 branch.
>>
>
> +bool
> +i386_binds_local_p (const_tree exp)
> +{
> +  /* Globals marked extern are treated as local when linker copy 
> relocations
> + support is available with -f{pie|PIE}.  */
> +  if (TARGET_64BIT && ix86_copyrelocs && flag_pie
> +  && TREE_CODE (exp) == VAR_DECL
> +  && DECL_EXTERNAL (exp) && !DECL_WEAK (exp))
> +return true;
> +  return default_binds_local_p (exp);
> +}
> +
>
> It returns true with -fPIE and false without -fPIE.  It is lying to 
> compiler.
> Maybe legitimate_pic_address_disp_p is a better place.
>>>
>>> Agreed.
>>>
 Something like this?
>>>
>>> Yes.
>>>
>>> OK, if Jakub doesn't have any objections here. Please also add
>>> Sriraman as author to ChangeLog entry.
>>>
>>> Thanks,
>>> Uros.
>>
>> Here is the patch.   OK to install?
>>
>> Thanks.
>>
>> --
>> H.J.
>> ---
>> Normally, with -fPIE/-fpie, GCC accesses globals that are extern to the
>> module using the GOT.  This is two instructions, one to get the address
>> of the global from the GOT and the other to get the value.  If it turns
>> out that the global gets defined in the executable at link-time, it still
>> needs to go through the GOT as it is too late then to generate a direct
>> access.
>>
>> Examples:
>>
>> foo.cc
>> --
>> int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code directly accesses the global via
>> PC-relative insn:
>>
>> 5e0   :
>>mov0x165a(%rip),%eax# 1c40 
>>
>> foo.cc
>> --
>>
>> extern int a_glob;
>> int main () {
>>   return a_glob; // defined in this file
>> }
>>
>> With -O2 -fpie -pie, the generated code accesses global via GOT using
>> two memory loads:
>>
>> 6f0  :
>>mov0x1609(%rip),%rax   # 1d00 <_DYNAMIC+0x230>
>>mov(%rax),%eax
>>
>> This is true even if in the latter case the global was defined in the
>> executable through a different file.
>>
>> Some experiments on google benchmarks shows that the extra memory loads
>> affects performance by 1% to 5%.
>>
>> Solution - Copy Relocations:
>>
>> When the linker supports copy relocations, GCC can always assume that
>> the global will be defined in the executable.  For globals that are truly
>> extern (come from shared objects), the linker will create copy relocations
>> and have them defined in the executable. Result is that no global access
>> needs to go through the GOT and hence improves performance.
>>
>> This optimization only applies to undefined, non-weak global data.
>> Undefined, weak global data access still must go through the GOT.
>>
>> This patch checks if linker supports PIE with copy reloc, which is
>> enabled in gold and bfd linker in bininutils 2.25, at configure time
>> and enables this optimization if the linker support is available.
>>
>> gcc/
>>
>> * configure.ac (HAVE_LD_PIE_COPYRELOC): Defined to 1 if
>> Linux/x86-64 linker supports PIE with copy reloc.
>> * config.in: Regenerated.
>> * configure: Likewise.
>>
>> * config/i386/i386.c (legitimate_pic_address_disp_p): Allow
>> pc-relative address for undefined, non-weak, non-function
>> symbol reference in 64-bit PIE if linker supports PIE with
>> copy reloc.
>>
>> * doc/sourcebuild.texi: Document pie_copyreloc target.
>>
>> gcc/testsuite/
>>
>> * gcc.target/i386/pie-copyrelocs-1.c: New test.
>> * gcc.target/i386/pie-copyrelocs-2.c: Likewise.
>> * gcc

Re: one more patch for PR64317

2015-02-27 Thread Vladimir Makarov




On 2015-02-27 5:30 PM, Bernhard Reutner-Fischer wrote:

On February 27, 2015 11:03:14 PM GMT+01:00, Vladimir Makarov 
 wrote:

Committed as rev.221070.


+DEFPARAM (PARAM_LRA_INHERITANCE_EBB_PROBABILITY_CUTOFF,
+ "lra-inheritance-ebb-probability-cutoff",
+ "Minimal fall-through edge probability in percentage used to add BB to 
inheritance EEB in LRA",
+ 40, 0, 100)

s/EEB/EBB/



Fixed.  Thanks for pointing this out.

Re: Fix typo in ira-costs.c

2015-02-27 Thread Vladimir Makarov




On 2015-02-27 4:52 PM, Kugan wrote:

Looks like a typo to me. Bootstrapped and regression tested on
x86_64-unknown-linux-gnu with no new regressions.

Is this OK for stage1?



Yes, Kugan.  Thanks.

Re: [PATCH] PR63175 - [4.9/5 regression] FAIL: gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c scan-tree-dump-times slp2" basic block vectorized using SLP" 1

2015-02-27 Thread Martin Sebor


Given that Martin's fix to the testcase allowed it to succeed without
Richi's fix for the underlying problem, is there a modification to the
testcase or a new testcase that would really test the optimization?


Let me work on it.


Below is a patch with a couple of minor tweaks to the existing
test first to update the search string and second to better
exercise the vectorization not only when the source address
isn't aligned on the expected boundary but also when the
destination address isn't.  This enhancement revealed
an outstanding aspect of the regression (not fixed by Richard's
already committed patch).

Besides this change, the patch also adds a number of other
tests to better exercise the vectorization by verifying it
takes place for arrays of elements of other sizes besides
word: byte, half word, and double word.  Those tests reveal
both another regression WRT 4.8 and further vectorization
opportunities not exploited even in 4.8.  I marked the latter
XFAIL in the tests so that when the regression is fully
resolved, the tests should pass with no unexpected failures.

Martin

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index cc86e37..4edd559 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,15 @@
+2015-02-27  Martin Sebor  
+
+   PR testsuite/63175
+   * gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c (main1): Rename...
+   (copy_to_unaligned): ...to this and move checking of results into
+   main.
+   (copy_from_unaligned): New function.
+   * costmodel-bb-slp-pr63175-base.c: New test.
+   * costmodel-bb-slp-pr63175-dword.c: New test.
+   * costmodel-bb-slp-pr63175-hword.c: New test.
+   * costmodel-bb-slp-pr63175-word.c: New test.
+
 2015-02-27  Jakub Jelinek  

PR tree-optimization/65048
diff --git 
a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c 
b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c

index e1bc1a8..a2dc367 100644
--- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a.c
@@ -1,46 +1,63 @@
 /* { dg-require-effective-target vect_int } */

-#include 
 #include "../../tree-vect.h"

 #define N 16

 unsigned int out[N];
-unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+const unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};

-__attribute__ ((noinline)) int
-main1 (unsigned int x, unsigned int y)
+__attribute__ ((noinline)) void
+copy_to_unaligned (void)
 {
-  int i;
-  unsigned int *pin = &in[1];
+  const unsigned int *pin = &in[1];
   unsigned int *pout = &out[0];
-  unsigned int a0, a1, a2, a3;

   /* Misaligned load.  */
   *pout++ = *pin++;
   *pout++ = *pin++;
   *pout++ = *pin++;
   *pout++ = *pin++;
+}

-  /* Check results.  */
-  if (out[0] != in[1]
-  || out[1] != in[2]
-  || out[2] != in[3]
-  || out[3] != in[4])
-abort();
+__attribute__ ((noinline)) void
+copy_from_unaligned (void)
+{
+  const unsigned int *pin = &in[0];
+  unsigned int *pout = &out[1];

-  return 0;
+  /* Misaligned load.  */
+  *pout++ = *pin++;
+  *pout++ = *pin++;
+  *pout++ = *pin++;
+  *pout++ = *pin++;
 }

 int main (void)
 {
   check_vect ();

-  main1 (2, 3);
+  copy_to_unaligned ();
+
+  /* Check results outside of main1 where it would likely
+ be optimized away.  */
+  if (out[0] != in[1]
+  || out[1] != in[2]
+  || out[2] != in[3]
+  || out[3] != in[4])
+abort();
+
+  copy_from_unaligned ();
+
+  if (out[1] != in[0]
+  || out[2] != in[1]
+  || out[3] != in[2]
+  || out[4] != in[3])
+abort();

   return 0;
 }

-/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 
1 "slp2"  { xfail  vect_no_align } } } */
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 2 "slp2" 
 { xfail  vect_no_align } } } */

 /* { dg-final { cleanup-tree-dump "slp2" } } */

diff --git 
a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-base.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-base.c

new file mode 100644
index 000..b94cf0e
--- /dev/null
+++ 
b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-pr63175-base.c

@@ -0,0 +1,45 @@
+/* { dg-require-effective-target vect_int } */
+/* { dg-do compile } */
+
+#define DEFINE_DATA(T) \
+const T T ## _ ## src[] = { \
+ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,\
+16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 
32 }; \

+T T ## _ ## dst[sizeof T ## _ ## src / sizeof (T)]
+
+#define DEFINE_COPY_M_N(T, srcoff, dstoff) \
+void copy_ ## T ## _ ## srcoff ## _ ## dstoff (void) { \
+const T *s = T ## _ ## src + srcoff; \
+T *d = T ## _ ## dst + dstoff; \
+unsigned i; \
+for (i = 0; i != 16 / sizeof *s; ++i) \
+*d++ = *s++; \
+}
+
+#define DEFINE_COPY_M(T, M) \
+DEFINE_COPY_M_N (T, M, 0); \
+DEFINE_COPY_M_N (T, M, 1); \
+DEFINE_COPY_M_N (T, M, 2); \
+

Re: one more patch for PR64317

2015-02-27 Thread Sandra Loosemore


On 02/27/2015 03:03 PM, Vladimir Makarov wrote:

   The following patch improves inheritance for PR64317 testcase

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64317

[snip]
Index: params.def
===
--- params.def  (revision 220916)
+++ params.def  (working copy)
@@ -836,6 +836,11 @@ DEFPARAM (PARAM_LRA_MAX_CONSIDERED_RELOA
  "The max number of reload pseudos which are considered during spilling a 
non-reload pseudo",
  500, 0, 0)

+DEFPARAM (PARAM_LRA_INHERITANCE_EBB_PROBABILITY_CUTOFF,
+ "lra-inheritance-ebb-probability-cutoff",
+ "Minimal fall-through edge probability in percentage used to add BB to 
inheritance EEB in LRA",


s/EEB/EBB/?

I can't say I understand what either of those abbreviations means

-Sandra

Re: [PATCH][AArch64]: Fix rtl type in aarch64.md.

2015-02-27 Thread Xingxing Pan


On 02/27/2015 04:30 PM, Marcus Shawcroft wrote:

On 26 February 2015 at 06:22, Xingxing Pan  wrote:

Hi,

This patch fix the type of mov_aarch64 in aarch64.md.
Is it OK for trunk?


OK, thank you /Marcus



Hi,

Could someone help to apply the patch? Until now I don't have SVN write 
access.


--
Regards,
Xingxing

Re: one more patch for PR64317

2015-02-27 Thread Vladimir Makarov




On 2015-02-27 8:06 PM, Sandra Loosemore wrote:

On 02/27/2015 03:03 PM, Vladimir Makarov wrote:

   The following patch improves inheritance for PR64317 testcase

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64317

[snip]
Index: params.def
===
--- params.def(revision 220916)
+++ params.def(working copy)
@@ -836,6 +836,11 @@ DEFPARAM (PARAM_LRA_MAX_CONSIDERED_RELOA
   "The max number of reload pseudos which are considered during 
spilling a non-reload pseudo",

   500, 0, 0)

+DEFPARAM (PARAM_LRA_INHERITANCE_EBB_PROBABILITY_CUTOFF,
+  "lra-inheritance-ebb-probability-cutoff",
+  "Minimal fall-through edge probability in percentage used to 
add BB to inheritance EEB in LRA",


s/EEB/EBB/?

I can't say I understand what either of those abbreviations means


EBB is an extended basic block.
EEB is a typo.
Sorry for using the abbreviation but it is a way to keep attribute name 
shorter (although it is already too long).

60 matches

Mail list logo