date:20150312

RE: [PATCH, FT32] initial support

2015-03-12 Thread James Bowman

> On Mon, 16 Feb 2015, James Bowman wrote:
> 
> > I have updated the target options. Space-saving is now enabled by
> > -Os. There is also a new option -msim to enable building for the
> > simulator (the simulator is pending submission to gdb-binutils).
> 
> The documentation in this patch doesn't seem to have been updated for
> those changes.
> 

Ping.
Also, have attached updated patchset for the current gcc. Thanks.

--
James Bowman
FTDI Open Source Liaison


gcc-ft32.txt.gz
Description: gcc-ft32.txt.gz

[PATCH] Backport ubsan fix to 4.9

2015-03-12 Thread Marek Polacek

I'd like to backport the following patch that suppresses bogus ubsan errors.
I had to tweak the testcase a bit since 4.9 doesn't know -fno-sanitize-recover.

Bootstrapped/regtested on x86_64-linux, ok for 4.9?

2015-03-10  Marek Polacek  

Backported from mainline
2014-12-04  Marek Polacek  

PR middle-end/56917
* fold-const.c (fold_unary_loc): Perform the negation in A's type
when transforming ~ (A - 1) or ~ (A + -1) to -A.

* c-c++-common/ubsan/pr56917.c: New test.

--- gcc/fold-const.c
+++ gcc/fold-const.c
@@ -8324,9 +8324,14 @@ fold_unary_loc (location_t loc, enum tree_code code, 
tree type, tree op0)
&& integer_onep (TREE_OPERAND (arg0, 1)))
   || (TREE_CODE (arg0) == PLUS_EXPR
   && integer_all_onesp (TREE_OPERAND (arg0, 1)
-   return fold_build1_loc (loc, NEGATE_EXPR, type,
-   fold_convert_loc (loc, type,
- TREE_OPERAND (arg0, 0)));
+   {
+ /* Perform the negation in ARG0's type and only then convert
+to TYPE as to avoid introducing undefined behavior.  */
+ tree t = fold_build1_loc (loc, NEGATE_EXPR,
+   TREE_TYPE (TREE_OPERAND (arg0, 0)),
+   TREE_OPERAND (arg0, 0));
+ return fold_convert_loc (loc, type, t);
+   }
   /* Convert ~(X ^ Y) to ~X ^ Y or X ^ ~Y if ~X or ~Y simplify.  */
   else if (TREE_CODE (arg0) == BIT_XOR_EXPR
   && (tem = fold_unary_loc (loc, BIT_NOT_EXPR, type,
--- gcc/testsuite/c-c++-common/ubsan/pr56917.c
+++ gcc/testsuite/c-c++-common/ubsan/pr56917.c
@@ -0,0 +1,43 @@
+/* PR middle-end/56917 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=undefined" } */
+
+#include 
+
+#define INT_MIN (-__INT_MAX__ - 1)
+#define LONG_MIN (-__LONG_MAX__ - 1L)
+#define LLONG_MIN (-__LONG_LONG_MAX__ - 1LL)
+
+int __attribute__ ((noinline,noclone))
+fn1 (unsigned int u)
+{
+  return (-(int) (u - 1U)) - 1;
+}
+
+long __attribute__ ((noinline,noclone))
+fn2 (unsigned long int ul)
+{
+  return (-(long) (ul - 1UL)) - 1L;
+}
+
+long long __attribute__ ((noinline,noclone))
+fn3 (unsigned long long int ull)
+{
+  return (-(long long) (ull - 1ULL)) - 1LL;
+}
+
+int
+main (void)
+{
+  fputs ("UBSAN TEST START\n", stderr);
+
+  if (fn1 (__INT_MAX__ + 1U) != INT_MIN
+  || fn2 (__LONG_MAX__ + 1UL) != LONG_MIN
+  || fn3 (__LONG_LONG_MAX__ + 1ULL) != LLONG_MIN)
+__builtin_abort ();
+
+  fputs ("UBSAN TEST END\n", stderr);
+  return 0;
+}
+
+/* { dg-output "UBSAN TEST START(\n|\r\n|\r)UBSAN TEST END" } */

Marek

Re: [PATCH/AARCH64] Add missing definition of crypto instruction on cortex-a57.md

2015-03-12 Thread Ramana Radhakrishnan

On 11/03/2015 02:11, 박준모 wrote:
> Hi all,
> 
> This patch only affect sha2 crypto instruction's order when gcc 
> performs instruction scheduling(rtl-sched1,2).
> 
> There are no definition for crypto_sha256_fast, crypto_sha256_slow on 
> "cortex-a57.md".
> 
> This makes poor result of instruction scheduling when we use sha2 crypto 
> instructions.
> 
> This idea already applied on "cortex-a53.md". so I think it can apply on 
> GCC5(even we only accepts regression fixes.).
> 
> Is this ok?

The approach makes sense - however please resubmit the patch with a
Changelog entry and a proper plain text email so that the patch is
archived on the GCC mailing lists.

HTML email is bounced from the lists, please only use plain text when
submitting patches or writing emails to the GCC mailing lists.

regards
Ramana

> Thanks,
> 
> Junmo Park.
>

Re: [PATCH, PR target/65103, 1/3] Fix cost of PIC register in ix86_address_cost

2015-03-12 Thread Uros Bizjak

Hello!

> > > Test O2 ref patchedOfast + LTO ref   patched
> > > 164.gzip12  0 (-100%)39  0 (-100%)
> > > 175.vpr 0   0 (-0%)  4   0 (-100%)
> > > 176.gcc 141 6 (-96%) 294 10 (-97%)
> > > 181.mcf 4   0 (-100%)4   2 (-50%)

Do you also have executable sizes at hand?

> 2015-03-10  Ilya Enkovich  
>
> PR target/65103
> * config/i386/i386.c (ix86_address_cost): Fix cost of a PIC
> register.
>
> gcc/testsuite/
>
> 2015-03-10  Ilya Enkovich  
>
> PR target/65103
> * gcc.target/i386/pr65103-1.c: New.

LGTM, just a nit below.

Otherwise, OK for mainline as a bugfix (but please wait for a day if
there are any objections from release managers).

+  /* Attempt to minimize number of registers in the address.

This is now a displaced comment. Please integrate it in the main comment.

Thanks,
Uros.

Re: [C++ Patch] PR 65370

2015-03-12 Thread Jason Merrill


On 03/10/2015 01:03 PM, Paolo Carlini wrote:

Good question, but we don't have this issue, because for that we emit
anyway:

65370.C:11:36: error: default argument specified in explicit
specialization [-fpermissive]
  C::C(const C&, bool = false);

nothing changes about that kind of testcase, usual behavior.


Ah.  So here we can ignore any template instantiation or specialization, 
with a comment that check_explicit_specialization will handle them.  But 
I suspect that checking the decl itself will be better; I would expect 
checking the context to lead you to accept


template<> class C {
  template 
  C(const C&, bool);
};

template  C::C(const C&, bool = false);

Since here C is a specialization of C, but the constructor is not 
itself a partial instantiation.


Jason

Re: Fwd: [PATCH]Remve xfail for wrapped target from libstdc++-v3/testsuite/27_io/ios_base/sync_with_stdio/1.cc

2015-03-12 Thread Jonathan Wakely


On 05/02/15 11:28 +, Renlin Li wrote:

Hi all,

This patch simply remove the target selector. It should pass for all target 
which applies.

The comment in the code is not correct. stderr is redirected, not the stdout.
Therefore, the return status which is streamed into stdout should properly 
captured even by wrapped target.


The history of this test is curious. Paolo changed the redirect to fix
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14866 but then a year
later Mark added the support for the "unwrapped" effective target to
XFAIL this test, adding the incorrect comment ... even though
presumably it wasn't actually failing after Paolo's fix! Maybe Mark
was merging something from a CodeSourcery branch where the test still
failed.

The "unwrapped" target is used elsewhere in gcc/testsuite so it's
still useful even if we remove it from this libstdc++ test.


Okay for trunk?


OK, thanks.



libstdc++-v3/ChangeLog:

2015-02-03  Renlin Li

* testsuite/27_io/ios_base/sync_with_stdio/1.cc: Remve xfail for 
wrapped target.






diff --git a/libstdc++-v3/testsuite/27_io/ios_base/sync_with_stdio/1.cc 
b/libstdc++-v3/testsuite/27_io/ios_base/sync_with_stdio/1.cc
index 6edaef3..1c9fa60 100644
--- a/libstdc++-v3/testsuite/27_io/ios_base/sync_with_stdio/1.cc
+++ b/libstdc++-v3/testsuite/27_io/ios_base/sync_with_stdio/1.cc
@@ -23,12 +23,6 @@
// @require@ %-*.tst
// @diff@ %-*.tst %-*.txt

-// This test fails on platforms using a wrapper, because this test
-// redirects stdout to a file and so the exit status printed by the
-// wrapper is not visibile to DejaGNU.  DejaGNU then assumes that the
-// test exited with a non-zero exit status.
-// { dg-do run { xfail { ! unwrapped } } }
-
#include 
#include 
#include

[patch] disable libmpx x32 multilib builds

2015-03-12 Thread Matthias Klose

current trunk fails to build on x86*-linux, when configured for x32 multilibs
because libmpx doesn't support these. Disable them.

ok for the trunk?

* Disable libmpx x32 multilib builds.

--- a/config-ml.in
+++ b/config-ml.in
@@ -102,6 +102,7 @@
 Makefile=${ac_file-Makefile}
 ml_config_shell=${CONFIG_SHELL-/bin/sh}
 ml_realsrcdir=${srcdir}
+ml_srcbase=`basename $ml_realsrcdir`

 # Scan all the arguments and set all the ones we need.

@@ -220,6 +221,10 @@
   if [ "${dir}" = "." ]; then
 true
   else
+# libmpx is not supported on x32
+if [ "${ml_srcbase}-${dir}" = libmpx-x32 ]; then
+  continue
+fi
 if [ -z "${multidirs}" ]; then
   multidirs="${dir}"
 else

[PATCH, TSAN] Fix a crash in ScopedReport::AddThread

2015-03-12 Thread Bernd Edlinger

Hi Jakub,


with my OPC UA Server, I observe a reproducible crash in
ScopedReport::AddThread: tctx==NULL
in "if ((u32)rep_->threads[i]->id == tctx->tid)".

Apparently, Dmitry has already fixed that in the obvious way.

So we should cherry pick these two changes from LLVM: 224508 and 224755
See attachment.


Builds cleanly and fixes the problem for me.

OK for trunk?


Thanks
Bernd.
  

patch-tsan-crash.diff
Description: Binary data

Re: [PATCH] PR target/65242, Fix powerpc abort in gen_add2_insn

2015-03-12 Thread Jeff Law


On 03/11/15 08:44, David Edelsohn wrote:

On Mon, Mar 9, 2015 at 7:30 PM, Michael Meissner
 wrote:

This bug was one I unfortunately introduced with the -mupper-regs support.  If
the reload pass needed to reload a PLUS operation (for example, due to using
odd address with the LD/STD instructions), it would go through all of the
registers you could load DImode into, and see if it is a preferred register
class.  This lead the compiler to believe it could do integer arithmetic in the
floating point registers.

This patch fixes the problem, by not allowing PLUS to be reloaded into FPR
registers.  I have done bootstraps and make checks on both a big endian Power7
and a little endian Power8 system, and there were no regressions.  Is the patch
ok to apply?  I do not believe it needs to be back ported to GCC 4.9 since the
-mupper-regs changes are not installed currently on that branch.

[gcc]
2015-03-09  Michael Meissner  

 PR target/65242
 * config/rs6000/rs6000.c (rs6000_preferred_reload_class): Do not
 allow reloads of PLUS in floating point/VSX registers.

[gcc/testsuite]
2015-03-09  Michael Meissner  

 PR target/65242
 * g++.dg/pr65242.C: New test.


This is okay.

What about Jeff Law's Bugzilla comment #6 to change ?m to !m in the
movdi_internal64 pattern?  That also seems reasonable.

It doesn't matter much to me either way as long as it gets fixed :-)

Avoiding floating point registers via preferred reload class is a valid 
approach.  My only concern then would be cases where we have similar 
looking arithmetic and even though we no longer prefer the FP classes, 
we still end up selecting that problematical alternative -- say perhaps 
because the pseudos in question have many other uses where FP regs make 
sense.


I know we could get into those kind of situations on the PA because of 
the weird way in which integer multiplies were implemented (FP unit, 
using FP regs) -- which could occur even when using '?' to disparage 
those alternatives.  I'm not familiar enough with PPC implementations to 
know if we can get into that same situation with that port.


Jeff

Re: [PATCH/AARCH64] Add missing definition of crypto instruction on cortex-a57.md

2015-03-12 Thread Ramana Radhakrishnan





Attached patch as text.

2015-03-11  Junmo Park  

 * config/arm/cortex-a57.md (cortex_a57_crypto_simple): Add 
crypto_sha256_fast.
 (cortex_a57_crypto_complex): Add crypto_sha256_slow.

Ok to commit to trunk?




OK, Thanks Sebastian.

regards
Ramana



Thanks,
Sebastian

Re: [PATCH/AARCH64] Add missing definition of crypto instruction on cortex-a57.md

2015-03-12 Thread Sebastian Pop

James Greenhalgh wrote:
> On Wed, Mar 11, 2015 at 04:24:07PM +, Ramana Radhakrishnan wrote:
> > 
> > >
> > > Attached patch as text.
> > >
> > > 2015-03-11  Junmo Park  
> > >
> > >  * config/arm/cortex-a57.md (cortex_a57_crypto_simple): Add 
> > > crypto_sha256_fast.
> > >  (cortex_a57_crypto_complex): Add crypto_sha256_slow.
> > >
> > > Ok to commit to trunk?
> > 
> > 
> > 
> > OK, Thanks Sebastian.
> 
> As far as I can see, this patch still hasn't made it to gcc-patches.
> Could you please send a copy (or a commit revision number), for those
> of us interested?

Committed r221349.

Sebastian

[CHKP, PATCH] Fix instrumented indirect calls with propagated pointers

2015-03-12 Thread Ilya Enkovich

Hi,

Instrumented function pointer may be propagated into not instrumented indirect 
call and vice versa.  It requires additional call modifications (either remove 
bounds or change callee).  Bootstrapped and tested on x86_64-unknown-linux-gnu. 
 OK for trunk?


Thanks,
Ilya
--
gcc/

2015-03-12  Ilya Enkovich  

* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Add
redirection for instrumented calls.
* tree-chkp.h (chkp_copy_call_skip_bounds): New.
(chkp_redirect_edge): New.
* tree-chkp.c (chkp_copy_call_skip_bounds): New.
(chkp_redirect_edge): New.

gcc/testsuite/

2015-03-12  Ilya Enkovich  

* gcc.target/i386/mpx/chkp-fix-calls-1.c: New.
* gcc.target/i386/mpx/chkp-fix-calls-2.c: New.


diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 5ca1901..a0b0465 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -1278,14 +1278,25 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
 {
   cgraph_edge *e = this;
 
-  tree decl = gimple_call_fndecl (e->call_stmt);
-  tree lhs = gimple_call_lhs (e->call_stmt);
+  tree decl;
+  tree lhs;
   gcall *new_stmt;
   gimple_stmt_iterator gsi;
+  bool skip_bounds = false;
 #ifdef ENABLE_CHECKING
   cgraph_node *node;
 #endif
 
+  /* We might propagate instrumented function pointer into
+ not instrumented function and vice versa.  In such a
+ case we need to either fix function declaration or
+ remove bounds from call statement.  */
+  if (callee)
+skip_bounds = chkp_redirect_edge (e);
+
+  decl = gimple_call_fndecl (e->call_stmt);
+  lhs = gimple_call_lhs (e->call_stmt);
+
   if (e->speculative)
 {
   cgraph_edge *e2;
@@ -1391,7 +1402,8 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
 }
 
   if (e->indirect_unknown_callee
-  || decl == e->callee->decl)
+  || (decl == e->callee->decl
+ && !skip_bounds))
 return e->call_stmt;
 
 #ifdef ENABLE_CHECKING
@@ -1416,13 +1428,19 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
}
 }
 
-  if (e->callee->clone.combined_args_to_skip)
+  if (e->callee->clone.combined_args_to_skip
+  || skip_bounds)
 {
   int lp_nr;
 
-  new_stmt
-   = gimple_call_copy_skip_args (e->call_stmt,
- e->callee->clone.combined_args_to_skip);
+  new_stmt = e->call_stmt;
+  if (e->callee->clone.combined_args_to_skip)
+   new_stmt
+ = gimple_call_copy_skip_args (new_stmt,
+   e->callee->clone.combined_args_to_skip);
+  if (skip_bounds)
+   new_stmt = chkp_copy_call_skip_bounds (new_stmt);
+
   gimple_call_set_fndecl (new_stmt, e->callee->decl);
   gimple_call_set_fntype (new_stmt, gimple_call_fntype (e->call_stmt));
 
diff --git a/gcc/testsuite/gcc.target/i386/mpx/chkp-fix-calls-1.c 
b/gcc/testsuite/gcc.target/i386/mpx/chkp-fix-calls-1.c
new file mode 100644
index 000..cb4d229
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/chkp-fix-calls-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcheck-pointer-bounds -mmpx" } */
+
+#include "math.h"
+
+double
+test1 (double x, double y, double (*fn)(double, double))
+{
+  return fn (x, y);
+}
+
+double
+test2 (double x, double y)
+{
+  return test1 (x, y, copysign);
+}
diff --git a/gcc/testsuite/gcc.target/i386/mpx/chkp-fix-calls-2.c 
b/gcc/testsuite/gcc.target/i386/mpx/chkp-fix-calls-2.c
new file mode 100644
index 000..951e7de
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/chkp-fix-calls-2.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fcheck-pointer-bounds -mmpx -fno-inline" } */
+
+#include "math.h"
+
+double
+test1 (double x, double y, double (*fn)(double, double))
+{
+  return fn (x, y);
+}
+
+double
+test2 (double x, double y)
+{
+  return test1 (x, y, copysign);
+}
diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
index d2df4ba..2d2090f 100644
--- a/gcc/tree-chkp.c
+++ b/gcc/tree-chkp.c
@@ -500,6 +500,62 @@ chkp_expand_bounds_reset_for_mem (tree mem, tree ptr)
   expand_normal (bndstx);
 }
 
+/* Build a GIMPLE_CALL identical to CALL but skipping bounds
+   arguments.  */
+
+gcall *
+chkp_copy_call_skip_bounds (gcall *call)
+{
+  bitmap bounds;
+  unsigned i;
+
+  bitmap_obstack_initialize (NULL);
+  bounds = BITMAP_ALLOC (NULL);
+
+  for (i = 0; i < gimple_call_num_args (call); i++)
+if (POINTER_BOUNDS_P (gimple_call_arg (call, i)))
+  bitmap_set_bit (bounds, i);
+
+  call = gimple_call_copy_skip_args (call, bounds);
+  gimple_call_set_with_bounds (call, false);
+
+  BITMAP_FREE (bounds);
+  bitmap_obstack_release (NULL);
+
+  return call;
+}
+
+/* Redirect edge E to the correct node according to call_stmt.
+   Return 1 if bounds removal from call_stmt should be done
+   instead of redirection.  */
+
+bool
+chkp_redirect_edge (cgraph_edge *e)
+{
+  bool instrumented = false;
+
+  if (e->callee->instrumentation_clone
+  || chkp_function_instrumented_p (e->callee->decl))
+instrumented = true;
+

Re: [RS6000] bswapdi2 pattern, reload and lra

2015-03-12 Thread Alan Modra

On Wed, Dec 18, 2013 at 09:53:38AM -0500, David Edelsohn wrote:
https://gcc.gnu.org/ml/gcc-patches/2013-12/msg01599.html
> Why change the code from swapping the words at the initial
> change_address() to swapping the words in the call to gen_bswapsi2()?

Sorry for dropping this on the floor for so long.  I've been prodded
back into action by Redhat people and pr63150.

I don't recall a compelling technical reason for the change.  It was
probably to make my life easier in tracking the lifetimes of addr1
and addr2, necessary due to losing one of the scratch registers along
with early clobbers.  (In the splitter you question, addr1 might be
the same register as dest/dest_32.)

I suppose it also makes those splitters look a little more like the
one for bswapdi2_32bit, so a plus for maintenance.

The patch applies with some minor changes (see pr63150) and I've
checked for regressions on a current powerpc64le build.  OK to apply,
and on the branches?

-- 
Alan Modra
Australia Development Lab, IBM

[PING^2] [PATCH] [AArch64, NEON] Improve vmulX intrinsics

2015-03-12 Thread Jiangjiji

Hi, 
  This is a ping for: https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00775.html
  Regtested with aarch64-linux-gnu on QEMU.
  This patch has no regressions for aarch64_be-linux-gnu big-endian target too. 
  OK for the trunk? Thanks.


Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 219845)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,38 @@
+2014-12-11  Felix Yang  
+   Jiji Jiang  
+
+   * config/aarch64/aarch64-simd.md (aarch64_mul_n,
+   aarch64_mull_n, aarch64_mull,
+   aarch64_simd_mull2_n, aarch64_mull2_n,
+   aarch64_mull_lane, aarch64_mull2_lane_internal,
+   aarch64_mull_laneq, aarch64_mull2_laneq_internal,
+   aarch64_smull2_lane, aarch64_umull2_lane,
+   aarch64_smull2_laneq, aarch64_umull2_laneq,
+   aarch64_fmulx, aarch64_fmulx, aarch64_fmulx_lane,
+   aarch64_pmull2v16qi, aarch64_pmullv8qi): New patterns.
+   * config/aarch64/aarch64-simd-builtins.def (vec_widen_smult_hi_,
+   vec_widen_umult_hi_, umull, smull, smull_n, umull_n, mul_n, smull2_n,
+   umull2_n, smull_lane, umull_lane, smull_laneq, umull_laneq, pmull,
+   umull2_lane, smull2_laneq, umull2_laneq, fmulx, fmulx_lane, pmull2,
+   smull2_lane): New builtins.
+   * config/aarch64/arm_neon.h (vmul_n_f32, vmul_n_s16, vmul_n_s32,
+   vmul_n_u16, vmul_n_u32, vmulq_n_f32, vmulq_n_f64, vmulq_n_s16,
+   vmulq_n_s32, vmulq_n_u16, vmulq_n_u32, vmull_high_lane_s16,
+   vmull_high_lane_s32, vmull_high_lane_u16, vmull_high_lane_u32,
+   vmull_high_laneq_s16, vmull_high_laneq_s32, vmull_high_laneq_u16,
+   vmull_high_laneq_u32, vmull_high_n_s16, vmull_high_n_s32,
+   vmull_high_n_u16, vmull_high_n_u32, vmull_high_p8, vmull_high_s8,
+   vmull_high_s16, vmull_high_s32, vmull_high_u8, vmull_high_u16,
+   vmull_high_u32, vmull_lane_s16, vmull_lane_s32, vmull_lane_u16,
+   vmull_lane_u32, vmull_laneq_s16, vmull_laneq_s32, vmull_laneq_u16,
+   vmull_laneq_u32, vmull_n_s16, vmull_n_s32, vmull_n_u16, vmull_n_u32,
+   vmull_p8, vmull_s8, vmull_s16, vmull_s32, vmull_u8, vmull_u16,
+   vmull_u32, vmulx_f32, vmulx_lane_f32, vmulxd_f64, vmulxq_f32,
+   vmulxq_f64, vmulxq_lane_f32, vmulxq_lane_f64, vmulxs_f32): Rewrite
+   using builtin functions.
+   * config/aarch64/iterators.md (UNSPEC_FMULX, UNSPEC_FMULX_LANE,
+   VDQF_Q): New unspec and int iterator.
+
 2015-01-19  Jiong Wang  
Andrew Pinski  
 
Index: gcc/config/aarch64/arm_neon.h
===
--- gcc/config/aarch64/arm_neon.h   (revision 219845)
+++ gcc/config/aarch64/arm_neon.h   (working copy)
@@ -7580,671 +7580,6 @@ vmovn_u64 (uint64x2_t a)
   return result;
 }
 
-__extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
-vmul_n_f32 (float32x2_t a, float32_t b)
-{
-  float32x2_t result;
-  __asm__ ("fmul %0.2s,%1.2s,%2.s[0]"
-   : "=w"(result)
-   : "w"(a), "w"(b)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline int16x4_t __attribute__ ((__always_inline__))
-vmul_n_s16 (int16x4_t a, int16_t b)
-{
-  int16x4_t result;
-  __asm__ ("mul %0.4h,%1.4h,%2.h[0]"
-   : "=w"(result)
-   : "w"(a), "x"(b)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline int32x2_t __attribute__ ((__always_inline__))
-vmul_n_s32 (int32x2_t a, int32_t b)
-{
-  int32x2_t result;
-  __asm__ ("mul %0.2s,%1.2s,%2.s[0]"
-   : "=w"(result)
-   : "w"(a), "w"(b)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline uint16x4_t __attribute__ ((__always_inline__))
-vmul_n_u16 (uint16x4_t a, uint16_t b)
-{
-  uint16x4_t result;
-  __asm__ ("mul %0.4h,%1.4h,%2.h[0]"
-   : "=w"(result)
-   : "w"(a), "x"(b)
-   : /* No clobbers */);
-  return result;
-}
-
-__extension__ static __inline uint32x2_t __attribute__ ((__always_inline__))
-vmul_n_u32 (uint32x2_t a, uint32_t b)
-{
-  uint32x2_t result;
-  __asm__ ("mul %0.2s,%1.2s,%2.s[0]"
-   : "=w"(result)
-   : "w"(a), "w"(b)
-   : /* No clobbers */);
-  return result;
-}
-
-#define vmull_high_lane_s16(a, b, c)\
-  __extension__ \
-({  \
-   int16x4_t b_ = (b);  \
-   int16x8_t a_ = (a);  \
-   int32x4_t result;\
-   __asm__ ("smull2 %0.4s, %1.8h, %2.h[%3]" \
-: "=w"(result)  \
-: "w"(a_), "x"(b_), "i"(c)  \
-: /* No clobbers */);

Re: [PATCH, CHKP, PR target/65044] Restrict pointer bounds checker with Sanitizer

2015-03-12 Thread Ilya Enkovich

2015-03-12 12:02 GMT+03:00 Jakub Jelinek :
> On Thu, Mar 12, 2015 at 11:51:51AM +0300, Ilya Enkovich wrote:
>> On 09 Mar 15:51, Jakub Jelinek wrote:
>> > On Mon, Mar 02, 2015 at 01:25:43PM +0300, Ilya Enkovich wrote:
>> > > > --- a/gcc/toplev.c
>> > > > +++ b/gcc/toplev.c
>> > > > @@ -1376,6 +1376,11 @@ process_options (void)
>> > > >  {
>> > > >if (targetm.chkp_bound_mode () == VOIDmode)
>> > > > error ("-fcheck-pointer-bounds is not supported for this 
>> > > > target");
>> > > > +
>> > > > +  if (flag_sanitize & SANITIZE_ADDRESS)
>> > > > +   error ("-fcheck-pointer-bounds is not supported with Address 
>> > > > Sanitizer");
>> > > > +
>> > > > +  flag_check_pointer_bounds = 0;
>> > > >  }
>> >
>> > Doesn't this disable -fcheck-pointer-bounds always?
>> > I'd expect you want to clear flag_check_pointer_bounds only if you issued
>> > one of the two errors...
>> >
>> > Jakub
>>
>> Whoops!  Here is a less destructive version.
>
> Ok for trunk.  Did the old version pass make check?  If so, perhaps you want 
> to add
> (incrementally) some test that would actually verify that
> -fcheck-pointer-bounds does what it should do (e.g. by scanning tree dumps
> etc.).

Thanks!  I sent previous version before make check.  There are several
chkp tests which would fail.

Ilya

>
> Jakub

Re: [PATCH, CHKP, PR target/65044] Restrict pointer bounds checker with Sanitizer

2015-03-12 Thread Ilya Enkovich

On 09 Mar 15:51, Jakub Jelinek wrote:
> On Mon, Mar 02, 2015 at 01:25:43PM +0300, Ilya Enkovich wrote:
> > > --- a/gcc/toplev.c
> > > +++ b/gcc/toplev.c
> > > @@ -1376,6 +1376,11 @@ process_options (void)
> > >  {
> > >if (targetm.chkp_bound_mode () == VOIDmode)
> > > error ("-fcheck-pointer-bounds is not supported for this target");
> > > +
> > > +  if (flag_sanitize & SANITIZE_ADDRESS)
> > > +   error ("-fcheck-pointer-bounds is not supported with Address 
> > > Sanitizer");
> > > +
> > > +  flag_check_pointer_bounds = 0;
> > >  }
> 
> Doesn't this disable -fcheck-pointer-bounds always?
> I'd expect you want to clear flag_check_pointer_bounds only if you issued
> one of the two errors...
> 
>   Jakub

Whoops!  Here is a less destructive version.

Thanks,
Ilya
--
gcc/

2015-03-11  Ilya Enkovich  

PR target/65044
* toplev.c (process_options): Restrict Pointer Bounds Checker
usage with Address Sanitizer.

gcc/testsuite/

2015-03-11  Ilya Enkovich  

PR target/65044
* gcc.target/i386/pr65044.c: New.


diff --git a/gcc/testsuite/gcc.target/i386/pr65044.c 
b/gcc/testsuite/gcc.target/i386/pr65044.c
new file mode 100644
index 000..4f318d6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr65044.c
@@ -0,0 +1,12 @@
+/* { dg-error "-fcheck-pointer-bounds is not supported with Address Sanitizer" 
} */
+/* { dg-do compile } */
+/* { dg-require-effective-target mpx } */
+/* { dg-options "-fcheck-pointer-bounds -mmpx -fsanitize=address" } */
+
+extern int x[];
+
+void
+foo ()
+{
+  x[0] = 0;
+}
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 99cf180..b06eed3 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1375,7 +1375,17 @@ process_options (void)
   if (flag_check_pointer_bounds)
 {
   if (targetm.chkp_bound_mode () == VOIDmode)
-   error ("-fcheck-pointer-bounds is not supported for this target");
+   {
+ error ("-fcheck-pointer-bounds is not supported for this target");
+ flag_check_pointer_bounds = 0;
+   }
+
+  if (flag_sanitize & SANITIZE_ADDRESS)
+   {
+ error ("-fcheck-pointer-bounds is not supported with "
+"Address Sanitizer");
+ flag_check_pointer_bounds = 0;
+   }
 }
 
   /* One region RA really helps to decrease the code size.  */

Re: [PATCH] Fix PR44563 more

2015-03-12 Thread Richard Biener

On Tue, 10 Mar 2015, Richard Biener wrote:

> 
> CFG cleanup currently searches for calls that became noreturn and
> fixes them up (splitting block and removing the fallthru).  Previously
> that was technically necessary as propagation may have turned an
> indirect call into a direct noreturn call and the CFG verifier would
> have barfed.  Today we guard that with GF_CALL_CTRL_ALTERING and
> thus we "remember" the previous call analysis.
> 
> The following patch removes the CFG cleanup code (which is expensive
> because gimple_call_flags () is quite expensive, not to talk about
> walking all stmts).  This leaves the fixup_cfg passes to perform the
> very same optimization (relevant propagators can also be teached
> to call fixup_noreturn_call, but I don't think that's very important).
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> I'm somewhat undecided whether this is ok at this stage and if we
> _do_ want to make propagators fix those (previously indirect) calls up
> earlier at the same time.
> 
> Honza - I think we performed this in CFG cleanup for the sake of CFG 
> checking, not for the sake of prompt optimization, no?
> 
> This would make PR44563 a pure IPA pass issue.

Soo - testing revealed a single case where we mess up things (and
the verifier noticing only because of a LHS on a noreturn call...).

The following patch makes all propagators handle the noreturn transition
(the paths in all but PRE are not exercised by bootstrap or testsuite :/).

This patch makes CFG cleanup independent on BB size (during analysis,
merge_blocks and delete_basic_block are still O(n)) - which is
a very much desired property.

It also changes fixup_cfg to produce a dump only when run as
separate pass (otherwise the .optimized dump changes and I get
tons of scan related fails) - that also reduces noise in the
very many places we dump functions (they are dumped anyway for
all cases).

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

I wonder if you can throw this on firefox/chromium - the critical
paths are devirtualization introducing __builtin_unreachable.

This patch should get a good speedup on all compiles (we run
CFG-cleanup a _lot_), by removing pointless IL walks and expensive
gimple_call_flags calls on calls.

Thanks,
Richard.

2015-03-10  Richard Biener  

PR middle-end/44563
* tree-cfgcleanup.c (split_bb_on_noreturn_calls): Remove.
(cleanup_tree_cfg_1): Do not call it.
(execute_cleanup_cfg_post_optimizing): Fixup the CFG here.
(fixup_noreturn_call): Mark the stmt as control altering.
* tree-cfg.c (execute_fixup_cfg): Do not dump the function
here.
(pass_data_fixup_cfg): Produce a dump file.
* tree-ssa-dom.c: Include tree-cfgcleanup.h.
(need_noreturn_fixup): New global.
(pass_dominator::execute): Fixup queued noreturn calls.
(optimize_stmt): Queue calls that became noreturn for fixup.
* tree-ssa-forwprop.c (pass_forwprop::execute): Likewise.
* tree-ssa-pre.c: Include tree-cfgcleanup.h.
(el_to_fixup): New global.
(eliminate_dom_walker::before_dom_childre): Queue calls that
became noreturn for fixup.
(eliminate): Fixup queued noreturn calls.
* tree-ssa-propagate.c: Include tree-cfgcleanup.h.
(substitute_and_fold_dom_walker): New member stmts_to_fixup.
(substitute_and_fold_dom_walker::before_dom_children): Queue
alls that became noreturn for fixup.
(substitute_and_fold): Fixup queued noreturn calls.

Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c  (revision 221379)
+++ gcc/tree-cfg.c  (working copy)
@@ -8721,10 +8721,6 @@ execute_fixup_cfg (void)
   if (count_scale != REG_BR_PROB_BASE)
 compute_function_frequency ();

-  /* Dump a textual representation of the flowgraph.  */
-  if (dump_file)
-gimple_dump_cfg (dump_file, dump_flags);
-
   if (current_loops
   && (todo & TODO_cleanup_cfg))
 loops_state_set (LOOPS_NEED_FIXUP);
@@ -8737,7 +8733,7 @@ namespace {
 const pass_data pass_data_fixup_cfg =
 {
   GIMPLE_PASS, /* type */
-  "*free_cfg_annotations", /* name */
+  "fixup_cfg", /* name */
   OPTGROUP_NONE, /* optinfo_flags */
   TV_NONE, /* tv_id */
   PROP_cfg, /* properties_required */
Index: gcc/tree-cfgcleanup.c
===
--- gcc/tree-cfgcleanup.c   (revision 221379)
+++ gcc/tree-cfgcleanup.c   (working copy)
@@ -625,35 +625,13 @@ fixup_noreturn_call (gimple stmt)
   update_stmt (stmt);
 }

+  /* Mark the call as altering control flow.  */
+  gimple_call_set_ctrl_altering (stmt, true);
+
   return remove_fallthru_edge (bb->succs);
 }

-/* Split basic blocks on calls in the middle of a basic block that are now
-   known not to return, and remove the unreachable code.  */
-
-static bool
-split_bb_on_noreturn_calls (basic_block bb)

[wwwdocs] Update 4.9.2 status link from RC announcement to release announcement

2015-03-12 Thread Jonathan Wakely


This just updates the status link on the homepage from the 4.9.2-rc1
announcement to the final release announcement a week later.

Committed to CVS.
Index: index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
retrieving revision 1.955
diff -u -r1.955 index.html
--- index.html	6 Feb 2015 22:36:02 -	1.955
+++ index.html	12 Mar 2015 09:37:46 -
@@ -188,7 +188,7 @@
 
   Status:
   
-  https://gcc.gnu.org/ml/gcc/2014-10/msg00195.html";>2014-10-23
+  https://gcc.gnu.org/ml/gcc/2014-10/msg00260.html";>2014-10-30
   
   (regression fixes and docs only).

Re: [PING^2] [PATCH] [AArch64, NEON] Improve vmulX intrinsics

2015-03-12 Thread Kyrill Tkachov


Hi Jiangjiji,

This is definitely stage 1 material by now...
At my glance it all looks like the right approach, I have a question below:

On 12/03/15 09:20, Jiangjiji wrote:

+
+(define_insn "aarch64_fmulx_lane"
+  [(set (match_operand:VDQF 0 "register_operand" "=w")
+(unspec:VDQF  [(match_operand:VDQF 1 "register_operand" "w")
+   (match_operand: 2 "register_operand" "w")
+   (match_operand:SI 3 "immediate_operand" "i")]
+  UNSPEC_FMULX_LANE))]
+ "TARGET_SIMD"
+ "fmulx\\t%0., %1., %2."
+  [(set_attr "type" "neon_mul_s")]
+)

Where did operand 3 go? Shouldn't his be the lane-element variant of fmulx?

Thanks,
Kyrill

Re: [PATCH] Speedup gimple_split_block

2015-03-12 Thread Richard Biener

On Tue, 10 Mar 2015, Richard Biener wrote:

> On Tue, 10 Mar 2015, Richard Biener wrote:
> 
> > 
> > This removes the old vestige loop to find a gsi for a stmt (from times
> > where gsi_for_stmt was O(n)).
> > 
> > PR44563 shows gimple_split_block quite high in the profile (this
> > patch doesn't fix that) as the tail loop setting BB on all stmts
> > moved to the new block shows quadratic behavior when inlining
> > N calls in a basic-block.
> > 
> > Bootstrap and regtest scheduled on x86_64-unknown-linux-gnu.
> 
> Ok, reveals two errors in my fix and two oddities in omp-low.c - removing
> a stmt and then splitting its basic-block after it.
> 
> Hopefully the following will finish bootstrap & regtest ok.

Not.  But the following did.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2015-03-12  Richard Biener  

* tree-cfg.c (gimple_split_block): Remove loop finding stmt
to split on.
* omp-low.c (expand_omp_taskreg): Split block before removing
the stmt.
(expand_omp_target): Likewise.
* ubsan.c (ubsan_expand_null_ifn): Adjust stmt if we replaced it.
* tree-parloops.c (create_call_for_reduction_1): Pass a proper
stmt to split_block.

Index: gcc/tree-cfg.c
===
*** gcc/tree-cfg.c  (revision 221324)
--- gcc/tree-cfg.c  (working copy)
*** gimple_split_block (basic_block bb, void
*** 5683,5689 
  {
gimple_stmt_iterator gsi;
gimple_stmt_iterator gsi_tgt;
-   gimple act;
gimple_seq list;
basic_block new_bb;
edge e;
--- 5683,5688 
*** gimple_split_block (basic_block bb, void
*** 5697,5722 
FOR_EACH_EDGE (e, ei, new_bb->succs)
  e->src = new_bb;
  
!   if (stmt && gimple_code ((gimple) stmt) == GIMPLE_LABEL)
! stmt = NULL;
! 
!   /* Move everything from GSI to the new basic block.  */
!   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
  {
!   act = gsi_stmt (gsi);
!   if (gimple_code (act) == GIMPLE_LABEL)
!   continue;
! 
!   if (!stmt)
!   break;
! 
!   if (stmt == act)
!   {
! gsi_next (&gsi);
! break;
!   }
  }
! 
if (gsi_end_p (gsi))
  return new_bb;
  
--- 5696,5711 
FOR_EACH_EDGE (e, ei, new_bb->succs)
  e->src = new_bb;
  
!   /* Get a stmt iterator pointing to the first stmt to move.  */
!   if (!stmt || gimple_code ((gimple) stmt) == GIMPLE_LABEL)
! gsi = gsi_after_labels (bb);
!   else
  {
!   gsi = gsi_for_stmt ((gimple) stmt);
!   gsi_next (&gsi);
  }
!  
!   /* Move everything from GSI to the new basic block.  */
if (gsi_end_p (gsi))
  return new_bb;
  
Index: gcc/omp-low.c
===
*** gcc/omp-low.c   (revision 221324)
--- gcc/omp-low.c   (working copy)
*** expand_omp_taskreg (struct omp_region *r
*** 5514,5521 
stmt = gsi_stmt (gsi);
gcc_assert (stmt && (gimple_code (stmt) == GIMPLE_OMP_PARALLEL
   || gimple_code (stmt) == GIMPLE_OMP_TASK));
-   gsi_remove (&gsi, true);
e = split_block (entry_bb, stmt);
entry_bb = e->dest;
single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
  
--- 5514,5521 
stmt = gsi_stmt (gsi);
gcc_assert (stmt && (gimple_code (stmt) == GIMPLE_OMP_PARALLEL
   || gimple_code (stmt) == GIMPLE_OMP_TASK));
e = split_block (entry_bb, stmt);
+   gsi_remove (&gsi, true);
entry_bb = e->dest;
single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
  
*** expand_omp_target (struct omp_region *re
*** 8889,8896 
stmt = gsi_stmt (gsi);
gcc_assert (stmt
  && gimple_code (stmt) == gimple_code (entry_stmt));
-   gsi_remove (&gsi, true);
e = split_block (entry_bb, stmt);
entry_bb = e->dest;
single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
  
--- 8889,8896 
stmt = gsi_stmt (gsi);
gcc_assert (stmt
  && gimple_code (stmt) == gimple_code (entry_stmt));
e = split_block (entry_bb, stmt);
+   gsi_remove (&gsi, true);
entry_bb = e->dest;
single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
  
Index: gcc/ubsan.c
===
*** gcc/ubsan.c (revision 221324)
--- gcc/ubsan.c (working copy)
*** ubsan_expand_null_ifn (gimple_stmt_itera
*** 864,869 
--- 864,870 
  
/* Replace the UBSAN_NULL with a GIMPLE_COND stmt.  */
gsi_replace (&gsi, g, false);
+   stmt = g;
  }
  
if (check_align)
Index: gcc/tree-parloops.c
===
*** gcc/tree-parloops.c (revision 221324)
--- gcc/tree-parloops.c (working copy)
*** create_call_for_reduction_1 (r

Re: [PATCH, CHKP, PR target/65044] Restrict pointer bounds checker with Sanitizer

2015-03-12 Thread Jakub Jelinek

On Thu, Mar 12, 2015 at 11:51:51AM +0300, Ilya Enkovich wrote:
> On 09 Mar 15:51, Jakub Jelinek wrote:
> > On Mon, Mar 02, 2015 at 01:25:43PM +0300, Ilya Enkovich wrote:
> > > > --- a/gcc/toplev.c
> > > > +++ b/gcc/toplev.c
> > > > @@ -1376,6 +1376,11 @@ process_options (void)
> > > >  {
> > > >if (targetm.chkp_bound_mode () == VOIDmode)
> > > > error ("-fcheck-pointer-bounds is not supported for this 
> > > > target");
> > > > +
> > > > +  if (flag_sanitize & SANITIZE_ADDRESS)
> > > > +   error ("-fcheck-pointer-bounds is not supported with Address 
> > > > Sanitizer");
> > > > +
> > > > +  flag_check_pointer_bounds = 0;
> > > >  }
> > 
> > Doesn't this disable -fcheck-pointer-bounds always?
> > I'd expect you want to clear flag_check_pointer_bounds only if you issued
> > one of the two errors...
> > 
> > Jakub
> 
> Whoops!  Here is a less destructive version.

Ok for trunk.  Did the old version pass make check?  If so, perhaps you want to 
add
(incrementally) some test that would actually verify that
-fcheck-pointer-bounds does what it should do (e.g. by scanning tree dumps
etc.).

Jakub

[PING] [PATCH, AArch64] [4.8] [4.9] Backport PR64304 fix (miscompilation with -mgeneral-regs-only )

2015-03-12 Thread Chen Shanyao



This backports the fixes for PR target/64304 , miscompilation with 
-mgeneral-regs-only, to the 4.8 & 4.9 branch from trunk r219844.
Tested on x86_64 by using qemu of aarch64.
OK for 4.8 & 4.9 ?

---gcc-4.8---

diff -rupN gcc-4.8-20150226/gcc/ChangeLog
gcc-4.8-20150226.pr64304//gcc/ChangeLog
--- gcc-4.8-20150226/gcc/ChangeLog2015-03-04 21:13:46.0 -0500
+++ gcc-4.8-20150226.pr64304//gcc/ChangeLog2015-03-04
21:19:49.0 -0500
@@ -1,3 +1,13 @@
+2015-03-05  Shanyao Chen
+
+Backported from mainline
+2015-01-19  Jiong Wang
+Andrew Pinski
+
+PR target/64304
+* config/aarch64/aarch64.md (define_insn "*ashl3_insn"):
Deleted.
+(ashl3): Don't expand if operands[2] is not constant.
+
   2015-02-26  Peter Bergner

   Backport from mainline
diff -rupN gcc-4.8-20150226/gcc/config/aarch64/aarch64.md
gcc-4.8-20150226.pr64304//gcc/config/aarch64/aarch64.md
--- gcc-4.8-20150226/gcc/config/aarch64/aarch64.md2015-03-04
21:14:29.0 -0500
+++ gcc-4.8-20150226.pr64304//gcc/config/aarch64/aarch64.md 2015-03-04
21:21:54.0 -0500
@@ -2612,6 +2612,8 @@
   DONE;
 }
 }
+else
+  FAIL;
 }
   )

@@ -2681,16 +2683,6 @@
  (set_attr "mode" "SI")]
   )

-(define_insn "*ashl3_insn"
-  [(set (match_operand:SHORT 0 "register_operand" "=r")
-(ashift:SHORT (match_operand:SHORT 1 "register_operand" "r")
-  (match_operand:QI 2 "aarch64_reg_or_shift_imm_si" "rUss")))]
-  ""
-  "lsl\\t%0, %1, %2"
-  [(set_attr "v8type" "shift")
-   (set_attr "mode" "")]
-)
-
   (define_insn "*3_insn"
 [(set (match_operand:SHORT 0 "register_operand" "=r")
   (ASHIFT:SHORT (match_operand:SHORT 1 "register_operand" "r")
diff -rupN gcc-4.8-20150226/gcc/testsuite/ChangeLog
gcc-4.8-20150226.pr64304//gcc/testsuite/ChangeLog
--- gcc-4.8-20150226/gcc/testsuite/ChangeLog2015-03-04
21:16:54.0 -0500
+++ gcc-4.8-20150226.pr64304//gcc/testsuite/ChangeLog2015-03-04
21:22:58.0 -0500
@@ -1,3 +1,10 @@
+2015-03-05  Shanyao chen
+
+Backported from mainline
+2015-01-19  Jiong Wang
+
+* gcc.target/aarch64/pr64304.c: New testcase.
+
   2015-02-26  Peter Bergner

   Backport from mainline
diff -rupN gcc-4.8-20150226/gcc/testsuite/gcc.target/aarch64/pr64304.c
gcc-4.8-20150226.pr64304//gcc/testsuite/gcc.target/aarch64/pr64304.c
--- gcc-4.8-20150226/gcc/testsuite/gcc.target/aarch64/pr64304.c
1969-12-31 19:00:00.0 -0500
+++ gcc-4.8-20150226.pr64304//gcc/testsuite/gcc.target/aarch64/pr64304.c
2015-03-04 21:12:15.0 -0500
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 --save-temps" } */
+
+unsigned char byte = 0;
+
+void
+set_bit (unsigned int bit, unsigned char value)
+{
+  unsigned char mask = (unsigned char) (1 << (bit & 7));
+
+  if (! value)
+byte &= (unsigned char)~mask;
+  else
+byte |= mask;
+/* { dg-final { scan-assembler "and\tw\[0-9\]+, w\[0-9\]+, 7" } } */
+}
+
+/* { dg-final { cleanup-saved-temps } } */
+

---gcc-4.9---

diff -rupN gcc-4.9-20150225/gcc/ChangeLog
gcc-4.9-20150225.pr64304//gcc/ChangeLog
--- gcc-4.9-20150225/gcc/ChangeLog2015-03-04 20:48:30.0 -0500
+++ gcc-4.9-20150225.pr64304//gcc/ChangeLog2015-03-04
20:55:59.0 -0500
@@ -1,3 +1,13 @@
+2015-03-05  Shanyao Chen
+
+Backported from mainline
+2015-01-19  Jiong Wang
+Andrew Pinski
+
+PR target/64304
+* config/aarch64/aarch64.md (define_insn "*ashl3_insn"): Deleted.
+(ashl3): Don't expand if operands[2] is not constant.
+
   2015-02-25  Kai Tietz

   PR tree-optimization/61917
diff -rupN gcc-4.9-20150225/gcc/config/aarch64/aarch64.md
gcc-4.9-20150225.pr64304//gcc/config/aarch64/aarch64.md
--- gcc-4.9-20150225/gcc/config/aarch64/aarch64.md2015-03-04
20:41:03.0 -0500
+++ gcc-4.9-20150225.pr64304//gcc/config/aarch64/aarch64.md 2015-03-04
20:46:44.0 -0500
@@ -2719,6 +2719,8 @@
   DONE;
 }
 }
+else
+  FAIL;
 }
   )

@@ -2947,15 +2949,6 @@
 [(set_attr "type" "shift_reg")]
   )

-(define_insn "*ashl3_insn"
-  [(set (match_operand:SHORT 0 "register_operand" "=r")
-(ashift:SHORT (match_operand:SHORT 1 "register_operand" "r")
-  (match_operand:QI 2 "aarch64_reg_or_shift_imm_si" "rUss")))]
-  ""
-  "lsl\\t%0, %1, %2"
-  [(set_attr "type" "shift_reg")]
-)
-
   (define_insn "*3_insn"
 [(set (match_operand:SHORT 0 "register_operand" "=r")
   (ASHIFT:SHORT (match_operand:SHORT 1 "register_operand" "r")
diff -rupN gcc-4.9-20150225/gcc/testsuite/ChangeLog
gcc-4.9-20150225.pr64304//gcc/testsuite/ChangeLog
--- gcc-4.9-20150225/gcc/testsuite/ChangeLog2015-03-04
21:00:24.0 -0500
+++ gcc-4.9-20150225.pr64304//gcc/testsuite/ChangeLog2015-03-04
21:03:21.0 -0500
@@ -1,3 +1,10 @@
+2015-03-05  Shanyao chen
+
+Back

[PING][PATCH] ASan on unaligned accesses

2015-03-12 Thread Marat Zakirov




On 03/04/2015 11:00 AM, Marat Zakirov wrote:

Hi all!

Here is the patch which forces ASan to work on memory access without 
proper alignment. it's useful because some programs like linux kernel 
often cheat with alignment which may cause false negatives. This patch 
needs additional support for proper work on unaligned accesses in 
global data and heap. It will be implemented in libsanitizer by 
separate patch.



--Marat


gcc/ChangeLog:

2015-02-25  Marat Zakirov  

	* asan.c (asan_emit_stack_protection): Support for misalign accesses. 
	(asan_expand_check_ifn): Likewise. 
	* params.def: New option asan-catch-misaligned.
	* params.h: New param ASAN_CATCH_MISALIGNED.

gcc/testsuite/ChangeLog:

2015-02-25  Marat Zakirov  

	* c-c++-common/asan/misalign-catch.c: New test.


diff --git a/gcc/asan.c b/gcc/asan.c
index b7c2b11..49d0da4 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1050,7 +1050,6 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
   rtx_code_label *lab;
   rtx_insn *insns;
   char buf[30];
-  unsigned char shadow_bytes[4];
   HOST_WIDE_INT base_offset = offsets[length - 1];
   HOST_WIDE_INT base_align_bias = 0, offset, prev_offset;
   HOST_WIDE_INT asan_frame_size = offsets[0] - base_offset;
@@ -1059,6 +1058,7 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
   unsigned char cur_shadow_byte = ASAN_STACK_MAGIC_LEFT;
   tree str_cst, decl, id;
   int use_after_return_class = -1;
+  bool misalign = (flag_sanitize & SANITIZE_KERNEL_ADDRESS) || ASAN_CATCH_MISALIGNED;
 
   if (shadow_ptr_types[0] == NULL_TREE)
 asan_init_shadow_ptr_types ();
@@ -1193,11 +1193,37 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
   if (STRICT_ALIGNMENT)
 set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
   prev_offset = base_offset;
+
+  vec shadow_mems;
+  vec shadow_bytes;
+
+  shadow_mems.create(0);
+  shadow_bytes.create(0);
+
   for (l = length; l; l -= 2)
 {
   if (l == 2)
 	cur_shadow_byte = ASAN_STACK_MAGIC_RIGHT;
   offset = offsets[l - 1];
+  if (l != length && misalign)
+	{
+	  HOST_WIDE_INT aoff
+	= base_offset + ((offset - base_offset)
+			 & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1))
+	  - ASAN_RED_ZONE_SIZE;
+	  if (aoff > prev_offset)
+	{
+	  shadow_mem = adjust_address (shadow_mem, VOIDmode,
+	   (aoff - prev_offset)
+	   >> ASAN_SHADOW_SHIFT);
+	  prev_offset = aoff;
+	  shadow_bytes.safe_push (0);
+	  shadow_bytes.safe_push (0);
+	  shadow_bytes.safe_push (0);
+	  shadow_bytes.safe_push (0);
+	  shadow_mems.safe_push (shadow_mem);
+	}
+	}
   if ((offset - base_offset) & (ASAN_RED_ZONE_SIZE - 1))
 	{
 	  int i;
@@ -1212,13 +1238,13 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
 	if (aoff < offset)
 	  {
 		if (aoff < offset - (1 << ASAN_SHADOW_SHIFT) + 1)
-		  shadow_bytes[i] = 0;
+		  shadow_bytes.safe_push (0);
 		else
-		  shadow_bytes[i] = offset - aoff;
+		  shadow_bytes.safe_push (offset - aoff);
 	  }
 	else
-	  shadow_bytes[i] = ASAN_STACK_MAGIC_PARTIAL;
-	  emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
+	  shadow_bytes.safe_push (ASAN_STACK_MAGIC_PARTIAL);
+	  shadow_mems.safe_push(shadow_mem);
 	  offset = aoff;
 	}
   while (offset <= offsets[l - 2] - ASAN_RED_ZONE_SIZE)
@@ -1227,12 +1253,21 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
    (offset - prev_offset)
    >> ASAN_SHADOW_SHIFT);
 	  prev_offset = offset;
-	  memset (shadow_bytes, cur_shadow_byte, 4);
-	  emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
+	  shadow_bytes.safe_push (cur_shadow_byte);
+	  shadow_bytes.safe_push (cur_shadow_byte);
+	  shadow_bytes.safe_push (cur_shadow_byte);
+	  shadow_bytes.safe_push (cur_shadow_byte);
+	  shadow_mems.safe_push(shadow_mem);
 	  offset += ASAN_RED_ZONE_SIZE;
 	}
   cur_shadow_byte = ASAN_STACK_MAGIC_MIDDLE;
 }
+  for (unsigned i = 0; misalign && i < shadow_bytes.length () - 1; i++)
+if (shadow_bytes[i] == 0 && shadow_bytes[i + 1] > 0)
+  shadow_bytes[i] = 8 + (shadow_bytes[i + 1] > 7 ? 0 : shadow_bytes[i + 1]);
+  for (unsigned i = 0; i < shadow_mems.length (); i++)
+emit_move_insn (shadow_mems[i], asan_shadow_cst (&shadow_bytes[i * 4]));
+  
   do_pending_stack_adjust ();
 
   /* Construct epilogue sequence.  */
@@ -1285,34 +1320,8 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
   if (STRICT_ALIGNMENT)
 set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
 
-  prev_offset = base_offset;
-  last_offset = base_offset;
-  last_size = 0;
-  for (l = length; l; l -= 2)
-{
-  offset = base_offset + ((offsets[l - 1] - base_offset)
-			 & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1));
-  if (last_offset + last_size != offset)
-	{
-	  shadow_mem = adjust_address (shadow_mem, VOIDmode,
-   (last_offset - prev_offset)
-   >> ASA

Re: [PATCH, i386 testsuite]: Require nonpic target for some tests

2015-03-12 Thread Tom de Vries


On 12-03-15 10:57, Uros Bizjak wrote:

On Thu, Mar 12, 2015 at 9:11 AM, Tom de Vries  wrote:


Attached patch adds nonpic target requirement for some (obvious)
cases, where data access or PIC register setup confuses scan-asms.

2015-01-30  Uros Bizjak  

  * gcc.target/i386/fuse-caller-save-rec.c: Require nonpic target.
  * gcc.target/i386/fuse-caller-save-xmm.c: Ditto.
  * gcc.target/i386/fuse-caller-save.c: Ditto.



Hi,

I've reverted this part of the patch. The scans were failing because the
-fipa-ra optimization was broken for -m32 -fpic (PR64895).


Not really.

Allocator is free to allocate %ebx (or other call-saved
register) as PIC register.

In this case, unwanted push/pop sequence
will be emitted.



Sure, but I don't see what that has to do with the test-cases. I don't see a pic 
register used in fuse-caller-save.c and fuse-caller-save-rec.c. I do see a pic 
register used in gcc.target/i386/fuse-caller-save-xmm.c, but there's no scan for 
push/pop sequence in there.


Thanks,
- Tom

[CHKP, PATCH] Fix LTO cgraph merge for instrumented functions

2015-03-12 Thread Ilya Enkovich

Hi,

Currently cgraph merge has several issues with instrumented code:
 - original function node may be removed => no assembler name conflict is 
detected between function and variable
 - only orig_decl name is privatized for instrumented function => node still 
shares assembler name which causes infinite privatization loop
 - information about changed name is stored in file_data of instrumented node 
=> original section name may be not found for original function
 - chkp reference is not fixed when nodes are merged

This patch should fix theese problems by keeping instrumentation thunks 
reachable, privatizing both nodes and fixing chkp references.  Bootstrapped and 
tested on x86_64-unknown-linux-gnu.  OK for trunk?


Thanks,
Ilya
--
gcc/

2015-03-12  Ilya Enkovich  

* ipa-chkp.h (chkp_maybe_fix_chkp_ref): New.
* ipa-chkp.c (chkp_maybe_fix_chkp_ref): New.
* ipa.c (symbol_table::remove_unreachable_nodes): Don't
remove instumentation thunks calling reachable functions.
* lto-cgraph.c: Include ipa-chkp.h.
(input_symtab): Fix chkp references for boundary nodes.
* lto/lto-partition.c (privatize_symbol_name_1): New.
(privatize_symbol_name): Privatize both decl and orig_decl
names for instrumented functions.
* lto/lto-symtab.c: Include ipa-chkp.h.
(lto_cgraph_replace_node): Fix chkp references for merged
function nodes.

gcc/testsuite/

2015-03-12  Ilya Enkovich  

* gcc.dg/lto/chkp-privatize-1_0.c: New.
* gcc.dg/lto/chkp-privatize-1_1.c: New.
* gcc.dg/lto/chkp-privatize-2_0.c: New.
* gcc.dg/lto/chkp-privatize-2_1.c: New.


diff --git a/gcc/ipa-chkp.c b/gcc/ipa-chkp.c
index 0b857ff..223f4ed 100644
--- a/gcc/ipa-chkp.c
+++ b/gcc/ipa-chkp.c
@@ -414,6 +414,36 @@ chkp_instrumentable_p (tree fndecl)
  && (!fn || !copy_forbidden (fn, fndecl)));
 }
 
+/* Check NODE has a correct IPA_REF_CHKP reference.
+   Create a new reference if required.  */
+
+void
+chkp_maybe_fix_chkp_ref (cgraph_node *node)
+{
+  /* Firstly check node needs IPA_REF_CHKP.  */
+  if (node->instrumentation_clone
+  || !node->instrumented_version)
+return;
+
+  /* Check we already have a proper IPA_REF_CHKP.
+ Remove incorrect refs.  */
+  int i;
+  ipa_ref *ref = NULL;
+  for (i = 0; node->iterate_reference (i, ref); i++)
+if (ref->use == IPA_REF_CHKP)
+  {
+   /* Found proper reference.  */
+   if (ref->referred == node->instrumented_version)
+ return;
+
+   /* Need to recreate reference.  */
+   ref->remove_reference ();
+   break;
+  }
+
+  node->create_reference (node->instrumented_version, IPA_REF_CHKP, NULL);
+}
+
 /* Return clone created for instrumentation of NODE or NULL.  */
 
 cgraph_node *
diff --git a/gcc/ipa-chkp.h b/gcc/ipa-chkp.h
index 6708fe9..5fa7d88 100644
--- a/gcc/ipa-chkp.h
+++ b/gcc/ipa-chkp.h
@@ -24,5 +24,6 @@ extern tree chkp_copy_function_type_adding_bounds (tree 
orig_type);
 extern tree chkp_maybe_clone_builtin_fndecl (tree fndecl);
 extern cgraph_node *chkp_maybe_create_clone (tree fndecl);
 extern bool chkp_instrumentable_p (tree fndecl);
+extern void chkp_maybe_fix_chkp_ref (cgraph_node *node);
 
 #endif /* GCC_IPA_CHKP_H */
diff --git a/gcc/ipa.c b/gcc/ipa.c
index b3752de..ae6269f 100644
--- a/gcc/ipa.c
+++ b/gcc/ipa.c
@@ -492,7 +492,18 @@ symbol_table::remove_unreachable_nodes (FILE *file)
}
  else if (cnode->thunk.thunk_p)
enqueue_node (cnode->callees->callee, &first, &reachable);
- 
+
+ /* For instrumentation clones we always need original
+function node for proper LTO privatization.  */
+ if (cnode->instrumentation_clone
+ && reachable.contains (cnode)
+ && cnode->definition)
+   {
+ gcc_assert (cnode->instrumented_version);
+ enqueue_node (cnode->instrumented_version, &first, &reachable);
+ reachable.add (cnode->instrumented_version);
+   }
+
  /* If any reachable function has simd clones, mark them as
 reachable as well.  */
  if (cnode->simd_clones)
diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
index c875fed..b9196eb 100644
--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -80,6 +80,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "pass_manager.h"
 #include "ipa-utils.h"
 #include "omp-low.h"
+#include "ipa-chkp.h"
 
 /* True when asm nodes has been output.  */
 bool asm_nodes_output = false;
@@ -1888,6 +1889,10 @@ input_symtab (void)
 context of the nested function.  */
   if (node->lto_file_data)
node->aux = NULL;
+
+  /* May need to fix chkp reference because we don't stream
+them for boundary symbols.  */
+  chkp_maybe_fix_chkp_ref (node);
 }
 }
 
diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 235b735..7d117e9 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-par

RFA: Update gcc test 20101011-1.c with more targets that do not trap

2015-03-12 Thread Nick Clifton

Hi Guys,

  The patch below updates the 20101011-1.c test in the gcc testsuite to
  add a few more targets whose (simulated) runtime does not support
  trapping on division by zero.

  OK to apply ?

Cheers
  Nick

gcc/testsuite/ChangeLog
2015-03-12  Nick Clifton  

* gcc.c-torture/execute/20101011-1.c: Skip this test for the V850,
MSP430, RL78 and RX targets.

Index: gcc/testsuite/gcc.c-torture/execute/20101011-1.c
===
--- gcc/testsuite/gcc.c-torture/execute/20101011-1.c(revision 221346)
+++ gcc/testsuite/gcc.c-torture/execute/20101011-1.c(working copy)
@@ -12,6 +12,18 @@
 #elif defined (__sh__)
   /* On SH division by zero does not trap.  */
 # define DO_TEST 0
+#elif defined (__v850__)
+  /* On V850 division by zero does not trap.  */
+# define DO_TEST 0
+#elif defined (__MSP430__)
+  /* On MSP430 division by zero does not trap.  */
+# define DO_TEST 0
+#elif defined (__RL78__)
+  /* On RL78 division by zero does not trap.  */
+# define DO_TEST 0
+#elif defined (__RX__)
+  /* On RX division by zero does not trap.  */
+# define DO_TEST 0
 #elif defined (__aarch64__)
   /* On AArch64 integer division by zero does not trap.  */
 # define DO_TEST 0

Re: [PATCH, i386 testsuite]: Require nonpic target for some tests

2015-03-12 Thread Uros Bizjak

On Thu, Mar 12, 2015 at 9:11 AM, Tom de Vries  wrote:

>> Attached patch adds nonpic target requirement for some (obvious)
>> cases, where data access or PIC register setup confuses scan-asms.
>>
>> 2015-01-30  Uros Bizjak  
>>
>>  * gcc.target/i386/fuse-caller-save-rec.c: Require nonpic target.
>>  * gcc.target/i386/fuse-caller-save-xmm.c: Ditto.
>>  * gcc.target/i386/fuse-caller-save.c: Ditto.
>
>
> Hi,
>
> I've reverted this part of the patch. The scans were failing because the
> -fipa-ra optimization was broken for -m32 -fpic (PR64895).

Not really. Allocator is free to allocate %ebx (or other call-saved
register) as PIC register. In this case, unwanted push/pop sequence
will be emitted.

Uros.

Re: [PATCH, PR target/65103, 1/3] Fix cost of PIC register in ix86_address_cost

2015-03-12 Thread Ilya Enkovich

On 10 Mar 19:08, Uros Bizjak wrote:
> Hello!
> 
> > > > Test O2 ref patchedOfast + LTO ref   patched
> > > > 164.gzip12  0 (-100%)39  0 (-100%)
> > > > 175.vpr 0   0 (-0%)  4   0 (-100%)
> > > > 176.gcc 141 6 (-96%) 294 10 (-97%)
> > > > 181.mcf 4   0 (-100%)4   2 (-50%)
> 
> Do you also have executable sizes at hand?

Summary size change for SPEC2000 on -O2 is -0,11%.

> 
> > 2015-03-10  Ilya Enkovich  
> >
> > PR target/65103
> > * config/i386/i386.c (ix86_address_cost): Fix cost of a PIC
> > register.
> >
> > gcc/testsuite/
> >
> > 2015-03-10  Ilya Enkovich  
> >
> > PR target/65103
> > * gcc.target/i386/pr65103-1.c: New.
> 
> LGTM, just a nit below.
> 
> Otherwise, OK for mainline as a bugfix (but please wait for a day if
> there are any objections from release managers).
> 
> +  /* Attempt to minimize number of registers in the address.
> 
> This is now a displaced comment. Please integrate it in the main comment.
> 
> Thanks,
> Uros.

Here is a final version.

Thanks,
Ilya
--
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ab8f03a..47deda7 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12931,30 +12931,26 @@ ix86_address_cost (rtx x, machine_mode, addr_space_t, 
bool)
   if (parts.index && GET_CODE (parts.index) == SUBREG)
 parts.index = SUBREG_REG (parts.index);
 
-  /* Attempt to minimize number of registers in the address.  */
-  if ((parts.base
-   && (!REG_P (parts.base) || REGNO (parts.base) >= FIRST_PSEUDO_REGISTER))
-  || (parts.index
- && (!REG_P (parts.index)
- || REGNO (parts.index) >= FIRST_PSEUDO_REGISTER)))
-cost++;
-
-  /* When address base or index is "pic_offset_table_rtx" we don't increase
- address cost.  When a memopt with "pic_offset_table_rtx" is not invariant
- itself it most likely means that base or index is not invariant.
- Therefore only "pic_offset_table_rtx" could be hoisted out, which is not
- profitable for x86.  */
+  /* Attempt to minimize number of registers in the address by increasing
+ address cost for each used register.  We don't increase address cost
+ for "pic_offset_table_rtx".  When a memopt with "pic_offset_table_rtx"
+ is not invariant itself it most likely means that base or index is not
+ invariant.  Therefore only "pic_offset_table_rtx" could be hoisted out,
+ which is not profitable for x86.  */
   if (parts.base
-  && (current_pass->type == GIMPLE_PASS
- || (!pic_offset_table_rtx
- || REGNO (pic_offset_table_rtx) != REGNO(parts.base)))
   && (!REG_P (parts.base) || REGNO (parts.base) >= FIRST_PSEUDO_REGISTER)
-  && parts.index
   && (current_pass->type == GIMPLE_PASS
- || (!pic_offset_table_rtx
- || REGNO (pic_offset_table_rtx) != REGNO(parts.index)))
+ || !pic_offset_table_rtx
+ || !REG_P (parts.base)
+ || REGNO (pic_offset_table_rtx) != REGNO (parts.base)))
+cost++;
+
+  if (parts.index
   && (!REG_P (parts.index) || REGNO (parts.index) >= FIRST_PSEUDO_REGISTER)
-  && parts.base != parts.index)
+  && (current_pass->type == GIMPLE_PASS
+ || !pic_offset_table_rtx
+ || !REG_P (parts.index)
+ || REGNO (pic_offset_table_rtx) != REGNO (parts.index)))
 cost++;
 
   /* AMD-K6 don't like addresses with ModR/M set to 00_xxx_100b,
diff --git a/gcc/testsuite/gcc.target/i386/pr65103-1.c 
b/gcc/testsuite/gcc.target/i386/pr65103-1.c
new file mode 100644
index 000..4e3a7a3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr65103-1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile { target ia32 } } */
+/* { dg-require-effective-target pie } */
+/* { dg-options "-O2 -fPIE" } */
+/* { dg-final { scan-assembler-not "GOTOFF," } } */
+
+typedef struct S
+{
+  int a;
+  int sum;
+  int delta;
+} S;
+
+S gs;
+int global_opt (int max)
+{
+  while (gs.sum < max)
+gs.sum += gs.delta;
+  return gs.a;
+}

Re: [PATCH, i386 testsuite]: Require nonpic target for some tests

2015-03-12 Thread Uros Bizjak

On Thu, Mar 12, 2015 at 11:41 AM, Tom de Vries  wrote:

 Attached patch adds nonpic target requirement for some (obvious)
 cases, where data access or PIC register setup confuses scan-asms.

 2015-01-30  Uros Bizjak  

   * gcc.target/i386/fuse-caller-save-rec.c: Require nonpic target.
   * gcc.target/i386/fuse-caller-save-xmm.c: Ditto.
   * gcc.target/i386/fuse-caller-save.c: Ditto.
>>>
>>>
>>>
>>> Hi,
>>>
>>> I've reverted this part of the patch. The scans were failing because the
>>> -fipa-ra optimization was broken for -m32 -fpic (PR64895).
>>
>>
>> Not really.
>>
>> Allocator is free to allocate %ebx (or other call-saved
>> register) as PIC register.
>>
>> In this case, unwanted push/pop sequence
>> will be emitted.
>>
>
> Sure, but I don't see what that has to do with the test-cases. I don't see a
> pic register used in fuse-caller-save.c and fuse-caller-save-rec.c. I do see
> a pic register used in gcc.target/i386/fuse-caller-save-xmm.c, but there's
> no scan for push/pop sequence in there.

You are right, the call is (obviously) to a local function. There is
no need for PIC reg, so this clears my concerns.

Thanks,
Uros.

Re: [PATCH, PR target/65103, 1/3] Fix cost of PIC register in ix86_address_cost

2015-03-12 Thread Uros Bizjak

On Thu, Mar 12, 2015 at 10:50 AM, Ilya Enkovich  wrote:

>> > > > Test O2 ref patchedOfast + LTO ref   patched
>> > > > 164.gzip12  0 (-100%)39  0 (-100%)
>> > > > 175.vpr 0   0 (-0%)  4   0 (-100%)
>> > > > 176.gcc 141 6 (-96%) 294 10 (-97%)
>> > > > 181.mcf 4   0 (-100%)4   2 (-50%)
>>
>> Do you also have executable sizes at hand?
>
> Summary size change for SPEC2000 on -O2 is -0,11%.

Nice!

>> > 2015-03-10  Ilya Enkovich  
>> >
>> > PR target/65103
>> > * config/i386/i386.c (ix86_address_cost): Fix cost of a PIC
>> > register.
>> >
>> > gcc/testsuite/
>> >
>> > 2015-03-10  Ilya Enkovich  
>> >
>> > PR target/65103
>> > * gcc.target/i386/pr65103-1.c: New.
>>
>> LGTM, just a nit below.
>>
>> Otherwise, OK for mainline as a bugfix (but please wait for a day if
>> there are any objections from release managers).
>>
>> +  /* Attempt to minimize number of registers in the address.
>>
>> This is now a displaced comment. Please integrate it in the main comment.
>>
>> Thanks,
>> Uros.
>
> Here is a final version.

OK for mainline.

Thanks,
Uros.

Re: [PATCH, i386 testsuite]: Require nonpic target for some tests

2015-03-12 Thread Tom de Vries


On 30-01-15 20:49, Uros Bizjak wrote:

Hello!

Attached patch adds nonpic target requirement for some (obvious)
cases, where data access or PIC register setup confuses scan-asms.

2015-01-30  Uros Bizjak  

 * gcc.target/i386/fuse-caller-save-rec.c: Require nonpic target.
 * gcc.target/i386/fuse-caller-save-xmm.c: Ditto.
 * gcc.target/i386/fuse-caller-save.c: Ditto.


Hi,

I've reverted this part of the patch. The scans were failing because the 
-fipa-ra optimization was broken for -m32 -fpic (PR64895).


Thanks,
- Tom

2015-03-12  Tom de Vries  

	PR rtl-optimization/64895
	* gcc.target/i386/fuse-caller-save-rec.c: Revert require nonpic target.
	* gcc.target/i386/fuse-caller-save-xmm.c: Ditto.
	* gcc.target/i386/fuse-caller-save.c: Ditto.

diff --git a/gcc/testsuite/gcc.target/i386/fuse-caller-save-rec.c b/gcc/testsuite/gcc.target/i386/fuse-caller-save-rec.c
index ed0984c..c660e01 100644
--- a/gcc/testsuite/gcc.target/i386/fuse-caller-save-rec.c
+++ b/gcc/testsuite/gcc.target/i386/fuse-caller-save-rec.c
@@ -1,5 +1,4 @@
 /* { dg-do compile } */
-/* { dg-require-effective-target nonpic } */
 /* { dg-options "-O2 -fipa-ra -fomit-frame-pointer -fno-optimize-sibling-calls" } */
 /* { dg-additional-options "-mregparm=1" { target ia32 } } */
 
diff --git a/gcc/testsuite/gcc.target/i386/fuse-caller-save-xmm.c b/gcc/testsuite/gcc.target/i386/fuse-caller-save-xmm.c
index 261ba07..1d02844 100644
--- a/gcc/testsuite/gcc.target/i386/fuse-caller-save-xmm.c
+++ b/gcc/testsuite/gcc.target/i386/fuse-caller-save-xmm.c
@@ -1,5 +1,4 @@
 /* { dg-do compile } */
-/* { dg-require-effective-target nonpic } */
 /* { dg-options "-O2 -msse2 -mno-avx -fipa-ra -fomit-frame-pointer" } */
 
 typedef double v2df __attribute__((vector_size (16)));
diff --git a/gcc/testsuite/gcc.target/i386/fuse-caller-save.c b/gcc/testsuite/gcc.target/i386/fuse-caller-save.c
index b9494ac..7cfd22a 100644
--- a/gcc/testsuite/gcc.target/i386/fuse-caller-save.c
+++ b/gcc/testsuite/gcc.target/i386/fuse-caller-save.c
@@ -1,5 +1,4 @@
 /* { dg-do compile } */
-/* { dg-require-effective-target nonpic } */
 /* { dg-options "-O2 -fipa-ra -fomit-frame-pointer" } */
 /* { dg-additional-options "-mregparm=1" { target ia32 } } */
 
-- 
1.9.1

[Committed][PR64895] Use actual_call_used_reg_set to find conflicting regs

2015-03-12 Thread Tom de Vries


Hi,

This patch fixes PR64895, related to the gcc.target/i386/fuse-caller-save*.c 
failures for -m32 -fpic.


Bootstrapped and reg-tested on x86_64 for unix/ and unix/-m32.
Build and reg-tested on x86_64 for unix/fpic and unix/fpic/-m32.

Approved here ( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64895#c7 ):
...
The patch looks ok to me.  Tom, could you prepare the patch (check it mostly for 
x86-64 bootstrap and testsuite) and commit it to the trunk.  I approve it.

...

Thanks,
- Tom
2015-03-12  Tom de Vries  

	PR rtl-optimization/64895
	* lra-lives.c (check_pseudos_live_through_calls): Use
	actual_call_used_reg_set instead of call_used_reg_set, if available.

diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c
index 9dfffb6..5d759ca 100644
--- a/gcc/lra-lives.c
+++ b/gcc/lra-lives.c
@@ -636,8 +636,12 @@ check_pseudos_live_through_calls (int regno)
   if (! sparseset_bit_p (pseudos_live_through_calls, regno))
 return;
   sparseset_clear_bit (pseudos_live_through_calls, regno);
+  bool actual_call_used_reg_set_available_p
+= !hard_reg_set_empty_p (lra_reg_info[regno].actual_call_used_reg_set);
   IOR_HARD_REG_SET (lra_reg_info[regno].conflict_hard_regs,
-		call_used_reg_set);
+		(actual_call_used_reg_set_available_p
+		 ? lra_reg_info[regno].actual_call_used_reg_set
+		 : call_used_reg_set));
 
   for (hr = 0; hr < FIRST_PSEUDO_REGISTER; hr++)
 if (HARD_REGNO_CALL_PART_CLOBBERED (hr, PSEUDO_REGNO_MODE (regno)))
-- 
1.9.1

Re: [patch] disable libmpx x32 multilib builds

2015-03-12 Thread Ilya Enkovich

On 11 Mar 19:11, Ilya Enkovich wrote:
> 2015-03-11 18:59 GMT+03:00 H.J. Lu :
> > On Wed, Mar 11, 2015 at 7:37 AM, Matthias Klose  wrote:
> >> current trunk fails to build on x86*-linux, when configured for x32 
> >> multilibs
> >> because libmpx doesn't support these. Disable them.
> >>
> >> ok for the trunk?
> >>
> >> * Disable libmpx x32 multilib builds.
> >>
> >> --- a/config-ml.in
> >> +++ b/config-ml.in
> >> @@ -102,6 +102,7 @@
> >>  Makefile=${ac_file-Makefile}
> >>  ml_config_shell=${CONFIG_SHELL-/bin/sh}
> >>  ml_realsrcdir=${srcdir}
> >> +ml_srcbase=`basename $ml_realsrcdir`
> >>
> >>  # Scan all the arguments and set all the ones we need.
> >>
> >> @@ -220,6 +221,10 @@
> >>if [ "${dir}" = "." ]; then
> >>  true
> >>else
> >> +# libmpx is not supported on x32
> >> +if [ "${ml_srcbase}-${dir}" = libmpx-x32 ]; then
> >> +  continue
> >> +fi
> >>  if [ -z "${multidirs}" ]; then
> >>multidirs="${dir}"
> >>  else
> >
> > This is incorrect.  Ilya and I are working on a proper fix.
> >
> > --
> > H.J.
> 
> Current libmpx configure has a check for x32 but it doesn't work due
> to square brackets removed from the test by autoconf. Will test this
> patch:
> 
> diff --git a/libmpx/configure.ac b/libmpx/configure.ac
> index 4669525..fe0d3f2 100644
> --- a/libmpx/configure.ac
> +++ b/libmpx/configure.ac
> @@ -28,7 +28,7 @@ GCC_LIBSTDCXX_RAW_CXX_FLAGS
>  # See if supported.
>  unset LIBMPX_SUPPORTED
>  AC_MSG_CHECKING([for target support for Intel MPX runtime library])
> -echo "int i[sizeof (void *) == 4 ? 1 : -1] = { __x86_64__ };" > conftest.c
> +echo "int i[[sizeof (void *) == 4 ? 1 : -1]] = { __x86_64__ };" > conftest.c
>  if AC_TRY_COMMAND([${CC} ${CFLAGS} -c -o conftest.o conftest.c
> 1>&AS_MESSAGE_LOG_FD])
>  then
>  LIBMPX_SUPPORTED=no
> 
> 
> Thanks,
> Ilya

Successfully bootstrapped on on x86_64-unknown-linux-gnu with '--enable-libmpx 
--with-multilib-list=m32,m64,mx32'.  Applied to trunk.

Thanks,
Ilya
--
2015-03-12  Ilya Enkovich  

PR other/65384
* configure.ac: Fix x32 test.
* configure: Regenerate.


diff --git a/libmpx/configure.ac b/libmpx/configure.ac
index 4669525..fe0d3f2 100644
--- a/libmpx/configure.ac
+++ b/libmpx/configure.ac
@@ -28,7 +28,7 @@ GCC_LIBSTDCXX_RAW_CXX_FLAGS
 # See if supported.
 unset LIBMPX_SUPPORTED
 AC_MSG_CHECKING([for target support for Intel MPX runtime library])
-echo "int i[sizeof (void *) == 4 ? 1 : -1] = { __x86_64__ };" > conftest.c
+echo "int i[[sizeof (void *) == 4 ? 1 : -1]] = { __x86_64__ };" > conftest.c
 if AC_TRY_COMMAND([${CC} ${CFLAGS} -c -o conftest.o conftest.c 
1>&AS_MESSAGE_LOG_FD])
 then
 LIBMPX_SUPPORTED=no

[Patch, Fortran] Reject unsupported coarray communication

2015-03-12 Thread Tobias Burnus

There are two groups of features which are not properly implemented with 
remote access:


* "caf(:)[i]%a" might have a byte stride which is not compatible with 
the size of "a". (Fix: new array descriptor.)
* All access which involves dereferencing pointers in a remote coarray 
(e.g. "caf[i]%ptr_comp = 5") are not supported.


This patch now rejects them - instead of accepting them silently and 
doing the wrong things at runtime.


Build and regtested on x86-64-gnu-linux
OK for the trunk?

Tobias
2015-03-11  Tobias Burnus  

	* trans-expr.c (gfc_get_tree_for_caf_expr): Reject unimplemented
	coindexed coarray accesses.

	* gfortran.dg/coarray_38.f90: New.
	* gfortran.dg/coarray_39.f90: New.
	* gfortran.dg/coarray/coindexed_3.f90: Add dg-error, turn into
	compile test.

 gcc/fortran/trans-expr.c  |  57 +-
 gcc/testsuite/gfortran.dg/coarray/coindexed_3.f90 |  10 +-
 gcc/testsuite/gfortran.dg/coarray_38.f90  | 124 ++
 gcc/testsuite/gfortran.dg/coarray_39.f90  | 124 ++
 4 files changed, 309 insertions(+), 6 deletions(-)

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 353d012..87d3a2d 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -1498,10 +1498,65 @@ gfc_get_tree_for_caf_expr (gfc_expr *expr)
 {
   tree caf_decl;
   bool found = false;
-  gfc_ref *ref;
+  gfc_ref *ref, *comp_ref = NULL;
 
   gcc_assert (expr && expr->expr_type == EXPR_VARIABLE);
 
+  /* Not-implemented diagnostic.  */
+  for (ref = expr->ref; ref; ref = ref->next)
+if (ref->type == REF_COMPONENT)
+  {
+comp_ref = ref;
+	if ((ref->u.c.component->ts.type == BT_CLASS
+	 && !CLASS_DATA (ref->u.c.component)->attr.codimension
+	 && (CLASS_DATA (ref->u.c.component)->attr.pointer
+		 || CLASS_DATA (ref->u.c.component)->attr.allocatable))
+	|| (ref->u.c.component->ts.type != BT_CLASS
+		&& !ref->u.c.component->attr.codimension
+		&& (ref->u.c.component->attr.pointer
+		|| ref->u.c.component->attr.allocatable)))
+	  gfc_error ("Sorry, coindexed access to a pointer or allocatable "
+		 "component of the coindexed coarray at %L is not yet "
+		 "supported", &expr->where);
+  }
+  if ((!comp_ref
+   && ((expr->symtree->n.sym->ts.type == BT_CLASS
+	&& CLASS_DATA (expr->symtree->n.sym)->attr.alloc_comp)
+	   || (expr->symtree->n.sym->ts.type == BT_DERIVED
+	   && expr->symtree->n.sym->ts.u.derived->attr.alloc_comp)))
+  || (comp_ref
+	  && ((comp_ref->u.c.component->ts.type == BT_CLASS
+	   && CLASS_DATA (comp_ref->u.c.component)->attr.alloc_comp)
+	  || (comp_ref->u.c.component->ts.type == BT_DERIVED
+		  && comp_ref->u.c.component->ts.u.derived->attr.alloc_comp
+gfc_error ("Sorry, coindexed coarray at %L with allocatable component is "
+	   "not yet supported", &expr->where);
+
+  if (expr->rank)
+{
+  /* Without the new array descriptor, access like "caf[i]%a(:)%b" is in
+	 general not possible as the required stride multiplier might be not
+	 a multiple of c_sizeof(b). In case of noncoindexed access, the
+	 scalarizer often takes care of it - for coarrays, it always fails.  */
+  for (ref = expr->ref; ref; ref = ref->next)
+if (ref->type == REF_COMPONENT
+	&& ((ref->u.c.component->ts.type == BT_CLASS
+		 && CLASS_DATA (ref->u.c.component)->attr.codimension)
+	|| (ref->u.c.component->ts.type != BT_CLASS
+		&& ref->u.c.component->attr.codimension)))
+	  break;
+  if (ref == NULL)
+	ref = expr->ref;
+  for ( ; ref; ref = ref->next)
+	if (ref->type == REF_ARRAY && ref->u.ar.dimen)
+	  break;
+  for ( ; ref; ref = ref->next)
+	if (ref->type == REF_COMPONENT)
+	  gfc_error ("Sorry, coindexed access at %L to a scalar component "
+		 "with an array partref is not yet supported",
+		 &expr->where);
+}
+
   caf_decl = expr->symtree->n.sym->backend_decl;
   gcc_assert (caf_decl);
   if (expr->symtree->n.sym->ts.type == BT_CLASS)
diff --git a/gcc/testsuite/gfortran.dg/coarray/coindexed_3.f90 b/gcc/testsuite/gfortran.dg/coarray/coindexed_3.f90
index 46488f3..4642f2c 100644
--- a/gcc/testsuite/gfortran.dg/coarray/coindexed_3.f90
+++ b/gcc/testsuite/gfortran.dg/coarray/coindexed_3.f90
@@ -1,4 +1,4 @@
-! { dg-do run }
+! { dg-do compile }
 !
 ! Contributed by Reinhold Bader
 !
@@ -45,8 +45,8 @@ program pmup
   allocate(t :: a(3)[*])
   IF (this_image() == num_images()) THEN
 SELECT TYPE (a)
-  TYPE IS (t)
-  a(:)[1]%a = 4.0
+  TYPE IS (t) ! FIXME: When implemented, turn into "do-do run"
+  a(:)[1]%a = 4.0 ! { dg-error "Sorry, coindexed access at \\(1\\) to a scalar component with an array partref is not yet supported" }
 END SELECT
   END IF
   SYNC ALL
@@ -56,8 +56,8 @@ program pmup
TYPE IS (real)
   ii = a(1)[1]
   call abort()
-TYPE IS (t)
-  IF (ALL(A(:)[1]%a == 4.0)) THEN
+TYPE IS (t)   ! FIXME: When implemented, turn into

Re: [C++ Patch] PR 65323

2015-03-12 Thread Paolo Carlini


Hi,

On 03/11/2015 09:26 PM, Jason Merrill wrote:

On 03/06/2015 03:36 AM, Paolo Carlini wrote:

this is a regression about duplicate warnings with
-Wzero-as-null-pointer-constant. The regression is rather old, affects
4_8-branch too, and started when check_default_argument got a
perform_implicit_conversion_flags call which warns a first time, then
maybe_warn_zero_as_null_pointer_constant as called by
check_default_argument itself warns a second time. The latter call is
even older, dates back to c++/52718, I think we can now safely remove it
and keep on returning nullptr_node to avoid warning later still at the
call sites (that was the point of c++/52718). Tested x86_64-linux.


Do we need this special handling at all?  When I remove that whole 
'if' block I still only get one warning from the 52718 testcase.
I just tried again for this reduced version of 52718 (I added a 0 && to 
the 'if'):


void* fun(void* a = 0);
void* f2 = fun();

and I got (removed the irrelevant carets):

52718_red.C:1:22: warning: zero as null pointer constant 
[-Wzero-as-null-pointer-constant]

 void* fun(void* a = 0);

52718_red.C:2:16: warning: zero as null pointer constant 
[-Wzero-as-null-pointer-constant]

 void* f2 = fun();

That is, as far as I can see, the rationale that led to an early 
*return* for 52718 still stand: no matter what we do at the beginning of 
check_default_argument, whether we warn via 
perform_implicit_conversion_flags or immediately, we still want to early 
return to avoid warning again at the call site.


Paolo.

[PATCH] Fix recent OEP_ADDRESS_OF change

2015-03-12 Thread Richard Biener


This fixes sth noticed by Honza - I was resetting OEP_ADDRESS_OF
before actually testing for it in MEM_REF/TARGET_MEM_REF handling.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-03-12  Richard Biener  

PR middle-end/65270
* fold-const.c (operand_equal_p): Fix ordering of resetting
OEP_ADDRESS_OF and checking for it in the [TARGET_]MEM_REF case.

Index: gcc/fold-const.c
===
*** gcc/fold-const.c(revision 221324)
--- gcc/fold-const.c(working copy)
*** operand_equal_p (const_tree arg0, const_
*** 2934,2954 
  return OP_SAME (0);
  
case TARGET_MEM_REF:
- flags &= ~(OEP_CONSTANT_ADDRESS_OF|OEP_ADDRESS_OF);
- /* Require equal extra operands and then fall through to MEM_REF
-handling of the two common operands.  */
- if (!OP_SAME_WITH_NULL (2)
- || !OP_SAME_WITH_NULL (3)
- || !OP_SAME_WITH_NULL (4))
-   return 0;
- /* Fallthru.  */
case MEM_REF:
- flags &= ~(OEP_CONSTANT_ADDRESS_OF|OEP_ADDRESS_OF);
  /* Require equal access sizes, and similar pointer types.
 We can have incomplete types for array references of
 variable-sized arrays from the Fortran frontend
 though.  Also verify the types are compatible.  */
! return ((TYPE_SIZE (TREE_TYPE (arg0)) == TYPE_SIZE (TREE_TYPE (arg1))
   || (TYPE_SIZE (TREE_TYPE (arg0))
   && TYPE_SIZE (TREE_TYPE (arg1))
   && operand_equal_p (TYPE_SIZE (TREE_TYPE (arg0)),
--- 2934,2945 
  return OP_SAME (0);
  
case TARGET_MEM_REF:
case MEM_REF:
  /* Require equal access sizes, and similar pointer types.
 We can have incomplete types for array references of
 variable-sized arrays from the Fortran frontend
 though.  Also verify the types are compatible.  */
! if (!((TYPE_SIZE (TREE_TYPE (arg0)) == TYPE_SIZE (TREE_TYPE (arg1))
 || (TYPE_SIZE (TREE_TYPE (arg0))
 && TYPE_SIZE (TREE_TYPE (arg1))
 && operand_equal_p (TYPE_SIZE (TREE_TYPE (arg0)),
*** operand_equal_p (const_tree arg0, const_
*** 2963,2970 
  && (MR_DEPENDENCE_BASE (arg0)
  == MR_DEPENDENCE_BASE (arg1))
  && (TYPE_ALIGN (TREE_TYPE (arg0))
! == TYPE_ALIGN (TREE_TYPE (arg1)
! && OP_SAME (0) && OP_SAME (1));
  
case ARRAY_REF:
case ARRAY_RANGE_REF:
--- 2954,2968 
&& (MR_DEPENDENCE_BASE (arg0)
== MR_DEPENDENCE_BASE (arg1))
&& (TYPE_ALIGN (TREE_TYPE (arg0))
!   == TYPE_ALIGN (TREE_TYPE (arg1)))
!   return 0;
! flags &= ~(OEP_CONSTANT_ADDRESS_OF|OEP_ADDRESS_OF);
! return (OP_SAME (0) && OP_SAME (1)
! /* TARGET_MEM_REF require equal extra operands.  */
! && (TREE_CODE (arg0) != TARGET_MEM_REF
! || (OP_SAME_WITH_NULL (2)
! && OP_SAME_WITH_NULL (3)
! && OP_SAME_WITH_NULL (4;
  
case ARRAY_REF:
case ARRAY_RANGE_REF:

[PATCH] Make split_block and create_basic_block type-safe

2015-03-12 Thread Richard Biener


After noticing tree-parloop.c passing crap to split_block (a tree
rather than a gimple or an rtx) I noticed those CFG functions simply
take void * pointers.  The following patch fixes that and adds
two overloads, one for GIMPLE use and one for RTL use.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Ok at this stage?

Thanks,
Richard.

2015-03-12  Richard Biener  

* cfghooks.h (create_basic_block): Replace with two overloads
for RTL and GIMPLE.
(split_block): Likewise.
* cfghooks.c (split_block): Rename to ...
(split_block_1): ... this.
(split_block): Add two type-safe overloads for RTL and GIMPLE.
(split_block_after_labels): Call split_block_1.
(create_basic_block): Rename to ...
(create_basic_block_1): ... this.
(create_basic_block): Add two type-safe overloads for RTL and GIMPLE.
(create_empty_bb): Call create_basic_block_1.
* cfgrtl.c (fixup_fallthru_exit_predecessor): Use
split_block_after_labels.
* omp-low.c (expand_parallel_call): Likewise.
(expand_omp_target): Likewise.
(simd_clone_adjust): Likewise.
* tree-chkp.c (chkp_get_entry_block): Likewise.
* cgraphunit.c (init_lowered_empty_function): Use the GIMPLE
create_basic_block overload.
(cgraph_node::expand_thunk): Likewise.
* tree-cfg.c (make_blocks): Likewise.
(handle_abnormal_edges): Likewise.
* tree-inline.c (copy_bb): Likewise.

Index: gcc/cfghooks.c
===
--- gcc/cfghooks.c  (revision 221379)
+++ gcc/cfghooks.c  (working copy)
@@ -505,8 +505,8 @@ redirect_edge_and_branch_force (edge e,
the labels).  If I is NULL, splits just after labels.  The newly created 
edge
is returned.  The new basic block is created just after the old one.  */
 
-edge
-split_block (basic_block bb, void *i)
+static edge
+split_block_1 (basic_block bb, void *i)
 {
   basic_block new_bb;
   edge res;
@@ -550,12 +550,24 @@ split_block (basic_block bb, void *i)
   return res;
 }
 
+edge
+split_block (basic_block bb, gimple i)
+{
+  return split_block_1 (bb, i);
+}
+
+edge
+split_block (basic_block bb, rtx i)
+{
+  return split_block_1 (bb, i);
+}
+
 /* Splits block BB just after labels.  The newly created edge is returned.  */
 
 edge
 split_block_after_labels (basic_block bb)
 {
-  return split_block (bb, NULL);
+  return split_block_1 (bb, NULL);
 }
 
 /* Moves block BB immediately after block AFTER.  Returns false if the
@@ -696,8 +708,8 @@ split_edge (edge e)
HEAD and END are the first and the last statement belonging
to the block.  If both are NULL, an empty block is created.  */
 
-basic_block
-create_basic_block (void *head, void *end, basic_block after)
+static basic_block
+create_basic_block_1 (void *head, void *end, basic_block after)
 {
   basic_block ret;
 
@@ -714,12 +726,25 @@ create_basic_block (void *head, void *en
   return ret;
 }
 
+basic_block
+create_basic_block (gimple_seq seq, basic_block after)
+{
+  return create_basic_block_1 (seq, NULL, after);
+}
+
+basic_block
+create_basic_block (rtx head, rtx end, basic_block after)
+{
+  return create_basic_block_1 (head, end, after);
+}
+
+
 /* Creates an empty basic block just after basic block AFTER.  */
 
 basic_block
 create_empty_bb (basic_block after)
 {
-  return create_basic_block (NULL, NULL, after);
+  return create_basic_block_1 (NULL, NULL, after);
 }
 
 /* Checks whether we may merge blocks BB1 and BB2.  */
Index: gcc/cfghooks.h
===
--- gcc/cfghooks.h  (revision 221379)
+++ gcc/cfghooks.h  (working copy)
@@ -196,12 +196,14 @@ extern edge redirect_edge_succ_nodup (ed
 extern bool can_remove_branch_p (const_edge);
 extern void remove_branch (edge);
 extern void remove_edge (edge);
-extern edge split_block (basic_block, void *);
+extern edge split_block (basic_block, rtx);
+extern edge split_block (basic_block, gimple);
 extern edge split_block_after_labels (basic_block);
 extern bool move_block_after (basic_block, basic_block);
 extern void delete_basic_block (basic_block);
 extern basic_block split_edge (edge);
-extern basic_block create_basic_block (void *, void *, basic_block);
+extern basic_block create_basic_block (rtx, rtx, basic_block);
+extern basic_block create_basic_block (gimple_seq, basic_block);
 extern basic_block create_empty_bb (basic_block);
 extern bool can_merge_blocks_p (basic_block, basic_block);
 extern void merge_blocks (basic_block, basic_block);
Index: gcc/cfgrtl.c
===
--- gcc/cfgrtl.c(revision 221379)
+++ gcc/cfgrtl.c(working copy)
@@ -4047,7 +4047,7 @@ fixup_fallthru_exit_predecessor (void)
 edge, we have to split that block.  */
   if (c == bb)
{
- bb = split_block (bb, NULL)->dest;
+ bb = split_block_

Re: [PATCH][simplify-rtx] PR 65235: Calculate element size correctly when simplifying (vec_select (vec_concat (const_int) (...)) [...])

2015-03-12 Thread Kyrill Tkachov


>> The patch fixes that by calculating the size of the first element by
>> taking the size of the outer mode and subtracting the size of the second
>> element.
>>
>> I've added an assert to make sure that the second element is not also a
>> const_int, as a vec_concat of const_ints doesn't make sense as far 
as I can

>> see.
>
> I'm not sure about the assert, can't we just punt in this case?

Ok, here's a patch returning 0 in that case.
The assert had never triggered in my testing anyway, but I agree we
want to just cancel the simplification rather than ICE.

>
>> Bootstrapped and tested on aarch64-none-linux-gnu,
>> arm-none-linux-gnueabihf, x86_64-linux-gnu.
>> This bug appears on trunk, 4.9 and 4.8, so it's not a regression on the
>> release branches but it is a wrong-code bug.
>
> I think that the fix would be acceptable for GCC 5 without the assert.
>

Thanks for reviewing.
Richard, do you think this can go in for GCC 5 now?
What about 4.9 and 4.8? The bug appears there as well.

Thanks,
Kyrill


2015-03-12  Kyrylo Tkachov  

PR rtl-optimization 65235
* simplify-rtx.c (simplify_binary_operation_1, VEC_SELECT case):
When first element of vec_concat is const_int, calculate its size
using second element.

2015-03-12  Kyrylo Tkachov  

PR rtl-optimization 65235
* gcc.target/aarch64/pr65235_1.c: New test.commit 9946603f73e89f50d6610a943f770627ed533dbc
Author: Kyrylo Tkachov 
Date:   Thu Feb 26 16:40:52 2015 +

[simplify-rtx] Calculate vector size correctly when simplifying (vec_select (vec_concat (const_int) (...)) [...])

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index a003b41..5d17498 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -3555,7 +3555,21 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode,
 	  while (GET_MODE (vec) != mode
 		 && GET_CODE (vec) == VEC_CONCAT)
 	{
-	  HOST_WIDE_INT vec_size = GET_MODE_SIZE (GET_MODE (XEXP (vec, 0)));
+	  HOST_WIDE_INT vec_size;
+
+	  if (CONST_INT_P (XEXP (vec, 0)))
+	{
+	  /* vec_concat of two const_ints doesn't make sense with
+	 respect to modes.  */
+	  if (CONST_INT_P (XEXP (vec, 1)))
+	return 0;
+
+	  vec_size = GET_MODE_SIZE (GET_MODE (trueop0))
+	 - GET_MODE_SIZE (GET_MODE (XEXP (vec, 1)));
+	}
+	  else
+	vec_size = GET_MODE_SIZE (GET_MODE (XEXP (vec, 0)));
+
 	  if (offset < vec_size)
 		vec = XEXP (vec, 0);
 	  else
diff --git a/gcc/testsuite/gcc.target/aarch64/pr65235_1.c b/gcc/testsuite/gcc.target/aarch64/pr65235_1.c
new file mode 100644
index 000..ca12cd5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr65235_1.c
@@ -0,0 +1,30 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include "arm_neon.h"
+
+int
+main (int argc, char** argv)
+{
+  int64x1_t val1;
+  int64x1_t val2;
+  int64x1_t val3;
+  uint64x1_t val13;
+  uint64x2_t val14;
+  uint64_t got;
+  uint64_t exp;
+  val1 = vcreate_s64(UINT64_C(0x80008000));
+  val2 = vcreate_s64(UINT64_C(0xf38d));
+  val3 = vcreate_s64(UINT64_C(0x7fff809b));
+  /* Expect: "val13" = 80001553.  */
+  val13 = vcreate_u64 (UINT64_C(0x80001553));
+  /* Expect: "val14" = 0010   0002    .  */
+  val14 = vcombine_u64(vcgt_s64(vqrshl_s64(val1, val2),
+vshr_n_s64(val3, 18)),
+		   vshr_n_u64(val13, 11));
+  /* Should be .  */
+  got = vgetq_lane_u64(val14, 0);
+  exp = 0;
+  if(exp != got)
+__builtin_abort ();
+}

Re: [PATCH][simplify-rtx] PR 65235: Calculate element size correctly when simplifying (vec_select (vec_concat (const_int) (...)) [...])

2015-03-12 Thread Richard Biener

On Thu, Mar 12, 2015 at 2:28 PM, Kyrill Tkachov  wrote:
>>> The patch fixes that by calculating the size of the first element by
>>> taking the size of the outer mode and subtracting the size of the second
>>> element.
>>>
>>> I've added an assert to make sure that the second element is not also a
>>> const_int, as a vec_concat of const_ints doesn't make sense as far as I
>>> can
>>> see.
>>
>> I'm not sure about the assert, can't we just punt in this case?
>
> Ok, here's a patch returning 0 in that case.
> The assert had never triggered in my testing anyway, but I agree we
> want to just cancel the simplification rather than ICE.
>
>>
>>> Bootstrapped and tested on aarch64-none-linux-gnu,
>>> arm-none-linux-gnueabihf, x86_64-linux-gnu.
>>> This bug appears on trunk, 4.9 and 4.8, so it's not a regression on the
>>> release branches but it is a wrong-code bug.
>>
>> I think that the fix would be acceptable for GCC 5 without the assert.
>>
>
> Thanks for reviewing.
> Richard, do you think this can go in for GCC 5 now?
> What about 4.9 and 4.8? The bug appears there as well.

Sure - it's a wrong-code fix.  Ok for trunk and branches (after a while).

Thanks,
Richard.

> Thanks,
> Kyrill
>
>
> 2015-03-12  Kyrylo Tkachov  
>
> PR rtl-optimization 65235
> * simplify-rtx.c (simplify_binary_operation_1, VEC_SELECT case):
> When first element of vec_concat is const_int, calculate its size
> using second element.
>
> 2015-03-12  Kyrylo Tkachov  
>
>
> PR rtl-optimization 65235
> * gcc.target/aarch64/pr65235_1.c: New test.

[linaro/gcc-4_9-branch] Merge from gcc-4_9-branch and backports

2015-03-12 Thread Yvan Roux

Hi all

we have merged the gcc-4_9-branch into linaro/gcc-4_9-branch up to
revision 221341 as r221360.  We have also backported this set of revisions:

* r212011 as r221216 : PR tree-optimization/61607
* r214942 as r221216 : Abstract away marking loops for removal
* r214957 as r221216 : Sanity check removed loops
* r215012 as r221216 : PR bootstrap/63204
* r215016 as r221216 : PR ipa/63196
* r215612 as r221194 : Tighten predicates on SIMD shift intrinsics
* r215722 as r221196 : Wire up vqdmullh_laneq_s16 and vqdmullh_laneq_s32
* r216663 as r221239 : [testsuite] revert changes on
check_effective_target_arm_*_ok
* r217706 as r221240 : [testsuite] new set of Neon intrinsics tests
* r217707 as r221241 : [testsuite] fix vbic/vorn Neon tests
* r217725 as r221339 : Improve modeled latency between FP operations
and FP->GP register moves
* r217780 as r221302 : Adjust generic move costs
* r217852 as r221300 : Add range-check for Symbol + offset addressing
* r217938 as r221301 : Add vector pattern for __builtin_ctz
* r218115 as r221216 : PR tree-optimization/64083
* r218463 as r221242 : [testsuite] Fix vaddl and vaddw tests
* r218486 as r221344 : Bics instruction generation for aarch64
* r218503 as r221344 : additional bics patterns
* r218733 as r221216 : PR tree-optimization/64284
* r218746 as r221216 : PR middle-end/64246
* r219764 as r221242 : [testsuite] Add explicit dependency on Neon
Cumulative Saturation flag
* r219765 as r221242 : [testsuite] Be more verbose, and actually
confirm that a test was checked.
* r219767 as r221242 : [testsuite] Add vld1_lane tests
* r219914 as r221242 : [testsuite] Add vldX_dup test.
* r219917 as r221242 : [testsuite] Add vmla and vmls tests.
* r219918 as r221242 : [testsuite] Add vmla_lane and vmls_lane tests.
* r219919 as r221242 : [testsuite] Add vtrn tests. Refactor vzup and vzip tests.
* r219920 as r221242 : [testsuite] Add vmlal and vmlsl tests.
* r219921 as r221242 : [testsuite] Add vmlal_lane and vmlsl_lane tests.
* r219922 as r221242 : [testsuite] Add vmlal_n and vmlsl_n tests.
* r219930 as r221242 : [testsuite] Add vqdmlal and vqdmlsl tests.
* r219931 as r221242 : [testsuite] Add vqdmlal_lane and vqdmlsl_lane tests
* r219932 as r221242 : [testsuite] Add vqdmlal_n and vqdmlsl_n tests.
* r219934 as r221242 : [testsuite] Add vsli_n and vsri_n tests.
* r219937 as r221242 : [testsuite] Add vsubl tests, put most of the
code in common with vaddl in vXXXl.inc.
* r219938 as r221242 : [testsuite] Add vsubw tests, putting most of
the code in common with vaddw
* r219939 as r221242 : [testsuite] Add vmovn tests.
* r219940 as r221242 : [testsuite] Add vmul_lane tests.
* r219941 as r221242 : [testsuite] Add vmul_n tests.
* r219942 as r221242 : [testsuite] Add vmull tests.
* r219943 as r221242 : [testsuite] Add vmull_lane tests.
* r219944 as r221242 : [testsuite] Add vmull_n tests.
* r219945 as r221242 : [testsuite] Add vqdmulh tests.
* r219946 as r221242 : [testsuite] Add vqdmulh_lane tests.
* r219947 as r221242 : [testsuite] Add vqdmulh_n tests.
* r219948 as r221242 : [testsuite] Add vqdmull tests.
* r219949 as r221242 : [testsuite] Add vqdmull_lane tests.
* r219950 as r221242 : [testsuite] Add vqdmull_n tests.
* r220117 as r221242 : [testsuite] Add vsubhn, vraddhn and vrsubhn tests.
* r220118 as r221242 : [testsuite] Add vmla_n and vmls_n tests.
* r220119 as r221242 : [testsuite] Add vpadd, vpmax and vpmin tests.
* r220121 as r221242 : [testsuite] Add vmovl tests.
* r220122 as r221242 : [testsuite] Add vmnv tests.
* r220123 as r221242 : [testsuite] Add vpadal tests.
* r220124 as r221242 : [testsuite] Add vpaddl tests.
* r220126 as r221242 : Fix incorrect ChangeLog formatting.
* r220353 as r221242 : [testsuite] Add vmax, vmin, vhadd, vhsub and
vrhadd tests.
* r220491 as r221216 : PR tree-optimization/64878
* r220751 as r221343 : [Haifa Scheduler] Fix latent bug in
macro-fusion/instruction grouping
* r220860 as r221215 : [AArch64] Fix wrong-code bug in right-shift SISD patterns

This will be part of our 2015.03 4.9 release.

Thanks
Yvan

[Ada] handle 'Code_Address on targets with function descriptors

2015-03-12 Thread Olivier Hainque

For P a subprogram, P'Code_Address is expected to return
the address at which the machine code for P starts.

It differs from 'Address on targets where function
symbol names denote the address of a function descriptor,
a record from which the code address can be fetched
(e.g. on ppc-aix).

On such targets, P'Address is expected to return
the descriptor address, and it does.

P'Code_Address should fetch the code address from the,
descriptor but we have nothing in place to achieve that
today. It just returns the same as 'Address.

The attached patch is the gigi part of a change to
fix this, relying on a tm definition that we'll be
submitting later on.

With everything in place, the testcase below is expected
to display "OK".

Bootstrapped and regtested on x86_64-pc-linux-gnu

Olivier

2015-03-12  Olivier Hainque  

* gcc-interface/trans.c (Attribute_to_gnu) :
On targets where a function symbol designates a function descriptor,
fetch the function code address from the descriptor.

--

with System, Ada.Unchecked_Conversion;
with Ada.Text_IO; use Ada.Text_IO;

procedure Code_Addr_P is
   Addr, Code_Addr : System.Address;
   
   type Fn_Descriptor is record
  Fn_Address : System.Address;
   end record;

   type Descriptor_Access is access all Fn_Descriptor;
   function To_Descriptor_Access is
  new Ada.Unchecked_Conversion (System.Address, Descriptor_Access);
   
   Da : Descriptor_Access;
   
   use type System.Address;
begin
   Addr := Code_Addr_P'Address;
   Code_Addr := Code_Addr_P'Code_Address;
   
   Da := To_Descriptor_Access (Addr);
   if Da.Fn_Address /= Code_Addr then
  raise Program_Error;
   end if;
   
   Put_Line ("OK");
end;



fndesc.diff
Description: Binary data

[PATCH][OpenMP] Fix declare target variables in fortran modules

2015-03-12 Thread Ilya Verbin

Hi,

We have a problem with declare target variables in fortran modules, here is a
small reproducer:

+ share.f90:

module share
  integer :: var_x
!$omp declare target(var_x)
end module

+ test.f90:

  use share
  var_x = 10
!$omp target update to(var_x)
end

+

$ gfortran -fopenmp -c share.f90
$ gfortran -fopenmp -c test.f90
$ gfortran -fopenmp share.o test.o
$ ./a.out
libgomp: Duplicate node

+

This happens because the var_x is added into offload tables for both share.o and
test.o.  The patch below fixes this issue.  Regtested on x86_64-linux and
i686-linux.  However I'm not sure how to create a regression test, which would
compile 2 separate objects, and check run-time result.


diff --git a/gcc/varpool.c b/gcc/varpool.c
index 707f62f..5929d92 100644
--- a/gcc/varpool.c
+++ b/gcc/varpool.c
@@ -173,7 +173,7 @@ varpool_node::get_create (tree decl)
   node = varpool_node::create_empty ();
   node->decl = decl;
 
-  if ((flag_openacc || flag_openmp)
+  if ((flag_openacc || flag_openmp) && !DECL_EXTERNAL (decl)
   && lookup_attribute ("omp declare target", DECL_ATTRIBUTES (decl)))
 {
   node->offloadable = 1;


Thanks,
  -- Ilya

Re: [PATCH][OpenMP] Fix declare target variables in fortran modules

2015-03-12 Thread Jakub Jelinek

On Thu, Mar 12, 2015 at 04:56:35PM +0300, Ilya Verbin wrote:
> This happens because the var_x is added into offload tables for both share.o 
> and
> test.o.  The patch below fixes this issue.  Regtested on x86_64-linux and
> i686-linux.  However I'm not sure how to create a regression test, which would
> compile 2 separate objects, and check run-time result.

Ok with proper ChangeLog entry.

As for testcase, won't dg-additional-sources help?
I mean, does it fail without your patch even if you just do
gfortran -fopenmp -o a.out share.f90 test.f90; ./a.out ?

> --- a/gcc/varpool.c
> +++ b/gcc/varpool.c
> @@ -173,7 +173,7 @@ varpool_node::get_create (tree decl)
>node = varpool_node::create_empty ();
>node->decl = decl;
>  
> -  if ((flag_openacc || flag_openmp)
> +  if ((flag_openacc || flag_openmp) && !DECL_EXTERNAL (decl)
>&& lookup_attribute ("omp declare target", DECL_ATTRIBUTES (decl)))
>  {
>node->offloadable = 1;

Jakub

Re: [PATCH] PR target/65240, Fix Power{7,8} insn constraint issue with -O3 -ffast-math

2015-03-12 Thread Michael Meissner

On Wed, Mar 11, 2015 at 08:52:54PM -0400, David Edelsohn wrote:
> On Wed, Mar 11, 2015 at 6:21 PM, Michael Meissner
>  wrote:
> > On Wed, Mar 11, 2015 at 01:02:06PM -0400, David Edelsohn wrote:
> >> I am concerned with the create_TOC_reference use for TARGET_TOC.  Has
> >> this been tested with big endian -mcmodel=small?
> >
> > Yes, that was a problem.  Patch coming up soon.  Thanks.
> 
> Can you call rs6000_emit_move_directly?

Well, I can, but I would have to have some sort of flag that says after the
split1 pass not to allow FP constants in move (other than 0.0).  It is doable,
but it does touch more areas in the rs6000 back end.

I am starting to think that it is just simpler to rip out all of the special
fast math handling of constants, considering the multiply by reciprocal support
has moved to SSA/tree and away from RTL.  Did you want me to investigate the
performance implications of removing it now (rather than waiting to GCC 6.0),
or just do the more limited patch that I've been pursuing.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH] PR target/65240, Fix Power{7,8} insn constraint issue with -O3 -ffast-math

2015-03-12 Thread David Edelsohn

On Thu, Mar 12, 2015 at 11:29 AM, Michael Meissner
 wrote:
> On Wed, Mar 11, 2015 at 08:52:54PM -0400, David Edelsohn wrote:
>> On Wed, Mar 11, 2015 at 6:21 PM, Michael Meissner
>>  wrote:
>> > On Wed, Mar 11, 2015 at 01:02:06PM -0400, David Edelsohn wrote:
>> >> I am concerned with the create_TOC_reference use for TARGET_TOC.  Has
>> >> this been tested with big endian -mcmodel=small?
>> >
>> > Yes, that was a problem.  Patch coming up soon.  Thanks.
>>
>> Can you call rs6000_emit_move_directly?
>
> Well, I can, but I would have to have some sort of flag that says after the
> split1 pass not to allow FP constants in move (other than 0.0).  It is doable,
> but it does touch more areas in the rs6000 back end.
>
> I am starting to think that it is just simpler to rip out all of the special
> fast math handling of constants, considering the multiply by reciprocal 
> support
> has moved to SSA/tree and away from RTL.  Did you want me to investigate the
> performance implications of removing it now (rather than waiting to GCC 6.0),
> or just do the more limited patch that I've been pursuing.

Please check on the performance implications of removing the special
constant support.  I know that it is late, but I think that ripping it
out is less risky than trying to fix this, if the performance impact
is not bad.

Thanks, David

libgo patch committed: It's OK to use cgo on PPC

2015-03-12 Thread Ian Lance Taylor

The cgo tool installed by gccgo works fine on 32-bit PPC.  This patch
notes that fact in the gccgo version of the go tool.  This is GCC PR
65404.  Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian
diff -r 81cc50c9140d libgo/go/go/build/build.go
--- a/libgo/go/go/build/build.goMon Mar 09 17:13:50 2015 -0700
+++ b/libgo/go/go/build/build.goThu Mar 12 09:32:03 2015 -0700
@@ -268,6 +268,7 @@
"linux/alpha": true,
"linux/amd64": true,
"linux/arm":   true,
+   "linux/ppc":   true,
"linux/ppc64": true,
"linux/ppc64le":   true,
"linux/s390":  true,

gotools patch committed: Build gotools with compiler options

2015-03-12 Thread Ian Lance Taylor

This patch changes the gotools to add GOCFLAGS to the build command,
since the command is both compiling and linking.  The main effect of
this is to, by default, build with -g -O2, which previously was not
happening.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

2015-03-12  Ian Lance Taylor  

* Makefile.am (GOLINK): Add GOCFLAGS.
* Makefile.in: Rebuild.
Index: gotools/Makefile.am
===
--- gotools/Makefile.am (revision 220066)
+++ gotools/Makefile.am (working copy)
@@ -39,7 +39,7 @@ GOCFLAGS = $(CFLAGS_FOR_TARGET)
 GOCOMPILE = $(GOCOMPILER) $(GOCFLAGS)
 
 AM_LDFLAGS = -L $(libgodir) -L $(libgodir)/.libs
-GOLINK = $(GOCOMPILER) $(AM_GOCFLAGS) $(LDFLAGS) $(AM_LDFLAGS) -o $@
+GOLINK = $(GOCOMPILER) $(GOCFLAGS) $(AM_GOCFLAGS) $(LDFLAGS) $(AM_LDFLAGS) -o 
$@
 
 cmdsrcdir = $(srcdir)/../libgo/go/cmd

Re: [C++ Patch] PR 65323

2015-03-12 Thread Jason Merrill


On 03/12/2015 06:13 AM, Paolo Carlini wrote:

52718_red.C:1:22: warning: zero as null pointer constant
[-Wzero-as-null-pointer-constant]
  void* fun(void* a = 0);

52718_red.C:2:16: warning: zero as null pointer constant
[-Wzero-as-null-pointer-constant]
  void* f2 = fun();


OK, then your second patch is OK.  But please add a comment that the 
code is there to avoid a redundant warning at the call site.


Jason

Re: libgo patch committed: It's OK to use cgo on PPC

2015-03-12 Thread Matthias Klose

On 03/12/2015 05:41 PM, Ian Lance Taylor wrote:
> The cgo tool installed by gccgo works fine on 32-bit PPC.  This patch
> notes that fact in the gccgo version of the go tool.  This is GCC PR
> 65404.  Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
> Committed to mainline.

same thing needs to be done for arm64:

>From 391fba3b788628ef6431765c382a51f52a93cddf Mon Sep 17 00:00:00 2001
From: Michael Hudson-Doyle 
Date: Wed, 27 Aug 2014 14:57:07 +1200
Subject: [PATCH 3/3] Enable cgo by default on linux/arm64.

---
 src/libgo/go/go/build/build.go | 1 +
 1 file changed, 1 insertion(+)

Index: b/src/libgo/go/go/build/build.go
===
--- a/src/libgo/go/go/build/build.go
+++ b/src/libgo/go/go/build/build.go
@@ -268,6 +268,7 @@ var cgoEnabled = map[string]bool{
"linux/alpha": true,
"linux/amd64": true,
"linux/arm":   true,
+   "linux/arm64": true,
"linux/ppc64": true,
"linux/ppc64le":   true,
"linux/s390":  true,


> 
> Ian
>

Re: libgo patch committed: It's OK to use cgo on PPC

2015-03-12 Thread Ian Lance Taylor

On Thu, Mar 12, 2015 at 10:00 AM, Matthias Klose  wrote:
> On 03/12/2015 05:41 PM, Ian Lance Taylor wrote:
>> The cgo tool installed by gccgo works fine on 32-bit PPC.  This patch
>> notes that fact in the gccgo version of the go tool.  This is GCC PR
>> 65404.  Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
>> Committed to mainline.
>
> same thing needs to be done for arm64:

Thanks.  Committed.

Ian

[C PATCH] Fix up file-scope _Atomic expansion (PR c/65345)

2015-03-12 Thread Marek Polacek

The PR shows that the compiler ICEs whenever it tries to expand an atomic
operation at the file scope.  That happens because it creates temporaries 
via create_tmp_var, which also pushes the variable into the current binding,
but that can't work if current_function_decl is NULL.  The fix is I think to
only generate the temporaries during gimplification.  Turned out that the
TARGET_EXPRs are tailor-made for this, so I've used them along with changing
create_tmp_var calls to create_tmp_var_raw that does not push the variable
into the current binding.

But this wasn't enough to handle the following case:
_Atomic int i = 5;
void f (int a[i += 1]) {}
To make it work I had to tweak the artificial labels that build_atomic_assign
creates to not ICE in gimplification.  The comment in store_parm_decls sums
it up.  It uses walk_tree, but I think this will be only rarely exercised in
practice, if ever; I think programs using such a construction are thin on the
ground.

I tried comparing .gimple dumps with/without the patch on

_Atomic int q = 4;
void
f (void)
{
  q += 2;
}

and I see no code changes.

This is not a regression, so not sure if I shouldn't defer this patch to the
next stage1 at this juncture...

Comments?

Bootstrapped/regtested on x86_64-linux.

2015-03-12  Marek Polacek  

PR c/65345
* c-decl.c (set_labels_context_r): New function.
(store_parm_decls): Call it via walk_tree_without_duplicates.
* c-typeck.c (convert_lvalue_to_rvalue): Use create_tmp_var_raw
instead of create_tmp_var.  Build TARGET_EXPR instead of
COMPOUND_EXPR.
(build_atomic_assign): Use create_tmp_var_raw instead of
create_tmp_var.  Build TARGET_EXPRs instead of MODIFY_EXPR.

* gcc.dg/pr65345-1.c: New test.
* gcc.dg/pr65345-2.c: New test.

--- gcc/c/c-decl.c
+++ gcc/c/c-decl.c
@@ -8799,6 +8799,21 @@ store_parm_decls_from (struct c_arg_info *arg_info)
   store_parm_decls ();
 }
 
+/* Called by walk_tree to look for and update context-less labels.  */
+
+static tree
+set_labels_context_r (tree *tp, int *walk_subtrees, void *data)
+{
+  if (TREE_CODE (*tp) == LABEL_EXPR
+  && DECL_CONTEXT (LABEL_EXPR_LABEL (*tp)) == NULL_TREE)
+{
+  DECL_CONTEXT (LABEL_EXPR_LABEL (*tp)) = static_cast(data);
+  *walk_subtrees = 0;
+}
+
+  return NULL_TREE;
+}
+
 /* Store the parameter declarations into the current function declaration.
This is called after parsing the parameter declarations, before
digesting the body of the function.
@@ -8853,7 +8868,21 @@ store_parm_decls (void)
  thus won't naturally see the SAVE_EXPR containing the increment.  All
  other pending sizes would be handled by gimplify_parameters.  */
   if (arg_info->pending_sizes)
-add_stmt (arg_info->pending_sizes);
+{
+  /* In very special circumstances, e.g. for code like
+  _Atomic int i = 5;
+  void f (int a[i += 2]) {}
+we need to execute the atomic assignment on function entry.
+But in this case, it is not just a straight store, it has the
+op= form, which means that build_atomic_assign has generated
+gotos, labels, etc.  Because at that time the function decl
+for F has not been created yet, those labels do not have any
+function context.  But we have the fndecl now, so update the
+labels accordingly.  gimplify_expr would crash otherwise.  */
+  walk_tree_without_duplicates (&arg_info->pending_sizes,
+   set_labels_context_r, fndecl);
+  add_stmt (arg_info->pending_sizes);
+}
 }
 
 /* Store PARM_DECLs in PARMS into scope temporarily.  Used for
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -2039,7 +2039,7 @@ convert_lvalue_to_rvalue (location_t loc, struct c_expr 
exp,
   /* Remove the qualifiers for the rest of the expressions and
 create the VAL temp variable to hold the RHS.  */
   nonatomic_type = build_qualified_type (expr_type, TYPE_UNQUALIFIED);
-  tmp = create_tmp_var (nonatomic_type);
+  tmp = create_tmp_var_raw (nonatomic_type);
   tmp_addr = build_unary_op (loc, ADDR_EXPR, tmp, 0);
   TREE_ADDRESSABLE (tmp) = 1;
   TREE_NO_WARNING (tmp) = 1;
@@ -2055,7 +2055,8 @@ convert_lvalue_to_rvalue (location_t loc, struct c_expr 
exp,
   mark_exp_read (exp.value);
 
   /* Return tmp which contains the value loaded.  */
-  exp.value = build2 (COMPOUND_EXPR, nonatomic_type, func_call, tmp);
+  exp.value = build4 (TARGET_EXPR, nonatomic_type, tmp, func_call,
+ NULL_TREE, NULL_TREE);
 }
   return exp;
 }
@@ -3686,10 +3687,11 @@ build_atomic_assign (location_t loc, tree lhs, enum 
tree_code modifycode,
  the VAL temp variable to hold the RHS.  */
   nonatomic_lhs_type = build_qualified_type (lhs_type, TYPE_UNQUALIFIED);
   nonatomic_rhs_type = build_qualified_type (rhs_type, TYPE_UNQUALIFIED);
-  val = create_tmp_var (nonatomic_rhs_type);
+  val = create_tmp_var_raw (

Ping: [Patch, fortran] PR61138 Wrong code with pointer-bounds remapping

2015-03-12 Thread Mikael Morin

Ping:
https://gcc.gnu.org/ml/fortran/2015-02/msg00045.html

Re: [Patch, fortran] PR64952 - Missing temporary in assignment from elemental function

2015-03-12 Thread Mikael Morin

Hello Paul,

have you had time to look at this again?

Mikael

[patch] libstdc++/64847 add autoconf checks for pthread_rwlock_t

2015-03-12 Thread Jonathan Wakely


I assumed that Pthreads was enough to ensure pthread_rwlock_t but
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64847 shows that isn't
true for HPUX (seems it was optional prior to POSIX 1003.1-2001).

This adds an autoconf check to decide whether to use pthread_rwlock_t
or the fallback implementation in terms of std::condition_variable and
std::mutex.

This also includes some fixes from Torvald so that we loop and retry
if libc returns EAGAIN, to handle a difference in semantics between
POSIX and C++14.

And as an optimization I've made the _M_rwlock member use the
PTHREAD_RWLOCK_INITIALIZER macro if available.

Tested x86_64-linux, ppc64le-linux and x86_64-dragonfly.

I plan to commit this to trunk tomorrow.

commit 7212446ada7d741f6fe0fc9d9fca9d5b55322384
Author: Jonathan Wakely 
Date:   Thu Mar 12 17:29:42 2015 +

2015-03-12  Jonathan Wakely  
	Torvald Riegel  

	PR libstdc++/64847
	* acinclude.m4 (GLIBCXX_CHECK_GTHREADS): Check for pthread_rwlock_t.
	* config.h.in: Regenerate.
	* configure: Regenerate.
	* include/std/shared_mutex: Check _GLIBCXX_USE_PTHREADS_RWLOCKS.
	(shared_timed_mutex::_M_rwlock): Use PTHREAD_RWLOCK_INITIALIZER.
	(shared_timed_mutex::lock_shared()): Retry on EAGAIN.
	(shared_timed_mutex::try_lock_shared_until()): Retry on EAGAIN and
	EDEADLK.

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 1727140..86628c0 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -3563,6 +3563,13 @@ AC_DEFUN([GLIBCXX_CHECK_GTHREADS], [
   if test x"$ac_has_gthreads" = x"yes"; then
 AC_DEFINE(_GLIBCXX_HAS_GTHREADS, 1,
 	  [Define if gthreads library is available.])
+
+# Also check for pthread_rwlock_t for std::shared_timed_mutex in C++14
+AC_CHECK_TYPE([pthread_rwlock_t],
+[AC_DEFINE([_GLIBCXX_USE_PTHREADS_RWLOCKS], 1,
+[Define if POSIX read/write locks are available in .])],
+[],
+[#include "gthr.h"])
   fi
 
   CXXFLAGS="$ac_save_CXXFLAGS"
diff --git a/libstdc++-v3/include/std/shared_mutex b/libstdc++-v3/include/std/shared_mutex
index 5dcc295..61251b0 100644
--- a/libstdc++-v3/include/std/shared_mutex
+++ b/libstdc++-v3/include/std/shared_mutex
@@ -57,10 +57,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// shared_timed_mutex
   class shared_timed_mutex
   {
-#if defined(__GTHREADS_CXX0X)
+#ifdef _GLIBCXX_USE_PTHREADS_RWLOCKS
 typedef chrono::system_clock	__clock_t;
 
-pthread_rwlock_t			_M_rwlock;
+#ifdef PTHREAD_RWLOCK_INITIALIZER
+pthread_rwlock_t	_M_rwlock = PTHREAD_RWLOCK_INITIALIZER;
+
+  public:
+shared_timed_mutex() = default;
+~shared_timed_mutex() = default;
+#else
+pthread_rwlock_t	_M_rwlock;
 
   public:
 shared_timed_mutex()
@@ -82,6 +89,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Errors not handled: EBUSY, EINVAL
   _GLIBCXX_DEBUG_ASSERT(__ret == 0);
 }
+#endif
 
 shared_timed_mutex(const shared_timed_mutex&) = delete;
 shared_timed_mutex& operator=(const shared_timed_mutex&) = delete;
@@ -165,12 +173,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 void
 lock_shared()
 {
-  int __ret = pthread_rwlock_rdlock(&_M_rwlock);
+  int __ret;
+  do
+	__ret = pthread_rwlock_rdlock(&_M_rwlock);
+  // We retry if we exceeded the maximum number of read locks supported by
+  // the POSIX implementation; this can result in busy-waiting, but this
+  // is okay based on the current specification of forward progress
+  // guarantees by the standard.
+  while (__ret == EAGAIN);
   if (__ret == EDEADLK)
 	__throw_system_error(int(errc::resource_deadlock_would_occur));
-  if (__ret == EAGAIN)
-	// Maximum number of read locks has been exceeded.
-	__throw_system_error(int(errc::device_or_resource_busy));
   // Errors not handled: EINVAL
   _GLIBCXX_DEBUG_ASSERT(__ret == 0);
 }
@@ -210,11 +222,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	static_cast(__ns.count())
 	  };
 
-	int __ret = pthread_rwlock_timedrdlock(&_M_rwlock, &__ts);
+	int __ret;
+	do
+	  __ret = pthread_rwlock_timedrdlock(&_M_rwlock, &__ts);
 	// If the maximum number of read locks has been exceeded, or we would
-	// deadlock, we just fail to acquire the lock.  Unlike for lock(),
-	// we are not allowed to throw an exception.
-	if (__ret == ETIMEDOUT || __ret == EAGAIN || __ret == EDEADLK)
+	// deadlock, we just try to acquire the lock again (and will time out
+	// eventually).  Unlike for lock(), we are not allowed to throw an
+	// exception.  In cases where we would exceed the maximum number of
+	// read locks throughout the whole time until the timeout, we will
+	// fail to acquire the lock even if it would be logically free;
+	// however, this is allowed by the standard, and we made a "strong
+	// effort" (see C++14 30.4.1.4p26).
+	while (__ret == EAGAIN || __ret == EDEADLK);
+	if (__ret == ETIMEDOUT)
 	  return false;
 	// Errors not handled: EINVAL
 	_GLIBCXX_DEBUG_ASSERT(__ret

Re: [PATCH][OpenMP] Fix declare target variables in fortran modules

2015-03-12 Thread Ilya Verbin

On Thu, Mar 12, 2015 at 15:21:35 +0100, Jakub Jelinek wrote:
> On Thu, Mar 12, 2015 at 04:56:35PM +0300, Ilya Verbin wrote:
> > This happens because the var_x is added into offload tables for both 
> > share.o and
> > test.o.  The patch below fixes this issue.  Regtested on x86_64-linux and
> > i686-linux.  However I'm not sure how to create a regression test, which 
> > would
> > compile 2 separate objects, and check run-time result.
> 
> Ok with proper ChangeLog entry.
> 
> As for testcase, won't dg-additional-sources help?
> I mean, does it fail without your patch even if you just do
> gfortran -fopenmp -o a.out share.f90 test.f90; ./a.out ?

Yes, this works.  Here is what I will commit tomorrow, if no objections.


gcc/
* varpool.c (varpool_node::get_create): Don't set 'offloadable' flag for
the external decls.
libgomp/
* testsuite/libgomp.fortran/declare-target-1.f90: New test.
* testsuite/libgomp.fortran/declare-target-2.f90: New file.


diff --git a/gcc/varpool.c b/gcc/varpool.c
index b583693..ce64279 100644
--- a/gcc/varpool.c
+++ b/gcc/varpool.c
@@ -173,7 +173,7 @@ varpool_node::get_create (tree decl)
   node = varpool_node::create_empty ();
   node->decl = decl;
 
-  if ((flag_openacc || flag_openmp)
+  if ((flag_openacc || flag_openmp) && !DECL_EXTERNAL (decl)
   && lookup_attribute ("omp declare target", DECL_ATTRIBUTES (decl)))
 {
   node->offloadable = 1;
diff --git a/libgomp/testsuite/libgomp.fortran/declare-target-1.f90 
b/libgomp/testsuite/libgomp.fortran/declare-target-1.f90
new file mode 100644
index 000..fd9c26f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/declare-target-1.f90
@@ -0,0 +1,15 @@
+! { dg-do run }
+! { dg-additional-sources declare-target-2.f90 }
+
+module declare_target_1_mod
+  integer :: var_x
+  !$omp declare target(var_x)
+end module declare_target_1_mod
+
+  interface
+subroutine foo ()
+end subroutine foo
+  end interface
+
+  call foo ()
+end
diff --git a/libgomp/testsuite/libgomp.fortran/declare-target-2.f90 
b/libgomp/testsuite/libgomp.fortran/declare-target-2.f90
new file mode 100644
index 000..f8d3ab2
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/declare-target-2.f90
@@ -0,0 +1,18 @@
+! Don't compile this anywhere, it is just auxiliary
+! file compiled together with declare-target-1.f90
+! to verify inter-CU module handling of omp declare target.
+! { dg-do compile { target { lp64 && { ! lp64 } } } }
+
+subroutine foo
+  use declare_target_1_mod
+
+  var_x = 10
+  !$omp target update to(var_x)
+
+  !$omp target
+var_x = var_x * 2;
+  !$omp end target
+
+  !$omp target update from(var_x)
+  if (var_x /= 20) call abort
+end subroutine foo


  -- Ilya

Re: [PATCH][OpenMP] Fix declare target variables in fortran modules

2015-03-12 Thread Jakub Jelinek

On Thu, Mar 12, 2015 at 10:22:37PM +0300, Ilya Verbin wrote:
> On Thu, Mar 12, 2015 at 15:21:35 +0100, Jakub Jelinek wrote:
> > On Thu, Mar 12, 2015 at 04:56:35PM +0300, Ilya Verbin wrote:
> > > This happens because the var_x is added into offload tables for both 
> > > share.o and
> > > test.o.  The patch below fixes this issue.  Regtested on x86_64-linux and
> > > i686-linux.  However I'm not sure how to create a regression test, which 
> > > would
> > > compile 2 separate objects, and check run-time result.
> > 
> > Ok with proper ChangeLog entry.
> > 
> > As for testcase, won't dg-additional-sources help?
> > I mean, does it fail without your patch even if you just do
> > gfortran -fopenmp -o a.out share.f90 test.f90; ./a.out ?
> 
> Yes, this works.  Here is what I will commit tomorrow, if no objections.

Ok, thanks.

> gcc/
>   * varpool.c (varpool_node::get_create): Don't set 'offloadable' flag for
>   the external decls.
> libgomp/
>   * testsuite/libgomp.fortran/declare-target-1.f90: New test.
>   * testsuite/libgomp.fortran/declare-target-2.f90: New file.
> 
> 
> diff --git a/gcc/varpool.c b/gcc/varpool.c
> index b583693..ce64279 100644
> --- a/gcc/varpool.c
> +++ b/gcc/varpool.c
> @@ -173,7 +173,7 @@ varpool_node::get_create (tree decl)
>node = varpool_node::create_empty ();
>node->decl = decl;
>  
> -  if ((flag_openacc || flag_openmp)
> +  if ((flag_openacc || flag_openmp) && !DECL_EXTERNAL (decl)
>&& lookup_attribute ("omp declare target", DECL_ATTRIBUTES (decl)))
>  {
>node->offloadable = 1;
> diff --git a/libgomp/testsuite/libgomp.fortran/declare-target-1.f90 
> b/libgomp/testsuite/libgomp.fortran/declare-target-1.f90
> new file mode 100644
> index 000..fd9c26f
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.fortran/declare-target-1.f90
> @@ -0,0 +1,15 @@
> +! { dg-do run }
> +! { dg-additional-sources declare-target-2.f90 }
> +
> +module declare_target_1_mod
> +  integer :: var_x
> +  !$omp declare target(var_x)
> +end module declare_target_1_mod
> +
> +  interface
> +subroutine foo ()
> +end subroutine foo
> +  end interface
> +
> +  call foo ()
> +end
> diff --git a/libgomp/testsuite/libgomp.fortran/declare-target-2.f90 
> b/libgomp/testsuite/libgomp.fortran/declare-target-2.f90
> new file mode 100644
> index 000..f8d3ab2
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.fortran/declare-target-2.f90
> @@ -0,0 +1,18 @@
> +! Don't compile this anywhere, it is just auxiliary
> +! file compiled together with declare-target-1.f90
> +! to verify inter-CU module handling of omp declare target.
> +! { dg-do compile { target { lp64 && { ! lp64 } } } }
> +
> +subroutine foo
> +  use declare_target_1_mod
> +
> +  var_x = 10
> +  !$omp target update to(var_x)
> +
> +  !$omp target
> +var_x = var_x * 2;
> +  !$omp end target
> +
> +  !$omp target update from(var_x)
> +  if (var_x /= 20) call abort
> +end subroutine foo
> 
> 
>   -- Ilya

Jakub

[PATCH][ARM] New testcase to check parameter passing bug

2015-03-12 Thread Honggyu Kim

Hi,

I have wrote a testcase that reproduces argument overwriting bug during arm
code generation.

I wrote this testcase with the help of Mikael Pettersson.
If some format is not proper to run in gcc testsuite framework, please
correct me.

Please refer to the following bugzilla link for details:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65358

Honggyu
---
 gcc/testsuite/ChangeLog|5 +
 gcc/testsuite/gcc.target/arm/pr65358.c |   34 
 2 files changed, 39 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/arm/pr65358.c

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 5302dbd..9acd12a 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2015-03-13  Honggyu Kim  
+
+   PR target/65235
+   * gcc.target/arm/pr65358.c: New test for sibcall argument passing bug.
+
 2015-03-12  Kyrylo Tkachov  
 
PR rtl-optimization/65235
diff --git a/gcc/testsuite/gcc.target/arm/pr65358.c 
b/gcc/testsuite/gcc.target/arm/pr65358.c
new file mode 100644
index 000..d663dcf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr65358.c
@@ -0,0 +1,34 @@
+/* PR target/65358 */
+/* { dg-do compile { target arm*-*-* } } */
+/* { dg-options "-O2" } */
+
+struct pack
+{
+  int fine;
+  int victim;
+  int killer;
+};
+
+int __attribute__ ((__noinline__, __noclone__))
+bar (int a, int b, struct pack p)
+{
+  if (a != 20 || b != 30)
+__builtin_abort ();
+  if (p.fine != 40 || p.victim != 50 || p.killer != 60)
+__builtin_abort ();
+  return 0;
+}
+
+int __attribute__ ((__noinline__, __noclone__))
+foo (int arg1, int arg2, int arg3, struct pack p)
+{
+  return bar (arg2, arg3, p);
+}
+
+int main (void)
+{
+  struct pack p = { 40, 50, 60 };
+
+  (void) foo (10, 20, 30, p);
+  return 0;
+}
-- 
1.7.9.5

Re: [PATCH] Fix PR44563 more

2015-03-12 Thread Jan Hubicka

> > CFG cleanup currently searches for calls that became noreturn and
> > fixes them up (splitting block and removing the fallthru).  Previously
> > that was technically necessary as propagation may have turned an
> > indirect call into a direct noreturn call and the CFG verifier would
> > have barfed.  Today we guard that with GF_CALL_CTRL_ALTERING and
> > thus we "remember" the previous call analysis.

Yep, I remember introducing this in back in tree-SSA branch days as kind
of aftertought.
> > 
> > The following patch removes the CFG cleanup code (which is expensive
> > because gimple_call_flags () is quite expensive, not to talk about
> > walking all stmts).  This leaves the fixup_cfg passes to perform the
> > very same optimization (relevant propagators can also be teached
> > to call fixup_noreturn_call, but I don't think that's very important).
> > 
> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> > 
> > I'm somewhat undecided whether this is ok at this stage and if we
> > _do_ want to make propagators fix those (previously indirect) calls up
> > earlier at the same time.
> > 
> > Honza - I think we performed this in CFG cleanup for the sake of CFG 
> > checking, not for the sake of prompt optimization, no?

It is first time I hear this.   We have verify_flow_info.
I think most of CFG cleanups was scheudled because we need update-ssa and
that would bomb on unreachable basic blocks.
> > 
> > This would make PR44563 a pure IPA pass issue.
> 
> Soo - testing revealed a single case where we mess up things (and
> the verifier noticing only because of a LHS on a noreturn call...).
> 
> The following patch makes all propagators handle the noreturn transition
> (the paths in all but PRE are not exercised by bootstrap or testsuite :/).
> 
> This patch makes CFG cleanup independent on BB size (during analysis,
> merge_blocks and delete_basic_block are still O(n)) - which is
> a very much desired property.
> 
> It also changes fixup_cfg to produce a dump only when run as
> separate pass (otherwise the .optimized dump changes and I get
> tons of scan related fails) - that also reduces noise in the
> very many places we dump functions (they are dumped anyway for
> all cases).
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> 
> I wonder if you can throw this on firefox/chromium - the critical
> paths are devirtualization introducing __builtin_unreachable.

I built chromium and firefox without problems with your patch.
> 
> This patch should get a good speedup on all compiles (we run
> CFG-cleanup a _lot_), by removing pointless IL walks and expensive
> gimple_call_flags calls on calls.

Yes, i definitely like it.  The expensiveness of cfg-cleanup always
quite bothered me.
On unrelated note, I think it is possible to do cfg-cleanup with
only fixed number of passes over CFG (my old RTL code did that but
it was changed since then adding crossjumping). Both RTL and tree
cfg cleanups have complicated history, there are probably quite
few ways to get them cheaper.

Thanks for working on it. As I noted in PR I plan to finally fix the
inliner non-linearity with the sreal metrics next stage1 too.

Honza

Re: [PATCH] Speed-up def_builtin_const (ix86_valid_target_attribute)

2015-03-12 Thread Jan Hubicka

> 2015-03-09  Martin Liska  
> 
>   * config/i386/i386.c (def_builtin): Collect union of all
>   possible masks.
>   (ix86_add_new_builtins): Do not iterate over all builtins
>   in cases that isa value has no intersection with possible masks
>   and(or) last passed value is equal to the provided.
> ---
>  gcc/config/i386/i386.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index ab8f03a..5f180b6 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -30592,6 +30592,8 @@ struct builtin_isa {
>  
>  static struct builtin_isa ix86_builtins_isa[(int) IX86_BUILTIN_MAX];
>  
> +/* Union of all masks that are part of builtin_isa structures.  */
> +static HOST_WIDE_INT defined_isa_values = 0;
>  
>  /* Add an ix86 target builtin function with CODE, NAME and TYPE.  Save the 
> MASK
> of which isa_flags to use in the ix86_builtins_isa array.  Stores the
> @@ -30619,6 +30621,7 @@ def_builtin (HOST_WIDE_INT mask, const char *name,
>if (!(mask & OPTION_MASK_ISA_64BIT) || TARGET_64BIT)
>  {
>ix86_builtins_isa[(int) code].isa = mask;
> +  defined_isa_values |= mask;

I think you can move this down to set_and_not_build_p set.  Please add also
comment explaining the caching mehanism.
>  
>mask &= ~OPTION_MASK_ISA_64BIT;
>if (mask == 0
> @@ -30670,6 +30673,14 @@ def_builtin_const (HOST_WIDE_INT mask, const char 
> *name,
>  static void
>  ix86_add_new_builtins (HOST_WIDE_INT isa)
>  {
> +  /* Last cached isa value.  */
> +  static HOST_WIDE_INT last_tested_isa_value = 0;
> +
> +  if ((isa & defined_isa_values) == 0 || isa == last_tested_isa_value)

Heer you need to compare (isa & defined_isa_values) == (isa &
last_tested_isa_value) right, because we have isa flags that enable no
builtins.

Honza

Fix polymorphic type matching in ipa-icf

2015-03-12 Thread Jan Hubicka

Hi,
this patch fixes IPA-ICF's polymorphic type matching.  Basically
ipa-polymorphic-call looks for the following cases:

  - data living in automatic or static variables of polymorphic type
  - dynamic type changes done by explicit calls to constructor or writes
of virtual table pointer
  - parameters of THIS pointer of methods
  - data living in parameters/return values of polymorphic types
passed by invisible reference.

In these cases it may derive type of the instance from them.  Current
implementation of ipa-icf mixes type compatibility checks with polymorphic
type checks and checks types of everything that is useless and somehwat
expensive.  It disables the checks for leaf functions (that are commonly
merged).

This is however not 100% safe, because constructor calls and THIS pointer
writes may get inlined and the information may be propagated outside of
the function code itself.

This patch restructures the checks to be safe in this case and to do
polymorphic type matching independently of the type checking.

The common reason why merging fails is mismat of THIS pointer type
(not very suprisingly).  Next stae1 I think we can lift this restriction and
simply merge polymorphic type info in ipa-prop's jump functions.

Bootstrapped/regtested x86_64-linux and also tested with Firefox and
Chromium.

Honza

* ipa-icf.c (sem_function::equals_wpa): Match CXX_CONSTRUCTOR_P
and CXX_DESTURCTOR_P. For consutrctors match ODR type of class they
are building; for methods check ODR type of class they belong to if
they may lead to a polymorphic call.
(sem_function::compare_polymorphic_p): Be bit smarter about testing
when function may lead to a polymorphic call.
(sem_function::compare_type_list): Remove.
(sem_variable::equals): Update use of compatible_types_p.
(sem_variable::parse_tree_refs): Remove.
(sem_item_optimizer::filter_removed_items): Do not filter out CXX
cdtor.
* ipa-icf-gimple.c (func_checker::compare_decl): Do polymorphic
matching here.
(func_checker::compatible_polymorphic_types_p): Break out from ...
(unc_checker::compatible_types_p): ... here.
* ipa-icf-gimple.h (func_checker::compatible_polymorphic_types_p):
Declare.
(unc_checker::compatible_types_p): Update.
* ipa-icf.h (compare_type_list, parse_tree_refs, compare_sections):
Remove.

Index: ipa-icf.c
===
--- ipa-icf.c   (revision 221405)
+++ ipa-icf.c   (working copy)
@@ -429,9 +429,29 @@ sem_function::equals_wpa (sem_item *item
   if (DECL_NO_LIMIT_STACK (decl) != DECL_NO_LIMIT_STACK (item->decl))
 return return_false_with_msg ("no stack limit attributes are different");
 
+  if (DECL_CXX_CONSTRUCTOR_P (decl) != DECL_CXX_CONSTRUCTOR_P (item->decl))
+return return_false_with_msg ("DELC_CXX_CONSTRUCTOR mismatch");
+
+  if (DECL_CXX_DESTRUCTOR_P (decl) != DECL_CXX_DESTRUCTOR_P (item->decl))
+return return_false_with_msg ("DELC_CXX_DESTRUCTOR mismatch");
+
   if (flags_from_decl_or_type (decl) != flags_from_decl_or_type (item->decl))
 return return_false_with_msg ("decl_or_type flags are different");
 
+  /* Do not match polymorphic constructors of different types.  They calls
+ type memory location for ipa-polymorphic-call and we do not want
+ it to get confused by wrong type.  */
+  if (DECL_CXX_CONSTRUCTOR_P (decl)
+  && TREE_CODE (TREE_TYPE (decl)) == METHOD_TYPE)
+{
+  if (TREE_CODE (TREE_TYPE (item->decl)) != METHOD_TYPE)
+return return_false_with_msg ("DECL_CXX_CONSTURCTOR type mismatch");
+  else if (!func_checker::compatible_polymorphic_types_p
+(method_class_type (TREE_TYPE (decl)),
+ method_class_type (TREE_TYPE (item->decl)), false))
+return return_false_with_msg ("ctor polymorphic type mismatch");
+}
+
   /* Checking function TARGET and OPTIMIZATION flags.  */
   cl_target_option *tar1 = target_opts_for_fn (decl);
   cl_target_option *tar2 = target_opts_for_fn (item->decl);
@@ -473,13 +493,8 @@ sem_function::equals_wpa (sem_item *item
   if (!arg_types[i] || !m_compared_func->arg_types[i])
return return_false_with_msg ("NULL argument type");
 
-  /* Polymorphic comparison is executed just for non-leaf functions.  */
-  bool is_not_leaf = get_node ()->callees != NULL
-|| get_node ()->indirect_calls != NULL;
-
   if (!func_checker::compatible_types_p (arg_types[i],
-m_compared_func->arg_types[i],
-is_not_leaf, i == 0))
+m_compared_func->arg_types[i]))
return return_false_with_msg ("argument type is different");
   if (POINTER_TYPE_P (arg_types[i])
  && (TYPE_RESTRICT (arg_types[i])
@@ -494,6 +509,24 @@ sem_function::e

60 matches

Mail list logo