Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-19 Thread Jakub Jelinek
On Mon, Nov 18, 2013 at 09:09:03AM -0800, Ian Lance Taylor wrote:
> > 2) for tsan querying of data symbols, apparently the classes want to see
> >not just the symbol name and start value, but also size.  libbacktrace
> >has all this info available, just doesn't pass it down to the callback.
> >I wonder if we'd need to create yet another libbacktrace entrypoint, or
> >if it would be acceptable to do source code incompatible, ABI (at least
> >on all sane targets) compatible version of just adding another
> >uintptr_t symsize argument to backtrace_syminfo_callback.
> 
> I think it would be fine to change the callback.  I doubt that
> libbacktrace is so widely used that we need to worry about backward
> compatibility at this stage.  In particular I imagine that any users
> of libbacktrace are simply copying the source code, since there is no
> installable package.

So how about this?  Due to the CLA etc. I have not done the obvious change
to libgo/runtime/go-caller.c (syminfo_callback) that is needed together with
that.

2013-11-19  Jakub Jelinek  

* backtrace.h (backtrace_syminfo_callback): Add symsize argument.
* elf.c (elf_syminfo): Pass 0 or sym->size to the callback as
last argument.
* btest.c (struct symdata): Add size field.
(callback_three): Add symsize argument.  Copy it to the data->size
field.
(f23): Set symdata.size to 0.
(test5): Likewise.  If sizeof (int) > 1, lookup address of
((uintptr_t) &global) + 1.  Verify symdata.val and symdata.size
values.

--- libbacktrace/backtrace.h.jj 2013-11-18 09:59:08.0 +0100
+++ libbacktrace/backtrace.h2013-11-19 08:46:32.537927858 +0100
@@ -169,12 +169,13 @@ extern int backtrace_pcinfo (struct back
 /* The type of the callback argument to backtrace_syminfo.  DATA and
PC are the arguments passed to backtrace_syminfo.  SYMNAME is the
name of the symbol for the corresponding code.  SYMVAL is the
-   value.  SYMNAME will be NULL if no error occurred but the symbol
-   could not be found.  */
+   value and SYMSIZE is the size of the symbol.  SYMNAME will be NULL
+   if no error occurred but the symbol could not be found.  */
 
 typedef void (*backtrace_syminfo_callback) (void *data, uintptr_t pc,
const char *symname,
-   uintptr_t symval);
+   uintptr_t symval,
+   uintptr_t symsize);
 
 /* Given ADDR, an address or program counter in the current program,
call the callback information with the symbol name and value
--- libbacktrace/elf.c.jj   2013-11-19 08:35:09.0 +0100
+++ libbacktrace/elf.c  2013-11-19 08:47:37.646598147 +0100
@@ -502,9 +502,9 @@ elf_syminfo (struct backtrace_state *sta
 }
 
   if (sym == NULL)
-callback (data, addr, NULL, 0);
+callback (data, addr, NULL, 0, 0);
   else
-callback (data, addr, sym->name, sym->address);
+callback (data, addr, sym->name, sym->address, sym->size);
 }
 
 /* Add the backtrace data for one ELF file.  */
--- libbacktrace/btest.c.jj 2013-11-18 09:59:08.0 +0100
+++ libbacktrace/btest.c2013-11-19 08:56:29.320901588 +0100
@@ -92,7 +92,7 @@ struct sdata
 struct symdata
 {
   const char *name;
-  uintptr_t val;
+  uintptr_t val, size;
   int failed;
 };
 
@@ -238,7 +238,8 @@ error_callback_two (void *vdata, const c
 
 static void
 callback_three (void *vdata, uintptr_t pc ATTRIBUTE_UNUSED,
-   const char *symname, uintptr_t symval)
+   const char *symname, uintptr_t symval,
+   uintptr_t symsize)
 {
   struct symdata *data = (struct symdata *) vdata;
 
@@ -250,6 +251,7 @@ callback_three (void *vdata, uintptr_t p
   assert (data->name != NULL);
 }
   data->val = symval;
+  data->size = symsize;
 }
 
 /* The backtrace_syminfo error callback function.  */
@@ -458,6 +460,7 @@ f23 (int f1line, int f2line)
 
  symdata.name = NULL;
  symdata.val = 0;
+ symdata.size = 0;
  symdata.failed = 0;
 
  i = backtrace_syminfo (state, addrs[j], callback_three,
@@ -605,12 +608,17 @@ test5 (void)
 {
   struct symdata symdata;
   int i;
+  uintptr_t addr = (uintptr_t) &global;
+
+  if (sizeof (global) > 1)
+addr += 1;
 
   symdata.name = NULL;
   symdata.val = 0;
+  symdata.size = 0;
   symdata.failed = 0;
 
-  i = backtrace_syminfo (state, (uintptr_t) &global, callback_three,
+  i = backtrace_syminfo (state, addr, callback_three,
 error_callback_three, &symdata);
   if (i == 0)
 {
@@ -634,6 +642,22 @@ test5 (void)
   symdata.name, "global");
  symdata.failed = 1;
}
+  else if (symdata.val != (uintptr_t) &global)
+   {
+ fprintf (stderr,
+  "test5: unexpected syminfo value got %lx expected %lx\n",
+  (unsigned long) s

Re: [GOMP4] Generation tables with omp-functions addresses for offloading.

2013-11-19 Thread Michael V. Zolotukhin
Hi Jakub,

Thanks for the remarks.  Updated patch is attached, and my answers are below.

> This will add into the table all "omp declare target" functions, but you
> actually want there only the outlined #pragma omp target bodies.
> The question is how to find them here reliably.  At least ignoring
> !DECL_ARTIFICIAL (node->decl) functions would help, but still would add
> e.g. cloned functions, or say #pragma omp parallel/task outlined bodies
> in omp declare target functions, etc.
> So, perhaps either add some extra attribute, or some flag in cgraph node,
> or change create_omp_child_function_name + create_omp_child_function
> + callers that it doesn't create "_omp_fn" clone suffix for GOMP_target,
> but say "_omp_tgtfn".  I think the last thing would be nice in any case.
> Then you can just check if it is DECL_ARTIFICIAL with strstr (DECL_NAME
> (node->decl), "_omp_tgtfn") != NULL.
That's right.  I added check for DECL_ARTIFICIAL for now.  Renaming to
'_omp_tgtfn' could go in a separate patch I guess.  Variables are handled now as
well.

> 
> > +  CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, build_fold_addr_expr 
> > (node->decl));
> 
> As explained, the table shouldn't contain just pointers, but pairs of
> pointer, size.  For FUNCTION_DECLs just use size 1, we want to look up
> only the first byte in them, but for var decls you want to know also
> the size to register in the mapping tree.
> So you need to create the structure with the two pairs, or create the
> table as pointers but push two elements for each function (and VAR_DECL),
> first one address, second one 1 (or size in bytes) fold_converted into
> pointer type.
Yep, I fixed that using the option with table of just pointers, storing
(address, size) pairs.

> You need to check if target actually supports named sections, if not, don't
> create anything.
Fixed.

> > +  omp_finish_file ();
> 
> Only call it if (flag_openmp).
Currently that would break work with '-flto' as we don't have '-fopenmp' in the
options list when calling lto1-compiler.  I think that would be fixed soon, when
we finish with all command-line options stuff.  Also, when no symbols have 'omp
declare target' attribute, this call won't do anything except traversal through
all symbols.  What do you think, is it still worth to be guarded with
if (flag_openmp)?

Changelog:

2013-11-19  Michael Zolotukhin  

* omp-low.c: Include common/common-target.h.
(omp_finish_file): New.
* omp-low.h (omp_finish_file): Declare new function.
* toplev.c: Include omp-low.h.
(compile_file): Call omp_finish_file.


Thanks,
Michael
>   Jakub

---
 gcc/omp-low.c | 70 +++
 gcc/omp-low.h |  1 +
 gcc/toplev.c  |  3 +++
 3 files changed, 74 insertions(+)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 797a492..be458eb 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -51,6 +51,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "optabs.h"
 #include "cfgloop.h"
 #include "target.h"
+#include "common/common-target.h"
 #include "omp-low.h"
 #include "gimple-low.h"
 #include "tree-cfgcleanup.h"
@@ -12101,4 +12102,73 @@ make_pass_omp_simd_clone (gcc::context *ctxt)
   return new pass_omp_simd_clone (ctxt);
 }
 
+/* Create new symbol containing (address, size) pairs for omp-marked
+   functions and global variables.  */
+void
+omp_finish_file (void)
+{
+  struct cgraph_node *node;
+  struct varpool_node *vnode;
+  const char *section_name = ".omp_table_section";
+  tree new_decl, new_decl_type;
+  vec *v;
+  tree ctor;
+  int num = 0;
+
+  if (!targetm_common.have_named_sections)
+return;
+
+  vec_alloc (v, 0);
+
+  /* Collect all omp-target functions.  */
+  FOR_EACH_DEFINED_FUNCTION (node)
+{
+  /* TODO: This check could fail on functions, created by omp
+parallel/task pragmas.  It's better to name outlined for offloading
+functions in some different way and to check here the function name.
+It could be something like "*_omp_tgtfn" in contrast with "*_omp_fn"
+for functions from omp parallel/task pragmas.  */
+  if (!lookup_attribute ("omp declare target",
+DECL_ATTRIBUTES (node->decl))
+ || !DECL_ARTIFICIAL (node->decl))
+   continue;
+  CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, build_fold_addr_expr (node->decl));
+  CONSTRUCTOR_APPEND_ELT (v, NULL_TREE,
+ fold_convert (const_ptr_type_node,
+   integer_one_node));
+  num += 2;
+}
+
+  /* Collect all omp-target global variables.  */
+  FOR_EACH_DEFINED_VARIABLE (vnode)
+{
+  if (!lookup_attribute ("omp declare target",
+DECL_ATTRIBUTES (vnode->decl))
+ || TREE_CODE (vnode->decl) != VAR_DECL
+ || DECL_SIZE (vnode->decl) == 0)
+   continue;
+
+  CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, build_fold_addr_expr 
(vnode->decl));
+  

Re: [PATCH] S/390: More htm testcases plus some fixes

2013-11-19 Thread Andreas Krebbel
On 18/11/13 17:09, Peter Bergner wrote:
> On Mon, 2013-11-18 at 10:05 +0100, Andreas Krebbel wrote:
>> With the patch the htm-nofloat-2 testcase fails.  Due to the
>> "returns_twice" flag on tbegin the optimizers fail to fold the
>> compares of the condition code and the s390_optimize_nonescaping_tx
>> routine in turn fails to optimize the simple transactions.  This will
>> hopefully be fixed with a follow-on patch.
> 
> Hi Andreas,
> 
> I assume you're using the returns_twice attribute on your tbegin builtin
> so that the compiler will help you with the handling of the floating
> point registers since they are not restored on the s390's transaction
> failure?  We don't have that attribute set on POWER's tbegin builtin
> and I don't think we should since all of our registers are restored
> on a transaction failure, but I'd like to know if you added that
> attribute for any other reason such that POWER should have it too?

Hi Peter,

right. I did this because of the FPR trouble we have with our tbegin. The 
backend forces the
compiler to generate save/restore instructions around a tbegin by adding 
clobbers for the FPRs.  But
this only helps with the RTL level passes. On tree level the incomplete control 
flow modeling leads
to examples like this being misoptimized (f = 77.0f is propagated into the f != 
88.0f) comparison:

int foo ()
{
  float f = 77.0f;
  if (__builtin_tbegin (0) == 0)
{
  f = 88.0f;
  __builtin_tabort (256);
  return 2;
}
  if (f != 88.0f)
return 3;
  return 4;
}

I agree with you that it should not be needed if the tbegin saves and restores 
all the registers.
And I really cannot recommend it. The flag prevents quite some optimizations 
what currently makes
the generated code quite ugly.

Bye,

-Andreas-



Re: Enale -fno-fat-lto-objects by default

2013-11-19 Thread Paolo Bonzini
Il 18/11/2013 20:09, Jan Hubicka ha scritto:
>>> > > this patch switches the default for fat-lto-objects as was documented 
>>> > > for a while.
>>> > > -ffat-lto-objects doubles compilation time and often makes users to not 
>>> > > notice that
>>> > > LTO was not used at all (because they forgot to use gcc-ar/gcc-nm 
>>> > > plugins).
>>> > > 
>>> > > Sadly I had to add -ffat-lto-objects to bootstrap. This is because I do 
>>> > > not know
>>> > > how to convince our build machinery to use gcc-ar/gcc-nm during the 
>>> > > stage2+
>> > 
>> > I've posted a minimal patch set for slim-lto-bootstrap last year, see:
>> > http://thread.gmane.org/gmane.comp.gcc.patches/270842
>> > 
>> > If there's interest I could repost it.
> It would be really nice to have it in indeed.  I think we do not really need
> lto-bootstrap.mk and slim-lto-bootstrap.mk, but otherwise the patch seems easy
> enough and would save quite some of lto bootstrap testing time...

Patches 1 and 2 should go upstream first.

Patch 3 in the series is wrong because Makefile.in is a generated file.
 The message does not explain why it is necessary, and it is probably
working around a bug elsewhere.

For patch 4, I agree with Jan that we do not need a separate configuration.

Paolo


[PATCH] Parameters added for coverage_compute_cfg_checksum

2013-11-19 Thread Martin Liška
Hello,
   I will need for my further patch more general usage of
coverage_compute_cfg_checksum function, so that I added new argument.

Is the patch OK?

Thank you,
Martin
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d759d4c..32ffc2e 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2013-11-19  Martin Liska 
+
+	* gcc/coverage.c (coverage_compute_cfg_checksum): Parameters introduced.
+	* gcc/profile.c (branch_prob): Argument added.
+
 2013-11-19  Jeff Law  
 
 	* tree-ssa-threadupdate.c: Include ssa-iterators.h
diff --git a/gcc/coverage.c b/gcc/coverage.c
index 3f4e334..f0bdc1c 100644
--- a/gcc/coverage.c
+++ b/gcc/coverage.c
@@ -581,12 +581,12 @@ coverage_compute_profile_id (struct cgraph_node *n)
but the compiler won't detect the change and use the wrong profile data.  */
 
 unsigned
-coverage_compute_cfg_checksum (void)
+coverage_compute_cfg_checksum (struct function *fn)
 {
   basic_block bb;
-  unsigned chksum = n_basic_blocks_for_fn (cfun);
+  unsigned chksum = n_basic_blocks_for_fn (fn);
 
-  FOR_EACH_BB (bb)
+  FOR_EACH_BB_FN (bb, fn)
 {
   edge e;
   edge_iterator ei;
diff --git a/gcc/coverage.h b/gcc/coverage.h
index 342d73e..a467c6e 100644
--- a/gcc/coverage.h
+++ b/gcc/coverage.h
@@ -32,8 +32,8 @@ extern int coverage_begin_function (unsigned, unsigned);
 /* Complete the coverage information for the current function.  */
 extern void coverage_end_function (unsigned, unsigned);
 
-/* Compute the control flow checksum for the current function.  */
-extern unsigned coverage_compute_cfg_checksum (void);
+/* Compute the control flow checksum for the function given as argument.  */
+extern unsigned coverage_compute_cfg_checksum (struct function *);
 
 /* Compute the profile id of function N.  */
 extern unsigned coverage_compute_profile_id (struct cgraph_node *n);
diff --git a/gcc/profile.c b/gcc/profile.c
index 1f1c265..cc0f5a5 100644
--- a/gcc/profile.c
+++ b/gcc/profile.c
@@ -1197,7 +1197,7 @@ branch_prob (void)
  the checksum in only once place, since it depends on the shape
  of the control flow which can change during 
  various transformations.  */
-  cfg_checksum = coverage_compute_cfg_checksum ();
+  cfg_checksum = coverage_compute_cfg_checksum (cfun);
   lineno_checksum = coverage_compute_lineno_checksum ();
 
   /* Write the data from which gcov can reconstruct the basic block


Re: [PATCH] Eliminate n_basic_blocks macro (was Re: [PATCH] Avoid some unnecessary set_cfun calls)

2013-11-19 Thread Richard Biener
On Mon, 18 Nov 2013, David Malcolm wrote:

> On Fri, 2013-11-15 at 20:38 -0500, David Malcolm wrote:
> > On Wed, 2013-11-13 at 14:44 +0100, Richard Biener wrote:
> > > On Wed, 13 Nov 2013, David Malcolm wrote:
> > > 
> > > > On Wed, 2013-11-13 at 13:53 +0100, Richard Biener wrote:
> > > > > On Wed, 13 Nov 2013, Martin Jambor wrote:
> > > > > 
> > > > > > Hi,
> > > > > > 
> > > > > > On Wed, Nov 13, 2013 at 10:49:09AM +0100, Jakub Jelinek wrote:
> > > > > > > Hi!
> > > > > > > 
> > > > > > > void f1 (void) {}
> > > > > > > __attribute__((target ("avx"))) void f2 (void) {}
> > > > > > > __attribute__((target ("avx2"))) void f3 (void) {}
> > > > > > > __attribute__((target ("sse3"))) void f4 (void) {}
> > > > > > > __attribute__((target ("ssse3"))) void f5 (void) {}
> > > > > > > __attribute__((target ("sse4"))) void f6 (void) {}
> > > > > > > takes about 3 seconds to compile at -O2, because set_cfun is 
> > > > > > > terribly
> > > > > > > expensive and there are hundreds of such calls.
> > > > > > > The following patch is just a quick change to avoid some of them:
> > > > > > > execute_function_todo starts with:
> > > > > > >   unsigned int flags = (size_t)data;
> > > > > > >   flags &= ~cfun->last_verified;
> > > > > > >   if (!flags)
> > > > > > > return;
> > > > > > > and if flags is initially zero, it does nothing.
> > > > > > > Similarly, execute_function_dump has the whole body surrounded by
> > > > > > >   if (dump_file && current_function_decl)
> > > > > > > and thus if dump_file is NULL, there is nothing to do.
> > > > > > > So IMHO in neither case (which happens pretty frequently) we need 
> > > > > > > to
> > > > > > > set_cfun to every function during IPA.
> > > > > > > 
> > > > > > > Also, I wonder if we couldn't defer the expensive ira_init, if 
> > > > > > > the info
> > > > > > > computed by it is used only during RTL optimization passes 
> > > > > > > (haven't verified
> > > > > > > it yet), then supposedly we could just remember using some target 
> > > > > > > hook
> > > > > > > what the last state was when we did ira_init last time, and call 
> > > > > > > ira_init
> > > > > > > again at the start of expansion or so if it is different from the
> > > > > > > last time.
> > > > > > 
> > > > > > I was wondering whether the expensive parts of set_cfun could only 
> > > > > > be
> > > > > > run in pass_all_optimizations (and the -Og equivalent) but not when
> > > > > > changing functions in early and IPA passes.
> > > > > 
> > > > > Sounds like a hack ;)
> > > > > 
> > > > > Better get things working without the cfun/current_function_decl 
> > > > > globals.
> > > > > Wasn't there someone replacing all implicit uses with explicit ones
> > > > > for stuff like n_basic_blocks?
> > > > 
> > > > I was working on this:
> > > > http://gcc.gnu.org/ml/gcc-patches/2013-06/msg00780.html
> > > > though I switched to other tasks I felt were higher priority; sorry.
> > > > 
> > > > Do you still want me to go ahead and commit the series of changes you
> > > > pre-approved there?
> > > > 
> > > > i.e. the "n_basic_blocks" macro goes away in favor of:
> > > >n_basic_blocks_for_fn (cfun)
> > > > as a renaming of the existing n_basic_blocks_for_function macro,
> > > > followed up by analogous changes to the other macros.
> > > > 
> > > > Or should I repost before committing?
> > > 
> > > I'd say create the n_basic_blocks patch and post it, that gives
> > > people a chance to object.  If nobody chimes in I approve it
> > > and pre-approve the rest ;)
> > > 
> > > Using n_basic_blocks_for_fn (cfun) might feel backwards if
> > > eventually we'd want to C++-ify struct function and make
> > > n_basic_blocks a member function which would make it
> > > cfun->n_basic_blocks () instead.  Ok, I think that will get
> > > us into C++ bikeshedding again ;)
> > 
> > [I can't face another C vs C++ discussion right now :)]
> > 
> > Thanks.  Attached is such a patch, eliminating the:
> >   n_basic_blocks
> > macro in favor of
> >   n_basic_blocks_for_fn (cfun)
> > 
> > Successfully bootstrapped on x86_64-unknown-linux-gnu, and successfully
> > compiled stage1 on spu-unknown-elf and s390-linux-gnu (given that those
> > config files are affected).
> > 
> > Given the conditional pre-approval above, I'm posting here to give
> > people a change to object - otherwise I'll commit, and followup with the
> > other macros that implicitly use cfun as per the thread linked to above.
> 
> Committed to trunk as r204995; I plan to commit followup patches to
> remove the other such macros.

Thanks!

Richard.


Re: Add value range support into memcpy/memset expansion

2013-11-19 Thread Richard Biener
On Tue, 19 Nov 2013, Jan Hubicka wrote:

> Hi,
> this patch fixes two issues with memcpy testcase - silences warning and 
> updates
> the template as suggested by Uros in the PR.  The testcase still fails on 
> i386.
> This is because we end up with:
> ;; Function t (t, funcdef_no=0, decl_uid=1763, symbol_order=2)
> 
> t (unsigned int c)
> {
>   void * b.0_4;
>   void * a.1_5;
> 
>   :
>   if (c_2(D) <= 9)
> goto ;
>   else
> goto ;
> 
>   :
>   b.0_4 = b;
>   a.1_5 = a;
>   memcpy (a.1_5, b.0_4, c_2(D));
> 
>   :
>   return;
> 
> }
> and we have no useful value range on c_2 because assert_expr was removed,
> while in 64bit version there is a cast in bb 3 that preserves the info.
> Solving this is an independent (and I guess not terribly easy) problem.

Hmm, I thought Jakub fixed this already (with the checking whether
there are any uses of c_2(D) before the conditional)?  Or is this
a different case?

Richard.

> Regtested x86_64-linux, will commit it shortly.
> 
> Index: ChangeLog
> ===
> --- ChangeLog (revision 204984)
> +++ ChangeLog (working copy)
> @@ -1,3 +1,10 @@
> +2013-11-18  Jan Hubicka  
> + Uros Bizjak  
> +
> + PR middle-end/59175
> + * gcc.target/i386/memcpy-2.c: Fix template;
> + add +1 so the testcase passes at 32bit.
> +
>  2013-11-18  Dominique d'Humieres  
>  
>   * c-c++-common/cilk-plus/PS/reduction-3.c: Use stdlib.h.
> Index: gcc.target/i386/memcpy-2.c
> ===
> --- gcc.target/i386/memcpy-2.c(revision 204984)
> +++ gcc.target/i386/memcpy-2.c(working copy)
> @@ -1,11 +1,11 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2" } */
> -/* Memcpy should be inlined because block size is known.  */
> -/* { dg-final { scan-assembler-not "memcpy" } } */
>  void *a;
>  void *b;
>  t(unsigned int c)
>  {
>if (c<10)
> -memcpy (a,b,c);
> +__builtin_memcpy (a,b,c+1);
>  }
> +/* Memcpy should be inlined because block size is known.  */
> +/* { dg-final { scan-assembler-not "(jmp|call)\[\\t \]*memcpy" } } */
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


Re: [patch][RFC] make lra.c:check_rtl set maybe_hot_insn_p

2013-11-19 Thread Yvan Roux
> yep, all good performance-wise :)

Great, Thanks Kyrill.

Ok for trunk ?

Yvan


Re: [AArch64] Remove v8type attribute.

2013-11-19 Thread Marcus Shawcroft
On 14 November 2013 17:25, James Greenhalgh  wrote:
>
> Now, every insn has a "type", we don't need v8type anymore.
> This patch removes v8type.
>
> Tested on aarch64-none-elf with no regression.
>
> OK?

OK
/Marcus


Re: [PATCH] Generate a label for the split cold function while using -freorder-blocks-and-partition

2013-11-19 Thread Eric Botcazou
> Thanks! Do I need any other maintainer's approval or can I go ahead
> and commit, after checking for test regressions and boot-strap parity?

Steven's approval is sufficient here.

-- 
Eric Botcazou


Re: [PATCH i386 4/8] [AVX512] [2/n] Add substed patterns: mask scalar subst.

2013-11-19 Thread Kirill Yukhin
Hello,
On 15 Nov 20:03, Kirill Yukhin wrote:
> Ping?
Ping?

--
Thanks, K


Re: [PATCH i386 4/8] [AVX512] [4/n] Add substed patterns: `sd' subst.

2013-11-19 Thread Kirill Yukhin
Hello,
On 15 Nov 20:05, Kirill Yukhin wrote:
> > Is it ok for trunk?
> Ping.
Ping.

--
Thanks, K


Re: [PATCH i386 4/8] [AVX512] [5/8] Add substed patterns: rounding subst.

2013-11-19 Thread Kirill Yukhin
Hello,
On 15 Nov 20:06, Kirill Yukhin wrote:
> Ping.
Ping.

--
Thanks, K


Re: [GOMP4] [PATCH] SIMD-Enabled Functions (formerly Elemental functions) for C

2013-11-19 Thread Jakub Jelinek
On Mon, Nov 18, 2013 at 10:35:50PM +, Iyer, Balaji V wrote:
> Attached, please find a patch that will implement SIMD enabled functions
> for C targeting the gomp-4_0-branch.  Here are the ChangeLog entries.  Is
> this OK to install?

Have you tested say:
int func9 (int x, int y) __attribute__ ((vector (uniform (y), mask)));
?  I mean, mostly you pass NULL to c_parser_attributes as the new argument,
but then dereference it unconditionally.

You should error out if a function has
#pragma omp declare simd
and
vector attribute, defining what it means doesn't make sense.

Also, I'm not sure I like doing the transformation from Cilk+ to OpenMP
syntax through rewriting tokens, rather than at the parsing level.
After all, the Cilk+ syntax is quite different, even when the patch pretends
it is the same, consider e.g. the linear clause, which in Cilk+ allows
to refer to an argument, in OpenMP doesn't.

Also, I wonder if you couldn't save the tokens wrapped into some tree
temporarily into the attribute, rather than having to adjust
c_parser_attribute callers.  Joseph, what do you prefer here?

> @@ -1502,9 +1502,17 @@
>c_parser_peek_token (parser)->value = error_mark_node;
>fndef_ok = !nested;
>  }
> +  /* In Cilk Plus SIMD-enabled functions (formerly known as Elemental
> + Functions), attributes are used right above the functoin declaration or
> + the function itself.  */

Spelling.

> +  tree attrs = NULL_TREE;
> +  if (flag_enable_cilkplus
> +  && c_parser_next_token_is_keyword (parser, RID_ATTRIBUTE))
> +attrs = c_parser_attributes (parser, &omp_declare_simd_clauses);

This looks wrong, parsing of the attributes is supposed to be done
in c_parser_declspecs and can be intermixed with various other tokens
(e.g. static, extern, etc.).

>c_parser_declspecs (parser, specs, true, true, start_attr_ok,
> true, true, cla_nonabstract_decl);
> +  specs->attrs = chainon (attrs, specs->attrs);
>if (parser->error)
>  {
>c_parser_skip_to_end_of_block_or_statement (parser);
> +/* Since we are converting an attribute to a pragma, we need to end the
> + attribute with PRAGMA_EOL.  OpenMP guys would like to have 2 CPP_EOF
> + at the end, and so we insert that also.  */

Comment wrongly indented.  It isn't about me who likes to have there
2 CPP_EOF, but just a safety net, because normally C FE has two tokens
look-ahead.

Jakub


Re: [PATCH i386 4/8] [AVX512] [4/n] Add substed patterns: `sd' subst.

2013-11-19 Thread Kirill Yukhin
Adding back Richard.

On 19 Nov 12:07, Kirill Yukhin wrote:
> Hello,
> On 15 Nov 20:05, Kirill Yukhin wrote:
> > > Is it ok for trunk?
> > Ping.
> Ping.
> 
> --
> Thanks, K


Re: [PATCH i386 4/8] [AVX512] [7/8] Add substed patterns: `round for expand' subst.

2013-11-19 Thread Kirill Yukhin
Hello,
On 15 Nov 20:08, Kirill Yukhin wrote:
> > Is it ok for trunk?
> Ping.
Ping.

--
Thanks, K


Re: [PATCH i386 4/8] [AVX512] [6/8] Add substed patterns: `sae' subst.

2013-11-19 Thread Kirill Yukhin
Hello,
On 15 Nov 20:07, Kirill Yukhin wrote:
> > Is it ok for trunk?
> Ping.
Ping.

--
Thanks, K


Re: [PATCH i386 4/8] [AVX512] [8/8] Add substed patterns: `sae-only for expand' subst.

2013-11-19 Thread Kirill Yukhin
Hello,
On 15 Nov 20:09, Kirill Yukhin wrote:
> > Is it ok for trunk?
> Ping.
Ping.

--
Thanks, K


Re: Pass floating point values on powerpc64 as per ABI

2013-11-19 Thread Andreas Schwab
Alan Modra  writes:

> On Tue, Nov 19, 2013 at 11:16:26AM +1030, Alan Modra wrote:
>> On Tue, Nov 19, 2013 at 01:27:39AM +0100, Andreas Schwab wrote:
>> > Where does it call a varargs function?
>> 
>> printf
>
> Sorry that wasn't such a helpful response.
>
> Here, really:
>   res = ((int(*)(char*, ...))(code))(format, doubleArg);

But cls_double_va_fn doesn't expect a varargs call.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH i386 5/8] [AVX-512] Extend vectorizer hooks.

2013-11-19 Thread Kirill Yukhin
Hello,
On 15 Nov 20:10, Kirill Yukhin wrote:
> > Is it ok to commit to main trunk?
> Ping.
Ping.

--
Thanks, K


[PATCH] Fix PR57517

2013-11-19 Thread Richard Biener

This fixes a predcom ICE where it commons a combination of two
loads but with the combination being conditionally executed.
It's not prepared to handle this situation, disabled with
the following.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to
trunk and 4.8 branch for now.

Richard.

2013-11-19  Richard Biener  

PR tree-optimization/57517
* tree-predcom.c (combinable_refs_p): Verify the combination
is always executed when the refs are.

* gfortran.fortran-torture/compile/pr57517.f90: New testcase.
* gcc.dg/torture/pr57517.c: Likewise.

Index: gcc/tree-predcom.c
===
*** gcc/tree-predcom.c  (revision 204948)
--- gcc/tree-predcom.c  (working copy)
*** combinable_refs_p (dref r1, dref r2,
*** 2068,2074 
  
stmt = find_common_use_stmt (&name1, &name2);
  
!   if (!stmt)
  return false;
  
acode = gimple_assign_rhs_code (stmt);
--- 2068,2078 
  
stmt = find_common_use_stmt (&name1, &name2);
  
!   if (!stmt
!   /* A simple post-dominance check - make sure the combination
!  is executed under the same condition as the references.  */
!   || (gimple_bb (stmt) != gimple_bb (r1->stmt)
! && gimple_bb (stmt) != gimple_bb (r2->stmt)))
  return false;
  
acode = gimple_assign_rhs_code (stmt);
Index: gcc/testsuite/gfortran.fortran-torture/compile/pr57517.f90
===
*** gcc/testsuite/gfortran.fortran-torture/compile/pr57517.f90  (revision 0)
--- gcc/testsuite/gfortran.fortran-torture/compile/pr57517.f90  (working copy)
***
*** 0 
--- 1,13 
+ SUBROUTINE cal_helicity (uh, ph, phb, wavg, ims, ime, its, ite)
+   INTEGER, INTENT( IN ) :: ims, ime, its, ite
+   REAL, DIMENSION( ims:ime), INTENT( IN ) :: ph, phb, wavg
+   REAL, DIMENSION( ims:ime), INTENT( INOUT ) :: uh
+   INTEGER :: i
+   REAL :: zu
+   DO i = its, ite
+ zu =  (ph(i ) + phb(i)) + (ph(i-1) + phb(i-1))
+ IF (wavg(i) .GT. 0) THEN
+   uh(i) = uh(i) + zu 
+ ENDIF
+   END DO
+ END SUBROUTINE cal_helicity
Index: gcc/testsuite/gcc.dg/torture/pr57517.c
===
*** gcc/testsuite/gcc.dg/torture/pr57517.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr57517.c  (working copy)
***
*** 0 
--- 1,16 
+ /* { dg-do compile } */
+ 
+ int x[1024], y[1024], z[1024], w[1024];
+ void foo (void)
+ {
+   int i;
+   for (i = 1; i < 1024; ++i)
+ {
+   int a = x[i];
+   int b = y[i];
+   int c = x[i-1];
+   int d = y[i-1];
+   if (w[i])
+   z[i] = (a + b) + (c + d);
+ }
+ }


Re: [PATCH/AARCH64] Emit brk #0 for __builtin_trap

2013-11-19 Thread Marcus Shawcroft

On 13 November 2013 00:04, Andrew Pinski
 wrote:

Hi all,
  This patch implements the trap pattern for the AARCH64 back-end.  I
used the "brk #0" instruction as that is the breakpoint instruction
that GDB uses.  I looked at what other targets did when the
instruction set did not have a trap instruction and found that using
the breakpoint instruction was a common theme between them if there
was not explicit defined undefined instruction to use.



Hi Andrew, We can exploit the immediate field in the brk instruction to
distinguish the origin of various traps.  There was some discussion on
this topic within ARM a (long) while back, at that time we discussed a 
scheme along the following lines:


POSIX siginfo
BRK #imm16  si_signo si_codePurpose
--   ------

-0fff  S/w breakpoint, reserved for debuggers
 -3ff   SIGTRAP TRAP_BRKPT-   EL0 breakpoint (e.g. gdb)
 0400-7ff   SIGILL ILL_ILLTRP*EL1 breakpoint (e.g. kgdb)
 0800-bff   SIGILL ILL_ILLTRP*EL2 breakpoint
 0c00-fff   SIGILL ILL_ILLTRP*EL3 breakpoint

1000-1fff   C/C++ runtime errors
 1000   SIGABRT n/a   libc abort()
 1001   SIGFPE  FPE_INTDIVinteger divide by zero
 1002   SIGFPE  FPE_INTOVFinteger overflow
 1003   SIGFPE  FPE_FLTDIVfloating-point divide by zero
 1004   SIGFPE  FPE_FLTOVFfloating-point overflow
 1005   SIGFPE  FPE_FLTUNDfloating-point underflow
 1006   SIGFPE  FPE_FLTRESfloating-point inexact result
 1007   SIGFPE  FPE_FLTINVfloating-point invalid op
 1008   SIGFPE  FPE_FLTSUBsubscript out of range
 1009-1fff  SIGILL  ILL_ILLTRPunused but reserved

2000-   SIGILL  ILL_ILLOPC+   Guaranteed unused, resvd for apps

- This is the signal generated now for all values of BRK
  immediate. GDB currently uses "BRK #0"

* The EL1/EL2/EL3 breakpoints would deliver a SIGILL if they are
  executed by EL0 code, and caught by the EL1 kernel. A s/w debugger
  operating at a higher EL which placed such breakpoints would
  presumably catch them and handle them, without the EL1 kernel ever
  seeing them.

+ Immediate values 0x2000- generate the same signal as any other
  UNDEFINED instruction encoding, but with guaranteed behaviour for
  JITs etc. The original imm16 value could be made available in the
  si_trapno field of the signal context.

Following this scheme I suggest __builtin_trap() generate brk #1000

Thoughts?

Cheers
/Marcus



Re: Add value range support into memcpy/memset expansion

2013-11-19 Thread Jakub Jelinek
On Tue, Nov 19, 2013 at 09:50:56AM +0100, Richard Biener wrote:
> > this patch fixes two issues with memcpy testcase - silences warning and 
> > updates
> > the template as suggested by Uros in the PR.  The testcase still fails on 
> > i386.
> > This is because we end up with:
> > ;; Function t (t, funcdef_no=0, decl_uid=1763, symbol_order=2)
> > 
> > t (unsigned int c)
> > {
> >   void * b.0_4;
> >   void * a.1_5;
> > 
> >   :
> >   if (c_2(D) <= 9)
> > goto ;
> >   else
> > goto ;
> > 
> >   :
> >   b.0_4 = b;
> >   a.1_5 = a;
> >   memcpy (a.1_5, b.0_4, c_2(D));
> > 
> >   :
> >   return;
> > 
> > }
> > and we have no useful value range on c_2 because assert_expr was removed,
> > while in 64bit version there is a cast in bb 3 that preserves the info.
> > Solving this is an independent (and I guess not terribly easy) problem.
> 
> Hmm, I thought Jakub fixed this already (with the checking whether
> there are any uses of c_2(D) before the conditional)?  Or is this
> a different case?

It was a different case, I've only handled the case where you have
if (c >= 10) __builtin_unreachable ();
and c doesn't have any immediate uses before this form of assertion.
In Honza's testcase there is no __builtin_unreachable, but instead
return at that point, changing the value range of c_2(D) in his case
would be far more controversial - for __builtin_unreachable () it means
if you pass c_2(D) 10 to the function and the if (c >= 10) test is
reached, it will be invalid (from this POV that transformation isn't
100% valid either, because there is no proof that if you call the function,
then that condition stmt will be executed).  While in the above case
c_2(D) of 10 is completely valid.

Perhaps better might be for __builtin_unreachable () to just add
there an internal pass through call (perhaps doing nothing at all)
that would just preserve the range and non-zero bits info, we could perhaps
allow some extra optimizations on it (like, if we propagate to the first
argument anything other than some SSA_NAME, we just remove the call and
propagate it further), the question is what else should we do so that it
doesn't inhibit some optimizations unnecessarily.  But if the user
cared enough to insert an __builtin_unreachable () assertion, perhaps it
might be worth preserving that info.

That said, for Honza's case keeping around some internal pass through call
might be even far more expensive.

Jakub


Re: [Patch, AArch64] Make reduc_* operations bigendian-safe.

2013-11-19 Thread Marcus Shawcroft
On 15 November 2013 16:52, Tejas Belagod  wrote:
> Hi,
>
> The attached patch fixes all the reduc_* expansions to be BE-safe by moving
> the scalar result to the LSB where RTL expects it. While moving it also adds
> patterns that will give gcc the freedom to choose between 2-lane-situations
> like
>
>   ADDP Dd, Vd.2D
>   DUP Vd.2D, Vd.d[0]

OK
Thanks /Marcus


Re: [PATCH] aarch64 gcc.c-torture/execute/20101011-1.c failures

2013-11-19 Thread Marcus Shawcroft
On 18 November 2013 18:02, Cesar Philippidis  wrote:

>>> gcc.c-torture/execute/20101011-1.c test on aarch64. The reason why this
>>> test fails is because aarch64 does not trap on integer division by zero.
>>>
>>> Is this OK for trunk? If so, please commit it because I do not have an
>>> svn account.

This is OK.

The comment Jeff highlighted is incorrect. AArch64 does not trap on
integer division.

To get integer trap on divide by zero behavior we would have to go the
mips route and add -mdivide-traps to explicitly check and generate a
brk #XXX instruction.

/Marcus


Re: [RS6000] strict alignment for little-endian

2013-11-19 Thread Eric Botcazou
> It's not desirable since gcc easily loses track of alignment, for instance
> with -mstrict-align
> 
> void foo (char *p, char *q)
> {
>   __builtin_memcpy (p, q, 4);
> }
> 
> generates
> 
>   lbz 7,0(4)
>   lbz 8,1(4)
>   lbz 10,2(4)
>   lbz 9,3(4)
>   stb 7,0(3)
>   stb 8,1(3)
>   stb 10,2(3)
>   stb 9,3(3)
>   blr
> 
> whereas -mno-strict-align gives
> 
>   lwz 9,0(4)
>   stw 9,0(3)
>   blr

I presume you meant:

void foo (int *p, int *q)
{
  __builtin_memcpy (p, q, 4);
}

which will yield the same generated code.  Yes, it's an unfortunate regression 
on strict-alignment platforms, up to 4.5 the generated code was the same.

-- 
Eric Botcazou


Re: [PATCH] Support addsub/subadd as non-isomorphic operations for SLP vectorizer.

2013-11-19 Thread Richard Biener
On Mon, 18 Nov 2013, Cong Hou wrote:

> I tried your method and it works well for doubles. But for float,
> there is an issue. For the following gimple code:
> 
>c1 = a - b;
>c2 = a + b;
>c = VEC_PERM 
>
> It needs two instructions to implement the VEC_PERM operation in
> SSE2-4, one of which should be using shufps which is represented by
> the following pattern in rtl:
> 
> 
> (define_insn "sse_shufps_"
>   [(set (match_operand:VI4F_128 0 "register_operand" "=x,x")
> (vec_select:VI4F_128
>  (vec_concat:
>(match_operand:VI4F_128 1 "register_operand" "0,x")
>(match_operand:VI4F_128 2 "nonimmediate_operand" "xm,xm"))
>  (parallel [(match_operand 3 "const_0_to_3_operand")
> (match_operand 4 "const_0_to_3_operand")
> (match_operand 5 "const_4_to_7_operand")
> (match_operand 6 "const_4_to_7_operand")])))]
> ...)
> 
> Note that it contains two rtl instructions.

It's a single instruction as far as combine is concerned (RTL
instructions have arbitrary complexity).

> Together with minus, plus,
> and one more shuffling instruction, we have at least five instructions
> for addsub pattern. I think during the combine pass, only four
> instructions are considered to be combined, right? So unless we
> compress those five instructions into four or less, we could not use
> this method for float values.

At the moment addsubv4sf looks like

(define_insn "sse3_addsubv4sf3"
  [(set (match_operand:V4SF 0 "register_operand" "=x,x")
(vec_merge:V4SF
  (plus:V4SF
(match_operand:V4SF 1 "register_operand" "0,x")
(match_operand:V4SF 2 "nonimmediate_operand" "xm,xm"))
  (minus:V4SF (match_dup 1) (match_dup 2))
  (const_int 10)))]

to match this it's best to have the VEC_SHUFFLE retained as
vec_merge and thus support arbitrary(?) vec_merge for the aid
of combining until reload(?) after which we can split it.

> What do you think?

Besides of addsub are there other instructions that can be expressed
similarly?  Thus, how far should the combiner pattern go?

Richard.

> 
> 
> 
> thanks,
> Cong
> 
> 
> On Fri, Nov 15, 2013 at 12:53 AM, Richard Biener  wrote:
> > On Thu, 14 Nov 2013, Cong Hou wrote:
> >
> >> Hi
> >>
> >> This patch adds the support to two non-isomorphic operations addsub
> >> and subadd for SLP vectorizer. More non-isomorphic operations can be
> >> added later, but the limitation is that operations on even/odd
> >> elements should still be isomorphic. Once such an operation is
> >> detected, the code of the operation used in vectorized code is stored
> >> and later will be used during statement transformation. Two new GIMPLE
> >> opeartions VEC_ADDSUB_EXPR and VEC_SUBADD_EXPR are defined. And also
> >> new optabs for them. They are also documented.
> >>
> >> The target supports for SSE/SSE2/SSE3/AVX are added for those two new
> >> operations on floating points. SSE3/AVX provides ADDSUBPD and ADDSUBPS
> >> instructions. For SSE/SSE2, those two operations are emulated using
> >> two instructions (selectively negate then add).
> >>
> >> With this patch the following function will be SLP vectorized:
> >>
> >>
> >> float a[4], b[4], c[4];  // double also OK.
> >>
> >> void subadd ()
> >> {
> >>   c[0] = a[0] - b[0];
> >>   c[1] = a[1] + b[1];
> >>   c[2] = a[2] - b[2];
> >>   c[3] = a[3] + b[3];
> >> }
> >>
> >> void addsub ()
> >> {
> >>   c[0] = a[0] + b[0];
> >>   c[1] = a[1] - b[1];
> >>   c[2] = a[2] + b[2];
> >>   c[3] = a[3] - b[3];
> >> }
> >>
> >>
> >> Boostrapped and tested on an x86-64 machine.
> >
> > I managed to do this without adding new tree codes or optabs by
> > vectorizing the above as
> >
> >c1 = a + b;
> >c2 = a - b;
> >c = VEC_PERM 
> >
> > which then matches sse3_addsubv4sf3 if you fix that pattern to
> > not use vec_merge (or fix PR56766).  Doing it this way also
> > means that the code is vectorizable if you don't have a HW
> > instruction for that but can do the VEC_PERM efficiently.
> >
> > So, I'd like to avoid new tree codes and optabs whenever possible
> > and here I've already proved (with a patch) that it is possible.
> > Didn't have time to clean it up, and it likely doesn't apply anymore
> > (and PR56766 blocks it but it even has a patch).
> >
> > Btw, this was PR56902 where I attached my patch.
> >
> > Richard.
> >
> >>
> >> thanks,
> >> Cong
> >>
> >>
> >>
> >>
> >>
> >> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> >> index 2c0554b..656d5fb 100644
> >> --- a/gcc/ChangeLog
> >> +++ b/gcc/ChangeLog
> >> @@ -1,3 +1,31 @@
> >> +2013-11-14  Cong Hou  
> >> +
> >> + * tree-vect-slp.c (vect_create_new_slp_node): Initialize
> >> + SLP_TREE_OP_CODE.
> >> + (slp_supported_non_isomorphic_op): New function.  Check if the
> >> + non-isomorphic operation is supported or not.
> >> + (vect_build_slp_tree_1): Consider non-isomorphic operations.
> >> + (vect_build_slp_tree): Change argument.
> >> + * tree-vect-stmts.c (vectorizable_operation): Consider the opcode
> >> + for non-isomorphic operations.
> >> + *

Re: patch PLUGIN_HEADER_FILE event for tracing of header inclusions.

2013-11-19 Thread Basile Starynkevitch
On Mon, Nov 18, 2013 at 10:50:10PM +, Joseph S. Myers wrote:
> On Mon, 18 Nov 2013, Basile Starynkevitch wrote:
> 
> > @@ -43,6 +44,7 @@
> >TARGET_OPTF.  */
> >  #include "tm_p.h"  /* For C_COMMON_OVERRIDE_OPTIONS.  */
> >  
> > +
> >  #ifndef DOLLARS_IN_IDENTIFIERS
> >  # define DOLLARS_IN_IDENTIFIERS true
> >  #endif
> 
> This is a spurious diff hunk that should not be in this patch.
> 
> OK minus the spurious change in the absence of objections from the plugin 
> maintainers within 48 hours (or in the presence of approval from either of 
> them).

Thanks for your attention. I am attaching a slightly improved patch 
against trunk svn rev. 305009 (the improvements are removing the spurious 
diff hunk, and better comments.)

# gcc/c-family/ChangeLog entry :
2013-11-19  Basile Starynkevitch  

* c-opts.c: Include plugin.h.
(cb_file_change): Invoke plugin event PLUGIN_INCLUDE_FILE.


# gcc/ChangeLog entry :
2013-11-19  Basile Starynkevitch  

* plugin.def (PLUGIN_INCLUDE_FILE): New event, invoked in 
cb_file_change.

### 

Ok for trunk?

Regards.
-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***
Index: gcc/c-family/c-opts.c
===
--- gcc/c-family/c-opts.c	(revision 205009)
+++ gcc/c-family/c-opts.c	(working copy)
@@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "debug.h"		/* For debug_hooks.  */
 #include "opts.h"
 #include "options.h"
+#include "plugin.h"		/* For PLUGIN_INCLUDE_FILE event.  */
 #include "mkdeps.h"
 #include "c-target.h"
 #include "tm.h"			/* For BYTES_BIG_ENDIAN,
@@ -1397,6 +1398,17 @@ cb_file_change (cpp_reader * ARG_UNUSED (pfile),
   else
 fe_file_change (new_map);
 
+  if (new_map 
+  && (new_map->reason == LC_ENTER || new_map->reason == LC_RENAME))
+{
+  /* Signal to plugins that a file is included.  This could happen
+	 several times with the same file path, e.g. because of
+	 several '#include' or '#line' directives...  */
+  invoke_plugin_callbacks 
+	(PLUGIN_INCLUDE_FILE,
+	 const_cast (ORDINARY_MAP_FILE_NAME (new_map)));
+}
+
   if (new_map == 0 || (new_map->reason == LC_LEAVE && MAIN_FILE_P (new_map)))
 {
   pch_cpp_save_state ();
Index: gcc/plugin.def
===
--- gcc/plugin.def	(revision 205009)
+++ gcc/plugin.def	(working copy)
@@ -92,6 +92,12 @@ DEFEVENT (PLUGIN_EARLY_GIMPLE_PASSES_END)
 /* Called when a pass is first instantiated.  */
 DEFEVENT (PLUGIN_NEW_PASS)
 
+/* Called when a file is #include-d or given thru #line directive.
+   Could happen many times.  The event data is the included file path,
+   as a const char* pointer.  */
+DEFEVENT (PLUGIN_INCLUDE_FILE)
+
+
 /* After the hard-coded events above, plugins can dynamically allocate events
at run time.
PLUGIN_EVENT_FIRST_DYNAMIC only appears as last enum element.  */


Re: [PATCH GCC]Compute, cache and use cost of auto-increment rtx patterns in IVOPT

2013-11-19 Thread Bin.Cheng
On Tue, Nov 19, 2013 at 10:09 AM, bin.cheng  wrote:
>
>
>> -Original Message-
>> From: Bernd Schmidt [mailto:ber...@codesourcery.com]
>> Sent: Monday, November 18, 2013 8:05 PM
>> To: Bin Cheng
>> Cc: gcc-patches@gcc.gnu.org
>> Subject: Re: [PATCH GCC]Compute, cache and use cost of auto-increment rtx
>> patterns in IVOPT
>>
>> On 11/04/2013 04:31 AM, bin.cheng wrote:
>> > 2013-11-01  Bin Cheng  
>> >
>> > * tree-ssa-loop-ivopts.c (enum ainc_type): New.
>> > (address_cost_data): New field.
>> > (get_address_cost): Compute auto-increment rtx cost in ainc_costs.
>> > Use ainc_costs for auto-increment rtx patterns.
>> > Cleanup TWS.
>>
>> I think this is fine. I'd just like to see AINC_NUM gone and its use
> replaced by
>> AIC_NONE, we don't really need two separate enum codes for that.
>>
> Thanks for reviewing, I updated patch as suggested.
>
> Hi Richard, is this patch ok for you too?
>

Got approval and committed as r205015.

Thanks,
bin

-- 
Best Regards.


Re: [PATCH] Support addsub/subadd as non-isomorphic operations for SLP vectorizer.

2013-11-19 Thread Richard Earnshaw
On 18/11/13 20:19, Cong Hou wrote:
> On Fri, Nov 15, 2013 at 10:18 AM, Richard Earnshaw  wrote:
>> On 15/11/13 02:06, Cong Hou wrote:
>>> Hi
>>>
>>> This patch adds the support to two non-isomorphic operations addsub
>>> and subadd for SLP vectorizer. More non-isomorphic operations can be
>>> added later, but the limitation is that operations on even/odd
>>> elements should still be isomorphic. Once such an operation is
>>> detected, the code of the operation used in vectorized code is stored
>>> and later will be used during statement transformation. Two new GIMPLE
>>> opeartions VEC_ADDSUB_EXPR and VEC_SUBADD_EXPR are defined. And also
>>> new optabs for them. They are also documented.
>>>
>>
>> Not withstanding what Richi has already said on this subject, you
>> certainly don't need both VEC_ADDSUB_EXPR and VEC_SUBADD_EXPR.  The
>> latter can always be formed by vec-negating the second operand and
>> passing it to VEC_ADDSUB_EXPR.
>>
> 
> Right. But I also considered targets without the support to addsub
> instructions. Then we could still selectively negate odd/even elements
> using masks then use PLUS_EXPR (at most 2 instructions). If I
> implement VEC_ADDSUB_EXPR by negating the second operand then using
> VEC_ADDSUB_EXPR, I end up with one more instruction.
> 
> 

No, you don't, since as Richi has mentioned elsewhere, two RTL
operations in a single pattern doesn't imply two instructions.

R.




Re: Ping Re: [gomp4] Dumping gimple for offload.

2013-11-19 Thread Ilya Tocar
On 14 Nov 11:27, Richard Biener wrote:
> > +  /* Set when symbol needs to be dumped for lto/offloading.  */
> > +  unsigned need_dump : 1;
> > +
> 
> That's very non-descriptive.  What's "offloading"?  But yes, something
> like this is what I was asking for.

I've changed it into:
Set when symbol needs to be dumped into LTO bytecode for LTO,
or in pragma omp target case, for separate compilation targeting
a different architecture.

Ok for gomp4 branch now?

2013-11-19 Ilya Tocar   

* cgraph.h (symtab_node): Add need_dump.
* cgraphunit.c (ipa_passes): Run ipa_write_summaries for omp.
(compile): Intialize streamer for omp. 
* ipa-inline-analysis.c (inline_generate_summary): Add flag_openmp.
* lto-cgraph.c (lto_set_symtab_encoder_in_partition): Respect
need_dump flag.
(select_what_to_dump): New.
* lto-streamer.c (section_name_prefix): New.
(lto_get_section_name): Use section_name_prefix.
(lto_streamer_init): Add flag_openmp.
* lto-streamer.h (OMP_SECTION_NAME_PREFIX): New.
(section_name_prefix): Ditto.
(select_what_to_dump): Ditto.
* lto/lto-partition.c (add_symbol_to_partition_1): Set need_dump.
(lto_promote_cross_file_statics): Dump everyhtinh.
* passes.c (ipa_write_summaries): Add parameter,
call select_what_to_dump.
* tree-pass.h (void ipa_write_summaries): Add parameter.


---
 gcc/cgraph.h  |  5 +
 gcc/cgraphunit.c  | 15 +--
 gcc/ipa-inline-analysis.c |  2 +-
 gcc/lto-cgraph.c  | 14 ++
 gcc/lto-streamer.c|  5 +++--
 gcc/lto-streamer.h|  6 ++
 gcc/lto/lto-partition.c   |  3 +++
 gcc/passes.c  |  6 --
 gcc/tree-pass.h   |  2 +-
 9 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index fb0fe93..9f799f4 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -105,6 +105,11 @@ public:
   /* Set when symbol has address taken. */
   unsigned address_taken : 1;
 
+  /* Set when symbol needs to be dumped into LTO bytecode for LTO,
+ or in pragma omp target case, for separate compilation targeting
+ a different architecture.  */
+  unsigned need_dump : 1;
+
 
   /* Ordering of all symtab entries.  */
   int order;
diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index c3a8967..53cd250 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -2019,7 +2019,18 @@ ipa_passes (void)
  passes->all_lto_gen_passes);
 
   if (!in_lto_p)
-ipa_write_summaries ();
+{
+  if (flag_openmp)
+   {
+ section_name_prefix = OMP_SECTION_NAME_PREFIX;
+ ipa_write_summaries (true);
+   }
+  if (flag_lto)
+   {
+ section_name_prefix = LTO_SECTION_NAME_PREFIX;
+ ipa_write_summaries (false);
+   }
+}
 
   if (flag_generate_lto)
 targetm.asm_out.lto_end ();
@@ -2110,7 +2121,7 @@ compile (void)
   cgraph_state = CGRAPH_STATE_IPA;
 
   /* If LTO is enabled, initialize the streamer hooks needed by GIMPLE.  */
-  if (flag_lto)
+  if (flag_lto || flag_openmp)
 lto_streamer_hooks_init ();
 
   /* Don't run the IPA passes if there was any error or sorry messages.  */
diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
index 4458723..62faa52 100644
--- a/gcc/ipa-inline-analysis.c
+++ b/gcc/ipa-inline-analysis.c
@@ -3813,7 +3813,7 @@ inline_generate_summary (void)
 
   /* When not optimizing, do not bother to analyze.  Inlining is still done
  because edge redirection needs to happen there.  */
-  if (!optimize && !flag_lto && !flag_wpa)
+  if (!optimize && !flag_lto && !flag_wpa && !flag_openmp)
 return;
 
   function_insertion_hook_holder =
diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
index 6a52da8..697c069 100644
--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -238,6 +238,9 @@ void
 lto_set_symtab_encoder_in_partition (lto_symtab_encoder_t encoder,
 symtab_node *node)
 {
+  /* Ignore not needed nodes.  */
+  if (!node->need_dump)
+return;
   int index = lto_symtab_encoder_encode (encoder, node);
   encoder->nodes[index].in_partition = true;
 }
@@ -751,6 +754,17 @@ add_references (lto_symtab_encoder_t encoder,
   lto_symtab_encoder_encode (encoder, ref->referred);
 }
 
+/* Select what needs to be dumped. In lto case dump everything.
+   In omp target case only dump stuff makrked with attribute.  */
+void
+select_what_to_dump (bool is_omp)
+{
+  struct symtab_node *snode;
+  FOR_EACH_SYMBOL(snode)
+snode->need_dump = !is_omp || lookup_attribute ("omp declare target",
+   DECL_ATTRIBUTES 
(snode->decl));
+}
+
 /* Find all symbols we want to stream into given partition and insert them
to encoders.
 
diff --git a/gcc/lto-streamer.c b/gcc/lto-streamer.c
index 1540e4c..ffafb0e 100644
--- a/gcc/lto-streamer.c
+++ b/gcc/lto-streamer.c
@@ -43,6 +43,

Re: Enale -fno-fat-lto-objects by default

2013-11-19 Thread Markus Trippelsdorf
On 2013.11.19 at 09:44 +0100, Paolo Bonzini wrote:
> Il 18/11/2013 20:09, Jan Hubicka ha scritto:
> >>> > > this patch switches the default for fat-lto-objects as was documented 
> >>> > > for a while.
> >>> > > -ffat-lto-objects doubles compilation time and often makes users to 
> >>> > > not notice that
> >>> > > LTO was not used at all (because they forgot to use gcc-ar/gcc-nm 
> >>> > > plugins).
> >>> > > 
> >>> > > Sadly I had to add -ffat-lto-objects to bootstrap. This is because I 
> >>> > > do not know
> >>> > > how to convince our build machinery to use gcc-ar/gcc-nm during the 
> >>> > > stage2+
> >> > 
> >> > I've posted a minimal patch set for slim-lto-bootstrap last year, see:
> >> > http://thread.gmane.org/gmane.comp.gcc.patches/270842
> >> > 
> >> > If there's interest I could repost it.
> > It would be really nice to have it in indeed.  I think we do not really need
> > lto-bootstrap.mk and slim-lto-bootstrap.mk, but otherwise the patch seems 
> > easy
> > enough and would save quite some of lto bootstrap testing time...
> 
> Patches 1 and 2 should go upstream first.

OK, but where is upstream?
Please note that a general libtool update would fix this issue, too.
So, maybe it is just time to upgrade libtool everywhere in gnu-land?

> Patch 3 in the series is wrong because Makefile.in is a generated file.
>  The message does not explain why it is necessary, and it is probably
> working around a bug elsewhere.
> For patch 4, I agree with Jan that we do not need a separate configuration.

The problem is that fixincl links with libiberty.a:

/var/tmp/gcc_build_dir/./gcc/xgcc -B/var/tmp/gcc_build_dir/./gcc/
-B/usr/x86_64-pc-linux-gnu/bin/ -B/usr/x86_64-pc-linux-gnu/lib/ -isystem
/usr/x86_64-pc-linux-gnu/include -isystem
/usr/x86_64-pc-linux-gnu/sys-include-O2 -pipe -static-libstdc++
-static-libgcc  -o fixincl fixincl.o fixtests.o fixfixes.o server.o
procopen.o fixlib.o fixopts.o ../libiberty/libiberty.a

And this archive consists of object files with LTO sections only. So we
need to find a way to pass -fuse-linker-plugin to the invocation above.

-- 
Markus


Re: Enale -fno-fat-lto-objects by default

2013-11-19 Thread Jan Hubicka
> On 2013.11.19 at 09:44 +0100, Paolo Bonzini wrote:
> > Il 18/11/2013 20:09, Jan Hubicka ha scritto:
> > >>> > > this patch switches the default for fat-lto-objects as was 
> > >>> > > documented for a while.
> > >>> > > -ffat-lto-objects doubles compilation time and often makes users to 
> > >>> > > not notice that
> > >>> > > LTO was not used at all (because they forgot to use gcc-ar/gcc-nm 
> > >>> > > plugins).
> > >>> > > 
> > >>> > > Sadly I had to add -ffat-lto-objects to bootstrap. This is because 
> > >>> > > I do not know
> > >>> > > how to convince our build machinery to use gcc-ar/gcc-nm during the 
> > >>> > > stage2+
> > >> > 
> > >> > I've posted a minimal patch set for slim-lto-bootstrap last year, see:
> > >> > http://thread.gmane.org/gmane.comp.gcc.patches/270842
> > >> > 
> > >> > If there's interest I could repost it.
> > > It would be really nice to have it in indeed.  I think we do not really 
> > > need
> > > lto-bootstrap.mk and slim-lto-bootstrap.mk, but otherwise the patch seems 
> > > easy
> > > enough and would save quite some of lto bootstrap testing time...
> > 
> > Patches 1 and 2 should go upstream first.
> 
> OK, but where is upstream?
> Please note that a general libtool update would fix this issue, too.
> So, maybe it is just time to upgrade libtool everywhere in gnu-land?
> 
> > Patch 3 in the series is wrong because Makefile.in is a generated file.
> >  The message does not explain why it is necessary, and it is probably
> > working around a bug elsewhere.
> > For patch 4, I agree with Jan that we do not need a separate configuration.
> 
> The problem is that fixincl links with libiberty.a:
> 
> /var/tmp/gcc_build_dir/./gcc/xgcc -B/var/tmp/gcc_build_dir/./gcc/
> -B/usr/x86_64-pc-linux-gnu/bin/ -B/usr/x86_64-pc-linux-gnu/lib/ -isystem
> /usr/x86_64-pc-linux-gnu/include -isystem
> /usr/x86_64-pc-linux-gnu/sys-include-O2 -pipe -static-libstdc++
> -static-libgcc  -o fixincl fixincl.o fixtests.o fixfixes.o server.o
> procopen.o fixlib.o fixopts.o ../libiberty/libiberty.a
> 
> And this archive consists of object files with LTO sections only. So we
> need to find a way to pass -fuse-linker-plugin to the invocation above.

-fuse-linker-plugin is now default at the same time as -fno-fat-object-files is,
so there should be no need for using this switch explicitely.

Honza
> 
> -- 
> Markus


Re: [RFA/RFC patch]: Follow-up on type-demotion pass ...

2013-11-19 Thread Kai Tietz


- Original Message -
> This is not a review, but:
> 
> * What do you need from rtl.h?  It's generally best for GIMPLE passes to
> avoid rtl.h where possible (and if you can avoid it, the next question is
> whether you can also avoid tm.h).

Yes, for now rtl.h, tm.h and tm_p.h headers can be avoided.  The first two are 
required for tm_p.h header.  As this patch doesn't requires target-hooks, I 
removed those three includes from my local patch.
 
> * Going just on the general description of the pass and not looking at the
> details: does this do any of the things that are done by shorten_binary_op
> or shorten_compare in c-common.c?  If so, do you plan followup changes to
> remove as premature optimizations whatever those functions do that can be
> done by this pass?

I didn't had explicit shorten_compare and/or shorten_binary_op on radar, but on 
a closer look into them, yes.  This patch is a step into this area.

> 
> --
> Joseph S. Myers
> jos...@codesourcery.com
> 

Kai


Re: Enale -fno-fat-lto-objects by default

2013-11-19 Thread Paolo Bonzini
Il 19/11/2013 11:05, Markus Trippelsdorf ha scritto:
> On 2013.11.19 at 09:44 +0100, Paolo Bonzini wrote:
>> Il 18/11/2013 20:09, Jan Hubicka ha scritto:
>>> this patch switches the default for fat-lto-objects as was documented 
>>> for a while.
>>> -ffat-lto-objects doubles compilation time and often makes users to not 
>>> notice that
>>> LTO was not used at all (because they forgot to use gcc-ar/gcc-nm 
>>> plugins).
>>>
>>> Sadly I had to add -ffat-lto-objects to bootstrap. This is because I do 
>>> not know
>>> how to convince our build machinery to use gcc-ar/gcc-nm during the 
>>> stage2+
>
> I've posted a minimal patch set for slim-lto-bootstrap last year, see:
> http://thread.gmane.org/gmane.comp.gcc.patches/270842
>
> If there's interest I could repost it.
>>> It would be really nice to have it in indeed.  I think we do not really need
>>> lto-bootstrap.mk and slim-lto-bootstrap.mk, but otherwise the patch seems 
>>> easy
>>> enough and would save quite some of lto bootstrap testing time...
>>
>> Patches 1 and 2 should go upstream first.
> 
> OK, but where is upstream?
> Please note that a general libtool update would fix this issue, too.

Ah, so they're already upstream.

> So, maybe it is just time to upgrade libtool everywhere in gnu-land?

Yes, that would be better but no need to do that now.

>> Patch 3 in the series is wrong because Makefile.in is a generated file.
>>  The message does not explain why it is necessary, and it is probably
>> working around a bug elsewhere.
>> For patch 4, I agree with Jan that we do not need a separate configuration.
> 
> The problem is that fixincl links with libiberty.a:
> 
> /var/tmp/gcc_build_dir/./gcc/xgcc -B/var/tmp/gcc_build_dir/./gcc/
> -B/usr/x86_64-pc-linux-gnu/bin/ -B/usr/x86_64-pc-linux-gnu/lib/ -isystem
> /usr/x86_64-pc-linux-gnu/include -isystem
> /usr/x86_64-pc-linux-gnu/sys-include-O2 -pipe -static-libstdc++
> -static-libgcc  -o fixincl fixincl.o fixtests.o fixfixes.o server.o
> procopen.o fixlib.o fixopts.o ../libiberty/libiberty.a
> 
> And this archive consists of object files with LTO sections only. So we
> need to find a way to pass -fuse-linker-plugin to the invocation above.

Then -fuse-linker-plugin should be added to the LDFLAGS (not CFLAGS) for
all host modules, as in "LDFLAGS += -fuse-linker-plugin".  Other host
modules than fixincludes could also use libiberty or another
bootstrapped host library (libbfd is one, I think), and would require
the same fix.

Paolo


[RFC] [PATCH, AARCH64] Machine descriptions to support stack smashing protection

2013-11-19 Thread Venkataramanan Kumar
Hi Maintainers,

This is RFC patch that adds machine descriptions to support stack
smashing protection in AArch64.

I have written a very simple patch that prints "stack set" and "stack
test" as template of instructions.

I had 2 assumptions.

1) For "stack_protect_set" and "stack_protect_test", I
used "memory_operand" as predicate.

GCC pushes the memory operand in a register much
earlier during expand phase before these patterns are invoked.

So assuming that I will get a memory operand "__stack_chk_gaurd" in a
register when we are not using TLS based stack guard.

2) For the TLS case, assuming stack guard value will be stored at "-8"
offset from "tp" GCC generates below code for stack set.


mrs x0, tpidr_el0
ldr x1, [x0,-8]
str x1, [x29,24]
mov x1,0

I submitted Glibc patches some time before
https://sourceware.org/ml/libc-ports/2013-08/msg00044.html.

There are few regressions, the pthread_cancel tests in glibc fails I
am currently debugging :(.

GCC with the patch generates below code for stack test

ldr x1, [x29,24]
ldr x0, [x0,-8]
eor x0, x1, x0
cbnzx0, .L4
.
..
.L4:
bl  __stack_chk_f


I generate "eor" since it has 2 purpose one for checking equality, and
 two  for clearing the canary loaded register.

Request your feedback to shape this into a better patch.

regards,
Venkat.
Index: gcc/testsuite/gcc.dg/pr46440.c
===
--- gcc/testsuite/gcc.dg/pr46440.c  (revision 204932)
+++ gcc/testsuite/gcc.dg/pr46440.c  (working copy)
@@ -1,7 +1,6 @@
 /* PR rtl-optimization/46440 */
 /* { dg-do compile } */
 /* { dg-options "-O -fstack-protector -fno-tree-dominator-opts -fno-tree-fre" 
} */
-/* { dg-require-effective-target fstack_protector } */
 
 int i;
 
Index: gcc/testsuite/gcc.dg/ssp-1.c
===
--- gcc/testsuite/gcc.dg/ssp-1.c(revision 204932)
+++ gcc/testsuite/gcc.dg/ssp-1.c(working copy)
@@ -1,6 +1,4 @@
-/* { dg-do run { target native } } */
 /* { dg-options "-fstack-protector" } */
-/* { dg-require-effective-target fstack_protector } */
 
 #include 
 
Index: gcc/testsuite/gcc.dg/pr47766.c
===
--- gcc/testsuite/gcc.dg/pr47766.c  (revision 204932)
+++ gcc/testsuite/gcc.dg/pr47766.c  (working copy)
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -fstack-protector" } */
-/* { dg-require-effective-target fstack_protector } */
 
 int
 parse_opt (int key)
Index: gcc/testsuite/gcc.dg/ssp-2.c
===
--- gcc/testsuite/gcc.dg/ssp-2.c(revision 204932)
+++ gcc/testsuite/gcc.dg/ssp-2.c(working copy)
@@ -1,7 +1,5 @@
-/* { dg-do run { target native } } */
 /* { dg-options "-fstack-protector" } */
 /* { dg-options "-fstack-protector -Wl,-multiply_defined,suppress" { target 
*-*-darwin* } } */
-/* { dg-require-effective-target fstack_protector } */
 
 #include 
 
Index: gcc/testsuite/gcc.dg/fstack-protector-strong.c
===
--- gcc/testsuite/gcc.dg/fstack-protector-strong.c  (revision 204932)
+++ gcc/testsuite/gcc.dg/fstack-protector-strong.c  (working copy)
@@ -1,6 +1,6 @@
 /* Test that stack protection is done on chosen functions. */
 
-/* { dg-do compile { target i?86-*-* x86_64-*-* rs6000-*-* s390x-*-* } } */
+/* { dg-do compile { target i?86-*-* x86_64-*-* rs6000-*-* s390x-*-* 
aarch64-*-*} } */
 /* { dg-options "-O2 -fstack-protector-strong" } */
 
 #include
Index: gcc/testsuite/g++.dg/fstack-protector-strong.C
===
--- gcc/testsuite/g++.dg/fstack-protector-strong.C  (revision 204932)
+++ gcc/testsuite/g++.dg/fstack-protector-strong.C  (working copy)
@@ -1,6 +1,6 @@
 /* Test that stack protection is done on chosen functions. */
 
-/* { dg-do compile { target i?86-*-* x86_64-*-* } } */
+/* { dg-do compile { target i?86-*-* x86_64-*-* aarch64-*-* } } */
 /* { dg-options "-O2 -fstack-protector-strong" } */
 
 class A
Index: gcc/config/aarch64/aarch64-linux.h
===
--- gcc/config/aarch64/aarch64-linux.h  (revision 204932)
+++ gcc/config/aarch64/aarch64-linux.h  (working copy)
@@ -43,4 +43,9 @@
 }  \
   while (0)
 
+#ifdef TARGET_LIBC_PROVIDES_SSP
+/* Aarch64 glibc provides __stack_chk_guard in [tp - 0x8].  */
+#define TARGET_THREAD_SSP_OFFSET (-1 * GET_MODE_SIZE (ptr_mode))
+#endif
+
 #endif  /* GCC_AARCH64_LINUX_H */
Index: gcc/config/aarch64/aarch64.md
===
--- gcc/config/aarch64/aarch64.md   (revision 204932)
+++ gcc/config/aarch64

[PING] Fix PR58115

2013-11-19 Thread Bernd Edlinger
PING...

this patch still needs review: 
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00133.html

Thanks.


> on i686-pc-linux-gnu the test case gcc.target/i386/intrinsics_4.c fails 
> because of
> an internal compiler error, see PR58155.
>
> The reason for this is that the optab CODE_FOR_movv8sf is disabled when it
> should be enabled.
>
> This happens because invoke_set_current_function_hook changes the pointer
> "this_fn_optabs" after targetm.set_current_function has already modified the
> optab to enable/disable CODE_FOR_movv8sf, leaving that optab entry
> in an undefined state.
>
> Boot-strapped and regression-tested on i686-pc-linux-gnu.
>
> Ok for trunk?
>
> Regards
> Bernd.  

Re: [RFC] [PATCH, AARCH64] Machine descriptions to support stack smashing protection

2013-11-19 Thread Jakub Jelinek
On Tue, Nov 19, 2013 at 04:30:21PM +0530, Venkataramanan Kumar wrote:
> This is RFC patch that adds machine descriptions to support stack
> smashing protection in AArch64.

Most of the testsuite changes look wrong.  The fact that aarch64
gets stack protector support doesn't mean all other targets do as well.
So please leave all the changes that remove native or stack_protector
requirements out.

Jakub


Re: [PATCH, PR 10474] Take two on splitting live-ranges of function arguments to help shrink-wrapping

2013-11-19 Thread James Greenhalgh
*Ping*

This is one of many bugs blocking bootstrap on ARM. It would be
helpful if the fix could be reviewed and, if correct, for it to go
in.

http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01820.html

Thanks,
James

On Fri, Nov 15, 2013 at 03:10:49PM +, Martin Jambor wrote:
> Perfect, thanks a lot.  The patch has also passed bootstrap and
> testing on x86_64-linux (all languages and Ada) and passed bootstrap
> on ppc64 (all languages) and ia64 (C, C++ and Fortran), I do not yet
> have results from testsuite runs from the latter two platforms.  Just
> fro the record, I've also just started an i686 bootstrap.
> 
> Vlad, the patch below is the exactly the same one + a testcase.  It
> basically does what you suggested in your first email, moves the
> transformation to before all the analyses and undoes some code
> movement from find_moveable_pseudos() to ira().  Is it OK for trunk?
> 
> Thanks,
> 
> Martin
> 
> 
> 2013-11-15  Martin Jambor  
> 
>   * ira.c (find_moveable_pseudos): Put back various analyses from ira()
>   here.
>   (ira): Move init_reg_equiv and call to
>   split_live_ranges_for_shrink_wrap up, remove analyses around call
>   to find_moveable_pseudos.
> 
> testsuite/
>   * gcc.target/i386/pr59099.c: New test.
> 
> diff --git a/gcc/ira.c b/gcc/ira.c
> index 2ef69cb..a171761 100644
> --- a/gcc/ira.c
> +++ b/gcc/ira.c
> @@ -4515,6 +4515,9 @@ find_moveable_pseudos (void)
>pseudo_replaced_reg.release ();
>pseudo_replaced_reg.safe_grow_cleared (max_regs);
>  
> +  df_analyze ();
> +  calculate_dominance_info (CDI_DOMINATORS);
> +
>i = 0;
>bitmap_initialize (&live, 0);
>bitmap_initialize (&used, 0);
> @@ -4827,6 +4830,14 @@ find_moveable_pseudos (void)
>free (bb_moveable_reg_sets);
>  
>last_moveable_pseudo = max_reg_num ();
> +
> +  fix_reg_equiv_init ();
> +  expand_reg_info ();
> +  regstat_free_n_sets_and_refs ();
> +  regstat_free_ri ();
> +  regstat_init_n_sets_and_refs ();
> +  regstat_compute_ri ();
> +  free_dominance_info (CDI_DOMINATORS);
>  }
>  
>  
> @@ -5187,7 +5198,19 @@ ira (FILE *f)
>  #endif
>df_analyze ();
>  
> +  init_reg_equiv ();
> +  if (ira_conflicts_p)
> +{
> +  calculate_dominance_info (CDI_DOMINATORS);
> +
> +  if (split_live_ranges_for_shrink_wrap ())
> + df_analyze ();
> +
> +  free_dominance_info (CDI_DOMINATORS);
> +}
> +
>df_clear_flags (DF_NO_INSN_RESCAN);
> +
>regstat_init_n_sets_and_refs ();
>regstat_compute_ri ();
>  
> @@ -5205,7 +5228,6 @@ ira (FILE *f)
>if (resize_reg_info () && flag_ira_loop_pressure)
>  ira_set_pseudo_classes (true, ira_dump_file);
>  
> -  init_reg_equiv ();
>rebuild_p = update_equiv_regs ();
>setup_reg_equiv ();
>setup_reg_equiv_init ();
> @@ -5228,22 +5250,7 @@ ira (FILE *f)
>   allocation because of -O0 usage or because the function is too
>   big.  */
>if (ira_conflicts_p)
> -{
> -  df_analyze ();
> -  calculate_dominance_info (CDI_DOMINATORS);
> -
> -  find_moveable_pseudos ();
> -  if (split_live_ranges_for_shrink_wrap ())
> - df_analyze ();
> -
> -  fix_reg_equiv_init ();
> -  expand_reg_info ();
> -  regstat_free_n_sets_and_refs ();
> -  regstat_free_ri ();
> -  regstat_init_n_sets_and_refs ();
> -  regstat_compute_ri ();
> -  free_dominance_info (CDI_DOMINATORS);
> -}
> +find_moveable_pseudos ();
>  
>max_regno_before_ira = max_reg_num ();
>ira_setup_eliminable_regset (true);
> diff --git a/gcc/testsuite/gcc.target/i386/pr59099.c 
> b/gcc/testsuite/gcc.target/i386/pr59099.c
> new file mode 100644
> index 000..7dc12ff
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr59099.c
> @@ -0,0 +1,76 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fPIC -m32" } */
> +
> +void (*pfn)(void);
> +
> +struct s
> +{
> +  void** q;
> +  int h;
> +  int t;
> +  int s;
> +};
> +
> +
> +void* f (struct s *, struct s *) __attribute__ ((noinline, regparm(1)));
> +
> +void*
> +__attribute__ ((regparm(1)))
> +f (struct s *p, struct s *p2)
> +{
> +  void *gp, *gp1;
> +  int t, h, s, t2, h2, c, i;
> +
> +  if (p2->h == p2->t)
> +return 0;
> +
> +  (*pfn) ();
> +
> +  h = p->h;
> +  t = p->t;
> +  s = p->s;
> +
> +  h2 = p2->h;
> +  t2 = p2->t;
> +
> +  gp = p2->q[h2++];
> +
> +  c = (t2 - h2) / 2;
> +  for (i = 0; i != c; i++)
> +{
> +  if (t == h || (h == 0 && t == s - 1))
> + break;
> +  gp1 = p2->q[h2++];
> +  p->q[t++] = gp1;
> +  if (t == s)
> + t = 0;
> +}
> +
> +  p2->h = h2;
> +  return gp;
> +}
> +
> +static void gn () { }
> +
> +int
> +main()
> +{
> +  struct s s1, s2;
> +  void *q[10];
> +
> +  pfn = gn;
> +
> +  s1.q = q;
> +  s1.h = 0;
> +  s1.t = 2;
> +  s1.s = 4;
> +
> +  s2.q = q;
> +  s2.h = 0;
> +  s2.t = 4;
> +  s2.s = 2;
> +
> +  f (&s1, &s2);
> +
> +  return 0;
> +}
> 


Re: Enale -fno-fat-lto-objects by default

2013-11-19 Thread Markus Trippelsdorf
On 2013.11.19 at 11:54 +0100, Paolo Bonzini wrote:
> Il 19/11/2013 11:05, Markus Trippelsdorf ha scritto:
> > On 2013.11.19 at 09:44 +0100, Paolo Bonzini wrote:
> >> Il 18/11/2013 20:09, Jan Hubicka ha scritto:
> >>> this patch switches the default for fat-lto-objects as was documented 
> >>> for a while.
> >>> -ffat-lto-objects doubles compilation time and often makes users to 
> >>> not notice that
> >>> LTO was not used at all (because they forgot to use gcc-ar/gcc-nm 
> >>> plugins).
> >>>
> >>> Sadly I had to add -ffat-lto-objects to bootstrap. This is because I 
> >>> do not know
> >>> how to convince our build machinery to use gcc-ar/gcc-nm during the 
> >>> stage2+
> >
> > I've posted a minimal patch set for slim-lto-bootstrap last year, see:
> > http://thread.gmane.org/gmane.comp.gcc.patches/270842
> >
> > If there's interest I could repost it.
> >>> It would be really nice to have it in indeed.  I think we do not really 
> >>> need
> >>> lto-bootstrap.mk and slim-lto-bootstrap.mk, but otherwise the patch seems 
> >>> easy
> >>> enough and would save quite some of lto bootstrap testing time...
> >>
> >> Patches 1 and 2 should go upstream first.
> > 
> > OK, but where is upstream?
> > Please note that a general libtool update would fix this issue, too.
> 
> Ah, so they're already upstream.
> 
> > So, maybe it is just time to upgrade libtool everywhere in gnu-land?
> 
> Yes, that would be better but no need to do that now.

So would Patches 1 and 2 be OK in the interim?

-- 
Markus


Re: Enale -fno-fat-lto-objects by default

2013-11-19 Thread Paolo Bonzini
Il 19/11/2013 12:19, Markus Trippelsdorf ha scritto:
>> > 
>>> > > So, maybe it is just time to upgrade libtool everywhere in gnu-land?
>> > 
>> > Yes, that would be better but no need to do that now.
> So would Patches 1 and 2 be OK in the interim?

Yes.  And Jan's answer suggests that patch 3 is not necessary at all now.

Paolo


[PATCH][ARM] Implement CRC32 intrinsics for AArch32 in ARMv8-A

2013-11-19 Thread Kyrill Tkachov

Hi all,

This patch implements the CRC32 intrinsics that map down to the optional CRC32 
instructions in ARMv8-A as defined by ACLE. They are exposed by a new header 
file: arm_acle.h which can be included in user programs similarly to the 
existing arm_neon.h header.


To enable the use of these intrinsics (and instructions) we define a new 
-march=armv8-a+crc option. We will pass the "crc" option as a .arch_extension 
directive in the generated assembly to gas.


Documentation and testsuite changes are included (a new effective target check 
and option-adding procedure in testsuite/lib). A new directory: 
gcc.target/arm/acle/ is added that contains the new tests and can be used to 
contain tests for other non-NEON ACLE intrinsics that might be implemented in 
the future.



Regtested arm-none-eabi on a model and bootstrapped arm-none-linux-gnueabihf on 
a Chromebook.


Ok for trunk?

Thanks,
Kyrill

gcc/
2013-11-19  Kyrylo Tkachov  

* Makefile.in (TEXI_GCC_FILES): Add arm-acle-intrinsics.texi.
* config.gcc (extra_headers): Add arm_acle.h.
* config/arm/arm.c (FL_CRC32): Define.
(arm_have_crc): Likewise.
(arm_option_override): Set arm_have_crc.
(arm_builtins): Add CRC32 builtins.
(bdesc_2arg): Likewise.
(arm_init_crc32_builtins): New function.
(arm_init_builtins): Initialise CRC32 builtins.
(arm_file_start): Handle architecture extensions.
* config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Define __ARM_FEATURE_CRC32.
Define __ARM_32BIT_STATE.
(TARGET_CRC32): Define.
* config/arm/arm-arches.def: Add armv8-a+crc.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm.md (type): Add crc.
(): New insn.
* config/arm/arm_acle.h: New file.
* config/arm/iterators.md (CRC): New int iterator.
(crc_variant, crc_mode): New int attributes.
* confg/arm/unspecs.md (UNSPEC_CRC32B, UNSPEC_CRC32H, UNSPEC_CRC32W,
UNSPEC_CRC32CB, UNSPEC_CRC32CH, UNSPEC_CRC32CW): New unspecs.
* doc/invoke.texi: Document -march=armv8-a+crc option.
* doc/extend.texi: Document ACLE intrinsics.
* doc/arm-acle-intrinsics.texi: New.


gcc/testsuite
2013-11-19  Kyrylo Tkachov  

* lib/target-supports.exp (add_options_for_arm_crc): New procedure.
(check_effective_target_arm_crc_ok_nocache): Likewise.
(check_effective_target_arm_crc_ok): Likewise.
* gcc.target/arm/acle/: New directory.
* gcc.target/arm/acle/acle.exp: New.
* gcc.target/arm/acle/crc32b.c: New test.
* gcc.target/arm/acle/crc32h.c: Likewise.
* gcc.target/arm/acle/crc32w.c: Likewise.
* gcc.target/arm/acle/crc32d.c: Likewise.
* gcc.target/arm/acle/crc32cb.c: Likewise.
* gcc.target/arm/acle/crc32ch.c: Likewise.
* gcc.target/arm/acle/crc32cw.c: Likewise.
* gcc.target/arm/acle/crc32cd.c: Likewise.diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 77fba80..08f1ea1 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2793,7 +2793,8 @@ TEXI_GCC_FILES = gcc.texi gcc-common.texi gcc-vers.texi frontends.texi	\
 	 gcov.texi trouble.texi bugreport.texi service.texi		\
 	 contribute.texi compat.texi funding.texi gnu.texi gpl_v3.texi	\
 	 fdl.texi contrib.texi cppenv.texi cppopts.texi avr-mmcu.texi	\
-	 implement-c.texi implement-cxx.texi arm-neon-intrinsics.texi
+	 implement-c.texi implement-cxx.texi arm-neon-intrinsics.texi	\
+	 arm-acle-intrinsics.texi
 
 # we explicitly use $(srcdir)/doc/tm.texi here to avoid confusion with
 # the generated tm.texi; the latter might have a more recent timestamp,
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 2907018..ebbdc59 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -329,8 +329,8 @@ arc*-*-*)
 	;;
 arm*-*-*)
 	cpu_type=arm
-	extra_headers="mmintrin.h arm_neon.h"
 	extra_objs="aarch-common.o"
+	extra_headers="mmintrin.h arm_neon.h arm_acle.h"
 	target_type_format_char='%'
 	c_target_objs="arm-c.o"
 	cxx_target_objs="arm-c.o"
diff --git a/gcc/config/arm/arm-arches.def b/gcc/config/arm/arm-arches.def
index fcf3401..9b7d20c 100644
--- a/gcc/config/arm/arm-arches.def
+++ b/gcc/config/arm/arm-arches.def
@@ -54,5 +54,6 @@ ARM_ARCH("armv7-r", cortexr4,	7R,  FL_CO_PROC |	  FL_FOR_ARCH7R)
 ARM_ARCH("armv7-m", cortexm3,	7M,  FL_CO_PROC |	  FL_FOR_ARCH7M)
 ARM_ARCH("armv7e-m", cortexm4,  7EM, FL_CO_PROC |	  FL_FOR_ARCH7EM)
 ARM_ARCH("armv8-a", cortexa53,  8A,  FL_CO_PROC | FL_FOR_ARCH8A)
+ARM_ARCH("armv8-a+crc",cortexa53, 8A,FL_CO_PROC | FL_CRC32  | FL_FOR_ARCH8A)
 ARM_ARCH("iwmmxt",  iwmmxt, 5TE, FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT)
 ARM_ARCH("iwmmxt2", iwmmxt2,5TE, FL_LDSCHED | FL_STRONG | FL_FOR_ARCH5TE | FL_XSCALE | FL_IWMMXT | FL_IWMMXT2)
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index b3e7a7c..8851876 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -362,10 +362,13 @@ EnumValue
 Enum(arm_arch) String(armv8-a) Value(23)
 
 EnumValue
-Enum(arm_arch) String(iwmmxt) Value(24)
+Enum(arm_

[PATCH, testsuite]: Add add-dg-options ieee to gcc.dg/c11-complex-1.c

2013-11-19 Thread Uros Bizjak
Hello!

2013-11-19  Uros Bizjak  

* gcc.dg/c11-complex-1.c: Use dg-add-options ieee.

Tested on alphaev68-pc-linux-gnu, committed to mainline SVN.

Uros.

Index: gcc.dg/c11-complex-1.c
===
--- gcc.dg/c11-complex-1.c  (revision 205004)
+++ gcc.dg/c11-complex-1.c  (working copy)
@@ -1,6 +1,7 @@
 /* Test complex divide does not have the bug identified in N1496.  */
 /* { dg-do run } */
 /* { dg-options "-std=c11 -pedantic-errors" } */
+/* { dg-add-options ieee } */

 extern void abort (void);
 extern void exit (int);


Re: [PING] [PATCH] Optional alternative base_expr in finding basis for CAND_REFs

2013-11-19 Thread Yufeng Zhang

Hi Richard,

Can I get an approval or some feedback from you about the patch?

Regards,
Yufeng

On 11/13/13 23:25, Yufeng Zhang wrote:

On 11/13/13 20:54, Bill Schmidt wrote:

Hi Yufeng,

The second version of your original patch is ok with me with the
following changes.


Thanks a lot for the review.  I've attached an updated patch with the
suggested changes incorporated.


Everything else looks OK to me.  Please ask Richard for final approval,
as I'm not a maintainer.


Hi Richard, would you be happy to OK the patch?

Regards,
Yufeng

gcc/

* gimple-ssa-strength-reduction.c: Include tree-affine.h.
(name_expansions): New static variable.
(alt_base_map): Ditto.
(get_alternative_base): New function.
(find_basis_for_candidate): For CAND_REF, optionally call
find_basis_for_base_expr with the returned value from
get_alternative_base.
(record_potential_basis): Add new parameter 'base' of type 'tree';
add an assertion of non-NULL base; use base to set node->base_expr.
(alloc_cand_and_find_basis): Update; call record_potential_basis
for CAND_REF with the returned value from get_alternative_base.
(execute_strength_reduction): Call pointer_map_create for
alt_base_map; call free_affine_expand_cache with&name_expansions.

gcc/testsuite/

* gcc.dg/tree-ssa/slsr-41.c: New test.





[0/4] Make more use of tree_to_[su]hwi

2013-11-19 Thread Richard Sandiford
Dull series, sorry, but this is another change taken from wide-int.

After checking host_integerp (now tree_fits_[su]hwi_p), the preferred way
of getting the HWI seemed to be tree_low_cst (now tree_to_[su]hwi).
This series mops up some cases where TREE_INT_CST_LOW was used directly.

Tested on x86_64-linux-gnu.

Thanks,
Richard


[1/4] Use tree_to_uhwi with an inlined tree_fits_uhwi_p test

2013-11-19 Thread Richard Sandiford
check_function_arguments_recurse has an assert that is equivalent
to tree_fits_uhwi_p.  The extraction can then use tree_to_uhwi.

Asserting here makes the intent obvious, but tree_to_uhwi also asserts
for the same thing, so an alternative would be to use tree_to_uhwi on
its own.

Thanks,
Richard


gcc/c-family/
2013-11-19  Kenneth Zadeck  

* c-common.c (check_function_arguments_recurse): Use tree_fits_uhwi_p
and tree_to_uhwi.

Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c 2013-11-19 10:53:54.965643984 +
+++ gcc/c-family/c-common.c 2013-11-19 11:08:41.797920627 +
@@ -9209,10 +9209,9 @@ check_function_arguments_recurse (void (
   to be valid.  */
format_num_expr = TREE_VALUE (TREE_VALUE (attrs));
 
-   gcc_assert (TREE_CODE (format_num_expr) == INTEGER_CST
-   && !TREE_INT_CST_HIGH (format_num_expr));
+   gcc_assert (tree_fits_uhwi_p (format_num_expr));
 
-   format_num = TREE_INT_CST_LOW (format_num_expr);
+   format_num = tree_to_uhwi (format_num_expr);
 
for (inner_arg = first_call_expr_arg (param, &iter), i = 1;
 inner_arg != 0;


Re: [PATCH] RTEMS: Add LEON3/SPARC multilibs

2013-11-19 Thread Sebastian Huber

Hello Eric,

On 2013-09-19 09:23, Eric Botcazou wrote:

I don't expect that this will be back ported to GCC 4.8.  You also need
>Binutils 2.24 for this.
From a SPARC maintainership viewpoint, I'd think that this is backportable for

the upcoming 4.8.2 release, and the patches are essentially SPARC-specific,
but perhaps the RMs are of a different opinion here.


I back ported your list of changes from mainline to GCC 4.8.  See the attached 
patches.  In addition to your proposed changes I had to add


2013-04-10  Steven Bosscher  

   * config/sparc/sparc.c: Include tree-pass.h.
   (TARGET_MACHINE_DEPENDENT_REORG): Do not redefine.
   (sparc_reorg): Rename to sparc_do_work_around_errata.  Move to
   head of file.  Change return type.  Split off gate function.
   (sparc_gate_work_around_errata): New function.
   (pass_work_around_errata): New pass definition.
   (insert_pass_work_around_errata) New pass insert definition to
   insert pass_work_around_errata just after delayed-branch scheduling.
   (sparc_option_override): Insert the pass.
   * config/sparc/t-sparc (sparc.o): Add TREE_PASS_H dependence.

This was necessary for

2013-07-22  Eric Botcazou  

* config.gcc (sparc*-*-*): Accept leon3 processor.
(sparc-leon*-*): Merge with sparc*-*-* and add leon3 support.
* doc/invoke.texi (SPARC Options): Adjust -mfix-ut699 entry.
* config/sparc/sparc-opts.h (enum processor_type): Add PROCESSOR_LEON3.
* config/sparc/sparc.opt (enum processor_type): Add leon3.
(mfix-ut699): Adjust comment.
* config/sparc/sparc.h (TARGET_CPU_leon3): New define.
(CPP_CPU32_DEFAULT_SPEC): Add leon3 support.
(CPP_CPU_SPEC): Likewise.
(ASM_CPU_SPEC): Likewise.
* config/sparc/sparc.c (leon3_cost): New constant.
(sparc_option_override): Add leon3 support.
(mem_ref): New function.
(sparc_gate_work_around_errata): Return true if -mfix-ut699 is enabled.
(sparc_do_work_around_errata): Look into the instruction in the delay
slot and adjust accordingly.  Add fix for the data cache nullify issues
of the UT699.  Change insertion position for the NOP.
* config/sparc/leon.md (leon_fpalu, leon_fpmds, write_buf): Delete.
(leon3_load): New reservation.
(leon_store): Bump latency to 2.
(grfpu): New automaton.
(grfpu_alu): New unit.
(grfpu_ds): Likewise.
(leon_fp_alu): Adjust.
(leon_fp_mult): Delete.
(leon_fp_div): Split into leon_fp_divs and leon_fp_divd.
(leon_fp_sqrt): Split into leon_fp_sqrts and leon_fp_sqrtd.
* config/sparc/sparc.md (cpu): Add leon3.
* config/sparc/sync.md (atomic_exchangesi): Disable if -mfix-ut699.
(swapsi): Likewise.
(atomic_test_and_set): Likewise.
(ldstub): Likewise.

I cannot judge if this was good or bad.  I can only perform mechanical changes 
since I don't know how the compiler works.


I run the GCC test suite on the GDB SIS with RTEMS, but its hard for me to 
interpret the results.  I think there are no new test failures due to the back 
ports.


--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
>From 1d85b001eb4bdf2076638f17353d6074c3e3f81c Mon Sep 17 00:00:00 2001
From: Sebastian Huber 
Date: Tue, 28 May 2013 07:26:35 +
Subject: [PATCH 1/6] SPARC, GCC 4.8: -mfix-ut699 changes

gcc/ChangeLog
2013-11-18  Sebastian Huber  

	Backport from mainline
	2013-05-28  Eric Botcazou  

	* doc/invoke.texi (SPARC Options): Document -mfix-ut699.
	* builtins.c (expand_builtin_mathfn) : Try to widen the
	mode if the instruction isn't available in the original mode.
	* config/sparc/sparc.opt (mfix-ut699): New option.
	* config/sparc/sparc.md (muldf3_extend): Disable if -mfix-ut699.
	(divdf3): Turn into expander.
	(divdf3_nofix): New insn.
	(divdf3_fix): Likewise.
	(divsf3): Disable if -mfix-ut699.
	(sqrtdf2): Turn into expander.
	(sqrtdf2_nofix): New insn.
	(sqrtdf2_fix): Likewise.
	(sqrtsf2): Disable if -mfix-ut699.
---
 gcc/builtins.c |8 ++--
 gcc/config/sparc/sparc.md  |   42 +-
 gcc/config/sparc/sparc.opt |4 
 gcc/doc/invoke.texi|7 ++-
 4 files changed, 53 insertions(+), 8 deletions(-)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index e3c32a9..c977df0 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -1958,6 +1958,7 @@ expand_builtin_mathfn (tree exp, rtx target, rtx subtarget)
   tree fndecl = get_callee_fndecl (exp);
   enum machine_mode mode;
   bool errno_set = false;
+  bool try_widening = false;
   tree arg;
 
   if (!validate_arglist (exp, REAL_TYPE, VOID_TYPE))
@@ -1969,6 +1970,7 @@ e

[2/4] Use tree_to_uhwi when folding (x >> c) << c

2013-11-19 Thread Richard Sandiford
The (x >> c) << c folding has:

  && tree_fits_shwi_p (arg1)
  && TREE_INT_CST_LOW (arg1) < prec
  && tree_fits_shwi_p (TREE_OPERAND (arg0, 1))
  && TREE_INT_CST_LOW (TREE_OPERAND (arg0, 1)) < prec)

At first glance the use of tree_fits_shwi_p rather than tree_fits_uhwi_p
made me think this allows negative shift counts, but of course TREE_INT_CST_LOW
is unsigned.  I think it'd be clearer to use tree_fits_uhwi_p instead.

Thanks,
Richard


gcc/
* fold-const.c (fold_binary_loc): Use unsigned rather than signed
HOST_WIDE_INTs when folding (x >> c) << c.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c2013-11-19 10:53:54.965643984 +
+++ gcc/fold-const.c2013-11-19 11:59:33.611252297 +
@@ -12676,13 +12676,13 @@ fold_binary_loc (location_t loc,
   if (((code == LSHIFT_EXPR && TREE_CODE (arg0) == RSHIFT_EXPR)
|| (TYPE_UNSIGNED (type)
   && code == RSHIFT_EXPR && TREE_CODE (arg0) == LSHIFT_EXPR))
- && tree_fits_shwi_p (arg1)
- && TREE_INT_CST_LOW (arg1) < prec
- && tree_fits_shwi_p (TREE_OPERAND (arg0, 1))
- && TREE_INT_CST_LOW (TREE_OPERAND (arg0, 1)) < prec)
+ && tree_fits_uhwi_p (arg1)
+ && tree_to_uhwi (arg1) < prec
+ && tree_fits_uhwi_p (TREE_OPERAND (arg0, 1))
+ && tree_to_uhwi (TREE_OPERAND (arg0, 1)) < prec)
{
- HOST_WIDE_INT low0 = TREE_INT_CST_LOW (TREE_OPERAND (arg0, 1));
- HOST_WIDE_INT low1 = TREE_INT_CST_LOW (arg1);
+ HOST_WIDE_INT low0 = tree_to_uhwi (TREE_OPERAND (arg0, 1));
+ HOST_WIDE_INT low1 = tree_to_uhwi (arg1);
  tree lshift;
  tree arg00;
 


Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support

2013-11-19 Thread Richard Biener
On Mon, Nov 18, 2013 at 5:45 PM, Jeff Law  wrote:
> On 11/08/13 02:02, Ilya Enkovich wrote:
>>
>> Hi,
>>
>> Here is an updated patch version with no langhook.
>>
>> Regarding TLS objects issue - I do not think compiler should compensate
>> the absence of instrumentation in libraries.  Compiler should be responsible
>> for initialization of Bounds Tables for .tdata section.  Correct data copy
>> is a responsibility of library.  User should use either instrumented library
>> or wrapper calls if he needs this functionality.
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2013-11-06  Ilya Enkovich  
>>
>> * c/c-parser.c: Include tree-chkp.h.
>> (c_parser_declaration_or_fndef): Register statically
>> initialized decls in Pointer Bounds Checker.
>> * cp/decl.c: Include tree-chkp.h.
>> (cp_finish_decl): Register statically
>> initialized decls in Pointer Bounds Checker.
>> * gimplify.c: Include tree-chkp.h.
>> (gimplify_init_constructor): Register statically
>> initialized decls in Pointer Bounds Checker.
>
> Is parsing really the right time to register these things with the checking
> framework?  Doesn't all this stuff flow through the gimplifier?  If so
> wouldn't that be a better place?
>
> If it can be done in the gimplifier, which seems good from the standpoint of
> simplifying the long term maintenance of the checking code.
>
> If there's a good reason to have this front-end, please explain it.

I'd say not in the gimplifier either but in varpool (symbol table) code
where the symbols are ultimatively registered with?

Richard.

> Thanks,
> Jeff
>


Re: [PATCH, MPX, 2/X] Pointers Checker [14/25] Function splitting

2013-11-19 Thread Richard Biener
On Mon, Nov 18, 2013 at 8:12 PM, Ilya Enkovich  wrote:
> 2013/11/18 Jeff Law :
>> On 11/18/13 11:27, Ilya Enkovich wrote:
>>>
>>>
>>> How does pointer passed to regular function differ from pointer passed
>>> to splitted function? How do I know then which pointer is to be passed
>>> with bounds and wchich one is not? Moreover current ABI does not allow
>>> to pass bounds with no pointer or pass bounds for some pointers in the
>>> call only.
>>
>> But I don't see any case in function splitting where we're going to want to
>> pass the pointer without the bounds.  If you want the former, you're going
>> to want the latter.
>
> There are at least cases when checks are eliminated or when lots of
> pointer usages are accompanied with few checks performed earlier (e.g.
> we are working with array). In such cases splitted part may easily get
> no bounds.
>
>>
>> I really don't see why you need to do anything special here.  At the most an
>> assert in the splitting code to ensure that you don't have a situation where
>> there's mixed pointers with bounds and pointers without bounds should be all
>> you need or that you passed a bounds with no associated pointer :-)
>
> It would also require generation of proper bind_bounds calls in the
> original function and arg_bounds calls in a separated part. So,
> special support is required.

Well, only to keep proper instrumentation.  I hope code still works
(doesn't trap) when optimizations "wreck" the bounds?  Thus all
these patches are improving bounds propagation but are not required
for correctness?  If so please postpone all of them until after the
initial support is merged.  If not, please make sure BND instrumentation
works conservatively when optimizations wreck it.

Richard.

>>
>>
>> Jeff


[3/4] Avoid undefined operation in overflow check

2013-11-19 Thread Richard Sandiford
This is a case where tree_to_shwi can be used instead of TREE_INT_CST_LOW.
I separated it out because it was using a signed "x * y / y == x" to check
whether "x * y" overflows a HWI, which relies on undefined behaviour.

Thanks,
Richard


gcc/
* tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Avoid signed
overflow.  Use tree_to_shwi.

Index: gcc/tree-ssa-alias.c
===
--- gcc/tree-ssa-alias.c2013-11-19 10:53:54.965643984 +
+++ gcc/tree-ssa-alias.c2013-11-19 11:08:51.882992035 +
@@ -615,9 +615,8 @@ ao_ref_init_from_ptr_and_size (ao_ref *r
   ref->offset += extra_offset;
   if (size
   && tree_fits_shwi_p (size)
-  && TREE_INT_CST_LOW (size) * BITS_PER_UNIT / BITS_PER_UNIT
-== TREE_INT_CST_LOW (size))
-ref->max_size = ref->size = TREE_INT_CST_LOW (size) * BITS_PER_UNIT;
+  && tree_to_shwi (size) <= HOST_WIDE_INT_MAX / BITS_PER_UNIT)
+ref->max_size = ref->size = tree_to_shwi (size) * BITS_PER_UNIT;
   else
 ref->max_size = ref->size = -1;
   ref->ref_alias_set = 0;


[4/4] The rest of the tree_to_[su]hwi changes

2013-11-19 Thread Richard Sandiford
This patch just changes TREE_INT_CST_LOW to tree_to_[su]hwi in cases
where there is already a protecting tree_fits_[su]hwi_p.  I've upped
the number of context lines in case that helps, but there are still
some hunks where the tree_fits_* call is too high up.

Thanks,
Richard


gcc/ada/
2013-11-19  Kenneth Zadeck  
Mike Stump  
Richard Sandiford  

* gcc-interface/cuintp.c (UI_From_gnu): Use tree_to_shwi.
* gcc-interface/decl.c (gnat_to_gnu_entity): Use tree_to_uhwi.
* gcc-interface/utils.c (make_packable_type): Likewise.

gcc/c-family/
2013-11-19  Kenneth Zadeck  
Mike Stump  
Richard Sandiford  

* c-ada-spec.c (is_simple_enum): Use tree_to_shwi and tree_to_uhwi
instead of TREE_INT_CST_LOW, in cases where there is a protecting
tree_fits_shwi_p or tree_fits_uhwi_p.
(dump_generic_ada_node): Likewise.
* c-format.c (check_format_arg): Likewise.
* c-pretty-print.c (pp_c_integer_constant): Likewise.

gcc/
2013-11-19  Kenneth Zadeck  
Mike Stump  
Richard Sandiford  

* alias.c (ao_ref_from_mem): Use tree_to_shwi and tree_to_uhwi
instead of TREE_INT_CST_LOW, in cases where there is a protecting
tree_fits_shwi_p or tree_fits_uhwi_p.
* builtins.c (fold_builtin_powi): Likewise.
* config/epiphany/epiphany.c (epiphany_special_round_type_align):
Likewise.
* dbxout.c (dbxout_symbol): Likewise.
* expr.c (expand_expr_real_1): Likewise.
* fold-const.c (fold_single_bit_test, fold_plusminus_mult_expr)
(fold_binary_loc): Likewise.
* gimple-fold.c (fold_const_aggregate_ref_1): Likewise.
* gimple-ssa-strength-reduction.c (stmt_cost): Likewise.
* omp-low.c (lower_omp_for_lastprivate): Likewise.
* simplify-rtx.c (delegitimize_mem_from_attrs): Likewise.
* stor-layout.c (compute_record_mode): Likewise.
* tree-cfg.c (verify_expr): Likewise.
* tree-dfa.c (get_ref_base_and_extent): Likewise.
* tree-pretty-print.c (dump_array_domain): Likewise.
* tree-sra.c (build_user_friendly_ref_for_offset): Likewise.
* tree-ssa-ccp.c (fold_builtin_alloca_with_align): Likewise.
* tree-ssa-loop-ivopts.c (get_loop_invariant_expr_id): Likewise.
* tree-ssa-math-opts.c (execute_cse_sincos): Likewise.
* tree-ssa-phiopt.c (hoist_adjacent_loads): Likewise.
* tree-ssa-reassoc.c (acceptable_pow_call): Likewise.
* tree-ssa-sccvn.c (copy_reference_ops_from_ref): Likewise.
(ao_ref_init_from_vn_reference, vn_reference_fold_indirect): Likewise.
(vn_reference_lookup_3, simplify_binary_expression): Likewise.
* tree-ssa-structalias.c (bitpos_of_field): Likewise.
(get_constraint_for_1, push_fields_onto_fieldstack): Likewise.
(create_variable_info_for_1): Likewise.
* tree-vect-data-refs.c (vect_compute_data_ref_alignment): Likewise.
(vect_verify_datarefs_alignment): Likewise.
(vect_analyze_data_ref_accesses): Likewise.
(vect_prune_runtime_alias_test_list): Likewise.
* tree-vectorizer.h (NITERS_KNOWN_P): Likewise.

Index: gcc/ada/gcc-interface/cuintp.c
===
--- gcc/ada/gcc-interface/cuintp.c  2013-11-19 11:59:43.285326264 +
+++ gcc/ada/gcc-interface/cuintp.c  2013-11-19 12:09:07.933676448 +
@@ -150,28 +150,28 @@ UI_From_gnu (tree Input)
   Int_Vector vec;
 
 #if HOST_BITS_PER_WIDE_INT == 64
   /* On 64-bit hosts, tree_fits_shwi_p tells whether the input fits in a
  signed 64-bit integer.  Then a truncation tells whether it fits
  in a signed 32-bit integer.  */
   if (tree_fits_shwi_p (Input))
 {
-  HOST_WIDE_INT hw_input = TREE_INT_CST_LOW (Input);
+  HOST_WIDE_INT hw_input = tree_to_shwi (Input);
   if (hw_input == (int) hw_input)
return UI_From_Int (hw_input);
 }
   else
 return No_Uint;
 #else
   /* On 32-bit hosts, tree_fits_shwi_p tells whether the input fits in a
  signed 32-bit integer.  Then a sign test tells whether it fits
  in a signed 64-bit integer.  */
   if (tree_fits_shwi_p (Input))
-return UI_From_Int (TREE_INT_CST_LOW (Input));
+return UI_From_Int (tree_to_shwi (Input));
   else if (TREE_INT_CST_HIGH (Input) < 0 && TYPE_UNSIGNED (gnu_type))
 return No_Uint;
 #endif
 
   gnu_base = build_int_cst (gnu_type, UI_Base);
   gnu_temp = Input;
 
   for (i = Max_For_Dint - 1; i >= 0; i--)
Index: gcc/ada/gcc-interface/decl.c
===
--- gcc/ada/gcc-interface/decl.c2013-11-19 11:59:43.285326264 +
+++ gcc/ada/gcc-interface/decl.c2013-11-19 12:09:07.934676456 +
@@ -4918,17 +4918,17 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
   && !TYPE_FAT_POINTER_P (gnu_type))
size = rm_size (gnu_type);
   

Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support

2013-11-19 Thread Ilya Enkovich
On 19 Nov 13:00, Richard Biener wrote:
> On Mon, Nov 18, 2013 at 5:45 PM, Jeff Law  wrote:
> > On 11/08/13 02:02, Ilya Enkovich wrote:
> >>
> >> Hi,
> >>
> >> Here is an updated patch version with no langhook.
> >>
> >> Regarding TLS objects issue - I do not think compiler should compensate
> >> the absence of instrumentation in libraries.  Compiler should be 
> >> responsible
> >> for initialization of Bounds Tables for .tdata section.  Correct data copy
> >> is a responsibility of library.  User should use either instrumented 
> >> library
> >> or wrapper calls if he needs this functionality.
> >>
> >> Thanks,
> >> Ilya
> >> --
> >> gcc/
> >>
> >> 2013-11-06  Ilya Enkovich  
> >>
> >> * c/c-parser.c: Include tree-chkp.h.
> >> (c_parser_declaration_or_fndef): Register statically
> >> initialized decls in Pointer Bounds Checker.
> >> * cp/decl.c: Include tree-chkp.h.
> >> (cp_finish_decl): Register statically
> >> initialized decls in Pointer Bounds Checker.
> >> * gimplify.c: Include tree-chkp.h.
> >> (gimplify_init_constructor): Register statically
> >> initialized decls in Pointer Bounds Checker.
> >
> > Is parsing really the right time to register these things with the checking
> > framework?  Doesn't all this stuff flow through the gimplifier?  If so
> > wouldn't that be a better place?
> >
> > If it can be done in the gimplifier, which seems good from the standpoint of
> > simplifying the long term maintenance of the checking code.
> >
> > If there's a good reason to have this front-end, please explain it.
> 
> I'd say not in the gimplifier either but in varpool (symbol table) code
> where the symbols are ultimatively registered with?

Something like that?

--- a/gcc/varpool.c
+++ b/gcc/varpool.c
@@ -151,6 +151,10 @@ varpool_node_for_decl (tree decl)
   node = varpool_create_empty_node ();
   node->decl = decl;
   symtab_register_node (node);
+
+  if (DECL_NIITIAL (decl))
+chkp_register_var_initializer (decl);
+
   return node;
 }

Thanks,
Ilya

> 
> Richard.
> 
> > Thanks,
> > Jeff
> >


Re: [PATCH] add auto_vec

2013-11-19 Thread Richard Biener
On Mon, Nov 18, 2013 at 10:08 PM, Trevor Saunders  wrote:
> On Mon, Nov 18, 2013 at 10:03:53PM +0100, Marc Glisse wrote:
>> On Mon, 18 Nov 2013, Trevor Saunders wrote:
>>
>> >This patch adds a class auto_vec which releases its internal
>> >storage in its destructor, but unlike stack_vec it has no built in
>> >storage so its reasonable to use it in objects on the heap.  It
>> >then replaces a bunch of vectors on the stack with stack_vec if
>> >the initial creation size was a compile time constant or auto_vec
>> >otherwise.
>>
>> Why not use stack_vec? You could partially specialize it if there
>> is waste, and you could make the 0 implicit.
>
> I'd like to see it get used for stuff on the heap, but that depends on
> or at least only makes sense once other stuff uses destructors.

Using stack_vec would be odd, but yes, making the 0 implicit
and adding a constructor with element count would make sense.

Of course then we'd have auto_bitmap but stack_vec, that's a bit
inconsistent.

So in the end I think the patch is ok as-is.  Please wait for further
comments though.

Thanks,
Richard.

> Trev
>
>>
>> --
>> Marc Glisse


Re: [PATCH, MPX, 2/X] Pointers Checker [14/25] Function splitting

2013-11-19 Thread Ilya Enkovich
2013/11/19 Richard Biener :
> On Mon, Nov 18, 2013 at 8:12 PM, Ilya Enkovich  wrote:
>> 2013/11/18 Jeff Law :
>>> On 11/18/13 11:27, Ilya Enkovich wrote:


 How does pointer passed to regular function differ from pointer passed
 to splitted function? How do I know then which pointer is to be passed
 with bounds and wchich one is not? Moreover current ABI does not allow
 to pass bounds with no pointer or pass bounds for some pointers in the
 call only.
>>>
>>> But I don't see any case in function splitting where we're going to want to
>>> pass the pointer without the bounds.  If you want the former, you're going
>>> to want the latter.
>>
>> There are at least cases when checks are eliminated or when lots of
>> pointer usages are accompanied with few checks performed earlier (e.g.
>> we are working with array). In such cases splitted part may easily get
>> no bounds.
>>
>>>
>>> I really don't see why you need to do anything special here.  At the most an
>>> assert in the splitting code to ensure that you don't have a situation where
>>> there's mixed pointers with bounds and pointers without bounds should be all
>>> you need or that you passed a bounds with no associated pointer :-)
>>
>> It would also require generation of proper bind_bounds calls in the
>> original function and arg_bounds calls in a separated part. So,
>> special support is required.
>
> Well, only to keep proper instrumentation.  I hope code still works
> (doesn't trap) when optimizations "wreck" the bounds?  Thus all
> these patches are improving bounds propagation but are not required
> for correctness?  If so please postpone all of them until after the
> initial support is merged.  If not, please make sure BND instrumentation
> works conservatively when optimizations wreck it.

All patches I sent for optimization passes are required to avoid ICEs
when compiling instrumented code.

Ilya

>
> Richard.
>
>>>
>>>
>>> Jeff


Re: [PATCH, MPX, 2/X] Pointers Checker [15/25] IPA Propagation

2013-11-19 Thread Martin Jambor
Hi,

On Mon, Nov 18, 2013 at 10:38:49PM +0400, Ilya Enkovich wrote:
> 2013/11/18 Martin Jambor :
> > On Mon, Nov 18, 2013 at 02:28:58PM +0400, Ilya Enkovich wrote:
> >> Hi,
> >>
> >> Here is a patch to disable propagation of bounded values.
> >>
> >
> > Why do ypu need to do this?  If the problem is that IPA-CP can remove
> > parameter it knows is a constant, which somehow confuses how you pass
> > bounds, then it is much better to clear
> > node->local.can_change_signature flag for such nodes and it will not
> > happen while still propagating stuff.
> >
> > Or is there some other reason?
> 
> Thanks for pointing to this flag. I'll look into it.
> 
> There is another problem in propagation - value shoud never be
> propagated into BUILT_IN_CHKP_ARG_BND calls.

If a particular function should be excluded from propagation then
initialize_node_lattices should set all its lattices to bottom and
that is enough, no need to complicate creation of individual jump
functions.

> Ideally bounds of
> propagated value should also be analyzed and propagated (in the most
> cases bounds for constant value are also constant).  I suppose
> can_change_signature flag can be used for lightweight support, when
> param is propagated but bounds are not (it would still require special
> handling of BUILT_IN_CHKP_ARG_BND calls).

If you want to remove bounds when also removing a parameter, adding
those to parms_to_skip in create_specialized_node should do the trick.

> 
> This patch is to avoid ICEs before some scheme is implemented.

Well, while I understand that one has things like this on a
development branch, I would be unhappy to see them merged into trunk
without a better explanation of what the problem is.

Thanks,

Martin


> 
> Thanks,
> Ilya
> >
> > Thanks,
> >
> > Martin
> >
> >
> >> Thanks,
> >> Ilya
> >> --
> >> 2013-11-13  Ilya Enkovich  
> >>
> >>   * ipa-prop.c: Include tree-chkp.h.
> >>   (ipa_compute_jump_functions_for_edge): Do not propagate bounded args.
> >>
> >>
> >> diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
> >> index eb464e4..81e1237 100644
> >> --- a/gcc/ipa-prop.c
> >> +++ b/gcc/ipa-prop.c
> >> @@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
> >>  #include "tree-streamer.h"
> >>  #include "params.h"
> >>  #include "ipa-utils.h"
> >> +#include "tree-chkp.h"
> >>
> >>  /* Intermediate information about a parameter that is only useful during 
> >> the
> >> run of ipa_analyze_node and is not kept afterwards.  */
> >> @@ -1558,6 +1559,7 @@ ipa_compute_jump_functions_for_edge (struct 
> >> param_analysis_info *parms_ainfo,
> >>struct ipa_node_params *info = IPA_NODE_REF (cs->caller);
> >>struct ipa_edge_args *args = IPA_EDGE_REF (cs);
> >>gimple call = cs->call_stmt;
> >> +  tree fndecl = gimple_call_fndecl (call);
> >>int n, arg_num = gimple_call_num_args (call);
> >>
> >>if (arg_num == 0 || args->jump_functions)
> >> @@ -1575,7 +1577,13 @@ ipa_compute_jump_functions_for_edge (struct 
> >> param_analysis_info *parms_ainfo,
> >>tree arg = gimple_call_arg (call, n);
> >>tree param_type = ipa_get_callee_param_type (cs, n);
> >>
> >> -  if (is_gimple_ip_invariant (arg))
> >> +  /* No optimization for bounded types yet implemented.  */
> >> +  if ((gimple_call_with_bounds_p (call)
> >> +|| (fndecl && chkp_function_instrumented_p (fndecl)))
> >> +   && ((param_type && chkp_type_has_pointer (param_type))
> >> +   || (!param_type && chkp_type_has_pointer (TREE_TYPE (arg)
> >> + continue;
> >> +  else if (is_gimple_ip_invariant (arg))
> >>   ipa_set_jf_constant (jfunc, arg, cs);
> >>else if (!is_gimple_reg_type (TREE_TYPE (arg))
> >>  && TREE_CODE (arg) == PARM_DECL)


Re: [PATCH] add auto_vec

2013-11-19 Thread Jakub Jelinek
On Mon, Nov 18, 2013 at 02:46:19PM -0500, Trevor Saunders wrote:
> 2013-11-18  Trevor Saunders  
> 
> gcc/
> * vec.h (auto_vec): New class.
>   * cfganal.c cfgloop.c cgraphunit.c config/i386/i386.c dwarf2out.c
>   function.c genautomata.c gimple.c haifa-sched.c ipa-inline.c
>   ira-build.c loop-unroll.c omp-low.c ree.c trans-mem.c tree-call-cdce.c
>   tree-eh.c tree-if-conv.c tree-into-ssa.c tree-loop-distribution.c
>   tree-predcom.c tree-sra.c
>   tree-sssa-forwprop.c tree-ssa-loop-manip.c tree-ssa-pre.c
>   tree-ssa-reassoc.c tree-ssa-sccvn.c tree-ssa-structalias.c
>   tree-vect-loop.c tree-vect-stmts.c Use auto_vec and stack-vec as
>   appropriate instead of vec for local variables.

The ChangeLog is incorrectly formatted.  The filenames shouldn't
be space separated, but comma space separated, and there should be
as always : after the last filename, before the descriptions.  Also,
stack-vec should have been stack_vec, right?
> 
>   cp/
>   * parser.c semantics.c Change some local variables from vec to
>   auto_vec or stack-vec.

Ditto.

Jakub


Re: Change warnings for unsupported alignment to errors

2013-11-19 Thread Richard Biener
On Tue, Nov 19, 2013 at 2:45 AM, Joseph S. Myers
 wrote:
> When implementing C11 _Alignas in
> , I noted
> that I has omitted checks that alignment was supported (which, as
> constraints in C11, should be errors or pedwarns rather than just
> plain warnings).
>
> The issues with the C11 definition of alignment remain unresolved,
> although there's now N1731 for them (assigned DR numbers 444 and 445).
> However, it seems clearly right to me, and in accord with the intent
> of C11, that declaring an object with an unsupported alignment should
> be an error; in that case, any code generated must be presumed to be
> wrong code as it can't respect the alignment required by the user's
> source code.
>
> Contrary to what I said when implementing _Alignas, arbitrary stack
> alignments are in fact supported on all architectures (PR 33721, fixed
> a year before that patch).  Alignments too big for host unsigned int,
> when measured in bits, are rejected in
> c-common.c:check_user_alignment.  (That's less than ideal - ELF can
> represent alignments of up to 2^31 or 2^63 bytes, not bits, depending
> on whether it's 32-bit or 64-bit ELF.  But no doubt converting
> alignments to be represented in bytes would be a lot of work.)
>
> The alignment checks in varasm.c, however, give warnings rather than
> errors.  This patch changes them to errors.
>
> There's no testcase, as at least for the align_variable case it can't
> actually generate an error on ELF systems, where MAX_OFILE_ALIGNMENT
> is the largest alignment GCC can represent.  (I'm not sure under what
> circumstances assemble_noswitch_variable might generate its
> diagnostic.)
>
> (I do not make an assertion about whether the existing checks for
> supported alignment cover all cases where a requested alignment may
> not be supported; any failure to generate the requested alignment,
> with no diagnostic given, is an ordinary wrong-code bug.)
>
> Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  OK to
> commit?
>
> 2013-11-19  Joseph Myers  
>
> * varasm.c (align_variable): Give error instead of warning for
> unsupported alignment.
> (assemble_noswitch_variable): Likewise.
>
> Index: gcc/varasm.c
> ===
> --- gcc/varasm.c(revision 204948)
> +++ gcc/varasm.c(working copy)
> @@ -960,9 +960,9 @@ align_variable (tree decl, bool dont_output_data)
>   In particular, a.out format supports a maximum alignment of 4.  */
>if (align > MAX_OFILE_ALIGNMENT)
>  {
> -  warning (0, "alignment of %q+D is greater than maximum object "
> -   "file alignment.  Using %d", decl,
> -  MAX_OFILE_ALIGNMENT/BITS_PER_UNIT);
> +  error ("alignment of %q+D is greater than maximum object "
> +"file alignment.  Using %d", decl,
> +MAX_OFILE_ALIGNMENT/BITS_PER_UNIT);

The "Using %d" part of the diagnostic is now pointless, no?

Ok with removing it or changing it to tell the user the maximum supported
alignment.

Thanks,
Richard.

>align = MAX_OFILE_ALIGNMENT;
>  }
>
> @@ -1908,8 +1908,8 @@ assemble_noswitch_variable (tree decl, const char
>
>if (!sect->noswitch.callback (decl, name, size, rounded)
>&& (unsigned HOST_WIDE_INT) (align / BITS_PER_UNIT) > rounded)
> -warning (0, "requested alignment for %q+D is greater than "
> -"implemented alignment of %wu", decl, rounded);
> +error ("requested alignment for %q+D is greater than "
> +  "implemented alignment of %wu", decl, rounded);
>  }
>
>  /* A subroutine of assemble_variable.  Output the label and contents of
>
> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-19 Thread Diego Novillo
On Tue, Nov 19, 2013 at 12:17 AM, Jeff Law  wrote:

> It looks OK to me.

Thanks. Committed as rev 205023.

Ian,  the Go front end will need that patch committed now.


Diego.


Re: Enale -fno-fat-lto-objects by default

2013-11-19 Thread Markus Trippelsdorf
On 2013.11.19 at 11:21 +0100, Jan Hubicka wrote:
> > On 2013.11.19 at 09:44 +0100, Paolo Bonzini wrote:
> > > Il 18/11/2013 20:09, Jan Hubicka ha scritto:
> > > >>> > > this patch switches the default for fat-lto-objects as was 
> > > >>> > > documented for a while.
> > > >>> > > -ffat-lto-objects doubles compilation time and often makes users 
> > > >>> > > to not notice that
> > > >>> > > LTO was not used at all (because they forgot to use gcc-ar/gcc-nm 
> > > >>> > > plugins).
> > > >>> > > 
> > > >>> > > Sadly I had to add -ffat-lto-objects to bootstrap. This is 
> > > >>> > > because I do not know
> > > >>> > > how to convince our build machinery to use gcc-ar/gcc-nm during 
> > > >>> > > the stage2+
> > > >> > 
> > > >> > I've posted a minimal patch set for slim-lto-bootstrap last year, 
> > > >> > see:
> > > >> > http://thread.gmane.org/gmane.comp.gcc.patches/270842
> > > >> > 
> > > >> > If there's interest I could repost it.
> > > > It would be really nice to have it in indeed.  I think we do not really 
> > > > need
> > > > lto-bootstrap.mk and slim-lto-bootstrap.mk, but otherwise the patch 
> > > > seems easy
> > > > enough and would save quite some of lto bootstrap testing time...
> > > 
> > > Patches 1 and 2 should go upstream first.
> > 
> > OK, but where is upstream?
> > Please note that a general libtool update would fix this issue, too.
> > So, maybe it is just time to upgrade libtool everywhere in gnu-land?
> > 
> > > Patch 3 in the series is wrong because Makefile.in is a generated file.
> > >  The message does not explain why it is necessary, and it is probably
> > > working around a bug elsewhere.
> > > For patch 4, I agree with Jan that we do not need a separate 
> > > configuration.
> > 
> > The problem is that fixincl links with libiberty.a:
> > 
> > /var/tmp/gcc_build_dir/./gcc/xgcc -B/var/tmp/gcc_build_dir/./gcc/
> > -B/usr/x86_64-pc-linux-gnu/bin/ -B/usr/x86_64-pc-linux-gnu/lib/ -isystem
> > /usr/x86_64-pc-linux-gnu/include -isystem
> > /usr/x86_64-pc-linux-gnu/sys-include-O2 -pipe -static-libstdc++
> > -static-libgcc  -o fixincl fixincl.o fixtests.o fixfixes.o server.o
> > procopen.o fixlib.o fixopts.o ../libiberty/libiberty.a
> > 
> > And this archive consists of object files with LTO sections only. So we
> > need to find a way to pass -fuse-linker-plugin to the invocation above.
> 
> -fuse-linker-plugin is now default at the same time as -fno-fat-object-files 
> is,
> so there should be no need for using this switch explicitely.

Hmm, gcc/gcc.c still reads:

 690 /* Conditional to test whether the LTO plugin is used or not.
 691FIXME: For slim LTO we will need to enable plugin unconditionally.  This
 692still cause problems with PLUGIN_LD != LD and when plugin is built but
 693not useable.  For GCC 4.6 we don't support slim LTO and thus we can 
enable
 694plugin only when LTO is enabled.  We still honor explicit
 695-fuse-linker-plugin if the linker used understands -plugin.  */
 696
 697 /* The linker has some plugin support.  */
 698 #if HAVE_LTO_PLUGIN > 0
 699 /* The linker used has full plugin support, use LTO plugin by default.  */
 700 #if HAVE_LTO_PLUGIN == 2
 701 #define PLUGIN_COND 
"!fno-use-linker-plugin:%{flto|flto=*|fuse-linker-plugin"
 702 #define PLUGIN_COND_CLOSE "}"
 703 #else
 704 /* The linker used has limited plugin support, use LTO plugin with explicit
 705-fuse-linker-plugin.  */
 706 #define PLUGIN_COND "fuse-linker-plugin"
 707 #define PLUGIN_COND_CLOSE ""
 708 #endif
 709 #define LINK_PLUGIN_SPEC \
 710 "%{"PLUGIN_COND": \
 711 -plugin %(linker_plugin_file) \
 712 -plugin-opt=%(lto_wrapper) \
 713 -plugin-opt=-fresolution=%u.res \
 714 
%{!nostdlib:%{!nodefaultlibs:%:pass-through-libs(%(link_gcc_c_sequence))}} \
 715 }"PLUGIN_COND_CLOSE
 716 #else
 717 /* The linker used doesn't support -plugin, reject -fuse-linker-plugin.  */
 718 #define LINK_PLUGIN_SPEC "%{fuse-linker-plugin:\
 719 %e-fuse-linker-plugin is not supported in this configuration}"
 720 #endif

-- 
Markus


[PATCH i386 8/8] [AVX-512] Add SHA support.

2013-11-19 Thread Kirill Yukhin
Hello,
This patch introduces new SHA instructions described in [1]
along with tests.

Testing:
  1. Bootstrap pass.
  2. make check shows no regressions.
  3. Spec 2000 & 2006 build show no regressions both with and without -mavx512f 
option.
  4. Spec 2000 & 2006 run shows no stability regressions without -mavx512f 
option.

ChangeLog entry:
2013-11-18  Alexander Ivchenko  
Maxim Kuznetsov  
Sergey Lega  
Anna Tikhonova  
Ilya Tocar  
Andrey Turetskiy  
Ilya Verbin  
Kirill Yukhin  
Michael Zolotukhin  

* common/config/i386/i386-common.c (OPTION_MASK_ISA_SHA_SET): New.
(OPTION_MASK_ISA_SHA_UNSET): Ditto.
(ix86_handle_option): Handle OPT_msha.
* config.gcc (extra_headers): Add shaintrin.h.
* config/i386/cpuid.h (bit_SHA): New.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect SHA
instructions.
* config/i386/i386-c.c (ix86_target_macros_internal): Handle
OPTION_MASK_ISA_SHA.
* config/i386/i386.c (ix86_target_string): Add -msha.
(ix86_option_override_internal): Add PTA_SHA.
(ix86_valid_target_attribute_inner_p): Handle OPT_msha.
(enum ix86_builtins): Add IX86_BUILTIN_SHA1MSG1,
IX86_BUILTIN_SHA1MSG2, IX86_BUILTIN_SHA1NEXTE, IX86_BUILTIN_SHA1RNDS4,
IX86_BUILTIN_SHA256MSG1, IX86_BUILTIN_SHA256MSG2,
IX86_BUILTIN_SHA256RNDS2.
(bdesc_args): Add BUILTINS defined above.
(ix86_init_mmx_sse_builtins): Add __builtin_ia32_sha1msg1,
__builtin_ia32_sha1msg2, __builtin_ia32_sha1nexte,
__builtin_ia32_sha1rnds4, __builtin_ia32_sha256msg1,
__builtin_ia32_sha256msg2, __builtin_ia32_sha256rnds2.
(ix86_expand_args_builtin): Handle V4SI_FTYPE_V4SI_V4SI_V4SI, add
warning for CODE_FOR_sha1rnds4.
* config/i386/i386.h (TARGET_SHA): New.
(TARGET_SHA_P): Ditto.  
* config/i386/i386.opt (-msha): Document it.
* config/i386/immintrin.h: Add shaintrin.h.
* config/i386/shaintrin.h: New.
* config/i386/sse.md (unspec): Add UNSPEC_SHA1MSG1, UNSPEC_SHA1MSG2,
UNSPEC_SHA1NEXTE, UNSPEC_SHA1RNDS4, UNSPEC_SHA256MSG1,
UNSPEC_SHA256MSG2, UNSPEC_SHA256RNDS2.
(sha1msg1): New.
(sha1msg2): Ditto.
(sha1nexte): Ditto.
(sha1rnds4): Ditto.
(sha256msg1): Ditto.
(sha256msg2): Ditto.
(sha256rnds2): Ditto.
* doc/invoke.texi: Add -msha, -mno-sha.

testsuite/ChangeLog entry:
2013-11-18  Alexander Ivchenko  
Maxim Kuznetsov  
Sergey Lega  
Anna Tikhonova  
Ilya Tocar  
Andrey Turetskiy  
Ilya Verbin  
Kirill Yukhin  
Michael Zolotukhin  

* gcc.target/i386/avx-1.c: Add define for __builtin_ia32_sha1rnds4.
* gcc.target/i386/i386.exp (check_effective_target_sha): New.
* gcc.target/i386/sha-check.h: New file.
* gcc.target/i386/sha1msg1-1.c: Ditto.
* gcc.target/i386/sha1msg1-2.c: Ditto.
* gcc.target/i386/sha1msg2-1.c: Ditto.
* gcc.target/i386/sha1msg2-2.c: Ditto.
* gcc.target/i386/sha1nexte-1: Ditto.
* gcc.target/i386/sha1nexte-2: Ditto.
* gcc.target/i386/sha1rnds4-1.c: Ditto.
* gcc.target/i386/sha1rnds4-2.c: Ditto.
* gcc.target/i386/sha256msg1-1.c: Ditto.
* gcc.target/i386/sha256msg1-2.c: Ditto.
* gcc.target/i386/sha256msg2-1.c: Ditto.
* gcc.target/i386/sha256msg2-2.c: Ditto.
* gcc.target/i386/sha256rnds2-1.c: Ditto.
* gcc.target/i386/sha256rnds2-2.c: Ditto.
* gcc.target/i386/sse-13.c: Add __builtin_ia32_sha1rnds4.
* gcc.target/i386/sse-14.c: Add _mm_sha1rnds4_epu32.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Add __builtin_ia32_sha1rnds4.

Patch in the bottom.

Is it ok for trunk?

[1] - http://download-software.intel.com/sites/default/files/319433-016.pdf

--
Thanks, K

---
 gcc/common/config/i386/i386-common.c  | 18 -
 gcc/config.gcc|  6 +-
 gcc/config/i386/cpuid.h   |  1 +
 gcc/config/i386/driver-i386.c |  6 +-
 gcc/config/i386/i386-c.c  |  2 +
 gcc/config/i386/i386.c| 46 -
 gcc/config/i386/i386.h|  2 +
 gcc/config/i386/i386.opt  |  4 ++
 gcc/config/i386/immintrin.h   |  2 +
 gcc/config/i386/shaintrin.h   | 99 +++
 gcc/config/i386/sse.md| 90 
 gcc/doc/invoke.texi   |  8 ++-
 gcc/testsuite/gcc.target/i386/avx-1.c |  3 +
 gcc/testsuite/gcc.target/i386/i386.exp| 14 
 gcc/testsuite/gcc.target/i386/sha-check.h | 37 ++
 gcc/testsuite/gcc.target/i386

Make simple loop peeling to happen at gimple level

2013-11-19 Thread Jan Hubicka
Hi,
this is update of my 2012 patch to move rtl loop peeling (the one based on
profile feedback) to tree level.  Peeling expose new optimization oppurtunities
and it is good idea to have gimple passes to see them.  Moreover we probably
want to hook in a new heuristic that use value histograms of loop counts.

The patch also removes simple peeling at RTL level that is mostly obsoleted by
gimple level (in few cases in our testsuite the RTL code is able to determine
loop bounds better than gimple code, but these seems to be just weird cases that
should be handled independently at gimple level)

I re-profiled-bootstrapped/regtested the patch on x86_64-linux and
benchmarked SPEC2000 with quite neutral results (smaller code overall).
(-Werror needs to be disabled or I get bogus overflow warnings both with
and without the patch)

* loop-unroll.c: (decide_unrolling_and_peeling): Rename to
(decide_unrolling): ... this one.
(peel_loops_completely): Remove.
(decide_peel_simple): Remove.
(decide_peel_once_rolling): Remove.
(decide_peel_completely): Remove.
(peel_loop_simple): Remove.
(peel_loop_completely): Remove.
(unroll_and_peel_loops): Rename to ...
(unroll_loops): ... this one; handle only unrolling.
* cfgloop.h (lpt_dec): Remove LPT_PEEL_COMPLETELY and
LPT_PEEL_SIMPLE.
(UAP_PEEL): Remove.
(unroll_and_peel_loops): Remove.
(unroll_loops): New.
* passes.def: Replace
pass_rtl_unroll_and_peel_loops by pass_rtl_unroll_loops.
* loop-init.c (gate_rtl_unroll_and_peel_loops,
rtl_unroll_and_peel_loops): Rename to ...
(gate_rtl_unroll_loops, rtl_unroll_loops): ... these; update.
(pass_rtl_unroll_and_peel_loops): Rename to ...
(pass_rtl_unroll_loops): ... this one.
* tree-pass.h (make_pass_rtl_unroll_and_peel_loops): Remove.
(make_pass_rtl_unroll_loops): New.
* tree-ssa-loop-ivcanon.c: (estimated_peeled_sequence_size, 
try_peel_loop): New.
(canonicalize_loop_induction_variables): Update.

* gcc.dg/tree-prof/peel-1.c: Update.
* gcc.dg/tree-prof/unroll-1.c: Update.
Index: tree-pass.h
===
*** tree-pass.h (revision 205015)
--- tree-pass.h (working copy)
*** extern rtl_opt_pass *make_pass_loop2 (gc
*** 506,512 
  extern rtl_opt_pass *make_pass_rtl_loop_init (gcc::context *ctxt);
  extern rtl_opt_pass *make_pass_rtl_move_loop_invariants (gcc::context *ctxt);
  extern rtl_opt_pass *make_pass_rtl_unswitch (gcc::context *ctxt);
! extern rtl_opt_pass *make_pass_rtl_unroll_and_peel_loops (gcc::context *ctxt);
  extern rtl_opt_pass *make_pass_rtl_doloop (gcc::context *ctxt);
  extern rtl_opt_pass *make_pass_rtl_loop_done (gcc::context *ctxt);
  
--- 506,512 
  extern rtl_opt_pass *make_pass_rtl_loop_init (gcc::context *ctxt);
  extern rtl_opt_pass *make_pass_rtl_move_loop_invariants (gcc::context *ctxt);
  extern rtl_opt_pass *make_pass_rtl_unswitch (gcc::context *ctxt);
! extern rtl_opt_pass *make_pass_rtl_unroll_loops (gcc::context *ctxt);
  extern rtl_opt_pass *make_pass_rtl_doloop (gcc::context *ctxt);
  extern rtl_opt_pass *make_pass_rtl_loop_done (gcc::context *ctxt);
  
Index: testsuite/gcc.dg/tree-prof/peel-1.c
===
*** testsuite/gcc.dg/tree-prof/peel-1.c (revision 205015)
--- testsuite/gcc.dg/tree-prof/peel-1.c (working copy)
***
*** 1,4 
! /* { dg-options "-O3 -fdump-rtl-loop2_unroll -fno-unroll-loops -fpeel-loops" 
} */
  void abort();
  
  int a[1000];
--- 1,4 
! /* { dg-options "-O3 -fdump-tree-cunroll-details -fno-unroll-loops 
-fpeel-loops" } */
  void abort();
  
  int a[1000];
Index: passes.def
===
*** passes.def  (revision 205015)
--- passes.def  (working copy)
*** along with GCC; see the file COPYING3.
*** 337,343 
  NEXT_PASS (pass_rtl_loop_init);
  NEXT_PASS (pass_rtl_move_loop_invariants);
  NEXT_PASS (pass_rtl_unswitch);
! NEXT_PASS (pass_rtl_unroll_and_peel_loops);
  NEXT_PASS (pass_rtl_doloop);
  NEXT_PASS (pass_rtl_loop_done);
  TERMINATE_PASS_LIST ()
--- 337,343 
  NEXT_PASS (pass_rtl_loop_init);
  NEXT_PASS (pass_rtl_move_loop_invariants);
  NEXT_PASS (pass_rtl_unswitch);
! NEXT_PASS (pass_rtl_unroll_loops);
  NEXT_PASS (pass_rtl_doloop);
  NEXT_PASS (pass_rtl_loop_done);
  TERMINATE_PASS_LIST ()
Index: loop-init.c
===
*** loop-init.c (revision 205015)
--- loop-init.c (working copy)
*** gate_handle_loop2 (void)
*** 300,306 
if (optimize > 0
&& (flag_move_loop_invariants
  || flag_unswitch_loops
- || flag_peel_loops
  ||

Re: Enale -fno-fat-lto-objects by default

2013-11-19 Thread Jan Hubicka
> Hmm, gcc/gcc.c still reads:
> 
>  690 /* Conditional to test whether the LTO plugin is used or not.
>  691FIXME: For slim LTO we will need to enable plugin unconditionally.  
> This
>  692still cause problems with PLUGIN_LD != LD and when plugin is built but
>  693not useable.  For GCC 4.6 we don't support slim LTO and thus we can 
> enable
>  694plugin only when LTO is enabled.  We still honor explicit
>  695-fuse-linker-plugin if the linker used understands -plugin.  */
>  696
>  697 /* The linker has some plugin support.  */
>  698 #if HAVE_LTO_PLUGIN > 0
>  699 /* The linker used has full plugin support, use LTO plugin by default.  
> */
>  700 #if HAVE_LTO_PLUGIN == 2

Hmm, I see, your problem is that there is no -flto?
I guess you need to add that one then (rather than -fuse-linker-plugin).
Yep, we ought to enable plugin for default to make split LTO happy and probably 
can
drop LTO_PLUGIN versioning. Richi?

Honza
>  701 #define PLUGIN_COND 
> "!fno-use-linker-plugin:%{flto|flto=*|fuse-linker-plugin"
>  702 #define PLUGIN_COND_CLOSE "}"
>  703 #else
>  704 /* The linker used has limited plugin support, use LTO plugin with 
> explicit
>  705-fuse-linker-plugin.  */
>  706 #define PLUGIN_COND "fuse-linker-plugin"
>  707 #define PLUGIN_COND_CLOSE ""
>  708 #endif
>  709 #define LINK_PLUGIN_SPEC \
>  710 "%{"PLUGIN_COND": \
>  711 -plugin %(linker_plugin_file) \
>  712 -plugin-opt=%(lto_wrapper) \
>  713 -plugin-opt=-fresolution=%u.res \
>  714 
> %{!nostdlib:%{!nodefaultlibs:%:pass-through-libs(%(link_gcc_c_sequence))}} \
>  715 }"PLUGIN_COND_CLOSE
>  716 #else
>  717 /* The linker used doesn't support -plugin, reject -fuse-linker-plugin.  
> */
>  718 #define LINK_PLUGIN_SPEC "%{fuse-linker-plugin:\
>  719 %e-fuse-linker-plugin is not supported in this configuration}"
>  720 #endif
> 
> -- 
> Markus


Re: Enale -fno-fat-lto-objects by default

2013-11-19 Thread Richard Biener
On Tue, 19 Nov 2013, Jan Hubicka wrote:

> > Hmm, gcc/gcc.c still reads:
> > 
> >  690 /* Conditional to test whether the LTO plugin is used or not.
> >  691FIXME: For slim LTO we will need to enable plugin unconditionally.  
> > This
> >  692still cause problems with PLUGIN_LD != LD and when plugin is built 
> > but
> >  693not useable.  For GCC 4.6 we don't support slim LTO and thus we can 
> > enable
> >  694plugin only when LTO is enabled.  We still honor explicit
> >  695-fuse-linker-plugin if the linker used understands -plugin.  */
> >  696
> >  697 /* The linker has some plugin support.  */
> >  698 #if HAVE_LTO_PLUGIN > 0
> >  699 /* The linker used has full plugin support, use LTO plugin by default. 
> >  */
> >  700 #if HAVE_LTO_PLUGIN == 2
> 
> Hmm, I see, your problem is that there is no -flto?

The 4.6 consideration is irrelevant, I don't understand your question ...
for disabled LTO you can't run bootstrap-lto ...

> I guess you need to add that one then (rather than -fuse-linker-plugin).
> Yep, we ought to enable plugin for default to make split LTO happy and 
> probably can
> drop LTO_PLUGIN versioning. Richi?

I'd like to remove -fuse-linker-plugin and decide on its use at build
time (like we decide on its default now).  We should print a big fat
warning when we disable its use (for whatever reason) - it's probably
too late (or early ...) to remove non-linker-plugin LTO.

Richard.
 
> Honza
> >  701 #define PLUGIN_COND 
> > "!fno-use-linker-plugin:%{flto|flto=*|fuse-linker-plugin"
> >  702 #define PLUGIN_COND_CLOSE "}"
> >  703 #else
> >  704 /* The linker used has limited plugin support, use LTO plugin with 
> > explicit
> >  705-fuse-linker-plugin.  */
> >  706 #define PLUGIN_COND "fuse-linker-plugin"
> >  707 #define PLUGIN_COND_CLOSE ""
> >  708 #endif
> >  709 #define LINK_PLUGIN_SPEC \
> >  710 "%{"PLUGIN_COND": \
> >  711 -plugin %(linker_plugin_file) \
> >  712 -plugin-opt=%(lto_wrapper) \
> >  713 -plugin-opt=-fresolution=%u.res \
> >  714 
> > %{!nostdlib:%{!nodefaultlibs:%:pass-through-libs(%(link_gcc_c_sequence))}} \
> >  715 }"PLUGIN_COND_CLOSE
> >  716 #else
> >  717 /* The linker used doesn't support -plugin, reject 
> > -fuse-linker-plugin.  */
> >  718 #define LINK_PLUGIN_SPEC "%{fuse-linker-plugin:\
> >  719 %e-fuse-linker-plugin is not supported in this configuration}"
> >  720 #endif
> > 
> > -- 
> > Markus
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer


[PATCH] Fix PR58956

2013-11-19 Thread Richard Biener

This fixes PR58956 where we TER a load into a call statement lhs
that is modified by the call.  TER already has measures to avoid
doing this for regular assignments so the following simply
extends it to arbitrary stmts (including ASMs).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2013-11-19  Richard Biener  

PR middle-end/58956
* tree-ssa-ter.c (find_replaceable_in_bb): Avoid forwarding
loads into stmts that may clobber it.

* gcc.dg/torture/pr58956.c: New testcase.

Index: gcc/tree-ssa-ter.c
===
*** gcc/tree-ssa-ter.c  (revision 205009)
--- gcc/tree-ssa-ter.c  (working copy)
*** find_replaceable_in_bb (temp_expr_table_
*** 601,608 
  /* If the stmt does a memory store and the replacement
 is a load aliasing it avoid creating overlapping
 assignments which we cannot expand correctly.  */
! if (gimple_vdef (stmt)
! && gimple_assign_single_p (stmt))
{
  gimple def_stmt = SSA_NAME_DEF_STMT (use);
  while (is_gimple_assign (def_stmt)
--- 607,613 
  /* If the stmt does a memory store and the replacement
 is a load aliasing it avoid creating overlapping
 assignments which we cannot expand correctly.  */
! if (gimple_vdef (stmt))
{
  gimple def_stmt = SSA_NAME_DEF_STMT (use);
  while (is_gimple_assign (def_stmt)
*** find_replaceable_in_bb (temp_expr_table_
*** 611,618 
  = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (def_stmt));
  if (gimple_vuse (def_stmt)
  && gimple_assign_single_p (def_stmt)
! && refs_may_alias_p (gimple_assign_lhs (stmt),
!  gimple_assign_rhs1 (def_stmt)))
same_root_var = true;
}
  
--- 616,623 
  = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (def_stmt));
  if (gimple_vuse (def_stmt)
  && gimple_assign_single_p (def_stmt)
! && stmt_may_clobber_ref_p (stmt,
!gimple_assign_rhs1 (def_stmt)))
same_root_var = true;
}
  
Index: gcc/testsuite/gcc.dg/torture/pr58956.c
===
*** gcc/testsuite/gcc.dg/torture/pr58956.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr58956.c  (working copy)
***
*** 0 
--- 1,30 
+ /* { dg-do run } */
+ 
+ extern void abort (void);
+ 
+ struct S
+ {
+   int f0;
+ } a = {1}, b, g, *c = &b, **f = &c;
+ 
+ int *d, **e = &d, h;
+ 
+ struct S
+ foo ()
+ {
+   *e = &h;
+   if (!d) 
+ __builtin_unreachable ();
+   *f = &g;
+   return a;
+ }
+ 
+ int
+ main ()
+ {
+   struct S *i = c;
+   *i = foo ();
+   if (b.f0 != 1)
+ abort ();
+   return 0;
+ }


[x86,PATCH] Additional fix for 57756.

2013-11-19 Thread Yuri Rumyantsev
Hi All,

We found out that compiler configured with '-fpmath=sse' option does
not generate scalar floating-point instructions present in the SSE
instruction set for generic32 that leads to performance degradation
for Fortran benchmarks using library functions in 32-bit mode.

This simple fix was designed to cure this issue - definition of
TARGET_FPMATH_DEFAULT_P macros was missed in i386/ssemath.h. Also one
missed fix was done.

Bootstrapping and regression testing were successful.

Is it OK for trunk

2013-11-20  Yuri Rumyantsev  

* config/i386/i386.c (ix86_option_override_internal): Add missed
argument prefix for 'ix86_fpmath'.
* config/i386/ssemath.h: Add missed definition of
TARGET_FPMATH_DEFAULT_P macros.


57756.patch
Description: Binary data


[patch] fix graphite build

2013-11-19 Thread Andrew MacLeod
graphite-sese-to-poly.c needs expr.h to compile.  Fixed thusly and 
checked in as revision 205027.


Andrew
	* graphite-sese-to-poly.c: Include expr.h.

Index: graphite-sese-to-poly.c
===
*** graphite-sese-to-poly.c	(revision 205024)
--- graphite-sese-to-poly.c	(working copy)
*** along with GCC; see the file COPYING3.
*** 58,63 
--- 58,64 
  #include "tree-ssa-propagate.h"
  
  #ifdef HAVE_cloog
+ #include "expr.h"
  #include "graphite-poly.h"
  #include "graphite-sese-to-poly.h"
  


Re: [x86,PATCH] Additional fix for 57756.

2013-11-19 Thread H.J. Lu
On Tue, Nov 19, 2013 at 5:31 AM, Yuri Rumyantsev  wrote:
> Hi All,
>
> We found out that compiler configured with '-fpmath=sse' option does
> not generate scalar floating-point instructions present in the SSE
> instruction set for generic32 that leads to performance degradation
> for Fortran benchmarks using library functions in 32-bit mode.
>
> This simple fix was designed to cure this issue - definition of
> TARGET_FPMATH_DEFAULT_P macros was missed in i386/ssemath.h. Also one
> missed fix was done.
>
> Bootstrapping and regression testing were successful.
>
> Is it OK for trunk
>
> 2013-11-20  Yuri Rumyantsev  
>
> * config/i386/i386.c (ix86_option_override_internal): Add missed
> argument prefix for 'ix86_fpmath'.
> * config/i386/ssemath.h: Add missed definition of
> TARGET_FPMATH_DEFAULT_P macros.

Please add "PR target/57756" in ChangeLog entry.

Thanks.

-- 
H.J.


Re: [patch] fix graphite build

2013-11-19 Thread Diego Novillo
On Tue, Nov 19, 2013 at 8:36 AM, Andrew MacLeod  wrote:
> graphite-sese-to-poly.c needs expr.h to compile.  Fixed thusly and checked
> in as revision 205027.

Thanks!


Diego.


Re: [x86,PATCH] Additional fix for 57756.

2013-11-19 Thread Yuri Rumyantsev
Resend modified ChangeLog:

2013-11-20  Yuri Rumyantsev  
PR target/57756
* config/i386/i386.c (ix86_option_override_internal): Add missed
argument prefix for 'ix86_fpmath'.
* config/i386/ssemath.h: Add missed definition of
TARGET_FPMATH_DEFAULT_P macros.

2013/11/19 H.J. Lu :
> On Tue, Nov 19, 2013 at 5:31 AM, Yuri Rumyantsev  wrote:
>> Hi All,
>>
>> We found out that compiler configured with '-fpmath=sse' option does
>> not generate scalar floating-point instructions present in the SSE
>> instruction set for generic32 that leads to performance degradation
>> for Fortran benchmarks using library functions in 32-bit mode.
>>
>> This simple fix was designed to cure this issue - definition of
>> TARGET_FPMATH_DEFAULT_P macros was missed in i386/ssemath.h. Also one
>> missed fix was done.
>>
>> Bootstrapping and regression testing were successful.
>>
>> Is it OK for trunk
>>
>> 2013-11-20  Yuri Rumyantsev  
>>
>> * config/i386/i386.c (ix86_option_override_internal): Add missed
>> argument prefix for 'ix86_fpmath'.
>> * config/i386/ssemath.h: Add missed definition of
>> TARGET_FPMATH_DEFAULT_P macros.
>
> Please add "PR target/57756" in ChangeLog entry.
>
> Thanks.
>
> --
> H.J.


Re: [x86,PATCH] Additional fix for 57756.

2013-11-19 Thread Uros Bizjak
On Tue, Nov 19, 2013 at 2:45 PM, Yuri Rumyantsev  wrote:
> Resend modified ChangeLog:
>
> 2013-11-20  Yuri Rumyantsev  
> PR target/57756
> * config/i386/i386.c (ix86_option_override_internal): Add missed
> argument prefix for 'ix86_fpmath'.
> * config/i386/ssemath.h: Add missed definition of
> TARGET_FPMATH_DEFAULT_P macros.
>
> 2013/11/19 H.J. Lu :
>> On Tue, Nov 19, 2013 at 5:31 AM, Yuri Rumyantsev  wrote:
>>> Hi All,
>>>
>>> We found out that compiler configured with '-fpmath=sse' option does
>>> not generate scalar floating-point instructions present in the SSE
>>> instruction set for generic32 that leads to performance degradation
>>> for Fortran benchmarks using library functions in 32-bit mode.
>>>
>>> This simple fix was designed to cure this issue - definition of
>>> TARGET_FPMATH_DEFAULT_P macros was missed in i386/ssemath.h. Also one
>>> missed fix was done.
>>>
>>> Bootstrapping and regression testing were successful.
>>>
>>> Is it OK for trunk
>>>
>>> 2013-11-20  Yuri Rumyantsev  
>>>
>>> * config/i386/i386.c (ix86_option_override_internal): Add missed
>>> argument prefix for 'ix86_fpmath'.
>>> * config/i386/ssemath.h: Add missed definition of
>>> TARGET_FPMATH_DEFAULT_P macros.

The patch looks OK to me, but let's also ask Sriraman for his opinion.

Thanks,
Uros.


Re: Enale -fno-fat-lto-objects by default

2013-11-19 Thread Jan Hubicka
> On Tue, 19 Nov 2013, Jan Hubicka wrote:
> 
> > > Hmm, gcc/gcc.c still reads:
> > > 
> > >  690 /* Conditional to test whether the LTO plugin is used or not.
> > >  691FIXME: For slim LTO we will need to enable plugin 
> > > unconditionally.  This
> > >  692still cause problems with PLUGIN_LD != LD and when plugin is 
> > > built but
> > >  693not useable.  For GCC 4.6 we don't support slim LTO and thus we 
> > > can enable
> > >  694plugin only when LTO is enabled.  We still honor explicit
> > >  695-fuse-linker-plugin if the linker used understands -plugin.  */
> > >  696
> > >  697 /* The linker has some plugin support.  */
> > >  698 #if HAVE_LTO_PLUGIN > 0
> > >  699 /* The linker used has full plugin support, use LTO plugin by 
> > > default.  */
> > >  700 #if HAVE_LTO_PLUGIN == 2
> > 
> > Hmm, I see, your problem is that there is no -flto?
> 
> The 4.6 consideration is irrelevant, I don't understand your question ...
> for disabled LTO you can't run bootstrap-lto ...

The problem is that you have .a library consisting of slim LTO objects and you 
link
with it during configure check without -flto.
In this case we do not run plugin and never notice that LTO objects are 
involved.
I think in linger run we should do it.
> 
> > I guess you need to add that one then (rather than -fuse-linker-plugin).
> > Yep, we ought to enable plugin for default to make split LTO happy and 
> > probably can
> > drop LTO_PLUGIN versioning. Richi?
> 
> I'd like to remove -fuse-linker-plugin and decide on its use at build
> time (like we decide on its default now).  We should print a big fat
> warning when we disable its use (for whatever reason) - it's probably
> too late (or early ...) to remove non-linker-plugin LTO.

I would be also happy to see -fuse-linker-plugin and non-plugin path to go.
I think non-plugin needs to stay until darwin gets plugin support though
and the decisions depends on linker in use. At compile time we do not know
if user will choose or not to use linker enabled LD.

Honza


Re: Enale -fno-fat-lto-objects by default

2013-11-19 Thread Jan Hubicka
> > On Tue, 19 Nov 2013, Jan Hubicka wrote:
> > 
> > > > Hmm, gcc/gcc.c still reads:
> > > > 
> > > >  690 /* Conditional to test whether the LTO plugin is used or not.
> > > >  691FIXME: For slim LTO we will need to enable plugin 
> > > > unconditionally.  This
> > > >  692still cause problems with PLUGIN_LD != LD and when plugin is 
> > > > built but
> > > >  693not useable.  For GCC 4.6 we don't support slim LTO and thus we 
> > > > can enable
> > > >  694plugin only when LTO is enabled.  We still honor explicit
> > > >  695-fuse-linker-plugin if the linker used understands -plugin.  */
> > > >  696
> > > >  697 /* The linker has some plugin support.  */
> > > >  698 #if HAVE_LTO_PLUGIN > 0
> > > >  699 /* The linker used has full plugin support, use LTO plugin by 
> > > > default.  */
> > > >  700 #if HAVE_LTO_PLUGIN == 2
> > > 
> > > Hmm, I see, your problem is that there is no -flto?
> > 
> > The 4.6 consideration is irrelevant, I don't understand your question ...
> > for disabled LTO you can't run bootstrap-lto ...
> 
> The problem is that you have .a library consisting of slim LTO objects and 
> you link
> with it during configure check without -flto.
> In this case we do not run plugin and never notice that LTO objects are 
> involved.
> I think in linger run we should do it.
I meant in longer run we should switch into using plugin all the time, so slim 
LTO
archives can be used this way. It does  come with some linktime cost (never 
measured
if it is important or not)

Honza


Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-19 Thread Sergey Ostanevich
:) agree to you, but as soon as you're a user who tries to introduce
vector code and face a bug in cost model you'd like to have a
workaround until the bug will be fixed and compiler will come to you
with new OS distribution, don't you?

I propose the following, yet SLP have to use a NULL as a loop info
which looks somewhat hacky.

Sergos


* common.opt: Added new option -fsimd-vect-cost-model
* tree-vectorizer.h (unlimited_cost_model): Interface update
to rely on particular loop info
* tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
unlimited_cost_model call according to new interface
(vect_peeling_hash_choose_best_peeling): Ditto
(vect_enhance_data_refs_alignment): Ditto
* tree-vect-slp.c: Ditto
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto
plus issue a warning in case cost model overrides users' directive



diff --git a/gcc/common.opt b/gcc/common.opt
index d5971df..87b3b37 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2296,6 +2296,10 @@ fvect-cost-model=
 Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
 Specifies the cost model for vectorization

+fsimd-vect-cost-model=
+Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_simd_vect_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
+Specifies the cost model for vectorization in loops marked with
#pragma omp simd
+
 Enum
 Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
vectorizer cost model %qs)

diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 83d1f45..e26f704 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
loop_vinfo, struct data_reference *dr,
   *new_slot = slot;
 }

-  if (!supportable_dr_alignment && unlimited_cost_model ())
+  if (!supportable_dr_alignment
+  && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
 slot->count += VECT_MAX_COST;
 }

@@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
(loop_vec_info loop_vinfo,
res.peel_info.dr = NULL;
res.body_cost_vec = stmt_vector_for_cost ();

-   if (!unlimited_cost_model ())
+   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
  {
res.inside_cost = INT_MAX;
res.outside_cost = INT_MAX;
@@ -1429,7 +1430,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
loop_vinfo)
  vectorization factor.
  We do this automtically for cost model, since we
calculate cost
  for every peeling option.  */
-  if (unlimited_cost_model ())
+  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
 possible_npeel_number = vf /nelements;

   /* Handle the aligned case. We may decide to align some other
@@ -1437,7 +1438,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
loop_vinfo)
   if (DR_MISALIGNMENT (dr) == 0)
 {
   npeel_tmp = 0;
-  if (unlimited_cost_model ())
+  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
 possible_npeel_number++;
 }

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 86ebbd2..be66172 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -2696,7 +2696,7 @@ vect_estimate_min_profitable_iters
(loop_vec_info loop_vinfo,
   void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);

   /* Cost model disabled.  */
-  if (unlimited_cost_model ())
+  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
 {
   dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
   *ret_min_profitable_niters = 0;
@@ -2929,6 +2929,11 @@ vect_estimate_min_profitable_iters
(loop_vec_info loop_vinfo,
   /* vector version will never be profitable.  */
   else
 {
+  if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
+{
+  pedwarn (vect_location, 0, "Vectorization did not happen
for the loop");
+}
+
   if (dump_enabled_p ())
 dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
  "cost model: the vector iteration cost = %d "
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 247bdfd..4b25964 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2171,7 +2171,7 @@ vect_slp_analyze_bb_1 (basic_block bb)
 }

   /* Cost model: check if the vectorization is worthwhile.  */
-  if (!unlimited_cost_model ()
+  if (!unlimited_cost_model (NULL)
   && !vect_bb_vectorization_profitable_p (bb_vinfo))
 {
   if (dump_enabled_p ())
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index a6c5b59..2916906 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -919,9 +919,12 @@ known_alignment_for_access_p (struct
data_reference *data_ref_info)

 /* Return true if the vect cost model is unlimited.  */
 

Re: [patch] gcc fstack-protector-explicit

2013-11-19 Thread Marcos Díaz
My employer is working on the signature of the papers. Could someone
please do the review meanwhile?

On Tue, Nov 19, 2013 at 3:00 AM, Jeff Law  wrote:
> On 11/18/13 13:05, Marcos Díaz wrote:
>>
>> Hi,
>> the attached patch adds a new attribute and option flag to control
>> when to do stack protection.
>> The new attribute (stack_protect)  affects the behavior of gcc by
>> forcing the stack protection of the function marked with the attribute
>> if any of the options -fstack-protector, -fstack-protector-strong or
>> -fstack-protector-explicit(new) are set.
>>
>> If the new option (-fstack-protector-explicit) is set only those
>> functions with the attribute stack_protect will have stack protection.
>>
>> The stack re-ordering of the variables depends on the option set,
>> currently if flag -fstack-protector is set only char arrays are
>> reordered in the stack, whereas if flag -fstack-protector-strong or
>> -fstack-protector-explicit is set then char arrays and other arrays
>> are ordered first in the stack.
>> About this reordering of the non char arrays, shouldn't all to-be
>> protected functions have the full re-ordering? If not, for
>> completeness, I should make that new flag -fstack-protector-explicit
>> not to order the non-char arrays, and create a new -strong
>> counterpart, namely -fstack-protector-explicit-strong which does.
>>
>> Additionally, I think that the behavior of the flag
>> -fstack-protector-strong lacked the re-ordering of non char arrays
>> (named phase 2) so I added the reordering also for such flag.
>> Current tests pass after applying this patch, plus the tests specially
>> added.
>> Please commit it for me if OK since I don't have write access.
>>
>> Changelog:
>> 2013-11-18 Marcos Diaz 
>
> [ ... ]
> Before doing any significant review on this work I have to ask, do you have
> a copyright assignment on file with the FSF and any necessary paperwork from
> your employer?
>
> Jeff
>



-- 
__


Marcos Díaz

Software Engineer


San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina


Phone: +54 351 4217888 / +54 351 4218211/ +54 351 7617452

Skype: markdiaz22


[PATCH, MPX, 2/X] Pointers Checker. Add flag to varpool_node

2013-11-19 Thread Ilya Enkovich
Hi,

Here is a patch to add flag for marking symbols as requiring static 
initialization of bounds.  Used by Pointer Bounds Checker to handle statically 
initialized pointers and static bounds vars.

Thanks,
Ilya
--
2013-11-19  Ilya Enkovich  

* cgraph.h (varpool_node): Add need_bounds_init field.
* lto-cgraph.c (lto_output_varpool_node): Output
need_bounds_init value.
(input_varpool_node): Read need_bounds_init value.
* varpool.c (dump_varpool_node): Dump need_bounds_init field.


diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 1ac6dfb..31c3635 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -520,6 +520,10 @@ class GTY((tag ("SYMTAB_VARIABLE"))) varpool_node : public 
symtab_node {
 public:
   /* Set when variable is scheduled to be assembled.  */
   unsigned output : 1;
+
+  /* Set when variable has statically initialized pointer
+ or is a static bounds variable and needs initalization.  */
+  unsigned need_bounds_init : 1;
 };
 
 /* Every top level asm statement is put into a asm_node.  */
diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
index 99dbf96..0d3479d 100644
--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -579,6 +579,7 @@ lto_output_varpool_node (struct lto_simple_output_block 
*ob, struct varpool_node
 && boundary_p && !DECL_EXTERNAL (node->decl), 1);
  /* in_other_partition.  */
 }
+  bp_pack_value (&bp, node->need_bounds_init, 1);
   streamer_write_bitpack (&bp);
   if (node->same_comdat_group && !boundary_p)
 {
@@ -1149,6 +1150,7 @@ input_varpool_node (struct lto_file_decl_data *file_data,
   node->analyzed = bp_unpack_value (&bp, 1);
   node->used_from_other_partition = bp_unpack_value (&bp, 1);
   node->in_other_partition = bp_unpack_value (&bp, 1);
+  node->need_bounds_init = bp_unpack_value (&bp, 1);
   if (node->in_other_partition)
 {
   DECL_EXTERNAL (node->decl) = 1;
diff --git a/gcc/varpool.c b/gcc/varpool.c
index 1e4c823..471db82 100644
--- a/gcc/varpool.c
+++ b/gcc/varpool.c
@@ -204,6 +204,8 @@ dump_varpool_node (FILE *f, struct varpool_node *node)
 fprintf (f, " initialized");
   if (node->output)
 fprintf (f, " output");
+  if (node->need_bounds_init)
+fprintf (f, " need-bounds-init");
   if (TREE_READONLY (node->decl))
 fprintf (f, " read-only");
   if (ctor_for_folding (node->decl) != error_mark_node)


Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-19 Thread Richard Biener
On Tue, 19 Nov 2013, Sergey Ostanevich wrote:

> :) agree to you, but as soon as you're a user who tries to introduce
> vector code and face a bug in cost model you'd like to have a
> workaround until the bug will be fixed and compiler will come to you
> with new OS distribution, don't you?
> 
> I propose the following, yet SLP have to use a NULL as a loop info
> which looks somewhat hacky.

I think this is overengineering.  -fvect-cost-model will do as
workaround.  And -fsimd-vect-cost-model has what I consider
duplicate - "simd" and "vect".

Richard.

> Sergos
> 
> 
> * common.opt: Added new option -fsimd-vect-cost-model
> * tree-vectorizer.h (unlimited_cost_model): Interface update
> to rely on particular loop info
> * tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
> unlimited_cost_model call according to new interface
> (vect_peeling_hash_choose_best_peeling): Ditto
> (vect_enhance_data_refs_alignment): Ditto
> * tree-vect-slp.c: Ditto
> * tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto
> plus issue a warning in case cost model overrides users' directive
> 
> 
> 
> diff --git a/gcc/common.opt b/gcc/common.opt
> index d5971df..87b3b37 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2296,6 +2296,10 @@ fvect-cost-model=
>  Common Joined RejectNegative Enum(vect_cost_model)
> Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
>  Specifies the cost model for vectorization
> 
> +fsimd-vect-cost-model=
> +Common Joined RejectNegative Enum(vect_cost_model)
> Var(flag_simd_vect_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
> +Specifies the cost model for vectorization in loops marked with
> #pragma omp simd
> +
>  Enum
>  Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
> vectorizer cost model %qs)
> 
> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index 83d1f45..e26f704 100644
> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
> loop_vinfo, struct data_reference *dr,
>*new_slot = slot;
>  }
> 
> -  if (!supportable_dr_alignment && unlimited_cost_model ())
> +  if (!supportable_dr_alignment
> +  && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>  slot->count += VECT_MAX_COST;
>  }
> 
> @@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
> (loop_vec_info loop_vinfo,
> res.peel_info.dr = NULL;
> res.body_cost_vec = stmt_vector_for_cost ();
> 
> -   if (!unlimited_cost_model ())
> +   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>   {
> res.inside_cost = INT_MAX;
> res.outside_cost = INT_MAX;
> @@ -1429,7 +1430,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
> loop_vinfo)
>   vectorization factor.
>   We do this automtically for cost model, since we
> calculate cost
>   for every peeling option.  */
> -  if (unlimited_cost_model ())
> +  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>  possible_npeel_number = vf /nelements;
> 
>/* Handle the aligned case. We may decide to align some other
> @@ -1437,7 +1438,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
> loop_vinfo)
>if (DR_MISALIGNMENT (dr) == 0)
>  {
>npeel_tmp = 0;
> -  if (unlimited_cost_model ())
> +  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>  possible_npeel_number++;
>  }
> 
> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> index 86ebbd2..be66172 100644
> --- a/gcc/tree-vect-loop.c
> +++ b/gcc/tree-vect-loop.c
> @@ -2696,7 +2696,7 @@ vect_estimate_min_profitable_iters
> (loop_vec_info loop_vinfo,
>void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
> 
>/* Cost model disabled.  */
> -  if (unlimited_cost_model ())
> +  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>  {
>dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
>*ret_min_profitable_niters = 0;
> @@ -2929,6 +2929,11 @@ vect_estimate_min_profitable_iters
> (loop_vec_info loop_vinfo,
>/* vector version will never be profitable.  */
>else
>  {
> +  if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
> +{
> +  pedwarn (vect_location, 0, "Vectorization did not happen
> for the loop");
> +}
> +
>if (dump_enabled_p ())
>  dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>   "cost model: the vector iteration cost = %d "
> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> index 247bdfd..4b25964 100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -2171,7 +2171,7 @@ vect_slp_analyze_bb_1 (basic_block bb)
>  }
> 
>/* Cost model: check if the vectorization is worthwhile. 

Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-19 Thread Jakub Jelinek
On Tue, Nov 19, 2013 at 03:07:52PM +0100, Richard Biener wrote:
> On Tue, 19 Nov 2013, Sergey Ostanevich wrote:
> 
> > :) agree to you, but as soon as you're a user who tries to introduce
> > vector code and face a bug in cost model you'd like to have a
> > workaround until the bug will be fixed and compiler will come to you
> > with new OS distribution, don't you?
> > 
> > I propose the following, yet SLP have to use a NULL as a loop info
> > which looks somewhat hacky.
> 
> I think this is overengineering.  -fvect-cost-model will do as
> workaround.  And -fsimd-vect-cost-model has what I consider
> duplicate - "simd" and "vect".

I think it is a good idea, though I agree about s/simd-vect/simd/ and
I'd use VECT_COST_MODEL_DEFAULT as the default, which would mean
just use -fvect-cost-model.

> > @@ -2929,6 +2929,11 @@ vect_estimate_min_profitable_iters
> > (loop_vec_info loop_vinfo,
> >/* vector version will never be profitable.  */
> >else
> >  {
> > +  if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
> > +{
> > +  pedwarn (vect_location, 0, "Vectorization did not happen
> > for the loop");
> > +}

pedwarn isn't really desirable for this, you want just warning,
but some warning you can actually also turn off.
-Wopenmp-simd (and we'd use it also when we ignore #pragma omp declare simd
because it wasn't useful/desirable).

Jakub


Re: [PATCH] Fix PR58115

2013-11-19 Thread H.J. Lu
On Sun, Nov 3, 2013 at 2:25 AM, Bernd Edlinger
 wrote:
> Hello,
>
> on i686-pc-linux-gnu the test case gcc.target/i386/intrinsics_4.c fails 
> because of
> an internal compiler error, see PR58155.
>
> The reason for this is that the optab CODE_FOR_movv8sf is disabled when it
> should be enabled.
>
> This happens because invoke_set_current_function_hook changes the pointer
> "this_fn_optabs" after targetm.set_current_function has already modified the
> optab to enable/disable CODE_FOR_movv8sf, leaving that optab entry
> in an undefined state.
>
> Boot-strapped and regression-tested on i686-pc-linux-gnu.
>
> Ok for trunk?
>
> Regards
> Bernd.

Are you sure your patch is for

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58155


-- 
H.J.


Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-19 Thread Sergey Ostanevich
On Tue, Nov 19, 2013 at 6:07 PM, Richard Biener  wrote:
> On Tue, 19 Nov 2013, Sergey Ostanevich wrote:
>
>> :) agree to you, but as soon as you're a user who tries to introduce
>> vector code and face a bug in cost model you'd like to have a
>> workaround until the bug will be fixed and compiler will come to you
>> with new OS distribution, don't you?
>>
>> I propose the following, yet SLP have to use a NULL as a loop info
>> which looks somewhat hacky.
>
> I think this is overengineering.  -fvect-cost-model will do as
> workaround.  And -fsimd-vect-cost-model has what I consider
> duplicate - "simd" and "vect".

I just wanted to separate the autovectorized loops from ones user
wants to vectorize. The -fvect-cost-model will force all at once.
That's the reason to introcude the simd-vect, since pragma name
is simd.

>
> Richard.
>
>> Sergos
>>
>>
>> * common.opt: Added new option -fsimd-vect-cost-model
>> * tree-vectorizer.h (unlimited_cost_model): Interface update
>> to rely on particular loop info
>> * tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
>> unlimited_cost_model call according to new interface
>> (vect_peeling_hash_choose_best_peeling): Ditto
>> (vect_enhance_data_refs_alignment): Ditto
>> * tree-vect-slp.c: Ditto
>> * tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto
>> plus issue a warning in case cost model overrides users' directive
>>
>>
>>
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index d5971df..87b3b37 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -2296,6 +2296,10 @@ fvect-cost-model=
>>  Common Joined RejectNegative Enum(vect_cost_model)
>> Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
>>  Specifies the cost model for vectorization
>>
>> +fsimd-vect-cost-model=
>> +Common Joined RejectNegative Enum(vect_cost_model)
>> Var(flag_simd_vect_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
>> +Specifies the cost model for vectorization in loops marked with
>> #pragma omp simd
>> +
>>  Enum
>>  Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
>> vectorizer cost model %qs)
>>
>> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
>> index 83d1f45..e26f704 100644
>> --- a/gcc/tree-vect-data-refs.c
>> +++ b/gcc/tree-vect-data-refs.c
>> @@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
>> loop_vinfo, struct data_reference *dr,
>>*new_slot = slot;
>>  }
>>
>> -  if (!supportable_dr_alignment && unlimited_cost_model ())
>> +  if (!supportable_dr_alignment
>> +  && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>>  slot->count += VECT_MAX_COST;
>>  }
>>
>> @@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
>> (loop_vec_info loop_vinfo,
>> res.peel_info.dr = NULL;
>> res.body_cost_vec = stmt_vector_for_cost ();
>>
>> -   if (!unlimited_cost_model ())
>> +   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>>   {
>> res.inside_cost = INT_MAX;
>> res.outside_cost = INT_MAX;
>> @@ -1429,7 +1430,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
>> loop_vinfo)
>>   vectorization factor.
>>   We do this automtically for cost model, since we
>> calculate cost
>>   for every peeling option.  */
>> -  if (unlimited_cost_model ())
>> +  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>>  possible_npeel_number = vf /nelements;
>>
>>/* Handle the aligned case. We may decide to align some other
>> @@ -1437,7 +1438,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
>> loop_vinfo)
>>if (DR_MISALIGNMENT (dr) == 0)
>>  {
>>npeel_tmp = 0;
>> -  if (unlimited_cost_model ())
>> +  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>>  possible_npeel_number++;
>>  }
>>
>> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
>> index 86ebbd2..be66172 100644
>> --- a/gcc/tree-vect-loop.c
>> +++ b/gcc/tree-vect-loop.c
>> @@ -2696,7 +2696,7 @@ vect_estimate_min_profitable_iters
>> (loop_vec_info loop_vinfo,
>>void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo);
>>
>>/* Cost model disabled.  */
>> -  if (unlimited_cost_model ())
>> +  if (unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>>  {
>>dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled.\n");
>>*ret_min_profitable_niters = 0;
>> @@ -2929,6 +2929,11 @@ vect_estimate_min_profitable_iters
>> (loop_vec_info loop_vinfo,
>>/* vector version will never be profitable.  */
>>else
>>  {
>> +  if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
>> +{
>> +  pedwarn (vect_location, 0, "Vectorization did not happen
>> for the loop");
>> +}
>> +
>>if (dump_enabled_p ())
>>  dump_printf

RE: [PATCH] Fix PR58115

2013-11-19 Thread Bernd Edlinger
On Tue, 19 Nov 2013 06:21:22, H.J. Lu wrote:
>
> On Sun, Nov 3, 2013 at 2:25 AM, Bernd Edlinger
>  wrote:
>> Hello,
>>
>> on i686-pc-linux-gnu the test case gcc.target/i386/intrinsics_4.c fails 
>> because of
>> an internal compiler error, see PR58155.
>>
>> The reason for this is that the optab CODE_FOR_movv8sf is disabled when it
>> should be enabled.
>>
>> This happens because invoke_set_current_function_hook changes the pointer
>> "this_fn_optabs" after targetm.set_current_function has already modified the
>> optab to enable/disable CODE_FOR_movv8sf, leaving that optab entry
>> in an undefined state.
>>
>> Boot-strapped and regression-tested on i686-pc-linux-gnu.
>>
>> Ok for trunk?
>>
>> Regards
>> Bernd.
>
> Are you sure your patch is for
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58155

Oh. Sorry.

I meant http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58115


Thanks
Bernd.

>
>
> --
> H.J.

Re: [PATCH i386 8/8] [AVX-512] Add SHA support.

2013-11-19 Thread Uros Bizjak
On Tue, Nov 19, 2013 at 1:41 PM, Kirill Yukhin  wrote:
> Hello,
> This patch introduces new SHA instructions described in [1]
> along with tests.
>
> Testing:
>   1. Bootstrap pass.
>   2. make check shows no regressions.
>   3. Spec 2000 & 2006 build show no regressions both with and without 
> -mavx512f option.
>   4. Spec 2000 & 2006 run shows no stability regressions without -mavx512f 
> option.
>
> ChangeLog entry:
> 2013-11-18  Alexander Ivchenko  
> Maxim Kuznetsov  
> Sergey Lega  
> Anna Tikhonova  
> Ilya Tocar  
> Andrey Turetskiy  
> Ilya Verbin  
> Kirill Yukhin  
> Michael Zolotukhin  
>
> * common/config/i386/i386-common.c (OPTION_MASK_ISA_SHA_SET): New.
> (OPTION_MASK_ISA_SHA_UNSET): Ditto.
> (ix86_handle_option): Handle OPT_msha.
> * config.gcc (extra_headers): Add shaintrin.h.
> * config/i386/cpuid.h (bit_SHA): New.
> * config/i386/driver-i386.c (host_detect_local_cpu): Detect SHA
> instructions.
> * config/i386/i386-c.c (ix86_target_macros_internal): Handle
> OPTION_MASK_ISA_SHA.
> * config/i386/i386.c (ix86_target_string): Add -msha.
> (ix86_option_override_internal): Add PTA_SHA.
> (ix86_valid_target_attribute_inner_p): Handle OPT_msha.
> (enum ix86_builtins): Add IX86_BUILTIN_SHA1MSG1,
> IX86_BUILTIN_SHA1MSG2, IX86_BUILTIN_SHA1NEXTE, IX86_BUILTIN_SHA1RNDS4,
> IX86_BUILTIN_SHA256MSG1, IX86_BUILTIN_SHA256MSG2,
> IX86_BUILTIN_SHA256RNDS2.
> (bdesc_args): Add BUILTINS defined above.
> (ix86_init_mmx_sse_builtins): Add __builtin_ia32_sha1msg1,
> __builtin_ia32_sha1msg2, __builtin_ia32_sha1nexte,
> __builtin_ia32_sha1rnds4, __builtin_ia32_sha256msg1,
> __builtin_ia32_sha256msg2, __builtin_ia32_sha256rnds2.
> (ix86_expand_args_builtin): Handle V4SI_FTYPE_V4SI_V4SI_V4SI, add
> warning for CODE_FOR_sha1rnds4.
> * config/i386/i386.h (TARGET_SHA): New.
> (TARGET_SHA_P): Ditto.
> * config/i386/i386.opt (-msha): Document it.
> * config/i386/immintrin.h: Add shaintrin.h.
> * config/i386/shaintrin.h: New.
> * config/i386/sse.md (unspec): Add UNSPEC_SHA1MSG1, UNSPEC_SHA1MSG2,
> UNSPEC_SHA1NEXTE, UNSPEC_SHA1RNDS4, UNSPEC_SHA256MSG1,
> UNSPEC_SHA256MSG2, UNSPEC_SHA256RNDS2.
> (sha1msg1): New.
> (sha1msg2): Ditto.
> (sha1nexte): Ditto.
> (sha1rnds4): Ditto.
> (sha256msg1): Ditto.
> (sha256msg2): Ditto.
> (sha256rnds2): Ditto.
> * doc/invoke.texi: Add -msha, -mno-sha.
>
> testsuite/ChangeLog entry:
> 2013-11-18  Alexander Ivchenko  
> Maxim Kuznetsov  
> Sergey Lega  
> Anna Tikhonova  
> Ilya Tocar  
> Andrey Turetskiy  
> Ilya Verbin  
> Kirill Yukhin  
> Michael Zolotukhin  
>
> * gcc.target/i386/avx-1.c: Add define for __builtin_ia32_sha1rnds4.
> * gcc.target/i386/i386.exp (check_effective_target_sha): New.
> * gcc.target/i386/sha-check.h: New file.
> * gcc.target/i386/sha1msg1-1.c: Ditto.
> * gcc.target/i386/sha1msg1-2.c: Ditto.
> * gcc.target/i386/sha1msg2-1.c: Ditto.
> * gcc.target/i386/sha1msg2-2.c: Ditto.
> * gcc.target/i386/sha1nexte-1: Ditto.
> * gcc.target/i386/sha1nexte-2: Ditto.
> * gcc.target/i386/sha1rnds4-1.c: Ditto.
> * gcc.target/i386/sha1rnds4-2.c: Ditto.
> * gcc.target/i386/sha256msg1-1.c: Ditto.
> * gcc.target/i386/sha256msg1-2.c: Ditto.
> * gcc.target/i386/sha256msg2-1.c: Ditto.
> * gcc.target/i386/sha256msg2-2.c: Ditto.
> * gcc.target/i386/sha256rnds2-1.c: Ditto.
> * gcc.target/i386/sha256rnds2-2.c: Ditto.
> * gcc.target/i386/sse-13.c: Add __builtin_ia32_sha1rnds4.
> * gcc.target/i386/sse-14.c: Add _mm_sha1rnds4_epu32.
> * gcc.target/i386/sse-22.c: Ditto.
> * gcc.target/i386/sse-23.c: Add __builtin_ia32_sha1rnds4.

Please also add new command options to g++.dg/other/sse-2.C and
g++.dg/other/sse-3.C

OK with the small nit below and with above testsute addition.

> Patch in the bottom.
>
> Is it ok for trunk?> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -663,7 +663,7 @@ Objective-C and Objective-C++ Dialects}.
>  -mrecip -mrecip=@var{opt} @gol
>  -mvzeroupper -mprefer-avx128 @gol
>  -mmmx  -msse  -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol
> --mavx2 -mavx512f -mavx512pf -mavx512er -mavx512cd @gol
> +-mavx2 -mavx512f -mavx512pf -mavx512er -mavx512cd -msha -mno-sha @gol

No need to document negative option here.

Thanks,
Uros.


[PATCH, PR 57748] Check for out of bounds access, Part 2

2013-11-19 Thread Bernd Edlinger
Hello,


this is a minor update to my previous version of this patch, (using a boolean 
expand_reference,
instead of adding a new expand_modifier enum value):

I forgot to pass down the expand_reference value at the second expand_expr call 
inside the
case VIEW_CONVERT_EXPR. Sorry for the inconvenience.



@@ -10219,7 +10229,8 @@ expand_expr_real_1 (tree exp, rtx target, enum mac
   }

   if (!op0)
-   op0 = expand_expr (treeop0, NULL_RTX, VOIDmode, modifier);
+   op0 = expand_expr_real (treeop0, NULL_RTX, VOIDmode, modifier,
+   NULL, expand_reference);

   /* If the input and output modes are both the same, we are done.  */
   if (mode == GET_MODE (op0))


Boot-strapped and regression-tested on X86_64-pc-linux-gnu.

Ok for trunk?


Thanks
Bernd.


> Date: Thu, 7 Nov 2013 13:58:55 +0100
>
> oops - this time with attachments...
>
>
>> Hi,
>>
>> On Fri, 25 Oct 2013 12:51:13, Richard Biener wrote:
>>>
>>> On Fri, Oct 25, 2013 at 12:02 PM, Bernd Edlinger
>>>  wrote:
 Hi,

> Eh ... even
>
> register struct { int i; int a[0]; } asm ("ebx");
>
> works. Also with int a[1] but not with a[2]. So just handling trailing
> arrays makes this case regress as well.
>
> Now I'd call this use questionable ... but we've likely supported that
> for decades and cannot change that now.
>
> Back to fixing everything in expand.
>
> Richard.
>

 Ok, finally you asked for it.

 Here is my previous version of that patch again.

 I have now added a new value "EXPAND_REFERENCE" to the expand_modifier
 enumeration. It is almost like EXPAND_MEMORY but it does not interfere with
 constant values.

 I have done the same modification to VIEW_CONVERT_EXPR too, because
 this is a possible inner reference, itself. It is however inherently hard 
 to
 test around this code.

 To understand this patch it is good to know what type of object the
 return value "tem" of get_inner_reference can be.

 From the program logic at get_inner_reference it is clear that the
 return value may *not* be BIT_FIELD_REF, COMPONENT_REF, ARRAY_REF,
 ARRAY_RANGE_REF, REALPART_EXPR, IMAGPART_EXPR. The result may
 be VIEW_CONVERT_EXPR only on a STRICT_ALIGNMENT target. This is probably
 further restricted because exp is gimplified.

 Usually the result will be a MEM_REF or a SSA_NAME of the memory where
 the structure is to be found.

 When you look at where EXPAND_MEMORY is handled you see it is special-cased
 in TARGET_MEM_REF, MEM_REF, ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF,
 ARRAY_RANGE_REF.

 At TARGET_MEM_REF, MEM_REF, VIEW_CONVERT_EXPR, it should be the
 same if EXPAND_MEMORY, EXPAND_WRITE or EXPAND_REFERENCE is given:
 If it is an unaligned memory, we just return the unaligned reference.

 This was missing for VIEW_CONVERT_EXPR, and unfortunately I have no test 
 case,
 because it is only a problem for STRICT_ALIGNMENT targets, and even there 
 it will
 certainly be really hard to find test cases that exercise this code.

 In ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF
 we do not have to touch the handling of the outer modifier. However we pass
 EXPAND_REFERENCE to the inner object, which should not be a recursive
 use of any ARRAY_REF, COMPONENT_REF, BIT_FIELD_REF, ARRAY_RANGE_REF.

 TARGET_MEM_REF, MEM_REF and VIEW_CONVERT_EXPR know how to handle
 EXPAND_REFERENCE, anything else handles it like EXPAND_NORMAL.


 Boot-strapped and regression-tested on x86_64-linux-gnu
 OK for trunk?
>>>
>>> You point to a weak spot in expansion - that it handles creating
>>> the base MEM to offset with handled components by recursing
>>> into the case that handles bare MEM_REFs. This makes the
>>> bare MEM_REF handling code somewhat awkward (it's the
>>> one to assign mem-attrs which are later adjusted for example).
>>>
>>> Maybe a better appropach than adding yet another expand
>>> modifier would be to split out the "base MEM" expansion part
>>> out of the bare MEM_REF handling code so we can call that
>>> instead of recursing.
>>>
>>> In this light - instead of a new expand modifier don't you want
>>> an actual flag that specifies we are coming from a call that
>>> wants to expand a base? That is, allow EXPAND_SUM
>>> but with the recursion flag set?
>>>
>>
>> I think you are right. After some thought, I start to like that idea.
>>
>> This way we have at least much more flexibility, how to handle the inner
>> references correctly, and if I change only the interfaces of 
>> expand_expr_real/real_1
>> that will not be used at too many places, either.
>>
>>> Finally I think the recursion into the VIEW_CONVERT_EXPR case
>>> is only there because of the keep_aligning flag of get_inner_reference
>>> which should be obsolete now that we properly handle 

Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-19 Thread Sergey Ostanevich
>> > I propose the following, yet SLP have to use a NULL as a loop info
>> > which looks somewhat hacky.
>>
>> I think this is overengineering.  -fvect-cost-model will do as
>> workaround.  And -fsimd-vect-cost-model has what I consider
>> duplicate - "simd" and "vect".
>
> I think it is a good idea, though I agree about s/simd-vect/simd/ and
> I'd use VECT_COST_MODEL_DEFAULT as the default, which would mean
> just use -fvect-cost-model.

that's ok, since we'd have a way to force those 'simd' loops.

>
>> > @@ -2929,6 +2929,11 @@ vect_estimate_min_profitable_iters
>> > (loop_vec_info loop_vinfo,
>> >/* vector version will never be profitable.  */
>> >else
>> >  {
>> > +  if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
>> > +{
>> > +  pedwarn (vect_location, 0, "Vectorization did not happen
>> > for the loop");
>> > +}
>
> pedwarn isn't really desirable for this, you want just warning,
> but some warning you can actually also turn off.
> -Wopenmp-simd (and we'd use it also when we ignore #pragma omp declare simd
> because it wasn't useful/desirable).

consider a user is interested in enabling warning-as-error for this case?
can we disable the pedwarn the same way?

Sergos

>
> Jakub


Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-19 Thread Jakub Jelinek
On Tue, Nov 19, 2013 at 06:39:48PM +0400, Sergey Ostanevich wrote:
> > pedwarn isn't really desirable for this, you want just warning,
> > but some warning you can actually also turn off.
> > -Wopenmp-simd (and we'd use it also when we ignore #pragma omp declare simd
> > because it wasn't useful/desirable).
> 
> consider a user is interested in enabling warning-as-error for this case?

-Werror=openmp-simd will work then, this works for any named warnings.

> can we disable the pedwarn the same way?

pedwarn is for pedantic warnings, no standard says that #pragma omp simd
must be vectorized, or that #pragma omp simd or #pragma omp declare simd
is anything but an optimization hint, so pedwarn isn't what you are looking
for.

Jakub


Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-19 Thread Ian Lance Taylor
On Mon, Nov 18, 2013 at 11:44 PM, Jakub Jelinek  wrote:
> On Tue, Nov 19, 2013 at 05:32:12PM +1030, Alan Modra wrote:
>> On Tue, Nov 19, 2013 at 06:17:41AM +0100, Hans-Peter Nilsson wrote:
>> > In file included from /tmp/x/gcc/libbacktrace/atomic.c:37:
>> > /tmp/x/gcc/libbacktrace/internal.h:182: error: expected declaration 
>> > specifiers or '...' before 'off_t'
>> > make[3]: *** [atomic.lo] Error 1
>> >
>> > brgds, H-P
>> > PS. Host is Fedora 12, x86_64.
>>
>> Likewise on powerpc-linux.  Fixed here by #include  in
>> atomic.c.
>
> Given:
> /* We assume that  and "backtrace.h" have already been
>included.  */
> comment at the start of internal.h, I've committed following fix as obvious.
> All other libbacktrace source files that include internal.h include both
> sys/types.h and backtrace.h before internal.h.
>
> 2013-11-19  Jakub Jelinek  
>
> * atomic.c: Include sys/types.h.

Thanks.  My apologies for the breakage.  I thought I explicitly tested
that case, but evidently I somehow messed up.

Ian


Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-19 Thread Ian Lance Taylor
On Tue, Nov 19, 2013 at 12:04 AM, Jakub Jelinek  wrote:
> On Mon, Nov 18, 2013 at 09:09:03AM -0800, Ian Lance Taylor wrote:
>> > 2) for tsan querying of data symbols, apparently the classes want to see
>> >not just the symbol name and start value, but also size.  libbacktrace
>> >has all this info available, just doesn't pass it down to the callback.
>> >I wonder if we'd need to create yet another libbacktrace entrypoint, or
>> >if it would be acceptable to do source code incompatible, ABI (at least
>> >on all sane targets) compatible version of just adding another
>> >uintptr_t symsize argument to backtrace_syminfo_callback.
>>
>> I think it would be fine to change the callback.  I doubt that
>> libbacktrace is so widely used that we need to worry about backward
>> compatibility at this stage.  In particular I imagine that any users
>> of libbacktrace are simply copying the source code, since there is no
>> installable package.
>
> So how about this?  Due to the CLA etc. I have not done the obvious change
> to libgo/runtime/go-caller.c (syminfo_callback) that is needed together with
> that.
>
> 2013-11-19  Jakub Jelinek  
>
> * backtrace.h (backtrace_syminfo_callback): Add symsize argument.
> * elf.c (elf_syminfo): Pass 0 or sym->size to the callback as
> last argument.
> * btest.c (struct symdata): Add size field.
> (callback_three): Add symsize argument.  Copy it to the data->size
> field.
> (f23): Set symdata.size to 0.
> (test5): Likewise.  If sizeof (int) > 1, lookup address of
> ((uintptr_t) &global) + 1.  Verify symdata.val and symdata.size
> values.

This is OK.

Thanks.

I will take care of libgo when this is committed.

Ian


Re: [PATCH] Use libbacktrace as libsanitizer's symbolizer

2013-11-19 Thread Jakub Jelinek
On Tue, Nov 19, 2013 at 06:43:07AM -0800, Ian Lance Taylor wrote:
> > 2013-11-19  Jakub Jelinek  
> >
> > * backtrace.h (backtrace_syminfo_callback): Add symsize argument.
> > * elf.c (elf_syminfo): Pass 0 or sym->size to the callback as
> > last argument.
> > * btest.c (struct symdata): Add size field.
> > (callback_three): Add symsize argument.  Copy it to the data->size
> > field.
> > (f23): Set symdata.size to 0.
> > (test5): Likewise.  If sizeof (int) > 1, lookup address of
> > ((uintptr_t) &global) + 1.  Verify symdata.val and symdata.size
> > values.
> 
> This is OK.
> 
> I will take care of libgo when this is committed.

Ok, thanks, in now (== r205028).

Jakub


Please don't commit changes to gcc/go/gofrontend

2013-11-19 Thread Ian Lance Taylor
Hi, as noted in gcc/go/README.gcc, the files in gcc/go/gofrontend are
actually mirrored from a different repository.  Please do not directly
commit changes to those files.  Instead, send the changes to me.  I
will commit them upstream.  Thanks.

Ian


[PATCH i386] Enable -freorder-blocks-and-partition

2013-11-19 Thread Teresa Johnson
This patch enables -freorder-blocks-and-partition by default for x86
at -O2 and up. It is showing some modest gains in cpu2006 performance
with profile feedback and -O2 on an Intel Westmere system. Specifically,
I am seeing consistent improvements in 401.bzip2 (1.5-3%), 483.xalancbmk
(1.5-3%), and 453.povray (2.5-3%), and no apparent regressions.

Bootstrapped and tested on x86-64-unknown-linux-gnu with a normal
bootstrap, a profiledbootstrap and an LTO profiledbootstrap. All were
configured with --enable-languages=all,obj-c++ and tested for both
32 and 64-bit with RUNTESTFLAGS="--target_board=unix\{-m32,-m64\}".

It would be good to enable this for additional targets as a follow on,
but it needs more testing for both correctness and performance on those
other targets (i.e for correctness because I see a number of places
in other config/*/*.c files that do some special handling under this
option for different targets or simply disable it, so I am not sure
how well-tested it is under different architectural constraints).

Ok for trunk?

Thanks,
Teresa

2013-11-19  Teresa Johnson  

* common/config/i386/i386-common.c: Enable
-freorder-blocks-and-partition at -O2 and up for x86.
* opts.c (finish_options): Only warn if -freorder-blocks-and-
partition was set on command line.

Index: common/config/i386/i386-common.c
===
--- common/config/i386/i386-common.c(revision 205001)
+++ common/config/i386/i386-common.c(working copy)
@@ -789,6 +789,8 @@ static const struct default_options ix86_option_op
   {
 /* Enable redundant extension instructions removal at -O2 and higher.  */
 { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
+/* Enable function splitting at -O2 and higher.  */
+{ OPT_LEVELS_2_PLUS, OPT_freorder_blocks_and_partition, NULL, 1 },
 /* Turn off -fschedule-insns by default.  It tends to make the
problem with not enough registers even worse.  */
 { OPT_LEVELS_ALL, OPT_fschedule_insns, NULL, 0 },
Index: opts.c
===
--- opts.c  (revision 205001)
+++ opts.c  (working copy)
@@ -737,9 +737,10 @@ finish_options (struct gcc_options *opts, struct g
   && opts->x_flag_reorder_blocks_and_partition
   && (ui_except == UI_SJLJ || ui_except >= UI_TARGET))
 {
-  inform (loc,
- "-freorder-blocks-and-partition does not work "
- "with exceptions on this architecture");
+  if (opts_set->x_flag_reorder_blocks_and_partition)
+inform (loc,
+"-freorder-blocks-and-partition does not work "
+"with exceptions on this architecture");
   opts->x_flag_reorder_blocks_and_partition = 0;
   opts->x_flag_reorder_blocks = 1;
 }
@@ -752,9 +753,10 @@ finish_options (struct gcc_options *opts, struct g
   && opts->x_flag_reorder_blocks_and_partition
   && (ui_except == UI_SJLJ || ui_except >= UI_TARGET))
 {
-  inform (loc,
- "-freorder-blocks-and-partition does not support "
- "unwind info on this architecture");
+  if (opts_set->x_flag_reorder_blocks_and_partition)
+inform (loc,
+"-freorder-blocks-and-partition does not support "
+"unwind info on this architecture");
   opts->x_flag_reorder_blocks_and_partition = 0;
   opts->x_flag_reorder_blocks = 1;
 }
@@ -769,9 +771,10 @@ finish_options (struct gcc_options *opts, struct g
  && targetm_common.unwind_tables_default
  && (ui_except == UI_SJLJ || ui_except >= UI_TARGET
 {
-  inform (loc,
- "-freorder-blocks-and-partition does not work "
- "on this architecture");
+  if (opts_set->x_flag_reorder_blocks_and_partition)
+inform (loc,
+"-freorder-blocks-and-partition does not work "
+"on this architecture");
   opts->x_flag_reorder_blocks_and_partition = 0;
   opts->x_flag_reorder_blocks = 1;
 }


-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


[PATCH] Fix PR59164

2013-11-19 Thread Richard Biener

This fixes PR59164 - a mismatch during vectorizer analysis and transform.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk
sofar.

Richard.

2013-11-19  Richard Biener  

PR tree-optimization/59164
* tree-vect-loop-manip.c (vect_update_ivs_after_vectorizer):
Uncomment assert.
* tree-vect-loop.c (vect_analyze_loop_operations): Adjust
check whether we can create an epilogue loop to reflect the
cases where we create one.

* gcc.dg/torture/pr59164.c: New testcase.

Index: gcc/tree-vect-loop-manip.c
===
*** gcc/tree-vect-loop-manip.c  (revision 205009)
--- gcc/tree-vect-loop-manip.c  (working copy)
*** vect_update_ivs_after_vectorizer (loop_v
*** 1672,1678 
gimple_stmt_iterator gsi, gsi1;
basic_block update_bb = update_e->dest;
  
!   /* gcc_assert (vect_can_advance_ivs_p (loop_vinfo)); */
  
/* Make sure there exists a single-predecessor exit bb:  */
gcc_assert (single_pred_p (exit_bb));
--- 1667,1673 
gimple_stmt_iterator gsi, gsi1;
basic_block update_bb = update_e->dest;
  
!   gcc_checking_assert (vect_can_advance_ivs_p (loop_vinfo));
  
/* Make sure there exists a single-predecessor exit bb:  */
gcc_assert (single_pred_p (exit_bb));
Index: gcc/tree-vect-loop.c
===
*** gcc/tree-vect-loop.c(revision 205009)
--- gcc/tree-vect-loop.c(working copy)
*** vect_analyze_loop_operations (loop_vec_i
*** 1586,1609 
return false;
  }
  
!   if (LOOP_PEELING_FOR_ALIGNMENT (loop_vinfo)
|| ((int) tree_ctz (LOOP_VINFO_NITERS (loop_vinfo))
  < exact_log2 (vectorization_factor)))
  {
if (dump_enabled_p ())
! dump_printf_loc (MSG_NOTE, vect_location, "epilog loop required.\n");
!   if (!vect_can_advance_ivs_p (loop_vinfo))
  {
if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
!"not vectorized: can't create epilog loop 1.\n");
!   return false;
! }
!   if (!slpeel_can_duplicate_loop_p (loop, single_exit (loop)))
! {
!   if (dump_enabled_p ())
!   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
!"not vectorized: can't create epilog loop 2.\n");
return false;
  }
  }
--- 1586,1604 
return false;
  }
  
!   if (LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
|| ((int) tree_ctz (LOOP_VINFO_NITERS (loop_vinfo))
  < exact_log2 (vectorization_factor)))
  {
if (dump_enabled_p ())
! dump_printf_loc (MSG_NOTE, vect_location, "epilog loop required\n");
!   if (!vect_can_advance_ivs_p (loop_vinfo)
! || !slpeel_can_duplicate_loop_p (loop, single_exit (loop)))
  {
if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
!"not vectorized: can't create required "
!"epilog loop\n");
return false;
  }
  }
Index: gcc/testsuite/gcc.dg/torture/pr59164.c
===
*** gcc/testsuite/gcc.dg/torture/pr59164.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr59164.c  (working copy)
***
*** 0 
--- 1,21 
+ /* { dg-do compile } */
+ 
+ int a, d, e;
+ long b[10];
+ int c[10][8];
+ 
+ int fn1(p1)
+ { 
+   return 1 >> p1; 
+ }
+ 
+ void fn2(void)
+ {
+   int f;
+   for (a=1; a <= 4; a++)
+ {
+   f = fn1(0 < c[a][0]);
+   if (f || d)
+   e = b[a] = 1;
+ }
+ }


Re: [PATCH i386 7/8] [AVX-512] Add tests.

2013-11-19 Thread Uros Bizjak
On Tue, Nov 19, 2013 at 11:35 AM, Kirill Yukhin  wrote:

> Here is a patch that introduces tests for AVX-512 instructions.
>
> While implementing testsuite we were strongly connected to the fact that we
> don't want more then 2 test files per each instruction - a scan assembler test
> and a runtime test.
>
> Consider that in general case for most new instuctions we have a simple
> intrinsic, an intrinsic with merge masking and an intrinsic with zero masking 
> -
> and we need to have scan tests and runtimes test for them all. Also, there may
> be rounding support, i.e.  an intrinsic with rounding. For this case we only
> have scan tests and do not have runtime tests because it's unclear how to
> implement a runtime test in this case.
>
> Firstly, scan tests (avx512f--1.c). Each test should aggregate all
> intrinsics that generate appropriate instruction . I.e. simple 
> intrinsic,
> merge masking, zero masking, rounding intrinsics and maybe some aliases that
> worth testing. Tests are written in exactly the same manner as AVX2 scan 
> tests.
> See avx2-*-1.c for reference.
>
> Secondly, runtime tests (avx512f--2.c). Basically, the approach was the
> same for AVX2 runtime tests - call an intrinsic with some pre-initialized 
> source
> and destination and check if results meet expectation - except that we have 
> 3-4
> intrinsics with the same semantics. To avoid lots of duplicate code, we use
> macros in runtime tests. Macros are defined in avx512f-helper.h, and every
> runtime test includes this file. Also, avx512f-helper.h contains definition of
> core testing function - avx512f_test. Note that some macros are defined in
> dg-options. This machinery may seem redundand for now, but it will be 
> extremely
> useful for future extensions.  There're also some stand-alone AVX512F runtime
> tests that are implemented without our macros machinery just like AVX2 tests.
>
> Finally, we have updated avx-1.c, sse-*.c, testimm-*.c tests with new 
> intrinsics
> and builtins. To check messaging for intrinsics with rouning, we have added
> testround-*.c tests.

Please also add new options to g++.dg/other/i386-{2,3}.C. They check
if x86intrin.h additions can be compiled under c++.

Uros.


Go patch committed: Update for mainline changes

2013-11-19 Thread Ian Lance Taylor
This patch to the Go frontend incorporates patches by Richard S and
Diego for changes to the middle-end.  These patches were already
committed to the GCC repository.  This change commits them to the master
Go repository.

Ian

diff -r 75537ee240ab go/expressions.cc
--- a/go/expressions.cc	Mon Nov 18 18:29:34 2013 -0800
+++ b/go/expressions.cc	Tue Nov 19 06:55:30 2013 -0800
@@ -11,6 +11,8 @@
 #include "toplev.h"
 #include "intl.h"
 #include "tree.h"
+#include "stringpool.h"
+#include "stor-layout.h"
 #include "gimple.h"
 #include "gimplify.h"
 #include "tree-iterator.h"
@@ -3343,9 +3345,9 @@
   tree int_type_tree = type_to_tree(int_type->get_backend(gogo));
 
   expr_tree = fold_convert(int_type_tree, expr_tree);
-  if (host_integerp(expr_tree, 0))
-	{
-	  HOST_WIDE_INT intval = tree_low_cst(expr_tree, 0);
+  if (tree_fits_shwi_p (expr_tree))
+	{
+	  HOST_WIDE_INT intval = tree_to_shwi (expr_tree);
 	  std::string s;
 	  Lex::append_char(intval, true, &s, this->location());
 	  Expression* se = Expression::make_string(s, this->location());
diff -r 75537ee240ab go/gogo-tree.cc
--- a/go/gogo-tree.cc	Mon Nov 18 18:29:34 2013 -0800
+++ b/go/gogo-tree.cc	Tue Nov 19 06:55:30 2013 -0800
@@ -8,6 +8,9 @@
 
 #include "toplev.h"
 #include "tree.h"
+#include "stringpool.h"
+#include "stor-layout.h"
+#include "varasm.h"
 #include "gimple.h"
 #include "gimplify.h"
 #include "tree-iterator.h"


[v3 patch] move library TS status to new table

2013-11-19 Thread Jonathan Wakely
2013-11-19  Jonathan Wakely  

   * doc/xml/manual/status_cxx2014.xml: Create new table for TS statuses.

Committed to trunk.
commit f6ca0fe406b379b4fd7cd4575756fda04156cde0
Author: Jonathan Wakely 
Date:   Tue Nov 19 14:56:22 2013 +

* doc/xml/manual/status_cxx2014.xml: Create new table for TS statuses.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2014.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2014.xml
index 0e0ac37..bb389e8 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2014.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2014.xml
@@ -20,8 +20,8 @@ presence of the required flag.
 
 
 
-This page describes the C++14 support in mainline GCC SVN, not in any
-particular release.
+This page describes the C++14 and library TS support in mainline GCC SVN,
+not in any particular release.
 
 
 
@@ -223,29 +223,53 @@ particular release.
   
 
 
-
 
   
-   http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/JTC1/sc22/WG21/docs/papers/2013/n3793.html";>
- N3672
+   http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/JTC1/sc22/WG21/docs/papers/2013/n3655.pdf";>
+ N3655

   
-  A proposal to add a utility class to represent optional 
objects
+  TransformationTraits Redux
   Y
-  Moved from C++14 to Library Fundamentals TS
+  
 
 
 
+  
   
-   http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/JTC1/sc22/WG21/docs/papers/2013/n3655.pdf";>
- N3655
+   http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/JTC1/sc22/WG21/docs/papers/2013/n3644.pdf";>
+ N3644

   
-  TransformationTraits Redux
-  Y
+  Null Forward Iterators
+  N
   
 
 
+  
+
+
+
+
+
+C++ Technical Specifications Implementation Status
+
+
+
+
+
+
+  
+
+  Paper
+  Title
+  Status
+  Comments
+
+  
+
+  
+
 
   
   
@@ -255,21 +279,44 @@ particular release.
   
   C++ Dynamic Arrays
   N
-  Moved from C++14 to Library Fundamentals TS
+  Array Extensions TS
 
 
 
-  
   
-   http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/JTC1/sc22/WG21/docs/papers/2013/n3644.pdf";>
- N3644
+   http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/JTC1/sc22/WG21/docs/papers/2013/n3793.html";>
+ N3672

   
-  Null Forward Iterators
-  N
-  
+  A proposal to add a utility class to represent optional 
objects
+  Y
+  Library Fundamentals TS
 
 
+
+  
+   http://www.w3.org/1999/xlink"; 
xlink:href="http://open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3762.html";>
+ N3762
+   
+  
+  string_view: a non-owning reference to a 
string
+  Y
+  Library Fundamentals TS
+
+
+
+  
+  
+   http://www.w3.org/1999/xlink"; 
xlink:href="http://open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3790.html";>
+ N3790
+   
+  
+  File System
+  WIP
+  
+
+
+
   
 
 


[PATCH] C++-ify and simplify loop iterators

2013-11-19 Thread Richard Biener

$subject - the following turns

 loop_iterator li;
 FOR_EACH_LOOP (li, loop, LI_ONLY_INNERMOST)
   {
 ...
 if ()
   FOR_EACH_LOOP_BREAK;
   }

into

 FOR_EACH_LOOP (loop, LI_ONLY_INNERMOST)
   {
 ...
 if ()
   break;
   }

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2013-11-19  Richard Biener  

* cfgloop.h (struct loop_iterator): C++-ify, add constructor
and destructor and make fel_next a member function.
(fel_next): Transform into ...
(loop_iterator::next): ... this.
(fel_init): Transform into ...
(loop_iterator::loop_iterator): ... this.
(loop_iterator::~loop_iterator): New.
(FOR_EACH_LOOP): Remove loop-iterator argument.
(FOR_EACH_LOOP_BREAK): Remove no longer necessary macro.
* cfgloop.c, cfgloopmanip.c, config/mn10300/mn10300.c,
graphite-clast-to-gimple.c, graphite-scop-detection.c,
graphite-sese-to-poly.c, ipa-inline-analysis.c, ipa-pure-const.c,
loop-init.c, loop-invariant.c, loop-unroll.c, loop-unswitch.c,
modulo-sched.c, predict.c, sel-sched-ir.c, tree-cfg.c, tree-data-ref.c,
tree-if-conv.c, tree-loop-distribution.c, tree-parloops.c,
tree-predcom.c, tree-scalar-evolution.c, tree-ssa-dce.c,
tree-ssa-loop-ch.c, tree-ssa-loop-im.c, tree-ssa-loop-ivcanon.c,
tree-ssa-loop-ivopts.c, tree-ssa-loop-manip.c, tree-ssa-loop-niter.c,
tree-ssa-loop-prefetch.c, tree-ssa-loop-unswitch.c,
tree-ssa-threadupdate.c, tree-vectorizer.c, tree-vrp.c: Adjust
uses of FOR_EACH_LOOP and remove loop_iterator variables.  Replace
FOR_EACH_LOOP_BREAK with break.

Index: gcc/cfgloop.h
===
*** gcc/cfgloop.h.orig  2013-10-21 10:10:15.0 +0200
--- gcc/cfgloop.h   2013-11-19 13:37:18.432818617 +0100
*** enum li_flags
*** 542,589 
  
  /* The iterator for loops.  */
  
! typedef struct
  {
/* The list of loops to visit.  */
vec to_visit;
  
/* The index of the actual loop.  */
unsigned idx;
! } loop_iterator;
  
! static inline void
! fel_next (loop_iterator *li, loop_p *loop)
  {
int anum;
  
!   while (li->to_visit.iterate (li->idx, &anum))
  {
!   li->idx++;
!   *loop = get_loop (cfun, anum);
!   if (*loop)
!   return;
  }
  
!   li->to_visit.release ();
!   *loop = NULL;
  }
  
! static inline void
! fel_init (loop_iterator *li, loop_p *loop, unsigned flags)
  {
struct loop *aloop;
unsigned i;
int mn;
  
!   li->idx = 0;
if (!current_loops)
  {
!   li->to_visit.create (0);
*loop = NULL;
return;
  }
  
!   li->to_visit.create (number_of_loops (cfun));
mn = (flags & LI_INCLUDE_ROOT) ? 0 : 1;
  
if (flags & LI_ONLY_INNERMOST)
--- 542,593 
  
  /* The iterator for loops.  */
  
! struct loop_iterator
  {
+   loop_iterator (loop_p *loop, unsigned flags);
+   ~loop_iterator ();
+ 
+   inline loop_p next ();
+ 
/* The list of loops to visit.  */
vec to_visit;
  
/* The index of the actual loop.  */
unsigned idx;
! };
  
! inline loop_p
! loop_iterator::next ()
  {
int anum;
  
!   while (this->to_visit.iterate (this->idx, &anum))
  {
!   this->idx++;
!   loop_p loop = get_loop (cfun, anum);
!   if (loop)
!   return loop;
  }
  
!   return NULL;
  }
  
! inline
! loop_iterator::loop_iterator (loop_p *loop, unsigned flags)
  {
struct loop *aloop;
unsigned i;
int mn;
  
!   this->idx = 0;
if (!current_loops)
  {
!   this->to_visit.create (0);
*loop = NULL;
return;
  }
  
!   this->to_visit.create (number_of_loops (cfun));
mn = (flags & LI_INCLUDE_ROOT) ? 0 : 1;
  
if (flags & LI_ONLY_INNERMOST)
*** fel_init (loop_iterator *li, loop_p *loo
*** 592,598 
if (aloop != NULL
&& aloop->inner == NULL
&& aloop->num >= mn)
! li->to_visit.quick_push (aloop->num);
  }
else if (flags & LI_FROM_INNERMOST)
  {
--- 596,602 
if (aloop != NULL
&& aloop->inner == NULL
&& aloop->num >= mn)
! this->to_visit.quick_push (aloop->num);
  }
else if (flags & LI_FROM_INNERMOST)
  {
*** fel_init (loop_iterator *li, loop_p *loo
*** 605,611 
while (1)
{
  if (aloop->num >= mn)
!   li->to_visit.quick_push (aloop->num);
  
  if (aloop->next)
{
--- 609,615 
while (1)
{
  if (aloop->num >= mn)
!   this->to_visit.quick_push (aloop->num);
  
  if (aloop->next)
{
*** fel_init (loop_iterator *li, loop_p *loo
*** 627,633 
while (1)
{
  if (aloop->num >= mn)
!   li->to_visit.quick_push (aloop->num);
  
  if (aloop->inner != NULL)
aloop = aloop->inner;
--- 631,6

libgo patch committed: Update for libbacktrace change

2013-11-19 Thread Ian Lance Taylor
This patch to libgo updates the use of the libbacktrace library for the
recent addition of a size argument to the syminfo callback.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r 923fd178d72b libgo/runtime/go-caller.c
--- a/libgo/runtime/go-caller.c	Tue Nov 19 06:58:17 2013 -0800
+++ b/libgo/runtime/go-caller.c	Tue Nov 19 07:00:09 2013 -0800
@@ -135,7 +135,7 @@
 static void
 syminfo_callback (void *data, uintptr_t pc __attribute__ ((unused)),
 		  const char *symname __attribute__ ((unused)),
-		  uintptr_t address)
+		  uintptr_t address, uintptr_t size __attribute__ ((unused)))
 {
   uintptr_t *pval = (uintptr_t *) data;
 


[PATCH, rs6000] Fix ICE when loading vectors into GPRs in little-endian

2013-11-19 Thread Ulrich Weigand
Hello,

running the testsuite in powerpc64le-linux with --with-cpu=power7 causes
FAIL: tmpdir-g++.dg-struct-layout-1/t024 cp_compat_x_tst.o compile,  (internal 
compiler error)
due to an unrecognizable insn

(insn 137 136 138 5 (set (reg:V2DI 5 5)
(vec_select:V2DI (reg:V2DI 211)
(parallel [
(const_int 1 [0x1])
(const_int 0 [0])
]))) 
/home/gcc-build/gcc/testsuite/g++/g++.dg-struct-layout-1//t024_test.h:6 -1
 (nil))

i.e. an attempted vector permute into a GPR hard reg.  It turns out this happens
when rs6000_emit_le_vsx_move is called with a GPR hard reg destination, which
in turn can happen when passing vectors to a vararg routine.

However, rs6000_emit_le_vsx_move is not set up to handle GPRs.  Fortunately,
for GPRs this routine is not actually necessary; vectors can be loaded into
GPRs using the regular move patterns.

This patch fixes the problem by not invoking the rs6000_emit_le_vsx_move special
case if a hard reg GPR is involved as source/destination.

Tested on powerpc64le-linux.

OK for mainline?

Bye,
Ulrich


ChangeLog:

* config/rs6000/vector.md ("mov"): Do not call
rs6000_emit_le_vsx_move to move into or out of GPRs.
* config/rs6000/rs6000.c (rs6000_emit_le_vsx_move): Assert
source and destination are not GPR hard regs.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 205009)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -7947,6 +7947,7 @@
   gcc_assert (!BYTES_BIG_ENDIAN
  && VECTOR_MEM_VSX_P (mode)
  && mode != TImode
+ && !gpr_or_gpr_p (dest, source)
  && (MEM_P (source) ^ MEM_P (dest)));
 
   if (MEM_P (source))
Index: gcc/config/rs6000/vector.md
===
--- gcc/config/rs6000/vector.md (revision 205009)
+++ gcc/config/rs6000/vector.md (working copy)
@@ -108,6 +108,7 @@
   if (!BYTES_BIG_ENDIAN
   && VECTOR_MEM_VSX_P (mode)
   && mode != TImode
+  && !gpr_or_gpr_p (operands[0], operands[1])
   && (memory_operand (operands[0], mode)
   ^ memory_operand (operands[1], mode)))
 {
-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



[PATCH, rs6000] Make ppc64-abi-2.c test case endian safe

2013-11-19 Thread Ulrich Weigand
Hello,

some routines in the ppc64-abi-2.c test case attempt to verify that
the slots of the parameter save area in the caller hold correct values.

However, those slots holds (parts of) "vector int" data, which the
test case compares against immediate long values.  This of course
hard-codes byte order.

The patch below fixes the test to construct appropriate values
for both byte orders, which fixes the test failure on powerpc64le.

Tested on powerpc64le-linux.

OK for mainline?

Bye,
Ulrich


Index: gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c
===
--- gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c  (revision 205009)
+++ gcc/testsuite/gcc.target/powerpc/ppc64-abi-2.c  (working copy)
@@ -121,6 +121,12 @@
   vector int v;
 } vector_int_t;
 
+#ifdef __LITTLE_ENDIAN__
+#define MAKE_SLOT(x, y) ((long)x | ((long)y << 32))
+#else
+#define MAKE_SLOT(x, y) ((long)y | ((long)x << 32))
+#endif
+
 /* Paramter passing.
s : gpr 3
v : vpr 2
@@ -228,8 +234,8 @@
   sp = __builtin_frame_address(0);
   sp = sp->backchain;
   
-  if (sp->slot[2].l != 0x10002ULL
-  || sp->slot[4].l != 0x50006ULL)
+  if (sp->slot[2].l != MAKE_SLOT (1, 2)
+  || sp->slot[4].l !=  MAKE_SLOT (5, 6))
 abort();
 }
 
@@ -270,8 +276,8 @@
   sp = __builtin_frame_address(0);
   sp = sp->backchain;
   
-  if (sp->slot[4].l != 0x10002ULL
-  || sp->slot[6].l != 0x50006ULL)
+  if (sp->slot[4].l != MAKE_SLOT (1, 2)
+  || sp->slot[6].l !=  MAKE_SLOT (5, 6))
 abort();
 }
 
@@ -298,8 +304,8 @@
   sp = __builtin_frame_address(0);
   sp = sp->backchain;
   
-  if (sp->slot[4].l != 0x10002ULL
-  || sp->slot[6].l != 0x50006ULL)
+  if (sp->slot[4].l != MAKE_SLOT (1, 2)
+  || sp->slot[6].l !=  MAKE_SLOT (5, 6))
 abort();
 }
 
-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH] Add reference binding instrumentation

2013-11-19 Thread Jason Merrill

On 11/18/2013 11:39 AM, Marek Polacek wrote:

+   init = fold_build2 (COMPOUND_EXPR, TREE_TYPE (init),
+   ubsan_instrument_reference (input_location, init),
+   init);


This looks like it will evaluate init twice.

Jason




Re: Please don't commit changes to gcc/go/gofrontend

2013-11-19 Thread Diego Novillo
On Tue, Nov 19, 2013 at 9:48 AM, Ian Lance Taylor  wrote:
> Hi, as noted in gcc/go/README.gcc, the files in gcc/go/gofrontend are
> actually mirrored from a different repository.  Please do not directly
> commit changes to those files.  Instead, send the changes to me.  I
> will commit them upstream.  Thanks.
>
> Ian

Ugh, sorry.  This is really counter-intuitive.  To properly test
changes, one needs the patch in the tree.  So when I do the final
commit, it is not easy to remember that I need to take it out (and
taking it out means undoing a local commit, which is yet another
operation).

Sorry, but I would expect these problems to continue.  Is it possible
for you to cope in some other way?  Like your merging script noticing
changes and incorporating them into your tree?

Alternately, would it be possible to install some kind of svn hook
that only allows certain users to commit?


Diego.


  1   2   3   >