date:20140417

Re: Reduce -flto -fprofile-generate memory use

2014-04-17 Thread Richard Biener

On Thu, 17 Apr 2014, Jan Hubicka wrote:

> Hi,
> while compiling firefox I noticed that -fprofile-generage -flto goes to 8GB.
> It turns out that this is caused by ipa_reference no longer being disabled
> becaus in_lto_p became flag that is set later (it is not clear to me why it
> needs to be this way).
> 
> I however do not see reason why not disable ipa-reference for non-lto path, 
> too.
> 
> Bootstrapped/regtested x86_linux, comitted to mainline.
> OK for 4.9.1?

Yes.

Thanks,
Richard.

> Honza
> 
> Index: ChangeLog
> ===
> --- ChangeLog (revision 209461)
> +++ ChangeLog (working copy)
> @@ -1,5 +1,10 @@
>  2014-04-16  Jan Hubicka  
>  
> + * opts.c (common_handle_option): Disable -fipa-reference coorectly
> + with -fuse-profile.
> +
> +2014-04-16  Jan Hubicka  
> +
>   * ipa-devirt.c (odr_type_d): Add field all_derivations_known.
>   (type_all_derivations_known_p): New predicate.
>   (type_all_ctors_visible_p): New predicate.
> Index: opts.c
> ===
> --- opts.c(revision 209461)
> +++ opts.c(working copy)
> @@ -1732,7 +1732,7 @@ common_handle_option (struct gcc_options
>/* FIXME: Instrumentation we insert makes ipa-reference bitmaps
>quadratic.  Disable the pass until better memory representation
>is done.  */
> -  if (!opts_set->x_flag_ipa_reference && opts->x_in_lto_p)
> +  if (!opts_set->x_flag_ipa_reference)
>  opts->x_flag_ipa_reference = false;
>break;
>  
> 
> 

-- 
Richard Biener 
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

Re: Fix lto/PR60854

2014-04-17 Thread Richard Biener

On Thu, Apr 17, 2014 at 4:30 AM, Jan Hubicka  wrote:
> Hi,
> the testcase shows problem where cpp implicit alias is always inline and
> symtab_remove_unreachable_nodes removes the body of aliased function before
> inlininghappens.  The real problem is that cgraph_state is set too early
> and not as the comment says after inlinig, but for release branch I think
> it is easier to sovle the problem by simply making the alias target
> reachable by hand.
>
> Bootstrapped/regtested x86_64-linux, comitted to trunk. Let me know
> when it is OK for release brach.

It's ok for 4.9.1.

Richard.

> Honza
>
> Index: ChangeLog
> ===
> --- ChangeLog   (revision 209458)
> +++ ChangeLog   (working copy)
> @@ -1,3 +1,9 @@
> +2014-04-16  Jan Hubicka  
> +
> +   PR ipa/60854
> +   * ipa.c (symtab_remove_unreachable_nodes): Mark targets of
> +   external aliases alive, too.
> +
>  2014-04-16  Andrew  Pinski  
>
> * config/host-linux.c (TRY_EMPTY_VM_SPACE): Change aarch64 ilp32
> Index: testsuite/ChangeLog
> ===
> --- testsuite/ChangeLog (revision 209450)
> +++ testsuite/ChangeLog (working copy)
> @@ -1,3 +1,8 @@
> +2014-04-16  Jan Hubicka  
> +
> +   PR ipa/60854
> +   * g++.dg/torture/pr60854.C: New testcase.
> +
>  2014-04-16  Catherine Moore  
>
> * gcc.target/mips/umips-store16-2.c: New test.
> Index: ipa.c
> ===
> --- ipa.c   (revision 209450)
> +++ ipa.c   (working copy)
> @@ -415,7 +415,18 @@ symtab_remove_unreachable_nodes (bool be
>   || !DECL_EXTERNAL (e->callee->decl)
>   || e->callee->alias
>   || before_inlining_p))
> -   pointer_set_insert (reachable, e->callee);
> +   {
> + /* Be sure that we will not optimize out alias target
> +body.  */
> + if (DECL_EXTERNAL (e->callee->decl)
> + && e->callee->alias
> + && before_inlining_p)
> +   {
> + pointer_set_insert (reachable,
> + cgraph_function_node 
> (e->callee));
> +   }
> + pointer_set_insert (reachable, e->callee);
> +   }
>   enqueue_node (e->callee, &first, reachable);
> }
>
> Index: testsuite/g++.dg/torture/pr60854.C
> ===
> --- testsuite/g++.dg/torture/pr60854.C  (revision 0)
> +++ testsuite/g++.dg/torture/pr60854.C  (revision 0)
> @@ -0,0 +1,13 @@
> +template 
> +class MyClass
> +{
> +public:
> +  __attribute__ ((__always_inline__)) inline MyClass () { ; }
> +};
> +
> +extern template class MyClass;
> +
> +void Func()
> +{
> +  MyClass x;
> +}

Re: [PATCH GCC]Fix pr60363 by adding backtraced value of phi arg along jump threading path

2014-04-17 Thread Richard Biener

On Thu, Apr 17, 2014 at 7:30 AM, Jeff Law  wrote:
> On 03/18/14 04:13, bin.cheng wrote:
>>
>> Hi,
>> After control flow graph change made by
>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01492.html, case
>> gcc.dg/tree-ssa/ssa-dom-thread-4.c is broken on logical_op_short_circuit
>> targets including cortex-m3/cortex-m0.
>> The regression reveals a missed opportunity in jump threading, which
>> causes
>> a forward basic block doesn't get removed in cfgcleanup after jump
>> threading
>> in VRP1.  Root cause is stated at the corresponding PR:
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60363, please refer to it for
>> detailed report.
>>
>> This patch fixes the issue by adding constant value instead of ssa_name as
>> the new phi argument.  Bootstrap and test on x86_64, also test on
>> cortex-m3
>> and the regression is gone.
>> I think this should wait for stage1, but would like to hear some comments
>> now.  So does it look reasonable?
>>
>>
>> 2014-03-18  Bin Cheng
>>
>> PR regression/60363
>> * gcc/tree-ssa-threadupdate.c (get_value_locus_in_path): New.
>> (copy_phi_args): New parameters.  Call get_value_locus_in_path.
>> (update_destination_phis): New parameter.
>> (create_edge_and_update_destination_phis): Ditto.
>> (ssa_fix_duplicate_block_edges): Pass new arguments.
>> (thread_single_edge): Ditto.
>
> This is a good and interesting catch. DOM knows how to propagate these
> context sensitive equivalences which should expose the optimizable forwarder
> blocks.
>
> But I'm a big believer in catching as many CFG simplifications as early as
> we can as they tend to have nice cascading effects.  So if we can pick it up
> by being smarter in how we duplicate arguments, then I'm all for it.
>
>> +  for (int j = idx - 1; j >= 0; j--)
>> +{
>> +  edge e = (*path)[j]->e;
>> +  if (e->dest == def_bb)
>> +   {
>> + arg = gimple_phi_arg_def (def_phi, e->dest_idx);
>> + *locus = gimple_phi_arg_location (def_phi, e->dest_idx);
>> + return (TREE_CODE (arg) == INTEGER_CST ? arg : def);
>
> Presumably any constant that can legitimately appear in a PHI node is good
> here.  So for example ADDR_EXPR  ought to be
> handled as well.
>
> One could also argue that we should go ahead and do a context sensitive copy
> propagation here too if ARG turns out to be an SSA_NAME.  You have to be a
> bit more careful with those and use may_propagate_copy_p and you'd probably
> want to test the loop depth of the SSA_NAMEs to ensure you're not doing a
> propagation that is going to muck up LICM.  See loop_depth_of_name uses in
> tree-ssa-dom.c.
>
> Overall I think it's good.  We just need to resolve whether or not we want
> to catch constant ADDR_EXPRs and/or do the context sensitive copy
> propagations.

Simply use is_gimple_min_invariant (arg) ? arg : def

> jeff

[PATCH] Fix PR60841

2014-04-17 Thread Richard Biener


This fixes running into the exponential value-graph -> SLP tree
expansion by artificially limiting the overall SLP tree size.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2014-04-16   Richard Biener  

PR tree-optimization/60841
* tree-vect-data-refs.c (vect_analyze_data_refs): Count stmts.
* tree-vect-loop.c (vect_analyze_loop_2): Pass down number
of stmts to SLP build.
* tree-vect-slp.c (vect_slp_analyze_bb_1): Likewise.
(vect_analyze_slp): Likewise.
(vect_analyze_slp_instance): Likewise.
(vect_build_slp_tree): Limit overall SLP tree growth.
* tree-vectorizer.h (vect_analyze_data_refs,
vect_analyze_slp): Adjust prototypes.

* gcc.dg/vect/pr60841.c: New testcase.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 209423)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -3172,7 +3213,7 @@ vect_check_gather (gimple stmt, loop_vec
 bool
 vect_analyze_data_refs (loop_vec_info loop_vinfo,
bb_vec_info bb_vinfo,
-   int *min_vf)
+   int *min_vf, unsigned *n_stmts)
 {
   struct loop *loop = NULL;
   basic_block bb = NULL;
@@ -3207,6 +3248,9 @@ vect_analyze_data_refs (loop_vec_info lo
  for (gsi = gsi_start_bb (bbs[i]); !gsi_end_p (gsi); gsi_next (&gsi))
{
  gimple stmt = gsi_stmt (gsi);
+ if (is_gimple_debug (stmt))
+   continue;
+ ++*n_stmts;
  if (!find_data_references_in_stmt (loop, stmt, &datarefs))
{
  if (is_gimple_call (stmt) && loop->safelen)
@@ -3260,6 +3304,9 @@ vect_analyze_data_refs (loop_vec_info lo
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
{
  gimple stmt = gsi_stmt (gsi);
+ if (is_gimple_debug (stmt))
+   continue;
+ ++*n_stmts;
  if (!find_data_references_in_stmt (NULL, stmt,
 &BB_VINFO_DATAREFS (bb_vinfo)))
{
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 209423)
+++ gcc/tree-vect-loop.c(working copy)
@@ -1629,6 +1629,7 @@ vect_analyze_loop_2 (loop_vec_info loop_
   int max_vf = MAX_VECTORIZATION_FACTOR;
   int min_vf = 2;
   unsigned int th;
+  unsigned int n_stmts = 0;
 
   /* Find all data references in the loop (which correspond to vdefs/vuses)
  and analyze their evolution in the loop.  Also adjust the minimal
@@ -1637,7 +1638,7 @@ vect_analyze_loop_2 (loop_vec_info loop_
  FORNOW: Handle only simple, array references, which
  alignment can be forced, and aligned pointer-references.  */
 
-  ok = vect_analyze_data_refs (loop_vinfo, NULL, &min_vf);
+  ok = vect_analyze_data_refs (loop_vinfo, NULL, &min_vf, &n_stmts);
   if (!ok)
 {
   if (dump_enabled_p ())
@@ -1747,7 +1748,7 @@ vect_analyze_loop_2 (loop_vec_info loop_
 }
 
   /* Check the SLP opportunities in the loop, analyze and build SLP trees.  */
-  ok = vect_analyze_slp (loop_vinfo, NULL);
+  ok = vect_analyze_slp (loop_vinfo, NULL, n_stmts);
   if (ok)
 {
   /* Decide which possible SLP instances to SLP.  */
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 209423)
+++ gcc/tree-vect-slp.c (working copy)
@@ -849,9 +849,10 @@ vect_build_slp_tree (loop_vec_info loop_
  unsigned int *max_nunits,
  vec *loads,
  unsigned int vectorization_factor,
-bool *matches, unsigned *npermutes)
+bool *matches, unsigned *npermutes, unsigned *tree_size,
+unsigned max_tree_size)
 {
-  unsigned nops, i, this_npermutes = 0;
+  unsigned nops, i, this_npermutes = 0, this_tree_size = 0;
   gimple stmt;
 
   if (!matches)
@@ -911,6 +912,12 @@ vect_build_slp_tree (loop_vec_info loop_
   if (oprnd_info->first_dt != vect_internal_def)
 continue;
 
+  if (++this_tree_size > max_tree_size)
+   {
+ vect_free_oprnd_info (oprnds_info);
+ return false;
+   }
+
   child = vect_create_new_slp_node (oprnd_info->def_stmts);
   if (!child)
{
@@ -921,7 +928,8 @@ vect_build_slp_tree (loop_vec_info loop_
   bool *matches = XALLOCAVEC (bool, group_size);
   if (vect_build_slp_tree (loop_vinfo, bb_vinfo, &child,
   group_size, max_nunits, loads,
-  vectorization_factor, matches, npermutes))
+  vectorization_factor, matches,
+  npermutes, &this_tree_size, max_tree_size))
{
  oprnd_info->def_stmts = vNULL;
  SLP_TREE_CHILDREN (*node).quick_

[PATCH] Fix PR60849

2014-04-17 Thread Richard Biener


This fixes PR60849 by properly rejecting non-boolean typed
comparisons from valid_gimple_rhs_p so they go through the
gimplification paths.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2014-04-16  Richard Biener  

PR middle-end/60849
* tree-ssa-propagate.c (valid_gimple_rhs_p): Only allow effective
boolean results for comparisons.

* g++.dg/opt/pr60849.C: New testcase.

Index: gcc/tree-ssa-propagate.c
===
--- gcc/tree-ssa-propagate.c(revision 209423)
+++ gcc/tree-ssa-propagate.c(working copy)
@@ -571,8 +571,14 @@ valid_gimple_rhs_p (tree expr)
   /* All constants are ok.  */
   break;
 
-case tcc_binary:
 case tcc_comparison:
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (expr))
+ || (TREE_CODE (TREE_TYPE (expr)) != BOOLEAN_TYPE
+ && TYPE_PRECISION (TREE_TYPE (expr)) != 1))
+   return false;
+
+  /* Fallthru.  */
+case tcc_binary:
   if (!is_gimple_val (TREE_OPERAND (expr, 0))
  || !is_gimple_val (TREE_OPERAND (expr, 1)))
return false;
Index: gcc/testsuite/g++.dg/opt/pr60849.C
===
--- gcc/testsuite/g++.dg/opt/pr60849.C  (revision 0)
+++ gcc/testsuite/g++.dg/opt/pr60849.C  (working copy)
@@ -0,0 +1,13 @@
+// { dg-do compile }
+// { dg-options "-O2" }
+
+int g;
+
+extern "C" int isnan ();
+
+void foo(float a) {
+  int (*xx)(...);
+  xx = isnan;
+  if (xx(a))
+g++;
+}

[PATCH] Fix PR60836

2014-04-17 Thread Richard Biener


This fixes PR60836 by emitting a non-proper PHI argument to the
incoming edge.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2014-04-16  Richard Biener  

PR tree-optimization/60836
* tree-vect-loop.c (vect_create_epilog_for_reduction): Force
initial PHI args to be gimple values.

* g++.dg/vect/pr60836.cc: New testcase.

Index: gcc/tree-vect-loop.c
===
*** gcc/tree-vect-loop.c(revision 209423)
--- gcc/tree-vect-loop.c(working copy)
*** vect_create_epilog_for_reduction (vec double
+ norm_ (const int &)
+ {
+   char c, d;
+   A e;
+   for (; a; a++)
+ {
+   b = e (b, d);
+   b = e (b, c);
+ }
+ }
+ 
+ void
+ norm ()
+ {
+   static NormFunc f = norm_ < int, A >;
+   f = 0;
+ }
+ 
+ // { dg-final { cleanup-tree-dump "vect" } }

[PATCH 2/6] merge register_dump_files_1 into register_dump_files

2014-04-17 Thread tsaunders

From: Trevor Saunders 

Hi,

simplification allowed by previous patch.

bootstrap + regtest passed on x86_64-unknown-linux-gnu, ok?

Trev

2014-03-19  Trevor Saunders  

* pass_manager.h (pass_manager::register_dump_files_1): Remove 
declaration.
* passes.c (pass_manager::register_dump_files_1): Merge into
(pass_manager::register_dump_files): this, and remove its handling of
properties since the pass always has the properties anyway.
(pass_manager::pass_manager): Adjust.


diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h
index 8309567..9f4d67b 100644
--- a/gcc/pass_manager.h
+++ b/gcc/pass_manager.h
@@ -91,8 +91,7 @@ public:
 
 private:
   void set_pass_for_id (int id, opt_pass *pass);
-  void register_dump_files_1 (opt_pass *pass);
-  void register_dump_files (opt_pass *pass, int properties);
+  void register_dump_files (opt_pass *pass);
 
 private:
   context *m_ctxt;
diff --git a/gcc/passes.c b/gcc/passes.c
index 3f9590a..7508771 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -706,11 +706,10 @@ pass_manager::register_one_dump_file (opt_pass *pass)
   free (CONST_CAST (char *, full_name));
 }
 
-/* Recursive worker function for register_dump_files.  */
+/* Register the dump files for the pass_manager starting at PASS. */
 
 void
-pass_manager::
-register_dump_files_1 (opt_pass *pass)
+pass_manager::register_dump_files (opt_pass *pass)
 {
   do
 {
@@ -718,25 +717,13 @@ register_dump_files_1 (opt_pass *pass)
 register_one_dump_file (pass);
 
   if (pass->sub)
-register_dump_files_1 (pass->sub);
+register_dump_files (pass->sub);
 
   pass = pass->next;
 }
   while (pass);
 }
 
-/* Register the dump files for the pass_manager starting at PASS.
-   PROPERTIES reflects the properties that are guaranteed to be available at
-   the beginning of the pipeline.  */
-
-void
-pass_manager::
-register_dump_files (opt_pass *pass,int properties)
-{
-  pass->properties_required |= properties;
-  register_dump_files_1 (pass);
-}
-
 struct pass_registry
 {
   const char* unique_name;
@@ -1536,19 +1523,11 @@ pass_manager::pass_manager (context *ctxt)
 #undef TERMINATE_PASS_LIST
 
   /* Register the passes with the tree dump code.  */
-  register_dump_files (all_lowering_passes, PROP_gimple_any);
-  register_dump_files (all_small_ipa_passes,
-  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
-  | PROP_cfg);
-  register_dump_files (all_regular_ipa_passes,
-  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
-  | PROP_cfg);
-  register_dump_files (all_late_ipa_passes,
-  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
-  | PROP_cfg);
-  register_dump_files (all_passes,
-  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
-  | PROP_cfg);
+  register_dump_files (all_lowering_passes);
+  register_dump_files (all_small_ipa_passes);
+  register_dump_files (all_regular_ipa_passes);
+  register_dump_files (all_late_ipa_passes);
+  register_dump_files (all_passes);
 }
 
 /* If we are in IPA mode (i.e., current_function_decl is NULL), call
-- 
1.9.2

[PATCH 1/6] remove properties stuff from register_dump_files_1

2014-04-17 Thread tsaunders

From: Trevor Saunders 

Hi,

just removing some dead code.

bootstrapped + regtested against r209414 on x86_64-unknown-linux-gnu, ok?

Trev

2014-03-19  Trevor Saunders  

* pass_manager.h (pass_manager::register_dump_files_1): Adjust.
* passes.c (pass_manager::register_dump_files_1): Remove dead code
dealing with properties.
(pass_manager::register_dump_files): Adjust.

diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h
index e1d8143..8309567 100644
--- a/gcc/pass_manager.h
+++ b/gcc/pass_manager.h
@@ -91,7 +91,7 @@ public:
 
 private:
   void set_pass_for_id (int id, opt_pass *pass);
-  int register_dump_files_1 (opt_pass *pass, int properties);
+  void register_dump_files_1 (opt_pass *pass);
   void register_dump_files (opt_pass *pass, int properties);
 
 private:
diff --git a/gcc/passes.c b/gcc/passes.c
index 60fb135..3f9590a 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -708,33 +708,21 @@ pass_manager::register_one_dump_file (opt_pass *pass)
 
 /* Recursive worker function for register_dump_files.  */
 
-int
+void
 pass_manager::
-register_dump_files_1 (opt_pass *pass, int properties)
+register_dump_files_1 (opt_pass *pass)
 {
   do
 {
-  int new_properties = (properties | pass->properties_provided)
-  & ~pass->properties_destroyed;
-
   if (pass->name && pass->name[0] != '*')
 register_one_dump_file (pass);
 
   if (pass->sub)
-new_properties = register_dump_files_1 (pass->sub, new_properties);
-
-  /* If we have a gate, combine the properties that we could have with
- and without the pass being examined.  */
-  if (pass->has_gate)
-properties &= new_properties;
-  else
-properties = new_properties;
+register_dump_files_1 (pass->sub);
 
   pass = pass->next;
 }
   while (pass);
-
-  return properties;
 }
 
 /* Register the dump files for the pass_manager starting at PASS.
@@ -746,7 +734,7 @@ pass_manager::
 register_dump_files (opt_pass *pass,int properties)
 {
   pass->properties_required |= properties;
-  register_dump_files_1 (pass, properties);
+  register_dump_files_1 (pass);
 }
 
 struct pass_registry
-- 
1.9.2

[PATCH 4/6] enable -Woverloaded-virtual when available

2014-04-17 Thread tsaunders

From: Trevor Saunders 

hi,

its a useful warning, and helps catch bugs in the next two patches.

bootstrap + regtest passed on x86_64-unknown-linux-gnu, ok?

Trev

2014-03-19  Trevor Saunders  

* configure.ac: Check for -Woverloaded-virtual and enable it if found.
* configure: Regenerate.


diff --git a/gcc/configure b/gcc/configure
index 415377a..1a48ca3 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -6427,6 +6427,50 @@ fi
   done
 CFLAGS="$save_CFLAGS"
 
+save_CFLAGS="$CFLAGS"
+for real_option in -Woverloaded-virtual; do
+  # Do the check with the no- prefix removed since gcc silently
+  # accepts any -Wno-* option on purpose
+  case $real_option in
+-Wno-*) option=-W`expr x$real_option : 'x-Wno-\(.*\)'` ;;
+*) option=$real_option ;;
+  esac
+  as_acx_Woption=`$as_echo "acx_cv_prog_cc_warning_$option" | $as_tr_sh`
+
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC supports 
$option" >&5
+$as_echo_n "checking whether $CC supports $option... " >&6; }
+if { as_var=$as_acx_Woption; eval "test \"\${$as_var+set}\" = set"; }; then :
+  $as_echo_n "(cached) " >&6
+else
+  CFLAGS="$option"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  eval "$as_acx_Woption=yes"
+else
+  eval "$as_acx_Woption=no"
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+
+fi
+eval ac_res=\$$as_acx_Woption
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5
+$as_echo "$ac_res" >&6; }
+  if test `eval 'as_val=${'$as_acx_Woption'};$as_echo "$as_val"'` = yes; then :
+  strict_warn="$strict_warn${strict_warn:+ }$real_option"
+fi
+  done
+CFLAGS="$save_CFLAGS"
+
 c_strict_warn=
 save_CFLAGS="$CFLAGS"
 for real_option in -Wold-style-definition -Wc++-compat; do
@@ -17927,7 +17971,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 17930 "configure"
+#line 17974 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -18033,7 +18077,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18036 "configure"
+#line 18080 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 0336066..b2726e5 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -340,6 +340,8 @@ ACX_PROG_CC_WARNING_OPTS(
 ACX_PROG_CC_WARNING_OPTS(
m4_quote(m4_do([-Wmissing-format-attribute])), [strict_warn])
 ACX_PROG_CC_WARNING_OPTS(
+   m4_quote(m4_do([-Woverloaded-virtual])), [strict_warn])
+ACX_PROG_CC_WARNING_OPTS(
m4_quote(m4_do([-Wold-style-definition -Wc++-compat])), [c_strict_warn])
 ACX_PROG_CC_WARNING_ALMOST_PEDANTIC(
m4_quote(m4_do([-Wno-long-long -Wno-variadic-macros ], 
-- 
1.9.2

Re: [PATCH 2/6] merge register_dump_files_1 into register_dump_files

2014-04-17 Thread Richard Biener

On Thu, Apr 17, 2014 at 10:37 AM,   wrote:
> From: Trevor Saunders 
>
> Hi,
>
> simplification allowed by previous patch.
>
> bootstrap + regtest passed on x86_64-unknown-linux-gnu, ok?

Ok.

Thanks,
Richard.

> Trev
>
> 2014-03-19  Trevor Saunders  
>
> * pass_manager.h (pass_manager::register_dump_files_1): Remove 
> declaration.
> * passes.c (pass_manager::register_dump_files_1): Merge into
> (pass_manager::register_dump_files): this, and remove its handling of
> properties since the pass always has the properties anyway.
> (pass_manager::pass_manager): Adjust.
>
>
> diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h
> index 8309567..9f4d67b 100644
> --- a/gcc/pass_manager.h
> +++ b/gcc/pass_manager.h
> @@ -91,8 +91,7 @@ public:
>
>  private:
>void set_pass_for_id (int id, opt_pass *pass);
> -  void register_dump_files_1 (opt_pass *pass);
> -  void register_dump_files (opt_pass *pass, int properties);
> +  void register_dump_files (opt_pass *pass);
>
>  private:
>context *m_ctxt;
> diff --git a/gcc/passes.c b/gcc/passes.c
> index 3f9590a..7508771 100644
> --- a/gcc/passes.c
> +++ b/gcc/passes.c
> @@ -706,11 +706,10 @@ pass_manager::register_one_dump_file (opt_pass *pass)
>free (CONST_CAST (char *, full_name));
>  }
>
> -/* Recursive worker function for register_dump_files.  */
> +/* Register the dump files for the pass_manager starting at PASS. */
>
>  void
> -pass_manager::
> -register_dump_files_1 (opt_pass *pass)
> +pass_manager::register_dump_files (opt_pass *pass)
>  {
>do
>  {
> @@ -718,25 +717,13 @@ register_dump_files_1 (opt_pass *pass)
>  register_one_dump_file (pass);
>
>if (pass->sub)
> -register_dump_files_1 (pass->sub);
> +register_dump_files (pass->sub);
>
>pass = pass->next;
>  }
>while (pass);
>  }
>
> -/* Register the dump files for the pass_manager starting at PASS.
> -   PROPERTIES reflects the properties that are guaranteed to be available at
> -   the beginning of the pipeline.  */
> -
> -void
> -pass_manager::
> -register_dump_files (opt_pass *pass,int properties)
> -{
> -  pass->properties_required |= properties;
> -  register_dump_files_1 (pass);
> -}
> -
>  struct pass_registry
>  {
>const char* unique_name;
> @@ -1536,19 +1523,11 @@ pass_manager::pass_manager (context *ctxt)
>  #undef TERMINATE_PASS_LIST
>
>/* Register the passes with the tree dump code.  */
> -  register_dump_files (all_lowering_passes, PROP_gimple_any);
> -  register_dump_files (all_small_ipa_passes,
> -  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
> -  | PROP_cfg);
> -  register_dump_files (all_regular_ipa_passes,
> -  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
> -  | PROP_cfg);
> -  register_dump_files (all_late_ipa_passes,
> -  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
> -  | PROP_cfg);
> -  register_dump_files (all_passes,
> -  PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh
> -  | PROP_cfg);
> +  register_dump_files (all_lowering_passes);
> +  register_dump_files (all_small_ipa_passes);
> +  register_dump_files (all_regular_ipa_passes);
> +  register_dump_files (all_late_ipa_passes);
> +  register_dump_files (all_passes);
>  }
>
>  /* If we are in IPA mode (i.e., current_function_decl is NULL), call
> --
> 1.9.2
>

Re: [PATCH 1/6] remove properties stuff from register_dump_files_1

2014-04-17 Thread Richard Biener

On Thu, Apr 17, 2014 at 10:37 AM,   wrote:
> From: Trevor Saunders 
>
> Hi,
>
> just removing some dead code.
>
> bootstrapped + regtested against r209414 on x86_64-unknown-linux-gnu, ok?

Ok.

Thanks,
Richard.

> Trev
>
> 2014-03-19  Trevor Saunders  
>
> * pass_manager.h (pass_manager::register_dump_files_1): Adjust.
> * passes.c (pass_manager::register_dump_files_1): Remove dead code
> dealing with properties.
> (pass_manager::register_dump_files): Adjust.
>
> diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h
> index e1d8143..8309567 100644
> --- a/gcc/pass_manager.h
> +++ b/gcc/pass_manager.h
> @@ -91,7 +91,7 @@ public:
>
>  private:
>void set_pass_for_id (int id, opt_pass *pass);
> -  int register_dump_files_1 (opt_pass *pass, int properties);
> +  void register_dump_files_1 (opt_pass *pass);
>void register_dump_files (opt_pass *pass, int properties);
>
>  private:
> diff --git a/gcc/passes.c b/gcc/passes.c
> index 60fb135..3f9590a 100644
> --- a/gcc/passes.c
> +++ b/gcc/passes.c
> @@ -708,33 +708,21 @@ pass_manager::register_one_dump_file (opt_pass *pass)
>
>  /* Recursive worker function for register_dump_files.  */
>
> -int
> +void
>  pass_manager::
> -register_dump_files_1 (opt_pass *pass, int properties)
> +register_dump_files_1 (opt_pass *pass)
>  {
>do
>  {
> -  int new_properties = (properties | pass->properties_provided)
> -  & ~pass->properties_destroyed;
> -
>if (pass->name && pass->name[0] != '*')
>  register_one_dump_file (pass);
>
>if (pass->sub)
> -new_properties = register_dump_files_1 (pass->sub, new_properties);
> -
> -  /* If we have a gate, combine the properties that we could have with
> - and without the pass being examined.  */
> -  if (pass->has_gate)
> -properties &= new_properties;
> -  else
> -properties = new_properties;
> +register_dump_files_1 (pass->sub);
>
>pass = pass->next;
>  }
>while (pass);
> -
> -  return properties;
>  }
>
>  /* Register the dump files for the pass_manager starting at PASS.
> @@ -746,7 +734,7 @@ pass_manager::
>  register_dump_files (opt_pass *pass,int properties)
>  {
>pass->properties_required |= properties;
> -  register_dump_files_1 (pass, properties);
> +  register_dump_files_1 (pass);
>  }
>
>  struct pass_registry
> --
> 1.9.2
>

Re: [PATCH 4/6] enable -Woverloaded-virtual when available

2014-04-17 Thread Richard Biener

On Thu, Apr 17, 2014 at 10:37 AM,   wrote:
> From: Trevor Saunders 
>
> hi,
>
> its a useful warning, and helps catch bugs in the next two patches.
>
> bootstrap + regtest passed on x86_64-unknown-linux-gnu, ok?

Ok.

Thanks,
Richard.

> Trev
>
> 2014-03-19  Trevor Saunders  
>
> * configure.ac: Check for -Woverloaded-virtual and enable it if found.
> * configure: Regenerate.
>
>
> diff --git a/gcc/configure b/gcc/configure
> index 415377a..1a48ca3 100755
> --- a/gcc/configure
> +++ b/gcc/configure
> @@ -6427,6 +6427,50 @@ fi
>done
>  CFLAGS="$save_CFLAGS"
>
> +save_CFLAGS="$CFLAGS"
> +for real_option in -Woverloaded-virtual; do
> +  # Do the check with the no- prefix removed since gcc silently
> +  # accepts any -Wno-* option on purpose
> +  case $real_option in
> +-Wno-*) option=-W`expr x$real_option : 'x-Wno-\(.*\)'` ;;
> +*) option=$real_option ;;
> +  esac
> +  as_acx_Woption=`$as_echo "acx_cv_prog_cc_warning_$option" | $as_tr_sh`
> +
> +  { $as_echo "$as_me:${as_lineno-$LINENO}: checking whether $CC supports 
> $option" >&5
> +$as_echo_n "checking whether $CC supports $option... " >&6; }
> +if { as_var=$as_acx_Woption; eval "test \"\${$as_var+set}\" = set"; }; then :
> +  $as_echo_n "(cached) " >&6
> +else
> +  CFLAGS="$option"
> +cat confdefs.h - <<_ACEOF >conftest.$ac_ext
> +/* end confdefs.h.  */
> +
> +int
> +main ()
> +{
> +
> +  ;
> +  return 0;
> +}
> +_ACEOF
> +if ac_fn_c_try_compile "$LINENO"; then :
> +  eval "$as_acx_Woption=yes"
> +else
> +  eval "$as_acx_Woption=no"
> +fi
> +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
> +
> +fi
> +eval ac_res=\$$as_acx_Woption
> +  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5
> +$as_echo "$ac_res" >&6; }
> +  if test `eval 'as_val=${'$as_acx_Woption'};$as_echo "$as_val"'` = yes; 
> then :
> +  strict_warn="$strict_warn${strict_warn:+ }$real_option"
> +fi
> +  done
> +CFLAGS="$save_CFLAGS"
> +
>  c_strict_warn=
>  save_CFLAGS="$CFLAGS"
>  for real_option in -Wold-style-definition -Wc++-compat; do
> @@ -17927,7 +17971,7 @@ else
>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>lt_status=$lt_dlunknown
>cat > conftest.$ac_ext <<_LT_EOF
> -#line 17930 "configure"
> +#line 17974 "configure"
>  #include "confdefs.h"
>
>  #if HAVE_DLFCN_H
> @@ -18033,7 +18077,7 @@ else
>lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
>lt_status=$lt_dlunknown
>cat > conftest.$ac_ext <<_LT_EOF
> -#line 18036 "configure"
> +#line 18080 "configure"
>  #include "confdefs.h"
>
>  #if HAVE_DLFCN_H
> diff --git a/gcc/configure.ac b/gcc/configure.ac
> index 0336066..b2726e5 100644
> --- a/gcc/configure.ac
> +++ b/gcc/configure.ac
> @@ -340,6 +340,8 @@ ACX_PROG_CC_WARNING_OPTS(
>  ACX_PROG_CC_WARNING_OPTS(
> m4_quote(m4_do([-Wmissing-format-attribute])), [strict_warn])
>  ACX_PROG_CC_WARNING_OPTS(
> +   m4_quote(m4_do([-Woverloaded-virtual])), [strict_warn])
> +ACX_PROG_CC_WARNING_OPTS(
> m4_quote(m4_do([-Wold-style-definition -Wc++-compat])), 
> [c_strict_warn])
>  ACX_PROG_CC_WARNING_ALMOST_PEDANTIC(
> m4_quote(m4_do([-Wno-long-long -Wno-variadic-macros ],
> --
> 1.9.2
>

Re: [RFC] Add aarch64 support for ada

2014-04-17 Thread Tristan Gingold


On 16 Apr 2014, at 17:36, Richard Henderson  wrote:

> On 04/16/2014 12:39 AM, Eric Botcazou wrote:
>>> The primary bit of rfc here is the hunk that applies to ada/types.h
>>> with respect to Fat_Pointer.  Given that the Ada type, as defined in
>>> s-stratt.ads, does not include alignment, I can't imagine why the C
>>> type should have it.
>> 
>> See gcc-interface/utils.c:finish_fat_pointer_type.
> 
> Ah hah.
> 
>  /* Make sure we can put it into a register.  */
>  if (STRICT_ALIGNMENT)
>TYPE_ALIGN (record_type) = MIN (BIGGEST_ALIGNMENT, 2 * POINTER_SIZE);
> 
> AArch64 is not a STRICT_ALIGNMENT target, thus the mismatch.

As the align attribute in types.h is for the host, couldn't a configure test 
solve
this issue ?

> If we were to make this alignment unconditional, would it be better to drop 
> the
> code from here in finish_fat_pointer_type and instead record that in the Ada
> source, as we do with the C source?
> 
> I presume
> 
>  for Fat_Pointer'Alignment use System.Address'Size * 2;
> 
> or some such incantation would do that...

One of the most common Fat_Pointer is for strings, which aren't declared in any
source and is very commonly used.

OTOH, I think this optimization mostly targets sparc.

Tristan.

Re: [PATCH v2] libstdc++: Add hexfloat/defaultfloat io manipulators.

2014-04-17 Thread Jonathan Wakely

On 17 April 2014 01:56, Luke Allardyce wrote:
>> Thanks, I was wrong about that.
>>
>> Then I think we should just bite the bullet and provide the new
>> behaviour. If we do have an abi_tag on those types in the next release
>> then we can preserve the old behaviour in the old ABI and use the
>> C++11 semantics for the abi_tagged type, which will be used for both
>> C++03 and C++11 code. I am not too concerned that people who use a
>> meaningless modifier in C++03 code get the C++11 behaviour. If they
>> really want %g or %G then they shouldn't use fixed|scientific.
>
> Does that mean abi_tag will be enabled with separate compiler flag /
> define rather than checking against the __cplusplus value?

I'm going to send a mail later on today, but the plan is that it's not
going to depend on __cplusplus at all. That makes it possible to pass
the abi_tagged types between C++03 and C++11 code.

Re: [PATCH] Enhancing the widen-mult pattern in vectorization.

2014-04-17 Thread Richard Biener

On Sat, Dec 7, 2013 at 12:45 AM, Cong Hou  wrote:
> After further reviewing this patch, I found I don't have to change the
> code in tree-vect-stmts.c to allow further type conversion after
> widen-mult operation. Instead, I detect the following pattern in
> vect_recog_widen_mult_pattern():
>
> T1 a, b;
> ai = (T2) a;
> bi = (T2) b;
> c = ai * bi;
>
> where T2 is more that double the size of T1. (e.g. T1 is char and T2 is int).
>
> In this case I just create a new type T3 whose size is double of the
> size of T1, then get an intermediate result of type T3 from
> widen-mult. Then I add a new statement to STMT_VINFO_PATTERN_DEF_SEQ
> converting the result into type T2.
>
> This strategy makes the patch more clean.
>
> Bootstrapped and tested on an x86-64 machine.

Ok for trunk (please re-bootstrap/test of course).

Thanks,
Richard.

>
> thanks,
> Cong
>
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index f298c0b..12990b2 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,10 @@
> +2013-12-02  Cong Hou  
> +
> + * tree-vect-patterns.c (vect_recog_widen_mult_pattern): Enhance
> + the widen-mult pattern by handling two operands with different
> + sizes, and operands whose size is smaller than half of the result
> + type.
> +
>  2013-11-22  Jakub Jelinek  
>
>   PR sanitizer/59061
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 12d2c90..611ae1c 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2013-12-02  Cong Hou  
> +
> + * gcc.dg/vect/vect-widen-mult-u8-s16-s32.c: New test.
> + * gcc.dg/vect/vect-widen-mult-u8-u32.c: New test.
> +
>  2013-11-22  Jakub Jelinek  
>
>   * c-c++-common/asan/no-redundant-instrumentation-7.c: Fix
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-s16-s32.c
> b/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-s16-s32.c
> new file mode 100644
> index 000..9f9081b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-s16-s32.c
> @@ -0,0 +1,48 @@
> +/* { dg-require-effective-target vect_int } */
> +
> +#include 
> +#include "tree-vect.h"
> +
> +#define N 64
> +
> +unsigned char X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
> +short Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
> +int result[N];
> +
> +/* unsigned char * short -> int widening-mult.  */
> +__attribute__ ((noinline)) int
> +foo1(int len) {
> +  int i;
> +
> +  for (i=0; i +result[i] = X[i] * Y[i];
> +  }
> +}
> +
> +int main (void)
> +{
> +  int i;
> +
> +  check_vect ();
> +
> +  for (i=0; i +X[i] = i;
> +Y[i] = 64-i;
> +__asm__ volatile ("");
> +  }
> +
> +  foo1 (N);
> +
> +  for (i=0; i +if (result[i] != X[i] * Y[i])
> +  abort ();
> +  }
> +
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" {
> target { vect_widen_mult_hi_to_si || vect_unpack } } } } */
> +/* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern:
> detected" 1 "vect" { target vect_widen_mult_hi_to_si_pattern } } } */
> +/* { dg-final { scan-tree-dump-times "pattern recognized" 1 "vect" {
> target vect_widen_mult_hi_to_si_pattern } } } */
> +/* { dg-final { cleanup-tree-dump "vect" } } */
> +
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-u32.c
> b/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-u32.c
> new file mode 100644
> index 000..12c4692
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-u32.c
> @@ -0,0 +1,48 @@
> +/* { dg-require-effective-target vect_int } */
> +
> +#include 
> +#include "tree-vect.h"
> +
> +#define N 64
> +
> +unsigned char X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
> +unsigned char Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
> +unsigned int result[N];
> +
> +/* unsigned char-> unsigned int widening-mult.  */
> +__attribute__ ((noinline)) int
> +foo1(int len) {
> +  int i;
> +
> +  for (i=0; i +result[i] = X[i] * Y[i];
> +  }
> +}
> +
> +int main (void)
> +{
> +  int i;
> +
> +  check_vect ();
> +
> +  for (i=0; i +X[i] = i;
> +Y[i] = 64-i;
> +__asm__ volatile ("");
> +  }
> +
> +  foo1 (N);
> +
> +  for (i=0; i +if (result[i] != X[i] * Y[i])
> +  abort ();
> +  }
> +
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" {
> target { vect_widen_mult_qi_to_hi || vect_unpack } } } } */
> +/* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern:
> detected" 1 "vect" { target vect_widen_mult_qi_to_hi_pattern } } } */
> +/* { dg-final { scan-tree-dump-times "pattern recognized" 1 "vect" {
> target vect_widen_mult_qi_to_hi_pattern } } } */
> +/* { dg-final { cleanup-tree-dump "vect" } } */
> +
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index 7823cc3..f412e2d 100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -529,7 +529,8 @@ vect_handle_widen_op_by_const (gimple stmt, enum
> tree_code code,
>
> Try to find the following pattern:
>

Re: Remove obsolete Solaris 9 support

2014-04-17 Thread Rainer Orth

Uros Bizjak  writes:

> On Wed, Apr 16, 2014 at 1:16 PM, Rainer Orth
>  wrote:
>> Now that 4.9 has branched, it's time to actually remove the obsolete
>> Solaris 9 configuration.  Most of this is just legwork and falls under
>> my Solaris maintainership.
>>
>> A couple of questions, though:
>>
>> * Uros: I'm removing all sse_os_support() checks from the testsuite.
>>   Solaris 9 was the only consumer, so it seems best to do away with it.
>
> This is OK, but please leave sse-os-check.h (and corresponding
> sse_os_support calls) in the testsuite. Just remove the Solaris 9
> specific code from sse-os-check.h and always return 1, perhaps with
> the comment that all currently supported OSes support SSE
> instructions.

Done.  I'll repost the final patch once another round of testing has
completed.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH GCC]Fix pr60363 by adding backtraced value of phi arg along jump threading path

2014-04-17 Thread Bin.Cheng

On Thu, Apr 17, 2014 at 1:30 PM, Jeff Law  wrote:
> On 03/18/14 04:13, bin.cheng wrote:
>>
>> Hi,
>> After control flow graph change made by
>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01492.html, case
>> gcc.dg/tree-ssa/ssa-dom-thread-4.c is broken on logical_op_short_circuit
>> targets including cortex-m3/cortex-m0.
>> The regression reveals a missed opportunity in jump threading, which
>> causes
>> a forward basic block doesn't get removed in cfgcleanup after jump
>> threading
>> in VRP1.  Root cause is stated at the corresponding PR:
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60363, please refer to it for
>> detailed report.
>>
>> This patch fixes the issue by adding constant value instead of ssa_name as
>> the new phi argument.  Bootstrap and test on x86_64, also test on
>> cortex-m3
>> and the regression is gone.
>> I think this should wait for stage1, but would like to hear some comments
>> now.  So does it look reasonable?
>>
>>
>> 2014-03-18  Bin Cheng
>>
>> PR regression/60363
>> * gcc/tree-ssa-threadupdate.c (get_value_locus_in_path): New.
>> (copy_phi_args): New parameters.  Call get_value_locus_in_path.
>> (update_destination_phis): New parameter.
>> (create_edge_and_update_destination_phis): Ditto.
>> (ssa_fix_duplicate_block_edges): Pass new arguments.
>> (thread_single_edge): Ditto.
>
> This is a good and interesting catch. DOM knows how to propagate these
> context sensitive equivalences which should expose the optimizable forwarder
> blocks.
At the time I was looking into the problem, DOM couldn't understand
the equivalence.  Maybe it can be improved too.
>
> But I'm a big believer in catching as many CFG simplifications as early as
> we can as they tend to have nice cascading effects.  So if we can pick it up
> by being smarter in how we duplicate arguments, then I'm all for it.
>
>> +  for (int j = idx - 1; j >= 0; j--)
>> +{
>> +  edge e = (*path)[j]->e;
>> +  if (e->dest == def_bb)
>> +   {
>> + arg = gimple_phi_arg_def (def_phi, e->dest_idx);
>> + *locus = gimple_phi_arg_location (def_phi, e->dest_idx);
>> + return (TREE_CODE (arg) == INTEGER_CST ? arg : def);
>
> Presumably any constant that can legitimately appear in a PHI node is good
> here.  So for example ADDR_EXPR  ought to be
> handled as well.
>
> One could also argue that we should go ahead and do a context sensitive copy
> propagation here too if ARG turns out to be an SSA_NAME.  You have to be a
> bit more careful with those and use may_propagate_copy_p and you'd probably
> want to test the loop depth of the SSA_NAMEs to ensure you're not doing a
> propagation that is going to muck up LICM.  See loop_depth_of_name uses in
> tree-ssa-dom.c.
>
> Overall I think it's good.  We just need to resolve whether or not we want
> to catch constant ADDR_EXPRs and/or do the context sensitive copy
> propagations.
Do you mean const/copy propagation in jump threading optimization, or
just an independent opt somewhere else?  It's naturally flow sensitive
along jump threading path, which looks interesting to me.

Thanks,
bin
>
> jeff



-- 
Best Regards.

Re: [PATCH] Conditional count update for fast coverage test in multi-threaded programs

2014-04-17 Thread Richard Biener

On Fri, Dec 20, 2013 at 11:45 PM, Rong Xu  wrote:
> Here are the results using our internal benchmarks which are a mixed a
> multi-threaded and single-threaded programs.
> This was collected about a month ago but I did not got time to send
> due to an unexpected trip.
>
> cmpxchg gives the worst performance due to the memory barriers it incurs.
> I'll send a patch that supports conditional_1 and unconditional_1.

Bah, too bad.  Ok, so another idea is to use non-tempoal unconditional
store of 1.  According to docs this is weakly ordered and side-steps the
cache (movnti/movntq on x86).

Btw,

+@item coverage-exec_once
+Set to 1 to update each arc counter only once. This avoids false sharing
+and speeds up profile-generate run for multi-threaded programs.
+

for -fprofile-generate this certainly is a bad choice but I can see that
it is useful for -coverage.  Also avoid mixing - and _ here.  Using a
--param here is not the proper thing for a non-developer feature,
thus I'd suggest to add -fcoverage-exec-once instead.  It is supposed
to help for -coverage, right?  Not for -fprofile-generate?

Instead of using a pointer-set to record stmts to "instrument" just
set one of the pass-local flags on the stmts (gimple_set_plf/gimple_plf,
you have to clear flags before using them).

Thanks,
Richard.

> - result -
>
> base: original_coverage
> (1): using_conditional_1 -- using branch (my original implementation)
> (2): using_unconfitional_1 -- write straight 1
> (3): using_cmpxchg -- using compxchg write 1
>
> Values are performance ratios where 100.0 equals the performance of
> O2. Larger numbers are faster.
> "--" means the test failed due to running too slowly.
>
> arch: westmere
>   Benchmark   Base  (1)(2)  (3)
> -
> benchmark_126.4  +176.62%  +17.20%--
> benchmark_2  --[78.4]   [12.3]--
> benchmark_386.3+6.15%  +10.52%   -61.28%
> benchmark_488.4+6.59%  +14.26%   -68.76%
> benchmark_589.6+6.26%  +13.00%   -68.74%
> benchmark_676.7   +22.28%  +29.15%   -75.31%
> benchmark_789.0-0.62%   +3.36%   -71.37%
> benchmark_884.5-1.45%   +5.27%   -74.04%
> benchmark_981.3   +10.64%  +13.32%   -72.82%
> benchmark_10   59.1   +44.71%  +14.77%   -73.24%
> benchmark_11   90.3-1.74%   +4.22%   -61.95%
> benchmark_12   98.9+0.07%   +0.48%-6.37%
> benchmark_13   74.0-4.69%   +4.35%   -77.02%
> benchmark_14   21.4  +309.92%  +63.41%   -35.82%
> benchmark_15   21.4  +282.33%  +58.15%   -57.98%
> benchmark_16   85.1-7.71%   +1.65%   -60.72%
> benchmark_17   81.7+2.47%   +8.20%   -72.08%
> benchmark_18   83.7+1.59%   +3.83%   -69.33%
> geometric mean +30.30%  +14.41%  -65.66% (incomplete)
>
> arch: sandybridge
>   Benchmark   Base(1)   (2)  (3)
> -
> benchmark_1 --[70.1]   [26.1]   --
> benchmark_2 --[79.1]   --   --
> benchmark_3   84.3   +10.82%  +15.84%  -68.98%
> benchmark_4   88.5   +10.28%  +11.35%  -75.10%
> benchmark_5   89.4   +10.46%  +11.40%  -74.41%
> benchmark_6   65.5   +38.52%  +44.46%  -77.97%
> benchmark_7   87.7-0.16%   +1.74%  -76.19%
> benchmark_8   89.6-4.52%   +6.29%  -78.10%
> benchmark_9   79.9   +13.43%  +19.44%  -75.99%
> benchmark_10  52.6   +61.53%   +8.23%  -78.41%
> benchmark_11  89.9-1.40%   +3.37%  -68.16%
> benchmark_12  99.0+1.51%   +0.63%  -10.37%
> benchmark_13  74.3-6.75%   +3.89%  -81.84%
> benchmark_14  21.8  +295.76%  +19.48%  -51.58%
> benchmark_15  23.5  +257.20%  +29.33%  -83.53%
> benchmark_16  84.4   -10.04%   +2.39%  -68.25%
> benchmark_17  81.6+0.60%   +8.82%  -78.02%
> benchmark_18  87.4-1.14%   +9.69%  -75.88%
> geometric mean   +25.64%  +11.76%  -72.96% (incomplete)
>
> arch: clovertown
>   Benchmark Base   (1)   (2)(3)
> --
> benchmark_1 -- [83.4]-- --
> benchmark_2 -- [82.3]-- --
> benchmark_3   86.2 +7.58%   +13.10%-81.74%
> benchmark_4   89.4 +5.69%   +11.70%-82.97%
> benchmark_5   92.8 +4.67%+7.48%-80.02%
> benchmark_6   78.1+13.28%   +22.21%-86.92%
> benchmark_7   96.8 +0.25%+5.44%-84.94%
> benchmark_8   89.1 +0.66%+3.60%-85.89%
> benchmark_9   86.4 +8.42%+9.95%-82.30%
> benchmark_10  59.7+44.95%   +

Re: [PATCH, i386, PR57623] Introduce synonyms for BMI intrinsics

2014-04-17 Thread Jakub Jelinek

On Wed, Jul 03, 2013 at 08:14:25AM +0200, Uros Bizjak wrote:
> On Tue, Jul 2, 2013 at 10:32 AM, Kirill Yukhin  
> wrote:
> > Bootstrap passing. Updated tests passing on BMI-featured HW.
> >
> > ChangeLog:
> > 2013-07-02  Kirill Yukhin  
> >
> > * config/i386/bmiintrin.h (_blsi_u32): New.
> > (_blsi_u64): Ditto.
> > (_blsr_u32): Ditto.
> > (_blsr_u64): Ditto.
> > (_blsmsk_u32): Ditto.
> > (_blsmsk_u64): Ditto.
> > (_tzcnt_u32): Ditto.
> > (_tzcnt_u64): Ditto.
> >
> > testsuite/ChangeLog:
> > 2013-07-02  Kirill Yukhin  
> >
> > * gcc.target/i386/bmi-1.c: Extend with new instrinsics.
> > Fix scan patterns.
> > * gcc.target/i386/bmi-2.c: Ditto.
> >
> > [1] http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01286.html
> 
> This is OK for mainline.
> 
> BTW: Do we want to backport this patch (and your previous) to 4.8 branch?

Kyrill, you've committed this only to the 4.8 branch and not to the trunk,
which means we actually regress on this on in 4.9 compared to 4.8.2.

As the patch has been approved, I went ahead and after testing it
on x86_64 (-m32/-m64) committed it to the trunk and 4.9.

2014-04-17  Jakub Jelinek  

PR target/60847
Forward port from 4.8 branch
2013-07-19  Kirill Yukhin  

* config/i386/bmiintrin.h (_blsi_u32): New.
(_blsi_u64): Ditto.
(_blsr_u32): Ditto.
(_blsr_u64): Ditto.
(_blsmsk_u32): Ditto.
(_blsmsk_u64): Ditto.
(_tzcnt_u32): Ditto.
(_tzcnt_u64): Ditto.

* gcc.target/i386/bmi-1.c: Extend with new instrinsics.
Fix scan patterns.
* gcc.target/i386/bmi-2.c: Ditto.

--- gcc/config/i386/bmiintrin.h (revision 201046)
+++ gcc/config/i386/bmiintrin.h (revision 201047)
@@ -40,7 +40,6 @@ __tzcnt_u16 (unsigned short __X)
   return __builtin_ctzs (__X);
 }
 
-
 extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __andn_u32 (unsigned int __X, unsigned int __Y)
 {
@@ -66,17 +65,34 @@ __blsi_u32 (unsigned int __X)
 }
 
 extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_blsi_u32 (unsigned int __X)
+{
+  return __blsi_u32 (__X);
+}
+
+extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __blsmsk_u32 (unsigned int __X)
 {
   return __X ^ (__X - 1);
 }
 
 extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_blsmsk_u32 (unsigned int __X)
+{
+  return __blsmsk_u32 (__X);
+}
+
+extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __blsr_u32 (unsigned int __X)
 {
   return __X & (__X - 1);
 }
 
+extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_blsr_u32 (unsigned int __X)
+{
+  return __blsr_u32 (__X);
+}
 
 extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 __tzcnt_u32 (unsigned int __X)
@@ -84,6 +100,12 @@ __tzcnt_u32 (unsigned int __X)
   return __builtin_ctz (__X);
 }
 
+extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_tzcnt_u32 (unsigned int __X)
+{
+  return __builtin_ctz (__X);
+}
+
 
 #ifdef  __x86_64__
 extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
@@ -111,22 +133,46 @@ __blsi_u64 (unsigned long long __X)
 }
 
 extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
+_blsi_u64 (unsigned long long __X)
+{
+  return __blsi_u64 (__X);
+}
+
+extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
 __blsmsk_u64 (unsigned long long __X)
 {
   return __X ^ (__X - 1);
 }
 
 extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
+_blsmsk_u64 (unsigned long long __X)
+{
+  return __blsmsk_u64 (__X);
+}
+
+extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
 __blsr_u64 (unsigned long long __X)
 {
   return __X & (__X - 1);
 }
 
 extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
+_blsr_u64 (unsigned long long __X)
+{
+  return __blsr_u64 (__X);
+}
+
+extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
 __tzcnt_u64 (unsigned long long __X)
 {
   return __builtin_ctzll (__X);
 }
+
+extern __inline unsigned long long __attribute__((__gnu_inline__, 
__always_inline__, __artificial__))
+_tzcnt_u64 (unsigned long long __X)
+{
+  return __builtin_ctzll (__X);
+}
 
 #endif /* __x86_64__  */
 
--- gcc/testsuite/gcc.target/i386/bmi-1.c   (revision 201046)
+++ gcc/testsuite/gcc.target/i386/bmi-1.c   (revision 201047)
@@ -2,10 +2,10 @@
 /* { dg-options "-O2 -mbmi " } */
 /* { dg-final { scan-assembler "andn\[^

Re: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store

2014-04-17 Thread Richard Biener

On Thu, Apr 17, 2014 at 7:19 AM, Thomas Preud'homme
 wrote:
>> From: Richard Biener [mailto:richard.guent...@gmail.com]
>>
>> With handling only the outermost handled-component and then only a
>> selected subset you'll catch many but not all cases.  Why not simply
>> use get_inner_reference () here (plus stripping the constant offset
>> from an innermost MEM_REF) and get the best of both worlds (not
>> duplicate parts of its logic and handle more cases)?  Eventually
>> using tree-affine.c and get_inner_reference_aff is even more appropriate
>> so you can compute the address differences without decomposing
>> them yourselves.
>
> Why does the constant offset from an innermost MEM_REF need to be
> stripped? Shouldn't that be part of the offset in the symbolic number?

Yes, but get_inner_reference returns MEM[ptr, constant-offset] as base,
thus it doesn't move the constant offset therein to bitpos and doesn't
return MEM[ptr, 0].  You have to do that yourselves.

(as you are really interested in the _address_ of the memory reference
instead of the reference itself it would be appropriate to introduce a
variant of get_inner_reference that returns 'ptr' in this case and
&x for x.field1 for example)

>>
>> + /*  Compute address to load from and cast according to the size
>> + of the load.  */
>> + load_ptr_type = build_pointer_type (load_type);
>> + addr_expr = build1 (ADDR_EXPR, load_ptr_type, bswap_src);
>> + addr_tmp = make_temp_ssa_name (load_ptr_type, NULL,
>> "load_src");
>> + addr_stmt = gimple_build_assign_with_ops
>> +(NOP_EXPR, addr_tmp, addr_expr, NULL);
>> + gsi_insert_before (&gsi, addr_stmt, GSI_SAME_STMT);
>> +
>> + /* Perform the load.  */
>> + load_offset_ptr = build_int_cst (load_ptr_type, 0);
>> + val_tmp = make_temp_ssa_name (load_type, NULL, "load_dst");
>> + val_expr = build2 (MEM_REF, load_type, addr_tmp, 
>> load_offset_ptr);
>> + load_stmt = gimple_build_assign_with_ops
>> +(MEM_REF, val_tmp, val_expr, NULL);
>>
>> this is unnecessarily complex and has TBAA issues.  You don't need to
>> create a "correct" pointer type, so doing
>>
>> addr_expr = fold_build_addr_expr (bswap_src);
>>
>> is enough.  Now, to fix the TBAA issues you either need to remember
>> and "combine" the reference_alias_ptr_type of each original load and
>> use that for the load_offset_ptr value or decide that isn't worth it and
>> use alias-set zero (use ptr_type_node).
>
> Sorry this is only my second patch [1] to gcc so it's not all clear to me. 
> The TBAA
> issue you mention comes from val_expr referring to a memory area that
> overlap with the smaller memory area used in the bitwise OR operation, am I
> right? Now, I have no idea about how to do the combination of the values
> returned by reference_alias_ptr_type () for each individual small memory
> area. Can you advise me on this? And what are the effect of not doing it and
> using ptr_type_node for the alias-set?

You can "combine" two reference_alias_ptr_type()s with

  if (alias_ptr_types_compatible_p (type1, type2))
return type1;
  else
return ptr_type_node;

using ptr_type_node for the alias-set will make it alias with all memory
references (that is, type-based disambiguation will be disabled).  That's
required for example if you combine four loads with type 'short' using
a single load with type 'long'.

> [1] First one was a fix on the existing implementation of the bswap pass.
>
>>
>> Can you also expand the comment about "size vs. range"?  Is it
>> that range can be bigger than size if you have (short)a[0] |
>> ((short)a[3] << 1) sofar
>> where size == 2 but range == 3?  Thus range can also be smaller than size
>> for example for (short)a[0] | ((short)a[0] << 1) where range would be 1 and
>> size == 2?  I suppose adding two examples like this to the comment, together
>> with the expected value of 'n' would help here.
>
> You understood correctly. I will add the suggested example.
>
>> Otherwise the patch looks good.  Now we're only missing the addition
>> of trying to match to a VEC_PERM_EXPR with a constant mask
>> using can_vec_perm_p ;)
>
> Is that the vector shuffle engine you were mentioning in PR54733? If I
> understand correctly it is a generalization of the check again CMPNOP and
> CMPXCHG in find_bswap in this new patchset. I will look if ARM could
> Benefit from this and if yes I might take a look (yep, two conditions).

Yep.  For example it might match on things like

int foo (char *x)
{
   return x[0] << 1 | x[0]) << 1) | x[1]) << 1) | x[0];
}

not sure if target support for shuffles on small vectors (or vector parts)
is working well.  Thus on v1si as in the example.

Richard.

> Thanks a lot for such quick and detailed comments after my ping.
>
> Best regards,
>
> Thomas
>
>
>

Re: [PATCH] dwarf2out: Use normal constant values in bound_info if possible.

2014-04-17 Thread Mark Wielaard

On Tue, 2014-04-15 at 14:24 -0700, Cary Coutant wrote:
> > +   /* If HOST_WIDE_INT is big enough then represent the bound as
> > +  a constant value.  Note that we need to make sure the type
> > +  is signed or unsigned.  We cannot just add an unsigned
> > +  constant if the value itself is positive.  Some DWARF
> > +  consumers will lookup the bounds type and then sign extend
> > +  any unsigned values found for signed types.  This is only
> > +  for DW_AT_lower_bound, normally unsigned values
> > +  (DW_FORM_data[1248]) are assumed to not need
> > +  sign-extension.  */
> 
> This comment confuses me.

Sorry, obviously not my intention. But I see what I was trying to say
and how I said it didn't make things very clear. Apologies.

>  By "we need to make sure the type is signed
> or unsigned" (what else can it be?), I think you mean "we need to
> choose a form based on whether the type is signed or unsigned."

Yes, right. I was confusing matters in my comment because I was thinking
of non-constants (reference or exprlocs) that are handled elsewhere
later on in the code.

>  And by "This is only for DW_AT_lower_bound, ...", I think you mean "This is
> needed only for DW_AT_{lower,upper}_bound, since for most other
> attributes, consumers will treat DW_FORM_data[1248] as unsigned
> values, regardless of the underlying type."

Yes, right again.

> Otherwise, the patch looks OK to me.

Thanks I pushed it with the comment changed to how you expressed things.
It now reads:

/* If HOST_WIDE_INT is big enough then represent the bound as   
   a constant value.  We need to choose a form based on 
   whether the type is signed or unsigned.  We cannot just  
   call add_AT_unsigned if the value itself is positive 
   (add_AT_unsigned might add the unsigned value encoded as 
   DW_FORM_data[1248]).  Some DWARF consumers will lookup the   
   bounds type and then sign extend any unsigned values found   
   for signed types.  This is needed only for   
   DW_AT_{lower,upper}_bound, since for most other attributes,  
   consumers will treat DW_FORM_data[1248] as unsigned values,  
   regardless of the underlying type.  */

Thanks,

Mark

Re: [PATCH] Fix PR60849

2014-04-17 Thread Marc Glisse


On Thu, 17 Apr 2014, Richard Biener wrote:


This fixes PR60849 by properly rejecting non-boolean typed
comparisons from valid_gimple_rhs_p so they go through the
gimplification paths.


Could you also accept vector comparisons please?

--
Marc Glisse

Re: [PATCH, i386, PR57623] Introduce synonyms for BMI intrinsics

2014-04-17 Thread Kirill Yukhin

Thanks! Sorry, missed that!

K

On Thu, Apr 17, 2014 at 2:13 PM, Jakub Jelinek  wrote:
> On Wed, Jul 03, 2013 at 08:14:25AM +0200, Uros Bizjak wrote:
>> On Tue, Jul 2, 2013 at 10:32 AM, Kirill Yukhin  
>> wrote:
>> > Bootstrap passing. Updated tests passing on BMI-featured HW.
>> >
>> > ChangeLog:
>> > 2013-07-02  Kirill Yukhin  
>> >
>> > * config/i386/bmiintrin.h (_blsi_u32): New.
>> > (_blsi_u64): Ditto.
>> > (_blsr_u32): Ditto.
>> > (_blsr_u64): Ditto.
>> > (_blsmsk_u32): Ditto.
>> > (_blsmsk_u64): Ditto.
>> > (_tzcnt_u32): Ditto.
>> > (_tzcnt_u64): Ditto.
>> >
>> > testsuite/ChangeLog:
>> > 2013-07-02  Kirill Yukhin  
>> >
>> > * gcc.target/i386/bmi-1.c: Extend with new instrinsics.
>> > Fix scan patterns.
>> > * gcc.target/i386/bmi-2.c: Ditto.
>> >
>> > [1] http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01286.html
>>
>> This is OK for mainline.
>>
>> BTW: Do we want to backport this patch (and your previous) to 4.8 branch?
>
> Kyrill, you've committed this only to the 4.8 branch and not to the trunk,
> which means we actually regress on this on in 4.9 compared to 4.8.2.
>
> As the patch has been approved, I went ahead and after testing it
> on x86_64 (-m32/-m64) committed it to the trunk and 4.9.
>
> 2014-04-17  Jakub Jelinek  
>
> PR target/60847
> Forward port from 4.8 branch
> 2013-07-19  Kirill Yukhin  
>
> * config/i386/bmiintrin.h (_blsi_u32): New.
> (_blsi_u64): Ditto.
> (_blsr_u32): Ditto.
> (_blsr_u64): Ditto.
> (_blsmsk_u32): Ditto.
> (_blsmsk_u64): Ditto.
> (_tzcnt_u32): Ditto.
> (_tzcnt_u64): Ditto.
>
> * gcc.target/i386/bmi-1.c: Extend with new instrinsics.
> Fix scan patterns.
> * gcc.target/i386/bmi-2.c: Ditto.
>
> --- gcc/config/i386/bmiintrin.h (revision 201046)
> +++ gcc/config/i386/bmiintrin.h (revision 201047)
> @@ -40,7 +40,6 @@ __tzcnt_u16 (unsigned short __X)
>return __builtin_ctzs (__X);
>  }
>
> -
>  extern __inline unsigned int __attribute__((__gnu_inline__, 
> __always_inline__, __artificial__))
>  __andn_u32 (unsigned int __X, unsigned int __Y)
>  {
> @@ -66,17 +65,34 @@ __blsi_u32 (unsigned int __X)
>  }
>
>  extern __inline unsigned int __attribute__((__gnu_inline__, 
> __always_inline__, __artificial__))
> +_blsi_u32 (unsigned int __X)
> +{
> +  return __blsi_u32 (__X);
> +}
> +
> +extern __inline unsigned int __attribute__((__gnu_inline__, 
> __always_inline__, __artificial__))
>  __blsmsk_u32 (unsigned int __X)
>  {
>return __X ^ (__X - 1);
>  }
>
>  extern __inline unsigned int __attribute__((__gnu_inline__, 
> __always_inline__, __artificial__))
> +_blsmsk_u32 (unsigned int __X)
> +{
> +  return __blsmsk_u32 (__X);
> +}
> +
> +extern __inline unsigned int __attribute__((__gnu_inline__, 
> __always_inline__, __artificial__))
>  __blsr_u32 (unsigned int __X)
>  {
>return __X & (__X - 1);
>  }
>
> +extern __inline unsigned int __attribute__((__gnu_inline__, 
> __always_inline__, __artificial__))
> +_blsr_u32 (unsigned int __X)
> +{
> +  return __blsr_u32 (__X);
> +}
>
>  extern __inline unsigned int __attribute__((__gnu_inline__, 
> __always_inline__, __artificial__))
>  __tzcnt_u32 (unsigned int __X)
> @@ -84,6 +100,12 @@ __tzcnt_u32 (unsigned int __X)
>return __builtin_ctz (__X);
>  }
>
> +extern __inline unsigned int __attribute__((__gnu_inline__, 
> __always_inline__, __artificial__))
> +_tzcnt_u32 (unsigned int __X)
> +{
> +  return __builtin_ctz (__X);
> +}
> +
>
>  #ifdef  __x86_64__
>  extern __inline unsigned long long __attribute__((__gnu_inline__, 
> __always_inline__, __artificial__))
> @@ -111,22 +133,46 @@ __blsi_u64 (unsigned long long __X)
>  }
>
>  extern __inline unsigned long long __attribute__((__gnu_inline__, 
> __always_inline__, __artificial__))
> +_blsi_u64 (unsigned long long __X)
> +{
> +  return __blsi_u64 (__X);
> +}
> +
> +extern __inline unsigned long long __attribute__((__gnu_inline__, 
> __always_inline__, __artificial__))
>  __blsmsk_u64 (unsigned long long __X)
>  {
>return __X ^ (__X - 1);
>  }
>
>  extern __inline unsigned long long __attribute__((__gnu_inline__, 
> __always_inline__, __artificial__))
> +_blsmsk_u64 (unsigned long long __X)
> +{
> +  return __blsmsk_u64 (__X);
> +}
> +
> +extern __inline unsigned long long __attribute__((__gnu_inline__, 
> __always_inline__, __artificial__))
>  __blsr_u64 (unsigned long long __X)
>  {
>return __X & (__X - 1);
>  }
>
>  extern __inline unsigned long long __attribute__((__gnu_inline__, 
> __always_inline__, __artificial__))
> +_blsr_u64 (unsigned long long __X)
> +{
> +  return __blsr_u64 (__X);
> +}
> +
> +extern __inline unsigned long long __attribute__((__gnu_inline__, 
> __always_inline__, __artificial__))
>  __tzcnt_u64 (unsigned long long __X)
>  {
>return __builtin_ctzll (__X);
>  }
> +
> +extern __inline unsigned long long __attribute__((__gnu_inline__, 
>

Re: [PATCH] Fix PR60849

2014-04-17 Thread Richard Biener

On Thu, 17 Apr 2014, Marc Glisse wrote:

> On Thu, 17 Apr 2014, Richard Biener wrote:
> 
> > This fixes PR60849 by properly rejecting non-boolean typed
> > comparisons from valid_gimple_rhs_p so they go through the
> > gimplification paths.
> 
> Could you also accept vector comparisons please?

Sure.  Testing in progress.

Richard.

2014-04-17  Richard Biener  

PR middle-end/60849
* tree-ssa-propagate.c (valid_gimple_rhs_p): Allow vector
comparison results and add clarifying comment.

Index: gcc/tree-ssa-propagate.c
===
--- gcc/tree-ssa-propagate.c(revision 209469)
+++ gcc/tree-ssa-propagate.c(working copy)
@@ -572,9 +572,13 @@ valid_gimple_rhs_p (tree expr)
   break;
 
 case tcc_comparison:
-  if (!INTEGRAL_TYPE_P (TREE_TYPE (expr))
- || (TREE_CODE (TREE_TYPE (expr)) != BOOLEAN_TYPE
- && TYPE_PRECISION (TREE_TYPE (expr)) != 1))
+  /* GENERIC allows comparisons with non-boolean types, reject
+ those for GIMPLE.  Let vector-typed comparisons pass - rules
+for GENERIC and GIMPLE are the same here.  */
+  if (!(INTEGRAL_TYPE_P (TREE_TYPE (expr))
+   && (TREE_CODE (TREE_TYPE (expr)) == BOOLEAN_TYPE
+   || TYPE_PRECISION (TREE_TYPE (expr)) == 1))
+ && TREE_CODE (TREE_TYPE (expr)) != VECTOR_TYPE)
return false;
 
   /* Fallthru.  */

[PATCH] Try to coalesce for unary and binary ops

2014-04-17 Thread Richard Biener


The patch below increases the number of coalescs we attempt
to also cover unary and binary operations.  This improves
initial code generation for code like

int foo (int i, int j, int k, int l)
{
  int res = i;
  res += j;
  res += k;
  res += l;
  return res;
}

from

;; res_3 = i_1(D) + j_2(D);

(insn 9 8 0 (parallel [
(set (reg/v:SI 83 [ res ])
(plus:SI (reg/v:SI 87 [ i ])
(reg/v:SI 88 [ j ])))
(clobber (reg:CC 17 flags))
]) t.c:4 -1
 (nil))

;; res_5 = res_3 + k_4(D);

(insn 10 9 0 (parallel [
(set (reg/v:SI 84 [ res ])
(plus:SI (reg/v:SI 83 [ res ])
(reg/v:SI 89 [ k ])))
(clobber (reg:CC 17 flags))
]) t.c:5 -1
 (nil))
...

to

;; res_3 = i_1(D) + j_2(D);

(insn 9 8 0 (parallel [
(set (reg/v:SI 83 [ res ])
(plus:SI (reg/v:SI 85 [ i ])
(reg/v:SI 86 [ j ])))
(clobber (reg:CC 17 flags))
]) t.c:4 -1
 (nil))

;; res_5 = res_3 + k_4(D);

(insn 10 9 0 (parallel [
(set (reg/v:SI 83 [ res ])
(plus:SI (reg/v:SI 83 [ res ])
(reg/v:SI 87 [ k ])))
(clobber (reg:CC 17 flags))
]) t.c:5 -1
 (nil))

re-using the same pseudo for the LHS.

Expansion has special code to improve coalescing of op1 with
target thus this is what we try to match here.

Overall there are positive and negative size effects during
a bootstrap on x86_64, but overall it seems to be a loss
- stage3 cc1 text size is 18261647 bytes without the patch
compared to 18265751 bytes with the patch.

Now the question is what does this tell us?  Not re-using
the same pseudo as op and target is always better?

Btw, I tried this to find a convincing metric for a intra-BB
scheduling pass (during out-of-SSA) on GIMPLE (to be able
to kill that odd scheduling code we now have in reassoc).
And to have sth that TER not immediately un-does we have
to disable TER which conveniently happens for coalesced
SSA names.  Thus -> schedule for "register pressure", and thus
reduce SSA name lifetime - with the goal that out-of-SSA can
do more coalescing.  But it won't even try to coalesce
anything else than PHI copies (not affected by scheduling)
or plain SSA name copies (shouldn't happen anyway due to
copy propagation).

So - any ideas?  Or is the overall negative for cc1 just
an artifact to ignore and we _should_ coalesce as much
as possible (even if it doesn't avoid copies - thus the
"cost" of 0 used in the patch)?

Otherwise the patch bootstraps and tests fine on x86_64-unknown-linux-gnu.

Thanks,
Richard.

2014-04-17  Richard Biener  

* tree-ssa-coalesce.c (create_outofssa_var_map): Try to
coalesce SSA name uses with SSA name results in all unary
and binary operations.

Index: gcc/tree-ssa-coalesce.c
===
*** gcc/tree-ssa-coalesce.c (revision 209469)
--- gcc/tree-ssa-coalesce.c (working copy)
*** create_outofssa_var_map (coalesce_list_p
*** 991,1007 
case GIMPLE_ASSIGN:
  {
tree lhs = gimple_assign_lhs (stmt);
tree rhs1 = gimple_assign_rhs1 (stmt);
!   if (gimple_assign_ssa_name_copy_p (stmt)
&& gimple_can_coalesce_p (lhs, rhs1))
  {
v1 = SSA_NAME_VERSION (lhs);
v2 = SSA_NAME_VERSION (rhs1);
!   cost = coalesce_cost_bb (bb);
!   add_coalesce (cl, v1, v2, cost);
bitmap_set_bit (used_in_copy, v1);
bitmap_set_bit (used_in_copy, v2);
  }
  }
  break;
  
--- 993,1031 
case GIMPLE_ASSIGN:
  {
tree lhs = gimple_assign_lhs (stmt);
+   if (TREE_CODE (lhs) != SSA_NAME)
+ break;
+ 
+   /* Expansion handles target == op1 properly and also
+  target == op2 for commutative binary ops.  */
tree rhs1 = gimple_assign_rhs1 (stmt);
!   enum tree_code code = gimple_assign_rhs_code (stmt);
!   enum gimple_rhs_class klass = get_gimple_rhs_class (code);
!   if (TREE_CODE (rhs1) == SSA_NAME
&& gimple_can_coalesce_p (lhs, rhs1))
  {
v1 = SSA_NAME_VERSION (lhs);
v2 = SSA_NAME_VERSION (rhs1);
!   add_coalesce (cl, v1, v2,
! klass == GIMPLE_SINGLE_RHS
! ? coalesce_cost_bb (bb) : 0);
bitmap_set_bit (used_in_copy, v1);
bitmap_set_bit (used_in_copy, v2);
  }
+   if (klass == GIMPLE_BINARY_RHS
+   && commutative_tree_code (code))
+ {
+   tree rhs

Re: [PATCH] Try to coalesce for unary and binary ops

2014-04-17 Thread Richard Biener

On Thu, 17 Apr 2014, Richard Biener wrote:

> 
> The patch below increases the number of coalescs we attempt
> to also cover unary and binary operations.  This improves
> initial code generation for code like
> 
> int foo (int i, int j, int k, int l)
> {
>   int res = i;
>   res += j;
>   res += k;
>   res += l;
>   return res;
> }
> 
> from
> 
> ;; res_3 = i_1(D) + j_2(D);
> 
> (insn 9 8 0 (parallel [
> (set (reg/v:SI 83 [ res ])
> (plus:SI (reg/v:SI 87 [ i ])
> (reg/v:SI 88 [ j ])))
> (clobber (reg:CC 17 flags))
> ]) t.c:4 -1
>  (nil))
> 
> ;; res_5 = res_3 + k_4(D);
> 
> (insn 10 9 0 (parallel [
> (set (reg/v:SI 84 [ res ])
> (plus:SI (reg/v:SI 83 [ res ])
> (reg/v:SI 89 [ k ])))
> (clobber (reg:CC 17 flags))
> ]) t.c:5 -1
>  (nil))
> ...
> 
> to
> 
> ;; res_3 = i_1(D) + j_2(D);
> 
> (insn 9 8 0 (parallel [
> (set (reg/v:SI 83 [ res ])
> (plus:SI (reg/v:SI 85 [ i ])
> (reg/v:SI 86 [ j ])))
> (clobber (reg:CC 17 flags))
> ]) t.c:4 -1
>  (nil))
> 
> ;; res_5 = res_3 + k_4(D);
> 
> (insn 10 9 0 (parallel [
> (set (reg/v:SI 83 [ res ])
> (plus:SI (reg/v:SI 83 [ res ])
> (reg/v:SI 87 [ k ])))
> (clobber (reg:CC 17 flags))
> ]) t.c:5 -1
>  (nil))
> 
> re-using the same pseudo for the LHS.
> 
> Expansion has special code to improve coalescing of op1 with
> target thus this is what we try to match here.
> 
> Overall there are positive and negative size effects during
> a bootstrap on x86_64, but overall it seems to be a loss
> - stage3 cc1 text size is 18261647 bytes without the patch
> compared to 18265751 bytes with the patch.
> 
> Now the question is what does this tell us?  Not re-using
> the same pseudo as op and target is always better?
> 
> Btw, I tried this to find a convincing metric for a intra-BB
> scheduling pass (during out-of-SSA) on GIMPLE (to be able
> to kill that odd scheduling code we now have in reassoc).
> And to have sth that TER not immediately un-does we have
> to disable TER which conveniently happens for coalesced
> SSA names.  Thus -> schedule for "register pressure", and thus
> reduce SSA name lifetime - with the goal that out-of-SSA can
> do more coalescing.  But it won't even try to coalesce
> anything else than PHI copies (not affected by scheduling)
> or plain SSA name copies (shouldn't happen anyway due to
> copy propagation).
> 
> So - any ideas?  Or is the overall negative for cc1 just
> an artifact to ignore and we _should_ coalesce as much
> as possible (even if it doesn't avoid copies - thus the
> "cost" of 0 used in the patch)?

One example where it delivers bad initial expansion on x86_64 is

int foo (int *p)
{
  int res = p[0];
  res += p[1];
  res += p[2];
  res += p[3];
  return res;
}

where i386.c:ix86_fixup_binary_operands tries to be clever
and "improve address combine", generating two instructions
for (plus:SI (reg/v:SI 83 [ res ]) (mem:SI (...))) and thus
triggering expand_binop_directly

  pat = maybe_gen_insn (icode, 3, ops);
  if (pat)
{
  /* If PAT is composed of more than one insn, try to add an 
appropriate
 REG_EQUAL note to it.  If we can't because TEMP conflicts with an
 operand, call expand_binop again, this time without a target.  */
  if (INSN_P (pat) && NEXT_INSN (pat) != NULL_RTX
  && ! add_equal_note (pat, ops[0].value, optab_to_code 
(binoptab),
   ops[1].value, ops[2].value))
{
  delete_insns_since (last);
  return expand_binop (mode, binoptab, op0, op1, NULL_RTX,
   unsignedp, methods);
}

and thus we end up with

(insn 9 6 10 (set (reg:SI 91)
(mem:SI (plus:DI (reg/v/f:DI 88 [ p ])
(const_int 4 [0x4])) [0 MEM[(int *)p_2(D) + 4B]+0 S4 
A32])) t.c:4 -1
 (nil))

(insn 10 9 11 (parallel [
(set (reg:SI 90)
(plus:SI (reg/v:SI 83 [ res ])
(reg:SI 91)))
(clobber (reg:CC 17 flags))
]) t.c:4 -1
 (expr_list:REG_EQUAL (plus:SI (reg/v:SI 83 [ res ])
(mem:SI (plus:DI (reg/v/f:DI 88 [ p ])
(const_int 4 [0x4])) [0 MEM[(int *)p_2(D) + 4B]+0 S4 
A32]))
(nil)))

(insn 11 10 0 (set (reg/v:SI 83 [ res ])
(reg:SI 90)) t.c:4 -1
 (nil))

unpatched we avoid the last move (the tiny testcase of course
ends up optimizing the same anyway).  Not sure if that strong
desire to add a REG_EQUAL note makes up for the losses.  At
least it looks backwards to the code preceeding it:

  /* If operation is commutative,
 try to make the first operand a register.
 Even better, try to make it the same as the target.
 Also try to make the last operand a constant.  */
  if (commutative_p
  && swap_commutative_operands_wi

RE: [PATCH] Add a new option "-fmerge-bitfields" (patch / doc inside)

2014-04-17 Thread Zoran Jovanovic

Hello,
My apologies for inconvenience.
Removed every appearance of -ftree-bitfield-merge from the patch and fixed an 
issue with unions.
The rest of the patch is the same as before.

Regards,
Zoran Jovanovic

--

Lowering is applied only for bit-fields copy sequences that are merged.
Data structure representing bit-field copy sequences is renamed and reduced in 
size.
Optimization turned on by default for -O2 and higher.
Some comments fixed.

Benchmarking performed on WebKit for Android.
Code size reduction noticed on several files, best examples are:

core/rendering/style/StyleMultiColData (632->520 bytes)
core/platform/graphics/FontDescription (1715->1475 bytes)
core/rendering/style/FillLayer (5069->4513 bytes)
core/rendering/style/StyleRareInheritedData (5618->5346)
core/css/CSSSelectorList(4047->3887)
core/platform/animation/CSSAnimationData (3844->3440 bytes)
core/css/resolver/FontBuilder (13818->13350 bytes)
core/platform/graphics/Font (16447->15975 bytes)


Example:

One of the motivating examples for this work was copy constructor of the class 
which contains bit-fields.

C++ code:
class A
{
public:
A(const A &x);
unsigned a : 1;
unsigned b : 2;
unsigned c : 4;
};

A::A(const A&x)
{
a = x.a;
b = x.b;
c = x.c;
}

GIMPLE code without optimization:

  :
  _3 = x_2(D)->a;
  this_4(D)->a = _3;
  _6 = x_2(D)->b;
  this_4(D)->b = _6;
  _8 = x_2(D)->c;
  this_4(D)->c = _8;
  return;

Optimized GIMPLE code:
  :
  _10 = x_2(D)->D.1867;
  _11 = BIT_FIELD_REF <_10, 7, 0>;
  _12 = this_4(D)->D.1867;
  _13 = _12 & 128;
  _14 = (unsigned char) _11;
  _15 = _13 | _14;
  this_4(D)->D.1867 = _15;
  return;

Generated MIPS32r2 assembly code without optimization:
 lw  $3,0($5)
lbu $2,0($4)
andi$3,$3,0x1
andi$2,$2,0xfe
or  $2,$2,$3
sb  $2,0($4)
lw  $3,0($5)
andi$2,$2,0xf9
andi$3,$3,0x6
or  $2,$2,$3
sb  $2,0($4)
lw  $3,0($5)
andi$2,$2,0x87
andi$3,$3,0x78
or  $2,$2,$3
j   $31
sb  $2,0($4)

Optimized MIPS32r2 assembly code:
lw  $3,0($5)
lbu $2,0($4)
andi$3,$3,0x7f
andi$2,$2,0x80
or  $2,$3,$2
j   $31
sb  $2,0($4)


Algorithm works on basic block level and consists of following 3 major steps:
1. Go through basic block statements list. If there are statement pairs that 
implement copy of bit field content from one memory location to another record 
statements pointers and other necessary data in corresponding data structure.
2. Identify records that represent adjacent bit field accesses and mark them as 
merged.
3. Lower bit-field accesses by using new field size for those that can be 
merged.


New command line option "-fmerge-bitfields" is introduced.


Tested - passed gcc regression tests for MIPS32r2.


Changelog -

gcc/ChangeLog:
2014-04-16 Zoran Jovanovic (zoran.jovano...@imgtec.com)
  * common.opt (fmerge-bitfields): New option.
  * doc/invoke.texi: Add reference to "-fmerge-bitfields".
  * tree-sra.c (lower_bitfields): New function.
  Entry for (-fmerge-bitfields).
  (part_of_union_p): New function.
  (bf_access_candidate_p): New function.
  (lower_bitfield_read): New function.
  (lower_bitfield_write): New function.
  (bitfield_stmt_bfcopy_pair::hash): New function.
  (bitfield_stmt_bfcopy_pair::equal): New function.
  (bitfield_stmt_bfcopy_pair::remove): New function.
  (create_and_insert_bfcopy): New function.
  (get_bit_offset): New function.
  (add_stmt_bfcopy_pair): New function.
  (cmp_bfcopies): New function.
  (get_merged_bit_field_size): New function.
  * dwarf2out.c (simple_type_size_in_bits): Move to tree.c.
  (field_byte_offset): Move declaration to tree.h and make it extern.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
  * tree-ssa-sccvn.c (expressions_equal_p): Move to tree.c.
  * tree-ssa-sccvn.h (expressions_equal_p): Move declaration to tree.h.
  * tree.c (expressions_equal_p): Move from tree-ssa-sccvn.c.
  (simple_type_size_in_bits): Move from dwarf2out.c.
  * tree.h (expressions_equal_p): Add declaration.
  (field_byte_offset): Add declaration.

Patch -

diff --git a/gcc/common.opt b/gcc/common.opt
index da275e5..52c7f58 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2203,6 +2203,10 @@ ftree-sra
 Common Report Var(flag_tree_sra) Optimization
 Perform scalar replacement of aggregates
 
+fmerge-bitfields
+Common Report Var(flag_tree_bitfield_merge) Optimization
+Merge loads and stores of consecutive bitfields
+
 ftree-ter
 Common Report Var(flag_tree_ter) Optimization
 Replace temporary expressions in the SSA->normal pass
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi

[RFC][PING^2] Do not consider volatile asms as optimization barriers #1

2014-04-17 Thread Yury Gribov

--
From: Yury Gribov 
Sent:  Tuesday, March 25, 2014 11:57AM
To: Jakub Jelinek , Eric Botcazou 
, gcc-patches@gcc.gnu.org, Hans-Peter Nilsson 
, rdsandif...@googlemail.com

Subject: Re: [RFC] Do not consider volatile asms as optimization barriers #1

On 03/25/2014 11:57 AM, Yury Gribov wrote:
Jakub Jelinek wrote:

Richard Sandiford wrote:

OK, how about this?  It looks like the builtins.c and stmt.c stuff
wasn't
merged until 4.9, and at this stage it seemed safer to just add the same
use/clobber sequence to both places.

Please wait a little bit, the patch has been committed to the trunk only
very recently, we want to see if it has any fallout.

It has been two weeks since Richard commited this to trunk. Perhaps it's
ok to backport to 4.8 branch now?

-Y

Link to original email: 
http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01306.html

Re: Patch ping

2014-04-17 Thread Jakub Jelinek

On Wed, Apr 16, 2014 at 02:45:37PM -0400, DJ Delorie wrote:
> I'll approve both patches, if you agree to think about a way to solve
> this problem without module-specific configury changes for each such
> command line option.  I understand the usefulness of having
> instrumentation, but the configure hack is a hack.

Only the second patch I'd consider a hack, the first patch merely makes sure
the POSTSTAGE1_LDFLAGS stuff actually isn't eaten by libtool.

I'll think about other options for the second patch.

> Note that in a combined tree this isn't a problem, because we'd just
> instrument the linker at the same time.

Only if you never use the plugin from the combined tree build with any other
linker.  Add -B ../ to some other linker and suddenly it will crash.

Jakub

Re: [PATCH] Make SRA tolerate most throwing statements

2014-04-17 Thread Martin Jambor

On Wed, Apr 16, 2014 at 11:22:28AM +0200, Richard Biener wrote:
> On Tue, 15 Apr 2014, Martin Jambor wrote:
> 
> > Hi,
> > 
> > back in January in
> > http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00848.html Eric pointed
> > out a testcase where the problem was SRA not scalarizing an aggregate
> > because it was involved in a throwing statement.  The reason is that
> > SRA is likely to need to append new statements after each one where a
> > replaced aggregate is present, but throwing statements must end their
> > BBs.  This patch comes up with a fix for most such situations by
> > adding these new statements onto a single successor non-EH edge, if
> > there is one and only one such edge.
> > 
> > I have bootstrapped and tested a very similar version on x86_64-linux,
> > bootstrap and testing of this exact one is currently underway.  OK for
> > trunk?  Eric, if and once this gets in, can you please add the
> > testcase from your original post to the suite?
> > 
> > Thanks,
> > 
> > Martin
> > 
> > 
> > 2014-04-15  Martin Jambor  
> > 
> > * tree-sra.c (single_non_eh_succ): New function.
> > (disqualify_ops_if_throwing_stmt): Renamed to
> > disqualify_if_bad_bb_terminating_stmt.  Allow throwing statements
> > having one non-EH successor BB.
> > (gsi_for_eh_followups): New function.
> > (sra_modify_expr): If stmt ends bb, use single non-EH successor to
> > generate loads into replacements.
> > (sra_modify_assign): Likewise and and also use the simple path for
> > such statements.
> > (sra_modify_function_body): Iterate safely over BBs.
> > 

...

> > @@ -2734,6 +2758,19 @@ get_access_for_expr (tree expr)
> >return get_var_base_offset_size_access (base, offset, max_size);
> >  }
> >  
> > +/* Split the single non-EH successor edge from BB (there must be exactly 
> > one)
> > +   and return a gimple iterator to the new block.  */
> > +
> > +static gimple_stmt_iterator
> > +gsi_for_eh_followups (basic_block bb)
> > +{
> > +  edge e = single_non_eh_succ (bb);
> > +  gcc_assert (e);
> > +
> > +  basic_block new_bb = split_edge (e);
> > +  return gsi_start_bb (new_bb);
> > +}
> > +
> >  /* Replace the expression EXPR with a scalar replacement if there is one 
> > and
> > generate other statements to do type conversion or subtree copying if
> > necessary.  GSI is used to place newly created statements, WRITE is 
> > true if
> > @@ -2763,6 +2800,13 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator 
> > *gsi, bool write)
> >type = TREE_TYPE (*expr);
> >  
> >loc = gimple_location (gsi_stmt (*gsi));
> > +  gimple_stmt_iterator alt_gsi = gsi_none ();
> > +  if (write && stmt_ends_bb_p (gsi_stmt (*gsi)))
> > +{
> > +  alt_gsi = gsi_for_eh_followups (gsi_bb (*gsi));
> > +  gsi = &alt_gsi;
> 
> I think you should try to either use gsi_insert_on_edge_immediate
> (yeah, bad we can't build a gsi_for_edge_insert ()) or add
> a gsi_for_edge_insert () building on gimple_find_edge_insert_loc
> (note the before/after flag that returns - gsi_insert_* variants
> that take a flag specifying after/before would come handy here).
> You could also add a flag to gimple_find_edge_insert_loc whether
> it always should be possible to use gsi_insert_after and split
> the block in some more cases (or split it if both after and
> before inserts should be valid, but that would not split in
> the very rare case of an empty successor only).
> 
> Basically usually you can avoid splitting the edge.

The following patch adds gsi_start_edge for that purpose and uses it
together with gsi_commit_edge_inserts from within SRA.

I did not make it an inline static function in the header like the
other gsi initializing functions because that would make
gimple-iterator.h depend on tree-cfg.h and with our current flat
includes that triggered changes of includes in half a gazillion
unrelated c files (I have that patch too because I was apparently too
lazy to think before the third coffee yesterday but I do not think it
is worth it).

Bootstrapped and tested on x86_64-linux, this time it also includes
Eric's testcase.  OK for trunk?

Thanks,

Martin


2014-04-16  Martin Jambor  

* gimple-iterator.c (gsi_start_edge): New function.
* gimple-iterator.h (gsi_start_edge): Declare.
* tree-sra.c (single_non_eh_succ): New function.
(disqualify_ops_if_throwing_stmt): Renamed to
disqualify_if_bad_bb_terminating_stmt.  Allow throwing statements
having one non-EH successor BB.
(sra_modify_expr): If stmt ends bb, use single non-EH successor to
generate loads into replacements.
(sra_modify_assign): Likewise and and also use the simple path for
such statements.
(sra_modify_function_body): Commit statements on edges.

testsuite/
* gnat.dg/opt34.adb: New.
* gnat.dg/opt34_pkg.ads: Likewise.

diff --git a/gcc/gimple-iterator.c b/gcc/gimple-iterator.c
index 1cfeb73..8a1ec53 100644
--- a/gcc/gimple-iterato

Re: [PATCH] Make SRA tolerate most throwing statements

2014-04-17 Thread Richard Biener

On Thu, Apr 17, 2014 at 2:21 PM, Martin Jambor  wrote:
> On Wed, Apr 16, 2014 at 11:22:28AM +0200, Richard Biener wrote:
>> On Tue, 15 Apr 2014, Martin Jambor wrote:
>>
>> > Hi,
>> >
>> > back in January in
>> > http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00848.html Eric pointed
>> > out a testcase where the problem was SRA not scalarizing an aggregate
>> > because it was involved in a throwing statement.  The reason is that
>> > SRA is likely to need to append new statements after each one where a
>> > replaced aggregate is present, but throwing statements must end their
>> > BBs.  This patch comes up with a fix for most such situations by
>> > adding these new statements onto a single successor non-EH edge, if
>> > there is one and only one such edge.
>> >
>> > I have bootstrapped and tested a very similar version on x86_64-linux,
>> > bootstrap and testing of this exact one is currently underway.  OK for
>> > trunk?  Eric, if and once this gets in, can you please add the
>> > testcase from your original post to the suite?
>> >
>> > Thanks,
>> >
>> > Martin
>> >
>> >
>> > 2014-04-15  Martin Jambor  
>> >
>> > * tree-sra.c (single_non_eh_succ): New function.
>> > (disqualify_ops_if_throwing_stmt): Renamed to
>> > disqualify_if_bad_bb_terminating_stmt.  Allow throwing statements
>> > having one non-EH successor BB.
>> > (gsi_for_eh_followups): New function.
>> > (sra_modify_expr): If stmt ends bb, use single non-EH successor to
>> > generate loads into replacements.
>> > (sra_modify_assign): Likewise and and also use the simple path for
>> > such statements.
>> > (sra_modify_function_body): Iterate safely over BBs.
>> >
>
> ...
>
>> > @@ -2734,6 +2758,19 @@ get_access_for_expr (tree expr)
>> >return get_var_base_offset_size_access (base, offset, max_size);
>> >  }
>> >
>> > +/* Split the single non-EH successor edge from BB (there must be exactly 
>> > one)
>> > +   and return a gimple iterator to the new block.  */
>> > +
>> > +static gimple_stmt_iterator
>> > +gsi_for_eh_followups (basic_block bb)
>> > +{
>> > +  edge e = single_non_eh_succ (bb);
>> > +  gcc_assert (e);
>> > +
>> > +  basic_block new_bb = split_edge (e);
>> > +  return gsi_start_bb (new_bb);
>> > +}
>> > +
>> >  /* Replace the expression EXPR with a scalar replacement if there is one 
>> > and
>> > generate other statements to do type conversion or subtree copying if
>> > necessary.  GSI is used to place newly created statements, WRITE is 
>> > true if
>> > @@ -2763,6 +2800,13 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator 
>> > *gsi, bool write)
>> >type = TREE_TYPE (*expr);
>> >
>> >loc = gimple_location (gsi_stmt (*gsi));
>> > +  gimple_stmt_iterator alt_gsi = gsi_none ();
>> > +  if (write && stmt_ends_bb_p (gsi_stmt (*gsi)))
>> > +{
>> > +  alt_gsi = gsi_for_eh_followups (gsi_bb (*gsi));
>> > +  gsi = &alt_gsi;
>>
>> I think you should try to either use gsi_insert_on_edge_immediate
>> (yeah, bad we can't build a gsi_for_edge_insert ()) or add
>> a gsi_for_edge_insert () building on gimple_find_edge_insert_loc
>> (note the before/after flag that returns - gsi_insert_* variants
>> that take a flag specifying after/before would come handy here).
>> You could also add a flag to gimple_find_edge_insert_loc whether
>> it always should be possible to use gsi_insert_after and split
>> the block in some more cases (or split it if both after and
>> before inserts should be valid, but that would not split in
>> the very rare case of an empty successor only).
>>
>> Basically usually you can avoid splitting the edge.
>
> The following patch adds gsi_start_edge for that purpose and uses it
> together with gsi_commit_edge_inserts from within SRA.
>
> I did not make it an inline static function in the header like the
> other gsi initializing functions because that would make
> gimple-iterator.h depend on tree-cfg.h and with our current flat
> includes that triggered changes of includes in half a gazillion
> unrelated c files (I have that patch too because I was apparently too
> lazy to think before the third coffee yesterday but I do not think it
> is worth it).
>
> Bootstrapped and tested on x86_64-linux, this time it also includes
> Eric's testcase.  OK for trunk?

Ok.

Thanks,
Richard.

> Thanks,
>
> Martin
>
>
> 2014-04-16  Martin Jambor  
>
> * gimple-iterator.c (gsi_start_edge): New function.
> * gimple-iterator.h (gsi_start_edge): Declare.
> * tree-sra.c (single_non_eh_succ): New function.
> (disqualify_ops_if_throwing_stmt): Renamed to
> disqualify_if_bad_bb_terminating_stmt.  Allow throwing statements
> having one non-EH successor BB.
> (sra_modify_expr): If stmt ends bb, use single non-EH successor to
> generate loads into replacements.
> (sra_modify_assign): Likewise and and also use the simple path for
> such statements.
> (sra_modify_function_body): Commit statements on e

Re: [PATCH 1/6] remove properties stuff from register_dump_files_1

2014-04-17 Thread Trevor Saunders

On Thu, Apr 17, 2014 at 10:53:07AM +0200, Richard Biener wrote:
> On Thu, Apr 17, 2014 at 10:37 AM,   wrote:
> > From: Trevor Saunders 
> >
> > Hi,
> >
> > just removing some dead code.
> >
> > bootstrapped + regtested against r209414 on x86_64-unknown-linux-gnu, ok?
> 
> Ok.

Thanks for the quick reviews! committed as r209477 - 209482

Trev

> 
> Thanks,
> Richard.
> 
> > Trev
> >
> > 2014-03-19  Trevor Saunders  
> >
> > * pass_manager.h (pass_manager::register_dump_files_1): Adjust.
> > * passes.c (pass_manager::register_dump_files_1): Remove dead code
> > dealing with properties.
> > (pass_manager::register_dump_files): Adjust.
> >
> > diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h
> > index e1d8143..8309567 100644
> > --- a/gcc/pass_manager.h
> > +++ b/gcc/pass_manager.h
> > @@ -91,7 +91,7 @@ public:
> >
> >  private:
> >void set_pass_for_id (int id, opt_pass *pass);
> > -  int register_dump_files_1 (opt_pass *pass, int properties);
> > +  void register_dump_files_1 (opt_pass *pass);
> >void register_dump_files (opt_pass *pass, int properties);
> >
> >  private:
> > diff --git a/gcc/passes.c b/gcc/passes.c
> > index 60fb135..3f9590a 100644
> > --- a/gcc/passes.c
> > +++ b/gcc/passes.c
> > @@ -708,33 +708,21 @@ pass_manager::register_one_dump_file (opt_pass *pass)
> >
> >  /* Recursive worker function for register_dump_files.  */
> >
> > -int
> > +void
> >  pass_manager::
> > -register_dump_files_1 (opt_pass *pass, int properties)
> > +register_dump_files_1 (opt_pass *pass)
> >  {
> >do
> >  {
> > -  int new_properties = (properties | pass->properties_provided)
> > -  & ~pass->properties_destroyed;
> > -
> >if (pass->name && pass->name[0] != '*')
> >  register_one_dump_file (pass);
> >
> >if (pass->sub)
> > -new_properties = register_dump_files_1 (pass->sub, new_properties);
> > -
> > -  /* If we have a gate, combine the properties that we could have with
> > - and without the pass being examined.  */
> > -  if (pass->has_gate)
> > -properties &= new_properties;
> > -  else
> > -properties = new_properties;
> > +register_dump_files_1 (pass->sub);
> >
> >pass = pass->next;
> >  }
> >while (pass);
> > -
> > -  return properties;
> >  }
> >
> >  /* Register the dump files for the pass_manager starting at PASS.
> > @@ -746,7 +734,7 @@ pass_manager::
> >  register_dump_files (opt_pass *pass,int properties)
> >  {
> >pass->properties_required |= properties;
> > -  register_dump_files_1 (pass, properties);
> > +  register_dump_files_1 (pass);
> >  }
> >
> >  struct pass_registry
> > --
> > 1.9.2
> >


signature.asc
Description: Digital signature

Changes for if-convert to recognize simple conditional reduction.

2014-04-17 Thread Yuri Rumyantsev

Hi All,

We implemented enhancement for if-convert phase to recognize the
simplest conditional reduction and to transform it vectorizable form,
e.g. statement
if (A[i] != 0) num+= 1; will be recognized.
A new test-case is also provided.

Bootstrapping and regression testing did not show any new failures.

Is it OK for trunk?

gcc/ChangeLog:
2014-04-17  Yuri Rumyantsev  

* tree-if-conv.c (is_cond_scalar_reduction): New function.
(convert_scalar_cond_reduction): Likewise.
(predicate_scalar_phi): Add recognition and transformation
of simple conditioanl reduction to be vectorizable.

gcc/testsuite/ChangeLog:
2014-04-17  Yuri Rumyantsev  

* gcc.dg/cond-reduc.c: New test.


if-conv-cond-reduc.patch
Description: Binary data

Re: [PATCH] Redesign jump threading profile updates

2014-04-17 Thread Teresa Johnson

On Wed, Apr 16, 2014 at 10:39 PM, Jeff Law  wrote:
> On 03/26/14 17:44, Teresa Johnson wrote:
>>
>> Recently I discovered that the profile updates being performed by jump
>> threading were incorrect in many cases, particularly in the case where
>> the threading path contains a joiner. Some of the duplicated
>> blocks/edges were not getting any counts, leading to incorrect
>> function splitting and other downstream optimizations, and there were
>> other insanities as well. After making a few attempts to fix the
>> handling I ended up completely redesigning the profile update code,
>> removing a few places throughout the code where it was attempting to
>> do some updates.
>
> The profile updates in that code is a mess.  It's never really been looked
> at in any systematic way, what's there is ad-hoc and usually in response to
> someone mentioning the profile data was incorrectly updated.   As we rely
> more and more on that data the ad-hoc updating is going to cause us more and
> more pain.
>
> So any work in this space is going to be greatly appreciated.
>
> I'll have to look at this in some detail.  But I wanted you to know I was
> aware of the work and it's in my queue.

Great, thanks for the update! I realize that it is not a trivial
change so it would take some time to get through. Hopefully it should
address the ongoing profile fixup issues.
Teresa

>
> Thanks!
>
> jeff



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

[PING] [PATCH] Fix for PR libstdc++/60758

2014-04-17 Thread Alexey Merzlyakov


Hi,

This fixes infinite backtrace in __cxa_end_cleanup().
Regtest was finished with no regressions on arm-linux-gnueabi(sf).

The patch posted at:
  http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00496.html

Thanks in advance.

Best regards,
Merzlyakov Alexey

[ PATCH] Extend mode-switching to support toggle (1/2)

2014-04-17 Thread Christian Bruel

Hello,

He is a new version of the patch. It hookizes the mode-setting and
mode-toggling macros. Split in 2 parts.

Successfully bootstrapped/regtested on ix86 and SH4/SH4a.

I was able to do a limited build on Epiphany, if someone could give it a
try on it that would be great.

comments ? suggestions ?

many thanks,

Christian









2014-04-02  Christian Bruel  

	* target.def (mode_switching): New hook vector.
	(mode_emit, mode_needed, mode_after, mode_entry): New hooks.
	(mode_exit, modepriority_to_mode): Likewise.
	* mode-switching.c (MODE_NEEDED, MODE_AFTER, MODE_ENTRY): Hookify.
	(MODE_EXIT, MODE_PRIORITY_TO_MODE, EMIT_MODE_SET): Likewise.
	(default_priority_to_mode): Define.
	* targhooks.h (default_priority_to_mode): Declare.
	* target.h: Include tm.h and hard-reg-set.h.
	* doc/tm.texi.in (EMIT_MODE_SET, MODE_NEEDED, MODE_AFTER, MODE_ENTRY)
	(MODE_EXIT, MODE_PRIORITY_TO_MODE): Delete and hookify.
	* doc/tm.texi Regenerate.
	* config/sh/sh.h (MODE_NEEDED, MODE_AFTER, MODE_ENTRY): Delete
	(MODE_EXIT, MODE_PRIORITY_TO_MODE, EMIT_MODE_SET): Likewise.
	* config/sh/sh.c (emit_fpu_toggle): New function.
	(sh4_emit_mode_set, sh4_mode_needed): Hookify.
	(sh4_mode_after, sh4_mode_entry, sh4_mode_exit): Likewise.
	* config/i386/i386.h (MODE_NEEDED, MODE_AFTER, MODE_ENTRY): Delete
	(MODE_EXIT, MODE_PRIORITY_TO_MODE, EMIT_MODE_SET): Likewise.
	* config/i386/i386-protos.h (ix86_mode_needed, ix86_mode_after)
	(ix86_mode_entrym, ix86_emit_mode_set): Remove external declaration.
	* config/i386/i386.c (ix86_mode_needed, ix86_mode_after, ix86_mode_exit,
	(ix86_mode_entry, ix86_mode_priority, ix86_emit_mode_set): Hookify.
	* config/epiphany/epiphany.h (MODE_NEEDED, MODE_AFTER, MODE_ENTRY):
	Delete
	(MODE_EXIT, MODE_PRIORITY_TO_MODE, EMIT_MODE_SET): Likewise.
	* config/sh/sh.h (MODE_NEEDED, MODE_AFTER, MODE_ENTRY): Delete
	(MODE_EXIT, MODE_PRIORITY_TO_MODE, EMIT_MODE_SET): Likewise.
	* config/sh/sh.c (sh4_emit_mode_set, sh4_mode_needed): Hookify.
	(sh4_mode_after, sh4_mode_entry, sh4_mode_exit): Likewise.
	* config/epiphany/epiphany-protos.h (epiphany_mode_needed)
	(emit_set_fp_mode, epiphany_mode_entry_exit, epiphany_mode_after)
	(epiphany_mode_priority_to_mode): Remove declaration.
	* config/epiphany/epiphany.c (emit_set_fp_mode): Hookify.
	(epiphany_mode_needed, epiphany_mode_priority_to_mode): Likewise.
	(epiphany_mode_entry, epiphany_mode_exit, epiphany_mode_after):
	Likewise.
	(epiphany_mode_priority_to_mode): Change priority type. Hookify.
	(epiphany_mode_needed, epiphany_mode_entry_exit): Hookify.
	(epiphany_mode_after, epiphany_mode_entry, emit_set_fp_mode): Hookify.

--- gcc/config/epiphany/epiphany-protos.h	(revision 209415)
+++ gcc/config/epiphany/epiphany-protos.h	(working copy)
@@ -45,9 +45,7 @@ extern void emit_set_fp_mode (int entity, int mode
 extern void epiphany_insert_mode_switch_use (rtx insn, int, int);
 extern void epiphany_expand_set_fp_mode (rtx *operands);
 extern int epiphany_mode_needed (int entity, rtx insn);
-extern int epiphany_mode_entry_exit (int entity, bool);
 extern int epiphany_mode_after (int entity, int last_mode, rtx insn);
-extern int epiphany_mode_priority_to_mode (int entity, unsigned priority);
 extern bool epiphany_epilogue_uses (int regno);
 extern bool epiphany_optimize_mode_switching (int entity);
 extern bool epiphany_is_interrupt_p (tree);
--- gcc/config/epiphany/epiphany.c	(revision 209415)
+++ gcc/config/epiphany/epiphany.c	(working copy)
@@ -152,6 +152,20 @@ static rtx frame_insn (rtx);
 /* We further restrict the minimum to be a multiple of eight.  */
 #define TARGET_MIN_ANCHOR_OFFSET (optimize_size ? 0 : -2040)
 
+/* Mode switching hooks.  */
+
+#define TARGET_MODE_EMIT emit_set_fp_mode
+
+#define TARGET_MODE_NEEDED epiphany_mode_needed
+
+#define TARGET_MODE_PRIORITY epiphany_mode_priority
+
+#define TARGET_MODE_ENTRY epiphany_mode_entry
+
+#define TARGET_MODE_EXIT epiphany_mode_exit
+
+#define TARGET_MODE_AFTER epiphany_mode_after
+
 #include "target-def.h"
 
 #undef TARGET_ASM_ALIGNED_HI_OP
@@ -2306,8 +2320,8 @@ epiphany_optimize_mode_switching (int entity)
   gcc_unreachable ();
 }
 
-int
-epiphany_mode_priority_to_mode (int entity, unsigned priority)
+static int
+epiphany_mode_priority (int entity, int priority)
 {
   if (entity == EPIPHANY_MSW_ENTITY_AND || entity == EPIPHANY_MSW_ENTITY_OR
   || entity== EPIPHANY_MSW_ENTITY_CONFIG)
@@ -2415,7 +2429,7 @@ epiphany_mode_needed (int entity, rtx insn)
   }
 }
 
-int
+static int
 epiphany_mode_entry_exit (int entity, bool exit)
 {
   int normal_mode = epiphany_normal_fp_mode ;
@@ -2502,6 +2516,18 @@ epiphany_mode_after (int entity, int last_mode, rt
   return last_mode;
 }
 
+static int
+epiphany_mode_entry (int entity)
+{
+  return epiphany_mode_entry_exit (entity, false);
+}
+
+static int
+epiphany_mode_exit (int entity)
+{
+  return epiphany_mode_entry_exit (entity, true);
+}
+
 void
 emit_set_fp_mode (int entity, int mode, HARD_REG_SET regs_live ATTRIBUTE_UNUSED)
 {
--- gcc/config/epiphany/epiphany.h	(rev

Re: [PATCH] Try to coalesce for unary and binary ops

2014-04-17 Thread Michael Matz

Hi,

On Thu, 17 Apr 2014, Richard Biener wrote:

> The patch below increases the number of coalescs we attempt
> to also cover unary and binary operations.

This is not usually a good idea if not mitigated by things like register 
pressure measurement and using target properties to determine if it's a 
two- or three-address instruction.  It increases register pressure and 
naturally generates multiple-def pseudos which aren't liked by some of the 
RTL passes.  It will lead to fewer pseudos, so there's a positive side.

> Now the question is what does this tell us?  Not re-using the same 
> pseudo as op and target is always better?

No, it tells us that tree-ssa-coalesce is too early for such coalescing.  
The register allocator is the right spot (or instruction selection if we 
had that), and it's done there.

> And to have sth that TER not immediately un-does we have
> to disable TER which conveniently happens for coalesced
> SSA names.

So, instead TER should be improved to not disturb the incoming instruction 
order (except where secondary effects of expanding larger trees can be 
had).  Changing the coalescing set to disable some bad parts in a later 
pass doesn't sound very convincing :)

Ciao,
Michael.

[ PATCH] Extend mode-switching to support toggle (2/2)

2014-04-17 Thread Christian Bruel

and the toggle-support hookized

many thanks,

Christian



2014-04-02  Christian Bruel  

	* target.def (mode_switching): New hook vector.
	(toggle_init, toggle_destroy, toggle_set, toggle_test):
	New mode toggle hooks.
	* targhooks.h (default_toggle_test): Declare.
	* basic-block.h (pre_edge_lcm_avs): Declare.
	* lcm.c (pre_edge_lcm_avs): Renamed from pre_edge_lcm.
	Call clear_aux_for_edges. Fix comments.
	(pre_edge_lcm): New wrapper function to call pre_edge_lcm_avs.
	(pre_edge_rev_lcm): Idem.
	* mode-switching.c (init_modes_infos): New function.
	(free_modes_infos): Likewise.
	(add_mode_set): Likewise.
	(get_mode): Likewise.
	(commit_mode_sets): Likewise.
	(merge_modes): Likewise.
	(optimize_mode_switching): Support mode toggle.
	(default_priority_to_mode, default_toggle_test): Define.
	* doc/tm.texi.in (TARGET_MODE_TOGGLE_INIT, TARGET_MODE_TOGGLE_TEST)
	(TARGET_MODE_TOGGLE_DESTROY, TARGET_MODE_TOGGLE_SET):
	 New target hooks.
	* doc/tm.texi: Regenerate.
	* config/sh/sh.c (sh4_toggle_init, sh4_toggle_destroy): Add hook and define.
	(sh4_toggle_set, sh4_toggle_test): Likewise.
	(mode_in_flip, mode_out_flip): Add bitmap to compute mode flipping.
	(TARGET_MODE_EMIT): New toggle parameter.
	* config/sh/sh.md (toggle_pr): Defined for TARGET_SH4_300 and TARGET_SH4A_FP.
	(in_delay_slot): fpscr_toggle don't go in delay slot.
	* config/i386/i386.c (ix86_emit_mode_set): Add bool unused parameter.
	* config/epiphany/epiphany.c (emit_set_fp_mode): Add bool unused parameter.

--- gcc/basic-block.h	2014-01-07 10:30:59.0 +0100
+++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/basic-block.h	2014-04-15 16:17:53.0 +0200
@@ -711,6 +711,9 @@
 extern struct edge_list *pre_edge_lcm (int, sbitmap *, sbitmap *,
    sbitmap *, sbitmap *, sbitmap **,
    sbitmap **);
+extern struct edge_list *pre_edge_lcm_avs (int, sbitmap *, sbitmap *,
+	   sbitmap *, sbitmap *, sbitmap *,
+	   sbitmap *, sbitmap **, sbitmap **);
 extern struct edge_list *pre_edge_rev_lcm (int, sbitmap *,
 	   sbitmap *, sbitmap *,
 	   sbitmap *, sbitmap **,
--- gcc/config/epiphany/epiphany.c	2014-04-17 13:23:48.0 +0200
+++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/config/epiphany/epiphany.c	2014-04-17 13:25:54.0 +0200
@@ -2529,7 +2529,8 @@
 }
 
 void
-emit_set_fp_mode (int entity, int mode, HARD_REG_SET regs_live ATTRIBUTE_UNUSED)
+emit_set_fp_mode (int entity, int mode, bool toggle ATTRIBUTE_UNUSED,
+		  HARD_REG_SET regs_live ATTRIBUTE_UNUSED)
 {
   rtx save_cc, cc_reg, mask, src, src2;
   enum attr_fp_mode fp_mode;
--- gcc/config/epiphany/epiphany-protos.h	2014-04-17 11:10:36.0 +0200
+++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/config/epiphany/epiphany-protos.h	2014-04-17 11:22:02.0 +0200
@@ -40,7 +40,8 @@
 extern void epiphany_init_expanders (void);
 extern int hard_regno_mode_ok (int regno, enum machine_mode mode);
 #ifdef HARD_CONST
-extern void emit_set_fp_mode (int entity, int mode, HARD_REG_SET regs_live);
+extern void emit_set_fp_mode (int entity, int mode,
+			  bool toggle ATTRIBUTE_UNUSED, HARD_REG_SET regs_live);
 #endif
 extern void epiphany_insert_mode_switch_use (rtx insn, int, int);
 extern void epiphany_expand_set_fp_mode (rtx *operands);
--- gcc/config/epiphany/resolve-sw-modes.c	2014-04-17 11:10:36.0 +0200
+++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/config/epiphany/resolve-sw-modes.c	2014-04-17 11:21:07.0 +0200
@@ -147,7 +147,7 @@
 	}
 	  start_sequence ();
 	  emit_set_fp_mode (EPIPHANY_MSW_ENTITY_ROUND_UNKNOWN,
-			jilted_mode, NULL);
+			jilted_mode, false, NULL);
 	  seq = get_insns ();
 	  end_sequence ();
 	  need_commit = true;
--- gcc/config/i386/i386.c	2014-04-17 13:02:49.0 +0200
+++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/config/i386/i386.c	2014-04-17 13:04:18.0 +0200
@@ -16409,7 +16409,8 @@
are to be inserted.  */
 
 static void
-ix86_emit_mode_set (int entity, int mode, HARD_REG_SET regs_live)
+ix86_emit_mode_set (int entity, int mode, bool toggle ATTRIBUTE_UNUSED,
+		HARD_REG_SET regs_live)
 {
   switch (entity)
 {
--- gcc/config/sh/sh.c	2014-04-17 13:23:07.0 +0200
+++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/config/sh/sh.c	2014-04-17 13:25:27.0 +0200
@@ -202,7 +202,7 @@
 static int calc_live_regs (HARD_REG_SET *);
 static HOST_WIDE_INT rounded_frame_size (int);
 static bool sh_frame_pointer_required (void);
-static void sh4_emit_mode_set (int, int, HARD_REG_SET);
+static void sh4_emit_mode_set (int, int, bool, HARD_REG_SET);
 static int sh4_mode_needed (int, rtx);
 static int sh4_mode_after (int, int, rtx);
 static int sh4_mode_entry (int);
@@ -590,9 +590,21 @@
 #undef TARGET_MODE_EXIT
 #define TARGET_MODE_EXIT sh4_mode_exit
 
+#undef TARGET_MODE_TOGGLE_INIT
+#define TARGET_MODE_TOGGLE_INIT sh4_toggle_init
+
 #undef TARGET_MODE_PRIORITY
 #define TARGET_MODE_PRIORITY sh4_mode_priority
 
+#undef TARGET_MODE_TOGGLE_DESTROY
+#define TARGET

[PATCH 0/3] libsanitizer libc conditionals

2014-04-17 Thread Bernhard Reutner-Fischer

Respun. First two patches are for gcc, the last one is for upstream
LLVM.

The gcc part was bootstrapped and regtested on x86_64-unknown-linux-gnu
without regressions and bootstrapped on x86_64-unknown-linux-uclibc to
verify that the configury works as expected and that the library links
without errors. These two patches are essentially "backports" of the
LLVM bits in patch #3.

The LLVM part was compiled on x86_64 (X86_64 ?) against glibc and
verified that the configury picks up the previously hard-coded values
both with "configure && make" as well as with "cmake && make".

LLVM'er, please install the LLVM bits.

Ok for trunk?


Bernhard Reutner-Fischer (3):
  libsanitizer: Fix !statfs64 builds
  libsanitizer: add conditionals for libc
  [LLVM] [sanitizer] add conditionals for libc

 libsanitizer/asan/Makefile.am  |   6 +
 libsanitizer/asan/Makefile.in  |  17 +-
 libsanitizer/config.h.in   |  60 +
 libsanitizer/configure | 281 -
 libsanitizer/configure.ac  |  38 +++
 libsanitizer/interception/interception_linux.cc|   2 +
 libsanitizer/interception/interception_linux.h |   8 +
 libsanitizer/lsan/Makefile.am  |   6 +
 libsanitizer/lsan/Makefile.in  |  11 +-
 libsanitizer/sanitizer_common/Makefile.am  |   5 +
 libsanitizer/sanitizer_common/Makefile.in  |  18 +-
 .../sanitizer_common_interceptors.inc  | 100 +++-
 .../sanitizer_platform_interceptors.h  |   4 +-
 .../sanitizer_platform_limits_linux.cc |   2 +
 .../sanitizer_platform_limits_posix.cc |  44 +++-
 .../sanitizer_platform_limits_posix.h  |  27 +-
 .../sanitizer_common/sanitizer_posix_libcdep.cc|   7 +
 libsanitizer/tsan/Makefile.am  |   6 +
 libsanitizer/tsan/Makefile.in  |  11 +-
 19 files changed, 619 insertions(+), 34 deletions(-)

-- 
1.9.1

[PATCH 1/3] libsanitizer: Fix !statfs64 builds

2014-04-17 Thread Bernhard Reutner-Fischer

libsanitizer/ChangeLog
2014-04-02  Bernhard Reutner-Fischer  

* configure.ac: Check for sizeof(struct statfs64).
* configure, config.h.in: Regenerate.
* sanitizer_common/sanitizer_platform_interceptors.h
(SANITIZER_INTERCEPT_STATFS64): Make conditional on
SIZEOF_STRUCT_STATFS64 being not 0.
* sanitizer_common/sanitizer_platform_limits_linux.cc
(namespace __sanitizer): Make unsigned
struct_statfs64_sz conditional on SANITIZER_INTERCEPT_STATFS64.

Signed-off-by: Bernhard Reutner-Fischer 
---
 libsanitizer/config.h.in   |  9 +++
 libsanitizer/configure | 69 ++
 libsanitizer/configure.ac  | 15 +
 .../sanitizer_platform_interceptors.h  |  4 +-
 .../sanitizer_platform_limits_linux.cc |  2 +
 5 files changed, 98 insertions(+), 1 deletion(-)

diff --git a/libsanitizer/config.h.in b/libsanitizer/config.h.in
index e4b2786..4bd6a7f 100644
--- a/libsanitizer/config.h.in
+++ b/libsanitizer/config.h.in
@@ -61,12 +61,18 @@
 /* Define to 1 if you have the  header file. */
 #undef HAVE_SYS_MMAN_H
 
+/* Define to 1 if you have the  header file. */
+#undef HAVE_SYS_STATFS_H
+
 /* Define to 1 if you have the  header file. */
 #undef HAVE_SYS_STAT_H
 
 /* Define to 1 if you have the  header file. */
 #undef HAVE_SYS_TYPES_H
 
+/* Define to 1 if you have the  header file. */
+#undef HAVE_SYS_VFS_H
+
 /* Define to 1 if you have the  header file. */
 #undef HAVE_UNISTD_H
 
@@ -107,6 +113,9 @@
 /* The size of `short', as computed by sizeof. */
 #undef SIZEOF_SHORT
 
+/* The size of `struct statfs64', as computed by sizeof. */
+#undef SIZEOF_STRUCT_STATFS64
+
 /* The size of `void *', as computed by sizeof. */
 #undef SIZEOF_VOID_P
 
diff --git a/libsanitizer/configure b/libsanitizer/configure
index 5e4840f..c636212 100755
--- a/libsanitizer/configure
+++ b/libsanitizer/configure
@@ -15463,6 +15463,75 @@ _ACEOF
 
 
 
+for ac_header in sys/statfs.h
+do :
+  ac_fn_c_check_header_mongrel "$LINENO" "sys/statfs.h" 
"ac_cv_header_sys_statfs_h" "$ac_includes_default"
+if test "x$ac_cv_header_sys_statfs_h" = x""yes; then :
+  cat >>confdefs.h <<_ACEOF
+#define HAVE_SYS_STATFS_H 1
+_ACEOF
+
+fi
+
+done
+
+if test "$ac_cv_header_sys_statfs_h" = "no"; then
+  for ac_header in sys/vfs.h
+do :
+  ac_fn_c_check_header_mongrel "$LINENO" "sys/vfs.h" "ac_cv_header_sys_vfs_h" 
"$ac_includes_default"
+if test "x$ac_cv_header_sys_vfs_h" = x""yes; then :
+  cat >>confdefs.h <<_ACEOF
+#define HAVE_SYS_VFS_H 1
+_ACEOF
+
+fi
+
+done
+
+fi
+# The cast to long int works around a bug in the HP C Compiler
+# version HP92453-01 B.11.11.23709.GP, which incorrectly rejects
+# declarations like `int a3[[(sizeof (unsigned char)) >= 0]];'.
+# This bug is HP SR number 8606223364.
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking size of struct statfs64" >&5
+$as_echo_n "checking size of struct statfs64... " >&6; }
+if test "${ac_cv_sizeof_struct_statfs64+set}" = set; then :
+  $as_echo_n "(cached) " >&6
+else
+  if ac_fn_c_compute_int "$LINENO" "(long int) (sizeof (struct statfs64))" 
"ac_cv_sizeof_struct_statfs64""
+#ifdef HAVE_SYS_STATFS_H
+# include 
+#endif
+#ifdef HAVE_SYS_VFS_H
+# include 
+#endif
+
+"; then :
+
+else
+  if test "$ac_cv_type_struct_statfs64" = yes; then
+ { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5
+$as_echo "$as_me: error: in \`$ac_pwd':" >&2;}
+{ as_fn_set_status 77
+as_fn_error "cannot compute sizeof (struct statfs64)
+See \`config.log' for more details." "$LINENO" 5; }; }
+   else
+ ac_cv_sizeof_struct_statfs64=0
+   fi
+fi
+
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$ac_cv_sizeof_struct_statfs64" >&5
+$as_echo "$ac_cv_sizeof_struct_statfs64" >&6; }
+
+
+
+cat >>confdefs.h <<_ACEOF
+#define SIZEOF_STRUCT_STATFS64 $ac_cv_sizeof_struct_statfs64
+_ACEOF
+
+
+
 if test "${multilib}" = "yes"; then
   multilib_arg="--enable-multilib"
 else
diff --git a/libsanitizer/configure.ac b/libsanitizer/configure.ac
index e672131..746c216 100644
--- a/libsanitizer/configure.ac
+++ b/libsanitizer/configure.ac
@@ -78,6 +78,21 @@ AC_SUBST(enable_static)
 
 AC_CHECK_SIZEOF([void *])
 
+dnl Careful, this breaks on glibc for e.g. dirent.d_ino being 64bit
+dnl AC_SYS_LARGEFILE
+AC_CHECK_HEADERS(sys/statfs.h)
+if test "$ac_cv_header_sys_statfs_h" = "no"; then
+  AC_CHECK_HEADERS(sys/vfs.h)
+fi
+AC_CHECK_SIZEOF([struct statfs64],[],[
+#ifdef HAVE_SYS_STATFS_H
+# include 
+#endif
+#ifdef HAVE_SYS_VFS_H
+# include 
+#endif
+])
+
 if test "${multilib}" = "yes"; then
   multilib_arg="--enable-multilib"
 else
diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_interceptors.h 
b/libsanitizer/sanitizer_common/sanitizer_platform_interceptors.h
index f37d84b..b9ebd5c 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform_interceptors.h
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_interceptors.h
@@ -137,7 +1

[PATCH 2/3] libsanitizer: add conditionals for libc

2014-04-17 Thread Bernhard Reutner-Fischer

Conditionalize usage of dlvsym(), nanosleep(), usleep();
Conditionalize layout of struct sigaction and type of it's member
sa_flags.
Conditionalize glob_t members gl_closedir, gl_readdir, gl_opendir,
gl_flags, gl_lstat, gl_stat.
Check for availability of glob.h for use with above members.
Check for availability of netrom/netrom.h, sys/ustat.h (for obsolete
ustat() function), utime.h (for obsolete utime() function), wordexp.h.
Determine size of sigset_t instead of hardcoding it.

libsanitizer/ChangeLog

2014-04-16  Bernhard Reutner-Fischer  

* configure.ac (AC_CHECK_HEADERS): Add time.h, wordexp.h,
glob.h, netrom/netrom.h, sys/ustat.h.
(AC_CHECK_MEMBERS): Check GNU extension glob_t members.
(AC_CHECK_SIZEOF): Determine size of sigset_t.
(HAVE_STRUCT_SIGACTION_SA_MASK_LAST,
STRUCT_SIGACTION_SA_FLAGS_TYPE): New.
(AC_CHECK_FUNCS): Add usleep, nanosleep, dlvsym.
* configure, config.h.in: Regenerate.
* asan/Makefile.am, lsan/Makefile.am, tsan/Makefile.am,
sanitizer_common/Makefile.am (AM_CXXFLAGS): Include config.h,
add include search directory.
* asan/Makefile.in, lsan/Makefile.in, tsan/Makefile.in,
sanitizer_common/Makefile.in: Regenerate.
* interception/interception_linux.h,
interception/interception_linux.cc,
sanitizer_common/sanitizer_common_interceptors.inc,
sanitizer_common/sanitizer_platform_limits_posix.cc,
sanitizer_common/sanitizer_platform_limits_posix.h,
sanitizer_common/sanitizer_posix_libcdep.cc: Use config.h's new
defines.

Signed-off-by: Bernhard Reutner-Fischer 
---
 libsanitizer/asan/Makefile.am  |   6 +
 libsanitizer/asan/Makefile.in  |  17 +-
 libsanitizer/config.h.in   |  51 +
 libsanitizer/configure | 212 -
 libsanitizer/configure.ac  |  23 +++
 libsanitizer/interception/interception_linux.cc|   2 +
 libsanitizer/interception/interception_linux.h |   8 +
 libsanitizer/lsan/Makefile.am  |   6 +
 libsanitizer/lsan/Makefile.in  |  11 +-
 libsanitizer/sanitizer_common/Makefile.am  |   5 +
 libsanitizer/sanitizer_common/Makefile.in  |  18 +-
 .../sanitizer_common_interceptors.inc  | 100 +-
 .../sanitizer_platform_limits_posix.cc |  44 -
 .../sanitizer_platform_limits_posix.h  |  27 ++-
 .../sanitizer_common/sanitizer_posix_libcdep.cc|   7 +
 libsanitizer/tsan/Makefile.am  |   6 +
 libsanitizer/tsan/Makefile.in  |  11 +-
 17 files changed, 521 insertions(+), 33 deletions(-)

diff --git a/libsanitizer/asan/Makefile.am b/libsanitizer/asan/Makefile.am
index 3f07a83..851774c 100644
--- a/libsanitizer/asan/Makefile.am
+++ b/libsanitizer/asan/Makefile.am
@@ -9,6 +9,12 @@ DEFS += -DMAC_INTERPOSE_FUNCTIONS -DMISSING_BLOCKS_SUPPORT
 endif
 AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic 
-Wno-long-long  -fPIC -fno-builtin -fno-exceptions -fno-rtti 
-fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros
 AM_CXXFLAGS += $(LIBSTDCXX_RAW_CXX_CXXFLAGS)
+AM_CXXFLAGS += -include $(top_builddir)/config.h
+if LIBBACKTRACE_SUPPORTED
+# backtrace-rename.h is included from config.h, provide -I dir for it
+AM_CXXFLAGS += -I $(top_srcdir)
+endif
+
 ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config
 
 toolexeclib_LTLIBRARIES = libasan.la
diff --git a/libsanitizer/asan/Makefile.in b/libsanitizer/asan/Makefile.in
index 273eb4b..a9b889d 100644
--- a/libsanitizer/asan/Makefile.in
+++ b/libsanitizer/asan/Makefile.in
@@ -37,8 +37,10 @@ build_triplet = @build@
 host_triplet = @host@
 target_triplet = @target@
 @USING_MAC_INTERPOSE_TRUE@am__append_1 = -DMAC_INTERPOSE_FUNCTIONS 
-DMISSING_BLOCKS_SUPPORT
-@USING_MAC_INTERPOSE_FALSE@am__append_2 = 
$(top_builddir)/interception/libinterception.la
-@LIBBACKTRACE_SUPPORTED_TRUE@am__append_3 = 
$(top_builddir)/libbacktrace/libsanitizer_libbacktrace.la
+# backtrace-rename.h is included from config.h, provide -I dir for it
+@LIBBACKTRACE_SUPPORTED_TRUE@am__append_2 = -I $(top_srcdir)
+@USING_MAC_INTERPOSE_FALSE@am__append_3 = 
$(top_builddir)/interception/libinterception.la
+@LIBBACKTRACE_SUPPORTED_TRUE@am__append_4 = 
$(top_builddir)/libbacktrace/libsanitizer_libbacktrace.la
 subdir = asan
 DIST_COMMON = $(srcdir)/Makefile.in $(srcdir)/Makefile.am
 ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
@@ -86,8 +88,8 @@ LTLIBRARIES = $(toolexeclib_LTLIBRARIES)
 am__DEPENDENCIES_1 =
 libasan_la_DEPENDENCIES =  \
$(top_builddir)/sanitizer_common/libsanitizer_common.la \
-   $(top_builddir)/lsan/libsanitizer_lsan.la $(am__append_2) \
-   $(am__append_3) $(am__DEPENDENCIES_1)
+   $(top_builddir)/lsan/libsanitizer_lsan.la $(am__append_3) \
+   $(am__append_4) $(am__DEPENDENCIES_1)

[PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc

2014-04-17 Thread Bernhard Reutner-Fischer

Conditionalize usage of dlvsym(), nanosleep(), usleep();
Conditionalize layout of struct sigaction and type of it's member
sa_flags.
Conditionalize glob_t members gl_closedir, gl_readdir, gl_opendir,
gl_flags, gl_lstat, gl_stat.
Check for availability of glob.h for use with above members.
Check for availability of netrom/netrom.h, sys/ustat.h (for obsolete
ustat() function), utime.h (for obsolete utime() function), wordexp.h.
Determine size of sigset_t instead of hardcoding it.
Determine size of struct statfs64, if available.

Leave defaults to match what glibc expects but probe them for uClibc.

Signed-off-by: Bernhard Reutner-Fischer 
---
 CMakeLists.txt |  58 +++
 cmake/Modules/CompilerRTUtils.cmake|  15 ++
 cmake/Modules/FunctionExistsNotStub.cmake  |  56 +++
 lib/interception/interception_linux.cc |   2 +
 lib/interception/interception_linux.h  |   9 ++
 .../sanitizer_common_interceptors.inc  | 101 +++-
 .../sanitizer_platform_limits_posix.cc |  44 -
 .../sanitizer_platform_limits_posix.h  |  27 +++-
 lib/sanitizer_common/sanitizer_posix_libcdep.cc|   9 ++
 make/platform/clang_linux.mk   | 180 +
 make/platform/clang_linux_test_libc.c  |  68 
 11 files changed, 561 insertions(+), 8 deletions(-)
 create mode 100644 cmake/Modules/FunctionExistsNotStub.cmake
 create mode 100644 make/platform/clang_linux_test_libc.c

diff --git a/CMakeLists.txt b/CMakeLists.txt
index e1a7a1f..af8073e 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -330,6 +330,64 @@ if(APPLE)
 -isysroot ${IOSSIM_SDK_DIR})
 endif()
 
+set(ct_c ${COMPILER_RT_SOURCE_DIR}/make/platform/clang_linux_test_libc.c)
+check_include_file(sys/ustat.h HAVE_SYS_USTAT_H)
+check_include_file(utime.h HAVE_UTIME_H)
+check_include_file(wordexp.h HAVE_WORDEXP_H)
+check_include_file(glob.h HAVE_GLOB_H)
+include(FunctionExistsNotStub)
+check_function_exists_not_stub(${ct_c} nanosleep HAVE_NANOSLEEP)
+check_function_exists_not_stub(${ct_c} usleep HAVE_USLEEP)
+include(CheckTypeSize)
+# check for sizeof sigset_t
+set(oCMAKE_EXTRA_INCLUDE_FILES "${CMAKE_EXTRA_INCLUDE_FILES}")
+set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} signal.h)
+check_type_size("sigset_t" SIZEOF_SIGSET_T BUILTIN_TYPES_ONLY)
+if(EXISTS HAVE_SIZEOF_SIGSET_T)
+  set(SIZEOF_SIGSET_T ${HAVE_SIZEOF_SIGSET_T})
+endif()
+set(CMAKE_EXTRA_INCLUDE_FILES "${oCMAKE_EXTRA_INCLUDE_FILES}")
+# check for sizeof struct statfs64
+set(oCMAKE_EXTRA_INCLUDE_FILES "${CMAKE_EXTRA_INCLUDE_FILES}")
+check_include_file(sys/statfs.h HAVE_SYS_STATFS_H)
+check_include_file(sys/vfs.h HAVE_SYS_VFS_H)
+if(HAVE_SYS_STATFS_H)
+  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/statfs.h)
+endif()
+if(HAVE_SYS_VFS_H)
+  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/vfs.h)
+endif()
+# Have to pass _LARGEFILE64_SOURCE otherwise there is no struct statfs64.
+# We forcefully enable LFS to retain glibc legacy behaviour herein.
+set(oCMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS})
+set(CMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS} 
-D_LARGEFILE64_SOURCE)
+check_type_size("struct statfs64" SIZEOF_STRUCT_STATFS64)
+if(EXISTS HAVE_SIZEOF_STRUCT_STATFS64)
+  set(SIZEOF_STRUCT_STATFS64 ${HAVE_SIZEOF_STRUCT_STATFS64})
+else()
+  set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
+endif()
+set(CMAKE_EXTRA_INCLUDE_FILES "${oCMAKE_EXTRA_INCLUDE_FILES}")
+# do not set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
+# it back here either way.
+include(CheckStructHasMember)
+check_struct_has_member(glob_t gl_flags glob.h HAVE_GLOB_T_GL_FLAGS)
+check_struct_has_member(glob_t gl_closedir glob.h HAVE_GLOB_T_GL_CLOSEDIR)
+check_struct_has_member(glob_t gl_readdir glob.h HAVE_GLOB_T_GL_READDIR)
+check_struct_has_member(glob_t gl_opendir glob.h HAVE_GLOB_T_GL_OPENDIR)
+check_struct_has_member(glob_t gl_lstat glob.h HAVE_GLOB_T_GL_LSTAT)
+check_struct_has_member(glob_t gl_stat glob.h HAVE_GLOB_T_GL_STAT)
+
+# folks seem to have an aversion to configure_file? So be it..
+foreach(x HAVE_SYS_USTAT_H HAVE_UTIME_H HAVE_WORDEXP_H HAVE_GLOB_H
+HAVE_NANOSLEEP HAVE_USLEEP SIZEOF_SIGSET_T SIZEOF_STRUCT_STATFS64
+HAVE_GLOB_T_GL_FLAGS HAVE_GLOB_T_GL_CLOSEDIR
+HAVE_GLOB_T_GL_READDIR HAVE_GLOB_T_GL_OPENDIR
+HAVE_GLOB_T_GL_LSTAT HAVE_GLOB_T_GL_STAT)
+def_undef_string(${x} SANITIZER_COMMON_CFLAGS)
+endforeach()
+
+
 # Architectures supported by Sanitizer runtimes. Specific sanitizers may
 # support only subset of these (e.g. TSan works on x86_64 only).
 filter_available_targets(SANITIZER_COMMON_SUPPORTED_ARCH
diff --git a/cmake/Modules/CompilerRTUtils.cmake 
b/cmake/Modules/CompilerRTUtils.cmake
index e22e775..3a0beec 100644
--- a/cmake/Modules/CompilerRTUtils.cmake
+++ b/cmake/Modules/CompilerRTUtils.cmake
@@ -59,3 +59,18 @@ macro(append_no_rtti_flag list)
   append_if(COMPIL

Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc

2014-04-17 Thread Konstantin Serebryany

Hi,

If you are trying to modify the libsanitizer files, please read here:
https://code.google.com/p/address-sanitizer/wiki/HowToContribute

--kcc

On Thu, Apr 17, 2014 at 5:49 PM, Bernhard Reutner-Fischer
 wrote:
> Conditionalize usage of dlvsym(), nanosleep(), usleep();
> Conditionalize layout of struct sigaction and type of it's member
> sa_flags.
> Conditionalize glob_t members gl_closedir, gl_readdir, gl_opendir,
> gl_flags, gl_lstat, gl_stat.
> Check for availability of glob.h for use with above members.
> Check for availability of netrom/netrom.h, sys/ustat.h (for obsolete
> ustat() function), utime.h (for obsolete utime() function), wordexp.h.
> Determine size of sigset_t instead of hardcoding it.
> Determine size of struct statfs64, if available.
>
> Leave defaults to match what glibc expects but probe them for uClibc.
>
> Signed-off-by: Bernhard Reutner-Fischer 
> ---
>  CMakeLists.txt |  58 +++
>  cmake/Modules/CompilerRTUtils.cmake|  15 ++
>  cmake/Modules/FunctionExistsNotStub.cmake  |  56 +++
>  lib/interception/interception_linux.cc |   2 +
>  lib/interception/interception_linux.h  |   9 ++
>  .../sanitizer_common_interceptors.inc  | 101 +++-
>  .../sanitizer_platform_limits_posix.cc |  44 -
>  .../sanitizer_platform_limits_posix.h  |  27 +++-
>  lib/sanitizer_common/sanitizer_posix_libcdep.cc|   9 ++
>  make/platform/clang_linux.mk   | 180 
> +
>  make/platform/clang_linux_test_libc.c  |  68 
>  11 files changed, 561 insertions(+), 8 deletions(-)
>  create mode 100644 cmake/Modules/FunctionExistsNotStub.cmake
>  create mode 100644 make/platform/clang_linux_test_libc.c
>
> diff --git a/CMakeLists.txt b/CMakeLists.txt
> index e1a7a1f..af8073e 100644
> --- a/CMakeLists.txt
> +++ b/CMakeLists.txt
> @@ -330,6 +330,64 @@ if(APPLE)
>  -isysroot ${IOSSIM_SDK_DIR})
>  endif()
>
> +set(ct_c ${COMPILER_RT_SOURCE_DIR}/make/platform/clang_linux_test_libc.c)
> +check_include_file(sys/ustat.h HAVE_SYS_USTAT_H)
> +check_include_file(utime.h HAVE_UTIME_H)
> +check_include_file(wordexp.h HAVE_WORDEXP_H)
> +check_include_file(glob.h HAVE_GLOB_H)
> +include(FunctionExistsNotStub)
> +check_function_exists_not_stub(${ct_c} nanosleep HAVE_NANOSLEEP)
> +check_function_exists_not_stub(${ct_c} usleep HAVE_USLEEP)
> +include(CheckTypeSize)
> +# check for sizeof sigset_t
> +set(oCMAKE_EXTRA_INCLUDE_FILES "${CMAKE_EXTRA_INCLUDE_FILES}")
> +set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} signal.h)
> +check_type_size("sigset_t" SIZEOF_SIGSET_T BUILTIN_TYPES_ONLY)
> +if(EXISTS HAVE_SIZEOF_SIGSET_T)
> +  set(SIZEOF_SIGSET_T ${HAVE_SIZEOF_SIGSET_T})
> +endif()
> +set(CMAKE_EXTRA_INCLUDE_FILES "${oCMAKE_EXTRA_INCLUDE_FILES}")
> +# check for sizeof struct statfs64
> +set(oCMAKE_EXTRA_INCLUDE_FILES "${CMAKE_EXTRA_INCLUDE_FILES}")
> +check_include_file(sys/statfs.h HAVE_SYS_STATFS_H)
> +check_include_file(sys/vfs.h HAVE_SYS_VFS_H)
> +if(HAVE_SYS_STATFS_H)
> +  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/statfs.h)
> +endif()
> +if(HAVE_SYS_VFS_H)
> +  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/vfs.h)
> +endif()
> +# Have to pass _LARGEFILE64_SOURCE otherwise there is no struct statfs64.
> +# We forcefully enable LFS to retain glibc legacy behaviour herein.
> +set(oCMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS})
> +set(CMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS} 
> -D_LARGEFILE64_SOURCE)
> +check_type_size("struct statfs64" SIZEOF_STRUCT_STATFS64)
> +if(EXISTS HAVE_SIZEOF_STRUCT_STATFS64)
> +  set(SIZEOF_STRUCT_STATFS64 ${HAVE_SIZEOF_STRUCT_STATFS64})
> +else()
> +  set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
> +endif()
> +set(CMAKE_EXTRA_INCLUDE_FILES "${oCMAKE_EXTRA_INCLUDE_FILES}")
> +# do not set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
> +# it back here either way.
> +include(CheckStructHasMember)
> +check_struct_has_member(glob_t gl_flags glob.h HAVE_GLOB_T_GL_FLAGS)
> +check_struct_has_member(glob_t gl_closedir glob.h HAVE_GLOB_T_GL_CLOSEDIR)
> +check_struct_has_member(glob_t gl_readdir glob.h HAVE_GLOB_T_GL_READDIR)
> +check_struct_has_member(glob_t gl_opendir glob.h HAVE_GLOB_T_GL_OPENDIR)
> +check_struct_has_member(glob_t gl_lstat glob.h HAVE_GLOB_T_GL_LSTAT)
> +check_struct_has_member(glob_t gl_stat glob.h HAVE_GLOB_T_GL_STAT)
> +
> +# folks seem to have an aversion to configure_file? So be it..
> +foreach(x HAVE_SYS_USTAT_H HAVE_UTIME_H HAVE_WORDEXP_H HAVE_GLOB_H
> +HAVE_NANOSLEEP HAVE_USLEEP SIZEOF_SIGSET_T SIZEOF_STRUCT_STATFS64
> +HAVE_GLOB_T_GL_FLAGS HAVE_GLOB_T_GL_CLOSEDIR
> +HAVE_GLOB_T_GL_READDIR HAVE_GLOB_T_GL_OPENDIR
> +HAVE_GLOB_T_GL_LSTAT HAVE_GLOB_T_GL_STAT)
> +def_undef_string(${x} SANITIZER_COMMON_CFLAGS)
> +endforeach()
> +
> +
>  # Architectures supported by Sanitizer runtimes. Specific san

Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc

2014-04-17 Thread Bernhard Reutner-Fischer

On 17 April 2014 16:07, Konstantin Serebryany
 wrote:
> Hi,
>
> If you are trying to modify the libsanitizer files, please read here:
> https://code.google.com/p/address-sanitizer/wiki/HowToContribute

I read that, thanks. Patch 3/3 is for current compiler-rt git repo,
please install it there, i do not have write access to the LLVM nor
compiler-rt trees.
TIA,
>
> --kcc
>
> On Thu, Apr 17, 2014 at 5:49 PM, Bernhard Reutner-Fischer
>  wrote:
>> Conditionalize usage of dlvsym(), nanosleep(), usleep();
>> Conditionalize layout of struct sigaction and type of it's member
>> sa_flags.
>> Conditionalize glob_t members gl_closedir, gl_readdir, gl_opendir,
>> gl_flags, gl_lstat, gl_stat.
>> Check for availability of glob.h for use with above members.
>> Check for availability of netrom/netrom.h, sys/ustat.h (for obsolete
>> ustat() function), utime.h (for obsolete utime() function), wordexp.h.
>> Determine size of sigset_t instead of hardcoding it.
>> Determine size of struct statfs64, if available.
>>
>> Leave defaults to match what glibc expects but probe them for uClibc.
>>
>> Signed-off-by: Bernhard Reutner-Fischer 
>> ---
>>  CMakeLists.txt |  58 +++
>>  cmake/Modules/CompilerRTUtils.cmake|  15 ++
>>  cmake/Modules/FunctionExistsNotStub.cmake  |  56 +++
>>  lib/interception/interception_linux.cc |   2 +
>>  lib/interception/interception_linux.h  |   9 ++
>>  .../sanitizer_common_interceptors.inc  | 101 +++-
>>  .../sanitizer_platform_limits_posix.cc |  44 -
>>  .../sanitizer_platform_limits_posix.h  |  27 +++-
>>  lib/sanitizer_common/sanitizer_posix_libcdep.cc|   9 ++
>>  make/platform/clang_linux.mk   | 180 
>> +
>>  make/platform/clang_linux_test_libc.c  |  68 
>>  11 files changed, 561 insertions(+), 8 deletions(-)
>>  create mode 100644 cmake/Modules/FunctionExistsNotStub.cmake
>>  create mode 100644 make/platform/clang_linux_test_libc.c
>>
>> diff --git a/CMakeLists.txt b/CMakeLists.txt
>> index e1a7a1f..af8073e 100644
>> --- a/CMakeLists.txt
>> +++ b/CMakeLists.txt
>> @@ -330,6 +330,64 @@ if(APPLE)
>>  -isysroot ${IOSSIM_SDK_DIR})
>>  endif()
>>
>> +set(ct_c ${COMPILER_RT_SOURCE_DIR}/make/platform/clang_linux_test_libc.c)
>> +check_include_file(sys/ustat.h HAVE_SYS_USTAT_H)
>> +check_include_file(utime.h HAVE_UTIME_H)
>> +check_include_file(wordexp.h HAVE_WORDEXP_H)
>> +check_include_file(glob.h HAVE_GLOB_H)
>> +include(FunctionExistsNotStub)
>> +check_function_exists_not_stub(${ct_c} nanosleep HAVE_NANOSLEEP)
>> +check_function_exists_not_stub(${ct_c} usleep HAVE_USLEEP)
>> +include(CheckTypeSize)
>> +# check for sizeof sigset_t
>> +set(oCMAKE_EXTRA_INCLUDE_FILES "${CMAKE_EXTRA_INCLUDE_FILES}")
>> +set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} signal.h)
>> +check_type_size("sigset_t" SIZEOF_SIGSET_T BUILTIN_TYPES_ONLY)
>> +if(EXISTS HAVE_SIZEOF_SIGSET_T)
>> +  set(SIZEOF_SIGSET_T ${HAVE_SIZEOF_SIGSET_T})
>> +endif()
>> +set(CMAKE_EXTRA_INCLUDE_FILES "${oCMAKE_EXTRA_INCLUDE_FILES}")
>> +# check for sizeof struct statfs64
>> +set(oCMAKE_EXTRA_INCLUDE_FILES "${CMAKE_EXTRA_INCLUDE_FILES}")
>> +check_include_file(sys/statfs.h HAVE_SYS_STATFS_H)
>> +check_include_file(sys/vfs.h HAVE_SYS_VFS_H)
>> +if(HAVE_SYS_STATFS_H)
>> +  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/statfs.h)
>> +endif()
>> +if(HAVE_SYS_VFS_H)
>> +  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/vfs.h)
>> +endif()
>> +# Have to pass _LARGEFILE64_SOURCE otherwise there is no struct statfs64.
>> +# We forcefully enable LFS to retain glibc legacy behaviour herein.
>> +set(oCMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS})
>> +set(CMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS} 
>> -D_LARGEFILE64_SOURCE)
>> +check_type_size("struct statfs64" SIZEOF_STRUCT_STATFS64)
>> +if(EXISTS HAVE_SIZEOF_STRUCT_STATFS64)
>> +  set(SIZEOF_STRUCT_STATFS64 ${HAVE_SIZEOF_STRUCT_STATFS64})
>> +else()
>> +  set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
>> +endif()
>> +set(CMAKE_EXTRA_INCLUDE_FILES "${oCMAKE_EXTRA_INCLUDE_FILES}")
>> +# do not set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
>> +# it back here either way.
>> +include(CheckStructHasMember)
>> +check_struct_has_member(glob_t gl_flags glob.h HAVE_GLOB_T_GL_FLAGS)
>> +check_struct_has_member(glob_t gl_closedir glob.h HAVE_GLOB_T_GL_CLOSEDIR)
>> +check_struct_has_member(glob_t gl_readdir glob.h HAVE_GLOB_T_GL_READDIR)
>> +check_struct_has_member(glob_t gl_opendir glob.h HAVE_GLOB_T_GL_OPENDIR)
>> +check_struct_has_member(glob_t gl_lstat glob.h HAVE_GLOB_T_GL_LSTAT)
>> +check_struct_has_member(glob_t gl_stat glob.h HAVE_GLOB_T_GL_STAT)
>> +
>> +# folks seem to have an aversion to configure_file? So be it..
>> +foreach(x HAVE_SYS_USTAT_H HAVE_UTIME_H HAVE_WORDEXP_H HAVE_GLOB_H
>> +HAVE_NANOSLEEP HAVE_USLEE

Re: [PATCH v7?] PR middle-end/60281

2014-04-17 Thread lin zuojian

Hi Bernd,
I have my copyright mark signed and the process has completed. Now I
am going to answer two more questions before my patch can be
commited right?

Did you copy any
files or text written by someone else in these changes?”

no

[Which files have you changed so far, and which new files have you written
so far?]
gcc/asan.c
gcc/ChangeLog
gcc/cfgexpand.c

Okay, you may review my patch again, if there is no problem, please
commit it for me.
--
Regards
lin zuojian

Re: [RFC] Add aarch64 support for ada

2014-04-17 Thread Richard Henderson

On 04/17/2014 02:00 AM, Tristan Gingold wrote:
> 
> On 16 Apr 2014, at 17:36, Richard Henderson  wrote:
> 
>> On 04/16/2014 12:39 AM, Eric Botcazou wrote:
 The primary bit of rfc here is the hunk that applies to ada/types.h
 with respect to Fat_Pointer.  Given that the Ada type, as defined in
 s-stratt.ads, does not include alignment, I can't imagine why the C
 type should have it.
>>>
>>> See gcc-interface/utils.c:finish_fat_pointer_type.
>>
>> Ah hah.
>>
>>  /* Make sure we can put it into a register.  */
>>  if (STRICT_ALIGNMENT)
>>TYPE_ALIGN (record_type) = MIN (BIGGEST_ALIGNMENT, 2 * POINTER_SIZE);
>>
>> AArch64 is not a STRICT_ALIGNMENT target, thus the mismatch.
> 
> As the align attribute in types.h is for the host, couldn't a configure test 
> solve
> this issue ?

I doubt it.  I'm not sure what kind of configure test you could write that
would determine the setting of STRICT_ALIGNMENT, since even non-strict-align
targets prefer to align data for performance reasons.  Be careful that the test
couldn't be an execution test, lest you break host != build.

> One of the most common Fat_Pointer is for strings, which aren't declared in 
> any
> source and is very commonly used.
> 
> OTOH, I think this optimization mostly targets sparc.

Indeed, 32-bit sparc wants 64-bit alignment for its ldd/std instructions.

Perhaps the better optimization (supposing it's really worth keeping) is to
DECL_ALIGN the static strings, rather than align the type?

Presumably Ada strings are as with C string literals -- symbols private to the
compilation unit which are normally passed by value.  Thus functions within the
compilation unit would see the extra alignment of the data and be able to use
ldd to load the pair.  On the receiving end being able to use std would remain
a matter of luck.

r~

Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc

2014-04-17 Thread Konstantin Serebryany

On Thu, Apr 17, 2014 at 6:27 PM, Bernhard Reutner-Fischer
 wrote:
> On 17 April 2014 16:07, Konstantin Serebryany
>  wrote:
>> Hi,
>>
>> If you are trying to modify the libsanitizer files, please read here:
>> https://code.google.com/p/address-sanitizer/wiki/HowToContribute
>
> I read that, thanks. Patch 3/3 is for current compiler-rt git repo,
> please install it there, i do not have write access to the LLVM nor
> compiler-rt trees.

I can commit your patch to llvm tree only after you follow the process
described on that page.
Sorry, this is a hard rule.

--kcc

> TIA,
>>
>> --kcc
>>
>> On Thu, Apr 17, 2014 at 5:49 PM, Bernhard Reutner-Fischer
>>  wrote:
>>> Conditionalize usage of dlvsym(), nanosleep(), usleep();
>>> Conditionalize layout of struct sigaction and type of it's member
>>> sa_flags.
>>> Conditionalize glob_t members gl_closedir, gl_readdir, gl_opendir,
>>> gl_flags, gl_lstat, gl_stat.
>>> Check for availability of glob.h for use with above members.
>>> Check for availability of netrom/netrom.h, sys/ustat.h (for obsolete
>>> ustat() function), utime.h (for obsolete utime() function), wordexp.h.
>>> Determine size of sigset_t instead of hardcoding it.
>>> Determine size of struct statfs64, if available.
>>>
>>> Leave defaults to match what glibc expects but probe them for uClibc.
>>>
>>> Signed-off-by: Bernhard Reutner-Fischer 
>>> ---
>>>  CMakeLists.txt |  58 +++
>>>  cmake/Modules/CompilerRTUtils.cmake|  15 ++
>>>  cmake/Modules/FunctionExistsNotStub.cmake  |  56 +++
>>>  lib/interception/interception_linux.cc |   2 +
>>>  lib/interception/interception_linux.h  |   9 ++
>>>  .../sanitizer_common_interceptors.inc  | 101 +++-
>>>  .../sanitizer_platform_limits_posix.cc |  44 -
>>>  .../sanitizer_platform_limits_posix.h  |  27 +++-
>>>  lib/sanitizer_common/sanitizer_posix_libcdep.cc|   9 ++
>>>  make/platform/clang_linux.mk   | 180 
>>> +
>>>  make/platform/clang_linux_test_libc.c  |  68 
>>>  11 files changed, 561 insertions(+), 8 deletions(-)
>>>  create mode 100644 cmake/Modules/FunctionExistsNotStub.cmake
>>>  create mode 100644 make/platform/clang_linux_test_libc.c
>>>
>>> diff --git a/CMakeLists.txt b/CMakeLists.txt
>>> index e1a7a1f..af8073e 100644
>>> --- a/CMakeLists.txt
>>> +++ b/CMakeLists.txt
>>> @@ -330,6 +330,64 @@ if(APPLE)
>>>  -isysroot ${IOSSIM_SDK_DIR})
>>>  endif()
>>>
>>> +set(ct_c ${COMPILER_RT_SOURCE_DIR}/make/platform/clang_linux_test_libc.c)
>>> +check_include_file(sys/ustat.h HAVE_SYS_USTAT_H)
>>> +check_include_file(utime.h HAVE_UTIME_H)
>>> +check_include_file(wordexp.h HAVE_WORDEXP_H)
>>> +check_include_file(glob.h HAVE_GLOB_H)
>>> +include(FunctionExistsNotStub)
>>> +check_function_exists_not_stub(${ct_c} nanosleep HAVE_NANOSLEEP)
>>> +check_function_exists_not_stub(${ct_c} usleep HAVE_USLEEP)
>>> +include(CheckTypeSize)
>>> +# check for sizeof sigset_t
>>> +set(oCMAKE_EXTRA_INCLUDE_FILES "${CMAKE_EXTRA_INCLUDE_FILES}")
>>> +set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} signal.h)
>>> +check_type_size("sigset_t" SIZEOF_SIGSET_T BUILTIN_TYPES_ONLY)
>>> +if(EXISTS HAVE_SIZEOF_SIGSET_T)
>>> +  set(SIZEOF_SIGSET_T ${HAVE_SIZEOF_SIGSET_T})
>>> +endif()
>>> +set(CMAKE_EXTRA_INCLUDE_FILES "${oCMAKE_EXTRA_INCLUDE_FILES}")
>>> +# check for sizeof struct statfs64
>>> +set(oCMAKE_EXTRA_INCLUDE_FILES "${CMAKE_EXTRA_INCLUDE_FILES}")
>>> +check_include_file(sys/statfs.h HAVE_SYS_STATFS_H)
>>> +check_include_file(sys/vfs.h HAVE_SYS_VFS_H)
>>> +if(HAVE_SYS_STATFS_H)
>>> +  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/statfs.h)
>>> +endif()
>>> +if(HAVE_SYS_VFS_H)
>>> +  set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/vfs.h)
>>> +endif()
>>> +# Have to pass _LARGEFILE64_SOURCE otherwise there is no struct statfs64.
>>> +# We forcefully enable LFS to retain glibc legacy behaviour herein.
>>> +set(oCMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS})
>>> +set(CMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS} 
>>> -D_LARGEFILE64_SOURCE)
>>> +check_type_size("struct statfs64" SIZEOF_STRUCT_STATFS64)
>>> +if(EXISTS HAVE_SIZEOF_STRUCT_STATFS64)
>>> +  set(SIZEOF_STRUCT_STATFS64 ${HAVE_SIZEOF_STRUCT_STATFS64})
>>> +else()
>>> +  set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
>>> +endif()
>>> +set(CMAKE_EXTRA_INCLUDE_FILES "${oCMAKE_EXTRA_INCLUDE_FILES}")
>>> +# do not set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS})
>>> +# it back here either way.
>>> +include(CheckStructHasMember)
>>> +check_struct_has_member(glob_t gl_flags glob.h HAVE_GLOB_T_GL_FLAGS)
>>> +check_struct_has_member(glob_t gl_closedir glob.h HAVE_GLOB_T_GL_CLOSEDIR)
>>> +check_struct_has_member(glob_t gl_readdir glob.h HAVE_GLOB_T_GL_READDIR)
>>> +check_struct_has_member(glob_t gl_opendir glob.h HAVE_GLOB_T_GL_OPENDIR)
>>> +check_

Re: fuse-caller-save - hook format

2014-04-17 Thread Vladimir Makarov

On 2014-04-16, 3:19 PM, Tom de Vries wrote:
> Vladimir,
> 
> All patches for the fuse-caller-save optimization have been ok-ed. The only 
> part
> not approved is the MIPS-specific part.
> 
> The objection of Richard S. is not so much the patch itself, but more the idea
> of the hook fn_other_hard_reg_usage.
> 
> For clarity, I'm restating the current hook definition here:
> ...
> +@deftypefn {Target Hook} bool TARGET_FN_OTHER_HARD_REG_USAGE (struct
> hard_reg_set_container *@var{regs})
> Add any hard registers to @var{regs} that are set or clobbered by a call to 
> the
> function.  This hook only needs to add registers that cannot be found by
> examination of the final RTL representation of a function.  This hook returns
> true if it managed to determine which registers need to be added.  The default
> version of this hook returns false.
> ...
> 
> Richard prefers to, rather than having a hook specifying what registers are
> implicitly clobbered, adding those clobbers to CALL_INSN_FUNCTION_USAGE.
> 
> I can see these possibilities (and perhaps there are more):
> 
> 1. We go with Richards proposal: we make each target responsible for adding
> these clobbers in CALL_INSN_FUNCTION_USAGE, and use a hook called f.i.
> targetm.fuse_caller_save_p or targetm.implicit_call_clobbers_in_fusage_p, to
> indicate whether a target has taken care of that, meaning it's safe to do the
> fuse-caller-save optimization.
> 
> 2. A mixed solution: we make each target responsible for specifying which
> clobbers need to be added in CALL_INSN_FUNCTION_USAGE, using a hook called 
> f.i.
> targetm.call_clobbered_regs, and add generic code to add those clobbers to
> CALL_INSN_FUNCTION_USAGE.
> 
> 3. We stick with the current, approved hook format, and try to convince 
> Richard
> to live with it.
> 
> 
> Since you are a register allocator maintainer, familiar with the
> fuse-caller-save optimization, and have approved the original hook, I would 
> like
> to ask you to make a decision on how to proceed from here.
> 

I have no preferences and it is a matter of taste.  Each solution has
own advantages and disadvantages.  Putting this info into
CALL_INSN_FUNCTION_USAGE helps GCC developing a lot but it has a big
drawback in RTL memory footprint (especially for some targets which have
a lot of regs like AM29k or IA64).  On the order hand analogous approach
is already used in DF-infrastructure (which would be nice to fix it imho).

Still between GCC users and GCC developers, I'd prefer solution (even
the effect on amount of resources used by GCC is quite insignificant)
for users as their number is in a few magnitudes more then the developers.

But I can live with any solution.  So it is up to you.  I am flexible.

PATCH: PR target/60863: Incorrect codegen in ix86_expand_clear for -Os

2014-04-17 Thread H.J. Lu

Hi,

I checked in this preapproved patch to generate "xor reg, reg" when
optimizing for size.


H.J.
Index: ChangeLog
===
--- ChangeLog   (revision 209487)
+++ ChangeLog   (working copy)
@@ -1,3 +1,10 @@
+2014-04-17  H.J. Lu  
+
+   PR target/60863
+   * config/i386/i386.c (ix86_expand_clear): Remove outdated
+   comment.  Check optimize_insn_for_size_p instead of
+   optimize_insn_for_speed_p.
+
 2014-04-17  Martin Jambor  
 
* gimple-iterator.c (gsi_start_edge): New function.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 209487)
+++ config/i386/i386.c  (working copy)
@@ -16668,8 +16668,7 @@ ix86_expand_clear (rtx dest)
 dest = gen_rtx_REG (SImode, REGNO (dest));
   tmp = gen_rtx_SET (VOIDmode, dest, const0_rtx);
 
-  /* This predicate should match that for movsi_xor and movdi_xor_rex64.  */
-  if (!TARGET_USE_MOV0 || optimize_insn_for_speed_p ())
+  if (!TARGET_USE_MOV0 || optimize_insn_for_size_p ())
 {
   rtx clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
   tmp = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, tmp, clob));

Re: fuse-caller-save - hook format

2014-04-17 Thread Richard Sandiford

Vladimir Makarov  writes:
> On 2014-04-16, 3:19 PM, Tom de Vries wrote:
>> Vladimir,
>> 
>> All patches for the fuse-caller-save optimization have been ok-ed. The
>> only part
>> not approved is the MIPS-specific part.
>> 
>> The objection of Richard S. is not so much the patch itself, but more the 
>> idea
>> of the hook fn_other_hard_reg_usage.
>> 
>> For clarity, I'm restating the current hook definition here:
>> ...
>> +@deftypefn {Target Hook} bool TARGET_FN_OTHER_HARD_REG_USAGE (struct
>> hard_reg_set_container *@var{regs})
>> Add any hard registers to @var{regs} that are set or clobbered by a
>> call to the
>> function.  This hook only needs to add registers that cannot be found by
>> examination of the final RTL representation of a function.  This hook returns
>> true if it managed to determine which registers need to be added.  The 
>> default
>> version of this hook returns false.
>> ...
>> 
>> Richard prefers to, rather than having a hook specifying what registers are
>> implicitly clobbered, adding those clobbers to CALL_INSN_FUNCTION_USAGE.
>> 
>> I can see these possibilities (and perhaps there are more):
>> 
>> 1. We go with Richards proposal: we make each target responsible for adding
>> these clobbers in CALL_INSN_FUNCTION_USAGE, and use a hook called f.i.
>> targetm.fuse_caller_save_p or targetm.implicit_call_clobbers_in_fusage_p, to
>> indicate whether a target has taken care of that, meaning it's safe to do the
>> fuse-caller-save optimization.
>> 
>> 2. A mixed solution: we make each target responsible for specifying which
>> clobbers need to be added in CALL_INSN_FUNCTION_USAGE, using a hook
>> called f.i.
>> targetm.call_clobbered_regs, and add generic code to add those clobbers to
>> CALL_INSN_FUNCTION_USAGE.
>> 
>> 3. We stick with the current, approved hook format, and try to
>> convince Richard
>> to live with it.
>> 
>> 
>> Since you are a register allocator maintainer, familiar with the
>> fuse-caller-save optimization, and have approved the original hook, I
>> would like
>> to ask you to make a decision on how to proceed from here.
>> 
>
> I have no preferences and it is a matter of taste.  Each solution has
> own advantages and disadvantages.  Putting this info into
> CALL_INSN_FUNCTION_USAGE helps GCC developing a lot but it has a big
> drawback in RTL memory footprint (especially for some targets which have
> a lot of regs like AM29k or IA64).  On the order hand analogous approach
> is already used in DF-infrastructure (which would be nice to fix it imho).
>
> Still between GCC users and GCC developers, I'd prefer solution (even
> the effect on amount of resources used by GCC is quite insignificant)
> for users as their number is in a few magnitudes more then the developers.

Hmm, but you're talking like there are going to be a lot of these registers.
This isn't about which registers are call-clobbered or call-saved according
to the ABI (that's already available in other places).  All we want here
are the set of registers that are clobbered _in the caller_ before reaching
the callee or after the callee has returned.

So although IA-64 has lots of registers, the caller doesn't AFAIK use
lots of registers in the process of making the call.

On all targets we should be talking about one or two registers here.

Thanks,
Richard

Re: [RFC] Add aarch64 support for ada

2014-04-17 Thread Tristan Gingold


On 17 Apr 2014, at 16:50, Richard Henderson  wrote:

> On 04/17/2014 02:00 AM, Tristan Gingold wrote:
>> 
>> On 16 Apr 2014, at 17:36, Richard Henderson  wrote:
>> 
>>> On 04/16/2014 12:39 AM, Eric Botcazou wrote:
> The primary bit of rfc here is the hunk that applies to ada/types.h
> with respect to Fat_Pointer.  Given that the Ada type, as defined in
> s-stratt.ads, does not include alignment, I can't imagine why the C
> type should have it.
 
 See gcc-interface/utils.c:finish_fat_pointer_type.
>>> 
>>> Ah hah.
>>> 
>>> /* Make sure we can put it into a register.  */
>>> if (STRICT_ALIGNMENT)
>>>   TYPE_ALIGN (record_type) = MIN (BIGGEST_ALIGNMENT, 2 * POINTER_SIZE);
>>> 
>>> AArch64 is not a STRICT_ALIGNMENT target, thus the mismatch.
>> 
>> As the align attribute in types.h is for the host, couldn't a configure test 
>> solve
>> this issue ?
> 
> I doubt it.  I'm not sure what kind of configure test you could write that
> would determine the setting of STRICT_ALIGNMENT, since even non-strict-align
> targets prefer to align data for performance reasons.  Be careful that the 
> test
> couldn't be an execution test, lest you break host != build.

What about this compile-time check:

package Fatptralign is
   type String_Acc is access String;
   type Integer_acc is access Integer;

   pragma Compile_Time_Error
(String_Acc'Alignment = 1 * Integer_Acc'Alignment,
 "Fat pointer are simply aligned");

   pragma Compile_Time_Error
(String_Acc'Alignment = 2 * Integer_Acc'Alignment,
 "Fat pointer are doubly aligned");
end Fatptralign;


>> One of the most common Fat_Pointer is for strings, which aren't declared in 
>> any
>> source and is very commonly used.
>> 
>> OTOH, I think this optimization mostly targets sparc.
> 
> Indeed, 32-bit sparc wants 64-bit alignment for its ldd/std instructions.
> 
> Perhaps the better optimization (supposing it's really worth keeping)

That's a true question (worth keeping).  I think this also affects powerpc (as
an important target)

Eric ?

> is to
> DECL_ALIGN the static strings, rather than align the type?

[ Ada strings (and more generally Ada unconstrained array and Ada accesses to
  unconstrained arrays) are represented in GNAT by a fat pointer, ie a structure
  containing a pointer to the bounds and a pointer to the data.
  We are talking about the alignment of that structure. ]

> Presumably Ada strings are as with C string literals -- symbols private to the
> compilation unit which are normally passed by value.  Thus functions within 
> the
> compilation unit would see the extra alignment of the data and be able to use
> ldd to load the pair.  On the receiving end being able to use std would remain
> a matter of luck.

I think this will dismiss most of the gain.  Fat pointers can be heavily used in
some applications, and be present in structures.  Gain with only private symbols
might be tiny.

Tristan.

Re: [RFC] Add aarch64 support for ada

2014-04-17 Thread Eric Botcazou

> Ah hah.
> 
>   /* Make sure we can put it into a register.  */
>   if (STRICT_ALIGNMENT)
> TYPE_ALIGN (record_type) = MIN (BIGGEST_ALIGNMENT, 2 * POINTER_SIZE);
> 
> AArch64 is not a STRICT_ALIGNMENT target, thus the mismatch.

I see.  Initially this alignment promotion had been universal, but someone 
recently complained about holes in structures on x86-64 because of it so we 
restricted it to the platforms where it is really necessary for the goal 
stated in the comment; we left types.h untouched because the alignment could 
not possibly change the calling convention for non-strict-alignment targets...

> If we were to make this alignment unconditional, would it be better to drop
> the code from here in finish_fat_pointer_type and instead record that in
> the Ada source, as we do with the C source?

We cannot really do that, the s-stratt.ads thing is a red herring, alignment 
of fat pointer types is entirely decided inside the compiler (layout.adb:3213 
and gcc-interface/utils.c:finish_fat_pointer_type)

I presume that the attached kludge is sufficient to make it work?


* fe.h (Compiler_Abort): Replace Fat_Pointer by String.
(Error_Msg_N): Likewise.
(Error_Msg_NE): Likewise.
(Get_External_Name_With_Suffix): Likewise.
* types.h (Fat_Pointer): Delete.
(String): New type.
(DECLARE_STRING): New macro.
* gcc-interface/decl.c (create_concat_name): Adjust.
* gcc-interface/trans.c (post_error): Likewise.
(post_error_ne): Likewise.
* gcc-interface/misc.c (internal_error_function): Likewise.


-- 
Eric BotcazouIndex: fe.h
===
--- fe.h	(revision 209461)
+++ fe.h	(working copy)
@@ -39,7 +39,7 @@ extern "C" {
 /* comperr:  */
 
 #define Compiler_Abort comperr__compiler_abort
-extern int Compiler_Abort (Fat_Pointer, int, Fat_Pointer) ATTRIBUTE_NORETURN;
+extern int Compiler_Abort (String, int, String) ATTRIBUTE_NORETURN;
 
 /* csets: */
 
@@ -90,8 +90,8 @@ extern Node_Id Get_Attribute_Definition_
 #define Error_Msg_NE  errout__error_msg_ne
 #define Set_Identifier_Casing errout__set_identifier_casing
 
-extern void Error_Msg_N	  (Fat_Pointer, Node_Id);
-extern void Error_Msg_NE  (Fat_Pointer, Node_Id, Entity_Id);
+extern void Error_Msg_N	  (String, Node_Id);
+extern void Error_Msg_NE  (String, Node_Id, Entity_Id);
 extern void Set_Identifier_Casing (Char *, const Char *);
 
 /* err_vars: */
@@ -151,7 +151,7 @@ extern void Setup_Asm_Outputs		(Node_Id)
 
 extern void Get_Encoded_Name			(Entity_Id);
 extern void Get_External_Name			(Entity_Id, Boolean);
-extern void Get_External_Name_With_Suffix	(Entity_Id, Fat_Pointer);
+extern void Get_External_Name_With_Suffix	(Entity_Id, String);
 
 /* exp_util: */
 
Index: types.h
===
--- types.h	(revision 209461)
+++ types.h	(working copy)
@@ -76,11 +76,14 @@ typedef Char *Str;
 /* Pointer to string of Chars */
 typedef Char *Str_Ptr;
 
-/* Types for the fat pointer used for strings and the template it
-   points to.  */
+/* Types for the fat pointer used for strings and the template it points to.
+   On most platforms the fat pointer is naturally aligned but, on the rest,
+   it is given twice the natural alignment.  For maximum portability, we do
+   not overalign the type but only the objects.  */
 typedef struct {int Low_Bound, High_Bound; } String_Template;
-typedef struct {const char *Array; String_Template *Bounds; }
-	__attribute ((aligned (sizeof (char *) * 2))) Fat_Pointer;
+typedef struct {const char *Array; String_Template *Bounds; } String;
+#define DECLARE_STRING(s, a, t) \
+  __attribute__ ((aligned (sizeof (char *) * 2))) String s = { a, &t }
 
 /* Types for Node/Entity Kinds:  */
 
Index: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 209461)
+++ gcc-interface/decl.c	(working copy)
@@ -8861,8 +8861,8 @@ create_concat_name (Entity_Id gnat_entit
   if (suffix)
 {
   String_Template temp = {1, (int) strlen (suffix)};
-  Fat_Pointer fp = {suffix, &temp};
-  Get_External_Name_With_Suffix (gnat_entity, fp);
+  DECLARE_STRING (s, suffix, temp);
+  Get_External_Name_With_Suffix (gnat_entity, s);
 }
   else
 Get_External_Name (gnat_entity, 0);
Index: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 209461)
+++ gcc-interface/trans.c	(working copy)
@@ -7833,7 +7833,6 @@ gnat_gimplify_stmt (tree *stmt_p)
 	  gnu_cond = build2 (ANNOTATE_EXPR, TREE_TYPE (gnu_cond), gnu_cond,
  build_int_cst (integer_type_node,
 		annot_expr_ivdep_kind));
-
 	if (LOOP_STMT_NO_VECTOR (stmt))
 	  gnu_cond = build2 (ANNOTATE_EXPR, TREE_TYPE (gnu_cond), gnu_cond,
  build_int_cst (integer_type_node,
@@ -9357,16 +9356,14 @@ void
 p

Re: [C PATCH] Make attributes accept enum values (PR c/50459)

2014-04-17 Thread Marek Polacek

On Wed, Apr 16, 2014 at 01:40:22PM -0400, Jason Merrill wrote:
> On 04/15/2014 03:56 PM, Marek Polacek wrote:
> >The testsuite doesn't hit this code with C++, but does hit this code
> >with C.  The thing is, if we have e.g.
> >enum { A = 128 };
> >void *fn1 (void) __attribute__((assume_aligned (A)));
> >then handle_assume_aligned_attribute walks the attribute arguments
> >and gets the argument via TREE_VALUE.  If this argument is an enum
> >value, then for C the argument is identifier_node that contains const_decl,
> 
> Ah.  Then I think the C parser should be fixed to check
> attribute_takes_identifier_p and look up the argument if false.

Ok, thanks, I didn't know about attribute_takes_identifier_p.  Should be done
in the following.  Regtested/bootstrapped on x86_64-linux.  Ok now?

2014-04-17  Marek Polacek  

PR c/50459
c-family/
* c-common.c (handle_aligned_attribute): Don't call default_conversion
on FUNCTION_DECLs.
(handle_vector_size_attribute): Likewise.
(handle_sentinel_attribute): Call default_conversion and allow even
integral types as an argument.  
c/
* c-parser.c (c_parser_attributes): If the attribute doesn't take an
identifier, call lookup_name for arguments. 
testsuite/
* c-c++-common/pr50459.c: New test.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index c0e247b..1443914 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -7539,7 +7539,8 @@ handle_aligned_attribute (tree *node, tree ARG_UNUSED 
(name), tree args,
   if (args)
 {
   align_expr = TREE_VALUE (args);
-  if (align_expr && TREE_CODE (align_expr) != IDENTIFIER_NODE)
+  if (align_expr && TREE_CODE (align_expr) != IDENTIFIER_NODE
+ && TREE_CODE (align_expr) != FUNCTION_DECL)
align_expr = default_conversion (align_expr);
 }
   else
@@ -8533,7 +8534,8 @@ handle_vector_size_attribute (tree *node, tree name, tree 
args,
   *no_add_attrs = true;
 
   size = TREE_VALUE (args);
-  if (size && TREE_CODE (size) != IDENTIFIER_NODE)
+  if (size && TREE_CODE (size) != IDENTIFIER_NODE
+  && TREE_CODE (size) != FUNCTION_DECL)
 size = default_conversion (size);
 
   if (!tree_fits_uhwi_p (size))
@@ -8944,8 +8946,12 @@ handle_sentinel_attribute (tree *node, tree name, tree 
args,
   if (args)
 {
   tree position = TREE_VALUE (args);
+  if (position && TREE_CODE (position) != IDENTIFIER_NODE
+ && TREE_CODE (position) != FUNCTION_DECL)
+   position = default_conversion (position);
 
-  if (TREE_CODE (position) != INTEGER_CST)
+  if (TREE_CODE (position) != INTEGER_CST
+  || !INTEGRAL_TYPE_P (TREE_TYPE (position)))
{
  warning (OPT_Wattributes,
   "requested position is not an integer constant");
diff --git gcc/c/c-parser.c gcc/c/c-parser.c
index 5653e49..f8fe424 100644
--- gcc/c/c-parser.c
+++ gcc/c/c-parser.c
@@ -3912,6 +3912,7 @@ c_parser_attributes (c_parser *parser)
 || c_parser_next_token_is (parser, CPP_KEYWORD))
{
  tree attr, attr_name, attr_args;
+ bool attr_takes_id_p;
  vec *expr_list;
  if (c_parser_next_token_is (parser, CPP_COMMA))
{
@@ -3922,6 +3923,7 @@ c_parser_attributes (c_parser *parser)
  attr_name = c_parser_attribute_any_word (parser);
  if (attr_name == NULL)
break;
+ attr_takes_id_p = attribute_takes_identifier_p (attr_name);
  if (is_cilkplus_vector_p (attr_name))   
{
  c_token *v_token = c_parser_peek_token (parser);
@@ -3950,6 +3952,15 @@ c_parser_attributes (c_parser *parser)
  == CPP_CLOSE_PAREN)))
{
  tree arg1 = c_parser_peek_token (parser)->value;
+ if (!attr_takes_id_p)
+   {
+ /* This is for enum values, so that they can be used as
+an attribute parameter; lookup_name will find their
+CONST_DECLs.  */
+ tree ln = lookup_name (arg1);
+ if (ln)
+   arg1 = ln;
+   }
  c_parser_consume_token (parser);
  if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN))
attr_args = build_tree_list (NULL_TREE, arg1);
diff --git gcc/testsuite/c-c++-common/pr50459.c 
gcc/testsuite/c-c++-common/pr50459.c
index e69de29..f954b32 100644
--- gcc/testsuite/c-c++-common/pr50459.c
+++ gcc/testsuite/c-c++-common/pr50459.c
@@ -0,0 +1,14 @@
+/* PR c/50459 */
+/* { dg-do compile } */
+/* { dg-options "-Wall -Wextra" } */
+
+enum { A = 128, B = 1 };
+void *fn1 (void) __attribute__((assume_aligned (A)));
+void *fn2 (void) __attribute__((assume_aligned (A, 4)));
+void fn3 (void) __attribute__((constructor (A)));
+void fn4 (void) __attribute__((destructor (A)));
+void *fn5 (int) __attribute__((alloc_size (B)));
+void *fn6 (int) __attribute__((alloc_align (B)));
+void fn7 (const

Re: Avoid unnecesary GGC runs during LTO

2014-04-17 Thread Jan Hubicka

> > > +
> > > +  /* At this stage we know that majority of GGC memory is reachable.  
> > > + Growing the limits prevents unnecesary invocation of GGC.  */
> > > +  ggc_grow ();
> > >ggc_collect ();
> > 
> > Isn't the collect here pointless?  I see not in ENABLE_CHECKING, but
> > shouldn't this be abstracted away, thus call ggc_collect from ggc_grow?
> > Or maybe rather even for ENABLE_CHECKING adjust G.allocated_last_gc
> > and simply drop the ggc_collect above ().
> 
> I am fine with both.  I basically decided to keep the explicit ggc_collect() 
> to
> make it clear (from lto.c source code) that we are GGC safe at this point and
> to have way to double check that we do not produce too much of garbage with
> checking disabled. (so with -Q I will see how much it is collected at that 
> place).
> 
> We can embed it into ggc_grow and document that w/o checking it is equivalent
> to ggc_cooect.
> > 
> > Anyway, this is sth for stage1 at this point.
> 
> OK,
> Honza

Ping...
the patches saves 33 GGC runs during libxul.so link, that is not that bad ;)

Honza
> > 
> > Thanks,
> > Richard.
> > 
> > >/* Set the hooks so that all of the ipa passes can read in their data. 
> > >  */
> > > Index: ggc-none.c
> > > ===
> > > --- ggc-none.c(revision 209170)
> > > +++ ggc-none.c(working copy)
> > > @@ -63,3 +63,8 @@ ggc_free (void *p)
> > >  {
> > >free (p);
> > >  }
> > > +
> > > +void
> > > +ggc_grow (void)
> > > +{
> > > +}
> > > Index: ggc-page.c
> > > ===
> > > --- ggc-page.c(revision 209170)
> > > +++ ggc-page.c(working copy)
> > > @@ -2095,6 +2095,19 @@ ggc_collect (void)
> > >  fprintf (G.debug_file, "END COLLECTING\n");
> > >  }
> > >  
> > > +/* Assume that all GGC memory is reachable and grow the limits for next 
> > > collection. */
> > > +
> > > +void
> > > +ggc_grow (void)
> > > +{
> > > +#ifndef ENABLE_CHECKING
> > > +  G.allocated_last_gc = MAX (G.allocated_last_gc,
> > > +  G.allocated);
> > > +#endif
> > > +  if (!quiet_flag)
> > > +fprintf (stderr, " {GC start %luk} ", (unsigned long) G.allocated / 
> > > 1024);
> > > +}
> > > +
> > >  /* Print allocation statistics.  */
> > >  #define SCALE(x) ((unsigned long) ((x) < 1024*10 \
> > > ? (x) \
> > > 
> > > 
> > 
> > -- 
> > Richard Biener 
> > SUSE / SUSE Labs
> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> > GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

RE: [PATCH] Add a new option "-fmerge-bitfields" (patch / doc inside)

2014-04-17 Thread Zoran Jovanovic

Hello,
Unfortunately, optimization is limited only to bit-fields that have same 
bit-field representative (DECL_BIT_FIELD_REPRESENTATIVE), and fields from 
different classes do have different representatives.
In given example optimization would merge accesses to x and y bit-fields from 
Base class, but not the access to z from Der class.

Regards,
Zoran
  

From: Daniel Gutson [daniel.gut...@tallertechnologies.com]
Sent: Wednesday, April 16, 2014 4:16 PM
To: Zoran Jovanovic
Cc: Bernhard Reutner-Fischer; Richard Biener; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] Add a new option "-fmerge-bitfields" (patch / doc inside)

On Wed, Apr 16, 2014 at 8:38 AM, Zoran Jovanovic
 wrote:
> Hello,
> This is new patch version.
> Lowering is applied only for bit-fields copy sequences that are merged.
> Data structure representing bit-field copy sequences is renamed and reduced 
> in size.
> Optimization turned on by default for -O2 and higher.
> Some comments fixed.
>
> Benchmarking performed on WebKit for Android.
> Code size reduction noticed on several files, best examples are:
>
> core/rendering/style/StyleMultiColData (632->520 bytes)
> core/platform/graphics/FontDescription (1715->1475 bytes)
> core/rendering/style/FillLayer (5069->4513 bytes)
> core/rendering/style/StyleRareInheritedData (5618->5346)
> core/css/CSSSelectorList(4047->3887)
> core/platform/animation/CSSAnimationData (3844->3440 bytes)
> core/css/resolver/FontBuilder (13818->13350 bytes)
> core/platform/graphics/Font (16447->15975 bytes)
>
>
> Example:
>
> One of the motivating examples for this work was copy constructor of the 
> class which contains bit-fields.
>
> C++ code:
> class A
> {
> public:
> A(const A &x);
> unsigned a : 1;
> unsigned b : 2;
> unsigned c : 4;
> };
>
> A::A(const A&x)
> {
> a = x.a;
> b = x.b;
> c = x.c;
> }

Very interesting.

Does this work with inheritance too? E.g.

struct Base
{
uint32_t x:1;
uint32_t y:3;

Base(const Base& other) { x = other.x; y = other.y; }
};

struct Der : Base
{
Der() = default;
Der(const Der& other) : Base(other)
{ z = other.z; }

uint32_t z:9;
};


>
> GIMPLE code without optimization:
>
>   :
>   _3 = x_2(D)->a;
>   this_4(D)->a = _3;
>   _6 = x_2(D)->b;
>   this_4(D)->b = _6;
>   _8 = x_2(D)->c;
>   this_4(D)->c = _8;
>   return;
>
> Optimized GIMPLE code:
>   :
>   _10 = x_2(D)->D.1867;
>   _11 = BIT_FIELD_REF <_10, 7, 0>;
>   _12 = this_4(D)->D.1867;
>   _13 = _12 & 128;
>   _14 = (unsigned char) _11;
>   _15 = _13 | _14;
>   this_4(D)->D.1867 = _15;
>   return;
>
> Generated MIPS32r2 assembly code without optimization:
>  lw  $3,0($5)
> lbu $2,0($4)
> andi$3,$3,0x1
> andi$2,$2,0xfe
> or  $2,$2,$3
> sb  $2,0($4)
> lw  $3,0($5)
> andi$2,$2,0xf9
> andi$3,$3,0x6
> or  $2,$2,$3
> sb  $2,0($4)
> lw  $3,0($5)
> andi$2,$2,0x87
> andi$3,$3,0x78
> or  $2,$2,$3
> j   $31
> sb  $2,0($4)
>
> Optimized MIPS32r2 assembly code:
> lw  $3,0($5)
> lbu $2,0($4)
> andi$3,$3,0x7f
> andi$2,$2,0x80
> or  $2,$3,$2
> j   $31
> sb  $2,0($4)
>
>
> Algorithm works on basic block level and consists of following 3 major steps:
> 1. Go through basic block statements list. If there are statement pairs that 
> implement copy of bit field content from one memory location to another 
> record statements pointers and other necessary data in corresponding data 
> structure.
> 2. Identify records that represent adjacent bit field accesses and mark them 
> as merged.
> 3. Lower bit-field accesses by using new field size for those that can be 
> merged.
>
>
> New command line option "-fmerge-bitfields" is introduced.
>
>
> Tested - passed gcc regression tests for MIPS32r2.
>
>
> Changelog -
>
> gcc/ChangeLog:
> 2014-04-16 Zoran Jovanovic (zoran.jovano...@imgtec.com)
>   * common.opt (fmerge-bitfields): New option.
>   * doc/invoke.texi: Add reference to "-fmerge-bitfields".
>   * tree-sra.c (lower_bitfields): New function.
>   Entry for (-fmerge-bitfields).
>   (bf_access_candidate_p): New function.
>   (lower_bitfield_read): New function.
>   (lower_bitfield_write): New function.
>   (bitfield_stmt_bfcopy_pair::hash): New function.
>   (bitfield_stmt_bfcopy_pair::equal): New function.
>   (bitfield_stmt_bfcopy_pair::remove): New function.
>   (create_and_insert_bfcopy): New function.
>   (get_bit_offset): New function.
>   (add_stmt_bfcopy_pair): New function.
>   (cmp_bfcopies): New function.
>   (get_merged_bit_field_size): New function.
>   * dwarf2out.c (simple_type_size_in_bits): Move to tree.c.
>   (field_byte_offset): Move declaration to tree.h and make it extern.
>   * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test

[C++ Patch] PR 59120

2014-04-17 Thread Paolo Carlini


Hi,

we can fix this crash during error recovery very easily, by grouping 
together the two conditions leading to skip & early return, in complete 
analogy with the single-declaration case (note that we explicitly commit 
to tentative parse at the beginning of the function, thus we are good). 
Tested x86_64-linux.


Thanks,
Paolo.

///
/cp
2014-04-17  Paolo Carlini  

PR c++/59120
* parser.c (cp_parser_alias_declaration): Check return value of
cp_parser_require.

/testsuite
2014-04-17  Paolo Carlini  

PR c++/59120
* g++.dg/cpp0x/alias-decl-42.C: New.
Index: cp/parser.c
===
--- cp/parser.c (revision 209472)
+++ cp/parser.c (working copy)
@@ -16142,20 +16142,13 @@ cp_parser_alias_declaration (cp_parser* parser)
   if (parser->num_template_parameter_lists)
 parser->type_definition_forbidden_message = saved_message;
 
-  if (type == error_mark_node)
+  if (type == error_mark_node
+  || !cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON))
 {
   cp_parser_skip_to_end_of_block_or_statement (parser);
   return error_mark_node;
 }
 
-  cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
-
-  if (cp_parser_error_occurred (parser))
-{
-  cp_parser_skip_to_end_of_block_or_statement (parser);
-  return error_mark_node;
-}
-
   /* A typedef-name can also be introduced by an alias-declaration. The
  identifier following the using keyword becomes a typedef-name. It has
  the same semantics as if it were introduced by the typedef
Index: testsuite/g++.dg/cpp0x/alias-decl-42.C
===
--- testsuite/g++.dg/cpp0x/alias-decl-42.C  (revision 0)
+++ testsuite/g++.dg/cpp0x/alias-decl-42.C  (working copy)
@@ -0,0 +1,4 @@
+// PR c++/59120
+// { dg-do compile { target c++11 } }
+
+template using X = int T::T*;  // { dg-error "expected" }

[PATCH] Fix uninitialised variable warning in tree-ssa-loop-ivcanon.c

2014-04-17 Thread Kyrill Tkachov


Hi all,

While looking at the build logs I noticed a warning while building 
tree-ssa-loop-ivcanon.c about a potential use of an uninitialised variable.
This patchlet fixes that warning by initialising it to 0.

Tested arm-none-eabi.

Ok for trunk?

2014-04-17  Kyrylo Tkachov  

* tree-ssa-loop-ivcanon.c (canonicalize_loop_induction_variables):
Initialise n_unroll to 0.
diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c
index cdf1559..7a83b12 100644
--- a/gcc/tree-ssa-loop-ivcanon.c
+++ b/gcc/tree-ssa-loop-ivcanon.c
@@ -656,7 +656,7 @@ try_unroll_loop_completely (struct loop *loop,
 			HOST_WIDE_INT maxiter,
 			location_t locus)
 {
-  unsigned HOST_WIDE_INT n_unroll, ninsns, max_unroll, unr_insns;
+  unsigned HOST_WIDE_INT n_unroll = 0, ninsns, max_unroll, unr_insns;
   gimple cond;
   struct loop_size size;
   bool n_unroll_found = false;

Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc

2014-04-17 Thread Bernhard Reutner-Fischer

On 17 April 2014 16:51:23 Konstantin Serebryany 
 wrote:

On Thu, Apr 17, 2014 at 6:27 PM, Bernhard Reutner-Fischer
 wrote:
> On 17 April 2014 16:07, Konstantin Serebryany
>  wrote:
>> Hi,
>>
>> If you are trying to modify the libsanitizer files, please read here:
>> https://code.google.com/p/address-sanitizer/wiki/HowToContribute
>
> I read that, thanks. Patch 3/3 is for current compiler-rt git repo,
> please install it there, i do not have write access to the LLVM nor
> compiler-rt trees.

I can commit your patch to llvm tree only after you follow the process
described on that page.
Sorry, this is a hard rule.

What part of the process do you think I did not follow?

I made a patch for compiler-rt, sent it to llvm-comm...@cs.uiuc.edu then 
provided the corresponding GCC parts, along a backport of the new bits that 
I expect to be overwritten once you do a new merge, leaving just the GCC 
configuy bits. This is how I read the wiki page you cite.

Please tell me what you expect me to do differently?

Thanks,

--kcc

Sent with AquaMail for Android
http://www.aqua-mail.com

Re: fuse-caller-save - hook format

2014-04-17 Thread Vladimir Makarov


On 2014-04-17, 11:29 AM, Richard Sandiford wrote:

Vladimir Makarov  writes:

On 2014-04-16, 3:19 PM, Tom de Vries wrote:

Vladimir,

All patches for the fuse-caller-save optimization have been ok-ed. The
only part
not approved is the MIPS-specific part.

The objection of Richard S. is not so much the patch itself, but more the idea
of the hook fn_other_hard_reg_usage.

For clarity, I'm restating the current hook definition here:
...
+@deftypefn {Target Hook} bool TARGET_FN_OTHER_HARD_REG_USAGE (struct
hard_reg_set_container *@var{regs})
Add any hard registers to @var{regs} that are set or clobbered by a
call to the
function.  This hook only needs to add registers that cannot be found by
examination of the final RTL representation of a function.  This hook returns
true if it managed to determine which registers need to be added.  The default
version of this hook returns false.
...

Richard prefers to, rather than having a hook specifying what registers are
implicitly clobbered, adding those clobbers to CALL_INSN_FUNCTION_USAGE.

I can see these possibilities (and perhaps there are more):

1. We go with Richards proposal: we make each target responsible for adding
these clobbers in CALL_INSN_FUNCTION_USAGE, and use a hook called f.i.
targetm.fuse_caller_save_p or targetm.implicit_call_clobbers_in_fusage_p, to
indicate whether a target has taken care of that, meaning it's safe to do the
fuse-caller-save optimization.

2. A mixed solution: we make each target responsible for specifying which
clobbers need to be added in CALL_INSN_FUNCTION_USAGE, using a hook
called f.i.
targetm.call_clobbered_regs, and add generic code to add those clobbers to
CALL_INSN_FUNCTION_USAGE.

3. We stick with the current, approved hook format, and try to
convince Richard
to live with it.


Since you are a register allocator maintainer, familiar with the
fuse-caller-save optimization, and have approved the original hook, I
would like
to ask you to make a decision on how to proceed from here.



I have no preferences and it is a matter of taste.  Each solution has
own advantages and disadvantages.  Putting this info into
CALL_INSN_FUNCTION_USAGE helps GCC developing a lot but it has a big
drawback in RTL memory footprint (especially for some targets which have
a lot of regs like AM29k or IA64).  On the order hand analogous approach
is already used in DF-infrastructure (which would be nice to fix it imho).

Still between GCC users and GCC developers, I'd prefer solution (even
the effect on amount of resources used by GCC is quite insignificant)
for users as their number is in a few magnitudes more then the developers.


Hmm, but you're talking like there are going to be a lot of these registers.


Yes, you are right.  That is what I thought.  I should have read Tom's 
email with more attention.



This isn't about which registers are call-clobbered or call-saved according
to the ABI (that's already available in other places).  All we want here
are the set of registers that are clobbered _in the caller_ before reaching
the callee or after the callee has returned.

So although IA-64 has lots of registers, the caller doesn't AFAIK use
lots of registers in the process of making the call.

On all targets we should be talking about one or two registers here.



I see.  I guess your proposed solution is ok then.

[PATCH] Fix warning in libgfortran configure script

2014-04-17 Thread Kyrill Tkachov


Hi all,

While configuring libgfortran I'm getting this message:
"libgfortran/configure: line 25938: test: =: unary operator expected"
The script doesn't fail and continues afterwards, but I don't think it's 
supposed to give that warning.
This patch makes it go away and makes it more consistent with other similar uses 
(a few lines below $ac_cv_lib_rt_clock_gettime is quoted when used in a test 
structure). configure.ac is updated and configure is regenerated with autoconf 2.64


Ok for trunk?

Make sure libgfortran builds for arm-none-eabi.

libgfortran/
2014-04-17  Kyrylo Tkachov  

* configure.ac: Quote usage of ac_cv_func_clock_gettime in if test.
* configure: Regenerate.
diff --git a/libgfortran/configure b/libgfortran/configure
index 23f57c7..d3ced74 100755
--- a/libgfortran/configure
+++ b/libgfortran/configure
@@ -25935,7 +25935,7 @@ fi
 # test is copied from libgomp, and modified to not link in -lrt as
 # libgfortran calls clock_gettime via a weak reference if it's found
 # in librt.
-if test $ac_cv_func_clock_gettime = no; then
+if test "$ac_cv_func_clock_gettime" = no; then
   { $as_echo "$as_me:${as_lineno-$LINENO}: checking for clock_gettime in -lrt" >&5
 $as_echo_n "checking for clock_gettime in -lrt... " >&6; }
 if test "${ac_cv_lib_rt_clock_gettime+set}" = set; then :
diff --git a/libgfortran/configure.ac b/libgfortran/configure.ac
index de2d65e..24dbf2b 100644
--- a/libgfortran/configure.ac
+++ b/libgfortran/configure.ac
@@ -510,7 +510,7 @@ AC_CHECK_LIB([m],[feenableexcept],[have_feenableexcept=yes AC_DEFINE([HAVE_FEENA
 # test is copied from libgomp, and modified to not link in -lrt as
 # libgfortran calls clock_gettime via a weak reference if it's found
 # in librt.
-if test $ac_cv_func_clock_gettime = no; then
+if test "$ac_cv_func_clock_gettime" = no; then
   AC_CHECK_LIB(rt, clock_gettime,
 [AC_DEFINE(HAVE_CLOCK_GETTIME_LIBRT, 1,
[Define to 1 if you have the `clock_gettime' function in librt.])])

Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc

2014-04-17 Thread Konstantin Serebryany

On Thu, Apr 17, 2014 at 8:45 PM, Bernhard Reutner-Fischer
 wrote:
> On 17 April 2014 16:51:23 Konstantin Serebryany
>  wrote:
>
>> On Thu, Apr 17, 2014 at 6:27 PM, Bernhard Reutner-Fischer
>>  wrote:
>> > On 17 April 2014 16:07, Konstantin Serebryany
>> >  wrote:
>> >> Hi,
>> >>
>> >> If you are trying to modify the libsanitizer files, please read here:
>> >> https://code.google.com/p/address-sanitizer/wiki/HowToContribute
>> >
>> > I read that, thanks. Patch 3/3 is for current compiler-rt git repo,
>> > please install it there, i do not have write access to the LLVM nor
>> > compiler-rt trees.
>>
>> I can commit your patch to llvm tree only after you follow the process
>> described on that page.
>> Sorry, this is a hard rule.
>
>
> What part of the process do you think I did not follow?
>
> I made a patch for compiler-rt, sent it to llvm-comm...@cs.uiuc.edu then
> provided the corresponding GCC parts, along a backport of the new bits that
> I expect to be overwritten once you do a new merge, leaving just the GCC
> configuy bits. This is how I read the wiki page you cite.
>
> Please tell me what you expect me to do differently?

First, I did not notice that you've sent it to llvm-commits because it
was also sent to the gcc list (unusual thing to happen)
and got filtered into the gcc part of my mail. Sorry.
But second, the patch is far from trivial and you should not expect us
to commit it w/o a careful review,
so here comes another part of the wiki: "For non-trivial patches
please use Phabricator -- this will help us reply faster."

--kcc


>
> Thanks,
>>
>>
>> --kcc
>
>
>
> Sent with AquaMail for Android
> http://www.aqua-mail.com
>
>

RE: [PATCH] Fix uninitialised variable warning in tree-ssa-loop-ivcanon.c

2014-04-17 Thread Daniel Marjamäki

Hello!

I am not against it..

However I think there is no danger. I see no potential use of
uninitialized variable.

The use of n_unroll is guarded by n_unroll_found.

Best regards,
Daniel Marjamäki

PATCH: PR target/60868: [4.9/4.10 Regression] ICE: in int_mode_for_mode, at stor-layout.c:400 with -minline-all-stringops -minline-stringops-dynamically -march=core2

2014-04-17 Thread H.J. Lu

Hi,

GET_MODE returns VOIDmode on CONST_INT.  It happens with -O0.  This
patch uses counter_mode on count_exp to get mode.  Tested on
Linux/x86-64 without regressions.  OK for trunk and 4.9 branch?

Thanks.


H.J.
---
gcc/

2014-04-17  H.J. Lu  

PR target/60868
* config/i386/i386.c (ix86_expand_set_or_movmem): Call counter_mode 
on count_exp to get mode.

gcc/testsuite/

2014-04-17  H.J. Lu  

PR target/60868
* gcc.target/i386/pr60868.c: New testcase.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 536f50f..7a68623 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -24392,7 +24392,8 @@ ix86_expand_set_or_movmem (rtx dst, rtx src, rtx 
count_exp, rtx val_exp,
  if (jump_around_label == NULL_RTX)
jump_around_label = gen_label_rtx ();
  emit_cmp_and_jump_insns (count_exp, GEN_INT (dynamic_check - 1),
-  LEU, 0, GET_MODE (count_exp), 1, hot_label);
+  LEU, 0, counter_mode (count_exp),
+  1, hot_label);
  predict_jump (REG_BR_PROB_BASE * 90 / 100);
  if (issetmem)
set_storage_via_libcall (dst, count_exp, val_exp, false);
diff --git a/gcc/testsuite/gcc.target/i386/pr60868.c 
b/gcc/testsuite/gcc.target/i386/pr60868.c
new file mode 100644
index 000..c30bbfc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr60868.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -minline-all-stringops -minline-stringops-dynamically 
-march=core2" } */
+
+void bar (float *);
+
+void foo (void)
+{
+  float b[256] = {0};
+  bar(b);
+}

Re: Avoid unnecesary GGC runs during LTO

2014-04-17 Thread Richard Biener

On April 17, 2014 6:03:13 PM CEST, Jan Hubicka  wrote:
>> > > +
>> > > +  /* At this stage we know that majority of GGC memory is
>reachable.  
>> > > + Growing the limits prevents unnecesary invocation of GGC. 
>*/
>> > > +  ggc_grow ();
>> > >ggc_collect ();
>> > 
>> > Isn't the collect here pointless?  I see not in ENABLE_CHECKING,
>but
>> > shouldn't this be abstracted away, thus call ggc_collect from
>ggc_grow?
>> > Or maybe rather even for ENABLE_CHECKING adjust G.allocated_last_gc
>> > and simply drop the ggc_collect above ().
>> 
>> I am fine with both.  I basically decided to keep the explicit
>ggc_collect() to
>> make it clear (from lto.c source code) that we are GGC safe at this
>point and
>> to have way to double check that we do not produce too much of
>garbage with
>> checking disabled. (so with -Q I will see how much it is collected at
>that place).
>> 
>> We can embed it into ggc_grow and document that w/o checking it is
>equivalent
>> to ggc_cooect.
>> > 
>> > Anyway, this is sth for stage1 at this point.
>> 
>> OK,
>> Honza
>
>Ping...
>the patches saves 33 GGC runs during libxul.so link, that is not that
>bad ;)

What is the updated patch you propose?

Richard

>Honza
>> > 
>> > Thanks,
>> > Richard.
>> > 
>> > >/* Set the hooks so that all of the ipa passes can read in
>their data.  */
>> > > Index: ggc-none.c
>> > >
>===
>> > > --- ggc-none.c   (revision 209170)
>> > > +++ ggc-none.c   (working copy)
>> > > @@ -63,3 +63,8 @@ ggc_free (void *p)
>> > >  {
>> > >free (p);
>> > >  }
>> > > +
>> > > +void
>> > > +ggc_grow (void)
>> > > +{
>> > > +}
>> > > Index: ggc-page.c
>> > >
>===
>> > > --- ggc-page.c   (revision 209170)
>> > > +++ ggc-page.c   (working copy)
>> > > @@ -2095,6 +2095,19 @@ ggc_collect (void)
>> > >  fprintf (G.debug_file, "END COLLECTING\n");
>> > >  }
>> > >  
>> > > +/* Assume that all GGC memory is reachable and grow the limits
>for next collection. */
>> > > +
>> > > +void
>> > > +ggc_grow (void)
>> > > +{
>> > > +#ifndef ENABLE_CHECKING
>> > > +  G.allocated_last_gc = MAX (G.allocated_last_gc,
>> > > + G.allocated);
>> > > +#endif
>> > > +  if (!quiet_flag)
>> > > +fprintf (stderr, " {GC start %luk} ", (unsigned long)
>G.allocated / 1024);
>> > > +}
>> > > +
>> > >  /* Print allocation statistics.  */
>> > >  #define SCALE(x) ((unsigned long) ((x) < 1024*10 \
>> > >? (x) \
>> > > 
>> > > 
>> > 
>> > -- 
>> > Richard Biener 
>> > SUSE / SUSE Labs
>> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>> > GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

Re: Avoid unnecesary GGC runs during LTO

2014-04-17 Thread Jan Hubicka

> On April 17, 2014 6:03:13 PM CEST, Jan Hubicka  wrote:
> >> > > +
> >> > > +  /* At this stage we know that majority of GGC memory is
> >reachable.  
> >> > > + Growing the limits prevents unnecesary invocation of GGC. 
> >*/
> >> > > +  ggc_grow ();
> >> > >ggc_collect ();
> >> > 
> >> > Isn't the collect here pointless?  I see not in ENABLE_CHECKING,
> >but
> >> > shouldn't this be abstracted away, thus call ggc_collect from
> >ggc_grow?
> >> > Or maybe rather even for ENABLE_CHECKING adjust G.allocated_last_gc
> >> > and simply drop the ggc_collect above ().
> >> 
> >> I am fine with both.  I basically decided to keep the explicit
> >ggc_collect() to
> >> make it clear (from lto.c source code) that we are GGC safe at this
> >point and
> >> to have way to double check that we do not produce too much of
> >garbage with
> >> checking disabled. (so with -Q I will see how much it is collected at
> >that place).
> >> 
> >> We can embed it into ggc_grow and document that w/o checking it is
> >equivalent
> >> to ggc_cooect.
> >> > 
> >> > Anyway, this is sth for stage1 at this point.
> >> 
> >> OK,
> >> Honza
> >
> >Ping...
> >the patches saves 33 GGC runs during libxul.so link, that is not that
> >bad ;)
> 
> What is the updated patch you propose?

I was trying to explain, why I kept explicit ggc_collect just after ggc_grow:

I want to make it clear that we are ggc safe at that point. I also want to see
the ggc run happening w/o checking to have -Q report how much of garbage we see
at this stage so I can keep eye on it.

I can hide ENABLE_CHECKING ggc_collect call in ggc_grow and update
documentation if your preffer.

Honza
> 
> Richard
> 
> >Honza
> >> > 
> >> > Thanks,
> >> > Richard.
> >> > 
> >> > >/* Set the hooks so that all of the ipa passes can read in
> >their data.  */
> >> > > Index: ggc-none.c
> >> > >
> >===
> >> > > --- ggc-none.c (revision 209170)
> >> > > +++ ggc-none.c (working copy)
> >> > > @@ -63,3 +63,8 @@ ggc_free (void *p)
> >> > >  {
> >> > >free (p);
> >> > >  }
> >> > > +
> >> > > +void
> >> > > +ggc_grow (void)
> >> > > +{
> >> > > +}
> >> > > Index: ggc-page.c
> >> > >
> >===
> >> > > --- ggc-page.c (revision 209170)
> >> > > +++ ggc-page.c (working copy)
> >> > > @@ -2095,6 +2095,19 @@ ggc_collect (void)
> >> > >  fprintf (G.debug_file, "END COLLECTING\n");
> >> > >  }
> >> > >  
> >> > > +/* Assume that all GGC memory is reachable and grow the limits
> >for next collection. */
> >> > > +
> >> > > +void
> >> > > +ggc_grow (void)
> >> > > +{
> >> > > +#ifndef ENABLE_CHECKING
> >> > > +  G.allocated_last_gc = MAX (G.allocated_last_gc,
> >> > > +   G.allocated);
> >> > > +#endif
> >> > > +  if (!quiet_flag)
> >> > > +fprintf (stderr, " {GC start %luk} ", (unsigned long)
> >G.allocated / 1024);
> >> > > +}
> >> > > +
> >> > >  /* Print allocation statistics.  */
> >> > >  #define SCALE(x) ((unsigned long) ((x) < 1024*10 \
> >> > >  ? (x) \
> >> > > 
> >> > > 
> >> > 
> >> > -- 
> >> > Richard Biener 
> >> > SUSE / SUSE Labs
> >> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
> >> > GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
>

Re: [RFC] Add aarch64 support for ada

2014-04-17 Thread Richard Henderson

On 04/17/2014 08:35 AM, Tristan Gingold wrote:
> What about this compile-time check:
> 
> package Fatptralign is
>type String_Acc is access String;
>type Integer_acc is access Integer;
> 
>pragma Compile_Time_Error
> (String_Acc'Alignment = 1 * Integer_Acc'Alignment,
>  "Fat pointer are simply aligned");
> 
>pragma Compile_Time_Error
> (String_Acc'Alignment = 2 * Integer_Acc'Alignment,
>  "Fat pointer are doubly aligned");
> end Fatptralign;

Yes, that seems to work, even with a cross-compiler.


r~

Re: [PATCH] C++ thunk section names

2014-04-17 Thread Sriraman Tallam

Ping.

On Wed, Feb 5, 2014 at 4:31 PM, Sriraman Tallam  wrote:
> Hi,
>
>   I would like this patch reviewed and considered for commit when
> Stage 1 is active again.
>
> Patch Description:
>
> A C++ thunk's section name is set to be the same as the original function's
> section name for which the thunk was created in order to place the two
> together.  This is done in cp/method.c in function use_thunk.
> However, with function reordering turned on, the original function's section
> name can change to something like ".text.hot." or
> ".text.unlikely." in function default_function_section in varasm.c
> based on the node count of that function.  The thunk function's section name
> is not updated to be the same as the original here and also is not always
> correct to do it as the original function can be hotter than the thunk.
>
> I have created a patch to not name the thunk function's section to be the same
> as the original function when function reordering is enabled.
>
> Thanks
> Sri

Inliner heuristics TLC 1/n - let small function inlinng to ignore cold portion of program

2014-04-17 Thread Jan Hubicka

Hi,
I think for 4.10 we should revisit inliner behaviour to be more LTO and LTO+FDO
ready. This is first of small patches I made to sanitize behaviour of current 
bounds.

The main problem LTO brings is that we get way too many inline candidates. In 
per-file
model one gets only small percentage of calls inlinable, since most of them go 
to other
units, so our current heuristics behave quite well, inlining usually all calls 
that it
consider benefical.

With LTO almost all calls are inlinable and if we inline everything we consider
profitable we get insane code size growths, so practically always we hit our
30% unit growth threshold.  This is not always a good idea.  Reducing
inline-insns-auto/inline-insns-single to avoid inliner hitting the growth limit
would cause a regression on benchmarks that needs inlining of large functions.

LLVM seems to get around the problem by doing code expanding inlining at compile
time (in equivalent of our early inliner). This makes functions big, so the LTO
doesn't inline much, but it also misses useful cross-module inlines and replace
them by less usefull inter-module.

Other approach would be to have inline-insns-crossmodule that is significantly
smaller than inline-insns-auto. We already have crossmodule hint that probably
ought to be made smarter to not fire on COMDAT functions.
I do not want to do it, since the numbers I collected in
http://hubicka.blogspot.ca/2014/04/devirtualization-in-c-part-5-feedback.html
suggest that inline-insns-auto is already quite bad limit.

I would be happy to hear about alternative solutions to this.  We may want
to switch whole program inliner into temperature style bound, like open64 does.

Well, this patch actually goes bit different direction - making unit growth
threashold more sane.

While looking into inliner behaviour at Firefox to write my blog entry
I noticed that with profile feedback only very small portion of the program
is trained (15%) and only around 7% of code contains something that we consider
hot.

Inliner however still hits the inline-unit-growth limit with:
Unit growth for small function inlining: 7232256->9220597 (27%)
Inlined 183353 calls, eliminated 54652 function

We do not grow the code in the cold portions of program, but because of
the "dead" padding we grow everything we consider hot 4 times, instead
of 1.3 times as we would usually do if it was unpadded.

This patch fixes the problem by considering only non-cold functions for
frequency calculation.  We now get:

Unit growth for small function inlining: 2083217->2537163 (21%)
Inlined 134611 calls, eliminated 53586 functions

So while the relative growth is still close to 30%, the absolute
growth is only 22% of the previous one.  We inline fewer calls but
in the dynamic stats there is very minor (sub 0.01%) diference.

Bootstrapped/regtested x86_64-linux, will commit it shortly.

Honza

* ipa-inline.c (inline_small_functions): Account only non-cold
functions.

* doc/invoke.texi (inline-unit-growth): Update documentation.
Index: ipa-inline.c
===
--- ipa-inline.c(revision 209461)
+++ ipa-inline.c(working copy)
@@ -1585,7 +1590,10 @@ inline_small_functions (void)
struct inline_summary *info = inline_summary (node);
struct ipa_dfs_info *dfs = (struct ipa_dfs_info *) node->aux;
 
-   if (!DECL_EXTERNAL (node->decl))
+   /* Do not account external functions, they will be optimized out
+  if not inlined.  Also only count the non-cold portion of 
program.  */
+   if (!DECL_EXTERNAL (node->decl)
+   && node->frequency != NODE_FREQUENCY_UNLIKELY_EXECUTED)
  initial_size += info->size;
info->growth = estimate_growth (node);
if (dfs && dfs->next_cycle)
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 209461)
+++ doc/invoke.texi (working copy)
@@ -9409,7 +9409,8 @@ before applying @option{--param inline-u
 @item inline-unit-growth
 Specifies maximal overall growth of the compilation unit caused by inlining.
 The default value is 30 which limits unit growth to 1.3 times the original
-size.
+size. Cold functions (either marked cold via an attribibute or by profile
+feedback) are not accounted into the unit size.
 
 @item ipcp-unit-growth
 Specifies maximal overall growth of the compilation unit caused by

Re: Avoid unnecesary GGC runs during LTO

2014-04-17 Thread Richard Biener

On April 17, 2014 7:18:05 PM CEST, Jan Hubicka  wrote:
>> On April 17, 2014 6:03:13 PM CEST, Jan Hubicka 
>wrote:
>> >> > > +
>> >> > > +  /* At this stage we know that majority of GGC memory is
>> >reachable.  
>> >> > > + Growing the limits prevents unnecesary invocation of
>GGC. 
>> >*/
>> >> > > +  ggc_grow ();
>> >> > >ggc_collect ();
>> >> > 
>> >> > Isn't the collect here pointless?  I see not in ENABLE_CHECKING,
>> >but
>> >> > shouldn't this be abstracted away, thus call ggc_collect from
>> >ggc_grow?
>> >> > Or maybe rather even for ENABLE_CHECKING adjust
>G.allocated_last_gc
>> >> > and simply drop the ggc_collect above ().
>> >> 
>> >> I am fine with both.  I basically decided to keep the explicit
>> >ggc_collect() to
>> >> make it clear (from lto.c source code) that we are GGC safe at
>this
>> >point and
>> >> to have way to double check that we do not produce too much of
>> >garbage with
>> >> checking disabled. (so with -Q I will see how much it is collected
>at
>> >that place).
>> >> 
>> >> We can embed it into ggc_grow and document that w/o checking it is
>> >equivalent
>> >> to ggc_cooect.
>> >> > 
>> >> > Anyway, this is sth for stage1 at this point.
>> >> 
>> >> OK,
>> >> Honza
>> >
>> >Ping...
>> >the patches saves 33 GGC runs during libxul.so link, that is not
>that
>> >bad ;)
>> 
>> What is the updated patch you propose?
>
>I was trying to explain, why I kept explicit ggc_collect just after
>ggc_grow:
>
>I want to make it clear that we are ggc safe at that point. I also want
>to see
>the ggc run happening w/o checking to have -Q report how much of
>garbage we see
>at this stage so I can keep eye on it.
>
>I can hide ENABLE_CHECKING ggc_collect call in ggc_grow and update
>documentation if your preffer.

I'd prefer that.  OK with that change.

Thanks,
Richard.

>Honza
>> 
>> Richard
>> 
>> >Honza
>> >> > 
>> >> > Thanks,
>> >> > Richard.
>> >> > 
>> >> > >/* Set the hooks so that all of the ipa passes can read in
>> >their data.  */
>> >> > > Index: ggc-none.c
>> >> > >
>> >===
>> >> > > --- ggc-none.c(revision 209170)
>> >> > > +++ ggc-none.c(working copy)
>> >> > > @@ -63,3 +63,8 @@ ggc_free (void *p)
>> >> > >  {
>> >> > >free (p);
>> >> > >  }
>> >> > > +
>> >> > > +void
>> >> > > +ggc_grow (void)
>> >> > > +{
>> >> > > +}
>> >> > > Index: ggc-page.c
>> >> > >
>> >===
>> >> > > --- ggc-page.c(revision 209170)
>> >> > > +++ ggc-page.c(working copy)
>> >> > > @@ -2095,6 +2095,19 @@ ggc_collect (void)
>> >> > >  fprintf (G.debug_file, "END COLLECTING\n");
>> >> > >  }
>> >> > >  
>> >> > > +/* Assume that all GGC memory is reachable and grow the
>limits
>> >for next collection. */
>> >> > > +
>> >> > > +void
>> >> > > +ggc_grow (void)
>> >> > > +{
>> >> > > +#ifndef ENABLE_CHECKING
>> >> > > +  G.allocated_last_gc = MAX (G.allocated_last_gc,
>> >> > > +  G.allocated);
>> >> > > +#endif
>> >> > > +  if (!quiet_flag)
>> >> > > +fprintf (stderr, " {GC start %luk} ", (unsigned long)
>> >G.allocated / 1024);
>> >> > > +}
>> >> > > +
>> >> > >  /* Print allocation statistics.  */
>> >> > >  #define SCALE(x) ((unsigned long) ((x) < 1024*10 \
>> >> > > ? (x) \
>> >> > > 
>> >> > > 
>> >> > 
>> >> > -- 
>> >> > Richard Biener 
>> >> > SUSE / SUSE Labs
>> >> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>> >> > GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer
>>

Re: [gomp4] Add tables generation

2014-04-17 Thread Ilya Verbin

On 27 Mar 17:16, Jakub Jelinek wrote:
> On Thu, Mar 27, 2014 at 08:13:00PM +0400, Ilya Verbin wrote:
> > On 27 Mar 15:02, Jakub Jelinek wrote:
> > > The tables need to be created before IPA, that way it really shouldn't
> > > matter in what order you emit them.  E.g. the outlined target functions
> > > could be added to the table during ompexp pass which actually creates the
> > > outlined functions, the vars need to be added before target lto or host 
> > > lto
> > > is streamed.
> > 
> > For host tables it's ok, but when target compiler will create tables with 
> > functions?
> > It reads bytecode from target_lto sections, so it never executes ompexp 
> > pass.
> 
> Which is why the table created for host by the ompexp pass should be
> streamed into the target_lto sections (marked specially somehow, special
> attribute or whatever), and then corresponding target table created from
> that, rather then created from some possibly different ordering there.
> 
>   Jakub

Hi Jakub,

Could you please take a look at this patch?  It fixes the ordering issue in the
tables stated above, and passes all the tests that I have.  But I'm not sure
about its correctness from the architectural point of view.


---
 gcc/lto-cgraph.c   | 93 ++
 gcc/lto-section-in.c   |  3 +-
 gcc/lto-streamer-out.c |  2 ++
 gcc/lto-streamer.h |  3 ++
 gcc/lto/lto.c  |  2 ++
 gcc/omp-low.c  | 68 +++-
 6 files changed, 115 insertions(+), 56 deletions(-)

diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
index 544f04b..3d6637e 100644
--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -82,6 +82,8 @@ enum LTO_symtab_tags
   LTO_symtab_last_tag
 };
 
+extern vec *offload_funcs, *offload_vars;
+
 /* Create a new symtab encoder.
if FOR_INPUT, the encoder allocate only datastructures needed
to read the symtab.  */
@@ -958,6 +960,51 @@ output_symtab (void)
   output_refs (encoder);
 }
 
+void
+output_offload_tables (void)
+{
+  /* Collect all omp-target global variables to offload_vars, if they have not
+ been gathered earlier by input_offload_tables.  */
+  if (vec_safe_is_empty (offload_vars))
+{
+  struct varpool_node *vnode;
+  FOR_EACH_DEFINED_VARIABLE (vnode)
+   {
+ if (!lookup_attribute ("omp declare target",
+DECL_ATTRIBUTES (vnode->decl))
+ || TREE_CODE (vnode->decl) != VAR_DECL
+ || DECL_SIZE (vnode->decl) == 0)
+   continue;
+ vec_safe_push (offload_vars, vnode->decl);
+   }
+}
+
+  if (vec_safe_is_empty (offload_funcs) && vec_safe_is_empty (offload_vars))
+return;
+
+  struct lto_simple_output_block *ob
+= lto_create_simple_output_block (LTO_section_offload_table);
+
+  for (unsigned i = 0; i < vec_safe_length (offload_funcs); i++)
+{
+  streamer_write_enum (ob->main_stream, LTO_symtab_tags,
+  LTO_symtab_last_tag, LTO_symtab_unavail_node);
+  lto_output_fn_decl_index (ob->decl_state, ob->main_stream,
+   (*offload_funcs)[i]);
+}
+
+  for (unsigned i = 0; i < vec_safe_length (offload_vars); i++)
+{
+  streamer_write_enum (ob->main_stream, LTO_symtab_tags,
+  LTO_symtab_last_tag, LTO_symtab_variable);
+  lto_output_var_decl_index (ob->decl_state, ob->main_stream,
+(*offload_vars)[i]);
+}
+
+  streamer_write_uhwi_stream (ob->main_stream, 0);
+  lto_destroy_simple_output_block (ob);
+}
+
 /* Overwrite the information in NODE based on FILE_DATA, TAG, FLAGS,
STACK_SIZE, SELF_TIME and SELF_SIZE.  This is called either to initialize
NODE or to replace the values in it, for instance because the first
@@ -1611,6 +1658,52 @@ input_symtab (void)
 }
 }
 
+void
+input_offload_tables (void)
+{
+  struct lto_file_decl_data **file_data_vec = lto_get_file_decl_data ();
+  struct lto_file_decl_data *file_data;
+  unsigned int j = 0;
+
+  while ((file_data = file_data_vec[j++]))
+{
+  const char *data;
+  size_t len;
+  struct lto_input_block *ib
+   = lto_create_simple_input_block (file_data, LTO_section_offload_table,
+&data, &len);
+  if (!ib)
+   continue;
+
+  enum LTO_symtab_tags tag
+   = streamer_read_enum (ib, LTO_symtab_tags, LTO_symtab_last_tag);
+  while (tag)
+   {
+ if (tag == LTO_symtab_unavail_node)
+   {
+ int decl_index = streamer_read_uhwi (ib);
+ tree fn_decl
+   = lto_file_decl_data_get_fn_decl (file_data, decl_index);
+ vec_safe_push (offload_funcs, fn_decl);
+   }
+ else if (tag == LTO_symtab_variable)
+   {
+ int decl_index = streamer_read_uhwi (ib);
+ tree var_decl
+   = lto_file_decl_data_get_var_decl (file_data, decl_index);
+ vec_safe_push

Re: Inliner heuristics TLC 1/n - let small function inlinng to ignore cold portion of program

2014-04-17 Thread David Malcolm

On Thu, 2014-04-17 at 19:52 +0200, Jan Hubicka wrote:

[...]

> Index: doc/invoke.texi
> ===
> --- doc/invoke.texi   (revision 209461)
> +++ doc/invoke.texi   (working copy)
> @@ -9409,7 +9409,8 @@ before applying @option{--param inline-u
>  @item inline-unit-growth
>  Specifies maximal overall growth of the compilation unit caused by inlining.
>  The default value is 30 which limits unit growth to 1.3 times the original
> -size.
> +size. Cold functions (either marked cold via an attribibute or by profile
FWIW, there a trivial typo here-^^

Go patch commited: Mark various expressions as immutable

2014-04-17 Thread Ian Lance Taylor

This patch from Chris Manghane marks various expression types as
immutable: numerics, constants, type info, address of, type conversion
when appropriate.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 194e0f47c9e5 go/expressions.cc
--- a/go/expressions.cc	Wed Apr 16 13:33:13 2014 -0700
+++ b/go/expressions.cc	Thu Apr 17 11:57:28 2014 -0700
@@ -555,6 +555,10 @@
   { return true; }
 
   bool
+  do_is_immutable() const
+  { return true; }
+
+  bool
   do_numeric_constant_value(Numeric_constant* nc) const
   {
 nc->set_unsigned_long(NULL, 0);
@@ -1422,6 +1426,10 @@
   do_is_constant() const
   { return true; }
 
+  bool
+  do_is_immutable() const
+  { return true; }
+
   Type*
   do_type();
 
@@ -1790,6 +1798,10 @@
   { return true; }
 
   bool
+  do_is_immutable() const
+  { return true; }
+
+  bool
   do_numeric_constant_value(Numeric_constant* nc) const;
 
   Type*
@@ -2109,6 +2121,10 @@
   { return true; }
 
   bool
+  do_is_immutable() const
+  { return true; }
+
+  bool
   do_numeric_constant_value(Numeric_constant* nc) const
   {
 nc->set_float(this->type_, this->val_);
@@ -2292,6 +2308,10 @@
   { return true; }
 
   bool
+  do_is_immutable() const
+  { return true; }
+
+  bool
   do_numeric_constant_value(Numeric_constant* nc) const
   {
 nc->set_complex(this->type_, this->real_, this->imag_);
@@ -2506,6 +2526,10 @@
   { return true; }
 
   bool
+  do_is_immutable() const
+  { return true; }
+
+  bool
   do_numeric_constant_value(Numeric_constant* nc) const;
 
   bool
@@ -2994,6 +3018,9 @@
   do_is_constant() const;
 
   bool
+  do_is_immutable() const;
+
+  bool
   do_numeric_constant_value(Numeric_constant*) const;
 
   bool
@@ -3175,6 +3202,27 @@
   return true;
 }
 
+// Return whether a type conversion is immutable.
+
+bool
+Type_conversion_expression::do_is_immutable() const
+{
+  Type* type = this->type_;
+  Type* expr_type = this->expr_->type();
+
+  if (type->interface_type() != NULL
+  || expr_type->interface_type() != NULL)
+return false;
+
+  if (!this->expr_->is_immutable())
+return false;
+
+  if (Type::are_identical(type, expr_type, false, NULL))
+return true;
+
+  return type->is_basic_type() && expr_type->is_basic_type();
+}
+
 // Return the constant numeric value if there is one.
 
 bool
@@ -3599,7 +3647,8 @@
 
   bool
   do_is_immutable() const
-  { return this->expr_->is_immutable(); }
+  { return this->expr_->is_immutable()
+  || (this->op_ == OPERATOR_AND && this->expr_->is_variable()); }
 
   bool
   do_numeric_constant_value(Numeric_constant*) const;
@@ -14076,6 +14125,10 @@
   { }
 
  protected:
+  bool
+  do_is_immutable() const
+  { return true; }
+
   Type*
   do_type();

Re: Inliner heuristics TLC 1/n - let small function inlinng to ignore cold portion of program

2014-04-17 Thread Xinliang David Li

This looks fine.  LIPO has similar change too.  Other directions worth
looking into:

1) To model icache effect better,  weighted callee size need to be
used with profile. The weight for BB may look like: min(1,
FREQ(BB)/FREQ(ENTRY)).
2) When function splitting is turned on, are any inline heuristic
changes are needed? E.g. only consider the hot code part of node for
unit growth computation?

We are also looking into more aggressive approach to track per loop
(inter-procedural) region growth limit, instead of using one single
global limit.

David

On Thu, Apr 17, 2014 at 10:52 AM, Jan Hubicka  wrote:
> Hi,
> I think for 4.10 we should revisit inliner behaviour to be more LTO and 
> LTO+FDO
> ready. This is first of small patches I made to sanitize behaviour of current 
> bounds.
>
> The main problem LTO brings is that we get way too many inline candidates. In 
> per-file
> model one gets only small percentage of calls inlinable, since most of them 
> go to other
> units, so our current heuristics behave quite well, inlining usually all 
> calls that it
> consider benefical.
>
> With LTO almost all calls are inlinable and if we inline everything we 
> consider
> profitable we get insane code size growths, so practically always we hit our
> 30% unit growth threshold.  This is not always a good idea.  Reducing
> inline-insns-auto/inline-insns-single to avoid inliner hitting the growth 
> limit
> would cause a regression on benchmarks that needs inlining of large functions.
>
> LLVM seems to get around the problem by doing code expanding inlining at 
> compile
> time (in equivalent of our early inliner). This makes functions big, so the 
> LTO
> doesn't inline much, but it also misses useful cross-module inlines and 
> replace
> them by less usefull inter-module.
>
> Other approach would be to have inline-insns-crossmodule that is significantly
> smaller than inline-insns-auto. We already have crossmodule hint that probably
> ought to be made smarter to not fire on COMDAT functions.
> I do not want to do it, since the numbers I collected in
> http://hubicka.blogspot.ca/2014/04/devirtualization-in-c-part-5-feedback.html
> suggest that inline-insns-auto is already quite bad limit.
>
> I would be happy to hear about alternative solutions to this.  We may want
> to switch whole program inliner into temperature style bound, like open64 
> does.
>
> Well, this patch actually goes bit different direction - making unit growth
> threashold more sane.
>
> While looking into inliner behaviour at Firefox to write my blog entry
> I noticed that with profile feedback only very small portion of the program
> is trained (15%) and only around 7% of code contains something that we 
> consider
> hot.
>
> Inliner however still hits the inline-unit-growth limit with:
> Unit growth for small function inlining: 7232256->9220597 (27%)
> Inlined 183353 calls, eliminated 54652 function
>
> We do not grow the code in the cold portions of program, but because of
> the "dead" padding we grow everything we consider hot 4 times, instead
> of 1.3 times as we would usually do if it was unpadded.
>
> This patch fixes the problem by considering only non-cold functions for
> frequency calculation.  We now get:
>
> Unit growth for small function inlining: 2083217->2537163 (21%)
> Inlined 134611 calls, eliminated 53586 functions
>
> So while the relative growth is still close to 30%, the absolute
> growth is only 22% of the previous one.  We inline fewer calls but
> in the dynamic stats there is very minor (sub 0.01%) diference.
>
> Bootstrapped/regtested x86_64-linux, will commit it shortly.
>
> Honza
>
> * ipa-inline.c (inline_small_functions): Account only non-cold
> functions.
>
> * doc/invoke.texi (inline-unit-growth): Update documentation.
> Index: ipa-inline.c
> ===
> --- ipa-inline.c(revision 209461)
> +++ ipa-inline.c(working copy)
> @@ -1585,7 +1590,10 @@ inline_small_functions (void)
> struct inline_summary *info = inline_summary (node);
> struct ipa_dfs_info *dfs = (struct ipa_dfs_info *) node->aux;
>
> -   if (!DECL_EXTERNAL (node->decl))
> +   /* Do not account external functions, they will be optimized out
> +  if not inlined.  Also only count the non-cold portion of 
> program.  */
> +   if (!DECL_EXTERNAL (node->decl)
> +   && node->frequency != NODE_FREQUENCY_UNLIKELY_EXECUTED)
>   initial_size += info->size;
> info->growth = estimate_growth (node);
> if (dfs && dfs->next_cycle)
> Index: doc/invoke.texi
> ===
> --- doc/invoke.texi (revision 209461)
> +++ doc/invoke.texi (working copy)
> @@ -9409,7 +9409,8 @@ before applying @option{--param inline-u
>  @item inline-unit-growth
>  Specifies maximal overall growth of the compilation unit caused by inlini

RE: [PATCH v7?] PR middle-end/60281

2014-04-17 Thread Bernd Edlinger

Hi Lin,

On Thu, 17 Apr 2014 22:29:14, Lin Zuojian wrote:
>
> Hi Bernd,
> I have my copyright mark signed and the process has completed. Now I
> am going to answer two more questions before my patch can be
> commited right?
>
> Did you copy any
> files or text written by someone else in these changes?”
>
> no
>
> [Which files have you changed so far, and which new files have you written
> so far?]
> gcc/asan.c
> gcc/ChangeLog
> gcc/cfgexpand.c
>
> Okay, you may review my patch again, if there is no problem, please
> commit it for me.
> --
> Regards
> lin zuojian

I am not sure if your patch was already "approved" by a global GCC reviewer.
That is however absolutely necessary before it can be committed.

I think it would be best to re-submit the latest version of your patch now,
and ask a global reviewer for approval.

The message should be sent to gcc-patches@gcc.gnu.org and contain the
following information in addition to the proposed patch itself and the
change-log entry:

a) On which target(s) did you boot-strap your patch?

b) Did you run the testsuite?

c) When you compare the test results with and without the patch, were there any 
regressions?

Regards
Bernd.

Go patch committed: Only convert function type when necessary

2014-04-17 Thread Ian Lance Taylor

This patch to the Go frontend fixes it to not convert the function type
in a call when calling an interface method.  The function type of an
interface method is not correct, since it does not include the receiver,
but the type of the method field is correct, and as such should not be
converted.  This is PR 60870.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Tested by Ulrich Weigand on PPC.  Committed
to mainline.

Ian

diff -r 43e2635914c2 go/expressions.cc
--- a/go/expressions.cc	Thu Apr 17 12:09:37 2014 -0700
+++ b/go/expressions.cc	Thu Apr 17 12:24:08 2014 -0700
@@ -9619,9 +9619,20 @@
   fn = Expression::make_compound(set_closure, fn, location);
 }
 
-  Btype* bft = fntype->get_backend_fntype(gogo);
   Bexpression* bfn = tree_to_expr(fn->get_tree(context));
-  bfn = gogo->backend()->convert_expression(bft, bfn, location);
+
+  // When not calling a named function directly, use a type conversion
+  // in case the type of the function is a recursive type which refers
+  // to itself.  We don't do this for an interface method because 1)
+  // an interface method never refers to itself, so we always have a
+  // function type here; 2) we pass an extra first argument to an
+  // interface method, so fntype is not correct.
+  if (func == NULL && !is_interface_method)
+{
+  Btype* bft = fntype->get_backend_fntype(gogo);
+  bfn = gogo->backend()->convert_expression(bft, bfn, location);
+}
+
   Bexpression* call = gogo->backend()->call_expression(bfn, fn_args, location);
 
   if (this->results_ != NULL)

Re: [RFC][PATCH] RL78 - clean-up of missing operand mode warnings.

2014-04-17 Thread Richard Hulme


On 15/04/14 22:58, DJ Delorie wrote:

I typically leave the mode off when the operand accepts a CONST_INT as
I've had problems with patterns matching CONST_INTs otherwise, as
CONST_INT rtx's do not have a mode (or have VOIDmode).

(yes, I know gcc is supposed to accomodate that, but like I said, I've
had problems...)


Ok, that's fine.  I was just trying to mop up one little bit of the sea 
of warnings.


It seems a little inconsistent, however, that *movqi_real and 
*xorqi3_real don't specify modes but *movhi_real and 
*andqi_real/*iorqi_real do (and they also accept CONST_INTs).  Not that 
I'm advocating generating more warnings, but my inner OCD likes 
consistency :)


Richard.

Re: [patch] change specific int128 -> generic intN

2014-04-17 Thread Marc Glisse


On Tue, 15 Apr 2014, DJ Delorie wrote:


I wasn't sure what to do with that array, since it was static and
couldn't have "empty" slots in them like the arrays in tree.h.  Also,
do we need to have *every* type in that list?  What's the rule for
whether a type gets installed there or not?  The comment says
"guaranteed to be in the runtime support" but does that mean "for this
particular build" (wrt multilibs) as not all intN types are guaranteed
(even the int128 types were not guaranteed to be supported before my
patch).  In other parts of the patch, just taking out the special case
for __int128 was sufficient to "do the right thing" for all __intN
types.


You need someone who understands this better than me (ask Jason). To be 
able to throw/catch a type, you need some typeinfo symbols. The front-end 
generates that for classes when they are defined. For fundamental types, 
it assumes libsupc++ will provide it, and the function you are modifying 
is the one generating libsupc++ (I am surprised your patch didn't cause 
any failure on x64_64, at least in abi_check). We need to generate the 
typeinfo for __intN, either in libsupc++, or in each TU, and since both 
cases will require code, I assume libsupc++ is preferable.



I can certainly put the intN types in there, but note that it would
mean regenerating the fundamentals[] array at runtime to include those
types which are supported at the time.


After the patch I linked, it should just mean calling the helper function 
on your new types, no need to touch the array.



Do the entries in the array need to be in a particular order?


No, any random order would do.

--
Marc Glisse

Re: [RFC] Add aarch64 support for ada

2014-04-17 Thread Richard Henderson

On 04/17/2014 08:56 AM, Eric Botcazou wrote:
> I presume that the attached kludge is sufficient to make it work?
> 
> 
>   * fe.h (Compiler_Abort): Replace Fat_Pointer by String.
>   (Error_Msg_N): Likewise.
>   (Error_Msg_NE): Likewise.
>   (Get_External_Name_With_Suffix): Likewise.
>   * types.h (Fat_Pointer): Delete.
>   (String): New type.
>   (DECLARE_STRING): New macro.
>   * gcc-interface/decl.c (create_concat_name): Adjust.
>   * gcc-interface/trans.c (post_error): Likewise.
>   (post_error_ne): Likewise.
>   * gcc-interface/misc.c (internal_error_function): Likewise.

Yes, this bootstrapped.


r~

Go patch committed: Use backend interface for constant expressions

2014-04-17 Thread Ian Lance Taylor

This patch from Chris Manghane changes the Go frontend to use the
backend interface for global constants.  Bootstrapped and ran Go
testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian


2014-04-17  Chris Manghane  

* go-gcc.cc (Gcc_backend::named_constant_expression): New
function.


Index: gcc/go/go-gcc.cc
===
--- gcc/go/go-gcc.cc	(revision 209494)
+++ gcc/go/go-gcc.cc	(revision 209495)
@@ -227,6 +227,10 @@ class Gcc_backend : public Backend
   indirect_expression(Bexpression* expr, bool known_valid, Location);
 
   Bexpression*
+  named_constant_expression(Btype* btype, const std::string& name,
+			Bexpression* val, Location);
+
+  Bexpression*
   integer_constant_expression(Btype* btype, mpz_t val);
 
   Bexpression*
@@ -962,6 +966,29 @@ Gcc_backend::indirect_expression(Bexpres
   return tree_to_expr(ret);
 }
 
+// Return an expression that declares a constant named NAME with the
+// constant value VAL in BTYPE.
+
+Bexpression*
+Gcc_backend::named_constant_expression(Btype* btype, const std::string& name,
+   Bexpression* val, Location location)
+{
+  tree type_tree = btype->get_tree();
+  tree const_val = val->get_tree();
+  if (type_tree == error_mark_node || const_val == error_mark_node)
+return this->error_expression();
+
+  tree name_tree = get_identifier_from_string(name);
+  tree decl = build_decl(location.gcc_location(), CONST_DECL, name_tree,
+			 type_tree);
+  DECL_INITIAL(decl) = const_val;
+  TREE_CONSTANT(decl) = 1;
+  TREE_READONLY(decl) = 1;
+
+  go_preserve_from_gc(decl);
+  return this->make_expression(decl);
+}
+
 // Return a typed value as a constant integer.
 
 Bexpression*
Index: gcc/go/gofrontend/gogo-tree.cc
===
--- gcc/go/gofrontend/gogo-tree.cc	(revision 209494)
+++ gcc/go/gofrontend/gogo-tree.cc	(revision 209495)
@@ -1015,44 +1015,22 @@ Named_object::get_tree(Gogo* gogo, Named
 {
 case NAMED_OBJECT_CONST:
   {
-	Named_constant* named_constant = this->u_.const_value;
 	Translate_context subcontext(gogo, function, NULL, NULL);
-	tree expr_tree = named_constant->expr()->get_tree(&subcontext);
-	if (expr_tree == error_mark_node)
-	  decl = error_mark_node;
-	else
+	Type* type = this->u_.const_value->type();
+	Location loc = this->location();
+
+	Expression* const_ref = Expression::make_const_reference(this, loc);
+Bexpression* const_decl =
+	  tree_to_expr(const_ref->get_tree(&subcontext));
+	if (type != NULL && type->is_numeric_type())
 	  {
-	Type* type = named_constant->type();
-	if (type != NULL && !type->is_abstract())
-	  {
-		if (type->is_error())
-		  expr_tree = error_mark_node;
-		else
-		  {
-		Btype* btype = type->get_backend(gogo);
-		expr_tree = fold_convert(type_to_tree(btype), expr_tree);
-		  }
-	  }
-	if (expr_tree == error_mark_node)
-	  decl = error_mark_node;
-	else if (INTEGRAL_TYPE_P(TREE_TYPE(expr_tree)))
-	  {
-tree name = get_identifier_from_string(this->get_id(gogo));
-		decl = build_decl(named_constant->location().gcc_location(),
-  CONST_DECL, name, TREE_TYPE(expr_tree));
-		DECL_INITIAL(decl) = expr_tree;
-		TREE_CONSTANT(decl) = 1;
-		TREE_READONLY(decl) = 1;
-	  }
-	else
-	  {
-		// A CONST_DECL is only for an enum constant, so we
-		// shouldn't use for non-integral types.  Instead we
-		// just return the constant itself, rather than a
-		// decl.
-		decl = expr_tree;
-	  }
+	Btype* btype = type->get_backend(gogo);
+	std::string name = this->get_id(gogo);
+const_decl =
+	  gogo->backend()->named_constant_expression(btype, name,
+			 const_decl, loc);
 	  }
+	decl = expr_to_tree(const_decl);
   }
   break;
 
Index: gcc/go/gofrontend/backend.h
===
--- gcc/go/gofrontend/backend.h	(revision 209494)
+++ gcc/go/gofrontend/backend.h	(revision 209495)
@@ -257,6 +257,12 @@ class Backend
   virtual Bexpression*
   indirect_expression(Bexpression* expr, bool known_valid, Location) = 0;
 
+  // Return an expression that declares a constant named NAME with the
+  // constant value VAL in BTYPE.
+  virtual Bexpression*
+  named_constant_expression(Btype* btype, const std::string& name,
+ Bexpression* val, Location) = 0;
+
   // Return an expression for the multi-precision integer VAL in BTYPE.
   virtual Bexpression*
   integer_constant_expression(Btype* btype, mpz_t val) = 0;
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc	(revision 209494)
+++ gcc/go/gofrontend/expressions.cc	(revision 209495)
@@ -2792,12 +2792,12 @@ Const_expression::do_get_tree(Translate_
   // If the type has been set for this expression, but the underlying
   // object is a

Re: Patch ping

2014-04-17 Thread Uros Bizjak

On Wed, Apr 16, 2014 at 11:35 PM, Jeff Law  wrote:
>
>> I'd like to ping 2 patches:
>>
>> http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00140.html
>> - Ensure GET_MODE_{SIZE,INNER,NUNITS} (const) is constant rather than
>>memory load after optimization (I'd like to keep the current
>> 
>>patch for the reasons mentioned there, but also add this patch)
>
> This is fine.  Per the follow-up discussion, I think you can mark it was
> resolving 36109 as well.
>
>
>>
>> http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00131.html
>> - PR target/59617
>>handle gather loads for AVX512 (at least non-masked ones, masked ones
>>will need to wait for 5.0 and we need to find how to represent it in
>>GIMPLE)
>
> I'll leave this to Uros :-)

IIRC, this patch was already committed to 4.9 some time ago.

Uros.

[PATCH], PR target/60876 -- fix build issue with powerpc

2014-04-17 Thread Michael Meissner

I committed the following patch as obvious to fix the PowerPC build issue that
came up with changes to machmode.h.  These changes allow the compiler to build
and bootstrap. Submitted as subversion id 209498.

2014-04-17  Michael Meissner  

PR target/60876
* config/rs6000/rs6000.c (rs6000_setup_reg_addr_masks): Make sure
GET_MODE_SIZE gets passed an enum machine_mode type and not
integer.
(rs6000_init_hard_regno_mode_ok): Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 209494)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -2329,6 +2329,8 @@ rs6000_setup_reg_addr_masks (void)
 
   for (m = 0; m < NUM_MACHINE_MODES; ++m)
 {
+  enum machine_mode m2 = (enum machine_mode)m;
+
   /* SDmode is special in that we want to access it only via REG+REG
 addressing on power7 and above, since we want to use the LFIWZX and
 STFIWZX instructions to load it.  */
@@ -2363,13 +2365,13 @@ rs6000_setup_reg_addr_masks (void)
 
  if (TARGET_UPDATE
  && (rc == RELOAD_REG_GPR || rc == RELOAD_REG_FPR)
- && GET_MODE_SIZE (m) <= 8
- && !VECTOR_MODE_P (m)
- && !COMPLEX_MODE_P (m)
+ && GET_MODE_SIZE (m2) <= 8
+ && !VECTOR_MODE_P (m2)
+ && !COMPLEX_MODE_P (m2)
  && !indexed_only_p
- && !(TARGET_E500_DOUBLE && GET_MODE_SIZE (m) == 8)
- && !(m == DFmode && TARGET_UPPER_REGS_DF)
- && !(m == SFmode && TARGET_UPPER_REGS_SF))
+ && !(TARGET_E500_DOUBLE && GET_MODE_SIZE (m2) == 8)
+ && !(m2 == DFmode && TARGET_UPPER_REGS_DF)
+ && !(m2 == SFmode && TARGET_UPPER_REGS_SF))
{
  addr_mask |= RELOAD_REG_PRE_INCDEC;
 
@@ -2815,6 +2817,7 @@ rs6000_init_hard_regno_mode_ok (bool glo
 
   for (m = 0; m < NUM_MACHINE_MODES; ++m)
{
+ enum machine_mode m2 = (enum machine_mode)m;
  int reg_size2 = reg_size;
 
  /* TFmode/TDmode always takes 2 registers, even in VSX.  */
@@ -2823,7 +2826,7 @@ rs6000_init_hard_regno_mode_ok (bool glo
reg_size2 = UNITS_PER_FP_WORD;
 
  rs6000_class_max_nregs[m][c]
-   = (GET_MODE_SIZE (m) + reg_size2 - 1) / reg_size2;
+   = (GET_MODE_SIZE (m2) + reg_size2 - 1) / reg_size2;
}
 }

Re: [RFC][PATCH] RL78 - clean-up of missing operand mode warnings.

2014-04-17 Thread DJ Delorie


> It seems a little inconsistent, however, that *movqi_real and 
> *xorqi3_real don't specify modes but *movhi_real and 
> *andqi_real/*iorqi_real do (and they also accept CONST_INTs).  Not that 
> I'm advocating generating more warnings, but my inner OCD likes 
> consistency :)

Adding the mode might be the right way, but I've seen cases where it
wasn't.  My paranoia supercedes my OCD ;-)

libgo patch committed: Avoid unnecessary gccgo extension

2014-04-17 Thread Ian Lance Taylor

This patch from Peter Collingbourne avoids an unnecessary gccgo
extension in libgo.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 801009f33610 libgo/go/syscall/libcall_posix.go
--- a/libgo/go/syscall/libcall_posix.go	Thu Apr 17 16:01:58 2014 -0700
+++ b/libgo/go/syscall/libcall_posix.go	Thu Apr 17 16:03:58 2014 -0700
@@ -138,7 +138,7 @@
 //sys	Select(nfd int, r *FdSet, w *FdSet, e *FdSet, timeout *Timeval) (n int, err error)
 //select(nfd _C_int, r *FdSet, w *FdSet, e *FdSet, timeout *Timeval) _C_int
 
-const nfdbits = int(unsafe.Sizeof(fds_bits_type) * 8)
+const nfdbits = int(unsafe.Sizeof(fds_bits_type(0)) * 8)
 
 type FdSet struct {
 	Bits [(FD_SETSIZE + nfdbits - 1) / nfdbits]fds_bits_type

libgo patch committed: Use delete rather than old map deletion syntax

2014-04-17 Thread Ian Lance Taylor

This patch from Peter Collingbourne changes libgo to use the builtin
delete function rather than the old map deletion syntax.  Bootstrapped
and ran Go testsuite on x86_64-unknown-linux-gnu.  Committed to
mainline.

Ian

diff -r 1e27a38c43ea libgo/go/syscall/syscall_unix.go
--- a/libgo/go/syscall/syscall_unix.go	Thu Apr 17 16:13:05 2014 -0700
+++ b/libgo/go/syscall/syscall_unix.go	Thu Apr 17 16:17:50 2014 -0700
@@ -153,7 +153,7 @@
 	if errno := m.munmap(uintptr(unsafe.Pointer(&b[0])), uintptr(len(b))); errno != nil {
 		return errno
 	}
-	m.active[p] = nil, false
+	delete(m.active, p)
 	return nil
 }

libgo patch committed: Avoid duplicate function declarations in syscall

2014-04-17 Thread Ian Lance Taylor

This patch from Peter Collingbourne avoids duplicate function
declarations in the generated libcalls.go file when building the syscall
package.  Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r 5009262c3e56 libgo/go/syscall/mksyscall.awk
--- a/libgo/go/syscall/mksyscall.awk	Thu Apr 17 16:26:57 2014 -0700
+++ b/libgo/go/syscall/mksyscall.awk	Thu Apr 17 16:30:29 2014 -0700
@@ -96,8 +96,11 @@
 cfnresult = line
 
 printf("// Automatically generated wrapper for %s/%s\n", gofnname, cfnname)
-printf("//extern %s\n", cfnname)
-printf("func c_%s(%s) %s\n", cfnname, cfnparams, cfnresult)
+if (!(cfnname in cfns)) {
+cfns[cfnname] = 1
+printf("//extern %s\n", cfnname)
+printf("func c_%s(%s) %s\n", cfnname, cfnparams, cfnresult)
+}
 printf("func %s(%s) %s%s%s%s{\n",
 	   gofnname, gofnparams, gofnresults == "" ? "" : "(", gofnresults,
 	   gofnresults == "" ? "" : ")", gofnresults == "" ? "" : " ")

[PATCH, rs6000, 4.8, 4.9, trunk] Fix little endian behavior of vec_merge[hl] for V4SI/V4SF with VSX

2014-04-17 Thread Bill Schmidt

Hi,

I missed a case in the vector API work for little endian.  When VSX is
enabled, the vec_mergeh and vec_mergel interfaces for 4x32 vectors are
translated into xxmrghw and xxmrglw.  The patterns for these were not
adjusted for little endian.  This patch fixes this and adds tests for
V4SI and V4SF modes when VSX is available.

Bootstrapped and tested on 4.8, 4.9, and trunk for
powerpc64le-unknown-linux-gnu with no regressions.  Tests are still
ongoing for powerpc64-unknown-linux-gnu.  Provided those complete
without regressions, is this fix ok for trunk, 4.9, and 4.8?

Thanks,
Bill


[gcc]

2014-04-17  Bill Schmidt  

* config/rs6000/vsx.md (vsx_xxmrghw_): Adjust for
little-endian.
(vsx_xxmrglw_): Likewise.

[gcc/testsuite]

2014-04-17  Bill Schmidt  

* gcc.dg/vmx/merge-vsx.c: Add V4SI and V4SF tests.
* gcc.dg/vmx/merge-vsx-be-order.c: Likewise.


Index: gcc/config/rs6000/vsx.md
===
--- gcc/config/rs6000/vsx.md(revision 209513)
+++ gcc/config/rs6000/vsx.md(working copy)
@@ -1891,7 +1891,12 @@
  (parallel [(const_int 0) (const_int 4)
 (const_int 1) (const_int 5)])))]
   "VECTOR_MEM_VSX_P (mode)"
-  "xxmrghw %x0,%x1,%x2"
+{
+  if (BYTES_BIG_ENDIAN)
+return "xxmrghw %x0,%x1,%x2";
+  else
+return "xxmrglw %x0,%x2,%x1";
+}
   [(set_attr "type" "vecperm")])
 
 (define_insn "vsx_xxmrglw_"
@@ -1903,7 +1908,12 @@
  (parallel [(const_int 2) (const_int 6)
 (const_int 3) (const_int 7)])))]
   "VECTOR_MEM_VSX_P (mode)"
-  "xxmrglw %x0,%x1,%x2"
+{
+  if (BYTES_BIG_ENDIAN)
+return "xxmrglw %x0,%x1,%x2";
+  else
+return "xxmrghw %x0,%x2,%x1";
+}
   [(set_attr "type" "vecperm")])
 
 ;; Shift left double by word immediate
Index: gcc/testsuite/gcc.dg/vmx/merge-vsx-be-order.c
===
--- gcc/testsuite/gcc.dg/vmx/merge-vsx-be-order.c   (revision 209513)
+++ gcc/testsuite/gcc.dg/vmx/merge-vsx-be-order.c   (working copy)
@@ -21,10 +21,19 @@ static void test()
   vector long long vlb = {0,1};
   vector double vda = {-2.0,-1.0};
   vector double vdb = {0.0,1.0};
+  vector unsigned int vuia = {0,1,2,3};
+  vector unsigned int vuib = {4,5,6,7};
+  vector signed int vsia = {-4,-3,-2,-1};
+  vector signed int vsib = {0,1,2,3};
+  vector float vfa = {-4.0,-3.0,-2.0,-1.0};
+  vector float vfb = {0.0,1.0,2.0,3.0};
 
   /* Result vectors.  */
   vector long long vlh, vll;
   vector double vdh, vdl;
+  vector unsigned int vuih, vuil;
+  vector signed int vsih, vsil;
+  vector float vfh, vfl;
 
   /* Expected result vectors.  */
 #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
@@ -32,11 +41,23 @@ static void test()
   vector long long vlrl = {0,-2};
   vector double vdrh = {1.0,-1.0};
   vector double vdrl = {0.0,-2.0};
+  vector unsigned int vuirh = {6,2,7,3};
+  vector unsigned int vuirl = {4,0,5,1};
+  vector signed int vsirh = {2,-2,3,-1};
+  vector signed int vsirl = {0,-4,1,-3};
+  vector float vfrh = {2.0,-2.0,3.0,-1.0};
+  vector float vfrl = {0.0,-4.0,1.0,-3.0};
 #else
   vector long long vlrh = {-2,0};
   vector long long vlrl = {-1,1};
   vector double vdrh = {-2.0,0.0};
   vector double vdrl = {-1.0,1.0};
+  vector unsigned int vuirh = {0,4,1,5};
+  vector unsigned int vuirl = {2,6,3,7};
+  vector signed int vsirh = {-4,0,-3,1};
+  vector signed int vsirl = {-2,2,-1,3};
+  vector float vfrh = {-4.0,0.0,-3.0,1.0};
+  vector float vfrl = {-2.0,2.0,-1.0,3.0};
 #endif
 
   vlh = vec_mergeh (vla, vlb);
@@ -43,9 +64,21 @@ static void test()
   vll = vec_mergel (vla, vlb);
   vdh = vec_mergeh (vda, vdb);
   vdl = vec_mergel (vda, vdb);
+  vuih = vec_mergeh (vuia, vuib);
+  vuil = vec_mergel (vuia, vuib);
+  vsih = vec_mergeh (vsia, vsib);
+  vsil = vec_mergel (vsia, vsib);
+  vfh  = vec_mergeh (vfa,  vfb );
+  vfl  = vec_mergel (vfa,  vfb );
 
   check (vec_long_long_eq (vlh, vlrh), "vlh");
   check (vec_long_long_eq (vll, vlrl), "vll");
   check (vec_double_eq (vdh, vdrh), "vdh" );
   check (vec_double_eq (vdl, vdrl), "vdl" );
+  check (vec_all_eq (vuih, vuirh), "vuih");
+  check (vec_all_eq (vuil, vuirl), "vuil");
+  check (vec_all_eq (vsih, vsirh), "vsih");
+  check (vec_all_eq (vsil, vsirl), "vsil");
+  check (vec_all_eq (vfh,  vfrh),  "vfh");
+  check (vec_all_eq (vfl,  vfrl),  "vfl");
 }
Index: gcc/testsuite/gcc.dg/vmx/merge-vsx.c
===
--- gcc/testsuite/gcc.dg/vmx/merge-vsx.c(revision 209513)
+++ gcc/testsuite/gcc.dg/vmx/merge-vsx.c(working copy)
@@ -21,10 +21,19 @@ static void test()
   vector long long vlb = {0,1};
   vector double vda = {-2.0,-1.0};
   vector double vdb = {0.0,1.0};
+  vector unsigned int vuia = {0,1,2,3};
+  vector unsigned int vuib = {4,5,6,7};
+  vector signed int vsia = {-4,-3,-2,-1};
+  vector signed int vsib = {0,1,2,3};
+  vector float vfa = {-4.0,-3.0,-2.0,-1.0};
+  vector flo

[PATCH v8] PR middle-end/60281

2014-04-17 Thread lin zuojian

 Hi,
Here is the patch after the Jakub's review, and Jakub helps with the
coding style.

--

 * asan.c (asan_emit_stack_protection):
 Force the base to align to appropriate bits if STRICT_ALIGNMENT.  Set
 shadow_mem align to appropriate bits if STRICT_ALIGNMENT. 
 * cfgexpand.c
 (expand_stack_vars): Set base_align appropriately when asan is on.
 (expand_used_vars): Leave a space in the stack frame for alignment if
 STRICT_ALIGNMENT.

---
 gcc/ChangeLog   |  9 +
 gcc/asan.c  | 15 +++
 gcc/cfgexpand.c | 18 --
 3 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index da35be8..30a2b33 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2014-04-18  Lin Zuojian  
+   PR middle-end/60281
+   * asan.c (asan_emit_stack_protection): Force the base to align to
+   appropriate bits if STRICT_ALIGNMENT.  Set shadow_mem align to
+   appropriate bits if STRICT_ALIGNMENT.
+   * cfgexpand.c (expand_stack_vars): Set base_align appropriately
+   when asan is on.
+   (expand_used_vars): Leave a space in the stack frame for alignment
+   if STRICT_ALIGNMENT.
 2014-04-17  Jakub Jelinek  
 
PR target/60847
diff --git a/gcc/asan.c b/gcc/asan.c
index 53992a8..28a476f 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1017,8 +1017,17 @@ asan_emit_stack_protection (rtx base, rtx pbase, 
unsigned int alignb,
base_align_bias = ((asan_frame_size + alignb - 1)
   & ~(alignb - HOST_WIDE_INT_1)) - asan_frame_size;
 }
+  /* Align base if target is STRICT_ALIGNMENT.  */
+  if (STRICT_ALIGNMENT)
+base = expand_binop (Pmode, and_optab, base,
+gen_int_mode (-((GET_MODE_ALIGNMENT (SImode)
+ << ASAN_SHADOW_SHIFT)
+/ BITS_PER_UNIT), Pmode), NULL_RTX,
+1, OPTAB_DIRECT);
+
   if (use_after_return_class == -1 && pbase)
 emit_move_insn (pbase, base);
+
   base = expand_binop (Pmode, add_optab, base,
   gen_int_mode (base_offset - base_align_bias, Pmode),
   NULL_RTX, 1, OPTAB_DIRECT);
@@ -1097,6 +1106,8 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned 
int alignb,
  && (ASAN_RED_ZONE_SIZE >> ASAN_SHADOW_SHIFT) == 4);
   shadow_mem = gen_rtx_MEM (SImode, shadow_base);
   set_mem_alias_set (shadow_mem, asan_shadow_set);
+  if (STRICT_ALIGNMENT)
+set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
   prev_offset = base_offset;
   for (l = length; l; l -= 2)
 {
@@ -1186,6 +1197,10 @@ asan_emit_stack_protection (rtx base, rtx pbase, 
unsigned int alignb,
 
   shadow_mem = gen_rtx_MEM (BLKmode, shadow_base);
   set_mem_alias_set (shadow_mem, asan_shadow_set);
+
+  if (STRICT_ALIGNMENT)
+set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
+
   prev_offset = base_offset;
   last_offset = base_offset;
   last_size = 0;
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index b7f6360..14511e1 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -1013,10 +1013,19 @@ expand_stack_vars (bool (*pred) (size_t), struct 
stack_vars_data *data)
  if (data->asan_base == NULL)
data->asan_base = gen_reg_rtx (Pmode);
  base = data->asan_base;
+
+ if (!STRICT_ALIGNMENT)
+   base_align = crtl->max_used_stack_slot_alignment;
+ else
+   base_align = MAX (crtl->max_used_stack_slot_alignment,
+ GET_MODE_ALIGNMENT (SImode)
+ << ASAN_SHADOW_SHIFT);
}
  else
-   offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
- base_align = crtl->max_used_stack_slot_alignment;
+   {
+ offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
+ base_align = crtl->max_used_stack_slot_alignment;
+   }
}
   else
{
@@ -1845,6 +1854,11 @@ expand_used_vars (void)
= alloc_stack_frame_space (redzonesz, ASAN_RED_ZONE_SIZE);
  data.asan_vec.safe_push (prev_offset);
  data.asan_vec.safe_push (offset);
+ /* Leave space for alignment if STRICT_ALIGNMENT.  */
+ if (STRICT_ALIGNMENT)
+   alloc_stack_frame_space ((GET_MODE_ALIGNMENT (SImode)
+ << ASAN_SHADOW_SHIFT)
+/ BITS_PER_UNIT, 1);
 
  var_end_seq
= asan_emit_stack_protection (virtual_stack_vars_rtx,
-- 
1.8.3.2

--
Regards
lin zuojian

Re: [PATCH v8] PR middle-end/60281

2014-04-17 Thread lin zuojian

Hi Bernd,
a) On which target(s) did you boot-strap your patch?
I just run it on x86, can't run it on ARM, because Android is not a
posix system, nor a System V compatible system. And my code does not
effect x86.

b) Did you run the testsuite?
Yes, but again my code does not effect x86.

c) When you compare the test results with and without the patch, were there any 
regressions?
Only the bug has gone. My app can run on my Android ARM system.

On Fri, Apr 18, 2014 at 12:21:50PM +0800, lin zuojian wrote:
>  Hi,
> Here is the patch after the Jakub's review, and Jakub helps with the
> coding style.
> 
> --
> 
>  * asan.c (asan_emit_stack_protection):
>  Force the base to align to appropriate bits if STRICT_ALIGNMENT.  Set
>  shadow_mem align to appropriate bits if STRICT_ALIGNMENT. 
>  * cfgexpand.c
>  (expand_stack_vars): Set base_align appropriately when asan is on.
>  (expand_used_vars): Leave a space in the stack frame for alignment if
>  STRICT_ALIGNMENT.
> 
> ---
>  gcc/ChangeLog   |  9 +
>  gcc/asan.c  | 15 +++
>  gcc/cfgexpand.c | 18 --
>  3 files changed, 40 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index da35be8..30a2b33 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,12 @@
> +2014-04-18  Lin Zuojian  
> +   PR middle-end/60281
> +   * asan.c (asan_emit_stack_protection): Force the base to align to
> +   appropriate bits if STRICT_ALIGNMENT.  Set shadow_mem align to
> +   appropriate bits if STRICT_ALIGNMENT.
> +   * cfgexpand.c (expand_stack_vars): Set base_align appropriately
> +   when asan is on.
> +   (expand_used_vars): Leave a space in the stack frame for alignment
> +   if STRICT_ALIGNMENT.
>  2014-04-17  Jakub Jelinek  
>  
>   PR target/60847
> diff --git a/gcc/asan.c b/gcc/asan.c
> index 53992a8..28a476f 100644
> --- a/gcc/asan.c
> +++ b/gcc/asan.c
> @@ -1017,8 +1017,17 @@ asan_emit_stack_protection (rtx base, rtx pbase, 
> unsigned int alignb,
>   base_align_bias = ((asan_frame_size + alignb - 1)
>  & ~(alignb - HOST_WIDE_INT_1)) - asan_frame_size;
>  }
> +  /* Align base if target is STRICT_ALIGNMENT.  */
> +  if (STRICT_ALIGNMENT)
> +base = expand_binop (Pmode, and_optab, base,
> +  gen_int_mode (-((GET_MODE_ALIGNMENT (SImode)
> +   << ASAN_SHADOW_SHIFT)
> +  / BITS_PER_UNIT), Pmode), NULL_RTX,
> +  1, OPTAB_DIRECT);
> +
>if (use_after_return_class == -1 && pbase)
>  emit_move_insn (pbase, base);
> +
>base = expand_binop (Pmode, add_optab, base,
>  gen_int_mode (base_offset - base_align_bias, Pmode),
>  NULL_RTX, 1, OPTAB_DIRECT);
> @@ -1097,6 +1106,8 @@ asan_emit_stack_protection (rtx base, rtx pbase, 
> unsigned int alignb,
> && (ASAN_RED_ZONE_SIZE >> ASAN_SHADOW_SHIFT) == 4);
>shadow_mem = gen_rtx_MEM (SImode, shadow_base);
>set_mem_alias_set (shadow_mem, asan_shadow_set);
> +  if (STRICT_ALIGNMENT)
> +set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
>prev_offset = base_offset;
>for (l = length; l; l -= 2)
>  {
> @@ -1186,6 +1197,10 @@ asan_emit_stack_protection (rtx base, rtx pbase, 
> unsigned int alignb,
>  
>shadow_mem = gen_rtx_MEM (BLKmode, shadow_base);
>set_mem_alias_set (shadow_mem, asan_shadow_set);
> +
> +  if (STRICT_ALIGNMENT)
> +set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
> +
>prev_offset = base_offset;
>last_offset = base_offset;
>last_size = 0;
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index b7f6360..14511e1 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -1013,10 +1013,19 @@ expand_stack_vars (bool (*pred) (size_t), struct 
> stack_vars_data *data)
> if (data->asan_base == NULL)
>   data->asan_base = gen_reg_rtx (Pmode);
> base = data->asan_base;
> +
> +   if (!STRICT_ALIGNMENT)
> + base_align = crtl->max_used_stack_slot_alignment;
> +   else
> + base_align = MAX (crtl->max_used_stack_slot_alignment,
> +   GET_MODE_ALIGNMENT (SImode)
> +   << ASAN_SHADOW_SHIFT);
>   }
> else
> - offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
> -   base_align = crtl->max_used_stack_slot_alignment;
> + {
> +   offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
> +   base_align = crtl->max_used_stack_slot_alignment;
> + }
>   }
>else
>   {
> @@ -1845,6 +1854,11 @@ expand_used_vars (void)
>   = alloc_stack_frame_space (redzonesz, ASAN_RED_ZONE_SIZE);
> data.asan_vec.safe_push (prev_offset);
> data.asan_vec.safe_push (offset);
> +   /* Leave space for alignment if STRICT_ALI

[patch, testsuite] Fix fragile case nsdmi-union5

2014-04-17 Thread Joey Ye

Resulting from discussion here:
http://gcc.gnu.org/ml/gcc/2014-04/msg00125.html

ChangeLog:
* g++.dg/cpp0x/nsdmi-union5.C: Change to runtime test.

Index: gcc/testsuite/g++.dg/cpp0x/nsdmi-union5.C
===
--- gcc/testsuite/g++.dg/cpp0x/nsdmi-union5.C   (revision 209462)
+++ gcc/testsuite/g++.dg/cpp0x/nsdmi-union5.C   (working copy)
@@ -1,6 +1,5 @@
 // PR c++/58701
-// { dg-require-effective-target c++11 }
-// { dg-final { scan-assembler "7" } }
+// { dg-do run { target c++11 } }
 
 static union
 {
@@ -9,3 +8,10 @@
 int i = 7;
   };
 };
+
+extern "C" void abort(void);
+int main()
+{
+  if (i != 7) abort();
+  return 0;
+}

[C PATCH] Warn if switch has boolean value (PR c/60439)

2014-04-17 Thread Marek Polacek

This patch implements a new warning that warns when controlling
expression of a switch has boolean value.  (Intentionally I don't
warn if the controlling expression is (un)signed:1 bit-field.)
I guess the question is if this should be enabled by default or
deserves some new warning option.  Since clang does the former,
I did it too and currently this warning is enabled by default.

Regtested/bootstrapped on x86_64-linux, ok for trunk?

2014-04-17  Marek Polacek  

PR c/60439
c/
* c-typeck.c (c_start_case): Warn if switch condition has boolean
value.
testsuite/
* gcc.dg/pr60439.c: New test.

diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index 65aad45..91b1109 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -9344,6 +9344,28 @@ c_start_case (location_t switch_loc,
   else
{
  tree type = TYPE_MAIN_VARIANT (orig_type);
+ tree e = exp;
+ enum tree_code exp_code;
+
+ while (TREE_CODE (e) == COMPOUND_EXPR)
+   e = TREE_OPERAND (e, 1);
+ exp_code = TREE_CODE (e);
+
+ if (TREE_CODE (type) == BOOLEAN_TYPE
+ || exp_code == TRUTH_ANDIF_EXPR
+ || exp_code == TRUTH_AND_EXPR
+ || exp_code == TRUTH_ORIF_EXPR
+ || exp_code == TRUTH_OR_EXPR
+ || exp_code == TRUTH_XOR_EXPR
+ || exp_code == TRUTH_NOT_EXPR
+ || exp_code == EQ_EXPR
+ || exp_code == NE_EXPR
+ || exp_code == LE_EXPR
+ || exp_code == GE_EXPR
+ || exp_code == LT_EXPR
+ || exp_code == GT_EXPR)
+   warning_at (switch_cond_loc, 0,
+   "switch condition has boolean value");
 
  if (!in_system_header_at (input_location)
  && (type == long_integer_type_node
diff --git gcc/testsuite/gcc.dg/pr60439.c gcc/testsuite/gcc.dg/pr60439.c
index e69de29..26e7c25 100644
--- gcc/testsuite/gcc.dg/pr60439.c
+++ gcc/testsuite/gcc.dg/pr60439.c
@@ -0,0 +1,112 @@
+/* PR c/60439 */
+/* { dg-do compile } */
+
+typedef _Bool bool;
+extern _Bool foo (void);
+
+void
+f1 (const _Bool b)
+{
+  switch (b) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+}
+
+void
+f2 (int a, int b)
+{
+  switch (a && b) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+  switch ((bool) (a && b)) /* { dg-warning "switch condition has boolean 
value" } */
+case 1:
+  break;
+  switch ((a && b) || a) /* { dg-warning "switch condition has boolean value" 
} */
+case 1:
+  break;
+}
+
+void
+f3 (int a)
+{
+  switch (!!a) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+  switch (!a) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+}
+
+void
+f4 (void)
+{
+  switch (foo ()) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+}
+
+void
+f5 (int a)
+{
+  switch (a == 3) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+  switch (a != 3) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+  switch (a > 3) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+  switch (a < 3) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+  switch (a <= 3) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+  switch (a >= 3) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+  switch (foo (), foo (), a >= 42) /* { dg-warning "switch condition has 
boolean value" } */
+case 1:
+  break;
+  switch (a == 3, a & 4, a ^ 5, a)
+case 1:
+  break;
+}
+
+void
+f6 (bool b)
+{
+  switch (b) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+  switch (!b) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+  switch (b++) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+}
+
+void
+f7 (void)
+{
+  bool b;
+  switch (b = 1) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+}
+
+void
+f8 (int i)
+{
+  switch (i)
+case 0:
+  break;
+  switch ((unsigned int) i)
+case 0:
+  break;
+  switch ((bool) i) /* { dg-warning "switch condition has boolean value" } */
+case 0:
+  break;
+}

Marek

Re: [C PATCH] Warn if switch has boolean value (PR c/60439)

2014-04-17 Thread Marc Glisse


On Fri, 18 Apr 2014, Marek Polacek wrote:


This patch implements a new warning that warns when controlling
expression of a switch has boolean value.  (Intentionally I don't
warn if the controlling expression is (un)signed:1 bit-field.)
I guess the question is if this should be enabled by default or
deserves some new warning option.  Since clang does the former,
I did it too and currently this warning is enabled by default.


It can be enabled by -Wsome-name which is itself enabled by default but
at least gives the possibility to use -Wno-some-name, -Werror=some-name,
etc. No? I believe Manuel insists regularly that no new warning should
use 0 (and old ones should progressively lose it).

--
Marc Glisse

Re: [C PATCH] Warn if switch has boolean value (PR c/60439)

2014-04-17 Thread Marek Polacek

On Fri, Apr 18, 2014 at 07:49:22AM +0200, Marc Glisse wrote:
> On Fri, 18 Apr 2014, Marek Polacek wrote:
> 
> >This patch implements a new warning that warns when controlling
> >expression of a switch has boolean value.  (Intentionally I don't
> >warn if the controlling expression is (un)signed:1 bit-field.)
> >I guess the question is if this should be enabled by default or
> >deserves some new warning option.  Since clang does the former,
> >I did it too and currently this warning is enabled by default.
> 
> It can be enabled by -Wsome-name which is itself enabled by default but
> at least gives the possibility to use -Wno-some-name, -Werror=some-name,
> etc. No? I believe Manuel insists regularly that no new warning should
> use 0 (and old ones should progressively lose it).

Yes, that's the other possibility and exactly what I wanted to
discuss.  I think I'll prepare another version with -Wswitch-bool (and
documentation).

Marek

RE: [PATCH v8] PR middle-end/60281

2014-04-17 Thread Bernd Edlinger

Hi Jakub,

I can take that task over and will boot-strap with all languages and run the
test suite on my armv7-linux-gnueabihf system.
But that will take until next week as it is currently occupied with other tests.


Can you please review Lin's latest patch and give your OK for check-in on trunk
and 4.9.1 branch?


Thanks
Bernd. 

On Fri, 18 Apr 2014 12:26:36, Lin Zuojian wrote:
>
> Hi Bernd,
> a) On which target(s) did you boot-strap your patch?
> I just run it on x86, can't run it on ARM, because Android is not a
> posix system, nor a System V compatible system. And my code does not
> effect x86.
>
> b) Did you run the testsuite?
> Yes, but again my code does not effect x86.
>
> c) When you compare the test results with and without the patch, were there 
> any regressions?
> Only the bug has gone. My app can run on my Android ARM system.
>
> On Fri, Apr 18, 2014 at 12:21:50PM +0800, lin zuojian wrote:
>> Hi,
>> Here is the patch after the Jakub's review, and Jakub helps with the
>> coding style.
>>
>> --
>>
>> * asan.c (asan_emit_stack_protection):
>> Force the base to align to appropriate bits if STRICT_ALIGNMENT. Set
>> shadow_mem align to appropriate bits if STRICT_ALIGNMENT.
>> * cfgexpand.c
>> (expand_stack_vars): Set base_align appropriately when asan is on.
>> (expand_used_vars): Leave a space in the stack frame for alignment if
>> STRICT_ALIGNMENT.
>>
>> ---
>> gcc/ChangeLog | 9 +
>> gcc/asan.c | 15 +++
>> gcc/cfgexpand.c | 18 --
>> 3 files changed, 40 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index da35be8..30a2b33 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,12 @@
>> +2014-04-18 Lin Zuojian 
>> + PR middle-end/60281
>> + * asan.c (asan_emit_stack_protection): Force the base to align to
>> + appropriate bits if STRICT_ALIGNMENT. Set shadow_mem align to
>> + appropriate bits if STRICT_ALIGNMENT.
>> + * cfgexpand.c (expand_stack_vars): Set base_align appropriately
>> + when asan is on.
>> + (expand_used_vars): Leave a space in the stack frame for alignment
>> + if STRICT_ALIGNMENT.
>> 2014-04-17 Jakub Jelinek 
>>
>> PR target/60847
>> diff --git a/gcc/asan.c b/gcc/asan.c
>> index 53992a8..28a476f 100644
>> --- a/gcc/asan.c
>> +++ b/gcc/asan.c
>> @@ -1017,8 +1017,17 @@ asan_emit_stack_protection (rtx base, rtx pbase, 
>> unsigned int alignb,
>> base_align_bias = ((asan_frame_size + alignb - 1)
>> & ~(alignb - HOST_WIDE_INT_1)) - asan_frame_size;
>> }
>> + /* Align base if target is STRICT_ALIGNMENT. */
>> + if (STRICT_ALIGNMENT)
>> + base = expand_binop (Pmode, and_optab, base,
>> + gen_int_mode (-((GET_MODE_ALIGNMENT (SImode)
>> + << ASAN_SHADOW_SHIFT)
>> + / BITS_PER_UNIT), Pmode), NULL_RTX,
>> + 1, OPTAB_DIRECT);
>> +
>> if (use_after_return_class == -1 && pbase)
>> emit_move_insn (pbase, base);
>> +
>> base = expand_binop (Pmode, add_optab, base,
>> gen_int_mode (base_offset - base_align_bias, Pmode),
>> NULL_RTX, 1, OPTAB_DIRECT);
>> @@ -1097,6 +1106,8 @@ asan_emit_stack_protection (rtx base, rtx pbase, 
>> unsigned int alignb,
>> && (ASAN_RED_ZONE_SIZE>> ASAN_SHADOW_SHIFT) == 4);
>> shadow_mem = gen_rtx_MEM (SImode, shadow_base);
>> set_mem_alias_set (shadow_mem, asan_shadow_set);
>> + if (STRICT_ALIGNMENT)
>> + set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
>> prev_offset = base_offset;
>> for (l = length; l; l -= 2)
>> {
>> @@ -1186,6 +1197,10 @@ asan_emit_stack_protection (rtx base, rtx pbase, 
>> unsigned int alignb,
>>
>> shadow_mem = gen_rtx_MEM (BLKmode, shadow_base);
>> set_mem_alias_set (shadow_mem, asan_shadow_set);
>> +
>> + if (STRICT_ALIGNMENT)
>> + set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
>> +
>> prev_offset = base_offset;
>> last_offset = base_offset;
>> last_size = 0;
>> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
>> index b7f6360..14511e1 100644
>> --- a/gcc/cfgexpand.c
>> +++ b/gcc/cfgexpand.c
>> @@ -1013,10 +1013,19 @@ expand_stack_vars (bool (*pred) (size_t), struct 
>> stack_vars_data *data)
>> if (data->asan_base == NULL)
>> data->asan_base = gen_reg_rtx (Pmode);
>> base = data->asan_base;
>> +
>> + if (!STRICT_ALIGNMENT)
>> + base_align = crtl->max_used_stack_slot_alignment;
>> + else
>> + base_align = MAX (crtl->max_used_stack_slot_alignment,
>> + GET_MODE_ALIGNMENT (SImode)
>> + << ASAN_SHADOW_SHIFT);
>> }
>> else
>> - offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
>> - base_align = crtl->max_used_stack_slot_alignment;
>> + {
>> + offset = alloc_stack_frame_space (stack_vars[i].size, alignb);
>> + base_align = crtl->max_used_stack_slot_alignment;
>> + }
>> }
>> else
>> {
>> @@ -1845,6 +1854,11 @@ expand_used_vars (void)
>> = alloc_stack_frame_space (redzonesz, ASAN_RED_ZONE_SIZE);
>> data.asan_vec.safe_push (prev_offset);
>> data.asan_vec.safe_push (offset);
>> + /* Leave space for alignment if STRICT_ALIGNMENT. */
>> + if (STRICT_ALIGNMENT)
>> + alloc_stack_frame_space ((GET_MODE_ALIGNMENT (SImode)
>> + << ASAN_SHADOW_SHI

Re: [C PATCH] Warn if switch has boolean value (PR c/60439)

2014-04-17 Thread Marek Polacek

On Fri, Apr 18, 2014 at 08:00:45AM +0200, Marek Polacek wrote:
> On Fri, Apr 18, 2014 at 07:49:22AM +0200, Marc Glisse wrote:
> > On Fri, 18 Apr 2014, Marek Polacek wrote:
> > 
> > >This patch implements a new warning that warns when controlling
> > >expression of a switch has boolean value.  (Intentionally I don't
> > >warn if the controlling expression is (un)signed:1 bit-field.)
> > >I guess the question is if this should be enabled by default or
> > >deserves some new warning option.  Since clang does the former,
> > >I did it too and currently this warning is enabled by default.
> > 
> > It can be enabled by -Wsome-name which is itself enabled by default but
> > at least gives the possibility to use -Wno-some-name, -Werror=some-name,
> > etc. No? I believe Manuel insists regularly that no new warning should
> > use 0 (and old ones should progressively lose it).
> 
> Yes, that's the other possibility and exactly what I wanted to
> discuss.  I think I'll prepare another version with -Wswitch-bool (and
> documentation).

Here.

2014-04-18  Marek Polacek  

PR c/60439
* doc/invoke.texi: Document -Wswitch-bool.
c/
* c-typeck.c (c_start_case): Warn if switch condition has boolean
value.
c-family/
* c.opt (Wswitch-bool): New option.
testsuite/
* gcc.dg/pr60439.c: New test.

diff --git gcc/c-family/c.opt gcc/c-family/c.opt
index 390c056..9089496 100644
--- gcc/c-family/c.opt
+++ gcc/c-family/c.opt
@@ -529,6 +529,10 @@ Wswitch-enum
 C ObjC C++ ObjC++ Var(warn_switch_enum) Warning
 Warn about all enumerated switches missing a specific case
 
+Wswitch-bool
+C ObjC Warning Init(1)
+Warn about switches with boolean controlling expression
+
 Wmissing-format-attribute
 C ObjC C++ ObjC++ Alias(Wsuggest-attribute=format)
 ;
diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index 65aad45..44982d3 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -9344,6 +9344,28 @@ c_start_case (location_t switch_loc,
   else
{
  tree type = TYPE_MAIN_VARIANT (orig_type);
+ tree e = exp;
+ enum tree_code exp_code;
+
+ while (TREE_CODE (e) == COMPOUND_EXPR)
+   e = TREE_OPERAND (e, 1);
+ exp_code = TREE_CODE (e);
+
+ if (TREE_CODE (type) == BOOLEAN_TYPE
+ || exp_code == TRUTH_ANDIF_EXPR
+ || exp_code == TRUTH_AND_EXPR
+ || exp_code == TRUTH_ORIF_EXPR
+ || exp_code == TRUTH_OR_EXPR
+ || exp_code == TRUTH_XOR_EXPR
+ || exp_code == TRUTH_NOT_EXPR
+ || exp_code == EQ_EXPR
+ || exp_code == NE_EXPR
+ || exp_code == LE_EXPR
+ || exp_code == GE_EXPR
+ || exp_code == LT_EXPR
+ || exp_code == GT_EXPR)
+   warning_at (switch_cond_loc, OPT_Wswitch_bool,
+   "switch condition has boolean value");
 
  if (!in_system_header_at (input_location)
  && (type == long_integer_type_node
diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi
index 8004da8..04e1c41 100644
--- gcc/doc/invoke.texi
+++ gcc/doc/invoke.texi
@@ -268,7 +268,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wstrict-aliasing=n @gol -Wstrict-overflow -Wstrict-overflow=@var{n} @gol
 -Wsuggest-attribute=@r{[}pure@r{|}const@r{|}noreturn@r{|}format@r{]} @gol
 -Wmissing-format-attribute @gol
--Wswitch  -Wswitch-default  -Wswitch-enum -Wsync-nand @gol
+-Wswitch  -Wswitch-default  -Wswitch-enum -Wswitch-bool -Wsync-nand @gol
 -Wsystem-headers  -Wtrampolines  -Wtrigraphs  -Wtype-limits  -Wundef @gol
 -Wuninitialized  -Wunknown-pragmas  -Wno-pragmas @gol
 -Wunsuffixed-float-constants  -Wunused  -Wunused-function @gol
@@ -3822,6 +3822,12 @@ between @option{-Wswitch} and this option is that this 
option gives a
 warning about an omitted enumeration code even if there is a
 @code{default} label.
 
+@item -Wswitch-bool
+@opindex Wswitch-bool
+@opindex Wno-switch-bool
+Warn whenever a @code{switch} statement has an index of boolean type.
+This warning is enabled by default for C programs.
+
 @item -Wsync-nand @r{(C and C++ only)}
 @opindex Wsync-nand
 @opindex Wno-sync-nand
diff --git gcc/testsuite/gcc.dg/pr60439.c gcc/testsuite/gcc.dg/pr60439.c
index e69de29..26e7c25 100644
--- gcc/testsuite/gcc.dg/pr60439.c
+++ gcc/testsuite/gcc.dg/pr60439.c
@@ -0,0 +1,112 @@
+/* PR c/60439 */
+/* { dg-do compile } */
+
+typedef _Bool bool;
+extern _Bool foo (void);
+
+void
+f1 (const _Bool b)
+{
+  switch (b) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+}
+
+void
+f2 (int a, int b)
+{
+  switch (a && b) /* { dg-warning "switch condition has boolean value" } */
+case 1:
+  break;
+  switch ((bool) (a && b)) /* { dg-warning "switch condition has boolean 
value" } */
+case 1:
+  break;
+  switch ((a && b) || a) /* { dg-warning "switch condition has boolean value" 
} */
+case 1:
+  break;
+}
+
+void
+f3 (int a)
+{
+  switch (!!a) /* { dg-warning "switch conditi

94 matches

Mail list logo