Re: [asan] Change shadow type from unsigned to signed

2012-10-19 Thread Jakub Jelinek
On Thu, Oct 18, 2012 at 08:40:32PM -0700, Xinliang David Li wrote:
> The following is small bug fix caught during testing. The shadow value
> for redzone can be negative, so the shadow type must be signed for the
> check to succeed.
> 
> Ok for the branch?

Yes (with a proper ChangeLog.asan entry ;)

> --- asan.c  (revision 192567)
> +++ asan.c  (working copy)
> @@ -825,10 +825,10 @@ static void
>  asan_init_shadow_ptr_types (void)
>  {
>asan_shadow_set = new_alias_set ();
> -  shadow_ptr_types[0] = build_distinct_type_copy (unsigned_char_type_node);
> +  shadow_ptr_types[0] = build_distinct_type_copy (signed_char_type_node);
>TYPE_ALIAS_SET (shadow_ptr_types[0]) = asan_shadow_set;
>shadow_ptr_types[0] = build_pointer_type (shadow_ptr_types[0]);
> -  shadow_ptr_types[1] = build_distinct_type_copy (short_unsigned_type_node);
> +  shadow_ptr_types[1] = build_distinct_type_copy (short_integer_type_node);
>TYPE_ALIAS_SET (shadow_ptr_types[1]) = asan_shadow_set;
>shadow_ptr_types[1] = build_pointer_type (shadow_ptr_types[1]);
>  }

Jakub


Re: wide int patch #6: Replacement of hwi extraction from int-csts.

2012-10-19 Thread Richard Biener
On Thu, Oct 18, 2012 at 4:00 PM, Kenneth Zadeck
 wrote:
> you know richi, i did not know who i was actually talking to.   i said who
> is this richard beiner person and then i saw the email address.

;)

> On 10/18/2012 08:58 AM, Richard Biener wrote:
>>
>> On Thu, Oct 18, 2012 at 2:52 PM, Kenneth Zadeck
>>  wrote:
>>>
>>> On 10/18/2012 06:22 AM, Richard Biener wrote:

 On Wed, Oct 17, 2012 at 11:47 PM, Kenneth Zadeck
  wrote:
>
> Richi,
>
> I apologize for the size of this patch, but it only does one very small
> thing, it is just that it does it all over the middle end.
>
> This patch introduces a new api for extracting a signed or unsigned hwi
> from
> an integer cst.  There had been an abi for doing this, but it has some
> problems that were fixed here, and it was only used sometimes.
>
> The new abi consists of 6 functions, three for testing if the constant
> in
> the int cst will fit and three for actually pulling out the value.
>
> The ones for testing are tree_fits_uhwi_p, tree_fits_shwi_p,
> tree_fits_hwi_p.   The first two of these are unsigned and signed
> versions,
> and the second one takes a boolean parameter which is true if the value
> is
> positive.   This replaces the host_integerp which basically has the
> same
> functionality of tree_fits_hwi_p.   The reason for changing this is
> that
> there were about 400 calls to host_integerp and all but 4 of them took
> a
> constant parameter. These names are chosen to be similar to the similar
> functions in wide-int and are more mnemonic that the existing name
> since
> they use the more readable u and s prefixes that a lot of other places
> do.
>
> On the accessing side, there is tree_to_uhwi, tree_to_shwi, and
> tree_to_hwi.
> The first two do checking when checking is enabled. The last does no
> checking.

 Just a quick note here - the changelog mentions tree_low_cst (as new
 function!?) but not host_integerp.  You should probably get rid of both,
 otherwise uses will creap back as people are familiar with them
 (I'm not sure these changes for consistency are always good ...)
>>>
>>> i will fix this.
>
> these are bugs in the changelog, not the code.   new changelog included.
>
>>>
 I don't like that tree_to_hwi does not do any checking.  In fact I don't
 like that it _exists_, after all it has a return type which signedness
 does not magically change.  Unchecked conversions should use
 TREE_LOW_CST.
>>>
>>> the idea is that when wide-int goes in, there is actually no
>>> TREE_INT_CST_LOW.   The concept of low and high go out the door. the
>>> int-cst
>>> will have an array in it that is big enough to hold the value.
>>> so tree_to_hwi becomes a short hand for just accessing the lower element
>>> of
>>> the array.
>>>
>>> you could argue that you should say tree_fits_uhwi_p followed by
>>> tree_to_uhwi (and the same for signed).   This is an easy fix.   it just
>>> seemed a little redundant.
>>
>> Well, if you want raw access to the lower element (when do you actually
>> want that, when not in wide-int.c/h?) ... you still need to care about the
>> signedness of the result.  And tree_fits_uhwi_p does not return the
>> same as tree_fits_shwi_p all the time.
>>
>> I don't see any goodness in tree_to_hwi nor tree_fits_hwi really.  Because
>> if you just access the lower word then that still has a sign (either
>> HOST_WIDE_INT or unsigned HOST_WIDE_INT).  We should get rid
>> of those places - can you enumerate them?  I think you said it was just
>> a few callers with variable signedness argument.
>
> Note that tree_fits_hwi_p does check.   it just takes a parameter to say if
> it wants signed or unsigned checking (it is the old host_integerp,
> repackaged).   You really do need this function as it is for the 4 or 5
> places it is called.  The parameter to it is typically, but not always, the
> sign of the type of the int cst being passed to it.
>
> it is the tree_to_hwi that is unchecked.  Most of the places are identified
> with comments.  This survives the changelog.   (i happen to be in the group
> of people that think changelogs are useless, and that we should do a better
> job of commenting the code.)
>
> I do not know if this is sloppyness or not, but the signedness that is
> checked rarely matches the sign of the variable that the value is assigned.
> I found this quite frustrating when i was making the patch but this kind of
> thing is common in code where the original writer "knew what (s)he was
> doing."  Unless you are doing comparisons or shifting, the signedness of
> target does not really make much difference.
>
> if you want me to change the sequences of explicit checking and unchecked
> access to explicit checking followed by a checked access, then i am happy to
> do this.

Disclaimer: not looking at the patch (again).

The existing tree_low_

Re: [PATCH] Don't fold PTR - (UN)SIGNED comparison (PR middle-end/54945)

2012-10-19 Thread Richard Biener
On Thu, Oct 18, 2012 at 6:21 PM, Marek Polacek  wrote:
> This patch changes folding in fold_sign_changed_comparison in a way
> that when we have pointer/non-pointer comparison, we give up folding
> here.  The issue is e.g. when we have (intptr_t) &MEM[(void *)&x + 4B]
> == (intptr_t) &y and forwprop wants to fold this one via
> fold_binary_loc--we then end up folding non-sign-changing comparison.
>
> I don't know about the testcase though.  I had something, but I think
> it is too unrealiable.  Regtested/bootstrapped on
> x86_64-unknown-linux-gnu.  Ok for trunk?

Ok for all active branches.

Thanks,
Richard.

> 2012-10-18  Marek Polacek  
>
> * fold-const.c (fold_sign_changed_comparison):  Punt if folding
> pointer/non-pointer comparison.
>
> --- gcc/fold-const.c.mp 2012-10-17 15:01:34.882926243 +0200
> +++ gcc/fold-const.c2012-10-18 18:02:55.482642941 +0200
> @@ -6731,12 +6731,14 @@ fold_sign_changed_comparison (location_t
>&& TREE_TYPE (TREE_OPERAND (arg1, 0)) == inner_type))
>  return NULL_TREE;
>
> -  if ((TYPE_UNSIGNED (inner_type) != TYPE_UNSIGNED (outer_type)
> -   || POINTER_TYPE_P (inner_type) != POINTER_TYPE_P (outer_type))
> +  if (TYPE_UNSIGNED (inner_type) != TYPE_UNSIGNED (outer_type)
>&& code != NE_EXPR
>&& code != EQ_EXPR)
>  return NULL_TREE;
>
> +  if (POINTER_TYPE_P (inner_type) != POINTER_TYPE_P (outer_type))
> +return NULL_TREE;
> +
>if (TREE_CODE (arg1) == INTEGER_CST)
>  arg1 = force_fit_type_double (inner_type, tree_to_double_int (arg1),
>   0, TREE_OVERFLOW (arg1));
>
> Marek


Re: wide int patch #7: Replacement of INT_CST_LT and INT_CST_LT_UNSIGNED

2012-10-19 Thread Richard Biener
On Thu, Oct 18, 2012 at 7:32 PM, Kenneth Zadeck
 wrote:
> This patch replaces all instances of  INT_CST_LT and INT_CST_LT_UNSIGNED
> with
> the new functions tree_int_cst_lts_p and tree_int_cst_ltu_p.   With the new
> implementation of int_cst these functions will be too big to do inline.

These new function names are extremely confusing given that we already
have tree_int_cst_lt which does the right thing based on the signedness
of the INTEGER_CST trees.

The whole point of the macros was to be inlined and you break that.  That is,
for example

   if (unsignedp && unsignedp0)
{
- min_gt = INT_CST_LT_UNSIGNED (primop1, minval);
- max_gt = INT_CST_LT_UNSIGNED (primop1, maxval);
- min_lt = INT_CST_LT_UNSIGNED (minval, primop1);
- max_lt = INT_CST_LT_UNSIGNED (maxval, primop1);
+ min_gt = tree_int_cst_ltu_p (primop1, minval);
+ max_gt = tree_int_cst_ltu_p (primop1, maxval);
+ min_lt = tree_int_cst_ltu_p (minval, primop1);
+ max_lt = tree_int_cst_ltu_p (maxval, primop1);
}
   else
{
- min_gt = INT_CST_LT (primop1, minval);
- max_gt = INT_CST_LT (primop1, maxval);
- min_lt = INT_CST_LT (minval, primop1);
- max_lt = INT_CST_LT (maxval, primop1);
+ min_gt = tree_int_cst_lts_p (primop1, minval);
...

could have just been

min_gt = tree_int_cst_lt (primop1, minval);


without any sign check.

So if you think you need to kill the inlined variants please use the existing
tree_int_cst_lt instead.

Thanks,
Richard.

> This is a small patch that has no prerequisites.
>
> Tested on x86-64.
>
> kenny


[Patch, Build+Fortran] PR54725 - correctly set TARGET_SYSTEM_ROOT for CPP

2012-10-19 Thread Tobias Burnus
gfortran was ignoring the TARGET_SYSTEM_ROOT and thus searched in 
/usr/include for files specified with "#include"/"include" files and for 
.mod files.


The solution is to do in gcc/fortran/cpp.c the same as it is done in 
gcc/c-family/c-opts.c.



However, the TARGET_SYSTEM_ROOT also has to be available. For C/C++ 
that's done via gcc/Makefile.in:

  CFLAGS-c-family/c-opts.o += @TARGET_SYSTEM_ROOT_DEFINE@

For Fortran, we have to to it likewise, but slightly different:

The gcc/Makefile.in gets updated by configure and written to 
$BUILD/gcc/Makefile. At that point the @TARGET...@ has been replaced by 
the actual value.


For Fortran, the gcc/fortran/Make-lang.in is included in 
$BUILD/gcc/Makefile:

  include $(LANG_MAKEFRAGS)

Thus, we cannot use @TARGET_...@ as it won't get replaced. Hence, I 
create a Makefile variable in gcc/Makefile.in and set it there, which 
then in turn gets used in gcc/fortran/Make-lang.in.


Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias

PS: I haven't yet cross compiled and thus checked whether it indeed works.
gcc/
2012-10-19  Tobias Burnus  

	PR fortran/54725
	* Makefile.in (TARGET_SYSTEM_ROOT_DEFINE): New.

gcc/fortran
2012-10-19  Tobias Burnus  

	PR fortran/54725
	* Make-lang.in (CFLAGS-cpp.o): Use TARGET_SYSTEM_ROOT_DEFINE.
	* cpp.o (gfc_cpp_init_options): Use it for
	setting gfc_cpp_option.sysroot.

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 7ae3bb9..e18dc8f 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -468,6 +468,7 @@ LIMITS_H_TEST = [ -f $(SYSTEM_HEADER_DIR)/limits.h ]
 # Directory for prefix to system directories, for
 # each of $(system_prefix)/usr/include, $(system_prefix)/usr/lib, etc.
 TARGET_SYSTEM_ROOT = @TARGET_SYSTEM_ROOT@
+TARGET_SYSTEM_ROOT_DEFINE = @TARGET_SYSTEM_ROOT_DEFINE@
 
 xmake_file=@xmake_file@
 tmake_file=@tmake_file@
diff --git a/gcc/fortran/Make-lang.in b/gcc/fortran/Make-lang.in
index a74eb7f..4041d2d 100644
--- a/gcc/fortran/Make-lang.in
+++ b/gcc/fortran/Make-lang.in
@@ -341,6 +341,7 @@ GFORTRAN_TRANS_DEPS = fortran/gfortran.h fortran/libgfortran.h \
 $(CONFIG_H) $(SYSTEM_H) $(TREE_H) $(TM_H) coretypes.h $(GGC_H) \
 fortran/iso-c-binding.def fortran/iso-fortran-env.def
 
+CFLAGS-cpp.o += $(TARGET_SYSTEM_ROOT_DEFINE)
 fortran/f95-lang.o: $(GFORTRAN_TRANS_DEPS) fortran/mathbuiltins.def \
   gt-fortran-f95-lang.h gtype-fortran.h $(CGRAPH_H) $(TARGET_H) fortran/cpp.h \
   $(BUILTINS_DEF) fortran/types.def \
diff --git a/gcc/fortran/cpp.c b/gcc/fortran/cpp.c
index c45af39..f54ba96 100644
--- a/gcc/fortran/cpp.c
+++ b/gcc/fortran/cpp.c
@@ -38,6 +38,10 @@ along with GCC; see the file COPYING3.  If not see
 #include "cppbuiltin.h"
 #include "mkdeps.h"
 
+#ifndef TARGET_SYSTEM_ROOT
+# define TARGET_SYSTEM_ROOT NULL
+#endif
+
 #ifndef TARGET_CPU_CPP_BUILTINS
 # define TARGET_CPU_CPP_BUILTINS()
 #endif
@@ -267,7 +271,7 @@ gfc_cpp_init_options (unsigned int decoded_options_count,
 
   gfc_cpp_option.multilib = NULL;
   gfc_cpp_option.prefix = NULL;
-  gfc_cpp_option.sysroot = NULL;
+  gfc_cpp_option.sysroot = TARGET_SYSTEM_ROOT;
 
   gfc_cpp_option.deferred_opt = XNEWVEC (gfc_cpp_deferred_opt_t,
 	 decoded_options_count);


Re: [PATCH] Intrinsics for fxsave[,64], xsave[,64], xsaveopt[,64]

2012-10-19 Thread H.J. Lu
On Thu, Oct 18, 2012 at 2:16 AM, Alexander Ivchenko  wrote:
> Hi,
>
> this patch adds new intrinsics for fxsave, fxsave64, xsave, xsave64,
> xsaveopt and xsaveopt64 instructions
>
> Bootstrapped on x86-64
>
> Is it ok for trunk?
>
> Changelog entry:
> 2012-10-16  Alexander Ivchenko  
>
> * gcc/common/config/i386/i386-common.c
> (OPTION_MASK_ISA_FXSAVE_SET): New.
> (OPTION_MASK_ISA_XSAVE_SET): Likewise.
> (ix86_handle_option): Handle mfxsave and mxsave options.
> * gcc/config.gcc (i[34567]86-*-*): Add fxsaveintrin.h and 
> xsaveintrin.h.
> (x86_64-*-*): Likewise.
> * config/i386/xsaveintrin.h: New header.
> * config/i386/fxsaveintrin.h: Likewise.
> * gcc/config/i386/driver-i386.c (host_detect_local_cpu): Detect
> FXSAVE/XSAVE support.
> * gcc/config/i386/i386-builtin-types.def
> (VOID_FTYPE_PVOID_INT64): New function type.
> * gcc/config/i386/i386-c.c: Define __FXSAVE__ and __XSAVE__ if needed.
> * gcc/config/i386/i386.c (ix86_target_string): Define -mfxsave
> and -mxsave options.
> (PTA_FXSAVE): New.
> (PTA_XSAVE): Likewise.
> (ix86_option_override_internal): Handle new option.
> (ix86_valid_target_attribute_inner_p): Add OPT_mfxsave, OPT_mxsave.
> (ix86_builtins): Add IX86_BUILTIN_FXSAVE32, IX86_BUILTIN_FXSAVE64,
> IX86_BUILTIN_XSAVE32, IX86_BUILTIN_XSAVE64,
> IX86_BUILTIN_XSAVEOPT32, IX86_BUILTIN_XSAVEOPT64.
> (ix86_expand_builtin): Handle these built-ins.
> * gcc/config/i386/i386.h (TARGET_FXSAVE): New.
> (TARGET_XSAVE): Likewise.
> * gcc/config/i386/i386.md (fxsave32): New.
> (fxsave64): Likewise.
> (xsave32): Likewise.
> (xsave64): Likewise.
> (xsaveopt32): Likewise.
> (xsaveopt64): Likewise.
> * gcc/config/i386/i386.opt (mfxsave): New.
> (mxsave): Likewise.
> * gcc/config/i386/x86intrin.h: Include
> xsaveintrin.h, fxsaveintrin.h.
>
> testsuite/Changelog entry:
> 2012-10-16  Alexander Ivchenko  
>
> * gcc.target/i386/fxsave-1.c: New.
> * gcc.target/i386/fxsave64-1.c: Ditto.
> * gcc.target/i386/xsave-1.c: Ditto.
> * gcc.target/i386/xsave64-1.c: Ditto.
> * gcc.target/i386/xsaveopt-1.c: Ditto.
> * gcc.target/i386/xsaveopt64-1.c: Ditto.

A few comments:

1. Shouldn't we also support XSAVEOPT?
2. Shouldn't we enable FXSAVE for -msse?
3. Shouldn't we enable XSAVE for -mavx.
4. XSAVE detection is wrong in host_detect_local_cpu.
Please follow Intel64 SDM.  You should also check
OSXSAVE bit, similar to AVX.
5. Should we use the feature same in Intel SDM, FXSR?
6. Please add PTA_XXX to proper -march=.
7. Shouldn't we add intrinsics for fxrstr and xrstor


-- 
H.J.


Re: [PATCH] Rs6000 infrastructure cleanup (switches), revised patch #2f

2012-10-19 Thread Eric Botcazou
> The patch builds in 32-bit on my powerpc64-linux system (which showed up
> Eric's problem).  I can't test the patch for Dominique's problem, since it
> needs a non-linux system to show up.  I verified the debug problem:
> 
> Eric and Dominique could you try it on your respective systems?  Thanks in
> advance.

It works for me, thanks.

-- 
Eric Botcazou


Fix array bound niter estimate (PR middle-end/54937)

2012-10-19 Thread Jan Hubicka
Hi,
this patch fixes off-by-one error in the testcase attached.  The problem is that
dominance based test used by record_estimate to check whether the given 
statement
must be executed at last iteration of the loop is wrong ignoring the side effect
of other statements that may terminate the program.
It also does not work for mulitple exits as excercised by cunroll-2.c testcase.

This patch makes simple approach of computing set of all statements that must
by executed last iteration first time record_estimate is executed this way.
The set is computed conservatively walking header BB and its signle successors
(possibly diving into nested loop) stopping on first BB with multiple exits.

Better result can be computed by
1) estimating what loops are known to be finite
2) inserting fake edges for all infinite loop and all statements with side 
effect
   that may terminate the execution
3) using the post dominance info.
But that seems too expensive for something that is executed several times per
function compilation. In fact I am thinking about making recrod-estimate to 
happen
at ivcanon time only and then making the loop_max_iterations and 
loop_estimated_iterations
to simply return the bound instead of trying to recompute it all the time.

Still I do not see us to care about very precise bound for loops having complex
control flow since we do not really want to completely unroll them and elsewhere
the off-by-one error in conservative direction for iteration estimate is not big
deal.

Bootstrapped/regtested x86_64-linux, seems sane?

Honza

PR middle-end/54937
* tree-ssa-loop-niter.c (compute_stmts_executed_last_iteration): New
function.
(record_estimate): Use it; remove confused dominance test.
(estimate_numbers_of_iterations_loop): Free 
stmts_executed_last_iteration.
* gcc.c-torture/execute/pr54937.c: Ne wtestcase.
* testsuite/gcc.dg/tree-ssa/cunroll-2.c: Tighten.
Index: tree-ssa-loop-niter.c
===
*** tree-ssa-loop-niter.c   (revision 192537)
--- tree-ssa-loop-niter.c   (working copy)
*** record_niter_bound (struct loop *loop, d
*** 2523,2528 
--- 2523,2580 
  loop->nb_iterations_estimate = loop->nb_iterations_upper_bound;
  }
  
+ /* Compute pointer set of statements in currently analyzed loop that are known
+to be executed at last iteration and store it into AUX.  
+We do very simple job of recording statements in the superblock
+starsting in loop header until reaching first statement with side effect.
+ 
+We can compute post-dominance after inserting fake edges to CFG
+for all statements possibly terminating CFG and for all possibly
+infinite subloops, but we really care about precise estimates
+of simple loops with little control flow in it.  */
+ static void
+ compute_stmts_executed_last_iteration (struct loop *loop)
+ {
+   basic_block bb = loop->header;
+   pointer_set_t *visited = NULL;
+   gimple_stmt_iterator gsi;
+   pointer_set_t *stmts_executed_last_iteration;
+ 
+   loop->aux = stmts_executed_last_iteration = pointer_set_create ();
+   while (true)
+ {
+   for (gsi = gsi_start_bb (loop->header); !gsi_end_p (gsi); gsi_next 
(&gsi))
+   {
+ if (gimple_has_side_effects (gsi_stmt (gsi)))
+   {
+ if (visited)
+   pointer_set_destroy (visited);
+ return;
+   }
+ pointer_set_insert (stmts_executed_last_iteration, gsi_stmt (gsi));
+   }
+   if (single_succ_p (bb))
+   {
+ /* Deal with the rare case that there is an infintie loop inside the
+loop examined.  */
+ if (!visited)
+   {
+   visited = pointer_set_create ();
+   pointer_set_insert (visited, bb);
+   }
+ bb = single_succ_edge (bb)->dest;
+   if (pointer_set_insert (visited, bb))
+   break;
+   }
+   else
+   break;
+ }
+   if (visited)
+ pointer_set_destroy (visited);
+   return;
+ }
+ 
+ 
  /* Records that AT_STMT is executed at most BOUND + 1 times in LOOP.  IS_EXIT
 is true if the loop is exited immediately after STMT, and this exit
 is taken at last when the STMT is executed BOUND + 1 times.
*** record_estimate (struct loop *loop, tree
*** 2535,2541 
 gimple at_stmt, bool is_exit, bool realistic, bool upper)
  {
double_int delta;
-   edge exit;
  
if (dump_file && (dump_flags & TDF_DETAILS))
  {
--- 2587,2592 
*** record_estimate (struct loop *loop, tree
*** 2573,2586 
   If at_stmt is an exit or dominates the single exit from the loop,
   then the loop latch is executed at most BOUND times, otherwise
   it can be executed BOUND + 1 times.  */
!   exit = single_exit (loop);
!   if (is_exit
!   || (exit != NULL
! && dominated_by_p (CDI_DOMINATORS,
!  

Re: [PATCH, ARM] Fix PR target/54892 - [4.7/4.8 Regression], ICE in extract_insn, at recog.c:2123

2012-10-19 Thread Zhenqiang Chen
On 19 October 2012 14:53, Ramana Radhakrishnan
 wrote:
> On Fri, Oct 19, 2012 at 7:46 AM, Zhenqiang Chen
>  wrote:
>> Hi,
>>
>> In function arm_expand_compare_and_swap, oldval is converted to SImode
>> when its "mode" is QImode/HImode. After "FALLTHRU" to "case SImode",
>> we should use "SImode", other than "mode" (which is QImode/HImode).
>> And INSN atomic_compare_and_swap_1 expects the operand in
>> SImode.
>>
>> No make check regression.
>>
>> Is it OK for 4.7 and trunk?
>
>  Makes sense , OK for both.
>

Thanks. Committed to 4.7 and trunk.

-Zhenqiang


Re: Fix array bound niter estimate (PR middle-end/54937)

2012-10-19 Thread Richard Biener
On Fri, 19 Oct 2012, Jan Hubicka wrote:

> Hi,
> this patch fixes off-by-one error in the testcase attached.  The problem is 
> that
> dominance based test used by record_estimate to check whether the given 
> statement
> must be executed at last iteration of the loop is wrong ignoring the side 
> effect
> of other statements that may terminate the program.
> It also does not work for mulitple exits as excercised by cunroll-2.c 
> testcase.
> 
> This patch makes simple approach of computing set of all statements that must
> by executed last iteration first time record_estimate is executed this way.
> The set is computed conservatively walking header BB and its signle successors
> (possibly diving into nested loop) stopping on first BB with multiple exits.
> 
> Better result can be computed by
> 1) estimating what loops are known to be finite
> 2) inserting fake edges for all infinite loop and all statements with side 
> effect
>that may terminate the execution
> 3) using the post dominance info.

would using post-dom info even work?  That only says that _if_ the
dominated stmt executed then it came through the dominator.  It
doesn't deal with functions that may not return.

> But that seems too expensive for something that is executed several times per
> function compilation. In fact I am thinking about making recrod-estimate to 
> happen
> at ivcanon time only and then making the loop_max_iterations and 
> loop_estimated_iterations
> to simply return the bound instead of trying to recompute it all the time.
> 
> Still I do not see us to care about very precise bound for loops having 
> complex
> control flow since we do not really want to completely unroll them and 
> elsewhere
> the off-by-one error in conservative direction for iteration estimate is not 
> big
> deal.
> 
> Bootstrapped/regtested x86_64-linux, seems sane?
> 
> Honza
> 
>   PR middle-end/54937
>   * tree-ssa-loop-niter.c (compute_stmts_executed_last_iteration): New
>   function.
>   (record_estimate): Use it; remove confused dominance test.
>   (estimate_numbers_of_iterations_loop): Free 
> stmts_executed_last_iteration.
>   * gcc.c-torture/execute/pr54937.c: Ne wtestcase.
>   * testsuite/gcc.dg/tree-ssa/cunroll-2.c: Tighten.
> Index: tree-ssa-loop-niter.c
> ===
> *** tree-ssa-loop-niter.c (revision 192537)
> --- tree-ssa-loop-niter.c (working copy)
> *** record_niter_bound (struct loop *loop, d
> *** 2523,2528 
> --- 2523,2580 
>   loop->nb_iterations_estimate = loop->nb_iterations_upper_bound;
>   }
>   
> + /* Compute pointer set of statements in currently analyzed loop that are 
> known
> +to be executed at last iteration and store it into AUX.  
> +We do very simple job of recording statements in the superblock
> +starsting in loop header until reaching first statement with side effect.
> + 
> +We can compute post-dominance after inserting fake edges to CFG
> +for all statements possibly terminating CFG and for all possibly
> +infinite subloops, but we really care about precise estimates
> +of simple loops with little control flow in it.  */
> + static void
> + compute_stmts_executed_last_iteration (struct loop *loop)
> + {
> +   basic_block bb = loop->header;
> +   pointer_set_t *visited = NULL;
> +   gimple_stmt_iterator gsi;
> +   pointer_set_t *stmts_executed_last_iteration;
> + 
> +   loop->aux = stmts_executed_last_iteration = pointer_set_create ();
> +   while (true)
> + {
> +   for (gsi = gsi_start_bb (loop->header); !gsi_end_p (gsi); gsi_next 
> (&gsi))
> + {
> +   if (gimple_has_side_effects (gsi_stmt (gsi)))
> + {
> +   if (visited)
> + pointer_set_destroy (visited);
> +   return;
> + }
> +   pointer_set_insert (stmts_executed_last_iteration, gsi_stmt (gsi));
> + }
> +   if (single_succ_p (bb))
> + {
> +   /* Deal with the rare case that there is an infintie loop inside the
> +  loop examined.  */
> +   if (!visited)
> + {
> +   visited = pointer_set_create ();
> +   pointer_set_insert (visited, bb);
> + }
> +   bb = single_succ_edge (bb)->dest;
> +   if (pointer_set_insert (visited, bb))
> + break;
> + }
> +   else
> + break;
> + }
> +   if (visited)
> + pointer_set_destroy (visited);
> +   return;
> + }
> + 
> + 
>   /* Records that AT_STMT is executed at most BOUND + 1 times in LOOP.  
> IS_EXIT
>  is true if the loop is exited immediately after STMT, and this exit
>  is taken at last when the STMT is executed BOUND + 1 times.
> *** record_estimate (struct loop *loop, tree
> *** 2535,2541 
>gimple at_stmt, bool is_exit, bool realistic, bool upper)
>   {
> double_int delta;
> -   edge exit;
>   
> if (dump_file && (dump_flags & TDF_DETAILS))
>   {

[PATCH] Fix PR54976

2012-10-19 Thread Richard Biener

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk
and branch.

Richard.

2012-10-19  Richard Biener  

PR tree-optimization/54976
* tree-vect-stmts.c (get_vectype_for_scalar_type_and_size):
Robustify against odd inner_mode inputs.

Index: gcc/tree-vect-stmts.c
===
*** gcc/tree-vect-stmts.c   (revision 192605)
--- gcc/tree-vect-stmts.c   (working copy)
*** get_vectype_for_scalar_type_and_size (tr
*** 6082,6097 
   When the component mode passes the above test simply use a type
   corresponding to that mode.  The theory is that any use that
   would cause problems with this will disable vectorization anyway.  */
!   if (!SCALAR_FLOAT_TYPE_P (scalar_type)
!   && !INTEGRAL_TYPE_P (scalar_type)
!   && !POINTER_TYPE_P (scalar_type))
  scalar_type = lang_hooks.types.type_for_mode (inner_mode, 1);
  
/* We can't build a vector type of elements with alignment bigger than
   their size.  */
!   if (nbytes < TYPE_ALIGN_UNIT (scalar_type))
  scalar_type = lang_hooks.types.type_for_mode (inner_mode, 1);
  
/* If no size was supplied use the mode the target prefers.   Otherwise
   lookup a vector mode of the specified size.  */
if (size == 0)
--- 6082,6102 
   When the component mode passes the above test simply use a type
   corresponding to that mode.  The theory is that any use that
   would cause problems with this will disable vectorization anyway.  */
!   else if (!SCALAR_FLOAT_TYPE_P (scalar_type)
!  && !INTEGRAL_TYPE_P (scalar_type)
!  && !POINTER_TYPE_P (scalar_type))
  scalar_type = lang_hooks.types.type_for_mode (inner_mode, 1);
  
/* We can't build a vector type of elements with alignment bigger than
   their size.  */
!   else if (nbytes < TYPE_ALIGN_UNIT (scalar_type))
  scalar_type = lang_hooks.types.type_for_mode (inner_mode, 1);
  
+   /* If we felt back to using the mode fail if there was
+  no scalar type for it.  */
+   if (scalar_type == NULL_TREE)
+ return NULL_TREE;
+ 
/* If no size was supplied use the mode the target prefers.   Otherwise
   lookup a vector mode of the specified size.  */
if (size == 0)


Re: wide int patch #7: Replacement of INT_CST_LT and INT_CST_LT_UNSIGNED

2012-10-19 Thread Kenneth Zadeck


On 10/19/2012 04:22 AM, Richard Biener wrote:

On Thu, Oct 18, 2012 at 7:32 PM, Kenneth Zadeck
 wrote:

This patch replaces all instances of  INT_CST_LT and INT_CST_LT_UNSIGNED
with
the new functions tree_int_cst_lts_p and tree_int_cst_ltu_p.   With the new
implementation of int_cst these functions will be too big to do inline.

These new function names are extremely confusing given that we already
have tree_int_cst_lt which does the right thing based on the signedness
of the INTEGER_CST trees.

The whole point of the macros was to be inlined and you break that.  That is,
for example

if (unsignedp && unsignedp0)
 {
- min_gt = INT_CST_LT_UNSIGNED (primop1, minval);
- max_gt = INT_CST_LT_UNSIGNED (primop1, maxval);
- min_lt = INT_CST_LT_UNSIGNED (minval, primop1);
- max_lt = INT_CST_LT_UNSIGNED (maxval, primop1);
+ min_gt = tree_int_cst_ltu_p (primop1, minval);
+ max_gt = tree_int_cst_ltu_p (primop1, maxval);
+ min_lt = tree_int_cst_ltu_p (minval, primop1);
+ max_lt = tree_int_cst_ltu_p (maxval, primop1);
 }
else
 {
- min_gt = INT_CST_LT (primop1, minval);
- max_gt = INT_CST_LT (primop1, maxval);
- min_lt = INT_CST_LT (minval, primop1);
- max_lt = INT_CST_LT (maxval, primop1);
+ min_gt = tree_int_cst_lts_p (primop1, minval);
...

could have just been

 min_gt = tree_int_cst_lt (primop1, minval);


without any sign check.

So if you think you need to kill the inlined variants please use the existing
tree_int_cst_lt instead.
no, they could not have.   tree_int_cst_lt uses the signedness of the 
type to determine how to do the comparison.These two functions, as 
the macros they replace, force the comparison to be done independent of 
the signedness of the type.


I do not know why we need to do this.  I am just applying a plug 
compatible replacement here. I did not write this code, but I do not 
think that i can just do as you say here.


Kenny


Thanks,
Richard.


This is a small patch that has no prerequisites.

Tested on x86-64.

kenny




Re: wide int patch #7: Replacement of INT_CST_LT and INT_CST_LT_UNSIGNED

2012-10-19 Thread Richard Biener
On Fri, Oct 19, 2012 at 12:59 PM, Kenneth Zadeck
 wrote:
>
> On 10/19/2012 04:22 AM, Richard Biener wrote:
>>
>> On Thu, Oct 18, 2012 at 7:32 PM, Kenneth Zadeck
>>  wrote:
>>>
>>> This patch replaces all instances of  INT_CST_LT and INT_CST_LT_UNSIGNED
>>> with
>>> the new functions tree_int_cst_lts_p and tree_int_cst_ltu_p.   With the
>>> new
>>> implementation of int_cst these functions will be too big to do inline.
>>
>> These new function names are extremely confusing given that we already
>> have tree_int_cst_lt which does the right thing based on the signedness
>> of the INTEGER_CST trees.
>>
>> The whole point of the macros was to be inlined and you break that.  That
>> is,
>> for example
>>
>> if (unsignedp && unsignedp0)
>>  {
>> - min_gt = INT_CST_LT_UNSIGNED (primop1, minval);
>> - max_gt = INT_CST_LT_UNSIGNED (primop1, maxval);
>> - min_lt = INT_CST_LT_UNSIGNED (minval, primop1);
>> - max_lt = INT_CST_LT_UNSIGNED (maxval, primop1);
>> + min_gt = tree_int_cst_ltu_p (primop1, minval);
>> + max_gt = tree_int_cst_ltu_p (primop1, maxval);
>> + min_lt = tree_int_cst_ltu_p (minval, primop1);
>> + max_lt = tree_int_cst_ltu_p (maxval, primop1);
>>  }
>> else
>>  {
>> - min_gt = INT_CST_LT (primop1, minval);
>> - max_gt = INT_CST_LT (primop1, maxval);
>> - min_lt = INT_CST_LT (minval, primop1);
>> - max_lt = INT_CST_LT (maxval, primop1);
>> + min_gt = tree_int_cst_lts_p (primop1, minval);
>> ...
>>
>> could have just been
>>
>>  min_gt = tree_int_cst_lt (primop1, minval);
>> 
>>
>> without any sign check.
>>
>> So if you think you need to kill the inlined variants please use the
>> existing
>> tree_int_cst_lt instead.
>
> no, they could not have.   tree_int_cst_lt uses the signedness of the type
> to determine how to do the comparison.These two functions, as the macros
> they replace, force the comparison to be done independent of the signedness
> of the type.

Well, looking at the surrounding code it's indeed non-obvious that this would
be a 1:1 transform.  But then they should not compare _trees_ but instead
compare double-ints (or wide-ints).

That said, I still think having a tree_int_cst_lt[us]_p function is wrong.
tree INTEGER_CSTs have a sign.  (apart from that opinion we have
tree_int_cst_lt and you introduce tree_int_cst_ltu_p - consistent
would be tree_int_cst_ltu).

> I do not know why we need to do this.  I am just applying a plug compatible
> replacement here. I did not write this code, but I do not think that i can
> just do as you say here.

So use the double-int interface in the places you substituted your new
tree predicates.  Yes, you'll have to touch that again when converting to
wide-int - but if those places really want to ignore the sign of the tree
they should not use a tree interface.

Richard.

> Kenny
>
>
>> Thanks,
>> Richard.
>>
>>> This is a small patch that has no prerequisites.
>>>
>>> Tested on x86-64.
>>>
>>> kenny
>
>


Re: wide int patch #7: Replacement of INT_CST_LT and INT_CST_LT_UNSIGNED

2012-10-19 Thread Kenneth Zadeck


On 10/19/2012 07:13 AM, Richard Biener wrote:

On Fri, Oct 19, 2012 at 12:59 PM, Kenneth Zadeck
 wrote:

On 10/19/2012 04:22 AM, Richard Biener wrote:

On Thu, Oct 18, 2012 at 7:32 PM, Kenneth Zadeck
 wrote:

This patch replaces all instances of  INT_CST_LT and INT_CST_LT_UNSIGNED
with
the new functions tree_int_cst_lts_p and tree_int_cst_ltu_p.   With the
new
implementation of int_cst these functions will be too big to do inline.

These new function names are extremely confusing given that we already
have tree_int_cst_lt which does the right thing based on the signedness
of the INTEGER_CST trees.

The whole point of the macros was to be inlined and you break that.  That
is,
for example

 if (unsignedp && unsignedp0)
  {
- min_gt = INT_CST_LT_UNSIGNED (primop1, minval);
- max_gt = INT_CST_LT_UNSIGNED (primop1, maxval);
- min_lt = INT_CST_LT_UNSIGNED (minval, primop1);
- max_lt = INT_CST_LT_UNSIGNED (maxval, primop1);
+ min_gt = tree_int_cst_ltu_p (primop1, minval);
+ max_gt = tree_int_cst_ltu_p (primop1, maxval);
+ min_lt = tree_int_cst_ltu_p (minval, primop1);
+ max_lt = tree_int_cst_ltu_p (maxval, primop1);
  }
 else
  {
- min_gt = INT_CST_LT (primop1, minval);
- max_gt = INT_CST_LT (primop1, maxval);
- min_lt = INT_CST_LT (minval, primop1);
- max_lt = INT_CST_LT (maxval, primop1);
+ min_gt = tree_int_cst_lts_p (primop1, minval);
...

could have just been

  min_gt = tree_int_cst_lt (primop1, minval);


without any sign check.

So if you think you need to kill the inlined variants please use the
existing
tree_int_cst_lt instead.

no, they could not have.   tree_int_cst_lt uses the signedness of the type
to determine how to do the comparison.These two functions, as the macros
they replace, force the comparison to be done independent of the signedness
of the type.

Well, looking at the surrounding code it's indeed non-obvious that this would
be a 1:1 transform.  But then they should not compare _trees_ but instead
compare double-ints (or wide-ints).

That said, I still think having a tree_int_cst_lt[us]_p function is wrong.
tree INTEGER_CSTs have a sign.  (apart from that opinion we have
tree_int_cst_lt and you introduce tree_int_cst_ltu_p - consistent
would be tree_int_cst_ltu).

This reply applies just as much to this patch as patch 6.
I morally agree with you 100%.But the code does not agree with you.

On patch 6, there are about 450 places where we want to take the lower 
hwi worth of bits out of a int cst.Of those, only 5 places use the 
function that allows the signedness to be passed in.   The rest make the 
signedness decision based on the local code and completely ignore any 
information in the type.Of those 5 that do allow the signedness to 
be passed in, only three of them actually pass it in based on the 
signedness of the variable they are accessing.


I am sure that a lot of these are wrong.  But i could not tell you which 
are and which are not.
luckily, a lot of this will go away with the full wide-int code because 
i just do most of this math in the full precision so the issue never 
comes up.But after i am finished, there will still be a fair number 
of places that do this.   (luckily, a large number of them are pulling 
the number out and comparing it to the precision of something, so this 
is likely to be harmless no matter how the code is written).


But to a large extent, you are shooting the messenger here, and not 
person who committed the crime.   I will be happy to add some comments 
to point the clients of these to the one that looks at the type.   In 
looking over the patch, the only obvious ones that could be changed are 
the ones in tree-ssa-uninit.c and the tree-vrp.c. The one in tree-vrp.c 
just looks like that the person writing the code did not know about 
tree_int_cst_lt and wrote the check out our himself.  (i will fix this 
in the tree-vrp patch that i am working on now. The one in 
tree-ssa-uniunit looks correct.


But beyond that, the rest are in the front ends and so i think that this 
as good as you get out of me.


Kenny


I do not know why we need to do this.  I am just applying a plug compatible
replacement here. I did not write this code, but I do not think that i can
just do as you say here.

So use the double-int interface in the places you substituted your new
tree predicates.  Yes, you'll have to touch that again when converting to
wide-int - but if those places really want to ignore the sign of the tree
they should not use a tree interface.

Richard.


Kenny



Thanks,
Richard.


This is a small patch that has no prerequisites.

Tested on x86-64.

kenny






Re: wide int patch #7: Replacement of INT_CST_LT and INT_CST_LT_UNSIGNED

2012-10-19 Thread Richard Biener
On Fri, Oct 19, 2012 at 1:49 PM, Kenneth Zadeck
 wrote:
>
> On 10/19/2012 07:13 AM, Richard Biener wrote:
>>
>> On Fri, Oct 19, 2012 at 12:59 PM, Kenneth Zadeck
>>  wrote:
>>>
>>> On 10/19/2012 04:22 AM, Richard Biener wrote:

 On Thu, Oct 18, 2012 at 7:32 PM, Kenneth Zadeck
  wrote:
>
> This patch replaces all instances of  INT_CST_LT and
> INT_CST_LT_UNSIGNED
> with
> the new functions tree_int_cst_lts_p and tree_int_cst_ltu_p.   With the
> new
> implementation of int_cst these functions will be too big to do inline.

 These new function names are extremely confusing given that we already
 have tree_int_cst_lt which does the right thing based on the signedness
 of the INTEGER_CST trees.

 The whole point of the macros was to be inlined and you break that.
 That
 is,
 for example

  if (unsignedp && unsignedp0)
   {
 - min_gt = INT_CST_LT_UNSIGNED (primop1, minval);
 - max_gt = INT_CST_LT_UNSIGNED (primop1, maxval);
 - min_lt = INT_CST_LT_UNSIGNED (minval, primop1);
 - max_lt = INT_CST_LT_UNSIGNED (maxval, primop1);
 + min_gt = tree_int_cst_ltu_p (primop1, minval);
 + max_gt = tree_int_cst_ltu_p (primop1, maxval);
 + min_lt = tree_int_cst_ltu_p (minval, primop1);
 + max_lt = tree_int_cst_ltu_p (maxval, primop1);
   }
  else
   {
 - min_gt = INT_CST_LT (primop1, minval);
 - max_gt = INT_CST_LT (primop1, maxval);
 - min_lt = INT_CST_LT (minval, primop1);
 - max_lt = INT_CST_LT (maxval, primop1);
 + min_gt = tree_int_cst_lts_p (primop1, minval);
 ...

 could have just been

   min_gt = tree_int_cst_lt (primop1, minval);
 

 without any sign check.

 So if you think you need to kill the inlined variants please use the
 existing
 tree_int_cst_lt instead.
>>>
>>> no, they could not have.   tree_int_cst_lt uses the signedness of the
>>> type
>>> to determine how to do the comparison.These two functions, as the
>>> macros
>>> they replace, force the comparison to be done independent of the
>>> signedness
>>> of the type.
>>
>> Well, looking at the surrounding code it's indeed non-obvious that this
>> would
>> be a 1:1 transform.  But then they should not compare _trees_ but instead
>> compare double-ints (or wide-ints).
>>
>> That said, I still think having a tree_int_cst_lt[us]_p function is wrong.
>> tree INTEGER_CSTs have a sign.  (apart from that opinion we have
>> tree_int_cst_lt and you introduce tree_int_cst_ltu_p - consistent
>> would be tree_int_cst_ltu).
>
> This reply applies just as much to this patch as patch 6.
> I morally agree with you 100%.But the code does not agree with you.
>
> On patch 6, there are about 450 places where we want to take the lower hwi
> worth of bits out of a int cst.Of those, only 5 places use the function
> that allows the signedness to be passed in.   The rest make the signedness
> decision based on the local code and completely ignore any information in
> the type.Of those 5 that do allow the signedness to be passed in, only
> three of them actually pass it in based on the signedness of the variable
> they are accessing.
>
> I am sure that a lot of these are wrong.  But i could not tell you which are
> and which are not.
> luckily, a lot of this will go away with the full wide-int code because i
> just do most of this math in the full precision so the issue never comes up.
> But after i am finished, there will still be a fair number of places that do
> this.   (luckily, a large number of them are pulling the number out and
> comparing it to the precision of something, so this is likely to be harmless
> no matter how the code is written).
>
> But to a large extent, you are shooting the messenger here, and not person
> who committed the crime.   I will be happy to add some comments to point the
> clients of these to the one that looks at the type.   In looking over the
> patch, the only obvious ones that could be changed are the ones in
> tree-ssa-uninit.c and the tree-vrp.c. The one in tree-vrp.c just looks like
> that the person writing the code did not know about tree_int_cst_lt and
> wrote the check out our himself.  (i will fix this in the tree-vrp patch
> that i am working on now. The one in tree-ssa-uniunit looks correct.
>
> But beyond that, the rest are in the front ends and so i think that this as
> good as you get out of me.

Well, if you transform bogus (by moral standards) code into other
bogus code the whole point of your patch is to exchange the names
of a set of macros / functions to another set of macros / functions.

I see no point in that then.

Leave broken code as-is.  The more often you touch broken code
and just mangle it in some way the harder it gets to get to the
point that would maybe re

Re: wide int patch #7: Replacement of INT_CST_LT and INT_CST_LT_UNSIGNED

2012-10-19 Thread Kenneth Zadeck


On 10/19/2012 07:58 AM, Richard Biener wrote:

On Fri, Oct 19, 2012 at 1:49 PM, Kenneth Zadeck
 wrote:

On 10/19/2012 07:13 AM, Richard Biener wrote:

On Fri, Oct 19, 2012 at 12:59 PM, Kenneth Zadeck
 wrote:

On 10/19/2012 04:22 AM, Richard Biener wrote:

On Thu, Oct 18, 2012 at 7:32 PM, Kenneth Zadeck
 wrote:

This patch replaces all instances of  INT_CST_LT and
INT_CST_LT_UNSIGNED
with
the new functions tree_int_cst_lts_p and tree_int_cst_ltu_p.   With the
new
implementation of int_cst these functions will be too big to do inline.

These new function names are extremely confusing given that we already
have tree_int_cst_lt which does the right thing based on the signedness
of the INTEGER_CST trees.

The whole point of the macros was to be inlined and you break that.
That
is,
for example

  if (unsignedp && unsignedp0)
   {
- min_gt = INT_CST_LT_UNSIGNED (primop1, minval);
- max_gt = INT_CST_LT_UNSIGNED (primop1, maxval);
- min_lt = INT_CST_LT_UNSIGNED (minval, primop1);
- max_lt = INT_CST_LT_UNSIGNED (maxval, primop1);
+ min_gt = tree_int_cst_ltu_p (primop1, minval);
+ max_gt = tree_int_cst_ltu_p (primop1, maxval);
+ min_lt = tree_int_cst_ltu_p (minval, primop1);
+ max_lt = tree_int_cst_ltu_p (maxval, primop1);
   }
  else
   {
- min_gt = INT_CST_LT (primop1, minval);
- max_gt = INT_CST_LT (primop1, maxval);
- min_lt = INT_CST_LT (minval, primop1);
- max_lt = INT_CST_LT (maxval, primop1);
+ min_gt = tree_int_cst_lts_p (primop1, minval);
...

could have just been

   min_gt = tree_int_cst_lt (primop1, minval);


without any sign check.

So if you think you need to kill the inlined variants please use the
existing
tree_int_cst_lt instead.

no, they could not have.   tree_int_cst_lt uses the signedness of the
type
to determine how to do the comparison.These two functions, as the
macros
they replace, force the comparison to be done independent of the
signedness
of the type.

Well, looking at the surrounding code it's indeed non-obvious that this
would
be a 1:1 transform.  But then they should not compare _trees_ but instead
compare double-ints (or wide-ints).

That said, I still think having a tree_int_cst_lt[us]_p function is wrong.
tree INTEGER_CSTs have a sign.  (apart from that opinion we have
tree_int_cst_lt and you introduce tree_int_cst_ltu_p - consistent
would be tree_int_cst_ltu).

This reply applies just as much to this patch as patch 6.
I morally agree with you 100%.But the code does not agree with you.

On patch 6, there are about 450 places where we want to take the lower hwi
worth of bits out of a int cst.Of those, only 5 places use the function
that allows the signedness to be passed in.   The rest make the signedness
decision based on the local code and completely ignore any information in
the type.Of those 5 that do allow the signedness to be passed in, only
three of them actually pass it in based on the signedness of the variable
they are accessing.

I am sure that a lot of these are wrong.  But i could not tell you which are
and which are not.
luckily, a lot of this will go away with the full wide-int code because i
just do most of this math in the full precision so the issue never comes up.
But after i am finished, there will still be a fair number of places that do
this.   (luckily, a large number of them are pulling the number out and
comparing it to the precision of something, so this is likely to be harmless
no matter how the code is written).

But to a large extent, you are shooting the messenger here, and not person
who committed the crime.   I will be happy to add some comments to point the
clients of these to the one that looks at the type.   In looking over the
patch, the only obvious ones that could be changed are the ones in
tree-ssa-uninit.c and the tree-vrp.c. The one in tree-vrp.c just looks like
that the person writing the code did not know about tree_int_cst_lt and
wrote the check out our himself.  (i will fix this in the tree-vrp patch
that i am working on now. The one in tree-ssa-uniunit looks correct.

But beyond that, the rest are in the front ends and so i think that this as
good as you get out of me.

Well, if you transform bogus (by moral standards) code into other
bogus code the whole point of your patch is to exchange the names
of a set of macros / functions to another set of macros / functions.

I see no point in that then.

Leave broken code as-is.  The more often you touch broken code
and just mangle it in some way the harder it gets to get to the
point that would maybe reveal the real intent of the original code.

Sorry for the harsh words, but to take the example of
INT_CST_LT and INT_CST_LT_UNSIGNED -- those are remanents of
a world before double_ints existed.  All uses should have been
replaced by double_int usage by now; replacing them with something
tree-ish is the wrong way.  It might be 

Add IDENTIFIER_NODE to description of TARGET_ASM_NAMED_SECTION

2012-10-19 Thread Paulo Matos
As a followup to:
http://gcc.gnu.org/ml/gcc/2012-10/msg00276.html

2012-10-19  Paulo Matos
* tm.texi, tm.texi.in: Add IDENTIFIER_NODE as an alternative possibility
to possible values of decl.

Paulo Matos




doc.patch
Description: doc.patch


[PATCH] Fix PR54981

2012-10-19 Thread Richard Biener

This fixes PR54981, loop distribution was depending on DEBUG_STMT
uses.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2012-10-19  Richard Guenther  

PR tree-optimization/54981
* tree-loop-distribution.c (ssa_name_has_uses_outside_loop_p):
Do not consider debug stmts as uses.

* gcc.dg/pr54981.c: New testcase.

Index: gcc/tree-loop-distribution.c
===
*** gcc/tree-loop-distribution.c(revision 192560)
--- gcc/tree-loop-distribution.c(working copy)
*** ssa_name_has_uses_outside_loop_p (tree d
*** 125,132 
use_operand_p use_p;
  
FOR_EACH_IMM_USE_FAST (use_p, imm_iter, def)
! if (loop != loop_containing_stmt (USE_STMT (use_p)))
!   return true;
  
return false;
  }
--- 125,136 
use_operand_p use_p;
  
FOR_EACH_IMM_USE_FAST (use_p, imm_iter, def)
! {
!   gimple use_stmt = USE_STMT (use_p);
!   if (!is_gimple_debug (use_stmt)
! && loop != loop_containing_stmt (use_stmt))
!   return true;
! }
  
return false;
  }
Index: gcc/testsuite/gcc.dg/pr54981.c
===
*** gcc/testsuite/gcc.dg/pr54981.c  (revision 0)
--- gcc/testsuite/gcc.dg/pr54981.c  (working copy)
***
*** 0 
--- 1,15 
+ /* { dg-do compile } */
+ /* { dg-options "-O -ftree-loop-distribute-patterns -fcompare-debug" } */
+ 
+ extern void bar(unsigned *, char *);
+ 
+ void foo(char *s)
+ {
+   unsigned i;
+   char t[2];
+ 
+   bar(&i, t);
+ 
+   for (i = 0; i < 2; i++)
+ s[i] = t[i];
+ }


Re: wide int patch #7: Replacement of INT_CST_LT and INT_CST_LT_UNSIGNED

2012-10-19 Thread Richard Biener
On Fri, Oct 19, 2012 at 2:12 PM, Kenneth Zadeck
 wrote:
>
> On 10/19/2012 07:58 AM, Richard Biener wrote:
>>
>> On Fri, Oct 19, 2012 at 1:49 PM, Kenneth Zadeck
>>  wrote:
>>>
>>> On 10/19/2012 07:13 AM, Richard Biener wrote:

 On Fri, Oct 19, 2012 at 12:59 PM, Kenneth Zadeck
  wrote:
>
> On 10/19/2012 04:22 AM, Richard Biener wrote:
>>
>> On Thu, Oct 18, 2012 at 7:32 PM, Kenneth Zadeck
>>  wrote:
>>>
>>> This patch replaces all instances of  INT_CST_LT and
>>> INT_CST_LT_UNSIGNED
>>> with
>>> the new functions tree_int_cst_lts_p and tree_int_cst_ltu_p.   With
>>> the
>>> new
>>> implementation of int_cst these functions will be too big to do
>>> inline.
>>
>> These new function names are extremely confusing given that we already
>> have tree_int_cst_lt which does the right thing based on the
>> signedness
>> of the INTEGER_CST trees.
>>
>> The whole point of the macros was to be inlined and you break that.
>> That
>> is,
>> for example
>>
>>   if (unsignedp && unsignedp0)
>>{
>> - min_gt = INT_CST_LT_UNSIGNED (primop1, minval);
>> - max_gt = INT_CST_LT_UNSIGNED (primop1, maxval);
>> - min_lt = INT_CST_LT_UNSIGNED (minval, primop1);
>> - max_lt = INT_CST_LT_UNSIGNED (maxval, primop1);
>> + min_gt = tree_int_cst_ltu_p (primop1, minval);
>> + max_gt = tree_int_cst_ltu_p (primop1, maxval);
>> + min_lt = tree_int_cst_ltu_p (minval, primop1);
>> + max_lt = tree_int_cst_ltu_p (maxval, primop1);
>>}
>>   else
>>{
>> - min_gt = INT_CST_LT (primop1, minval);
>> - max_gt = INT_CST_LT (primop1, maxval);
>> - min_lt = INT_CST_LT (minval, primop1);
>> - max_lt = INT_CST_LT (maxval, primop1);
>> + min_gt = tree_int_cst_lts_p (primop1, minval);
>> ...
>>
>> could have just been
>>
>>min_gt = tree_int_cst_lt (primop1, minval);
>> 
>>
>> without any sign check.
>>
>> So if you think you need to kill the inlined variants please use the
>> existing
>> tree_int_cst_lt instead.
>
> no, they could not have.   tree_int_cst_lt uses the signedness of the
> type
> to determine how to do the comparison.These two functions, as the
> macros
> they replace, force the comparison to be done independent of the
> signedness
> of the type.

 Well, looking at the surrounding code it's indeed non-obvious that this
 would
 be a 1:1 transform.  But then they should not compare _trees_ but
 instead
 compare double-ints (or wide-ints).

 That said, I still think having a tree_int_cst_lt[us]_p function is
 wrong.
 tree INTEGER_CSTs have a sign.  (apart from that opinion we have
 tree_int_cst_lt and you introduce tree_int_cst_ltu_p - consistent
 would be tree_int_cst_ltu).
>>>
>>> This reply applies just as much to this patch as patch 6.
>>> I morally agree with you 100%.But the code does not agree with you.
>>>
>>> On patch 6, there are about 450 places where we want to take the lower
>>> hwi
>>> worth of bits out of a int cst.Of those, only 5 places use the
>>> function
>>> that allows the signedness to be passed in.   The rest make the
>>> signedness
>>> decision based on the local code and completely ignore any information in
>>> the type.Of those 5 that do allow the signedness to be passed in,
>>> only
>>> three of them actually pass it in based on the signedness of the variable
>>> they are accessing.
>>>
>>> I am sure that a lot of these are wrong.  But i could not tell you which
>>> are
>>> and which are not.
>>> luckily, a lot of this will go away with the full wide-int code because i
>>> just do most of this math in the full precision so the issue never comes
>>> up.
>>> But after i am finished, there will still be a fair number of places that
>>> do
>>> this.   (luckily, a large number of them are pulling the number out and
>>> comparing it to the precision of something, so this is likely to be
>>> harmless
>>> no matter how the code is written).
>>>
>>> But to a large extent, you are shooting the messenger here, and not
>>> person
>>> who committed the crime.   I will be happy to add some comments to point
>>> the
>>> clients of these to the one that looks at the type.   In looking over the
>>> patch, the only obvious ones that could be changed are the ones in
>>> tree-ssa-uninit.c and the tree-vrp.c. The one in tree-vrp.c just looks
>>> like
>>> that the person writing the code did not know about tree_int_cst_lt and
>>> wrote the check out our himself.  (i will fix this in the tree-vrp patch
>>> that i am working on now. The one in tree-ssa-uniunit looks correct.
>>>
>>> But beyond that, the rest are in the front ends and so i think that this
>>> as

Re: Add IDENTIFIER_NODE to description of TARGET_ASM_NAMED_SECTION

2012-10-19 Thread Ian Lance Taylor
On Fri, Oct 19, 2012 at 5:48 AM, Paulo Matos  wrote:
>
> 2012-10-19  Paulo Matos
> * tm.texi, tm.texi.in: Add IDENTIFIER_NODE as an alternative 
> possibility
> to possible values of decl.

This is OK.

Thanks.

Ian


Re: [PATCH] Rs6000 infrastructure cleanup (switches), revised patch #2f

2012-10-19 Thread Dominique Dhumieres
> > Eric and Dominique could you try it on your respective systems?  Thanks in
> > advance.
>
> It works for me, thanks.

It does not for me:

In file included from ../../work/gcc/config/rs6000/rs6000.c:1475:0:
../../work/gcc/config/rs6000/rs6000-cpus.def:54:0: error: "MASK_STRICT_ALIGN" 
redefined [-Werror]
 #define MASK_STRICT_ALIGN 0
 ^
In file included from ./tm.h:15:0,
 from ../../work/gcc/config/rs6000/rs6000.c:24:
../../work/gcc/config/rs6000/rs6000.h:490:0: note: this is the location of the 
previous definition
 #define MASK_STRICT_ALIGN  OPTION_MASK_STRICT_ALIGN
 ^
cc1plus: all warnings being treated as errors

I had to use the following modified patch

--- ../_gcc_clean/gcc/config/rs6000/rs6000-cpus.def 2012-10-18 
00:34:51.0 +0200
+++ ../work/gcc/config/rs6000/rs6000-cpus.def   2012-10-19 14:14:59.0 
+0200
@@ -46,6 +46,17 @@
 
 #define POWERPC_7400_MASK  (OPTION_MASK_PPC_GFXOPT | OPTION_MASK_ALTIVEC)
 
+/* Deal with ports that do not have -mstrict-align.  */
+#ifdef OPTION_MASK_STRICT_ALIGN
+#define OPTION_MASK_STRICT_ALIGN_OPTIONAL OPTION_MASK_STRICT_ALIGN
+#else
+#define OPTION_MASK_STRICT_ALIGN 0
+#define OPTION_MASK_STRICT_ALIGN_OPTIONAL 0
+#ifndef MASK_STRICT_ALIGN
+#define MASK_STRICT_ALIGN 0
+#endif
+#endif
+
 /* Mask of all options to set the default isa flags based on -mcpu=.  */
 #define POWERPC_MASKS  (OPTION_MASK_ALTIVEC\
 | OPTION_MASK_CMPB \
@@ -64,7 +75,7 @@
 | OPTION_MASK_PPC_GPOPT\
 | OPTION_MASK_RECIP_PRECISION  \
 | OPTION_MASK_SOFT_FLOAT   \
-| OPTION_MASK_STRICT_ALIGN \
+| OPTION_MASK_STRICT_ALIGN_OPTIONAL\
 | OPTION_MASK_VSX)
 
 #endif

in order to recompile gcc (not a clean bootstrap), i.e.,

+#define OPTION_MASK_STRICT_ALIGN_OPTIONAL 0

and

+#ifndef MASK_STRICT_ALIGN
+#define MASK_STRICT_ALIGN 0
+#endif

I think this patch could be simplied as

+#ifndef OPTION_MASK_STRICT_ALIGN
+#define OPTION_MASK_STRICT_ALIGN 0
+#ifndef MASK_STRICT_ALIGN
+#define MASK_STRICT_ALIGN 0
+#endif
+#endif
#define OPTION_MASK_STRICT_ALIGN_OPTIONAL OPTION_MASK_STRICT_ALIGN

Apparently

 #define MASK_STRICT_ALIGN  OPTION_MASK_STRICT_ALIGN

defines MASK_STRICT_ALIGN (to which value?) even if
OPTION_MASK_STRICT_ALIGN is not defined.

Dominique




Re: [PATCH] Fix PR54976

2012-10-19 Thread Marek Polacek
On Fri, Oct 19, 2012 at 12:26:13PM +0200, Richard Biener wrote:
> +   /* If we felt back to using the mode fail if there was
> +  no scalar type for it.  */
> +   if (scalar_type == NULL_TREE)
> + return NULL_TREE;

s/felt/fell/ ;)

Marek


Re: Fix array bound niter estimate (PR middle-end/54937)

2012-10-19 Thread Jan Hubicka
> On Fri, 19 Oct 2012, Jan Hubicka wrote:
> 
> > Hi,
> > this patch fixes off-by-one error in the testcase attached.  The problem is 
> > that
> > dominance based test used by record_estimate to check whether the given 
> > statement
> > must be executed at last iteration of the loop is wrong ignoring the side 
> > effect
> > of other statements that may terminate the program.
> > It also does not work for mulitple exits as excercised by cunroll-2.c 
> > testcase.
> > 
> > This patch makes simple approach of computing set of all statements that 
> > must
> > by executed last iteration first time record_estimate is executed this way.
> > The set is computed conservatively walking header BB and its signle 
> > successors
> > (possibly diving into nested loop) stopping on first BB with multiple exits.
> > 
> > Better result can be computed by
> > 1) estimating what loops are known to be finite
> > 2) inserting fake edges for all infinite loop and all statements with side 
> > effect
> >that may terminate the execution
> > 3) using the post dominance info.
> 
> would using post-dom info even work?  That only says that _if_ the
> dominated stmt executed then it came through the dominator.  It
> doesn't deal with functions that may not return.

With fake edges inserted it will. We do have code for that used in profiling 
that
also needs this stronger definition of CFG.
> 
> What about the conservative variant of simply
> 
>   else
> delta = double_int_one;

I think it would be bad idea: it makes us to completely unroll one interation
too many that bloats code for no benefit. No optimization cancels the path in
CFG because of undefined effect and thus the code will be output (unless someone
smarter, like VRP, cleans up later, but it is more an exception than rule.)
> 
> ?  I don't like all the code you add, nor the use of ->aux.

Neither I really do, but what are the alternatives?

My first implementation simply checked that stmt is in the loop header and
walked up to the beggining of basic blocks looking for side effects.  Then I
become worried about possibility of gigantic basic blocks with many array
stores within the loop, so I decided to record the reachable statements instead
of repeating the walk.
Loop count estimation is recursive (i.e. it dives into inner loops), thus I
ended up with using AUX.  I can for sure put this separately or add extra
reference argument passed over the whole call stack, but there are quite many
functions that can leads to record_estimate. (I have nothing against that
alternative however if AUX looks ugly)
> 
> > i_bound += delta;
> 
> Another alternative would be to not use i_bound for the
> strong upper bound but only the estimate (thus conservatively
> use i_bound + 1 for the upper bound if !is_exit).

We can not derrive realistic estimate based on this: the loop may exit much 
earlier.
We can only lower the estimate if it is already there and greater than this 
bound.
This can probably happen with profile feedback and I can implement it later,
I do not think it is terribly important though.

Honza


Plan to merge LRA.

2012-10-19 Thread Vladimir Makarov
  I'd like to commit LRA patches on the Sunday.  I got a lot of 
feedback about LRA for a few last weeks.  All patches besides RA part 
were approved by Jeff Law.  Steven Bosscher and I solved LRA speed 
problem on famous PR54146.  Now Steven does not object to LRA merge.


  I got invaluable reviews of LRA parts from Richard Sandiford.  I 
guess Richard thought deeply about reload for a long time.  His 
proposals were very useful and made LRA code much more clear.


  I'd like to thank all these people for their help and spending a lot 
of time on reviews and LRA work.


  So my plan is to merge trunk into LRA branch first and, if it looks 
ok, commit LRA to the trunk on Sunday to be ready solve problems (if 
there are any) starting Monday morning EST.  LRA merge will affect only 
x86/x86-64 target.  If you suspect any problem with LRA and have no 
quick response from me, you can easily switch it off by putting false 
instead of true in function i386.c::lra_p and continue your work until 
the problem is solved by me.


  The work on LRA will be continued on the branch.  There are still few 
proposals non-critical for x86/x86-64 which will be implemented on the 
branch.  After some stabilization of LRA branch for other targets, it 
would be nice if target maintainers look at the branch. It will make 
transitions of more targets to LRA more smoothly when development of 
gcc4.9 starts.


Vlad.



[Patch, Fortran] PR 54224: [4.8 Regression] Bogus -Wunused-function warning with static function

2012-10-19 Thread Janus Weil
Hi all,

this regression is haunting me a lot in real-world code. It's 'just' a
bogus warning, but it can really swamp the output for modules with
many private procedures.

I have found a fix that is rather trivial in the sense that it's a
one-liner. However, it may not be as trivial conceptually (in
particular I'm not 100% sure what caused this regression in the first
place; it might have been the the commit for PR40973).

Anyway, it regtests cleanly and fixes the problem as advertised. Ok for trunk?

Cheers,
Janus


2012-10-19  Janus Weil  

PR fortran/54224
* trans-expr.c (conv_function_val): Set TREE_USED.

2012-10-19  Janus Weil  

PR fortran/54224
* gfortran.dg/warn_unused_function.f90: New.


pr54224.diff
Description: Binary data


warn_unused_function.f90
Description: Binary data


Re: [PATCH, ARM][3/4] Epilogue using LDRD in Thumb mode

2012-10-19 Thread Richard Earnshaw

On 10/10/12 16:03, Greta Yorsh wrote:

Generate epilogue using LDRD in Thumb mode when prefer_ldrd_strd is set in
tune_params.

ChangeLog

gcc/

2012-09-13  Sameera Deshpande  
 Greta Yorsh  

 * config/arm/arm.c (thumb2_emit_ldrd_pop): New function.
   (arm_expand_epilogue): Use the new function.



This is OK, apart from:




+
+  /* Var j iterates over all the registers to gather all the registers in
+ saved_regs_mask.  Var i gives index of saved registers in stack frame.
+ A PARALLEL RTX of register-pair is created here, so that pattern for
+ LDRD can be matched.  As PC is always last register to be popped, and
+ we have already decremented num_regs if PC, we don't have to worry
+ about PC in this loop.  */
+  for (i = 0, j = 0; i < (num_regs - (num_regs % 2)); j++)
+if (saved_regs_mask & (1 << j))
+  {
+gcc_assert (j != SP_REGNUM);


Please move the assert outside of the loop.

R.




Re: [PATCH, ARM][4/4] Adjust tests gcc.target/arm/pr40457-*.c

2012-10-19 Thread Richard Earnshaw

On 10/10/12 16:03, Greta Yorsh wrote:

As a result of adding LDRD/STRD patterns in Thumb mode, the compiler
generates LDRD/STRD instead of LDM/STM in some cases. This patch adjusts
existing tests to accept LDRD/STRD in addition to LDM/STM.

ChangeLog

gcc/testsuite

2012-09-13  Sameera Deshpande  
 Greta Yorsh  

 * gcc.target/arm/pr40457-1.c: Adjust expected output.
 * gcc.target/arm/pr40457-2.c: Likewise.
 * gcc.target/arm/pr40457-3.c: Likewise.



OK.

R.




Re: Fix array bound niter estimate (PR middle-end/54937)

2012-10-19 Thread Jan Hubicka
> > What about the conservative variant of simply
> > 
> >   else
> > delta = double_int_one;
> 
> I think it would be bad idea: it makes us to completely unroll one interation
> too many that bloats code for no benefit. No optimization cancels the path in
> CFG because of undefined effect and thus the code will be output (unless 
> someone
> smarter, like VRP, cleans up later, but it is more an exception than rule.)

OK, on deper tought I guess I can add double_int_one always at that spot and
once we are done with everything I can walk nb_iter_bound for all statements
known to not be executed on last iteration and record them to pointer set.

Finally I can walk from header in DFS stopping on loop exits, side effects and
those stateemnts.  If I visit no loop exit or side effect I know I can lower
iteration count by 1 (in estimate_numbers_of_iterations_loop).

This will give accurate answer and requires just little extra bookkeeping.

I will give this a try.
Honza


Re: User directed Function Multiversioning via Function Overloading (issue5752064)

2012-10-19 Thread Diego Novillo

On 2012-10-12 18:19 , Sriraman Tallam wrote:


When the front-end sees more than one decl for "foo", it calls a target hook to
determine if they are versions. To prevent duplicate definition errors with 
other
 versions of "foo", "decls_match" function in cp/decl.c is made to return false
 when 2 decls have are deemed versions by the target. This will make all 
function
versions of "foo" to be added to the overload list of "foo".


So, this means that this can only work for C++, right?  Or could the 
same trickery be done some other way in other FEs?


I see no handling of different FEs.  If the user tries to use these 
attributes from languages other than C++, we should emit a diagnostic.



+@deftypefn {Target Hook} tree TARGET_GET_FUNCTION_VERSIONS_DISPATCHER (void 
*@var{arglist})
+This hook is used to get the dispatcher function for a set of function
+versions.  The dispatcher function is called to invoke the rignt function


s/rignt/right/


+version at run-time. @var{arglist} is the vector of function versions
+that should be considered for dispatch.
+@end deftypefn
+
+@deftypefn {Target Hook} tree TARGET_GENERATE_VERSION_DISPATCHER_BODY (void 
*@var{arg})
+This hook is used to generate the dispatcher logic to invoke the right
+function version at runtime for a given set of function versions.


s/runtime/run-time/


+@hook TARGET_GET_FUNCTION_VERSIONS_DISPATCHER
+This hook is used to get the dispatcher function for a set of function
+versions.  The dispatcher function is called to invoke the rignt function


s/rignt/right/


+version at run-time. @var{arglist} is the vector of function versions
+that should be considered for dispatch.
+@end deftypefn
+
+@hook TARGET_GENERATE_VERSION_DISPATCHER_BODY
+This hook is used to generate the dispatcher logic to invoke the right
+function version at runtime for a given set of function versions.


s/runtime/run-time/


@@ -288,7 +289,6 @@ mark_store (gimple stmt, tree t, void *data)
  }
   return false;
 }
-
 /* Create cgraph edges for function calls.
Also look for functions and variables having addresses taken.  */


Don't remove vertical white space, please.


+   {
+ struct cgraph_node *callee = cgraph_get_create_node (decl);
+ /* If a call to a multiversioned function dispatcher is
+found, generate the body to dispatch the right function
+at run-time.  */
+ if (callee->dispatcher_function)
+   {
+ tree resolver_decl;
+ gcc_assert (callee->function_version.next);


What if callee is the last version in the list?  Not sure what you are 
trying to check here.




@@ -8601,9 +8601,22 @@ handle_target_attribute (tree *node, tree name, tr
   warning (OPT_Wattributes, "%qE attribute ignored", name);
   *no_add_attrs = true;
 }
-  else if (! targetm.target_option.valid_attribute_p (*node, name, args,
- flags))
-*no_add_attrs = true;
+  else
+{
+  /* When a target attribute is invalid, it may also be because the
+target for the compilation unit and the attribute match.  For
+ instance, target attribute "xxx" is invalid when -mxxx is used.
+ When used with multiversioning, removing the attribute will lead
+ to duplicate definitions if a default version is provided.
+So, generate a warning here and remove the attribute.  */
+  if (!targetm.target_option.valid_attribute_p (*node, name, args, flags))
+   {
+ warning (OPT_Wattributes,
+  "Invalid target attribute in function %qE, ignored.",
+  *node);
+ *no_add_attrs = true;


If you do this, isn't the compiler going to generate two warning 
messages?  One for the invalid target attribute, the second for the 
duplicate definition.



@@ -228,6 +228,26 @@ struct GTY(()) cgraph_node {
   struct cgraph_node *prev_sibling_clone;
   struct cgraph_node *clones;
   struct cgraph_node *clone_of;
+
+  /* Function Multiversioning info.  */
+  struct {
+/* Chains all the semantically identical function versions.  The
+   first function in this chain is the default function.  */
+struct cgraph_node *prev;
+/* If this node is a dispatcher for function versions, this points
+   to the default function version, the first function in the chain.  */
+struct cgraph_node *next;


Why not a VEC of function decls?  Seems easier to manage and less size 
overhead.




@@ -3516,8 +3522,8 @@ struct GTY(()) tree_function_decl {
   unsigned looping_const_or_pure_flag : 1;
   unsigned has_debug_args_flag : 1;
   unsigned tm_clone_flag : 1;
-
-  /* 1 bit left */
+  unsigned versioned_function : 1;
+  /* No bits left.  */


You ate the last bit!  How rude ;)


@@ -8132,6 +8176,38 @@ joust (struct z_candidate *cand1, struct z_candida
   && (IS_TYPE_OR_DECL_P (cand1->fn)))
 return 1;

+  /

RE: [PATCH, ARM][1/4] New RTL patterns for LDRD/STRD in Thumb mode

2012-10-19 Thread Greta Yorsh
On 18 October 2012 14:41, Richard Earnshaw wrote:
> > +/* Checks whether the operands are valid for use in an LDRD/STRD
instruction.
> > +   Assumes that RT, RT2, and RTN are REG.  This is guaranteed by the
patterns.
> > +   Assumes that the address in the base register RTN is word aligned.
Pattern
> > +   guarantees that both memory accesses use the same base register,
> > +   the offsets are constants within the range, and the gap between the
offsets is 4.
> > +   If preload complete then check that registers are legal.  WBACK
indicates whether
> > +   address is updated.  LOAD indicates whether memory access is load or
store.  */
> 
> ARM ARM terminology uses Rn for the base reg, so:
> 
> s/RTN/RN/

Fixed.

> 
> > +bool
> > +operands_ok_ldrd_strd (rtx rt, rtx rt2, rtx rtn, HOST_WIDE_INT offset,
> 
> s/rtn/rn/

Fixed.

> > +;; Patterns for LDRD/STRD in Thumb2 mode
> > +
> > +(define_insn "*thumb2_ldrd"
> > +  [(set (match_operand:SI 0 "s_register_operand" "=r")
> > +(mem:SI (plus:SI (match_operand:SI 1 "s_register_operand" "rk")
> > + (match_operand:SI 2 "ldrd_strd_offset_operand"
"Do"
> > +   (set (match_operand:SI 3 "s_register_operand" "=r")
> > +(mem:SI (plus:SI (match_dup 1)
> > + (match_operand:SI 4 "const_int_operand"
""]
> > +  "TARGET_LDRD && TARGET_THUMB2
> > + && (current_tune->prefer_ldrd_strd &&
!optimize_function_for_size_p (cfun))
> 
> All these should be gated on "reload_completed" and not on the tune or 
> size optimization.

Removed the condition "!optimize_function_for_size_p (cfun))".

The condition "current_tune->prefer_ldrd_strd" is needed because the
patterns 
for LDRD/STRD appear before the patterns for LDM/STM that can match the same
RTL
(two register in the list). Condition "reload_completed" does not help with
it
because peephole optimizations in ldmstm.md may (after reload) create new
RTL insn 
that match this pattern.

> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > index f330da3..21d1aa8 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -12130,6 +12130,9 @@ offset_ok_for_ldrd_strd (HOST_WIDE_INT offset)
> >   {
> > HOST_WIDE_INT max_offset;
> >
> > +  if (!TARGET_LDRD)
> > +return false;
> > +
> 
> This seems to be in the wrong place.  If we don't have ldrd then the 
> question as to what is a valid offset is irrelevant.

Moved this condition to predicates.md and constraints.md.

Other uses of offset_ok_for_ldrd_strd are already guarded by the conditions.

I am attaching a new version of this patch. 

No regression on qemu for arm-none-eabi with cpu cortex-m4 and cortex-a15.

Ok for trunk?

Thank you,
Greta

ChangeLog


gcc/

2012-10-19  Sameera Deshpande  
Greta Yorsh  

* config/arm/arm-protos.h (offset_ok_for_ldrd_strd): New
declaration.
(operands_ok_ldrd_strd): Likewise.
* config/arm/arm.c (offset_ok_for_ldrd_strd): New function.
(operands_ok_ldrd_strd): Likewise.
* config/arm/arm.md (thumb2_ldrd, thumb2_ldrd_base): New patterns.
(thumb2_ldrd_base_neg): Likewise.
(thumb2_strd, thumb2_strd_base, thumb_strd_base_neg): Likewise.
* predicates.md (ldrd_strd_offset_operand): New predicate.
* config/arm/constraints.md (Do): New constraint.

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 010e7fc..bfe96ea 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -116,6 +116,8 @@ extern bool gen_stm_seq (rtx *, int);
 extern bool gen_const_stm_seq (rtx *, int);
 extern rtx arm_gen_load_multiple (int *, int, rtx, int, rtx, HOST_WIDE_INT *);
 extern rtx arm_gen_store_multiple (int *, int, rtx, int, rtx, HOST_WIDE_INT *);
+extern bool offset_ok_for_ldrd_strd (HOST_WIDE_INT);
+extern bool operands_ok_ldrd_strd (rtx, rtx, rtx, HOST_WIDE_INT, bool, bool);
 extern int arm_gen_movmemqi (rtx *);
 extern enum machine_mode arm_select_cc_mode (RTX_CODE, rtx, rtx);
 extern enum machine_mode arm_select_dominance_cc_mode (rtx, rtx,
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index fc3a508..c60e62f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -12185,6 +12185,75 @@ arm_pad_reg_upward (enum machine_mode mode,
   return !BYTES_BIG_ENDIAN;
 }
 
+/* Returns true iff OFFSET is valid for use in an LDRD/STRD instruction,
+   assuming that the address in the base register is word aligned.  */
+bool
+offset_ok_for_ldrd_strd (HOST_WIDE_INT offset)
+{
+  HOST_WIDE_INT max_offset;
+
+  /* Offset must be a multiple of 4 in Thumb mode.  */
+  if (TARGET_THUMB2 && ((offset & 3) != 0))
+return false;
+
+  if (TARGET_THUMB2)
+max_offset = 1020;
+  else if (TARGET_ARM)
+max_offset = 255;
+  else
+gcc_unreachable ();
+
+  return ((offset <= max_offset) && (offset >= -max_offset));
+}
+
+/* Checks whether the operands are valid for use in an LDRD/STRD instruction.
+   Assumes that RT, RT2, and RN are REG. 

RFA: fix dbr_schedule to leave unique ids unique

2012-10-19 Thread Joern Rennecke

Quoting Richard Sandiford :


Joern Rennecke  writes:

Quoting Richard Sandiford :

The fact that we even have shared unique ids is pretty bad -- and surely
a contradiction in terms -- but I think both ways of handling them rely
on the length being the same for all copies.  If we don't record a length
(your version), we won't set something_changed if the length of this copy
did in fact change, which could lead to branches remaining too short.


That is not the case, as the current length of the inner insn is added to
new_length (of the outer insn).


If we do record a length (current version),


In the current version, the length of the inner insn is calculated anew
each iteration, so only the out-of-sequence copy suffers from the bogosity.


we could end up changing
the length of previous copies that have already been processed in
this iteration.


Both that, and also using the length or the previous insn for the inner
length calculation, and ratcheting them up together.
In fact, this can make delay slot instructions ineligible for the delay slot
they are in.  Also, the unwanted length cross-interference can violate
alignment invariants, and this leads to invalid code on ARCompact.


But if you're saying that ARCompat wants different copies of the same insn
(with the same uid) to have different lengths, then please fix dbr_schedule
so that it uses different uids.  The "i == 0" thing is just too hackish IMO.


Implemented with the attached patch.
I've been wondering if I should use copy_insn, but that doesn't seem to make
any real difference after reload.
Also, copying only PATTERN, INSN_LOCATION, and REG_NOTES into the new
insn obtained from make_insn_raw had seemed a possibility, but there is no
firm assurance that we will only see insn with the code INSN.
In the end, continuing to use copy_rtx to make the copy seems least likely to
break something.  And now that the copying is in one place, it's easier to
change the way it is done if you want to try that.

We waste a bit of memory (temporarily, as it is garbage collected) by using
make_insn_raw just to increment cur_insn_uid.  Alternatives would be to
move the function inside emit-rtl.c, or have emit-rtl.h provide a means to
increment cur_insn_uid.  Well, or we could directly access  
crtl->emit.x_cur_insn_uid, but that would certainly be hackish.


regression tested on i686-pc-linux-gnu X cris-elf

(cris-elf has delayed branches, CASE_VECTOR_SHORTEN_MODE, is not  
currently affected by PR54938, and builds libgfortran.)


C / C++ / objc regression tests for mipsel-elf also passed.
2012-10-19  Joern Rennecke  

* reorg.c (copy_delay_slot_insn): New function.
(steal_delay_list_from_target, fill_slots_from_thread): Use it.
(fill_simple_delay_slots): Likewise.

Index: reorg.c
===
--- reorg.c (revision 192573)
+++ reorg.c (working copy)
@@ -1179,6 +1179,19 @@ check_annul_list_true_false (int annul_t
   return 1;
 }
 
+/* Copy INSN, which is to remain in place (in fill_slots_from_thread parlance,
+   it is from a thread we don't own), for use in a delay slot.  */
+static rtx
+copy_delay_slot_insn (rtx insn)
+{
+  /* Copy INSN with its rtx_code, all its notes, location etc.  */
+  insn = copy_rtx (insn);
+  /* Get new UID.  */
+  rtx new_insn = make_insn_raw (PATTERN (insn));
+  INSN_UID (insn) = INSN_UID (new_insn);
+  return insn;
+}
+
 /* INSN branches to an insn whose pattern SEQ is a SEQUENCE.  Given that
the condition tested by INSN is CONDITION and the resources shown in
OTHER_NEEDED are needed after INSN, see whether INSN can take all the insns
@@ -1297,7 +1310,7 @@ steal_delay_list_from_target (rtx insn,
{
  if (must_annul)
used_annul = 1;
- temp = copy_rtx (trial);
+ temp = copy_delay_slot_insn (trial);
  INSN_FROM_TARGET_P (temp) = 1;
  new_delay_list = add_to_delay_list (temp, new_delay_list);
  total_slots_filled++;
@@ -2369,7 +2382,8 @@ fill_simple_delay_slots (int non_jumps_p
  if (new_label)
{
  delay_list
-   = add_to_delay_list (copy_rtx (next_trial), delay_list);
+   = add_to_delay_list (copy_delay_slot_insn (next_trial),
+delay_list);
  slots_filled++;
  reorg_redirect_jump (trial, new_label);
 
@@ -2793,7 +2807,7 @@ fill_slots_from_thread (rtx insn, rtx co
  else
new_thread = next_active_insn (trial);
 
- temp = own_thread ? trial : copy_rtx (trial);
+ temp = own_thread ? trial : copy_delay_slot_insn (trial);
  if (thread_if_true)
INSN_FROM_TARGET_P (temp) = 1;
 
@@ -2974,7 +2988,7 @@ fill_slots_from_thread (rtx insn, rtx co
  else
new_thread = next_active_insn (trial);
 
- ninsn = own_thread ?

Re: [PATCH, ARM][1/4] New RTL patterns for LDRD/STRD in Thumb mode

2012-10-19 Thread Richard Earnshaw

On 19/10/12 16:20, Greta Yorsh wrote:


Removed the condition "!optimize_function_for_size_p (cfun))".

The condition "current_tune->prefer_ldrd_strd" is needed because the
patterns
for LDRD/STRD appear before the patterns for LDM/STM that can match the same
RTL
(two register in the list). Condition "reload_completed" does not help with
it
because peephole optimizations in ldmstm.md may (after reload) create new
RTL insn
that match this pattern.



The point of the reload_completed is that these patterns have the 
potential to cause some problems if they somehow matched during earlier 
passes and the address base was an eliminable register.






Re: Plan to merge LRA.

2012-10-19 Thread Diego Novillo

On 2012-10-19 10:31 , Vladimir Makarov wrote:


   So my plan is to merge trunk into LRA branch first and, if it looks
ok, commit LRA to the trunk on Sunday to be ready solve problems (if
there are any) starting Monday morning EST.  LRA merge will affect only
x86/x86-64 target.  If you suspect any problem with LRA and have no
quick response from me, you can easily switch it off by putting false
instead of true in function i386.c::lra_p and continue your work until
the problem is solved by me.


Thanks for the heads up, Vlad.  Does the branch have a lot of VEC() 
code?  I will need to adjust my VEC changes after you committed LRA.


I am still battling GTY failures, so I will wait until LRA is in trunk 
and stabilized before I push my VEC rewrite.



Thanks.  Diego.


Re: Plan to merge LRA.

2012-10-19 Thread Vladimir Makarov

On 12-10-19 11:46 AM, Diego Novillo wrote:

On 2012-10-19 10:31 , Vladimir Makarov wrote:


   So my plan is to merge trunk into LRA branch first and, if it looks
ok, commit LRA to the trunk on Sunday to be ready solve problems (if
there are any) starting Monday morning EST.  LRA merge will affect only
x86/x86-64 target.  If you suspect any problem with LRA and have no
quick response from me, you can easily switch it off by putting false
instead of true in function i386.c::lra_p and continue your work until
the problem is solved by me.


Thanks for the heads up, Vlad.  Does the branch have a lot of VEC() 
code?  I will need to adjust my VEC changes after you committed LRA.


I counted 4 vectors (with about 25 lines referencing them).  One vector 
(point_freq_vec) is analogous to IRA.
I am still battling GTY failures, so I will wait until LRA is in trunk 
and stabilized before I push my VEC rewrite.






Re: PR c/53063 Use (Lang)EnabledBy for a bunch of options

2012-10-19 Thread Joseph S. Myers
On Wed, 17 Oct 2012, Manuel L?pez-Ib??ez wrote:

> This patch changes most trivial cases to use the new  (Lang)EnabledBy.
> 
> Bootstrapped and regression tested on x86_64-linux-gnu. OK?
> 
> 2012-10-17  Manuel L?pez-Ib??ez  
> 
>   PR c/53063
>   PR c/40989
> c-family/
>   * c.opt (Waddress,Wchar-subscripts,Wsign-conversion,Wimplicit,
>   Wimplicit-function-declaration,Wimplicit-int,Wsizeof-pointer-memaccess,
>   Wnarrowing,Wparentheses,Wpointer-sign,Wreturn-type,Wsequence-point,
>   Wsign-compare,Wuninitialized,Wmaybe-uninitialized,Wunused,
>   Wvolatile-register-var): Add LangEnabledBy or EnabledBy.
>   * c-opts.c (c_common_handle_option): Remove explicit handling from
>   here.
>   (c_common_post_options): Likewise.
> gcc/
>   * opts.c (finish_options): Remove explicit handling from here.

OK.

> Not sure how to encode these dependencies. Ideas?

No specific suggestions here.  If a particular case isn't common enough to 
have a .opt mechanism for it, then in principle the use of the _set 
structures tracking what was explicitly set could replace the use of -1 
initializers.  But since the _set structures generally correspond to what 
was set directly, whereas implicit setting by another option also replaces 
the -1 value, this might only work given the extension of _set structures 
to record the distance between the option passed by the user and the one 
set, not just whether each option was set directly.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Plan to merge LRA.

2012-10-19 Thread Diego Novillo

On 2012-10-19 11:51 , Vladimir Makarov wrote:


I counted 4 vectors (with about 25 lines referencing them).  One vector
(point_freq_vec) is analogous to IRA.


Perfect.  That shouldn't be hard at all then.  Thanks.


Diego.


Re: [Patch, Fortran] PR 54224: [4.8 Regression] Bogus -Wunused-function warning with static function

2012-10-19 Thread Tobias Burnus

Hi Janus,

Janus Weil wrote:

I have found a fix that is rather trivial in the sense that it's a
one-liner.


I think it is the proper fix.


However, it may not be as trivial conceptually (in particular I'm not 100% sure 
what caused this regression in the first place


I think it is the combination of having "static" noninternal procedures 
(which is a new 4.8 feature) combined with the middle-end assumption 
that the FE marks procedures as TREE_USED. (The ME does mark them as 
TREE_USED, but too late.)


(The other question is why the diagnostic doesn't work for internal 
procedures - or for GNU C's nested function. I think also for module 
variables such a diagnostic would be useful. However, both are just 
missed diagnostics and not diagnostic-output regressions.)



Anyway, it regtests cleanly and fixes the problem as advertised. Ok for trunk?


Yes. Thanks for the patch.

Tobias


2012-10-19  Janus Weil  

PR fortran/54224
* trans-expr.c (conv_function_val): Set TREE_USED.

2012-10-19  Janus Weil  

PR fortran/54224
* gfortran.dg/warn_unused_function.f90: New.


[v3] (almost) finish

2012-10-19 Thread Jonathan Wakely
This adds support for piecewise construction of std::pair by
scoped_allocator_adaptor.  The only thing missing from
scoped_allocator_adaptor now is that my definition of OUTERMOST isn't
recursive so doesn't work for nested scoped_allocator_adaptors.
That's a suitably obscure use case that I'm not going to rush to fix
it today.

* include/std/scoped_allocator (__outermost_alloc_traits): Define.
(scoped_allocator_adaptor::destroy): Use it.
(scoped_allocator_adaptor::construct): Likewise. Overload for
piecewise construction of std::pair objects.
* testsuite/20_util/scoped_allocator/2.cc: New.

Tested x86_64-linux, committed to trunk.
commit 2a969ae20431833dbb4a2dfdf0be03b5d4b716c9
Author: Jonathan Wakely 
Date:   Fri Oct 19 16:58:53 2012 +0100

* include/std/scoped_allocator (__outermost_alloc_traits): Define.
(scoped_allocator_adaptor::destroy): Use it.
(scoped_allocator_adaptor::construct): Likewise. Overload for
piecewise construction of std::pair objects.
* testsuite/20_util/scoped_allocator/2.cc: New.
* doc/xml/manual/status_cxx2011.xml: Update.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
index ba37e0e..226eef9 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
@@ -1037,7 +1037,7 @@ particular release.
   20.12.4
   Scoped allocator adaptor members
   Partial
-  Missing std::pair piecewise construction.
+  OUTERMOST is not recursive.
 
 
   20.12.5
diff --git a/libstdc++-v3/include/std/scoped_allocator 
b/libstdc++-v3/include/std/scoped_allocator
index fc2db7c..81365b6 100644
--- a/libstdc++-v3/include/std/scoped_allocator
+++ b/libstdc++-v3/include/std/scoped_allocator
@@ -74,7 +74,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : allocator_traits<_Alloc>::propagate_on_container_swap
 { };
 
-  
+
   template
 inline auto
 __do_outermost(_Alloc& __a, _Alloc*) -> decltype(__a.outer_allocator())
@@ -85,6 +85,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 __do_outermost(_Alloc& __a, ...)
 { return __a; }
 
+  // TODO: make recursive (see note in 20.12.4/1)
   template
 inline auto
 __outermost(_Alloc& __a) -> decltype(__do_outermost(__a, &__a))
@@ -190,15 +191,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_tie() const noexcept
   { return std::tuple_cat(std::tie(outer_allocator()), _M_inner._M_tie()); 
}
 
+  template
+   using __outermost_type = typename
+ std::decay()))>::type;
+
+  template
+   using __outermost_alloc_traits
+ = allocator_traits<__outermost_type<_Alloc>>;
   
   template
 void 
 _M_construct(__uses_alloc0, _Tp* __p, _Args&&... __args)
 {
-  auto& __outer = __outermost(*this);
- typedef typename std::decay::type __outer_type;
-  typedef allocator_traits<__outer_type> __o_traits;
-  __o_traits::construct(__outer, __p, std::forward<_Args>(__args)...);
+ typedef __outermost_alloc_traits _O_traits;
+ _O_traits::construct(__outermost(*this), __p,
+  std::forward<_Args>(__args)...);
 }
 
   typedef __uses_alloc1 __uses_alloc1_;
@@ -208,22 +215,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 void 
 _M_construct(__uses_alloc1_, _Tp* __p, _Args&&... __args)
 {
-  auto& __outer = __outermost(*this);
- typedef typename std::decay::type __outer_type;
-  typedef allocator_traits<__outer_type> __o_traits;
-  __o_traits::construct(__outer, __p, allocator_arg, inner_allocator(),
-std::forward<_Args>(__args)...);
+ typedef __outermost_alloc_traits _O_traits;
+ _O_traits::construct(__outermost(*this), __p,
+  allocator_arg, inner_allocator(),
+  std::forward<_Args>(__args)...);
 }
 
   template
 void 
 _M_construct(__uses_alloc2_, _Tp* __p, _Args&&... __args)
 {
- auto& __outer = __outermost(*this);
- typedef typename std::decay::type __outer_type;
-  typedef allocator_traits<__outer_type> __o_traits;
-  __o_traits::construct(__outer, __p, std::forward<_Args>(__args)...,
-inner_allocator());
+ typedef __outermost_alloc_traits _O_traits;
+ _O_traits::construct(__outermost(*this), __p,
+  std::forward<_Args>(__args)...,
+  inner_allocator());
 }
 
   template
@@ -338,15 +343,61 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_construct(__use_tag, __p, std::forward<_Args>(__args)...);
 }
 
-  // TODO: construct pairs
+  template
+   void
+   construct(pair<_T1, _T2>* __p, piecewise_construct_t,
+ tuple<_Args1...> __x, tuple<

Re: PR c/53063 Handle Wformat with LangEnabledBy

2012-10-19 Thread Joseph S. Myers
On Wed, 17 Oct 2012, Manuel L?pez-Ib??ez wrote:

> documentation but I can also implement -Wformat=0 being an alias for
> -Wno-format and -Wformat=1 an alias for -Wformat and simply reject
> -Wno-format=.

I think that's what's wanted; -Wno-format= should be rejected, -Wformat= 
should take an arbitrary integer level (of which at present all those 
above 2 are equivalent to 2, just as -O for n > 3 is equivalent to 
-O3).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [patch] PR 54930: add -Wreturn-local-addr

2012-10-19 Thread Joseph S. Myers
On Thu, 18 Oct 2012, Jonathan Wakely wrote:

> This adds a warning switch for the existing "returning address of
> local variable" warnings in the C and C++ FEs which are enabled by
> default but have no switch controlling them. Adding a switch allows it
> to be turned into an error with -Werror=return-local-addr.  I've added
> two tests checking the -Werror form works.

The C front-end changes are OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: patch to fix constant math - patch 5 - the rest of the rtl stuff.

2012-10-19 Thread Richard Sandiford
I can't approve this, but some comments:

Kenneth Zadeck  writes:
> diff --git a/gcc/alias.c b/gcc/alias.c
> index 0c6a744..9e67823 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -1490,9 +1490,9 @@ rtx_equal_for_memref_p (const_rtx x, const_rtx y)
>  
>  case VALUE:
>  CASE_CONST_UNIQUE:
> -  /* There's no need to compare the contents of CONST_DOUBLEs or
> -  CONST_INTs because pointer equality is a good enough
> -  comparison for these nodes.  */
> +  /* There's no need to compare the contents of CONST_DOUBLEs,
> +  CONST_INTs or CONST_WIDE_INTs because pointer equality is a
> +  good enough comparison for these nodes.  */
>return 0;
>  
>  default:

Maybe just:

  /* Pointer equality guarantees value equality for these codes.  */

> @@ -695,13 +700,14 @@ c_readstr (const char *str, enum machine_mode mode)
> && GET_MODE_SIZE (mode) >= UNITS_PER_WORD)
>   j = j + UNITS_PER_WORD - 2 * (j % UNITS_PER_WORD) - 1;
>j *= BITS_PER_UNIT;
> -  gcc_assert (j < HOST_BITS_PER_DOUBLE_INT);
>  
>if (ch)
>   ch = (unsigned char) str[i];
> -  c[j / HOST_BITS_PER_WIDE_INT] |= ch << (j % HOST_BITS_PER_WIDE_INT);
> +  c.elt_ref (j / HOST_BITS_PER_WIDE_INT) |= ch << (j % 
> HOST_BITS_PER_WIDE_INT);

Long line.

> @@ -4990,12 +4996,13 @@ expand_builtin_signbit (tree exp, rtx target)
>  
>if (bitpos < GET_MODE_BITSIZE (rmode))
>  {
> -  double_int mask = double_int_zero.set_bit (bitpos);
> +  wide_int mask;
> +  mask = wide_int::set_bit_in_zero (bitpos, rmode);

No real point splitting this.

> @@ -1511,17 +1511,20 @@ Similarly, there is only one object for the integer 
> whose value is
>  
>  @findex const_double
>  @item (const_double:@var{m} @var{i0} @var{i1} @dots{})
> -Represents either a floating-point constant of mode @var{m} or an
> -integer constant too large to fit into @code{HOST_BITS_PER_WIDE_INT}
> -bits but small enough to fit within twice that number of bits (GCC
> -does not provide a mechanism to represent even larger constants).  In
> -the latter case, @var{m} will be @code{VOIDmode}.  For integral values
> -constants for modes with more bits than twice the number in
> -@code{HOST_WIDE_INT} the implied high order bits of that constant are
> -copies of the top bit of @code{CONST_DOUBLE_HIGH}.  Note however that
> -integral values are neither inherently signed nor inherently unsigned;
> -where necessary, signedness is determined by the rtl operation
> -instead.
> +On older ports, this represents either a floating-point constant of
> +mode @var{m} or an integer constant too large to fit into
> +@code{HOST_BITS_PER_WIDE_INT} bits but small enough to fit within
> +twice that number of bits (GCC does not provide a mechanism to
> +represent even larger constants).

No longer true :-)

>[...] In the latter case, @var{m} will be
> +@code{VOIDmode}.  For integral values constants for modes with more
> +bits than twice the number in @code{HOST_WIDE_INT} the implied high
> +order bits of that constant are copies of the top bit of
> +@code{CONST_DOUBLE_HIGH}.  Note however that integral values are
> +neither inherently signed nor inherently unsigned; where necessary,
> +signedness is determined by the rtl operation instead.
> +
> +On more modern ports, @code{CONST_DOUBLE} only represents floating
> +point values.   New ports define to TARGET_SUPPORTS_WIDE_INT to  

Truncated sentence.

> @@ -1536,6 +1539,37 @@ machine's or host machine's floating point format.  To 
> convert them to
>  the precise bit pattern used by the target machine, use the macro
>  @code{REAL_VALUE_TO_TARGET_DOUBLE} and friends (@pxref{Data Output}).
>  
> +@findex const_wide_int
> +@item (const_wide_int:@var{m} @var{nunits} @var{elt0} @dots{})
> +This contains a garbage collected array of @code{HOST_WIDE_INTS} that
> +is large enough to hold any constant that can be represented on the
> +target.

Suggest dropping "garbage-collected".  I think it's a hold-over from
when the vector was separate.

> This form of rtl is only used on targets that define
> +@code{TARGET_SUPPORTS_WIDE_INT} to be non zero and then
> +@code{CONST_DOUBLES} are only used to hold floating point values.  If

@code{const_double}s.  In general, rtl names should be lower case
in @code{...}, and the 's' should come outside.  There are quite
a few instances (because of the detailed documenation :-)) so I won't
list them all.  But this applies to const_int and const_wide_int too.

> +The values are stored in a compressed format.   The higher order
> +0s or -1s are not represented if they are just the logical sign
> +extension the number that is represented.   

extension of ...

> +On older ports, large integers are stored in @code{CONST_DOUBLE} rtl
> +objects.  Newer ports define @code{TARGET_SUPPORTS_WIDE_INT} to be non
> +zero to indicate tha large integers are stored in

...that large integers...

> +Converting a port mostly requires looking for the places where
> +@c

Re: Fix array bound niter estimate (PR middle-end/54937)

2012-10-19 Thread Jan Hubicka
> > > What about the conservative variant of simply
> > > 
> > >   else
> > > delta = double_int_one;
> > 
> > I think it would be bad idea: it makes us to completely unroll one 
> > interation
> > too many that bloats code for no benefit. No optimization cancels the path 
> > in
> > CFG because of undefined effect and thus the code will be output (unless 
> > someone
> > smarter, like VRP, cleans up later, but it is more an exception than rule.)
> 
> OK, on deper tought I guess I can add double_int_one always at that spot and
> once we are done with everything I can walk nb_iter_bound for all statements
> known to not be executed on last iteration and record them to pointer set.
> 
> Finally I can walk from header in DFS stopping on loop exits, side effects and
> those stateemnts.  If I visit no loop exit or side effect I know I can lower
> iteration count by 1 (in estimate_numbers_of_iterations_loop).
> 
> This will give accurate answer and requires just little extra bookkeeping.
> 
> I will give this a try.

Here is updated patch.  It solves the testcase and gives better estimates than 
before.

Here is obvious improvements: record_estimate can put all statements to the 
list not only
those that dominates loop latch and maybe_lower_iteration_bound can track 
lowest estimate
it finds on its walk.  This will need bit more work and I am thus sending the 
bugfix
separately, because I think it should go to 4.7, too.

Honza

* tree-ssa-loop-niter.c (record_estimate): Remove confused
dominators check.
(maybe_lower_iteration_bound): New function.
(estimate_numbers_of_iterations_loop): Use it.
Index: tree-ssa-loop-niter.c
===
--- tree-ssa-loop-niter.c   (revision 192537)
+++ tree-ssa-loop-niter.c   (working copy)
@@ -2535,7 +2541,6 @@ record_estimate (struct loop *loop, tree
 gimple at_stmt, bool is_exit, bool realistic, bool upper)
 {
   double_int delta;
-  edge exit;
 
   if (dump_file && (dump_flags & TDF_DETAILS))
 {
@@ -2570,14 +2577,10 @@ record_estimate (struct loop *loop, tree
 }
 
   /* Update the number of iteration estimates according to the bound.
- If at_stmt is an exit or dominates the single exit from the loop,
- then the loop latch is executed at most BOUND times, otherwise
- it can be executed BOUND + 1 times.  */
-  exit = single_exit (loop);
-  if (is_exit
-  || (exit != NULL
- && dominated_by_p (CDI_DOMINATORS,
-exit->src, gimple_bb (at_stmt
+ If at_stmt is an exit then the loop latch is executed at most BOUND times,
+ otherwise it can be executed BOUND + 1 times.  We will lower the estimate
+ later if such statement must be executed on last iteration  */
+  if (is_exit)
 delta = double_int_zero;
   else
 delta = double_int_one;
@@ -2953,6 +2956,87 @@ gcov_type_to_double_int (gcov_type val)
   return ret;
 }
 
+/* See if every path cross the loop goes through a statement that is known
+   to not execute at the last iteration. In that case we can decrese iteration
+   count by 1.  */
+
+static void
+maybe_lower_iteration_bound (struct loop *loop)
+{
+  pointer_set_t *not_executed_last_iteration = pointer_set_create ();
+  pointer_set_t *visited;
+  struct nb_iter_bound *elt;
+  bool found = false;
+  VEC (basic_block, heap) *queue = NULL;
+
+  for (elt = loop->bounds; elt; elt = elt->next)
+{
+  if (!elt->is_exit
+ && elt->bound.ult (loop->nb_iterations_upper_bound))
+   {
+ found = true;
+ pointer_set_insert (not_executed_last_iteration, elt->stmt);
+   }
+}
+  if (!found)
+{
+  pointer_set_destroy (not_executed_last_iteration);
+  return;
+}
+  visited = pointer_set_create ();
+  VEC_safe_push (basic_block, heap, queue, loop->header);
+  pointer_set_insert (visited, loop->header);
+  found = false;
+
+  while (VEC_length (basic_block, queue) && !found)
+{
+  basic_block bb = VEC_pop (basic_block, queue);
+  gimple_stmt_iterator gsi;
+  bool stmt_found = false;
+
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+   {
+ gimple stmt = gsi_stmt (gsi);
+ if (pointer_set_contains (not_executed_last_iteration, stmt))
+   {
+ stmt_found = true;
+ break;
+   }
+ if (gimple_has_side_effects (stmt))
+   {
+ found = true;
+ break;
+   }
+   }
+  if (!stmt_found && !found)
+   {
+  edge e;
+  edge_iterator ei;
+
+  FOR_EACH_EDGE (e, ei, bb->succs)
+   {
+ if (loop_exit_edge_p (loop, e))
+   {
+ found = true;
+ break;
+   }
+ if (!pointer_set_insert (visited, e->dest))
+   VEC_safe_push (basic_block, heap, queue, e->dest);
+   }
+   }
+}
+ 

RE: [PATCH, ARM][1/4] New RTL patterns for LDRD/STRD in Thumb mode

2012-10-19 Thread Greta Yorsh
> -Original Message-
> From: Richard Earnshaw
> Sent: 19 October 2012 16:44
> To: Greta Yorsh
> Cc: GCC Patches; Ramana Radhakrishnan; ni...@redhat.com;
> p...@codesourcery.com
> Subject: Re: [PATCH, ARM][1/4] New RTL patterns for LDRD/STRD in Thumb
> mode
> 
> On 19/10/12 16:20, Greta Yorsh wrote:
> 
> > Removed the condition "!optimize_function_for_size_p (cfun))".
> >
> > The condition "current_tune->prefer_ldrd_strd" is needed because the
> > patterns
> > for LDRD/STRD appear before the patterns for LDM/STM that can match
> the same
> > RTL
> > (two register in the list). Condition "reload_completed" does not
> help with
> > it
> > because peephole optimizations in ldmstm.md may (after reload) create
> new
> > RTL insn
> > that match this pattern.
> >
> 
> The point of the reload_completed is that these patterns have the
> potential to cause some problems if they somehow matched during earlier
> passes and the address base was an eliminable register.
> 

Thank you for the explanation. Here is an updated patch.

Regression tests and bootstrap in progress for the entire sequence, after
addressing all other comments as well. 

OK for trunk, if bootstrap successful?

Thanks,
Greta


ChangeLog


gcc/

2012-10-19  Sameera Deshpande  
Greta Yorsh  

* config/arm/arm-protos.h (offset_ok_for_ldrd_strd): New
declaration.
(operands_ok_ldrd_strd): Likewise.
* config/arm/arm.c (offset_ok_for_ldrd_strd): New function.
(operands_ok_ldrd_strd): Likewise.
* config/arm/arm.md (thumb2_ldrd, thumb2_ldrd_base): New patterns.
(thumb2_ldrd_base_neg): Likewise.
(thumb2_strd, thumb2_strd_base, thumb_strd_base_neg): Likewise.
* predicates.md (ldrd_strd_offset_operand): New predicate.
* config/arm/constraints.md (Do): New constraint.
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 010e7fc..bfe96ea 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -116,6 +116,8 @@ extern bool gen_stm_seq (rtx *, int);
 extern bool gen_const_stm_seq (rtx *, int);
 extern rtx arm_gen_load_multiple (int *, int, rtx, int, rtx, HOST_WIDE_INT *);
 extern rtx arm_gen_store_multiple (int *, int, rtx, int, rtx, HOST_WIDE_INT *);
+extern bool offset_ok_for_ldrd_strd (HOST_WIDE_INT);
+extern bool operands_ok_ldrd_strd (rtx, rtx, rtx, HOST_WIDE_INT, bool, bool);
 extern int arm_gen_movmemqi (rtx *);
 extern enum machine_mode arm_select_cc_mode (RTX_CODE, rtx, rtx);
 extern enum machine_mode arm_select_dominance_cc_mode (rtx, rtx,
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index fc3a508..c60e62f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -12185,6 +12185,75 @@ arm_pad_reg_upward (enum machine_mode mode,
   return !BYTES_BIG_ENDIAN;
 }
 
+/* Returns true iff OFFSET is valid for use in an LDRD/STRD instruction,
+   assuming that the address in the base register is word aligned.  */
+bool
+offset_ok_for_ldrd_strd (HOST_WIDE_INT offset)
+{
+  HOST_WIDE_INT max_offset;
+
+  /* Offset must be a multiple of 4 in Thumb mode.  */
+  if (TARGET_THUMB2 && ((offset & 3) != 0))
+return false;
+
+  if (TARGET_THUMB2)
+max_offset = 1020;
+  else if (TARGET_ARM)
+max_offset = 255;
+  else
+gcc_unreachable ();
+
+  return ((offset <= max_offset) && (offset >= -max_offset));
+}
+
+/* Checks whether the operands are valid for use in an LDRD/STRD instruction.
+   Assumes that RT, RT2, and RN are REG.  This is guaranteed by the patterns.
+   Assumes that the address in the base register RN is word aligned.  Pattern
+   guarantees that both memory accesses use the same base register,
+   the offsets are constants within the range, and the gap between the offsets 
is 4.
+   If preload complete then check that registers are legal.  WBACK indicates 
whether
+   address is updated.  LOAD indicates whether memory access is load or store. 
 */
+bool
+operands_ok_ldrd_strd (rtx rt, rtx rt2, rtx rn, HOST_WIDE_INT offset,
+   bool wback, bool load)
+{
+  unsigned int t, t2, n;
+
+  if (!reload_completed)
+return true;
+
+  if (!offset_ok_for_ldrd_strd (offset))
+return false;
+
+  t = REGNO (rt);
+  t2 = REGNO (rt2);
+  n = REGNO (rn);
+
+  if ((TARGET_THUMB2)
+  && ((wback && (n == t || n == t2))
+  || (t == SP_REGNUM)
+  || (t == PC_REGNUM)
+  || (t2 == SP_REGNUM)
+  || (t2 == PC_REGNUM)
+  || (!load && (n == PC_REGNUM))
+  || (load && (t == t2))
+  /* Triggers Cortex-M3 LDRD errata.  */
+  || (!wback && load && fix_cm3_ldrd && (n == t
+return false;
+
+  if ((TARGET_ARM)
+  && ((wback && (n == t || n == t2))
+  || (t2 == PC_REGNUM)
+  || (t % 2 != 0)   /* First destination register is not even.  */
+  || (t2 != t + 1)
+  /* PC can be used as base register (for offset addressing only),
+ but it is depricated.  */
+  || (n == PC

[v3] update C++11 status table

2012-10-19 Thread Jonathan Wakely
The table's a bit out of date

* doc/xml/manual/status_cxx2011.xml: Update.

Committed to trunk.
diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
index 226eef9..6b1ad51 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
@@ -821,15 +821,10 @@ particular release.
   Type properties
   Partial
   Missing is_trivially_copyable,
-  is_assignable, is_copy_assignable, is_move_assignable,
   is_trivially_constructible, is_trivially_default_constructible,
   is_trivially_copy_constructible, is_trivially_move_constructible,
   is_trivially_assignable, is_trivially_default_assignable,
-  is_trivially_copy_assignable, is_trivially_move_assignable,
-  is_trivially_destructible,
-  is_nothrow_assignable,
-  is_nothrow_copy_assignable, is_nothrow_move_assignable,
-  is_nothrow_destructible
+  is_trivially_copy_assignable, is_trivially_move_assignable
   
 
 



[Patch, Fortran] PR54958 - Allow ac-implied-do and data-implied-do with INTENT(IN)

2012-10-19 Thread Tobias Burnus
gfortran's INTENT(IN) check was too strict for "do" variables. While a 
variable in the normal do-stmt and in an io-implied-do is in the scope 
and, hence, the variable may not be modified for a nonpointer intent(in) 
variable.


However, ac-implied-do and data-implied-do live in their own scope and 
hence INTENT(IN) doesn't apply to them. (Neither does it apply to the 
FORALL/DO CONCURRENT index-name, but that was already handled correctly.)


Build and regtested on x86-64-linux.
OK for the trunk?

Tobias
2012-10-19  Tobias Burnus  

	PR fortran/54958
	* gfortran.h (gfc_resolve_iterator_expr,
	gfc_check_vardef_context): Update prototype.
	* expr.c (gfc_check_vardef_context): Add own_scope
	argument and honour it.
	* resolve.c (gfc_resolve_iterator_expr): Add own_scope
	argument and honour it.
	(resolve_deallocate_expr, resolve_allocate_expr,
	resolve_data_variables, resolve_transfer
	resolve_lock_unlock, resolve_code): Update calls.
	* array.c (resolve_array_list): Ditto.
	* check.c (gfc_check_atomic_def, gfc_check_atomic_ref): Ditto.
	* interface.c (compare_actual_formal): Ditto.
	* intrinsic.c (check_arglist): Ditto.
	* io.c (resolve_tag, gfc_resolve_dt, gfc_resolve_inquire): Ditto.

2012-10-19  Tobias Burnus  

	PR fortran/54958
	* gfortran.dg/do_check_6.f90: New.

diff --git a/gcc/fortran/array.c b/gcc/fortran/array.c
index 066ac1e..3491517 100644
--- a/gcc/fortran/array.c
+++ b/gcc/fortran/array.c
@@ -1816,7 +1816,7 @@ resolve_array_list (gfc_constructor_base base)
 	  gfc_symbol *iter_var;
 	  locus iter_var_loc;
 	 
-	  if (gfc_resolve_iterator (iter, false) == FAILURE)
+	  if (gfc_resolve_iterator (iter, false, true) == FAILURE)
 	t = FAILURE;
 
 	  /* Check for bounds referencing the iterator variable.  */
diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c
index 58c5856..a490238 100644
--- a/gcc/fortran/check.c
+++ b/gcc/fortran/check.c
@@ -1046,7 +1046,7 @@ gfc_check_atomic_def (gfc_expr *atom, gfc_expr *value)
   if (scalar_check (atom, 0) == FAILURE || scalar_check (value, 1) == FAILURE)
 return FAILURE;
 
-  if (gfc_check_vardef_context (atom, false, false, NULL) == FAILURE)
+  if (gfc_check_vardef_context (atom, false, false, false, NULL) == FAILURE)
 {
   gfc_error ("ATOM argument of the %s intrinsic function at %L shall be "
 		 "definable", gfc_current_intrinsic, &atom->where);
@@ -1063,7 +1063,7 @@ gfc_check_atomic_ref (gfc_expr *value, gfc_expr *atom)
   if (scalar_check (value, 0) == FAILURE || scalar_check (atom, 1) == FAILURE)
 return FAILURE;
 
-  if (gfc_check_vardef_context (value, false, false, NULL) == FAILURE)
+  if (gfc_check_vardef_context (value, false, false, false, NULL) == FAILURE)
 {
   gfc_error ("VALUE argument of the %s intrinsic function at %L shall be "
 		 "definable", gfc_current_intrinsic, &value->where);
diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c
index 9ac0fc6..211f304 100644
--- a/gcc/fortran/expr.c
+++ b/gcc/fortran/expr.c
@@ -4634,13 +4634,15 @@ gfc_build_intrinsic_call (const char* name, locus where, unsigned numarg, ...)
(F2008, 16.6.7) or pointer association context (F2008, 16.6.8).
This is called from the various places when resolving
the pieces that make up such a context.
+   If own_scope is true (applies to, e.g., ac-implied-do/data-implied-do
+   variables), some checks are not performed.
 
Optionally, a possible error message can be suppressed if context is NULL
and just the return status (SUCCESS / FAILURE) be requested.  */
 
 gfc_try
 gfc_check_vardef_context (gfc_expr* e, bool pointer, bool alloc_obj,
-			  const char* context)
+			  bool own_scope, const char* context)
 {
   gfc_symbol* sym = NULL;
   bool is_pointer;
@@ -4725,7 +4727,7 @@ gfc_check_vardef_context (gfc_expr* e, bool pointer, bool alloc_obj,
  assignment to a pointer component from pointer-assignment to a pointer
  component.  Note that (normal) assignment to procedure pointers is not
  possible.  */
-  check_intentin = true;
+  check_intentin = !own_scope;
   ptr_component = (sym->ts.type == BT_CLASS && CLASS_DATA (sym))
 		  ? CLASS_DATA (sym)->attr.class_pointer : sym->attr.pointer;
   for (ref = e->ref; ref && check_intentin; ref = ref->next)
@@ -4760,7 +4762,7 @@ gfc_check_vardef_context (gfc_expr* e, bool pointer, bool alloc_obj,
 }
 
   /* PROTECTED and use-associated.  */
-  if (sym->attr.is_protected && sym->attr.use_assoc  && check_intentin)
+  if (sym->attr.is_protected && sym->attr.use_assoc && check_intentin)
 {
   if (pointer && is_pointer)
 	{
@@ -4782,7 +4784,7 @@ gfc_check_vardef_context (gfc_expr* e, bool pointer, bool alloc_obj,
 
   /* Variable not assignable from a PURE procedure but appears in
  variable definition context.  */
-  if (!pointer && gfc_pure (NULL) && gfc_impure_variable (sym))
+  if (!pointer && !own_scope && gfc_pure (NULL) && gfc_impure_variable (sym))
 {
   if (context)
 	gfc_error ("Variable '%s' can not appear in a variable definition"
@@ -48

Re: [PATCH, ARM][1/4] New RTL patterns for LDRD/STRD in Thumb mode

2012-10-19 Thread Richard Earnshaw

On 19/10/12 17:51, Greta Yorsh wrote:

-Original Message-
From: Richard Earnshaw
Sent: 19 October 2012 16:44
To: Greta Yorsh
Cc: GCC Patches; Ramana Radhakrishnan; ni...@redhat.com;
p...@codesourcery.com
Subject: Re: [PATCH, ARM][1/4] New RTL patterns for LDRD/STRD in Thumb
mode

On 19/10/12 16:20, Greta Yorsh wrote:


Removed the condition "!optimize_function_for_size_p (cfun))".

The condition "current_tune->prefer_ldrd_strd" is needed because the
patterns
for LDRD/STRD appear before the patterns for LDM/STM that can match

the same

RTL
(two register in the list). Condition "reload_completed" does not

help with

it
because peephole optimizations in ldmstm.md may (after reload) create

new

RTL insn
that match this pattern.



The point of the reload_completed is that these patterns have the
potential to cause some problems if they somehow matched during earlier
passes and the address base was an eliminable register.



Thank you for the explanation. Here is an updated patch.

Regression tests and bootstrap in progress for the entire sequence, after
addressing all other comments as well.

OK for trunk, if bootstrap successful?

Thanks,
Greta


ChangeLog


gcc/

2012-10-19  Sameera Deshpande  
 Greta Yorsh  

 * config/arm/arm-protos.h (offset_ok_for_ldrd_strd): New
 declaration.
 (operands_ok_ldrd_strd): Likewise.
 * config/arm/arm.c (offset_ok_for_ldrd_strd): New function.
 (operands_ok_ldrd_strd): Likewise.
 * config/arm/arm.md (thumb2_ldrd, thumb2_ldrd_base): New patterns.
 (thumb2_ldrd_base_neg): Likewise.
 (thumb2_strd, thumb2_strd_base, thumb_strd_base_neg): Likewise.
 * predicates.md (ldrd_strd_offset_operand): New predicate.
 * config/arm/constraints.md (Do): New constraint.



OK.

R.




[ping] couple of fixes

2012-10-19 Thread Eric Botcazou
Unwind/debug info on x86-64/Windows:
  http://gcc.gnu.org/ml/gcc-patches/2012-10/msg00013.html

PR bootstrap/54820 (stage #1 bootstrap failure)
  http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01093.html

-- 
Eric Botcazou


Re: [asan] migrate runtime from llvm

2012-10-19 Thread Xinliang David Li
I tried it, and this version works for me.

Your probably do not need to add the m4 subdir under libasan.  The
required m4 files are either in .. or ../config dir. See how
libmudflap does it.

Other than that, if there are no other comments, the change is good to
check into the branch. Remaining bugs can always be found and fixed
later.

thanks,

David



On Thu, Oct 18, 2012 at 8:04 PM, Wei Mi  wrote:
> Hi,
>
> David cought a problem in the last patch when he tries to built
> libasan. The problem was that an incomplete libtool under libasan
> build directory was generated. The cause is that the old patch used an
> old ltmain.sh to generate libtool. I fix it and attach a new patch.
> And the new patch move -lpthread and -ldl to libasan LDFLAGS.
>
> Thanks,
> Wei.


Re: [Patch, Fortran] PR 54224: [4.8 Regression] Bogus -Wunused-function warning with static function

2012-10-19 Thread Janus Weil
Hi Tobias,

>> I have found a fix that is rather trivial in the sense that it's a
>> one-liner.
>
>
> I think it is the proper fix.

good :)


>> However, it may not be as trivial conceptually (in particular I'm not 100%
>> sure what caused this regression in the first place
>
>
> I think it is the combination of having "static" noninternal procedures
> (which is a new 4.8 feature) combined with the middle-end assumption that
> the FE marks procedures as TREE_USED. (The ME does mark them as TREE_USED,
> but too late.)
>
> (The other question is why the diagnostic doesn't work for internal
> procedures - or for GNU C's nested function. I think also for module
> variables such a diagnostic would be useful. However, both are just missed
> diagnostics and not diagnostic-output regressions.)

these things we should follow-up on in the PR, or are there other PRs for these?


>> Anyway, it regtests cleanly and fixes the problem as advertised. Ok for
>> trunk?
>
>
> Yes. Thanks for the patch.

Thanks. Committed as r192620.

Cheers,
Janus


Re: [asan] migrate runtime from llvm

2012-10-19 Thread Diego Novillo

On 2012-10-19 13:16 , Xinliang David Li wrote:


Other than that, if there are no other comments, the change is good to
check into the branch. Remaining bugs can always be found and fixed
later.


Agreed.  David, could you commit the patch after Wei has addressed your 
feedback?  I have just committed a merge from trunk.


Once the runtime is in, please test patches against it.  We still don't 
have dejagnu glue, so testing will need to be manual at first.



Diego.


Re: [asan] migrate runtime from llvm

2012-10-19 Thread Wei Mi
David, I put the m4 subdir under libasan because once I use the .m4
files (libtool.m4  lt~obsolete.m4  ltoptions.m4  ltsugar.m4
ltversion.m4) and ltmain.sh under $topsrcdir, the problem that a bad
libtool was generated under
$topbuilddir/x86_64-unknown-linux-gnu/libasan you met yesterday
appeared.  That is why I had to generate the new libtool m4 files and
ltmain.sh using libtoolize.

Thanks,
Wei.

On Fri, Oct 19, 2012 at 10:16 AM, Xinliang David Li  wrote:
> I tried it, and this version works for me.
>
> Your probably do not need to add the m4 subdir under libasan.  The
> required m4 files are either in .. or ../config dir. See how
> libmudflap does it.
>
> Other than that, if there are no other comments, the change is good to
> check into the branch. Remaining bugs can always be found and fixed
> later.
>
> thanks,
>
> David
>
>
>
> On Thu, Oct 18, 2012 at 8:04 PM, Wei Mi  wrote:
>> Hi,
>>
>> David cought a problem in the last patch when he tries to built
>> libasan. The problem was that an incomplete libtool under libasan
>> build directory was generated. The cause is that the old patch used an
>> old ltmain.sh to generate libtool. I fix it and attach a new patch.
>> And the new patch move -lpthread and -ldl to libasan LDFLAGS.
>>
>> Thanks,
>> Wei.


Re: [PATCH] PowerPC VLE port

2012-10-19 Thread David Edelsohn
Jim,

How do you want to move forward with the VLE patch?  Can you localize
more of the changes?

Thanks, David


Re: [PATCH] Improve expansion into DEBUG_IMPLICIT_PTR (PR debug/54970)

2012-10-19 Thread Cary Coutant
> 2012-10-18  Jakub Jelinek  
>
> PR debug/54970
> * cfgexpand.c (expand_debug_expr): Expand &MEM_REF[&var, n]
> as DEBUG_IMPLICIT_PTR + n if &var expands to DEBUG_IMPLICIT_PTR.
> * tree-sra.c (create_access_replacement): Allow also MEM_REFs
> with ADDR_EXPR first operand in DECL_DEBUG_EXPR expressions.
> * var-tracking.c (track_expr_p): Handle MEM_REFs in DECL_DEBUG_EXPR
> expressions.
> * dwarf2out.c (add_var_loc_to_decl): Likewise.

OK on the dwarf2out.c part.

-cary


Re: [PATCH] Rs6000 infrastructure cleanup (switches), revised patch #2f

2012-10-19 Thread Michael Meissner
I checked in your patch along with my patches into the trunk.

[gcc]
2012-10-19  Michael Meissner  
Dominique Dhumieres  

* config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Don't assume
OPTION_MASK_STRICT_ALIGN is defined.
(MASK_STRICT_ALIGN): If OPTION_MASK_STRICT_ALIGN is not defined,
define this to 0 for the 3 ports that use it.

* config/rs6000/rs6000.c (OPTION_MASK_STRICT_ALIGN): Don't do
check for ports not having -mstrict-align here, move test to
rs6000-cpus.def.
(MASK_STRICT_ALIGN): Likewise.
(rs6000_debug_reg_global): Print out correct target flag words if
-mdebug=reg.

[libobjc]
2012-10-19  Michael Meissner  

* encoding.c (TARGET_VSX): Provide definitions based on the
current compilation options, and not based off the target options
structure.
(TARGET_ALTIVEC): Likewise.
(TARGET_64BIT): Likewise.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 192622)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -1121,10 +1121,6 @@ static const struct attribute_spec rs600
   { NULL,0, 0, false, false, false, NULL, false }
 };
 
-#ifndef OPTION_MASK_STRICT_ALIGN
-#define OPTION_MASK_STRICT_ALIGN 0
-#define MASK_STRICT_ALIGN 0
-#endif
 #ifndef TARGET_PROFILE_KERNEL
 #define TARGET_PROFILE_KERNEL 0
 #endif
@@ -1805,10 +1801,11 @@ rs6000_debug_reg_global (void)
 fprintf (stderr, DEBUG_FMT_S, "tune", "");
 
   cl_target_option_save (&cl_opts, &global_options);
-  rs6000_print_isa_options (stderr, 0, "target_flags", target_flags);
+  rs6000_print_isa_options (stderr, 0, "rs6000_isa_flags",
+   rs6000_isa_flags);
 
-  rs6000_print_isa_options (stderr, 0, "target_flags_explicit",
-   target_flags_explicit);
+  rs6000_print_isa_options (stderr, 0, "rs6000_isa_flags_explicit",
+   rs6000_isa_flags_explicit);
 
   rs6000_print_builtin_options (stderr, 0, "rs6000_builtin_mask",
rs6000_builtin_mask);
Index: gcc/config/rs6000/rs6000-cpus.def
===
--- gcc/config/rs6000/rs6000-cpus.def   (revision 192622)
+++ gcc/config/rs6000/rs6000-cpus.def   (working copy)
@@ -46,6 +46,17 @@
 
 #define POWERPC_7400_MASK  (OPTION_MASK_PPC_GFXOPT | OPTION_MASK_ALTIVEC)
 
+/* Deal with ports that do not have -mstrict-align.  */
+#ifdef OPTION_MASK_STRICT_ALIGN
+#define OPTION_MASK_STRICT_ALIGN_OPTIONAL OPTION_MASK_STRICT_ALIGN
+#else
+#define OPTION_MASK_STRICT_ALIGN 0
+#define OPTION_MASK_STRICT_ALIGN_OPTIONAL 0
+#ifndef MASK_STRICT_ALIGN
+#define MASK_STRICT_ALIGN 0
+#endif
+#endif
+
 /* Mask of all options to set the default isa flags based on -mcpu=.  */
 #define POWERPC_MASKS  (OPTION_MASK_ALTIVEC\
 | OPTION_MASK_CMPB \
@@ -64,7 +75,7 @@
 | OPTION_MASK_PPC_GPOPT\
 | OPTION_MASK_RECIP_PRECISION  \
 | OPTION_MASK_SOFT_FLOAT   \
-| OPTION_MASK_STRICT_ALIGN \
+| OPTION_MASK_STRICT_ALIGN_OPTIONAL\
 | OPTION_MASK_VSX)
 
 #endif
Index: libobjc/encoding.c
===
--- libobjc/encoding.c  (revision 192622)
+++ libobjc/encoding.c  (working copy)
@@ -112,6 +112,32 @@ static int __attribute__ ((__unused__)) 
 #undef ALTIVEC_VECTOR_MODE
 #define ALTIVEC_VECTOR_MODE(MODE) (0)
 
+/* Replace TARGET_VSX, TARGET_ALTIVEC, and TARGET_64BIT with constants based on
+   the current switches, rather than looking in the options structure.  */
+#ifdef _ARCH_PPC
+#undef TARGET_VSX
+#undef TARGET_ALTIVEC
+#undef TARGET_64BIT
+
+#ifdef __VSX__
+#define TARGET_VSX 1
+#else
+#define TARGET_VSX 0
+#endif
+
+#ifdef __ALTIVEC__
+#define TARGET_ALTIVEC 1
+#else
+#define TARGET_ALTIVEC 0
+#endif
+
+#ifdef _ARCH_PPC64
+#define TARGET_64BIT 1
+#else
+#define TARGET_64BIT 0
+#endif
+#endif
+
 /* Furthermore, some (powerpc) targets also use TARGET_ALIGN_NATURAL
  in their alignment macros. Currently[4.5/6], rs6000.h points this
  to a static variable, initialized by target overrides. This is reset
Index: libobjc/ChangeLog
===
--- libobjc/ChangeLog   (revision 192622)
+++ libobjc/ChangeLog   (working copy)
@@ -1,3 +1,11 @@
+2012-10-19  Michael Meissner  
+
+   * encoding.c (TARGET_VSX): Provide definitions based on the
+   current compilation options, and not based off the target options
+   structure.
+   (TARGET_ALTIVEC): Likewise.
+   (TARGET_64BIT): Likewise.
+
 2012-09-14  David Edelsohn  
 
* config

Re: Ping: RFA: add lock_length attribute to break branch-shortening cycles

2012-10-19 Thread Joern Rennecke

Quoting Richard Sandiford :


Joern Rennecke  writes:



When the condition is not fulfilled, we want to keep the length from the
previous iteration.


Right, that's what I mean.  So we need to make sure that the difference
between the address of the current instruction and the address of the
next instruction also doesn't change.  The code as posted was:

  else
{
  new_length = insn_current_length (insn);
  insn_current_address += new_length;
}

#ifdef ADJUST_INSN_LENGTH
  /* If needed, do any adjustment.  */
  tmp_length = new_length;
  ADJUST_INSN_LENGTH (insn, new_length);
  insn_current_address += (new_length - tmp_length);
#endif

  if (new_length != insn_lengths[uid]
  && (new_length > insn_lengths[uid] || first_pass))
{
  insn_lengths[uid] = new_length;
  something_changed = 1;
}

which always leaves insn_current_address based off new_length,
even if new_length is out of kilter with insn_lengths[uid].


Indeed.

I have attached a patch which I believe addresses the issues you raised.

Because making the length sticky is problematic when UIDs aren't unique,
I have tested this patch on top of that patch:

http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01806.html

regression tested on i686-pc-linux-gnu X cris-elf and
i686-pc-linux-gnu X mipsel-elf.
2012-10-19  J"orn Rennecke  

* final.c (shorten_branches): When optimizing, start with small
length and increase from there, and don't decrease lengths.

Index: final.c
===
--- final.c (revision 192573)
+++ final.c (working copy)
@@ -1066,7 +1066,15 @@ shorten_branches (rtx first ATTRIBUTE_UN
 }
 #endif /* CASE_VECTOR_SHORTEN_MODE */
 
+  /* When optimizing, we start assuming minimum length, and keep increasing
+ lengths as we find the need for this, till nothing changes.
+ When not optimizing, we start assuming maximum lengths, and
+ do a single pass to update the lengths.  */
+  bool increasing = optimize != 0;
+
   /* Compute initial lengths, addresses, and varying flags for each insn.  */
+  int (*length_fun) (rtx) = increasing ? insn_min_length : insn_default_length;
+
   for (insn_current_address = 0, insn = first;
insn != 0;
insn_current_address += insn_lengths[uid], insn = NEXT_INSN (insn))
@@ -1117,6 +1125,8 @@ shorten_branches (rtx first ATTRIBUTE_UN
 #else
  const_delay_slots = 0;
 #endif
+ int (*inner_length_fun) (rtx)
+   = const_delay_slots ? length_fun : insn_default_length;
  /* Inside a delay slot sequence, we do not do any branch shortening
 if the shortening could change the number of delay slots
 of the branch.  */
@@ -1131,7 +1141,7 @@ shorten_branches (rtx first ATTRIBUTE_UN
inner_length = (asm_insn_count (PATTERN (inner_insn))
* insn_default_length (inner_insn));
  else
-   inner_length = insn_default_length (inner_insn);
+   inner_length = inner_length_fun (inner_insn);
 
  insn_lengths[inner_uid] = inner_length;
  if (const_delay_slots)
@@ -1149,7 +1159,7 @@ shorten_branches (rtx first ATTRIBUTE_UN
}
   else if (GET_CODE (body) != USE && GET_CODE (body) != CLOBBER)
{
- insn_lengths[uid] = insn_default_length (insn);
+ insn_lengths[uid] = length_fun (insn);
  varying_length[uid] = insn_variable_length_p (insn);
}
 
@@ -1165,6 +1175,7 @@ shorten_branches (rtx first ATTRIBUTE_UN
  get the current insn length.  If it has changed, reflect the change.
  When nothing changes for a full pass, we are done.  */
 
+  bool first_pass ATTRIBUTE_UNUSED = true;
   while (something_changed)
 {
   something_changed = 0;
@@ -1220,6 +1231,7 @@ shorten_branches (rtx first ATTRIBUTE_UN
  rtx prev;
  int rel_align = 0;
  addr_diff_vec_flags flags;
+ enum machine_mode vec_mode;
 
  /* Avoid automatic aggregate initialization.  */
  flags = ADDR_DIFF_VEC_FLAGS (body);
@@ -1298,9 +1310,12 @@ shorten_branches (rtx first ATTRIBUTE_UN
  else
max_addr += align_fuzz (max_lab, rel_lab, 0, 0);
}
- PUT_MODE (body, CASE_VECTOR_SHORTEN_MODE (min_addr - rel_addr,
-   max_addr - rel_addr,
-   body));
+ vec_mode = CASE_VECTOR_SHORTEN_MODE (min_addr - rel_addr,
+  max_addr - rel_addr, body);
+ if (first_pass
+ || (GET_MODE_SIZE (vec_mode)
+ >= GET_MODE_SIZE (GET_MODE (body
+   PUT_MODE (body, vec_mode);

Re: [asan] migrate runtime from llvm

2012-10-19 Thread Xinliang David Li
The library builds fine with the following diff. The file acinclude.m4
is cloned from libmudflap.

David

index 485d169..3e847f1 100644
--- a/libasan/aclocal.m4
+++ b/libasan/aclocal.m4
@@ -1037,8 +1037,8 @@ AC_SUBST([am__tar])
 AC_SUBST([am__untar])
 ]) # _AM_PROG_TAR

-m4_include([m4/libtool.m4])
-m4_include([m4/ltoptions.m4])
-m4_include([m4/ltsugar.m4])
-m4_include([m4/ltversion.m4])
-m4_include([m4/lt~obsolete.m4])
+m4_include([../config/ltoptions.m4])
+m4_include([../config/ltsugar.m4])
+m4_include([../config/ltversion.m4])
+#m4_include([../config/lt~obsolete.m4])
+m4_include([acinclude.m4])


On Fri, Oct 19, 2012 at 11:02 AM, Wei Mi  wrote:
> David, I put the m4 subdir under libasan because once I use the .m4
> files (libtool.m4  lt~obsolete.m4  ltoptions.m4  ltsugar.m4
> ltversion.m4) and ltmain.sh under $topsrcdir, the problem that a bad
> libtool was generated under
> $topbuilddir/x86_64-unknown-linux-gnu/libasan you met yesterday
> appeared.  That is why I had to generate the new libtool m4 files and
> ltmain.sh using libtoolize.
>
> Thanks,
> Wei.
>
> On Fri, Oct 19, 2012 at 10:16 AM, Xinliang David Li  
> wrote:
>> I tried it, and this version works for me.
>>
>> Your probably do not need to add the m4 subdir under libasan.  The
>> required m4 files are either in .. or ../config dir. See how
>> libmudflap does it.
>>
>> Other than that, if there are no other comments, the change is good to
>> check into the branch. Remaining bugs can always be found and fixed
>> later.
>>
>> thanks,
>>
>> David
>>
>>
>>
>> On Thu, Oct 18, 2012 at 8:04 PM, Wei Mi  wrote:
>>> Hi,
>>>
>>> David cought a problem in the last patch when he tries to built
>>> libasan. The problem was that an incomplete libtool under libasan
>>> build directory was generated. The cause is that the old patch used an
>>> old ltmain.sh to generate libtool. I fix it and attach a new patch.
>>> And the new patch move -lpthread and -ldl to libasan LDFLAGS.
>>>
>>> Thanks,
>>> Wei.


Re: [asan] migrate runtime from llvm

2012-10-19 Thread Diego Novillo
On Fri, Oct 19, 2012 at 3:52 PM, Xinliang David Li  wrote:
> The library builds fine with the following diff. The file acinclude.m4
> is cloned from libmudflap.

This is OK with the corresponding ChangeLog entry (in ChangeLog.asan).


Diego.


breakage with "[v3] (almost) finish "

2012-10-19 Thread Hans-Peter Nilsson
> From: Jonathan Wakely 
> Date: Fri, 19 Oct 2012 18:16:51 +0200

> This adds support for piecewise construction of std::pair by
> scoped_allocator_adaptor.  The only thing missing from
> scoped_allocator_adaptor now is that my definition of OUTERMOST isn't
> recursive so doesn't work for nested scoped_allocator_adaptors.
> That's a suitably obscure use case that I'm not going to rush to fix
> it today.
> 
> * include/std/scoped_allocator (__outermost_alloc_traits): Define.
> (scoped_allocator_adaptor::destroy): Use it.
> (scoped_allocator_adaptor::construct): Likewise. Overload for
> piecewise construction of std::pair objects.
> * testsuite/20_util/scoped_allocator/2.cc: New.
> 
> Tested x86_64-linux, committed to trunk.

Looks like _U is one of those identifiers that should be
avoided:  Grep yields:
src/newlib/libc/include/ctype.h:#define _U  01

And also:
#define _L  02
#define _N  04
#define _S  010
#define _P  020
#define _C  040
#define _X  0100
#define _B  0200

Trunk is broken for cris-elf (a newlib target) as follows,
coinciding with the above commit (one in 192612:192621).

/tmp/hpautotest-gcc0/cris-elf/gccobj/./gcc/xgcc -shared-libgcc 
-B/tmp/hpautotest-gcc0/cris-elf/gccobj/./gcc -nostdinc++ 
-L/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/src 
-L/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/src/.libs 
-nostdinc -B/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/newlib/ -isystem 
/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/newlib/targ-include -isystem 
/tmp/hpautotest-gcc0/gcc/newlib/libc/include 
-B/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libgloss/cris 
-L/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libgloss/libnosys 
-L/tmp/hpautotest-gcc0/gcc/libgloss/cris 
-B/tmp/hpautotest-gcc0/cris-elf/pre/cris-elf/bin/ 
-B/tmp/hpautotest-gcc0/cris-elf/pre/cris-elf/lib/ -isystem 
/tmp/hpautotest-gcc0/cris-elf/pre/cris-elf/include -isystem 
/tmp/hpautotest-gcc0/cris-elf/pre/cris-elf/sys-include-x c++-header 
-nostdinc++ -g -O2 
-I/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/include/cris-elf 
-I/tmp/hpautotest-gcc0/cris-elf/
 gccobj/cris-elf/libstdc++-v3/include 
-I/tmp/hpautotest-gcc0/gcc/libstdc++-v3/libsupc++ -O2 -g -std=gnu++0x 
/tmp/hpautotest-gcc0/gcc/libstdc++-v3/include/precompiled/stdc++.h \
-o cris-elf/bits/stdc++.h.gch/O2ggnu++0x.gch
In file included from 
/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/include/cctype:44:0,
 from 
/tmp/hpautotest-gcc0/gcc/libstdc++-v3/include/precompiled/stdc++.h:36:
/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/include/scoped_allocator:368:53:
 error: expected nested-name-specifier before numeric constant
   template
 ^
/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/include/scoped_allocator:368:53:
 error: expected '>' before numeric constant
/tmp/hpautotest-gcc0/cris-elf/gccobj/cris-elf/libstdc++-v3/include/scoped_allocator:370:33:
 error: expected identifier before numeric constant
  construct(pair<_T1, _T2>* __p, _U&& __u, _V&& __v)

 ^
(many similar error lines)

What's the preferred replacement?

brgds, H-P


Re: [C++] Handle ?: for vectors

2012-10-19 Thread Marc Glisse
(Adding Joseph in Cc: because we'll want the same behavior in C 
afterwards. Conversation starts here: 
http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01665.html )


On Thu, 18 Oct 2012, Marc Glisse wrote:

Clang's webpage says they support ?: for vector types, but my version (3.1) 
rejects it, I'll compile a new one to check what they do.


They implement ?: for vectors only in OpenCL mode, and in that mode 
implement the OpenCL semantics (<0).



OpenCL can be confusing. x?y:z means (x<0)?y:z, but !x means x==0 and x&&y 
seems to mean (x!=0)&(y!=0) (the text is not extremely clear).



On Wed, 17 Oct 2012, Jason Merrill wrote:


On 10/17/2012 11:15 PM, Marc Glisse wrote:

In the middle-end, VEC_COND_EXPR is documented to take a vector of 0 and
-1, and at expansion time, it gives vcond(x<0,y,z) to the target if its
first argument is not a comparison.


I shouldn't have mentioned the <0 in vcond which is rather random: x is 
necessarily a vector of -1 and 0 there, so anything like <0, !=0, ==-1 is 
equivalent, we just need one because of a currently missing target 
interface.


Maybe we should leave the first argument alone and let the middle end mess 
with it, then?


VEC_COND_EXPR only accepts a vector of -1 and 0, so we can't just pass 
arbitrary input from the front-end. Richard B didn't want to make 
VEC_COND_EXPR more general either.


Besides, I believe it is the front-end's responsibility to decide this. If 
we don't want to decide on a specific semantic, I think I'd rather sorry() 
if a user passes an argument that is not directly a comparison (with a 
note explaining that they can write <0 for OpenCL-like behavior) than 
have an unspecified behavior.


So between the following:
a) x?y:z means (x<0)?y:z
b) x?y:z means (x!=0)?y:z
c) x?y:z is rejected, only things like (x==t)?y:z are accepted
d) other

is the choice still b) ? That's an easy one, only 2 characters to change 
in the patch (assuming the rest is correct) :-)

(well, no, actually, I'd also remove the conversion to a signed type)

--
Marc Glisse

PS: sorry for the slow reply, I was at a meeting.


Re: breakage with "[v3] (almost) finish "

2012-10-19 Thread Marc Glisse

On Fri, 19 Oct 2012, Hans-Peter Nilsson wrote:


Looks like _U is one of those identifiers that should be
avoided:

[...]

What's the preferred replacement?


std::pair seems to use _U1 and _U2 rather than _U and _V.

--
Marc Glisse


Re: breakage with "[v3] (almost) finish "

2012-10-19 Thread Daniel Krügler
2012/10/19 Marc Glisse :
> On Fri, 19 Oct 2012, Hans-Peter Nilsson wrote:
>
>> Looks like _U is one of those identifiers that should be
>> avoided:
>
> [...]
>>
>> What's the preferred replacement?
>
>
> std::pair seems to use _U1 and _U2 rather than _U and _V.

When not numbered, I have seen (and used) _Up (similar to _Tp).

- Daniel


[RFC] Updating PowerPC's overload vector table

2012-10-19 Thread Michael Meissner
As I beging work on extensions for a potential future machine, I am running
into the overload table.  In the power7 timeframe, I just gritted my teeth, and
added all of the appropriate entries, but I would like to visit making the
table somewhat friendlier to edit and add new overloaded functions.
Unfortunately in its current state, the table makes it easy to forget some
entries, and people's eyes glaze over it when making lots of changes.

To recapp the current situation, rs6000-c.c defines a table that given an
overloaded builtin, it will return an insn that does the builtin for the
particular argument types defined.

struct altivec_builtin_types
{
  enum rs6000_builtins code;
  enum rs6000_builtins overloaded_code;
  signed char ret_type;
  signed char op1;
  signed char op2;
  signed char op3;
};


Here are all of the entries just for __builtin_vec_add:

const struct altivec_builtin_types altivec_overloaded_builtins[] = {
/* ... */
  /* Binary AltiVec/VSX builtins.  */
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUBM,
RS6000_BTI_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_V16QI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUBM,
RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_bool_V16QI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUBM,
RS6000_BTI_V16QI, RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUBM,
RS6000_BTI_unsigned_V16QI, RS6000_BTI_bool_V16QI, 
RS6000_BTI_unsigned_V16QI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUBM,
RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 
RS6000_BTI_bool_V16QI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUBM,
RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 
RS6000_BTI_unsigned_V16QI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUHM,
RS6000_BTI_V8HI, RS6000_BTI_bool_V8HI, RS6000_BTI_V8HI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUHM,
RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_bool_V8HI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUHM,
RS6000_BTI_V8HI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUHM,
RS6000_BTI_unsigned_V8HI, RS6000_BTI_bool_V8HI, RS6000_BTI_unsigned_V8HI, 0 
},
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUHM,
RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_bool_V8HI, 0 
},
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUHM,
RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 
RS6000_BTI_unsigned_V8HI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUWM,
RS6000_BTI_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_V4SI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUWM,
RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_bool_V4SI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUWM,
RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUWM,
RS6000_BTI_unsigned_V4SI, RS6000_BTI_bool_V4SI, RS6000_BTI_unsigned_V4SI, 0 
},
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUWM,
RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_bool_V4SI, 0 
},
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUWM,
RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 
RS6000_BTI_unsigned_V4SI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUDM,
RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_V2DI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUDM,
RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_bool_V2DI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUDM,
RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUDM,
RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI, RS6000_BTI_unsigned_V2DI, 0 
},
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUDM,
RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_bool_V2DI, 0 
},
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDUDM,
RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 
RS6000_BTI_unsigned_V2DI, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, ALTIVEC_BUILTIN_VADDFP,
RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 },
  { ALTIVEC_BUILTIN_VEC_ADD, VSX_BUILTIN_XVADDDP,
RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 },
/* ... */
};

and this is repeated over and over again for each function.  I would like to
propose something like

/* Signed builtins with vectors that take 2 arguments.  Allow bool vector of
   the same type as one of the arguments.  */
#define BU_SIGN_2(OVERLOAD, INSTANCE, TYPE) \
  { OVERLOAD, INSTANCE, RS6000_BTI_ ## TYPE, RS6000_BTI_ ## TYPE,   \
RS6000_BTI_ ## TYPE, 0 },   \
  { OVERLOAD, INSTANCE, RS6000_BTI_ ## TYPE, RS6000_BTI_ ## TYPE,   \
RS6000_BTI_bool_ ## TYPE, 0 },  \
  { OVERLOAD, INSTANCE, RS6000_BTI_ ## TYPE, RS6000_BTI_bool_ ## T

RE: [Ping]FW: [PATCH] Cilk Plus merging to trunk (2 of n)

2012-10-19 Thread Joseph S. Myers
On Thu, 4 Oct 2012, Iyer, Balaji V wrote:

> Here is a link to the latest spec. This should clear several of the
> questions you are seeking.
> (http://software.intel.com/sites/default/files/m/4/e/7/3/1/40297-
> Intel_Cilk_plus_lang_spec_2.htm#array)

This specification is much improved, especially as regards specifying the 
types of section expressions.  I'm not convinced that "the type of the 
array being subscripted shall have a declared size" is properly defined in 
standard terms, but I can guess reasonable semantics - that if the 
array-to-pointer decay were considered not to occur in such a context, 
then the expressions for the array being subscripted shall have array 
type, not pointer type, and the array type shall not be one with 
unspecified size (array[]), although it may be a VLA.  For example, given 
"int a[10];", it would be valid to say a[:] or (a)[:] but not (+a)[:].  I 
don't, however, see any testcases at all in this patch for that particular 
requirements - not even for the completely clear-cut cases, such as giving 
an error for "extern int a[]; a[:];" or "int *a; a[:];".

Say expr1 through expr9 are expressions with side effects, and you have:

expr1[expr2:expr3:expr4] = expr5[expr6:expr7:expr8] + expr9;

The spec says "However, in such a statement, a sub-expression with rank 
zero is evaluated only once." - that is, each of the nine expressions is 
evaluated once.  I don't see any calls to save_expr to ensure these 
semantics, or any testcases that verify that they are adhered to.

(Are multidimensional section expressions valid when what you have is 
pointers to pointers, e.g. "int ***p; p[0:10][0:10][0:10];"?  I don't see 
anything to rule them out, so I assume they are valid, but don't see 
testcases for them either.)

Looking at the patch itself:

In find_rank you have error ("Rank Mismatch!"); - this is not a properly 
formatted error message according to the GNU Coding standards (which 
typically would not have any uppercase).  I'd also suggest that when you 
find a rank, you store (through a location_t * pointer) the location of 
the first expression found with that rank, so if you then find a 
mismatching rank you can use error_at to point to that rank and then 
inform to point to the previous rank it didn't match.

I'm not convinced that your logic, falling back to examining each operand 
for a generic expression, is correct to find the ranks of all kinds of 
expressions.  For example, there are rules:

* "The rank of a simple subscript expression (postfix-expression [ 
expression ]) is the sum of the ranks of its operand expressions. The rank 
of the subscript operand shall not be greater than one." - how do you 
ensure this rule?  Where do you test for errors if the subscript has too 
high a rank (both in the front-end code, and in the testsuite), and test 
(in the testsuite) for cases where the subscript has rank 1?

* "The rank of a comma expression is the rank of its second operand." - I 
don't see anything special to handle that.  Are there testcases for rank 
of comma expressions?  Apart from testing rank, you may need to test how 
they are evaluated (that each part, with independent rank, gets fully 
evaluted in turn) - I don't see anything obvious in the code to handle 
them appropriately.

In general, I'd say you should have tests in the testsuite for each 
syntactic type of expression supported by GCC, both standard and GNU 
extensions, testing how it interacts with section expressions - both valid 
cases, and cases that are invalid because of rank mismatches.  As another 
example, you don't have tests of conditional expressions.

Where do you test (both in code, and testcases to verify errors) that "The 
rank of each expression in a section triplet shall be zero."?  What about 
"The rank of the postfix expression identifying the function to call shall 
be zero."?  "A full expression shall have rank zero, unless it appears in 
an expression statement or as the controlling expression of an if 
statement."?  (This means, I suppose, that uses such as initializers or 
sizes in array declarators must be rejected.)  I'd advise going through 
each sentence in the relevant part of the spec that says something is 
invalid and making sure you diagnose it and have a test of this.

Where in the patch you use int for the size of something (e.g. a list) on 
the host, please use size_t.

In extract_array_notation_exprs you appear to be reallocating every time 
something is added to a list (with XRESIZEVEC).  It would probably be more 
efficient to use the vec.h infrastructure for an automatically resizing 
vector on which you push things.

In c_parser_array_notation you appear to be converting indices to 
integer_type_node in some cases, not converting at all in other cases.  
But the spec says "The expressions in a triplet are converted to 
ptrdiff_t.", so you need to convert to target ptrdiff_t, not target int.  
And there's a requirement that "Each of the exp

Re: [PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self

2012-10-19 Thread Meador Inge
On 10/18/2012 01:33 PM, Bernhard Reutner-Fischer wrote:

> On 18 October 2012 17:30:20 Meador Inge  wrote:
>> Ping ^ 2
> 
> Been a while but wasn't --with-build-sysroot for exactly this?

AFAICT, no.  --with-build-sysroot seems to be used for setting a different
sysroot to use for compiling target libraries while GCC is being built [1].
This patch fixes the gcc-{ar,nm,ranlib} programs that are built and deployed
with GCC to not rely on the current PATH when running the binutils programs
they are wrapping.

[1] http://gcc.gnu.org/install/configure.html

-- 
Meador Inge
CodeSourcery / Mentor Embedded


Re: [PATCH] gcc-{ar,nm,ranlib}: Find binutils binaries relative to self

2012-10-19 Thread Meador Inge
CC'ing the LTO maintainers.

On 10/18/2012 10:30 AM, Meador Inge wrote:
> Ping ^ 2.
> 
> On 10/09/2012 09:44 PM, Meador Inge wrote:
>> Ping.
>>
>> On 10/04/2012 03:45 PM, Meador Inge wrote:
>>> Hi All,
>>>
>>> Currently the gcc-{ar,nm,ranlib} utilities assume that binutils is in
>>> path when invoking the wrapped binutils program.  This goes against the
>>> accepted practice in GCC to find sub-programs relative to where the
>>> GCC binaries are stored and to not make assumptions about the PATH.
>>>
>>> This patch changes the gcc-{ar,nm,ranlib} utilities to do the same
>>> by factoring out some utility code for finding files from collect2.c.
>>> These functions are then leveraged to find the binutils programs.
>>> Note that similar code exist in gcc.c.  Perhaps one day everything
>>> can be merged to the file-find files.
>>>
>>> Tested for Windows and GNU/Linux hosts and i686-pc-linux-gnu and
>>> arm-none-eabi targets.  OK?
>>>
>>> P.S. I am not quite sure what is best for the copyrights and contributed
>>> by comments in the file-find* files I added since that code was just moved.
>>> This patch drops the contributed by and keeps all the copyright dates from
>>> collect2.c.
>>>
>>> 2012-10-04  Meador Inge  
>>>
>>> * collect2.c (main): Call find_file_set_debug.
>>> (find_a_find, add_prefix, prefix_from_env, prefix_from_string):
>>> Factor out into ...
>>> * file-find.c (New file): ... here and ...
>>> * file-find.h (New file): ... here.
>>> * gcc-ar.c (standard_exec_prefix): New variable.
>>> (standard_libexec_prefix): Ditto.
>>> (tooldir_base_prefix) Ditto.
>>> (self_exec_prefix): Ditto.
>>> (self_libexec_prefix): Ditto.
>>> (self_tooldir_prefix): Ditto.
>>> (target_version): Ditto.
>>> (path): Ditto.
>>> (target_path): Ditto.
>>> (setup_prefixes): New function.
>>> (main): Rework how wrapped programs are found.
>>> * Makefile.in (OBJS-libcommon-target): Add file-find.o.
>>> (AR_OBJS): New variable.
>>> (gcc-ar$(exeext)): Add dependency on $(AR_OBJS).
>>> (gcc-nm$(exeext)): Ditto.
>>> (gcc-ranlib(exeext)): Ditto.
>>> (COLLECT2_OBJS): Add file-find.o.
>>> (collect2.o): Add file-find.h prerequisite.
>>> (file-find.o): New rule.
>>>
>>> Index: gcc/gcc-ar.c
>>> ===
>>> --- gcc/gcc-ar.c(revision 192099)
>>> +++ gcc/gcc-ar.c(working copy)
>>> @@ -21,21 +21,110 @@
>>>  #include "config.h"
>>>  #include "system.h"
>>>  #include "libiberty.h"
>>> +#include "file-find.h"
>>>  
>>>  #ifndef PERSONALITY
>>>  #error "Please set personality"
>>>  #endif
>>>  
>>> +/* The exec prefix as derived at compile-time from --prefix.  */
>>> +
>>> +static const char standard_exec_prefix[] = STANDARD_EXEC_PREFIX;
>>> +
>>> +/* The libexec prefix as derived at compile-time from --prefix.  */
>>> +
>>>  static const char standard_libexec_prefix[] = STANDARD_LIBEXEC_PREFIX;
>>> +
>>> +/* The bindir prefix as derived at compile-time from --prefix.  */
>>> +
>>>  static const char standard_bin_prefix[] = STANDARD_BINDIR_PREFIX;
>>> -static const char *const target_machine = TARGET_MACHINE;
>>>  
>>> +/* A relative path to be used in finding the location of tools
>>> +   relative to this program.  */
>>> +
>>> +static const char *const tooldir_base_prefix = TOOLDIR_BASE_PREFIX;
>>> +
>>> +/* The exec prefix as relocated from the location of this program.  */
>>> +
>>> +static const char *self_exec_prefix;
>>> +
>>> +/* The libexec prefix as relocated from the location of this program.  */
>>> +
>>> +static const char *self_libexec_prefix;
>>> +
>>> +/* The tools prefix as relocated from the location of this program.  */
>>> +
>>> +static const char *self_tooldir_prefix;
>>> +
>>> +/* The name of the machine that is being targeted.  */
>>> +
>>> +static const char *const target_machine = DEFAULT_TARGET_MACHINE;
>>> +
>>> +/* The target version.  */
>>> +
>>> +static const char *const target_version = DEFAULT_TARGET_VERSION;
>>> +
>>> +/* The collection of target specific path prefixes.  */
>>> +
>>> +static struct path_prefix target_path;
>>> +
>>> +/* The collection path prefixes.  */
>>> +
>>> +static struct path_prefix path;
>>> +
>>> +/* The directory separator.  */
>>> +
>>>  static const char dir_separator[] = { DIR_SEPARATOR, 0 };
>>>  
>>> +static void
>>> +setup_prefixes (const char *exec_path)
>>> +{
>>> +  const char *self;
>>> +
>>> +  self = getenv ("GCC_EXEC_PREFIX");
>>> +  if (!self)
>>> +self = exec_path;
>>> +  else
>>> +self = concat (self, "gcc-" PERSONALITY, NULL);
>>> +
>>> +  /* Relocate the exec prefix.  */
>>> +  self_exec_prefix = make_relative_prefix (self,
>>> +  standard_bin_prefix,
>>> +  standard_exec_prefix);
>>> +  if (self_exec_prefix == NULL)
>>> +self_exec_prefix = standard_exec_prefix;
>>> +
>>> +  /* Relocate libexec prefix.  */
>>> +  self_libexec_pre

RE: [Ping]FW: [PATCH] Cilk Plus merging to trunk (2 of n)

2012-10-19 Thread Iyer, Balaji V
Hi Joseph,
Thank you very much for your response. I will look into this and get 
back to you soon!

-Balaji V. Iyer.

>-Original Message-
>From: Joseph Myers [mailto:jos...@codesourcery.com]
>Sent: Friday, October 19, 2012 5:38 PM
>To: Iyer, Balaji V
>Cc: Richard Guenther; gcc-patches@gcc.gnu.org
>Subject: RE: [Ping]FW: [PATCH] Cilk Plus merging to trunk (2 of n)
>
>On Thu, 4 Oct 2012, Iyer, Balaji V wrote:
>
>> Here is a link to the latest spec. This should clear several of
>> the questions you are seeking.
>> (http://software.intel.com/sites/default/files/m/4/e/7/3/1/40297-
>> Intel_Cilk_plus_lang_spec_2.htm#array)
>
>This specification is much improved, especially as regards specifying the 
>types of
>section expressions.  I'm not convinced that "the type of the array being
>subscripted shall have a declared size" is properly defined in standard terms, 
>but I
>can guess reasonable semantics - that if the array-to-pointer decay were
>considered not to occur in such a context, then the expressions for the array
>being subscripted shall have array type, not pointer type, and the array type 
>shall
>not be one with unspecified size (array[]), although it may be a VLA.  For 
>example,
>given "int a[10];", it would be valid to say a[:] or (a)[:] but not (+a)[:].  
>I don't,
>however, see any testcases at all in this patch for that particular 
>requirements -
>not even for the completely clear-cut cases, such as giving an error for 
>"extern int
>a[]; a[:];" or "int *a; a[:];".
>
>Say expr1 through expr9 are expressions with side effects, and you have:
>
>expr1[expr2:expr3:expr4] = expr5[expr6:expr7:expr8] + expr9;
>
>The spec says "However, in such a statement, a sub-expression with rank zero is
>evaluated only once." - that is, each of the nine expressions is evaluated 
>once.  I
>don't see any calls to save_expr to ensure these semantics, or any testcases 
>that
>verify that they are adhered to.
>
>(Are multidimensional section expressions valid when what you have is pointers
>to pointers, e.g. "int ***p; p[0:10][0:10][0:10];"?  I don't see anything to 
>rule
>them out, so I assume they are valid, but don't see testcases for them either.)
>
>Looking at the patch itself:
>
>In find_rank you have error ("Rank Mismatch!"); - this is not a properly 
>formatted
>error message according to the GNU Coding standards (which typically would not
>have any uppercase).  I'd also suggest that when you find a rank, you store
>(through a location_t * pointer) the location of the first expression found 
>with
>that rank, so if you then find a mismatching rank you can use error_at to 
>point to
>that rank and then inform to point to the previous rank it didn't match.
>
>I'm not convinced that your logic, falling back to examining each operand for a
>generic expression, is correct to find the ranks of all kinds of expressions.  
>For
>example, there are rules:
>
>* "The rank of a simple subscript expression (postfix-expression [ expression 
>]) is
>the sum of the ranks of its operand expressions. The rank of the subscript
>operand shall not be greater than one." - how do you ensure this rule?  Where 
>do
>you test for errors if the subscript has too high a rank (both in the 
>front-end code,
>and in the testsuite), and test (in the testsuite) for cases where the 
>subscript has
>rank 1?
>
>* "The rank of a comma expression is the rank of its second operand." - I don't
>see anything special to handle that.  Are there testcases for rank of comma
>expressions?  Apart from testing rank, you may need to test how they are
>evaluated (that each part, with independent rank, gets fully evaluted in turn) 
>- I
>don't see anything obvious in the code to handle them appropriately.
>
>In general, I'd say you should have tests in the testsuite for each syntactic 
>type of
>expression supported by GCC, both standard and GNU extensions, testing how it
>interacts with section expressions - both valid cases, and cases that are 
>invalid
>because of rank mismatches.  As another example, you don't have tests of
>conditional expressions.
>
>Where do you test (both in code, and testcases to verify errors) that "The 
>rank of
>each expression in a section triplet shall be zero."?  What about "The rank of 
>the
>postfix expression identifying the function to call shall be zero."?  "A full
>expression shall have rank zero, unless it appears in an expression statement 
>or as
>the controlling expression of an if statement."?  (This means, I suppose, that 
>uses
>such as initializers or sizes in array declarators must be rejected.)  I'd 
>advise going
>through each sentence in the relevant part of the spec that says something is
>invalid and making sure you diagnose it and have a test of this.
>
>Where in the patch you use int for the size of something (e.g. a list) on the 
>host,
>please use size_t.
>
>In extract_array_notation_exprs you appear to be reallocating every time
>something is added to a list (with XRESIZEVE

Re: [RFC] Updating PowerPC's overload vector table

2012-10-19 Thread Segher Boessenkool

const struct altivec_builtin_types altivec_overloaded_builtins[] = {


[big fat table constructed using many macros]


Now, it bothers me that every time we do the overloaded builtins, we
essentially have to scan the table in a linear fashion.  I'm  
thinking that we
may want to move these tables into separate arrays, one for each  
overloaded
function, with a null marker at the end.  This is more of a radical  
reworking.
If I did this, I likely would keep the big table as a last resort,  
and add the

arrays, one at a time.

Any thoughts?


You could construct the table at runtime, using helper functions for
the regular groups.  It's easy to make a more efficient data structure
as well then.


Segher



Re: Minimize downward code motion during reassociation

2012-10-19 Thread Xinliang David Li
On Thu, Oct 18, 2012 at 3:36 PM, Easwaran Raman  wrote:
> Hi,
>
> During expression reassociation, statements are conservatively moved
> downwards to ensure that dependences are correctly satisfied after
> reassocation. This could lead to lengthening of live ranges. This
> patch moves statements only to the extent necessary. Bootstraps and no
> test regression on x86_64/linux. OK for trunk?
>
> Thanks,
> Easwaran
>
> 2012-10-18   Easwaran Raman  
> * tree-ssa-reassoc.c(assign_uids): New function.
> (assign_uids_in_relevant_bbs): Likewise.
> (ensure_ops_are_available): Likewise.
> (rewrite_expr_tree): Do not move statements beyond what is
> necessary. Remove call to swap_ops_for_binary_stmt...
> (reassociate_bb): ... and move it here.
>
>
>
> Index: gcc/tree-ssa-reassoc.c
> ===
> --- gcc/tree-ssa-reassoc.c (revision 192487)
> +++ gcc/tree-ssa-reassoc.c (working copy)
> @@ -2250,6 +2250,128 @@ swap_ops_for_binary_stmt (VEC(operand_entry_t, hea
>  }
>  }
>
> +/* Assign UIDs to statements in basic block BB.  */
> +
> +static void
> +assign_uids (basic_block bb)
> +{
> +  unsigned uid = 0;
> +  gimple_stmt_iterator gsi;
> +  /* First assign uids to phis.  */
> +  for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> +{
> +  gimple stmt = gsi_stmt (gsi);
> +  gimple_set_uid (stmt, uid++);
> +}
> +
> +  /* Then assign uids to stmts.  */
> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> +{
> +  gimple stmt = gsi_stmt (gsi);
> +  gimple_set_uid (stmt, uid++);
> +}
> +}
> +
> +/* For each operand in OPS, find the basic block that contains the statement
> +   which defines the operand. For all such basic blocks, assign UIDs.  */
> +
> +static void
> +assign_uids_in_relevant_bbs (VEC(operand_entry_t, heap) * ops)
> +{
> +  operand_entry_t oe;
> +  int i;
> +  struct pointer_set_t *seen_bbs = pointer_set_create ();
> +
> +  for (i = 0; VEC_iterate (operand_entry_t, ops, i, oe); i++)
> +{
> +  gimple def_stmt;
> +  basic_block bb;
> +  if (TREE_CODE (oe->op) != SSA_NAME)
> +continue;
> +  def_stmt = SSA_NAME_DEF_STMT (oe->op);
> +  bb = gimple_bb (def_stmt);
> +  if (!pointer_set_contains (seen_bbs, bb))
> +{
> +  assign_uids (bb);
> +  pointer_set_insert (seen_bbs, bb);
> +}
> +}
> +  pointer_set_destroy (seen_bbs);
> +}
> +
> +/* Ensure that operands in the OPS vector starting from OPINDEXth
> entry are live
> +   at STMT.  This is accomplished by moving STMT if needed.  */
> +
> +static void
> +ensure_ops_are_available (gimple stmt, VEC(operand_entry_t, heap) *
> ops, int opindex)

format issue?

> +{
> +  int i;
> +  int len = VEC_length (operand_entry_t, ops);
> +  gimple insert_stmt = stmt;
> +  basic_block insert_bb = gimple_bb (stmt);
> +  gimple_stmt_iterator gsi_insert, gsistmt;
> +  for (i = opindex; i < len; i++)
> +{
> +  operand_entry_t oe = VEC_index (operand_entry_t, ops, i);
> +  gimple def_stmt;
> +  basic_block def_bb;
> +  /* Ignore constants and operands with default definitons.  */
> +  if (TREE_CODE (oe->op) != SSA_NAME
> +  || SSA_NAME_IS_DEFAULT_DEF (oe->op))
> +continue;
> +  def_stmt = SSA_NAME_DEF_STMT (oe->op);
> +  def_bb = gimple_bb (def_stmt);
> +  if (def_bb != insert_bb
> +  && !dominated_by_p (CDI_DOMINATORS, insert_bb, def_bb))
> +{
> +  insert_bb = def_bb;
> +  insert_stmt = def_stmt;
> +}
> +  else if (def_bb == insert_bb
> +   && gimple_uid (insert_stmt) < gimple_uid (def_stmt))
> +insert_stmt = def_stmt;
> +}
> +  if (insert_stmt == stmt)
> +return;
> +  gsistmt = gsi_for_stmt (stmt);
> +  /* If GSI_STMT is a phi node, then do not insert just after that statement.

GSI_STMT --> INSERT_STMT?

> + Instead, find the first non-label gimple statement in BB and insert 
> before
> + that.  */
> +  if (gimple_code (insert_stmt) == GIMPLE_PHI)
> +{
> +  gsi_insert = gsi_after_labels (insert_bb);
> +  gsi_move_before (&gsistmt, &gsi_insert);
> +}
> +  /* Statements marked for throw can not be in the middle of a basic block. 
> So
> + we can not insert a statement (not marked for throw) immediately
> after.  */
> +  else if (lookup_stmt_eh_lp (insert_stmt) > 0
> +   && stmt_can_throw_internal (insert_stmt))




> +{
> +  edge e, succ_edge = NULL;
> +  edge_iterator ei;
> +
> +  /* There should be exactly one normal edge out of the basic block.  */
> +  FOR_EACH_EDGE (e, ei, insert_bb->succs)
> +{
> +  if (!(e->flags & EDGE_COMPLEX))
> +{
> +  gcc_assert (succ_edge == NULL);
> +  succ_edge = e;
> +}
> +}
> +  /* Insert STMT at the beginning of the successor basic block.  */
> +  insert_bb = succ_edge->dest;
> +  gsi_insert = gsi

Re: Minimize downward code motion during reassociation

2012-10-19 Thread Easwaran Raman
On Fri, Oct 19, 2012 at 5:13 PM, Xinliang David Li  wrote:
> On Thu, Oct 18, 2012 at 3:36 PM, Easwaran Raman  wrote:
>> Hi,
>>
>> During expression reassociation, statements are conservatively moved
>> downwards to ensure that dependences are correctly satisfied after
>> reassocation. This could lead to lengthening of live ranges. This
>> patch moves statements only to the extent necessary. Bootstraps and no
>> test regression on x86_64/linux. OK for trunk?
>>
>> Thanks,
>> Easwaran
>>
>> 2012-10-18   Easwaran Raman  
>> * tree-ssa-reassoc.c(assign_uids): New function.
>> (assign_uids_in_relevant_bbs): Likewise.
>> (ensure_ops_are_available): Likewise.
>> (rewrite_expr_tree): Do not move statements beyond what is
>> necessary. Remove call to swap_ops_for_binary_stmt...
>> (reassociate_bb): ... and move it here.
>>
>>
>>
>> Index: gcc/tree-ssa-reassoc.c
>> ===
>> --- gcc/tree-ssa-reassoc.c (revision 192487)
>> +++ gcc/tree-ssa-reassoc.c (working copy)
>> @@ -2250,6 +2250,128 @@ swap_ops_for_binary_stmt (VEC(operand_entry_t, hea
>>  }
>>  }
>>
>> +/* Assign UIDs to statements in basic block BB.  */
>> +
>> +static void
>> +assign_uids (basic_block bb)
>> +{
>> +  unsigned uid = 0;
>> +  gimple_stmt_iterator gsi;
>> +  /* First assign uids to phis.  */
>> +  for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>> +{
>> +  gimple stmt = gsi_stmt (gsi);
>> +  gimple_set_uid (stmt, uid++);
>> +}
>> +
>> +  /* Then assign uids to stmts.  */
>> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>> +{
>> +  gimple stmt = gsi_stmt (gsi);
>> +  gimple_set_uid (stmt, uid++);
>> +}
>> +}
>> +
>> +/* For each operand in OPS, find the basic block that contains the statement
>> +   which defines the operand. For all such basic blocks, assign UIDs.  */
>> +
>> +static void
>> +assign_uids_in_relevant_bbs (VEC(operand_entry_t, heap) * ops)
>> +{
>> +  operand_entry_t oe;
>> +  int i;
>> +  struct pointer_set_t *seen_bbs = pointer_set_create ();
>> +
>> +  for (i = 0; VEC_iterate (operand_entry_t, ops, i, oe); i++)
>> +{
>> +  gimple def_stmt;
>> +  basic_block bb;
>> +  if (TREE_CODE (oe->op) != SSA_NAME)
>> +continue;
>> +  def_stmt = SSA_NAME_DEF_STMT (oe->op);
>> +  bb = gimple_bb (def_stmt);
>> +  if (!pointer_set_contains (seen_bbs, bb))
>> +{
>> +  assign_uids (bb);
>> +  pointer_set_insert (seen_bbs, bb);
>> +}
>> +}
>> +  pointer_set_destroy (seen_bbs);
>> +}
>> +
>> +/* Ensure that operands in the OPS vector starting from OPINDEXth
>> entry are live
>> +   at STMT.  This is accomplished by moving STMT if needed.  */
>> +
>> +static void
>> +ensure_ops_are_available (gimple stmt, VEC(operand_entry_t, heap) *
>> ops, int opindex)
>
> format issue?
>
>> +{
>> +  int i;
>> +  int len = VEC_length (operand_entry_t, ops);
>> +  gimple insert_stmt = stmt;
>> +  basic_block insert_bb = gimple_bb (stmt);
>> +  gimple_stmt_iterator gsi_insert, gsistmt;
>> +  for (i = opindex; i < len; i++)
>> +{
>> +  operand_entry_t oe = VEC_index (operand_entry_t, ops, i);
>> +  gimple def_stmt;
>> +  basic_block def_bb;
>> +  /* Ignore constants and operands with default definitons.  */
>> +  if (TREE_CODE (oe->op) != SSA_NAME
>> +  || SSA_NAME_IS_DEFAULT_DEF (oe->op))
>> +continue;
>> +  def_stmt = SSA_NAME_DEF_STMT (oe->op);
>> +  def_bb = gimple_bb (def_stmt);
>> +  if (def_bb != insert_bb
>> +  && !dominated_by_p (CDI_DOMINATORS, insert_bb, def_bb))
>> +{
>> +  insert_bb = def_bb;
>> +  insert_stmt = def_stmt;
>> +}
>> +  else if (def_bb == insert_bb
>> +   && gimple_uid (insert_stmt) < gimple_uid (def_stmt))
>> +insert_stmt = def_stmt;
>> +}
>> +  if (insert_stmt == stmt)
>> +return;
>> +  gsistmt = gsi_for_stmt (stmt);
>> +  /* If GSI_STMT is a phi node, then do not insert just after that 
>> statement.
>
> GSI_STMT --> INSERT_STMT?

Ok.
>> + Instead, find the first non-label gimple statement in BB and insert 
>> before
>> + that.  */
>> +  if (gimple_code (insert_stmt) == GIMPLE_PHI)
>> +{
>> +  gsi_insert = gsi_after_labels (insert_bb);
>> +  gsi_move_before (&gsistmt, &gsi_insert);
>> +}
>> +  /* Statements marked for throw can not be in the middle of a basic block. 
>> So
>> + we can not insert a statement (not marked for throw) immediately
>> after.  */
>> +  else if (lookup_stmt_eh_lp (insert_stmt) > 0
>> +   && stmt_can_throw_internal (insert_stmt))
>
>
>
>
>> +{
>> +  edge e, succ_edge = NULL;
>> +  edge_iterator ei;
>> +
>> +  /* There should be exactly one normal edge out of the basic block.  */
>> +  FOR_EACH_EDGE (e, ei, insert_bb->succs)
>> +{
>> +  if (!(e->flags & EDGE_COMPLEX))
>> +{
>> +  gcc_assert (

Re: [PATCH] Fix dumps for IPA passes

2012-10-19 Thread Sharad Singhai
As suggested in http://gcc.gnu.org/ml/gcc/2012-10/msg00285.html, I
have updated the attached patch to rename 'dump_enabled_phase' to
'dump_enabled_phase_p'. The 'dump_enabled_p ()' doesn't take any
argument and can be used as a predicate for the dump calls.

Once this patch gets in, the plan is to update the existing calls (in
vectorizer passes) of the form
  if (dump_kind_p (flags))
  dump_printf(flags, ...)

to

  if (dump_enabled_p ())
  dump_printf(flags, ...)

Bootstrapped and tested on x86_64 and didn't observe any new test
failures. Okay for trunk?

Thanks,
Sharad

2012-10-19  Sharad Singhai  

* dumpfile.c (dump_phase_enabled_p): Renamed dump_enabled_p. Update
all callers.
(dump_enabled_p): A new function to check if any of the dump files
is available.
(dump_kind_p): Remove check for current_function_decl. Add check for
dumpfile and alt_dump_file.
* dumpfile.h: Add declaration of dump_enabled_p.

Index: dumpfile.c
===
--- dumpfile.c (revision 192623)
+++ dumpfile.c (working copy)
@@ -35,7 +35,7 @@ static int alt_flags;/* current op
 static FILE *alt_dump_file = NULL;

 static void dump_loc (int, FILE *, source_location);
-static int dump_enabled_p (int);
+static int dump_phase_enabled_p (int);
 static FILE *dump_open_alternate_stream (struct dump_file_info *);

 /* Table of tree dump switches. This must be consistent with the
@@ -380,7 +380,7 @@ dump_start (int phase, int *flag_ptr)
   char *name;
   struct dump_file_info *dfi;
   FILE *stream;
-  if (phase == TDI_none || !dump_enabled_p (phase))
+  if (phase == TDI_none || !dump_phase_enabled_p (phase))
 return 0;

   dfi = get_dump_file_info (phase);
@@ -461,7 +461,7 @@ dump_begin (int phase, int *flag_ptr)
   struct dump_file_info *dfi;
   FILE *stream;

-  if (phase == TDI_none || !dump_enabled_p (phase))
+  if (phase == TDI_none || !dump_phase_enabled_p (phase))
 return NULL;

   name = get_dump_file_name (phase);
@@ -493,8 +493,8 @@ dump_begin (int phase, int *flag_ptr)
If PHASE is TDI_tree_all, return nonzero if any dump is enabled for
any phase.  */

-int
-dump_enabled_p (int phase)
+static int
+dump_phase_enabled_p (int phase)
 {
   if (phase == TDI_tree_all)
 {
@@ -514,6 +514,14 @@ dump_begin (int phase, int *flag_ptr)
 }
 }

+/* Return true if any of the dumps are enabled, false otherwise. */
+
+inline bool
+dump_enabled_p (void)
+{
+  return (dump_file || alt_dump_file);
+}
+
 /* Returns nonzero if tree dump PHASE has been initialized.  */

 int
@@ -834,9 +842,8 @@ opt_info_switch_p (const char *arg)
 bool
 dump_kind_p (int msg_type)
 {
-  if (!current_function_decl)
-return 0;
-  return ((msg_type & pflags) || (msg_type & alt_flags));
+  return (dump_file && (msg_type & pflags))
+|| (alt_dump_file && (msg_type & alt_flags));
 }

 /* Print basic block on the dump streams.  */
Index: dumpfile.h
===
--- dumpfile.h (revision 192623)
+++ dumpfile.h (working copy)
@@ -121,6 +121,7 @@ extern int dump_switch_p (const char *);
 extern int opt_info_switch_p (const char *);
 extern const char *dump_flag_name (int);
 extern bool dump_kind_p (int);
+extern inline bool dump_enabled_p (void);
 extern void dump_printf (int, const char *, ...) ATTRIBUTE_PRINTF_2;
 extern void dump_printf_loc (int, source_location,
  const char *, ...) ATTRIBUTE_PRINTF_3;


Re: breakage with "[v3] (almost) finish "

2012-10-19 Thread Jonathan Wakely
On 19 October 2012 21:28, Hans-Peter Nilsson wrote:
> Looks like _U is one of those identifiers that should be
> avoided:  Grep yields:
> src/newlib/libc/include/ctype.h:#define _U  01

Argh! my bad, sorry - fix on the way ...


Re: breakage with "[v3] (almost) finish "

2012-10-19 Thread Jonathan Wakely
On 20 October 2012 02:35, Jonathan Wakely wrote:
> On 19 October 2012 21:28, Hans-Peter Nilsson wrote:
>> Looks like _U is one of those identifiers that should be
>> avoided:  Grep yields:
>> src/newlib/libc/include/ctype.h:#define _U  01
>
> Argh! my bad, sorry - fix on the way ...

Sorry about that, _U is in the badnames list and I really should know better.

Fixed by this patch:

* include/std/scoped_allocator: Avoid badname.

Tested x86_64-linux, committed to trunk.
commit dc33ce8c77b3c7d6a39cdf28359cbc2afae9d1da
Author: Jonathan Wakely 
Date:   Sat Oct 20 02:40:43 2012 +0100

* include/std/scoped_allocator: Avoid badname.

diff --git a/libstdc++-v3/include/std/scoped_allocator 
b/libstdc++-v3/include/std/scoped_allocator
index 81365b6..07a2e5e 100644
--- a/libstdc++-v3/include/std/scoped_allocator
+++ b/libstdc++-v3/include/std/scoped_allocator
@@ -365,31 +365,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
construct(pair<_T1, _T2>* __p)
{ construct(__p, piecewise_construct, tuple<>(), tuple<>()); }
 
-  template
+  template
void
-   construct(pair<_T1, _T2>* __p, _U&& __u, _V&& __v)
+   construct(pair<_T1, _T2>* __p, _Up&& __u, _Vp&& __v)
{
  construct(__p, piecewise_construct,
-   std::forward_as_tuple(std::forward<_U>(__u)),
-   std::forward_as_tuple(std::forward<_V>(__v)));
+   std::forward_as_tuple(std::forward<_Up>(__u)),
+   std::forward_as_tuple(std::forward<_Vp>(__v)));
}
 
-  template
+  template
void
-   construct(pair<_T1, _T2>* __p, const pair<_U, _V>& __x)
+   construct(pair<_T1, _T2>* __p, const pair<_Up, _Vp>& __x)
{
  construct(__p, piecewise_construct,
std::forward_as_tuple(__x.first),
std::forward_as_tuple(__x.second));
}
 
-  template
+  template
void
-   construct(pair<_T1, _T2>* __p, pair<_U, _V>&& __x)
+   construct(pair<_T1, _T2>* __p, pair<_Up, _Vp>&& __x)
{
  construct(__p, piecewise_construct,
-   std::forward_as_tuple(std::forward<_U>(__x.first)),
-   std::forward_as_tuple(std::forward<_V>(__x.second)));
+   std::forward_as_tuple(std::forward<_Up>(__x.first)),
+   std::forward_as_tuple(std::forward<_Vp>(__x.second)));
}
 
   template


Re: [PATCH] Rs6000 infrastructure cleanup (switches), revised patch #2f

2012-10-19 Thread Mike Stump
Thanks.


Re: [RFC] Updating PowerPC's overload vector table

2012-10-19 Thread Mike Stump
On Oct 19, 2012, at 2:27 PM, Michael Meissner  
wrote:
> As I beging work on extensions for a potential future machine, I am running
> into the overload table.

I still like the built-in mechanism Kenny and I have  Not perfect, but 
vastly better...  If you wanted to help submitify it  I bet we could fine a 
way to get it to you.  Iterators, overloading, over load resolution, in 
parameters, in/out parameters, ties nicely into RL patterns.  .md specification 
language for them, a generator so one only has a specification; the 
implementation it generated by the generator.  Doesn't yet generate 
documentation, but that is 20 lines away if I had to guess.


Re: [PATCH, ARM] Fix PR44557 (Thumb-1 ICE)

2012-10-19 Thread Ramana Radhakrishnan
On Tue, Oct 16, 2012 at 10:25 AM, Chung-Lin Tang
 wrote:
> On 12/9/27 6:25 AM, Janis Johnson wrote:
>> On 09/26/2012 01:58 AM, Chung-Lin Tang wrote:
>>
>> +/* { dg-do compile } */
>> +/* { dg-options "-mthumb -O1 -march=armv5te -fno-omit-frame-pointer 
>> -fno-forward-propagate" }  */
>> +/* { dg-require-effective-target arm_thumb1_ok } */
>>
>> This test will fail to compile for test flags that conflict with
>> the -march option, and the specified -march option might be
>> overridden with similar options from other test flags.  The problem
>> might have also been seen for other -march options.  I recommend
>> leaving it off and omitting the dg-require so the test can be run
>> for more multilibs.
>
> I'm not sure, as the intent is to test a Thumb-1 case here. If the
> maintainers think we should adjust the testcase, I'm of course fine with it.

I think this is OK but you need to prune out the conflict warnings to
reduce annoyance for folks doing multilib testing and it does look
like more than one group.

Longer term I wonder if we should reorganise gcc.target/arm and indeed
gcc.target/aarch64 . Janis, do you have any other ideas ?

* to contain a torture script that goes through all combinations of
architectures and fpus' / arm / thumb for all the tests.
* another sub-level directory for such directed tests where multilib
options aren't applied which are essentially from regressions.

However I don't know of an easy way by which we can ignore said
multilib flags ?

Ramana


>
> And ping for the patch.
>
> Thanks,
> Chung-Lin
>