date:20120809

Re: [cxx-conversion] Support garbage-collected C++ templates

2012-08-09 Thread Laurynas Biveinis

Diego -

It's all good changes and your plan for future improvements sounds
good, including the part where gengtype is killed with fire.

> - Functions should be emitted in files that have access to the
>   structure where they were defined.  I'm not convinced that the
>   current multiplicity of gt-*.[ch] files is even necessary. However,
>   I would like the guidance of a gengtype maintainer. I don't think I
>   fully understand all of it.

Yes, I remember looking into splitting the output a few years ago. It
should be possible to split gtype-desc.h into header files to be
included in source header files defining the relevant types. I.e.
tree.h includes a generated gt-tree.h that provides allocator
definitions for the tree.h types. gtype-desc.h then would be left with
the master enum of all GTY-handled types. It should be also possible
to split gtype-desc.c into already-existing gt-foo.h too, although the
benefit of doing that is not as big I think.

> I've tested the patch on x86_64 with the page and zone collectors and
> with --enable-checking=gc,gcac (boy was that a slow mistake).

Might be also interesting to try valgrind. Good to hear the zone
collector hasn't bitrotten once again.

> * doc/gty.texi: Document support for C++ templates and
> user-provided markers.

The 1st node in this doc file needs s/C/C++/g and perhaps some more
explanation with an eye on C++.

-- 
Laurynas

Re: Value type of map need not be default copyable

2012-08-09 Thread Marc Glisse


On Wed, 8 Aug 2012, François Dumont wrote:


On 08/08/2012 03:39 PM, Paolo Carlini wrote:

On 08/08/2012 03:15 PM, François Dumont wrote:
I have also introduce a special std::pair constructor for container usage 
so that we do not have to include the whole tuple stuff just for 
associative container implementations.

To be clear: sorry, this is not an option.

Paolo.

   Then I can only imagine the attached patch which require to include tuple 
when including unordered_map or unordered_set. The 
std::pair(piecewise_construct_t, tuple<>, tuple<>) is the only constructor 
that allow to build a pair using the default constructor for the second 
member.


I agree that the extra constructor would be convenient (I probably would 
have gone with pair(T&&,__default_construct_t), the symmetric version, and 
enough extra constructors to resolve all ambiguities). Maybe LWG would 
consider doing something.


+ __p = __h->_M_allocate_node(std::piecewise_construct,
+ std::make_tuple(__k),
+ std::make_tuple());

Don't you want cref(__k)? It might save a move at some point.

--
Marc Glisse

Re: [PATCH] Strength reduction part 3 of 4: candidates with unknown strides

2012-08-09 Thread Richard Guenther

On Wed, 8 Aug 2012, H.J. Lu wrote:

> On Wed, Aug 1, 2012 at 10:36 AM, William J. Schmidt
>  wrote:
> > Greetings,
> >
> > Thanks for the review of part 2!  Here's another chunk of the SLSR code
> > (I feel I owe you a few beers at this point).  This performs analysis
> > and replacement on groups of related candidates having an SSA name
> > (rather than a constant) for a stride.
> >
> > This leaves only the conditional increment (CAND_PHI) case, which will
> > be handled in the last patch of the series.
> >
> > Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new
> > regressions.  Ok for trunk?
> >
> > Thanks,
> > Bill
> >
> >
> > gcc:
> >
> > 2012-08-01  Bill Schmidt  
> >
> > * gimple-ssa-strength-reduction.c (struct incr_info_d): New struct.
> > (incr_vec): New static var.
> > (incr_vec_len): Likewise.
> > (address_arithmetic_p): Likewise.
> > (stmt_cost): Remove dead assignment.
> > (dump_incr_vec): New function.
> > (cand_abs_increment): Likewise.
> > (lazy_create_slsr_reg): Likewise.
> > (incr_vec_index): Likewise.
> > (count_candidates): Likewise.
> > (record_increment): Likewise.
> > (record_increments): Likewise.
> > (unreplaced_cand_in_tree): Likewise.
> > (optimize_cands_for_speed_p): Likewise.
> > (lowest_cost_path): Likewise.
> > (total_savings): Likewise.
> > (analyze_increments): Likewise.
> > (ncd_for_two_cands): Likewise.
> > (nearest_common_dominator_for_cands): Likewise.
> > (profitable_increment_p): Likewise.
> > (insert_initializers): Likewise.
> > (introduce_cast_before_cand): Likewise.
> > (replace_rhs_if_not_dup): Likewise.
> > (replace_one_candidate): Likewise.
> > (replace_profitable_candidates): Likewise.
> > (analyze_candidates_and_replace): Handle candidates with SSA-name
> > strides.
> >
> > gcc/testsuite:
> >
> > 2012-08-01  Bill Schmidt  
> >
> > * gcc.dg/tree-ssa/slsr-5.c: New.
> > * gcc.dg/tree-ssa/slsr-6.c: New.
> > * gcc.dg/tree-ssa/slsr-7.c: New.
> > * gcc.dg/tree-ssa/slsr-8.c: New.
> > * gcc.dg/tree-ssa/slsr-9.c: New.
> > * gcc.dg/tree-ssa/slsr-10.c: New.
> > * gcc.dg/tree-ssa/slsr-11.c: New.
> > * gcc.dg/tree-ssa/slsr-12.c: New.
> > * gcc.dg/tree-ssa/slsr-13.c: New.
> > * gcc.dg/tree-ssa/slsr-14.c: New.
> > * gcc.dg/tree-ssa/slsr-15.c: New.
> > * gcc.dg/tree-ssa/slsr-16.c: New.
> > * gcc.dg/tree-ssa/slsr-17.c: New.
> > * gcc.dg/tree-ssa/slsr-18.c: New.
> > * gcc.dg/tree-ssa/slsr-19.c: New.
> > * gcc.dg/tree-ssa/slsr-20.c: New.
> > * gcc.dg/tree-ssa/slsr-21.c: New.
> > * gcc.dg/tree-ssa/slsr-22.c: New.
> > * gcc.dg/tree-ssa/slsr-23.c: New.
> > * gcc.dg/tree-ssa/slsr-24.c: New.
> > * gcc.dg/tree-ssa/slsr-25.c: New.
> > * gcc.dg/tree-ssa/slsr-26.c: New.
> > * gcc.dg/tree-ssa/slsr-30.c: New.
> > * gcc.dg/tree-ssa/slsr-31.c: New.
> >
> >
> ==
> > --- gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c (revision 0)
> > +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c (revision 0)
> > @@ -0,0 +1,25 @@
> > +/* Verify straight-line strength reduction fails for simple integer 
> > addition
> > +   with casts thrown in when -fwrapv is used.  */
> > +
> > +/* { dg-do compile } */
> > +/* { dg-options "-O3 -fdump-tree-dom2 -fwrapv" } */
> > +/* { dg-skip-if "" { ilp32 } { "-m32" } { "" } } */
> > +
> 
> This doesn't work on x32 nor Linux/ia32 since -m32
> may not be needed for ILP32.  This patch works for
> me.  OK to install?

Ok.

Thanks,
Richard.

Re: Commit: RL78: Include tree-pass.h

2012-08-09 Thread Richard Guenther

On Wed, Aug 8, 2012 at 5:29 PM, Richard Henderson  wrote:
> On 08/08/2012 07:19 AM, Ian Lance Taylor wrote:
>>> > I was suggesting to for example register a 2nd mdreorg-like pass and
>>> > add a 2nd target hook.  regstack should get the same treatment.
>> If the mechanism is a proliferation of mdreorg passes in every place
>> we want a target-specific pass, that is fine with me.
>
> I think it makes much more sense to edit the pass ordering from
> the backend, rather than hooks upon hooks upon hooks.
>
> Since the plugin interface exists, we might as well use it.

The issue is that using the plugin interface makes breakage only detectable
when you are able to test a target, not by merely building it.  That's bad
(of course only for those weirdo targets).  We should _at least_ provide
an interface to internals that for example use the address of the pass
structure for pass placement instead of just the dump file name.

Richard.

>
> r~

Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)

2012-08-09 Thread Richard Guenther

On Thu, Aug 9, 2012 at 12:17 AM, Lawrence Crowl  wrote:
> On 8/7/12, Mike Stump  wrote:
>> On Aug 7, 2012, at 11:38 AM, Lawrence Crowl wrote:
>> > Hm.  There seems to be significant opinion that there should not be any
>> > implicit conversions.  I am okay with operations as above, but would like
>> > to hear the opinions of others.
>>
>> If there is an agreed upon and expected semantic, having them are useful.
>> In the wide-int world, which replaces double_int, I think there is an
>> agreeable semantic and I think it is useful, so, I think we should plan on
>> having them, though, I'd be fine with punting their implementation until
>> such time as someone needs it.  If no one every needs the routine, I don't
>> see the harm in not implementing it.
>
> At present, there are no functions equivalent to (double_int + int), so
> there can be no expressions that need this overload.  I have no objection
> to adding such an overload, but if there are no objections, I would rather
> do it as a separate patch.

Sure.  It's just one of the possibilities to clean up existing code.

Richard.

> --
> Lawrence Crowl

Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)

2012-08-09 Thread Richard Guenther

On Thu, Aug 9, 2012 at 12:25 AM, Lawrence Crowl  wrote:
> On 8/8/12, Richard Guenther  wrote:
>> On Aug 7, 2012 Lawrence Crowl  wrote:
>> > We should probably think about naming conventions for mutating
>> > operations, as I expect we will want them eventually.
>>
>> Right.  In the end I would prefer explicit constructors.
>
> I don't think we're thinking about the same thing.
>
> I'm talking about member functions like mystring.append ("foo").
> The += operator is mutating as well.
>
> Constructors do not mutate, they create.

Ah.  For simple objects like double_int I prefer to have either all ops mutating
or all ops non-mutating.

Richard.

> --
> Lawrence Crowl

Re: [PATCH,i386] fma,fma4 and xop flags

2012-08-09 Thread Richard Guenther

On Thu, Aug 9, 2012 at 7:55 AM, Gopalasubramanian, Ganesh
 wrote:
>> Otherwise, what does -mno-fma4 -mxop do?
>> (it should enable both xop and fma4!)  what should -mfma4 -mno-xop do
>> (it should disable both xop and fma4!).
>
> Yes! that's what GCC does now.
> Some flags are coupled (atleast for now).
> For ex, -mno-sse4.2 -mavx enables both sse4.2 and avx
> whereas -mavx -mno-sse4.2 disables both.
>
> Setting of the following are clubbed.
> 1) 3DNow sets MMX
> 2) SSE2 sets SSE
> 3) SSE3 sets SSE2
> 4) SSE4_1 sets SSE3
> 5) SSE4_2 sets SSE4_1
> 6) FMA sets AVX
> 7) AVX2 sets AVX
> 8) SSE4_A sets SSE3
> 9) FMA4 set SSE4_A and AVX
> 10) XOP sets FMA4
> 11) AES sets SSE2
> 12) PCLMUL sets SSE2
> 13) ABM sets POPCNT
>
> Resetting is done in reversely (MMX resets 3DNOW).
>
> IMO, if we have different cpuid flags, enabling\disabling
> the compiler flags depends on these cpuid flags directly.
> Adding subsets to them or tangling them together may give
> wrong results.

Uh, ok ... it's messier than I anticipated ;)

> Please let me know your opinion.

Well, your patch looks reasonable then.  I'll defer to x86 maintainers for
approval though.

Thanks,
Richard.

> Regards
> Ganesh
>
> -Original Message-
> From: Richard Guenther [mailto:richard.guent...@gmail.com]
> Sent: Wednesday, August 08, 2012 5:12 PM
> To: Gopalasubramanian, Ganesh
> Cc: gcc-patches@gcc.gnu.org; ubiz...@gmail.com
> Subject: Re: [PATCH,i386] fma,fma4 and xop flags
>
> On Wed, Aug 8, 2012 at 1:31 PM,   wrote:
>> Hello,
>>
>> Bdver2 cpu supports both fma and fma4 instructions.
>> Previous to patch, option "-mno-xop" removes "-mfma4".
>> Similarly, option "-mno-fma4" removes "-mxop".
>
> Eh?  Why's that?  I think we should disentangle -mxop and -mfma4
> instead.  Otherwise, what does -mno-fma4 -mxop do?
> (it should enable both xop and fma4!)  what should -mfma4 -mno-xop do
> (it should disable both xop and fma4!).  All this is just confusing to
> the user, even if in AMD documents XOP includes FMA4.
>
> Richard.
>

Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)

2012-08-09 Thread Gabriel Dos Reis

On Thu, Aug 9, 2012 at 3:22 AM, Richard Guenther
 wrote:
> On Thu, Aug 9, 2012 at 12:25 AM, Lawrence Crowl  wrote:
>> On 8/8/12, Richard Guenther  wrote:
>>> On Aug 7, 2012 Lawrence Crowl  wrote:
>>> > We should probably think about naming conventions for mutating
>>> > operations, as I expect we will want them eventually.
>>>
>>> Right.  In the end I would prefer explicit constructors.
>>
>> I don't think we're thinking about the same thing.
>>
>> I'm talking about member functions like mystring.append ("foo").
>> The += operator is mutating as well.
>>
>> Constructors do not mutate, they create.
>
> Ah.  For simple objects like double_int I prefer to have either all ops 
> mutating
> or all ops non-mutating.

Hmm, isn't that a bit extreme?  I mean that does not hold for simple
types that int
or double, etc.

-- Gaby

Re: Value type of map need not be default copyable

2012-08-09 Thread Paolo Carlini


Hi,

On 08/09/2012 09:14 AM, Marc Glisse wrote:

On Wed, 8 Aug 2012, François Dumont wrote:


On 08/08/2012 03:39 PM, Paolo Carlini wrote:

On 08/08/2012 03:15 PM, François Dumont wrote:
I have also introduce a special std::pair constructor for container 
usage so that we do not have to include the whole tuple stuff just 
for associative container implementations.

To be clear: sorry, this is not an option.

Paolo.

   Then I can only imagine the attached patch which require to 
include tuple when including unordered_map or unordered_set. The 
std::pair(piecewise_construct_t, tuple<>, tuple<>) is the only 
constructor that allow to build a pair using the default constructor 
for the second member.


I agree that the extra constructor would be convenient (I probably 
would have gone with pair(T&&,__default_construct_t), the symmetric 
version, and enough extra constructors to resolve all ambiguities). 
Maybe LWG would consider doing something.
When it does, and the corresponding PR will be *ready* we'll reconsider 
the issue. After all the *months and months and months* spent by the LWG 
adding and removing members from pair and tweaking everything wrt the 
containers and issues *still* popping up (like that with the defaulted 
copy constructor vs insert constraining), and with the support for 
scoped allocators still missing from our implementation, we are not 
adding members to std::pair such easily. Sorry, but personally I'm not 
available now to further discuss this specific point.


I was still hoping that for something as simple as mapped_type() we 
wouldn't need the full  machinery, and I encourage everybody to 
have another look (while making sure anything we figure out adapts 
smoothly an consistently to std::map), then in a few days we'll take a 
final decision. We'll still have chances to further improve the code in 
time for 4.8.0.



+ __p = __h->_M_allocate_node(std::piecewise_construct,
+ std::make_tuple(__k),
+ std::make_tuple());

Don't you want cref(__k)? It might save a move at some point.
Are we already doing that elsewhere? I think we should aim for something 
simple first, then carefully evaluate if the additional complexity is 
worth the cost and in case deploy the superior solution consistently 
everywhere it may apply.


Thanks!
Paolo.

Re: [cxx-conversion] Support garbage-collected C++ templates

2012-08-09 Thread Richard Guenther

On Wed, Aug 8, 2012 at 11:27 PM, Diego Novillo  wrote:
> On 12-08-08 17:25 , Gabriel Dos Reis wrote:
>
>> Aha, so it is an ordering issue, e.g. declarations being generated
>> after they have been seen used in an instantiation.
>>
>> We might want to consider  including the header file (that contains
>> only the declarations of the marking functions)  in the header
>> files that contain the GTY-marked type definition.  In this case, it would
>> be included near the end of tree.h
>
>
> Right. And that's the part of my plan that requires killing gengtype with
> fire first.  When I started down that path, it became a very messy re-write,
> so I decided it was better to do it in stages.

But now with doing it in stages you end up with (this) first stage
that complicates
gengtype to support a very small subset of C++ types (namely the one special
case you need for vec.h).  Exactly what I did _not_ want!

I understood that you had the complete "killing of gengtype with fire" ready
(or almost ready).  Please finish it instead.

Thanks,
Richard.

>
> Diego.

[Patch, Fortran] PR54199 improve warning "is also the name of an intrinsic" for internal procedures

2012-08-09 Thread Tobias Burnus

This patch makes the warning for internal procedures whose name is the 
same as the one of an intrinsic clearer. Initially, I though that one 
shouldn't warn for internal procedures, but others disagree. In any 
case, the warning text is better than original one.


Build and regstested on x86-64-linux.
OK for the trunk?

Tobias
2012-08-09  Tobias Burnus  

	PR fortran/54199
	* intrinsic.c (gfc_warn_intrinsic_shadow): Better warning
	for internal procedures.

2012-08-09  Tobias Burnus  

	PR fortran/54199
	* gfortran.dg/intrinsic_shadow_4.f90: New.

diff --git a/gcc/fortran/intrinsic.c b/gcc/fortran/intrinsic.c
index 60c68fe..72b149f 100644
--- a/gcc/fortran/intrinsic.c
+++ b/gcc/fortran/intrinsic.c
@@ -4503,7 +4511,7 @@ gfc_warn_intrinsic_shadow (const gfc_symbol* sym, bool in_module, bool func)
 return;
 
   /* Emit the warning.  */
-  if (in_module)
+  if (in_module || sym->ns->proc_name)
 gfc_warning ("'%s' declared at %L may shadow the intrinsic of the same"
 		 " name.  In order to call the intrinsic, explicit INTRINSIC"
 		 " declarations may be required.",
--- /dev/null	2012-08-08 07:41:43.631684108 +0200
+++ gcc/gcc/testsuite/gfortran.dg/intrinsic_shadow_4.f90	2012-08-09 10:28:55.0 +0200
@@ -0,0 +1,12 @@
+! { dg-do compile }
+! { dg-options "-Wall" }
+!
+! PR fortran/54199
+!
+subroutine test()
+contains
+  real function fraction(x) ! { dg-warning "'fraction' declared at .1. may shadow the intrinsic of the same name.  In order to call the intrinsic, explicit INTRINSIC declarations may be required." }
+real :: x
+fraction = x
+  end function fraction
+end subroutine test

[AArch64] Merge from upstream trunk r189905

2012-08-09 Thread Sofiane Naci

Hi,

I've just merged upstream trunk on the aarch64-branch up to r189905.

Thanks
Sofiane

[PATCH][5/n] Allow anonymous SSA names

2012-08-09 Thread Richard Guenther


Another set of small changes.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2012-08-09  Richard Guenther  

* tree.h (SSA_VAR_P): Simplify.
* tree-ssanames.c (make_ssa_name_fn): Strengthen assert.
* fold-const.c (fold_comparison): Check for default def first
before checking for PARM_DECL.
* tree-complex.c (get_component_ssa_name): Likewise.
* tree-inline.c (remap_ssa_name): Likewise.
* tree-ssa-loop-ivopts.c (parm_decl_cost): Likewise.
* tree-ssa-structalias.c (get_fi_for_callee): Likewise.
(find_what_p_points_to): Likewise.
* tree-ssa-operands.c (add_stmt_operand): Simplify.

Index: trunk/gcc/fold-const.c
===
*** trunk.orig/gcc/fold-const.c 2012-08-08 16:49:38.0 +0200
--- trunk/gcc/fold-const.c  2012-08-09 11:08:52.273217092 +0200
*** fold_comparison (location_t loc, enum tr
*** 8940,8955 
 && auto_var_in_fn_p (base0, current_function_decl)
 && !indirect_base1
 && TREE_CODE (base1) == SSA_NAME
!&& TREE_CODE (SSA_NAME_VAR (base1)) == PARM_DECL
!&& SSA_NAME_IS_DEFAULT_DEF (base1))
|| (TREE_CODE (arg1) == ADDR_EXPR
&& indirect_base1
&& TREE_CODE (base1) == VAR_DECL
&& auto_var_in_fn_p (base1, current_function_decl)
&& !indirect_base0
&& TREE_CODE (base0) == SSA_NAME
!   && TREE_CODE (SSA_NAME_VAR (base0)) == PARM_DECL
!   && SSA_NAME_IS_DEFAULT_DEF (base0)))
  {
if (code == NE_EXPR)
  return constant_boolean_node (1, type);
--- 8940,8955 
 && auto_var_in_fn_p (base0, current_function_decl)
 && !indirect_base1
 && TREE_CODE (base1) == SSA_NAME
!&& SSA_NAME_IS_DEFAULT_DEF (base1)
!  && TREE_CODE (SSA_NAME_VAR (base1)) == PARM_DECL)
|| (TREE_CODE (arg1) == ADDR_EXPR
&& indirect_base1
&& TREE_CODE (base1) == VAR_DECL
&& auto_var_in_fn_p (base1, current_function_decl)
&& !indirect_base0
&& TREE_CODE (base0) == SSA_NAME
!   && SSA_NAME_IS_DEFAULT_DEF (base0)
! && TREE_CODE (SSA_NAME_VAR (base0)) == PARM_DECL))
  {
if (code == NE_EXPR)
  return constant_boolean_node (1, type);
Index: trunk/gcc/tree-complex.c
===
*** trunk.orig/gcc/tree-complex.c   2012-08-08 16:49:38.0 +0200
--- trunk/gcc/tree-complex.c2012-08-09 11:19:15.799195507 +0200
*** get_component_ssa_name (tree ssa_name, b
*** 495,502 
 is used in an abnormal phi, and whether it's uninitialized.  */
SSA_NAME_OCCURS_IN_ABNORMAL_PHI (ret)
= SSA_NAME_OCCURS_IN_ABNORMAL_PHI (ssa_name);
!   if (TREE_CODE (SSA_NAME_VAR (ssa_name)) == VAR_DECL
! && SSA_NAME_IS_DEFAULT_DEF (ssa_name))
{
  SSA_NAME_DEF_STMT (ret) = SSA_NAME_DEF_STMT (ssa_name);
  set_ssa_default_def (cfun, SSA_NAME_VAR (ret), ret);
--- 495,502 
 is used in an abnormal phi, and whether it's uninitialized.  */
SSA_NAME_OCCURS_IN_ABNORMAL_PHI (ret)
= SSA_NAME_OCCURS_IN_ABNORMAL_PHI (ssa_name);
!   if (SSA_NAME_IS_DEFAULT_DEF (ssa_name)
! && TREE_CODE (SSA_NAME_VAR (ssa_name)) == VAR_DECL)
{
  SSA_NAME_DEF_STMT (ret) = SSA_NAME_DEF_STMT (ssa_name);
  set_ssa_default_def (cfun, SSA_NAME_VAR (ret), ret);
Index: trunk/gcc/tree-inline.c
===
*** trunk.orig/gcc/tree-inline.c2012-08-08 16:49:38.0 +0200
--- trunk/gcc/tree-inline.c 2012-08-09 11:19:15.800195507 +0200
*** remap_ssa_name (tree name, copy_body_dat
*** 187,194 
  
if (processing_debug_stmt)
  {
!   if (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL
! && SSA_NAME_IS_DEFAULT_DEF (name)
  && id->entry_bb == NULL
  && single_succ_p (ENTRY_BLOCK_PTR))
{
--- 187,194 
  
if (processing_debug_stmt)
  {
!   if (SSA_NAME_IS_DEFAULT_DEF (name)
! && TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL
  && id->entry_bb == NULL
  && single_succ_p (ENTRY_BLOCK_PTR))
{
Index: trunk/gcc/tree-ssa-loop-ivopts.c
===
*** trunk.orig/gcc/tree-ssa-loop-ivopts.c   2012-08-08 16:49:38.0 
+0200
--- trunk/gcc/tree-ssa-loop-ivopts.c2012-08-09 11:19:15.801195507 +0200
*** parm_decl_cost (struct ivopts_data *data
*** 4657,4664 
STRIP_NOPS (sbound);
  
if (TREE_CODE (sbound) == SSA_NAME
&& TREE_CODE (SSA_NAME_VAR (sbound)) == PARM_DECL
-   && gimple_nop_p (SSA_NAME_

Re: [Patch, Fortran] PR54199 improve warning "is also the name of an intrinsic" for internal procedures

2012-08-09 Thread Mikael Morin

On 09/08/2012 11:12, Tobias Burnus wrote:
> This patch makes the warning for internal procedures whose name is the
> same as the one of an intrinsic clearer. Initially, I though that one
> shouldn't warn for internal procedures, but others disagree. In any
> case, the warning text is better than original one.
> 
> Build and regstested on x86-64-linux.
> OK for the trunk?

OK.

Fix PR 53701

2012-08-09 Thread Andrey Belevantsev


Hello,

The problem in question is uncovered by the recent speculation patch, it is 
in the handling of expressions blocked by bookkeeping.  Those are 
expressions that become unavailable due to the newly created bookkeeping 
copies.  In the original algorithm the supported insns and transformations 
cannot lead to this result, but when handling non-separable insns or 
creating speculative checks that unpredictably block certain insns the 
situation can arise.  We just filter out all such expressions from the 
final availability set for correctness.


The PR happens because the expression being filtered out can be transformed 
while being moved up, thus we need to look up not only its exact pattern 
but also all its previous forms saved in its history of changes.  The patch 
does exactly that, I also clarified the comments w.r.t. this situation.


Bootstrapped and tested on ia64 and x86-64, the PR testcase is minimized, 
too.  OK for trunk?  Also need to backport this to 4.7 with PR 53975, say 
on the next week.


Yours,
Andrey

gcc:
2012-08-09  Andrey Belevantsev  

PR rtl-optimization/53701
* sel-sched.c (vinsn_vec_has_expr_p): Clarify function comment.
Process not only expr's vinsns but all old vinsns from expr's
history of changes.
(update_and_record_unavailable_insns): Clarify comment.

testsuite:
2012-08-09  Andrey Belevantsev  

PR rtl-optimization/53701
* gcc.dg/pr53701.c: New test.
diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c
index 3099b92..f0c6eaf 100644
--- a/gcc/sel-sched.c
+++ b/gcc/sel-sched.c
@@ -3564,29 +3564,41 @@ process_use_exprs (av_set_t *av_ptr)
   return NULL;
 }
 
-/* Lookup EXPR in VINSN_VEC and return TRUE if found.  */
+/* Lookup EXPR in VINSN_VEC and return TRUE if found.  Also check patterns from
+   EXPR's history of changes.  */
 static bool
 vinsn_vec_has_expr_p (vinsn_vec_t vinsn_vec, expr_t expr)
 {
-  vinsn_t vinsn;
+  vinsn_t vinsn, expr_vinsn;
   int n;
+  unsigned i;
 
-  FOR_EACH_VEC_ELT (vinsn_t, vinsn_vec, n, vinsn)
-if (VINSN_SEPARABLE_P (vinsn))
-  {
-if (vinsn_equal_p (vinsn, EXPR_VINSN (expr)))
-  return true;
-  }
-else
-  {
-/* For non-separable instructions, the blocking insn can have
-   another pattern due to substitution, and we can't choose
-   different register as in the above case.  Check all registers
-   being written instead.  */
-if (bitmap_intersect_p (VINSN_REG_SETS (vinsn),
-VINSN_REG_SETS (EXPR_VINSN (expr
-  return true;
-  }
+  /* Start with checking expr itself and then proceed with all the old forms
+ of expr taken from its history vector.  */
+  for (i = 0, expr_vinsn = EXPR_VINSN (expr);
+   expr_vinsn;
+   expr_vinsn = (i < VEC_length (expr_history_def,
+ EXPR_HISTORY_OF_CHANGES (expr))
+		 ? VEC_index (expr_history_def,
+  EXPR_HISTORY_OF_CHANGES (expr),
+  i++)->old_expr_vinsn
+		 : NULL))
+FOR_EACH_VEC_ELT (vinsn_t, vinsn_vec, n, vinsn)
+  if (VINSN_SEPARABLE_P (vinsn))
+	{
+	  if (vinsn_equal_p (vinsn, expr_vinsn))
+	return true;
+	}
+  else
+	{
+	  /* For non-separable instructions, the blocking insn can have
+	 another pattern due to substitution, and we can't choose
+	 different register as in the above case.  Check all registers
+	 being written instead.  */
+	  if (bitmap_intersect_p (VINSN_REG_SETS (vinsn),
+  VINSN_REG_SETS (expr_vinsn)))
+	return true;
+	}
 
   return false;
 }
@@ -5694,8 +5706,8 @@ update_and_record_unavailable_insns (basic_block book_block)
   || EXPR_TARGET_AVAILABLE (new_expr)
 		 != EXPR_TARGET_AVAILABLE (cur_expr))
 	/* Unfortunately, the below code could be also fired up on
-	   separable insns.
-	   FIXME: add an example of how this could happen.  */
+	   separable insns, e.g. when moving insns through the new
+	   speculation check as in PR 53701.  */
 vinsn_vec_add (&vec_bookkeeping_blocked_vinsns, cur_expr);
 }
 
diff --git a/gcc/testsuite/gcc.dg/pr53701.c b/gcc/testsuite/gcc.dg/pr53701.c
new file mode 100644
index 000..2c85223
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr53701.c
@@ -0,0 +1,59 @@
+/* { dg-do compile { target powerpc*-*-* ia64-*-* i?86-*-* x86_64-*-* } } */
+/* { dg-options "-O3 -fselective-scheduling2 -fsel-sched-pipelining" } */
+typedef unsigned short int uint16_t;
+typedef unsigned long int uintptr_t;
+typedef struct GFX_VTABLE
+{
+  int color_depth;
+  unsigned char *line[];
+}
+BITMAP;
+extern int _drawing_mode;
+extern BITMAP *_drawing_pattern;
+extern int _drawing_y_anchor;
+extern unsigned int _drawing_x_mask;
+extern unsigned int _drawing_y_mask;
+extern uintptr_t bmp_write_line (BITMAP *, int);
+  void
+_linear_hline15 (BITMAP * dst, int dx1, int dy, int dx2, int color)
+{
+  int w;
+  if (_drawing_mode == 0)
+  {
+int x, curw;
+unsigned short *sline =
+  (

Re: Value type of map need not be default copyable

2012-08-09 Thread Jonathan Wakely

On 9 August 2012 09:35, Paolo Carlini wrote:
>
> When it does, and the corresponding PR will be *ready* we'll reconsider the
> issue. After all the *months and months and months* spent by the LWG adding
> and removing members from pair and tweaking everything wrt the containers
> and issues *still* popping up (like that with the defaulted copy constructor
> vs insert constraining), and with the support for scoped allocators still
> missing from our implementation, we are not adding members to std::pair such
> easily. Sorry, but personally I'm not available now to further discuss this
> specific point.

I'm with Paolo on this. No additional (non-standard) constructors in
std::pair please.

If it was possible to do without changing the ABI I'd include 
in the unordered containers anyway, when add scoped allocator support,
because std::tuple already knows how to avoid the EBO for 'final'
allocators (PR 51365).  I'd do the same in the other containers except
that they need to work in C++03 mode without std::tuple.

I think we should consider std::tuple almost as fundamental as
std::pair and shouldn't jump through hoops to avoid using it.  It's
already included by  for example, to implement
std::unique_ptr, and I recently made changes to make it easier to use
std::unique_ptr internally, so we shouldn't be afraid of std::tuple
getting used more widely.

Re: [Patch, Fortran] PR40881 - Add two F95 obsolescence warnings

2012-08-09 Thread Mikael Morin

On 08/08/2012 19:12, Tobias Burnus wrote:
> With this patch, I think the only unimplemented obsolescence warning is for
> "(8) Fixed form source -- see B.2.7."
> 
> For the latter, I would like to see a possibility to silence that
> warning, given that there is substantial code around, which is in fixed
> form but otherwise a completely valid and obsolescent-free code.

We could silence it with explicit -ffixed-form.

> 
> The motivation for implementing this patch was that I did a small
> obsolescent cleanup of our fixed-form code (which uses some Fortran 2003
> features) and I realized that ifort had the "shared DO termination"
> warning and gfortran didn't.
> 
> Build and regtested on x86-64-gnu-linux.
> OK for the trunk?

More comments below. Regarding the general design, I'm not sure it makes
sense to distinguish between ST_LABEL_DO_TARGET and
ST_LABEL_ENDDO_TARGET. There are no ST_LABEL_GOTO_TARGET or
ST_LABEL_WRITE_TARGET after all.



> diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
> index b6e2975..9670022 100644
> --- a/gcc/fortran/gfortran.h
> +++ b/gcc/fortran/gfortran.h
> @@ -146,8 +146,8 @@ ar_type;
>  
>  /* Statement label types.  */
>  typedef enum
> -{ ST_LABEL_UNKNOWN = 1, ST_LABEL_TARGET,
> -  ST_LABEL_BAD_TARGET, ST_LABEL_FORMAT
> +{ ST_LABEL_UNKNOWN = 1, ST_LABEL_TARGET, ST_LABEL_DO_TARGET,
> +  ST_LABEL_ENDDO_TARGET, ST_LABEL_BAD_TARGET, ST_LABEL_FORMAT
>  }
>  gfc_sl_type;

Please add a comment explaining the different types; something like:
The labels referenced in DO statements and defined in END DO statements
 get types respectively ST_LABEL_DO_TARGET and ST_LABEL_ENDDO_TARGET
instead of the generic ST_LABEL_TARGET so that they can be distinguished
to issue DO-specific diagnostics.
The DO label is a label reference, so ST_LABEL_DO_TARGET is to be used
in gfc_st_label::referenced only.  The ST_LABEL_ENDDO_TARGET is the
corresponding label definition, and is to be used in
gfc_st_label::defined only.



> @@ -3825,8 +3828,11 @@ parse_executable (gfc_statement st)
>   case ST_NONE:
> unexpected_eof ();
>  
> - case ST_FORMAT:
>   case ST_DATA:
> +   gfc_notify_std (GFC_STD_F95_OBS, "DATA statement at %C after the "
> +"first executable statement");
> +   /* Fall through.  */
> + case ST_FORMAT:
>   case ST_ENTRY:
>   case_executable:
> accept_statement (st);

This diagnostic is more appropriate in verify_st_order (which needs to
be called then).


> diff --git a/gcc/fortran/symbol.c b/gcc/fortran/symbol.c
> index 455e6c9..135c1e5 100644
> --- a/gcc/fortran/symbol.c
> +++ b/gcc/fortran/symbol.c
> @@ -2213,12 +2214,19 @@ gfc_define_st_label (gfc_st_label *lp, gfc_sl_type 
> type, locus *label_locus)
> break;
>  
>   case ST_LABEL_TARGET:
> + case ST_LABEL_ENDDO_TARGET:
> if (lp->referenced == ST_LABEL_FORMAT)
>   gfc_error ("Label %d at %C already referenced as a format label",
>  labelno);
> else
>   lp->defined = ST_LABEL_TARGET;

I think it should be `lp->defined = type;' here.


> @@ -2254,14 +2262,16 @@ gfc_reference_st_label (gfc_st_label *lp, gfc_sl_type 
> type)
>lp->where = gfc_current_locus;
>  }
>  
> -  if (label_type == ST_LABEL_FORMAT && type == ST_LABEL_TARGET)
> +  if (label_type == ST_LABEL_FORMAT
> +  && (type == ST_LABEL_TARGET || type == ST_LABEL_DO_TARGET))
>  {
>gfc_error ("Label %d at %C previously used as a FORMAT label", 
> labelno);
>rc = FAILURE;
>goto done;
>  }
>  
> -  if ((label_type == ST_LABEL_TARGET || label_type == ST_LABEL_BAD_TARGET)
> +  if ((label_type == ST_LABEL_TARGET || label_type == ST_LABEL_DO_TARGET
> +   || label_type == ST_LABEL_BAD_TARGET)
>&& type == ST_LABEL_FORMAT)
>  {
>gfc_error ("Label %d at %C previously used as branch target", labelno);

label_type is initialized using either lp->referenced or lp->defined.
Thus both ST_LABEL_DO_TARGET and ST_LABEL_ENDDO_TARGET should be checked
here. Unless they are merged as suggested above.


Mikael

Re: [PATCH] Intrinsics for ADCX

2012-08-09 Thread Michael Zolotukhin

Hi guys,
This patch generalizes recently commited addcarryx-intrinsic so that
it could be generated either via ADCX or common ADC instruction.
ADX-* tests are ok, bootstrap is passed.
Is it ok for trunk?

Changelog entry:
2012-08-09  Michael Zolotukhin  

* config/i386/adxintrin.h: Remove guarding __ADX__ check.
* config/i386/x86intrin.h: Likewise.
* config/i386/i386.c (ix86_init_mmx_sse_builtins): Remove
OPTION_MASK_ISA_ADX from needed options for
__builtin_ia32_addcarryx_u32 and __builtin_ia32_addcarryx_u64.
(ix86_expand_builtin): Use add3_carry in expanding of
IX86_BUILTIN_ADDCARRYX32 and IX86_BUILTIN_ADDCARRYX64.

testsuite/Changelog entry:
2012-08-09  Michael Zolotukhin  

* gcc.target/i386/adx-addxcarry32-3.c: New.
* gcc.target/i386/adx-addxcarry64-3.c: New.


Thanks, Michael

On 1 August 2012 20:37, Kirill Yukhin  wrote:
> Hi Richard,
>
>> Frankly I don't understand the point of these instructions
>> being added to the ISA at all.  I would have understood an
>> add-with-carry that did *not* modify the flags at all, but
>> two separate ones that modify C and O separately is just
>> downright strange.
> If there is only one carry in flight, they all are equivalent although
> ADOX is a little less useful in loops.
> If there are two carries in flight, that’s where the new instructions
> show their benefit, since they allow accumulation without destroying
> each other (see next comment).
> For any number of carries beyond two, you have to start saving
> restoring carry bits and it degenerates to the first case for some of
> them.
>
>> But to the point: I don't understand the point of having
>> this as a builtin.  Is the code generated by this builtin
>> any better than plain C?
> I think this is just like a practice to introduce new intrinsics for new 
> insns.
> I doubt, that we may generate such things automatically:
> c1 = 0;
> c2 = 0;
> c1 = _adcx64( & res[i], src[i], src2[i], c1);
> c1 = _adcx64( & res[i+1], src[i+1], src2[i+1], c1);
> c2 = _adcx64( & res[i], src[i], src2[i], c2);
> c2 = _adcx64( & res[i+1], src[i+1], src2[i+1], c2);
>
>> And if you're going to have the builtin, why is this restricted
>> to adx anyway?  You obviously can produce the same results with
>> the good old fashioned adc instruction as well.
> We have one intrinsic for both ADCX/ADOX. So, we just picked up first
> one to use when exanding the built-in
>
>> Which begs the question of why you've got a separate pattern
>> for the adx anyway.  If the insn is so much better, it ought to
>> be used in the same pattern we use for adc now.
> I believe, we may introduce global variant of ADCX, which may be
> expanded into either of ADC/ADCX/ADOX on x86 and into analogs
> on the other ports.
>
> K


-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.


bdw-adx-5.gcc.patch
Description: Binary data

Re: [PATCH] Strength reduction part 3 of 4: candidates with unknown strides

2012-08-09 Thread William J. Schmidt

On Wed, 2012-08-08 at 19:22 -0700, Janis Johnson wrote:
> On 08/08/2012 06:41 PM, William J. Schmidt wrote:
> > On Wed, 2012-08-08 at 15:35 -0700, Janis Johnson wrote:
> >> On 08/08/2012 03:27 PM, Andrew Pinski wrote:
> >>> On Wed, Aug 8, 2012 at 3:25 PM, H.J. Lu  wrote:
>  On Wed, Aug 1, 2012 at 10:36 AM, William J. Schmidt
>   wrote:
> 
> > +/* { dg-do compile } */
> > +/* { dg-options "-O3 -fdump-tree-dom2 -fwrapv" } */
> > +/* { dg-skip-if "" { ilp32 } { "-m32" } { "" } } */
> > +
> 
>  This doesn't work on x32 nor Linux/ia32 since -m32
>  may not be needed for ILP32.  This patch works for
>  me.  OK to install?
> >>>
> >>> This also does not work for mips64 where the options are either
> >>> -mabi=32 or -mabi=n32 for ILP32.
> >>>
> >>> HJL's patch looks correct.
> >>>
> >>> Thanks,
> >>> Andrew
> >>
> >> There are GCC targets with 16-bit integers.  What's the actual
> >> set of targets on which this test is meant to run?  There's a list
> >> of effective-target names based on data type sizes in
> >> .
> > 
> > Yes, sorry.  The test really is only valid when int and long have
> > different sizes.  So according to that link we should skip ilp32 and
> > llp64 at a minimum.  It isn't clear what we should do for int16 since
> > the size of long isn't specified, so I suppose we should skip that as
> > well.  So, perhaps modify HJ's patch to have
> > 
> > /* { dg-do compile { target { ! { ilp32 llp64 int16 } } } } */
> > 
> > ?
> > 
> > Thanks,
> > Bill
> 
> That's confusing.  Perhaps what you really need is a new effective
> target for "sizeof(int) != sizeof(long)".

Good idea.  I'll work up a patch when I get a moment.

Thanks,
Bill

> 
> Janis
>

Re: [cxx-conversion] Support garbage-collected C++ templates

2012-08-09 Thread Diego Novillo

On Thu, Aug 9, 2012 at 5:03 AM, Richard Guenther
 wrote:

> But now with doing it in stages you end up with (this) first stage
> that complicates
> gengtype to support a very small subset of C++ types (namely the one special
> case you need for vec.h).  Exactly what I did _not_ want!

No. It supports all C++ types. All it needs is the user annotation.

> I understood that you had the complete "killing of gengtype with fire" ready
> (or almost ready).  Please finish it instead.

No. It was not even close. The full re-write will wait until the
branch is merged in trunk. It will touch too many files and the branch
is already hard to maintain as it is.

Adding support for explicit user annotations is easy enough. Patch coming up.

Diego.

Test...

2012-08-09 Thread Uros Bizjak

[PATCH, alpha]: Prevent another case of linker issues with exception handler

2012-08-09 Thread Uros Bizjak

Hello!

This problem is similar to [1], but in this case issue
occurs when exception handler immediately follows sibcall function.
This happens in libstdc++,

./include/ext/pb_ds/detail/binary_heap_/split_join_fn_imps.hpp: 130

This is the reason for testcase failure in [2]:

Running target unix
FAIL: ext/pb_ds/regression/priority_queue_rand.cc execution test

We have to prevent this situation in the same way as in [1]: pad
sibcall function call with a nop, when GP load immediately follows the
call. Following patch fixes the failure.

2012-08-09  Uros Bizjak  

* config/alpha/alpha.c (alpha_pad_noreturn): Rename to ...
(alpha_pad_function_end): ... this.  Also insert NOP between
sibling call and GP load.
(alpha_reorg): Update call to alpha_pad_function_end.  Expand comment.

Patch was bootstrapped and regression tested on alphaev68-pc-linux-gnu.

OK for mainline and release branches?

[1] http://gcc.gnu.org/ml/gcc-patches/2008-12/msg01097.html
[2] http://gcc.gnu.org/ml/gcc-testresults/2012-08/msg00583.html

Uros.

Re: [cxx-conversion] Support garbage-collected C++ templates

2012-08-09 Thread Richard Guenther

On Thu, Aug 9, 2012 at 2:44 PM, Diego Novillo  wrote:
> On Thu, Aug 9, 2012 at 5:03 AM, Richard Guenther
>  wrote:
>
>> But now with doing it in stages you end up with (this) first stage
>> that complicates
>> gengtype to support a very small subset of C++ types (namely the one special
>> case you need for vec.h).  Exactly what I did _not_ want!
>
> No. It supports all C++ types. All it needs is the user annotation.

You said it only works for types in the template parameter list and there
you only support types (and not integers).  Which means I fail to see how
it works for VEC(tree).  As far as I understand you are not creating the
gt_pch_nx overloads for all GTYed types (which includes 'tree').  If you do
then I fail to see why it should be restricted at all?

>> I understood that you had the complete "killing of gengtype with fire" ready
>> (or almost ready).  Please finish it instead.
>
> No. It was not even close. The full re-write will wait until the
> branch is merged in trunk. It will touch too many files and the branch
> is already hard to maintain as it is.

Well.  So what are exactly the limitations?  If I can provide user-defined
gc routines for all C++ types and gengtype will pick them up automagically
when auto-generating gc routines for other types then fine.

What I do not understand is why you need a GTY(()) annotation on
C++ types with user-defined gc routines.  gengtype should treat all
types not marked with GTY(()) as having user-defined gc routines, no?

Richard.

> Adding support for explicit user annotations is easy enough. Patch coming up.
>
>
> Diego.

Re: [PATCH, alpha]: Prevent another case of linker issues with exception handler

2012-08-09 Thread Uros Bizjak

On Thu, Aug 9, 2012 at 3:04 PM, Uros Bizjak  wrote:

> 2012-08-09  Uros Bizjak  
>
> * config/alpha/alpha.c (alpha_pad_noreturn): Rename to ...
> (alpha_pad_function_end): ... this.  Also insert NOP between
> sibling call and GP load.
> (alpha_reorg): Update call to alpha_pad_function_end.  Expand comment.
>
> Patch was bootstrapped and regression tested on alphaev68-pc-linux-gnu.
>
> OK for mainline and release branches?

Now with the patch.

Uros.
Index: alpha.c
===
--- alpha.c (revision 190247)
+++ alpha.c (working copy)
@@ -9258,17 +9258,18 @@ alpha_align_insns (unsigned int max_align,
 }
 }
 
-/* Insert an unop between a noreturn function call and GP load.  */
+/* Insert an unop between sibcall or noreturn function call and GP load.  */
 
 static void
-alpha_pad_noreturn (void)
+alpha_pad_function_end (void)
 {
   rtx insn, next;
 
   for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
 {
   if (! (CALL_P (insn)
-&& find_reg_note (insn, REG_NORETURN, NULL_RTX)))
+&& (SIBLING_CALL_P (insn)
+|| find_reg_note (insn, REG_NORETURN, NULL_RTX
 continue;
 
   /* Make sure we do not split a call and its corresponding
@@ -9300,11 +9301,31 @@ static void
 static void
 alpha_reorg (void)
 {
-  /* Workaround for a linker error that triggers when an
- exception handler immediatelly follows a noreturn function.
+  /* Workaround for a linker error that triggers when an exception
+ handler immediatelly follows a sibcall or a noreturn function.
 
+In the sibcall case:
+
  The instruction stream from an object file:
 
+ 1d8:   00 00 fb 6b jmp (t12)
+ 1dc:   00 00 ba 27 ldahgp,0(ra)
+ 1e0:   00 00 bd 23 lda gp,0(gp)
+ 1e4:   00 00 7d a7 ldq t12,0(gp)
+ 1e8:   00 40 5b 6b jsr ra,(t12),1ec <__funcZ+0x1ec>
+
+ was converted in the final link pass to:
+
+   12003aa88:   67 fa ff c3 br  120039428 <...>
+   12003aa8c:   00 00 fe 2f unop
+   12003aa90:   00 00 fe 2f unop
+   12003aa94:   48 83 7d a7 ldq t12,-31928(gp)
+   12003aa98:   00 40 5b 6b jsr ra,(t12),12003aa9c <__func+0x1ec>
+
+And in the noreturn case:
+
+ The instruction stream from an object file:
+
   54:   00 40 5b 6b jsr ra,(t12),58 <__func+0x58>
   58:   00 00 ba 27 ldahgp,0(ra)
   5c:   00 00 bd 23 lda gp,0(gp)
@@ -9321,11 +9342,11 @@ alpha_reorg (void)
 
  GP load instructions were wrongly cleared by the linker relaxation
  pass.  This workaround prevents removal of GP loads by inserting
- an unop instruction between a noreturn function call and
+ an unop instruction between a sibcall or noreturn function call and
  exception handler prologue.  */
 
   if (current_function_has_exception_handlers ())
-alpha_pad_noreturn ();
+alpha_pad_function_end ();
 
   if (alpha_tp != ALPHA_TP_PROG || flag_exceptions)
 alpha_handle_trap_shadows ();

Re: Fix PR 53701

2012-08-09 Thread Alexander Monakov



On Thu, 9 Aug 2012, Andrey Belevantsev wrote:

> Hello,
> 
> The problem in question is uncovered by the recent speculation patch, it is in
> the handling of expressions blocked by bookkeeping.  Those are expressions
> that become unavailable due to the newly created bookkeeping copies.  In the
> original algorithm the supported insns and transformations cannot lead to this
> result, but when handling non-separable insns or creating speculative checks
> that unpredictably block certain insns the situation can arise.  We just
> filter out all such expressions from the final availability set for
> correctness.
> 
> The PR happens because the expression being filtered out can be transformed
> while being moved up, thus we need to look up not only its exact pattern but
> also all its previous forms saved in its history of changes.  The patch does
> exactly that, I also clarified the comments w.r.t. this situation.
> 
> Bootstrapped and tested on ia64 and x86-64, the PR testcase is minimized, too.
> OK for trunk?  Also need to backport this to 4.7 with PR 53975, say on the
> next week.

This is OK.

Thanks.

Alexander

Re: [SH] PR 50751

2012-08-09 Thread Kaz Kojima

Oleg Endo  wrote:
> This patch fixes a minor issue related to the displacement addressing
> patterns, which leads to useless movt exts.* sequences and one of the
> predicates wrongly accepting non-mem ops.
> 
> Tested on rev 190151 with
>  make -k check RUNTESTFLAGS="--target_board=sh-sim
> \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"
> 
> and no new failures.
> OK?

OK.

Regards,
kaz

Re: [SH] PR 39423

2012-08-09 Thread Kaz Kojima

Oleg Endo  wrote:
> How about the attached patch?
> Is that way of dealing with the mems OK?
> What could be a possible test case for the alias info issue?
> 
> Tested on rev 190151 with
> make -k check RUNTESTFLAGS="--target_board=sh-sim
> \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"
> 
> and no new failures.

This patch is OK.

Regards,
kaz

Re: [SH] PR 51244 - Improve store of floating-point comparison

2012-08-09 Thread Kaz Kojima

Oleg Endo  wrote:
> This patch mainly improves stores of negated/inverted floating point
> comparison results in regs and removes a useless zero-extension after
> storing the negated T bit in a reg.
[snip]
> Tested on rev 190151 with
>  make -k check RUNTESTFLAGS="--target_board=sh-sim
> \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"
> 
> and no new failures.
> OK?

OK.

Regards,
kaz

Re: [PATCH, libjava] Use accessor functions to manipulate xmlOutputBuffer

2012-08-09 Thread Dodji Seketeli

Andrew Hughes  writes:

>> OK.
>> 
>
> As this is a GNU Classpath change, it should go in there first to avoid 
> creating
> a divergence which will cause later problems in merging.  Classpath is 
> regularly
> merged into gcj as a whole.
>
> I found several patches during the last merge which had only been added to gcj
> (some without ChangeLog entries) and this slowed the process down 
> considerably.
>
> Dodji, I can push this to Classpath on your behalf if you don't have commit
> access.

Oops.  I committed the patch before I saw your message.  Sorry.

If you agree, I can revert the commit so that you can commit it to
classpath then.  I don't think I have commit access to GNU classpath.

Sorry for the inconvenience.

-- 
Dodji

[PATCH] Fix PR54027

2012-08-09 Thread Richard Guenther


This fixes PR54027, VRP treating overflow in signed left-shifts undefined.

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

Richard.

2012-08-09  Richard Guenther  

PR tree-optimization/54027
* tree-vrp.c (extract_range_from_binary_expr_1): Merge RSHIFT_EXPR
and LSHIFT_EXPR handling, force -fwrapv for the multiplication used
to handle LSHIFT_EXPR with a constant.

* gcc.dg/torture/pr54027.c: New testcase.

Index: gcc/tree-vrp.c
===
*** gcc/tree-vrp.c  (revision 190252)
--- gcc/tree-vrp.c  (working copy)
*** extract_range_from_binary_expr_1 (value_
*** 2726,2782 
extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
return;
  }
!   else if (code == RSHIFT_EXPR)
  {
/* If we have a RSHIFT_EXPR with any shift values outside [0..prec-1],
 then drop to VR_VARYING.  Outside of this range we get undefined
 behavior from the shift operation.  We cannot even trust
 SHIFT_COUNT_TRUNCATED at this stage, because that applies to rtl
 shifts, and the operation at the tree level may be widened.  */
!   if (vr1.type != VR_RANGE
! || !value_range_nonnegative_p (&vr1)
! || TREE_CODE (vr1.max) != INTEGER_CST
! || compare_tree_int (vr1.max, TYPE_PRECISION (expr_type) - 1) == 1)
{
! set_value_range_to_varying (vr);
! return;
}
- 
-   extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
-   return;
- }
-   else if (code == LSHIFT_EXPR)
- {
-   /* If we have a LSHIFT_EXPR with any shift values outside [0..prec-1],
-then drop to VR_VARYING.  Outside of this range we get undefined
-behavior from the shift operation.  We cannot even trust
-SHIFT_COUNT_TRUNCATED at this stage, because that applies to rtl
-shifts, and the operation at the tree level may be widened.  */
-   if (vr1.type != VR_RANGE
- || !value_range_nonnegative_p (&vr1)
- || TREE_CODE (vr1.max) != INTEGER_CST
- || compare_tree_int (vr1.max, TYPE_PRECISION (expr_type) - 1) == 1)
-   {
- set_value_range_to_varying (vr);
- return;
-   }
- 
-   /* We can map shifts by constants to MULT_EXPR handling.  */
-   if (range_int_cst_singleton_p (&vr1))
-   {
- value_range_t vr1p = VR_INITIALIZER;
- vr1p.type = VR_RANGE;
- vr1p.min
-   = double_int_to_tree (expr_type,
- double_int_lshift (double_int_one,
-TREE_INT_CST_LOW (vr1.min),
-TYPE_PRECISION (expr_type),
-false));
- vr1p.max = vr1p.min;
- extract_range_from_multiplicative_op_1 (vr, MULT_EXPR, &vr0, &vr1p);
- return;
-   }
- 
set_value_range_to_varying (vr);
return;
  }
--- 2726,2773 
extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
return;
  }
!   else if (code == RSHIFT_EXPR
!  || code == LSHIFT_EXPR)
  {
/* If we have a RSHIFT_EXPR with any shift values outside [0..prec-1],
 then drop to VR_VARYING.  Outside of this range we get undefined
 behavior from the shift operation.  We cannot even trust
 SHIFT_COUNT_TRUNCATED at this stage, because that applies to rtl
 shifts, and the operation at the tree level may be widened.  */
!   if (range_int_cst_p (&vr1)
! && compare_tree_int (vr1.min, 0) >= 0
! && compare_tree_int (vr1.max, TYPE_PRECISION (expr_type)) == -1)
{
! if (code == RSHIFT_EXPR)
!   {
! extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1);
! return;
!   }
! /* We can map lshifts by constants to MULT_EXPR handling.  */
! else if (code == LSHIFT_EXPR
!  && range_int_cst_singleton_p (&vr1))
!   {
! bool saved_flag_wrapv;
! value_range_t vr1p = VR_INITIALIZER;
! vr1p.type = VR_RANGE;
! vr1p.min
!   = double_int_to_tree (expr_type,
! double_int_lshift
!   (double_int_one,
!TREE_INT_CST_LOW (vr1.min),
!TYPE_PRECISION (expr_type),
!false));
! vr1p.max = vr1p.min;
! /* We have to use a wrapping multiply though as signed overflow
!on lshifts is implementation defined in C89.  */
! saved_flag_wrapv = flag_wrapv;
! flag_wrapv = 1;
! extract_range_from_binary_expr_1 (vr, MULT_EXPR, expr_type,
!

Re: [PATCH] Intrinsics for ADCX

2012-08-09 Thread Richard Henderson

On 08/09/2012 05:21 AM, Michael Zolotukhin wrote:
> Changelog entry:
> 2012-08-09  Michael Zolotukhin  
> 
> * config/i386/adxintrin.h: Remove guarding __ADX__ check.
> * config/i386/x86intrin.h: Likewise.
> * config/i386/i386.c (ix86_init_mmx_sse_builtins): Remove
> OPTION_MASK_ISA_ADX from needed options for
> __builtin_ia32_addcarryx_u32 and __builtin_ia32_addcarryx_u64.
> (ix86_expand_builtin): Use add3_carry in expanding of
> IX86_BUILTIN_ADDCARRYX32 and IX86_BUILTIN_ADDCARRYX64.
> 
> testsuite/Changelog entry:
> 2012-08-09  Michael Zolotukhin  
> 
> * gcc.target/i386/adx-addxcarry32-3.c: New.
> * gcc.target/i386/adx-addxcarry64-3.c: New.

Ok.


r~

[PATCH] Set current_function_decl in {push,pop}_cfun and push_struct_function

2012-08-09 Thread Martin Jambor

Hi,

I've always found it silly that in order to change the current
function one has to call push_cfun and pop_cfun which conveniently set
and restore the value of cfun and in addition to that also set
current_function_decl and usually also cache its old value to restore
it back afterwards.  I also think that, at least throughout the
middle-end, we should strive to have current_function_decl consistent
with cfun->decl.  There are quite a few places where we are not
consistent and I think such situations are prone to nasty surprises as
various functions rely on cfun and others on current_function_decl and
it's easy to be unaware that one of the two is incorrect at the
moment.

This week I have therefore decided to try and make push_cfun, pop_cfun
and push_struct_function also set the current_function_decl.  Being
afraid of opening a giant can of worms I only a mid-sized hole and
left various set_cfuns for later as well as places where we set
current_function_decl without bothering with cfun.  After a few
debugging sessions I came up with the patch below.  The changes are
mostly mechanical, let me try and explain some of the difficult or
not-quite-nice ones, most of which come from calls from front-ends
which generally do not care about cfun all that much.

- In order to ensure that pop_cfun will reliable restore the old
  current_function_decl, push_cfun asserts that cfun and
  current_function_decl match.  pop_cfun then simply restores
  current_function_decl to new_cfun->decl or NULL_TREE if new_cfun is
  NULL.  To check that the two remain consistent, pop_cfun has a
  similar (albeit checking) assert.

- I had to allow push_cfun(NULL) because in
  gfc_get_extern_function_decl in fortran/trans-decl.c we momentarily
  emulate top-level context by doing:

  current_function_decl = NULL_TREE;
  push_cfun (cfun);

  do_something ()

  pop_cfun ();
  current_function_decl = save_fn_decl;

  and to keep current_function_decl consistent with cfun, cfun had to
  be made NULL too.  Co I converted the above to push_cfun (NULL)
  which also sets current_function_decl to NULL_TREE.

- I also had to allow push_cfun(NULL) because
  dwarf2out_abstract_function does just that, just it looks like:

  push_cfun (DECL_STRUCT_FUNCTION (decl));

  but DECL_STRUCT_FUNCTION is usually (always?) NULL for abstract
  origin functions.  But this also means that changed push_cfun sets
  current_function_decl to NULL, which means the abstract function is
  not dwarf2outed as it should be.  Thus, in perhaps the most awful
  thunk in this patch I re-set current_function_decl after calling
  push_cfun.  If someone has a better idea how to deal with this, I'm
  certainly interested.

  For the same reason I do not assert that
  current_function matches cfun->decl in pop_cfun if cfun is NULL.

- each cfun change also triggers a pair of init_dummy_function_start
  and expand_dummy_function_end which invoke push_struct_function and
  pop_cfun respectively.  Because we may be in the middle of another
  push/pop_cfun, the current_function_decl may not match and so the
  asserts are disabled in these cases, fortunately we can recognize
  them by looking at value of in_dummy_function.

- ada/gcc-interface/utils.c:rest_of_subprog_body_compilation calls
  dump_function which in turns calls dump_function_to_file which calls
  push_cfun.  But Ada front end has its idea of the
  current_function_decl and there is no cfun which is an inconsistency
  which makes push_cfun assert fail.  I "solved" it by temporarily
  setting current_function_decl to NULL_TREE.  It's just dumping and I
  thought that dump_function should be considered middle-end and thus
  middle-end invariants should apply.

The patch passes bootstrap and testing on x86_64-linux (all languages
+ ada + obj-c++) and ia64-linux (c,c++,fortran,objc,obj-c++).  There
is some confusing jitter in the go testing results which I have not
yet looked at (perhaps compare_tests just can't deal with it, there
are tests reported both as newly failing and newly working etc...) but
I thought that I'd send the patch now anyway to get some feedback in
case I was doing something else wrong (I also do not know whether
anyone but Ian can modify the go front-end).  I have also LTO-built
Mozilla Firefox with the patch.

Well, what do you think?

Martin


2012-08-08  Martin Jambor  

* function.c (push_cfun): Check old current_function_decl matches
old cfun, set new current_function_decl to the decl of the new
cfun.
(push_struct_function): Likewise.
(pop_cfun): Likewise.
(allocate_struct_function): Move call to
invoke_set_current_function_hook to the end of the function.
* cfgexpand.c (estimated_stack_frame_size): Do not set and restore
current_function_decl.
* cgraph.c (cgraph_release_function_body): Likewise.
* cgraphunit.c (cgraph_process_new_functions): Likewise.
(cgraph_ad

Re: [PATCH] Intrinsics for ADCX

2012-08-09 Thread Kirill Yukhin

>
> Ok.

Checked in:
http://gcc.gnu.org/ml/gcc-cvs/2012-08/msg00231.html

Thanks, K

Re: [PATCH, libjava] Use accessor functions to manipulate xmlOutputBuffer

2012-08-09 Thread Andrew Hughes

- Original Message -
> Andrew Hughes  writes:
> 
> >> OK.
> >> 
> >
> > As this is a GNU Classpath change, it should go in there first to
> > avoid creating
> > a divergence which will cause later problems in merging.  Classpath
> > is regularly
> > merged into gcj as a whole.
> >
> > I found several patches during the last merge which had only been
> > added to gcj
> > (some without ChangeLog entries) and this slowed the process down
> > considerably.
> >
> > Dodji, I can push this to Classpath on your behalf if you don't
> > have commit
> > access.
> 
> Oops.  I committed the patch before I saw your message.  Sorry.
> 
> If you agree, I can revert the commit so that you can commit it to
> classpath then.  I don't think I have commit access to GNU classpath.
> 
> Sorry for the inconvenience.
> 

Don't worry about reverting it.  I'll add it to Classpath now, then
they'll be in sync when we do the next merge.

In future, please post changes to files under the libjava/classpath directory to
classp...@gnu.org and feel free to ping me directly if you don't
get a response in a reasonable timeframe.  It just makes my life a bit
easier when it comes to doing the merges :-)

> --
>   Dodji
> 

-- 
Andrew :)

Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: 248BDC07 (https://keys.indymedia.org/)
Fingerprint = EC5A 1F5E C0AD 1D15 8F1F  8F91 3B96 A578 248B DC07

Re: [PATCH, alpha]: Prevent another case of wrong relaxation with exception handler

2012-08-09 Thread Richard Henderson

On 08/08/2012 11:23 PM, Uros Bizjak wrote:
> 2012-08-09  Uros Bizjak  
> 
> * config/alpha/alpha.c (alpha_pad_noreturn): Rename to ...
> (alpha_pad_function_end): ... this.  Also insert NOP between
> sibling call and GP load.
> (alpha_reorg): Update call to alpha_pad_function_end.  Expand comment.
> 
> Patch was bootstrapped and regression tested on alphaev68-pc-linux-gnu.
> 
> OK for mainline and release branches?

Ok everywhere.


r~

Re: PATCH: PR rtl-optimization/54157: [x32] -maddress-mode=long failures

2012-08-09 Thread H.J. Lu

On Wed, Aug 8, 2012 at 8:11 AM, Richard Sandiford
 wrote:
> "H.J. Lu"  writes:
>> On Wed, Aug 8, 2012 at 6:43 AM, Uros Bizjak  wrote:
>>> Probably we need to backport this patch to 4.7, where x32 is
>>> -maddress-mode=long by default.
>>>
>>
>> It doesn't fail on 4.7 branch since checking mode on PLUS CONST
>> is new on trunk.  However, I think it is a correctness issue.  Is this
>> OK to backport to 4.7?
>
> Yeah, I agree we should backport it.
>
> Richard

I am checking this into 4.7 branch.  Tested on Linux/x32, Linux/ia32
and Linux/x86-64.

Thanks.

-- 
H.J.
---
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index bc7c36c..44b0d32 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,13 @@
+2012-08-09  H.J. Lu  
+
+   Backport from mainline
+   2012-08-08  Richard Sandiford  
+   H.J. Lu  
+
+   PR rtl-optimization/54157
+   * combine.c (gen_lowpart_for_combine): Don't return identity
+   for CONST or symbolic reference.
+
 2012-08-06  Uros Bizjak  

Backport from mainline
diff --git a/gcc/combine.c b/gcc/combine.c
index 3d81da8a..67bd776 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -10802,13 +10802,6 @@ gen_lowpart_for_combine (enum machine_mode
omode, rtx x)
   if (omode == imode)
 return x;

-  /* Return identity if this is a CONST or symbolic reference.  */
-  if (omode == Pmode
-  && (GET_CODE (x) == CONST
- || GET_CODE (x) == SYMBOL_REF
- || GET_CODE (x) == LABEL_REF))
-return x;
-
   /* We can only support MODE being wider than a word if X is a
  constant integer or has a mode the same size.  */
   if (GET_MODE_SIZE (omode) > UNITS_PER_WORD
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 9fd8113..ef35a62 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,11 @@
+2012-08-09  H.J. Lu  
+
+   Backport from mainline
+   2012-08-08  H.J. Lu  
+
+   PR rtl-optimization/54157
+   * gcc.target/i386/pr54157.c: New file.
+
 2012-08-01  Uros Bizjak  

Backport from mainline
diff --git a/gcc/testsuite/gcc.target/i386/pr54157.c
b/gcc/testsuite/gcc.target/i386/pr54157.c
new file mode 100644
index 000..59fcd79
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr54157.c
@@ -0,0 +1,21 @@
+/* { dg-do compile { target { ! { ia32 } } } } */
+/* { dg-options "-O2 -mx32 -ftree-vectorize" } */
+
+struct s2{
+  int n[24 -1][24 -1][24 -1];
+};
+
+struct test2{
+  struct s2 e;
+};
+
+struct test2 tmp2[4];
+
+void main1 ()
+{
+  int i,j;
+
+  for (i = 0; i < 24 -4; i++)
+  for (j = 0; j < 24 -4; j++)
+  tmp2[2].e.n[1][i][j] = 8;
+}

Re: [PATCH] Set current_function_decl in {push,pop}_cfun and push_struct_function

2012-08-09 Thread Richard Guenther

On Thu, Aug 9, 2012 at 4:26 PM, Martin Jambor  wrote:
> Hi,
>
> I've always found it silly that in order to change the current
> function one has to call push_cfun and pop_cfun which conveniently set
> and restore the value of cfun and in addition to that also set
> current_function_decl and usually also cache its old value to restore
> it back afterwards.  I also think that, at least throughout the
> middle-end, we should strive to have current_function_decl consistent
> with cfun->decl.  There are quite a few places where we are not
> consistent and I think such situations are prone to nasty surprises as
> various functions rely on cfun and others on current_function_decl and
> it's easy to be unaware that one of the two is incorrect at the
> moment.
>
> This week I have therefore decided to try and make push_cfun, pop_cfun
> and push_struct_function also set the current_function_decl.  Being
> afraid of opening a giant can of worms I only a mid-sized hole and
> left various set_cfuns for later as well as places where we set
> current_function_decl without bothering with cfun.  After a few
> debugging sessions I came up with the patch below.  The changes are
> mostly mechanical, let me try and explain some of the difficult or
> not-quite-nice ones, most of which come from calls from front-ends
> which generally do not care about cfun all that much.
>
> - In order to ensure that pop_cfun will reliable restore the old
>   current_function_decl, push_cfun asserts that cfun and
>   current_function_decl match.  pop_cfun then simply restores
>   current_function_decl to new_cfun->decl or NULL_TREE if new_cfun is
>   NULL.  To check that the two remain consistent, pop_cfun has a
>   similar (albeit checking) assert.
>
> - I had to allow push_cfun(NULL) because in
>   gfc_get_extern_function_decl in fortran/trans-decl.c we momentarily
>   emulate top-level context by doing:
>
>   current_function_decl = NULL_TREE;
>   push_cfun (cfun);
>
>   do_something ()
>
>   pop_cfun ();
>   current_function_decl = save_fn_decl;
>
>   and to keep current_function_decl consistent with cfun, cfun had to
>   be made NULL too.  Co I converted the above to push_cfun (NULL)
>   which also sets current_function_decl to NULL_TREE.
>
> - I also had to allow push_cfun(NULL) because
>   dwarf2out_abstract_function does just that, just it looks like:
>
>   push_cfun (DECL_STRUCT_FUNCTION (decl));
>
>   but DECL_STRUCT_FUNCTION is usually (always?) NULL for abstract
>   origin functions.  But this also means that changed push_cfun sets
>   current_function_decl to NULL, which means the abstract function is
>   not dwarf2outed as it should be.  Thus, in perhaps the most awful
>   thunk in this patch I re-set current_function_decl after calling
>   push_cfun.  If someone has a better idea how to deal with this, I'm
>   certainly interested.
>
>   For the same reason I do not assert that
>   current_function matches cfun->decl in pop_cfun if cfun is NULL.
>
> - each cfun change also triggers a pair of init_dummy_function_start
>   and expand_dummy_function_end which invoke push_struct_function and
>   pop_cfun respectively.  Because we may be in the middle of another
>   push/pop_cfun, the current_function_decl may not match and so the
>   asserts are disabled in these cases, fortunately we can recognize
>   them by looking at value of in_dummy_function.
>
> - ada/gcc-interface/utils.c:rest_of_subprog_body_compilation calls
>   dump_function which in turns calls dump_function_to_file which calls
>   push_cfun.  But Ada front end has its idea of the
>   current_function_decl and there is no cfun which is an inconsistency
>   which makes push_cfun assert fail.  I "solved" it by temporarily
>   setting current_function_decl to NULL_TREE.  It's just dumping and I
>   thought that dump_function should be considered middle-end and thus
>   middle-end invariants should apply.
>
> The patch passes bootstrap and testing on x86_64-linux (all languages
> + ada + obj-c++) and ia64-linux (c,c++,fortran,objc,obj-c++).  There
> is some confusing jitter in the go testing results which I have not
> yet looked at (perhaps compare_tests just can't deal with it, there
> are tests reported both as newly failing and newly working etc...) but
> I thought that I'd send the patch now anyway to get some feedback in
> case I was doing something else wrong (I also do not know whether
> anyone but Ian can modify the go front-end).  I have also LTO-built
> Mozilla Firefox with the patch.
>
> Well, what do you think?

Well.  We should try to get rid of most push/pop_cfun calls, and the middle-end
should never need to look at current_function_decl ... (in practice we have
tree.c and fold-const.c which has to because its shared between FE and
middle-end).

For example the use in estimate_stack_frame_size.  Or the uses in IPA
passes.  It would be nice to figure out which parts need access to
cfun/current_function_decl in them (thus,

Ping Re: Add --no-sysroot-suffix driver option

2012-08-09 Thread Joseph S. Myers

Ping.  This patch 
 is pending 
review.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH, libjava] Use accessor functions to manipulate xmlOutputBuffer

2012-08-09 Thread Dodji Seketeli

Andrew Hughes  writes:

> Don't worry about reverting it.  I'll add it to Classpath now, then
> they'll be in sync when we do the next merge.

Thank you.

> In future, please post changes to files under the libjava/classpath directory 
> to
> classp...@gnu.org and feel free to ping me directly if you don't
> get a response in a reasonable timeframe.  It just makes my life a bit
> easier when it comes to doing the merges :-)

OK, I will do.  Sorry for the inconvenience.

-- 
Dodji

Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)

2012-08-09 Thread Mike Stump

On Aug 9, 2012, at 1:22 AM, Richard Guenther wrote:
> Ah.  For simple objects like double_int I prefer to have either all ops 
> mutating
> or all ops non-mutating.

wide_int, which replaces double_int for int types, is always non-mutating, by 
value interface.  In C++, it will be const & input parameters, to avoid the 
copies and retain the performance. We maintain a cache under it, and reuse out 
of it for the long lived objects, for short lived, we just allocate the on the 
stack as needed.

Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)

2012-08-09 Thread Michael Matz

Hi,

On Thu, 9 Aug 2012, Mike Stump wrote:

> > Ah.  For simple objects like double_int I prefer to have either all 
> > ops mutating or all ops non-mutating.
> 
> wide_int, which replaces double_int for int types, is always 
> non-mutating, by value interface.  In C++, it will be const & input 
> parameters, to avoid the copies and retain the performance. We maintain 
> a cache under it, and reuse out of it for the long lived objects, for 
> short lived, we just allocate the on the stack as needed.

Hmm.  And maintaining a cache is faster than 
passing/returning/manipulating two registers?


Ciao,
Michael.

Re: MIPS Android patch

2012-08-09 Thread H.J. Lu

On Fri, Apr 20, 2012 at 6:15 PM, Maxim Kuvyrkov  wrote:
> On 20/04/2012, at 1:34 PM, Fu, Chao-Ying wrote:
>
>> Hi Maxim, Richard,
>>
>>  I built cross-toolchains for 3 different targets as follows.
>> 1. mips-linux-gnu
>> 2. mips-linux-gnu --enable-targets=all
>> 3. mips64-linux-gnu
>>
>>  These targets are affected by this MIPS Android patch.
>>
>>  Then, I checked the output from "gcc -dumpspecs" before and after applying 
>> the patch.
>> The specs have 6 places of differences for Android due to new defines in 
>> linux-common.h.
>> I am also building GCC natively, and will test GCC natively later.
>> Any feedback?  Thanks!
>>
>> Regards,
>> Chao-ying
>>
>> libgcc/ChangeLog
>> 2012-04-19  Chao-ying Fu  
>>
>>   * unwind-dw2-fde-dip.c: Define USE_PT_GNU_EH_FRAME for BIONIC.
>
> This piece is trivial, so, given that Richard approved the MIPS changes, you 
> are clear to check in after amending the patch per Richard's comments.  
> Please check in the patch to unwind-dw2-fde-dip.c separately, as it is a 
> change on its own.
>
> Thank you,
>

This breaks Android/x86 build:

#if defined(USE_PT_GNU_EH_FRAME)

#include 

but Bionic/x86 doesn't have link.h

-- 
H.J.

Re: MIPS Android patch

2012-08-09 Thread H.J. Lu

On Thu, Aug 9, 2012 at 8:45 AM, H.J. Lu  wrote:
> On Fri, Apr 20, 2012 at 6:15 PM, Maxim Kuvyrkov  
> wrote:
>> On 20/04/2012, at 1:34 PM, Fu, Chao-Ying wrote:
>>
>>> Hi Maxim, Richard,
>>>
>>>  I built cross-toolchains for 3 different targets as follows.
>>> 1. mips-linux-gnu
>>> 2. mips-linux-gnu --enable-targets=all
>>> 3. mips64-linux-gnu
>>>
>>>  These targets are affected by this MIPS Android patch.
>>>
>>>  Then, I checked the output from "gcc -dumpspecs" before and after applying 
>>> the patch.
>>> The specs have 6 places of differences for Android due to new defines in 
>>> linux-common.h.
>>> I am also building GCC natively, and will test GCC natively later.
>>> Any feedback?  Thanks!
>>>
>>> Regards,
>>> Chao-ying
>>>
>>> libgcc/ChangeLog
>>> 2012-04-19  Chao-ying Fu  
>>>
>>>   * unwind-dw2-fde-dip.c: Define USE_PT_GNU_EH_FRAME for BIONIC.
>>
>> This piece is trivial, so, given that Richard approved the MIPS changes, you 
>> are clear to check in after amending the patch per Richard's comments.  
>> Please check in the patch to unwind-dw2-fde-dip.c separately, as it is a 
>> change on its own.
>>
>> Thank you,
>>
>
> This breaks Android/x86 build:
>
> #if defined(USE_PT_GNU_EH_FRAME)
>
> #include 
>
> but Bionic/x86 doesn't have link.h
>
> --
> H.J.

I opened:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54209


-- 
H.J.

Re: [PATCH, MIPS] fix MIPS16 hard-float function stub bugs

2012-08-09 Thread Sandra Loosemore


On 08/08/2012 03:07 AM, Richard Sandiford wrote:


It looks like this patch might have been written before:

   http://gcc.gnu.org/ml/gcc-patches/2012-01/msg00756.html

which added:

   /* If we're calling a locally-defined MIPS16 function, we know that
  it will return values in both the "soft-float" and "hard-float"
  registers.  There is no need to use a stub to move the latter
  to the former.  */
   if (fp_code == 0&&  mips16_local_function_p (fn))
 return NULL_RTX;

to cope with this.


Yes, you are right; this patch does predate yours, and I'd missed that 
you'd already committed another fix for what looks like the same problem.



If so, and out of nervousness :-), did the testcase fail with
current trunk before the patch?


The testcase bundled with the patch is OK on current trunk.  But, I have 
to admit that the "real" testcase that motivated this patch was building 
Android.  It's going to take us a while to figure out whether your patch 
alone is adequate to make that work, so I'll withdraw this patch pending 
the outcome of those experiments.


-Sandra

PATCH: PR bootstrap/54209: [4.8 Regression] Failed to build gcc for Android/x86

2012-08-09 Thread H.J. Lu

Hi,

Bionic C library doesn't provide link.h.  This patch reverts revision
186788:

http://gcc.gnu.org/ml/gcc-cvs/2012-04/msg00740.html

OK to install?

Thanks.

H.J.
---
2012-08-09  H.J. Lu  

PR bootstrap/54209
* unwind-dw2-fde-dip.c (USE_PT_GNU_EH_FRAME): Don't define for
Bionic C library.

diff --git a/libgcc/unwind-dw2-fde-dip.c b/libgcc/unwind-dw2-fde-dip.c
index 92f8ab5..f57dc8c 100644
--- a/libgcc/unwind-dw2-fde-dip.c
+++ b/libgcc/unwind-dw2-fde-dip.c
@@ -54,11 +54,6 @@
 #endif
 
 #if !defined(inhibit_libc) && defined(HAVE_LD_EH_FRAME_HDR) \
-&& defined(__BIONIC__)
-# define USE_PT_GNU_EH_FRAME
-#endif
-
-#if !defined(inhibit_libc) && defined(HAVE_LD_EH_FRAME_HDR) \
 && defined(__FreeBSD__) && __FreeBSD__ >= 7
 # define ElfW __ElfN
 # define USE_PT_GNU_EH_FRAME

[SH] PR 54089 - Reinstate T_REG clobber for left shifts

2012-08-09 Thread Oleg Endo

Hello,

Removing the T_REG clobber from the left shift patterns entirely wasn't
such a good idea.  Especially if dynamic shifts are not available
(anything < SH3) incorrect code may be generated.
The attached patch adds a T_REG clobbering version of the left shift
insn "ashlsi3_n".  While at it, I consolidated the description of the
constant shift sequences, which hopefully makes them easier to read and
understand.

Tested on rev 190151 with
make -k check RUNTESTFLAGS="--target_board=sh-sim
\{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"

and no new failures.
OK?

Cheers,
Oleg

ChangeLog:

PR target/54089
* config/sh/sh-protos (shift_insns_rtx): Delete.
(sh_ashlsi_clobbers_t_reg_p): Add.
* config/sh/sh.c (shift_insns, shift_amounts, ext_shift_insns, 
ext_shift_amounts): Merge arrays of ints to array of structs.  
Adapt usage of arrays throughout the file.
(shift_insns_rtx): Delete unused function.
(sh_ashlsi_clobbers_t_reg_p): New function.
* config/sh/sh.md (ashlsi3): Emit ashlsi3_n_clobbers_t insn if 
the final shift sequence will clobber T_REG.
(ashlsi3_n): Split only if the final shift sequence will not 
clobber T_REG.
(ashlsi3_n_clobbers_t): New insn_and_split.
Index: gcc/config/sh/sh-protos.h
===
--- gcc/config/sh/sh-protos.h	(revision 190151)
+++ gcc/config/sh/sh-protos.h	(working copy)
@@ -73,7 +73,7 @@
 extern rtx sh_emit_cheap_store_flag (enum machine_mode, enum rtx_code, rtx, rtx);
 extern void sh_emit_compare_and_branch (rtx *, enum machine_mode);
 extern void sh_emit_compare_and_set (rtx *, enum machine_mode);
-extern int shift_insns_rtx (rtx);
+extern bool sh_ashlsi_clobbers_t_reg_p (rtx);
 extern void gen_shifty_op (int, rtx *);
 extern void gen_shifty_hi_op (int, rtx *);
 extern bool expand_ashiftrt (rtx *);
Index: gcc/config/sh/sh.c
===
--- gcc/config/sh/sh.c	(revision 190151)
+++ gcc/config/sh/sh.c	(working copy)
@@ -2786,72 +2786,117 @@
   return false;
 }
 
-/* Actual number of instructions used to make a shift by N.  */
+/* Number of instructions used to make an arithmetic right shift by N.  */
 static const char ashiftrt_insns[] =
   { 0,1,2,3,4,5,8,8,8,8,8,8,8,8,8,8,2,3,4,5,8,8,8,8,8,8,8,8,8,8,8,2};
 
-/* Left shift and logical right shift are the same.  */
-static const char shift_insns[]=
-  { 0,1,1,2,2,3,3,4,1,2,2,3,3,4,3,3,1,2,2,3,3,4,3,3,2,3,3,4,4,4,3,3};
+/* Description of a logical left or right shift, when expanded to a sequence
+   of 1/2/8/16 shifts.
+   Notice that one bit right shifts clobber the T bit.  One bit left shifts
+   are done with an 'add Rn,Rm' insn and thus do not clobber the T bit.  */
+enum
+{
+  ASHL_CLOBBERS_T = 1 << 0,
+  LSHR_CLOBBERS_T = 1 << 1
+};
 
-/* Individual shift amounts needed to get the above length sequences.
-   One bit right shifts clobber the T bit, so when possible, put one bit
-   shifts in the middle of the sequence, so the ends are eligible for
-   branch delay slots.  */
-static const short shift_amounts[32][5] = {
-  {0}, {1}, {2}, {2, 1},
-  {2, 2}, {2, 1, 2}, {2, 2, 2}, {2, 2, 1, 2},
-  {8}, {8, 1}, {8, 2}, {8, 1, 2},
-  {8, 2, 2}, {8, 2, 1, 2}, {8, -2, 8}, {8, -1, 8},
-  {16}, {16, 1}, {16, 2}, {16, 1, 2},
-  {16, 2, 2}, {16, 2, 1, 2}, {16, -2, 8}, {16, -1, 8},
-  {16, 8}, {16, 1, 8}, {16, 8, 2}, {16, 8, 1, 2},
-  {16, 8, 2, 2}, {16, -1, -2, 16}, {16, -2, 16}, {16, -1, 16}};
+struct ashl_lshr_sequence
+{
+  char insn_count;
+  char amount[6];
+  char clobbers_t;
+};
 
-/* Likewise, but for shift amounts < 16, up to three highmost bits
-   might be clobbered.  This is typically used when combined with some
+static const struct ashl_lshr_sequence ashl_lshr_seq[32] =
+{
+  { 0, { 0 },		0 },
+  { 1, { 1 },		LSHR_CLOBBERS_T },
+  { 1, { 2 },		0 },
+  { 2, { 2, 1 },	LSHR_CLOBBERS_T },
+  { 2, { 2, 2 },	0 },
+  { 3, { 2, 1, 2 },	LSHR_CLOBBERS_T },
+  { 3, { 2, 2, 2 },	0 },
+  { 4, { 2, 2, 1, 2 },	LSHR_CLOBBERS_T },
+  { 1, { 8 },		0 },
+  { 2, { 8, 1 },	LSHR_CLOBBERS_T },
+  { 2, { 8, 2 },	0 },
+  { 3, { 8, 1, 2 },	LSHR_CLOBBERS_T },
+  { 3, { 8, 2, 2 },	0 },
+  { 4, { 8, 2, 1, 2 },	LSHR_CLOBBERS_T },
+  { 3, { 8, -2, 8 },	0 },
+  { 3, { 8, -1, 8 },	ASHL_CLOBBERS_T },
+  { 1, { 16 },		0 },
+  { 2, { 16, 1 },	LSHR_CLOBBERS_T },
+  { 2, { 16, 2 },	0 },
+  { 3, { 16, 1, 2 },	LSHR_CLOBBERS_T },
+  { 3, { 16, 2, 2 },	0 },
+  { 4, { 16, 2, 1, 2 },	LSHR_CLOBBERS_T },
+  { 3, { 16, -2, 8 },	0 },
+  { 3, { 16, -1, 8 },	ASHL_CLOBBERS_T },
+  { 2, { 16, 8 },	0 },
+  { 3, { 16, 1, 8 },	LSHR_CLOBBERS_T },
+  { 3, { 16, 8, 2 },	0 },
+  { 4, { 16, 8, 1, 2 }, LSHR_CLOBBERS_T },
+  { 4, { 16, 8, 2, 2 },	0 },
+  { 4, { 16, -1, -2, 16 },  ASHL_CLOBBERS_T },
+  { 3, { 16, -2, 16 },	0 },
+  {

Re: Commit: RL78: Include tree-pass.h

2012-08-09 Thread DJ Delorie


> The issue is that using the plugin interface makes breakage only
> detectable when you are able to test a target, not by merely
> building it.

You just described *most* of the bugs I have to deal with.

Re: s390: Avoid CAS boolean output inefficiency

2012-08-09 Thread Eric Botcazou

> This was caused (or perhaps abetted by) the representation of EQ
> as NE ^ 1.  With the subsequent truncation and zero-extend, I
> think combine reached its insn limit of 3 before seeing everything
> it needed to see.

This can be 4 now, if you tweak the initial heuristic.

-- 
Eric Botcazou

Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)

2012-08-09 Thread Mike Stump

On Aug 9, 2012, at 8:19 AM, Michael Matz wrote:
> Hmm.  And maintaining a cache is faster than 
> passing/returning/manipulating two registers?

For the most part, we merely mirror existing code, check out 
lookup_const_double and immed_double_const.  If the existing code is wrong, 
love to have someone fix it.  :-)  Also, bear in mind, on a port with with 
OImode math for example, on a 32-bit host, it would be 8 registers...

Re: [PATCH][7/6] Allow anonymous SSA names

2012-08-09 Thread Richard Henderson

On 08/09/2012 06:20 AM, Richard Guenther wrote:
> This converts most users of create_tmp_{var,reg} to use anonymous
> SSA names.  To give you one more reason to look at 6/6 ;)

Wow, there's some really nice cleanups in there.


r~

RE: PATCH: PR bootstrap/54209: [4.8 Regression] Failed to build gcc for Android/x86

2012-08-09 Thread Fu, Chao-Ying

> > Hi,
> > 
> > Bionic C library doesn't provide link.h.  This patch 
> reverts revision
> > 186788:
> > 
> > http://gcc.gnu.org/ml/gcc-cvs/2012-04/msg00740.html
> > 
> > OK to install?
> > 
> > Thanks.
> > 
> > H.J.
> > ---
> > 2012-08-09  H.J. Lu  
> > 
> > PR bootstrap/54209
> > * unwind-dw2-fde-dip.c (USE_PT_GNU_EH_FRAME): Don't define for
> > Bionic C library.
> > 
> > diff --git a/libgcc/unwind-dw2-fde-dip.c 
> b/libgcc/unwind-dw2-fde-dip.c
> > index 92f8ab5..f57dc8c 100644
> > --- a/libgcc/unwind-dw2-fde-dip.c
> > +++ b/libgcc/unwind-dw2-fde-dip.c
> > @@ -54,11 +54,6 @@
> >  #endif
> >  
> >  #if !defined(inhibit_libc) && defined(HAVE_LD_EH_FRAME_HDR) \
> > -&& defined(__BIONIC__)
> > -# define USE_PT_GNU_EH_FRAME
> > -#endif
> > -
> > -#if !defined(inhibit_libc) && defined(HAVE_LD_EH_FRAME_HDR) \
> >  && defined(__FreeBSD__) && __FreeBSD__ >= 7
> >  # define ElfW __ElfN
> >  # define USE_PT_GNU_EH_FRAME
> > 
> 
>   How about this patch?  Just enable it for MIPS that 
> provides link.h in Android NDK.
> Thanks a lot!
> 
> Regards,
> Chao-ying
> 
> Index: unwind-dw2-fde-dip.c
> ===
> --- unwind-dw2-fde-dip.c(revision 190260)
> +++ unwind-dw2-fde-dip.c(working copy)
> @@ -55,6 +55,7 @@
> 
>  #if !defined(inhibit_libc) && defined(HAVE_LD_EH_FRAME_HDR) \
>  && defined(__BIONIC__)
> +&& defined(__mips__)
>  # define USE_PT_GNU_EH_FRAME
>  #endif
> 

  Sorry, I forgot \ in the previous patch.
Ex:
Index: unwind-dw2-fde-dip.c
===
--- unwind-dw2-fde-dip.c(revision 190260)
+++ unwind-dw2-fde-dip.c(working copy)
@@ -54,7 +54,8 @@
 #endif

 #if !defined(inhibit_libc) && defined(HAVE_LD_EH_FRAME_HDR) \
-&& defined(__BIONIC__)
+&& defined(__BIONIC__) \
+&& defined(__mips__)
 # define USE_PT_GNU_EH_FRAME
 #endif

RE: PATCH: PR bootstrap/54209: [4.8 Regression] Failed to build gcc for Android/x86

2012-08-09 Thread Fu, Chao-Ying

> Hi,
> 
> Bionic C library doesn't provide link.h.  This patch reverts revision
> 186788:
> 
> http://gcc.gnu.org/ml/gcc-cvs/2012-04/msg00740.html
> 
> OK to install?
> 
> Thanks.
> 
> H.J.
> ---
> 2012-08-09  H.J. Lu  
> 
>   PR bootstrap/54209
>   * unwind-dw2-fde-dip.c (USE_PT_GNU_EH_FRAME): Don't define for
>   Bionic C library.
> 
> diff --git a/libgcc/unwind-dw2-fde-dip.c b/libgcc/unwind-dw2-fde-dip.c
> index 92f8ab5..f57dc8c 100644
> --- a/libgcc/unwind-dw2-fde-dip.c
> +++ b/libgcc/unwind-dw2-fde-dip.c
> @@ -54,11 +54,6 @@
>  #endif
>  
>  #if !defined(inhibit_libc) && defined(HAVE_LD_EH_FRAME_HDR) \
> -&& defined(__BIONIC__)
> -# define USE_PT_GNU_EH_FRAME
> -#endif
> -
> -#if !defined(inhibit_libc) && defined(HAVE_LD_EH_FRAME_HDR) \
>  && defined(__FreeBSD__) && __FreeBSD__ >= 7
>  # define ElfW __ElfN
>  # define USE_PT_GNU_EH_FRAME
> 

  How about this patch?  Just enable it for MIPS that provides link.h in 
Android NDK.
Thanks a lot!

Regards,
Chao-ying

Index: unwind-dw2-fde-dip.c
===
--- unwind-dw2-fde-dip.c(revision 190260)
+++ unwind-dw2-fde-dip.c(working copy)
@@ -55,6 +55,7 @@

 #if !defined(inhibit_libc) && defined(HAVE_LD_EH_FRAME_HDR) \
 && defined(__BIONIC__)
+&& defined(__mips__)
 # define USE_PT_GNU_EH_FRAME
 #endif

[PATCH, i386]: Improve LIMIT_RELOAD_CLASSES

2012-08-09 Thread Uros Bizjak

On Sat, Aug 4, 2012 at 2:26 PM, Uros Bizjak  wrote:

>>> Without this, on the new testcase we hit the assert in
>>> inline_secondary_memory_needed. The comment before the function states:
>>>
>>>   The macro can't work reliably when one of the CLASSES is class
>>>   containing registers from multiple units (SSE, MMX, integer).  We
>>>   avoid this by never combining those units in single alternative in
>>>   the machine description. Ensure that this constraint holds to avoid
>>>   unexpected surprises.
>>>
>>> So, this indicates that we shouldn't be using INT_SSE_REGS for a reload
>>> class at all, and I expect that at the moment we don't. With the patch,
>>> the new find_valid_class_1 discovers INT_SSE_REGS as the best class for
>>> the register to hold the SYMBOL_REF, leading to the failed assert.
>
> Actually. existing LIMIT_RELOAD_CLASS is way too simple to handle all
> issues with mixed register sets. Looking at ix86_hard_regno_mode_ok,
> we have problems with DI and SI mode, which can go int XMM and GENERAL
> regs, and SF and DF mode, which can go into XMM, FLOAT and GENERAL
> regs, depending on the availability of units.
>
> Attached (RFC) patch handles this limitation by limiting multiple
> register set modes to the "natural mode" register set, i.e. DI and SI
> modes will always return GENERAL_REGS, DF and SF will return either
> SSE_REGS, or FLOAT_REGS or GENERAL_REGS. Please note, that we don't
> want to widen i.e. CREG or ADREG narrow classes to full GENERAL_REGS.
>
> The patch also improves Q_REGS selection in the same way, and adds a
> couple of missing registers to various register sets, so the macro
> works as expected.

I have committed the patch to mainline SVN. The testcase options were
adjusted to really fail for all default cases of mpfmath on x86. Also,
the testcase that fails on 64bit targets was added.

2012-08-09  Uros Bizjak  

* config/i386/i386.h (LIMIT_RELOAD_CLASS): Return preferred
single unit register class for classes that contain registers form
multiple units.
(REG_CLASS_CONTENTS): Add missing "frame" register to FLOAT_INT_REGS,
INT_SSE_REGS and FLOAT_INT_SSE_REGS register classes.

testsuite/ChangeLog:

2012-08-09  Uros Bizjak  

* gcc.c-torture/compile/20120727-1.c (dg-options): Add -mfpmath=387
for x86 targets.
* gcc.c-torture/compile/20120727-2.c: New test.

Re-tested on x86_64-pc-linux-gnu {,-m32} and committed.

Uros.
Index: config/i386/i386.h
===
--- config/i386/i386.h  (revision 190254)
+++ config/i386/i386.h  (working copy)
@@ -1298,9 +1298,9 @@
 { 0x1fe00100,0x1fe000 },   /* FP_TOP_SSE_REG */\
 { 0x1fe00200,0x1fe000 },   /* FP_SECOND_SSE_REG */ \
 { 0x1fe0ff00,0x1fe000 },   /* FLOAT_SSE_REGS */\
-   { 0x1,  0x1fe0 },   /* FLOAT_INT_REGS */\
-{ 0x1fe100ff,0x1fffe0 },   /* INT_SSE_REGS */  \
-{ 0x1fe1,0x1fffe0 },   /* FLOAT_INT_SSE_REGS */\
+  { 0x11,  0x1fe0 },   /* FLOAT_INT_REGS */\
+{ 0x1ff100ff,0x1fffe0 },   /* INT_SSE_REGS */  \
+{ 0x1ff1,0x1fffe0 },   /* FLOAT_INT_SSE_REGS */\
 { 0x,0x1f }
\
 }
 
@@ -1378,15 +1378,28 @@
 
 /* Place additional restrictions on the register class to use when it
is necessary to be able to hold a value of mode MODE in a reload
-   register for which class CLASS would ordinarily be used.  */
+   register for which class CLASS would ordinarily be used.
 
-#define LIMIT_RELOAD_CLASS(MODE, CLASS)\
-  ((MODE) == QImode && !TARGET_64BIT   \
-   && ((CLASS) == ALL_REGS || (CLASS) == GENERAL_REGS  \
-   || (CLASS) == LEGACY_REGS || (CLASS) == INDEX_REGS) \
-   ? Q_REGS\
-   : (CLASS) == INT_SSE_REGS ? GENERAL_REGS : (CLASS))
+   We avoid classes containing registers from multiple units due to
+   the limitation in ix86_secondary_memory_needed.  We limit these
+   classes to their "natural mode" single unit register class, depending
+   on the unit availability.
 
+   Please note that reg_class_subset_p is not commutative, so these
+   conditions mean "... if (CLASS) includes ALL registers from the
+   register set."  */
+
+#define LIMIT_RELOAD_CLASS(MODE, CLASS)
\
+  (((MODE) == QImode && !TARGET_64BIT  \
+&& reg_class_subset_p (Q_REGS, (CLASS))) ? Q_REGS  \
+   : (((MODE) == SImode || (MODE) == DImode)   \
+  && reg_class_subset_p (GENERAL_REGS, (CLASS))) ? GENERAL_REGS\
+   : (SSE_FLOAT_MODE_P (MODE) && TARGET_SSE_MATH   \
+  && reg_class_subset_p (SSE_REGS, (CLASS)

Re: PATCH: PR bootstrap/54209: [4.8 Regression] Failed to build gcc for Android/x86

2012-08-09 Thread H.J. Lu

On Thu, Aug 9, 2012 at 11:11 AM, Fu, Chao-Ying  wrote:
>> > Hi,
>> >
>> > Bionic C library doesn't provide link.h.  This patch
>> reverts revision
>> > 186788:
>> >
>> > http://gcc.gnu.org/ml/gcc-cvs/2012-04/msg00740.html
>> >
>> > OK to install?
>> >
>> > Thanks.
>> >
>> > H.J.
>> > ---
>> > 2012-08-09  H.J. Lu  
>> >
>> > PR bootstrap/54209
>> > * unwind-dw2-fde-dip.c (USE_PT_GNU_EH_FRAME): Don't define for
>> > Bionic C library.
>> >
>> > diff --git a/libgcc/unwind-dw2-fde-dip.c
>> b/libgcc/unwind-dw2-fde-dip.c
>> > index 92f8ab5..f57dc8c 100644
>> > --- a/libgcc/unwind-dw2-fde-dip.c
>> > +++ b/libgcc/unwind-dw2-fde-dip.c
>> > @@ -54,11 +54,6 @@
>> >  #endif
>> >
>> >  #if !defined(inhibit_libc) && defined(HAVE_LD_EH_FRAME_HDR) \
>> > -&& defined(__BIONIC__)
>> > -# define USE_PT_GNU_EH_FRAME
>> > -#endif
>> > -
>> > -#if !defined(inhibit_libc) && defined(HAVE_LD_EH_FRAME_HDR) \
>> >  && defined(__FreeBSD__) && __FreeBSD__ >= 7
>> >  # define ElfW __ElfN
>> >  # define USE_PT_GNU_EH_FRAME
>> >
>>
>>   How about this patch?  Just enable it for MIPS that
>> provides link.h in Android NDK.
>> Thanks a lot!
>>
>> Regards,
>> Chao-ying
>>
>> Index: unwind-dw2-fde-dip.c
>> ===
>> --- unwind-dw2-fde-dip.c(revision 190260)
>> +++ unwind-dw2-fde-dip.c(working copy)
>> @@ -55,6 +55,7 @@
>>
>>  #if !defined(inhibit_libc) && defined(HAVE_LD_EH_FRAME_HDR) \
>>  && defined(__BIONIC__)
>> +&& defined(__mips__)
>>  # define USE_PT_GNU_EH_FRAME
>>  #endif
>>
>
>   Sorry, I forgot \ in the previous patch.
> Ex:
> Index: unwind-dw2-fde-dip.c
> ===
> --- unwind-dw2-fde-dip.c(revision 190260)
> +++ unwind-dw2-fde-dip.c(working copy)
> @@ -54,7 +54,8 @@
>  #endif
>
>  #if !defined(inhibit_libc) && defined(HAVE_LD_EH_FRAME_HDR) \
> -&& defined(__BIONIC__)
> +&& defined(__BIONIC__) \
> +&& defined(__mips__)
>  # define USE_PT_GNU_EH_FRAME
>  #endif


Where does mips link.h come from?  I didn't see it in AOSP Bionic C library.

-- 
H.J.

RE: PATCH: PR bootstrap/54209: [4.8 Regression] Failed to build gcc for Android/x86

2012-08-09 Thread Fu, Chao-Ying

> 
> Where does mips link.h come from?  I didn't see it in AOSP 
> Bionic C library.
> 
> -- 
> H.J.
> 

 It's from development/ndk/platforms/android-9/arch-mips/include/link.h from 
AOSP checkout.

Regards,
Chao-ying

Re: [google/gcc-4_7] Fix problems with -fdebug-types-section and local types

2012-08-09 Thread Diego Novillo


On 12-08-08 19:17 , Cary Coutant wrote:


2012-08-07   Cary Coutant  

gcc/
* dwarf2out.c (clone_as_declaration): Copy DW_AT_abstract_origin
attribute.
(generate_skeleton_bottom_up): Remove DW_AT_object_pointer attribute
from original DIE.
(clone_tree_hash): Rename to ...
(clone_tree_partial): ... this; change callers.  Copy
DW_TAG_subprogram DIEs as declarations.

gcc/testsuite/
* testsuite/g++.dg/debug/dwarf2/dwarf4-nested.C: New test case.
* testsuite/g++.dg/debug/dwarf2/dwarf4-typedef.C: Add
-fdebug-types-section flag.


OK for google/gcc-4_7.  If the trunk review requires substantive 
changes, then you can just cherry-pick the subsequent patch later.



Diego.

RE: PATCH: PR bootstrap/54209: [4.8 Regression] Failed to build gcc for Android/x86

2012-08-09 Thread Joseph S. Myers

On Thu, 9 Aug 2012, Fu, Chao-Ying wrote:

>   How about this patch?  Just enable it for MIPS that provides link.h in 
> Android NDK.
> Thanks a lot!

Please don't put this sort of architecture conditional in an 
architecture-independent source file.  In this case it should be fine for 
libgcc's configure to try compiling a file that #includes  
(obviously, make sure the configure test gets the right results both when 
it's present and when it's absent), and use the results of that configure 
test instead of defined(__mips__).  (In a bootstrap where libc headers 
aren't yet present, inhibit_libc should be defined anyway to disable those 
libgcc features depending on system headers from libc.)

-- 
Joseph S. Myers
jos...@codesourcery.com

[google/gcc-4_7] XFAIL libitm failures

2012-08-09 Thread Ollie Wild

As discussed, this patch XFAILs the libitm failures uncovered by
http://gcc.gnu.org/viewcvs?view=revision&revision=190233.

OK for google/gcc-4_7?

Ollie

2012-08-09  Ollie Wild  

* testsuite-management/x86_64-grtev3-linux-gnu.xfail: XFAIL libitm
failuires.
commit 8d78568138de78f11935f92b3143149733ea0172
Author: Ollie Wild 
Date:   Thu Aug 9 14:38:51 2012 -0500

2012-08-09  Ollie Wild  

* testsuite-management/x86_64-grtev3-linux-gnu.xfail: XFAIL libitm
failuires.

diff --git a/contrib/ChangeLog.google-4_7 b/contrib/ChangeLog.google-4_7
index c1664f9..fbfc0f5 100644
--- a/contrib/ChangeLog.google-4_7
+++ b/contrib/ChangeLog.google-4_7
@@ -1,3 +1,8 @@
+2012-08-09  Ollie Wild  
+
+   * testsuite-management/x86_64-grtev3-linux-gnu.xfail: XFAIL libitm
+   failuires.
+
 2012-08-08  Ollie Wild  
 
* testsuite-management/powerpc-grtev3-linux-gnu.xfail: xfail
diff --git a/contrib/testsuite-management/x86_64-grtev3-linux-gnu.xfail 
b/contrib/testsuite-management/x86_64-grtev3-linux-gnu.xfail
index 4fa47ec..d68b543 100644
--- a/contrib/testsuite-management/x86_64-grtev3-linux-gnu.xfail
+++ b/contrib/testsuite-management/x86_64-grtev3-linux-gnu.xfail
@@ -68,3 +68,34 @@ flaky | FAIL: libgomp.graphite/force-parallel-6.c execution 
test
 # that is resolved.
 UNRESOLVED: 23_containers/map/element_access/2.cc compilation failed to 
produce executable
 FAIL: 23_containers/map/element_access/2.cc (test for excess errors)
+
+# libitm failures caused by missing --sysroot.
+UNRESOLVED: libitm.c++/dropref.C compilation failed to produce executable
+FAIL: libitm.c++/dropref.C (test for excess errors)
+UNRESOLVED: libitm.c++/eh-1.C compilation failed to produce executable
+FAIL: libitm.c++/eh-1.C (test for excess errors)
+FAIL: libitm.c++/throwdown.C (test for excess errors)
+FAIL: libitm.c/cancel.c (test for excess errors)
+UNRESOLVED: libitm.c/cancel.c compilation failed to produce executable
+FAIL: libitm.c/clone-1.c (test for excess errors)
+UNRESOLVED: libitm.c/clone-1.c compilation failed to produce executable
+FAIL: libitm.c/dropref-2.c (test for excess errors)
+UNRESOLVED: libitm.c/dropref-2.c compilation failed to produce executable
+UNRESOLVED: libitm.c/dropref.c compilation failed to produce executable
+FAIL: libitm.c/dropref.c (test for excess errors)
+FAIL: libitm.c/memcpy-1.c (test for excess errors)
+UNRESOLVED: libitm.c/memcpy-1.c compilation failed to produce executable
+FAIL: libitm.c/memset-1.c (test for excess errors)
+UNRESOLVED: libitm.c/memset-1.c compilation failed to produce executable
+UNRESOLVED: libitm.c/notx.c compilation failed to produce executable
+FAIL: libitm.c/notx.c (test for excess errors)
+UNRESOLVED: libitm.c/reentrant.c compilation failed to produce executable
+FAIL: libitm.c/reentrant.c (test for excess errors)
+FAIL: libitm.c/simple-1.c (test for excess errors)
+UNRESOLVED: libitm.c/simple-1.c compilation failed to produce executable
+UNRESOLVED: libitm.c/simple-2.c compilation failed to produce executable
+FAIL: libitm.c/simple-2.c (test for excess errors)
+FAIL: libitm.c/stackundo.c (test for excess errors)
+UNRESOLVED: libitm.c/stackundo.c compilation failed to produce executable
+UNRESOLVED: libitm.c/txrelease.c compilation failed to produce executable
+FAIL: libitm.c/txrelease.c (test for excess errors)

Re: [google/gcc-4_7] XFAIL libitm failures

2012-08-09 Thread Diego Novillo


On 12-08-09 15:42 , Ollie Wild wrote:


* testsuite-management/x86_64-grtev3-linux-gnu.xfail: XFAIL libitm
failuires.



OK.


Diego.

Re: [PATCH,i386] fma,fma4 and xop flags

2012-08-09 Thread Uros Bizjak

On Wed, Aug 8, 2012 at 1:31 PM,   wrote:

> Bdver2 cpu supports both fma and fma4 instructions.
> Previous to patch, option "-mno-xop" removes "-mfma4".
> Similarly, option "-mno-fma4" removes "-mxop".

It looks to me that there is some misunderstanding. AFAICS:

-mxop implies -mfma4, but reverse is not true. Please see

#define OPTION_MASK_ISA_FMA4_SET \
  (OPTION_MASK_ISA_FMA4 | OPTION_MASK_ISA_SSE4A_SET \
   | OPTION_MASK_ISA_AVX_SET)
#define OPTION_MASK_ISA_XOP_SET \
  (OPTION_MASK_ISA_XOP | OPTION_MASK_ISA_FMA4_SET)

So, -mxop sets -mfma4, etc ..., but -mfma4 does NOT enable -mxop.

OTOH,

#define OPTION_MASK_ISA_FMA4_UNSET \
  (OPTION_MASK_ISA_FMA4 | OPTION_MASK_ISA_XOP_UNSET)
#define OPTION_MASK_ISA_XOP_UNSET OPTION_MASK_ISA_XOP

-mno-fma4 implies -mno-xop, but again reverse is not true. Thus,
-mno-xop does NOT imply -mno-fma4.

> So, the patch conditionally disables "-mfma" or "-mfma4".
> Enabling "-mxop" is done by also checking "-mfma".

Please note that conditional handling of ISA flags belongs to
ix86_option_override_internal. However, if someone set -mfma4 together
with -mfma on the command line, we should NOT disable selected ISA
behind user's back, in the same way as we don't disable anything with
"-march=i386 -msse4". With -march=bdver2, we already marked that only
fma is supported, and if user selected "-march=bdver2 -mfma4" on the
command line, we shouldn't disable anything.

Uros.

Re: [PATCH, libjava] Use accessor functions to manipulate xmlOutputBuffer

2012-08-09 Thread Andrew Hughes



- Original Message -
> Andrew Hughes  writes:
> 
> > Don't worry about reverting it.  I'll add it to Classpath now, then
> > they'll be in sync when we do the next merge.
> 
> Thank you.
> 

Done: 
http://git.savannah.gnu.org/cgit/classpath.git/commit/?id=4d4db712cf4df4feb4d7b98bb1b5b448218500b3

> > In future, please post changes to files under the libjava/classpath
> > directory to
> > classp...@gnu.org and feel free to ping me directly if you don't
> > get a response in a reasonable timeframe.  It just makes my life a
> > bit
> > easier when it comes to doing the merges :-)
> 
> OK, I will do.  Sorry for the inconvenience.

No worries.  It's not immediately obvious for someone new to the libjava 
codebase.

> 
> --
>   Dodji
> 

-- 
Andrew :)

Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: 248BDC07 (https://keys.indymedia.org/)
Fingerprint = EC5A 1F5E C0AD 1D15 8F1F  8F91 3B96 A578 248B DC07

Re: [PATCH][7/6] Allow anonymous SSA names

2012-08-09 Thread Diego Novillo


On 12-08-09 09:20 , Richard Guenther wrote:


 if (interm_type)
   {
 /* Create a type conversion HALF_TYPE->INTERM_TYPE.  */
!   tmp = create_tmp_reg (interm_type, NULL);
!   new_oprnd = make_ssa_name (tmp, NULL);
 new_stmt = gimple_build_assign_with_ops (NOP_EXPR, new_oprnd,
  oprnd, NULL_TREE);
 oprnd = new_oprnd;
--- 1119,1125 
 if (interm_type)
   {
 /* Create a type conversion HALF_TYPE->INTERM_TYPE.  */
!   new_oprnd = make_ssa_name (interm_type, NULL);



Nice!  Any chance that you could go over tree-ssa.texi to refresh the 
internal docs?  (I don't recall what we have documented in there, tbh).



Diego.

Re: Value type of map need not be default copyable

2012-08-09 Thread François Dumont


On 08/09/2012 10:35 AM, Paolo Carlini wrote:

Hi,

On 08/09/2012 09:14 AM, Marc Glisse wrote:

On Wed, 8 Aug 2012, François Dumont wrote:


On 08/08/2012 03:39 PM, Paolo Carlini wrote:

On 08/08/2012 03:15 PM, François Dumont wrote:
I have also introduce a special std::pair constructor for 
container usage so that we do not have to include the whole tuple 
stuff just for associative container implementations.

To be clear: sorry, this is not an option.

Paolo.

   Then I can only imagine the attached patch which require to 
include tuple when including unordered_map or unordered_set. The 
std::pair(piecewise_construct_t, tuple<>, tuple<>) is the only 
constructor that allow to build a pair using the default constructor 
for the second member.


I agree that the extra constructor would be convenient (I probably 
would have gone with pair(T&&,__default_construct_t), the symmetric 
version, and enough extra constructors to resolve all ambiguities). 
Maybe LWG would consider doing something.
When it does, and the corresponding PR will be *ready* we'll 
reconsider the issue. After all the *months and months and months* 
spent by the LWG adding and removing members from pair and tweaking 
everything wrt the containers and issues *still* popping up (like that 
with the defaulted copy constructor vs insert constraining), and with 
the support for scoped allocators still missing from our 
implementation, we are not adding members to std::pair such easily. 
Sorry, but personally I'm not available now to further discuss this 
specific point.


I was still hoping that for something as simple as mapped_type() we 
wouldn't need the full  machinery, and I encourage everybody to 
have another look (while making sure anything we figure out adapts 
smoothly an consistently to std::map), then in a few days we'll take a 
final decision. We'll still have chances to further improve the code 
in time for 4.8.0.



+ __p = __h->_M_allocate_node(std::piecewise_construct,
+ std::make_tuple(__k),
+ std::make_tuple());

Don't you want cref(__k)? It might save a move at some point.
Are we already doing that elsewhere? I think we should aim for 
something simple first, then carefully evaluate if the additional 
complexity is worth the cost and in case deploy the superior solution 
consistently everywhere it may apply.


Thanks!
Paolo.



Here is an updated version considering the good catch from Marc. 
However I prefer to use an explicit instantiation of tuple rather than 
using cref that would have imply inclusion of  in addition 
to . I have also updated the test case to use a type without copy 
and move constructors.


2012-08-09  François Dumont  
Ollie Wild  

* include/bits/hashtable.h (_Hashtable<>::_M_insert_bucket):
Replace by ...
(_Hashtable<>::_M_insert_node): ... this, new.
(_Hashtable<>::_M_insert(_Args&&, true_type)): Use latter.
* include/bits/hashtable_policy.h (_Map_base<>::operator[]): Use
latter, emplace the value_type rather than insert.
* include/std/unordered_map: Include tuple.
* include/std/unordered_set: Likewise.
* testsuite/util/testsuite_counter_type.h: New.
* testsuite/23_containers/unordered_map/operators/2.cc: New.


François



Index: include/std/unordered_map
===
--- include/std/unordered_map	(revision 190209)
+++ include/std/unordered_map	(working copy)
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include  // equal_to, _Identity, _Select1st
Index: include/std/unordered_set
===
--- include/std/unordered_set	(revision 190209)
+++ include/std/unordered_set	(working copy)
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include  // equal_to, _Identity, _Select1st
Index: include/bits/hashtable_policy.h
===
--- include/bits/hashtable_policy.h	(revision 190209)
+++ include/bits/hashtable_policy.h	(working copy)
@@ -577,8 +577,14 @@
   __node_type* __p = __h->_M_find_node(__n, __k, __code);
 
   if (!__p)
-	return __h->_M_insert_bucket(std::make_pair(__k, mapped_type()),
- __n, __code)->second;
+	{
+	  __p = __h->_M_allocate_node(std::piecewise_construct,
+  std::tuple(__k),
+  std::make_tuple());
+	  __h->_M_store_code(__p, __code);
+	  return __h->_M_insert_node(__n, __code, __p)->second;
+	}
+
   return (__p->_M_v).second;
 }
 
@@ -598,9 +604,14 @@
   __node_type* __p = __h->_M_find_node(__n, __k, __code);
 
   if (!__p)
-	return __h->_M_insert_bucket(std::make_pair(std::move(__k),
-		mapped_type()),
- __n, __code)->second;
+	{
+	  __p = __h->_M_allocate_node(std::piecewise_construct,
+std::forward_as_tuple(forward(__k)),
+std::make_tuple())

[cxx-conversion] Avoid overloaded double_int 'constructor'. (issue6441127)

2012-08-09 Thread Lawrence Crowl

Convert overloaded double_int::make to non-overloaded from_signed and
from_unsigned.  This change is intended to preserve the exact semantics of
the existing expressions using shwi_to_double_int and uhwi_to_double_int.

Tested on x86_64.


Index: gcc/ChangeLog

2012-08-09  Lawrence Crowl 

* double-int.h (double_int::make): Remove.
(double_int::from_signed): New.
(double_int::from_unsigned): New.
(shwi_to_double_int): Use double_int::from_signed instead of
double_int::make.
(double_int_minus_one): Likewise.
(double_int_zero): Likewise.
(double_int_one): Likewise.
(double_int_two): Likewise.
(double_int_ten): Likewise.
(uhwi_to_double_int): Use double_int::from_unsigned instead of
double_int::make.


Index: gcc/double-int.h
===
--- gcc/double-int.h(revision 190239)
+++ gcc/double-int.h(working copy)
@@ -60,10 +60,8 @@ public:
  Second, the GCC conding conventions prefer explicit conversion,
  and explicit conversion operators are not available until C++11.  */
 
-  static double_int make (unsigned HOST_WIDE_INT cst);
-  static double_int make (HOST_WIDE_INT cst);
-  static double_int make (unsigned int cst);
-  static double_int make (int cst);
+  static double_int from_unsigned (unsigned HOST_WIDE_INT cst);
+  static double_int from_signed (HOST_WIDE_INT cst);
 
   /* No copy assignment operator or destructor to keep the type a POD.  */
 
@@ -188,7 +186,7 @@ public:
HOST_WIDE_INT are filled with the sign bit.  */
 
 inline
-double_int double_int::make (HOST_WIDE_INT cst)
+double_int double_int::from_signed (HOST_WIDE_INT cst)
 {
   double_int r;
   r.low = (unsigned HOST_WIDE_INT) cst;
@@ -196,17 +194,11 @@ double_int double_int::make (HOST_WIDE_I
   return r;
 }
 
-inline
-double_int double_int::make (int cst)
-{
-  return double_int::make (static_cast  (cst));
-}
-
 /* FIXME(crowl): Remove after converting callers.  */
 static inline double_int
 shwi_to_double_int (HOST_WIDE_INT cst)
 {
-  return double_int::make (cst);
+  return double_int::from_signed (cst);
 }
 
 /* Some useful constants.  */
@@ -214,17 +206,17 @@ shwi_to_double_int (HOST_WIDE_INT cst)
The problem is that a named constant would not be as optimizable,
while the functional syntax is more verbose.  */
 
-#define double_int_minus_one (double_int::make (-1))
-#define double_int_zero (double_int::make (0))
-#define double_int_one (double_int::make (1))
-#define double_int_two (double_int::make (2))
-#define double_int_ten (double_int::make (10))
+#define double_int_minus_one (double_int::from_signed (-1))
+#define double_int_zero (double_int::from_signed (0))
+#define double_int_one (double_int::from_signed (1))
+#define double_int_two (double_int::from_signed (2))
+#define double_int_ten (double_int::from_signed (10))
 
 /* Constructs double_int from unsigned integer CST.  The bits over the
precision of HOST_WIDE_INT are filled with zeros.  */
 
 inline
-double_int double_int::make (unsigned HOST_WIDE_INT cst)
+double_int double_int::from_unsigned (unsigned HOST_WIDE_INT cst)
 {
   double_int r;
   r.low = cst;
@@ -232,17 +224,11 @@ double_int double_int::make (unsigned HO
   return r;
 }
 
-inline
-double_int double_int::make (unsigned int cst)
-{
-  return double_int::make (static_cast  (cst));
-}
-
 /* FIXME(crowl): Remove after converting callers.  */
 static inline double_int
 uhwi_to_double_int (unsigned HOST_WIDE_INT cst)
 {
-  return double_int::make (cst);
+  return double_int::from_unsigned (cst);
 }
 
 inline double_int &

--
This patch is available for review at http://codereview.appspot.com/6441127

Re: Value type of map need not be default copyable

2012-08-09 Thread Marc Glisse


On Thu, 9 Aug 2012, François Dumont wrote:

   Here is an updated version considering the good catch from Marc. However 
I prefer to use an explicit instantiation of tuple rather than using cref 
that would have imply inclusion of  in addition to .


I wouldn't have used make_tuple at all (tuple<>() is shorter than 
make_tuple()), but I wanted to stick to your style as much as possible ;-)


I don't know if std:: is needed, but it looks strange to have it only on 
some functions:

std::forward_as_tuple(forward(__k)),

Looking at this line again, you seem to be using std::forward on something 
that is not a deduced parameter type. I guess it is equivalent to 
std::move in this case, it just confuses me a bit.



   * include/std/unordered_map: Include tuple.
   * include/std/unordered_set: Likewise.


Is it a libstdc++ policy to put all includes in the topmost headers, as 
opposed to the header where they are used? I never paid much attention to 
it, I was just surprised because it doesn't match what I do in my code. 
But since hashtable*.h currently include nothing, it is consistent. Does 
that help with compile-time?


(ok, it is a bit obvious that I pretended to make a review just so I had 
an excuse to ask a question at the end ;-)


--
Marc Glisse

[patch] Use SBITMAP_SIZE in a few places

2012-08-09 Thread Steven Bosscher

Hello,

SBITMAP_SIZE should be used to get the current size of an sbitmap.

Bootstrapped&tested on powerpc64-unknown-linux-gnu. Will commit as obvious.

Ciao!
Steven


sbitmap_size.diff
Description: Binary data

[google/main, google/gcc-4_7] Fix segfault in linemap lookup

2012-08-09 Thread Cary Coutant

This patch is for the google/main and google/gcc-4_7 branches.

New code in GCC 4.7 is calling linemap_lookup with a location_t that
may still represent a location-with-discriminator.  Before using a
location_t value to lookup the line number, it needs to be mapped to
a real location_t value.

Tested with make check-gcc and validate-failures.py.

OK for google/main and google/gcc-4_7?


2012-08-09   Cary Coutant  

gcc/
* tree-diagnostic.c (maybe_unwind_expanded_macro_loc): Check for
discriminator.
* diagnostic.c (diagnostic_report_current_module): Likewise.


Index: gcc/tree-diagnostic.c
===
--- gcc/tree-diagnostic.c   (revision 190262)
+++ gcc/tree-diagnostic.c   (working copy)
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
+#include "input.h"
 #include "tree.h"
 #include "diagnostic.h"
 #include "tree-diagnostic.h"
@@ -115,6 +116,8 @@ maybe_unwind_expanded_macro_loc (diagnos
   unsigned ix;
   loc_map_pair loc, *iter;
 
+  if (has_discriminator (where))
+where = map_discriminator_location (where);
   map = linemap_lookup (line_table, where);
   if (!linemap_macro_expansion_map_p (map))
 return;
Index: gcc/diagnostic.c
===
--- gcc/diagnostic.c(revision 190262)
+++ gcc/diagnostic.c(working copy)
@@ -270,6 +270,9 @@ diagnostic_report_current_module (diagno
   if (where <= BUILTINS_LOCATION)
 return;
 
+  if (has_discriminator (where))
+where = map_discriminator_location (where);
+
   linemap_resolve_location (line_table, where,
LRK_MACRO_DEFINITION_LOCATION,
&map);

[patch] Fix a couple of VEC_reserve uses, speed up update_ssa a bit

2012-08-09 Thread Steven Bosscher

Hello,

VEC_reserve allocates an *extra* number of slots. There is
unfortunately no VEC_resize op (one of the first things to add after
the merge of the cxx branch, I suppose...), so to "grow" a VEC without
increasing the used slots count (the VEC_length) it's necessary to
compute the number of extra slots needed and reserve only that number
of slots.

So something like:

  VEC_reserve (ssa_name_info_p, heap, info_for_ssa_name, num_ssa_names);

on an existing VEC with non-null length is wrong. In the worst case,
the VEC_length is already num_ssa_names and the VEC ends up twice as
large as necessary.

Another thing I noticed, is that in update_ssa() we're
sbitmap_zero'ing new_ssa_names and old_ssa_names even after we've
already done so in init_update_ssa. This might seem like a
micro-optimization, but it cuts the time spent in the timevar "tree
SSA incremental" in half for the test case of PR54146...

Bootstrapped&tested on powerpc64-unknown-linux-gnu. OK for trunk?

Ciao!
Steven


vec_reserve.diff
Description: Binary data

Re: [PATCH] Set correct source location for deallocator calls

2012-08-09 Thread Jason Merrill


On 08/08/2012 12:32 PM, Richard Henderson wrote:

On 08/08/2012 09:27 AM, Dehao Chen wrote:

Then we should probably assign UNKNOWN_LOCATION for these destructor
calls, what do you guys think?


I think it's certainly plausible.  I can't think what other problems
such a change would cause.  Jason?


cxx_maybe_build_cleanup is already trying to do that.  If it's missing 
some cases then yes, let's fix them too.


Jason

Re: [patch] Fix a couple of VEC_reserve uses, speed up update_ssa a bit

2012-08-09 Thread Richard Henderson

On 08/09/2012 03:06 PM, Steven Bosscher wrote:
> +  unsigned old_len = name_to_id ? VEC_length (unsigned, name_to_id) : 0;
> +  VEC_reserve (unsigned, heap, name_to_id, num_ssa_names - old_len);

VEC_length already handles NULL input.


r~

Re: PATCH: PR bootstrap/54209: [4.8 Regression] Failed to build gcc for Android/x86

2012-08-09 Thread Ian Lance Taylor

On Thu, Aug 9, 2012 at 9:39 AM, H.J. Lu  wrote:
>
> Bionic C library doesn't provide link.h.

Does Bionic provide dl_iterate_phdr?  If it does, I'll just note in
passing that it would be straightforward to simply incorporate the
required types and constants in unwind-dw2-fde-dip.c directly, and
avoid the #include.  If it doesn't, then of course nothing will make
this code work correctly.

Ian

Re: [patch] Fix a couple of VEC_reserve uses, speed up update_ssa a bit

2012-08-09 Thread Steven Bosscher

On Fri, Aug 10, 2012 at 12:15 AM, Richard Henderson  wrote:
> On 08/09/2012 03:06 PM, Steven Bosscher wrote:
>> +  unsigned old_len = name_to_id ? VEC_length (unsigned, name_to_id) : 0;
>> +  VEC_reserve (unsigned, heap, name_to_id, num_ssa_names - old_len);
>
> VEC_length already handles NULL input.

I didn't know that. Consider that hunk changed to this:

Index: tree-ssa-pre.c
===
--- tree-ssa-pre.c  (revision 190267)
+++ tree-ssa-pre.c  (working copy)
@@ -249,7 +249,8 @@ alloc_expression_id (pre_expr expr)
   /* VEC_safe_grow_cleared allocates no headroom.  Avoid frequent
 re-allocations by using VEC_reserve upfront.  There is no
 VEC_quick_grow_cleared unfortunately.  */
-  VEC_reserve (unsigned, heap, name_to_id, num_ssa_names);
+  unsigned old_len = VEC_length (unsigned, name_to_id);
+  VEC_reserve (unsigned, heap, name_to_id, num_ssa_names - old_len);
   VEC_safe_grow_cleared (unsigned, heap, name_to_id, num_ssa_names);
   gcc_assert (VEC_index (unsigned, name_to_id, version) == 0);
   VEC_replace (unsigned, name_to_id, version, expr->id);

Re: [SH] PR 54089 - Reinstate T_REG clobber for left shifts

2012-08-09 Thread Kaz Kojima

Oleg Endo  wrote:
> Removing the T_REG clobber from the left shift patterns entirely wasn't
> such a good idea.  Especially if dynamic shifts are not available
> (anything < SH3) incorrect code may be generated.
> The attached patch adds a T_REG clobbering version of the left shift
> insn "ashlsi3_n".  While at it, I consolidated the description of the
> constant shift sequences, which hopefully makes them easier to read and
> understand.
> 
> Tested on rev 190151 with
> make -k check RUNTESTFLAGS="--target_board=sh-sim
> \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}"
> 
> and no new failures.
> OK?

OK.

Regards,
kaz

[v3] fix references to C++11 standard

2012-08-09 Thread Jonathan Wakely

This just updates some comments to refer to the section numbers in the
final C++11 standard.

* acinclude.m4: Update references to final C++11 standard.
* include/bits/shared_ptr.h: Likewise.
* include/bits/shared_ptr_base.h: Likewise.
* include/bits/unique_ptr.h: Likewise.
* include/std/chrono: Likewise.
* include/std/thread: Likewise.

Tested x86_64-linux, committed to trunk.
commit 0d6bc17d16d85865ed4b6deffd455d3e1e12f430
Author: Jonathan Wakely 
Date:   Thu Aug 9 23:21:27 2012 +0100

* acinclude.m4: Update references to final C++11 standard.
* include/bits/shared_ptr.h: Likewise.
* include/bits/shared_ptr_base.h: Likewise.
* include/bits/unique_ptr.h: Likewise.
* include/std/chrono: Likewise.
* include/std/thread: Likewise.

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 6632725..1179407 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -1115,16 +1115,16 @@ AC_DEFUN([GLIBCXX_ENABLE_C99], [
 
 dnl
 dnl Check for clock_gettime, nanosleep and sched_yield, used in the
-dnl implementation of 20.8.5 [time.clock], and 30.2.2 [thread.thread.this]
-dnl in the current C++0x working draft.
+dnl implementation of 20.11.7 [time.clock], and 30.3.2 [thread.thread.this]
+dnl in the C++11 standard.
 dnl
 dnl --enable-libstdcxx-time
 dnl --enable-libstdcxx-time=yes
 dnlchecks for the availability of monotonic and realtime clocks,
-dnlnanosleep and sched_yield in libc and libposix4 and, in case, links
-dnl   the latter
+dnlnanosleep and sched_yield in libc and libposix4 and, if needed,
+dnllinks in the latter.
 dnl --enable-libstdcxx-time=rt
-dnlalso searches (and, in case, links) librt.  Note that this is
+dnlalso searches (and, if needed, links) librt.  Note that this is
 dnlnot always desirable because, in glibc, for example, in turn it
 dnltriggers the linking of libpthread too, which activates locking,
 dnla large overhead for single-thread programs.
@@ -1256,8 +1256,8 @@ AC_DEFUN([GLIBCXX_ENABLE_LIBSTDCXX_TIME], [
 ])
 
 dnl
-dnl Check for gettimeofday, used in the implementation of 20.8.5
-dnl [time.clock] in the current C++0x working draft.
+dnl Check for gettimeofday, used in the implementation of 20.11.7
+dnl [time.clock] in the C++11 standard.
 dnl
 AC_DEFUN([GLIBCXX_CHECK_GETTIMEOFDAY], [
 
diff --git a/libstdc++-v3/include/bits/shared_ptr.h 
b/libstdc++-v3/include/bits/shared_ptr.h
index e1c1eb9..7843365 100644
--- a/libstdc++-v3/include/bits/shared_ptr.h
+++ b/libstdc++-v3/include/bits/shared_ptr.h
@@ -321,7 +321,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
allocate_shared(const _Alloc& __a, _Args&&... __args);
 };
 
-  // 20.8.13.2.7 shared_ptr comparisons
+  // 20.7.2.2.7 shared_ptr comparisons
   template
 inline bool
 operator==(const shared_ptr<_Tp1>& __a,
@@ -425,13 +425,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct less> : public _Sp_less>
 { };
 
-  // 20.8.13.2.9 shared_ptr specialized algorithms.
+  // 20.7.2.2.8 shared_ptr specialized algorithms.
   template
 inline void
 swap(shared_ptr<_Tp>& __a, shared_ptr<_Tp>& __b) noexcept
 { __a.swap(__b); }
 
-  // 20.8.13.2.10 shared_ptr casts.
+  // 20.7.2.2.9 shared_ptr casts.
   template
 inline shared_ptr<_Tp>
 static_pointer_cast(const shared_ptr<_Tp1>& __r) noexcept
@@ -511,7 +511,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 };
 
-  // 20.8.13.3.7 weak_ptr specialized algorithms.
+  // 20.7.2.3.6 weak_ptr specialized algorithms.
   template
 inline void
 swap(weak_ptr<_Tp>& __a, weak_ptr<_Tp>& __b) noexcept
diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index 1ccd5ef..07ac000 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -1056,7 +1056,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 
 
-  // 20.8.13.2.7 shared_ptr comparisons
+  // 20.7.2.2.7 shared_ptr comparisons
   template
 inline bool
 operator==(const __shared_ptr<_Tp1, _Lp>& __a,
@@ -1348,7 +1348,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __weak_count<_Lp>  _M_refcount;// Reference counter.
 };
 
-  // 20.8.13.3.7 weak_ptr specialized algorithms.
+  // 20.7.2.3.6 weak_ptr specialized algorithms.
   template
 inline void
 swap(__weak_ptr<_Tp, _Lp>& __a, __weak_ptr<_Tp, _Lp>& __b) noexcept
diff --git a/libstdc++-v3/include/bits/unique_ptr.h 
b/libstdc++-v3/include/bits/unique_ptr.h
index 9b736d4..242d01e 100644
--- a/libstdc++-v3/include/bits/unique_ptr.h
+++ b/libstdc++-v3/include/bits/unique_ptr.h
@@ -87,7 +87,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template void operator()(_Up*) const = delete;
 };
 
-  /// 20.7.12.2 unique_ptr for single objects.
+  /// 20.7.1.2 unique_ptr for single objects.
   template  >
 class unique_ptr
 {
@@ -260,7 +260,7 @@ _GLIBCXX_

Re: PATCH: PR bootstrap/54209: [4.8 Regression] Failed to build gcc for Android/x86

2012-08-09 Thread H.J. Lu

On Thu, Aug 9, 2012 at 3:17 PM, Ian Lance Taylor  wrote:
> On Thu, Aug 9, 2012 at 9:39 AM, H.J. Lu  wrote:
>>
>> Bionic C library doesn't provide link.h.
>
> Does Bionic provide dl_iterate_phdr?  If it does, I'll just note in
> passing that it would be straightforward to simply incorporate the
> required types and constants in unwind-dw2-fde-dip.c directly, and
> avoid the #include.  If it doesn't, then of course nothing will make
> this code work correctly.

That is a good idea.  Pavel, can you look into it?

Thanks.

-- 
H.J.

Re: PATCH: PR bootstrap/54209: [4.8 Regression] Failed to build gcc for Android/x86

2012-08-09 Thread Ian Lance Taylor

On Thu, Aug 9, 2012 at 4:01 PM, H.J. Lu  wrote:
> On Thu, Aug 9, 2012 at 3:17 PM, Ian Lance Taylor  wrote:
>> On Thu, Aug 9, 2012 at 9:39 AM, H.J. Lu  wrote:
>>>
>>> Bionic C library doesn't provide link.h.
>>
>> Does Bionic provide dl_iterate_phdr?  If it does, I'll just note in
>> passing that it would be straightforward to simply incorporate the
>> required types and constants in unwind-dw2-fde-dip.c directly, and
>> avoid the #include.  If it doesn't, then of course nothing will make
>> this code work correctly.
>
> That is a good idea.  Pavel, can you look into it?

You may find libiberty/simple-object-elf.c to be a useful guide.

Ian

Re: Value type of map need not be default copyable

2012-08-09 Thread Paolo Carlini


On 08/09/2012 11:22 PM, Marc Glisse wrote:
I don't know if std:: is needed, but it looks strange to have it only 
on some functions:

std::forward_as_tuple(forward(__k)),

Looking at this line again, you seem to be using std::forward on 
something that is not a deduced parameter type. I guess it is 
equivalent to std::move in this case, it just confuses me a bit.

Wanted to point out that yesterday. Please double check std::move.

I realize now that nobody is interested in std::cref, good ;)

Thanks!
Paolo.

Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)

2012-08-09 Thread Michael Matz

Hi,

On Thu, 9 Aug 2012, Mike Stump wrote:

> On Aug 9, 2012, at 8:19 AM, Michael Matz wrote:
> > Hmm.  And maintaining a cache is faster than 
> > passing/returning/manipulating two registers?
> 
> For the most part, we merely mirror existing code, check out 
> lookup_const_double and immed_double_const.

No, I won't without patches on this list.  You keep repeating bragging 
about wide_int during the last two weeks, without offering anything 
concrete about it whatsoever.  You'll understand that I (or anybody else) 
can't usefully discuss with you any merits or demerits of the 
implementation you chose.  (can I btw. complain about the retainment of 
underscores?  If it's a base data type, then why not wideint?  Make that a 
testament for the "quality" of feedback you'll get with the information 
given)

I mean, preparing the audience for an upcoming _suggested_ change in data 
structure of course is fine.  But argueing as if the change happenend 
already, and what's more concerning, as if the change was even already 
suggested and agreed upon even though that's not the case, is just bad 
style.

I would suggest to stay conservative about whatever you have (except if 
it's momentarily materializing), and _especially don't argue against or 
for or not against or for whatever improvement is suggested on the grounds 
that you have a better, as of yet secret but surely taking-over-the-world 
very-soon-now implementation of datastructure X_.  Nobody has seen it yet, 
so you can't expect to get any feedback on it.  Certainly that's the thing 
you need to get it into the code base.

> If the existing code is wrong, love to have someone fix it.  :-)  Also, 
> bear in mind, on a port with with OImode math for example, on a 32-bit 
> host, it would be 8 registers...

Nice try.  But what problem do _you_ want to solve?  For instance why 
should a port with OImode for example be interesting to the FSF?  I hope 
you recognize this as half-rhethorical question, but still, how exactly 
will wide_int help for the goal (which remains to be shown as useful), how 
is it implemented?, why isn't it worse than crap on sensible (i.e. 64bit) 
hosts, and why should everybody not interested in such target pay the 
price, or why isn't there a price to pay for non-OI-targets?

I'm actually more intersted in comments to the first part, but still, 
comments on OI appreciated.

Ciao,
Michael.

[rl78] add some checks

2012-08-09 Thread DJ Delorie


RTL checking pointed out a couple of cases where rl78.c was extracting
info from rtx without checking the rtx type first.  Applied.

2012-08-09  DJ Delorie  

* config/rl78/rl78.c (rl78_alloc_physical_registers): Check for
SET before extracting SET_SRC.
(rl78_remove_unused_sets): Check for REG before extractnig REGNO.

Index: config/rl78/rl78.c
===
--- config/rl78/rl78.c  (revision 190277)
+++ config/rl78/rl78.c  (working copy)
@@ -2217,7 +2217,8 @@
  && GET_CODE (PATTERN (insn)) != CALL)
  continue;
 
-  if (GET_CODE (SET_SRC (PATTERN (insn))) == ASM_OPERANDS)
+  if (GET_CODE (PATTERN (insn)) == SET
+ && GET_CODE (SET_SRC (PATTERN (insn))) == ASM_OPERANDS)
continue;
 
   valloc_method = get_attr_valloc (insn);
@@ -2644,7 +2645,7 @@
 
   dest = SET_DEST (insn);
 
-  if (REGNO (dest) > 23)
+  if (GET_CODE (dest) != REG || REGNO (dest) > 23)
continue;
 
   if (find_regno_note (insn, REG_UNUSED, REGNO (dest)))

[PATCH] Fix PR54211

2012-08-09 Thread William J. Schmidt

Fix a thinko in strength reduction.  I was checking the type of the
wrong operand to determine whether address arithmetic should be used in
replacing expressions.  This produced a spurious POINTER_PLUS_EXPR when
an address was converted to an unsigned long and back again.

Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new
regressions.  Ok for trunk?

Thanks,
Bill


gcc:

2012-08-09  Bill Schmidt  

PR middle-end/54211
* gimple-ssa-strength-reduction.c (analyze_candidates_and_replace):
Use cand_type to determine whether pointer arithmetic will be generated.

gcc/testsuite:

2012-08-09  Bill Schmidt  

PR middle-end/54211
* gcc.dg/tree-ssa/pr54211.c: New test.


Index: gcc/testsuite/gcc.dg/tree-ssa/pr54211.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr54211.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr54211.c (revision 0)
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-Os" } */
+
+int a, b;
+unsigned char e;
+void fn1 ()
+{
+unsigned char *c=0;
+for (;; a++)
+{
+unsigned char d = *(c + b);
+for (; &e<&d; c++)
+goto Found_Top;
+}
+Found_Top:
+if (0)
+goto Empty_Bitmap;
+for (;; a++)
+{
+unsigned char *e = c + b;
+for (; c < e; c++)
+goto Found_Bottom;
+c -= b;
+}
+Found_Bottom:
+Empty_Bitmap:
+;
+}
Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c (revision 190260)
+++ gcc/gimple-ssa-strength-reduction.c (working copy)
@@ -2534,7 +2534,7 @@ analyze_candidates_and_replace (void)
  /* Determine whether we'll be generating pointer arithmetic
 when replacing candidates.  */
  address_arithmetic_p = (c->kind == CAND_ADD
- && POINTER_TYPE_P (TREE_TYPE (c->base_expr)));
+ && POINTER_TYPE_P (c->cand_type));
 
  /* If all candidates have already been replaced under other
 interpretations, nothing remains to be done.  */

[PATCH, testsuite] New effective target long_neq_int

2012-08-09 Thread William J. Schmidt

As suggested by Janis regarding testsuite/gcc.dg/tree-ssa/slsr-30.c,
this patch adds a new effective target for machines having long and int
of differing sizes.

Tested on powerpc64-unknown-linux-gnu, where the test passes for -m64
and is skipped for -m32.  Ok for trunk?

Thanks,
Bill


doc:

2012-08-09  Bill Schmidt  

* sourcebuild.texi: Document long_neq_int effective target.


testsuite:

2012-08-09  Bill Schmidt  

* lib/target-supports.exp (check_effective_target_long_neq_int): New.
* gcc.dg/tree-ssa/slsr-30.c: Check for long_neq_int effective target.


Index: gcc/doc/sourcebuild.texi
===
--- gcc/doc/sourcebuild.texi(revision 190260)
+++ gcc/doc/sourcebuild.texi(working copy)
@@ -1303,6 +1303,9 @@ Target has @code{int} that is at 32 bits or longer
 @item int16
 Target has @code{int} that is 16 bits or shorter.
 
+@item long_neq_int
+Target has @code{int} and @code{long} with different sizes.
+
 @item large_double
 Target supports @code{double} that is longer than @code{float}.
 
Index: gcc/testsuite/lib/target-supports.exp
===
--- gcc/testsuite/lib/target-supports.exp   (revision 190260)
+++ gcc/testsuite/lib/target-supports.exp   (working copy)
@@ -1689,6 +1689,15 @@ proc check_effective_target_llp64 { } {
 }]
 }
 
+# Return 1 if long and int have different sizes,
+# 0 otherwise.
+
+proc check_effective_target_long_neq_int { } {
+return [check_no_compiler_messages long_ne_int object {
+   int dummy[sizeof (int) != sizeof (long) ? 1 : -1];
+}]
+}
+
 # Return 1 if the target supports long double larger than double,
 # 0 otherwise.
 
Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c (revision 190260)
+++ gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c (working copy)
@@ -1,7 +1,7 @@
 /* Verify straight-line strength reduction fails for simple integer addition
with casts thrown in when -fwrapv is used.  */
 
-/* { dg-do compile { target { ! { ilp32 } } } } */
+/* { dg-do compile { target { long_neq_int } } } */
 /* { dg-options "-O3 -fdump-tree-dom2 -fwrapv" } */
 
 long

s390: Use VOIDmode with gen_rtx_SET

2012-08-09 Thread Richard Henderson

Committed as obvious.


r~
* config/s390/s390.c (s390_expand_insv): Use VOIDmode in gen_rtx_SET.
 
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 0ae77a2..d67c0eb 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -4684,9 +4684,8 @@ s390_expand_insv (rtx dest, rtx op1, rtx op2, rtx src)
  src = gen_lowpart (mode, src);
}
 
-  op = gen_rtx_SET (mode,
-   gen_rtx_ZERO_EXTRACT (mode, dest, op1, op2),
-   src);
+  op = gen_rtx_ZERO_EXTRACT (mode, dest, op1, op2),
+  op = gen_rtx_SET (VOIDmode, op, src);
   clobber = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, CC_REGNUM));
   emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, op, clobber)));

[PATCH 0/7] s390 improvements with r[ioxn]sbg

2012-08-09 Thread Richard Henderson

Only "tested" visually, by examining assembly diffs of the
runtime libraries between successive patches.  All told it
would appear to be some remarkable code size improvements.

Please test.


r~


Richard Henderson (7):
  s390: Constraints, predicates, and op letters for contiguous bitmasks
  s390: Only use lhs zero_extract in word_mode
  s390: Use risbgz for AND.
  s390: Add mode attribute for mode bitsize
  s390: Implement extzv for z10
  s390: Generate rxsbg, and shifted forms of rosbg
  s390: Generate rnsbg

 gcc/config/s390/constraints.md |   11 +-
 gcc/config/s390/predicates.md  |   10 +
 gcc/config/s390/s390-protos.h  |1 +
 gcc/config/s390/s390.c |  108 ---
 gcc/config/s390/s390.md|  385 ++--
 5 files changed, 353 insertions(+), 162 deletions(-)

-- 
1.7.7.6

[PATCH 2/7] s390: Only use lhs zero_extract in word_mode

2012-08-09 Thread Richard Henderson

This means that anything targeting extimm or z10 must therefore
imply zarch, which implies word_mode == DImode.

Then, now that *insv_z10 is no longer dependent on mode, let gas
do some arithmetic, rather than doing it in C and generating new rtl.
---
 gcc/config/s390/s390.md |   45 -
 1 files changed, 16 insertions(+), 29 deletions(-)

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 76ec9c4..2677fb2 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -3364,27 +3364,15 @@
   FAIL;
 })
 
-(define_insn "*insv_z10"
-  [(set (zero_extract:GPR (match_operand:GPR 0 "nonimmediate_operand" "+d")
- (match_operand 1 "const_int_operand""I")
- (match_operand 2 "const_int_operand""I"))
-   (match_operand:GPR 3 "nonimmediate_operand" "d"))
+(define_insn "*insv_z10"
+  [(set (zero_extract:DI
+ (match_operand:DI 0 "nonimmediate_operand" "+d")
+ (match_operand 1 "const_int_operand" "")
+ (match_operand 2 "const_int_operand" ""))
+   (match_operand:DI 3 "nonimmediate_operand" "d"))
(clobber (reg:CC CC_REGNUM))]
-  "TARGET_Z10
-   && (INTVAL (operands[1]) + INTVAL (operands[2])) <=
-  GET_MODE_BITSIZE (mode)"
-{
-  int start = INTVAL (operands[2]);
-  int size = INTVAL (operands[1]);
-  int offset = 64 - GET_MODE_BITSIZE (mode);
-
-  operands[2] = GEN_INT (offset + start);  /* start bit position */
-  operands[1] = GEN_INT (offset + start + size - 1);   /* end bit position */
-  operands[4] = GEN_INT (GET_MODE_BITSIZE (mode) -
-start - size);   /* left shift count */
-
-  return "risbg\t%0,%3,%b2,%b1,%b4";
-}
+  "TARGET_Z10"
+  "risbg\t%0,%3,%2,%2+%1-1,64-%2-%1"
   [(set_attr "op_type" "RIE")
(set_attr "z10prop" "z10_super_E1")])
 
@@ -3483,15 +3471,14 @@
   [(set_attr "op_type" "RIL")
(set_attr "z10prop" "z10_fwd_E1")])
 
-; Update the right-most 32 bit of a DI, or the whole of a SI.
-(define_insn "*insv_l_reg_extimm"
-  [(set (zero_extract:P (match_operand:P 0 "register_operand" "+d")
-   (const_int 32)
-   (match_operand 1 "const_int_operand" "n"))
-   (match_operand:P 2 "const_int_operand" "n"))]
-  "TARGET_EXTIMM
-   && BITS_PER_WORD - INTVAL (operands[1]) == 32"
-  "iilf\t%0,%o2"
+; Update the right-most 32 bit of a DI.
+(define_insn "*insv_l_di_reg_extimm"
+  [(set (zero_extract:DI (match_operand:DI 0 "register_operand" "+d")
+(const_int 32)
+(const_int 32))
+   (match_operand:DI 1 "const_int_operand" "n"))]
+  "TARGET_EXTIMM"
+  "iilf\t%0,%o1"
   [(set_attr "op_type" "RIL")
(set_attr "z10prop" "z10_fwd_A1")])
 
-- 
1.7.7.6

[PATCH 7/7] s390: Generate rnsbg

2012-08-09 Thread Richard Henderson

---
 gcc/config/s390/s390.md |   55 +++
 1 files changed, 55 insertions(+), 0 deletions(-)

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index d733062..182e7b1 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -3462,6 +3462,61 @@
   "rsbg\t%0,%1,%2,%2,%3"
   [(set_attr "op_type" "RIE")])
 
+;; These two are generated by combine for s.bf &= val.
+;; ??? For bitfields smaller than 32-bits, we wind up with SImode
+;; shifts and ands, which results in some truly awful patterns
+;; including subregs of operations.  Rather unnecessisarily, IMO.
+;; Instead of
+;;
+;; (set (zero_extract:DI (reg/v:DI 50 [ s ])
+;;(const_int 24 [0x18])
+;;(const_int 0 [0]))
+;;(subreg:DI (and:SI (subreg:SI (lshiftrt:DI (reg/v:DI 50 [ s ])
+;;(const_int 40 [0x28])) 4)
+;;(reg:SI 4 %r4 [ y+4 ])) 0))
+;;
+;; we should instead generate
+;;
+;; (set (zero_extract:DI (reg/v:DI 50 [ s ])
+;;(const_int 24 [0x18])
+;;(const_int 0 [0]))
+;;(and:DI (lshiftrt:DI (reg/v:DI 50 [ s ])
+;;(const_int 40 [0x28]))
+;;(subreg:DI (reg:SI 4 %r4 [ y+4 ]) 0)))
+;;
+;; by noticing that we can push down the outer paradoxical subreg
+;; into the operation.
+
+(define_insn "*insv_rnsbg_noshift"
+  [(set (zero_extract:DI
+ (match_operand:DI 0 "nonimmediate_operand" "+d")
+ (match_operand 1 "const_int_operand" "")
+ (match_operand 2 "const_int_operand" ""))
+   (and:DI
+ (match_dup 0)
+ (match_operand:DI 3 "nonimmediate_operand" "d")))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_Z10
+   && INTVAL (operands[1]) + INTVAL (operands[2]) == 64"
+  "rnsbg\t%0,%3,%2,63,0"
+  [(set_attr "op_type" "RIE")])
+
+(define_insn "*insv_rnsbg_srl"
+  [(set (zero_extract:DI
+ (match_operand:DI 0 "nonimmediate_operand" "+d")
+ (match_operand 1 "const_int_operand" "")
+ (match_operand 2 "const_int_operand" ""))
+   (and:DI
+ (lshiftrt:DI
+   (match_dup 0)
+   (match_operand 3 "const_int_operand" ""))
+ (match_operand:DI 4 "nonimmediate_operand" "d")))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_Z10
+   && INTVAL (operands[3]) == 64 - INTVAL (operands[1]) - INTVAL (operands[2])"
+  "rnsbg\t%0,%4,%2,%2+%1-1,64-%2,%1"
+  [(set_attr "op_type" "RIE")])
+
 (define_insn "*insv_mem_reg"
   [(set (zero_extract:W (match_operand:QI 0 "memory_operand" "+Q,S")
(match_operand 1 "const_int_operand" "n,n")
-- 
1.7.7.6

[PATCH 4/7] s390: Add mode attribute for mode bitsize

2012-08-09 Thread Richard Henderson

Constant fold, and less typing than, GET_MODE_BITSIZE with
another mode substitution.
---
 gcc/config/s390/s390.md |   24 
 1 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 6474023..b6e1535 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -522,6 +522,9 @@
 (define_mode_attr bfstart [(DI "s") (SI "t")])
 (define_mode_attr bfend   [(DI "e") (SI "f")])
 
+;; In place of GET_MODE_BITSIZE (mode)
+(define_mode_attr bitsize [(DI "64") (SI "32") (HI "16") (QI "8")])
+
 ;;
 ;;- Compare instructions.
 ;;
@@ -3317,7 +3320,7 @@
 
   operands[1] = adjust_address (operands[1], BLKmode, 0);
   set_mem_size (operands[1], size);
-  operands[2] = GEN_INT (GET_MODE_BITSIZE (mode) - bitsize);
+  operands[2] = GEN_INT ( - bitsize);
   operands[3] = GEN_INT (mask);
 })
 
@@ -3344,7 +3347,7 @@
 
   operands[1] = adjust_address (operands[1], BLKmode, 0);
   set_mem_size (operands[1], size);
-  operands[2] = GEN_INT (GET_MODE_BITSIZE (mode) - bitsize);
+  operands[2] = GEN_INT ( - bitsize);
   operands[3] = GEN_INT (mask);
 })
 
@@ -3532,8 +3535,7 @@
 }
   else if (!TARGET_EXTIMM)
 {
-  rtx bitcount = GEN_INT (GET_MODE_BITSIZE (mode) -
- GET_MODE_BITSIZE (mode));
+  rtx bitcount = GEN_INT ( - );
 
   operands[1] = gen_lowpart (mode, operands[1]);
   emit_insn (gen_ashl3 (operands[0], operands[1], bitcount));
@@ -3635,8 +3637,7 @@
 {
   operands[1] = adjust_address (operands[1], BLKmode, 0);
   set_mem_size (operands[1], GET_MODE_SIZE (QImode));
-  operands[2] = GEN_INT (GET_MODE_BITSIZE (mode)
-- GET_MODE_BITSIZE (QImode));
+  operands[2] = GEN_INT ( - BITS_PER_UNIT);
 })
 
 ;
@@ -3747,8 +3748,7 @@
 }
   else if (!TARGET_EXTIMM)
 {
-  rtx bitcount = GEN_INT (GET_MODE_BITSIZE(DImode) -
- GET_MODE_BITSIZE(mode));
+  rtx bitcount = GEN_INT (64 - );
   operands[1] = gen_lowpart (DImode, operands[1]);
   emit_insn (gen_ashldi3 (operands[0], operands[1], bitcount));
   emit_insn (gen_lshrdi3 (operands[0], operands[0], bitcount));
@@ -3765,7 +3765,7 @@
 {
   operands[1] = gen_lowpart (SImode, operands[1]);
   emit_insn (gen_andsi3 (operands[0], operands[1],
-   GEN_INT ((1 << GET_MODE_BITSIZE(mode)) - 1)));
+GEN_INT ((1 << ) - 1)));
   DONE;
 }
 })
@@ -3958,8 +3958,8 @@
   REAL_VALUE_TYPE cmp, sub;
 
   operands[1] = force_reg (mode, operands[1]);
-  real_2expN (&cmp, GET_MODE_BITSIZE(mode) - 1, mode);
-  real_2expN (&sub, GET_MODE_BITSIZE(mode), mode);
+  real_2expN (&cmp,  - 1, mode);
+  real_2expN (&sub, , mode);
 
   emit_cmp_and_jump_insns (operands[1],
CONST_DOUBLE_FROM_REAL_VALUE (cmp, mode),
@@ -4676,7 +4676,7 @@
&& (CONST_OK_FOR_CONSTRAINT_P (INTVAL (operands[2]), 'K', \"K\")
|| CONST_OK_FOR_CONSTRAINT_P (INTVAL (operands[2]), 'O', \"Os\")
|| CONST_OK_FOR_CONSTRAINT_P (INTVAL (operands[2]), 'C', \"C\"))
-   && INTVAL (operands[2]) != -((HOST_WIDE_INT)1 << 
(GET_MODE_BITSIZE(mode) - 1))"
+   && INTVAL (operands[2]) != -((HOST_WIDE_INT)1 << ( - 1))"
   "@
ahi\t%0,%h2
ahik\t%0,%1,%h2
-- 
1.7.7.6

[PATCH 5/7] s390: Implement extzv for z10

2012-08-09 Thread Richard Henderson

---
 gcc/config/s390/predicates.md |4 +++
 gcc/config/s390/s390-protos.h |1 +
 gcc/config/s390/s390.c|   16 
 gcc/config/s390/s390.md   |   55 +++--
 4 files changed, 68 insertions(+), 8 deletions(-)

diff --git a/gcc/config/s390/predicates.md b/gcc/config/s390/predicates.md
index 333457d..e4632b9 100644
--- a/gcc/config/s390/predicates.md
+++ b/gcc/config/s390/predicates.md
@@ -101,6 +101,10 @@
   return true;
 })
 
+(define_predicate "nonzero_shift_count_operand"
+  (and (match_code "const_int")
+   (match_test "IN_RANGE (INTVAL (op), 1, GET_MODE_BITSIZE (mode) - 1)")))
+
 ;;  Return true if OP a valid operand for the LARL instruction.
 
 (define_predicate "larl_operand"
diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index 79673d6..97c378f 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -110,5 +110,6 @@ extern bool s390_legitimate_address_without_index_p (rtx);
 extern bool s390_decompose_shift_count (rtx, rtx *, HOST_WIDE_INT *);
 extern int s390_branch_condition_mask (rtx);
 extern int s390_compare_and_branch_condition_mask (rtx);
+extern bool s390_extzv_shift_ok (int, int, unsigned HOST_WIDE_INT);
 
 #endif /* RTX_CODE */
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 4e22100..52138d7 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -1308,6 +1308,22 @@ s390_contiguous_bitmask_p (unsigned HOST_WIDE_INT in, 
int size,
   return true;
 }
 
+/* Check whether a rotate of ROTL followed by an AND of CONTIG is equivalent
+   to a shift followed by the AND.  In particular, CONTIG should not overlap
+   the (rotated) bit 0/bit 63 gap.  */
+
+bool
+s390_extzv_shift_ok (int bitsize, int rotl, unsigned HOST_WIDE_INT contig)
+{
+  int pos, len;
+  bool ok;
+
+  ok = s390_contiguous_bitmask_p (contig, bitsize, &pos, &len);
+  gcc_assert (ok);
+
+  return (rotl <= pos || rotl >= pos + len + (64 - bitsize));
+}
+
 /* Check whether we can (and want to) split a double-word
move in mode MODE from SRC to DST into two single-word
moves, moving the subword FIRST_SUBWORD first.  */
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index b6e1535..ae004ac 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -3298,15 +3298,25 @@
   [(set_attr "op_type" "RS,RSY")
(set_attr "z10prop" "z10_super_E1,z10_super_E1")])
 
+(define_insn "extzv"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+   (zero_extract:DI
+ (match_operand:DI 1 "register_operand" "d")
+ (match_operand 2 "const_int_operand" "")
+ (match_operand 3 "const_int_operand" "")))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_Z10"
+  "risbg\t%0,%1,63-%3-%2,128+63,63-%3-%2"
+  [(set_attr "op_type" "RIE")
+   (set_attr "z10prop" "z10_super_E1")])
 
-(define_insn_and_split "*extzv"
+(define_insn_and_split "*pre_z10_extzv"
   [(set (match_operand:GPR 0 "register_operand" "=d")
(zero_extract:GPR (match_operand:QI 1 "s_operand" "QS")
- (match_operand 2 "const_int_operand" "n")
+ (match_operand 2 "nonzero_shift_count_operand" "")
  (const_int 0)))
(clobber (reg:CC CC_REGNUM))]
-  "INTVAL (operands[2]) > 0
-   && INTVAL (operands[2]) <= GET_MODE_BITSIZE (SImode)"
+  "!TARGET_Z10"
   "#"
   "&& reload_completed"
   [(parallel
@@ -3324,14 +3334,13 @@
   operands[3] = GEN_INT (mask);
 })
 
-(define_insn_and_split "*extv"
+(define_insn_and_split "*pre_z10_extv"
   [(set (match_operand:GPR 0 "register_operand" "=d")
(sign_extract:GPR (match_operand:QI 1 "s_operand" "QS")
- (match_operand 2 "const_int_operand" "n")
+ (match_operand 2 "nonzero_shift_count_operand" "")
  (const_int 0)))
(clobber (reg:CC CC_REGNUM))]
-  "INTVAL (operands[2]) > 0
-   && INTVAL (operands[2]) <= GET_MODE_BITSIZE (SImode)"
+  "!TARGET_Z10"
   "#"
   "&& reload_completed"
   [(parallel
@@ -6034,6 +6043,36 @@
  (clobber (reg:CC CC_REGNUM))])]
   "s390_narrow_logical_operator (AND, &operands[0], &operands[1]);")
 
+;; These two are what combine generates for (ashift (zero_extract)).
+(define_insn "*extzv__srl"
+  [(set (match_operand:DSI 0 "register_operand" "=d")
+   (and:DSI (lshiftrt:DSI
+  (match_operand:DSI 1 "register_operand" "d")
+  (match_operand:DSI 2 "nonzero_shift_count_operand" ""))
+   (match_operand:DSI 3 "contiguous_bitmask_operand" "")))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_Z10
+   /* Note that even for the SImode pattern, the rotate is always DImode.  */
+   && s390_extzv_shift_ok (, 64 - INTVAL (operands[2]),
+  INTVAL (operands[3]))"
+  "risbg\t%0,%1,%3,128+%3,64-%2"
+  [(set_attr "op_type" "RIE")
+   (set_attr "z10prop" "z10_super_E1")])
+
+(define_insn "*extzv__sll"
+  [(set (match_operand:DSI 0 "regis

[PATCH 1/7] s390: Constraints, predicates, and op letters for contiguous bitmasks

2012-08-09 Thread Richard Henderson

---
 gcc/config/s390/constraints.md |   11 -
 gcc/config/s390/predicates.md  |6 +++
 gcc/config/s390/s390.c |   92 +++-
 gcc/config/s390/s390.md|   48 +
 4 files changed, 90 insertions(+), 67 deletions(-)

diff --git a/gcc/config/s390/constraints.md b/gcc/config/s390/constraints.md
index 8564b66..9d416ad 100644
--- a/gcc/config/s390/constraints.md
+++ b/gcc/config/s390/constraints.md
@@ -45,6 +45,8 @@
 ;; H,Q: mode of the part
 ;; D,S,H:   mode of the containing operand
 ;; 0,F: value of the other parts (F - all bits set)
+;; --
+;; xx[DS]q  satisfies s390_contiguous_bitmask_p for DImode or SImode
 ;;
 ;; The constraint matches if the specified part of a constant
 ;; has a value different from its other parts.  If the letter x
@@ -330,8 +332,15 @@
   (and (match_code "const_int")
(match_test "s390_N_constraint_str (\"xQH0\", ival)")))
 
+(define_constraint "NxxDq"
+  "@internal"
+  (and (match_code "const_int")
+   (match_test "s390_contiguous_bitmask_p (ival, 64, NULL, NULL)")))
 
-
+(define_constraint "NxxSq"
+  "@internal"
+  (and (match_code "const_int")
+   (match_test "s390_contiguous_bitmask_p (ival, 32, NULL, NULL)")))
 
 ;;
 ;; Double-letter constraints starting with O follow.
diff --git a/gcc/config/s390/predicates.md b/gcc/config/s390/predicates.md
index 9d619fb..333457d 100644
--- a/gcc/config/s390/predicates.md
+++ b/gcc/config/s390/predicates.md
@@ -154,6 +154,12 @@
   return false;
 })
 
+(define_predicate "contiguous_bitmask_operand"
+  (match_code "const_int")
+{
+  return s390_contiguous_bitmask_p (INTVAL (op), GET_MODE_BITSIZE (mode), 
NULL, NULL);
+})
+
 ;; operators --
 
 ;; Return nonzero if OP is a valid comparison operator
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index d67c0eb..4e22100 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -5286,28 +5286,35 @@ print_operand_address (FILE *file, rtx addr)
 'C': print opcode suffix for branch condition.
 'D': print opcode suffix for inverse branch condition.
 'E': print opcode suffix for branch on index instruction.
-'J': print tls_load/tls_gdcall/tls_ldcall suffix
 'G': print the size of the operand in bytes.
+'J': print tls_load/tls_gdcall/tls_ldcall suffix
+'M': print the second word of a TImode operand.
+'N': print the second word of a DImode operand.
 'O': print only the displacement of a memory reference.
 'R': print only the base register of a memory reference.
 'S': print S-type memory reference (base+displacement).
-'N': print the second word of a DImode operand.
-'M': print the second word of a TImode operand.
 'Y': print shift count operand.
 
 'b': print integer X as if it's an unsigned byte.
 'c': print integer X as if it's an signed byte.
-'x': print integer X as if it's an unsigned halfword.
+'e': "end" of DImode contiguous bitmask X.
+'f': "end" of SImode contiguous bitmask X.
 'h': print integer X as if it's a signed halfword.
 'i': print the first nonzero HImode part of X.
 'j': print the first HImode part unequal to -1 of X.
 'k': print the first nonzero SImode part of X.
 'm': print the first SImode part unequal to -1 of X.
-'o': print integer X as if it's an unsigned 32bit word.  */
+'o': print integer X as if it's an unsigned 32bit word.
+'s': "start" of DImode contiguous bitmask X.
+'t': "start" of SImode contiguous bitmask X.
+'x': print integer X as if it's an unsigned halfword.
+*/
 
 void
 print_operand (FILE *file, rtx x, int code)
 {
+  HOST_WIDE_INT ival;
+
   switch (code)
 {
 case 'C':
@@ -5486,30 +5493,57 @@ print_operand (FILE *file, rtx x, int code)
   break;
 
 case CONST_INT:
-  if (code == 'b')
-fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (x) & 0xff);
-  else if (code == 'c')
-fprintf (file, HOST_WIDE_INT_PRINT_DEC, ((INTVAL (x) & 0xff) ^ 0x80) - 
0x80);
-  else if (code == 'x')
-fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (x) & 0x);
-  else if (code == 'h')
-fprintf (file, HOST_WIDE_INT_PRINT_DEC, ((INTVAL (x) & 0x) ^ 
0x8000) - 0x8000);
-  else if (code == 'i')
-   fprintf (file, HOST_WIDE_INT_PRINT_DEC,
-s390_extract_part (x, HImode, 0));
-  else if (code == 'j')
-   fprintf (file, HOST_WIDE_INT_PRINT_DEC,
-s390_extract_part (x, HImode, -1));
-  else if (code == 'k')
-   fprintf (file, HOST_WIDE_INT_PRINT_DEC,
-s390_extract_part (x, SImode, 0));
-  else if (code == 'm')
-   fprintf (file, HOST_WIDE_INT_PRINT_DEC,
-s390_extract_part (x, SImode, -1));
-  else if (code == 'o')
-   fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (x) & 0x);
-  else

[PATCH 6/7] s390: Generate rxsbg, and shifted forms of rosbg

2012-08-09 Thread Richard Henderson

---
 gcc/config/s390/s390.md |   63 +-
 1 files changed, 56 insertions(+), 7 deletions(-)

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index ae004ac..d733062 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -384,6 +384,9 @@
 ;; the same template.
 (define_code_iterator SHIFT [ashift lshiftrt])
 
+;; This iterator allow r[ox]sbg to be defined with the same template
+(define_code_iterator IXOR [ior xor])
+
 ;; This iterator and attribute allow to combine most atomic operations.
 (define_code_iterator ATOMIC [and ior xor plus minus mult])
 (define_code_iterator ATOMIC_Z196 [and ior xor plus])
@@ -3402,15 +3405,61 @@
   [(set_attr "op_type" "RIE")
(set_attr "z10prop" "z10_super_E1")])
 
-; and op1 with a mask being 1 for the selected bits and 0 for the rest
-(define_insn "*insv_or_z10_noshift"
-  [(set (match_operand:GPR 0 "nonimmediate_operand" "=d")
-   (ior:GPR (and:GPR (match_operand:GPR 1 "nonimmediate_operand" "d")
- (match_operand:GPR 2 "contiguous_bitmask_operand" ""))
-   (match_operand:GPR 3 "nonimmediate_operand" "0")))
+(define_insn "*rsbg__noshift"
+  [(set (match_operand:DSI 0 "nonimmediate_operand" "=d")
+   (IXOR:DSI
+ (and:DSI (match_operand:DSI 1 "nonimmediate_operand" "d")
+   (match_operand:DSI 2 "contiguous_bitmask_operand" ""))
+ (match_operand:DSI 3 "nonimmediate_operand" "0")))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_Z10"
+  "rsbg\t%0,%1,%2,%2,0"
+  [(set_attr "op_type" "RIE")])
+
+(define_insn "*rsbg_di_rotl"
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=d")
+   (IXOR:DI
+ (and:DI
+   (rotate:DI
+ (match_operand:DI 1 "nonimmediate_operand" "d")
+  (match_operand:DI 3 "const_int_operand" ""))
+(match_operand:DI 2 "contiguous_bitmask_operand" ""))
+ (match_operand:DI 4 "nonimmediate_operand" "0")))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_Z10"
-  "rosbg\t%0,%1,%2,%2,0"
+  "rsbg\t%0,%1,%2,%2,%b3"
+  [(set_attr "op_type" "RIE")])
+
+(define_insn "*rsbg__srl"
+  [(set (match_operand:DSI 0 "nonimmediate_operand" "=d")
+   (IXOR:DSI
+ (and:DSI
+   (lshiftrt:DSI
+  (match_operand:DSI 1 "nonimmediate_operand" "d")
+  (match_operand:DSI 3 "nonzero_shift_count_operand" ""))
+(match_operand:DSI 2 "contiguous_bitmask_operand" ""))
+ (match_operand:DSI 4 "nonimmediate_operand" "0")))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_Z10
+   && s390_extzv_shift_ok (, 64 - INTVAL (operands[3]),
+   INTVAL (operands[2]))"
+  "rsbg\t%0,%1,%2,%2,64-%3"
+  [(set_attr "op_type" "RIE")])
+
+(define_insn "*rsbg__sll"
+  [(set (match_operand:DSI 0 "nonimmediate_operand" "=d")
+   (IXOR:DSI
+ (and:DSI
+   (ashift:DSI
+  (match_operand:DSI 1 "nonimmediate_operand" "d")
+  (match_operand:DSI 3 "nonzero_shift_count_operand" ""))
+(match_operand:DSI 2 "contiguous_bitmask_operand" ""))
+ (match_operand:DSI 4 "nonimmediate_operand" "0")))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_Z10
+   && s390_extzv_shift_ok (, INTVAL (operands[3]),
+   INTVAL (operands[2]))"
+  "rsbg\t%0,%1,%2,%2,%3"
   [(set_attr "op_type" "RIE")])
 
 (define_insn "*insv_mem_reg"
-- 
1.7.7.6

[PATCH 3/7] s390: Use risbgz for AND.

2012-08-09 Thread Richard Henderson

---
 gcc/config/s390/s390.md |  107 +++
 1 files changed, 62 insertions(+), 45 deletions(-)

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 2677fb2..6474023 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -5946,44 +5946,50 @@
 
 (define_insn "*anddi3_cc"
   [(set (reg CC_REGNUM)
-(compare (and:DI (match_operand:DI 1 "nonimmediate_operand" "%0,d, 0")
- (match_operand:DI 2 "general_operand"  " d,d,RT"))
- (const_int 0)))
-   (set (match_operand:DI 0 "register_operand"  "=d,d, d")
+(compare
+ (and:DI (match_operand:DI 1 "nonimmediate_operand" "%0,d, 0,d")
+  (match_operand:DI 2 "general_operand"  " d,d,RT,NxxDq"))
+  (const_int 0)))
+   (set (match_operand:DI 0 "register_operand"   "=d,d, d,d")
 (and:DI (match_dup 1) (match_dup 2)))]
-  "s390_match_ccmode(insn, CCTmode) && TARGET_ZARCH"
+  "TARGET_ZARCH && s390_match_ccmode(insn, CCTmode)"
   "@
ngr\t%0,%2
ngrk\t%0,%1,%2
-   ng\t%0,%2"
-  [(set_attr "op_type"  "RRE,RRF,RXY")
-   (set_attr "cpu_facility" "*,z196,*")
-   (set_attr "z10prop" "z10_super_E1,*,z10_super_E1")])
+   ng\t%0,%2
+   risbg\t%0,%1,%s2,128+%e2,0"
+  [(set_attr "op_type"  "RRE,RRF,RXY,RIE")
+   (set_attr "cpu_facility" "*,z196,*,z10")
+   (set_attr "z10prop" "z10_super_E1,*,z10_super_E1,z10_super_E1")])
 
 (define_insn "*anddi3_cconly"
   [(set (reg CC_REGNUM)
-(compare (and:DI (match_operand:DI 1 "nonimmediate_operand" "%0,d, 0")
- (match_operand:DI 2 "general_operand"  " d,d,RT"))
+(compare
+ (and:DI (match_operand:DI 1 "nonimmediate_operand" "%0,d, 0,d")
+  (match_operand:DI 2 "general_operand"  " d,d,RT,NxxDq"))
  (const_int 0)))
-   (clobber (match_scratch:DI 0 "=d,d, 
d"))]
-  "s390_match_ccmode(insn, CCTmode) && TARGET_ZARCH
+   (clobber (match_scratch:DI 0  "=d,d, d,d"))]
+  "TARGET_ZARCH
+   && s390_match_ccmode(insn, CCTmode)
/* Do not steal TM patterns.  */
&& s390_single_part (operands[2], DImode, HImode, 0) < 0"
   "@
ngr\t%0,%2
ngrk\t%0,%1,%2
-   ng\t%0,%2"
-  [(set_attr "op_type"  "RRE,RRF,RXY")
-   (set_attr "cpu_facility" "*,z196,*")
-   (set_attr "z10prop" "z10_super_E1,*,z10_super_E1")])
+   ng\t%0,%2
+   risbg\t%0,%1,%s2,128+%e2,0"
+  [(set_attr "op_type"  "RRE,RRF,RXY,RIE")
+   (set_attr "cpu_facility" "*,z196,*,z10")
+   (set_attr "z10prop" "z10_super_E1,*,z10_super_E1,z10_super_E1")])
 
 (define_insn "*anddi3"
   [(set (match_operand:DI 0 "nonimmediate_operand"
-"=d,d,d,d,d,d,d,d,d,d, d,  
 AQ,Q")
-(and:DI (match_operand:DI 1 "nonimmediate_operand"
-"%d,o,0,0,0,0,0,0,0,d, 0,  
  0,0")
-(match_operand:DI 2 "general_operand"
-"M, 
M,N0HDF,N1HDF,N2HDF,N3HDF,N0SDF,N1SDF,d,d,RT,NxQDF,Q")))
+"=d,d,d,d,d,d,d,d,d,d, d,d,   AQ,Q")
+(and:DI
+ (match_operand:DI 1 "nonimmediate_operand"
+"%d,o,0,0,0,0,0,0,0,d, 0,d,0,0")
+  (match_operand:DI 2 "general_operand"
+"M, M,N0HDF,N1HDF,N2HDF,N3HDF,N0SDF,N1SDF,d,d,RT,NxxDq,NxQDF,Q")))
(clobber (reg:CC CC_REGNUM))]
   "TARGET_ZARCH && s390_logical_operator_ok_p (operands)"
   "@
@@ -5998,10 +6004,11 @@
ngr\t%0,%2
ngrk\t%0,%1,%2
ng\t%0,%2
+   risbg\t%0,%1,%s2,128+%e2,0
#
#"
-  [(set_attr "op_type" "RRE,RXE,RI,RI,RI,RI,RIL,RIL,RRE,RRF,RXY,SI,SS")
-   (set_attr "cpu_facility" "*,*,*,*,*,*,extimm,extimm,*,z196,*,*,*")
+  [(set_attr "op_type" "RRE,RXE,RI,RI,RI,RI,RIL,RIL,RRE,RRF,RXY,RIE,SI,SS")
+   (set_attr "cpu_facility" "*,*,*,*,*,*,extimm,extimm,*,z196,*,z10,*,*")
(set_attr "z10prop" "*,
 *,
 z10_super_E1,
@@ -6013,6 +6020,7 @@
 z10_super_E1,
 *,
 z10_super_E1,
+z10_super_E1,
 *,
 *")])
 
@@ -6033,10 +6041,12 @@
 
 (define_insn "*andsi3_cc"
   [(set (reg CC_REGNUM)
-(compare (and:SI (match_operand:SI 1 "nonimmediate_operand" 
"%0,0,d,0,0")
- (match_operand:SI 2 "general_operand"  
"Os,d,d,R,T"))
- (const_int 0)))
-   (set (match_operand:SI 0 "register_operand"  
"=d,d,d,d,d")
+(compare
+ (and:SI
+   (match_operand:SI 1 "nonimmediate_operand" "%0,0,d,0,0,d")
+(match_operand:SI 2 "general_operand"  "Os,d,d,R,T,NxxSq"))
+  (const_int 0)))
+   (set (match_operand:SI 0 "register_operand" "=d,d,d,d,d,d")
 (and

Re: [cxx-conversion] Make double_int a class with methods and operators. (issue6443093)

2012-08-09 Thread Mike Stump

On Aug 9, 2012, at 5:00 PM, Michael Matz wrote:
> On Thu, 9 Aug 2012, Mike Stump wrote:
> 
>> On Aug 9, 2012, at 8:19 AM, Michael Matz wrote:
>>> Hmm.  And maintaining a cache is faster than 
>>> passing/returning/manipulating two registers?
>> 
>> For the most part, we merely mirror existing code, check out 
>> lookup_const_double and immed_double_const.
> 
> No, I won't without patches on this list.

Ah, we are discussing the code in the gcc tree currently.  You _can_ comment on 
it, if you like to.  I was only pointing out that this choice we didn't make 
nor deviate from the code in the top of the tree.  If you think it is wrong to 
cache it, then talking about the code in the top of the tree is the right place 
to discuss it.  Though, you don't have to if you don't want to.

> You keep repeating bragging

Such hostility.  Why?  I don't get it.  I _asked_ about when the cxx branch was 
going to land, I stated that I liked non-mutating interfaces, I gave a heads up 
that we have a wide-int class to replace double-int for ints.  I _only_ gave a 
heads up because of the submitted change to the cxx branch conflicts on a 
larger than expected scale with the wide-int change.  I think giving a heads up 
before the conflict happens is good citizenship.

> I mean, preparing the audience for an upcoming _suggested_ change in data 
> structure of course is fine.  But argueing as if the change happenend 
> already, and what's more concerning, as if the change was even already 
> suggested and agreed upon even though that's not the case, is just bad 
> style.

So, let me get this straight, alerting people that I have a patch that 
conflicts with another posted patch is, bad style?  Odd.  I saw it listed on 
page 10 of the etiquette guide, maybe you could update the guide for us.

> I would suggest to stay conservative about whatever you have (except if 
> it's momentarily materializing), and _especially don't argue against or 
> for or not against or for whatever improvement is suggested

Ah, that's a misunderstanding on your part.  I was not arguing for, or against 
the double_int changes.  In fact, I'm very supportive of those changes and the 
entire cxx branch, not that you'd know that, as I think all of the changes are 
a slam dunk and don't need any support from me.  The :-( in the email that you 
read, was just a comment that someone is going to have to resolve conflicts.  
Now that we know the timing of the cxx branch landing, we expect, we'll handle 
the conflicts on the wide-int side.  If the timing was different, we'd land the 
wide-int change first, then the :-( in the heads up comment would be read more 
as, we're sorry, but we've just scrambled the tree on you, so sorry.

Let me be perfectly clear, I support the double_int changes and the entire 
cxx-conversion branch.  No work I may or may not have matters or should be 
considered in reviewing any patches.  I'm a firm believer in the first in, wins 
method of resolving conflicts.  Sorry if anyone thought I was objecting in 
anyway to the double_int work.

> Nobody has seen it yet,

Actually, that's not true; but, it doesn't matter any.

> so you can't expect to get any feedback on it.

I don't recall asking for feedback on it.  The feedback I requested that you 
quote above, was feedback on the code in the top of the tree.

Re: [PATCH 2/3] Incorporate aggregate jump functions into inlining analysis

2012-08-09 Thread Jan Hubicka

> Hi,
> 
> this patch uses the aggregate jump functions created by the previous
> patch in the series to determine benefits of inlining a particular
> call graph edge.  It has not changed much since the last time I posted
> it, except for the presence of by_ref flags and removal of checks
> required by TBAA which we now do not use.
> 
> The patch works in fairly straightforward way.  It ads two flags to
> struct condition to specify it actually refers to an aggregate passed
> by value or something passed by reference, in both cases at a
> particular offset, also newly stored in the structures.  Functions
> which build the predicates specifying under which conditions CFG edges
> will be taken or individual statements are actually executed then
> simply also look whether a value comes from an aggregate passed to us
> in a parameter (either by value or reference) and if so, create
> appropriate conditions.  Later on, predicates are evaluated as before,
> we only also look at aggregate contents of the jump functions of the
> edge we are considering to inline when evaluating the predicates, and
> also remap the offsets of the jump functions when remapping over an
> ancestor jump function.
> 
> This patch alone makes us inline the function bar in testcase of PR
> 48636 in comment #4.  It also passes bootstrap and testing on
> x86_64-linux.  I successfully LTO-built Firefox with it too.
> 
> Thanks for all comments and suggestions,
> 
> Martin
> 
> 
> 2012-07-31  Martin Jambor  
> 
>   PR fortran/48636
>   * ipa-inline.h (condition): New fields offset, agg_contents and by_ref.
>   * ipa-inline-analysis.c (agg_position_info): New type.
>   (add_condition): New parameter aggpos, also store agg_contents, by_ref
>   and offset.
>   (dump_condition): Also dump aggregate conditions.
>   (evaluate_conditions_for_known_args): Also handle aggregate
>   conditions.  New parameter known_aggs.
>   (evaluate_properties_for_edge): Gather known aggregate contents.
>   (inline_node_duplication_hook): Pass NULL known_aggs to
>   evaluate_conditions_for_known_args.
>   (unmodified_parm): Split into unmodified_parm and unmodified_parm_1.
>   (unmodified_parm_or_parm_agg_item): New function.
>   (set_cond_stmt_execution_predicate): Handle values passed in
>   aggregates.
>   (set_switch_stmt_execution_predicate): Likewise.
>   (will_be_nonconstant_predicate): Likewise.
>   (estimate_edge_devirt_benefit): Pass new parameter known_aggs to
>   ipa_get_indirect_edge_target.
>   (estimate_calls_size_and_time): New parameter known_aggs, pass it
>   recrsively to itself and to estimate_edge_devirt_benefit.
>   (estimate_node_size_and_time): New vector known_aggs, pass it o
>   functions which need it.
>   (remap_predicate): New parameter offset_map, use it to remap aggregate
>   conditions.
>   (remap_edge_summaries): New parameter offset_map, pass it recursively
>   to itself and to remap_predicate.
>   (inline_merge_summary): Also create and populate vector offset_map.
>   (do_estimate_edge_time): New vector of known aggregate contents,
>   passed to functions which need it.
>   (inline_read_section): Stream new fields of condition.
>   (inline_write_summary): Likewise.
>   * ipa-cp.c (ipa_get_indirect_edge_target): Also examine the aggregate
>   contents.  Let all local callers pass NULL for known_aggs.
> 
>   * testsuite/gfortran.dg/pr48636.f90: New test.

OK with the following changes.

I plan to push out my inline hints code, so it would be nice if you commited 
soon 
so I cn resolve conflicts on my side.
> Index: src/gcc/ipa-inline.h
> ===
> *** src.orig/gcc/ipa-inline.h
> --- src/gcc/ipa-inline.h
> *** along with GCC; see the file COPYING3.
> *** 28,36 
> --- 28,45 
>   
>   typedef struct GTY(()) condition
> {
> + /* If agg_contents is set, this is the offset from which the used data 
> was
> +loaded.  */
> + HOST_WIDE_INT offset;
>   tree val;
>   int operand_num;
>   enum tree_code code;
> + /* Set if the used data were loaded from an aggregate parameter or from
> +data received by reference.  */
> + unsigned agg_contents : 1;
> + /* If agg_contents is set, this differentiates between loads from data
> +passed by reference and by value.  */
> + unsigned by_ref : 1;

Do you have any data on memory usage?  I was originally concerned about memory 
use of the
whole predicate thingy on WPA level.  Eventually we could add simple 
inheritance on
conditions and sort them into mutiple vectors if needed. But I assume it is OK 
or we
will work out on Mozilla builds soonish.

One obvious thing is to patch CODE and the bitfields so we fit in 3 64bit words.
> *** dump_condition (FILE *f, conditions cond
> *** 519,524 
> --- 554,561 
>

Re: [PATCH 2/3] Incorporate aggregate jump functions into inlining analysis

2012-08-09 Thread Jan Hubicka

> *** inline_merge_summary (struct cgraph_edge
> *** 2639,2655 
> int count = ipa_get_cs_argument_count (args);
> int i;
>   
> !   evaluate_properties_for_edge (edge, true, &clause, NULL, NULL);
> if (count)
> ! VEC_safe_grow_cleared (int, heap, operand_map, count);
> for (i = 0; i < count; i++)
>   {
> struct ipa_jump_func *jfunc = ipa_get_ith_jump_func (args, i);
> int map = -1;
> /* TODO: handle non-NOPs when merging.  */
> !   if (jfunc->type == IPA_JF_PASS_THROUGH
> !   && ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR)
> ! map = ipa_get_jf_pass_through_formal_id (jfunc);
> VEC_replace (int, operand_map, i, map);
> gcc_assert (map < ipa_get_param_count (IPA_NODE_REF (to)));
>   }
> --- 2788,2822 
> int count = ipa_get_cs_argument_count (args);
> int i;
>   
> !   evaluate_properties_for_edge (edge, true, &clause, NULL, NULL, NULL);
> if (count)
> ! {
> !   VEC_safe_grow_cleared (int, heap, operand_map, count);
> !   VEC_safe_grow_cleared (int, heap, offset_map, count);
> ! }
> for (i = 0; i < count; i++)
>   {
> struct ipa_jump_func *jfunc = ipa_get_ith_jump_func (args, i);
> int map = -1;
> + 
> /* TODO: handle non-NOPs when merging.  */
> !   if (jfunc->type == IPA_JF_PASS_THROUGH)
> ! {
> !   if (ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR)
> ! map = ipa_get_jf_pass_through_formal_id (jfunc);
> !   if (!ipa_get_jf_pass_through_agg_preserved (jfunc))
> ! VEC_replace (int, offset_map, i, -1);
> ! }
> !   else if (jfunc->type == IPA_JF_ANCESTOR)
> ! {
> !   HOST_WIDE_INT offset = ipa_get_jf_ancestor_offset (jfunc);
> !   if (offset >= 0 && offset < INT_MAX)
> ! {
> !   map = ipa_get_jf_ancestor_formal_id (jfunc);
> !   if (!ipa_get_jf_ancestor_agg_preserved (jfunc))
> ! offset = -1;
> ! }
Missing VEC_replace (int, offset_map, i, offset) here?
So we do not handle cases where operand is unpacked from aggregate and so on.
But it seems you are missing some matching of the aggregate flags here.

Honza

Re: [PATCH 3/3] Compute predicates for phi node results in ipa-inline-analysis.c

2012-08-09 Thread Jan Hubicka

> Hi,
> 
> this third patch is basically a proof-of-concept aiming at alleviating
> the following code found in Fortran functions when they look at the
> contents of array descriptors:
> 
>   :
> stride.156_7 = strain_tensor_6(D)->dim[0].stride;
> if (stride.156_7 != 0)
>   goto ;
> else
>   goto ;
> 
>   :
> 
>   :
> # stride.156_4 = PHI 
> 
> and stride.156_4 is then used for other computations.  Currently we
> compute a predicate for SSA name stride.156_7 but the PHI node stops
> us from having one for stride.156_4 and those computed from it.
> 
> This patch looks at phi nodes, and if its pairs of predecessors have
> the same nearest common dominator, and the condition there is known to
> be described by a predicate (computed either by
> set_cond_stmt_execution_predicate or,
> set_switch_stmt_execution_predicate, we depend on knowing how exactly
> they behave), we use the parameter and offset from the predicate
> condition and create one for the PHI node result, provided the
> arguments of a phi node allow that, of course.

Consider:

 b==0?
T/  \F
/\
   /  \
 a==0?   a==0?
T/ \F   T/  \F
... \   /   ...
 \ /
 PHI

In this case vale of PHI is determined by a==0, but the condition in common
dominator would be b==0.  We can work this out from control dependency relation
or handle it by propagation engine, but perhaps it is overkill. What about
special casing (half) diamond CFG to start with?

Path is OK with that change.
Honza

Re: [PATCH][7/6] Allow anonymous SSA names

2012-08-09 Thread Jan Hubicka

> 
> This converts most users of create_tmp_{var,reg} to use anonymous
> SSA names.  To give you one more reason to look at 6/6 ;)
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Very cool. Thanks for the hard work.  Did you have time to test the memory use 
effets?
(I have to read whole series perhaps it is there :)

Honza

Re: [PATCH, testsuite] New effective target long_neq_int

2012-08-09 Thread Janis Johnson

On 08/09/2012 06:46 PM, William J. Schmidt wrote:
> As suggested by Janis regarding testsuite/gcc.dg/tree-ssa/slsr-30.c,
> this patch adds a new effective target for machines having long and int
> of differing sizes.
> 
> Tested on powerpc64-unknown-linux-gnu, where the test passes for -m64
> and is skipped for -m32.  Ok for trunk?

OK!

Janis

> Thanks,
> Bill
> 
> 
> doc:
> 
> 2012-08-09  Bill Schmidt  
> 
>   * sourcebuild.texi: Document long_neq_int effective target.
> 
> 
> testsuite:
> 
> 2012-08-09  Bill Schmidt  
> 
>   * lib/target-supports.exp (check_effective_target_long_neq_int): New.
>   * gcc.dg/tree-ssa/slsr-30.c: Check for long_neq_int effective target.
> 
> 
> Index: gcc/doc/sourcebuild.texi
> ===
> --- gcc/doc/sourcebuild.texi  (revision 190260)
> +++ gcc/doc/sourcebuild.texi  (working copy)
> @@ -1303,6 +1303,9 @@ Target has @code{int} that is at 32 bits or longer
>  @item int16
>  Target has @code{int} that is 16 bits or shorter.
>  
> +@item long_neq_int
> +Target has @code{int} and @code{long} with different sizes.
> +
>  @item large_double
>  Target supports @code{double} that is longer than @code{float}.
>  
> Index: gcc/testsuite/lib/target-supports.exp
> ===
> --- gcc/testsuite/lib/target-supports.exp (revision 190260)
> +++ gcc/testsuite/lib/target-supports.exp (working copy)
> @@ -1689,6 +1689,15 @@ proc check_effective_target_llp64 { } {
>  }]
>  }
>  
> +# Return 1 if long and int have different sizes,
> +# 0 otherwise.
> +
> +proc check_effective_target_long_neq_int { } {
> +return [check_no_compiler_messages long_ne_int object {
> + int dummy[sizeof (int) != sizeof (long) ? 1 : -1];
> +}]
> +}
> +
>  # Return 1 if the target supports long double larger than double,
>  # 0 otherwise.
>  
> Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c
> ===
> --- gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c   (revision 190260)
> +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-30.c   (working copy)
> @@ -1,7 +1,7 @@
>  /* Verify straight-line strength reduction fails for simple integer addition
> with casts thrown in when -fwrapv is used.  */
>  
> -/* { dg-do compile { target { ! { ilp32 } } } } */
> +/* { dg-do compile { target { long_neq_int } } } */
>  /* { dg-options "-O3 -fdump-tree-dom2 -fwrapv" } */
>  
>  long
> 
>

RE: [PATCH,i386] fma,fma4 and xop flags

2012-08-09 Thread Gopalasubramanian, Ganesh

> -mxop implies -mfma4, but reverse is not true.

I think this handling went in for bdver1.
But, with bdver2, we have both fma and fma4.
So for bdver2, -mxop should not be enabling one of them.

> if someone set -mfma4 together
> with -mfma on the command line, we should NOT disable selected ISA
> behind user's back

If both -mfma4 and -mfma are enabled, GCC outputs fma4 instructions.
This, I think is because fma4 instruction patterns are read before fma 
instruction patterns from the ".md" files.
So, enabling both -mfma4 and -mfma is not good for bdver2.

Moreover, if user tries to use, -mfma -mno-fma4 -mxop, the order in which these 
options are used becomes crucial. -mxop enables -mfma4 and by instruction 
patterns fma4 instructions gets listed in the assembly file.

For the below test,

double a,b,c,d;
int fn(){
a = b + c * d ;
return a;
}

#1) Using options "-O2 -mno-fma4 -mfma -mxop" outputs fma4. 
(vfmaddsdb(%rip), %xmm2, %xmm1, %xmm0)
#2) Using options "-O2 -mfma -mno-fma4 -mxop" outputs fma4.
(vfmaddsdb(%rip), %xmm2, %xmm1, %xmm0)
#3) Using options "-mxop -mno-fma4 -mfma" outpts fma.
(vfmadd132sd d(%rip), %xmm1, %xmm0)

As we see the order in which the options are used becomes crucial.
This is confusing.

I haven't really tested other implied options. But, I suspect
similar phenomenon in those cases too. 

IMO, we can directly go by the CPUID flags and enable the flags.
This will be a one to one mapping and leave the user with lot more liberty.
Please let me know your opinion.

Regards
Ganesh

-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com] 
Sent: Friday, August 10, 2012 1:21 AM
To: Gopalasubramanian, Ganesh
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH,i386] fma,fma4 and xop flags

On Wed, Aug 8, 2012 at 1:31 PM,   wrote:

> Bdver2 cpu supports both fma and fma4 instructions.
> Previous to patch, option "-mno-xop" removes "-mfma4".
> Similarly, option "-mno-fma4" removes "-mxop".

It looks to me that there is some misunderstanding. AFAICS:

-mxop implies -mfma4, but reverse is not true. Please see

#define OPTION_MASK_ISA_FMA4_SET \
  (OPTION_MASK_ISA_FMA4 | OPTION_MASK_ISA_SSE4A_SET \
   | OPTION_MASK_ISA_AVX_SET)
#define OPTION_MASK_ISA_XOP_SET \
  (OPTION_MASK_ISA_XOP | OPTION_MASK_ISA_FMA4_SET)

So, -mxop sets -mfma4, etc ..., but -mfma4 does NOT enable -mxop.

OTOH,

#define OPTION_MASK_ISA_FMA4_UNSET \
  (OPTION_MASK_ISA_FMA4 | OPTION_MASK_ISA_XOP_UNSET)
#define OPTION_MASK_ISA_XOP_UNSET OPTION_MASK_ISA_XOP

-mno-fma4 implies -mno-xop, but again reverse is not true. Thus,
-mno-xop does NOT imply -mno-fma4.

> So, the patch conditionally disables "-mfma" or "-mfma4".
> Enabling "-mxop" is done by also checking "-mfma".

Please note that conditional handling of ISA flags belongs to
ix86_option_override_internal. However, if someone set -mfma4 together
with -mfma on the command line, we should NOT disable selected ISA
behind user's back, in the same way as we don't disable anything with
"-march=i386 -msse4". With -march=bdver2, we already marked that only
fma is supported, and if user selected "-march=bdver2 -mfma4" on the
command line, we shouldn't disable anything.

Uros.

Re: libgo patch committed: Support NumCPU on more platforms

2012-08-09 Thread Ian Lance Taylor

On Tue, Aug 7, 2012 at 5:21 AM, Richard Earnshaw  wrote:
>
> Wouldn't it be more useful on Linux to check the task's affinity
> settings?  Then when a task is locked to a limited set of cores it won't
> overload those cores with threads.

This suggestion was implemented by Shenghou Ma, as attached.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian
diff -r 8fc45a01251d libgo/runtime/getncpu-linux.c
--- a/libgo/runtime/getncpu-linux.c Mon Aug 06 21:42:17 2012 -0700
+++ b/libgo/runtime/getncpu-linux.c Thu Aug 09 23:04:31 2012 -0700
@@ -2,46 +2,35 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-#include 
-#include 
-#include 
-#include 
+#include 
+#include 
+
+// CPU_COUNT is only provided by glibc 2.6 or higher
+#if !defined(__GLIBC_PREREQ) || !__GLIBC_PREREQ(2, 6)
+#define CPU_COUNT(set) _CPU_COUNT((unsigned int *)(set), 
sizeof(*(set))/sizeof(unsigned int))
+static int _CPU_COUNT(unsigned int *set, size_t len) {
+   int cnt;
+
+   cnt = 0;
+   while (len--)
+   cnt += __builtin_popcount(*set++);
+   return cnt;
+}
+#endif
 
 #include "runtime.h"
 #include "defs.h"
 
-#ifndef O_CLOEXEC
-#define O_CLOEXEC 0
-#endif
-
 int32
 getproccount(void)
 {
-   int32 fd, rd, cnt, cpustrlen;
-   const char *cpustr;
-   const byte *pos;
-   byte *bufpos;
-   byte buf[256];
+   cpu_set_t set;
+   int32 r, cnt;
 
-   fd = open("/proc/stat", O_RDONLY|O_CLOEXEC, 0);
-   if(fd == -1)
-   return 1;
cnt = 0;
-   bufpos = buf;
-   cpustr = "\ncpu";
-   cpustrlen = strlen(cpustr);
-   for(;;) {
-   rd = read(fd, bufpos, sizeof(buf)-cpustrlen);
-   if(rd == -1)
-   break;
-   bufpos[rd] = 0;
-   for(pos=buf; (pos=(const byte*)strstr((const char*)pos, 
cpustr)) != nil; cnt++, pos++) {
-   }
-   if(rd < cpustrlen)
-   break;
-   memmove(buf, bufpos+rd-cpustrlen+1, cpustrlen-1);
-   bufpos = buf+cpustrlen-1;
-   }
-   close(fd);
+   r = sched_getaffinity(0, sizeof(set), &set);
+   if(r == 0)
+   cnt += CPU_COUNT(&set);
+
return cnt ? cnt : 1;
 }

Re: [PATCH] Intrinsics for ADCX

2012-08-09 Thread Michael Zolotukhin

Thanks!

On 9 August 2012 18:36, Kirill Yukhin  wrote:
>>
>> Ok.
>
> Checked in:
> http://gcc.gnu.org/ml/gcc-cvs/2012-08/msg00231.html
>
> Thanks, K


-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.

[PATCH] RDSEED-builtin Description Fix

2012-08-09 Thread Michael Zolotukhin

Hi,
Here is an obvious fix for a mistake in description of
__builtin_ia32_rdseed_di_step.
Bootstrap and rdseed-* tests are ok.

Ok for commit to trunk?

Changelog entry:
2012-08-10  Michael Zolotukhin  

* config/i386/i386.c (ix86_init_mmx_sse_builtins): Fix description of
__builtin_ia32_rdseed_di_step.


-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.


bdw-rdseed-fix.gcc.patch
Description: Binary data

1 2 >

100 matches

Mail list logo