Re: [PATCH Atom][PR middle-end/44382] Tree reassociation improvement

2011-08-19 Thread Ilya Enkovich
Ping.

2011/8/10 Ilya Enkovich :
> Hello,
>
> Here is a new version of the patch. Changes from the previous version
> (http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02240.html):
>  - updated to trunk
>  - TODO_remove_unused_locals flag was removed from todo_flags_finish
> of reassoc pass
>
> Bootstrapped and checked on x86_64-linux.
>
> Thanks,
> Ilya
> ---
> gcc/
>
> 2011-08-10  Enkovich Ilya  
>
>        PR middle-end/44382
>        * target.def (reassociation_width): New hook.
>
>        * doc/tm.texi.in (reassociation_width): Likewise.
>
>        * doc/tm.texi (reassociation_width): Likewise.
>
>        * doc/invoke.texi (tree-reassoc-width): New param documented.
>
>        * hooks.h (hook_int_uint_mode_1): New default hook.
>
>        * hooks.c (hook_int_uint_mode_1): Likewise.
>
>        * config/i386/i386.h (ix86_tune_indices): Add
>        X86_TUNE_REASSOC_INT_TO_PARALLEL and
>        X86_TUNE_REASSOC_FP_TO_PARALLEL.
>
>        (TARGET_REASSOC_INT_TO_PARALLEL): New.
>        (TARGET_REASSOC_FP_TO_PARALLEL): Likewise.
>
>        * config/i386/i386.c (initial_ix86_tune_features): Add
>        X86_TUNE_REASSOC_INT_TO_PARALLEL and
>        X86_TUNE_REASSOC_FP_TO_PARALLEL.
>
>        (ix86_reassociation_width) implementation of
>        new hook for i386 target.
>
>        * params.def (PARAM_TREE_REASSOC_WIDTH): New param added.
>
>        * tree-ssa-reassoc.c (get_required_cycles): New function.
>        (get_reassociation_width): Likewise.
>        (swap_ops_for_binary_stmt): Likewise.
>        (rewrite_expr_tree_parallel): Likewise.
>
>        (rewrite_expr_tree): Refactored. Part of code moved into
>        swap_ops_for_binary_stmt.
>
>        (reassociate_bb): Now checks reassociation width to be used
>        and call rewrite_expr_tree_parallel instead of rewrite_expr_tree
>        if needed.
>
> gcc/testsuite/
>
> 2011-08-10  Enkovich Ilya  
>
>        * gcc.dg/tree-ssa/pr38533.c (dg-options): Added option
>        --param tree-reassoc-width=1.
>
>        * gcc.dg/tree-ssa/reassoc-24.c: New test.
>        * gcc.dg/tree-ssa/reassoc-25.c: Likewise.
>


Re: C1X _Noreturn

2011-08-19 Thread Richard Sandiford
"Joseph S. Myers"  writes:
> As far as I know MIPS is no longer using the old SDE library and it is
> considered superseded by newlib,

Hadn't realised that.

> so perhaps that configuration (mips*-sde-elf* without newlib) should
> actually be deprecated/removed (and "mipssde" threads along with it).

Yeah, sounds like a good plan.

Richard


Re: PING: PATCH: PR target/46770: Use .init_array/.fini_array sections

2011-08-19 Thread Jakub Jelinek
Sorry for the delay.

> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -186,6 +186,9 @@
>  #  configure_default_options
>  #Set to an initializer for configure_default_options
>  #in configargs.h, based on --with-cpu et cetera.
> +#
> +#  use_initfini_arrayIf set to yes, .init_array/.fini_array sections
> +#will be used if they work.
>  
>  # The following variables are used in each case-construct to build up the
>  # outgoing variables:
> @@ -238,6 +241,7 @@ default_gnu_indirect_function=no
>  target_gtfiles=
>  need_64bit_hwint=
>  need_64bit_isa=
> +use_initfini_array=yes

What is this for, when nothing ever sets it to anything but yes?
If the $enable_initfini_array = yes test works, then there shouldn't be
any need to override it on a per-target basis...

> --- /dev/null
> +++ b/gcc/config/initfini-array.h
> @@ -0,0 +1,44 @@
> +/* Definitions for ELF systems with .init_array/.fini_array section
> +   support.
> +   Copyright (C) 2011
> +   Free Software Foundation, Inc.
> +
> +   This file is part of GCC.
> +
> +   GCC is free software; you can redistribute it and/or modify it
> +   under the terms of the GNU General Public License as published
> +   by the Free Software Foundation; either version 3, or (at your
> +   option) any later version.
> +
> +   GCC is distributed in the hope that it will be useful, but WITHOUT
> +   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> +   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> +   License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with GCC; see the file COPYING3.  If not see
> +   .  */
> +
> +#define USE_INITFINI_ARRAY
> +
> +#undef INIT_SECTION_ASM_OP
> +#undef FINI_SECTION_ASM_OP
> +
> +/* FIXME: INIT_ARRAY_SECTION_ASM_OP and FINI_ARRAY_SECTION_ASM_OP
> +   aren't used in any assembly codes.  But we have to define
> +   them to something.  */
> +#define INIT_ARRAY_SECTION_ASM_OP Something
> +#define FINI_ARRAY_SECTION_ASM_OP Something

Can't you just define it to an empty string?  And, a couple of targets
define INIT_ARRAY_SECTION_ASM_OP/FINI_ARRAY_SECTION_ASM_OP, you either need
to undef it first, or define only if it wasn't defined.
> +
> +#ifndef TARGET_ASM_INIT_SECTIONS
> +#define TARGET_ASM_INIT_SECTIONS default_elf_initfini_array_init_sections
> +#endif
> +extern void default_elf_initfini_array_init_sections (void);

Why do you need this (and the default_initfini_array_init_sections () call
in all the backends)?  Isn't it easier to just initialize the two global
vars only when you are actually going to use them (if they are still NULL)?

> --- a/gcc/varasm.c
> +++ b/gcc/varasm.c
> @@ -7350,4 +7350,62 @@ make_debug_expr_from_rtl (const_rtx exp)
>return dval;
>  }
>  
> +static GTY(()) section *elf_init_array_section;
> +static GTY(()) section *elf_fini_array_section;
> +
> +void
> +default_elf_initfini_array_init_sections (void)
> +{
> +  elf_init_array_section = get_unnamed_section (0, output_section_asm_op,
> + "\t.section\t.init_array");
> +  elf_fini_array_section = get_unnamed_section (0, output_section_asm_op,
> + "\t.section\t.fini_array");
> +}

Remove above function.

> +
> +static section *
> +get_elf_initfini_array_priority_section (int priority,
> +  bool constructor_p)
> +{
> +  section *sec;
> +  if (priority != DEFAULT_INIT_PRIORITY)
> +{
> +  char buf[18];
> +  sprintf (buf, "%s.%.5u", 
> +constructor_p ? ".init_array" : ".fini_array",
> +priority);
> +  sec = get_section (buf, SECTION_WRITE, NULL_TREE);
> +}

I'd just put here
   else
 {
   if (elf_init_array_section == NULL)
 elf_init_array_section = get_unnamed_section...
   if (elf_fini_array_section == NULL)
 elf_fini_array_section = get_unnamed_section...
> +sec = constructor_p ? elf_init_array_section : elf_fini_array_section;
 }

> +void
> +default_initfini_array_init_sections (void)
> +{
> +#ifdef USE_INITFINI_ARRAY
> +  default_elf_initfini_array_init_sections ();
> +#endif
> +}

And remove this (and all callers etc.).

On which targets has it been tested?  Would be nice to test it at least on
targets that define their own INIT_ARRAY_SECTION_ASM_OP (pa64-hpux, arm,
m32c, rx) and on {i?86,x86_64,ia64}-linux and some solaris target.

Jakub


Re: [PATCH][RFC] Fix sizetype "sign" checks

2011-08-19 Thread Richard Guenther
On Fri, 19 Aug 2011, Eric Botcazou wrote:

> > Looking at the Ada case I believe this happens because
> > Ada has negative DECL_FIELD_OFFSET values (but that's
> > again in sizetype, not ssizetype)?  Other host_integerp
> > uses in Ada operate on sizes where I hope those are
> > never negative ;)
> 
> Yes, the Ada compiler uses negative offsets for some peculiar constructs.
> Nothing to do with the language per se, but with mechanisms implemented in 
> gigi to support some features of the language.
> 
> > Eric, any better way of fixing this or would you be fine with this patch?
> 
> Hard to say without seeing the complete patch and playing a little with it.

This is the "complete" patch I am playing with currently, Ada bootstrap
still fails for me unfortunately.  Bootstrap for all other languages
succeeds, but there are some regressions, mostly warning-related.

Any help with pinpointing the Ada problem is welcome.

Thanks,
Richard.

2011-06-16  Richard Guenther  

* fold-const.c (div_if_zero_remainder): sizetypes no longer
sign-extend.
* stor-layout.c (initialize_sizetypes): Likewise.
* tree-ssa-ccp.c (bit_value_unop_1): Likewise.
(bit_value_binop_1): Likewise.
* tree.c (double_int_to_tree): Likewise.
(double_int_fits_to_tree_p): Likewise.
(force_fit_type_double): Likewise.
(host_integerp): Likewise.
(int_fits_type_p): Likewise.
* tree-cfg.c (verify_types_in_gimple_reference): Do not compare
sizes by pointer.

Index: trunk/gcc/fold-const.c
===
*** trunk.orig/gcc/fold-const.c 2011-08-18 13:41:00.0 +0200
--- trunk/gcc/fold-const.c  2011-08-18 14:38:31.0 +0200
*** div_if_zero_remainder (enum tree_code co
*** 194,202 
   does the correct thing for POINTER_PLUS_EXPR where we want
   a signed division.  */
uns = TYPE_UNSIGNED (TREE_TYPE (arg2));
-   if (TREE_CODE (TREE_TYPE (arg2)) == INTEGER_TYPE
-   && TYPE_IS_SIZETYPE (TREE_TYPE (arg2)))
- uns = false;
  
quo = double_int_divmod (tree_to_double_int (arg1),
   tree_to_double_int (arg2),
--- 194,199 
*** int_binop_types_match_p (enum tree_code
*** 938,945 
 to produce a new constant.  Return NULL_TREE if we don't know how
 to evaluate CODE at compile-time.  */
  
! tree
! int_const_binop (enum tree_code code, const_tree arg1, const_tree arg2)
  {
double_int op1, op2, res, tmp;
tree t;
--- 935,943 
 to produce a new constant.  Return NULL_TREE if we don't know how
 to evaluate CODE at compile-time.  */
  
! static tree
! int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree arg2,
!  int overflowable)
  {
double_int op1, op2, res, tmp;
tree t;
*** int_const_binop (enum tree_code code, co
*** 1081,1093 
return NULL_TREE;
  }
  
!   t = force_fit_type_double (TREE_TYPE (arg1), res, 1,
 ((!uns || is_sizetype) && overflow)
 | TREE_OVERFLOW (arg1) | TREE_OVERFLOW (arg2));
  
return t;
  }
  
  /* Combine two constants ARG1 and ARG2 under operation CODE to produce a new
 constant.  We assume ARG1 and ARG2 have the same data type, or at least
 are the same kind of constant and the same machine mode.  Return zero if
--- 1079,1097 
return NULL_TREE;
  }
  
!   t = force_fit_type_double (TREE_TYPE (arg1), res, overflowable,
 ((!uns || is_sizetype) && overflow)
 | TREE_OVERFLOW (arg1) | TREE_OVERFLOW (arg2));
  
return t;
  }
  
+ tree
+ int_const_binop (enum tree_code code, const_tree arg1, const_tree arg2)
+ {
+   return int_const_binop_1 (code, arg1, arg2, 1);
+ }
+ 
  /* Combine two constants ARG1 and ARG2 under operation CODE to produce a new
 constant.  We assume ARG1 and ARG2 have the same data type, or at least
 are the same kind of constant and the same machine mode.  Return zero if
*** size_binop_loc (location_t loc, enum tre
*** 1448,1455 
return arg1;
}
  
!   /* Handle general case of two integer constants.  */
!   return int_const_binop (code, arg0, arg1);
  }
  
return fold_build2_loc (loc, code, type, arg0, arg1);
--- 1452,1461 
return arg1;
}
  
!   /* Handle general case of two integer constants.  For sizetype
!  constant calculations we always want to know about overflow,
!even in the unsigned case.  */
!   return int_const_binop_1 (code, arg0, arg1, -1);
  }
  
return fold_build2_loc (loc, code, type, arg0, arg1);
Index: trunk/gcc/stor-layout.c
===
*** trunk.orig/gcc/stor-layout.c2011-08-18 13:40:55.0 +0200
--- trunk/gcc/stor-layout.c 2011-08-18 13:43:39.0 

Re: [cxx-mem-model] Atomic C++ header file changes

2011-08-19 Thread Torvald Riegel
On Wed, 2011-08-17 at 11:39 -0400, Andrew MacLeod wrote:
> Turns out, C++ will allow you to specify the memory model as a variable 
> of type enum memory_order... WTF?  I would expect that to be pretty 
> uncommon, and in order to get that right, we'd need a switch statement 
> and call the appropriate __sync_mem* routine with the appropriate 
> constant parameter.
> 
> That would be quite ugly, and you get what you deserve if you do that.   
> I changed the builtins so that if you dont specify a compile time 
> constant in the memory model parameter, it will simply default to 
> __SYNC_MEM_SEQ_CST, which will always be safe.  That is standard 
> compliant (verified), and if anyone is really unhappy about it, then the 
> c++ headers can be really uglified by adding a bunch of switch 
> statements to handle this twisted case.

IMHO this behavior should be documented so that users will be aware of
it, and it would be best if this would raise a warning. Note that I also
cannot see any reason why a programmer might want to make barriers
runtime-configurable, but silently adding overhead (perhaps the
parameter was supposed to be a constant, but wasn't?) can lead to more
confusion than necessary.

Torvald



Re: [PATCH, testsuite, i386] AVX2 support for GCC

2011-08-19 Thread Kirill Yukhin
It was checked in by HJ
http://gcc.gnu.org/viewcvs?view=revision&revision=177876

I am testing next patch.

Thanks, K


On Thu, Aug 11, 2011 at 1:16 PM, Kirill Yukhin  wrote:
> Hi Uros,
> Thanks for patience reviewing my English :) and for finding a bug in souces.
>
> Updated patch is attached. It was bootstrapped successfully.
>
> updated ChangeLog entry:
>
> 2011-08-11  Kirill Yukhin  
>
>        * common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX2_SET): New.
>        (OPTION_MASK_ISA_AVX_UNSET): Update.
>        (OPTION_MASK_ISA_AVX2_UNSET): New.
>        (ix86_handle_option): Handle OPT_mavx2 case.
>        * config/i386/cpuid.h (bit_AVX2): New.
>        * config/i386/driver-i386.c (host_detect_local_cpu): Detect
>        AVX2 feature.
>        * config/i386/i386-c.c (ix86_target_macros_internal):
>        Conditionally define __AVX2__.
>        * config/i386/i386.c (ix86_option_override_internal): Define
>        PTA_AVX2.  Define "core-avx2" processor alias.  Handle avx2
>        option.
>        (ix86_valid_target_attribute_inner_p): Handle avx2 option.
>        * config/i386/i386.h (TARGET_AVX2): New.
>        * config/i386/i386.opt (mavx2): New.
>        * doc/invoke.texi: Document -mavx2.
>
>
> Seems, from now on we have to wait for int64 patch to be approved and
> comitted to trunk..
>
> --
> Thanks, K
>
> On Wed, Aug 10, 2011 at 11:41 PM, Uros Bizjak  wrote:
>> On Wed, Aug 10, 2011 at 9:39 PM, Uros Bizjak  wrote:
>>
>>> diff --git a/gcc/common/config/i386/i386-common.c
>>> b/gcc/common/config/i386/i386-common.c
>>> index 1fd33bd..1e0ca5e 100644
>>> --- a/gcc/common/config/i386/i386-common.c
>>> +++ b/gcc/common/config/i386/i386-common.c
>>> @@ -52,6 +52,8 @@ along with GCC; see the file COPYING3.  If not see
>>>   (OPTION_MASK_ISA_AVX | OPTION_MASK_ISA_SSE4_2_SET)
>>>  #define OPTION_MASK_ISA_FMA_SET \
>>>   (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_AVX_SET)
>>> +#define OPTION_MASK_ISA_AVX2_SET \
>>> +  (OPTION_MASK_ISA_AVX2 | OPTION_MASK_ISA_AVX_SET)
>>>
>>>  /* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same
>>>    as -msse4.2.  */
>>> @@ -114,8 +116,10 @@ along with GCC; see the file COPYING3.  If not see
>>>   (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_AVX_UNSET )
>>>  #define OPTION_MASK_ISA_AVX_UNSET \
>>>   (OPTION_MASK_ISA_AVX | OPTION_MASK_ISA_FMA_UNSET \
>>> -   | OPTION_MASK_ISA_FMA4_UNSET | OPTION_MASK_ISA_F16C_UNSET)
>>> +   | OPTION_MASK_ISA_FMA4_UNSET | OPTION_MASK_ISA_F16C_UNSET \
>>> +   | OPTION_MASK_ISA_AVX)
>>>
>>> OPTION_MASK_ISA_AVX2
>>
>> Hrm, OPTION_MASK_ISA_AVX2_UNSET.
>>
>> Uros.
>>
>


Re: [4.7][google]Support for getting CPU type and feature information at run-time. (issue4893046)

2011-08-19 Thread Richard Guenther
On Fri, Aug 19, 2011 at 12:08 AM, Richard Henderson  wrote:
> On 08/18/2011 02:51 PM, Sriraman Tallam wrote:
>> Oh!, right, sorry. So, the only available option now is to mark it as
>> a constructor in libgcc.
>
> Or call it explicitly from the out-of-line tests.
>
> The thing is, if you intend to use this from ifunc tests, I believe
> that these can run *extremely* early.  E.g. LD_BIND_NOW=1 will run
> these while relocating the entire application, and therefore before
> any of DT_INIT (aka .ctors), DT_INIT_ARRAY, or DT_PREINIT_ARRAY.

So make sure that __cpu_indicator initially has a conservative correct
value?  I'd still prefer the constructor-in-libgcc option - if only because
then the compiler-side is much simplified.

Richard.

>
> r~
>


Re: [4.7][google]Support for getting CPU type and feature information at run-time. (issue4893046)

2011-08-19 Thread Jakub Jelinek
On Fri, Aug 19, 2011 at 11:04:11AM +0200, Richard Guenther wrote:
> On Fri, Aug 19, 2011 at 12:08 AM, Richard Henderson  wrote:
> > On 08/18/2011 02:51 PM, Sriraman Tallam wrote:
> >> Oh!, right, sorry. So, the only available option now is to mark it as
> >> a constructor in libgcc.
> >
> > Or call it explicitly from the out-of-line tests.
> >
> > The thing is, if you intend to use this from ifunc tests, I believe
> > that these can run *extremely* early.  E.g. LD_BIND_NOW=1 will run
> > these while relocating the entire application, and therefore before
> > any of DT_INIT (aka .ctors), DT_INIT_ARRAY, or DT_PREINIT_ARRAY.
> 
> So make sure that __cpu_indicator initially has a conservative correct
> value?  I'd still prefer the constructor-in-libgcc option - if only because
> then the compiler-side is much simplified.

Note that exporting data from shared libraries and using those in binaries
often leads to copy relocations (which are possibly still not applied when
calling IFUNC functions with LD_BIND_NOW=1).  Similarly calling a function
in a different shared library might be a problem from IFUNC handler.

Jakub


Re: [Patch, Fortran, OOP] PR 49638: [OOP] length parameter is ignored when overriding type bound character functions with constant length.

2011-08-19 Thread Janus Weil
Ping! (Maybe I should have posted the follow-up patch in a separate
thread to make it more visible.)




2011/8/13 Janus Weil :
> Hi Thomas, hi all,
>
> 2011/8/7 Thomas Koenig :
>> When extending the values of gfc_dep_compare_expr, we will need to go
>> through all its uses (making sure we change == -2 to <= -2).
>
> attached is a patch which makes a start with this.
>
> For now, it changes the return value to "-3" for two cases:
> 1) different expr_types
> 2) non-identical variables
>
> I tried to take care of all places which are checking for a return
> value of "-2" and I hope I missed none.
>
> Any objections or ok for trunk? (Regtested successfully.)
>
> Cheers,
> Janus
>
>
> 2011-08-13  Janus Weil  
>
>        PR fortran/49638
>        * dependency.c (gfc_dep_compare_expr): Add new result value "-3".
>        (gfc_check_element_vs_section,gfc_check_element_vs_element): Handle
>        result value "-3".
>        * frontend-passes.c (optimize_comparison): Ditto.
>        * interface.c (gfc_check_typebound_override): Ditto.
>
>
> 2011-08-13  Janus Weil  
>
>        PR fortran/49638
>        * gfortran.dg/typebound_override_1.f90: Modified.
>


Re: [PATCH, ARM] Generate conditional compares in Thumb2 state

2011-08-19 Thread Ramana Radhakrishnan
>
> Regression test against cortex-M0/M3/M4 profile with "-mthumb" option
> doesn't show any new failures.

Please test on ARM state as well and make sure there are no
regressions before committing.

Ok if no regressions.

Ramana

>
> Thanks,
> -Jiangning


Re: PATCH: Change ix86_isa_flags to HOST_WIDE_INT

2011-08-19 Thread Joseph S. Myers
One of these patches appears to have broken bootstrap on
x86_64-unknown-linux-gnu with Ada enabled:

In file included from ../../tm.h:19:0,
 from targext.c:48:
../../options.h:3533:3: error: unknown type name 'HOST_WIDE_INT'
../../options.h:3534:3: error: unknown type name 'HOST_WIDE_INT'

I've applied this patch to fix it.

2011-08-19  Joseph Myers  

* opth-gen.awk: Do not declare target save/restore structures and
functions if IN_RTS defined.

Index: opth-gen.awk
===
--- opth-gen.awk(revision 177893)
+++ opth-gen.awk(working copy)
@@ -127,7 +127,7 @@
 # Also, order the structure so that pointer fields occur first, then int
 # fields, and then char fields to provide the best packing.
 
-print "#if !defined(IN_LIBGCC2) && !defined(IN_TARGET_LIBS)"
+print "#if !defined(IN_LIBGCC2) && !defined(IN_TARGET_LIBS) && 
!defined(IN_RTS)"
 print ""
 print "/* Structure to save/restore optimization and target specific options.  
*/";
 print "struct GTY(()) cl_optimization";

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [c6x] Fix libgcc/soft-fp move fallout

2011-08-19 Thread Rainer Orth
Richard,

> Rainer, I know you're in the middle of libgcc2_extras, so I
> don't want to commit something that conflicts and may well
> be out-of-date any minute.  But these c6x bits got missed
> after the soft-fp move.

yep, c6x support got in while the soft-fp patch was almost ready.  I've
incorporated basic support, but obviously have missed this part.

> Please decide among you whether this patch should be applied
> in the meantime, or whether it should wait until your other
> bulk libgcc patches get applied.

Those files only get moved over unchanged in the libgcc2 patch, so it's
no problem for me when they change now.  Please go ahead with the patch.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [Patch, Fortran, OOP] PR 49638: [OOP] length parameter is ignored when overriding type bound character functions with constant length.

2011-08-19 Thread Mikael Morin
On Friday 19 August 2011 12:05:02 Janus Weil wrote:
> Ping! (Maybe I should have posted the follow-up patch in a separate
> thread to make it more visible.)
I saw it, had a quick glance, thought that Thomas would jump on it, and 
forgot. Sorry.

> 
> 2011/8/13 Janus Weil :
> > Hi Thomas, hi all,
> > 
> > 2011/8/7 Thomas Koenig :
> >> When extending the values of gfc_dep_compare_expr, we will need to go
> >> through all its uses (making sure we change == -2 to <= -2).
> > 
> > attached is a patch which makes a start with this.
> > 
> > For now, it changes the return value to "-3" for two cases:
> > 1) different expr_types
> > 2) non-identical variables
> > 
> > I tried to take care of all places which are checking for a return
> > value of "-2" and I hope I missed none.
> > 
> > Any objections or ok for trunk? (Regtested successfully.)
OK from my side for the code proper.

I have one comment though about this:
+/* Compare two expressions.  Return values:
+   * +1 if e1 > e2
+   * 0 if e1 == e2
+   * -1 if e1 < e2
+   * -2 if the relationship could not be determined
+   * -3 if e1 /= e2, but we cannot tell which one is larger.  */

I think this is misleading, as the function does not always return -3 when 
e1/=e2. There is for example (currently) no special handling for operators.
Here is an attempt at expressing it:
  * -3 in some cases where we could determine that e1 and e2 have different 
data dependencies (and thus are not guaranteed to have always the same value), 
but we cannot tell whether one is greater than the other.

Mikael



Re: [Patch, fortran] PR fortran/50050 out of bounds whilst freeing an allocate-object.

2011-08-19 Thread Dominique Dhumieres
I have regstrapped several time with the patch without regression or failure on 
my own tests.
Could someone review the patch?

Dominique


Re: [commit, spu] Improve address generation for large stack frames

2011-08-19 Thread Ulrich Weigand
Richard Henderson wrote:
> On 08/16/2011 11:35 AM, Ulrich Weigand wrote:
> > +   /* Reload the displacement.  */
> > +   push_reload (XEXP (ad, 1), NULL_RTX, &XEXP (ad, 1), NULL,
> > +  BASE_REG_CLASS, GET_MODE (ad), VOIDmode, 0, 0,
> > +  opnum, (enum reload_type) type);
> 
> Are you sure you want to reload it this way, and not
> 
>   (plus (plus base const-large) const-small)
> 
> ?  If you push that inner reload, it seems like it would be 
> sharable/cse-able with other variable references within the block.

Well, yes, but I don't have an instruction to implement the
inner plus (SPU add immediate is restricted to +/- 512), so
I'd still have to reload the large constant as well and would
get another instruction.

This results in actually worse code if no sharing ends up
possible, so I'm not sure if it's worth it overall ...

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  ulrich.weig...@de.ibm.com


[PATCH] Some data-dep testcases

2011-08-19 Thread Richard Guenther

While working on PR50067, with some local changes I made them fail.

Tested on x86_64-unknown-linux-gnu, applied.

Richard.

2011-08-19  Richard Guenther  

* gcc.dg/torture/pr50067-1.c: New testcase.
* gcc.dg/torture/pr50067-2.c: Likewise.

Index: testsuite/gcc.dg/torture/pr50067-1.c
===
--- testsuite/gcc.dg/torture/pr50067-1.c(revision 0)
+++ testsuite/gcc.dg/torture/pr50067-1.c(revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do run } */
+
+/* Make sure data-dependence analysis does not compute a bogus
+   distance vector for the different sized accesses.  */
+
+extern int memcmp(const void *, const void *, __SIZE_TYPE__);
+extern void abort (void);
+short a[32] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 };
+short b[32] = { 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, };
+int main()
+{
+  int i;
+  for (i = 0; i < 32; ++i)
+(*((unsigned short(*)[32])&a[0]))[i] = (*((char(*)[32])&a[0]))[i+8];
+  if (memcmp (&a, &b, sizeof (a)) != 0)
+abort ();
+  return 0;
+}
Index: testsuite/gcc.dg/torture/pr50067-2.c
===
--- testsuite/gcc.dg/torture/pr50067-2.c(revision 0)
+++ testsuite/gcc.dg/torture/pr50067-2.c(revision 0)
@@ -0,0 +1,20 @@
+/* { dg-do run } */
+
+/* Make sure data-dependence analysis does not compute a bogus
+  distance vector for the different sized accesses.  */
+
+extern int memcmp(const void *, const void *, __SIZE_TYPE__);
+extern void abort (void);
+short a[32] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 };
+short b[32] = { 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, };
+int main()
+{
+  int i;
+  for (i = 0; i < 32; ++i)
+{
+  a[i] = (*((char(*)[32])&a[0]))[i+8];
+}
+  if (memcmp (&a, &b, sizeof (a)) != 0)
+abort ();
+  return 0;
+}


[PATCH][1/2] Fixup dr_analyze_indices, fix PR50067

2011-08-19 Thread Richard Guenther

This is a first piece of dr_analyze_indices TLC, simplifying
it (no INDIRECT_REFs anymore) and fixing one appearant bug
(but not 50067 yet), that we strip the MEM_REF offset even
if we didn't account for it (if !nest).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2011-08-19  Richard Guenther  

PR tree-optimization/50067
* tree-data-ref.c (dr_analyze_indices): Simplify, strip MEM_REF
offset only if we accounted for it.

Index: gcc/tree-data-ref.c
===
--- gcc/tree-data-ref.c (revision 177894)
+++ gcc/tree-data-ref.c (working copy)
@@ -863,17 +863,20 @@ dr_analyze_indices (struct data_referenc
 }
 
   if (nest
-  && (INDIRECT_REF_P (aref)
- || TREE_CODE (aref) == MEM_REF))
+  && TREE_CODE (aref) == MEM_REF)
 {
   op = TREE_OPERAND (aref, 0);
   access_fn = analyze_scalar_evolution (loop, op);
   access_fn = instantiate_scev (before_loop, loop, access_fn);
   base = initial_condition (access_fn);
   split_constant_offset (base, &base, &off);
-  if (TREE_CODE (aref) == MEM_REF)
-   off = size_binop (PLUS_EXPR, off,
- fold_convert (ssizetype, TREE_OPERAND (aref, 1)));
+  if (!integer_zerop (TREE_OPERAND (aref, 1)))
+   {
+ off = size_binop (PLUS_EXPR, off,
+   fold_convert (ssizetype, TREE_OPERAND (aref, 1)));
+ TREE_OPERAND (aref, 1)
+   = build_int_cst (TREE_TYPE (TREE_OPERAND (aref, 1)), 0);
+   }
   access_fn = chrec_replace_initial_condition (access_fn,
fold_convert (TREE_TYPE (base), off));
 
@@ -881,10 +884,6 @@ dr_analyze_indices (struct data_referenc
   VEC_safe_push (tree, heap, access_fns, access_fn);
 }
 
-  if (TREE_CODE (aref) == MEM_REF)
-TREE_OPERAND (aref, 1)
-  = build_int_cst (TREE_TYPE (TREE_OPERAND (aref, 1)), 0);
-
   if (TREE_CODE (ref) == MEM_REF
   && TREE_CODE (TREE_OPERAND (ref, 0)) == ADDR_EXPR
   && integer_zerop (TREE_OPERAND (ref, 1)))


Re: [cxx-mem-model] Atomic C++ header file changes

2011-08-19 Thread Andrew MacLeod

On 08/19/2011 04:57 AM, Torvald Riegel wrote:

On Wed, 2011-08-17 at 11:39 -0400, Andrew MacLeod wrote:

That would be quite ugly, and you get what you deserve if you do that.
I changed the builtins so that if you dont specify a compile time
constant in the memory model parameter, it will simply default to
__SYNC_MEM_SEQ_CST, which will always be safe.  That is standard
compliant (verified), and if anyone is really unhappy about it, then the
c++ headers can be really uglified by adding a bunch of switch
statements to handle this twisted case.

IMHO this behavior should be documented so that users will be aware of
it, and it would be best if this would raise a warning. Note that I also
cannot see any reason why a programmer might want to make barriers
runtime-configurable, but silently adding overhead (perhaps the
parameter was supposed to be a constant, but wasn't?) can lead to more
confusion than necessary.



The problem with issuing a warning is that anytime the compiler creates 
a C++ atomic class and you use a method with a memory order, it usually 
leaves an externally call-able method which has to take a runtime 
value... so you'd see the warning on basically every compilation... 
which in turn defeats the purpose of the warning.


Andrew




Re: [PATCH, testsuite, i386] BMI2 support for GCC

2011-08-19 Thread H.J. Lu
On Fri, Aug 19, 2011 at 2:23 AM, Kirill Yukhin  wrote:
> Hi guys,
> I've prepared a patch which enables BMI2 extensions in GCC
>
> It conforms (hopefully) to Spec which can be found at [1]
>
> I am attaching following files:
>  - bmi2.gcc.patch. Bunch of changes to GCC
>  - ChangeLog. Entry for ChangeLog in GCC's root directory
>  - ChangeLog.testsuite. Entry for ChangeLog in GCC's test suite
>
> Bootstrap is passed
> Make-check shows no new fails, my compile-time new tests are passed
> Make-check under simulator causes all my new tests to pass
>
> Is it OK for trunk?
>
> [1] - http://software.intel.com/file/36945
>
> Thanks, K
>

Incorrect format:

+ && CONST_INT_P (src2) ) {
+/* We generatin RORX instruction, freedom of register +
+  flags not affected  */
+   insn = op;
+  } else {
+   clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
+   insn = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, op, clob));
+  }

+{
+  if (can_create_pseudo_p () && mode != SImode) {
+rtx tmp = gen_rtx_REG (mode, 0);
+emit_insn (gen_extendsidi2 (tmp, operands[2]));
+operands[2] = tmp;
+  }


-- 
H.J.


[PATCH][2/2] Fixup dr_analyze_indices, fix PR50067

2011-08-19 Thread Richard Guenther

This is the fix for the testcase in PR50067.  We strip outermost
(yes, outermost only, which makes it very inefficient) MEM_REFs
which causes the DR base objects in the PR to agree for two
conflicting DRs, but with the issues we have with how we
compose access functions they still get disambiguated.

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

Richard.

2011-08-19  Richard Guenther  

PR tree-optimization/50067
* tree-data-ref.c (dr_analyze_indices): Do not strip
outermost MEM_REF off its ADDR_EXPR operand.

* gcc.dg/torture/pr50067-3.c: New testcase.

Index: gcc/tree-data-ref.c
===
*** gcc/tree-data-ref.c (revision 177895)
--- gcc/tree-data-ref.c (working copy)
*** dr_analyze_indices (struct data_referenc
*** 885,895 
  TREE_OPERAND (aref, 1)
= build_int_cst (TREE_TYPE (TREE_OPERAND (aref, 1)), 0);
  
-   if (TREE_CODE (ref) == MEM_REF
-   && TREE_CODE (TREE_OPERAND (ref, 0)) == ADDR_EXPR
-   && integer_zerop (TREE_OPERAND (ref, 1)))
- ref = TREE_OPERAND (TREE_OPERAND (ref, 0), 0);
- 
/* For canonicalization purposes we'd like to strip all outermost
   zero-offset component-refs.
   ???  For now simply handle zero-index array-refs.  */
--- 885,890 
Index: gcc/testsuite/gcc.dg/torture/pr50067-3.c
===
*** gcc/testsuite/gcc.dg/torture/pr50067-3.c(revision 0)
--- gcc/testsuite/gcc.dg/torture/pr50067-3.c(revision 0)
***
*** 0 
--- 1,20 
+ /* { dg-do run } */
+ /* { dg-options "-fpredictive-commoning" } */
+ 
+ extern void abort (void);
+ int a[6] = { 0, 0, 0, 0, 7, 0 };
+ static int *p = &a[4];
+ 
+ int
+ main ()
+ {
+   int i;
+   for (i = 0; i < 4; ++i)
+ {
+   a[i + 1] = a[i + 2] > i;
+   *p &= ~1;
+ }
+   if (a[4] != 0)
+ abort ();
+   return 0;
+ }


Re: [pph] Add support for line table streaming with includes (issue 4908051)

2011-08-19 Thread dnovillo

Very nice.  One potential application of this in the future would be to
not only sequence the included files, but also the symbols and types.
To support the cases where a child include depends on symbols exported
by the parent before its inclusion (though I'm not sure we want to
really support that).

OK with the changes below.


http://codereview.appspot.com/4908051/diff/1/gcc/cp/pph-streamer-in.c
File gcc/cp/pph-streamer-in.c (right):

http://codereview.appspot.com/4908051/diff/1/gcc/cp/pph-streamer-in.c#newcode1402
gcc/cp/pph-streamer-in.c:1402: gcc_assert(lm->included_from == -1);
Space before '('

http://codereview.appspot.com/4908051/diff/1/gcc/cp/pph-streamer-out.c
File gcc/cp/pph-streamer-out.c (right):

http://codereview.appspot.com/4908051/diff/1/gcc/cp/pph-streamer-out.c#newcode1211
gcc/cp/pph-streamer-out.c:1211: header_name = header_path + 1;
Actually, you can use lbasename from libiberty.  You can then figure out
the extension with strrchr.  I forgot to tell you about it, sorry.

http://codereview.appspot.com/4908051/diff/1/gcc/cp/pph-streamer-out.c#newcode1328
gcc/cp/pph-streamer-out.c:1328: #ifdef and only enabled if asserts are
on.  */
Actually, leave it in permanently.  Changing the output of the compiler
depending on whether checking is enabled is bad ju-ju.  An extra int at
the end of the line table will not hurt.

http://codereview.appspot.com/4908051/


Re: [Patch, Fortran, OOP] PR 49638: [OOP] length parameter is ignored when overriding type bound character functions with constant length.

2011-08-19 Thread Tobias Burnus

On 08/19/2011 01:55 PM, Mikael Morin wrote:

OK from my side for the code proper.

I have one comment though about this:
+/* Compare two expressions.  Return values:
+   * +1 if e1>  e2
+   * 0 if e1 == e2
+   * -1 if e1<  e2
+   * -2 if the relationship could not be determined
+   * -3 if e1 /= e2, but we cannot tell which one is larger.  */

I think this is misleading, as the function does not always return -3 when
e1/=e2. There is for example (currently) no special handling for operators.
Here is an attempt at expressing it:
   * -3 in some cases where we could determine that e1 and e2 have different
data dependencies (and thus are not guaranteed to have always the same value),
but we cannot tell whether one is greater than the other.


Besides that issue, I am wondering whether we shouldn't start to use an 
ENUM for those. I think for "<" vs. "==" vs. ">" one can use a number 
(-1, 0, 1) and then compare the result against 0 (>0, == 0 etc.).


However, for 5 values, I think it makes sense to do something else 
otherwise, someone write "... < 0" which not only matches -1 but also -2 
or -3.


I think this does not block the committal but one should think about 
whether one should do it as follow up.


Tobias,
who has not read the patch.


[testsuite] Require C99 runtime in gcc.dg/builtins-67.c, gcc.target/i386/conversion.c

2011-08-19 Thread Rainer Orth
Uros' two new testcases were failing on Solaris (both SPARC and x86)
like this:

FAIL: gcc.dg/builtins-67.c (test for excess errors)

Excess errors:
In file included from 
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/builtins-67.c:6:0:
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/builtins-config.h:48:2: error: 
#error forgot to set -std=c99.

Both tests need C99 flags.  Fixed as follows, tested with the
appropriate runtest invocations on i386-pc-solaris2.8 and
x86_64-unknown-linux-gnu, installed on mainline.

Rainer


2011-08-19  Rainer Orth  

* gcc.dg/builtins-67.c: Use dg-add-options c99_runtime.
* gcc.target/i386/conversion.c: Likewise.

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2011-08-19  Rainer Orth  
+
+	* gcc.dg/builtins-67.c: Use dg-add-options c99_runtime.
+	* gcc.target/i386/conversion.c: Likewise.
+
 2011-08-19  Richard Guenther  
 
 	* gcc.dg/torture/pr50067-1.c: New testcase.
diff --git a/gcc/testsuite/gcc.dg/builtins-67.c b/gcc/testsuite/gcc.dg/builtins-67.c
--- a/gcc/testsuite/gcc.dg/builtins-67.c
+++ b/gcc/testsuite/gcc.dg/builtins-67.c
@@ -2,6 +2,7 @@
 
 /* { dg-do link } */
 /* { dg-options "-ffast-math -lm" }  */
+/* { dg-add-options c99_runtime } */
 
 #include "builtins-config.h"
 
diff --git a/gcc/testsuite/gcc.target/i386/conversion.c b/gcc/testsuite/gcc.target/i386/conversion.c
--- a/gcc/testsuite/gcc.target/i386/conversion.c
+++ b/gcc/testsuite/gcc.target/i386/conversion.c
@@ -2,6 +2,7 @@
 
 /* { dg-do link } */
 /* { dg-options "-ffast-math" }  */
+/* { dg-add-options c99_runtime } */
 
 #include "../../gcc.dg/builtins-config.h"
 

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH, testsuite, i386] BMI2 support for GCC

2011-08-19 Thread Jakub Jelinek
On Fri, Aug 19, 2011 at 05:18:19PM +0400, Kirill Yukhin wrote:
> Thanks, it is fixed.
> Update patch is attached.

+ /* We generatin RORX instruction, freedom of register +   
   
+flags not affected  */ 
   

comment doesn't look to be correct english (missing verb, missing g at
the end of generating, missing dot at the end of sentence).

Jakub


Re: C1X _Noreturn

2011-08-19 Thread Joseph S. Myers
On Thu, 18 Aug 2011, Gabriel Dos Reis wrote:

> On Thu, Aug 18, 2011 at 4:37 PM, Joseph S. Myers
>  wrote:
> 
> > The new keyword is C-only (C++0x has a different way of declaring
> > non-returning functions) and I did not try to make the header do
> > anything useful if included in C++ code.
> 
> I would suggest you don't define it all as macro when __cplusplus is defined.

This followup patch:

* stops stdnoreturn.h from defining the noreturn macro for C++, as
  suggested;

* adds a pedwarn-if-pedantic for using _Noreturn outside C1X mode (not
  formally required as it's in the reserved namespace, but still seems
  useful and is similar to what's done with _Complex, for example);

* mentions _Noreturn in the syntax comments in the C parser.

Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  Applied
to mainline.

2011-08-19  Joseph Myers  

* c-decl.c (grokdeclarator): Diagnose _Noreturn for non-C1X if
pedantic.
* c-parser.c (c_parser_declspecs): Include _Noreturn in syntax
comment.
* ginclude/stdnoreturn.h (noreturn): Don't define for C++.

testsuite:
2011-08-19  Joseph Myers  

* gcc.dg/c90-noreturn-1.c, gcc.dg/c99-noreturn-1.c: New tests.

Index: gcc/ginclude/stdnoreturn.h
===
--- gcc/ginclude/stdnoreturn.h  (revision 177894)
+++ gcc/ginclude/stdnoreturn.h  (working copy)
@@ -26,6 +26,10 @@ see the files COPYING3 and COPYING.RUNTI
 #ifndef _STDNORETURN_H
 #define _STDNORETURN_H
 
+#ifndef __cplusplus
+
 #define noreturn _Noreturn
 
+#endif
+
 #endif /* stdnoreturn.h */
Index: gcc/testsuite/gcc.dg/c99-noreturn-1.c
===
--- gcc/testsuite/gcc.dg/c99-noreturn-1.c   (revision 0)
+++ gcc/testsuite/gcc.dg/c99-noreturn-1.c   (revision 0)
@@ -0,0 +1,5 @@
+/* Test _Noreturn not in C99.  */
+/* { dg-do compile } */
+/* { dg-options "-std=iso9899:1999 -pedantic-errors" } */
+
+_Noreturn void f (void); /* { dg-error "ISO C99 does not support '_Noreturn'" 
} */
Index: gcc/testsuite/gcc.dg/c90-noreturn-1.c
===
--- gcc/testsuite/gcc.dg/c90-noreturn-1.c   (revision 0)
+++ gcc/testsuite/gcc.dg/c90-noreturn-1.c   (revision 0)
@@ -0,0 +1,5 @@
+/* Test _Noreturn not in C90.  */
+/* { dg-do compile } */
+/* { dg-options "-std=iso9899:1990 -pedantic-errors" } */
+
+_Noreturn void f (void); /* { dg-error "ISO C90 does not support '_Noreturn'" 
} */
Index: gcc/c-decl.c
===
--- gcc/c-decl.c(revision 177894)
+++ gcc/c-decl.c(working copy)
@@ -5986,7 +5986,18 @@ grokdeclarator (const struct c_declarato
  /* Record that the function is declared `inline'.  */
  DECL_DECLARED_INLINE_P (decl) = 1;
if (declspecs->noreturn_p)
- TREE_THIS_VOLATILE (decl) = 1;
+ {
+   if (!flag_isoc1x)
+ {
+   if (flag_isoc99)
+ pedwarn (loc, OPT_pedantic,
+  "ISO C99 does not support %<_Noreturn%>");
+   else
+ pedwarn (loc, OPT_pedantic,
+  "ISO C90 does not support %<_Noreturn%>");
+ }
+   TREE_THIS_VOLATILE (decl) = 1;
+ }
  }
   }
 else
Index: gcc/c-parser.c
===
--- gcc/c-parser.c  (revision 177894)
+++ gcc/c-parser.c  (working copy)
@@ -1905,6 +1905,9 @@ c_parser_static_assert_declaration_no_se
C99 6.7.4:
function-specifier:
  inline
+ _Noreturn
+
+   (_Noreturn is new in C1X.)
 
C90 6.5.2, C99 6.7.2:
type-specifier:

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Fix !$omp workshare (PR fortran/49792)

2011-08-19 Thread Jakub Jelinek
Hi!

As the first testcase shows, if we need a temporary for array assignment,
we can't easily parallelize it (we'd have to emit the temporary allocation
into a OMP_SINGLE, then copyprivate the result to all the other threads, then
do the OMP_FOR and afterwards OMP_SINGLE again to free it).
Similarly, if for f03 we need lhs reallocation.
And lastly, if workshare has empty body, but no NOWAIT clause, we want to
emit a barrier.

Regtested on x86_64-linux, committed to trunk and 4.6 so far.

2011-08-19  Jakub Jelinek  

PR fortran/49792
* trans-expr.c (gfc_trans_assignment_1): Set OMPWS_SCALARIZER_WS
bit in ompws_flags only if loop.temp_ss is NULL, and clear it if
lhs needs reallocation.
* trans-openmp.c (gfc_trans_omp_workshare): Don't return early if
code is NULL, emit a barrier if workshare emitted no code at all
and NOWAIT clause isn't present.

* testsuite/libgomp.fortran/pr49792-1.f90: New test.
* testsuite/libgomp.fortran/pr49792-2.f90: New test.

--- gcc/fortran/trans-expr.c.jj 2011-07-27 23:25:33.0 +0200
+++ gcc/fortran/trans-expr.c2011-08-19 14:12:46.0 +0200
@@ -6137,10 +6137,6 @@ gfc_trans_assignment_1 (gfc_expr * expr1
   rss = NULL;
   if (lss != gfc_ss_terminator)
 {
-  /* Allow the scalarizer to workshare array assignments.  */
-  if (ompws_flags & OMPWS_WORKSHARE_FLAG)
-   ompws_flags |= OMPWS_SCALARIZER_WS;
-
   /* The assignment needs scalarization.  */
   lss_section = lss;
 
@@ -6196,6 +6192,10 @@ gfc_trans_assignment_1 (gfc_expr * expr1
  gfc_mark_ss_chain_used (loop.temp_ss, 3);
}
 
+  /* Allow the scalarizer to workshare array assignments.  */
+  if ((ompws_flags & OMPWS_WORKSHARE_FLAG) && loop.temp_ss == NULL)
+   ompws_flags |= OMPWS_SCALARIZER_WS;
+
   /* Start the scalarized loop body.  */
   gfc_start_scalarized_body (&loop, &body);
 }
@@ -6304,6 +6304,7 @@ gfc_trans_assignment_1 (gfc_expr * expr1
&& !gfc_expr_attr (expr1).codimension
&& !gfc_is_coindexed (expr1))
{
+ ompws_flags &= ~OMPWS_SCALARIZER_WS;
  tmp = gfc_alloc_allocatable_for_assignment (&loop, expr1, expr2);
  if (tmp != NULL_TREE)
gfc_add_expr_to_block (&loop.code[expr1->rank - 1], tmp);
--- gcc/fortran/trans-openmp.c.jj   2011-08-03 18:41:01.0 +0200
+++ gcc/fortran/trans-openmp.c  2011-08-19 13:58:02.0 +0200
@@ -1764,9 +1764,6 @@ gfc_trans_omp_workshare (gfc_code *code,
 
   pushlevel (0);
 
-  if (!code)
-return build_empty_stmt (input_location);
-
   gfc_start_block (&block);
   pblock = █
 
@@ -1903,6 +1900,9 @@ gfc_trans_omp_workshare (gfc_code *code,
   else
 poplevel (0, 0, 0);
 
+  if (IS_EMPTY_STMT (stmt) && !clauses->nowait)
+stmt = gfc_trans_omp_barrier ();
+
   ompws_flags = 0;
   return stmt;
 }
--- libgomp/testsuite/libgomp.fortran/pr49792-1.f90.jj  2011-08-19 
14:14:53.0 +0200
+++ libgomp/testsuite/libgomp.fortran/pr49792-1.f90 2011-08-19 
14:16:20.0 +0200
@@ -0,0 +1,18 @@
+! PR fortran/49792
+! { dg-do run }
+
+subroutine reverse(n, a)
+  integer :: n
+  real(kind=8) :: a(n)
+!$omp parallel workshare
+  a(:) = a(n:1:-1)
+!$omp end parallel workshare
+end subroutine reverse
+
+program pr49792
+  real(kind=8) :: a(16) = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
+  real(kind=8) :: b(16)
+  b(:) = a(16:1:-1)
+  call reverse (16,a)
+  if (any (a.ne.b)) call abort
+end program pr49792
--- libgomp/testsuite/libgomp.fortran/pr49792-2.f90.jj  2011-08-19 
14:16:00.0 +0200
+++ libgomp/testsuite/libgomp.fortran/pr49792-2.f90 2011-08-19 
14:28:25.0 +0200
@@ -0,0 +1,22 @@
+! PR fortran/49792
+! { dg-do run }
+! { dg-options "-std=f2003 -fall-intrinsics" }
+
+subroutine reverse(n, a)
+  integer :: n
+  real(kind=8) :: a(n)
+!$omp parallel workshare
+  a(:) = a(n:1:-1)
+!$omp end parallel workshare
+end subroutine reverse
+
+program pr49792
+  integer :: b(16)
+  integer, allocatable :: a(:)
+  b = 1
+!$omp parallel workshare
+  a = b
+!$omp end parallel workshare
+  if (size(a).ne.size(b)) call abort()
+  if (any (a.ne.b)) call abort()
+end program pr49792

Jakub


Re: C1X _Noreturn

2011-08-19 Thread Rainer Orth
Richard Sandiford  writes:

>> so perhaps that configuration (mips*-sde-elf* without newlib) should
>> actually be deprecated/removed (and "mipssde" threads along with it).
>
> Yeah, sounds like a good plan.

Good to know :-)  Could have saved me a little bit of work with the
libgcc patches.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [rtl, delay-slot] Fix overload of "unchanging" bit

2011-08-19 Thread Hans-Peter Nilsson
> Date: Thu, 18 Aug 2011 15:48:41 -0700
> From: Richard Henderson 

> The following has passed stage2-gcc on sparc64-linux host (full build still
> in progress), with --enable-checking=yes,rtl.  It surely needs more than that,
> and I'm asking for help from the relevant maintainers to give this a try.

No regressions for neither cris-elf (branch delay-slots, no
annullable slots, no call delay-slots) nor for crisv32-elf
(branch and call delay-slots, no annullable slots).
(No new passes either; just the same results.)

brgds, H-P


Re: [PATCH, testsuite, i386] BMI2 support for GCC

2011-08-19 Thread H.J. Lu
No need for () in "(mode == SImode)":

+ && !optimize_function_for_size_p (cfun)
+ && ((mode == SImode) || (mode == DImode && TARGET_64BIT))

Wrong placement of '{':

+  if (can_create_pseudo_p () && mode != SImode) {
+rtx tmp = gen_rtx_REG (mode, 0);
+emit_insn (gen_extendsidi2 (tmp, operands[2]));
+operands[2] = tmp;
+  }



On Fri, Aug 19, 2011 at 6:53 AM, Kirill Yukhin  wrote:
> Thanks!
> Fixed, updated patch is attached.
>
> Is it ok?
>
> Thanks, K
>
> On Fri, Aug 19, 2011 at 5:22 PM, Jakub Jelinek  wrote:
>> On Fri, Aug 19, 2011 at 05:18:19PM +0400, Kirill Yukhin wrote:
>>> Thanks, it is fixed.
>>> Update patch is attached.
>>
>> +         /* We generatin RORX instruction, freedom of register +
>> +            flags not affected  */
>>
>> comment doesn't look to be correct english (missing verb, missing g at
>> the end of generating, missing dot at the end of sentence).
>>
>>        Jakub
>>
>



-- 
H.J.


Re: [PATCH (1/7)] New optab framework for widening multiplies

2011-08-19 Thread Andrew Stubbs

On 22/07/11 16:34, Andrew Stubbs wrote:

On 22/07/11 14:28, Bernd Schmidt wrote:

Oh well, let's shelve it and do it later.


Here's an updated patch with the formatting problem you found fixed.


I've just committed an updated version of this patch (attached).

I found a number of subtle bugs while I was testing, and these have now 
been corrected. In particular, I found that VOIDmode constants were not 
handled correctly; I've added a function "widened_mode" along the lines 
originally suggested by Benrd to deal with this. I also found one case 
where different code was produced to previously, although it was 
actually corrected later in the patch series I've fixed it here now.


Andrew

2011-08-19  Andrew Stubbs  

	gcc/
	* expr.c (expand_expr_real_2): Use widening_optab_handler.
	* genopinit.c (optabs): Use set_widening_optab_handler for $N.
	(gen_insn): $N now means $a must be wider than $b, not consecutive.
	* optabs.c (widened_mode): New function.
	(expand_widen_pattern_expr): Use widening_optab_handler.
	(expand_binop_directly): Likewise.
	(expand_binop): Likewise.
	* optabs.h (widening_optab_handlers): New struct.
	(optab_d): New member, 'widening'.
	(widening_optab_handler): New function.
	(set_widening_optab_handler): New function.
	* tree-ssa-math-opts.c (convert_mult_to_widen): Use
	widening_optab_handler.
	(convert_plusminus_to_widen): Likewise.

--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8005,7 +8005,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 	  this_optab = usmul_widen_optab;
 	  if (mode == GET_MODE_2XWIDER_MODE (innermode))
 	{
-	  if (optab_handler (this_optab, mode) != CODE_FOR_nothing)
+	  if (widening_optab_handler (this_optab, mode, innermode)
+		!= CODE_FOR_nothing)
 		{
 		  if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
 		expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
@@ -8032,7 +8033,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 	  if (mode == GET_MODE_2XWIDER_MODE (innermode)
 	  && TREE_CODE (treeop0) != INTEGER_CST)
 	{
-	  if (optab_handler (this_optab, mode) != CODE_FOR_nothing)
+	  if (widening_optab_handler (this_optab, mode, innermode)
+		!= CODE_FOR_nothing)
 		{
 		  expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
    EXPAND_NORMAL);
@@ -8040,7 +8042,8 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 	   unsignedp, this_optab);
 		  return REDUCE_BIT_FIELD (temp);
 		}
-	  if (optab_handler (other_optab, mode) != CODE_FOR_nothing
+	  if (widening_optab_handler (other_optab, mode, innermode)
+		!= CODE_FOR_nothing
 		  && innermode == word_mode)
 		{
 		  rtx htem, hipart;
--- a/gcc/genopinit.c
+++ b/gcc/genopinit.c
@@ -46,10 +46,12 @@ along with GCC; see the file COPYING3.  If not see
used.  $A and $B are replaced with the full name of the mode; $a and $b
are replaced with the short form of the name, as above.
 
-   If $N is present in the pattern, it means the two modes must be consecutive
-   widths in the same mode class (e.g, QImode and HImode).  $I means that
-   only full integer modes should be considered for the next mode, and $F
-   means that only float modes should be considered.
+   If $N is present in the pattern, it means the two modes must be in
+   the same mode class, and $b must be greater than $a (e.g, QImode
+   and HImode).
+
+   $I means that only full integer modes should be considered for the
+   next mode, and $F means that only float modes should be considered.
$P means that both full and partial integer modes should be considered.
$Q means that only fixed-point modes should be considered.
 
@@ -99,17 +101,17 @@ static const char * const optabs[] =
   "set_optab_handler (smulv_optab, $A, CODE_FOR_$(mulv$I$a3$))",
   "set_optab_handler (umul_highpart_optab, $A, CODE_FOR_$(umul$a3_highpart$))",
   "set_optab_handler (smul_highpart_optab, $A, CODE_FOR_$(smul$a3_highpart$))",
-  "set_optab_handler (smul_widen_optab, $B, CODE_FOR_$(mul$a$b3$)$N)",
-  "set_optab_handler (umul_widen_optab, $B, CODE_FOR_$(umul$a$b3$)$N)",
-  "set_optab_handler (usmul_widen_optab, $B, CODE_FOR_$(usmul$a$b3$)$N)",
-  "set_optab_handler (smadd_widen_optab, $B, CODE_FOR_$(madd$a$b4$)$N)",
-  "set_optab_handler (umadd_widen_optab, $B, CODE_FOR_$(umadd$a$b4$)$N)",
-  "set_optab_handler (ssmadd_widen_optab, $B, CODE_FOR_$(ssmadd$a$b4$)$N)",
-  "set_optab_handler (usmadd_widen_optab, $B, CODE_FOR_$(usmadd$a$b4$)$N)",
-  "set_optab_handler (smsub_widen_optab, $B, CODE_FOR_$(msub$a$b4$)$N)",
-  "set_optab_handler (umsub_widen_optab, $B, CODE_FOR_$(umsub$a$b4$)$N)",
-  "set_optab_handler (ssmsub_widen_optab, $B, CODE_FOR_$(ssmsub$a$b4$)$N)",
-  "set_optab_handler (usmsub_widen_optab, $B, CODE_FOR_$(usmsub$a$b4$)$N)",
+  "set_widening_optab_handler (smul_widen_optab, $B, $A, CODE_FOR_$(mul$a$b3$)$N)",
+  "set_widening_optab_handler (umul_widen_optab, $B, $A, CODE_FOR_$(umul$a$b3$)$N)",
+  "set_widening_optab_

Re: [PATCH 1/7] Linemap infrastructure for virtual locations

2011-08-19 Thread Tom Tromey
> "Jason" == Jason Merrill  writes:

>> +  LC_ENTER_MACRO
>> +  /* stringize */
>> +  /* paste */

Jason> What is the purpose of these comments?

That is left over from my initial hack.  The new scheme doesn't (yet?)
properly handle locations arising from stringizing or token pasting.

Tom


Re: [PATCH (2/7)] Widening multiplies by more than one mode

2011-08-19 Thread Andrew Stubbs

On 14/07/11 15:15, Richard Guenther wrote:

Is this version OK?

Ok.


I've just committed this slightly updated patch.

I found some bugs while testing, now fixed. Most of the changes in this 
patch are context changes, and using widened_mode to handle VOIDmode 
constants.


Andrew
2011-08-19  Andrew Stubbs  

	gcc/
	* config/arm/arm.md (maddhidi4): Remove '*' from name.
	* expr.c (expand_expr_real_2): Use find_widening_optab_handler.
	* optabs.c (find_widening_optab_handler_and_mode): New function.
	(expand_widen_pattern_expr): Use find_widening_optab_handler.
	(expand_binop_directly): Likewise.
	(expand_binop): Likewise.
	* optabs.h (find_widening_optab_handler): New macro define.
	(find_widening_optab_handler_and_mode): New prototype.
	* tree-cfg.c (verify_gimple_assign_binary): Adjust WIDEN_MULT_EXPR
	type precision rules.
	(verify_gimple_assign_ternary): Likewise for WIDEN_MULT_PLUS_EXPR.
	* tree-ssa-math-opts.c (build_and_insert_cast): New function.
	(is_widening_mult_rhs_p): Allow widening by more than one mode.
	Explicitly disallow mis-matched input types.
	(convert_mult_to_widen): Use find_widening_optab_handler, and cast
	input types to fit the new handler.
	(convert_plusminus_to_widen): Likewise.

	gcc/testsuite/
	* gcc.target/arm/wmul-bitfield-1.c: New file.

--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -1857,7 +1857,7 @@
(set_attr "predicable" "yes")]
 )
 
-(define_insn "*maddhidi4"
+(define_insn "maddhidi4"
   [(set (match_operand:DI 0 "s_register_operand" "=r")
 	(plus:DI
 	  (mult:DI (sign_extend:DI
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -8003,19 +8003,16 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 	{
 	  enum machine_mode innermode = TYPE_MODE (TREE_TYPE (treeop0));
 	  this_optab = usmul_widen_optab;
-	  if (mode == GET_MODE_2XWIDER_MODE (innermode))
+	  if (find_widening_optab_handler (this_optab, mode, innermode, 0)
+		!= CODE_FOR_nothing)
 	{
-	  if (widening_optab_handler (this_optab, mode, innermode)
-		!= CODE_FOR_nothing)
-		{
-		  if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
-		expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
- EXPAND_NORMAL);
-		  else
-		expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0,
- EXPAND_NORMAL);
-		  goto binop3;
-		}
+	  if (TYPE_UNSIGNED (TREE_TYPE (treeop0)))
+		expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
+ EXPAND_NORMAL);
+	  else
+		expand_operands (treeop0, treeop1, NULL_RTX, &op1, &op0,
+ EXPAND_NORMAL);
+	  goto binop3;
 	}
 	}
   /* Check for a multiplication with matching signedness.  */
@@ -8030,10 +8027,9 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 	  optab other_optab = zextend_p ? smul_widen_optab : umul_widen_optab;
 	  this_optab = zextend_p ? umul_widen_optab : smul_widen_optab;
 
-	  if (mode == GET_MODE_2XWIDER_MODE (innermode)
-	  && TREE_CODE (treeop0) != INTEGER_CST)
+	  if (TREE_CODE (treeop0) != INTEGER_CST)
 	{
-	  if (widening_optab_handler (this_optab, mode, innermode)
+	  if (find_widening_optab_handler (this_optab, mode, innermode, 0)
 		!= CODE_FOR_nothing)
 		{
 		  expand_operands (treeop0, treeop1, NULL_RTX, &op0, &op1,
@@ -8042,7 +8038,7 @@ expand_expr_real_2 (sepops ops, rtx target, enum machine_mode tmode,
 	   unsignedp, this_optab);
 		  return REDUCE_BIT_FIELD (temp);
 		}
-	  if (widening_optab_handler (other_optab, mode, innermode)
+	  if (find_widening_optab_handler (other_optab, mode, innermode, 0)
 		!= CODE_FOR_nothing
 		  && innermode == word_mode)
 		{
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -249,6 +249,37 @@ widened_mode (enum machine_mode to_mode, rtx op0, rtx op1)
   return result;
 }
 
+/* Find a widening optab even if it doesn't widen as much as we want.
+   E.g. if from_mode is HImode, and to_mode is DImode, and there is no
+   direct HI->SI insn, then return SI->DI, if that exists.
+   If PERMIT_NON_WIDENING is non-zero then this can be used with
+   non-widening optabs also.  */
+
+enum insn_code
+find_widening_optab_handler_and_mode (optab op, enum machine_mode to_mode,
+  enum machine_mode from_mode,
+  int permit_non_widening,
+  enum machine_mode *found_mode)
+{
+  for (; (permit_non_widening || from_mode != to_mode)
+	 && GET_MODE_SIZE (from_mode) <= GET_MODE_SIZE (to_mode)
+	 && from_mode != VOIDmode;
+   from_mode = GET_MODE_WIDER_MODE (from_mode))
+{
+  enum insn_code handler = widening_optab_handler (op, to_mode,
+		   from_mode);
+
+  if (handler != CODE_FOR_nothing)
+	{
+	  if (found_mode)
+	*found_mode = from_mode;
+	  return handler;
+	}
+}
+
+  return CODE_FOR_nothing;
+}
+
 /* Widen OP to MODE and return the rtx for the widened operand.  UNSIGNEDP
says whether OP is signed or unsigned.  NO_EXTEND is nonzero if we need
not actually do a sign-extend or zero-extend, but can leave the
@@ -539,8 +570,9 @@ expand_widen_patt

Re: [PATCH, testsuite, i386] BMI2 support for GCC

2011-08-19 Thread H.J. Lu
It is hard to tell.  Can you double check indentation on

+  if (can_create_pseudo_p () && mode != SImode)
+  {
+rtx tmp = gen_rtx_REG (mode, 0);
+emit_insn (gen_extendsidi2 (tmp, operands[2]));
+operands[2] = tmp;
+  }


On Fri, Aug 19, 2011 at 7:13 AM, Kirill Yukhin  wrote:
> Thanks, fixed.
>
> Updated patch is attached.
>
> K
>
> On Fri, Aug 19, 2011 at 6:04 PM, H.J. Lu  wrote:
>> No need for () in "(mode == SImode)":
>>
>> +         && !optimize_function_for_size_p (cfun)
>> +         && ((mode == SImode) || (mode == DImode && TARGET_64BIT))
>>
>> Wrong placement of '{':
>>
>> +  if (can_create_pseudo_p () && mode != SImode) {
>> +    rtx tmp = gen_rtx_REG (mode, 0);
>> +    emit_insn (gen_extendsidi2 (tmp, operands[2]));
>> +    operands[2] = tmp;
>> +  }
>>
>>
>>
>> On Fri, Aug 19, 2011 at 6:53 AM, Kirill Yukhin  
>> wrote:
>>> Thanks!
>>> Fixed, updated patch is attached.
>>>
>>> Is it ok?
>>>
>>> Thanks, K
>>>
>>> On Fri, Aug 19, 2011 at 5:22 PM, Jakub Jelinek  wrote:
 On Fri, Aug 19, 2011 at 05:18:19PM +0400, Kirill Yukhin wrote:
> Thanks, it is fixed.
> Update patch is attached.

 +         /* We generatin RORX instruction, freedom of register +
 +            flags not affected  */

 comment doesn't look to be correct english (missing verb, missing g at
 the end of generating, missing dot at the end of sentence).

        Jakub

>>>
>>
>>
>>
>> --
>> H.J.
>>
>



-- 
H.J.


Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

2011-08-19 Thread Andrew Stubbs

On 12/07/11 11:52, Richard Guenther wrote:

Is this one ok?

Ok.


I've just committed this slightly modified patch.

The changes are mainly in the context and the testcase.

Andrew
2011-08-19  Andrew Stubbs  

	gcc/
	* tree-ssa-math-opts.c (convert_plusminus_to_widen): Permit a single
	conversion statement separating multiply-and-accumulate.

	gcc/testsuite/
	* gcc.target/arm/wmul-5.c: New file.
	* gcc.target/arm/no-wmla-1.c: New file.

--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/no-wmla-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+int
+foo (int a, short b, short c)
+{
+ int bc = b * c;
+return a + (short)bc;
+}
+
+/* { dg-final { scan-assembler "\tmul\t" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-5.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (long long a, char *b, char *c)
+{
+  return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2136,6 +2136,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
 			enum tree_code code)
 {
   gimple rhs1_stmt = NULL, rhs2_stmt = NULL;
+  gimple conv1_stmt = NULL, conv2_stmt = NULL, conv_stmt;
   tree type, type1, type2, tmp;
   tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs;
   enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK;
@@ -2178,6 +2179,38 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   else
 return false;
 
+  /* Allow for one conversion statement between the multiply
+ and addition/subtraction statement.  If there are more than
+ one conversions then we assume they would invalidate this
+ transformation.  If that's not the case then they should have
+ been folded before now.  */
+  if (CONVERT_EXPR_CODE_P (rhs1_code))
+{
+  conv1_stmt = rhs1_stmt;
+  rhs1 = gimple_assign_rhs1 (rhs1_stmt);
+  if (TREE_CODE (rhs1) == SSA_NAME)
+	{
+	  rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
+	  if (is_gimple_assign (rhs1_stmt))
+	rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
+	}
+  else
+	return false;
+}
+  if (CONVERT_EXPR_CODE_P (rhs2_code))
+{
+  conv2_stmt = rhs2_stmt;
+  rhs2 = gimple_assign_rhs1 (rhs2_stmt);
+  if (TREE_CODE (rhs2) == SSA_NAME)
+	{
+	  rhs2_stmt = SSA_NAME_DEF_STMT (rhs2);
+	  if (is_gimple_assign (rhs2_stmt))
+	rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
+	}
+  else
+	return false;
+}
+
   /* If code is WIDEN_MULT_EXPR then it would seem unnecessary to call
  is_widening_mult_p, but we still need the rhs returns.
 
@@ -2191,6 +2224,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
 			   &type2, &mult_rhs2))
 	return false;
   add_rhs = rhs2;
+  conv_stmt = conv1_stmt;
 }
   else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR)
 {
@@ -2198,6 +2232,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
 			   &type2, &mult_rhs2))
 	return false;
   add_rhs = rhs1;
+  conv_stmt = conv2_stmt;
 }
   else
 return false;
@@ -2208,6 +2243,33 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
 return false;
 
+  /* If there was a conversion between the multiply and addition
+ then we need to make sure it fits a multiply-and-accumulate.
+ The should be a single mode change which does not change the
+ value.  */
+  if (conv_stmt)
+{
+  tree from_type = TREE_TYPE (gimple_assign_rhs1 (conv_stmt));
+  tree to_type = TREE_TYPE (gimple_assign_lhs (conv_stmt));
+  int data_size = TYPE_PRECISION (type1) + TYPE_PRECISION (type2);
+  bool is_unsigned = TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2);
+
+  if (TYPE_PRECISION (from_type) > TYPE_PRECISION (to_type))
+	{
+	  /* Conversion is a truncate.  */
+	  if (TYPE_PRECISION (to_type) < data_size)
+	return false;
+	}
+  else if (TYPE_PRECISION (from_type) < TYPE_PRECISION (to_type))
+	{
+	  /* Conversion is an extend.  Check it's the right sort.  */
+	  if (TYPE_UNSIGNED (from_type) != is_unsigned
+	  && !(is_unsigned && TYPE_PRECISION (from_type) > data_size))
+	return false;
+	}
+  /* else convert is a no-op for our purposes.  */
+}
+
   /* Verify that the machine can perform a widening multiply
  accumulate in this mode/signedness combination, otherwise
  this transformation is likely to pessimize code.  */


Re: [PATCH (4/7)] Unsigned multiplies using wider signed multiplies

2011-08-19 Thread Andrew Stubbs

On 14/07/11 15:25, Richard Guenther wrote:

Ok.


Committed, with no real changes. I just updated the testcase a little.

Andrew
2011-08-19  Andrew Stubbs  

	gcc/
	* tree-ssa-math-opts.c (convert_mult_to_widen): Convert
	unsupported unsigned multiplies to signed.
	(convert_plusminus_to_widen): Likewise.

	gcc/testsuite/
	* gcc.target/arm/wmul-6.c: New file.

--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-6.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (long long a, unsigned char *b, signed char *c)
+{
+  return a + (long long)*b * (long long)*c;
+}
+
+/* { dg-final { scan-assembler "smlalbb" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2068,12 +2068,13 @@ is_widening_mult_p (gimple stmt,
 static bool
 convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
 {
-  tree lhs, rhs1, rhs2, type, type1, type2, tmp;
+  tree lhs, rhs1, rhs2, type, type1, type2, tmp = NULL;
   enum insn_code handler;
   enum machine_mode to_mode, from_mode, actual_mode;
   optab op;
   int actual_precision;
   location_t loc = gimple_location (stmt);
+  bool from_unsigned1, from_unsigned2;
 
   lhs = gimple_assign_lhs (stmt);
   type = TREE_TYPE (lhs);
@@ -2085,10 +2086,12 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
 
   to_mode = TYPE_MODE (type);
   from_mode = TYPE_MODE (type1);
+  from_unsigned1 = TYPE_UNSIGNED (type1);
+  from_unsigned2 = TYPE_UNSIGNED (type2);
 
-  if (TYPE_UNSIGNED (type1) && TYPE_UNSIGNED (type2))
+  if (from_unsigned1 && from_unsigned2)
 op = umul_widen_optab;
-  else if (!TYPE_UNSIGNED (type1) && !TYPE_UNSIGNED (type2))
+  else if (!from_unsigned1 && !from_unsigned2)
 op = smul_widen_optab;
   else
 op = usmul_widen_optab;
@@ -2097,22 +2100,45 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
 		  0, &actual_mode);
 
   if (handler == CODE_FOR_nothing)
-return false;
+{
+  if (op != smul_widen_optab)
+	{
+	  from_mode = GET_MODE_WIDER_MODE (from_mode);
+	  if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
+	return false;
+
+	  op = smul_widen_optab;
+	  handler = find_widening_optab_handler_and_mode (op, to_mode,
+			  from_mode, 0,
+			  &actual_mode);
+
+	  if (handler == CODE_FOR_nothing)
+	return false;
+
+	  from_unsigned1 = from_unsigned2 = false;
+	}
+  else
+	return false;
+}
 
   /* Ensure that the inputs to the handler are in the correct precison
  for the opcode.  This will be the full mode size.  */
   actual_precision = GET_MODE_PRECISION (actual_mode);
-  if (actual_precision != TYPE_PRECISION (type1))
+  if (actual_precision != TYPE_PRECISION (type1)
+  || from_unsigned1 != TYPE_UNSIGNED (type1))
 {
   tmp = create_tmp_var (build_nonstandard_integer_type
-(actual_precision, TYPE_UNSIGNED (type1)),
+(actual_precision, from_unsigned1),
 			NULL);
   rhs1 = build_and_insert_cast (gsi, loc, tmp, rhs1);
-
+}
+  if (actual_precision != TYPE_PRECISION (type2)
+  || from_unsigned2 != TYPE_UNSIGNED (type2))
+{
   /* Reuse the same type info, if possible.  */
-  if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
+  if (!tmp || from_unsigned1 != from_unsigned2)
 	tmp = create_tmp_var (build_nonstandard_integer_type
-(actual_precision, TYPE_UNSIGNED (type2)),
+(actual_precision, from_unsigned2),
 			  NULL);
   rhs2 = build_and_insert_cast (gsi, loc, tmp, rhs2);
 }
@@ -2137,7 +2163,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
 {
   gimple rhs1_stmt = NULL, rhs2_stmt = NULL;
   gimple conv1_stmt = NULL, conv2_stmt = NULL, conv_stmt;
-  tree type, type1, type2, tmp;
+  tree type, type1, type2, optype, tmp = NULL;
   tree lhs, rhs1, rhs2, mult_rhs1, mult_rhs2, add_rhs;
   enum tree_code rhs1_code = ERROR_MARK, rhs2_code = ERROR_MARK;
   optab this_optab;
@@ -2146,6 +2172,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   enum machine_mode to_mode, from_mode, actual_mode;
   location_t loc = gimple_location (stmt);
   int actual_precision;
+  bool from_unsigned1, from_unsigned2;
 
   lhs = gimple_assign_lhs (stmt);
   type = TREE_TYPE (lhs);
@@ -2239,9 +2266,21 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
 
   to_mode = TYPE_MODE (type);
   from_mode = TYPE_MODE (type1);
+  from_unsigned1 = TYPE_UNSIGNED (type1);
+  from_unsigned2 = TYPE_UNSIGNED (type2);
 
-  if (TYPE_UNSIGNED (type1) != TYPE_UNSIGNED (type2))
-return false;
+  /* There's no such thing as a mixed sign madd yet, so use a wider mode.  */
+  if (from_unsigned1 != from_unsigned2)
+{
+  enum machine_mode mode = GET_MODE_WIDER_MODE (from_mode);
+  if (GET_MODE_PRECISION (mode) < GET_MODE_PRECISION (to_mode))
+	{
+	  from_mode = mode;
+	  from_unsigned1 = from_unsigned2 = false;
+	}
+  else
+	return false;
+}
 
   /* If there was a conversio

Re: [PATCH][2/2] Fixup dr_analyze_indices, fix PR50067

2011-08-19 Thread Richard Guenther
On Fri, 19 Aug 2011, Richard Guenther wrote:

> 
> This is the fix for the testcase in PR50067.  We strip outermost
> (yes, outermost only, which makes it very inefficient) MEM_REFs
> which causes the DR base objects in the PR to agree for two
> conflicting DRs, but with the issues we have with how we
> compose access functions they still get disambiguated.
> 
> Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

Updated patch, which exposes some latent wide-multiply GIMPLE
type issues at least.  Fixes all known testcases I have.

We can't really use the MEM_REF (or maybe any?) offset from
the base object as independent access-function.  Nor can
we replace the base with scev_not_known - the alias oracle
will assume funny things about this (a latent issue for sure).

Bootstrap and regtest running on x86_64-unknown-linux-gnu,
I'll have to dig into the latent issue exposed first, but that's
for next week only.

Eventually all these changes need backporting to 4.6, even if
the individual testcases do not fail there.

Thanks,
Richard.

2011-08-19  Richard Guenther  

PR tree-optimization/50067
* tree-data-ref.c (dr_analyze_indices): Do not strip
outermost MEM_REF off its ADDR_EXPR operand.  Do not strip
offset operand off MEM_REFs.  Do not leak scev_not_known
into DR_BASE_OBJECT.

* gcc.dg/torture/pr50067-3.c: New testcase.
* gcc.dg/torture/pr50067-4.c: Likewise.

Index: gcc/tree-data-ref.c
===
*** gcc/tree-data-ref.c (revision 177903)
--- gcc/tree-data-ref.c (working copy)
*** dr_analyze_indices (struct data_referenc
*** 868,894 
op = TREE_OPERAND (aref, 0);
access_fn = analyze_scalar_evolution (loop, op);
access_fn = instantiate_scev (before_loop, loop, access_fn);
!   base = initial_condition (access_fn);
!   split_constant_offset (base, &base, &off);
!   if (!integer_zerop (TREE_OPERAND (aref, 1)))
{
! off = size_binop (PLUS_EXPR, off,
!   fold_convert (ssizetype, TREE_OPERAND (aref, 1)));
! TREE_OPERAND (aref, 1)
!   = build_int_cst (TREE_TYPE (TREE_OPERAND (aref, 1)), 0);
}
-   access_fn = chrec_replace_initial_condition (access_fn,
-   fold_convert (TREE_TYPE (base), off));
- 
-   TREE_OPERAND (aref, 0) = base;
VEC_safe_push (tree, heap, access_fns, access_fn);
  }
  
-   if (TREE_CODE (ref) == MEM_REF
-   && TREE_CODE (TREE_OPERAND (ref, 0)) == ADDR_EXPR
-   && integer_zerop (TREE_OPERAND (ref, 1)))
- ref = TREE_OPERAND (TREE_OPERAND (ref, 0), 0);
- 
/* For canonicalization purposes we'd like to strip all outermost
   zero-offset component-refs.
   ???  For now simply handle zero-index array-refs.  */
--- 868,884 
op = TREE_OPERAND (aref, 0);
access_fn = analyze_scalar_evolution (loop, op);
access_fn = instantiate_scev (before_loop, loop, access_fn);
!   if (!automatically_generated_chrec_p (access_fn))
{
! base = initial_condition (access_fn);
! split_constant_offset (base, &base, &off);
! access_fn = chrec_replace_initial_condition
! (access_fn, fold_convert (TREE_TYPE (base), off));
! TREE_OPERAND (aref, 0) = base;
}
VEC_safe_push (tree, heap, access_fns, access_fn);
  }
  
/* For canonicalization purposes we'd like to strip all outermost
   zero-offset component-refs.
   ???  For now simply handle zero-index array-refs.  */
Index: gcc/testsuite/gcc.dg/torture/pr50067-3.c
===
*** gcc/testsuite/gcc.dg/torture/pr50067-3.c(revision 0)
--- gcc/testsuite/gcc.dg/torture/pr50067-3.c(revision 0)
***
*** 0 
--- 1,20 
+ /* { dg-do run } */
+ /* { dg-options "-fpredictive-commoning" } */
+ 
+ extern void abort (void);
+ int a[6] = { 0, 0, 0, 0, 7, 0 };
+ static int *p = &a[4];
+ 
+ int
+ main ()
+ {
+   int i;
+   for (i = 0; i < 4; ++i)
+ {
+   a[i + 1] = a[i + 2] > i;
+   *p &= ~1;
+ }
+   if (a[4] != 0)
+ abort ();
+   return 0;
+ }
Index: gcc/testsuite/gcc.dg/torture/pr50067-4.c
===
*** gcc/testsuite/gcc.dg/torture/pr50067-4.c(revision 0)
--- gcc/testsuite/gcc.dg/torture/pr50067-4.c(revision 0)
***
*** 0 
--- 1,20 
+ /* { dg-do run } */
+ 
+ /* Verify we do not get a bogus access function with 0B vs. 1B which
+disambiguates both accesses and leads to vectorization.  */
+ 
+ extern int memcmp(const void *, const void *, __SIZE_TYPE__);
+ extern void abort (void);
+ short a[33] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 };
+ short b[33] = { 0, };
+ int main()
+ {
+   int i;
+   for (i = 0; i < 64; ++i)
+ {
+

Re: [PATCH][2/2] Fixup dr_analyze_indices, fix PR50067

2011-08-19 Thread Richard Guenther
On Fri, 19 Aug 2011, Richard Guenther wrote:

> On Fri, 19 Aug 2011, Richard Guenther wrote:
> 
> > 
> > This is the fix for the testcase in PR50067.  We strip outermost
> > (yes, outermost only, which makes it very inefficient) MEM_REFs
> > which causes the DR base objects in the PR to agree for two
> > conflicting DRs, but with the issues we have with how we
> > compose access functions they still get disambiguated.
> > 
> > Bootstrap and regtest pending on x86_64-unknown-linux-gnu.
> 
> Updated patch, which exposes some latent wide-multiply GIMPLE
> type issues at least.  Fixes all known testcases I have.
> 
> We can't really use the MEM_REF (or maybe any?) offset from
> the base object as independent access-function.  Nor can
> we replace the base with scev_not_known - the alias oracle
> will assume funny things about this (a latent issue for sure).
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu,
> I'll have to dig into the latent issue exposed first, but that's
> for next week only.

Bah, bootstrap fails very early when building stage1 libgcc:

/abuild/rguenther/obj3/./gcc/xgcc -B/abuild/rguenther/obj3/./gcc/ 
-B/usr/local/x86_64-unknown-linux-gnu/bin/ 
-B/usr/local/x86_64-unknown-linux-gnu/lib/ -isystem 
/usr/local/x86_64-unknown-linux-gnu/include -isystem 
/usr/local/x86_64-unknown-linux-gnu/sys-include-g -O2 -m32 -O2  -I. 
-I. -I/space/rguenther/src/svn/trunk/gcc 
-I/space/rguenther/src/svn/trunk/gcc/. 
-I/space/rguenther/src/svn/trunk/gcc/../include 
-I/space/rguenther/src/svn/trunk/gcc/../libdecnumber 
-I/space/rguenther/src/svn/trunk/gcc/../libdecnumber/bid -I../libdecnumber 
-I/space/rguenther/src/svn/trunk/gcc/../libgcc -g -O2 -DIN_GCC   -W -Wall 
-Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes 
-Wold-style-definition  -isystem ./include  -fPIC -g -DHAVE_GTHR_DEFAULT 
-DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-protector   -I. -I. 
-I../../.././gcc -I/space/rguenther/src/svn/trunk/libgcc 
-I/space/rguenther/src/svn/trunk/libgcc/. 
-I/space/rguenther/src/svn/trunk/libgcc/../gcc 
-I/space/rguenther/src/svn/trunk/libgcc/../include 
-I/space/rguenther/src/svn/trunk/libgcc/config/libbid 
-DENABLE_DECIMAL_BID_FORMAT -DHAVE_CC_TLS  -DUSE_TLS -o bid64_div.o -MT 
bid64_div.o -MD -MP -MF bid64_div.dep -c 
/space/rguenther/src/svn/trunk/libgcc/config/libbid/bid64_div.c
/space/rguenther/src/svn/trunk/libgcc/config/libbid/bid64_div.c: In 
function '__bid64dq_div':
/space/rguenther/src/svn/trunk/libgcc/config/libbid/bid64_div.c:523:51: 
warning: variable 'Ql' set but not used [-Wunused-but-set-variable]
/space/rguenther/src/svn/trunk/libgcc/config/libbid/bid64_div.c: In 
function '__bid64qd_div':
/space/rguenther/src/svn/trunk/libgcc/config/libbid/bid64_div.c:937:51: 
warning: variable 'Ql' set but not used [-Wunused-but-set-variable]
/space/rguenther/src/svn/trunk/libgcc/config/libbid/bid64_div.c: In 
function '__bid64qq_div':
/space/rguenther/src/svn/trunk/libgcc/config/libbid/bid64_div.c:1374:51: 
warning: variable 'Ql' set but not used [-Wunused-but-set-variable]
PD_359 = digit_22 w* 109951163;

/space/rguenther/src/svn/trunk/libgcc/config/libbid/bid64_div.c: In 
function '__bid64_div':
/space/rguenther/src/svn/trunk/libgcc/config/libbid/bid64_div.c:80:1: 
internal compiler error: verify_gimple failed
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

I suppose it's exposed by

2011-08-19  Andrew Stubbs  

   * tree-ssa-math-opts.c (build_and_insert_cast): New function.
(is_widening_mult_rhs_p): Allow widening by more than one mode.
Explicitly disallow mis-matched input types.
(convert_mult_to_widen): Use find_widening_optab_handler, and cast
input types to fit the new handler.
(convert_plusminus_to_widen): Likewise.

Andrew - appearantly you broke bootstrap on x86_64.

Richard.


Re: [PATCH (5/7)] Widening multiplies for mis-matched mode inputs

2011-08-19 Thread Andrew Stubbs

On 14/07/11 15:31, Richard Guenther wrote:

Ok.


I've just committed this patch with no real changes. I've just updated 
the testcase.


Andrew
2011-08-19  Andrew Stubbs  

	gcc/
	* tree-ssa-math-opts.c (is_widening_mult_p): Remove FIXME.
	Ensure the the larger type is the first operand.

	gcc/testsuite/
	* gcc.target/arm/wmul-7.c: New file.

--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-7.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+unsigned long long
+foo (unsigned long long a, unsigned char *b, unsigned short *c)
+{
+  return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2054,9 +2054,17 @@ is_widening_mult_p (gimple stmt,
   *type2_out = *type1_out;
 }
 
-  /* FIXME: remove this restriction.  */
-  if (TYPE_PRECISION (*type1_out) != TYPE_PRECISION (*type2_out))
-return false;
+  /* Ensure that the larger of the two operands comes first. */
+  if (TYPE_PRECISION (*type1_out) < TYPE_PRECISION (*type2_out))
+{
+  tree tmp;
+  tmp = *type1_out;
+  *type1_out = *type2_out;
+  *type2_out = tmp;
+  tmp = *rhs1_out;
+  *rhs1_out = *rhs2_out;
+  *rhs2_out = tmp;
+}
 
   return true;
 }


Re: [PATCH (1/7)] New optab framework for widening multiplies

2011-08-19 Thread Richard Guenther
On Fri, Aug 19, 2011 at 4:18 PM, Andrew Stubbs  wrote:
> On 22/07/11 16:34, Andrew Stubbs wrote:
>>
>> On 22/07/11 14:28, Bernd Schmidt wrote:
>>>
>>> Oh well, let's shelve it and do it later.
>>
>> Here's an updated patch with the formatting problem you found fixed.
>
> I've just committed an updated version of this patch (attached).
>
> I found a number of subtle bugs while I was testing, and these have now been
> corrected. In particular, I found that VOIDmode constants were not handled
> correctly; I've added a function "widened_mode" along the lines originally
> suggested by Benrd to deal with this. I also found one case where different
> code was produced to previously, although it was actually corrected later in
> the patch series I've fixed it here now.

Seems one in the series has broken bootstrap on x86_64 when building
the 32bit libgcc multilib in stage1.

Richard.

> Andrew
>
>


Re: PING: PATCH: PR target/46770: Use .init_array/.fini_array sections

2011-08-19 Thread H.J. Lu
On Fri, Aug 19, 2011 at 1:17 AM, Jakub Jelinek  wrote:
> Sorry for the delay.
>
>> --- a/gcc/config.gcc
>> +++ b/gcc/config.gcc
>> @@ -186,6 +186,9 @@
>>  #  configure_default_options
>>  #                    Set to an initializer for configure_default_options
>>  #                    in configargs.h, based on --with-cpu et cetera.
>> +#
>> +#  use_initfini_array        If set to yes, .init_array/.fini_array sections
>> +#                    will be used if they work.
>>
>>  # The following variables are used in each case-construct to build up the
>>  # outgoing variables:
>> @@ -238,6 +241,7 @@ default_gnu_indirect_function=no
>>  target_gtfiles=
>>  need_64bit_hwint=
>>  need_64bit_isa=
>> +use_initfini_array=yes
>
> What is this for, when nothing ever sets it to anything but yes?
> If the $enable_initfini_array = yes test works, then there shouldn't be
> any need to override it on a per-target basis...

Done.

>> --- /dev/null
>> +++ b/gcc/config/initfini-array.h
>> @@ -0,0 +1,44 @@
>> +/* Definitions for ELF systems with .init_array/.fini_array section
>> +   support.
>> +   Copyright (C) 2011
>> +   Free Software Foundation, Inc.
>> +
>> +   This file is part of GCC.
>> +
>> +   GCC is free software; you can redistribute it and/or modify it
>> +   under the terms of the GNU General Public License as published
>> +   by the Free Software Foundation; either version 3, or (at your
>> +   option) any later version.
>> +
>> +   GCC is distributed in the hope that it will be useful, but WITHOUT
>> +   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
>> +   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
>> +   License for more details.
>> +
>> +   You should have received a copy of the GNU General Public License
>> +   along with GCC; see the file COPYING3.  If not see
>> +   .  */
>> +
>> +#define USE_INITFINI_ARRAY
>> +
>> +#undef INIT_SECTION_ASM_OP
>> +#undef FINI_SECTION_ASM_OP
>> +
>> +/* FIXME: INIT_ARRAY_SECTION_ASM_OP and FINI_ARRAY_SECTION_ASM_OP
>> +       aren't used in any assembly codes.  But we have to define
>> +       them to something.  */
>> +#define INIT_ARRAY_SECTION_ASM_OP Something
>> +#define FINI_ARRAY_SECTION_ASM_OP Something
>
> Can't you just define it to an empty string?  And, a couple of targets
> define INIT_ARRAY_SECTION_ASM_OP/FINI_ARRAY_SECTION_ASM_OP, you either need
> to undef it first, or define only if it wasn't defined.

Done.

>> +
>> +#ifndef TARGET_ASM_INIT_SECTIONS
>> +#define TARGET_ASM_INIT_SECTIONS default_elf_initfini_array_init_sections
>> +#endif
>> +extern void default_elf_initfini_array_init_sections (void);
>
> Why do you need this (and the default_initfini_array_init_sections () call
> in all the backends)?  Isn't it easier to just initialize the two global
> vars only when you are actually going to use them (if they are still NULL)?

Done.

>> --- a/gcc/varasm.c
>> +++ b/gcc/varasm.c
>> @@ -7350,4 +7350,62 @@ make_debug_expr_from_rtl (const_rtx exp)
>>    return dval;
>>  }
>>
>> +static GTY(()) section *elf_init_array_section;
>> +static GTY(()) section *elf_fini_array_section;
>> +
>> +void
>> +default_elf_initfini_array_init_sections (void)
>> +{
>> +  elf_init_array_section = get_unnamed_section (0, output_section_asm_op,
>> +                                             "\t.section\t.init_array");
>> +  elf_fini_array_section = get_unnamed_section (0, output_section_asm_op,
>> +                                             "\t.section\t.fini_array");
>> +}
>
> Remove above function.

Done.

>> +
>> +static section *
>> +get_elf_initfini_array_priority_section (int priority,
>> +                                      bool constructor_p)
>> +{
>> +  section *sec;
>> +  if (priority != DEFAULT_INIT_PRIORITY)
>> +    {
>> +      char buf[18];
>> +      sprintf (buf, "%s.%.5u",
>> +            constructor_p ? ".init_array" : ".fini_array",
>> +            priority);
>> +      sec = get_section (buf, SECTION_WRITE, NULL_TREE);
>> +    }
>
> I'd just put here
>   else
>     {
>       if (elf_init_array_section == NULL)
>         elf_init_array_section = get_unnamed_section...
>       if (elf_fini_array_section == NULL)
>         elf_fini_array_section = get_unnamed_section...
>> +    sec = constructor_p ? elf_init_array_section : elf_fini_array_section;
>     }

Done.

>> +void
>> +default_initfini_array_init_sections (void)
>> +{
>> +#ifdef USE_INITFINI_ARRAY
>> +  default_elf_initfini_array_init_sections ();
>> +#endif
>> +}
>
> And remove this (and all callers etc.).

Done.

> On which targets has it been tested?  Would be nice to test it at least on
> targets that define their own INIT_ARRAY_SECTION_ASM_OP (pa64-hpux, arm,
> m32c, rx) and on {i?86,x86_64,ia64}-linux and some solaris target.
>

Here is the updated patch. I tested it on Linux/ia32 and Linux/x86-64.
OK for trunk?

Bootstrap on Linux/ia64 has failed for a while and I don't have other
platforms. If it can't be easily tested on ot

Re: [PATCH (6/7)] More widening multiply-and-accumulate pattern matching

2011-08-19 Thread Andrew Stubbs

On 14/07/11 15:35, Richard Guenther wrote:

Ok.


I've just committed this updated patch.

I found bugs with VOIDmode constants that have caused me to recast my 
patches to is_widening_mult_rhs_p. They should be logically the same for 
non VOIDmode cases, but work correctly for constants. I think the new 
version is a bit easier to understand in any case.


Andrew
2011-08-19  Andrew Stubbs  

	gcc/
	* tree-ssa-math-opts.c (is_widening_mult_rhs_p): Add new argument
	'type'.
	Use 'type' from caller, not inferred from 'rhs'.
	Don't reject non-conversion statements. Do return lhs in this case.
	(is_widening_mult_p): Add new argument 'type'.
	Use 'type' from caller, not inferred from 'stmt'.
	Pass type to is_widening_mult_rhs_p.
	(convert_mult_to_widen): Pass type to is_widening_mult_p.
	(convert_plusminus_to_widen): Likewise.

	gcc/testsuite/
	* gcc.target/arm/wmul-8.c: New file.

--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-8.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (long long a, int *b, int *c)
+{
+  return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1966,7 +1966,8 @@ struct gimple_opt_pass pass_optimize_bswap =
  }
 };
 
-/* Return true if RHS is a suitable operand for a widening multiplication.
+/* Return true if RHS is a suitable operand for a widening multiplication,
+   assuming a target type of TYPE.
There are two cases:
 
  - RHS makes some value at least twice as wide.  Store that value
@@ -1976,27 +1977,31 @@ struct gimple_opt_pass pass_optimize_bswap =
but leave *TYPE_OUT untouched.  */
 
 static bool
-is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
+is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out,
+			tree *new_rhs_out)
 {
   gimple stmt;
-  tree type, type1, rhs1;
+  tree type1, rhs1;
   enum tree_code rhs_code;
 
   if (TREE_CODE (rhs) == SSA_NAME)
 {
-  type = TREE_TYPE (rhs);
   stmt = SSA_NAME_DEF_STMT (rhs);
-  if (!is_gimple_assign (stmt))
-	return false;
-
-  rhs_code = gimple_assign_rhs_code (stmt);
-  if (TREE_CODE (type) == INTEGER_TYPE
-	  ? !CONVERT_EXPR_CODE_P (rhs_code)
-	  : rhs_code != FIXED_CONVERT_EXPR)
-	return false;
+  if (is_gimple_assign (stmt))
+	{
+	  rhs_code = gimple_assign_rhs_code (stmt);
+	  if (TREE_CODE (type) == INTEGER_TYPE
+	  ? !CONVERT_EXPR_CODE_P (rhs_code)
+	  : rhs_code != FIXED_CONVERT_EXPR)
+	rhs1 = rhs;
+	  else
+	rhs1 = gimple_assign_rhs1 (stmt);
+	}
+  else
+	rhs1 = rhs;
 
-  rhs1 = gimple_assign_rhs1 (stmt);
   type1 = TREE_TYPE (rhs1);
+
   if (TREE_CODE (type1) != TREE_CODE (type)
 	  || TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type))
 	return false;
@@ -2016,28 +2021,27 @@ is_widening_mult_rhs_p (tree rhs, tree *type_out, tree *new_rhs_out)
   return false;
 }
 
-/* Return true if STMT performs a widening multiplication.  If so,
-   store the unwidened types of the operands in *TYPE1_OUT and *TYPE2_OUT
-   respectively.  Also fill *RHS1_OUT and *RHS2_OUT such that converting
-   those operands to types *TYPE1_OUT and *TYPE2_OUT would give the
-   operands of the multiplication.  */
+/* Return true if STMT performs a widening multiplication, assuming the
+   output type is TYPE.  If so, store the unwidened types of the operands
+   in *TYPE1_OUT and *TYPE2_OUT respectively.  Also fill *RHS1_OUT and
+   *RHS2_OUT such that converting those operands to types *TYPE1_OUT
+   and *TYPE2_OUT would give the operands of the multiplication.  */
 
 static bool
-is_widening_mult_p (gimple stmt,
+is_widening_mult_p (tree type, gimple stmt,
 		tree *type1_out, tree *rhs1_out,
 		tree *type2_out, tree *rhs2_out)
 {
-  tree type;
-
-  type = TREE_TYPE (gimple_assign_lhs (stmt));
   if (TREE_CODE (type) != INTEGER_TYPE
   && TREE_CODE (type) != FIXED_POINT_TYPE)
 return false;
 
-  if (!is_widening_mult_rhs_p (gimple_assign_rhs1 (stmt), type1_out, rhs1_out))
+  if (!is_widening_mult_rhs_p (type, gimple_assign_rhs1 (stmt), type1_out,
+			   rhs1_out))
 return false;
 
-  if (!is_widening_mult_rhs_p (gimple_assign_rhs2 (stmt), type2_out, rhs2_out))
+  if (!is_widening_mult_rhs_p (type, gimple_assign_rhs2 (stmt), type2_out,
+			   rhs2_out))
 return false;
 
   if (*type1_out == NULL)
@@ -2089,7 +2093,7 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
   if (TREE_CODE (type) != INTEGER_TYPE)
 return false;
 
-  if (!is_widening_mult_p (stmt, &type1, &rhs1, &type2, &rhs2))
+  if (!is_widening_mult_p (type, stmt, &type1, &rhs1, &type2, &rhs2))
 return false;
 
   to_mode = TYPE_MODE (type);
@@ -2255,7 +2259,7 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   if (code == PLUS_EXPR
   && (rhs1_code == MULT_EXPR || rhs1_code == WIDEN_MULT_EXPR))
 {
-  if (!is_widening_mult_p (rhs1_

Re: [PATCH (1/7)] New optab framework for widening multiplies

2011-08-19 Thread Andrew Stubbs

On 19/08/11 15:45, Richard Guenther wrote:

Seems one in the series has broken bootstrap on x86_64 when building
the 32bit libgcc multilib in stage1.


Oh? Hopefully that'll be fixed when I complete the patchset. Patches 8 
and 9 (of 7) did fix issues with the earlier patches.


Andrew


Re: [PATCH, testsuite, i386] BMI2 support for GCC

2011-08-19 Thread Kirill Yukhin
Done. Patch attached in previous mail

K

On Fri, Aug 19, 2011 at 6:51 PM, Kirill Yukhin  wrote:
> On Fri, Aug 19, 2011 at 6:31 PM, H.J. Lu  wrote:
>> It is hard to tell.  Can you double check indentation on
>>
>> +  if (can_create_pseudo_p () && mode != SImode)
>> +  {
>> +    rtx tmp = gen_rtx_REG (mode, 0);
>> +    emit_insn (gen_extendsidi2 (tmp, operands[2]));
>> +    operands[2] = tmp;
>> +  }
>>
>>
>> On Fri, Aug 19, 2011 at 7:13 AM, Kirill Yukhin  
>> wrote:
>>> Thanks, fixed.
>>>
>>> Updated patch is attached.
>>>
>>> K
>>>
>>> On Fri, Aug 19, 2011 at 6:04 PM, H.J. Lu  wrote:
 No need for () in "(mode == SImode)":

 +         && !optimize_function_for_size_p (cfun)
 +         && ((mode == SImode) || (mode == DImode && TARGET_64BIT))

 Wrong placement of '{':

 +  if (can_create_pseudo_p () && mode != SImode) {
 +    rtx tmp = gen_rtx_REG (mode, 0);
 +    emit_insn (gen_extendsidi2 (tmp, operands[2]));
 +    operands[2] = tmp;
 +  }



 On Fri, Aug 19, 2011 at 6:53 AM, Kirill Yukhin  
 wrote:
> Thanks!
> Fixed, updated patch is attached.
>
> Is it ok?
>
> Thanks, K
>
> On Fri, Aug 19, 2011 at 5:22 PM, Jakub Jelinek  wrote:
>> On Fri, Aug 19, 2011 at 05:18:19PM +0400, Kirill Yukhin wrote:
>>> Thanks, it is fixed.
>>> Update patch is attached.
>>
>> +         /* We generatin RORX instruction, freedom of register +
>> +            flags not affected  */
>>
>> comment doesn't look to be correct english (missing verb, missing g at
>> the end of generating, missing dot at the end of sentence).
>>
>>        Jakub
>>
>



 --
 H.J.

>>>
>>
>>
>>
>> --
>> H.J.
>>
>


Re: Vector Comparison patch

2011-08-19 Thread Richard Guenther
On Fri, Aug 19, 2011 at 2:29 AM, Artem Shinkarov
 wrote:
> Hi, I had the problem with passing information about single variable
> from expand_vec_cond_expr optab into ix86_expand_*_vcond.
>
> I looked into it this problem for quite a while and found a solution.
> Now the question if it could be done better.
>
> First of all the problem:
>
> If we represent any vector comparison with VEC_COND_EXPR < v0  v1
> ? {-1,...} : {0,...} >, then in the assembler we do not want to see
> this useless comparison with {-1...}.
>
> Now it is easy to fix the problem about excessive masking. The real
> challenge starts when the comparison inside vcond is expressed as a
> variable. In that case in order to construct correct vector expression
> we need to adjust cond in cond ? v0 : v1 to  cond == {-1...} or as we
> agreed recently cond != {0,..}. But hat we need to do only to make
> vec_cond_expr happy. On the level of assembler we don't want this
> condition.
>
> Now, if I just construct the tree, then in x86, rtx_equal_p, does not
> know that this is a constant vector full of -1, because the comparison
> operands are not immediate. So I need somehow to mark the fact in
> optabs, and then check the information in the x86.

Well, this is why I was suggesting the bitwise semantic for a mask
operand.  What we should do on the tree level (and that should happen
already), is forward the comparison into the COND_EXPR.  Thus,

mask = v1 < v2;
v3 = mask ? v4 : v5;

should get changed to

v3 = v1 < v2 ? v4 : v5;

by tree-ssa-forwprop.c.  If that is not happening we have to fix that there.

Because we _don't_ know the mask is all -1 or 0 ;)  The user might
put in {3, 5 ,1 3} and expect it to be treated like {-1,...} but it isn't
so already.

> At the moment I do something like this:
>
> optabs:
>
> if (!COMPARISON_CLASS_P (op0))
>  ops[3] = gen_rtx_EQ (mode, NULL_RTX, NULL_RTX);
>
> This expression is preserved while checking and verifying.
>
> ix86:
> if (GET_CODE (comp) == EQ && XEXP (comp, 0) == NULL_RTX
>      && XEXP (comp, 1) == NULL_RTX)
>
> See the patch attached for more details. The patch is just to give you
> an idea of the way I am doing it and it seems to work. Please don't
> criticise the patch itself, better help me to understand if there is a
> better way to pass the information from optabs to ix86.

Hm, I'm not sure the expand_vec_cond_expr will work that way,
I'd have to play with it myself (but will now be running for weekend).

Is the special-casing of a < b ? {-1,-1,-1} : {0,0,0,0} in the backend
working for you?  I think there are probably some rtl all-ones and all-zeros
predicates you can re-use.

Richard.

>
> Thanks,
> Artem.
>
> On Thu, Aug 18, 2011 at 3:31 PM, Richard Henderson  wrote:
>> On 08/18/2011 02:23 AM, Richard Guenther wrote:
> >> The first one (inefficient) is vec0 > vec1 ? {-1,...} : {0,...}
> >> The second is vec0 > vec1. expand_vec_cond_expr is stupid, which is
> >> fine, but it means that we need to construct it carefully.
 >
 > This is still important.
>>> Yes.  I think the backends need to handle optimizing this case,
>>> esp. considering targets that do not have instructions to produce
>>> a {-1,...}/{0,...} bitmask from a comparison but produce a vector
>>> of condition codes.  With using vec0 > vec1 ? {-1...} : {0,...} for
>>> mask = vec0 > vec1; we avoid exposing the result kind of
>>> vector comparisons.
>>>
>>> It should be easily possible for x86 for example to recognize
>>> the -1 : 0 case.
>>>
>>
>> I think you've been glossing over the hard part with "..." up there.
>> I challenge you to actually fill that in with something meaningful
>> in rtl.
>>
>> I suspect that you simply have to add another named pattern that
>> will Do What You Want on mips and suchlike that produce a CCmode.
>>
>>
>>
>> r~
>>
>


Re: PING: PATCH: PR target/46770: Use .init_array/.fini_array sections

2011-08-19 Thread Jakub Jelinek
On Fri, Aug 19, 2011 at 07:47:40AM -0700, H.J. Lu wrote:
> 2011-08-19  H.J. Lu  
> 
>   PR target/46770
>   * config.gcc (tm_file): Add initfini-array.h if
>   .init_arary/.fini_array supported.

s/arary/array/

Ok if nobody objects within 24 hours, but please watch for any fallouts.

Jakub


Re: [PATCH (7/7)] Mixed-sign multiplies using narrowest mode

2011-08-19 Thread Andrew Stubbs

On 14/07/11 15:41, Richard Guenther wrote:

Ok.


Committed, unchanged apart from the test case.

Andrew
2011-08-19  Andrew Stubbs  

	gcc/
	* tree-ssa-math-opts.c (convert_mult_to_widen): Better handle
	unsigned inputs of different modes.
	(convert_plusminus_to_widen): Likewise.

	gcc/testsuite/
	* gcc.target/arm/wmul-9.c: New file.
	* gcc.target/arm/wmul-bitfield-2.c: New file.

--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-9.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (long long a, short *b, char *c)
+{
+  return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "smlalbb" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-bitfield-2.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+struct bf
+{
+  int a : 3;
+  unsigned int b : 15;
+  int c : 3;
+};
+
+long long
+foo (long long a, struct bf b, struct bf c)
+{
+  return a + b.b * c.c;
+}
+
+/* { dg-final { scan-assembler "smlalbb" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2115,9 +2115,18 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
 {
   if (op != smul_widen_optab)
 	{
-	  from_mode = GET_MODE_WIDER_MODE (from_mode);
-	  if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
-	return false;
+	  /* We can use a signed multiply with unsigned types as long as
+	 there is a wider mode to use, or it is the smaller of the two
+	 types that is unsigned.  Note that type1 >= type2, always.  */
+	  if ((TYPE_UNSIGNED (type1)
+	   && TYPE_PRECISION (type1) == GET_MODE_PRECISION (from_mode))
+	  || (TYPE_UNSIGNED (type2)
+		  && TYPE_PRECISION (type2) == GET_MODE_PRECISION (from_mode)))
+	{
+	  from_mode = GET_MODE_WIDER_MODE (from_mode);
+	  if (GET_MODE_SIZE (to_mode) <= GET_MODE_SIZE (from_mode))
+		return false;
+	}
 
 	  op = smul_widen_optab;
 	  handler = find_widening_optab_handler_and_mode (op, to_mode,
@@ -2284,14 +2293,20 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   /* There's no such thing as a mixed sign madd yet, so use a wider mode.  */
   if (from_unsigned1 != from_unsigned2)
 {
-  enum machine_mode mode = GET_MODE_WIDER_MODE (from_mode);
-  if (GET_MODE_PRECISION (mode) < GET_MODE_PRECISION (to_mode))
+  /* We can use a signed multiply with unsigned types as long as
+	 there is a wider mode to use, or it is the smaller of the two
+	 types that is unsigned.  Note that type1 >= type2, always.  */
+  if ((from_unsigned1
+	   && TYPE_PRECISION (type1) == GET_MODE_PRECISION (from_mode))
+	  || (from_unsigned2
+	  && TYPE_PRECISION (type2) == GET_MODE_PRECISION (from_mode)))
 	{
-	  from_mode = mode;
-	  from_unsigned1 = from_unsigned2 = false;
+	  from_mode = GET_MODE_WIDER_MODE (from_mode);
+	  if (GET_MODE_SIZE (from_mode) >= GET_MODE_SIZE (to_mode))
+	return false;
 	}
-  else
-	return false;
+
+  from_unsigned1 = from_unsigned2 = false;
 }
 
   /* If there was a conversion between the multiply and addition


Re: [PATCH (8/7)] Fix a bug in multiply-and-accumulate

2011-08-19 Thread Andrew Stubbs

On 21/07/11 14:14, Andrew Stubbs wrote:

Here is the patch I plan to commit, when patch 1 is approved, and my
testing is complete.


Committed, unchanged.

Andrew


Re: [PATCH (9/7)] Widening multiplies with constant inputs

2011-08-19 Thread Andrew Stubbs

On 22/07/11 16:38, Andrew Stubbs wrote:

Fixed in the attached. I'll commit this version when the rest of my
testing is complete.


Now committed. Here's the patch with updated context.

Andrew
2011-08-19  Andrew Stubbs  

	gcc/
	* tree-ssa-math-opts.c (is_widening_mult_rhs_p): Handle constants
	beyond conversions.
	(convert_mult_to_widen): Convert constant inputs to the right type.
	(convert_plusminus_to_widen): Don't automatically reject inputs that
	are not an SSA_NAME.
	Convert constant inputs to the right type.

	gcc/testsuite/
	* gcc.target/arm/wmul-11.c: New file.
	* gcc.target/arm/wmul-12.c: New file.
	* gcc.target/arm/wmul-13.c: New file.

--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-11.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (int *b)
+{
+  return 10 * (long long)*b;
+}
+
+/* { dg-final { scan-assembler "smull" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-12.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (int *b, int *c)
+{
+  int tmp = *b * *c;
+  return 10 + (long long)tmp;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/wmul-13.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target arm_dsp } */
+
+long long
+foo (int *a, int *b)
+{
+  return *a + (long long)*b * 10;
+}
+
+/* { dg-final { scan-assembler "smlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -1995,7 +1995,16 @@ is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out,
 	  : rhs_code != FIXED_CONVERT_EXPR)
 	rhs1 = rhs;
 	  else
-	rhs1 = gimple_assign_rhs1 (stmt);
+	{
+	  rhs1 = gimple_assign_rhs1 (stmt);
+
+	  if (TREE_CODE (rhs1) == INTEGER_CST)
+		{
+		  *new_rhs_out = rhs1;
+		  *type_out = NULL;
+		  return true;
+		}
+	}
 	}
   else
 	rhs1 = rhs;
@@ -2164,6 +2173,12 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi)
   rhs2 = build_and_insert_cast (gsi, loc, tmp, rhs2);
 }
 
+  /* Handle constants.  */
+  if (TREE_CODE (rhs1) == INTEGER_CST)
+rhs1 = fold_convert (type1, rhs1);
+  if (TREE_CODE (rhs2) == INTEGER_CST)
+rhs2 = fold_convert (type2, rhs2);
+
   gimple_assign_set_rhs1 (stmt, rhs1);
   gimple_assign_set_rhs2 (stmt, rhs2);
   gimple_assign_set_rhs_code (stmt, WIDEN_MULT_EXPR);
@@ -2215,8 +2230,6 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   if (is_gimple_assign (rhs1_stmt))
 	rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
 }
-  else
-return false;
 
   if (TREE_CODE (rhs2) == SSA_NAME)
 {
@@ -2224,8 +2237,6 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   if (is_gimple_assign (rhs2_stmt))
 	rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
 }
-  else
-return false;
 
   /* Allow for one conversion statement between the multiply
  and addition/subtraction statement.  If there are more than
@@ -2373,6 +2384,12 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
 add_rhs = build_and_insert_cast (gsi, loc, create_tmp_var (type, NULL),
  add_rhs);
 
+  /* Handle constants.  */
+  if (TREE_CODE (mult_rhs1) == INTEGER_CST)
+rhs1 = fold_convert (type1, mult_rhs1);
+  if (TREE_CODE (mult_rhs2) == INTEGER_CST)
+rhs2 = fold_convert (type2, mult_rhs2);
+
   gimple_assign_set_rhs_with_ops_1 (gsi, wmult_code, mult_rhs1, mult_rhs2,
 add_rhs);
   update_stmt (gsi_stmt (*gsi));


Re: Vector Comparison patch

2011-08-19 Thread Artem Shinkarov
On Fri, Aug 19, 2011 at 3:54 PM, Richard Guenther
 wrote:
> On Fri, Aug 19, 2011 at 2:29 AM, Artem Shinkarov
>  wrote:
>> Hi, I had the problem with passing information about single variable
>> from expand_vec_cond_expr optab into ix86_expand_*_vcond.
>>
>> I looked into it this problem for quite a while and found a solution.
>> Now the question if it could be done better.
>>
>> First of all the problem:
>>
>> If we represent any vector comparison with VEC_COND_EXPR < v0  v1
>> ? {-1,...} : {0,...} >, then in the assembler we do not want to see
>> this useless comparison with {-1...}.
>>
>> Now it is easy to fix the problem about excessive masking. The real
>> challenge starts when the comparison inside vcond is expressed as a
>> variable. In that case in order to construct correct vector expression
>> we need to adjust cond in cond ? v0 : v1 to  cond == {-1...} or as we
>> agreed recently cond != {0,..}. But hat we need to do only to make
>> vec_cond_expr happy. On the level of assembler we don't want this
>> condition.
>>
>> Now, if I just construct the tree, then in x86, rtx_equal_p, does not
>> know that this is a constant vector full of -1, because the comparison
>> operands are not immediate. So I need somehow to mark the fact in
>> optabs, and then check the information in the x86.
>
> Well, this is why I was suggesting the bitwise semantic for a mask
> operand.  What we should do on the tree level (and that should happen
> already), is forward the comparison into the COND_EXPR.  Thus,
>
> mask = v1 < v2;
> v3 = mask ? v4 : v5;
>
> should get changed to
>
> v3 = v1 < v2 ? v4 : v5;
>
> by tree-ssa-forwprop.c.  If that is not happening we have to fix that there.

Yeah, that is something I am working on.

> Because we _don't_ know the mask is all -1 or 0 ;)  The user might
> put in {3, 5 ,1 3} and expect it to be treated like {-1,...} but it isn't
> so already.
>
>> At the moment I do something like this:
>>
>> optabs:
>>
>> if (!COMPARISON_CLASS_P (op0))
>>  ops[3] = gen_rtx_EQ (mode, NULL_RTX, NULL_RTX);
>>
>> This expression is preserved while checking and verifying.
>>
>> ix86:
>> if (GET_CODE (comp) == EQ && XEXP (comp, 0) == NULL_RTX
>>      && XEXP (comp, 1) == NULL_RTX)
>>
>> See the patch attached for more details. The patch is just to give you
>> an idea of the way I am doing it and it seems to work. Please don't
>> criticise the patch itself, better help me to understand if there is a
>> better way to pass the information from optabs to ix86.
>
> Hm, I'm not sure the expand_vec_cond_expr will work that way,
> I'd have to play with it myself (but will now be running for weekend).
>
> Is the special-casing of a < b ? {-1,-1,-1} : {0,0,0,0} in the backend
> working for you?  I think there are probably some rtl all-ones and all-zeros
> predicates you can re-use.
>
> Richard.

It works fine. Masks all ones and all zeroes are predefined, all -1
are not, but I am switching to all zeroes. The real question is that
this special case of comparison with two empty operands is a little
bit hackish. On the other hand there should be no problem with that,
because operand 3 is used only to get the code of comparison, noone is
looking inside the arguments, so we could use this fact. The question
is whether there is a better way.

Thanks,
Artem.


Re: [PATCH (1/7)] New optab framework for widening multiplies

2011-08-19 Thread Andrew Stubbs

On 19/08/11 15:51, Andrew Stubbs wrote:

On 19/08/11 15:45, Richard Guenther wrote:

Seems one in the series has broken bootstrap on x86_64 when building
the 32bit libgcc multilib in stage1.


Oh? Hopefully that'll be fixed when I complete the patchset. Patches 8
and 9 (of 7) did fix issues with the earlier patches.



Seems fine now. Sorry for the trouble.

Andrew


Re: RFA: Avoiding unprofitable speculation

2011-08-19 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 08/18/11 15:59, Richard Henderson wrote:
> On 08/17/2011 12:21 AM, Richard Guenther wrote:
>> The patch itself looks sensible, though I am surprised ifcvt
>> doesn't run in cfglayout mode (so you have to use reg notes to find
>> probabilities ...)
> 
> It does run in cfglayout mode.
> 
> Jeff, I believe you're supposed to get the probabilities from some
> combination of
> 
> bb->frequency edge->probability EDGE_FREQUENCY(edge)
I was just utilizing code that was already sprinkled all over ifcvt.c to
get the probabilities.

Changing it to edge->probability should be trivial.

Jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOToNVAAoJEBRtltQi2kC7VJwH/itfjDUdJW+JavPAzXAOsMO2
JjSEGVtmokaxQoj2aVpF2uV8y0LyRY2eWQK2kNu9JC6B5BPXiWKknqJmDkAUTLO4
TAjbXrYcghcFoqQMBCC9yolyhbi5LxbIcFs7KT72s3vigOD6TSFRJZk/a02uyLNb
JxW3X8lB+K81LQKzLqNgJEW7k2FmbFYXDEJp0MZq+Y+un3vUTWytyX0Zbm6/caJc
cFxG+qi4HDCKCMBxwkuoxV+T+bEpW+VhIJecIVbIIU/GbzXJO9O+IgAJUMOPGIsh
lj239VeinlwE+4SGdwZQWTfJnmZRkDR3qS3xL67QlSLWpaM393D6uoUhRUfZlyg=
=uFvC
-END PGP SIGNATURE-


Re: [PATCH (9/7)] Widening multiplies with constant inputs

2011-08-19 Thread H.J. Lu
On Fri, Aug 19, 2011 at 8:07 AM, Andrew Stubbs  wrote:
> On 22/07/11 16:38, Andrew Stubbs wrote:
>>
>> Fixed in the attached. I'll commit this version when the rest of my
>> testing is complete.
>
> Now committed. Here's the patch with updated context.
>

I think one of your patches caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50128

-- 
H.J.


Add __builtin_complex to construct complex values (C1X CMPLX* macros)

2011-08-19 Thread Joseph S. Myers
This patch adds __builtin_complex to support generating values with
arbitrary real and imaginary parts, including in static initializers,
despite the absence of imaginary types.  (Recall that X + I * Y, in
the absence of imaginary types, is really X + Y * (0.0 + 1.0I),
resulting in a real part X + Y * 0.0 which yields incorrect results
when infinities or signed zero are used.)  This is intended to be used
by C library headers to define the C1X macros CMPLX*, with definitions
along the lines of:

#define CMPLX(X, Y) __builtin_complex ((double) (X), (double) (Y))

As requested in PR 48760 comment 12, this is purely a C front-end
built-in (actually, a keyword that looks like a built-in function)
providing syntax for COMPLEX_EXPR, rather than a middle-end built-in
function.  The C++ front end is using a different approach for this
issue, allowing list-initialization of _Complex values.  (Allowing
{ real, imag } initializers was one of the approaches considered for
C1X before the final macro approach was arrived at.  See N1464 for the
approach that was followed and references to previous proposals; the
initializer approach was proposal two in N1431.  Note that if you did
allow such initializers for C, it wouldn't provide *expressions*
usable in static initializers, since to make a braced initializer into
an expression you need a compound literal and compound literals can't
be used in static initializers.)

Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  Applied
to mainline.

2011-08-19  Joseph Myers  

* c-parser.c (c_parser_postfix_expression): Handle
RID_BUILTIN_COMPLEX.
* doc/extend.texi (__builtin_complex): Document.

c-family:
2011-08-19  Joseph Myers  

* c-common.c (c_common_reswords): Add __builtin_complex.
* c-common.h (RID_BUILTIN_COMPLEX): New.

testsuite:
2011-08-19  Joseph Myers  

* gcc.dg/builtin-complex-err-1.c, gcc.dg/builtin-complex-err-2.c,
gcc.dg/dfp/builtin-complex.c, gcc.dg/torture/builtin-complex-1.c:
New tests.

Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi (revision 177894)
+++ gcc/doc/extend.texi (working copy)
@@ -7511,6 +7511,18 @@ future revisions.
 
 @end deftypefn
 
+@deftypefn {Built-in Function} @var{type} __builtin_complex (@var{real}, 
@var{imag})
+
+The built-in function @code{__builtin_complex} is provided for use in
+implementing the ISO C1X macros @code{CMPLXF}, @code{CMPLX} and
+@code{CMPLXL}.  @var{real} and @var{imag} must have the same type, a
+real binary floating-point type, and the result has the corresponding
+complex type with real and imaginary parts @var{real} and @var{imag}.
+Unlike @samp{@var{real} + I * @var{imag}}, this works even when
+infinities, NaNs and negative zeros are involved.
+
+@end deftypefn
+
 @deftypefn {Built-in Function} int __builtin_constant_p (@var{exp})
 You can use the built-in function @code{__builtin_constant_p} to
 determine if a value is known to be constant at compile-time and hence
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c (revision 177894)
+++ gcc/c-family/c-common.c (working copy)
@@ -424,6 +424,7 @@ const struct c_common_resword c_common_r
   { "__attribute", RID_ATTRIBUTE,  0 },
   { "__attribute__",   RID_ATTRIBUTE,  0 },
   { "__builtin_choose_expr", RID_CHOOSE_EXPR, D_CONLY },
+  { "__builtin_complex", RID_BUILTIN_COMPLEX, D_CONLY },
   { "__builtin_offsetof", RID_OFFSETOF, 0 },
   { "__builtin_types_compatible_p", RID_TYPES_COMPATIBLE_P, D_CONLY },
   { "__builtin_va_arg",RID_VA_ARG, 0 },
Index: gcc/c-family/c-common.h
===
--- gcc/c-family/c-common.h (revision 177894)
+++ gcc/c-family/c-common.h (working copy)
@@ -103,7 +103,7 @@ enum rid
   /* C extensions */
   RID_ASM,   RID_TYPEOF,   RID_ALIGNOF,  RID_ATTRIBUTE,  RID_VA_ARG,
   RID_EXTENSION, RID_IMAGPART, RID_REALPART, RID_LABEL,  RID_CHOOSE_EXPR,
-  RID_TYPES_COMPATIBLE_P,
+  RID_TYPES_COMPATIBLE_P,  RID_BUILTIN_COMPLEX,
   RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128,
   RID_FRACT, RID_ACCUM,
 
Index: gcc/testsuite/gcc.dg/dfp/builtin-complex.c
===
--- gcc/testsuite/gcc.dg/dfp/builtin-complex.c  (revision 0)
+++ gcc/testsuite/gcc.dg/dfp/builtin-complex.c  (revision 0)
@@ -0,0 +1,10 @@
+/* Test __builtin_complex errors with DFP.  */
+/* { dg-do compile } */
+
+_Decimal32 a, b;
+
+void
+f (void)
+{
+  __builtin_complex (a, b); /* { dg-error "not of real binary floating-point 
type" } */
+}
Index: gcc/testsuite/gcc.dg/builtin-complex-err-1.c
===
--- gcc/testsuite/gcc.dg/builtin-complex-err-1.c(revision 0)
+++ gcc/testsuite/gcc.dg/builtin-complex-err-1.c(revision 0)
@@ -0,0 +1,26 @@
+/* Test __b

Re: Linemap force location and remove LINEMAP_POSITION_FOR_COLUMN (issue4801090)

2011-08-19 Thread Tom Tromey
> "Gabriel" == Gabriel Charette  writes:

Gabriel> It nows exposes two libcpp functions to force the
Gabriel> source_location for tokens when desired.

I am not really a fan of this approach, but I see why you did it this
way -- anything else would be very invasive.

I can only approve the libcpp parts.

Gabriel> +void cpp_force_token_locations (cpp_reader *r, source_location *p)

Newline after "void".

Gabriel> +void cpp_stop_forcing_token_locations (cpp_reader *r)

Likewise.

The libcpp parts are ok with those changes.

Tom


Re: [PATCH, PR43864] Gimple level duplicate block cleanup - test cases.

2011-08-19 Thread Tom de Vries
On 07/17/2011 08:33 PM, Tom de Vries wrote:
> Updated version.
> 
> On 06/08/2011 11:45 AM, Tom de Vries wrote:
>> On 06/08/2011 11:42 AM, Tom de Vries wrote:
>>
>>> I'll send the patch with the testcases in a separate email.
>>
> 

2 extra testcases added.

OK for trunk?

Thanks,
- Tom

2011-08-19  Tom de Vries  

PR middle-end/43864
* gcc.dg/fold-compare-2.c (dg-options): Add -fno-tree-tail-merge.
* gcc/testsuite/gcc.dg/uninit-pred-2_c.c: Same.
* gcc.dg/pr43864.c: New test.
* gcc.dg/pr43864-2.c: Same.
* gcc.dg/pr43864-3.c: Same.
* gcc.dg/pr43864-4.c: Same.
Index: gcc/testsuite/gcc.dg/pr43864-4.c
===
--- gcc/testsuite/gcc.dg/pr43864-4.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr43864-4.c	(revision 0)
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-pre" } */
+
+/* Different stmt order.  */
+
+int f(int c, int b, int d)
+{
+  int r, r2, e;
+
+  if (c)
+{
+  r = b + d;
+  r2 = d - b;
+}
+  else
+{
+  r2 = d - b;
+  e = d + b;
+  r = e;
+}
+
+  return r - r2;
+}
+
+/* { dg-final { scan-tree-dump-times "if " 0 "pre"} } */
+/* { dg-final { scan-tree-dump-times "_.*\\\+.*_" 1 "pre"} } */
+/* { dg-final { scan-tree-dump-times " - " 2 "pre"} } */
+/* { dg-final { cleanup-tree-dump "pre" } } */
Index: gcc/testsuite/gcc.dg/fold-compare-2.c
===
--- gcc/testsuite/gcc.dg/fold-compare-2.c	(revision 176554)
+++ gcc/testsuite/gcc.dg/fold-compare-2.c	(working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp" } */
+/* { dg-options "-O2 -fno-tree-tail-merge -fdump-tree-vrp" } */
 
 extern void abort (void);
 
Index: gcc/testsuite/gcc.dg/uninit-pred-2_c.c
===
--- gcc/testsuite/gcc.dg/uninit-pred-2_c.c	(revision 176554)
+++ gcc/testsuite/gcc.dg/uninit-pred-2_c.c	(working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-Wuninitialized -O2" } */
+/* { dg-options "-Wuninitialized -O2 -fno-tree-tail-merge" } */
 
 int g;
 void bar (void);
Index: gcc/testsuite/gcc.dg/pr43864.c
===
--- gcc/testsuite/gcc.dg/pr43864.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr43864.c	(revision 0)
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-pre" } */
+
+extern void foo (char*, int);
+extern void mysprintf (char *, char *);
+extern void myfree (void *);
+extern int access (char *, int);
+extern int fopen (char *, int);
+
+char *
+hprofStartupp (char *outputFileName, char *ctx)
+{
+  char fileName[1000];
+  int fp;
+  mysprintf (fileName, outputFileName);
+  if (access (fileName, 1) == 0)
+{
+  myfree (ctx);
+  return 0;
+}
+
+  fp = fopen (fileName, 0);
+  if (fp == 0)
+{
+  myfree (ctx);
+  return 0;
+}
+
+  foo (outputFileName, fp);
+
+  return ctx;
+}
+
+/* { dg-final { scan-tree-dump-times "myfree \\(" 1 "pre"} } */
+/* { dg-final { cleanup-tree-dump "pre" } } */
Index: gcc/testsuite/gcc.dg/pr43864-2.c
===
--- gcc/testsuite/gcc.dg/pr43864-2.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr43864-2.c	(revision 0)
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-pre" } */
+
+int
+f (int c, int b, int d)
+{
+  int r, e;
+
+  if (c)
+r = b + d;
+  else
+{
+  e = b + d;
+  r = e;
+}
+
+  return r;
+}
+
+/* { dg-final { scan-tree-dump-times "if " 0 "pre"} } */
+/* { dg-final { scan-tree-dump-times "_.*\\\+.*_" 1 "pre"} } */
+/* { dg-final { cleanup-tree-dump "pre" } } */
Index: gcc/testsuite/gcc.dg/pr43864-3.c
===
--- gcc/testsuite/gcc.dg/pr43864-3.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr43864-3.c	(revision 0)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-pre" } */
+
+/* Commutative case.  */
+
+int f(int c, int b, int d)
+{
+  int r, e;
+
+  if (c)
+r = b + d;
+  else
+{
+  e = d + b;
+  r = e;
+}
+
+  return r;
+}
+
+/* { dg-final { scan-tree-dump-times "if " 0 "pre"} } */
+/* { dg-final { scan-tree-dump-times "_.*\\\+.*_" 1 "pre"} } */
+/* { dg-final { cleanup-tree-dump "pre" } } */


Re: Dump stats about hottest hash tables when -fmem-report

2011-08-19 Thread Tom Tromey
> "Dimitrios" == Dimitrios Apostolou  writes:

Richard> Note that sparsely populated hashes come at the cost of increased
Richard> cache footprint.  Not sure what is more important here though, memory
Richard> access or hash computation.

Tom> I was only approving the change to the dumping.
Tom> I am undecided about making the hash tables more sparse.

Dimitrios> Since my Core Quad processor has large caches and the i386
Dimitrios> has small pointer size, the few extra empty buckets impose
Dimitrios> small overhead, which as it seems is minor in comparison to
Dimitrios> gains due to less rehashes.

Dimitrios> Maybe that's not true on older or alternate equipment. I'd be
Dimitrios> very interested to hear about runtime measurements on various
Dimitrios> equipment, please let me know if you do any.

I think you are the most likely person to do this sort of testing.
You can use machines on the GCC compile farm for this.

Your patch to change the symbol table's load factor is fine technically.
I think the argument for putting it in is lacking; what I would like to
see is either some rationale explaining that the increased memory use is
not important, or some numbers showing that it still performs well on
more than a single machine.  My reason for wanting this is just that,
historically, GCC has been very sensitive to increases in memory use.
Alternatively, comments from more active maintainers indicating that
they don't care about this would also help your case.

I can't approve or reject the libiberty change, just the libcpp one.

Tom


Re: [cxx-mem-model] Atomic C++ header file changes

2011-08-19 Thread Torvald Riegel
On Fri, 2011-08-19 at 08:44 -0400, Andrew MacLeod wrote:
> On 08/19/2011 04:57 AM, Torvald Riegel wrote:
> > On Wed, 2011-08-17 at 11:39 -0400, Andrew MacLeod wrote:
> >> That would be quite ugly, and you get what you deserve if you do that.
> >> I changed the builtins so that if you dont specify a compile time
> >> constant in the memory model parameter, it will simply default to
> >> __SYNC_MEM_SEQ_CST, which will always be safe.  That is standard
> >> compliant (verified), and if anyone is really unhappy about it, then the
> >> c++ headers can be really uglified by adding a bunch of switch
> >> statements to handle this twisted case.
> > IMHO this behavior should be documented so that users will be aware of
> > it, and it would be best if this would raise a warning. Note that I also
> > cannot see any reason why a programmer might want to make barriers
> > runtime-configurable, but silently adding overhead (perhaps the
> > parameter was supposed to be a constant, but wasn't?) can lead to more
> > confusion than necessary.
> >
> 
> The problem with issuing a warning is that anytime the compiler creates 
> a C++ atomic class and you use a method with a memory order, it usually 
> leaves an externally call-able method which has to take a runtime 
> value... so you'd see the warning on basically every compilation... 
> which in turn defeats the purpose of the warning.

Hmm. I would have assumed that the check that would raise warnings would
be for actual calls, not for the instantiations. But that would probably
require special handling of calls to the atomics class for all the
integers and pointers (can atomic be handled as one thing?). So, if
that's too much work, at least document the constraint somewhere?

Torvald



[PATCH, PR43864] Gimple level duplicate block cleanup - 2nd review

2011-08-19 Thread Tom de Vries
Hi Ian,

In the following 2 messages I have posted a gimple level duplicate block cleanup
pass.

Implementation: http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01602.html
Test cases: http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01603.html

The pass reduces x864_64-stage3-cc1 text size with 1.7%, and Richard G. is now
reasonably happy with the state of the patch.

Would you be willing to do a 2nd review?

Thanks,
- Tom


Re: Dump stats about hottest hash tables when -fmem-report

2011-08-19 Thread Dimitrios Apostolou

On Fri, 19 Aug 2011, Tom Tromey wrote:


I think you are the most likely person to do this sort of testing.
You can use machines on the GCC compile farm for this.

Your patch to change the symbol table's load factor is fine technically.
I think the argument for putting it in is lacking; what I would like to
see is either some rationale explaining that the increased memory use is
not important, or some numbers showing that it still performs well on
more than a single machine.  My reason for wanting this is just that,
historically, GCC has been very sensitive to increases in memory use.
Alternatively, comments from more active maintainers indicating that
they don't care about this would also help your case.

I can't approve or reject the libiberty change, just the libcpp one.


Hi Tom, thanks for your comments. I'm well aware that I should test on 
more equipment besides i386 and x86_64 and I certainly plan to. It's just 
that writing patches is one thing, but advocating them on gcc-patches is 
an equally hard thing, and I plan on doing the latter correctly after GSOC 
ends.


As for the technical side of this patch, I've noticed that while in 
general there is a good gain to be earned from reduced collisions, there 
is an overhead in htab_traverse() and htab_delete() that iterate over the 
array. This is evident primarily in var-tracking, which is more 
htab-intensive than the rest of the compiler alltogether! So I plan to 
resubmit my patch together with small changes in var-tracking.



Thanks,
Dimitris





Re: [PATCH] Wire-up missing ARM iwmmxt intrinsics (bugs 35294, 36798, 36966)

2011-08-19 Thread Matt Turner
On Fri, Aug 19, 2011 at 2:09 AM, Xinyu Qi  wrote:
> At 2011-08-19 12:18:10,"Matt Turner"  wrote:> Subject: Re:
>>
>> On Fri, Aug 19, 2011 at 12:13 AM, Matt Turner  wrote:
>> > Hi,
>> >
>> > Attached is a patch based on gcc-4.6.1 that wires-up missing ARM
>> > iwmmxt intrinsics. Without it, gcc is completely useless when it comes
>> > to using a large portion of the intrinsics documented on this page:
>> > http://gcc.gnu.org/onlinedocs/gcc/ARM-iWMMXt-Built_002din-Functions.html
>> >
>> > The patch is based on the work of  in bug 35294.
>> >
>> > I do not know why the check_opsmode hack is necessary.
>
> Hi,
>
> I think check_opsmode in this patch is used to solve something that could be 
> solved by
> -  gcc_assert (GET_MODE (op0) == mode0 && GET_MODE (op1) == mode1);
> +  gcc_assert ((GET_MODE (op0) == mode0 || GET_MODE (op0) == VOIDmode)
> +             && (GET_MODE (op1) == mode1 || GET_MODE (op1) == VOIDmode));
> in my patch.
> For example, in the shift intrinsics, the shift count could be either a 
> variable, or a CONST_INT which has VOIDmode.
>
>> >I also do not know if this wires up all the missing intrinsics.
>
> I'm afraid not. Trunk misses all iWMMXt2 intrinsics and the bugs could be 
> found everywhere since it is lack of maintenance for a long time.
>
>> > I have seen much more extensive patches from Xinyu Qi, but I do not
>> > suppose that they will be available in gcc 4.6.
>
> The patches I submitted have some conflict with 4.6 code base.
>
> Thanks,
> Xinyu

Indeed, that seems like the way it should be done. Thanks very much.
See the attached patch.

Thanks,
Matt
--- arm.c.orig	2011-05-05 04:39:40.0 -0400
+++ arm.c	2011-08-19 13:48:21.548405102 -0400
@@ -19218,7 +19218,8 @@
   || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
 target = gen_reg_rtx (tmode);
 
-  gcc_assert (GET_MODE (op0) == mode0 && GET_MODE (op1) == mode1);
+  gcc_assert ((GET_MODE (op0) == mode0 || GET_MODE (op0) == VOIDmode)
+ && (GET_MODE (op1) == mode1 || GET_MODE (op1) == VOIDmode));
 
   if (! (*insn_data[icode].operand[1].predicate) (op0, mode0))
 op0 = copy_to_mode_reg (mode0, op0);
@@ -19814,6 +19815,65 @@
   emit_insn (pat);
   return target;
 
+case ARM_BUILTIN_WSLLH:
+case ARM_BUILTIN_WSLLHI:
+case ARM_BUILTIN_WSLLW:
+case ARM_BUILTIN_WSLLWI:
+case ARM_BUILTIN_WSLLD:
+case ARM_BUILTIN_WSLLDI:
+case ARM_BUILTIN_WSRAH:
+case ARM_BUILTIN_WSRAHI:
+case ARM_BUILTIN_WSRAW:
+case ARM_BUILTIN_WSRAWI:
+case ARM_BUILTIN_WSRAD:
+case ARM_BUILTIN_WSRADI:
+case ARM_BUILTIN_WSRLH:
+case ARM_BUILTIN_WSRLHI:
+case ARM_BUILTIN_WSRLW:
+case ARM_BUILTIN_WSRLWI:
+case ARM_BUILTIN_WSRLD:
+case ARM_BUILTIN_WSRLDI:
+case ARM_BUILTIN_WRORH:
+case ARM_BUILTIN_WRORHI:
+case ARM_BUILTIN_WRORW:
+case ARM_BUILTIN_WRORWI:
+case ARM_BUILTIN_WRORD:
+case ARM_BUILTIN_WRORDI:
+case ARM_BUILTIN_WAND:
+case ARM_BUILTIN_WANDN:
+case ARM_BUILTIN_WOR:
+case ARM_BUILTIN_WXOR:
+  icode = (fcode == ARM_BUILTIN_WSLLH ? CODE_FOR_ashlv4hi3_di
+	   : fcode == ARM_BUILTIN_WSLLHI ? CODE_FOR_ashlv4hi3_iwmmxt
+	   : fcode == ARM_BUILTIN_WSLLW  ? CODE_FOR_ashlv2si3_di
+	   : fcode == ARM_BUILTIN_WSLLWI ? CODE_FOR_ashlv2si3_iwmmxt
+	   : fcode == ARM_BUILTIN_WSLLD  ? CODE_FOR_ashldi3_di
+	   : fcode == ARM_BUILTIN_WSLLDI ? CODE_FOR_ashldi3_iwmmxt
+	   : fcode == ARM_BUILTIN_WSRAH  ? CODE_FOR_ashrv4hi3_di
+	   : fcode == ARM_BUILTIN_WSRAHI ? CODE_FOR_ashrv4hi3_iwmmxt
+	   : fcode == ARM_BUILTIN_WSRAW  ? CODE_FOR_ashrv2si3_di
+	   : fcode == ARM_BUILTIN_WSRAWI ? CODE_FOR_ashrv2si3_iwmmxt
+	   : fcode == ARM_BUILTIN_WSRAD  ? CODE_FOR_ashrdi3_di
+	   : fcode == ARM_BUILTIN_WSRADI ? CODE_FOR_ashrdi3_iwmmxt
+	   : fcode == ARM_BUILTIN_WSRLH  ? CODE_FOR_lshrv4hi3_di
+	   : fcode == ARM_BUILTIN_WSRLHI ? CODE_FOR_lshrv4hi3_iwmmxt
+	   : fcode == ARM_BUILTIN_WSRLW  ? CODE_FOR_lshrv2si3_di
+	   : fcode == ARM_BUILTIN_WSRLWI ? CODE_FOR_lshrv2si3_iwmmxt
+	   : fcode == ARM_BUILTIN_WSRLD  ? CODE_FOR_lshrdi3_di
+	   : fcode == ARM_BUILTIN_WSRLDI ? CODE_FOR_lshrdi3_iwmmxt
+	   : fcode == ARM_BUILTIN_WRORH  ? CODE_FOR_rorv4hi3_di
+	   : fcode == ARM_BUILTIN_WRORHI ? CODE_FOR_rorv4hi3
+	   : fcode == ARM_BUILTIN_WRORW  ? CODE_FOR_rorv2si3_di
+	   : fcode == ARM_BUILTIN_WRORWI ? CODE_FOR_rorv2si3
+	   : fcode == ARM_BUILTIN_WRORD  ? CODE_FOR_rordi3_di
+	   : fcode == ARM_BUILTIN_WRORDI ? CODE_FOR_rordi3
+	   : fcode == ARM_BUILTIN_WAND   ? CODE_FOR_iwmmxt_anddi3
+	   : fcode == ARM_BUILTIN_WANDN  ? CODE_FOR_iwmmxt_nanddi3
+	   : fcode == ARM_BUILTIN_WOR? CODE_FOR_iwmmxt_iordi3
+	   : fcode == ARM_BUILTIN_WXOR   ? CODE_FOR_iwmmxt_xordi3
+	   : CODE_FOR_rordi3);
+  return arm_expand_binop_builtin (icode, exp, target);
+
 case ARM_BUILTIN_WZERO:
   target = gen_reg_rtx (DImode);
   emit_in

Re: Add __builtin_complex to construct complex values (C1X CMPLX* macros)

2011-08-19 Thread Jakub Jelinek
On Fri, Aug 19, 2011 at 03:55:12PM +, Joseph S. Myers wrote:
> Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  Applied
> to mainline.

The new tests ICE on i686-linux:
FAIL: gcc.dg/builtin-complex-err-1.c (internal compiler error)
FAIL: gcc.dg/builtin-complex-err-2.c (internal compiler error)
FAIL: gcc.dg/torture/builtin-complex-1.c  -O*  (internal compiler error)

All the ICEs are on
case EXCESS_PRECISION_EXPR:
  /* Each case where an operand with excess precision may be
 encountered must remove the EXCESS_PRECISION_EXPR around
 inner operands and possibly put one around the whole
 expression or possibly convert to the semantic type (which
 c_fully_fold does); we cannot tell at this stage which is
 appropriate in any particular case.  */
  gcc_unreachable ();
in c_fully_fold_internal.

Jakub


Re: [cxx-mem-model] Atomic C++ header file changes

2011-08-19 Thread Andrew MacLeod

On 08/19/2011 12:48 PM, Torvald Riegel wrote:



The problem with issuing a warning is that anytime the compiler creates
a C++ atomic class and you use a method with a memory order, it usually
leaves an externally call-able method which has to take a runtime
value... so you'd see the warning on basically every compilation...
which in turn defeats the purpose of the warning.
 

Hmm. I would have assumed that the check that would raise warnings would
be for actual calls, not for the instantiations. But that would probably
require special handling of calls to the atomics class for all the
integers and pointers (can atomic  be handled as one thing?). So, if
that's too much work, at least document the constraint somewhere?

   


I'd definitely document the constraint.

To be honest, I think its a pretty useless thing, bordering on moronic.  
The whole point of the memory model is to be able to generate more 
efficient code when you don't need SEQ_CST and really know what you are 
doing.


Even if you *DO* want to make that kind of a call, you have to expect 
the overhead of a runtime library call.  And if you are using SEQ_CST 
mode, its going to be that much slower again due to the call.I think 
inlining it to be SEQ_CST will provide smaller code size always, and I'd 
be surpised if it *ever* became a performance issue.  And I do mean *ever*.


Andrew



[PATCH] Fix execute_update_addresses_taken for loop closed SSA form (PR tree-optimization/48739)

2011-08-19 Thread Jakub Jelinek
Hi!

If some variable is optimized from TREE_ADDRESSABLE into a gimple var
during execute_update_addresses_taken while in loop closed SSA form,
it might not be rewritten into loop closed SSA form, thus either fail
verification, or following loop passes might miscompile something.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk/4.6?

2011-08-19  Jakub Jelinek  

PR tree-optimization/48739
* tree-ssa.c: Include cfgloop.h.
(execute_update_addresses_taken): When updating ssa, if in
loop closed SSA form, call rewrite_into_loop_closed_ssa instead of
update_ssa.
* Makefile.in (tree-ssa.o): Depend on $(CFGLOOP_H).

* gcc.dg/pr48739-1.c: New test.
* gcc.dg/pr48739-2.c: New test.

--- gcc/tree-ssa.c.jj   2011-08-18 08:36:00.0 +0200
+++ gcc/tree-ssa.c  2011-08-19 18:51:18.0 +0200
@@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.  
 #include "tree-dump.h"
 #include "tree-pass.h"
 #include "diagnostic-core.h"
+#include "cfgloop.h"
 
 /* Pointer map of variable mappings, keyed by edge.  */
 static struct pointer_map_t *edge_var_maps;
@@ -2208,7 +2209,10 @@ execute_update_addresses_taken (void)
  }
 
   /* Update SSA form here, we are called as non-pass as well.  */
-  update_ssa (TODO_update_ssa);
+  if (number_of_loops () > 1 && loops_state_satisfies_p (LOOP_CLOSED_SSA))
+   rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa);
+  else
+   update_ssa (TODO_update_ssa);
 }
 
   BITMAP_FREE (not_reg_needs);
--- gcc/Makefile.in.jj  2011-08-18 08:36:01.0 +0200
+++ gcc/Makefile.in 2011-08-19 18:55:17.0 +0200
@@ -2405,7 +2405,7 @@ tree-ssa.o : tree-ssa.c $(TREE_FLOW_H) $
$(TREE_DUMP_H) langhooks.h $(TREE_PASS_H) $(BASIC_BLOCK_H) $(BITMAP_H) \
$(FLAGS_H) $(GGC_H) $(HASHTAB_H) pointer-set.h \
$(GIMPLE_H) $(TREE_INLINE_H) $(TARGET_H) tree-pretty-print.h \
-   gimple-pretty-print.h
+   gimple-pretty-print.h $(CFGLOOP_H)
 tree-into-ssa.o : tree-into-ssa.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \
$(TREE_H) $(TM_P_H) $(EXPR_H) output.h $(DIAGNOSTIC_H) \
$(FUNCTION_H) $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
--- gcc/testsuite/gcc.dg/pr48739-1.c.jj 2011-08-19 18:53:43.0 +0200
+++ gcc/testsuite/gcc.dg/pr48739-1.c2011-08-19 18:53:26.0 +0200
@@ -0,0 +1,27 @@
+/* PR tree-optimization/48739 */
+/* { dg-do compile } */
+/* { dg-require-effective-target pthread } */
+/* { dg-options "-O1 -ftree-parallelize-loops=2 -fno-tree-dominator-opts" } */
+
+extern int g;
+extern void bar (void);
+
+int
+foo (int x)
+{
+  int a, b, *c = (int *) 0;
+  for (a = 0; a < 10; ++a)
+{
+  bar ();
+  for (b = 0; b < 5; ++b)
+   {
+ x = 0;
+ c = &x;
+ g = 1;
+   }
+}
+  *c = x;
+  for (x = 0; x != 10; x++)
+;
+  return g;
+}
--- gcc/testsuite/gcc.dg/pr48739-2.c.jj 2011-08-19 18:53:43.0 +0200
+++ gcc/testsuite/gcc.dg/pr48739-2.c2011-08-19 18:54:00.0 +0200
@@ -0,0 +1,27 @@
+/* PR tree-optimization/48739 */
+/* { dg-do compile } */
+/* { dg-require-effective-target pthread } */
+/* { dg-options "-O1 -ftree-parallelize-loops=2 -fno-tree-dominator-opts" } */
+
+extern int g, v[10];
+extern void bar (void);
+
+int
+foo (int x)
+{
+  int a, b, *c = (int *) 0;
+  for (a = 0; a < 10; ++a)
+{
+  bar ();
+  for (b = 0; b < 5; ++b)
+   {
+ x = 0;
+ c = &x;
+ g = 1;
+   }
+}
+  *c = x;
+  for (x = 0; x != 10; x++)
+v[x] = x;
+  return g;
+}

Jakub


Re: RFC: [build, ada] Centralize PICFLAG configuration

2011-08-19 Thread Rainer Orth
Paolo,

> I've actually changed it to honor both -fpic and -fPIC in CFLAGS
> (resp. CFLAGS_FOR_TARGET).
>
> I'll fire off i386-pc-solaris2.10 and x86_64-unknown-linux-gnu
> bootstraps before leaving for home tonight, but the following patch has
> already been tested lightly with a minimal configure script that
> uses GCC_PICFLAG{, _FOR_TARGET}, exercising it without and with various
> --build/--host/--target settings and specifing -fpic/-fPIC in CFLAGS*.
> At least in this scenario, it works as expected.

the following patch has been fully tested.  Changes from the previous
version include correct quoting of [] in picflag.m4 and substituting
PICFLAG_FOR_TARGET in gcc/configure.ac.

The patch has been bootstrapped without regressions on
i386-pc-solaris2.10 and x86_64-unknown-linux-gnu.  I've also performed
an Ada-only --disable-libada bootstrap on Linux/x86_64 with Arnaud's
patch and confirmed that -fpic was correctly included when running make
gnatlib from gcc.  That make errored out later with

../../xgcc -B../../ -B/usr/local/x86_64-unknown-linux-gnu/bin/ -isystem 
/usr/local/x86_64-unknown-linux-gnu/include -isystem 
/usr/local/x86_64-unknown-linux-gnu/sys-include 
-L/var/gcc/gcc-4.7.0-20110819/2.6.18-gcc-gas-gld-no-libada/gcc/../ld -c 
a-assert.adb -o a-assert.o
a-assert.adb:32:01: user-defined descendents of package Ada are not allowed
make[2]: *** [a-assert.o] Error 1
make[2]: Leaving directory 
`/var/gcc/gcc-4.7.0-20110819/2.6.18-gcc-gas-gld-no-libada/gcc/ada/rts'
make[1]: *** [gnatlib] Error 2
make[1]: Leaving directory 
`/var/gcc/gcc-4.7.0-20110819/2.6.18-gcc-gas-gld-no-libada/gcc/ada'
make: *** [gnatlib] Error 2

which I haven't investigated further.

Ok for mainline now (or rather after the libgcc1 and before the libgcc2
patch once those are approved)?

Thanks.
Rainer


2011-07-31  Rainer Orth  

config:
* picflag.m4: New file.

gcc:
* configure.ac (GCC_PICFLAG_FOR_TARGET): Call it.
(PICFLAG_FOR_TARGET): Substitute.
* aclocal.m4: Regenerate.
* configure: Regenerate.

gcc/ada:
* gcc-interface/Makefile.in (PICFLAG_FOR_TARGET): New.
(GNATLIBCFLAGS_FOR_C): Replace
TARGET_LIBGCC2_CFLAGS by PICFLAG_FOR_TARGET.
(gnatlib-shared-default, gnatlib-shared-dual-win32)
(gnatlib-shared-win32, gnatlib-shared-darwin, gnatlib-shared)
(gnatlib-sjlj, gnatlib-zcx): Likewise.

libada:
* configure.ac: Include ../config/picflag.m4.
(GCC_PICFLAG): Call it.
Substitute.
* configure: Regenerate.
* Makefile.in (TARGET_LIBGCC2_CFLAGS): Replace by PICFLAG.
(GNATLIBCFLAGS_FOR_C): Replace TARGET_LIBGCC2_CFLAGS by PICFLAG.
(LIBADA_FLAGS_TO_PASS): Pass PICFLAG as PICFLAG_FOR_TARGET.
Don't include $(GCC_DIR)/libgcc.mvars.

libiberty:
* aclocal.m4: Include ../config/picflag.m4.
* configure.ac (GCC_PICFLAG): Call it.
(enable_shared): Clear PICFLAG unless shared.
* configure: Regenerate.

# HG changeset patch
# Parent 4f8f991ef1b75c25f4231de1fe5406200c19653d
Centralize PICFLAG configuration

diff --git a/config/picflag.m4 b/config/picflag.m4
new file mode 100644
--- /dev/null
+++ b/config/picflag.m4
@@ -0,0 +1,95 @@
+# _GCC_PICFLAG(FLAG, DISPATCH)
+# 
+# Store PIC flag corresponding to DISPATCH triplet in FLAG.
+# Explit use of -fpic in CFLAGS corresponding to FLAG overrides default.
+AC_DEFUN([_GCC_PICFLAG], [
+
+case "${$2}" in
+# PIC is the default on some targets or must not be used.
+*-*-darwin*)
+	# PIC is the default on this platform
+	# Common symbols not allowed in MH_DYLIB files
+	$1=-fno-common
+	;;
+alpha*-dec-osf5*)
+	# PIC is the default.
+	;;
+hppa*64*-*-hpux*)
+	# PIC is the default for 64-bit PA HP-UX.
+	;;
+i[[34567]]86-*-cygwin* | i[[34567]]86-*-mingw* | x86_64-*-mingw*)
+	;;
+i[[34567]]86-*-interix3*)
+	# Interix 3.x gcc -fpic/-fPIC options generate broken code.
+	# Instead, we relocate shared libraries at runtime.
+	;;
+i[[34567]]86-*-nto-qnx*)
+	# QNX uses GNU C++, but need to define -shared option too, otherwise
+	# it will coredump.
+	$1='-fPIC -shared'
+	;;
+i[[34567]]86-pc-msdosdjgpp*)
+	# DJGPP does not support shared libraries at all.
+	;;
+ia64*-*-hpux*)
+	# On IA64 HP-UX, PIC is the default but the pic flag
+	# sets the default TLS model and affects inlining.
+	$1=-fPIC
+	;;
+mips-sgi-irix6*)
+	# PIC is the default.
+	;;
+rs6000-ibm-aix* | powerpc-ibm-aix*)
+	# All AIX code is PIC.
+	;;
+
+# Some targets support both -fPIC and -fpic, but prefer the latter.
+# FIXME: Why?
+i[[34567]]86-*-* | x86_64-*-*)
+	$1=-fpic
+	;;
+m68k-*-*)
+	$1=-fpic
+	;;
+s390*-*-*)
+	$1=-fpic
+	;;
+# FIXME: Override -fPIC default in libgcc only? 
+sh-*-linux* | sh[[2346lbe]]*-*-linux*)
+	$1=-fpic
+	;;
+# FI

[Patch, fortran, obvious] PR fortran/50129 gfc_enforce_clean_symbol_state ICE after rejecting an ELSEWHERE statement

2011-08-19 Thread Mikael Morin
Hello, 

I'm going to commit the following (to trunk and 4.6) once the regression test 
finishes.

Mikael
2011-08-19  Mikael Morin  

PR fortran/50129
* parse.c (parse_where): Undo changes after emitting an error. 

2011-08-19  Mikael Morin  

PR fortran/50129
* where_3.f90: New test.

diff --git a/parse.c b/parse.c
index aab711c..9b11086 100644
--- a/parse.c
+++ b/parse.c
@@ -2778,6 +2778,7 @@ parse_where_block (void)
 	{
 	  gfc_error ("ELSEWHERE statement at %C follows previous "
 			 "unmasked ELSEWHERE");
+	  reject_statement ();
 	  break;
 	}
 
! { dg-do compile }
!
! PR fortran/50129
! ICE after reporting an error on a masked ELSEWHERE statement following an
! unmasked one.
!
! Contributed by Joost Van de Vondele 

INTEGER :: I(3)
WHERE (I>2)
ELSEWHERE
ELSEWHERE (I<1) ! { dg-error "follows previous unmasked ELSEWHERE" }
END WHERE
END



Re: [PATCH, testsuite, i386] AVX2 support for GCC

2011-08-19 Thread Uros Bizjak
On Fri, Aug 19, 2011 at 2:30 PM, Kirill Yukhin  wrote:

> Here is next patch, which adds support of code generation and intrinsics.
> Patch and ChangeLog are attached.
>
> Bootstrap and make check are passed
>
> Is it ok for trunk?

The patch looks good to me. If there are any other macroization
opportunities, we will find them later.

Before committing the patch, please fix whitespace, i.e. 8 spaces to
tab. You are using these inconsistently.

OK with these changes.

Thanks,
Uros.


Re: [PATCH, i386]: Expand round(a) = sgn(a) * floor(fabs(a) + 0.5) using SSE4 ROUND insn

2011-08-19 Thread Uros Bizjak
On Mon, Aug 15, 2011 at 5:25 PM, Michael Matz  wrote:

> On Mon, 15 Aug 2011, Michael Matz wrote:
>
>> > > .LFB0:
>> > >        .cfi_startproc
>> > >        movsd   .LC0(%rip), %xmm2
>> > >        movapd  %xmm0, %xmm1
>> > >        andpd   %xmm2, %xmm1
>> > >        andnpd  %xmm0, %xmm2
>> > >        addsd   .LC1(%rip), %xmm1
>> > >        roundsd $1, %xmm1, %xmm1
>> > >        orpd    %xmm2, %xmm1
>> > >        movapd  %xmm1, %xmm0
>> > >        ret
>> >
>> > Hm, why do we need the sign-copy?  If I read the docs correctly
>> > we can simply use roundsd directly, no?
>>
>> round-half-away-from-zero breaks your neck.  round[ps][sd] only supports
>> the usual four IEEE rounding modes.
>
> But, you should be able to apply the sign to the 0.5, which wouldn't
> require building the absolute value of input:
>
> round(x) = trunc(x + (copysign (0.5, x)))
>
> which should roughly be expanded to:
>
>        movsd   signbits(%rip), %xmm1
>       andpd   %xmm0, %xmm1
>       movsd   nextof0.5(%rip), %xmm2
>       orpd    %xmm1, %xmm2
>       addpd   %xmm2, %xmm0
>       roundsd $1, %xmm0, %xmm0
>        ret
>
> Which has one logical operation less (and one move because I chose a more
> optimal register assignment).

Thanks for the suggestion, I will implement and test it ASAP.

Uros.


Re: Add __builtin_complex to construct complex values (C1X CMPLX* macros)

2011-08-19 Thread Gabriel Dos Reis
On Fri, Aug 19, 2011 at 10:55 AM, Joseph S. Myers
 wrote:
>  Note that if you did
> allow such initializers for C, it wouldn't provide *expressions*
> usable in static initializers, since to make a braced initializer into
> an expression you need a compound literal and compound literals can't
> be used in static initializers.)

Thanks for the rationale.  I was puzzled until I read that bits.
I would have thought that the natural thing to do was to fix
C's compound literals so that they can be used in static initializers.
Do you know why WG14 did not want to do that?

-- Gaby


Re: PATCH: PR middle-end/49721: convert_memory_address_addr_space may generate invalid new insns

2011-08-19 Thread H.J. Lu
On Sun, Aug 14, 2011 at 9:22 AM, H.J. Lu  wrote:
> Hi,
>
> This patch is needed for x32 and only affects x32.  Any comments/objections
> to apply this to finish x32 support?
>
> Thanks.
>
>
> H.J.
> 
> On Thu, Aug 11, 2011 at 6:25 AM, H.J. Lu  wrote:
>> Hi,
>>
>> This is the last patch needed for x32 support.
>> convert_memory_address_addr_space
>> is called to convert a memory address without overflow/underflow.  It
>> should be safe
>> to transform
>>
>> (zero_extend:DI (plus:SI (FOO:SI) (const_int Y)))
>>
>> to
>>
>> (plus:DI (zero_extend:DI (FOO:SI)) (const_int Y))
>>
>> GCC only works this way.  Any comments?
>>
>> Thanks.
>>
>> H.J.
>> 
>> On Sun, Aug 7, 2011 at 1:08 PM, H.J. Lu  wrote:
>>> Hi,
>>>
>>> We transform
>>>
>>> ptr_extend:DI (plus:SI (FOO:SI) (const_int Y)))
>>>
>>> to
>>>
>>> (plus:DI (ptr_extend:DI (FOO:SI)) (const_int Y))
>>>
>>> since this is how Pmode != ptr_mode is supported even if the resulting
>>> address may overflow/underflow.   It is also true for x32 which has
>>> zero_extend instead of ptr_extend.  I have tried different approaches
>>> to avoid transforming
>>>
>>> (zero_extend:DI (plus:SI (FOO:SI) (const_int Y)))
>>>
>>> to
>>>
>>> (plus:DI (zero_extend:DI (FOO:SI)) (const_int Y))
>>>
>>> without success.  This patch relaxes the condition to check
>>> POINTERS_EXTEND_UNSIGNED != 0 instead if POINTERS_EXTEND_UNSIGNED < 0
>>> to cover both ptr_extend and zero_extend. We can investigate a better
>>> approach for ptr_extend and zero_extend later.  For now, I believe it
>>> is the saftest way to support ptr_extend and zero_extend.
>>>
>>> Any comments?
>>>
>>> Thanks.
>>>
>>>
>>> H.J.

I am checking in this patch, which only affects x32
and nothing else.  This one character change, from

POINTERS_EXTEND_UNSIGNED < 0

to

POINTERS_EXTEND_UNSIGNED != 0

creates a working x32 GCC. This isn't perfect. I have
tried many different approaches without any success.
I will revisit it if we run into any problems with x32
applications.

Thanks.

-- 
H.J.

gcc/

2011-08-19  H.J. Lu  

PR middle-end/49721
* explow.c (convert_memory_address_addr_space): Also permute the
conversion and addition of constant for zero-extend.

gcc/testsuite/

2011-08-19  H.J. Lu  

PR middle-end/49721
* gfortran.dg/pr49721-1.f: New.
* gfortran.fortran-torture/compile/pr49721-1.f: Likewise.
gcc/

2011-08-19  H.J. Lu  

	PR middle-end/49721
	* explow.c (convert_memory_address_addr_space): Also permute the
	conversion and addition of constant for zero-extend.

gcc/testsuite/

2011-08-19  H.J. Lu  

	PR middle-end/49721
	* gfortran.dg/pr49721-1.f: New.
	* gfortran.fortran-torture/compile/pr49721-1.f: Likewise.

diff --git a/gcc/explow.c b/gcc/explow.c
index beeab44..984150e 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -384,18 +384,23 @@ convert_memory_address_addr_space (enum machine_mode to_mode ATTRIBUTE_UNUSED,
 
 case PLUS:
 case MULT:
-  /* For addition we can safely permute the conversion and addition
-	 operation if one operand is a constant and converting the constant
-	 does not change it or if one operand is a constant and we are
-	 using a ptr_extend instruction  (POINTERS_EXTEND_UNSIGNED < 0).
+  /* FIXME: For addition, we used to permute the conversion and
+	 addition operation only if one operand is a constant and
+	 converting the constant does not change it or if one operand
+	 is a constant and we are using a ptr_extend instruction
+	 (POINTERS_EXTEND_UNSIGNED < 0) even if the resulting address
+	 may overflow/underflow.  We relax the condition to include
+	 zero-extend (POINTERS_EXTEND_UNSIGNED > 0) since the other
+	 parts of the compiler depend on it.  See PR 49721.
+
 	 We can always safely permute them if we are making the address
 	 narrower.  */
   if (GET_MODE_SIZE (to_mode) < GET_MODE_SIZE (from_mode)
 	  || (GET_CODE (x) == PLUS
 	  && CONST_INT_P (XEXP (x, 1))
-	  && (XEXP (x, 1) == convert_memory_address_addr_space
-   (to_mode, XEXP (x, 1), as)
- || POINTERS_EXTEND_UNSIGNED < 0)))
+	  && (POINTERS_EXTEND_UNSIGNED != 0
+		  || XEXP (x, 1) == convert_memory_address_addr_space
+		  			(to_mode, XEXP (x, 1), as
 	return gen_rtx_fmt_ee (GET_CODE (x), to_mode,
 			   convert_memory_address_addr_space
  (to_mode, XEXP (x, 0), as),
diff --git a/gcc/testsuite/gfortran.dg/pr49721-1.f b/gcc/testsuite/gfortran.dg/pr49721-1.f
new file mode 100644
index 000..39e2ed7
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr49721-1.f
@@ -0,0 +1,35 @@
+! PR middle-end/49721
+! { dg-do compile }
+! { dg-options "-O3 -funroll-loops" }
+
+  subroutine midbloc6(c,a2,a2i,q)
+  parameter (ndim2=6)
+  parameter (ndim=3)
+  dimension ri(ndim2),cr(ndim2,ndim2),xj(ndim2,ndim2),q(*)
+ @,sai(ndim2,ndim2),cm(ndim2,ndim2),w(ndim2,ndim2)
+  dimension vr(ndim2,ndim2),vi(ndim2,ndim2),s1(ndim2,ndim2),p(ndim)
+  dimension xq(6),qb(2),qc(2),ifl(6),iplane(3)
+  save
+  c

Re: [rtl, delay-slot] Fix overload of "unchanging" bit

2011-08-19 Thread Kaz Kojima
Richard Henderson  wrote:
> The following has passed stage2-gcc on sparc64-linux host (full build still
> in progress), with --enable-checking=yes,rtl.  It surely needs more than that,
> and I'm asking for help from the relevant maintainers to give this a try.

There are no regressions for sh4-unknown-linux-gnu.

Regards,
kaz


Re: [Patch, Fortran, OOP] PR 49638: [OOP] length parameter is ignored when overriding type bound character functions with constant length.

2011-08-19 Thread Janus Weil
>> > 2011/8/7 Thomas Koenig :
>> >> When extending the values of gfc_dep_compare_expr, we will need to go
>> >> through all its uses (making sure we change == -2 to <= -2).
>> >
>> > attached is a patch which makes a start with this.
>> >
>> > For now, it changes the return value to "-3" for two cases:
>> > 1) different expr_types
>> > 2) non-identical variables
>> >
>> > I tried to take care of all places which are checking for a return
>> > value of "-2" and I hope I missed none.
>> >
>> > Any objections or ok for trunk? (Regtested successfully.)
> OK from my side for the code proper.

Thanks for the review.


> I have one comment though about this:
> +/* Compare two expressions.  Return values:
> +   * +1 if e1 > e2
> +   * 0 if e1 == e2
> +   * -1 if e1 < e2
> +   * -2 if the relationship could not be determined
> +   * -3 if e1 /= e2, but we cannot tell which one is larger.  */
>
> I think this is misleading, as the function does not always return -3 when
> e1/=e2.

That's right. However, the same argument applies to the other values
as well: The function does not always return 0 if e1==e2. There could
be cases where the arguments are algebraically equal, but we fail to
detect this (example: A+B+C vs C+B+A). This sort of "uncertainty" was
not introduced by me, but was present before, and is not special to
the value "-3".

Describing the value -2 as "relationship could not be determined" sort
of implies that this can happen. So I would tend to leave the
description as it is.


> There is for example (currently) no special handling for operators.

Well, unfortunately one cannot just return "-3" for non-matching
operators. Just think of cases like A*(B+C) vs A*B+A*C. One could try
to handle such cases in a follow-up patch.

I'll commit the patch (as posted) tomorrow, if Mikael agrees that the
description is ok.

Also I like Tobias' idea of using an enum, but I'll leave it for a follow-up.

Cheers,
Janus


Re: Add __builtin_complex to construct complex values (C1X CMPLX* macros)

2011-08-19 Thread Joseph S. Myers
On Fri, 19 Aug 2011, Jakub Jelinek wrote:

> On Fri, Aug 19, 2011 at 03:55:12PM +, Joseph S. Myers wrote:
> > Bootstrapped with no regressions on x86_64-unknown-linux-gnu.  Applied
> > to mainline.
> 
> The new tests ICE on i686-linux:
> FAIL: gcc.dg/builtin-complex-err-1.c (internal compiler error)
> FAIL: gcc.dg/builtin-complex-err-2.c (internal compiler error)
> FAIL: gcc.dg/torture/builtin-complex-1.c  -O*  (internal compiler error)
> 
> All the ICEs are on
> case EXCESS_PRECISION_EXPR:
>   /* Each case where an operand with excess precision may be
>  encountered must remove the EXCESS_PRECISION_EXPR around
>  inner operands and possibly put one around the whole
>  expression or possibly convert to the semantic type (which
>  c_fully_fold does); we cannot tell at this stage which is
>  appropriate in any particular case.  */
>   gcc_unreachable ();
> in c_fully_fold_internal.

I've applied this patch that will hopefully fix the problem by
converting operands of __builtin_complex to their semantic types.
Bootstrapped with no regressions on x86_64-unknown-linux-gnu.

2011-08-19  Joseph Myers  

* c-parser.c (c_parser_postfix_expression): Convert operands of
__builtin_complex to their semantic types.

Index: c-parser.c
===
--- c-parser.c  (revision 177911)
+++ c-parser.c  (working copy)
@@ -6428,7 +6428,13 @@ c_parser_postfix_expression (c_parser *p
  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
 "expected %<)%>");
  mark_exp_read (e1.value);
+ if (TREE_CODE (e1.value) == EXCESS_PRECISION_EXPR)
+   e1.value = convert (TREE_TYPE (e1.value),
+   TREE_OPERAND (e1.value, 0));
  mark_exp_read (e2.value);
+ if (TREE_CODE (e2.value) == EXCESS_PRECISION_EXPR)
+   e2.value = convert (TREE_TYPE (e2.value),
+   TREE_OPERAND (e2.value, 0));
  if (!SCALAR_FLOAT_TYPE_P (TREE_TYPE (e1.value))
  || DECIMAL_FLOAT_TYPE_P (TREE_TYPE (e1.value))
  || !SCALAR_FLOAT_TYPE_P (TREE_TYPE (e2.value))

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Add __builtin_complex to construct complex values (C1X CMPLX* macros)

2011-08-19 Thread Joseph S. Myers
On Fri, 19 Aug 2011, Gabriel Dos Reis wrote:

> On Fri, Aug 19, 2011 at 10:55 AM, Joseph S. Myers
>  wrote:
> >  Note that if you did
> > allow such initializers for C, it wouldn't provide *expressions*
> > usable in static initializers, since to make a braced initializer into
> > an expression you need a compound literal and compound literals can't
> > be used in static initializers.)
> 
> Thanks for the rationale.  I was puzzled until I read that bits.
> I would have thought that the natural thing to do was to fix
> C's compound literals so that they can be used in static initializers.
> Do you know why WG14 did not want to do that?

A compound literal is essentially an anonymous variable with a given 
initializer, so I suppose it comes down to C not allowing const variables 
(to which const qualified compound literals are equivalent, except that 
they may share storage, like string constants and unlike named variables) 
in initializers and I don't know a specific rationale for that difference 
between C and C++.

-- 
Joseph S. Myers
jos...@codesourcery.com

patch to solve PR49936

2011-08-19 Thread Vladimir Makarov
The following patch makes gcc4.7 behaving as gcc4.6 for the case 
described on http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49936.


The patch was successfully bootstrapped on x86_64 and ppc64.

Committed as rev 177916.

2011-08-19  Vladimir Makarov 

PR rtl-optimization/49936
* ira.c (ira_init_register_move_cost): Ignore too small subclasses
for calculation of max register move costs.

Index: ira.c
===
--- ira.c   (revision 177573)
+++ ira.c   (working copy)
@@ -1501,6 +1501,10 @@ ira_init_register_move_cost (enum machin
  sizeof (move_table) * N_REG_CLASSES);
   for (cl1 = 0; cl1 < N_REG_CLASSES; cl1++)
 {
+  /* Some subclasses are to small to have enough registers to hold
+a value of MODE.  Just ignore them.  */
+  if (! contains_reg_of_mode[cl1][mode])
+   continue;
   COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl1]);
   AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs);
   if (hard_reg_set_empty_p (temp_hard_regset))


Re: [Patch, Fortran, OOP] PR 49638: [OOP] length parameter is ignored when overriding type bound character functions with constant length.

2011-08-19 Thread Mikael Morin
On Friday 19 August 2011 23:54:45 Janus Weil wrote:
> > I have one comment though about this:
> > +/* Compare two expressions.  Return values:
> > +   * +1 if e1 > e2
> > +   * 0 if e1 == e2
> > +   * -1 if e1 < e2
> > +   * -2 if the relationship could not be determined
> > +   * -3 if e1 /= e2, but we cannot tell which one is larger.  */
> > 
> > I think this is misleading, as the function does not always return -3
> > when e1/=e2.
> 
> That's right. However, the same argument applies to the other values
> as well: The function does not always return 0 if e1==e2. There could
> be cases where the arguments are algebraically equal, but we fail to
> detect this (example: A+B+C vs C+B+A). This sort of "uncertainty" was
> not introduced by me, but was present before, and is not special to
> the value "-3".
> 
> Describing the value -2 as "relationship could not be determined" sort
> of implies that this can happen. So I would tend to leave the
> description as it is.
OK, this comment really bugged me, but it's not that bad on second thought.

> 
> > There is for example (currently) no special handling for operators.
> 
> Well, unfortunately one cannot just return "-3" for non-matching
> operators. Just think of cases like A*(B+C) vs A*B+A*C. 
Ah yes. I was thinking expressions themselves were compared; but only their 
values are. 

> One could try to handle such cases in a follow-up patch.
If you want. I wasn't asking you (or anyone else) to do it. 

> 
> I'll commit the patch (as posted) tomorrow, if Mikael agrees that the
> description is ok.
It's fine. Thanks.

Mikael.


[pph] Add support for line table streaming with includes (issue4908051)

2011-08-19 Thread Gabriel Charette
Applied requested changes.

Tested on x64 with bootstrap and pph regression testing.

Committing to pph branch, if any other changes are needed to this patch, I'm 
writting a clean up patch for the line_table implementation now and will add 
whatever else is needed to it.

One thing I wasn't sure is how to handle headers with no .h extensions, will we 
ever pph any of those? I currently support them in this patch.

Cheers,
Gab

2011-08-19  Gabriel Charette  

gcc/cp/ChangeLog.pph
* pph-streamer-in.c (pph_in_includes): Remove.
(pph_in_linetable_marker): New.
(pph_in_line_table_and_includes): Renamed from
pph_in_and_merge_line_table.
Now handles line_table and includes input in parallel.
Now returns source_location corresponding to line 1 / col 0
of the header currently loaded as pph.
(pph_read_file_1): Read line_table and includes before replaying
tokens. Use location returned by pph_in_line_table_and_includes
as forced token location for replayed tokens.
* pph-streamer-out.c (pph_out_includes): Remove.
(pph_out_linetable_marker): New.
(pph_filename_eq_ignoring_path): New.
(pph_get_next_include): New.
(pph_out_line_table_and_includes): Renamed from pph_out_line_table.
Now handles output of both the line_table and includes references
in parallel.
(pph_write_file): Write out line_table and includes before
identifiers.
* pph-streamer.h (enum pph_linetable_marker): New.
* pph.c (pph_include_handler): Add hack to mimic
line_table->highest_location behaviour in _cpp_stack_include used
by the non-pph compiler.

gcc/testsuite/ChangeLog.pph
* g++.dg/pph/p4eabi1.cc: Remove asm xdiff.
* g++.dg/pph/x4keyed.cc: Changed line info for expected failure.
* g++.dg/pph/x7rtti.cc: Changed line info for expected failures.

diff --git a/gcc/cp/pph-streamer-in.c b/gcc/cp/pph-streamer-in.c
index 4666ace..9686edf 100644
--- a/gcc/cp/pph-streamer-in.c
+++ b/gcc/cp/pph-streamer-in.c
@@ -1281,27 +1281,6 @@ pph_in_symtab (pph_stream *stream)
 }
 
 
-/* Read all the images included by STREAM.  */
-
-static void
-pph_in_includes (pph_stream *stream)
-{
-  unsigned i, num;
-
-  pph_reading_includes++;
-
-  num = pph_in_uint (stream);
-  for (i = 0; i < num; i++)
-{
-  const char *include_name = pph_in_string (stream);
-  pph_stream *include = pph_read_file (include_name);
-  pph_add_include (include, false);
-}
-
-  pph_reading_includes--;
-}
-
-
 /* Read a linenum_type from STREAM.  */
 
 static inline linenum_type
@@ -1320,6 +1299,20 @@ pph_in_source_location (pph_stream *stream)
 }
 
 
+/* Read a line table marker from STREAM.  */
+
+static inline enum pph_linetable_marker
+pph_in_linetable_marker (pph_stream *stream)
+{
+  enum pph_linetable_marker m =
+(enum pph_linetable_marker) pph_in_uchar (stream);
+  gcc_assert (m == PPH_LINETABLE_ENTRY
+ || m == PPH_LINETABLE_REFERENCE
+ || m == PPH_LINETABLE_END);
+  return m;
+}
+
+
 /* Read a line_map from STREAM into LM.  */
 
 static void
@@ -1343,47 +1336,100 @@ pph_in_line_map (pph_stream *stream, struct line_map 
*lm)
 }
 
 
-/* Read the line_table from STREAM and merge it in LINETAB.  */
+/* Read the line_table from STREAM and merge it in LINETAB.  At the same time
+   load includes in the order they were originally included by loading them at
+   the point they were referenced in the line_table.
 
-static void
-pph_in_and_merge_line_table (pph_stream *stream, struct line_maps *linetab)
+   Returns the source_location of line 1 / col 0 for this include.
+
+   FIXME pph: The line_table is now identical to the non-pph line_table, the
+   only problem is that we load line_table entries twice for headers that are
+   re-included and are #ifdef guarded; thus shouldn't be replayed.  This is
+   a known current issue, so I didn't bother working around it here for now.  
*/
+
+static source_location
+pph_in_line_table_and_includes (pph_stream *stream, struct line_maps *linetab)
 {
-  unsigned int ix, pph_used, old_depth;
+  unsigned int old_depth;
+  bool first;
+  int includer_ix = -1;
+  unsigned int used_before = linetab->used;
   int entries_offset = linetab->used - PPH_NUM_IGNORED_LINE_TABLE_ENTRIES;
+  enum pph_linetable_marker next_lt_marker = pph_in_linetable_marker (stream);
 
-  pph_used = pph_in_uint (stream);
+  pph_reading_includes++;
 
-  for (ix = 0; ix < pph_used; ix++, linetab->used++)
+  for (first = true; next_lt_marker != PPH_LINETABLE_END;
+   next_lt_marker = pph_in_linetable_marker (stream))
 {
-  struct line_map *lm;
+  if (next_lt_marker == PPH_LINETABLE_REFERENCE)
+   {
+ int old_loc_offset;
+ const char *include_name = pph_in_string (stream);
+ source_location prev_start_loc = pph_in_source_location (stream);
+ pph_stream *include;
+
+ gcc_a

Re: [libiberty patch] Add demangler support for cloned function symbols (PR 40831)

2011-08-19 Thread Cary Coutant
> OK, thanks. Dmitry G. also commented that the patch does not work "for
> `_Z3fooi.1988' or `_Z3fooi.part.9.165493.constprop.775.31805'."
> Apparently, there can be multiple numeric suffixes, and a cloned
> function can be cloned again. Is it worth trying to identify the kinds
> of cloning in the demangled name, or should I just look for a generic
> pattern instead?

Here's an updated patch that generalizes the clone suffix pattern recognition.

Does this look OK for trunk?

I've bootstrapped and regression tested on x86_64.

-cary


2011-08-19   Cary Coutant  

* include/demangle.h (enum demangle_component_type):
* libiberty/cp-demangle.c (struct d_info):
(CP_STATIC_IF_GLIBCPP_V3):
(struct d_print_info):
* libiberty/testsuite/demangle-expected (DFA):

include/ChangeLog:

PR 40831
* demangle.h (enum demangle_component_type): Add
DEMANGLE_COMPONENT_CLONE.

libiberty/ChangeLog:

PR 40831
* cp-demangle.c (d_make_comp): Add new component type.
(cplus_demangle_mangled_name): Check for clone suffixes.
(d_parmlist): Don't error out if we see '.'.
(d_clone_suffix): New function.
(d_print_comp): Print info for clone suffixes.
* testsuite/demangle-expected: Add new testcases.
2011-08-19   Cary Coutant  

* include/demangle.h (enum demangle_component_type):
* libiberty/cp-demangle.c (struct d_info):
(CP_STATIC_IF_GLIBCPP_V3):
(struct d_print_info):
* libiberty/testsuite/demangle-expected (DFA):

include/ChangeLog:

PR 40831
* demangle.h (enum demangle_component_type): Add
DEMANGLE_COMPONENT_CLONE.

libiberty/ChangeLog:

PR 40831
* cp-demangle.c (d_make_comp): Add new component type.
(cplus_demangle_mangled_name): Check for clone suffixes.
(d_parmlist): Don't error out if we see '.'.
(d_clone_suffix): New function.
(d_print_comp): Print info for clone suffixes.
* testsuite/demangle-expected: Add new testcases.


commit 69ffd5c629215428cf817dfd11bc4fc5f7d07715
Author: Cary Coutant 
Date:   Thu Apr 7 17:36:13 2011 -0700

Demangle cloned function names.

diff --git a/include/demangle.h b/include/demangle.h
index 53f6c54..960e88e 100644
--- a/include/demangle.h
+++ b/include/demangle.h
@@ -402,7 +402,9 @@ enum demangle_component_type
   /* An unnamed type.  */
   DEMANGLE_COMPONENT_UNNAMED_TYPE,
   /* A pack expansion.  */
-  DEMANGLE_COMPONENT_PACK_EXPANSION
+  DEMANGLE_COMPONENT_PACK_EXPANSION,
+  /* A cloned function.  */
+  DEMANGLE_COMPONENT_CLONE
 };
 
 /* Types which are only used internally.  */
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index d67a9e7..11da407 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -417,6 +417,9 @@ static struct demangle_component *d_lambda (struct d_info 
*);
 
 static struct demangle_component *d_unnamed_type (struct d_info *);
 
+static struct demangle_component *
+d_clone_suffix (struct d_info *, struct demangle_component *);
+
 static int
 d_add_substitution (struct d_info *, struct demangle_component *);
 
@@ -802,6 +805,7 @@ d_make_comp (struct d_info *di, enum 
demangle_component_type type,
 case DEMANGLE_COMPONENT_LITERAL_NEG:
 case DEMANGLE_COMPONENT_COMPOUND_NAME:
 case DEMANGLE_COMPONENT_VECTOR_TYPE:
+case DEMANGLE_COMPONENT_CLONE:
   if (left == NULL || right == NULL)
return NULL;
   break;
@@ -1034,7 +1038,7 @@ d_make_sub (struct d_info *di, const char *name, int len)
   return p;
 }
 
-/*  ::= _Z 
+/*  ::= _Z  []*
 
TOP_LEVEL is non-zero when called at the top level.  */
 
@@ -1042,6 +1046,8 @@ CP_STATIC_IF_GLIBCPP_V3
 struct demangle_component *
 cplus_demangle_mangled_name (struct d_info *di, int top_level)
 {
+  struct demangle_component *p;
+
   if (! d_check_char (di, '_')
   /* Allow missing _ if not at toplevel to work around a
 bug in G++ abi-version=2 mangling; see the comment in
@@ -1050,7 +1056,17 @@ cplus_demangle_mangled_name (struct d_info *di, int 
top_level)
 return NULL;
   if (! d_check_char (di, 'Z'))
 return NULL;
-  return d_encoding (di, top_level);
+  p = d_encoding (di, top_level);
+
+  /* If at top level and parsing parameters, check for a clone
+ suffix.  */
+  if (top_level && (di->options & DMGL_PARAMS) != 0)
+while (d_peek_char (di) == '.'
+  && (IS_LOWER (d_peek_next_char (di))
+  || IS_DIGIT (d_peek_next_char (di
+  p = d_clone_suffix (di, p);
+
+  return p;
 }
 
 /* Return whether a function should have a return type.  The argument
@@ -2354,7 +2370,7 @@ d_parmlist (struct d_info *di)
   struct demangle_component *type;
 
   char peek = d_peek_char (di);
-  if (peek == '\0' || peek == 'E')
+  if (peek == '\0' || peek == 'E' || peek == '.')
break;
   type = cplus_demangle_type (di);
   if (type == NULL)
@@ -3082,6 +3098,33 @@ d_unnamed_type 

Re: Announcing the Port of Intel(r) Cilk (TM) Plus into GCC

2011-08-19 Thread Mike Stump
On Aug 15, 2011, at 1:30 PM, Iyer, Balaji V wrote:
>   This letter describes the recently created GCC branch called "cilkplus" 
> that ports the Intel(R) Cilk(TM) Plus language extensions to the C and C++ 
> front-ends of gcc-4.7. We are looking for collaborators and advice as we 
> proceed

Enhance the gcc plugin infrastructure to permit the extension to be a pure 
plugin.  :-)  I'm thinking about doing this for the Objective-C and 
Objective-C++ languages, as a fun, get the feet wet project.  We can rely upon 
-flto to improve performance, should performance be a concern.

The actual goal however, is to provide a way for people to play around and add 
extensions, like say for example, the Apple Blocks extension, but without 
rebuilding gcc, only using the standard plugin interface.  I think longer term, 
this can enhance the design and layout of gcc itself.