Re: [C++ Patch] More accurate location for conditional expressions

2013-05-22 Thread Andreas Schwab
Paolo Carlini  writes:

>   * call.c (build_conditional_expr_1): Add location_t parameter.
>   (build_conditional_expr): Likewise.

../../gcc/objc/objc-next-runtime-abi-02.c: In function 'tree_node* 
build_v2_build_objc_method_call(int, tree, tree, tree, tree, bool)':
../../gcc/objc/objc-next-runtime-abi-02.c:1642:83: error: invalid conversion 
from 'tree' to 'location_t {aka unsigned int}' [-fpermissive]
   ret_val = build_conditional_expr (ifexp, ret_val, ftree, 
tf_warning_or_error);

   ^
../../gcc/objc/objc-next-runtime-abi-02.c:1642:83: error: cannot convert 
'tsubst_flags' to 'tree' for argument '4' to 'tree_node* 
build_conditional_expr(location_t, tree, tree, tree, tsubst_flags_t)'
make[3]: *** [objcp/objc-next-runtime-abi-02.o] Error 1

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] Allow nested use of attributes in MD-files

2013-05-22 Thread Kirill Yukhin
> Ok.
Thanks a lot!
Checked into trunk: http://gcc.gnu.org/ml/gcc-cvs/2013-05/msg00698.html

Thanks, K


Re: [PATCH] Allow nested use of attributes in MD-files

2013-05-22 Thread Kirill Yukhin
BTW, do we need to update GCC internals with this change?

Thanks, K

On Wed, May 22, 2013 at 11:49 AM, Kirill Yukhin  wrote:
>> Ok.
> Thanks a lot!
> Checked into trunk: http://gcc.gnu.org/ml/gcc-cvs/2013-05/msg00698.html
>
> Thanks, K


Re: [4.8 PATCH, i386]: Fix PR 57356, SSE2 instructions generated with '-mno-sse2'

2013-05-22 Thread Jakub Jelinek
On Tue, May 21, 2013 at 07:03:54PM +0200, Uros Bizjak wrote:
> This patch avoids movdqu/movdqa when SSE2 is disabled. Although
> -mno-sse2 is bordering on ABI violation for 64bit targets, the patch
> is simple enough to fix movti_internal_rex64 pattern.

If the TImode attr variant isn't valid for !SSE2, I'd find it clearer
to do
(match_test "!TARGET_SSE2") (const_string "V4SF")
much earlier in the cond, before TARGET_AVX case that returns TImode
(sure, when TARGET_AVX is true, then !TARGET_SSE2 is false), and
not added using ior to an unrelated condition.

That said, you are the maintainer, thus your decision; I think this is
fine for 4.8 today (if committed before RC2 is rolled).

> 2013-05-21  Uros Bizjak  
> 
> PR target/57356
> * config/i386/i386.md (*movti_internal_rex64): Emit movaps/movups
> for non-sse2 targets.
> 
> Tested on x86_64-pc-linux-gnu {,-m32}. OK for 4.8 branch?
> 
> (The patch is also needed for 4.7 branch, but not for mainline, where
> this problem is fixed by movti_internal/movti_internal_rex64 merge).

> Index: config/i386/i386.md
> ===
> --- config/i386/i386.md   (revision 199017)
> +++ config/i386/i386.md   (working copy)
> @@ -1805,7 +1805,8 @@
>(const_string "V4SF")
>  (match_test "TARGET_AVX")
>(const_string "TI")
> -(match_test "optimize_function_for_size_p (cfun)")
> +(ior (not (match_test "TARGET_SSE2"))
> + (match_test "optimize_function_for_size_p (cfun)"))
>(const_string "V4SF")
>  ]
>  (const_string "TI")))])


Jakub


Re: [4.8 PATCH, i386]: Fix PR 57356, SSE2 instructions generated with '-mno-sse2'

2013-05-22 Thread Uros Bizjak
On Wed, May 22, 2013 at 9:55 AM, Jakub Jelinek  wrote:
> On Tue, May 21, 2013 at 07:03:54PM +0200, Uros Bizjak wrote:
>> This patch avoids movdqu/movdqa when SSE2 is disabled. Although
>> -mno-sse2 is bordering on ABI violation for 64bit targets, the patch
>> is simple enough to fix movti_internal_rex64 pattern.
>
> If the TImode attr variant isn't valid for !SSE2, I'd find it clearer
> to do
> (match_test "!TARGET_SSE2") (const_string "V4SF")
> much earlier in the cond, before TARGET_AVX case that returns TImode
> (sure, when TARGET_AVX is true, then !TARGET_SSE2 is false), and
> not added using ior to an unrelated condition.

Yes, this is a good suggestion, I was trying to be consistent with the
movti_internal pattern.

I will change both in the way you suggested ASAP.

Thanks,
Uros.


Re: [patch, powerpc] increase array alignment for Altivec

2013-05-22 Thread Richard Biener
On Wed, May 22, 2013 at 3:57 AM, David Edelsohn  wrote:
> On Tue, May 21, 2013 at 7:13 PM, Sandra Loosemore
>  wrote:
>> On 05/21/2013 04:04 PM, David Edelsohn wrote:
>>>
>>>
>>> There are three issues here:
>>>
>>> 1) Someone in the LTC toolchain team needs to benchmark this patch on
>>> POWER7.
>>
>>
>> That would be great if somebody else could help with that.
>>
>>
>>> 2) We need to clarify how the patch affects the ABI because it cannot
>>> break the ABI.
>>
>>
>> I understand this.
>>
>>
>>> 3) Please stop saying that you cannot justify trying to get the patch
>>> in mainline.  Other developers have pointed out how the patch may be
>>> incorrect. Do you want to deliver a broken compiler to CodeSourcery's
>>> customers? The comment sets a bad tone for engaging with the GCC
>>> community.
>>
>>
>> I think you've misunderstood my position, here.  Delivering a broken
>> compiler is just what I want to avoid!  We've had the original
>> local-arrays-only patch in our local tree for a couple of years now, but we
>> no longer have a customer for it.  I thought the comments from the previous
>> review would be straightforward to address and it would be worth making one
>> more attempt to revise and resubmit the patch, but if the feedback we get
>> from the community is that this is still broken in other ways and is going
>> to need a lot more work before it's acceptable, we're going to give up on it
>> and revert the previous version of the patch locally too.  We have a lot of
>> higher-priority patches in our local tree that we'd like to get on mainline,
>> and limited resources for working on it, so we need to pick our battles.
>> That's all.  :-)
>
> I think the local arrays patch makes sense, if it does not hurt
> performance. We had another recent case where increasing GCC's
> knowledge about the alignment of memory returned by malloc allowed
> additional vectorization opportunities, but hurt performance because
> of bad spilling choices by GCC RA.  This alignment patch may expose
> similar RA problems.  We may need to apply the patch with the
> optimization disabled until the RA spilling problem is fixed.
>
> Increasing the alignment of arrays within structs and unions would be
> nice, but that probably will change the ABI. I think that they best we
> may be able to do is increase the alignment if the array is the first
> element of the struct or union, see ROUND_TYPE_ALIGN for AIX.
> Although this might be more trouble than it is worth.

Maybe the idea was to increase the alignment of the struct (without
affecting it's layout) when that increases the alignment of a contained
array member.  Like for

struct { int i; int j; float a[1024]; int x; };

where aligning to sizeof (int) * 2 would get 'a' a bigger alignment.

Richard.

> Pat or Bill, can you test the performance of the array alignment patch?
>
> Thanks, David


Re: [Patch] Extend script ./contrib/download_prerequisites usage for isl and cloog

2013-05-22 Thread Paolo Bonzini
Il 18/05/2013 04:37, Chung-Ju Wu ha scritto:
> Hi all,
> 
> Using current trunk repository, it is now able to build
> compiler with in-tree isl and cloog.
> 
> This patch is to extend ./contrib/download_prerequisites
> usage to download isl and cloog conditionally in case
> people would like to build gcc with graphite loop optimizations.
> 
> OK for the trunk?
> 
> 
> contrib/ChangeLog
> 
> 2013-05-18  Chung-Ju Wu  
> 
> * download_prerequisites: Download isl and cloog conditionally.
> 
> 
> 
> Index: contrib/download_prerequisites
> ===
> --- contrib/download_prerequisites  (revision 199006)
> +++ contrib/download_prerequisites  (working copy)
> @@ -19,6 +19,12 @@
>  # You should have received a copy of the GNU General Public License
>  # along with this program. If not, see http://www.gnu.org/licenses/.
> 
> +# If you want to build GCC with the Graphite loop optimizations,
> +# set GRAPHITE_LOOP_OPT=yes to download optional prerequisties
> +# ISL Library and CLooG.
> +GRAPHITE_LOOP_OPT=no
> +
> +# Necessary to build GCC.
>  MPFR=mpfr-2.4.2
>  GMP=gmp-4.3.2
>  MPC=mpc-0.8.1
> @@ -36,3 +42,19 @@
>  ln -sf $MPC mpc || exit 1
> 
>  rm $MPFR.tar.bz2 $GMP.tar.bz2 $MPC.tar.gz || exit 1
> +
> +# Necessary to build GCC with the Graphite loop optimizations.
> +if [ "$GRAPHITE_LOOP_OPT" == "yes" ] ; then
> +  ISL=isl-0.11.1
> +  CLOOG=cloog-0.18.0
> +
> +  wget ftp://gcc.gnu.org/pub/gcc/infrastructure/$ISL.tar.bz2 || exit 1
> +  tar xjf $ISL.tar.bz2  || exit 1
> +  ln -sf $ISL isl || exit 1
> +
> +  wget ftp://gcc.gnu.org/pub/gcc/infrastructure/$CLOOG.tar.gz || exit 1
> +  tar xzf $CLOOG.tar.gz || exit 1
> +  ln -sf $CLOOG cloog || exit 1
> +
> +  rm $ISL.tar.bz2 $CLOOG.tar.gz || exit 1

I would leave the tar files in place.  Otherwise ok.

Paolo

> +fi
> 
> 
> Best regards,
> jasonwucj
> 



[patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Jonathan Wakely
This alters the configure script to enable C++11 thread library
features based on targets that are known to support the features,
rather than based on link tests which are disabled by default.  With
Glibc 2.17 this enables a nanosecond resolution std::system_clock in
the default configuration, yay!

I've tested this on two versions of Fedora and Debian, but would be
grateful for test results on Solaris, Cygwin and BSD targets, and for
cross-compilers to any of those targets.

* acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Add KIND=auto to
enable features if target OS is known to support them.
* configure.ac (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Default to 'auto'.
* configure: Regenerate.


Tested x86_64-linux, committed to trunk.
commit 47e011c33d78a00015265be39f2845209eeb97fc
Author: Jonathan Wakely 
Date:   Sun Feb 3 20:45:51 2013 +

* acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Add KIND=auto to
enable features if target OS is known to support them.
* configure.ac (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Default to 'auto'.
* configure: Regenerate.

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 619fff0..efeb6d4 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -871,7 +871,8 @@ dnl(FEATURE, DEFAULT, HELP-ARG, HELP-STRING)
 dnl(FEATURE, DEFAULT, HELP-ARG, HELP-STRING, permit a|b|c)
 dnl(FEATURE, DEFAULT, HELP-ARG, HELP-STRING, SHELL-CODE-HANDLER)
 dnl
-dnl See docs/html/17_intro/configury.html#enable for documentation.
+dnl See manual/appendix_porting.html#appendix.porting.build_hacking for
+dnl documentation.
 dnl
 m4_define([GLIBCXX_ENABLE],[dnl
 m4_define([_g_switch],[--enable-$1])dnl
@@ -1161,8 +1162,9 @@ dnlnanosleep and sched_yield in libc and 
libposix4 and, if needed,
 dnllinks in the latter.
 dnl --enable-libstdcxx-time=rt
 dnlalso searches (and, if needed, links) librt.  Note that this is
-dnlnot always desirable because, in glibc, for example, in turn it
-dnltriggers the linking of libpthread too, which activates locking,
+dnlnot always desirable because, in glibc 2.16 and earlier, for
+dnlexample, in turn it triggers the linking of libpthread too,
+dnlwhich activates locking,
 dnla large overhead for single-thread programs.
 dnl --enable-libstdcxx-time=no
 dnl --disable-libstdcxx-time
@@ -1175,8 +1177,7 @@ dnl os_defines.h and also defines 
_GLIBCXX_USE_SCHED_YIELD.
 dnl
 AC_DEFUN([GLIBCXX_ENABLE_LIBSTDCXX_TIME], [
 
-  AC_MSG_CHECKING([for clock_gettime, nanosleep and sched_yield])
-  GLIBCXX_ENABLE(libstdcxx-time,$1,[[[=KIND]]],
+  GLIBCXX_ENABLE(libstdcxx-time,auto,[[[=KIND]]],
 [use KIND for check type],
 [permit yes|no|rt])
 
@@ -1188,9 +1189,59 @@ AC_DEFUN([GLIBCXX_ENABLE_LIBSTDCXX_TIME], [
 
   ac_has_clock_monotonic=no
   ac_has_clock_realtime=no
-  AC_MSG_RESULT($enable_libstdcxx_time)
+  ac_has_nanosleep=no
+  ac_has_sched_yield=no
+
+  if test x"$enable_libstdcxx_time" = x"auto"; then
+
+case "${target_os}" in
+  cygwin*)
+ac_has_nanosleep=yes
+;;
+  darwin*)
+ac_has_nanosleep=yes
+ac_has_sched_yield=yes
+;;
+  gnu* | linux* | kfreebsd*-gnu | knetbsd*-gnu)
+AC_MSG_CHECKING([for at least GNU libc 2.17])
+AC_TRY_COMPILE(
+  [#include ],
+  [
+  #if ! __GLIBC_PREREQ(2, 17)
+  #error 
+  #endif
+  ],
+  [glibcxx_glibc217=yes], [glibcxx_glibc217=no])
+AC_MSG_RESULT($glibcxx_glibc217)
+
+if test x"$glibcxx_glibc217" = x"yes"; then
+  ac_has_clock_monotonic=yes
+  ac_has_clock_realtime=yes
+fi
+ac_has_nanosleep=yes
+ac_has_sched_yield=yes
+;;
+  freebsd*|netbsd*)
+ac_has_clock_monotonic=yes
+ac_has_clock_realtime=yes
+ac_has_nanosleep=yes
+ac_has_sched_yield=yes
+;;
+  openbsd*)
+ac_has_clock_monotonic=yes
+ac_has_clock_realtime=yes
+ac_has_nanosleep=yes
+;;
+  solaris*)
+GLIBCXX_LIBS="$GLIBCXX_LIBS -lrt"
+ac_has_clock_monotonic=yes
+ac_has_clock_realtime=yes
+ac_has_nanosleep=yes
+ac_has_sched_yield=yes
+;;
+esac
 
-  if test x"$enable_libstdcxx_time" != x"no"; then
+  elif test x"$enable_libstdcxx_time" != x"no"; then
 
 if test x"$enable_libstdcxx_time" = x"rt"; then
   AC_SEARCH_LIBS(clock_gettime, [rt posix4])
@@ -1214,19 +1265,16 @@ AC_DEFUN([GLIBCXX_ENABLE_LIBSTDCXX_TIME], [
 case "$ac_cv_search_sched_yield" in
   -lposix4*)
   GLIBCXX_LIBS="$GLIBCXX_LIBS $ac_cv_search_sched_yield"
-  AC_DEFINE(_GLIBCXX_USE_SCHED_YIELD, 1,
-   [ Defined if sched_yield is available. ])
+  ac_has_sched_yield=yes
   ;;
   -lrt*)
   if test x"$enable_libstdcxx_time" = x"rt"; then
GLIBCXX_LIBS="$GLIBCXX_LIBS $ac_cv_search_sched_

Re: [patch] Preserve the CFG until after final

2013-05-22 Thread Eric Botcazou
> That is only partially true. Currently the transition is already de
> facto going on: Just look at how many back ends use
> compute_bb_for_insn to re-initialize the BLOCK_FOR_INSN pointers right
> after pass_free_cfg (it's usually the first thing they do in the
> machine-reorg pass).

Yes, and we should do something about it.  Btw, why is the line removed from 
ia64_reorg in the patch (and not mentioned in the ChangeLog)?

> Some ports never call free_bb_or_insn after that,
> and expect BLOCK_FOR_INSN to be valid in 'final'. One of those ports
> is i386, look at where BLOCK_FOR_INSN is used in i386.c (in functions
> deciding what asm output templates to use).

AFAICS it's only used for splitters.  And using BLOCK_FOR_INSN after the last 
split pass (pass_split_for_shorten_branches) is dubious.  Here's the list:

./frv/frv.c:   || BLOCK_FOR_INSN (insn) == ce_info->else_bb

Only used during if-conversion.

./rs6000/rs6000.c:  bb = BLOCK_FOR_INSN (label);

Only used during compute_alignments.

./spu/spu.c:  bitmap_set_bit (blocks, BLOCK_FOR_INSN (branch)->index);

Only used during md_reorg.

./c6x/c6x.c:  BLOCK_FOR_INSN (bundle) = BLOCK_FOR_INSN (slot[0]);

Only used during md_reorg.

./mips/mips.c:  basic_block bb = BLOCK_FOR_INSN (insn);
./mips/mips.c:  /* Restore the BLOCK_FOR_INSN pointers, which are needed by 
DF.  Also during

Only used for splitting decision, deal with null BLOCK_FOR_INSN.

./i386/i386.c:  basic_block bb = start ? BLOCK_FOR_INSN (start) : NULL;
./i386/i386.c:  basic_block bb = BLOCK_FOR_INSN (insn);
./i386/i386.c:  basic_block bb = start ? BLOCK_FOR_INSN (start) : NULL;
./i386/i386.c:  basic_block bb = BLOCK_FOR_INSN (insn);
./i386/i386.c:  basic_block bb = BLOCK_FOR_INSN (insn);
./i386/i386.c:  rtx start = BB_HEAD (BLOCK_FOR_INSN (insn));
./i386/i386.c:  basic_block bb =  BLOCK_FOR_INSN (insn);

Only used for splitting decision (and scheduling).

> Also, right now I'm stuck with a chicken-and-egg problem: dbr_schedule
> is not CFG-aware, but my still slowly progressing work on a
> replacement can't have a CFG because it has to run after machine-reorg
> passes, and therefore after pass_free_cfg.

That's why I'm suggesting to get rid of or modify pass_free_cfg so that the 
CFG is still usable up to the point where it is really destroyed.

> > I think that an incremental step would be to allow the machine reorg
> > pass to use the CFG, even if it doesn't maintain it.
> 
> That is the current state of things already.

No, as you noted earlier, because of pass_free_cfg.

> But I need this exactly for that reason: To remove that blocker! :-)

I don't think so.  The goal for now shouldn't be to "preserve the CFG until 
after final" since we know it isn't feasible in the short term, but rather to 
preserve the CFG as long as possible.

So what about doing the following instead:
 1. Remove pass_free_cfg from the pipeline,
 2. Remove compute_bb_for_insn from all the md_reorg passes that call it,
 3. (Optionally) Do pass_free_cfg from all the md_reorg passes that don't,
 4. Do pass_free_cfg at the beginning of pass_delay_slots,
 5. Do pass_free_cfg at the end of pass_split_for_shorten_branches,

-- 
Eric Botcazou


Re: [fortran, doc] Improve random_seed example

2013-05-22 Thread Janne Blomqvist
On Tue, May 21, 2013 at 2:50 AM, Steve Kargl
 wrote:
> On Fri, May 17, 2013 at 03:18:01PM +0300, Janne Blomqvist wrote:
>> Hi,
>>
>> the example we provide for the usage of the random_seed intrinsic
>> could be better. At least one user has already been tripped over by
>> the fact that on some targets the first call to system_clock returns
>> 0, resulting in a poor seed. Below is an improved(?!) example program.
>>
>> Ok for trunk/4.8/4.7? (Changelog + patch is of course trivial once
>> there is agreement on the code itself)
>>
>
> Looks OK to me.
>
> It should probably also be noted in the manual that one can get
> real poor results if one does not set all seed values.

Thanks. Attached is the patch I committed following 'make pdf'
testing. Committed to trunk/4.8 (after Jakub's approval on IRC)/4.7.


--
Janne Blomqvist


rseed.diff
Description: Binary data


Re: [C++ Patch] More accurate location for conditional expressions

2013-05-22 Thread Paolo Carlini


Andreas Schwab  ha scritto:

>Paolo Carlini  writes:
>
>>  * call.c (build_conditional_expr_1): Add location_t parameter.
>>  (build_conditional_expr): Likewise.

Argh, I'll fix it momentarily sorry. I admit I forgot to enable objc and 
obj-c++ when testing.

Paolo


Re: RFA: Use gen_int_mode in plus_constant

2013-05-22 Thread Richard Sandiford
Andreas Krebbel  writes:
> On 21/05/13 11:26, Richard Sandiford wrote:
>> gcc/
>>  * recog.c (offsettable_address_addr_space_p): Fix calculation of
>>  address mode.  Move pointer mode initialization to the same place.
>
> Thanks! This fixed the failure (and others). Bootstrapped on s390x - no
> regressions:

Thanks for testing.  Also now bootstrapped & regression-tested on
x86_64-linux-gnu.  OK to install?  OK for 4.8 once 4.8.1 is released?

Thanks,
Richard



Re: RFA: fix rtl-optimization/56833

2013-05-22 Thread Eric Botcazou
> The problem was that we had some optimzations added to the
> reload_cse_move2add pass that would attempt transformations with
> multi-hard-register registers, without keeping track of the validity of the
> values in all hard registers involved.

That's not clear to me: for example in move2add_note_store, we reset the info 
about any multi hard-register; moveover 5 arrays are supposed to be maintained 
for each hard-register:

  for (i = FIRST_PSEUDO_REGISTER - 1; i >= 0; i--)
{
  reg_set_luid[i] = 0;
  reg_offset[i] = 0;
  reg_base_reg[i] = 0;
  reg_symbol_ref[i] = NULL_RTX;
  reg_mode[i] = VOIDmode;
}

so I'm not sure whether we really properly handle multi hard-registers.  So 
what about explicitly punting for multi hard-registers in reload_cse_move2add?
Do you think that this would pessimize too much in practice?

-- 
Eric Botcazou


Re: RFA: Use gen_int_mode in plus_constant

2013-05-22 Thread Richard Biener
On Wed, May 22, 2013 at 11:02 AM, Richard Sandiford
 wrote:
> Andreas Krebbel  writes:
>> On 21/05/13 11:26, Richard Sandiford wrote:
>>> gcc/
>>>  * recog.c (offsettable_address_addr_space_p): Fix calculation of
>>>  address mode.  Move pointer mode initialization to the same place.
>>
>> Thanks! This fixed the failure (and others). Bootstrapped on s390x - no
>> regressions:
>
> Thanks for testing.  Also now bootstrapped & regression-tested on
> x86_64-linux-gnu.  OK to install?  OK for 4.8 once 4.8.1 is released?

Ok.

Thanks,
Richard.

> Thanks,
> Richard
>


Re: [patch] install host specific {bits,ext}/opt_random.h headers in host specific c++ incdir

2013-05-22 Thread Paolo Carlini

On 05/21/2013 10:25 AM, Matthias Klose wrote:

Am 19.05.2013 11:40, schrieb Paolo Carlini:

On 05/19/2013 11:35 AM, Andreas Schwab wrote:

Tests that now fail, but worked before:

Thanks Andreas. Matthias, please revert ASAP, thanks.

you already did that.

Looks like ext/random includes opt_random.h, which is not on any include dir, so
make it ext/opt_random.h.  tests all pass, and check the include with an
installed version too.
Ok, if nobody has further comments over the next day or so, let's try 
again ;)


Thanks,
Paolo.


Re: [PATCH] Add explicit default constructors where required by the standard

2013-05-22 Thread Evgeniy Stepanov
OK to merge to google/4_7 and google/4_8?

On Wed, May 15, 2013 at 12:28 PM, Evgeniy Stepanov  wrote:
> Thanks!
>
> On Wed, May 15, 2013 at 3:05 AM, Jonathan Wakely  
> wrote:
>> On 14 May 2013 10:56, Jonathan Wakely wrote:
>>> On 14 May 2013 10:45, Evgeniy Stepanov wrote:
 This must have fallen through the cracks.
>>>
>>> It's still in my Git branch at home. I've been too busy to push any
>>> commits recently, but I haven't forgotten it.
>>>
>>>
 I realized we also need it in the 4_7 branch. Could you backport the
 change there, too, if it is not too much trouble?
>>>
>>> Yes, for such a small, safe change I'm happy to apply it to the 4.7 branch 
>>> too.
>>
>> Done.


Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Paolo Carlini

Hi Jon,

On 05/22/2013 10:14 AM, Jonathan Wakely wrote:

This alters the configure script to enable C++11 thread library
features based on targets that are known to support the features,
rather than based on link tests which are disabled by default.  With
Glibc 2.17 this enables a nanosecond resolution std::system_clock in
the default configuration, yay!

I've tested this on two versions of Fedora and Debian, but would be
grateful for test results on Solaris, Cygwin and BSD targets, and for
cross-compilers to any of those targets.

 * acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Add KIND=auto to
 enable features if target OS is known to support them.
 * configure.ac (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Default to 'auto'.
 * configure: Regenerate.

Tested x86_64-linux, committed to trunk.

Thanks a lot!

Could you please double check abi_check? On this glibc 2.17 machine I'm 
seeing:


1 incompatible symbols
0
_ZNSt6chrono12steady_clock3nowEv
std::chrono::steady_clock::now()
version status: incompatible
GLIBCXX_3.4.17
type: function
status: added

I'm going to try again, just in case I goofed something.

Thanks!
Paolo.



[PATCH] Fix PR57349

2013-05-22 Thread Richard Biener

The following fixes FDO on setjmp using functions.  We may not
blindly split blocks before setjmp receivers as that disconnects
the PHIs and will end up creating a bogus CFG when IPA inlining
then creates new abnormal edges to the new setjmp block - we're
going to have a hard time coalescing SSA names that way.

Fixed by following what the SJLJ machinery does - not split blocks
before ECF_RETURNS_TWICE.

Bootstrap / regtest running on x86_64-unknown-linux-gnu
(253.perlbmk with FDO passed with this patch).

Richard.

2013-05-22  Richard Biener  

PR middle-end/57349
* profile.c (branch_prob): Do not split blocks that are
abnormally receiving from ECF_RETURNS_TWICE functions.

Index: gcc/profile.c
===
*** gcc/profile.c   (revision 199138)
--- gcc/profile.c   (working copy)
*** branch_prob (void)
*** 1085,1102 
 or __builtin_setjmp_dispatcher calls.  These are very
 special and don't expect anything to be inserted before
 them.  */
! if (!is_gimple_call (first)
! || (fndecl = gimple_call_fndecl (first)) == NULL
! || DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL
! || (DECL_FUNCTION_CODE (fndecl) != BUILT_IN_SETJMP_RECEIVER
! && (DECL_FUNCTION_CODE (fndecl)
! != BUILT_IN_SETJMP_DISPATCHER)))
!   {
! if (dump_file)
!   fprintf (dump_file, "Splitting bb %i after labels\n",
!bb->index);
! split_block_after_labels (bb);
!   }
}
}
  }
--- 1085,1104 
 or __builtin_setjmp_dispatcher calls.  These are very
 special and don't expect anything to be inserted before
 them.  */
! if (is_gimple_call (first)
! && (((fndecl = gimple_call_fndecl (first)) != NULL
!  && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL
!  && (DECL_FUNCTION_CODE (fndecl)
!  == BUILT_IN_SETJMP_RECEIVER
!  || (DECL_FUNCTION_CODE (fndecl)
!  == BUILT_IN_SETJMP_DISPATCHER)))
! || gimple_call_flags (first) & ECF_RETURNS_TWICE))
!   continue;
! 
! if (dump_file)
!   fprintf (dump_file, "Splitting bb %i after labels\n",
!bb->index);
! split_block_after_labels (bb);
}
}
  }


RE: [PATCH][gensupport] Add optional attributes field to define_cond_exec

2013-05-22 Thread Kyrylo Tkachov
> > We have a new define_subst which may help here.  I *think* that
> > define_cond_exec is (or is nearly) a subset of define_subst.  On my
> > medium term
> > to-do list is to support define_cond_exec within gensupport via the
> > define_subst infrastructure, removing everything except the actual
> > parsing of
> > define_cond_exec.
> 
> An clearer example of what I'm trying to achieve can be found here:
> http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01139.html
> 
> I've been trying to wrap my head around define_subst, and it was
> considered
> initially, but from the online documentation
> (http://gcc.gnu.org/onlinedocs/gccint/Define-Subst.html#Define-Subst)
> I couldn't think of a way to use it.
> 
> What I want essentially, is to disable the cond_exec variant of an insn
> based on some variable.
> This stems from the fact that the "predicable" attribute cannot depend
> on
> anything else
> (has to be a compile-compile time constant).
> 
> >
> > Your "predicated" attribute could be mapped to a define_subst
> > substitution.
> > You could expose this via a true attribute, but I'm not 100% certain
> > that you'd
> > need to do so.
> >
> > I havn't thought about all of the ramifications here, but I'd be
> > interested to
> > hear about your thoughts or experiments down this line.

So, thinking about it a bit more...
I can see how define_subst is a generalised version of define_cond_exec,
but what I need in my case is more fine-grained control over the
alternatives.
I'd like to say, for example, that the predicated version of the 3rd
alternative
is disabled for, say, TARGET_RESTRICT_CE. I don't think define_subst would
allow me
to do this in its current form.

From what I understand, using define_subst would mean creating a
define_subst
for every pattern that can be "predicable"? There are at least 600
predicable patterns
in the arm backend, so that would be infeasible.

Kyrill







Unreviewed libffi patch

2013-05-22 Thread Rainer Orth
The following libffi patch

[libffi] Fix several libffi testsuite failures on 64-bit SPARC and 
PowerPC (PR libffi/56033)
http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00840.html

has remained unreviewed for a week.  It needs a libffi maintainer, which
I suppose Anthony still is, despite not being listed in MAINTAINERS.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Jonathan Wakely
On 22 May 2013 10:47, Paolo Carlini wrote:
>
> Could you please double check abi_check? On this glibc 2.17 machine I'm
> seeing:
>
> 1 incompatible symbols
> 0
> _ZNSt6chrono12steady_clock3nowEv
> std::chrono::steady_clock::now()
> version status: incompatible
> GLIBCXX_3.4.17
> type: function
> status: added
>
> I'm going to try again, just in case I goofed something.

No, I don't think you goofed, I see it too on glibc 2.17, sorry about that.

How should we handle that export?  It is available on older systems
without glibc 2.17 and with older GCC versions if you configure with
--enable-libstdcxx-time=rt, so it wouldn't be correct to just add it
to the GLIBCXX_3.4.19 version.


Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Paolo Carlini

Hi,

On 05/22/2013 11:59 AM, Jonathan Wakely wrote:

On 22 May 2013 10:47, Paolo Carlini wrote:

Could you please double check abi_check? On this glibc 2.17 machine I'm
seeing:

1 incompatible symbols
0
_ZNSt6chrono12steady_clock3nowEv
std::chrono::steady_clock::now()
version status: incompatible
GLIBCXX_3.4.17
type: function
status: added

I'm going to try again, just in case I goofed something.

No, I don't think you goofed, I see it too on glibc 2.17, sorry about that.

How should we handle that export?  It is available on older systems
without glibc 2.17 and with older GCC versions if you configure with
--enable-libstdcxx-time=rt,
I don't think it could be available on older systems without something 
like --enable-libstdcxx-time=rt on the command line. That is, without a 
non-default configure switch. Anyway, indeed it could be available, 
depending on the configuration and we wanted it to be exported, in case. 
I think there isn't a risk of real, substantive abi-breakage here, the 
issue is largely about the way we are testing abi_check. I bet Jakub has 
ideas about the best way to approach this, like with the alias 
mechanism?!? Essentially we want to arrange things in such a way that 
this specific symbol is fine to be optionally exported at *any* version 
>= GLIBCXX_3.4.17 (thus, no abi_check error), depending on the 
configure switch.


Paolo.


Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Paolo Carlini
.. an idea I have got: we could arrange for defining the symbol 
unconditionally, even when isn't really meaningful in that clock 
configuration, and therefore exporting it uncondtionally, a dummy 
implementation in many configuration. We also have at the same time to 
regenerate the baselines to include it, not a big deal. Maybe Rainer can 
also help... he pays a lot of attention to this kind of issue.


Paolo.


Re: rtl expansion without zero/sign extension based on VRP

2013-05-22 Thread Richard Biener
On Fri, May 17, 2013 at 1:40 PM, Kugan
 wrote:
> On 13/05/13 17:47, Richard Biener wrote:
>>
>> On Mon, May 13, 2013 at 5:45 AM, Kugan
>>  wrote:
>>>
>>> Hi,
>>>
>>> This patch removes some of the redundant sign/zero
>>> extensions using value ranges informations generated by VRP.
>>>
>>> When GIMPLE_ASSIGN stmts with LHS type smaller than word is expanded to
>>> rtl, if we can prove that RHS expression value can always fit in LHS
>>> type and there is no sign conversion, truncation and extension to fit
>>> the type is redundant. Subreg and Zero/sign extensions are therefore
>>> redundant.
>>>
>>> For example, when an expression is evaluated and it's value is assigned
>>> to variable of type short, the generated rtl would look something like
>>> the following.
>>>
>>> (set (reg:SI 110)
>>>(zero_extend:SI (subreg:HI (reg:SI 117) 0)))
>>>
>>> However, if during value range propagation, if we can say for certain
>>> that the value of the expression which is present in register 117 is
>>> within the limits of short and there is no sign conversion, we don’t
>>> need to perform the subreg and zero_extend; instead we can generate the
>>> following rtl.
>>>
>>> (set (reg:SI 110)
>>>(reg:SI 117)))
>>>
>>> Attached patch adds value range information to gimple stmts during value
>>> range propagation pass and expands rtl without subreg and zero/sign
>>> extension if the value range is within the limits of it's type.
>>>
>>> This change improve the geomean of spec2k int benchmark with ref by about
>>> ~3.5% on an arm chromebook.
>>>
>>> Tested  on X86_64 and ARM.
>>>
>>> I would like review comments on this.
>>
>>
>> A few comments on the way you preserve this information.
>>
>
> Thanks Richard for reviewing it.
>
>
>> --- a/gcc/gimple.h
>> +++ b/gcc/gimple.h
>> @@ -191,6 +191,11 @@ struct GTY((chain_next ("%h.next")))
>> gimple_statement_base {
>>in there.  */
>> unsigned int subcode : 16;
>>
>> +  /* if an assignment gimple statement has RHS expression that can fit
>> + LHS type, zero/sign extension to truncate is redundant.
>> + Set this if we detect extension as redundant during VRP.  */
>> +  unsigned sign_zero_ext_redundant : 1;
>> +
>>
>> this enlarges all gimple statements by 8 bytes, so it's out of the
>> question.
>>
>> My original plan to preserve value-range information was to re-use
>> SSA_NAME_PTR_INFO which is where we already store value-range
>> information for pointers:
>>
>> struct GTY(()) ptr_info_def
>> {
>>/* The points-to solution.  */
>>struct pt_solution pt;
>>
>>/* Alignment and misalignment of the pointer in bytes.  Together
>>   align and misalign specify low known bits of the pointer.
>>   ptr & (align - 1) == misalign.  */
>>
>>/* When known, this is the power-of-two byte alignment of the object
>> this
>>   pointer points into.  This is usually DECL_ALIGN_UNIT for decls and
>>   MALLOC_ABI_ALIGNMENT for allocated storage.  When the alignment is
>> not
>>   known, it is zero.  Do not access directly but use functions
>>   get_ptr_info_alignment, set_ptr_info_alignment,
>>   mark_ptr_info_alignment_unknown and similar.  */
>>unsigned int align;
>>
>>/* When alignment is known, the byte offset this pointer differs from
>> the
>>   above alignment.  Access only through the same helper functions as
>> align
>>   above.  */
>>unsigned int misalign;
>> };
>>
>> what you'd do is to make the ptr_info_def reference from tree_ssa_name a
>> reference to a union of the above (for pointer SSA names) and an alternate
>> info like
>>
>> struct range_info_def {
>>double_int min;
>>double_int max;
>> };
>>
>
> I have now changed the way I am preserving this information as you have
> suggested.
>
>
>> or a more compressed format (a reference to a mode or type it fits for
>> example).
>>
>> The very specific case in question also points at the fact that we have
>> two conversion tree codes, NOP_EXPR and CONVERT_EXPR, and we've
>> tried to unify them (but didn't finish up that task)...
>>
>
> We can also remove sign/zero extension in cases other than  NOP_EXPR and
> CONVERT_EXPR,  if the value range of the RHS expressions fits LHS type
> without sign change.
>
> For example, please look at the following gimple stmt and the resulting rtl:
>
> ;; c_3 = c_2(D) & 15;
>
> ;; Without the patch
>   (insn 6 5 7 (set (reg:SI 116)
>   (and:SI (reg/v:SI 115 [ c ])
>   (const_int 15 [0xf]))) c5.c:4 -1
>(nil))
>
>   (insn 7 6 0 (set (reg/v:SI 111 [ c ])
>   (zero_extend:SI (subreg:QI (reg:SI 116) 0))) c5.c:4 -1
>(nil))
>
> ;; with the patch
>   (insn 6 5 7 (set (reg:SI 116)
>   (and:SI (reg/v:SI 115 [ c ])
>   (const_int 15 [0xf]))) c5.c:4 -1
>(nil))
>
>   (insn 7 6 0 (set (reg/v:SI 111 [ c ])
>   (reg:SI 116)) c5.c:4 -1
>(nil))
>
>
>
>> +static void
>> +process_stmt_for_ext_elimination ()
>> +{
>>
>> we alread

Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Paolo Carlini
.. put an abort or something in the dummy implementations, to make sure 
people don't try to run an executable built with headers which have the 
clock available linked at run time to a .so which doesn't, really. 
Suboptimal solution because ideally we would like to fail at build time, 
but this is the best I have got so far. Be very clear in the docs, anyway.


Paolo.


[ARM] fix PR debug/57351 ICE: internal compiler error: in dbx_reg_number,

2013-05-22 Thread Christian Bruel
Hello,

arm_dwarf_register_span converts regno to a dbx register number while
building the PARALLEL rtx. But since
http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01131.html this information
is centralized in DBX_REGISTER_NUMBER that will be called when
processing the operands in reg_loc_descriptor, so the DBX conversion
information doesn't need to be duplicated.

Build and regtested with a arm-none-eabi newlib toolset configured with
--with-fpu=neon-vfpv4 --with-float=hard --with-arch=armv7-a

OK for trunk ?

Thanks

Christian

2013-05-22  Christian Bruel  

	PR debug/57351
	* config/arm/arm.c (arm_dwarf_register_span): Do not use dbx number.

2013-05-22  Christian Bruel  

	PR debug/57351
	* gcc.dg/debug/pr57351.c: New test

Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c	(revision 199179)
+++ gcc/config/arm/arm.c	(working copy)
@@ -25861,9 +25861,8 @@ arm_dwarf_register_span (rtx rtl)
 
   nregs = GET_MODE_SIZE (GET_MODE (rtl)) / 8;
   p = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (nregs));
-  regno = (regno - FIRST_VFP_REGNUM) / 2;
   for (i = 0; i < nregs; i++)
-XVECEXP (p, 0, i) = gen_rtx_REG (DImode, 256 + regno + i);
+XVECEXP (p, 0, i) = gen_rtx_REG (DImode, regno + i);
 
   return p;
 }

Index: gcc/testsuite/gcc.dg/debug/pr57351.c
===
--- gcc/testsuite/gcc.dg/debug/pr57351.c	(revision 0)
+++ gcc/testsuite/gcc.dg/debug/pr57351.c	(revision 0)
@@ -0,0 +1,53 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon }  */
+/* { dg-options "-std=c99 -Os -g -march=armv7-a -mfpu=neon-vfpv4  -mfloat-abi=hard" { target { arm-*-* } } } */
+
+typedef unsigned int size_t;
+typedef int ptrdiff_t;
+typedef signed char int8_t ;
+typedef signed long long int64_t;
+typedef int8_t GFC_INTEGER_1;
+typedef GFC_INTEGER_1 GFC_LOGICAL_1;
+typedef int64_t GFC_INTEGER_8;
+typedef GFC_INTEGER_8 GFC_LOGICAL_8;
+typedef ptrdiff_t index_type;
+typedef struct descriptor_dimension
+{
+  index_type lower_bound;
+  index_type _ubound;
+}
+descriptor_dimension;
+typedef struct { GFC_LOGICAL_1 *base_addr; size_t offset; index_type dtype; descriptor_dimension dim[7];} gfc_array_l1;
+typedef struct { GFC_LOGICAL_8 *base_addr; size_t offset; index_type dtype; descriptor_dimension dim[7];} gfc_array_l8;
+void
+all_l8 (gfc_array_l8 * const restrict retarray,
+ gfc_array_l1 * const restrict array,
+ const index_type * const restrict pdim)
+{
+  GFC_LOGICAL_8 * restrict dest;
+  index_type n;
+  index_type len;
+  index_type delta;
+  index_type dim;
+  dim = (*pdim) - 1;
+  len = ((array)->dim[dim]._ubound + 1 - (array)->dim[dim].lower_bound);
+  for (n = 0; n < dim; n++)
+{
+  const GFC_LOGICAL_1 * restrict src;
+  GFC_LOGICAL_8 result;
+  {
+  result = 1;
+   {
+ for (n = 0; n < len; n++, src += delta)
+   {
+  if (! *src)
+{
+  result = 0;
+  break;
+}
+  }
+ *dest = result;
+   }
+  }
+}
+}


[SPARC] Small tweaks to config/sparc/sol2-unwind.h

2013-05-22 Thread Eric Botcazou
This silences a maybe-uninitialized warning, removes an obsolete comment and 
changes the CFA offset of individual registers to comprise the global CFA 
offset, thus making it possible to remove the special-casing of signal frames 
from sparc64_frob_update_context.

Tested on SPARC/Solaris, applied on the mainline.


2013-05-22  Eric Botcazou  

* config/sparc/sol2-unwind.h (sparc64_frob_update_context): Do it for
signal frames as well.
(MD_FALLBACK_FRAME_STATE_FOR): Do minor cleanups throughout and add the
STACK_BIAS to the CFA offset.


-- 
Eric BotcazouIndex: libgcc/config/sparc/sol2-unwind.h
===
--- libgcc/config/sparc/sol2-unwind.h	(revision 199091)
+++ libgcc/config/sparc/sol2-unwind.h	(working copy)
@@ -155,12 +155,10 @@ sparc64_frob_update_context (struct _Unw
 {
   /* The column of %sp contains the old CFA, not the old value of %sp.
  The CFA offset already comprises the stack bias so, when %sp is the
- CFA register, we must avoid counting the stack bias twice.  Do not
- do that for signal frames as the offset is artificial for them.  */
+ CFA register, we must avoid counting the stack bias twice.  */
   if (fs->regs.cfa_reg == __builtin_dwarf_sp_column ()
   && fs->regs.cfa_how == CFA_REG_OFFSET
-  && fs->regs.cfa_offset != 0
-  && !fs->signal_frame)
+  && fs->regs.cfa_offset != 0)
 {
   long i;
 
@@ -296,9 +294,8 @@ MD_FALLBACK_FRAME_STATE_FOR (struct _Unw
 			 _Unwind_FrameState *fs)
 {
   void *pc = context->ra;
-  struct frame *fp = (struct frame *) context->cfa;
-  int nframes;
   void *this_cfa = context->cfa;
+  int nframes = 0;
   long new_cfa;
   void *ra_location, *shifted_ra_location;
   mcontext_t *mctx;
@@ -318,21 +315,22 @@ MD_FALLBACK_FRAME_STATE_FOR (struct _Unw
   return _URC_NO_REASON;
 }
 
+  /* Do some pattern matching at the return address.  */
   if (IS_SIGHANDLER (pc, this_cfa, &nframes))
 {
+  struct frame *fp = (struct frame *) this_cfa;
   struct handler_args {
 	struct frame frwin;
 	ucontext_t ucontext;
   } *handler_args;
   ucontext_t *ucp;
 
-  /* context->cfa points into the frame after the saved frame pointer and
+  /* this_cfa points into the frame after the saved frame pointer and
  saved pc (struct frame).
 
  The ucontext_t structure is in the kernel frame after a struct
  frame.  Since the frame sizes vary even within OS releases, we
  need to walk the stack to get there.  */
-
   for (i = 0; i < nframes; i++)
 	fp = (struct frame *) ((char *)fp->fr_savfp + STACK_BIAS);
 
@@ -340,19 +338,15 @@ MD_FALLBACK_FRAME_STATE_FOR (struct _Unw
   ucp = &handler_args->ucontext;
   mctx = &ucp->uc_mcontext;
 }
-
-  /* Exit if the pattern at the return address does not match the
- previous three patterns.  */
   else
 return _URC_END_OF_STACK;
 
-  new_cfa = mctx->gregs[REG_SP];
   /* The frame address is %sp + STACK_BIAS in 64-bit mode.  */
-  new_cfa += STACK_BIAS;
+  new_cfa = mctx->gregs[REG_SP] + STACK_BIAS;
 
   fs->regs.cfa_how = CFA_REG_OFFSET;
   fs->regs.cfa_reg = __builtin_dwarf_sp_column ();
-  fs->regs.cfa_offset = new_cfa - (long) this_cfa;
+  fs->regs.cfa_offset = new_cfa - (long) this_cfa + STACK_BIAS;
 
   /* Restore global and out registers (in this order) from the
  ucontext_t structure, uc_mcontext.gregs field.  */
@@ -372,7 +366,7 @@ MD_FALLBACK_FRAME_STATE_FOR (struct _Unw
   for (i = 0; i < 16; i++)
 {
   fs->regs.reg[i + 16].how = REG_SAVED_OFFSET;
-  fs->regs.reg[i + 16].loc.offset = i*sizeof(long);
+  fs->regs.reg[i + 16].loc.offset = i * sizeof(long);
 }
 
   /* Check whether we need to restore FPU registers.  */

Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Jonathan Wakely
On 22 May 2013 11:30, Paolo Carlini wrote:
> .. put an abort or something in the dummy implementations, to make sure
> people don't try to run an executable built with headers which have the
> clock available linked at run time to a .so which doesn't, really.
> Suboptimal solution because ideally we would like to fail at build time, but
> this is the best I have got so far. Be very clear in the docs, anyway.

Based on feedback for  users much prefer to have something
undefined than to have it defined but non-functional or causing an
abort.


Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Jonathan Wakely
On 22 May 2013 11:40, Jonathan Wakely wrote:
> On 22 May 2013 11:30, Paolo Carlini wrote:
>> .. put an abort or something in the dummy implementations, to make sure
>> people don't try to run an executable built with headers which have the
>> clock available linked at run time to a .so which doesn't, really.
>> Suboptimal solution because ideally we would like to fail at build time, but
>> this is the best I have got so far. Be very clear in the docs, anyway.
>
> Based on feedback for  users much prefer to have something
> undefined than to have it defined but non-functional or causing an
> abort.

Could leave the steady_clock::now() symbol "undesignated", like the
__once_call and __once_callable symbols that are only present when
_GLIBCXX_HAVE_TLS is defined?


Re: RFA: fix rtl-optimization/56833

2013-05-22 Thread Joern Rennecke

Quoting Eric Botcazou :


The problem was that we had some optimzations added to the
reload_cse_move2add pass that would attempt transformations with
multi-hard-register registers, without keeping track of the validity of the
values in all hard registers involved.


That's not clear to me: for example in move2add_note_store, we reset the info
about any multi hard-register; moveover 5 arrays are supposed to be   
maintained

for each hard-register:

  for (i = FIRST_PSEUDO_REGISTER - 1; i >= 0; i--)
{
  reg_set_luid[i] = 0;
  reg_offset[i] = 0;
  reg_base_reg[i] = 0;
  reg_symbol_ref[i] = NULL_RTX;
  reg_mode[i] = VOIDmode;
}
so I'm not sure whether we really properly handle multi hard-registers.


Initializing all the values is nice to have defined contents, but
the values in reg_offset[i] / reg_base_reg[i] / reg_symbol_ref[i]
just tell what's in these registers if they are valid, not if they
are valid.
But I can see that there could be a problem with an earlier value
that used to be valid in a multi-hard-register sub register to be
considered to be still valid.
Setting the mode of every constituent register but the first one
(which has the new value recorded) to VOIDmode at the same time
as updating reg_set_luid should be sufficent to address this issue.


So
what about explicitly punting for multi hard-registers in   
reload_cse_move2add?

Do you think that this would pessimize too much in practice?


The pass was originally written with word_mode == Pmode targets like
the SH in mind, where multi-hard-register values are uninteresting.

But for targets like the avr, most or all of the interesting values
will be in multi-hard-register registers.
2013-05-22  Joern Rennecke 

PR rtl-optimization/56833
* postreload.c (move2add_use_add2_insn): Update reg_set_luid
and reg_mode for all affected hard registers.
(move2add_use_add3_insn): Likewise.  Check that all source hard
regs have been set by the same set.
(reload_cse_move2add): Before concluding that we have a pre-existing
value, check that all destination hard registers have been set by the
same set.

Index: postreload.c
===
--- postreload.c(revision 199190)
+++ postreload.c(working copy)
@@ -1749,7 +1749,13 @@ move2add_use_add2_insn (rtx reg, rtx sym
}
}
 }
-  reg_set_luid[regno] = move2add_luid;
+  int i = hard_regno_nregs[regno][GET_MODE (reg)] -1;
+  do
+{
+  reg_set_luid[regno + i] = move2add_luid;
+  reg_mode[regno + i] = VOIDmode;
+}
+  while (--i >= 0);
   reg_base_reg[regno] = -1;
   reg_mode[regno] = GET_MODE (reg);
   reg_symbol_ref[regno] = sym;
@@ -1793,6 +1799,14 @@ move2add_use_add3_insn (rtx reg, rtx sym
&& reg_symbol_ref[i] != NULL_RTX
&& rtx_equal_p (sym, reg_symbol_ref[i]))
   {
+   int j;
+
+   for (j = hard_regno_nregs[i][reg_mode[i]] - 1;
+j > 0 && reg_set_luid[i+j] == reg_set_luid[i];)
+ j--;
+   if (j != 0)
+ continue;
+
rtx new_src = gen_int_mode (INTVAL (off) - reg_offset[i],
GET_MODE (reg));
/* (set (reg) (plus (reg) (const_int 0))) is not canonical;
@@ -1835,7 +1849,13 @@ move2add_use_add3_insn (rtx reg, rtx sym
   if (validate_change (insn, &SET_SRC (pat), tem, 0))
changed = true;
 }
-  reg_set_luid[regno] = move2add_luid;
+  i = hard_regno_nregs[regno][GET_MODE (reg)] -1;
+  do
+{
+  reg_set_luid[regno + i] = move2add_luid;
+  reg_mode[regno + i] = VOIDmode;
+}
+  while (--i >= 0);
   reg_base_reg[regno] = -1;
   reg_mode[regno] = GET_MODE (reg);
   reg_symbol_ref[regno] = sym;
@@ -1890,7 +1910,10 @@ reload_cse_move2add (rtx first)
 
  /* Check if we have valid information on the contents of this
 register in the mode of REG.  */
- if (reg_set_luid[regno] > move2add_last_label_luid
+ for (i = hard_regno_nregs[regno][GET_MODE (reg)] - 1;
+  i > 0 && reg_set_luid[regno + i] == reg_set_luid[regno];)
+   i--;
+ if (i == 0 && reg_set_luid[regno] > move2add_last_label_luid
  && MODES_OK_FOR_MOVE2ADD (GET_MODE (reg), reg_mode[regno])
   && dbg_cnt (cse2_move2add))
{
@@ -2021,7 +2044,10 @@ reload_cse_move2add (rtx first)
 
  /* If the reg already contains the value which is sum of
 sym and some constant value, we can use an add2 insn.  */
- if (reg_set_luid[regno] > move2add_last_label_luid
+ for (i = hard_regno_nregs[regno][GET_MODE (reg)] - 1;
+  i > 0 && reg_set_luid[regno + i] == reg_set_luid[regno];)
+   i--;
+ if (i == 0 && reg_set_luid[regno] > move2add_last_label_luid
  && MODES_OK_FOR_MOVE2ADD (GET_MODE (reg), reg_mode[regno])
  && reg_base_reg[regno] < 0
 

Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Paolo Carlini

On 05/22/2013 12:40 PM, Jonathan Wakely wrote:

On 22 May 2013 11:30, Paolo Carlini wrote:

.. put an abort or something in the dummy implementations, to make sure
people don't try to run an executable built with headers which have the
clock available linked at run time to a .so which doesn't, really.
Suboptimal solution because ideally we would like to fail at build time, but
this is the best I have got so far. Be very clear in the docs, anyway.

Based on feedback for  users much prefer to have something
undefined than to have it defined but non-functional or causing an
abort.
That for sure. But I don't think the situation is exactly the same, to 
be honest. Note, I don't know what exactly "undesignated" means, I 
understand you already used that approach, thus don't hesitate to try it 
here too!


Take your time (no pun intended ;) you may want to disable temporarily 
the auto with a FIXME comment.


Paolo.


Re: [Patch, Fortran] PR57338 - add more missing constraint checks for assumed-rank

2013-05-22 Thread Mikael Morin
Le 21/05/2013 20:05, Tobias Burnus a écrit :
> That's a follow-up the just committed patch - which came too late in
> some cases.
> 
> Build and regtested on x86-64-gnu-linux.
> OK for the trunk?
> 
OK, thanks

Mikael


Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Jakub Jelinek
On Wed, May 22, 2013 at 11:42:45AM +0100, Jonathan Wakely wrote:
> On 22 May 2013 11:40, Jonathan Wakely wrote:
> > On 22 May 2013 11:30, Paolo Carlini wrote:
> >> .. put an abort or something in the dummy implementations, to make sure
> >> people don't try to run an executable built with headers which have the
> >> clock available linked at run time to a .so which doesn't, really.
> >> Suboptimal solution because ideally we would like to fail at build time, 
> >> but
> >> this is the best I have got so far. Be very clear in the docs, anyway.
> >
> > Based on feedback for  users much prefer to have something
> > undefined than to have it defined but non-functional or causing an
> > abort.
> 
> Could leave the steady_clock::now() symbol "undesignated", like the
> __once_call and __once_callable symbols that are only present when
> _GLIBCXX_HAVE_TLS is defined?

For now() I think it was a mistake not to export it (generally, exports
conditional on supposedly non-ABI changing configure options are bad IMHO),
can't it just return some perhaps less precise clock or something instead?

The problem with having libstdc++.so's without now()@@GLIBCXX_3.4.17
exported in the wild and now new default flag builds having those
is quite a big ABI problem, say rpm dependency tracking will not spot those,
it only looks at symbol version names, not at individual symbols, so
programs that will try to call that symbol might fail only at runtime.

If now() can be perhaps with worse precision emulated in configurations not
built against glibc 2.17, perhaps best solution would be to
add now()@@GLIBCXX_3.4.18 into 4.8.1 (and change all 3.4.18 symbols to
3.4.19) and have now()@GLIBCXX_3.4.17 (note, just one @) as compatibility
alias to that.

Jakub


[AArch64] Support for CLZ

2013-05-22 Thread Vidya Praveen

Hello,

This patch adds support to AdvSIMD CLZ instruction and adds tests for the same.
Regression test done for aarch64-none-elf with no issues.

OK?

Regards
VP

---

gcc/ChangeLog

2013-05-22  Vidya Praveen 

* config/aarch64/aarch64-simd.md (clzv4si2): Support for CLZ
  instruction (AdvSIMD).
* config/aarch64/aarch64-builtins.c
  (aarch64_builtin_vectorized_function): Handler for BUILT_IN_CLZ.
* config/aarch64/aarch-simd-builtins.def: Entry for CLZ.
* testsuite/gcc.target/aarch64/vect-clz.c: New file.
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index 4fdfe24..2a0e5fd 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -1245,6 +1245,16 @@ aarch64_builtin_vectorized_function (tree fndecl, tree type_out, tree type_in)
 	  return AARCH64_FIND_FRINT_VARIANT (sqrt);
 #undef AARCH64_CHECK_BUILTIN_MODE
 #define AARCH64_CHECK_BUILTIN_MODE(C, N) \
+  (out_mode == SImode && out_n == C \
+   && in_mode == N##Imode && in_n == C)
+case BUILT_IN_CLZ:
+  {
+if (AARCH64_CHECK_BUILTIN_MODE (4, S))
+  return aarch64_builtin_decls[AARCH64_SIMD_BUILTIN_clzv4si];
+return NULL_TREE;
+  }
+#undef AARCH64_CHECK_BUILTIN_MODE
+#define AARCH64_CHECK_BUILTIN_MODE(C, N) \
   (out_mode == N##Imode && out_n == C \
&& in_mode == N##Fmode && in_n == C)
 	case BUILT_IN_LFLOOR:
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index e420173..5134f96 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -49,6 +49,7 @@
   BUILTIN_VDQF (UNOP, sqrt, 2)
   BUILTIN_VD_BHSI (BINOP, addp, 0)
   VAR1 (UNOP, addp, 0, di)
+  VAR1 (UNOP, clz, 2, v4si)
 
   BUILTIN_VD_RE (REINTERP, reinterpretdi, 0)
   BUILTIN_VDC (REINTERP, reinterpretv8qi, 0)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 9069a73..82fe1ad 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1611,6 +1611,15 @@
   DONE;
 })
 
+(define_insn "clz2"
+ [(set (match_operand:VDQ_BHSI 0 "register_operand" "=w")
+   (clz:VDQ_BHSI (match_operand:VDQ_BHSI 1 "register_operand" "w")))]
+ "TARGET_SIMD"
+ "clz\\t%0., %1."
+ [(set_attr "simd_type" "simd_cls")
+  (set_attr "simd_mode" "")]
+)
+
 ;; 'across lanes' max and min ops.
 
 (define_insn "reduc__"
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-clz.c b/gcc/testsuite/gcc.target/aarch64/vect-clz.c
new file mode 100644
index 000..8f1fe70
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/vect-clz.c
@@ -0,0 +1,35 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -save-temps -fno-inline" } */
+
+extern void abort ();
+
+void
+count_lz_v4si (unsigned *__restrict a, int *__restrict b)
+{
+  int i;
+
+  for (i = 0; i < 4; i++)
+b[i] = __builtin_clz (a[i]);
+}
+
+/* { dg-final { scan-assembler "clz\tv\[0-9\]+\.4s" } } */
+
+int
+main ()
+{
+  unsigned int x[4] = { 0x0, 0x, 0x1, 0x };
+  int r[4] = { 32, 16, 15, 0 };
+  int d[4], i;
+
+  count_lz_v4si (x, d);
+
+  for (i = 0; i < 4; i++)
+{
+  if (d[i] != r[i])
+	abort ();
+}
+
+  return 0;
+}
+
+/* { dg-final { cleanup-saved-temps } } */

Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Jonathan Wakely
On 22 May 2013 12:15, Jakub Jelinek wrote:
>
> If now() can be perhaps with worse precision emulated in configurations not
> built against glibc 2.17, perhaps best solution would be to
> add now()@@GLIBCXX_3.4.18 into 4.8.1 (and change all 3.4.18 symbols to
> 3.4.19) and have now()@GLIBCXX_3.4.17 (note, just one @) as compatibility
> alias to that.

The problem for steady_clock::now() isn't one of precision, it's that
it isn't steady if we don't use the monotonic clock.  We could define
a non-steady steady_clock (with the same precision as system_clock)
but is that helpful to users?  If that's what we want then yes, we can
do the symbol versioning you suggest.


Re: Unreviewed libffi patch

2013-05-22 Thread Rainer Orth
Anthony Green  writes:

> Rainer - sorry, I've been travelling and falling behind on email.
>  This patch is fine.  Please commit it to GCC. I'll put it in the
> libffi git tree.

No worries, I just saw your vacation note.  I've commited to mainline
now.

Jakub, what should we do about the 4.8 branch?  Wait until 4.8.1 is
released or apply now?  This is a regression from 4.7.

Thanks.
Rainer


> On Wed, May 22, 2013 at 5:58 AM, Andrew Haley  wrote:
>
> On 05/22/2013 10:55 AM, Rainer Orth wrote:
> > The following libffi patch
> >
> >       [libffi] Fix several libffi testsuite failures on 64-bit
> SPARC and PowerPC (PR libffi/56033)
> >       http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00840.html
> >
> > has remained unreviewed for a week.  It needs a libffi
> maintainer, which
> > I suppose Anthony still is, despite not being listed in
> MAINTAINERS.
>
> Forwarding to libffi.
>
>
>
>
>
>

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [ARM] fix PR debug/57351 ICE: internal compiler error: in dbx_reg_number,

2013-05-22 Thread Ramana Radhakrishnan

On 05/22/13 11:31, Christian Bruel wrote:

Hello,

arm_dwarf_register_span converts regno to a dbx register number while
building the PARALLEL rtx. But since
http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01131.html this information
is centralized in DBX_REGISTER_NUMBER that will be called when
processing the operands in reg_loc_descriptor, so the DBX conversion
information doesn't need to be duplicated.

Build and regtested with a arm-none-eabi newlib toolset configured with
--with-fpu=neon-vfpv4 --with-float=hard --with-arch=armv7-a

OK for trunk ?

Thanks

Christian




In the test use

/* { dg-options "-std=c99 -Os -g -march=armv7-a" } */
/* { dg-add-options arm_neon } */

instead of

/* { dg-options "-std=c99 -Os -g -march=armv7-a -mfpu=neon-vfpv4 
-mfloat-abi=hard" { target { arm-*-* } } } */


Ok with that change.


regards
Ramana






Re: [4.8 PATCH, i386]: Fix PR 57356, SSE2 instructions generated with '-mno-sse2'

2013-05-22 Thread Uros Bizjak
On Wed, May 22, 2013 at 10:01 AM, Uros Bizjak  wrote:

>>> This patch avoids movdqu/movdqa when SSE2 is disabled. Although
>>> -mno-sse2 is bordering on ABI violation for 64bit targets, the patch
>>> is simple enough to fix movti_internal_rex64 pattern.
>>
>> If the TImode attr variant isn't valid for !SSE2, I'd find it clearer
>> to do
>> (match_test "!TARGET_SSE2") (const_string "V4SF")
>> much earlier in the cond, before TARGET_AVX case that returns TImode
>> (sure, when TARGET_AVX is true, then !TARGET_SSE2 is false), and
>> not added using ior to an unrelated condition.
>
> Yes, this is a good suggestion, I was trying to be consistent with the
> movti_internal pattern.
>
> I will change both in the way you suggested ASAP.

I have committed attached patch to 4.8 branch.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 199189)
+++ config/i386/i386.md (working copy)
@@ -1798,6 +1798,8 @@
(set (attr "mode")
(cond [(eq_attr "alternative" "0,1")
 (const_string "DI")
+  (not (match_test "TARGET_SSE2"))
+(const_string "V4SF")
   (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
 (const_string "V4SF")
   (and (eq_attr "alternative" "4")
@@ -1854,15 +1856,16 @@
   [(set_attr "type" "sselog1,ssemov,ssemov")
(set_attr "prefix" "maybe_vex")
(set (attr "mode")
-   (cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
+   (cond [(not (match_test "TARGET_SSE2"))
+(const_string "V4SF")
+  (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
 (const_string "V4SF")
   (and (eq_attr "alternative" "2")
(match_test "TARGET_SSE_TYPELESS_STORES"))
 (const_string "V4SF")
   (match_test "TARGET_AVX")
 (const_string "TI")
-  (ior (not (match_test "TARGET_SSE2"))
-   (match_test "optimize_function_for_size_p (cfun)"))
+  (match_test "optimize_function_for_size_p (cfun)")
 (const_string "V4SF")
  ]
  (const_string "TI")))])


Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Paolo Carlini

On 05/22/2013 01:49 PM, Jonathan Wakely wrote:

On 22 May 2013 12:15, Jakub Jelinek wrote:

If now() can be perhaps with worse precision emulated in configurations not
built against glibc 2.17, perhaps best solution would be to
add now()@@GLIBCXX_3.4.18 into 4.8.1 (and change all 3.4.18 symbols to
3.4.19) and have now()@GLIBCXX_3.4.17 (note, just one @) as compatibility
alias to that.

The problem for steady_clock::now() isn't one of precision, it's that
it isn't steady if we don't use the monotonic clock.  We could define
a non-steady steady_clock (with the same precision as system_clock)
but is that helpful to users?  If that's what we want then yes, we can
do the symbol versioning you suggest.
If I understand correctly Jakub's idea, nothing would change from the 
user point of view in the headers, using now would be still a no-no if 
the real facility isn't implemented. It would be only a run-time fall 
back for the situation I mentioned before: the binary is built with 
headers providing a good clock and then moved to another machine, and 
dynamically linked to a library which doesn't and then run. We could 
also spill a warning, at run-time.


That kind of solution would be definitely Ok with me.

Paolo.


Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Jonathan Wakely
On 22 May 2013 12:49, Jonathan Wakely wrote:
> On 22 May 2013 12:15, Jakub Jelinek wrote:
>>
>> If now() can be perhaps with worse precision emulated in configurations not
>> built against glibc 2.17, perhaps best solution would be to
>> add now()@@GLIBCXX_3.4.18 into 4.8.1 (and change all 3.4.18 symbols to
>> 3.4.19) and have now()@GLIBCXX_3.4.17 (note, just one @) as compatibility
>> alias to that.
>
> The problem for steady_clock::now() isn't one of precision, it's that
> it isn't steady if we don't use the monotonic clock.  We could define
> a non-steady steady_clock (with the same precision as system_clock)
> but is that helpful to users?  If that's what we want then yes, we can
> do the symbol versioning you suggest.

In fact we were already providing a non-steady steady_clock in the
default configuration, we just define it as a typedef for
system_clock, so we can export steady_clock::now() as an alias for
system_clock::now() when the two clocks are the same type.  When
they're not the same type there will be a real symbol present in the
.so


Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Jakub Jelinek
On Wed, May 22, 2013 at 12:49:44PM +0100, Jonathan Wakely wrote:
> On 22 May 2013 12:15, Jakub Jelinek wrote:
> >
> > If now() can be perhaps with worse precision emulated in configurations not
> > built against glibc 2.17, perhaps best solution would be to
> > add now()@@GLIBCXX_3.4.18 into 4.8.1 (and change all 3.4.18 symbols to
> > 3.4.19) and have now()@GLIBCXX_3.4.17 (note, just one @) as compatibility
> > alias to that.
> 
> The problem for steady_clock::now() isn't one of precision, it's that
> it isn't steady if we don't use the monotonic clock.  We could define
> a non-steady steady_clock (with the same precision as system_clock)
> but is that helpful to users?  If that's what we want then yes, we can
> do the symbol versioning you suggest.

There are two things.  One is offering to use steady_clock::now() at
compile time, even if we know the clock isn't steady (here sure, I think
preferrably don't offer that), but the second thing is, if you have
already linked program which uses steady clock, what is better, not be
able to dynamically link it at all (steady_clock use might be in some
library and you don't ever call now() in your program), or to
abort/std::terminate when now() is called in your program, or you offer
non-steady clock instead.  Or, have you also considered just using
for this routine
#if _GLIBCXX_HAS_SYS_SYSCALL_H
#include 
#endif

#if defined (SYS_clock_gettime) && defined (CLOCK_MONOTONIC)
syscall (SYS_clock_gettime, CLOCK_MONOTONIC, &tp);
#endif
if clock_gettime isn't available, at least on Linux?
The implementation seems to ignore ENOSYS from clock_gettime, so ignoring it
even here wouldn't make it worse.

Jakub


Re: Unreviewed libffi patch

2013-05-22 Thread Jakub Jelinek
On Wed, May 22, 2013 at 01:49:47PM +0200, Rainer Orth wrote:
> Jakub, what should we do about the 4.8 branch?  Wait until 4.8.1 is
> released or apply now?  This is a regression from 4.7.

You can apply it now.

Jakub


[Patch, Fortran, committed] PR57364 - add missing gfc_commit_symbol (4.8/4.9 regression)

2013-05-22 Thread Tobias Burnus

A rather obvious patch.

Committed to the trunk as Rev. 199196 after build+regtesting on 
x86-64-gnu-linux.

I will backport the patch to 4.9 in a while.

Tobias
2013-05-22  Tobias Burnus  

	PR fortran/57364
	* resolve.c (get_temp_from_expr): Commit created sym.

2013-05-22  Tobias Burnus  

	PR fortran/57364
	* gfortran.dg/defined_assignment_6.f90: New.

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 74e0aa4..6f32df8 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -9300,6 +9300,7 @@ get_temp_from_expr (gfc_expr *e, gfc_namespace *ns)
 
   gfc_set_sym_referenced (tmp->n.sym);
   gfc_add_flavor (&tmp->n.sym->attr, FL_VARIABLE, name, NULL);
+  gfc_commit_symbol (tmp->n.sym);
   e = gfc_lval_expr_from_sym (tmp->n.sym);
 
   /* Should the lhs be a section, use its array ref for the
diff --git a/gcc/testsuite/gfortran.dg/defined_assignment_6.f90 b/gcc/testsuite/gfortran.dg/defined_assignment_6.f90
new file mode 100644
index 000..a5666fe
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/defined_assignment_6.f90
@@ -0,0 +1,36 @@
+! { dg-do compile }
+!
+! PR fortran/57364
+!
+! Contributed by Damian Rouson
+!
+module ref_counter_implementation
+  type ref_counter
+  contains
+procedure :: assign
+generic :: assignment(=) => assign
+  end type
+contains
+  subroutine assign (lhs, rhs)
+class (ref_counter), intent(inout) :: lhs
+class (ref_counter), intent(in) :: rhs
+  end subroutine
+end module
+module foo_parent_implementation
+  use ref_counter_implementation ,only: ref_counter
+  type :: foo_parent
+type(ref_counter) :: counter
+  end type
+contains
+  type(foo_parent) function new_foo_parent()
+  end function
+end module
+module foo_implementation
+  use foo_parent_implementation ,only: foo_parent,new_foo_parent
+  type, extends(foo_parent) :: foo
+  end type
+contains
+  type(foo) function new_foo()
+new_foo%foo_parent = new_foo_parent()
+ end function
+end module


Re: Unreviewed libffi patch

2013-05-22 Thread Rainer Orth
Jakub Jelinek  writes:

> On Wed, May 22, 2013 at 01:49:47PM +0200, Rainer Orth wrote:
>> Jakub, what should we do about the 4.8 branch?  Wait until 4.8.1 is
>> released or apply now?  This is a regression from 4.7.
>
> You can apply it now.

Done, thanks.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [AArch64] Support for CLZ

2013-05-22 Thread Marcus Shawcroft
On 22 May 2013 12:47, Vidya Praveen  wrote:
> Hello,
>
> This patch adds support to AdvSIMD CLZ instruction and adds tests for the
> same.
> Regression test done for aarch64-none-elf with no issues.
>
> OK?
>
> Regards
> VP
>
> ---
>
> gcc/ChangeLog
>
> 2013-05-22  Vidya Praveen 
>
> * config/aarch64/aarch64-simd.md (clzv4si2): Support for CLZ
>   instruction (AdvSIMD).
> * config/aarch64/aarch64-builtins.c
>   (aarch64_builtin_vectorized_function): Handler for BUILT_IN_CLZ.
> * config/aarch64/aarch-simd-builtins.def: Entry for CLZ.
> * testsuite/gcc.target/aarch64/vect-clz.c: New file.

OK


Re: [rs6000] Add register save/restore routines for cross

2013-05-22 Thread David Edelsohn
I don't believe those functions should be provided by libgcc, at least
not by shared libgcc.so, as explained by Alan.

- David


Re: [patch, powerpc] increase array alignment for Altivec

2013-05-22 Thread Sandra Loosemore

On 05/22/2013 02:01 AM, Richard Biener wrote:

On Wed, May 22, 2013 at 3:57 AM, David Edelsohn  wrote:


Increasing the alignment of arrays within structs and unions would be
nice, but that probably will change the ABI. I think that they best we
may be able to do is increase the alignment if the array is the first
element of the struct or union, see ROUND_TYPE_ALIGN for AIX.
Although this might be more trouble than it is worth.


Maybe the idea was to increase the alignment of the struct (without
affecting it's layout) when that increases the alignment of a contained
array member.  Like for

struct { int i; int j; float a[1024]; int x; };

where aligning to sizeof (int) * 2 would get 'a' a bigger alignment.


In fact, this is what the patch I posted earlier this week does, except 
the alignment it looks for is 16 bytes.  In other words, it doesn't 
change the alignment of arrays inside structs (which would cause all 
sorts of compatibility problems), but it checks to see whether there is 
already some array inside the struct aligned on a 16-byte offset so that 
the whole containing struct object would benefit from being aligned on a 
16-byte boundary.


This was a generalization of the suggestion from the previous review to 
align structs that contain an array as the first or only element  I 
realized we'd want to handle unions too, and look at all fields in a 
union instead of just the first one in a struct, and unifying the two 
cases made the code come out tidier.  :-)


-Sandra



Re: [rs6000] Add register save/restore routines for cross

2013-05-22 Thread Alan Modra
On Wed, May 22, 2013 at 09:40:22AM -0400, David Edelsohn wrote:
> I don't believe those functions should be provided by libgcc, at least
> not by shared libgcc.so, as explained by Alan.

David, I think t-savresfgpr satifies that requirement.

# These can't end up in shared libgcc
LIB2ADD_ST += \
...

-- 
Alan Modra
Australia Development Lab, IBM


Re: [patch, powerpc] increase array alignment for Altivec

2013-05-22 Thread Richard Biener
On Wed, May 22, 2013 at 3:57 PM, Sandra Loosemore
 wrote:
> On 05/22/2013 02:01 AM, Richard Biener wrote:
>>
>> On Wed, May 22, 2013 at 3:57 AM, David Edelsohn  wrote:
>>>
>>>
>>> Increasing the alignment of arrays within structs and unions would be
>>> nice, but that probably will change the ABI. I think that they best we
>>> may be able to do is increase the alignment if the array is the first
>>> element of the struct or union, see ROUND_TYPE_ALIGN for AIX.
>>> Although this might be more trouble than it is worth.
>>
>>
>> Maybe the idea was to increase the alignment of the struct (without
>> affecting it's layout) when that increases the alignment of a contained
>> array member.  Like for
>>
>> struct { int i; int j; float a[1024]; int x; };
>>
>> where aligning to sizeof (int) * 2 would get 'a' a bigger alignment.
>
>
> In fact, this is what the patch I posted earlier this week does, except the
> alignment it looks for is 16 bytes.  In other words, it doesn't change the
> alignment of arrays inside structs (which would cause all sorts of
> compatibility problems), but it checks to see whether there is already some
> array inside the struct aligned on a 16-byte offset so that the whole
> containing struct object would benefit from being aligned on a 16-byte
> boundary.
>
> This was a generalization of the suggestion from the previous review to
> align structs that contain an array as the first or only element  I
> realized we'd want to handle unions too, and look at all fields in a union
> instead of just the first one in a struct, and unifying the two cases made
> the code come out tidier.  :-)

Right.  Btw, the code should all be target independent ... just the desired
alignment for an array should be queried from it.  pass_ipa_increase_alignment
and/or the vectorizer code itself should use that info.

Richard.

> -Sandra
>


Re: [rs6000] Add register save/restore routines for cross

2013-05-22 Thread David Edelsohn
Why does cross need the functions in libgcc and not provided by the linker?

- David

On Wed, May 22, 2013 at 10:00 AM, Alan Modra  wrote:
> On Wed, May 22, 2013 at 09:40:22AM -0400, David Edelsohn wrote:
>> I don't believe those functions should be provided by libgcc, at least
>> not by shared libgcc.so, as explained by Alan.
>
> David, I think t-savresfgpr satifies that requirement.
>
> # These can't end up in shared libgcc
> LIB2ADD_ST += \
> ...
>
> --
> Alan Modra
> Australia Development Lab, IBM


Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Rainer Orth
Jonathan Wakely  writes:

> This alters the configure script to enable C++11 thread library
> features based on targets that are known to support the features,
> rather than based on link tests which are disabled by default.  With
> Glibc 2.17 this enables a nanosecond resolution std::system_clock in
> the default configuration, yay!
>
> I've tested this on two versions of Fedora and Debian, but would be
> grateful for test results on Solaris, Cygwin and BSD targets, and for
> cross-compilers to any of those targets.

Apart from the abi_check failure already reported, I get the following
testsuite regressions on Solaris 10/x86:

FAIL: 30_threads/async/54297.cc (test for excess errors)
WARNING: 30_threads/async/54297.cc compilation failed to produce executable
FAIL: 30_threads/condition_variable_any/53830.cc (test for excess errors)
WARNING: 30_threads/condition_variable_any/53830.cc compilation failed to produ
e executable
FAIL: 30_threads/this_thread/3.cc (test for excess errors)
WARNING: 30_threads/this_thread/3.cc compilation failed to produce executable
FAIL: 30_threads/this_thread/4.cc (test for excess errors)
WARNING: 30_threads/this_thread/4.cc compilation failed to produce executable
FAIL: 30_threads/thread/native_handle/cancel.cc (test for excess errors)
WARNING: 30_threads/thread/native_handle/cancel.cc compilation failed to produc
e executable

All of them have the same root cause:

Excess errors:
Undefined   first referenced
 symbol in file
nanosleep   /var/tmp//ccQhmiwd.o  (symbol belongs to 
implicit dependency /lib/librt.so.1)
ld: fatal: symbol referencing errors. No output written to ./54297.exe
collect2: error: ld returned 1 exit status

It seems that now every single C++ program needs to be linked with -lrt,
not only libstdc++.so.  This will also happen on Solaris 9 (bootstrap
still running), while on Solaris 11 nanosleep and the others were
integrated into libc.so.1.

Speaking of Solaris 9, there's another caveat: unlike Solaris 10 and up,
CLOCK_MONOTONIC isn't defined, while the equivalent non-standard
CLOCK_HIGHRES is.  Instead of handling this in
libstdc++-v3/src/c++11/chrono.cc directly, I've chosen the following
route which allows libstdc++ to build on Solaris 9:

2013-05-22  Rainer Orth  

* config/os/solaris/solaris2.9/os_defines.h [!CLOCK_MONOTONIC]
(CLOCK_MONOTONIC): Define.

# HG changeset patch
# Parent 079c0691568d1b9ef88a62ebf89ea19d3a8272ab
Use CLOCK_HIGHRES on Solaris 9

	libstdc++-v3:
	* config/os/solaris/solaris2.9/os_defines.h [!CLOCK_MONOTONIC]
	(CLOCK_MONOTONIC): Define.

diff --git a/libstdc++-v3/config/os/solaris/solaris2.9/os_defines.h b/libstdc++-v3/config/os/solaris/solaris2.9/os_defines.h
--- a/libstdc++-v3/config/os/solaris/solaris2.9/os_defines.h
+++ b/libstdc++-v3/config/os/solaris/solaris2.9/os_defines.h
@@ -35,5 +35,10 @@
 #define __CORRECT_ISO_CPP_WCHAR_H_PROTO
 #endif
 
+/* Solaris 9 uses the non-standard CLOCK_HIGHRES instead.  */
+#ifndef CLOCK_MONOTONIC
+#define CLOCK_MONOTONIC CLOCK_HIGHRES
 #endif
 
+#endif
+
 
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[Patch, Fortran] Create valid temporary variable to avoid assembler errors

2013-05-22 Thread Tobias Burnus

With one Fortran file, I get the following assembler errors:

/tmp/cc28epKK.s:2075: Error: junk `@1.2304+16' after expression

That's due to the way a temporary variable is generated. While that 
variable is local to the procedure, the name somehow escapes into the 
assembler file. The dump looks as follows.


  static struct foo DA@0;
  static struct universal DA@1;
  static struct universal DA@2;
...
class.30._data = &DA@0.foo_parent.universal.counter;

I have now changed the mangling, see attached patch. (The test file uses 
finalization - hence, I do not include it into the patch. I will include 
it in the FINAL patch.)


Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias
2013-05-22  Tobias Burnus  

	* resolve.c (get_temp_from_expr): Fix temp var mangling.

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 6f32df8..5fabc9a 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -9254,7 +9254,7 @@ get_temp_from_expr (gfc_expr *e, gfc_namespace *ns)
   gfc_array_ref *aref;
   gfc_ref *ref;
 
-  sprintf (name, "DA@%d", serial++);
+  sprintf (name, "DA" GFC_PREFIX("%d"), serial++);
   gfc_get_sym_tree (name, ns, &tmp, false);
   gfc_add_type (tmp->n.sym, &e->ts, NULL);
 


Re: [patch, powerpc] increase array alignment for Altivec

2013-05-22 Thread Alan Modra
On Wed, May 22, 2013 at 07:57:34AM -0600, Sandra Loosemore wrote:
> On 05/22/2013 02:01 AM, Richard Biener wrote:
> >Maybe the idea was to increase the alignment of the struct (without
> >affecting it's layout) when that increases the alignment of a contained
> >array member.  Like for
> >
> >struct { int i; int j; float a[1024]; int x; };
> >
> >where aligning to sizeof (int) * 2 would get 'a' a bigger alignment.
> 
> In fact, this is what the patch I posted earlier this week does,
> except the alignment it looks for is 16 bytes.  In other words, it
> doesn't change the alignment of arrays inside structs (which would
> cause all sorts of compatibility problems), but it checks to see
> whether there is already some array inside the struct aligned on a
> 16-byte offset so that the whole containing struct object would
> benefit from being aligned on a 16-byte boundary.

Right, and I like the patch.  However it does have a potential ABI
problem.  For example a shared library that exports a variable
affected by this change might do so on a smaller alignment than new
code expects.  Fiddling with that variable in a new executable might
give wrong results.  Or the PR56564 case of linking C++ relocatable
objects compiled with different compilers might bite us.

Note that rs6000.h DATA_ALIGNMENT already assumes alignment greater
than the ppc64 ABI guarantees for byte arrays.  So we're already
guilty of this sin.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [rs6000] Add register save/restore routines for cross

2013-05-22 Thread Alan Modra
On Wed, May 22, 2013 at 10:05:47AM -0400, David Edelsohn wrote:
> Why does cross need the functions in libgcc and not provided by the linker?

Only the ppc64 linker provides save/restore functions magically.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [rs6000] Add register save/restore routines for cross

2013-05-22 Thread David Edelsohn
On Wed, May 22, 2013 at 10:35 AM, Alan Modra  wrote:
> On Wed, May 22, 2013 at 10:05:47AM -0400, David Edelsohn wrote:
>> Why does cross need the functions in libgcc and not provided by the linker?
>
> Only the ppc64 linker provides save/restore functions magically.

Okay, then the patch is okay.

Thanks, David


Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Jonathan Wakely
On 22 May 2013 15:05, Rainer Orth wrote:
> Jonathan Wakely  writes:
>
>> This alters the configure script to enable C++11 thread library
>> features based on targets that are known to support the features,
>> rather than based on link tests which are disabled by default.  With
>> Glibc 2.17 this enables a nanosecond resolution std::system_clock in
>> the default configuration, yay!
>>
>> I've tested this on two versions of Fedora and Debian, but would be
>> grateful for test results on Solaris, Cygwin and BSD targets, and for
>> cross-compilers to any of those targets.
>
> Apart from the abi_check failure already reported, I get the following
> testsuite regressions on Solaris 10/x86:

Thanks for checking it.

> FAIL: 30_threads/async/54297.cc (test for excess errors)
> WARNING: 30_threads/async/54297.cc compilation failed to produce executable
> FAIL: 30_threads/condition_variable_any/53830.cc (test for excess errors)
> WARNING: 30_threads/condition_variable_any/53830.cc compilation failed to 
> produ
> e executable
> FAIL: 30_threads/this_thread/3.cc (test for excess errors)
> WARNING: 30_threads/this_thread/3.cc compilation failed to produce executable
> FAIL: 30_threads/this_thread/4.cc (test for excess errors)
> WARNING: 30_threads/this_thread/4.cc compilation failed to produce executable
> FAIL: 30_threads/thread/native_handle/cancel.cc (test for excess errors)
> WARNING: 30_threads/thread/native_handle/cancel.cc compilation failed to 
> produc
> e executable
>
> All of them have the same root cause:
>
> Excess errors:
> Undefined   first referenced
>  symbol in file
> nanosleep   /var/tmp//ccQhmiwd.o  (symbol belongs to 
> implicit dependency /lib/librt.so.1)
> ld: fatal: symbol referencing errors. No output written to ./54297.exe
> collect2: error: ld returned 1 exit status
>
> It seems that now every single C++ program needs to be linked with -lrt,
> not only libstdc++.so.  This will also happen on Solaris 9 (bootstrap
> still running), while on Solaris 11 nanosleep and the others were
> integrated into libc.so.1.

I see.  Would you prefer to disable use of nanosleep on Solaris 9 and
10, or to keep using it and link to librt?


Re: [patch] Default to --enable-libstdcxx-time=auto

2013-05-22 Thread Rainer Orth
Jonathan Wakely  writes:

>> All of them have the same root cause:
>>
>> Excess errors:
>> Undefined   first referenced
>>  symbol in file
>> nanosleep /var/tmp//ccQhmiwd.o (symbol belongs to implicit dependency
>> /lib/librt.so.1)
>> ld: fatal: symbol referencing errors. No output written to ./54297.exe
>> collect2: error: ld returned 1 exit status
>>
>> It seems that now every single C++ program needs to be linked with -lrt,
>> not only libstdc++.so.  This will also happen on Solaris 9 (bootstrap
>> still running), while on Solaris 11 nanosleep and the others were
>> integrated into libc.so.1.
>
> I see.  Would you prefer to disable use of nanosleep on Solaris 9 and
> 10, or to keep using it and link to librt?

I think it's fine for both C++11 programs and libstdc++.so to depend
on librt.  Maybe one could restrict linking with -lrt to C++11 mode?
But I doubt that: you can probably link both C++99 and C++11 objects
into a single executable, and with libstdc++.so already depending on
librt, there's not much point in the added complexity.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Fix incorrect discriminator assignment.

2013-05-22 Thread Dehao Chen
Sounds reasonable. The patch is updated, bootstrapped and passed all
regression test.

Dehao

On Tue, May 21, 2013 at 5:34 AM, Richard Biener
 wrote:
> On Fri, May 17, 2013 at 5:48 PM, Dehao Chen  wrote:
>> On Fri, May 17, 2013 at 1:22 AM, Richard Biener
>>  wrote:
>>> On Wed, May 15, 2013 at 6:50 PM, Cary Coutant  wrote:
> gcc/ChangeLog:
>
> * tree-cfg.c (locus_descrim_hasher::hash): Only hash lineno.
> (locus_descrim_hasher::equal): Likewise.
> (next_discriminator_for_locus): Likewise.
> (assign_discriminator): Add return value.
> (make_edges): Assign more discriminator if necessary.
> (make_cond_expr_edges): Likewise.
> (make_goto_expr_edges): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/debug/dwarf2/discriminator.c: New Test.

 Looks OK to me, but I can't approve patches for tree-cfg.c.

 Two comments:

 (1) In the C++ conversion, it looks like someone misspelled "discrim"
 in "locus_descrim_hasher".

 (2) Is this worth fixing in trunk when we'll eventually switch to a
 per-instruction discriminator?
>>>
 This patch fixes a common case where one line is distributed to 3 BBs,
 but only 1 discriminator is assigned.
>>>
>>> As far as I can see the patch makes discriminators coarser (by hashing
>>> and comparing on LOCATION_LINE and not on the location).  It also has
>>> the changes like
>>>
>>> -  assign_discriminator (entry_locus, then_bb);
>>> +  if (assign_discriminator (entry_locus, then_bb))
>>> +assign_discriminator (entry_locus, bb);
>>>
>>> which is what the comment refers to?  I think those changes are 
>>> short-sighted
>>> because what happens if the 2nd assign_discriminator call has exactly the
>>> same issue?  Especially with the make_goto_expr_edges change there
>>> may be multiple predecessors where I cannot see how the change is
>>> correct.  Yes, the change changes something and thus may fix the testcase
>>> but it doesn't change things in a predictable way as far as I can see.
>>>
>>> So - the change to compare/hash LOCATION_LINE looks good to me,
>>> but the assign_discriminator change needs more explaining.
>>
>> The new discriminator assignment algorithm is:
>>
>> 1 FOR_EACH_BB:
>> 2   FOR_EACH_SUCC_EDGE:
>> 3 if (same_line(bb, succ_bb))
>> 4   if (succ_bb has no discriminator)
>> 5  succ_bb->discriminator = new_discriminator
>> 6   else if (bb has no discriminator)
>> 7  bb->discriminator = new_discriminator
>>
>> Because there are so many places to handle SUCC_EDGE, thus the logic
>> line #3, #4 is embedded within assign_discriminator function, and the
>> logic in line #6 is encoded as the return value of
>> assign_discriminator.
>
> Hmm, as of current the code is hardly readable while the above makes sense
> (well, apart from what happens if both succ and bb already have a
> discriminator).
>
> Can I make you refactor the current code to postpone discriminator assignment
> until after make_edges completed, thus, do it in a postprocessing step done
> exactly like outlined above?  That way also the whole thing, including
> the currently global  discriminator_per_locus can be localized into a single
> function.
>
> Thanks,
> Richard.
>
>
>> Thanks,
>> Dehao
>>
>>>
>>> Richard.
>>>
 -cary
Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c  (revision 198891)
+++ gcc/tree-cfg.c  (working copy)
@@ -105,7 +105,7 @@ struct locus_descrim_hasher : typed_free_remove locus;
+  return LOCATION_LINE (item->locus);
 }
 
 /* Equality function for the locus-to-discriminator map.  A and B
@@ -114,7 +114,7 @@ locus_descrim_hasher::hash (const value_type *item
 inline bool
 locus_descrim_hasher::equal (const value_type *a, const compare_type *b)
 {
-  return a->locus == b->locus;
+  return LOCATION_LINE (a->locus) == LOCATION_LINE (b->locus);
 }
 
 static hash_table  discriminator_per_locus;
@@ -125,11 +125,11 @@ static void factor_computed_gotos (void);
 
 /* Edges.  */
 static void make_edges (void);
+static void assign_discriminators (void);
 static void make_cond_expr_edges (basic_block);
 static void make_gimple_switch_edges (basic_block);
 static void make_goto_expr_edges (basic_block);
 static void make_gimple_asm_edges (basic_block);
-static void assign_discriminator (location_t, basic_block);
 static edge gimple_redirect_edge_and_branch (edge, basic_block);
 static edge gimple_try_redirect_by_replacing_jump (edge, basic_block);
 static unsigned int split_critical_edges (void);
@@ -231,6 +231,7 @@ build_gimple_cfg (gimple_seq seq)
   /* Create the edges of the flowgraph.  */
   discriminator_per_locus.create (13);
   make_edges ();
+  assign_discriminators ();
   cleanup_dead_labels ();
   discriminator_per_locus.dispose ();
 }
@@ -690,11 +691,7 @@ make_edges (void)
fallthru = true;
 
   if (fallthru)
-{
- make_edge (bb, bb->next_bb, EDGE_FALLTHRU);
- if

Re: [PATCH] Add explicit default constructors where required by the standard

2013-05-22 Thread Diego Novillo

On 2013-05-22 05:32 , Evgeniy Stepanov wrote:

OK to merge to google/4_7 and google/4_8?


Yes.  Patches coming from trunk or other release branches need no 
further approval for backporting.  You just need to make sure you don't 
introduce any regressions, of course.



Diego.


[PATCH] Do not allow non-top-level BIT_FIELD_REFs, IMAGPART_EXPRs or REALPART_EXPRs

2013-05-22 Thread Martin Jambor
Hi,

earlier this week I asked on IRC whether we could have non-top-level
BIT_FIELD_REFs and Richi said that we could.  However, when I later
looked at SRA code, quite apparently it is not designed to handle
non-top-level BIT_FIELD_REFs, IMAGPART_EXPRs or REALPART_EXPRs.  So in
order to test whether that assumption is OK, I added the following
into the gimple verifier and ran bootstrap and testsuite of all
languages including Ada and ObjC++ on x86_64.  It survived, which
makes me wondering whether we do not want it in trunk.

What do you think?

Martin


2013-05-22  Martin Jambor  

* tree-cfg.c (verify_types_in_gimple_reference): Do not allow
non-top-level BIT_FIELD_REFs, IMAGPART_EXPRs or REALPART_EXPRs.

Index: src/gcc/tree-cfg.c
===
--- src.orig/gcc/tree-cfg.c
+++ src/gcc/tree-cfg.c
@@ -2858,6 +2858,8 @@ verify_types_in_gimple_min_lval (tree ex
 static bool
 verify_types_in_gimple_reference (tree expr, bool require_lvalue)
 {
+  bool top_level = true;
+
   while (handled_component_p (expr))
 {
   tree op = TREE_OPERAND (expr, 0);
@@ -2944,6 +2946,17 @@ verify_types_in_gimple_reference (tree e
return false;
}
 
+  if (!top_level
+ && (TREE_CODE (expr) == BIT_FIELD_REF
+ || TREE_CODE (expr) == IMAGPART_EXPR
+ || TREE_CODE (expr) == REALPART_EXPR))
+   {
+ error ("non-top-level BIT_FIELD_REF, IMAGPART_EXPR or REALPART_EXPR");
+ debug_generic_stmt (expr);
+ return true;
+   }
+
+  top_level = false;
   expr = op;
 }
 


RE: [PING]RE: [patch] cilkplus: Array notation for C patch

2013-05-22 Thread Iyer, Balaji V
Hello Richard,
Thank you for reviewing my code. Please see my responses below.

Thanks,

Balaji V. Iyer.

> -Original Message-
> From: Richard Henderson [mailto:r...@redhat.com]
> Sent: Tuesday, May 21, 2013 10:57 PM
> To: Iyer, Balaji V
> Cc: 'Joseph S. Myers'; 'Aldy Hernandez'; 'gcc-patches'
> Subject: Re: [PING]RE: [patch] cilkplus: Array notation for C patch
> 
> Let me start by saying that I think the patch is generally ok, especially
> considering the advice that's already been given.  That said...
> 
> > +++ b/gcc/c/c-array-notation.c
> > @@ -0,0 +1,3121 @@
> 
> So, like, are we going to need to replicate 3000 lines to add array notation 
> to
> c++ too?  How much of this are we going to be able to share?

The overall function names are same, but the components inside it function 
differs greatly from C and C++. For example, in C++ I can't use 
build_modify_expr, but build_x_modify_expr. Also, I need to handle overloaded 
function types, and that requires further checking. Similarly, the trees are 
different from C and C++, and so I have to do more checks and handle them 
differently in C++. In my mind, this seem to be the most straight-forward and 
legible way to do it.

> 
> While I don't think we should make the array notation global, we might be able
> to share the code via c-family/.  The Ideal way I think would be via 
> c-gimplify, in
> which we don't expand the array notation until later.  But even if delaying 
> the
> expansion causes other problems that I havn't thought about, just placing the
> common expansion logic in the shared directory would be better.
> 
> > +  for (ii = 0; ii < rank ; ii++)
> > +{
> > +  /* This will create the if statement label.  */
> > +  if_stmt_label[ii] = build_decl (location, LABEL_DECL, NULL_TREE,
> > + void_type_node);
> > +  DECL_CONTEXT (if_stmt_label[ii]) = current_function_decl;
> > +  DECL_ARTIFICIAL (if_stmt_label[ii]) = 0;
> > +  DECL_IGNORED_P (if_stmt_label[ii]) = 1;
> > +
> > +  /* This label statement will point to the loop body.  */
> > +  body_label[ii] = build_decl (location, LABEL_DECL, NULL_TREE,
> > +  void_type_node);
> > +  DECL_CONTEXT (body_label[ii]) = current_function_decl;
> > +  DECL_ARTIFICIAL (body_label[ii]) = 0;
> > +  DECL_IGNORED_P (body_label[ii]) = 1;
> > +  body_label_expr[ii] = build1 (LABEL_EXPR, void_type_node,
> > + body_label[ii])
> > ;
> > +
> > +  /* This will create the exit label, i.e. where the while loop will 
> > branch
> > +out of.  */
> > +  exit_label[ii] = build_decl (location, LABEL_DECL, NULL_TREE,
> > +  void_type_node);
> > +  DECL_CONTEXT (exit_label[ii]) = current_function_decl;
> > +  DECL_ARTIFICIAL (exit_label[ii]) = 0;
> > +  DECL_IGNORED_P (exit_label[ii]) = 1;
> > +  exit_label_expr[ii] = build1 (LABEL_EXPR, void_type_node, 
> > exit_label[ii]);
> > +}
> 
> Is there any particular reason why you're open-coding the loop expansion
> instead of using existing infrastructure like c_finish_loop?

Before I started my implementation, I did briefly consider using c_finish_loop, 
but adding multiple expressions got a bit complicated. For example, if I have 2 
increment/condition expressions (my initial implementation had it but later on 
I eliminated that need), I had to create a statement list and then add them 
using append_to_statement_list() and then pass that into c_finish_loop. Also, 
if I am not mistaken, I still had to create some of the individual labels when 
using c_finish_loop(). Overall, I didn't find much code-size decrease, and I 
found this a bit more straight forward for me. I apologize if I am mistaken.

> 
> Yes, the specific example of c_finish_loop goes against the c++ sharing, but 
> it's
> fairly easy to add new interfaces to c-common.h that are implemented in the
> two front ends.

For C++, I looked into creating FOR_STMT instead. I found this slightly more 
convenient than c_finish_loop, but decided to stick with this approach for 
convenience, and I didn't find much code-size decrease.

> 

Are these changes a blocking issue for acceptance into trunk? Or, would it be 
ok to accept it as-is and then I make these changes at a later date?

> 
> 
> r~


[PATCH, AArch64] Fix invalid assembler in scalar_intrinsics.c test

2013-05-22 Thread Ian Bolton
The test file scalar_intrinsics.c (in gcc.target/aarch64)
is currently compile-only.

If you attempt to make it run, as opposed to just generate
assembler, you can't because it won't assemble.

There are two issues causing trouble here:

1) Use of invalid instruction "mov d0, d1".
   It should be "mov d0, v1.d[0]".

2) The vdupd_lane_s64 and vdupd_lane_u64 calls are being given
   a lane that is out of range, which causes invalid assembler
   output.

This patch fixes both, so that we can build on this to make
executable test cases for scalar intrinsics.

OK for trunk?

Cheers,
Ian


2013-05-22  Ian Bolton  

testsuite/
* gcc.target/aarch64/scalar_intrinsics.c (force_simd):
Use a valid instruction.
(test_vdupd_lane_s64): Pass a valid lane argument.
(test_vdupd_lane_u64): Likewise.diff --git a/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c 
b/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c
index 7427c62..16537ce 100644
--- a/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c
+++ b/gcc/testsuite/gcc.target/aarch64/scalar_intrinsics.c
@@ -4,7 +4,7 @@
 #include 
 
 /* Used to force a variable to a SIMD register.  */
-#define force_simd(V1)   asm volatile ("mov %d0, %d1"  \
+#define force_simd(V1)   asm volatile ("mov %d0, %1.d[0]"  \
   : "=w"(V1)   \
   : "w"(V1)\
   : /* No clobbers */);
@@ -228,13 +228,13 @@ test_vdups_lane_u32 (uint32x4_t a)
 int64x1_t
 test_vdupd_lane_s64 (int64x2_t a)
 {
-  return vdupd_lane_s64 (a, 2);
+  return vdupd_lane_s64 (a, 1);
 }
 
 uint64x1_t
 test_vdupd_lane_u64 (uint64x2_t a)
 {
-  return vdupd_lane_u64 (a, 2);
+  return vdupd_lane_u64 (a, 1);
 }
 
 /* { dg-final { scan-assembler-times "\\tcmtst\\td\[0-9\]+, d\[0-9\]+, 
d\[0-9\]+" 2 } } */


Re: [Patch ARM] Fix PR target/19599

2013-05-22 Thread Ramana Radhakrishnan



This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57372


A fix is forthcoming - this is a dup of PR57340.

Ramana



--
Roman





Re: *ping* Re: [PATCH, PR preprocessor/42014] Added LAST_SOURCE_COLUMN in while loop

2013-05-22 Thread Tom Tromey
> "Shakthi" == Shakthi Kannan  writes:

Shakthi> Is the following patch okay for trunk?

I still think it needs a test case.
I also don't recall -- did you check to see if the column number that is
emitted is actually correct?

You may want to change the Subject line of your note, as well.
The bug is filed against the preprocessor but it would have to be
reviewed by a diagnostics maintainer.

Tom


Re: [PATCH, AArch64] Fix invalid assembler in scalar_intrinsics.c test

2013-05-22 Thread Marcus Shawcroft
On 22 May 2013 16:18, Ian Bolton  wrote:
> The test file scalar_intrinsics.c (in gcc.target/aarch64)
> is currently compile-only.
>
> If you attempt to make it run, as opposed to just generate
> assembler, you can't because it won't assemble.
>
> There are two issues causing trouble here:
>
> 1) Use of invalid instruction "mov d0, d1".
>It should be "mov d0, v1.d[0]".
>
> 2) The vdupd_lane_s64 and vdupd_lane_u64 calls are being given
>a lane that is out of range, which causes invalid assembler
>output.
>
> This patch fixes both, so that we can build on this to make
> executable test cases for scalar intrinsics.
>
> OK for trunk?

OK
/Marcus


Re: [PING]RE: [patch] cilkplus: Array notation for C patch

2013-05-22 Thread Richard Henderson

On 2013-05-21 23:15, Jakub Jelinek wrote:

Furthermore, do you want to generate just normal loop out of it, or
shouldn't we generate a #pragma omp simd loop out of it instead?
Haven't read the spec if array notation guarantees lack of forward/backward
loop dependencies and what are the restrictions on the calls you can do
inside of array notation.  Perhaps the lowering to normal vs. simd loop
could depend on whether all called functions in the expression are
elemental.


The only overlap allowed is exact, i.e. a[0:5] = a[0:5] + b[1:5],
but not a[0:5] = a[1:5] + b[1:5].  So, yes, I believe that safelen
is essentially unlimited.


r~


[Patch ARM] Fix PR57340

2013-05-22 Thread Ramana Radhakrishnan

Hi,

This fixes PR target/57340 a fallout from my PR target/19599 patch. 
Unfortunately this didn't show up in the testing I did and I'm sorry 
about the breakage.


The problem here is that we choose r3 as a padding register with 
indirect tailcalls that could use r3 for this purpose. Ofcourse this 
decision is made when the frame offsets is calculated and r3 isn't in 
the fusage of the function. The simplest way to fix this was to check if 
the tailcall in this case is also an indirect tailcall. If so, we don't 
allow the use of r3 as a padding register. Additionally, in the prologue 
code we choose an alternate callee save register that is free if r3 is 
not available so we are free to use this instead of another callee save 
register.


Tested by bootstrapping armv5te and observing it pass well into stage2. 
Tested by bootstrapping and reg testing Thumb2 configurations with A9 
where this problem was also observed on a version prior to breakages 
from PR57351.


Applied.

regards
Ramana



2013-05-22  Ramana Radhakrishnan  

PR target/19599
PR target/57340
* config/arm/arm.c (any_sibcall_uses_r3): Rename to ..
(any_sibcall_could_use_r3): this and handle indirect calls.
(arm_get_frame_offsets): Rename use of any_sibcall_uses_r3.Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c(revision 199202)
+++ gcc/config/arm/arm.c(working copy)
@@ -17568,11 +17568,27 @@ thumb_force_lr_save (void)
 || df_regs_ever_live_p (LR_REGNUM));
 }
 
+/* We do not know if r3 will be available because
+   we do have an indirect tailcall happening in this
+   particular case.  */
+static bool
+is_indirect_tailcall_p (rtx call)
+{
+  rtx pat = PATTERN (call);
+
+  /* Indirect tail call.  */
+  pat = XVECEXP (pat, 0, 0);
+  if (GET_CODE (pat) == SET)
+pat = SET_SRC (pat);
+
+  pat = XEXP (XEXP (pat, 0), 0);
+  return REG_P (pat);
+}
 
 /* Return true if r3 is used by any of the tail call insns in the
current function.  */
 static bool
-any_sibcall_uses_r3 (void)
+any_sibcall_could_use_r3 (void)
 {
   edge_iterator ei;
   edge e;
@@ -17586,7 +17602,8 @@ any_sibcall_uses_r3 (void)
if (!CALL_P (call))
  call = prev_nonnote_nondebug_insn (call);
gcc_assert (CALL_P (call) && SIBLING_CALL_P (call));
-   if (find_regno_fusage (call, USE, 3))
+   if (find_regno_fusage (call, USE, 3)
+   || is_indirect_tailcall_p (call))
  return true;
   }
   return false;
@@ -17753,7 +17770,7 @@ arm_get_frame_offsets (void)
  /* If it is safe to use r3, then do so.  This sometimes
 generates better code on Thumb-2 by avoiding the need to
 use 32-bit push/pop instructions.  */
-  if (! any_sibcall_uses_r3 ()
+  if (! any_sibcall_could_use_r3 ()
  && arm_size_return_regs () <= 12
  && (offsets->saved_regs_mask & (1 << 3)) == 0
   && (TARGET_THUMB2 || !current_tune->prefer_ldrd_strd))

RE: [PING]RE: [patch] cilkplus: Array notation for C patch

2013-05-22 Thread Iyer, Balaji V
Hi Jakub,
Please see my response below.
Thanks,

Balaji V. Iyer.

> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Wednesday, May 22, 2013 2:15 AM
> To: Richard Henderson
> Cc: Iyer, Balaji V; 'Joseph S. Myers'; 'Aldy Hernandez'; 'gcc-patches'
> Subject: Re: [PING]RE: [patch] cilkplus: Array notation for C patch
> 
> On Tue, May 21, 2013 at 07:57:10PM -0700, Richard Henderson wrote:
> > >+  /* This will create the if statement label.  */
> > >+  if_stmt_label[ii] = build_decl (location, LABEL_DECL, NULL_TREE,
> > >+ void_type_node);
> > >+  DECL_CONTEXT (if_stmt_label[ii]) = current_function_decl;
> > >+  DECL_ARTIFICIAL (if_stmt_label[ii]) = 0;
> > >+  DECL_IGNORED_P (if_stmt_label[ii]) = 1;
> > >+
> > >+  /* This label statement will point to the loop body.  */
> > >+  body_label[ii] = build_decl (location, LABEL_DECL, NULL_TREE,
> > >+  void_type_node);
> > >+  DECL_CONTEXT (body_label[ii]) = current_function_decl;
> > >+  DECL_ARTIFICIAL (body_label[ii]) = 0;
> > >+  DECL_IGNORED_P (body_label[ii]) = 1;
> > >+  body_label_expr[ii] = build1 (LABEL_EXPR, void_type_node,
> > >+ body_label[ii])
> > >;
> > >+
> > >+  /* This will create the exit label, i.e. where the while loop will 
> > >branch
> > >+out of.  */
> > >+  exit_label[ii] = build_decl (location, LABEL_DECL, NULL_TREE,
> > >+  void_type_node);
> > >+  DECL_CONTEXT (exit_label[ii]) = current_function_decl;
> > >+  DECL_ARTIFICIAL (exit_label[ii]) = 0;
> > >+  DECL_IGNORED_P (exit_label[ii]) = 1;
> > >+  exit_label_expr[ii] = build1 (LABEL_EXPR, void_type_node, 
> > >exit_label[ii]);
> > >+}
> >
> > Is there any particular reason why you're open-coding the loop
> > expansion instead of using existing infrastructure like c_finish_loop?
> >
> > Yes, the specific example of c_finish_loop goes against the c++
> > sharing, but it's fairly easy to add new interfaces to c-common.h that
> > are implemented in the two front ends.
> 
> Furthermore, do you want to generate just normal loop out of it, or shouldn't
> we generate a #pragma omp simd loop out of it instead?
> Haven't read the spec if array notation guarantees lack of forward/backward
> loop dependencies and what are the restrictions on the calls you can do 
> inside of
> array notation.  Perhaps the lowering to normal vs. simd loop could depend on
> whether all called functions in the expression are elemental.

What you mentioned is a good performance optimization. It is something I have  
planned to do in future. But, the spec does not require the loop (or array 
notation expansion) to be vectorized.

> 
>   Jakub


RE: [PING]RE: [patch] cilkplus: Array notation for C patch

2013-05-22 Thread Iyer, Balaji V


> -Original Message-
> From: Richard Henderson [mailto:r...@redhat.com]
> Sent: Wednesday, May 22, 2013 11:30 AM
> To: Jakub Jelinek
> Cc: Iyer, Balaji V; 'Joseph S. Myers'; 'Aldy Hernandez'; 'gcc-patches'
> Subject: Re: [PING]RE: [patch] cilkplus: Array notation for C patch
> 
> On 2013-05-21 23:15, Jakub Jelinek wrote:
> > Furthermore, do you want to generate just normal loop out of it, or
> > shouldn't we generate a #pragma omp simd loop out of it instead?
> > Haven't read the spec if array notation guarantees lack of
> > forward/backward loop dependencies and what are the restrictions on
> > the calls you can do inside of array notation.  Perhaps the lowering
> > to normal vs. simd loop could depend on whether all called functions
> > in the expression are elemental.
> 
> The only overlap allowed is exact, i.e. a[0:5] = a[0:5] + b[1:5], but not 
> a[0:5] =
> a[1:5] + b[1:5].  So, yes, I believe that safelen is essentially unlimited.

Yes, the vectorlength/safelen can be the entire loop-size.

> 
> 
> r~


Re: [google gcc-4_7] coverage callback instrumentation (issue9630043)

2013-05-22 Thread Xinliang David Li
Looks ok to me in general.
1) the parameter name is not ideal -- it is not callonce.
2) it might be better to extend the callonce parameter into
-ftest-coverage option such as -ftest-coverage=exec_once?
3) need documentation in invoke.texi
4) watch out for long lines.

cc Teresa.

David

On Tue, May 21, 2013 at 3:28 PM, Rong Xu  wrote:
> This patch is to be used with customized coverage reduction.
>
> The functionalities are under two parameter options:
> --param=COVERAGE-CALLBACK={0|1}
> when enabled with 1, it injects a callback function for
> each arc counter. Default off.
> --param=COVERAGE-CALLONCE={0|1}
> when enabled with 1, it stops incrementing the arc counter
> when they flip to 1. Default off. Even with an extra condition
> check, this is drastically faster than the normal coverage test
> for highly threaded applications.
>
> It also fixes a bug in profile sampling where some counters
> (like those after the call stmt) are not sampled.
>
> It disables the may-uninit check if the warning will not be
> emitted.
>
> Tested with bootstrap, regressions test and google internal
> benchmarks.
>
> Thanks,
>
> -Rong
>
> 2013-05-21  Rong Xu  
>
> * gcc/tree-ssa-uninit.c (gate_warn_uninitialized): Disable if
> may-uninit warning will not emitted.
> * gcc/params.def : (PARAM_COVERAGE_CALLBACK) New.
> (PARAM_COVERAGE_CALLONCE): NEW.
> * gcc/profile.c (branch_prob): Not increment edge counter after
> first update.
> (tree_init_instrumentation_sampling): Ditto.
> (COVERAGE_CALLBACK_FUNC_NAME): New.
> (COVERAGE_INSERT_CALL): New.
> (insert_if_then): Update branch probability.
> (add_sampling_wrapper): Ditto.
> (add_callback_wrapper): New.
> (add_sampling_to_edge_counters): Make predicate insertion
> complete.
> (gimple_gen_edge_profiler): Add coverage callback function.
> (gimple_gen_ic_profiler): Fix trailing space.
> (gimple_gen_ic_func_topn_profiler): Ditto.
> (gimple_gen_dc_profiler): Ditto.
> * libgcc/libgcov.c (__coverage_callback): Add an empty callback
> function.
>
> Index: libgcc/libgcov.c
> ===
> --- libgcc/libgcov.c(revision 198952)
> +++ libgcc/libgcov.c(working copy)
> @@ -135,6 +135,14 @@ extern int gcov_dump_complete ATTRIBUTE_HIDDEN;
> these symbols will always need to be resolved.  */
>  void (*__gcov_dummy_ref1)() = &__gcov_reset;
>  void (*__gcov_dummy_ref2)() = &__gcov_dump;
> +
> +__attribute__((weak)) void
> +__coverage_callback (gcov_type funcdef_no __attribute__ ((unused)),
> + int edge_no __attribute__ ((unused)))
> +{
> +   /* nothing */
> +}
> +
>  #endif /* __GCOV_KERNEL__ */
>
>  /* Utility function for outputing errors.  */
> Index: gcc/profile.c
> ===
> --- gcc/profile.c   (revision 198952)
> +++ gcc/profile.c   (working copy)
> @@ -69,6 +69,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "timevar.h"
>  #include "cfgloop.h"
>  #include "tree-pass.h"
> +#include "params.h"
>
>  #include "profile.h"
>
> @@ -1480,7 +1481,8 @@ branch_prob (void)
>/* Commit changes done by instrumentation.  */
>gsi_commit_edge_inserts ();
>
> -  if (flag_profile_generate_sampling)
> +  if (flag_profile_generate_sampling
> +  || PARAM_VALUE (PARAM_COVERAGE_CALLONCE))
>  add_sampling_to_edge_counters ();
>  }
>
> Index: gcc/params.def
> ===
> --- gcc/params.def  (revision 198952)
> +++ gcc/params.def  (working copy)
> @@ -1060,6 +1060,16 @@ DEFPARAM (PARAM_PROFILE_GENERATE_SAMPLING_PERIOD,
>   "sampling rate with -fprofile-generate-sampling",
>   100, 0, 20)
>
> +DEFPARAM (PARAM_COVERAGE_CALLBACK,
> + "coverage-callback",
> + "callback a user-define function when for arc counter increments.",
> + 0, 0, 1)
> +
> +DEFPARAM (PARAM_COVERAGE_CALLONCE,
> + "coverage-callonce",
> + "Stop increment coverage counts once they become 1.",
> + 0, 0, 1)
> +
>  /* Used for debugging purpose. Tell the compiler to find
> the gcda file in the current directory.  */
>  DEFPARAM (PARAM_GCOV_DEBUG,
> Index: gcc/tree-profile.c
> ===
> --- gcc/tree-profile.c  (revision 198952)
> +++ gcc/tree-profile.c  (working copy)
> @@ -54,6 +54,15 @@ along with GCC; see the file COPYING3.  If not see
>  #include "target.h"
>  #include "output.h"
>
> +/* Default name for coverage callback function.  */
> +#define COVERAGE_CALLBACK_FUNC_NAME "__coverage_callback"
> +
> +/* If we insert a callback to edge instrumentation code. Avoid this
> +   for the callback function itself.  */
> +#define COVERAGE_INSERT_CALL   ((PARAM_VALUE (PARAM_COVERAGE_CALL

Re: [PING]RE: [patch] cilkplus: Array notation for C patch

2013-05-22 Thread Jeff Law

On 05/22/2013 09:37 AM, Iyer, Balaji V wrote:



Furthermore, do you want to generate just normal loop out of it, or
shouldn't we generate a #pragma omp simd loop out of it instead?
Haven't read the spec if array notation guarantees lack of
forward/backward loop dependencies and what are the restrictions on
the calls you can do inside of array notation.  Perhaps the
lowering to normal vs. simd loop could depend on whether all called
functions in the expression are elemental.


What you mentioned is a good performance optimization. It is
something I have  planned to do in future. But, the spec does not
require the loop (or array notation expansion) to be vectorized.
So if we can represent array notation as an OpenMP SIMD loop and re-use 
the OpenMP code generation, that's a significant win.  I realize the 
OpenMP SIMD stuff is still in-progress, but from a design standpoint 
we'd like to separate out the front-end issues from the code generation 
issues -- with the goal being that we can re-use the OpenMP SIMD path to 
also handle Cilk SIMD & Cilk array notation and possible bits of OpenACC.


Balaji, how feasible is it to rewire the code generation aspects of the 
array notation patch to instead feed into the OpenMP SIMD bits Jakub is 
working on?



Jeff


[PATCH, PR 57347] Do not create aggregate jump functions for bit-fields

2013-05-22 Thread Martin Jambor
Hi,

I have not intended aggregate jump functions to work with bit-fields
but apparently forgot to include the test to ignore them.  PR 57347
testcase gives a good example why they need to be avoided.  If we ever
decide to optimize for them too (and not just in IPA land), they
should be lowered earlier and jump functions can then be built for the
stores to the representatives.

The following patch disables their generation.  It passes bootstrap
and testing on x8664-linux on trunk, the same for the 4.8 branch is
currently underway.  OK for trunk and for the branch if it passes?

Thanks,

Martin


2013-05-21  Martin Jambor  

PR middle-end/57347
* tree.h (contains_bitfld_component_ref_p): Declare.
* tree-sra.c (contains_bitfld_comp_ref_p): Move...
* tree.c (contains_bitfld_component_ref_p): ...here.  Adjust its caller.
* ipa-prop.c (determine_known_aggregate_parts): Check that LHS does
not access a bit-field.  Assert all final offsets are byte-aligned.

testsuite/
* gcc.dg/ipa/pr57347.c: New test.

Index: src/gcc/ipa-prop.c
===
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -1327,7 +1327,9 @@ determine_known_aggregate_parts (gimple
 
   lhs = gimple_assign_lhs (stmt);
   rhs = gimple_assign_rhs1 (stmt);
-  if (!is_gimple_reg_type (rhs))
+  if (!is_gimple_reg_type (rhs)
+ || TREE_CODE (lhs) == BIT_FIELD_REF
+ || contains_bitfld_component_ref_p (lhs))
break;
 
   lhs_base = get_ref_base_and_extent (lhs, &lhs_offset, &lhs_size,
@@ -1418,6 +1420,7 @@ determine_known_aggregate_parts (gimple
{
  struct ipa_agg_jf_item item;
  item.offset = list->offset - arg_offset;
+ gcc_assert ((item.offset % BITS_PER_UNIT) == 0);
  item.value = unshare_expr_without_location (list->constant);
  jfunc->agg.items->quick_push (item);
}
Index: src/gcc/testsuite/gcc.dg/ipa/pr57347.c
===
--- /dev/null
+++ src/gcc/testsuite/gcc.dg/ipa/pr57347.c
@@ -0,0 +1,27 @@
+/* { dg-do run } */
+/* { dg-options "-O3" } */
+
+struct S1 { int f0; int f1 : 10; int f2 : 13; };
+int i;
+int *j = &i;
+
+static void
+foo (struct S1 s)
+{
+  int *p;
+  int l[88];
+  int **pp = &p;
+  *pp = &l[1];
+  l[0] = 1;
+  *j = 1 && s.f2;
+}
+
+int
+main ()
+{
+  struct S1 s = { 0, 0, 1 };
+  foo (s);
+  if (i != 1)
+__builtin_abort ();
+  return 0;
+}
Index: src/gcc/tree-sra.c
===
--- src.orig/gcc/tree-sra.c
+++ src/gcc/tree-sra.c
@@ -2998,23 +2998,6 @@ get_repl_default_def_ssa_name (struct ac
   return get_or_create_ssa_default_def (cfun, racc->replacement_decl);
 }
 
-/* Return true if REF has a COMPONENT_REF with a bit-field field declaration
-   somewhere in it.  */
-
-static inline bool
-contains_bitfld_comp_ref_p (const_tree ref)
-{
-  while (handled_component_p (ref))
-{
-  if (TREE_CODE (ref) == COMPONENT_REF
-  && DECL_BIT_FIELD (TREE_OPERAND (ref, 1)))
-return true;
-  ref = TREE_OPERAND (ref, 0);
-}
-
-  return false;
-}
-
 /* Return true if REF has an VIEW_CONVERT_EXPR or a COMPONENT_REF with a
bit-field field declaration somewhere in it.  */
 
@@ -3110,7 +3093,7 @@ sra_modify_assign (gimple *stmt, gimple_
 ???  This should move to fold_stmt which we simply should
 call after building a VIEW_CONVERT_EXPR here.  */
  if (AGGREGATE_TYPE_P (TREE_TYPE (lhs))
- && !contains_bitfld_comp_ref_p (lhs))
+ && !contains_bitfld_component_ref_p (lhs))
{
  lhs = build_ref_for_model (loc, lhs, 0, racc, gsi, false);
  gimple_assign_set_lhs (*stmt, lhs);
Index: src/gcc/tree.c
===
--- src.orig/gcc/tree.c
+++ src/gcc/tree.c
@@ -11785,4 +11785,21 @@ warn_deprecated_use (tree node, tree att
 }
 }
 
+/* Return true if REF has a COMPONENT_REF with a bit-field field declaration
+   somewhere in it.  */
+
+bool
+contains_bitfld_component_ref_p (const_tree ref)
+{
+  while (handled_component_p (ref))
+{
+  if (TREE_CODE (ref) == COMPONENT_REF
+  && DECL_BIT_FIELD (TREE_OPERAND (ref, 1)))
+return true;
+  ref = TREE_OPERAND (ref, 0);
+}
+
+  return false;
+}
+
 #include "gt-tree.h"
Index: src/gcc/tree.h
===
--- src.orig/gcc/tree.h
+++ src/gcc/tree.h
@@ -5974,6 +5974,7 @@ extern tree block_ultimate_origin (const
 extern tree get_binfo_at_offset (tree, HOST_WIDE_INT, tree);
 extern tree get_ref_base_and_extent (tree, HOST_WIDE_INT *,
 HOST_WIDE_INT *, HOST_WIDE_INT *);
+extern bool contains_bitfld_component_ref_p (const_tree);
 
 /* In tree-nested.c */
 extern tree build_addr (tree,

Re: [PATCH, AArch64] Allow insv_imm to handle bigger immediates via masking to 16-bits

2013-05-22 Thread Marcus Shawcroft
On 17 May 2013 19:20, Ian Bolton  wrote:
> The MOVK instruction is currently not used when operand 2 is
> more than 16 bits, which leads to sub-optimal code.
>
> This patch improves those situations by removing the check and
> instead masking down to 16 bits within the new "X" format specifier
> I added recently.
>
> OK for trunk?
>
> Cheers,
> Ian
>
>
> 2013-05-17  Ian Bolton  
>
> * config/aarch64/aarch64.c (aarch64_print_operand): Change the X
> format
> specifier to only display bottom 16 bits.
>
> * config/aarch64/aarch64.md (insv_imm): Allow any-sized
> immediate
> to match for operand 2, since it will be masked.


OK


Re: [PATCH] Fix incorrect discriminator assignment.

2013-05-22 Thread Cary Coutant
@@ -105,7 +105,7 @@ struct locus_descrim_hasher : typed_free_remove locus;
+  return LOCATION_LINE (item->locus);
 }

 /* Equality function for the locus-to-discriminator map.  A and B
@@ -114,7 +114,7 @@ locus_descrim_hasher::hash (const value_type *item
 inline bool
 locus_descrim_hasher::equal (const value_type *a, const compare_type *b)
 {
-  return a->locus == b->locus;
+  return LOCATION_LINE (a->locus) == LOCATION_LINE (b->locus);
 }


While you're in that part of the code, could you fix the spelling of
locus_descrim_hasher? Should be "locus_discrim_hasher".

-cary


Re: [PATCH] Fix incorrect discriminator assignment.

2013-05-22 Thread Dehao Chen
Sure, will update the patch for that.

Dehao

On Wed, May 22, 2013 at 9:40 AM, Cary Coutant  wrote:
> @@ -105,7 +105,7 @@ struct locus_descrim_hasher : typed_free_remove   inline hashval_t
>  locus_descrim_hasher::hash (const value_type *item)
>  {
> -  return item->locus;
> +  return LOCATION_LINE (item->locus);
>  }
>
>  /* Equality function for the locus-to-discriminator map.  A and B
> @@ -114,7 +114,7 @@ locus_descrim_hasher::hash (const value_type *item
>  inline bool
>  locus_descrim_hasher::equal (const value_type *a, const compare_type *b)
>  {
> -  return a->locus == b->locus;
> +  return LOCATION_LINE (a->locus) == LOCATION_LINE (b->locus);
>  }
>
>
> While you're in that part of the code, could you fix the spelling of
> locus_descrim_hasher? Should be "locus_discrim_hasher".
>
> -cary


Re: [PATCH] Disable profile-use if no .gcda file is found

2013-05-22 Thread Jan Hubicka
> If -fprofile-use is specified, but no .gcda file is found, reset all
> the flags back to the values they would have had if -fprofile-use was
> not specified.  Since the code path where -fprofile-use is on and
> .gcda files are not found is not a well tested pass, this will
> increase the robustness of FDO.  Further, in the event an ICE is found
> when doing FDO, this allows users to delete .gcda files as a work
> around without having to implement complex modifications to the build
> system.  To ensure the testsuite still passes (and is relevant), some
> tests which used -fprofile-use but had no .gcda file had to be moved
> so that they would have a .gcda file generated.  This also involved
> adding a main to these unit tests.
> 
> Bootstrapped and regtested on x86_64.  Ok for trunk?
> 
> 2009-09-29  Neil Vachharajani  
>   * coverage.c (read_counts_file): Disable profile use if no .gcda
>   file is found.
>   * opts.c (common_handle_option): Call set_profile_use instead of
>   directly setting parameters in -fprofile-use.
>   (set_profile_use): New function.
>   * opts.h (set_profile_use): New function.
>   * testsuite/g++.dg/tree-ssa/dom-invalid.C: Moved to
>   testsuite/g++.dg/tree-prof.
>   * testsuite/g++.dg/tree-prof/dom-invalid.C: See above.
>   * testsuite/gcc.dg/pr26570.c: Moved to
>   testsuite/gcc.dg/tree-prof.
>   * testsuite/gcc.dg/tree-prof/pr26570.c: See above.  main added.
>   * testsuite/gcc.dg/pr32773.c: Moved to
>   testsuite/gcc.dg/tree-prof.
>   * testsuite/gcc.dg/tree-prof/pr32773.c: See above.  main added.
>   

> +
> +/* Set FLAG_PROFILE_USE and dependent flags based on VALUE.  This
> +   function is used to handle the -f(no)profile-use option as well as
> +   to reset flags if a .gcda file is not found.  */
> +
> +void
> +set_profile_use (bool value, bool force)
> +{
> +  flag_profile_use = value;
> +  if (!flag_branch_probabilities_set || force)
> +flag_branch_probabilities = value;
> +  if (!flag_profile_values_set || force)
> +flag_profile_values = value;
> +  if (!flag_unroll_loops_set)
> +flag_unroll_loops = value;
> +  if (!flag_peel_loops_set)
> +flag_peel_loops = value;
> +  if (!flag_tracer_set)
> +flag_tracer = value;
> +  if (!flag_value_profile_transformations_set || force)
> +flag_value_profile_transformations = value;
> +  if (!flag_inline_functions_set)
> +flag_inline_functions = value;
> +  if (!flag_ipa_cp_set)
> +flag_ipa_cp = value;
> +  if (!flag_ipa_cp_clone_set)
> +{
> +  if (!flag_ipa_cp_set || flag_ipa_cp)
> + flag_ipa_cp_clone = value;
> +}
> +  if (!flag_predictive_commoning_set)
> +flag_predictive_commoning = value;
> +  if (!flag_unswitch_loops_set)
> +flag_unswitch_loops = value;
> +  if (!flag_gcse_after_reload_set)
> +flag_gcse_after_reload = value;
> +}

It seems to make sense to disable profile feedback when GCDA is missing. 
However I wonder if we can't
figure out before finishing of parsing of flags?
Machine descriptions may want to alter options depending on profile-use so this 
approach seems fragile.

Honza


Re: [PATCH, rs6000] power8 patches, patch #7, quad/byte/half-word atomic instructions

2013-05-22 Thread Michael Meissner
This patch adds support for the byte, half-word, and quad-word atomic memory
operations that were added in ISA 2.07 (i.e. power8).  Like the other patches,
this passes bootstrap and had no regressions in make check.  Is it ok to commit
this patch after the previous 6 patches have been applied?

[gcc]
2013-05-22  Michael Meissner  
Pat Haugen 
Peter Bergner 

* config/rs6000/rs6000.c (emit_load_locked): Add support for
power8 byte, half-word, and quad-word atomic instructions.
(emit_store_conditional): Likewise.
(rs6000_expand_atomic_compare_and_swap): Likewise.
(rs6000_expand_atomic_op): Likewise.

* config/rs6000/sync.md (larx): Add new modes for power8.
(stcx): Likewise.
(AINT): New mode iterator to include TImode as well as normal
integer modes on power8.
(fetchop_pred): Use int_reg_operand instead of gpc_reg_operand so
that VSX registers are not considered.  Use AINT mode iterator
instead of INT1 to allow inclusion of quad word atomic operations
on power8.
(load_locked): Likewise.
(store_conditional): Likewise.
(atomic_compare_and_swap): Likewise.
(atomic_exchange): Likewise.
(atomic_nand): Likewise.
(atomic_fetch_): Likewise.
(atomic_nand_fetch): Likewise.
(mem_thread_fence): Use gen_loadsync_ instead of enumerating
each type.
(ATOMIC): On power8, add QImode, HImode modes.
(load_locked_si): Varients of load_locked for QI/HI
modes that promote to SImode.
(load_lockedti): Convert TImode arguments to PTImode, so that we
get a guaranteed even/odd register pair.
(load_lockedpti): Likewise.
(store_conditionalti): Likewise.
(store_conditionalpti): Likewise.

* config/rs6000/rs6000.md (QHI): New mode iterator for power8
atomic load/store instructions.
(HSI): Likewise.

[gcc/testsuite]
2013-05-22  Michael Meissner  
Pat Haugen 
Peter Bergner 

* gcc.target/powerpc/atomic-p7.c: New file, add tests for atomic
load/store instructions on power7, power8.
* gcc.target/powerpc/atomic-p8.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/testsuite/gcc.target/powerpc/atomic-p7.c
===
--- gcc/testsuite/gcc.target/powerpc/atomic-p7.c(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/atomic-p7.c(revision 0)
@@ -0,0 +1,207 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mcpu=power7 -O2" } */
+/* { dg-final { scan-assembler-not "lbarx" } } */
+/* { dg-final { scan-assembler-not "lharx" } } */
+/* { dg-final { scan-assembler-times "lwarx" 18 } } */
+/* { dg-final { scan-assembler-times "ldarx" 6 } } */
+/* { dg-final { scan-assembler-not "lqarx" } } */
+/* { dg-final { scan-assembler-not "stbcx" } } */
+/* { dg-final { scan-assembler-not "sthcx" } } */
+/* { dg-final { scan-assembler-times "stwcx" 18 } } */
+/* { dg-final { scan-assembler-times "stdcx" 6 } } */
+/* { dg-final { scan-assembler-not "stqcx" } } */
+/* { dg-final { scan-assembler-times "bl __atomic" 6 } } */
+/* { dg-final { scan-assembler-times "isync" 12 } } */
+/* { dg-final { scan-assembler-times "lwsync" 8 } } */
+/* { dg-final { scan-assembler-not "mtvsrd" } } */
+/* { dg-final { scan-assembler-not "mtvsrwa" } } */
+/* { dg-final { scan-assembler-not "mtvsrwz" } } */
+/* { dg-final { scan-assembler-not "mfvsrd" } } */
+/* { dg-final { scan-assembler-not "mfvsrwz" } } */
+
+/* Test for the byte atomic operations on power8 using lbarx/stbcx.  */
+char
+char_fetch_add_relaxed (char *ptr, int value)
+{
+  return __atomic_fetch_add (ptr, value, __ATOMIC_RELAXED);
+}
+
+char
+char_fetch_sub_consume (char *ptr, int value)
+{
+  return __atomic_fetch_sub (ptr, value, __ATOMIC_CONSUME);
+}
+
+char
+char_fetch_and_acquire (char *ptr, int value)
+{
+  return __atomic_fetch_and (ptr, value, __ATOMIC_ACQUIRE);
+}
+
+char
+char_fetch_ior_release (char *ptr, int value)
+{
+  return __atomic_fetch_or (ptr, value, __ATOMIC_RELEASE);
+}
+
+char
+char_fetch_xor_acq_rel (char *ptr, int value)
+{
+  return __atomic_fetch_xor (ptr, value, __ATOMIC_ACQ_REL);
+}
+
+char
+char_fetch_nand_seq_cst (char *ptr, int value)
+{
+  return __atomic_fetch_nand (ptr, value, __ATOMIC_SEQ_CST);
+}
+
+/* Test for the half word atomic operations on power8 using lharx/sthcx.  */
+short
+short_fetch_add_relaxed (short *ptr, int value)
+{
+  return __atomic_fetch_add (ptr, value, __ATOMIC_RELAXED);
+}
+
+short
+short_fetch_sub_consume (short *ptr, int value)
+{
+  return __atomic_fetch_sub (ptr, value, __ATOMIC_CONSUME);
+}
+

website patch committed: GCC 4.8.1 does not implement Go 1.1

2013-05-22 Thread Ian Lance Taylor
I unfortunately have not had time to keep up with gccgo, so I'm adding a
note that GCC 4.8.1 does not yet implement Go 1.1.  Committed to the web
site.

Ian

Index: changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v
retrieving revision 1.117
diff -u -r1.117 changes.html
--- changes.html	20 May 2013 16:38:06 -	1.117
+++ changes.html	22 May 2013 17:01:39 -
@@ -441,9 +441,8 @@
 
 Go
   
-GCC 4.8.0 implements a preliminary version of the upcoming Go
-  1.1 release.  The library support is not quite complete, due to
-  release timing.
+GCC 4.8.0 and 4.8.1 implement a preliminary version of the Go
+  1.1 release.  The library support is not quite complete.
 Go has been tested on GNU/Linux and Solaris platforms for
   various processors including x86, x86_64, PowerPC, SPARC, and
   Alpha.  It may work on other platforms as well.


Re: [PATCH][gensupport] Add optional attributes field to define_cond_exec

2013-05-22 Thread Richard Henderson
On 05/22/2013 02:54 AM, Kyrylo Tkachov wrote:
> From what I understand, using define_subst would mean creating a 
> define_subst for every pattern that can be "predicable"? There are at least
> 600 predicable patterns in the arm backend, so that would be infeasible.

No, define_subst works across patterns, keyed by attributes.  Exactly like
cond_exec, really.

But what you ought to be able to do right now is

(define_subst "ds_predicable"
  [(match_operand 0)]
  ""
  [(cond_exec (blah) (match_dup 0))])

(define_subst_attr "ds_predicable_enabled" "ds_predicable" "no" "yes"0

(define_insn "blah"
  [(blah)]
  ""
  "@
   blah
   blah"
  [(set_attr "ds_predicable" "yes")
   (set_attr "ds_predicated" "")])

At which point you can define "enabled" in terms of ds_predicated plus whatever.

With a small bit of work we ought to be able to move that ds_predicated
attribute to the define_subst itself, so that you don't have to replicate that
set_attr line N times.  I think that's more or less what you were suggesting
with your cond_exec extension, yes?



r~


Re: [PING]RE: [patch] cilkplus: Array notation for C patch

2013-05-22 Thread Richard Henderson

On 2013-05-22 08:18, Iyer, Balaji V wrote:

The overall function names are same, but the components inside it function
differs greatly from C and C++. For example, in C++ I can't use
build_modify_expr, but build_x_modify_expr. Also, I need to handle
overloaded function types, and that requires further checking. Similarly,
the trees are different from C and C++, and so I have to do more checks and
handle them differently in C++. In my mind, this seem to be the most
straight-forward and legible way to do it.


Really, that's surprising:

/* For use from the C common bits.  */
tree
build_modify_expr (location_t /*location*/,
   tree lhs, tree /*lhs_origtype*/,
   enum tree_code modifycode,
   location_t /*rhs_location*/, tree rhs,
   tree /*rhs_origtype*/)
{
  return cp_build_modify_expr (lhs, modifycode, rhs, tf_warning_or_error);
}

I can definitely see how function overloading could potentially get in the way.

Although by delaying expansion of ARRAY_NOTATION_REF, given that the TREE_TYPE 
of the ARRAY_NOTATION_REF is not some complex artificial slice type but the 
element type of the array, I wonder how much of the overloaded function 
selection would just happen automatically?


Which reminds me: What is ARRAY_NOTATION_TYPE for?  Isn't it always the same as 
the TREE_TYPE of the ARRAY_NOTATION_REF?



Before I started my implementation, I did briefly consider using
c_finish_loop, but adding multiple expressions got a bit complicated. For
example, if I have 2 increment/condition expressions (my initial
implementation had it but later on I eliminated that need), I had to create
a statement list and then add them using append_to_statement_list() and then
pass that into c_finish_loop. Also, if I am not mistaken, I still had to
create some of the individual labels when using c_finish_loop(). Overall, I
didn't find much code-size decrease, and I found this a bit more straight
forward for me. I apologize if I am mistaken.


Well, append_to_statement_list etc was certainly not needed.  Mirroring exactly 
how you'd write it in C, with a comma operator also works.  That's a 
COMPOUND_EXPR, fyi.


There are continue/break label arguments to c_finish_loop, but they may be (and 
often are) null.  They will only be non-null if the relevant statements 
actually exist inside the loop.  C.f. c_finish_bc_stmt and c_break_label.


You're right about it not saving *that* much code, but the point is more about 
if we make improvements to normal C loop representation -- such as to support 
#pragma simd -- then we want to be able to take advantage of that here.



Are these changes a blocking issue for acceptance into trunk? Or, would it
be ok to accept it as-is and then I make these changes at a later date?


I'm leaning to honoring the advice you were given last year and accepting the 
patch more or less in its current form.



r~


Re: [PATCH] Disable profile-use if no .gcda file is found

2013-05-22 Thread Richard Biener
Jan Hubicka  wrote:

>> If -fprofile-use is specified, but no .gcda file is found, reset all
>> the flags back to the values they would have had if -fprofile-use was
>> not specified.  Since the code path where -fprofile-use is on and
>> .gcda files are not found is not a well tested pass, this will
>> increase the robustness of FDO.  Further, in the event an ICE is
>found
>> when doing FDO, this allows users to delete .gcda files as a work
>> around without having to implement complex modifications to the build
>> system.  To ensure the testsuite still passes (and is relevant), some
>> tests which used -fprofile-use but had no .gcda file had to be moved
>> so that they would have a .gcda file generated.  This also involved
>> adding a main to these unit tests.
>> 
>> Bootstrapped and regtested on x86_64.  Ok for trunk?
>> 
>> 2009-09-29  Neil Vachharajani  
>>  * coverage.c (read_counts_file): Disable profile use if no .gcda
>>  file is found.
>>  * opts.c (common_handle_option): Call set_profile_use instead of
>>  directly setting parameters in -fprofile-use.
>>  (set_profile_use): New function.
>>  * opts.h (set_profile_use): New function.
>>  * testsuite/g++.dg/tree-ssa/dom-invalid.C: Moved to
>>  testsuite/g++.dg/tree-prof.
>>  * testsuite/g++.dg/tree-prof/dom-invalid.C: See above.
>>  * testsuite/gcc.dg/pr26570.c: Moved to
>>  testsuite/gcc.dg/tree-prof.
>>  * testsuite/gcc.dg/tree-prof/pr26570.c: See above.  main added.
>>  * testsuite/gcc.dg/pr32773.c: Moved to
>>  testsuite/gcc.dg/tree-prof.
>>  * testsuite/gcc.dg/tree-prof/pr32773.c: See above.  main added.
>>  
>
>> +
>> +/* Set FLAG_PROFILE_USE and dependent flags based on VALUE.  This
>> +   function is used to handle the -f(no)profile-use option as well
>as
>> +   to reset flags if a .gcda file is not found.  */
>> +
>> +void
>> +set_profile_use (bool value, bool force)
>> +{
>> +  flag_profile_use = value;
>> +  if (!flag_branch_probabilities_set || force)
>> +flag_branch_probabilities = value;
>> +  if (!flag_profile_values_set || force)
>> +flag_profile_values = value;
>> +  if (!flag_unroll_loops_set)
>> +flag_unroll_loops = value;
>> +  if (!flag_peel_loops_set)
>> +flag_peel_loops = value;
>> +  if (!flag_tracer_set)
>> +flag_tracer = value;
>> +  if (!flag_value_profile_transformations_set || force)
>> +flag_value_profile_transformations = value;
>> +  if (!flag_inline_functions_set)
>> +flag_inline_functions = value;
>> +  if (!flag_ipa_cp_set)
>> +flag_ipa_cp = value;
>> +  if (!flag_ipa_cp_clone_set)
>> +{
>> +  if (!flag_ipa_cp_set || flag_ipa_cp)
>> +flag_ipa_cp_clone = value;
>> +}
>> +  if (!flag_predictive_commoning_set)
>> +flag_predictive_commoning = value;
>> +  if (!flag_unswitch_loops_set)
>> +flag_unswitch_loops = value;
>> +  if (!flag_gcse_after_reload_set)
>> +flag_gcse_after_reload = value;
>> +}
>
>It seems to make sense to disable profile feedback when GCDA is
>missing. However I wonder if we can't
>figure out before finishing of parsing of flags?
>Machine descriptions may want to alter options depending on profile-use
>so this approach seems fragile.

You'd have to move handling of fprofile-use to the driver then. And think about 
what to do with lto. Btw, why bother to reset any flags besides of those 
controlling profile computation at all?

Richard.

>Honza




[C++ Patch] PR 57352

2013-05-22 Thread Paolo Carlini

Hi,

avoiding this ICE on invalid seems just matter of setting up the 
parser->type_definition_forbidden_message string.


Tested x86_64-linux.

Thanks,
Paolo.

///
/cp
2013-05-22  Paolo Carlini  

PR c++/57352
* parser.c (cp_parser_conversion_type_id): Set up
parser->type_definition_forbidden_message before calling
cp_parser_type_specifier_seq.

/testsuite
2013-05-22  Paolo Carlini  

PR c++/57352
* g++.dg/parse/crash62.C: New.
Index: cp/parser.c
===
--- cp/parser.c (revision 199204)
+++ cp/parser.c (working copy)
@@ -11715,13 +11715,22 @@ cp_parser_conversion_type_id (cp_parser* parser)
   cp_decl_specifier_seq type_specifiers;
   cp_declarator *declarator;
   tree type_specified;
+  const char *saved_message;
 
   /* Parse the attributes.  */
   attributes = cp_parser_attributes_opt (parser);
+
+  saved_message = parser->type_definition_forbidden_message;
+  parser->type_definition_forbidden_message
+= G_("types may not be defined in a conversion-type-id");
+
   /* Parse the type-specifiers.  */
   cp_parser_type_specifier_seq (parser, /*is_declaration=*/false,
/*is_trailing_return=*/false,
&type_specifiers);
+
+  parser->type_definition_forbidden_message = saved_message;
+
   /* If that didn't work, stop.  */
   if (type_specifiers.type == error_mark_node)
 return error_mark_node;
Index: testsuite/g++.dg/parse/crash62.C
===
--- testsuite/g++.dg/parse/crash62.C(revision 0)
+++ testsuite/g++.dg/parse/crash62.C(working copy)
@@ -0,0 +1,6 @@
+// PR c++/57352
+
+struct x
+{
+  operator class {} ();  // { dg-error "types|expected" }
+};


[Patch, Fortran] Enable the generation of the FINALization wrapper function

2013-05-22 Thread Tobias Burnus
Pre-remark: This patch does *not* enable finalization or polymorphic 
deallocation.


* * *

Dear all,

The attached patch is a bit boring and invasive, but it paves the way to 
FINAL support.


Changes of technical kind:

* Changed ABI for CLASS's virtual table (due to _final) - and, hence, it 
bumps the .mod version
* The finalization wrapper is now generated (this should not but might 
lead to ICEs)

* It also causes that the virtual table is now more often generated

New feature:

_copy no longer deallocates the "dst" argument. Doing so lead to bogus 
finalization with ALLOCATE (exposed with the pending FINAL patch). As a 
sideeffect, memset could be removed and CALLOC could be replased by 
MALLOC (minute performance advantage). In order to keep the deallocation 
in gfc_trans_class_array_init_assign, there is now a call to the 
finalization wrapper.


Next steps:
* Add end-of-scope/intent(out) deallocation for polymorphic arrays
* Enable FINAL parsing
* Stepwise enabling for polymorphic deallocation/finalization
* Fix issues with ELEMENTAL(+optional) with intent(out)
* Fix some issues related to intrinsic assignment
* Fix fallout of any of those items

Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias
2013-05-22  Tobias Burnus  

	* class.c (finalize_component): Fix coarray array refs.
	(gfc_find_derived_vtab): _copy's dst is now intent(inout).
	Enable finalization-wrapper generation.
	* module.c (MOD_VERSION): Bump.
	(gfc_dump_module, gfc_use_module): Remove empty line in .mod.
	* trans-array.c (gfc_conv_descriptor_token): Accept nonrestricted
	void pointer.
	(gfc_array_allocate, structure_alloc_comps): Don't nullify for
	BT_CLASS allocations.
	* trans-stmt.c (gfc_trans_allocate): Ditto.
	* trans-expr.c (gfc_trans_class_array_init_assign): Call _final
	before _copy.
	
2013-05-22  Tobias Burnus  

	* gfortran.dg/auto_dealloc_2.f90: Update _free count in the dump.
	* gfortran.dg/class_19.f03: Ditto.

diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c
index 349f494..c41b95a 100644
--- a/gcc/fortran/class.c
+++ b/gcc/fortran/class.c
@@ -832,17 +832,18 @@ finalize_component (gfc_expr *expr, gfc_symbol *derived, gfc_component *comp,
   ref->u.c.component = comp;
   e->ts = comp->ts;
 
-  if (comp->attr.dimension
+  if (comp->attr.dimension || comp->attr.codimension
   || (comp->ts.type == BT_CLASS && CLASS_DATA (comp)
-	  && CLASS_DATA (comp)->attr.dimension))
+	  && (CLASS_DATA (comp)->attr.dimension
+	  || CLASS_DATA (comp)->attr.codimension)))
 {
   ref->next = gfc_get_ref ();
   ref->next->type = REF_ARRAY;
-  ref->next->u.ar.type = AR_FULL;
   ref->next->u.ar.dimen = 0;
   ref->next->u.ar.as = comp->ts.type == BT_CLASS ? CLASS_DATA (comp)->as
 			: comp->as;
   e->rank = ref->next->u.ar.as->rank;
+  ref->next->u.ar.type = e->rank ? AR_FULL : AR_ELEMENT;
 }
 
   /* Call DEALLOCATE (comp, stat=ignore).  */
@@ -2363,7 +2364,7 @@ gfc_find_derived_vtab (gfc_symbol *derived)
 		  dst->attr.flavor = FL_VARIABLE;
 		  dst->attr.dummy = 1;
 		  dst->attr.artificial = 1;
-		  dst->attr.intent = INTENT_OUT;
+		  dst->attr.intent = INTENT_INOUT;
 		  gfc_set_sym_referenced (dst);
 		  copy->formal->next = gfc_get_formal_arglist ();
 		  copy->formal->next->sym = dst;
@@ -2382,9 +2383,6 @@ gfc_find_derived_vtab (gfc_symbol *derived)
 		 components and the calls to finalization subroutines.
 		 Note: The actual wrapper function can only be generated
 		 at resolution time.  */
-	/* FIXME: Enable ABI-breaking "_final" generation.  */
-	if (0)
-	{
 	  if (!gfc_add_component (vtype, "_final", &c))
 		goto cleanup;
 	  c->attr.proc_pointer = 1;
@@ -2392,7 +2390,6 @@ gfc_find_derived_vtab (gfc_symbol *derived)
 	  c->tb = XCNEW (gfc_typebound_proc);
 	  c->tb->ppc = 1;
 	  generate_finalization_wrapper (derived, ns, tname, c);
-	}
 
 	  /* Add procedure pointers for type-bound procedures.  */
 	  if (!derived->attr.unlimited_polymorphic)
@@ -2651,7 +2648,7 @@ gfc_find_intrinsic_vtab (gfc_typespec *ts)
 	  dst->ts.kind = ts->kind;
 	  dst->attr.flavor = FL_VARIABLE;
 	  dst->attr.dummy = 1;
-	  dst->attr.intent = INTENT_OUT;
+	  dst->attr.intent = INTENT_INOUT;
 	  gfc_set_sym_referenced (dst);
 	  copy->formal->next = gfc_get_formal_arglist ();
 	  copy->formal->next->sym = dst;
diff --git a/gcc/fortran/module.c b/gcc/fortran/module.c
index e6a4cd7..9486b28 100644
--- a/gcc/fortran/module.c
+++ b/gcc/fortran/module.c
@@ -80,10 +80,8 @@ along with GCC; see the file COPYING3.  If not see
 #define MODULE_EXTENSION ".mod"
 
 /* Don't put any single quote (') in MOD_VERSION, if you want it to be
-   recognized.  
-   TODO: When the version is bumped, remove the extra empty line at
-   the beginning of module files.  */
-#define MOD_VERSION "10"
+   recognized.  */
+#define MOD_VERSION "11"
 
 
 /* Structure that describes a position within a module file.  */
@@ -5571,7 +5569,7 @@ gfc

[wwwdocs] correct email address for assignments

2013-05-22 Thread Benjamin De Kosnik

Fix typo in assignment email address, to make gcc match GNU site:

https://www.gnu.org/prep/maintain/maintain.html#Legal-Matters

It would be awesome if now GCC assigments were magically less work.

-benjamin2013-05-22  Benjamin Kosnik  

* htdocs/contribute.html: Use ass...@gnu.org, match main GNU docs.


Index: htdocs/contribute.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/contribute.html,v
retrieving revision 1.79
diff -c -p -r1.79 contribute.html
*** htdocs/contribute.html	22 Oct 2011 13:46:22 -	1.79
--- htdocs/contribute.html	22 May 2013 19:05:44 -
*** the relevant forms.  The most common for
*** 44,50 
  specific change, an assignment for all future changes, and an employer
  disclaimer, if an employer or school owns work created by the developer.
  It's a good idea to send
! assignme...@gnu.org a copy of
  your request.
  
  If a contributor is reluctant to sign a copyright assignment for a
--- 44,50 
  specific change, an assignment for all future changes, and an employer
  disclaimer, if an employer or school owns work created by the developer.
  It's a good idea to send
! ass...@gnu.org a copy of
  your request.
  
  If a contributor is reluctant to sign a copyright assignment for a


Re: [Patch, Fortran] Create valid temporary variable to avoid assembler errors

2013-05-22 Thread Thomas Koenig

Hi Tobias,



I have now changed the mangling, see attached patch. (The test file uses
finalization - hence, I do not include it into the patch. I will include
it in the FINAL patch.)

Build and regtested on x86-64-gnu-linux.
OK for the trunk?


OK (obvious really).

Thanks for the patch!

Thomas


RE: [PING]RE: [patch] cilkplus: Array notation for C patch

2013-05-22 Thread Iyer, Balaji V


> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Jeff Law
> Sent: Wednesday, May 22, 2013 12:20 PM
> To: Iyer, Balaji V
> Cc: Jakub Jelinek; Richard Henderson; 'Joseph S. Myers'; 'Aldy Hernandez'; 
> 'gcc-
> patches'
> Subject: Re: [PING]RE: [patch] cilkplus: Array notation for C patch
> 
> On 05/22/2013 09:37 AM, Iyer, Balaji V wrote:
> >>>
> >> Furthermore, do you want to generate just normal loop out of it, or
> >> shouldn't we generate a #pragma omp simd loop out of it instead?
> >> Haven't read the spec if array notation guarantees lack of
> >> forward/backward loop dependencies and what are the restrictions on
> >> the calls you can do inside of array notation.  Perhaps the lowering
> >> to normal vs. simd loop could depend on whether all called functions
> >> in the expression are elemental.
> >
> > What you mentioned is a good performance optimization. It is something
> > I have  planned to do in future. But, the spec does not require the
> > loop (or array notation expansion) to be vectorized.
> So if we can represent array notation as an OpenMP SIMD loop and re-use the
> OpenMP code generation, that's a significant win.  I realize the OpenMP SIMD
> stuff is still in-progress, but from a design standpoint we'd like to 
> separate out
> the front-end issues from the code generation issues -- with the goal being 
> that
> we can re-use the OpenMP SIMD path to also handle Cilk SIMD & Cilk array
> notation and possible bits of OpenACC.
> 
> Balaji, how feasible is it to rewire the code generation aspects of the array
> notation patch to instead feed into the OpenMP SIMD bits Jakub is working on?

Hi Jeff,
Yes, converting the array notation expansion to #pragma simd (or 
#pragma omp simd) trees will be beneficial performance wise. But, it will 
require semi-significant re-write and this can destabilize a currently stable 
implementation. IMHO, for now I would like to keep array notation as is and let 
us see how it interacts with other flavors of gcc backends. At the moment, we 
do not see any reason for instability in this, but we have not done any 
significant testing on non x86 backends. After OMP 4 hits the main-line and 
everything is stable, we could proceed as you indicated.

Thanks,

Balaji V. Iyer.


> 
> 
> Jeff


Re: [Patch, Fortran] Create valid temporary variable to avoid assembler errors

2013-05-22 Thread Steven Bosscher
On Wed, May 22, 2013 at 4:20 PM, Tobias Burnus wrote:
> With one Fortran file, I get the following assembler errors:
>
> /tmp/cc28epKK.s:2075: Error: junk `@1.2304+16' after expression
>
> That's due to the way a temporary variable is generated. While that variable
> is local to the procedure, the name somehow escapes into the assembler file.
> The dump looks as follows.
>
>   static struct foo DA@0;
>   static struct universal DA@1;
>   static struct universal DA@2;
> ...
> class.30._data = &DA@0.foo_parent.universal.counter;
>
> I have now changed the mangling, see attached patch. (The test file uses
> finalization - hence, I do not include it into the patch. I will include it
> in the FINAL patch.)
>
> Build and regtested on x86-64-gnu-linux.
> OK for the trunk?

Hm, DA_F_0 is a valid variable name. So can't your patch cause clashes
with user names?

Ciao!
Steven


Re: [PING]RE: [patch] cilkplus: Array notation for C patch

2013-05-22 Thread Jeff Law

On 05/22/2013 01:13 PM, Iyer, Balaji V wrote:


Hi Jeff, Yes, converting the array notation expansion to #pragma simd
(or #pragma omp simd) trees will be beneficial performance wise. But,
it will require semi-significant re-write and this can destabilize a
currently stable implementation. IMHO, for now I would like to keep
array notation as is and let us see how it interacts with other
flavors of gcc backends. At the moment, we do not see any reason for
instability in this, but we have not done any significant testing on
non x86 backends. After OMP 4 hits the main-line and everything is
stable, we could proceed as you indicated.
I was thinking more from maintenance standpoint.  It seems kindof silly 
to have multiple blobs of code generating loops.  The fact that it gets 
us vectorized code more often is just icing on the cake.


This is certainly something that I want to see happen, though I'll defer 
to Richard, Joseph, Aldy, etc on the timing of that change.


jeff


Re: [rs6000] Add register save/restore routines for cross

2013-05-22 Thread Bernhard Reutner-Fischer

On 22 May 2013 16:36:52 David Edelsohn  wrote:

On Wed, May 22, 2013 at 10:35 AM, Alan Modra  wrote:
> On Wed, May 22, 2013 at 10:05:47AM -0400, David Edelsohn wrote:
>> Why does cross need the functions in libgcc and not provided by the linker?
>
> Only the ppc64 linker provides save/restore functions magically.

Okay, then the patch is okay.


Hmm. I cannot look right now, does that sound like a cure for PR53803 too?


Sent with AquaMail for Android
http://www.aqua-mail.com




A trivial script to scrub ChangeLog changes from "git show"

2013-05-22 Thread David Malcolm
I've been writing some automation around testing gcc patches, and kept
running into ChangeLog conflicts, so I wrote the following to make it
easier.

The attached one-liner wraps "git show" into a form that omits changes
to ChangeLog files, for use when generating patches from a git repo,
so that you can run things like:

# Get last commit, omitting ChangeLog changes:
./contrib/git-show-non-changelog.sh

# Get a specific commit, omitting ChangeLog changes:
./contrib/git-show-non-changelog.sh 60623f1616b3144310f432174ebbb3e2cc6dff28

# Get all my non-pushed changes, omitting ChangeLog changes:
./contrib/git-show-non-changelog.sh origin/master..

etc

If there's a pre-existing way of doing this, I'm all ears, naturally [1]

OK for trunk? (and of course, the contrib/ChangeLog entry will no doubt
conflict, irony of ironies).

Dave

[1] gnulib appears to have a "git-merge-changelog", but that requires
building/installing it at the receiving end
commit 120b040899e61ca3b9214bd71a545038025e7e68
Author: David Malcolm 
Date:   Wed May 22 15:17:35 2013 -0400

contrib/
	* git-show-non-changelog.sh: New.

diff --git a/contrib/ChangeLog b/contrib/ChangeLog
index 1fd90b7..b0b63f3 100644
--- a/contrib/ChangeLog
+++ b/contrib/ChangeLog
@@ -1,3 +1,7 @@
+2013-05-22  David Malcolm  
+
+	* git-show-non-changelog.sh: New.
+
 2013-05-16  Rainer Orth  
 
 	* config-list.mk (LIST): Add -enable-obsolete for
diff --git a/contrib/git-show-non-changelog.sh b/contrib/git-show-non-changelog.sh
new file mode 100755
index 000..5b6ec99
--- /dev/null
+++ b/contrib/git-show-non-changelog.sh
@@ -0,0 +1,22 @@
+#!/bin/sh
+
+# "git show OBJECT" but omitting ChangeLog files
+
+# Copyright (C) 2013  Free Software Foundation, Inc.
+# Contributed by David Malcolm 
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
+
+git show $@ $(git show $@ | diffstat -l -p1 | grep -v ChangeLog)


Re: A trivial script to scrub ChangeLog changes from "git show"

2013-05-22 Thread Andrew Pinski
On Wed, May 22, 2013 at 1:10 PM, David Malcolm  wrote:
> I've been writing some automation around testing gcc patches, and kept
> running into ChangeLog conflicts, so I wrote the following to make it
> easier.
>
> The attached one-liner wraps "git show" into a form that omits changes
> to ChangeLog files, for use when generating patches from a git repo,
> so that you can run things like:
>
> # Get last commit, omitting ChangeLog changes:
> ./contrib/git-show-non-changelog.sh
>
> # Get a specific commit, omitting ChangeLog changes:
> ./contrib/git-show-non-changelog.sh 60623f1616b3144310f432174ebbb3e2cc6dff28
>
> # Get all my non-pushed changes, omitting ChangeLog changes:
> ./contrib/git-show-non-changelog.sh origin/master..
>
> etc
>
> If there's a pre-existing way of doing this, I'm all ears, naturally [1]
>
> OK for trunk? (and of course, the contrib/ChangeLog entry will no doubt
> conflict, irony of ironies).

Note changelog entries in emails should not be part of the patch anyways.

Thanks,
Andrew

>
> Dave
>
> [1] gnulib appears to have a "git-merge-changelog", but that requires
> building/installing it at the receiving end


Re: [Patch] Fix PR56780: --disable-install-libiberty still installs libiberty.a

2013-05-22 Thread Ian Lance Taylor
On Wed, Apr 3, 2013 at 7:03 AM, Matt Burgess
 wrote:
>
> 2013-04-03 Matt Burgess 
>
> other/PR56780
> * libiberty/configure.ac:
> Move test for --enable-install-libiberty outside of the
> 'with_target_subdir' test so that it actually gets run.
> Add output messages to show the test result.
>
> * libiberty/configure:
> Regenerate.
>
> * libiberty/Makefile.in (install_to_libdir):
> Place the installation of the libiberty library in the same guard as
> that used for the headers to prevent it being installed unless requested
> via --enable-install-libiberty.

Note that in the ChangeLog entry you should not have any blank lines,
and you should not break the lines after the colon.  See the existing
entries and follow their format.

> +[  --enable-install-libiberty   Install headers for end users],

Change this to "Install headers and library for end users" or
something like that.

This is OK with those changes.

Thanks, and sorry for the slow review.

Ian


Re: [PATCH, rs6000] power8 patches, patch #8, power8 load fusion + misc.

2013-05-22 Thread Michael Meissner
This is the final set of patches that I have available right now.  We will be
doing additional patches over the summer.

The primary thing in this patch is to add support for load fusion in the
power8.  Power8 has two types of fusion:

addi ,,
lxvd2x ,,

and:

addis ,,
ld ,()

These instructions must be adjacent to each other, and in the case of fusion in
loading GPRs, the register being loaded must be the base register to load from
it.  In this patch, I added peepholes to cover this case.  In the future, I
plan on reworking the problem by being more liberal in what addresses are
allowed before reload/lra, and in lra, generate these forms.  However, these
peepholes do help find fusion cases.

I also added two switches (-mlra and -mconstrain-regs) that were used in
converting the powerpc port to use the LRA register allocator.  Note, at the
present time, Vlad and I are going back on forth on additional things needed
for LRA.

This patch bootstraps and has no regressions in the test suite.  Is it ok to
check in after the previous 7 patches have been applied?

FWIW, patches 1-2 that were approved have now been checked in.

2013-05-22  Michael Meissner  

* config/rs6000/predicates.md (fusion_gpr_addis): New predicates
to support power8 load fusion.
(fusion_gpr_mem_load): Likewise.

* config/rs6000/rs6000-modes.def (PTImode): Update a comment.

* config/rs6000/rs6000-protos.h (fusion_gpr_load_p): New
declarations for power8 load fusion.
(emit_fusion_gpr_load): Likewise.

* config/rs6000/rs6000.opt (-mlra): New undocumented switch to
turn on using the LRA register allocator.
(-mconstrain-regs): New undocumented switch to constrain
non-integer values from being loaded into the LR or CTR registers.

* config/rs6000/rs6000.c (TARGET_LRA_P): If -mlra, turn on using
the LRA register allocator.
(rs6000_lra_p): Likewise.
(rs6000_hard_regno_mode_ok): Allow DI/DD/SF/SD modes in altivec
registers if power8.  If -mconstrain-regs, only allow int modes
into LR, CTR, and special purpose registers.
(rs6000_debug_reg_global): Print -mlra, -mconstrain-regs status if
debugging.
(rs6000_init_hard_regno_mode_ok): Mark that SFmode can use Altivec
registers in the future.
(rs6000_option_override_internal): If tuning for power8, turn on
fusion mode by default.  Turn on sign extending fusion mode if
normal fusion mode is on, and we are at -O2 or -O3.
(rs6000_opt_masks): Add -mlra, -mconstrain-regs.
(fusion_gpr_load_p): New function, return true if we can fuse an
addis instruction with a dependent load to a GPR.
(emit_fusion_gpr_load): Emit the instructions for power8 load
fusion to GPRs.

* config/rs6000/vsx.md (VSX load fusion peepholes): Add peepholes
to fuse together an addi instruction with a VSX load instruction.

* config/rs6000/rs6000.md (GPR load fusion peepholes): Add
peepholes to fuse an addis instruction with a load to a GPR base
register, if the addis instruction is dead after the load, by
using the register to be loaded for the addis.  If we are
supporting sign extending fusions, convert sign extending loads to
zero extending loads and an explicit sign extension.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 199168)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -1654,3 +1654,99 @@ (define_predicate "small_toc_ref"
 
   return GET_CODE (op) == UNSPEC && XINT (op, 1) == UNSPEC_TOCREL;
 })
+
+;; Match the first insn (addis) in fusing the combination of addis and loads to
+;; GPR registers on power8.  Power8 currently will only do the fusion if the
+;; top 11 bits of the addis value are all 1's or 0's.
+(define_predicate "fusion_gpr_addis"
+  (match_code "const_int,high,plus")
+{
+  HOST_WIDE_INT value;
+  rtx int_const;
+
+  /* 32-bit is not done yet.  */
+  if (TARGET_ELF && !TARGET_POWERPC64)
+return 0;
+
+  if (GET_CODE (op) == HIGH)
+return 1;
+
+  if (CONST_INT_P (op))
+int_const = op;
+
+  else if (GET_CODE (op) == PLUS
+  && base_reg_operand (XEXP (op, 0), Pmode)
+  && CONST_INT_P (XEXP (op, 1)))
+int_const = XEXP (op, 1);
+
+  else
+return 0;
+
+  value = INTVAL (int_const);
+  if ((value & (HOST_WIDE_INT)0x) != 0)
+return 0;
+
+  if ((value & (HOST_WIDE_INT)0x) == 0)
+return 0;
+
+  return (IN_RANGE (value >> 16, -32, 31));
+})
+
+;; Match the second insn (lbz, lhz, lwz, ld) in fusing the combination of addis
+;; and loads to GPR registers on power8.
+(define_predicate "fusio

Re: [C++ Patch] for c++/54537

2013-05-22 Thread Fabien Chêne
Hi,

Again sorry for such a delay. Paolo subtly ping'ed me on DR39
(c++/13590), and I remember that this bug (c++/54537) was blocking the
patch I wrote last year for DR39.

2012/11/15 Jason Merrill :
[...]
> I was only thinking of the primary signature; putting
>
> extern "C" double pow (double, double);
>
> in any namespace will result in calling the C library function if that
> declaration is selected by overload resolution.

Ah, I see, it works. Then we end up with three solutions:
1) remove the pow(double,double) overload
2) add a specialization template <> pow(double,double)
3) add an extern "C" declaration to refer to the C library function

I don't have a strong preference, I would say that all solutions are
more or less equivalent. In the end, builtins are just reached in
different ways I guess, depending on the optimisations enabled.
Given that Paolo already OKed the solution 1), I would go for it.

Shall I commit if it still passes regtests ?

Thank you for your patience...

--
Fabien


Re: [C++ Patch] PR 57352

2013-05-22 Thread Jason Merrill

OK.

Jason


Re: Fix PR 53743 and other -freorder-blocks-and-partition failures

2013-05-22 Thread Teresa Johnson
Revised patch included below. The spacing of my pasted in patch text
looks funky again, let me know if you want the patch as an attachment
instead.

I addressed all of Steven's comments, except for the suggestion to use
gcc_assert
instead of error() in verify_hot_cold_block_grouping() to keep this consistent
with the rest of the verify_flow_info subroutines (let me know if this is ok).

The other main changes:
(1) Added several test cases (cloned from the torture subdirectories,
where I manually
built/ran with FDO and -freorder-blocks-and-partition with both the
current trunk and
my fixed trunk compiler, and was able to expose some failures I fixed.
(2) Changed existing tree-prof tests that used
-freorder-blocks-and-partition to be
built with -O2 instead of -O, so that partitioning actually kicks in.
(3) Fixed a couple of failures in the new
verify_hot_cold_block_grouping() checks
exposed by the torture tests I ran manually with splitting (2 of the
tests cloned
to tree-prof in this patch). One was in computed goto where we were
too aggressive
about cloning crossing edges, and the other was in rtl_split_edge
called from the "stack"
pass which was not correctly inserting the new bb in the correct partition since
bb layout is complete at that point.

Re-tested on x86_64-unknown-linux-gnu with bootstrap and profiledbootstrap
builds and regression testing. Re-built/ran cpu2006int with profile
feedback and -freorder-blocks-and-partition enabled.

Ok for trunk?

Thanks!
Teresa

2013-05-22  Teresa Johnson  

* ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert
as this is now done by redirect_edge_and_branch_force.
* function.c (thread_prologue_and_epilogue_insns): Insert new bb after
barriers, and fix interaction with splitting.
* emit-rtl.c (try_split): Copy REG_CROSSING_JUMP notes.
* cfgcleanup.c (try_forward_edges): Fix early return value to properly
reflect changes made in the routine.
* bb-reorder.c (emit_barrier_after_bb): Move to cfgrtl.c.
(fix_up_fall_thru_edges): Remove incorrect check for bb layout order
since this is called in cfglayout mode, and replace partition fixup
with assert as that is now done by force_nonfallthru_and_redirect.
(add_reg_crossing_jump_notes): Handle the fact that some jumps may
already be marked with region crossing note.
(insert_section_boundary_note): Make non-static, gate on flag
has_bb_partition, rewrite to also check for multiple partitions.
(rest_of_handle_reorder_blocks): Remove call to
insert_section_boundary_note, now done later during free_cfg.
(duplicate_computed_gotos): Don't duplicate partition crossing edge.
* bb-reorder.h (insert_section_boundary_note): Declare.
* Makefile.in (cfgrtl.o): Depend on bb-reorder.h
* cfgrtl.c (rest_of_pass_free_cfg): If partitions exist
invoke insert_section_boundary_note.
(try_redirect_by_replacing_jump): Remove unnecessary
check for region crossing note.
(fixup_partition_crossing): New function.
(rtl_redirect_edge_and_branch): Fixup partition boundaries.
(emit_barrier_after_bb): Move here from bb-reorder.c, handle insertion
in non-cfglayout mode.
(force_nonfallthru_and_redirect): Fixup partition boundaries,
remove old code that tried to do this. Emit barrier correctly
when we are in cfglayout mode.
(last_bb_in_partition): New function.
(rtl_split_edge): Correctly fixup partition boundaries.
(commit_one_edge_insertion): Remove old code that tried to
fixup region crossing edge since this is now handled in
split_block, and set up insertion point correctly since
block may now end in a jump.
(verify_hot_cold_block_grouping): Guard against checking when not in
linearized RTL mode.
(rtl_verify_edges): Add checks for incorrect/missing REG_CROSSING_JUMP
notes.
(rtl_verify_flow_info_1): Move verify_hot_cold_block_grouping to
rtl_verify_flow_info, so not called in cfglayout mode.
(rtl_verify_flow_info): Move verify_hot_cold_block_grouping here.
(fixup_reorder_chain): Remove old code that attempted to fixup region
crossing note as this is now handled in force_nonfallthru_and_redirect.
(duplicate_insn_chain): Don't duplicate switch section notes.
(rtl_can_remove_branch_p): Remove unnecessary check for region crossing
note.
* basic-block.h (emit_barrier_after_bb): Declare.
* testsuite/gcc.dg/tree-prof/va-arg-pack-1.c: Cloned from c-torture, made
into -freorder-blocks-and-partition test.
* testsuite/gcc.dg/tree-prof/comp-goto-1.c: Ditto.
* testsuite/gcc.dg/tree-prof/20041218-1.c: Ditto.
* testsuite/gcc.dg/tree-prof/pr52027.c: Use -O2.
* testsuite/gcc.dg/tree-prof/pr50907.c: Ditto.
* testsuite/gcc.dg/tree-prof/pr45354.c: Ditto.
* testsuite/g++.dg/tree-prof/partition2.C: Ditto.
* testsuite/g++.dg/tree-prof/partition3.C: Ditto.

Index: ifcvt.c
===
--- ifcvt.c (revi

Re: [patch] Preserve the CFG until after final

2013-05-22 Thread Steven Bosscher
On Wed, May 22, 2013 at 10:21 AM, Eric Botcazou wrote:
>> That is only partially true. Currently the transition is already de
>> facto going on: Just look at how many back ends use
>> compute_bb_for_insn to re-initialize the BLOCK_FOR_INSN pointers right
>> after pass_free_cfg (it's usually the first thing they do in the
>> machine-reorg pass).
>
> Yes, and we should do something about it.  Btw, why is the line removed from
> ia64_reorg in the patch (and not mentioned in the ChangeLog)?

Mistake. I wrote these patches on an ia64 machine, an older version of
the patch was also bootstrapped&tested there.


>> Some ports never call free_bb_or_insn after that,
>> and expect BLOCK_FOR_INSN to be valid in 'final'. One of those ports
>> is i386, look at where BLOCK_FOR_INSN is used in i386.c (in functions
>> deciding what asm output templates to use).
>
> AFAICS it's only used for splitters.  And using BLOCK_FOR_INSN after the last
> split pass (pass_split_for_shorten_branches) is dubious.

Actually it's even wrong *during* pass_split_for_shorten_branches
right now. It may work in some ports but not in others, depending
whether the port's machine reorg has done its TLC on the CFG or not.
This can lead to very interesting behavior. I chased a bug recently
where a splitter during split5 used the DF_INSN_USES cache on an insn
introduced by another splitter. Because split5 uses
split_all_insns_noflow, the insn from the splitter somehow ended up a
NULL BLOCK_FOR_INSN (I'm still not sure how that happened, but it
did). It turns out df-scan ignores insns with a NULL BLOCK_FOR_INSN,
and hence NULL DF_INSN_USES, and that lead to a dependency violation
in my splitter because a USE got overlooked!

Ah, the crazy stuff one can do after machine reorg. It's the Wild West
of GCC :-)


>  Here's the list:
>
> ./frv/frv.c:   || BLOCK_FOR_INSN (insn) == ce_info->else_bb
>
> Only used during if-conversion.
>
> ./rs6000/rs6000.c:  bb = BLOCK_FOR_INSN (label);
>
> Only used during compute_alignments.
>
> ./spu/spu.c:  bitmap_set_bit (blocks, BLOCK_FOR_INSN (branch)->index);
>
> Only used during md_reorg.
>
> ./c6x/c6x.c:  BLOCK_FOR_INSN (bundle) = BLOCK_FOR_INSN (slot[0]);
>
> Only used during md_reorg.
>
> ./mips/mips.c:  basic_block bb = BLOCK_FOR_INSN (insn);
> ./mips/mips.c:  /* Restore the BLOCK_FOR_INSN pointers, which are needed by
> DF.  Also during
>
> Only used for splitting decision, deal with null BLOCK_FOR_INSN.

Yes, these are all fine.


> ./i386/i386.c:  basic_block bb = start ? BLOCK_FOR_INSN (start) : NULL;
> ./i386/i386.c:  basic_block bb = BLOCK_FOR_INSN (insn);
> ./i386/i386.c:  basic_block bb = start ? BLOCK_FOR_INSN (start) : NULL;
> ./i386/i386.c:  basic_block bb = BLOCK_FOR_INSN (insn);
> ./i386/i386.c:  basic_block bb = BLOCK_FOR_INSN (insn);
> ./i386/i386.c:  rtx start = BB_HEAD (BLOCK_FOR_INSN (insn));
> ./i386/i386.c:  basic_block bb =  BLOCK_FOR_INSN (insn);
>
> Only used for splitting decision (and scheduling).

Sadly no. Most of these (the *agu* ones) are also reached from final.
For example:

movdi_internal -> ix86_use_lea_for_mov -> ix86_lea_outperforms ->
distance_non_agu_define -> distance_non_agu_define_in_bb

Likewise for movsi_internal, and zero_extendsidi2. For the
mov?i_internal define_insns, it's been like that since at least
r181077 (November 2011).

I must admit I was surprised by that, too. It may have been
coincidence that it worked when this patch was (IMHO wrongfully)
accepted. Someone got away with it because i386 calls
compute_bb_for_insn in its machine-reorg, and does *not* call
free_bb_for_insn, leaving the BLOCK_FOR_INSN pointers in place all the
way through final. There are no passes between machine-reorg and final
that run for i386 and damage the CFG because split5 doesn't run on
i386 (because of STACK_REGS) and the other passes, like
shorten_branches, don't modify the insns chain.

Most ports that call compute_bb_for_insn do not call free_bb_for_insn:

$ grep compute_bb_for_insn config/*/*.c
config/arm/arm.c:  compute_bb_for_insn ();
config/bfin/bfin.c:  compute_bb_for_insn ();
config/c6x/c6x.c:  compute_bb_for_insn ();
config/frv/frv.c:  compute_bb_for_insn ();
config/i386/i386.c:  compute_bb_for_insn ();
config/ia64/ia64.c:  compute_bb_for_insn ();
config/mep/mep.c:  compute_bb_for_insn ();
config/mips/mips.c:compute_bb_for_insn ();
config/mn10300/mn10300.c:  compute_bb_for_insn ();
config/picochip/picochip.c:  compute_bb_for_insn ();
config/spu/spu.c:  compute_bb_for_insn ();
config/spu/spu.c:  compute_bb_for_insn ();
config/tilegx/tilegx.c:  compute_bb_for_insn ();
config/tilepro/tilepro.c:  compute_bb_for_insn ();
$ grep free_bb_for_insn config/*/*.c
config/mips/mips.c:  free_bb_for_insn ();
config/spu/spu.c:  free_bb_for_insn ();
config/spu/spu.c:  free_bb_for_insn ();

Most of these, eh, let's call them "compute_bb_for_insn ports", they
come out of the machine reorg pass with an insns chain that doesn't
pass verify_flow_info. The most commo

RE: [PING]RE: [patch] cilkplus: Array notation for C patch

2013-05-22 Thread Iyer, Balaji V

> -Original Message-
> From: Richard Henderson [mailto:r...@redhat.com]
> Sent: Wednesday, May 22, 2013 1:34 PM
> To: Iyer, Balaji V
> Cc: 'Joseph S. Myers'; 'Aldy Hernandez'; 'gcc-patches'
> Subject: Re: [PING]RE: [patch] cilkplus: Array notation for C patch
> 
> On 2013-05-22 08:18, Iyer, Balaji V wrote:
> > The overall function names are same, but the components inside it
> > function differs greatly from C and C++. For example, in C++ I can't
> > use build_modify_expr, but build_x_modify_expr. Also, I need to handle
> > overloaded function types, and that requires further checking.
> > Similarly, the trees are different from C and C++, and so I have to do
> > more checks and handle them differently in C++. In my mind, this seem
> > to be the most straight-forward and legible way to do it.
> 
> Really, that's surprising:
> 
> /* For use from the C common bits.  */
> tree
> build_modify_expr (location_t /*location*/,
> tree lhs, tree /*lhs_origtype*/,
> enum tree_code modifycode,
> location_t /*rhs_location*/, tree rhs,
> tree /*rhs_origtype*/) {
>return cp_build_modify_expr (lhs, modifycode, rhs, tf_warning_or_error); }
> 
> I can definitely see how function overloading could potentially get in the 
> way.

There are couple other ones such as build_x_unary_op, build_x_conditional_expr 
which does not have such a mapping that I know of.

> 
> Although by delaying expansion of ARRAY_NOTATION_REF, given that the
> TREE_TYPE of the ARRAY_NOTATION_REF is not some complex artificial slice
> type but the element type of the array, I wonder how much of the overloaded
> function selection would just happen automatically?
> 
> Which reminds me: What is ARRAY_NOTATION_TYPE for?  Isn't it always the
> same as the TREE_TYPE of the ARRAY_NOTATION_REF?

Yes, they are both the same. A while back, I found a couple corner cases where 
the TREE_TYPE of the array notations inside __sec_reduce functions that was 
getting changed. This is a storage location that will be untouched so that I 
can get the original type. 

> 
> > Before I started my implementation, I did briefly consider using
> > c_finish_loop, but adding multiple expressions got a bit complicated.
> > For example, if I have 2 increment/condition expressions (my initial
> > implementation had it but later on I eliminated that need), I had to
> > create a statement list and then add them using
> > append_to_statement_list() and then pass that into c_finish_loop.
> > Also, if I am not mistaken, I still had to create some of the
> > individual labels when using c_finish_loop(). Overall, I didn't find
> > much code-size decrease, and I found this a bit more straight forward for 
> > me. I
> apologize if I am mistaken.
> 
> Well, append_to_statement_list etc was certainly not needed.  Mirroring 
> exactly
> how you'd write it in C, with a comma operator also works.  That's a
> COMPOUND_EXPR, fyi.
> 
> There are continue/break label arguments to c_finish_loop, but they may be
> (and often are) null.  They will only be non-null if the relevant statements
> actually exist inside the loop.  C.f. c_finish_bc_stmt and c_break_label.
> 
> You're right about it not saving *that* much code, but the point is more 
> about if
> we make improvements to normal C loop representation -- such as to support
> #pragma simd -- then we want to be able to take advantage of that here.

OK. I see your point. I will look into changing them.

> 
> > Are these changes a blocking issue for acceptance into trunk? Or,
> > would it be ok to accept it as-is and then I make these changes at a later 
> > date?
> 
> I'm leaning to honoring the advice you were given last year and accepting the
> patch more or less in its current form.

Thank you!

> 
> 
> r~


Re: [patch] Preserve the CFG until after final

2013-05-22 Thread Jeff Law

On 05/22/2013 03:17 PM, Steven Bosscher wrote:



Ah, the crazy stuff one can do after machine reorg. It's the Wild West
of GCC :-)
I still look at that hook as the contribution I most wish I'd never 
made.  The abuses are, umm, amazing.


jeff



[Patch, Fortran] Deallocate CLASS(...),INTENT(OUT),allocatable arrays

2013-05-22 Thread Tobias Burnus

A rather simple patch found while testing the draft finalization patch.

For a "class(...), allocatable, intent(out)" dummy argument, the actual 
argument has to be deallocated. That worked for scalar polymorphic vars, 
but not for polymorphic arrays.


Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias

PS: I think there is also a related issue for polymorphic arrays and 
end-of-scope. I will handle it in a follow-up patch.
2013-05-22  Tobias Burnus  

	* trans-expr.c (gfc_conv_procedure_call): Deallocate
	polymorphic arrays for allocatable intent(out) dummies.

2013-05-22  Tobias Burnus  

	* gfortran.dg/class_array_16.f90: New.

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index f8d99fd..ba878e0 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -4334,6 +4334,64 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 	{
 	  /* Pass a class array.  */
 	  gfc_conv_expr_descriptor (&parmse, e);
+
+	  /* If an ALLOCATABLE dummy argument has INTENT(OUT) and is
+		 allocated on entry, it must be deallocated.  */
+	  if (fsym->attr.intent == INTENT_OUT
+		  && CLASS_DATA (fsym)->attr.allocatable)
+		{
+		  stmtblock_t block;
+		  tree ptr;
+
+		  gfc_init_block  (&block);
+		  ptr = parmse.expr;
+		  ptr = gfc_class_data_get (ptr);
+
+		  tmp = gfc_deallocate_with_status (ptr, NULL_TREE,
+		NULL_TREE, NULL_TREE,
+		NULL_TREE, true, e,
+		false);
+		  gfc_add_expr_to_block (&block, tmp);
+		  tmp = fold_build2_loc (input_location, MODIFY_EXPR,
+	 void_type_node, ptr,
+	 null_pointer_node);
+		  gfc_add_expr_to_block (&block, tmp);
+
+		  if (UNLIMITED_POLY (fsym))
+		gfc_add_modify (&block, ptr,
+fold_convert (TREE_TYPE (ptr),
+		  null_pointer_node));
+		  else
+		{
+		  gfc_symbol *vtab;
+		  vtab = gfc_find_derived_vtab (fsym->ts.u.derived);
+		  tmp = gfc_get_symbol_decl (vtab);
+		  tmp = gfc_build_addr_expr (NULL_TREE, tmp);
+		  ptr = gfc_class_vptr_get (parmse.expr);
+		  gfc_add_modify (&block, ptr,
+  fold_convert (TREE_TYPE (ptr), tmp));
+		}
+		  gfc_add_expr_to_block (&block, tmp);
+
+		  if (fsym->attr.optional
+		  && e->expr_type == EXPR_VARIABLE
+		  && (!e->ref
+			  || (e->ref->type == REF_ARRAY
+			  && !e->ref->u.ar.type != AR_FULL))
+		  && e->symtree->n.sym->attr.optional)
+		{
+		  tmp = fold_build3_loc (input_location, COND_EXPR,
+void_type_node,
+gfc_conv_expr_present (e->symtree->n.sym),
+gfc_finish_block (&block),
+build_empty_stmt (input_location));
+		}
+		  else
+		tmp = gfc_finish_block (&block);
+
+		  gfc_add_expr_to_block (&se->pre, tmp);
+}
+
 	  /* The conversion does not repackage the reference to a class
 	 array - _data descriptor.  */
 	  gfc_conv_class_to_class (&parmse, e, fsym->ts, false,
--- /dev/null	2013-05-22 07:37:12.475061900 +0200
+++ gcc/gcc/testsuite/gfortran.dg/class_array_16.f90	2013-05-22 23:12:43.271073681 +0200
@@ -0,0 +1,20 @@
+! { dg-do compile }
+! { dg-options "-fdump-tree-original" }
+!
+module m
+  type t
+  end type t
+contains
+  subroutine sub(x)
+ class(t), allocatable, intent(out) :: x(:)
+  end subroutine sub
+end module m
+
+use m
+class(t), save, allocatable :: y(:)
+call sub(y)
+end
+
+! { dg-final { scan-tree-dump-times "__builtin_free" 1 "original" } }
+! { dg-final { scan-tree-dump-times "finally" 0 "original" } }
+! { dg-final { cleanup-tree-dump "original" } }


  1   2   >