date:20150325

Re: [CHKP, PATCH] Fix instrumented indirect calls with propagated pointers

2015-03-25 Thread Ilya Enkovich

2015-03-24 17:06 GMT+03:00 Jakub Jelinek :
> On Tue, Mar 24, 2015 at 12:22:27PM +0300, Ilya Enkovich wrote:
>> 2015-03-24 11:33 GMT+03:00 Jakub Jelinek :
>> > On Thu, Mar 19, 2015 at 11:29:44AM +0300, Ilya Enkovich wrote:
>> >> +  /* We might propagate instrumented function pointer into
>> >> + not instrumented function and vice versa.  In such a
>> >> + case we need to either fix function declaration or
>> >> + remove bounds from call statement.  */
>> >> +  if (callee)
>> >> +skip_bounds = chkp_redirect_edge (e);
>> >
>> > I just want to say that I'm not really excited about all this compile time
>> > cost that is added everywhere unconditionally for chkp.
>> > I think much better would be to guard most of it with proper option check
>> > first and only do the more expensive part if the option has been used.
>>
>> Agree, overhead for not instrumented code should be minimized.
>> Unfortunately there is no option check I can use to guard chkp codes
>> due to LTO. Currently it is allowed to pass -fcheck-pointer-bounds for
>> IL generation and don't pass it for final code generation. I suppose I
>> may set this (or some new) flag if see instrumented node when read
>> cgraph and then use it to guard chkp related codes. Would it be
>> acceptable?
>
> The question is what you want to do in the LTO case for the different cases,
> in particular a TU compiled with -fcheck-pointer-bounds and LTO link without
> that, or TU compiled without -fcheck-pointer-bounds and LTO link with it.
> It could be handled as LTO incompatible option, where lto1 would error out
> if you try to mix -fcheck-pointer-bounds with -fno-check-pointer-bounds
> code, or e.g. similar to var-tracking, you could consider adjusting the IL
> upon LTO reading if if some TU has been built with -fcheck-pointer-bounds
> and the LTO link is -fno-check-pointer-bounds.  Dunno what will happen
> with -fno-check-pointer-bounds TUs LTO linked with -fcheck-pointer-bounds.
> Or another possibility is to or in -fcheck-pointer-bounds from all TUs.

Mixing instrumented and not instrumented TUs is allowed. All
instrumentation passes happen before LTO link. The only code
generation problem if instrumented code is linked with no
-fcheck-pointer-bounds is disabled chkp_finish_file call which
generates static constructors. I think I just should set
flag_check_pointer_bounds if see any instrumented symbol on LTO read.
It would cause chkp_finish_file call when required and would be
available as guard for chkp related codes.

>
>> Maybe replace attribute usage with a new flag in tree_decl_with_vis 
>> structure?
>
> Depends, might be better to stick it into cgraph_node instead, depends on
> whether you are querying it already early in the FEs or just during GIMPLE
> when the cgraph node should be created too.

Flag in cgraph_node should work. I'll have a look.

Thanks,
Ilya

>
> Jakub

Re: [CHKP, PATCH] Fix instrumented indirect calls with propagated pointers

2015-03-25 Thread Jakub Jelinek

On Wed, Mar 25, 2015 at 11:05:17AM +0300, Ilya Enkovich wrote:
> > The question is what you want to do in the LTO case for the different cases,
> > in particular a TU compiled with -fcheck-pointer-bounds and LTO link without
> > that, or TU compiled without -fcheck-pointer-bounds and LTO link with it.
> > It could be handled as LTO incompatible option, where lto1 would error out
> > if you try to mix -fcheck-pointer-bounds with -fno-check-pointer-bounds
> > code, or e.g. similar to var-tracking, you could consider adjusting the IL
> > upon LTO reading if if some TU has been built with -fcheck-pointer-bounds
> > and the LTO link is -fno-check-pointer-bounds.  Dunno what will happen
> > with -fno-check-pointer-bounds TUs LTO linked with -fcheck-pointer-bounds.
> > Or another possibility is to or in -fcheck-pointer-bounds from all TUs.
> 
> Mixing instrumented and not instrumented TUs is allowed. All
> instrumentation passes happen before LTO link. The only code
> generation problem if instrumented code is linked with no
> -fcheck-pointer-bounds is disabled chkp_finish_file call which
> generates static constructors. I think I just should set
> flag_check_pointer_bounds if see any instrumented symbol on LTO read.
> It would cause chkp_finish_file call when required and would be
> available as guard for chkp related codes.

Thus perhaps oring the flag_check_pointer_bounds option from all the TUs is
the desirable behavior for LTO?
I think Richard or Honza would know where would be the best spot to do that.

Jakub

Re: [PATCH][3/3][PR65460] Mark offloaded functions as parallelized

2015-03-25 Thread Thomas Schwinge

Hi Tom!

On Sat, 21 Mar 2015 23:30:51 +0100, Tom de Vries  wrote:
> On 20-03-15 12:38, Tom de Vries wrote:
> > On 19-03-15 12:05, Tom de Vries wrote:
> >> On 18-03-15 18:22, Tom de Vries wrote:
> >>> this patch fixes PR65460.
> >>>
> >>> The patch marks offloaded functions as parallelized, which means the 
> >>> parloops
> >>> pass no longer attempts to modify that function.
> >>
> >> Updated patch to postpone mark_parallelized_function until the 
> >> corresponding
> >> cgraph_node is available, to ensure it works with the updated
> >> mark_parallelized_function from patch 2/3.
> >
> > Updated to eliminate mark_parallelized_function.
> >
> > Bootstrapped and reg-tested on x86_64.
> >
> > OK for stage4?
> 
> as requested, applied to gomp-4_0-branch.

Thanks!


Committed to gomp-4_0-branch in r221652:

commit 68c0851cb7ce420d5d938d7f0d9247adf79190a5
Author: tschwinge 
Date:   Wed Mar 25 08:28:09 2015 +

Use ChangeLog.gomp on gomp-4_0-branch.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@221652 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog  |   34 --
 gcc/ChangeLog.gomp |   34 ++
 2 files changed, 34 insertions(+), 34 deletions(-)

diff --git gcc/ChangeLog gcc/ChangeLog
index 48dca87..e474fc8 100644
--- gcc/ChangeLog
+++ gcc/ChangeLog
@@ -1,37 +1,3 @@
-2015-03-21  Tom de Vries  
-
-   PR tree-optimization/65460
-   * omp-low.c (expand_omp_target): Set parallelized_function on
-   cgraph_node for child_fn.
-
-2015-03-21  Tom de Vries  
-
-   backport from trunk:
-   2015-03-21  Tom de Vries  
-
-   PR tree-optimization/65458
-   * cgraph.c (cgraph_node::dump): Handle parallelized_function field.
-   * cgraph.h (cgraph_node): Add parallelized_function field.
-   * lto-cgraph.c (lto_output_node): Write parallelized_function field.
-   (input_overwrite_node): Read parallelized_function field.
-   * omp-low.c (expand_omp_taskreg, finalize_task_copyfn): Set
-   parallelized_function on cgraph_node for child_fn.
-   * tree-parloops.c: Add include of plugin-api.h, ipa-ref.h and cgraph.h.
-   Remove include of gt-tree-parloops.h.
-   (parallelized_functions): Remove static variable.
-   (parallelized_function_p): Rewrite using parallelized_function field of
-   cgraph_node.
-   (create_loop_fn): Remove adding to parallelized_functions.
-   * Makefile.in (GTFILES): Remove tree-parloops.c
-
-2015-03-21  Tom de Vries  
-
-   backport from trunk:
-   2015-03-18  Tom de Vries  
-
-   * tree-parloops.c (parallelize_loops): Make static.
-   * tree-parloops.h (parallelize_loops): Remove extern declaration.
-
 2015-03-11  Thomas Schwinge  
 
* config/nvptx/nvptx.h (LIBSTDCXX): Define to "gcc".
diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 6ed6962..b499d04 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,37 @@
+2015-03-21  Tom de Vries  
+
+   PR tree-optimization/65460
+   * omp-low.c (expand_omp_target): Set parallelized_function on
+   cgraph_node for child_fn.
+
+2015-03-21  Tom de Vries  
+
+   backport from trunk:
+   2015-03-21  Tom de Vries  
+
+   PR tree-optimization/65458
+   * cgraph.c (cgraph_node::dump): Handle parallelized_function field.
+   * cgraph.h (cgraph_node): Add parallelized_function field.
+   * lto-cgraph.c (lto_output_node): Write parallelized_function field.
+   (input_overwrite_node): Read parallelized_function field.
+   * omp-low.c (expand_omp_taskreg, finalize_task_copyfn): Set
+   parallelized_function on cgraph_node for child_fn.
+   * tree-parloops.c: Add include of plugin-api.h, ipa-ref.h and cgraph.h.
+   Remove include of gt-tree-parloops.h.
+   (parallelized_functions): Remove static variable.
+   (parallelized_function_p): Rewrite using parallelized_function field of
+   cgraph_node.
+   (create_loop_fn): Remove adding to parallelized_functions.
+   * Makefile.in (GTFILES): Remove tree-parloops.c
+
+2015-03-21  Tom de Vries  
+
+   backport from trunk:
+   2015-03-18  Tom de Vries  
+
+   * tree-parloops.c (parallelize_loops): Make static.
+   * tree-parloops.h (parallelize_loops): Remove extern declaration.
+
 2015-01-13  Thomas Schwinge  
 
* tree-core.h: Don't include "gomp-constants.h".


Grüße,
 Thomas


pgpN3QSz6W6Zt.pgp
Description: PGP signature

[patch, nios2, committed] Fix nios2-linux crti/crtn settings

2015-03-25 Thread Chung-Lin Tang

We appear to have erroneously set 'extra_parts' in nios2-linux libgcc,
to include the crti.o/crtn.o files intended for nios2 EABI. This still
largely worked, which is why we haven't noticed it till now, expect some
features like gprof profiling wasn't properly set up.

This patch removes the extra_parts setting for nios2-linux libgcc; now
crti.o/crtn.o links to the correct ones provided by glibc.

Chung-Lin

2015-03-25  Chung-Lin Tang  

libgcc/
* config.host (nios2-*-linux*): Remove 'extra_parts' setting.
Index: config.host
===
--- config.host	(revision 221651)
+++ config.host	(working copy)
@@ -943,7 +943,6 @@ nds32*-elf*)
 	;;
 nios2-*-linux*)
 	tmake_file="$tmake_file nios2/t-nios2 nios2/t-linux t-libgcc-pic t-slibgcc-libgcc"
-	extra_parts="$extra_parts crti.o crtn.o"
 	md_unwind_header=nios2/linux-unwind.h
 	;;
 nios2-*-*)

Re: [CHKP, PATCH] Fix instrumented indirect calls with propagated pointers

2015-03-25 Thread Ilya Enkovich

2015-03-24 17:40 GMT+03:00 Richard Biener :
> On Tue, Mar 24, 2015 at 3:06 PM, Jakub Jelinek  wrote:
>> On Tue, Mar 24, 2015 at 12:22:27PM +0300, Ilya Enkovich wrote:
>>
>> The question is what you want to do in the LTO case for the different cases,
>> in particular a TU compiled with -fcheck-pointer-bounds and LTO link without
>> that, or TU compiled without -fcheck-pointer-bounds and LTO link with it.
>> It could be handled as LTO incompatible option, where lto1 would error out
>> if you try to mix -fcheck-pointer-bounds with -fno-check-pointer-bounds
>> code, or e.g. similar to var-tracking, you could consider adjusting the IL
>> upon LTO reading if if some TU has been built with -fcheck-pointer-bounds
>> and the LTO link is -fno-check-pointer-bounds.  Dunno what will happen
>> with -fno-check-pointer-bounds TUs LTO linked with -fcheck-pointer-bounds.
>> Or another possibility is to or in -fcheck-pointer-bounds from all TUs.
>>
>>> Maybe replace attribute usage with a new flag in tree_decl_with_vis 
>>> structure?
>>
>> Depends, might be better to stick it into cgraph_node instead, depends on
>> whether you are querying it already early in the FEs or just during GIMPLE
>> when the cgraph node should be created too.
>
> I also wonder why it is necessary to execute pass_chkp_instrumentation_passes
> when mpx is not active.
>
> That is, can we guard that properly in
>
> void
> pass_manager::execute_early_local_passes ()
> {
>   execute_pass_list (cfun, pass_build_ssa_passes_1->sub);
>   execute_pass_list (cfun, pass_chkp_instrumentation_passes_1->sub);
>   execute_pass_list (cfun, pass_local_optimization_passes_1->sub);
> }

I'm worried about new functions generated in LTO. But with re-created
flag_check_pointer_bounds it should be safe to guard it.

>
> (why's that so oddly wrapped?)
>
> class pass_chkp_instrumentation_passes
>
> also has no gate that guards with flag_mpx or so.
>
> That would save a IL walk over all functions (fixup_cfg) and a cgraph
> edge rebuild.

Right. Will fix it.

Thanks,
Ilya

>
> Richard.
>
>> Jakub

Re: [CHKP, PATCH] Fix instrumented indirect calls with propagated pointers

2015-03-25 Thread Ilya Enkovich

2015-03-25 11:16 GMT+03:00 Jakub Jelinek :
> On Wed, Mar 25, 2015 at 11:05:17AM +0300, Ilya Enkovich wrote:
>> > The question is what you want to do in the LTO case for the different 
>> > cases,
>> > in particular a TU compiled with -fcheck-pointer-bounds and LTO link 
>> > without
>> > that, or TU compiled without -fcheck-pointer-bounds and LTO link with it.
>> > It could be handled as LTO incompatible option, where lto1 would error out
>> > if you try to mix -fcheck-pointer-bounds with -fno-check-pointer-bounds
>> > code, or e.g. similar to var-tracking, you could consider adjusting the IL
>> > upon LTO reading if if some TU has been built with -fcheck-pointer-bounds
>> > and the LTO link is -fno-check-pointer-bounds.  Dunno what will happen
>> > with -fno-check-pointer-bounds TUs LTO linked with -fcheck-pointer-bounds.
>> > Or another possibility is to or in -fcheck-pointer-bounds from all TUs.
>>
>> Mixing instrumented and not instrumented TUs is allowed. All
>> instrumentation passes happen before LTO link. The only code
>> generation problem if instrumented code is linked with no
>> -fcheck-pointer-bounds is disabled chkp_finish_file call which
>> generates static constructors. I think I just should set
>> flag_check_pointer_bounds if see any instrumented symbol on LTO read.
>> It would cause chkp_finish_file call when required and would be
>> available as guard for chkp related codes.
>
> Thus perhaps oring the flag_check_pointer_bounds option from all the TUs is
> the desirable behavior for LTO?
> I think Richard or Honza would know where would be the best spot to do that.
>
> Jakub

Is such oring used for some other flags to have an example?

Thanks,
Ilya

Re: [PATCH] Rewrite lto streamer DFS from recursion to worklist (PR lto/65515)

2015-03-25 Thread Richard Biener

On Tue, 24 Mar 2015, Jakub Jelinek wrote:

> On Tue, Mar 24, 2015 at 04:19:46PM +0100, Jakub Jelinek wrote:
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Also tested with
> ../configure --with-build-config=bootstrap-lto 
> --enable-languages=c,c++,fortran,objc,obj-c++,go
> make -j16; make -j16 -k check
> on x86_64-linux, no regressions.

Ok.

Thanks,
Richard.

Re: [CHKP, PATCH] Fix instrumented indirect calls with propagated pointers

2015-03-25 Thread Richard Biener

On Wed, Mar 25, 2015 at 9:50 AM, Ilya Enkovich  wrote:
> 2015-03-24 17:40 GMT+03:00 Richard Biener :
>> On Tue, Mar 24, 2015 at 3:06 PM, Jakub Jelinek  wrote:
>>> On Tue, Mar 24, 2015 at 12:22:27PM +0300, Ilya Enkovich wrote:
>>>
>>> The question is what you want to do in the LTO case for the different cases,
>>> in particular a TU compiled with -fcheck-pointer-bounds and LTO link without
>>> that, or TU compiled without -fcheck-pointer-bounds and LTO link with it.
>>> It could be handled as LTO incompatible option, where lto1 would error out
>>> if you try to mix -fcheck-pointer-bounds with -fno-check-pointer-bounds
>>> code, or e.g. similar to var-tracking, you could consider adjusting the IL
>>> upon LTO reading if if some TU has been built with -fcheck-pointer-bounds
>>> and the LTO link is -fno-check-pointer-bounds.  Dunno what will happen
>>> with -fno-check-pointer-bounds TUs LTO linked with -fcheck-pointer-bounds.
>>> Or another possibility is to or in -fcheck-pointer-bounds from all TUs.
>>>
 Maybe replace attribute usage with a new flag in tree_decl_with_vis 
 structure?
>>>
>>> Depends, might be better to stick it into cgraph_node instead, depends on
>>> whether you are querying it already early in the FEs or just during GIMPLE
>>> when the cgraph node should be created too.
>>
>> I also wonder why it is necessary to execute pass_chkp_instrumentation_passes
>> when mpx is not active.
>>
>> That is, can we guard that properly in
>>
>> void
>> pass_manager::execute_early_local_passes ()
>> {
>>   execute_pass_list (cfun, pass_build_ssa_passes_1->sub);
>>   execute_pass_list (cfun, pass_chkp_instrumentation_passes_1->sub);
>>   execute_pass_list (cfun, pass_local_optimization_passes_1->sub);
>> }
>
> I'm worried about new functions generated in LTO. But with re-created
> flag_check_pointer_bounds it should be safe to guard it.
>
>>
>> (why's that so oddly wrapped?)
>>
>> class pass_chkp_instrumentation_passes
>>
>> also has no gate that guards with flag_mpx or so.
>>
>> That would save a IL walk over all functions (fixup_cfg) and a cgraph
>> edge rebuild.
>
> Right. Will fix it.

I am already testing

Index: gcc/passes.c
===
--- gcc/passes.c(revision 221633)
+++ gcc/passes.c(working copy)
@@ -156,7 +156,8 @@ void
 pass_manager::execute_early_local_passes ()
 {
   execute_pass_list (cfun, pass_build_ssa_passes_1->sub);
-  execute_pass_list (cfun, pass_chkp_instrumentation_passes_1->sub);
+  if (flag_check_pointer_bounds)
+execute_pass_list (cfun, pass_chkp_instrumentation_passes_1->sub);
   execute_pass_list (cfun, pass_local_optimization_passes_1->sub);
 }

@@ -424,7 +425,8 @@ public:
   virtual bool gate (function *)
 {
   /* Don't bother doing anything if the program has errors.  */
-  return (!seen_error () && !in_lto_p);
+  return (flag_check_pointer_bounds
+ && !seen_error () && !in_lto_p);
 }

 }; // class pass_chkp_instrumentation_passes


Richard.

> Thanks,
> Ilya
>
>>
>> Richard.
>>
>>> Jakub

Re: [Patch, Fortran, pr60322] was: [Patch 1/2, Fortran, pr60322] [OOP] Incorrect bounds on polymorphic dummy array

2015-03-25 Thread Dominique d'Humières

Hi Andre,

> Le 24 mars 2015 à 18:06, Andre Vehreschild  a écrit :
> 
> Hi all,
> 
> I have worked on the comments Mikael gave me. I am now checking for
> class_pointer in the way he pointed out.
> 
> Furthermore did I *join the two parts* of the patch into this one, because
> keeping both in sync was no benefit but only tedious and did not prove to be
> reviewed faster.

Are you sure that you attached the right patch? It does not apply on a clean 
tree unless I apply the patch at

https://gcc.gnu.org/ml/fortran/2015-02/msg00105.html

with minor surgery for gcc/fortran/expr.c.

> Paul, Dominique: I have addressed the LOC issue that came up lately. Or rather
> the patch addressed it already. I feel like this is not tested very well, not
> the loc() call nor the sizeof() call as given in the 57305 second's download.

The ICE is fixed and the LOC issue seems fixed. 

> Unfortunately, is that download not runable. I would love to see a test 
> similar
> to that download, but couldn't come up with one, that satisfied me. Given that
> the patch's review will last some days, I still have enough time to come up
> with something beautiful which I will add then.

I have changed the test to

use iso_c_binding
implicit none
real, target :: e
class(*), allocatable, target :: a(:)
e = 1.0
call add_element_poly(a,e)
print *, size(a)
call add_element_poly(a,e)
print *, size(a)
select type (a)
  type is (real)
print *, a
end select
contains
subroutine add_element_poly(a,e)
  use iso_c_binding
  class(*),allocatable,intent(inout),target :: a(:)
  class(*),intent(in),target :: e
  class(*),allocatable,target :: tmp(:)
  type(c_ptr) :: dummy
  
  interface
function memcpy(dest,src,n) bind(C,name="memcpy") result(res)
  import
  type(c_ptr) :: res
  integer(c_intptr_t),value :: dest
  integer(c_intptr_t),value :: src
  integer(c_size_t),value :: n
end function
  end interface

  if (.not.allocated(a)) then
allocate(a(1), source=e)
  else
print *, size(a)
allocate(tmp(size(a)),source=a)
print *, size(a), size(tmp) + 1
print *, loc(a(1)),loc(tmp),sizeof(tmp)
deallocate(a)
!allocate(a(size(tmp)+1),mold=e)
allocate(a(size(tmp)+1),source=e)
print *, size(a), size(tmp)
dummy = memcpy(loc(a(1)),loc(tmp),sizeof(tmp))
dummy = memcpy(loc(a(size(tmp)+1)),loc(e),sizeof(e))
  end if
end subroutine
end

As pointed by Paul, I get a segfault at run time if I use the commented line, 
i.e. ‘mold’ instead of ‘source’.

> Bootstraps and regtests ok on x86_64-linux-gnu/F20.
> 
> Regards,
>   Andre

Thanks for your work.

Dominique

Re: [CHKP, PATCH] Fix instrumented indirect calls with propagated pointers

2015-03-25 Thread Jakub Jelinek

On Wed, Mar 25, 2015 at 10:38:56AM +0100, Richard Biener wrote:
> --- gcc/passes.c(revision 221633)
> +++ gcc/passes.c(working copy)
> @@ -156,7 +156,8 @@ void
>  pass_manager::execute_early_local_passes ()
>  {
>execute_pass_list (cfun, pass_build_ssa_passes_1->sub);
> -  execute_pass_list (cfun, pass_chkp_instrumentation_passes_1->sub);
> +  if (flag_check_pointer_bounds)
> +execute_pass_list (cfun, pass_chkp_instrumentation_passes_1->sub);
>execute_pass_list (cfun, pass_local_optimization_passes_1->sub);
>  }
> 
> @@ -424,7 +425,8 @@ public:
>virtual bool gate (function *)
>  {
>/* Don't bother doing anything if the program has errors.  */
> -  return (!seen_error () && !in_lto_p);
> +  return (flag_check_pointer_bounds
> + && !seen_error () && !in_lto_p);
>  }
> 
>  }; // class pass_chkp_instrumentation_passes

There is still the wasteful pass_fixup_cfg at the start of:
PUSH_INSERT_PASSES_WITHIN (pass_local_optimization_passes)
  NEXT_PASS (pass_fixup_cfg);
which wasn't there before chkp.  Perhaps this should be a different
pass with the same execute method, but gate containing
flag_check_pointer_bounds?

Jakub

Re: [CHKP, PATCH] Fix instrumented indirect calls with propagated pointers

2015-03-25 Thread Ilya Enkovich

2015-03-25 12:50 GMT+03:00 Jakub Jelinek :
> On Wed, Mar 25, 2015 at 10:38:56AM +0100, Richard Biener wrote:
>> --- gcc/passes.c(revision 221633)
>> +++ gcc/passes.c(working copy)
>> @@ -156,7 +156,8 @@ void
>>  pass_manager::execute_early_local_passes ()
>>  {
>>execute_pass_list (cfun, pass_build_ssa_passes_1->sub);
>> -  execute_pass_list (cfun, pass_chkp_instrumentation_passes_1->sub);
>> +  if (flag_check_pointer_bounds)
>> +execute_pass_list (cfun, pass_chkp_instrumentation_passes_1->sub);
>>execute_pass_list (cfun, pass_local_optimization_passes_1->sub);
>>  }
>>
>> @@ -424,7 +425,8 @@ public:
>>virtual bool gate (function *)
>>  {
>>/* Don't bother doing anything if the program has errors.  */
>> -  return (!seen_error () && !in_lto_p);
>> +  return (flag_check_pointer_bounds
>> + && !seen_error () && !in_lto_p);
>>  }
>>
>>  }; // class pass_chkp_instrumentation_passes
>
> There is still the wasteful pass_fixup_cfg at the start of:
> PUSH_INSERT_PASSES_WITHIN (pass_local_optimization_passes)
>   NEXT_PASS (pass_fixup_cfg);
> which wasn't there before chkp.  Perhaps this should be a different
> pass with the same execute method, but gate containing
> flag_check_pointer_bounds?

IIRC the reason for this pass was a different passes split, not
instrumentation itself. Previously function processing always started
with pass_fixup_cfg. Splitting processing into three stages we got
three pass_fixup_cfg passes.

Ilya

>
> Jakub

Re: [CHKP, PATCH] Fix instrumented indirect calls with propagated pointers

2015-03-25 Thread Jakub Jelinek

On Wed, Mar 25, 2015 at 01:06:46PM +0300, Ilya Enkovich wrote:
> > There is still the wasteful pass_fixup_cfg at the start of:
> > PUSH_INSERT_PASSES_WITHIN (pass_local_optimization_passes)
> >   NEXT_PASS (pass_fixup_cfg);
> > which wasn't there before chkp.  Perhaps this should be a different
> > pass with the same execute method, but gate containing
> > flag_check_pointer_bounds?
> 
> IIRC the reason for this pass was a different passes split, not
> instrumentation itself. Previously function processing always started
> with pass_fixup_cfg. Splitting processing into three stages we got
> three pass_fixup_cfg passes.

Sure, but it would be really nice if for !flag_check_pointer_bounds
we really could have just one stage again, rather than 3.
When it is a global option, and for LTO ideally ored in from all the TUs,
that shouldn't be that hard...

Jakub

Re: [CHKP, PATCH] Fix instrumented indirect calls with propagated pointers

2015-03-25 Thread Richard Biener

On Wed, Mar 25, 2015 at 10:50 AM, Jakub Jelinek  wrote:
> On Wed, Mar 25, 2015 at 10:38:56AM +0100, Richard Biener wrote:
>> --- gcc/passes.c(revision 221633)
>> +++ gcc/passes.c(working copy)
>> @@ -156,7 +156,8 @@ void
>>  pass_manager::execute_early_local_passes ()
>>  {
>>execute_pass_list (cfun, pass_build_ssa_passes_1->sub);
>> -  execute_pass_list (cfun, pass_chkp_instrumentation_passes_1->sub);
>> +  if (flag_check_pointer_bounds)
>> +execute_pass_list (cfun, pass_chkp_instrumentation_passes_1->sub);
>>execute_pass_list (cfun, pass_local_optimization_passes_1->sub);
>>  }
>>
>> @@ -424,7 +425,8 @@ public:
>>virtual bool gate (function *)
>>  {
>>/* Don't bother doing anything if the program has errors.  */
>> -  return (!seen_error () && !in_lto_p);
>> +  return (flag_check_pointer_bounds
>> + && !seen_error () && !in_lto_p);
>>  }
>>
>>  }; // class pass_chkp_instrumentation_passes
>
> There is still the wasteful pass_fixup_cfg at the start of:
> PUSH_INSERT_PASSES_WITHIN (pass_local_optimization_passes)
>   NEXT_PASS (pass_fixup_cfg);
> which wasn't there before chkp.  Perhaps this should be a different
> pass with the same execute method, but gate containing
> flag_check_pointer_bounds?

That's not wasteful but required due to local_pure_const.  The remaining
wasteful fixup_cfg is the one in pass_build_ssa_passes.  ISTR
that pass_ipa_chkp_versioning/early_produce_thunks makes that one
required?  Or EH / CFG cleanup stuff makes it necessary to not
fail IL checking done by into-SSA.

Richard.

> Jakub

Re: [Patch, fortran] PR65532 shape mismatch error with data partial initialization

2015-03-25 Thread Mikael Morin

Le 24/03/2015 23:39, Mikael Morin a écrit :
> The patch I propose here adds a flag to remember the function has been
> called, and skip it the second time.
> I considered reusing the existing 'resolved' field, but I had to
> slightly change its semantics to prevent regressing somewhere, and I was
> not completely sure how safe that change was.
> I have finally preferred this safer patch keeping the existing field
> completely untouched.
> 
> Regression tested on x86_64-unknown-linux-gnu. OK for trunk?
> 
I have committed the patch as obvious as revision 221657.
If someone is willing to debate about it, the discussion remains open.

Mikael

Re: [CHKP, PATCH] Fix instrumented indirect calls with propagated pointers

2015-03-25 Thread Richard Biener

On Wed, Mar 25, 2015 at 11:11 AM, Jakub Jelinek  wrote:
> On Wed, Mar 25, 2015 at 01:06:46PM +0300, Ilya Enkovich wrote:
>> > There is still the wasteful pass_fixup_cfg at the start of:
>> > PUSH_INSERT_PASSES_WITHIN (pass_local_optimization_passes)
>> >   NEXT_PASS (pass_fixup_cfg);
>> > which wasn't there before chkp.  Perhaps this should be a different
>> > pass with the same execute method, but gate containing
>> > flag_check_pointer_bounds?
>>
>> IIRC the reason for this pass was a different passes split, not
>> instrumentation itself. Previously function processing always started
>> with pass_fixup_cfg. Splitting processing into three stages we got
>> three pass_fixup_cfg passes.
>
> Sure, but it would be really nice if for !flag_check_pointer_bounds
> we really could have just one stage again, rather than 3.
> When it is a global option, and for LTO ideally ored in from all the TUs,
> that shouldn't be that hard...

LTO doesn't even run all this stuff at it only runs before LTO streaming.

I don't think we want to go back to not going into SSA for all functions
before early-opts (esp. early inlining).  Which unfortunately won't
get the EH cleanup related benefits.

Btw, execute_fixup_cfg can be optimized as well - edge purging only
needs to be done for the last stmt of a BB.

Richard.

> Jakub

Re: [CHKP, PATCH] Fix instrumented indirect calls with propagated pointers

2015-03-25 Thread Ilya Enkovich

2015-03-25 13:15 GMT+03:00 Richard Biener :
> On Wed, Mar 25, 2015 at 10:50 AM, Jakub Jelinek  wrote:
>> On Wed, Mar 25, 2015 at 10:38:56AM +0100, Richard Biener wrote:
>>> --- gcc/passes.c(revision 221633)
>>> +++ gcc/passes.c(working copy)
>>> @@ -156,7 +156,8 @@ void
>>>  pass_manager::execute_early_local_passes ()
>>>  {
>>>execute_pass_list (cfun, pass_build_ssa_passes_1->sub);
>>> -  execute_pass_list (cfun, pass_chkp_instrumentation_passes_1->sub);
>>> +  if (flag_check_pointer_bounds)
>>> +execute_pass_list (cfun, pass_chkp_instrumentation_passes_1->sub);
>>>execute_pass_list (cfun, pass_local_optimization_passes_1->sub);
>>>  }
>>>
>>> @@ -424,7 +425,8 @@ public:
>>>virtual bool gate (function *)
>>>  {
>>>/* Don't bother doing anything if the program has errors.  */
>>> -  return (!seen_error () && !in_lto_p);
>>> +  return (flag_check_pointer_bounds
>>> + && !seen_error () && !in_lto_p);
>>>  }
>>>
>>>  }; // class pass_chkp_instrumentation_passes
>>
>> There is still the wasteful pass_fixup_cfg at the start of:
>> PUSH_INSERT_PASSES_WITHIN (pass_local_optimization_passes)
>>   NEXT_PASS (pass_fixup_cfg);
>> which wasn't there before chkp.  Perhaps this should be a different
>> pass with the same execute method, but gate containing
>> flag_check_pointer_bounds?
>
> That's not wasteful but required due to local_pure_const.  The remaining
> wasteful fixup_cfg is the one in pass_build_ssa_passes.  ISTR
> that pass_ipa_chkp_versioning/early_produce_thunks makes that one
> required?  Or EH / CFG cleanup stuff makes it necessary to not
> fail IL checking done by into-SSA.

These two chkp passes don't modify function bodies (mat remove it
though). I don't expect them to require following fixup_cfg.

Ilya

>
> Richard.
>
>> Jakub

Re: [PATCH] Fix PR65538

2015-03-25 Thread Martin Liška


On 03/25/2015 12:37 AM, Jan Hubicka wrote:

On Tue, Mar 24, 2015 at 10:54:25PM +0100, Martin Liška wrote:

--- a/gcc/symbol-summary.h
+++ b/gcc/symbol-summary.h
@@ -81,6 +81,12 @@ public:
  m_symtab_insertion_hook = NULL;
  m_symtab_removal_hook = NULL;
  m_symtab_duplication_hook = NULL;
+
+/* Release all summaries in case we use non-GGC memory.  */
+typedef typename hash_map ::iterator 
map_iterator;
+if (!m_ggc)
+  for (map_iterator it = m_map.begin (); it != m_map.end (); ++it)
+   release ((*it).second);


You haven't removed the now unnecessary if (!m_ggc) guard.


@@ -106,6 +112,15 @@ public:
  return m_ggc ? new (ggc_alloc  ()) T() : new T () ;
}

+  /* Release an item that is stored within map.  */
+  void release (T *item)
+  {
+if (m_ggc)
+  ggc_free (item);


Perhaps run also the item's destructor first?  I know that
inline_summary doesn't have a user destructor, so it will expand to nothing,
so it would be just for completeness.


Yep, calling destructors is a good idea.  OK with that change
and fix Jakub pointed out.

Honza



+else
+  delete item;
+  }
+


Jakub


Ok, changes are applied in the final patch I'm going to install.

Thanks,
Martin
>From 6eae938e34e36c461ebec1570ff0f3d2f5e1b8cf Mon Sep 17 00:00:00 2001
From: mliska 
Date: Tue, 24 Mar 2015 13:58:50 +0100
Subject: [PATCH] Fix PR65538.

gcc/ChangeLog:

2015-03-24  Martin Liska  

	PR tree-optimization/65538
	* symbol-summary.h (function_summary::~function_summary):
	Relese memory for allocated summaries.
	(function_summary::release): New function.
---
 gcc/symbol-summary.h | 17 +
 1 file changed, 17 insertions(+)

diff --git a/gcc/symbol-summary.h b/gcc/symbol-summary.h
index 8d7e42c..0448310 100644
--- a/gcc/symbol-summary.h
+++ b/gcc/symbol-summary.h
@@ -81,6 +81,11 @@ public:
 m_symtab_insertion_hook = NULL;
 m_symtab_removal_hook = NULL;
 m_symtab_duplication_hook = NULL;
+
+/* Release all summaries.  */
+typedef typename hash_map ::iterator map_iterator;
+for (map_iterator it = m_map.begin (); it != m_map.end (); ++it)
+  release ((*it).second);
   }
 
   /* Traverses all summarys with a function F called with
@@ -106,6 +111,18 @@ public:
 return m_ggc ? new (ggc_alloc  ()) T() : new T () ;
   }
 
+  /* Release an item that is stored within map.  */
+  void release (T *item)
+  {
+if (m_ggc)
+  {
+	item->~T ();
+	ggc_free (item);
+  }
+else
+  delete item;
+  }
+
   /* Getter for summary callgraph node pointer.  */
   T* get (cgraph_node *node)
   {
-- 
2.1.4

[PATCH] Vimrc config: fix symlink creation

2015-03-25 Thread Martin Liška


Hello.

Following patch correctly creates symlink that now points to a wrong location.

Ready for trunk?
Thanks,
Martin
>From 5681b55f531f579ba75aad21f5628f86fba4bc8a Mon Sep 17 00:00:00 2001
From: mliska 
Date: Wed, 25 Mar 2015 10:09:21 +0100
Subject: [PATCH] Fix vimrc file link creation.

ChangeLog:

2015-03-25  Martin Liska  
	Yury Gribov  

	* Makefile.in: Fix ln source location for vimrc file.
	* Makefile.tpl: Likewise.
---
 Makefile.in  | 4 ++--
 Makefile.tpl | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/Makefile.in b/Makefile.in
index 6f9dfd4..36b4008 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -2442,10 +2442,10 @@ mail-report-with-warnings.log: warning.log
 # Local Vim config
 
 $(srcdir)/.local.vimrc:
-	$(LN_S) $(srcdir)/contrib/vimrc $@
+	$(LN_S) contrib/vimrc $@
 
 $(srcdir)/.lvimrc:
-	$(LN_S) $(srcdir)/contrib/vimrc $@
+	$(LN_S) contrib/vimrc $@
 
 vimrc: $(srcdir)/.local.vimrc $(srcdir)/.lvimrc
 
diff --git a/Makefile.tpl b/Makefile.tpl
index f737cfc..1ea1954 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -872,10 +872,10 @@ mail-report-with-warnings.log: warning.log
 # Local Vim config
 
 $(srcdir)/.local.vimrc:
-	$(LN_S) $(srcdir)/contrib/vimrc $@
+	$(LN_S) contrib/vimrc $@
 
 $(srcdir)/.lvimrc:
-	$(LN_S) $(srcdir)/contrib/vimrc $@
+	$(LN_S) contrib/vimrc $@
 
 vimrc: $(srcdir)/.local.vimrc $(srcdir)/.lvimrc
 
-- 
2.1.4

Re: [PATCH] Vimrc config: fix symlink creation

2015-03-25 Thread Jakub Jelinek

On Wed, Mar 25, 2015 at 11:57:08AM +0100, Martin Liška wrote:
> Following patch correctly creates symlink that now points to a wrong location.

Only if $(srcdir) is a relative path I'd say.

> Ready for trunk?

In any case, LGTM.

> 2015-03-25  Martin Liska  
>   Yury Gribov  
> 
>   * Makefile.in: Fix ln source location for vimrc file.
>   * Makefile.tpl: Likewise.

Jakub

[PATCH] Guard pass_chkp_instrumentation_passes with flag_check_pointer_bounds

2015-03-25 Thread Richard Biener


Avoids a fixup-cfg and cgraph edge rebuild.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2015-03-25  Richard Biener  

* passes.c (pass_manager::execute_early_local_passes): Guard
execution of pass_chkp_instrumentation_passes with
flag_check_pointer_bounds.
(pass_chkp_instrumentation_passes::gate): Likewise.

Index: gcc/passes.c
===
--- gcc/passes.c(revision 221633)
+++ gcc/passes.c(working copy)
@@ -156,7 +156,8 @@ void
 pass_manager::execute_early_local_passes ()
 {
   execute_pass_list (cfun, pass_build_ssa_passes_1->sub);
-  execute_pass_list (cfun, pass_chkp_instrumentation_passes_1->sub);
+  if (flag_check_pointer_bounds)
+execute_pass_list (cfun, pass_chkp_instrumentation_passes_1->sub);
   execute_pass_list (cfun, pass_local_optimization_passes_1->sub);
 }
 
@@ -424,7 +425,8 @@ public:
   virtual bool gate (function *)
 {
   /* Don't bother doing anything if the program has errors.  */
-  return (!seen_error () && !in_lto_p);
+  return (flag_check_pointer_bounds
+ && !seen_error () && !in_lto_p);
 }
 
 }; // class pass_chkp_instrumentation_passes

[patch libgomp]: Fix PR 64972

2015-03-25 Thread Kai Tietz

Hi,

ChangeLog

2015-03-25  Kai Tietz  

PR libgomp/64972
* oacc-parallel.c (GOACC_parallel): Use PRIu64 if available.
(GOACC_data_start): Likewise.
* target.c (gomp_map_vars): Likewise.

Tested for i686-w64-mingw32.  Fix got preapproved by Jakub, so I will
commit this soon, if there are no objections.

Regards,
Kai

Index: oacc-parallel.c
===
--- oacc-parallel.c(Revision 221640)
+++ oacc-parallel.c(Arbeitskopie)
@@ -31,6 +31,9 @@
 #include "libgomp_g.h"
 #include "gomp-constants.h"
 #include "oacc-int.h"
+#ifdef HAVE_INTTYPES_H
+# include   /* For PRIu64.  */
+#endif
 #include 
 #include 
 #include 
@@ -99,9 +102,15 @@ GOACC_parallel (int device, void (*fn) (void *),
 gomp_fatal ("num_workers (%d) different from one is not yet supported",
 num_workers);

-  gomp_debug (0, "%s: mapnum=%zd, hostaddrs=%p, sizes=%p, kinds=%p,
async=%d\n",
-  __FUNCTION__, mapnum, hostaddrs, sizes, kinds, async);
-
+#ifdef HAVE_INTTYPES_H
+  gomp_debug (0, "%s: mapnum=%"PRIu64", hostaddrs=%p, size=%p, kinds=%p, "
+ "async = %d\n",
+  __FUNCTION__, (uint64_t) mapnum, hostaddrs, sizes, kinds, async);
+#else
+  gomp_debug (0, "%s: mapnum=%lu, hostaddrs=%p, sizes=%p, kinds=%p,
async=%d\n",
+  __FUNCTION__, (unsigned long) mapnum, hostaddrs, sizes, kinds,
+  async);
+#endif
   select_acc_device (device);

   thr = goacc_thread ();
@@ -178,8 +187,13 @@ GOACC_data_start (int device, size_t mapnum,
   bool host_fallback = device == GOMP_DEVICE_HOST_FALLBACK;
   struct target_mem_desc *tgt;

-  gomp_debug (0, "%s: mapnum=%zd, hostaddrs=%p, sizes=%p, kinds=%p\n",
-  __FUNCTION__, mapnum, hostaddrs, sizes, kinds);
+#ifdef HAVE_INTTYPES_H
+  gomp_debug (0, "%s: mapnum=%"PRIu64", hostaddrs=%p, size=%p, kinds=%p\n",
+  __FUNCTION__, (uint64_t) mapnum, hostaddrs, sizes, kinds);
+#else
+  gomp_debug (0, "%s: mapnum=%lu, hostaddrs=%p, sizes=%p, kinds=%p\n",
+  __FUNCTION__, (unsigned long) mapnum, hostaddrs, sizes, kinds);
+#endif

   select_acc_device (device);

Index: target.c
===
--- target.c(Revision 221640)
+++ target.c(Arbeitskopie)
@@ -33,6 +33,9 @@
 #include 
 #include 
 #include 
+#ifdef HAVE_INTTYPES_H
+# include   /* For PRIu64.  */
+#endif
 #include 
 #include 

@@ -438,9 +441,16 @@ gomp_map_vars (struct gomp_device_descr *devicep,
   /* We already looked up the memory region above and it
  was missing.  */
   size_t size = k->host_end - k->host_start;
+#ifdef HAVE_INTTYPES_H
   gomp_fatal ("present clause: !acc_is_present (%p, "
-  "%zd (0x%zx))", (void *) k->host_start,
-  size, size);
+  "%"PRIu64" (0x%"PRIx64"))",
+  (void *) k->host_start,
+  (uint64_t) size, (uint64_t) size);
+#else
+  gomp_fatal ("present clause: !acc_is_present (%p, "
+  "%lu (0x%lx))", (void *) k->host_start,
+  (unsigned long) size, (unsigned long) size);
+#endif
 }
 break;
   case GOMP_MAP_FORCE_DEVICEPTR:

[PATCH, CHKP, PR target/65508] Set static chain for instrumented calls

2015-03-25 Thread Ilya Enkovich

Hi,

This patch fixes PR target/65508 by proper copy of static chain for 
instrumented calls.  Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK 
for trunk or wait for stage 1?

Thanks,
Ilya
--
gcc/

2015-03-25  Ilya Enkovich  

PR target/65508
* tree-chkp.c (chkp_add_bounds_to_call_stmt): Set static
chain for generated call.

gcc/testsuite/

2015-03-25  Ilya Enkovich  

PR target/65508
* gcc.target/i386/mpx/pr65508.c: New.


diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr65508.c 
b/gcc/testsuite/gcc.target/i386/mpx/pr65508.c
new file mode 100644
index 000..9060287
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mpx/pr65508.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcheck-pointer-bounds -mmpx" } */
+
+void
+bar (int N)
+{
+  int a[N];
+  void foo (int a[N])
+  {
+  }
+  foo (a);
+}
diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
index d2df4ba..de127ae 100644
--- a/gcc/tree-chkp.c
+++ b/gcc/tree-chkp.c
@@ -1838,6 +1838,7 @@ chkp_add_bounds_to_call_stmt (gimple_stmt_iterator *gsi)
   new_call = gimple_build_call_vec (gimple_op (call, 1), new_args);
   gimple_call_set_lhs (new_call, gimple_call_lhs (call));
   gimple_call_copy_flags (new_call, call);
+  gimple_call_set_chain (new_call, gimple_call_chain (call));
 }
   new_args.release ();

Re: [PATCH, CHKP, PR target/65508] Set static chain for instrumented calls

2015-03-25 Thread Richard Biener

On Wed, Mar 25, 2015 at 1:35 PM, Ilya Enkovich  wrote:
> Hi,
>
> This patch fixes PR target/65508 by proper copy of static chain for 
> instrumented calls.  Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK 
> for trunk or wait for stage 1?

Ok for trunk.

Richard.

> Thanks,
> Ilya
> --
> gcc/
>
> 2015-03-25  Ilya Enkovich  
>
> PR target/65508
> * tree-chkp.c (chkp_add_bounds_to_call_stmt): Set static
> chain for generated call.
>
> gcc/testsuite/
>
> 2015-03-25  Ilya Enkovich  
>
> PR target/65508
> * gcc.target/i386/mpx/pr65508.c: New.
>
>
> diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr65508.c 
> b/gcc/testsuite/gcc.target/i386/mpx/pr65508.c
> new file mode 100644
> index 000..9060287
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/mpx/pr65508.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fcheck-pointer-bounds -mmpx" } */
> +
> +void
> +bar (int N)
> +{
> +  int a[N];
> +  void foo (int a[N])
> +  {
> +  }
> +  foo (a);
> +}
> diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
> index d2df4ba..de127ae 100644
> --- a/gcc/tree-chkp.c
> +++ b/gcc/tree-chkp.c
> @@ -1838,6 +1838,7 @@ chkp_add_bounds_to_call_stmt (gimple_stmt_iterator *gsi)
>new_call = gimple_build_call_vec (gimple_op (call, 1), new_args);
>gimple_call_set_lhs (new_call, gimple_call_lhs (call));
>gimple_call_copy_flags (new_call, call);
> +  gimple_call_set_chain (new_call, gimple_call_chain (call));
>  }
>new_args.release ();
>

[PATCH] XFAIL gcc.dg/graphite/vect-pr43423.c

2015-03-25 Thread Richard Biener


Committed.

Richard.

2015-03-25  Richard Biener  

PR tree-optimization/62630
* gcc.dg/graphite/vect-pr43423.c: XFAIL.

Index: gcc/testsuite/gcc.dg/graphite/vect-pr43423.c
===
--- gcc/testsuite/gcc.dg/graphite/vect-pr43423.c(revision 221633)
+++ gcc/testsuite/gcc.dg/graphite/vect-pr43423.c(working copy)
@@ -15,5 +15,5 @@ void foo(int n, int mid)
 }
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { xfail 
*-*-* } } } */
 /* { dg-final { cleanup-tree-dump "vect" } } */

[Patch, fortran, pr65548, v1] [5 Regression] gfc_conv_procedure_call

2015-03-25 Thread Andre Vehreschild

Hi all,

please find attached a fix for the recently introduced regression when
allocating arrays with an intrinsic function for source=. The patch addresses
this issue by using gfc_conv_expr_descriptor () for intrinsic functions.

Bootstraps and regtests ok on x86_64-linux-gnu/F20.

Ok for trunk?

Regards,
Andre
-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


pr65548_1.clog
Description: Binary data
diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index 6ffae6e79e..68b343b 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -5075,12 +5075,17 @@ gfc_trans_allocate (gfc_code * code)
 	  /* In all other cases evaluate the expr3 and create a
 		 temporary.  */
 	  gfc_init_se (&se, NULL);
-	  gfc_conv_expr_reference (&se, code->expr3);
+	  if (code->expr3->rank != 0
+		  && code->expr3->expr_type == EXPR_FUNCTION
+		  && code->expr3->value.function.isym)
+		gfc_conv_expr_descriptor (&se, code->expr3);
+	  else
+		gfc_conv_expr_reference (&se, code->expr3);
 	  if (code->expr3->ts.type == BT_CLASS)
 		gfc_conv_class_to_class (&se, code->expr3,
 	 code->expr3->ts,
 	 false, true,
-	  false,false);
+	 false, false);
 	  gfc_add_block_to_block (&block, &se.pre);
 	  gfc_add_block_to_block (&post, &se.post);
 	  /* Prevent aliasing, i.e., se.expr may be already a
diff --git a/gcc/testsuite/gfortran.dg/allocate_with_source_5.f90 b/gcc/testsuite/gfortran.dg/allocate_with_source_5.f90
new file mode 100644
index 000..e934e08
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/allocate_with_source_5.f90
@@ -0,0 +1,52 @@
+! { dg-do run }
+!
+! Check that pr65548 is fixed.
+! Contributed by Juergen Reuter  
+
+module allocate_with_source_5_module
+
+  type :: selector_t
+integer, dimension(:), allocatable :: map
+real, dimension(:), allocatable :: weight
+  contains
+procedure :: init => selector_init
+  end type selector_t
+
+contains
+
+  subroutine selector_init (selector, weight)
+class(selector_t), intent(out) :: selector
+real, dimension(:), intent(in) :: weight
+real :: s
+integer :: n, i
+logical, dimension(:), allocatable :: mask
+s = sum (weight)
+allocate (mask (size (weight)), source = weight /= 0)
+n = count (mask)
+if (n > 0) then
+   allocate (selector%map (n), &
+source = pack ([(i, i = 1, size (weight))], mask))
+   allocate (selector%weight (n), &
+source = pack (weight / s, mask))
+else
+   allocate (selector%map (1), source = 1)
+   allocate (selector%weight (1), source = 0.)
+end if
+  end subroutine selector_init
+
+end module allocate_with_source_5_module
+
+program allocate_with_source_5
+  use allocate_with_source_5_module
+
+  class(selector_t), allocatable :: sel;
+  real, dimension(5) :: w = [ 1, 0, 2, 0, 3];
+
+  allocate (sel)
+  call sel%init(w)
+
+  if (any(sel%map /= [ 1, 3, 5])) call abort()
+  if (any(abs(sel%weight - [1, 2, 3] / 6) < 1E-6)) call abort()
+end program allocate_with_source_5
+! { dg-final { cleanup-modules "allocate_with_source_5_module" } }
+

C++ PATCH for c++/61670 (ice-after-error with null DECL_SIZE)

2015-03-25 Thread Marek Polacek

The following fixes an ICE on invalid code by checking that DECL_SIZE is
not null before feeding it to integer_zerop.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-03-25  Marek Polacek  

PR c++/61670
* class.c (remove_zero_width_bit_fields): Check for null DECL_SIZE.

* g++.dg/template/pr61670.C: New test.

diff --git gcc/cp/class.c gcc/cp/class.c
index 0518320..c2d4201 100644
--- gcc/cp/class.c
+++ gcc/cp/class.c
@@ -5434,7 +5434,8 @@ remove_zero_width_bit_fields (tree t)
 DECL_INITIAL (*fieldsp).
 check_bitfield_decl eventually sets DECL_SIZE (*fieldsp)
 to that width.  */
- && integer_zerop (DECL_SIZE (*fieldsp)))
+ && (DECL_SIZE (*fieldsp) == NULL_TREE
+ || integer_zerop (DECL_SIZE (*fieldsp
*fieldsp = DECL_CHAIN (*fieldsp);
   else
fieldsp = &DECL_CHAIN (*fieldsp);
diff --git gcc/testsuite/g++.dg/template/pr61670.C 
gcc/testsuite/g++.dg/template/pr61670.C
index e69de29..d244efa 100644
--- gcc/testsuite/g++.dg/template/pr61670.C
+++ gcc/testsuite/g++.dg/template/pr61670.C
@@ -0,0 +1,9 @@
+// PR c++/61670
+// { dg-do compile }
+
+template 
+class A {
+  A: 0 // { dg-error "" }
+};
+
+A a;

Marek

Re: Fix PR 65177: diamonds are not valid execution threads for jump threading

2015-03-25 Thread Jeff Law

On 03/19/15 13:54, Sebastian Pop wrote:

Richard Biener wrote:

>please instead fixup after copy_bbs in duplicate_seme_region.
>

Thanks for the review.
Attached patch that does not modify copy_bbs.
Fixes make check in hmmer and make check RUNTESTFLAGS=tree-ssa.exp

Full bootstrap and regtest in progress on x86_64-linux.  Ok for trunk?

0001-diamonds-are-not-valid-execution-threads-for-jump-th.patch

 From 8f1516235bce3e1c4f359149dcc546d813ed7817 Mon Sep 17 00:00:00 2001
From: Sebastian Pop
Date: Tue, 17 Mar 2015 20:28:19 +0100
Subject: [PATCH] diamonds are not valid execution threads for jump threading

PR tree-optimization/65177
* tree-ssa-threadupdate.c (verify_seme): Renamed verify_jump_thread.
(bb_in_bbs): New.
(duplicate_seme_region): Renamed duplicate_thread_path.  Redirect all
edges not adjacent on the path to the original code.
OK for the trunk.  Though I think there's some stage1 refactoring that 
we're going to want to do.

Specifically, it seems to me that copy_bbs should be refactored into 
copy_bbs and copy_bbs_for_threading or somesuch.  Where those routines 
call into refactored common subroutines, but obviously handle wiring up 
the outgoing edges from the copied blocks differently.

The goal would be to eliminate the overly complex block copy/CFG update 
scheme in tree-ssa-threadupdate.c as part of a larger project to convert 
to a backward threader that can run independently of DOM.

Jeff

Re: [PATCH v2] New testcase to check parameter passing bug

2015-03-25 Thread Jeff Law


On 03/18/15 19:40, Honggyu Kim wrote:

Hi,

I have modified the test-case to check parameter passing bug based on the
comments from Kyrill Tkachov, Christophe Lyon, and Segher Boessenkool
as follows:
  1. move from "gcc.target/arm" to "gcc.dg"
  2. change "dg-do compile" to "dg-do run"

Please let me know if there's still something to fix more.
Thanks for your comment.

Honggyu
---
  gcc/testsuite/ChangeLog|4 
  gcc/testsuite/gcc.dg/pr65358.c |   33 +
  2 files changed, 37 insertions(+)
  create mode 100644 gcc/testsuite/gcc.dg/pr65358.c

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 77d24a1..218f908 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2015-03-19  Honggyu Kim  
+
+   * gcc.dg/pr65358.c: New test.
This should be included as part of Kyrill's patch.  If the test goes in 
without Kryill's fix, then it'll just create testsuite noise.


Jeff

RE: [PATCH v2] New testcase to check parameter passing bug

2015-03-25 Thread Kyrill Tkachov



> -Original Message-
> From: Jeff Law [mailto:l...@redhat.com]
> Sent: 25 March 2015 12:27
> To: Honggyu Kim; gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov; seg...@kernel.crashing.org; christophe.l...@st.com
> Subject: Re: [PATCH v2] New testcase to check parameter passing bug
> 
> On 03/18/15 19:40, Honggyu Kim wrote:
> > Hi,
> >
> > I have modified the test-case to check parameter passing bug based on
> > the comments from Kyrill Tkachov, Christophe Lyon, and Segher
> > Boessenkool as follows:
> >   1. move from "gcc.target/arm" to "gcc.dg"
> >   2. change "dg-do compile" to "dg-do run"
> >
> > Please let me know if there's still something to fix more.
> > Thanks for your comment.
> >
> > Honggyu
> > ---
> >   gcc/testsuite/ChangeLog|4 
> >   gcc/testsuite/gcc.dg/pr65358.c |   33
> +
> >   2 files changed, 37 insertions(+)
> >   create mode 100644 gcc/testsuite/gcc.dg/pr65358.c
> >
> > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index
> > 77d24a1..218f908 100644
> > --- a/gcc/testsuite/ChangeLog
> > +++ b/gcc/testsuite/ChangeLog
> > @@ -1,3 +1,7 @@
> > +2015-03-19  Honggyu Kim  
> > +
> > +   * gcc.dg/pr65358.c: New test.
> This should be included as part of Kyrill's patch.  If the test goes in
without
> Kryill's fix, then it'll just create testsuite noise.

I'll make sure to commit this together with my fix (at
https://gcc.gnu.org/ml/gcc-patches/2015-03/msg01014.html)
if it gets approved. I agree that there's no point taking the test in by
itself .

Thanks,
Kyrill

> 
> Jeff
>

Re: [patch libgomp]: Fix PR 64972

2015-03-25 Thread Jakub Jelinek

On Wed, Mar 25, 2015 at 01:35:02PM +0100, Kai Tietz wrote:
> ChangeLog
> 
> 2015-03-25  Kai Tietz  
> 
> PR libgomp/64972
> * oacc-parallel.c (GOACC_parallel): Use PRIu64 if available.
> (GOACC_data_start): Likewise.
> * target.c (gomp_map_vars): Likewise.
> 
> Tested for i686-w64-mingw32.  Fix got preapproved by Jakub, so I will
> commit this soon, if there are no objections.

The patch is ok to commit immediately, no need to wait.

Jakub

Re: [PATCH, bootstrap]: Add bootstrap-lto-noplugin build configuration (PR65537)

2015-03-25 Thread Jakub Jelinek

On Tue, Mar 24, 2015 at 05:43:09PM +0100, Uros Bizjak wrote:
> Attached patch introduces bootstrap-lto-noplugin bootstrap
> configuration for hosts that do not support linker plugin (e.g. CentOS
> 5.11 with binutils 2.17). Also, the patch adds some additional
> documentation to bootstrap-lto option.
> 
> config/ChangeLog:
> 
> 2015-03-24  Uros Bizjak  
> 
> PR bootstrap/65537
> * bootstrap-lto-noplugin.mk: New build configuration.
> 
> gcc/ChangeLog:
> 
> 2015-03-24  Uros Bizjak  
> 
> PR bootstrap/65537
> * doc/install.texi (Building a native compiler): Document new
> bootstrap-lto-noplugin configuration.  Mention that bootstrap-lto
> configuration assumes that the host supports the linker plugin.
> 
> Patch was bootstrapped and tested on x86_64-linux-gnu (CentOS 5.11)
> host, configured with --with-build-config=bootstrap-lto build
> configuration.

and not --with-build-config=bootstrap-lto-noplugin ?

> OK for mainline?

Ok, thanks.

Jakub

[Obvious] Fix libstdc++/33394 testcase when cross-testing linux

2015-03-25 Thread Alan Lawrence

When cross-testing, the -DITERATIONS=1000 flag replaced the -pthread required 
for linux targets, so the test failed to build. I've pushed the following test 
fix as r221666:


Index: libstdc++-v3/testsuite/21_strings/basic_string/pthread33394.cc
===
--- libstdc++-v3/testsuite/21_strings/basic_string/pthread33394.cc 
(revision 221665)
+++ libstdc++-v3/testsuite/21_strings/basic_string/pthread33394.cc  (working 
copy)

@@ -18,7 +18,7 @@
 // { dg-do run { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* *-*-linux* 
*-*-gnu* *-*-solaris* *-*-cygwin *-*-darwin* } }
 // { dg-options "-pthread" { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* 
*-*-linux* *-*-gnu* *-*-solaris* } }


-// { dg-options "-DITERATIONS=1000" { target simulator } }
+// { dg-additional-options "-DITERATIONS=1000" { target simulator } }
 #ifndef ITERATIONS
 #define ITERATIONS 5
 #endif

Jonathan Wakely wrote:

Adding a testcase so the bug can be closed.

I believe the segfault was fixed for 3.4.0 by
https://gcc.gnu.org/r67912

Tested x86_64-linux, committed to trunk.

Re: [Obvious] Fix libstdc++/33394 testcase when cross-testing linux

2015-03-25 Thread Jonathan Wakely


On 25/03/15 15:49 +, Alan Lawrence wrote:
When cross-testing, the -DITERATIONS=1000 flag replaced the -pthread 
required for linux targets, so the test failed to build. I've pushed 
the following test fix as r221666:


Ah yes, of course it does! Thanks for the fix.


Index: libstdc++-v3/testsuite/21_strings/basic_string/pthread33394.cc
===
--- libstdc++-v3/testsuite/21_strings/basic_string/pthread33394.cc 
(revision 221665)
+++ libstdc++-v3/testsuite/21_strings/basic_string/pthread33394.cc  
(working copy)

@@ -18,7 +18,7 @@
// { dg-do run { target *-*-freebsd* *-*-dragonfly* *-*-netbsd* 
*-*-linux* *-*-gnu* *-*-solaris* *-*-cygwin *-*-darwin* } }
// { dg-options "-pthread" { target *-*-freebsd* *-*-dragonfly* 
*-*-netbsd* *-*-linux* *-*-gnu* *-*-solaris* } }


-// { dg-options "-DITERATIONS=1000" { target simulator } }
+// { dg-additional-options "-DITERATIONS=1000" { target simulator } }
#ifndef ITERATIONS
#define ITERATIONS 5
#endif

Jonathan Wakely wrote:

Adding a testcase so the bug can be closed.

I believe the segfault was fixed for 3.4.0 by
https://gcc.gnu.org/r67912

Tested x86_64-linux, committed to trunk.

Re: [libstdc++/65033] Give alignment info to libatomic

2015-03-25 Thread Jonathan Wakely


On 18/02/15 12:15 +, Jonathan Wakely wrote:

On 12/02/15 13:23 -0800, Richard Henderson wrote:

When we fixed PR54005, making sure that atomic_is_lock_free returns the same
value for all objects of a given type, we probably should have changed the
interface so that we would pass size and alignment rather than size and object
pointer.

Instead, we decided that passing null for the object pointer would be
sufficient.  But as this PR shows, we really do need to take alignment into
account.

The following patch constructs a fake object pointer that is maximally
misaligned.  This allows the interface to both the builtin and to libatomic to
remain unchanged.  Which probably makes this back-portable to maintenance
releases as well.


Am I right in thinking that another option would be to ensure that
std::atomic<> objects are always suitably aligned? Would that make
std::atomic<> slightly more compatible with a C11 atomic_int, where
the _Atomic qualifier affects alignment?

https://gcc.gnu.org/PR62259 suggests we might need to enforce
alignment on std::atomic anyway, or am I barking up the wrong tree?



I've convinced myself that Richard's patch is correct in all cases,
but I think we also want this patch, to fix PR62259 and PR65147.

For the generic std::atomic (i.e. not the integral or pointer
specializations) we should increase the alignment of atomic types that
have the same size as one of the standard integral types. This should
be consistent with what the C front end does for _Atomic, based on
what Joseph told me on IRC:

 jwakely: _Atomic aligns 1/2/4/8/16-byte types the same as
   integer types of that size.
 (Which may not be alignment = size, depending on the
   architecture.)

Ideally we'd use an attribute like Andrew describes at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62259#c4 but that's not
going to happen for GCC 5, so this just looks for an integral type of
the same size and uses its alignment.

Tested x86_64-linux, powerpc64le-linux.

I'll wait for RM approval for this and Richard's patch (which is OK
from a libstdc++ perspective).
commit bdcba837b42bbef3143ea59a0194bd3ef435dfcb
Author: Jonathan Wakely 
Date:   Wed Sep 3 15:39:53 2014 +0100

	PR libstdc++/62259
	PR libstdc++/65147
	* include/std/atomic (atomic): Increase alignment for types with
	the same size as one of the integral types.
	* testsuite/29_atomics/atomic/60695.cc: Adjust dg-error line number.
	* testsuite/29_atomics/atomic/62259.cc: New.

diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic
index cc4b5f1..5f02fe8 100644
--- a/libstdc++-v3/include/std/atomic
+++ b/libstdc++-v3/include/std/atomic
@@ -165,7 +165,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct atomic
 {
 private:
-  _Tp _M_i;
+  // Align 1/2/4/8/16-byte types the same as integer types of that size.
+  // This matches the alignment effects of the C11 _Atomic qualifier.
+  static constexpr int _S_alignment
+	= sizeof(_Tp) == sizeof(char)	   ? alignof(char)
+	: sizeof(_Tp) == sizeof(short)	   ? alignof(short)
+	: sizeof(_Tp) == sizeof(int)	   ? alignof(int)
+	: sizeof(_Tp) == sizeof(long)	   ? alignof(long)
+	: sizeof(_Tp) == sizeof(long long) ? alignof(long long)
+#ifdef _GLIBCXX_USE_INT128
+	: sizeof(_Tp) == sizeof(__int128)  ? alignof(__int128)
+#endif
+	: alignof(_Tp);
+
+  alignas(_S_alignment) _Tp _M_i;
 
   static_assert(__is_trivially_copyable(_Tp),
 		"std::atomic requires a trivially copyable type");
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic/60695.cc b/libstdc++-v3/testsuite/29_atomics/atomic/60695.cc
index b59c6ba..806ccb1 100644
--- a/libstdc++-v3/testsuite/29_atomics/atomic/60695.cc
+++ b/libstdc++-v3/testsuite/29_atomics/atomic/60695.cc
@@ -27,4 +27,4 @@ struct X {
   char stuff[0]; // GNU extension, type has zero size
 };
 
-std::atomic a;  // { dg-error "not supported" "" { target *-*-* } 173 }
+std::atomic a;  // { dg-error "not supported" "" { target *-*-* } 186 }
diff --git a/libstdc++-v3/testsuite/29_atomics/atomic/62259.cc b/libstdc++-v3/testsuite/29_atomics/atomic/62259.cc
new file mode 100644
index 000..cfe70b1
--- /dev/null
+++ b/libstdc++-v3/testsuite/29_atomics/atomic/62259.cc
@@ -0,0 +1,56 @@
+// Copyright (C) 2015 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+//

[PATCH, rs6000, 4.9] Backport little endian swap optimization to 4.9

2015-03-25 Thread Bill Schmidt

Hi,

The POWER-specific little-endian swap optimization pass has been burning
in on mainline since last August.  Since then there have been a few
improvements and bug fixes, but the code is very stable.  I've had some
recent requests to get this code backported to 4.9, as it provides
important performance benefits for vector computation.

Most of the work is target-specific, but there are some
target-independent changes to convert web.c to use a class structure so
that this pass can inherit it.  This has caused no problems and was not
controversial when added to trunk.

The rest of the patch is straightforward backporting of the
target-specific pieces and test cases.  There have been a few
infrastructural changes to adjust to, but nothing major.

After this goes in, I'll work on taking it back to 4.8 as well.  OK for
4.9?

Thanks,
Bill


gcc:

2015-03-25  Bill Schmidt  

Backport of r214242, r214254, and bug fix patches from mainline
* config/rs6000/rs6000.c (context.h): New #include.
(tree-pass.h): Likewise.
(make_pass_analyze_swaps): New declaration.
(rs6000_option_override): Register swap-optimization pass.
(swap_web_entry): New class.
(special_handling_values): New enum.
(union_defs): New function.
(union_uses): Likewise.
(insn_is_load_p): Likewise.
(insn_is_store_p): Likewise.
(insn_is_swap_p): Likewise.
(rtx_is_swappable_p): Likewise.
(insn_is_swappable_p): Likewise.
(chain_purpose): New enum.
(chain_contains_only_swaps): New function.
(mark_swaps_for_removal): Likewise.
(swap_const_vector_halves): Likewise.
(adjust_subreg_index): Likewise.
(permute_load): Likewise.
(permute_store): Likewise.
(adjust_extract): Likewise.
(adjust_splat): Likewise.
(handle_special_swappables): Likewise.
(replace_swap_with_copy): Likewise.
(dump_swap_insn_table): Likewise.
(rs6000_analyze_swaps): Likewise.
(pass_data_analyze_swaps): New pass_data.
(pass_analyze_swaps): New class.
(pass_analyze_swaps::gate): New method.
(pass_analyze_swaps::execute): New method.
(make_pass_analyze_swaps): New function.
* config/rs6000/rs6000.opt (moptimize-swaps): New option.
* df.h (web_entry_base): New class, replacing struct web_entry.
(web_entry_base::pred): New method.
(web_entry_base::set_pred): Likewise.
(web_entry_base::unionfind_root): Likewise.
(web_entry_base::unionfind_union): Likewise.
(unionfind_root): Delete external reference.
(unionfind_union): Likewise.
(union_defs): Likewise.
* web.c (web_entry_base::unionfind_root): Convert to method.
(web_entry_base::unionfind_union): Likewise.
(web_entry): New class.
(union_match_dups): Convert to use class structure.
(union_defs): Likewise.
(entry_register): Likewise.
(web_main): Likewise.

[testsuite]

2015-03-25  Bill Schmidt  

Backport r214254 and related tests from mainline
* gcc.target/powerpc/swaps-p8-1.c: New test.
* gcc.target/powerpc/swaps-p8-2.c: New test.
* gcc.target/powerpc/swaps-p8-3.c: New test.
* gcc.target/powerpc/swaps-p8-4.c: New test.
* gcc.target/powerpc/swaps-p8-5.c: New test.
* gcc.target/powerpc/swaps-p8-6.c: New test.
* gcc.target/powerpc/swaps-p8-7.c: New test.
* gcc.target/powerpc/swaps-p8-8.c: New test.
* gcc.target/powerpc/swaps-p8-9.c: New test.
* gcc.target/powerpc/swaps-p8-10.c: New test.
* gcc.target/powerpc/swaps-p8-11.c: New test.
* gcc.target/powerpc/swaps-p8-12.c: New test.
* gcc.target/powerpc/swaps-p8-13.c: New test.
* gcc.target/powerpc/swaps-p8-14.c: New test.
* gcc.target/powerpc/swaps-p8-15.c: New test.
* gcc.target/powerpc/swaps-p8-16.c: New test.
* gcc.target/powerpc/swaps-p8-17.c: New test.


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 221633)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -80,6 +80,8 @@
 #include "cgraph.h"
 #include "target-globals.h"
 #include "real.h"
+#include "context.h"
+#include "tree-pass.h"
 #if TARGET_XCOFF
 #include "xcoffout.h"  /* get declarations of xcoff_*_section_name */
 #endif
@@ -1172,6 +1174,7 @@ static bool rs6000_secondary_reload_move (enum rs6
  enum machine_mode,
  secondary_reload_info *,
  bool);
+rtl_opt_pass *make_pass_analyze_swaps (gcc::context*);
 
 /* Hash table stuff for keeping track of TOC entries.  */
 
@@ -4094,6 +4097,15 @@ static void
 rs6000_option_override (void)
 {
   (void) rs6000_option_override_internal (true);
+
+  /* Register machine-sp

Re: [Patch, Fortran, pr60322] was: [Patch 1/2, Fortran, pr60322] [OOP] Incorrect bounds on polymorphic dummy array

2015-03-25 Thread Andre Vehreschild

Hi Dominique, hi all,

you are absolutely right, Dominique: I missed the part of pr60322_base_*. 

But this time it is there and furthermore does solve the allocate( mold=e) and
the loc(e) issue. 

Paul: I have simplified your patch by only checking whether the
arg_expr.ts.type == BT_CLASS. All tests showed, that this enough to produce the
correct code.

Bootstraps and regtests ok on x86_64-linux-gnu/F20. 

Comments, please!

Regards,
Andre

On Wed, 25 Mar 2015 10:43:34 +0100
Dominique d'Humières  wrote:

> Hi Andre,
> 
> > Le 24 mars 2015 à 18:06, Andre Vehreschild  a écrit :
> > 
> > Hi all,
> > 
> > I have worked on the comments Mikael gave me. I am now checking for
> > class_pointer in the way he pointed out.
> > 
> > Furthermore did I *join the two parts* of the patch into this one, because
> > keeping both in sync was no benefit but only tedious and did not prove to be
> > reviewed faster.
> 
> Are you sure that you attached the right patch? It does not apply on a clean
> tree unless I apply the patch at
> 
> https://gcc.gnu.org/ml/fortran/2015-02/msg00105.html
> 
> with minor surgery for gcc/fortran/expr.c.
> 
> > Paul, Dominique: I have addressed the LOC issue that came up lately. Or
> > rather the patch addressed it already. I feel like this is not tested very
> > well, not the loc() call nor the sizeof() call as given in the 57305
> > second's download.
> 
> The ICE is fixed and the LOC issue seems fixed. 
> 
> > Unfortunately, is that download not runable. I would love to see a test
> > similar to that download, but couldn't come up with one, that satisfied me.
> > Given that the patch's review will last some days, I still have enough time
> > to come up with something beautiful which I will add then.
> 
> I have changed the test to
> 
> use iso_c_binding
> implicit none
> real, target :: e
> class(*), allocatable, target :: a(:)
> e = 1.0
> call add_element_poly(a,e)
> print *, size(a)
> call add_element_poly(a,e)
> print *, size(a)
> select type (a)
>   type is (real)
> print *, a
> end select
> contains
> subroutine add_element_poly(a,e)
>   use iso_c_binding
>   class(*),allocatable,intent(inout),target :: a(:)
>   class(*),intent(in),target :: e
>   class(*),allocatable,target :: tmp(:)
>   type(c_ptr) :: dummy
>   
>   interface
> function memcpy(dest,src,n) bind(C,name="memcpy") result(res)
>   import
>   type(c_ptr) :: res
>   integer(c_intptr_t),value :: dest
>   integer(c_intptr_t),value :: src
>   integer(c_size_t),value :: n
> end function
>   end interface
> 
>   if (.not.allocated(a)) then
> allocate(a(1), source=e)
>   else
> print *, size(a)
> allocate(tmp(size(a)),source=a)
> print *, size(a), size(tmp) + 1
> print *, loc(a(1)),loc(tmp),sizeof(tmp)
> deallocate(a)
> !allocate(a(size(tmp)+1),mold=e)
> allocate(a(size(tmp)+1),source=e)
> print *, size(a), size(tmp)
> dummy = memcpy(loc(a(1)),loc(tmp),sizeof(tmp))
> dummy = memcpy(loc(a(size(tmp)+1)),loc(e),sizeof(e))
>   end if
> end subroutine
> end
> 
> As pointed by Paul, I get a segfault at run time if I use the commented line,
> i.e. ‘mold’ instead of ‘source’.
> 
> > Bootstraps and regtests ok on x86_64-linux-gnu/F20.
> > 
> > Regards,
> > Andre
> 
> Thanks for your work.
> 
> Dominique
> 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


pr60322_full_5.clog
Description: Binary data
diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c
index ab6f7a5..7f3a59d 100644
--- a/gcc/fortran/expr.c
+++ b/gcc/fortran/expr.c
@@ -4052,6 +4052,7 @@ gfc_expr *
 gfc_lval_expr_from_sym (gfc_symbol *sym)
 {
   gfc_expr *lval;
+  gfc_array_spec *as;
   lval = gfc_get_expr ();
   lval->expr_type = EXPR_VARIABLE;
   lval->where = sym->declared_at;
@@ -4059,10 +4060,10 @@ gfc_lval_expr_from_sym (gfc_symbol *sym)
   lval->symtree = gfc_find_symtree (sym->ns->sym_root, sym->name);
 
   /* It will always be a full array.  */
-  lval->rank = sym->as ? sym->as->rank : 0;
+  as = IS_CLASS_ARRAY (sym) ? CLASS_DATA (sym)->as : sym->as;
+  lval->rank = as ? as->rank : 0;
   if (lval->rank)
-gfc_add_full_array_ref (lval, sym->ts.type == BT_CLASS ?
-			CLASS_DATA (sym)->as : sym->as);
+gfc_add_full_array_ref (lval, as);
   return lval;
 }
 
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 8e6595f..901a1c0 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3206,6 +3206,11 @@ bool gfc_is_finalizable (gfc_symbol *, gfc_expr **);
 	 && CLASS_DATA (sym) \
 	 && CLASS_DATA (sym)->ts.u.derived \
 	 && CLASS_DATA (sym)->ts.u.derived->attr.unlimited_polymorphic)
+#define IS_CLASS_ARRAY(sym) \
+	(sym->ts.type == BT_CLASS \
+	 && CLASS_DATA (sym) \
+	 && CLASS_DATA (sym)->attr.dimension \
+	 && !CLASS_DATA (sym)->attr.class_pointer)
 
 /* frontend-passes.c */
 
diff --git a/gcc/fortran/trans-array.c b/gcc/f

Re: [PATCH][AArch64][Testsuite] Fix gcc.target/aarch64/c-output-template-3.c

2015-03-25 Thread James Greenhalgh

On Tue, Mar 24, 2015 at 05:46:57PM +, Alan Lawrence wrote:
> Hmmm. This is not the right fix: the tests Richard fixed, were failing because
> of lack of constant propagation and DCE at compile-time, which then didn't
> eliminate the call to link_error. The AArch64 test is failing because this 
> from
> aarch64/constraints.md:
> 
> (define_constraint "S"
> "A constraint that matches an absolute symbolic address."
> (and (match_code "const,symbol_ref,label_ref")
>  (match_test "aarch64_symbolic_address_p (op)")))
> 
> previously was seeing (and being satisfied by):
> 
> (const:DI (plus:DI (symbol_ref:DI ("test") [flags 0x3]  0x7fb7c60300 test>)
>   (const_int 4 [0x4])))
> 
> but following Richard's patch the constraint is evaluated only on:
> 
> (reg/f:DI 73 [ D.2670 ])
 
I don't think we should get too concerned by this. There are a number
of other constraints which we define which we can only satisfy given
a level of optimisation. Take the I (immediate acceptable for an ADD
instruction) constraint, which will fail for:

int foo (int x)
{
  int z = 5;
  __asm__ ("xxx %0 %1":"=r"(x) : "I"(z));
  return x;
}

at O0 and happily produce:

xxx x0 5

with optimisations.

I think your original patch to add -O is just fine, but Marcus or
Richard will need to approve it.

Cheers,
James

Re: [libstdc++/65033] Give alignment info to libatomic

2015-03-25 Thread Richard Henderson

On 03/25/2015 09:22 AM, Jonathan Wakely wrote:
>  private:
> -  _Tp _M_i;
> +  // Align 1/2/4/8/16-byte types the same as integer types of that size.
> +  // This matches the alignment effects of the C11 _Atomic qualifier.
> +  static constexpr int _S_alignment
> + = sizeof(_Tp) == sizeof(char)  ? alignof(char)
> + : sizeof(_Tp) == sizeof(short) ? alignof(short)
> + : sizeof(_Tp) == sizeof(int)   ? alignof(int)
> + : sizeof(_Tp) == sizeof(long)  ? alignof(long)
> + : sizeof(_Tp) == sizeof(long long) ? alignof(long long)
> +#ifdef _GLIBCXX_USE_INT128
> + : sizeof(_Tp) == sizeof(__int128)  ? alignof(__int128)
> +#endif
> + : alignof(_Tp);
> +
> +  alignas(_S_alignment) _Tp _M_i;


Surely not by reducing a larger alignment applied to _Tp.
I.e.

  static constexpr int _S_min_alignment
= sizeof(_Tp) == sizeof(char)  ? alignof(char)
: sizeof(_Tp) == sizeof(short) ? alignof(short)
: sizeof(_Tp) == sizeof(int)   ? alignof(int)
: sizeof(_Tp) == sizeof(long)  ? alignof(long)
: sizeof(_Tp) == sizeof(long long) ? alignof(long long)
#ifdef _GLIBCXX_USE_INT128
: sizeof(_Tp) == sizeof(__int128)  ? alignof(__int128)
#endif
: 0;

  static constexpr int _S_alignment
= _S_min_alignment > alignof(_Tp) ? _S_min_alignment : alignof(_Tp);



r~

[PATCH] Add workaround for PR64715

2015-03-25 Thread Jakub Jelinek

Hi!

As discussed in the PR, fixing this issue for real (make sure we at least
until the objsz pass don't lose information on which field's address if any
has been taken) is probably too dangerous at this point, so this patch
just adds a simple workaround:
another objsz pass instance run early before first ccp pass, in which we
only process __bos (x, 1) and __bos (x, 3), and rather than folding them
right away we instead just replace say
  _1 = __builtin_object_size (ptr_2, 1);
with
  _7 = __builtin_object_size (ptr_2, 1);
  _1 = MIN <_7, 17>;
if 17 is what the __builtin_object_size folds to.  The reason for the MIN or
MAX is that later DCE etc. could still make the value smaller later on
(as shown in the third snippet of __builtin_object_size).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

For GCC 6 will need to write some real fix and revert this (except for the
testcases).

2015-03-25  Jakub Jelinek  

PR tree-optimization/64715
* passes.def: Add another instance of pass_object_sizes before
ccp1.
* tree-object-size.c (pass_object_sizes::execute): In
first_pass_instance, only handle __bos (, 1) and __bos (, 3)
calls, and keep the call in the IL, as {MIN,MAX}_EXPR of the
__bos result and the computed constant.  Remove redundant
checks, obsoleted by gimple_call_builtin_p test.  When propagating
folded __bos into uses, if the use is {MIN,MAX}_EXPR we can fold
into constant, propagate even that constant into their uses.

* gcc.dg/builtin-object-size-15.c: New test.
* gcc.dg/pr64715-1.c: New test.
* gcc.dg/pr64715-2.c: New test.

--- gcc/passes.def.jj   2015-01-19 14:40:46.0 +0100
+++ gcc/passes.def  2015-03-25 12:18:21.079207954 +0100
@@ -77,6 +77,7 @@ along with GCC; see the file COPYING3.
   PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations)
  NEXT_PASS (pass_remove_cgraph_callee_edges);
  NEXT_PASS (pass_rename_ssa_copies);
+ NEXT_PASS (pass_object_sizes);
  NEXT_PASS (pass_ccp);
  /* After CCP we rewrite no longer addressed locals into SSA
 form if possible.  */
--- gcc/tree-object-size.c.jj   2015-03-20 17:58:31.0 +0100
+++ gcc/tree-object-size.c  2015-03-25 14:40:03.664185560 +0100
@@ -1268,25 +1268,60 @@ pass_object_sizes::execute (function *fu
continue;
 
  init_object_sizes ();
+
+ /* In the first pass instance, only attempt to fold
+__builtin_object_size (x, 1) and __builtin_object_size (x, 3),
+and rather than folding the builtin to the constant if any,
+create a MIN_EXPR or MAX_EXPR of the __builtin_object_size
+call result and the computed constant.  */
+ if (first_pass_instance)
+   {
+ tree ost = gimple_call_arg (call, 1);
+ if (tree_fits_uhwi_p (ost))
+   {
+ unsigned HOST_WIDE_INT object_size_type = tree_to_uhwi (ost);
+ tree ptr = gimple_call_arg (call, 0);
+ tree lhs = gimple_call_lhs (call);
+ if ((object_size_type == 1 || object_size_type == 3)
+ && (TREE_CODE (ptr) == ADDR_EXPR
+ || TREE_CODE (ptr) == SSA_NAME)
+ && lhs)
+   {
+ tree type = TREE_TYPE (lhs);
+ unsigned HOST_WIDE_INT bytes
+   = compute_builtin_object_size (ptr, object_size_type);
+ if (bytes != (unsigned HOST_WIDE_INT) (object_size_type 
== 1
+? -1 : 0)
+ && wi::fits_to_tree_p (bytes, type))
+   {
+ tree tem = make_ssa_name (type);
+ gimple_call_set_lhs (call, tem);
+ enum tree_code code
+   = object_size_type == 1 ? MIN_EXPR : MAX_EXPR;
+ tree cst = build_int_cstu (type, bytes);
+ gimple g = gimple_build_assign (lhs, code, tem, cst);
+ gsi_insert_after (&i, g, GSI_NEW_STMT);
+ update_stmt (call);
+   }
+   }
+   }
+ continue;
+   }
+
  result = fold_call_stmt (as_a  (call), false);
  if (!result)
{
- if (gimple_call_num_args (call) == 2
- && POINTER_TYPE_P (TREE_TYPE (gimple_call_arg (call, 0
-   {
- tree ost = gimple_call_arg (call, 1);
+ tree ost = gimple_call_arg (call, 1);
 
- if (tree_fits_uhwi_p (ost))
-   {
- unsigned HOST_WIDE_INT object_size_type
-   = tree_to_uhwi (ost);
+ if (tree_fits_uhwi_p (ost))
+

Re: [libstdc++/65033] Give alignment info to libatomic

2015-03-25 Thread Richard Henderson

On 03/25/2015 09:22 AM, Jonathan Wakely wrote:
> +static_assert( alignof(std::atomic) > alignof(int),
> +   "std::atomic not suitably aligned" );

This is only true if int64_t has alignment larger than int32_t,
which is unfortunately not always the case.


r~

Re: [libstdc++/65033] Give alignment info to libatomic

2015-03-25 Thread Jonathan Wakely


On 25/03/15 11:36 -0700, Richard Henderson wrote:

On 03/25/2015 09:22 AM, Jonathan Wakely wrote:

 private:
-  _Tp _M_i;
+  // Align 1/2/4/8/16-byte types the same as integer types of that size.
+  // This matches the alignment effects of the C11 _Atomic qualifier.
+  static constexpr int _S_alignment
+   = sizeof(_Tp) == sizeof(char)  ? alignof(char)
+   : sizeof(_Tp) == sizeof(short) ? alignof(short)
+   : sizeof(_Tp) == sizeof(int)   ? alignof(int)
+   : sizeof(_Tp) == sizeof(long)  ? alignof(long)
+   : sizeof(_Tp) == sizeof(long long) ? alignof(long long)
+#ifdef _GLIBCXX_USE_INT128
+   : sizeof(_Tp) == sizeof(__int128)  ? alignof(__int128)
+#endif
+   : alignof(_Tp);
+
+  alignas(_S_alignment) _Tp _M_i;



Surely not by reducing a larger alignment applied to _Tp.
I.e.

 static constexpr int _S_min_alignment
= sizeof(_Tp) == sizeof(char)  ? alignof(char)
: sizeof(_Tp) == sizeof(short) ? alignof(short)
: sizeof(_Tp) == sizeof(int)   ? alignof(int)
: sizeof(_Tp) == sizeof(long)  ? alignof(long)
: sizeof(_Tp) == sizeof(long long) ? alignof(long long)
#ifdef _GLIBCXX_USE_INT128
: sizeof(_Tp) == sizeof(__int128)  ? alignof(__int128)
#endif
: 0;

 static constexpr int _S_alignment
= _S_min_alignment > alignof(_Tp) ? _S_min_alignment : alignof(_Tp);


Doh, good catch. I'll make that change and add a test with a type that
has alignof(X) > sizeof(X).


On 25/03/15 11:39 -0700, Richard Henderson wrote:

On 03/25/2015 09:22 AM, Jonathan Wakely wrote:

+static_assert( alignof(std::atomic) > alignof(int),
+   "std::atomic not suitably aligned" );


This is only true if int64_t has alignment larger than int32_t,
which is unfortunately not always the case.


Huh, didn't realise that. I could change the tests to check it's
alignof(std::int64_t) as the next assertion does, but is it safe to
assume that struct twoints { int a; int b; } is exactly 64 bits
everywhere?

I'd prefer not to have the test say "if sizeof(twoints) ==
sizeof(long), test this, otherwise if sizeof(twoints) == ..."

C++ PATCH for c++/65558 (ICE with abi_tag on anon namespace)

2015-03-25 Thread Marek Polacek

As discussed in the PR, the abi_tag on an anonymous namespace is useless,
but we shouldn't ICE if the user attempts to do that.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-03-25  Marek Polacek  

PR c++/65558
* name-lookup.c (handle_namespace_attrs): Ignore abi_tag attribute
on an anonymous namespace.

* g++.dg/cpp0x/pr65558.C: New test.

diff --git gcc/cp/name-lookup.c gcc/cp/name-lookup.c
index b85fbc9..4303ed5 100644
--- gcc/cp/name-lookup.c
+++ gcc/cp/name-lookup.c
@@ -3663,6 +3663,12 @@ handle_namespace_attrs (tree ns, tree attributes)
   "namespace", name);
  continue;
}
+ if (!DECL_NAME (ns))
+   {
+ warning (OPT_Wattributes, "ignoring %qD attribute on anonymous "
+  "namespace", name);
+ continue;
+   }
  if (!args)
{
  tree dn = DECL_NAME (ns);
diff --git gcc/testsuite/g++.dg/cpp0x/pr65558.C 
gcc/testsuite/g++.dg/cpp0x/pr65558.C
index e69de29..5437e50 100644
--- gcc/testsuite/g++.dg/cpp0x/pr65558.C
+++ gcc/testsuite/g++.dg/cpp0x/pr65558.C
@@ -0,0 +1,6 @@
+// PR c++/65558
+// { dg-do compile { target c++11 } }
+
+inline namespace __attribute__((__abi_tag__))
+{ // { dg-warning "ignoring .__abi_tag__. attribute on anonymous namespace" }
+}

Marek

Re: [libstdc++/65033] Give alignment info to libatomic

2015-03-25 Thread Richard Henderson

On 03/25/2015 11:49 AM, Jonathan Wakely wrote:
> On 25/03/15 11:36 -0700, Richard Henderson wrote:
>> On 03/25/2015 09:22 AM, Jonathan Wakely wrote:
> On 25/03/15 11:39 -0700, Richard Henderson wrote:
>> On 03/25/2015 09:22 AM, Jonathan Wakely wrote:
>>> +static_assert( alignof(std::atomic) > alignof(int),
>>> +   "std::atomic not suitably aligned" );
>>
>> This is only true if int64_t has alignment larger than int32_t,
>> which is unfortunately not always the case.
> 
> Huh, didn't realise that. I could change the tests to check it's
> alignof(std::int64_t) as the next assertion does, but is it safe to
> assume that struct twoints { int a; int b; } is exactly 64 bits
> everywhere?

Certainly not.  But if you're going to explicitly use int64_t elsewhere, you
might as well explicitly use int32_t as well.  Then I believe you can
reasonably assert

  alignof(twoint32) == alignof(int64_t)


r~

Re: [PATCH, bootstrap]: Add bootstrap-lto-noplugin build configuration (PR65537)

2015-03-25 Thread Uros Bizjak

On Wed, Mar 25, 2015 at 3:23 PM, Jakub Jelinek  wrote:

>> Attached patch introduces bootstrap-lto-noplugin bootstrap
>> configuration for hosts that do not support linker plugin (e.g. CentOS
>> 5.11 with binutils 2.17). Also, the patch adds some additional
>> documentation to bootstrap-lto option.
>>
>> config/ChangeLog:
>>
>> 2015-03-24  Uros Bizjak  
>>
>> PR bootstrap/65537
>> * bootstrap-lto-noplugin.mk: New build configuration.
>>
>> gcc/ChangeLog:
>>
>> 2015-03-24  Uros Bizjak  
>>
>> PR bootstrap/65537
>> * doc/install.texi (Building a native compiler): Document new
>> bootstrap-lto-noplugin configuration.  Mention that bootstrap-lto
>> configuration assumes that the host supports the linker plugin.
>>
>> Patch was bootstrapped and tested on x86_64-linux-gnu (CentOS 5.11)
>> host, configured with --with-build-config=bootstrap-lto build
>> configuration.
>
> and not --with-build-config=bootstrap-lto-noplugin ?

Oh ... with bootstrap-lto-noplugin option. The bootstrap with linker
plugin does not work at all on CentOS 5.11.

Uros.

Re: C++ PATCH for c++/65558 (ICE with abi_tag on anon namespace)

2015-03-25 Thread Jason Merrill


OK.

Jason

Re: C++ PATCH for c++/61670 (ice-after-error with null DECL_SIZE)

2015-03-25 Thread Jason Merrill


OK.

Jason

Re: [debug-early] emit early dwarf for locally scoped functions

2015-03-25 Thread Jason Merrill


On 03/24/2015 02:00 PM, Aldy Hernandez wrote:

I found that for locally scoped functions we were not emitting early
dwarf.


Why weren't they being emitted as part of their enclosing function? 
They should be.


Jason

Re: [debug-early] emit early dwarf for locally scoped functions

2015-03-25 Thread Aldy Hernandez


On 03/25/2015 12:37 PM, Jason Merrill wrote:

On 03/24/2015 02:00 PM, Aldy Hernandez wrote:

I found that for locally scoped functions we were not emitting early
dwarf.


Why weren't they being emitted as part of their enclosing function? They
should be.

Jason



Hmm, you're right.  Sorry for being so sloppy.

What is actually happening is that when the declaration is seen, 
nameless DIEs for the types are generated, which are then used when the 
cached subprogram DIE is seen the second time.  The nameless DIEs end up 
looking like this because we don't have the "this" name:


char Object_method(Object * const);

whereas the function type should be:

char Object_method(void);

I now understand what this was doing in mainline:

  /* Clear out the declaration attribute and the formal parameters.
 Do not remove all children, because it is possible that this
 declaration die was forced using force_decl_die(). In such
 cases die that forced declaration die (e.g. TAG_imported_module)
 is one of the children that we do not want to remove.  */
  remove_AT (subr_die, DW_AT_declaration);
  remove_AT (subr_die, DW_AT_object_pointer);
  remove_child_TAG (subr_die, DW_TAG_formal_parameter);

I suppose we could re-use the DW_AT_object_pointer and 
DW_TAG_formal_parameter, and tack on the DW_AT_name now that we know it? 
 Or we could cheat and just remove them as mainline does, but only when 
reusing a declaration (as in the attached patch).


What do you think?

Aldy
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 48e2eed..4bc945f 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -3113,7 +3113,7 @@ static inline dw_die_ref get_AT_ref (dw_die_ref, enum 
dwarf_attribute);
 static bool is_cxx (void);
 static bool is_fortran (void);
 static bool is_ada (void);
-static void remove_AT (dw_die_ref, enum dwarf_attribute);
+static bool remove_AT (dw_die_ref, enum dwarf_attribute);
 static void remove_child_TAG (dw_die_ref, enum dwarf_tag);
 static void add_child_die (dw_die_ref, dw_die_ref);
 static dw_die_ref new_die (enum dwarf_tag, dw_die_ref, tree);
@@ -4752,16 +4752,17 @@ is_ada (void)
   return lang == DW_LANG_Ada95 || lang == DW_LANG_Ada83;
 }
 
-/* Remove the specified attribute if present.  */
+/* Remove the specified attribute if present.  Return TRUE if removal
+   was successful.  */
 
-static void
+static bool
 remove_AT (dw_die_ref die, enum dwarf_attribute attr_kind)
 {
   dw_attr_ref a;
   unsigned ix;
 
   if (! die)
-return;
+return false;
 
   FOR_EACH_VEC_SAFE_ELT (die->die_attr, ix, a)
 if (a->dw_attr == attr_kind)
@@ -4773,8 +4774,9 @@ remove_AT (dw_die_ref die, enum dwarf_attribute attr_kind)
/* vec::ordered_remove should help reduce the number of abbrevs
   that are needed.  */
die->die_attr->ordered_remove (ix);
-   return;
+   return true;
   }
+  return false;
 }
 
 /* Remove CHILD from its parent.  PREV must have the property that
@@ -18790,8 +18792,15 @@ gen_subprogram_die (tree decl, dw_die_ref context_die)
 
  /* Clear out the declaration attribute, but leave the
 parameters so they can be augmented with location
-information later.  */
- remove_AT (subr_die, DW_AT_declaration);
+information later.  Unless this was a declaration, in
+which case, wipe out the nameless parameters and recreate
+them further down.  */
+ if (remove_AT (subr_die, DW_AT_declaration))
+   {
+
+ remove_AT (subr_die, DW_AT_object_pointer);
+ remove_child_TAG (subr_die, DW_TAG_formal_parameter);
+   }
}
   /* Make a specification pointing to the previously built
 declaration.  */

libgo patch committed: Add runtime/cgo to list of standard packages

2015-03-25 Thread Ian Lance Taylor

PR 65570 points out that the recent patch to the go tool breaks the
use of cgo (and obviously also points out that we need better testing
for go and cgo).  The problem is that the go tool treats the
runtime/cgo package specially.  Although gccgo doesn't use that
package, the go tool needs to know that it has no source code.  This
patch fixes it.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian
diff -r 2048f406394a libgo/Makefile.am
--- a/libgo/Makefile.am Tue Mar 24 13:54:30 2015 -0700
+++ b/libgo/Makefile.am Wed Mar 25 14:14:41 2015 -0700
@@ -987,7 +987,7 @@
echo 'package main' > zstdpkglist.go.tmp
echo "" >> zstdpkglist.go.tmp
echo 'var stdpkg = map[string]bool{' >> zstdpkglist.go.tmp
-   echo $(libgo_go_objs) 'unsafe.lo' | sed 's/\.lo /\": true,\n/g' | sed 
's/\.lo/\": true,/' | sed 's/-go//' | grep -v _c | sed 's/^/\t\"/' | sort | 
uniq >> zstdpkglist.go.tmp
+   echo $(libgo_go_objs) 'unsafe.lo' 'runtime/cgo.lo' | sed 's/\.lo /\": 
true,\n/g' | sed 's/\.lo/\": true,/' | sed 's/-go//' | grep -v _c | sed 
's/^/\t\"/' | sort | uniq >> zstdpkglist.go.tmp
echo '}' >> zstdpkglist.go.tmp
$(SHELL) $(srcdir)/mvifdiff.sh zstdpkglist.go.tmp zstdpkglist.go
$(STAMP) $@

Re: [PATCH, rs6000, 4.9] Backport little endian swap optimization to 4.9

2015-03-25 Thread David Edelsohn

On Wed, Mar 25, 2015 at 12:42 PM, Bill Schmidt
 wrote:
> Hi,
>
> The POWER-specific little-endian swap optimization pass has been burning
> in on mainline since last August.  Since then there have been a few
> improvements and bug fixes, but the code is very stable.  I've had some
> recent requests to get this code backported to 4.9, as it provides
> important performance benefits for vector computation.
>
> Most of the work is target-specific, but there are some
> target-independent changes to convert web.c to use a class structure so
> that this pass can inherit it.  This has caused no problems and was not
> controversial when added to trunk.
>
> The rest of the patch is straightforward backporting of the
> target-specific pieces and test cases.  There have been a few
> infrastructural changes to adjust to, but nothing major.
>
> After this goes in, I'll work on taking it back to 4.8 as well.  OK for
> 4.9?
>
> Thanks,
> Bill
>
>
> gcc:
>
> 2015-03-25  Bill Schmidt  
>
> Backport of r214242, r214254, and bug fix patches from mainline
> * config/rs6000/rs6000.c (context.h): New #include.
> (tree-pass.h): Likewise.
> (make_pass_analyze_swaps): New declaration.
> (rs6000_option_override): Register swap-optimization pass.
> (swap_web_entry): New class.
> (special_handling_values): New enum.
> (union_defs): New function.
> (union_uses): Likewise.
> (insn_is_load_p): Likewise.
> (insn_is_store_p): Likewise.
> (insn_is_swap_p): Likewise.
> (rtx_is_swappable_p): Likewise.
> (insn_is_swappable_p): Likewise.
> (chain_purpose): New enum.
> (chain_contains_only_swaps): New function.
> (mark_swaps_for_removal): Likewise.
> (swap_const_vector_halves): Likewise.
> (adjust_subreg_index): Likewise.
> (permute_load): Likewise.
> (permute_store): Likewise.
> (adjust_extract): Likewise.
> (adjust_splat): Likewise.
> (handle_special_swappables): Likewise.
> (replace_swap_with_copy): Likewise.
> (dump_swap_insn_table): Likewise.
> (rs6000_analyze_swaps): Likewise.
> (pass_data_analyze_swaps): New pass_data.
> (pass_analyze_swaps): New class.
> (pass_analyze_swaps::gate): New method.
> (pass_analyze_swaps::execute): New method.
> (make_pass_analyze_swaps): New function.
> * config/rs6000/rs6000.opt (moptimize-swaps): New option.
> * df.h (web_entry_base): New class, replacing struct web_entry.
> (web_entry_base::pred): New method.
> (web_entry_base::set_pred): Likewise.
> (web_entry_base::unionfind_root): Likewise.
> (web_entry_base::unionfind_union): Likewise.
> (unionfind_root): Delete external reference.
> (unionfind_union): Likewise.
> (union_defs): Likewise.
> * web.c (web_entry_base::unionfind_root): Convert to method.
> (web_entry_base::unionfind_union): Likewise.
> (web_entry): New class.
> (union_match_dups): Convert to use class structure.
> (union_defs): Likewise.
> (entry_register): Likewise.
> (web_main): Likewise.
>
> [testsuite]
>
> 2015-03-25  Bill Schmidt  
>
> Backport r214254 and related tests from mainline
> * gcc.target/powerpc/swaps-p8-1.c: New test.
> * gcc.target/powerpc/swaps-p8-2.c: New test.
> * gcc.target/powerpc/swaps-p8-3.c: New test.
> * gcc.target/powerpc/swaps-p8-4.c: New test.
> * gcc.target/powerpc/swaps-p8-5.c: New test.
> * gcc.target/powerpc/swaps-p8-6.c: New test.
> * gcc.target/powerpc/swaps-p8-7.c: New test.
> * gcc.target/powerpc/swaps-p8-8.c: New test.
> * gcc.target/powerpc/swaps-p8-9.c: New test.
> * gcc.target/powerpc/swaps-p8-10.c: New test.
> * gcc.target/powerpc/swaps-p8-11.c: New test.
> * gcc.target/powerpc/swaps-p8-12.c: New test.
> * gcc.target/powerpc/swaps-p8-13.c: New test.
> * gcc.target/powerpc/swaps-p8-14.c: New test.
> * gcc.target/powerpc/swaps-p8-15.c: New test.
> * gcc.target/powerpc/swaps-p8-16.c: New test.
> * gcc.target/powerpc/swaps-p8-17.c: New test.

Okay.

However, I was hoping to perform the backport of both this piece and
your newer swapping patches together, but those patches cannot go in
until GCC 5 is released and trunk is re-opened for non-bug fix
patches.  Once some of the optimizations are applied, users complain
about other extraneous swaps addressed by your next set of patches.

Thanks, David

[PATCH, testsuite]: Fix gcc.target/i386/sse-{13,23}.c

2015-03-25 Thread Uros Bizjak

Hello!

For some reason gcc.target/i386/sse-13.c lost its #include
.  Attached patch fixes this issue and adjusts
corresponding #defines.  The patch also removes extra #includes from
sse-23.c.

2015-03-25  Uros Bizjak  

* gcc.target/i386/sse-13.c: Include x86intrin.h and adjust #defines.
* gcc.target/i386/sse-23.c: Do not explicitly include wmmintrin.h,
smmintrin.h and mm3dnow.h.

Tested on x86_64-linux-gnu {,-m32}  and committed to mainline SVN.

Uros.
Index: gcc.target/i386/sse-13.c
===
--- gcc.target/i386/sse-13.c(revision 221669)
+++ gcc.target/i386/sse-13.c(working copy)
@@ -55,20 +55,6 @@
 #define __builtin_ia32_vcvtps2ph(A, I) __builtin_ia32_vcvtps2ph(A, 1)
 #define __builtin_ia32_vcvtps2ph256(A, I) __builtin_ia32_vcvtps2ph256(A, 1)
 
-/* avx512pfintrin.h */
-#define __builtin_ia32_gatherpfdps(A, B, C, D, E) __builtin_ia32_gatherpfdps 
(A, B, C, 1, 1)
-#define __builtin_ia32_gatherpfqps(A, B, C, D, E) __builtin_ia32_gatherpfqps 
(A, B, C, 1, 1)
-#define __builtin_ia32_scatterpfdps(A, B, C, D, E) __builtin_ia32_scatterpfdps 
(A, B, C, 1, 1)
-#define __builtin_ia32_scatterpfqps(A, B, C, D, E) __builtin_ia32_scatterpfqps 
(A, B, C, 1, 1)
-
-/* avx512erintrin.h */
-#define __builtin_ia32_exp2pd_mask(A, B, C, D) __builtin_ia32_exp2pd_mask (A, 
B, C, 1)
-#define __builtin_ia32_exp2ps_mask(A, B, C, D) __builtin_ia32_exp2ps_mask (A, 
B, C, 1)
-#define __builtin_ia32_rcp28pd_mask(A, B, C, D) __builtin_ia32_rcp28pd_mask 
(A, B, C, 1)
-#define __builtin_ia32_rcp28ps_mask(A, B, C, D) __builtin_ia32_rcp28ps_mask 
(A, B, C, 1)
-#define __builtin_ia32_rsqrt28pd_mask(A, B, C, D) 
__builtin_ia32_rsqrt28pd_mask (A, B, C, 1)
-#define __builtin_ia32_rsqrt28ps_mask(A, B, C, D) 
__builtin_ia32_rsqrt28ps_mask (A, B, C, 1)
-
 /* wmmintrin.h */
 #define __builtin_ia32_aeskeygenassist128(X, C) 
__builtin_ia32_aeskeygenassist128(X, 1)
 #define __builtin_ia32_pclmulqdq128(X, Y, I) __builtin_ia32_pclmulqdq128(X, Y, 
1)
@@ -195,13 +181,13 @@
 #define __builtin_ia32_gatherdiv4si256(X, Y, Z, K, M) 
__builtin_ia32_gatherdiv4si256(X, Y, Z, K, 1)
 
 /* rtmintrin.h */
-#define __builtin_ia32_xabort (N) __builtin_ia32_xabort (1)
+#define __builtin_ia32_xabort(N) __builtin_ia32_xabort(1)
 
 /* avx512fintrin.h */
 #define __builtin_ia32_addpd512_mask(A, B, C, D, E) 
__builtin_ia32_addpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_addps512_mask(A, B, C, D, E) 
__builtin_ia32_addps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_addsd_mask(A, B, C, D, E) __builtin_ia32_addsd_mask(A, 
B, C, D, 8)
-#define __builtin_ia32_addss_mask(A, B, C, D, E) __builtin_ia32_addss_mask(A, 
B, C, D, 8)
+#define __builtin_ia32_addsd_round(A, B, C) __builtin_ia32_addsd_round(A, B, 8)
+#define __builtin_ia32_addss_round(A, B, C) __builtin_ia32_addss_round(A, B, 8)
 #define __builtin_ia32_alignd512_mask(A, B, F, D, E) 
__builtin_ia32_alignd512_mask(A, B, 1, D, E)
 #define __builtin_ia32_alignq512_mask(A, B, F, D, E) 
__builtin_ia32_alignq512_mask(A, B, 1, D, E)
 #define __builtin_ia32_cmpd512_mask(A, B, E, D) __builtin_ia32_cmpd512_mask(A, 
B, 1, D)
@@ -217,11 +203,11 @@
 #define __builtin_ia32_cvtps2dq512_mask(A, B, C, D) 
__builtin_ia32_cvtps2dq512_mask(A, B, C, 8)
 #define __builtin_ia32_cvtps2pd512_mask(A, B, C, D) 
__builtin_ia32_cvtps2pd512_mask(A, B, C, 8)
 #define __builtin_ia32_cvtps2udq512_mask(A, B, C, D) 
__builtin_ia32_cvtps2udq512_mask(A, B, C, 8)
-#define __builtin_ia32_cvtsd2ss_mask(A, B, C, D, E) 
__builtin_ia32_cvtsd2ss_mask(A, B, C, D, 8)
+#define __builtin_ia32_cvtsd2ss_round(A, B, C) 
__builtin_ia32_cvtsd2ss_round(A, B, 8)
+#define __builtin_ia32_cvtss2sd_round(A, B, C) 
__builtin_ia32_cvtss2sd_round(A, B, 4)
 #define __builtin_ia32_cvtsi2sd64(A, B, C) __builtin_ia32_cvtsi2sd64(A, B, 8)
 #define __builtin_ia32_cvtsi2ss32(A, B, C) __builtin_ia32_cvtsi2ss32(A, B, 8)
 #define __builtin_ia32_cvtsi2ss64(A, B, C) __builtin_ia32_cvtsi2ss64(A, B, 8)
-#define __builtin_ia32_cvtss2sd_mask(A, B, C, D, E) 
__builtin_ia32_cvtss2sd_mask(A, B, C, D, 8)
 #define __builtin_ia32_cvttpd2dq512_mask(A, B, C, D) 
__builtin_ia32_cvttpd2dq512_mask(A, B, C, 8)
 #define __builtin_ia32_cvttpd2udq512_mask(A, B, C, D) 
__builtin_ia32_cvttpd2udq512_mask(A, B, C, 8)
 #define __builtin_ia32_cvttps2dq512_mask(A, B, C, D) 
__builtin_ia32_cvttps2dq512_mask(A, B, C, 8)
@@ -232,8 +218,8 @@
 #define __builtin_ia32_cvtusi2ss64(A, B, C) __builtin_ia32_cvtusi2ss64(A, B, 8)
 #define __builtin_ia32_divpd512_mask(A, B, C, D, E) 
__builtin_ia32_divpd512_mask(A, B, C, D, 8)
 #define __builtin_ia32_divps512_mask(A, B, C, D, E) 
__builtin_ia32_divps512_mask(A, B, C, D, 8)
-#define __builtin_ia32_divsd_mask(A, B, C, D, E) __builtin_ia32_divsd_mask(A, 
B, C, D, 8)
-#define __builtin_ia32_divss_mask(A, B, C, D, E) __builtin_ia32_divss_mask(A, 
B, C, D, 8)
+#define __builtin_ia32_divsd_round(A, B, C) __builtin_ia32_divsd_round(A, B, 8)
+#define __builtin_ia32_divss_round(A, B, C) __builtin_ia32_divss_roun

Re: [PATCH, rs6000, 4.9] Backport little endian swap optimization to 4.9

2015-03-25 Thread Bill Schmidt

On Wed, 2015-03-25 at 17:56 -0400, David Edelsohn wrote:
> On Wed, Mar 25, 2015 at 12:42 PM, Bill Schmidt
>  wrote:
> > Hi,
> >
> > The POWER-specific little-endian swap optimization pass has been burning
> > in on mainline since last August.  Since then there have been a few
> > improvements and bug fixes, but the code is very stable.  I've had some
> > recent requests to get this code backported to 4.9, as it provides
> > important performance benefits for vector computation.
> >
> > Most of the work is target-specific, but there are some
> > target-independent changes to convert web.c to use a class structure so
> > that this pass can inherit it.  This has caused no problems and was not
> > controversial when added to trunk.
> >
> > The rest of the patch is straightforward backporting of the
> > target-specific pieces and test cases.  There have been a few
> > infrastructural changes to adjust to, but nothing major.
> >
> > After this goes in, I'll work on taking it back to 4.8 as well.  OK for
> > 4.9?
> >
> > Thanks,
> > Bill
> >
> >
> > gcc:
> >
> > 2015-03-25  Bill Schmidt  
> >
> > Backport of r214242, r214254, and bug fix patches from mainline
> > * config/rs6000/rs6000.c (context.h): New #include.
> > (tree-pass.h): Likewise.
> > (make_pass_analyze_swaps): New declaration.
> > (rs6000_option_override): Register swap-optimization pass.
> > (swap_web_entry): New class.
> > (special_handling_values): New enum.
> > (union_defs): New function.
> > (union_uses): Likewise.
> > (insn_is_load_p): Likewise.
> > (insn_is_store_p): Likewise.
> > (insn_is_swap_p): Likewise.
> > (rtx_is_swappable_p): Likewise.
> > (insn_is_swappable_p): Likewise.
> > (chain_purpose): New enum.
> > (chain_contains_only_swaps): New function.
> > (mark_swaps_for_removal): Likewise.
> > (swap_const_vector_halves): Likewise.
> > (adjust_subreg_index): Likewise.
> > (permute_load): Likewise.
> > (permute_store): Likewise.
> > (adjust_extract): Likewise.
> > (adjust_splat): Likewise.
> > (handle_special_swappables): Likewise.
> > (replace_swap_with_copy): Likewise.
> > (dump_swap_insn_table): Likewise.
> > (rs6000_analyze_swaps): Likewise.
> > (pass_data_analyze_swaps): New pass_data.
> > (pass_analyze_swaps): New class.
> > (pass_analyze_swaps::gate): New method.
> > (pass_analyze_swaps::execute): New method.
> > (make_pass_analyze_swaps): New function.
> > * config/rs6000/rs6000.opt (moptimize-swaps): New option.
> > * df.h (web_entry_base): New class, replacing struct web_entry.
> > (web_entry_base::pred): New method.
> > (web_entry_base::set_pred): Likewise.
> > (web_entry_base::unionfind_root): Likewise.
> > (web_entry_base::unionfind_union): Likewise.
> > (unionfind_root): Delete external reference.
> > (unionfind_union): Likewise.
> > (union_defs): Likewise.
> > * web.c (web_entry_base::unionfind_root): Convert to method.
> > (web_entry_base::unionfind_union): Likewise.
> > (web_entry): New class.
> > (union_match_dups): Convert to use class structure.
> > (union_defs): Likewise.
> > (entry_register): Likewise.
> > (web_main): Likewise.
> >
> > [testsuite]
> >
> > 2015-03-25  Bill Schmidt  
> >
> > Backport r214254 and related tests from mainline
> > * gcc.target/powerpc/swaps-p8-1.c: New test.
> > * gcc.target/powerpc/swaps-p8-2.c: New test.
> > * gcc.target/powerpc/swaps-p8-3.c: New test.
> > * gcc.target/powerpc/swaps-p8-4.c: New test.
> > * gcc.target/powerpc/swaps-p8-5.c: New test.
> > * gcc.target/powerpc/swaps-p8-6.c: New test.
> > * gcc.target/powerpc/swaps-p8-7.c: New test.
> > * gcc.target/powerpc/swaps-p8-8.c: New test.
> > * gcc.target/powerpc/swaps-p8-9.c: New test.
> > * gcc.target/powerpc/swaps-p8-10.c: New test.
> > * gcc.target/powerpc/swaps-p8-11.c: New test.
> > * gcc.target/powerpc/swaps-p8-12.c: New test.
> > * gcc.target/powerpc/swaps-p8-13.c: New test.
> > * gcc.target/powerpc/swaps-p8-14.c: New test.
> > * gcc.target/powerpc/swaps-p8-15.c: New test.
> > * gcc.target/powerpc/swaps-p8-16.c: New test.
> > * gcc.target/powerpc/swaps-p8-17.c: New test.
> 
> Okay.
> 
> However, I was hoping to perform the backport of both this piece and
> your newer swapping patches together, but those patches cannot go in
> until GCC 5 is released and trunk is re-opened for non-bug fix
> patches.  Once some of the optimizations are applied, users complain
> about other extraneous swaps addressed by your next set of patches.

Thanks, David.  I agree, and that was my original plan until we had some
customer r

Optimize lto location stremaing

2015-03-25 Thread Jan Hubicka

Hi,
linemap is optimized for situation where parser enters positions into it in 
source order.
LTO does not work this way - it attach locations to trees and reads them more 
or less
randomly. This results in large memory use of linemaps, slow lookups (that are 
critical
for WPA stremaing) and as i noticed recently also wrong line&column info.

This patch changes the way by streaming in the location into cache that is 
ordered
and applied in source order.  The cache also knows how to cheaply discard 
elements
for linemaps of trees that was rmeoved by tree merging.

One catch ist hat the linemaps are not present in trees and thus can not be 
expanded,
copied or relocated before calling lto_apply_location_cache. I hope I caught 
the cases
where this can happen. This include
 1) calling debug hooks during ltrans from lto_read_decls
 2) producing odr violation warnings from ipa-devirt
 3) modifying locations to record blocks (unpack_ts_block_value_fields)
 4) for safety I skipped the trick for gimple streaming for now becuase at
least PHI args can probably be relocated.

Bootstrapped/regtested x86_64-linux, the patch saves about 1GB of locators for 
chromium
and 400MB for firefox LTO.

OK?

Honza
PR lto/65536
* streamer-hooks.h (struct streamer_hooks): Make input_location to take
pointer to location.
(stream_input_location): Update.
(lto_apply_location_cache, lto_revert_location_cache,
lto_accept_location_cache): Declare.
(stream_input_location_now): New inline function.
* ipa-devirt.c: Include streamer-hooks.h.
(warn_odr): Apply location cache before warning.
(lto_input_location): Update prototype.
* gimple-streamer-in.c (input_phi, input_gimple_stmt):
Use stream_input_location_now.
* lto/lto.c (unify_scc): Revert location cache when unification
suceeded.
(lto_read_decls): Accept location cache after sucess;
apply location cache before calling debug hooks.
* lto-streamer-in.c (struct cached_location): New.
(loc_cache, accepted_length, current_file, current_line,
current_col, current_loc): New static vars.
(cmp_loc): New function.
(lto_apply_location_cache): New function.
(lto_accept_location_cache): New function.
(lto_revert_location_cache): New function.
(lto_input_location): Do location caching.
(input_eh_region, input_struct_function_base): Use
stream_input_location_now.
* tree-streamer-in.c (unpack_ts_block_value_fields,
unpack_ts_omp_clause_value_fields, streamer_read_tree_bitfields,
lto_input_ts_exp_tree_pointers): Update for cached location api.
Index: streamer-hooks.h
===
--- streamer-hooks.h(revision 221582)
+++ streamer-hooks.h(working copy)
@@ -52,7 +52,7 @@ struct streamer_hooks {
   tree (*read_tree) (struct lto_input_block *, struct data_in *);
 
   /* [REQ] Called by every streaming routine that needs to read a location.  */
-  location_t (*input_location) (struct bitpack_d *, struct data_in *);
+  void (*input_location) (location_t *, struct bitpack_d *, struct data_in *);
 
   /* [REQ] Called by every streaming routine that needs to write a location.  
*/
   void (*output_location) (struct output_block *, struct bitpack_d *, 
location_t);
@@ -67,8 +67,8 @@ struct streamer_hooks {
 #define stream_read_tree(IB, DATA_IN) \
 streamer_hooks.read_tree (IB, DATA_IN)
 
-#define stream_input_location(BP, DATA_IN) \
-streamer_hooks.input_location (BP, DATA_IN)
+#define stream_input_location(LOCPTR, BP, DATA_IN) \
+streamer_hooks.input_location (LOCPTR, BP, DATA_IN)
 
 #define stream_output_location(OB, BP, LOC) \
 streamer_hooks.output_location (OB, BP, LOC)
@@ -78,5 +78,21 @@ extern struct streamer_hooks streamer_ho
 
 /* In streamer-hooks.c.  */
 void streamer_hooks_init (void);
+bool lto_apply_location_cache ();
+void lto_revert_location_cache ();
+void lto_accept_location_cache ();
+
+/* Read location and return it instead of going through location caching.
+   This should be used only when the resulting location is not going to be
+   discarded.  */
+
+inline location_t
+stream_input_location_now (struct bitpack_d *bp, struct data_in *data)
+{
+  location_t loc;
+  streamer_hooks.input_location (&loc, bp, data);
+  lto_apply_location_cache ();
+  return loc;
+}
 
 #endif  /* GCC_STREAMER_HOOKS_H  */
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 221582)
+++ ipa-devirt.c(working copy)
@@ -166,7 +166,7 @@ along with GCC; see the file COPYING3.
 #include "gimple-pretty-print.h"
 #include "stor-layout.h"
 #include "intl.h"
-#include "demangle.h"
+#include "streamer-hooks.h"
 
 /* Hash based set of pairs of types.  */
 typedef struct
@@ -936,6 +936,8 @@ warn_odr (tree t1, tree t2, tree st1, tr
   if (!wa

[PATCH] testsuite checks for arm vectorization support on non-arm targets

2015-03-25 Thread Martin Sebor


The attached patch adds tests to lib/target-supports.exp
to avoid unnecessarily invoking the compiler on non-ARM
targets to check for the support for a number of ARM
vectorization features.

Okay to commit to trunk?

Martin
2015-03-23  Martin Sebor  

	* lib/target-supports.exp (check_effective_target_arm32): Fail early
	when target isn't arm*-*-*-*.
	(check_effective_target_arm_nothumb): Likewise.
	(check_effective_target_arm_little_endian): Likewise.
	(check_effective_target_arm_vect_no_misalign): Likewise.
	(check_effective_target_aarch64_little_endian): Fail early if target
	isn't aarch64*-*-*.

diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 6b957de..25786df 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2373,6 +2373,10 @@ proc check_effective_target_aarch64_big_endian { } {
 
 # Return 1 if this is a AArch64 target supporting little endian
 proc check_effective_target_aarch64_little_endian { } {
+if { ![istarget aarch64*-*-*] } {
+	return 0
+}
+
 return [check_no_compiler_messages aarch64_little_endian assembly {
 #if !defined(__aarch64__) || defined(__AARCH64EB__)
 #error FOO
@@ -2382,6 +2386,10 @@ proc check_effective_target_aarch64_little_endian { } {
 
 # Return 1 if this is an arm target using 32-bit instructions
 proc check_effective_target_arm32 { } {
+if { ![istarget arm*-*-*] } {
+	return 0
+}
+
 return [check_no_compiler_messages arm32 assembly {
 	#if !defined(__arm__) || (defined(__thumb__) && !defined(__thumb2__))
 	#error !__arm || __thumb__ && !__thumb2__
@@ -2391,6 +2399,10 @@ proc check_effective_target_arm32 { } {
 
 # Return 1 if this is an arm target not using Thumb
 proc check_effective_target_arm_nothumb { } {
+if { ![istarget arm*-*-*] } {
+	return 0
+}
+
 return [check_no_compiler_messages arm_nothumb assembly {
 	#if !defined(__arm__) || (defined(__thumb__) || defined(__thumb2__))
 	#error !__arm__ || __thumb || __thumb2__
@@ -2400,6 +2412,10 @@ proc check_effective_target_arm_nothumb { } {
 
 # Return 1 if this is a little-endian ARM target
 proc check_effective_target_arm_little_endian { } {
+if { ![istarget arm*-*-*] } {
+	return 0
+}
+
 return [check_no_compiler_messages arm_little_endian assembly {
 	#if !defined(__arm__) || !defined(__ARMEL__)
 	#error !__arm__ || !__ARMEL__
@@ -2409,6 +2425,10 @@ proc check_effective_target_arm_little_endian { } {
 
 # Return 1 if this is an ARM target that only supports aligned vector accesses
 proc check_effective_target_arm_vect_no_misalign { } {
+if { ![istarget arm*-*-*] } {
+	return 0
+}
+
 return [check_no_compiler_messages arm_vect_no_misalign assembly {
 	#if !defined(__arm__) \
 	|| (defined(__ARM_FEATURE_UNALIGNED) \

Re: [PATCH] testsuite checks for arm vectorization support on non-arm targets

2015-03-25 Thread Jakub Jelinek

On Wed, Mar 25, 2015 at 05:04:32PM -0600, Martin Sebor wrote:
> The attached patch adds tests to lib/target-supports.exp
> to avoid unnecessarily invoking the compiler on non-ARM
> targets to check for the support for a number of ARM
> vectorization features.
> 
> Okay to commit to trunk?
> 
> Martin

> 2015-03-23  Martin Sebor  

Use current date ;)

>   * lib/target-supports.exp (check_effective_target_arm32): Fail early
>   when target isn't arm*-*-*-*.
>   (check_effective_target_arm_nothumb): Likewise.
>   (check_effective_target_arm_little_endian): Likewise.
>   (check_effective_target_arm_vect_no_misalign): Likewise.
>   (check_effective_target_aarch64_little_endian): Fail early if target
>   isn't aarch64*-*-*.

Ok, thanks.

Jakub

Re: Fix PR 65177: diamonds are not valid execution threads for jump threading

2015-03-25 Thread Sebastian Pop

Jeff Law wrote:
> > PR tree-optimization/65177
> > * tree-ssa-threadupdate.c (verify_seme): Renamed verify_jump_thread.
> > (bb_in_bbs): New.
> > (duplicate_seme_region): Renamed duplicate_thread_path.  Redirect all
> > edges not adjacent on the path to the original code.
> OK for the trunk.  

Committed r221675.

> Though I think there's some stage1 refactoring that we're going to want to do.

Agreed.

> Specifically, it seems to me that copy_bbs should be refactored into
> copy_bbs and copy_bbs_for_threading or somesuch.  Where those
> routines call into refactored common subroutines, but obviously
> handle wiring up the outgoing edges from the copied blocks
> differently.
> 

That would be a good cleanup: I don't like to arbitrarily redirect edges in
copy_bbs just to redirect them back to their initial place in the caller.

> The goal would be to eliminate the overly complex block copy/CFG
> update scheme in tree-ssa-threadupdate.c as part of a larger project
> to convert to a backward threader that can run independently of DOM.

I have a start of a patch for that cleanup, it currently runs wild as it would
replace the existing threadupdate code generator with a call to the new
duplicate_thread_path.  I think we should take smaller more manageable steps to
ease the review and to not destabilize the jump-threader.  In particular I think
we should have both code generators for a while and turn one on/off with an 
option.

Sebastian

[PATCH], PR 65569, Fix powerpc long double regression PF 65240 caused

2015-03-25 Thread Michael Meissner

Pat Haugen runs a spec regression tester on various PowerPC boxes, and he
noticed that my fix for PR 65240 (the bug involving floating point constants
and -ffast-math under VSX) caused a regression in building the dealII benchmark
on power6x.  I looked into it, and discovered I had missed extenddftf2_fprs
relying on (const_double 0.0) being used in RTL code.  This works on VSX
systems, where you can use the XXLXOR instruction, but it does not work on
previous systems.

This patch fixes the problem.  I have bootstrapped and ran make check on a
power7 big endian system and a power8 little endian system.  On power7, the
following test had been failing, and is now fixed (it doesn't fail on power8):

g++.dg/torture/pr58369.C

I have also built the power8-vsx, power7-vsx, power6x-altivec suite with no
failures.  I'm building power6x-scalar, and power5-scalar shortly.  Assuming
that the last two spec runs build without errors, can I apply the patch?

2015-03-25  Michael Meissner  

PR target/65569
* config/rs6000/rs6000.md (extenddftf2_fprs): On VSX systems use
XXLXOR to create 0.0.  On pre-VSX systems make sure the constant
0.0 is correctly setup.
(extenddftf2_internal): Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 221668)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -8357,16 +8357,21 @@ (define_expand "extenddftf2_fprs"
&& TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT 
&& TARGET_LONG_DOUBLE_128"
 {
-  operands[2] = CONST0_RTX (DFmode);
-  /* Generate GOT reference early for SVR4 PIC.  */
-  if (DEFAULT_ABI == ABI_V4 && flag_pic)
-operands[2] = validize_mem (force_const_mem (DFmode, operands[2]));
+  /* VSX can create 0.0 directly, otherwise let rs6000_emit_move create
+ the proper constant.  */
+  if (TARGET_VSX)
+operands[2] = CONST0_RTX (DFmode);
+  else
+{
+  operands[2] = gen_reg_rtx (DFmode);
+  rs6000_emit_move (operands[2], CONST0_RTX (DFmode), DFmode);
+}
 })
 
 (define_insn_and_split "*extenddftf2_internal"
-  [(set (match_operand:TF 0 "nonimmediate_operand" "=m,Y,d,&d,r")
-   (float_extend:TF (match_operand:DF 1 "input_operand" "d,r,md,md,rm")))
-   (use (match_operand:DF 2 "zero_reg_mem_operand" "d,r,m,d,n"))]
+  [(set (match_operand:TF 0 "nonimmediate_operand" "=m,Y,ws,d,&d,r")
+   (float_extend:TF (match_operand:DF 1 "input_operand" 
"d,r,md,md,md,rm")))
+   (use (match_operand:DF 2 "zero_reg_mem_operand" "d,r,j,m,d,n"))]
   "!TARGET_IEEEQUAD
&& TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT 
&& TARGET_LONG_DOUBLE_128"

libgo patch committed: Avoid some s390 failures

2015-03-25 Thread Ian Lance Taylor

This patch from Dominik Vogt fixes some s390 failures in libgo.  Ran
Go testsuite on x86_64-unknown-linux-gnu.  Committed to mainline.

Ian
diff -r bdce421e579e libgo/go/runtime/chan_test.go
--- a/libgo/go/runtime/chan_test.go Wed Mar 25 14:16:52 2015 -0700
+++ b/libgo/go/runtime/chan_test.go Wed Mar 25 17:39:06 2015 -0700
@@ -202,6 +202,11 @@
n := 1
if testing.Short() {
n = 100
+   } else {
+   if runtime.GOARCH == "s390" {
+   // Test uses too much address space on 31-bit S390.
+   t.Skip("skipping long test on s390")
+   }
}
for i := 0; i < n; i++ {
c := make(chan int, 1)
diff -r bdce421e579e libgo/go/runtime/map_test.go
--- a/libgo/go/runtime/map_test.go  Wed Mar 25 14:16:52 2015 -0700
+++ b/libgo/go/runtime/map_test.go  Wed Mar 25 17:39:06 2015 -0700
@@ -243,7 +243,12 @@
 
 func testConcurrentReadsAfterGrowth(t *testing.T, useReflect bool) {
if runtime.GOMAXPROCS(-1) == 1 {
-   defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(16))
+   if runtime.GOARCH == "s390" {
+   // Test uses too much address space on 31-bit S390.
+   defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(8))
+   } else {
+   defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(16))
+   }
}
numLoop := 10
numGrowStep := 250

Fix line-maps wrt LTO

2015-03-25 Thread Jan Hubicka

Hi,
I read linemap_line_start and I think I noticed few issues with respect
to overflows and lines being added randomly.

1) line_delta is computed as to_line SOURCE_LINE (map, set->highest_line)
   I think the last inserted line is not very releavnt.  What we care about is
   the base of the last location and to keep thing dense how much we are
   stretching the value range from highest inserted element (inserting into 
middle
   is cheap).

   For this reason I added base_line_delta and changed line_delta to be
   to_line - SOURCE_LINE (map, set->highest_location).

   Because things go in randomly, highest_line, which really is last inserted
   line, may be somewhere in between.
2) (line_delta > 10 && line_delta * ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) > 
1000)
   ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) is in range 7... 15, so it never
   gets high enough to make this conditional trigger.  I changed it to:

  || line_delta > 1000
  || (line_delta << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map)) > 1000

   I.e. we do not want to skip more than 1000 unused entries since highest
   inserted location.

3) (max_column_hint <= 80 && ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) >= 10)
   seems to intend to reduce the column range when it is no longer needed.
   Again, this is not really good idea when line inserted is not last.

4) the code deciding whether to do reuse seems worng:
  if (line_delta < 0
  || last_line != ORDINARY_MAP_STARTING_LINE_NUMBER (map)
  || SOURCE_COLUMN (map, highest) >= (1U << column_bits))

   line_delta really should be base_line_delta, we do not need to give up
   when map's line is 1, SOURCE_LINE (map, set->highest_line) is 5
   and we are requested to switch to line 3.

   Second last_line != ORDINARY_MAP_STARTING_LINE_NUMBER (map) tests whether
   location has only one line that does not work (at least with my changes)
   because we may switch to next line and back.

   This conditoinal also seems to be completely missing hanlding of overflows.

The following patch makes all line info and all but one carret to to be right
on chromium warnings

Bootstrapped/regtested x86_64-linux, OK?

* line-map.c (linemap_line_start): Correct overflow tests.
Index: line-map.c
===
--- line-map.c  (revision 221568)
+++ line-map.c  (working copy)
@@ -519,25 +519,38 @@ linemap_line_start (struct line_maps *se
   struct line_map *map = LINEMAPS_LAST_ORDINARY_MAP (set);
   source_location highest = set->highest_location;
   source_location r;
-  linenum_type last_line =
-SOURCE_LINE (map, set->highest_line);
-  int line_delta = to_line - last_line;
+  int base_line_delta = to_line - ORDINARY_MAP_STARTING_LINE_NUMBER (map);
+  int line_delta = to_line - SOURCE_LINE (map, set->highest_location);
   bool add_map = false;
 
-  if (line_delta < 0
-  || (line_delta > 10
- && line_delta * ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) > 1000)
-  || (max_column_hint >= (1U << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map)))
+  /* Single MAP entry can be used to encode multiple source lines.
+ Look for situations when this is impossible or undesriable.  */
+  if (base_line_delta < 0
+  /* We want to keep maps resonably dense, so do not increase the range
+of this linemap entry by more than 1000.  */
+  || line_delta > 1000
+  || (line_delta << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map)) > 1000
+  /* If the max column is out of range and we are still not dropping line
+info.  */
+  || (max_column_hint >= (1U << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map))
+ && highest < 0x6000)
+  /* If the prevoius line was long.  Ignore this problem is we already
+re-used the map for lines with greater indexes.  */
   || (max_column_hint <= 80
- && ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) >= 10)
+ && ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) >= 10 && line_delta > 0)
+  /* If we are just started running out of locations (which makes us to 
drop
+column info), but current line map still has column info, create fresh
+one.  */
   || (highest > 0x6000
- && (set->max_column_hint || highest > 0x7000)))
+ && (ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map)
+ || highest > 0x7000)))
 add_map = true;
   else
 max_column_hint = set->max_column_hint;
   if (add_map)
 {
   int column_bits;
+  bool reuse_map = true;
   if (max_column_hint > 10 || highest > 0x6000)
{
  /* If the column number is ridiculous or we've allocated a huge
@@ -554,11 +567,38 @@ linemap_line_start (struct line_maps *se
column_bits++;
  max_column_hint = 1U << column_bits;
}
+
   /* Allocate the new line_map.  However, if the current map only has a
 single line we can sometimes just increase its column_bits instead. */
-  if (line_delta < 0
-

Re: [PATCH, bootstrap]: Add bootstrap-lto-noplugin build configuration (PR65537)

2015-03-25 Thread Jan Hubicka

> Hello!
> 
> Attached patch introduces bootstrap-lto-noplugin bootstrap
> configuration for hosts that do not support linker plugin (e.g. CentOS
> 5.11 with binutils 2.17). Also, the patch adds some additional
> documentation to bootstrap-lto option.
> 
> config/ChangeLog:
> 
> 2015-03-24  Uros Bizjak  
> 
> PR bootstrap/65537
> * bootstrap-lto-noplugin.mk: New build configuration.
> 
> gcc/ChangeLog:
> 
> 2015-03-24  Uros Bizjak  
> 
> PR bootstrap/65537
> * doc/install.texi (Building a native compiler): Document new
> bootstrap-lto-noplugin configuration.  Mention that bootstrap-lto
> configuration assumes that the host supports the linker plugin.
> 
> Patch was bootstrapped and tested on x86_64-linux-gnu (CentOS 5.11)
> host, configured with --with-build-config=bootstrap-lto build
> configuration.
> 
> OK for mainline?
> Index: gcc/doc/install.texi
> ===
> --- gcc/doc/install.texi  (revision 221636)
> +++ gcc/doc/install.texi  (working copy)
> @@ -2519,8 +2519,14 @@
>  @item @samp{bootstrap-lto}
>  Enables Link-Time Optimization for host tools during bootstrapping.
>  @samp{BUILD_CONFIG=bootstrap-lto} is equivalent to adding
> -@option{-flto} to @samp{BOOT_CFLAGS}.
> +@option{-flto} to @samp{BOOT_CFLAGS}.  This option assumes that the host
> +supports the linker plugin (e.g. GNU ld version 2.21 or later or GNU gold
> +version 2.21 or later).
>  
> +@item @samp{bootstrap-lto-noplugin}
> +This option is similar to @code{bootstrap-lto}, but is intended for
> +hosts that do not support the linker plugin.

Can you, please, add a note that without linker plugin the static libraries are
not compiled with linktime optimization. Because GCC middle-end and backend is
in libbackend.a it means that only (part of) the frontend is actually LTO
optimized?

Currently it seems bit too welcoming to skip the linker update.

Honza
> +
>  @item @samp{bootstrap-debug}
>  Verifies that the compiler generates the same executable code, whether
>  or not it is asked to emit debug information.  To this end, this

Re: [debug-early] emit early dwarf for locally scoped functions

2015-03-25 Thread Jason Merrill


On 03/25/2015 05:05 PM, Aldy Hernandez wrote:

  Or we could cheat and just remove them as mainline does, but only when
reusing a declaration (as in the attached patch).


This seems right to me.

Jason

[patch, libgfortran] Bug 65541 - [5 Regression] namelist regression

2015-03-25 Thread Jerry DeLisle


Committed as obvious and simple.

revision 221682.

Regards,

Jerry

2015-03-25 Jerry DeLisle  

PR libgfortran/65541
* io/write.c (nml_write_obj): Convert '+' to '%' before emitting
object names in namelists.

Index: io/write.c
===
--- io/write.c  (revision 221681)
+++ io/write.c  (working copy)
@@ -1704,10 +1704,11 @@
   size_t clen;
   index_type elem_ctr;
   size_t obj_name_len;
-  void * p ;
+  void * p;
   char cup;
   char * obj_name;
   char * ext_name;
+  char * q;
   size_t ext_name_len;
   char rep_buff[NML_DIGITS];
   namelist_info * cmp;
@@ -1745,6 +1746,8 @@
   for (dim_i = len; dim_i < clen; dim_i++)
{
  cup = toupper ((int) obj->var_name[dim_i]);
+ if (cup == '+')
+   cup = '%';
  write_character (dtp, &cup, 1, 1, NODELIM);
}
   write_character (dtp, "=", 1, 1, NODELIM);
@@ -1894,6 +1897,9 @@
}

  ext_name[tot_len] = '\0';
+ for (q = ext_name; *q; q++)
+   if (*q == '+')
+ *q = '%';

  /* Now obj_name.  */

Re: [PATCH], PR 65569, Fix powerpc long double regression PF 65240 caused

2015-03-25 Thread David Edelsohn

On Wed, Mar 25, 2015 at 8:09 PM, Michael Meissner
 wrote:
> Pat Haugen runs a spec regression tester on various PowerPC boxes, and he
> noticed that my fix for PR 65240 (the bug involving floating point constants
> and -ffast-math under VSX) caused a regression in building the dealII 
> benchmark
> on power6x.  I looked into it, and discovered I had missed extenddftf2_fprs
> relying on (const_double 0.0) being used in RTL code.  This works on VSX
> systems, where you can use the XXLXOR instruction, but it does not work on
> previous systems.
>
> This patch fixes the problem.  I have bootstrapped and ran make check on a
> power7 big endian system and a power8 little endian system.  On power7, the
> following test had been failing, and is now fixed (it doesn't fail on power8):
>
> g++.dg/torture/pr58369.C
>
> I have also built the power8-vsx, power7-vsx, power6x-altivec suite with no
> failures.  I'm building power6x-scalar, and power5-scalar shortly.  Assuming
> that the last two spec runs build without errors, can I apply the patch?
>
> 2015-03-25  Michael Meissner  
>
> PR target/65569
> * config/rs6000/rs6000.md (extenddftf2_fprs): On VSX systems use
> XXLXOR to create 0.0.  On pre-VSX systems make sure the constant
> 0.0 is correctly setup.
> (extenddftf2_internal): Likewise.

Okay.

Thanks, David

Re: Optimize lto location stremaing

2015-03-25 Thread Andi Kleen

Jan Hubicka  writes:
>
> Bootstrapped/regtested x86_64-linux, the patch saves about 1GB of locators 
> for chromium
> and 400MB for firefox LTO.

Great. On my LTO builds linemap was always high up in the profiles too.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only

Re: Optimize lto location stremaing

2015-03-25 Thread Jan Hubicka

> Jan Hubicka  writes:
> >
> > Bootstrapped/regtested x86_64-linux, the patch saves about 1GB of locators 
> > for chromium
> > and 400MB for firefox LTO.
> 
> Great. On my LTO builds linemap was always high up in the profiles too.

Yep, these was always high. I am re-running some profiles now.  I feel somewhat
stupid I did not get this idea bit earlier - it seems to work well in practice.

In GCC 6 we we will hopefully look into preserving more of the linemap info
(inline stacks & macro expansion) and inventing less stupid way to pickle the
locators, but I think this patch solves the majority of memory use/lookup time
issues.

I think this was the last major offender for Chromoim/Libreoffice and Firefox.
(Modulo the fact htat chromium needs 9GB for WPA. There seems not be much of
low hanging fruit - chromium needs a lot of trees to be streamed in that will
hopefully be tracked by early debug soon.) What is the status of GCC 5 for
kernel compilation? Are the compile times/memory uses resonable now?

Honza

Discover nothorow functions before into_ssa

2015-03-25 Thread Jan Hubicka

Hi,
this patch (as suggested by Richard) adds very simple discovery of
DECL_NOTHORW to build_ssa passes.

The reason is that in 4.9 we did build_ssa in parallel with early optimization
that does nothrow discovery as part of local pure const.  Bounds checking
patches broke the pass queue into multiple lasses and we produce a lot more
statements when notrhow is not identifier early.

I went with specialized pass because I do not want to pay the cost of
local pure const building loop structure to prove that function is pure/const.
We really care about this just later. I also tested a variant making this pass
part of early lowering passes.  This does not work that well, because these
are not run in topological order and C++ FE already does its own nothrow
discovery, so it handled only about 900 calls on tramp3d.
This pass handles about 3500 calls and additional 3000 calls are handled
by the followup ipa-pure-const pass (probably because of extra code removal).

Adding pass causes cgraph verifier to fail.  The reason is that now fixup_cfg
pass at begging of ssa passes actually does some dead code removal. This
makes cgraph edges out of date and they are not rebuild at the end of the
passes.

Instead of triggering yet another rebuild, which would be somewhat redundant
given that early passes rebuilds the edges again, I just changed cgraph
verifier to not compare calleers frequencies, but do callees.  This way
we reduce some work, too.

Doing this I removed one very old FIXME about verificatoin that pointed out
latent bug in set_edge_predicate. Fixed thus.

Bootstrapped/regtested x86_64-linux, OK?

* cgraph.c (cgraph_edge::verify_count_and_frequency): Remove testing
of frequency and bb match.
(cgraph_node::verify_node): Do it here on callees only.
* passes.def: Add pass_nothrow.
* ipa-pure-const.c: (pass_data_nothrow): New.
(pass_nothrow): New.
(pass_nothrow::execute): New.
(make_pass_nothrow): New.
* tree-pass.h (make_pass_nothrow): Declare.
* ipa-inline.c (set_edge_predicate): Also redirect indirect
edges.
Index: cgraph.c
===
--- cgraph.c(revision 221682)
+++ cgraph.c(working copy)
@@ -2661,25 +2661,6 @@ cgraph_edge::verify_count_and_frequency
   error ("caller edge frequency is too large");
   error_found = true;
 }
-  if (gimple_has_body_p (caller->decl)
-  && !caller->global.inlined_to
-  && !speculative
-  /* FIXME: Inline-analysis sets frequency to 0 when edge is optimized out.
-Remove this once edges are actually removed from the function at that 
time.  */
-  && (frequency
- || (inline_edge_summary_vec.exists ()
- && ((inline_edge_summary_vec.length () <= (unsigned) uid)
- || !inline_edge_summary (this)->predicate)))
-  && (frequency
- != compute_call_stmt_bb_frequency (caller->decl,
-gimple_bb (call_stmt
-{
-  error ("caller edge frequency %i does not match BB frequency %i",
-frequency,
-compute_call_stmt_bb_frequency (caller->decl,
-gimple_bb (call_stmt)));
-  error_found = true;
-}
   return error_found;
 }
 
@@ -2848,9 +2829,46 @@ cgraph_node::verify_node (void)
error_found = true;
  }
 }
+  for (e = callees; e; e = e->next_callee)
+{
+  if (e->verify_count_and_frequency ())
+   error_found = true;
+  if (gimple_has_body_p (e->caller->decl)
+ && !e->caller->global.inlined_to
+ && !e->speculative
+ /* Optimized out calls are redirected to __builtin_unreachable.  */
+ && (e->frequency
+ || e->callee->decl
+!= builtin_decl_implicit (BUILT_IN_UNREACHABLE))
+ && (e->frequency
+ != compute_call_stmt_bb_frequency (e->caller->decl,
+gimple_bb (e->call_stmt
+   {
+ error ("caller edge frequency %i does not match BB frequency %i",
+e->frequency,
+compute_call_stmt_bb_frequency (e->caller->decl,
+gimple_bb (e->call_stmt)));
+ error_found = true;
+   }
+}
   for (e = indirect_calls; e; e = e->next_callee)
-if (e->verify_count_and_frequency ())
-  error_found = true;
+{
+  if (e->verify_count_and_frequency ())
+   error_found = true;
+  if (gimple_has_body_p (e->caller->decl)
+ && !e->caller->global.inlined_to
+ && !e->speculative
+ && (e->frequency
+ != compute_call_stmt_bb_frequency (e->caller->decl,
+gimple_bb (e->call_stmt
+   {
+ error ("caller edge frequency %i does not match BB frequency %i",
+e->frequency,
+c

[PINGv3][PATCH] ASan on unaligned accesses

2015-03-25 Thread Marat Zakirov




On 03/19/2015 09:01 AM, Marat Zakirov wrote:


On 03/04/2015 11:07 AM, Andrew Pinski wrote:
On Wed, Mar 4, 2015 at 12:00 AM, Marat Zakirov 
 wrote:

Hi all!

Here is the patch which forces ASan to work on memory access without 
proper
alignment. it's useful because some programs like linux kernel often 
cheat
with alignment which may cause false negatives. This patch needs 
additional
support for proper work on unaligned accesses in global data and 
heap. It

will be implemented in libsanitizer by separate patch.


--Marat

gcc/ChangeLog:

2015-02-25  Marat Zakirov  

 * asan.c (asan_emit_stack_protection): Support for misalign
accesses.
 (asan_expand_check_ifn): Likewise.
 * params.def: New option asan-catch-misaligned.
 * params.h: New param ASAN_CATCH_MISALIGNED.

Since this parameter can only be true or false, I think it should be a
normal option.  Also you did not add documentation of the param.

Thanks,
Andrew

Fixed.



gcc/ChangeLog:

2015-03-12  Marat Zakirov  

	* asan.c (asan_emit_stack_protection): Support for misalign accesses.
	(asan_expand_check_ifn): Likewise.
	* common.opt: New flag -fasan-catch-misaligned.
	* doc/invoke.texi: New flag description.
	* opts.c (finish_options): Add check for new flag.
	(common_handle_option): Switch on flag if SANITIZE_KERNEL_ADDRESS.

gcc/testsuite/ChangeLog:

2015-03-12  Marat Zakirov  

	* c-c++-common/asan/misalign-catch.c: New test.


diff --git a/gcc/asan.c b/gcc/asan.c
index 9e4a629..80bf2e8 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1050,7 +1050,6 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
   rtx_code_label *lab;
   rtx_insn *insns;
   char buf[30];
-  unsigned char shadow_bytes[4];
   HOST_WIDE_INT base_offset = offsets[length - 1];
   HOST_WIDE_INT base_align_bias = 0, offset, prev_offset;
   HOST_WIDE_INT asan_frame_size = offsets[0] - base_offset;
@@ -1193,11 +1192,37 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
   if (STRICT_ALIGNMENT)
 set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
   prev_offset = base_offset;
+
+  vec shadow_mems;
+  vec shadow_bytes;
+
+  shadow_mems.create(0);
+  shadow_bytes.create(0);
+
   for (l = length; l; l -= 2)
 {
   if (l == 2)
 	cur_shadow_byte = ASAN_STACK_MAGIC_RIGHT;
   offset = offsets[l - 1];
+  if (l != length && flag_asan_catch_misaligned)
+	{
+	  HOST_WIDE_INT aoff
+	= base_offset + ((offset - base_offset)
+			 & ~(ASAN_RED_ZONE_SIZE - HOST_WIDE_INT_1))
+	  - ASAN_RED_ZONE_SIZE;
+	  if (aoff > prev_offset)
+	{
+	  shadow_mem = adjust_address (shadow_mem, VOIDmode,
+	   (aoff - prev_offset)
+	   >> ASAN_SHADOW_SHIFT);
+	  prev_offset = aoff;
+	  shadow_bytes.safe_push (0);
+	  shadow_bytes.safe_push (0);
+	  shadow_bytes.safe_push (0);
+	  shadow_bytes.safe_push (0);
+	  shadow_mems.safe_push (shadow_mem);
+	}
+	}
   if ((offset - base_offset) & (ASAN_RED_ZONE_SIZE - 1))
 	{
 	  int i;
@@ -1212,13 +1237,13 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
 	if (aoff < offset)
 	  {
 		if (aoff < offset - (1 << ASAN_SHADOW_SHIFT) + 1)
-		  shadow_bytes[i] = 0;
+		  shadow_bytes.safe_push (0);
 		else
-		  shadow_bytes[i] = offset - aoff;
+		  shadow_bytes.safe_push (offset - aoff);
 	  }
 	else
-	  shadow_bytes[i] = ASAN_STACK_MAGIC_PARTIAL;
-	  emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
+	  shadow_bytes.safe_push (ASAN_STACK_MAGIC_PARTIAL);
+	  shadow_mems.safe_push(shadow_mem);
 	  offset = aoff;
 	}
   while (offset <= offsets[l - 2] - ASAN_RED_ZONE_SIZE)
@@ -1227,12 +1252,21 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
    (offset - prev_offset)
    >> ASAN_SHADOW_SHIFT);
 	  prev_offset = offset;
-	  memset (shadow_bytes, cur_shadow_byte, 4);
-	  emit_move_insn (shadow_mem, asan_shadow_cst (shadow_bytes));
+	  shadow_bytes.safe_push (cur_shadow_byte);
+	  shadow_bytes.safe_push (cur_shadow_byte);
+	  shadow_bytes.safe_push (cur_shadow_byte);
+	  shadow_bytes.safe_push (cur_shadow_byte);
+	  shadow_mems.safe_push(shadow_mem);
 	  offset += ASAN_RED_ZONE_SIZE;
 	}
   cur_shadow_byte = ASAN_STACK_MAGIC_MIDDLE;
 }
+  for (unsigned i = 0; flag_asan_catch_misaligned && i < shadow_bytes.length () - 1; i++)
+if (shadow_bytes[i] == 0 && shadow_bytes[i + 1] > 0)
+  shadow_bytes[i] = 8 + (shadow_bytes[i + 1] > 7 ? 0 : shadow_bytes[i + 1]);
+  for (unsigned i = 0; i < shadow_mems.length (); i++)
+emit_move_insn (shadow_mems[i], asan_shadow_cst (&shadow_bytes[i * 4]));
+  
   do_pending_stack_adjust ();
 
   /* Construct epilogue sequence.  */
@@ -1285,34 +1319,8 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb,
   if (STRICT_ALIGNMENT)
 set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode)));
 
-  prev_offset = base_offset;
-  last_offset = base_offset;
-  last_size = 0;
-

67 matches

Mail list logo