Re: Turn DECL_SECTION_NAME into string

2014-06-12 Thread Richard Biener
On Thu, Jun 12, 2014 at 6:33 AM, Jan Hubicka  wrote:
> Hi,
> this lenghtly patch makes the legwork to put section names out of tree 
> representation.
> Originally they were STRING_CST. I ended up implementing on-side reference 
> counted
> string voclabulary that is done in bit baroque way to be GGC and PCH safe 
> (uff).
> The memory savings on Firefox are about 60MB, becuase while reading symbol 
> table we
> now unify the many duplicated comdat group strings and also we free them 
> after we bring
> those local.
>
> The old representation probably made sense when most of string came via 
> __section__
> attribute where they was readily parsed as string constants.

I wonder why you didn't use IDENTIFIER_NODEs?  (ok, still trees ...)
At least those are already GGC and PCH safe.

Richard.

> Bootstrapped/regtested x86_64-linux, comitted.
>
> Honza
>
> * symtab.c (section_hash): New hash.
> (symtab_unregister_node): Clear section before freeing.
> (hash_section_hash_entry): New haser.
> (eq_sections): New function.
> (symtab_node::set_section_for_node): New method.
> (set_section_1): Update.
> (symtab_node::set_section): Take string instead of tree as parameter.
> (symtab_resolve_alias): Update.
> * cgraph.h (section_hash_entry_d): New structure.
> (section_hash_entry): New typedef.
> (cgraph_node): Change comdat_group_ to x_comdat_group,
> change section_ to x_section and turn into section_hash_entry;
> update accestors; put set_section_for_node offline.
> * tree.c (decl_section_name): Turn into string.
> (set_decl_section_name): Change parameter to be string.
> * tree.h (decl_section_name, set_decl_section_name): Update 
> prototypes.
> * sdbout.c (sdbout_one_type): Update.
> * tree-vect-data-refs.c (vect_can_force_dr_alignment_p): Update.
> * varasm.c (IN_NAMED_SECTION, get_named_section, 
> resolve_unique_section,
> hot_function_section, get_named_text_section, 
> USE_SELECT_SECTION_FOR_FUNCTIONS,
> default_function_rodata_section, make_decl_rtl, 
> default_unique_section):
> Update.
> * config/c6x/c6x.c (c6x_in_small_data_p): Update.
> (c6x_elf_unique_section): Update.
> * config/nios2/nios2.c (nios2_in_small_data_p): Update.
> * config/pa/pa.c (pa_function_section): Update.
> * config/pa/pa.h (IN_NAMED_SECTION_P): Update.
> * config/ia64/ia64.c (ia64_in_small_data_p): Update.
> * config/arc/arc.c (arc_in_small_data_p): Update.
> * config/arm/unknown-elf.h (IN_NAMED_SECTION_P): Update.
> * config/mcore/mcore.c (mcore_unique_section): Update.
> * config/mips/mips.c (mips16_build_function_stub): Update.
> (mips16_build_call_stub): Update.
> (mips_function_rodata_section): Update.
> (mips_in_small_data_p): Update.
> * config/score/score.c (score_in_small_data_p): Update.
> * config/rx/rx.c (rx_in_small_data): Update.
> * config/rs6000/rs6000.c (rs6000_elf_in_small_data_p): Update.
> (rs6000_xcoff_asm_named_section): Update.
> (rs6000_xcoff_unique_section): Update.
> * config/frv/frv.c (frv_string_begins_with): Update.
> (frv_in_small_data_p): Update.
> * config/v850/v850.c (v850_encode_data_area): Update.
> * config/bfin/bfin.c (DECL_SECTION_NAME): Update.
> (bfin_handle_l1_data_attribute): Update.
> (bfin_handle_l2_attribute): Update.
> * config/mep/mep.c (mep_unique_section): Update.
> * config/microblaze/microblaze.c (microblaze_elf_in_small_data_p): 
> Update.
> * config/h8300/h8300.c (h8300_handle_eightbit_data_attribute): Update.
> (h8300_handle_tiny_data_attribute): Update.
> * config/m32r/m32r.c (m32r_in_small_data_p): Update.
> (m32r_in_small_data_p): Update.
> * config/alpha/alpha.c (alpha_in_small_data_p): Update.
> * config/i386/i386.c (ix86_in_large_data_p): Update.
> * config/i386/winnt.c (i386_pe_unique_section): Update.
> * config/darwin.c (darwin_function_section): Update.
> * config/lm32/lm32.c (lm32_in_small_data_p): Update.
> * tree-emutls.c (get_emutls_init_templ_addr): Update.
> (new_emutls_decl): Update.
> * lto-cgraph.c (lto_output_node, input_node, input_varpool_node,
> input_varpool_node): Update.
> (ead_string_cst): Turn to ...
> (read_string): ... this one.
> * dwarf2out.c (secname_for_decl): Update.
> * asan.c (asan_protect_global): Update.
>
> * c-family/c-common.c (handle_section_attribute): Update handling for
> section names that are no longer trees.
>
> * java/class.c (build_utf8_ref): Update handling for section names
> that are no longer trees.
> (emit_register_classes_in_jcr_section): Update.
>
> * vtable-class

Re: [PATCH, Pointer Bounds Checker 35/x] Fix object size emitted for structures with flexible arrays

2014-06-12 Thread Richard Biener
On Wed, Jun 11, 2014 at 6:08 PM, Ilya Enkovich  wrote:
> Hi,
>
> This patch fixes problem with size emitted for static structures with 
> flexible array.  I found a couple of trackers in guzilla for this problem but 
> all of them are marked as fixed and problem still exists.
>
> For a simple testcase
>
> struct S { int a; int b[0]; } s = { 1, { 0, 0} };
>
> current trunk produces (no flags):
>
> .globl  s
> .data
> .align 4
> .type   s, @object
> .size   s, 4
> s:
> .long   1
> .long   0
> .long   0
>
> which has wrong size for object s.
>
> This problem is important for checker because wrong size leads to wrong 
> bounds and false bounds violations.  Following patch uses DECL_SIZE_UNIT 
> instead of type size and works well for me.  Does it look OK?

There is a bug about this in bugzilla somewhere.

It looks ok to me - did you test with all languages?  In particular did
you test Ada?

Thanks,
Richard.

> Bootstrapped and tested on linux-x86_64.
>
> Thanks,
> Ilya
> --
> gcc/
>
> 2014-06-11  Ilya Enkovich  
>
> * config/elfos.h (ASM_DECLARE_OBJECT_NAME): Use decl size
> instead of type size.
> (ASM_FINISH_DECLARE_OBJECT): Likewise.
>
>
> diff --git a/gcc/config/elfos.h b/gcc/config/elfos.h
> index c1d5553..7929708 100644
> --- a/gcc/config/elfos.h
> +++ b/gcc/config/elfos.h
> @@ -313,7 +313,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>   && (DECL) && DECL_SIZE (DECL))\
> {   \
>   size_directive_output = 1;\
> - size = int_size_in_bytes (TREE_TYPE (DECL));  \
> + size = tree_to_uhwi (DECL_SIZE_UNIT (DECL));  \
>   ASM_OUTPUT_SIZE_DIRECTIVE (FILE, NAME, size); \
> }   \
> \
> @@ -341,7 +341,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>   && !size_directive_output)\
> {   \
>   size_directive_output = 1;\
> - size = int_size_in_bytes (TREE_TYPE (DECL));  \
> + size = tree_to_uhwi (DECL_SIZE_UNIT (DECL));  \
>   ASM_OUTPUT_SIZE_DIRECTIVE (FILE, name, size); \
> }   \
>  }  \


Re: fix math wrt volatile-bitfields vs C++ model

2014-06-12 Thread Richard Biener
On Wed, Jun 11, 2014 at 11:35 PM, DJ Delorie  wrote:
>
> If the combined bitfields are exactly the size of the mode, the logic
> for detecting range overflow is flawed - it calculates an ending
> "position" that's the position of the first bit in the next field.
>
> In the case of "short" for example, you get "16 > 15" without this
> patch (comparing size to position), and "15 > 15" with (comparing
> position to position).
>
> Ok to apply?

Looks ok to me, but can you add a testcase please?

Also check if 4.9 is affected.

Thanks,
Richard.

> * expmed.c (strict_volatile_bitfield_p): Fix off-by-one error.
>
> Index: expmed.c
> ===
> --- expmed.c(revision 211479)
> +++ expmed.c(working copy)
> @@ -472,13 +472,13 @@ strict_volatile_bitfield_p (rtx op0, uns
>   && bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize > modesize))
>  return false;
>
>/* Check for cases where the C++ memory model applies.  */
>if (bitregion_end != 0
>&& (bitnum - bitnum % modesize < bitregion_start
> - || bitnum - bitnum % modesize + modesize > bitregion_end))
> + || bitnum - bitnum % modesize + modesize - 1 > bitregion_end))
>  return false;
>
>return true;
>  }
>
>  /* Return true if OP is a memory and if a bitfield of size BITSIZE at


Re: [PATCH] PR rtl-optimization/61047

2014-06-12 Thread Eric Botcazou
> This patch tries to get safe lower and upper bounds where accesses
> are always guaranteed to work.  The goal is not to penalize
> reasonable written code:  When boot-strapping the whole GCC
> only a few places were found, where this new check triggers.
> 
> Boot-strapped and regression-tested on x86_64-linux-gnu.
> Additionally built a cross compiler for a stack-grows-upward-target
> (xstormy16-elf).
> 
> Ok for trunk?

No, that's far too complicated a change for such a dumb artificial testcase.

I have suspended the PR.  I'd suggest concentrating on bug reports for real-
life software and/or new features, this would IMO be a better use of the time 
you devote to GCC.

-- 
Eric Botcazou


Re: [PATCH] PR rtl-optimization/61047

2014-06-12 Thread Richard Biener
On Thu, Jun 12, 2014 at 10:03 AM, Eric Botcazou  wrote:
>> This patch tries to get safe lower and upper bounds where accesses
>> are always guaranteed to work.  The goal is not to penalize
>> reasonable written code:  When boot-strapping the whole GCC
>> only a few places were found, where this new check triggers.
>>
>> Boot-strapped and regression-tested on x86_64-linux-gnu.
>> Additionally built a cross compiler for a stack-grows-upward-target
>> (xstormy16-elf).
>>
>> Ok for trunk?
>
> No, that's far too complicated a change for such a dumb artificial testcase.
>
> I have suspended the PR.  I'd suggest concentrating on bug reports for real-
> life software and/or new features, this would IMO be a better use of the time
> you devote to GCC.

Btw, I wonder if we can simply mark the MEMs generated from spill code
with MEM_NOTRAP_P so we can remove the special casing of
frame-pointer-based addresses from add while properly initializing
MEM_NOTRAP_p from rtx_addr_can_trap_p?  I suppose it was added
exactly to cover spill code?

Otherwise I agree with Eric.

Richard.

> --
> Eric Botcazou


Re: Turn DECL_SECTION_NAME into string

2014-06-12 Thread Jan Hubicka
> On Thu, Jun 12, 2014 at 6:33 AM, Jan Hubicka  wrote:
> > Hi,
> > this lenghtly patch makes the legwork to put section names out of tree 
> > representation.
> > Originally they were STRING_CST. I ended up implementing on-side reference 
> > counted
> > string voclabulary that is done in bit baroque way to be GGC and PCH safe 
> > (uff).
> > The memory savings on Firefox are about 60MB, becuase while reading symbol 
> > table we
> > now unify the many duplicated comdat group strings and also we free them 
> > after we bring
> > those local.
> >
> > The old representation probably made sense when most of string came via 
> > __section__
> > attribute where they was readily parsed as string constants.
> 
> I wonder why you didn't use IDENTIFIER_NODEs?  (ok, still trees ...)
> At least those are already GGC and PCH safe.

To be able to discard it effectively during LTO by ref counting.
IDENTIFIER_NODEs makes sense for assembler names (sorta) since they may match
identifier and thus also to COMDAT_GROUPS that are taken from assembler names.
Section names do not match those, so having a separate pool for them seemed to 
work
best.

What happens is at LTO is that we read all the sections for comdat groups and 
then
ipa-visibility dismantles them.

Anyway, it is now hidden by the API, so we can change it easily.

Honza
> 
> Richard.
> 
> > Bootstrapped/regtested x86_64-linux, comitted.
> >
> > Honza
> >
> > * symtab.c (section_hash): New hash.
> > (symtab_unregister_node): Clear section before freeing.
> > (hash_section_hash_entry): New haser.
> > (eq_sections): New function.
> > (symtab_node::set_section_for_node): New method.
> > (set_section_1): Update.
> > (symtab_node::set_section): Take string instead of tree as 
> > parameter.
> > (symtab_resolve_alias): Update.
> > * cgraph.h (section_hash_entry_d): New structure.
> > (section_hash_entry): New typedef.
> > (cgraph_node): Change comdat_group_ to x_comdat_group,
> > change section_ to x_section and turn into section_hash_entry;
> > update accestors; put set_section_for_node offline.
> > * tree.c (decl_section_name): Turn into string.
> > (set_decl_section_name): Change parameter to be string.
> > * tree.h (decl_section_name, set_decl_section_name): Update 
> > prototypes.
> > * sdbout.c (sdbout_one_type): Update.
> > * tree-vect-data-refs.c (vect_can_force_dr_alignment_p): Update.
> > * varasm.c (IN_NAMED_SECTION, get_named_section, 
> > resolve_unique_section,
> > hot_function_section, get_named_text_section, 
> > USE_SELECT_SECTION_FOR_FUNCTIONS,
> > default_function_rodata_section, make_decl_rtl, 
> > default_unique_section):
> > Update.
> > * config/c6x/c6x.c (c6x_in_small_data_p): Update.
> > (c6x_elf_unique_section): Update.
> > * config/nios2/nios2.c (nios2_in_small_data_p): Update.
> > * config/pa/pa.c (pa_function_section): Update.
> > * config/pa/pa.h (IN_NAMED_SECTION_P): Update.
> > * config/ia64/ia64.c (ia64_in_small_data_p): Update.
> > * config/arc/arc.c (arc_in_small_data_p): Update.
> > * config/arm/unknown-elf.h (IN_NAMED_SECTION_P): Update.
> > * config/mcore/mcore.c (mcore_unique_section): Update.
> > * config/mips/mips.c (mips16_build_function_stub): Update.
> > (mips16_build_call_stub): Update.
> > (mips_function_rodata_section): Update.
> > (mips_in_small_data_p): Update.
> > * config/score/score.c (score_in_small_data_p): Update.
> > * config/rx/rx.c (rx_in_small_data): Update.
> > * config/rs6000/rs6000.c (rs6000_elf_in_small_data_p): Update.
> > (rs6000_xcoff_asm_named_section): Update.
> > (rs6000_xcoff_unique_section): Update.
> > * config/frv/frv.c (frv_string_begins_with): Update.
> > (frv_in_small_data_p): Update.
> > * config/v850/v850.c (v850_encode_data_area): Update.
> > * config/bfin/bfin.c (DECL_SECTION_NAME): Update.
> > (bfin_handle_l1_data_attribute): Update.
> > (bfin_handle_l2_attribute): Update.
> > * config/mep/mep.c (mep_unique_section): Update.
> > * config/microblaze/microblaze.c (microblaze_elf_in_small_data_p): 
> > Update.
> > * config/h8300/h8300.c (h8300_handle_eightbit_data_attribute): 
> > Update.
> > (h8300_handle_tiny_data_attribute): Update.
> > * config/m32r/m32r.c (m32r_in_small_data_p): Update.
> > (m32r_in_small_data_p): Update.
> > * config/alpha/alpha.c (alpha_in_small_data_p): Update.
> > * config/i386/i386.c (ix86_in_large_data_p): Update.
> > * config/i386/winnt.c (i386_pe_unique_section): Update.
> > * config/darwin.c (darwin_function_section): Update.
> > * config/lm32/lm32.c (lm32_in_small_data_p): Update.
> > * tree-emutls.c (get_em

Re: [PATCH, PR52252] Alternative way of vectorization for load groups of size 2 and 3.

2014-06-12 Thread Richard Biener
On Thu, Jun 12, 2014 at 6:04 AM, Evgeny Stupachenko  wrote:
> Testing finished. No new regressions.
> Is the following patch ok?

+  if (targetm.sched.reassociation_width (VEC_PERM_EXPR, mode) > 1 ||
+  !vect_shift_permute_load_chain (dr_chain, size, stmt, gsi,
&result_chain))

||s and &&s go to the next line.

I miss testcases that make sure the vectorizer/backend code-paths are
both exercised.  Put them in gcc.target/i386 and provide an appropriate
-march.

The vectorizer changes are ok with the above fixed, I defer to backend
maintainers for the i386 changes.

Richard.

> 2014-06-11  Evgeny Stupachenko  
>
> * config/i386/i386.c (ix86_reassociation_width): Add alternative for
> vector case.
> * config/i386/i386.h (TARGET_VECTOR_PARALLEL_EXECUTION): New.
> * config/i386/x86-tune.def (X86_TUNE_VECTOR_PARALLEL_EXECUTION): New.
> * tree-vect-data-refs.c (vect_shift_permute_load_chain): New.
> Introduces alternative way of loads group permutaions.
> (vect_transform_grouped_load): Try alternative way of permutations.
>
> Thanks,
> Evgeny
>
> On Tue, Jun 10, 2014 at 4:43 PM, Evgeny Stupachenko  
> wrote:
>> ix86_reassociation_width checks INTEGRAL_MODE_P and FLOAT_MODE_P which
>> include vector mode.
>> I'll try to separate this into scalar and vector part, but it will
>> require more testing (under the testing now).
>> What about the rest of the patch?
>>
>> Thanks,
>> Evgeny
>>
>> On Thu, Jun 5, 2014 at 3:54 PM, Ramana Radhakrishnan
>>  wrote:
>>> On 06/05/14 12:43, Evgeny Stupachenko wrote:

 New hook is related to vector instructions only. Vector instructions
 could be sequential in pipeline, but scalar - parallel. For x86
 architectures TARGET_SCHED_REASSOC_WIDTH does not give required
 differentiation.
 General hooks could be potentially reused in other algorithms/by other
 architectures.
>>>
>>>
>>> It already takes a "mode" argument. Couldn't you use a vector mode to work
>>> this out ?
>>>
>>> If it is not enough then please be more specific about the documentation of
>>> this hook about where it is useful so that it's easy for people reading the
>>> documentation to understand at a glance what purpose it serves.
>>>
>>>
>>> Ramana
>>>
>>>

 Thanks,
 Evgeny

 On Thu, Jun 5, 2014 at 2:04 PM, Ramana Radhakrishnan
  wrote:
>
> On Wed, May 28, 2014 at 2:09 PM, Evgeny Stupachenko 
> wrote:
>>
>> Hi,
>>
>> The patch introduces alternative way of permutations for load groups
>> of size 2 and 3 which should be faster on architectures with low
>> parallelism.
>> The patch gives 2 times gain on Silvermont to the test from PR52252
>> (in addition to already committed 3 times gain).
>>
>> Patch passes bootstrap on x86. Make check is in progress.
>
>
> Why do we need a new hook ? Can't you derive this information from
> something which is equally badly named TARGET_SCHED_REASSOC_WIDTH
> though used in the reassociation logic but also serves a similar
> purpose ?
>
> Also the documentation of this hook is incomplete at best and wrong at
> worst as this is not applied everywhere in the vectorizer but just for
> this special case for load store permuting. Implying this is useful
> everywhere in the vectorizer does not appear to be correct.
>
> regards
> Ramana
>
>
>
>
>>
>> ChangeLog:
>>
>> 2014-05-28  Evgeny Stupachenko  
>>
>>  * config/i386/i386.c (ix86_have_vector_parallel_execution):
>> New.
>>  (TARGET_VECTORIZE_HAVE_VECTOR_PARALLEL_EXECUTION): New.
>>  * config/i386/i386.h (TARGET_VECTOR_PARALLEL_EXECUTION): New.
>>  * config/i386/x86-tune.def
>> (X86_TUNE_VECTOR_PARALLEL_EXECUTION): New.
>>  * target.def (have_vector_parallel_execution): New.
>>  * doc/tm.texi.in (have_vector_parallel_execution)): New.
>>  * doc/tm.texi: Regenerate.
>>  * targhooks.c (default_have_vector_parallel_execution): New.
>>  * tree-vect-data-refs.c (vect_shift_permute_load_chain): New.
>>  Introduces alternative way of loads group permutaions.
>>  (vect_transform_grouped_load): Try alternative way of
>> permutaions.
>>
>> Evgeny


>>>


Re: [PATCH] Trust TREE_ADDRESSABLE

2014-06-12 Thread Eric Botcazou
> If we want to give frontends a way to pass information that address of a
> given global object is not taken (apparently useful for Ada and its alias
> attribute), then I do not think we are looking for middle-end only
> solution.

I don't feel very confortable with doing that in Ada, since everybody seems to 
be thinking that TRE_PUBLIC/DECL_EXTERNAL objects are implicitly addressable 
(see for example Steven's reasoning in an earlier message).

> If we really do not want to revisit TREE_ADDRESSABLE in frontends, we can do
> the following:
>  1) change semantics of addressable flag on global variables in a way
> Richard did, document it is initialized only after symbol table is built 2)
> add code to cgraph construction to set TREE_ADDRESSABLE on every global
> variable it sees.
> IPA visibility is run before early optimizations. I suppose we can set
> it there.  I.e. in function_and_variable_visibility whenever we set
> externally_visible and we have !in_lto_p
> It is bit of hack.
>  3) perhaps add some way to avoid 2) on objects we want - apparenlty we now
> have DECL_NONALIASED that may be useful for this.

Then how about using DECL_NONALIASED instead of TREE_ADDRESSABLE to achieve 
the initial goal here?  That is to say, may_be_aliased tests DECL_NONALIASED 
for TREE_PUBLIC/DECL_EXTERNAL DECLs and the LTO front-end sets it properly.

-- 
Eric Botcazou


Re: [PATCH] Trust TREE_ADDRESSABLE

2014-06-12 Thread Richard Biener
On Thu, 12 Jun 2014, Eric Botcazou wrote:

> > If we want to give frontends a way to pass information that address of a
> > given global object is not taken (apparently useful for Ada and its alias
> > attribute), then I do not think we are looking for middle-end only
> > solution.
> 
> I don't feel very confortable with doing that in Ada, since everybody seems 
> to 
> be thinking that TRE_PUBLIC/DECL_EXTERNAL objects are implicitly addressable 
> (see for example Steven's reasoning in an earlier message).
> 
> > If we really do not want to revisit TREE_ADDRESSABLE in frontends, we can do
> > the following:
> >  1) change semantics of addressable flag on global variables in a way
> > Richard did, document it is initialized only after symbol table is built 2)
> > add code to cgraph construction to set TREE_ADDRESSABLE on every global
> > variable it sees.
> > IPA visibility is run before early optimizations. I suppose we can set
> > it there.  I.e. in function_and_variable_visibility whenever we set
> > externally_visible and we have !in_lto_p
> > It is bit of hack.
> >  3) perhaps add some way to avoid 2) on objects we want - apparenlty we now
> > have DECL_NONALIASED that may be useful for this.
> 
> Then how about using DECL_NONALIASED instead of TREE_ADDRESSABLE to achieve 
> the initial goal here?  That is to say, may_be_aliased tests DECL_NONALIASED 
> for TREE_PUBLIC/DECL_EXTERNAL DECLs and the LTO front-end sets it properly.

Btw, may_be_aliased already does that.  So yes, when LTO promotes sth
from non-public to public but hidden visibility and TREE_ADDRESSABLE
was not set it could set DECL_NONALIASED.  That would at least preserve
the aliasing behavior from without using LTO.  If the resolution info
from the linker allows us to make initial public variables hidden
_and_ some LTO IPA pass proves that the variables address is not taken
then that pass can set DECL_NONALIASED as well.

Of course one issue is that it's impossible to write a verifier that
checks whether DECL_NONALIASED and TREE_ADDRESSABLE are "out-of-sync"
(because by design they can be).  So it's a bit more fragile
(we could make the operand scanner that "updates" TREE_ADDRESSABLE
also unset DECL_NONALIASED of course).

Richard.


Re: [PATCH] PR rtl-optimization/61047

2014-06-12 Thread Eric Botcazou
> Btw, I wonder if we can simply mark the MEMs generated from spill code
> with MEM_NOTRAP_P so we can remove the special casing of
> frame-pointer-based addresses from add while properly initializing
> MEM_NOTRAP_p from rtx_addr_can_trap_p?

Spill code generated by the compiler itself?  That's quite restrictive.

> I suppose it was added exactly to cover spill code?

Nope, it was added for jump tables:

2003-04-22  Richard Henderson  

PR 8866
* rtl.h (MEM_NOTRAP_P): New.
(MEM_COPY_ATTRIBUTES): Copy it.
* rtlanal.c (may_trap_p): Check it.
* expr.c (do_tablejump): Set it.
* doc/rtl.texi (Flags): Document it.

* cfgrtl.c (try_redirect_by_replacing_jump): Revert last three changes.

that is to say, for memory accesses that can nominally trap but for which we 
know that they actually don't.

-- 
Eric Botcazou


RE: [PATCH] PR rtl-optimization/61047

2014-06-12 Thread Bernd Edlinger
On  Thu, 12 Jun 2014 10:36:25, Eric Botcazou wrote:
>
>> Btw, I wonder if we can simply mark the MEMs generated from spill code
>> with MEM_NOTRAP_P so we can remove the special casing of
>> frame-pointer-based addresses from add while properly initializing
>> MEM_NOTRAP_p from rtx_addr_can_trap_p?
>
> Spill code generated by the compiler itself? That's quite restrictive.
>
>> I suppose it was added exactly to cover spill code?
>
> Nope, it was added for jump tables:
>
> 2003-04-22 Richard Henderson 
>
> PR 8866
> * rtl.h (MEM_NOTRAP_P): New.
> (MEM_COPY_ATTRIBUTES): Copy it.
> * rtlanal.c (may_trap_p): Check it.
> * expr.c (do_tablejump): Set it.
> * doc/rtl.texi (Flags): Document it.
>
> * cfgrtl.c (try_redirect_by_replacing_jump): Revert last three changes.
>
> that is to say, for memory accesses that can nominally trap but for which we
> know that they actually don't.
>
> --
> Eric Botcazou

Btw I am not sure at all,  why argp-references can never be dangerous?
For instance in a struct with an array inside, passed as function argument?


Bernd.
  

[RFC] Teaching SCC merging about unit local trees

2014-06-12 Thread Jan Hubicka
Richard,
as briefly discussed before, I would like to teach LTO type merging to not merge
types that was declared in anonymous namespaces and use C++ ODR type names 
(stored in DECL_ASSEMBLER_NAME of the TYPE_DECL) to break down canonical types
by their names.

First thing I need to arrange IMO is to not merge two anonymous types from
two different units.  While looking into it I noticed that the current code
in unify_scc that refuses to merge local decls produces conflicts and seems
useless excercise to do.

This patch introduces special hash code 1 that specify that given SCC is known
to be local and should bypass the merging logic. This is propagated down and
seems to quite noticeably reduce size of SCC hash:

[WPA] read 10190717 SCCs of average size 1.980409
[WPA] 20181785 tree bodies read in total
[WPA] tree SCC table: size 4194301, 1882700 elements, collision ratio: 0.815497
[WPA] tree SCC max chain length 140 (size 1)
[WPA] Compared 3392363 SCCs, 2718822 collisions (0.801454)
[WPA] Merged 3314075 SCCs
[WPA] Merged 9693632 tree bodies
[WPA] Merged 2467704 types
[WPA] 1783262 types prevailed (4491218 associated trees)
[WPA] GIMPLE canonical type table: size 131071, 94867 elements, 1783347 
searches, 737056 collisions (ratio: 0.413299)
[WPA] GIMPLE canonical type pointer-map: 94867 elements, 3973875 searches
[WPA] Compression: 282828785 input bytes, 831186147 uncompressed bytes (ratio: 
2.938832)
[WPA] Size of mmap'd section decls: 282828785 bytes

to:

[WPA] read 10172291 SCCs of average size 1.982162
[WPA] 20163124 tree bodies read in total
[WPA] tree SCC table: size 2097143, 988764 elements, collision ratio: 0.684967
[WPA] tree SCC max chain length 140 (size 1)
[WPA] Compared 3060932 SCCs, 2405009 collisions (0.785711)
[WPA] Merged 3040565 SCCs
[WPA] Merged 9246482 tree bodies
[WPA] Merged 2382312 types
[WPA] 1868611 types prevailed (4728465 associated trees)
[WPA] GIMPLE canonical type table: size 131071, 94910 elements, 1868696 
searches, 790939 collisions (ratio: 0.423257)
[WPA] GIMPLE canonical type pointer-map: 94910 elements, 4216423 searches
[WPA] Compression: 273322455 input bytes, 824178095 uncompressed bytes (ratio: 
3.015406) 

We merge less, but not by much and I think we was not right not merge in that 
cases.

Would something like this make sense? (I am not saying my definition of 
unit_local_tree_p
is most polished one :)

I think next step could be to make anonymous types to bypass the canonical type
merging (i.e. simply save the chains as they comde from frontends forthose) and
then look into computing the type names in free lang data, using odr name hash 
instaed
of canonical type hash for those named types + link them to canonical type hash
entries and if we end up with unnamed type in canonical type hash, then make its
alias class to conflict with all the named types.

Honza

Index: lto-streamer-out.c
===
--- lto-streamer-out.c  (revision 211488)
+++ lto-streamer-out.c  (working copy)
@@ -54,6 +54,47 @@ along with GCC; see the file COPYING3.
 #include "cfgloop.h"
 #include "builtins.h"
 
+/* Return if T can never be shared across units.  */
+static bool
+unit_local_tree_p (tree t)
+{
+  switch (TREE_CODE (t))
+{
+  case VAR_DECL:
+   /* Automatic variables are always unit local.  */
+   if (!TREE_STATIC (t) && !DECL_EXTERNAL (t)
+   && !DECL_HARD_REGISTER (t))
+ return true;
+   /* ... fall through ... */
+
+  case FUNCTION_DECL:
+   /* Non-public declarations are alwyas local. */
+   if (!TREE_PUBLIC (t))
+ return true;
+
+   /* Public definitions that would cause linker error if
+  appeared in other unit.  */
+   if (TREE_PUBLIC (t)
+   && !DECL_EXTERNAL (t)
+   && !DECL_WEAK (t))
+ return true;
+   return false;
+  case NAMESPACE_DECL:
+   return !TREE_PUBLIC (t);
+  case TRANSLATION_UNIT_DECL:
+   return true;
+  case PARM_DECL:
+  case RESULT_DECL:
+  case LABEL_DECL:
+  case SSA_NAME:
+   return true;
+  default:
+   if (TYPE_P (t)
+   && type_in_anonymous_namespace_p (t))
+ return true;
+   return false;
+}
+}
 
 static void lto_write_tree (struct output_block*, tree, bool);
 
@@ -686,7 +727,9 @@ DFS_write_tree_body (struct output_block
 #undef DFS_follow_tree_edge
 }
 
-/* Return a hash value for the tree T.  */
+/* Return a hash value for the tree T. 
+   If T is local to unit or refers anything local to unit, return 1.
+   Otherwise return non-1.  */
 
 static hashval_t
 hash_tree (struct streamer_tree_cache_d *cache, tree t)
@@ -694,10 +737,19 @@ hash_tree (struct streamer_tree_cache_d
 #define visit(SIBLING) \
   do { \
 unsigned ix; \
+hashval_t h; \
 if (SIBLING && streamer_tree_cache_lookup (cache, SIBLING, &ix)) \
-  v = iterative_hash_hashval_t (streamer_tree_cache_get_hash (cache, ix), 
v); \
+  { \
+h = st

Re: [PATCH] Trust TREE_ADDRESSABLE

2014-06-12 Thread Jan Hubicka
> On Thu, 12 Jun 2014, Eric Botcazou wrote:
> 
> > > If we want to give frontends a way to pass information that address of a
> > > given global object is not taken (apparently useful for Ada and its alias
> > > attribute), then I do not think we are looking for middle-end only
> > > solution.
> > 
> > I don't feel very confortable with doing that in Ada, since everybody seems 
> > to 
> > be thinking that TRE_PUBLIC/DECL_EXTERNAL objects are implicitly 
> > addressable 
> > (see for example Steven's reasoning in an earlier message).
> > 
> > > If we really do not want to revisit TREE_ADDRESSABLE in frontends, we can 
> > > do
> > > the following:
> > >  1) change semantics of addressable flag on global variables in a way
> > > Richard did, document it is initialized only after symbol table is built 
> > > 2)
> > > add code to cgraph construction to set TREE_ADDRESSABLE on every global
> > > variable it sees.
> > > IPA visibility is run before early optimizations. I suppose we can set
> > > it there.  I.e. in function_and_variable_visibility whenever we set
> > > externally_visible and we have !in_lto_p
> > > It is bit of hack.
> > >  3) perhaps add some way to avoid 2) on objects we want - apparenlty we 
> > > now
> > > have DECL_NONALIASED that may be useful for this.
> > 
> > Then how about using DECL_NONALIASED instead of TREE_ADDRESSABLE to achieve 
> > the initial goal here?  That is to say, may_be_aliased tests 
> > DECL_NONALIASED 
> > for TREE_PUBLIC/DECL_EXTERNAL DECLs and the LTO front-end sets it properly.
> 
> Btw, may_be_aliased already does that.  So yes, when LTO promotes sth
> from non-public to public but hidden visibility and TREE_ADDRESSABLE
> was not set it could set DECL_NONALIASED.  That would at least preserve
> the aliasing behavior from without using LTO.  If the resolution info
> from the linker allows us to make initial public variables hidden
> _and_ some LTO IPA pass proves that the variables address is not taken
> then that pass can set DECL_NONALIASED as well.

Yep, I suppose each time I clear TREE_ADDRESSABLE flag, i can also set
DECL_NONALIASED.
> 
> Of course one issue is that it's impossible to write a verifier that
> checks whether DECL_NONALIASED and TREE_ADDRESSABLE are "out-of-sync"
> (because by design they can be).  So it's a bit more fragile
> (we could make the operand scanner that "updates" TREE_ADDRESSABLE
> also unset DECL_NONALIASED of course).

Hmm,when one would unset it?

Honza
> 
> Richard.


Re: [PATCH] PR rtl-optimization/61047

2014-06-12 Thread Eric Botcazou
> Btw I am not sure at all,  why argp-references can never be dangerous?
> For instance in a struct with an array inside, passed as function argument?

IMO there cannot be any definitive solution to this issue until after we move 
all the affected optimizations from RTL to GIMPLE.  In the meantime, the 
failure mode is not catastrophic (100% reproducible segfault) and there is 
always an easy workaround (generally a -fno-* switch).

-- 
Eric Botcazou


Re: [PATCH] Trust TREE_ADDRESSABLE

2014-06-12 Thread Eric Botcazou
> Btw, may_be_aliased already does that.

Indeed, and we could make use of that in Ada, at least in some cases.

> Of course one issue is that it's impossible to write a verifier that
> checks whether DECL_NONALIASED and TREE_ADDRESSABLE are "out-of-sync"
> (because by design they can be).  So it's a bit more fragile
> (we could make the operand scanner that "updates" TREE_ADDRESSABLE
> also unset DECL_NONALIASED of course).

IMO it's also more robust because the default (no DECL_NONALIASED) is safe.

-- 
Eric Botcazou


Re: [PATCH] Trust TREE_ADDRESSABLE

2014-06-12 Thread Richard Biener
On Thu, 12 Jun 2014, Jan Hubicka wrote:

> > On Thu, 12 Jun 2014, Eric Botcazou wrote:
> > 
> > > > If we want to give frontends a way to pass information that address of a
> > > > given global object is not taken (apparently useful for Ada and its 
> > > > alias
> > > > attribute), then I do not think we are looking for middle-end only
> > > > solution.
> > > 
> > > I don't feel very confortable with doing that in Ada, since everybody 
> > > seems to 
> > > be thinking that TRE_PUBLIC/DECL_EXTERNAL objects are implicitly 
> > > addressable 
> > > (see for example Steven's reasoning in an earlier message).
> > > 
> > > > If we really do not want to revisit TREE_ADDRESSABLE in frontends, we 
> > > > can do
> > > > the following:
> > > >  1) change semantics of addressable flag on global variables in a way
> > > > Richard did, document it is initialized only after symbol table is 
> > > > built 2)
> > > > add code to cgraph construction to set TREE_ADDRESSABLE on every global
> > > > variable it sees.
> > > > IPA visibility is run before early optimizations. I suppose we can 
> > > > set
> > > > it there.  I.e. in function_and_variable_visibility whenever we set
> > > > externally_visible and we have !in_lto_p
> > > > It is bit of hack.
> > > >  3) perhaps add some way to avoid 2) on objects we want - apparenlty we 
> > > > now
> > > > have DECL_NONALIASED that may be useful for this.
> > > 
> > > Then how about using DECL_NONALIASED instead of TREE_ADDRESSABLE to 
> > > achieve 
> > > the initial goal here?  That is to say, may_be_aliased tests 
> > > DECL_NONALIASED 
> > > for TREE_PUBLIC/DECL_EXTERNAL DECLs and the LTO front-end sets it 
> > > properly.
> > 
> > Btw, may_be_aliased already does that.  So yes, when LTO promotes sth
> > from non-public to public but hidden visibility and TREE_ADDRESSABLE
> > was not set it could set DECL_NONALIASED.  That would at least preserve
> > the aliasing behavior from without using LTO.  If the resolution info
> > from the linker allows us to make initial public variables hidden
> > _and_ some LTO IPA pass proves that the variables address is not taken
> > then that pass can set DECL_NONALIASED as well.
> 
> Yep, I suppose each time I clear TREE_ADDRESSABLE flag, i can also set
> DECL_NONALIASED.
> > 
> > Of course one issue is that it's impossible to write a verifier that
> > checks whether DECL_NONALIASED and TREE_ADDRESSABLE are "out-of-sync"
> > (because by design they can be).  So it's a bit more fragile
> > (we could make the operand scanner that "updates" TREE_ADDRESSABLE
> > also unset DECL_NONALIASED of course).
> 
> Hmm,when one would unset it?

When you extract the address and use it.  For example when you
do auto-parallelization and outline a part of your function it
passes arrays as addresses.

Or if you start to introduce address induction variables like
the vectorizer or IVOPTs does.

Richard.


Minor cleanup

2014-06-12 Thread Eric Botcazou
There was apparently a last-minute name change for DECL_NONALIASED.

Tested on x86_64-suse-linux, applied on mainline and 4.9 branch as obvious.


2014-06-12  Eric Botcazou  

* tree-core.h (DECL_NONALIASED): Use proper spelling in comment.


-- 
Eric BotcazouIndex: tree-core.h
===
--- tree-core.h	(revision 211435)
+++ tree-core.h	(working copy)
@@ -1012,7 +1012,7 @@ struct GTY(()) tree_base {
SSA_NAME_IN_FREELIST in
   SSA_NAME
 
-   VAR_DECL_NONALIASED in
+   DECL_NONALIASED in
 	  VAR_DECL
 
deprecated_flag:


RE: [PATCH] PR rtl-optimization/61047

2014-06-12 Thread Bernd Edlinger
On Thu, 12 Jun 2014 10:50:29, Eric Botcazou wrote:
>
>> Btw I am not sure at all, why argp-references can never be dangerous?
>> For instance in a struct with an array inside, passed as function argument?
>
> IMO there cannot be any definitive solution to this issue until after we move
> all the affected optimizations from RTL to GIMPLE. In the meantime, the
> failure mode is not catastrophic (100% reproducible segfault) and there is
> always an easy workaround (generally a -fno-* switch).
>
> --
> Eric Botcazou

not really 100% reproducable. As a little surprise, the test case from the 
tracker
did _not_ crash when I initially put it in the testsuite.

Reason, probably, the stack layout in the test suite is a little different,
because the LD_LIBRARY_PATH environment variable is sooo long,
and all environment variables, and arguments are at the top of the stack.

The test did only produce the crash as expected when I changed this

  if (b == 2837)
    a = e[b];

into that:

  if (b == 28378)
    a = e[b];



Bernd.
  

[PATCH][RFC] Fix PR61473, inline small memcpy/memmove during tree opts

2014-06-12 Thread Richard Biener

This implements the requested inlining of memmove for possibly
overlapping arguments by doing first all loads and then all stores.
The easiest place is to do this in memory op folding where we already
perform inlining of some memcpy cases (but fail to do the equivalent
memcpy optimization - though RTL expansion later does it).

The following patch restricts us to max. word-mode size.  Ideally
we'd have a way to check for the number of real instructions needed
to load an (aligned) value of size N.  But maybe we don't care
and are fine with doing multiple loads / stores?

Anyway, the following is conservative (but maybe not enough).

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

These transforms don't really belong to GENERIC folding (they
also run at -O0 ...), similar to most builtin foldings.  But this
patch is not to change that.

Any comments on the size/cost issue?

Thanks,
Richard.

2014-06-12  Richard Biener  

PR middle-end/61473
* builtins.c (fold_builtin_memory_op): Inline memory moves
that can be implemented with a single load followed by a
single store.

* gcc.dg/memmove-4.c: New testcase.

Index: gcc/builtins.c
===
--- gcc/builtins.c  (revision 211449)
+++ gcc/builtins.c  (working copy)
@@ -8637,11 +8637,53 @@ fold_builtin_memory_op (location_t loc,
   unsigned int src_align, dest_align;
   tree off0;
 
-  if (endp == 3)
+  /* Build accesses at offset zero with a ref-all character type.  */
+  off0 = build_int_cst (build_pointer_type_for_mode (char_type_node,
+ptr_mode, true), 0);
+
+  /* If we can perform the copy efficiently with first doing all loads
+ and then all stores inline it that way.  Currently efficiently
+means that we can load all the memory into a single integer
+register and thus limited to word_mode size.  Ideally we'd have
+a way to query the largest mode that we can load/store with
+a signle instruction.  */
+  src_align = get_pointer_alignment (src);
+  dest_align = get_pointer_alignment (dest);
+  if (tree_fits_uhwi_p (len)
+ && compare_tree_int (len, BITS_PER_WORD / 8) <= 0)
{
- src_align = get_pointer_alignment (src);
- dest_align = get_pointer_alignment (dest);
+ unsigned ilen = tree_to_uhwi (len);
+ if (exact_log2 (ilen) != -1)
+   {
+ tree type = lang_hooks.types.type_for_size (ilen * 8, 1);
+ if (type
+ && TYPE_MODE (type) != BLKmode
+ && (GET_MODE_SIZE (TYPE_MODE (type)) * BITS_PER_UNIT
+ == ilen * 8)
+ /* If the pointers are not aligned we must be able to
+emit an unaligned load.  */
+ && ((src_align >= GET_MODE_ALIGNMENT (TYPE_MODE (type))
+  && dest_align >= GET_MODE_ALIGNMENT (TYPE_MODE (type)))
+ || !SLOW_UNALIGNED_ACCESS (TYPE_MODE (type),
+MIN (src_align, dest_align
+   {
+ tree srctype = type;
+ tree desttype = type;
+ if (src_align < GET_MODE_ALIGNMENT (TYPE_MODE (type)))
+   srctype = build_aligned_type (type, src_align);
+ if (dest_align < GET_MODE_ALIGNMENT (TYPE_MODE (type)))
+   desttype = build_aligned_type (type, dest_align);
+ destvar = fold_build2 (MEM_REF, desttype, dest, off0);
+ expr = build2 (MODIFY_EXPR, type,
+fold_build2 (MEM_REF, desttype, dest, off0),
+fold_build2 (MEM_REF, srctype, src, off0));
+ goto done;
+   }
+   }
+   }
 
+  if (endp == 3)
+   {
  /* Both DEST and SRC must be pointer types.
 ??? This is what old code did.  Is the testing for pointer types
 really mandatory?
@@ -8818,10 +8860,6 @@ fold_builtin_memory_op (location_t loc,
   if (!ignore)
 dest = builtin_save_expr (dest);
 
-  /* Build accesses at offset zero with a ref-all character type.  */
-  off0 = build_int_cst (build_pointer_type_for_mode (char_type_node,
-ptr_mode, true), 0);
-
   destvar = dest;
   STRIP_NOPS (destvar);
   if (TREE_CODE (destvar) == ADDR_EXPR
@@ -,6 +8926,7 @@ fold_builtin_memory_op (location_t loc,
   expr = build2 (MODIFY_EXPR, TREE_TYPE (destvar), destvar, srcvar);
 }
 
+done:
   if (ignore)
 return expr;
 
Index: gcc/testsuite/gcc.dg/memmove-4.c
===
--- gcc/testsuite/gcc.dg/memmove-4.c(revision 0)
+++ gcc/testsuite/gcc.dg/memmove-4.c(working copy)
@@ -0,0 +1,12 @@
+/* {

RE: [PATCH][RX] Patch to correct the functionality of compiler option -falign-labels=n

2014-06-12 Thread Sandeep Kumar Singh
Hi DJ,

> Have you checked the other alignment macros to see if they need to be
> fixed too?
Thank you for review this patch.
Yes, I have checked other alignment macros and it seems fine. 

> This should be :
I have corrected this review comment.

Is this patch now ok to commit?

Best Regards,
Sandeep Kumar Singh


2014-06-12  Sandeep Kumar Singh  
* config/rx/rx.h (LABEL_ALIGN): Corrected macro LABEL_ALIGN



rx_align_labels.patch
Description: rx_align_labels.patch


Re: [RFC] Teaching SCC merging about unit local trees

2014-06-12 Thread Richard Biener
On Thu, Jun 12, 2014 at 10:47 AM, Jan Hubicka  wrote:
> Richard,
> as briefly discussed before, I would like to teach LTO type merging to not 
> merge
> types that was declared in anonymous namespaces and use C++ ODR type names
> (stored in DECL_ASSEMBLER_NAME of the TYPE_DECL) to break down canonical types
> by their names.
>
> First thing I need to arrange IMO is to not merge two anonymous types from
> two different units.  While looking into it I noticed that the current code
> in unify_scc that refuses to merge local decls produces conflicts and seems
> useless excercise to do.
>
> This patch introduces special hash code 1 that specify that given SCC is known
> to be local and should bypass the merging logic. This is propagated down and
> seems to quite noticeably reduce size of SCC hash:
>
> [WPA] read 10190717 SCCs of average size 1.980409
> [WPA] 20181785 tree bodies read in total
> [WPA] tree SCC table: size 4194301, 1882700 elements, collision ratio: 
> 0.815497
> [WPA] tree SCC max chain length 140 (size 1)
> [WPA] Compared 3392363 SCCs, 2718822 collisions (0.801454)
> [WPA] Merged 3314075 SCCs
> [WPA] Merged 9693632 tree bodies
> [WPA] Merged 2467704 types
> [WPA] 1783262 types prevailed (4491218 associated trees)
> [WPA] GIMPLE canonical type table: size 131071, 94867 elements, 1783347 
> searches, 737056 collisions (ratio: 0.413299)
> [WPA] GIMPLE canonical type pointer-map: 94867 elements, 3973875 searches
> [WPA] Compression: 282828785 input bytes, 831186147 uncompressed bytes 
> (ratio: 2.938832)
> [WPA] Size of mmap'd section decls: 282828785 bytes
>
> to:
>
> [WPA] read 10172291 SCCs of average size 1.982162
> [WPA] 20163124 tree bodies read in total
> [WPA] tree SCC table: size 2097143, 988764 elements, collision ratio: 0.684967
> [WPA] tree SCC max chain length 140 (size 1)
> [WPA] Compared 3060932 SCCs, 2405009 collisions (0.785711)
> [WPA] Merged 3040565 SCCs
> [WPA] Merged 9246482 tree bodies
> [WPA] Merged 2382312 types
> [WPA] 1868611 types prevailed (4728465 associated trees)
> [WPA] GIMPLE canonical type table: size 131071, 94910 elements, 1868696 
> searches, 790939 collisions (ratio: 0.423257)
> [WPA] GIMPLE canonical type pointer-map: 94910 elements, 4216423 searches
> [WPA] Compression: 273322455 input bytes, 824178095 uncompressed bytes 
> (ratio: 3.015406)
>
> We merge less, but not by much and I think we was not right not merge in that 
> cases.

If we merge things we may not merge then the fix is to compare_tree_sccs_1,
not introducing special cases like you propose.

That is, if we are not allowed to merge anonymous namespaces then
make sure we don't.  We already should not merge types with
TYPE_CONTEXT == such namespace by means of

  /* ???  Global types from different TUs have non-matching
 TRANSLATION_UNIT_DECLs.  Still merge them if they are otherwise
 equal.  */
  if (TYPE_FILE_SCOPE_P (t1) && TYPE_FILE_SCOPE_P (t2))
;
  else
compare_tree_edges (TYPE_CONTEXT (t1), TYPE_CONTEXT (t2));

but we possibly merge a subset of decl kinds from "different" namespaces :

  /* ???  Global decls from different TUs have non-matching
 TRANSLATION_UNIT_DECLs.  Only consider a small set of
 decls equivalent, we should not end up merging others.  */
  if ((code == TYPE_DECL
   || code == NAMESPACE_DECL
   || code == IMPORTED_DECL
   || code == CONST_DECL
   || (VAR_OR_FUNCTION_DECL_P (t1)
   && (TREE_PUBLIC (t1) || DECL_EXTERNAL (t1
  && DECL_FILE_SCOPE_P (t1) && DECL_FILE_SCOPE_P (t2))
;
  else
compare_tree_edges (DECL_CONTEXT (t1), DECL_CONTEXT (t2));

Not sure what we end up doing for NAMESPACE_DECL itself (and what
fields we stream for it).  It would be interesting to check that.

Thus, make sure we don't merge namespace {} and namespace {} from
two different units.

But effectively you say we have two classes of "global" trees, first
those that are mergeable across TUs and second those that are not.
This IMHO means we want to separate those to two different LTO
sections and simply skip all the merging code for the second (instead
of adding hacks to the merging code).

Richard.

>
> Would something like this make sense? (I am not saying my definition of 
> unit_local_tree_p
> is most polished one :)
>
> I think next step could be to make anonymous types to bypass the canonical 
> type
> merging (i.e. simply save the chains as they comde from frontends forthose) 
> and
> then look into computing the type names in free lang data, using odr name 
> hash instaed
> of canonical type hash for those named types + link them to canonical type 
> hash
> entries and if we end up with unnamed type in canonical type hash, then make 
> its
> alias class to conflict with all the named types.
>
> Honza
>
> Index: lto-streamer-out.c
> ===
> --- lto-streamer-out.c  (revision 211488)
> +++ lto-

Re: [RFC] Teaching SCC merging about unit local trees

2014-06-12 Thread Richard Biener
On Thu, Jun 12, 2014 at 12:29 PM, Richard Biener
 wrote:
> On Thu, Jun 12, 2014 at 10:47 AM, Jan Hubicka  wrote:
>> Richard,
>> as briefly discussed before, I would like to teach LTO type merging to not 
>> merge
>> types that was declared in anonymous namespaces and use C++ ODR type names
>> (stored in DECL_ASSEMBLER_NAME of the TYPE_DECL) to break down canonical 
>> types
>> by their names.
>>
>> First thing I need to arrange IMO is to not merge two anonymous types from
>> two different units.  While looking into it I noticed that the current code
>> in unify_scc that refuses to merge local decls produces conflicts and seems
>> useless excercise to do.
>>
>> This patch introduces special hash code 1 that specify that given SCC is 
>> known
>> to be local and should bypass the merging logic. This is propagated down and
>> seems to quite noticeably reduce size of SCC hash:
>>
>> [WPA] read 10190717 SCCs of average size 1.980409
>> [WPA] 20181785 tree bodies read in total
>> [WPA] tree SCC table: size 4194301, 1882700 elements, collision ratio: 
>> 0.815497
>> [WPA] tree SCC max chain length 140 (size 1)
>> [WPA] Compared 3392363 SCCs, 2718822 collisions (0.801454)
>> [WPA] Merged 3314075 SCCs
>> [WPA] Merged 9693632 tree bodies
>> [WPA] Merged 2467704 types
>> [WPA] 1783262 types prevailed (4491218 associated trees)
>> [WPA] GIMPLE canonical type table: size 131071, 94867 elements, 1783347 
>> searches, 737056 collisions (ratio: 0.413299)
>> [WPA] GIMPLE canonical type pointer-map: 94867 elements, 3973875 searches
>> [WPA] Compression: 282828785 input bytes, 831186147 uncompressed bytes 
>> (ratio: 2.938832)
>> [WPA] Size of mmap'd section decls: 282828785 bytes
>>
>> to:
>>
>> [WPA] read 10172291 SCCs of average size 1.982162
>> [WPA] 20163124 tree bodies read in total
>> [WPA] tree SCC table: size 2097143, 988764 elements, collision ratio: 
>> 0.684967
>> [WPA] tree SCC max chain length 140 (size 1)
>> [WPA] Compared 3060932 SCCs, 2405009 collisions (0.785711)
>> [WPA] Merged 3040565 SCCs
>> [WPA] Merged 9246482 tree bodies
>> [WPA] Merged 2382312 types
>> [WPA] 1868611 types prevailed (4728465 associated trees)
>> [WPA] GIMPLE canonical type table: size 131071, 94910 elements, 1868696 
>> searches, 790939 collisions (ratio: 0.423257)
>> [WPA] GIMPLE canonical type pointer-map: 94910 elements, 4216423 searches
>> [WPA] Compression: 273322455 input bytes, 824178095 uncompressed bytes 
>> (ratio: 3.015406)
>>
>> We merge less, but not by much and I think we was not right not merge in 
>> that cases.
>
> If we merge things we may not merge then the fix is to compare_tree_sccs_1,
> not introducing special cases like you propose.
>
> That is, if we are not allowed to merge anonymous namespaces then
> make sure we don't.  We already should not merge types with
> TYPE_CONTEXT == such namespace by means of
>
>   /* ???  Global types from different TUs have non-matching
>  TRANSLATION_UNIT_DECLs.  Still merge them if they are otherwise
>  equal.  */
>   if (TYPE_FILE_SCOPE_P (t1) && TYPE_FILE_SCOPE_P (t2))
> ;
>   else
> compare_tree_edges (TYPE_CONTEXT (t1), TYPE_CONTEXT (t2));
>
> but we possibly merge a subset of decl kinds from "different" namespaces :
>
>   /* ???  Global decls from different TUs have non-matching
>  TRANSLATION_UNIT_DECLs.  Only consider a small set of
>  decls equivalent, we should not end up merging others.  */
>   if ((code == TYPE_DECL
>|| code == NAMESPACE_DECL
>|| code == IMPORTED_DECL
>|| code == CONST_DECL
>|| (VAR_OR_FUNCTION_DECL_P (t1)
>&& (TREE_PUBLIC (t1) || DECL_EXTERNAL (t1
>   && DECL_FILE_SCOPE_P (t1) && DECL_FILE_SCOPE_P (t2))
> ;
>   else
> compare_tree_edges (DECL_CONTEXT (t1), DECL_CONTEXT (t2));
>
> Not sure what we end up doing for NAMESPACE_DECL itself (and what
> fields we stream for it).  It would be interesting to check that.
>
> Thus, make sure we don't merge namespace {} and namespace {} from
> two different units.
>
> But effectively you say we have two classes of "global" trees, first
> those that are mergeable across TUs and second those that are not.
> This IMHO means we want to separate those to two different LTO
> sections and simply skip all the merging code for the second (instead
> of adding hacks to the merging code).

As that also restricts the "pointers" we can have.  Mergeable stuff
may not refer to non-mergeable stuff.  Breaks down for initializers:

static int x;
int *p = &x;

though you could say that as p is initialized (thus not DECL_COMMON)
this instance cannot be merged with anything else - other entities
are 'extern int *p' (tree merging is different from symtab merging).

Thus int *p = &x; is also non-mergeable (everything that has tree
pointers refer to sth not mergeable is not mergeable).

We have similar "issues" with tree_is_indexable and pointers violating
constraints (like 

Re: ipa-visibility TLC 2/n

2014-06-12 Thread Rainer Orth
Hi Honza,

>> Unfortunately, AIX isn't the only target massively affected by your
>> recent patches.  This all started with r210597
>> 
>> 2014-05-17  Jan Hubicka  
>> 
>>  * tree-pass.h (make_pass_ipa_comdats): New pass.
>> * timevar.def (TV_IPA_COMDATS): New timevar.
>> * passes.def (pass_ipa_comdats): Add.
>> * Makefile.in (OBJS): Add ipa-comdats.o
>> * ipa-comdats.c: New file.
>> 
>> At that time, only Solaris 11 with gas/Solaris ld was affected: many Go
>> tests started failing like this:
>> 
>> runtime.SetFinalizer: cannot pass * os  os.file to finalizer func(*  
>>os  os.file) error
>> fatal error: runtime.SetFinalizer
>
> Thanks for letting me know.  THis is different transformation than one
> causing trouble
> on AIX (AIX has no comdats, so this pass does nothing).  Go seems tobe
> quite heavy user
> of comdat locals produced by that patch, so I suppose they somehow break
> with Solaris.
>
> Comdat locals are now used by ipa-comdats, for thunks and for decloned ctors.
> We probably need to figure out bit more precise limitation of Solaris and 
> either
> fix or add way for target to say what kind of comdat locals are not supported.

Right.  I'll start reghunting for the patch that caused additional
breakage even without comdat, as on Solaris 10.

> Can I reproduce your setup on the compile farm?

According to https://gcc.gnu.org/wiki/CompileFarm, there are no Solaris
machines or VMs in the compile farm.  If a VM could be set up (no idea
if they allow non-free OSes beyond AIX there), I'd suggest starting with
Solaris 11.2 Beta
(http://www.oracle.com/technetwork/server-storage/solaris11/downloads/beta-2182939.html),
which has the latest in /bin/ld support.  I can certainly help with
setting something up.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [RFC] Teaching SCC merging about unit local trees

2014-06-12 Thread Richard Biener
On Thu, Jun 12, 2014 at 12:34 PM, Richard Biener
 wrote:
> On Thu, Jun 12, 2014 at 12:29 PM, Richard Biener
>  wrote:
>> On Thu, Jun 12, 2014 at 10:47 AM, Jan Hubicka  wrote:
>>> Richard,
>>> as briefly discussed before, I would like to teach LTO type merging to not 
>>> merge
>>> types that was declared in anonymous namespaces and use C++ ODR type names
>>> (stored in DECL_ASSEMBLER_NAME of the TYPE_DECL) to break down canonical 
>>> types
>>> by their names.
>>>
>>> First thing I need to arrange IMO is to not merge two anonymous types from
>>> two different units.  While looking into it I noticed that the current code
>>> in unify_scc that refuses to merge local decls produces conflicts and seems
>>> useless excercise to do.
>>>
>>> This patch introduces special hash code 1 that specify that given SCC is 
>>> known
>>> to be local and should bypass the merging logic. This is propagated down and
>>> seems to quite noticeably reduce size of SCC hash:
>>>
>>> [WPA] read 10190717 SCCs of average size 1.980409
>>> [WPA] 20181785 tree bodies read in total
>>> [WPA] tree SCC table: size 4194301, 1882700 elements, collision ratio: 
>>> 0.815497
>>> [WPA] tree SCC max chain length 140 (size 1)
>>> [WPA] Compared 3392363 SCCs, 2718822 collisions (0.801454)
>>> [WPA] Merged 3314075 SCCs
>>> [WPA] Merged 9693632 tree bodies
>>> [WPA] Merged 2467704 types
>>> [WPA] 1783262 types prevailed (4491218 associated trees)
>>> [WPA] GIMPLE canonical type table: size 131071, 94867 elements, 1783347 
>>> searches, 737056 collisions (ratio: 0.413299)
>>> [WPA] GIMPLE canonical type pointer-map: 94867 elements, 3973875 searches
>>> [WPA] Compression: 282828785 input bytes, 831186147 uncompressed bytes 
>>> (ratio: 2.938832)
>>> [WPA] Size of mmap'd section decls: 282828785 bytes
>>>
>>> to:
>>>
>>> [WPA] read 10172291 SCCs of average size 1.982162
>>> [WPA] 20163124 tree bodies read in total
>>> [WPA] tree SCC table: size 2097143, 988764 elements, collision ratio: 
>>> 0.684967
>>> [WPA] tree SCC max chain length 140 (size 1)
>>> [WPA] Compared 3060932 SCCs, 2405009 collisions (0.785711)
>>> [WPA] Merged 3040565 SCCs
>>> [WPA] Merged 9246482 tree bodies
>>> [WPA] Merged 2382312 types
>>> [WPA] 1868611 types prevailed (4728465 associated trees)
>>> [WPA] GIMPLE canonical type table: size 131071, 94910 elements, 1868696 
>>> searches, 790939 collisions (ratio: 0.423257)
>>> [WPA] GIMPLE canonical type pointer-map: 94910 elements, 4216423 searches
>>> [WPA] Compression: 273322455 input bytes, 824178095 uncompressed bytes 
>>> (ratio: 3.015406)
>>>
>>> We merge less, but not by much and I think we was not right not merge in 
>>> that cases.
>>
>> If we merge things we may not merge then the fix is to compare_tree_sccs_1,
>> not introducing special cases like you propose.
>>
>> That is, if we are not allowed to merge anonymous namespaces then
>> make sure we don't.  We already should not merge types with
>> TYPE_CONTEXT == such namespace by means of
>>
>>   /* ???  Global types from different TUs have non-matching
>>  TRANSLATION_UNIT_DECLs.  Still merge them if they are otherwise
>>  equal.  */
>>   if (TYPE_FILE_SCOPE_P (t1) && TYPE_FILE_SCOPE_P (t2))
>> ;
>>   else
>> compare_tree_edges (TYPE_CONTEXT (t1), TYPE_CONTEXT (t2));
>>
>> but we possibly merge a subset of decl kinds from "different" namespaces :
>>
>>   /* ???  Global decls from different TUs have non-matching
>>  TRANSLATION_UNIT_DECLs.  Only consider a small set of
>>  decls equivalent, we should not end up merging others.  */
>>   if ((code == TYPE_DECL
>>|| code == NAMESPACE_DECL
>>|| code == IMPORTED_DECL
>>|| code == CONST_DECL
>>|| (VAR_OR_FUNCTION_DECL_P (t1)
>>&& (TREE_PUBLIC (t1) || DECL_EXTERNAL (t1
>>   && DECL_FILE_SCOPE_P (t1) && DECL_FILE_SCOPE_P (t2))
>> ;
>>   else
>> compare_tree_edges (DECL_CONTEXT (t1), DECL_CONTEXT (t2));
>>
>> Not sure what we end up doing for NAMESPACE_DECL itself (and what
>> fields we stream for it).  It would be interesting to check that.
>>
>> Thus, make sure we don't merge namespace {} and namespace {} from
>> two different units.
>>
>> But effectively you say we have two classes of "global" trees, first
>> those that are mergeable across TUs and second those that are not.
>> This IMHO means we want to separate those to two different LTO
>> sections and simply skip all the merging code for the second (instead
>> of adding hacks to the merging code).
>
> As that also restricts the "pointers" we can have.  Mergeable stuff
> may not refer to non-mergeable stuff.  Breaks down for initializers:
>
> static int x;
> int *p = &x;
>
> though you could say that as p is initialized (thus not DECL_COMMON)
> this instance cannot be merged with anything else - other entities
> are 'extern int *p' (tree merging is different from symtab merging).
>
> Thus int *p = &x; is also non-m

Re: [PATCH] PR rtl-optimization/61047

2014-06-12 Thread Richard Biener
On Thu, Jun 12, 2014 at 10:36 AM, Eric Botcazou  wrote:
>> Btw, I wonder if we can simply mark the MEMs generated from spill code
>> with MEM_NOTRAP_P so we can remove the special casing of
>> frame-pointer-based addresses from add while properly initializing
>> MEM_NOTRAP_p from rtx_addr_can_trap_p?
>
> Spill code generated by the compiler itself?  That's quite restrictive.
>
>> I suppose it was added exactly to cover spill code?
>
> Nope, it was added for jump tables:
>
> 2003-04-22  Richard Henderson  
>
> PR 8866
> * rtl.h (MEM_NOTRAP_P): New.
> (MEM_COPY_ATTRIBUTES): Copy it.
> * rtlanal.c (may_trap_p): Check it.
> * expr.c (do_tablejump): Set it.
> * doc/rtl.texi (Flags): Document it.
>
> * cfgrtl.c (try_redirect_by_replacing_jump): Revert last three 
> changes.
>
> that is to say, for memory accesses that can nominally trap but for which we
> know that they actually don't.

I was asking for the special-casing of frame-pointer-based accesses in
rtx_addr_can_trap_p, not MEM_NOTRAP_P.  (MEM_NOTRAP_P
of course has the issue that it may not be trusted when you try to
move the MEM ...)

Richard.

> --
> Eric Botcazou


[GOMP4, COMMITTED] OpenACC if clause.

2014-06-12 Thread Thomas Schwinge
From: tschwinge 

gcc/c/
* c-parser.c (c_parser_oacc_all_clauses): Handle
PRAGMA_OMP_CLAUSE_IF.
(OACC_DATA_CLAUSE_MASK, OACC_KERNELS_CLAUSE_MASK)
(OACC_PARALLEL_CLAUSE_MASK, OACC_UPDATE_CLAUSE_MASK): Add it.
gcc/
* omp-low.c (scan_sharing_clauses): Allow OMP_CLAUSE_IF.
(expand_oacc_offload, expand_omp_target): Handle it.
gcc/testsuite/
* c-c++-common/goacc/if-clause-1.c: New file.
* c-c++-common/goacc/if-clause-2.c: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@211510 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |  6 ++
 gcc/c/ChangeLog.gomp   |  8 +++
 gcc/c/c-parser.c   | 10 +++-
 gcc/omp-low.c  | 81 +-
 gcc/testsuite/ChangeLog.gomp   |  5 ++
 gcc/testsuite/c-c++-common/goacc/if-clause-1.c |  8 +++
 gcc/testsuite/c-c++-common/goacc/if-clause-2.c | 11 
 7 files changed, 112 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/if-clause-1.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/if-clause-2.c

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index be1aa16..2abe179 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,9 @@
+2014-06-12  Thomas Schwinge  
+   James Norris  
+
+   * omp-low.c (scan_sharing_clauses): Allow OMP_CLAUSE_IF.
+   (expand_oacc_offload, expand_omp_target): Handle it.
+
 2014-06-06  Thomas Schwinge  
James Norris  
 
diff --git gcc/c/ChangeLog.gomp gcc/c/ChangeLog.gomp
index f1e45f3..108ce3e 100644
--- gcc/c/ChangeLog.gomp
+++ gcc/c/ChangeLog.gomp
@@ -1,3 +1,11 @@
+2014-06-12  Thomas Schwinge  
+   James Norris  
+
+   * c-parser.c (c_parser_oacc_all_clauses): Handle
+   PRAGMA_OMP_CLAUSE_IF.
+   (OACC_DATA_CLAUSE_MASK, OACC_KERNELS_CLAUSE_MASK)
+   (OACC_PARALLEL_CLAUSE_MASK, OACC_UPDATE_CLAUSE_MASK): Add it.
+
 2014-06-06  Thomas Schwinge  
James Norris  
 
diff --git gcc/c/c-parser.c gcc/c/c-parser.c
index bf4bad62..6269923 100644
--- gcc/c/c-parser.c
+++ gcc/c/c-parser.c
@@ -10203,7 +10203,7 @@ c_parser_omp_clause_final (c_parser *parser, tree list)
   return list;
 }
 
-/* OpenMP 2.5:
+/* OpenACC, OpenMP 2.5:
if ( expression ) */
 
 static tree
@@ -11295,6 +11295,10 @@ c_parser_oacc_all_clauses (c_parser *parser, 
omp_clause_mask mask,
  clauses = c_parser_oacc_data_clause (parser, c_kind, clauses);
  c_name = "host";
  break;
+   case PRAGMA_OMP_CLAUSE_IF:
+ clauses = c_parser_omp_clause_if (parser, clauses);
+ c_name = "if";
+ break;
case PRAGMA_OMP_CLAUSE_NUM_GANGS:
  clauses = c_parser_omp_clause_num_gangs (parser, clauses);
  c_name = "num_gangs";
@@ -11614,6 +11618,7 @@ c_parser_omp_structured_block (c_parser *parser)
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_COPYOUT)  \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_CREATE)   \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DEVICEPTR)\
+   | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IF)   \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT)  \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT_OR_COPY)  \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT_OR_COPYIN)\
@@ -11649,6 +11654,7 @@ c_parser_oacc_data (location_t loc, c_parser *parser)
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_COPYOUT)  \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_CREATE)   \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DEVICEPTR)\
+   | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IF)   \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT)  \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT_OR_COPY)  \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT_OR_COPYIN)\
@@ -11727,6 +11733,7 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, 
char *p_name)
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_COPYOUT)  \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_CREATE)   \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DEVICEPTR)\
+   | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IF)   \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_NUM_GANGS)\
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_NUM_WORKERS)  \
| (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT)  \
@@ -11775,6 +11782,7 @@ c_parser_oacc_parallel (location_t loc, c_parser 
*parser, char *p_name)
 #define OACC_UPDATE_CLAUSE_MASK
\
( (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DEVICE)   \
| (OMP_CLAUSE_MASK_1 << PR

[GOMP4, COMMITTED] Different configure and make flags for target vs. accelerator GCC.

2014-06-12 Thread Thomas Schwinge
From: tschwinge 

--enable-target-gcc-configure-flags, EXTRA_TARGET_GCC_FLAGS vs.
--enable-accelerator-gcc-configure-flags, EXTRA_ACCELERATOR_GCC_FLAGS.

* configure.ac (--enable-target-gcc-configure-flags)
(--enable-accelerator-gcc-configure-flags): New configure options.
* Makefile.def (gcc, accel-gcc): Handle these as well as new
EXTRA_TARGET_GCC_FLAGS and EXTRA_ACCELERATOR_GCC_FLAGS make flags.
* configure: Regenerate.
* Makefile.in: Regenerate.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@211513 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 ChangeLog.gomp |   9 +
 Makefile.def   |   6 ++-
 Makefile.in| 114 ++---
 configure  |  31 
 configure.ac   |  17 +
 5 files changed, 121 insertions(+), 56 deletions(-)

diff --git ChangeLog.gomp ChangeLog.gomp
index c264057..46892a8 100644
--- ChangeLog.gomp
+++ ChangeLog.gomp
@@ -1,3 +1,12 @@
+2014-06-12  Thomas Schwinge  
+
+   * configure.ac (--enable-target-gcc-configure-flags)
+   (--enable-accelerator-gcc-configure-flags): New configure options.
+   * Makefile.def (gcc, accel-gcc): Handle these as well as new
+   EXTRA_TARGET_GCC_FLAGS and EXTRA_ACCELERATOR_GCC_FLAGS make flags.
+   * configure: Regenerate.
+   * Makefile.in: Regenerate.
+
 2014-03-20  Bernd Schmidt  
 
* Makefile.def (host_modules, dependencies): Add accel-gcc entries.
diff --git Makefile.def Makefile.def
index 89bfc07..e5fbd5c 100644
--- Makefile.def
+++ Makefile.def
@@ -44,10 +44,12 @@ host_modules= { module= fixincludes; bootstrap=true;
 host_modules= { module= flex; no_check_cross= true; };
 host_modules= { module= gas; bootstrap=true; };
 host_modules= { module= gcc; bootstrap=true; 
-   extra_make_flags="$(EXTRA_GCC_FLAGS)"; };
+   extra_configure_flags='@extra_target_gcc_configure_flags@';
+   extra_make_flags="$(EXTRA_GCC_FLAGS) 
$(EXTRA_TARGET_GCC_FLAGS)"; };
 host_modules= { module= accel-gcc;
actual_module=gcc;
-   
extra_configure_flags='--enable-as-accelerator-for=$(target_alias)'; };
+   
extra_configure_flags='--enable-as-accelerator-for=$(target_alias) 
@extra_accelerator_gcc_configure_flags@';
+   extra_make_flags="$(EXTRA_ACCELERATOR_GCC_FLAGS)"; };
 host_modules= { module= gmp; lib_path=.libs; bootstrap=true;
extra_configure_flags='--disable-shared';
no_install= true;
diff --git Makefile.in Makefile.in
index 85ec2c2..9ad7a51 100644
--- Makefile.in
+++ Makefile.in
@@ -10075,7 +10075,7 @@ configure-gcc:
libsrcdir="$$s/gcc"; \
$(SHELL) $${libsrcdir}/configure \
  $(HOST_CONFIGARGS) --build=${build_alias} --host=${host_alias} \
- --target=$${this_target} $${srcdiroption}  \
+ --target=$${this_target} $${srcdiroption} 
@extra_target_gcc_configure_flags@ \
  || exit 1
 @endif gcc
 
@@ -10109,7 +10109,8 @@ configure-stage1-gcc:
$(SHELL) $${libsrcdir}/configure \
  $(HOST_CONFIGARGS) --build=${build_alias} --host=${host_alias} \
  --target=${target_alias} $${srcdiroption} \
- $(STAGE1_CONFIGURE_FLAGS)
+ $(STAGE1_CONFIGURE_FLAGS) \
+ @extra_target_gcc_configure_flags@
 @endif gcc-bootstrap
 
 .PHONY: configure-stage2-gcc maybe-configure-stage2-gcc
@@ -10142,7 +10143,8 @@ configure-stage2-gcc:
  $(HOST_CONFIGARGS) --build=${build_alias} --host=${host_alias} \
  --target=${target_alias} $${srcdiroption} \
  --with-build-libsubdir=$(HOST_SUBDIR) \
- $(STAGE2_CONFIGURE_FLAGS)
+ $(STAGE2_CONFIGURE_FLAGS) \
+ @extra_target_gcc_configure_flags@
 @endif gcc-bootstrap
 
 .PHONY: configure-stage3-gcc maybe-configure-stage3-gcc
@@ -10175,7 +10177,8 @@ configure-stage3-gcc:
  $(HOST_CONFIGARGS) --build=${build_alias} --host=${host_alias} \
  --target=${target_alias} $${srcdiroption} \
  --with-build-libsubdir=$(HOST_SUBDIR) \
- $(STAGE3_CONFIGURE_FLAGS)
+ $(STAGE3_CONFIGURE_FLAGS) \
+ @extra_target_gcc_configure_flags@
 @endif gcc-bootstrap
 
 .PHONY: configure-stage4-gcc maybe-configure-stage4-gcc
@@ -10208,7 +10211,8 @@ configure-stage4-gcc:
  $(HOST_CONFIGARGS) --build=${build_alias} --host=${host_alias} \
  --target=${target_alias} $${srcdiroption} \
  --with-build-libsubdir=$(HOST_SUBDIR) \
- $(STAGE4_CONFIGURE_FLAGS)
+ $(STAGE4_CONFIGURE_FLAGS) \
+ @extra_target_gcc_configure_flags@
 @endif gcc-bootstrap
 
 .PHONY: configure-stageprofile-gcc maybe-configure-stageprofile-gcc
@@ -10241,7 +10245,8 @@ configure-stageprofile-gcc:
  $(HOST_CONFIGARGS) --build=${build_alias} --host=${host_alias} \
  --target=${target_alias} $${srcdiroption} \
  --with-build-libsubdir=$(HOST_SUBDIR) \
- $(STAGEprofile_CONFIGURE_FLAGS)
+  

[PATCH] Fix gennews

2014-06-12 Thread Richard Biener

It seems the https transition broke refering to permanently moved
URL gcc-3.0/gcc-3.0.html (I get a certificate error or some such),
breaking gennews and thus gcc_release.  Fixed like below which
makes gennews succeed.

Committed to the 4.7 branch.

Richard.

2014-06-12  Richard Biener  

* gennews: Use gcc-3.0/index.html.

Index: contrib/gennews
===
--- contrib/gennews (revision 211221)
+++ contrib/gennews (working copy)
@@ -36,7 +36,7 @@ files="
 gcc-3.3/index.html gcc-3.3/changes.html
 gcc-3.2/index.html gcc-3.2/changes.html
 gcc-3.1/index.html gcc-3.1/changes.html
-gcc-3.0/gcc-3.0.html gcc-3.0/features.html gcc-3.0/caveats.html
+gcc-3.0/index.html gcc-3.0/features.html gcc-3.0/caveats.html
 gcc-2.95/index.html gcc-2.95/features.html gcc-2.95/caveats.html
 egcs-1.1/index.html egcs-1.1/features.html egcs-1.1/caveats.html
 egcs-1.0/index.html egcs-1.0/features.html egcs-1.0/caveats.html"


GCC 4.7 branch is now closed

2014-06-12 Thread Richard Biener

The GCC 4.7 branch is now closed, please refrain from committing anything
there now.

Richard.


[linaro/gcc-4_9-branch] Merge from gcc-4_9-branch and backports

2014-06-12 Thread Yvan Roux
Hi all,

we have merged the gcc-4_9-branch into linaro/gcc-4_9-branch up to
revision 211054 as r211495.  We have also backported this set of revisions:

r209419 as r211497 : PR rtl-optimization/60663
r209457 as r211496 : TRY_EMPTY_VM_SPACE Change aarch64 ilp32
r209559 as r211498 : [AArch64] vrnd<*>_f64 patch
r209561 as r211505 : Suppress Redundant Flag Setting for Cortex-A15.
r209613 as r211506 : AArch32 Support ORN for DIMode
r209614 as r211507 : Optimise NotDI AND/OR ZeroExtendSI for ARMv7A
r209615 as r211508 : [ARM] Allow any register for DImode values in Thumb2
r209617 as r211509 : [AArch64] Fix possible wrong code generation when
comparing DImode values.
r209618 as r211511 : [AArch64] Add a space to memory asm code between
base register and offset.
r209627 as r211512 : [AArch64] Fix indentation.
r209636 as r211512 : [AArch64] Fix aarch64_initial_elimination_offset
calculation.
r209640 as r211514 : [AArch64] vqneg and vqabs intrinsics implementation.
r209641 as r211515 : [AArch64] Vreinterpret re-implemention.
r209642 as r211515 : [AArch64] 64-bit float vreinterpret implemention
r209643 as r211516 : [AArch64] Define TARGET_FLAGS_REGNUM
r209645 as r211517 : [AArch64] Fix TLS for ILP32.
r209649 as r211518 : Merge longlong.h from glibc tree.
r209659 as r211519 : AArch64 add, sub, mul in TImode
r209701 as r211520 : [ARM] Handle FMA code in rtx costs.
r209702 as r211520 : [ARM] Cortex-A8 rtx cost table
r209703 as r211520 : [ARM][1/3] Add rev field to rtx cost tables
r209704 as r211520 : [AArch64][2/3] Recognise rev16 operations on
SImode and DImode data
r209705 as r211520 : [ARM][3/3] Recognise bitwise operations leading
to SImode rev16
r209706 as r211521 : [AArch64] Add handling of bswap operations in rtx costs
r209710 as r211523 : [ARM] Initialize new tune_params values
r209711 as r211524 : [AArch64] Fully support rotate on logical operations.
r209712 as r211530 : [AARCH64] Use standard patterns for stack protection.
r209713 as r211560 : [AArch64] VDUP Testcases
r209736 as r211573 : [AArch64] Vectorise bswap[16,32,64]
r209742 as r211574 : [AArch64] Reverse TBL indices for big-endian.
r209747 as r211575 : Fix warning in libgfortran configure script
r209749 as r211574 : [AArch64] Enable TBL for big-endian.
r209806 as r211576 : [ARM] Initialise T16-related fields in Cortex-A8
tuning struct.
r209808 as r211577 : [ARM] Enable tail call optimization for long call
r209878 as r211578 : [AArch64] Relax modes_tieable_p and
cannot_change_mode_class
r209880 as r211579 : [AArch64] Improve vst4_lane intrinsics
r209893 as r211580 : Add execution + assembler tests of the AArch64
ZIP Intrinsics.
r209897 as r211581 : Remove PUSH_ARGS_REVERSED from the RTL expander.
r209906 as r211582 : [AArch64/ARM 2/3] Rewrite AArch64 ZIP Intrinsics
using __builtin_shuffle
r209908 as r211582 : Add execution tests of ARM ZIP Intrinsics.
r210615 as r211583 : libitm: Enable aarch64
r211211 as r211584 : [AARCH64]Support full addressing modes for
ldr/str in vectorization scenarios

This will be part of our 2014.06 release.

Thanks,
Yvan


Re: ipa-visibility TLC 2/n

2014-06-12 Thread Rainer Orth
Rainer Orth  writes:

> Hi Honza,
>
>>> Unfortunately, AIX isn't the only target massively affected by your
>>> recent patches.  This all started with r210597
>>> 
>>> 2014-05-17  Jan Hubicka  
>>> 
>>> * tree-pass.h (make_pass_ipa_comdats): New pass.
>>> * timevar.def (TV_IPA_COMDATS): New timevar.
>>> * passes.def (pass_ipa_comdats): Add.
>>> * Makefile.in (OBJS): Add ipa-comdats.o
>>> * ipa-comdats.c: New file.
>>> 
>>> At that time, only Solaris 11 with gas/Solaris ld was affected: many Go
>>> tests started failing like this:
>>> 
>>> runtime.SetFinalizer: cannot pass * os  os.file to finalizer func(* 
>>> os  os.file) error
>>> fatal error: runtime.SetFinalizer
>>
>> Thanks for letting me know.  THis is different transformation than one
>> causing trouble
>> on AIX (AIX has no comdats, so this pass does nothing).  Go seems tobe
>> quite heavy user
>> of comdat locals produced by that patch, so I suppose they somehow break
>> with Solaris.
>>
>> Comdat locals are now used by ipa-comdats, for thunks and for decloned ctors.
>> We probably need to figure out bit more precise limitation of Solaris and
>> either
>> fix or add way for target to say what kind of comdat locals are not 
>> supported.
>
> Right.  I'll start reghunting for the patch that caused additional
> breakage even without comdat, as on Solaris 10.

It turned out that those failures have been caused by the last libgo
merge, rev 211328: many 64-bit tests FAIL like this:

FAIL: go.go-torture/execute/chan-1.go execution,  -O0

fatal error: all goroutines are asleep - deadlock!

goroutine 16 [chan send]:
created by main
/vol/gcc/src/hg/trunk/local/libgo/runtime/go-main.c:42

There have been massive changes to libgo/runtime/chan.c, perhaps one of
them is the culprit.

It's hard to keep track with so much breakage these days ;-(

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


terse notation diagnostics

2014-06-12 Thread Andrew Sutton
Adds additional checks and tests for ill-formed programs.

2014-06-12  Andrew Sutton  
* gcc/cp/parser.c (cp_check_type_concept): New.
(cp_check_concept_name): Remove redundant condition from check.
Diagnose misuse of non-type concepts in constrained type specifiers.
* gcc/testuite/g++.dg/concepts/generic-fn.C: Add tests for
non-simple constrained-type-specifiers and nested-name-specifiers
in concept names.
* gcc/testuite/g++.dg/concepts/generic-fn-err.C: New tests for
diagnosing ill-formed programs.

Committed in r211585.

Andrew Sutton
Index: parser.c
===
--- parser.c	(revision 211476)
+++ parser.c	(working copy)
@@ -15132,11 +15132,22 @@ cp_parser_type_name (cp_parser* parser)
   return type_decl;
 }
 
-
+/// Returns true if proto is a type parameter, but not a template template
+/// parameter.
+static bool
+cp_check_type_concept (tree proto, tree fn) 
+{
+  if (TREE_CODE (proto) != TYPE_DECL) 
+{
+  error ("invalid use of non-type concept %qD", fn);
+  return false;
+}
+  return true;
+}
 
 // If DECL refers to a concept, return a TYPE_DECL representing the result
 // of using the constrained type specifier in the current context. 
-
+//
 // DECL refers to a concept if
 //   - it is an overload set containing a function concept taking a single
 // type argument, or
@@ -15173,9 +15184,13 @@ cp_check_concept_name (cp_parser* parser
 
   // In template paramteer scope, this results in a constrained parameter.
   // Return a descriptor of that parm.
-  if (template_parm_scope_p () && processing_template_parmlist)
+  if (processing_template_parmlist)
 return build_constrained_parameter (proto, fn);
 
+  // In any other context, a concept must be a type concept.
+  if (!cp_check_type_concept (proto, fn))
+return error_mark_node;
+
   // In a parameter-declaration-clause, constrained-type specifiers
   // result in invented template parameters.
   if (parser->auto_is_implicit_function_template_parm_p)
Index: generic-fn.C
===
--- generic-fn.C	(revision 211476)
+++ generic-fn.C	(working copy)
@@ -1,11 +1,16 @@
+// { dg-do run }
 // { dg-options "-std=c++1y" }
 
 #include 
+#include 
 
 template
   concept bool C() { return __is_class(T); }
 
-struct S { } s;
+template
+  concept bool Type() { return true; }
+
+struct S { };
 
 int called;
 
@@ -50,7 +55,43 @@ template
   };
 
 
+void ptr(C*) { called = 1; }
+void ptr(const C*) { called = 2; }
+
+void ref(C&) { called = 1; }
+void ref(const C&) { called = 2; }
+
+void 
+fwd_lvalue_ref(Type&& x) {
+  using T = decltype(x);
+  static_assert(std::is_lvalue_reference::value, "not an lvlaue reference");
+}
+
+void 
+fwd_const_lvalue_ref(Type&& x) {
+  using T = decltype(x);
+  static_assert(std::is_lvalue_reference::value, "not an lvalue reference");
+  using U = typename std::remove_reference::type;
+  static_assert(std::is_const::value, "not const-qualified");
+}
+
+void fwd_rvalue_ref(Type&& x) {
+  using T = decltype(x);
+  static_assert(std::is_rvalue_reference::value, "not an rvalue reference");
+}
+
+// Make sure we can use nested names speicifers for concept names.
+namespace N {
+  template
+concept bool C() { return true; }
+} // namesspace N
+
+void foo(N::C x) { }
+
 int main() {
+  S s;
+  const S cs;
+
   f(0); assert(called == 1);
   g(s); assert(called == 2);
 
@@ -60,7 +101,6 @@ int main() {
   S1 s1;
   s1.f1(0); assert(called == 1);
   s1.f2(s); assert(called == 2);
-  // s1.f2(0); // Error
 
   s1.f3(0); assert(called == 1);
   s1.f3(s); assert(called == 2);
@@ -68,26 +108,35 @@ int main() {
   S2 s2;
   s2.f1(0); assert(called == 1);
   s2.f2(s); assert(called == 2);
-  // s2.f2(0); // Error
 
   s2.f3(0); assert(called == 1);
   s2.f3(s); assert(called == 2);
 
   s2.h1(0); assert(called == 1);
   s2.h2(s); assert(called == 2);
-  // s2.h2(0); // Error
 
   s2.h3(0); assert(called == 1);
   s2.h3(s); assert(called == 2);
 
   s2.g1(s, s); assert(called == 1); 
-  // s2.g(s, 0); // Error
-  // s2.g(0, s); // Error
-
   s2.g2(s, s); assert(called == 2);
-  // s2.g(s, 0); // Error
+
+  ptr(&s); assert(called == 1);
+  ptr(&cs); assert(called == 2);
+
+  ref(s); assert(called == 1);
+  ref(cs); assert(called == 2);
+
+  // Check forwarding problems
+  fwd_lvalue_ref(s);
+  fwd_const_lvalue_ref(cs);
+  fwd_rvalue_ref(S());
+
+  foo(0);
 }
 
+// Test that decl/def matching works.
+
 void p(auto x) { called = 1; }
 void p(C x) { called = 2; }
 
Index: generic-fn-err.C
===
--- generic-fn-err.C	(revision 0)
+++ generic-fn-err.C	(revision 0)
@@ -0,0 +1,51 @@
+// { dg-options "-std=c++1y" }
+
+#include 
+
+template
+  concept bool C() { return __is_class(T); }
+
+template
+  concept bool Int() { return true; }
+
+template class X>
+  concept bool Template() { return true; }

[PATCH][AArch64] Add predicate for storewb_pair/loadwb_pair

2014-06-12 Thread Jiong Wang

This patch add predicate for storewb_pair/loadwb_pair, because aarch64
register pair push and pop instructions only accept constant offset
within certain range.

OK for trunk?
Thanks.

gcc/ChangeLog:

2014-06-12  Renlin Li  

  * config/aarch64/aarch64.c (offset_7bit_signed_scaled_p): Rename to
'aarch64_offset_7bit_signed_scaled_p', remove static and use it .
  * config/aarch64/aarch64-protos.h (aarch64_offset_7bit_signed_scaled_p):
New Declaration.
  * config/aarch64/predicates.md (aarch64_mem_pair_offset): New predicate.
  * config/aarch64/aarch64.md (loadwb_pair): Use aarch64_mem_pair_offset.
(storewb_pair): Likewise.
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 68d488d..d39ecc5 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -193,6 +193,7 @@ bool aarch64_modes_tieable_p (enum machine_mode mode1,
 bool aarch64_move_imm (HOST_WIDE_INT, enum machine_mode);
 bool aarch64_mov_operand_p (rtx, enum aarch64_symbol_context,
 			enum machine_mode);
+bool aarch64_offset_7bit_signed_scaled_p (enum machine_mode, HOST_WIDE_INT);
 char *aarch64_output_scalar_simd_mov_immediate (rtx, enum machine_mode);
 char *aarch64_output_simd_mov_immediate (rtx, enum machine_mode, unsigned);
 bool aarch64_pad_arg_upward (enum machine_mode, const_tree);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index f69457a..192caf4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3122,8 +3122,9 @@ aarch64_classify_index (struct aarch64_address_info *info, rtx x,
   return false;
 }
 
-static inline bool
-offset_7bit_signed_scaled_p (enum machine_mode mode, HOST_WIDE_INT offset)
+bool
+aarch64_offset_7bit_signed_scaled_p (enum machine_mode mode,
+ HOST_WIDE_INT offset)
 {
   return (offset >= -64 * GET_MODE_SIZE (mode)
 	  && offset < 64 * GET_MODE_SIZE (mode)
@@ -3195,12 +3196,12 @@ aarch64_classify_address (struct aarch64_address_info *info,
 	 We conservatively require an offset representable in either mode.
 	   */
 	  if (mode == TImode || mode == TFmode)
-	return (offset_7bit_signed_scaled_p (mode, offset)
+	return (aarch64_offset_7bit_signed_scaled_p (mode, offset)
 		&& offset_9bit_signed_unscaled_p (mode, offset));
 
 	  if (outer_code == PARALLEL)
 	return ((GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8)
-		&& offset_7bit_signed_scaled_p (mode, offset));
+		&& aarch64_offset_7bit_signed_scaled_p (mode, offset));
 	  else
 	return (offset_9bit_signed_unscaled_p (mode, offset)
 		|| offset_12bit_unsigned_scaled_p (mode, offset));
@@ -3255,12 +3256,12 @@ aarch64_classify_address (struct aarch64_address_info *info,
 	 We conservatively require an offset representable in either mode.
 	   */
 	  if (mode == TImode || mode == TFmode)
-	return (offset_7bit_signed_scaled_p (mode, offset)
+	return (aarch64_offset_7bit_signed_scaled_p (mode, offset)
 		&& offset_9bit_signed_unscaled_p (mode, offset));
 
 	  if (outer_code == PARALLEL)
 	return ((GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8)
-		&& offset_7bit_signed_scaled_p (mode, offset));
+		&& aarch64_offset_7bit_signed_scaled_p (mode, offset));
 	  else
 	return offset_9bit_signed_unscaled_p (mode, offset);
 	}
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index fec2ea8..e15747f 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -949,7 +949,7 @@
   [(parallel
 [(set (match_operand:P 0 "register_operand" "=k")
   (plus:P (match_operand:P 1 "register_operand" "0")
-  (match_operand:P 4 "const_int_operand" "n")))
+  (match_operand:P 4 "aarch64_mem_pair_offset" "n")))
  (set (match_operand:GPI 2 "register_operand" "=r")
   (mem:GPI (plus:P (match_dup 1)
(match_dup 4
@@ -967,7 +967,7 @@
   [(parallel
 [(set (match_operand:P 0 "register_operand" "=&k")
   (plus:P (match_operand:P 1 "register_operand" "0")
-  (match_operand:P 4 "const_int_operand" "n")))
+  (match_operand:P 4 "aarch64_mem_pair_offset" "n")))
  (set (mem:GPI (plus:P (match_dup 0)
(match_dup 4)))
   (match_operand:GPI 2 "register_operand" "r"))
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index 2702a3c..478de11 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -123,6 +123,10 @@
(match_test "INTVAL (op) != 0
 		&& (unsigned) exact_log2 (INTVAL (op)) < 64")))
 
+(define_predicate "aarch64_mem_pair_offset"
+  (and (match_code "const_int")
+   (match_test "aarch64_offset_7bit_signed_scaled_p (mode, INTVAL (op))")))
+
 (define_predicate "aarch64_mem_pair_operand"
   (and (match_code "mem")
(match_test "aarch64_legitimate_address_p (mode, XEXP (op, 0), PARALLEL,

Re: [Patch ARM/testsuite 00/22] Neon intrinsics executable tests

2014-06-12 Thread Christophe Lyon
On 12 June 2014 04:31, Mike Stump  wrote:
> On Jun 10, 2014, at 3:03 PM, Ramana Radhakrishnan  
> wrote:
>> I am a bit ambivalent between getting folks to add scan-assembler
>> tests here and worrying between this and getting the behaviour
>> correct. Additionally if you add the complexity of scanning for
>> aarch64 as well this starts getting messy.
>>
>> At this point I'm going to wait to see if any of the testsuite
>> maintainers step in and comment and if not I'll start looking at this
>> properly early next week.
>
> [ ducks ] So, I wasn’t going to comment…  If you guys do something really 
> stupid, I’ll scream, as hopefully will others.  Doing something a little 
> misguided I don’t think hurts much.  The worst case if you figure out in a 
> year or two why it was a bad idea and then fix it, not the end of the world.

If the execution part is OK and the scan-assembler is questionable, I
can just remove that part (or leave it commented until we decide
otherwise).


Re: [PING*2][PATCH] Extend mode-switching to support toggle (1/2)

2014-06-12 Thread Christian Bruel
On 06/11/2014 02:00 PM, Christian Bruel wrote:
> On 06/11/2014 06:17 AM, Joern Rennecke wrote:
 Joern, is this new target macro interface OK with you ?
>> Yes, this interface should allow me to do switches between rounding
>> and truncating
>> floating-point modes with an add/subtract immediate.
>>
>> However, the implentation, as posted, doesn't work - it causes memory
>> corruption.
>>
>> It appears to work with the attached amendment patch.
>>
> Indeed,  thanks for pointing out the bad reusing of the aux field
> between multiple entities.
>
> In fact rereading this part of the implementation, I find the allocation
> of aux*n_entities awkward. A simpler setting in the entity loop to carry
> the mode directly into eg->aux is possible without array allocation
> (which also fixes a memory leak by the way).
>

Here is the revised version fixing the aforementioned issue found by
Joern on Epiphany. It also simplifies the allocation of the aux edges
field to carry the modes.

Now that everyone agrees on the interface, is this OK for trunk ?

bootstrapped/regtested for X86 and SH4a.

thanks,

Christian






2014-06-12  Christian Bruel  

	* mode-switching.c (struct bb_info): Add mode_out, mode_in caches.
	(make_preds_opaque): Delete.
	(clear_mode_bit, mode_bit_p, set_mode_bit): New macros.
	(commit_mode_sets): New function.
	(optimize_mode_switching): Handle current_mode to mode_switching_emit.
	Process all modes at once.
	* basic-block.h (pre_edge_lcm_avs): Declare.
	* lcm.c (pre_edge_lcm_avs): Renamed from pre_edge_lcm.
	Call clear_aux_for_edges. Fix comments.
	(pre_edge_lcm): New wrapper function to call pre_edge_lcm_avs.
	(pre_edge_rev_lcm): Idem.
	* config/epiphany/epiphany.c (emit_set_fp_mode): Add prev_mode parameter.
	* config/epiphany/epiphany-protos.h (emit_set_fp_mode): Idem.
	* config/epiphany/resolve-sw-modes.c (pass_resolve_sw_modes::execute): Idem.
	* config/i386/i386.c (x96_emit_mode_set): Idem.
	* config/sh/sh.c (sh_emit_mode_set): Likewise. Handle PR toggle.
	* config/sh/sh.md (toggle_pr): 	Defined if TARGET_FPU_SINGLE.
	(fpscr_toggle) Disallow from delay slot.
	* target.def (emit_mode_set): Add prev_mode parameter.
	* doc/tm.texi: Regenerate.

2014-06-12  Christian Bruel  

	* gcc.target/sh/fpchg.c: New test.

Index: gcc/basic-block.h
===
--- gcc/basic-block.h	(revision 211436)
+++ gcc/basic-block.h	(working copy)
@@ -711,6 +711,9 @@ extern void bitmap_union_of_preds (sbitmap, sbitma
 extern struct edge_list *pre_edge_lcm (int, sbitmap *, sbitmap *,
    sbitmap *, sbitmap *, sbitmap **,
    sbitmap **);
+extern struct edge_list *pre_edge_lcm_avs (int, sbitmap *, sbitmap *,
+	   sbitmap *, sbitmap *, sbitmap *,
+	   sbitmap *, sbitmap **, sbitmap **);
 extern struct edge_list *pre_edge_rev_lcm (int, sbitmap *,
 	   sbitmap *, sbitmap *,
 	   sbitmap *, sbitmap **,
Index: gcc/config/epiphany/epiphany-protos.h
===
--- gcc/config/epiphany/epiphany-protos.h	(revision 211436)
+++ gcc/config/epiphany/epiphany-protos.h	(working copy)
@@ -40,7 +40,8 @@ extern int epiphany_initial_elimination_offset (in
 extern void epiphany_init_expanders (void);
 extern int hard_regno_mode_ok (int regno, enum machine_mode mode);
 #ifdef HARD_CONST
-extern void emit_set_fp_mode (int entity, int mode, HARD_REG_SET regs_live);
+extern void emit_set_fp_mode (int entity, int mode, int prev_mode,
+			  HARD_REG_SET regs_live);
 #endif
 extern void epiphany_insert_mode_switch_use (rtx insn, int, int);
 extern void epiphany_expand_set_fp_mode (rtx *operands);
Index: gcc/config/epiphany/epiphany.c
===
--- gcc/config/epiphany/epiphany.c	(revision 211436)
+++ gcc/config/epiphany/epiphany.c	(working copy)
@@ -2543,7 +2543,8 @@ epiphany_mode_exit (int entity)
 }
 
 void
-emit_set_fp_mode (int entity, int mode, HARD_REG_SET regs_live ATTRIBUTE_UNUSED)
+emit_set_fp_mode (int entity, int mode, int prev_mode ATTRIBUTE_UNUSED,
+		  HARD_REG_SET regs_live ATTRIBUTE_UNUSED)
 {
   rtx save_cc, cc_reg, mask, src, src2;
   enum attr_fp_mode fp_mode;
Index: gcc/config/epiphany/resolve-sw-modes.c
===
--- gcc/config/epiphany/resolve-sw-modes.c	(revision 211436)
+++ gcc/config/epiphany/resolve-sw-modes.c	(working copy)
@@ -170,7 +170,7 @@ pass_resolve_sw_modes::execute (function *fun)
 	}
 	  start_sequence ();
 	  emit_set_fp_mode (EPIPHANY_MSW_ENTITY_ROUND_UNKNOWN,
-			jilted_mode, NULL);
+			jilted_mode, FP_MODE_NONE, NULL);
 	  seq = get_insns ();
 	  end_sequence ();
 	  need_commit = true;
Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c	(revision 211436)
+++ gcc/config/i386/i386.c	(working copy)
@@ -16447,7 +16447,8 @@ ix86_avx_emit_vzeroupper (HARD_REG_SET re

Re: [PING][PATCH, trunk, 4.9, 4.8] Fix PR57653, filename information discarded when using -imacros

2014-06-12 Thread Manuel López-Ibáñez
On 12 June 2014 01:23, Peter Bergner  wrote:
> On Wed, 2014-06-11 at 23:07 +, Joseph S. Myers wrote:
>> On Wed, 11 Jun 2014, Peter Bergner wrote:
>>
>> > I'd like to ping the following patch that fixes PR57653.  This did
>> > bootstrap and regtest with no regressions on powerpc64-linux.
>> >
>> > https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01571.html
>> >
>> > Is this ok for trunk, 4.9 and 4.8?
>>
>> I think the code change is correct, but the comment added needs expanding
>> to explain better what's going on (i.e. the circumstances in which the
>> condition include_cursor > deferred_count may hold, and why, in those
>> circumstances, returning early is the correct thing to do).
>
> Manuel, can you offer an updated comment?  Being just the patch
> tester and not knowing this code at all, I'm not going to be of
> much use at expanding the Manuel's original comment.

It has been a long time, and it seems that even at the time I proposed
the patch, I had no idea why it worked:

"For some reason unknown to me, push_commandline_include should not be
called while processing -imacros. -imacros tries to achieve this by
playing tricks with include_cursor, but that doesn't stop the
pre-included files. Calling cpp_push_include (or
cpp_push_default_include) seems to mess up everything (again, no idea
why!)."

The long explanation is -imacro triggers:

  /* Handle -imacros after -D and -U.  */
  for (i = 0; i < deferred_count; i++)
{
  struct deferred_opt *opt = &deferred_opts[i];

  if (opt->code == OPT_imacros
  && cpp_push_include (parse_in, opt->arg))
{
  /* Disable push_command_line_include callback for now.  */
  include_cursor = deferred_count + 1;
  cpp_scan_nooutput (parse_in);
}
}

Then push_command_line_include is roughly as follows:

/* Give CPP the next file given by -include, if any.  */
static void
push_command_line_include (void)
{
  if (!done_preinclude)
{
  done_preinclude = true;
  if (flag_hosted && std_inc && !cpp_opts->preprocessed)
{
  const char *preinc = targetcm.c_preinclude ();
  if (preinc && cpp_push_default_include (parse_in, preinc))
return;
}
}

  pch_cpp_save_state ();

  while (include_cursor < deferred_count)
{
[...]
 }

  if (include_cursor == deferred_count)
{
[...]
   }
}

that is, when -imacros is given, push_command_line_include still calls
the cpp_push_default_include, which messes up everything. Why or how,
I have no idea. Someone else will need to investigate more.

I don't think the patch is the best possible approach. I think it
would be better to push the default includes as soon as it is
reasonably possible (before/after processing -imacros?), instead of
relying on push_command_line_include. Also, if
push_command_line_include needs to be disabled for -imacros, it would
be clearer to have a file-local boolean for this purpose.

The way push_command_line_include is called is a mystery to me:

  if (new_map == 0 || (new_map->reason == LC_LEAVE && MAIN_FILE_P (new_map)))
{
  pch_cpp_save_state ();
  push_command_line_include ();
}

Why is it called when leaving the main file at cb_file_change?

Also, c_finish_options has at the end:

  include_cursor = 0;
  push_command_line_include ();

but shouldn't this happen instead when (or before) the main file is entered?

Cheers,

Manuel.


Re: RFA: speeding up dg-extract-results.sh

2014-06-12 Thread Bernd Schmidt

On 05/25/2014 11:35 AM, Richard Sandiford wrote:

Bernd Schmidt  writes:

On 02/13/2014 10:18 AM, Richard Sandiford wrote:

contrib/
* dg-extract-results.py: New file.
* dg-extract-results.sh: Use it if the environment seems suitable.


I'm now seeing the following:

Traceback (most recent call last):
File "../../git/gcc/../contrib/dg-extract-results.py", line 581, in

  Prog().main()
File "../../git/gcc/../contrib/dg-extract-results.py", line 569, in main
  self.output_tool (self.runs[name])
File "../../git/gcc/../contrib/dg-extract-results.py", line 534, in
output_tool
  self.output_variation (tool, variation)
File "../../git/gcc/../contrib/dg-extract-results.py", line 483, in
output_variation
  for harness in sorted (variation.harnesses.values()):
TypeError: unorderable types: HarnessRun() < HarnessRun()

$ /usr/bin/python --version
Python 3.3.3


Sorry, thought I'd tested it with python3, but obviously not.
I've applied the fix below after testing that it didn't change the
output for python 2.6 and python 2.7.


I've recently been trying to add ada to my set of tested languages, and 
I now encounter the following:


Traceback (most recent call last):
  File "../../git/gcc/../contrib/dg-extract-results.py", line 580, in 


Prog().main()
  File "../../git/gcc/../contrib/dg-extract-results.py", line 544, in main
self.parse_file (filename, file)
  File "../../git/gcc/../contrib/dg-extract-results.py", line 427, in 
parse_file

self.parse_acats_run (filename, file)
  File "../../git/gcc/../contrib/dg-extract-results.py", line 342, in 
parse_acats_run

self.parse_run (filename, file, tool, variation, 1)
  File "../../git/gcc/../contrib/dg-extract-results.py", line 242, in 
parse_run

line = file.readline()
  File "/usr/lib64/python3.3/codecs.py", line 301, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe1 in position 
5227: invalid continuation byte



Bernd



Re: [RFC] Teaching SCC merging about unit local trees

2014-06-12 Thread Jan Hubicka
> 
> That is, have a tree_may_be_mergeable_p (), call it during the DFS
> walk storing it alongside the visited edges and thus obtain a result
> for each SCC, stream that as a flag (a special hash value is ugly,
> but well ... I guess it works).  The important part is to make an SCC
> !tree_may_be_mergeable_p () if any of the outgoing edges from an SCC
> are !tree_may_be_mergeable_p ().  You seem to miss this.

This is what I am trying to do by the hashing.  scc_hash is now 1 for any SCC
that refers to scc with hash 1. So non-mergeability propagates.

Honza


[C++ Patch] PR 33101

2014-06-12 Thread Paolo Carlini

Hi,

in this old bug Ian complained that the diagnostic we provide for:

typedef void v;
typedef v (*pf)(v);

is rather unfriendly, especially for people coming from C:

33101.C:2:17: error: ‘’ has incomplete type
33101.C:2:18: error: invalid use of ‘v’

thus Gaby (and Ian) suggested something along the lines of what I 
propose below. Today I also noticed that some front-ends also deal 
specially with cv-qualified void, thus added that case too, then just 
generically 'void' I think it's good enough.


Thanks,
Paolo.

//
/cp
2014-06-12  Paolo Carlini  

PR c++/33101
* decl.c (grokparms): Improve error message about void parameters.

/testsuite
2014-06-12  Paolo Carlini  

PR c++/33101
* g++.dg/other/void3.C: New.
* g++.dg/conversion/err-recover1.C: Update.
Index: cp/decl.c
===
--- cp/decl.c   (revision 211574)
+++ cp/decl.c   (working copy)
@@ -11161,10 +11161,25 @@ grokparms (tree parmlist, tree *parms)
{
  if (same_type_p (type, void_type_node)
  && DECL_SELF_REFERENCE_P (type)
- && !DECL_NAME (decl) && !result && TREE_CHAIN (parm) == 
void_list_node)
+ && !DECL_NAME (decl) && !result
+ && TREE_CHAIN (parm) == void_list_node)
/* this is a parmlist of `(void)', which is ok.  */
break;
- cxx_incomplete_type_error (decl, type);
+ else if (!cv_qualified_p (type)
+  && !DECL_SELF_REFERENCE_P (type)
+  && !DECL_NAME (decl) && !result
+  && TREE_CHAIN (parm) == void_list_node)
+   error_at (DECL_SOURCE_LOCATION (decl),
+ "invalid use of typedef-name for type "
+ "% in parameter declaration");
+ else if (cv_qualified_p (type))
+   error_at (DECL_SOURCE_LOCATION (decl),
+ "invalid use of cv-qualified type % "
+ "in parameter declaration");
+ else
+   error_at (DECL_SOURCE_LOCATION (decl),
+ "invalid use of type % in parameter "
+ "declaration");
  /* It's not a good idea to actually create parameters of
 type `void'; other parts of the compiler assume that a
 void type terminates the parameter list.  */
Index: testsuite/g++.dg/conversion/err-recover1.C
===
--- testsuite/g++.dg/conversion/err-recover1.C  (revision 211574)
+++ testsuite/g++.dg/conversion/err-recover1.C  (working copy)
@@ -1,6 +1,6 @@
 // PR c++/42219
 
-void foo(const void);  // { dg-error "incomplete|const" }
+void foo(const void);  // { dg-error "invalid use of cv-qualified" }
 
 void bar()
 {
Index: testsuite/g++.dg/other/void3.C
===
--- testsuite/g++.dg/other/void3.C  (revision 0)
+++ testsuite/g++.dg/other/void3.C  (working copy)
@@ -0,0 +1,4 @@
+// PR c++/33101
+
+typedef void v;
+typedef v (*pf)(v);  // { dg-error "invalid use of typedef-name" }


Re: [patch i386]: Combine memory and indirect jump

2014-06-12 Thread Kai Tietz
Hello,

I updated i386.md part of the patch. Initial patch included handling
of blockage, which is obviously superflous.  Additionally I merged
32-bit and 64-bit peephole2 versions by using mode-specifier W.

ChangeLog

2014-06-12  Kai Tietz  

* config/i386/i386.md (peehole2): To combine
indirect jump with memory.

2014-06-12  Kai Tietz  

* gcc.target/i386/indjmp-1.c: New test.

Tested for i686-pc-cygwin, and x86_64-unknown-linux-gnu.  Ok for apply?

with addition of adding a second peephole2 pass after sched2 pass, I
was able to get some improvement for PR target/39284.  I think by this
addition we can close bug as fixed.
Additionally additional peephole2 pass shows better results for PR
target/51840 testcase with disabled ASM_GOTO, too.

2014-06-12  Kai Tietz  

PR target/39284
* passes.def (pass_peephole2): Add second peephole2
run after sched2 pass.

Tested for i686-pc-cygwin, and x86_64-unknown-linux-gnu.  Ok for apply?

Regards,
Kai

Index: testsuite/gcc.target/i386/indjmp-1.c
===
--- testsuite/gcc.target/i386/indjmp-1.c(Revision 0)
+++ testsuite/gcc.target/i386/indjmp-1.c(Arbeitskopie)
@@ -0,0 +1,23 @@
+/* { dg-do compile  { target ia32 } } */
+/* { dg-options "-O2" } */
+
+#define ADVANCE_AND_DISPATCH() goto *addresses[*pc++]
+
+void
+Interpret(const unsigned char *pc)
+{
+static const void *const addresses[] = {
+  &&l0, &&l1, &&l2
+};
+
+l0:
+ADVANCE_AND_DISPATCH();
+
+l1:
+ADVANCE_AND_DISPATCH();
+
+l2:
+return;
+}
+
+/* { dg-final { scan-assembler-not "jmp\[ \t\]*.%eax" } } */
Index: config/i386/i386.md
===
--- config/i386/i386.md(Revision 211489)
+++ config/i386/i386.md(Arbeitskopie)
@@ -11471,6 +11471,15 @@
 (match_dup 3))
   (set (reg:SI SP_REG) (match_dup 4))])])

+;; Combining simple memory jump instruction
+
+(define_peephole2
+  [(set (match_operand:W 0 "register_operand")
+(match_operand:W 1 "memory_nox32_operand"))
+   (set (pc) (match_dup 0))]
+  "peep2_reg_dead_p (2, operands[0])"
+  [(set (pc) (match_dup 1))])
+
 ;; Call subroutine, returning value in operand 0

 (define_expand "call_value"
Index: passes.def
===
--- passes.def(Revision 211489)
+++ passes.def(Arbeitskopie)
@@ -396,6 +396,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_leaf_regs);
   NEXT_PASS (pass_split_before_sched2);
   NEXT_PASS (pass_sched2);
+  NEXT_PASS (pass_peephole2);
   NEXT_PASS (pass_stack_regs);
   PUSH_INSERT_PASSES_WITHIN (pass_stack_regs)
   NEXT_PASS (pass_split_before_regstack);


Re: [RFC] Teaching SCC merging about unit local trees

2014-06-12 Thread Jan Hubicka
> On Thu, Jun 12, 2014 at 10:47 AM, Jan Hubicka  wrote:
> > Richard,
> > as briefly discussed before, I would like to teach LTO type merging to not 
> > merge
> > types that was declared in anonymous namespaces and use C++ ODR type names
> > (stored in DECL_ASSEMBLER_NAME of the TYPE_DECL) to break down canonical 
> > types
> > by their names.
> >
> > First thing I need to arrange IMO is to not merge two anonymous types from
> > two different units.  While looking into it I noticed that the current code
> > in unify_scc that refuses to merge local decls produces conflicts and seems
> > useless excercise to do.
> >
> > This patch introduces special hash code 1 that specify that given SCC is 
> > known
> > to be local and should bypass the merging logic. This is propagated down and
> > seems to quite noticeably reduce size of SCC hash:
> >
> > [WPA] read 10190717 SCCs of average size 1.980409
> > [WPA] 20181785 tree bodies read in total
> > [WPA] tree SCC table: size 4194301, 1882700 elements, collision ratio: 
> > 0.815497
> > [WPA] tree SCC max chain length 140 (size 1)
> > [WPA] Compared 3392363 SCCs, 2718822 collisions (0.801454)
> > [WPA] Merged 3314075 SCCs
> > [WPA] Merged 9693632 tree bodies
> > [WPA] Merged 2467704 types
> > [WPA] 1783262 types prevailed (4491218 associated trees)
> > [WPA] GIMPLE canonical type table: size 131071, 94867 elements, 1783347 
> > searches, 737056 collisions (ratio: 0.413299)
> > [WPA] GIMPLE canonical type pointer-map: 94867 elements, 3973875 searches
> > [WPA] Compression: 282828785 input bytes, 831186147 uncompressed bytes 
> > (ratio: 2.938832)
> > [WPA] Size of mmap'd section decls: 282828785 bytes
> >
> > to:
> >
> > [WPA] read 10172291 SCCs of average size 1.982162
> > [WPA] 20163124 tree bodies read in total
> > [WPA] tree SCC table: size 2097143, 988764 elements, collision ratio: 
> > 0.684967
> > [WPA] tree SCC max chain length 140 (size 1)
> > [WPA] Compared 3060932 SCCs, 2405009 collisions (0.785711)
> > [WPA] Merged 3040565 SCCs
> > [WPA] Merged 9246482 tree bodies
> > [WPA] Merged 2382312 types
> > [WPA] 1868611 types prevailed (4728465 associated trees)
> > [WPA] GIMPLE canonical type table: size 131071, 94910 elements, 1868696 
> > searches, 790939 collisions (ratio: 0.423257)
> > [WPA] GIMPLE canonical type pointer-map: 94910 elements, 4216423 searches
> > [WPA] Compression: 273322455 input bytes, 824178095 uncompressed bytes 
> > (ratio: 3.015406)
> >
> > We merge less, but not by much and I think we was not right not merge in 
> > that cases.
> 
> If we merge things we may not merge then the fix is to compare_tree_sccs_1,
> not introducing special cases like you propose.

What I was looking for was to decide at streaming time what canbe merged
instead of doing it at merging time that is more expensive and causes
scc hash conflicts (because the hashes are the same)
> 
> That is, if we are not allowed to merge anonymous namespaces then
> make sure we don't.  We already should not merge types with
> TYPE_CONTEXT == such namespace by means of
> 
>   /* ???  Global types from different TUs have non-matching
>  TRANSLATION_UNIT_DECLs.  Still merge them if they are otherwise
>  equal.  */
>   if (TYPE_FILE_SCOPE_P (t1) && TYPE_FILE_SCOPE_P (t2))
> ;
>   else
> compare_tree_edges (TYPE_CONTEXT (t1), TYPE_CONTEXT (t2));
> 
> but we possibly merge a subset of decl kinds from "different" namespaces :
> 
>   /* ???  Global decls from different TUs have non-matching
>  TRANSLATION_UNIT_DECLs.  Only consider a small set of
>  decls equivalent, we should not end up merging others.  */
>   if ((code == TYPE_DECL
>|| code == NAMESPACE_DECL
>|| code == IMPORTED_DECL
>|| code == CONST_DECL
>|| (VAR_OR_FUNCTION_DECL_P (t1)
>&& (TREE_PUBLIC (t1) || DECL_EXTERNAL (t1
>   && DECL_FILE_SCOPE_P (t1) && DECL_FILE_SCOPE_P (t2))
> ;
>   else
> compare_tree_edges (DECL_CONTEXT (t1), DECL_CONTEXT (t2));
> 
> Not sure what we end up doing for NAMESPACE_DECL itself (and what
> fields we stream for it).  It would be interesting to check that.
> 
> Thus, make sure we don't merge namespace {} and namespace {} from
> two different units.
> 
> But effectively you say we have two classes of "global" trees, first
> those that are mergeable across TUs and second those that are not.

Yes, we have global trees that are not mergeable (becuase they are local
to TU by the nature).
We also have trees in function sections that IMO gets hash information just to
be ignored at ltrans streaming time (we also stream at WPA, but I do not see
how merging in these helps either)

> This IMHO means we want to separate those to two different LTO
> sections and simply skip all the merging code for the second (instead
> of adding hacks to the merging code).

We could move them to different section, though indicating in hash whether
the scc is mergea

Re: RFA: speeding up dg-extract-results.sh

2014-06-12 Thread Mike Stump
On Jun 12, 2014, at 8:53 AM, Bernd Schmidt  wrote:
> I've recently been trying to add ada to my set of tested languages, and I now 
> encounter the following:
> 
>  File "../../git/gcc/../contrib/dg-extract-results.py", line 242, in parse_run
>line = file.readline()
>  File "/usr/lib64/python3.3/codecs.py", line 301, in decode
>(result, consumed) = self._buffer_decode(data, self.errors, final)
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe1 in position 5227: 
> invalid continuation byte

In the old skool world, these are byte sequences that end in ‘\n’…  no decoding 
errors are possible…  well, maybe one, if you tried to put ‘\0’ in the stream.  
:-(  Maybe a LANG/LC type person can suggest an environment variable to set 
that would make things happier, else we’re down to a python person to solve 
from that side.  My knee jerk would be LANG=c for the entire test suite run...

Re: [Patch ARM/testsuite 00/22] Neon intrinsics executable tests

2014-06-12 Thread Mike Stump
On Jun 12, 2014, at 7:26 AM, Christophe Lyon  wrote:
> On 12 June 2014 04:31, Mike Stump  wrote:
>> On Jun 10, 2014, at 3:03 PM, Ramana Radhakrishnan 
>>  wrote:
>>> At this point I'm going to wait to see if any of the testsuite
>>> maintainers step in and comment
>> 
>> [ ducks ] So, I wasn’t going to comment…  If you guys do something really 
>> stupid, I’ll scream, as hopefully will others.  Doing something a little 
>> misguided I don’t think hurts much.  The worst case if you figure out in a 
>> year or two why it was a bad idea and then fix it, not the end of the world.
> 
> If the execution part is OK and the scan-assembler is questionable, I
> can just remove that part (or leave it commented until we decide
> otherwise).

Don’t read my comment as stating scanning as being questionable.  In fact, 
scanning is slightly better as one can see the results on a cross easier and 
faster…  for example when someone wants to study a regression they caused and 
they don’t have the target, they can build to cc1 and then run the test case by 
hand and see what the scan issues are.  If it where an executable test case, 
they would have to puzzle why the test case is different and understand what 
they are reading (they might not be familiar with the target).

[PATCH, AArch64, PR 61483] builtin va_start incorrectly initializes the field of va_list for incoming unnamed arguments on the stack

2014-06-12 Thread Yufeng Zhang

Hi,

The patch fixes a bug in the AArch64 backend in calculating the 
beginning address of the unnamed incoming arguments on the stack, i.e. 
the initial value of __va_list->__stack.  aarch64_layout_arg incorrectly 
calculates the size of named arguments on stack using the number of 
registers needed as if there were enough registers available.  This is 
wrong, as for instance when passed in registers an HFA/HVA* argument 
takes as many SIMD registers as the number of its fields; when passed on 
the stack, however, it should be passed as what its storage layout is 
(rounded to the nearest multiple of 8 bytes).


The bug only affects builtin va_start, as it is other routines like 
aarch64_pad_arg_upward rather than aarch64_layout_arg which take care of 
the positioning of outgoing arguments on stack and the fetching of the 
incoming named arguments from stack.


The patch has passed bootstrapping.

OK for the trunk and 4.9.1 branch once the regtest passes as well?

Thanks,
Yufeng

* HFA: Homogeneous Floating-point Aggregate
  HVA: Homogeneous Short-Vector Aggregate


gcc/

PR target/61483
* config/aarch64/aarch64.c (aarch64_layout_arg): Add new local
variable 'size'; calculate 'size' right in the front; use
'size' to compute 'nregs' (when 'allocate_ncrn != 0') and
pcum->aapcs_stack_words.

gcc/testsuite/

PR target/61483
* gcc.target/aarch64/aapcs64/type-def.h (struct hfa_fx2_t): New type.
* gcc.target/aarch64/aapcs64/va_arg-13.c: New test.
* gcc.target/aarch64/aapcs64/va_arg-14.c: Ditto.
* gcc.target/aarch64/aapcs64/va_arg-15.c: Ditto.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index fabd6a9..56a5a5d 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1459,6 +1459,7 @@ aarch64_layout_arg (cumulative_args_t pcum_v, enum 
machine_mode mode,
   CUMULATIVE_ARGS *pcum = get_cumulative_args (pcum_v);
   int ncrn, nvrn, nregs;
   bool allocate_ncrn, allocate_nvrn;
+  HOST_WIDE_INT size;
 
   /* We need to do this once per argument.  */
   if (pcum->aapcs_arg_processed)
@@ -1466,6 +1467,11 @@ aarch64_layout_arg (cumulative_args_t pcum_v, enum 
machine_mode mode,
 
   pcum->aapcs_arg_processed = true;
 
+  /* Size in bytes, rounded to the nearest multiple of 8 bytes.  */
+  size
+= AARCH64_ROUND_UP (type ? int_size_in_bytes (type) : GET_MODE_SIZE (mode),
+   UNITS_PER_WORD);
+
   allocate_ncrn = (type) ? !(FLOAT_TYPE_P (type)) : !FLOAT_MODE_P (mode);
   allocate_nvrn = aarch64_vfp_is_call_candidate (pcum_v,
 mode,
@@ -1516,9 +1522,7 @@ aarch64_layout_arg (cumulative_args_t pcum_v, enum 
machine_mode mode,
 }
 
   ncrn = pcum->aapcs_ncrn;
-  nregs = ((type ? int_size_in_bytes (type) : GET_MODE_SIZE (mode))
-  + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
-
+  nregs = size / UNITS_PER_WORD;
 
   /* C6 - C9.  though the sign and zero extension semantics are
  handled elsewhere.  This is the case where the argument fits
@@ -1567,13 +1571,12 @@ aarch64_layout_arg (cumulative_args_t pcum_v, enum 
machine_mode mode,
   pcum->aapcs_nextncrn = NUM_ARG_REGS;
 
   /* The argument is passed on stack; record the needed number of words for
- this argument (we can re-use NREGS) and align the total size if
- necessary.  */
+ this argument and align the total size if necessary.  */
 on_stack:
-  pcum->aapcs_stack_words = nregs;
+  pcum->aapcs_stack_words = size / UNITS_PER_WORD;
   if (aarch64_function_arg_alignment (mode, type) == 16 * BITS_PER_UNIT)
 pcum->aapcs_stack_size = AARCH64_ROUND_UP (pcum->aapcs_stack_size,
-  16 / UNITS_PER_WORD) + 1;
+  16 / UNITS_PER_WORD);
   return;
 }
 
diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/type-def.h 
b/gcc/testsuite/gcc.target/aarch64/aapcs64/type-def.h
index a95d06a..07e56ff 100644
--- a/gcc/testsuite/gcc.target/aarch64/aapcs64/type-def.h
+++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/type-def.h
@@ -34,6 +34,13 @@ struct hfa_fx2_t
   float b;
 };
 
+struct hfa_fx3_t
+{
+  float a;
+  float b;
+  float c;
+};
+
 struct hfa_dx2_t
 {
   double a;
diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-13.c 
b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-13.c
new file mode 100644
index 000..27c4099
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-13.c
@@ -0,0 +1,53 @@
+/* Test AAPCS64 layout and __builtin_va_start.
+
+   Pass named HFA/HVA argument on stack.  */
+
+/* { dg-do run { target aarch64*-*-* } } */
+
+#ifndef IN_FRAMEWORK
+#define AAPCS64_TEST_STDARG
+#define TESTFILE "va_arg-13.c"
+
+struct float_float_t
+{
+  float a;
+  float b;
+} float_float;
+
+union float_int_t
+{
+  float b8;
+  int b5;
+} float_int;
+
+#define HAS_DATA_INIT_FUNC
+void
+init_data ()
+{
+  float_float.a = 1.2f;
+  float_float.b = 2.2f;
+
+  float_int.

[PR tree-optimization/61009] Follow-up to fix incorrect return value

2014-06-12 Thread Jeff Law


It was reported that mysql was failing its testsuite due to a regex 
routine being mis-compiled on the ppc and s390 platforms.  Upon 
investigation it was found that the fix for PR61009 was incomplete.


The fix for 61009 changed thread_through_normal_block to return a 
tri-state with negative values indicating the block was not threadable, 
even for a joiner.  That situation occurs when we do not process all the 
statements in the block (for example, the block is too big for 
threading).  When we fail to process all the statements, then we will 
fail to properly invalidate entries in the equivalence tables which can 
result in incorrect transformations when threading across a loop backedge.


61009 detected the "block too big case", but missed the case when 
problematical PHIs are detected.  This patch fixes that oversight.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  It 
fixes the short-circuited loop in mysql for s390 (by inspection) and the 
mysql testsuite passes on ppc using 4.9 with this addition to the 
original 61009 patch backported.


Installed on the trunk.  Will install onto 4.9 branch shortly.



commit 5de2d4cf14b882066026745fb1b1019561daac12
Author: Jeff Law 
Date:   Thu Jun 12 10:13:16 2014 -0600

PR tree-optimization/61009
* tree-ssa-threadedge.c (thread_through_normal_block): Correct return
value when we stop processing a block due to problematic PHIs.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index a42b94d..d68262f 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2014-06-12  Jeff Law  
+
+PR tree-optimization/61009
+   * tree-ssa-threadedge.c (thread_through_normal_block): Correct return
+   value when we stop processing a block due to problematic PHIs.
+
 2014-06-12  Alan Lawrence  
 
* config/aarch64/arm_neon.h (vmlaq_n_f64, vmlsq_n_f64, vrsrtsq_f64,
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index ba9e1fe..a76a7ce 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -948,9 +948,12 @@ thread_through_normal_block (edge e,
   if (*backedge_seen_p)
 simplify = dummy_simplify;
 
-  /* PHIs create temporary equivalences.  */
+  /* PHIs create temporary equivalences.
+ Note that if we found a PHI that made the block non-threadable, then
+ we need to bubble that up to our caller in the same manner we do
+ when we prematurely stop processing statements below.  */
   if (!record_temporary_equivalences_from_phis (e, stack))
-return 0;
+return -1;
 
   /* Now walk each statement recording any context sensitive
  temporary equivalences we can detect.  */


config/vxworks-dummy.h on arm

2014-06-12 Thread Jakub Jelinek
Hi!

Seems http://gcc.gnu.org/r197156 effectively reverted
the PR45078 fix for arm*-linux* (where unfortunately tm_file
is always overridden).

Was the removal of vxworks-dummy.h from that line intentional
or just some mistake?

Seems one can't build gcc plugins on arm because of this,
because arm.h includes vxworks-dummy.h.

Jakub


[Google] Fix AFDO early inline ICEs due to DFE

2014-06-12 Thread Teresa Johnson
These two patches fix multiple ICE that occurred due to DFE being
recently enabled after AutoFDO LIPO linking.

Passes regression and internal testing. Ok for Google/4_8?

Teresa

2014-06-12  Teresa Johnson  
Dehao Chen  

Google ref b/15521327.

* cgraphclones.c (cgraph_clone_edge): Use resolved node.
* l-ipo.c (resolve_cgraph_node): Resolve to non-removable node.

Index: cgraphclones.c
===
--- cgraphclones.c  (revision 211386)
+++ cgraphclones.c  (working copy)
@@ -94,6 +94,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-utils.h"
 #include "lto-streamer.h"
 #include "except.h"
+#include "l-ipo.h"

 /* Create clone of E in the node N represented by CALL_EXPR the callgraph.  */
 struct cgraph_edge *
@@ -118,7 +119,11 @@ cgraph_clone_edge (struct cgraph_edge *e, struct c

   if (call_stmt && (decl = gimple_call_fndecl (call_stmt)))
{
- struct cgraph_node *callee = cgraph_get_node (decl);
+  struct cgraph_node *callee;
+  if (L_IPO_COMP_MODE && cgraph_pre_profiling_inlining_done)
+callee = cgraph_lipo_get_resolved_node (decl);
+  else
+callee = cgraph_get_node (decl);
  gcc_checking_assert (callee);
  new_edge = cgraph_create_edge (n, callee, call_stmt, count, freq);
}
Index: l-ipo.c
===
--- l-ipo.c (revision 211386)
+++ l-ipo.c (working copy)
@@ -1542,6 +1542,18 @@ resolve_cgraph_node (struct cgraph_sym **slot, str
   gcc_assert (decl1_defined);
   add_define_module (*slot, decl2);

+  /* Pick the node that cannot be removed, to avoid a situation
+ where we remove the resolved node and later try to access
+ it for the remaining non-removable copy.  E.g. one may be
+ extern and the other weak, only the extern copy can be removed.  */
+  if (cgraph_can_remove_if_no_direct_calls_and_refs_p ((*slot)->rep_node)
+  && !cgraph_can_remove_if_no_direct_calls_and_refs_p (node))
+{
+  (*slot)->rep_node = node;
+  (*slot)->rep_decl = decl2;
+  return;
+}
+
   has_prof1 = has_profile_info (decl1);
   bool is_aux1 = cgraph_is_auxiliary (decl1);
   bool is_aux2 = cgraph_is_auxiliary (decl2);


-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: Fix a function decl in gfortran

2014-06-12 Thread Tobias Burnus

A bit belated, I have now committed the patch as Rev. 211587.

Thanks for confirming that it now works!

Tobias

Bernd Schmidt wrote:

On 06/04/2014 10:36 PM, Tobias Burnus wrote:

Bernd Schmidt wrote:

Even with this applied, I'm still seeing similar failures.


I didn't claim that the patch would fix everything – nor that it was
well tested.


Just wanted to report back since the problem doesn't really show up on 
normal targets.



Can you try the attached version? The change is that I now properly use
"se->ignore_optional" to test whether absent optional arguments should
be skipped - rather than using this mornings ad-hoc solution of doing so
unconditionally. Additionally, the patch has now survived stage2
building – which is more testing than I could do this morning.


This seems to work. Thanks!


Bernd






Re: [patch i386]: Combine memory and indirect jump

2014-06-12 Thread Segher Boessenkool
On Thu, Jun 12, 2014 at 06:21:32PM +0200, Kai Tietz wrote:
> with addition of adding a second peephole2 pass after sched2 pass, I
> was able to get some improvement for PR target/39284.  I think by this
> addition we can close bug as fixed.
> Additionally additional peephole2 pass shows better results for PR
> target/51840 testcase with disabled ASM_GOTO, too.

Will that work on other targets?  Also, it needs a doc fix (md.texi
says peephole2 runs before scheduling).


Segher


Re: [PATCH, PR61446] Fix mode for register copy in REE pass

2014-06-12 Thread Jeff Law

On 06/10/14 01:42, Ilya Enkovich wrote:

Hi,

This patch fixes PR61446.  The problem appears when we insert value copies 
after transformations. We use the widest extension mode met in a chain, but it 
may be wider than original destination register size.  This patch checks it and 
use smaller mode if required.

Bootstrapped and tested on linux-x86_64.

Does it look OK?

Thanks,
Ilya
--
2014-06-09  Ilya Enkovich  

PR 61446
* ree.c (find_and_remove_re): Narrow mode for register copy
if required.
That seems wrong.   Something should have rejected this earlier.  Let me 
take a looksie.



jeff



Re: RFC: C++ PATCH to remove -fabi-version=1 support

2014-06-12 Thread Jason Merrill

On 06/09/2014 04:46 PM, Jason Merrill wrote:

I'm updating -Wabi to allow for warnings about changes between a
previous ABI version and the currently selected one, and rather than
adjust all the warnings for -fabi-version=1 I'd like to tear it out.


Here's a revised patch that I'm checking in.


commit 1df8225f7661bcd683bb7dd5924cc09668473bad
Author: Jason Merrill 
Date:   Fri Jun 6 14:01:58 2014 -0400

gcc/
	* toplev.c (process_options): Reject -fabi-version=1.
gcc/cp/
	* call.c (build_operator_new_call): Remove -fabi-version=1 support.
	* class.c (walk_subobject_offsets, include_empty_classes): Likewise.
	(layout_nonempty_base_or_field, end_of_class): Likewise.
	(layout_empty_base, build_base_field, layout_class_type): Likewise.
	(is_empty_class, add_vcall_offset_vtbl_entries_1): Likewise.
	(layout_virtual_bases): Likewise.
	* decl.c (compute_array_index_type): Likewise.
	* mangle.c (write_mangled_name, write_prefix): Likewise.
	(write_template_prefix, write_integer_cst, write_expression): Likewise.
	(write_template_arg, write_array_type): Likewise.
	* method.c (lazily_declare_fn): Likewise.
	* rtti.c (get_pseudo_ti_index): Likewise.
	* typeck.c (comp_array_types): Likewise.

diff --git a/gcc/common.opt b/gcc/common.opt
index 5c3f834..f61fab5 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -776,7 +776,7 @@ Driver Undocumented
 ;Therefore, 0 will not necessarily indicate the same ABI in different
 ;versions of G++.
 ;
-; 1: The version of the ABI first used in G++ 3.2.
+; 1: The version of the ABI first used in G++ 3.2.  No longer selectable.
 ;
 ; 2: The version of the ABI first used in G++ 3.4 (and current default).
 ;
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 75a6a4a..ac14ce2 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -4130,29 +4130,17 @@ build_operator_new_call (tree fnname, vec **args,
if (*cookie_size)
  {
bool use_cookie = true;
-   if (!abi_version_at_least (2))
-	 {
-	   /* In G++ 3.2, the check was implemented incorrectly; it
-	  looked at the placement expression, rather than the
-	  type of the function.  */
-	   if ((*args)->length () == 2
-	   && same_type_p (TREE_TYPE ((**args)[1]), ptr_type_node))
-	 use_cookie = false;
-	 }
-   else
-	 {
-	   tree arg_types;
+   tree arg_types;
 
-	   arg_types = TYPE_ARG_TYPES (TREE_TYPE (cand->fn));
-	   /* Skip the size_t parameter.  */
-	   arg_types = TREE_CHAIN (arg_types);
-	   /* Check the remaining parameters (if any).  */
-	   if (arg_types
-	   && TREE_CHAIN (arg_types) == void_list_node
-	   && same_type_p (TREE_VALUE (arg_types),
-			   ptr_type_node))
-	 use_cookie = false;
-	 }
+   arg_types = TYPE_ARG_TYPES (TREE_TYPE (cand->fn));
+   /* Skip the size_t parameter.  */
+   arg_types = TREE_CHAIN (arg_types);
+   /* Check the remaining parameters (if any).  */
+   if (arg_types
+	   && TREE_CHAIN (arg_types) == void_list_node
+	   && same_type_p (TREE_VALUE (arg_types),
+			   ptr_type_node))
+	 use_cookie = false;
/* If we need a cookie, adjust the number of bytes allocated.  */
if (use_cookie)
 	 {
diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 25fc89b..a96b360 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -3820,8 +3820,7 @@ walk_subobject_offsets (tree type,
 
   if (!TYPE_P (type))
 {
-  if (abi_version_at_least (2))
-	type_binfo = type;
+  type_binfo = type;
   type = BINFO_TYPE (type);
 }
 
@@ -3847,43 +3846,29 @@ walk_subobject_offsets (tree type,
 	{
 	  tree binfo_offset;
 
-	  if (abi_version_at_least (2)
-	  && BINFO_VIRTUAL_P (binfo))
+	  if (BINFO_VIRTUAL_P (binfo))
 	continue;
 
-	  if (!vbases_p
-	  && BINFO_VIRTUAL_P (binfo)
-	  && !BINFO_PRIMARY_P (binfo))
-	continue;
-
-	  if (!abi_version_at_least (2))
-	binfo_offset = size_binop (PLUS_EXPR,
-   offset,
-   BINFO_OFFSET (binfo));
-	  else
-	{
-	  tree orig_binfo;
-	  /* We cannot rely on BINFO_OFFSET being set for the base
-		 class yet, but the offsets for direct non-virtual
-		 bases can be calculated by going back to the TYPE.  */
-	  orig_binfo = BINFO_BASE_BINFO (TYPE_BINFO (type), i);
-	  binfo_offset = size_binop (PLUS_EXPR,
-	 offset,
-	 BINFO_OFFSET (orig_binfo));
-	}
+	  tree orig_binfo;
+	  /* We cannot rely on BINFO_OFFSET being set for the base
+	 class yet, but the offsets for direct non-virtual
+	 bases can be calculated by going back to the TYPE.  */
+	  orig_binfo = BINFO_BASE_BINFO (TYPE_BINFO (type), i);
+	  binfo_offset = size_binop (PLUS_EXPR,
+ offset,
+ BINFO_OFFSET (orig_binfo));
 
 	  r = walk_subobject_offsets (binfo,
   f,
   binfo_offset,
   offsets,
   max_offset,
-  (abi_version_at_least (2)
-   ? /*vbases_p=*/0 : vbases_p));
+  /*vbases_p=*/0);
 	  if (r)
 	return r;
 	}
 
- 

Re: [C++ Patch] PR 33101

2014-06-12 Thread Paolo Carlini
... in terms of code proper, the below is much better, IMHO. Assuming, 
as I understand, we have no reason to call the rather heavy same_type_p 
when we already know that VOID_TYPE_P (type) is true...


Thanks,
Paolo.

//
Index: cp/decl.c
===
--- cp/decl.c   (revision 211574)
+++ cp/decl.c   (working copy)
@@ -11159,12 +11159,25 @@ grokparms (tree parmlist, tree *parms)
   type = TREE_TYPE (decl);
   if (VOID_TYPE_P (type))
{
- if (same_type_p (type, void_type_node)
- && DECL_SELF_REFERENCE_P (type)
- && !DECL_NAME (decl) && !result && TREE_CHAIN (parm) == 
void_list_node)
+ bool cond = (!cv_qualified_p (type)
+  && !DECL_NAME (decl) && !result
+  && TREE_CHAIN (parm) == void_list_node);
+ if (cond
+ && DECL_SELF_REFERENCE_P (type))
/* this is a parmlist of `(void)', which is ok.  */
break;
- cxx_incomplete_type_error (decl, type);
+ else if (cond)
+   error_at (DECL_SOURCE_LOCATION (decl),
+ "invalid use of typedef-name for type "
+ "% in parameter declaration");
+ else if (cv_qualified_p (type))
+   error_at (DECL_SOURCE_LOCATION (decl),
+ "invalid use of cv-qualified type % "
+ "in parameter declaration");
+ else
+   error_at (DECL_SOURCE_LOCATION (decl),
+ "invalid use of type % in parameter "
+ "declaration");
  /* It's not a good idea to actually create parameters of
 type `void'; other parts of the compiler assume that a
 void type terminates the parameter list.  */
Index: testsuite/g++.dg/conversion/err-recover1.C
===
--- testsuite/g++.dg/conversion/err-recover1.C  (revision 211574)
+++ testsuite/g++.dg/conversion/err-recover1.C  (working copy)
@@ -1,6 +1,6 @@
 // PR c++/42219
 
-void foo(const void);  // { dg-error "incomplete|const" }
+void foo(const void);  // { dg-error "invalid use of cv-qualified" }
 
 void bar()
 {
Index: testsuite/g++.dg/other/void3.C
===
--- testsuite/g++.dg/other/void3.C  (revision 0)
+++ testsuite/g++.dg/other/void3.C  (working copy)
@@ -0,0 +1,4 @@
+// PR c++/33101
+
+typedef void v;
+typedef v (*pf)(v);  // { dg-error "invalid use of typedef-name" }


PATCH to change -fabi-version default to 0

2014-06-12 Thread Jason Merrill
I talked about doing this in 4.9 
(https://gcc.gnu.org/ml/gcc/2013-03/msg8.html), but decided to put 
it off along with the libstdc++ ABI transition.  I think it's time now.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit a2aa0efcd1f27e85a4c652f5177c66686f530a96
Author: Jason Merrill 
Date:   Mon Jun 9 16:37:43 2014 -0400

	* common.opt (fabi-version): Change default to 0.

diff --git a/gcc/common.opt b/gcc/common.opt
index f61fab5..7f05092 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -808,7 +808,8 @@ Driver Undocumented
 ; Additional positive integers will be assigned as new versions of
 ; the ABI become the default version of the ABI.
 fabi-version=
-Common Joined RejectNegative UInteger Var(flag_abi_version) Init(2)
+Common Joined RejectNegative UInteger Var(flag_abi_version) Init(0)
+The version of the C++ ABI in use
 
 faggressive-loop-optimizations
 Common Report Var(flag_aggressive_loop_optimizations) Optimization Init(1) 


Re: [PATCH 8/8] Add a common .md file and define standard constraints there

2014-06-12 Thread Segher Boessenkool
On Thu, Jun 05, 2014 at 10:43:25PM +0100, Richard Sandiford wrote:
> This final patch uses a common .md file to define all standard
> constraints except 'g'.

I had a look at what targets still use "g".  Note: there can be
errors in this, it's all based on  \

Re: PATCH to change -fabi-version default to 0

2014-06-12 Thread Mike Stump
On Jun 12, 2014, at 12:17 PM, Jason Merrill  wrote:
> I talked about doing this in 4.9 
> (https://gcc.gnu.org/ml/gcc/2013-03/msg8.html), but decided to put it off 
> along with the libstdc++ ABI transition.  I think it's time now.

Is a doc change needed?

> @opindex fabi-version
> Use version @var{n} of the C++ ABI@.  The default is version 2.


Re: [PATCH 8/8] Add a common .md file and define standard constraints there

2014-06-12 Thread Paul_Koning

On Jun 12, 2014, at 3:24 PM, Segher Boessenkool  
wrote:

> On Thu, Jun 05, 2014 at 10:43:25PM +0100, Richard Sandiford wrote:
>> This final patch uses a common .md file to define all standard
>> constraints except 'g'.
> 
> I had a look at what targets still use "g".  Note: there can be
> errors in this, it's all based on  \ 
> * frv and mcore use "g" in commented-out patterns;
> * cr16, mcore, picochip, rl78, and sh use "g" where they mean "rm"
>  or "m";
> * m68k uses it (in a dbne pattern) where the C template splits
>  the "r", "m", "i" cases again;
> * bfin, fr30, h8300, m68k, rs6000, and v850 use it as the second
>  operand (# bytes pushed) of the call patterns; that operand is
>  unused in all these cases, could just be "";
> * cris, m68k, pdp11, and vax actually use "g".
> 
> So it won't be all that much work to completely get rid of "g".
> Do we want that?

Is it simply a matter of replacing “g” by “mri”?  That’s what the doc suggests. 
 Or is there more to the story than that?

paul


Re: PATCH to change -fabi-version default to 0

2014-06-12 Thread Dominique Dhumieres
How does this affect pr60732?

Dominique


Re: [PATCH, PR61446] Fix mode for register copy in REE pass

2014-06-12 Thread Jeff Law

On 06/10/14 01:42, Ilya Enkovich wrote:

Hi,

This patch fixes PR61446.  The problem appears when we insert value copies 
after transformations. We use the widest extension mode met in a chain, but it 
may be wider than original destination register size.  This patch checks it and 
use smaller mode if required.

Bootstrapped and tested on linux-x86_64.

Does it look OK?

Thanks,
Ilya
--
2014-06-09  Ilya Enkovich  

PR 61446
* ree.c (find_and_remove_re): Narrow mode for register copy
if required.
The whole point behind the 61094 change was to avoid this kind of issue. 
 ie, before eliminating an extension which requires a copy, make sure 
the copy is going to be valid (single insn that is recognizable and 
satisfies its constraints).  If the copy is not going to be valid, then 
suppress the extension elimination.


It's not working as desired because of a relatively simple goof.

When I wrote the changes for 61094, I copied the code which created the 
new insns from find_and_remove_re into combine_reaching_defs -- the idea 
being we want to generate the same insn in combine_reaching_defs that 
will be generated in find_and_remove_re.  In combine_reaching_defs we 
generate, validate & throw it away.  In find_and_remove_re we generate 
and insert it into the insn stream.


The subtle issue missed as that in find_and_remove_re, we have already 
transformed the defining insn.  ie, the destination of the defining insn 
is in the widened mode.  That is _not_ the case in combine_reaching_defs.


So combine_reaching_defs is not testing the same insn that will be 
created by find_and_remove_re.  The insns have the same structure, but 
the modes of the operands are different.


For 61094, that little difference was not important.  It *is* important 
for 61446.  Thankfully the fix is trivial and I've confirmed that 61094 
stays fixed and that it fixes 61446.  Going through the bootstrap & 
regression process now.


Jeff


Re: [PATCH][RFC] Fix PR61473, inline small memcpy/memmove during tree opts

2014-06-12 Thread Jeff Law

On 06/12/14 04:12, Richard Biener wrote:


This implements the requested inlining of memmove for possibly
overlapping arguments by doing first all loads and then all stores.
The easiest place is to do this in memory op folding where we already
perform inlining of some memcpy cases (but fail to do the equivalent
memcpy optimization - though RTL expansion later does it).

The following patch restricts us to max. word-mode size.  Ideally
we'd have a way to check for the number of real instructions needed
to load an (aligned) value of size N.  But maybe we don't care
and are fine with doing multiple loads / stores?

Anyway, the following is conservative (but maybe not enough).

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

These transforms don't really belong to GENERIC folding (they
also run at -O0 ...), similar to most builtin foldings.  But this
patch is not to change that.

Any comments on the size/cost issue?
I recall seeing something in one of the BZ databases that asked for 
double-word to be expanded inline.  Presumably the reporter's code did 
lots of double-word things of this nature.


Obviously someone else might want quad-word and so-on.  However, double 
words seem like a very reasonable request.


jeff



Re: [patch i386]: Combine memory and indirect jump

2014-06-12 Thread Kai Tietz
2014-06-12 20:52 GMT+02:00 Segher Boessenkool :
> On Thu, Jun 12, 2014 at 06:21:32PM +0200, Kai Tietz wrote:
>> with addition of adding a second peephole2 pass after sched2 pass, I
>> was able to get some improvement for PR target/39284.  I think by this
>> addition we can close bug as fixed.
>> Additionally additional peephole2 pass shows better results for PR
>> target/51840 testcase with disabled ASM_GOTO, too.

Well, this is the only point I am a bit concerned too.  In general I
wouldn't expect here any issues to run peephole after scheduling, as
peephole doesn't do anything a new run of ira/lra would require.
Anyway it would be good if a global maintainer could comment on that.

> Will that work on other targets?  Also, it needs a doc fix (md.texi
> says peephole2 runs before scheduling).

Thanks for pointing on that.  When I send patch for this additional
peephole pass with testcase, I will adjust md.texi.

>
> Segher


partial-concept-ids

2014-06-12 Thread Andrew Sutton
Add support for partial concept ids. Mostly this just refactors the
basic support for concept names to also allow a template and extra
arguments.

Also added the missing .exp file for the test suite.

2014-06-12  Andrew Sutton  
* gcc/cp/constraint.cc (deduce_constrained_parameter): Refactor
common deduction framework into separate function.
(build_call_check): New.
(build_concept_check): Take additional arguments to support the
creation of constrained-type-specifiers from partial-concept-ids.
(build_constrained_parameter): Take arguments from a partial-concept-id.
* gcc/cp/cp-tree.h (build_concept_check, biuld_constrained_parameter):
Take a template argument list, defaulting to NULL_TREE.
* gcc/cp/parser.c (cp_parser_template_id): Check to see if a
template-id is a concept check.
(cp_check_type_concept): Reorder arguments
(cp_parser_allows_constrained_type_specifier): New. Check contexts
where a constrained-type-specifier is allowed.
(cp_maybe_constrained_type_specifier): New. Refactored common rules
for concept name checks.
(cp_maybe_partial_concept_id): New. Check for
constrained-type-specifiers.
* gcc/testuite/g++.dg/concepts/partial.C: New tests.
* gcc/testuite/g++.dg/concepts/partial-err.C: New tests.
* gcc/testuite/g++.dg/concepts/concepts.exp: Add missing test driver.

Andrew Sutton
Index: parser.c
===
--- parser.c	(revision 211585)
+++ parser.c	(working copy)
@@ -2523,7 +2523,10 @@ static tree cp_parser_make_typename_type
 static cp_declarator * cp_parser_make_indirect_declarator
  (enum tree_code, tree, cp_cv_quals, cp_declarator *, tree);
 
+/* Concept-related syntactic transformations */
 
+static tree cp_maybe_concept_name   (cp_parser *, tree);
+static tree cp_maybe_partial_concept_id (cp_parser *, tree, tree);
 
 // -- //
 // Unevaluated Operand Guard
@@ -13775,6 +13778,11 @@ cp_parser_template_id (cp_parser *parser
 		   || TREE_CODE (templ) == OVERLOAD
 		   || BASELINK_P (templ)));
 
+  // If the template + args designate a concept, then return
+  // something else.
+  if (tree id = cp_maybe_partial_concept_id (parser, templ, arguments))
+return id;
+
   template_id = lookup_template_function (templ, arguments);
 }
 
@@ -14995,7 +15003,8 @@ cp_parser_simple_type_specifier (cp_pars
 	}
   /* Otherwise, look for a type-name.  */
   else
-	type = cp_parser_type_name (parser);
+type = cp_parser_type_name (parser);
+  
   /* Keep track of all name-lookups performed in class scopes.  */
   if (type
 	  && !global_p
@@ -15071,6 +15080,7 @@ cp_parser_simple_type_specifier (cp_pars

type-name:
  concept-name
+ partial-concept-id
 
concept-name:
  identifier
@@ -15092,6 +15102,7 @@ cp_parser_type_name (cp_parser* parser)
 /*check_dependency_p=*/true,
 /*class_head_p=*/false,
 /*is_declaration=*/false);
+
   /* If it's not a class-name, keep looking.  */
   if (!cp_parser_parse_definitely (parser))
 {
@@ -15107,6 +15118,7 @@ cp_parser_type_name (cp_parser* parser)
 	 /*check_dependency_p=*/true,
 	 none_type,
 	 /*is_declaration=*/false);
+
   /* Note that this must be an instantiation of an alias template
 	 because [temp.names]/6 says:
 	 
@@ -15135,7 +15147,7 @@ cp_parser_type_name (cp_parser* parser)
 /// Returns true if proto is a type parameter, but not a template template
 /// parameter.
 static bool
-cp_check_type_concept (tree proto, tree fn) 
+cp_check_type_concept (tree fn, tree proto) 
 {
   if (TREE_CODE (proto) != TYPE_DECL) 
 {
@@ -15145,57 +15157,58 @@ cp_check_type_concept (tree proto, tree
   return true;
 }
 
-// If DECL refers to a concept, return a TYPE_DECL representing the result
-// of using the constrained type specifier in the current context. 
-//
-// DECL refers to a concept if
-//   - it is an overload set containing a function concept taking a single
-// type argument, or
-//   - it is a variable concept taking a single type argument
-//
-//
-// TODO: DECL could be a variable concept.
+/// Returns true if the parser is in a context that allows the
+/// use of a constrained type specifier.
+static inline bool
+cp_parser_allows_constrained_type_specifier (cp_parser *parser)
+{
+  return flag_concepts 
+&& (processing_template_parmlist
+|| parser->auto_is_implicit_function_template_parm_p
+|| parser->in_result_type_constraint_p);
+}
+
+// Check if DECL and ARGS can form a constrained-type-specifier. If ARGS
+// is non-null, we try to form a concept check of the form DECL
+// where ? is a placeholder for any kind of template argument. If ARGS
+// is NULL, then we try to form a concept check of the form DEC.
 static tre

C++ PATCH to add -Wabi=n

2014-06-12 Thread Jason Merrill
Now that -fabi-version defaults to 0, -Wabi isn't very useful.  But for 
people interested in compatibility with earlier versions, this patch 
allows you to say -Wabi=2 to get any relevant warnings.  This patch also 
adjusts the compatibility aliases to default to backward compatibility 
with -fabi-version=2.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 969f9f501a5a8b7a9498464bf3bef59e685b3895
Author: Jason Merrill 
Date:   Mon Jun 9 16:41:07 2014 -0400

	Support -Wabi warning about backward compatibility.
gcc/c-family/
	* c.opt (Wabi=, fabi-compat-version): New.
	* c-opts.c (c_common_handle_option): Handle -Wabi=.
	(c_common_post_options): Handle flag_abi_compat_version default.
	Disallow -fabi-compat-version=1.
	* c-common.h (abi_version_crosses): New.
gcc/cp/
	* call.c (convert_arg_to_ellipsis): Use abi_version_crosses.
	* cvt.c (type_promotes_to): Likewise.
	* mangle.c (write_type, write_expression): Likewise.
	(write_name, write_template_arg): Likewise.
	(mangle_decl): Make alias based on flag_abi_compat_version.
	Emit -Wabi warning here.
	(finish_mangling_internal): Not here.  Drop warn parm.
	(finish_mangling_get_identifier, finish_mangling): Adjust.
	(mangle_type_string, mangle_special_for_type): Adjust.
	(mangle_ctor_vtbl_for_type, mangle_thunk): Adjust.
	(mangle_guard_variable, mangle_tls_init_fn): Adjust.
	(mangle_tls_wrapper_fn, mangle_ref_init_variable): Adjust.

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 83d5dee..6bf4051 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -619,6 +619,13 @@ extern const char *constant_string_class_name;
 /* C++ language option variables.  */
 
 
+/* Return TRUE if one of {flag_abi_version,flag_abi_compat_version} is
+   less than N and the other is at least N, for use by -Wabi.  */
+#define abi_version_crosses(N)			\
+  (abi_version_at_least(N)			\
+   != (flag_abi_compat_version == 0		\
+   || flag_abi_compat_version >= (N)))
+
 /* Nonzero means generate separate instantiation control files and
juggle them at link time.  */
 
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 29e9a35..fbbc80e 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -456,6 +456,16 @@ c_common_handle_option (size_t scode, const char *arg, int value,
   handle_OPT_d (arg);
   break;
 
+case OPT_Wabi_:
+  warn_abi = true;
+  if (value == 1)
+	{
+	  warning (0, "%<-Wabi=1%> is not supported, using =2");
+	  value = 2;
+	}
+  flag_abi_compat_version = value;
+  break;
+
 case OPT_fcanonical_system_headers:
   cpp_opts->canonical_system_headers = value;
   break;
@@ -910,6 +920,22 @@ c_common_post_options (const char **pfilename)
   if (flag_declone_ctor_dtor == -1)
 flag_declone_ctor_dtor = optimize_size;
 
+  if (flag_abi_compat_version == 1)
+{
+  warning (0, "%<-fabi-compat-version=1%> is not supported, using =2");
+  flag_abi_compat_version = 2;
+}
+  else if (flag_abi_compat_version == -1)
+{
+  /* Generate compatibility aliases for ABI v2 (3.4-4.9) by default. */
+  flag_abi_compat_version = (flag_abi_version == 0 ? 2 : 0);
+
+  /* But don't warn about backward compatibility unless explicitly
+	 requested with -Wabi=n.  */
+  if (flag_abi_version == 0)
+	warn_abi = false;
+}
+
   if (cxx_dialect >= cxx11)
 {
   /* If we're allowing C++0x constructs, don't warn about C++98
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 76e67d7..d2e047f 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -256,6 +256,10 @@ Wabi
 C ObjC C++ ObjC++ LTO Var(warn_abi) Warning
 Warn about things that will change when compiling with an ABI-compliant compiler
 
+Wabi=
+C ObjC C++ ObjC++ LTO Joined RejectNegative UInteger Warning
+Warn about things that change between the current -fabi-version and the specified version
+
 Wabi-tag
 C++ ObjC++ Var(warn_abi_tag) Warning
 Warn if a subobject has an abi_tag attribute that the complete object type does not have
@@ -845,6 +849,10 @@ d
 C ObjC C++ ObjC++ Joined
 ; Documented in common.opt.  FIXME - what about -dI, -dD, -dN and -dD?
 
+fabi-compat-version=
+C++ ObjC++ Joined RejectNegative UInteger Var(flag_abi_compat_version) Init(-1)
+The version of the C++ ABI used for -Wabi warnings and link compatibility aliases
+
 faccess-control
 C++ ObjC++ Var(flag_access_control) Init(1)
 Enforce class member access control semantics
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index ac14ce2..44e92fc 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -6508,14 +6508,22 @@ convert_arg_to_ellipsis (tree arg, tsubst_flags_t complain)
 arg = null_pointer_node;
   else if (INTEGRAL_OR_ENUMERATION_TYPE_P (arg_type))
 {
-  if (SCOPED_ENUM_P (arg_type) && !abi_version_at_least (6))
+  if (SCOPED_ENUM_P (arg_type))
 	{
-	  if (complain & tf_warning)
-	warning_at (loc, OPT_Wabi, "scoped enu

Re: PATCH to change -fabi-version default to 0

2014-06-12 Thread Jason Merrill

On 06/12/2014 03:36 PM, Mike Stump wrote:

On Jun 12, 2014, at 12:17 PM, Jason Merrill  wrote:

I talked about doing this in 4.9 
(https://gcc.gnu.org/ml/gcc/2013-03/msg8.html), but decided to put it off 
along with the libstdc++ ABI transition.  I think it's time now.


Is a doc change needed?


Yep, I updated the docs in the -Wabi=n patch.

Jason




Re: PATCH to change -fabi-version default to 0

2014-06-12 Thread Jason Merrill

On 06/12/2014 03:44 PM, Dominique Dhumieres wrote:

How does this affect pr60732?


It should fix that failure.

Jason




Re: [patch i386]: Combine memory and indirect jump

2014-06-12 Thread Segher Boessenkool
> > Will that work on other targets?

> Well, this is the only point I am a bit concerned too.  In general I
> wouldn't expect here any issues to run peephole after scheduling, as
> peephole doesn't do anything a new run of ira/lra would require.

My concern is that peepholes are rather fragile, so imho it is not
inconceivable that some target will generate wrong code when you add
an extra (later) peephole pass.  Of course, we are in stage1.

My other concern is that running peepholes again after scheduling
could easily generate worse code.

So I think the effect of this change on other targets needs to be
evaluated.

> Anyway it would be good if a global maintainer could comment on that.

Yes :-)


Segher


[committed] Fix some combined OpenMP 4 clauses issues (PR middle-end/61486)

2014-06-12 Thread Jakub Jelinek
Hi!

This patch fixes 3 issues:
1) distribute doesn't support lastprivate clause, so gimplification
   shouldn't add it, it causes ICEs
2) for shared clauses on teams construct we need to at least
   record something in decl_map, otherwise lookup_decl ICEs
3) c_omp_split_clauses ICEd on a couple of combined constructs
   with firstprivate clause

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk
and 4.9 branch.

2014-06-12  Jakub Jelinek  

PR middle-end/61486
* gimplify.c (struct gimplify_omp_ctx): Add distribute field.
(gimplify_adjust_omp_clauses): Don't or in GOVD_LASTPRIVATE
if outer combined construct is distribute.
(gimplify_omp_for): For OMP_DISTRIBUTE set
gimplify_omp_ctxp->distribute.
* omp-low.c (scan_sharing_clauses) : For
GIMPLE_OMP_TEAMS, if decl isn't global in outer context, record
mapping into decl map.
c-family/
* c-omp.c (c_omp_split_clauses): Don't crash on firstprivate in
#pragma omp target teams or
#pragma omp {,target }teams distribute simd.
testsuite/
* c-c++-common/gomp/pr61486-1.c: New test.
* c-c++-common/gomp/pr61486-2.c: New test.

--- gcc/gimplify.c.jj   2014-06-06 09:19:23.0 +0200
+++ gcc/gimplify.c  2014-06-12 16:06:07.992997628 +0200
@@ -139,6 +139,7 @@ struct gimplify_omp_ctx
   enum omp_clause_default_kind default_kind;
   enum omp_region_type region_type;
   bool combined_loop;
+  bool distribute;
 };
 
 static struct gimplify_ctx *gimplify_ctxp;
@@ -6359,7 +6360,11 @@ gimplify_adjust_omp_clauses (tree *list_
  if (n == NULL
  || (n->value & GOVD_DATA_SHARE_CLASS) == 0)
{
- int flags = GOVD_FIRSTPRIVATE | GOVD_LASTPRIVATE;
+ int flags = GOVD_FIRSTPRIVATE;
+ /* #pragma omp distribute does not allow
+lastprivate clause.  */
+ if (!ctx->outer_context->distribute)
+   flags |= GOVD_LASTPRIVATE;
  if (n == NULL)
omp_add_variable (ctx->outer_context, decl,
  flags | GOVD_SEEN);
@@ -6640,6 +6645,8 @@ gimplify_omp_for (tree *expr_p, gimple_s
  || TREE_CODE (for_stmt) == CILK_SIMD);
   gimplify_scan_omp_clauses (&OMP_FOR_CLAUSES (for_stmt), pre_p,
 simd ? ORT_SIMD : ORT_WORKSHARE);
+  if (TREE_CODE (for_stmt) == OMP_DISTRIBUTE)
+gimplify_omp_ctxp->distribute = true;
 
   /* Handle OMP_FOR_INIT.  */
   for_pre_body = NULL;
--- gcc/omp-low.c.jj2014-06-10 08:02:49.0 +0200
+++ gcc/omp-low.c   2014-06-12 16:41:09.438849948 +0200
@@ -1509,11 +1509,19 @@ scan_sharing_clauses (tree clauses, omp_
  break;
 
case OMP_CLAUSE_SHARED:
+ decl = OMP_CLAUSE_DECL (c);
  /* Ignore shared directives in teams construct.  */
  if (gimple_code (ctx->stmt) == GIMPLE_OMP_TEAMS)
-   break;
+   {
+ /* Global variables don't need to be copied,
+the receiver side will use them directly.  */
+ tree odecl = maybe_lookup_decl_in_outer_ctx (decl, ctx);
+ if (is_global_var (odecl))
+   break;
+ insert_decl_map (&ctx->cb, decl, odecl);
+ break;
+   }
  gcc_assert (is_taskreg_ctx (ctx));
- decl = OMP_CLAUSE_DECL (c);
  gcc_assert (!COMPLETE_TYPE_P (TREE_TYPE (decl))
  || !is_variable_sized (decl));
  /* Global variables don't need to be copied,
--- gcc/c-family/c-omp.c.jj 2014-05-11 22:20:26.0 +0200
+++ gcc/c-family/c-omp.c2014-06-12 17:11:49.507948417 +0200
@@ -789,8 +789,13 @@ c_omp_split_clauses (location_t loc, enu
  else if ((mask & (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_NUM_TEAMS))
   != 0)
{
- /* This must be #pragma omp {,target }teams distribute.  */
- gcc_assert (code == OMP_DISTRIBUTE);
+ /* This must be one of
+#pragma omp {,target }teams distribute
+#pragma omp target teams
+#pragma omp {,target }teams distribute simd.  */
+ gcc_assert (code == OMP_DISTRIBUTE
+ || code == OMP_TEAMS
+ || code == OMP_SIMD);
  s = C_OMP_CLAUSE_SPLIT_TEAMS;
}
  else if ((mask & (OMP_CLAUSE_MASK_1
--- gcc/testsuite/c-c++-common/gomp/pr61486-1.c.jj  2014-06-12 
19:11:52.029213158 +0200
+++ gcc/testsuite/c-c++-common/gomp/pr61486-1.c 2014-06-12 19:12:22.427069749 
+0200
@@ -0,0 +1,13 @@
+/* PR middle-end/61486 */
+/* { dg-do compile } */
+/* { dg-options "-fopenmp" } */
+
+int
+foo (int *a)
+{
+  int i, j = 0;
+  #pragma omp target teams distribute simd linear(i, j) map(a[:10])
+  for (

Re: [PATCH 8/8] Add a common .md file and define standard constraints there

2014-06-12 Thread Segher Boessenkool
> > * cris, m68k, pdp11, and vax actually use "g".
> > 
> > So it won't be all that much work to completely get rid of "g".
> > Do we want that?
> 
> Is it simply a matter of replacing “g” by “mri”?  That’s what the doc 
> suggests.  Or is there more to the story than that?

As far as I know "g" and "rmi" are equivalent, yes.  "g" is easier to
type and read if you use it a lot (only ancient targets really); the
compiler will probably become somewhat slower for those targets, and
perhaps somewhat faster for all others.  Hard to say without doing the
work and measuring the result :-)


Segher


Re: [C++ Patch] PR 33101

2014-06-12 Thread Jason Merrill

On 06/12/2014 03:14 PM, Paolo Carlini wrote:

... in terms of code proper, the below is much better, IMHO. Assuming,
as I understand, we have no reason to call the rather heavy same_type_p
when we already know that VOID_TYPE_P (type) is true...


same_type_p is not so heavy since it just compares TYPE_CANONICAL, but I 
wonder why we don't use == for the normal case, and then 
typedef_variant_p to diagnose a typedef.


Jason



Re: [patch i386]: Combine memory and indirect jump

2014-06-12 Thread David Wohlferd


On 6/12/2014 9:21 AM, Kai Tietz wrote:

with addition of adding a second peephole2 pass after sched2 pass, I
was able to get some improvement for PR target/39284.  I think by this
addition we can close bug as fixed.
Additionally additional peephole2 pass shows better results for PR
target/51840 testcase with disabled ASM_GOTO, too.


Any chance this also fixes PR 58670 (see comment #5)?

dw




Re: [Google] Fix AFDO early inline ICEs due to DFE

2014-06-12 Thread Dehao Chen
I think the patch looks good. David and Rong, any comments?

Dehao

On Thu, Jun 12, 2014 at 11:23 AM, Teresa Johnson  wrote:
> These two patches fix multiple ICE that occurred due to DFE being
> recently enabled after AutoFDO LIPO linking.
>
> Passes regression and internal testing. Ok for Google/4_8?
>
> Teresa
>
> 2014-06-12  Teresa Johnson  
> Dehao Chen  
>
> Google ref b/15521327.
>
> * cgraphclones.c (cgraph_clone_edge): Use resolved node.
> * l-ipo.c (resolve_cgraph_node): Resolve to non-removable node.
>
> Index: cgraphclones.c
> ===
> --- cgraphclones.c  (revision 211386)
> +++ cgraphclones.c  (working copy)
> @@ -94,6 +94,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "ipa-utils.h"
>  #include "lto-streamer.h"
>  #include "except.h"
> +#include "l-ipo.h"
>
>  /* Create clone of E in the node N represented by CALL_EXPR the callgraph.  
> */
>  struct cgraph_edge *
> @@ -118,7 +119,11 @@ cgraph_clone_edge (struct cgraph_edge *e, struct c
>
>if (call_stmt && (decl = gimple_call_fndecl (call_stmt)))
> {
> - struct cgraph_node *callee = cgraph_get_node (decl);
> +  struct cgraph_node *callee;
> +  if (L_IPO_COMP_MODE && cgraph_pre_profiling_inlining_done)
> +callee = cgraph_lipo_get_resolved_node (decl);
> +  else
> +callee = cgraph_get_node (decl);
>   gcc_checking_assert (callee);
>   new_edge = cgraph_create_edge (n, callee, call_stmt, count, freq);
> }
> Index: l-ipo.c
> ===
> --- l-ipo.c (revision 211386)
> +++ l-ipo.c (working copy)
> @@ -1542,6 +1542,18 @@ resolve_cgraph_node (struct cgraph_sym **slot, str
>gcc_assert (decl1_defined);
>add_define_module (*slot, decl2);
>
> +  /* Pick the node that cannot be removed, to avoid a situation
> + where we remove the resolved node and later try to access
> + it for the remaining non-removable copy.  E.g. one may be
> + extern and the other weak, only the extern copy can be removed.  */
> +  if (cgraph_can_remove_if_no_direct_calls_and_refs_p ((*slot)->rep_node)
> +  && !cgraph_can_remove_if_no_direct_calls_and_refs_p (node))
> +{
> +  (*slot)->rep_node = node;
> +  (*slot)->rep_decl = decl2;
> +  return;
> +}
> +
>has_prof1 = has_profile_info (decl1);
>bool is_aux1 = cgraph_is_auxiliary (decl1);
>bool is_aux2 = cgraph_is_auxiliary (decl2);
>
>
> --
> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Fix vectorizer conditions on updating alignment

2014-06-12 Thread Jan Hubicka
Hi,
while updating vect_can_force_dr_alignment_p for section API I noticed the
predicate is bit confused about when it can update the alignment.

We need to check that decl_binds_to_current_def_p and in case we compile
a partition also that the symbol is not homed in other partition.
Previous code was wrong i.e. for COMDATs, weaks or -fpic.

Also when having an alias, only way to promote the alignment is to bump
up alignment of target.

On the other hand comment about DECL_IN_CONSTANT_POOL seems confused - we have
no sharing across partitions. I assume it was old hack and removed it.

I also see no reason for disregarding DECL_PRESERVE - we only update
alignment that should not disturb whatever magic user does. But I kept
it.

We probably should separate the logic into symtab predicate - it just checks if
we can change definition of variable to meet our needs. I can do that
incrementally.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* tree-vect-data-refs.c (vect_can_force_dr_alignment_p): Reorg
to use symtab and decl_binds_to_current_def_p
* tree-vectorizer.c (increase_alignment): Increase alignment
of alias target, too.
Index: tree-vect-data-refs.c
===
--- tree-vect-data-refs.c   (revision 211489)
+++ tree-vect-data-refs.c   (working copy)
@@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.
 #include "expr.h"
 #include "optabs.h"
 #include "builtins.h"
+#include "varasm.h"
 
 /* Return true if load- or store-lanes optab OPTAB is implemented for
COUNT vectors of type VECTYPE.  NAME is the name of OPTAB.  */
@@ -5316,19 +5317,26 @@ vect_can_force_dr_alignment_p (const_tre
   if (TREE_CODE (decl) != VAR_DECL)
 return false;
 
-  /* We cannot change alignment of common or external symbols as another
- translation unit may contain a definition with lower alignment.  
- The rules of common symbol linking mean that the definition
- will override the common symbol.  The same is true for constant
- pool entries which may be shared and are not properly merged
- by LTO.  */
-  if (DECL_EXTERNAL (decl)
-  || DECL_COMMON (decl)
-  || DECL_IN_CONSTANT_POOL (decl))
-return false;
+  gcc_assert (!TREE_ASM_WRITTEN (decl));
 
-  if (TREE_ASM_WRITTEN (decl))
-return false;
+  if (TREE_PUBLIC (decl) || DECL_EXTERNAL (decl))
+{
+  symtab_node *snode;
+
+  /* We cannot change alignment of symbols that may bind to symbols
+in other translation unit that may contain a definition with lower
+alignment.  */
+  if (!decl_binds_to_current_def_p (decl))
+   return false;
+
+  /* When compiling partition, be sure the symbol is not output by other
+partition.  */
+  snode = symtab_get_node (decl);
+  if (flag_ltrans
+ && (snode->in_other_partition
+ || symtab_get_symbol_partitioning_class (snode) == 
SYMBOL_DUPLICATE))
+   return false;
+}
 
   /* Do not override the alignment as specified by the ABI when the used
  attribute is set.  */
@@ -5343,6 +5351,18 @@ vect_can_force_dr_alignment_p (const_tre
   && !symtab_get_node (decl)->implicit_section)
 return false;
 
+  /* If symbol is an alias, we need to check that target is OK.  */
+  if (TREE_STATIC (decl))
+{
+  tree target = symtab_alias_ultimate_target (symtab_get_node 
(decl))->decl;
+  if (target != decl)
+   {
+ if (DECL_PRESERVE_P (target))
+   return false;
+ decl = target;
+   }
+}
+
   if (TREE_STATIC (decl))
 return (alignment <= MAX_OFILE_ALIGNMENT);
   else
Index: tree-vectorizer.c
===
--- tree-vectorizer.c   (revision 211488)
+++ tree-vectorizer.c   (working copy)
@@ -686,6 +686,12 @@ increase_alignment (void)
 {
   DECL_ALIGN (decl) = TYPE_ALIGN (vectype);
   DECL_USER_ALIGN (decl) = 1;
+ if (TREE_STATIC (decl))
+   {
+ tree target = symtab_alias_ultimate_target (symtab_get_node 
(decl))->decl;
+  DECL_ALIGN (target) = TYPE_ALIGN (vectype);
+  DECL_USER_ALIGN (target) = 1;
+   }
   dump_printf (MSG_NOTE, "Increasing alignment of decl: ");
   dump_generic_expr (MSG_NOTE, TDF_SLIM, decl);
   dump_printf (MSG_NOTE, "\n");


[C++ PATCH, RFC] PR c++/61491

2014-06-12 Thread Ville Voutilainen
DR1206 allows explicit specializations of member enumerations
of class templates, so just remove the pedwarn about it.

Tested on Linux-x64. Not bootstrapped.
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index d267a5c..97eadeb 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -967,11 +967,8 @@ maybe_process_partial_specialization (tree type)
   else if (processing_specialization)
 {
/* Someday C++0x may allow for enum template specialization.  */
-  if (cxx_dialect > cxx98 && TREE_CODE (type) == ENUMERAL_TYPE
- && CLASS_TYPE_P (context) && CLASSTYPE_USE_TEMPLATE (context))
-   pedwarn (input_location, OPT_Wpedantic, "template specialization "
-"of %qD not allowed by ISO C++", type);
-  else
+  if (!(cxx_dialect > cxx98 && TREE_CODE (type) == ENUMERAL_TYPE
+   && CLASS_TYPE_P (context) && CLASSTYPE_USE_TEMPLATE (context)))
{
  error ("explicit specialization of non-template %qT", type);
  return error_mark_node;
diff --git a/gcc/testsuite/g++.dg/cpp0x/pr61491.C 
b/gcc/testsuite/g++.dg/cpp0x/pr61491.C
new file mode 100644
index 000..c105782
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/pr61491.C
@@ -0,0 +1,12 @@
+// { dg-do compile { target c++11 } }
+// { dg-options "-pedantic" }
+// DR 1206 (explicit specialization of a member enumeration of a class 
template)
+
+template  struct Base 
+{ 
+enum class E : unsigned; 
+}; 
+
+struct X; 
+
+template<> enum class Base::E : unsigned { a, b }; 


pr61491.changelog
Description: Binary data


Re: [C++ PATCH, RFC] PR c++/61491

2014-06-12 Thread Ville Voutilainen
On 13 June 2014 01:37, Ville Voutilainen  wrote:
> DR1206 allows explicit specializations of member enumerations
> of class templates, so just remove the pedwarn about it.
>
> Tested on Linux-x64. Not bootstrapped.

Argh, also remove the old comment, new patch attached.
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index d267a5c..507585f 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -966,12 +966,8 @@ maybe_process_partial_specialization (tree type)
 }
   else if (processing_specialization)
 {
-   /* Someday C++0x may allow for enum template specialization.  */
-  if (cxx_dialect > cxx98 && TREE_CODE (type) == ENUMERAL_TYPE
- && CLASS_TYPE_P (context) && CLASSTYPE_USE_TEMPLATE (context))
-   pedwarn (input_location, OPT_Wpedantic, "template specialization "
-"of %qD not allowed by ISO C++", type);
-  else
+  if (!(cxx_dialect > cxx98 && TREE_CODE (type) == ENUMERAL_TYPE
+   && CLASS_TYPE_P (context) && CLASSTYPE_USE_TEMPLATE (context)))
{
  error ("explicit specialization of non-template %qT", type);
  return error_mark_node;
diff --git a/gcc/testsuite/g++.dg/cpp0x/pr61491.C 
b/gcc/testsuite/g++.dg/cpp0x/pr61491.C
new file mode 100644
index 000..c105782
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/pr61491.C
@@ -0,0 +1,12 @@
+// { dg-do compile { target c++11 } }
+// { dg-options "-pedantic" }
+// DR 1206 (explicit specialization of a member enumeration of a class 
template)
+
+template  struct Base 
+{ 
+enum class E : unsigned; 
+}; 
+
+struct X; 
+
+template<> enum class Base::E : unsigned { a, b }; 


Re: [Google] Fix AFDO early inline ICEs due to DFE

2014-06-12 Thread Rong Xu
This looks fine to me.

-Rong

On Thu, Jun 12, 2014 at 11:23 AM, Teresa Johnson  wrote:
> These two patches fix multiple ICE that occurred due to DFE being
> recently enabled after AutoFDO LIPO linking.
>
> Passes regression and internal testing. Ok for Google/4_8?
>
> Teresa
>
> 2014-06-12  Teresa Johnson  
> Dehao Chen  
>
> Google ref b/15521327.
>
> * cgraphclones.c (cgraph_clone_edge): Use resolved node.
> * l-ipo.c (resolve_cgraph_node): Resolve to non-removable node.
>
> Index: cgraphclones.c
> ===
> --- cgraphclones.c  (revision 211386)
> +++ cgraphclones.c  (working copy)
> @@ -94,6 +94,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "ipa-utils.h"
>  #include "lto-streamer.h"
>  #include "except.h"
> +#include "l-ipo.h"
>
>  /* Create clone of E in the node N represented by CALL_EXPR the callgraph.  
> */
>  struct cgraph_edge *
> @@ -118,7 +119,11 @@ cgraph_clone_edge (struct cgraph_edge *e, struct c
>
>if (call_stmt && (decl = gimple_call_fndecl (call_stmt)))
> {
> - struct cgraph_node *callee = cgraph_get_node (decl);
> +  struct cgraph_node *callee;
> +  if (L_IPO_COMP_MODE && cgraph_pre_profiling_inlining_done)
> +callee = cgraph_lipo_get_resolved_node (decl);
> +  else
> +callee = cgraph_get_node (decl);
>   gcc_checking_assert (callee);
>   new_edge = cgraph_create_edge (n, callee, call_stmt, count, freq);
> }
> Index: l-ipo.c
> ===
> --- l-ipo.c (revision 211386)
> +++ l-ipo.c (working copy)
> @@ -1542,6 +1542,18 @@ resolve_cgraph_node (struct cgraph_sym **slot, str
>gcc_assert (decl1_defined);
>add_define_module (*slot, decl2);
>
> +  /* Pick the node that cannot be removed, to avoid a situation
> + where we remove the resolved node and later try to access
> + it for the remaining non-removable copy.  E.g. one may be
> + extern and the other weak, only the extern copy can be removed.  */
> +  if (cgraph_can_remove_if_no_direct_calls_and_refs_p ((*slot)->rep_node)
> +  && !cgraph_can_remove_if_no_direct_calls_and_refs_p (node))
> +{
> +  (*slot)->rep_node = node;
> +  (*slot)->rep_decl = decl2;
> +  return;
> +}
> +
>has_prof1 = has_profile_info (decl1);
>bool is_aux1 = cgraph_is_auxiliary (decl1);
>bool is_aux2 = cgraph_is_auxiliary (decl2);
>
>
> --
> Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: [PATCH, Pointer Bounds Checker 35/x] Fix object size emitted for structures with flexible arrays

2014-06-12 Thread Ilya Enkovich
2014-06-12 11:55 GMT+04:00 Richard Biener :
> On Wed, Jun 11, 2014 at 6:08 PM, Ilya Enkovich  wrote:
>> Hi,
>>
>> This patch fixes problem with size emitted for static structures with 
>> flexible array.  I found a couple of trackers in guzilla for this problem 
>> but all of them are marked as fixed and problem still exists.
>>
>> For a simple testcase
>>
>> struct S { int a; int b[0]; } s = { 1, { 0, 0} };
>>
>> current trunk produces (no flags):
>>
>> .globl  s
>> .data
>> .align 4
>> .type   s, @object
>> .size   s, 4
>> s:
>> .long   1
>> .long   0
>> .long   0
>>
>> which has wrong size for object s.
>>
>> This problem is important for checker because wrong size leads to wrong 
>> bounds and false bounds violations.  Following patch uses DECL_SIZE_UNIT 
>> instead of type size and works well for me.  Does it look OK?
>
> There is a bug about this in bugzilla somewhere.

I looked through bugzilla and found two trackers with similar problem.

The first one is 57180 which is still open but with comment that
problem is fixed on the trunk. I checked it and it really passes for
the trunk (should tracker be closed then?).

Another one is 28865 (and a set of its duplicates). Interesting thing
here is that original testcase uses array of integers but testcases
which were added with commit use arrays of chars. Original test is
still compiled wrongly. I also see that a patch very similar to what I
posted was proposed as a solution but it was reported to cause a
problem with glibc/nss/nss_files/files-init.c. There is a
corresponding testcase in the tracker which results wrong padding when
patch is applied but it seems to be another problem because I do not
see any problem when use mpx compiler branch for this testcase.

>
> It looks ok to me - did you test with all languages?  In particular did
> you test Ada?

I configure compiler with no language disabling and then run bootstrap
and make check. Does it mean all languages are covered? Will make more
testing if required.

Thanks,
Ilya

>
> Thanks,
> Richard.
>
>> Bootstrapped and tested on linux-x86_64.
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2014-06-11  Ilya Enkovich  
>>
>> * config/elfos.h (ASM_DECLARE_OBJECT_NAME): Use decl size
>> instead of type size.
>> (ASM_FINISH_DECLARE_OBJECT): Likewise.
>>
>>
>> diff --git a/gcc/config/elfos.h b/gcc/config/elfos.h
>> index c1d5553..7929708 100644
>> --- a/gcc/config/elfos.h
>> +++ b/gcc/config/elfos.h
>> @@ -313,7 +313,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. 
>>  If not, see
>>   && (DECL) && DECL_SIZE (DECL))\
>> {   \
>>   size_directive_output = 1;\
>> - size = int_size_in_bytes (TREE_TYPE (DECL));  \
>> + size = tree_to_uhwi (DECL_SIZE_UNIT (DECL));  \
>>   ASM_OUTPUT_SIZE_DIRECTIVE (FILE, NAME, size); \
>> }   \
>> \
>> @@ -341,7 +341,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. 
>>  If not, see
>>   && !size_directive_output)\
>> {   \
>>   size_directive_output = 1;\
>> - size = int_size_in_bytes (TREE_TYPE (DECL));  \
>> + size = tree_to_uhwi (DECL_SIZE_UNIT (DECL));  \
>>   ASM_OUTPUT_SIZE_DIRECTIVE (FILE, name, size); \
>> }   \
>>  }  \


Re: ipa-visibility TLC 2/n

2014-06-12 Thread Jan Hubicka
> >
> > Comdat locals are now used by ipa-comdats, for thunks and for decloned 
> > ctors.
> > We probably need to figure out bit more precise limitation of Solaris and 
> > either
> > fix or add way for target to say what kind of comdat locals are not 
> > supported.
> 
> Right.  I'll start reghunting for the patch that caused additional
> breakage even without comdat, as on Solaris 10.

Good, at least one bug is off my radar.
I was thinking about the ipa-comdats issue and I remembered older problem where 
I wanted
to place thunks before function (to avoid need to jump back to the body) and 
that caused
problems for you, too, since solaris assembler apparently refused other than 
main comdat
group symbol being defined first.

Perhaps we run into similar issues? Do you know what precisely are the 
restrictions here?
(we do, for example, comdat groups that do not contain a symbol the group is 
called by,
so I do not see how the main symbol name is significant)

IPA-comdat brings extra symbols into the comdat group and pays no attention on 
the order,
so perhaps this is causing the issue.  We may add some logic into 
assemble_functions
to fix the order or work out why this breaks.
> 
> > Can I reproduce your setup on the compile farm?
> 
> According to https://gcc.gnu.org/wiki/CompileFarm, there are no Solaris
> machines or VMs in the compile farm.  If a VM could be set up (no idea
> if they allow non-free OSes beyond AIX there), I'd suggest starting with
> Solaris 11.2 Beta
> (http://www.oracle.com/technetwork/server-storage/solaris11/downloads/beta-2182939.html),
> which has the latest in /bin/ld support.  I can certainly help with
> setting something up.

Would be nice to have non-free OS for testing.  Comdats and aliases seems to be 
riddled by
implementation bugs and it would be nice to have way to test for those.

Honza
> 
>   Rainer
> 
> -- 
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Trust TREE_ADDRESSABLE

2014-06-12 Thread Jan Hubicka
> 
> When you extract the address and use it.  For example when you
> do auto-parallelization and outline a part of your function it
> passes arrays as addresses.
> 
> Or if you start to introduce address induction variables like
> the vectorizer or IVOPTs does.

I see, nothing really done by current early/IPA optimizers and in those cases
we also want to set TREE_ADDRESSABLE bit, too I suppose.
Do you think I should make patch for setting the NOVOPS bits in ipa code?

Honza
> 
> Richard.


Re: [Patch] Change URL in commit emails to https

2014-06-12 Thread Gerald Pfeifer

On Mon, 12 May 2014, Tobias Burnus wrote:

The patch changes the URL shown in the release message to HTTPS. (Cf.
https://gcc.gnu.org/viewcvs/gcc/hooks/svnmailer.conf and gcc-cvs mailing
list.)


Yes, please.  Thanks!

Gerald