Re: Turn DECL_SECTION_NAME into string
On Thu, Jun 12, 2014 at 6:33 AM, Jan Hubicka wrote: > Hi, > this lenghtly patch makes the legwork to put section names out of tree > representation. > Originally they were STRING_CST. I ended up implementing on-side reference > counted > string voclabulary that is done in bit baroque way to be GGC and PCH safe > (uff). > The memory savings on Firefox are about 60MB, becuase while reading symbol > table we > now unify the many duplicated comdat group strings and also we free them > after we bring > those local. > > The old representation probably made sense when most of string came via > __section__ > attribute where they was readily parsed as string constants. I wonder why you didn't use IDENTIFIER_NODEs? (ok, still trees ...) At least those are already GGC and PCH safe. Richard. > Bootstrapped/regtested x86_64-linux, comitted. > > Honza > > * symtab.c (section_hash): New hash. > (symtab_unregister_node): Clear section before freeing. > (hash_section_hash_entry): New haser. > (eq_sections): New function. > (symtab_node::set_section_for_node): New method. > (set_section_1): Update. > (symtab_node::set_section): Take string instead of tree as parameter. > (symtab_resolve_alias): Update. > * cgraph.h (section_hash_entry_d): New structure. > (section_hash_entry): New typedef. > (cgraph_node): Change comdat_group_ to x_comdat_group, > change section_ to x_section and turn into section_hash_entry; > update accestors; put set_section_for_node offline. > * tree.c (decl_section_name): Turn into string. > (set_decl_section_name): Change parameter to be string. > * tree.h (decl_section_name, set_decl_section_name): Update > prototypes. > * sdbout.c (sdbout_one_type): Update. > * tree-vect-data-refs.c (vect_can_force_dr_alignment_p): Update. > * varasm.c (IN_NAMED_SECTION, get_named_section, > resolve_unique_section, > hot_function_section, get_named_text_section, > USE_SELECT_SECTION_FOR_FUNCTIONS, > default_function_rodata_section, make_decl_rtl, > default_unique_section): > Update. > * config/c6x/c6x.c (c6x_in_small_data_p): Update. > (c6x_elf_unique_section): Update. > * config/nios2/nios2.c (nios2_in_small_data_p): Update. > * config/pa/pa.c (pa_function_section): Update. > * config/pa/pa.h (IN_NAMED_SECTION_P): Update. > * config/ia64/ia64.c (ia64_in_small_data_p): Update. > * config/arc/arc.c (arc_in_small_data_p): Update. > * config/arm/unknown-elf.h (IN_NAMED_SECTION_P): Update. > * config/mcore/mcore.c (mcore_unique_section): Update. > * config/mips/mips.c (mips16_build_function_stub): Update. > (mips16_build_call_stub): Update. > (mips_function_rodata_section): Update. > (mips_in_small_data_p): Update. > * config/score/score.c (score_in_small_data_p): Update. > * config/rx/rx.c (rx_in_small_data): Update. > * config/rs6000/rs6000.c (rs6000_elf_in_small_data_p): Update. > (rs6000_xcoff_asm_named_section): Update. > (rs6000_xcoff_unique_section): Update. > * config/frv/frv.c (frv_string_begins_with): Update. > (frv_in_small_data_p): Update. > * config/v850/v850.c (v850_encode_data_area): Update. > * config/bfin/bfin.c (DECL_SECTION_NAME): Update. > (bfin_handle_l1_data_attribute): Update. > (bfin_handle_l2_attribute): Update. > * config/mep/mep.c (mep_unique_section): Update. > * config/microblaze/microblaze.c (microblaze_elf_in_small_data_p): > Update. > * config/h8300/h8300.c (h8300_handle_eightbit_data_attribute): Update. > (h8300_handle_tiny_data_attribute): Update. > * config/m32r/m32r.c (m32r_in_small_data_p): Update. > (m32r_in_small_data_p): Update. > * config/alpha/alpha.c (alpha_in_small_data_p): Update. > * config/i386/i386.c (ix86_in_large_data_p): Update. > * config/i386/winnt.c (i386_pe_unique_section): Update. > * config/darwin.c (darwin_function_section): Update. > * config/lm32/lm32.c (lm32_in_small_data_p): Update. > * tree-emutls.c (get_emutls_init_templ_addr): Update. > (new_emutls_decl): Update. > * lto-cgraph.c (lto_output_node, input_node, input_varpool_node, > input_varpool_node): Update. > (ead_string_cst): Turn to ... > (read_string): ... this one. > * dwarf2out.c (secname_for_decl): Update. > * asan.c (asan_protect_global): Update. > > * c-family/c-common.c (handle_section_attribute): Update handling for > section names that are no longer trees. > > * java/class.c (build_utf8_ref): Update handling for section names > that are no longer trees. > (emit_register_classes_in_jcr_section): Update. > > * vtable-class
Re: [PATCH, Pointer Bounds Checker 35/x] Fix object size emitted for structures with flexible arrays
On Wed, Jun 11, 2014 at 6:08 PM, Ilya Enkovich wrote: > Hi, > > This patch fixes problem with size emitted for static structures with > flexible array. I found a couple of trackers in guzilla for this problem but > all of them are marked as fixed and problem still exists. > > For a simple testcase > > struct S { int a; int b[0]; } s = { 1, { 0, 0} }; > > current trunk produces (no flags): > > .globl s > .data > .align 4 > .type s, @object > .size s, 4 > s: > .long 1 > .long 0 > .long 0 > > which has wrong size for object s. > > This problem is important for checker because wrong size leads to wrong > bounds and false bounds violations. Following patch uses DECL_SIZE_UNIT > instead of type size and works well for me. Does it look OK? There is a bug about this in bugzilla somewhere. It looks ok to me - did you test with all languages? In particular did you test Ada? Thanks, Richard. > Bootstrapped and tested on linux-x86_64. > > Thanks, > Ilya > -- > gcc/ > > 2014-06-11 Ilya Enkovich > > * config/elfos.h (ASM_DECLARE_OBJECT_NAME): Use decl size > instead of type size. > (ASM_FINISH_DECLARE_OBJECT): Likewise. > > > diff --git a/gcc/config/elfos.h b/gcc/config/elfos.h > index c1d5553..7929708 100644 > --- a/gcc/config/elfos.h > +++ b/gcc/config/elfos.h > @@ -313,7 +313,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. > If not, see > && (DECL) && DECL_SIZE (DECL))\ > { \ > size_directive_output = 1;\ > - size = int_size_in_bytes (TREE_TYPE (DECL)); \ > + size = tree_to_uhwi (DECL_SIZE_UNIT (DECL)); \ > ASM_OUTPUT_SIZE_DIRECTIVE (FILE, NAME, size); \ > } \ > \ > @@ -341,7 +341,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. > If not, see > && !size_directive_output)\ > { \ > size_directive_output = 1;\ > - size = int_size_in_bytes (TREE_TYPE (DECL)); \ > + size = tree_to_uhwi (DECL_SIZE_UNIT (DECL)); \ > ASM_OUTPUT_SIZE_DIRECTIVE (FILE, name, size); \ > } \ > } \
Re: fix math wrt volatile-bitfields vs C++ model
On Wed, Jun 11, 2014 at 11:35 PM, DJ Delorie wrote: > > If the combined bitfields are exactly the size of the mode, the logic > for detecting range overflow is flawed - it calculates an ending > "position" that's the position of the first bit in the next field. > > In the case of "short" for example, you get "16 > 15" without this > patch (comparing size to position), and "15 > 15" with (comparing > position to position). > > Ok to apply? Looks ok to me, but can you add a testcase please? Also check if 4.9 is affected. Thanks, Richard. > * expmed.c (strict_volatile_bitfield_p): Fix off-by-one error. > > Index: expmed.c > === > --- expmed.c(revision 211479) > +++ expmed.c(working copy) > @@ -472,13 +472,13 @@ strict_volatile_bitfield_p (rtx op0, uns > && bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize > modesize)) > return false; > >/* Check for cases where the C++ memory model applies. */ >if (bitregion_end != 0 >&& (bitnum - bitnum % modesize < bitregion_start > - || bitnum - bitnum % modesize + modesize > bitregion_end)) > + || bitnum - bitnum % modesize + modesize - 1 > bitregion_end)) > return false; > >return true; > } > > /* Return true if OP is a memory and if a bitfield of size BITSIZE at
Re: [PATCH] PR rtl-optimization/61047
> This patch tries to get safe lower and upper bounds where accesses > are always guaranteed to work. The goal is not to penalize > reasonable written code: When boot-strapping the whole GCC > only a few places were found, where this new check triggers. > > Boot-strapped and regression-tested on x86_64-linux-gnu. > Additionally built a cross compiler for a stack-grows-upward-target > (xstormy16-elf). > > Ok for trunk? No, that's far too complicated a change for such a dumb artificial testcase. I have suspended the PR. I'd suggest concentrating on bug reports for real- life software and/or new features, this would IMO be a better use of the time you devote to GCC. -- Eric Botcazou
Re: [PATCH] PR rtl-optimization/61047
On Thu, Jun 12, 2014 at 10:03 AM, Eric Botcazou wrote: >> This patch tries to get safe lower and upper bounds where accesses >> are always guaranteed to work. The goal is not to penalize >> reasonable written code: When boot-strapping the whole GCC >> only a few places were found, where this new check triggers. >> >> Boot-strapped and regression-tested on x86_64-linux-gnu. >> Additionally built a cross compiler for a stack-grows-upward-target >> (xstormy16-elf). >> >> Ok for trunk? > > No, that's far too complicated a change for such a dumb artificial testcase. > > I have suspended the PR. I'd suggest concentrating on bug reports for real- > life software and/or new features, this would IMO be a better use of the time > you devote to GCC. Btw, I wonder if we can simply mark the MEMs generated from spill code with MEM_NOTRAP_P so we can remove the special casing of frame-pointer-based addresses from add while properly initializing MEM_NOTRAP_p from rtx_addr_can_trap_p? I suppose it was added exactly to cover spill code? Otherwise I agree with Eric. Richard. > -- > Eric Botcazou
Re: Turn DECL_SECTION_NAME into string
> On Thu, Jun 12, 2014 at 6:33 AM, Jan Hubicka wrote: > > Hi, > > this lenghtly patch makes the legwork to put section names out of tree > > representation. > > Originally they were STRING_CST. I ended up implementing on-side reference > > counted > > string voclabulary that is done in bit baroque way to be GGC and PCH safe > > (uff). > > The memory savings on Firefox are about 60MB, becuase while reading symbol > > table we > > now unify the many duplicated comdat group strings and also we free them > > after we bring > > those local. > > > > The old representation probably made sense when most of string came via > > __section__ > > attribute where they was readily parsed as string constants. > > I wonder why you didn't use IDENTIFIER_NODEs? (ok, still trees ...) > At least those are already GGC and PCH safe. To be able to discard it effectively during LTO by ref counting. IDENTIFIER_NODEs makes sense for assembler names (sorta) since they may match identifier and thus also to COMDAT_GROUPS that are taken from assembler names. Section names do not match those, so having a separate pool for them seemed to work best. What happens is at LTO is that we read all the sections for comdat groups and then ipa-visibility dismantles them. Anyway, it is now hidden by the API, so we can change it easily. Honza > > Richard. > > > Bootstrapped/regtested x86_64-linux, comitted. > > > > Honza > > > > * symtab.c (section_hash): New hash. > > (symtab_unregister_node): Clear section before freeing. > > (hash_section_hash_entry): New haser. > > (eq_sections): New function. > > (symtab_node::set_section_for_node): New method. > > (set_section_1): Update. > > (symtab_node::set_section): Take string instead of tree as > > parameter. > > (symtab_resolve_alias): Update. > > * cgraph.h (section_hash_entry_d): New structure. > > (section_hash_entry): New typedef. > > (cgraph_node): Change comdat_group_ to x_comdat_group, > > change section_ to x_section and turn into section_hash_entry; > > update accestors; put set_section_for_node offline. > > * tree.c (decl_section_name): Turn into string. > > (set_decl_section_name): Change parameter to be string. > > * tree.h (decl_section_name, set_decl_section_name): Update > > prototypes. > > * sdbout.c (sdbout_one_type): Update. > > * tree-vect-data-refs.c (vect_can_force_dr_alignment_p): Update. > > * varasm.c (IN_NAMED_SECTION, get_named_section, > > resolve_unique_section, > > hot_function_section, get_named_text_section, > > USE_SELECT_SECTION_FOR_FUNCTIONS, > > default_function_rodata_section, make_decl_rtl, > > default_unique_section): > > Update. > > * config/c6x/c6x.c (c6x_in_small_data_p): Update. > > (c6x_elf_unique_section): Update. > > * config/nios2/nios2.c (nios2_in_small_data_p): Update. > > * config/pa/pa.c (pa_function_section): Update. > > * config/pa/pa.h (IN_NAMED_SECTION_P): Update. > > * config/ia64/ia64.c (ia64_in_small_data_p): Update. > > * config/arc/arc.c (arc_in_small_data_p): Update. > > * config/arm/unknown-elf.h (IN_NAMED_SECTION_P): Update. > > * config/mcore/mcore.c (mcore_unique_section): Update. > > * config/mips/mips.c (mips16_build_function_stub): Update. > > (mips16_build_call_stub): Update. > > (mips_function_rodata_section): Update. > > (mips_in_small_data_p): Update. > > * config/score/score.c (score_in_small_data_p): Update. > > * config/rx/rx.c (rx_in_small_data): Update. > > * config/rs6000/rs6000.c (rs6000_elf_in_small_data_p): Update. > > (rs6000_xcoff_asm_named_section): Update. > > (rs6000_xcoff_unique_section): Update. > > * config/frv/frv.c (frv_string_begins_with): Update. > > (frv_in_small_data_p): Update. > > * config/v850/v850.c (v850_encode_data_area): Update. > > * config/bfin/bfin.c (DECL_SECTION_NAME): Update. > > (bfin_handle_l1_data_attribute): Update. > > (bfin_handle_l2_attribute): Update. > > * config/mep/mep.c (mep_unique_section): Update. > > * config/microblaze/microblaze.c (microblaze_elf_in_small_data_p): > > Update. > > * config/h8300/h8300.c (h8300_handle_eightbit_data_attribute): > > Update. > > (h8300_handle_tiny_data_attribute): Update. > > * config/m32r/m32r.c (m32r_in_small_data_p): Update. > > (m32r_in_small_data_p): Update. > > * config/alpha/alpha.c (alpha_in_small_data_p): Update. > > * config/i386/i386.c (ix86_in_large_data_p): Update. > > * config/i386/winnt.c (i386_pe_unique_section): Update. > > * config/darwin.c (darwin_function_section): Update. > > * config/lm32/lm32.c (lm32_in_small_data_p): Update. > > * tree-emutls.c (get_em
Re: [PATCH, PR52252] Alternative way of vectorization for load groups of size 2 and 3.
On Thu, Jun 12, 2014 at 6:04 AM, Evgeny Stupachenko wrote: > Testing finished. No new regressions. > Is the following patch ok? + if (targetm.sched.reassociation_width (VEC_PERM_EXPR, mode) > 1 || + !vect_shift_permute_load_chain (dr_chain, size, stmt, gsi, &result_chain)) ||s and &&s go to the next line. I miss testcases that make sure the vectorizer/backend code-paths are both exercised. Put them in gcc.target/i386 and provide an appropriate -march. The vectorizer changes are ok with the above fixed, I defer to backend maintainers for the i386 changes. Richard. > 2014-06-11 Evgeny Stupachenko > > * config/i386/i386.c (ix86_reassociation_width): Add alternative for > vector case. > * config/i386/i386.h (TARGET_VECTOR_PARALLEL_EXECUTION): New. > * config/i386/x86-tune.def (X86_TUNE_VECTOR_PARALLEL_EXECUTION): New. > * tree-vect-data-refs.c (vect_shift_permute_load_chain): New. > Introduces alternative way of loads group permutaions. > (vect_transform_grouped_load): Try alternative way of permutations. > > Thanks, > Evgeny > > On Tue, Jun 10, 2014 at 4:43 PM, Evgeny Stupachenko > wrote: >> ix86_reassociation_width checks INTEGRAL_MODE_P and FLOAT_MODE_P which >> include vector mode. >> I'll try to separate this into scalar and vector part, but it will >> require more testing (under the testing now). >> What about the rest of the patch? >> >> Thanks, >> Evgeny >> >> On Thu, Jun 5, 2014 at 3:54 PM, Ramana Radhakrishnan >> wrote: >>> On 06/05/14 12:43, Evgeny Stupachenko wrote: New hook is related to vector instructions only. Vector instructions could be sequential in pipeline, but scalar - parallel. For x86 architectures TARGET_SCHED_REASSOC_WIDTH does not give required differentiation. General hooks could be potentially reused in other algorithms/by other architectures. >>> >>> >>> It already takes a "mode" argument. Couldn't you use a vector mode to work >>> this out ? >>> >>> If it is not enough then please be more specific about the documentation of >>> this hook about where it is useful so that it's easy for people reading the >>> documentation to understand at a glance what purpose it serves. >>> >>> >>> Ramana >>> >>> Thanks, Evgeny On Thu, Jun 5, 2014 at 2:04 PM, Ramana Radhakrishnan wrote: > > On Wed, May 28, 2014 at 2:09 PM, Evgeny Stupachenko > wrote: >> >> Hi, >> >> The patch introduces alternative way of permutations for load groups >> of size 2 and 3 which should be faster on architectures with low >> parallelism. >> The patch gives 2 times gain on Silvermont to the test from PR52252 >> (in addition to already committed 3 times gain). >> >> Patch passes bootstrap on x86. Make check is in progress. > > > Why do we need a new hook ? Can't you derive this information from > something which is equally badly named TARGET_SCHED_REASSOC_WIDTH > though used in the reassociation logic but also serves a similar > purpose ? > > Also the documentation of this hook is incomplete at best and wrong at > worst as this is not applied everywhere in the vectorizer but just for > this special case for load store permuting. Implying this is useful > everywhere in the vectorizer does not appear to be correct. > > regards > Ramana > > > > >> >> ChangeLog: >> >> 2014-05-28 Evgeny Stupachenko >> >> * config/i386/i386.c (ix86_have_vector_parallel_execution): >> New. >> (TARGET_VECTORIZE_HAVE_VECTOR_PARALLEL_EXECUTION): New. >> * config/i386/i386.h (TARGET_VECTOR_PARALLEL_EXECUTION): New. >> * config/i386/x86-tune.def >> (X86_TUNE_VECTOR_PARALLEL_EXECUTION): New. >> * target.def (have_vector_parallel_execution): New. >> * doc/tm.texi.in (have_vector_parallel_execution)): New. >> * doc/tm.texi: Regenerate. >> * targhooks.c (default_have_vector_parallel_execution): New. >> * tree-vect-data-refs.c (vect_shift_permute_load_chain): New. >> Introduces alternative way of loads group permutaions. >> (vect_transform_grouped_load): Try alternative way of >> permutaions. >> >> Evgeny >>>
Re: [PATCH] Trust TREE_ADDRESSABLE
> If we want to give frontends a way to pass information that address of a > given global object is not taken (apparently useful for Ada and its alias > attribute), then I do not think we are looking for middle-end only > solution. I don't feel very confortable with doing that in Ada, since everybody seems to be thinking that TRE_PUBLIC/DECL_EXTERNAL objects are implicitly addressable (see for example Steven's reasoning in an earlier message). > If we really do not want to revisit TREE_ADDRESSABLE in frontends, we can do > the following: > 1) change semantics of addressable flag on global variables in a way > Richard did, document it is initialized only after symbol table is built 2) > add code to cgraph construction to set TREE_ADDRESSABLE on every global > variable it sees. > IPA visibility is run before early optimizations. I suppose we can set > it there. I.e. in function_and_variable_visibility whenever we set > externally_visible and we have !in_lto_p > It is bit of hack. > 3) perhaps add some way to avoid 2) on objects we want - apparenlty we now > have DECL_NONALIASED that may be useful for this. Then how about using DECL_NONALIASED instead of TREE_ADDRESSABLE to achieve the initial goal here? That is to say, may_be_aliased tests DECL_NONALIASED for TREE_PUBLIC/DECL_EXTERNAL DECLs and the LTO front-end sets it properly. -- Eric Botcazou
Re: [PATCH] Trust TREE_ADDRESSABLE
On Thu, 12 Jun 2014, Eric Botcazou wrote: > > If we want to give frontends a way to pass information that address of a > > given global object is not taken (apparently useful for Ada and its alias > > attribute), then I do not think we are looking for middle-end only > > solution. > > I don't feel very confortable with doing that in Ada, since everybody seems > to > be thinking that TRE_PUBLIC/DECL_EXTERNAL objects are implicitly addressable > (see for example Steven's reasoning in an earlier message). > > > If we really do not want to revisit TREE_ADDRESSABLE in frontends, we can do > > the following: > > 1) change semantics of addressable flag on global variables in a way > > Richard did, document it is initialized only after symbol table is built 2) > > add code to cgraph construction to set TREE_ADDRESSABLE on every global > > variable it sees. > > IPA visibility is run before early optimizations. I suppose we can set > > it there. I.e. in function_and_variable_visibility whenever we set > > externally_visible and we have !in_lto_p > > It is bit of hack. > > 3) perhaps add some way to avoid 2) on objects we want - apparenlty we now > > have DECL_NONALIASED that may be useful for this. > > Then how about using DECL_NONALIASED instead of TREE_ADDRESSABLE to achieve > the initial goal here? That is to say, may_be_aliased tests DECL_NONALIASED > for TREE_PUBLIC/DECL_EXTERNAL DECLs and the LTO front-end sets it properly. Btw, may_be_aliased already does that. So yes, when LTO promotes sth from non-public to public but hidden visibility and TREE_ADDRESSABLE was not set it could set DECL_NONALIASED. That would at least preserve the aliasing behavior from without using LTO. If the resolution info from the linker allows us to make initial public variables hidden _and_ some LTO IPA pass proves that the variables address is not taken then that pass can set DECL_NONALIASED as well. Of course one issue is that it's impossible to write a verifier that checks whether DECL_NONALIASED and TREE_ADDRESSABLE are "out-of-sync" (because by design they can be). So it's a bit more fragile (we could make the operand scanner that "updates" TREE_ADDRESSABLE also unset DECL_NONALIASED of course). Richard.
Re: [PATCH] PR rtl-optimization/61047
> Btw, I wonder if we can simply mark the MEMs generated from spill code > with MEM_NOTRAP_P so we can remove the special casing of > frame-pointer-based addresses from add while properly initializing > MEM_NOTRAP_p from rtx_addr_can_trap_p? Spill code generated by the compiler itself? That's quite restrictive. > I suppose it was added exactly to cover spill code? Nope, it was added for jump tables: 2003-04-22 Richard Henderson PR 8866 * rtl.h (MEM_NOTRAP_P): New. (MEM_COPY_ATTRIBUTES): Copy it. * rtlanal.c (may_trap_p): Check it. * expr.c (do_tablejump): Set it. * doc/rtl.texi (Flags): Document it. * cfgrtl.c (try_redirect_by_replacing_jump): Revert last three changes. that is to say, for memory accesses that can nominally trap but for which we know that they actually don't. -- Eric Botcazou
RE: [PATCH] PR rtl-optimization/61047
On Thu, 12 Jun 2014 10:36:25, Eric Botcazou wrote: > >> Btw, I wonder if we can simply mark the MEMs generated from spill code >> with MEM_NOTRAP_P so we can remove the special casing of >> frame-pointer-based addresses from add while properly initializing >> MEM_NOTRAP_p from rtx_addr_can_trap_p? > > Spill code generated by the compiler itself? That's quite restrictive. > >> I suppose it was added exactly to cover spill code? > > Nope, it was added for jump tables: > > 2003-04-22 Richard Henderson > > PR 8866 > * rtl.h (MEM_NOTRAP_P): New. > (MEM_COPY_ATTRIBUTES): Copy it. > * rtlanal.c (may_trap_p): Check it. > * expr.c (do_tablejump): Set it. > * doc/rtl.texi (Flags): Document it. > > * cfgrtl.c (try_redirect_by_replacing_jump): Revert last three changes. > > that is to say, for memory accesses that can nominally trap but for which we > know that they actually don't. > > -- > Eric Botcazou Btw I am not sure at all, why argp-references can never be dangerous? For instance in a struct with an array inside, passed as function argument? Bernd.
[RFC] Teaching SCC merging about unit local trees
Richard, as briefly discussed before, I would like to teach LTO type merging to not merge types that was declared in anonymous namespaces and use C++ ODR type names (stored in DECL_ASSEMBLER_NAME of the TYPE_DECL) to break down canonical types by their names. First thing I need to arrange IMO is to not merge two anonymous types from two different units. While looking into it I noticed that the current code in unify_scc that refuses to merge local decls produces conflicts and seems useless excercise to do. This patch introduces special hash code 1 that specify that given SCC is known to be local and should bypass the merging logic. This is propagated down and seems to quite noticeably reduce size of SCC hash: [WPA] read 10190717 SCCs of average size 1.980409 [WPA] 20181785 tree bodies read in total [WPA] tree SCC table: size 4194301, 1882700 elements, collision ratio: 0.815497 [WPA] tree SCC max chain length 140 (size 1) [WPA] Compared 3392363 SCCs, 2718822 collisions (0.801454) [WPA] Merged 3314075 SCCs [WPA] Merged 9693632 tree bodies [WPA] Merged 2467704 types [WPA] 1783262 types prevailed (4491218 associated trees) [WPA] GIMPLE canonical type table: size 131071, 94867 elements, 1783347 searches, 737056 collisions (ratio: 0.413299) [WPA] GIMPLE canonical type pointer-map: 94867 elements, 3973875 searches [WPA] Compression: 282828785 input bytes, 831186147 uncompressed bytes (ratio: 2.938832) [WPA] Size of mmap'd section decls: 282828785 bytes to: [WPA] read 10172291 SCCs of average size 1.982162 [WPA] 20163124 tree bodies read in total [WPA] tree SCC table: size 2097143, 988764 elements, collision ratio: 0.684967 [WPA] tree SCC max chain length 140 (size 1) [WPA] Compared 3060932 SCCs, 2405009 collisions (0.785711) [WPA] Merged 3040565 SCCs [WPA] Merged 9246482 tree bodies [WPA] Merged 2382312 types [WPA] 1868611 types prevailed (4728465 associated trees) [WPA] GIMPLE canonical type table: size 131071, 94910 elements, 1868696 searches, 790939 collisions (ratio: 0.423257) [WPA] GIMPLE canonical type pointer-map: 94910 elements, 4216423 searches [WPA] Compression: 273322455 input bytes, 824178095 uncompressed bytes (ratio: 3.015406) We merge less, but not by much and I think we was not right not merge in that cases. Would something like this make sense? (I am not saying my definition of unit_local_tree_p is most polished one :) I think next step could be to make anonymous types to bypass the canonical type merging (i.e. simply save the chains as they comde from frontends forthose) and then look into computing the type names in free lang data, using odr name hash instaed of canonical type hash for those named types + link them to canonical type hash entries and if we end up with unnamed type in canonical type hash, then make its alias class to conflict with all the named types. Honza Index: lto-streamer-out.c === --- lto-streamer-out.c (revision 211488) +++ lto-streamer-out.c (working copy) @@ -54,6 +54,47 @@ along with GCC; see the file COPYING3. #include "cfgloop.h" #include "builtins.h" +/* Return if T can never be shared across units. */ +static bool +unit_local_tree_p (tree t) +{ + switch (TREE_CODE (t)) +{ + case VAR_DECL: + /* Automatic variables are always unit local. */ + if (!TREE_STATIC (t) && !DECL_EXTERNAL (t) + && !DECL_HARD_REGISTER (t)) + return true; + /* ... fall through ... */ + + case FUNCTION_DECL: + /* Non-public declarations are alwyas local. */ + if (!TREE_PUBLIC (t)) + return true; + + /* Public definitions that would cause linker error if + appeared in other unit. */ + if (TREE_PUBLIC (t) + && !DECL_EXTERNAL (t) + && !DECL_WEAK (t)) + return true; + return false; + case NAMESPACE_DECL: + return !TREE_PUBLIC (t); + case TRANSLATION_UNIT_DECL: + return true; + case PARM_DECL: + case RESULT_DECL: + case LABEL_DECL: + case SSA_NAME: + return true; + default: + if (TYPE_P (t) + && type_in_anonymous_namespace_p (t)) + return true; + return false; +} +} static void lto_write_tree (struct output_block*, tree, bool); @@ -686,7 +727,9 @@ DFS_write_tree_body (struct output_block #undef DFS_follow_tree_edge } -/* Return a hash value for the tree T. */ +/* Return a hash value for the tree T. + If T is local to unit or refers anything local to unit, return 1. + Otherwise return non-1. */ static hashval_t hash_tree (struct streamer_tree_cache_d *cache, tree t) @@ -694,10 +737,19 @@ hash_tree (struct streamer_tree_cache_d #define visit(SIBLING) \ do { \ unsigned ix; \ +hashval_t h; \ if (SIBLING && streamer_tree_cache_lookup (cache, SIBLING, &ix)) \ - v = iterative_hash_hashval_t (streamer_tree_cache_get_hash (cache, ix), v); \ + { \ +h = st
Re: [PATCH] Trust TREE_ADDRESSABLE
> On Thu, 12 Jun 2014, Eric Botcazou wrote: > > > > If we want to give frontends a way to pass information that address of a > > > given global object is not taken (apparently useful for Ada and its alias > > > attribute), then I do not think we are looking for middle-end only > > > solution. > > > > I don't feel very confortable with doing that in Ada, since everybody seems > > to > > be thinking that TRE_PUBLIC/DECL_EXTERNAL objects are implicitly > > addressable > > (see for example Steven's reasoning in an earlier message). > > > > > If we really do not want to revisit TREE_ADDRESSABLE in frontends, we can > > > do > > > the following: > > > 1) change semantics of addressable flag on global variables in a way > > > Richard did, document it is initialized only after symbol table is built > > > 2) > > > add code to cgraph construction to set TREE_ADDRESSABLE on every global > > > variable it sees. > > > IPA visibility is run before early optimizations. I suppose we can set > > > it there. I.e. in function_and_variable_visibility whenever we set > > > externally_visible and we have !in_lto_p > > > It is bit of hack. > > > 3) perhaps add some way to avoid 2) on objects we want - apparenlty we > > > now > > > have DECL_NONALIASED that may be useful for this. > > > > Then how about using DECL_NONALIASED instead of TREE_ADDRESSABLE to achieve > > the initial goal here? That is to say, may_be_aliased tests > > DECL_NONALIASED > > for TREE_PUBLIC/DECL_EXTERNAL DECLs and the LTO front-end sets it properly. > > Btw, may_be_aliased already does that. So yes, when LTO promotes sth > from non-public to public but hidden visibility and TREE_ADDRESSABLE > was not set it could set DECL_NONALIASED. That would at least preserve > the aliasing behavior from without using LTO. If the resolution info > from the linker allows us to make initial public variables hidden > _and_ some LTO IPA pass proves that the variables address is not taken > then that pass can set DECL_NONALIASED as well. Yep, I suppose each time I clear TREE_ADDRESSABLE flag, i can also set DECL_NONALIASED. > > Of course one issue is that it's impossible to write a verifier that > checks whether DECL_NONALIASED and TREE_ADDRESSABLE are "out-of-sync" > (because by design they can be). So it's a bit more fragile > (we could make the operand scanner that "updates" TREE_ADDRESSABLE > also unset DECL_NONALIASED of course). Hmm,when one would unset it? Honza > > Richard.
Re: [PATCH] PR rtl-optimization/61047
> Btw I am not sure at all, why argp-references can never be dangerous? > For instance in a struct with an array inside, passed as function argument? IMO there cannot be any definitive solution to this issue until after we move all the affected optimizations from RTL to GIMPLE. In the meantime, the failure mode is not catastrophic (100% reproducible segfault) and there is always an easy workaround (generally a -fno-* switch). -- Eric Botcazou
Re: [PATCH] Trust TREE_ADDRESSABLE
> Btw, may_be_aliased already does that. Indeed, and we could make use of that in Ada, at least in some cases. > Of course one issue is that it's impossible to write a verifier that > checks whether DECL_NONALIASED and TREE_ADDRESSABLE are "out-of-sync" > (because by design they can be). So it's a bit more fragile > (we could make the operand scanner that "updates" TREE_ADDRESSABLE > also unset DECL_NONALIASED of course). IMO it's also more robust because the default (no DECL_NONALIASED) is safe. -- Eric Botcazou
Re: [PATCH] Trust TREE_ADDRESSABLE
On Thu, 12 Jun 2014, Jan Hubicka wrote: > > On Thu, 12 Jun 2014, Eric Botcazou wrote: > > > > > > If we want to give frontends a way to pass information that address of a > > > > given global object is not taken (apparently useful for Ada and its > > > > alias > > > > attribute), then I do not think we are looking for middle-end only > > > > solution. > > > > > > I don't feel very confortable with doing that in Ada, since everybody > > > seems to > > > be thinking that TRE_PUBLIC/DECL_EXTERNAL objects are implicitly > > > addressable > > > (see for example Steven's reasoning in an earlier message). > > > > > > > If we really do not want to revisit TREE_ADDRESSABLE in frontends, we > > > > can do > > > > the following: > > > > 1) change semantics of addressable flag on global variables in a way > > > > Richard did, document it is initialized only after symbol table is > > > > built 2) > > > > add code to cgraph construction to set TREE_ADDRESSABLE on every global > > > > variable it sees. > > > > IPA visibility is run before early optimizations. I suppose we can > > > > set > > > > it there. I.e. in function_and_variable_visibility whenever we set > > > > externally_visible and we have !in_lto_p > > > > It is bit of hack. > > > > 3) perhaps add some way to avoid 2) on objects we want - apparenlty we > > > > now > > > > have DECL_NONALIASED that may be useful for this. > > > > > > Then how about using DECL_NONALIASED instead of TREE_ADDRESSABLE to > > > achieve > > > the initial goal here? That is to say, may_be_aliased tests > > > DECL_NONALIASED > > > for TREE_PUBLIC/DECL_EXTERNAL DECLs and the LTO front-end sets it > > > properly. > > > > Btw, may_be_aliased already does that. So yes, when LTO promotes sth > > from non-public to public but hidden visibility and TREE_ADDRESSABLE > > was not set it could set DECL_NONALIASED. That would at least preserve > > the aliasing behavior from without using LTO. If the resolution info > > from the linker allows us to make initial public variables hidden > > _and_ some LTO IPA pass proves that the variables address is not taken > > then that pass can set DECL_NONALIASED as well. > > Yep, I suppose each time I clear TREE_ADDRESSABLE flag, i can also set > DECL_NONALIASED. > > > > Of course one issue is that it's impossible to write a verifier that > > checks whether DECL_NONALIASED and TREE_ADDRESSABLE are "out-of-sync" > > (because by design they can be). So it's a bit more fragile > > (we could make the operand scanner that "updates" TREE_ADDRESSABLE > > also unset DECL_NONALIASED of course). > > Hmm,when one would unset it? When you extract the address and use it. For example when you do auto-parallelization and outline a part of your function it passes arrays as addresses. Or if you start to introduce address induction variables like the vectorizer or IVOPTs does. Richard.
Minor cleanup
There was apparently a last-minute name change for DECL_NONALIASED. Tested on x86_64-suse-linux, applied on mainline and 4.9 branch as obvious. 2014-06-12 Eric Botcazou * tree-core.h (DECL_NONALIASED): Use proper spelling in comment. -- Eric BotcazouIndex: tree-core.h === --- tree-core.h (revision 211435) +++ tree-core.h (working copy) @@ -1012,7 +1012,7 @@ struct GTY(()) tree_base { SSA_NAME_IN_FREELIST in SSA_NAME - VAR_DECL_NONALIASED in + DECL_NONALIASED in VAR_DECL deprecated_flag:
RE: [PATCH] PR rtl-optimization/61047
On Thu, 12 Jun 2014 10:50:29, Eric Botcazou wrote: > >> Btw I am not sure at all, why argp-references can never be dangerous? >> For instance in a struct with an array inside, passed as function argument? > > IMO there cannot be any definitive solution to this issue until after we move > all the affected optimizations from RTL to GIMPLE. In the meantime, the > failure mode is not catastrophic (100% reproducible segfault) and there is > always an easy workaround (generally a -fno-* switch). > > -- > Eric Botcazou not really 100% reproducable. As a little surprise, the test case from the tracker did _not_ crash when I initially put it in the testsuite. Reason, probably, the stack layout in the test suite is a little different, because the LD_LIBRARY_PATH environment variable is sooo long, and all environment variables, and arguments are at the top of the stack. The test did only produce the crash as expected when I changed this if (b == 2837) a = e[b]; into that: if (b == 28378) a = e[b]; Bernd.
[PATCH][RFC] Fix PR61473, inline small memcpy/memmove during tree opts
This implements the requested inlining of memmove for possibly overlapping arguments by doing first all loads and then all stores. The easiest place is to do this in memory op folding where we already perform inlining of some memcpy cases (but fail to do the equivalent memcpy optimization - though RTL expansion later does it). The following patch restricts us to max. word-mode size. Ideally we'd have a way to check for the number of real instructions needed to load an (aligned) value of size N. But maybe we don't care and are fine with doing multiple loads / stores? Anyway, the following is conservative (but maybe not enough). Bootstrap / regtest running on x86_64-unknown-linux-gnu. These transforms don't really belong to GENERIC folding (they also run at -O0 ...), similar to most builtin foldings. But this patch is not to change that. Any comments on the size/cost issue? Thanks, Richard. 2014-06-12 Richard Biener PR middle-end/61473 * builtins.c (fold_builtin_memory_op): Inline memory moves that can be implemented with a single load followed by a single store. * gcc.dg/memmove-4.c: New testcase. Index: gcc/builtins.c === --- gcc/builtins.c (revision 211449) +++ gcc/builtins.c (working copy) @@ -8637,11 +8637,53 @@ fold_builtin_memory_op (location_t loc, unsigned int src_align, dest_align; tree off0; - if (endp == 3) + /* Build accesses at offset zero with a ref-all character type. */ + off0 = build_int_cst (build_pointer_type_for_mode (char_type_node, +ptr_mode, true), 0); + + /* If we can perform the copy efficiently with first doing all loads + and then all stores inline it that way. Currently efficiently +means that we can load all the memory into a single integer +register and thus limited to word_mode size. Ideally we'd have +a way to query the largest mode that we can load/store with +a signle instruction. */ + src_align = get_pointer_alignment (src); + dest_align = get_pointer_alignment (dest); + if (tree_fits_uhwi_p (len) + && compare_tree_int (len, BITS_PER_WORD / 8) <= 0) { - src_align = get_pointer_alignment (src); - dest_align = get_pointer_alignment (dest); + unsigned ilen = tree_to_uhwi (len); + if (exact_log2 (ilen) != -1) + { + tree type = lang_hooks.types.type_for_size (ilen * 8, 1); + if (type + && TYPE_MODE (type) != BLKmode + && (GET_MODE_SIZE (TYPE_MODE (type)) * BITS_PER_UNIT + == ilen * 8) + /* If the pointers are not aligned we must be able to +emit an unaligned load. */ + && ((src_align >= GET_MODE_ALIGNMENT (TYPE_MODE (type)) + && dest_align >= GET_MODE_ALIGNMENT (TYPE_MODE (type))) + || !SLOW_UNALIGNED_ACCESS (TYPE_MODE (type), +MIN (src_align, dest_align + { + tree srctype = type; + tree desttype = type; + if (src_align < GET_MODE_ALIGNMENT (TYPE_MODE (type))) + srctype = build_aligned_type (type, src_align); + if (dest_align < GET_MODE_ALIGNMENT (TYPE_MODE (type))) + desttype = build_aligned_type (type, dest_align); + destvar = fold_build2 (MEM_REF, desttype, dest, off0); + expr = build2 (MODIFY_EXPR, type, +fold_build2 (MEM_REF, desttype, dest, off0), +fold_build2 (MEM_REF, srctype, src, off0)); + goto done; + } + } + } + if (endp == 3) + { /* Both DEST and SRC must be pointer types. ??? This is what old code did. Is the testing for pointer types really mandatory? @@ -8818,10 +8860,6 @@ fold_builtin_memory_op (location_t loc, if (!ignore) dest = builtin_save_expr (dest); - /* Build accesses at offset zero with a ref-all character type. */ - off0 = build_int_cst (build_pointer_type_for_mode (char_type_node, -ptr_mode, true), 0); - destvar = dest; STRIP_NOPS (destvar); if (TREE_CODE (destvar) == ADDR_EXPR @@ -,6 +8926,7 @@ fold_builtin_memory_op (location_t loc, expr = build2 (MODIFY_EXPR, TREE_TYPE (destvar), destvar, srcvar); } +done: if (ignore) return expr; Index: gcc/testsuite/gcc.dg/memmove-4.c === --- gcc/testsuite/gcc.dg/memmove-4.c(revision 0) +++ gcc/testsuite/gcc.dg/memmove-4.c(working copy) @@ -0,0 +1,12 @@ +/* {
RE: [PATCH][RX] Patch to correct the functionality of compiler option -falign-labels=n
Hi DJ, > Have you checked the other alignment macros to see if they need to be > fixed too? Thank you for review this patch. Yes, I have checked other alignment macros and it seems fine. > This should be : I have corrected this review comment. Is this patch now ok to commit? Best Regards, Sandeep Kumar Singh 2014-06-12 Sandeep Kumar Singh * config/rx/rx.h (LABEL_ALIGN): Corrected macro LABEL_ALIGN rx_align_labels.patch Description: rx_align_labels.patch
Re: [RFC] Teaching SCC merging about unit local trees
On Thu, Jun 12, 2014 at 10:47 AM, Jan Hubicka wrote: > Richard, > as briefly discussed before, I would like to teach LTO type merging to not > merge > types that was declared in anonymous namespaces and use C++ ODR type names > (stored in DECL_ASSEMBLER_NAME of the TYPE_DECL) to break down canonical types > by their names. > > First thing I need to arrange IMO is to not merge two anonymous types from > two different units. While looking into it I noticed that the current code > in unify_scc that refuses to merge local decls produces conflicts and seems > useless excercise to do. > > This patch introduces special hash code 1 that specify that given SCC is known > to be local and should bypass the merging logic. This is propagated down and > seems to quite noticeably reduce size of SCC hash: > > [WPA] read 10190717 SCCs of average size 1.980409 > [WPA] 20181785 tree bodies read in total > [WPA] tree SCC table: size 4194301, 1882700 elements, collision ratio: > 0.815497 > [WPA] tree SCC max chain length 140 (size 1) > [WPA] Compared 3392363 SCCs, 2718822 collisions (0.801454) > [WPA] Merged 3314075 SCCs > [WPA] Merged 9693632 tree bodies > [WPA] Merged 2467704 types > [WPA] 1783262 types prevailed (4491218 associated trees) > [WPA] GIMPLE canonical type table: size 131071, 94867 elements, 1783347 > searches, 737056 collisions (ratio: 0.413299) > [WPA] GIMPLE canonical type pointer-map: 94867 elements, 3973875 searches > [WPA] Compression: 282828785 input bytes, 831186147 uncompressed bytes > (ratio: 2.938832) > [WPA] Size of mmap'd section decls: 282828785 bytes > > to: > > [WPA] read 10172291 SCCs of average size 1.982162 > [WPA] 20163124 tree bodies read in total > [WPA] tree SCC table: size 2097143, 988764 elements, collision ratio: 0.684967 > [WPA] tree SCC max chain length 140 (size 1) > [WPA] Compared 3060932 SCCs, 2405009 collisions (0.785711) > [WPA] Merged 3040565 SCCs > [WPA] Merged 9246482 tree bodies > [WPA] Merged 2382312 types > [WPA] 1868611 types prevailed (4728465 associated trees) > [WPA] GIMPLE canonical type table: size 131071, 94910 elements, 1868696 > searches, 790939 collisions (ratio: 0.423257) > [WPA] GIMPLE canonical type pointer-map: 94910 elements, 4216423 searches > [WPA] Compression: 273322455 input bytes, 824178095 uncompressed bytes > (ratio: 3.015406) > > We merge less, but not by much and I think we was not right not merge in that > cases. If we merge things we may not merge then the fix is to compare_tree_sccs_1, not introducing special cases like you propose. That is, if we are not allowed to merge anonymous namespaces then make sure we don't. We already should not merge types with TYPE_CONTEXT == such namespace by means of /* ??? Global types from different TUs have non-matching TRANSLATION_UNIT_DECLs. Still merge them if they are otherwise equal. */ if (TYPE_FILE_SCOPE_P (t1) && TYPE_FILE_SCOPE_P (t2)) ; else compare_tree_edges (TYPE_CONTEXT (t1), TYPE_CONTEXT (t2)); but we possibly merge a subset of decl kinds from "different" namespaces : /* ??? Global decls from different TUs have non-matching TRANSLATION_UNIT_DECLs. Only consider a small set of decls equivalent, we should not end up merging others. */ if ((code == TYPE_DECL || code == NAMESPACE_DECL || code == IMPORTED_DECL || code == CONST_DECL || (VAR_OR_FUNCTION_DECL_P (t1) && (TREE_PUBLIC (t1) || DECL_EXTERNAL (t1 && DECL_FILE_SCOPE_P (t1) && DECL_FILE_SCOPE_P (t2)) ; else compare_tree_edges (DECL_CONTEXT (t1), DECL_CONTEXT (t2)); Not sure what we end up doing for NAMESPACE_DECL itself (and what fields we stream for it). It would be interesting to check that. Thus, make sure we don't merge namespace {} and namespace {} from two different units. But effectively you say we have two classes of "global" trees, first those that are mergeable across TUs and second those that are not. This IMHO means we want to separate those to two different LTO sections and simply skip all the merging code for the second (instead of adding hacks to the merging code). Richard. > > Would something like this make sense? (I am not saying my definition of > unit_local_tree_p > is most polished one :) > > I think next step could be to make anonymous types to bypass the canonical > type > merging (i.e. simply save the chains as they comde from frontends forthose) > and > then look into computing the type names in free lang data, using odr name > hash instaed > of canonical type hash for those named types + link them to canonical type > hash > entries and if we end up with unnamed type in canonical type hash, then make > its > alias class to conflict with all the named types. > > Honza > > Index: lto-streamer-out.c > === > --- lto-streamer-out.c (revision 211488) > +++ lto-
Re: [RFC] Teaching SCC merging about unit local trees
On Thu, Jun 12, 2014 at 12:29 PM, Richard Biener wrote: > On Thu, Jun 12, 2014 at 10:47 AM, Jan Hubicka wrote: >> Richard, >> as briefly discussed before, I would like to teach LTO type merging to not >> merge >> types that was declared in anonymous namespaces and use C++ ODR type names >> (stored in DECL_ASSEMBLER_NAME of the TYPE_DECL) to break down canonical >> types >> by their names. >> >> First thing I need to arrange IMO is to not merge two anonymous types from >> two different units. While looking into it I noticed that the current code >> in unify_scc that refuses to merge local decls produces conflicts and seems >> useless excercise to do. >> >> This patch introduces special hash code 1 that specify that given SCC is >> known >> to be local and should bypass the merging logic. This is propagated down and >> seems to quite noticeably reduce size of SCC hash: >> >> [WPA] read 10190717 SCCs of average size 1.980409 >> [WPA] 20181785 tree bodies read in total >> [WPA] tree SCC table: size 4194301, 1882700 elements, collision ratio: >> 0.815497 >> [WPA] tree SCC max chain length 140 (size 1) >> [WPA] Compared 3392363 SCCs, 2718822 collisions (0.801454) >> [WPA] Merged 3314075 SCCs >> [WPA] Merged 9693632 tree bodies >> [WPA] Merged 2467704 types >> [WPA] 1783262 types prevailed (4491218 associated trees) >> [WPA] GIMPLE canonical type table: size 131071, 94867 elements, 1783347 >> searches, 737056 collisions (ratio: 0.413299) >> [WPA] GIMPLE canonical type pointer-map: 94867 elements, 3973875 searches >> [WPA] Compression: 282828785 input bytes, 831186147 uncompressed bytes >> (ratio: 2.938832) >> [WPA] Size of mmap'd section decls: 282828785 bytes >> >> to: >> >> [WPA] read 10172291 SCCs of average size 1.982162 >> [WPA] 20163124 tree bodies read in total >> [WPA] tree SCC table: size 2097143, 988764 elements, collision ratio: >> 0.684967 >> [WPA] tree SCC max chain length 140 (size 1) >> [WPA] Compared 3060932 SCCs, 2405009 collisions (0.785711) >> [WPA] Merged 3040565 SCCs >> [WPA] Merged 9246482 tree bodies >> [WPA] Merged 2382312 types >> [WPA] 1868611 types prevailed (4728465 associated trees) >> [WPA] GIMPLE canonical type table: size 131071, 94910 elements, 1868696 >> searches, 790939 collisions (ratio: 0.423257) >> [WPA] GIMPLE canonical type pointer-map: 94910 elements, 4216423 searches >> [WPA] Compression: 273322455 input bytes, 824178095 uncompressed bytes >> (ratio: 3.015406) >> >> We merge less, but not by much and I think we was not right not merge in >> that cases. > > If we merge things we may not merge then the fix is to compare_tree_sccs_1, > not introducing special cases like you propose. > > That is, if we are not allowed to merge anonymous namespaces then > make sure we don't. We already should not merge types with > TYPE_CONTEXT == such namespace by means of > > /* ??? Global types from different TUs have non-matching > TRANSLATION_UNIT_DECLs. Still merge them if they are otherwise > equal. */ > if (TYPE_FILE_SCOPE_P (t1) && TYPE_FILE_SCOPE_P (t2)) > ; > else > compare_tree_edges (TYPE_CONTEXT (t1), TYPE_CONTEXT (t2)); > > but we possibly merge a subset of decl kinds from "different" namespaces : > > /* ??? Global decls from different TUs have non-matching > TRANSLATION_UNIT_DECLs. Only consider a small set of > decls equivalent, we should not end up merging others. */ > if ((code == TYPE_DECL >|| code == NAMESPACE_DECL >|| code == IMPORTED_DECL >|| code == CONST_DECL >|| (VAR_OR_FUNCTION_DECL_P (t1) >&& (TREE_PUBLIC (t1) || DECL_EXTERNAL (t1 > && DECL_FILE_SCOPE_P (t1) && DECL_FILE_SCOPE_P (t2)) > ; > else > compare_tree_edges (DECL_CONTEXT (t1), DECL_CONTEXT (t2)); > > Not sure what we end up doing for NAMESPACE_DECL itself (and what > fields we stream for it). It would be interesting to check that. > > Thus, make sure we don't merge namespace {} and namespace {} from > two different units. > > But effectively you say we have two classes of "global" trees, first > those that are mergeable across TUs and second those that are not. > This IMHO means we want to separate those to two different LTO > sections and simply skip all the merging code for the second (instead > of adding hacks to the merging code). As that also restricts the "pointers" we can have. Mergeable stuff may not refer to non-mergeable stuff. Breaks down for initializers: static int x; int *p = &x; though you could say that as p is initialized (thus not DECL_COMMON) this instance cannot be merged with anything else - other entities are 'extern int *p' (tree merging is different from symtab merging). Thus int *p = &x; is also non-mergeable (everything that has tree pointers refer to sth not mergeable is not mergeable). We have similar "issues" with tree_is_indexable and pointers violating constraints (like
Re: ipa-visibility TLC 2/n
Hi Honza, >> Unfortunately, AIX isn't the only target massively affected by your >> recent patches. This all started with r210597 >> >> 2014-05-17 Jan Hubicka >> >> * tree-pass.h (make_pass_ipa_comdats): New pass. >> * timevar.def (TV_IPA_COMDATS): New timevar. >> * passes.def (pass_ipa_comdats): Add. >> * Makefile.in (OBJS): Add ipa-comdats.o >> * ipa-comdats.c: New file. >> >> At that time, only Solaris 11 with gas/Solaris ld was affected: many Go >> tests started failing like this: >> >> runtime.SetFinalizer: cannot pass * os os.file to finalizer func(* >>os os.file) error >> fatal error: runtime.SetFinalizer > > Thanks for letting me know. THis is different transformation than one > causing trouble > on AIX (AIX has no comdats, so this pass does nothing). Go seems tobe > quite heavy user > of comdat locals produced by that patch, so I suppose they somehow break > with Solaris. > > Comdat locals are now used by ipa-comdats, for thunks and for decloned ctors. > We probably need to figure out bit more precise limitation of Solaris and > either > fix or add way for target to say what kind of comdat locals are not supported. Right. I'll start reghunting for the patch that caused additional breakage even without comdat, as on Solaris 10. > Can I reproduce your setup on the compile farm? According to https://gcc.gnu.org/wiki/CompileFarm, there are no Solaris machines or VMs in the compile farm. If a VM could be set up (no idea if they allow non-free OSes beyond AIX there), I'd suggest starting with Solaris 11.2 Beta (http://www.oracle.com/technetwork/server-storage/solaris11/downloads/beta-2182939.html), which has the latest in /bin/ld support. I can certainly help with setting something up. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [RFC] Teaching SCC merging about unit local trees
On Thu, Jun 12, 2014 at 12:34 PM, Richard Biener wrote: > On Thu, Jun 12, 2014 at 12:29 PM, Richard Biener > wrote: >> On Thu, Jun 12, 2014 at 10:47 AM, Jan Hubicka wrote: >>> Richard, >>> as briefly discussed before, I would like to teach LTO type merging to not >>> merge >>> types that was declared in anonymous namespaces and use C++ ODR type names >>> (stored in DECL_ASSEMBLER_NAME of the TYPE_DECL) to break down canonical >>> types >>> by their names. >>> >>> First thing I need to arrange IMO is to not merge two anonymous types from >>> two different units. While looking into it I noticed that the current code >>> in unify_scc that refuses to merge local decls produces conflicts and seems >>> useless excercise to do. >>> >>> This patch introduces special hash code 1 that specify that given SCC is >>> known >>> to be local and should bypass the merging logic. This is propagated down and >>> seems to quite noticeably reduce size of SCC hash: >>> >>> [WPA] read 10190717 SCCs of average size 1.980409 >>> [WPA] 20181785 tree bodies read in total >>> [WPA] tree SCC table: size 4194301, 1882700 elements, collision ratio: >>> 0.815497 >>> [WPA] tree SCC max chain length 140 (size 1) >>> [WPA] Compared 3392363 SCCs, 2718822 collisions (0.801454) >>> [WPA] Merged 3314075 SCCs >>> [WPA] Merged 9693632 tree bodies >>> [WPA] Merged 2467704 types >>> [WPA] 1783262 types prevailed (4491218 associated trees) >>> [WPA] GIMPLE canonical type table: size 131071, 94867 elements, 1783347 >>> searches, 737056 collisions (ratio: 0.413299) >>> [WPA] GIMPLE canonical type pointer-map: 94867 elements, 3973875 searches >>> [WPA] Compression: 282828785 input bytes, 831186147 uncompressed bytes >>> (ratio: 2.938832) >>> [WPA] Size of mmap'd section decls: 282828785 bytes >>> >>> to: >>> >>> [WPA] read 10172291 SCCs of average size 1.982162 >>> [WPA] 20163124 tree bodies read in total >>> [WPA] tree SCC table: size 2097143, 988764 elements, collision ratio: >>> 0.684967 >>> [WPA] tree SCC max chain length 140 (size 1) >>> [WPA] Compared 3060932 SCCs, 2405009 collisions (0.785711) >>> [WPA] Merged 3040565 SCCs >>> [WPA] Merged 9246482 tree bodies >>> [WPA] Merged 2382312 types >>> [WPA] 1868611 types prevailed (4728465 associated trees) >>> [WPA] GIMPLE canonical type table: size 131071, 94910 elements, 1868696 >>> searches, 790939 collisions (ratio: 0.423257) >>> [WPA] GIMPLE canonical type pointer-map: 94910 elements, 4216423 searches >>> [WPA] Compression: 273322455 input bytes, 824178095 uncompressed bytes >>> (ratio: 3.015406) >>> >>> We merge less, but not by much and I think we was not right not merge in >>> that cases. >> >> If we merge things we may not merge then the fix is to compare_tree_sccs_1, >> not introducing special cases like you propose. >> >> That is, if we are not allowed to merge anonymous namespaces then >> make sure we don't. We already should not merge types with >> TYPE_CONTEXT == such namespace by means of >> >> /* ??? Global types from different TUs have non-matching >> TRANSLATION_UNIT_DECLs. Still merge them if they are otherwise >> equal. */ >> if (TYPE_FILE_SCOPE_P (t1) && TYPE_FILE_SCOPE_P (t2)) >> ; >> else >> compare_tree_edges (TYPE_CONTEXT (t1), TYPE_CONTEXT (t2)); >> >> but we possibly merge a subset of decl kinds from "different" namespaces : >> >> /* ??? Global decls from different TUs have non-matching >> TRANSLATION_UNIT_DECLs. Only consider a small set of >> decls equivalent, we should not end up merging others. */ >> if ((code == TYPE_DECL >>|| code == NAMESPACE_DECL >>|| code == IMPORTED_DECL >>|| code == CONST_DECL >>|| (VAR_OR_FUNCTION_DECL_P (t1) >>&& (TREE_PUBLIC (t1) || DECL_EXTERNAL (t1 >> && DECL_FILE_SCOPE_P (t1) && DECL_FILE_SCOPE_P (t2)) >> ; >> else >> compare_tree_edges (DECL_CONTEXT (t1), DECL_CONTEXT (t2)); >> >> Not sure what we end up doing for NAMESPACE_DECL itself (and what >> fields we stream for it). It would be interesting to check that. >> >> Thus, make sure we don't merge namespace {} and namespace {} from >> two different units. >> >> But effectively you say we have two classes of "global" trees, first >> those that are mergeable across TUs and second those that are not. >> This IMHO means we want to separate those to two different LTO >> sections and simply skip all the merging code for the second (instead >> of adding hacks to the merging code). > > As that also restricts the "pointers" we can have. Mergeable stuff > may not refer to non-mergeable stuff. Breaks down for initializers: > > static int x; > int *p = &x; > > though you could say that as p is initialized (thus not DECL_COMMON) > this instance cannot be merged with anything else - other entities > are 'extern int *p' (tree merging is different from symtab merging). > > Thus int *p = &x; is also non-m
Re: [PATCH] PR rtl-optimization/61047
On Thu, Jun 12, 2014 at 10:36 AM, Eric Botcazou wrote: >> Btw, I wonder if we can simply mark the MEMs generated from spill code >> with MEM_NOTRAP_P so we can remove the special casing of >> frame-pointer-based addresses from add while properly initializing >> MEM_NOTRAP_p from rtx_addr_can_trap_p? > > Spill code generated by the compiler itself? That's quite restrictive. > >> I suppose it was added exactly to cover spill code? > > Nope, it was added for jump tables: > > 2003-04-22 Richard Henderson > > PR 8866 > * rtl.h (MEM_NOTRAP_P): New. > (MEM_COPY_ATTRIBUTES): Copy it. > * rtlanal.c (may_trap_p): Check it. > * expr.c (do_tablejump): Set it. > * doc/rtl.texi (Flags): Document it. > > * cfgrtl.c (try_redirect_by_replacing_jump): Revert last three > changes. > > that is to say, for memory accesses that can nominally trap but for which we > know that they actually don't. I was asking for the special-casing of frame-pointer-based accesses in rtx_addr_can_trap_p, not MEM_NOTRAP_P. (MEM_NOTRAP_P of course has the issue that it may not be trusted when you try to move the MEM ...) Richard. > -- > Eric Botcazou
[GOMP4, COMMITTED] OpenACC if clause.
From: tschwinge gcc/c/ * c-parser.c (c_parser_oacc_all_clauses): Handle PRAGMA_OMP_CLAUSE_IF. (OACC_DATA_CLAUSE_MASK, OACC_KERNELS_CLAUSE_MASK) (OACC_PARALLEL_CLAUSE_MASK, OACC_UPDATE_CLAUSE_MASK): Add it. gcc/ * omp-low.c (scan_sharing_clauses): Allow OMP_CLAUSE_IF. (expand_oacc_offload, expand_omp_target): Handle it. gcc/testsuite/ * c-c++-common/goacc/if-clause-1.c: New file. * c-c++-common/goacc/if-clause-2.c: Likewise. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@211510 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.gomp | 6 ++ gcc/c/ChangeLog.gomp | 8 +++ gcc/c/c-parser.c | 10 +++- gcc/omp-low.c | 81 +- gcc/testsuite/ChangeLog.gomp | 5 ++ gcc/testsuite/c-c++-common/goacc/if-clause-1.c | 8 +++ gcc/testsuite/c-c++-common/goacc/if-clause-2.c | 11 7 files changed, 112 insertions(+), 17 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/goacc/if-clause-1.c create mode 100644 gcc/testsuite/c-c++-common/goacc/if-clause-2.c diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index be1aa16..2abe179 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,3 +1,9 @@ +2014-06-12 Thomas Schwinge + James Norris + + * omp-low.c (scan_sharing_clauses): Allow OMP_CLAUSE_IF. + (expand_oacc_offload, expand_omp_target): Handle it. + 2014-06-06 Thomas Schwinge James Norris diff --git gcc/c/ChangeLog.gomp gcc/c/ChangeLog.gomp index f1e45f3..108ce3e 100644 --- gcc/c/ChangeLog.gomp +++ gcc/c/ChangeLog.gomp @@ -1,3 +1,11 @@ +2014-06-12 Thomas Schwinge + James Norris + + * c-parser.c (c_parser_oacc_all_clauses): Handle + PRAGMA_OMP_CLAUSE_IF. + (OACC_DATA_CLAUSE_MASK, OACC_KERNELS_CLAUSE_MASK) + (OACC_PARALLEL_CLAUSE_MASK, OACC_UPDATE_CLAUSE_MASK): Add it. + 2014-06-06 Thomas Schwinge James Norris diff --git gcc/c/c-parser.c gcc/c/c-parser.c index bf4bad62..6269923 100644 --- gcc/c/c-parser.c +++ gcc/c/c-parser.c @@ -10203,7 +10203,7 @@ c_parser_omp_clause_final (c_parser *parser, tree list) return list; } -/* OpenMP 2.5: +/* OpenACC, OpenMP 2.5: if ( expression ) */ static tree @@ -11295,6 +11295,10 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask, clauses = c_parser_oacc_data_clause (parser, c_kind, clauses); c_name = "host"; break; + case PRAGMA_OMP_CLAUSE_IF: + clauses = c_parser_omp_clause_if (parser, clauses); + c_name = "if"; + break; case PRAGMA_OMP_CLAUSE_NUM_GANGS: clauses = c_parser_omp_clause_num_gangs (parser, clauses); c_name = "num_gangs"; @@ -11614,6 +11618,7 @@ c_parser_omp_structured_block (c_parser *parser) | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_COPYOUT) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_CREATE) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DEVICEPTR)\ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IF) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT_OR_COPY) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT_OR_COPYIN)\ @@ -11649,6 +11654,7 @@ c_parser_oacc_data (location_t loc, c_parser *parser) | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_COPYOUT) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_CREATE) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DEVICEPTR)\ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IF) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT_OR_COPY) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT_OR_COPYIN)\ @@ -11727,6 +11733,7 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name) | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_COPYOUT) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_CREATE) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DEVICEPTR)\ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IF) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_NUM_GANGS)\ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_NUM_WORKERS) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRESENT) \ @@ -11775,6 +11782,7 @@ c_parser_oacc_parallel (location_t loc, c_parser *parser, char *p_name) #define OACC_UPDATE_CLAUSE_MASK \ ( (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DEVICE) \ | (OMP_CLAUSE_MASK_1 << PR
[GOMP4, COMMITTED] Different configure and make flags for target vs. accelerator GCC.
From: tschwinge --enable-target-gcc-configure-flags, EXTRA_TARGET_GCC_FLAGS vs. --enable-accelerator-gcc-configure-flags, EXTRA_ACCELERATOR_GCC_FLAGS. * configure.ac (--enable-target-gcc-configure-flags) (--enable-accelerator-gcc-configure-flags): New configure options. * Makefile.def (gcc, accel-gcc): Handle these as well as new EXTRA_TARGET_GCC_FLAGS and EXTRA_ACCELERATOR_GCC_FLAGS make flags. * configure: Regenerate. * Makefile.in: Regenerate. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@211513 138bc75d-0d04-0410-961f-82ee72b054a4 --- ChangeLog.gomp | 9 + Makefile.def | 6 ++- Makefile.in| 114 ++--- configure | 31 configure.ac | 17 + 5 files changed, 121 insertions(+), 56 deletions(-) diff --git ChangeLog.gomp ChangeLog.gomp index c264057..46892a8 100644 --- ChangeLog.gomp +++ ChangeLog.gomp @@ -1,3 +1,12 @@ +2014-06-12 Thomas Schwinge + + * configure.ac (--enable-target-gcc-configure-flags) + (--enable-accelerator-gcc-configure-flags): New configure options. + * Makefile.def (gcc, accel-gcc): Handle these as well as new + EXTRA_TARGET_GCC_FLAGS and EXTRA_ACCELERATOR_GCC_FLAGS make flags. + * configure: Regenerate. + * Makefile.in: Regenerate. + 2014-03-20 Bernd Schmidt * Makefile.def (host_modules, dependencies): Add accel-gcc entries. diff --git Makefile.def Makefile.def index 89bfc07..e5fbd5c 100644 --- Makefile.def +++ Makefile.def @@ -44,10 +44,12 @@ host_modules= { module= fixincludes; bootstrap=true; host_modules= { module= flex; no_check_cross= true; }; host_modules= { module= gas; bootstrap=true; }; host_modules= { module= gcc; bootstrap=true; - extra_make_flags="$(EXTRA_GCC_FLAGS)"; }; + extra_configure_flags='@extra_target_gcc_configure_flags@'; + extra_make_flags="$(EXTRA_GCC_FLAGS) $(EXTRA_TARGET_GCC_FLAGS)"; }; host_modules= { module= accel-gcc; actual_module=gcc; - extra_configure_flags='--enable-as-accelerator-for=$(target_alias)'; }; + extra_configure_flags='--enable-as-accelerator-for=$(target_alias) @extra_accelerator_gcc_configure_flags@'; + extra_make_flags="$(EXTRA_ACCELERATOR_GCC_FLAGS)"; }; host_modules= { module= gmp; lib_path=.libs; bootstrap=true; extra_configure_flags='--disable-shared'; no_install= true; diff --git Makefile.in Makefile.in index 85ec2c2..9ad7a51 100644 --- Makefile.in +++ Makefile.in @@ -10075,7 +10075,7 @@ configure-gcc: libsrcdir="$$s/gcc"; \ $(SHELL) $${libsrcdir}/configure \ $(HOST_CONFIGARGS) --build=${build_alias} --host=${host_alias} \ - --target=$${this_target} $${srcdiroption} \ + --target=$${this_target} $${srcdiroption} @extra_target_gcc_configure_flags@ \ || exit 1 @endif gcc @@ -10109,7 +10109,8 @@ configure-stage1-gcc: $(SHELL) $${libsrcdir}/configure \ $(HOST_CONFIGARGS) --build=${build_alias} --host=${host_alias} \ --target=${target_alias} $${srcdiroption} \ - $(STAGE1_CONFIGURE_FLAGS) + $(STAGE1_CONFIGURE_FLAGS) \ + @extra_target_gcc_configure_flags@ @endif gcc-bootstrap .PHONY: configure-stage2-gcc maybe-configure-stage2-gcc @@ -10142,7 +10143,8 @@ configure-stage2-gcc: $(HOST_CONFIGARGS) --build=${build_alias} --host=${host_alias} \ --target=${target_alias} $${srcdiroption} \ --with-build-libsubdir=$(HOST_SUBDIR) \ - $(STAGE2_CONFIGURE_FLAGS) + $(STAGE2_CONFIGURE_FLAGS) \ + @extra_target_gcc_configure_flags@ @endif gcc-bootstrap .PHONY: configure-stage3-gcc maybe-configure-stage3-gcc @@ -10175,7 +10177,8 @@ configure-stage3-gcc: $(HOST_CONFIGARGS) --build=${build_alias} --host=${host_alias} \ --target=${target_alias} $${srcdiroption} \ --with-build-libsubdir=$(HOST_SUBDIR) \ - $(STAGE3_CONFIGURE_FLAGS) + $(STAGE3_CONFIGURE_FLAGS) \ + @extra_target_gcc_configure_flags@ @endif gcc-bootstrap .PHONY: configure-stage4-gcc maybe-configure-stage4-gcc @@ -10208,7 +10211,8 @@ configure-stage4-gcc: $(HOST_CONFIGARGS) --build=${build_alias} --host=${host_alias} \ --target=${target_alias} $${srcdiroption} \ --with-build-libsubdir=$(HOST_SUBDIR) \ - $(STAGE4_CONFIGURE_FLAGS) + $(STAGE4_CONFIGURE_FLAGS) \ + @extra_target_gcc_configure_flags@ @endif gcc-bootstrap .PHONY: configure-stageprofile-gcc maybe-configure-stageprofile-gcc @@ -10241,7 +10245,8 @@ configure-stageprofile-gcc: $(HOST_CONFIGARGS) --build=${build_alias} --host=${host_alias} \ --target=${target_alias} $${srcdiroption} \ --with-build-libsubdir=$(HOST_SUBDIR) \ - $(STAGEprofile_CONFIGURE_FLAGS) +
[PATCH] Fix gennews
It seems the https transition broke refering to permanently moved URL gcc-3.0/gcc-3.0.html (I get a certificate error or some such), breaking gennews and thus gcc_release. Fixed like below which makes gennews succeed. Committed to the 4.7 branch. Richard. 2014-06-12 Richard Biener * gennews: Use gcc-3.0/index.html. Index: contrib/gennews === --- contrib/gennews (revision 211221) +++ contrib/gennews (working copy) @@ -36,7 +36,7 @@ files=" gcc-3.3/index.html gcc-3.3/changes.html gcc-3.2/index.html gcc-3.2/changes.html gcc-3.1/index.html gcc-3.1/changes.html -gcc-3.0/gcc-3.0.html gcc-3.0/features.html gcc-3.0/caveats.html +gcc-3.0/index.html gcc-3.0/features.html gcc-3.0/caveats.html gcc-2.95/index.html gcc-2.95/features.html gcc-2.95/caveats.html egcs-1.1/index.html egcs-1.1/features.html egcs-1.1/caveats.html egcs-1.0/index.html egcs-1.0/features.html egcs-1.0/caveats.html"
GCC 4.7 branch is now closed
The GCC 4.7 branch is now closed, please refrain from committing anything there now. Richard.
[linaro/gcc-4_9-branch] Merge from gcc-4_9-branch and backports
Hi all, we have merged the gcc-4_9-branch into linaro/gcc-4_9-branch up to revision 211054 as r211495. We have also backported this set of revisions: r209419 as r211497 : PR rtl-optimization/60663 r209457 as r211496 : TRY_EMPTY_VM_SPACE Change aarch64 ilp32 r209559 as r211498 : [AArch64] vrnd<*>_f64 patch r209561 as r211505 : Suppress Redundant Flag Setting for Cortex-A15. r209613 as r211506 : AArch32 Support ORN for DIMode r209614 as r211507 : Optimise NotDI AND/OR ZeroExtendSI for ARMv7A r209615 as r211508 : [ARM] Allow any register for DImode values in Thumb2 r209617 as r211509 : [AArch64] Fix possible wrong code generation when comparing DImode values. r209618 as r211511 : [AArch64] Add a space to memory asm code between base register and offset. r209627 as r211512 : [AArch64] Fix indentation. r209636 as r211512 : [AArch64] Fix aarch64_initial_elimination_offset calculation. r209640 as r211514 : [AArch64] vqneg and vqabs intrinsics implementation. r209641 as r211515 : [AArch64] Vreinterpret re-implemention. r209642 as r211515 : [AArch64] 64-bit float vreinterpret implemention r209643 as r211516 : [AArch64] Define TARGET_FLAGS_REGNUM r209645 as r211517 : [AArch64] Fix TLS for ILP32. r209649 as r211518 : Merge longlong.h from glibc tree. r209659 as r211519 : AArch64 add, sub, mul in TImode r209701 as r211520 : [ARM] Handle FMA code in rtx costs. r209702 as r211520 : [ARM] Cortex-A8 rtx cost table r209703 as r211520 : [ARM][1/3] Add rev field to rtx cost tables r209704 as r211520 : [AArch64][2/3] Recognise rev16 operations on SImode and DImode data r209705 as r211520 : [ARM][3/3] Recognise bitwise operations leading to SImode rev16 r209706 as r211521 : [AArch64] Add handling of bswap operations in rtx costs r209710 as r211523 : [ARM] Initialize new tune_params values r209711 as r211524 : [AArch64] Fully support rotate on logical operations. r209712 as r211530 : [AARCH64] Use standard patterns for stack protection. r209713 as r211560 : [AArch64] VDUP Testcases r209736 as r211573 : [AArch64] Vectorise bswap[16,32,64] r209742 as r211574 : [AArch64] Reverse TBL indices for big-endian. r209747 as r211575 : Fix warning in libgfortran configure script r209749 as r211574 : [AArch64] Enable TBL for big-endian. r209806 as r211576 : [ARM] Initialise T16-related fields in Cortex-A8 tuning struct. r209808 as r211577 : [ARM] Enable tail call optimization for long call r209878 as r211578 : [AArch64] Relax modes_tieable_p and cannot_change_mode_class r209880 as r211579 : [AArch64] Improve vst4_lane intrinsics r209893 as r211580 : Add execution + assembler tests of the AArch64 ZIP Intrinsics. r209897 as r211581 : Remove PUSH_ARGS_REVERSED from the RTL expander. r209906 as r211582 : [AArch64/ARM 2/3] Rewrite AArch64 ZIP Intrinsics using __builtin_shuffle r209908 as r211582 : Add execution tests of ARM ZIP Intrinsics. r210615 as r211583 : libitm: Enable aarch64 r211211 as r211584 : [AARCH64]Support full addressing modes for ldr/str in vectorization scenarios This will be part of our 2014.06 release. Thanks, Yvan
Re: ipa-visibility TLC 2/n
Rainer Orth writes: > Hi Honza, > >>> Unfortunately, AIX isn't the only target massively affected by your >>> recent patches. This all started with r210597 >>> >>> 2014-05-17 Jan Hubicka >>> >>> * tree-pass.h (make_pass_ipa_comdats): New pass. >>> * timevar.def (TV_IPA_COMDATS): New timevar. >>> * passes.def (pass_ipa_comdats): Add. >>> * Makefile.in (OBJS): Add ipa-comdats.o >>> * ipa-comdats.c: New file. >>> >>> At that time, only Solaris 11 with gas/Solaris ld was affected: many Go >>> tests started failing like this: >>> >>> runtime.SetFinalizer: cannot pass * os os.file to finalizer func(* >>> os os.file) error >>> fatal error: runtime.SetFinalizer >> >> Thanks for letting me know. THis is different transformation than one >> causing trouble >> on AIX (AIX has no comdats, so this pass does nothing). Go seems tobe >> quite heavy user >> of comdat locals produced by that patch, so I suppose they somehow break >> with Solaris. >> >> Comdat locals are now used by ipa-comdats, for thunks and for decloned ctors. >> We probably need to figure out bit more precise limitation of Solaris and >> either >> fix or add way for target to say what kind of comdat locals are not >> supported. > > Right. I'll start reghunting for the patch that caused additional > breakage even without comdat, as on Solaris 10. It turned out that those failures have been caused by the last libgo merge, rev 211328: many 64-bit tests FAIL like this: FAIL: go.go-torture/execute/chan-1.go execution, -O0 fatal error: all goroutines are asleep - deadlock! goroutine 16 [chan send]: created by main /vol/gcc/src/hg/trunk/local/libgo/runtime/go-main.c:42 There have been massive changes to libgo/runtime/chan.c, perhaps one of them is the culprit. It's hard to keep track with so much breakage these days ;-( Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
terse notation diagnostics
Adds additional checks and tests for ill-formed programs. 2014-06-12 Andrew Sutton * gcc/cp/parser.c (cp_check_type_concept): New. (cp_check_concept_name): Remove redundant condition from check. Diagnose misuse of non-type concepts in constrained type specifiers. * gcc/testuite/g++.dg/concepts/generic-fn.C: Add tests for non-simple constrained-type-specifiers and nested-name-specifiers in concept names. * gcc/testuite/g++.dg/concepts/generic-fn-err.C: New tests for diagnosing ill-formed programs. Committed in r211585. Andrew Sutton Index: parser.c === --- parser.c (revision 211476) +++ parser.c (working copy) @@ -15132,11 +15132,22 @@ cp_parser_type_name (cp_parser* parser) return type_decl; } - +/// Returns true if proto is a type parameter, but not a template template +/// parameter. +static bool +cp_check_type_concept (tree proto, tree fn) +{ + if (TREE_CODE (proto) != TYPE_DECL) +{ + error ("invalid use of non-type concept %qD", fn); + return false; +} + return true; +} // If DECL refers to a concept, return a TYPE_DECL representing the result // of using the constrained type specifier in the current context. - +// // DECL refers to a concept if // - it is an overload set containing a function concept taking a single // type argument, or @@ -15173,9 +15184,13 @@ cp_check_concept_name (cp_parser* parser // In template paramteer scope, this results in a constrained parameter. // Return a descriptor of that parm. - if (template_parm_scope_p () && processing_template_parmlist) + if (processing_template_parmlist) return build_constrained_parameter (proto, fn); + // In any other context, a concept must be a type concept. + if (!cp_check_type_concept (proto, fn)) +return error_mark_node; + // In a parameter-declaration-clause, constrained-type specifiers // result in invented template parameters. if (parser->auto_is_implicit_function_template_parm_p) Index: generic-fn.C === --- generic-fn.C (revision 211476) +++ generic-fn.C (working copy) @@ -1,11 +1,16 @@ +// { dg-do run } // { dg-options "-std=c++1y" } #include +#include template concept bool C() { return __is_class(T); } -struct S { } s; +template + concept bool Type() { return true; } + +struct S { }; int called; @@ -50,7 +55,43 @@ template }; +void ptr(C*) { called = 1; } +void ptr(const C*) { called = 2; } + +void ref(C&) { called = 1; } +void ref(const C&) { called = 2; } + +void +fwd_lvalue_ref(Type&& x) { + using T = decltype(x); + static_assert(std::is_lvalue_reference::value, "not an lvlaue reference"); +} + +void +fwd_const_lvalue_ref(Type&& x) { + using T = decltype(x); + static_assert(std::is_lvalue_reference::value, "not an lvalue reference"); + using U = typename std::remove_reference::type; + static_assert(std::is_const::value, "not const-qualified"); +} + +void fwd_rvalue_ref(Type&& x) { + using T = decltype(x); + static_assert(std::is_rvalue_reference::value, "not an rvalue reference"); +} + +// Make sure we can use nested names speicifers for concept names. +namespace N { + template +concept bool C() { return true; } +} // namesspace N + +void foo(N::C x) { } + int main() { + S s; + const S cs; + f(0); assert(called == 1); g(s); assert(called == 2); @@ -60,7 +101,6 @@ int main() { S1 s1; s1.f1(0); assert(called == 1); s1.f2(s); assert(called == 2); - // s1.f2(0); // Error s1.f3(0); assert(called == 1); s1.f3(s); assert(called == 2); @@ -68,26 +108,35 @@ int main() { S2 s2; s2.f1(0); assert(called == 1); s2.f2(s); assert(called == 2); - // s2.f2(0); // Error s2.f3(0); assert(called == 1); s2.f3(s); assert(called == 2); s2.h1(0); assert(called == 1); s2.h2(s); assert(called == 2); - // s2.h2(0); // Error s2.h3(0); assert(called == 1); s2.h3(s); assert(called == 2); s2.g1(s, s); assert(called == 1); - // s2.g(s, 0); // Error - // s2.g(0, s); // Error - s2.g2(s, s); assert(called == 2); - // s2.g(s, 0); // Error + + ptr(&s); assert(called == 1); + ptr(&cs); assert(called == 2); + + ref(s); assert(called == 1); + ref(cs); assert(called == 2); + + // Check forwarding problems + fwd_lvalue_ref(s); + fwd_const_lvalue_ref(cs); + fwd_rvalue_ref(S()); + + foo(0); } +// Test that decl/def matching works. + void p(auto x) { called = 1; } void p(C x) { called = 2; } Index: generic-fn-err.C === --- generic-fn-err.C (revision 0) +++ generic-fn-err.C (revision 0) @@ -0,0 +1,51 @@ +// { dg-options "-std=c++1y" } + +#include + +template + concept bool C() { return __is_class(T); } + +template + concept bool Int() { return true; } + +template class X> + concept bool Template() { return true; }
[PATCH][AArch64] Add predicate for storewb_pair/loadwb_pair
This patch add predicate for storewb_pair/loadwb_pair, because aarch64 register pair push and pop instructions only accept constant offset within certain range. OK for trunk? Thanks. gcc/ChangeLog: 2014-06-12 Renlin Li * config/aarch64/aarch64.c (offset_7bit_signed_scaled_p): Rename to 'aarch64_offset_7bit_signed_scaled_p', remove static and use it . * config/aarch64/aarch64-protos.h (aarch64_offset_7bit_signed_scaled_p): New Declaration. * config/aarch64/predicates.md (aarch64_mem_pair_offset): New predicate. * config/aarch64/aarch64.md (loadwb_pair): Use aarch64_mem_pair_offset. (storewb_pair): Likewise. diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 68d488d..d39ecc5 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -193,6 +193,7 @@ bool aarch64_modes_tieable_p (enum machine_mode mode1, bool aarch64_move_imm (HOST_WIDE_INT, enum machine_mode); bool aarch64_mov_operand_p (rtx, enum aarch64_symbol_context, enum machine_mode); +bool aarch64_offset_7bit_signed_scaled_p (enum machine_mode, HOST_WIDE_INT); char *aarch64_output_scalar_simd_mov_immediate (rtx, enum machine_mode); char *aarch64_output_simd_mov_immediate (rtx, enum machine_mode, unsigned); bool aarch64_pad_arg_upward (enum machine_mode, const_tree); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index f69457a..192caf4 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -3122,8 +3122,9 @@ aarch64_classify_index (struct aarch64_address_info *info, rtx x, return false; } -static inline bool -offset_7bit_signed_scaled_p (enum machine_mode mode, HOST_WIDE_INT offset) +bool +aarch64_offset_7bit_signed_scaled_p (enum machine_mode mode, + HOST_WIDE_INT offset) { return (offset >= -64 * GET_MODE_SIZE (mode) && offset < 64 * GET_MODE_SIZE (mode) @@ -3195,12 +3196,12 @@ aarch64_classify_address (struct aarch64_address_info *info, We conservatively require an offset representable in either mode. */ if (mode == TImode || mode == TFmode) - return (offset_7bit_signed_scaled_p (mode, offset) + return (aarch64_offset_7bit_signed_scaled_p (mode, offset) && offset_9bit_signed_unscaled_p (mode, offset)); if (outer_code == PARALLEL) return ((GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8) - && offset_7bit_signed_scaled_p (mode, offset)); + && aarch64_offset_7bit_signed_scaled_p (mode, offset)); else return (offset_9bit_signed_unscaled_p (mode, offset) || offset_12bit_unsigned_scaled_p (mode, offset)); @@ -3255,12 +3256,12 @@ aarch64_classify_address (struct aarch64_address_info *info, We conservatively require an offset representable in either mode. */ if (mode == TImode || mode == TFmode) - return (offset_7bit_signed_scaled_p (mode, offset) + return (aarch64_offset_7bit_signed_scaled_p (mode, offset) && offset_9bit_signed_unscaled_p (mode, offset)); if (outer_code == PARALLEL) return ((GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8) - && offset_7bit_signed_scaled_p (mode, offset)); + && aarch64_offset_7bit_signed_scaled_p (mode, offset)); else return offset_9bit_signed_unscaled_p (mode, offset); } diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index fec2ea8..e15747f 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -949,7 +949,7 @@ [(parallel [(set (match_operand:P 0 "register_operand" "=k") (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "const_int_operand" "n"))) + (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) (set (match_operand:GPI 2 "register_operand" "=r") (mem:GPI (plus:P (match_dup 1) (match_dup 4 @@ -967,7 +967,7 @@ [(parallel [(set (match_operand:P 0 "register_operand" "=&k") (plus:P (match_operand:P 1 "register_operand" "0") - (match_operand:P 4 "const_int_operand" "n"))) + (match_operand:P 4 "aarch64_mem_pair_offset" "n"))) (set (mem:GPI (plus:P (match_dup 0) (match_dup 4))) (match_operand:GPI 2 "register_operand" "r")) diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index 2702a3c..478de11 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -123,6 +123,10 @@ (match_test "INTVAL (op) != 0 && (unsigned) exact_log2 (INTVAL (op)) < 64"))) +(define_predicate "aarch64_mem_pair_offset" + (and (match_code "const_int") + (match_test "aarch64_offset_7bit_signed_scaled_p (mode, INTVAL (op))"))) + (define_predicate "aarch64_mem_pair_operand" (and (match_code "mem") (match_test "aarch64_legitimate_address_p (mode, XEXP (op, 0), PARALLEL,
Re: [Patch ARM/testsuite 00/22] Neon intrinsics executable tests
On 12 June 2014 04:31, Mike Stump wrote: > On Jun 10, 2014, at 3:03 PM, Ramana Radhakrishnan > wrote: >> I am a bit ambivalent between getting folks to add scan-assembler >> tests here and worrying between this and getting the behaviour >> correct. Additionally if you add the complexity of scanning for >> aarch64 as well this starts getting messy. >> >> At this point I'm going to wait to see if any of the testsuite >> maintainers step in and comment and if not I'll start looking at this >> properly early next week. > > [ ducks ] So, I wasn’t going to comment… If you guys do something really > stupid, I’ll scream, as hopefully will others. Doing something a little > misguided I don’t think hurts much. The worst case if you figure out in a > year or two why it was a bad idea and then fix it, not the end of the world. If the execution part is OK and the scan-assembler is questionable, I can just remove that part (or leave it commented until we decide otherwise).
Re: [PING*2][PATCH] Extend mode-switching to support toggle (1/2)
On 06/11/2014 02:00 PM, Christian Bruel wrote: > On 06/11/2014 06:17 AM, Joern Rennecke wrote: Joern, is this new target macro interface OK with you ? >> Yes, this interface should allow me to do switches between rounding >> and truncating >> floating-point modes with an add/subtract immediate. >> >> However, the implentation, as posted, doesn't work - it causes memory >> corruption. >> >> It appears to work with the attached amendment patch. >> > Indeed, thanks for pointing out the bad reusing of the aux field > between multiple entities. > > In fact rereading this part of the implementation, I find the allocation > of aux*n_entities awkward. A simpler setting in the entity loop to carry > the mode directly into eg->aux is possible without array allocation > (which also fixes a memory leak by the way). > Here is the revised version fixing the aforementioned issue found by Joern on Epiphany. It also simplifies the allocation of the aux edges field to carry the modes. Now that everyone agrees on the interface, is this OK for trunk ? bootstrapped/regtested for X86 and SH4a. thanks, Christian 2014-06-12 Christian Bruel * mode-switching.c (struct bb_info): Add mode_out, mode_in caches. (make_preds_opaque): Delete. (clear_mode_bit, mode_bit_p, set_mode_bit): New macros. (commit_mode_sets): New function. (optimize_mode_switching): Handle current_mode to mode_switching_emit. Process all modes at once. * basic-block.h (pre_edge_lcm_avs): Declare. * lcm.c (pre_edge_lcm_avs): Renamed from pre_edge_lcm. Call clear_aux_for_edges. Fix comments. (pre_edge_lcm): New wrapper function to call pre_edge_lcm_avs. (pre_edge_rev_lcm): Idem. * config/epiphany/epiphany.c (emit_set_fp_mode): Add prev_mode parameter. * config/epiphany/epiphany-protos.h (emit_set_fp_mode): Idem. * config/epiphany/resolve-sw-modes.c (pass_resolve_sw_modes::execute): Idem. * config/i386/i386.c (x96_emit_mode_set): Idem. * config/sh/sh.c (sh_emit_mode_set): Likewise. Handle PR toggle. * config/sh/sh.md (toggle_pr): Defined if TARGET_FPU_SINGLE. (fpscr_toggle) Disallow from delay slot. * target.def (emit_mode_set): Add prev_mode parameter. * doc/tm.texi: Regenerate. 2014-06-12 Christian Bruel * gcc.target/sh/fpchg.c: New test. Index: gcc/basic-block.h === --- gcc/basic-block.h (revision 211436) +++ gcc/basic-block.h (working copy) @@ -711,6 +711,9 @@ extern void bitmap_union_of_preds (sbitmap, sbitma extern struct edge_list *pre_edge_lcm (int, sbitmap *, sbitmap *, sbitmap *, sbitmap *, sbitmap **, sbitmap **); +extern struct edge_list *pre_edge_lcm_avs (int, sbitmap *, sbitmap *, + sbitmap *, sbitmap *, sbitmap *, + sbitmap *, sbitmap **, sbitmap **); extern struct edge_list *pre_edge_rev_lcm (int, sbitmap *, sbitmap *, sbitmap *, sbitmap *, sbitmap **, Index: gcc/config/epiphany/epiphany-protos.h === --- gcc/config/epiphany/epiphany-protos.h (revision 211436) +++ gcc/config/epiphany/epiphany-protos.h (working copy) @@ -40,7 +40,8 @@ extern int epiphany_initial_elimination_offset (in extern void epiphany_init_expanders (void); extern int hard_regno_mode_ok (int regno, enum machine_mode mode); #ifdef HARD_CONST -extern void emit_set_fp_mode (int entity, int mode, HARD_REG_SET regs_live); +extern void emit_set_fp_mode (int entity, int mode, int prev_mode, + HARD_REG_SET regs_live); #endif extern void epiphany_insert_mode_switch_use (rtx insn, int, int); extern void epiphany_expand_set_fp_mode (rtx *operands); Index: gcc/config/epiphany/epiphany.c === --- gcc/config/epiphany/epiphany.c (revision 211436) +++ gcc/config/epiphany/epiphany.c (working copy) @@ -2543,7 +2543,8 @@ epiphany_mode_exit (int entity) } void -emit_set_fp_mode (int entity, int mode, HARD_REG_SET regs_live ATTRIBUTE_UNUSED) +emit_set_fp_mode (int entity, int mode, int prev_mode ATTRIBUTE_UNUSED, + HARD_REG_SET regs_live ATTRIBUTE_UNUSED) { rtx save_cc, cc_reg, mask, src, src2; enum attr_fp_mode fp_mode; Index: gcc/config/epiphany/resolve-sw-modes.c === --- gcc/config/epiphany/resolve-sw-modes.c (revision 211436) +++ gcc/config/epiphany/resolve-sw-modes.c (working copy) @@ -170,7 +170,7 @@ pass_resolve_sw_modes::execute (function *fun) } start_sequence (); emit_set_fp_mode (EPIPHANY_MSW_ENTITY_ROUND_UNKNOWN, - jilted_mode, NULL); + jilted_mode, FP_MODE_NONE, NULL); seq = get_insns (); end_sequence (); need_commit = true; Index: gcc/config/i386/i386.c === --- gcc/config/i386/i386.c (revision 211436) +++ gcc/config/i386/i386.c (working copy) @@ -16447,7 +16447,8 @@ ix86_avx_emit_vzeroupper (HARD_REG_SET re
Re: [PING][PATCH, trunk, 4.9, 4.8] Fix PR57653, filename information discarded when using -imacros
On 12 June 2014 01:23, Peter Bergner wrote: > On Wed, 2014-06-11 at 23:07 +, Joseph S. Myers wrote: >> On Wed, 11 Jun 2014, Peter Bergner wrote: >> >> > I'd like to ping the following patch that fixes PR57653. This did >> > bootstrap and regtest with no regressions on powerpc64-linux. >> > >> > https://gcc.gnu.org/ml/gcc-patches/2014-04/msg01571.html >> > >> > Is this ok for trunk, 4.9 and 4.8? >> >> I think the code change is correct, but the comment added needs expanding >> to explain better what's going on (i.e. the circumstances in which the >> condition include_cursor > deferred_count may hold, and why, in those >> circumstances, returning early is the correct thing to do). > > Manuel, can you offer an updated comment? Being just the patch > tester and not knowing this code at all, I'm not going to be of > much use at expanding the Manuel's original comment. It has been a long time, and it seems that even at the time I proposed the patch, I had no idea why it worked: "For some reason unknown to me, push_commandline_include should not be called while processing -imacros. -imacros tries to achieve this by playing tricks with include_cursor, but that doesn't stop the pre-included files. Calling cpp_push_include (or cpp_push_default_include) seems to mess up everything (again, no idea why!)." The long explanation is -imacro triggers: /* Handle -imacros after -D and -U. */ for (i = 0; i < deferred_count; i++) { struct deferred_opt *opt = &deferred_opts[i]; if (opt->code == OPT_imacros && cpp_push_include (parse_in, opt->arg)) { /* Disable push_command_line_include callback for now. */ include_cursor = deferred_count + 1; cpp_scan_nooutput (parse_in); } } Then push_command_line_include is roughly as follows: /* Give CPP the next file given by -include, if any. */ static void push_command_line_include (void) { if (!done_preinclude) { done_preinclude = true; if (flag_hosted && std_inc && !cpp_opts->preprocessed) { const char *preinc = targetcm.c_preinclude (); if (preinc && cpp_push_default_include (parse_in, preinc)) return; } } pch_cpp_save_state (); while (include_cursor < deferred_count) { [...] } if (include_cursor == deferred_count) { [...] } } that is, when -imacros is given, push_command_line_include still calls the cpp_push_default_include, which messes up everything. Why or how, I have no idea. Someone else will need to investigate more. I don't think the patch is the best possible approach. I think it would be better to push the default includes as soon as it is reasonably possible (before/after processing -imacros?), instead of relying on push_command_line_include. Also, if push_command_line_include needs to be disabled for -imacros, it would be clearer to have a file-local boolean for this purpose. The way push_command_line_include is called is a mystery to me: if (new_map == 0 || (new_map->reason == LC_LEAVE && MAIN_FILE_P (new_map))) { pch_cpp_save_state (); push_command_line_include (); } Why is it called when leaving the main file at cb_file_change? Also, c_finish_options has at the end: include_cursor = 0; push_command_line_include (); but shouldn't this happen instead when (or before) the main file is entered? Cheers, Manuel.
Re: RFA: speeding up dg-extract-results.sh
On 05/25/2014 11:35 AM, Richard Sandiford wrote: Bernd Schmidt writes: On 02/13/2014 10:18 AM, Richard Sandiford wrote: contrib/ * dg-extract-results.py: New file. * dg-extract-results.sh: Use it if the environment seems suitable. I'm now seeing the following: Traceback (most recent call last): File "../../git/gcc/../contrib/dg-extract-results.py", line 581, in Prog().main() File "../../git/gcc/../contrib/dg-extract-results.py", line 569, in main self.output_tool (self.runs[name]) File "../../git/gcc/../contrib/dg-extract-results.py", line 534, in output_tool self.output_variation (tool, variation) File "../../git/gcc/../contrib/dg-extract-results.py", line 483, in output_variation for harness in sorted (variation.harnesses.values()): TypeError: unorderable types: HarnessRun() < HarnessRun() $ /usr/bin/python --version Python 3.3.3 Sorry, thought I'd tested it with python3, but obviously not. I've applied the fix below after testing that it didn't change the output for python 2.6 and python 2.7. I've recently been trying to add ada to my set of tested languages, and I now encounter the following: Traceback (most recent call last): File "../../git/gcc/../contrib/dg-extract-results.py", line 580, in Prog().main() File "../../git/gcc/../contrib/dg-extract-results.py", line 544, in main self.parse_file (filename, file) File "../../git/gcc/../contrib/dg-extract-results.py", line 427, in parse_file self.parse_acats_run (filename, file) File "../../git/gcc/../contrib/dg-extract-results.py", line 342, in parse_acats_run self.parse_run (filename, file, tool, variation, 1) File "../../git/gcc/../contrib/dg-extract-results.py", line 242, in parse_run line = file.readline() File "/usr/lib64/python3.3/codecs.py", line 301, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe1 in position 5227: invalid continuation byte Bernd
Re: [RFC] Teaching SCC merging about unit local trees
> > That is, have a tree_may_be_mergeable_p (), call it during the DFS > walk storing it alongside the visited edges and thus obtain a result > for each SCC, stream that as a flag (a special hash value is ugly, > but well ... I guess it works). The important part is to make an SCC > !tree_may_be_mergeable_p () if any of the outgoing edges from an SCC > are !tree_may_be_mergeable_p (). You seem to miss this. This is what I am trying to do by the hashing. scc_hash is now 1 for any SCC that refers to scc with hash 1. So non-mergeability propagates. Honza
[C++ Patch] PR 33101
Hi, in this old bug Ian complained that the diagnostic we provide for: typedef void v; typedef v (*pf)(v); is rather unfriendly, especially for people coming from C: 33101.C:2:17: error: ‘’ has incomplete type 33101.C:2:18: error: invalid use of ‘v’ thus Gaby (and Ian) suggested something along the lines of what I propose below. Today I also noticed that some front-ends also deal specially with cv-qualified void, thus added that case too, then just generically 'void' I think it's good enough. Thanks, Paolo. // /cp 2014-06-12 Paolo Carlini PR c++/33101 * decl.c (grokparms): Improve error message about void parameters. /testsuite 2014-06-12 Paolo Carlini PR c++/33101 * g++.dg/other/void3.C: New. * g++.dg/conversion/err-recover1.C: Update. Index: cp/decl.c === --- cp/decl.c (revision 211574) +++ cp/decl.c (working copy) @@ -11161,10 +11161,25 @@ grokparms (tree parmlist, tree *parms) { if (same_type_p (type, void_type_node) && DECL_SELF_REFERENCE_P (type) - && !DECL_NAME (decl) && !result && TREE_CHAIN (parm) == void_list_node) + && !DECL_NAME (decl) && !result + && TREE_CHAIN (parm) == void_list_node) /* this is a parmlist of `(void)', which is ok. */ break; - cxx_incomplete_type_error (decl, type); + else if (!cv_qualified_p (type) + && !DECL_SELF_REFERENCE_P (type) + && !DECL_NAME (decl) && !result + && TREE_CHAIN (parm) == void_list_node) + error_at (DECL_SOURCE_LOCATION (decl), + "invalid use of typedef-name for type " + "% in parameter declaration"); + else if (cv_qualified_p (type)) + error_at (DECL_SOURCE_LOCATION (decl), + "invalid use of cv-qualified type % " + "in parameter declaration"); + else + error_at (DECL_SOURCE_LOCATION (decl), + "invalid use of type % in parameter " + "declaration"); /* It's not a good idea to actually create parameters of type `void'; other parts of the compiler assume that a void type terminates the parameter list. */ Index: testsuite/g++.dg/conversion/err-recover1.C === --- testsuite/g++.dg/conversion/err-recover1.C (revision 211574) +++ testsuite/g++.dg/conversion/err-recover1.C (working copy) @@ -1,6 +1,6 @@ // PR c++/42219 -void foo(const void); // { dg-error "incomplete|const" } +void foo(const void); // { dg-error "invalid use of cv-qualified" } void bar() { Index: testsuite/g++.dg/other/void3.C === --- testsuite/g++.dg/other/void3.C (revision 0) +++ testsuite/g++.dg/other/void3.C (working copy) @@ -0,0 +1,4 @@ +// PR c++/33101 + +typedef void v; +typedef v (*pf)(v); // { dg-error "invalid use of typedef-name" }
Re: [patch i386]: Combine memory and indirect jump
Hello, I updated i386.md part of the patch. Initial patch included handling of blockage, which is obviously superflous. Additionally I merged 32-bit and 64-bit peephole2 versions by using mode-specifier W. ChangeLog 2014-06-12 Kai Tietz * config/i386/i386.md (peehole2): To combine indirect jump with memory. 2014-06-12 Kai Tietz * gcc.target/i386/indjmp-1.c: New test. Tested for i686-pc-cygwin, and x86_64-unknown-linux-gnu. Ok for apply? with addition of adding a second peephole2 pass after sched2 pass, I was able to get some improvement for PR target/39284. I think by this addition we can close bug as fixed. Additionally additional peephole2 pass shows better results for PR target/51840 testcase with disabled ASM_GOTO, too. 2014-06-12 Kai Tietz PR target/39284 * passes.def (pass_peephole2): Add second peephole2 run after sched2 pass. Tested for i686-pc-cygwin, and x86_64-unknown-linux-gnu. Ok for apply? Regards, Kai Index: testsuite/gcc.target/i386/indjmp-1.c === --- testsuite/gcc.target/i386/indjmp-1.c(Revision 0) +++ testsuite/gcc.target/i386/indjmp-1.c(Arbeitskopie) @@ -0,0 +1,23 @@ +/* { dg-do compile { target ia32 } } */ +/* { dg-options "-O2" } */ + +#define ADVANCE_AND_DISPATCH() goto *addresses[*pc++] + +void +Interpret(const unsigned char *pc) +{ +static const void *const addresses[] = { + &&l0, &&l1, &&l2 +}; + +l0: +ADVANCE_AND_DISPATCH(); + +l1: +ADVANCE_AND_DISPATCH(); + +l2: +return; +} + +/* { dg-final { scan-assembler-not "jmp\[ \t\]*.%eax" } } */ Index: config/i386/i386.md === --- config/i386/i386.md(Revision 211489) +++ config/i386/i386.md(Arbeitskopie) @@ -11471,6 +11471,15 @@ (match_dup 3)) (set (reg:SI SP_REG) (match_dup 4))])]) +;; Combining simple memory jump instruction + +(define_peephole2 + [(set (match_operand:W 0 "register_operand") +(match_operand:W 1 "memory_nox32_operand")) + (set (pc) (match_dup 0))] + "peep2_reg_dead_p (2, operands[0])" + [(set (pc) (match_dup 1))]) + ;; Call subroutine, returning value in operand 0 (define_expand "call_value" Index: passes.def === --- passes.def(Revision 211489) +++ passes.def(Arbeitskopie) @@ -396,6 +396,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_leaf_regs); NEXT_PASS (pass_split_before_sched2); NEXT_PASS (pass_sched2); + NEXT_PASS (pass_peephole2); NEXT_PASS (pass_stack_regs); PUSH_INSERT_PASSES_WITHIN (pass_stack_regs) NEXT_PASS (pass_split_before_regstack);
Re: [RFC] Teaching SCC merging about unit local trees
> On Thu, Jun 12, 2014 at 10:47 AM, Jan Hubicka wrote: > > Richard, > > as briefly discussed before, I would like to teach LTO type merging to not > > merge > > types that was declared in anonymous namespaces and use C++ ODR type names > > (stored in DECL_ASSEMBLER_NAME of the TYPE_DECL) to break down canonical > > types > > by their names. > > > > First thing I need to arrange IMO is to not merge two anonymous types from > > two different units. While looking into it I noticed that the current code > > in unify_scc that refuses to merge local decls produces conflicts and seems > > useless excercise to do. > > > > This patch introduces special hash code 1 that specify that given SCC is > > known > > to be local and should bypass the merging logic. This is propagated down and > > seems to quite noticeably reduce size of SCC hash: > > > > [WPA] read 10190717 SCCs of average size 1.980409 > > [WPA] 20181785 tree bodies read in total > > [WPA] tree SCC table: size 4194301, 1882700 elements, collision ratio: > > 0.815497 > > [WPA] tree SCC max chain length 140 (size 1) > > [WPA] Compared 3392363 SCCs, 2718822 collisions (0.801454) > > [WPA] Merged 3314075 SCCs > > [WPA] Merged 9693632 tree bodies > > [WPA] Merged 2467704 types > > [WPA] 1783262 types prevailed (4491218 associated trees) > > [WPA] GIMPLE canonical type table: size 131071, 94867 elements, 1783347 > > searches, 737056 collisions (ratio: 0.413299) > > [WPA] GIMPLE canonical type pointer-map: 94867 elements, 3973875 searches > > [WPA] Compression: 282828785 input bytes, 831186147 uncompressed bytes > > (ratio: 2.938832) > > [WPA] Size of mmap'd section decls: 282828785 bytes > > > > to: > > > > [WPA] read 10172291 SCCs of average size 1.982162 > > [WPA] 20163124 tree bodies read in total > > [WPA] tree SCC table: size 2097143, 988764 elements, collision ratio: > > 0.684967 > > [WPA] tree SCC max chain length 140 (size 1) > > [WPA] Compared 3060932 SCCs, 2405009 collisions (0.785711) > > [WPA] Merged 3040565 SCCs > > [WPA] Merged 9246482 tree bodies > > [WPA] Merged 2382312 types > > [WPA] 1868611 types prevailed (4728465 associated trees) > > [WPA] GIMPLE canonical type table: size 131071, 94910 elements, 1868696 > > searches, 790939 collisions (ratio: 0.423257) > > [WPA] GIMPLE canonical type pointer-map: 94910 elements, 4216423 searches > > [WPA] Compression: 273322455 input bytes, 824178095 uncompressed bytes > > (ratio: 3.015406) > > > > We merge less, but not by much and I think we was not right not merge in > > that cases. > > If we merge things we may not merge then the fix is to compare_tree_sccs_1, > not introducing special cases like you propose. What I was looking for was to decide at streaming time what canbe merged instead of doing it at merging time that is more expensive and causes scc hash conflicts (because the hashes are the same) > > That is, if we are not allowed to merge anonymous namespaces then > make sure we don't. We already should not merge types with > TYPE_CONTEXT == such namespace by means of > > /* ??? Global types from different TUs have non-matching > TRANSLATION_UNIT_DECLs. Still merge them if they are otherwise > equal. */ > if (TYPE_FILE_SCOPE_P (t1) && TYPE_FILE_SCOPE_P (t2)) > ; > else > compare_tree_edges (TYPE_CONTEXT (t1), TYPE_CONTEXT (t2)); > > but we possibly merge a subset of decl kinds from "different" namespaces : > > /* ??? Global decls from different TUs have non-matching > TRANSLATION_UNIT_DECLs. Only consider a small set of > decls equivalent, we should not end up merging others. */ > if ((code == TYPE_DECL >|| code == NAMESPACE_DECL >|| code == IMPORTED_DECL >|| code == CONST_DECL >|| (VAR_OR_FUNCTION_DECL_P (t1) >&& (TREE_PUBLIC (t1) || DECL_EXTERNAL (t1 > && DECL_FILE_SCOPE_P (t1) && DECL_FILE_SCOPE_P (t2)) > ; > else > compare_tree_edges (DECL_CONTEXT (t1), DECL_CONTEXT (t2)); > > Not sure what we end up doing for NAMESPACE_DECL itself (and what > fields we stream for it). It would be interesting to check that. > > Thus, make sure we don't merge namespace {} and namespace {} from > two different units. > > But effectively you say we have two classes of "global" trees, first > those that are mergeable across TUs and second those that are not. Yes, we have global trees that are not mergeable (becuase they are local to TU by the nature). We also have trees in function sections that IMO gets hash information just to be ignored at ltrans streaming time (we also stream at WPA, but I do not see how merging in these helps either) > This IMHO means we want to separate those to two different LTO > sections and simply skip all the merging code for the second (instead > of adding hacks to the merging code). We could move them to different section, though indicating in hash whether the scc is mergea
Re: RFA: speeding up dg-extract-results.sh
On Jun 12, 2014, at 8:53 AM, Bernd Schmidt wrote: > I've recently been trying to add ada to my set of tested languages, and I now > encounter the following: > > File "../../git/gcc/../contrib/dg-extract-results.py", line 242, in parse_run >line = file.readline() > File "/usr/lib64/python3.3/codecs.py", line 301, in decode >(result, consumed) = self._buffer_decode(data, self.errors, final) > UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe1 in position 5227: > invalid continuation byte In the old skool world, these are byte sequences that end in ‘\n’… no decoding errors are possible… well, maybe one, if you tried to put ‘\0’ in the stream. :-( Maybe a LANG/LC type person can suggest an environment variable to set that would make things happier, else we’re down to a python person to solve from that side. My knee jerk would be LANG=c for the entire test suite run...
Re: [Patch ARM/testsuite 00/22] Neon intrinsics executable tests
On Jun 12, 2014, at 7:26 AM, Christophe Lyon wrote: > On 12 June 2014 04:31, Mike Stump wrote: >> On Jun 10, 2014, at 3:03 PM, Ramana Radhakrishnan >> wrote: >>> At this point I'm going to wait to see if any of the testsuite >>> maintainers step in and comment >> >> [ ducks ] So, I wasn’t going to comment… If you guys do something really >> stupid, I’ll scream, as hopefully will others. Doing something a little >> misguided I don’t think hurts much. The worst case if you figure out in a >> year or two why it was a bad idea and then fix it, not the end of the world. > > If the execution part is OK and the scan-assembler is questionable, I > can just remove that part (or leave it commented until we decide > otherwise). Don’t read my comment as stating scanning as being questionable. In fact, scanning is slightly better as one can see the results on a cross easier and faster… for example when someone wants to study a regression they caused and they don’t have the target, they can build to cc1 and then run the test case by hand and see what the scan issues are. If it where an executable test case, they would have to puzzle why the test case is different and understand what they are reading (they might not be familiar with the target).
[PATCH, AArch64, PR 61483] builtin va_start incorrectly initializes the field of va_list for incoming unnamed arguments on the stack
Hi, The patch fixes a bug in the AArch64 backend in calculating the beginning address of the unnamed incoming arguments on the stack, i.e. the initial value of __va_list->__stack. aarch64_layout_arg incorrectly calculates the size of named arguments on stack using the number of registers needed as if there were enough registers available. This is wrong, as for instance when passed in registers an HFA/HVA* argument takes as many SIMD registers as the number of its fields; when passed on the stack, however, it should be passed as what its storage layout is (rounded to the nearest multiple of 8 bytes). The bug only affects builtin va_start, as it is other routines like aarch64_pad_arg_upward rather than aarch64_layout_arg which take care of the positioning of outgoing arguments on stack and the fetching of the incoming named arguments from stack. The patch has passed bootstrapping. OK for the trunk and 4.9.1 branch once the regtest passes as well? Thanks, Yufeng * HFA: Homogeneous Floating-point Aggregate HVA: Homogeneous Short-Vector Aggregate gcc/ PR target/61483 * config/aarch64/aarch64.c (aarch64_layout_arg): Add new local variable 'size'; calculate 'size' right in the front; use 'size' to compute 'nregs' (when 'allocate_ncrn != 0') and pcum->aapcs_stack_words. gcc/testsuite/ PR target/61483 * gcc.target/aarch64/aapcs64/type-def.h (struct hfa_fx2_t): New type. * gcc.target/aarch64/aapcs64/va_arg-13.c: New test. * gcc.target/aarch64/aapcs64/va_arg-14.c: Ditto. * gcc.target/aarch64/aapcs64/va_arg-15.c: Ditto.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index fabd6a9..56a5a5d 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -1459,6 +1459,7 @@ aarch64_layout_arg (cumulative_args_t pcum_v, enum machine_mode mode, CUMULATIVE_ARGS *pcum = get_cumulative_args (pcum_v); int ncrn, nvrn, nregs; bool allocate_ncrn, allocate_nvrn; + HOST_WIDE_INT size; /* We need to do this once per argument. */ if (pcum->aapcs_arg_processed) @@ -1466,6 +1467,11 @@ aarch64_layout_arg (cumulative_args_t pcum_v, enum machine_mode mode, pcum->aapcs_arg_processed = true; + /* Size in bytes, rounded to the nearest multiple of 8 bytes. */ + size += AARCH64_ROUND_UP (type ? int_size_in_bytes (type) : GET_MODE_SIZE (mode), + UNITS_PER_WORD); + allocate_ncrn = (type) ? !(FLOAT_TYPE_P (type)) : !FLOAT_MODE_P (mode); allocate_nvrn = aarch64_vfp_is_call_candidate (pcum_v, mode, @@ -1516,9 +1522,7 @@ aarch64_layout_arg (cumulative_args_t pcum_v, enum machine_mode mode, } ncrn = pcum->aapcs_ncrn; - nregs = ((type ? int_size_in_bytes (type) : GET_MODE_SIZE (mode)) - + UNITS_PER_WORD - 1) / UNITS_PER_WORD; - + nregs = size / UNITS_PER_WORD; /* C6 - C9. though the sign and zero extension semantics are handled elsewhere. This is the case where the argument fits @@ -1567,13 +1571,12 @@ aarch64_layout_arg (cumulative_args_t pcum_v, enum machine_mode mode, pcum->aapcs_nextncrn = NUM_ARG_REGS; /* The argument is passed on stack; record the needed number of words for - this argument (we can re-use NREGS) and align the total size if - necessary. */ + this argument and align the total size if necessary. */ on_stack: - pcum->aapcs_stack_words = nregs; + pcum->aapcs_stack_words = size / UNITS_PER_WORD; if (aarch64_function_arg_alignment (mode, type) == 16 * BITS_PER_UNIT) pcum->aapcs_stack_size = AARCH64_ROUND_UP (pcum->aapcs_stack_size, - 16 / UNITS_PER_WORD) + 1; + 16 / UNITS_PER_WORD); return; } diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/type-def.h b/gcc/testsuite/gcc.target/aarch64/aapcs64/type-def.h index a95d06a..07e56ff 100644 --- a/gcc/testsuite/gcc.target/aarch64/aapcs64/type-def.h +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/type-def.h @@ -34,6 +34,13 @@ struct hfa_fx2_t float b; }; +struct hfa_fx3_t +{ + float a; + float b; + float c; +}; + struct hfa_dx2_t { double a; diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-13.c b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-13.c new file mode 100644 index 000..27c4099 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/va_arg-13.c @@ -0,0 +1,53 @@ +/* Test AAPCS64 layout and __builtin_va_start. + + Pass named HFA/HVA argument on stack. */ + +/* { dg-do run { target aarch64*-*-* } } */ + +#ifndef IN_FRAMEWORK +#define AAPCS64_TEST_STDARG +#define TESTFILE "va_arg-13.c" + +struct float_float_t +{ + float a; + float b; +} float_float; + +union float_int_t +{ + float b8; + int b5; +} float_int; + +#define HAS_DATA_INIT_FUNC +void +init_data () +{ + float_float.a = 1.2f; + float_float.b = 2.2f; + + float_int.
[PR tree-optimization/61009] Follow-up to fix incorrect return value
It was reported that mysql was failing its testsuite due to a regex routine being mis-compiled on the ppc and s390 platforms. Upon investigation it was found that the fix for PR61009 was incomplete. The fix for 61009 changed thread_through_normal_block to return a tri-state with negative values indicating the block was not threadable, even for a joiner. That situation occurs when we do not process all the statements in the block (for example, the block is too big for threading). When we fail to process all the statements, then we will fail to properly invalidate entries in the equivalence tables which can result in incorrect transformations when threading across a loop backedge. 61009 detected the "block too big case", but missed the case when problematical PHIs are detected. This patch fixes that oversight. Bootstrapped and regression tested on x86_64-unknown-linux-gnu. It fixes the short-circuited loop in mysql for s390 (by inspection) and the mysql testsuite passes on ppc using 4.9 with this addition to the original 61009 patch backported. Installed on the trunk. Will install onto 4.9 branch shortly. commit 5de2d4cf14b882066026745fb1b1019561daac12 Author: Jeff Law Date: Thu Jun 12 10:13:16 2014 -0600 PR tree-optimization/61009 * tree-ssa-threadedge.c (thread_through_normal_block): Correct return value when we stop processing a block due to problematic PHIs. diff --git a/gcc/ChangeLog b/gcc/ChangeLog index a42b94d..d68262f 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2014-06-12 Jeff Law + +PR tree-optimization/61009 + * tree-ssa-threadedge.c (thread_through_normal_block): Correct return + value when we stop processing a block due to problematic PHIs. + 2014-06-12 Alan Lawrence * config/aarch64/arm_neon.h (vmlaq_n_f64, vmlsq_n_f64, vrsrtsq_f64, diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c index ba9e1fe..a76a7ce 100644 --- a/gcc/tree-ssa-threadedge.c +++ b/gcc/tree-ssa-threadedge.c @@ -948,9 +948,12 @@ thread_through_normal_block (edge e, if (*backedge_seen_p) simplify = dummy_simplify; - /* PHIs create temporary equivalences. */ + /* PHIs create temporary equivalences. + Note that if we found a PHI that made the block non-threadable, then + we need to bubble that up to our caller in the same manner we do + when we prematurely stop processing statements below. */ if (!record_temporary_equivalences_from_phis (e, stack)) -return 0; +return -1; /* Now walk each statement recording any context sensitive temporary equivalences we can detect. */
config/vxworks-dummy.h on arm
Hi! Seems http://gcc.gnu.org/r197156 effectively reverted the PR45078 fix for arm*-linux* (where unfortunately tm_file is always overridden). Was the removal of vxworks-dummy.h from that line intentional or just some mistake? Seems one can't build gcc plugins on arm because of this, because arm.h includes vxworks-dummy.h. Jakub
[Google] Fix AFDO early inline ICEs due to DFE
These two patches fix multiple ICE that occurred due to DFE being recently enabled after AutoFDO LIPO linking. Passes regression and internal testing. Ok for Google/4_8? Teresa 2014-06-12 Teresa Johnson Dehao Chen Google ref b/15521327. * cgraphclones.c (cgraph_clone_edge): Use resolved node. * l-ipo.c (resolve_cgraph_node): Resolve to non-removable node. Index: cgraphclones.c === --- cgraphclones.c (revision 211386) +++ cgraphclones.c (working copy) @@ -94,6 +94,7 @@ along with GCC; see the file COPYING3. If not see #include "ipa-utils.h" #include "lto-streamer.h" #include "except.h" +#include "l-ipo.h" /* Create clone of E in the node N represented by CALL_EXPR the callgraph. */ struct cgraph_edge * @@ -118,7 +119,11 @@ cgraph_clone_edge (struct cgraph_edge *e, struct c if (call_stmt && (decl = gimple_call_fndecl (call_stmt))) { - struct cgraph_node *callee = cgraph_get_node (decl); + struct cgraph_node *callee; + if (L_IPO_COMP_MODE && cgraph_pre_profiling_inlining_done) +callee = cgraph_lipo_get_resolved_node (decl); + else +callee = cgraph_get_node (decl); gcc_checking_assert (callee); new_edge = cgraph_create_edge (n, callee, call_stmt, count, freq); } Index: l-ipo.c === --- l-ipo.c (revision 211386) +++ l-ipo.c (working copy) @@ -1542,6 +1542,18 @@ resolve_cgraph_node (struct cgraph_sym **slot, str gcc_assert (decl1_defined); add_define_module (*slot, decl2); + /* Pick the node that cannot be removed, to avoid a situation + where we remove the resolved node and later try to access + it for the remaining non-removable copy. E.g. one may be + extern and the other weak, only the extern copy can be removed. */ + if (cgraph_can_remove_if_no_direct_calls_and_refs_p ((*slot)->rep_node) + && !cgraph_can_remove_if_no_direct_calls_and_refs_p (node)) +{ + (*slot)->rep_node = node; + (*slot)->rep_decl = decl2; + return; +} + has_prof1 = has_profile_info (decl1); bool is_aux1 = cgraph_is_auxiliary (decl1); bool is_aux2 = cgraph_is_auxiliary (decl2); -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: Fix a function decl in gfortran
A bit belated, I have now committed the patch as Rev. 211587. Thanks for confirming that it now works! Tobias Bernd Schmidt wrote: On 06/04/2014 10:36 PM, Tobias Burnus wrote: Bernd Schmidt wrote: Even with this applied, I'm still seeing similar failures. I didn't claim that the patch would fix everything – nor that it was well tested. Just wanted to report back since the problem doesn't really show up on normal targets. Can you try the attached version? The change is that I now properly use "se->ignore_optional" to test whether absent optional arguments should be skipped - rather than using this mornings ad-hoc solution of doing so unconditionally. Additionally, the patch has now survived stage2 building – which is more testing than I could do this morning. This seems to work. Thanks! Bernd
Re: [patch i386]: Combine memory and indirect jump
On Thu, Jun 12, 2014 at 06:21:32PM +0200, Kai Tietz wrote: > with addition of adding a second peephole2 pass after sched2 pass, I > was able to get some improvement for PR target/39284. I think by this > addition we can close bug as fixed. > Additionally additional peephole2 pass shows better results for PR > target/51840 testcase with disabled ASM_GOTO, too. Will that work on other targets? Also, it needs a doc fix (md.texi says peephole2 runs before scheduling). Segher
Re: [PATCH, PR61446] Fix mode for register copy in REE pass
On 06/10/14 01:42, Ilya Enkovich wrote: Hi, This patch fixes PR61446. The problem appears when we insert value copies after transformations. We use the widest extension mode met in a chain, but it may be wider than original destination register size. This patch checks it and use smaller mode if required. Bootstrapped and tested on linux-x86_64. Does it look OK? Thanks, Ilya -- 2014-06-09 Ilya Enkovich PR 61446 * ree.c (find_and_remove_re): Narrow mode for register copy if required. That seems wrong. Something should have rejected this earlier. Let me take a looksie. jeff
Re: RFC: C++ PATCH to remove -fabi-version=1 support
On 06/09/2014 04:46 PM, Jason Merrill wrote: I'm updating -Wabi to allow for warnings about changes between a previous ABI version and the currently selected one, and rather than adjust all the warnings for -fabi-version=1 I'd like to tear it out. Here's a revised patch that I'm checking in. commit 1df8225f7661bcd683bb7dd5924cc09668473bad Author: Jason Merrill Date: Fri Jun 6 14:01:58 2014 -0400 gcc/ * toplev.c (process_options): Reject -fabi-version=1. gcc/cp/ * call.c (build_operator_new_call): Remove -fabi-version=1 support. * class.c (walk_subobject_offsets, include_empty_classes): Likewise. (layout_nonempty_base_or_field, end_of_class): Likewise. (layout_empty_base, build_base_field, layout_class_type): Likewise. (is_empty_class, add_vcall_offset_vtbl_entries_1): Likewise. (layout_virtual_bases): Likewise. * decl.c (compute_array_index_type): Likewise. * mangle.c (write_mangled_name, write_prefix): Likewise. (write_template_prefix, write_integer_cst, write_expression): Likewise. (write_template_arg, write_array_type): Likewise. * method.c (lazily_declare_fn): Likewise. * rtti.c (get_pseudo_ti_index): Likewise. * typeck.c (comp_array_types): Likewise. diff --git a/gcc/common.opt b/gcc/common.opt index 5c3f834..f61fab5 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -776,7 +776,7 @@ Driver Undocumented ;Therefore, 0 will not necessarily indicate the same ABI in different ;versions of G++. ; -; 1: The version of the ABI first used in G++ 3.2. +; 1: The version of the ABI first used in G++ 3.2. No longer selectable. ; ; 2: The version of the ABI first used in G++ 3.4 (and current default). ; diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 75a6a4a..ac14ce2 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -4130,29 +4130,17 @@ build_operator_new_call (tree fnname, vec **args, if (*cookie_size) { bool use_cookie = true; - if (!abi_version_at_least (2)) - { - /* In G++ 3.2, the check was implemented incorrectly; it - looked at the placement expression, rather than the - type of the function. */ - if ((*args)->length () == 2 - && same_type_p (TREE_TYPE ((**args)[1]), ptr_type_node)) - use_cookie = false; - } - else - { - tree arg_types; + tree arg_types; - arg_types = TYPE_ARG_TYPES (TREE_TYPE (cand->fn)); - /* Skip the size_t parameter. */ - arg_types = TREE_CHAIN (arg_types); - /* Check the remaining parameters (if any). */ - if (arg_types - && TREE_CHAIN (arg_types) == void_list_node - && same_type_p (TREE_VALUE (arg_types), - ptr_type_node)) - use_cookie = false; - } + arg_types = TYPE_ARG_TYPES (TREE_TYPE (cand->fn)); + /* Skip the size_t parameter. */ + arg_types = TREE_CHAIN (arg_types); + /* Check the remaining parameters (if any). */ + if (arg_types + && TREE_CHAIN (arg_types) == void_list_node + && same_type_p (TREE_VALUE (arg_types), + ptr_type_node)) + use_cookie = false; /* If we need a cookie, adjust the number of bytes allocated. */ if (use_cookie) { diff --git a/gcc/cp/class.c b/gcc/cp/class.c index 25fc89b..a96b360 100644 --- a/gcc/cp/class.c +++ b/gcc/cp/class.c @@ -3820,8 +3820,7 @@ walk_subobject_offsets (tree type, if (!TYPE_P (type)) { - if (abi_version_at_least (2)) - type_binfo = type; + type_binfo = type; type = BINFO_TYPE (type); } @@ -3847,43 +3846,29 @@ walk_subobject_offsets (tree type, { tree binfo_offset; - if (abi_version_at_least (2) - && BINFO_VIRTUAL_P (binfo)) + if (BINFO_VIRTUAL_P (binfo)) continue; - if (!vbases_p - && BINFO_VIRTUAL_P (binfo) - && !BINFO_PRIMARY_P (binfo)) - continue; - - if (!abi_version_at_least (2)) - binfo_offset = size_binop (PLUS_EXPR, - offset, - BINFO_OFFSET (binfo)); - else - { - tree orig_binfo; - /* We cannot rely on BINFO_OFFSET being set for the base - class yet, but the offsets for direct non-virtual - bases can be calculated by going back to the TYPE. */ - orig_binfo = BINFO_BASE_BINFO (TYPE_BINFO (type), i); - binfo_offset = size_binop (PLUS_EXPR, - offset, - BINFO_OFFSET (orig_binfo)); - } + tree orig_binfo; + /* We cannot rely on BINFO_OFFSET being set for the base + class yet, but the offsets for direct non-virtual + bases can be calculated by going back to the TYPE. */ + orig_binfo = BINFO_BASE_BINFO (TYPE_BINFO (type), i); + binfo_offset = size_binop (PLUS_EXPR, + offset, + BINFO_OFFSET (orig_binfo)); r = walk_subobject_offsets (binfo, f, binfo_offset, offsets, max_offset, - (abi_version_at_least (2) - ? /*vbases_p=*/0 : vbases_p)); + /*vbases_p=*/0); if (r) return r; } -
Re: [C++ Patch] PR 33101
... in terms of code proper, the below is much better, IMHO. Assuming, as I understand, we have no reason to call the rather heavy same_type_p when we already know that VOID_TYPE_P (type) is true... Thanks, Paolo. // Index: cp/decl.c === --- cp/decl.c (revision 211574) +++ cp/decl.c (working copy) @@ -11159,12 +11159,25 @@ grokparms (tree parmlist, tree *parms) type = TREE_TYPE (decl); if (VOID_TYPE_P (type)) { - if (same_type_p (type, void_type_node) - && DECL_SELF_REFERENCE_P (type) - && !DECL_NAME (decl) && !result && TREE_CHAIN (parm) == void_list_node) + bool cond = (!cv_qualified_p (type) + && !DECL_NAME (decl) && !result + && TREE_CHAIN (parm) == void_list_node); + if (cond + && DECL_SELF_REFERENCE_P (type)) /* this is a parmlist of `(void)', which is ok. */ break; - cxx_incomplete_type_error (decl, type); + else if (cond) + error_at (DECL_SOURCE_LOCATION (decl), + "invalid use of typedef-name for type " + "% in parameter declaration"); + else if (cv_qualified_p (type)) + error_at (DECL_SOURCE_LOCATION (decl), + "invalid use of cv-qualified type % " + "in parameter declaration"); + else + error_at (DECL_SOURCE_LOCATION (decl), + "invalid use of type % in parameter " + "declaration"); /* It's not a good idea to actually create parameters of type `void'; other parts of the compiler assume that a void type terminates the parameter list. */ Index: testsuite/g++.dg/conversion/err-recover1.C === --- testsuite/g++.dg/conversion/err-recover1.C (revision 211574) +++ testsuite/g++.dg/conversion/err-recover1.C (working copy) @@ -1,6 +1,6 @@ // PR c++/42219 -void foo(const void); // { dg-error "incomplete|const" } +void foo(const void); // { dg-error "invalid use of cv-qualified" } void bar() { Index: testsuite/g++.dg/other/void3.C === --- testsuite/g++.dg/other/void3.C (revision 0) +++ testsuite/g++.dg/other/void3.C (working copy) @@ -0,0 +1,4 @@ +// PR c++/33101 + +typedef void v; +typedef v (*pf)(v); // { dg-error "invalid use of typedef-name" }
PATCH to change -fabi-version default to 0
I talked about doing this in 4.9 (https://gcc.gnu.org/ml/gcc/2013-03/msg8.html), but decided to put it off along with the libstdc++ ABI transition. I think it's time now. Tested x86_64-pc-linux-gnu, applying to trunk. commit a2aa0efcd1f27e85a4c652f5177c66686f530a96 Author: Jason Merrill Date: Mon Jun 9 16:37:43 2014 -0400 * common.opt (fabi-version): Change default to 0. diff --git a/gcc/common.opt b/gcc/common.opt index f61fab5..7f05092 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -808,7 +808,8 @@ Driver Undocumented ; Additional positive integers will be assigned as new versions of ; the ABI become the default version of the ABI. fabi-version= -Common Joined RejectNegative UInteger Var(flag_abi_version) Init(2) +Common Joined RejectNegative UInteger Var(flag_abi_version) Init(0) +The version of the C++ ABI in use faggressive-loop-optimizations Common Report Var(flag_aggressive_loop_optimizations) Optimization Init(1)
Re: [PATCH 8/8] Add a common .md file and define standard constraints there
On Thu, Jun 05, 2014 at 10:43:25PM +0100, Richard Sandiford wrote: > This final patch uses a common .md file to define all standard > constraints except 'g'. I had a look at what targets still use "g". Note: there can be errors in this, it's all based on \
Re: PATCH to change -fabi-version default to 0
On Jun 12, 2014, at 12:17 PM, Jason Merrill wrote: > I talked about doing this in 4.9 > (https://gcc.gnu.org/ml/gcc/2013-03/msg8.html), but decided to put it off > along with the libstdc++ ABI transition. I think it's time now. Is a doc change needed? > @opindex fabi-version > Use version @var{n} of the C++ ABI@. The default is version 2.
Re: [PATCH 8/8] Add a common .md file and define standard constraints there
On Jun 12, 2014, at 3:24 PM, Segher Boessenkool wrote: > On Thu, Jun 05, 2014 at 10:43:25PM +0100, Richard Sandiford wrote: >> This final patch uses a common .md file to define all standard >> constraints except 'g'. > > I had a look at what targets still use "g". Note: there can be > errors in this, it's all based on \ > * frv and mcore use "g" in commented-out patterns; > * cr16, mcore, picochip, rl78, and sh use "g" where they mean "rm" > or "m"; > * m68k uses it (in a dbne pattern) where the C template splits > the "r", "m", "i" cases again; > * bfin, fr30, h8300, m68k, rs6000, and v850 use it as the second > operand (# bytes pushed) of the call patterns; that operand is > unused in all these cases, could just be ""; > * cris, m68k, pdp11, and vax actually use "g". > > So it won't be all that much work to completely get rid of "g". > Do we want that? Is it simply a matter of replacing “g” by “mri”? That’s what the doc suggests. Or is there more to the story than that? paul
Re: PATCH to change -fabi-version default to 0
How does this affect pr60732? Dominique
Re: [PATCH, PR61446] Fix mode for register copy in REE pass
On 06/10/14 01:42, Ilya Enkovich wrote: Hi, This patch fixes PR61446. The problem appears when we insert value copies after transformations. We use the widest extension mode met in a chain, but it may be wider than original destination register size. This patch checks it and use smaller mode if required. Bootstrapped and tested on linux-x86_64. Does it look OK? Thanks, Ilya -- 2014-06-09 Ilya Enkovich PR 61446 * ree.c (find_and_remove_re): Narrow mode for register copy if required. The whole point behind the 61094 change was to avoid this kind of issue. ie, before eliminating an extension which requires a copy, make sure the copy is going to be valid (single insn that is recognizable and satisfies its constraints). If the copy is not going to be valid, then suppress the extension elimination. It's not working as desired because of a relatively simple goof. When I wrote the changes for 61094, I copied the code which created the new insns from find_and_remove_re into combine_reaching_defs -- the idea being we want to generate the same insn in combine_reaching_defs that will be generated in find_and_remove_re. In combine_reaching_defs we generate, validate & throw it away. In find_and_remove_re we generate and insert it into the insn stream. The subtle issue missed as that in find_and_remove_re, we have already transformed the defining insn. ie, the destination of the defining insn is in the widened mode. That is _not_ the case in combine_reaching_defs. So combine_reaching_defs is not testing the same insn that will be created by find_and_remove_re. The insns have the same structure, but the modes of the operands are different. For 61094, that little difference was not important. It *is* important for 61446. Thankfully the fix is trivial and I've confirmed that 61094 stays fixed and that it fixes 61446. Going through the bootstrap & regression process now. Jeff
Re: [PATCH][RFC] Fix PR61473, inline small memcpy/memmove during tree opts
On 06/12/14 04:12, Richard Biener wrote: This implements the requested inlining of memmove for possibly overlapping arguments by doing first all loads and then all stores. The easiest place is to do this in memory op folding where we already perform inlining of some memcpy cases (but fail to do the equivalent memcpy optimization - though RTL expansion later does it). The following patch restricts us to max. word-mode size. Ideally we'd have a way to check for the number of real instructions needed to load an (aligned) value of size N. But maybe we don't care and are fine with doing multiple loads / stores? Anyway, the following is conservative (but maybe not enough). Bootstrap / regtest running on x86_64-unknown-linux-gnu. These transforms don't really belong to GENERIC folding (they also run at -O0 ...), similar to most builtin foldings. But this patch is not to change that. Any comments on the size/cost issue? I recall seeing something in one of the BZ databases that asked for double-word to be expanded inline. Presumably the reporter's code did lots of double-word things of this nature. Obviously someone else might want quad-word and so-on. However, double words seem like a very reasonable request. jeff
Re: [patch i386]: Combine memory and indirect jump
2014-06-12 20:52 GMT+02:00 Segher Boessenkool : > On Thu, Jun 12, 2014 at 06:21:32PM +0200, Kai Tietz wrote: >> with addition of adding a second peephole2 pass after sched2 pass, I >> was able to get some improvement for PR target/39284. I think by this >> addition we can close bug as fixed. >> Additionally additional peephole2 pass shows better results for PR >> target/51840 testcase with disabled ASM_GOTO, too. Well, this is the only point I am a bit concerned too. In general I wouldn't expect here any issues to run peephole after scheduling, as peephole doesn't do anything a new run of ira/lra would require. Anyway it would be good if a global maintainer could comment on that. > Will that work on other targets? Also, it needs a doc fix (md.texi > says peephole2 runs before scheduling). Thanks for pointing on that. When I send patch for this additional peephole pass with testcase, I will adjust md.texi. > > Segher
partial-concept-ids
Add support for partial concept ids. Mostly this just refactors the basic support for concept names to also allow a template and extra arguments. Also added the missing .exp file for the test suite. 2014-06-12 Andrew Sutton * gcc/cp/constraint.cc (deduce_constrained_parameter): Refactor common deduction framework into separate function. (build_call_check): New. (build_concept_check): Take additional arguments to support the creation of constrained-type-specifiers from partial-concept-ids. (build_constrained_parameter): Take arguments from a partial-concept-id. * gcc/cp/cp-tree.h (build_concept_check, biuld_constrained_parameter): Take a template argument list, defaulting to NULL_TREE. * gcc/cp/parser.c (cp_parser_template_id): Check to see if a template-id is a concept check. (cp_check_type_concept): Reorder arguments (cp_parser_allows_constrained_type_specifier): New. Check contexts where a constrained-type-specifier is allowed. (cp_maybe_constrained_type_specifier): New. Refactored common rules for concept name checks. (cp_maybe_partial_concept_id): New. Check for constrained-type-specifiers. * gcc/testuite/g++.dg/concepts/partial.C: New tests. * gcc/testuite/g++.dg/concepts/partial-err.C: New tests. * gcc/testuite/g++.dg/concepts/concepts.exp: Add missing test driver. Andrew Sutton Index: parser.c === --- parser.c (revision 211585) +++ parser.c (working copy) @@ -2523,7 +2523,10 @@ static tree cp_parser_make_typename_type static cp_declarator * cp_parser_make_indirect_declarator (enum tree_code, tree, cp_cv_quals, cp_declarator *, tree); +/* Concept-related syntactic transformations */ +static tree cp_maybe_concept_name (cp_parser *, tree); +static tree cp_maybe_partial_concept_id (cp_parser *, tree, tree); // -- // // Unevaluated Operand Guard @@ -13775,6 +13778,11 @@ cp_parser_template_id (cp_parser *parser || TREE_CODE (templ) == OVERLOAD || BASELINK_P (templ))); + // If the template + args designate a concept, then return + // something else. + if (tree id = cp_maybe_partial_concept_id (parser, templ, arguments)) +return id; + template_id = lookup_template_function (templ, arguments); } @@ -14995,7 +15003,8 @@ cp_parser_simple_type_specifier (cp_pars } /* Otherwise, look for a type-name. */ else - type = cp_parser_type_name (parser); +type = cp_parser_type_name (parser); + /* Keep track of all name-lookups performed in class scopes. */ if (type && !global_p @@ -15071,6 +15080,7 @@ cp_parser_simple_type_specifier (cp_pars type-name: concept-name + partial-concept-id concept-name: identifier @@ -15092,6 +15102,7 @@ cp_parser_type_name (cp_parser* parser) /*check_dependency_p=*/true, /*class_head_p=*/false, /*is_declaration=*/false); + /* If it's not a class-name, keep looking. */ if (!cp_parser_parse_definitely (parser)) { @@ -15107,6 +15118,7 @@ cp_parser_type_name (cp_parser* parser) /*check_dependency_p=*/true, none_type, /*is_declaration=*/false); + /* Note that this must be an instantiation of an alias template because [temp.names]/6 says: @@ -15135,7 +15147,7 @@ cp_parser_type_name (cp_parser* parser) /// Returns true if proto is a type parameter, but not a template template /// parameter. static bool -cp_check_type_concept (tree proto, tree fn) +cp_check_type_concept (tree fn, tree proto) { if (TREE_CODE (proto) != TYPE_DECL) { @@ -15145,57 +15157,58 @@ cp_check_type_concept (tree proto, tree return true; } -// If DECL refers to a concept, return a TYPE_DECL representing the result -// of using the constrained type specifier in the current context. -// -// DECL refers to a concept if -// - it is an overload set containing a function concept taking a single -// type argument, or -// - it is a variable concept taking a single type argument -// -// -// TODO: DECL could be a variable concept. +/// Returns true if the parser is in a context that allows the +/// use of a constrained type specifier. +static inline bool +cp_parser_allows_constrained_type_specifier (cp_parser *parser) +{ + return flag_concepts +&& (processing_template_parmlist +|| parser->auto_is_implicit_function_template_parm_p +|| parser->in_result_type_constraint_p); +} + +// Check if DECL and ARGS can form a constrained-type-specifier. If ARGS +// is non-null, we try to form a concept check of the form DECL +// where ? is a placeholder for any kind of template argument. If ARGS +// is NULL, then we try to form a concept check of the form DEC. static tre
C++ PATCH to add -Wabi=n
Now that -fabi-version defaults to 0, -Wabi isn't very useful. But for people interested in compatibility with earlier versions, this patch allows you to say -Wabi=2 to get any relevant warnings. This patch also adjusts the compatibility aliases to default to backward compatibility with -fabi-version=2. Tested x86_64-pc-linux-gnu, applying to trunk. commit 969f9f501a5a8b7a9498464bf3bef59e685b3895 Author: Jason Merrill Date: Mon Jun 9 16:41:07 2014 -0400 Support -Wabi warning about backward compatibility. gcc/c-family/ * c.opt (Wabi=, fabi-compat-version): New. * c-opts.c (c_common_handle_option): Handle -Wabi=. (c_common_post_options): Handle flag_abi_compat_version default. Disallow -fabi-compat-version=1. * c-common.h (abi_version_crosses): New. gcc/cp/ * call.c (convert_arg_to_ellipsis): Use abi_version_crosses. * cvt.c (type_promotes_to): Likewise. * mangle.c (write_type, write_expression): Likewise. (write_name, write_template_arg): Likewise. (mangle_decl): Make alias based on flag_abi_compat_version. Emit -Wabi warning here. (finish_mangling_internal): Not here. Drop warn parm. (finish_mangling_get_identifier, finish_mangling): Adjust. (mangle_type_string, mangle_special_for_type): Adjust. (mangle_ctor_vtbl_for_type, mangle_thunk): Adjust. (mangle_guard_variable, mangle_tls_init_fn): Adjust. (mangle_tls_wrapper_fn, mangle_ref_init_variable): Adjust. diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index 83d5dee..6bf4051 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -619,6 +619,13 @@ extern const char *constant_string_class_name; /* C++ language option variables. */ +/* Return TRUE if one of {flag_abi_version,flag_abi_compat_version} is + less than N and the other is at least N, for use by -Wabi. */ +#define abi_version_crosses(N) \ + (abi_version_at_least(N) \ + != (flag_abi_compat_version == 0 \ + || flag_abi_compat_version >= (N))) + /* Nonzero means generate separate instantiation control files and juggle them at link time. */ diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c index 29e9a35..fbbc80e 100644 --- a/gcc/c-family/c-opts.c +++ b/gcc/c-family/c-opts.c @@ -456,6 +456,16 @@ c_common_handle_option (size_t scode, const char *arg, int value, handle_OPT_d (arg); break; +case OPT_Wabi_: + warn_abi = true; + if (value == 1) + { + warning (0, "%<-Wabi=1%> is not supported, using =2"); + value = 2; + } + flag_abi_compat_version = value; + break; + case OPT_fcanonical_system_headers: cpp_opts->canonical_system_headers = value; break; @@ -910,6 +920,22 @@ c_common_post_options (const char **pfilename) if (flag_declone_ctor_dtor == -1) flag_declone_ctor_dtor = optimize_size; + if (flag_abi_compat_version == 1) +{ + warning (0, "%<-fabi-compat-version=1%> is not supported, using =2"); + flag_abi_compat_version = 2; +} + else if (flag_abi_compat_version == -1) +{ + /* Generate compatibility aliases for ABI v2 (3.4-4.9) by default. */ + flag_abi_compat_version = (flag_abi_version == 0 ? 2 : 0); + + /* But don't warn about backward compatibility unless explicitly + requested with -Wabi=n. */ + if (flag_abi_version == 0) + warn_abi = false; +} + if (cxx_dialect >= cxx11) { /* If we're allowing C++0x constructs, don't warn about C++98 diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 76e67d7..d2e047f 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -256,6 +256,10 @@ Wabi C ObjC C++ ObjC++ LTO Var(warn_abi) Warning Warn about things that will change when compiling with an ABI-compliant compiler +Wabi= +C ObjC C++ ObjC++ LTO Joined RejectNegative UInteger Warning +Warn about things that change between the current -fabi-version and the specified version + Wabi-tag C++ ObjC++ Var(warn_abi_tag) Warning Warn if a subobject has an abi_tag attribute that the complete object type does not have @@ -845,6 +849,10 @@ d C ObjC C++ ObjC++ Joined ; Documented in common.opt. FIXME - what about -dI, -dD, -dN and -dD? +fabi-compat-version= +C++ ObjC++ Joined RejectNegative UInteger Var(flag_abi_compat_version) Init(-1) +The version of the C++ ABI used for -Wabi warnings and link compatibility aliases + faccess-control C++ ObjC++ Var(flag_access_control) Init(1) Enforce class member access control semantics diff --git a/gcc/cp/call.c b/gcc/cp/call.c index ac14ce2..44e92fc 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -6508,14 +6508,22 @@ convert_arg_to_ellipsis (tree arg, tsubst_flags_t complain) arg = null_pointer_node; else if (INTEGRAL_OR_ENUMERATION_TYPE_P (arg_type)) { - if (SCOPED_ENUM_P (arg_type) && !abi_version_at_least (6)) + if (SCOPED_ENUM_P (arg_type)) { - if (complain & tf_warning) - warning_at (loc, OPT_Wabi, "scoped enu
Re: PATCH to change -fabi-version default to 0
On 06/12/2014 03:36 PM, Mike Stump wrote: On Jun 12, 2014, at 12:17 PM, Jason Merrill wrote: I talked about doing this in 4.9 (https://gcc.gnu.org/ml/gcc/2013-03/msg8.html), but decided to put it off along with the libstdc++ ABI transition. I think it's time now. Is a doc change needed? Yep, I updated the docs in the -Wabi=n patch. Jason
Re: PATCH to change -fabi-version default to 0
On 06/12/2014 03:44 PM, Dominique Dhumieres wrote: How does this affect pr60732? It should fix that failure. Jason
Re: [patch i386]: Combine memory and indirect jump
> > Will that work on other targets? > Well, this is the only point I am a bit concerned too. In general I > wouldn't expect here any issues to run peephole after scheduling, as > peephole doesn't do anything a new run of ira/lra would require. My concern is that peepholes are rather fragile, so imho it is not inconceivable that some target will generate wrong code when you add an extra (later) peephole pass. Of course, we are in stage1. My other concern is that running peepholes again after scheduling could easily generate worse code. So I think the effect of this change on other targets needs to be evaluated. > Anyway it would be good if a global maintainer could comment on that. Yes :-) Segher
[committed] Fix some combined OpenMP 4 clauses issues (PR middle-end/61486)
Hi! This patch fixes 3 issues: 1) distribute doesn't support lastprivate clause, so gimplification shouldn't add it, it causes ICEs 2) for shared clauses on teams construct we need to at least record something in decl_map, otherwise lookup_decl ICEs 3) c_omp_split_clauses ICEd on a couple of combined constructs with firstprivate clause Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk and 4.9 branch. 2014-06-12 Jakub Jelinek PR middle-end/61486 * gimplify.c (struct gimplify_omp_ctx): Add distribute field. (gimplify_adjust_omp_clauses): Don't or in GOVD_LASTPRIVATE if outer combined construct is distribute. (gimplify_omp_for): For OMP_DISTRIBUTE set gimplify_omp_ctxp->distribute. * omp-low.c (scan_sharing_clauses) : For GIMPLE_OMP_TEAMS, if decl isn't global in outer context, record mapping into decl map. c-family/ * c-omp.c (c_omp_split_clauses): Don't crash on firstprivate in #pragma omp target teams or #pragma omp {,target }teams distribute simd. testsuite/ * c-c++-common/gomp/pr61486-1.c: New test. * c-c++-common/gomp/pr61486-2.c: New test. --- gcc/gimplify.c.jj 2014-06-06 09:19:23.0 +0200 +++ gcc/gimplify.c 2014-06-12 16:06:07.992997628 +0200 @@ -139,6 +139,7 @@ struct gimplify_omp_ctx enum omp_clause_default_kind default_kind; enum omp_region_type region_type; bool combined_loop; + bool distribute; }; static struct gimplify_ctx *gimplify_ctxp; @@ -6359,7 +6360,11 @@ gimplify_adjust_omp_clauses (tree *list_ if (n == NULL || (n->value & GOVD_DATA_SHARE_CLASS) == 0) { - int flags = GOVD_FIRSTPRIVATE | GOVD_LASTPRIVATE; + int flags = GOVD_FIRSTPRIVATE; + /* #pragma omp distribute does not allow +lastprivate clause. */ + if (!ctx->outer_context->distribute) + flags |= GOVD_LASTPRIVATE; if (n == NULL) omp_add_variable (ctx->outer_context, decl, flags | GOVD_SEEN); @@ -6640,6 +6645,8 @@ gimplify_omp_for (tree *expr_p, gimple_s || TREE_CODE (for_stmt) == CILK_SIMD); gimplify_scan_omp_clauses (&OMP_FOR_CLAUSES (for_stmt), pre_p, simd ? ORT_SIMD : ORT_WORKSHARE); + if (TREE_CODE (for_stmt) == OMP_DISTRIBUTE) +gimplify_omp_ctxp->distribute = true; /* Handle OMP_FOR_INIT. */ for_pre_body = NULL; --- gcc/omp-low.c.jj2014-06-10 08:02:49.0 +0200 +++ gcc/omp-low.c 2014-06-12 16:41:09.438849948 +0200 @@ -1509,11 +1509,19 @@ scan_sharing_clauses (tree clauses, omp_ break; case OMP_CLAUSE_SHARED: + decl = OMP_CLAUSE_DECL (c); /* Ignore shared directives in teams construct. */ if (gimple_code (ctx->stmt) == GIMPLE_OMP_TEAMS) - break; + { + /* Global variables don't need to be copied, +the receiver side will use them directly. */ + tree odecl = maybe_lookup_decl_in_outer_ctx (decl, ctx); + if (is_global_var (odecl)) + break; + insert_decl_map (&ctx->cb, decl, odecl); + break; + } gcc_assert (is_taskreg_ctx (ctx)); - decl = OMP_CLAUSE_DECL (c); gcc_assert (!COMPLETE_TYPE_P (TREE_TYPE (decl)) || !is_variable_sized (decl)); /* Global variables don't need to be copied, --- gcc/c-family/c-omp.c.jj 2014-05-11 22:20:26.0 +0200 +++ gcc/c-family/c-omp.c2014-06-12 17:11:49.507948417 +0200 @@ -789,8 +789,13 @@ c_omp_split_clauses (location_t loc, enu else if ((mask & (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_NUM_TEAMS)) != 0) { - /* This must be #pragma omp {,target }teams distribute. */ - gcc_assert (code == OMP_DISTRIBUTE); + /* This must be one of +#pragma omp {,target }teams distribute +#pragma omp target teams +#pragma omp {,target }teams distribute simd. */ + gcc_assert (code == OMP_DISTRIBUTE + || code == OMP_TEAMS + || code == OMP_SIMD); s = C_OMP_CLAUSE_SPLIT_TEAMS; } else if ((mask & (OMP_CLAUSE_MASK_1 --- gcc/testsuite/c-c++-common/gomp/pr61486-1.c.jj 2014-06-12 19:11:52.029213158 +0200 +++ gcc/testsuite/c-c++-common/gomp/pr61486-1.c 2014-06-12 19:12:22.427069749 +0200 @@ -0,0 +1,13 @@ +/* PR middle-end/61486 */ +/* { dg-do compile } */ +/* { dg-options "-fopenmp" } */ + +int +foo (int *a) +{ + int i, j = 0; + #pragma omp target teams distribute simd linear(i, j) map(a[:10]) + for (
Re: [PATCH 8/8] Add a common .md file and define standard constraints there
> > * cris, m68k, pdp11, and vax actually use "g". > > > > So it won't be all that much work to completely get rid of "g". > > Do we want that? > > Is it simply a matter of replacing “g” by “mri”? That’s what the doc > suggests. Or is there more to the story than that? As far as I know "g" and "rmi" are equivalent, yes. "g" is easier to type and read if you use it a lot (only ancient targets really); the compiler will probably become somewhat slower for those targets, and perhaps somewhat faster for all others. Hard to say without doing the work and measuring the result :-) Segher
Re: [C++ Patch] PR 33101
On 06/12/2014 03:14 PM, Paolo Carlini wrote: ... in terms of code proper, the below is much better, IMHO. Assuming, as I understand, we have no reason to call the rather heavy same_type_p when we already know that VOID_TYPE_P (type) is true... same_type_p is not so heavy since it just compares TYPE_CANONICAL, but I wonder why we don't use == for the normal case, and then typedef_variant_p to diagnose a typedef. Jason
Re: [patch i386]: Combine memory and indirect jump
On 6/12/2014 9:21 AM, Kai Tietz wrote: with addition of adding a second peephole2 pass after sched2 pass, I was able to get some improvement for PR target/39284. I think by this addition we can close bug as fixed. Additionally additional peephole2 pass shows better results for PR target/51840 testcase with disabled ASM_GOTO, too. Any chance this also fixes PR 58670 (see comment #5)? dw
Re: [Google] Fix AFDO early inline ICEs due to DFE
I think the patch looks good. David and Rong, any comments? Dehao On Thu, Jun 12, 2014 at 11:23 AM, Teresa Johnson wrote: > These two patches fix multiple ICE that occurred due to DFE being > recently enabled after AutoFDO LIPO linking. > > Passes regression and internal testing. Ok for Google/4_8? > > Teresa > > 2014-06-12 Teresa Johnson > Dehao Chen > > Google ref b/15521327. > > * cgraphclones.c (cgraph_clone_edge): Use resolved node. > * l-ipo.c (resolve_cgraph_node): Resolve to non-removable node. > > Index: cgraphclones.c > === > --- cgraphclones.c (revision 211386) > +++ cgraphclones.c (working copy) > @@ -94,6 +94,7 @@ along with GCC; see the file COPYING3. If not see > #include "ipa-utils.h" > #include "lto-streamer.h" > #include "except.h" > +#include "l-ipo.h" > > /* Create clone of E in the node N represented by CALL_EXPR the callgraph. > */ > struct cgraph_edge * > @@ -118,7 +119,11 @@ cgraph_clone_edge (struct cgraph_edge *e, struct c > >if (call_stmt && (decl = gimple_call_fndecl (call_stmt))) > { > - struct cgraph_node *callee = cgraph_get_node (decl); > + struct cgraph_node *callee; > + if (L_IPO_COMP_MODE && cgraph_pre_profiling_inlining_done) > +callee = cgraph_lipo_get_resolved_node (decl); > + else > +callee = cgraph_get_node (decl); > gcc_checking_assert (callee); > new_edge = cgraph_create_edge (n, callee, call_stmt, count, freq); > } > Index: l-ipo.c > === > --- l-ipo.c (revision 211386) > +++ l-ipo.c (working copy) > @@ -1542,6 +1542,18 @@ resolve_cgraph_node (struct cgraph_sym **slot, str >gcc_assert (decl1_defined); >add_define_module (*slot, decl2); > > + /* Pick the node that cannot be removed, to avoid a situation > + where we remove the resolved node and later try to access > + it for the remaining non-removable copy. E.g. one may be > + extern and the other weak, only the extern copy can be removed. */ > + if (cgraph_can_remove_if_no_direct_calls_and_refs_p ((*slot)->rep_node) > + && !cgraph_can_remove_if_no_direct_calls_and_refs_p (node)) > +{ > + (*slot)->rep_node = node; > + (*slot)->rep_decl = decl2; > + return; > +} > + >has_prof1 = has_profile_info (decl1); >bool is_aux1 = cgraph_is_auxiliary (decl1); >bool is_aux2 = cgraph_is_auxiliary (decl2); > > > -- > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Fix vectorizer conditions on updating alignment
Hi, while updating vect_can_force_dr_alignment_p for section API I noticed the predicate is bit confused about when it can update the alignment. We need to check that decl_binds_to_current_def_p and in case we compile a partition also that the symbol is not homed in other partition. Previous code was wrong i.e. for COMDATs, weaks or -fpic. Also when having an alias, only way to promote the alignment is to bump up alignment of target. On the other hand comment about DECL_IN_CONSTANT_POOL seems confused - we have no sharing across partitions. I assume it was old hack and removed it. I also see no reason for disregarding DECL_PRESERVE - we only update alignment that should not disturb whatever magic user does. But I kept it. We probably should separate the logic into symtab predicate - it just checks if we can change definition of variable to meet our needs. I can do that incrementally. Bootstrapped/regtested x86_64-linux, comitted. Honza * tree-vect-data-refs.c (vect_can_force_dr_alignment_p): Reorg to use symtab and decl_binds_to_current_def_p * tree-vectorizer.c (increase_alignment): Increase alignment of alias target, too. Index: tree-vect-data-refs.c === --- tree-vect-data-refs.c (revision 211489) +++ tree-vect-data-refs.c (working copy) @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3. #include "expr.h" #include "optabs.h" #include "builtins.h" +#include "varasm.h" /* Return true if load- or store-lanes optab OPTAB is implemented for COUNT vectors of type VECTYPE. NAME is the name of OPTAB. */ @@ -5316,19 +5317,26 @@ vect_can_force_dr_alignment_p (const_tre if (TREE_CODE (decl) != VAR_DECL) return false; - /* We cannot change alignment of common or external symbols as another - translation unit may contain a definition with lower alignment. - The rules of common symbol linking mean that the definition - will override the common symbol. The same is true for constant - pool entries which may be shared and are not properly merged - by LTO. */ - if (DECL_EXTERNAL (decl) - || DECL_COMMON (decl) - || DECL_IN_CONSTANT_POOL (decl)) -return false; + gcc_assert (!TREE_ASM_WRITTEN (decl)); - if (TREE_ASM_WRITTEN (decl)) -return false; + if (TREE_PUBLIC (decl) || DECL_EXTERNAL (decl)) +{ + symtab_node *snode; + + /* We cannot change alignment of symbols that may bind to symbols +in other translation unit that may contain a definition with lower +alignment. */ + if (!decl_binds_to_current_def_p (decl)) + return false; + + /* When compiling partition, be sure the symbol is not output by other +partition. */ + snode = symtab_get_node (decl); + if (flag_ltrans + && (snode->in_other_partition + || symtab_get_symbol_partitioning_class (snode) == SYMBOL_DUPLICATE)) + return false; +} /* Do not override the alignment as specified by the ABI when the used attribute is set. */ @@ -5343,6 +5351,18 @@ vect_can_force_dr_alignment_p (const_tre && !symtab_get_node (decl)->implicit_section) return false; + /* If symbol is an alias, we need to check that target is OK. */ + if (TREE_STATIC (decl)) +{ + tree target = symtab_alias_ultimate_target (symtab_get_node (decl))->decl; + if (target != decl) + { + if (DECL_PRESERVE_P (target)) + return false; + decl = target; + } +} + if (TREE_STATIC (decl)) return (alignment <= MAX_OFILE_ALIGNMENT); else Index: tree-vectorizer.c === --- tree-vectorizer.c (revision 211488) +++ tree-vectorizer.c (working copy) @@ -686,6 +686,12 @@ increase_alignment (void) { DECL_ALIGN (decl) = TYPE_ALIGN (vectype); DECL_USER_ALIGN (decl) = 1; + if (TREE_STATIC (decl)) + { + tree target = symtab_alias_ultimate_target (symtab_get_node (decl))->decl; + DECL_ALIGN (target) = TYPE_ALIGN (vectype); + DECL_USER_ALIGN (target) = 1; + } dump_printf (MSG_NOTE, "Increasing alignment of decl: "); dump_generic_expr (MSG_NOTE, TDF_SLIM, decl); dump_printf (MSG_NOTE, "\n");
[C++ PATCH, RFC] PR c++/61491
DR1206 allows explicit specializations of member enumerations of class templates, so just remove the pedwarn about it. Tested on Linux-x64. Not bootstrapped. diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index d267a5c..97eadeb 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -967,11 +967,8 @@ maybe_process_partial_specialization (tree type) else if (processing_specialization) { /* Someday C++0x may allow for enum template specialization. */ - if (cxx_dialect > cxx98 && TREE_CODE (type) == ENUMERAL_TYPE - && CLASS_TYPE_P (context) && CLASSTYPE_USE_TEMPLATE (context)) - pedwarn (input_location, OPT_Wpedantic, "template specialization " -"of %qD not allowed by ISO C++", type); - else + if (!(cxx_dialect > cxx98 && TREE_CODE (type) == ENUMERAL_TYPE + && CLASS_TYPE_P (context) && CLASSTYPE_USE_TEMPLATE (context))) { error ("explicit specialization of non-template %qT", type); return error_mark_node; diff --git a/gcc/testsuite/g++.dg/cpp0x/pr61491.C b/gcc/testsuite/g++.dg/cpp0x/pr61491.C new file mode 100644 index 000..c105782 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/pr61491.C @@ -0,0 +1,12 @@ +// { dg-do compile { target c++11 } } +// { dg-options "-pedantic" } +// DR 1206 (explicit specialization of a member enumeration of a class template) + +template struct Base +{ +enum class E : unsigned; +}; + +struct X; + +template<> enum class Base::E : unsigned { a, b }; pr61491.changelog Description: Binary data
Re: [C++ PATCH, RFC] PR c++/61491
On 13 June 2014 01:37, Ville Voutilainen wrote: > DR1206 allows explicit specializations of member enumerations > of class templates, so just remove the pedwarn about it. > > Tested on Linux-x64. Not bootstrapped. Argh, also remove the old comment, new patch attached. diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index d267a5c..507585f 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -966,12 +966,8 @@ maybe_process_partial_specialization (tree type) } else if (processing_specialization) { - /* Someday C++0x may allow for enum template specialization. */ - if (cxx_dialect > cxx98 && TREE_CODE (type) == ENUMERAL_TYPE - && CLASS_TYPE_P (context) && CLASSTYPE_USE_TEMPLATE (context)) - pedwarn (input_location, OPT_Wpedantic, "template specialization " -"of %qD not allowed by ISO C++", type); - else + if (!(cxx_dialect > cxx98 && TREE_CODE (type) == ENUMERAL_TYPE + && CLASS_TYPE_P (context) && CLASSTYPE_USE_TEMPLATE (context))) { error ("explicit specialization of non-template %qT", type); return error_mark_node; diff --git a/gcc/testsuite/g++.dg/cpp0x/pr61491.C b/gcc/testsuite/g++.dg/cpp0x/pr61491.C new file mode 100644 index 000..c105782 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/pr61491.C @@ -0,0 +1,12 @@ +// { dg-do compile { target c++11 } } +// { dg-options "-pedantic" } +// DR 1206 (explicit specialization of a member enumeration of a class template) + +template struct Base +{ +enum class E : unsigned; +}; + +struct X; + +template<> enum class Base::E : unsigned { a, b };
Re: [Google] Fix AFDO early inline ICEs due to DFE
This looks fine to me. -Rong On Thu, Jun 12, 2014 at 11:23 AM, Teresa Johnson wrote: > These two patches fix multiple ICE that occurred due to DFE being > recently enabled after AutoFDO LIPO linking. > > Passes regression and internal testing. Ok for Google/4_8? > > Teresa > > 2014-06-12 Teresa Johnson > Dehao Chen > > Google ref b/15521327. > > * cgraphclones.c (cgraph_clone_edge): Use resolved node. > * l-ipo.c (resolve_cgraph_node): Resolve to non-removable node. > > Index: cgraphclones.c > === > --- cgraphclones.c (revision 211386) > +++ cgraphclones.c (working copy) > @@ -94,6 +94,7 @@ along with GCC; see the file COPYING3. If not see > #include "ipa-utils.h" > #include "lto-streamer.h" > #include "except.h" > +#include "l-ipo.h" > > /* Create clone of E in the node N represented by CALL_EXPR the callgraph. > */ > struct cgraph_edge * > @@ -118,7 +119,11 @@ cgraph_clone_edge (struct cgraph_edge *e, struct c > >if (call_stmt && (decl = gimple_call_fndecl (call_stmt))) > { > - struct cgraph_node *callee = cgraph_get_node (decl); > + struct cgraph_node *callee; > + if (L_IPO_COMP_MODE && cgraph_pre_profiling_inlining_done) > +callee = cgraph_lipo_get_resolved_node (decl); > + else > +callee = cgraph_get_node (decl); > gcc_checking_assert (callee); > new_edge = cgraph_create_edge (n, callee, call_stmt, count, freq); > } > Index: l-ipo.c > === > --- l-ipo.c (revision 211386) > +++ l-ipo.c (working copy) > @@ -1542,6 +1542,18 @@ resolve_cgraph_node (struct cgraph_sym **slot, str >gcc_assert (decl1_defined); >add_define_module (*slot, decl2); > > + /* Pick the node that cannot be removed, to avoid a situation > + where we remove the resolved node and later try to access > + it for the remaining non-removable copy. E.g. one may be > + extern and the other weak, only the extern copy can be removed. */ > + if (cgraph_can_remove_if_no_direct_calls_and_refs_p ((*slot)->rep_node) > + && !cgraph_can_remove_if_no_direct_calls_and_refs_p (node)) > +{ > + (*slot)->rep_node = node; > + (*slot)->rep_decl = decl2; > + return; > +} > + >has_prof1 = has_profile_info (decl1); >bool is_aux1 = cgraph_is_auxiliary (decl1); >bool is_aux2 = cgraph_is_auxiliary (decl2); > > > -- > Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Re: [PATCH, Pointer Bounds Checker 35/x] Fix object size emitted for structures with flexible arrays
2014-06-12 11:55 GMT+04:00 Richard Biener : > On Wed, Jun 11, 2014 at 6:08 PM, Ilya Enkovich wrote: >> Hi, >> >> This patch fixes problem with size emitted for static structures with >> flexible array. I found a couple of trackers in guzilla for this problem >> but all of them are marked as fixed and problem still exists. >> >> For a simple testcase >> >> struct S { int a; int b[0]; } s = { 1, { 0, 0} }; >> >> current trunk produces (no flags): >> >> .globl s >> .data >> .align 4 >> .type s, @object >> .size s, 4 >> s: >> .long 1 >> .long 0 >> .long 0 >> >> which has wrong size for object s. >> >> This problem is important for checker because wrong size leads to wrong >> bounds and false bounds violations. Following patch uses DECL_SIZE_UNIT >> instead of type size and works well for me. Does it look OK? > > There is a bug about this in bugzilla somewhere. I looked through bugzilla and found two trackers with similar problem. The first one is 57180 which is still open but with comment that problem is fixed on the trunk. I checked it and it really passes for the trunk (should tracker be closed then?). Another one is 28865 (and a set of its duplicates). Interesting thing here is that original testcase uses array of integers but testcases which were added with commit use arrays of chars. Original test is still compiled wrongly. I also see that a patch very similar to what I posted was proposed as a solution but it was reported to cause a problem with glibc/nss/nss_files/files-init.c. There is a corresponding testcase in the tracker which results wrong padding when patch is applied but it seems to be another problem because I do not see any problem when use mpx compiler branch for this testcase. > > It looks ok to me - did you test with all languages? In particular did > you test Ada? I configure compiler with no language disabling and then run bootstrap and make check. Does it mean all languages are covered? Will make more testing if required. Thanks, Ilya > > Thanks, > Richard. > >> Bootstrapped and tested on linux-x86_64. >> >> Thanks, >> Ilya >> -- >> gcc/ >> >> 2014-06-11 Ilya Enkovich >> >> * config/elfos.h (ASM_DECLARE_OBJECT_NAME): Use decl size >> instead of type size. >> (ASM_FINISH_DECLARE_OBJECT): Likewise. >> >> >> diff --git a/gcc/config/elfos.h b/gcc/config/elfos.h >> index c1d5553..7929708 100644 >> --- a/gcc/config/elfos.h >> +++ b/gcc/config/elfos.h >> @@ -313,7 +313,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. >> If not, see >> && (DECL) && DECL_SIZE (DECL))\ >> { \ >> size_directive_output = 1;\ >> - size = int_size_in_bytes (TREE_TYPE (DECL)); \ >> + size = tree_to_uhwi (DECL_SIZE_UNIT (DECL)); \ >> ASM_OUTPUT_SIZE_DIRECTIVE (FILE, NAME, size); \ >> } \ >> \ >> @@ -341,7 +341,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. >> If not, see >> && !size_directive_output)\ >> { \ >> size_directive_output = 1;\ >> - size = int_size_in_bytes (TREE_TYPE (DECL)); \ >> + size = tree_to_uhwi (DECL_SIZE_UNIT (DECL)); \ >> ASM_OUTPUT_SIZE_DIRECTIVE (FILE, name, size); \ >> } \ >> } \
Re: ipa-visibility TLC 2/n
> > > > Comdat locals are now used by ipa-comdats, for thunks and for decloned > > ctors. > > We probably need to figure out bit more precise limitation of Solaris and > > either > > fix or add way for target to say what kind of comdat locals are not > > supported. > > Right. I'll start reghunting for the patch that caused additional > breakage even without comdat, as on Solaris 10. Good, at least one bug is off my radar. I was thinking about the ipa-comdats issue and I remembered older problem where I wanted to place thunks before function (to avoid need to jump back to the body) and that caused problems for you, too, since solaris assembler apparently refused other than main comdat group symbol being defined first. Perhaps we run into similar issues? Do you know what precisely are the restrictions here? (we do, for example, comdat groups that do not contain a symbol the group is called by, so I do not see how the main symbol name is significant) IPA-comdat brings extra symbols into the comdat group and pays no attention on the order, so perhaps this is causing the issue. We may add some logic into assemble_functions to fix the order or work out why this breaks. > > > Can I reproduce your setup on the compile farm? > > According to https://gcc.gnu.org/wiki/CompileFarm, there are no Solaris > machines or VMs in the compile farm. If a VM could be set up (no idea > if they allow non-free OSes beyond AIX there), I'd suggest starting with > Solaris 11.2 Beta > (http://www.oracle.com/technetwork/server-storage/solaris11/downloads/beta-2182939.html), > which has the latest in /bin/ld support. I can certainly help with > setting something up. Would be nice to have non-free OS for testing. Comdats and aliases seems to be riddled by implementation bugs and it would be nice to have way to test for those. Honza > > Rainer > > -- > - > Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Trust TREE_ADDRESSABLE
> > When you extract the address and use it. For example when you > do auto-parallelization and outline a part of your function it > passes arrays as addresses. > > Or if you start to introduce address induction variables like > the vectorizer or IVOPTs does. I see, nothing really done by current early/IPA optimizers and in those cases we also want to set TREE_ADDRESSABLE bit, too I suppose. Do you think I should make patch for setting the NOVOPS bits in ipa code? Honza > > Richard.
Re: [Patch] Change URL in commit emails to https
On Mon, 12 May 2014, Tobias Burnus wrote: The patch changes the URL shown in the release message to HTTPS. (Cf. https://gcc.gnu.org/viewcvs/gcc/hooks/svnmailer.conf and gcc-cvs mailing list.) Yes, please. Thanks! Gerald