Re: [PATCH 3/4] Introduce NEXT_PASS_NUM macro
On Wed, Jul 17, 2013 at 6:18 PM, David Malcolm wrote: > gcc/ > > Explicitly number the instances of passes within passes.def. > > This is needed by a subsequent patch so that we can create > fields within the pipeline class for each pass instance (to help > locate pass instances when debugging). > > * passes.c (NEXT_PASS_NUM): Define. > > * passes.def (NEXT_PASS, NEXT_PASS_NUM): Replace uses of > NEXT_PASS on passes that have multiple instances with uses of > NEXT_PASS_NUM. I don't like this patch at all. Mainly because the numbers can get out of sync very quickly especially when it comes to internal versions of the compiler where it is normal to reorder passes and add another pass a few times. Thanks, Andrew > --- > gcc/passes.c | 3 + > gcc/passes.def | 173 > + > 2 files changed, 90 insertions(+), 86 deletions(-) > > diff --git a/gcc/passes.c b/gcc/passes.c > index 94fb586..f140330 100644 > --- a/gcc/passes.c > +++ b/gcc/passes.c > @@ -1294,6 +1294,8 @@ init_optimization_passes (void) > > #define NEXT_PASS(PASS) (p = next_pass_1 (p, &((PASS).pass))) > > +#define NEXT_PASS_NUM(PASS, NUM) (p = next_pass_1 (p, &((PASS).pass))) > + > #define TERMINATE_PASS_LIST() \ >*p = NULL; > > @@ -1303,6 +1305,7 @@ init_optimization_passes (void) > #undef PUSH_INSERT_PASSES_WITHIN > #undef POP_INSERT_PASSES > #undef NEXT_PASS > +#undef NEXT_PASS_NUM > #undef TERMINATE_PASS_LIST > >/* Register the passes with the tree dump code. */ > diff --git a/gcc/passes.def b/gcc/passes.def > index fa03d16..f142d31 100644 > --- a/gcc/passes.def > +++ b/gcc/passes.def > @@ -23,6 +23,7 @@ along with GCC; see the file COPYING3. If not see > PUSH_INSERT_PASSES_WITHIN (PASS) > POP_INSERT_PASSES () > NEXT_PASS (PASS) > + NEXT_PASS_NUM (PASS, NUM) > TERMINATE_PASS_LIST () > */ > > @@ -52,44 +53,44 @@ along with GCC; see the file COPYING3. If not see >NEXT_PASS (pass_ipa_function_and_variable_visibility); >NEXT_PASS (pass_early_local_passes); >PUSH_INSERT_PASSES_WITHIN (pass_early_local_passes) > - NEXT_PASS (pass_fixup_cfg); > + NEXT_PASS_NUM (pass_fixup_cfg, 1); >NEXT_PASS (pass_init_datastructures); > >NEXT_PASS (pass_build_ssa); >NEXT_PASS (pass_early_warn_uninitialized); > - NEXT_PASS (pass_rebuild_cgraph_edges); > - NEXT_PASS (pass_inline_parameters); > + NEXT_PASS_NUM (pass_rebuild_cgraph_edges, 1); > + NEXT_PASS_NUM (pass_inline_parameters, 1); >NEXT_PASS (pass_early_inline); >NEXT_PASS (pass_all_early_optimizations); >PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations) > - NEXT_PASS (pass_remove_cgraph_callee_edges); > - NEXT_PASS (pass_rename_ssa_copies); > - NEXT_PASS (pass_ccp); > + NEXT_PASS_NUM (pass_remove_cgraph_callee_edges, 1); > + NEXT_PASS_NUM (pass_rename_ssa_copies, 1); > + NEXT_PASS_NUM (pass_ccp, 1); > /* After CCP we rewrite no longer addressed locals into SSA > form if possible. */ > - NEXT_PASS (pass_forwprop); > + NEXT_PASS_NUM (pass_forwprop, 1); > /* pass_build_ealias is a dummy pass that ensures that we > execute TODO_rebuild_alias at this point. */ > NEXT_PASS (pass_build_ealias); > NEXT_PASS (pass_sra_early); > - NEXT_PASS (pass_fre); > - NEXT_PASS (pass_copy_prop); > - NEXT_PASS (pass_merge_phi); > - NEXT_PASS (pass_cd_dce); > + NEXT_PASS_NUM (pass_fre, 1); > + NEXT_PASS_NUM (pass_copy_prop, 1); > + NEXT_PASS_NUM (pass_merge_phi, 1); > + NEXT_PASS_NUM (pass_cd_dce, 1); > NEXT_PASS (pass_early_ipa_sra); > - NEXT_PASS (pass_tail_recursion); > + NEXT_PASS_NUM (pass_tail_recursion, 1); > NEXT_PASS (pass_convert_switch); > - NEXT_PASS (pass_cleanup_eh); > + NEXT_PASS_NUM (pass_cleanup_eh, 1); >NEXT_PASS (pass_profile); > - NEXT_PASS (pass_local_pure_const); > + NEXT_PASS_NUM (pass_local_pure_const, 1); > /* Split functions creates parts that are not run through > early optimizations again. It is thus good idea to do this > late. */ >NEXT_PASS (pass_split_functions); >POP_INSERT_PASSES () >NEXT_PASS (pass_release_ssa_names); > - NEXT_PASS (pass_rebuild_cgraph_edges); > - NEXT_PASS (pass_inline_parameters); > + NEXT_PASS_NUM (pass_rebuild_cgraph_edges, 2); > + NEXT_PASS_NUM (pass_inline_parameters, 2); >POP_INSERT_PASSES () >NEXT_PASS (pass_ipa_free_inline_summary); >NEXT_PASS (pass_ipa_tree_profile); > @@ -126,118 +127,118 @@ along with GCC; see the file COPYING3. If not see >/* These passes are run after IPA passes on every function that is being > output to the assemble
Re: [PATCH 0/4] Move pass-creation logic into a passes.def file
On Wed, Jul 17, 2013 at 6:18 PM, David Malcolm wrote: > The following patch series moves the logic for creating the > pipeline of optimization passes out from passes.c and into a new > passes.def file (patches 1 and 2). > > It then explicitly numbers those passes that have multiple instances, by > using a NEXT_PASS_NUM macro in place of NEXT_PASS (patch 3) This is not useful in itself and is one of the reasons why we added the ability to have a pass multiple times without much code. > > The motivation for this is subsequent work towards removing global > variables from GCC's internals: by numbering the instances it becomes > possible to create a "class pipeline" and have the fields be declared > via suitable use of passes.def. See: > http://dmalcolm.fedorapeople.org/gcc/global-state/new-classes.html#pass-classes > > The final patch in the sequence adds a script which sanity-checks > passes.def, and prints some stats about the passes. You can see output > from the script at: > > http://dmalcolm.fedorapeople.org/gcc/2013-07-17/pass-stats.txt > > Specifically, it lists single-instanced passes, then all multi-instance > passes, giving the number of instances of each (alphabetically within > each list). > > I've successfully bootstrapped and tested the sequence of patches on > x86_64-unknown-linux-gnu: all testcases show the same results as an > unpatched build (relative to r201011). Besides patch 3 for reasons mentioned above and mentioned in the reply directly to that patch, I like this set of patches. Thanks, Andrew Pinski > > OK to commit these to trunk? > > David Malcolm (4): > Introduce macros when constructing the tree of passes > Move the construction of the pass hierarchy into a new passes.def > file. > Introduce NEXT_PASS_NUM macro > Add contrib/check_passes.py script > > contrib/check_passes.py | 58 +++ > gcc/Makefile.in | 2 +- > gcc/passes.c| 401 ++- > gcc/passes.def | 406 > > 4 files changed, 481 insertions(+), 386 deletions(-) > create mode 100644 contrib/check_passes.py > create mode 100644 gcc/passes.def > > -- > 1.7.11.7 >
Re: [Patch x86/darwin] fix PR51784 'PIC register not correctly preserved...'
Hello! > The PR is logged against Darwin, and (as Jakub points out in the PR > thread) indeed Darwin is missing a nonlocal_goto_receiver to restore > the PIC reg in code that uses it (most of the patch). > > However, there is a second issue, and (if I've understood things > correctly) this affects GOT targets too - thus there is a single > non-darwin-specific hunk for which I need approval for X86 as a whole. > > consider (x86 -fPIC -m32) > > == > > int g42 = 42; > > int foo (void) <=== doesn't use EBX, so doesn't save it. > { > __label__ x; > int z; > int bar (int *zz) <== does use EBX, and saves it > { >*zz = g42; > goto x; <== however, EBX is not restored here. > } > > bar(&z); > > x: > return z; > } > > == > > ... this all works OK when the caller of foo and foo are in one object > (and thus share the same GOT) > > .. however, suppose we build the code above as a shared lib and call > it from a pie executable (or another so). > > Now, when the caller (with a different GOT value from the lib) calls > foo() - EBX gets trashed (probably *boom*). > > The solution proposed here (for this aspect) is that, if a function > contains a nonlocal label, then the PICbase register should be > preserved. This is the only non-darwin-specific hunk in the patch. I don't think this is the correct solution. Function bar() should restore %ebx before jumping to the label. The problem can be seen if you change "return z" to "return z + g42" at the end of your testcase. The test will be compiled to: bar.1372: pushl%ebp movl%esp, %ebp pushl%ebx call__x86.get_pc_thunk.bx addl$_GLOBAL_OFFSET_TABLE_, %ebx movlg42@GOT(%ebx), %edx movl(%edx), %edx movl%edx, (%eax) leal.L2@GOTOFF(%ebx), %eax movl(%ecx), %ebp movl4(%ecx), %esp jmp*%eax and foo: pushl%ebp pushl%edi pushl%esi pushl%ebx call__x86.get_pc_thunk.bx addl$_GLOBAL_OFFSET_TABLE_, %ebx subl$16, %esp leal16(%esp), %eax movl%eax, 4(%esp) leal4(%esp), %ecx movl%esp, %eax movl%esp, 8(%esp) callbar.1372 .L2: movlg42@GOT(%ebx), %edx movl(%esp), %eax addl(%edx), %eax addl$16, %esp popl%ebx popl%esi popl%edi popl%ebp ret Under assumpiton that foo and bar doesn't share the same GOT, you will see that g42 after the label is accessed with "clobbered" %ebx. Uros.
Re: [Patch x86/darwin] fix PR51784 'PIC register not correctly preserved...'
Hi Uros, (working on a re-vamp with an expander for the nonlocal goto MD). On 18 Jul 2013, at 08:26, Uros Bizjak wrote: > Under assumpiton that foo and bar doesn't share the same GOT, you will > see that g42 after the label is accessed with "clobbered" %ebx. My understanding is that foo and bar *have* have to share the same GOT (since they must be in the same object). It is only foo and its caller that would have different GOTs. --- Note, however, that IFF EBX might be used in foo() for some other purpose (i.e. such that its value is not the GOT and needs to be preserved across the call to bar()) then we have a more general problem - i.e. that x86 needs some general way to restore ebx at the site of a non-local-goto-reciever. I don't think it works to restore ebx inside bar() since the goto might not be to bar()'s caller. e.g. in Jakub's example in the PR. thanks Iain
Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
Zoran Jovanovic wrote: >Hello, >This patch adds new optimization pass that combines several adjacent >bit field accesses that copy values from one memory location to another >into single bit field access. > >Example: > >Original code: > D.1351; > D.1350; > D.1349; > D.1349_2 = p1_1(D)->f1; > p2_3(D)->f1 = D.1349_2; > D.1350_4 = p1_1(D)->f2; > p2_3(D)->f2 = D.1350_4; > D.1351_5 = p1_1(D)->f3; > p2_3(D)->f3 = D.1351_5; > >Optimized code: > D.1358; > D.1358_10 = BIT_FIELD_REF <*p1_1(D), 19, 13>; > BIT_FIELD_REF <*p2_3(D), 19, 13> = D.1358_10; > >Algorithm works on basic block level and consists of following 3 major >steps: >1. Go trough basic block statements list. If there are statement pairs >that implement copy of bit field content from one memory location to >another record statements pointers and other necessary data in >corresponding data structure. >2. Identify records that represent adjacent bit field accesses and mark >them as merged. >3. Modify trees accordingly. All this should use BITFIELD_REPRESENTATIVE both to decide what accesses are related and for the lowering. This makes sure to honor the appropriate memory models. In theory only lowering is necessary and FRE and DSE will do the job of optimizing - also properly accounting for alias issues that Joseph mentions. The lowering and analysis is strongly related to SRA So I don't believe we want a new pass for this. Richard. >New command line option "-ftree-bitfield-merge" is introduced. > >Tested - passed gcc regression tests. > >Changelog - > >gcc/ChangeLog: >2013-07-17 Zoran Jovanovic (zoran.jovano...@imgtec.com) > * Makefile.in : Added tree-ssa-bitfield-merge.o to OBJS. > * common.opt (ftree-bitfield-merge): New option. > * doc/invoke.texi: Added reference to "-ftree-bitfield-merge". > * dwarf2out.c (field_type): static removed from declaration. > (simple_type_size_in_bits): static removed from declaration. > (field_byte_offset): static removed from declaration. > (field_type): static inline removed from declaration. > * passes.c (init_optimization_passes): pass_bitfield_merge pass >added. > * testsuite/gcc.dg/tree-ssa/bitfldmrg.c: New test. > * timevar.def : Added TV_TREE_BITFIELD_MERGE. > * tree-pass.h : Added pass_bitfield_merge declaration. > * tree-ssa-bitfield-merge.c : New file. > >Patch - > >diff --git a/gcc/Makefile.in b/gcc/Makefile.in >index d5121f3..5cdd6eb 100644 >--- a/gcc/Makefile.in >+++ b/gcc/Makefile.in >@@ -1417,6 +1417,7 @@ OBJS = \ > tree-ssa-dom.o \ > tree-ssa-dse.o \ > tree-ssa-forwprop.o \ >+ tree-ssa-bitfield-merge.o \ > tree-ssa-ifcombine.o \ > tree-ssa-live.o \ > tree-ssa-loop-ch.o \ >@@ -2312,6 +2313,11 @@ tree-ssa-forwprop.o : tree-ssa-forwprop.c >$(CONFIG_H) $(SYSTEM_H) coretypes.h \ > $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \ > langhooks.h $(FLAGS_H) $(GIMPLE_H) $(GIMPLE_PRETTY_PRINT_H) >$(EXPR_H) \ > $(OPTABS_H) tree-ssa-propagate.h >+tree-ssa-bitfield-merge.o : tree-ssa-forwprop.c $(CONFIG_H) >$(SYSTEM_H) \ >+ coretypes.h $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \ >+ $(TREE_FLOW_H) $(TREE_PASS_H) $(TREE_DUMP_H) $(DIAGNOSTIC_H) >$(TIMEVAR_H) \ >+ langhooks.h $(FLAGS_H) $(GIMPLE_H) tree-pretty-print.h \ >+ gimple-pretty-print.h $(EXPR_H) > tree-ssa-phiprop.o : tree-ssa-phiprop.c $(CONFIG_H) $(SYSTEM_H) >coretypes.h \ > $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \ > $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \ >@@ -3803,6 +3809,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h >$(srcdir)/coretypes.h \ > $(srcdir)/ipa-inline.h \ > $(srcdir)/asan.c \ > $(srcdir)/tsan.c \ >+ $(srcdir)/tree-ssa-bitfield-merge.c \ > @all_gtfiles@ > > # Compute the list of GT header files from the corresponding C >sources, >diff --git a/gcc/common.opt b/gcc/common.opt >index 4c7933e..e0dbc37 100644 >--- a/gcc/common.opt >+++ b/gcc/common.opt >@@ -2088,6 +2088,10 @@ ftree-forwprop > Common Report Var(flag_tree_forwprop) Init(1) Optimization > Enable forward propagation on trees > >+ftree-bitfield-merge >+Common Report Var(flag_tree_bitfield_merge) Init(0) Optimization >+Enable bit field merge on trees >+ > ftree-fre > Common Report Var(flag_tree_fre) Optimization > Enable Full Redundancy Elimination (FRE) on trees >diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >index dd82880..7b671aa 100644 >--- a/gcc/doc/invoke.texi >+++ b/gcc/doc/invoke.texi >@@ -409,7 +409,7 @@ Objective-C and Objective-C++ Dialects}. > -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol > -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol > -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol >--ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol >+-ftree-bitfield-merge -ftree-builtin-call-dce -ftree-ccp -ftree-ch >@gol > -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol > -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol > -ftree-forwprop -ftree-fre -ftre
Re: [Patch] Partially implement regex_search
On Wed, Jul 17, 2013 at 7:06 PM, Jonathan Wakely wrote: > The changelog has a typo, _M__search_from_first has two underscores. > > The testcase dg-options should use -std=gnu++11 not -std=c++0x. Is > the testcase based on an existing file? If not the copyright year > should just be 2013. These will be fixed when commit to SVN. By the way, this implementation is not necessarily the ultimate one. It's simple and don't require NFA modification. For efficiency purpose in the future, a "some_re".search() => ".*(some_re).*".match() algorithm should be used(at least for Thompson NFA). -- Tim Shen
Re: [Patch x86/darwin] fix PR51784 'PIC register not correctly preserved...'
Hi Uros, On 18 Jul 2013, at 07:31, Uros Bizjak wrote: > This should be implemented as an expander. You also won't need > UNSPEC_NLGR that way. Now I reload the state from this PR, I recall why this did not work. in the case: foo () { do stuff that doesn't use the pic reg call nested function nonlocal_label: do stuff that uses the pic register. } +{ + if (crtl->uses_pic_offset_table) +{ + rtx xops[3]; crtl->uses_pic_offset_table is not set at the point that "nonlocal_label:" is evaluated. So, I think we have to use the define_insn_and_split, or am I still missing something? Thanks Iain
Re: [Ping] [Patch, AArch64, ILP32] 1/5 Initial support - configury changes
Ping~ Thanks, Yufeng On 07/02/13 19:53, Yufeng Zhang wrote: Hi Andrew, Please find the updated patch in the attachment that addresses your comments. It now builds both ilp32 and lp64 multilibs by default, with the --with-multilib-list support remaining to provide options to turn off one of them. -mabi=ilp32 and -mabi=lp64 are now the command line options to use. The SPECs have been updated as well. Thanks, Yufeng gcc/ * config.gcc (aarch64*-*-*): Support --with-abi. (aarch64*-*-elf): Support --with-multilib-list. (aarch64*-*-linux*): Likewise. (supported_defaults): Add abi to aarch64*-*-*. * configure.ac: Mention AArch64 for --with-multilib-list. * configure: Re-generated. * config/aarch64/biarchilp32.h: New file. * config/aarch64/biarchlp64.h: New file. * config/aarch64/aarch64-elf.h (ENDIAN_SPEC): New define. (ABI_SPEC): Ditto. (MULTILIB_DEFAULTS): Ditto. (DRIVER_SELF_SPECS): Ditto. (ASM_SPEC): Update to also substitute -mabi. * config/aarch64/aarch64-elf-raw.h (LINK_SPEC): Add linker script file whose name depends on -mabi= and -mbig-endian. * config/aarch64/aarch64.h (LONG_TYPE_SIZE): Change to depend on TARGET_ILP32. (POINTER_SIZE): New define. (POINTERS_EXTEND_UNSIGNED): Ditto. (enum aarch64_abi_type): New enumeration tag. (AARCH64_ABI_LP64, AARCH64_ABI_ILP32): New enumerators. (AARCH64_ABI_DEFAULT): Define to AARCH64_ABI_LP64 if undefined. (TARGET_ILP32): New define. * config/aarch64/aarch64.opt (mabi): New. (aarch64_abi): New. (ilp32, lp64): New values for -mabi. * config/aarch64/t-aarch64 (comma): New define. (MULTILIB_OPTIONS): Ditto. (MULTILIB_DIRNAMES): Ditto. * config/aarch64/t-aarch64-linux (MULTIARCH_DIRNAME): New define. * doc/invoke.texi: Document -mabi for AArch64. On 06/26/13 23:59, Andrew Pinski wrote: On Wed, Jun 26, 2013 at 3:33 PM, Yufeng Zhang wrote: This patch adds the configuration changes to the AArch64 GCC to support: * -milp32 and -mlp64 options in the compiler and the driver * multilib of ilp32 and/or lp64 libraries * differentiation of basic types in the compiler backend The patch enables --with-multilib-list configuration option for specifying the list of library flavors to enable; the default value is "mlp64" and can be overridden by --with-abi to "milp32". It also enables --with-abi for setting the default model in the compiler. Its default value is "mlp64" unless --with-multilib-list is explicitly specified with "milp32", in which case it defaults to "milp32". In the backend, two target flags are introduced: TARGET_ILP32 and TARGET_LP64. They are set by -milp32 and -mlp64 respectively, exclusive to each other. The default setting is via the option variable aarch64_pmodel_flags, which defaults to TARGET_DEFAULT_PMODEL, which is further defined in biarchlp64.h or biarchilp32.h depending which header file is included. biarchlp64.h biarchilp32.h TARGET_DEFAULT_PMODEL OPTION_MASK_LP64 OPTION_MASK_ILP32 TARGET_PMODEL 12 TARGET_ILP32 and TARGET_LP64 are implicitly defined as: #define TARGET_ILP32 ((aarch64_pmodel_flags& OPTION_MASK_ILP32) != 0) #define TARGET_LP64 ((aarch64_pmodel_flags& OPTION_MASK_LP64) != 0) Note that the multilib support in the Linux toolchain is suppressed deliberately. OK for the trunk? I think you should not support --with-multilib-list at all. It should just include ilp32 multilib no matter what. Note the linux multilib has to wait until the glibc/kernel side is done. Also: +#if TARGET_BIG_ENDIAN_DEFAULT == 1 +#define EMUL_SUFFIX "b" +#else +#define EMUL_SUFFIX "" +#endif is broken when you supply the opposite endian option. Also you really should just use -mabi=ilp32 and -mabi=lp64 which reduces the number of changes needed to be done to config.gcc. You should use DRIVER_SELF_SPECS to simplify your LINKS_SPECS. Something like: #ifdef TARGET_BIG_ENDIAN_DEFAULT #define ENDIAN_SPEC "-mbig-endian" #else #define ENDIAN_SPEC "-mlittle-endian" #endif /* Force the default endianness and ABI flags onto the command line in order to make the other specs easier to write. */ #undef DRIVER_SELF_SPECS #define DRIVER_SELF_SPECS \ " %{!mbig-endian:%{!mlittle-endian:" ENDIAN_SPEC "}}" \ " %{!milp32:%{!mlp64:-mlp64}}" or rather: " %{!mabi=*: -mabi=lp64}" And then in aarch64-elf-raw.h: #ifndef LINK_SPEC #define LINK_SPEC "%{mbig-endian:-EB} %{mlittle-endian:-EL} -X \ -maarch64elf%{milp32:32}%{mbig-endian:b}" #endif Or using the -mabi=* way: #ifndef LINK_SPEC #define LINK_SPEC "%{mbig-endian:-EB} %{mlittle-endian:-EL} -X \ -maarch64elf%{mabi=ilp32:32}%{mbig-endian:b}" #endif Thanks, Andrew Pinski
Re: [Ping] [Patch, AArch64, ILP32] 2/5 More backend changes and support for small absolute and small PIC addressing models
Ping~ Thanks, Yufeng On 06/26/13 23:35, Yufeng Zhang wrote: This patch updates the AArch64 backend to support the small absolute and small PIC addressing models for ILP32; it also updates a number of other backend macros and hooks in order to support ILP32. OK for the trunk? Thanks, Yufeng gcc/ * config/aarch64/aarch64.c (POINTER_BYTES): New define. (aarch64_load_symref_appropriately): In the case of SYMBOL_SMALL_ABSOLUTE, use the mode of 'dest' instead of Pmode to generate new rtx; likewise to the case of SYMBOL_SMALL_GOT. (aarch64_expand_mov_immediate): In the case of SYMBOL_FORCE_TO_MEM, change to pass 'ptr_mode' to force_const_mem and zero-extend 'mem' if 'mode' doesn't equal to 'ptr_mode'. (aarch64_output_mi_thunk): Add an assertion on the alignment of 'vcall_offset'; change to call aarch64_emit_move differently depending on whether 'Pmode' equals to 'ptr_mode' or not; use 'POINTER_BYTES' to calculate the upper bound of 'vcall_offset'. (aarch64_cannot_force_const_mem): Change to also return true if mode != ptr_mode. (aarch64_legitimize_reload_address): In the case of large displacements, add new local variable 'xmode' and an assertion based on it; change to use 'xmode' to generate the new rtx and reload. (aarch64_asm_trampoline_template): Change to generate the template differently depending on TARGET_ILP32 or not; change to use 'POINTER_BYTES' in the argument passed to assemble_aligned_integer. (aarch64_trampoline_size): Removed. (aarch64_trampoline_init): Add new local constant 'tramp_code_sz' and replace immediate literals with it. Change to use 'ptr_mode' instead of 'DImode' and call convert_memory_address if the mode of 'fnaddr' doesn't equal to 'ptr_mode'. (aarch64_elf_asm_constructor): Change to use assemble_aligned_integer to output symbol. (aarch64_elf_asm_destructor): Likewise. * config/aarch64/aarch64.h (TRAMPOLINE_SIZE): Change to be dependent on TARGET_ILP32 instead of aarch64_trampoline_size. * config/aarch64/aarch64.md (movsi_aarch64): Add new alternatives of 'mov' between WSP and W registers as well as 'adr' and 'adrp'. (loadwb_pair_): Rename to ... (loadwb_pair_): ... this. Replace PTR with P. (storewb_pair_): Likewise; rename to ... (storewb_pair_): ... this. (add_losym): Change to 'define_expand' and call gen_add_losym_ depending on the value of 'mode'. (add_losym_): New. (ldr_got_small_): New, based on ldr_got_small. (ldr_got_small): Remove. (ldr_got_small_sidi): New. * config/aarch64/iterators.md (P): New. (PTR): Change to 'ptr_mode' in the condition.
Re: [Ping^3] [Patch, AArch64, ILP32] 3/5 Minor change in function.c:assign_parm_find_data_types()
Ping^3~ Thanks, Yufeng On 07/08/13 11:11, Yufeng Zhang wrote: Ping^2~ Thanks, Yufeng On 07/02/13 23:44, Yufeng Zhang wrote: Ping~ Can I get an OK please if there is no objection? Regards, Yufeng On 06/26/13 23:39, Yufeng Zhang wrote: This patch updates assign_parm_find_data_types to assign passed_mode and nominal_mode with the mode of the built pointer type instead of the hard-coded Pmode in the case of pass-by-reference. This is in line with the assignment to passed_mode and nominal_mode in other cases inside the function. assign_parm_find_data_types generally uses TYPE_MODE to calculate passed_mode and nominal_mode: /* Find mode of arg as it is passed, and mode of arg as it should be during execution of this function. */ passed_mode = TYPE_MODE (passed_type); nominal_mode = TYPE_MODE (nominal_type); this includes the case when the passed argument is a pointer by itself. However there is a discrepancy when it deals with argument passed by invisible reference; it builds the argument's corresponding pointer type, but sets passed_mode and nominal_mode with Pmode directly. This is OK for targets where Pmode == ptr_mode, but on AArch64 with ILP32 they are different with Pmode as DImode and ptr_mode as SImode. When such a reference is passed on stack, the reference is prepared by the caller in the lower 4 bytes of an 8-byte slot but is fetched by the callee as an 8-byte datum, of which the higher 4 bytes may contain junk. It is probably the combination of Pmode != ptr_mode and the particular ABI specification that make the AArch64 ILP32 the first target on which the issue manifests itself. Bootstrapped on x86_64-none-linux-gnu. OK for the trunk? Thanks, Yufeng gcc/ * function.c (assign_parm_find_data_types): Set passed_mode and nominal_mode to the TYPE_MODE of nominal_type for the built pointer type in case of the struct-pass-by-reference.
Re: [Ping] [Patch, AArch64, ILP32] 4/5 Change tests to be ILP32-friendly
Ping~ Thanks, Yufeng On 06/26/13 23:41, Yufeng Zhang wrote: The attached patch fixes a few gcc test cases. Thanks, Yufeng gcc/testsuite/ * gcc.dg/20020219-1.c: Skip the test on aarch64*-*-* in ilp32. * gcc.target/aarch64/aapcs64/test_18.c (struct y): Change the field type from long to long long. * gcc.target/aarch64/atomic-op-long.c: Update dg-final directives to have effective-target keywords of lp64 and ilp32. * gcc.target/aarch64/fcvt_double_int.c: Likewise. * gcc.target/aarch64/fcvt_double_long.c: Likewise. * gcc.target/aarch64/fcvt_double_uint.c: Likewise. * gcc.target/aarch64/fcvt_double_ulong.c: Likewise. * gcc.target/aarch64/fcvt_float_int.c: Likewise. * gcc.target/aarch64/fcvt_float_long.c: Likewise. * gcc.target/aarch64/fcvt_float_uint.c: Likewise. * gcc.target/aarch64/fcvt_float_ulong.c: Likewise. * gcc.target/aarch64/vect_smlal_1.c: Replace 'long' with 'long long'.
Re: [Ping] [Patch, AArch64, ILP32] 5/5 Define _ILP32 and __ILP32__
Ping~ Thanks, Yufeng On 06/26/13 23:42, Yufeng Zhang wrote: This patch defines _ILP32 and __ILP32__ for the AArch64 port when the ILP32 ABI is in use. This helps libraries, e.g. libgloss and glibc, recognize which model is being compiled. OK for the trunk? Thanks, Yufeng gcc/ * config/aarch64/aarch64.h (TARGET_CPU_CPP_BUILTINS): Define _ILP32 and __ILP32__ when the ILP32 model is in use.
Re: [Ping] [Patch, AArch64, ILP32] Pad pointer-typed stack argument downward in ILP32
Ping~ Thanks, Yufeng On 06/27/13 17:00, Yufeng Zhang wrote: This patch fixes the bug that pointer-typed argument passed on stack is not padded properly in ILP32. OK for the trunk? Thanks, Yufeng gcc/ * config/aarch64/aarch64.c (aarch64_pad_arg_upward): In big-endian, pad pointer-typed argument downward. gcc/testsuite/ * gcc.target/aarch64/test-ptr-arg-on-stack-1.c: New test.
[patch,avr] Fix PR57516 fixed-point rounding in the overflow case
Currently, the fixed-point rounding does not work correctly in the overflow case. This is because of misreading section 2.1.7.2 of TR 18037. Rounding builtins expand to saturated addition and AND so that the instruction sequence is add value1 if not overflow goto 0 load max value 0: and value2 where the correct sequence reads add value1 if not overflow goto 0 load max value goto 1 0: and value2 1: This change is performed by the patch. The round expander is transformed to an insn that uses avr_out_plus and avr_out_bitop to print most of the instructions. Okay to apply? Johann gcc/ PR target/57516 * config/avr/avr-fixed.md (round3_const): Turn expander to insn. * config/avr/avr.md (adjust_len): Add `round'. * config/avr/avr-protos.h (avr_out_round): New prototype. (avr_out_plus): Add `out_label' argument. * config/avr/avr.c (avr_out_plus_1): Add `out_label' argument. (avr_out_plus): Pass down `out_label' to avr_out_plus_1. Handle the case where `insn' is just a pattern. (avr_out_bitop): Handle the case where `insn' is just a pattern. (avr_out_round): New function. (avr_adjust_insn_length): Handle ADJUST_LEN_ROUND. libgcc/ PR target/57516 * config/avr/lib1funcs-fixed.S (__roundqq3, __rounduqq3) (__round_s2_const, __round_u2_const) (__round_s4_const, __round_u4_const, __round_x8): Saturate result if addition result cannot be represented. gcc/testsuite/ PR target/57516 * gcc.target/avr/torture/builtins-4-roundfx.c (test2hr, test2k): Adjust to corrected rounding. Index: gcc/config/avr/avr-fixed.md === --- gcc/config/avr/avr-fixed.md (revision 200903) +++ gcc/config/avr/avr-fixed.md (working copy) @@ -447,49 +447,18 @@ (define_expand "round3" ;; "roundqq3_const" "rounduqq3_const" ;; "roundhq3_const" "rounduhq3_const" "roundha3_const" "rounduha3_const" ;; "roundsq3_const" "roundusq3_const" "roundsa3_const" "roundusa3_const" -(define_expand "round3_const" - [(parallel [(match_operand:ALL124QA 0 "register_operand" "") - (match_operand:ALL124QA 1 "register_operand" "") - (match_operand:HI 2 "const_int_operand" "")])] +(define_insn "round3_const" + [(set (match_operand:ALL124QA 0 "register_operand" "=d") +(unspec:ALL124QA [(match_operand:ALL124QA 1 "register_operand" "0") + (match_operand:HI 2 "const_int_operand" "n") + (const_int 0)] + UNSPEC_ROUND))] "" { -// The rounding point RP is $2. The smallest fractional -// bit that is not cleared by the rounding is 2^(-RP). - -enum machine_mode imode = int_mode_for_mode (mode); -int fbit = (int) GET_MODE_FBIT (mode); - -// Add-Saturate 1/2 * 2^(-RP) - -double_int i_add = double_int_zero.set_bit (fbit-1 - INTVAL (operands[2])); -rtx x_add = const_fixed_from_double_int (i_add, mode); - -if (SIGNED_FIXED_POINT_MODE_P (mode)) - emit_move_insn (operands[0], - gen_rtx_SS_PLUS (mode, operands[1], x_add)); -else - emit_move_insn (operands[0], - gen_rtx_US_PLUS (mode, operands[1], x_add)); - -// Keep all bits from RP and higher: ... 2^(-RP) -// Clear all bits from RP+1 and lower: 2^(-RP-1) ... -// Rounding point ^^^ -// Added above ^ - -rtx xreg = simplify_gen_subreg (imode, operands[0], mode, 0); -rtx xmask = immed_double_int_const (-i_add - i_add, imode); - -if (SImode == imode) - emit_insn (gen_andsi3 (xreg, xreg, xmask)); -else if (HImode == imode) - emit_insn (gen_andhi3 (xreg, xreg, xmask)); -else if (QImode == imode) - emit_insn (gen_andqi3 (xreg, xreg, xmask)); -else - gcc_unreachable(); - -DONE; - }) +return avr_out_round (insn, operands); + } + [(set_attr "cc" "clobber") + (set_attr "adjust_len" "round")]) ;; "*roundqq3.libgcc" "*rounduqq3.libgcc" Index: gcc/config/avr/avr.md === --- gcc/config/avr/avr.md (revision 200903) +++ gcc/config/avr/avr.md (working copy) @@ -140,7 +140,7 @@ (define_attr "adjust_len" "out_bitop, plus, addto_sp, tsthi, tstpsi, tstsi, compare, compare64, call, mov8, mov16, mov24, mov32, reload_in16, reload_in24, reload_in32, - ufract, sfract, + ufract, sfract, round, xload, lpm, movmem, ashlqi, ashrqi, lshrqi, ashlhi, ashrhi, lshrhi, Index: gcc/config/avr/avr-protos.h === --- gcc/config/avr/avr-protos.h (revision 200903) +++ gcc/config/avr/avr-protos.h (working copy) @@ -86,7 +86,8 @@ extern int avr_starting_frame_offset (vo extern void avr_output_addr_vec_elt (FILE *s
Re: [PATCH 3/4] Introduce NEXT_PASS_NUM macro
On Thu, 2013-07-18 at 00:08 -0700, Andrew Pinski wrote: > On Wed, Jul 17, 2013 at 6:18 PM, David Malcolm wrote: > > gcc/ > > > > Explicitly number the instances of passes within passes.def. > > > > This is needed by a subsequent patch so that we can create > > fields within the pipeline class for each pass instance (to help > > locate pass instances when debugging). > > > > * passes.c (NEXT_PASS_NUM): Define. > > > > * passes.def (NEXT_PASS, NEXT_PASS_NUM): Replace uses of > > NEXT_PASS on passes that have multiple instances with uses of > > NEXT_PASS_NUM. > > > I don't like this patch at all. Mainly because the numbers can get > out of sync very quickly especially when it comes to internal versions > of the compiler where it is normal to reorder passes and add another > pass a few times. How would you feel about a "passes.def.in" and having that be what's in svn, with some kind of preprocessing step that builds a passes.def from it? That way we get the flexibility of before, but gain the ability I'm looking for to make a class holding the passes, and have easy access in gdb to the various instances of the class (rather than just the last instance of each that was created). If so, what tools are blessed for usage at build time? (I'd prefer Python, but I don't think that's a build-time dep yet).
Re: [Patch x86/darwin] fix PR51784 'PIC register not correctly preserved...'
On Thu, Jul 18, 2013 at 12:12 PM, Iain Sandoe wrote: >> This should be implemented as an expander. You also won't need >> UNSPEC_NLGR that way. > > Now I reload the state from this PR, I recall why this did not work. > > in the case: > > foo () > { > > do stuff that doesn't use the pic reg > > call nested function > > nonlocal_label: > > do stuff that uses the pic register. > > } > > +{ > + if (crtl->uses_pic_offset_table) > +{ > + rtx xops[3]; > > crtl->uses_pic_offset_table is not set at the point that "nonlocal_label:" is > evaluated. > > So, I think we have to use the define_insn_and_split, or am I still missing > something? Just a wild guess, do you also need "&& reload_completed" in the split condition? Uros.
[Patch RX] Add assembler option "-mcu" for generating assembler
Hi, Please find the patch to add assembler option "-mcu" for generating assembler error messages when target not supporting hardware FPU were seeing FPU code, namely RX100 and RX200. KPIT has recently submitted a patch to add warnings of RX variants that do not have hardware FPU support, http://www.sourceware.org/ml/binutils/2013-07/msg00085.html Index: gcc/config/rx/rx.h === --- gcc/config/rx/rx.h.orig 2013-07-18 18:03:11.0 +0530 +++ gcc/config/rx/rx.h 2013-07-11 14:57:17.0 +0530 @@ -101,6 +101,7 @@ %{mpid} \ %{mint-register=*} \ %{mgcc-abi:-mgcc-abi} %{!mgcc-abi:-mrx-abi} \ +%{mcpu=*} \ " No regression found with this patch. Please let me know if this is OK? Regards, Sandeep Kumar Singh, KPIT Cummins InfoSystems Ltd. Pune, India gas/config: 2013-07-18 Sandeep Kumar Singh * rx.h: Add option -mcpu for target variants RX100 and RX200.
[patch,cilk-plus,testsuite] Skip int16 and size16 targets (too much FAILs)
Hi, running the cilk-plus.exp tests I get ~200 FAILs because the tests are not written for 16-bit int or size_t platforms. As a quick tentative fix, the cilk-plus tests are skipped on such platforms. Common problems are: - internal compiler error: in build_int_cst_wide, at tree.c:1214 - warning: overflow in implicit constant conversion - error: total size of local objects too large - error: size of array 'array4' is too large - Many execution fails because int32 is assumed. See attached cilk-fail.txt. The proposed patch arranges for the current state of cilk-plus implementation. Moreover, I don't think a 16-bit or 8-bit platform is multicore and it makes sense to run cilk-plus on them. I don't know enough if cilk-plus to fix all these FAILs. Therefore it is very much appreciated if cilk-plus maintainers fix these FAILs or apply skipping or cilk-plus tests on the 16-bit platforms until proper tests are worked out and the ICEs are fixed. Thanks. * lib/target-supports.exp (check_effective_target_cilkplus): New proc. * gcc.dg/cilk-plus/cilk-plus.exp: only run if check_effective_target_cilkplus. * g++.dg/cilk-plus/cilk-plus.exp: Same. Index: lib/target-supports.exp === --- lib/target-supports.exp (revision 200903) +++ lib/target-supports.exp (working copy) @@ -1132,6 +1132,24 @@ proc check_effective_target_static_libgf } "-static"] } +# Return 1 if cilk-plus is supported by the target, 0 otherwise. + +proc check_effective_target_cilkplus { } { +# Skip cilk-plus tests on int16 and size16 targets for now. +# The cilk-plus tests are not generic enough to cover these +# cases and would throw hundreds of FAILs. +if { [check_effective_target_int16] + || ![check_effective_target_size32plus] } { + return 0; +} + +# Skip AVR, its RAM is too small and too many tests would fail. +if { [istarget avr-*-*] } { + return 0; +} +return 1 +} + proc check_linker_plugin_available { } { return [check_no_compiler_messages_nocache linker_plugin executable { int main() { return 0; } Index: gcc.dg/cilk-plus/cilk-plus.exp === --- gcc.dg/cilk-plus/cilk-plus.exp (revision 200903) +++ gcc.dg/cilk-plus/cilk-plus.exp (working copy) @@ -19,6 +19,10 @@ load_lib gcc-dg.exp +if { ![check_effective_target_cilkplus] } { +return; +} + dg-init dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -fcilkplus" " " dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -O0 -fcilkplus" " " Index: g++.dg/cilk-plus/cilk-plus.exp === --- g++.dg/cilk-plus/cilk-plus.exp (revision 200903) +++ g++.dg/cilk-plus/cilk-plus.exp (working copy) @@ -19,6 +19,10 @@ load_lib g++-dg.exp +if { ![check_effective_target_cilkplus] } { +return; +} + dg-init dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -fcilkplus" " " dg-runtest [lsort [glob -nocomplain $srcdir/c-c++-common/cilk-plus/AN/*.c]] " -O0 -fcilkplus" " " FAIL: c-c++-common/cilk-plus/AN/an-if.c -fcilkplus (internal compiler error) FAIL: c-c++-common/cilk-plus/AN/an-if.c -fcilkplus (test for excess errors) FAIL: c-c++-common/cilk-plus/AN/array_test2.c -fcilkplus (test for excess errors) FAIL: c-c++-common/cilk-plus/AN/conditional.c -fcilkplus execution test FAIL: c-c++-common/cilk-plus/AN/gather_scatter.c -fcilkplus (test for excess errors) FAIL: c-c++-common/cilk-plus/AN/if_test.c -fcilkplus (test for excess errors) FAIL: c-c++-common/cilk-plus/AN/sec_implicit_ex.c -fcilkplus execution test FAIL: c-c++-common/cilk-plus/AN/sec_reduce_return.c -fcilkplus execution test FAIL: c-c++-common/cilk-plus/AN/an-if.c -O0 -fcilkplus (internal compiler error) FAIL: c-c++-common/cilk-plus/AN/an-if.c -O0 -fcilkplus (test for excess errors) FAIL: c-c++-common/cilk-plus/AN/array_test2.c -O0 -fcilkplus (test for excess errors) FAIL: c-c++-common/cilk-plus/AN/conditional.c -O0 -fcilkplus execution test FAIL: c-c++-common/cilk-plus/AN/gather_scatter.c -O0 -fcilkplus (test for excess errors) FAIL: c-c++-common/cilk-plus/AN/if_test.c -O0 -fcilkplus (test for excess errors) FAIL: c-c++-common/cilk-plus/AN/sec_implicit_ex.c -O0 -fcilkplus execution test FAIL: c-c++-common/cilk-plus/AN/sec_reduce_return.c -O0 -fcilkplus execution test FAIL: c-c++-common/cilk-plus/AN/an-if.c -O1 -fcilkplus (internal compiler error) FAIL: c-c++-common/cilk-plus/AN/an-if.c -O1 -fcilkplus (test for excess errors) FAIL: c-c++-common/cilk-plus/AN/array_test2.c -O1 -fcilkplus (test for excess errors) FAIL: c-c++-common/cilk-plus/AN/conditional.c -O1 -fcilkplus execution test FAIL: c-c++-common/cilk-plus/AN/exec-once2.c -O1 -fcilkplus execution test FAIL: c-c++-common/cilk-plus/AN/gather_scatter.c -O1 -fcilkplus (test for excess errors) FA
Re: Go patch committed: Update libgo to 1.1.1
Ian Lance Taylor writes: > I have committed a large patch to update libgo to the library that was > part of the Go 1.1.1 release. As usual, I'm not including the entire > patch in this e-mail message, because it is too large. I'm only > including the changes to the files that are partially gccgo-specific. > Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. > Committed to mainline and 4.8 branch. This broke the Solaris build: /vol/gcc/src/hg/trunk/local/libgo/go/net/sock_solaris.go:20:1: error: redefinition of 'listenerSockaddr' func listenerSockaddr(s, f int, la syscall.Sockaddr, toAddr func(syscall.Sockaddr) Addr) (syscall.Sockaddr, error) { ^ /vol/gcc/src/hg/trunk/local/libgo/go/net/sock_unix.go:11:1: note: previous definition of 'listenerSockaddr' was here func listenerSockaddr(s, f int, la syscall.Sockaddr, toAddr func(syscall.Sockaddr) Addr) (syscall.Sockaddr, error) { ^ make[2]: *** [net.lo] Error 1 Seems enought to just remove the sock_solaris.go definition. /vol/gcc/src/hg/trunk/local/libgo/go/log/syslog/syslog_libc.go:18:25: error: use of undefined type 'serverConn' func unixSyslog() (conn serverConn, err error) { ^ make[6]: *** [log/syslog.lo] Error 1 Didn't make much progress on this one. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [patch] [python libstdc++ printers] Fix gdb/15195
> "Phil" == Phil Muldoon writes: Phil> 2013-07-03 Phil Muldoon Phil> PR gcc/53477 Phil> http://sourceware.org/bugzilla/show_bug.cgi?id=15195 Phil> * python/libstdcxx/v6/printers.py (Printer.__call__): If a value Phil> is a reference, fetch referenced value. Phil> (RxPrinter.invoke): Ditto. Phil> * testsuite/libstdc++-prettyprinters/cxx11.cc (main): Add -O0 Phil> flag. Add referenced value tests. Thanks Phil. Remember to CC on these notes. Phil> +if value.type.code == gdb.TYPE_CODE_REF: Phil> +value = value.referenced_value() Phil> + I think this code should test for the existence of referenced_value using hasattr. Maybe somebody is still on gdb 7.4. Tom
Re: [PATCH 3/4] Introduce NEXT_PASS_NUM macro
On Thu, Jul 18, 2013 at 4:33 AM, David Malcolm wrote: > On Thu, 2013-07-18 at 00:08 -0700, Andrew Pinski wrote: >> On Wed, Jul 17, 2013 at 6:18 PM, David Malcolm wrote: >> > gcc/ >> > >> > Explicitly number the instances of passes within passes.def. >> > >> > This is needed by a subsequent patch so that we can create >> > fields within the pipeline class for each pass instance (to help >> > locate pass instances when debugging). >> > >> > * passes.c (NEXT_PASS_NUM): Define. >> > >> > * passes.def (NEXT_PASS, NEXT_PASS_NUM): Replace uses of >> > NEXT_PASS on passes that have multiple instances with uses of >> > NEXT_PASS_NUM. >> >> >> I don't like this patch at all. Mainly because the numbers can get >> out of sync very quickly especially when it comes to internal versions >> of the compiler where it is normal to reorder passes and add another >> pass a few times. > > How would you feel about a "passes.def.in" and having that be what's in > svn, with some kind of preprocessing step that builds a passes.def from > it? That way we get the flexibility of before, but gain the ability I'm > looking for to make a class holding the passes, and have easy access in > gdb to the various instances of the class (rather than just the last > instance of each that was created). That would work and would be ok with me. > If so, what tools are blessed for usage at build time? (I'd prefer > Python, but I don't think that's a build-time dep yet). So far awk and shell and C programming are the blessed processing tools. Awk in this case seems like the best for this though. I don't think Python is a good solution only because it does add another build dependency that is not there already though I would not complain about it if it gets added. Thanks, Andrew Pinski
[PATCH, rs6000] Fix flag interaction of new Power8 flags
The following patch fixes a testsuite failure due to the fact that -mcpu=power8 was turning on the new flags such as power8-vector, which would then result in the VSX flag being turned back on after it was previously turned off due to a conflicting option such as -msoft-float. Bootstrap/regtest with no new regressions, ok for trunk? -Pat 2013-07-18 Pat Haugen * config/rs6000/rs6000.c (rs6000_option_override_internal): Adjust flag interaction for new Power8 flags and VSX. Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 200903) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -2952,7 +2952,10 @@ rs6000_option_override_internal (bool gl if (rs6000_isa_flags_explicit & OPTION_MASK_VSX) msg = N_("-mvsx requires hardware floating point"); else - rs6000_isa_flags &= ~ OPTION_MASK_VSX; + { + rs6000_isa_flags &= ~ OPTION_MASK_VSX; + rs6000_isa_flags_explicit |= OPTION_MASK_VSX; + } } else if (TARGET_PAIRED_FLOAT) msg = N_("-mvsx and -mpaired are incompatible"); @@ -2980,6 +2983,16 @@ rs6000_option_override_internal (bool gl } } + /* If hard-float/altivec/vsx were explicitly turned off then don't allow + the -mcpu setting to enable options that conflict. */ + if ((!TARGET_HARD_FLOAT || !TARGET_ALTIVEC || !TARGET_VSX) + && (rs6000_isa_flags_explicit & (OPTION_MASK_SOFT_FLOAT + | OPTION_MASK_ALTIVEC + | OPTION_MASK_VSX)) != 0) +rs6000_isa_flags &= ~((OPTION_MASK_P8_VECTOR | OPTION_MASK_CRYPTO + | OPTION_MASK_DIRECT_MOVE) +& ~rs6000_isa_flags_explicit); + if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET) rs6000_print_isa_options (stderr, 0, "before defaults", rs6000_isa_flags);
Re: [PATCH 3/4] Introduce NEXT_PASS_NUM macro
> "David" == David Malcolm writes: David> If so, what tools are blessed for usage at build time? (I'd prefer David> Python, but I don't think that's a build-time dep yet). http://www.gnu.org/prep/standards/html_node/Utilities-in-Makefiles.html#Utilities-in-Makefiles Tom
[PATCH] Remove redundant decl of pass_ipa_lto_wpa_fixup
pass_ipa_lto_wpa_fixup was removed in r158622: 2010-04-21 Jan Hubicka [...snip...] * passes.c (init_optimization_passes): Remove pass_ipa_lto_wpa_fixup. but that commit left the declaration still present in tree-pass.h This patch removes the redundant decl. Successfully bootstrapped on x86_64-unknown-linux-gnu OK for trunk? [this one seems obvious to me, but doesn't quite match the letter of the rules in "Free for all" in http://gcc.gnu.org/svnwrite.html , and I'm new here, hence I'm asking out of an abundance of caution :) ] Thanks Dave >From da20870a2220873df67067fcae9a00bace75d376 Mon Sep 17 00:00:00 2001 From: David Malcolm Date: Wed, 17 Jul 2013 23:27:36 -0400 Subject: [PATCH] Remove redundant decl of pass_ipa_lto_wpa_fixup --- gcc/tree-pass.h | 1 - 1 file changed, 1 deletion(-) diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index b8c59a7..547f355 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -383,7 +383,6 @@ extern struct ipa_opt_pass_d pass_ipa_cp; extern struct ipa_opt_pass_d pass_ipa_reference; extern struct ipa_opt_pass_d pass_ipa_pure_const; extern struct simple_ipa_opt_pass pass_ipa_pta; -extern struct ipa_opt_pass_d pass_ipa_lto_wpa_fixup; extern struct ipa_opt_pass_d pass_ipa_lto_finish_out; extern struct simple_ipa_opt_pass pass_ipa_tm; extern struct ipa_opt_pass_d pass_ipa_profile; -- 1.7.11.7
Re: [Patch, PR 57810] Wasted work in validate_const_int()
On 07/17/2013 10:41 AM, pcha...@cs.wisc.edu wrote: Hi, The problem appears in revision 200945 in version 4.9. I attached a one-line patch that fixes it. I also reported this problem at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57810 . In method "validate_const_int()" in "gcc/read-rtl.c", the loop on line 804 should break immediately after "valid" is set to "0". All the iterations after "valid" set to "0" do not perform any useful work, at best they just set "valid" again to "0". Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Installed onto the trunk. jeff
Re: [Patch, PR 57805] Wasted work in write_roots()
On 07/17/2013 10:38 AM, pcha...@cs.wisc.edu wrote: Hi, The problem appears in revision 200945 in version 4.9. I attached a one-line patch that fixes it. I also reported this problem at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57805. In method "write_roots()" in gcc/gengtype.c, the loop on line 4682 should break immediately after "skip_p" is set to "1". All the iterations after "skip_p" set to "1" do not perform any useful work, at best they just set "skip_p" again to "1". Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Installed onto the trunk. jeff
Re: [PATCH] Remove redundant decl of pass_ipa_lto_wpa_fixup
On 07/18/2013 10:03 AM, David Malcolm wrote: pass_ipa_lto_wpa_fixup was removed in r158622: 2010-04-21 Jan Hubicka [...snip...] * passes.c (init_optimization_passes): Remove pass_ipa_lto_wpa_fixup. but that commit left the declaration still present in tree-pass.h This patch removes the redundant decl. Successfully bootstrapped on x86_64-unknown-linux-gnu This is fine. Thanks. OK for trunk? [this one seems obvious to me, but doesn't quite match the letter of the rules in "Free for all" in http://gcc.gnu.org/svnwrite.html , and I'm new here, hence I'm asking out of an abundance of caution :) ] Yea. The steering committee is likely to revamp the wording to make this kind of obvious fix OK in the future. jeff
Re: [PATCH 1/4] Introduce macros when constructing the tree of passes
On 07/17/2013 07:18 PM, David Malcolm wrote: gcc/ * passes.c (init_optimization_passes): Introduce macros for constructing the tree of passes (INSERT_PASSES_AFTER, PUSH_INSERT_PASSES_WITHIN, POP_INSERT_PASSES, TERMINATE_PASS_LIST). --- gcc/passes.c | 108 +++ 1 file changed, 56 insertions(+), 52 deletions(-) diff --git a/gcc/passes.c b/gcc/passes.c index 761f030..6ca4134 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -1282,13 +1282,26 @@ init_optimization_passes (void) { struct opt_pass **p; +#define INSERT_PASSES_AFTER(PASS) \ + p = &(PASS); + +#define PUSH_INSERT_PASSES_WITHIN(PASS) \ + { \ +struct opt_pass **p = &(PASS).pass.sub; + +#define POP_INSERT_PASSES() \ + } + I've never been a fan of having unmatched braces inside macros; though I guess I can live with it particularly since it'll help catch an unmatched push/pop. OK for the trunk. jeff
Re: [PATCH 2/4] Move the construction of the pass hierarchy into a new passes.def file.
On 07/17/2013 07:18 PM, David Malcolm wrote: gcc/ * passes.def: New. * passes.c (init_optimization_passes): Move the construction of the pass hierarchy into a new passes.def file. * Makefile.in (passes.o): Add dependency on passes.def. OK for the trunk. I'm assuming you just cut-n-pasted the bits from passes.c into passes.def without changing the ordering/nesting. If that's not the case, speak up. Jeff
Re: [PATCH 3/4] Introduce NEXT_PASS_NUM macro
On 07/17/2013 07:18 PM, David Malcolm wrote: gcc/ Explicitly number the instances of passes within passes.def. This is needed by a subsequent patch so that we can create fields within the pipeline class for each pass instance (to help locate pass instances when debugging). * passes.c (NEXT_PASS_NUM): Define. * passes.def (NEXT_PASS, NEXT_PASS_NUM): Replace uses of NEXT_PASS on passes that have multiple instances with uses of NEXT_PASS_NUM. So this means we have to track down the instance number if we add a duplicate pass in the pipeline. I can see positives and negatives of doing that. I'll go along as I'm a proponent of the goal of having the passes be a first class object. Ok for the trunk. Jeff
Re: [PATCH 2/4] Move the construction of the pass hierarchy into a new passes.def file.
On Wed, Jul 17, 2013 at 09:18:21PM -0400, David Malcolm wrote: > --- /dev/null > +++ b/gcc/passes.def > @@ -0,0 +1,405 @@ > +/* Description of pass structure > + Copyright (C) 2013 Free Software Foundation, Inc. Shouldn't this be 1987-2013 instead? I mean, it isn't really a new file, the content comes from passes.c whose stuff dates up to 1987. Jakub
Re: [PATCH 4/4] Add contrib/check_passes.py script
On 07/17/2013 07:18 PM, David Malcolm wrote: contrib/ * check_passes.py: New. OK for the trunk. jeff
Re: [PATCH] Fix raw-string handling (PR preprocessor/57620)
Hmm, that logic is difficult to follow. It needs comments at least explaining last_seen_* and why the loop in the suffix handling keeps going after we change the phase to RAW_STR. Maybe instead of tracking last_seen_* BUFF_APPEND could copy into a short local char array as well as the string buffer? Jason
Re: [Patch, microblaze]: Add -fstack-usage support
On 03/18/13 05:48, David Holsgrove wrote: Changelog 2013-03-18 David Holsgrove * gcc/config/microblaze/microblaze.c (microblaze_expand_prologue): Add check for flag_stack_usage to handle -fstack-usage support Signed-off-by: David Holsgrove Applied revision 201035. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: Fix GCC bug causing bootstrap failure with vectorizer turned on
On 07/12/2013 04:13 PM, Cong Hou wrote: GCC bootstrap failed with loop vectorizer turned on by default at O2. The symptom is that the comparison between stage2&3 compilers fails. The root cause is a bug in the file "tree-vect-data-refs.c", where a qsort() function call is used to sort a group of data references using a comparison function called "dr_group_sort_cmp()". In this function, the iterative hash values of tree nodes are used for comparisons. For a declaration tree node, its UID participates in the calculation of the hash value. However, a specific declaration may have different UIDs whether the debug information is switched on/off (-gtoggle). In consequence, the results of comparisons may vary in stage 2&3 during bootstrapping. The following patch fixed the bug. Compiler bootstraps and there is no regressions in regression test. Compiler also bootstraps fine when turning on vectorizer by default. Since this patch may produce difference result (but still correct) than before due to the modification to the comparison function, four test cases are adjusted accordingly. OK for trunk? If I understand you correctly, you're claiming that the DECL_UID can vary based on whether or not we're emitting debug info. This can be seen in iterative_hash_expr: default: tclass = TREE_CODE_CLASS (code); if (tclass == tcc_declaration) { /* DECL's have a unique ID */ val = iterative_hash_host_wide_int (DECL_UID (t), val); } What doesn't make sense to me is your compare_tree looks at the UIDs just as well: + +default: + tclass = TREE_CODE_CLASS (code); + + /* For var-decl, we could compare their UIDs. */ + if (tclass == tcc_declaration) + { + if (DECL_UID (t1) != DECL_UID (t2)) +return DECL_UID (t1) < DECL_UID (t2) ? -1 : 1; + break; + } + Why does this work while using iterative_hash_expr fail? Clearly I'm mis-understanding something. jeff
Re: Fix GCC bug causing bootstrap failure with vectorizer turned on
The difference is that the relative order of DECL_UIDs do not change whether debug info is on or not, but there is no such guarantee when hashing is involved. David On Thu, Jul 18, 2013 at 9:45 AM, Jeff Law wrote: > On 07/12/2013 04:13 PM, Cong Hou wrote: >> >> GCC bootstrap failed with loop vectorizer turned on by default at O2. >> The symptom is that the comparison between stage2&3 compilers fails. >> The root cause is a bug in the file "tree-vect-data-refs.c", where a >> qsort() function call is used to sort a group of data references using >> a comparison function called "dr_group_sort_cmp()". In this function, >> the iterative hash values of tree nodes are used for comparisons. For >> a declaration tree node, its UID participates in the calculation of >> the hash value. However, a specific declaration may have different >> UIDs whether the debug information is switched on/off (-gtoggle). In >> consequence, the results of comparisons may vary in stage 2&3 during >> bootstrapping. >> >> The following patch fixed the bug. Compiler bootstraps and there is no >> regressions in regression test. Compiler also bootstraps fine when >> turning on vectorizer by default. Since this patch may produce >> difference result (but still correct) than before due to the >> modification to the comparison function, four test cases are adjusted >> accordingly. OK for trunk? > > If I understand you correctly, you're claiming that the DECL_UID can vary > based on whether or not we're emitting debug info. This can be seen in > iterative_hash_expr: > > >default: > tclass = TREE_CODE_CLASS (code); > > if (tclass == tcc_declaration) > { > /* DECL's have a unique ID */ > val = iterative_hash_host_wide_int (DECL_UID (t), val); > } > > > What doesn't make sense to me is your compare_tree looks at the UIDs just as > well: > > > > >> + >> +default: >> + tclass = TREE_CODE_CLASS (code); >> + >> + /* For var-decl, we could compare their UIDs. */ >> + if (tclass == tcc_declaration) >> + { >> + if (DECL_UID (t1) != DECL_UID (t2)) >> +return DECL_UID (t1) < DECL_UID (t2) ? -1 : 1; >> + break; >> + } >> + >> > > > Why does this work while using iterative_hash_expr fail? > > Clearly I'm mis-understanding something. > > jeff
Re: [PATCH] Remove redundant decl of pass_ipa_lto_wpa_fixup
On Thu, 2013-07-18 at 10:11 -0600, Jeff Law wrote: > On 07/18/2013 10:03 AM, David Malcolm wrote: > > pass_ipa_lto_wpa_fixup was removed in r158622: > > > > 2010-04-21 Jan Hubicka > > [...snip...] > > * passes.c (init_optimization_passes): Remove pass_ipa_lto_wpa_fixup. > > > > but that commit left the declaration still present in tree-pass.h > > > > This patch removes the redundant decl. > > > > Successfully bootstrapped on x86_64-unknown-linux-gnu > This is fine. Thanks. Thanks; committed to svn trunk as r201035. > > OK for trunk? [this one seems obvious to me, but doesn't quite match > > the letter of the rules in "Free for all" in > > http://gcc.gnu.org/svnwrite.html , and I'm new here, hence I'm asking > > out of an abundance of caution :) ] > Yea. The steering committee is likely to revamp the wording to make > this kind of obvious fix OK in the future. One other thing that's unclear on that page: are there any recommendations on what the commit message should be? In the example ("Commit the changes to the central repository") you appear to have trimmed the top line containing date and name from the ChangeLog entry, and I've (mostly) emulated that in my commits, but looking at "svn log" there seems to be some variety in what people do. Presumably the log message should contain the ChangeLog fragment(s), with multiple ChangeLogs indicated by path. Is it OK to have extra info? When I've been using git in my own local branches I've preferred to also put a one-line summary at the top of the commit above the ChangeLog fragment(s), since it makes for more readable entries in the git log. For more complicated changes, I also like to place some higher-level information about the change near the top of the logs (though I wouldn't want to impose that requirement on other devs). Perhaps also a URL to relevant discussions on the gcc-patches archive would also be appropriate? (and again optional, to avoid adding to the red tape). [to repeat my rant from Cauldron, you wouldn't write code comments like this: /* Double x. */ x *= 2; but if it warrants a comment, you'd have something like: /* Increase the buffer size, whilst avoiding O(n^2) copying costs on repeated growth. */ x *= 2; or somesuch - comments should describe the *intent* of change, rather than merely an English description. Why do GNU ChangeLogs seems to favor the latter approach?] (Sorry if the above turned into a rant again) Dave
Re: [PATCH] Remove redundant decl of pass_ipa_lto_wpa_fixup
On 07/18/2013 10:53 AM, David Malcolm wrote: On Thu, 2013-07-18 at 10:11 -0600, Jeff Law wrote: In the example ("Commit the changes to the central repository") you appear to have trimmed the top line containing date and name from the ChangeLog entry, and I've (mostly) emulated that in my commits, but looking at "svn log" there seems to be some variety in what people do. Presumably the log message should contain the ChangeLog fragment(s), with multiple ChangeLogs indicated by path. I think most folks just use their ChangeLog entries as-is. I suspect that if we move forward with some kind of "extract ChangeLogs from the repository' that we'll need to formalize this a bit better. That's certainly been the case for other projects that have dropped manual ChangeLogs in favor of extracting them from the repository. or somesuch - comments should describe the *intent* of change, rather than merely an English description. Why do GNU ChangeLogs seems to favor the latter approach?] It's always been the policy that code comments should carry the intent of the change while the ChangeLog just notes what changed. As Jim W. mentioned in the steering committee bof, there was a requirement that a log of changes be kept. In the early days, GCC didn't use any version control system and the ChangeLog was the only way to track what had changed. Jeff
Re: [PATCH] PR57878, Incorrect code: live register clobbered in split2
On 07/15/2013 02:26 PM, Wei Mi wrote: > Hi, > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57878 > > The bug occurs because tfreq is given higher priority than bigger mode > in reload_pseudo_compare_func. When there are multiple reload pseudos > in the same insn, and the pseudo with bigger mode has lower thread > frequency than other reload pseudos, it is possible the bigger mode > pseudo cannot find available hardregs. > > The proposed fix is to switch the priority of bigger mode and tfreq in > reload_pseudo_compare_func. Besides I promoted lra_assert to > gcc_assert at the end of to make adding testcase easier since > lra_assert will not fire on 4.8.1. > > bootstrap and regression test are ok on x86_64-linux-gnu. Is it ok for > trunk and 4.8 branch? > > Thanks, > Wei. > > 2013-07-15 Wei Mi > > PR rtl-optimization/57878 > * lra-assigns.c (reload_pseudo_compare_func): Switch the priority of > bigger mode and tfreq. > > 2013-07-15 Wei Mi > > PR rtl-optimization/57878 > * g++.dg/pr57518.C: New test. In overall, the PR analysis and the patch is ok. The only problem the patch can affect generated code performance especially for 32-bit targets. I see regressions on 4 SPECInt2000 tests. Here is the patch I've committed into the trunk. It decreases the patch effect on performance. The patch was successfully bootstrapped an tested on x86/x86-64. Committed to the trunk as rev. 201036. Wei Mi, could you commit it to gcc4.8 branch. Thanks. 2013-07-18 Vladimir Makarov Wei Mi PR rtl-optimization/57878 * lra-assigns.c (assign_by_spills): Move non_reload_pseudos to the top. (reload_pseudo_compare_func): Check nregs first for reload pseudos. 2013-07-18 Wei Mi PR rtl-optimization/57878 * g++.dg/pr57518.C: New test. Index: lra-assigns.c === --- lra-assigns.c (revision 200959) +++ lra-assigns.c (working copy) @@ -116,6 +116,11 @@ struct regno_assign_info /* Map regno to the corresponding regno assignment info. */ static struct regno_assign_info *regno_assign_info; +/* All inherited, subreg or optional pseudos created before last spill + sub-pass. Such pseudos are permitted to get memory instead of hard + regs. */ +static bitmap_head non_reload_pseudos; + /* Process a pseudo copy with execution frequency COPY_FREQ connecting REGNO1 and REGNO2 to form threads. */ static void @@ -194,6 +199,15 @@ reload_pseudo_compare_func (const void * if ((diff = (ira_class_hard_regs_num[cl1] - ira_class_hard_regs_num[cl2])) != 0) return diff; + if ((diff + = (ira_reg_class_max_nregs[cl2][lra_reg_info[r2].biggest_mode] + - ira_reg_class_max_nregs[cl1][lra_reg_info[r1].biggest_mode])) != 0 + /* The code below executes rarely as nregs == 1 in most cases. + So we should not worry about using faster data structures to + check reload pseudos. */ + && ! bitmap_bit_p (&non_reload_pseudos, r1) + && ! bitmap_bit_p (&non_reload_pseudos, r2)) +return diff; if ((diff = (regno_assign_info[regno_assign_info[r2].first].freq - regno_assign_info[regno_assign_info[r1].first].freq)) != 0) return diff; @@ -1155,7 +1169,6 @@ assign_by_spills (void) rtx insn; basic_block bb; bitmap_head changed_insns, do_not_assign_nonreload_pseudos; - bitmap_head non_reload_pseudos; unsigned int u; bitmap_iterator bi; bool reload_p; Index: testsuite/g++.dg/pr57878.C === --- testsuite/g++.dg/pr57878.C (revision 0) +++ testsuite/g++.dg/pr57878.C (working copy) @@ -0,0 +1,226 @@ +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ +/* { dg-options "-m32 -O2 -fno-omit-frame-pointer -fPIC -std=gnu++11" } */ + +typedef int int32; +typedef long long int64; +typedef unsigned int uint32; +typedef unsigned long long uint64; +namespace std { + typedef unsigned int size_t; + template + struct char_traits; + template + inline _Tp* __addressof(_Tp& __r) noexcept { +return reinterpret_cast<_Tp*> (&const_cast(reinterpret_cast(__r))); + } + template + struct remove_reference { +typedef _Tp type; + }; + template + constexpr _Tp&& forward(typename std::remove_reference<_Tp>::type& __t) noexcept { +return static_cast<_Tp&&>(__t); + } +} +typedef unsigned int size_t; +extern "C++" { + inline void* operator new(std::size_t, void* __p) noexcept { +return __p; + } +} +namespace __gnu_cxx __attribute__ ((__visibility__ ("default"))) { + template +class new_allocator { + public: +typedef size_t size_type; +typedef _Tp* pointer; + }; +} +namespace std { + template + using __allocator_base = __gnu_cxx::new_allocator<_Tp>; + template + class allocator +: public __allocator_base<_Tp> { + public: +typedef size_t size_type; +template +struct rebind { + typedef allocator<_Tp1> oth
Re: Fix GCC bug causing bootstrap failure with vectorizer turned on
On Thu, Jul 18, 2013 at 10:45:19AM -0600, Jeff Law wrote: > If I understand you correctly, you're claiming that the DECL_UID can > vary based on whether or not we're emitting debug info. This can be > seen in iterative_hash_expr: Yes, for e.g. SSA_NAME_VERSION, we require that it is the same in between -g -fvar-tracking-assignments and -g0, for DECL_UIDs, only the relative ordering must be preserved, as in, for VTA the gaps between DECL_UID of decls seen both in -g0 and -g code can be bigger. So -g0 can use for decls a, c, b uids say 12 13 14, while -g can use 12 27 39. Thus hashing the DECL_UID if e.g. the hash table is later on traversed could lead to -fcompare-debug failures. Jakub
Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
> * dwarf2out.c (field_type): static removed from declaration. > (simple_type_size_in_bits): static removed from declaration. > (field_byte_offset): static removed from declaration. > (field_type): static inline removed from declaration. If you're going to use these declarations from tree-ssa-bitfield-merge.c, it would be better to move the declarations into dwarf2out.h, and include that file from tree-ssa-bitfield-merge.c. Even better would be to move these routines (which today are in dwarf2out.c simply because that was the only file that needed them) to a more appropriate location. I'd suggest tree.h/tree.c, but tree.c is already way too big -- does someone have a better suggestion? -cary
Re: [patch,cilk-plus,testsuite] Skip int16 and size16 targets (too much FAILs)
On Jul 18, 2013, at 6:48 AM, Georg-Johann Lay wrote: > running the cilk-plus.exp tests I get ~200 FAILs because the tests are not > written for 16-bit int or size_t platforms. > > As a quick tentative fix, the cilk-plus tests are skipped on such platforms. I think this patch is fine. Until such time until cilk can do better, I think this is a fine way to address the issue. The other possible way is to not spin up any support for cilk on such a platform in the first place. noconfigdir is one common way to do this.
Re: [PATCH 1/4] Introduce macros when constructing the tree of passes
On Thu, 2013-07-18 at 10:18 -0600, Jeff Law wrote: > On 07/17/2013 07:18 PM, David Malcolm wrote: > > gcc/ > > * passes.c (init_optimization_passes): Introduce macros for > > constructing the tree of passes (INSERT_PASSES_AFTER, > > PUSH_INSERT_PASSES_WITHIN, POP_INSERT_PASSES, > > TERMINATE_PASS_LIST). > > --- > > gcc/passes.c | 108 > > +++ > > 1 file changed, 56 insertions(+), 52 deletions(-) > > > > diff --git a/gcc/passes.c b/gcc/passes.c > > index 761f030..6ca4134 100644 > > --- a/gcc/passes.c > > +++ b/gcc/passes.c > > @@ -1282,13 +1282,26 @@ init_optimization_passes (void) > > { > > struct opt_pass **p; > > > > +#define INSERT_PASSES_AFTER(PASS) \ > > + p = &(PASS); > > + > > +#define PUSH_INSERT_PASSES_WITHIN(PASS) \ > > + { \ > > +struct opt_pass **p = &(PASS).pass.sub; > > + > > +#define POP_INSERT_PASSES() \ > > + } > > + > I've never been a fan of having unmatched braces inside macros; though I > guess I can live with it particularly since it'll help catch an > unmatched push/pop. > > > OK for the trunk. Thanks; committed to svn trunk as r201037.
Re: [PATCH, rs6000] Fix flag interaction of new Power8 flags
On Thu, Jul 18, 2013 at 10:58 AM, Pat Haugen wrote: > The following patch fixes a testsuite failure due to the fact that > -mcpu=power8 was turning on the new flags such as power8-vector, which would > then result in the VSX flag being turned back on after it was previously > turned off due to a conflicting option such as -msoft-float. > > Bootstrap/regtest with no new regressions, ok for trunk? > > -Pat > > > 2013-07-18 Pat Haugen > > * config/rs6000/rs6000.c (rs6000_option_override_internal): Adjust > flag > interaction for new Power8 flags and VSX. Okay. Thanks. - David
RE: [patch,cilk-plus,testsuite] Skip int16 and size16 targets (too much FAILs)
> -Original Message- > From: Mike Stump [mailto:mikest...@comcast.net] > Sent: Thursday, July 18, 2013 2:14 PM > To: Georg-Johann Lay > Cc: gcc-patches@gcc.gnu.org; Iyer, Balaji V > Subject: Re: [patch,cilk-plus,testsuite] Skip int16 and size16 targets (too > much > FAILs) > > On Jul 18, 2013, at 6:48 AM, Georg-Johann Lay wrote: > > running the cilk-plus.exp tests I get ~200 FAILs because the tests are > > not written for 16-bit int or size_t platforms. > > > > As a quick tentative fix, the cilk-plus tests are skipped on such platforms. > > I think this patch is fine. Until such time until cilk can do better, I > think this is a > fine way to address the issue. The other possible way is to not spin up any > support for cilk on such a platform in the first place. noconfigdir is one > common > way to do this. Hi Mike, The changes we have committed thus far does not require the cilk runtime. Thanks, Balaji V. Iyer.
Re: [PATCH 2/4] Move the construction of the pass hierarchy into a new passes.def file.
On Thu, 2013-07-18 at 10:21 -0600, Jeff Law wrote: > On 07/17/2013 07:18 PM, David Malcolm wrote: > > gcc/ > > * passes.def: New. > > > > * passes.c (init_optimization_passes): Move the construction of > > the pass hierarchy into a new passes.def file. > > > > * Makefile.in (passes.o): Add dependency on passes.def. > OK for the trunk. > > I'm assuming you just cut-n-pasted the bits from passes.c into > passes.def without changing the ordering/nesting. If that's not the > case, speak up. Yes; sorry for not spelling that out more clearly. I double-checked that passes.def was indeed a direct copy of the appropriate fragment of passes.c, fixed up the copyright header as per Jakub's email, and smoke-tested that it still compiles after the copyright header edit. I've committed the result to svn trunk as r201038. (I'll hold off on the rest for now as I've only got a few hours connectivity before being about to be offline for a few days) Thanks Dave
[PATCH, testsuite committed] Fix gcc.target/powerpc/pr57744.c
Committed the following as obvious. 2013-07-18 Pat Haugen * gcc.target/powerpc/pr57744.c: Fix typo. Index: gcc/testsuite/gcc.target/powerpc/pr57744.c === --- gcc/testsuite/gcc.target/powerpc/pr57744.c(revision 200903) +++ gcc/testsuite/gcc.target/powerpc/pr57744.c(working copy) @@ -31,7 +31,7 @@ volatile int do_test = 0; int main (void) { if (do_test && !libat_compare_exchange_16 (&a, &b, c, 0, 0)) -aborrt (); +abort (); return 0; }
Re: [PATCH] Add a new option "-ftree-bitfield-merge" (patch / doc inside)
On Wed, 17 Jul 2013, Zoran Jovanovic wrote: > Hello, > This patch adds new optimization pass that combines several adjacent bit > field accesses that copy values from one memory location to another into > single bit field access. > > Example: > > Original code: > D.1351; > D.1350; > D.1349; > D.1349_2 = p1_1(D)->f1; > p2_3(D)->f1 = D.1349_2; > D.1350_4 = p1_1(D)->f2; > p2_3(D)->f2 = D.1350_4; > D.1351_5 = p1_1(D)->f3; > p2_3(D)->f3 = D.1351_5; > > Optimized code: > D.1358; > D.1358_10 = BIT_FIELD_REF <*p1_1(D), 19, 13>; > BIT_FIELD_REF <*p2_3(D), 19, 13> = D.1358_10; > > Algorithm works on basic block level and consists of following 3 major steps: > 1. Go trough basic block statements list. If there are statement pairs that > implement copy of bit field content from one memory location to another > record statements pointers and other necessary data in corresponding data > structure. > 2. Identify records that represent adjacent bit field accesses and mark them > as merged. I see noone else asked, so: How are volatile bitfields handled or accesses to bitfields in structures declared volatile? (A quick grep found no match for "volatil" in the patch; maybe they're rejected by other means, but better double-check. This pass must not widen, narrow or combine such accesses and this may or may not depend on the language standard.) > 3. Modify trees accordingly. brgds, H-P
Re: [ubsan] Add libcall arguments
On 07/05/2013 10:04 AM, Marek Polacek wrote: +/* This type represents an entry in the hash table. */ Please describe the hash table more up here. What are you tracking? + hashval_t h = iterative_hash_object (data->type, 0); + h = iterative_hash_object (data->decl, h); If you hash the decl as well as the type, the find_slot in ubsan_type_descriptor will almost never find an existing entry. +uptr_type (void) +{ + return build_nonstandard_integer_type (POINTER_SIZE, 1); Why not use uintptr_type_node? I have yet to handle freeing the hash table, but I think I'll need the GTY machinery for this (ubsan is not a pass, so I can't just call it at the end of the pas). Or maybe just create a destructor and use append_to_statement_list. That won't work; append_to_statement_list is for things that happen at runtime, but freeing the hash table is something that needs to happen in the compiler. +/* This routine returns a magic number for TYPE. + ??? This is probably too ugly. Tweak it. */ + +static unsigned short +get_tinfo_for_type (tree type) Why map from size to some magic number rather than use the size directly? Also, "tinfo" sounds to me like something to do with C++ type_info. Jason
Re: [ubsan] Add testsuite
On Tue, Jul 16, 2013 at 08:04:14AM +0200, Bernhard Reutner-Fischer wrote: > +# ubsan_finish -- called at the start of each subdir of tests > > s/the start/the end/ You're attentive, will fix. Thanks, Marek
Re: [PATCH] MIPS: IEEE 754-2008 features support
On Wed, 17 Jul 2013, Richard Sandiford wrote: > "Maciej W. Rozycki" writes: > >> The patch mostly looks good apart from that, but please use a single > >> enum for the 2008/legacy thing, both in mips.h and mips.opt. > > > > Also mips.c and mips.md (and last but not least mips-opts.h). Done. > > I wasn't trying to list all the places that use the C enum. > I said mips.opt because I thought we should be able to use a single > mips.opt Enum too, a bit like we share mips_arch_opt_value for both > -march= and -mtune=. Doh, I *knew* it was doable somehow. Fixed now. Thanks for persistence. Please check if the adjusted wording of the help messages is fine with you. > > Please also note that the writability of the individual new (HAS2008) > > FCSR bits is optional e.g. a conforming processor may have NAN2008 > > hardwired to 1 and ABS2008 hardwired to 0 (or likewise with NAN2008 > > writable). > > OK, I'd missed that this was allowed, sorry. It just seems really > unfortunate... Well, I'm not really sure what the Big Plan here is. It looks to me like the non-arithmetic ABS/NEG feature is a really good thing, while the 2008 NaN encoding has its shortcomings, e.g. unlike with the legacy encoding there's no single sNaN bit pattern to preset FPRs or variables with to catch uninitialised use that would work across all the floating-point formats (S, D and PS). So it seems to me like there's no single superior setting we could make the default for a group option. > >> > +# Return 1 if this is a MIPS target supporting -mnan=. > >> > +# Old versions of binutils may not support this option. > >> > + > >> > +proc check_effective_target_mips_nan { } { > >> > +if { ![istarget mips*-*-*] } { > >> > +return 0 > >> > +} > >> > +return [check_no_compiler_messages mips_nan object { > >> > +int dummy; > >> > +} "-mnan=2008"] > >> > +} > >> > >> The tests you added are dg-do compile tests, which stop after assembly > >> generation, so this guard shouldn't be needed. > > > > It is needed in case the compiler was built without support for this > > option i.e. configured against old binutils. I verified that it indeed > > triggered in this case: > > > > Executing on host: mips-linux-gnu-gcc -fno-diagnostics-show-caret > > -fdiagnostics-color=never -mnan=2008 -c -o mips_nan21474.o > > mips_nan21474.c(timeout = 300) > > mips-linux-gnu-gcc: error: unrecognized command line option '-mnan=2008' > > compiler exited with status 1 > > output is: > > mips-linux-gnu-gcc: error: unrecognized command line option '-mnan=2008' > > That seems like bad practice though. In other cases we leave the > assembler to report options that it doesn't understand. I think that's > better because it's then clearer that the assembler needs to be upgraded. > > Within config/mips, the configure test should just control whether it's > safe to use .nan when no -mnan option has been given. (This is what we > did for -mmicromips vs ".set nomicromips" FWIW.) I can see your point and acknowledge the preexisting practice, but I don't feel particularly convinced, especially in this case where we have a feature that's never going to raise the user's attention when miconfigured, but it also applies to the microMIPS case. The two issues I see with it are: 1. Conceptually I see the toolchain as a whole and I don't see a value in GCC producing known-unsupported assembly and relying on the assembler (or the linker if applicable) to complain. I agree pointing at the other tool being incapable or obsolete is a useful practice, but I also think a clear message from GCC itself would be more appropriate (e.g. "`-mfoo' unsupported, please reconfigure against current binutils"). 2. Technically I think we have an actual problem here, e.g. in the example you referred we have a situation where GCC supports microMIPS compilation in all cases, however non-microMIPS code is different depending on whether the compiler has been configured against modern or obsolete binutils. Now the latter case may prompt someone to upgrade binutils, but there is nothing to prompt that person to reconfigure GCC afterwards. As a result we have two cases of a toolchain comprised of the same versions of both GCC and binutils, but depending on the "history" of the GCC binaries code produced will be different. I think this is subtler and riskier than just rejecting the relevant compiler option outright. Applying this to the 2008-NaN case the compiler would have to refrain from producing the .nan directive in the legacy case if built against old binutils but would produce the directive regardless in the 2008 case. I don't feel safe with such an arrangement. 2013-07-18 Maciej W. Rozycki gcc/ * config/mips/linux.h (GLIBC_DYNAMIC_LINKER): Handle `-mnan=2008'. (UCLIBC_DYNAMIC_LINKER): New macro. * config/mips/linux64.h (GLIBC_DYNAMIC
Re: [Patch, microblaze]: Add -fstack-usage support
On 19 July 2013 02:42, Michael Eager wrote: > On 03/18/13 05:48, David Holsgrove wrote: >> >> Changelog >> >> 2013-03-18 David Holsgrove >> >> * gcc/config/microblaze/microblaze.c (microblaze_expand_prologue): >> Add check for flag_stack_usage to handle -fstack-usage support >> >> Signed-off-by: David Holsgrove > > > Applied revision 201035. Thanks Michael - did this get applied to trunk? I can't see the commit upstream. regards, David > > -- > Michael Eagerea...@eagercon.com > 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: [Patch, microblaze]: Add -fstack-usage support
On 07/18/13 16:25, David Holsgrove wrote: On 19 July 2013 02:42, Michael Eager wrote: On 03/18/13 05:48, David Holsgrove wrote: Changelog 2013-03-18 David Holsgrove * gcc/config/microblaze/microblaze.c (microblaze_expand_prologue): Add check for flag_stack_usage to handle -fstack-usage support Signed-off-by: David Holsgrove Applied revision 201035. Thanks Michael - did this get applied to trunk? I can't see the commit upstream. Not sure what happened before, but it did not get committed. Committed revision 201042. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: [PATCH] [libitm] Add --enable-werror.
On Mon, 8 Jul 2013 20:20:01 -0600 Ryan Hill wrote: Ping. > On Mon, 1 Jul 2013 14:56:01 -0600 > Ryan Hill wrote: > > Ping. > http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00033.html > > > > libitm is currently unconditionally built with -Werror. This patch adds > > --enable-werror to control it (enabled by default). Bootstrapped and tested > > on x86_64, and inspected build logs to ensure it was doing what it should. > > > > I'm assuming copyright assignment isn't necessary for a small change like > > this. I will also need someone to check this in for me please. > > > > > > gcc/libitm/ > > 2013-06-30 Ryan Hill > > > > * configure.ac: Add --enable-werror. > > (XCFLAGS): Use it. > > * configure: Regenerate. > > > > --- > > libitm/configure.ac | 10 -- > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > diff --git a/libitm/configure.ac b/libitm/configure.ac > > index ff41266..5a9400d 100644 > > --- a/libitm/configure.ac > > +++ b/libitm/configure.ac > > @@ -252,9 +252,15 @@ GCC_CHECK_ELF_STYLE_WEAKREF > > CFLAGS="$save_CFLAGS" > > AC_CACHE_SAVE > > > > -# Add -Wall -Werror if we are using GCC. > > +AC_ARG_ENABLE(werror, [AS_HELP_STRING([--enable-werror], > > + [turns on -Werror @<:@default=yes@:>@])]) > > +# Add -Wall if we are using GCC. > > if test "x$GCC" = "xyes"; then > > - XCFLAGS="$XCFLAGS -Wall -Werror" > > + XCFLAGS="$XCFLAGS -Wall" > > + # Add -Werror if requested. > > + if test "x$enable_werror" != "xno"; then > > +XCFLAGS="$XCFLAGS -Werror" > > + fi > > fi > > > > XCFLAGS="$XCFLAGS $XPCFLAGS" > > > -- Ryan Hillpsn: dirtyepic_sk gcc-porting/toolchain/wxwidgets @ gentoo.org 47C3 6D62 4864 0E49 8E9E 7F92 ED38 BD49 957A 8463 signature.asc Description: PGP signature
Re: [PATCH] [libatomic] Add --enable-werror.
On Mon, 8 Jul 2013 20:19:24 -0600 Ryan Hill wrote: Ping. > On Mon, 1 Jul 2013 14:55:35 -0600 > Ryan Hill wrote: > > Ping. > http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00032.html > > > libatomic is currently unconditionally built with -Werror. This patch adds > > --enable-werror to control it (enabled by default). Bootstrapped and tested > > on x86_64, and inspected build logs to ensure it was doing what it should. > > > > I'm assuming copyright assignment isn't necessary for a small change like > > this. I will also need someone to check this in for me please. > > > > gcc/libatomic/ > > 2013-06-30 Ryan Hill > > > > * configure.ac: Add --enable-werror. > > (XCFLAGS): Use it. > > * configure: Regenerate. > > > > --- > > libatomic/configure.ac | 10 -- > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > diff --git a/libatomic/configure.ac b/libatomic/configure.ac > > index 0dc4a98..4020d23 100644 > > --- a/libatomic/configure.ac > > +++ b/libatomic/configure.ac > > @@ -226,9 +226,15 @@ LIBAT_ENABLE_SYMVERS > > CFLAGS="$save_CFLAGS" > > AC_CACHE_SAVE > > > > -# Add -Wall -Werror if we are using GCC. > > +AC_ARG_ENABLE(werror, [AS_HELP_STRING([--enable-werror], > > + [turns on -Werror @<:@default=yes@:>@])]) > > +# Add -Wall if we are using GCC. > > if test "x$GCC" = "xyes"; then > > - XCFLAGS="$XCFLAGS -Wall -Werror" > > + XCFLAGS="$XCFLAGS -Wall" > > + # Add -Werror if requested. > > + if test "x$enable_werror" != "xno"; then > > +XCFLAGS="$XCFLAGS -Werror" > > + fi > > fi > > > > XCFLAGS="$XCFLAGS $XPCFLAGS" -- Ryan Hillpsn: dirtyepic_sk gcc-porting/toolchain/wxwidgets @ gentoo.org 47C3 6D62 4864 0E49 8E9E 7F92 ED38 BD49 957A 8463 signature.asc Description: PGP signature
Re: [RFC] Parallel build broken on trunk.
On Wed, 21 Nov 2012 13:15:34 + Marcus Shawcroft wrote: > Thanks for looking at this Laurynas. > > I've committed the attached to trunk. > > > /Marcus > > 2012-11-21 Marcus Shawcroft > > * Makefile.in (gengtype-lex.o): Add dependency on $(BCONFIG_H). This also affects 4.7. Can we get a backport please? -- Ryan Hillpsn: dirtyepic_sk gcc-porting/toolchain/wxwidgets @ gentoo.org 47C3 6D62 4864 0E49 8E9E 7F92 ED38 BD49 957A 8463 signature.asc Description: PGP signature