Re: [PATCH] Fix for PR64353
On January 14, 2015 5:23:21 PM CET, Ilya Enkovich wrote: >On 14 Jan 15:35, Richard Biener wrote: >> On Wed, Jan 14, 2015 at 3:28 PM, Ilya Enkovich > wrote: >> > Hi, >> > >> > SRA gimple passes may add loads to functions with no SSA update. >Later it causes ICE when function with not updated SSA is processed by >gimple passes. This patch fixes it by calling update_ssa. >> > >> > Bootstrapped and checked on x86_64-unknown-linux-gnu. OK for >trunk? >> >> No. I have removed this quadratic update-ssa call previously. It >should >> simply keep SSA for up-to-date manually (see how it does >gimple_set_vuse >> in some cases, probably not all required ones?). >> > >Would it be OK to call update_ssa only in case we don't have a proper >VUSE for call? No, and most definitely not here. Are we allowed to just emit error due to incorrect >attribute? No, I don't think so either. But we may drop it. Richard. > >diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c >index 01f4111..4ce7822 100644 >--- a/gcc/ipa-prop.c >+++ b/gcc/ipa-prop.c >@@ -4054,6 +4054,11 @@ ipa_modify_call_arguments (struct cgraph_edge >*cs, gcall *stmt, > expr = create_tmp_reg (TREE_TYPE (expr)); > gimple_assign_set_lhs (tem, expr); > gsi_insert_before (&gsi, tem, GSI_SAME_STMT); >+/* In case callee has a wrong __attribute__((const)) >+ we may have no VUSE for the call and thus require >+ SSA update for the inserted load. See PR64353. */ >+if (gimple_in_ssa_p (cfun) && !gimple_vuse (stmt)) >+ update_ssa (TODO_update_ssa_only_virtuals); > } > } > else > >Thanks, >Ilya > >> Richard. >> >> > Thanks, >> > Ilya >> > -- >> > gcc/ >> > >> > 2015-01-14 Ilya Enkovich >> > >> > PR middle-end/64353 >> > * ipa-prop.c (ipa_modify_call_arguments): Update SSA for >> > vops after adding a load. >> > >> > >> > gcc/testsuite/ >> > >> > 2015-01-14 Ilya Enkovich >> > >> > PR middle-end/64353 >> > * g++.dg/pr64353.C: New. >> > >> > >> > diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c >> > index 01f4111..533dcfe 100644 >> > --- a/gcc/ipa-prop.c >> > +++ b/gcc/ipa-prop.c >> > @@ -4054,6 +4054,8 @@ ipa_modify_call_arguments (struct cgraph_edge >*cs, gcall *stmt, >> > expr = create_tmp_reg (TREE_TYPE (expr)); >> > gimple_assign_set_lhs (tem, expr); >> > gsi_insert_before (&gsi, tem, GSI_SAME_STMT); >> > + if (gimple_in_ssa_p (cfun)) >> > + update_ssa (TODO_update_ssa_only_virtuals); >> > } >> > } >> > else >> > diff --git a/gcc/testsuite/g++.dg/pr64353.C >b/gcc/testsuite/g++.dg/pr64353.C >> > new file mode 100644 >> > index 000..7859918 >> > --- /dev/null >> > +++ b/gcc/testsuite/g++.dg/pr64353.C >> > @@ -0,0 +1,15 @@ >> > +/* { dg-do compile } */ >> > +/* { dg-options "-O2" } */ >> > + >> > +class C >> > +{ >> > + int y, x; >> > + void i (); >> > + bool __attribute__((const)) xx () { return x; } >> > +}; >> > + >> > +void C::i () >> > +{ >> > + if (xx ()) >> > +x = 1; >> > +}
Re: [Ping] Port of VTV for Cygwin and MinGW
On Thu, Jan 8, 2015 at 12:33 PM, Patrick Wollgast wrote: > A short recap again: > > Latest patch, changelog and a test program (further information about > the program in the mail): > https://gcc.gnu.org/ml/gcc-patches/2014-11/msg03368.html In that patch, the change to varasm.c looks wrong if neither OBJECT_FORMAT_ELF nor TARGET_PECOFF are defined. It looks like you've dropped the switch_to_section call in that case. Ian
Re: Patch RFA: Support for building Go tools
Hello! > This patch adds support to the GCC tree for building tools that are > used with Go. There are two external used tools (go, gofmt) and one > tool used internally by go (cgo). This patch is pure machinery, with > no source code. The tools are not built by default, only when go is > enabled using --enable-languages. For the moment the tools are also > not built when building a cross-compiler, although I hope to change > that when I figure out what is needed. Attached is the patch that enables gotools on alpha/linux. Tested on alpha-linux-gnu, cgo, go and gofmt executables run without problems. Uros. --cut here-- Index: go/go/build/build.go === --- go/go/build/build.go(revision 219515) +++ go/go/build/build.go(working copy) @@ -266,6 +266,7 @@ var cgoEnabled = map[string]bool{ "freebsd/amd64": true, "freebsd/arm": true, "linux/386": true, + "linux/alpha": true, "linux/amd64": true, "linux/arm": true, "linux/ppc64": true, --cut here--
[PATCH, RFC] LRA subreg handling
Hi Vladimir, An issue has been identified with LRA when running CPU2006 h264ref benchmark. I'll try to describe what the issue is and a fix applied as it is very difficult to reproduce it and it is next to impossible to create a narrowed testcase on top of the source code restrictions. The concerned LRA code in lra-constraints.c is the following: if (GET_CODE (*loc) == SUBREG) { reg = SUBREG_REG (*loc); byte = SUBREG_BYTE (*loc); if (REG_P (reg) /* Strict_low_part requires reload the register not the sub-register. */ && (curr_static_id->operand[i].strict_low || (GET_MODE_SIZE (mode) <= GET_MODE_SIZE (GET_MODE (reg)) && (hard_regno = get_try_hard_regno (REGNO (reg))) >= 0 && (simplify_subreg_regno (hard_regno, GET_MODE (reg), byte, mode) < 0) && (goal_alt[i] == NO_REGS || (simplify_subreg_regno (ira_class_hard_regs[goal_alt[i]][0], GET_MODE (reg), byte, mode) >= 0 { loc = &SUBREG_REG (*loc); mode = GET_MODE (*loc); } } The above works just fine when we deal with strict_low_part or a subreg smaller than a word. However, multi-word operations that were emitted as a sequence of operations on word sized parts of the DImode register appears to expose a problem with LRA e.g. '(set (subreg: SI (reg: DI)) ...)'. LRA does not realize that it actually uses the other halve of the DI-mode register leading to a situation where it modifies one halve of the result and spills the whole register with the other halve undefined. In the dump I can see the following: Creating newreg=1552 from oldreg=521, assigning class GR_REGS to r1552 1487: r1552:DI#4=r1404:SI+r1509:SI REG_DEAD r1509:SI REG_DEAD r1404:SI Inserting insn reload after: 1735: r521:DI=r1552:DI There is nothing in the dump that sets r1552:DI#0 nor a reload is inserted to load the value before modifying it but it is spilled. As it is a multi-word register, the split pass emits an additional instruction to load the whole 64-bit value but since one halve was modified, only register $20 appears in the live-in set. In contrast to $20, $21 is being used but not added to the live-in set. ... ;; live in 4 [$4] 6 [$6] 7 [$7] 10 [$10] 11 [$11] 12 [$12] 13 [$13] [$14] 15 [$15] 16 [$16] 17 [$17] 20 [$20] 22 [$22] 23 [$23] 24 [$24] 25 [$25] 29 [$sp] 30 [$fp] 31 [$31] 52 [$f20] 79 [$fakec] ... (insn 1788 1077 1789 80 (set (reg:SI 20 $20 [orig:521 distortion ] [521]) (mem/c:SI (plus:SI (reg/f:SI 29 $sp) (const_int 40 [0x28])) [16 %sfp+40 S4 A64])) rdopt.c:257 288 {*movsi_internal} (nil)) (insn 1789 1788 1743 80 (set (reg:SI 21 $21 [ distortion+4 ]) (mem/c:SI (plus:SI (reg/f:SI 29 $sp) (const_int 44 [0x2c])) [16 %sfp+44 S4 A32])) rdopt.c:257 288 {*movsi_internal} (nil)) ... The potential fix for this is to promote the type of a subreg OP_OUT to OP_INOUT to treat the pseudo register (r1552 in this case) as input and LRA will be forced to insert a reload before modifying its contents. Handling of strict_low_part case is fine as the operand is described in the MD pattern as IN_OUT through modifiers. With the above change in place, we get a reload before assignment: Creating newreg=1552 from oldreg=521, assigning class GR_REGS to r1552 1487: r1552:DI#4=r1404:SI+r1509:SI REG_DEAD r1509:SI REG_DEAD r1404:SI Inserting insn reload before: 1735: r1552:DI=r521:DI Inserting insn reload after: 1736: r521:DI=r1552:DI and the benchmark happily passes the runtime check. The question is whether changing the type to OP_INOUT is the correct and valid fix? Regards, Robert 2015-01-14 Robert Suchanek gcc/ * lra-constraints.c (curr_insn_transform): Change the type of a reload pseudo to OP_INOUT. --- gcc/lra-constraints.c |1 + 1 file changed, 1 insertion(+) diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c index ec28b7f..018968b 100644 --- a/gcc/lra-constraints.c +++ b/gcc/lra-constraints.c @@ -3798,6 +3798,7 @@ curr_insn_transform (void) (ira_class_hard_regs[goal_alt[i]][0], GET_MODE (reg), byte, mode) >= 0) { + type = OP_INOUT; loc = &SUBREG_REG (*loc); mode = GET_MODE (*loc); } -- 1.7.9.5
Re: [RFC, PATCH][LRA, MIPS] ICE: in decompose_normal_address, at rtlanal.c:5817
Robert Suchanek writes: > Here is the revised patch that would handle the other cases as per Richard's > comments. > > I slightly modified Matthew's proposed patch and used split_const > instead of get_related_value. AFAICS, the canonical form would always have > the 'plus' expression. > > The offset on the high part is most likely not important as the code > generation > has to guarantee that the low part represents the true address in the case > where the high and lo_sum are directly related. This looks good to me FWIW. Thanks, Richard
RE: [MIPS] Update the ZC constraint for MIPSR6 and use it
Hi Matthew, > -Original Message- > From: Matthew Fortune [mailto:matthew.fort...@imgtec.com] > Sent: Tuesday, January 06, 2015 7:43 AM > To: Moore, Catherine > Cc: 'gcc-patches@gcc.gnu.org' (gcc-patches@gcc.gnu.org) > Subject: [MIPS] Update the ZC constraint for MIPSR6 and use it > > Update the ZC constraint for MIPSR6 to allow it to be used as the memory > operand for implementations of atomic operations. Also switch the internal > implementation of atomic operations to use ZC instead of ZR. > > This fix accurately describes the memory constraints for the LL and SC > instructions. An offset can therefore be used to access a data item > (ie. %lo ()) rather than always having to load the address into a > register. Tested for mips32r2, mips32r6 and micromips. > > gcc/ > > * config/mips/constraints.md (ZC): Add support for R6 LL/SC > offsets. > (ZD): Update to use ISA_HAS_PREF_LL_SC_9BIT. > * config/mips/mips.h (ISA_HAS_PREFETCH_9BIT): Rename to... > (ISA_HAS_PREF_LL_SC_9BIT): ... this. New macro. > * config/mips/sync.md (sync_compare_and_swap): Use ZC > instead of ZR for the memory operand of LL/SC. > (compare_and_swap_12, sync_add): Likewise. > (sync__12, sync_old__12): Likewise. > (sync_new__12, sync_nand_12): Likewise. > (sync_old_nand_12, sync_new_nand_12): Likewise. > (sync_sub, sync_old_add): Likewise. > (sync_old_sub, sync_new_add): Likewise. > (sync_new_sub, sync_): Likewise. > (sync_old_, sync_new_"): > Likewise. > (sync_nand, sync_old_nand): Likewise. > (sync_new_nand, sync_lock_test_and_set): > Likewise. > (test_and_set_12, atomic_compare_and_swap): Likewise. > (atomic_exchange_llsc, atomic_fetch_add_llsc): > Likewise. > * doc/md.texi (ZC): Update description. > > OK to commit? > > diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h > index 9dad480..b608b17 100644 > --- a/gcc/config/mips/mips.h > +++ b/gcc/config/mips/mips.h > @@ -1089,8 +1089,8 @@ struct mips_cpu_info { > || mips_isa_rev >= 1) \ >&& !TARGET_MIPS16) > > -/* ISA has data prefetch with limited 9-bit displacement. */ > -#define ISA_HAS_PREFETCH_9BIT(mips_isa_rev >= 6) > +/* ISA has data prefetch, LL and SC with limited 9-bit displacement. */ > +#define ISA_HAS_PREF_LL_SC_9BIT (mips_isa_rev >= 6) > I'd like to see this described as something more general. Say: ISA_HAS_9BIT_DISPLACEMENT. This patch is okay with that fixup. Thanks, Catherine
Re: [PATCH] Allow MIPS call-saved-{4-6}.c tests to correctly run for micromips
"Maciej W. Rozycki" writes: > On Wed, 14 Jan 2015, Richard Sandiford wrote: > >> > Taking care that the default compilation mode does not conflict (e.g. >> > MIPS16, incompatible) and taking any exceptions into account (e.g. n64, >> > unsupported) I presume, right? >> >> mips.exp sorts that out for you. Adding "-mmicromips" or "(-micromips)" >> to dg-options forces (or at least is supposed to force) the overall flags >> to be compatible with microMIPS. >> >> The aim of mips.exp is avoid skipping tests whereever possible. If >> someone runs the testsuite with -mips16 and we have a -micromips test, >> it's better to remove -mips16 for that test than to skip the test entirely. > > OK, good to know, thanks; that works for compilation tests only though. > For execution tests however what if target hardware used is incompatible > or there is no suitable C library (multilib configuration) available for > the option requested? E.g. any hardware supporting MIPS16 won't run > microMIPS code, n64 tests won't link if there's no n64 multilib, etc. In those cases it just does a compile-only test, again on the basis that it's better than skipping the test entirely. See the big comment at the beginning of mips.exp if you're interested in the specific details of how this works and what is supported. >> > Please always try to test changes reasonably, i.e. at least o32, >> > o32/MIPS16, o32/microMIPS, n32, n64, and then Linux and ELF if applicable, >> > plus any options that may be relevant, unless it is absolutely clear >> > ABI/ISA variations do not matter for a change proposed. >> >> TBH this seems a bit much. On the one hand it's more testing than you'd >> get for almost any other target, but on the other it leaves out important >> differences like MIPS I vs MIPS II vs MIPS 32, MIPS III vs MIPS IV vs MIPS64, >> r1 vs. r2 vs. r6, Octeon vs. Loongson vs. vanilla, DSP vs. no DSP, etc. > > I disagree, I listed what I consider the base set of configurations for > the MIPS target, spanning the major target variations: > > - MIPS/MIPS16/microMIPS can be treated almost as distinct processor > architectures, the instruction sets have much common in spirit, but > there are enough pitfalls and traps, > > - n32 covers a substantially different calling convention plus (for Linux) > SVR4 PIC code that normally isn't used for executables with o32 these > days, > > - n64 covers all that n32 does plus a 64-bit target. > > I realise ELF testing may be difficult for some people due to the hassle > with setting up the runtime, so to skip an ELF target can be justified; > otherwise I think it makes sense to run such testing for at least one > configuration from the list above for a good measure. As is running some > of them with the big and some of them with the little endianness. > > You've got a point with architecture levels or processor models. I think > r6 should be treated as a distinct architecture and tested as the fourth > variant along MIPS/MIPS16/microMIPS, but such a test environment may not > yet be available to many. The rest I'm afraid will mostly matter for > changes made to the middle end rather than the MIPS backend, in which case > chances are MIPS testing will not be run at all. A test bot (similar to > JBG's build bot, but extended to run testing too) can help in this case; I > don't know if anyone runs one. > > As to DSP, MSA, hard-float, soft-float, 2008-NaN, etc., I'd only expect > to run appropriate testing (i.e. with `-mdsp', etc.) across the > configurations named above whenever relevant code is changed; some of this > stuff is irrelevant or unavailable for some of the configurations above > (e.g. n64 DSP, IIRC), or may have no influence (e.g. the NaN encoding), in > which case it may be justified to skip them. But soft vs. hard float in particular is a significant difference in terms of the ABI. Especially when it comes to MIPS16 interworking (but even apart from that). >> I think we just have to accept that there are so many possible >> combinations that we can't test everything that's potentially relevant. >> I think it's more useful to be flexible than prescribe a particular list. > > Of course flexibility is needed, I absolutely agree. I consider the list > I quoted the base set, I've used it for all recent submissions. Then for > each individual change I've asked myself: does it make sense to run all > this testing? If for example a change touched `if (TARGET_MICROMIPS)' > code only, then clearly running any non-microMIPS testing adds no value. > And then: will this testing provide enough coverage? If not, then what > else needs to be covered? > > As I say, testing is cheap, you can fire a bunch of test suites in > parallel under Linux started on QEMU run in the system emulation mode. > From my experience on decent x86 hardware whole GCC/G++ testing across the > 5 configurations named will complete in just a few hours, that you can
RE: [MIPS] Update the ZC constraint for MIPSR6 and use it
Moore, Catherine writes > Hi Matthew, > > > -Original Message- > > From: Matthew Fortune [mailto:matthew.fort...@imgtec.com] > > Sent: Tuesday, January 06, 2015 7:43 AM > > To: Moore, Catherine > > Cc: 'gcc-patches@gcc.gnu.org' (gcc-patches@gcc.gnu.org) > > Subject: [MIPS] Update the ZC constraint for MIPSR6 and use it > > > > Update the ZC constraint for MIPSR6 to allow it to be used as the > > memory operand for implementations of atomic operations. Also switch > > the internal implementation of atomic operations to use ZC instead of > ZR. > > > > This fix accurately describes the memory constraints for the LL and SC > > instructions. An offset can therefore be used to access a data item > > (ie. %lo ()) rather than always having to load the address into a > > register. Tested for mips32r2, mips32r6 and micromips. > > > > gcc/ > > > > * config/mips/constraints.md (ZC): Add support for R6 LL/SC > > offsets. > > (ZD): Update to use ISA_HAS_PREF_LL_SC_9BIT. > > * config/mips/mips.h (ISA_HAS_PREFETCH_9BIT): Rename to... > > (ISA_HAS_PREF_LL_SC_9BIT): ... this. New macro. > > * config/mips/sync.md (sync_compare_and_swap): Use ZC > > instead of ZR for the memory operand of LL/SC. > > (compare_and_swap_12, sync_add): Likewise. > > (sync__12, sync_old__12): Likewise. > > (sync_new__12, sync_nand_12): Likewise. > > (sync_old_nand_12, sync_new_nand_12): Likewise. > > (sync_sub, sync_old_add): Likewise. > > (sync_old_sub, sync_new_add): Likewise. > > (sync_new_sub, sync_): Likewise. > > (sync_old_, sync_new_"): > > Likewise. > > (sync_nand, sync_old_nand): Likewise. > > (sync_new_nand, sync_lock_test_and_set): > > Likewise. > > (test_and_set_12, atomic_compare_and_swap): Likewise. > > (atomic_exchange_llsc, atomic_fetch_add_llsc): > > Likewise. > > * doc/md.texi (ZC): Update description. > > > > OK to commit? > > > > diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index > > 9dad480..b608b17 100644 > > --- a/gcc/config/mips/mips.h > > +++ b/gcc/config/mips/mips.h > > @@ -1089,8 +1089,8 @@ struct mips_cpu_info { > > || mips_isa_rev >= 1) \ > > && !TARGET_MIPS16) > > > > -/* ISA has data prefetch with limited 9-bit displacement. */ > > -#define ISA_HAS_PREFETCH_9BIT (mips_isa_rev >= 6) > > +/* ISA has data prefetch, LL and SC with limited 9-bit displacement. > */ > > +#define ISA_HAS_PREF_LL_SC_9BIT(mips_isa_rev >= 6) > > > I'd like to see this described as something more general. Say: > ISA_HAS_9BIT_DISPLACEMENT. This patch is okay with that fixup. I think I'm OK with changing that but it does leave us with a different issue of knowing which subset of instructions should check for 9-bit displacement. I.e. not all instructions only have a 9-bit displacement. A GCC 6 thing would be to look over all the ISA_HAS macros and perhaps do some general improvement in the framework we have there. I don't know exactly what I'd do but something a bit more table based seems sensible. Matthew
[debug-early] fix C++ mangling issues with deferred_asm_name removal
There were some -fcompare-debug regressions when I removed the deferred_asm_name auxiliary data structure from dwarf2out. Jason was kind enough to tackle them (or at least the main one). Apparently there was some bug in the mangling code that necessitated the entire deferred_asm_name vector. I'll let Jason explain further. With this patch I see less guality failures than mainline, so perhaps removing deferred_asm_name can be pushed to mainline as soon as stage1 opens, or earlier if we feel lucky. I will be committing this patch, as well as my deferred_asm_name removal patch to the branch. Thanks. commit 0c817fd43f95029153045b9523c1c8b49291e4a3 Author: Aldy Hernandez Date: Wed Jan 14 10:12:40 2015 -0800 cp/ * decl2.c (mangling_aliases): New variable. (note_mangling_alias, generate_mangling_aliases): New. (cp_write_global_declarations): Call generate_mangling_aliases. (generate_mangling_alias): Split out from... * mangle.c (mangle_decl): ...here. * cp-tree.h: Declare note_mangling_alias. diff --git a/gcc/ChangeLog.debug-early b/gcc/ChangeLog.debug-early index 48d1913..f4fb4ba 100644 --- a/gcc/ChangeLog.debug-early +++ b/gcc/ChangeLog.debug-early @@ -1,3 +1,13 @@ +2015-01-14 Jason Merrill + + cp/ + * decl2.c (mangling_aliases): New variable. + (note_mangling_alias, generate_mangling_aliases): New. + (cp_write_global_declarations): Call generate_mangling_aliases. + (generate_mangling_alias): Split out from... + * mangle.c (mangle_decl): ...here. + * cp-tree.h: Declare note_mangling_alias. + 2015-01-06 Aldy Hernandez * dwarf2out.c: Remove deferred_asm_name. diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 98f2e20..5fa96cb 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -5431,6 +5431,7 @@ extern tree finish_case_label (location_t, tree, tree); extern tree cxx_maybe_build_cleanup(tree, tsubst_flags_t); /* in decl2.c */ +extern void note_mangling_alias(tree, tree); extern bool check_java_method (tree); extern tree build_memfn_type (tree, tree, cp_cv_quals, cp_ref_qualifier); extern tree build_pointer_ptrmemfn_type(tree); diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c index abcaeac..691688b 100644 --- a/gcc/cp/decl2.c +++ b/gcc/cp/decl2.c @@ -114,6 +114,10 @@ static GTY(()) vec *deferred_fns; sure are defined. */ static GTY(()) vec *no_linkage_decls; +/* A vector of alternating decls and identifiers, where the latter + is to be an alias for the former if the former is defined. */ +static GTY(()) vec *mangling_aliases; + /* Nonzero if we're done parsing and into end-of-file activities. */ int at_eof; @@ -4253,6 +4257,66 @@ handle_tls_init (void) expand_or_defer_fn (finish_function (0)); } +/* We're at the end of compilation, so generate any mangling aliases that + we've been saving up, if DECL is going to be output and ID2 isn't + already taken by another declaration. */ + +static void +generate_mangling_alias (tree decl, tree id2) +{ + /* If there's a declaration already using this mangled name, + don't create a compatibility alias that conflicts. */ + if (IDENTIFIER_GLOBAL_VALUE (id2)) +return; + + struct cgraph_node *n = NULL; + if (TREE_CODE (decl) == FUNCTION_DECL + && !(n = cgraph_node::get (decl))) +/* Don't create an alias to an unreferenced function. */ +return; + + tree alias = make_alias_for (decl, id2); + SET_IDENTIFIER_GLOBAL_VALUE (id2, alias); + DECL_IGNORED_P (alias) = 1; + TREE_PUBLIC (alias) = TREE_PUBLIC (decl); + DECL_VISIBILITY (alias) = DECL_VISIBILITY (decl); + if (vague_linkage_p (decl)) +DECL_WEAK (alias) = 1; + if (TREE_CODE (decl) == FUNCTION_DECL) +n->create_same_body_alias (alias, decl); + else +varpool_node::create_extra_name_alias (alias, decl); +} + +/* Note that we might want to emit an alias with the symbol ID2 for DECL at + the end of translation, for compatibility across bugs in the mangling + implementation. */ + +void +note_mangling_alias (tree decl, tree id2) +{ +#ifdef ASM_OUTPUT_DEF + if (at_eof) +generate_mangling_alias (decl, id2); + else +{ + vec_safe_push (mangling_aliases, decl); + vec_safe_push (mangling_aliases, id2); +} +#endif +} + +static void +generate_mangling_aliases () +{ + while (!vec_safe_is_empty (mangling_aliases)) +{ + tree id2 = mangling_aliases->pop(); + tree decl = mangling_aliases->pop(); + generate_mangling_alias (decl, id2); +} +} + /* The entire file is now complete. If requested, dump everything to a file. */ @@ -4593,6 +4657,8 @@ c_parse_final_cleanups (void) } while (reconsider); + generate_mangling_aliases (); + /* All used inline functions must have a definition at this point. */ FOR_EACH_VEC_SAFE_ELT (deferred_fns, i, decl) { diff --git
RE: [MIPS] Update the ZC constraint for MIPSR6 and use it
> -Original Message- > From: Matthew Fortune [mailto:matthew.fort...@imgtec.com] > Sent: Wednesday, January 14, 2015 2:54 PM > To: Moore, Catherine > Cc: 'gcc-patches@gcc.gnu.org' (gcc-patches@gcc.gnu.org) > Subject: RE: [MIPS] Update the ZC constraint for MIPSR6 and use it > > Moore, Catherine writes > > Hi Matthew, > > > > > -Original Message- > > > From: Matthew Fortune [mailto:matthew.fort...@imgtec.com] > > > Sent: Tuesday, January 06, 2015 7:43 AM > > > To: Moore, Catherine > > > Cc: 'gcc-patches@gcc.gnu.org' (gcc-patches@gcc.gnu.org) > > > Subject: [MIPS] Update the ZC constraint for MIPSR6 and use it > > > > > > Update the ZC constraint for MIPSR6 to allow it to be used as the > > > memory operand for implementations of atomic operations. Also > > > switch the internal implementation of atomic operations to use ZC > > > instead of > > ZR. > > > > > > This fix accurately describes the memory constraints for the LL and > > > SC instructions. An offset can therefore be used to access a data > > > item (ie. %lo ()) rather than always having to load the address > > > into a register. Tested for mips32r2, mips32r6 and micromips. > > > > > > gcc/ > > > > > > * config/mips/constraints.md (ZC): Add support for R6 LL/SC > > > offsets. > > > (ZD): Update to use ISA_HAS_PREF_LL_SC_9BIT. > > > * config/mips/mips.h (ISA_HAS_PREFETCH_9BIT): Rename to... > > > (ISA_HAS_PREF_LL_SC_9BIT): ... this. New macro. > > > * config/mips/sync.md (sync_compare_and_swap): Use ZC > > > instead of ZR for the memory operand of LL/SC. > > > (compare_and_swap_12, sync_add): Likewise. > > > (sync__12, sync_old__12): Likewise. > > > (sync_new__12, sync_nand_12): Likewise. > > > (sync_old_nand_12, sync_new_nand_12): Likewise. > > > (sync_sub, sync_old_add): Likewise. > > > (sync_old_sub, sync_new_add): Likewise. > > > (sync_new_sub, sync_): Likewise. > > > (sync_old_, sync_new_"): > > > Likewise. > > > (sync_nand, sync_old_nand): Likewise. > > > (sync_new_nand, sync_lock_test_and_set): > > > Likewise. > > > (test_and_set_12, atomic_compare_and_swap): Likewise. > > > (atomic_exchange_llsc, atomic_fetch_add_llsc): > > > Likewise. > > > * doc/md.texi (ZC): Update description. > > > > > > OK to commit? > > > > > > diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index > > > 9dad480..b608b17 100644 > > > --- a/gcc/config/mips/mips.h > > > +++ b/gcc/config/mips/mips.h > > > @@ -1089,8 +1089,8 @@ struct mips_cpu_info { > > > || mips_isa_rev >= 1) \ > > >&& !TARGET_MIPS16) > > > > > > -/* ISA has data prefetch with limited 9-bit displacement. */ > > > -#define ISA_HAS_PREFETCH_9BIT(mips_isa_rev >= 6) > > > +/* ISA has data prefetch, LL and SC with limited 9-bit displacement. > > */ > > > +#define ISA_HAS_PREF_LL_SC_9BIT (mips_isa_rev >= 6) > > > > > I'd like to see this described as something more general. Say: > > ISA_HAS_9BIT_DISPLACEMENT. This patch is okay with that fixup. > > I think I'm OK with changing that but it does leave us with a different issue > of > knowing which subset of instructions should check for 9-bit displacement. > I.e. not all instructions only have a 9-bit displacement. I'm open to a different name. Do you have any other suggestions? Can we just say >= R6? > A GCC 6 thing would be to look over all the ISA_HAS macros and perhaps do > some general improvement in the framework we have there. I don't know > exactly what I'd do but something a bit more table based seems sensible. > Sounds like a good idea.
RE: [PATCH] Allow MIPS call-saved-{4-6}.c tests to correctly run for micromips
Richard Sandiford writes: > "Maciej W. Rozycki" writes: > > On Wed, 14 Jan 2015, Richard Sandiford wrote: > >> I think we just have to accept that there are so many possible > >> combinations that we can't test everything that's potentially > relevant. > >> I think it's more useful to be flexible than prescribe a particular > list. > > > > Of course flexibility is needed, I absolutely agree. I consider the > > list I quoted the base set, I've used it for all recent submissions. > > Then for each individual change I've asked myself: does it make sense > > to run all this testing? If for example a change touched `if > (TARGET_MICROMIPS)' > > code only, then clearly running any non-microMIPS testing adds no > value. > > And then: will this testing provide enough coverage? If not, then > > what else needs to be covered? > > > > As I say, testing is cheap, you can fire a bunch of test suites in > > parallel under Linux started on QEMU run in the system emulation mode. > > From my experience on decent x86 hardware whole GCC/G++ testing across > > the > > 5 configurations named will complete in just a few hours, that you can > > spend doing something else. And if any issues are found then the > > patch submitter, who's often the actual author and knows his code the > > best, is in the best position to understand what happened. > > > > OTOH chasing down a problem later on is expensive (difficult), first > > it has to be narrowed down, often based on a user bug report rather > > than the discovery of a test-suite regression. Even making a > > reproducible test case from such a report may be tough. And then you > > have the choice of either debugging the problem from scratch, or (if > > you have an easy way to figure out it is a regression, such as by > > passing the test case through an older version of the compiler whose > > binary you already have handy) bisecting the tree to find the > > offending commit (not readily available with SVN AFAIK, but I had > > cases I did it manually in the past) and starting from there. Both > ways are tedious and time consuming. > > > >> Having everyone test the same multilib combinations on the same > >> target isn't necessarily a good thing anyway. Diversity in testing > >> (between > >> developers) is useful too. > > > > Sure, people will undoubtedly use different default options, I'm sure > > folks at Cavium will compile for Octeon rather than the base > > architecture for example. Other people may have DSP enabled. Etc., > > etc... That IMHO does not preclude testing across more than just a > single configuration. > > Yeah, but that's just the way it goes. By trying to get everyone to > test with the options that matter to you, you're reducing the amount of > work you have to do when tracking regressions on those targets, but > you're saying that people who care about Octeon or the opposite > floatness have to go through the process you describe as "tedious and > time consuming". > > And you don't avoid that process anyway, since people making changes to > target-independent parts of GCC are just as likely to introduce a > regression as those making changes to MIPS-only code. If testing is > cheap and takes only a small number of hours, and if you want to make it > less tedious to track down a regression, continuous testing would give > you a narrow window for each regression. > > Submitters should be free to test on what matters to them rather than > have to test a canned set of multilibs on specific configurations. One of my main concerns is in enabling contribution from less experienced developers and those that don't have the infrastructure available to perform wide regression testing. I would not want to instil fear in anyone that because they didn't test a specific ISA/revision then they shouldn't bother submitting their patch. The review process is fairly intense in GNU projects and the retest of code can easily stack up with just with a few configurations. Frankly, I dread having to do anything remotely like FPXX ever again as the testing drove me bonkers. I believe there is a point where we have to accept that some issues may have to be fixed after the initial patch is committed. There have been several configuration related issues addressed after FPXX was committed but having the code in tree and getting feedback from other people's favourite configuration testing can actually help speed up development as well. The majority of test failures for different MIPS configurations tend to come from the tests with expected output. Trying to ensure a test builds correctly for any set of test options and has the correct output is exceptionally hard and there is a general theme of not over-specifying test options so that the test does take on the personality of the test options if possible. Personally I am happy to go through at regular intervals and look at the results for a wide range of configurations and fix them up. It takes significantly less tim
RE: [MIPS] Update the ZC constraint for MIPSR6 and use it
Moore, Catherine writes: > > -Original Message- > > From: Matthew Fortune [mailto:matthew.fort...@imgtec.com] > > Sent: Wednesday, January 14, 2015 2:54 PM > > To: Moore, Catherine > > Cc: 'gcc-patches@gcc.gnu.org' (gcc-patches@gcc.gnu.org) > > Subject: RE: [MIPS] Update the ZC constraint for MIPSR6 and use it > > > > Moore, Catherine writes > > > Hi Matthew, > > > > > > > -Original Message- > > > > From: Matthew Fortune [mailto:matthew.fort...@imgtec.com] > > > > Sent: Tuesday, January 06, 2015 7:43 AM > > > > To: Moore, Catherine > > > > Cc: 'gcc-patches@gcc.gnu.org' (gcc-patches@gcc.gnu.org) > > > > Subject: [MIPS] Update the ZC constraint for MIPSR6 and use it > > > > > > > > Update the ZC constraint for MIPSR6 to allow it to be used as the > > > > memory operand for implementations of atomic operations. Also > > > > switch the internal implementation of atomic operations to use ZC > > > > instead of > > > ZR. > > > > > > > > This fix accurately describes the memory constraints for the LL > > > > and SC instructions. An offset can therefore be used to access a > > > > data item (ie. %lo ()) rather than always having to load the > > > > address into a register. Tested for mips32r2, mips32r6 and > micromips. > > > > > > > > gcc/ > > > > > > > > * config/mips/constraints.md (ZC): Add support for R6 LL/SC > > > > offsets. > > > > (ZD): Update to use ISA_HAS_PREF_LL_SC_9BIT. > > > > * config/mips/mips.h (ISA_HAS_PREFETCH_9BIT): Rename to... > > > > (ISA_HAS_PREF_LL_SC_9BIT): ... this. New macro. > > > > * config/mips/sync.md (sync_compare_and_swap): Use ZC > > > > instead of ZR for the memory operand of LL/SC. > > > > (compare_and_swap_12, sync_add): Likewise. > > > > (sync__12, sync_old__12): Likewise. > > > > (sync_new__12, sync_nand_12): Likewise. > > > > (sync_old_nand_12, sync_new_nand_12): Likewise. > > > > (sync_sub, sync_old_add): Likewise. > > > > (sync_old_sub, sync_new_add): Likewise. > > > > (sync_new_sub, sync_): Likewise. > > > > (sync_old_, sync_new_"): > > > > Likewise. > > > > (sync_nand, sync_old_nand): Likewise. > > > > (sync_new_nand, sync_lock_test_and_set): > > > > Likewise. > > > > (test_and_set_12, atomic_compare_and_swap): Likewise. > > > > (atomic_exchange_llsc, atomic_fetch_add_llsc): > > > > Likewise. > > > > * doc/md.texi (ZC): Update description. > > > > > > > > OK to commit? > > > > > > > > diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index > > > > 9dad480..b608b17 100644 > > > > --- a/gcc/config/mips/mips.h > > > > +++ b/gcc/config/mips/mips.h > > > > @@ -1089,8 +1089,8 @@ struct mips_cpu_info { > > > > || mips_isa_rev >= 1) > > > > \ > > > > && !TARGET_MIPS16) > > > > > > > > -/* ISA has data prefetch with limited 9-bit displacement. */ > > > > -#define ISA_HAS_PREFETCH_9BIT (mips_isa_rev >= 6) > > > > +/* ISA has data prefetch, LL and SC with limited 9-bit > displacement. > > > */ > > > > +#define ISA_HAS_PREF_LL_SC_9BIT(mips_isa_rev >= 6) > > > > > > > I'd like to see this described as something more general. Say: > > > ISA_HAS_9BIT_DISPLACEMENT. This patch is okay with that fixup. > > > > I think I'm OK with changing that but it does leave us with a > > different issue of knowing which subset of instructions should check > for 9-bit displacement. > > I.e. not all instructions only have a 9-bit displacement. > > I'm open to a different name. Do you have any other suggestions? Can > we just say >= R6? That is pretty much what it boils down to but I do like keeping all the isa level checks in one place and giving names to things. I'll go with your suggestion and leave the rest to a later general improvement. Thanks, Matthew > > A GCC 6 thing would be to look over all the ISA_HAS macros and perhaps > > do some general improvement in the framework we have there. I don't > > know exactly what I'd do but something a bit more table based seems > sensible. > > > Sounds like a good idea.
Re: [Ping] Port of VTV for Cygwin and MinGW
On 14.01.2015 20:00, Ian Lance Taylor wrote: > On Thu, Jan 8, 2015 at 12:33 PM, Patrick Wollgast > wrote: >> A short recap again: >> >> Latest patch, changelog and a test program (further information about >> the program in the mail): >> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg03368.html > > In that patch, the change to varasm.c looks wrong if neither > OBJECT_FORMAT_ELF nor TARGET_PECOFF are defined. It looks like you've > dropped the switch_to_section call in that case. > > Ian > You're right. It should have been '#else' again, instead of 'else' before the switch_to_section call. Regards, Patrick Index: gcc/config/i386/cygwin.h === --- gcc/config/i386/cygwin.h (Revision 214408) +++ gcc/config/i386/cygwin.h (Arbeitskopie) @@ -41,12 +41,18 @@ along with GCC; see the file COPYING3. #define STARTFILE_SPEC "\ %{!shared: %{!mdll: crt0%O%s \ %{pg:gcrt0%O%s}}}\ - %{shared:crtbeginS.o%s;:crtbegin.o%s}" + %{shared:crtbeginS.o%s;:crtbegin.o%s} \ + %{fvtable-verify=none:%s; \ +fvtable-verify=preinit:vtv_start.o%s; \ +fvtable-verify=std:vtv_start.o%s}" #undef ENDFILE_SPEC #define ENDFILE_SPEC \ "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s}\ %{!shared:%:if-exists(default-manifest.o%s)}\ + %{fvtable-verify=none:%s; \ +fvtable-verify=preinit:vtv_end.o%s; \ +fvtable-verify=std:vtv_end.o%s} \ crtend.o%s" /* Normally, -lgcc is not needed since everything in it is in the DLL, but we @@ -81,6 +87,8 @@ along with GCC; see the file COPYING3. %{pthread: } \ -lcygwin \ %{mwindows:-lgdi32 -lcomdlg32} \ + %{fvtable-verify=preinit:-lvtv -lpsapi; \ +fvtable-verify=std:-lvtv -lpsapi} \ -ladvapi32 -lshell32 -luser32 -lkernel32" /* To implement C++ function replacement we always wrap the cxx Index: gcc/config/i386/mingw-w64.h === --- gcc/config/i386/mingw-w64.h (Revision 214408) +++ gcc/config/i386/mingw-w64.h (Arbeitskopie) @@ -32,7 +32,10 @@ along with GCC; see the file COPYING3. %{!shared:%{!mdll:%{!municode:crt2%O%s}}} \ %{!shared:%{!mdll:%{municode:crt2u%O%s}}} \ %{pg:gcrt2%O%s} \ - crtbegin.o%s" + crtbegin.o%s \ + %{fvtable-verify=none:%s; \ +fvtable-verify=preinit:vtv_start.o%s; \ +fvtable-verify=std:vtv_start.o%s}" /* Enable multilib. */ @@ -43,6 +46,8 @@ along with GCC; see the file COPYING3. #define LIB_SPEC "%{pg:-lgmon} %{" SPEC_PTHREAD1 ":-lpthread} " \ "%{" SPEC_PTHREAD2 ": } " \ "%{mwindows:-lgdi32 -lcomdlg32} " \ + "%{fvtable-verify=preinit:-lvtv -lpsapi; \ +fvtable-verify=std:-lvtv -lpsapi} " \ "-ladvapi32 -lshell32 -luser32 -lkernel32" #undef SPEC_32 Index: gcc/config/i386/mingw32.h === --- gcc/config/i386/mingw32.h (Revision 214408) +++ gcc/config/i386/mingw32.h (Arbeitskopie) @@ -91,6 +91,8 @@ along with GCC; see the file COPYING3. #define LIB_SPEC "%{pg:-lgmon} %{" SPEC_PTHREAD1 ":-lpthread} " \ "%{" SPEC_PTHREAD2 ": } " \ "%{mwindows:-lgdi32 -lcomdlg32} " \ + "%{fvtable-verify=preinit:-lvtv -lpsapi; \ +fvtable-verify=std:-lvtv -lpsapi} " \ "-ladvapi32 -lshell32 -luser32 -lkernel32" /* Weak symbols do not get resolved if using a Windows dll import lib. @@ -143,12 +145,18 @@ along with GCC; see the file COPYING3. #undef STARTFILE_SPEC #define STARTFILE_SPEC "%{shared|mdll:dllcrt2%O%s} \ %{!shared:%{!mdll:crt2%O%s}} %{pg:gcrt2%O%s} \ - crtbegin.o%s" + crtbegin.o%s \ + %{fvtable-verify=none:%s; \ +fvtable-verify=preinit:vtv_start.o%s; \ +fvtable-verify=std:vtv_start.o%s}" #undef ENDFILE_SPEC #define ENDFILE_SPEC \ "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \ %{!shared:%:if-exists(default-manifest.o%s)}\ + %{fvtable-verify=none:%s; \ +fvtable-verify=preinit:vtv_end.o%s; \ +fvtable-verify=std:vtv_end.o%s} \ crtend.o%s" /* Override startfile prefix defaults. */ Index: gcc/cp/vtable-class-hierarchy.c === --- gcc/cp/vtable-class-hierarchy.c (Revision 214408) +++ gcc/cp/vtable-class-hierarchy.c (Arbeitskopie) @@ -1182,7 +1182,11 @@ vtv_generate_init_routine (void) TREE_STATIC (vtv_fndecl) = 1; TREE_USED (vtv_fndecl) = 1; DECL_PRESERVE_P (vtv_fndecl) = 1; +#if defined (TARGET_PECOFF) + if (flag_vtable_verify == VTV_PREINIT_PRIORITY && !TARGET_PECOFF) +#else if (flag_vtable_verify == VTV_PREINIT_PRIORITY) +#endif DECL_STATIC_CONSTRUCTOR (vtv_fndecl) = 0; gimplify_function_tree (vtv_fndecl); @@ -1190,7 +1194,11 @@ vtv_generate_init_routine (void) cgraph_process_new_functions (); +#if defined (TARGET_PECOFF) + if (flag_vtable_verify == VTV_PREINIT_PRIORITY && !TARGET_PECOFF) +#else if (flag_vtable_verify == VTV_PREINIT_PRIORITY) +#endif assem
[RFC] POWER8 default for PPC64LE
The PPC64LE ABI specifies POWER8 ISA as the minimum hardware requierment. Currently, Linux distributions are building the toolchain using --with-cpu=power7 or power8, as they wish. GCC defaults to essentially the POWER4 ISA. The appended patch changes the default for PPC64LE to POWER8 (ISA 2.7). 32-bit PPC SVR4 is not really defined, but it is left unchanged with no minimum ISA. The default ISA can be overridden using --with-cpu= and we presume that Linux distributions and users will continue to configure as they require for their deployment. * config/rs6000/default64.h (TARGET_DEFAULT) [LITTLE_ENDIAN]: Use ISA 2.7 (POWER8). Thanks, David Index: default64.h === --- default64.h (revision 219607) +++ default64.h (working copy) @@ -20,7 +20,7 @@ #if (TARGET_DEFAULT & MASK_LITTLE_ENDIAN) #undef TARGET_DEFAULT -#define TARGET_DEFAULT (MASK_PPC_GFXOPT | MASK_POWERPC64 | MASK_64BIT | MASK_LITTLE_ENDIAN) +#define TARGET_DEFAULT (ISA_2_7_MASKS_SERVER | MASK_POWERPC64 | MASK_64BIT | MASK_LITTLE_ENDIAN) #else #undef TARGET_DEFAULT #define TARGET_DEFAULT (MASK_PPC_GFXOPT | MASK_POWERPC64 | MASK_64BIT)
Re: [RFC] POWER8 default for PPC64LE
On 01/14/15 13:32, David Edelsohn wrote: The PPC64LE ABI specifies POWER8 ISA as the minimum hardware requierment. Currently, Linux distributions are building the toolchain using --with-cpu=power7 or power8, as they wish. GCC defaults to essentially the POWER4 ISA. The appended patch changes the default for PPC64LE to POWER8 (ISA 2.7). 32-bit PPC SVR4 is not really defined, but it is left unchanged with no minimum ISA. The default ISA can be overridden using --with-cpu= and we presume that Linux distributions and users will continue to configure as they require for their deployment. * config/rs6000/default64.h (TARGET_DEFAULT) [LITTLE_ENDIAN]: Use ISA 2.7 (POWER8). Given you've got a new ABI in play here, seems like the perfect time to bump the default ISA to something reasonable. jeff
Re: [PATCH] add option to emit more array bounds warnigs
On 01/14/15 00:48, Martin Uecker wrote: If you plan to contribute regularly, you should go ahead and apply for write access to the repository so that you'll be able to commit your own patches once they're approved. I put a request in with you as sponsor (hope this is ok). Of course. You'll also need to make sure you have an assignment on file with the FSF.That patch was pretty small (the testcase was larger than the patch itself, which I always like :-) so I didn't request an assignment. Further submissions likely will require an assignment. I already have an assignment on file. Excellent. Jeff
Re: [C++ Patch/RFC] PR 58671
On 01/14/2015 10:08 AM, Paolo Carlini wrote: I can look again, but as far as I remember nothing is clearing it, it just stay false because the ICE happens while we process the 'i' on the right hand side and the DECL_INITIALIZED_P becomes true only in cp_finish_decl. Ah, please say that in the comment. OK with that change. Jason
Re: [PATCH] Fix PR c++/16160
On 01/14/2015 11:28 AM, Patrick Palka wrote: Second, since the user probably intended to have written an explicit template instantiation (as in the PR), the FE should suggest adding "template" before such a declaration, that is the declaration struct X<5>; // error + suggest adding "template" Actually, I think in pre-standard days this declared a specialization, before template<> was required. So I think we want to treat it as a specialization in this case as well. Jason
Re: [PATCH] PR59448 - Promote consume to acquire
On 01/14/2015 01:28 PM, Joseph Myers wrote: On Wed, 14 Jan 2015, Andrew MacLeod wrote: - There is a warning for invalid memory models already, so I just continued using that. - I remove the check for CONSUME in exchange since the current standard makes no mention of that being illegal. - I also reversed the current check in compare_exchange to check for failure > success first, allowing us to still catch both errors if present. I think this brings us to where we ought to be... at least almost :-) The latest version I have is n3337, which still specifies that atomic_clear can't be memory_order_acquire or memory_order_acq_rel. Has that been updated to specify that memory_order_consume is not allowed either? I think there was a request in at some point... I can add that if so. Bootstraps on x86_64-unknown-linux-gnu, and no regressions in the testsuite. OK for trunk? OK. checked in after disallowing memory_order_consume on atomic_clear() and an additional test in the testcase for that... Andrew
Re: Patch ping...
On 01/14/2015 12:30 AM, Jan Hubicka wrote: I would like to ping the patch to fix divergence between a type and its main variant introduced by C++ FE. OK. Jason
Re: [RFC, PATCH][LRA, MIPS] ICE: in decompose_normal_address, at rtlanal.c:5817
On 01/14/15 10:10, Robert Suchanek wrote: Here is the revised patch that would handle the other cases as per Richard's comments. I slightly modified Matthew's proposed patch and used split_const instead of get_related_value. AFAICS, the canonical form would always have the 'plus' expression. The offset on the high part is most likely not important as the code generation has to guarantee that the low part represents the true address in the case where the high and lo_sum are directly related. Regards, Robert gcc/ * simplify-rtx.c (simplify_replace_fn_rtx): Simplify (lo_sum (high x) y) to y if x and y have the same base. gcc/testsuite/ * gcc.c-torture/compile/20150108.c: New test. OK. The MIPS and Sparc ports are probably going to hit this the hardest. So you've got a vested interest in dealing with any fallout :-) jeff
Re: [PATCH] Correct target selector in -mfentry tests
On 01/14/15 04:38, H.J. Lu wrote: On Tue, Jan 13, 2015 at 11:15 PM, Jeff Law wrote: On 01/13/15 14:27, H.J. Lu wrote: -fprofile -mfentry works with PIE if gcrt1.o is compiled with -fPIC. A glibc has been filed, PR 17836, and a glibc patch has been submitted. OK for trunk? Thanks. H.J. -- * gcc.target/i386/fentry-override.c: Properly place {} in target selector. Remove nonpic. * gcc.target/i386/fentry.c: Likewise. Does this change the pass/fail result of the test on a system without an updated glibc? Yes, they pass with the current glibc since they are compile tests. OK for trunk. Thanks, Jeff
Re: [RFC] Tighten memory type assumption in RTL combiner pass.
On 01/14/15 04:27, Venkataramanan Kumar wrote: Hi all, When trying to debug GCC combiner pass with the test case in PR63949 ref https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63949 I came across this code. This code in "make_compound_operation" assumes that all PLUS and MINUS RTX are "MEM" type for scalar int modes and tries to optimize based on that assumption. /* Select the code to be used in recursive calls. Once we are inside an address, we stay there. If we have a comparison, set to COMPARE, but once inside, go back to our default of SET. */ next_code = (code == MEM ? MEM : ((code == PLUS || code == MINUS) && SCALAR_INT_MODE_P (mode)) ? MEM : ((code == COMPARE || COMPARISON_P (x)) && XEXP (x, 1) == const0_rtx) ? COMPARE : in_code == COMPARE ? SET : in_code); next_code is passed as in_code via recursive calls to "make_compound_operation". Based on that we are converting shift pattern to MULT pattern. case ASHIFT: /* Convert shifts by constants into multiplications if inside an address. */ if (in_code == MEM && CONST_INT_P (XEXP (x, 1)) && INTVAL (XEXP (x, 1)) < HOST_BITS_PER_WIDE_INT && INTVAL (XEXP (x, 1)) >= 0 && SCALAR_INT_MODE_P (mode)) { Now I tried to tighten it further by adding check to see in_code is also MEM type. Not sure if this right thing to do. But this assumption about MEM seems to be very relaxed before. diff --git a/gcc/combine.c b/gcc/combine.c index 101cf35..1353f54 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -7696,7 +7696,8 @@ make_compound_operation (rtx x, enum rtx_code in_code) next_code = (code == MEM ? MEM : ((code == PLUS || code == MINUS) - && SCALAR_INT_MODE_P (mode)) ? MEM + && SCALAR_INT_MODE_P (mode) + && (in_code == MEM)) ? MEM : ((code == COMPARE || COMPARISON_P (x)) && XEXP (x, 1) == const0_rtx) ? COMPARE : in_code == COMPARE ? SET : in_code); This passed bootstrap on x86_64 and GCC tests are not regressing. On Aarch64 passed bootstrap tests, test case in PR passed, but few tests failed (failed to generate adds and subs), because there are patterns (extended adds and subs) based on multiplication only in Aarch64 backend. if this change is correct then I may need to add patterns in Aarch64 based on shifts. Not sure about targets also. Requesting further comments/help about this. I am looking to get it fixed in stage 1. So the first question I would ask here is what precisely are you trying to accomplish? Is there some code where making this change is important or is it strictly a theoretical problem? If the latter, can we make it concrete with a well crafted testcase? jeff
Re: [PATCH][testsuite] Fix oversized bitfield warning.
On 01/14/15 03:54, Matthew Wahab wrote: Hello, Test case g++.dg/torture/20141013.C (added https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01190.html) triggers the warning -- 20141013.C:45:23: warning: width of 'tree_base::code' exceeds its type -- on arm-none-eabi. The code specifies a bitfield of size 16 with an enum as the underlying type. On arm-none-eabi, enums are packed by default (-fshort-enums) so the bitfield is oversized and the warning is correct. This patch adds -fno-short-enums to the compiler options for the test case. Testing: Ran g++.dg/torture/dg-torture.exp for arm-none-eabi and arm-none-linux-gnueabihf. Matthew 2015-01-13 Matthew Wahab * testsuite/g++.dg/torture/20141013.C: Set -fno-short-enums. OK for the trunk. Please install. Thanks, Jeff
[Patch] Missing plugin header files
I tried compiling an empty plugin that just included gcc-plugin.h and plugin-version.h and found that these header files were included from gcc-plugin.h but not in the list of header files to be copied to the plugin include directory. OK to checkin? Steve Ellcey sell...@imgtec.com 2015-01-14 Steve Ellcey * Makefile.in (PLUGIN_HEADERS): Add dominance.h, cfg.h, cfgrtl.h, cfganal.h, cfgbuild.h, cfgcleanup.h, lcm.h, builtins.def, chkp-builtins.def, and pass-instances.def diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 44a4214..abe2d0d 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -3228,7 +3228,8 @@ PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ tree-ssa-loop.h tree-ssa-loop-ivopts.h tree-ssa-loop-manip.h \ tree-ssa-loop-niter.h tree-ssa-ter.h tree-ssa-threadedge.h \ tree-ssa-threadupdate.h inchash.h wide-int.h signop.h hash-map.h \ - hash-set.h pass-instances.def + hash-set.h dominance.h cfg.h cfgrtl.h cfganal.h cfgbuild.h cfgcleanup.h \ + lcm.h builtins.def chkp-builtins.def pass-instances.def # generate the 'build fragment' b-header-vars s-header-vars: Makefile
Re: [PATCH] Fix PR 61225
On 12/10/14 06:47, Segher Boessenkool wrote: On Tue, Dec 09, 2014 at 12:15:30PM -0700, Jeff Law wrote: @@ -3323,7 +3396,11 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, rtx old = newpat; total_sets = 1 + extra_sets; newpat = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (total_sets)); - XVECEXP (newpat, 0, 0) = old; + + if (to_combined_insn) + XVECEXP (newpat, 0, --total_sets) = old; + else + XVECEXP (newpat, 0, 0) = old; } Is this correct? If so, it needs a big fat comment, because it is not exactly obvious :-) Also, it doesn't handle at all the case where the new pattern already is a PARALLEL; can that never happen? I'd convinced myself it was. But yes, a comment here would be good. Presumably you're thinking about a PARALLEL that satisfies single_set_p? I wasn't thinking about anything in particular; this code does not handle a PARALLEL newpat with to_combined_insn correctly, and it doesn't say it cannot happen. It situations like this where I really need to just put the damn patch into my tree and fire up the debugger and poke at it for a while. Regardless, I got mail from Zhenqiang that he left ARM at the start of the year for other opportunities and won't be doing GCC work. My initial thought is to attach his work to date to the BZ, we can use it as a starting point if we want to pursue this missed optimization further (it's a regression and thus suitable for stage4 if we're so inclined). Thoughts? jeff
[PATCH committed] Sync include/libiberty.h with Binutils
Hi! This pulls libiberty.h's copyright update from Binutils, so that the file is synced again. 2015-12-14 Jan-Benedict Glaw * libiberty.h: Merge Copyright year update from Binutils. diff --git a/include/ChangeLog b/include/ChangeLog index dbf2554..c1011b9 100644 --- a/include/ChangeLog +++ b/include/ChangeLog @@ -1,3 +1,7 @@ +2015-12-14 Jan-Benedict Glaw + + * libiberty.h: Merge Copyright year update from Binutils. + 2014-12-24 Uros Bizjak Ben Elliston Manuel Lopez-Ibanez diff --git a/include/libiberty.h b/include/libiberty.h index aa0d92c..b33dd65 100644 --- a/include/libiberty.h +++ b/include/libiberty.h @@ -1,7 +1,6 @@ /* Function declarations for libiberty. - Copyright 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, - 2006, 2007, 2008, 2009, 2010, 2011, 2013 Free Software Foundation, Inc. + Copyright (C) 1997-2015 Free Software Foundation, Inc. Note - certain prototypes declared in this header file are for functions whoes implementation copyright does not belong to the -- Jan-Benedict Glaw jbg...@lug-owl.de +49-172-7608481 Signature of: Alles sollte so einfach wie möglich gemacht sein. the second : Aber nicht einfacher. (Einstein) signature.asc Description: Digital signature
Re: Patch: Some potential warnings for C++ bootstrap
On 11/14/14 13:41, Kai Tietz wrote: Hello, this patch fixes some potential warnings for C++ bootstrap. I noticed them while working on the delayed folding for C++-FE on boostrap. ChangeLog 2014-11-14 Kai Tietz * dwarf2out.c(output_loc_operands): Make sure that comparison is done on same sign. * varasm.c (default_assemble_integer): Likewise. * optabs.c (expand_subword_shift): Likewise. (expand_doubleword_shift): Likewise. (expand_binop): Likewise. * tree-complex.c (create_one_component_var): Fix pontential sequence-point issue. Regression tested on x86_64-unknown-linux-gnu. Ok for apply? Yes, these are OK. Sorry for the delay, Jeff
Re: [PATCH] Fix PR c++/16160
On Wed, Jan 14, 2015 at 4:28 PM, Jason Merrill wrote: > On 01/14/2015 11:28 AM, Patrick Palka wrote: >> >> Second, since the user probably intended to >> have written an explicit template instantiation (as in the PR), the FE >> should suggest adding "template" before such a declaration, that is the >> declaration >> >> struct X<5>; // error + suggest adding "template" > > > Actually, I think in pre-standard days this declared a specialization, > before template<> was required. So I think we want to treat it as a > specialization in this case as well. Did this define a specialization too: struct X<5> { }; or was template<> always required here? > > Jason >
[wwwdocs] Add porting_to.html for GCC 5 (again)
A few months ago I posted the "porting to" document for GCC 5. But I never got around to commit it, so here it is again, this time with feewing. I fixed description of GNU89 "extern inline", added some more snippets, and mentioned -Wc9?-c??-compat warnings. I plan to commit this tomorrow. --- porting_to.html.mp 2014-10-22 17:25:42.122367884 +0200 +++ porting_to.html 2015-01-14 22:38:24.939867012 +0100 @@ -0,0 +1,255 @@ + + + +Porting to GCC 5 + + + +Porting to GCC 5 + + +The GCC 5 release series differs from previous GCC releases in +a number of ways. Some of +these are a result of bug fixing, and some old behaviors have been +intentionally changed in order to support new standards, or relaxed +in standards-conforming ways to facilitate compilation or run-time +performance. Some of these changes are not visible to the naked eye +and will not cause problems when updating from older versions. + + + +However, some of these changes are visible, and can cause grief to +users porting to GCC 5. This document is an effort to identify major +issues and provide clear solutions in a quick and easily searched +manner. Additions and suggestions for improvement are welcome. + + +C language issues + +Default standard is now GNU11 + +GCC defaults to -std=gnu11 instead of -std=gnu89. +This brings several changes that the users should be aware of. The following +paragraphs describe some of these changes and suggest how to deal with them. + +Some users might prefer to stay with gnu89, in which case we suggest to use +the -std=gnu89 command-line option, perhaps by putting it in +override CFLAGS or similarly in the Makefile. + +To ease the migration process, GCC offers two new warning options, +-Wc90-c99-compat and -Wc99-c11-compat. The +former warns about features not present in ISO C90, but present in ISO C99 +and the latter warns about features not present in ISO C99, but present in +ISO C11. See the GCC manual for more info. + +Different semantics for inline functions +While -std=gnu89 employs the GNU89 inline semantics, +-std=gnu11 uses the C99 inline semantics. The C99 inline semantics +requires that if a function with external linkage is declared with +inline function specifier, it also has to be defined in the same +translation unit. Consequently, GCC now warns if it sees a TU such as the +following: + + + inline int foo (void); + + +This example now gives the following diagnostic: + + +f.c:1:12: warning: inline function 'foo' declared but never defined + inline int foo (void); + ^ + + +Furthermore, there is a difference between extern inline and +inline: + + C99 inline: no externally visible function is generated; + if the function is referenced in this TU, external definition has to + exist in another TU; same as GNU89 extern inline with no + redefinition; + C99 extern inline: externally visible function is generated; + same as GNU89 inline; + GNU89 inline: same as C99 extern inline; + GNU89 extern inline: no externally visible function is generated; + no equivalent in C99, because redefinition is not permitted. + + +(Fortunately static inline is the same in both C99 and GNU89.) +In other words, ISO C99 requires that exactly one C source file has the +callable copy of the inline function. Consider the following program: + + + inline int + foo (void) + { +return 42; + } + + int + main (void) + { +return foo (); + } + + +The program above will not link with the C99 inline semantics, because there +is not an out-of-line function foo generated. To fix this, either +mark the function foo as extern, or add the following +declaration: + + + extern inline int foo (void); + + +This ensures that an externally visible function be emitted. +To enforce the GNU89 inline semantics, you can either use the +-fgnu89-inline command-line option, or mark a function with the +gnu_inline attribute. For example: + + + __attribute__ ((gnu_inline)) inline int + foo (void) + { +return 42; + } + + +A program which used GNU89 extern inline may fail in the new +standard due to multiple definition errors: + + + extern inline int + foo (void) + { +return 42; + } + + int + foo (void) + { +return 23; + } + + int + main (void) + { +return foo (); + } + + +Some warnings are enabled by default + +The C99 mode enables some warnings by default. For instance, GCC warns +about missing declarations of functions: + + + int + foo (void) + { +return bar (); + } + + +This example now gives the following diagnostic: + + +w.c:4:10: warning: implicit declaration of function 'bar' [-Wimplicit-function-declaration] + return bar (); + ^ + + +To suppress this warning add the proper declaration: + + + int bar (void); + + +or use -Wno-implicit-function-declaration. + +Another warning that is now turned on by default is the warning about +implicit int, as in the following snippet: + + + foo (u) + { +return u; + } + + +This example
Re: [PATCH] testsuite/lib/target-supports.exp: Fix check_effective_target_lto
On 12 Jan 14:30, Jeff Law wrote: > On 01/11/15 12:26, Ilya Verbin wrote: > >On 09 Jan 10:29, Thomas Schwinge wrote: > >>As this was the only use of ENABLE_LTO in the testsuite, I suggest to > >>also remove it from the gcc/Makefile.in:site.exp rule. > > > >Done. > >Here is an updated and retested patch. OK for trunk? > > > > > >gcc/ > > * Makefile.in (site.exp): Do not set ENABLE_LTO. > >gcc/testsuite/ > > * lib/target-supports.exp (check_effective_target_lto): Check for -flto > > option support instead of ENABLE_LTO from Makefile. > OK. > Jeff This patch caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64605 -- Ilya
Re: [RFC] Tighten memory type assumption in RTL combiner pass.
On 01/14/2015 03:27 AM, Venkataramanan Kumar wrote: >next_code = (code == MEM ? MEM >: ((code == PLUS || code == MINUS) > - && SCALAR_INT_MODE_P (mode)) ? MEM > + && SCALAR_INT_MODE_P (mode) > + && (in_code == MEM)) ? MEM >: ((code == COMPARE || COMPARISON_P (x)) > && XEXP (x, 1) == const0_rtx) ? COMPARE >: in_code == COMPARE ? SET : in_code); Isn't this change the same as simply deleting the condition? If we're testing in_code == MEM, isn't that the same as just returning in_code, as the last condition does? Seconded Law's request for more information... r~
Re: [PATCH/expand] PR64011 Adjust bitsize when partial overflow happen for big-endian
On 13/01/15 21:45, Jeff Law wrote: On 01/09/15 06:39, Jiong Wang wrote: the bug testcase is === typedef short U __attribute__((may_alias, aligned (1))); struct S { _Complex float d __attribute__((aligned (8))); }; void bar(struct S); void f5 (int x) { struct S s; ((U *)((char *) &s.d + 1))[3] = x; bar (s); } So I'm going to focus on that assignment statement. Doesn't that write outside the bounds of "s"?If I'm reading everything correctly we have an object that is 8 bytes (a complex float). The assignment is a 2 byte write starting at byte 7 in the object. ISTM that writes one byte beyond the underlying object, at which point we're in undefined behaviour territory. In many ways having the compiler or assembler spitting out an error here is preferable to silently compiling the code. That would also help explain why we haven't seen this on other big endian targets with rich bitfield support (PA and H8 come to mind) -- it only arises in cases of undefined behaviour AFAICT. What I do not like is the ICE or unrecognizable insn error. currently, if a backend define "insv" with strict operand constraints to reject negative imm, for example ARM, will ICE as unrecognizable error. I'm really tempted here to use the conditional you want to add to store_bit_field_using_insv and when it triggers issue an error/warning instead of or in addition to truncating the size of the assignment. agree, and I think the truncation is needed otherwise there may have ICE on some target. and I found current gcc LOCATION info is very good ! have done an experimental hack on at "expand_assignment": 4931 where the tree is expanded, gcc could give quite useful & accurate warning based on tree LOCATION info. ./cc1 -O2 -mbig-endian pr48335-2.c pr48335-2.c: In function ‘f5’: pr48335-2.c:19:29: warning: overflow here ! ((U *)((char *) &s.d + 1))[3] = x; ^ while we need to add warning at store_bit_field_using_insv where there is no accurate LOCATION info. but looks like it's acceptable? pr48335-2.c:19:33: warning: overflow here ! ((U *)((char *) &s.d + 1))[3] = x; ^ Thoughts? jeff
RE: [PATCH,MIPS] Add support for the R6 LSA and DLSA instructions
> -Original Message- > From: Matthew Fortune [mailto:matthew.fort...@imgtec.com] > Sent: Monday, January 12, 2015 10:35 AM > To: Moore, Catherine > Cc: 'gcc-patches@gcc.gnu.org' (gcc-patches@gcc.gnu.org) > Subject: [PATCH,MIPS] Add support for the R6 LSA and DLSA instructions > > This patch adds support for the R6 [D]LSA instructions. The support has been > structured to allow MSA (when implemented) to turn on the same > instructions as they are also added by the MSA ASE. > > I have continued to use the idea of 'ghost' options in the testsuite to > indicate > what features are required rather than arch revisions. > > Thanks, > Matthew > > gcc/ > > * config/mips/mips.c (mips_rtx_costs): Set costs for LSA/DLSA. > (mips_print_operand): Support 'y' to print exact log2 in decimal > of a const_int. > * config/mips/mips.h (ISA_HAS_LSA): New define. > (ISA_HAS_DLSA): Likewise. > * config/mips/mips.md (lsa): New define_insn. > * config/mips/predicates.md (const_immlsa_operand): New > predicate. > This patch is fine.
Re: [PATCH][AArch64] Improve bit-test-branch pattern to avoid unnecessary register clobber
On 12/15/2014 07:36 AM, Jiong Wang wrote: > + char buf[64]; > + uint64_t val = ((uint64_t) 1) << UINTVAL (operands[1]); > + sprintf (buf, "tst\t%%0, %"PRId64, val); > + output_asm_insn (buf, operands); > + return "\t%l2"; Better to simply modify the operand, as in operands[1] = GEN_INT (HOST_WIDE_INT_1U << UINTVAL (operands[1])); return "tst\t%0, %1\;\t%l2"; r~
[doc, committed] reclassify -fuse-ld= as a linker option
I noticed that -fuse-ld= was incorrectly classified as an optimization option in the manual. It's a driver-only option so it seems better to list it with the linker options. I've checked in this patch to move it (there's no change to the actual content of the documentation, just its location). -Sandra 2015-01-14 Sandra Loosemore gcc/ * doc/invoke.texi (Option Summary): Reclassify -fuse-ld as a linker option. (Optimization Options): Move -fuse-ld documentation to... (Link Options): ...here. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 219497) +++ gcc/doc/invoke.texi (working copy) @@ -446,7 +446,7 @@ Objective-C and Objective-C++ Dialects}. -funit-at-a-time -funroll-all-loops -funroll-loops @gol -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol -fipa-ra -fvariable-expansion-in-unroller -fvect-cost-model -fvpt @gol --fweb -fwhole-program -fwpa -fuse-ld=@var{linker} -fuse-linker-plugin @gol +-fweb -fwhole-program -fwpa -fuse-linker-plugin @gol --param @var{name}=@var{value} -O -O0 -O1 -O2 -O3 -Os -Ofast -Og} @@ -472,7 +472,7 @@ Objective-C and Objective-C++ Dialects}. @item Linker Options @xref{Link Options,,Options for Linking}. -@gccoptlist{@var{object-file-name} -l@var{library} @gol +@gccoptlist{@var{object-file-name} -fuse-ld=@var{linker} -l@var{library} @gol -nostartfiles -nodefaultlibs -nostdlib -pie -rdynamic @gol -s -static -static-libgcc -static-libstdc++ @gol -static-libasan -static-libtsan -static-liblsan -static-libubsan @gol @@ -9324,14 +9324,6 @@ the comparison operation before register Enabled at levels @option{-O}, @option{-O2}, @option{-O3}, @option{-Os}. -@item -fuse-ld=bfd -@opindex fuse-ld=bfd -Use the @command{bfd} linker instead of the default linker. - -@item -fuse-ld=gold -@opindex fuse-ld=gold -Use the @command{gold} linker instead of the default linker. - @item -fcprop-registers @opindex fcprop-registers After register allocation and post-register allocation instruction splitting, @@ -10875,6 +10867,14 @@ If any of these options is used, then th object file names should not be used as arguments. @xref{Overall Options}. +@item -fuse-ld=bfd +@opindex fuse-ld=bfd +Use the @command{bfd} linker instead of the default linker. + +@item -fuse-ld=gold +@opindex fuse-ld=gold +Use the @command{gold} linker instead of the default linker. + @cindex Libraries @item -l@var{library} @itemx -l @var{library}
Fix spelling error in top level configure.ac
Hi Not a huge patch. Just changes "developement" to "development" in a comment and error message. The larger issue is that it is in the top level configure.ac. :) OK to comment and where? 2015-01-14 Joel Sherrill * configure.ac: Fix spelling error. * configure: Regenerate. -- Joel Sherrill, Ph.D. Director of Research & Development joel.sherr...@oarcorp.comOn-Line Applications Research Ask me about RTEMS: a free RTOS Huntsville AL 35805 Support Available(256) 722-9985 >From 2245d177fb2d5b396168c56d5dfdacc48a1bf94e Mon Sep 17 00:00:00 2001 From: Joel Sherrill Date: Wed, 14 Jan 2015 16:59:07 -0600 Subject: [PATCH] configure.ac: Fix spelling error 2015-01-14 Joel Sherrill * configure.ac: Fix spelling error. * configure: Regenerate. --- configure| 4 ++-- configure.ac | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/configure b/configure index 64d287d..1536c11 100755 --- a/configure +++ b/configure @@ -7553,7 +7553,7 @@ fi # multilib is not explicitly enabled. case "$target:$have_compiler:$host:$target:$enable_multilib" in x86_64-*linux*:yes:$build:$build:) -# Make sure we have a developement environment that handles 32-bit +# Make sure we have a development environment that handles 32-bit dev64=no echo "int main () { return 0; }" > conftest.c ${CC} -m32 -o conftest ${CFLAGS} ${CPPFLAGS} ${LDFLAGS} conftest.c @@ -7564,7 +7564,7 @@ case "$target:$have_compiler:$host:$target:$enable_multilib" in fi rm -f conftest* if test x${dev64} != xyes ; then - as_fn_error "I suspect your system does not have 32-bit developement libraries (libc and headers). If you have them, rerun configure with --enable-multilib. If you do not have them, and want to build a 64-bit-only compiler, rerun configure with --disable-multilib." "$LINENO" 5 + as_fn_error "I suspect your system does not have 32-bit development libraries (libc and headers). If you have them, rerun configure with --enable-multilib. If you do not have them, and want to build a 64-bit-only compiler, rerun configure with --disable-multilib." "$LINENO" 5 fi ;; esac diff --git a/configure.ac b/configure.ac index 5badc7f..7c7ee18 100644 --- a/configure.ac +++ b/configure.ac @@ -2921,7 +2921,7 @@ fi # multilib is not explicitly enabled. case "$target:$have_compiler:$host:$target:$enable_multilib" in x86_64-*linux*:yes:$build:$build:) -# Make sure we have a developement environment that handles 32-bit +# Make sure we have a development environment that handles 32-bit dev64=no echo "int main () { return 0; }" > conftest.c ${CC} -m32 -o conftest ${CFLAGS} ${CPPFLAGS} ${LDFLAGS} conftest.c @@ -2932,7 +2932,7 @@ case "$target:$have_compiler:$host:$target:$enable_multilib" in fi rm -f conftest* if test x${dev64} != xyes ; then - AC_MSG_ERROR([I suspect your system does not have 32-bit developement libraries (libc and headers). If you have them, rerun configure with --enable-multilib. If you do not have them, and want to build a 64-bit-only compiler, rerun configure with --disable-multilib.]) + AC_MSG_ERROR([I suspect your system does not have 32-bit development libraries (libc and headers). If you have them, rerun configure with --enable-multilib. If you do not have them, and want to build a 64-bit-only compiler, rerun configure with --disable-multilib.]) fi ;; esac -- 1.9.3
Re: [Ping] Port of VTV for Cygwin and MinGW
On Wed, Jan 14, 2015 at 12:28 PM, Patrick Wollgast wrote: > On 14.01.2015 20:00, Ian Lance Taylor wrote: >> On Thu, Jan 8, 2015 at 12:33 PM, Patrick Wollgast >> wrote: >>> A short recap again: >>> >>> Latest patch, changelog and a test program (further information about >>> the program in the mail): >>> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg03368.html >> >> In that patch, the change to varasm.c looks wrong if neither >> OBJECT_FORMAT_ELF nor TARGET_PECOFF are defined. It looks like you've >> dropped the switch_to_section call in that case. >> >> Ian >> > > You're right. It should have been '#else' again, instead of 'else' > before the switch_to_section call. OK, the patches to varasm.c and cp/vtable-class-hierarchy.c are OK. Thanks. Ian
RFA: patch to fix a bad code generation for PR64110 -- new constraints addition
The problem of unexpected code generation is discussed on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64110 The following patch introduces 2 new constraints '^' and '$' which are analogous to '?' and '!' but disfavor given alternative when *the operand with the new constraint* needs a reload ('?' and '!' disfavor the alternative if *any* operand needs a reload). I hope the new constraints will be useful for other insns and targets. The patch was successfully bootstrapped and tested on x86-64. I just need an approval for changes in sse.md, stmt.c, and genoutput.c Thanks. 2015-01-14 Vladimir Makarov PR rtl-optimization/64110 * stmt.c (parse_output_constraint): Process '^' and '$'. (parse_input_constraint): Ditto. * lra-constraints.c (process_alt_operands): Process the new constraints. * ira-costs.c (record_reg_classes): Process the new constraint '^'. * genoutput.c (indep_constraints): Add '^' and '$'. * config/i386/sse.md (*vec_dup): Use '$' instead of '!'. * doc/md.texi: Add description of the new constraints. 2015-01-14 Vladimir Makarov PR rtl-optimization/64110 * gcc.target/i386/pr64110.c: Add scan-assembler. Index: config/i386/sse.md === --- config/i386/sse.md (revision 219262) +++ config/i386/sse.md (working copy) @@ -16713,7 +16713,7 @@ (define_insn "*vec_dup" [(set (match_operand:AVX2_VEC_DUP_MODE 0 "register_operand" "=x,x,x") (vec_duplicate:AVX2_VEC_DUP_MODE - (match_operand: 1 "nonimmediate_operand" "m,x,!r")))] + (match_operand: 1 "nonimmediate_operand" "m,x,$r")))] "TARGET_AVX2" "@ vbroadcast\t{%1, %0|%0, %1} Index: doc/md.texi === --- doc/md.texi (revision 219262) +++ doc/md.texi (working copy) @@ -1503,6 +1503,18 @@ in it. Disparage severely the alternative that the @samp{!} appears in. This alternative can still be used if it fits without reloading, but if reloading is needed, some other alternative will be used. + +@cindex @samp{^} in constraint +@cindex caret +@item ^ +This constraint is analogous to @samp{?} but it disparages slightly +the alternative only unless the corresponding operand applies exactly. + +@cindex @samp{$} in constraint +@cindex dollar sign +@item $ +This constraint is analogous to @samp{!} but it disparages severely +the alternative only unless the corresponding operand applies exactly. @end table @ifset INTERNALS Index: genoutput.c === --- genoutput.c (revision 219262) +++ genoutput.c (working copy) @@ -209,7 +209,7 @@ struct constraint_data /* All machine-independent constraint characters (except digits) that are handled outside the define*_constraint mechanism. */ -static const char indep_constraints[] = ",=+%*?!#&g"; +static const char indep_constraints[] = ",=+%*?!^$#&g"; static struct constraint_data * constraints_by_letter_table[1 << CHAR_BIT]; Index: ira-costs.c === --- ira-costs.c (revision 219262) +++ ira-costs.c (working copy) @@ -762,6 +762,10 @@ record_reg_classes (int n_alts, int n_op c = *++p; break; + case '^': + alt_cost += 2; + break; + case '?': alt_cost += 2; break; Index: lra-constraints.c === --- lra-constraints.c (revision 219262) +++ lra-constraints.c (working copy) @@ -1640,6 +1640,7 @@ process_alt_operands (int only_alternati then REJECT is ignored, but otherwise it gets this much counted against it in addition to the reloading needed. */ int reject; + int op_reject; /* The number of elements in the following array. */ int early_clobbered_regs_num; /* Numbers of operands which are early clobber registers. */ @@ -1789,6 +1790,7 @@ process_alt_operands (int only_alternati track. */ lra_assert (*p != 0 && *p != ','); + op_reject = 0; /* Scan this alternative's specs for this operand; set WIN if the operand fits any letter in this alternative. Otherwise, clear BADOP if this operand could fit some @@ -1811,6 +1813,13 @@ process_alt_operands (int only_alternati early_clobber_p = true; break; + case '$': + op_reject += LRA_MAX_REJECT; + break; + case '^': + op_reject += LRA_LOSER_COST_FACTOR; + break; + case '#': /* Ignore rest of this alternative. */ c = '\0'; @@ -2097,6 +2106,7 @@ process_alt_operands (int only_alternati int
Re: [COMMITTED] Merge libffi with upstream
On Mon, Jan 12, 2015 at 8:34 AM, Richard Henderson wrote: > Upstream libffi has added support for Go closures (using the static chain), > and support for complex numbers. Perhaps less relevant is new support for > arc, microblaze, moxie, nios, and or1k targets. > > Without additional changes for Go, this merge has little effect. Within the > gcc tree libffi is primarily used by libjava. > > Tested with no regressions on {i686,x86_64,ppc64,s390x,aarch64,alpha}-linux. > > Due to upstream breakage, and difficulty debugging on Darwin, > {i686,x86_64}-darwin retains copies of the existing sources and thus remains > 100% unchanged. Since libgo doesn't support darwin, this should cause no > immediate problems. > It caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64607 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64581 -- H.J.
[SH][committed] PR 53988 - Fix wrong code
Hi, The attached patch fixes a wrong-code issue which was the result of the initial fix for PR 53988. Tested with make -k check RUNTESTFLAGS="--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}" and no new failures. Committed as 219623. Backports to 4.8 and 4.9 will follow later. Cheers, Oleg gcc/ChangeLog: PR target/53988 * config/sh/sh-protos.h (sh_find_set_of_reg): Add option to ignore reg-reg copies. (sh_extending_set_of_reg): New struct. (sh_find_extending_set_of_reg, sh_split_tst_subregs, sh_remove_reg_dead_or_unused_notes): New Declarations. * config/sh/sh.c (sh_remove_reg_dead_or_unused_notes, sh_find_extending_set_of_reg, sh_split_tst_subregs, sh_extending_set_of_reg::use_as_extended_reg): New functions. * config/sh/sh.md (*tst_t_zero): Rename to *tst_t_subregs, convert to insn_and_split and use new function sh_split_tst_subregs. gcc/testsuite/ChangeLog: PR target/53988 * gcc.target/sh/pr53988-1.c: New. Index: gcc/testsuite/gcc.target/sh/pr53988-1.c === --- gcc/testsuite/gcc.target/sh/pr53988-1.c (revision 0) +++ gcc/testsuite/gcc.target/sh/pr53988-1.c (revision 0) @@ -0,0 +1,66 @@ +/* Check that sign/zero extensions are emitted where needed when the + tst Rm,Rn instruction is used. */ +/* { dg-do compile } */ +/* { dg-options "-O1" } */ +/* { dg-skip-if "" { "sh*-*-*" } { "-m5*"} { "" } } */ +/* { dg-final { scan-assembler-times "tst\tr" 8 } } */ +/* { dg-final { scan-assembler-times "mov.b" 4 } } */ +/* { dg-final { scan-assembler-times "mov.w" 4 } } */ +/* { dg-final { scan-assembler-times "extu.b" 4 } } */ +/* { dg-final { scan-assembler-times "extu.w" 2 } } */ + +int +test_00 (char* x, char* y) +{ + /* 2x mov.b (sign extending) */ + return *x & *y ? -40 : 60; +} + +int +test_01 (short* x, short* y) +{ + /* 2x mov.w (sign extending) */ + return *x & *y ? -40 : 60; +} + +int +test_02 (char x, char y) +{ + /* 1x extu.b */ + return x & y ? -40 : 60; +} + +int +test_03 (short x, short y) +{ + /* 1x extu.w */ + return x & y ? -40 : 60; +} + +int +test_04 (char* x, unsigned char y) +{ + /* 1x mov.b, 1x extu.b */ + return *x & y ? -40 : 60; +} + +int +test_05 (short* x, unsigned char y) +{ + /* 1x mov.w, 1x extu.b */ + return *x & y ? -40 : 60; +} + +int +test_06 (short x, short* y, int z, int w) +{ + /* 1x mov.w, 1x extu.w */ + return x & y[0] ? z : w; +} + +int +test_07 (char x, char* y, int z, int w) +{ + /* 1x mov.b, 1x extu.b */ + return x & y[0] ? z : w; +} Index: gcc/config/sh/sh.md === --- gcc/config/sh/sh.md (revision 219622) +++ gcc/config/sh/sh.md (working copy) @@ -666,30 +666,40 @@ [(set_attr "type" "mt_group")]) ;; This pattern might be risky because it also tests the upper bits and not -;; only the subreg. However, it seems that combine will get to this only -;; when testing sign/zero extended values. In this case the extended upper -;; bits do not matter. -(define_insn "*tst_t_zero" +;; only the subreg. We have to check whether the operands have been sign +;; or zero extended. In the worst case, a zero extension has to be inserted +;; to mask out the unwanted bits. +(define_insn_and_split "*tst_t_subregs" [(set (reg:SI T_REG) (eq:SI (subreg:QIHI - (and:SI (match_operand:SI 0 "arith_reg_operand" "%r") - (match_operand:SI 1 "arith_reg_operand" "r")) ) + (and:SI (match_operand:SI 0 "arith_reg_operand") + (match_operand:SI 1 "arith_reg_operand")) ) (const_int 0)))] - "TARGET_SH1 && TARGET_LITTLE_ENDIAN" - "tst %0,%1" - [(set_attr "type" "mt_group")]) + "TARGET_SH1 && TARGET_LITTLE_ENDIAN && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] +{ + sh_split_tst_subregs (curr_insn, mode, , operands); + DONE; +}) -(define_insn "*tst_t_zero" +(define_insn_and_split "*tst_t_subregs" [(set (reg:SI T_REG) (eq:SI (subreg:QIHI - (and:SI (match_operand:SI 0 "arith_reg_operand" "%r") - (match_operand:SI 1 "arith_reg_operand" "r")) ) + (and:SI (match_operand:SI 0 "arith_reg_operand") + (match_operand:SI 1 "arith_reg_operand")) ) (const_int 0)))] - "TARGET_SH1 && TARGET_BIG_ENDIAN" - "tst %0,%1" - [(set_attr "type" "mt_group")]) + "TARGET_SH1 && TARGET_BIG_ENDIAN && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] +{ + sh_split_tst_subregs (curr_insn, mode, , operands); + DONE; +}) ;; Extract LSB, negate and store in T bit. (define_insn "tstsi_t_and_not" Index: gcc/config/sh/sh-protos.h === --- gcc/config/sh/sh-protos.h (revision 219622) +++ gcc/config/sh/sh-protos.h (working copy) @@ -181,7 +181,8 @@ 'prev_nonnote_insn_bb'. When the insn is found, try to extract the rtx of the reg set. */ template inline set_of_reg -sh_f
libgo patch committed: Update to Go 1.4
I've committed a patch to libgo to update it to the Go 1.4 release, except for the runtime package. Much of the runtime package was rewritten in Go, and it does not really affect users of the library, so I've postponed that complex merge. All the other packages are updated. A few minor compiler changes were required, as well as a few changes to the runtime packages required for other changes. The testsuite script was changed to add support for the new TestMain function, which is used by one or two of the standard packages. As usual with libgo updates the entire patch is too large to attach here. I've attached the changes to configuration/build files and the runtime package. Note that the type descriptor format has changed very very slightly to include an additional flag. This means that all existing Go files must be recompiled in order to work with this updated libgo. I will bump the libgo version number shortly. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian gotools/ChangeLog: 2015-01-14 Ian Lance Taylor * Makefile.am (go_cmd_go_files): Sort entries. Add generate.go. * Makefile.in: Rebuild. diff -r 0fde0b6a7eb2 go/expressions.cc --- a/go/expressions.cc Fri Jan 09 17:00:26 2015 -0800 +++ b/go/expressions.cc Wed Jan 14 15:56:44 2015 -0800 @@ -15559,7 +15559,7 @@ Numeric_constant::set_type(Type* type, bool issue_error, Location loc) { bool ret; - if (type == NULL) + if (type == NULL || type->is_error()) ret = true; else if (type->integer_type() != NULL) ret = this->check_int_type(type->integer_type(), issue_error, loc); diff -r 0fde0b6a7eb2 go/types.cc --- a/go/types.cc Fri Jan 09 17:00:26 2015 -0800 +++ b/go/types.cc Wed Jan 14 15:56:44 2015 -0800 @@ -1966,6 +1966,8 @@ if (!this->has_pointer()) runtime_type_kind |= RUNTIME_TYPE_KIND_NO_POINTERS; + if (this->points_to() != NULL) +runtime_type_kind |= RUNTIME_TYPE_KIND_DIRECT_IFACE; Struct_field_list::const_iterator p = fields->begin(); go_assert(p->is_field_name("kind")); vals->push_back(Expression::make_integer_ul(runtime_type_kind, p->type(), diff -r 0fde0b6a7eb2 go/types.h --- a/go/types.hFri Jan 09 17:00:26 2015 -0800 +++ b/go/types.hWed Jan 14 15:56:44 2015 -0800 @@ -81,6 +81,8 @@ static const int RUNTIME_TYPE_KIND_STRUCT = 25; static const int RUNTIME_TYPE_KIND_UNSAFE_POINTER = 26; +static const int RUNTIME_TYPE_KIND_DIRECT_IFACE = (1 << 5); +static const int RUNTIME_TYPE_KIND_GC_PROG = (1 << 6); static const int RUNTIME_TYPE_KIND_NO_POINTERS = (1 << 7); // GC instruction opcodes. These must match the values in libgo/runtime/mgc0.h. diff -r 0fde0b6a7eb2 libgo/MERGE --- a/libgo/MERGE Fri Jan 09 17:00:26 2015 -0800 +++ b/libgo/MERGE Wed Jan 14 15:56:44 2015 -0800 @@ -1,4 +1,4 @@ -f44017549ff9 +14854533dcc7 The first line of this file holds the Mercurial revision number of the last merge done from the master library sources. diff -r 0fde0b6a7eb2 libgo/Makefile.am --- a/libgo/Makefile.am Fri Jan 09 17:00:26 2015 -0800 +++ b/libgo/Makefile.am Wed Jan 14 15:56:44 2015 -0800 @@ -495,6 +495,7 @@ runtime/go-unsafe-new.c \ runtime/go-unsafe-newarray.c \ runtime/go-unsafe-pointer.c \ + runtime/go-unsetenv.c \ runtime/go-unwind.c \ runtime/go-varargs.c \ runtime/env_posix.c \ @@ -695,7 +696,7 @@ else if LIBGO_IS_SOLARIS go_net_cgo_file = go/net/cgo_linux.go -go_net_sock_file = go/net/sock_solaris.go +go_net_sock_file = go/net/sock_stub.go go_net_sockopt_file = go/net/sockopt_solaris.go go_net_sockoptip_file = go/net/sockoptip_stub.go else @@ -761,9 +762,6 @@ if LIBGO_IS_DARWIN go_net_tcpsockopt_file = go/net/tcpsockopt_darwin.go else -if LIBGO_IS_SOLARIS -go_net_tcpsockopt_file = go/net/tcpsockopt_solaris.go -else if LIBGO_IS_DRAGONFLY go_net_tcpsockopt_file = go/net/tcpsockopt_dragonfly.go else @@ -771,7 +769,6 @@ endif endif endif -endif go_net_files = \ go/net/cgo_unix.go \ @@ -997,7 +994,6 @@ go/runtime/extern.go \ go/runtime/mem.go \ go/runtime/softfloat64.go \ - go/runtime/type.go \ version.go version.go: s-version; @true @@ -1187,10 +1183,19 @@ go/crypto/md5/md5.go \ go/crypto/md5/md5block.go \ go/crypto/md5/md5block_generic.go + +if LIBGO_IS_LINUX +crypto_rand_file = go/crypto/rand/rand_linux.go +else +crypto_rand_file = +endif + go_crypto_rand_files = \ go/crypto/rand/rand.go \ go/crypto/rand/rand_unix.go \ + $(crypto_rand_file) \ go/crypto/rand/util.go + go_crypto_rc4_files = \ go/crypto/rc4/rc4.go \ go/crypto/rc4/rc4_ref.go @@ -1289,9 +1294,11 @@ go_encoding_gob_files = \ go/encoding/gob/decode.go \ go/encoding/gob/decoder.go \ + go/encoding/gob/dec_helpers.go \ go/encoding/gob/doc.go \ go/encoding/gob/encode.go \ go/encoding/gob/encoder.go \ + go/encoding
RE: [PATCH,MIPS] Remove all excess parallel constructs
> -Original Message- > From: Matthew Fortune [mailto:matthew.fort...@imgtec.com] > Sent: Monday, January 12, 2015 11:12 AM > To: Moore, Catherine > Cc: 'gcc-patches@gcc.gnu.org' (gcc-patches@gcc.gnu.org) > Subject: [PATCH,MIPS] Remove all excess parallel constructs > > * config/mips/micromips.md (*swp): Remove explicit parallel. > (jraddiusp, *movep): Likewise. > * config/mips/mips-dsp.md (add3): Likewise. > (mips_add_s_, sub3): > Likewise. > (mips_sub_s_, mips_addsc): > Likewise. > (mips_addwc, mips_absq_s_): Likewise. > (mips_precrq_rs_ph_w, mips_precrqu_s_qb_ph): Likewise. > (mips_shll_, mips_shll_s_): > Likewise. > (mips_muleu_s_ph_qbl, mips_muleu_s_ph_qbr): Likewise. > (mips_mulq_rs_ph, mips_muleq_s_w_phl, mips_muleq_s_w_phr): > Likewise. > (mips_dpaq_s_w_ph, mips_dpsq_s_w_ph, mips_mulsaq_s_w_ph): > Likewise. > (mips_dpaq_sa_l_w, mips_dpsq_sa_l_w, mips_maq_s_w_phl): > Likewise. > (mips_maq_s_w_phr, mips_maq_sa_w_phl, mips_maq_sa_w_phr): > Likewise. > (mips_extr_w, mips_extr_r_w, mips_extr_rs_w): Likewise. > (mips_extr_s_h, mips_extp, mips_extpdp, mips_mthlip): Likewise. > (mips_wrdsp): Likewise. > * config/mips/mips-dspr2.md (mips_absq_s_qb): Remove explicit > parallel. > (mips_addu_ph, mips_addu_s_ph, mips_cmpgdu_eq_qb): Likewise. > (mips_cmpgdu_lt_qb, mips_cmpgdu_le_qb, mulv2hi3): Likewise. > (mips_mul_s_ph, mips_mulq_rs_w, mips_mulq_s_ph): Likewise. > (mips_mulq_s_w, mips_subu_ph, mips_subu_s_ph): Likewise. > (mips_dpaqx_s_w_ph, mips_dpaqx_sa_w_ph): Likewise. > (mips_dpsqx_s_w_ph, mips_dpsqx_sa_w_ph): Likewise. > * config/mips/mips-fixed.md (usadd3): Remove explicit > parallel. > (ssadd3, ussub3, sssub3, ssmul3): > Likewise. > (ssmaddsqdq4, ssmsubsqdq4): Likewise. This one is OK, too.
Re: [PATCH] Fix PR c++/16160
On 01/14/2015 05:04 PM, Patrick Palka wrote: Did this define a specialization too: struct X<5> { }; Yes. There's an example in the ARM that says A class can be defined as the definition of a template class. For example, template class stream { /* ... */ }; class stream { /* ... */ }; Here, the class declaration will be used as the definition of streams of characters (stream). Other streams will be handled by template functions generated from the class template. Jason
libgo patch committed: Add missing files
Somehow two files were omitted from the last commit upgrading to 1.4. This adds them. Committed to mainline. Ian Index: libgo/go/crypto/tls/testdata/Server-TLSv12-IssueTicketPreDisable === --- libgo/go/crypto/tls/testdata/Server-TLSv12-IssueTicketPreDisable (revision 0) +++ libgo/go/crypto/tls/testdata/Server-TLSv12-IssueTicketPreDisable (working copy) @@ -0,0 +1,87 @@ +>>> Flow 1 (client to server) + 16 03 01 00 60 01 00 00 5c 03 03 54 23 54 02 17 |`...\..T#T..| +0010 f3 53 13 3d 48 88 c3 19 b9 d1 3d 33 7f f5 99 56 |.S.=H.=3...V| +0020 04 71 1b d9 d5 64 8a 0d 4a 54 00 00 00 04 00 05 |.q...d..JT..| +0030 00 ff 01 00 00 2f 00 23 00 00 00 0d 00 22 00 20 |./.#.". | +0040 06 01 06 02 06 03 05 01 05 02 05 03 04 01 04 02 || +0050 04 03 03 01 03 02 03 03 02 01 02 02 02 03 01 01 || +0060 00 0f 00 01 01|.| +>>> Flow 2 (server to client) + 16 03 03 00 35 02 00 00 31 03 03 00 00 00 00 00 |5...1...| +0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || +0020 00 00 00 00 00 00 00 00 00 00 00 00 00 05 00 00 || +0030 09 00 23 00 00 ff 01 00 01 00 16 03 03 02 be 0b |..#.| +0040 00 02 ba 00 02 b7 00 02 b4 30 82 02 b0 30 82 02 |.0...0..| +0050 19 a0 03 02 01 02 02 09 00 85 b0 bb a4 8a 7f b8 || +0060 ca 30 0d 06 09 2a 86 48 86 f7 0d 01 01 05 05 00 |.0...*.H| +0070 30 45 31 0b 30 09 06 03 55 04 06 13 02 41 55 31 |0E1.0...UAU1| +0080 13 30 11 06 03 55 04 08 13 0a 53 6f 6d 65 2d 53 |.0...USome-S| +0090 74 61 74 65 31 21 30 1f 06 03 55 04 0a 13 18 49 |tate1!0...UI| +00a0 6e 74 65 72 6e 65 74 20 57 69 64 67 69 74 73 20 |nternet Widgits | +00b0 50 74 79 20 4c 74 64 30 1e 17 0d 31 30 30 34 32 |Pty Ltd0...10042| +00c0 34 30 39 30 39 33 38 5a 17 0d 31 31 30 34 32 34 |4090938Z..110424| +00d0 30 39 30 39 33 38 5a 30 45 31 0b 30 09 06 03 55 |090938Z0E1.0...U| +00e0 04 06 13 02 41 55 31 13 30 11 06 03 55 04 08 13 |AU1.0...U...| +00f0 0a 53 6f 6d 65 2d 53 74 61 74 65 31 21 30 1f 06 |.Some-State1!0..| +0100 03 55 04 0a 13 18 49 6e 74 65 72 6e 65 74 20 57 |.UInternet W| +0110 69 64 67 69 74 73 20 50 74 79 20 4c 74 64 30 81 |idgits Pty Ltd0.| +0120 9f 30 0d 06 09 2a 86 48 86 f7 0d 01 01 01 05 00 |.0...*.H| +0130 03 81 8d 00 30 81 89 02 81 81 00 bb 79 d6 f5 17 |0...y...| +0140 b5 e5 bf 46 10 d0 dc 69 be e6 2b 07 43 5a d0 03 |...F...i..+.CZ..| +0150 2d 8a 7a 43 85 b7 14 52 e7 a5 65 4c 2c 78 b8 23 |-.zC...R..eL,x.#| +0160 8c b5 b4 82 e5 de 1f 95 3b 7e 62 a5 2c a5 33 d6 |;~b.,.3.| +0170 fe 12 5c 7a 56 fc f5 06 bf fa 58 7b 26 3f b5 cd |..\zV.X{&?..| +0180 04 d3 d0 c9 21 96 4a c7 f4 54 9f 5a bf ef 42 71 |!.J..T.Z..Bq| +0190 00 fe 18 99 07 7f 7e 88 7d 7d f1 04 39 c4 a2 2e |..~.}}..9...| +01a0 db 51 c9 7c e3 c0 4c 3b 32 66 01 cf af b1 1d b8 |.Q.|..L;2f..| +01b0 71 9a 1d db db 89 6b ae da 2d 79 02 03 01 00 01 |q.k..-y.| +01c0 a3 81 a7 30 81 a4 30 1d 06 03 55 1d 0e 04 16 04 |...0..0...U.| +01d0 14 b1 ad e2 85 5a cf cb 28 db 69 ce 23 69 de d3 |.Z..(.i.#i..| +01e0 26 8e 18 88 39 30 75 06 03 55 1d 23 04 6e 30 6c |&...90u..U.#.n0l| +01f0 80 14 b1 ad e2 85 5a cf cb 28 db 69 ce 23 69 de |..Z..(.i.#i.| +0200 d3 26 8e 18 88 39 a1 49 a4 47 30 45 31 0b 30 09 |.&...9.I.G0E1.0.| +0210 06 03 55 04 06 13 02 41 55 31 13 30 11 06 03 55 |..UAU1.0...U| +0220 04 08 13 0a 53 6f 6d 65 2d 53 74 61 74 65 31 21 |Some-State1!| +0230 30 1f 06 03 55 04 0a 13 18 49 6e 74 65 72 6e 65 |0...UInterne| +0240 74 20 57 69 64 67 69 74 73 20 50 74 79 20 4c 74 |t Widgits Pty Lt| +0250 64 82 09 00 85 b0 bb a4 8a 7f b8 ca 30 0c 06 03 |d...0...| +0260 55 1d 13 04 05 30 03 01 01 ff 30 0d 06 09 2a 86 |U00...*.| +0270 48 86 f7 0d 01 01 05 05 00 03 81 81 00 08 6c 45 |H.lE| +0280 24 c7 6b b1 59 ab 0c 52 cc f2 b0 14 d7 87 9d 7a |$.k.Y..R...z| +0290 64 75 b5 5a 95 66 e4 c5 2b 8e ae 12 66 1f eb 4f |du.Z.f..+...f..O| +02a0 38 b3 6e 60 d3 92 fd f7 41 08 b5 25 13 b1 18 7a |8.n`A..%...z| +02b0 24 fb 30 1d ba ed 98 b9 17 ec e7 d7 31 59 db 95 |$.0.1Y..| +02c0 d3 1d 78 ea 50 56 5c d5 82 5a 2d 5a 5f 33 c4 b6 |..x.PV\..Z-Z_3..| +02d0 d8 c9 75 90 96 8c 0f 52 98 b5 cd 98 1f 89 20 5f |..uR.. _| +02e0 f2 a0 1c a3 1b 96 94 dd a9 fd 57 e9 70 e8 26 6d |..W.p.&m| +02f0 71 99 9b 26 6e 38 50 29 6c 90 a7 bd d9 16 03 03 |q..&n8P)l...| +0300 00 04 0e 00 00 00 |..| +>>> Flow 3 (client to server) +
libgo patch committed: Bump version number
This patch bumps the version number of libgo, giving it a new soname. Bootstrapped on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 530a9509c45c -r bdf59f9cfda8 libgo/configure.ac --- a/libgo/configure.acWed Jan 14 16:11:24 2015 -0800 +++ b/libgo/configure.acWed Jan 14 16:41:32 2015 -0800 @@ -11,7 +11,7 @@ AC_CONFIG_SRCDIR(Makefile.am) AC_CONFIG_HEADER(config.h) -libtool_VERSION=6:0:0 +libtool_VERSION=7:0:0 AC_SUBST(libtool_VERSION) AM_ENABLE_MULTILIB(, ..)
[PATCH 2/5] rs6000: Fix TARGET_PROMOTE_FUNCTION_MODE
As the existing comment explains, we should always promote function arguments and return values. However, notwithstanding its name, default_promote_function_mode_always_promote does not always promote. Importantly, it does not for libcalls. This makes ftrapv-[12].c fail with 64-bit ABIs. This patch introduces an rs6000_promote_function_mode that _does_ always promote, fixing this. Tested as usual (c,c++,fortran,ada; -m32,-m32/-mpowerpc64,-m64,-m64/-mlra). Is this okay for mainline? Segher 2015-01-14 Segher Boessenkool gcc/ * config/rs6000/rs6000.c (TARGET_PROMOTE_FUNCTION_MODE): Implement as rs6000_promote_function_mode. Move comment to there. (rs6000_promote_function_mode): New function. --- gcc/config/rs6000/rs6000.c | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 4f8803d..ca5ce28 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -1504,10 +1504,8 @@ static const struct attribute_spec rs6000_attribute_table[] = #undef TARGET_MEMBER_TYPE_FORCES_BLK #define TARGET_MEMBER_TYPE_FORCES_BLK rs6000_member_type_forces_blk -/* On rs6000, function arguments are promoted, as are function return - values. */ #undef TARGET_PROMOTE_FUNCTION_MODE -#define TARGET_PROMOTE_FUNCTION_MODE default_promote_function_mode_always_promote +#define TARGET_PROMOTE_FUNCTION_MODE rs6000_promote_function_mode #undef TARGET_RETURN_IN_MEMORY #define TARGET_RETURN_IN_MEMORY rs6000_return_in_memory @@ -9301,6 +9299,20 @@ init_cumulative_args (CUMULATIVE_ARGS *cum, tree fntype, } } +/* On rs6000, function arguments are promoted, as are function return + values. */ + +static machine_mode +rs6000_promote_function_mode (const_tree type ATTRIBUTE_UNUSED, + machine_mode mode, + int *punsignedp ATTRIBUTE_UNUSED, + const_tree, int) +{ + PROMOTE_MODE (mode, *punsignedp, type); + + return mode; +} + /* Return true if TYPE must be passed on the stack and not in registers. */ static bool -- 1.8.1.4
[PATCH 1/5] rs6000: Fix PROMOTE_MODE for -m32 -mpowerpc64
UNITS_PER_WORD is 8 with -m32 -mpowerpc64. Promoting items smaller than 8 bytes to 4 bytes doesn't make sense. I tried to fix it the other way around first, promoting everything smaller than UNITS_PER_WORD to word_mode; this fails all over the place, because word_mode is bigger than Pmode. So let's not do that ;-) Okay for mainline? Segher 2015-01-14 Segher Boessenkool gcc/ * config/rs6000/rs6000.h (PROMOTE_MODE): Correct test for when -m32 -mpowerpc64 is active. --- gcc/config/rs6000/rs6000.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index c55d7ed..ef6bb2f 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -733,7 +733,7 @@ extern unsigned char rs6000_recip_bits[]; #define PROMOTE_MODE(MODE,UNSIGNEDP,TYPE) \ if (GET_MODE_CLASS (MODE) == MODE_INT\ - && GET_MODE_SIZE (MODE) < UNITS_PER_WORD) \ + && GET_MODE_SIZE (MODE) < (TARGET_32BIT ? 4 : 8)) \ (MODE) = TARGET_32BIT ? SImode : DImode; /* Define this if most significant bit is lowest numbered -- 1.8.1.4
[PATCH 4/5] rs6000: Introducing rs6000_abi_word_mode
Some hooks return word_mode by default, which is incorrect for -m32 -mpowerpc64. This patch creates a new function rs6000_abi_word_mode to implement these hooks, and does so. This fixes 163 testuite FAILs. [ David already OKed this fixed version, but sending it again is easier for me. Lazy, etc. ] 2015-01-14 Segher Boessenkool gcc/ * config/rs6000/rs6000.c (TARGET_LIBGCC_CMP_RETURN_MODE, TARGET_LIBGCC_SHIFT_COUNT_MODE, TARGET_UNWIND_WORD_MODE): Implement as ... (rs6000_abi_word_mode): New function. --- gcc/config/rs6000/rs6000.c | 16 1 file changed, 16 insertions(+) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index afced72..6c91f3c 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -1669,6 +1669,13 @@ static const struct attribute_spec rs6000_attribute_table[] = #undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV #define TARGET_ATOMIC_ASSIGN_EXPAND_FENV rs6000_atomic_assign_expand_fenv + +#undef TARGET_LIBGCC_CMP_RETURN_MODE +#define TARGET_LIBGCC_CMP_RETURN_MODE rs6000_abi_word_mode +#undef TARGET_LIBGCC_SHIFT_COUNT_MODE +#define TARGET_LIBGCC_SHIFT_COUNT_MODE rs6000_abi_word_mode +#undef TARGET_UNWIND_WORD_MODE +#define TARGET_UNWIND_WORD_MODE rs6000_abi_word_mode /* Processor table. */ @@ -9299,6 +9306,15 @@ init_cumulative_args (CUMULATIVE_ARGS *cum, tree fntype, } } +/* The mode the ABI uses for a word. This is not the same as word_mode + for -m32 -mpowerpc64. This is used to implement various target hooks. */ + +static machine_mode +rs6000_abi_word_mode (void) +{ + return TARGET_32BIT ? SImode : DImode; +} + /* On rs6000, function arguments are promoted, as are function return values. */ -- 1.8.1.4
[PATCH 3/5] rs6000: Fix va_start handling for -m32 -mpowerpc64 ABI_V4
This fixes 88 testsuite FAILs. -mpowerpc64 does not change the ABI, but it does change the value of UNITS_PER_WORD. This code is for 32-bit only so we can use MIN_UNITS_PER_WORD instead. Bootstrapped and tested as usual. Okay for mainline? Segher 2015-01-14 Segher Boessenkool gcc/ * config/rs6000/rs6000.c (rs6000_va_start): Use MIN_UNITS_PER_WORD instead of UNITS_PER_WORD to describe the size of stack slots. --- gcc/config/rs6000/rs6000.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index ca5ce28..afced72 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -11232,7 +11232,7 @@ rs6000_va_start (tree valist, rtx nextarg) /* Find the overflow area. */ t = make_tree (TREE_TYPE (ovf), virtual_incoming_args_rtx); if (words != 0) -t = fold_build_pointer_plus_hwi (t, words * UNITS_PER_WORD); +t = fold_build_pointer_plus_hwi (t, words * MIN_UNITS_PER_WORD); t = build2 (MODIFY_EXPR, TREE_TYPE (ovf), ovf, t); TREE_SIDE_EFFECTS (t) = 1; expand_expr (t, const0_rtx, VOIDmode, EXPAND_NORMAL); -- 1.8.1.4
[PATCH 5/5] rs6000: Do not allow TImode with -m32 -mpowerpc64
This fixes 141 FAILs. -mpowerpc64 does not change the ABI, but default_scalar_mode_supported_p does not know that, and allows TImode for -m32 -mpowerpc64. This fixes it. Okay for mainline? 2015-01-14 Segher Boessenkool gcc/ * config/rs6000/rs6000.c (rs6000_scalar_mode_supported_p): Disallow TImode for TARGET_32BIT. --- gcc/config/rs6000/rs6000.c | 4 1 file changed, 4 insertions(+) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 6c91f3c..8fa9a22 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -31960,6 +31960,10 @@ rs6000_eh_return_filter_mode (void) static bool rs6000_scalar_mode_supported_p (machine_mode mode) { + /* For -m32 -mpowerpc64 we want the same ABI as for -m32. */ + if (TARGET_32BIT && mode == TImode) +return false; + if (DECIMAL_FLOAT_MODE_P (mode)) return default_decimal_float_supported_p (); else -- 1.8.1.4
Re: [PATCH/expand] PR64011 Adjust bitsize when partial overflow happen for big-endian
On 01/14/15 15:31, Jiong Wang wrote: agree, and I think the truncation is needed otherwise there may have ICE on some target. and I found current gcc LOCATION info is very good ! have done an experimental hack on at "expand_assignment": 4931 where the tree is expanded, gcc could give quite useful & accurate warning based on tree LOCATION info. ./cc1 -O2 -mbig-endian pr48335-2.c pr48335-2.c: In function ‘f5’: pr48335-2.c:19:29: warning: overflow here ! ((U *)((char *) &s.d + 1))[3] = x; ^ while we need to add warning at store_bit_field_using_insv where there is no accurate LOCATION info. but looks like it's acceptable? pr48335-2.c:19:33: warning: overflow here ! ((U *)((char *) &s.d + 1))[3] = x; ^ Yes, I think we're on the right track now -- warn and truncate the the insertion. I just scanned our set of warning flags to see if this would fit nicely under any of the existing flags, and it doesn't. I guess putting it under -Wextra is probably best for now. I think the warning text should indicate that the statement will write outside the bounds of the destination object or something along those lines. Jeff
Re: [PATCH 2/5] rs6000: Fix TARGET_PROMOTE_FUNCTION_MODE
On Wed, Jan 14, 2015 at 8:14 PM, Segher Boessenkool wrote: > As the existing comment explains, we should always promote function > arguments and return values. However, notwithstanding its name, > default_promote_function_mode_always_promote does not always promote. > Importantly, it does not for libcalls. This makes ftrapv-[12].c fail > with 64-bit ABIs. > > This patch introduces an rs6000_promote_function_mode that _does_ > always promote, fixing this. > > Tested as usual (c,c++,fortran,ada; -m32,-m32/-mpowerpc64,-m64,-m64/-mlra). > Is this okay for mainline? > > > Segher > > > 2015-01-14 Segher Boessenkool > > gcc/ > * config/rs6000/rs6000.c (TARGET_PROMOTE_FUNCTION_MODE): Implement > as rs6000_promote_function_mode. Move comment to there. > (rs6000_promote_function_mode): New function. Okay. Thanks, David
Re: [PATCH 1/5] rs6000: Fix PROMOTE_MODE for -m32 -mpowerpc64
On Wed, Jan 14, 2015 at 8:14 PM, Segher Boessenkool wrote: > UNITS_PER_WORD is 8 with -m32 -mpowerpc64. Promoting items smaller > than 8 bytes to 4 bytes doesn't make sense. > > I tried to fix it the other way around first, promoting everything > smaller than UNITS_PER_WORD to word_mode; this fails all over the > place, because word_mode is bigger than Pmode. So let's not do that ;-) > > Okay for mainline? > > > Segher > > > 2015-01-14 Segher Boessenkool > > gcc/ > * config/rs6000/rs6000.h (PROMOTE_MODE): Correct test for when -m32 > -mpowerpc64 is active. Okay. Thanks, David
Re: [PATCH 3/5] rs6000: Fix va_start handling for -m32 -mpowerpc64 ABI_V4
On Wed, Jan 14, 2015 at 8:14 PM, Segher Boessenkool wrote: > This fixes 88 testsuite FAILs. > > -mpowerpc64 does not change the ABI, but it does change the value of > UNITS_PER_WORD. This code is for 32-bit only so we can use > MIN_UNITS_PER_WORD instead. > > Bootstrapped and tested as usual. Okay for mainline? > > > Segher > > > 2015-01-14 Segher Boessenkool > > gcc/ > * config/rs6000/rs6000.c (rs6000_va_start): Use MIN_UNITS_PER_WORD > instead of UNITS_PER_WORD to describe the size of stack slots. Okay. Thanks, David
Re: [PATCH 4/5] rs6000: Introducing rs6000_abi_word_mode
On Wed, Jan 14, 2015 at 8:14 PM, Segher Boessenkool wrote: > Some hooks return word_mode by default, which is incorrect for -m32 > -mpowerpc64. This patch creates a new function rs6000_abi_word_mode > to implement these hooks, and does so. > > This fixes 163 testuite FAILs. > > [ David already OKed this fixed version, but sending it again is easier > for me. Lazy, etc. ] > > > 2015-01-14 Segher Boessenkool > > gcc/ > * config/rs6000/rs6000.c (TARGET_LIBGCC_CMP_RETURN_MODE, > TARGET_LIBGCC_SHIFT_COUNT_MODE, TARGET_UNWIND_WORD_MODE): Implement > as ... > (rs6000_abi_word_mode): New function. Okay. Thanks, David
Re: RFA: patch to fix a bad code generation for PR64110 -- new constraints addition
On 01/14/15 16:52, Vladimir Makarov wrote: The problem of unexpected code generation is discussed on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64110 The following patch introduces 2 new constraints '^' and '$' which are analogous to '?' and '!' but disfavor given alternative when *the operand with the new constraint* needs a reload ('?' and '!' disfavor the alternative if *any* operand needs a reload). I hope the new constraints will be useful for other insns and targets. Right. This gives us finer grained control over when to disparage an alternative. Reloading some of the operands in an alternative may not be a big deal, but there may be other operands in the alternative that if a reload was needed for that operand would be so bad that we'd want to reject the entire alternative. The example I had in mind when I read Vlad's analysis in the BZ were the old movb and addb patterns on the PA. Basically we have some side effect like addition/subtraction/register copy along with a conditional jump. (define_insn "" [(set (pc) (if_then_else (match_operator 2 "movb_comparison_operator" [(match_operand:SI 1 "register_operand" "r,r,r,r") (const_int 0)]) (label_ref (match_operand 3 "" "")) (pc))) (set (match_operand:SI 0 "reg_before_reload_operand" "=!r,!*f,*Q,!*q") (match_dup 1))] Needing a reload for operand 1 really isn't a big deal here, but reloading operand 0 is a disaster. This would be a good place to use the new constraint modifiers. I can distinctly recall running into similar issues on other ports through the years. I wouldn't be at all surprised if a notable percentage of the "!" and "?"s that appear in our machine descriptions would be better off as "^" and "$". 2015-01-14 Vladimir Makarov PR rtl-optimization/64110 * stmt.c (parse_output_constraint): Process '^' and '$'. (parse_input_constraint): Ditto. * lra-constraints.c (process_alt_operands): Process the new constraints. * ira-costs.c (record_reg_classes): Process the new constraint '^'. * genoutput.c (indep_constraints): Add '^' and '$'. * config/i386/sse.md (*vec_dup): Use '$' instead of '!'. * doc/md.texi: Add description of the new constraints. 2015-01-14 Vladimir Makarov PR rtl-optimization/64110 * gcc.target/i386/pr64110.c: Add scan-assembler. pr64110-3.patch Index: config/i386/sse.md === --- config/i386/sse.md (revision 219262) +++ config/i386/sse.md (working copy) @@ -16713,7 +16713,7 @@ (define_insn "*vec_dup" [(set (match_operand:AVX2_VEC_DUP_MODE 0 "register_operand" "=x,x,x") (vec_duplicate:AVX2_VEC_DUP_MODE - (match_operand: 1 "nonimmediate_operand" "m,x,!r")))] + (match_operand: 1 "nonimmediate_operand" "m,x,$r")))] "TARGET_AVX2" "@ vbroadcast\t{%1, %0|%0, %1} Index: doc/md.texi === --- doc/md.texi (revision 219262) +++ doc/md.texi (working copy) @@ -1503,6 +1503,18 @@ in it. Disparage severely the alternative that the @samp{!} appears in. This alternative can still be used if it fits without reloading, but if reloading is needed, some other alternative will be used. + +@cindex @samp{^} in constraint +@cindex caret +@item ^ +This constraint is analogous to @samp{?} but it disparages slightly +the alternative only unless the corresponding operand applies exactly. + +@cindex @samp{$} in constraint +@cindex dollar sign +@item $ +This constraint is analogous to @samp{!} but it disparages severely +the alternative only unless the corresponding operand applies exactly. @end table I found these hard to parse. This disparages severely the alternative if the operand with the @samp{$} needs a reload. Seems clearer to me. With the doc update this is good for the trunk. Jeff
[Patch, libstdc++/64584, libstdc++/64585] Clear basic_regex after imbue and make assign exception tolerant
It's irony that I spent non-trivial effort to make sure basic_regex still works after imbuing. :) But now with this specification, the implementation can be cleaner (e.g. _M_original_str is removed). Bootstrapped and tested. I'll make a 4.9 patch later. Thanks! -- Regards, Tim Shen commit b22b13c50bce5999bbec3e3438e49950f932f60d Author: timshen Date: Wed Jan 14 00:01:40 2015 -0800 PR libstdc++/64584 PR libstdc++/64585 * include/bits/regex.h (basic_regex<>::basic_regex, basic_regex<>::assign, basic_regex<>::imbue, basic_regex<>::swap, basic_regex<>::mark_count): Drop NFA after imbuing basic_regex; Make assign() transactional against exception. * include/bits/regex_compiler.h (__compile_nfa<>): Add back __compile_nfa SFINAE. * include/bits/regex_compiler.tcc (_Compiler<>::_M_get_nfa): Use unique_ptr instead of shared_ptr for NFA in _Compiler. * include/std/regex: Adjust include order to avoid __compile_nfa forward declaration. * testsuite/28_regex/basic_regex/assign/char/string.cc: New testcase. * testsuite/28_regex/basic_regex/imbue/string.cc: New testcase. diff --git a/libstdc++-v3/include/bits/regex.h b/libstdc++-v3/include/bits/regex.h index 52c2384..6de883a 100644 --- a/libstdc++-v3/include/bits/regex.h +++ b/libstdc++-v3/include/bits/regex.h @@ -62,13 +62,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION template class _Executor; - template -inline std::shared_ptr<_NFA<_TraitsT>> -__compile_nfa(const typename _TraitsT::char_type* __first, - const typename _TraitsT::char_type* __last, - const typename _TraitsT::locale_type& __loc, - regex_constants::syntax_option_type __flags); - _GLIBCXX_END_NAMESPACE_VERSION } @@ -433,7 +426,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 * character sequence. */ basic_regex() - : _M_flags(ECMAScript), _M_loc(), _M_original_str(), _M_automaton(nullptr) + : _M_flags(ECMAScript), _M_loc(), _M_automaton(nullptr) { } /** @@ -497,7 +490,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 basic_regex(const std::basic_string<_Ch_type, _Ch_traits, _Ch_alloc>& __s, flag_type __f = ECMAScript) - : basic_regex(__s.begin(), __s.end(), __f) + : basic_regex(__s.data(), __s.data() + __s.size(), __f) { } /** @@ -516,14 +509,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 template basic_regex(_FwdIter __first, _FwdIter __last, flag_type __f = ECMAScript) - : _M_flags(__f), - _M_loc(), - _M_original_str(__first, __last), - _M_automaton(__detail::__compile_nfa<_Rx_traits>( - _M_original_str.c_str(), - _M_original_str.c_str() + _M_original_str.size(), - _M_loc, - _M_flags)) + : basic_regex(std::move(__first), std::move(__last), locale_type(), __f) { } /** @@ -657,15 +643,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 assign(const basic_string<_Ch_type, _Ch_traits, _Alloc>& __s, flag_type __flags = ECMAScript) { - _M_flags = __flags; - _M_original_str.assign(__s.begin(), __s.end()); - auto __p = _M_original_str.c_str(); - _M_automaton = __detail::__compile_nfa<_Rx_traits>( - __p, - __p + _M_original_str.size(), - _M_loc, - _M_flags); - return *this; + return this->assign(basic_regex(__s.data(), __s.data() + __s.size(), + _M_loc, _M_flags)); } /** @@ -709,7 +688,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 */ unsigned int mark_count() const - { return _M_automaton->_M_sub_count() - 1; } + { + if (_M_automaton) + return _M_automaton->_M_sub_count() - 1; + return 0; + } /** * @brief Gets the flags used to construct the regular expression @@ -729,8 +712,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 imbue(locale_type __loc) { std::swap(__loc, _M_loc); - if (_M_automaton != nullptr) - this->assign(_M_original_str, _M_flags); + _M_automaton = nullptr; return __loc; } @@ -753,7 +735,6 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 { std::swap(_M_flags, __rhs._M_flags); std::swap(_M_loc, __rhs._M_loc); - std::swap(_M_original_str, __rhs._M_original_str); std::swap(_M_automaton, __rhs._M_automaton); } @@ -764,7 +745,15 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 #endif private: - typedef std::shared_ptr<__detail::_NFA<_Rx_traits>> _AutomatonPtr; + typedef std::shared_ptr> _AutomatonPtr; + + template + basic_regex(_FwdIter __first, _FwdIter __last, locale_type __loc, + flag_type __f) + : _M_flags(__f), _M_loc(std::move(__loc)), +
Re: [PATCH, RFC] LRA subreg handling
On 01/14/15 12:20, Robert Suchanek wrote: Hi Vladimir, An issue has been identified with LRA when running CPU2006 h264ref benchmark. I'll try to describe what the issue is and a fix applied as it is very difficult to reproduce it and it is next to impossible to create a narrowed testcase on top of the source code restrictions. The concerned LRA code in lra-constraints.c is the following: if (GET_CODE (*loc) == SUBREG) { reg = SUBREG_REG (*loc); byte = SUBREG_BYTE (*loc); if (REG_P (reg) /* Strict_low_part requires reload the register not the sub-register. */ && (curr_static_id->operand[i].strict_low || (GET_MODE_SIZE (mode) <= GET_MODE_SIZE (GET_MODE (reg)) && (hard_regno = get_try_hard_regno (REGNO (reg))) >= 0 && (simplify_subreg_regno (hard_regno, GET_MODE (reg), byte, mode) < 0) && (goal_alt[i] == NO_REGS || (simplify_subreg_regno (ira_class_hard_regs[goal_alt[i]][0], GET_MODE (reg), byte, mode) >= 0 { loc = &SUBREG_REG (*loc); mode = GET_MODE (*loc); } } The above works just fine when we deal with strict_low_part or a subreg smaller than a word. However, multi-word operations that were emitted as a sequence of operations on word sized parts of the DImode register appears to expose a problem with LRA e.g. '(set (subreg: SI (reg: DI)) ...)'. LRA does not realize that it actually uses the other halve of the DI-mode register leading to a situation where it modifies one halve of the result and spills the whole register with the other halve undefined. In the dump I can see the following: Creating newreg=1552 from oldreg=521, assigning class GR_REGS to r1552 1487: r1552:DI#4=r1404:SI+r1509:SI REG_DEAD r1509:SI REG_DEAD r1404:SI Inserting insn reload after: 1735: r521:DI=r1552:DI There is nothing in the dump that sets r1552:DI#0 nor a reload is inserted to load the value before modifying it but it is spilled. As it is a multi-word register, the split pass emits an additional instruction to load the whole 64-bit value but since one halve was modified, only register $20 appears in the live-in set. In contrast to $20, $21 is being used but not added to the live-in set. ... ;; live in 4 [$4] 6 [$6] 7 [$7] 10 [$10] 11 [$11] 12 [$12] 13 [$13] [$14] 15 [$15] 16 [$16] 17 [$17] 20 [$20] 22 [$22] 23 [$23] 24 [$24] 25 [$25] 29 [$sp] 30 [$fp] 31 [$31] 52 [$f20] 79 [$fakec] ... (insn 1788 1077 1789 80 (set (reg:SI 20 $20 [orig:521 distortion ] [521]) (mem/c:SI (plus:SI (reg/f:SI 29 $sp) (const_int 40 [0x28])) [16 %sfp+40 S4 A64])) rdopt.c:257 288 {*movsi_internal} (nil)) (insn 1789 1788 1743 80 (set (reg:SI 21 $21 [ distortion+4 ]) (mem/c:SI (plus:SI (reg/f:SI 29 $sp) (const_int 44 [0x2c])) [16 %sfp+44 S4 A32])) rdopt.c:257 288 {*movsi_internal} (nil)) ... The potential fix for this is to promote the type of a subreg OP_OUT to OP_INOUT to treat the pseudo register (r1552 in this case) as input and LRA will be forced to insert a reload before modifying its contents. Handling of strict_low_part case is fine as the operand is described in the MD pattern as IN_OUT through modifiers. With the above change in place, we get a reload before assignment: Creating newreg=1552 from oldreg=521, assigning class GR_REGS to r1552 1487: r1552:DI#4=r1404:SI+r1509:SI REG_DEAD r1509:SI REG_DEAD r1404:SI Inserting insn reload before: 1735: r1552:DI=r521:DI Inserting insn reload after: 1736: r521:DI=r1552:DI and the benchmark happily passes the runtime check. The question is whether changing the type to OP_INOUT is the correct and valid fix? Regards, Robert 2015-01-14 Robert Suchanek gcc/ * lra-constraints.c (curr_insn_transform): Change the type of a reload pseudo to OP_INOUT. Robert, can you look at reload.c::reload_inner_reg_of_subreg and verify that the comment just before its return statement is effectively the situation you're in. There are certainly cases where a SUBREG needs to be treated as an in-out operand. We walked through them eons ago when we were poking at SSA for RTL. But the details have long since faded from memory. jeff
Re: [PATCH] Reenable CSE of non-volatile inline asm (PR rtl-optimization/63637)
On 01/14/15 08:19, Segher Boessenkool wrote: " @findex clobber @item (clobber @var{x}) Represents the storing or possible storing of an unpredictable, undescribed value into @var{x}, which must be a @code{reg}, @code{scratch}, @code{parallel} or @code{mem} expression. [...] If @var{x} is @code{(mem:BLK (const_int 0))} or @code{(mem:BLK (scratch))}, it means that all memory locations must be presumed clobbered. " Note it doesn't mention reading memory. The documentation is incomplete. The right thing to do is fix the documentation and treat the "memory" tag appearing in the "clobber" section as a read as well as a write. It's lame, but the historical decision by RMS to put that tag into the clobbers section is what it is. Don't get too hung up on it. RMS just botched it. Now if we go back to my earlier quote: " If your assembler instructions access memory in an unpredictable fashion, add @samp{memory} to the list of clobbered registers. Note "access" not "write". This causes GCC to not keep memory values cached in registers across the assembler instruction and not optimize stores or loads to that memory. You also should add the @code{volatile} keyword if the memory affected is not listed in the inputs or outputs of the @code{asm}, as the @samp{memory} clobber does not count as a side-effect of the @code{asm}. " That last line means the compiler is free to delete a non-volatile asm with a memory clobber if that asm is not needed for dataflow. Or that is how I read it; it is trying to indicate that if you want to prevent the memory clobber from being deleted (together with the rest of the asm), you need to make the asm volatile. So as far as I can see the compiler can CSE two identical non-volatile asms with memory clobber just fine. Older GCC (I tried 4.7.2) does do this; current mainline doesn't. I think it should. No, it should not CSE those two cases. That's simply wrong and if an older version did that optimization, that's a bug. jeff
Re: [patch] gcc fstack-protector-explicit
On 07/01/14 15:34, Daniel Gutson wrote: On Tue, Jul 1, 2014 at 2:25 PM, Jeff Law wrote: On 03/19/14 08:06, Marcos Díaz wrote: Well, finally I have the assignment, could you please review this patch? Thanks. My first thought was that if we've marked the function with an explicit static protector attribute, then it ought to be protected regardless of any flags. Is there some reason to require the -fstack-protect-explicit? They can work separately, since the logic is: if NOT stack-protect-explicit a function can be protected by the current logic OR it has the attribute (a function may be not automatically protected with the current logic) ELSE // stack-protect-explicit only functions marked with the attribute will be protected. IOW, when no stack-protect-explicit, the functions may not be protected due to current logic, so the attribute acts as an override to request protection. Sorry this took so long. I fixed a variety of whitespace errors, wrote a better ChangeLog, re-bootstrapped and regression tested the patch (given the long delay, I felt it was the least I could do). Approved and installed. Sorry for the terribly long delay. jeff
Re: [Patch, libstdc++/64584, libstdc++/64585] Clear basic_regex after imbue and make assign exception tolerant
On Wed, Jan 14, 2015 at 8:53 PM, Tim Shen wrote: > I'll make a 4.9 patch later. Here it is. :) Bootstrapped and tested. -- Regards, Tim Shen commit 978246f3fb976d466d5735b43e416b77b750cd0d Author: timshen Date: Wed Jan 14 01:35:22 2015 -0800 PR libstdc++/64584 PR libstdc++/64585 * include/bits/regex.h (basic_regex<>::basic_regex, basic_regex<>::assign, basic_regex<>::imbue, basic_regex<>::swap, basic_regex<>::mark_count): Drop NFA after imbuing basic_regex; Make assign() transactional against exception. * testsuite/28_regex/basic_regex/assign/char/string.cc: New testcase. * testsuite/28_regex/basic_regex/imbue/string.cc: New testcase. diff --git a/libstdc++-v3/include/bits/regex.h b/libstdc++-v3/include/bits/regex.h index 3cbec3c..efb38e5 100644 --- a/libstdc++-v3/include/bits/regex.h +++ b/libstdc++-v3/include/bits/regex.h @@ -476,7 +476,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION */ basic_regex(const basic_regex& __rhs) : _M_flags(__rhs._M_flags), _M_original_str(__rhs._M_original_str) - { this->imbue(__rhs.getloc()); } + { + _M_traits.imbue(__rhs.getloc()); + this->assign(_M_original_str, _M_flags); + } /** * @brief Move-constructs a basic regular expression. @@ -490,7 +493,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION : _M_flags(__rhs._M_flags), _M_original_str(std::move(__rhs._M_original_str)) { - this->imbue(__rhs.getloc()); + _M_traits.imbue(__rhs.getloc()); + this->assign(_M_original_str, _M_flags); __rhs._M_automaton.reset(); } @@ -604,7 +608,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { _M_flags = __rhs._M_flags; _M_original_str = __rhs._M_original_str; - this->imbue(__rhs.getloc()); + _M_traits.imbue(__rhs.getloc()); + this->assign(_M_original_str, _M_flags); return *this; } @@ -622,7 +627,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _M_flags = __rhs._M_flags; _M_original_str = std::move(__rhs._M_original_str); __rhs._M_automaton.reset(); - this->imbue(__rhs.getloc()); + _M_traits.imbue(__rhs.getloc()); + this->assign(_M_original_str, _M_flags); } /** @@ -675,12 +681,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION assign(const basic_string<_Ch_type, _Ch_typeraits, _Alloc>& __s, flag_type __flags = ECMAScript) { + auto __traits = _M_traits; + auto __f = _M_flags; _M_flags = __flags; - _M_original_str.assign(__s.begin(), __s.end()); - auto __p = _M_original_str.c_str(); - _M_automaton = __detail::__compile_nfa(__p, -__p + _M_original_str.size(), -_M_traits, _M_flags); + _M_traits = __traits; + __try + { + _M_automaton = __detail::__compile_nfa( + __s.data(), __s.data() + __s.size(), _M_traits, _M_flags); + _M_original_str = __s; + } + __catch (...) + { + _M_traits = __traits; + _M_flags = __f; + __throw_exception_again; + } return *this; } @@ -725,7 +741,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION */ unsigned int mark_count() const - { return _M_automaton->_M_sub_count() - 1; } + { + if (_M_automaton) + return _M_automaton->_M_sub_count() - 1; + return 0; + } /** * @brief Gets the flags used to construct the regular expression @@ -744,9 +764,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION locale_type imbue(locale_type __loc) { - auto __ret = _M_traits.imbue(__loc); - this->assign(_M_original_str, _M_flags); - return __ret; + _M_automaton = nullptr; + return _M_traits.imbue(__loc); } /** @@ -767,8 +786,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION swap(basic_regex& __rhs) { std::swap(_M_flags, __rhs._M_flags); - std::swap(_M_original_str, __rhs._M_original_str); - this->imbue(__rhs.imbue(this->getloc())); + std::swap(_M_traits, __rhs._M_traits); + auto tmp = std::move(_M_original_str); + this->assign(__rhs._M_original_str, _M_flags); + __rhs.assign(tmp, __rhs._M_flags); } #ifdef _GLIBCXX_DEBUG @@ -777,7 +798,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { _M_automaton->_M_dot(__ostr); } #endif -protected: +private: typedef std::shared_ptr<__detail::_NFA<_Rx_traits>> _AutomatonPtr; template @@ -29,6 +28,7 @@ // Tests C++ string assignment of the basic_regex class. void test01() { + bool test __attribute__((unused)) = true; typedef std::basic_regex test_type; std::string s("a*b"); @@ -36,9 +36,27 @@ void test01() re.assign(s); } +// libstdc++/64584 +void test02() +{ +
Re: [PATCH] Reenable CSE of non-volatile inline asm (PR rtl-optimization/63637)
On January 15, 2015 6:06:33 AM CET, Jeff Law wrote: >On 01/14/15 08:19, Segher Boessenkool wrote: > >> " >> @findex clobber >> @item (clobber @var{x}) >> Represents the storing or possible storing of an unpredictable, >> undescribed value into @var{x}, which must be a @code{reg}, >> @code{scratch}, @code{parallel} or @code{mem} expression. >> >> [...] >> >> If @var{x} is @code{(mem:BLK (const_int 0))} or >> @code{(mem:BLK (scratch))}, it means that all memory >> locations must be presumed clobbered. >> " >> >> Note it doesn't mention reading memory. >The documentation is incomplete. The right thing to do is fix the >documentation and treat the "memory" tag appearing in the "clobber" >section as a read as well as a write. > >It's lame, but the historical decision by RMS to put that tag into the >clobbers section is what it is. Don't get too hung up on it. RMS just > >botched it. > >> >> Now if we go back to my earlier quote: >> >> " >> If your assembler instructions access memory in an unpredictable >> fashion, add @samp{memory} to the list of clobbered registers. >Note "access" not "write". > > > This >> causes GCC to not keep memory values cached in registers across the >> assembler instruction and not optimize stores or loads to that >memory. >> You also should add the @code{volatile} keyword if the memory >> affected is not listed in the inputs or outputs of the @code{asm}, as >> the @samp{memory} clobber does not count as a side-effect of the >> @code{asm}. >> " >> >> That last line means the compiler is free to delete a non-volatile >> asm with a memory clobber if that asm is not needed for dataflow. Or >> that is how I read it; it is trying to indicate that if you want to >> prevent the memory clobber from being deleted (together with the rest >> of the asm), you need to make the asm volatile. >> >> So as far as I can see the compiler can CSE two identical >non-volatile >> asms with memory clobber just fine. Older GCC (I tried 4.7.2) does >do >> this; current mainline doesn't. I think it should. >No, it should not CSE those two cases. That's simply wrong and if an >older version did that optimization, that's a bug. I think segher has a point here. If the asm with memory clobber would store to random memory and the point would be to preserve that then the whole distinction with volatile doesn't make much sense (after all without volatile we happily DCE such asm if the regular outputs are not needed). This doesn't mean 'memory' is a well-designed thing, of course. Just its effects are effectively limited to reads without volatile(?) Richard. > >jeff
Re: [PATCH] Add missing requirement to crossmodule-indircall-1a.c
On 11/05/14 13:30, jb...@gmx.de wrote: "Jeff Law" : On 10/23/14 08:30, jb...@gmx.de wrote: "Jeff Law" : On 10/21/14 12:21, jb...@gmx.de wrote: "Jeff Law" : On 10/21/14 16:13, Haswell wrote: The additional source must have the same requirement crossmodule-indircall-1.c has. * crossmodule-indircall-1a.c: Add missing requirement. Why? When used by crossmodule-indircall-1.c we'll have already tested the marker and when used by itself, it does nothing. So I don't see why you think a marker is needed for this source file. When configuring --disable-lto it gets compiled twice: FAIL: gcc.dg/tree-prof/crossmodule-indircall-1a.c compilation, -fprofile-generate -D_PROFILE_GENERATE UNRESOLVED: gcc.dg/tree-prof/crossmodule-indircall-1a.c execution, -fprofile-generate -D_PROFILE_GENERATE UNRESOLVED: gcc.dg/tree-prof/crossmodule-indircall-1a.c compilation, -fprofile-use -D_PROFILE_USE UNRESOLVED: gcc.dg/tree-prof/crossmodule-indircall-1a.c execution, -fprofile-use -D_PROFILE_USE I'd recommend looking deeper. I believe that file should be collapsing down to main () { return 0; } when LTO is not enabled. I'm not a dejagnu expert, but this is what happens: /tmp/build/gcc/xgcc -B/tmp/build/gcc/ /tmp/gcc-4.9.1/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1a.c -fno-diagnostics-show-caret -fdiagnostics-color=never /tmp/gcc-4.9.1/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indircall-1a.c -fprofile-generate -D_PROFILE_GENERATE -lm -o /tmp/build/gcc/testsuite/gcc/crossmodule-indircall-1a.x01 /tmp/cc4rrWCn.o: In function `main': crossmodule-indircall-1a.c:(.text+0x0): multiple definition of `main' /tmp/ccgMlXGi.o:crossmodule-indircall-1a.c:(.text+0x0): first defined here collect2: error: ld returned 1 exit status compiler exited with status 1 Thanks. What's weird here is the source file is listed twice on the command line! No wonder it's failing. I can't typically decipher tcl code without trace info and some send_user commands to see what the values of various things are. [...] Though I have no idea how that's expected to work in an LTO enabled compile. With LTO enabled it runs just fine (which is the reason for the patch I suggested): It's definitely some wacky dejagnu nonsense going on. So if I run both crossmodule-indircall "tests" (yes I know one is an auxiliary file, but what I'm doing emulates what happen inside all the dejagnu/tcl/expect insanity): Running /home/gcc/GIT-2/gcc/gcc/testsuite/gcc.dg/tree-prof/tree-prof.exp ... FAIL: gcc.dg/tree-prof/crossmodule-indircall-1a.c compilation, -fprofile-generate -D_PROFILE_GENERATE So, yea, in a --disable-lto toolchain I can reproduce you problem. Now it gets interesting. Let's run the two tests independently. make check-gcc RUNTESTFLAGS="tree-prof.exp=crossmodule-indircall-1.c" [ ...] Running target unix Using /usr/share/dejagnu/baseboards/unix.exp as board description file for target. Using /usr/share/dejagnu/config/unix.exp as generic interface file for target. Using /home/gcc/GIT-2/gcc/gcc/testsuite/config/default.exp as tool-and-target-specific interface file. Running /home/gcc/GIT-2/gcc/gcc/testsuite/gcc.dg/tree-prof/tree-prof.exp ... === gcc Summary === # of unsupported tests 1 make check-gcc RUNTESTFLAGS="tree-prof.exp=crossmodule-indircall-1a.c" [ ... ] Running target unix Using /usr/share/dejagnu/baseboards/unix.exp as board description file for target. Using /usr/share/dejagnu/config/unix.exp as generic interface file for target. Using /home/gcc/GIT-2/gcc/gcc/testsuite/config/default.exp as tool-and-target-specific interface file. Running /home/gcc/GIT-2/gcc/gcc/testsuite/gcc.dg/tree-prof/tree-prof.exp ... === gcc Summary === # of expected passes4 /home/tmp/gcc3/gcc/xgcc version 5.0.0 20150115 (experimental) (GCC) Umm, WTF. if I run them independently, everything works as expected. Clearly state from running crossmodule-indircall-1.c is affecting how we "test" crossmodule-indircall-1a.c. I'm pretty sure we don't want to "fix" crossmodule-indircall-1a.c, but that the bug is in the dejagnu/tcl/expect code. Jeff
Re: [PATCH] Add missing requirement to crossmodule-indircall-1a.c
On 11/05/14 13:30, jb...@gmx.de wrote: With LTO enabled it runs just fine (which is the reason for the patch I suggested): This is definitely a testing framework problem. The profopt framework isn't clearing the "additional_whatever" variables. Presumably failure to clear is specific to a test failing or being unsupported as I'd expect all kinds of failures if we never cleared those variables. My head hurts, I haven't had to read this much expect/tcl in nearly 20 years. But I'm totally certain we shouldn't be hacking up the source code to deal with the defect in the testing harness. jeff
Re: [Ping] Port of VTV for Cygwin and MinGW
On 15.01.2015 00:52, Ian Lance Taylor wrote: > On Wed, Jan 14, 2015 at 12:28 PM, Patrick Wollgast > wrote: >> On 14.01.2015 20:00, Ian Lance Taylor wrote: >>> On Thu, Jan 8, 2015 at 12:33 PM, Patrick Wollgast >>> wrote: A short recap again: Latest patch, changelog and a test program (further information about the program in the mail): https://gcc.gnu.org/ml/gcc-patches/2014-11/msg03368.html >>> >>> In that patch, the change to varasm.c looks wrong if neither >>> OBJECT_FORMAT_ELF nor TARGET_PECOFF are defined. It looks like you've >>> dropped the switch_to_section call in that case. >>> >>> Ian >>> >> >> You're right. It should have been '#else' again, instead of 'else' >> before the switch_to_section call. > > OK, the patches to varasm.c and cp/vtable-class-hierarchy.c are OK. > > Thanks. > > Ian > Thanks to all the reviewers! Is there something I'm still supposed to do, since I don't have write access and this was the last part missing an "OK"? Regards, Patrick
Re: [arm][patch] fix arm_neon_ok check on !arm_arch7
On 13/01/15 21:01, Andrew Stubbs wrote: On 12/01/15 13:50, Ramana Radhakrishnan wrote: In principle ok, but I'd like a comment in there explaining why we've done this. Can you also post under what configurations these have been tested ? Is this better? I tested it by running the vect.exp tests with a variety of -mcpu flags. Andrew Ok, that should be enough. Please watch out for any testing fallout this week. Ramana
RE: [MIPS] Re-enable ABI->ISA inference
Moore, Catherine writes: > > gcc/ > > > > * config/mips/mips.h (MIPS_ISA_LEVEL_SPEC): Only infer an ISA > > level from an ARCH; do not inject the default. > > (MIPS_DEFAULT_ISA_LEVEL_SPEC): New macro split out from > > MIPS_ISA_LEVEL_SPEC. > > (MIPS_ISA_NAN2008_SPEC): Update comment. > > (BASE_DRIVER_SELF_SPECS): Likewise. > > * config/mips/elfoabi.h (DRIVER_SELF_SPECS): Add > > MIPS_DEFAULT_ISA_LEVEL_SPEC. > > * config/mips/mti-elf.h (DRIVER_SELF_SPECS): Likewise. > > * config/mips/mti-linux.h (DRIVER_SELF_SPECS): Likewise. > > * config/mips/sde.h (DRIVER_SELF_SPECS): Likewise. > > --- > > This looks OK. Thanks, committed as r219580. Matthew
Re: [Ada] Fix bootstrapping on darwin9/10 (PR ada/64349).
> On 09 Jan 2015, at 00:42, Iain Sandoe wrote: > > > On 8 Jan 2015, at 13:52, Tristan Gingold wrote: > >> >>> On 08 Jan 2015, at 13:49, Iain Sandoe wrote: >>> >>> Hi Tristan, >>> >>> On 7 Jan 2015, at 10:15, Arnaud Charlet wrote: >>> Use _NSGetEnviron to get environment. Tested on x86_64-pc-linux-gnu, committed on trunk 2015-01-07 Tristan Gingold PR ada/64349 * env.c (__gnat_environ): Adjust for darwin9/darwin10. >>> >>> So my original patch assumed that, while it was not legal to use environ >>> from a shlib, it is legal to use _NSGetEnviron () from an application ... >>> >>> .. and, OK fine, I see the point about ! defined (__arm__) .. but a few >>> other comments. >>> >>> ISTM that there's a partial implementation to distinguish between IN_RTS >>> and application? >> >> Yes you're right. The added code should have been added after the #endif >> for IN_RTS. > > How about this? > It uses the interface where needed, avoids it for main exes and gets rid of > the negative conditional (which IMO makes the code a little more readable). > > Iain > > P.S. this is not Darwin9/10 - specific the only reason it doesn't fail on > Darwin >= 11 is because they default to -undefined dynamic_lookup .. and so > find the symbol from the exe. Sorry for the late answer. We did something slightly different: always #include crt_externs.h on no-arm Darwin. Tristan.
Re: [patch] update function comments for lto_symtab_encoder_encode_*
On Tue, Jan 13, 2015 at 5:35 PM, Aldy Hernandez wrote: > Hi Richard. > > I'm chasing my tail here looking at an LTO + debug problem, and for the life > of me I can't figure out how all this partition business affects a symbol's > `analyzed' bit. Anyways... the documentation for all these functions is > wrong. > > Can you look at this patch and tell me if it makes sense? I feel a bit > uneasy committing under the obvious rule, since I don't entirely understand > the partitioning thing. > > Would anyone mind me fixing this on mainline? It's just a comment fix. Yeah, it's ok for trunk. > Also, since you seem to understand all this best, can you suggest some > better wording for the lto_encoder_entry comments? > > /* Entry of LTO symtab encoder. */ > struct lto_encoder_entry > { > symtab_node *node; > /* Is the node in this partition (i.e. ltrans of this partition will > be responsible for outputting it)? */ > unsigned int in_partition:1; > /* Do we encode body in this partition? */ > unsigned int body:1; > /* Do we encode initializer in this partition? > For example the readonly variable initializers are encoded to aid > constant folding even if they are not in the partition. */ > unsigned int initializer:1; > }; > > Whenever I get to the LTO part of this project, I promise to start > documenting things better. This whole thing is a mystery. Well - mostly to me as well ;) I'll let Honza answer this... Thanks, Richard. > Thanks. > Aldy
Re: shift/extract SHIFT_COUNT_TRUNCATED combine bug
On Tue, Jan 13, 2015 at 6:38 PM, Segher Boessenkool wrote: > On Tue, Jan 13, 2015 at 10:51:27AM +0100, Richard Biener wrote: >> IMHO SHIFT_COUNT_TRUNCATED should be removed and instead >> backends should provide shift patterns with a (and:QI ...) for the >> shift amount which simply will omit that operation if suitable. > > Note that that catches less though, e.g. in > > int f(int x, int n) { return x << ((2*n) & 31); } > > without SHIFT_COUNT_TRUNCATED it will try to match an AND with 30, > not with 31. But even with SHIFT_COUNT_TRUNCATED you cannot omit the and as it clears the LSB. Only at a higher level we might be tempted to drop the & 31 while it still persists in its original form (not sure if fold does that - I don't see SHIFT_COUNT_TRUNCATED mentioned there). Richard. > > Segher
Re: flatten expr.h (version 2)
On Wed, 14 Jan 2015, Prathamesh Kulkarni wrote: > On 13 January 2015 at 22:02, Prathamesh Kulkarni > wrote: > > On 13 January 2015 at 16:06, Prathamesh Kulkarni > > wrote: > >> On 13 January 2015 at 15:34, Richard Biener wrote: > >>> On Sun, 11 Jan 2015, Prathamesh Kulkarni wrote: > >>> > Hi, > This is a revamped expr.h flattening flattening patch rebased on > tree.h and tree-core.h flattening patch (r219402). > It depends upon the following patch to get committed. > https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00565.html > > Changes: > * Removed all includes except tree-core.h. Put includes required by > expr.h in a comment. > * Moved stmt.c, expmed.c prototypes to stmt.h, expmed.h respectively. > * Adjusted generator programs: genemit.c, gengtype.c, genopinit.c, > genoutput.c. > * Did not put includes in gcc-plugin.h since expr.h cannot be included > by plugins > (putting them broke building a file in c-family/ since expr.h is not > allowed in front-ends) > * Affects java front-end (expr.h is allowed in java front-end). > > Bootstrapped and tested on x86_64-unknown-linux-gnu with languages: > all,go,ada,jit > Built on all targets in config-list.mk with languages: all, go. > OK to commit ? > >>> > >>> diff --git a/gcc/expr.c b/gcc/expr.c > >>> index fc22862..824541e 100644 > >>> --- a/gcc/expr.c > >>> +++ b/gcc/expr.c > >>> @@ -41,11 +41,17 @@ along with GCC; see the file COPYING3. If not see > >>> #include "regs.h" > >>> #include "hard-reg-set.h" > >>> #include "except.h" > >>> -#include "input.h" > >>> #include "function.h" > >>> #include "insn-config.h" > >>> #include "insn-attr.h" > >>> /* Include expr.h after insn-config.h so we get HAVE_conditional_move. > >>> */ > >>> +#include "hashtab.h" > >>> +#include "emit-rtl.h" > >>> +#include "expmed.h" > >>> +#include "stmt.h" > >>> +#include "statistics.h" > >>> +#include "real.h" > >>> +#include "fixed-value.h" > >>> #include "expr.h" > >>> > >>> Please move the comment to the proper place > >> ah, my flattening tool doesn't look at comments. I will move the > >> comment before expr.h include, thanks. > >>> > >>> diff --git a/gcc/expr.h b/gcc/expr.h > >>> index a7638b8..f1be8dc 100644 > >>> --- a/gcc/expr.h > >>> +++ b/gcc/expr.h > >>> @@ -20,7 +20,8 @@ along with GCC; see the file COPYING3. If not see > >>> #ifndef GCC_EXPR_H > >>> #define GCC_EXPR_H > >>> > >>> -/* For inhibit_defer_pop */ > >>> +/* expr.h required includes */ > >>> +#if 0 > >>> #include "hashtab.h" > >>> #include "hash-set.h" > >>> #include "vec.h" > >>> @@ -29,15 +30,17 @@ along with GCC; see the file COPYING3. If not see > >>> #include "hard-reg-set.h" > >>> #include "input.h" > >>> #include "function.h" > >>> -/* For XEXP, GEN_INT, rtx_code */ > >>> #include "rtl.h" > >>> -/* For optimize_size */ > >>> #include "flags.h" > >>> -/* For tree_fits_[su]hwi_p, tree_to_[su]hwi, fold_convert, size_binop, > >>> - ssize_int, TREE_CODE, TYPE_SIZE, int_size_in_bytes,*/ > >>> #include "tree-core.h" > >>> -/* For GET_MODE_BITSIZE, word_mode */ > >>> #include "insn-config.h" > >>> +#include "alias.h" > >>> +#include "emit-rtl.h" > >>> +#include "expmed.h" > >>> +#include "stmt.h" > >>> +#endif > >>> > >>> Err, please remove the #if 0 section > >> I kept it because if something breaks later (hopefully not!), it will > >> be easier to fix. > >> I will remove it. > >>> > >>> + > >>> +#include "tree-core.h" > >>> > >>> Why? The original comment says > >>> > >>> -/* For tree_fits_[su]hwi_p, tree_to_[su]hwi, fold_convert, size_binop, > >>> - ssize_int, TREE_CODE, TYPE_SIZE, int_size_in_bytes,*/ > >>> > >>> but all those are declared in tree.h. Which means the files including > >>> expr.h must already include tree.h. > >>> > >>> If that's not the reason we need to include tree-core.h from expr.c > >>> please add a comment explaining why. > >> bt-load.c fails to compile because it includes expr.h but does not > >> include tree.h > >> I will place tree.h include in all files that include expr.h and rebuild. > > This is not going to work, since tree.h is now flattened. Shall also > > require including all headers required by > > tree.h in all files that include expr.h. Could we retain tree-core.h > > in expr.h for now ? > > Or should I insert tree.h (along with tree.h required includes) in all > > files that include expr.h ? > I am including tree.h along with required includes in all files that > include expr.h. > This removes all includes from expr.h. Good. Richard.
Re: Housekeeping work in backends.html
> I think I got this right > >| Characteristics > > Target | HMSLQNFICBD lqrcpfgmbdiates > ---+ > moxie | F g ds Thanks, applied, but I think that 't' could be added because AFAICS every insn (not expander) generates exactly 1 assembly instruction. Not very important though since only iq2000 and mep also have this nice property. -- Eric Botcazou
Re: RFC: Two minor optimization patterns
On Tue, Jan 13, 2015 at 11:47 PM, Andrew Pinski wrote: > On Tue, Jan 13, 2015 at 2:41 PM, Rasmus Villemoes > wrote: >> [My first attempt at submitting a patch for gcc, so please forgive me >> if I'm not following the right protocol.] > > There are a few things missing. For one, a testcase or two for the > added optimizations. > >> >> Sometimes rounding a variable to the next even integer is written x += x >> & 1. This usually means using an extra register (and hence at least an >> extra mov instruction) compared to the equivalent x = (x + 1) & ~1. The >> first pattern below tries to do this transformation. >> >> While playing with various ways of rounding down, I noticed that gcc >> already optimizes all of x-(x&3), x^(x&3) and x&~(x&3) to simply >> x&~3. Does it also handle x+(x&3)? Where does it handle x - (x&3)? That is, doesn't the pattern also work for constants other than 1? Please put it before the abs simplifications after the last one handing bit_and/bit_ior. Thanks, Richard. > In fact, x&~(x&y) is rewritten as x&~y. However, the dual of this >> is not handled, so I included the second pattern below. >> >> I've tested the below in the sense that gcc compiles and that trivial >> test cases get compiled as expected. > > The other thing you missed is a changelog entry for the change you did. > Also you mentioned you tested the patch below but did not mention > which target you tested it on and you should run the full GCC > testsuite. > https://gcc.gnu.org/contribute.html is a good page to start with how > to handle most of the items above. > https://gcc.gnu.org/wiki/HowToPrepareATestcase is a good page on how > to write the testcase for testing the added optimization. > > Thanks, > Andrew Pinski > >> >> Rasmus >> >> >> >> diff --git gcc/match.pd gcc/match.pd >> index 81c4ee6..04a0bc4 100644 >> --- gcc/match.pd >> +++ gcc/match.pd >> @@ -262,6 +262,16 @@ along with GCC; see the file COPYING3. If not see >> (abs tree_expr_nonnegative_p@0) >> @0) >> >> +/* x + (x & 1) -> (x + 1) & ~1 */ >> +(simplify >> + (plus @0 (bit_and @0 integer_onep@1)) >> + (bit_and (plus @0 @1) (bit_not @1))) >> + >> +/* x | ~(x | y) -> x | ~y */ >> +(simplify >> + (bit_ior @0 (bit_not (bit_ior @0 @1))) >> + (bit_ior @0 (bit_not @1))) >> + >> >> /* Try to fold (type) X op CST -> (type) (X op ((type-x) CST)) >> when profitable.
Re: [PATCH][ARM] FreeBSD ARM support, EABI, v3
On 13/01/15 21:08, Andreas Tobler wrote: > On 13.01.15 11:25, Ramana Radhakrishnan wrote: >> On Thu, Jan 8, 2015 at 8:51 PM, Andreas Tobler >> wrote: >>> On 08.01.15 17:27, Richard Earnshaw wrote: On 29/12/14 18:44, Andreas Tobler wrote: > > All, > > here is the third attempt to support ARM with FreeBSD. > > In the meantime we found another issue in the unwinder where I had to > adapt some stuff. > > The unwind_phase2_forced function in libgcc calls a stop_fn function. > This stop_fn is in FreeBSD's libthr implementation and is called > thread_unwind_stop. This thread_unwind_stop is a generic function used > on all FreeBSD archs. > > The issue is now that this thread_unwind_stop expects a double int for > the exception_class, like on every other arch. For ARM EABI this > exception_class is an array of char which is passed in one register as > pointer vs. two registers for a double int. > > To solve this issue we defined the exception_class as double integer for > FreeBSD. >> >> My apologies for the slow response, some other work and then holidays >> intervened. > > Np, the only issue which made me hurry was the stage 4 entering this week. > >> >From my understanding of the ABI document the implementation is >> currently as mandated by the ABI. Also this isn't a part of the ABI >> that's available for the platform (here FreeBSD to manipulate and >> change as per it's wishes). ARM EHABI is special for software, making >> FreeBSD more "special" for ARM appears to be counter intuitive from my >> point of view. A number of exception unwinding libraries. for e.g. >> libobjc , libstdc++ all use this implementation of exception_class. >> Therefore this creates a divergence for the FreeBSD port which is >> different from everything else. I expect that a number of language run >> time support libraries that supported the ARM EHABI would be using >> such an implementation, therefore you need to fix every single >> implementation of this in every unwinder that supports the ARM EHABI >> which I expect to have been ported to in a number of libraries >> already. (I already see this in libobjc and libstdc++ in the GCC tree) > > Grr ;) I didn't want to hear this answer, but I expected it somehow. > My proposal was the least effort for me. > The other way round is going to be very hard. Maybe impossible. > > It is not only FreeBSD which is affected but also llvm and friends. They > use for exception_class uint64_t. > > I have to take a picture how the effort is and if it would be possible > to do such a change in FreeBSD and more important in llvm etc. > >> I would rather fix the thread_unwind_stop implementation in libthr for >> ARM EHABI rather than make this change. > > It wouldn't be a 'fix' but more a wrapper I think. > > This adaptation reduced the failure count in libstdc++ by about 40 fails. > > I build and test this port on a regular basis and I post the results to > the usual place. >> >> Thanks for doing this. I'm really glad that FreeBSD is finally moving to >> EABI. > I agree with Ramana on this one, I feel that FreeBSD trying to plough a slightly different furrow is just going to cause major problems for everyone. I don't believe a claim that LLVM can't be made to work with the standard EHABI data structures. It can do this on Linux, so I can't conceive of any reason why it cannot also do so on FreeBSD. R. > Thanks for the review and the feedback. > > Gruss, > Andreas > >
Re: [PATCH][rtlanal.c][BE][1/2] Fix vector load/stores to not use ld1/st1
On 14 January 2015 at 07:35, Jeff Law wrote: > On 01/13/15 11:55, Eric Botcazou wrote: >> >> >>> (1) we have a non-paradoxical subreg; >>> (2) both (reg:ymode xregno) and (reg:xmode xregno) occupy full >>> hard registers (no padding or unused upper bits); >>> (3) (reg:ymode xregno) and (reg:xmode xregno) store the same number >>> of bytes (X) in each constituent hard register; >>> (4) the offset is a multiple of X, i.e. the data we're accessing >>> is aligned to a register boundary; and >>> (5) endianness is regular (no differences between words and bytes, >>> or between registers and memory) >> >> >> OK, that's a nice translation of the new code. :-) >> >> It seems to me that the patch wants to extend the support of generic >> subregs >> to modes whose sizes are not multiple of each other, which is a >> requirement of >> the existing code, but does that in a very specific case for the sake of >> the >> ARM port without saying where all the above restrictions come from. > > Basically we're lifting the restriction that the the sizes are multiples of > each other. The requirements above are the set where we know it will work. > They are target independent, but happen to match what the ARM needs. > > The certainly do short circuit the meat of the function, that's the whole > point, there's this set of conditions under which we know this will work and > when they hold, we bypass. > > Now one could argue that instead of bypassing we should put the code to > handle this situation further down. I'd be leery of doing that just from a > complexity standpoint. But one could also argue that short circuiting like > the patch does adds complexity as well and may be a bit kludgy. > > Maybe the way forward here is for someone to try and integrate this support > in the main part of the code and see how it looks. Then we can pick one. > > The downside is since this probably isn't a regression that work would need > to happen quickly to make it into gcc-5. > > Which leads to another option, get the release managers to sign off on the > kludge after gcc-5 branches and only install the kludge on the gcc-5 branch > and insisting the other solution go in for gcc-6 and beyond. Not sure if > they'd do that, but it's a discussion that could happen. This issue is currently gating a number of patches that get big endian working on aarch64 (all of which are on the list), it would be good if we could get this addressed in some form in gcc-5 rather than put out a second release of gcc with borked BE aarch64 support. Cheers /Marcus
Re: [PATCH, AArch64] Fix abitest for ilp32
On 14/01/15 04:59, Hurugalawadi, Naveen wrote: > Hi, > > Please find attached the patch that fixes abitest for ilp32. > > "testfunc_ptr" is a 32bit pointer in ILP32 but is being loaded as 64bit. > > Hence some of the func-ret testcases FAIL's for ILP32. > > Please review the patch and let us know if its okay? > > Regression tested on aarch64-elf. > > Thanks, > Naveen > > gcc/testsuite > > 2015-01-15 Andrew Pinski > Naveen H.S > > * gcc.target/aarch64/aapcs64/abitest.S (LABEL_TEST_FUNC_RETURN): Load > testfunc_ptr as 32bit for ILP32 and 64bit for LP64. > > OK. R.
Re: [PATCH][ARM] Fix PR target/64460: Set 'shift' attr properly on some patterns
On Mon, Jan 12, 2015 at 2:29 PM, Kyrill Tkachov wrote: > Now with patch attached > > Kyrill > > > On 12/01/15 14:27, Kyrill Tkachov wrote: >> >> Hi all, >> >> In this PR we ICE when compiling with -mtune=xscale. The ICE is a >> segfault in xscale_sched_adjust_cost. >> The root cause is that xscale_sched_adjust_cost uses the value of the >> 'shift' insn attribute to index >> the recog operands. In GCC 5 the form and number of operands in those >> patterns were updated but the >> shift value was not: >> >> Author: rearnsha >> Date: Thu May 29 09:39:07 2014 + >> >> * arm/iterators.md (shiftable_ops): New code iterator. >> (t2_binop0, arith_shift_insn): New code attributes. >> * arm/predicates.md (shift_nomul_operator): New predicate. >> * arm/arm.md (insn_enabled): Delete. >> (enabled): Remove insn_enabled test. >> (*arith_shiftsi): Delete. Replace with ... >> (*_multsi): ... new pattern. >> (*_shiftsi): ... new pattern. >> * config/arm/arm.c (arm_print_operand): Handle operand format >> 'b'. >> >> This led to an out-of-bounds array access. Only xscale_sched_adjust_cost >> uses the shift >> attribute, so the segfault only happens for xscale tuning. In the future >> we might want >> to use a more general pattern-matching approach to find the shifted >> operand in an rtx... >> >> In any case, this patch fixes the value of 'shift' for the offending >> pattern and also >> updates 'shift' for the *_shiftsi pattern to point to >> the correct >> operand that is being shifted. >> >> Tested arm-none-eabi and bootstrapped with -mtune=xscale in BOOT_CFLAGS. >> >> Ok for trunk? >> >> Thanks, >> Kyrill >> >> 2014-01-12 Kyrylo Tkachov >> >> PR target/64460 >> * config/arm/arm.md (*_multsi): Set 'shift' attr >> to 2. >> (*_shiftsi): Set 'shift' attr to 3. >> >> 2014-01-12 Kyrylo Tkachov >> >> PR target/64460 >> * gcc.target/arm/pr64460_1.c: New test. >> >> > OK. Thanks, Ramana
[Patch, ARM]Update GCC to generate Tag_ABI_HardFP_use per the latest EABI doc
Hi there, According to the latest EABI at http://infocenter.arm.com/help/topic/com.arm.doc.ihi0045d/IHI0045D_ABI_adden da.pdf, the new definition of Tag_ABI_HardFP_use is as below: Tag_ABI_HardFP_use, (=27), uleb128 0 The user intended that FP use should be implied by Tag_FP_arch 1 The user intended this code to execute on the single-precision variant derived from Tag_FP_arch 2 Reserved 3 The user intended that FP use should be implied by Tag_FP_arch (Note: This is a deprecated duplicate of the default encoded by 0) The attached patch intends to update gcc to conform this definition. Tested with GCC regression test, no regressions. Is it OK? BR, Terry 2015-01-14 Terry Guo * config/arm/arm.c (arm_file_start): Update the assignment of Tag_ABI_HardFP_use.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 0ec526b..378bed9 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -25576,7 +25576,13 @@ arm_file_start (void) if (arm_fpu_desc->model == ARM_FP_MODEL_VFP) { if (TARGET_HARD_FLOAT) - arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 3); + { + if (TARGET_VFP_SINGLE) + arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 1); + else + arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 0); + } + if (TARGET_HARD_FLOAT_ABI) arm_emit_eabi_attribute ("Tag_ABI_VFP_args", 28, 1); }
[PATCH, doc] NDS32: Describe -mcmodel= option instead of -mgp-direct in the documentation.
Hi, all, In this patch of nds32 port: https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00799.html Since we have a new option -mcmodel= as substitution for -mgp-direct, we need to update documentation about such change accordingly. The patch is attached and the plaintext ChangeLog is as follow. gcc/ChangeLog 2015-01-14 Chung-Ju Wu * doc/invoke.texi (NDS32 Options): Add -mcmodel= option and remove -mgp-direct option. Although these changes are target-specific part, I think it would be better for others to have review comments, if any, on its format and layout. If there is no other comments about this patch, I will commit it into trunk after 24 hours. Best regards, jasonwucj 0011-Describe-mcmodel-X-option-instead-of-mgp-direct-in-t.patch Description: Binary data
[PATCH, doc] NDS32: Remove -mforce-fp-as-gp, -mforbid-fp-as-gp, and -mex9 options from documentation.
Hi, all, In this patch of nds32 port: https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00969.html Since we remove the implementation of -mforce-fp-as-gp, -mforbid-fp-as-gp, and -mex9 options, we need to update documentation as well. The patch is attached and the plaintext ChangeLog is as follow. gcc/ChangeLog 2015-01-14 Chung-Ju Wu * doc/invoke.texi (NDS32 Options): Remove -mforce-fp-as-gp, -mforbid-fp-as-gp, and -mex9 options. Although these changes are target-specific part, I think it would be better for others to have review comments, if any, on its format and layout. If there is no other comments about this patch, I will commit it into trunk after 24 hours. Best regards, jasonwucj 0012-Remove-mforce-fp-as-gp-mforbid-fp-as-gp-and-mex9-opt.patch Description: Binary data
[PATCH PR64434]
Hi All, Here is updated patch which was redesigned accordingly to Richard review. It performs swapping operands of commutative operations to expand the expensive one first. Bootstrap and regression testing did not show any new failures. Is it OK for trunk? gcc/ChangeLog 2015-01-14 Yuri Rumyantsev PR tree-optimization/64434 * cfgexpand.c (reorder_operands): New function. (expand_gimple_basic_block): Insert call of reorder_operands. gcc/testsuite/ChangeLog * gcc.dg/torture/pr64434.c: New test. patch Description: Binary data
[PATCH] [ARM] Tune the max_cond_insns/branch_cost for Cortex-M7
Hi, This patch is tuned particularly for benchmark performance on cortex-m7. Tested with GCC regression test, no regressions. Is it ok for trunk? BR, Hale Wang gcc/ChangeLog 2014-12-24 Hale Wang * config/arm/arm.c: Tune the max_cond_insns/branch_cost for Cortex-M7. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 8193bf1..d52fcbd 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -287,6 +287,7 @@ static unsigned int arm_autovectorize_vector_sizes (void); static int arm_default_branch_cost (bool, bool); static int arm_cortex_a5_branch_cost (bool, bool); static int arm_cortex_m_branch_cost (bool, bool); +static int arm_cortex_m7_branch_cost (bool, bool); static bool arm_vectorize_vec_perm_const_ok (machine_mode vmode, const unsigned char *sel); @@ -1967,10 +1968,10 @@ const struct tune_params arm_cortex_m7_tune = &v7m_extra_costs, NULL,/* Sched adj cost. */ 0, /* Constant limit. */ - 0, /* Max cond insns. */ + 1, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, true,/* Prefer constant pool. */ - arm_cortex_m_branch_cost, + arm_cortex_m7_branch_cost, false, /* Prefer LDRD/STRD. */ {true, true},/* Prefer non short circuit. */ &arm_default_vec_cost,/* Vectorizer costs. */ @@ -12015,6 +12016,12 @@ arm_cortex_m_branch_cost (bool speed_p, bool predictable_p) : arm_default_branch_cost (speed_p, predictable_p); } +static int +arm_cortex_m7_branch_cost (bool speed_p, bool predictable_p) +{ + return speed_p ? 0 : arm_default_branch_cost (speed_p, predictable_p); +} + static bool fp_consts_inited = false; static REAL_VALUE_TYPE value_fp0; cortex-m7-branch-cost.patch-3 Description: Binary data
Re: [Patch, ARM]Update GCC to generate Tag_ABI_HardFP_use per the latest EABI doc
On 14/01/15 09:54, Terry Guo wrote: > Hi there, > > According to the latest EABI at > http://infocenter.arm.com/help/topic/com.arm.doc.ihi0045d/IHI0045D_ABI_adden > da.pdf, the new definition of Tag_ABI_HardFP_use is as below: > > Tag_ABI_HardFP_use, (=27), uleb128 > 0 The user intended that FP use should be implied by Tag_FP_arch > 1 The user intended this code to execute on the single-precision variant > derived from Tag_FP_arch > 2 Reserved > 3 The user intended that FP use should be implied by Tag_FP_arch > (Note: This is a deprecated duplicate of the default encoded by 0) > You don't need to emit tags that have the value 0. That's the default for missing tags. So you only need if (TARGET_HARD_FLOAT && TARGET_VFP_SINGLE) => Tag_ABI_HardFP_use = 1 OK with that change. R. > The attached patch intends to update gcc to conform this definition. Tested > with GCC regression test, no regressions. Is it OK? > > BR, > Terry > > 2015-01-14 Terry Guo > >* config/arm/arm.c (arm_file_start): Update the assignment of > Tag_ABI_HardFP_use. > > > gcc-update-Tag_ABI_HardFP_use-v2.txt > > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > index 0ec526b..378bed9 100644 > --- a/gcc/config/arm/arm.c > +++ b/gcc/config/arm/arm.c > @@ -25576,7 +25576,13 @@ arm_file_start (void) > if (arm_fpu_desc->model == ARM_FP_MODEL_VFP) > { > if (TARGET_HARD_FLOAT) > - arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 3); > + { > + if (TARGET_VFP_SINGLE) > + arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 1); > + else > + arm_emit_eabi_attribute ("Tag_ABI_HardFP_use", 27, 0); > + } > + > if (TARGET_HARD_FLOAT_ABI) > arm_emit_eabi_attribute ("Tag_ABI_VFP_args", 28, 1); > } >
Re: [PATCH] [ARM] Tune the max_cond_insns/branch_cost for Cortex-M7
On 14/01/15 10:14, Hale Wang wrote: Hi, This patch is tuned particularly for benchmark performance on cortex-m7. Tested with GCC regression test, no regressions. Is it ok for trunk? BR, Hale Wang gcc/ChangeLog 2014-12-24 Hale Wang * config/arm/arm.c: Tune the max_cond_insns/branch_cost for Cortex-M7. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 8193bf1..d52fcbd 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -287,6 +287,7 @@ static unsigned int arm_autovectorize_vector_sizes (void); static int arm_default_branch_cost (bool, bool); static int arm_cortex_a5_branch_cost (bool, bool); static int arm_cortex_m_branch_cost (bool, bool); +static int arm_cortex_m7_branch_cost (bool, bool); static bool arm_vectorize_vec_perm_const_ok (machine_mode vmode, const unsigned char *sel); @@ -1967,10 +1968,10 @@ const struct tune_params arm_cortex_m7_tune = &v7m_extra_costs, NULL, /* Sched adj cost. */ 0, /* Constant limit. */ - 0, /* Max cond insns. */ + 1, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, true, /* Prefer constant pool. */ - arm_cortex_m_branch_cost, + arm_cortex_m7_branch_cost, false, /* Prefer LDRD/STRD. */ {true, true}, /* Prefer non short circuit. */ &arm_default_vec_cost,/* Vectorizer costs. */ @@ -12015,6 +12016,12 @@ arm_cortex_m_branch_cost (bool speed_p, bool predictable_p) : arm_default_branch_cost (speed_p, predictable_p); } +static int +arm_cortex_m7_branch_cost (bool speed_p, bool predictable_p) +{ + return speed_p ? 0 : arm_default_branch_cost (speed_p, predictable_p); +} + static bool fp_consts_inited = false; static REAL_VALUE_TYPE value_fp0; OK. Ramana
[PATCH, ARM] Fix PR64453: live high register not saved in function prolog with -Os
When compiling for size, live high registers are not saved in function prolog in ARM backend in Thumb mode. The problem comes from arm_conditional_register_usage setting call_used_regs for all high register to avoid them being allocated. However, this cause prolog to not save these register even if they are used. This patch marks high registers as really needing to be saved in prolog if live, no matter what is the content of call_used_regs. ChangeLog entries are as follows: gcc/ChangeLog 2015-01-12 Thomas Preud'homme thomas.preudho...@arm.com PR target/64453 * config/arm/arm.c (callee_saved_reg_p): Define. (arm_compute_save_reg0_reg12_mask): Use callee_saved_reg_p to check if register is callee saved instead of !call_used_regs[reg]. (thumb1_compute_save_reg_mask): Likewise. gcc/testsuite/ChangeLog 2014-12-31 Thomas Preud'homme thomas.preudho...@arm.com * gcc.target/arm/pr64453.c: New. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 0ec526b..fcc14c2 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -18989,6 +18989,14 @@ output_ascii_pseudo_op (FILE *stream, const unsigned char *p, int len) fputs ("\"\n", stream); } +/* Whether a register is callee saved or not. This is necessary because high + registers are marked as caller saved when optimizing for size on Thumb-1 + targets despite being callee saved in order to avoid using them. */ +#define callee_saved_reg_p(reg) \ + (!call_used_regs[reg] \ + || (TARGET_THUMB1 && optimize_size \ + && reg >= FIRST_HI_REGNUM && reg <= LAST_HI_REGNUM)) + /* Compute the register save mask for registers 0 through 12 inclusive. This code is used by arm_compute_save_reg_mask. */ @@ -19049,7 +19057,7 @@ arm_compute_save_reg0_reg12_mask (void) /* In the normal case we only need to save those registers which are call saved and which are used by this function. */ for (reg = 0; reg <= 11; reg++) - if (df_regs_ever_live_p (reg) && ! call_used_regs[reg]) + if (df_regs_ever_live_p (reg) && callee_saved_reg_p (reg)) save_reg_mask |= (1 << reg); /* Handle the frame pointer as a special case. */ @@ -19212,7 +19220,7 @@ thumb1_compute_save_reg_mask (void) mask = 0; for (reg = 0; reg < 12; reg ++) -if (df_regs_ever_live_p (reg) && !call_used_regs[reg]) +if (df_regs_ever_live_p (reg) && callee_saved_reg_p (reg)) mask |= 1 << reg; if (flag_pic @@ -19245,7 +19253,7 @@ thumb1_compute_save_reg_mask (void) if (reg * UNITS_PER_WORD <= (unsigned) arm_size_return_regs ()) reg = LAST_LO_REGNUM; - if (! call_used_regs[reg]) + if (callee_saved_reg_p (reg)) mask |= 1 << reg; } @@ -27185,8 +27193,7 @@ arm_conditional_register_usage (void) /* When optimizing for size on Thumb-1, it's better not to use the HI regs, because of the overhead of stacking them. */ - for (regno = FIRST_HI_REGNUM; - regno <= LAST_HI_REGNUM; ++regno) + for (regno = FIRST_HI_REGNUM; regno <= LAST_HI_REGNUM; ++regno) fixed_regs[regno] = call_used_regs[regno] = 1; } diff --git a/gcc/testsuite/gcc.target/arm/pr64453.c b/gcc/testsuite/gcc.target/arm/pr64453.c new file mode 100644 index 000..17155af --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr64453.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-mthumb -Os " } */ +/* { dg-require-effective-target arm_thumb1_ok } */ + +void save_regs () { + __asm volatile ("" ::: "r8"); +} + +/* { dg-final { scan-assembler "\tmov\tr., r8" } } */ Tested by compiling an arm-none-eabi GCC cross-compiler and running the testsuite on QEMU emulating Cortex-M0 without any regression. Is this ok for trunk? Best regards, Thomas
Re: [PATCH PR64434]
On Wed, Jan 14, 2015 at 01:12:20PM +0300, Yuri Rumyantsev wrote: > Hi All, > > Here is updated patch which was redesigned accordingly to Richard review. > It performs swapping operands of commutative operations to expand the > expensive one first. > > Bootstrap and regression testing did not show any new failures. > > Is it OK for trunk? Haven't you just reposted the patch from December? I don't see any changes from then... > gcc/ChangeLog > 2015-01-14 Yuri Rumyantsev > > PR tree-optimization/64434 > * cfgexpand.c (reorder_operands): New function. > (expand_gimple_basic_block): Insert call of reorder_operands. > > gcc/testsuite/ChangeLog > * gcc.dg/torture/pr64434.c: New test. Jakub
Re: [PATCH PR64434]
Sorry, I resend correct patch. Yuri. 2015-01-14 13:23 GMT+03:00 Jakub Jelinek : > On Wed, Jan 14, 2015 at 01:12:20PM +0300, Yuri Rumyantsev wrote: >> Hi All, >> >> Here is updated patch which was redesigned accordingly to Richard review. >> It performs swapping operands of commutative operations to expand the >> expensive one first. >> >> Bootstrap and regression testing did not show any new failures. >> >> Is it OK for trunk? > > Haven't you just reposted the patch from December? I don't see any changes > from then... > >> gcc/ChangeLog >> 2015-01-14 Yuri Rumyantsev >> >> PR tree-optimization/64434 >> * cfgexpand.c (reorder_operands): New function. >> (expand_gimple_basic_block): Insert call of reorder_operands. >> >> gcc/testsuite/ChangeLog >> * gcc.dg/torture/pr64434.c: New test. > > Jakub patch Description: Binary data
[PATCH][doc][ARM] Deprecate -mapcs and -mapcs-frame
Hi all, -mapcs-frame (and its' alias -mapcs) are somewhat bitrotten and the ABI they represent is deprecated anyway so this is a patch to deprecate the option. It's not being removed here, just documented as deprecated. Kyrill 2015-01-14 Kyrylo Tkachov * doc/invoke.texi (mapcs): Mention deprecation. (mapcs-frame): Likewise.diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index d2f3c79..7a72120 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -12683,10 +12683,11 @@ Standard for all functions, even if this is not strictly necessary for correct execution of the code. Specifying @option{-fomit-frame-pointer} with this option causes the stack frames not to be generated for leaf functions. The default is @option{-mno-apcs-frame}. +This option is deprecated. @item -mapcs @opindex mapcs -This is a synonym for @option{-mapcs-frame}. +This is a synonym for @option{-mapcs-frame} and is deprecated. @ignore @c not currently implemented
[PATCH][ARM][tests] Add -march=armv6 to tests that have -mapcs
Hi all, I'm proposing a gas patch to add warnings for deprecated forms of ldm in ARMv7-A. (https://sourceware.org/ml/binutils/2015-01/msg00158.html) Unfortunately, GCC generates these deprecated forms with -mapcs-frame. Since the deprecation happens only from armv7-a onwards, this patch adds -march=armv6 to the tests that use -mapcs so that the assembler doesn't complain. -mapcs is a pretty old option that was not intended to be used with newer architecture levels anyway, as far as I can tell. Ok for trunk? Thanks, Kyrill 2015-01-14 Kyrylo Tkachov * gcc.target/arm/neon-nested-apcs.c: Add -march=armv6 to options. * gcc.target/arm/nested-apcs.c: Likewise. * gcc.target/arm/pr60264.c: Likewise.diff --git a/gcc/testsuite/gcc.target/arm/neon-nested-apcs.c b/gcc/testsuite/gcc.target/arm/neon-nested-apcs.c index cd92d7d..453096f 100644 --- a/gcc/testsuite/gcc.target/arm/neon-nested-apcs.c +++ b/gcc/testsuite/gcc.target/arm/neon-nested-apcs.c @@ -1,6 +1,6 @@ /* { dg-do run } */ /* { dg-require-effective-target arm_neon_hw } */ -/* { dg-options "-fno-omit-frame-pointer -mapcs-frame -O" } +/* { dg-options "-fno-omit-frame-pointer -mapcs-frame -O -march=armv6" } */ /* { dg-add-options arm_neon } */ extern void abort (void); diff --git a/gcc/testsuite/gcc.target/arm/nested-apcs.c b/gcc/testsuite/gcc.target/arm/nested-apcs.c index 9dac304..ee89b3c 100644 --- a/gcc/testsuite/gcc.target/arm/nested-apcs.c +++ b/gcc/testsuite/gcc.target/arm/nested-apcs.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-fno-omit-frame-pointer -mapcs-frame -O" } */ +/* { dg-options "-fno-omit-frame-pointer -mapcs-frame -O -march=armv6" } */ extern void abort (void); diff --git a/gcc/testsuite/gcc.target/arm/pr60264.c b/gcc/testsuite/gcc.target/arm/pr60264.c index 4fe6aed..b25ef2a 100644 --- a/gcc/testsuite/gcc.target/arm/pr60264.c +++ b/gcc/testsuite/gcc.target/arm/pr60264.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-mapcs -g" } */ +/* { dg-options "-mapcs -g -march=armv6" } */ double bar(void);
[PATCH][wwwdocs] Mention deprecation of -mapcs and -mapcs-frame on arm
Hi all, This is a wwwdocs patch to changes.html to announce the deprecation of the -mapcs and -mapcs-frame options. Ok to apply if we decide to deprecate it? Thanks, KyrillIndex: htdocs/gcc-5/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.50 diff -U 3 -r1.50 changes.html --- htdocs/gcc-5/changes.html 10 Dec 2014 00:28:18 - 1.50 +++ htdocs/gcc-5/changes.html 10 Dec 2014 12:20:01 - @@ -421,6 +421,9 @@ The deprecated option -mwords-little-endian has been removed. + The options relating to the old ABI -mapcs and + -mapcs-frame are deprecated. +
[PATCH][AArch64] Error out of arm_neon.h if nofp/nosimd
Hi all, In the arm version of arm_neon.h we error out early if the user tries to use that header without NEON. In aarch64 AdvancedSIMD is available by default unless the user explicitly disables it. Still it would be more helpful if we could just error out gracefully instead of dumping a long stream of type errors and other black magic in case the user disables AdvancedSIMD explicitly. So, similar to arm_neon.h in the arm port we error out in aarch64 as well. Checked that all arm_neon.h-related tests work as before. Ok for trunk? Thanks, Kyrill 2015-01-14 Kyrylo Tkachov * config/aarch64/arm_neon.h: Error out if NEON is not available. 2015-01-14 Kyrylo Tkachov * gcc.target/aarch64/arm_neon-nosimd-error.c: New test.commit 3c3d56e01ec55f8387bf447e57cdc7f94b0e119b Author: Kyrylo Tkachov Date: Thu Dec 11 15:09:56 2014 + [AArch64] Error out of arm_neon.h if nofp/nosimd diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index 319cd8c..22dfb0b 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -27,6 +27,10 @@ #ifndef _AARCH64_NEON_H_ #define _AARCH64_NEON_H_ +#ifndef __ARM_NEON +#error You must enable AdvancedSIMD instructions to use arm_neon.h +#else + #include #define __AARCH64_UINT64_C(__C) ((uint64_t) __C) @@ -25209,3 +25213,5 @@ __INTERLEAVE_LIST (zip) #undef __aarch64_vdupq_laneq_u64 #endif + +#endif diff --git a/gcc/testsuite/gcc.target/aarch64/arm_neon-nosimd-error.c b/gcc/testsuite/gcc.target/aarch64/arm_neon-nosimd-error.c new file mode 100644 index 000..6c508ec --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/arm_neon-nosimd-error.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-mgeneral-regs-only" } */ +/* { dg-excess-errors "You must enable" } */ + +#include "arm_neon.h" + +int +foo () +{ + return 0; +}
Re: [PATCH PR64434]
On Wed, Jan 14, 2015 at 01:32:13PM +0300, Yuri Rumyantsev wrote: > Sorry, I resend correct patch. > +reorder_operands (basic_block bb) > +{ > + unsigned int *lattice; /* Hold cost of each statement. */ > + unsigned int i = 0, n = 0; > + gimple_stmt_iterator gsi; > + gimple_seq stmts; > + gimple stmt; > + bool swap; > + tree op0, op1; > + ssa_op_iter iter; > + use_operand_p use_p; > + enum tree_code code; > + gimple def0, def1; > + > + if (!optimize) > +return; Wouldn't it be better to move the !optimize guard to the caller? > + /* Compute cost of each statement using estimate_num_insns. */ > + stmts = bb_seq (bb); > + for (gsi = gsi_start (stmts); !gsi_end_p (gsi); gsi_next (&gsi)) > +{ > + stmt = gsi_stmt (gsi); > + gimple_set_uid (stmt, n++); > +} > + lattice = XALLOCAVEC (unsigned, n); I'd be afraid that for very large functions you'd ICE here, stack is far more limited than heap on many hosts. Either give up if n is say .5 million or above, or allocate from heap in that case? > + for (gsi = gsi_start (stmts); !gsi_end_p (gsi); gsi_next (&gsi)) > +{ > + unsigned cost; > + stmt = gsi_stmt (gsi); > + cost = estimate_num_insns (stmt, &eni_size_weights); > + lattice[i] = cost; > + FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE | SSA_OP_VUSE) Why the SSA_OP_VUSE? > + if (gimple_code (stmt) != GIMPLE_ASSIGN !is_gimple_assign (stmt) instead > + if (op0 ==NULL_TREE || op1 == NULL_TREE Missing space after ==. > + /* Swap operands if the second one is more expensive. */ > + def0 = get_gimple_for_ssa_name (op0); > + if (!def0) > + continue; > + def1 = get_gimple_for_ssa_name (op1); > + if (!def1) > + continue; > + swap = false; You don't check here if def0/def1 are from the same bb, is that guaranteed? > + if (swap) > + { > + if (dump_file && (dump_flags & TDF_DETAILS)) > + { > + fprintf (dump_file, "Swap operands in stmt:\n"); > + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); > + } > + gimple_set_op (stmt, 1, op1); > + gimple_set_op (stmt, 2, op0); update_stmt (stmt); ? > Index: testsuite/gcc.dg/torture/pr64434.c > === > --- testsuite/gcc.dg/torture/pr64434.c(revision 0) > +++ testsuite/gcc.dg/torture/pr64434.c(working copy) > > Property changes on: testsuite/gcc.dg/torture/pr64434.c > ___ > Added: svn:executable Please don't make testcases executable. > ## -0,0 +1 ## > +* > \ No newline at end of property Please avoid these, terminate with newline. Jakub
[PATCH][testsuite] Fix oversized bitfield warning.
Hello, Test case g++.dg/torture/20141013.C (added https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01190.html) triggers the warning -- 20141013.C:45:23: warning: width of 'tree_base::code' exceeds its type -- on arm-none-eabi. The code specifies a bitfield of size 16 with an enum as the underlying type. On arm-none-eabi, enums are packed by default (-fshort-enums) so the bitfield is oversized and the warning is correct. This patch adds -fno-short-enums to the compiler options for the test case. Testing: Ran g++.dg/torture/dg-torture.exp for arm-none-eabi and arm-none-linux-gnueabihf. Matthew 2015-01-13 Matthew Wahab * testsuite/g++.dg/torture/20141013.C: Set -fno-short-enums.diff --git a/gcc/testsuite/g++.dg/torture/20141013.C b/gcc/testsuite/g++.dg/torture/20141013.C index 529ef0965e4472e5038a9e6d09f13ca4f4a05954..82aacd6317eb3cd66b2839347a708ae0f4787efc 100644 --- a/gcc/testsuite/g++.dg/torture/20141013.C +++ b/gcc/testsuite/g++.dg/torture/20141013.C @@ -1,3 +1,4 @@ +/* { dg-options "-fno-short-enums" } */ enum { _sch_isdigit = 0x0004,
[PATCH] Fix PR59354
The following fixes PR59354 by treating loads from a group larger than the SLP size as having gaps if the loop is unrolled. This usually means we combine parts of the group of different unroll interations which effectively introduces a gap. Of course in the end this check should be postponed and we should split groups according to uses in SLP instances after SLP discovery... (but not for GCC 5) Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2015-01-14 Richard Biener PR tree-optimization/59354 * tree-vect-slp.c (vect_build_slp_tree_1): Treat loads from groups larger than the slp group size as having gaps. * gcc.dg/vect/pr59354.c: New testcase. Index: gcc/tree-vect-slp.c === --- gcc/tree-vect-slp.c (revision 219581) +++ gcc/tree-vect-slp.c (working copy) @@ -729,8 +729,11 @@ vect_build_slp_tree_1 (loop_vec_info loo ??? We should enhance this to only disallow gaps inside vectors. */ if ((unrolling_factor > 1 - && GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)) == stmt - && GROUP_GAP (vinfo_for_stmt (stmt)) != 0) + && ((GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)) == stmt + && GROUP_GAP (vinfo_for_stmt (stmt)) != 0) + /* If the group is split up then GROUP_GAP + isn't correct here, nor is GROUP_FIRST_ELEMENT. */ + || GROUP_SIZE (vinfo_for_stmt (stmt)) > group_size)) || (GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt)) != stmt && GROUP_GAP (vinfo_for_stmt (stmt)) != 1)) { Index: gcc/testsuite/gcc.dg/vect/pr59354.c === --- gcc/testsuite/gcc.dg/vect/pr59354.c (revision 0) +++ gcc/testsuite/gcc.dg/vect/pr59354.c (working copy) @@ -0,0 +1,34 @@ +/* { dg-do run } */ + +#include "tree-vect.h" + +void abort (void); + +unsigned int a[256]; +unsigned char b[256]; + +int main() +{ + int i, z, x, y; + + check_vect (); + + for(i = 0; i < 256; i++) +{ + a[i] = i % 5; + __asm__ volatile (""); +} + + for (z = 0; z < 16; z++) +for (y = 0; y < 4; y++) + for (x = 0; x < 4; x++) + b[y*64 + z*4 + x] = a[z*16 + y*4 + x]; + + if (b[4] != 1) +abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump "vect" "vectorized 1 loop" { target { vect_pack_trunc } } } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */
[PATCH] PR64377
Hello. Following patch introduces target option support for array types. As discussed in [1], the patch is tested on a nios2 target machine. Ready for trunk? Thanks, Martin [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64377 >From 8f32501a6ea3989a44b76fc3ead2f70d1b636b7a Mon Sep 17 00:00:00 2001 From: mliska Date: Fri, 9 Jan 2015 10:31:00 +0100 Subject: [PATCH] Target optimization nodes: add support for arrays. gcc/ChangeLog: 2015-01-14 Martin Liska PR target/64377 * optc-save-gen.awk: Add support for array types. --- gcc/optc-save-gen.awk | 44 ++-- 1 file changed, 42 insertions(+), 2 deletions(-) diff --git a/gcc/optc-save-gen.awk b/gcc/optc-save-gen.awk index ebeb509..04db24c 100644 --- a/gcc/optc-save-gen.awk +++ b/gcc/optc-save-gen.awk @@ -437,6 +437,7 @@ print " struct cl_target_option const *ptr2 ATTRIBUTE_UNUSED print "{"; n_target_val = 0; n_target_str = 0; +n_target_array = 0; for (i = 0; i < n_target_save; i++) { var = target_save_decl[i]; @@ -448,8 +449,20 @@ for (i = 0; i < n_target_save; i++) { if (target_save_decl[i] ~ "^const char \\*+[_" alnum "]+$") var_target_str[n_target_str++] = name; else { - var_target_val_type[n_target_val] = type; - var_target_val[n_target_val++] = name; + if (target_save_decl[i] ~ " .*\\[.+\\]+$") { + size = name; + sub("[^\\[]+\\[", "", size); + sub("\\]$", "", size); + sub("\\[.+", "", name) + sub(" [^ ]+$", "", type) + var_target_array[n_target_array] = name + var_target_array_type[n_target_array] = type + var_target_array_size[n_target_array++] = size + } + else { + var_target_val_type[n_target_val] = type; + var_target_val[n_target_val++] = name; + } } } if (have_save) { @@ -484,6 +497,14 @@ for (i = 0; i < n_target_str; i++) { print " || strcmp (ptr1->" name", ptr2->" name ")))"; print "return false;"; } +for (i = 0; i < n_target_array; i++) { + name = var_target_array[i] + size = var_target_array_size[i] + type = var_target_array_type[i] + print " if (ptr1->" name" != ptr2->" name ""; + print " || memcmp (ptr1->" name ", ptr2->" name ", " size " * sizeof(" type ")))" + print "return false;"; +} for (i = 0; i < n_target_val; i++) { name = var_target_val[i] print " if (ptr1->" name" != ptr2->" name ")"; @@ -507,6 +528,13 @@ for (i = 0; i < n_target_str; i++) { print " else"; print "hstate.add_int (0);"; } +for (i = 0; i < n_target_array; i++) { + name= var_target_array[i] + size = var_target_array_size[i] + type = var_target_array_type[i] + print " hstate.add_int (" size ");"; + print " hstate.add (ptr->" name ", sizeof (" type ") * " size ");"; +} for (i = 0; i < n_target_val; i++) { name = var_target_val[i] print " hstate.add_wide_int (ptr->" name");"; @@ -525,6 +553,12 @@ for (i = 0; i < n_target_str; i++) { name = var_target_str[i] print " bp_pack_string (ob, bp, ptr->" name", true);"; } +for (i = 0; i < n_target_array; i++) { + name = var_target_array[i] + size = var_target_array_size[i] + print " for (unsigned i = 0; i < " size "; i++)" + print "bp_pack_value (bp, ptr->" name "[i], 64);"; +} for (i = 0; i < n_target_val; i++) { name = var_target_val[i] print " bp_pack_value (bp, ptr->" name", 64);"; @@ -544,6 +578,12 @@ for (i = 0; i < n_target_str; i++) { print " if (ptr->" name")"; print "ptr->" name" = xstrdup (ptr->" name");"; } +for (i = 0; i < n_target_array; i++) { + name = var_target_array[i] + size = var_target_array_size[i] + print " for (int i = " size " - 1; i >= 0; i--)" + print "ptr->" name "[i] = (" var_target_array_type[i] ") bp_unpack_value (bp, 64);"; +} for (i = 0; i < n_target_val; i++) { name = var_target_val[i] print " ptr->" name" = (" var_target_val_type[i] ") bp_unpack_value (bp, 64);"; -- 2.1.2
Re: [PATCH PR64434]
On Wed, Jan 14, 2015 at 11:45 AM, Jakub Jelinek wrote: > On Wed, Jan 14, 2015 at 01:32:13PM +0300, Yuri Rumyantsev wrote: >> Sorry, I resend correct patch. > >> +reorder_operands (basic_block bb) >> +{ >> + unsigned int *lattice; /* Hold cost of each statement. */ >> + unsigned int i = 0, n = 0; >> + gimple_stmt_iterator gsi; >> + gimple_seq stmts; >> + gimple stmt; >> + bool swap; >> + tree op0, op1; >> + ssa_op_iter iter; >> + use_operand_p use_p; >> + enum tree_code code; >> + gimple def0, def1; >> + >> + if (!optimize) >> +return; > > Wouldn't it be better to move the !optimize guard to the caller? > >> + /* Compute cost of each statement using estimate_num_insns. */ >> + stmts = bb_seq (bb); >> + for (gsi = gsi_start (stmts); !gsi_end_p (gsi); gsi_next (&gsi)) >> +{ >> + stmt = gsi_stmt (gsi); >> + gimple_set_uid (stmt, n++); >> +} >> + lattice = XALLOCAVEC (unsigned, n); > > I'd be afraid that for very large functions you'd ICE here, stack is far > more limited than heap on many hosts. Either give up if n is say .5 million > or above, or allocate from heap in that case? > >> + for (gsi = gsi_start (stmts); !gsi_end_p (gsi); gsi_next (&gsi)) >> +{ >> + unsigned cost; >> + stmt = gsi_stmt (gsi); >> + cost = estimate_num_insns (stmt, &eni_size_weights); >> + lattice[i] = cost; >> + FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE | SSA_OP_VUSE) > > Why the SSA_OP_VUSE? Shouldn't be needed. >> + if (gimple_code (stmt) != GIMPLE_ASSIGN > > !is_gimple_assign (stmt) > instead > >> + if (op0 ==NULL_TREE || op1 == NULL_TREE > > Missing space after ==. > >> + /* Swap operands if the second one is more expensive. */ >> + def0 = get_gimple_for_ssa_name (op0); >> + if (!def0) >> + continue; >> + def1 = get_gimple_for_ssa_name (op1); >> + if (!def1) >> + continue; >> + swap = false; > > You don't check here if def0/def1 are from the same bb, is that guaranteed? I think so - we only TER inside BBs. >> + if (swap) >> + { >> + if (dump_file && (dump_flags & TDF_DETAILS)) >> + { >> + fprintf (dump_file, "Swap operands in stmt:\n"); >> + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); >> + } >> + gimple_set_op (stmt, 1, op1); >> + gimple_set_op (stmt, 2, op0); > > update_stmt (stmt); ? Or rather swap_ssa_operands (stmt, gimple_assign_rhs1_ptr (stmt), gimple_assign_rhs2_ptr (stmt)) >> Index: testsuite/gcc.dg/torture/pr64434.c >> === >> --- testsuite/gcc.dg/torture/pr64434.c(revision 0) >> +++ testsuite/gcc.dg/torture/pr64434.c(working copy) >> >> Property changes on: testsuite/gcc.dg/torture/pr64434.c >> ___ >> Added: svn:executable > > Please don't make testcases executable. > >> ## -0,0 +1 ## >> +* >> \ No newline at end of property > > Please avoid these, terminate with newline. > > Jakub
Re: [PATCH PR64434]
On Wed, Jan 14, 2015 at 11:58:50AM +0100, Richard Biener wrote: > >> + /* Swap operands if the second one is more expensive. */ > >> + def0 = get_gimple_for_ssa_name (op0); > >> + if (!def0) > >> + continue; > >> + def1 = get_gimple_for_ssa_name (op1); > >> + if (!def1) > >> + continue; > >> + swap = false; > > > > You don't check here if def0/def1 are from the same bb, is that guaranteed? > > I think so - we only TER inside BBs. But then why to check for it a few lines above: + def_stmt = get_gimple_for_ssa_name (use); + if (!def_stmt || gimple_bb (def_stmt) != bb) If get_gimple_for_ssa_name != NULL guarantees that gimple_bb of the result == bb, then even the || gimple_bb (def_stmt) != bb shouldn't be needed. Jakub
Re: [PATCH][doc][ARM] Deprecate -mapcs and -mapcs-frame
On 14/01/15 10:38, Kyrill Tkachov wrote: > Hi all, > > -mapcs-frame (and its' alias -mapcs) are somewhat bitrotten and the ABI > they represent is deprecated anyway so this is a patch to deprecate the > option. It's not being removed here, just documented as deprecated. > > Kyrill > > 2015-01-14 Kyrylo Tkachov > > * doc/invoke.texi (mapcs): Mention deprecation. > (mapcs-frame): Likewise. > > > arm-mapcs-deprecated.patch > > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index d2f3c79..7a72120 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -12683,10 +12683,11 @@ Standard for all functions, even if this is not > strictly necessary for > correct execution of the code. Specifying @option{-fomit-frame-pointer} > with this option causes the stack frames not to be generated for > leaf functions. The default is @option{-mno-apcs-frame}. > +This option is deprecated. > > @item -mapcs > @opindex mapcs > -This is a synonym for @option{-mapcs-frame}. > +This is a synonym for @option{-mapcs-frame} and is deprecated. > > @ignore > @c not currently implemented > OK. I think this needs a mention in the release notes as well. R.
Re: [PATCH][doc][ARM] Deprecate -mapcs and -mapcs-frame
On 14/01/15 11:13, Richard Earnshaw wrote: On 14/01/15 10:38, Kyrill Tkachov wrote: Hi all, -mapcs-frame (and its' alias -mapcs) are somewhat bitrotten and the ABI they represent is deprecated anyway so this is a patch to deprecate the option. It's not being removed here, just documented as deprecated. Kyrill 2015-01-14 Kyrylo Tkachov * doc/invoke.texi (mapcs): Mention deprecation. (mapcs-frame): Likewise. arm-mapcs-deprecated.patch diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index d2f3c79..7a72120 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -12683,10 +12683,11 @@ Standard for all functions, even if this is not strictly necessary for correct execution of the code. Specifying @option{-fomit-frame-pointer} with this option causes the stack frames not to be generated for leaf functions. The default is @option{-mno-apcs-frame}. +This option is deprecated. @item -mapcs @opindex mapcs -This is a synonym for @option{-mapcs-frame}. +This is a synonym for @option{-mapcs-frame} and is deprecated. @ignore @c not currently implemented OK. I think this needs a mention in the release notes as well. Thanks, the wwdocs patch is https://gcc.gnu.org/ml/gcc-patches/2015-01/msg01003.html Kyrill R.