[PATCH][4.8 backport] S/390: Transactional Execution support
Hi, I would like to apply backports of the TX support patches to 4.8 branch. We released 4.8 with EC12 support already but the support is somewhat incomplete without having TX. With these patches we will have full EC12 support in 4.8. The patches apply without changes but I had to add some configure additions for auxval from Peter Bergners patch for Power. 22/04/13 [PATCH] S/390: Initial libitm support http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01214.html 21/06/13 [PATCH] S/390: Hardware transactional memory support http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01267.html 17/07/13 Re: [PATCH] S/390: Hardware transactional memory support http://gcc.gnu.org/ml/gcc-patches/2013-07/msg00654.html 24/07/13 S/390: libitm: Use .machine to enable the HTM instructions in the assembler http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01074.html Any veto? Torvald, could you please consider adding backports of the following patches to the 4.8 branch? 19/06/13 [patch] libitm: Fix handling of reentrancy in the HTM fastpath http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01132.html 20/06/13 Re: [patch] libitm: Fix handling of reentrancy in the HTM fastpath http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01188.html Thanks! Bye, -Andreas-
Re: [AARCH64][Insn classification unification 1/N] Define "type" attribute for all patterns
On 31 July 2013 15:55, Sofiane Naci wrote: > Hi, > > This patch is the first of a series of patches that aim to unify instruction > classification between the ARM and AARCH64 backends. > > This patch updates the definition of the "type" attribute, used in the ARM > backend, in the AARCH64 backend and defines a "type" for each AARCH64 > instruction. This will allow pipeline descriptions from the ARM backend to > be also used in AARCH64. > > OK for trunk? OK /Marcus
Re: [AARCH64][Insn classification unification 2/N] Add cortex-a53 pipeline description
On 31 July 2013 15:56, Sofiane Naci wrote: > Hi, > > This patch is part of the ongoing work to unify instruction classification > between the ARM and AARCH64 backends. > > This patch wires up the cortex-a53 pipeline description defined in the ARM > backend to be used in the AARCH46 backend. > > OK for trunk? OK /Marcus
Re: [PATCH] PR32219, weak hidden reference segfault [PING^2]
On 14 July 2013 19:43, Diego Novillo wrote: > On Sun, Jul 14, 2013 at 2:08 AM, Chung-Lin Tang > wrote: >> Ping. > > Could you please repost the patch with its description? This thread > is sufficiently old and noisy that I'm not even sure what the patch > does nor why. Chung-Lin Tang, can you regtest and repost the patch please? TIA, > > > Thanks. Diego.
Re: [PATCH] PR32219, weak hidden reference segfault [PING^2]
On 13/8/1 5:16 PM, Bernhard Reutner-Fischer wrote: > On 14 July 2013 19:43, Diego Novillo wrote: >> On Sun, Jul 14, 2013 at 2:08 AM, Chung-Lin Tang >> wrote: >>> Ping. >> >> Could you please repost the patch with its description? This thread >> is sufficiently old and noisy that I'm not even sure what the patch >> does nor why. > > Chung-Lin Tang, can you regtest and repost the patch please? > > TIA, I'll re-explain the patch later as Diego has requested, maybe this weekend. Thanks, Chung-Lin >> >> >> Thanks. Diego.
Re: [PING] Re: [C++ Patch] for c++/54537
Hi, On 07/31/2013 10:01 PM, Peter Bergner wrote: Hi Fabien, Can you tell me what the status of the following patch that removes the pow() overload from tr1 is? Specifically: http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01166.html It seemed to have been approved and you were going to do some additional testing before committing the patch last November, but the patch still hasn't been committed yet. Peter, please build & test the patch once more and commit it and close the bug. Thanks, Paolo.
[PING] [C++ Patch] Remove finish_stmt
Hi, gently pinging this small clean-up: http://gcc.gnu.org/ml/gcc-patches/2013-06/msg00905.html Thanks! Paolo.
Re: Fix ICE when profiles are mismatched
> Honza, > > After this patch, I see new testsuite failures on PowerPC related to > profiling. > > FAIL: gcc.dg/tree-prof/ic-misattribution-1.c scan-ipa-dump profile > "hist->count 1 hist->all 1" > FAIL: gcc.dg/tree-prof/indir-call-prof.c scan-ipa-dump profile > "Indirect call -> direct call.* a1 transformation on insn" Oops, thanks! This silly bug took us quite a some time with Martin. Depending on memory allocator it depends if aact_count[8] is non-NULL. Index: ChangeLog === --- ChangeLog (revision 201400) +++ ChangeLog (working copy) @@ -1,3 +1,7 @@ +2013-08-01 Jan Hubicka + + * profile.c (compute_value_histograms): Fix thinko. + 2013-08-01 Sofiane Naci * config.gcc (aarch64*-*-*): Add aarch-common.o to extra_objs. Add Index: profile.c === --- profile.c (revision 201367) +++ profile.c (working copy) @@ -891,7 +891,7 @@ compute_value_histograms (histogram_valu gimple_add_histogram_value (cfun, stmt, hist); hist->hvalue.counters = XNEWVEC (gcov_type, hist->n_counters); for (j = 0; j < hist->n_counters; j++) -if (aact_count[t]) +if (aact_count) hist->hvalue.counters[j] = aact_count[j]; else hist->hvalue.counters[j] = 0;
Re: [Patch ARM] Fix PR19599 tail
On 07/29/13 11:05, Mikael Pettersson wrote: Ramana Radhakrishnan writes: > Hi, > > This fixes up the issues with PR target/19599 and the issues we've had > around it. ... > 2013-07-25 Ramana Radhakrishnan > > PR target/19599 > PR target/57731 > PR target/57748 Are you sure about the 57748 reference? Looks to me like it should reference 57837 instead. Thanks and smashes head on the wall. Fixed in the Changelog and log messages. regards Ramana /Mikael > * config/arm/arm.md ("*sibcall_value_insn): Replace use of > Ss with US. Adjust output for v5 and v4t. > (*sibcall_value_insn): Likewise and loosen predicate on > operand0. > * config/arm/constraints.md ("Ss"): Rename to US.
Re: msp430 port
On 07/20/2013 01:32 AM, DJ Delorie wrote: >> Every pattern that is using (subreg:SI (thing:PSI)) needs to be >> explained on this list and given an explicit clearance. It really looks >> like you're just papering over problems elsewhere. > > Most of them are just optimizations, but the problem with reload is > that an SImode register *can't* hold a PSImode value. Registers are > 20 bits; a PSImode value is 20 bits. SImode values require two > registers (16 bits each). Converting between SImode and PSImode is > both nontrivial and very common (pointer math). > > What I really need is an int20_t type in the core of gcc, so I can set > Pmode to *that*, to avoid the SImode stuff completely. But that's a > core change, not a target change. Sometimes you have to make core changes for a new port. This sounds like something that really should be fixed before this port can go in. Bernd
Re: [Patch] Fix selector for vect-iv-5.c
On 08/01/13 11:46, Vidya Praveen wrote: Ping! On Tue, Jul 23, 2013 at 10:21:52AM +0100, Vidya Praveen wrote: Hello gcc.dg/vect/vect-iv-5.c XPASSes for arm-*-* since gcc.dg/vect/*.c tests are always run with -ffast-math for arm-*-*. This patch makes xfail conditional for this test by adding effective target keyword !arm_neon_ok. OK for trunk? Ok - thanks . Ramana Regards VP -- gcc/testsuite/ChangeLog: 2013-07-22 Vidya Praveen * gcc.dg/vect/vect-iv-5.c: Make xfail conditional with !arm_neon_ok. diff --git a/gcc/testsuite/gcc.dg/vect/vect-iv-5.c b/gcc/testsuite/gcc.dg/vect/vect-iv-5.c index 1766ae6..8861095 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-iv-5.c +++ b/gcc/testsuite/gcc.dg/vect/vect-iv-5.c @@ -36,5 +36,5 @@ int main (void) return main1 (); } -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail *-*-* } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail {! arm_neon_ok } } } } */ /* { dg-final { cleanup-tree-dump "vect" } } */
RE: [PATCH,i386] Default alignment for AMD BD and BT
Thanks Jakub! Committed revision 201402. -Original Message- From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: Thursday, July 04, 2013 4:46 PM To: Gopalasubramanian, Ganesh Cc: Uros Bizjak (ubiz...@gmail.com); gcc-patches@gcc.gnu.org Subject: Re: [PATCH,i386] Default alignment for AMD BD and BT On Thu, Jul 04, 2013 at 11:14:24AM +, Gopalasubramanian, Ganesh wrote: > Can this be backported now! Yes. Jakub
Re: [Patch] Fix selector for vect-iv-5.c
Ping! On Tue, Jul 23, 2013 at 10:21:52AM +0100, Vidya Praveen wrote: > Hello > > gcc.dg/vect/vect-iv-5.c XPASSes for arm-*-* since gcc.dg/vect/*.c tests are > always run with -ffast-math for arm-*-*. This patch makes xfail conditional > for this test by adding effective target keyword !arm_neon_ok. > > OK for trunk? > > Regards > VP > > -- > > gcc/testsuite/ChangeLog: > > 2013-07-22 Vidya Praveen > > * gcc.dg/vect/vect-iv-5.c: Make xfail conditional with !arm_neon_ok. > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-iv-5.c > b/gcc/testsuite/gcc.dg/vect/vect-iv-5.c > index 1766ae6..8861095 100644 > --- a/gcc/testsuite/gcc.dg/vect/vect-iv-5.c > +++ b/gcc/testsuite/gcc.dg/vect/vect-iv-5.c > @@ -36,5 +36,5 @@ int main (void) >return main1 (); > } > > -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail > *-*-* } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail > {! arm_neon_ok } } } } */ > /* { dg-final { cleanup-tree-dump "vect" } } */
[libgo PATCH 0/1] mksyscalls.awk: make split ERE more portable
Hi, When using busybox' awk to bootstrap, libgo's syscalls are generated incorrectly. I'm attaching the split() used by busybox' awk including output before and after the patch for reference. Please install / ok to install? Bernhard Reutner-Fischer (1): mksyscalls.awk: make split ERE more portable libgo/go/syscall/mksyscall.awk |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- 1.7.10.4 #include #include #include int main(void) { int l, n = 0; char *inp = "//sysnb pipe(p *[2]_C_int) (err error)"; char *out = __builtin_alloca(strlen(inp)*2+3); regex_t re; regmatch_t m[1]; strcpy(out, inp); #ifndef ERE # define ERE "[ (]*" #endif if (regcomp(&re, ERE, REG_EXTENDED) != 0) __builtin_abort(); do { l = strcspn(inp, "\n\0"); if (regexec(&re, inp, 1, m, 0) == 0 && m[0].rm_so <= l) { l = m[0].rm_so; if (m[0].rm_eo == 0) m[0].rm_eo++, l++; n++; } else { m[0].rm_eo = l; if (inp[l]) m[0].rm_eo++; } memcpy(out, inp, l); do { out[l] = '\0'; } while (++l < m[0].rm_eo); printf("[%2d] '%s'\n", n, out); while (*(out++) != '\0') continue; inp += m[0].rm_eo; } while (*inp); regfree(&re); return 0; } #if 0 + gcc '-DERE="[ (]*"' -o x regexec.c + ./x [ 1] '/' [ 2] '/' [ 3] 's' [ 4] 'y' [ 5] 's' [ 6] 'n' [ 7] 'b' [ 8] '' [ 9] 'p' [10] 'i' [11] 'p' [12] 'e' [13] '' [14] 'p' [15] '' [16] '*' [17] '[' [18] '2' [19] ']' [20] '_' [21] 'C' [22] '_' [23] 'i' [24] 'n' [25] 't' [26] ')' [27] '' [28] 'e' [29] 'r' [30] 'r' [31] '' [32] 'e' [33] 'r' [34] 'r' [35] 'o' [36] 'r' [37] ')' + gcc '-DERE="[ (]"' -o x regexec.c + ./x [ 1] '//sysnb' [ 2] 'pipe' [ 3] 'p' [ 4] '*[2]_C_int)' [ 5] '' [ 6] 'err' [ 6] 'error)' #endif
[libgo PATCH 1/1] mksyscalls.awk: make split ERE more portable
awk's split() ERE was splitting on (essentially) . Double checked with mawk 1.3.3, GNU Awk 4.0.1, busybox awk that they still produce identical output. libgo/ChangeLog (???) 2013-08-01 Bernhard Reutner-Fischer * go/syscall/mksyscall.awk (split): Fix ere argument. Signed-off-by: Bernhard Reutner-Fischer --- libgo/go/syscall/mksyscall.awk |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libgo/go/syscall/mksyscall.awk b/libgo/go/syscall/mksyscall.awk index 74f0e28..829b6fe 100644 --- a/libgo/go/syscall/mksyscall.awk +++ b/libgo/go/syscall/mksyscall.awk @@ -53,7 +53,7 @@ BEGIN { } # Sets a[1] = //sysnb, a[2] == function name. -split(line, a, "[ (]*") +split(line, a, "[ (]") gofnname = a[2] off = match(line, "\\([^()]*\\)") @@ -78,7 +78,7 @@ BEGIN { next } -split(line, a, "[ (]*") +split(line, a, "[ (]") cfnname = substr(a[1], 3, length(a[1]) - 2) off = match(line, "\\([^()]*\\)") -- 1.7.10.4
RE: [PATCH][ARM] Fix FAIL pr46975
> >> +;; Rd = (eq (reg1) (reg2/imm))// optimize for size on Thumb2 > >> +;;subs T1, Reg1, reg2 > >> +;;negs Rd, T1 > >> +;;adcs Rd, Rd, T1 > >> > >> Only the second operation has to be flag setting. A later pass will > >> convert the first and third instructions to flag clobbering if that's > >> the right thing to do on the target (which it currently always is, but > >> may not always remain so). > > > > So I tried removing the flag setting from the peephole for the sub and > adc > > instructions, and nothing ends up converting them to flag-setting > instructions > > and we end up with: > > sub > > negs > > adc > > which is suboptimal for size. Is combine supposed to do decide whether > to use > > flag-setting variants? > > > > thumb2_reorg should be picking both up. It already has code for sub, > but currently doesn't handle adc. You need to investigate the former > and perhaps write some code for the latter. Thanks, that's where the culprit turned out to be. Turns out the peephole generated the non-canonical subtract-immediate form that was not getting matched in thumb2_reorg. So fixing the peephole got the 235.mach pass to emit the flag-setting variant correctly. Also, I've added some matching for the add-with-carry case in thumb2_reorg so that the flag-setting variant of ADC is used. Now for the testcase, when optimising for size we get only 3 16-bit instructions. Tested arm-none-eabi on qemu with and without -mthumb. Ok? Thanks, Kyrill 2013-08-01 Kyrylo Tkachov * config/arm/arm.md (peepholes for eq (reg1) (reg2/imm)): Generate canonical plus rtx with negated immediate instead of minus where appropriate. * config/arm/arm.c (thumb2_reorg): Handle ADCS , case. 2013-08-01 Kyrylo Tkachov * gcc.target/arm/pr46972-2.c: New test. pr46975.patch Description: Binary data
[PATCH][ARM] Emit canonical form for constant subtraction in minmax_arithsi_non_canon
Hi all, I noticed that the minmax_arithsi_non_canon splitter had a case where it would generate a MINUS rtx with an immediate. The canonical form is PLUS with negated immediate. This patch fixes that. Tested arm-none-eabi on qemu with ARMv7-a, ARMv8-a, with and without -mthumb. Ok for trunk? Thanks, Kyrill 2013-08-01 Kyrylo Tkachov * config/arm/arm.md (minmax_arithsi_non_canon): Emit canonical RTL form when subtracting a constant. minmax_canon_minus.patch Description: Binary data
Do not use PARM_DECLs in ipa-cp and ipa-prop
Hi, this is preparation work to move DECL_ARGUMENTS and DECL_RESULT into function sections during WPA. Even with some work to release unused ones, there are 4M of PARM_DECLs and 2M of RESULT_DECLs streamed during LTO (for 6M of function_decls) making them one of the most common nodes. This patch makes ipa-cp and ipa-prop to not use DECL_ARGUMENTS during WPA stage. this only needed to tamn debug info, move logic doing casts from get_replacement_map to tree_function_versioning and stream move_cost that is computed form parm type. Martin, does this patch look OK? Honza * ipa-cp.c (gather_context_independent_values): Use ipa_get_param_move_cost. (get_replacement_map): Remove PARAM; move parameter folding into tree-inline.c (create_specialized_node): Update. * ipa-prop.c (ipa_populate_param_decls): Do not look for origins; assert that we have gimple body; update move_cost. (count_formal_params): Assert that we have gimple body. (ipa_alloc_node_params): Break out from ... (ipa_initialize_node_params): ... here. (ipa_get_vector_of_formal_parms): ICE when used in WPA. (ipa_write_node_info): Stream move costs. (ipa_read_node_info): Read move costs. (ipa_update_after_lto_read): Do not recompute node params. * ipa-prop.h (ipa_param_descriptor): Add move_cost. (ipa_get_param): Check we are not in WPA. (ipa_get_param_move_cost): New. * tree-inline.c (tree_function_versioning): Fold replacement as needed. Index: ipa-cp.c === *** ipa-cp.c(revision 201291) --- ipa-cp.c(working copy) *** gather_context_independent_values (struc *** 1758,1770 } else if (removable_params_cost && !ipa_is_param_used (info, i)) ! *removable_params_cost ! += estimate_move_cost (TREE_TYPE (ipa_get_param (info, i))); } else if (removable_params_cost && !ipa_is_param_used (info, i)) *removable_params_cost ! += estimate_move_cost (TREE_TYPE (ipa_get_param (info, i))); if (known_aggs) { --- 1758,1769 } else if (removable_params_cost && !ipa_is_param_used (info, i)) ! *removable_params_cost += ipa_get_param_move_cost (info, i); } else if (removable_params_cost && !ipa_is_param_used (info, i)) *removable_params_cost ! += ipa_get_param_move_cost (info, i); if (known_aggs) { *** gather_edges_for_value (struct ipcp_valu *** 2480,2515 Return it or NULL if for some reason it cannot be created. */ static struct ipa_replace_map * ! get_replacement_map (tree value, tree parm, int parm_num) { - tree req_type = TREE_TYPE (parm); struct ipa_replace_map *replace_map; - if (!useless_type_conversion_p (req_type, TREE_TYPE (value))) - { - if (fold_convertible_p (req_type, value)) - value = fold_build1 (NOP_EXPR, req_type, value); - else if (TYPE_SIZE (req_type) == TYPE_SIZE (TREE_TYPE (value))) - value = fold_build1 (VIEW_CONVERT_EXPR, req_type, value); - else - { - if (dump_file) - { - fprintf (dump_file, "const "); - print_generic_expr (dump_file, value, 0); - fprintf (dump_file, " can't be converted to param "); - print_generic_expr (dump_file, parm, 0); - fprintf (dump_file, "\n"); - } - return NULL; - } - } replace_map = ggc_alloc_ipa_replace_map (); if (dump_file) { ! fprintf (dump_file, "replacing param "); ! print_generic_expr (dump_file, parm, 0); fprintf (dump_file, " with const "); print_generic_expr (dump_file, value, 0); fprintf (dump_file, "\n"); --- 2479,2494 Return it or NULL if for some reason it cannot be created. */ static struct ipa_replace_map * ! get_replacement_map (tree value, int parm_num) { struct ipa_replace_map *replace_map; replace_map = ggc_alloc_ipa_replace_map (); if (dump_file) { ! fprintf (dump_file, "replacing param %i", parm_num); ! fprintf (dump_file, " with const "); print_generic_expr (dump_file, value, 0); fprintf (dump_file, "\n"); *** create_specialized_node (struct cgraph_n *** 2697,2703 { struct ipa_replace_map *replace_map; ! replace_map = get_replacement_map (t, ipa_get_param (info, i), i); if (replace_map) vec_safe_push (replace_trees, replace_map); } --- 2676,2682 { struct ipa_replace_map *replace_map; ! replace_map = get_replacement_map (t, i); if (replace_map) vec_safe_push (replace_trees, replace_ma
Symtab cleanup 9/17 better tracking of functions used as origin
Hi, current functions that appear as abstract origin of other function gets abstract_and_needed flag set. This flag is then not maintained in any way, but it is used by cgraph_remove_node to avoid removing block tree dneeded later by dwarf2out. This patch makes the tracking more explicit. Unreachable function removal now maintains the info about what is used as abstract origin. I also renamed the flag to used_as_abstract_origin, since abstract_and_needed suggest that function is DECL_ABSTRACT that is not the rule. This ensures that all abstract origin functions do have nodes attached and this info can be used in free_lang_data and LTO streaming/partitioning (currently we do not bother to ship the trees into LTO partitions that needs them) Bootstrapped/regtested ppc64-linux and in earlier version x86_64-linux, comitted. Honza * cgraph.c (cgraph_release_function_body): Use used_as_abstract_origin. (cgraph_release_function_body): Likewise. (cgraph_can_remove_if_no_direct_calls_p): Likewise. * cgraph.h (cgrpah_node): Rename abstract_and_needed to used_as_abstract_origin. * tree-inline-transfrom.c (can_remove_node_now_p_1): Do not remove symbols used as abstract origins. * cgraphunit.c (analyze_functions): Update. * ipa.c (symtab_remove_unreachable_nodes): Recompute used_as_abstract_origin. * tree-inline.c (tree_function_versioning): Update used_as_abstract_origin; be ready for DECL_RESULT and DECL_ARGUMENTS to be NULL. Index: cgraph.c === *** cgraph.c(revision 201291) --- cgraph.c(working copy) *** void *** 1316,1322 cgraph_release_function_body (struct cgraph_node *node) { node->ipa_transforms_to_apply.release (); ! if (!node->abstract_and_needed && cgraph_state != CGRAPH_STATE_PARSING) { DECL_RESULT (node->symbol.decl) = NULL; DECL_ARGUMENTS (node->symbol.decl) = NULL; --- 1317,1323 cgraph_release_function_body (struct cgraph_node *node) { node->ipa_transforms_to_apply.release (); ! if (!node->used_as_abstract_origin && cgraph_state != CGRAPH_STATE_PARSING) { DECL_RESULT (node->symbol.decl) = NULL; DECL_ARGUMENTS (node->symbol.decl) = NULL; *** cgraph_release_function_body (struct cgr *** 1324,1330 /* If the node is abstract and needed, then do not clear DECL_INITIAL of its associated function function declaration because it's needed to emit debug info later. */ ! if (!node->abstract_and_needed && DECL_INITIAL (node->symbol.decl)) DECL_INITIAL (node->symbol.decl) = error_mark_node; release_function_body (node->symbol.decl); } --- 1325,1331 /* If the node is abstract and needed, then do not clear DECL_INITIAL of its associated function function declaration because it's needed to emit debug info later. */ ! if (!node->used_as_abstract_origin && DECL_INITIAL (node->symbol.decl)) DECL_INITIAL (node->symbol.decl) = error_mark_node; release_function_body (node->symbol.decl); } Index: cgraph.h === *** cgraph.h(revision 201291) --- cgraph.h(working copy) *** struct GTY(()) cgraph_node { *** 303,309 /* Set when decl is an abstract function pointed to by the ABSTRACT_DECL_ORIGIN of a reachable function. */ ! unsigned abstract_and_needed : 1; /* Set once the function is lowered (i.e. its CFG is built). */ unsigned lowered : 1; /* Set once the function has been instantiated and its callee --- 303,309 /* Set when decl is an abstract function pointed to by the ABSTRACT_DECL_ORIGIN of a reachable function. */ ! unsigned used_as_abstract_origin : 1; /* Set once the function is lowered (i.e. its CFG is built). */ unsigned lowered : 1; /* Set once the function has been instantiated and its callee Index: ipa-inline-transform.c === *** ipa-inline-transform.c (revision 201291) --- ipa-inline-transform.c (working copy) *** can_remove_node_now_p_1 (struct cgraph_n *** 87,92 --- 87,93 the callgraph so references can point to it. */ return (!node->symbol.address_taken && !ipa_ref_has_aliases_p (&node->symbol.ref_list) + && !node->used_as_abstract_origin && cgraph_can_remove_if_no_direct_calls_p (node) /* Inlining might enable more devirtualizing, so we want to remove those only after all devirtualizable virtual calls are processed. Index: cgraphunit.c === *** cgraphunit.c(revision 201291) --- cgraphunit.c(working copy) *** analyze_functions (void) *** 925,931 {
Re: Do not use PARM_DECLs in ipa-cp and ipa-prop
Jan Hubicka wrote: >Hi, >this is preparation work to move DECL_ARGUMENTS and DECL_RESULT into >function >sections during WPA. Even with some work to release unused ones, there >are 4M >of PARM_DECLs and 2M of RESULT_DECLs streamed during LTO (for 6M of >function_decls) making them one of the most common nodes. > >This patch makes ipa-cp and ipa-prop to not use DECL_ARGUMENTS during >WPA >stage. this only needed to tamn debug info, move logic doing casts >from >get_replacement_map to tree_function_versioning and stream move_cost >that is >computed form parm type. > >Martin, does this patch look OK? What about uses in jump functions, like &parm? Are those sufficiently non-treeish already? Richard. >Honza > > * ipa-cp.c (gather_context_independent_values): Use >ipa_get_param_move_cost. > (get_replacement_map): Remove PARAM; move parameter folding into >tree-inline.c > (create_specialized_node): Update. > * ipa-prop.c (ipa_populate_param_decls): Do not look for origins; > assert that we have gimple body; update move_cost. > (count_formal_params): Assert that we have gimple body. > (ipa_alloc_node_params): Break out from ... > (ipa_initialize_node_params): ... here. > (ipa_get_vector_of_formal_parms): ICE when used in WPA. > (ipa_write_node_info): Stream move costs. > (ipa_read_node_info): Read move costs. > (ipa_update_after_lto_read): Do not recompute node params. > * ipa-prop.h (ipa_param_descriptor): Add move_cost. > (ipa_get_param): Check we are not in WPA. > (ipa_get_param_move_cost): New. > * tree-inline.c (tree_function_versioning): Fold replacement as >needed. >Index: ipa-cp.c >=== >*** ipa-cp.c (revision 201291) >--- ipa-cp.c (working copy) >*** gather_context_independent_values (struc >*** 1758,1770 > } > else if (removable_params_cost > && !ipa_is_param_used (info, i)) >! *removable_params_cost >!+= estimate_move_cost (TREE_TYPE (ipa_get_param (info, i))); > } >else if (removable_params_cost > && !ipa_is_param_used (info, i)) > *removable_params_cost >!+= estimate_move_cost (TREE_TYPE (ipa_get_param (info, i))); > >if (known_aggs) > { >--- 1758,1769 > } > else if (removable_params_cost > && !ipa_is_param_used (info, i)) >! *removable_params_cost += ipa_get_param_move_cost (info, i); > } >else if (removable_params_cost > && !ipa_is_param_used (info, i)) > *removable_params_cost >!+= ipa_get_param_move_cost (info, i); > >if (known_aggs) > { >*** gather_edges_for_value (struct ipcp_valu >*** 2480,2515 > Return it or NULL if for some reason it cannot be created. */ > > static struct ipa_replace_map * >! get_replacement_map (tree value, tree parm, int parm_num) > { >- tree req_type = TREE_TYPE (parm); >struct ipa_replace_map *replace_map; > >- if (!useless_type_conversion_p (req_type, TREE_TYPE (value))) >- { >- if (fold_convertible_p (req_type, value)) >- value = fold_build1 (NOP_EXPR, req_type, value); >- else if (TYPE_SIZE (req_type) == TYPE_SIZE (TREE_TYPE (value))) >- value = fold_build1 (VIEW_CONVERT_EXPR, req_type, value); >- else >- { >-if (dump_file) >- { >-fprintf (dump_file, "const "); >-print_generic_expr (dump_file, value, 0); >-fprintf (dump_file, " can't be converted to param "); >-print_generic_expr (dump_file, parm, 0); >-fprintf (dump_file, "\n"); >- } >-return NULL; >- } >- } > >replace_map = ggc_alloc_ipa_replace_map (); >if (dump_file) > { >! fprintf (dump_file, "replacing param "); >! print_generic_expr (dump_file, parm, 0); >fprintf (dump_file, " with const "); >print_generic_expr (dump_file, value, 0); >fprintf (dump_file, "\n"); >--- 2479,2494 > Return it or NULL if for some reason it cannot be created. */ > > static struct ipa_replace_map * >! get_replacement_map (tree value, int parm_num) > { >struct ipa_replace_map *replace_map; > > >replace_map = ggc_alloc_ipa_replace_map (); >if (dump_file) > { >! fprintf (dump_file, "replacing param %i", parm_num); >! >fprintf (dump_file, " with const "); >print_generic_expr (dump_file, value, 0); >fprintf (dump_file, "\n"); >*** create_specialized_node (struct cgraph_n >*** 2697,2703 > { > struct ipa_replace_map *replace_map; > >!replace_map = get_replacement_map (t, ipa_get_param (info, i), i); > if (replace_map) > vec_safe_push (replace_trees, replace_map); > } >--- 2676,
Re: Do not use PARM_DECLs in ipa-cp and ipa-prop
Hi, On 08/01/2013 03:11 PM, Jan Hubicka wrote: + replace_info->new_tree = fold_build1 (NOP_EXPR, req_type, replace_info->new_tree); the tree-inline.c changes don't seem formatted to 80 columns. Paolo.
Re: [PATCH][ARM] Fix FAIL pr46975
On 01/08/13 14:02, Kyrylo Tkachov wrote: +;; Rd = (eq (reg1) (reg2/imm)) // optimize for size on Thumb2 +;; subs T1, Reg1, reg2 +;; negs Rd, T1 +;; adcs Rd, Rd, T1 Only the second operation has to be flag setting. A later pass will convert the first and third instructions to flag clobbering if that's the right thing to do on the target (which it currently always is, but may not always remain so). So I tried removing the flag setting from the peephole for the sub and adc instructions, and nothing ends up converting them to flag-setting instructions and we end up with: sub negs adc which is suboptimal for size. Is combine supposed to do decide whether to use flag-setting variants? thumb2_reorg should be picking both up. It already has code for sub, but currently doesn't handle adc. You need to investigate the former and perhaps write some code for the latter. Thanks, that's where the culprit turned out to be. Turns out the peephole generated the non-canonical subtract-immediate form that was not getting matched in thumb2_reorg. So fixing the peephole got the 235.mach pass to emit the flag-setting variant correctly. Also, I've added some matching for the add-with-carry case in thumb2_reorg so that the flag-setting variant of ADC is used. Now for the testcase, when optimising for size we get only 3 16-bit instructions. Tested arm-none-eabi on qemu with and without -mthumb. Ok? Thanks, Kyrill 2013-08-01 Kyrylo Tkachov * config/arm/arm.md (peepholes for eq (reg1) (reg2/imm)): Generate canonical plus rtx with negated immediate instead of minus where appropriate. * config/arm/arm.c (thumb2_reorg): Handle ADCS , case. 2013-08-01 Kyrylo Tkachov * gcc.target/arm/pr46972-2.c: New test. OK. R.
Re: [PATCH][ARM] Emit canonical form for constant subtraction in minmax_arithsi_non_canon
On 01/08/13 14:05, Kyrylo Tkachov wrote: Hi all, I noticed that the minmax_arithsi_non_canon splitter had a case where it would generate a MINUS rtx with an immediate. The canonical form is PLUS with negated immediate. This patch fixes that. Tested arm-none-eabi on qemu with ARMv7-a, ARMv8-a, with and without -mthumb. Ok for trunk? Thanks, Kyrill 2013-08-01 Kyrylo Tkachov * config/arm/arm.md (minmax_arithsi_non_canon): Emit canonical RTL form when subtracting a constant. OK. R.
[Patch] regex bracket expression implementaion
Fully tested under x86_64. (make bootstrap && make -k check). Next, I'll try to refactor _Grep_matcher using templates instead of virtual functions, to make implementing back-reference easier. Thanks! -- Tim Shen bracket.patch Description: Binary data changelog Description: Binary data
Symtab cleanup 10/17 remove unnecesary DECL_ARGUMENTS and DECL_RESULT
Hi, Now when we have abstract origins tracked, this patch makes DECL_ARGUMENTS and DECL_RESULT to be removed from FUNCTION_DECLs that are never passed to symbol table. This reduces LTO streaming effort (by about 1/3rd of PARM_DECls) Bootstrapped/regtested ppc64-linux, will commit it after further testing on x86_64-linux. Honza * cgraph.h (release_function_body): Declare. * tree.c (free_lang_data_in_decl): Free, parameters and return values of unused delcarations. Index: cgraph.h === --- cgraph.h(revision 201408) +++ cgraph.h(working copy) @@ -606,6 +606,7 @@ void debug_cgraph_node (struct cgraph_no void cgraph_remove_edge (struct cgraph_edge *); void cgraph_remove_node (struct cgraph_node *); void cgraph_release_function_body (struct cgraph_node *); +void release_function_body (tree); void cgraph_node_remove_callees (struct cgraph_node *node); struct cgraph_edge *cgraph_create_edge (struct cgraph_node *, struct cgraph_node *, Index: tree.c === --- tree.c (revision 201367) +++ tree.c (working copy) @@ -4886,6 +4886,20 @@ free_lang_data_in_decl (tree decl) if (TREE_CODE (decl) == FUNCTION_DECL) { + struct cgraph_node *node; + if (!(node = cgraph_get_node (decl)) + || (!node->symbol.definition && !node->clones)) + { + if (node) + cgraph_release_function_body (node); + else + { + release_function_body (decl); + DECL_ARGUMENTS (decl) = NULL; + DECL_RESULT (decl) = NULL; + DECL_INITIAL (decl) = error_mark_node; + } + } if (gimple_has_body_p (decl)) { tree t;
Re: Do not use PARM_DECLs in ipa-cp and ipa-prop
> Jan Hubicka wrote: > >Hi, > >this is preparation work to move DECL_ARGUMENTS and DECL_RESULT into > >function > >sections during WPA. Even with some work to release unused ones, there > >are 4M > >of PARM_DECLs and 2M of RESULT_DECLs streamed during LTO (for 6M of > >function_decls) making them one of the most common nodes. > > > >This patch makes ipa-cp and ipa-prop to not use DECL_ARGUMENTS during > >WPA > >stage. this only needed to tamn debug info, move logic doing casts > >from > >get_replacement_map to tree_function_versioning and stream move_cost > >that is > >computed form parm type. > > > >Martin, does this patch look OK? > > What about uses in jump functions, like &parm? Are those sufficiently > non-treeish already? I do not think those can appear in jump functions - it is not IP invariant. passthrough and friends are non-treeish. Honza
Re: Do not use PARM_DECLs in ipa-cp and ipa-prop
Hi, On Thu, Aug 01, 2013 at 03:11:36PM +0200, Jan Hubicka wrote: > Hi, > this is preparation work to move DECL_ARGUMENTS and DECL_RESULT into function > sections during WPA. Even with some work to release unused ones, there are 4M > of PARM_DECLs and 2M of RESULT_DECLs streamed during LTO (for 6M of > function_decls) making them one of the most common nodes. > > This patch makes ipa-cp and ipa-prop to not use DECL_ARGUMENTS during WPA > stage. this only needed to tamn debug info, move logic doing casts from > get_replacement_map to tree_function_versioning and stream move_cost that is > computed form parm type. > > Martin, does this patch look OK? > Generally yes, except that I think you did not convert some dumping in ipa-cp.c, at least in functions estimate_local_effects, find_more_scalar_values_for_callers_subset and decide_about_value. I believe that after your change there should not be a single call to ipa_get_param in file ipa-cp.c. Thanks, Martin
Re: Do not use PARM_DECLs in ipa-cp and ipa-prop
Hi, On Thu, Aug 01, 2013 at 03:59:01PM +0200, Richard Biener wrote: > Jan Hubicka wrote: > >Hi, > >this is preparation work to move DECL_ARGUMENTS and DECL_RESULT into > >function > >sections during WPA. Even with some work to release unused ones, there > >are 4M > >of PARM_DECLs and 2M of RESULT_DECLs streamed during LTO (for 6M of > >function_decls) making them one of the most common nodes. > > > >This patch makes ipa-cp and ipa-prop to not use DECL_ARGUMENTS during > >WPA > >stage. this only needed to tamn debug info, move logic doing casts > >from > >get_replacement_map to tree_function_versioning and stream move_cost > >that is > >computed form parm type. > > > >Martin, does this patch look OK? > > What about uses in jump functions, like &parm? Are those > sufficiently non-treeish already? Yes, they use indices to refer to other parameters and are looked up by indices themselves. Martin
Re: Do not use PARM_DECLs in ipa-cp and ipa-prop
> Hi, > > On Thu, Aug 01, 2013 at 03:11:36PM +0200, Jan Hubicka wrote: > > Hi, > > this is preparation work to move DECL_ARGUMENTS and DECL_RESULT into > > function > > sections during WPA. Even with some work to release unused ones, there are > > 4M > > of PARM_DECLs and 2M of RESULT_DECLs streamed during LTO (for 6M of > > function_decls) making them one of the most common nodes. > > > > This patch makes ipa-cp and ipa-prop to not use DECL_ARGUMENTS during WPA > > stage. this only needed to tamn debug info, move logic doing casts from > > get_replacement_map to tree_function_versioning and stream move_cost that is > > computed form parm type. > > > > Martin, does this patch look OK? > > > > Generally yes, except that I think you did not convert some dumping in > ipa-cp.c, at least in functions estimate_local_effects, > find_more_scalar_values_for_callers_subset and decide_about_value. I > believe that after your change there should not be a single call to > ipa_get_param in file ipa-cp.c. Hmm, I was aware of that while making the patch but then forgot. OK, I will convert these too (and the testsuite) and commit. (alternative I was thining of having ipa_dump_param helper that will dump "parm 4" in WPA mode and "parm 4 (parmname)" in non-WPA). Do you think it would be useful? We probably should comonnize syntax of dumps in between ipa-prop and ipa-inline-analysis predicates (that currently use op4 syntax and call the parameters operands). Something to track incrementally anyway. Honza > > Thanks, > > Martin
[PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
Patch 3 of 3 split out from the patch I sent in May that fixes problems with -freorder-blocks-and-partition, with changes/fixes discussed in that thread. See http://gcc.gnu.org/ml/gcc-patches/2013-05/threads.html#00388 for context. This patch sanitizes the partitioning to address issues such as edge weight insanities that sometimes occur due to upstream optimizations, and ensures that hot blocks are not dominated by cold blocks. This needs to be resanitized after certain cfg optimizations that may cause hot blocks previously reached via both hot and cold paths to only be reached by cold paths. The verification code in sanitize_dominator_hotness was contributed by Steven Bosscher. Bootstrapped and tested on x86-64-unknown-linux-gnu. Also ensured that a profiledbootstrap passed with -freorder-blocks-and-partition enabled (and with the dwarf version changed to 2 to work around PR57451). Ok for trunk? (I also included the patch as an attachment since my mailer invariably messes up the formatting in the pasted version.) Thanks, Teresa 2013-08-01 Teresa Johnson Steven Bosscher * cfgrtl.c (fixup_bb_partition): New routine. (commit_edge_insertions): Invoke fixup_partitions. (find_partition_fixes): New routine. (fixup_partitions): Ditto. (verify_hot_cold_block_grouping): Update comments. (rtl_verify_edges): Invoke find_partition_fixes. (rtl_verify_bb_pointers): Update comments. (rtl_verify_bb_layout): Ditto. * basic-block.h (fixup_partitions): Declare. * cfgcleanup.c (try_optimize_cfg): Invoke fixup_partitions. * bb-reorder.c (sanitize_dominator_hotness): New function. (find_rarely_executed_basic_blocks_and_crossing_edges): Invoke sanitize_dominator_hotness. Index: cfgrtl.c === --- cfgrtl.c (revision 201281) +++ cfgrtl.c (working copy) @@ -1341,6 +1341,34 @@ fixup_partition_crossing (edge e) } } +/* Called when block BB has been reassigned to a different partition, + to ensure that the region crossing attributes are updated. */ + +static void +fixup_bb_partition (basic_block bb) +{ + edge e; + edge_iterator ei; + + /* Now need to make bb's pred edges non-region crossing. */ + FOR_EACH_EDGE (e, ei, bb->preds) +{ + fixup_partition_crossing (e); +} + + /* Possibly need to make bb's successor edges region crossing, + or remove stale region crossing. */ + FOR_EACH_EDGE (e, ei, bb->succs) +{ + if ((e->flags & EDGE_FALLTHRU) + && BB_PARTITION (bb) != BB_PARTITION (e->dest) + && e->dest != EXIT_BLOCK_PTR) +force_nonfallthru (e); + else +fixup_partition_crossing (e); +} +} + /* Attempt to change code to redirect edge E to TARGET. Don't do that on expense of adding new instructions or reordering basic blocks. @@ -1979,6 +2007,14 @@ commit_edge_insertions (void) { basic_block bb; + /* Optimization passes that invoke this routine can cause hot blocks + previously reached by both hot and cold blocks to become dominated only + by cold blocks. This will cause the verification below to fail, + and lead to now cold code in the hot section. In some cases this + may only be visible after newly unreachable blocks are deleted, + which will be done by fixup_partitions. */ + fixup_partitions (); + #ifdef ENABLE_CHECKING verify_flow_info (); #endif @@ -2173,6 +2209,101 @@ get_last_bb_insn (basic_block bb) return end; } +/* Sanity check partition hotness to ensure that basic blocks in + the cold partition don't dominate basic blocks in the hot partition. + If FLAG_ONLY is true, report violations as errors. Otherwise + re-mark the dominated blocks as cold, since this is run after + cfg optimizations that may make hot blocks previously reached + by both hot and cold blocks now only reachable along cold paths. */ + +vec +find_partition_fixes (bool flag_only) +{ + basic_block bb; + vec bbs_in_cold_partition = vNULL; + vec bbs_to_fix = vNULL; + + if (!crtl->has_bb_partition) +return vNULL; + + FOR_EACH_BB (bb) +if ((BB_PARTITION (bb) == BB_COLD_PARTITION)) + bbs_in_cold_partition.safe_push (bb); + + if (bbs_in_cold_partition.is_empty ()) +return vNULL; + + bool dom_calculated_here = !dom_info_available_p (CDI_DOMINATORS); + + if (dom_calculated_here) +calculate_dominance_info (CDI_DOMINATORS); + + while (! bbs_in_cold_partition.is_empty ()) +{ + bb = bbs_in_cold_partition.pop (); + /* Any blocks dominated by a block in the cold section + must also be cold. */ + basic_block son; + for (son = first_dom_son (CDI_DOMINATORS, bb); + son; + son = next_dom_son (CDI_DOMINATORS, son)) +{ + /* If son is not yet cold, then mark it cold here and + enqueue it for further processing. */ + if ((BB_PARTITION
[PATCH] MIPS/libgcc: Avoid the PLT in MIPS16 stub calls
Hi, As originally signalled here: http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00140.html here is a change to prevent MIPS16 stub calls from being made through the PLT by making them hidden. This is needed to avoid $2 and possibly $3 from being clobbered by MIPS16 code in the PLT. Originally I thought only return stubs were affected, but I have since realised all the libgcc stubs rely on $2, so none can be called via the PLT. This breakage can be triggered by linking with the `-shared-libgcc' option; otherwise libgcc.a code is used avoiding the problem. Since these stubs are a preexisting interface I have decided there is no need to go through the pain of link-once functions as you did for __mips16_rdhwr. I think we'd only need them if we changed the API, at which point we'd have to keep current libgcc bits for preexisting object code anyway. For the same reason this change retains libgcc_s.so bits for preexisting executables and shared libraries, but stops them being exported for new static links so that libgcc.a bits are always used instead. I have tested this change for the mips-linux-gnu target and o32/MIPS32r2 multilib with `-mips16' and `-shared-libgcc' options. There have been no regressions to a plain `-mips16' run whereas as it stands we have numerous. OK to apply? BTW, what's the "Check for MicroMIPS support." note seen in the config.host piece of the patch referring to? 2013-08-01 Maciej W. Rozycki Catherine Moore libgcc/ * config/mips/mips16.S (RET_FUNCTION): Rename to... (_RET_FUNCTION): ... this. (RET_FUNCTION): New macro. (CALL_STUB_NO_RET): Rename to... (_CALL_STUB_NO_RET): ... this. (CALL_STUB_NO_RET): New macro. (CALL_STUB_RET): Rename to... (_CALL_STUB_RET): ... this. (CALL_STUB_RET): New macro. * config.host : For non-R5900 add t-slibgcc-libgcc to tmake_file. Maciej gcc-mips16-hidden-stub.diff Index: gcc-fsf-trunk-quilt/libgcc/config.host === --- gcc-fsf-trunk-quilt.orig/libgcc/config.host 2013-07-31 17:48:35.0 +0100 +++ gcc-fsf-trunk-quilt/libgcc/config.host 2013-07-31 17:50:02.438694583 +0100 @@ -742,13 +742,13 @@ mips*-*-linux*) # Linux MIPS, either tmake_file="${tmake_file} t-crtfm" # Check for MicroMIPS support. case ${host} in - mips64r5900* | mipsr5900*) - # The MIPS16 support code uses floating point - # instructions that are not supported on r5900. - ;; - *) - tmake_file="${tmake_file} mips/t-mips16" - ;; + mips64r5900* | mipsr5900*) + # The MIPS16 support code uses floating point + # instructions that are not supported on r5900. + ;; + *) + tmake_file="${tmake_file} mips/t-mips16 t-slibgcc-libgcc" + ;; esac md_unwind_header=mips/linux-unwind.h if test "${ac_cv_sizeof_long_double}" = 16; then Index: gcc-fsf-trunk-quilt/libgcc/config/mips/mips16.S === --- gcc-fsf-trunk-quilt.orig/libgcc/config/mips/mips16.S2013-07-31 17:48:35.0 +0100 +++ gcc-fsf-trunk-quilt/libgcc/config/mips/mips16.S 2013-07-31 17:58:30.058731920 +0100 @@ -479,14 +479,34 @@ STARTFN (__mips16_fix_truncdfsi) #endif #endif /* !__mips_single_float */ +/* We don't export stubs from libgcc_s.so and always require static + versions to be pulled from libgcc.a as needed because they use $2 + and possibly $3 as arguments, diverging from the standard SysV ABI, + and as such would require severe pessimisation of MIPS16 PLT entries + just for this single special case. + + For compatibility with old binaries that used safe standard MIPS PLT + entries and referred to these functions we still export them at + version GCC_4.4.0 for run-time loading only. /* + /* Define a function NAME that moves a return value of mode MODE from FPRs to GPRs. */ -#define RET_FUNCTION(NAME, MODE) \ +#define _RET_FUNCTION(NAME, MODE) \ STARTFN (NAME);\ MOVE_##MODE##_RET (t, $31); \ ENDFN (NAME) +#ifdef SHARED +#define RET_FUNCTION(NAME, MODE) \ + _RET_FUNCTION (NAME##_compat, MODE);\ + .symver NAME##_compat, NAME##@GCC_4.4.0 +#else +#define RET_FUNCTION(NAME, MODE) \ + _RET_FUNCTION (NAME, MODE); \ + .hidden NAME +#endif + #ifdef L_m16retsf RET_FUNCTION (__mips16_ret_sf, SF) #endif @@ -525,7 +545,7 @@ RET_FUNCTION (__mips16_ret_dc, DC) pointer. They must copy the floating point arguments from the GPRs to FPRs and then call function $2. */ -#define CALL_STUB_NO_RET(NAME, CODE) \ +#define
[PATCH 3.1/11] Explicitly initialize the macro-generated pass fields (was Re: [PATCH 03/11] Handwritten part of conversion of passes to C++ classes)
On Mon, 2013-07-29 at 15:41 -0600, Jeff Law wrote: > On 07/26/2013 09:04 AM, David Malcolm wrote: > > This patch is the hand-written part of the conversion of passes from > > C structs to C++ classes. It does not work without the subsequent > > autogenerated part, which is huge. > [ ... ] > With the changes from pipeline -> pass_manager this is fine. As is the > follow-up autogenerated patch. Thanks. The current status is that Jeff has approved patches 1-5 for the trunk, and patches 6-11 haven't yet been reviewed by a global reviewer. I have committed patches 1 and 2, having bootstrapped+tested each in turn, but I can't commit any more of these without further review - details follow. I had only bootstrapped and tested the combination of all 11 patches together, so I've been attempting to test the approved patches. For reference: * Patch 3 is the handwritten part of the conversion of passes to C++ classes * Patch 4 is the autogenerated part * Patch 5 adds -fno-rtti to the testsuite when building plugins * Patch 6 is the not-yet-approved patch currently named "Rewrite how instances of passes are cloned" (but that's not all it does, see below). I had thought that the combination of patch 3 + 4 + 5 would work as a unit, and hoped to commit each of these after testing the combination of (3, 4, 5), but upon attempting a bootstrap I found two bugs: (a) patch 3 adds an gcc_assert to the pass_manager constructor to ensure that pass instance fields are NULL when created, to ensure that the macro-driven logic is correct. However, the instance fields haven't been explicitly initialized at that point, leading to sporadic assertion failures. This wasn't an issue for the full patch series since patch 11 adds an operator new for pass_manager, allocating it from the GC heap, using the *cleared* allocator: void* pass_manager::operator new (size_t sz) { return ggc_internal_cleared_alloc_stat (sz MEM_STAT_INFO); } hence ensuring that the fields are zeroed. So patch 3 works with patch 11, but not without. In the meantime, I'm attaching a new patch "3.1", which explicitly zeroes these fields early on in the pass_manager ctor. (b) with just these patches, although static_pass_number for every pass instance is correct, the dumpfile *switch* numbering is wrong, leading to numerous testsuite failures (e.g. "unrecognized command line option '-fdump-tree-dce2'") . Patch 6 is needed: it does the necessary fixups at the right time to ensure that the per-pass-instance dump files get the correct switch names (I'll add some more notes on this to that email). With the attached patch, the combination of patches (03, 03.1, 04, 05 *and* 06) successfully bootstraps on x86_64-unknown-linux-gnu, and all testcases show the same results as an unpatched build (relative to r201397). So I'm looking for review of the attached patch, and of at least patch 6 before I can proceed with this. AIUI, Jeff is away at the moment, so another global reviewer would need to do it. Thanks Dave >From 49d7df3c645459a0f90179dfb1a24cef459b186e Mon Sep 17 00:00:00 2001 From: David Malcolm Date: Wed, 31 Jul 2013 14:20:07 -0400 Subject: [PATCH 03.1/15] Explicitly initialize the macro-generated pass fields to NULL gcc/ * passes.c (pass_manager::pass_manager): Explicitly initialize the macro-generated pass fields to NULL. --- gcc/passes.c | 22 ++ 1 file changed, 22 insertions(+) diff --git a/gcc/passes.c b/gcc/passes.c index fcbd630..c4f4f39 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -1352,6 +1352,28 @@ pass_manager::pass_manager (context *ctxt) GCC_PASS_LISTS #undef DEF_PASS_LIST + /* For safety, NULL-initialize the various fields for individual pass + instances (mainly so that the: + gcc_assert (NULL == PASS ## _ ## NUM); + sees an initial NULL value). + + We cannot use member initializers for this since pass-instances.def + contains semicolons. */ + +#define INSERT_PASSES_AFTER(PASS) +#define PUSH_INSERT_PASSES_WITHIN(PASS) +#define POP_INSERT_PASSES() +#define NEXT_PASS(PASS, NUM) PASS ## _ ## NUM = NULL; +#define TERMINATE_PASS_LIST() + +#include "pass-instances.def" + +#undef INSERT_PASSES_AFTER +#undef PUSH_INSERT_PASSES_WITHIN +#undef POP_INSERT_PASSES +#undef NEXT_PASS +#undef TERMINATE_PASS_LIST + /* Build the tree of passes. */ #define INSERT_PASSES_AFTER(PASS) \ -- 1.7.11.7
Re: [PATCH, libgcc] Fix licenses on several files
On 07/28/2013 12:03 PM, Maxim Kuvyrkov wrote: > Richard, did you and Red Hat intend to license config/ia64/unwind-ia64.h > under GPL-3.0-with-GCC-exception? > > DJ, did you and Red Hat intend to license config/mips/vr4120-div.S under > GPL-3.0-with-GCC-exception? Yes, Red Hat intended to license with the exception. r~
Re: [PATCH 06/11] Rewrite how instances of passes are cloned
This patch does more than just remove the hardcoded assumptions about pass sizes - as noted in http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00041.html it also is needed to ensure that per-pass dumpfiles get the correct switch names. So the short version is that patch 6 is needed for patches 3-5 to work, ensuring that the command-line switches for the per-pass dumpfiles don't change. Here's the long version: AIUI, the "static_pass_number" field is used in two different ways: during pass creation for tracking instance counts within a pass family, and after that, for tracking the index within the entire pipeline of passes (the pass id). In the current implementation, during pass creation, the first time a pass is seen, the static_pass_number is set to -1. On subsequent instances of the pass, the pass is copied, and that number decremented, so that if you have a pass with 4 instances, their static_pass_number fields will be -4, 2, 3, 4 (note how the initial one is negative and gives the count). There is this logic in register_one_dump_file to determine the suffix used for per-pass-instance dumpfiles: if (pass->static_pass_number != -1) sprintf (num, "%d", ((int) pass->static_pass_number < 0 ? 1 : pass->static_pass_number)); This is then used by dump_register, and affects the "swtch" field of the relevant entry within extra_dump_files, so that unique passes have no num in their dumpfile switch, and multiinstance passes have dumpfile switches with numeric suffices. This works because the code is always referring to the same pass struct, and thus the changes to say pass_copy_prop.static_pass_number are seen and affect later uses. (Later on in pass initialization, the static_pass_number is overridden and becomes an id for the pass within the entire pipeline, rather than just within its category - see the notes in the comment in the patch below). In my patch series, with just patches 3 through 5, we're instead making separate calls to e.g. make_pass_copy_prop, and each instance gets an initial static_pass_number of 0, which is the changed to -1 by make_pass_instance (as if it were always the first instance) and hence the logic above in register_one_dump_file fails, and thus the dump file of each pass erroneously has no numeric suffix in its dumpfile switch, as if every pass were a unique instance of its kind. With patch 6, the new function add_pass_instance sets up static_pass_number on the new pass instances, mimicking the behavior of the old code, and so the logic above in register_one_dump_file gets the dump file switch names correct. I reviewed this in gdb using this Python one-liner to see the ->swtch field of each element of the extra_dump_files array: (gdb) python print \ [gdb.parse_and_eval('extra_dump_files')[i]['swtch'].string() for i in range(gdb.parse_and_eval('extra_dump_files_in_use'))] and without patch 6, all of the extra dumpfiles for the 8 copy_prop instances erroneously share the switch "tree-copyprop". With patch 6, they correctly have switches "tree-copyprop1" through "tree-copyprop8". Much of this is covered by the comment that the patch adds, but I should have spelled things out more in the initial posting of the patch; sorry. OK for trunk? (on top of the other patches, of course; see notes in http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00041.html in how I've tested this). On Fri, 2013-07-26 at 11:04 -0400, David Malcolm wrote: > gcc/ > > Rewrite how instances of passes are cloned to remove assumptions > about their sizes (thus allowing pass subclasses to have > additional data fields, albeit non-GC-managed ones at this point). > > * passes.c (make_pass_instance): Now that passes have clone > methods, rewrite this function to eliminate XNEW and memcpy > calls that used hardcoded sizes. Since this function no longer > creates pass instances, rename it to... > (add_pass_instance): ...this. Document the old way that passes > were numbered and flagged, and rework this function to continue > using it. > (next_pass_1): Add an initial_pass argument for use by > add_pass_instance. > (position_pass): When adding multiple instances of a pass, use > the pass's clone method, rather than relying on the XNEW/memcpy > within the former make_pass_instance (now add_pass_instance). > (pipeline::pipeline): When invoking next_pass_1, also supply the > initial instance of the current pass within the pipeline. > --- > gcc/passes.c | 92 > > 1 file changed, 55 insertions(+), 37 deletions(-) > > diff --git a/gcc/passes.c b/gcc/passes.c > index ead41a8..ce5cdeb 100644 > --- a/gcc/passes.c > +++ b/gcc/passes.c > @@ -1167,68 +1167,77 @@ is_pass_explicitly_enabled_or_disabled (struct > opt_pass *pass, >return false; > } > > -/* Look at the static_pass_number and duplicate the pass > - if it i
Re: Request to merge Undefined Behavior Sanitizer in (take 2)
On Wed, Jul 31, 2013 at 02:52:39PM -0400, Jason Merrill wrote: > On 07/31/2013 01:33 PM, Marek Polacek wrote: > >There are still at least two issues though, which is why > >bootstrap with -fsanitize=undefined fails: > > > >http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01480.html > > This looks like a serious bug, properly caught by -Wuninitialized. > > >When sanitizing, > >in .uninit1 we have > > int x.3; > > int x.2; > > > > : > > x.3_3 = x.2_1(D) >> 1; > > x = x.3_3; > > Note that x.2 is not initialized. Uh, dunno what I was thinking. Oh, right, the fact that x.2_1 is SSA_NAME_IS_DEFAULT_DEF confused me -- I though it's actually initialized to 0, as x is a global var... It really is a genuine bug. The problem can be seen when the if-condition in the COMPOUND_EXPR doesn't contain x, e.g. in C89 mode we instrument shift with the simple check: int y = if ((unsigned int) SAVE_EXPR > 31) { __builtin___ubsan_handle_shift_out_of_bounds (&*.Lubsan_data0, (unsigned long) SAVE_EXPR , (unsigned long) SAVE_EXPR ); } else { 0 }, SAVE_EXPR << SAVE_EXPR ;; Now, we gimplify this. SAVE_EXPRs are evaluated only once; in that COMPOUND_EXPR, when we first encounter SAVE_EXPR and call get_initialized_tmp_var, we get temporary value for it, x.2. Then, in the second part of the COMPOUND_EXPR, we meet SAVE_EXPR again, but it already has been resolved, so we don't create any initialized var for it. But then we end up with if (o.1 > 31) goto ; else goto ; : D.1464 = (unsigned long) o.0; x.2 = x; D.1466 = (unsigned long) x.2; __builtin___ubsan_handle_shift_out_of_bounds (&*.Lubsan_data0, D.1466, D.1464); goto ; : : y = x.2 << o.0; which means that when o.1 > 31 is false, we jump to D.1463, but the x.2 is not initialized! What we could perhaps do is to move the x.2 = x; initialization before the condition, so that it's always initialized. It's not readily obvious to me how to implement that nicely, but I could try something, if this is the way to go. Does anyone have a better idea? > >and when no sanitizing > > int x.1; > > int x.0; > > > > : > > x.0_2 = x; > > x.1_3 = x.0_2 >> 1; > > x = x.1_3; > > But here x.0 is initialized. > > >http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01536.html > > Here, the C++ compiler is wrong to fold away the division by zero, > but given that bug the folding ought to also eliminate the call to > the sanitize function. Seems like you should attach the call to the > questionable expression itself. Thanks, I'll look at this later. Marek
Re: [PATCH 06/11] Rewrite how instances of passes are cloned
On Thu, 2013-08-01 at 13:55 -0400, David Malcolm wrote: [...snip...] > OK for trunk? (on top of the other patches, of course; see notes in > http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00041.html in how I've > tested this). ...with "pipeline" renamed to "pass_manager", of course.
[PATCH, rs6000] Add builtin support for power8 32-bit Altivec multiply insns
This patch adds builtin support for the new 32-bit Altivec multiply instructions that were added in ISA 2.07 (ie, POWER8). This passed bootstrap and regtesting with no errors. Ok for mainline? P.S. I will be working on a followup patch sometime later that will attempt to generate these new instructions automatically. Peter gcc/ * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add support for new poweer8 P8V_BUILTIN_VEC_VMULESW builtin. (P8V_BUILTIN_VEC_VMULEUW): Likewise. (P8V_BUILTIN_VEC_VMULOSW): Likewise. (P8V_BUILTIN_VEC_VMULOUW): Likewise. (P8V_BUILTIN_VEC_VMULUWM): Likewise. * config/rs6000/rs6000-builtin.def (vmulesw): Add support for new power8 builtin. (vmuleuw): Likewise. (vmulosw): Likewise. (vmulouw): Likewise. (vmuluwm): Likewise. * config/rs6000/altivec.md (UNSPEC_VMULEUW): New unspec enum value. (UNSPEC_VMULESW): Likewise. (UNSPEC_VMULOUW): Likewise. (UNSPEC_VMULOSW): Likewise. (UNSPEC_VMULUWM): Likewise. (vec_widen_umult_even_v4si): New define_insn. (vec_widen_smult_even_v4si): Likewise. (vec_widen_umult_odd_v4si): Likewise. (vec_widen_smult_odd_v4si): Likewise. (altivec_vmuluwm): Likewise. * config/rs6000/altivec.h (vec_vmulesw): Add define to export power8 builtin function. (vec_vmuleuw): Likewise. (vec_vmulosw): Likewise. (vec_vmulouw): Likewise. (vec_vmuluwm): Likewise. gcc/testsuite/ * gcc.target/powerpc/p8vector-builtin-8.c: New. Index: gcc/config/rs6000/rs6000-c.c === --- gcc/config/rs6000/rs6000-c.c (revision 201409) +++ gcc/config/rs6000/rs6000-c.c (working copy) @@ -1752,6 +1752,10 @@ const struct altivec_builtin_types altiv RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULESH, ALTIVEC_BUILTIN_VMULESH, RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 }, + { P8V_BUILTIN_VEC_VMULESW, P8V_BUILTIN_VMULESW, +RS6000_BTI_V2DI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, + { P8V_BUILTIN_VEC_VMULEUW, P8V_BUILTIN_VMULEUW, +RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOUB, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOSB, @@ -1760,6 +1764,14 @@ const struct altivec_builtin_types altiv RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 0 }, { ALTIVEC_BUILTIN_VEC_MULO, ALTIVEC_BUILTIN_VMULOSH, RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 }, + { P8V_BUILTIN_VEC_VMULOSW, P8V_BUILTIN_VMULOSW, +RS6000_BTI_V2DI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, + { P8V_BUILTIN_VEC_VMULOUW, P8V_BUILTIN_VMULOUW, +RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, + { P8V_BUILTIN_VEC_VMULUWM, P8V_BUILTIN_VMULUWM, +RS6000_BTI_V4SI, RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0 }, + { P8V_BUILTIN_VEC_VMULUWM, P8V_BUILTIN_VMULUWM, +RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULOSH, ALTIVEC_BUILTIN_VMULOSH, RS6000_BTI_V4SI, RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0 }, { ALTIVEC_BUILTIN_VEC_VMULOUH, ALTIVEC_BUILTIN_VMULOUH, Index: gcc/config/rs6000/rs6000-builtin.def === --- gcc/config/rs6000/rs6000-builtin.def (revision 201409) +++ gcc/config/rs6000/rs6000-builtin.def (working copy) @@ -1315,6 +1315,11 @@ BU_P8V_AV_2 (VMINUD, "vminud", CONST, u BU_P8V_AV_2 (VMAXUD, "vmaxud", CONST, umaxv2di3) BU_P8V_AV_2 (VMRGEW, "vmrgew", CONST, p8_vmrgew) BU_P8V_AV_2 (VMRGOW, "vmrgow", CONST, p8_vmrgow) +BU_P8V_AV_2 (VMULESW, "vmulesw", CONST, vec_widen_smult_even_v4si) +BU_P8V_AV_2 (VMULEUW, "vmuleuw", CONST, vec_widen_umult_even_v4si) +BU_P8V_AV_2 (VMULOSW, "vmulosw", CONST, vec_widen_smult_odd_v4si) +BU_P8V_AV_2 (VMULOUW, "vmulouw", CONST, vec_widen_umult_odd_v4si) +BU_P8V_AV_2 (VMULUWM, "vmuluwm", CONST, altivec_vmuluwm) BU_P8V_AV_2 (VPKUDUM, "vpkudum", CONST, altivec_vpkudum) BU_P8V_AV_2 (VPKSDSS, "vpksdss", CONST, altivec_vpksdss) BU_P8V_AV_2 (VPKUDUS, "vpkudus", CONST, altivec_vpkudus) @@ -1382,6 +1387,11 @@ BU_P8V_OVERLOAD_2 (VMINSD, "vminsd") BU_P8V_OVERLOAD_2 (VMINUD, "vminud") BU_P8V_OVERLOAD_2 (VMRGEW, "vmrgew") BU_P8V_OVERLOAD_2 (VMRGOW, "vmrgow") +BU_P8V_OVERLOAD_2 (VMULESW, "vmulesw") +BU_P8V_OVERLOAD_2 (VMULEUW, "vmuleuw") +BU_P8V_OVERLOAD_2 (VMULOSW, "vmulosw") +BU_P8V_OVERLOAD_2 (VMULOUW, "vmulouw") +BU_P8V_OVERLOAD_2 (VMULUWM, "vmuluwm") BU_P8V_OVERLOAD_2 (VPKSDSS, "vpksdss") BU_P8V_OVERLOAD_2 (VPKSDUS, "vpksdus") BU_P8V_OVERLOAD_2 (VPKUDUM, "vpkudum") Index: gcc/config/rs6000/altivec.md =
Re: PING: Re: [patch] implement simd loops in trunk (OMP_SIMD)
> + if (simd > + /* > + || (fd->sched_kind == OMP_CLAUSE_SCHEDULE_STATIC > + && !fd->have_ordered)*/) Debugging leftovers or what? > + /* Enforce simdlen 1 in simd loops with data sharing clauses referencing > + variable sized vars. That is unnecessarily hard to support and very > + unlikely to result in vectorized code anyway. */ > + if (is_simd) > +for (c = clauses; c ; c = OMP_CLAUSE_CHAIN (c)) > + switch (OMP_CLAUSE_CODE (c)) > + { > + case OMP_CLAUSE_REDUCTION: > + if (OMP_CLAUSE_REDUCTION_PLACEHOLDER (c)) > + max_vf = 1; > + /* FALLTHRU */ > + case OMP_CLAUSE_PRIVATE: > + case OMP_CLAUSE_FIRSTPRIVATE: > + case OMP_CLAUSE_LASTPRIVATE: > + case OMP_CLAUSE_LINEAR: > + if (is_variable_sized (OMP_CLAUSE_DECL (c))) > + max_vf = 1; > + break; > + default: > + continue; > + } Comment is wrong, code is right, adjusting max_vf not simdlen. > + /* If max_vf is non-NULL, then we can use only vectorization factor > + up to the max_vf we chose. So stick it into safelen clause. */ > + if (max_vf) > +{ > + tree c = find_omp_clause (gimple_omp_for_clauses (ctx->stmt), > + OMP_CLAUSE_SAFELEN); First, non-zero, not non-null -- max_vf is not a pointer. Second, I don't understand why we're adjusting safelen. The VF we choose for optimizing the loop (even 1), does not change the fact that there are no dependencies between loop iterations larger than that. Did you want to be adding a _max_vf_ attribute to record stuff that we learned while examining the omp loop? E.g. the max_vf=1 reduction above? Given the only adjustment we make to max_vf is to disable vectorization, did we in fact want to discard the simd attribute instead, making this a "regular" openmp loop? > + if (__builtin_expect (N32 cond3 N31, 0)) goto ZERO_ITER_BB; > + if (cond3 is <) > + adj = STEP3 - 1; > + else > + adj = STEP3 + 1; > + count3 = (adj + N32 - N31) / STEP3; > + if (__builtin_expect (N22 cond2 N21, 0)) goto ZERO_ITER_BB; > + if (cond2 is <) > + adj = STEP2 - 1; > + else > + adj = STEP2 + 1; > + count2 = (adj + N22 - N21) / STEP2; > + if (__builtin_expect (N12 cond1 N11, 0)) goto ZERO_ITER_BB; CEIL_DIV_EXPR, instead of TRUNC_DIV_EXPR and adjusting by hand? Unless we can't generate the same code in the end because generically we don't know that the values involved must be negative for GT? I wonder if we do better mooshing all of the BBs together, creating one larger BB with all the computation and the unexpected jump at the end. I.e. bool zero3, zero2, zero1, zero; zero3 = N32 c3 N31; count3 = (N32 - N31) /[cl] STEP3; zero2 = N22 c2 N21; count2 = (N22 - N21) /[cl] STEP2; zero1 = N12 c1 N11; count1 = (N12 - N11) /[cl] STEP1; zero = zero3 || zero2 || zero1; count = count1 * count2 * count3; if (__builtin_expect(zero, false)) goto zero_iter_bb; After all, we expect the zero=false, and thus we expect to have to evaluate all of the comparison expressions, so short-circuiting oughtn't be a win. Since the condition isn't protecting a denominator, we're not concerned about divide-by-zero, so we can fully evaluate count even if a numerator turned out to be wrong. It seems like putting this all together would create much better scheduling opportunities, and less pressure on the chip's branch predictor. > @@ -291,6 +300,15 @@ vect_analyze_data_ref_dependence (struct > data_dependence_re You've two adjustments in this function. First hunk for "dont_know", which is fine; second hunk for "data is bad", which I suppose is really a form of dont_know and thus also fine. But it seems like there should be a third hunk in the "know" section, in which we perform some MAX(dist, loop->safelen) computation inside that loop. Alternately, why are we patching this function at all, rather than whatever bit of code creates the data_dependence_relation data? > + hash_table simduid_to_vf_htab; > + hash_table decl_to_simduid_htab; Why two hashes? Seems like you can map from decl to vf directly. At what point do you have a simduid integer, but not the decl from whence it came? r~
Re: [PING] Re: [C++ Patch] for c++/54537
On Thu, 2013-08-01 at 11:33 +0200, Paolo Carlini wrote: > On 07/31/2013 10:01 PM, Peter Bergner wrote: > > Can you tell me what the status of the following patch that > > removes the pow() overload from tr1 is? Specifically: > > > > http://gcc.gnu.org/ml/gcc-patches/2012-11/msg01166.html > > > > It seemed to have been approved and you were going to do some > > additional testing before committing the patch last November, > > but the patch still hasn't been committed yet. > Peter, please build & test the patch once more and commit it and close > the bug. The patch needed a small change to the expected error message in using9.C to get it to pass, otherwise, it passed bootstrap and regtesting with no errors. Patch committed to mainline as revision 201414 and the bug closed. Thanks. Jakub & richi, This bug also affects GCC 4.8. Is it appropriate to backport this patch to the FSF 4.8 branch? I can bootstrap and regtest it there too before committing if you think it's ok. Peter
[PATCH, i386]: Use INTEGER_CLASS_P and MAYBE_NON_Q_REGS_P in ix86_secondary_reload
Hello! This patch generalizes register class check, no other functional changes. 2013-08-01 Uros Bizjak * config/i386/i386.h (MAYBE_NON_Q_CLASS_P): New. * config/i386/i386.c (ix86_secondary_reload): Use INTEGER_CLASS_P and MAYBE_NON_Q_CLASS_P where appropriate. Tested on x86_64-pc-linux-gnu {,-m32} and committed to mainline SVN. Uros. Index: config/i386/i386.c === --- config/i386/i386.c (revision 201413) +++ config/i386/i386.c (working copy) @@ -33822,7 +33822,7 @@ if (TARGET_64BIT && MEM_P (x) && GET_MODE_SIZE (mode) > UNITS_PER_WORD - && rclass == GENERAL_REGS + && INTEGER_CLASS_P (rclass) && !offsettable_memref_p (x)) { sri->icode = (in_p @@ -33838,12 +33838,8 @@ intermediate register on 32bit targets. */ if (!TARGET_64BIT && !in_p && mode == QImode - && (rclass == GENERAL_REGS - || rclass == LEGACY_REGS - || rclass == NON_Q_REGS - || rclass == SIREG - || rclass == DIREG - || rclass == INDEX_REGS)) + && INTEGER_CLASS_P (rclass) + && MAYBE_NON_Q_CLASS_P (rclass)) { int regno; Index: config/i386/i386.h === --- config/i386/i386.h (revision 201413) +++ config/i386/i386.h (working copy) @@ -1288,13 +1288,16 @@ #define MAYBE_FLOAT_CLASS_P(CLASS) \ reg_classes_intersect_p ((CLASS), FLOAT_REGS) #define MAYBE_SSE_CLASS_P(CLASS) \ - reg_classes_intersect_p (SSE_REGS, (CLASS)) + reg_classes_intersect_p ((CLASS), SSE_REGS) #define MAYBE_MMX_CLASS_P(CLASS) \ - reg_classes_intersect_p (MMX_REGS, (CLASS)) + reg_classes_intersect_p ((CLASS), MMX_REGS) #define Q_CLASS_P(CLASS) \ reg_class_subset_p ((CLASS), Q_REGS) +#define MAYBE_NON_Q_CLASS_P(CLASS) \ + reg_classes_intersect_p ((CLASS), NON_Q_REGS) + /* Give names of register classes as strings for dump file. */ #define REG_CLASS_NAMES \
[PATCH/Merge Request] Vtable Verification feature.
Quick Reminder; The vtable verification feature (controlled by a flag) is designed to detect, at run time, if/when the vtable pointer in a C++ object has been corrupted, before allowing virtual calls through that pointer. If pointer corruption is detected, execution of the program is halted. I have created (with some help) a git branch on gcc.gnu.org to contain the vtable verification feature work. This work is now well integrated with GCC trunk, and the sources are in a good state for future work. I believe all previous review comments have been addressed. The feature is disabled by default, and disabled on non-linux OSes. Regression testing on linux is showing great results (no regressions). There is a wiki page at http://gcc.gnu.org/wiki/vtv that contains information about how to use the feature, as well as links to the Feature Proposal, User's Guide, and the presentation on this feature at last year's GNU Tools Cauldron. This work has been ongoing since GCC 4.7 and is now ready to merge into trunk for the GCC 4.9.0 release. I would like permission to merge this in (and or information on the best way to proceed from here). Thanks. -- Caroline Tice cmt...@google.com
Re: PR 57779 New debug check
Attached patch applied. Compare to the proposed one I had to: - Disable the new check for __gnu_debug::basic_string<>, it is supported following Standard words. To do so I had to slithly review how _GLIBCXX_DEBUG_PEDANTIC was managed. - Add check on forward_list::insert_after 2013-08-01 François Dumont PR libstdc++/57779 * include/debug/formatter.h (_Debug_msg_id): Add __msg_insert_itself_range entry. * include/debug/functions.h (_Insert_range_from_self_is_safe<>): New, indicate container types supporting self range insertion in GNU implementation. (__foreign_iterator): New, check if an iterator points to a given sequence. * include/debug/macros.h (__glibcxx_check_insert_range): Add check using __foreign_iterator. (__gibcxx_check_insert_range_after): Likewise. * include/debug/string (_Insert_range_from_self_is_safe<>): Partially specialized to mark __gnu_debug::basic_string<> as supporting self range insert. * include/debug/list (_Insert_range_from_self_is_safe<>): Partially specialized to mark std::list as supporting self range insert if _GLIBCXX_DEBUG_PEDANTIC is not defined. * include/debug/forward_list (_Insert_range_from_self_is_safe<>): Likewise. * src/c++11/debug.cc (_S_debug_messages): Add __msg_insert_itself_range_entry message. (_Error_formatter::_Parameter::_M_print_description): Display iterator sequence address rather than sequence address when the parameter type is an iterator. (_Error_formatter::_M_print_word): Enhance behavior when displaying a word with an appended '\n'. * testsuite/util/debug/checks.h (check_insert4<>): New. * testsuite/23_containers/deque/debug/insert5_neg.cc: New. * testsuite/23_containers/vector/debug/insert5_neg.cc: Likewise. * testsuite/23_containers/vector/debug/insert6_neg.cc: Likewise. * testsuite/23_containers/vector/debug/57779_neg.cc: Likewise. * testsuite/23_containers/list/debug/insert5_neg.cc: Likewise. * testsuite/23_containers/forward_list/debug/insert_after4_neg.cc: Likewise. François On 07/31/2013 11:50 PM, Paolo Carlini wrote: On 07/31/2013 09:58 PM, François Dumont wrote: Here is another proposal using std::common_type to find out how to instantiate std::less and std::greater_equal. Much easier with C++11 features indeed. Tested under linux x86_64 debug mode. Ok to commit ? Ok, thanks! Paolo. Index: include/debug/formatter.h === --- include/debug/formatter.h (revision 201413) +++ include/debug/formatter.h (working copy) @@ -114,7 +114,9 @@ // unordered container buckets __msg_bucket_index_oob, __msg_valid_load_factor, -__msg_equal_allocs +// others +__msg_equal_allocs, +__msg_insert_range_from_self }; class _Error_formatter Index: include/debug/functions.h === --- include/debug/functions.h (revision 201413) +++ include/debug/functions.h (working copy) @@ -32,7 +32,11 @@ #include #include // for iterator_traits, categories and // _Iter_base -#include // for __is_integer +#include // for __is_integer +#if __cplusplus >= 201103L +# include // for less and greater_equal +# include // for common_type +#endif #include namespace __gnu_debug @@ -40,6 +44,10 @@ template class _Safe_iterator; + template +struct _Insert_range_from_self_is_safe +{ enum { __value = 0 }; }; + // An arbitrary iterator pointer is not singular. inline bool __check_singular_aux(const void*) { return false; } @@ -162,6 +170,123 @@ return __first; } +#if __cplusplus >= 201103L + template +inline bool +__foreign_iterator_aux4(const _Safe_iterator<_Iterator, _Sequence>& __it, + _InputIterator __other, + _PointerType1, _PointerType2) +{ + typedef typename std::common_type<_PointerType1, + _PointerType2>::type _PointerType; + std::less<_PointerType> __l; + std::greater_equal<_PointerType> __ge; + + return + __l(std::addressof(*__other), + std::addressof(*(__it._M_get_sequence()->_M_base().begin( + || __ge(std::addressof(*__other), + std::addressof(*(__it._M_get_sequence()->_M_base().end() - 1)) + 1); + +} + + template +inline bool +__foreign_iterator_aux3(const _Safe_iterator<_Iterator, _Sequence>& __it, + _InputIterator __other, + std::true_type) +{ + // Only containers with all elements in contiguous memory can have their + // elements passed through pointers. + // Arithmetics is here just to make sure we are not dereferencing + // past-the-end iterator. + if (__it._M_get_sequence()->_M_base().begin() + != __it._M_get_sequence()->_M_base().end()) + if (std::__addressof(*(__it._M_get_sequence()->_M_base().end() - 1)) + - std::__addressof(*(__it._M_get_sequence()->_M_
Re: [PATCH/Merge Request] Vtable Verification feature.
Nice to see! > I have created (with some help) a git branch on gcc.gnu.org to contain > the vtable verification feature work. This work is now well > integrated with GCC trunk, and the sources are in a good state for > future work. I believe all previous review comments have been > addressed. The runtime bits are OK to me. Thanks for all your work on this. > The feature is disabled by default, and disabled on non-linux OSes. > Regression testing on linux is showing great results (no regressions). I can confirm these results on x86_64/linux. > > There is a wiki page at http://gcc.gnu.org/wiki/vtv that contains > information about how to use the feature, as well as links to the > Feature Proposal, User's Guide, and the presentation on this feature > at last year's GNU Tools Cauldron. This work has been ongoing since > GCC 4.7 and is now ready to merge into trunk for the GCC 4.9.0 > release. > > I would like permission to merge this in (and or information on the > best way to proceed from here). Thanks. You'll need a GWP to do the merge (Maybe Diego, Jason, Richard Henderson?) and then add yourself to MAINTAINERS for libvtv. -benjamin You'll need a GWP to do the merge, and then add yourself to MAINTAINERS for libvtv.
Re: [PATCH 07/11] Introduce virtual functions in testsuite/gcc.dg/plugin/one_time_plugin.c
On 07/26/2013 05:04 AM, David Malcolm wrote: > This is an example of converting the "gate" and "execute" functions of > a pass into C++ virtual functions, so that in the next patch we can move > a variable into member data of the opt_pass subclass. > > gcc/testsuite/ > > * gcc.dg/plugin/one_time_plugin.c: (one_pass_gate): convert > to member function... > (one_pass::gate): ...this > (one_pass_exec): convert to member function... > (one_pass::impl_execute): ...this Ok. r~
Re: [PATCH 06/11] Rewrite how instances of passes are cloned
On 08/01/2013 07:55 AM, David Malcolm wrote: > On Fri, 2013-07-26 at 11:04 -0400, David Malcolm wrote: >> > gcc/ >> > >> >Rewrite how instances of passes are cloned to remove assumptions >> >about their sizes (thus allowing pass subclasses to have >> >additional data fields, albeit non-GC-managed ones at this point). >> > >> >* passes.c (make_pass_instance): Now that passes have clone >> >methods, rewrite this function to eliminate XNEW and memcpy >> >calls that used hardcoded sizes. Since this function no longer >> >creates pass instances, rename it to... >> >(add_pass_instance): ...this. Document the old way that passes >> >were numbered and flagged, and rework this function to continue >> >using it. >> >(next_pass_1): Add an initial_pass argument for use by >> >add_pass_instance. >> >(position_pass): When adding multiple instances of a pass, use >> >the pass's clone method, rather than relying on the XNEW/memcpy >> >within the former make_pass_instance (now add_pass_instance). >> >(pipeline::pipeline): When invoking next_pass_1, also supply the >> >initial instance of the current pass within the pipeline. Ok (with the pass_manager change). r~
Re: [PATCH 08/11] Example of converting global state to per-pass state
On 07/26/2013 05:04 AM, David Malcolm wrote: > gcc/testsuite/ > > Example of converting global state to per-pass state. > > * gcc.dg/plugin/one_time_plugin.c (one_pass::execute): Convert > global state "static int counter" to... > (one_pass::counter): ...this instance data. Ok. r~
Re: [PATCH 09/11] Support "gcc" namespace in gengtype
On 07/26/2013 05:04 AM, David Malcolm wrote: > + "/* Types with a \"gcc::\" prefix have the prefix stripped\n" > + " during gengtype parsing. Provide a \"using\" directive\n" > + " to ensure that the fully-qualified types are found. */\n" I'd rather not use the word "prefix"; the term "namespace" is both more exact and less confusing in this context. Otherwise ok. r~
Re: [PATCH 10/11] Make gcc::context be GC-managed
On 07/26/2013 05:04 AM, David Malcolm wrote: > +/* Functions relating to the garbage collector. */ > +void > +gcc::context::gt_ggc_mx () > +{ > + /* Currently a no-op. */ > +} > + > +void > +gcc::context::gt_pch_nx () > +{ > + /* Currently a no-op. */ > +} > + > +void > +gcc::context::gt_pch_nx (gt_pointer_operator op ATTRIBUTE_UNUSED, > + void *cookie ATTRIBUTE_UNUSED) > +{ > + /* Currently a no-op. */ > +} I suppose these are members because you'll want to access private members later, and that's easier than playing with "friend"? > +void gt_ggc_mx (gcc::context *ctxt) > +{ > + ctxt->gt_ggc_mx (); > +} > + > +void gt_pch_nx (gcc::context *ctxt) > +{ > + ctxt->gt_pch_nx (); > +} > + > +void gt_pch_nx (gcc::context *ctxt, gt_pointer_operator op, void *cookie) > +{ > + ctxt->gt_pch_nx (op, cookie); > +} Should these be inline functions in context.h, so that the call does direct? Or do we take their address, making that sorta pointless? It seems like there's one level of indirection here that is avoidable... r~
Re: [PATCH 11/11] Make opt_pass and gcc::pipeline be GC-managed
On 07/26/2013 05:04 AM, David Malcolm wrote: > (opt_pass::gt_ggc_mx): New. > (opt_pass::gt_pch_nx): New. > (opt_pass::gt_pch_nx_with_op): New. > (gt_ggc_mx (opt_pass *)): New. > (gt_pch_nx (opt_pass *)): New. > (gt_pch_nx_opt_pass): New. > (pipeline::operator new): New. > (pipeline::gt_ggc_mx): New. > (pipeline::gt_pch_nx): New. > (pipeline::gt_pch_nx_with_op): New. > (gt_ggc_mx (pipeline *)): New. > (gt_pch_nx (pipeline *)): New. > (gt_pch_nx_pipeline): New. I guess my previous comments about ::gt_ggc_mx vs class::gt_ggc_mx wrt the context structure apply as well to this patch. r~
Re: [PATCH] Add atomic type qualifier
On 07/26/2013 07:21 PM, Joseph S. Myers wrote: On Fri, 26 Jul 2013, Andrew MacLeod wrote: This patch adds an atomic type qualifier to GCC. It can be accessed via __attribute__((atomic)) or in C11 mode via the _Atomic keyword. Why the attribute - why not just the keyword? * When C11 refers to qualified types, by default it does not include atomic types (6.2.5#27). What's your rationale for including "atomic" in TYPE_QUALS rather than making it separate? With either approach, a review of every reference to qualifiers in the front end is needed to determine what's correct for "atomic"; did you find that including "atomic" in qualifiers in the implementation made for fewer changes? Ive gotten the expressions mostly working (= and op= work great), but am having some issues with plain loads and identifying the correct time to emit the load... I've also hacked up a rough approximation of what stdatomic.h would need. I also think I'll revisit how atomic values are marked in the front end... now that Im doing all the real work in the front end, I don't think I want that attribute all over the place like I did before. I may well decide to remove it from the qualifiers and make it a separate type now... So don't worry about reviewing those files... I likely wont get to this until I get back from vacation in a couple of weeks however. I will post the state of things before I go, both for my own recollection when I return, and in case anyone just happens to feel really interested and wants to have a look at some of the remaining required changes. ha. I wish.Its starting to get close tho. Andrew
Re: [PING] [C++ Patch] Remove finish_stmt
2013/8/1 Paolo Carlini : > Hi, > > gently pinging this small clean-up: > > http://gcc.gnu.org/ml/gcc-patches/2013-06/msg00905.html > > Thanks! > Paolo. It was supposed to provide symmetry and a good place to put any "cleanup" code we might want to run, but that never materialized, and now that we are using C++, we are better abstraction tools. So, it can go. Patch OK. -- Gaby
[Ping] [Google] Fix profiledbootstrap failure
Ping? Hi, Here is the patch, Tested by profiledbootstrap. Ok for google gcc-4.8? thanks, Dinar. profiledbootstrap-fix1.patch Description: Binary data
Re: [Ping] [Google] Fix profiledbootstrap failure
Sorry for the delay. The patch is ok and I have committed it to the google branch. thanks, David On Thu, Aug 1, 2013 at 4:51 PM, Dinar Temirbulatov wrote: > Ping? > Hi, > Here is the patch, Tested by profiledbootstrap. Ok for google gcc-4.8? > thanks, Dinar.
Re: [PING] Re: [C++ Patch] for c++/54537
On Thu, 2013-08-01 at 14:42 -0500, Peter Bergner wrote: > Jakub & richi, > > This bug also affects GCC 4.8. Is it appropriate to backport this patch > to the FSF 4.8 branch? I can bootstrap and regtest it there too before > committing if you think it's ok. FYI, I tested the patch on the FSF 4.8 branch and it bootstraps and regtests with no regressions. Peter
Re: Re: PR 57779 New debug check
This patch broke bootstrap on AIX and probably many other targets. In file included from /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/ debug/safe_sequence.h:34:0, from /nasfarm/edelsohn/src/src/libstdc++-v3/src/c++11/debug.cc: 26: /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/debug/functions.h: In function 'bool __gnu_debug::__foreign_iterator_aux4(const __gnu_debug::_Safe_ite rator<_Iterator, _Sequence>&, _InputIterator, _PointerType1, _PointerType2)': /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/debug/functions.h:189: 6: error: 'addressof' is not a member of 'std' __l(std::addressof(*__other), ^ /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/debug/functions.h:190: 6: error: 'addressof' is not a member of 'std' std::addressof(*(__it._M_get_sequence()->_M_base().begin( ^ /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/debug/functions.h:191: 10: error: 'addressof' is not a member of 'std' || __ge(std::addressof(*__other), ^ /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/debug/functions.h:192: 3: error: 'addressof' is not a member of 'std' std::addressof(*(__it._M_get_sequence()->_M_base().end() - 1)) + 1); ^ /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/debug/functions.h: In function 'bool __gnu_debug::__foreign_iterator_aux3(const __gnu_debug::_Safe_ite rator<_Iterator, _Sequence>&, _InputIterator, std::true_type)': /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/debug/functions.h:208: 6: error: '__addressof' is not a member of 'std' if (std::__addressof(*(__it._M_get_sequence()->_M_base().end() - 1)) ^ /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/debug/functions.h:209: 8: error: '__addressof' is not a member of 'std' - std::__addressof(*(__it._M_get_sequence()->_M_base().begin())) ^ /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/debug/functions.h:212: 4: error: 'addressof' is not a member of 'std' std::addressof(*(__it._M_get_sequence()->_M_base().begin())), ^ /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/debug/functions.h:213: 4: error: 'addressof' is not a member of 'std' std::addressof(*__other));
Re: Re: PR 57779 New debug check
On Thu, Aug 1, 2013 at 9:42 PM, David Edelsohn wrote: > This patch broke bootstrap on AIX and probably many other targets. > > In file included from > /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/ > debug/safe_sequence.h:34:0, > from > /nasfarm/edelsohn/src/src/libstdc++-v3/src/c++11/debug.cc: > 26: > /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/debug/functions.h: > In > function 'bool __gnu_debug::__foreign_iterator_aux4(const > __gnu_debug::_Safe_ite > rator<_Iterator, _Sequence>&, _InputIterator, _PointerType1, _PointerType2)': > /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/debug/functions.h:189: > 6: error: 'addressof' is not a member of 'std' > __l(std::addressof(*__other), > ^ This is a libstdc++ bug. The file debug/functions.h is missing the inclusion of By the way, looking at the code, I think the local objects __l and __ge should be declared constexpr. That is not required, but it is good to annotate places where the code is conceptually "pure". -- Gaby
Re: msp430 port
> > What I really need is an int20_t type in the core of gcc, so I can set > > Pmode to *that*, to avoid the SImode stuff completely. But that's a > > core change, not a target change. > > Sometimes you have to make core changes for a new port. This sounds > like something that really should be fixed before this port can go in. That would be wonderful, IIRC I've asked about it before, and I plan on proposing it again, but (1) the port works as-is, and (2) I suspect such a proposal will start a political flame-fest, so I'd rather keep it separate.
Re: PR 57779 New debug check
On 08/02/2013 05:02 AM, Gabriel Dos Reis wrote: On Thu, Aug 1, 2013 at 9:42 PM, David Edelsohn wrote: This patch broke bootstrap on AIX and probably many other targets. In file included from /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/ debug/safe_sequence.h:34:0, from /nasfarm/edelsohn/src/src/libstdc++-v3/src/c++11/debug.cc: 26: /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/debug/functions.h: In function 'bool __gnu_debug::__foreign_iterator_aux4(const __gnu_debug::_Safe_ite rator<_Iterator, _Sequence>&, _InputIterator, _PointerType1, _PointerType2)': /tmp/20130801/powerpc-ibm-aix7.1.0.0/libstdc++-v3/include/debug/functions.h:189: 6: error: 'addressof' is not a member of 'std' __l(std::addressof(*__other), ^ This is a libstdc++ bug. The file debug/functions.h is missing the inclusion of By the way, looking at the code, I think the local objects __l and __ge should be declared constexpr. That is not required, but it is good to annotate places where the code is conceptually "pure". Thanks. I'll fix these issues momentarily. Paolo.