Re: libgo patch committed: Include TLS size in stack size
On Mon, Jun 04, 2012 at 11:19:46PM -0700, Ian Lance Taylor wrote: > This patch to libgo includes the TLS size in the requested stack size of > a new thread, if possible. This relies on the glibc-specific (and > undocumented) _dl_get_tls_static_info call. This is particularly > necessary when using glibc, because glibc removes the static TLS size > from the requested stack space, and gives an error if there is not > enough space. That means that a program that has a lot of TLS variables > will fail bizarrely, or worse may simply get a stack overflow > segmentation violation at runtime. This patch is far from perfect, but > at least works around that problem for such programs. Bootstrapped and > ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and > 4.7 branch. That is a very bad idea. _dl_get_tls_static_info is @@GLIBC_PRIVATE symbol, therefore it must not be used by anything but glibc itself, may change or may be removed at any time. Not using of GLIBC_PRIVATE symbols are even enforced by rpm, so if you build gcc as rpm, it won't install. So especially committing that to 4.7 branch is fatal. Talk to libc-al...@sourceware.org for what should be done instead. Jakub
Re: [line-map] simple oneliner that speeds up track-macro-expansion
Dimitrios Apostolou writes: > Hi Dodji, > > On Mon, 4 Jun 2012, Dodji Seketeli wrote: > >> Hello Dimitrios, >> >> I cannot approve or deny your patch, but I have one question. >> > > Who should I CC then? Do not worry, I am CC-ing the maintainers. I just forgot to CC them when I replied. Sorry for that. > I saw that you have commits in that file. Yeah, nobody is perfect. :-) >> I am wondering why this change implies better performance. >> >> Is this because when we later want to encode a new line/column, and >> hit the spot below, (in linemap_line_start via >> linemap_position_for_column), we call less linemap_add (and thus >> allocate less maps): > > Almost. To be exact we don't even enter linemap_line_start() because > of the check in linemap_position_for_column(). Great. So, for what it is worth, the patch looks OK to me. Let's see what the maintainers say. Thank you. -- Dodji
Re: [patch][PCH] Do not write/read asm_out_file
On Mon, Jun 4, 2012 at 8:23 PM, Steven Bosscher wrote: > Hello, > > The attached patch removes one more #include output.h, this time from > c-family/c-pch.c. > > Anything written out to asm_out_file between pch_init and > c_common_write_pch is read back in by c_common_write_pch and dumped to > the PCH that's being written out. In c_common_read_pch this data is > written out verbatim to asm_out_file again. > > But nothing should write to asm_out_file between pch_init and > c_common_write_pch. I suppose this happened before unit-at-a-time > became the only supported compilation mode, but these days there's > nothing, AFAICT, that should be written to asm_out_file by a front end > during PCH generation. > > This patch was bootstrapped&tested on powerpc64-unknown-linux-gnu. OK for > trunk? I think the patch is reasonable but I'll defer to Joseph for approval. Out of curiosity - what about that #ident thing? I suppose we'd ICE until you have fixed that part, no? Thanks, Richard. > Ciao! > Steven
Re: _FORTIFY_SOURCE for std::vector
On Mon, Jun 4, 2012 at 9:07 PM, Marc Glisse wrote: > On Mon, 4 Jun 2012, Florian Weimer wrote: > >> On 06/01/2012 01:34 PM, Jakub Jelinek wrote: >>> >>> Have you looked at the assembly differences with this in? >> >> >> It's not great. >> >> Here's an example: >> >> void >> write(std::vector& blob, unsigned n, float v1, float v2, float v3, >> float v4) >> { >> blob[n] = v1; >> blob[n + 1] = v2; >> blob[n + 2] = v3; >> blob[n + 3] = v4; >> } > > > Would be great if it ended up testing only n and n+3. > __attribute__((__noreturn__)) is not quite strong enough to allow this > optimization, it would require something like __attribute__((__crashing__)) > to let the compiler know that if the function is called, you don't care what > happens to blob. And possibly the use of a signed n. > > Note that even when the optimization would be legal, gcc seems to have a few > difficulties: > > extern "C" void fail() __attribute((noreturn)); > void write(signed m, signed n) > { > if((n+3)>m) fail(); > if((n+2)>m) fail(); > if((n+1)>m) fail(); > if(n>m) fail(); > } > > keeps 3 tests. Well, the issue is that we'd first need to commonize the fail () calls which we do now, but even then VRP fails to simplify the comparisons against the symbolic ranges (it's not very good at that). And that would only be at -O1. Note that such range-checks will defeat most, if not all, loop optimizations, too. So C++ code using std::vector in compute-intensive parts would be severely pessimized. So, I don't think fortifying libstdc++ is a good idea at all. Richard. > -- > Marc Glisse
[ARM Patch 4/n]PR53447: optimizations of 64bit ALU operation with constant
Hi This is the fourth part of the patches that deals with 64bit ior. It directly extends the patterns iordi3, iordi3_insn and iordi3_neon to handle 64bit constant operands. Tested on arm qemu without regression. OK for trunk? thanks Carrot 2012-06-05 Wei Guozhi PR target/53447 * gcc.target/arm/pr53447-4.c: New testcase. 2012-06-05 Wei Guozhi PR target/53447 * config/arm/arm-protos.h (const_ok_for_iordi): New prototype. * config/arm/arm.c (const_ok_for_iordi): New function. * config/arm/constraints.md (Df): New constraint. * config/arm/predicates.md (arm_iordi_operand): New predicate. (arm_immediate_iordi_operand): Likewise. (iordi_operand): Likewise. * config/arm/arm.md (iordi3): Extend it to handle 64bit constants. (iordi3_insn): Likewise. * config/arm/neon.md (iordi3_neon): Likewise. Index: testsuite/gcc.target/arm/pr53447-4.c === --- testsuite/gcc.target/arm/pr53447-4.c(revision 0) +++ testsuite/gcc.target/arm/pr53447-4.c(revision 0) @@ -0,0 +1,8 @@ +/* { dg-options "-O2" } */ +/* { dg-require-effective-target arm32 } */ +/* { dg-final { scan-assembler-not "mov" } } */ + +void t0p(long long * p) +{ + *p |= 0x10008; +} Index: config/arm/arm.c === --- config/arm/arm.c(revision 188048) +++ config/arm/arm.c(working copy) @@ -2496,6 +2496,24 @@ } } +/* Return TRUE if int I is a valid immediate constant used by pattern + iordi3_insn. */ +int +const_ok_for_iordi (HOST_WIDE_INT i) +{ + HOST_WIDE_INT high = ARM_SIGN_EXTEND ((i >> 32) & 0x); + HOST_WIDE_INT low = ARM_SIGN_EXTEND (i & 0x); + + if (TARGET_32BIT && const_ok_for_arm (low) && const_ok_for_arm (high)) +return 1; + + if (TARGET_THUMB2 && (const_ok_for_arm (low) || const_ok_for_arm (~low)) + && (const_ok_for_arm (high) || const_ok_for_arm (~high))) +return 1; + + return 0; +} + /* Emit a sequence of insns to handle a large constant. CODE is the code of the operation required, it can be any of SET, PLUS, IOR, AND, XOR, MINUS; Index: config/arm/arm-protos.h === --- config/arm/arm-protos.h (revision 188048) +++ config/arm/arm-protos.h (working copy) @@ -47,6 +47,7 @@ extern bool arm_small_register_classes_for_mode_p (enum machine_mode); extern int arm_hard_regno_mode_ok (unsigned int, enum machine_mode); extern bool arm_modes_tieable_p (enum machine_mode, enum machine_mode); +extern int const_ok_for_iordi (HOST_WIDE_INT); extern int const_ok_for_arm (HOST_WIDE_INT); extern int const_ok_for_op (HOST_WIDE_INT, enum rtx_code); extern int arm_split_constant (RTX_CODE, enum machine_mode, rtx, Index: config/arm/neon.md === --- config/arm/neon.md (revision 188048) +++ config/arm/neon.md (working copy) @@ -729,9 +729,9 @@ ) (define_insn "iordi3_neon" - [(set (match_operand:DI 0 "s_register_operand" "=w,w,?&r,?&r,?w,?w") -(ior:DI (match_operand:DI 1 "s_register_operand" "%w,0,0,r,w,0") - (match_operand:DI 2 "neon_logic_op2" "w,Dl,r,r,w,Dl")))] + [(set (match_operand:DI 0 "s_register_operand" "=w,w,?&r,?&r,?w,?w,?&r,?&r") +(ior:DI (match_operand:DI 1 "s_register_operand" "%w,0,0,r,w,0,0,r") + (match_operand:DI 2 "iordi_operand" "w,Dl,r,r,w,Dl,Df,Df")))] "TARGET_NEON" { switch (which_alternative) @@ -743,12 +743,14 @@ DImode, 0, VALID_NEON_QREG_MODE (DImode)); case 2: return "#"; case 3: return "#"; +case 6: return "#"; +case 7: return "#"; default: gcc_unreachable (); } } - [(set_attr "neon_type" "neon_int_1,neon_int_1,*,*,neon_int_1,neon_int_1") - (set_attr "length" "*,*,8,8,*,*") - (set_attr "arch" "nota8,nota8,*,*,onlya8,onlya8")] + [(set_attr "neon_type" "neon_int_1,neon_int_1,*,*,neon_int_1,neon_int_1,*,*") + (set_attr "length" "*,*,8,8,*,*,8,8") + (set_attr "arch" "nota8,nota8,*,*,onlya8,onlya8,*,*")] ) ;; The concrete forms of the Neon immediate-logic instructions are vbic and Index: config/arm/constraints.md === --- config/arm/constraints.md (revision 188048) +++ config/arm/constraints.md (working copy) @@ -29,7 +29,7 @@ ;; in Thumb-1 state: I, J, K, L, M, N, O ;; The following multi-letter normal constraints have been used: -;; in ARM/Thumb-2 state: Da, Db, Dc, Dn, Dl, DL, Dv, Dy, Di, Dt, Dz +;; in ARM/Thumb-2 state: Da, Db, Dc, Df, Dn, Dl, DL, Dv, Dy, Di, Dt, Dz ;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe ;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py @@ -251,6 +251,12 @@ (match_test "TARGET_32BIT && arm_const_double_inline_cost (op) == 4 && !(optimize_size || arm_ld_sched)"))
Re: libgo patch committed: Include TLS size in stack size
On Tue, Jun 5, 2012 at 9:20 AM, Jakub Jelinek wrote: > On Mon, Jun 04, 2012 at 11:19:46PM -0700, Ian Lance Taylor wrote: >> This patch to libgo includes the TLS size in the requested stack size of >> a new thread, if possible. This relies on the glibc-specific (and >> undocumented) _dl_get_tls_static_info call. This is particularly >> necessary when using glibc, because glibc removes the static TLS size >> from the requested stack space, and gives an error if there is not >> enough space. That means that a program that has a lot of TLS variables >> will fail bizarrely, or worse may simply get a stack overflow >> segmentation violation at runtime. This patch is far from perfect, but >> at least works around that problem for such programs. Bootstrapped and >> ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and >> 4.7 branch. > > That is a very bad idea. _dl_get_tls_static_info is @@GLIBC_PRIVATE symbol, > therefore it must not be used by anything but glibc itself, may change or > may be removed at any time. Not using of GLIBC_PRIVATE symbols are even > enforced > by rpm, so if you build gcc as rpm, it won't install. So especially > committing that to 4.7 branch is fatal. > > Talk to libc-al...@sourceware.org for what should be done instead. Ian, can you please revert the patch ASAP as I want to get 4.7.1 out of the door (well, a release candidate)? Otherwise we'll ship 4.7.1 with broken Go (not that we technically care - Go is not a primary language). Thanks, Richard. > Jakub
Re: [PATCH][RFC] Extend memset recognition
On Thu, 31 May 2012, Richard Guenther wrote: > On Wed, 30 May 2012, Richard Guenther wrote: > > > > > The patch below extents memset recognition to cover a few more > > non-byte-size store loops and all byte-size store loops. This exposes > > issues with our builtins.exp testsuite which has custom memset > > routines like > > > > void * > > my_memset (void *d, int c, size_t n) > > { > > char *dst = (char *) d; > > while (n--) > > *dst++ = c; > > return (char *) d; > > } > > > > Now, for LTO we have papered over similar issues by attaching > > the used attribute to the functions. But the general question is - when > > can we be sure the function we are dealing with are not the actual > > implementation for the builtin call we want to generate? A few > > things come to my mind: > > > > 1) the function already calls the function we want to generate (well, > > it might be a tail-recursive memset implementation ...) > > > > 2) the function availability is AVAIL_LOCAL > > > > 3) ... ? > > > > For sure 2) would work, but it would severely restrict the transform > > (do we care?). > > > > We have a similar issue with sin/cos -> sincos transform and a > > trivial sincos implementation. > > > > Any ideas? > > > > Bootstrapped (with memset recognition enabled by default) and tested > > on x86_64-unknown-linux-gnu with the aforementioned issues. > > The following fixes it by simply always adding > -fno-tree-loop-distribute-patterns to builtins.exp. > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > > If there are no further comments I'll go with the local advise from > Micha who says "who cares". Now done with the much simpler patch below (after all the loop distribution TLC). Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2012-06-05 Richard Guenther PR tree-optimization/53081 * tree-loop-distribution.c (generate_memset_builtin): Handle all kinds of byte-sized stores. (classify_partition): Likewise. (tree_loop_distribution): Adjust seed statements used for !flag_tree_loop_distribution. * gcc.dg/tree-ssa/ldist-19.c: New testcase. * gcc.c-torture/execute/builtins/builtins.exp: Always pass -fno-tree-loop-distribute-patterns. Index: gcc/tree-loop-distribution.c === *** gcc/tree-loop-distribution.c.orig 2012-06-04 17:05:14.0 +0200 --- gcc/tree-loop-distribution.c2012-06-04 17:32:38.829355831 +0200 *** generate_memset_builtin (struct loop *lo *** 332,337 --- 332,338 gimple_seq stmt_list = NULL, stmts; struct data_reference *dr = XCNEW (struct data_reference); location_t loc; + tree val; stmt = partition->main_stmt; loc = gimple_location (stmt); *** generate_memset_builtin (struct loop *lo *** 364,376 mem = force_gimple_operand (addr_base, &stmts, true, NULL); gimple_seq_add_seq (&stmt_list, stmts); fn = build_fold_addr_expr (builtin_decl_implicit (BUILT_IN_MEMSET)); ! fn_call = gimple_build_call (fn, 3, mem, integer_zero_node, nb_bytes); gimple_seq_add_stmt (&stmt_list, fn_call); gsi_insert_seq_after (&gsi, stmt_list, GSI_CONTINUE_LINKING); if (dump_file && (dump_flags & TDF_DETAILS)) ! fprintf (dump_file, "generated memset zero\n"); } /* Remove and destroy the loop LOOP. */ --- 365,408 mem = force_gimple_operand (addr_base, &stmts, true, NULL); gimple_seq_add_seq (&stmt_list, stmts); + /* This exactly matches the pattern recognition in classify_partition. */ + val = gimple_assign_rhs1 (stmt); + if (integer_zerop (val) + || real_zerop (val) + || TREE_CODE (val) == CONSTRUCTOR) + val = integer_zero_node; + else if (integer_all_onesp (val)) + val = build_int_cst (integer_type_node, -1); + else + { + if (TREE_CODE (val) == INTEGER_CST) + val = fold_convert (integer_type_node, val); + else if (!useless_type_conversion_p (integer_type_node, TREE_TYPE (val))) + { + gimple cstmt; + tree tem = create_tmp_reg (integer_type_node, NULL); + tem = make_ssa_name (tem, NULL); + cstmt = gimple_build_assign_with_ops (NOP_EXPR, tem, val, NULL_TREE); + gimple_seq_add_stmt (&stmt_list, cstmt); + val = tem; + } + } + fn = build_fold_addr_expr (builtin_decl_implicit (BUILT_IN_MEMSET)); ! fn_call = gimple_build_call (fn, 3, mem, val, nb_bytes); gimple_seq_add_stmt (&stmt_list, fn_call); gsi_insert_seq_after (&gsi, stmt_list, GSI_CONTINUE_LINKING); if (dump_file && (dump_flags & TDF_DETAILS)) ! { ! fprintf (dump_file, "generated memset"); ! if (integer_zerop (val)) ! fprintf (dump_file, " zero\n"); ! else if (integer_all_onesp (val)) ! fprintf (dump_file, " minus one\n"); ! else ! fprintf (dump_file, "\n"); ! }
Re: [Fortran, DRAFT patch] PR 46321 - [OOP] Polymorphic deallocation
Hi Alessandro, I am glad to see that Janus is giving you a helping hand, in addition to Tobias. I am so tied up with every aspect of life that gfortran is not figuring much at all. When you clean up the patch, you might consider making this into a separate function: + if (free_proc) + { + ppc = gfc_copy_expr(free_proc->initializer); + ppc_code = gfc_get_code (); + ppc_code->resolved_sym = ppc->symtree->n.sym; + ppc_code->resolved_sym->attr.elemental = 1; + ppc_code->ext.actual = actual; + ppc_code->expr1 = ppc; + ppc_code->op = EXEC_CALL; + tmp = gfc_trans_call (ppc_code, true, NULL, NULL, false); + gfc_free_statements (ppc_code); + gfc_add_expr_to_block (&block, tmp); + } ... and using the function call to replace the corresponding call to _copy in trans_allocate. I suspect that we are going to do this some more :-) Once we have the separate function, we could at later stage replace it by a TREE_SSA version. Cheers Paul On 3 June 2012 12:15, Alessandro Fanfarillo wrote: >> Right, the problem is that the _free component is missing. Just as the >> _copy component, _free should be present for *every* vtype, no matter >> if there are allocatable components or not. If the _free component is >> not needed, it should be initialized to EXPR_NULL. > > With an "empty" _free function for every type which does not have > allocatable components the problem with dynamic_dispatch_4.f03 > disappears :), thank you very much. In the afternoon I'll reorganize > the code. > > Bye. > > Alessandro -- The knack of flying is learning how to throw yourself at the ground and miss. --Hitchhikers Guide to the Galaxy
GCC 4.7.1 Status Report (2012-06-05), branch frozen
The GCC 4.7 branch is now frozen for creating a first release candidate of the GCC 4.7.1 release. All changes need explicit release manager approval until the final release of GCC 4.7.1 which should happen roughly one week after the release candidate if no issues show up with it. Previous Report === http://gcc.gnu.org/ml/gcc/2012-05/msg00394.html
[PATCH][8/7] loop distribution TLC
Well - this replaces passing down a flag pointer with a new member in the partition struct. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2012-06-05 Richard Guenther * tree-loop-distribution.c (struct partition_s): Add has_writes member. (partition_alloc): Initialize it. (partition_has_writes): New function. (rdg_flag_uses): Adjust. (rdg_flag_vertex): Likewise. (rdg_flag_vertex_and_dependent): Likewise. (rdg_flag_loop_exits): Likewise. (build_rdg_partition_for_component): Likewise. (rdg_build_partitions): Likewise. Index: trunk/gcc/tree-loop-distribution.c === *** trunk.orig/gcc/tree-loop-distribution.c 2012-06-05 11:45:32.0 +0200 --- trunk/gcc/tree-loop-distribution.c 2012-06-05 11:50:38.656074822 +0200 *** enum partition_kind { PKIND_NORMAL, PKIN *** 57,62 --- 57,63 typedef struct partition_s { bitmap stmts; + bool has_writes; enum partition_kind kind; /* Main statement a kind != PKIND_NORMAL partition is about. */ gimple main_stmt; *** partition_alloc (bitmap stmts) *** 72,77 --- 73,79 { partition_t partition = XCNEW (struct partition_s); partition->stmts = stmts ? stmts : BITMAP_ALLOC (NULL); + partition->has_writes = false; partition->kind = PKIND_NORMAL; return partition; } *** partition_builtin_p (partition_t partiti *** 93,98 --- 95,108 return partition->kind != PKIND_NORMAL; } + /* Returns true if the partition has an writes. */ + + static bool + partition_has_writes (partition_t partition) + { + return partition->has_writes; + } + /* If bit I is not set, it means that this node represents an operation that has already been performed, and that should not be performed again. This is the subgraph of remaining important *** has_upstream_mem_writes (int u) *** 583,596 } static void rdg_flag_vertex_and_dependent (struct graph *, int, partition_t, ! bitmap, bitmap, bool *); /* Flag the uses of U stopping following the information from upstream_mem_writes. */ static void rdg_flag_uses (struct graph *rdg, int u, partition_t partition, bitmap loops, ! bitmap processed, bool *part_has_writes) { use_operand_p use_p; struct vertex *x = &(rdg->vertices[u]); --- 593,606 } static void rdg_flag_vertex_and_dependent (struct graph *, int, partition_t, ! bitmap, bitmap); /* Flag the uses of U stopping following the information from upstream_mem_writes. */ static void rdg_flag_uses (struct graph *rdg, int u, partition_t partition, bitmap loops, ! bitmap processed) { use_operand_p use_p; struct vertex *x = &(rdg->vertices[u]); *** rdg_flag_uses (struct graph *rdg, int u, *** 606,612 if (!already_processed_vertex_p (processed, v)) rdg_flag_vertex_and_dependent (rdg, v, partition, loops, ! processed, part_has_writes); } if (gimple_code (stmt) != GIMPLE_PHI) --- 616,622 if (!already_processed_vertex_p (processed, v)) rdg_flag_vertex_and_dependent (rdg, v, partition, loops, ! processed); } if (gimple_code (stmt) != GIMPLE_PHI) *** rdg_flag_uses (struct graph *rdg, int u, *** 623,629 if (v >= 0 && !already_processed_vertex_p (processed, v)) rdg_flag_vertex_and_dependent (rdg, v, partition, loops, ! processed, part_has_writes); } } } --- 633,639 if (v >= 0 && !already_processed_vertex_p (processed, v)) rdg_flag_vertex_and_dependent (rdg, v, partition, loops, ! processed); } } } *** rdg_flag_uses (struct graph *rdg, int u, *** 645,651 if (!already_processed_vertex_p (processed, v)) rdg_flag_vertex_and_dependent (rdg, v, partition, loops, ! processed, part_has_writes); } } } --- 655,661 if (!already_processed_vertex_p (processed, v)) rdg_flag_vertex_and_dependent (rdg, v, partition, loops, ! processed); } } } *** rdg_flag_uses (struct graph *rdg, int u, *** 655,662 in LOOPS. */ static void ! rdg_flag_vertex (struct graph *rdg, int v, partition_t partition, bitmap loops, !bool *part_has_writes) { struct loop *loop; -
Re: PATCH: --with-abi=x32 without --with-multilib-list doesn't work
On Mon, Jun 4, 2012 at 8:09 PM, H.J. Lu wrote: > We should enable x32 run-time library if --with-abi={x32|mx32} is used > to configure GCC i[34567]86-*-* and x86_64-*-*. Tested on Linux/x86-64. > OK for trunk? > > 2012-06-04 H.J. Lu > > PR target/53575 > * config.gcc: Enable x32 run-time library if --with-abi={x32|mx32} > is used for i[34567]86-*-* and x86_64-*-*. > > diff --git a/gcc/config.gcc b/gcc/config.gcc > index 61adc89..3f66bd2 100644 > --- a/gcc/config.gcc > +++ b/gcc/config.gcc > @@ -1233,7 +1233,14 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | > i[34567]86-*-knetbsd*-gnu | i > tmake_file="${tmake_file} i386/t-linux64" > x86_multilibs="${with_multilib_list}" > if test "$x86_multilibs" = "default"; then > - x86_multilibs="m64,m32" > + case ${with_abi} in > + x32 | mx32) > + x86_multilibs="m64,m32,mx32" Why all three ABIs here? Didn't user specify -with-abi=mx32 only, so x86_multilibs="mx32" only here. Uros.
Merge from gcc-4_7-branch to gccgo branch
I've merged gcc-4_7-branch revision 188231 to the gccgo branch. Ian
[PATCH] Fix part of PR30442
PR30442 shows that we do not vectorize basic-blocks if the to-be vectorized data-references are followed by something that find_data_references_in_stmt does not know how to analyze (any call or asm for example). The following re-organizes how we create data-references in vect_analyze_data_refs for basic-blocks employing the same trick as used later when analyzing them - stop analysis at the stmt that we fail to analyze. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2012-06-05 Richard Guenther PR tree-optimization/30442 * tree-vect-data-refs.c (vect_analyze_data_refs): For basic-block vectorization stop analysis at the first stmt we cannot compute a data-reference for instead of giving up completely. * gcc.dg/vect/bb-slp-30.c: New testcase. Index: gcc/tree-vect-data-refs.c === *** gcc/tree-vect-data-refs.c (revision 188232) --- gcc/tree-vect-data-refs.c (working copy) *** vect_analyze_data_refs (loop_vec_info lo *** 2844,2854 } else { bb = BB_VINFO_BB (bb_vinfo); ! res = compute_data_dependences_for_bb (bb, true, !&BB_VINFO_DATAREFS (bb_vinfo), !&BB_VINFO_DDRS (bb_vinfo)); ! if (!res) { if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS)) fprintf (vect_dump, "not vectorized: basic block contains function" --- 2844,2866 } else { + gimple_stmt_iterator gsi; + bb = BB_VINFO_BB (bb_vinfo); ! for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) ! { ! gimple stmt = gsi_stmt (gsi); ! if (!find_data_references_in_stmt (NULL, stmt, !&BB_VINFO_DATAREFS (bb_vinfo))) ! { ! /* Mark the rest of the basic-block as unvectorizable. */ ! for (; !gsi_end_p (gsi); gsi_next (&gsi)) ! STMT_VINFO_VECTORIZABLE (vinfo_for_stmt (stmt)) = false; ! break; ! } ! } ! if (!compute_all_dependences (BB_VINFO_DATAREFS (bb_vinfo), ! &BB_VINFO_DDRS (bb_vinfo), NULL, true)) { if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS)) fprintf (vect_dump, "not vectorized: basic block contains function" Index: gcc/testsuite/gcc.dg/vect/bb-slp-30.c === *** gcc/testsuite/gcc.dg/vect/bb-slp-30.c (revision 0) --- gcc/testsuite/gcc.dg/vect/bb-slp-30.c (revision 0) *** *** 0 --- 1,47 + /* { dg-require-effective-target vect_int } */ + + int a[32]; + + void __attribute__((noinline)) + test1(void) + { + a[0] = 1; + a[1] = 1; + a[2] = 1; + a[3] = 1; + a[4] = 1; + a[5] = 1; + a[6] = 1; + a[7] = 1; + a[8] = 1; + a[9] = 1; + a[10] = 1; + a[11] = 1; + a[12] = 1; + a[13] = 1; + a[14] = 1; + a[15] = 1; + a[16] = 1; + a[17] = 1; + a[18] = 1; + a[19] = 1; + a[20] = 1; + a[21] = 1; + a[22] = 1; + a[23] = 1; + a[24] = 1; + a[25] = 1; + a[26] = 1; + a[27] = 1; + a[28] = 1; + a[29] = 1; + a[30] = 1; + a[31] = 1; + asm ("" : : : "memory"); + a[21] = 0; + } + + int main() { test1(); return a[21]; } + + /* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" } } */ + /* { dg-final { cleanup-tree-dump "slp" } } */
Re: PATCH: --with-abi=x32 without --with-multilib-list doesn't work
On Tue, Jun 5, 2012 at 5:24 AM, Uros Bizjak wrote: > On Mon, Jun 4, 2012 at 8:09 PM, H.J. Lu wrote: > >> We should enable x32 run-time library if --with-abi={x32|mx32} is used >> to configure GCC i[34567]86-*-* and x86_64-*-*. Tested on Linux/x86-64. >> OK for trunk? >> >> 2012-06-04 H.J. Lu >> >> PR target/53575 >> * config.gcc: Enable x32 run-time library if --with-abi={x32|mx32} >> is used for i[34567]86-*-* and x86_64-*-*. >> >> diff --git a/gcc/config.gcc b/gcc/config.gcc >> index 61adc89..3f66bd2 100644 >> --- a/gcc/config.gcc >> +++ b/gcc/config.gcc >> @@ -1233,7 +1233,14 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | >> i[34567]86-*-knetbsd*-gnu | i >> tmake_file="${tmake_file} i386/t-linux64" >> x86_multilibs="${with_multilib_list}" >> if test "$x86_multilibs" = "default"; then >> - x86_multilibs="m64,m32" >> + case ${with_abi} in >> + x32 | mx32) >> + x86_multilibs="m64,m32,mx32" > > Why all three ABIs here? Didn't user specify -with-abi=mx32 only, so > x86_multilibs="mx32" only here. > Is this patch OK? Since --with-abi is only used for x86_64-*-*, we don't need to change i[34567]86-*-*. Thanks. -- H.J. 2012-06-05 H.J. Lu PR target/53575 * config.gcc: Select x32 run-time library if --with-abi={x32|mx32} is used for x86_64-*-*. diff --git a/gcc/config.gcc b/gcc/config.gcc index 61adc89..f0ea9c7 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -1299,7 +1299,14 @@ x86_64-*-linux* | x86_64-*-kfreebsd*-gnu | x86_64-*-knetbsd*-gnu) tmake_file="${tmake_file} i386/t-linux64" x86_multilibs="${with_multilib_list}" if test "$x86_multilibs" = "default"; then - x86_multilibs="m64,m32" + case ${with_abi} in + x32 | mx32) + x86_multilibs="mx32" + ;; + *) + x86_multilibs="m64,m32" + ;; + esac fi x86_multilibs=`echo $x86_multilibs | sed -e 's/,/ /g'` for x86_multilib in ${x86_multilibs}; do
Re: PATCH: --with-abi=x32 without --with-multilib-list doesn't work
On Tue, Jun 5, 2012 at 2:47 PM, H.J. Lu wrote: >>> We should enable x32 run-time library if --with-abi={x32|mx32} is used >>> to configure GCC i[34567]86-*-* and x86_64-*-*. Tested on Linux/x86-64. >> Why all three ABIs here? Didn't user specify -with-abi=mx32 only, so >> x86_multilibs="mx32" only here. >> > > Is this patch OK? Since --with-abi is only used for x86_64-*-*, > we don't need to change i[34567]86-*-*. > > 2012-06-05 H.J. Lu > > PR target/53575 > * config.gcc: Select x32 run-time library if --with-abi={x32|mx32} > is used for x86_64-*-*. This looks OK to me. Thanks, Uros.
Re: libgo patch committed: Include TLS size in stack size
Jakub Jelinek writes: > On Mon, Jun 04, 2012 at 11:19:46PM -0700, Ian Lance Taylor wrote: >> This patch to libgo includes the TLS size in the requested stack size of >> a new thread, if possible. This relies on the glibc-specific (and >> undocumented) _dl_get_tls_static_info call. This is particularly >> necessary when using glibc, because glibc removes the static TLS size >> from the requested stack space, and gives an error if there is not >> enough space. That means that a program that has a lot of TLS variables >> will fail bizarrely, or worse may simply get a stack overflow >> segmentation violation at runtime. This patch is far from perfect, but >> at least works around that problem for such programs. Bootstrapped and >> ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and >> 4.7 branch. > > That is a very bad idea. _dl_get_tls_static_info is @@GLIBC_PRIVATE symbol, > therefore it must not be used by anything but glibc itself, may change or > may be removed at any time. Not using of GLIBC_PRIVATE symbols are even > enforced > by rpm, so if you build gcc as rpm, it won't install. So especially > committing that to 4.7 branch is fatal. > > Talk to libc-al...@sourceware.org for what should be done instead. I knew it was nonportable, but I didn't realize that it would break rpm. Disabled per richi's request, like so. Patch committed to mainline and 4.7 branch. Ian diff -r 76c6ab4f8cdd libgo/runtime/proc.c --- a/libgo/runtime/proc.c Mon Jun 04 23:18:14 2012 -0700 +++ b/libgo/runtime/proc.c Tue Jun 05 06:09:41 2012 -0700 @@ -1122,6 +1122,7 @@ stacksize = PTHREAD_STACK_MIN; +#if 0 #ifdef HAVE__DL_GET_TLS_STATIC_INFO { /* On GNU/Linux the static TLS size is taken out of @@ -1142,6 +1143,7 @@ stacksize += tlssize; } #endif +#endif if(pthread_attr_setstacksize(&attr, stacksize) != 0) runtime_throw("pthread_attr_setstacksize");
Re: [trunk] Copy TREE_STATIC() property from id in dwarf2asm.c (issue 6133061)
On Thu, May 31, 2012 at 8:10 PM, wrote: > Reviewers: xur, davidxl, iant2, Diego Novillo, > > Message: > The relevant bug is this: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53546. This bug does not affect trunk. Only the google branches where LIPO is implemented. Diego.
Re: [PATCH] Fix part of PR30442
On Tue, Jun 05, 2012 at 02:35:30PM +0200, Richard Guenther wrote: > Index: gcc/tree-vect-data-refs.c > ! gimple stmt = gsi_stmt (gsi); > ! if (!find_data_references_in_stmt (NULL, stmt, > ! &BB_VINFO_DATAREFS (bb_vinfo))) > ! { > ! /* Mark the rest of the basic-block as unvectorizable. */ > ! for (; !gsi_end_p (gsi); gsi_next (&gsi)) I see iteration through the rest of the basic block... > ! STMT_VINFO_VECTORIZABLE (vinfo_for_stmt (stmt)) = false; ...but I don't see corresponding updates to stmt. -Nathan
Re: [gimplefe] Adding more preliminary tests
On 12-06-04 23:08 , Sandeep Soni wrote: Hi, I have added some more preliminary tests for the gimple front end.. These are really simple individual statement tests which have helped me get a hang of the testing mechanism in gcc. I will be adding a few more tests tonight. Sounds good. At this point, make check-gimple passes 7 cases and fails one. I will fix up the bug in a later patch tonight. For now, you can commit the tests that are working. Hold on to the failing test until you have a patch fixing the failure (this way, your fix is committed together with the test case). Diego.
Re: [PATCH] vrp: fold ffs to ctz
Il 04/06/2012 11:31, Richard Guenther ha scritto: > +val = compare_range_with_value (NE_EXPR, vr, integer_zero_node, > &sop); > +if (!val || !integer_onep (val)) > + return false; > > please add a value_range_nonzero_p helper alongside value_range_nonnegative_p. > > +fndecl = builtin_decl_implicit (target_builtin_code); > +lhs = gimple_call_lhs (stmt); > +gcc_assert (TREE_TYPE (TREE_TYPE (fndecl)) == TREE_TYPE (lhs)); > > eh, if you care to check this please fail instead of ICEing ... The verifier would fail very soon anyway. I might as well remove the assertion altogether. > Do we always have CTZ if we have FFS? Can't there be a target that > implements FFS as opcode but not CTZ, so you'd slow down things? > Thus, should the transform be conditonal on target support for CTZ > or no target support for FFS? Hmm, SH and (some semi-obscure variant of) SPARC. But actually SPARC should define a clz pattern instead; SH should have a popcount pattern + a generic trick to expand ctz/ffs in terms of popcount. I'll submit those before applying this patch. > Please add a comment to the code as to what transform you are doing > here. > > +/* Convert argument type. */ > +argtype = TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl))); > +tem = create_tmp_reg (argtype, NULL); > +newop = gimple_build_assign_with_ops (NOP_EXPR, tem, op0, NULL_TREE); > +tem = make_ssa_name (tem, newop); > +gimple_assign_set_lhs (newop, tem); > +gsi_insert_before (gsi, newop, GSI_SAME_STMT); > > why is that necessary? Argument checking for GIMPLE_CALL is almost nonexistent, but I would like to be nice and create my calls with good arguments. > Can you at least wrap it inside a > > if (!useless_type_conversion_p (argtype, TREE_TYPE (op0))) > > ? Yes. Thanks for the review! Paolo
Re: [PATCH] Fix part of PR30442
On Tue, 5 Jun 2012, Nathan Froyd wrote: > On Tue, Jun 05, 2012 at 02:35:30PM +0200, Richard Guenther wrote: > > Index: gcc/tree-vect-data-refs.c > > ! gimple stmt = gsi_stmt (gsi); > > ! if (!find_data_references_in_stmt (NULL, stmt, > > !&BB_VINFO_DATAREFS (bb_vinfo))) > > ! { > > ! /* Mark the rest of the basic-block as unvectorizable. */ > > ! for (; !gsi_end_p (gsi); gsi_next (&gsi)) > > I see iteration through the rest of the basic block... > > > ! STMT_VINFO_VECTORIZABLE (vinfo_for_stmt (stmt)) = false; > > ...but I don't see corresponding updates to stmt. Eh ... fix in testing. Richard.
Re: User directed Function Multiversioning via Function Overloading (issue5752064)
On Mon, Jun 4, 2012 at 3:29 PM, Sriraman Tallam wrote: > Bug fixed and new patch attached. > > Patch also available for review at http://codereview.appspot.com/5752064 > I think you should also export __cpu_indicator_init in libgcc_s.so. Also, is this feature C++ only? Can you make it to work for C? -- H.J.
Re: [PATCH 2/2] Better system header location detection for built-in macro tokens
On 06/05/2012 12:14 AM, Mike Stump wrote: On Jun 4, 2012, at 7:46 PM, Mike Stump wrote: g++.dg/other/warning1.C -std=c++11 (test for warnings, line 10) g++.dg/other/warning1.C -std=c++11 (test for warnings, line 11) g++.dg/other/warning1.C -std=c++11 (test for excess errors) g++.dg/other/warning1.C -std=c++98 (test for warnings, line 10) g++.dg/other/warning1.C -std=c++98 (test for warnings, line 11) g++.dg/other/warning1.C -std=c++98 (test for excess errors) So, this one is not obvious. The testcase checks for warning, and the compiler generates an error. Could a C++ maintainer weigh in on it? The errors are correct; indeed, we have always given an error for this testcase with -pedantic-errors. It does seem like a diagnostic quality regression that we no longer complain specifically about the division by 0. That regression seems to have happened in 4.3. Since this PR was specifically about the formatting of 1.0f, that should have had its own dg-warning line instead of being lumped in with the initialization error. Jason
Re: [C++ Patch] PR 53567
On 06/05/2012 07:00 AM, Paolo Carlini wrote: (construct_virtual_base): Adjust LOOKUP_COMPLAIN -> LOOKUP_NORMAL. This and the similar changes elsewhere seem dangerous; they're adding adding LOOKUP_PROTECT that wasn't there before. Instead, let's replace LOOKUP_COMPLAIN with 0 or some macro defined to 0. I would also keep the LOOKUP_PROTECT macro rather than replace its uses with LOOKUP_NORMAL. Jason
Re: [PATCH] Fix part of PR30442
On Tue, 5 Jun 2012, Richard Guenther wrote: > On Tue, 5 Jun 2012, Nathan Froyd wrote: > > > On Tue, Jun 05, 2012 at 02:35:30PM +0200, Richard Guenther wrote: > > > Index: gcc/tree-vect-data-refs.c > > > ! gimple stmt = gsi_stmt (gsi); > > > ! if (!find_data_references_in_stmt (NULL, stmt, > > > ! &BB_VINFO_DATAREFS > > > (bb_vinfo))) > > > ! { > > > ! /* Mark the rest of the basic-block as unvectorizable. */ > > > ! for (; !gsi_end_p (gsi); gsi_next (&gsi)) > > > > I see iteration through the rest of the basic block... > > > > > ! STMT_VINFO_VECTORIZABLE (vinfo_for_stmt (stmt)) = false; > > > > ...but I don't see corresponding updates to stmt. > > Eh ... > > fix in testing. Tested on x86_64-unknown-linxu-gnu, applied. Richard. 2012-06-05 Richard Guenther * tree-vect-data-refs.c (vect_analyze_data_refs): Fix last change. Index: gcc/tree-vect-data-refs.c === --- gcc/tree-vect-data-refs.c (revision 188235) +++ gcc/tree-vect-data-refs.c (working copy) @@ -2855,7 +2855,10 @@ vect_analyze_data_refs (loop_vec_info lo { /* Mark the rest of the basic-block as unvectorizable. */ for (; !gsi_end_p (gsi); gsi_next (&gsi)) - STMT_VINFO_VECTORIZABLE (vinfo_for_stmt (stmt)) = false; + { + stmt = gsi_stmt (gsi); + STMT_VINFO_VECTORIZABLE (vinfo_for_stmt (stmt)) = false; + } break; } }
Re: [line-map] simple oneliner that speeds up track-macro-expansion
Applied, thanks. Jason
[PATCH][RFC] Recognize memcpy/memmove - fix PR53081
This adds memcpy/memmove recognition to loop distribution (and cleans it up some more). Issues are similar to memset and not handled (and I just noticed we generate memset/memcpy even with -fno-builtin ...). Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2012-06-05 Richard Guenther PR tree-optimization/53081 * tree-data-ref.h (adjacent_store_dr_p): Rename to ... (adjacent_dr_p): ... this and make it work for reads, too. * tree-loop-distribution.c (enum partition_kind): Add PKIND_MEMCPY. (struct partition_s): Change main_stmt to main_dr, add secondary_dr member. (build_size_arg_loc): Change to date data-reference and not gimplify here. (build_addr_arg_loc): New function split out from ... (generate_memset_builtin): ... here. Use it and simplify. (generate_memcpy_builtin): New function. (generate_code_for_partition): Adjust. (classify_partition): Streamline pattern detection. Detect memcpy. (ldist_gen): Adjust. (tree_loop_distribution): Adjust seed statements for memcpy recognition. * gcc.dg/tree-ssa/ldist-20.c: New testcase. Index: gcc/tree-data-ref.h === *** gcc/tree-data-ref.h (revision 188233) --- gcc/tree-data-ref.h (working copy) *** bool rdg_defs_used_in_other_loops_p (str *** 615,625 with a stride equal to its unit type size. */ static inline bool ! adjacent_store_dr_p (struct data_reference *dr) { - if (!DR_IS_WRITE (dr)) - return false; - /* If this is a bitfield store bail out. */ if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1))) --- 615,622 with a stride equal to its unit type size. */ static inline bool ! adjacent_dr_p (struct data_reference *dr) { /* If this is a bitfield store bail out. */ if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF && DECL_BIT_FIELD (TREE_OPERAND (DR_REF (dr), 1))) Index: gcc/tree-loop-distribution.c === *** gcc/tree-loop-distribution.c(revision 188233) --- gcc/tree-loop-distribution.c(working copy) *** along with GCC; see the file COPYING3. *** 52,66 #include "tree-scalar-evolution.h" #include "tree-pass.h" ! enum partition_kind { PKIND_NORMAL, PKIND_MEMSET }; typedef struct partition_s { bitmap stmts; bool has_writes; enum partition_kind kind; ! /* Main statement a kind != PKIND_NORMAL partition is about. */ ! gimple main_stmt; } *partition_t; DEF_VEC_P (partition_t); --- 52,67 #include "tree-scalar-evolution.h" #include "tree-pass.h" ! enum partition_kind { PKIND_NORMAL, PKIND_MEMSET, PKIND_MEMCPY }; typedef struct partition_s { bitmap stmts; bool has_writes; enum partition_kind kind; ! /* data-references a kind != PKIND_NORMAL partition is about. */ ! data_reference_p main_dr; ! data_reference_p secondary_dr; } *partition_t; DEF_VEC_P (partition_t); *** generate_loops_for_partition (struct loo *** 313,352 free (bbs); } ! /* Build the size argument for a memset call. */ ! static inline tree ! build_size_arg_loc (location_t loc, tree nb_iter, tree op, ! gimple_seq *stmt_list) ! { ! gimple_seq stmts; ! tree x = fold_build2_loc (loc, MULT_EXPR, size_type_node, ! fold_convert_loc (loc, size_type_node, nb_iter), ! fold_convert_loc (loc, size_type_node, ! TYPE_SIZE_UNIT (TREE_TYPE (op; ! x = force_gimple_operand (x, &stmts, true, NULL); ! gimple_seq_add_seq (stmt_list, stmts); ! return x; } /* Generate a call to memset for PARTITION in LOOP. */ static void ! generate_memset_builtin (struct loop *loop, struct graph *rdg, !partition_t partition) { gimple_stmt_iterator gsi; gimple stmt, fn_call; ! tree op0, nb_iter, mem, fn, addr_base, nb_bytes; ! gimple_seq stmt_list = NULL, stmts; ! struct data_reference *dr = XCNEW (struct data_reference); location_t loc; tree val; ! stmt = partition->main_stmt; loc = gimple_location (stmt); - op0 = gimple_assign_lhs (stmt); if (gimple_bb (stmt) == loop->latch) nb_iter = number_of_latch_executions (loop); else --- 314,366 free (bbs); } ! /* Build the size argument for a memory operation call. */ ! static tree ! build_size_arg_loc (location_t loc, data_reference_p dr, tree nb_iter) ! { ! tree size; ! size = fold_build2_loc (loc, MULT_EXPR, sizetype, ! fold_convert_loc (loc, sizetype, nb_iter), ! TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr; ! return fold_convert_loc (loc, size_type_nod
Re: No documentation of -fsched-pressure-algorithm
Richard Sandiford wrote: > Ian Lance Taylor writes: > > Richard Sandiford writes: > >> gcc/ > >>* doc/invoke.texi (sched-pressure-algorithm): Document new --param. > >>* common.opt (fsched-pressure-algorithm=): Remove. > >>* flag-types.h (sched_pressure_algorithm): Move to... > >>* sched-int.h (sched_pressure_algorithm): ...here. > >>* params.def (sched-pressure-algorithm): New param. > >>* haifa-sched.c (sched_init): Use it to initialize sched_pressure. > > > > This is OK. > > Thanks. It's taken me too long to update the s390 bits too, but finally > got round to it today. Tested by building s390x-linux-gnu still builds, > uses the new -fsched-pressure algorithm by default, but can be told to > use the old one using --param. Hmmm. Specifying the algorithm by number instead of symbolic name doesn't look like much of an improvement to me :-) But if the consensus is to go that way, the s390 bits are certainly OK with me. Thanks, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
Re: arm-rtems switch default target to EABI
Hi, what is the blocking point for the integration of these patches? -- Sebastian Huber, embedded brains GmbH Address : Obere Lagerstr. 30, D-82178 Puchheim, Germany Phone : +49 89 18 90 80 79-6 Fax : +49 89 18 90 80 79-9 E-Mail : sebastian.hu...@embedded-brains.de PGP : Public key available on request. Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
Re: [C++] Return error_mark_node from cp_parser_constant_expression
On 06/04/2012 10:46 PM, Jason Merrill wrote: On 06/04/2012 04:12 PM, Florian Weimer wrote: This doesn't make sense to me. parser->integral_constant_expression_p should always be true at this point if you're moving the restore later (which also seems unnecessary). I think parser->integral_constant_expression_p reflects the result of the cp_parser_assignment_expression() call earlier in this function. parser->integral_constant_expression_p is set to indicate that the current context expects an integral constant expression; the call to cp_parser_assignment_expression will not affect that. If the expression turns out not to be a valid constant expression, then we set parser->non_integral_constant_expression_p to true, but we don't touch parser->integral_constant_expression_p. Okay, now all this makes sense. This is a bit difficult to figure out just reading the comments in cp/parser.h. I think the condition you want is if (parser->non_integral_constant_expression_p && !allow_non_constant_p) True. But if we want cascading errors, cp_parser_constant_expression really cannot return error_mark_node, so this approach is a dead end. (For example, build_enumerator replaces error_mark_node in the enumeration value with nothing, i.e., the next possible enumeration value.) So this approach is a dead end. On the other hand, if cascading errors are acceptable, I probably should not worry too much about them in operator new, either. 8-) -- Florian Weimer / Red Hat Product Security Team
Re: arm-rtems switch default target to EABI
On 05/14/2012 08:51 PM, Joseph S. Myers wrote: On Mon, 14 May 2012, Joel Sherrill wrote: There is a long explanation in the PR but the short version is that although we fully intended to switch the arm-rtems target from ELF to EABI we never intended the target name "arm-*-rtemseabi*" to become the preferred arm RTEMS target name. We want all RTEMS target names to be of the form -rtems. Unfortunately, we screwed up and arm-rtems is marked as deprecated in 4.7. Note that various testcases test for target arm*-*-*eabi* (or some similar form that would catch arm-*-rtemseabi* but not arm-rtems). It would be a good idea to move them to using the arm_eabi effective-target. (Some of those tests also explicitly list arm*-*-symbianelf*, an existing EABI target not matching the arm*-*-*eabi* pattern; some do not.) Thanks for the pointer. Do you suggest to change: /* { dg-do run { target arm*-*-symbianelf* arm*-*-eabi* } } */ /* { dg-do run { target arm*-*-*eabi* } } */ into: { dg-do run { target arm_eabi } } */ ? Or is it: /* { dg-require-effective-target arm_eabi } */ What is the difference between the last two? Who sets "arm_eabi"? -- Sebastian Huber, embedded brains GmbH Address : Obere Lagerstr. 30, D-82178 Puchheim, Germany Phone : +49 89 18 90 80 79-6 Fax : +49 89 18 90 80 79-9 E-Mail : sebastian.hu...@embedded-brains.de PGP : Public key available on request. Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
Re: [C++ Patch] PR 53567
Hi, Il giorno 05/giu/2012, alle ore 16:16, Jason Merrill ha scritto: > On 06/05/2012 07:00 AM, Paolo Carlini wrote: >>(construct_virtual_base): Adjust LOOKUP_COMPLAIN -> LOOKUP_NORMAL. > > This and the similar changes elsewhere seem dangerous; they're adding adding > LOOKUP_PROTECT that wasn't there before. Instead, let's replace > LOOKUP_COMPLAIN with 0 or some macro defined to 0. Indeed, sorry about that, somehow I got confused. > I would also keep the LOOKUP_PROTECT macro rather than replace its uses with > LOOKUP_NORMAL. To be sure: NORMAL used to be just PROTECT | COMPLAIN, thus it's just about names, right? You mean, we do away with the NORMAL name, you mean? (indeed I had a moment of esitation about this, when I noticed that the comment preceding now NORMAL is the one which used to preceed PROTECT) > I'll send an updated patch later today. Thanks! Paolo
[PATCH COMMITTED] Fix date in ChangeLog
I committed the following patch to fix the dates on my own ChangeLog entries Thanks Edmar Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 188244) +++ gcc/ChangeLog (working copy) @@ -1,4 +1,4 @@ -2012-06-01 Edmar Wienskoski +2012-06-05 Edmar Wienskoski * config/rs6000/e5500.md: New file. * config/rs6000/e6500.md: New file. Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (revision 188244) +++ gcc/testsuite/ChangeLog (working copy) @@ -1,4 +1,4 @@ -2012-06-01 Edmar Wienskoski +2012-06-05 Edmar Wienskoski * gcc.dg/tree-ssa/vector-3.c: Adjust regular expression.
[PATCH][Cilkplus] Patch to fix array notation bug
Hello Everyone, This patch is for the Cilkplus branch affecting the C++ compiler. This patch will fix some cases were array notation is not decomposed correctly in parser. Thanking You, Yours Sincerely, Balaji V. Iyer. Index: gcc/cp/parser.c === --- gcc/cp/parser.c (revision 188243) +++ gcc/cp/parser.c (working copy) @@ -18045,6 +18045,11 @@ cp_parser_function_body (parser, in_function_try_block); if (check_body_p) check_constexpr_ctor_body (last, list); + + if (flag_enable_cilk) +if (contains_array_notation_expr (body)) + body = fix_array_notation_exprs (body); + /* Finish the function body. */ finish_function_body (body); Index: gcc/cp/ChangeLog.cilk === --- gcc/cp/ChangeLog.cilk (revision 188243) +++ gcc/cp/ChangeLog.cilk (working copy) @@ -1,3 +1,8 @@ +2012-06-05 Balaji V. Iyer + + * parser.c (cp_parser_ctor_initializer_opt_and_function_body): Added a + check for array notation expressions. If so, then decompose them. + 2012-06-04 Balaji V. Iyer * cilk.c (cp_make_cilk_frame): Removed adding body to the orig body's
Re: [C++ Patch] PR 53567
On 06/05/2012 11:29 AM, Paolo Carlini wrote: To be sure: NORMAL used to be just PROTECT | COMPLAIN, thus it's just about names, right? Yes. You mean, we do away with the NORMAL name, you mean? We could, but I think it's fine to have it as an alias for LOOKUP_PROTECT; the LOOKUP_NORMAL name implies that we're doing a normal name lookup, whereas LOOKUP_PROTECT is what that implies. Jason
[RFC, ivopts] fix bugs in ivopts address cost computation
My colleagues and I have been working on the GCC port for the Qualcomm Hexagon. Along the way I noticed that we were getting poor results from the ivopts pass no matter how we adjusted the target-specific RTX costs. In many cases ivopts was coming up with candidate use costs that seemed completely inconsistent with the target cost model. On further inspection, I found what appears to be a whole bunch of bugs in the way ivopts is computing address costs: (1) While the address cost computation is assuming in some situations that pre/post increment/decrement addressing will be used if supported by the target, it isn't actually using the target's address cost for such forms -- instead, just the cost of the form that would be used if autoinc weren't available/applicable. (2) The computation to determine which multiplier values are supported by target addressing modes is constructing an address rtx of the form (reg * ratio) to do the tests. This isn't a valid address RTX on Hexagon, although both (reg + reg * ratio) and (sym + reg * ratio) are. Because it's choosing the wrong address form to probe with, it thinks that the target doesn't support multipliers at all and is incorrectly tacking on an extra cost for them. I also note that it's assuming that the same set of ratios are supported by all three address forms that can potentially include them, and that all valid ratios have the same cost. (3) The computation to determine the range of valid constant offsets for address forms that can include them is probing the upper end of the range using constants of the form ((1< gcc/ * tree-ssa-loop-ivopts.c (comp_cost): Make complexity field signed. Update comments to indicate this is for addressing mode complexity. (new_cost): Make signedness of parameters match comp_cost fields. (compare_costs): Prefer higher complexity, not lower, per documentation of TARGET_ADDRESS_COST. (multiplier_allowed_in_address_p): Use (+ (* reg1 ratio) reg2) to probe for valid ratios, rather than just (* reg1 ratio). (get_address_cost): Rewrite to eliminate precomputation and caching. Use target's address cost for autoinc forms if possible. Only attempt sym_present -> var_present cost conversion if the sym_present form is not legitimate; amortize setup cost over loop iterations. Adjust complexity computation. (get_computation_cost_at): Adjust call to get_address_cost. Do not mess with complexity for non-address expressions. (determine_use_iv_cost_address): Initialize can_autoinc. (autoinc_possible_for_pair): Likewise. Index: gcc/tree-ssa-loop-ivopts.c === --- gcc/tree-ssa-loop-ivopts.c (revision 188110) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -157,10 +157,10 @@ enum use_type typedef struct { int cost; /* The runtime cost. */ - unsigned complexity; /* The estimate of the complexity of the code for - the computation (in no concrete units -- - complexity field should be larger for more - complex expressions and addressing modes). */ + int complexity; /* The estimate of the complexity of the code for + addressing modes in the computation (in no + concrete units -- complexity field should be + larger for more complex addressing modes). */ } comp_cost; static const comp_cost zero_cost = {0, 0}; @@ -2621,7 +2621,7 @@ alloc_use_cost_map (struct ivopts_data * cost is RUNTIME and complexity corresponds to COMPLEXITY. */ static comp_cost -new_cost (unsigned runtime, unsigned complexity) +new_cost (int runtime, int complexity) { comp_cost cost; @@ -2659,7 +2659,7 @@ static int compare_costs (comp_cost cost1, comp_cost cost2) { if (cost1.cost == cost2.cost) -return cost1.complexity - cost2.complexity; +return cost2.complexity - cost1.complexity; return cost1.cost - cost2.cost; } @@ -3182,15 +3182,23 @@ multiplier_allowed_in_address_p (HOST_WI { enum machine_mode address_mode = targetm.addr_space.address_mode (as); rtx reg1 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 1); - rtx addr; + rtx reg2 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 2); + rtx addr, mult; HOST_WIDE_INT i; valid_mult = sbitmap_alloc (2 * MAX_RATIO + 1); sbitmap_zero (valid_mult); - addr = gen_rtx_fmt_ee (MULT, address_mode, reg1, NULL_RTX); + /* Use an expression of the form (PLUS (MULT reg1 constant) reg2) + to test the validity of various constants. Plain (MULT reg1 constant) + is less likely to be a valid address RTX on many targets, and + probably (PLUS (MULT reg1 constant) symbol) likewise. Note that + we're assuming the set of valid multipliers is the same for any/all + of the three address RTX forms that allow them. */ + mult = gen_rtx_fmt_ee (MULT, address_mode, reg1, NULL_RTX); + ad
Re: [C++ Patch] PR 53567
Hi, > We could, but I think it's fine to have it as an alias for LOOKUP_PROTECT; > the LOOKUP_NORMAL name implies that we're doing a normal name lookup, whereas > LOOKUP_PROTECT is what that implies. Believe it or not, yesterday for a few minutes I had it exactly as an alias. Ok, I'll do that. Paolo
Re: [patch] Fix PR ada/52362
On Fri, May 18, 2012 at 5:05 PM, Eric Botcazou wrote: > * configure.ac (HAVE_GNU_LD): Move to after config.gcc inclusion. > (HAVE_GNU_AS): Likewise. > * config.in: Regenerate. > * configure: Likewise. Eric, This change is breaking bootstraps in one of the google branches because it does not allow us to control the presence of GNU as with the --with-gnu-as= configure flag. When doing native bootstraps, we need to set --with-gnu-as=no because binutils 'as' does not handle a flag that we pass to our own version of 'as'. This is a combination of problems, actually: 1- The local patch we have that enables the new flag is keyed on HAVE_GNU_AS. Easwaran, I think this is a patch of yours from Apr 2011: 2011-04-20 Easwaran Raman * gcc.c (asm_options): Pass --save-temps to assembler. The change should not assume that HAVE_GNU_AS means that we can use --save-temps. You should use a new configuration flag for it. 2- Eric, your patch essentially disables the --with-gnu-as= flag. When doing a native bootstrap on the 4.7 branch, HAVE_GNU_AS is set to 1, regardless of the value of --with-gnu-as. The problem is that your new test to decide whether to use gas just uses the value set in config.gcc. Easwaran, I will revert Eric's patch from our 4.7 branch. Eric, shouldn't the whole section testing for GNU as move after the inclusion of config.gcc? Thanks. Diego.
Re: [C++] Return error_mark_node from cp_parser_constant_expression
On 06/05/2012 11:19 AM, Florian Weimer wrote: True. But if we want cascading errors, cp_parser_constant_expression really cannot return error_mark_node, so this approach is a dead end. (For example, build_enumerator replaces error_mark_node in the enumeration value with nothing, i.e., the next possible enumeration value.) So this approach is a dead end. On the other hand, if cascading errors are acceptable, I probably should not worry too much about them in operator new, either. 8-) It really depends. The error messages I was talking about before add more context information to the the previous error, which is good. Better would be to have set a constant expression context somewhere so that we can cover both the context and the violation in one error message. In the operator new case, the non-constant expression error followed by the VLA warning is not as helpful, as the latter ignores the former. Perhaps the right way to deal with this is to allow non-constant expressions in the new-type-id, since we allow them in a regular type-id. Jason
Re: [C++] Return error_mark_node from cp_parser_constant_expression
On Tue, Jun 5, 2012 at 12:05 PM, Jason Merrill wrote: > In the operator new case, the non-constant expression error followed by the > VLA warning is not as helpful, as the latter ignores the former. Perhaps the > right way to deal with this is to allow non-constant expressions in the > new-type-id, since we allow them in a regular type-id. and afterward issue the diagnostics. That sounds good to me :-) -- Gaby
Re: [PATCH][C++] Fix PR52841
OK. Jason
Re: [google] New fdo summary-based icache sensitive unrolling (issue 6282045)
http://codereview.appspot.com/6282045/diff/1/gcc/gcov-io.h File gcc/gcov-io.h (right): http://codereview.appspot.com/6282045/diff/1/gcc/gcov-io.h#newcode544 gcc/gcov-io.h:544: gcov_unsigned_t sum_cutoff_percent;/* sum_all cutoff percentage computed Is there a need to record this? http://codereview.appspot.com/6282045/diff/1/gcc/gcov-io.h#newcode546 gcc/gcov-io.h:546: gcov_unsigned_t num_to_cutoff;/* number of counters to reach above cutoff. */ This should have a better name -- e.g., hot_arc_count. http://codereview.appspot.com/6282045/diff/1/gcc/loop-unroll.c File gcc/loop-unroll.c (right): http://codereview.appspot.com/6282045/diff/1/gcc/loop-unroll.c#newcode196 gcc/loop-unroll.c:196: or peeled loop. */ Add more documentation here: 1) for program size < threshold, do not limit 2) forthreshold < psize < 2* threshold, tame the max allows peels/unrolls according to hotness; 3) for huge footprint programs, disable it (by ...). http://codereview.appspot.com/6282045/diff/1/gcc/loop-unroll.c#newcode197 gcc/loop-unroll.c:197: static limit_type Blank line. http://codereview.appspot.com/6282045/diff/1/gcc/loop-unroll.c#newcode233 gcc/loop-unroll.c:233: gcc_assert(profile_info->sum_all > 0); Do not assert on profile data -- either bail or emit info. http://codereview.appspot.com/6282045/diff/1/gcc/loop-unroll.c#newcode237 gcc/loop-unroll.c:237: if (profile_info->num_to_cutoff < size_threshold*2) { space http://codereview.appspot.com/6282045/diff/1/gcc/loop-unroll.c#newcode238 gcc/loop-unroll.c:238: /* For appliations that are less than twice the codesize limit, allow applications http://codereview.appspot.com/6282045/diff/1/gcc/loop-unroll.c#newcode532 gcc/loop-unroll.c:532: limit = limit_code_size(loop, &codesize_divisor); space. http://codereview.appspot.com/6282045/diff/1/gcc/loop-unroll.c#newcode1095 gcc/loop-unroll.c:1095: int codesize_divisor) This parameter is not documented. The name of the parameter is also not ideal -. Is it possible to not change the interfaces? -- i.e., split limit_code_size into two helper functions one to get the the suppress flag as before, and the other gets the limit_factor which is called inside each function 'decide_unroll_runtime...' -- it is cleaner and easier to understand that way. http://codereview.appspot.com/6282045/diff/1/libgcc/libgcov.c File libgcc/libgcov.c (right): http://codereview.appspot.com/6282045/diff/1/libgcc/libgcov.c#newcode832 libgcc/libgcov.c:832: #define CUM_CUTOFF_PERCENT_TIMES_10 999 Ok now but it should be controllable at compile time with a --param -- recorded in the binary; http://codereview.appspot.com/6282045/diff/1/libgcc/libgcov.c#newcode839 libgcc/libgcov.c:839: for (t_ix = 0; t_ix < GCOV_COUNTERS_SUMMABLE; t_ix++) There does not seem a need for a loop -- only t_ix == GCOV_COUNTER_ARCS is summable. http://codereview.appspot.com/6282045/diff/1/libgcc/libgcov.c#newcode848 libgcc/libgcov.c:848: cum_cutoff = (cs_ptr->sum_all * cutoff_perc)/1000; Overflow possibility? http://codereview.appspot.com/6282045/diff/1/libgcc/libgcov.c#newcode854 libgcc/libgcov.c:854: value_array = (gcov_type *) malloc (sizeof(gcov_type)*cs_ptr->num); space http://codereview.appspot.com/6282045/diff/1/libgcc/libgcov.c#newcode860 libgcc/libgcov.c:860: for (i = 0, ctr_info_ix = 0; i < t_ix; i++) No need for this -- the index for ARC counter is always 0 (add assert) -- so either skip (use merge function for 47 and mask for 46) or use 0. http://codereview.appspot.com/6282045/diff/1/libgcc/libgcov.c#newcode864 libgcc/libgcov.c:864: } in gcc_46 and before, the counters may not be allocated, and it should not directly accessed using t_ix. It needs to be guarded with if ((1 << t_ix) & gi_ptr->ctr_mask) http://codereview.appspot.com/6282045/diff/1/libgcc/libgcov.c#newcode875 libgcc/libgcov.c:875: gcc_assert (index + ci_ptr->num <= cs_ptr->num); Need to relax this a little by skipping -- profiling dumping is known to be 'flaky' http://codereview.appspot.com/6282045/diff/1/libgcc/libgcov.c#newcode1103 libgcc/libgcov.c:1103: cs_prg->sum_max += cs_tprg->run_max; For multiple runs, how should num_to_cutoff merged? Pick the larger value? http://codereview.appspot.com/6282045/
[PATCH, testsuite] Fix gcc.target/powerpc/lhs-1.c for 32-bit
The following fixes a problem with my recently added testcase that resulted in failure for 32-bit since instructions to stack a frame reduced the number of nop's that were needed to force the load into a separate dispatch group. Tested on powerpc64-linux, committed as obvious. -Pat testsuite/ChangeLog: 2012-06-05 Pat Haugen * gcc.target/powerpc/lhs-1.c: Use parm instead of stack space. Index: gcc/testsuite/gcc.target/powerpc/lhs-1.c === --- gcc/testsuite/gcc.target/powerpc/lhs-1.c(revision 188208) +++ gcc/testsuite/gcc.target/powerpc/lhs-1.c(working copy) @@ -13,10 +13,9 @@ typedef union { }; } words; -unsigned int f (double d) +unsigned int f (double d, words *u) { - words u; - u.val = d; - return u.w2; + u->val = d; + return u->w2; }
Re: [wwwdocs] Make codingconventions.html pass W3 validator.
Hi Lawrence, On Mon, 4 Jun 2012, Lawrence Crowl wrote: > The following source change enables coddingconventions.html to > pass the HTML validator at validator.w3.org. the web pages will be preprocessed before the are put on the server (this transparently happens upon checkin) and as part of that http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";> > are going to be prepended to any page, plus the standard CSS reference and footer are added, for example. Since you ran into this, I would like to document this better. Would http://gcc.gnu.org/projects/web.html be a good place, or do you have a different suggestion? Gerald
[patch][cris] Clean up some cris-aout remnants
Hello, This patch just cleans up some remaining code for removed cris-aout subtarget by folding away code that was conditional on TARGET_ELF. Tested with a x86_64-linux X cris-elf cross-compiler. OK for trunk? Ciao! Steven cris_small_cleanups.diff Description: Binary data
Re: [PR49888, VTA] don't keep VALUEs bound to modified MEMs
On May 23, 2012, Jakub Jelinek wrote: > On Wed, May 23, 2012 at 06:27:21AM -0300, Alexandre Oliva wrote: >> + for (loc = var->var_part[0].loc_chain; loc; loc = loc->next) >> +if (GET_CODE (loc->loc) == MEM >> +&& !nonoverlapping_memrefs_p (loc->loc, mloc, false)) > Isn't nonoverlapping_memrefs_p predicate too conservative? > cselib.c uses canon_true_dependence to decide what should be invalidated. Yeah, I guess that should do. I've finally managed to analyze the effects of using canon_true_dependence on debug info, and it does seem to produce reasonable results: before after = entry value count = 33005 35700 i686/cc1plus 99116 i686/libgcc_s.so 216239 i686/libstdc++.so 99902 104390 amd64/cc1plus 308319 amd64/libgcc_s.so 8819 8864 amd64/libstdc++.so = call site value count = 519877 514591 i686/cc1plus 406403 i686/libgcc_s.so 10284 10121 i686/libstdc++.so 518401 518220 amd64/cc1plus 341341 amd64/libgcc_s.so 11309 11261 amd64/libstdc++.so loc info coverage (before | after) i686/cc1plus: cov%samples cumul cov%samples cumul 0.0 155345/28% 155345/28%| 0.0 155561/29% 155561/29% 0..56606/1% 161951/30%| 0..57969/1% 163530/30% 6..10 5984/1% 167935/31%| 6..10 6336/1% 169866/31% 11..15 4823/0% 172758/32%| 11..15 5117/0% 174983/32% 16..20 6262/1% 179020/33%| 16..20 6677/1% 181660/33% 21..25 5259/0% 184279/34%| 21..25 5688/1% 187348/34% 26..30 5149/0% 189428/35%| 26..30 5409/1% 192757/35% 31..35 4547/0% 193975/36%| 31..35 4868/0% 197625/36% 36..40 7204/1% 201179/37%| 36..40 7524/1% 205149/38% 41..45 8343/1% 209522/39%| 41..45 8656/1% 213805/39% 46..50 8744/1% 218266/40%| 46..50 9252/1% 223057/41% 51..55 5806/1% 224072/41%| 51..55 6023/1% 229080/42% 56..60 6834/1% 230906/43%| 56..60 7072/1% 236152/44% 61..65 5391/1% 236297/44%| 61..65 5554/1% 241706/45% 66..70 9269/1% 245566/45%| 66..70 9327/1% 251033/46% 71..75 6249/1% 251815/46%| 71..75 6309/1% 257342/47% 76..80 8871/1% 260686/48%| 76..80 8905/1% 266247/49% 81..85 8775/1% 269461/50%| 81..85 8786/1% 275033/51% 86..90 13323/2%282784/52%| 86..90 13286/2%288319/53% 91..95 21279/3%304063/56%| 91..95 20947/3%309266/57% 96..99 21323/3%325386/60%| 96..99 20247/3%329513/61% 100 211312/39% 536698/100% | 100 206704/38% 536217/100% i686/libgcc_s.so: cov%samples cumul cov%samples cumul 0.0 511/21% 511/21% | 0.0 512/21% 512/21% 0..544/1% 555/23% | 0..567/2% 579/24% 6..10 45/1% 600/25% | 6..10 41/1% 620/26% 11..15 35/1% 635/26% | 11..15 37/1% 657/27% 16..20 28/1% 663/27% | 16..20 29/1% 686/29% 21..25 32/1% 695/29% | 21..25 32/1% 718/30% 26..30 40/1% 735/31% | 26..30 41/1% 759/32% 31..35 32/1% 767/32% | 31..35 34/1% 793/33% 36..40 29/1% 796/33% | 36..40 33/1% 826/34% 41..45 42/1% 838/35% | 41..45 44/1% 870/36% 46..50 47/1% 885/37% | 46..50 56/2% 926/39% 51..55 33/1% 918/38% | 51..55 35/1% 961/40% 56..60 45/1% 963/40% | 56..60 48/2% 1009/42% 61..65 44/1% 1007/42% | 61..65 43/1% 1052/44% 66..70 64/2% 1071/45% | 66..70 68/2% 1120/47% 71..75 45/1% 1116/47% | 71..75 45/1% 1165/49% 76..80 67/2% 1183/49% | 76..80 69/2% 1234/52% 81..85 76/3% 1259/53% | 81..85 76/3% 1310/55% 86..90 131/5% 1390/58% | 86..90 119/5% 1429/60% 91..95 83/3% 1473/62% | 91..95 81/3% 1510/63% 96..99 54/2% 1527/64% | 96..99 49/2% 1559/66% 100 842/35% 2369/100% | 100 802/33% 2361/100% i686/libstdc++.so cov%samples cumul cov%samples cumul 0.0 12708/37% 12708/37% | 0.0 12737/37% 12737/37% 0..5125/0% 12833/38% | 0..5263/0% 13000/38% 6..10 167/0% 13000/38% | 6..10 201/0% 13201/39% 11..15 125/0% 13125/39% | 11..15 157/0% 13358/39% 16..20 197/0% 13322/39% | 16..20 216/0% 13574/40% 21..25 169/0% 13491/40% | 21..25 194/0% 13768/40% 26..30 120/0% 13611/40% | 26..30 155/0% 13923/41% 31..35 179/0% 13790/41% | 31..35 188/0% 14111/41% 36..40 238/0% 14028/41% | 36..40 257/0% 14368/42% 41..45 226/0% 14254/42% | 41..45 266/0% 14634/43% 46..50 258/0% 14512/43% | 46..50 270/0% 14904/44% 51..55 176/0%
Re: [RFA] PowerPC e5500 and e6500 cores support
The patch I submitted had an omission. I failed to regenerate rs6000-tables.opt (Sorry, I misunderstood gcc_update --touch instructions) OK to commit the update ? 2012-06-05 Edmar Wienskoski * config/rs6000/rs6000-tables.opt: Regenerated. On 06/04/2012 08:45 PM, David Edelsohn wrote: This patch is okay to commit. Thanks, David . Index: gcc/gcc/config/rs6000/rs6000-tables.opt === --- gcc/gcc/config/rs6000/rs6000-tables.opt (revision 188248) +++ gcc/gcc/config/rs6000/rs6000-tables.opt (working copy) @@ -126,80 +126,86 @@ Enum(rs6000_cpu_opt_value) String(e500mc64) Value(32) EnumValue -Enum(rs6000_cpu_opt_value) String(860) Value(33) +Enum(rs6000_cpu_opt_value) String(e5500) Value(33) EnumValue -Enum(rs6000_cpu_opt_value) String(970) Value(34) +Enum(rs6000_cpu_opt_value) String(e6500) Value(34) EnumValue -Enum(rs6000_cpu_opt_value) String(cell) Value(35) +Enum(rs6000_cpu_opt_value) String(860) Value(35) EnumValue -Enum(rs6000_cpu_opt_value) String(common) Value(36) +Enum(rs6000_cpu_opt_value) String(970) Value(36) EnumValue -Enum(rs6000_cpu_opt_value) String(ec603e) Value(37) +Enum(rs6000_cpu_opt_value) String(cell) Value(37) EnumValue -Enum(rs6000_cpu_opt_value) String(G3) Value(38) +Enum(rs6000_cpu_opt_value) String(common) Value(38) EnumValue -Enum(rs6000_cpu_opt_value) String(G4) Value(39) +Enum(rs6000_cpu_opt_value) String(ec603e) Value(39) EnumValue -Enum(rs6000_cpu_opt_value) String(G5) Value(40) +Enum(rs6000_cpu_opt_value) String(G3) Value(40) EnumValue -Enum(rs6000_cpu_opt_value) String(titan) Value(41) +Enum(rs6000_cpu_opt_value) String(G4) Value(41) EnumValue -Enum(rs6000_cpu_opt_value) String(power) Value(42) +Enum(rs6000_cpu_opt_value) String(G5) Value(42) EnumValue -Enum(rs6000_cpu_opt_value) String(power2) Value(43) +Enum(rs6000_cpu_opt_value) String(titan) Value(43) EnumValue -Enum(rs6000_cpu_opt_value) String(power3) Value(44) +Enum(rs6000_cpu_opt_value) String(power) Value(44) EnumValue -Enum(rs6000_cpu_opt_value) String(power4) Value(45) +Enum(rs6000_cpu_opt_value) String(power2) Value(45) EnumValue -Enum(rs6000_cpu_opt_value) String(power5) Value(46) +Enum(rs6000_cpu_opt_value) String(power3) Value(46) EnumValue -Enum(rs6000_cpu_opt_value) String(power5+) Value(47) +Enum(rs6000_cpu_opt_value) String(power4) Value(47) EnumValue -Enum(rs6000_cpu_opt_value) String(power6) Value(48) +Enum(rs6000_cpu_opt_value) String(power5) Value(48) EnumValue -Enum(rs6000_cpu_opt_value) String(power6x) Value(49) +Enum(rs6000_cpu_opt_value) String(power5+) Value(49) EnumValue -Enum(rs6000_cpu_opt_value) String(power7) Value(50) +Enum(rs6000_cpu_opt_value) String(power6) Value(50) EnumValue -Enum(rs6000_cpu_opt_value) String(powerpc) Value(51) +Enum(rs6000_cpu_opt_value) String(power6x) Value(51) EnumValue -Enum(rs6000_cpu_opt_value) String(powerpc64) Value(52) +Enum(rs6000_cpu_opt_value) String(power7) Value(52) EnumValue -Enum(rs6000_cpu_opt_value) String(rios) Value(53) +Enum(rs6000_cpu_opt_value) String(powerpc) Value(53) EnumValue -Enum(rs6000_cpu_opt_value) String(rios1) Value(54) +Enum(rs6000_cpu_opt_value) String(powerpc64) Value(54) EnumValue -Enum(rs6000_cpu_opt_value) String(rios2) Value(55) +Enum(rs6000_cpu_opt_value) String(rios) Value(55) EnumValue -Enum(rs6000_cpu_opt_value) String(rsc) Value(56) +Enum(rs6000_cpu_opt_value) String(rios1) Value(56) EnumValue -Enum(rs6000_cpu_opt_value) String(rsc1) Value(57) +Enum(rs6000_cpu_opt_value) String(rios2) Value(57) EnumValue -Enum(rs6000_cpu_opt_value) String(rs64) Value(58) +Enum(rs6000_cpu_opt_value) String(rsc) Value(58) +EnumValue +Enum(rs6000_cpu_opt_value) String(rsc1) Value(59) + +EnumValue +Enum(rs6000_cpu_opt_value) String(rs64) Value(60) + 2012-06-05 Edmar Wienskoski * config/rs6000/rs6000-tables.opt: Regenerated.
Re: [wwwdocs] Make codingconventions.html pass W3 validator.
On 6/5/12, Gerald Pfeifer wrote: > On Mon, 4 Jun 2012, Lawrence Crowl wrote: > > The following source change enables coddingconventions.html to > > pass the HTML validator at validator.w3.org. > > the web pages will be preprocessed before the are put on the server > (this transparently happens upon checkin) and as part of that > > > PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" >"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";> > > are going to be prepended to any page, plus the standard CSS > reference and footer are added, for example. Okay, but now I have questions! Where do these prepended pages come from? How do I test the page as it will appear? I guess maybe I'm asking for the makefile that produces what one would see. I want to validate that. BTW, part of the problem is that the pages are complete enough as they are to be considered complete. I.e. they are not obviously fragments. Would it be better to make them clearly fragments? Doesn't the prepending prevent incremental migration to new standards? > Since you ran into this, I would like to document this better. > Would http://gcc.gnu.org/projects/web.html be a good place, > or do you have a different suggestion? My entry point was http://gcc.gnu.org/cvs.html, so at a minimum it need to be cross linked with http://gcc.gnu.org/projects/web.html. -- Lawrence Crowl
[RFC] [PowerPC] Patch to create new attribute type: popcnt
David, Michael, Here is the new type "popcnt" patch that I had separated from previous E5500/E6500 submission, also added the changes suggested by Michael Meissner (detailed bellow). I am missing some details for power6. (Could not find any documentation) Bootstrapped with no regressions, all languages enabled, configured for target powerpc64 and used "--with-cpu=<>" for each of power6, power7, and 970. All work performed on svn revison number 188200. NOTES: - 403, and 440 manuals does not list popcnt* instructions. Skipped. - 750, 74xx Freescale parts does not have popcnt* instructions. Skipped. - 476 manual lists popcnt as requiring i-pipe. Added to corresponding insn reservation. - power4 (IBM 970) pre-dates ISA-2.02. It does not have popcnt* instructions. Skipped. - power5, power7 groups simple integer and complex integer together. Appended popcnt to insn reservation. - power6.md has different style. Created a separate reservation. I used instruction latency of 1. Please confirm. I did not added a store bypass either. Let me know if I should. Thanks, Edmar 2012-06-05 Edmar Wienskoski * config/rs6000/rs6000.md (define_attr "type"): New type popcnt. (popcntb2): Add attribute type popcnt. (popcntd2): Ditto. * config/rs6000/power4.md (define_insn_reservation): Add type popcnt. * config/rs6000/power5.md (define_insn_reservation): Ditto. * config/rs6000/power7.md (define_insn_reservation): Ditto. * config/rs6000/476.md (define_insn_reservation): Ditto. * config/rs6000/power6.md (define_insn_reservation): New reservation for popcnt instructions. Index: gcc-20120604/gcc/config/rs6000/476.md === --- gcc-20120604/gcc/config/rs6000/476.md (revision 188200) +++ gcc-20120604/gcc/config/rs6000/476.md (working copy) @@ -71,7 +71,7 @@ ppc476_i_pipe|ppc476_lj_pipe") (define_insn_reservation "ppc476-complex-integer" 1 - (and (eq_attr "type" "cmp,cr_logical,delayed_cr,cntlz,isel,isync,sync,trap") + (and (eq_attr "type" "cmp,cr_logical,delayed_cr,cntlz,isel,isync,sync,trap,popcnt") (eq_attr "cpu" "ppc476")) "ppc476_issue,\ ppc476_i_pipe") Index: gcc-20120604/gcc/config/rs6000/power7.md === --- gcc-20120604/gcc/config/rs6000/power7.md (revision 188200) +++ gcc-20120604/gcc/config/rs6000/power7.md (working copy) @@ -150,7 +150,7 @@ ; FX Unit (define_insn_reservation "power7-integer" 1 (and (eq_attr "type" "integer,insert_word,insert_dword,shift,trap,\ -var_shift_rotate,exts,isel") +var_shift_rotate,exts,isel,popcnt") (eq_attr "cpu" "power7")) "DU_power7,FXU_power7") Index: gcc-20120604/gcc/config/rs6000/power6.md === --- gcc-20120604/gcc/config/rs6000/power6.md (revision 188200) +++ gcc-20120604/gcc/config/rs6000/power6.md (working copy) @@ -216,6 +216,11 @@ (eq_attr "cpu" "power6")) "FXU_power6") +(define_insn_reservation "power6-popcnt" 1 + (and (eq_attr "type" "popcnt") + (eq_attr "cpu" "power6")) + "FXU_power6") + (define_insn_reservation "power6-insert" 1 (and (eq_attr "type" "insert_word") (eq_attr "cpu" "power6")) Index: gcc-20120604/gcc/config/rs6000/power5.md === --- gcc-20120604/gcc/config/rs6000/power5.md (revision 188200) +++ gcc-20120604/gcc/config/rs6000/power5.md (working copy) @@ -142,7 +142,7 @@ ; Integer latency is 2 cycles (define_insn_reservation "power5-integer" 2 (and (eq_attr "type" "integer,insert_dword,shift,trap,\ -var_shift_rotate,cntlz,exts,isel") +var_shift_rotate,cntlz,exts,isel,popcnt") (eq_attr "cpu" "power5")) "iq_power5") Index: gcc-20120604/gcc/config/rs6000/rs6000.md === --- gcc-20120604/gcc/config/rs6000/rs6000.md (revision 188200) +++ gcc-20120604/gcc/config/rs6000/rs6000.md (working copy) @@ -145,7 +145,7 @@ ;; Define an insn type attribute. This is used in function unit delay ;; computations. -(define_attr "type" "integer,two,three,load,load_ext,load_ext_u,load_ext_ux,load_ux,load_u,store,store_ux,store_u,fpload,fpload_ux,fpload_u,fpstore,fpstore_ux,fpstore_u,vecload,vecstore,imul,imul2,imul3,lmul,idiv,ldiv,insert_word,branch,cmp,fast_compare,compare,var_delayed_compare,delayed_compare,imul_compare,lmul_compare,fpcompare,cr_logical,delayed_cr,mfcr,mfcrf,mtcr,mfjmpr,mtjmpr,fp,fpsimple,dmul,sdiv,ddiv,ssqrt,dsqrt,jmpreg,brinc,vecsimple,veccomplex,vecdiv,veccmp,veccmpsimple,vecperm,vecfloat,vecfdiv,vecdouble,isync,sync,load_l,store_c,shift,trap,insert_dword,var_shift_rotate,cntlz,exts,mffgpr,mftgpr,isel" +(define_attr "type" "integer,two,three,load,load_ext,load_ext_u,load_ext_ux,load_ux,load_u,store,store_ux,store_u,fp
[PATCH, 4.6, committed] Backport fix for g++ -E -C issue in gthr-posix.h
I committed the following patch to the FSF 4.6 branch which Jakub approved on IRC. Peter Backport from mainline 2011-08-29 Jakub Jelinek * gthr-posix.h (__gthread_active_p): Do not use preprocessor conditionals and comments inside macro arguments. Index: gcc/gthr-posix.h === --- gcc/gthr-posix.h(revision 188247) +++ gcc/gthr-posix.h(working copy) @@ -239,16 +239,15 @@ __gthread_active_p (void) static inline int __gthread_active_p (void) { - static void *const __gthread_active_ptr -= __extension__ (void *) &__gthrw_( /* Android's C library does not provide pthread_cancel, check for `pthread_create' instead. */ #ifndef __BIONIC__ - pthread_cancel + static void *const __gthread_active_ptr += __extension__ (void *) &__gthrw_(pthread_cancel); #else - pthread_create + static void *const __gthread_active_ptr += __extension__ (void *) &__gthrw_(pthread_create); #endif - ); return __gthread_active_ptr != 0; }
Re: Support for Runtime CPU type detection via builtins (issue5754058)
Hi H.J., I am attaching a patch to add __cpu_indicator_init to the list of symbols to be versioned and exported in libgcc_s.so. Also, updating builtin_target.c test to explicitly do a CPUID and check if the features are identified correctly like you had suggested earlier. Patch ok? * config/i386/libgcc-bsd.ver: Version symbol __cpu_indicator_init. * config/i386/libgcc-sol2.ver: Ditto. * config/i386/libgcc-glibc.ver: Ditto. * gcc.target/i386/builtin_target.c (vendor_signatures): New enum. (check_intel_cpu_model): New function. (check_amd_cpu_model): New function. (check_features): New function. (__get_cpuid_output): New function. (check_detailed): New function. (fn1): Rename to quick_check. (main): Update to call quick_check and call check_detailed. Thanks, -Sri. On Wed, Apr 25, 2012 at 5:52 PM, Sriraman Tallam wrote: > Patch committed. > > Thanks, > -Sri. > > On Wed, Apr 25, 2012 at 4:52 PM, H.J. Lu wrote: >> On Wed, Apr 25, 2012 at 4:38 PM, Sriraman Tallam wrote: >>> Hi H.J, >>> >>> Could you please review this patch for AVX2 check? >>> >>> * config/i386/i386-cpuinfo.c (FEATURE_AVX2): New enum value. >>> (get_available_features): New argument. Check for AVX2. >>> (__cpu_indicator_init): Modify call to get_available_features >>> . >>> * doc/extend.texi: Document avx2 support. >>> * testsuite/gcc.target/i386/builtin_target.c: Check avx2. >>> * config/i386/i386.c (fold_builtin_cpu): Add avx2. >>> >> >> It looks good to me. >> >> Thanks. >> >> -- >> H.J. Index: libgcc/config/i386/libgcc-bsd.ver === --- libgcc/config/i386/libgcc-bsd.ver (revision 188246) +++ libgcc/config/i386/libgcc-bsd.ver (working copy) @@ -109,4 +109,5 @@ GCC_4.6.0 { GCC_4.8.0 { __cpu_model + __cpu_indicator_init } Index: libgcc/config/i386/libgcc-sol2.ver === --- libgcc/config/i386/libgcc-sol2.ver (revision 188246) +++ libgcc/config/i386/libgcc-sol2.ver (working copy) @@ -109,4 +109,5 @@ GCC_4.5.0 { GCC_4.8.0 { __cpu_model + __cpu_indicator_init } Index: libgcc/config/i386/libgcc-glibc.ver === --- libgcc/config/i386/libgcc-glibc.ver (revision 188246) +++ libgcc/config/i386/libgcc-glibc.ver (working copy) @@ -150,6 +150,7 @@ GCC_4.3.0 { GCC_4.8.0 { __cpu_model + __cpu_indicator_init } %else GCC_4.4.0 { @@ -190,5 +191,6 @@ GCC_4.5.0 { GCC_4.8.0 { __cpu_model + __cpu_indicator_init } %endif Index: gcc/testsuite/gcc.target/i386/builtin_target.c === --- gcc/testsuite/gcc.target/i386/builtin_target.c (revision 188246) +++ gcc/testsuite/gcc.target/i386/builtin_target.c (working copy) @@ -1,13 +1,229 @@ /* This test checks if the __builtin_cpu_is and __builtin_cpu_supports calls - are recognized. */ + are recognized. It also independently uses CPUID to get cpu type and + features supported and checks if the builtins correctly identify the + platform. The code to do the identification is adapted from + libgcc/config/i386/cpuinfo.c. */ /* { dg-do run } */ #include +#include "cpuid.h" -int -fn1 () +enum vendor_signatures { + SIG_INTEL = 0x756e6547 /* Genu */, + SIG_AMD =0x68747541 /* Auth */ +}; + +/* Check if the Intel CPU model and sub-model are identified. */ +static void +check_intel_cpu_model (unsigned int family, unsigned int model, + unsigned int brand_id) +{ + /* Parse family and model only if brand ID is 0. */ + if (brand_id == 0) +{ + switch (family) + { + case 0x5: + /* Pentium. */ + break; + case 0x6: + switch (model) + { + case 0x1c: + case 0x26: + /* Atom. */ + assert (__builtin_cpu_is ("atom")); + break; + case 0x1a: + case 0x1e: + case 0x1f: + case 0x2e: + /* Nehalem. */ + assert (__builtin_cpu_is ("corei7")); + assert (__builtin_cpu_is ("nehalem")); + break; + case 0x25: + case 0x2c: + case 0x2f: + /* Westmere. */ + assert (__builtin_cpu_is ("corei7")); + assert (__builtin_cpu_is ("westmere")); + break; + case 0x2a: + /* Sandy Bridge. */ + assert (__builtin_cpu_is ("corei7")); + assert (__builtin_cpu_is ("sandybridge")); + break; + case 0x17: + case 0x1d: + /* Penryn. */ + case 0x0f: + /* Merom. */ + assert (__builtin_cpu_is ("core2")); + break; + default: + break; + } + break; +
Re: [PATCH] gcc/config/freebsd-spec.h: Fix building PIE executables. Link them with crt{begin,end}S.o and Scrt1.o which are PIC instead of crt{begin,end}.o and crt1.o which are not. Spec synced from g
ping ? is there any problem with that patch ?
[PATCH][Cilkplus] Did an induction variable optimization
Hello Everyone, This patch is for the Cilkplus branch affecting the C compiler. This patch will remove an unwanted conversion of the induction variable in the cilk_for function. Thanking You, Yours Sincerely, Balaji V. Iyer.Index: gcc/cilk-spawn.c === --- gcc/cilk-spawn.c(revision 188251) +++ gcc/cilk-spawn.c(working copy) @@ -2290,9 +2290,6 @@ int incr_sign = cfd->incr_sign; enum tree_code add_op = incr_sign >= 0 ? PLUS_EXPR : MINUS_EXPR; - gcc_assert (TYPE_MAIN_VARIANT (TREE_TYPE (loop_var)) == - TYPE_MAIN_VARIANT (count_type)); - /* Compute an expression to be added or subtracted. We want to add or subtract LOOP_VAR * INCR. INCR may be negative. @@ -2374,8 +2371,8 @@ tree body, block; tree lower_bound; tree loop_var; - tree count_type; tree tempx,tempy; + declare_cilk_for_parms (cfd); cfd->wd.fntype = build_function_type (void_type_node, cfd->wd.argtypes); @@ -2420,14 +2417,12 @@ lower_bound = hack; } loop_var = build_decl (UNKNOWN_LOCATION, VAR_DECL, NULL_TREE, -TREE_TYPE (cfd->min_parm)); +cfd->var_type); DECL_CONTEXT (loop_var) = fndecl; - add_stmt (build2 (INIT_EXPR, void_type_node, loop_var, cfd->min_parm)); + add_stmt (build_modify_expr (UNKNOWN_LOCATION, loop_var, TREE_TYPE (loop_var), + NOP_EXPR, UNKNOWN_LOCATION, + cfd->min_parm, TREE_TYPE (cfd->min_parm))); - count_type = cfd->count_type; - gcc_assert (TYPE_MAIN_VARIANT (TREE_TYPE (loop_var)) == - TYPE_MAIN_VARIANT (count_type)); - /* The new loop body is var2 = (T)((control variable) * INCR + (lower bound)); @@ -2464,14 +2459,15 @@ add_stmt (loop_body); tempx = build2 (MODIFY_EXPR, void_type_node, loop_var, - build2 (PLUS_EXPR, count_type, + build2 (PLUS_EXPR, TREE_TYPE (loop_var), loop_var, - build_int_cst (count_type, 1))); + build_int_cst (TREE_TYPE (loop_var), 1))); add_stmt(tempx); tempy = build3 (COND_EXPR, void_type_node, build2 (LT_EXPR, boolean_type_node, loop_var, - cfd->max_parm), + build_c_cast (UNKNOWN_LOCATION, + TREE_TYPE (loop_var), cfd->max_parm)), build1 (GOTO_EXPR, void_type_node, lab), build_empty_stmt (UNKNOWN_LOCATION)); Index: gcc/ChangeLog.cilk === --- gcc/ChangeLog.cilk (revision 188251) +++ gcc/ChangeLog.cilk (working copy) @@ -1,3 +1,10 @@ +2012-06-05 Balaji V. Iyer + + * cilk-spawn.c (compute_loop_var): Removed an unwanted assert. + (build_cilk_for_body): Changed var type from min_parms's to the original + var_type. This change is propagated in several places with the + appropriate type conversions. + 2012-06-02 Balaji V. Iyer * tree-inline.c (remap_gimple_op_r): Added a check for NON-NULL
Re: [PATCH] vrp: fold ffs to ctz
On Jun 5, 2012, at 6:46 AM, Paolo Bonzini wrote: >> Do we always have CTZ if we have FFS? Can't there be a target that >> implements FFS as opcode but not CTZ, so you'd slow down things? >> Thus, should the transform be conditonal on target support for CTZ >> or no target support for FFS? > > Hmm, SH and (some semi-obscure variant of) SPARC. But actually SPARC > should define a clz pattern instead; SH should have a popcount pattern + > a generic trick to expand ctz/ffs in terms of popcount. I'll submit > those before applying this patch. VAX has both FFS/FFC instructions but only a ffs pattern. It does not have CTZ or CTO patterns but those could be added trivially.
Re: [C++ Patch] PR 53567
On 06/05/2012 08:23 PM, Paolo Carlini wrote: @@ -1695,6 +1695,8 @@ implicit_conversion (tree to, tree from, tree expr |LOOKUP_NO_TEMP_BIND|LOOKUP_NO_RVAL_BIND|LOOKUP_PREFER_RVALUE |LOOKUP_NO_NARROWING|LOOKUP_PROTECT); + complain&= ~tf_error; I don't think we want warnings from implicit_conversion, either. - if (flags & LOOKUP_COMPLAIN) - permerror (loc, "conversion from %q#T to %q#T", intype, type); - if (!flag_permissive) + if (complain & tf_error) + { + permerror (loc, "conversion from %q#T to %q#T", +intype, type); + if (!flag_permissive) + return error_mark_node; I don't think we need the last two lines anymore, we can use the usual pattern of return error if sfinae, permerror and continue otherwise. BTW, I'm somewhat surprised that dropping LOOKUP_COMPLAIN from all the lookup_* functions works fine, but looking through the code myself I don't see anything that was using the flag. Jason
Re: [google] Add options to pattern match function name for hotness attributes
Patch updated: using regex to match the function name: http://codereview.appspot.com/6281047 Thanks, Dehao 2012-06-01 Dehao Chen * gcc/cgraph.c (cgraph_node): Add attribute to function decl. * gcc/opts-global.c (add_attribute_pattern): New function. (pattern_match_function_attributes): New function. (handle_common_deferred_options): Handle new options. * gcc/opts.c (common_handle_option): Handle new options. * gcc/opts.h (handle_common_deferred_options): New function. * gcc/common.opt (ffunction_attribute_list): New option. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 188050) +++ gcc/doc/invoke.texi (working copy) @@ -362,7 +362,8 @@ -fdelete-null-pointer-checks -fdse -fdevirtualize -fdse @gol -fearly-inlining -fipa-sra -fexpensive-optimizations -ffast-math @gol -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol --fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol +-fforward-propagate -ffp-contract=@var{style} @gol +-ffunction-attribute-list -ffunction-sections @gol -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol -fgcse-sm -fif-conversion -fif-conversion2 -findirect-inlining @gol -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol @@ -8585,6 +8586,10 @@ specify this option and you may have problems with debugging if you specify both this option and @option{-g}. +@item -ffunction-attribute-list +@opindex ffunction-attribute-list +List of function name patterns that will be applied specified attribute. + @item -fbranch-target-load-optimize @opindex fbranch-target-load-optimize Perform branch target register load optimization before prologue / epilogue Index: gcc/cgraph.c === --- gcc/cgraph.c(revision 188050) +++ gcc/cgraph.c(working copy) @@ -99,6 +99,7 @@ #include "ipa-utils.h" #include "lto-streamer.h" #include "l-ipo.h" +#include "opts.h" const char * const ld_plugin_symbol_resolution_names[]= { @@ -554,6 +555,7 @@ node->origin->nested = node; } cgraph_add_assembler_hash_node (node); + pattern_match_function_attributes (decl); return node; } Index: gcc/opts.c === --- gcc/opts.c (revision 188050) +++ gcc/opts.c (working copy) @@ -1647,6 +1647,10 @@ /* Deferred. */ break; +case OPT_ffunction_attribute_list_: + /* Deferred. */ + break; + case OPT_fsched_verbose_: #ifdef INSN_SCHEDULING /* Handled with Var in common.opt. */ Index: gcc/opts.h === --- gcc/opts.h (revision 188050) +++ gcc/opts.h (working copy) @@ -382,4 +382,5 @@ location_t loc, const char *value); extern void write_opts_to_asm (void); +extern void pattern_match_function_attributes (tree); #endif Index: gcc/common.opt === --- gcc/common.opt (revision 188050) +++ gcc/common.opt (working copy) @@ -1242,6 +1242,10 @@ Common Report Var(flag_function_sections) Place each function into its own section +ffunction-attribute-list= +Common Joined RejectNegative Var(common_deferred_options) Defer +-ffunction-attribute-list=attribute:name,... Add attribute to named functions + fgcda= Common Joined RejectNegative Var(gcov_da_name) Set the gcov data file name. Index: gcc/opts-global.c === --- gcc/opts-global.c (revision 188050) +++ gcc/opts-global.c (working copy) @@ -39,6 +39,7 @@ #include "tree-pass.h" #include "params.h" #include "l-ipo.h" +#include "xregex.h" typedef const char *const_char_p; /* For DEF_VEC_P. */ DEF_VEC_P(const_char_p); @@ -50,6 +51,13 @@ const char **in_fnames; unsigned num_in_fnames; +static struct reg_func_attr_patterns +{ + regex_t r; + const char *attribute; + struct reg_func_attr_patterns *next; +} *reg_func_attr_patterns; + /* Return a malloced slash-separated list of languages in MASK. */ static char * @@ -79,6 +87,62 @@ return result; } +/* Add strings like attribute_str:pattern... to attribute pattern list. */ + +static void +add_attribute_pattern (const char *arg) +{ + char *tmp; + char *pattern_str; + struct reg_func_attr_patterns *one_pat; + int ec; + + /* We never free this string. */ + tmp = xstrdup (arg); + + pattern_str = strchr (tmp, ':'); + if (!pattern_str) +error ("invalid pattern in -ffunction-attribute-list option: %qs", tmp); + + *pattern_str = '\0'; + pattern_str ++; + + one_pat = XCNEW (struct reg_func_attr_patterns); + one_pat->next = reg_func_attr_patterns; + one_pat->attribute = tmp; + reg_func_attr_patterns = one_pat; + if
Re: [google] Add options to pattern match function name for hotness attributes
Please document it in doc/invoke.texi with examples. thanks, David On Tue, Jun 5, 2012 at 7:28 PM, Dehao Chen wrote: > Patch updated: using regex to match the function name: > > http://codereview.appspot.com/6281047 > > Thanks, > Dehao > > 2012-06-01 Dehao Chen > > * gcc/cgraph.c (cgraph_node): Add attribute to function decl. > * gcc/opts-global.c (add_attribute_pattern): New function. > (pattern_match_function_attributes): New function. > (handle_common_deferred_options): Handle new options. > * gcc/opts.c (common_handle_option): Handle new options. > * gcc/opts.h (handle_common_deferred_options): New function. > * gcc/common.opt (ffunction_attribute_list): New option. > > Index: gcc/doc/invoke.texi > === > --- gcc/doc/invoke.texi (revision 188050) > +++ gcc/doc/invoke.texi (working copy) > @@ -362,7 +362,8 @@ > -fdelete-null-pointer-checks -fdse -fdevirtualize -fdse @gol > -fearly-inlining -fipa-sra -fexpensive-optimizations -ffast-math @gol > -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol > --fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol > +-fforward-propagate -ffp-contract=@var{style} @gol > +-ffunction-attribute-list -ffunction-sections @gol > -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol > -fgcse-sm -fif-conversion -fif-conversion2 -findirect-inlining @gol > -finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol > @@ -8585,6 +8586,10 @@ > specify this option and you may have problems with debugging if > you specify both this option and @option{-g}. > > +@item -ffunction-attribute-list > +@opindex ffunction-attribute-list > +List of function name patterns that will be applied specified attribute. > + > @item -fbranch-target-load-optimize > @opindex fbranch-target-load-optimize > Perform branch target register load optimization before prologue / epilogue > Index: gcc/cgraph.c > === > --- gcc/cgraph.c (revision 188050) > +++ gcc/cgraph.c (working copy) > @@ -99,6 +99,7 @@ > #include "ipa-utils.h" > #include "lto-streamer.h" > #include "l-ipo.h" > +#include "opts.h" > > const char * const ld_plugin_symbol_resolution_names[]= > { > @@ -554,6 +555,7 @@ > node->origin->nested = node; > } > cgraph_add_assembler_hash_node (node); > + pattern_match_function_attributes (decl); > return node; > } > > Index: gcc/opts.c > === > --- gcc/opts.c (revision 188050) > +++ gcc/opts.c (working copy) > @@ -1647,6 +1647,10 @@ > /* Deferred. */ > break; > > + case OPT_ffunction_attribute_list_: > + /* Deferred. */ > + break; > + > case OPT_fsched_verbose_: > #ifdef INSN_SCHEDULING > /* Handled with Var in common.opt. */ > Index: gcc/opts.h > === > --- gcc/opts.h (revision 188050) > +++ gcc/opts.h (working copy) > @@ -382,4 +382,5 @@ > location_t loc, > const char *value); > extern void write_opts_to_asm (void); > +extern void pattern_match_function_attributes (tree); > #endif > Index: gcc/common.opt > === > --- gcc/common.opt (revision 188050) > +++ gcc/common.opt (working copy) > @@ -1242,6 +1242,10 @@ > Common Report Var(flag_function_sections) > Place each function into its own section > > +ffunction-attribute-list= > +Common Joined RejectNegative Var(common_deferred_options) Defer > +-ffunction-attribute-list=attribute:name,... Add attribute to named > functions > + > fgcda= > Common Joined RejectNegative Var(gcov_da_name) > Set the gcov data file name. > Index: gcc/opts-global.c > === > --- gcc/opts-global.c (revision 188050) > +++ gcc/opts-global.c (working copy) > @@ -39,6 +39,7 @@ > #include "tree-pass.h" > #include "params.h" > #include "l-ipo.h" > +#include "xregex.h" > > typedef const char *const_char_p; /* For DEF_VEC_P. */ > DEF_VEC_P(const_char_p); > @@ -50,6 +51,13 @@ > const char **in_fnames; > unsigned num_in_fnames; > > +static struct reg_func_attr_patterns > +{ > + regex_t r; > + const char *attribute; > + struct reg_func_attr_patterns *next; > +} *reg_func_attr_patterns; > + > /* Return a malloced slash-separated list of languages in MASK. */ > > static char * > @@ -79,6 +87,62 @@ > return result; > } > > +/* Add strings like attribute_str:pattern... to attribute pattern list. */ > + > +static void > +add_attribute_pattern (const char *arg) > +{ > + char *tmp; > + char *pattern_str; > + struct reg_func_attr_patterns *one_pat; > + int ec; > + > + /* We never free this string.
Re: [google] Add options to pattern match function name for hotness attributes
Also needs to get the attribute spec and call the attribute handler .. David On Tue, Jun 5, 2012 at 9:28 PM, Xinliang David Li wrote: > Please document it in doc/invoke.texi with examples. > > thanks, > > David > > On Tue, Jun 5, 2012 at 7:28 PM, Dehao Chen wrote: >> Patch updated: using regex to match the function name: >> >> http://codereview.appspot.com/6281047 >> >> Thanks, >> Dehao >> >> 2012-06-01 Dehao Chen >> >> * gcc/cgraph.c (cgraph_node): Add attribute to function decl. >> * gcc/opts-global.c (add_attribute_pattern): New function. >> (pattern_match_function_attributes): New function. >> (handle_common_deferred_options): Handle new options. >> * gcc/opts.c (common_handle_option): Handle new options. >> * gcc/opts.h (handle_common_deferred_options): New function. >> * gcc/common.opt (ffunction_attribute_list): New option. >> >> Index: gcc/doc/invoke.texi >> === >> --- gcc/doc/invoke.texi (revision 188050) >> +++ gcc/doc/invoke.texi (working copy) >> @@ -362,7 +362,8 @@ >> -fdelete-null-pointer-checks -fdse -fdevirtualize -fdse @gol >> -fearly-inlining -fipa-sra -fexpensive-optimizations -ffast-math @gol >> -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol >> --fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol >> +-fforward-propagate -ffp-contract=@var{style} @gol >> +-ffunction-attribute-list -ffunction-sections @gol >> -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol >> -fgcse-sm -fif-conversion -fif-conversion2 -findirect-inlining @gol >> -finline-functions -finline-functions-called-once -finline-limit=@var{n} >> @gol >> @@ -8585,6 +8586,10 @@ >> specify this option and you may have problems with debugging if >> you specify both this option and @option{-g}. >> >> +@item -ffunction-attribute-list >> +@opindex ffunction-attribute-list >> +List of function name patterns that will be applied specified attribute. >> + >> @item -fbranch-target-load-optimize >> @opindex fbranch-target-load-optimize >> Perform branch target register load optimization before prologue / epilogue >> Index: gcc/cgraph.c >> === >> --- gcc/cgraph.c (revision 188050) >> +++ gcc/cgraph.c (working copy) >> @@ -99,6 +99,7 @@ >> #include "ipa-utils.h" >> #include "lto-streamer.h" >> #include "l-ipo.h" >> +#include "opts.h" >> >> const char * const ld_plugin_symbol_resolution_names[]= >> { >> @@ -554,6 +555,7 @@ >> node->origin->nested = node; >> } >> cgraph_add_assembler_hash_node (node); >> + pattern_match_function_attributes (decl); >> return node; >> } >> >> Index: gcc/opts.c >> === >> --- gcc/opts.c (revision 188050) >> +++ gcc/opts.c (working copy) >> @@ -1647,6 +1647,10 @@ >> /* Deferred. */ >> break; >> >> + case OPT_ffunction_attribute_list_: >> + /* Deferred. */ >> + break; >> + >> case OPT_fsched_verbose_: >> #ifdef INSN_SCHEDULING >> /* Handled with Var in common.opt. */ >> Index: gcc/opts.h >> === >> --- gcc/opts.h (revision 188050) >> +++ gcc/opts.h (working copy) >> @@ -382,4 +382,5 @@ >> location_t loc, >> const char *value); >> extern void write_opts_to_asm (void); >> +extern void pattern_match_function_attributes (tree); >> #endif >> Index: gcc/common.opt >> === >> --- gcc/common.opt (revision 188050) >> +++ gcc/common.opt (working copy) >> @@ -1242,6 +1242,10 @@ >> Common Report Var(flag_function_sections) >> Place each function into its own section >> >> +ffunction-attribute-list= >> +Common Joined RejectNegative Var(common_deferred_options) Defer >> +-ffunction-attribute-list=attribute:name,... Add attribute to named >> functions >> + >> fgcda= >> Common Joined RejectNegative Var(gcov_da_name) >> Set the gcov data file name. >> Index: gcc/opts-global.c >> === >> --- gcc/opts-global.c (revision 188050) >> +++ gcc/opts-global.c (working copy) >> @@ -39,6 +39,7 @@ >> #include "tree-pass.h" >> #include "params.h" >> #include "l-ipo.h" >> +#include "xregex.h" >> >> typedef const char *const_char_p; /* For DEF_VEC_P. */ >> DEF_VEC_P(const_char_p); >> @@ -50,6 +51,13 @@ >> const char **in_fnames; >> unsigned num_in_fnames; >> >> +static struct reg_func_attr_patterns >> +{ >> + regex_t r; >> + const char *attribute; >> + struct reg_func_attr_patterns *next; >> +} *reg_func_attr_patterns; >> + >> /* Return a malloced slash-separated list of languages in MASK. */ >> >> static char * >> @@ -79,6 +87,62 @@ >> return result; >> } >