Steven Bosscher <[email protected]> writes:
> On Sat, Jun 14, 2014 at 9:36 PM, Richard Sandiford wrote:
>> Using a linked list gives a consistent 2% compile-time improvement for
>> fold-const.ii -O0 and ~1% for various -O2 compiles I tried. The df
>> routines do still show up high on the profile though.
>
> Can you explain a bit more about what shows up high?
For cc1plus -O0 on an oldish fold-const.ii I get:
3.19% cc1plus cc1plus [.] record_reg_classes(int, int,
rtx_def**, machine_mode*, char const**, rtx_def*, reg_class*) [clone
.constprop.5]
2.91% cc1plus cc1plus [.]
cp_parser_skip_to_closing_parenthesis(cp_parser*, bool, bool, bool)
1.42% cc1plus cc1plus [.] cp_lexer_consume_token(cp_lexer*)
1.42% cc1plus cc1plus [.]
df_ref_create_structure(df_ref_class, df_collection_rec*, rtx_def*, rtx_def**,
basic_block_def*, df_insn_info*, df_ref_type, int)
1.31% cc1plus cc1plus [.] ggc_internal_alloc(unsigned long,
void (*)(void*), unsigned long, unsigned long)
1.10% cc1plus cc1plus [.] df_ref_record(df_ref_class,
df_collection_rec*, rtx_def*, rtx_def**, basic_block_def*, df_insn_info*,
df_ref_type, int)
0.89% cc1plus cc1plus [.] find_costs_and_classes(_IO_FILE*)
0.89% cc1plus cc1plus [.]
process_bb_node_lives(ira_loop_tree_node*)
0.86% cc1plus cc1plus [.] bitmap_set_bit(bitmap_head*, int)
0.82% cc1plus cc1plus [.] df_note_compute(bitmap_head*)
0.77% cc1plus libc-2.18.so [.] _int_malloc
0.76% cc1plus cc1plus [.] ix86_decompose_address(rtx_def*,
ix86_address*)
0.76% cc1plus cc1plus [.] process_alt_operands(int)
0.75% cc1plus cc1plus [.] general_operand(rtx_def*,
machine_mode)
0.72% cc1plus cc1plus [.] df_uses_record(df_collection_rec*,
rtx_def**, df_ref_type, basic_block_def*, df_insn_info*, int)
0.72% cc1plus cc1plus [.] pool_alloc(alloc_pool_def*)
0.72% cc1plus cc1plus [.] lookup_name_real(tree_node*, int,
int, bool, int, int)
0.67% cc1plus cc1plus [.]
df_insn_refs_collect(df_collection_rec*, basic_block_def*, df_insn_info*)
0.67% cc1plus cc1plus [.] constrain_operands(int)
0.66% cc1plus cc1plus [.] for_each_rtx_1(rtx_def*, int, int
(*)(rtx_def**, void*), void*)
0.63% cc1plus cc1plus [.] gimplify_expr(tree_node**,
gimple_statement_base**, gimple_statement_base**, bool (*)(tree_node*), int)
0.62% cc1plus cc1plus [.] grokdeclarator(cp_declarator
const*, cp_decl_specifier_seq*, decl_context, int, tree_node**)
0.62% cc1plus cc1plus [.] walk_tree_1(tree_node**, tree_node*
(*)(tree_node**, int*, void*), void*, pointer_set_t*, tree_node*
(*)(tree_node**, int*, tree_node* (*)(tree_node**, int*, void*), void*,
pointer_set_t*))
0.59% cc1plus cc1plus [.] expand_expr_real_1(tree_node*,
rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
0.58% cc1plus cc1plus [.] df_ref_equal_p(df_ref_d*, df_ref_d*)
0.57% cc1plus cc1plus [.] lra_create_live_ranges(bool)
0.51% cc1plus cc1plus [.] lra_eliminate(bool, bool)
0.51% cc1plus cc1plus [.] extract_insn(rtx_def*)
0.51% cc1plus cc1plus [.]
ix86_legitimate_address_p(machine_mode, rtx_def*, bool)
0.50% cc1plus libc-2.18.so [.] _IO_putc
0.49% cc1plus libc-2.18.so [.] memset
0.49% cc1plus cc1plus [.] df_lr_bb_local_compute(unsigned int)
0.47% cc1plus cc1plus [.] regstat_compute_ri()
0.47% cc1plus cc1plus [.] _cpp_lex_direct
0.46% cc1plus cc1plus [.]
cp_parser_postfix_expression(cp_parser*, bool, bool, bool, bool, cp_id_kind*)
0.44% cc1plus cc1plus [.] htab_find_slot_with_hash
0.42% cc1plus libc-2.18.so [.] malloc_consolidate
0.42% cc1plus [kernel.kallsyms] [k] clear_page_c_e
0.41% cc1plus cc1plus [.] copy_rtx_if_shared_1(rtx_def**)
0.41% cc1plus cc1plus [.] cleanup_cfg(int)
where df routines seem to be showing up a fair bit (3 in the top 10).
I realise that can be misleading since it might just be that the
df work is concentrated in a small number of functions.
This is after the patches. malloc was in the top 5 before.
Thanks,
Richard