http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375
--- Comment #96 from Jan Hubicka <hubicka at gcc dot gnu.org> 2011-05-27 21:57:27 UTC --- Stream in oprofile is now quite changed: 33258 9.6313 lto1 htab_find_slot_with_hash 29679 8.5949 lto1 lto_input_tree 18338 5.3106 lto1 gt_ggc_mx_lang_tree_node 15723 4.5533 lto1 ggc_set_mark 15109 4.3755 lto1 inflate_fast 13883 4.0204 lto1 ht_lookup_with_hash 12957 3.7523 lto1 pointer_map_insert 12433 3.6005 libc-2.11.1.so memset 8661 2.5082 lto1 lto_input_uleb128 8584 2.4859 libc-2.11.1.so _int_malloc 6832 1.9785 lto1 ggc_internal_alloc_stat 6722 1.9467 lto1 ht_lookup We do have nice improvements on merging and streaming effectivity. Still burning over 10% in hashing don't seem quite reasonable. I am not sure if most of the htab overhead is still the type merging given that rest of it is off profile. It may be something stupid, like the file name hash, that is queried every time file is changed in the location. Probably should re-do callgraph profile later next week. I do have some extra patches to reduce uleb streaming overhead and further make lto_input_tree bit more streamlined that might help a little. Not sure how much real room for improvement for simple optimizations in this direction is left and how much we really need to look into streaming fewer trees. garbage collection : 16.29 ( 6%) usr 0.02 ( 0%) sys 16.33 ( 6%) wall 0 kB ( 0%) ggc ipa lto decl in : 76.15 (28%) usr 2.96 (21%) sys 79.33 (28%) wall 722892 kB (44%) ggc ipa lto decl out : 83.36 (31%) usr 4.58 (32%) sys 88.37 (31%) wall 0 kB ( 0%) ggc ipa lto decl merge : 14.59 ( 5%) usr 0.00 ( 0%) sys 14.64 ( 5%) wall 801 kB ( 0%) ggc inline heuristics : 40.95 (15%) usr 0.19 ( 1%) sys 41.40 (14%) wall 241725 kB (15%) ggc Memory needed is down, too, at about 4.3GB (in 64bit compilation). GIMPLE type table: size 1048573, 570402 elements, 5098430 searches, 3158421 collisions (ratio: 0.619489) GIMPLE type hash table: size 4194301, 1441169 elements, 44401918 searches, 37071081 collisions (ratio: 0.834898) GIMPLE canonical type table: size 65521, 49079 elements, 896788 searches, 575628 collisions (ratio: 0.641877) GIMPLE canonical type hash table: size 1048573, 524811 elements, 2845518 searches, 2279153 collisions (ratio: 0.800962) [WPA] Compression: 424774798 input bytes, 1619588170 uncompressed bytes (ratio: 3.812816)