http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44563
--- Comment #11 from Jan Hubicka <hubicka at gcc dot gnu.org> 2010-12-13 13:22:28 UTC --- Patched compiler at -O2 now shows: integration : 166.20 (16%) usr 0.19 ( 1%) sys 166.86 (15%) wall 92691 kB ( 4%) ggc tree CCP : 792.75 (74%) usr 0.15 ( 1%) sys 794.63 (73%) wall 66560 kB ( 3%) ggc integration is probably the overhead of splitting BBs, I wonder what makes tree CCP so slow, it is probably worth investigating. The profile shows: 2009425 73.3643 cc1 cc1 gsi_for_stmt 398058 14.5331 cc1 cc1 gimple_set_bb 22230 0.8116 libc-2.11.1.so libc-2.11.1.so _int_malloc 14339 0.5235 cc1 cc1 gimple_split_block 11727 0.4282 libc-2.11.1.so libc-2.11.1.so memset 10411 0.3801 libc-2.11.1.so libc-2.11.1.so _IO_vfscanf 9095 0.3321 cc1 cc1 htab_delete 6061 0.2213 cc1 cc1 bitmap_set_bit 5990 0.2187 no-vmlinux no-vmlinux /no-vmlinux 5912 0.2158 libc-2.11.1.so libc-2.11.1.so _int_free 5077 0.1854 libc-2.11.1.so libc-2.11.1.so malloc_consolidate 4516 0.1649 cc1 cc1 htab_find_slot_with_hash 4515 0.1648 opreport opreport /usr/bin/opreport 4284 0.1564 libc-2.11.1.so libc-2.11.1.so free 4234 0.1546 libc-2.11.1.so libc-2.11.1.so malloc 4106 0.1499 cc1 cc1 htab_traverse_noresize 3737 0.1364 libc-2.11.1.so libc-2.11.1.so calloc 3197 0.1167 cc1 cc1 eq_node 2996 0.1094 cc1 cc1 df_note_compute 2632 0.0961 cc1 cc1 ggc_internal_alloc_stat 2476 0.0904 cc1 cc1 bitmap_bit_p Other passes are sub 10s each. Maybe tree-ccp just gets the overhead of merging the large BBs into single? Honza