On Wed, Jan 29, 2014 at 4:24 PM, Richard Biener <richard.guent...@gmail.com> wrote: > On Wed, Jan 29, 2014 at 11:02 AM, Jakub Jelinek <ja...@redhat.com> wrote: >> On Tue, Jan 28, 2014 at 07:26:47AM +0100, Jakub Jelinek wrote: >>> > I wonder if this is just some of --enable-checking tests in dwarf2out >>> > going wild >>> > or if it is just expensive sanity checking code? >>> > I used to have chroot environment for 32bit builds, I will need to >>> > re-install it now. >>> >>> variable tracking :2914.85 (83%) usr 1.88 ( 7%) sys2931.22 (82%) >>> wall 80844 kB ( 3%) ggc >>> var-tracking dataflow : 18.19 ( 1%) usr 0.19 ( 1%) sys 18.49 ( 1%) >>> wall 10899 kB ( 0%) ggc >>> var-tracking emit : 29.41 ( 1%) usr 0.11 ( 0%) sys 29.65 ( 1%) >>> wall 148128 kB ( 6%) ggc >>> TOTAL :3525.97 25.73 3570.33 >>> 2321043 kB >>> >>> So, strangely both vt_find_locations and vt_emit_notes, typically the most >>> expensive ones, >>> are quite unexpensive and most of the time is spent elsewhere, in >>> vt_initialize? >> >> So, most of the time seems to be spent in cselib.c remove_useless_values >> (both from Ctrl-C in gdb profiling, and callgrind). For callgrind I've >> actually built 64-bit cc1plus with --enable-checking=release, and still >> compiled >> the same --enable-checking=yes,rtl -m32 -O2 -g insn-recog.c, the build then >> took just 14 minutes instead of 60 minutes, and in that case only about 30% >> of compile time has been spent in var-tracking and 20% of compile time >> in remove_useless_values in particular. >> >> The problem with remove_useless_values is that we have quickly very big >> cselib hash table (over 200000 entries) and very large number of very small >> basic blocks (over 34000) and at the end of every basic block we call >> cselib_preserve_only_values which walks the whole hash table, and e.g. >> references_value_p is called 845114869x from discard_useless_locs. > > It looks like remove_useless_values () is an "optimization" (shrinks the > hashtable) and thus can be omitted? After all it's already limited: > > if (n_useless_values > MAX_USELESS_VALUES > /* remove_useless_values is linear in the hash table size. Avoid > quadratic behavior for very large hashtables with very few > useless elements. */ > && ((unsigned int)n_useless_values > > (cselib_hash_table.elements () - n_debug_values) / 4)) > remove_useless_values (); > > why don't we remove them at the point we discover them useless? > Thus when n_useless_values increases? Can't we do that in O(1) > there (with a hashtable lookup?)
Actually best would be to avoid generating so many useless values in the first place ... still, quadraticness is a complete no-go. >> A micro-optimization could be e.g. to turn references_value_p into a >> template where only_useless would be a template parameter rather than actual >> parameter (due to recursion inlining doesn't help here) or just two >> functions. >> >> Also, for RTL checking, I wonder if the functions like reference_values_p >> and similar ones that use GET_RTX_FORMAT/GET_RTX_LENGTH to walk the elements >> couldn't use special non-checking macros in doing so if the compiler can't >> figure out checking is redundant there because it is being performed by hand >> by the function (haven't verified). And, perhaps also an approach similar >> to http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00140.html for >> GET_RTX_FORMAT and/or GET_RTX_LENGTH (so that at least for the cases where >> the compiler knows which rtx code it is (if some code is guarded with >> specific GET_CODE () == test), it could avoid loading from the const >> arrays). >> >> Anyway, I guess more important is whether all the values in the >> cselib_hash_table will be ever useful for some lookup, or if there are >> e.g. values that only reference preserved values where none of those >> referenced values have any locations other than preserved values. >> >> Or if we can somehow quickly find out what VALUEs have changed during >> processing of the last bb and only process that subset instead of all the >> hash table entries all the time. >> >> Alex? Your thoughts? >> >> Jakub