Walks of things like DF_REF_INSN_USES were showing up high in the profile
of a fold-const.ii compilation.  These reference lists are represented
as pointers to null-terminated lists of pointers, and since there's
little locality when walking all insns, each loop over the uses or defs
generally has two major cache misses before it can do anything
(or one major cache miss before doing nothing), on top of accessing
the underlying df_insn_info.  Also, for -O0, the overhead of mallocing
lots of small arrays is itself noticeable.

I don't think there's any real need for this representation.  Each
df_ref belongs to exactly one of these null-terminated pointer arrays,
so using a normal linked list would be more efficient memory-wise
(because we'd save on the null terminator and separate malloced memory).

The idea might have been to allow the array to be sorted easily.
That doesn't really apply now though.  We collect the references in a
df_collection_rec and sort them there before populating the df_insn_info.
After that we just insert single elements or merge two sorted lists.
(Both of these are currently done as full qsorts, but don't need to be.)

Using a linked list gives a consistent 2% compile-time improvement for
fold-const.ii -O0 and ~1% for various -O2 compiles I tried.  The df
routines do still show up high on the profile though.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard

Reply via email to