> > Walks of things like DF_REF_INSN_USES were showing up high in the profile > > of a fold-const.ii compilation. These reference lists are represented > > as pointers to null-terminated lists of pointers, and since there's > > little locality when walking all insns, each loop over the uses or defs > > generally has two major cache misses before it can do anything > > (or one major cache miss before doing nothing), on top of accessing > > the underlying df_insn_info. Also, for -O0, the overhead of mallocing > > lots of small arrays is itself noticeable. > > > > I don't think there's any real need for this representation. Each > > df_ref belongs to exactly one of these null-terminated pointer arrays, > > so using a normal linked list would be more efficient memory-wise > > (because we'd save on the null terminator and separate malloced memory). > > I think situation here is simliar to ipa-ref. Here I have two arrays > in ipa_ref_list: > /* Store actual references in references vector. */ > vec<ipa_ref_t, va_gc> *references; > /* Referring is vector of pointers to references. It must not live in GGC > space > or GGC will try to mark middle of references vectors. */ > vec<ipa_ref_ptr> GTY((skip)) referring; > > So all references are stored in an array, while all referring are stored > as array of pointers to references within the arrays of refering nodes. > > On resize of references array I need to check if all references has moved & > update referring arrays accordingly, but that is not so big deal. Also users > needs to be aware of fact that refernces do not have stable addresses. > > Perhaps this representation would make sense for df? Especially when > references > are collected for given function and thus we know the size of references array ^^^^^^^^ instruction
Basically I think that most of time we can allocate the array only after we know the size. Honza > in advance? > > Honza > > > > The idea might have been to allow the array to be sorted easily. > > That doesn't really apply now though. We collect the references in a > > df_collection_rec and sort them there before populating the df_insn_info. > > After that we just insert single elements or merge two sorted lists. > > (Both of these are currently done as full qsorts, but don't need to be.) > > > > Using a linked list gives a consistent 2% compile-time improvement for > > fold-const.ii -O0 and ~1% for various -O2 compiles I tried. The df > > routines do still show up high on the profile though. > > > > Tested on x86_64-linux-gnu. OK to install? > > > > Thanks, > > Richard