On 27/04/19(Sat) 21:55, Nathanael Rensen wrote: > The diff below speeds up ld.so library intialisation where the dependency > tree is broad and deep, such as samba's smbd which links over 100 libraries. > > See for example https://marc.info/?l=openbsd-misc&m=155007285712913&w=2 > > See https://marc.info/?l=openbsd-tech&m=155637285221396&w=2 for part 1 > that speeds up library loading. > > The timings below are for /usr/local/sbin/smbd --version: > > Timing without either diff : 6m45.67s real 6m45.65s user 0m00.02s system > Timing with part 1 diff only: 4m42.88s real 4m42.85s user 0m00.02s system > Timing with part 2 diff only: 2m02.61s real 2m02.60s user 0m00.01s system > Timing with both diffs : 0m00.03s real 0m00.03s user 0m00.00s system > > Note that these timings are for a build of a recent samba master tree > (linked with kerberos) which is probably slower than the OpenBSD port.
Nice numbers. Could you explain in words what your diff is doing? Why does splitting the flag help? Is it because some ctors/initarray are being initialized multiple times currently? Or is it just to prevent some traversal? In that case does that mean the `STAT_VISISTED' flag is removed too early? > Index: libexec/ld.so/loader.c > =================================================================== > RCS file: /cvs/src/libexec/ld.so/loader.c,v > retrieving revision 1.177 > diff -u -p -p -u -r1.177 loader.c > --- libexec/ld.so/loader.c 3 Dec 2018 05:29:56 -0000 1.177 > +++ libexec/ld.so/loader.c 27 Apr 2019 13:24:02 -0000 > @@ -749,15 +749,15 @@ _dl_call_init_recurse(elf_object_t *obje > { > struct dep_node *n; > > - object->status |= STAT_VISITED; > + int visited_flag = initfirst ? STAT_VISITED_1 : STAT_VISITED_2; > + > + object->status |= visited_flag; > > TAILQ_FOREACH(n, &object->child_list, next_sib) { > - if (n->data->status & STAT_VISITED) > + if (n->data->status & visited_flag) > continue; > _dl_call_init_recurse(n->data, initfirst); > } > - > - object->status &= ~STAT_VISITED; > > if (object->status & STAT_INIT_DONE) > return; > Index: libexec/ld.so/resolve.h > =================================================================== > RCS file: /cvs/src/libexec/ld.so/resolve.h,v > retrieving revision 1.90 > diff -u -p -p -u -r1.90 resolve.h > --- libexec/ld.so/resolve.h 21 Apr 2019 04:11:42 -0000 1.90 > +++ libexec/ld.so/resolve.h 27 Apr 2019 13:24:02 -0000 > @@ -125,8 +125,9 @@ struct elf_object { > #define STAT_FINI_READY 0x10 > #define STAT_UNLOADED 0x20 > #define STAT_NODELETE 0x40 > -#define STAT_VISITED 0x80 > +#define STAT_VISITED_1 0x80 > #define STAT_GNU_HASH 0x100 > +#define STAT_VISITED_2 0x200 > > Elf_Phdr *phdrp; > int phdrc; >