On Fri, Jun 17, 2011 at 05:06:29PM +0200, Loïc Minier wrote: > On Fri, Jun 17, 2011, Colin Watson wrote: > > I really don't want to do this. I'd rather optimise mandb. > > Ok; just so that I understand, is this about avoid confusion of the > users, or complexity or...?
Splitting packages is for life, not just for Christmas. Once I do it, I'm pretty much stuck with it, or at least some vestige of it, forever. Thus, I'm reluctant to do it solely for performance reasons which I feel can be addressed in other ways. If I exhaust the possibilities for optimising mandb without reaching acceptable performance, then I'm willing to revisit splitting some tools out into a separate package. > I guess we can repurpose this bug to "man-db is too slow on armel/ppc" > or something, which are arches where I've witnessed this. Actually, I think I could do a lot better generally. For example, compare these two operations which have identical output, with hot cache on a reasonably decent i386 laptop with fast SSD: <cjwatson@sarantium /usr/share/man>$ time find -type f | xargs cat | zcat >/dev/null real 0m2.494s user 0m2.440s sys 0m0.324s <cjwatson@sarantium /usr/share/man>$ time find -type f | xargs -n1 zcat >/dev/null real 1m27.988s user 0m7.940s sys 0m16.373s mandb is currently acting more like the latter than the former (and, for that matter, has similar runtime). OK, so it isn't actually execing zcat every time, instead forking and having one of the child processes run an in-process function which uses zlib, thus saving an execve per process and all the associated process startup costs, and I seem to remember that that made a noticeable performance difference; but even so, simply forking 20000-odd processes (as in my example, which is in a fairly complete environment with lots of manual pages installed; probably very much less in a build chroot) isn't cheap. In fact, strace indicates that mandb is forking on the order of four processes per page. Just the cost of forking, exiting, and waiting for that number of processes comes to 23 seconds on my system out of mandb's total runtime of around 100 seconds, and I strongly suspect that doing any non-trivial multi-process work like this gives the scheduler trouble and slows everything down further due to the sheer number of context switches involved (trashing CPU caches, doing TLB flushes, etc.). My plan here is to beef up libpipeline so that I can do all of mandb's work in a single process. In fact, I've had a to-do entry in the code for some time: "ideally, could there be a facility to execute non-blocking functions without needing to fork?" These would be something like coroutines or generators. If I do this in libpipeline, then the changes in man-db can be very small and wouldn't make the code much harder to maintain: it would still look like running a pipeline of processes, except that some of them happen to be non-forking function calls, much as some of them can currently be function calls executed in a child process. The called functions would just need to be written such that they can yield control and be re-entered later rather than blocking. If that doesn't speed things up enough, then I can look at having more things done by passing buffers around rather than reading and writing over pipes. That breaks some useful abstraction layers, though (less common compression methods are implemented by calling programs like bzcat, and I'd rather not have to link directly against lots of decompression libraries), and I'm not sure that it will be necessary. My instinct is that I can make a very serious dent in mandb's runtime without resorting to that. Cheers, -- Colin Watson [cjwat...@debian.org] -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org