Hi Paul, On Wed, Feb 19, 2014 at 09:20:06AM -0500, Paul Smith wrote: > Also it's probably worthwhile to hold this discussion on the > bug-make@gnu.org mailing list; most of the people interested in GNU make > development hang out there. Ok, here we go. ;)
> On Wed, 2014-02-19 at 00:11 +0100, Bjoern Michaelsen wrote: > > So I tried if putting an cachefile with an index of filenames at the > > beginning and then just referencing those filenames helps. > > Hi Bjoern; thanks for your work on improving GNU make! It would be > helpful to me if you could send along a quick description of exactly > what this is doing that gives the speedup; I could read the code but it > will be faster to just read an explanation. It's not exactly clear from > the above. It introduces a new "includedepcache" keyword. The first time GNU make runs past it, it works just like a normal include, except that it tracks the dependency relations described in the included file and writes them in a simplified and better to parse format into the file ${includefile}.cache. The second time GNU make comes past the includedepcache statement, it checks for the ${includefile}.cache file, and if it is younger than ${includefile}, and if so, it reads that file instead of the ${includefile}. The format of the cache file is: <Number of filename> filename1 filename2 ... <Number of dependency relations> <index of target1><index of dependency1><index of target2><index of dependency2>... with the indices written in binary. A usual LibreOffice build generates 1.3GB of dependency files for >8200 object files. As we dont want to open 8200 files on each make run we concat these files to one per library (and do some deduplication), which us brings down to ~300MB[1] in standard make syntax. However, parsing that still takes a lot of time, and more than needs to be, as the dependency file for one library has 800 objects, you have: long_path_to_object1: long_path_to_header_with_string_types <more dependencies> long_path_to_object2: long_path_to_header_with_string_types <more dependencies> ... long_path_to_object800: long_path_to_header_with_string_types <more dependencies> which is a lot of duplication. Instead of parsing "long_path_to_header_with_string_types" 800 times and look for it 800 times in the strcache, when parsing the cachefile this is done once at the beginning. For LibreOffice, parsing the cachefile is ~10 times faster than the standard make syntax, and parsing dependencies is then reduced to a neglectable part of a noop incremental build. On my machine using the depcache it takes: - 5.8 seconds to parse the ~134KLOC build description, which is very heavy on $(eval $(call)) - 0.6 seconds to parse the cachefiles for >8200 targets - 1.3 seconds to stat all the targets for a total of 7.7 seconds, while with include instead of includedepcache parsing the whole 300MB of generated dependencies instead of the cachefiles yields: - 5.8 seconds to parse the ~134KLOC build description, which is very heavy on $(eval $(call)) - 5.9 seconds to parse the generated dependencies in standard make syntax - 1.3 seconds to stat all the targets for a total of 13.0 seconds. The current implementation for this can be found here: https://gerrit.libreoffice.org/gitweb?p=gnu-make-lo.git;a=shortlog;h=refs/heads/feature/depcache it has tests, but no further documentation (apart from this mail) yet. Best, Bjoern [1] see https://gerrit.libreoffice.org/gitweb?p=core.git;a=blob;f=solenv/bin/concat-deps.c;h=a64723f476d77f88c147545dc8844ac47c44dfb2;hb=22b709e84a7b6d38cab2dd37f2f2b28e0fc9d062 if you really want to know all the gory details. I doubt that. ;) _______________________________________________ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make