i've just done some simple tests and found the following: 1. on one of my systems (a laptop with 128MB RAM), dlocatedb takes up 696KB of disk space. a plain text dump of it takes up 3.0MB
text dump generated with 'dlocate / > dlocate.txt' i then made sure that both dlocatedb and dlocate.txt were not in the disk cache by catting approx 150MB of files to /dev/null. # ls -lh dlocate.txt dlocatedb -rw-r--r-- 1 root root 3.0M 2009-05-30 11:14 dlocate.txt -rw-r--r-- 1 root root 696K 2009-05-30 06:29 dlocatedb # wc -l dlocate.txt 62090 dlocate.txt 2. searching the dlocatedb with locate for a single file takes 1.091 seconds. grepping for the same file in the text dump takes 0.584 seconds. the filename "usr/share/doc/apache2.2-bin/changelog.gz" was chosen because it is the very last line in dlocate.txt # time dlocate usr/share/doc/apache2.2-bin/changelog.gz apache2.2-bin: /usr/share/doc/apache2.2-bin/changelog.gz real 0m1.091s user 0m0.484s sys 0m0.044s # time grep usr/share/doc/apache2.2-bin/changelog.gz dlocate.txt apache2.2-bin: /usr/share/doc/apache2.2-bin/changelog.gz real 0m0.584s user 0m0.008s sys 0m0.020s 3. repeating the test immediately with both files cached in RAM gives 0.512 seconds (dlocate) and 0.034s (grep) # time dlocate usr/share/doc/apache2.2-bin/changelog.gz apache2.2-bin: /usr/share/doc/apache2.2-bin/changelog.gz real 0m0.512s user 0m0.476s sys 0m0.032s # time grep usr/share/doc/apache2.2-bin/changelog.gz dlocate.txt apache2.2-bin: /usr/share/doc/apache2.2-bin/changelog.gz real 0m0.034s user 0m0.012s sys 0m0.024s on the first run, grep is twice as fast as dlocate. on subsequent runs, it is about 15 times faster. there appears to be no advantage whatsoever to using frcode any more (in fact, locate is much slower than plain grep), and disk space is so cheap that the difference between 700KB and 3MB is irrelevant. accordingly the solution to this on-going dlocate/locate/mlocate confusion will be the release of a new version of dlocate that doesn't use or depend on frcode or locate, but instead just uses a plain text file and grep. i have a few other things on my TODO list for dlocate. I'll get them done and release a new version. hopefully this weekend if real life doesn't intrude. i think i'll also add a few more options to dlocate to take advantage of GNU grep's ability to use different Matchers - from grep(1): Matcher Selection -E, --extended-regexp Interpret PATTERN as an extended regular expression (ERE, see below). (-E is specified by POSIX.) -F, --fixed-strings Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched. (-F is specified by POSIX.) -G, --basic-regexp Interpret PATTERN as a basic regular expression (BRE, see below). This is the default. -P, --perl-regexp Interpret PATTERN as a Perl regular expression. This is highly experimental and grep -P may warn of unimplemented features. and i'll support -w too: -w, --word-regexp Select only those lines containing matches that form whole words. The test is that the matching substring must either be at the beginning of the line, or preceded by a non-word constituent character. Similarly, it must be either at the end of the line or followed by a non-word constituent character. Word-constituent characters are letters, digits, and the underscore. this will change the way that dlocate works (in that it does a regexp search rather than a plain text search) but, IMO, that's far more useful. GNU locate has an option to do a regexp search but the timing comparison gets even more in favour of grep: # time locate.findutils -d /var/lib/dlocate/dlocatedb -r usr/share/doc/apache2.2-bin/changelog.gz apache2.2-bin: /usr/share/doc/apache2.2-bin/changelog.gz real 0m1.796s user 0m1.640s sys 0m0.012s 1.796 seconds for the first run after flushing disk cache, compared to 0.512 seconds for grep. grep is over 3 times faster. on subsequent runs, the regexp locate still takes over 1.6 seconds, while grep takes 0.034 seconds. over 47 times faster. obviously, and not at all surprisingly, grepping an frcode database is not a very efficient operation. # time locate.findutils -d /var/lib/dlocate/dlocatedb -r usr/share/doc/apache2.2-bin/changelog.gz apache2.2-bin: /usr/share/doc/apache2.2-bin/changelog.gz real 0m1.640s user 0m1.628s sys 0m0.008s craig -- craig sanders <c...@taz.net.au> BOFH excuse #319: Your computer hasn't been returning all the bits it gets from the Internet. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org