On Sun, Feb 14, 2010 at 03:02:14PM +0100, Jim Meyering wrote: > Colin Watson wrote: > > I looked at converting man-db to use Gnulib's hash implementation rather > > than its own. One obstacle seems to be that there is one place where > > man-db cares about the difference between a key not being in the hash at > > all, and a key being in the hash but with a NULL value (in Perl syntax, > > the difference between 'not exists $hash{$key}' and 'not defined > > $hash{$key}'). I haven't spent much time looking at the relevant code > > and so it may well be that it can be rewritten to avoid relying on this > > distinction, but in general it does seem like a useful distinction to > > draw; most high-level languages' built-in hash implementations allow you > > to ask whether a key exists without testing the truth of its value. > > Hi Colin, > > It's good to hear that man-db might use gnulib's hash function. > > If the hash module were really unable to distinguish those cases, we > would have found it too limiting (and fixed it) long ago ;-)
I confess I did not check how new the module was; written in 1992, I can see your point. ;-) > The API for hash_insert requires that you pass it a non-NULL-pointer > (usually to a key/value pair). Oops. My bad for only code-inspecting the lookup functions ... the hash_insert documentation is of course perfectly clear. > Note that if you want to use the hash table as a simple set, there is > no need to have a separate key,value struct, since the value would be > irrelevant. In that case, you can save storage and indirection costs > by passing the key string directly to hash_insert (always assuming the > hash_compare and hash_hash functions are aware). Indeed, in the case at hand it looks as though I just need a set, so you're right, setting key == value will be quite adequate. > Perhaps this can be attributed to the lack of documentation and/or > examples? For reference, there are many uses in gnulib itself > and in the tar and coreutils packages. I think the biggest thing I found missing in this regard was an overview of the generic data structure modules provided by Gnulib. For instance, I wasted some time fiddling around with oset before realising that I didn't really need an *ordered* set in most cases and that a hash would do just fine. Some kind of index of this category of modules would be wonderful, for orientation purposes. Thanks for the cluebat, -- Colin Watson [cjwat...@debian.org]