Am So., 11. Okt. 2020 um 03:28 Uhr schrieb Bruno Haible <br...@clisp.org>: > > Hi Marc, > > Some comments: > > * GL_HAMT_THREAD_SAFE can be defined to 1 not only if > __STDC_VERSION__ >= 201112 && ! defined __STD_NO_ATOMICS__ > but also if > __GNUC__ + (__GNUC_MINOR >= 9) > 4
Fixed. > (see the other mail). > > * Still '#ifdef GL_HAMT_THREAD_SAFE'. Fixed. > * Typo s/comparision/comparison/ Fixed. > * Hamt_processor: The comment does not say what the return value means. Maybe > it > would be clearer if this type was moved down to the "Iteration" section. Moved. > * hamt_lookup: If the caller is allowed to modify the payload stored in the > returned entry, this function should return a 'Hamt_entry *', not a > 'const Hamt_entry *'. Just like 'strchr' and 'strstr' return a 'char *', > not a 'const char *'. Unless the caller knows what it does, modifying the payload is not a good idea because the entry is shared between different hamts. If the caller really wants to modify the payload, it has to do an explicit type cast (which is safe). > * In trienode_count, you don't need count_one_bits_l, only count_one_bits. > It's OK to assume that 'uint32_t' and 'unsigned int' are the same type. Okay. > * If subtrie_lookup was inlined in entry_lookup, you could get rid of the > forward declaration of entry_lookup, and compilers would be more likely > to turn the recursion of entry_lookup into a loop (tail-recursion > elimination). I would prefer to have smaller functions. Are there any relevant compilers who do not inline the function at higher recursion levels? > * If subtrie_do_while was inlined in entry_do_while, you could get rid of the > forward declaration of entry_do_while. > * It would be useful to be able to construct modules 'hamt-set' and > 'hamt-map', > analogous to 'hash-set' and 'hash-map', but the hamt API currently has two > issues that block it: > > 1) The iterator API is such that you cannot write the iteration using a > loop; > it involves a function call that is active during the entire iteration. > This is also problematic for two other reasons: > > - Creating a closure in C (unlike Scheme or Lisp) is tedious hand-work. > (The GNU C extension that allows local functions inside functions [1] > is not a solution, because > - it is unportable, > - it forces an executable stack on many platforms, which is > considered bad for security.) > > - In Common Lisp, iteration through a hash table is available not only > through a function-call interface (MAPHASH [2]) but also through an > iteration-loop interface (WITH-HASH-TABLE-ITERATOR [3], LOOP [4]). > > For example, in GNU clisp, LOOP expands to > > [1]> (macroexpand-1 '(loop for x being each hash-key of h do (foo > x))) > (MACROLET ((LOOP-FINISH NIL (SYSTEM::LOOP-FINISH-ERROR))) > (BLOCK NIL > (LET ((#:WHTI-3214 (SYSTEM::HASH-TABLE-ITERATOR H))) > (MULTIPLE-VALUE-BIND (#:MORE?3215 #:HASH-KEY-3216 > #:HASH-VALUE-3217) > (SYSTEM::HASH-TABLE-ITERATE #:WHTI-3214) > (DECLARE (IGNORE #:HASH-VALUE-3217)) > (LET ((X NIL)) > (LET NIL > (MACROLET ((LOOP-FINISH NIL '(GO SYSTEM::END-LOOP))) > (TAGBODY SYSTEM::BEGIN-LOOP (UNLESS #:MORE?3215 (LOOP-FINISH)) > (SETQ X #:HASH-KEY-3216) (PROGN (PROGN (FOO X))) > (MULTIPLE-VALUE-SETQ > (#:MORE?3215 #:HASH-KEY-3216 #:HASH-VALUE-3217) > (SYSTEM::HASH-TABLE-ITERATE #:WHTI-3214)) > (GO SYSTEM::BEGIN-LOOP) SYSTEM::END-LOOP > (MACROLET > ((LOOP-FINISH NIL (SYSTEM::LOOP-FINISH-WARN) > '(GO SYSTEM::END-LOOP)))))))))))) ; > > You can see that there is an initial function call > HASH-TABLE-ITERATOR > upfront, and then a function call to HASH-TABLE-ITERATE in each round > of the loop. > > This principle makes it possible to iterate e.g. through a hash table > and a list in parallel. In Scheme, it is even more relevant because you have call/cc so any iteration can be suspended and restored any number of times. I thought this could be solved by storing a "next" pointer in the payload but this doesn't work with hamts sharing entries... So I will add some iterator interface as well. Preferably one that can also be used to implement Hamts in Scheme. > 2) We have a dilemma regarding use of malloc vs. xmalloc. While modules > meant for use in programs can readily use xmalloc, modules meant for use > (also) in libraries cannot use xmalloc, since a library is not supposed > to terminate the program, even when memory is tight. GMP more or less has this behavior. > The solution we found for this dilemma is that the *-set and *-map > modules > use just malloc, and we have thin wrapper modules (xset, xmap) that do > call xalloc_die() in case of out-of-memory. > > Unfortunately, the API of many of the functions need to be adjusted to > cope with the possibility of an ENOMEM failure. That is tedious work, and > the code does not look so pretty afterwards... But I see no other way to > make it fit for use in libraries. The bigger problem is that it mustn't make the code slower for the 99,99999% of the use cases where there is either enough memory or calling xalloc_die is the only reasonable action.