Bruno Haible <br...@clisp.org> writes: > - The functions u#_grapheme_len and u#_grapheme_next are > redundant with each other: > u8_grapheme_len (s, n) ::= > (n == 0 ? 0 : (u8_grapheme_next (s, s + n) ? : end) - s). > I would not offer two APIs that are _that_ similar for > doing the same thing. (The functions in unistr.h do > have similar redundancies, but there the rationale is > similarity with the ISO C wchar_t API.) And the functions > returning a pointer are more symmetric when you consider > forward and backward iteration together. > > So my slight preference would be to remove the > u#_grapheme_len functions and keep u#_grapheme_next instead.
OK, that's fine. I'll push a change that removes the *_len functions soon. > - Naming of functions: Now some functions have > _grapheme_cluster_ in their name, some have _grapheme_ only. > How about using either _grapheme_ consistently or > _grapheme_cluster_ consistently? > > It's not necessary to rename the files and modules, though. I prefer _grapheme_ over _grapheme_cluster_ for use in function names, because it is shorter but as far as I can tell it is not ambiguous with _grapheme_cluster_. Only one function has grapheme_cluster in its name. I'll push a change that renames it soon. > - Code duplication: In lib/ (not in tests/) it is often possible to > use the same algorithm for the 8-bit unit, 16-bit unit, and 32-bit unit > cases. It helps future maintenance if the same algorithm is present > in 1 file and not 3 files. Would you disagree if I use the de-duplication > technique (with a parametrized .h file) on your new files? > > In the tests, unfortunately, this is often not possible. This sounds fine with me, if you want to do it. > - Typo in unigbrk.h: 's/ ben / been /' I pushed a fix. I'll email it out with the others, too, in a few minutes. Thank you for all the feedback! -- Ben Pfaff http://benpfaff.org