> From: Gavin Smith <gavinsmith0...@gmail.com> > Date: Sat, 14 Oct 2023 19:57:22 +0100 > > It's all in the future, but I am slightly concerned about is duplicating > in Texinfo existing system facilities. For example, for avoiding use of > wcwidth, our use of which depends on setting a UTF-8 locale, and using > the wchar_t type. Is every program that uses wcwidth supposed to supply > their own implementation instead, and isn't this wasteful?
What other locale-specific functions do we need in addition to wcwidth? If the list of those functions is short enough, we could replace them all by the corresponding Gnulib/libunistring functions, and then we could stop setting specific locales and relying on locale-specific libc functions. That will give us locale-independent code which will work on all systems. > I don't know if libunistring aspires to become a standard system library > for handling UTF-8 data but if we use it for other UTF-8 processing it > would make sense to use it for collation. > > I suggest writing to Bruno Haible to ask if he has plans to include > collation functionality in libunistring in the future. I am currently > reading through "Unicode Technical Standard #10" and although I don't > understand a lot of it yet, it seems feasible that we could implement it > in C. It is feasible, but implementing it from scratch is a lot of work, and needs a large database (which we could take from the CLDR). But note that CLDR is AFAIK locale-dependent; the only part of it that doesn't depend on the locale is collation by Unicode codepoints.