On 21/05/18 19:26, Pádraig Brady wrote: > Basically, the two problems that the profiling found were: > > * It is pointless to call locale_charset repeatedly, because the > locale won't change while 'wc' is running. > > * glibc has a slow mbrtowc() implementation for UTF-8 locales. > > Both problems can be addressed with the "abstract factory" design patterns.
I'm going to apply my whar-single module to gnulib to tweak it so the main bottleneck of calling locale_charset repeatedly is removed from wcwidth() and mbrtowc(), in a simple manner, without the need for another API. It's very interesting that the system mbrtowc() implementations don't look to be optimized for utf8, being 4.5 and 2.3 times slower than utf8_mbrtowc() on glibc and macOS respectively. It would be useful to follow this up with the glibc folks at least, so that everyone could benefit, without any code changes. cheers, Pádraig