Re: [PATCH] mbrtowc: work around glibc bug#19932

Paul Eggert Sat, 09 Apr 2016 10:30:02 -0700

Bruno Haible wrote:

The function hard_locale is quite slow, as it calls setlocale(),
strdup(), and similar functions. rpl_mbrtowc is supposed to be fast,
as it's called once on every character in a string. Can't you get
away without the call to hard_locale?

I am concerned about performance there too. On glibc it is not so bad, since ithard_locale calls only setlocale. On my platform (x86-64, GCC 5.3.1 Fedora 23)calling hard_locale takes about 40 instructions total, including the setlocale.This cost is paid only for encoding errors; still, it'd be nicer to get it down.

On platforms like Solaris the cost is zero, since Solaris already conforms tofuture POSIX and this is tested at compile-time.

I don't know about other C libraries, such as FreeBSD. It's possible that thisimplementation could be quite slow there, as you say.

If this turns into a problem with GNU grep, I plan to fix it by having grepcache the results of mbrtowc in unibyte locales. GNU grep is already doing thatfor other reasons in its DFA engine, and I would merely need to have it do thatin all places where performance is important. So this Gnulib performance problemneed not be addressed for 'grep'; only for other programs that use Gnulib mbrtowc.

Perhaps we could add to the mbrtowc and/or hard-locale module a way to do thehard-locale test once after calling setlocale, so that the mbrtowc workaroundcan simply reference a boolean variable (either a global variable withsetlocale, or a thread-local variable with uselocale). That would bring the 40instructions down to 1 on glibc. Sounds like a bit of a hassle, but it should bedoable.

Other performance hacks that I considered were to make hard_locale an inlinefunction, and to have it test MB_CUR_MAX > 1 before going to the bother ofcalling setlocale and strcmp. Undoubtedly there could be benchmarks where thissort of thing would be a win, though the converse might also be true.

Re: [PATCH] mbrtowc: work around glibc bug#19932

Reply via email to