To accommodate new characters added since Unicode 5.1.0, this changes the internal representation of codepoint ranges. Previously, we grouped codepoint ranges by manually assigned 4-bit tag, which only allowed 16 groups. This removes the limitation by switching to binary search on a table. For the detail rationale and the benchmark results, see: https://lists.gnu.org/archive/html/bug-libunistring/2014-06/msg00001.html
* lib/uniname/gen-uninames.lisp (unicode-char): Rename CODE member to INDEX, as it no longer represents a codepoint. (range): New struct. (main): Switch to intervals list from a bit-pattern based classification. * lib/uniname/uninames.h: Regenerate. * tests/uniname/UnicodeDataNames.txt: Update to Unicode 7.0.0. * modules/uniname/base (configure.ac): Bump minimum version to 0.9.5. * modules/uniname/uniname (configure.ac): Bump minimum version to 0.9.5. -- As I really need the updated Unicode character names, and the uniname module is isolated from other libunistring modules, I've pushed this patch in advance of other patches. I don't include the patch in this mail because the size is too large, but the code is the same as what I posted in the last October: https://lists.gnu.org/archive/html/bug-gnulib/2014-10/msg00040.html Regards, -- Daiki Ueno