[PATCH] uniname/uniname: update to Unicode 7.0.0

Daiki Ueno Tue, 06 Jan 2015 02:08:56 -0800

To accommodate new characters added since Unicode 5.1.0, this
changes the internal representation of codepoint ranges.
Previously, we grouped codepoint ranges by manually assigned 4-bit
tag, which only allowed 16 groups.  This removes the limitation by
switching to binary search on a table.  For the detail rationale and the
benchmark results, see:
https://lists.gnu.org/archive/html/bug-libunistring/2014-06/msg00001.html


* lib/uniname/gen-uninames.lisp (unicode-char): Rename CODE member
to INDEX, as it no longer represents a codepoint.
(range): New struct.
(main): Switch to intervals list from a bit-pattern based
classification.
* lib/uniname/uninames.h: Regenerate.
* tests/uniname/UnicodeDataNames.txt: Update to Unicode 7.0.0.
* modules/uniname/base (configure.ac): Bump minimum version to
0.9.5.
* modules/uniname/uniname (configure.ac): Bump minimum version to
0.9.5.
--
As I really need the updated Unicode character names, and the uniname
module is isolated from other libunistring modules, I've pushed this
patch in advance of other patches.  I don't include the patch in this
mail because the size is too large, but the code is the same as what I
posted in the last October:
https://lists.gnu.org/archive/html/bug-gnulib/2014-10/msg00040.html

Regards,
--
Daiki Ueno

[PATCH] uniname/uniname: update to Unicode 7.0.0

Reply via email to