Brian Inglis wrote: > It seems to me that most application developers needing to support > non-Western-European languages might want a non-POSIX interpretation of > digits.
Sure. GNU libunistring has dedicated API for this: - https://www.gnu.org/software/libunistring/manual/html_node/Object-oriented-API.html UC_DECIMAL_DIGIT_NUMBER. - https://www.gnu.org/software/libunistring/manual/html_node/Decimal-digit-value.html - https://www.gnu.org/software/libunistring/manual/html_node/Digit-value.html - https://www.gnu.org/software/libunistring/manual/html_node/Properties-as-objects.html UC_PROPERTY_DECIMAL_DIGIT - https://www.gnu.org/software/libunistring/manual/html_node/Properties-as-functions.html uc_is_property_decimal_digit I'm sure ICU4C has similar APIs too. > Are the Unicode character attribute classes supported for those application > use > cases that need more than POSIX limitations allow? POSIX allows the libc to define additional character classes. But these will be platform and locale dependent, and I don't know of any application which makes use of such additional character classes via wctype() and iswctype(). > I know that I sometimes want to see some alternative numeric digit forms and > expect to be able to find those with an appropriate grep expression. I think you can do so with GNU 'grep', when it was built with PCRE support. PCRE includes support for Unicode character classes. <https://www.pcre.org/current/doc/html/pcre2pattern.html> Bruno -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple