I saw your question and was curious, so I looked into it a bit: >> To your knowledge, is there any objection to defining alpha-char-p as >> including code-char's >= 128?
I see that SBCL 1.2.2 is OK with that, for example: * (code-char 232) #\LATIN_SMALL_LETTER_E_WITH_GRAVE * (alpha-char-p (code-char 232)) T * In fact, that alpha-char-p call also returns T in (versions of) Allegro CL, CCL, CLISP, CMU CL, LispWorks, and SBCL. Next, I checked the CL HyperSpec http://www.lispworks.com/documentation/HyperSpec/Body/f_alpha_.htm#alpha-char-p and found this for alpha-char-p: Returns true if character is an alphabetic[1] character; otherwise, returns false. I followed the link to "alphabetic" http://www.lispworks.com/documentation/HyperSpec/Body/26_glo_a.htm#alphabetic and found this as the first definition, which seems to justify the above return value of T. adj. (of a character) being one of the standard characters A through Z or a through z, or being any implementation-defined character that has case, or being some other graphic character defined by the implementation to be alphabetic[1]. [By the way, ACL2 has this wrong! So I'm glad you asked. I'll fix that....] -- Matt From: Camm Maguire <[email protected]> Date: Sat, 01 Nov 2014 10:50:48 -0400 Cc: Raymond Toy <[email protected]>, [email protected] Greetings! Carl Shapiro <[email protected]> writes: > On Fri, Oct 31, 2014 at 11:20 AM, Camm Maguire <[email protected]> wrote: > > It really appears that unicode refers more to a glyph than anything > else. If we follow your suggestions, and leave characters 8-bit, aref > random O(1) access, is there any utility to providing unicode functions > #'glyph-length or some such in a common lisp implementation? > > Yes, a Common Lisp character is a UTF-8 code unit. As such, (length "א") would return 2 in GCL whereas it returns 1 in CMUCL. > > For iterating across strings in ways other than by UTF-8 code unit, you will want to provide an iterators for iterating by code point, by glyph, > and so forth. > > In theory, something like CL-UNICODE would provide that but I think its really lacking in a number of important ways. GCL being what it is, you > could link against ICU and use their functions to start with. > Thanks so much for these tips. They certainly seem to illuminate the path forward. Can't see how we could do better than icu. To your knowledge, is there any objection to defining alpha-char-p as including code-char's >= 128? Take care, -- Camm Maguire [email protected] ========================================================================== "The earth is but one country, and mankind its citizens." -- Baha'u'llah _______________________________________________ Gcl-devel mailing list [email protected] https://lists.gnu.org/mailman/listinfo/gcl-devel _______________________________________________ Gcl-devel mailing list [email protected] https://lists.gnu.org/mailman/listinfo/gcl-devel
