samppi a écrit :
> I see. Does this mean that, if I expect to handle 32-bit characters,
> then I need to consider changing my character-handling functions to
> accept sequences of vectors instead?
>
> Also, how does (seq "\ud800\udc00") work? Does it split the character
> into two 16-bit characters? In the REPL, it seems to return (\? \?).
>
seq on a String returns a sequence of Java characters (16 bits values).
(defn codepoints-seq [s] ; returns a seq of ints
(let [s (str s)
n (count s)
f (fn this [i]
(lazy-seq
(when (< i n)
(cons (.codePointAt s i)
(this (.offsetByCodePoints s i 1))))))]
(f 0)))
;; => (codepoint-seq "\ud800\udc00a\ud800\udd00")
;; (65536 97 65792)
--
Professional: http://cgrand.net/ (fr)
On Clojure: http://clj-me.blogspot.com/ (en)
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---