Greetings! Don't worry -- I'm not committed to this idea yet, just exploring!
Do these other lisps allocate a fresh character on each aref? Do they maintain some ~2^21 sized table in core? (And isn't emacs a "lisp" :-)). Take care, Raymond Toy <[email protected]> writes: >>>>>> "Camm" == Camm Maguire <[email protected]> writes: > > Camm> Greetings! I've recently been considering supporting unicode in > gcl by > Camm> representing strings internally in utf8. It appears that emacs > does the > Camm> same or similar. Apart from the obvious memory footprint benefits, > I'd > Camm> like to ask what other advantages/disadvantages have been > discovered. > Camm> Much of the utf8 literature emphasizes that most algorithms can > proceed > Camm> conventionally in byte-wise fashion, including lexicographical > ordering > Camm> comparisons, given that almost all jobs are sequential, at least > Camm> initially. A cached internal pointer storing the last referenced > Camm> codepoint offset makes access essentially O(1). Yet setting string > Camm> elements can trigger reallocations/memmove operations. While these > can > Camm> be aggregated over the setting of multiple elements, operations like > Camm> nreverse look ridiculous if left in terms of calls to aref and aset. > > Camm> Thoughts, advice and experiences most appreciated. > > Have you looked at what other Lisp implementations do? AFAIK, none use > utf-8. CCL and clisp use utf-32, cmucl and allegro use utf-16, sbcl > and ecl(?) have two string types: 8-bit base-string and 32-bit > strings. > > As a one-man operation (unfortunately), I'd go with the easiest one to > get right and follow either ccl or cmucl. The rest of the support for > unicode can be added with libraries like cl-unicode and/or babel, if > need be. > > -- > Ray > > > _______________________________________________ > Gcl-devel mailing list > [email protected] > https://lists.gnu.org/mailman/listinfo/gcl-devel > > > > -- Camm Maguire [email protected] ========================================================================== "The earth is but one country, and mankind its citizens." -- Baha'u'llah _______________________________________________ Gcl-devel mailing list [email protected] https://lists.gnu.org/mailman/listinfo/gcl-devel
