"Eli Zaretskii" <[EMAIL PROTECTED]> writes: >> Date: Sat, 8 Jan 2005 22:29:21 +0900 >> From: Miles Bader <[EMAIL PROTECTED]> >> Cc: Geoff Kuenning <[EMAIL PROTECTED]>, [EMAIL PROTECTED], >> [EMAIL PROTECTED], [EMAIL PROTECTED], >> Kenichi Handa <[EMAIL PROTECTED]>, [EMAIL PROTECTED], >> [EMAIL PROTECTED], Ken Stevens <[EMAIL PROTECTED]>, >> Stefan Monnier <[EMAIL PROTECTED]> >> >> If ispell wants utf-8, it's easy enough to convert each input line to >> utf-8 and deal with offsets into that in the event of a mispelling; > > Or account for byte offsets by (variable) multibyte lenght of each > character, which Emacs knows. I don't remember for the moment whether > the multibyte length of the UTF-8 encoding can be gotten at by a Lisp > program, but if not, we could add some primitive to do that.
Just encode the line to utf-8, find the correct point in the byte string, cut off the line there, convert back and check the length of the string. This works unless you are in the middle of a character. But it would be much saner if our conversion facilities would preserve markers (which they don't do right now): encode to utf-8, place a marker at the right byte offset, undo the conversion. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]