> From: Larry Denenberg <[email protected]> > cc: Larry Denenberg <[email protected]>, [email protected] > Date: Sun, 27 Jun 2010 22:14:18 -0400 > > bidi-paragraph-direction is purely an Emacs thing, right? It's not > in the Unicode bidi standard.
bidi-paragraph-direction is one of the Emacs-specific aspects of what UAX#9 calls ``higher protocols'' for determining the base direction of paragraphs. > Is it absolute? That is, can it be overridden by LRM or LRO > characters? No, not currently. The code that determines base paragraph direction looks at the value of bidi-paragraph-direction, and if that's non-nil, it doesn't bother looking for the first strong directional character in the paragraph. But this is Emacs: Lisp code that wants to override the default value of bidi-paragraph-direction can always let-bind it to any value it wants, including nil. Then LRM etc. will have their normal effect. > Can you give me an example of any message in an English Emacs that > should be RTL? I would need to wade through the many uses of `message' to see if there are any. In general, any echo-area message that shows just portions of buffer text (as opposed to a message generated by Emacs to convey some information to the user) might need RTL paragraphs if the text comes from a buffer written in some bidirectional script. But I don't know off the top of my head which features use that, although I'm pretty much sure there are such features. > >Btw, there's something I overlooked before: why exactly is ^× > >considered a strong R2L character? Could you please go to it in the " > >*Echo Area 0/1*" buffer, type "C-u C-x =", and show what Emacs tells > >about that character? > > First of all, I don't think your procedure works. You can make the > message appear, and with care you can get a cursor on top of it, but > typing C-u (or most anything else) changes the buffer contents---it's > not called the Echo Area for nothing! To get your hands on the > character you'd have to write a function that grabs the contents of the > buffer and bind it to a key, or in some other way avoid echoing. See my other message for how I would do that. > But there's no point in trying. The buffer can't possibly contain an > actual ^ב. No buffer can. Buffers and strings can contain only those > characters encodable in 22 bits. If your input facilities permit, you > can prove this by typing ^Q ^ב; Emacs refuses to insert such a character > (Wrong type argument: char-or-string-p, 67110353). ^ב is just another Emacs display feature, like ^C. Emacs has special code in its display engine to produce such two-character combinations to display an otherwise unprintable character as a string that any terminal will show without any problem. But Emacs still knows that these two characters stand for a single character, and "C-u C-x =" will tell you which one. > Here's what I think is happening: The code that complains about > undefined characters handles uninsertable characters (things like ^ב and > meta-control-mouse-down) by translating them to visible representation. > So the message contains a real caret followed by ב. That is, the first > character has no strong directionality, and the directionality is set by > the second character, a non-control ב. That'd be my guess as well, but I'd like to be sure. One thing that puzzles me is where does that caret come from: the function which displays the "X is undefined" is supposed to use the C- notation for control-modified characters, not the ^ notation. _______________________________________________ emacs-bidi mailing list [email protected] http://lists.gnu.org/mailman/listinfo/emacs-bidi
