--- Karl Ove Hufthammer <[EMAIL PROTECTED]> wrote: > Paul Rohr <[EMAIL PROTECTED]> wrote in > news:[EMAIL PROTECTED]: > > > How should undo work for combining characters? > > Well, combining characters may be input in several > ways. On my > Norwegian keyboard, I write � by pressing the Alt Gr > + 'the � > deadkey', followed by an e. (BTW, note that the > decomposed form of > � in Unicode is e�, not �e.) On French keyboards, I > believe there > is a separate � key. But exactly how the keypress > --> character > sequence is generated should be done by the OS.
The concept of input and the concept of internal representation are really quite distinct. It just happens that the concept of "dead keys" and "combining characters" are similar - but reversed. In reality there are not related. Both the Norwegian and French keymaps on all OSes only return a single, precomposed character on entering an �: U+00E9 Combining characters are now possible with Unicode for western languages but nobody is using them yet. Combining characters are currently used by Vietnamese to add "tone marks" to roman characters. > As for undoing a decomposed character (e.g. e�), I > think it's safe > to undo all characters back to (and including) the > last non- > combining character. For example if you write e� Don't confuse your precomposed � above with combining characters. On Vietnamese windows, you would type "e" and "e" would be displayed, you would then type a "tone mark" and this would be displayed above the "e" and the cursor would not move to the right. This is a combining character. > (where � is not > actually �, but the combining �) and press undo, > both characters > (which are probably displayed as one glyph) should > be deleted. (In > practice � would/should be written as the > pre-composed � character, > as per Normalization Form�C <URL: > http://www.unicode.org/unicode/reports/tr15/ >. I > only use it here > as an exaple.) Normalization is a different subject which mostly comes into play with searching and sorting - it's probably only going to be confusing to mention it here. Though maybe we do need to discuss whether AbiWord should normalize all characters in its internal representation... > > What would a native speaker want to happen when > you "undo" the > > entry of a single "on-screen" character?[1] I > suspect that > > creating such an entity may take more than one > step (in the > > input method editor), but should they always be > undone > > individually? > > In case similar to my example above, yes. But not > always. See for > example the romaji input example at <URL: > http://www.w3.org/TR/charmod/#sec-CharExamples >. > How this should > be handled is depedant on the actual input method > used. See my earlier post. Andrew Dunbar. > -- > Karl Ove Hufthammer ===== http://linguaphile.sourceforge.net http://www.abisource.com __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com
