Hi, this very simple diff provides partial, naive UTF-8 support for word handling in ksh(1) emacs mode.
It improves all functions involving words (forward-word, backward-word, delete-word-forward, delete-word-backward, downcase-word, upcase-word, capitalize-word) by allowing non-ASCII characters to be part of words. This is not perfect: all non-ASCII characters become part of the adjacent words, and the case of non-ASCII characters cannot be changed. But it improves things a bit in a very non-intrusive way. This is the final patch i'd like to commit to ksh/emacs.c for 5.9. It is too early for adding support for double-width and zero-with characters, and we are too close to release for that, anyway. OK? Ingo Index: emacs.c =================================================================== RCS file: /cvs/src/bin/ksh/emacs.c,v retrieving revision 1.61 diff -u -p -r1.61 emacs.c --- emacs.c 10 Dec 2015 10:00:14 -0000 1.61 +++ emacs.c 5 Jan 2016 18:35:54 -0000 @@ -49,7 +49,8 @@ struct x_ftab { #define is_cfs(c) (c == ' ' || c == '\t' || c == '"' || c == '\'') /* Separator for motion */ -#define is_mfs(c) (!(isalnum((unsigned char)c) || c == '_' || c == '$')) +#define is_mfs(c) (!(isalnum((unsigned char)c) || \ + c == '_' || c == '$' || c & 0x80)) /* Arguments for do_complete() * 0 = enumerate M-= complete as much as possible and then list