reverse-i-search, multibyte backspace problem
Hello, I've noticed a bug with terminal usage of bash. Steps to reproduce: 1. Press control-r to get in reverse-i-search mode 2. Enter a key outside of the ASCII character set, like the french é or the german ä. 3. Press backspace. What to expect: The key gets removed. What happens: Instead of the whole key getting removed, you can see a weird character (like � or Ã) appear. The most likely theory: Instead of adhering to the utf-8 multibyte specification, and removing the whole multibyte codepoint encoding sequence (or perhaps the whole sequence representing the "abstract character"? [1]), it just removes the last byte. Note that the bug is dependent on the terminal. I've originally discovered the bug on konsole, but it has been confirmed by other users on the freenode #bash channel to exist on xterm, st and rxvt, but one user couldn't reproduce it with st. Affected versions: I've tested 4.3.30(1)-release (my distro's packaged one) and 4.3.39(2)-release, the latter coming straight from the development git repository's master branch, compiled with ./configure && make -j 4. The operating system I use is Kubuntu, but it has been confirmed to exist on Gentoo and arch linux too. Thanks for answers. Greetings Est31. [1]: Quoting the Unicode standard, version 7, Section 3.4, Characters and Encoding: "A single abstract character may also be represented by a sequence of code points—for example, "latin capital letter g with acute" may be represented by the sequence , rather than being mapped to a single code point."
Re: reverse-i-search, multibyte backspace problem
Hello, Thanks for pointing out the fix. I've tried the "devel" branch, and couldn't reproduce the bug there. Sorry for the disturbance, I should have checked whether the master branch really represents the bleeding edge of development. 19.07.2015, 03:53, "Eduardo A. Bustamante López" : > Hello, > > Can you please try the 'devel' branch? > > There's a fix for this issue already in it: > > | commit 947f04912e4715e7a9df526cd99412bffa729368 > | Author: Chet Ramey > | Date: Tue Jan 27 11:10:49 2015 -0500 > | > | commit bash-20150116 snapshot > > Here's the description of the fix: > > | lib/readline/isearch.c > | - _rl_isearch_dispatch: if we are in a multibyte locale, make sure to use > | _rl_find_prev_mbchar when trying to delete characters from the search > | string, instead of just chopping off the previous byte. Fixes bug > | reported by Kyrylo Shpytsya > > This was reported earlier this year: > > http://lists.gnu.org/archive/html/bug-readline/2015-01/msg00017.html > > Or use this to patch: > > | dualbus@yaqui ...src/gnu/bash % git diff origin/master > 947f04912e4715e7a9df526cd99412bffa729368 -- lib/readline/isearch.c > | diff --git a/lib/readline/isearch.c b/lib/readline/isearch.c > | index 6f6a7a6..d768560 100644 > | --- a/lib/readline/isearch.c > | +++ b/lib/readline/isearch.c > | @@ -553,8 +553,16 @@ add_character: > | do until we have a real isearch-undo. */ > | if (cxt->search_string_index == 0) > | rl_ding (); > | - else > | + else if (MB_CUR_MAX == 1 || rl_byte_oriented) > | cxt->search_string[--cxt->search_string_index] = '\0'; > | + else > | + { > | + wstart = _rl_find_prev_mbchar (cxt->search_string, > cxt->search_string_index, MB_FIND_NONZERO); > | + if (wstart >= 0) > | + cxt->search_string[cxt->search_string_index = wstart] = '\0'; > | + else > | + rl_ding (); > | + } > | break; > | > | case -4: /* C-G, abort */ > > Greetings! > > -- > Eduardo Bustamante > https://dualbus.me/