It's been a long time since I looked into Unicode, but this is what I remember.
Depending on the Unicode normalisation level, backspace is *supposed* to remove a letter and all its associated combining marks. The root problem seems to be that some Arabic letters change from "non-combining" to "combining" depending on the language in which they're used. Unicode also has a problem distinguishing a combining letter (vowel points in Arabic or Hebrew) from a combining diacritic (accents in Latin script). If you think that's a bug in Unicode, you're not alone; the Unicode consortium has been struggling with this for at least ten years - see https://unicode.org/L2/L2014/14109-inline-chars.pdf There's been some progress; Unicode version 12 has at least admitted there's a problem (https://www.unicode.org/versions/Unicode12.1.0/ch07.pdf chapter 7.9 page 327). I'll leave it to others to survey the current state of play with Unicode, but historically it's been a mess. -Martin On Tue, 20 Feb 2024 at 12:26, Avid Seeker <avidseek...@protonmail.com> wrote: > When pressing backspace on Arabic ligatures (including characters with > diacritics), they are removed as if they are one character. > > Example: > > السَّلامُ > > Pressing 3 backspaces leaves the word at ال. It removed لا which is a > ligature > combining "ل" and "ا", and removed "م" with diacritics. Compare this with > the > behavior of zsh. > > For non-Arabic speakers, this is like typing: fi (U+0046 U+0049), but when > pressing backspace it removed it as the character: fi (U+FB01). > >