Hi, In the aftermath of D5301, Martin asked to compile a document on the requirements for complex text input in Plasma, especially with the opportunities provided by the Wayland transition. It makes sense to share this document with all of you, to
= I. What Input Methods do = Basically, with simple text input (and skipping a few stack steps in this explanation), the user presses a key on the keyboard, it's interpreted according to the active layout, and results in a symbol on screen (modulo complicated ideas like dead keys and the Compose key, which also enter into input method territory as we'll see). With some writing system, this isn't enough. They require multiple key presses or pick-and-choose selection steps to produce a symbol. The general idea is to disconnect the input from the output and introduce a conversion inbetween that's more complex than a mere keyboard layout. The conversion itself may have complex UI like temporary feedback in the textfield, popups positioned relative to the text field, or state visualization in shell chrome. There are other ways to access input methods than just the keyboard these days, too. I'll briefly mention one of them later. = II. Practical examples of Input Method use = a) Korean Korean is written using an alphabet similar to German. It has about the same number of letters, with each letter representing a vowel, a consonant, or a diphtong. On computers, each letter has its own key on the keyboard. As with German, there are multiple keyboard layouts for the Korean alphabet, although one is dominant and considered standard by far. Unlike the German alphabet however, letters in the Korean alphabet are grouped together into (morpho-)syllabic blocks when written. Each block must start with a consonant and contain a vowel or diphtong. It may optionally end in one or two consonants. Here are some letters and their corresponding Latin latterns (the sounds they most closely correspond to - theeir locations in the keyboard layout do not match QWERTY/QWERTZ): ㅎ h ㅏ a ㄴ n In linear order these form the syllable "han". When written properly, these are groupedtogether: 한 han In the UI, when pressing the keyboard keys for each letter, the text field contents cycle through these stages: ㅎ 하 한 As you can see, the existing text is replaced two times, making the operation stateful. The Input Method Engine generates complex events with text payload, state hints, even formatting hints (some IMEs use text color, e.g. color inversion, to communicate state such as "this input is not finalized yet") that are delivered to the application. In Qt, a plugin corresponding to the Input Method (e.g. ibus or fcitx) translates these events into QInputMethodEvent objects that are delivered to widgets and processed there. The rules of the Korean alphabet additionally have some implications for cursor movement/behavior. E.g. because it's not allowed to start a block with two consonants, and because the number of vowels and ending consonants is limited, as keys are pressed a block might implicitly finish composing and the cursor moves on. b) Chinese There are many different strategies for inputting Chinese characters. The most common actually makes use of the Latin alphabet, and keyboard layouts for the Latin alphabet. Chinese characters have assigned sound values (i.e., how a human actually pronounces a character), and there are rule systems to transcribe these using the Latin alphabet, e.g. Pinyin. As users write some Pinyin using Latin characters, a selection popup will offer a list of Chinese characters matching the input. The user picks one and it's inserted into the text field. Chinese input methods try to be very smart in what characters they offer, taking preceding input, common phrasings, etc. into account, making them highly stateful things. c) Modes, Input Method overlap Korean used to be written using Chinese characters (this simple statement is the tip of a large iceberg of complicated history and rulesets :), and especially Korean Academic writing still makes use of them. Korean input methods wherefore tend to also offer the ability to type Chinese characters, with a mode toggle between them. Korean input methods also usually have a Hangul (the Korean alphabet) vs. Latin mode toggle. Chinese input methods tend to have mode toggles for things like chosing between half-width and full-width characters, i.e. also offer control over typography. Japanese is a mashup of all of these things, with Japanese users typing in two Japanese-specific syllabaries, Chinese characters and Latin during a typical session. = Other uses for input methods = The ever more popular Emoji character set may turn writers of languages which typically do not require an input method into input method users. On Fedora/Gnome systems, pressing Ctrl+Shift+e inserts a @ character into the text field, and typing a string such as "heart" will select among suitable emoji (shown in a popup akin to Chinese character input). The text is underlined during composition and the underline disappears when composition is complete. Under the hood, this is implemented using an ibus input method plugin. Emoji input can also be made available via context menu actions and similar. Another input method engine that's language-agnostic in basic conception is the increasingly popular "typing booster", which provides workd completion and spell check suggestions on the fly. This is closely related to similar features in virtual keyboards. As this indicates, multiple input methods may also be chained or coexist modally or just cycled through. = The players on the field = Time for a brief overview of the input method components currently in common use on free systems. Afterwards I'll talk about how we currently interact with and expose these things in Plasma. a) ibus ibus is a framework and daemon for input method engines. The ibus project provides the central daemon, as well as default UIs for managing input method engine plugins and a default frontend for showing a tray icon and input method popups. It also develops a set of input method engine plugins (with a plugin providing e.g. Korean support, or something like the aforementioned typing-booster), although third parties can develop and deploy their own using public API. The config UI and the tray/popup UI (called a "panel" in ibus parlance) can be replaced by third-party components as well. b) fcitx fcitx is a competitor to ibus that covers much of the same ground. Like in ibus, there is a central component, engine plugins, config frontend, and so on. It's worth noting that there is contributor overlap between Plasma and fcitx. I personally know less about fcitx because my distro supports ibus better, but I don't mean to present it as default choice. c) Others There's a handful of other entries in this space - solutions focussing on a single language but standalone instead of using the ibus or fcitx frameworks (e.g. the Navi/Nabi input method for Korean), or legacy systems like scim. Mobile input stacks duplicate much of the work found in ibus and fcitx as well, e.g. Maliit and the Qt Virtual Keyboard both have their own language-specific engine implementations. This is unfortunate, as config and state are not shared between physical and virtual keyboards, and feature set and behavior may differ. Input method engine plugins to ibus and fcitx and others sometimes rely on the same library stack, e.g. libhangul for Korean. Input method systems interact with applications and toolkits via protocols like XIM or the Wayland text-input protocol, along with toolkit plugins. Qt offers a public API for input method plugins. Qt 5 bundles plugins for compose key support, ibus and non-X11 platforms. For Qt 4, the ibus plugin is an independent install provided by ibus. fcitx' plugins are an independent install as well (iirc). = The situation in Plasma 5 right now = Text input in Plasma 5 is currently handled by the following components: * A System Settings module offering keyboard layout management * A dynamic panel indicator for keyboard layout state and management * A panel applet for key state (CAPS lock, etc.) * An "Input Method Panel" (kimpanel) widget that provides state/popup UI frontend to ibus (i.e. a "panel", replacing ibus' default GTK+ panel UI), fcitx and scim * KWin is smart enough to do cool things like switch keyboard layouts automatically per virtual desktop or even window * The wide character set support of our recommended default typeface (Noto Sans) And now the problems start: Once an input method is used, many of these components become useless, unsupported or show ugly integration seams. Additionally, setup often requires heavy distro tooling or expert knowledge. Here's an example playbook of outfitting an existing, English input Plasma 5 system to handle Korean input using ibus: - Install ibus and ibus-hangul. - Manually add ibus-daemon to the system autostart. - Manually add the Input Method Panel widget to the panel. - Use the (GTK+) ibus config UI to manage English and Korean input method engines. The config UI is accessed via the Input Method Panel widget, it's not available in System Settings. - Use the (GTK+) ibus config UI to manage your keyboard layouts for your input modes. The System Settings keyboard layout module becomes useless when using ibus. While ibus in theory has integration with the xkb "system keyboard layout", in practice things end up fighting each other and the System Settings layout handling has to be disabled. - The dynamic keyboard layout panel indicator becomes useless. - Kwin-assisted layout switching becomes useless. - Up until commits on master yesterday that made it a little easier, the user may need expert knowledge to get the Input Method Panel widget to work. By default ibus will start its default bundled GTK+ panel frontend and Input Method Panel will not work, unless disk configuration is changed or the ibus-daemon is started with the right CLI args. On master Input Method Panel will work, but (due to an ibus bug) the GTK+ panel will co-exist and clutter up the tray. - The Input Method Panel widget itself is pretty great, but somewhat poorly integrated into the overall Plasma panel UX. It looks and behaves as a second system tray, showing icon buttons for the various features the active input method engine provides (these may change dynamically during input or as the engine is switched), with a distinct system for hiding individual buttons. Using fcitx, the situation is slightly better but similar. Unlike ibus, fcitx provides a third-party config module that integrates into System Settings. But our bundled modules become likewise mostly useless, as does Kwin assist, etc. The playbook looks better assuming a fresh first-time installation. Here's the best case: - Select the Korean language while installing $distro. - $distro takes care of pulling in the packages and setting up autostart. - $distro takes care of arbitrating between Input Method Panel and upstream UI frontends. - Plasma on first log-in auto-adds Input Method Panel to the default panel in locales it knows need it. Even better, on some distros language management tools outside of System Settings (e.g. YaST) can be used to add languages later that will do many of these steps, except for adding Input Method Panel, which the user needs to do manually. Unfortunately this best case scenario isn't the reality in many distros that ship Plasma 5 currently. = The situation on Wayland = Currently, only ibus works on Wayland. Input method popups are not positioned correctly, but otherwise things work. The situation is just as good or bad as on X11. = Where we need to go = * Input method use needs to be a primary usage scenario, not an after thought. We should take care of initializing the input method system. * Our keyboard layout management UI needs to become an input language management UI, configuring both input methods and layouts. * The dynamic keyboard layout indicator and Input Method Panel need to be merged to eliminate redundancy and make the latter dynamic, not requiring the user to know they need to manually add a widget. * Input Method Panel needs to be integrated with the System tray widget, reusing its show/hide infra and eliminating inconsistent layout and behavior seams. * KWin-assist (dynamic layout switching) needs to think in input languages, not keyboard layouts. * We should surface input method-assisted functionality like Emoji input and typing-booster. * Physical and virtual keyboards should share a common feature set, common behavior and state. This is catch-up work; other systems are already there. On the flip side, we have some nice bits to start with (the good Input Method Panel, the powerful tray implementation, generally the power of the Plasma platform). There's a common position that holds "but computers are used in English, duh". But even a hardcore developer audience that does use a non-localized system has a need to socialize with people in their native writing system. And it's unfortunately especially on those systems (not initially installed in the native language) where the biggest pain points lie. This work is incredibly vital for making Plasma accessible to hundreds of millions of additional potential users. It's incredibly vital for our mission to make free software alternatives available to more people, and strongly aligned with the Inclusivity goal in the KDE manifesto. = Screenshots = Here are some complementary screenshots of Input Method Pane on X11, with the Korean input method engine active: http://imgur.com/a/F2d9Y As you can see, the poorly positioned context menus are another entry in the "integration seams" list. And some screenshots of ibus configuration: http://imgur.com/a/EIW3u Cheers, Eike