Hi Jan Arne!, Chiming in, and kinda late at that... hopefully we'll get this moving :).
First of all, I'm aware that some of my comments are directed towards stuff that's unchanged between v1 and v2, so please bear with me, I hope the feedback is useful. On Mon, May 30, 2016 at 11:41 AM, Jan Arne Petersen <[email protected]> wrote: > There were some shortcomings in the first version of the protocol which > makes it not really useful in real world applications. It is not really > possible to fix them in a compatible way so introduce a new v2 of the > protocol. > > Fixes some shortcomings of the first version: > > * Use only one wp_text_input per wl_seat (client side should be > handled by client toolkit) > * Allow focus tracking without wl_keyboard present > * Improve update state handling and better define state handling > --- > Changes to v6: > * Fix some typos > > Makefile.am | 1 + > unstable/text-input/text-input-unstable-v2.xml | 480 > +++++++++++++++++++++++++ > 2 files changed, 481 insertions(+) > create mode 100644 unstable/text-input/text-input-unstable-v2.xml > > diff --git a/Makefile.am b/Makefile.am > index 71d2632..cc8d531 100644 > --- a/Makefile.am > +++ b/Makefile.am > @@ -3,6 +3,7 @@ unstable_protocols = > \ > unstable/fullscreen-shell/fullscreen-shell-unstable-v1.xml > \ > unstable/linux-dmabuf/linux-dmabuf-unstable-v1.xml > \ > unstable/text-input/text-input-unstable-v1.xml > \ > + unstable/text-input/text-input-unstable-v2.xml > \ > unstable/input-method/input-method-unstable-v1.xml > \ > unstable/xdg-shell/xdg-shell-unstable-v5.xml > \ > unstable/relative-pointer/relative-pointer-unstable-v1.xml > \ > diff --git a/unstable/text-input/text-input-unstable-v2.xml > b/unstable/text-input/text-input-unstable-v2.xml > new file mode 100644 > index 0000000..ea35d9b > --- /dev/null > +++ b/unstable/text-input/text-input-unstable-v2.xml > @@ -0,0 +1,480 @@ > +<?xml version="1.0" encoding="UTF-8"?> > + > +<protocol name="text_input_unstable_v2"> > + <copyright> > + Copyright © 2012, 2013 Intel Corporation > + Copyright © 2015, 2016 Jan Arne Petersen > + > + Permission to use, copy, modify, distribute, and sell this > + software and its documentation for any purpose is hereby granted > + without fee, provided that the above copyright notice appear in > + all copies and that both that copyright notice and this permission > + notice appear in supporting documentation, and that the name of > + the copyright holders not be used in advertising or publicity > + pertaining to distribution of the software without specific, > + written prior permission. The copyright holders make no > + representations about the suitability of this software for any > + purpose. It is provided "as is" without express or implied > + warranty. > + > + THE COPYRIGHT HOLDERS DISCLAIM ALL WARRANTIES WITH REGARD TO THIS > + SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND > + FITNESS, IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY > + SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > + WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN > + AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, > + ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF > + THIS SOFTWARE. > + </copyright> > + > + <interface name="zwp_text_input_v2" version="1"> > + <description summary="text input"> > + The zwp_text_input_v2 interface represents text input and input methods > + associated with a seat. It provides enter/leave events to follow the > + text input focus for a seat. > + > + Requests are used to enable/disable the text-input object and set > + state information like surrounding and selected text or the content > type. > + The information about the entered text is sent to the text-input object > + via the pre-edit and commit events. Using this interface removes the > need > + for applications to directly process hardware key events and compose > text > + out of them. > + > + Text is valid UTF-8 encoded, indices and lengths are in bytes. Indices > + have to always point to the first byte of an UTF-8 encoded code point. > + Lengths are not allowed to contain just a part of an UTF-8 encoded code > + point. > + > + State is sent by the state requests (set_surrounding_text, > + set_content_type, set_cursor_rectangle and set_preferred_language) and > + an update_state request. After an enter or an input_method_change event > + all state information is invalidated and needs to be resent from the > + client. A reset or entering a new widget on client side also > + invalidates all current state information. > + </description> > + > + <request name="destroy" type="destructor"> > + <description summary="Destroy the wp_text_input"> > + Destroy the wp_text_input object. Also disables all surfaces enabled > + through this wp_text_input object > + </description> > + </request> > + > + <request name="enable"> > + <description summary="enable text input for surface"> > + Enable text input in a surface (usually when a text entry inside of it > + has focus). > + > + This can be called before or after a surface gets text (or keyboard) > + focus via the enter event. Text input to a surface is only active > + when it has the current text (or keyboard) focus and is enabled. > + </description> > + <arg name="surface" type="object" interface="wl_surface"/> > + </request> The "can be called before a surface gets text focus" bit concerns me a bit. Can this be called on several surfaces? Is the compositor expected to implement persistence? if that's the case, what's the benefit if only one surface can be active+focused at a given time? > + > + <request name="disable"> > + <description summary="disable text input for surface"> > + Disable text input in a surface (typically when there is no focus on > any > + text entry inside the surface). > + </description> > + <arg name="surface" type="object" interface="wl_surface"/> > + </request> > + > + <request name="show_input_panel"> > + <description summary="show input panels"> > + Requests input panels (virtual keyboard) to show. > + > + This should be used for example to show a virtual keyboard again > + (with a tap) after it was closed by pressing on a close button on the > + keyboard. > + </description> > + </request> If I'm reading the spec correctly, it looks like the input panel can end up visible on two situations: 1) after the .enter event, and after a .enable request happened on the surface 2) after the user hid the keyboard (or the app requested .hide_input_panel) and .show_input_panel happens So, I wonder if this (and .hide_input_panel) can't be folded with the previous two calls :). In order to show the input panel as soon as a surface gets focus it'd require an explicit .enable/.show_input_panel call, but that IMHO fits well with the protocol in other places. > + > + <request name="hide_input_panel"> > + <description summary="hide input panels"> > + Requests input panels (virtual keyboard) to hide. > + </description> > + </request> > + > + <request name="set_surrounding_text"> > + <description summary="sets the surrounding text"> > + Sets the plain surrounding text around the input position. Text is > + UTF-8 encoded. Cursor is the byte offset within the surrounding text. > + Anchor is the byte offset of the selection anchor within the > + surrounding text. If there is no selected text, anchor is the same as > + cursor. > + > + Make sure to always send some text before and after the cursor > + except when the cursor is at the beginning or end of text. > + > + When there was a configure_surrounding_text event take the > + before_cursor and after_cursor arguments into account for picking how > + much surrounding text to send. > + > + There is a maximum length of wayland messages so text can not be > + longer than 4000 bytes. > + </description> > + <arg name="text" type="string"/> > + <arg name="cursor" type="int"/> > + <arg name="anchor" type="int"/> Minor nitpick, would be great to have short summaries for arguments, esp. the ones that aren't self-explaining. > + </request> > + > + <enum name="content_hint" bitfield="true"> > + <description summary="content hint"> > + Content hint is a bitmask to allow to modify the behavior of the text > + input. > + </description> > + <entry name="none" value="0x0" summary="no special behaviour"/> > + <entry name="auto_completion" value="0x1" summary="suggest word > completions"/> > + <entry name="auto_correction" value="0x2" summary="suggest word > corrections"/> > + <entry name="auto_capitalization" value="0x4" summary="switch to > uppercase letters at the start of a sentence"/> > + <entry name="lowercase" value="0x8" summary="prefer lowercase > letters"/> > + <entry name="uppercase" value="0x10" summary="prefer uppercase > letters"/> > + <entry name="titlecase" value="0x20" summary="prefer casing for titles > and headings (can be language dependent)"/> > + <entry name="hidden_text" value="0x40" summary="characters should be > hidden"/> > + <entry name="sensitive_data" value="0x80" summary="typed text should > not be stored"/> > + <entry name="latin" value="0x100" summary="just latin characters > should be entered"/> > + <entry name="multiline" value="0x200" summary="the text input is > multiline"/> > + </enum> > + > + <enum name="content_purpose"> > + <description summary="content purpose"> > + The content purpose allows to specify the primary purpose of a text > + input. > + > + This allows an input method to show special purpose input panels with > + extra characters or to disallow some characters. > + </description> > + <entry name="normal" value="0" summary="default input, allowing all > characters"/> > + <entry name="alpha" value="1" summary="allow only alphabetic > characters"/> > + <entry name="digits" value="2" summary="allow only digits"/> > + <entry name="number" value="3" summary="input a number (including > decimal separator and sign)"/> > + <entry name="phone" value="4" summary="input a phone number"/> > + <entry name="url" value="5" summary="input an URL"/> > + <entry name="email" value="6" summary="input an email address"/> > + <entry name="name" value="7" summary="input a name of a person"/> > + <entry name="password" value="8" summary="input a password (combine > with password or sensitive_data hint)"/> > + <entry name="date" value="9" summary="input a date"/> > + <entry name="time" value="10" summary="input a time"/> > + <entry name="datetime" value="11" summary="input a date and time"/> > + <entry name="terminal" value="12" summary="input for a terminal"/> > + </enum> > + > + <request name="set_content_type"> > + <description summary="set content purpose and hint"> > + Sets the content purpose and content hint. While the purpose is the > + basic purpose of an input field, the hint flags allow to modify some > + of the behavior. > + > + When no content type is explicitly set, a normal content purpose with > + none hint should be assumed. > + </description> > + <arg name="hint" type="uint" enum="content_hint"/> > + <arg name="purpose" type="uint" enum="content_purpose"/> > + </request> It should be probably mentioned in this request that compositors are free to ignore hint, purpose, or invalid combinations of those. This probably means that it should be stated somewhere that clients should handle/ignore unexpected input too... > + > + <request name="set_cursor_rectangle"> > + <description summary="set cursor position"> > + Sets the cursor outline as a x, y, width, height rectangle in surface > + local coordinates. > + > + Allows the compositor to put a window with word suggestions near the > + cursor. > + </description> > + <arg name="x" type="int"/> > + <arg name="y" type="int"/> > + <arg name="width" type="int"/> > + <arg name="height" type="int"/> > + </request> These coordinates are relative to which surface? the one we last got the .enter event on? I have the feeling the surface should be an argument to this request, in order to ease checks in the compositors and remove ambiguity. > + > + <request name="set_preferred_language"> > + <description summary="sets preferred language"> > + Sets a specific language. This allows for example a virtual keyboard > to > + show a language specific layout. The "language" argument is a RFC-3066 > + format language tag. > + > + It could be used for example in a word processor to indicate language > of > + currently edited document or in an instant message application which > + tracks languages of contacts. > + </description> > + <arg name="language" type="string"/> > + </request> I wonder... can this just be inferred from the desktop locale/current IM? Does it make a practical difference to the casual app launched with another LANG= to just use the .language request in order to force it? > + > + <enum name="update_state"> > + <description summary="update_state flags"> > + Defines the reason for sending an updated state. I actually miss some further description of the fields here... > + </description> > + <entry name="change" value="0" summary="updated state because it > changed"/> > + <entry name="full" value="1" summary="full state after enter or > input_method_changed event"/> > + <entry name="reset" value="2" summary="full state after reset"/> > + <entry name="enter" value="3" summary="full state after switching > focus to a different widget on client side"/> > + </enum> > + > + <request name="update_state"> > + <description summary="update state"> > + Allows to atomically send state updates from client. > + > + This request should follow after a batch of state updating requests > + like set_surrounding_text, set_content_type, set_cursor_rectangle and > + set_preferred_language. > + > + The flags field indicates why an updated state is sent to the input > + method. The field is called "reason" below, and doesn't seem to be a flag set :). > + > + Reset should be used by an editor widget after the text was changed > + outside of the normal input method flow. > + > + For "change" it is enough to send the changed state, else the full > + state should be send. Oh, you explain the enum values here, somewhat :). > + > + Serial should be set to the serial from the last enter or > + input_method_changed event. > + > + To make sure to not receive outdated input method events after a > + reset or switching to a new widget wl_display_sync() should be used > + after update_state in these cases. > + </description> > + <arg name="serial" type="uint" summary="serial of the enter or > input_method_changed event"/> > + <arg name="reason" type="uint" enum="update_state"/> > + </request> Ok... what is unclear to me (and should be at least documented) is the usefulness of the reason argument on both sides of the pipe. You sort of mention the situations when they should be expected to be emitted on the client, but there's no mentions as to what should the compositor do with these flags. Is it presumably expected to coalesce stuff or ignore unnecessary changes with these? if it's the latter case, should it ignore data received from a client request even if the flags don't match? Also, what's the minimal set of requests a client should perform prior to each of those reasons? It also seems to me like some of those reason values are there to reaffirm the compositor, it already knows whether all state was dropped because of a focus switch, or if the input method changed. However, it will send the respective events and the client must respond with the expected requests, plus this one with the corresponding expected flag. It sounds like extra room for misunderstandings. Otoh, there's basically one single reason to trigger this without compositor notice: focus changes inside the client. All in all, if there's a set of requests that should be taking effect at once, IMHO it actually sounds best to have that data double-buffered. I realize that there previously was a "commit" request in v1 that was replaced by this one, however there's no mention in any of both versions to requests being honored in a delayed manner. > + > + <event name="enter"> > + <description summary="enter event"> > + Notification that this seat's text-input focus is on a certain > surface. > + > + When the seat has the keyboard capability the text-input focus follows > + the keyboard focus. > + </description> > + <arg name="serial" type="uint" summary="serial to be used by > update_state"/> > + <arg name="surface" type="object" interface="wl_surface"/> > + </event> > + > + <event name="leave"> > + <description summary="leave event"> > + Notification that this seat's text-input focus is no longer on > + a certain surface. > + > + The leave notification is sent before the enter notification > + for the new focus. > + > + When the seat has the keyboard capability the text-input focus follows > + the keyboard focus. > + </description> > + <arg name="serial" type="uint"/> > + <arg name="surface" type="object" interface="wl_surface"/> > + </event> > + > + <enum name="input_panel_visibility"> > + <entry name="hidden" value="0" > + summary="the input panel (virtual keyboard) is hidden"/> > + <entry name="visible" value="1" > + summary="the input panel (virtual keyboard) is visible"/> > + </enum> > + > + <event name="input_panel_state"> > + <description summary="state of the input panel"> > + Notification that the visibility of the input panel (virtual keyboard) > + changed. > + > + The rectangle x, y, width, height defines the area overlapped by the > + input panel (virtual keyboard) on the surface having the text > + focus in surface local coordinates. > + > + That can be used to make sure widgets are visible and not covered by > + a virtual keyboard. > + </description> > + <arg name="state" type="uint" enum="input_panel_visibility"/> > + <arg name="x" type="int"/> > + <arg name="y" type="int"/> > + <arg name="width" type="int"/> > + <arg name="height" type="int"/> > + </event> I think I recognize the inspiration for this change :). However, do we really need passing the overlap coordinates in wayland? The compositor might well reconfigure the surface as it sees fit. Furthermore, the compositor already knows the current cursor position from the .set_cursor_rectangle request, so might do some smart placement as long as the surface size doesn't change (and thus does the cursor location) For fullscreen surfaces, I think it's better just have the compositor resize them in order to accommodate the input panel. This way things look fine even with clients that didn't commit full-on with text input v2. A compositor-side concern I have with this request is that it embeds UI state in the protocol. I'm aware that's partly the point, but some cases I can think of where this becomes ambiguous are: - input panes sliding from an edge. Presumably you can emit this event once per frame, but feels finicky... - split input pane (eg. osk in wide touchscreen in landscape position) As for client/toolkit-side concerns, tbh I've got quite a few :(... This model of keeping focus visible while typing only makes sense on a very certain set of UIs, mostly document editors, browsers, and others that still have plenty of room to relocate the focus on a visible area. It also pushes clients to handle all sorts of corner cases, are toolkits expected to handle overscrolling when the cursor is near the end of the document? what happens if the input panel covers entirely the focused text area? what happens if the input panel covers the entire client surface? IMHO, all those questions are better answered by having the compositor move/resize surfaces when the input panel is shown, we'd at least be using well tested mechanisms to keep the relevant bits of the UI in sight :), so I'd argue to remove this event altogether. > + > + <event name="preedit_string"> > + <description summary="pre-edit"> > + Notify when a new composing text (pre-edit) should be set around the > + current cursor position. Any previously set composing text should > + be removed. > + > + The commit text can be used to replace the composing text in some > cases > + (for example when losing focus). > + > + The text input should also handle all preedit_style and preedit_cursor > + events occurring directly before preedit_string. > + </description> > + <arg name="text" type="string"/> > + <arg name="commit" type="string"/> > + </event> > + > + <enum name="preedit_style"> > + <entry name="default" value="0" summary="default style for composing > text"/> > + <entry name="none" value="1" summary="composing text should be shown > the same as non-composing text"/> > + <entry name="active" value="2" summary="composing text might be bold"/> > + <entry name="inactive" value="3" summary="composing text might be > cursive"/> > + <entry name="highlight" value="4" summary="composing text might have a > different background color"/> > + <entry name="underline" value="5" summary="composing text might be > underlined"/> > + <entry name="selection" value="6" summary="composing text should be > shown the same as selected text"/> > + <entry name="incorrect" value="7" summary="composing text might be > underlined with a red wavy line"/> > + </enum> > + > + <event name="preedit_styling"> > + <description summary="pre-edit styling"> > + Sets styling information on composing text. The style is applied for > + length bytes from index relative to the beginning of the composing > + text (as byte offset). Multiple styles can be applied to a composing > + text by sending multiple preedit_styling events. > + > + This event is handled as part of a following preedit_string event. > + </description> > + <arg name="index" type="uint"/> > + <arg name="length" type="uint"/> > + <arg name="style" type="uint" enum="preedit_style"/> > + </event> Is there any event that can be used to enclose the possible burst of .preedit_styling events? How does the client know when/whether no more of these events will be sent? In other protocols this is usually handled by a .done event, having a .preedit_done event or somesuch would be nice to have here. > + > + <event name="preedit_cursor"> > + <description summary="pre-edit cursor"> > + Sets the cursor position inside the composing text (as byte > + offset) relative to the start of the composing text. When index is a > + negative number no cursor is shown. > + > + When no preedit_cursor event is sent the cursor will be at the end of > + the composing text by default. > + > + This event is handled as part of a following preedit_string event. > + </description> > + <arg name="index" type="int"/> > + </event> > + > + <event name="commit_string"> > + <description summary="commit"> > + Notify when text should be inserted into the editor widget. The text > to > + commit could be either just a single character after a key press or > the > + result of some composing (pre-edit). It could be also an empty text > + when some text should be removed (see delete_surrounding_text) or when > + the input cursor should be moved (see cursor_position). > + > + Any previously set composing text should be removed. > + </description> > + <arg name="text" type="string"/> > + </event> > + > + <event name="cursor_position"> > + <description summary="set cursor to new position"> > + Notify when the cursor or anchor position should be modified. Cursor > + and anchor are set relative to the position at end of commit string > + (in bytes). Default is 0 for index and anchor. > + > + This event should be handled as part of a following commit_string > + event. > + > + The text between anchor and index should be selected. > + </description> > + <arg name="index" type="int" summary="position of cursor relative to > end of inserted commit string"/> > + <arg name="anchor" type="int" summary="position of selection anchor > relative to end of inserted commit string"/> > + </event> Is there no provision for cursor position changes without a string being committed? A case I can think of are terminals with an osk, you can't just tap anywhere, so an osk with the "terminal" hint might want to add cursor keys. > + > + <event name="delete_surrounding_text"> > + <description summary="delete surrounding text"> > + Notify when the text around the current cursor position should be > + deleted. Before_length and after_length is the length (in bytes) of > text > + before and after the current cursor position (excluding the selection) > + to delete. > + > + This event should be handled as part of a following commit_string > + or preedit_string event. > + </description> > + <arg name="before_length" type="uint" summary="length of text before > current cursor positon"/> > + <arg name="after_length" type="uint" summary="length of text after > current cursor positon"/> Typo, missing 'i' in "position" on both arguments. > + </event> > + > + <event name="modifiers_map"> > + <description summary="modifiers map"> > + Transfer an array of 0-terminated modifiers names. The position in > + the array is the index of the modifier as used in the modifiers > + bitmask in the keysym event. > + </description> > + <arg name="map" type="array"/> > + </event> It should probably state that you'll receive this after the .enter event. > + > + <event name="keysym"> > + <description summary="keysym"> > + Notify when a key event was sent. Key events should not be used > + for normal text input operations, which should be done with > + commit_string, delete_surrounding_text, etc. The key event follows The wording here seems a bit off... what's the "key event" it mentions, this same event? in that case the first phrase reads a bit like "notifies about itself". It's also somewhat unclear to me what "normal text input operations" are, multichar input? any printable char? Should an OSK implementation use this event or the other ones? should it use both if it has both printable and non-printable (eg. tab) keys? IMHO there should be some examples about when to use which. > + the wl_keyboard key event convention. Sym is a XKB keysym, state a I wonder if it wouldn't be more consistent here if we used evcodes like wl_keyboard.key, instead of XKB keysyms. Having different pieces of the protocol talk slightly different conventions strikes me as odd. > + wl_keyboard key_state. Modifiers are a mask for effective modifiers > + (where the modifier indices are set by the modifiers_map event) > + </description> > + <arg name="time" type="uint"/> > + <arg name="sym" type="uint"/> > + <arg name="state" type="uint"/> > + <arg name="modifiers" type="uint"/> > + </event> > + > + <event name="language"> > + <description summary="language"> > + Sets the language of the input text. The "language" argument is a > RFC-3066 > + format language tag. > + </description> > + <arg name="language" type="string"/> > + </event> Hmm, we've got a request and an event for this. Presumably the compositor sends this after .enter and after every .language request from the client? Are there situations where the compositor might want to have the last say? should the client shut up and follow? the bidirectionality here makes it a bit unclear on who takes precedence :). > + > + <enum name="text_direction"> > + <entry name="auto" value="0" summary="automatic text direction based > on text and language"/> > + <entry name="ltr" value="1" summary="left-to-right"/> > + <entry name="rtl" value="2" summary="right-to-left"/> > + </enum> > + > + <event name="text_direction"> > + <description summary="text direction"> > + Sets the text direction of input text. > + > + It is mainly needed for showing input cursor on correct side of the > + editor when there is no input yet done and making sure neutral > + direction text is laid out properly. > + </description> > + <arg name="direction" type="uint" enum="text_direction"/> > + </event> The description doesn't sound very convincing :)... Can't this be eg. inferred from the .language event or the locale? > + > + <event name="configure_surrounding_text"> > + <description summary="configure amount of surrounding text to be sent"> > + Configure what amount of surrounding text is expected by the > + input method. The surrounding text will be sent in the > + set_surrounding_text request on the following state information > updates. > + </description> > + <arg name="before_cursor" type="int"/> > + <arg name="after_cursor" type="int"/> > + </event> Hmm, maybe "configure" is not the best term? this is a "reply me back with data" event, so perhaps "request" is better? > + > + <event name="input_method_changed"> > + <description summary="Notifies about a changed input method"> > + The input method changed on compositor side, which invalidates all > + current state information. New state information should be sent from > + the client via state requests (set_surrounding_text, > + set_content_hint, ...) and update_state. IMO the expected requests perfom should be clearly stated here, i.e. no ellipsis. > + </description> > + <arg name="serial" type="uint" summary="serial to be used by > update_state"/> > + <arg name="flags" type="uint" summary="currently unused"/> Won't argue hard against this, but after all this is an unstable protocol, this argument can be always added in future versions if it turns out needed :). Cheers, Carlos _______________________________________________ wayland-devel mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/wayland-devel
