Re: why not flow control in wl_connection_flush?
On Wed, 21 Feb 2024 11:08:02 -0500 jleivent wrote: > Not completely blocking makes sense for the compositor, but why not > block the client? Blocking in clients is indeed less of a problem, but: - Clients do not usually have requests they *have to* send to the compositor even if the compositor is not responding timely, unlike input events that compositors have; a client can spam surfaces all it wants, but it is just throwing work away if it does it faster than the screen can update. So there is some built-in expectation that clients control their sending. - I think the Wayland design wants to give full asynchronicity for clients as well, never blocking them unless they explicitly choose to wait for an event. A client might have semi-real-time responsibilities as well. - A client's send buffer could be infinite. If a client chooses to send requests so fast it hits OOM, it is just DoS'ing itself. > For the compositor, wouldn't a timeout in the sendmsg make sense? That would make both problems: slight blocking multiplied by number of (stalled) clients, and overflows. That could lead to jittery user experience while not eliminating the overflow problem. Thanks, pq > On Wed, 21 Feb 2024 16:39:08 +0100 > Olivier Fourdan wrote: > > > Hi, > > > > On Wed, Feb 21, 2024 at 4:21 PM jleivent wrote: > > > > > I've been looking into some of the issues about allowing the > > > socket's kernel buffer to run out of space, and was wondering why > > > not simply remove MSG_DONTWAIT from the sendmsg call in > > > wl_connection_flush? That should implement flow control by having > > > the sender thread wait until the receiver has emptied the socket's > > > buffer sufficiently. > > > > > > It seems to me that using an unbounded buffer could cause memory > > > resource problems on whichever end was using that buffer. > > > > > > Was removing MSG_DONTWAIT from the sendmsg call considered and > > > abandoned for some reason? > > > > > > > See this thread [1] from 2012, it might give some hint on why > > MSG_DONTWAIT was added with commit b26774da [2]. > > > > HTH > > Olivier > > > > [1] > > https://lists.freedesktop.org/archives/wayland-devel/2012-February/002394.html > > [2] https://gitlab.freedesktop.org/wayland/wayland/-/commit/b26774da > pgp6eosZRij9S.pgp Description: OpenPGP digital signature
Re: why not flow control in wl_connection_flush?
Thanks for this response. I am considering adding unbounded buffering to my Wayland middleware project, and wanted to consider the flow control options first. Walking through the reasonsing here is very helpful. I didn't know that there was a built-in expectation that clients would do some of their own flow control. I was also operating under the assumption that blocking flushes from the compositor to one client would not have an impact on other clients (was assuming an appropriate threading model in compositors). The client OOM issue, though: A malicious client can do all kinds of things to try to get DoS, and moving towards OOM would accomplish that as well on systems with sufficient speed disadvantages for thrashing. A buggy client that isn't trying to do anything malicious, but is trapped in a send loop, that would be a case where causing it to wait might be better than allowing it to move towards OOM (and thrash). On Thu, 22 Feb 2024 11:52:28 +0200 Pekka Paalanen wrote: > On Wed, 21 Feb 2024 11:08:02 -0500 > jleivent wrote: > > > Not completely blocking makes sense for the compositor, but why not > > block the client? > > Blocking in clients is indeed less of a problem, but: > > - Clients do not usually have requests they *have to* send to the > compositor even if the compositor is not responding timely, unlike > input events that compositors have; a client can spam surfaces all > it wants, but it is just throwing work away if it does it faster than > the screen can update. So there is some built-in expectation that > clients control their sending. > > - I think the Wayland design wants to give full asynchronicity for > clients as well, never blocking them unless they explicitly choose > to wait for an event. A client might have semi-real-time > responsibilities as well. > > - A client's send buffer could be infinite. If a client chooses to > send requests so fast it hits OOM, it is just DoS'ing itself. > > > For the compositor, wouldn't a timeout in the sendmsg make sense? > > That would make both problems: slight blocking multiplied by number of > (stalled) clients, and overflows. That could lead to jittery user > experience while not eliminating the overflow problem. > > > Thanks, > pq >
Wayland debugging with Qtwayland, gstreamer waylandsink, wayland-lib and Weston
Hi, We are developing a video processing system that runs on an NXP imx8 processor using a Yocto embedded Linux system that has Qt6, GStreamer, Wayland and Weston. We are having a problem displaying the video stream from GStreamer on a QWidget. In the past we had this working with Qt5 and older GStreamer, Wayland and Weston. A simple test program also shows the issue on Fedora37 with QT6 and KDE/Plasma/Wayland. The technique we are using is to get the Wayland surface from the QWidget is using (It has been configured to use a Qt::WA_NativeWindow) and pass this to the GStreamer's waylandsink which should then update this surface with video frames (via hardware). This works when the QWidget is a top level Window widget (QWidget(0)), but if this QWidget is below others in the hierarchy no video is seen and the gstreamer pipeline line is stalled. It appears that waylandsink does: Creates a surface callback: callback = wl_surface_frame (surface); wl_callback_add_listener (callback, &frame_callback_listener, self); Then adds a buffer to a surface: gst_wl_buffer_attach (buffer, priv->video_surface_wrapper); wl_surface_set_buffer_scale (priv->video_surface_wrapper, priv->scale); wl_surface_damage_buffer (priv->video_surface_wrapper, 0, 0, G_MAXINT32, G_MAXINT32); wl_surface_commit (priv->video_surface_wrapper); But never gets a callback and just sits in a loop awaiting that callback. I assume that the surface waylandsink is using, which is created using the original QWidget surface (sub-surface ? with window ?) is not "active" for some reason. I am trying to debug this, but this graphics stack is quite complicated with waylandsink, qtwayland, wayland-lib and Weston not to mention the NXP hardware levels. My thoughts are that it is something qtwayland is doing with the surface stack or thread locking issues (gstreamer uses separate threads). I also don't understand Wayland or Weston in detail. So some questions: 1. Anyone seen something like this ? 2. Anyone any idea one where to look ? 3. Given the wl_surface in the Qt app or in waylandsink is there a way I can print out its state and the surface hierarchy easily ? 4. Any idea on any debug methods to use ? Cheers Terry
Re: Wayland debugging with Qtwayland, gstreamer waylandsink, wayland-lib and Weston
Hi, On Thu, Feb 22, 2024 at 03:21:01PM +, Terry Barnaby wrote: > Hi, > > We are developing a video processing system that runs on an NXP imx8 > processor using a Yocto embedded Linux system that has Qt6, GStreamer, > Wayland and Weston. > > We are having a problem displaying the video stream from GStreamer on a > QWidget. In the past we had this working with Qt5 and older GStreamer, > Wayland and Weston. > > A simple test program also shows the issue on Fedora37 with QT6 and > KDE/Plasma/Wayland. I'm tempted to say if this happens on a desktop with the same Qt version and other compositors to be an issue with Qt rather than waylandsink or the compositor. Note that on NXP they have their own modified Weston version. > > The technique we are using is to get the Wayland surface from the QWidget is > using (It has been configured to use a Qt::WA_NativeWindow) and pass this to > the GStreamer's waylandsink which should then update this surface with video > frames (via hardware). This works when the QWidget is a top level Window > widget (QWidget(0)), but if this QWidget is below others in the hierarchy no > video is seen and the gstreamer pipeline line is stalled. So the assumption is that aren't there other widgets which obscures this one, when you move it below others? > > It appears that waylandsink does: > > Creates a surface callback: > > callback = wl_surface_frame (surface); > > wl_callback_add_listener (callback, &frame_callback_listener, self); > > Then adds a buffer to a surface: > > gst_wl_buffer_attach (buffer, priv->video_surface_wrapper); > wl_surface_set_buffer_scale (priv->video_surface_wrapper, priv->scale); > wl_surface_damage_buffer (priv->video_surface_wrapper, 0, 0, G_MAXINT32, > G_MAXINT32); > wl_surface_commit (priv->video_surface_wrapper); > > But never gets a callback and just sits in a loop awaiting that callback. > > I assume that the surface waylandsink is using, which is created using the > original QWidget surface (sub-surface ? with window ?) is not "active" for > some reason. Possibly when QWidget is below in hierarcy to be a child of of a parent, as described in https://wayland.app/protocols/xdg-shell#xdg_toplevel:request:set_parent, so I assume to have a different surface than the parent one. This would be easy to determine with WAYLAND_DEBUG. Seems unlikely to a itself a sub-surface of a surface. > > > I am trying to debug this, but this graphics stack is quite complicated with > waylandsink, qtwayland, wayland-lib and Weston not to mention the NXP > hardware levels. My thoughts are that it is something qtwayland is doing > with the surface stack or thread locking issues (gstreamer uses separate > threads). I also don't understand Wayland or Weston in detail. So some > questions: > > 1. Anyone seen something like this ? Someone else reported something similar but that by causing damage, or moving pointer to make the video sub-surface to show up: https://gitlab.freedesktop.org/wayland/weston/-/issues/843. > > 2. Anyone any idea one where to look ? > > 3. Given the wl_surface in the Qt app or in waylandsink is there a way I can > print out its state and the surface hierarchy easily ? In Weston there's something called scene-graph. You can grab it by starting Weston with with the --debug argument, then you can print with `weston-debug scene-graph` command. A more recent Weston version would indent sub-surfaces by their (main) surface parent. > > 4. Any idea on any debug methods to use ? WAYLAND_DEBUG=1 as env variable. > > Cheers > > Terry > > signature.asc Description: PGP signature
Re: Wayland debugging with Qtwayland, gstreamer waylandsink, wayland-lib and Weston
Hi Marius, Many thanks for the info. Some notes/questions below: Terry On 22/02/2024 17:49, Marius Vlad wrote: Hi, On Thu, Feb 22, 2024 at 03:21:01PM +, Terry Barnaby wrote: Hi, We are developing a video processing system that runs on an NXP imx8 processor using a Yocto embedded Linux system that has Qt6, GStreamer, Wayland and Weston. We are having a problem displaying the video stream from GStreamer on a QWidget. In the past we had this working with Qt5 and older GStreamer, Wayland and Weston. A simple test program also shows the issue on Fedora37 with QT6 and KDE/Plasma/Wayland. I'm tempted to say if this happens on a desktop with the same Qt version and other compositors to be an issue with Qt rather than waylandsink or the compositor. Note that on NXP they have their own modified Weston version. That is my current feeling and is one reason why I tried it on Fedora with whatever Wayland compositor KDE/Plasma is using. The technique we are using is to get the Wayland surface from the QWidget is using (It has been configured to use a Qt::WA_NativeWindow) and pass this to the GStreamer's waylandsink which should then update this surface with video frames (via hardware). This works when the QWidget is a top level Window widget (QWidget(0)), but if this QWidget is below others in the hierarchy no video is seen and the gstreamer pipeline line is stalled. So the assumption is that aren't there other widgets which obscures this one, when you move it below others? My simple test example has two QWidgets with the one for video being created as a child of the first so it should be above all others. I have even tried drawing in it to make sure and it displays its Qt drawn contents fine, just not the video stream. It appears that waylandsink does: Creates a surface callback: callback = wl_surface_frame (surface); wl_callback_add_listener (callback, &frame_callback_listener, self); Then adds a buffer to a surface: gst_wl_buffer_attach (buffer, priv->video_surface_wrapper); wl_surface_set_buffer_scale (priv->video_surface_wrapper, priv->scale); wl_surface_damage_buffer (priv->video_surface_wrapper, 0, 0, G_MAXINT32, G_MAXINT32); wl_surface_commit (priv->video_surface_wrapper); But never gets a callback and just sits in a loop awaiting that callback. I assume that the surface waylandsink is using, which is created using the original QWidget surface (sub-surface ? with window ?) is not "active" for some reason. Possibly when QWidget is below in hierarcy to be a child of of a parent, as described in https://wayland.app/protocols/xdg-shell#xdg_toplevel:request:set_parent, so I assume to have a different surface than the parent one. This would be easy to determine with WAYLAND_DEBUG. Seems unlikely to a itself a sub-surface of a surface. I haven't really got the gist of whats going on, but waylandsink certainly creates a subsurface from the QWidget surface, in fact it seems to create a few things. I assume a subsurface is used so the video can be displayed in that subsurface separately from the parent (de synced from it). I am trying to debug this, but this graphics stack is quite complicated with waylandsink, qtwayland, wayland-lib and Weston not to mention the NXP hardware levels. My thoughts are that it is something qtwayland is doing with the surface stack or thread locking issues (gstreamer uses separate threads). I also don't understand Wayland or Weston in detail. So some questions: 1. Anyone seen something like this ? Someone else reported something similar but that by causing damage, or moving pointer to make the video sub-surface to show up: https://gitlab.freedesktop.org/wayland/weston/-/issues/843. Thanks, I will have a look. Moving the mouse cursor in my case (at least with Weston) does not affect things. 2. Anyone any idea one where to look ? 3. Given the wl_surface in the Qt app or in waylandsink is there a way I can print out its state and the surface hierarchy easily ? In Weston there's something called scene-graph. You can grab it by starting Weston with with the --debug argument, then you can print with `weston-debug scene-graph` command. A more recent Weston version would indent sub-surfaces by their (main) surface parent. Thanks, that could be useful. 4. Any idea on any debug methods to use ? WAYLAND_DEBUG=1 as env variable. Any idea on how to get a surfaces ID from a C pointer so I can match up the QtWidget/waylandsink surface with the Wayland debug output ? Cheers Terry
Re: Wayland debugging with Qtwayland, gstreamer waylandsink, wayland-lib and Weston
I have tried using "weston-debug scene-graph" and I am coming to the conclusion that qtwayland 6.5.0 is not really using native Wayland surfaces when Qt::WA_NativeWindow is used. From what I can see (and I could easily be wrong) the Wayland protocol shows wl_surfaces being created and two QWidget's QPlatformNativeInterface nativeResourceForWindow("surface", windowHandle()) function does return different wl_surface pointers but even at the QWidget level (ignoring gstreamer), a QPainter paint into each of these QWidgets actually uses Wayland to draw into just the one top level surface and "weston-debug scene-graph" shows only one application xdg_toplevel surface and no subsurfaces. I don't know how to determine the Wayland surface ID from a wl_surface pointer unfortunately to really check this. If my Video QWidget(0) is a top level QWidget, then video is shown and "weston-debug scene-graph" shows the application xdg_toplevel and two wl_subsurfaces as children. Unfortunately I think "weston-debug scene-graph" only shows surfaces that are actually "active" so I can't see all of the surfaces that Weston actually knows about (is there a method of doing this ?). My feeling is that although Qtwayland is creating native surfaces, it actually only uses the one top level one and presumably doesn't "activate" (set a role, do something ?) with the other surfaces. Does anyone know a good list/place where I can ask such detailed qtwayland questions ? I guess I can work around this by manually creating a Wayland subsurface from the Qt top level surface and handing that to waylandsink and then manage this subsurface, like hiding, showing and resizing, when the QWidget is hidden/shown/resized. Or could there be a way of "activating" the child QWidget's Wayland surface ? On 22/02/2024 18:44, Terry Barnaby wrote: Hi Marius, Many thanks for the info. Some notes/questions below: Terry On 22/02/2024 17:49, Marius Vlad wrote: Hi, On Thu, Feb 22, 2024 at 03:21:01PM +, Terry Barnaby wrote: Hi, We are developing a video processing system that runs on an NXP imx8 processor using a Yocto embedded Linux system that has Qt6, GStreamer, Wayland and Weston. We are having a problem displaying the video stream from GStreamer on a QWidget. In the past we had this working with Qt5 and older GStreamer, Wayland and Weston. A simple test program also shows the issue on Fedora37 with QT6 and KDE/Plasma/Wayland. I'm tempted to say if this happens on a desktop with the same Qt version and other compositors to be an issue with Qt rather than waylandsink or the compositor. Note that on NXP they have their own modified Weston version. That is my current feeling and is one reason why I tried it on Fedora with whatever Wayland compositor KDE/Plasma is using. The technique we are using is to get the Wayland surface from the QWidget is using (It has been configured to use a Qt::WA_NativeWindow) and pass this to the GStreamer's waylandsink which should then update this surface with video frames (via hardware). This works when the QWidget is a top level Window widget (QWidget(0)), but if this QWidget is below others in the hierarchy no video is seen and the gstreamer pipeline line is stalled. So the assumption is that aren't there other widgets which obscures this one, when you move it below others? My simple test example has two QWidgets with the one for video being created as a child of the first so it should be above all others. I have even tried drawing in it to make sure and it displays its Qt drawn contents fine, just not the video stream. It appears that waylandsink does: Creates a surface callback: callback = wl_surface_frame (surface); wl_callback_add_listener (callback, &frame_callback_listener, self); Then adds a buffer to a surface: gst_wl_buffer_attach (buffer, priv->video_surface_wrapper); wl_surface_set_buffer_scale (priv->video_surface_wrapper, priv->scale); wl_surface_damage_buffer (priv->video_surface_wrapper, 0, 0, G_MAXINT32, G_MAXINT32); wl_surface_commit (priv->video_surface_wrapper); But never gets a callback and just sits in a loop awaiting that callback. I assume that the surface waylandsink is using, which is created using the original QWidget surface (sub-surface ? with window ?) is not "active" for some reason. Possibly when QWidget is below in hierarcy to be a child of of a parent, as described in https://wayland.app/protocols/xdg-shell#xdg_toplevel:request:set_parent, so I assume to have a different surface than the parent one. This would be easy to determine with WAYLAND_DEBUG. Seems unlikely to a itself a sub-surface of a surface. I haven't really got the gist of whats going on, but waylandsink certainly creates a subsurface from the QWidget surface, in fact it seems to create a few things. I assume a subsurface is used so the video can be displayed in that subsurface separately from the parent (de sy