https://bugs.kde.org/show_bug.cgi?id=450914

--- Comment #25 from Erik Kurzinger <ekurzin...@nvidia.com> ---
So I did look into this a bit further, and can provide some hopefully relevant
information...

Firstly, I noticed that, unfortunately, direct scan-out for Xwayland
applications doesn't work at all with our driver right now. This is an issue on
our end, we're allocating compressed buffers which aren't eligible for
scan-out. It should be pretty easy to fix, but that will need to be in an
upcoming driver release.

Pertaining to the issue at hand, though, that implies Kwin's direct scan-out
behavior is not relevant. Xwayland applications will always be composited.

To follow-up on Zamundaaa's earlier comment, the difference between Mesa and
NVIDIA does indeed appear to be due to the fact that Mesa will use 4 buffers in
its swapchain if vsync is disabled and the Present extension reports it's using
the flipping path. Our driver, on the other hand, will only use two, always.
Note that's only for X11 applications with Xwayland, for native Wayland
applications we do actually use 3, and, at least in my testing, those aren't
capped to the display refresh rate.

I experimented with increasing the number of swapchain buffers to 3, and using
mailbox-style logic similar to Mesa, and this does resolve the issue. So that
is an option.

However, I still question why Kwin is holding on to both buffers for the entire
frame if it's not doing direct scan-out. Like, if we do two
wl_surface_attach/damage/commits in a single vblank period, shouldn't the
second one cause the first buffer to be released? That's what happens with
weston, mutter, wlroots, etc. but apparently not with Kwin.

Now, once we do fix the compressed buffer thing enabling direct scan-out, maybe
we will need to add a third swapchain image anyway. That's assuming that Kwin's
page flips will still be synchronized to the display refresh rate, which I
believe is the case (correct me if I'm wrong).

But is that really the solution users want? Sure, it would technically let the
game run at an uncapped refresh rate, but as I understand it the main reason
people want to run games that way is to minimize input latency, and they're
willing to accept possible tearing as a trade-off (e.g. competitive gamers and
whatnot). So would it maybe make more sense to have Kwin do tearing flips
instead for direct scan-out applications? In *that* case I think 2 swapchain
buffers on the driver side would still be fine, right?

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to