Re: [PATCH weston v2 10/11] [RFC] Account for very large repaint window misses

Mario Kleiner Mon, 27 Mar 2017 15:59:59 -0700

Hi Daniel & Pekka,

finally managed to go through the whole patch series, updated my ownapplication to current Wayland/Weston and test it a bit. I like it! Itwould have gotten my Reviewed-by's if i had actually managed to reviewit in some more reasonable time before the merge ;).

The only thing i would suggest is to make the time window for the outputupdate coalescing somewhat more tight and mostly to prevent repaintdeadlines from shifting.


In weston_output_maybe_repaint() you have this...

        msec_to_repaint = timespec_sub_to_msec(&output->next_repaint, now);
        if (msec_to_repaint > 1)
                return ret;

...for skipping an output for coalescing. Given timespec_sub_to_msecfloor()'s to full msecs, that means you'd accept a next_repaint almost 2msecs (~1.999999 msecs) into the future, so for the candidate outputthat would be like moving the repaint window deadline ~ 2 msecs closerto the start of a refresh cycle, cutting off more clients earlier fromgetting their surface updates out for the next vblank.


After return from weston_output_repaint() you call ...

weston_compositor_read_presentation_clock(...&now)

... again to update "now", so if weston_output_repaint() for the currentoutput involves some serious compositing work, you shift the "now" pointfor the following outputs in the compositors output list further intothe future, so depending on where an output is in thecompositor->output_list you could get such delays to add up andessentially move the repaint window deadline for later outputs by morethan 2 msecs closer. I think that's not so good for predictability ifthe position of an output in the output_list and the potentially varyingcomposition workload on preceding outputs can shift the repaintdeadlines for later outputs by a large amount and defer an actualcoalesced update for all outputs which are caught in this further.

So i'd probably drop thatweston_compositor_read_presentation_clock(...&now) to prevent this kindof drift? And make the msec_to_repaint deadline more like >= 1 insteadof > 1 to limit the time window to at most 1 msec?

Ideally we'd probably have timers with better than 1 msec granularity todeal with high refresh rate displays. The underlying timerfd api forwl_event_source_timer_update() seems to support nsecs resolution.

Gamer or Stereo panels with 144 Hz or 165 Hz refresh are now becomingmore common. One of my users already uses a commercially available 240Hz BenQ Zowie panel for reliable fullscreen high-speed animations underX11 with the FOSS graphics stack, so refresh durations of only ~4 msecsare now a thing and shifting repaint deadlines by 2 msecs or more wouldhave significant impact at only 4 msecs refresh.

I assume a very important intended use case for this output coalescingis to make sure that outputs which are tightly genlocked/synchronized intheir video refresh cycles will really update/page-flip somewhatreliably together and do so efficiently, e.g., if this is implemented ontop of some solid atomic flip kms-driver support? Stuff likestereoscopic 3D output to two separate genlocked outputs for the twoeyes (3D cinema, medical/science applications, advanced VR/AR HMD's). Oreven multi-display walls or VR CAVE environments with > 2 outputs? Forsuch apps one would assume the outputs are tightly synchronized, so evena < 1 msec window for coalescing should be fine.

If you think about single-display VR apps like 1 output driving aregular desktop GUI display, the other driving something like a cosumerVR HMD like the HTC Vive or Oculus Rift, we'd also would want to makesure the repaint behavior of the output driving the HMD is verypredictable and stable, even in presence of some activity on the regularnon-synchronized desktop screen, so apps can minimize motion-to-photonlatency. Output coalescing which would too liberally coalesce outputswhich are unrelated in their refresh cycles could hurt such applicationsquite a bit. Or create funny beat patterns when the outputs refreshcycles drift against each other (60 Hz vs. 75/90/144 Hz) and thecompositor alternates between coalescing updates together and treatingthem separately in a way that could cause hard to understand or avoidframe drops?

I've read that the latest Vulkan spec now includes aVK_GOOGLE_display_timing extension, intended for VR apps, which ispretty close to what was proposed for Waylands presentation_queueextension or VDPAU's frame scheduling for video playback.


Comments more to the point of patch 10/11 below...

On 03/13/2017 01:48 PM, Pekka Paalanen wrote:

On Fri, 10 Mar 2017 14:52:58 +0000
Daniel Stone <[email protected]> wrote:

Hi Pekka,

On 10 March 2017 at 13:41, Pekka Paalanen
<[email protected]> wrote:

On Wed,  1 Mar 2017 11:34:09 +0000 Daniel Stone <[email protected]> wrote:

       * the deadline given by repaint_msec? In that case we delay until
       * the deadline of the next frame, to give clients a more predictable
       * timing of the repaint cycle to lock on. */
-     if (presented_flags == WP_PRESENTATION_FEEDBACK_INVALID && msec_rel < 0)
-             timespec_add_nsec(&output->next_repaint, &output->next_repaint,
-                               refresh_nsec);
+     if (presented_flags == WP_PRESENTATION_FEEDBACK_INVALID &&
+         msec_rel < 0) {
+             while (timespec_sub_to_nsec(&output->next_repaint, &now) < 0) {
+                     timespec_add_nsec(&output->next_repaint,
+                                       &output->next_repaint,
+                                       refresh_nsec);
+             }
+     }

 out:
      output->repaint_status = REPAINT_SCHEDULED;


the very point is to postpone the repaint to a future time so that
clients would not be racing each other any more than they do when the
repaint loop is continuously active. In that sense this is a correct
change. It just seems a bit awkward to use a loop instead computing the
rounded number of frame periods by division.


Sure, not that difficult, just a little ugly:
int n_periods = !!(msec_rel % (refresh_nsec * NSEC_TO_MSEC));
n_periods += msec_rel / (refresh_nsec * NSEC_TO_MSEC);


Maybe I'll send my own proposal for this, if I can come up with
something nicer.

However, what strikes me as odd here is that this is done only when
coming from start_repaint_loop(). How could start_repaint_loop() cause
weston_output_finish_frame() to be called with a timestamp that is
older than the most recent vblank?


No idea: if you look at the diff, the original code did the exact same
thing, but only adjusted by a maximum of one period. I agree with you
that it would probably seem better to postpone in all cases, giving
sheer predictability (always respecting the repaint window exactly)
for speed. But I'd like some more input before changing this, and I'd
like to change it in a separate patch to this.

Ok, so i think i can clarify this. The point of this routine wasn't tocope with massive scheduling delays/system overload etc. or other milddisasters, but only to make sure that clients always see a predictablerepaint timing, regardless if the repaint loop was/is running when theymake their surface.commit, or if the output repaint was idle and is justrestarting the repaint loop, or if the specific client is thereason/trigger for the restart of the repaint loop. This to make surethat timing sensitive clients can "latch" onto some stable compositortiming. See commit b7df04ebc048e18417a13d577d977d6e36c8a8ca

Iff the repaint loop is already running when a client requests an updatethen the repaint window will define the cutoff point / deadline fordeciding if a clients update makes it for the next vblank, or if it getsdeferred to the next frame.

Iff the repaint loop was idle, the clients should see the same cutoffdeadline as if it was running. So two cases for that:

- If the first client in that refresh cycle triggeredweston_output_schedule_repaint() -> output->start_repaint_loop ->weston_output_finish_frame while the output in its current refresh cycleis before the repaint window deadline then you'll see a msec_rel >= 0and the output should perform a repaint for that client (and possibleothers which commit before the repaint window deadline) targetting thenext vblank.

- If the first client in that refresh cycle triggered a restart of therepaint loop after the repaint_window cutoff deadline then you will geta msecs_rel < 0 in the old code and current code, and the clients updateshould get deferred to the vblank after the next vblank, just as wouldhave been the case if the repaint loop was running all the time. This iswhat that if-statement => next_repaint += refresh_nsec is supposed to do.

What we want to prevent by shifting the repaint by 1 frame specificallyduring restart of repaint loop if msecs_rel < 0 is that one clienttriggers restart too late in the frame (= after repaint_windowdeadline), then its repaint misses the intended next vblank (n) due tothe lateness, and a kms page-flip therefore skips to the vblank (n+1)after this next one, but now all other potential latecomers from thisframe (targetting vblank n) and all the clients who wanted to commit anupdate properly in time for the next refresh cycle (n+1) will getdelayed by a full frame while a kms page-flip waits for the (n+1) vblankand only then triggers weston_output_finsh_frame for an update at vblank(n+2) earliest. You'd punish clients which are in time by an extra 1frame delay (earliest processing/completion at n+2 instead of n+1), andall other latecomers by an extra 2 frame delay (n+2 vs. n) while at thesame time not achieving an improvement for the first latecomer, whichwill likely still skip to n+1 instead of n.

This would be a common scenario, e.g., for 16 msecs refresh rate and ourdefault repaint window of 7 msecs = 7/16 ~ 43% probability of a randomclient triggering this.


Sure!

Did you see a case where it happened?


No, it's just something that had been bugging me ever since I started
looking into repaint. Found purely by inspection.

I think it would only happen with a) broken driver reporting bad
timestamps with pageflip events (we already verify timestamps from the
vblank query and fall back to pageflip otherwise), or b) something
stalled weston for too long.

a) would be a bug we want to log. b) might be a bug we want to log, or
just high system load in general which we wouldn't want to log, really.
Unfortunately there is no way to tell.

There is the original reason we want to ensure the target time is in
the future, but there are no reasons when doing so would be bad.
Therefore I think this is a good idea.

I'm not sure if generally ensuring that the target time is in the futurehelps or hurts or doesn't matter in practice. That code wasn't meant todeal with overload/scheduling delay. I assume you can only get largerdelays during repaint loop restart or during already running repaintloop on system overload, and in that case all predictability is probablyout of the window and the whole desktop will glitch anyway. One couldargue that it would be better to just accept defeat and try to get anoutput repaint out as fast as possible, so not too many clientschedule_repaint requests can back up during the extra time given tothem by shifting the repaint deadline further into the future, possiblymaking the glitch longer. Not sure if that would matter in practice though.


best and sorry for the late reply,
-mario

I'd like to get rid of the loop if it doesn't make things too
messy, even when the "insanity check" already ensures we don't
loop too long.


As long as the above looks reasonable, then sure; happy to expand the
!! to a ternary or something as well.

Failed insanity check makes the target time 'now'. Should that also be
subject to postponing for a period? I think it would be more
consistent. That is, make the comparison <= 0.


Er, unsure. Probably?

What were the arguments you thought to make this an undesireable patch?
I couldn't come up with any.


None per se. I flagged it as such because a) it isn't required to
achieve the actual goal of the patchset, b) given the somewhat
intricate nature of the code in question, I wasn't sure if it was
deliberate or not, and c) I only noticed it by inspection and have
never observed it myself. But it does indeed seem to be correct, so
I'll take that as a positive comment, and maybe wait to see what Mario
says before we think about merging it.


Yeah, let's leave this brewing for a bit.


Thanks,
pq

_______________________________________________
wayland-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/wayland-devel

Re: [PATCH weston v2 10/11] [RFC] Account for very large repaint window misses

Reply via email to