Re: Introduction and updates from NVIDIA

James Jones Tue, 03 May 2016 11:51:31 -0700

On 05/03/2016 09:53 AM, Daniel Stone wrote:

Hi James,


On 3 May 2016 at 17:07, James Jones <[email protected]> wrote:

On 04/29/2016 03:07 PM, Daniel Stone wrote:

With new Wayland protocol, patches to all Wayland compositors to send
proper
hints to clients using this protocol, improvements to GBM, and updates to
both of these when new GPU architectures introduced new requirements,
what
you describe could do anything streams can do. However, then the problem
will have been solved only in the context of top-of-tree Wayland and
Weston.


This doesn't require explicit/new compositor interaction at all.
Extensions can be done within the gbm/EGL bundle itself (via
EGL_WL_bind_wayland_display), so you're only changing one DSO (or DSO
bundle), and the API usage there today does seem to stand up. Given
that the protocol is private - I'm certainly not advocating for a
DRI2-style all-things-to-all-hardware standard protocol to communicate
this - and that it's localised in a vendor bundle, it seems completely
widely applicable to me. As someone who's writing this from
Mutter/Wayland/GBM, I'm certainly not interested in Weston-only
solutions.


No, the necessary extensions can not be contained within the binding. There
is not enough information within the driver layer alone. Something needs to
tell the driver when the configuration changes (E.g., the consumer of a
wayland surface switches from a texture to a plane) and what the new
configuration is. This would trigger the protocol notifications &
subsequent optimization within the driver.  By the nature of their API,
streams would require the compositor to take action on such configuration
changes, and streams can discover the new configuration.  Something
equivalent would be required to make this work in the GBM+wl_drm/EGL case.


I don't think this is the case. As I went through with Andy, we
_already_ have intent expressed in the GBM case, in the exact same way
that EGLStreams does: consider gbm_bo_import as equivalent for
attaching to an EGLOutput(Layer) consumer, and EGLImage import +
TargetTexture2D as equivalent for attaching a gltexture consumer.

"Will be used for display on device X" is not sufficient information, asDaniel Vetter outlined.

This
is the exact same proxy for intent to display, and in fact the GBM
approach is slightly more flexible, because it allows you to both do
direct scanout as well as GPU composition (e.g. if you're
capturing/streaming at the same time as display).

Again though, without stream-retargeting, this is not something which
exists in Streams today, and doing so is going to require more
extensions: more code in your driver, more code in every
implementation. GBM today, for all its faults, does not require
further API extension to make this work.

Agreed. We're working on similar flexibility for streams via anEGLSwitch muxing extension. As mentioned above, GBM would require APIextensions and driver updates to reach the expressiveness of streams aswell though.

Further, as a driver vendor, the idea of requiring even in-driver
platform-specific modifications for this sounds undesirable.  If it was
something that could be contained entirely within GBM, that would be
interesting.  However, distributing the architecture-specific code
throughout the window-system specific code in the driver means a lot more
maintenance burden in a world with X, Chrome OS, Wayland, and several
others.


This would hold true if Streams was a perfect encapsulation, but I
don't really see how doing so adds any burden over layering the
winsys/platform layer over Streams in the first place. I mean, you've
written Wayland bindings for Streams in the first place ... how would
this be too much different? Even if the protocol is designed to be the
perfect transport for Streams, you _still_ need transport bindings to
your target protocol.

We wrote the wayland protocol as an example of what is possible usingstreams, and we intend to open-source it. Presumably window-systemauthors would write the protocol for other windowing systems. Further,since streams would encapsulate all the device-specific stuff, theprotocol library wouldn't require as much maintenance as adriver-specific protocol library.

In a world with only Wayland, yes, we'd be doing slightly more work tobootstrap streams support than we would to support GBM+wayland.However, other windowing systems and stream use cases exist.

What streams exposes is intended to lower the amount of stuff hidden indrivers, not increase it. Streams is a generic swapchain mechanismexposed to any user, whereas we would need to write somethingproprietary (maybe open source, maybe closed source, but NVIDIA-specificnone the less) for each window system to get equivalent performance ifwe pushed the abstraction to a lower level.

Certainly there are, but then again, there are far more usecases than
EGL. Looking at media playback, Vulkan, etc, where you don't have EGL
yet need to solve the same problems.



EGLStreams, Vulkan swapchains, and (for example) VDPAU presentation queues
are all varying levels of abstraction on top of the same thing within the
driver: a presentation engine or buffer queue, depending on whether the
target is a physical output or a compositor.  These API-level components can
be hooked up to eachother as long as the lower-level details are fully
contained within the driver abstraction. A Vulkan swapchain can be
internally implemented as an EGLStream producer, for example.  In fact,
Vulkan swapchains borrow many ideas directly and indirectly from EGLStream.


Indeed, I noted the similarity, but primarily for the device_swapchain
extension.

I agree, and I'm not arguing this to be on the application or
compositor side either. I believe the GBM and HWC suggestions are
entirely doable, and further that these problems will need to be
solved outside EGL anyway, for the other usecases. My worry - quite
aside from how vendors who struggle to produce a conformant EGL 1.4
implementation today will ever implement the complexity of Streams,
though this isn't your problem - is that EGL is really the wrong place
to be solving this.


Could you elaborate on what the other usecases are?  If you mean the
Vulkan/media playback cases mentioned above, then I don't see what is
fundamentally wrong about using EGL as a backend within the window system
for those.  If a Vulkan application needs to display on an EGL+GLES-based
Wayland compositor, there will be some point where a transition is made from
Vulkan -> EGL+GLES regardless.


Media falls down because currently there is no zerocopy binding from
either hardware or software media decode engines. Perhaps not the case
on your hardware, unusually blessed with a great deal of memory
bandwidth, but a great many devices physically cannot cope with a
single copy in the pipeline, given the ratio of content size to memory
bandwidth. Doing this in EGL would require a 'draw' step which simply
presented an existing buffer - a step which would unnecessarily
involve the GPU if the pipeline is direct from decode to scanout - or
it would involve having every media engine write their own bindings to
the Streams protocol.

Right. Streams are meant to support lot's of different producers andconsumers.

There are also incredibly exacting timing requirements for media
display, which the Streams model of 'single permanently fixed latency'
does not even come close to achieving. So for that you'd need another
extension, to report actual achieved timings back. Wayland today
fulfills these requirements with the zlinux_dmabuf and
presentation_timing protocols, with the original hardware timings fed
back through KMS.

Would it be reasonable to support such existing extensions while usingstreams?

I think it's large enough that it warrants a split of gl-renderer and
compositor-drm, rather than trying to shoehorn them into the same
file. There's going to be quite some complexity hiding between the
synchronise-with-client-event-stream and direct-scanout boxes, that
will push it over the limit of what's tractable. Those files are
already pretty huge and complex.


Would it be better to wait until such complexities arise in future patches
and split the files at that point, or would you prefer we split the backends
now?  Perhaps I'm just more optimistic about the complexity, but it seems
like it would be easier to evaluate once that currently-hypothetical portion
of the code exists.


Well, there were quite a few issues with the previous set of patches,
and honestly I'm expecting just resolving those to bring enough
complexity to require a three-way split (common, Streams, and
EGLImage/GBM), let alone the features you're talking about solving
with Streams: direct scanout via retargeting of Streams, etc.

I share the hope, and maybe with the WSI and Streams available, we can
design future window systems and display control APIs towards
something like that. But at the moment, the impedance mismatch between
Streams and the (deliberately very different) Wayland and KMS APIs is
already fairly glaring. The winsys support is absolutely trivial to
write, and with winsys interactions only getting more featureful and
complex, such will the common stream protocol have to be.

If I was starting from the position of the EGL ideal: that everything
is EGL, and the only external interactions are creating native types
for it, then I would surely arrive at the same position as you. But
everything we've seen so far - and again, ChromeOS have taken this to
a much further extent - has been chipping away at EGL, rather than
putting more into it, and this has been for the better.


The direction ChromeOS is taking is even more problematic, and I'd hate to
see it being held up as an example of proper design direction.  We spent a
good deal of time working with Google to support ChromeOS and ended up
essentially allowing them to punch through the driver abstraction via very
opaque EGL extensions that no engineer besides the extension authors could
be expected to use correctly, and embed HW-specific knowledge within some
component of ChromeOS, such that it will likely only run optimally on a
single generation of our hardware and will need to be revisited.  That's the
type of problem we're trying to avoid here.  ChromeOS has made other design
compromises that cost us (and I suspect other vendors) 10-20% performance
across the board to optimize for a very specific use case (I.e., a browser)
and within very constrained schedules.  It is not the right direction for
OS<->graphics driver interactions to evolve.


Direction and extent are two very different things: I largely agree
with their direction (less encapsulation inside vendor drivers), and
disagree on the extent to which they've taken it.


That's a very good point.  I agree minimal encapsulation is a good goal.

I don't think that's a difference we'll ever resolve though.


I believe thus far we've all tried to focus objectively on specific issues,
proposed solutions for them, and the merits of those solutions.  Weston and
the other Wayland compositors I'm aware of are based on EGL at the moment,
so regardless of its merits as an API it doesn't seem problematic purely
from a dependency standpoint to add EGLStream as an option next to the
existing EGLImage and EGLDisplay+GBM paths.  I'm certainly willing to
continue discussing the merits of EGL on a broader scale, but does that
discussion need to block the patches proposed here?


Every additional codepath has its cost. Even if you just look at
Mutter and Weston in a vacuum, it seems like it'll be quite the large
patchset(s) by the time it's done, let alone extending it out to all
the other compositors. This is a patchset which will need constant
care and feeding: if it's not tested, it's broken. Right now, there is
only one Streams implementation available, which is in a driver whose
legal status is seen to be sufficiently problematic that it is not
generally distributed by downstreams, which requires a whole set of
external kernel patches to run. So even getting it to run is
non-trivial.

But then we'd have to do that in such a way that it was generally
available, else any refactoring or changes we wanted to do internally
would have to be blocked on testing/review from someone who knew that
backend well enough. Either that, or it would just get broken.
Introducing these codepaths has a very, very, real cost to the
projects you're talking about.

If there were an open source implementation of streams, would thataffect your view?

Agreed, all new code, and especially new significant branches in codehas costs. However, a balance always needs to be struck.

You could quite rightly point to the Raspberry Pi DispManX backend as
an example of the same, and you'd be right. And that's why I'm
extremely enthused about how their new KMS/GBM driver allows us to
nuke the entire backend from orbit, and reduce our testing load by
shifting them to the generic driver.

I hope we can avoid an entirely forked compositor-drm/eglstream (andespecially gl-renderer) for these reasons. The majority of the code isstill common and would be exercised using either path.


Thanks,
-James

Cheers,
Daniel

_______________________________________________
wayland-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/wayland-devel

Re: Introduction and updates from NVIDIA

Reply via email to