date:20240125

Re: Future direction of the Mesa Vulkan runtime (or "should we build a new gallium?")

2024-01-25 Thread Jose Fonseca

> So far, we've been trying to build those components in terms of the
Vulkan API itself with calls jumping back into the dispatch table to try
and get inside the driver. This is working but it's getting more and more
fragile the more tools we add to that box. A lot of what I want to do with
gallium2 or whatever we're calling it is to fix our layering problems so
that calls go in one direction and we can untangle the jumble. I'm still
not sure what I want that to look like but I think I want it to look a lot
like Vulkan, just with a handier interface.

That resonates with my experience.  For example, Galllium draw module does
some of this too -- it provides its own internal interfaces for drivers,
but it also loops back into Gallium top interface to set FS and rasterizer
state -- and that has *always* been a source of grief.  Having control flow
proceeding through layers in one direction only seems an important
principle to observe.  It's fine if the lower interface is the same
interface (e.g., Gallium to Gallium, or Vulkan to Vulkan as you allude),
but they shouldn't be the same exact entry-points/modules (ie, no
reentrancy/recursion.)

It's also worth considering that Vulkan extensibility could come in hand
too in what you want to achieve.  For example, Mesa Vulkan drivers could
have their own VK_MESA_internal_ extensions that could be used by the
shared Vulkan code to do lower level things.

Jose


On Wed, Jan 24, 2024 at 3:26 PM Faith Ekstrand  wrote:

> Jose,
>
> Thanks for your thoughts!
>
> On Wed, Jan 24, 2024 at 4:30 AM Jose Fonseca 
> wrote:
> >
> > I don't know much about the current Vulkan driver internals to have or
> provide an informed opinion on the path forward, but I'd like to share my
> backwards looking perspective.
> >
> > Looking back, Gallium was two things effectively:
> > (1) an abstraction layer, that's watertight (as in upper layers
> shouldn't reach through to lower layers)
> > (2) an ecosystem of reusable components (draw, util, tgsi, etc.)
> >
> > (1) was of course important -- and the discipline it imposed is what
> enabled to great simplifications -- but it also became a straight-jacket,
> as GPUs didn't stand still, and sooner or later the
> see-every-hardware-as-the-same lenses stop reflecting reality.
> >
> > If I had to pick one, I'd say that (2) is far more useful and
> practical.Take components like gallium's draw and other util modules. A
> driver can choose to use them or not.  One could fork them within Mesa
> source tree, and only the drivers that opt-in into the fork would need to
> be tested/adapted/etc
> >
> > On the flip side, Vulkan API is already a pretty low level HW
> abstraction.  It's also very flexible and extensible, so it's hard to
> provide a watertight abstraction underneath it without either taking the
> lowest common denominator, or having lots of optional bits of functionality
> governed by a myriad of caps like you alluded to.
>
> There is a third thing that isn't really recognized in your description:
>
> (3) A common "language" to talk about GPUs and data structures that
> represent that language
>
> This is precisely what the Vulkan runtime today doesn't have. Classic
> meta sucked because we were trying to implement GL in GL. u_blitter,
> on the other hand, is pretty fantastic because Gallium provides a much
> more sane interface to write those common components in terms of.
>
> So far, we've been trying to build those components in terms of the
> Vulkan API itself with calls jumping back into the dispatch table to
> try and get inside the driver. This is working but it's getting more
> and more fragile the more tools we add to that box. A lot of what I
> want to do with gallium2 or whatever we're calling it is to fix our
> layering problems so that calls go in one direction and we can
> untangle the jumble. I'm still not sure what I want that to look like
> but I think I want it to look a lot like Vulkan, just with a handier
> interface.
>
> ~Faith
>
> > Not sure how useful this is in practice to you, but the lesson from my
> POV is that opt-in reusable and shared libraries are always time well spent
> as they can bend and adapt with the times, whereas no opt-out watertight
> abstractions inherently have a shelf life.
> >
> > Jose
> >
> > On Fri, Jan 19, 2024 at 5:30 PM Faith Ekstrand 
> wrote:
> >>
> >> Yeah, this one's gonna hit Phoronix...
> >>
> >> When we started writing Vulkan drivers back in the day, there was this
> >> notion that Vulkan was a low-level API that directly targets hardware.
> >> Vulkan drivers were these super thin things that just blasted packets
> >> straight into the hardware. What little code was common was small and
> >> pretty easy to just copy+paste around. It was a nice thought...
> >>
> >> What's happened in the intervening 8 years is that Vulkan has grown. A
> lot.
> >>
> >> We already have several places where we're doing significant layering.
> >> It started with sharing the WSI code and some Python

Tesla gaming (and more)

2024-01-25 Thread test

What is the most proper way to re-route output from a rendering card 
(which can have it's output disconnected or don't have it at all) to a 
displaying card (weak one, iGPU etc)?
For example, a laptop with an external card in an ExpressCard riser (no 
external display is connected to the card), or a desktop PC with 
embedded video plus Nvidia Tesla?
How should I configure Mesa in order to get «auto-screen-grabbing» from 
the rendering card to the displaying one?

Re: Future direction of the Mesa Vulkan runtime (or "should we build a new gallium?")

2024-01-25 Thread Faith Ekstrand

On Thu, Jan 25, 2024 at 8:57 AM Jose Fonseca 
wrote:

> > So far, we've been trying to build those components in terms of the
> Vulkan API itself with calls jumping back into the dispatch table to try
> and get inside the driver. This is working but it's getting more and more
> fragile the more tools we add to that box. A lot of what I want to do with
> gallium2 or whatever we're calling it is to fix our layering problems so
> that calls go in one direction and we can untangle the jumble. I'm still
> not sure what I want that to look like but I think I want it to look a lot
> like Vulkan, just with a handier interface.
>
> That resonates with my experience.  For example, Galllium draw module does
> some of this too -- it provides its own internal interfaces for drivers,
> but it also loops back into Gallium top interface to set FS and rasterizer
> state -- and that has *always* been a source of grief.  Having control
> flow proceeding through layers in one direction only seems an important
> principle to observe.  It's fine if the lower interface is the same
> interface (e.g., Gallium to Gallium, or Vulkan to Vulkan as you allude),
> but they shouldn't be the same exact entry-points/modules (ie, no
> reentrancy/recursion.)
>
> It's also worth considering that Vulkan extensibility could come in hand
> too in what you want to achieve.  For example, Mesa Vulkan drivers could
> have their own VK_MESA_internal_ extensions that could be used by the
> shared Vulkan code to do lower level things.
>

We already do that for a handful of things. The fact that Vulkan doesn't
ever check the stuff in the pNext chain is really useful for that. 😅

~Faith


> Jose
>
>
> On Wed, Jan 24, 2024 at 3:26 PM Faith Ekstrand 
> wrote:
>
>> Jose,
>>
>> Thanks for your thoughts!
>>
>> On Wed, Jan 24, 2024 at 4:30 AM Jose Fonseca 
>> wrote:
>> >
>> > I don't know much about the current Vulkan driver internals to have or
>> provide an informed opinion on the path forward, but I'd like to share my
>> backwards looking perspective.
>> >
>> > Looking back, Gallium was two things effectively:
>> > (1) an abstraction layer, that's watertight (as in upper layers
>> shouldn't reach through to lower layers)
>> > (2) an ecosystem of reusable components (draw, util, tgsi, etc.)
>> >
>> > (1) was of course important -- and the discipline it imposed is what
>> enabled to great simplifications -- but it also became a straight-jacket,
>> as GPUs didn't stand still, and sooner or later the
>> see-every-hardware-as-the-same lenses stop reflecting reality.
>> >
>> > If I had to pick one, I'd say that (2) is far more useful and
>> practical.Take components like gallium's draw and other util modules. A
>> driver can choose to use them or not.  One could fork them within Mesa
>> source tree, and only the drivers that opt-in into the fork would need to
>> be tested/adapted/etc
>> >
>> > On the flip side, Vulkan API is already a pretty low level HW
>> abstraction.  It's also very flexible and extensible, so it's hard to
>> provide a watertight abstraction underneath it without either taking the
>> lowest common denominator, or having lots of optional bits of functionality
>> governed by a myriad of caps like you alluded to.
>>
>> There is a third thing that isn't really recognized in your description:
>>
>> (3) A common "language" to talk about GPUs and data structures that
>> represent that language
>>
>> This is precisely what the Vulkan runtime today doesn't have. Classic
>> meta sucked because we were trying to implement GL in GL. u_blitter,
>> on the other hand, is pretty fantastic because Gallium provides a much
>> more sane interface to write those common components in terms of.
>>
>> So far, we've been trying to build those components in terms of the
>> Vulkan API itself with calls jumping back into the dispatch table to
>> try and get inside the driver. This is working but it's getting more
>> and more fragile the more tools we add to that box. A lot of what I
>> want to do with gallium2 or whatever we're calling it is to fix our
>> layering problems so that calls go in one direction and we can
>> untangle the jumble. I'm still not sure what I want that to look like
>> but I think I want it to look a lot like Vulkan, just with a handier
>> interface.
>>
>> ~Faith
>>
>> > Not sure how useful this is in practice to you, but the lesson from my
>> POV is that opt-in reusable and shared libraries are always time well spent
>> as they can bend and adapt with the times, whereas no opt-out watertight
>> abstractions inherently have a shelf life.
>> >
>> > Jose
>> >
>> > On Fri, Jan 19, 2024 at 5:30 PM Faith Ekstrand 
>> wrote:
>> >>
>> >> Yeah, this one's gonna hit Phoronix...
>> >>
>> >> When we started writing Vulkan drivers back in the day, there was this
>> >> notion that Vulkan was a low-level API that directly targets hardware.
>> >> Vulkan drivers were these super thin things that just blasted packets
>> >> straight into the hardware. What l

Re: Future direction of the Mesa Vulkan runtime (or "should we build a new gallium?")

2024-01-25 Thread Triang3l

On 24/01/2024 18:26, Faith Ekstrand wrote:

> So far, we've been trying to build those components in terms of the
> Vulkan API itself with calls jumping back into the dispatch table to
> try and get inside the driver.

To me, it looks like the "opt-in" approach would still be well-applicable
to the goal of cleaning up "implementing Vulkan in Vulkan", and gradual
changes diverging from the usual Vulkan specification behavior can be
implemented and maintained in existing and new drivers more efficiently
compared to a whole new programming model.

I think it's important that the scale of our solution should be appropriate
to the scale of the problem, otherwise we risk creating large issues in
other areas. Currently there are pretty few places where Mesa implements
Vulkan on top of Vulkan:
 • WSI,
 • Emulated render passes,
 • Emulated secondary command buffers,
 • Meta.

For WSI, render passes and secondary command buffers, I don't think there's
anything that needs to be done, as those already have little to none driver
backend involvement or interference with application's calls — render pass
and secondary command buffer emulation interacts with the hardware driver
entirely within the framework of the Vulkan specification, only storing a
few fields in vk_command_buffer which are already handled fully in common
code.

Common meta, on the other hand, yes, is extremely intrusive — overriding
the application's pipeline state, bindings, and passing shaders directly in
NIR bypassing SPIR-V.

But with meta being such a different beast, I think we shouldn't even be
trying to tame it with the same interfaces as everything else. If we're
going to handle meta's special cases throughout our common "Gallium2"
framework, it feels like we'll simply be turning our "Vulkan on Vulkan"
issue into the problem of "implementing Gallium2 on Gallium2".

Instead, I think the cleanest solution in the common meta would be sending
commands to the driver through a separate callback interface specifically
for meta instead of trying to make meta mimic application code. That would
allow drivers to clearly negotiate the details of applying/reverting state
changes, shader compilation, while letting their developers assume that
everything else is written for the most part purely against the Vulkan
specification.

It would still be okay for meta to make calls to vkGetPhysicalDevice*,
vkCreate*/vkDestroy*, as long as they're done within the rules of the
Vulkan specification, to require certain extensions, as well as to do some
less-intrusive, non-hot-path interaction with the driver's internals
directly — such as requiring that every VkImage is a vk_image and pulling
the needed create info fields from there. However, everything interacting
with the state/bindings, as well as things going beyond the specification
like creating image views with incompatible formats, would be going through
those new callbacks.

NVK-style drivers would be able to share a common implementation of those
callbacks. Drivers that want to take advantage of more direct-to-hardware
paths would need to provide what's friendly to them (maybe even with
lighter handling of compute-based meta operations compared to graphics
ones). That'd probably be not a single flat list of callbacks, but a bunch
of ones — like it'd be possible for a driver to use the common command
buffer callbacks, but to specialize some view/descriptor-related ones (it
may not be possible to make those common at all, by the way). And if a
driver doesn't need the common meta at all, none of that would be bothering
it.

The other advantages I see in this separate meta API approach are:
 • In the rest of the code, driver developers in most cases will need to
   refer to only a single authority — the massively detailed Vulkan
   specification, and there are risks regarding rolling our own interface
   for everything:
   • Driver developers will have to spend more time carefully looking up
 what they need to do in two places rather than largely just one.
   • We're much more prone to leaving gaps in our interface and to writing
 lacking documentation. I can't see this effort not being rushed, with
 us having to catch up to 10 years of XGL/Vulkan development, while
 moving many drivers alongside working on other tasks, and with varying
 levels of enthusiasm of driver developers towards this. Unless zmike's
 10 years estimate is our actual target 🤷
   • Having to deal with a new large-scale API may raise the barrier for
 new contributors and discourage them.
 Unlike with OpenGL with all the resource renaming stuff, except for
 shader compilation, the experience I got from developing applications
 on Vulkan was enough for me to start comfortably implementing it.
 When zmike showed me an R600g issue about some relation of vertex
 buffer bindings and CSOs, I just didn't have anything useful to say.
 • Faster iteration inside the common meta code, with the meta interfac

Re: Future direction of the Mesa Vulkan runtime (or "should we build a new gallium?")

2024-01-25 Thread Gert Wollny

Hi, 

thanks, Faith, for bringing this discussion up. 

I think with Venus we are more interested in using utility libraries on
an as-needed basis. Here, most of the time the Vulkan commands are just
serialized according to the Venus protocol and this is then passed to
the host because usually it wouldn't make sense to let the guest
translate the Vulkan commands to something different (e.g. something
that is commonly used in a runtime), only to then re-encode this in the
Venus driver to satisfy the host Vulkan driver -  just think Spir-V:
why would we want to have NIR only to then re-encode it to Spir-V?

I'd also like to give a +1 to the points raised by Triang3l and others
about the potential of breaking other drivers. I've certainly be bitten
by this on the Gallium side with r600, and unfortunately I can't set up
a CI in my home office (and after watching the XDC talk about setting
up your own CI I was even more discouraged to do this).

In summary I certainly see the advantage in using common code, but with
these two points above in mind I think opt-in is better.

Gert

Re: Future direction of the Mesa Vulkan runtime (or "should we build a new gallium?")

2024-01-25 Thread Faith Ekstrand

On Thu, Jan 25, 2024 at 5:06 PM Gert Wollny  wrote:

> Hi,
>
> thanks, Faith, for bringing this discussion up.
>
> I think with Venus we are more interested in using utility libraries on
> an as-needed basis. Here, most of the time the Vulkan commands are just
> serialized according to the Venus protocol and this is then passed to
> the host because usually it wouldn't make sense to let the guest
> translate the Vulkan commands to something different (e.g. something
> that is commonly used in a runtime), only to then re-encode this in the
> Venus driver to satisfy the host Vulkan driver -  just think Spir-V:
> why would we want to have NIR only to then re-encode it to Spir-V?
>

I think Venus is an entirely different class of driver. It's not even
really a driver. It's more of a Vulkan layer that has a VM boundary in the
middle. It's attempting to be as thin of a Vulkan -> Vulkan pass-through as
possible. As such, it doesn't use most of the shared stuff anyway. It uses
the dispatch framework and that's really about it. As long as that code
stays in-tree roughly as-is, I think Venus will be fine.

> I'd also like to give a +1 to the points raised by Triang3l and others
> about the potential of breaking other drivers. I've certainly be bitten
> by this on the Gallium side with r600, and unfortunately I can't set up
> a CI in my home office (and after watching the XDC talk about setting
> up your own CI I was even more discouraged to do this).
>

That's a risk with all common code. You could raise the same risk with NIR
or basically anything else. Sure, if someone wants to go write all the code
themselves in an attempt to avoid bugs, I guess they're free to do that. I
don't really see that as a compelling argument, though. Also, while you
experienced gallium breakage with r600, having worked on i965, I can
guarantee you that that's still better than maintaining a classic
(non-gallium) GL driver. 🙃

At the moment, given the responses I've seen and the scope of the project
as things are starting to congeal in my head, I don't think this will be an
incremental thing where drivers get converted as we go anymore. If we
really do want to flip the flow, I think it'll be invasive enough that
we'll build gallium2 and then people can port to it if they want. I may
port a driver or two myself but those will be things I own or am at least
willing to deal with the bug fallout for. Others can port or not at-will.

This is what I meant when I said elsewhere that we're probably heading
towards a gallium/classic situation again. I don't expect anyone to port
until the benefits outweigh the costs but I do expect the benefits will be
there eventually.

~Faith

Re: Future direction of the Mesa Vulkan runtime (or "should we build a new gallium?")

Tesla gaming (and more)

Re: Future direction of the Mesa Vulkan runtime (or "should we build a new gallium?")

Re: Future direction of the Mesa Vulkan runtime (or "should we build a new gallium?")

Re: Future direction of the Mesa Vulkan runtime (or "should we build a new gallium?")

Re: Future direction of the Mesa Vulkan runtime (or "should we build a new gallium?")

6 matches

Site Navigation

Mail list logo

Footer information