On 10/21/24 07:33, Jose Fonseca wrote:
I see a few downsides with the proposed callback:
- feels like a solution too tailored for WINE
- there's a layering violation: the application suddenly takes the
driving seat for a thing deep down in the GL driver
so I fear Mesa community might regret it doing, and once WINE supports
there would be outcry if to go back.
IIUC the problem at hand, another way to go about this would be an
extension that allows applications to get a malloc'ed/valloc'ed memory
exposed to the GPU as a GL buffer object.
I feel this would be potentially useful to applications other than just
WINE, especially on systems with unified memory. And there have been
extensions along these lines before, for example,
https://registry.khronos.org/OpenGL/extensions/AMD/AMD_pinned_memory.txt
<https://registry.khronos.org/OpenGL/extensions/AMD/AMD_pinned_memory.txt>
These ideas by themselves make it very difficult to get e.g., a mapping
of a device memory region as opposed to a system memory buffer, and...
There's also NVIDIA's Heterogeneous Memory Management, which takes this
idea to a whole different level:
-
https://developer.nvidia.com/blog/simplifying-gpu-application-development-with-heterogeneous-memory-management/ <https://developer.nvidia.com/blog/simplifying-gpu-application-development-with-heterogeneous-memory-management/>
- https://www.kernel.org/doc/html/v5.0/vm/hmm.html
<https://www.kernel.org/doc/html/v5.0/vm/hmm.html>
-
https://lpc.events/event/2/contributions/70/attachments/14/6/hmm-lpc18.pdf <https://lpc.events/event/2/contributions/70/attachments/14/6/hmm-lpc18.pdf>
- https://lwn.net/Articles/752964/ <https://lwn.net/Articles/752964/>
- https://lwn.net/Articles/684916/ <https://lwn.net/Articles/684916/>
These are great, but seem like overkill for this problem space.
The Vulkan solution was very minimally invasive from a driver point of
view. If only WINE ends up using it, it's not that big of a deal. WINE
is a common use case, and there are plenty of things in graphics APIs
that cater to one or a small set of very impactful use cases. If the
OpenGL extension had a similarly small footprint, it also wouldn't be
that big of a deal if it were tailored to WINE. Note Vulkan already has
the equivalent of the above AMD extension, and chose to add additional
functionality for this particular use case anyway.
All that said, I don't love the idea of callbacks either. Callbacks in
general are tough to specify and use robustly, and hence should be a
last resort. E.g., this particular callback might sometimes come from
the application thread, and sometimes come from some separate
driver-managed thread. It's hard to validate that all applications can
handle that properly and wouldn't do something crazy like rely on their
own TLS data in the callback or try to call back into OpenGL from the
callback and deadlock themselves, even if these are clearly specified as
an unsupported actions.
Thanks,
-James
But I remember that Thomas Hellstrom (while at VMware, now Intel) once
prototyped this without HMM, just plain DRM. I think HMM provides the
ability to do this transparently for application, which is above and
beyond what's strictly needed for WINE.
Metal API also provides this --
https://developer.apple.com/documentation/metal/mtldevice/1433382-newbufferwithbytesnocopy?language=objc <https://developer.apple.com/documentation/metal/mtldevice/1433382-newbufferwithbytesnocopy?language=objc>
Jose
On Fri, Oct 18, 2024 at 11:10 PM Derek Lesho <dle...@codeweavers.com
<mailto:dle...@codeweavers.com>> wrote:
Hey everyone 👋,
I'm Derek from the Wine project, and wanted to start a discussion with
y'all about potentially extending the Mesa OGL drivers to help us
with a
functionality gap we're facing.
Problem Space:
In the last few years Wine's support for running 32-bit windows apps in
a 64-bit host environment (wow64) has almost reached feature
completion,
but there remains a pain point with OpenGL applications: Namely that
Wine can't return a 64-bit GL implementation's buffer mappings to a 32
bit application when the address is outside of the 32-bit range.
Currently, we have a workaround that will copy any changes to the
mapping back to the host upon glBufferUnmap, but this of course is slow
when the implementation directly returns mapped memory, and doesn't
work
for GL_PERSISTENT_BIT, where directly mapped memory is required.
A few years ago we also faced this problem with Vulkan's, which was
solved through the VK_EXT_map_memory_placed extension Faith drafted,
allowing us to use our Wine-internal allocator to provide the pages the
driver maps to. I'm now wondering if an GL equivalent would also be
seen
as feasible amongst the devs here.
Proposed solution:
As the GL backend handles host mapping in its own code, only giving
suballocations from its mappings back to the App, the problem is a
little bit less straight forward in comparison to our Vulkan solution:
If we just allowed the application to set its own placed mapping when
calling glMapBuffer, the driver might then have to handle moving
buffers
out of already mapped ranges, and would lose control over its own
memory
management schemes.
Therefore, I propose a GL extension that allows the GL client to
provide
a mapping and unmapping callback to the implementation, to be used
whenever the driver needs to perform such operations. This way the
driver remains in full control of its memory management affairs, and
the
amount of work for an implementation as well as potential for bugs is
kept minimal. I've written a draft implementation in Zink using
map_memory_placed [1] and a corresponding Wine MR utilizing it [2], and
would be curious to hear your thoughts. I don't have experience in the
Mesa codebase, so I apologize if the branch is a tad messy.
In theory, the only requirement from drivers from the extension
would be
that glMapBuffer always return a pointer from within a page allocated
through the provided callbacks, so that it can be guaranteed to be
positioned within the required address space. Wine would then use it's
existing workaround for other types of buffers, but as Mesa seems to
often return directly mapped buffers in other cases as well, Wine could
also avoid the slowdown that comes with copying in these cases as well.
Why not use Zink?:
There's also a proposal to use a 32-bit PE build of Zink in Wine
bypassing the need for an extension; I brought this to discussion in
this Wine-Devel thread last week [3], which has some arguments against
this approach.
If any of you have thoughts, concerns, or questions about this
potential
approach, please let me know, thanks!
1:
https://gitlab.freedesktop.org/Guy1524/mesa/-/commits/placed_allocation
<https://gitlab.freedesktop.org/Guy1524/mesa/-/commits/placed_allocation>
2: https://gitlab.winehq.org/wine/wine/-/merge_requests/6663
<https://gitlab.winehq.org/wine/wine/-/merge_requests/6663>
3: https://marc.info/?t=172883260300002&r=1&w=2
<https://marc.info/?t=172883260300002&r=1&w=2>
This electronic communication and the information and any files
transmitted with it, or attached to it, are confidential and are
intended solely for the use of the individual or entity to whom it is
addressed and may contain information that is confidential, legally
privileged, protected by privacy laws, or otherwise restricted from
disclosure to anyone else. If you are not the intended recipient or the
person responsible for delivering the e-mail to the intended recipient,
you are hereby notified that any use, copying, distributing,
dissemination, forwarding, printing, or copying of this e-mail is
strictly prohibited. If you received this e-mail in error, please return
the e-mail to the sender, delete it from your computer, and destroy any
printed copy of it.