On 10/21/24 07:33, Jose Fonseca wrote:
I see a few downsides with the proposed callback:
- feels like a solution too tailored for WINE
- there's a layering violation: the application suddenly takes the driving seat for a thing deep down in the GL driver so I fear Mesa community might regret it doing, and once WINE supports there would be outcry if to go back.


IIUC the problem at hand, another way to go about this would be an extension that allows applications to get a malloc'ed/valloc'ed memory exposed to the GPU as a GL buffer object.

I feel this would be potentially useful to applications other than just WINE, especially on systems with unified memory.  And there have been extensions along these lines before, for example, https://registry.khronos.org/OpenGL/extensions/AMD/AMD_pinned_memory.txt <https://registry.khronos.org/OpenGL/extensions/AMD/AMD_pinned_memory.txt>

These ideas by themselves make it very difficult to get e.g., a mapping of a device memory region as opposed to a system memory buffer, and...

There's also NVIDIA's Heterogeneous Memory Management, which takes this idea to a whole different level: - https://developer.nvidia.com/blog/simplifying-gpu-application-development-with-heterogeneous-memory-management/ <https://developer.nvidia.com/blog/simplifying-gpu-application-development-with-heterogeneous-memory-management/> - https://www.kernel.org/doc/html/v5.0/vm/hmm.html <https://www.kernel.org/doc/html/v5.0/vm/hmm.html> - https://lpc.events/event/2/contributions/70/attachments/14/6/hmm-lpc18.pdf <https://lpc.events/event/2/contributions/70/attachments/14/6/hmm-lpc18.pdf>
- https://lwn.net/Articles/752964/ <https://lwn.net/Articles/752964/>
- https://lwn.net/Articles/684916/ <https://lwn.net/Articles/684916/>

These are great, but seem like overkill for this problem space.

The Vulkan solution was very minimally invasive from a driver point of view. If only WINE ends up using it, it's not that big of a deal. WINE is a common use case, and there are plenty of things in graphics APIs that cater to one or a small set of very impactful use cases. If the OpenGL extension had a similarly small footprint, it also wouldn't be that big of a deal if it were tailored to WINE. Note Vulkan already has the equivalent of the above AMD extension, and chose to add additional functionality for this particular use case anyway.

All that said, I don't love the idea of callbacks either. Callbacks in general are tough to specify and use robustly, and hence should be a last resort. E.g., this particular callback might sometimes come from the application thread, and sometimes come from some separate driver-managed thread. It's hard to validate that all applications can handle that properly and wouldn't do something crazy like rely on their own TLS data in the callback or try to call back into OpenGL from the callback and deadlock themselves, even if these are clearly specified as an unsupported actions.

Thanks,
-James

But I remember that Thomas Hellstrom (while at VMware, now Intel) once prototyped this without HMM, just plain DRM.  I think HMM provides the ability to do this transparently for application, which is above and beyond what's strictly needed for WINE.

Metal API also provides this -- https://developer.apple.com/documentation/metal/mtldevice/1433382-newbufferwithbytesnocopy?language=objc <https://developer.apple.com/documentation/metal/mtldevice/1433382-newbufferwithbytesnocopy?language=objc>

Jose

On Fri, Oct 18, 2024 at 11:10 PM Derek Lesho <dle...@codeweavers.com <mailto:dle...@codeweavers.com>> wrote:

    Hey everyone 👋,

    I'm Derek from the Wine project, and wanted to start a discussion with
    y'all about potentially extending the Mesa OGL drivers to help us
    with a
    functionality gap we're facing.

    Problem Space:

    In the last few years Wine's support for running 32-bit windows apps in
    a 64-bit host environment (wow64) has almost reached feature
    completion,
    but there remains a pain point with OpenGL applications: Namely that
    Wine can't return a 64-bit GL implementation's buffer mappings to a 32
    bit application when the address is outside of the 32-bit range.

    Currently, we have a workaround that will copy any changes to the
    mapping back to the host upon glBufferUnmap, but this of course is slow
    when the implementation directly returns mapped memory, and doesn't
    work
    for GL_PERSISTENT_BIT, where directly mapped memory is required.

    A few years ago we also faced this problem with Vulkan's, which was
    solved through the VK_EXT_map_memory_placed extension Faith drafted,
    allowing us to use our Wine-internal allocator to provide the pages the
    driver maps to. I'm now wondering if an GL equivalent would also be
    seen
    as feasible amongst the devs here.

    Proposed solution:

    As the GL backend handles host mapping in its own code, only giving
    suballocations from its mappings back to the App, the problem is a
    little bit less straight forward in comparison to our Vulkan solution:
    If we just allowed the application to set its own placed mapping when
    calling glMapBuffer, the driver might then have to handle moving
    buffers
    out of already mapped ranges, and would lose control over its own
    memory
    management schemes.

    Therefore, I propose a GL extension that allows the GL client to
    provide
    a mapping and unmapping callback to the implementation, to be used
    whenever the driver needs to perform such operations. This way the
    driver remains in full control of its memory management affairs, and
    the
    amount of work for an implementation as well as potential for bugs is
    kept minimal. I've written a draft implementation in Zink using
    map_memory_placed [1] and a corresponding Wine MR utilizing it [2], and
    would be curious to hear your thoughts. I don't have experience in the
    Mesa codebase, so I apologize if the branch is a tad messy.

    In theory, the only requirement from drivers from the extension
    would be
    that glMapBuffer always return a pointer from within a page allocated
    through the provided callbacks, so that it can be guaranteed to be
    positioned within the required address space. Wine would then use it's
    existing workaround for other types of buffers, but as Mesa seems to
    often return directly mapped buffers in other cases as well, Wine could
    also avoid the slowdown that comes with copying in these cases as well.

    Why not use Zink?:

    There's also a proposal to use a 32-bit PE build of Zink in Wine
    bypassing the need for an extension; I brought this to discussion in
    this Wine-Devel thread last week [3], which has some arguments against
    this approach.


    If any of you have thoughts, concerns, or questions about this
    potential
    approach, please let me know, thanks!

    1:
    https://gitlab.freedesktop.org/Guy1524/mesa/-/commits/placed_allocation 
<https://gitlab.freedesktop.org/Guy1524/mesa/-/commits/placed_allocation>

    2: https://gitlab.winehq.org/wine/wine/-/merge_requests/6663
    <https://gitlab.winehq.org/wine/wine/-/merge_requests/6663>

    3: https://marc.info/?t=172883260300002&r=1&w=2
    <https://marc.info/?t=172883260300002&r=1&w=2>


This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it.

Reply via email to