Hi
Am 16.10.25 um 12:32 schrieb Boris Brezillon:
On Thu, 16 Oct 2025 11:58:21 +0200
Thomas Zimmermann <[email protected]> wrote:
Hi
Am 16.10.25 um 11:40 schrieb Boris Brezillon:
Hi Thomas,
On Thu, 16 Oct 2025 10:32:46 +0200
Thomas Zimmermann <[email protected]> wrote:
Hi,
on patches 2 to 4: sync is really begin/end access wrapped into one
interface, which I find questionable. I also don't like that these
patches add generic infrastructure for a single driver.
It's actually two drivers (panfrost and panthor), and the interface is
here so other drivers relying on drm_gem_shmem don't have to hand-roll
these things in the future.
Ok.
But what about the unrelated drivers? Most GEM SHMEM drivers don't need
this. Looking at patch 4, there's a for loop over the SG table, which
appears to affect all drivers.
It doesn't. First off, this ::sync hook is optional, and if you look at
patch 4's commit message, it says:
"
We provide a drm_gem_shmem_object_sync() for drivers that wants to
have this default implementation as their drm_gem_object_funcs::sync,
but we don't set drm_gem_shmem_funcs::sync to
drm_gem_shmem_object_sync() to avoid changing the behavior of drivers
currently relying on the default gem_shmem vtable.
"
You know, there are two types of GEM SHMEM users. The ones like panthor
that build upon it with additional features. And the other type that
only use it as simple buffer storage. The former type what's building
blocks to combine as needed; that latter (actually the majority) wants a
simple solution. And TBH, I've considered spitting GEM SHMEM into two
for this reason.
That's the very reason I don't force
::sync = drm_gem_shmem_sync
but leave it to each driver to decide whether they want it or not.
Apologies for misunderstanding. My impression was that the sync helper
is now the default. But that brings me to another question: where do you
set the gem callback? I applied your series and grepped for it. It's not
set in the panthor/panfrost gem funcs nor the default funcs for shmem. I
can only find the sync calls in the ioctl code.
My proposal is to make your own dma_buf exporter in panthor and set the
begin/end_cpu_access functions as you need. Panthor already contains its
own GEM export helper at [1]. If you inline drm_gem_prime_export() [2]
you can set the cpu_access callbacks to panthor-specific code. The other
dma-buf helpers' symbols should be exported and can be re-used. That is
a lot less intrusive and should provide what you need.
I can of course do that in panthor, but then I'll have to duplicate
the same logic in panfrost. Also, the whole point of making it generic
is so that people don't forget that begin/end_cpu_access() is a thing
they should consider (like happened to me if you look at v2 of this
patchset), otherwise importers of their buffers might have unpleasant
side effects because of missing flush/invalidates. This, IMHO, is a good
reason to have it as a drm_gem_funcs::sync() callback. That, or we
decide that the default dma_buf_ops is not a thing, and we force
developers to think twice when they select the default hooks to pick
for their dma_buf implementation.
See my comment above. Panthor and panfrost should treat GEM SHMEM like a
set of building blocks rather and a full solution.
That's exactly what it does. Panfrost/Panthor have their own
drm_gem_object_funcs, they just use the default drm_gem_shmem_sync
implementation provided by this patch, that's all. And yes,
hand-rolling your own drm_prime implem is not only annoying, it's also
error prone because some of the gem_shmem helpers [1] might rely on
drm_gem_is_prime_exported_dma_buf(), which checks the dma_buf ops
against the default drm_prime implementation.
I know, but that's a shortcoming of the current implementation. IIRC all
driver's prime importers have code like [1] at the top, so we probably
should do that in common code.
But that's not the point here. I really do think that drivers either use
a simple turn-key solution OR hand-pick the building blocks for the
complex scenarios. There's no middle ground IMHO. IIRC the original TTM
was supposed to provide a full solution for the complex scenarios and
failed at that.
I found that a hand full of other DRM drivers implement dma-buf's
begin/end access callbacks. If you want a generic begin/end interface
for GEM, you certainly want to get them on board. If there's something
common to share, this should be done in a separate series.
Fair enough. I'll try to convert freedreno and imagination to this new
interface, and gather some feedback.
I did 'git grep \.begin_cpu_access' in the drm directory. This returned,
amdgpu, i915, tegra, omap, and xe. These where the ones I had in mind.
Hm, so tegra/omap should be relatively easy, but I'm reluctant to touch
the big drivers here (amdgpu, i915 and xe). Because it's an opt-in, and
because these drivers already have hand-rolled these dma_buf_ops, I'd
rather let their owners deal with the transition if they want to. IOW,
if you ask me to find new users, I'd like to focus on relatively small
drivers, and primarily those relying on drm_gem_shmem already. Note
that there might be drivers lacking a {begin,end}_cpu_access()
implementation, but don't know it, because there's very few use cases
where GPU is the exporter and the importer is not the same GPU device.
The other reason would be that most drivers relying on gem_shmem set
::map_wc=true unconditionally (CPU mappings are not cached, and thus
don't require synchronization), or just that those devices are IO
coherent.
I'm not asking you to rewrite all these drivers, but to take them into
consideration. Feedback from their devs would be valuable. Semantics of
similar operations in dma-buf and GEM sometimes differ in subtle ways.
Things like pinning, locking and eviction might be of importance to make
sync work for all drivers.
Honestly, I'm not too sure why this is a problem if this hook is
optional. If it turns out to be too simple for more complex use cases
others have, it can still be extended when those drivers transition to
this ::sync() approach, as no in-kernel API is immutable. And in the
meantime, we have a solution for two drivers that doesn't imply
duplicating a bunch of drm_prime boiler-plate that's otherwise rather
generic.
The prime code you'd have to duplicate is just 10 lines, plus some small
per-driver code. Most of that being data-structure inits.
I want to point out that I'm not opposing the general idea of GEM sync,
but I think it should get more feedback from others. It's supposed to be
a generic interface after all. Hence I was asking to put all this into a
separate series.
Best regards
Thomas
[1]https://elixir.bootlin.com/linux/v6.17.1/source/drivers/gpu/drm/drm_gem_shmem_helper.c#L825
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)