On Mon, Dec 15, 2025 at 11:53 PM Christian König <[email protected]> wrote: > > On 12/15/25 14:59, Maxime Ripard wrote: > > On Mon, Dec 15, 2025 at 02:30:47PM +0100, Christian König wrote: > >> On 12/15/25 11:51, Maxime Ripard wrote: > >>> Hi TJ, > >>> > >>> On Fri, Dec 12, 2025 at 08:25:19AM +0900, T.J. Mercier wrote: > >>>> On Fri, Dec 12, 2025 at 4:31 AM Eric Chanudet <[email protected]> > >>>> wrote: > >>>>> > >>>>> The system dma-buf heap lets userspace allocate buffers from the page > >>>>> allocator. However, these allocations are not accounted for in memcg, > >>>>> allowing processes to escape limits that may be configured. > >>>>> > >>>>> Pass the __GFP_ACCOUNT for our allocations to account them into memcg. > >>>> > >>>> We had a discussion just last night in the MM track at LPC about how > >>>> shared memory accounted in memcg is pretty broken. Without a way to > >>>> identify (and possibly transfer) ownership of a shared buffer, this > >>>> makes the accounting of shared memory, and zombie memcg problems > >>>> worse. :\ > >>> > >>> Are there notes or a report from that discussion anywhere? > >>> > >>> The way I see it, the dma-buf heaps *trivial* case is non-existent at > >>> the moment and that's definitely broken. Any application can bypass its > >>> cgroups limits trivially, and that's a pretty big hole in the system. > >> > >> Well, that is just the tip of the iceberg. > >> > >> Pretty much all driver interfaces doesn't account to memcg at the > >> moment, all the way from alsa, over GPUs (both TTM and SHM-GEM) to > >> V4L2. > > > > Yes, I know, and step 1 of the plan we discussed earlier this year is to > > fix the heaps. > > > >>> The shared ownership is indeed broken, but it's not more or less broken > >>> than, say, memfd + udmabuf, and I'm sure plenty of others. > >>> > >>> So we really improve the common case, but only make the "advanced" > >>> slightly more broken than it already is. > >>> > >>> Would you disagree? > >> > >> I strongly disagree. As far as I can see there is a huge chance we > >> break existing use cases with that. > > > > Which ones? And what about the ones that are already broken? > > Well everybody that expects that driver resources are *not* accounted to > memcg. > > >> There has been some work on TTM by Dave but I still haven't found time > >> to wrap my head around all possible side effects such a change can > >> have. > >> > >> The fundamental problem is that neither memcg nor the classic resource > >> tracking (e.g. the OOM killer) has a good understanding of shared > >> resources. > > > > And yet heap allocations don't necessarily have to be shared. But they > > all have to be allocated. > > > >> For example you can use memfd to basically kill any process in the > >> system because the OOM killer can't identify the process which holds > >> the reference to the memory in question. And that is a *MUCH* bigger > >> problem than just inaccurate memcg accounting. > > > > When you frame it like that, sure. Also, you can use the system heap to > > DoS any process in the system. I'm not saying that what you're concerned > > about isn't an issue, but let's not brush off other people legitimate > > issues as well. > > Completely agree, but we should prioritize. > > That driver allocated memory is not memcg accounted is actually uAPI, e.g. > that is not something which can easily change. > > While fixing the OOM killer looks perfectly doable and will then most likely > also show a better path how to fix the memcg accounting.
You think so? I can see how the OOM killer could identify that a process is using a dmabuf and include that memory use for its decision making, but the memory for it won't be reclaimed unless *all* users get killed, which isn't easily known right now. > Christian. > > > > > Maxime >
