On 29/01/2026 04:41, Akihiko Odaki wrote:
On 2026/01/29 0:46, BALATON Zoltan wrote:
On Wed, 28 Jan 2026, Peter Maydell wrote:
On Wed, 28 Jan 2026 at 11:40, BALATON Zoltan <[email protected]> wrote:
OK I try to summarise the motivation again:
1. Documentation in docs/devel/memory.rst says that memory regions'
lifecycle is managed by QOM and they are freed with their owner or when
nothing else uses them. This is also already implemented for a long time
as described but cannot be used because the only constructors available
kill this feature when calling object_initialize that clears the free
function added by object_new. (The life time management is implemented
through adding memory regions as children to the owner and unparenting
them on freeing the owner which decreases ref count of the memory region
and will free it when nothing else references it as far as I can tell.)
If we have leaks because of our very common pattern of "embed
a MemoryRegion struct in the device state struct" then we must
fix those, because there's no way we're going to convert all
that existing code to a new set of APIs. But I was under the
impression we had already dealt with those, because MRs track
their owner's refcount, and don't have their own independent one ?
I'm not sure if all those leaks are resolved as there were some patches and
discussion about this recently but I think that problem or the need to use the
owner's ref count to circumwent it instead of using the memory region's own ref
count may also come from that there's currently no way to allocate memory regions
that are ref counted and automatically freed as it should work with QOM and the
documentation implies. (Only the constuctor is missing that is all this series
adds, the mechanism is already there and implemented.) There may still be a problem
with circular references if the memory region needs the owner so the owner can't be
freed until the memory region is also freed but the memory region is not freed
until the owner is freed but if both the owner and memory region used their own ref
count things may become a bit less confusing and could be easier to find a way to
break circular reference (e.g. by owner unparent child regions on unrealize but
isn't freed until memory regions unref owner in their free method).
These are my motivation for this change. What is the motivation for using
embedded memory regions instead and against this change?
Simply that it's a consistent pattern we use in a lot of the codebase:
the device embeds a lot of the structs it uses, rather than allocating
memory for them and keeping pointers to that allocated memory. We
You mix in the issue of SoCs and complex devices using other devices in which case
the recommendation was to embed those in the parent device so they don't have to be
freed or kept track of by a pointer but won't be leaked. This series does not mean
to change that, it's only limited to memory regions. (Although that problem may
also stem from similar issue with object_initialize_child not allowing creating
reference counted objects only initializing preallocated instances but that's not
something this series touches.)
We can say that memory regions are like other embedded objects but they are often
used for sysbus and PCI devices only to be registered in the parent device that
already has pointers in their state to track these so there's no need to keep track
of them in the subclass if we can rely on QOM freeing them when not needed any more
and this is already implemented and documented that way. So even if we keep
embedding other child devices into complex parent devices that I think does not
directly apply to memory regions and we could use what the documentation and
implementation already allows and says for memory regions at least.
still have also various older device models that use the previous
pattern of "allocate memory and have pointers" too, but most new
code doesn't do that. I think we should for preference write code
in one pattern, not two, and "embed structs" seems to be what
we have mostly settled on for new code.
There is an argument to be made that the pointer model would
fit better with a possible future world of "the user can wire
configurably wire up their own board model from devices", and
that it works better in a part-Rust-part-C world where the two
different languages don't have convenient access to the exact
size of structs defined in the other language. But that future
model is not something anybody has yet really fleshed out in any
detail, so it's still a bit speculative.
You keep mentioning pointers but the point of ref counts and regisrering memory
region as child of an owner is to avoid needing a pointer or embedding it in the
subclass state as the relationship and lifecycle management are then handled by
QOM. If we don't use that we could remove this from QOM and memory regions to
simplify it but if it's already there and makes the device state simpler I think we
better use it.
I'm not actually opposed to the idea of making a design decision
that this struct-embedding is no longer what we want to do, and defining
that something else is our new best practice for how to write devices.
But I think we would need to start by reaching a consensus that that
*is* what we want to do, and documenting that "best practice" somewhere
in docs/devel/. Then we can examine proposed new APIs and all be
on the same page about the design patterns we want and it will
be clearer to reviewers whether the new APIs fit into those
patterns or not.
I think we're in that discussion now in this thread. I don't propose to change the
struct-embedding for sub devices used in SoC or south bridge or other complex
devices but only propose to not embed memory regions that are already documented as
and handled by QOM and simply allocate them and let QOM handle them so we only need
to reference them in the devices state unless they are needed for some reason by
the device methods which is rarely the case. So this is limited to memory regions
and the series only seems to add a lot of lines because of the extensive
documentation comments. The actual change is just factoring out actual memory
region init from memory_region_init functions then add a memory_region_new variant
that does object_new; do_init and keep the memory_region_init do object_initializel
do_init. Nothing else is changed, the way to manage and free regions based on ref
counting is already there this series just enables them to be actually used becase
currently despite what the docs say memory regions are either leaked or must be
embedded.
I actually think deprecating struct-embedding for all QOM objects is a good
idea.
The problem initially stated in this thread is that embedding requires having extra
field, but people see the benefit is too small. There is no real logic involved in
having such fields so it does not reduce code complexity much; it saves some lines
and that's it.
However, I see another problem in struct embedding; it breaks object_ref(). When
embedding, the child object effectively takes the reference to the storage of the
parent object, but this reference is not counted, so use-after-free can happen if
someone takes a reference to the child object with object_ref(). That is why the
wrapper of object_ref() in rust/qom/src/qom.rs needs to be marked unsafe. Memory
regions workaround this with memory_region_ref(), but it's not perfect since it
relies on object_ref() in the end.
For this reason I think object_initialize(), object_initialize_child(), and the like
are better to be noted as deprecated in
include/qom/object.h. Then memory_region_init() can be deprecated referring to
them.
FWIW this is something I've been thinking about for some time: possibly I had a chat
with Phil about it at some point? Once example could be if you want to have a
reference to a parent type like PCIDevice that can change at runtime e.g. it could be
PCINE2000State or PCIPCNetState then you can never embed it, because you don't know
the size at compile time. So then why not use object_property_add_child() everywhere
so there is just a single way of doing things?
For memory regions it's a bit trickier because as per the virtio-gpu issues you've
been looking at, it is possible for the memory region to exist outside of its parent
device until it is destroyed later by the RCU thread. Is this something that can be
solved by manipulating the refcount?
I do agree with Peter too that we don't want to add yet another way of doing things:
if we decide to change the way memory regions work, we should go all-in and update
all callers to reflect the new API and remove the old one. Having one easily
understood way of modelling things makes life much easier for contributors and
reviewers alike.
ATB,
Mark.