Peter Xu <pet...@redhat.com> writes:

> On Thu, Nov 07, 2024 at 12:12:10PM +0100, Markus Armbruster wrote:
>> Peter Xu <pet...@redhat.com> writes:
>> 
>> > On Fri, Oct 25, 2024 at 05:55:59PM -0400, Peter Xu wrote:
>> >> On Fri, Oct 25, 2024 at 11:25:23AM +0200, Markus Armbruster wrote:
>> >> > Peter Xu <pet...@redhat.com> writes:
>> >> > 
>> >> > > X86 IOMMUs cannot be created more than one on a system yet.  Make it a
>> >> > > singleton so it guards the system from accidentally create yet another
>> >> > > IOMMU object when one already presents.
>> >> > >
>> >> > > Now if someone tries to create more than one, e.g., via:
>> >> > >
>> >> > >   ./qemu -M q35 -device intel-iommu -device intel-iommu
>> >> > >
>> >> > > The error will change from:
>> >> > >
>> >> > >   qemu-system-x86_64: -device intel-iommu: QEMU does not support 
>> >> > > multiple vIOMMUs for x86 yet.
>> >> > >
>> >> > > To:
>> >> > >
>> >> > >   qemu-system-x86_64: -device intel-iommu: Class 'intel-iommu' only 
>> >> > > supports one instance
>> >> > >
>> >> > > Unfortunately, yet we can't remove the singleton check in the machine
>> >> > > hook (pc_machine_device_pre_plug_cb), because there can also be
>> >> > > virtio-iommu involved, which doesn't share a common parent class yet.
>> >> > >
>> >> > > But with this, it should be closer to reach that goal to check 
>> >> > > singleton by
>> >> > > QOM one day.
>> >> > >
>> >> > > Signed-off-by: Peter Xu <pet...@redhat.com>
>> >> > 
>> >> > $ qemu-system-x86_64 -device amd-iommu,help
>> >> > /work/armbru/qemu/include/hw/boards.h:24:MACHINE: Object 0x56473906f960 
>> >> > is not an instance of type machine
>> >> > Aborted (core dumped)
>> 
>> [...]
>> 
>> >> Thanks for the report!
>> >> 
>> >> It turns out that qdev_get_machine() cannot be invoked too early, and the
>> >> singleton code can make it earlier..
>> >> 
>> >> We may want a pre-requisite patch to allow qdev_get_machine() to be 
>> >> invoked
>> >> anytime, like:
>> >> 
>> >> ===8<===
>> >> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
>> >> index db36f54d91..7ceae47139 100644
>> >> --- a/hw/core/qdev.c
>> >> +++ b/hw/core/qdev.c
>> >> @@ -831,6 +831,16 @@ Object *qdev_get_machine(void)
>> >>  {
>> >>      static Object *dev;
>> >>  
>> >> +    if (!phase_check(PHASE_MACHINE_CREATED)) {
>> >> +        /*
>> >> +         * When the machine is not created, below can wrongly create
>> >> +         * /machine to be a container.. this enables qdev_get_machine() 
>> >> to
>> >> +         * be used at any time and return NULL properly when machine is 
>> >> not
>> >> +         * created.
>> >> +         */
>> >> +        return NULL;
>> >> +    }
>> >> +
>> >>      if (dev == NULL) {
>> >>          dev = container_get(object_get_root(), "/machine");
>> >>      }
>> >> ===8<===
>> >> 
>> >> I hope it makes sense on its own.
>> >
>> > My apologies, spoke too soon here.  This helper is used too after machine
>> > is created, but right before switching to PHASE_MACHINE_CREATE stage..
>> 
>> container_get() is a trap.
>
> I had the same feeling..  Though I'd confess I'm not familiar enough with
> this part of code.
>
>> 
>> When the object to be gotten is always "container", it merely
>> complicates container creation: it's implicitly created on first get.
>> Which of the calls creates may be less than obvious.
>> 
>> When the object to be gotten is something else, such as a machine,
>> container_get() before creation is *wrong*, and will lead to trouble
>> later.
>> 
>> In my opinion:
>> 
>> * Hiding creation in getters is a bad idea unless creation has no
>>   material side effects.
>> 
>> * Getting anything but a container with container_get() is in bad taste.
>
> Agreed.
>
> IMHO container_get() interface might still be ok to implicitly create
> containers,

Creation on demand is fine when we want to create the thing only when
there is demand.

I guess it can also be okay when we want to create it always, but don't
want to decide when exactly (must be before first use), although I
suspect that's just lazy more often than not.

>             but only if it will: (1) always make sure what it walks is a
> container along the way, and (2) never return any non-container.

Yes.  Anything else invites abuse.

>> > So we need another way, like:
>> >
>> > ===8<===
>> >
>> > diff --git a/hw/core/qdev.c b/hw/core/qdev.c
>> > index db36f54d91..36a9fdb428 100644
>> > --- a/hw/core/qdev.c
>> > +++ b/hw/core/qdev.c
>> > @@ -832,7 +832,13 @@ Object *qdev_get_machine(void)
>> >      static Object *dev;
>> >  
>> >      if (dev == NULL) {
>> > -        dev = container_get(object_get_root(), "/machine");
>> > +        /*
>> > +         * NOTE: dev can keep being NULL if machine is not yet created!
>> > +         * In which case the function will properly return NULL.
>> > +         *
>> > +         * Whenever machine object is created and found once, we cache it.
>> > +         */
>> > +        dev = object_resolve_path_component(object_get_root(), "machine");
>> >      }
>> >  
>> >      return dev;
>> 
>> Now returns null instead of a bogus container when called before machine
>> creation.  Improvement of sorts.  But none of the callers expect null...
>> shouldn't we assert(dev) here?
>> 
>> Hmm, below you add a caller that checks for null.
>> 
>> Another nice mess.
>
> I plan to put aside the application of singletons to x86-iommu as of now,
> due to the fact that qdev complexity may better be done separately.
>
> IOW, before that, I wonder whether we should clean up the container_get()
> as you discussed: it doesn't sound like a good interface to return
> non-container objects.
>
> I had a quick look, I only see two outliers of such, and besides the
> "abuse" in qdev_get_machine(), the only other one is
> e500_pcihost_bridge_realize():
>
> *** hw/core/qdev.c:
> qdev_get_machine[820]          dev = container_get(object_get_root(), 
> "/machine");
>
> *** hw/pci-host/ppce500.c:
> e500_pcihost_bridge_realize[422] PPCE500CCSRState *ccsr = 
> CCSR(container_get(qdev_get_machine(),
                                                                "/e500-ccsr"));

Yes, this abuses container_get() to get an "e500-ccsr", which is a
device, not a container.

By the way, intentation is confusing here.

> If any of us thinks this is the right way to go, I can try to clean it up
> (for 10.0).  qdev_get_machine() may still need to be able to return NULL
> when singleton applies to IOMMUs, but that can be for later.  Before that,
> we can still assert(qdev), I think.

I think it's worthwhile.

> Just to mention I've posted rfcv2 for this series, again feel free to
> ignore patch 3-5 as of now:
>
> [PATCH RFC v2 0/7] QOM: Singleton interface
> https://lore.kernel.org/r/20241029211607.2114845-1-pet...@redhat.com
>
> I think the plan is Dan may keep collecting feedbacks on his other rfc:
>
> [RFC 0/5] RFC: require error handling for dynamically created objects
> https://lore.kernel.org/r/20241031155350.3240361-1-berra...@redhat.com
>
> Then after Dan's lands, I'll rebase my rfcv2 on top of his, dropping
> iommu/qdev changes.
>
> Thanks,

Makes sense.  Thanks!


Reply via email to