On 17/10/2023 6:24 am, Juergen Gross wrote:
> On 16.10.23 18:24, Andrew Cooper wrote:
>> +command to ``xenstored``.  This instructs ``xenstored`` to connect to
>> the
>> +guest's xenstore ring, and fire the ``@introduceDomain`` watch.  The
>> firing of
>> +this watch is the signal to all other components which care that a
>> new VM has
>> +appeared and is about to start running.
> 
> A note should be added that the control domain is introduced implicitly by
> xenstored, so no XS_INTRODUCE command is needed and no @introduceDomain
> watch is being sent for the control domain.

How does this work for a stub xenstored?  It can't know that dom0 is
alive, and is the control domain, and mustn't assume that this is true.

I admit that I've been a bit vague in the areas where I think there are
pre-existing bugs.  This is one area.

I'm planning a separate document on "how to connect to xenstore" seeing
as it is buggy in multiple ways in Linux (causing a deadlock on boot
with a stub xenstored), and made worse by dom0less creating memory
corruption from a 3rd entity into the xenstored<->kernel comms channel.

(And as I've said multiple times already, shuffling code in one of the
two xenstored's doesn't fix the root of the dom0less bug.  It simply
shuffles it around for someone else to trip over.)

> All components interested in the @introduceDomain watch have to find out for
> themselves which new domain has appeared, as the watch event doesn't contain
> the domid of the new domain.

Yes, but we're intending to change that, and it is diverting focus from
the domain's lifecycle.

I suppose I could put in a footnote discussing the single-bit-ness of
the three signals.

>> +ceased to exist.  It fires the ``@releaseDomain`` watch a second time to
>> +signal to any components which care that the domain has gone away.
>> +
>> +E.g. The second ``@releaseDomain`` is commonly used by paravirtual
>> driver
>> +backends to shut themselves down.
> 
> There is no guarantee that @releaseDomain will always be fired twice for a
> domain ceasing to exist,

Are you sure?

Because the toolstack needs to listen to @releaseDomain in order to
start cleanup, there will be two distinct @releaseDomain's for an
individual domain.

But an individual @releaseDomain can be relevant for a state change in
more than one domain, so there are not necessary 2*nr_doms worth of
@releaseDomain's fired.

> and multiple domains disappearing might result in
> only one @releaseDomain watch being fired. This means that any component
> receiving this watch event have not only to find out the domid(s) of the
> domains changing state, but whether they have been shutting down only, or
> are completely gone, too.

All entities holding a reference on the domain will block the second
notification until they have performed their own unmap action.

But for entities which don't hold a reference on the domain, there is a
race condition where it's @releaseDomain notification is delivered
sufficiently late that the domid has already disappeared.

It's certainly good coding practice to cope with the domain disappearing
entirely underfoot, but entities without held references don't watch
@releaseDomain in the first place, so I don't think this case occurs in
practice.

~Andrew

Reply via email to