On 12/19/20 3:05 AM, David Woodhouse wrote:
> On Fri, 2020-12-18 at 17:20 -0500, [email protected] wrote:
>> Are there other cases where this would be useful? If it's just to
>> test a hypervisor under development I would think that people who are
>> doing that kind of work are capable of building their own kernel. My
>> concern is mostly about having yet another boot option that is of
>> interest to very few people who can easily work around not having it.
> For hypervisor testing we can just set the Xen major version number in
> CPUID to 3, and that stops xs_reset_watches() from doing anything.
>
> cf. https://lkml.org/lkml/2017/4/10/266
>
> Karim ripped out all this INTX code in 2017 because it was broken, and
> subsequently put it back because it *was* working for older versions of
> Xen, due to that "coincidence". The conclusion back then was that if it
> was put back it should at least *work* consistently, and he was going
> to send a patch "shortly". This is a that patch :)
Right, I am not arguing about usefulness of the fix, only of the new boot
option.
>
>>> Add a 'no_vector_callback' command line argument to test it.
>> This last one should be a separate patch I think.
> Could do.
>
>>> + /*
>>> + * It doesn't strictly *have* to run on CPU0 but it sure
>>> + * as hell better process the event channel ports delivered
>>> + * to CPU0.
>>> + */
>>> + irq_set_affinity(pdev->irq, cpumask_of(0));
>>> +
>>
>> Is the concern here that it won't be handled at all?
> Indeed, the events don't get handled at all if the PCI interrupt lands
> on a CPU other than zero. When the handler calls
> xen_hvm_evtchn_do_upcall() that processes pending events for whichever
> CPU it happens to be running on, and *not* the events pending for CPU0.
> And the boot stops in xs_reset_watches() waiting (without a timeout)
> for an interrupt that never gets processed, as before.
Yes, I see. Then please do it in a separate patch.
>
>> And is this related to the issue this patch is addressing?
> It is required to fix the event channel callback via INTX/GSI, yes.
> Although it could reasonably be lifted out into a separate patch too.
>
>>> static int __init xenbus_probe_initcall(void)
>>> {
>>> - if (!xen_domain())
>>> - return -ENODEV;
>>> -
>>> - if (xen_initial_domain() || xen_hvm_domain())
>>> - return 0;
>>> + /*
>>> + * Probe XenBus here in the XS_PV case, and also XS_HVM unless we
>>> + * need to wait for the platform PCI device to come up, which is
>>> + * the (XEN_PVPVM && !xen_have_vector_callback) case.
>>> + */
>>> + if (xen_store_domain_type == XS_PV ||
>>> + (xen_store_domain_type == XS_HVM &&
>>> + (!IS_ENABLED(CONFIG_XEN_PVHVM) || xen_have_vector_callback)))
>>> + xenbus_probe();
>>>
>>> - xenbus_probe(NULL);
>>> return 0;
>>> }
>>> -
>>> device_initcall(xenbus_probe_initcall);
>>>
>>> +int xen_set_callback_via(uint64_t via)
>>> +{
>>> + struct xen_hvm_param a;
>>> + int ret;
>>> +
>>> + a.domid = DOMID_SELF;
>>> + a.index = HVM_PARAM_CALLBACK_IRQ;
>>> + a.value = via;
>>> +
>>> + ret = HYPERVISOR_hvm_op(HVMOP_set_param, &a);
>>> + if (ret)
>>> + return ret;
>>> +
>>> + /*
>>> + * If xenbus_probe_initcall() deferred the xenbus_probe()
>>> + * due to the callback not functioning yet, we can do it now.
>>> + */
>>> + if (!xenstored_ready && xen_store_domain_type == XS_HVM &&
>>> + IS_ENABLED(CONFIG_XEN_PVHVM) && !xen_have_vector_callback)
>>
>> I'd create an is_callback_ready() (or something with a better name)
>> helper.
> I pondered that, and indeed dropping the XVM/vector conditions and
> doing it literally based on whether xen_set_callback_via() had been
> called at all (and not too early). But it looks like there are cases
> where Arm doesn't call xen_set_callback_via() at all, and it made more
> sense to me to live xen_set_callback_via() to sit right here and have
> those two conditions within a page of each other in juxtaposition, with
> suitable comments. I think that's probably easier to understand and
> work with than a "helper".
OK.
>
>>> + xenbus_probe();
>>> +
>>> + return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(xen_set_callback_via);
>>> +
>>> /* Set up event channel for xenstored which is run as a local process
>>> * (this is normally used only in dom0)
>>> */
>>> @@ -817,11 +851,17 @@ static int __init xenbus_init(void)
>>> break;
>>> }
>>>
>>> - /* Initialize the interface to xenstore. */
>>> - err = xs_init();
>>> - if (err) {
>>> - pr_warn("Error initializing xenstore comms: %i\n", err);
>>> - goto out_error;
>>> + /*
>>> + * HVM domains may not have a functional callback yet. In that
>>> + * case let xs_init() be called from xenbus_probe(), which will
>>> + * get invoked at an appropriate time.
>>> + */
>>> + if (xen_store_domain_type != XS_HVM) {
>>
>> Can we delay xs_init() for !XS_HVM as well? In other words wait until
>> after PCI platform device has been probed (on HVM) and then call
>> xs_init() for everyone.
> We're half-way there already, because xenbus_probe() *does* happen
> later as a device_initcall, and I've just made it call xs_init().
>
> We could make it avoid calling xs_init() from xenbus_init() in the
> XS_HVM *and* XS_PV cases fairly easily, and let xenbus_probe() do it.
Yes, that's along the lines of what I was thinking.
>
> But right now xenbus_probe() doesn't run for the other cases, so
> there'd have to be a mode where it *only* calls xs_init() and doesn't
> do the notifier chain. That seems like more churn that was needed so I
> didn't do it.
You think so? Yes, there would be a couple more places where you'd need to call
xenbus_probe() but then you won't have to explain (with comments) why you call
xs_init() here and not there and vice versa. It just looks to me a bit more
complicated the way you do this but I suppose it's a matter of personal
preference.
-boris