On 12/10/2022 11:01, Julien Grall wrote:
> (+ Bertrand & Stefano)
>
> Hi Henry,
>
> On 12/10/2022 07:39, Henry Wang wrote:
>>> -----Original Message-----
>>> Subject: Re: [xen-unstable-smoke test] 173492: regressions - FAIL
>>>
>>> On 11.10.2022 18:29, osstest service owner wrote:
>>>> flight 173492 xen-unstable-smoke real [real]
>>>> http://logs.test-lab.xenproject.org/osstest/logs/173492/
>>>>
>>>> Regressions :-(
>>>>
>>>> Tests which did not succeed and are blocking,
>>>> including tests which could not be run:
>>>>   test-arm64-arm64-xl-xsm      14 guest-start              fail
>>>> REGR. vs. 173457
>>>
>>> Parsing config from /etc/xen/debian.guest.osstest.cfg
>>> libxl: debug: libxl_create.c:2079:do_domain_create: ao 0xaaaacaccf680:
>>> create: how=(nil) callback=(nil) poller=0xaaaacaccefd0
>>> libxl: detail: libxl_create.c:661:libxl__domain_make: passthrough:
>>> disabled
>>> libxl: debug: libxl_arm.c:148:libxl__arch_domain_prepare_config:
>>> Configure
>>> the domain
>>> libxl: debug: libxl_arm.c:151:libxl__arch_domain_prepare_config:  -
>>> Allocate
>>> 0 SPIs
>>> libxl: error: libxl_create.c:709:libxl__domain_make: domain creation
>>> fail: No
>>> such file or directory
>
> So this is -ENOENT which could be returned by the P2M is it can't
> allocate a page table (see p2m_set_entry()).
>
>>> libxl: error: libxl_create.c:1294:initiate_domain_create: cannot
>>> make domain:
>>> -3
>>>
>>> Later flights don't fail here anymore, though.
>>>
>>>>   test-armhf-armhf-xl          14 guest-start              fail
>>>> REGR. vs. 173457
>>>
>>> Similar log contents here, but later flights continue to fail the
>>> same way.
>>>
>>> I'm afraid I can't draw conclusions from this; I haven't been able
>>> to spot
>>> anything helpful in the hypervisor logs. My best guess right now is
>>> the use
>>> of some uninitialized memory, which just happened to go fine in the
>>> later
>>> flights for 64-bit.
>
> It looks like the smoke flight failed on laxton0 but passed on
> rochester{0, 1}. The former is using GICv2 whilst the latter are using
> GICv3.
>
> In the case of GICv2, we will create a P2M mapping when the domain is
> created. This is not necessary in the GICv3.
>
> IIRC the P2M pool is only populated later on (we don't add a few pages
> like on x86). So I am guessing this is why we are seen failure.
>
> If that's correct, then this is a complete oversight from me (I
> haven't done any GICv2 testing) while reviewing the series.
>
> The easy way to solve it would be to add a few pages in the pool when
> the domain is created. I don't like it, but I think there other
> possible solutions would require more work as we would need to delay
> the mappings.

Honestly, I've considered doing this on x86 too.

There are several things which want allocating in domain_create(), but
are deferred to max_vcpus() because they require the P2M having a
non-zero allocation.  This in turn means we've got a load of checks in
paths where we'd ideally not have them.

We already have a calculation of the absolutely minimum we will ever
permit the p2m pool to be.  IMO we ought to allocate this minimum size
in domain_create().

~Andrew

Reply via email to