On 12/10/2022 11:01, Julien Grall wrote: > (+ Bertrand & Stefano) > > Hi Henry, > > On 12/10/2022 07:39, Henry Wang wrote: >>> -----Original Message----- >>> Subject: Re: [xen-unstable-smoke test] 173492: regressions - FAIL >>> >>> On 11.10.2022 18:29, osstest service owner wrote: >>>> flight 173492 xen-unstable-smoke real [real] >>>> http://logs.test-lab.xenproject.org/osstest/logs/173492/ >>>> >>>> Regressions :-( >>>> >>>> Tests which did not succeed and are blocking, >>>> including tests which could not be run: >>>> test-arm64-arm64-xl-xsm 14 guest-start fail >>>> REGR. vs. 173457 >>> >>> Parsing config from /etc/xen/debian.guest.osstest.cfg >>> libxl: debug: libxl_create.c:2079:do_domain_create: ao 0xaaaacaccf680: >>> create: how=(nil) callback=(nil) poller=0xaaaacaccefd0 >>> libxl: detail: libxl_create.c:661:libxl__domain_make: passthrough: >>> disabled >>> libxl: debug: libxl_arm.c:148:libxl__arch_domain_prepare_config: >>> Configure >>> the domain >>> libxl: debug: libxl_arm.c:151:libxl__arch_domain_prepare_config: - >>> Allocate >>> 0 SPIs >>> libxl: error: libxl_create.c:709:libxl__domain_make: domain creation >>> fail: No >>> such file or directory > > So this is -ENOENT which could be returned by the P2M is it can't > allocate a page table (see p2m_set_entry()). > >>> libxl: error: libxl_create.c:1294:initiate_domain_create: cannot >>> make domain: >>> -3 >>> >>> Later flights don't fail here anymore, though. >>> >>>> test-armhf-armhf-xl 14 guest-start fail >>>> REGR. vs. 173457 >>> >>> Similar log contents here, but later flights continue to fail the >>> same way. >>> >>> I'm afraid I can't draw conclusions from this; I haven't been able >>> to spot >>> anything helpful in the hypervisor logs. My best guess right now is >>> the use >>> of some uninitialized memory, which just happened to go fine in the >>> later >>> flights for 64-bit. > > It looks like the smoke flight failed on laxton0 but passed on > rochester{0, 1}. The former is using GICv2 whilst the latter are using > GICv3. > > In the case of GICv2, we will create a P2M mapping when the domain is > created. This is not necessary in the GICv3. > > IIRC the P2M pool is only populated later on (we don't add a few pages > like on x86). So I am guessing this is why we are seen failure. > > If that's correct, then this is a complete oversight from me (I > haven't done any GICv2 testing) while reviewing the series. > > The easy way to solve it would be to add a few pages in the pool when > the domain is created. I don't like it, but I think there other > possible solutions would require more work as we would need to delay > the mappings.
Honestly, I've considered doing this on x86 too. There are several things which want allocating in domain_create(), but are deferred to max_vcpus() because they require the P2M having a non-zero allocation. This in turn means we've got a load of checks in paths where we'd ideally not have them. We already have a calculation of the absolutely minimum we will ever permit the p2m pool to be. IMO we ought to allocate this minimum size in domain_create(). ~Andrew
