On 2024-11-06 19:46, Daniel P. Smith wrote:
On 11/6/24 17:16, Jason Andryuk wrote:
On 2024-11-02 13:25, Daniel P. Smith wrote:
A precarious approach was used to release the pages used to hold a
boot module.
The precariousness stemmed from the fact that in the case of PV dom0,
the
initrd module pages may be either mapped or explicitly copied into
the dom0
address space. So to handle this situation, the PV dom0 construction
code will
set the size of the module to zero, relying on
discard_initial_images() to skip
any modules with a size of zero.
A function is introduced to release a module when it is no longer
needed that
accepts a boolean parameter, free_mem, to indicate if the
corresponding pages
can be freed. To track that a module has been released, the boot
module flag
`released` is introduced.
The previous release model was a free all at once except those of
size zeros,
which would handle any unused modules passed. The new model is one
of, free
consumed module after usage is complete, thus unconsumed modules do
not have a
consumer to free them.
Slightly confusing. Maybe just "The new model is to free modules
after they are consumed. Thus unconsumed modules are not freed."
okay.
To address this, the discard_uknown_boot_modules() is
"unknown"
Ack
introduced and called after the last module identification occurs,
initrd, to
free the pages of any boot modules that are identified as not being
released.
After domain construction completes, all modules should be freed. A
walk of the
boot modules is added after domain construction to validate and
notify if a
module is found not to have been released.
Signed-off-by: Daniel P. Smith <[email protected]>
---
+ {
+ struct boot_module *bm = &bi->mods[i];
+
+ if ( bm == NULL || bm->released )
continue;
- init_domheap_pages(start, start + PAGE_ALIGN(size));
+ release_boot_module(bm, true);
+ count++;
}
- bi->nr_modules = 0;
+ if ( count )
+ printk(XENLOG_DEBUG "Releasing pages for uknown boot module
%d\n",
"unknown". Since the operation already happened, maybe "Released
pages for %d unknown boot modules"? I'm not sure of the value of that
message. It would be more informative to provide the module index and
maybe a page count. The index would at least point to which module is
unused.
Ack to unknown.
Can adjust the phrasing, the question is there a desire to have a
message for every boot module freed. Guess I could do a single log line
split across multiple printks, Thinking about the case where someone
tried to abuse the interface by loading a bunch of unused modules.
It's 63 modules, so not too many. And this is the boot path, so it's
administrator controlled and shouldn't really be subject to abuse.
I think printing the index of unused modules makes the message useful.
If a module is provided to Xen, but Xen doesn't use it, it is worth
flagging which one. As an example, if I thought module 3 was ucode, but
it isn't getting used, that is useful information for further investigation.
+ count);
}
static void __init init_idle_domain(void)
@@ -2111,6 +2139,8 @@ void asmlinkage __init noreturn __start_xen(void)
initrdidx);
}
+ discard_unknown_boot_modules();
+
/*
* We're going to setup domain0 using the module(s) that we
stashed safely
* above our heap. The second module, if present, is an initrd
ramdisk.
@@ -2122,6 +2152,11 @@ void asmlinkage __init noreturn __start_xen(void)
if ( !dom0 )
panic("Could not set up DOM0 guest OS\n");
+ /* Check and warn if any modules did not get released */
+ for ( i = 0; i < bi->nr_modules; i++ )
+ if ( !bi->mods[i].released )
+ printk(XENLOG_ERR "Boot module %d not released, memory
leaked", i);
+
Why not release the memory here instead of leaking it?
Because you don't know if it was mapped or consumed.
So this is more of a sanity check since it should not trigger? i.e.
it's a bug to see this message.
I feel like the corner case of mapping the dom0 initrd is leading to
this manual approach or releasing modules individually. That logic
then has to be spread around. discard_initial_images() kept the logic
centralized. Maybe just replace the mod_end == 0 special case with a
"don't release me" flag that is checked in discard_initial_images()?
That is what started me at the options to deal with it. The two I came
up with was a flag and this approach. Weighing the pros/cons of the two,
the deciding factor is when multi-domain construction is introduced.
With multi-domain with a large number of domains, a balance has to be
struck between holding all the kernels and ramdisks in memory plus being
able to allocate the desired amount of memory for each domain.
So you're saying that by piece-wise free-ing memory, you can have more
domUs loaded by the boot loader? If free-ing is delayed to the end,
memory is tied up that could otherwise be assigned to domUs?
Is this the real motivation? If so, it belongs in the commit message.
As it is, the commit message is lacking in a specify reason why this
change is needed.
Perhaps a
balance can be struck, with a discard_consumed_boot_modules() that walks
the boot module list, and discard the ones consumed. While only dom0 can
be constructed, it would be called once after create_dom0 call returns
(per Andy's suggestion in response to this comment). An expansion on
this could be that instead of using a free flag, use a ref count that is
incremented as it is claimed and the decremented when it has been consumed.
Are you expecting 1 kernel image gets used to create N domUs? Or are
you only handling 1:1?
Is mapping the initrd only expected for a single PV dom0, or do you
envision mapping multiple initrds for multiple PV domUs?
It seems to me that more mapping might help alleviate keeping memory
tied up. Freeing XSM and ucode early helps, but those aren't
particularly big. An initrd in the 100s of MBs is significant though.
If you hyperlaunch Xen and 2 domUs, you'd want to be able to assign ~all
the memory. Losing 2x ~100MB as unassigned for copies of the initrds is
wasteful.
Regards,
Jason