Some x86 CPUs has errata regarding microcode updates. The most notorious
is Broadwell's BDX90: "Loading Microcode ... May Result in a System Hang".
(URL:
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e7-v4-spec-update.pdf)
CPUs are supposed to be idle during initial microcode update. Idle-scrub
changes this, making a CPU to go scrubbing (memset) right after it was
brought up. This can get in a way of microcode update for other CPUs,
which results in a system hang:
[ 0.000000] CPU Vendor: Intel, Family 6 (0x6), Model 71 (0x47), Stepping
1 (raw 00040671)
...
[ 2.598813] HVM: Hardware Assisted Paging (HAP) detected
[ 2.600211] HVM: HAP page sizes: 4kB, 2MB, 1GB
[ 0.000000] microcode: CPU2 updated from revision 0x11 to 0x1e, date =
2018-04-03
[ 0.000000] microcode: CPU4 updated from revision 0x11 to 0x1e, d€
[2J[1;1H[2J
Prevent this situation by disabling idle scrubbing until
SYS_STATE_smp_booted is reached.
Signed-off-by: Sergey Dyasli <[email protected]>
---
xen/arch/arm/setup.c | 2 ++
xen/arch/x86/setup.c | 2 ++
xen/common/page_alloc.c | 7 +++++++
3 files changed, 11 insertions(+)
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 21baee534e..9120c5092d 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -933,6 +933,8 @@ void __init start_xen(unsigned long boot_phys_offset,
/* TODO: smp_cpus_done(); */
system_state = SYS_STATE_smp_booted;
+ /* Wake up secondary CPUs to start idle memory scrubbing */
+ smp_send_event_check_mask(&cpu_online_map);
setup_virt_paging();
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 189481608d..fea83aee5b 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1684,6 +1684,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
}
system_state = SYS_STATE_smp_booted;
+ /* Wake up secondary CPUs to start idle memory scrubbing */
+ smp_send_event_check_mask(&cpu_online_map);
printk("Brought up %ld CPUs\n", (long)num_online_cpus());
if ( num_parked )
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 4a2cbda1db..a82d70464e 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1261,6 +1261,13 @@ bool scrub_free_pages(void)
nodeid_t node;
unsigned int cnt = 0;
+ /*
+ * Don't start scrubbing until all secondary CPUs have booted and
+ * updated their microcode.
+ */
+ if ( system_state < SYS_STATE_smp_booted )
+ return false;
+
node = node_to_scrub(true);
if ( node == NUMA_NO_NODE )
return false;
--
2.17.1
_______________________________________________
Xen-devel mailing list
[email protected]
https://lists.xenproject.org/mailman/listinfo/xen-devel