On Mon, Apr 14, 2014 at 06:44:42PM +0200, Igor Mammedov wrote:
> On Mon, 14 Apr 2014 15:25:01 +0800
> Hu Tao <[email protected]> wrote:
>
> > On Fri, Apr 04, 2014 at 03:36:58PM +0200, Igor Mammedov wrote:
> > > Needed for Windows to use hotplugged memory device, otherwise
> > > it complains that server is not configured for memory hotplug.
> > > Tests shows that aftewards it uses dynamically provided
> > > proximity value from _PXM() method if available.
> > >
> > > Signed-off-by: Igor Mammedov <[email protected]>
> > > ---
> > > hw/i386/acpi-build.c | 14 ++++++++++++++
> > > 1 file changed, 14 insertions(+)
> > >
> > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > index ef89e99..012b100 100644
> > > --- a/hw/i386/acpi-build.c
> > > +++ b/hw/i386/acpi-build.c
> > > @@ -1197,6 +1197,8 @@ build_srat(GArray *table_data, GArray *linker,
> > > uint64_t curnode;
> > > int srat_start, numa_start, slots;
> > > uint64_t mem_len, mem_base, next_base;
> > > + PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
> > > + ram_addr_t hotplug_as_size =
> > > memory_region_size(&pcms->hotplug_memory);
> > >
> > > srat_start = table_data->len;
> > >
> > > @@ -1261,6 +1263,18 @@ build_srat(GArray *table_data, GArray *linker,
> > > acpi_build_srat_memory(numamem, 0, 0, 0, MEM_AFFINITY_NOFLAGS);
> > > }
> > >
> > > + /*
> > > + * Fake entry required by Windows to enable memory hotplug in OS.
> > > + * Individual DIMM devices override proximity set here via _PXM
> > > method,
> > > + * which returns associated with it NUMA node id.
> > > + */
> > > + if (hotplug_as_size) {
> > > + numamem = acpi_data_push(table_data, sizeof *numamem);
> > > + acpi_build_srat_memory(numamem, pcms->hotplug_memory_base,
> > > + hotplug_as_size, 0,
> > > MEM_AFFINITY_HOTPLUGGABLE |
> > > + MEM_AFFINITY_ENABLED);
> > > + }
> > > +
> >
> > Hi Igor,
> >
> > With the faked entry, memory unplug doesn't work. Entries should be set
> > up for each node with correct flags(enable, hotpluggable) to make memory
> > unplug work.
> Could you be more specific, what and how doesn't work and why there is
> need for SRAT entries per DIMM?
> I've briefly tested with your unplug patches and linux seemed be ok with
> unplug,
> i.e. device node was removed from /sys after receiving remove notification.
Following are fail cases:
------------------------------------------------------------------------+----------------------------------------------
guest commands | this
patch | hacked SRAT
------------------------------------------------------------------------+----------------------------------------------
echo 'online' > /sys/devices/system/memory/memory32/state && \ |
|
echo 'offline' > /sys/devices/system/memory/memory32/state | fail
| success
------------------------------------------------------------------------+----------------------------------------------
echo 'online' > /sys/devices/system/memory/memory32/state && \ |
|
echo 1 > /sys/devices/LNXSYSTM\:00/device\:00/PNP0C80\:00/eject | fail
| success
------------------------------------------------------------------------+----------------------------------------------
echo 'online_movable' > /sys/devices/system/memory/memory32/state |
fail[first memory block] | fail
------------------------------------------------------------------------+----------------------------------------------
echo 'online_movable' > /sys/devices/system/memory/memory35/state && \ |
|
echo 'offline' > /sys/devices/system/memory/memory35/state |
success[last memory block] | success
------------------------------------------------------------------------+----------------------------------------------
echo 'online_movable' > /sys/devices/system/memory/memory32/state && \ |
|
echo 1 > /sys/devices/LNXSYSTM\:00/device\:00/PNP0C80\:00/eject |
success[last memory block] | success
------------------------------------------------------------------------+----------------------------------------------
Hacke SRAT memory entry:
PXM: 0
range: 4G ~ 4G + 512M
flags: Enabled Hot-Pluggable
PXM: 1
range: 4G + 512M ~ 5G
flags: Enabled Hot-Pluggable
So I think we should add maxmem to -numa and build SRAT accordingly.
But there is something I'm not sure with. I added dimm in node 1, but
it's memory range fell in node 0. Users always can cause the mismatch
with dimm,start,node.
This is the relevent part in command line:
qemu command line: -m 512M,slots=4,maxmem=2G \
-object memory-ram,id=foo,size=512M \
-numa node,id=n0,mem=256M -numa node,id=n1,mem=256M
(qemu monitor) device_add dimm,id=d0,memdev=foo,node=1
>
> >
> > Windows has not been tested yet. I encountered a problem that there is
> > no SRAT in Windows so even memory hotplug doesn't work. (but there is
> > in Linux with the same configuration).
> For Windows to work one needs to add "-numa node" CLI option so that
> SRAT would be exposed to guest.
Thanks. I need to double-check.
> Paolo suggested to enable -numa node by default, I guess we can do it
> once NUMA re-factoring is merged.
>
> That said, I haven't found any information that Windows supports
> memory hot-remove. Google tells that only hot-add is supported
> for up to WS2008R2. I've tested WS2012R2, it doesn't work either,
> i.e. it sees but ignores Notify request.
>
> >
> > Regards,
> > Hu Tao
> >
/*
* Intel ACPI Component Architecture
* AML Disassembler version 20100528
*
* Disassembly of SRAT, Tue Apr 15 02:18:57 2014
*
* ACPI Data Table [SRAT]
*
* Format: [HexOffset DecimalOffset ByteLength] FieldName : FieldValue
*/
[000h 0000 4] Signature : "SRAT" /* System Resource
Affinity Table */
[004h 0004 4] Table Length : 00000118
[008h 0008 1] Revision : 01
[009h 0009 1] Checksum : F4
[00Ah 0010 6] Oem ID : "BOCHS "
[010h 0016 8] Oem Table ID : "BXPCSRAT"
[018h 0024 4] Oem Revision : 00000001
[01Ch 0028 4] Asl Compiler ID : "BXPC"
[020h 0032 4] Asl Compiler Revision : 00000001
[024h 0036 4] Table Revision : 00000001
[028h 0040 8] Reserved : 0000000000000000
[030h 0048 1] Subtable Type : 00 <Processor Local APIC/SAPIC
Affinity>
[031h 0049 1] Length : 10
[032h 0050 1] Proximity Domain Low(8) : 00
[033h 0051 1] Apic ID : 00
[034h 0052 4] Flags (decoded below) : 00000001
Enabled : 1
[038h 0056 1] Local Sapic EID : 00
[039h 0057 3] Proximity Domain High(24) : 000000
[03Ch 0060 4] Reserved : 00000000
[040h 0064 1] Subtable Type : 00 <Processor Local APIC/SAPIC
Affinity>
[041h 0065 1] Length : 10
[042h 0066 1] Proximity Domain Low(8) : 01
[043h 0067 1] Apic ID : 01
[044h 0068 4] Flags (decoded below) : 00000001
Enabled : 1
[048h 0072 1] Local Sapic EID : 00
[049h 0073 3] Proximity Domain High(24) : 000000
[04Ch 0076 4] Reserved : 00000000
[050h 0080 1] Subtable Type : 01 <Memory Affinity>
[051h 0081 1] Length : 28
[052h 0082 4] Proximity Domain : 00000000
[056h 0086 2] Reserved : 0000
[058h 0088 8] Base Address : 0000000000000000
[060h 0096 8] Address Length : 00000000000A0000
[068h 0104 4] Reserved : 00000000
[06Ch 0108 4] Flags (decoded below) : 00000001
Enabled : 1
Hot Pluggable : 0
Non-Volatile : 0
[070h 0112 8] Reserved : 0000000000000000
[078h 0120 1] Subtable Type : 01 <Memory Affinity>
[079h 0121 1] Length : 28
[07Ah 0122 4] Proximity Domain : 00000000
[07Eh 0126 2] Reserved : 0000
[080h 0128 8] Base Address : 0000000000100000
[088h 0136 8] Address Length : 000000000FF00000
[090h 0144 4] Reserved : 00000000
[094h 0148 4] Flags (decoded below) : 00000001
Enabled : 1
Hot Pluggable : 0
Non-Volatile : 0
[098h 0152 8] Reserved : 0000000000000000
[0A0h 0160 1] Subtable Type : 01 <Memory Affinity>
[0A1h 0161 1] Length : 28
[0A2h 0162 4] Proximity Domain : 00000001
[0A6h 0166 2] Reserved : 0000
[0A8h 0168 8] Base Address : 0000000010000000
[0B0h 0176 8] Address Length : 0000000010000000
[0B8h 0184 4] Reserved : 00000000
[0BCh 0188 4] Flags (decoded below) : 00000001
Enabled : 1
Hot Pluggable : 0
Non-Volatile : 0
[0C0h 0192 8] Reserved : 0000000000000000
[0C8h 0200 1] Subtable Type : 01 <Memory Affinity>
[0C9h 0201 1] Length : 28
[0CAh 0202 4] Proximity Domain : 00000000
[0CEh 0206 2] Reserved : 0000
[0D0h 0208 8] Base Address : 0000000000000000
[0D8h 0216 8] Address Length : 0000000000000000
[0E0h 0224 4] Reserved : 00000000
[0E4h 0228 4] Flags (decoded below) : 00000000
Enabled : 0
Hot Pluggable : 0
Non-Volatile : 0
[0E8h 0232 8] Reserved : 0000000000000000
[0F0h 0240 1] Subtable Type : 01 <Memory Affinity>
[0F1h 0241 1] Length : 28
[0F2h 0242 4] Proximity Domain : 00000000
[0F6h 0246 2] Reserved : 0000
[0F8h 0248 8] Base Address : 0000000100000000
[100h 0256 8] Address Length : 0000000060000000
[108h 0264 4] Reserved : 00000000
[10Ch 0268 4] Flags (decoded below) : 00000003
Enabled : 1
Hot Pluggable : 1
Non-Volatile : 0
[110h 0272 8] Reserved : 0000000000000000
Raw Table Data
0000: 53 52 41 54 18 01 00 00 01 F4 42 4F 43 48 53 20 SRAT......BOCHS
0010: 42 58 50 43 53 52 41 54 01 00 00 00 42 58 50 43 BXPCSRAT....BXPC
0020: 01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
0030: 00 10 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
0040: 00 10 01 01 01 00 00 00 00 00 00 00 00 00 00 00 ................
0050: 01 28 00 00 00 00 00 00 00 00 00 00 00 00 00 00 .(..............
0060: 00 00 0A 00 00 00 00 00 00 00 00 00 01 00 00 00 ................
0070: 00 00 00 00 00 00 00 00 01 28 00 00 00 00 00 00 .........(......
0080: 00 00 10 00 00 00 00 00 00 00 F0 0F 00 00 00 00 ................
0090: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
00A0: 01 28 01 00 00 00 00 00 00 00 00 10 00 00 00 00 .(..............
00B0: 00 00 00 10 00 00 00 00 00 00 00 00 01 00 00 00 ................
00C0: 00 00 00 00 00 00 00 00 01 28 00 00 00 00 00 00 .........(......
00D0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00E0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00F0: 01 28 00 00 00 00 00 00 00 00 00 00 01 00 00 00 .(..............
0100: 00 00 00 60 00 00 00 00 00 00 00 00 03 00 00 00 ...`............
0110: 00 00 00 00 00 00 00 00 ........
/*
* Intel ACPI Component Architecture
* AML Disassembler version 20100528
*
* Disassembly of SRAT, Tue Apr 15 02:00:35 2014
*
* ACPI Data Table [SRAT]
*
* Format: [HexOffset DecimalOffset ByteLength] FieldName : FieldValue
*/
[000h 0000 4] Signature : "SRAT" /* System Resource
Affinity Table */
[004h 0004 4] Table Length : 000000F0
[008h 0008 1] Revision : 01
[009h 0009 1] Checksum : 33
[00Ah 0010 6] Oem ID : "BOCHS "
[010h 0016 8] Oem Table ID : "BXPCSRAT"
[018h 0024 4] Oem Revision : 00000001
[01Ch 0028 4] Asl Compiler ID : "BXPC"
[020h 0032 4] Asl Compiler Revision : 00000001
[024h 0036 4] Table Revision : 00000001
[028h 0040 8] Reserved : 0000000000000000
[030h 0048 1] Subtable Type : 00 <Processor Local APIC/SAPIC
Affinity>
[031h 0049 1] Length : 10
[032h 0050 1] Proximity Domain Low(8) : 00
[033h 0051 1] Apic ID : 00
[034h 0052 4] Flags (decoded below) : 00000001
Enabled : 1
[038h 0056 1] Local Sapic EID : 00
[039h 0057 3] Proximity Domain High(24) : 000000
[03Ch 0060 4] Reserved : 00000000
[040h 0064 1] Subtable Type : 00 <Processor Local APIC/SAPIC
Affinity>
[041h 0065 1] Length : 10
[042h 0066 1] Proximity Domain Low(8) : 01
[043h 0067 1] Apic ID : 01
[044h 0068 4] Flags (decoded below) : 00000001
Enabled : 1
[048h 0072 1] Local Sapic EID : 00
[049h 0073 3] Proximity Domain High(24) : 000000
[04Ch 0076 4] Reserved : 00000000
[050h 0080 1] Subtable Type : 01 <Memory Affinity>
[051h 0081 1] Length : 28
[052h 0082 4] Proximity Domain : 00000000
[056h 0086 2] Reserved : 0000
[058h 0088 8] Base Address : 0000000000000000
[060h 0096 8] Address Length : 00000000000A0000
[068h 0104 4] Reserved : 00000000
[06Ch 0108 4] Flags (decoded below) : 00000001
Enabled : 1
Hot Pluggable : 0
Non-Volatile : 0
[070h 0112 8] Reserved : 0000000000000000
[078h 0120 1] Subtable Type : 01 <Memory Affinity>
[079h 0121 1] Length : 28
[07Ah 0122 4] Proximity Domain : 00000000
[07Eh 0126 2] Reserved : 0000
[080h 0128 8] Base Address : 0000000000100000
[088h 0136 8] Address Length : 000000001FF00000
[090h 0144 4] Reserved : 00000000
[094h 0148 4] Flags (decoded below) : 00000001
Enabled : 1
Hot Pluggable : 0
Non-Volatile : 0
[098h 0152 8] Reserved : 0000000000000000
[0A0h 0160 1] Subtable Type : 01 <Memory Affinity>
[0A1h 0161 1] Length : 28
[0A2h 0162 4] Proximity Domain : 00000000
[0A6h 0166 2] Reserved : 0000
[0A8h 0168 8] Base Address : 0000000100000000
[0B0h 0176 8] Address Length : 0000000020000000
[0B8h 0184 4] Reserved : 00000000
[0BCh 0188 4] Flags (decoded below) : 00000003
Enabled : 1
Hot Pluggable : 1
Non-Volatile : 0
[0C0h 0192 8] Reserved : 0000000000000000
[0C8h 0200 1] Subtable Type : 01 <Memory Affinity>
[0C9h 0201 1] Length : 28
[0CAh 0202 4] Proximity Domain : 00000001
[0CEh 0206 2] Reserved : 0000
[0D0h 0208 8] Base Address : 0000000120000000
[0D8h 0216 8] Address Length : 0000000040000000
[0E0h 0224 4] Reserved : 00000000
[0E4h 0228 4] Flags (decoded below) : 00000003
Enabled : 1
Hot Pluggable : 1
Non-Volatile : 0
[0E8h 0232 8] Reserved : 0000000000000000
Raw Table Data
0000: 53 52 41 54 F0 00 00 00 01 33 42 4F 43 48 53 20 SRAT.....3BOCHS
0010: 42 58 50 43 53 52 41 54 01 00 00 00 42 58 50 43 BXPCSRAT....BXPC
0020: 01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
0030: 00 10 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
0040: 00 10 01 01 01 00 00 00 00 00 00 00 00 00 00 00 ................
0050: 01 28 00 00 00 00 00 00 00 00 00 00 00 00 00 00 .(..............
0060: 00 00 0A 00 00 00 00 00 00 00 00 00 01 00 00 00 ................
0070: 00 00 00 00 00 00 00 00 01 28 00 00 00 00 00 00 .........(......
0080: 00 00 10 00 00 00 00 00 00 00 F0 1F 00 00 00 00 ................
0090: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
00A0: 01 28 00 00 00 00 00 00 00 00 00 00 01 00 00 00 .(..............
00B0: 00 00 00 20 00 00 00 00 00 00 00 00 03 00 00 00 ... ............
00C0: 00 00 00 00 00 00 00 00 01 28 01 00 00 00 00 00 .........(......
00D0: 00 00 00 20 01 00 00 00 00 00 00 40 00 00 00 00 ... .......@....
00E0: 00 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00 ................