On Wed, 7 Aug 2019 09:54:27 +0200 David Hildenbrand <da...@redhat.com> wrote:
> On 06.08.19 11:48, Igor Mammedov wrote: > > Max memslot size supported by kvm on s390 is 8Tb, > > move logic of splitting RAM in chunks upto 8T to KVM code. > > > > This way it will hide KVM specific restrictions in KVM code > > and won't affect baord level design decisions. Which would allow > > us to avoid misusing memory_region_allocate_system_memory() API > > and eventually use a single hostmem backend for guest RAM. > > > > Signed-off-by: Igor Mammedov <imamm...@redhat.com> > > --- > > v4: > > * fix compilation issue > > (Christian Borntraeger <borntrae...@de.ibm.com>) > > * advance HVA along with GPA in kvm_set_phys_mem() > > (Christian Borntraeger <borntrae...@de.ibm.com>) > > > > patch prepares only KVM side for switching to single RAM memory region > > another patch will take care of dropping manual RAM partitioning in > > s390 code. > > --- > > include/sysemu/kvm_int.h | 1 + > > accel/kvm/kvm-all.c | 80 +++++++++++++++++++++++--------------- > > hw/s390x/s390-virtio-ccw.c | 9 ----- > > target/s390x/kvm.c | 12 ++++++ > > 4 files changed, 62 insertions(+), 40 deletions(-) > > > > diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h > > index 31df465fdc..7f7520bce2 100644 > > --- a/include/sysemu/kvm_int.h > > +++ b/include/sysemu/kvm_int.h > > @@ -41,4 +41,5 @@ typedef struct KVMMemoryListener { > > void kvm_memory_listener_register(KVMState *s, KVMMemoryListener *kml, > > AddressSpace *as, int as_id); > > > > +void kvm_set_max_memslot_size(hwaddr max_slot_size); > > #endif > > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c > > index f450f25295..d87f855ea4 100644 > > --- a/accel/kvm/kvm-all.c > > +++ b/accel/kvm/kvm-all.c > > @@ -138,6 +138,7 @@ bool kvm_direct_msi_allowed; > > bool kvm_ioeventfd_any_length_allowed; > > bool kvm_msi_use_devid; > > static bool kvm_immediate_exit; > > +static hwaddr kvm_max_slot_size = ~0; > > > > static const KVMCapabilityInfo kvm_required_capabilites[] = { > > KVM_CAP_INFO(USER_MEMORY), > > @@ -951,6 +952,14 @@ kvm_check_extension_list(KVMState *s, const > > KVMCapabilityInfo *list) > > return NULL; > > } > > > > +void kvm_set_max_memslot_size(hwaddr max_slot_size) > > +{ > > + g_assert( > > + ROUND_UP(max_slot_size, qemu_real_host_page_size) == max_slot_size > > + ); > > + kvm_max_slot_size = max_slot_size; > > +} > > + > > static void kvm_set_phys_mem(KVMMemoryListener *kml, > > MemoryRegionSection *section, bool add) > > { > > @@ -958,7 +967,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml, > > int err; > > MemoryRegion *mr = section->mr; > > bool writeable = !mr->readonly && !mr->rom_device; > > - hwaddr start_addr, size; > > + hwaddr start_addr, size, slot_size; > > void *ram; > > > > if (!memory_region_is_ram(mr)) { > > @@ -983,41 +992,50 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml, > > kvm_slots_lock(kml); > > > > if (!add) { > > - mem = kvm_lookup_matching_slot(kml, start_addr, size); > > - if (!mem) { > > - goto out; > > - } > > - if (mem->flags & KVM_MEM_LOG_DIRTY_PAGES) { > > - kvm_physical_sync_dirty_bitmap(kml, section); > > - } > > + do { > > + slot_size = kvm_max_slot_size < size ? kvm_max_slot_size : > > size; > > + mem = kvm_lookup_matching_slot(kml, start_addr, slot_size); > > + if (!mem) { > > + goto out; > > I wonder if this can trigger for the first, but not the second slot (or > the other way around). In that case you would want to continue the loop > (incrementing counters). But most probably there would something be > wrong in the caller if that would happen. I couldn't come up with scenario where it would be possible (unless flatview rendering is broken) > > > + } > > + if (mem->flags & KVM_MEM_LOG_DIRTY_PAGES) { > > + kvm_physical_sync_dirty_bitmap(kml, section); > > + } > > > > - /* unregister the slot */ > > - g_free(mem->dirty_bmap); > > - mem->dirty_bmap = NULL; > > - mem->memory_size = 0; > > - mem->flags = 0; > > - err = kvm_set_user_memory_region(kml, mem, false); > > - if (err) { > > - fprintf(stderr, "%s: error unregistering slot: %s\n", > > - __func__, strerror(-err)); > > - abort(); > > - } > > + /* unregister the slot */ > > + g_free(mem->dirty_bmap); > > + mem->dirty_bmap = NULL; > > + mem->memory_size = 0; > > + mem->flags = 0; > > + err = kvm_set_user_memory_region(kml, mem, false); > > + if (err) { > > + fprintf(stderr, "%s: error unregistering slot: %s\n", > > + __func__, strerror(-err)); > > + abort(); > > + } > > + start_addr += slot_size; > > + } while ((size -= slot_size)); > > NIT: I think you can drop parentheses - but I would really prefer to not > perform computations in the condition. sure, I'll move computation within the loop > > goto out; > > } > > > > /* register the new slot */ > > - mem = kvm_alloc_slot(kml); > > - mem->memory_size = size; > > - mem->start_addr = start_addr; > > - mem->ram = ram; > > - mem->flags = kvm_mem_flags(mr); > > - > > - err = kvm_set_user_memory_region(kml, mem, true); > > - if (err) { > > - fprintf(stderr, "%s: error registering slot: %s\n", __func__, > > - strerror(-err)); > > - abort(); > > - } > > + do { > > + slot_size = kvm_max_slot_size < size ? kvm_max_slot_size : size; > > + mem = kvm_alloc_slot(kml); > > + mem->memory_size = slot_size; > > + mem->start_addr = start_addr; > > + mem->ram = ram; > > + mem->flags = kvm_mem_flags(mr); > > + > > + err = kvm_set_user_memory_region(kml, mem, true); > > + if (err) { > > + fprintf(stderr, "%s: error registering slot: %s\n", __func__, > > + strerror(-err)); > > + abort(); > > + } > > + start_addr += slot_size; > > + ram += slot_size; > > + } while ((size -= slot_size)); > > dito > > One note: > > KVMState stores the number of slots in "nr_slots". We export that via > kvm_get_max_memslots(). > > E.g., spapr uses that to compare it against "machine->ram_slots". this patch shouldn't affect spapr/arm or x86 machines as they do not have limits on memslot size. > Later (esp. for s390x), kvm_get_max_memslots() can no longer be compared to > ram_slots directly. Could be that a ram slot would map to multiple KVM > memory slots. There would be no easy way to detect if KVM is able to > deal with "machine->ram_slots" as defined by the user, until the sizes > of the slots are known. If I recall correctly about kvm_foo_slots() APIs, assumption 1 memory region == 1 memslot didn't/doesn't hold true in all cases (ex: x86) and it's only best effort to let us compare the number of apples to oranges on a tree and works for current usecases. From hotplug point of view kvm_has_free_slot() would be more important, to allow for graceful abort. If s390 would ever need to hotplug RAM MemoryRegion, anyway APIs should be changed to account for 1:N split when actual dependency arises.