bernardodemarco opened a new issue, #12908:
URL: https://github.com/apache/cloudstack/issues/12908
# Live scaling for VMs with fixed service offerings on KVM
This specification introduces a new feature enabling users to live scale VMs
created with fixed service offerings on the KVM hypervisor.
## Table of contents
- [Live scaling for VMs with fixed service offerings on
KVM](#live-scaling-for-vms-with-fixed-service-offerings-on-kvm)
- [Table of contents](#table-of-contents)
- [1. Problem description](#1-problem-description)
- [2. Proposed changes](#2-proposed-changes)
- [2.1. Apache CloudStack definition of domain XMLs with
KVM](#21-apache-cloudstack-definition-of-domain-xmls-with-kvm)
- [2.2. New cluster-wide settings to control the memory and vCPUs
maximum capacity for live
scaling](#22-new-cluster-wide-settings-to-control-the-memory-and-vcpus-maximum-capacity-for-live-scaling)
- [2.3. `scaleVirtualMachine` API](#23-scalevirtualmachine-api)
- [3.0. Conclusion and Limitations](#30-conclusion-and-limitations)
## 1. Problem description
Currently, Apache CloudStack supports three types of compute offerings:
fixed, custom constrained and custom unconstrained offerings. Fixed offerings
have a fixed number of vCPUs, vCPU speed, and RAM memory. Custom constrained
offerings accept a fixed vCPU speed, and a range of vCPUs and RAM memory in
which users can select a value from. Lastly, custom unconstrained offerings
accept an arbitrary amount of vCPUs, vCPU speed and RAM memory.
When using KVM as hypervisor, Apache CloudStack supports scaling `Stopped`
instances by executing the `scaleVirtualMachine` API. During this process,
since the VMs are stopped, their metadata is updated in the database and, thus,
when the VMs are later started again, their respective domain XMLs are created
with the updated attributes.
For running VMs, Apache CloudStack only supports live scaling VMs with
custom constrained and custom unconstrained compute
offerings[^enable-scale-vm-setting]. For VMs with fixed compute offerings,
users necessarily need to stop the VMs, change their service offerings through
the `scaleVirtualMachine` API and start them again. However, depending on the
criticality of the applications running on the VMs, the downtime caused by the
scaling process is highly undesirable.
## 2. Proposed changes
To address the described problem, this specification introduces the feature
of live scaling VMs with fixed service offerings when using the KVM hypervisor
on Apache CloudStack cloud environments. A high-level design of the feature is
presented, briefly describing the proposed changes to the generation of guest
VM domain XMLs, the `scaleVirtualMachine` API workflows and the global settings
controlling the scaling process.
### 2.1. Apache CloudStack definition of domain XMLs with KVM
During the process of deploying virtual machines with KVM, a VM transfer
object (`VirtualMachineTO`) is implemented given a `VirtualMachineProfile`,
which is an object that stores the attributes of the VM. The implementation of
VMs transfer objects is illustrated in the following activity diagram:
<img width="361" height="921" alt="Image"
src="https://github.com/user-attachments/assets/f6dce360-5f80-4c6f-ab8c-ab8bc634a2e9"
/>
When configuring the memory and vCPU attributes for the VMs, Apache
CloudStack implements the following workflow:
<img width="849" height="657" alt="Image"
src="https://github.com/user-attachments/assets/aad3adc2-758a-417a-aa73-34e97d8c2827"
/>
Regarding the definition of domain XMLs for guest VMs, the following
boilerplate is used by the Apache CloudStack Agent:
```xml
<domain type='kvm' id='4'>
<!--(...)-->
<maxMemory slots='16' unit='KiB'>1930240</maxMemory>
<memory unit='KiB'>1048576</memory>
<currentMemory unit='KiB'>1048576</currentMemory>
<vcpu placement='static' current='1'>2</vcpu>
<!--(...)-->
<cpu mode='custom' match='exact' check='full'>
<!--(...)-->
<numa>
<cell id='0' cpus='0-1' memory='1048576' unit='KiB'/>
</numa>
<!--(...)-->
</cpu>
</domain>
```
Regarding vCPU configuration, the content of the `vcpu` element defines the
maximum number of vCPUs that can be allocated for the guest VM. Its `current`
property defines the number of vCPUs that are effectively
active[^cpu-configuration-libvirt-docs]. As can be analyzed from the above
activity diagrams, for custom constrained compute offerings, the content of the
`vcpu` element is defined as the offering's maximum number of vCPUs. For custom
unconstrained offerings, the element's content is defined as the
`vm.serviceoffering.cpu.cores.max` global setting value, defaulting to the host
maximum CPU capacity in scenarios that the global setting is equal to zero.
As for guest VMs memory configuration, the `maxMemory` element defines the
maximum memory allocation for the VM. The `memory` element represents the
maximum amount of memory available for the VM at boot time. The `currentMemory`
represents the actual allocation of memory for the
VM[^memory-configuration-libvirt-docs]. Lastly, Libvirt currently requires the
specification of NUMA nodes to enable memory hotplug. Each `cell` element
represents a NUMA node. The `cpus` attribute define the range of CPUs that are
part of the node. The `memory` attribute represents the amount of memory in use
by the VM[^numa-configuration-libvirt-docs].
As can be noticed from the VM configuration workflow, for custom constrained
compute offerings, the `maxMemory` element content is defined as the offering's
maximum number of memory. For custom unconstrained offerings, the element's
content is defined as the `vm.serviceoffering.ram.size.max` global setting
value, defaulting to the host maximum memory capacity in scenarios in which the
global setting is equal to zero.
Hence, to address the live scaling of VMs with fixed offerings, the first
validation on the configuration of memory and vCPU for a guest VM, that checks
whether the VM is dynamically scalable, will be modified. Currently, it
considers a VM to be dynamically scalable if its offering is dynamic (that is,
either the amount of vCPUs, vCPU speed or RAM memory is not specified); the
VM's `dynamically_scalable` property is `true`; and the global setting
`enable.dynamic.scale.vm` is true.
Therefore, the check for dynamic offerings will be removed from the above
mentioned validation. As a consequence of that, when the `dynamically_scalable`
property of VMs is `true` and the `enable.dynamic.scale.vm` global setting is
`true`, then the domain XMLs of guest VMs will always have a range of memory
and vCPUs to be scaled up to, even when the VMs are created from fixed compute
offerings.
To define the upper limit of the memory and vCPU live scaling range, the
`kvm.cpu.dynamic.scaling.capacity` and `kvm.memory.dynamic.scaling.capacity`
global settings will be considered (see [new cluster-wide
settings](#22-new-cluster-wide-settings-to-control-the-memory-and-vcpus-maximum-capacity-for-live-scaling)).
Therefore, the maximum number of vCPUs will be retrieved from the
`kvm.cpu.dynamic.scaling.capacity` setting value and the maximum number of
memory will be retrieved from the `kvm.memory.dynamic.scaling.capacity` value.
When these settings are equal to zero, the host maximum capacity of CPU and
memory will be considered.
This procedure will also be applied for custom constrained offerings to
enable seamless live scaling between all types of compute offerings existing in
Apache CloudStack. Thus, at the KVM level, all types of VMs hosted in a given
host will be homogeneous regarding the CPU and memory upper limits. At the
Apache CloudStack level, on the other hand, the Management Server will be
responsible for validating the computing resources ranges defined in the
constrained offerings.
### 2.2. New cluster-wide settings to control the memory and vCPUs maximum
capacity for live scaling
As mentioned on the [domain XMLs definition
section](#21-apache-cloudstack-definition-of-domain-xmls-with-kvm), the
`vm.serviceoffering.cpu.cores.max` and `vm.serviceoffering.ram.size.max` global
settings are used to control de maximum amount of vCPUs and memory to which VMs
with custom unconstrained offerings can be live scaled to. However, both
settings are also currently used to limit the maximum amount of CPU and RAM
that can be defined for compute offerings and allocated to VMs.
To segregate goals and responsibilities, two new cluster-wide settings will
be introduced:
| Name | Type | Scope | Description |
| ------ | ------ | ------ | ----------- |
| `kvm.memory.dynamic.scaling.capacity` | Integer | Cluster | Defines the
maximum memory capacity in MiB for which VMs can be dynamically scaled to with
KVM. The `kvm.memory.dynamic.scaling.capacity` setting's value will be used to
define the value of the `<maxMemory />` element of domain XMLs. If it is set to
a value less than or equal to `0`, then the host's memory capacity will be
considered. |
| `kvm.cpu.dynamic.scaling.capacity` | Integer | Cluster | Defines the
maximum vCPU capacity for which VMs can be dynamically scaled to with KVM. The
`kvm.cpu.dynamic.scaling.capacity` setting's value will be used to define the
value of the `<vcpu />` element of domain XMLs. If it is set to a value less
than or equal to `0`, then the host's CPU cores capacity will be considered. |
Therefore, both settings will be used to exclusively control the maximum
live scaling capacity for memory and vCPUs. To maintain compatibility with the
current behavior, the values of the `vm.serviceoffering.cpu.cores.max` and
`vm.serviceoffering.ram.size.max` global settings will be used to populate the
initial values of the new cluster-wide settings.
### 2.3. `scaleVirtualMachine` API
The following activity diagram illustrates the current implementation of the
VM scaling process:
<img width="951" height="982" alt="Image"
src="https://github.com/user-attachments/assets/13c8e18c-9ec3-4f42-b334-d4f9cf3331a1"
/>
As can be noticed, when upgrading running VMs, after performing some general
validations, the Management Server checks if the VM is running on KVM, and if
its offering is not dynamic (that is, vCPU, vCPU speed and RAM memory are
specified for the offering). If the VM meets these conditions, then an
exception is thrown, informing the end user that KVM does not support live
scaling VMs with fixed compute offerings.
Therefore, the above-mentioned check for dynamic offerings will be removed
from the scaling workflow. This is possible because the domain XMLs of guest
VMs will be prepared to support live scaling when the global setting
`enable.dynamic.scale.vm` and the VM's `dynamically_scalable` property are set
to `true`.
Additionally, it is relevant to note that the current workflow does not
update the VM's CPU quota percentage based on its new CPU frequency. To address
this, the current implementation will be extended so that, before building the
`ScaleVmCommand`, the new CPU quota percentage is calculated and included in
the command payload, along with a flag indicating whether the CPU cap has
changed between the old and new service offerings.
At the Agent side, the following workflow will be executed:
<img width="1059" height="651" alt="Image"
src="https://github.com/user-attachments/assets/1b3c67c1-002b-4edc-9d32-a1ec38478a22"
/>
Thus, if both the old and the new service offerings have CPU limitation
enabled, the domain's CPU quota will be updated on the fly. If the old service
offering does not have CPU limitation enabled but the new one does, the
domain's CPU quota and period parameters will be set. Conversely, it the old
offering has CPU limitation enabled and the new one does not, these parameters
will be removed from the domain.
To update the CPU scheduling parameters of a domain on the fly, the
`virDomainSetSchedulerParameters` Libvirt API will be used. To remove CPU
limitation from a domain, the API will be called with the `vcpu_quota`
parameter set to `17,592,186,044,415`. Due to a Libvirt regression (see
[Libvirt regression
description](https://gitlab.com/libvirt/libvirt/-/work_items/324)), it is not
possible to set this value to `-1`. However, setting it to `17,592,186,044,415`
has the same effect, ensuring that CPU limitations are effectively removed from
running domains despite the Libvirt's constraint.
# 3.0. Conclusion and Limitations
This proposal introduces support for live scaling of VMs with fixed service
offerings when using the KVM hypervisor in Apache CloudStack environments. As a
limitation, it is important to highlight that live scaling will not be
supported for existing domains created prior to upgrading to the Apache
CloudStack release in which this patch is included. This is because their
domain XMLs are not prepared to support live CPU and memory scaling.
To enable live scaling for such VMs, the following steps are required:
1. Ensure that the `enable.dynamic.scale.vm` global setting is enabled.
2. Ensure that the VM's template is marked as dynamically scalable.
3. Ensure that the VM itself is marked as dynamically scalable.
[^enable-scale-vm-setting]: The global setting `enable.dynamic.scale.vm`
controls whether the live scaling feature is enabled or not.
[^cpu-configuration-libvirt-docs]: More information about the configuration
of guest VMs CPU attributes can be found at [the Libvirt
documentation](https://libvirt.org/formatdomain.html#cpu-allocation).
[^memory-configuration-libvirt-docs]: More information about the
configuration of guest VMs memory attributes can be found at [the Libvirt
documentation](https://libvirt.org/formatdomain.html#memory-allocation).
[^numa-configuration-libvirt-docs]: More information about the configuration
of guest NUMA topology can be found at [the Libvirt
documentation](https://www.libvirt.org/formatdomain.html#cpu-model-and-topology).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]