bernardodemarco opened a new issue, #12908:
URL: https://github.com/apache/cloudstack/issues/12908

   # Live scaling for VMs with fixed service offerings on KVM
   
   This specification introduces a new feature enabling users to live scale VMs 
created with fixed service offerings on the KVM hypervisor.
   
   ## Table of contents
   
   - [Live scaling for VMs with fixed service offerings on 
KVM](#live-scaling-for-vms-with-fixed-service-offerings-on-kvm)
     - [Table of contents](#table-of-contents)
     - [1. Problem description](#1-problem-description)
     - [2. Proposed changes](#2-proposed-changes)
       - [2.1. Apache CloudStack definition of domain XMLs with 
KVM](#21-apache-cloudstack-definition-of-domain-xmls-with-kvm)
       - [2.2. New cluster-wide settings to control the memory and vCPUs 
maximum capacity for live 
scaling](#22-new-cluster-wide-settings-to-control-the-memory-and-vcpus-maximum-capacity-for-live-scaling)
       - [2.3. `scaleVirtualMachine` API](#23-scalevirtualmachine-api)
   - [3.0. Conclusion and Limitations](#30-conclusion-and-limitations)
   
   ## 1. Problem description
   
   Currently, Apache CloudStack supports three types of compute offerings: 
fixed, custom constrained and custom unconstrained offerings. Fixed offerings 
have a fixed number of vCPUs, vCPU speed, and RAM memory. Custom constrained 
offerings accept a fixed vCPU speed, and a range of vCPUs and RAM memory in 
which users can select a value from. Lastly, custom unconstrained offerings 
accept an arbitrary amount of vCPUs, vCPU speed and RAM memory.
   
   When using KVM as hypervisor, Apache CloudStack supports scaling `Stopped` 
instances by executing the `scaleVirtualMachine` API. During this process, 
since the VMs are stopped, their metadata is updated in the database and, thus, 
when the VMs are later started again, their respective domain XMLs are created 
with the updated attributes.
   
   For running VMs, Apache CloudStack only supports live scaling VMs with 
custom constrained and custom unconstrained compute 
offerings[^enable-scale-vm-setting]. For VMs with fixed compute offerings, 
users necessarily need to stop the VMs, change their service offerings through 
the `scaleVirtualMachine` API and start them again. However, depending on the 
criticality of the applications running on the VMs, the downtime caused by the 
scaling process is highly undesirable.
   
   ## 2. Proposed changes
   
   To address the described problem, this specification introduces the feature 
of live scaling VMs with fixed service offerings when using the KVM hypervisor 
on Apache CloudStack cloud environments. A high-level design of the feature is 
presented, briefly describing the proposed changes to the generation of guest 
VM domain XMLs, the `scaleVirtualMachine` API workflows and the global settings 
controlling the scaling process.
   
   ### 2.1. Apache CloudStack definition of domain XMLs with KVM
   
   During the process of deploying virtual machines with KVM, a VM transfer 
object (`VirtualMachineTO`) is implemented given a `VirtualMachineProfile`, 
which is an object that stores the attributes of the VM. The implementation of 
VMs transfer objects is illustrated in the following activity diagram:
   
   <img width="361" height="921" alt="Image" 
src="https://github.com/user-attachments/assets/f6dce360-5f80-4c6f-ab8c-ab8bc634a2e9";
 />
   
   When configuring the memory and vCPU attributes for the VMs, Apache 
CloudStack implements the following workflow:
   
   <img width="849" height="657" alt="Image" 
src="https://github.com/user-attachments/assets/aad3adc2-758a-417a-aa73-34e97d8c2827";
 />
   
   Regarding the definition of domain XMLs for guest VMs, the following 
boilerplate is used by the Apache CloudStack Agent:
   
   ```xml
   <domain type='kvm' id='4'>
       <!--(...)-->
       <maxMemory slots='16' unit='KiB'>1930240</maxMemory>
       <memory unit='KiB'>1048576</memory>
       <currentMemory unit='KiB'>1048576</currentMemory>
       <vcpu placement='static' current='1'>2</vcpu>
       <!--(...)-->
       <cpu mode='custom' match='exact' check='full'>
           <!--(...)-->
           <numa>
               <cell id='0' cpus='0-1' memory='1048576' unit='KiB'/>
           </numa>
           <!--(...)-->
       </cpu>
   </domain>
   ```
   
   Regarding vCPU configuration, the content of the `vcpu` element defines the 
maximum number of vCPUs that can be allocated for the guest VM. Its `current` 
property defines the number of vCPUs that are effectively 
active[^cpu-configuration-libvirt-docs]. As can be analyzed from the above 
activity diagrams, for custom constrained compute offerings, the content of the 
`vcpu` element is defined as the offering's maximum number of vCPUs. For custom 
unconstrained offerings, the element's content is defined as the 
`vm.serviceoffering.cpu.cores.max` global setting value, defaulting to the host 
maximum CPU capacity in scenarios that the global setting is equal to zero.
   
   As for guest VMs memory configuration, the `maxMemory` element defines the 
maximum memory allocation for the VM. The `memory` element represents the 
maximum amount of memory available for the VM at boot time. The `currentMemory` 
represents the actual allocation of memory for the 
VM[^memory-configuration-libvirt-docs]. Lastly, Libvirt currently requires the 
specification of NUMA nodes to enable memory hotplug. Each `cell` element 
represents a NUMA node. The `cpus` attribute define the range of CPUs that are 
part of the node. The `memory` attribute represents the amount of memory in use 
by the VM[^numa-configuration-libvirt-docs].
   
   As can be noticed from the VM configuration workflow, for custom constrained 
compute offerings, the `maxMemory` element content is defined as the offering's 
maximum number of memory. For custom unconstrained offerings, the element's 
content is defined as the `vm.serviceoffering.ram.size.max` global setting 
value, defaulting to the host maximum memory capacity in scenarios in which the 
global setting is equal to zero.
   
   Hence, to address the live scaling of VMs with fixed offerings, the first 
validation on the configuration of memory and vCPU for a guest VM, that checks 
whether the VM is dynamically scalable, will be modified. Currently, it 
considers a VM to be dynamically scalable if its offering is dynamic (that is, 
either the amount of vCPUs, vCPU speed or RAM memory is not specified); the 
VM's `dynamically_scalable` property is `true`; and the global setting 
`enable.dynamic.scale.vm` is true. 
   
   Therefore, the check for dynamic offerings will be removed from the above 
mentioned validation. As a consequence of that, when the `dynamically_scalable` 
property of VMs is `true` and the `enable.dynamic.scale.vm` global setting is 
`true`, then the domain XMLs of guest VMs will always have a range of memory 
and vCPUs to be scaled up to, even when the VMs are created from fixed compute 
offerings.
   
   To define the upper limit of the memory and vCPU live scaling range, the 
`kvm.cpu.dynamic.scaling.capacity` and `kvm.memory.dynamic.scaling.capacity` 
global settings will be considered (see [new cluster-wide 
settings](#22-new-cluster-wide-settings-to-control-the-memory-and-vcpus-maximum-capacity-for-live-scaling)).
 Therefore, the maximum number of vCPUs will be retrieved from the 
`kvm.cpu.dynamic.scaling.capacity` setting value and the maximum number of 
memory will be retrieved from the `kvm.memory.dynamic.scaling.capacity` value. 
When these settings are equal to zero, the host maximum capacity of CPU and 
memory will be considered. 
   
   This procedure will also be applied for custom constrained offerings to 
enable seamless live scaling between all types of compute offerings existing in 
Apache CloudStack. Thus, at the KVM level, all types of VMs hosted in a given 
host will be homogeneous regarding the CPU and memory upper limits. At the 
Apache CloudStack level, on the other hand, the Management Server will be 
responsible for validating the computing resources ranges defined in the 
constrained offerings. 
   
   ### 2.2. New cluster-wide settings to control the memory and vCPUs maximum 
capacity for live scaling  
   
   As mentioned on the [domain XMLs definition 
section](#21-apache-cloudstack-definition-of-domain-xmls-with-kvm), the 
`vm.serviceoffering.cpu.cores.max` and `vm.serviceoffering.ram.size.max` global 
settings are used to control de maximum amount of vCPUs and memory to which VMs 
with custom unconstrained offerings can be live scaled to. However, both 
settings are also currently used to limit the maximum amount of CPU and RAM 
that can be defined for compute offerings and allocated to VMs.
   
   To segregate goals and responsibilities, two new cluster-wide settings will 
be introduced:
   
   | Name   | Type   | Scope  | Description |
   | ------ | ------ | ------ | ----------- |
   | `kvm.memory.dynamic.scaling.capacity` | Integer | Cluster | Defines the 
maximum memory capacity in MiB for which VMs can be dynamically scaled to with 
KVM. The `kvm.memory.dynamic.scaling.capacity` setting's value will be used to 
define the value of the `<maxMemory />` element of domain XMLs. If it is set to 
a value less than or equal to `0`, then the host's memory capacity will be 
considered. |
   | `kvm.cpu.dynamic.scaling.capacity` | Integer | Cluster | Defines the 
maximum vCPU capacity for which VMs can be dynamically scaled to with KVM. The 
`kvm.cpu.dynamic.scaling.capacity` setting's value will be used to define the 
value of the `<vcpu />` element of domain XMLs. If it is set to a value less 
than or equal to  `0`, then the host's CPU cores capacity will be considered. |
   
   Therefore, both settings will be used to exclusively control the maximum 
live scaling capacity for memory and vCPUs. To maintain compatibility with the 
current behavior, the values of the `vm.serviceoffering.cpu.cores.max` and 
`vm.serviceoffering.ram.size.max` global settings will be used to populate the 
initial values of the new cluster-wide settings.
   
   ### 2.3. `scaleVirtualMachine` API
   
   The following activity diagram illustrates the current implementation of the 
VM scaling process:
   
   <img width="951" height="982" alt="Image" 
src="https://github.com/user-attachments/assets/13c8e18c-9ec3-4f42-b334-d4f9cf3331a1";
 />
   
   As can be noticed, when upgrading running VMs, after performing some general 
validations, the Management Server checks if the VM is running on KVM, and if 
its offering is not dynamic (that is, vCPU, vCPU speed and RAM memory are 
specified for the offering). If the VM meets these conditions, then an 
exception is thrown, informing the end user that KVM does not support live 
scaling VMs with fixed compute offerings.
   
   Therefore, the above-mentioned check for dynamic offerings will be removed 
from the scaling workflow. This is possible because the domain XMLs of guest 
VMs will be prepared to support live scaling when the global setting 
`enable.dynamic.scale.vm` and the VM's `dynamically_scalable` property are set 
to `true`.
   
   Additionally, it is relevant to note that the current workflow does not 
update the VM's CPU quota percentage based on its new CPU frequency. To address 
this, the current implementation will be extended so that, before building the 
`ScaleVmCommand`, the new CPU quota percentage is calculated and included in 
the command payload, along with a flag indicating whether the CPU cap has 
changed between the old and new service offerings.
   
   At the Agent side, the following workflow will be executed:
   
   <img width="1059" height="651" alt="Image" 
src="https://github.com/user-attachments/assets/1b3c67c1-002b-4edc-9d32-a1ec38478a22";
 />
   
   Thus, if both the old and the new service offerings have CPU limitation 
enabled, the domain's CPU quota will be updated on the fly. If the old service 
offering does not have CPU limitation enabled but the new one does, the 
domain's CPU quota and period parameters will be set. Conversely, it the old 
offering has CPU limitation enabled and the new one does not, these parameters 
will be removed from the domain.  
   
   To update the CPU scheduling parameters of a domain on the fly, the 
`virDomainSetSchedulerParameters` Libvirt API will be used. To remove CPU 
limitation from a domain, the API will be called with the `vcpu_quota` 
parameter set to `17,592,186,044,415`. Due to a Libvirt regression (see 
[Libvirt regression 
description](https://gitlab.com/libvirt/libvirt/-/work_items/324)), it is not 
possible to set this value to `-1`. However, setting it to `17,592,186,044,415` 
has the same effect, ensuring that CPU limitations are effectively removed from 
running domains despite the Libvirt's constraint.
   
   # 3.0. Conclusion and Limitations
   
   This proposal introduces support for live scaling of VMs with fixed service 
offerings when using the KVM hypervisor in Apache CloudStack environments. As a 
limitation, it is important to highlight that live scaling will not be 
supported for existing domains created prior to upgrading to the Apache 
CloudStack release in which this patch is included. This is because their 
domain XMLs are not prepared to support live CPU and memory scaling.
   
   To enable live scaling for such VMs, the following steps are required:
   
   1. Ensure that the `enable.dynamic.scale.vm` global setting is enabled.
   2. Ensure that the VM's template is marked as dynamically scalable.
   3. Ensure that the VM itself is marked as dynamically scalable.
   
   [^enable-scale-vm-setting]: The global setting `enable.dynamic.scale.vm` 
controls whether the live scaling feature is enabled or not.
   
   [^cpu-configuration-libvirt-docs]: More information about the configuration 
of guest VMs CPU attributes can be found at [the Libvirt 
documentation](https://libvirt.org/formatdomain.html#cpu-allocation).
   
   [^memory-configuration-libvirt-docs]: More information about the 
configuration of guest VMs memory attributes can be found at [the Libvirt 
documentation](https://libvirt.org/formatdomain.html#memory-allocation).
   
   [^numa-configuration-libvirt-docs]: More information about the configuration 
of guest NUMA topology can be found at [the Libvirt 
documentation](https://www.libvirt.org/formatdomain.html#cpu-model-and-topology).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to