On 6/12/2025 4:27 PM, Chenyi Qiang wrote:
> Commit 852f0048f3 ("RAMBlock: make guest_memfd require uncoordinated
> discard") highlighted that subsystems like VFIO may disable RAM block
> discard. However, guest_memfd relies on discard operations for page
> conversion between private and shared memory, potentially leading to
> the stale IOMMU mapping issue when assigning hardware devices to
> confidential VMs via shared memory. To address this and allow shared
> device assignement, it is crucial to ensure the VFIO system refreshes
> its IOMMU mappings.
>
> RamDiscardManager is an existing interface (used by virtio-mem) to
> adjust VFIO mappings in relation to VM page assignment. Effectively page
> conversion is similar to hot-removing a page in one mode and adding it
> back in the other. Therefore, similar actions are required for page
> conversion events. Introduce the RamDiscardManager to guest_memfd to
> facilitate this process.
>
> Since guest_memfd is not an object, it cannot directly implement the
> RamDiscardManager interface. Implementing it in HostMemoryBackend is
> not appropriate because guest_memfd is per RAMBlock, and some RAMBlocks
> have a memory backend while others do not. Notably, virtual BIOS
> RAMBlocks using memory_region_init_ram_guest_memfd() do not have a
> backend.
>
> To manage RAMBlocks with guest_memfd, define a new object named
> RamBlockAttributes to implement the RamDiscardManager interface. This
> object can store the guest_memfd information such as the bitmap for
> shared memory and the registered listeners for event notifications. A
> new state_change() helper function is provided to notify listeners, such
> as VFIO, allowing VFIO to do dynamically DMA map and unmap for the shared
> memory according to conversion events. Note that in the current context
> of RamDiscardManager for guest_memfd, the shared state is analogous to
> being populated, while the private state can be considered discarded for
> simplicity. In the future, it would be more complicated if considering
> more states like private/shared/discarded at the same time.
>
> In current implementation, memory state tracking is performed at the
> host page size granularity, as the minimum conversion size can be one
> page per request. Additionally, VFIO expected the DMA mapping for a
> specific IOVA to be mapped and unmapped with the same granularity.
> Confidential VMs may perform partial conversions, such as conversions on
> small regions within a larger one. To prevent such invalid cases and
> until support for DMA mapping cut operations is available, all
> operations are performed with 4K granularity.
>
> In addition, memory conversion failures cause QEMU to quit rather than
> resuming the guest or retrying the operation at present. It would be
> future work to add more error handling or rollback mechanisms once
> conversion failures are allowed. For example, in-place conversion of
> guest_memfd could retry the unmap operation during the conversion from
> shared to private. For now, keep the complex error handling out of the
> picture as it is not required.
>
> Tested-by: Alexey Kardashevskiy <a...@amd.com>
> Reviewed-by: Alexey Kardashevskiy <a...@amd.com>
> Reviewed-by: Pankaj Gupta <pankaj.gu...@amd.com>
> Signed-off-by: Chenyi Qiang <chenyi.qi...@intel.com>
> ---
Fix the issue when build with cross platform compiling. Opportunistically
resolve
a "line over 80 characters" warning from checkpatch.
===
>From 66d6edfb78a6059362a1de3d5028c4159782554b Mon Sep 17 00:00:00 2001
From: Chenyi Qiang <chenyi.qi...@intel.com>
Date: Fri, 20 Jun 2025 10:29:14 +0800
Subject: [PATCH] fixup! ram-block-attributes: Introduce RamBlockAttributes to
manage RAMBlock with guest_memfd
Signed-off-by: Chenyi Qiang <chenyi.qi...@intel.com>
---
system/ram-block-attributes.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/system/ram-block-attributes.c b/system/ram-block-attributes.c
index dbb8c9675b..68e8a02703 100644
--- a/system/ram-block-attributes.c
+++ b/system/ram-block-attributes.c
@@ -42,7 +42,8 @@ ram_block_attributes_rdm_is_populated(const RamDiscardManager
*rdm,
const RamBlockAttributes *attr = RAM_BLOCK_ATTRIBUTES(rdm);
const size_t block_size = ram_block_attributes_get_block_size(attr);
const uint64_t first_bit = section->offset_within_region / block_size;
- const uint64_t last_bit = first_bit + int128_get64(section->size) /
block_size - 1;
+ const uint64_t last_bit =
+ first_bit + int128_get64(section->size) / block_size - 1;
unsigned long first_discarded_bit;
first_discarded_bit = find_next_zero_bit(attr->bitmap, last_bit + 1,
@@ -333,8 +334,8 @@ int ram_block_attributes_state_change(RamBlockAttributes
*attr,
int ret = 0;
if (!ram_block_attributes_is_valid_range(attr, offset, size)) {
- error_report("%s, invalid range: offset 0x%lx, size 0x%lx",
- __func__, offset, size);
+ error_report("%s, invalid range: offset 0x%" PRIx64 ", size "
+ "0x%" PRIx64, __func__, offset, size);
return -EINVAL;
}
@@ -402,7 +403,8 @@ RamBlockAttributes *ram_block_attributes_create(RAMBlock
*ram_block)
object_unref(OBJECT(attr));
return NULL;
}
- attr->bitmap_size = ROUND_UP(mr->size, block_size) / block_size;
+ attr->bitmap_size =
+ ROUND_UP(int128_get64(mr->size), block_size) / block_size;
attr->bitmap = bitmap_new(attr->bitmap_size);
return attr;
--
2.43.5