On 06.03.26 12:12, Kevin Wolf wrote:
Am 06.03.2026 um 11:20 hat Hanna Czenczek geschrieben:
On 04.03.26 17:13, Kevin Wolf wrote:
Am 04.03.2026 um 15:20 hat Hanna Czenczek geschrieben:
On 02.03.26 15:30, Kevin Wolf wrote:
Am 05.02.2026 um 15:47 hat Hanna Czenczek geschrieben:
Add BDS flags that prevent taking WRITE and/or RESIZE permissions on
pure data (no metadata) children.  These are going to be used by qcow2
during formatting, when we need write access to format the metadata
file, but no write access to an external data file.  This will allow
creating a qcow2 image for a raw image while the latter is currently in
use by the VM.

Signed-off-by: Hanna Czenczek <[email protected]>
---
    include/block/block-common.h | 11 +++++++++++
    block.c                      | 15 +++++++++++++++
    2 files changed, 26 insertions(+)

diff --git a/include/block/block-common.h b/include/block/block-common.h
index c8c626daea..504f6aa113 100644
--- a/include/block/block-common.h
+++ b/include/block/block-common.h
@@ -245,6 +245,17 @@ typedef enum {
    #define BDRV_O_CBW_DISCARD_SOURCE 0x80000 /* for copy-before-write filter */
+/*
+ * Promise not to write any data to pure (non-metadata-bearing) data storage
+ * children, so we don't need the WRITE permission for them.
+ * For image creation, formatting requires write access to the image, but not
+ * necessarily to its pure storage children.  This allows creating an image on
+ * top of an existing raw storage image that is already attached to the VM.
+ */
+#define BDRV_O_NO_DATA_WRITE  0x100000
Can't we just use BDRV_O_NO_IO for this one? It is stricter because it
doesn't allow reading either, but I don't think image creation ever
requires reading from the image?
How would qcow2 set it?  It opens the qcow2 image, so it can only set the
flag on the qcow2 BDS (via a BlockBackend), but BDRV_O_NO_IO needs to go on
the data-file child.  Maybe we can construct the graph manually…? It would
be quite painful, I imagine, but I haven’t tried yet.
Why should it go on the data-file child? The child permissions are
defined by the qcow2 node. If the caller promises not to do any I/O (and
I'm fairly sure that apart from preallocation, creating the image
doesn't involve any I/O on the qcow2 node, just on the primary child),
then qcow2 doesn't need any permissions on data-file.

Or am I missing a reason why BDRV_O_NO_IO can't be set for the qcow2
node?
First, bdrv_co_write_req_prepare() requires !BDRV_O_NO_IO and the RESIZE
permission.
Ah, I wasn't aware that truncate requires !BDRV_O_NO_IO. It's not what I
would intuitively call I/O, but it's also justifiable because it changes
the result of reading some blocks.

The part where things start to feel questionable is with the way
qcow2_co_create() uses truncate. It doesn't actually want to truncate
the image (especially with an existing data file), but just allocate the
metadata for the full image size.

If the problem is just bdrv_co_write_req_prepare(), would a request flag
for the truncate solve it without having to introduce two global BDS
flags that can never be set by the user? BDRV_REQ_NO_DATA_IO or
something?

Oh, only one new flag.  We can drop the RESIZE flag, as you said, and make it !BDRV_O_RESIZE.  (If we call qcow2_co_truncate() directly, that is.)

Dropping that flag changes the error messages sometimes (because without this flag, WRITE will always imply RESIZE, so with a guest device that prevents concurrent resize, you will then always get RESIZE conflicts alongside WRITE), but that’s OK (because it’s only when you get a WRITE conflict anyway).

We could bypass this by calling qcow2_co_truncate() instead of
blk_co_truncate().  Feels wrong to me to pass BDRV_O_NO_IO and
!BDRV_O_RESIZE to blk_co_new_open() when we actually kind of want those
things and just bypass the BB to get them, but only morally wrong. Not
technically wrong.

Bigger problem: BDRV_O_NO_IO makes qcow2 skip opening the data-file
altogether.  So we would need to distinguish between qemu-img info and
this case somehow.
Do we need to have it opened, except for preallocation, which can't set
BDRV_O_NO_IO anyway?

Yes, for resizing it without preallocation (growing to fit).

We could have qcow2_do_open() distinguish by checking whether the "data-file" option in the QDict is set and a string (a node-name), because if it is, it’s safe to take that existing BDS despite BDRV_O_NO_IO.

I’m still not sure how I morally feel about passing BDRV_O_NO_IO when we do want to do I/O.  Yes, only metadata.  I know.  But I understand BDRV_O_NO_IO was introduced specifically for nothing at all, just querying image information.  And it works, yes – you just have to make sure you keep BDRV_O_RDWR in there, too, because otherwise the whole thing will be read-only and not work.  Feels really wrong to me, but I guess we already do the same thing at the end of qcow2_co_create() to flush the image, so…  Too late to protest now, I suppose.

Hanna

(I also thought maybe the bdrv_co_truncate() in qcow2_co_truncate() might be
a problem with data-file inheriting BDRV_O_NO_IO, but because
qcow2_co_create() puts in a pre-opened BDS (without the flag set), that part
works.)
We should probably skip truncating the data file for BDRV_REQ_NO_DATA_IO
and we want to make the image the same size as the data file anyway, no?
I don't think we have the intention to actually resize the data file, do
we?

Kevin



Reply via email to