Tested using:
$ cat test-unmap.sh
#!/bin/sh
qemu=${1:?Usage: $0 qemu-executable}
img=/tmp/test.raw
echo
echo "defaults - write zeroes"
fallocate -l 1m $img
echo -e 'qemu-io none0 "write -z 0 1m"\nquit' | $qemu -monitor stdio \
-drive if=none,file=$img,format=raw >/dev/null
du -sh $img
echo
echo "defaults - write zeroes unmap"
fallocate -l 1m $img
echo -e 'qemu-io none0 "write -zu 0 1m"\nquit' | $qemu -monitor stdio \
-drive if=none,file=$img,format=raw >/dev/null
du -sh $img
echo
echo "defaults - write actual zeros"
fallocate -l 1m $img
echo -e 'qemu-io none0 "write -P 0 0 1m"\nquit' | $qemu -monitor stdio \
-drive if=none,file=$img,format=raw >/dev/null
du -sh $img
echo
echo "discard=off - write zeroes unmap"
fallocate -l 1m $img
echo -e 'qemu-io none0 "write -zu 0 1m"\nquit' | $qemu -monitor stdio \
-drive if=none,file=$img,format=raw,discard=off >/dev/null
du -sh $img
echo
echo "detect-zeros=on - write actual zeros"
fallocate -l 1m $img
echo -e 'qemu-io none0 "write -P 0 0 1m"\nquit' | $qemu -monitor stdio \
-drive if=none,file=$img,format=raw,detect-zeroes=on >/dev/null
du -sh $img
echo
echo "detect-zeros=unmap,discard=unmap - write actual zeros"
fallocate -l 1m $img
echo -e 'qemu-io none0 "write -P 0 0 1m"\nquit' | $qemu -monitor stdio \
-drive if=none,file=$img,format=raw,detect-zeroes=unmap,discard=unmap
>/dev/null
du -sh $img
echo
echo "discard=unmap - write zeroes"
fallocate -l 1m $img
echo -e 'qemu-io none0 "write -z 0 1m"\nquit' | $qemu -monitor stdio \
-drive if=none,file=$img,format=raw,discard=unmap >/dev/null
du -sh $img
echo
echo "discard=unmap - write zeroes unmap"
fallocate -l 1m $img
echo -e 'qemu-io none0 "write -zu 0 1m"\nquit' | $qemu -monitor stdio \
-drive if=none,file=$img,format=raw,discard=unmap >/dev/null
du -sh $img
rm $img
Before this change:
$ cat before.out
defaults - write zeroes
1.0M /tmp/test.raw
defaults - write zeroes unmap
0 /tmp/test.raw
defaults - write actual zeros
1.0M /tmp/test.raw
discard=off - write zeroes unmap
0 /tmp/test.raw
detect-zeros=on - write actual zeros
1.0M /tmp/test.raw
detect-zeros=unmap,discard=unmap - write actual zeros
0 /tmp/test.raw
discard=unmap - write zeroes
1.0M /tmp/test.raw
discard=unmap - write zeroes unmap
0 /tmp/test.raw
[nsoffer build (consider-discard-option)]$
After this change:
$ cat after.out
defaults - write zeroes
1.0M /tmp/test.raw
defaults - write zeroes unmap
1.0M /tmp/test.raw
defaults - write actual zeros
1.0M /tmp/test.raw
discard=off - write zeroes unmap
1.0M /tmp/test.raw
detect-zeros=on - write actual zeros
1.0M /tmp/test.raw
detect-zeros=unmap,discard=unmap - write actual zeros
0 /tmp/test.raw
discard=unmap - write zeroes
1.0M /tmp/test.raw
discard=unmap - write zeroes unmap
0 /tmp/test.raw
Differences:
$ diff -u before.out after.out
--- before.out 2024-06-19 20:24:09.234083713 +0300
+++ after.out 2024-06-19 20:24:20.526165573 +0300
@@ -3,13 +3,13 @@
1.0M /tmp/test.raw
defaults - write zeroes unmap
-0 /tmp/test.raw
+1.0M /tmp/test.raw
defaults - write actual zeros
1.0M /tmp/test.raw
discard=off - write zeroes unmap
-0 /tmp/test.raw
+1.0M /tmp/test.raw
On Wed, Jun 19, 2024 at 8:40 PM Nir Soffer <[email protected]> wrote:
> When opening an image with discard=off, we punch hole in the image when
> writing zeroes, making the image sparse. This breaks users that want to
> ensure that writes cannot fail with ENOSPACE by using fully allocated
> images.
>
> bdrv_co_pwrite_zeroes() correctly disable BDRV_REQ_MAY_UNMAP if we
> opened the child without discard=unmap or discard=on. But we don't go
> through this function when accessing the top node. Move the check down
> to bdrv_co_do_pwrite_zeroes() which seems to be used in all code paths.
>
> Issues:
> - We don't punch hole by default, so images are kept allocated. Before
> this change we punched holes by default. I'm not sure this is a good
> change in behavior.
> - Need to run all block tests
> - Not sure that we have tests covering unmapping, we may need new tests
> - We may need new tests to cover this change
>
> Signed-off-by: Nir Soffer <[email protected]>
> ---
>
> Changes since v1:
> - Replace the incorrect has_discard change with the right fix
>
> v1 was here:
> https://lists.nongnu.org/archive/html/qemu-block/2024-06/msg00198.html
>
> block/io.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/block/io.c b/block/io.c
> index 7217cf811b..301514c880 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1860,10 +1860,15 @@ bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
> int64_t offset, int64_t bytes,
> /* By definition there is no user buffer so this flag doesn't make
> sense */
> if (flags & BDRV_REQ_REGISTERED_BUF) {
> return -EINVAL;
> }
>
> + /* If opened with discard=off we should never unmap. */
> + if (!(bs->open_flags & BDRV_O_UNMAP)) {
> + flags &= ~BDRV_REQ_MAY_UNMAP;
> + }
> +
> /* Invalidate the cached block-status data range if this write
> overlaps */
> bdrv_bsc_invalidate_range(bs, offset, bytes);
>
> assert(alignment % bs->bl.request_alignment == 0);
> head = offset % alignment;
> @@ -2313,14 +2318,10 @@ int coroutine_fn bdrv_co_pwrite_zeroes(BdrvChild
> *child, int64_t offset,
> {
> IO_CODE();
> trace_bdrv_co_pwrite_zeroes(child->bs, offset, bytes, flags);
> assert_bdrv_graph_readable();
>
> - if (!(child->bs->open_flags & BDRV_O_UNMAP)) {
> - flags &= ~BDRV_REQ_MAY_UNMAP;
> - }
> -
> return bdrv_co_pwritev(child, offset, bytes, NULL,
> BDRV_REQ_ZERO_WRITE | flags);
> }
>
> /*
> --
> 2.45.1
>
>