Re: Why is my VM image so large?!

Michael Paoli Mon, 19 May 2025 19:23:26 -0700

Not a Debian specific question.  You may possibly want to ask/check
on, e.g. relevant qemu
list or the like.

Though qcow2 is quite flexible, and can be quite efficient, depending
what data is written there,
what snapshots are or may have been there, etc., it may also be rather
to quite inefficient, including even
less efficient than raw.  So, e.g. if compression is used, and the
data can't be compressed, it will take
at least slightly more space than that data itself.  Likewise, if
there are no unallocated blocks, not only no space
savings there, but there's the additional overhead of tracking where
the blocks are, as they may be added
in most any order.

So, let's see how grossly inefficient I can be, and if I can recover
some of that.
# qemu-img create -f qcow2 -o
compression_type=zlib,preallocation=off,size=2G
/var/local/vtest/2GiB.qcow2
Formatting '/var/local/vtest/2GiB.qcow2', fmt=qcow2 cluster_size=65536
extended_l2=off preallocation=off compression_type=zlib
size=2147483648 lazy_refcounts=off refcount_bits=16
# stat -c '%s' /var/local/vtest/2GiB.qcow2
196640
#
$ virsh attach-disk balug /var/local/vtest/2GiB.qcow2 vdc --live
--subdriver qcow2
Disk attached successfully

$
// from the guest VM:
bs=65536
# (seek=0; while dd if=/dev/random of=/dev/vdc bs="$bs" count=1
seek="$seek" status=none; do seek=$(expr "$seek" + 2); done
dd: error writing '/dev/vdc': No space left on device
# (seek=1; while dd if=/dev/random of=/dev/vdc bs="$bs" count=1
seek="$seek" status=none; do seek=$(expr "$seek" + 2); done
dd: /dev/vdc: cannot seek: Invalid argument
#
// I used random data so it (generally) wouldn't compress,
// and filled clusters in alternating order to avoid possible
contiguous mapping efficiencies
// Back on the physical host:
# stat -c '%s' /var/local/vtest/2GiB.qcow2
2148073472
# expr 512 \* 2 \* 1024 \* 1024 \* 2
2147483648
#
// Let's detach it, add a snapshot, reattach, and likewise fill again
$ virsh detach-disk balug vdc
Disk detached successfully

$
# qemu-img snapshot -l /var/local/vtest/2GiB.qcow2
# qemu-img snapshot -c snap01 /var/local/vtest/2GiB.qcow2
# qemu-img snapshot -l /var/local/vtest/2GiB.qcow2
Snapshot list:
ID        TAG               VM SIZE                DATE     VM CLOCK     ICOUNT
1         snap01                0 B 2025-05-20 01:42:06 00:00:00.000          0
#
$ virsh attach-disk balug /var/local/vtest/2GiB.qcow2 vdc --live
--subdriver qcow2
Disk attached successfully

$
// back on the VM guest:
# (seek=0; while dd if=/dev/random of=/dev/vdc bs="$bs" count=1
seek="$seek" status=none; do seek=$(expr "$seek" + 2); done; seek=1;
while dd if=/dev/random of=/dev/vdc bs="$bs" count=1 seek="$seek"
status=none; do seek=$(expr "$seek" + 2); done
dd: error writing '/dev/vdc': No space left on device
dd: /dev/vdc: cannot seek: Invalid argument
#
// So, back to host, let's detach and see what we can free up.
$ virsh detach-disk balug vdc
Disk detached successfully

$
# ls -ons /var/local/vtest/2GiB.qcow2
4199252 -rw------- 1 0 4296015872 May 20 02:08 /var/local/vtest/2GiB.qcow2
# qemu-img snapshot -d snap01 /var/local/vtest/2GiB.qcow2
# ls -ons /var/local/vtest/2GiB.qcow2
2099792 -rw------- 1 0 4296015872 May 20 02:09 /var/local/vtest/2GiB.qcow2
#
// It's sparse file, we got most all that spare space back.
// Now let's see if we can do likewise for data in the image, if we
replace it with something that
// compresses highly well.
$ virsh attach-disk balug /var/local/vtest/2GiB.qcow2 vdc --live
--subdriver qcow2
Disk attached successfully

$
// and back to the VM guest:
# dd if=/dev/zero of=/dev/vdc bs="$bs" status=none; unset bs
dd: error writing '/dev/vdc': No space left on device
#
// And back to host:
$ virsh detach-disk balug vdc
Disk detached successfully

$
# ls -ons /var/local/vtest/2GiB.qcow2
2099792 -rw------- 1 0 4296015872 May 20 02:13 /var/local/vtest/2GiB.qcow2
# fallocate -d /var/local/vtest/2GiB.qcow2; ls -ons /var/local/vtest/2GiB.qcow2
368 -rw------- 1 0 4296015872 May 20 02:16 /var/local/vtest/2GiB.qcow2
#
Well, that nicely and radically shrunk it - not the logical size, but
freed huge numbers of null blocks to make it very sparse.
So, you might try something like that on the filesystem on the VM,
e.g. fill the unallocated space
with large file(s) containing nothing but ASCII NUL characters - can
then remove those files from the
VM's filesystem.  And then with the qcow2 file inactive, see what you
can do with fallocate -d (don't do that
to the file while it's in use by the VM).  I not uncommonly do similar
on VMs to save space on their
filesystem images - basically fill most or all the spare space with
large file(s) of just null(s), then remove
those files, and then with the backing file not in use by the VM, use
fallocate -d
There may be more efficient ways if discard/trim is in use all the way
down and through, but often that's
not the case (one may even specifically not want to do that, for
certain reasons).

On Fri, May 16, 2025 at 5:21 AM Celejar <cele...@gmail.com> wrote:
>
> Hi,
>
> I have a QEMU / KVM VM running Windows that has been running as a guest
> on various Debian hosts for about a decade. The Windows OS has
> undergone various repairs and reinstalls over the years. I was recently
> quite surprised to discover that the VM image size (actual size on
> disk, not apparent size) has somehow grown to about 4x the allocated
> size of the disk:
>
> ~# ls -alsh /var/lib/libvirt/images/win10.qcow2
> 314G -rw------- 1 root root 352G May 15 12:40 
> /var/lib/libvirt/images/win10.qcow2
>
> ~# qemu-img  info /var/lib/libvirt/images/win10.qcow2
> image: /var/lib/libvirt/images/win10.qcow2
> file format: qcow2
> virtual size: 80 GiB (85899345920 bytes)
> disk size: 314 GiB
> cluster_size: 65536
> Format specific information:
>     compat: 1.1
>     compression type: zlib
>     lazy refcounts: true
>     refcount bits: 16
>     corrupt: false
>     extended l2: false
> Child node '/file':
>     filename: /var/lib/libvirt/images/win10.qcow2
>     protocol type: file
>     file length: 352 GiB (377549750272 bytes)
>     disk size: 314 GiB
>
> I've found all kinds of discussions of this type of thing online, but
> no explanation / solution that seems applicable to my situation.
>
> The image contains no snapshots:
>
> ~# qemu-img snapshot -l /var/lib/libvirt/images/win10.qcow2
> ~#
>
> I think TRIM / DISCARD is properly configured. From the VM XML:
>
> <disk type='file' device='disk'>
>       <driver name='qemu' type='qcow2' discard='unmap'/>
>       <source file='/var/lib/libvirt/images/win10.qcow2'/>
>       <target dev='vda' bus='virtio'/>
>       <address type='pci' domain='0x0000' bus='0x04' slot='0x00' 
> function='0x0'/>
> </disk>
>
> TRIM / DISCARD is enabled in the Windows guest, and I've issued manual
> TRIM commands in the guest several times, like so:
>
> https://winaero.com/trim-ssd-windows-10/
>
> I think this did claw back some space, but only on the order of tens of GB.
>
> Can anyone explain what's going on here, and how I can fix this?
>
> --
> Celejar
>

Re: Why is my VM image so large?!

Reply via email to