buz.hr...@seznam.cz wrote: > Hi Debian people ;-), > > After having some issues with Fedora last year I decided to reinstall all my > servers to Debian 10. I'm supper happy with Debian except one repeating issue > I have with QEMU-KVM hosts that is very difficult to reproduce so I would > like to discuss it first before I open a new bug. Could you please discuss it > with me? ;-) > > I noticed that when I run VMs for a long period of time (a couple of days) > one or multiple VMs quite often stuck. It is not possible to connect the > stuck VMs using virt-manager and their serial consoles don't respond.
First question: when they are just a few minutes old, does the serial console work? > It is not possible to shut them down ("virsh shutdown vm"). Sometimes the > stuck VMs can be powered down ("virsh destroy vm") but in most cases "virsh > destroy" doesn't work. In that case the only thing to do is to shut down rest > of running VMs (that do respond) and reboot the host. Second question: when the VMs are a few minutes old, does virsh shutdown work? > When I reboot/shutdown the host the reboot/shutdown takes approx. 30min. > > This is how it looks like during the reboot / shutdown: > ~~~ > [ ***] (1 of 4) A stop job is running for /dev/dm-1 (18min 6s / no limit) You probably want to change that to 1 minute or so. > As I mentioned it is very difficult to reproduce it since it takes days to > get into that situation. VMs that are more likely to get stuck are VMs that: > > a) have larger virtual disks > b) more intensive storage use (use more IOPs) > c) have more vCPUs > > The problem is that VMs with larger disks usually use more IOPs and also have > more vCPUs so it is difficult to say what exactly is the issue. Based on my > testing I thing that less vCPUs makes it less likely to get stuck but it's > difficult to say... > > The only thing I'm confident is that the problem is not HW related - it > happened both on my SuperMicro with XEON E5 v2 and on other hardware with > Intel i7 7th gen. Are the VMs set up to match the local hardware definition or be fully emulated? And, especially: if they are not using virtio for disk and network address, try that ASAP. > Btw. this has never happened on my laptop that has same configuration as the > server (+Desktop Env.) but I reboot it multiple time a week so that might be > an answer... Not so much an answer as an explanation why you haven't seen it, but, sure, that's plausible. -dsr-