On Mon, Mar 22, 2021 at 03:06:40PM +0100, Mischa wrote:

> > On 22 Mar 2021, at 15:05, Dave Voutila <d...@sisu.io> wrote:
> > Otto Moerbeek writes:
> >> On Mon, Mar 22, 2021 at 09:51:19AM -0400, Dave Voutila wrote:
> >>> Otto Moerbeek writes:
> >>>> On Mon, Mar 22, 2021 at 01:47:18PM +0100, Mischa wrote:
> >>>>>> On 22 Mar 2021, at 13:43, Stuart Henderson <s...@spacehopper.org> 
> >>>>>> wrote:
> >>>>>> 
> >>>>>>>> Created a fresh install qcow2 image and derived 35 new VMs from it.
> >>>>>>>> Then I started all the VMs in four cycles, 10 VMs per cycle and 
> >>>>>>>> waiting 240 seconds after each cycle.
> >>>>>>>> Similar to the staggered start based on the amount of CPUs.
> >>>>>> 
> >>>>>>> For me this is not enough info to even try to reproduce, I know little
> >>>>>>> of vmm or vmd and have no idea what "derive" means in this context.
> >>>>>> 
> >>>>>> This is a big bit of information that was missing from the original
> >>>>> 
> >>>>> Well.. could have been better described indeed. :))
> >>>>> " I created 41 additional VMs based on a single qcow2 base image.”
> >>>>> 
> >>>>>> report ;) qcow has a concept of a read-only base image (or 'backing
> >>>>>> file') which can be shared between VMs, with writes diverted to a
> >>>>>> separate image ('derived image').
> >>>>>> 
> >>>>>> So e.g. you can create a base image, do a simple OS install for a
> >>>>>> particular OS version to that base image, then you stop using that
> >>>>>> for a VM and just use it as a base to create derived images from.
> >>>>>> You then run VMs using the derived image and make whatever config
> >>>>>> changes. If you have a bunch of VMs using the same OS release then
> >>>>>> you save some disk space for the common files.
> >>>>>> 
> >>>>>> Mischa did you leave a VM running which is working on the base
> >>>>>> image directly? That would certainly cause problems.
> >>>>> 
> >>>>> I did indeed. Let me try that again without keeping the base image 
> >>>>> running.
> >>>> 
> >>>> Right. As a safeguard, I would change the base image to be r/o.
> >>> 
> >>> vmd(8) should treating it r/o...the config process is responsible for
> >>> opening the disk files and passing the fd's to the vm process. In
> >>> config.c, the call to open(2) for the base images should be using the
> >>> flags O_RDONLY | O_NONBLOCK.
> >>> 
> >>> A ktrace on my system shows that's the case. Below, "new.qcow2" is a new
> >>> disk image I based off the "alpine.qcow2" image:
> >>> 
> >>> 20862 vmd      CALL  open(0x7f7ffffd4370,0x26<O_RDWR|O_NONBLOCK|O_EXLOCK>)
> >>> 20862 vmd      NAMI  "/home/dave/vm/new.qcow2"
> >>> 20862 vmd      RET   open 10/0xa
> >>> 20862 vmd      CALL  fstat(10,0x7f7ffffd42b8)
> >>> 20862 vmd      STRU  struct stat { dev=1051, ino=19531847, 
> >>> mode=-rw------- , nlink=1, uid=1000<"dave">, gid=1000<"dave">, 
> >>> rdev=78096304, atime=1616420730<"Mar 22 09:45:30 2021">.509011764, 
> >>> mtime=1616420697<"Mar 22 09:44:57 2021">.189185158, ctime=1616420697<"Mar 
> >>> 22 09:44:57 2021">.189185158, size=262144, blocks=256, blksize=32768, 
> >>> flags=0x0, gen=0xb64d5d98 }
> >>> 20862 vmd      RET   fstat 0
> >>> 20862 vmd      CALL  kbind(0x7f7ffffd39d8,24,0x2a9349e63ae9950c)
> >>> 20862 vmd      RET   kbind 0
> >>> 20862 vmd      CALL  pread(10,0x7f7ffffd42a8,0x68,0)
> >>> 20862 vmd      GIO   fd 10 read 104 bytes
> >>>       
> >>> "QFI\M-{\0\0\0\^C\0\0\0\0\0\0\0h\0\0\0\f\0\0\0\^P\0\0\0\^E\0\0\0\0\0\0\
> >>>        
> >>> \0\0\0\0\0(\0\0\0\0\0\^A\0\0\0\0\0\0\0\^B\0\0\0\0\0\^A\0\0\0\0\0\0\0\
> >>>        
> >>> \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\^D\0\
> >>>        \0\0h"
> >>> 20862 vmd      RET   pread 104/0x68
> >>> 20862 vmd      CALL  pread(10,0x7f7ffffd4770,0xc,0x68)
> >>> 20862 vmd      GIO   fd 10 read 12 bytes
> >>>       "alpine.qcow2"
> >>> 20862 vmd      RET   pread 12/0xc
> >>> 20862 vmd      CALL  kbind(0x7f7ffffd39d8,24,0x2a9349e63ae9950c)
> >>> 20862 vmd      RET   kbind 0
> >>> 20862 vmd      CALL  kbind(0x7f7ffffd39d8,24,0x2a9349e63ae9950c)
> >>> 20862 vmd      RET   kbind 0
> >>> 20862 vmd      CALL  __realpath(0x7f7ffffd3ea0,0x7f7ffffd3680)
> >>> 20862 vmd      NAMI  "/home/dave/vm/alpine.qcow2"
> >>> 20862 vmd      NAMI  "/home/dave/vm/alpine.qcow2"
> >>> 20862 vmd      RET   __realpath 0
> >>> 20862 vmd      CALL  open(0x7f7ffffd4370,0x4<O_RDONLY|O_NONBLOCK>)
> >>> 20862 vmd      NAMI  "/home/dave/vm/alpine.qcow2"
> >>> 20862 vmd      RET   open 11/0xb
> >>> 20862 vmd      CALL  fstat(11,0x7f7ffffd42b8)
> >>> 
> >>> 
> >>> I'm more familiar with the vmd(8) codebase than any ffs stuff, but I
> >>> don't think the issue is the base image being r/w.
> >>> 
> >>> -Dave
> >> 
> >> AFAIKS, the issue is that if you start a vm modifying the base because it
> >> uses it as a regular image, that r/o open for the other vms does not
> >> matter a lot,
> >> 
> >>    -OPtto
> > 
> > Good point. I'm going to look into the feasibility of having the
> > control[1] process track what disks it's opened and in what mode to see
> > if there's a way to build in some protection against this from
> > happening.
> > 
> > [1] I mistakenly called it the "config" process earlier.
> 
> I guess that would help a lot of poor souls like myself to not make that 
> mistake again. :)
> 
> Mischa
> 

BTW, is was testign 40 1G VMs on a host with 24G, but some of the VMs
died on me when the machine started hitting swap. Is this known? 

        -Otto

Reply via email to