First, many thanks for the review! > It is customary to send a 0/6 cover letter for details like this, rather than > slamming it into the first patch (git send-email --cover-letter). > Remember, once it is in git, it is no longer as easy to identify where a > series > starts and ends, so the contents of the cover letter is not essential to git > history, just to reviewers. > > > > > The file docs/backup-rfc.txt contains more details. > > While naming the file *-rfc is fine for an RFC patch series, it better not be > the > final name that you actually want committed. > > > > > Changes since v1: > > > > * fix spelling errors > > * move BackupInfo from BDS to BackupBlockJob > > * introduce BackupDriver to allow more than one backup format > > * vma: add suport to store vmstate (size is not known in advance) > > * add ability to store VM state > > > > Changes since v2: > > > > * BackupDriver: remove cancel_cb > > * use enum for BackupFormat > > * vma: use bdrv_open instead of bdrv_file_open > > * vma: fix aio, use O_DIRECT > > * backup one drive after another (try to avoid high load) > > Also, it is customary to list series revision history after the --- separator; > again, something useful for reviewers, but pointless in the actual git > history. >
OK, I will send a cover-letter next time. > > Signed-off-by: Dietmar Maurer <[email protected]> > > --- > > docs/backup-rfc.txt | 119 > > +++++++++++++++++++++++++++++++++++++++++++++++++++ > > 1 files changed, 119 insertions(+), 0 deletions(-) create mode > > 100644 docs/backup-rfc.txt > > > > diff --git a/docs/backup-rfc.txt b/docs/backup-rfc.txt new file mode > > 100644 index 0000000..5b4b3df > > --- /dev/null > > +++ b/docs/backup-rfc.txt > > @@ -0,0 +1,119 @@ > > +RFC: Efficient VM backup for qemu > > You already have RFC in the subject line; you don't need it here in your > proposed contents. OK > > > + > > +That basically means that any data written during backup involve > > +considerable overhead. For LVM we get the following steps: > > + > > +1.) read original data (VM write) > > Shouldn't that be '(VM read)'? No, that 'read' is triggered by the VM write . > > +2.) write original data into snapshot (VM write) > > +3.) write new data (VM write) > > +4.) read data from snapshot (backup) > > +5.) write data from snapshot into tar file (backup) > > + > > +Another approach to backup VM images is to create a new qcow2 image > > +which use the old image as base. During backup, writes are redirected > > +to the new image, so the old image represents a 'snapshot'. After > > +backup, data need to be copied back from new image into the old one > > +(commit). So a simple write during backup triggers the following > > +steps: > > + > > +1.) write new data to new image (VM write) > > +2.) read data from old image (backup) > > +3.) write data from old image into tar file (backup) > > + > > +4.) read data from new image (commit) > > +5.) write data to old image (commit) > > + > > +This is in fact the same overhead as before. Other tools like qemu > > +livebackup produces similar overhead (2 reads, 3 writes). > > + > > +Some storage types/formats supports internal snapshots using some > > +kind of reference counting (rados, sheepdog, dm-thin, qcow2). It > > +would be possible to use that for backups, but for now we want to be > storage-independent. > > + > > +Note: It turned out that taking a qcow2 snapshot can take a very long > > +time on larger files. > > That's an independent issue, and there have been patches proposed to try > and reduce that time. will remove that comment. > > > + > > +=Make it more efficient= > > + > > +The be more efficient, we simply need to avoid unnecessary steps. The > > +following steps are always required: > > + > > +1.) read old data before it gets overwritten > > +2.) write that data into the backup archive > > +3.) write new data (VM write) > > + > > +As you can see, this involves only one read, an two writes. > > s/an/and/ > > > + > > +To make that work, our backup archive need to be able to store image > > +data 'out of order'. It is important to notice that this will not > > +work with traditional archive formats like tar. > > Are you also requiring that the output file descriptor be seekable? No, it works with pipes (like tar).
