On Mon, Feb 27, 2017 at 12:03:14PM +0100, Peter Lieven wrote: > the convert process is currently completely implemented with sync operations. > That means it reads one buffer and then writes it. No parallelism and each > sync > request takes as long as it takes until it is completed. > > This can be a big performance hit when the convert process reads and writes > to devices which do not benefit from kernel readahead or pagecache. > In our environment we heavily have the following two use cases when using > qemu-img convert. > > a) reading from NFS and writing to iSCSI for deploying templates > b) reading from iSCSI and writing to NFS for backups > > In both processes we use libiscsi and libnfs so we have no kernel cache. > > This patch changes the convert process to work with parallel running > coroutines > which can significantly improve performance for network storage devices: > > qemu-img (master) > nfs -> iscsi 22.8 secs > nfs -> ram 11.7 secs > ram -> iscsi 12.3 secs > > qemu-img-async (8 coroutines, in-order write disabled) > nfs -> iscsi 11.0 secs > nfs -> ram 10.4 secs > ram -> iscsi 9.0 secs > > This patches introduces 2 new cmdline parameters. The -m parameter to specify > the number of coroutines running in parallel (defaults to 8). And the -W > paremeter to > allow qemu-img to write to the target out of order rather than sequential. > This improves > performance as the writes do not have to wait for each other to complete. > > Signed-off-by: Peter Lieven <[email protected]> > --- > V1->V2: - do not calculate source partition globally [Kevin] > - don't use s->status outside the global lock [Kevin] > - remove accidently left bracket in qemu-img.texi [Kevin] > - reworkd -W parageaph in documentation [Stefan] > > RFC->V1: - add documentation > - add missing coroutine_fn annotation [Stefan] > - add a comment why it is safe to call coroutine_enter [Stefan] > - check -m paramater for values < 1 [Stefan] > - disallow -W parameter with compression [Stefan] > > RFC V3->V4: - avoid to prepare a request queue upfront [Kevin] > - do not ignore the BLK_BACKING_FILE status [Kevin] > - redesign the interface to the read and write routines [Kevin] > > RFC V2->V3: - updated stats in the commit msg from a host with a better > network card > - only wake up the coroutine that is acutally waiting for a write > to complete. > this was not only overhead, but also breaking at least linux > AIO. > - fix coding style complaints > - rename some variables and structs > > RFC V1->V2: - using coroutine as worker "threads". [Max] > - keeping the request queue as otherwise it happens > that we wait on BLK_ZERO chunks while keeping the write order. > it also avoids redundant calls to get_block_status and helps > to skip some conditions for fully allocated imaged > (!s->min_sparse) > > --- > qemu-img-cmds.hx | 4 +- > qemu-img.c | 322 > ++++++++++++++++++++++++++++++++++++++----------------- > qemu-img.texi | 16 ++- > 3 files changed, 243 insertions(+), 99 deletions(-)
I haven't checked the locking issues that Kevin pointed out, but I'm happy on the other aspects: Reviewed-by: Stefan Hajnoczi <[email protected]>
signature.asc
Description: PGP signature
