On 07/04/2016 08:38 AM, Denis V. Lunev wrote: > From: Evgeny Yakovlev <[email protected]> > > Some guests (win2008 server for example) do a lot of unnecessary > flushing when underlying media has not changed. This adds additional > overhead on host when calling fsync/fdatasync. > > This change introduces a write generation scheme in BlockDriverState. > Current write generation is checked against last flushed generation to > avoid unnessesary flushes. > > The problem with excessive flushing was found by a performance test > which does parallel directory tree creation (from 2 processes). > Results improved from 0.424 loops/sec to 0.432 loops/sec. > Each loop creates 10^3 directories with 10 files in each. >
> +++ b/block/io.c
> @@ -1294,6 +1294,7 @@ static int coroutine_fn
> bdrv_aligned_pwritev(BlockDriverState *bs,
> }
> bdrv_debug_event(bs, BLKDBG_PWRITEV_DONE);
>
> + ++bs->write_gen;
Why pre-increment? Most code uses post-increment when done as a
statement in isolation.
> bdrv_set_dirty(bs, start_sector, end_sector - start_sector);
>
> if (bs->wr_highest_offset < offset + bytes) {
> @@ -2211,6 +2212,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
> {
> int ret;
> BdrvTrackedRequest req;
> + int current_gen = bs->write_gen;
>
> if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
> bdrv_is_sg(bs)) {
> @@ -2219,6 +2221,12 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>
> tracked_request_begin(&req, bs, 0, 0, BDRV_TRACKED_FLUSH);
>
> + /* Wait until any previous flushes are completed */
> + while (bs->flush_started_gen != bs->flushed_gen) {
Should this be an inequality, as in s/!=/</, in case several flushes can
be started in parallel and where the later flush ends up finishing
before the earlier flush?
> + qemu_co_queue_wait(&bs->flush_queue);
> + }
> + bs->flush_started_gen = current_gen;
> +
> /* Write back all layers by calling one driver function */
> if (bs->drv->bdrv_co_flush) {
> ret = bs->drv->bdrv_co_flush(bs);
> @@ -2239,6 +2247,11 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
> goto flush_parent;
> }
>
> + /* Check if we really need to flush anything */
> + if (bs->flushed_gen == current_gen) {
Likewise, if you are tracking generations, should this be s/==/<=/ (am I
getting the direction correct)?
> +++ b/include/block/block_int.h
> @@ -420,6 +420,11 @@ struct BlockDriverState {
> note this is a reference count */
> bool probed;
>
> + CoQueue flush_queue; /* Serializing flush queue */
> + unsigned int write_gen; /* Current data generation */
> + unsigned int flush_started_gen; /* Generation for which flush has
> started */
> + unsigned int flushed_gen; /* Flushed write generation */
Should these be 64-bit integers to avoid risk of overflow after just
2^32 flush attempts?
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature
