Re: [Qemu-devel] [PATCH v2 0/3] linux-aio: reduce completion latency

Stefan Hajnoczi Wed, 20 Jul 2016 04:36:58 -0700

On Tue, Jul 19, 2016 at 02:27:40PM +0200, Roman Pen wrote:
> v2:
>  o For the third patch do not introduce extra member for LinuxAioState
>    structure, reuse ret == -EINPROGRESS.
> 
>  o Add explicit comment which explains why we do not hang if requests
>    are still pended.
> 
> 
> This series are intended to reduce completion latencies by two changes:
> 
> 1. QEMU does not use any timeout value for harvesting completed AIO
>    requests from the ring buffer, thus io_getevents() can be implemented
>    in userspace (first patch).
> 
> 2. In order to reduce completion latency it makes sense to harvest completed
>    requests ASAP.  Very fast backend device can complete requests just after
>    submission, so it is worth trying to check ring buffer and peek completed
>    requests directly after io_submit() has been called (third patch).
> 
> Indeed, the series reduces the completions latencies and increases the
> overall throughput, e.g. the following is the percentiles of number of
> completed requests at once:
> 
>         1th 10th  20th  30th  40th  50th  60th  70th  80th  90th  99.99th
> Before    2    4    42   112   128   128   128   128   128   128    128
>  After    1    1     4    14    33    45    47    48    50    51    108
> 
> That means, that before the third patch is applied the ring buffer is
> observed as full (128 requests were consumed at once) in 60% of calls.
> 
> After the third patch is applied the distribution of number of completed
> requests is "smoother" and the queue (requests in-flight) is almost never
> full.
> 
> The fio read results are the following (write results are almost the
> same and are not showed here):
> 
>   Before
>   ------
> job: (groupid=0, jobs=8): err= 0: pid=2227: Tue Jul 19 11:29:50 2016
>   Description  : [Emulation of Storage Server Access Pattern]
>   read : io=54681MB, bw=1822.7MB/s, iops=179779, runt= 30001msec
>     slat (usec): min=172, max=16883, avg=338.35, stdev=109.66
>     clat (usec): min=1, max=21977, avg=1051.45, stdev=299.29
>      lat (usec): min=317, max=22521, avg=1389.83, stdev=300.73
>     clat percentiles (usec):
>      |  1.00th=[  346],  5.00th=[  596], 10.00th=[  708], 20.00th=[  852],
>      | 30.00th=[  932], 40.00th=[  996], 50.00th=[ 1048], 60.00th=[ 1112],
>      | 70.00th=[ 1176], 80.00th=[ 1256], 90.00th=[ 1384], 95.00th=[ 1496],
>      | 99.00th=[ 1800], 99.50th=[ 1928], 99.90th=[ 2320], 99.95th=[ 2672],
>      | 99.99th=[ 4704]
>     bw (KB  /s): min=205229, max=553181, per=12.50%, avg=233278.26, 
> stdev=18383.51
> 
>   After
>   ------
> job: (groupid=0, jobs=8): err= 0: pid=2220: Tue Jul 19 11:31:51 2016
>   Description  : [Emulation of Storage Server Access Pattern]
>   read : io=57637MB, bw=1921.2MB/s, iops=189529, runt= 30002msec
>     slat (usec): min=169, max=20636, avg=329.61, stdev=124.18
>     clat (usec): min=2, max=19592, avg=988.78, stdev=251.04
>      lat (usec): min=381, max=21067, avg=1318.42, stdev=243.58
>     clat percentiles (usec):
>      |  1.00th=[  310],  5.00th=[  580], 10.00th=[  748], 20.00th=[  876],
>      | 30.00th=[  908], 40.00th=[  948], 50.00th=[ 1012], 60.00th=[ 1064],
>      | 70.00th=[ 1080], 80.00th=[ 1128], 90.00th=[ 1224], 95.00th=[ 1288],
>      | 99.00th=[ 1496], 99.50th=[ 1608], 99.90th=[ 1960], 99.95th=[ 2256],
>      | 99.99th=[ 5408]
>     bw (KB  /s): min=212149, max=390160, per=12.49%, avg=245746.04, 
> stdev=11606.75
> 
> Throughput increased from 1822MB/s to 1921MB/s, average completion latencies
> decreased from 1051us to 988us.
> 
> Roman Pen (3):
>   linux-aio: consume events in userspace instead of calling io_getevents
>   linux-aio: split processing events function
>   linux-aio: process completions from ioq_submit()
> 
>  block/linux-aio.c | 178 
> ++++++++++++++++++++++++++++++++++++++++++------------
>  1 file changed, 141 insertions(+), 37 deletions(-)
> 
> Signed-off-by: Roman Pen <[email protected]>
> Cc: Stefan Hajnoczi <[email protected]>
> Cc: Paolo Bonzini <[email protected]>
> Cc: [email protected]


Thanks, applied to my block-next tree for QEMU 2.8:
https://github.com/stefanha/qemu/commits/block-next

Stefan

signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v2 0/3] linux-aio: reduce completion latency

Reply via email to