On 12/14/2017 08:52 PM, Paolo Bonzini wrote: > On 14/12/2017 15:09, Denis V. Lunev wrote: >> Linux guests submit IO requests no longer than PAGE_SIZE * max_seg >> field reported by SCSI controler. Thus typical sequential read with >> 1 MB size results in the following pattern of the IO from the guest: >> 8,16 1 15754 2.766095122 2071 D R 2095104 + 1008 [dd] >> 8,16 1 15755 2.766108785 2071 D R 2096112 + 1008 [dd] >> 8,16 1 15756 2.766113486 2071 D R 2097120 + 32 [dd] >> 8,16 1 15757 2.767668961 0 C R 2095104 + 1008 [0] >> 8,16 1 15758 2.768534315 0 C R 2096112 + 1008 [0] >> 8,16 1 15759 2.768539782 0 C R 2097120 + 32 [0] >> The IO was generated by >> dd if=/dev/sda of=/dev/null bs=1024 iflag=direct >> >> This effectively means that on rotational disks we will observe 3 IOPS >> for each 2 MBs processed. This definitely negatively affects both >> guest and host IO performance. >> >> The cure is relatively simple - we should report lengthy scatter-gather >> ability of the SCSI controller. Fortunately the situation here is very >> good. VirtIO transport layer can accomodate 1024 items in one request >> while we are using only 128. This situation is present since almost >> very beginning. 2 items are dedicated for request metadata thus we >> should publish VIRTQUEUE_MAX_SIZE - 2 as max_seg. >> >> The following pattern is observed after the patch: >> 8,16 1 9921 2.662721340 2063 D R 2095104 + 1024 [dd] >> 8,16 1 9922 2.662737585 2063 D R 2096128 + 1024 [dd] >> 8,16 1 9923 2.665188167 0 C R 2095104 + 1024 [0] >> 8,16 1 9924 2.665198777 0 C R 2096128 + 1024 [0] >> which is much better. >> >> The dark side of this patch is that we are tweaking guest visible >> parameter, though this should be relatively safe as above transport >> layer support is present in QEMU/host Linux for a very long time. >> The patch adds configurable property for VirtIO SCSI with a new default >> and hardcode option for VirtBlock which does not provide good >> configurable framework. > The patch is still missing compat properties (and 2.12 machine types). > If you don't want to add the machine types, feel free to wait until > someone else does it. :) > > Paolo
sorry :( my fault. Will re-spin tomorrow.
