On Mon, Dec 13, 2010 at 10:24:51AM +0000, Stefan Hajnoczi wrote: > On Sun, Dec 12, 2010 at 9:09 PM, Michael S. Tsirkin <m...@redhat.com> wrote: > > On Sun, Dec 12, 2010 at 10:56:34PM +0200, Michael S. Tsirkin wrote: > >> On Sun, Dec 12, 2010 at 10:42:28PM +0200, Michael S. Tsirkin wrote: > >> > On Sun, Dec 12, 2010 at 10:41:28PM +0200, Michael S. Tsirkin wrote: > >> > > On Sun, Dec 12, 2010 at 03:02:04PM +0000, Stefan Hajnoczi wrote: > >> > > > See below for the v5 changelog. > >> > > > > >> > > > Due to lack of connectivity I am sending from GMail. Git should > >> > > > retain my > >> > > > stefa...@linux.vnet.ibm.com From address. > >> > > > > >> > > > Virtqueue notify is currently handled synchronously in userspace > >> > > > virtio. This > >> > > > prevents the vcpu from executing guest code while hardware emulation > >> > > > code > >> > > > handles the notify. > >> > > > > >> > > > On systems that support KVM, the ioeventfd mechanism can be used to > >> > > > make > >> > > > virtqueue notify a lightweight exit by deferring hardware emulation > >> > > > to the > >> > > > iothread and allowing the VM to continue execution. This model is > >> > > > similar to > >> > > > how vhost receives virtqueue notifies. > >> > > > > >> > > > The result of this change is improved performance for userspace > >> > > > virtio devices. > >> > > > Virtio-blk throughput increases especially for multithreaded > >> > > > scenarios and > >> > > > virtio-net transmit throughput increases substantially. > >> > > > >> > > Interestingly, I see decreased throughput for small message > >> > > host to get netperf runs. > >> > > > >> > > The command that I used was: > >> > > netperf -H $vguest -- -m 200 > >> > > > >> > > And the results are: > >> > > - with ioeventfd=off > >> > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.104 > >> > > (11.0.0.104) port 0 AF_INET : demo > >> > > Recv Send Send Utilization Service > >> > > Demand > >> > > Socket Socket Message Elapsed Send Recv Send > >> > > Recv > >> > > Size Size Size Time Throughput local remote local > >> > > remote > >> > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > >> > > us/KB > >> > > > >> > > 87380 16384 200 10.00 3035.48 15.50 99.30 6.695 > >> > > 2.680 > >> > > > >> > > - with ioeventfd=on > >> > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.104 > >> > > (11.0.0.104) port 0 AF_INET : demo > >> > > Recv Send Send Utilization Service > >> > > Demand > >> > > Socket Socket Message Elapsed Send Recv Send > >> > > Recv > >> > > Size Size Size Time Throughput local remote local > >> > > remote > >> > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB > >> > > us/KB > >> > > > >> > > 87380 16384 200 10.00 1770.95 18.16 51.65 13.442 > >> > > 2.389 > >> > > > >> > > > >> > > Do you see this behaviour too? > >> > > >> > Just a note: this is with the patchset ported to qemu-kvm. > >> > >> And just another note: the trend is reversed for larged messages, > >> e.g. with 1.5k messages ioeventfd=on outputforms ioeventfd=off. > > > > Another datapoint where I see a regression is with 4000 byte messages > > for guest to host traffic. > > > > ioeventfd=off > > set_up_server could not establish a listen endpoint for port 12865 with > > family AF_UNSPEC > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.4 > > (11.0.0.4) port 0 AF_INET : demo > > Recv Send Send Utilization Service > > Demand > > Socket Socket Message Elapsed Send Recv Send Recv > > Size Size Size Time Throughput local remote local > > remote > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > > > > 87380 16384 4000 10.00 7717.56 98.80 15.11 1.049 2.566 > > > > ioeventfd=on > > TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 11.0.0.4 > > (11.0.0.4) port 0 AF_INET : demo > > Recv Send Send Utilization Service > > Demand > > Socket Socket Message Elapsed Send Recv Send Recv > > Size Size Size Time Throughput local remote local > > remote > > bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB > > > > 87380 16384 4000 10.00 3965.86 87.69 15.29 1.811 5.055 > > Interesting. I posted the following results in an earlier version of > this patch: > > "Sridhar Samudrala <s...@us.ibm.com> collected the following data for > virtio-net with 2.6.36-rc1 on the host and 2.6.34 on the guest. > > Guest to Host TCP_STREAM throughput(Mb/sec) > ------------------------------------------- > Msg Size vhost-net virtio-net virtio-net/ioeventfd > 65536 12755 6430 7590 > 16384 8499 3084 5764 > 4096 4723 1578 3659" > > Here we got a throughput improvement where you got a regression. Your > virtio-net ioeventfd=off throughput is much higher than what we got > (different hardware and configuration, but still I didn't know that > virtio-net reaches 7 Gbit/s!).
Which qemu are you running? Mine is upstream qemu-kvm + your patches v4 + my patch to port to qemu-kvm. Are you testing qemu.git? My cpu is Intel(R) Xeon(R) CPU X5560 @ 2.80GHz, I am running without any special flags, so IIRC kvm64 cpu type is emulated. Should really try +x2apic. > I have focussed on the block side of things. Any thoughts about the > virtio-net performance we're seeing? > > " 1024 1827 981 2060 I tried 1.5k, I am getting about 3000 guest to host, but in my testing I get about 2000 without ioeventfd as well. > Host to Guest TCP_STREAM throughput(Mb/sec) > ------------------------------------------- > Msg Size vhost-net virtio-net virtio-net/ioeventfd > 65536 11156 5790 5853 > 16384 10787 5575 5691 > 4096 10452 5556 4277 > 1024 4437 3671 5277 > > Guest to Host TCP_RR latency(transactions/sec) > ---------------------------------------------- > > Msg Size vhost-net virtio-net virtio-net/ioeventfd > 1 9903 3459 3425 > 4096 7185 1931 1899 > 16384 6108 2102 1923 > 65536 3161 1610 1744" > > I'll also run the netperf tests you posted to check what I get. > > Stefan