On Tue, 21 Mar 2006, John Hearns wrote: > On Tue, 2006-03-21 at 13:06 -0500, Joe Landman wrote: > > Not sure of the performance impact of this, but you could look at OpenVZ > > or Xen as well (when it is ready). > > Xen has very little impact on performance. I saw some very good figures > at a recent presentation at FOSDEM.
Have you tried Xen yourself? With real-life applications? I think that you'll find earlier published numbers selected operations that benchmarked favorably. > I guess the biggest drawback for MPI type work would be the emulated NIC > is pretty outdated. I think I remember Ian Pratt saying this will be > changed. Para-virtualization can be pretty efficient for computational work, but not communication or I/O. If it's emulating the NIC registers, sending a bunch of small, latency-sensitive packets can be pretty painful. What you need is a more efficient communication to the underlying OS or hardware. There are several approach, but all have obvious drawbacks. Enable direct access to the physical NIC hardware. This can be done with little overhead, but now you can only have one virtual machine on the physical machine, and cannot migrate the VM. (This same limitation applies to local disks as well.) Emulate a real-life NIC, much like VMWare emulates a AMD LANCE. This involves CPU overhead to mimic the hardware registers and bus transactions, as well as the quirks of the actual device. Emulate an ideal virtual NIC, instead of a real-life one. You have to write a device driver for each OS, but you can make the Host OS emulation simpler. (VMWare emulates an old LANCE design to minimize complexity, but adds back some modern features in an easier-to-emulate way.) A better approach is recognizing that para-virtualization involves hacking the OS anyway. You can create a new NIC interface model that allows e.g. page flipping with the host OS to enable lower overhead communication. But now you are touching more than the device driver, you are reaching into the buffer and memory management of the OS. The storage interface has some of the same issues. It does has the advantages of dealing with whole blocks, not being as latency sensitive, and allowing buffering/read-ahead/write-behind. But the guest OS inconveniently expects that it has exclusive access to blocks that will still be there later ;->. Many of these same issues remain even when we have VT or Pacifica. Unless the underlying devices are designed with virtualization in mind, and both the Host OS and Guest OS know how to handle that specific hardware, there will be run-time overhead for virtual machines. Hmmm, I almost drifted into the topic of "perhaps there is a better layer to virtualize at". Some of the list readers know where that one ends up at. Instead I'll keep the bottom line on-subject: Virtualizing at the machine level inherently has overhead, and it's still pretty noticeable with the current implementations. -- Donald Becker [EMAIL PROTECTED] Scyld Software Scyld Beowulf cluster systems 914 Bay Ridge Road, Suite 220 www.scyld.com Annapolis MD 21403 410-990-9993 _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf