On 10/11/2011 11:49 AM, Toke Eskildsen wrote:
Inline or top-posting? Long discussion, but for mailing lists I clearly prefer the former.
Ditto. ;)
I have little experience with VM servers for search. Although we use a lot of virtual machines, we use dedicated machines for our searchers, primarily to ensure low latency for I/O. They might be fine for that too, but we haven't tried it yet. Glad to be of help, Toke
We've been running a production Solr installation for over a year (first 1.4.1 and now 3.2) on virtual machines using Xen, CentOS 5 on CentOS 5. Each shard lives in a virtual machine. We have a pair of virtual machines (on separate hardware) to act as search brokers. Another pair of VMs acts as a heartbeat/haproxy load balancer. Each physical machine hosts three of the six large shards that make up our index, and we have two copies of the index, requiring four physical hosts.
Now I'm doing a migration where we use the same physical hardware with multiple production cores rather than virtualization, upgrading Solr to 3.4 and CentOS to 6.0 with ext4 at the same time. The hosts that have been migrated have 32GB of RAM, the hosts that are still using Xen have 64GB. There is not enough RAM in either case for the index to fit fully. Despite having less memory, the cores on the upgraded hosts are showing average query times 20-25% lower than the others. Before, the hosts with less memory had higher average query times. I expect that when I get the larger hosts migrated, query times will drop yet again.
My opinion: Virtualization can be very effective, but you'll get better results without it. It requires a more complex build system, because you can't assume every machine has cores with the same names. I also had to change Jetty's port number because when your load balancer is running on the same OS, you can't bind to the same port.
Thanks, Shawn