On 2/19/2016 3:40 PM, Brian Wright wrote: > Without going into excessive detail on our design, I won't be able to > sufficiently justify an answer to your question as to the why of it. > Suffice it to say we plan to deploy this indexing for our entire > customer base. Because of size these document collections and the way > that they will grow over time, doubling up in machines is not feasible > in our current infrastructure at this time. It may be justified later, > but not today. It's less expensive to add more CPUs and RAM than > doubling up on physical machines. Additionally, there are further > budgetary constraints going into our international datacenters which > prevents us from having identical clusters across the board, thus > requiring doubling up. We're not talking about 2 or 3 machines here. > We're talking 128 running instances of Solr with 64 clusters and many > shards. >
You will use fewer resources if you only run one Solr instance on each machine. You can still have different considerations for different hardware with one instance -- on the servers with more resources, configure a larger Java heap and run more indexes. > However, that doesn't preclude the use of something like Docker or KVM > to allow encapsulation of each Solr environment on a virtual machine > which is hooked to a fast storage subsystem. I started out with a Solr install using virtual machines on the free VMWare ESXi. Because it was impossible to remotely monitor that system, I switched it to a Xen environment, where the hypervisor was controlled by a full Linux installation, but the cost was still zero. It also seemed to perform better than VMWare. Later I did an experiment where I set up the exact same hardware without virtualization. Before each host was was running several virtual machines, each with an install of Solr. After the change, the machine was running one install of Solr, handling all of the same indexes that were originally handled by the VMs. Performance was noticeably better, and administration got a LOT better. One OS install, one IP address, one TCP port, one hostname, one JVM, instead of four or five of each. If the plan is to run SolrCloud, having only one Solr instance per physical machine will ensure that SolrCloud never places more than one replica for a shard on the same physical host, and it will do this without special configuration. > I would also suggest that if the recommendation is not to run two > instance side-by-side, then the documentation regarding how to set > this up should be removed and a strong statement put in its place that > running multiple Solr instances is not a supported configuration. > Right now, the documentation does not state this and, in fact, implies > that it is perfectly fine to run multiple instances side by side as > long as independent disks are used to hold the instances. The documentation is driven by what users ask for. A lot of users ask how to run multiple instances on one machine. Your idea above would be my preference on how to handle the documentation. Or perhaps leave the instructions in there, but include a strong warning indicating that one instance will usually work better. Each time I see somebody ask how to run multiple instances, I give them the same advice I gave you. It is often ignored. Thanks, Shawn