Bernd: You rarely have to worry about who the leader is unless and until you get many 100s of shards. The extra work a leader does is usually minimal and spending time trying to control where the leaders live is usually time wasted. Leaders will shift from replica to replica anyway. Say your leader for shard1 is on instance1, shard1_replica1. Then you shut instance1 down. The leader will shift to some other replica in the shard, say shard1_replica4.
If you insist you can use the collections API BALANCESHARDUNIQUE and REBALANCELEADERS. The former assigns a "preferredLeader" role to one replica for each shard and the latter tries to make those replicas the real leader. If you really want to go all-out you can use ADDREPLICAPROP to make the replica of your choice the preferredLeader. But this is generally a waste of time and energy. Those abilities were added for a case where 100s of leaders wound up being in the same JVM and the performance impact was noticeable. And even if you do assign the preferredLeader role, that is just a hint, not a requirement. The collection will tend to have the specified replicas be the leaders, but only "tend". Best, Erick On Tue, May 9, 2017 at 5:35 AM, Shawn Heisey <apa...@elyograg.org> wrote: > On 5/9/2017 1:44 AM, Bernd Fehling wrote: >> From my point of view it is a good solution to have 5 virtual 64GB >> servers on 5 different huge physical machines and start 2 instances on >> each virtual server. > > If the total amount of memory in the virtual machine is 64GB, then I > would run one Solr node on it with a heap size between 8 and 16GB. The > rest of the memory in the virtual machine would then be available to > cache whatever index data exists. That caching is extremely important > for good performance. > > If the *heap* size is what would be 64GB (and you actually do need that > much heap), then it *does* make sense to split that into two instances, > each with a 31GB heap. I would argue that it's better to have those two > instances on separate machines. > > Assuming that you have a bare metal server with 256GB of RAM, you would > *not* want to divide that up into five virtual machines each with 64GB. > The physical host would not have enough memory for all five virtual > machines. It would have the option of using its disk space as extra > memory, but as soon as you start swapping memory to disk, performance of > ANY software becomes unacceptable. Solr in particular requires actual > real memory. Oversubscribing memory on VMs might work for some > workloads, but it won't work for Solr. > > If all your virtual machines are running on the same physical host, then > you have no redundancy. Modern servers have redundant power supplies, > redundant hard drives, and other kinds of fault tolerance. Even so, > there are many components in a server that have no redundancy, like the > motherboard, or the backplane. If one of those components were to die, > all of the virtual machines would go down. > > Thanks, > Shawn >