On 8/13/2013 4:47 PM, Torsten Albrecht wrote:
I have a solr 3.6 infrastructure with 4 server 24 cores/128GB (~15 shards at
every server), 70 million documents.
Now I set up a new solr 4 infrastructure with the same hardware. I reduce the
shards and have only 6 shards.
But I don't understand the difference between solrcloud and a multi-server
loadbalancing. And if solrcloud the better way (more performance)?
LoadBalancer -> solrcloud (4 Nodes)
LoadBalancer -> 4 solr server with the same shards
With SolrCloud, Solr automates a LOT of things and takes care of
redundancy for you. You can index to any core/shard in the entire cloud
and Solr takes care of routing the updates to the correct shard and all
of its replicas. You can also send queries to any core/shard in the
cloud and they will be automatically balanced across the cloud. If part
of your cloud goes down and you've designed it right, everything keeps
working, and the down machine will automatically be synchronized with
the cloud when it comes back up.
With traditional sharding, redundancy requires designating masters and
slaves and setting up replication. You can only index to masters, and
you have to figure out which shard to index to.
If all your client code is Java, you don't need a load balancer - the
CloudSolrServer object talks to zookeeper and figures out what nodes are
available in realtime. You can continue to use a load balancer if you wish.
Thanks,
Shawn