On 12/19/2013 2:35 AM, hariprasadh89 wrote:
> We have done the solr cloud setup:
> In one machine
> 1. Centos 6.3
> 2. Apache solr 4.1
> 3. JbossasFinal 7.1.1
> 4 .ZooKeeper 
> Lets setup the zookeeper cloud on 2 machines
> 
> download and untar zookeeper in /opt/zookeeper directory on both servers
> solr1 & solr2. On both the servers do the following
> 
> root@solr1$ mkdir /opt/zookeeper/data
> root@solr1$ cp /opt/zookeeper/conf/zoo_sample.cfg
> /opt/zookeeper/conf/zoo.cfg
> root@solr1$ vim /opt/zookeeper/zoo.cfg
> Make the following changes in the zoo.cfg file
> 
> dataDir=/opt/zookeeper/data
> server.1=solr1:2888:3888
> server.2=solr2:2888:3888
> Save the zoo.cfg file.
> ssign different ids to the zookeeper servers
> 
> on solr1
> 
> root@solr1$ cat 1 > /opt/zookeeper/data/myid
> 
> on solr2
> 
> root@solr2$ cat 2 > /opt/zookeeper/data/myid
> 
> Start zookeeper on both the servers
> 
> root@solr1$ cd /opt/zookeeper
> root@solr1$ ./bin/zkServer.sh start
> Note : in future when you need to reset the cluster/shards information do
> the following
> 
> 4.RAM-2GB
> 5.set the heap size to 1GB
> Extracted the solr.war and change the solr home in web.xml of solr.
> In bin folder of jboss ,the JAVA_OPTS parameter has been set in
> standalone.conf
> java -DzkHost=solr1:2181,solr2:2181 -Dbootstrap_confdir=solr/corename/conf/
> -DnumShards=2
> 
> Restart the jboss
> 
> In another machine
> 
> 1. Centos 6.3
> 2. Apache solr 4.1
> 3. JbossasFinal 7.1.1
> 4.RAM-2GB
> 5.set the heap size to 1GB
> Extracted the solr.war and change the solr home in web.xml of solr.
> In bin folder of jboss ,the JAVA_OPTS parameter has been set in
> standalone.conf
> 
> Restart the jboss
> 
> Everything has been  done properly.
> But it is taking too much time to upload data into solr.
> It is taking more time than uploading data with one solr without shard
> concept
> Able to view two shards in solr cloud option present in ui.
> Please explain how the larger index splits and allocated in two shards.
> 
> Please suggest some optimization techniques.

First problem, but not likely the cause of your complaint -- replicated
zookeeper requires three hosts minimum.  If you only have two, they both
have to be up, or quorum is lost.  Zookeeper requires a majority
[(n/2)+1] of hosts to be active to maintain quorum.  With three or four
hosts, one may be down.  With five or six hosts, two may be down.  You
need to add another host for zookeeper.  It does not need to be a
powerful host, because SolrCloud typically will NOT put much load on
zookeeper.

A 1GB heap is pretty small in the Solr world.  With 2GB total RAM and
1GB heap, if the index size on each server is bigger than about 1.5-2GB,
performance will be terrible.  My production Solr server has over 45GB
of index per server, so I have 64GB of RAM on each server.  6GB of that
goes to Solr's java heap.

http://wiki.apache.org/solr/SolrPerformanceProblems

SolrCloud amplifies existing performance problems because indexing
requests must be forwarded to the leader of a shard, which will forward
it to all replicas of that shard.  If there are no underlying
performance issues, SolrCloud will usually index almost as fast as
standard unsharded Solr.

Also, Solr 4.1.0 is very old at this point, released back in January of
this year.  SolrCloud was new in 4.0, and has been evolving very quickly
through new releases.  Version 4.6.0 is the latest, released on December
2nd.  You can see the release history back to 4.0 here:

http://lucene.apache.org/solr/solrnews.html

One more final thing -- Solr works best in the jetty that's included
with the download in the example directory.  With a more complex
container like Tomcat or JBoss, memory requirements will be elevated a
little bit, which when working in a tight 2GB memory space, can make a
significant difference.

Thanks,
Shawn

Reply via email to