I actually started out using 3.4.9, and a tutorial that was recent recommended 
using 3.3.x instead since 3.4 "wasn't ready for production".  I'm fine with 
either really!  I did read that in production zk should utilize odd numbers of 
servers for sure.  1,3,5 etc etc etc for redundancy purposes for a chunk of 
your cloud doesn't go dead with your zk server.  3 servers provides better 
coverage since if one dies, you still have 66% of your cloud up etc etc etc.  
I'm doing this setup in Azure more as a proof of concept and to figure out how 
in the world to get SOLR Cloud up and running reliably so we can talk about 
migrating over.

I've definitely read over the 2 links you shared, and while I understand 
them....the lightbulb still hasn’t lit up yet in my head for that "ah ha!" 
moment.  ;-)

I plan to try and spin up some new VMs this weekend and start the process over 
again.  It's gotta work one of these times!

Thanks for the info!

-----Original Message-----
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Friday, February 24, 2017 11:34 AM
To: solr-user@lucene.apache.org
Subject: Re: SOLRCloud on 6.4 on Ubuntu

On 2/23/2017 2:12 PM, Pouliot, Scott wrote:
> I'm trying to find a good beginner level guide to setting up SolrCloud NOT 
> using the example configs that are provided with SOLR.
>
> Here are my goals (and the steps I have done so far!):
>
> 1.       Use an external Zookeeper server
> a.       wget 
> http://apache.claz.org/zookeeper/zookeeper-3.3.6/zookeeper-3.3.6.tar.gz

Solr includes the 3.4.6 version of the Zookeeper client.  I would strongly 
recommend that the servers be running the latest 3.4.x version, currently 
3.4.9.  Although I cannot say for sure, it's entirely possible that Solr uses 
ZK client features that are not supported by an earlier server version.

I've omitted the rest of the zookeeper steps you mentioned.  They look fine, as 
long as the configuration is OK and the version is new enough. 
Another bit of info:  You do know that Zookeeper requires three separate 
physical servers for a redundant install, I hope?  One or two servers is not 
enough.

> 2.       Install SOLR on both nodes
> a.       wget http://www.us.apache.org/dist/lucene/solr/6.4.1/solr-6.4.1.tgz
> b.       tar xzf solr-6.4.1.tgz solr-6.4.1/bin/install_solr_service.sh 
> --strip-components=2
> c.       ./install_solr_service.sh solr-6.4.1.tgz
> d.       Update solr.in.sh to include the ZKHome variable set to my ZK 
> server's ip on port 2181
>
> Now it seems if I start SOLR manually with bin/solr start -c -p 8080 -z <ZK 
> IP>:2181 then it will actually load, but if I let it auto start, I get an 
> HTTP 500 error on the Admin UI for SOLR.

Again ... you need three ZK servers for redundancy, so the setting for -z needs 
to reference all three, and probably should have a chroot.  You can set all of 
those startup parameters by configuring variables in 
/etc/default/solr.in.shinstead of starting it manually.  The copy of solr.in.sh 
that's in the bin directory is NOT used when running as a service.

> I also can't seem to figure out what I need to upload into Zookeeper as far 
> as configuration files go.  I created a test collection on the instance when 
> I got it up one time...but it has yet to start properly again for me.

Use the upconfig command with zkcli or the zk command on the solr script.  The 
directory you are uploading should contain everything in a core config that's 
normally in the "conf" directory -- solrconfig.xml, the schema, and any files 
referenced by either of those.

> Are there any GOOD tutorials out there?  I have read most of the 
> documentation I can get my hands on thus far from Apache, and blogs 
> and such, but the light bulb still has not lit up for me yet and I 
> feel like a n00b  ;-)

There's a quick start.  This URL shows how to start a SolrCloud example where 
Zookeeper is embedded within one of the Solr nodes, and everything's on one 
machine.  This setup is not suitable for production.

http://lucene.apache.org/solr/quickstart.html

This is some more detailed info about migrating to production:

https://cwiki.apache.org/confluence/display/solr/Taking+Solr+to+Production

Information about setting up a redundant external Zookeeper is best obtained 
from the Zookeeper project.  They understand their software best.

> My company is currently running SOLR in the old master/slave config and I'm 
> trying to setup a SOLRCloud so that we can toy with it in a Dev/QA 
> Environment and see what it's capable of.  We're currently running 4 separate 
> master/slave SOLR server pairs in production to spread out the load a bit, 
> but I'd rather see us migrate towards a cluster/cloud scenario to gain some 
> computing power here!

What SolrCloud offers is much easier management and a true cluster with no 
masters and no slaves.  Depending on how the master-slave architecture is used, 
SolrCloud can actually be a step down in performance, but it is generally 
easier to get a redundant and sharded collection operational.  The possible 
performance disadvantage is not usually extreme, and exists because all 
replicas handle their own indexing, rather than having slaves that copy the 
completed index from the master.

Thanks,
Shawn

Reply via email to