Hello,

Our development team is currently looking into migrating our search system
to Apache Solr, and we would greatly appreciate some advice on setup. We are
indexing approximately two hundred million database rows. We add about a
hundred thousand new rows throughout the day. These new database rows must
be searchable within two minutes of their receipt.

We don't want the indexing to bog down the searcher, so our thought is to
have two Solr servers running on different machines in a replication setup.
The first Solr instance will be the indexer. It will use the
DataImportHandler to index the delta and have autocommit enabled to prevent
overzealous commit rates. Index optimization will take place during
scheduled periods. The second Solr instance (the slave) will be the primary
searcher and will have its indexes stored on RAIDed solid state drives.

What we are concerned about is failover. Our searches are mission-critical.
If the primary searcher goes down for whatever reason, our search service
will automatically shunt queries over to the indexer node instead. Indexing
is equally critical, though. If the indexer dies, we need to have a warm
failover standing by. Is there a recommended way to automate master node
failover in Solr replication? I've begun looking into ZooKeeper, but I
wasn't sure if this was the best approach.

Reply via email to