Query regarding incremental index replication

Silent Surfer Wed, 09 Sep 2009 18:39:05 -0700

Hi ,

Currently we are using Solr 1.3 and we have the following requirement.


As we need to process very high volumes of documents (of the order of 400 GB 
per day), we are planning to separate indexer(s) and searcher(s), so that there 
won't be performance hit.

Our idea is to have have a set of servers which is used only for indexers for 
index creation and then every 5 mins or so, the index will be copied to the 
searchers(set of solr servers only for querying). For this we tried to use the 
snapshooter,rsysnc etc.

But the problem with this approach is, the same index is present on both the 
indexer and searcher, and hence occupying large FS.

What we need is a mechanism, where in the indexer contains only the index for 
the past 5 mins(last indexing cycle before the snap shooter is run) and the 
searcher should have the accumulated(total) index i.e every 5 mins, we should 
be able to move the entire index from indexer to searcher and so on.

The above scenario is slightly different from master/slave implementation, as 
on master we want only the latest(WIP) index and the slave should contain the 
entire index.

Appreciate if anyone can throw some light on how to achieve this.

Thanks,
sS

Query regarding incremental index replication

Reply via email to