There is only one index. The index has newer "segments" which represent new
records and deletes to old records (sort of). Incremental replication copies
new segments; putting the new segments together with the previous index
makes the new index.

Incremental replication under rsync does work; perhaps it did not work for
you.

If you do not want to store the full index on the indexer, that is a
problem. You will not be able to optimize the index on the indexer and ship
the new index to the slaves.

This has more on large-volume Solr installation design:

http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr

On 9/9/09, Silent Surfer <silentsurfe...@yahoo.com> wrote:
>
> Hi ,
>
> Currently we are using Solr 1.3 and we have the following requirement.
>
> As we need to process very high volumes of documents (of the order of 400
> GB per day), we are planning to separate indexer(s) and searcher(s), so that
> there won't be performance hit.
>
> Our idea is to have have a set of servers which is used only for indexers
> for index creation and then every 5 mins or so, the index will be copied to
> the searchers(set of solr servers only for querying). For this we tried to
> use the snapshooter,rsysnc etc.
>
> But the problem with this approach is, the same index is present on both
> the indexer and searcher, and hence occupying large FS.
>
> What we need is a mechanism, where in the indexer contains only the index
> for the past 5 mins(last indexing cycle before the snap shooter is run) and
> the searcher should have the accumulated(total) index i.e every 5 mins, we
> should be able to move the entire index from indexer to searcher and so on.
>
> The above scenario is slightly different from master/slave implementation,
> as on master we want only the latest(WIP) index and the slave should contain
> the entire index.
>
> Appreciate if anyone can throw some light on how to achieve this.
>
> Thanks,
> sS
>
>
>
>
>


-- 
Lance Norskog
goks...@gmail.com

Reply via email to