There is only one index. The index has newer "segments" which represent new records and deletes to old records (sort of). Incremental replication copies new segments; putting the new segments together with the previous index makes the new index.
Incremental replication under rsync does work; perhaps it did not work for you. If you do not want to store the full index on the indexer, that is a problem. You will not be able to optimize the index on the indexer and ship the new index to the slaves. This has more on large-volume Solr installation design: http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Scaling-Lucene-and-Solr On 9/9/09, Silent Surfer <silentsurfe...@yahoo.com> wrote: > > Hi , > > Currently we are using Solr 1.3 and we have the following requirement. > > As we need to process very high volumes of documents (of the order of 400 > GB per day), we are planning to separate indexer(s) and searcher(s), so that > there won't be performance hit. > > Our idea is to have have a set of servers which is used only for indexers > for index creation and then every 5 mins or so, the index will be copied to > the searchers(set of solr servers only for querying). For this we tried to > use the snapshooter,rsysnc etc. > > But the problem with this approach is, the same index is present on both > the indexer and searcher, and hence occupying large FS. > > What we need is a mechanism, where in the indexer contains only the index > for the past 5 mins(last indexing cycle before the snap shooter is run) and > the searcher should have the accumulated(total) index i.e every 5 mins, we > should be able to move the entire index from indexer to searcher and so on. > > The above scenario is slightly different from master/slave implementation, > as on master we want only the latest(WIP) index and the slave should contain > the entire index. > > Appreciate if anyone can throw some light on how to achieve this. > > Thanks, > sS > > > > > -- Lance Norskog goks...@gmail.com