Hi , Currently we are using Solr 1.3 and we have the following requirement.
As we need to process very high volumes of documents (of the order of 400 GB per day), we are planning to separate indexer(s) and searcher(s), so that there won't be performance hit. Our idea is to have have a set of servers which is used only for indexers for index creation and then every 5 mins or so, the index will be copied to the searchers(set of solr servers only for querying). For this we tried to use the snapshooter,rsysnc etc. But the problem with this approach is, the same index is present on both the indexer and searcher, and hence occupying large FS. What we need is a mechanism, where in the indexer contains only the index for the past 5 mins(last indexing cycle before the snap shooter is run) and the searcher should have the accumulated(total) index i.e every 5 mins, we should be able to move the entire index from indexer to searcher and so on. The above scenario is slightly different from master/slave implementation, as on master we want only the latest(WIP) index and the slave should contain the entire index. Appreciate if anyone can throw some light on how to achieve this. Thanks, sS