On Thu, Sep 10, 2009 at 7:08 AM, Silent Surfer <silentsurfe...@yahoo.com>wrote:
> Hi , > > Currently we are using Solr 1.3 and we have the following requirement. > > As we need to process very high volumes of documents (of the order of 400 > GB per day), we are planning to separate indexer(s) and searcher(s), so that > there won't be performance hit. > > Our idea is to have have a set of servers which is used only for indexers > for index creation and then every 5 mins or so, the index will be copied to > the searchers(set of solr servers only for querying). For this we tried to > use the snapshooter,rsysnc etc. > > But the problem with this approach is, the same index is present on both > the indexer and searcher, and hence occupying large FS. > > Set of servers used only for indexers? Solr replication currently supports only a single master. If you have a dedicated master then why do you care about index occupying too much disk space? > What we need is a mechanism, where in the indexer contains only the index > for the past 5 mins(last indexing cycle before the snap shooter is run) and > the searcher should have the accumulated(total) index i.e every 5 mins, we > should be able to move the entire index from indexer to searcher and so on. > > The above scenario is slightly different from master/slave implementation, > as on master we want only the latest(WIP) index and the slave should contain > the entire index. > If you commit but do not optimize then rsync will transfer only the new segment files which should be possible within 5 minutes. So I'd suggest optimize less frequently (once or twice a day). However, if for some reasons you still want to go with your design, there is a new MergeIndexes feature in Solr 1.4 which can help (assuming that you have only additions or replacements and no deletes). However, that is not used by the Solr 1.4 Java replication. You may be able to modify the snappuller and snapinstaller scripts to use merge indexes command though. Something like that can also work with multiple servers creating indexes (again assuming no deletes are needed). http://wiki.apache.org/solr/MergingSolrIndexes -- Regards, Shalin Shekhar Mangar.