I would make one *collection* for each date range and then make a collection alias or aliases that span the ones that you want to query.
http://wiki.apache.org/solr/SolrCloud#Collection_Aliases I don't have a good idea for you for how to handle indexing off-cluster, however. Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions <https://twitter.com/Appinions> | g+: plus.google.com/appinions<https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts> w: appinions.com <http://www.appinions.com/> On Wed, Dec 18, 2013 at 4:45 PM, Max Hansmire <hansm...@gmail.com> wrote: > I am considering using SolrCloud, but I have a use case that I am not sure > if it covers. > > I would like to keep an index up to date in realtime, but also I would like > to sometimes restate the past. The way that I would restate the past is to > do batch processing over historical data. > > My idea is that I would have the Solr collection sharded by date range. As > I move forward in time I would add more shards. > > For restating historical data I would have a separate process that actually > indexes a shards worth of data. (This keeps the servers that are meant for > production search from having to handle the load of indexing historically.) > I would then move the index files to the solr servers and register the > newly created index with the server replacing the existing shards. > > I used to be able to do something similar pre-SolrCloud by using the core > admin. But this did not have the benefit of having one search for the > entire "collection". I had to manually query each of the cores to get the > full search index. > > Essentially the question is: > 1- is it possible to shard by date range in this way? > 2- is it possible to swap out the index used by a shard? > 3- is there a different way I should be thinking of this? > > Max >