bq: does offline.... No. I'm talking about "collection aliasing". You can create an entirely new collection, index to it however you want then switch to using that new collection.
bq: Any updates to EXISTING document in the LIVE collection should NOT be replicated to the previous week(s) snapshot(s) then give it a new ID maybe? Best, Erick On Mon, Jul 13, 2015 at 3:21 PM, Raja Pothuganti <rpothuga...@competitrack.com> wrote: > Thank you Erick >>Actually, my question is why do it this way at all? Why not index >>directly to your "live" nodes? This is what SolrCloud is built for. >>You an use "implicit" routing to create shards say, for each week and >>age out the ones that are "too old" as well. > > > Any updates to EXISTING document in the LIVE collection should NOT be > replicated to the previous week(s) snapshot(s). Think of the snapshot(s) > as an archive of sort and searchable independent of LIVE. We're aiming to > support at most 2 archives of data in the past. > > >>Another option would be to use "collection aliasing" to keep an >>offline index up to date then switch over when necessary. > > Does offline indexing refers to this link > https://github.com/cloudera/search/tree/0d47ff79d6ccc0129ffadcb50f9fe0b271f > 102aa/search-mr > > > Thanks > Raja > > > > On 7/13/15, 3:14 PM, "Erick Erickson" <erickerick...@gmail.com> wrote: > >>Actually, my question is why do it this way at all? Why not index >>directly to your "live" nodes? This is what SolrCloud is built for. >> >>There's the new backup/restore functionality that's still a work in >>progress, see: https://issues.apache.org/jira/browse/SOLR-5750 >> >>You an use "implicit" routing to create shards say, for each week and >>age out the ones that are "too old" as well. >> >>Another option would be to use "collection aliasing" to keep an >>offline index up to date then switch over when necessary. >> >>I'd really like to know this isn't an XY problem though, what's the >>high-level problem you're trying to solve? >> >>Best, >>Erick >> >>On Mon, Jul 13, 2015 at 12:49 PM, Raja Pothuganti >><rpothuga...@competitrack.com> wrote: >>> >>> Hi, >>> We are setting up a new SolrCloud environment with 5.2.1 on Ubuntu >>>boxes. We currently ingest data into a large collection, call it LIVE. >>>After the full ingest is done we then trigger a delta delta ingestion >>>every 15 minutes to get the documents & data that have changed into this >>>LIVE instance. >>> >>> In Solr 4.X using a Master / Slave setup we had slaves that would >>>periodically (weekly, or monthly) refresh their data from the Master >>>rather than every 15 minutes. We're now trying to figure out how to get >>>this same type of setup using SolrCloud. >>> >>> Question(s): >>> - Is there a way to copy data from one SolrCloud collection into >>>another quickly and easily? >>> - Is there a way to programmatically control when a replica receives >>>it's data or possibly move it to another collection (without losing >>>data) that updates on a different interval? It ideally would be another >>>collection name, call it Week1 ... Week52 ... to avoid a replica in the >>>same collection serving old data. >>> >>> One option we thought of was to create a backup and then restore that >>>into a new clean cloud. This has a lot of moving parts and isn't nearly >>>as neat as the Master / Slave controlled replication setup. It also has >>>the side effect of potentially taking a very long time to backup and >>>restore instead of just copying the indexes like the old M/S setup. >>> >>> Any ideas of thoughts? Thanks in advance for you help. >>> Raja >