Re: replicate indexing to second site

2016-02-10 Thread Shawn Heisey
On 2/10/2016 8:02 AM, tedsolr wrote: > I have my head wrapped around sending index requests in parallel, but in a > later post you mentioned how you separately track the most recent update and > are able to sync from that point if needed. That I don't get. Is it an index > version you are tracking?

Re: replicate indexing to second site

2016-02-10 Thread tedsolr
Arcadius, Thanks for sharing your multi data center design. My requirements are different (hot site - warm site) but nevertheless your posts are very interesting. It helps to know that in many cases someone else has already cut their teeth on the problem you're trying to solve. Ted -- View thi

Re: replicate indexing to second site

2016-02-10 Thread tedsolr
Cross data center replication sounds like a great feature. I read Yonik's post on it. I'll keep my ear to the ground. In the meantime it's good to know there's nothing built in to handle this, so it will involve some design effort. I have my head wrapped around sending index requests in parallel,

Re: replicate indexing to second site

2016-02-09 Thread Arcadius Ahouansou
Hello Ted. We have a similar requirement to deploy Solr across 2 DCs. In our case, the DCs are connected via fibre optic. We managed to deploy a single SolrCloud cluster across multiple DCs without any major issue (see links below). The whole set-up is described in the following articles: - htt

Re: replicate indexing to second site

2016-02-09 Thread Walter Underwood
I agree. If the system updates synchronously, then you are in two-phase commit land. If you have a persistent store that each index can track, then things are good. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Feb 9, 2016, at 7:37 PM, Shawn Heis

Re: replicate indexing to second site

2016-02-09 Thread Shawn Heisey
On 2/9/2016 5:48 PM, Walter Underwood wrote: > Updating two systems in parallel gets into two-phase commit, instantly. So > you need a persistent pool of updates that both clusters pull from. My indexing system does exactly what I have suggested for tedsolr -- it updates multiple copies of my ind

Re: replicate indexing to second site

2016-02-09 Thread Alexandre Rafalovitch
This issue might be similar to what Apple presented at the closing keynote at Solr Revolution 2014. I believe they used a queue on each of the site feeding into Solr. The presentation should be online. Regards, Alex. Newsletter and resources for Solr beginners and intermediates: http://www

Re: replicate indexing to second site

2016-02-09 Thread Walter Underwood
Updating two systems in parallel gets into two-phase commit, instantly. So you need a persistent pool of updates that both clusters pull from. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Feb 9, 2016, at 4:15 PM, Shawn Heisey wrote: > > On 2/9/2

Re: replicate indexing to second site

2016-02-09 Thread Shawn Heisey
On 2/9/2016 1:43 PM, tedsolr wrote: > I expect that rsync can be used initially to copy the collection data > folders and the zookeeper data and transaction log folders. So after > verifying Solr/ZK is functional after the install, shut it down and perform > the copy. This may sound slow but my pro

Re: replicate indexing to second site

2016-02-09 Thread Walter Underwood
Making two indexing calls, one to each, works until one system is not available. Then they are out of sync. You might want to put the updates into a persistent message queue, then have both systems indexed from that queue. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood

Re: replicate indexing to second site

2016-02-09 Thread Upayavira
There is a Cross Datacenter replication feature in the works - not sure of its status. In lieu of that, I'd simply have two copies of your indexing code - index everything simultaneously into both clusters. There is, of course risks that both get out of sync, so you might want to find some ways t