Tom, (and take care not to restart the leader node otherwise it will replicate > from one of the replicas which is missing the index).
How is this possible? Ok I will look more into it. Appreciate if someone else also chimes in if they have similar issue. Amrit Sarkar Search Engineer Lucidworks, Inc. 415-589-9269 www.lucidworks.com Twitter http://twitter.com/lucidworks LinkedIn: https://www.linkedin.com/in/sarkaramrit2 Medium: https://medium.com/@sarkaramrit2 On Fri, Dec 1, 2017 at 4:49 AM, Tom Peters <tpet...@synacor.com> wrote: > Hi Amrit, I tried issuing hard commits to the various nodes in the target > cluster and it does not appear to cause the follower replicas to receive > the initial index. The only way I can get the replicas to see the original > index is by restarting those nodes (and take care not to restart the leader > node otherwise it will replicate from one of the replicas which is missing > the index). > > > > On Nov 30, 2017, at 12:16 PM, Amrit Sarkar <sarkaramr...@gmail.com> > wrote: > > > > Tom, > > > > This is very useful: > > > >> I found a way to get the follower replicas to receive the documents from > >> the leader in the target data center, I have to restart the solr > instance > >> running on that server. Not sure if this information helps at all. > > > > > > You have to issue hardcommit on target after the bootstrapping is done. > > Reloading makes the core opening a new searcher. While explicit commit is > > issued at target leader after the BS is done, follower are left > unattended > > though the docs are copied over. > > > > Amrit Sarkar > > Search Engineer > > Lucidworks, Inc. > > 415-589-9269 > > www.lucidworks.com > > Twitter http://twitter.com/lucidworks > > LinkedIn: https://www.linkedin.com/in/sarkaramrit2 > > Medium: https://medium.com/@sarkaramrit2 > > > > On Thu, Nov 30, 2017 at 10:06 PM, Tom Peters <tpet...@synacor.com> > wrote: > > > >> Hi Amrit, > >> > >> Starting with more documents doesn't appear to have made a difference. > >> This time I tried with >1000 docs. Here are the steps I took: > >> > >> 1. Deleted the collection on both the source and target DCs. > >> > >> 2. Recreated the collections. > >> > >> 3. Indexed >1000 documents on source data center, hard commmit > >> > >> $ for i in solr0{1,2}-{a,b,c}; do echo -n "$i: "; curl -s > >> $i:8080/solr/mycollection/select'?q=*:*' | jq '.response.numFound'; > done > >> solr01-a: 1368 > >> solr01-b: 1368 > >> solr01-c: 1368 > >> solr02-a: 0 > >> solr02-b: 0 > >> solr02-c: 0 > >> > >> 4. Enabled CDCR and checked docs > >> > >> $ curl 'solr01-a:8080/solr/synacor/cdcr?action=START' > >> > >> $ for i in solr0{1,2}-{a,b,c}; do echo -n "$i: "; curl -s > >> $i:8080/solr/mycollection/select'?q=*:*' | jq '.response.numFound'; > done > >> solr01-a: 1368 > >> solr01-b: 1368 > >> solr01-c: 1368 > >> solr02-a: 0 > >> solr02-b: 0 > >> solr02-c: 1368 > >> > >> Some additional notes: > >> > >> * I do not have numRecordsToKeep defined in my solrconfig.xml, so I > assume > >> it will use the default of 100 > >> > >> * I found a way to get the follower replicas to receive the documents > from > >> the leader in the target data center, I have to restart the solr > instance > >> running on that server. Not sure if this information helps at all. > >> > >>> On Nov 30, 2017, at 11:22 AM, Amrit Sarkar <sarkaramr...@gmail.com> > >> wrote: > >>> > >>> Hi Tom, > >>> > >>> I see what you are saying and I too think this is a bug, but I will > >> confirm > >>> once on the code. Bootstrapping should happen on all the nodes of the > >>> target. > >>> > >>> Meanwhile can you index more than 100 documents in the source and do > the > >>> exact same experiment again. Followers will not copy the entire index > of > >>> Leader unless the difference in versions in docs are more than > >>> "numRecordsToKeep", which is default 100, unless you have modified in > >>> solrconfig.xml. > >>> > >>> Looking forward to your analysis. > >>> > >>> Amrit Sarkar > >>> Search Engineer > >>> Lucidworks, Inc. > >>> 415-589-9269 > >>> www.lucidworks.com > >>> Twitter http://twitter.com/lucidworks > >>> LinkedIn: https://www.linkedin.com/in/sarkaramrit2 > >>> Medium: https://medium.com/@sarkaramrit2 > >>> > >>> On Thu, Nov 30, 2017 at 9:03 PM, Tom Peters <tpet...@synacor.com> > wrote: > >>> > >>>> I'm running into an issue with the initial CDCR bootstrapping of an > >>>> existing index. In short, after turning on CDCR only the leader > replica > >> in > >>>> the target data center will have the documents replicated and it will > >> not > >>>> exist in any of the follower replicas in the target data center. All > >>>> subsequent incremental updates made to the source datacenter will > >> appear in > >>>> all replicas in the target data center. > >>>> > >>>> A little more details: > >>>> > >>>> I have two clusters setup, a source cluster and a target cluster. Each > >>>> cluster has only one shard and three replicas. I used the > configuration > >>>> detailed in the Source and Target sections of the reference guide > as-is > >>>> with the exception of updating the zkHost (https://lucene.apache.org/ > >>>> solr/guide/7_1/cross-data-center-replication-cdcr.html# > >>>> cdcr-configuration-2). > >>>> > >>>> The source data center has the following nodes: > >>>> solr01-a, solr01-b, and solr01-c > >>>> > >>>> The target data center has the following nodes: > >>>> solr02-a, solr02-b, and solr02-c > >>>> > >>>> Here are the steps that I've done: > >>>> > >>>> 1. Create collection in source and target data centers > >>>> > >>>> 2. Add a number of documents to the source data center > >>>> > >>>> 3. Verify: > >>>> > >>>> $ for i in solr0{1,2}-{a,b,c}; do echo -n "$i: "; curl -s > >>>> $i:8080/solr/mycollection/select'?q=*:*' | jq '.response.numFound'; > >> done > >>>> solr01-a: 81 > >>>> solr01-b: 81 > >>>> solr01-c: 81 > >>>> solr02-a: 0 > >>>> solr02-b: 0 > >>>> solr02-c: 0 > >>>> > >>>> 4. Start CDCR: > >>>> > >>>> $ curl 'solr01-a:8080/solr/mycollection/cdcr?action=START' > >>>> > >>>> 5. See if target data center has received the initial index > >>>> > >>>> $ for i in solr0{1,2}-{a,b,c}; do echo -n "$i: "; curl -s > >>>> $i:8080/solr/mycollection/select'?q=*:*' | jq '.response.numFound'; > >> done > >>>> solr01-a: 81 > >>>> solr01-b: 81 > >>>> solr01-c: 81 > >>>> solr02-a: 0 > >>>> solr02-b: 0 > >>>> solr02-c: 81 > >>>> > >>>> note: only -c has received the index > >>>> > >>>> 6. Add another document to the source cluster > >>>> > >>>> 7. See how many documents are in each node: > >>>> > >>>> $ for i in solr0{1,2}-{a,b,c}; do echo -n "$i: "; curl -s > >>>> $i:8080/solr/mycollection/select'?q=*:*' | jq '.response.numFound'; > >> done > >>>> solr01-a: 82 > >>>> solr01-b: 82 > >>>> solr01-c: 82 > >>>> solr02-a: 1 > >>>> solr02-b: 1 > >>>> solr02-c: 82 > >>>> > >>>> > >>>> As you can see, the initial index only made it to one of the replicas > in > >>>> the target data center, but subsequent incremental updates have > appeared > >>>> everywhere I would expect. Any help would be greatly appreciated, > >> thanks. > >>>> > >>>> > >>>> > >>>> This message and any attachment may contain information that is > >>>> confidential and/or proprietary. Any use, disclosure, copying, > storing, > >> or > >>>> distribution of this e-mail or any attached file by anyone other than > >> the > >>>> intended recipient is strictly prohibited. If you have received this > >>>> message in error, please notify the sender by reply email and delete > the > >>>> message and any attachments. Thank you. > >>>> > >> > >> > >> > >> This message and any attachment may contain information that is > >> confidential and/or proprietary. Any use, disclosure, copying, storing, > or > >> distribution of this e-mail or any attached file by anyone other than > the > >> intended recipient is strictly prohibited. If you have received this > >> message in error, please notify the sender by reply email and delete the > >> message and any attachments. Thank you. > >> > > > > This message and any attachment may contain information that is > confidential and/or proprietary. Any use, disclosure, copying, storing, or > distribution of this e-mail or any attached file by anyone other than the > intended recipient is strictly prohibited. If you have received this > message in error, please notify the sender by reply email and delete the > message and any attachments. Thank you. >