I wouldn't rely on the "current" flag in the admin UI as an indicator. As long as your numDocs and the like match I'd say it's a UI issue.
Best, Erick On Wed, May 24, 2017 at 8:15 AM, Webster Homer <webster.ho...@sial.com> wrote: > We see data in the target clusters. CDCR replication is working. We first > noticed the current=false flag on the target replicas, but since I started > looking I see it on the source too. > > > I have removed the IgnoreCommitOptimizeUpdateProcessorFactory from our > update processor chain, I did two data loads to different collections. > These collections are part of our development system, they are not > configured to use cdcr they are directly loaded by our data load. The ETL > to our solrs use the /update/json request handler and does not send > commits. These collections mirror our production collections and have 2 > shards with 2 replicas. I see the situation where the replicas are marked > current=false which should not happen if autoCommit was working correctly. > The last load was yesterday at 5pm and I didn't check until this morning > where I found bb-catalog-material_shard1_replica1 (the leader) was not > current, but the other was. The last modified date on the leader was > 2017-05-23T22:44:54.618Z. > > My modified autoCommit: > <autoCommit> > <maxTime>${solr.autoCommit.maxTime:600000}</maxTime> > <openSearcher>false</openSearcher> > </autoCommit> > > <autoSoftCommit> > <maxTime>${solr.autoSoftCommit.maxTime:60000}</maxTime> > </autoSoftCommit> > > The last indexed record from a search matches up with the above time. For > this test,the numDocs are the same between the two replicas. I think the > soft commit is working. Why wouldn't both replicas be current after so many > hours? > We are using solr 6.2 fyi. I expect to upgrade to solr 6.6 when it becomes > available > > Thanks, > Webster > > On Tue, May 23, 2017 at 12:52 PM, Erick Erickson <erickerick...@gmail.com> > wrote: > >> This is all quite strange. Optimize (BTW, it's rarely >> necessary/desirable on an index that changes, despite its name) >> shouldn't matter here. CDCR forwards the raw documents to the target >> cluster. >> >> Ample time indeed. With a soft commit of 15 seconds, that's your >> window (with some slop for how long CDCR takes). >> >> If you do a search and sort by your timestamp descending, what do you >> see on the target cluster? And when you are indexing and CDCR is >> running, your target cluster solr logs should show updates coming in. >> Mostly checking if the data is even getting to the target cluster >> here. >> >> Also check the tlogs on the source cluster. By "check" here I just >> mean "are they reasonable size", and "reasonable" should be very >> small. The tlogs are the "queue" that CDCR uses to store docs before >> forwarding to the target cluster, so this is just a sanity check. If >> they're huge, then CDCR is not forwarding anything to the target >> cluster. >> >> It's also vaguely possible that >> IgnoreCommitOptimizeUpdateProcessorFactory is interfering, if so it's >> a bug and should be reported as a JIRA. If you remove that on the >> target cluster, does the behavior change? >> >> I'm mystified here as you can tell. >> >> Best, >> Erick >> >> On Tue, May 23, 2017 at 10:12 AM, Webster Homer <webster.ho...@sial.com> >> wrote: >> > We see a pretty consistent issue where the replicas show in the admin >> > console as not current, indicating that our auto commit isn't commiting. >> In >> > one case we loaded the data to the source, cdcr replicated it to the >> > targets and we see the source and the target as having current = false. >> It >> > is searchable so the soft commits are happening. We turned off data >> loading >> > to investigate this issue, and the replicas are still not current after 3 >> > days. So there should have been ample time to catch up. >> > This is our autoCommit >> > <autoCommit> >> > <maxDocs>25000</maxDocs> >> > <maxTime>${solr.autoCommit.maxTime:300000}</maxTime> >> > <openSearcher>false</openSearcher> >> > </autoCommit> >> > >> > This is our autoSoftCommit >> > <autoSoftCommit> >> > <maxTime>${solr.autoSoftCommit.maxTime:15000}</maxTime> >> > </autoSoftCommit> >> > neither property, solr.autoCommit.maxTime or solr.autoSoftCommit.maxTime >> > are set. >> > >> > We also have an updateChain that calls the >> > solr.IgnoreCommitOptimizeUpdateProcessorFactory to ignore client >> commits. >> > Could that be the cause of our >> > <updateRequestProcessorChain name="cleanup"> >> > <!-- Ignore commits from clients, telling them all's OK --> >> > <processor class="solr.IgnoreCommitOptimizeUpdateProc >> essorFactory"> >> > <int name="statusCode">200</int> >> > </processor> >> > >> > <processor class="TrimFieldUpdateProcessorFactory" /> >> > <processor class="RemoveBlankFieldUpdateProcessorFactory" /> >> > >> > <processor class="solr.LogUpdateProcessorFactory" /> >> > <processor class="solr.RunUpdateProcessorFactory" /> >> > </updateRequestProcessorChain> >> > >> > We did create a date field to all our collections that defaults to NOW >> so I >> > can see that no new data was added, but the replicas don't seem to get >> the >> > commit. I assume this is something in our configuration (see above). >> > >> > Is there a way to determine when the last commit occurred? >> > >> > I believe that the one replica got out of sync due to an admin running an >> > optimize while cdcr was still running. >> > That was one collection, but it looks like we are missing commits on most >> > of our collections. >> > >> > Any help would be greatly appreciated! >> > >> > Thanks, >> > Webster Homer >> > >> > On Mon, May 22, 2017 at 4:12 PM, Erick Erickson <erickerick...@gmail.com >> > >> > wrote: >> > >> >> You can ping individual replicas by addressing to a specific replica >> >> and setting distrib=false, something like >> >> >> >> http://SOLR_NODE:port/solr/collection1_shard1_replica1/ >> >> query?distrib=false&q=...... >> >> >> >> But one thing to check first is that you've committed. I'd: >> >> 1> turn off indexing on the source cluster. >> >> 2> wait until the CDCR had caught up (if necessary). >> >> 3> issue a hard commit on the target >> >> 4> _then_ see if the counts were what is expected. >> >> >> >> Due to the fact that autocommit settings can fire at different clock >> >> times even for replicas on the same shard, it's easier to track >> >> whether it's a transient issue. The other thing I've seen people do is >> >> have a timestamp on the docs set to NOW (there's an update processor >> >> that can do this). Then when you check for consistency you can use >> >> fq=timestamp:[* TO NOW - (some interval significantly longer than your >> >> autocommit interval)]. >> >> >> >> bq: Is there a way to recover when a shard has inconsistent replicas. >> >> If I use the delete replica API call to delete one of them and then use >> add >> >> replica to create it from scratch will it auto-populate from the other >> >> replica in the shard? >> >> >> >> Yes. Whenever you ADDREPLICA it'll catch itself up from the leader >> >> before becoming active. It'll have to copy the _entire_ index from the >> >> leader, so you'll see network traffic spike. >> >> >> >> Best, >> >> Erick >> >> >> >> On Mon, May 22, 2017 at 1:41 PM, Webster Homer <webster.ho...@sial.com> >> >> wrote: >> >> > I have a solrcloud collection with 2 shards and 4 replicas. The >> replicas >> >> > for shard 1 have different numbers of records, so different queries >> will >> >> > return different numbers of records. >> >> > >> >> > I am not certain how this occurred, it happened in a collection that >> was >> >> a >> >> > cdcr target. >> >> > >> >> > Is there a way to limit a search to a specific replica of a shard? We >> >> want >> >> > to understand the differences >> >> > >> >> > Is there a way to recover when a shard has inconsistent replicas. >> >> > If I use the delete replica API call to delete one of them and then >> use >> >> add >> >> > replica to create it from scratch will it auto-populate from the other >> >> > replica in the shard? >> >> > >> >> > Thanks, >> >> > Webster >> >> > >> >> > -- >> >> > >> >> > >> >> > This message and any attachment are confidential and may be >> privileged or >> >> > otherwise protected from disclosure. If you are not the intended >> >> recipient, >> >> > you must not copy this message or attachment or disclose the contents >> to >> >> > any other person. If you have received this transmission in error, >> please >> >> > notify the sender immediately and delete the message and any >> attachment >> >> > from your system. Merck KGaA, Darmstadt, Germany and any of its >> >> > subsidiaries do not accept liability for any omissions or errors in >> this >> >> > message which may arise as a result of E-Mail-transmission or for >> damages >> >> > resulting from any unauthorized changes of the content of this message >> >> and >> >> > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its >> >> > subsidiaries do not guarantee that this message is free of viruses and >> >> does >> >> > not accept liability for any damages caused by any virus transmitted >> >> > therewith. >> >> > >> >> > Click http://www.emdgroup.com/disclaimer to access the German, >> French, >> >> > Spanish and Portuguese versions of this disclaimer. >> >> >> > >> > -- >> > >> > >> > This message and any attachment are confidential and may be privileged or >> > otherwise protected from disclosure. If you are not the intended >> recipient, >> > you must not copy this message or attachment or disclose the contents to >> > any other person. If you have received this transmission in error, please >> > notify the sender immediately and delete the message and any attachment >> > from your system. Merck KGaA, Darmstadt, Germany and any of its >> > subsidiaries do not accept liability for any omissions or errors in this >> > message which may arise as a result of E-Mail-transmission or for damages >> > resulting from any unauthorized changes of the content of this message >> and >> > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its >> > subsidiaries do not guarantee that this message is free of viruses and >> does >> > not accept liability for any damages caused by any virus transmitted >> > therewith. >> > >> > Click http://www.emdgroup.com/disclaimer to access the German, French, >> > Spanish and Portuguese versions of this disclaimer. >> > > -- > > > This message and any attachment are confidential and may be privileged or > otherwise protected from disclosure. If you are not the intended recipient, > you must not copy this message or attachment or disclose the contents to > any other person. If you have received this transmission in error, please > notify the sender immediately and delete the message and any attachment > from your system. Merck KGaA, Darmstadt, Germany and any of its > subsidiaries do not accept liability for any omissions or errors in this > message which may arise as a result of E-Mail-transmission or for damages > resulting from any unauthorized changes of the content of this message and > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its > subsidiaries do not guarantee that this message is free of viruses and does > not accept liability for any damages caused by any virus transmitted > therewith. > > Click http://www.emdgroup.com/disclaimer to access the German, French, > Spanish and Portuguese versions of this disclaimer.