That case related to consistency after a ZK outage or network connectivity issue. Your case is standard operation, so I’m not sure that’s really the same thing. I’m aware of a few issues that cam happen if ZK connectivity goes wonky, that I hope are fixed in SOLR-8697.
This one might be a closer match to your problem though: https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201604.mbox/%3CCAOWq+=iePCJjnQiSqxgDVEPv42Pi7RUtw0X0=9f67mpcm99...@mail.gmail.com%3E On 5/19/16, 9:10 AM, "Aleksey Mezhva" <aleksey.mez...@wgsn.com> wrote: >Bump. > >this thread is with someone having a similar issue: > >https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201601.mbox/%3c09fdab82-7600-49e0-b639-9cb9db937...@yahoo.com%3E > >It seems like this is not really fixed in 5.4/6.0? > > >Aleksey > >From: Steve Weiss <steve.we...@wgsn.com> >Date: Tuesday, May 17, 2016 at 7:25 PM >To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> >Cc: Aleksey Mezhva <aleksey.mez...@wgsn.com>, Hans Zhou <hans.z...@wgsn.com> >Subject: Re: SolrCloud replicas consistently out of sync > >Gotcha - well that's nice. Still, we seem to be permanently out of sync. > >I see this thread with someone having a similar issue: > >https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201601.mbox/%3c09fdab82-7600-49e0-b639-9cb9db937...@yahoo.com%3E > >It seems like this is not really fixed in 5.4/6.0? Is there any version of >SolrCloud where this wasn't yet a problem that we could downgrade to? > >-- >Steve > >On Tue, May 17, 2016 at 6:23 PM, Markus Jelsma ><markus.jel...@openindex.io<mailto:markus.jel...@openindex.io>> wrote: >Hi, thats a known issue and unrelated: >https://issues.apache.org/jira/browse/SOLR-9120 > >M. > > >-----Original message----- >> From:Stephen Weiss <steve.we...@wgsn.com<mailto:steve.we...@wgsn.com>> >> Sent: Tuesday 17th May 2016 23:10 >> To: solr-user@lucene.apache.org<mailto:solr-user@lucene.apache.org>; Aleksey >> Mezhva <aleksey.mez...@wgsn.com<mailto:aleksey.mez...@wgsn.com>>; Hans Zhou >> <hans.z...@wgsn.com<mailto:hans.z...@wgsn.com>> >> Subject: Re: SolrCloud replicas consistently out of sync >> >> I should add - looking back through the logs, we're seeing frequent errors >> like this now: >> >> 78819692 WARN (qtp110456297-1145) [ ] o.a.s.h.a.LukeRequestHandler Error >> getting file length for [segments_4o] >> java.nio.file.NoSuchFileException: >> /var/solr/data/instock_shard5_replica1/data/index.20160516230059221/segments_4o >> >> -- >> Steve >> >> >> On Tue, May 17, 2016 at 5:07 PM, Stephen Weiss >> <steve.we...@wgsn.com<mailto:steve.we...@wgsn.com><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com>>> >> wrote: >> OK, so we did as you suggest, read through that article, and we reconfigured >> the autocommit to: >> >> <autoCommit> >> <maxTime>${solr.autoCommit.maxTime:30000}</maxTime> >> <openSearcher>false</openSearcher> >> </autoCommit> >> >> <autoSoftCommit> >> <maxTime>${solr.autoSoftCommit.maxTime:600000}</maxTime> >> </autoSoftCommit> >> >> However, we see no change, aside from the fact that it's clearly committing >> more frequently. I will say on our end, we clearly misunderstood the >> difference between soft and hard commit, but even now having it configured >> this way, we are still totally out of sync, long after all indexing has >> completed (it's been about 30 minutes now). We manually pushed through a >> commit on the whole collection as suggested, however, all we get back for >> that is o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping >> IW.commit., which makes sense, because it was all committed already anyway. >> >> We still currently have all shards mismatched: >> >> instock_shard1 replica 1: 30788491 replica 2: 30778865 >> instock_shard10 replica 1: 30973059 replica 2: 30971874 >> instock_shard11 replica 2: 31036815 replica 1: 31034715 >> instock_shard12 replica 2: 30177084 replica 1: 30170511 >> instock_shard13 replica 2: 30608225 replica 1: 30603923 >> instock_shard14 replica 2: 30755739 replica 1: 30753191 >> instock_shard15 replica 2: 30891713 replica 1: 30891528 >> instock_shard16 replica 1: 30818567 replica 2: 30817152 >> instock_shard17 replica 1: 30423877 replica 2: 30422742 >> instock_shard18 replica 2: 30874979 replica 1: 30872223 >> instock_shard19 replica 2: 30917208 replica 1: 30909999 >> instock_shard2 replica 1: 31062339 replica 2: 31060575 >> instock_shard20 replica 1: 30192046 replica 2: 30190893 >> instock_shard21 replica 2: 30793817 replica 1: 30791135 >> instock_shard22 replica 2: 30821521 replica 1: 30818836 >> instock_shard23 replica 2: 30553773 replica 1: 30547336 >> instock_shard24 replica 1: 30975564 replica 2: 30971170 >> instock_shard25 replica 1: 30734696 replica 2: 30731682 >> instock_shard26 replica 1: 31465696 replica 2: 31464738 >> instock_shard27 replica 1: 30844884 replica 2: 30842445 >> instock_shard28 replica 2: 30549826 replica 1: 30547405 >> instock_shard29 replica 2: 30637777 replica 1: 30634091 >> instock_shard3 replica 1: 30930723 replica 2: 30926483 >> instock_shard30 replica 2: 30904528 replica 1: 30902649 >> instock_shard31 replica 2: 31175813 replica 1: 31174921 >> instock_shard32 replica 2: 30932837 replica 1: 30926456 >> instock_shard4 replica 2: 30758100 replica 1: 30754129 >> instock_shard5 replica 2: 31008893 replica 1: 31002581 >> instock_shard6 replica 2: 31008679 replica 1: 31005380 >> instock_shard7 replica 2: 30738468 replica 1: 30737795 >> instock_shard8 replica 2: 30620929 replica 1: 30616715 >> instock_shard9 replica 1: 31071386 replica 2: 31066956 >> >> The fact that the min_rf numbers aren't coming back as 2 seems to indicate >> to me that documents simply aren't making it to both replicas - why would >> that have anything to do with committing anyway? >> >> Something else is amiss here. Too bad, committing sounded like an easy >> answer! >> >> -- >> Steve >> >> >> On Tue, May 17, 2016 at 11:39 AM, Erick Erickson >> <erickerick...@gmail.com<mailto:erickerick...@gmail.com><mailto:erickerick...@gmail.com<mailto:erickerick...@gmail.com>>> >> wrote: >> OK, these autocommit settings need revisiting. >> >> First off, I'd remove the maxDocs entirely although with the setting >> you're using it probably doesn't matter. >> >> The maxTime of 1,200,000 is 20 minutes. Which means if you evern >> un-gracefully kill your shards you'll have up to 20 minutes worth of >> data to replay from the tlog.... or resynch from the leader. Make this >> much shorter (60000 or less) and be sure to gracefully kill your Solrs. >> no "kill -9" for intance.... >> >> To be sure, before you bounce servers try either waiting 20 minutes >> after the indexing stops or issue a manual commit before shutting >> down your servers with >> http://..../solr/collection/update?commit=true >> >> I have a personal annoyance with the bin/solr script where it forcefully >> (ungracefully) kills Solr after 5 seconds. I think this is much too short >> so you might consider making it longer in prod, it's a shell script so >> it's easy. >> >> <autoCommit> >> <maxTime>${solr.autoCommit.maxTime:1200000}</maxTime> >> <maxDocs>${solr.autoCommit.maxDocs:1000000000}</maxDocs> >> <openSearcher>false</openSearcher> >> </autoCommit> >> >> >> this is probably the crux of "shards being out of sync". They're _not_ >> out of sync, it's just that some of them have docs visible to searches >> and some do not since the wall-clock time these are triggered are >> _not_ the same. So you have a 10 minute window where two or more >> replicas for a single shard are out-of-sync. >> >> >> <autoSoftCommit> >> <maxTime>${solr.autoSoftCommit.maxTime:600000}</maxTime> >> </autoSoftCommit> >> >> You can test all this one of two ways: >> 1> if you have a timestamp when the docs were indexed, do all >> the shards match if you do a query like >> q=*:*×tamp:[* TO NOW/-15MINUTES]? >> or, if indexing is _not_ occurring, issue a manual commit like >> .../solr/collection/update?commit=true >> and see if all the replicas match for each shard. >> >> Here's a long blog on commits: >> https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ >> >> Best, >> Erick >> >> On Tue, May 17, 2016 at 8:18 AM, Stephen Weiss >> <steve.we...@wgsn.com<mailto:steve.we...@wgsn.com><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com>>> >> wrote: >> > Yes, after startup there was a recovery process, you are right. It's just >> > that this process doesn't seem to happen unless we do a full restart. >> > >> > These are our autocommit settings - to be honest, we did not really use >> > autocommit until we switched up to SolrCloud so it's totally possible they >> > are not very good settings. We wanted to minimize the frequency of >> > commits because the commits seem to create a performance drag during >> > indexing. Perhaps it's gone overboard? >> > >> > <autoCommit> >> > <maxTime>${solr.autoCommit.maxTime:1200000}</maxTime> >> > <maxDocs>${solr.autoCommit.maxDocs:1000000000}</maxDocs> >> > <openSearcher>false</openSearcher> >> > </autoCommit> >> > <autoSoftCommit> >> > <maxTime>${solr.autoSoftCommit.maxTime:600000}</maxTime> >> > </autoSoftCommit> >> > >> > By nodes, I am indeed referring to machines. There are 8 shards per >> > machine (2 replicas of each), all in one JVM a piece. We haven't >> > specified any specific timestamps for the logs - they are just whatever >> > happens by default. >> > >> > -- >> > Steve >> > >> > On Mon, May 16, 2016 at 11:50 PM, Erick Erickson >> > <erickerick...@gmail.com<mailto:erickerick...@gmail.com><mailto:erickerick...@gmail.com<mailto:erickerick...@gmail.com>><mailto:erickerick...@gmail.com<mailto:erickerick...@gmail.com><mailto:erickerick...@gmail.com<mailto:erickerick...@gmail.com>>>> >> > wrote: >> > OK, this is very strange. There's no _good_ reason that >> > restarting the servers should make a difference. The fact >> > that it took 1/2 hour leads me to believe, though, that your >> > shards are somehow "incomplete", especially that you >> > are indexing to the system and don't have, say, >> > your autocommit settings done very well. The long startup >> > implies (guessing) that you have pretty big tlogs that >> > are replayed upon startup. While these were coming up, >> > did you see any of the shards in the "recovering" state? That's >> > the only way I can imagine that Solr "healed" itself. >> > >> > I've got to point back to the Solr logs. Are they showing >> > any anomalies? Are any nodes in recovery when you restart? >> > >> > Best, >> > Erick >> > >> > >> > >> > On Mon, May 16, 2016 at 4:14 PM, Stephen Weiss >> > <steve.we...@wgsn.com<mailto:steve.we...@wgsn.com><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com>><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com>>>> >> > wrote: >> >> Just one more note - while experimenting, I found that if I stopped all >> >> nodes (full cluster shutdown), and then startup all nodes, they do in >> >> fact seem to repair themselves then. We have a script to monitor the >> >> differences between replicas (just looking at numDocs) and before the >> >> full shutdown / restart, we had: >> >> >> >> wks53104:Downloads sweiss$ php testReplication.php >> >> Found 32 mismatched shard counts. >> >> instock_shard1 replica 1: 30785553 replica 2: 30777568 >> >> instock_shard10 replica 1: 30972662 replica 2: 30966215 >> >> instock_shard11 replica 2: 31036718 replica 1: 31033547 >> >> instock_shard12 replica 1: 30179823 replica 2: 30176067 >> >> instock_shard13 replica 2: 30604638 replica 1: 30599219 >> >> instock_shard14 replica 2: 30755117 replica 1: 30753469 >> >> instock_shard15 replica 2: 30891325 replica 1: 30888771 >> >> instock_shard16 replica 1: 30818260 replica 2: 30811728 >> >> instock_shard17 replica 1: 30422080 replica 2: 30414666 >> >> instock_shard18 replica 2: 30874530 replica 1: 30869977 >> >> instock_shard19 replica 2: 30917008 replica 1: 30913715 >> >> instock_shard2 replica 1: 31062073 replica 2: 31057583 >> >> instock_shard20 replica 1: 30188774 replica 2: 30186565 >> >> instock_shard21 replica 2: 30789012 replica 1: 30784160 >> >> instock_shard22 replica 2: 30820473 replica 1: 30814822 >> >> instock_shard23 replica 2: 30552105 replica 1: 30545802 >> >> instock_shard24 replica 1: 30973906 replica 2: 30971314 >> >> instock_shard25 replica 1: 30732287 replica 2: 30724988 >> >> instock_shard26 replica 1: 31465543 replica 2: 31463414 >> >> instock_shard27 replica 2: 30845514 replica 1: 30842665 >> >> instock_shard28 replica 2: 30549151 replica 1: 30543070 >> >> instock_shard29 replica 2: 30635711 replica 1: 30629240 >> >> instock_shard3 replica 1: 30930400 replica 2: 30928438 >> >> instock_shard30 replica 2: 30902221 replica 1: 30895176 >> >> instock_shard31 replica 2: 31174246 replica 1: 31169998 >> >> instock_shard32 replica 2: 30931550 replica 1: 30926256 >> >> instock_shard4 replica 2: 30755525 replica 1: 30748922 >> >> instock_shard5 replica 2: 31006601 replica 1: 30994316 >> >> instock_shard6 replica 2: 31006531 replica 1: 31003444 >> >> instock_shard7 replica 2: 30737098 replica 1: 30727509 >> >> instock_shard8 replica 2: 30619869 replica 1: 30609084 >> >> instock_shard9 replica 1: 31067833 replica 2: 31061238 >> >> >> >> >> >> This stayed consistent for several hours. >> >> >> >> After restart: >> >> >> >> wks53104:Downloads sweiss$ php testReplication.php >> >> Found 3 mismatched shard counts. >> >> instock_shard19 replica 2: 30917008 replica 1: 30913715 >> >> instock_shard22 replica 2: 30820473 replica 1: 30814822 >> >> instock_shard26 replica 1: 31465543 replica 2: 31463414 >> >> wks53104:Downloads sweiss$ php testReplication.php >> >> Found 2 mismatched shard counts. >> >> instock_shard19 replica 2: 30917008 replica 1: 30913715 >> >> instock_shard26 replica 1: 31465543 replica 2: 31463414 >> >> wks53104:Downloads sweiss$ php testReplication.php >> >> Everything looks peachy >> >> >> >> Took about a half hour to get there. >> >> >> >> Maybe the question should be - any way to get solrcloud to trigger this >> >> *without* having to shut down / restart all nodes? Even if we had to >> >> trigger that manually after indexing, it would be fine. It's a very >> >> controlled indexing workflow that only happens once a day. >> >> >> >> -- >> >> Steve >> >> >> >> On Mon, May 16, 2016 at 6:52 PM, Stephen Weiss >> >> <steve.we...@wgsn.com<mailto:steve.we...@wgsn.com><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com>><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com>>><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com>><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com>>>>> >> >> wrote: >> >> Each node has one JVM with 16GB of RAM. Are you suggesting we would put >> >> each shard into a separate JVM (something like 32 nodes)? >> >> >> >> We aren't encountering any OOMs. We are testing this in a separate cloud >> >> which no one is even using, the only activity is this very small amount >> >> of indexing and still we see this problem. In the logs, there are no >> >> errors at all. It's almost like none of the recovery features that >> >> people say are in Solr, are actually there at all. I can't find any >> >> evidence that Solr is even attempting to keep the shards together. >> >> >> >> There are no real errors in the solr log. I do see some warnings at >> >> system startup: >> >> >> >> http://pastie.org/private/thz0fbzcxgdreeeune8w >> >> >> >> These lines in particular look interesting: >> >> >> >> 16925 INFO >> >> (recoveryExecutor-3-thread-4-processing-n:172.20.140.173:8983_solr >> >> x:instock_shard15_replica1 s:shard15 c:instock r:core_node31) [c:instock >> >> s:shard15 r:core_node31 x:instock_shard15_replica1] o.a.s.u.PeerSync >> >> PeerSync: core=instock_shard15_replica1 >> >> url=http://172.20.140.173:8983/solr Received 0 versions from >> >> http://172.20.140.172:8983/solr/instock_shard15_replica2/ >> >> fingerprint:{maxVersionSpecified=9223372036854775807, >> >> maxVersionEncountered=1534492620385943552, maxInHash=1534492620385943552, >> >> versionsHash=-6845461210912808581, numVersions=30888332, >> >> numDocs=30888332, maxDoc=37699007} >> >> 16925 INFO >> >> (recoveryExecutor-3-thread-4-processing-n:172.20.140.173:8983_solr >> >> x:instock_shard15_replica1 s:shard15 c:instock r:core_node31) [c:instock >> >> s:shard15 r:core_node31 x:instock_shard15_replica1] o.a.s.u.PeerSync >> >> PeerSync: core=instock_shard15_replica1 >> >> url=http://172.20.140.173:8983/solr DONE. sync failed >> >> 16925 INFO >> >> (recoveryExecutor-3-thread-4-processing-n:172.20.140.173:8983_solr >> >> x:instock_shard15_replica1 s:shard15 c:instock r:core_node31) [c:instock >> >> s:shard15 r:core_node31 x:instock_shard15_replica1] >> >> o.a.s.c.RecoveryStrategy PeerSync Recovery was not successful - trying >> >> replication. >> >> >> >> This is the first node to start up, so most of the other shards are not >> >> there yet. >> >> >> >> On another node (the last node to start up), it looks similar but a >> >> little different: >> >> >> >> http://pastie.org/private/xjw0ruljcurdt4xpzqk6da >> >> >> >> 74090 INFO >> >> (recoveryExecutor-3-thread-1-processing-n:172.20.140.177:8983_solr >> >> x:instock_shard25_replica2 s:shard25 c:instock r:core_node60) [c:instock >> >> s:shard25 r:core_node60 x:instock_shard25_replica2] >> >> o.a.s.c.RecoveryStrategy Attempting to PeerSync from >> >> [http://172.20.140.170:8983/solr/instock_shard25_replica1/] - >> >> recoveringAfterStartup=[true] >> >> 74091 INFO >> >> (recoveryExecutor-3-thread-1-processing-n:172.20.140.177:8983_solr >> >> x:instock_shard25_replica2 s:shard25 c:instock r:core_node60) [c:instock >> >> s:shard25 r:core_node60 x:instock_shard25_replica2] o.a.s.u.PeerSync >> >> PeerSync: core=instock_shard25_replica2 >> >> url=http://172.20.140.177:8983/solr START >> >> replicas=[http://172.20.140.170:8983/solr/instock_shard25_replica1/] >> >> nUpdates=100 >> >> 74091 WARN >> >> (recoveryExecutor-3-thread-1-processing-n:172.20.140.177:8983_solr >> >> x:instock_shard25_replica2 s:shard25 c:instock r:core_node60) [c:instock >> >> s:shard25 r:core_node60 x:instock_shard25_replica2] o.a.s.u.PeerSync no >> >> frame of reference to tell if we've missed updates >> >> 74091 INFO >> >> (recoveryExecutor-3-thread-1-processing-n:172.20.140.177:8983_solr >> >> x:instock_shard25_replica2 s:shard25 c:instock r:core_node60) [c:instock >> >> s:shard25 r:core_node60 x:instock_shard25_replica2] >> >> o.a.s.c.RecoveryStrategy PeerSync Recovery was not successful - trying >> >> replication. >> >> >> >> Every single replica shows errors like this (either one or the other). >> >> >> >> I should add, beyond the block joins / nested children & grandchildren, >> >> there's really nothing unusual about this cloud at all. It's a very >> >> basic collection (simple enough it can be created in the GUI) and a dist >> >> installation of Solr 6. There are 3 independent zookeeper servers >> >> (again, vanilla from dist), and there don't appear to be any zookeeper >> >> issues. >> >> >> >> -- >> >> Steve >> >> >> >> On Mon, May 16, 2016 at 12:02 PM, Erick Erickson >> >> <erickerick...@gmail.com<mailto:erickerick...@gmail.com><mailto:erickerick...@gmail.com<mailto:erickerick...@gmail.com>><mailto:erickerick...@gmail.com<mailto:erickerick...@gmail.com><mailto:erickerick...@gmail.com<mailto:erickerick...@gmail.com>>><mailto:erickerick...@gmail.com<mailto:erickerick...@gmail.com><mailto:erickerick...@gmail.com<mailto:erickerick...@gmail.com>><mailto:erickerick...@gmail.com<mailto:erickerick...@gmail.com><mailto:erickerick...@gmail.com<mailto:erickerick...@gmail.com>>>>> >> >> wrote: >> >> 8 nodes, 4 shards apiece? All in the same JVM? People have gotten by >> >> the GC pain by running in separate JVMs with less Java memory each on >> >> big beefy machines.... That's not a recommendation as much as an >> >> observation. >> >> >> >> That aside, unless you have some very strange stuff going on this is >> >> totally weird. Are you hitting OOM errors at any time you have this >> >> problem? Once you hit an OOM error, all bets are off about how Java >> >> behaves. If you are hitting those, you can't hope for stability until >> >> you fix that issue. In your writeup there's some evidence for this >> >> when you say that if you index multiple docs at a time you get >> >> failures. >> >> >> >> Do your Solr logs show any anomalies? My guess is that you'll see >> >> exceptions in your Solr logs that will shed light on the issue. >> >> >> >> Best, >> >> Erick >> >> >> >> On Mon, May 16, 2016 at 8:03 AM, Stephen Weiss >> >> <steve.we...@wgsn.com<mailto:steve.we...@wgsn.com><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com>><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com>>><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com>><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com><mailto:steve.we...@wgsn.com<mailto:steve.we...@wgsn.com>>>>> >> >> wrote: >> >>> Hi everyone, >> >>> >> >>> I'm running into a problem with SolrCloud replicas and thought I would >> >>> ask the list to see if anyone else has seen this / gotten past it. >> >>> >> >>> Right now, we are running with only one replica per shard. This is >> >>> obviously a problem because if one node goes down anywhere, the whole >> >>> collection goes offline, and due to garbage collection issues, this >> >>> happens about once or twice a week, causing a great deal of instability. >> >>> If we try to increase to 2 replicas per shard, once we index new >> >>> documents and the shards autocommit, the shards all get out of sync with >> >>> each other, with different numbers of documents, different numbers of >> >>> documents deleted, different facet counts - pretty much totally >> >>> divergent indexes. Shards always show green and available, and never go >> >>> into recovery or any other state as to indicate there's a mismatch. >> >>> There are also no errors in the logs to indicate anything is going >> >>> wrong. Even long after indexing has finished, the replicas never come >> >>> back into sync. The only way to get consistency again is to delete one >> >>> set of replicas and then add them back in. Unfortunately, when we do >> >>> this, we invariabl > y discover that many documents (2-3%) are missing from the index. >> >>> >> >>> We have tried setting the min_rf parameter, and have found that when >> >>> setting min_rf=2, we almost never get back rf=2. We almost always get >> >>> rf=1, resend the request, and it basically just goes into an infinite >> >>> loop. The only way to get rf=2 to come back is to only index one >> >>> document at a time. Unfortunately, we have to update millions of >> >>> documents a day and it isn't really feasible to index this way, and even >> >>> when indexing one document at a time, we still occasionally find >> >>> ourselves in an infinite loop. This doesn't appear to be related to the >> >>> documents we are indexing - if we stop the index process and bounce >> >>> solr, the exact same document will go through fine the next time until >> >>> indexing stops up on another random document. >> >>> >> >>> We have 8 nodes, with 4 shards a piece, all running one collection with >> >>> about 900M documents. An important note is that we have a block join >> >>> system with 3 tiers of documents (products -> skus -> sku_history). >> >>> During indexing, we are forced to delete all documents for a product >> >>> prior to adding the product back into the index, in order to avoid >> >>> orphaned children / grandchildren. All documents are consistently >> >>> indexed with the top-level product ID so that we can delete all >> >>> child/grandchild documents prior to updating the document. So, for each >> >>> updated document, we are sending through a delete call followed by an >> >>> add call. We have tried putting both the delete and add in the same >> >>> update request with the same results. >> >>> >> >>> All we see out there on Google is that none of what we're seeing should >> >>> be happening. >> >>> >> >>> We are currently running Solr 6.0 with Zookeeper 3.4.6. We experienced >> >>> the same behavior on 5.4 as well. >> >>> >> >>> -- >> >>> Steve >> >>> >> >>> ________________________________ >> >>> >> >>> WGSN is a global foresight business. Our experts provide deep insight >> >>> and analysis of consumer, fashion and design trends. We inspire our >> >>> clients to plan and trade their range with unparalleled confidence and >> >>> accuracy. Together, we Create Tomorrow. >> >>> >> >>> WGSN<http://www.wgsn.com/> is part of WGSN Limited, comprising of >> >>> market-leading products including WGSN.com<http://www.wgsn.com>, WGSN >> >>> Lifestyle & Interiors<http://www.wgsn.com/en/lifestyle-interiors>, WGSN >> >>> INstock<http://www.wgsninstock.com/>, WGSN >> >>> StyleTrial<http://www.wgsn.com/en/styletrial/> and WGSN >> >>> Mindset<http://www.wgsn.com/en/services/consultancy/>, our bespoke >> >>> consultancy services. >> >>> >> >>> The information in or attached to this email is confidential and may be >> >>> legally privileged. If you are not the intended recipient of this >> >>> message, any use, disclosure, copying, distribution or any action taken >> >>> in reliance on it is prohibited and may be unlawful. If you have >> >>> received this message in error, please notify the sender immediately by >> >>> return email and delete this message and any copies from your computer >> >>> and network. WGSN does not warrant that this email and any attachments >> >>> are free from viruses and accepts no liability for any loss resulting >> >>> from infected email transmissions. >> >>> >> >>> WGSN reserves the right to monitor all email through its networks. Any >> >>> views expressed may be those of the originator and not necessarily of >> >>> WGSN. WGSN is powered by Ascential plc<http://www.ascential.com>, which >> >>> transforms knowledge businesses to deliver exceptional performance. >> >>> >> >>> Please be advised all phone calls may be recorded for training and >> >>> quality purposes and by accepting and/or making calls from and/or to us >> >>> you acknowledge and agree to calls being recorded. >> >>> >> >>> WGSN Limited, Company number 4858491 >> >>> >> >>> registered address: >> >>> >> >>> Ascential plc, The Prow, 1 Wilder Walk, London W1B 5AP >> >>> >> >>> WGSN Inc., tax ID 04-3851246, registered office c/o National Registered >> >>> Agents, Inc., 160 Greentree Drive, Suite 101, Dover DE 19904, United >> >>> States >> >>> >> >>> 4C Serviços de Informação Ltda., CNPJ/MF (Taxpayer's Register): >> >>> 15.536.968/0001-04, Address: Avenida Cidade Jardim, 377, 7˚ andar CEP >> >>> 01453-000, Itaim Bibi, São Paulo >> >>> >> >>> 4C Business Information Consulting (Shanghai) Co., Ltd, >> >>> 富新商务信息咨询(上海)有限公司, registered address Unit 4810/4811, 48/F Tower 1, Grand >> >>> Gateway, 1 Hong Qiao Road, Xuhui District, Shanghai >> >> >> >> >> >> >> >> ________________________________ >> >> >> >> WGSN is a global foresight business. Our experts provide deep insight and >> >> analysis of consumer, fashion and design trends. We inspire our clients >> >> to plan and trade their range with unparalleled confidence and accuracy. >> >> Together, we Create Tomorrow. >> >> >> >> WGSN<http://www.wgsn.com/> is part of WGSN Limited, comprising of >> >> market-leading products including WGSN.com<http://www.wgsn.com>, WGSN >> >> Lifestyle & Interiors<http://www.wgsn.com/en/lifestyle-interiors>, WGSN >> >> INstock<http://www.wgsninstock.com/>, WGSN >> >> StyleTrial<http://www.wgsn.com/en/styletrial/> and WGSN >> >> Mindset<http://www.wgsn.com/en/services/consultancy/>, our bespoke >> >> consultancy services. >> >> >> >> The information in or attached to this email is confidential and may be >> >> legally privileged. If you are not the intended recipient of this >> >> message, any use, disclosure, copying, distribution or any action taken >> >> in reliance on it is prohibited and may be unlawful. If you have received >> >> this message in error, please notify the sender immediately by return >> >> email and delete this message and any copies from your computer and >> >> network. WGSN does not warrant that this email and any attachments are >> >> free from viruses and accepts no liability for any loss resulting from >> >> infected email transmissions. >> >> >> >> WGSN reserves the right to monitor all email through its networks. Any >> >> views expressed may be those of the originator and not necessarily of >> >> WGSN. WGSN is powered by Ascential plc<http://www.ascential.com>, which >> >> transforms knowledge businesses to deliver exceptional performance. >> >> >> >> Please be advised all phone calls may be recorded for training and >> >> quality purposes and by accepting and/or making calls from and/or to us >> >> you acknowledge and agree to calls being recorded. >> >> >> >> WGSN Limited, Company number 4858491 >> >> >> >> registered address: >> >> >> >> Ascential plc, The Prow, 1 Wilder Walk, London W1B 5AP >> >> >> >> WGSN Inc., tax ID 04-3851246, registered office c/o National Registered >> >> Agents, Inc., 160 Greentree Drive, Suite 101, Dover DE 19904, United >> >> States >> >> >> >> 4C Serviços de Informação Ltda., CNPJ/MF (Taxpayer's Register): >> >> 15.536.968/0001-04, Address: Avenida Cidade Jardim, 377, 7˚ andar CEP >> >> 01453-000, Itaim Bibi, São Paulo >> >> >> >> 4C Business Information Consulting (Shanghai) Co., Ltd, 富新商务信息咨询(上海)有限公司, >> >> registered address Unit 4810/4811, 48/F Tower 1, Grand Gateway, 1 Hong >> >> Qiao Road, Xuhui District, Shanghai >> > >> > >> > ________________________________ >> > >> > WGSN is a global foresight business. Our experts provide deep insight and >> > analysis of consumer, fashion and design trends. We inspire our clients to >> > plan and trade their range with unparalleled confidence and accuracy. >> > Together, we Create Tomorrow. >> > >> > WGSN<http://www.wgsn.com/> is part of WGSN Limited, comprising of >> > market-leading products including WGSN.com<http://www.wgsn.com>, WGSN >> > Lifestyle & Interiors<http://www.wgsn.com/en/lifestyle-interiors>, WGSN >> > INstock<http://www.wgsninstock.com/>, WGSN >> > StyleTrial<http://www.wgsn.com/en/styletrial/> and WGSN >> > Mindset<http://www.wgsn.com/en/services/consultancy/>, our bespoke >> > consultancy services. >> > >> > The information in or attached to this email is confidential and may be >> > legally privileged. If you are not the intended recipient of this message, >> > any use, disclosure, copying, distribution or any action taken in reliance >> > on it is prohibited and may be unlawful. If you have received this message >> > in error, please notify the sender immediately by return email and delete >> > this message and any copies from your computer and network. WGSN does not >> > warrant that this email and any attachments are free from viruses and >> > accepts no liability for any loss resulting from infected email >> > transmissions. >> > >> > WGSN reserves the right to monitor all email through its networks. Any >> > views expressed may be those of the originator and not necessarily of >> > WGSN. WGSN is powered by Ascential plc<http://www.ascential.com>, which >> > transforms knowledge businesses to deliver exceptional performance. >> > >> > Please be advised all phone calls may be recorded for training and quality >> > purposes and by accepting and/or making calls from and/or to us you >> > acknowledge and agree to calls being recorded. >> > >> > WGSN Limited, Company number 4858491 >> > >> > registered address: >> > >> > Ascential plc, The Prow, 1 Wilder Walk, London W1B 5AP >> > >> > WGSN Inc., tax ID 04-3851246, registered office c/o National Registered >> > Agents, Inc., 160 Greentree Drive, Suite 101, Dover DE 19904, United States >> > >> > 4C Serviços de Informação Ltda., CNPJ/MF (Taxpayer's Register): >> > 15.536.968/0001-04, Address: Avenida Cidade Jardim, 377, 7˚ andar CEP >> > 01453-000, Itaim Bibi, São Paulo >> > >> > 4C Business Information Consulting (Shanghai) Co., Ltd, 富新商务信息咨询(上海)有限公司, >> > registered address Unit 4810/4811, 48/F Tower 1, Grand Gateway, 1 Hong >> > Qiao Road, Xuhui District, Shanghai >> >> >> >> ________________________________ >> >> WGSN is a global foresight business. Our experts provide deep insight and >> analysis of consumer, fashion and design trends. We inspire our clients to >> plan and trade their range with unparalleled confidence and accuracy. >> Together, we Create Tomorrow. >> >> WGSN<http://www.wgsn.com/> is part of WGSN Limited, comprising of >> market-leading products including WGSN.com<http://www.wgsn.com>, WGSN >> Lifestyle & Interiors<http://www.wgsn.com/en/lifestyle-interiors>, WGSN >> INstock<http://www.wgsninstock.com/>, WGSN >> StyleTrial<http://www.wgsn.com/en/styletrial/> and WGSN >> Mindset<http://www.wgsn.com/en/services/consultancy/>, our bespoke >> consultancy services. >> >> The information in or attached to this email is confidential and may be >> legally privileged. If you are not the intended recipient of this message, >> any use, disclosure, copying, distribution or any action taken in reliance >> on it is prohibited and may be unlawful. If you have received this message >> in error, please notify the sender immediately by return email and delete >> this message and any copies from your computer and network. WGSN does not >> warrant that this email and any attachments are free from viruses and >> accepts no liability for any loss resulting from infected email >> transmissions. >> >> WGSN reserves the right to monitor all email through its networks. Any views >> expressed may be those of the originator and not necessarily of WGSN. WGSN >> is powered by Ascential plc<http://www.ascential.com>, which transforms >> knowledge businesses to deliver exceptional performance. >> >> Please be advised all phone calls may be recorded for training and quality >> purposes and by accepting and/or making calls from and/or to us you >> acknowledge and agree to calls being recorded. >> >> WGSN Limited, Company number 4858491 >> >> registered address: >> >> Ascential plc, The Prow, 1 Wilder Walk, London W1B 5AP >> >> WGSN Inc., tax ID 04-3851246, registered office c/o National Registered >> Agents, Inc., 160 Greentree Drive, Suite 101, Dover DE 19904, United States >> >> 4C Serviços de Informação Ltda., CNPJ/MF (Taxpayer's Register): >> 15.536.968/0001-04, Address: Avenida Cidade Jardim, 377, 7˚ andar CEP >> 01453-000, Itaim Bibi, São Paulo >> >> 4C Business Information Consulting (Shanghai) Co., Ltd, 富新商务信息咨询(上海)有限公司, >> registered address Unit 4810/4811, 48/F Tower 1, Grand Gateway, 1 Hong Qiao >> Road, Xuhui District, Shanghai >> > > >________________________________ > >WGSN is a global foresight business. Our experts provide deep insight and >analysis of consumer, fashion and design trends. We inspire our clients to >plan and trade their range with unparalleled confidence and accuracy. >Together, we Create Tomorrow. > >WGSN<http://www.wgsn.com/> is part of WGSN Limited, comprising of >market-leading products including WGSN.com<http://www.wgsn.com>, WGSN >Lifestyle & Interiors<http://www.wgsn.com/en/lifestyle-interiors>, WGSN >INstock<http://www.wgsninstock.com/>, WGSN >StyleTrial<http://www.wgsn.com/en/styletrial/> and WGSN >Mindset<http://www.wgsn.com/en/services/consultancy/>, our bespoke consultancy >services. > >The information in or attached to this email is confidential and may be >legally privileged. If you are not the intended recipient of this message, any >use, disclosure, copying, distribution or any action taken in reliance on it >is prohibited and may be unlawful. If you have received this message in error, >please notify the sender immediately by return email and delete this message >and any copies from your computer and network. WGSN does not warrant that this >email and any attachments are free from viruses and accepts no liability for >any loss resulting from infected email transmissions. > >WGSN reserves the right to monitor all email through its networks. Any views >expressed may be those of the originator and not necessarily of WGSN. WGSN is >powered by Ascential plc<http://www.ascential.com>, which transforms knowledge >businesses to deliver exceptional performance. > >Please be advised all phone calls may be recorded for training and quality >purposes and by accepting and/or making calls from and/or to us you >acknowledge and agree to calls being recorded. > >WGSN Limited, Company number 4858491 > >registered address: > >Ascential plc, The Prow, 1 Wilder Walk, London W1B 5AP > >WGSN Inc., tax ID 04-3851246, registered office c/o National Registered >Agents, Inc., 160 Greentree Drive, Suite 101, Dover DE 19904, United States > >4C Serviços de Informação Ltda., CNPJ/MF (Taxpayer's Register): >15.536.968/0001-04, Address: Avenida Cidade Jardim, 377, 7˚ andar CEP >01453-000, Itaim Bibi, São Paulo > >4C Business Information Consulting (Shanghai) Co., Ltd, 富新商务信息咨询(上海)有限公司, >registered address Unit 4810/4811, 48/F Tower 1, Grand Gateway, 1 Hong Qiao >Road, Xuhui District, Shanghai