bq. To answer your question about index size on disk, it is 3 TB on every node. As mentioned it's a 32 GB machine and I allocated 24GB to Java heap.
This is massively undersized in terms of RAM in my experience. You're trying to cram 3TB of index into 32GB of memory. Frankly, I don't think there's much you can do to increase stability in this situation, too many things are going on. In particular, you're indexing during node restart. That means that 1> you'll almost inevitably get a full sync on start given your update rate. 2> while you're doing the full sync, all new updates are sent to the recovering replica and put in the tlog. 3> When the initial replication is done, the documents sent to the tlog while recovering are indexed. This is 7 hours of accumulated updates. 4> If much goes wrong in this situation, then you're talking another full sync. 5> rinse, repeat. There are no magic tweaks here. You really have to rethink your architecture. I'm actually surprised that your queries are performant. I expect you're getting a _lot_ of I/O, that is the relevant parts of your index are swapping in and out of the OS memory space. A _lot_. Or you're only using a _very_ small bit of your index. Sorry to be so negative, but this is not a situation that's amenable to a quick fix. Best, Erick On Mon, Feb 11, 2019 at 4:10 PM Rahul Goswami <rahul196...@gmail.com> wrote: > > Thanks for the response Eric. To answer your question about index size on > disk, it is 3 TB on every node. As mentioned it's a 32 GB machine and I > allocated 24GB to Java heap. > > Further monitoring the recovery, I see that when the follower node is > recovering, the leader node (which is NOT recovering) almost freezes with > 100% CPU usage and 80%+ memory usage. Follower node's memory usage is 80%+ > but CPU is very healthy. Also Follower node's log is filled up with updates > forwarded from the leader ("...PRE_UPDATE FINISH > {update.distrib=FROMLEADER&distrib.from=...") and replication starts much > afterwards. > There have been instances when complete recovery took 10+ hours. We have > upgraded to a 4 Gbps NIC between the nodes to see if it helps. > > Also, a few followup questions: > > 1) Is there a configuration which would start throttling update requests > if the replica falls behind a certain number of updates so as to not > trigger an index replication later? If not, would it be a worthy > enhancement? > 2) What would be a recommended hard commit interval for this kind of setup > ? > 3) What are some of the improvements in 7.5 with respect to recovery as > compared to 7.2.1? > 4) What do the below peersync failure logs lines mean? This would help me > better understand the reasons for peersync failure and maybe devise some > alert mechanism to start throttling update requests from application > program if feasible. > > *PeerSync Failure type 1*: > ---------------------------------- > 2019-02-04 20:43:50.018 INFO > (recoveryExecutor-4-thread-2-processing-n:indexnode1:20000_solr > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42 > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 r:core_node45) > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11 r:core_node45 > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42] > org.apache.solr.update.PeerSync Fingerprint comparison: 1 > > 2019-02-04 20:43:50.018 INFO > (recoveryExecutor-4-thread-2-processing-n:indexnode1:20000_solr > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42 > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 r:core_node45) > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11 r:core_node45 > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42] > org.apache.solr.update.PeerSync Other fingerprint: > {maxVersionSpecified=1624579878580912128, > maxVersionEncountered=1624579893816721408, maxInHash=1624579878580912128, > versionsHash=-8308981502886241345, numVersions=32966082, numDocs=32966165, > maxDoc=1828452}, Our fingerprint: {maxVersionSpecified=1624579878580912128, > maxVersionEncountered=1624579975760838656, maxInHash=1624579878580912128, > versionsHash=4017509388564167234, numVersions=32966066, numDocs=32966165, > maxDoc=1828452} > > 2019-02-04 20:43:50.018 INFO > (recoveryExecutor-4-thread-2-processing-n:indexnode1:20000_solr > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42 > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 r:core_node45) > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11 r:core_node45 > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42] > org.apache.solr.update.PeerSync PeerSync: > core=DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42 url= > http://indexnode1:8983/solr DONE. sync failed > > 2019-02-04 20:43:50.018 INFO > (recoveryExecutor-4-thread-2-processing-n:indexnode1:8983_solr > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42 > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 r:core_node45) > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11 r:core_node45 > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42] > org.apache.solr.cloud.RecoveryStrategy PeerSync Recovery was not successful > - trying replication. > > > *PeerSync Failure type 1*: > --------------------------------- > 2019-02-02 20:26:56.256 WARN > (recoveryExecutor-4-thread-11-processing-n:indexnode1:20000_solr > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard12_replica_n46 > s:shard12 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 r:core_node49) > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard12 r:core_node49 > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard12_replica_n46] > org.apache.solr.update.PeerSync PeerSync: > core=DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard12_replica_n46 url= > http://indexnode1:20000/solr too many updates received since start - > startingUpdates no longer overlaps with our currentUpdates > > > Regards, > Rahul > > On Thu, Feb 7, 2019 at 12:59 PM Erick Erickson <erickerick...@gmail.com> > wrote: > > > bq. We have a heavy indexing load of about 10,000 documents every 150 > > seconds. > > Not so heavy query load. > > > > It's unlikely that changing numRecordsToKeep will help all that much if > > your > > maintenance window is very large. Rather, that number would have to be > > _very_ > > high. > > > > 7 hours is huge. How big are your indexes on disk? You're essentially > > going to get a > > full copy from the leader for each replica, so network bandwidth may > > be the bottleneck. > > Plus, every doc that gets indexed to the leader during sync will be stored > > away in the replica's tlog (not limited by numRecordsToKeep) and replayed > > after > > the full index replication is accomplished. > > > > Much of the retry logic for replication has been improved starting > > with Solr 7.3 and, > > in particular, Solr 7.5. That might address your replicas that just > > fail to replicate ever, > > but won't help that replicas need to full sync anyway. > > > > That said, by far the simplest thing would be to stop indexing during > > your maintenance > > window if at all possible. > > > > Best, > > Erick > > > > On Tue, Feb 5, 2019 at 9:11 PM Rahul Goswami <rahul196...@gmail.com> > > wrote: > > > > > > Hello Solr gurus, > > > > > > So I have a scenario where on Solr cluster restart the replica node goes > > > into full index replication for about 7 hours. Both replica nodes are > > > restarted around the same time for maintenance. Also, during usual times, > > > if one node goes down for whatever reason, upon restart it again does > > index > > > replication. In certain instances, some replicas just fail to recover. > > > > > > *SolrCloud 7.2.1 *cluster configuration*:* > > > ============================ > > > 16 shards - replication factor=2 > > > > > > Per server configuration: > > > ====================== > > > 32GB machine - 16GB heap space for Solr > > > Index size : 3TB per server > > > > > > autoCommit (openSearcher=false) of 3 minutes > > > > > > We have a heavy indexing load of about 10,000 documents every 150 > > seconds. > > > Not so heavy query load. > > > > > > Reading through some of the threads on similar topic, I suspect it would > > be > > > the disparity between the number of updates(>100) between the replicas > > that > > > is causing this (courtesy our indexing load). One of the suggestions I > > saw > > > was using numRecordsToKeep. > > > However as Erick mentioned in one of the threads, that's a bandaid > > measure > > > and I am trying to eliminate some of the fundamental issues that might > > > exist. > > > > > > 1) Is the heap too less for that index size? If yes, what would be a > > > recommended max heap size? > > > 2) Is there a general guideline to estimate the required max heap based > > on > > > index size on disk? > > > 3) What would be a recommended autoCommit and autoSoftCommit interval ? > > > 4) Any configurations that would help improve the restart time and avoid > > > full replication? > > > 5) Does Solr retain "numRecordsToKeep" number of documents in tlog *per > > > replica*? > > > 6) The reasons for peersync from below logs are not completely clear to > > me. > > > Can someone please elaborate? > > > > > > *PeerSync fails with* : > > > > > > Failure type 1: > > > ----------------- > > > 2019-02-04 20:43:50.018 INFO > > > (recoveryExecutor-4-thread-2-processing-n:indexnode1:20000_solr > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42 > > > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 r:core_node45) > > > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11 r:core_node45 > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42] > > > org.apache.solr.update.PeerSync Fingerprint comparison: 1 > > > > > > 2019-02-04 20:43:50.018 INFO > > > (recoveryExecutor-4-thread-2-processing-n:indexnode1:20000_solr > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42 > > > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 r:core_node45) > > > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11 r:core_node45 > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42] > > > org.apache.solr.update.PeerSync Other fingerprint: > > > {maxVersionSpecified=1624579878580912128, > > > maxVersionEncountered=1624579893816721408, maxInHash=1624579878580912128, > > > versionsHash=-8308981502886241345, numVersions=32966082, > > numDocs=32966165, > > > maxDoc=1828452}, Our fingerprint: > > {maxVersionSpecified=1624579878580912128, > > > maxVersionEncountered=1624579975760838656, maxInHash=1624579878580912128, > > > versionsHash=4017509388564167234, numVersions=32966066, numDocs=32966165, > > > maxDoc=1828452} > > > > > > 2019-02-04 20:43:50.018 INFO > > > (recoveryExecutor-4-thread-2-processing-n:indexnode1:20000_solr > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42 > > > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 r:core_node45) > > > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11 r:core_node45 > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42] > > > org.apache.solr.update.PeerSync PeerSync: > > > core=DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42 > > url= > > > http://indexnode1:8983/solr DONE. sync failed > > > > > > 2019-02-04 20:43:50.018 INFO > > > (recoveryExecutor-4-thread-2-processing-n:indexnode1:8983_solr > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42 > > > s:shard11 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 r:core_node45) > > > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard11 r:core_node45 > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard11_replica_n42] > > > org.apache.solr.cloud.RecoveryStrategy PeerSync Recovery was not > > successful > > > - trying replication. > > > > > > > > > Failure type 2: > > > ------------------ > > > 2019-02-02 20:26:56.256 WARN > > > (recoveryExecutor-4-thread-11-processing-n:indexnode1:20000_solr > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard12_replica_n46 > > > s:shard12 c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 r:core_node49) > > > [c:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66 s:shard12 r:core_node49 > > > x:DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard12_replica_n46] > > > org.apache.solr.update.PeerSync PeerSync: > > > core=DataIndex_1C6F947C-6673-4778-847D-2DE0FDE56C66_shard12_replica_n46 > > url= > > > http://indexnode1:20000/solr too many updates received since start - > > > startingUpdates no longer overlaps with our currentUpdates > > > > > > > > > Thanks, > > > Rahul > >