Re: Full index replication upon service restart

2019-02-21 Thread Erick Erickson
There really is no such thing as the replica falling “too far behind”. The process is > leader gets an update > leader indexes locally and forwards the documents to the follower > follower acks back that it’s received the raw docs and is indexing them > the leader acks back to the client that the

Re: Full index replication upon service restart

2019-02-21 Thread Rahul Goswami
Eric, Thanks for the insight. We are looking at tuning the architecture. We are also stopping the indexing application before we bring down the Solr nodes for maintenance. However when both nodes are up, and one replica is falling behind too much we want to throttle the requests. Is there an API in

Re: Full index replication upon service restart

2019-02-11 Thread Erick Erickson
bq. To answer your question about index size on disk, it is 3 TB on every node. As mentioned it's a 32 GB machine and I allocated 24GB to Java heap. This is massively undersized in terms of RAM in my experience. You're trying to cram 3TB of index into 32GB of memory. Frankly, I don't think there's

Re: Full index replication upon service restart

2019-02-11 Thread Rahul Goswami
Thanks for the response Eric. To answer your question about index size on disk, it is 3 TB on every node. As mentioned it's a 32 GB machine and I allocated 24GB to Java heap. Further monitoring the recovery, I see that when the follower node is recovering, the leader node (which is NOT recovering)

Re: Full index replication upon service restart

2019-02-07 Thread Erick Erickson
bq. We have a heavy indexing load of about 10,000 documents every 150 seconds. Not so heavy query load. It's unlikely that changing numRecordsToKeep will help all that much if your maintenance window is very large. Rather, that number would have to be _very_ high. 7 hours is huge. How big are you