We had a rogue query take out several replicas in a large 4.2.0 cluster today, due to OOM's (we use the JVM args to kill the process on OOM).
After recovering, when I execute the match all docs query (*:*), I get a different count each time. In other words, if I execute q=*:* several times in a row, then I get a different count back for numDocs. This was not the case prior to the failure as that is one thing we monitor for. I think I should be worried ... any ideas on how to troubleshoot this? One thing to mention is that several of my replicas had to do full recoveries from the leader when they came back online. Indexing was happening when the replicas failed. Thanks. Tim