We had a rogue query take out several replicas in a large 4.2.0 cluster
today, due to OOM's (we use the JVM args to kill the process on OOM).

After recovering, when I execute the match all docs query (*:*), I get a
different count each time.

In other words, if I execute q=*:* several times in a row, then I get a
different count back for numDocs.

This was not the case prior to the failure as that is one thing we monitor
for.

I think I should be worried ... any ideas on how to troubleshoot this? One
thing to mention is that several of my replicas had to do full recoveries
from the leader when they came back online. Indexing was happening when the
replicas failed.

Thanks.
Tim

Reply via email to