Hi Jeff,
we ran into that a few times already. We have lots of collections and
when nodes get started too fast the overseer queue grows faster then
Solr can process it. At some point Solr tries to redo things like
leaders votes and adds new tasks to the list, which then gets longer and
longer. Once it is too long you can not read out the data anymore but
Solr is still adding tasks. In case you already reached that point you
have to start ZooKeeper and the ZooKeeper client with and increased
"jute.maxbuffer" value. I usually double it until I can read out the
queue again. After that I delete all entries in the queue and then start
the Solr nodes one by one, like every 5 minutes.
regards,
Hendrik
On 22.08.2017 13:42, Jeff Courtade wrote:
Hi,
I have an issue with what seems to be a blocked up /overseer/queue
There are 700k + entries.
Solr cloud 6.x
You cannot addreplica or deletereplica the commands time out.
Full stop and start of solr and zookeeper does not clear it.
Is it safe to use the zookeeper supplied zkCli.sh to simple rmr the
/overseer/queue ?
Jeff Courtade
M: 240.507.6116