> Note that for my previous e-mail you’d have to wait 15 minutes after you 
> started indexing to see a new tlog and also wait until at least 1,000 new 
> document after _that_ before the large tlog went away. I don't think that’s 
> your issue though.
Indeed I did wait 15 minutes but not sure 1000 documents were indexed in the 
meantime. Though I should've seen a new tlog even if the large one was still 
there, right?

> So I think that’s the place to focus. Did the node recover completely and go 
> active? Just checking the admin UI and seeing it be green is sometimes not 
> enough. Check the state.json znode and see if the state is also “active” 
> there.
On ZooKeeper (through the Solr UI or directly connecting to ZK) I can see 
"state":"active" in the state.json. This seems fine.
To be more weird, this is the leader node. Can a leader be in recovery??

> Next, try sending a request directly to that replica. Frankly I’m not sure 
> what to expect, but if you get something weird that’d be a “smoking gun” that 
> no matter what the admin UI says, the replica isn’t really active. Something 
> like “http://blah blah 
> blah/solr/collection1_shard1_replica_n1?q=some_query&distrib=false. The 
> “distrib=false” is important, otherwise the request will be forwarded to a 
> truly active node.
The request works fine, I don't see anything weird at that time in the logs.

I will investigate further and take a look at all what you mentionned.

Kind regards,
Gaël

Reply via email to