> Note that for my previous e-mail you’d have to wait 15 minutes after you > started indexing to see a new tlog and also wait until at least 1,000 new > document after _that_ before the large tlog went away. I don't think that’s > your issue though. Indeed I did wait 15 minutes but not sure 1000 documents were indexed in the meantime. Though I should've seen a new tlog even if the large one was still there, right?
> So I think that’s the place to focus. Did the node recover completely and go > active? Just checking the admin UI and seeing it be green is sometimes not > enough. Check the state.json znode and see if the state is also “active” > there. On ZooKeeper (through the Solr UI or directly connecting to ZK) I can see "state":"active" in the state.json. This seems fine. To be more weird, this is the leader node. Can a leader be in recovery?? > Next, try sending a request directly to that replica. Frankly I’m not sure > what to expect, but if you get something weird that’d be a “smoking gun” that > no matter what the admin UI says, the replica isn’t really active. Something > like “http://blah blah > blah/solr/collection1_shard1_replica_n1?q=some_query&distrib=false. The > “distrib=false” is important, otherwise the request will be forwarded to a > truly active node. The request works fine, I don't see anything weird at that time in the logs. I will investigate further and take a look at all what you mentionned. Kind regards, Gaël