Looks like after 900 seconds, it times out and starts up. I think the issue is that I'm using the bin/solr start/stop script, and it waits only 5 seconds before sending a kill -9. In my experience with solr 4.10.x and HDFS, that is not enough time to wait for a large shard to stop when using HDFS. I've seen it take well over a minute to stop. I'm not sure if the index is going to be missing data, or if it will be corrupt at this point.

-Joe

On 4/6/2015 1:35 PM, Joseph Obernberger wrote:
Having a couple issues with restarts of a 27 shard cluster using SolrCloud 5.0.0 and HDFS. I'm getting errors that a lock file exists and the shard will not start. When I delete the file, that shard starts OK.

On another shard, I'm getting the following messsage:
538220 [coreLoadExecutor-5-thread-1] INFO org.apache.solr.util.FSHDFSUtils รข recoverLease=false, attempt=466 on file=hdfs://nameservice1:8020/solr5/MAINCOLL/core_node8/data/tlog/tlog.0000000000000002971 after 526067ms

It has been doing this for 526 seconds, and doesn't seem to be coming up. I've tried restarting it several time, but it seems to be in an infinite loop retrying. Help!
Thank you.

-Joe


  • HDFS Locking Joseph Obernberger
    • Re: HDFS Locking Joseph Obernberger

Reply via email to