Hi Doug, Sounds fishy, especially increasing/decreasing mergeFactor to "funny values" (try changing your OS setting instead).
My guess is this is happening only with the 2 indices that are being modified and I'll guess that the FNFE is due to a bad/incomplete rsync from the master. Do snappuller logs mention any errors? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- From: Doug Steigerwald <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, April 1, 2008 4:12:25 PM Subject: java.io.FileNotFoundException? We just started hitting a FileNotFoundException for no real apparent reason for both our regular index and our spellchecker index, and only a few minute after we restarted Solr. I did some searching and didn't find much that helped. We started to do some load testing, and after about 10 minutes we started getting these errors. We hit the spellchecker every request through a SpellcheckComponent that we created (ie, code ripped out of SpellCheckRequestHandler for now). It runs essentially the same code as the spellcheck request handler when we specify a parameter (spellcheck=true). We have 34 cores. All but two cores are fully optimized (haven't been updated in 2 months). Only two cores are actively updated. We started Solr around 11:45am, not much happened until 12:27 when we started load testing (just a few queries, maybe 100 updates). find /home/dsteiger/local/solr/cores/*/data/index|wc -l => 414 find /home/dsteiger/local/solr/cores/*/data/spell|wc -l => 6 (only the two 'active' cores use the spell checker). So, not many files are open. Anyone have any idea what might cause the two below errors to happen? When I restarted Solr around 11:45am it was to test a new patch that set the mergeFactor in the lucene spellchecker to 2 instead of 300 because we kept running into 'too many files open' errors when rebuilding more than one spell index at a time. The spell indexes were rebuilt manually using the mergeFactor of 300, solr restarted, and any subsequent rebuild of the spell index would use a mergeFactor of 2. After we hit this error, I rebuilt the spell indexes with the new code replicated them to the slave, restarted Solr, and all has been well. We ran the load testing for more than an hour and the issue hasn't returned. Could the old spell indexes that were created using the high mergeFactor cause an issue like this somehow? Could the opening and closing of searchers so fast cause this? I don't have the slightest idea. All of our search queries hit the slave, and the master just handles updates. The master had no issues through all of this. Caused by: java.io.IOException: cannot read directory org.apache.lucene.store.FSDirectory@/home/dsteiger/local/solr/cores/qaa/data/spell: list() returned null at org.apache.lucene.index.SegmentInfos.getCurrentSegmentGeneration(SegmentInfos.java:115) at org.apache.lucene.index.IndexReader.indexExists(IndexReader.java:506) at org.apache.lucene.search.spell.SpellChecker.setSpellIndex(SpellChecker.java:102) at org.apache.lucene.search.spell.SpellChecker.<init>(SpellChecker.java:89) And this happened I believe when running the snapinstaller (done through cron)... Caused by: java.io.FileNotFoundException: no segments* file found in org.apache.lucene.store.FSDirectory@/home/dsteiger/local/solr/cores/qab/data/index: files: null at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:587) at org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:63) at org.apache.lucene.index.IndexReader.open(IndexReader.java:209) at org.apache.lucene.index.IndexReader.open(IndexReader.java:173) at org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:93) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:706) We're running r614955. Thanks. Doug