Hi Doug,

Sounds fishy, especially increasing/decreasing mergeFactor to "funny values" 
(try changing your OS setting instead).

My guess is this is happening only with the 2 indices that are being modified 
and I'll guess that the FNFE is due to a bad/incomplete rsync from the master.  
Do snappuller logs mention any errors?

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
From: Doug Steigerwald <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Tuesday, April 1, 2008 4:12:25 PM
Subject: java.io.FileNotFoundException?

We just started hitting a FileNotFoundException for no real apparent reason for 
both our regular
index and our spellchecker index, and only a few minute after we restarted 
Solr.  I did some 
searching and didn't find much that helped.

We started to do some load testing, and after about 10 minutes we started 
getting these errors.

We hit the spellchecker every request through a SpellcheckComponent that we 
created (ie, code ripped 
out of SpellCheckRequestHandler for now).  It runs essentially the same code as 
the spellcheck 
request handler when we specify a parameter (spellcheck=true).

We have 34 cores.  All but two cores are fully optimized (haven't been updated 
in 2 months).  Only 
two cores are actively updated.  We started Solr around 11:45am, not much 
happened until 12:27 when 
we started load testing (just a few queries, maybe 100 updates).

find /home/dsteiger/local/solr/cores/*/data/index|wc -l  => 414
find /home/dsteiger/local/solr/cores/*/data/spell|wc -l  => 6 (only the two 
'active' cores use the 
spell checker).  So, not many files are open.

Anyone have any idea what might cause the two below errors to happen?  When I 
restarted Solr around 
11:45am it was to test a new patch that set the mergeFactor in the lucene 
spellchecker to 2 instead 
of 300 because we kept running into 'too many files open' errors when 
rebuilding more than one spell 
index at a time.  The spell indexes were rebuilt manually using the mergeFactor 
of 300, solr 
restarted, and any subsequent rebuild of the spell index would use a 
mergeFactor of 2.

After we hit this error, I rebuilt the spell indexes with the new code 
replicated them to the slave, 
restarted Solr, and all has been well.  We ran the load testing for more than 
an hour and the issue 
hasn't returned.

Could the old spell indexes that were created using the high mergeFactor cause 
an issue like this 
somehow?  Could the opening and closing of searchers so fast cause this?  I 
don't have the slightest 
idea.  All of our search queries hit the slave, and the master just handles 
updates.  The master had 
no issues through all of this.

Caused by: java.io.IOException: cannot read directory
org.apache.lucene.store.FSDirectory@/home/dsteiger/local/solr/cores/qaa/data/spell:
 list() returned null
    at 
org.apache.lucene.index.SegmentInfos.getCurrentSegmentGeneration(SegmentInfos.java:115)
    at org.apache.lucene.index.IndexReader.indexExists(IndexReader.java:506)
    at 
org.apache.lucene.search.spell.SpellChecker.setSpellIndex(SpellChecker.java:102)
    at org.apache.lucene.search.spell.SpellChecker.<init>(SpellChecker.java:89)


And this happened I believe when running the snapinstaller (done through 
cron)...

Caused by: java.io.FileNotFoundException: no segments* file found in
org.apache.lucene.store.FSDirectory@/home/dsteiger/local/solr/cores/qab/data/index:
 files: null
         at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:587)
         at 
org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:63)
         at org.apache.lucene.index.IndexReader.open(IndexReader.java:209)
         at org.apache.lucene.index.IndexReader.open(IndexReader.java:173)
         at 
org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:93)
         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:706)

We're running r614955.

Thanks.
Doug




Reply via email to