Seemed to be able to fix the below problem with the following patch in lucene-2.2. Going to try the lucene 2.3 branch.

Index: 
contrib/spellchecker/src/java/org/apache/lucene/search/spell/SpellChecker.java
===================================================================
--- contrib/spellchecker/src/java/org/apache/lucene/search/spell/SpellChecker.java (revision 612882)
+++ 
contrib/spellchecker/src/java/org/apache/lucene/search/spell/SpellChecker.java  
    (working copy)
@@ -285,7 +285,7 @@
    */
   public void clearIndex() throws IOException {
     IndexReader.unlock(spellIndex);
-    IndexWriter writer = new IndexWriter(spellIndex, null, true);
+    IndexWriter writer = new IndexWriter(spellIndex, null, false);
     writer.close();
   }


Now the IndexWriter won't create a new index every time you rebuild the spellchecker index. Didn't seem to have any issues with the small index I have.

Only issue I have now is with a large index (not that large, 49k documents) I get keep getting errors like the one below when initially building an index (and every rebuild after that). This is with and without the patch above.

SEVERE: java.io.FileNotFoundException: /home/dsteiger/local/solr/cores/dsteiger/data/spell/_66.fnm (No such file or directory)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
        at 
org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:506)
        at 
org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:536)
        at 
org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:531)
        at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:440)
        at 
org.apache.lucene.index.CompoundFileWriter.copyFile(CompoundFileWriter.java:204)
        at 
org.apache.lucene.index.CompoundFileWriter.close(CompoundFileWriter.java:169)
        at 
org.apache.lucene.index.SegmentMerger.createCompoundFile(SegmentMerger.java:155)
        at 
org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1970)
        at 
org.apache.lucene.index.IndexWriter.flushRamSegments(IndexWriter.java:1741)
        at 
org.apache.lucene.index.IndexWriter.flushRamSegments(IndexWriter.java:1733)
        at 
org.apache.lucene.index.IndexWriter.maybeFlushRamSegments(IndexWriter.java:1727)
        at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1004)
        at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:983)

Any ideas?

doug

Doug Steigerwald wrote:
It's in the index.  Can see it with a query: q=word:blackjack

And in luke: −
<lst name="topTerms">
    <int name="blackjack">29</int>

The actual index data seems to disappear.

First rebuild:
$ ls  spell/
_2.cfs  segments.gen  segments_i

Second rebuild:
$ ls spell
segments_2z  segments.gen

doug

Otis Gospodnetic wrote:
Do you trust the spellchecker 100% (not looking at its source now). I'd peek at the index with Luke (Luke I trust :)) and see if that term is really there first.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
From: Doug Steigerwald <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Wednesday, January 16, 2008 2:56:35 PM
Subject: Spell checker index rebuild

Having another weird spell checker index issue.  Starting off from a
clean index and spell check index, I'll index everything in example/exampledocs. On the first rebuild of the spellchecker index using the query below says the word 'blackjack' exists in the
 spellchecker index.  Great, no problems.

Rebuild it again and the word 'blackjack' does not exist any more.

http://localhost:8983/solr/core0/select?q=blackjack&qt=spellchecker&cmd=rebuild

Any ideas?  This is with a Solr trunk build from yesterday.

doug


Reply via email to