Hi, I'm trying to integrate the Lucene-based spellchecker (http://wiki.apache.org/jakarta-lucene/SpellChecker + contrib/spellchecker under Lucene) with Solr (http://issues.apache.org/jira/browse/SOLR-81) in order to provide a query spellchecking service (you enter Speers and it suggest pant^H^H ... Spears). I've created a generic NGramTokenizer (+ NGramTokenizerFactory + unit test) that I'll attach to SOLR-81 shortly.
What I'm not yet sure about is: 1) integration of this generic n-grammer with that Lucene SpellChecker code - SpellChecker & TRStringDistance classes in particular. 2) mapping n-gram Tokens that come out of my NGramTokenizer to specific field names, like 3start, 4start, gram1, gram2, gram3.... is there is scheme.xml trick one can use to accomplish this? 3) once 2) is done, getting the.... request handler(?) to n-gram the query appropriately and hit the SpellChecker index to try and find alternative spelling suggestions. Damn, that's a lot of unknowns... on top of that my computer started freezing every half an hour. Hi Murphy. Any pointers will be greatly appreciated. Thanks, Otis