Alexander,

You could use a higher value for spellcheck.count, maybe 20 or so, then in your 
application pick out the suggestions that make changes on the right side.

Another option is to use DirectSolrSpellChecker (usually a better choice 
anyhow) and set the "minPrefix" field.  This will require up to n characters on 
the left side to match before it will make suggestions.  Taking a quick look at 
the code, it seems to me it won't try and correct anything in this prefix 
region also.  So perhaps you can set this to 2-4 (default=1).  See 
http://lucene.apache.org/core/4_10_0/suggest/org/apache/lucene/search/spell/DirectSpellChecker.html#setMinPrefix%28int%29
 .

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Lochschmied, Alexander [mailto:alexander.lochschm...@vishay.com] 
Sent: Wednesday, September 24, 2014 9:06 AM
To: solr-user@lucene.apache.org
Subject: Spellchecking and suggesting part numbers

Hello Solr Users,

we are trying to get suggestions for part numbers using the spellchecker.

Problem scenario:

ABCD1234 // This is the search term
ABCE1234 // This is what we get from spellchecker
ABCD1244 // This is what we would like to get from spellchecker

Characters towards the left of our part numbers are more relevant.


The setup is:

        <searchComponent name="spellcheck_part" 
class="solr.SpellCheckComponent">
                <lst name="spellchecker">
                        <str name="classname">solr.IndexBasedSpellChecker</str>
                        <str name="spellcheckIndexDir">./spellchecker</str>
                        <str name="field">did_you_mean_part</str>
                </lst>
        </searchComponent>
        <requestHandler name="/spell_part" class="solr.SearchHandler" 
startup="lazy">
                <lst name="defaults">
                        <str name="df">did_you_mean_part</str>
                        <str name="spellcheck">on</str>
                </lst>
                <arr name="last-components">
                        <str>spellcheck_part</str>
                </arr>
        </requestHandler>


        <fieldType name="did_you_mean_part" class="solr.TextField" 
positionIncrementGap="100">
                <analyzer type="index">
                        <charFilter 
class="solr.PatternReplaceCharFilterFactory" pattern="[\s]+" replacement=""/>
                        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
                        <filter class="solr.LowerCaseFilterFactory"/>
                        <filter class="solr.EdgeNGramFilterFactory" 
minGramSize="1" maxGramSize="20" side="front"/>
                        <filter 
class="solr.RemoveDuplicatesTokenFilterFactory"/>
                </analyzer>
                <analyzer type="query">
                        <charFilter 
class="solr.PatternReplaceCharFilterFactory" pattern="[\s]+" replacement=""/>
                        <tokenizer class="solr.KeywordTokenizerFactory"/>
                        <filter class="solr.LowerCaseFilterFactory"/>
                        <filter class="solr.EdgeNGramFilterFactory" 
minGramSize="1" maxGramSize="20" side="front"/>
                </analyzer>
        </fieldType>

Can we tweak the setup such that we should get more relevant part numbers?

Thanks,
Alexander


Reply via email to