Actually the documentation is not clear enough. Let's try to understand this suggester.
*Building* This suggester build a FST that it will use to provide the autocomplete feature running prefix searches on it . The terms it uses to generate the FST are the tokens produced by the "suggestFreeTextAnalyzerFieldType" . And this should be correct. So if we have a shingle token filter[1-3] ( we produce unigrams as well) in our analysis to keep it simple , from these original field values : "mp3 ipod" "mp3 player" "mp3 player ipod" "player of Real" -> we produce these list of possible suggestions in our FST : <mp3> <player> <ipod> <real> <of> <mp3 ipod> <mp3 player> <player ipod> <player of> <of real> <mp3 player ipod> <player of real> >From the documentation I read : > " ngrams: The max number of tokens out of which singles will be make the > dictionary. The default value is 2. Increasing this would mean you want > more than the previous 2 tokens to be taken into consideration when making > the suggestions. " This makes me confused, as I was not expecting this param to affect the suggestion dictionary. So I would like a clarification here from our masters :) At this point let's see what happens at query time . *Query Time * As my understanding the ngrams params will consider the last N-1 tokens the user put separated by the space separator. "Builds an ngram model from the text sent to {@link > * #build} and predicts based on the last grams-1 tokens in > * the request sent to {@link #lookup}. This tries to > * handle the "long tail" of suggestions for when the > * incoming query is a never before seen query string." Example , grams=3 should consider only the last 2 tokens special mp3 p -> mp3 p Then this query is analysed using the "suggestFreeTextAnalyzerFieldType" . We produce 3 tokens : <mp3> <p> <mp3 p> And we run the prefix matching on the FST . *Conclusion* My understanding is wrong for sure at some point, as the behaviour I get is different. Can we discuss this , clarify this and eventually put it in the official documentation ? Cheers 2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo <edwinye...@gmail.com>: > I'm implementing an auto-suggest feature in Solr, and I'll like to achieve > the follwing: > > For example, if the user enters "mp3", Solr might suggest "mp3 player", > "mp3 nano" and "mp3 music". > When the user enters "mp3 p", the suggestion should narrow down to "mp3 > player". > > Currently, when I type "mp3 p", the suggester is returning words that > starts with the letter "p" only, and I'm getting results like "plan", > "production", etc, and it does not take the "mp3" token into consideration. > > I'm using Solr 5.1 and below is my configuration: > > In solrconfig.xml: > > <searchComponent name="suggest" class="solr.SuggestComponent"> > <lst name="suggester"> > > <str name="lookupImpl">FreeTextLookupFactory</str> > <str name="indexPath">suggester_freetext_dir</str> > > <str name="dictionaryImpl">DocumentDictionaryFactory</str> > <str name="field">Suggestion</str> > <str name="weightField">Project</str> > <str name="suggestFreeTextAnalyzerFieldType">suggestType</str> > <int name="ngrams">5</int> > <str name="buildOnStartup">false</str> > <str name="buildOnCommit">false</str> > </lst> > </searchComponent> > > > In schema.xml > > <fieldType name="suggestType" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <charFilter class="solr.PatternReplaceCharFilterFactory" > pattern="[^a-zA-Z0-9]" replacement=" " /> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.ShingleFilterFactory" minShingleSize="2" > maxShingleSize="6" outputUnigrams="false"/> > </analyzer> > <analyzer type="query"> > <charFilter class="solr.PatternReplaceCharFilterFactory" > pattern="[^a-zA-Z0-9]" replacement=" " /> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.ShingleFilterFactory" minShingleSize="2" > maxShingleSize="6" outputUnigrams="true"/> > </analyzer> > </fieldType> > > > Is there anything that I configured wrongly? > > > Regards, > Edwin > -- -------------------------- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England