Thank you so much. I'll read up on that and try that out.
Regards, Edwin On 12 July 2015 at 00:41, Erick Erickson <erickerick...@gmail.com> wrote: > Cool! I've bookmarked it, much more thorough.... > > Erick > > On Sat, Jul 11, 2015 at 8:13 AM, Walter Underwood <wun...@wunderwood.org> > wrote: > > Thanks, this is very helpful. > > > > Suggester config is quite under documented. It took me longer than I > expected to get it working. > > > > wunder > > Walter Underwood > > wun...@wunderwood.org > > http://observer.wunderwood.org/ (my blog) > > > > > > On Jul 10, 2015, at 6:30 PM, Alessandro Benedetti < > benedetti.ale...@gmail.com> wrote: > > > >> Hi guys, > >> just wrote a blog to integrate Erick's post and to explain in details > with > >> practical examples all the main Lookup implementations : > >> > >> http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html > >> > >> I think this can be useful for Edwin to finally fix the config for the > >> FreeTextSuggester ( which finally I clarified Erick, thanks to Mike > answer > >> in dev, and deep code analysis and testing :) ) > >> > >> Cheers > >> > >> 2015-06-27 23:51 GMT+01:00 Alessandro Benedetti < > benedetti.ale...@gmail.com> > >> : > >> > >>> Thanks, Erick, i didn't have time to go again through the code. > >>> But i will forward this to the Dev list. > >>> Thank you for your time ! > >>> > >>> Cheers > >>> > >>> 2015-06-27 16:19 GMT+01:00 Erick Erickson <erickerick...@gmail.com>: > >>> > >>>> Alessandro: > >>>> > >>>> Going to have to defer to Mike McCandless et.al., they're the > >>>> authorities here. Don't quite know whether they monitor this list, > >>>> consider the dev list? > >>>> > >>>> Best, > >>>> Erick > >>>> > >>>> On Fri, Jun 26, 2015 at 4:53 AM, Alessandro Benedetti > >>>> <benedetti.ale...@gmail.com> wrote: > >>>>> Up, Can anyone gently take a look to my considerations related the > >>>> FreeText > >>>>> Suggester ? > >>>>> I am curious to have more insight. > >>>>> Eventually I will deeply analyse the code to understand my errors. > >>>>> > >>>>> Cheers > >>>>> > >>>>> 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti < > >>>> benedetti.ale...@gmail.com> > >>>>> : > >>>>> > >>>>>> Actually the documentation is not clear enough. > >>>>>> Let's try to understand this suggester. > >>>>>> > >>>>>> *Building* > >>>>>> This suggester build a FST that it will use to provide the > autocomplete > >>>>>> feature running prefix searches on it . > >>>>>> The terms it uses to generate the FST are the tokens produced by the > >>>>>> "suggestFreeTextAnalyzerFieldType" . > >>>>>> > >>>>>> And this should be correct. > >>>>>> So if we have a shingle token filter[1-3] ( we produce unigrams as > >>>> well) > >>>>>> in our analysis to keep it simple , from these original field > values : > >>>>>> "mp3 ipod" > >>>>>> "mp3 player" > >>>>>> "mp3 player ipod" > >>>>>> "player of Real" > >>>>>> > >>>>>> -> we produce these list of possible suggestions in our FST : > >>>>>> > >>>>>> <mp3> > >>>>>> <player> > >>>>>> <ipod> > >>>>>> <real> > >>>>>> <of> > >>>>>> > >>>>>> <mp3 ipod> > >>>>>> <mp3 player> > >>>>>> <player ipod> > >>>>>> <player of> > >>>>>> <of real> > >>>>>> > >>>>>> <mp3 player ipod> > >>>>>> <player of real> > >>>>>> > >>>>>> From the documentation I read : > >>>>>> > >>>>>>> " ngrams: The max number of tokens out of which singles will be > make > >>>> the > >>>>>>> dictionary. The default value is 2. Increasing this would mean you > >>>> want > >>>>>>> more than the previous 2 tokens to be taken into consideration when > >>>> making > >>>>>>> the suggestions. " > >>>>>> > >>>>>> > >>>>>> This makes me confused, as I was not expecting this param to affect > the > >>>>>> suggestion dictionary. > >>>>>> So I would like a clarification here from our masters :) > >>>>>> At this point let's see what happens at query time . > >>>>>> > >>>>>> *Query Time * > >>>>>> As my understanding the ngrams params will consider the last N-1 > >>>> tokens > >>>>>> the user put separated by the space separator. > >>>>>> > >>>>>> "Builds an ngram model from the text sent to {@link > >>>>>>> * #build} and predicts based on the last grams-1 tokens in > >>>>>>> * the request sent to {@link #lookup}. This tries to > >>>>>>> * handle the "long tail" of suggestions for when the > >>>>>>> * incoming query is a never before seen query string." > >>>>>> > >>>>>> > >>>>>> Example , grams=3 should consider only the last 2 tokens > >>>>>> > >>>>>> special mp3 p -> mp3 p > >>>>>> > >>>>>> Then this query is analysed using the > >>>> "suggestFreeTextAnalyzerFieldType" . > >>>>>> We produce 3 tokens : > >>>>>> <mp3> > >>>>>> <p> > >>>>>> <mp3 p> > >>>>>> > >>>>>> And we run the prefix matching on the FST . > >>>>>> > >>>>>> *Conclusion* > >>>>>> My understanding is wrong for sure at some point, as the behaviour I > >>>> get > >>>>>> is different. > >>>>>> Can we discuss this , clarify this and eventually put it in the > >>>> official > >>>>>> documentation ? > >>>>>> > >>>>>> Cheers > >>>>>> > >>>>>> 2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo <edwinye...@gmail.com > >: > >>>>>> > >>>>>>> I'm implementing an auto-suggest feature in Solr, and I'll like to > >>>> achieve > >>>>>>> the follwing: > >>>>>>> > >>>>>>> For example, if the user enters "mp3", Solr might suggest "mp3 > >>>> player", > >>>>>>> "mp3 nano" and "mp3 music". > >>>>>>> When the user enters "mp3 p", the suggestion should narrow down to > >>>> "mp3 > >>>>>>> player". > >>>>>>> > >>>>>>> Currently, when I type "mp3 p", the suggester is returning words > that > >>>>>>> starts with the letter "p" only, and I'm getting results like > "plan", > >>>>>>> "production", etc, and it does not take the "mp3" token into > >>>>>>> consideration. > >>>>>>> > >>>>>>> I'm using Solr 5.1 and below is my configuration: > >>>>>>> > >>>>>>> In solrconfig.xml: > >>>>>>> > >>>>>>> <searchComponent name="suggest" class="solr.SuggestComponent"> > >>>>>>> <lst name="suggester"> > >>>>>>> > >>>>>>> <str name="lookupImpl">FreeTextLookupFactory</str> > >>>>>>> <str name="indexPath">suggester_freetext_dir</str> > >>>>>>> > >>>>>>> <str name="dictionaryImpl">DocumentDictionaryFactory</str> > >>>>>>> <str name="field">Suggestion</str> > >>>>>>> <str name="weightField">Project</str> > >>>>>>> <str name="suggestFreeTextAnalyzerFieldType">suggestType</str> > >>>>>>> <int name="ngrams">5</int> > >>>>>>> <str name="buildOnStartup">false</str> > >>>>>>> <str name="buildOnCommit">false</str> > >>>>>>> </lst> > >>>>>>> </searchComponent> > >>>>>>> > >>>>>>> > >>>>>>> In schema.xml > >>>>>>> > >>>>>>> <fieldType name="suggestType" class="solr.TextField" > >>>>>>> positionIncrementGap="100"> > >>>>>>> <analyzer type="index"> > >>>>>>> <charFilter class="solr.PatternReplaceCharFilterFactory" > >>>>>>> pattern="[^a-zA-Z0-9]" replacement=" " /> > >>>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> > >>>>>>> <filter class="solr.ShingleFilterFactory" minShingleSize="2" > >>>>>>> maxShingleSize="6" outputUnigrams="false"/> > >>>>>>> </analyzer> > >>>>>>> <analyzer type="query"> > >>>>>>> <charFilter class="solr.PatternReplaceCharFilterFactory" > >>>>>>> pattern="[^a-zA-Z0-9]" replacement=" " /> > >>>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> > >>>>>>> <filter class="solr.ShingleFilterFactory" minShingleSize="2" > >>>>>>> maxShingleSize="6" outputUnigrams="true"/> > >>>>>>> </analyzer> > >>>>>>> </fieldType> > >>>>>>> > >>>>>>> > >>>>>>> Is there anything that I configured wrongly? > >>>>>>> > >>>>>>> > >>>>>>> Regards, > >>>>>>> Edwin > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> -------------------------- > >>>>>> > >>>>>> Benedetti Alessandro > >>>>>> Visiting card : http://about.me/alessandro_benedetti > >>>>>> > >>>>>> "Tyger, tyger burning bright > >>>>>> In the forests of the night, > >>>>>> What immortal hand or eye > >>>>>> Could frame thy fearful symmetry?" > >>>>>> > >>>>>> William Blake - Songs of Experience -1794 England > >>>>>> > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> -------------------------- > >>>>> > >>>>> Benedetti Alessandro > >>>>> Visiting card : http://about.me/alessandro_benedetti > >>>>> > >>>>> "Tyger, tyger burning bright > >>>>> In the forests of the night, > >>>>> What immortal hand or eye > >>>>> Could frame thy fearful symmetry?" > >>>>> > >>>>> William Blake - Songs of Experience -1794 England > >>>> > >>> > >>> > >>> > >>> -- > >>> -------------------------- > >>> > >>> Benedetti Alessandro > >>> Visiting card : http://about.me/alessandro_benedetti > >>> > >>> "Tyger, tyger burning bright > >>> In the forests of the night, > >>> What immortal hand or eye > >>> Could frame thy fearful symmetry?" > >>> > >>> William Blake - Songs of Experience -1794 England > >>> > >> > >> > >> > >> -- > >> -------------------------- > >> > >> Benedetti Alessandro > >> Visiting card : http://about.me/alessandro_benedetti > >> > >> "Tyger, tyger burning bright > >> In the forests of the night, > >> What immortal hand or eye > >> Could frame thy fearful symmetry?" > >> > >> William Blake - Songs of Experience -1794 England > > >