Cool! I've bookmarked it, much more thorough.... Erick
On Sat, Jul 11, 2015 at 8:13 AM, Walter Underwood <wun...@wunderwood.org> wrote: > Thanks, this is very helpful. > > Suggester config is quite under documented. It took me longer than I expected > to get it working. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > On Jul 10, 2015, at 6:30 PM, Alessandro Benedetti > <benedetti.ale...@gmail.com> wrote: > >> Hi guys, >> just wrote a blog to integrate Erick's post and to explain in details with >> practical examples all the main Lookup implementations : >> >> http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html >> >> I think this can be useful for Edwin to finally fix the config for the >> FreeTextSuggester ( which finally I clarified Erick, thanks to Mike answer >> in dev, and deep code analysis and testing :) ) >> >> Cheers >> >> 2015-06-27 23:51 GMT+01:00 Alessandro Benedetti <benedetti.ale...@gmail.com> >> : >> >>> Thanks, Erick, i didn't have time to go again through the code. >>> But i will forward this to the Dev list. >>> Thank you for your time ! >>> >>> Cheers >>> >>> 2015-06-27 16:19 GMT+01:00 Erick Erickson <erickerick...@gmail.com>: >>> >>>> Alessandro: >>>> >>>> Going to have to defer to Mike McCandless et.al., they're the >>>> authorities here. Don't quite know whether they monitor this list, >>>> consider the dev list? >>>> >>>> Best, >>>> Erick >>>> >>>> On Fri, Jun 26, 2015 at 4:53 AM, Alessandro Benedetti >>>> <benedetti.ale...@gmail.com> wrote: >>>>> Up, Can anyone gently take a look to my considerations related the >>>> FreeText >>>>> Suggester ? >>>>> I am curious to have more insight. >>>>> Eventually I will deeply analyse the code to understand my errors. >>>>> >>>>> Cheers >>>>> >>>>> 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti < >>>> benedetti.ale...@gmail.com> >>>>> : >>>>> >>>>>> Actually the documentation is not clear enough. >>>>>> Let's try to understand this suggester. >>>>>> >>>>>> *Building* >>>>>> This suggester build a FST that it will use to provide the autocomplete >>>>>> feature running prefix searches on it . >>>>>> The terms it uses to generate the FST are the tokens produced by the >>>>>> "suggestFreeTextAnalyzerFieldType" . >>>>>> >>>>>> And this should be correct. >>>>>> So if we have a shingle token filter[1-3] ( we produce unigrams as >>>> well) >>>>>> in our analysis to keep it simple , from these original field values : >>>>>> "mp3 ipod" >>>>>> "mp3 player" >>>>>> "mp3 player ipod" >>>>>> "player of Real" >>>>>> >>>>>> -> we produce these list of possible suggestions in our FST : >>>>>> >>>>>> <mp3> >>>>>> <player> >>>>>> <ipod> >>>>>> <real> >>>>>> <of> >>>>>> >>>>>> <mp3 ipod> >>>>>> <mp3 player> >>>>>> <player ipod> >>>>>> <player of> >>>>>> <of real> >>>>>> >>>>>> <mp3 player ipod> >>>>>> <player of real> >>>>>> >>>>>> From the documentation I read : >>>>>> >>>>>>> " ngrams: The max number of tokens out of which singles will be make >>>> the >>>>>>> dictionary. The default value is 2. Increasing this would mean you >>>> want >>>>>>> more than the previous 2 tokens to be taken into consideration when >>>> making >>>>>>> the suggestions. " >>>>>> >>>>>> >>>>>> This makes me confused, as I was not expecting this param to affect the >>>>>> suggestion dictionary. >>>>>> So I would like a clarification here from our masters :) >>>>>> At this point let's see what happens at query time . >>>>>> >>>>>> *Query Time * >>>>>> As my understanding the ngrams params will consider the last N-1 >>>> tokens >>>>>> the user put separated by the space separator. >>>>>> >>>>>> "Builds an ngram model from the text sent to {@link >>>>>>> * #build} and predicts based on the last grams-1 tokens in >>>>>>> * the request sent to {@link #lookup}. This tries to >>>>>>> * handle the "long tail" of suggestions for when the >>>>>>> * incoming query is a never before seen query string." >>>>>> >>>>>> >>>>>> Example , grams=3 should consider only the last 2 tokens >>>>>> >>>>>> special mp3 p -> mp3 p >>>>>> >>>>>> Then this query is analysed using the >>>> "suggestFreeTextAnalyzerFieldType" . >>>>>> We produce 3 tokens : >>>>>> <mp3> >>>>>> <p> >>>>>> <mp3 p> >>>>>> >>>>>> And we run the prefix matching on the FST . >>>>>> >>>>>> *Conclusion* >>>>>> My understanding is wrong for sure at some point, as the behaviour I >>>> get >>>>>> is different. >>>>>> Can we discuss this , clarify this and eventually put it in the >>>> official >>>>>> documentation ? >>>>>> >>>>>> Cheers >>>>>> >>>>>> 2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo <edwinye...@gmail.com>: >>>>>> >>>>>>> I'm implementing an auto-suggest feature in Solr, and I'll like to >>>> achieve >>>>>>> the follwing: >>>>>>> >>>>>>> For example, if the user enters "mp3", Solr might suggest "mp3 >>>> player", >>>>>>> "mp3 nano" and "mp3 music". >>>>>>> When the user enters "mp3 p", the suggestion should narrow down to >>>> "mp3 >>>>>>> player". >>>>>>> >>>>>>> Currently, when I type "mp3 p", the suggester is returning words that >>>>>>> starts with the letter "p" only, and I'm getting results like "plan", >>>>>>> "production", etc, and it does not take the "mp3" token into >>>>>>> consideration. >>>>>>> >>>>>>> I'm using Solr 5.1 and below is my configuration: >>>>>>> >>>>>>> In solrconfig.xml: >>>>>>> >>>>>>> <searchComponent name="suggest" class="solr.SuggestComponent"> >>>>>>> <lst name="suggester"> >>>>>>> >>>>>>> <str name="lookupImpl">FreeTextLookupFactory</str> >>>>>>> <str name="indexPath">suggester_freetext_dir</str> >>>>>>> >>>>>>> <str name="dictionaryImpl">DocumentDictionaryFactory</str> >>>>>>> <str name="field">Suggestion</str> >>>>>>> <str name="weightField">Project</str> >>>>>>> <str name="suggestFreeTextAnalyzerFieldType">suggestType</str> >>>>>>> <int name="ngrams">5</int> >>>>>>> <str name="buildOnStartup">false</str> >>>>>>> <str name="buildOnCommit">false</str> >>>>>>> </lst> >>>>>>> </searchComponent> >>>>>>> >>>>>>> >>>>>>> In schema.xml >>>>>>> >>>>>>> <fieldType name="suggestType" class="solr.TextField" >>>>>>> positionIncrementGap="100"> >>>>>>> <analyzer type="index"> >>>>>>> <charFilter class="solr.PatternReplaceCharFilterFactory" >>>>>>> pattern="[^a-zA-Z0-9]" replacement=" " /> >>>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >>>>>>> <filter class="solr.ShingleFilterFactory" minShingleSize="2" >>>>>>> maxShingleSize="6" outputUnigrams="false"/> >>>>>>> </analyzer> >>>>>>> <analyzer type="query"> >>>>>>> <charFilter class="solr.PatternReplaceCharFilterFactory" >>>>>>> pattern="[^a-zA-Z0-9]" replacement=" " /> >>>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >>>>>>> <filter class="solr.ShingleFilterFactory" minShingleSize="2" >>>>>>> maxShingleSize="6" outputUnigrams="true"/> >>>>>>> </analyzer> >>>>>>> </fieldType> >>>>>>> >>>>>>> >>>>>>> Is there anything that I configured wrongly? >>>>>>> >>>>>>> >>>>>>> Regards, >>>>>>> Edwin >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> -------------------------- >>>>>> >>>>>> Benedetti Alessandro >>>>>> Visiting card : http://about.me/alessandro_benedetti >>>>>> >>>>>> "Tyger, tyger burning bright >>>>>> In the forests of the night, >>>>>> What immortal hand or eye >>>>>> Could frame thy fearful symmetry?" >>>>>> >>>>>> William Blake - Songs of Experience -1794 England >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> -------------------------- >>>>> >>>>> Benedetti Alessandro >>>>> Visiting card : http://about.me/alessandro_benedetti >>>>> >>>>> "Tyger, tyger burning bright >>>>> In the forests of the night, >>>>> What immortal hand or eye >>>>> Could frame thy fearful symmetry?" >>>>> >>>>> William Blake - Songs of Experience -1794 England >>>> >>> >>> >>> >>> -- >>> -------------------------- >>> >>> Benedetti Alessandro >>> Visiting card : http://about.me/alessandro_benedetti >>> >>> "Tyger, tyger burning bright >>> In the forests of the night, >>> What immortal hand or eye >>> Could frame thy fearful symmetry?" >>> >>> William Blake - Songs of Experience -1794 England >>> >> >> >> >> -- >> -------------------------- >> >> Benedetti Alessandro >> Visiting card : http://about.me/alessandro_benedetti >> >> "Tyger, tyger burning bright >> In the forests of the night, >> What immortal hand or eye >> Could frame thy fearful symmetry?" >> >> William Blake - Songs of Experience -1794 England >