Cool! I've bookmarked it, much more thorough....

Erick

On Sat, Jul 11, 2015 at 8:13 AM, Walter Underwood <wun...@wunderwood.org> wrote:
> Thanks, this is very helpful.
>
> Suggester config is quite under documented. It took me longer than I expected 
> to get it working.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> On Jul 10, 2015, at 6:30 PM, Alessandro Benedetti 
> <benedetti.ale...@gmail.com> wrote:
>
>> Hi guys,
>> just wrote a blog to integrate Erick's post and to explain in details with
>> practical examples all the main Lookup implementations :
>>
>> http://alexbenedetti.blogspot.co.uk/2015/07/solr-you-complete-me.html
>>
>> I think this can be useful for Edwin to finally fix the config for the
>> FreeTextSuggester ( which finally I clarified Erick, thanks to Mike answer
>> in dev, and deep code analysis and testing :) )
>>
>> Cheers
>>
>> 2015-06-27 23:51 GMT+01:00 Alessandro Benedetti <benedetti.ale...@gmail.com>
>> :
>>
>>> Thanks, Erick, i didn't have time to go again through the code.
>>> But i will forward this to the Dev list.
>>> Thank you for your time !
>>>
>>> Cheers
>>>
>>> 2015-06-27 16:19 GMT+01:00 Erick Erickson <erickerick...@gmail.com>:
>>>
>>>> Alessandro:
>>>>
>>>> Going to have to defer to Mike McCandless et.al., they're the
>>>> authorities here. Don't quite know whether they monitor this list,
>>>> consider the dev list?
>>>>
>>>> Best,
>>>> Erick
>>>>
>>>> On Fri, Jun 26, 2015 at 4:53 AM, Alessandro Benedetti
>>>> <benedetti.ale...@gmail.com> wrote:
>>>>> Up, Can anyone gently take a look to my considerations related the
>>>> FreeText
>>>>> Suggester ?
>>>>> I am curious to have more insight.
>>>>> Eventually I will deeply analyse the code to understand my errors.
>>>>>
>>>>> Cheers
>>>>>
>>>>> 2015-06-19 11:53 GMT+01:00 Alessandro Benedetti <
>>>> benedetti.ale...@gmail.com>
>>>>> :
>>>>>
>>>>>> Actually the documentation is not clear enough.
>>>>>> Let's try to understand this suggester.
>>>>>>
>>>>>> *Building*
>>>>>> This suggester build a FST that it will use to provide the autocomplete
>>>>>> feature running prefix searches on it .
>>>>>> The terms it uses to generate the FST are the tokens produced by the
>>>>>> "suggestFreeTextAnalyzerFieldType" .
>>>>>>
>>>>>> And this should be correct.
>>>>>> So if we have a shingle token filter[1-3] ( we produce unigrams as
>>>> well)
>>>>>> in our analysis to keep it simple , from these original field values :
>>>>>> "mp3 ipod"
>>>>>> "mp3 player"
>>>>>> "mp3 player ipod"
>>>>>> "player of Real"
>>>>>>
>>>>>> -> we produce these list of possible suggestions in our FST :
>>>>>>
>>>>>> <mp3>
>>>>>> <player>
>>>>>> <ipod>
>>>>>> <real>
>>>>>> <of>
>>>>>>
>>>>>> <mp3 ipod>
>>>>>> <mp3 player>
>>>>>> <player ipod>
>>>>>> <player of>
>>>>>> <of real>
>>>>>>
>>>>>> <mp3 player ipod>
>>>>>> <player of real>
>>>>>>
>>>>>> From the documentation I read :
>>>>>>
>>>>>>> " ngrams: The max number of tokens out of which singles will be make
>>>> the
>>>>>>> dictionary. The default value is 2. Increasing this would mean you
>>>> want
>>>>>>> more than the previous 2 tokens to be taken into consideration when
>>>> making
>>>>>>> the suggestions. "
>>>>>>
>>>>>>
>>>>>> This makes me confused, as I was not expecting this param to affect the
>>>>>> suggestion dictionary.
>>>>>> So I would like a clarification here from our masters :)
>>>>>> At this point let's see what happens at query time .
>>>>>>
>>>>>> *Query Time *
>>>>>> As my understanding the ngrams params will consider  the last N-1
>>>> tokens
>>>>>> the user put separated by the space separator.
>>>>>>
>>>>>> "Builds an ngram model from the text sent to {@link
>>>>>>> * #build} and predicts based on the last grams-1 tokens in
>>>>>>> * the request sent to {@link #lookup}. This tries to
>>>>>>> * handle the "long tail" of suggestions for when the
>>>>>>> * incoming query is a never before seen query string."
>>>>>>
>>>>>>
>>>>>> Example , grams=3 should consider only the last 2 tokens
>>>>>>
>>>>>> special mp3 p -> mp3 p
>>>>>>
>>>>>> Then this query is analysed using the
>>>> "suggestFreeTextAnalyzerFieldType" .
>>>>>> We produce 3 tokens :
>>>>>> <mp3>
>>>>>> <p>
>>>>>> <mp3 p>
>>>>>>
>>>>>> And we run the prefix matching on the FST .
>>>>>>
>>>>>> *Conclusion*
>>>>>> My understanding is wrong for sure at some point, as the behaviour I
>>>> get
>>>>>> is different.
>>>>>> Can we discuss this , clarify this and eventually put it in the
>>>> official
>>>>>> documentation ?
>>>>>>
>>>>>> Cheers
>>>>>>
>>>>>> 2015-06-19 6:40 GMT+01:00 Zheng Lin Edwin Yeo <edwinye...@gmail.com>:
>>>>>>
>>>>>>> I'm implementing an auto-suggest feature in Solr, and I'll like to
>>>> achieve
>>>>>>> the follwing:
>>>>>>>
>>>>>>> For example, if the user enters "mp3", Solr might suggest "mp3
>>>> player",
>>>>>>> "mp3 nano" and "mp3 music".
>>>>>>> When the user enters "mp3 p", the suggestion should narrow down to
>>>> "mp3
>>>>>>> player".
>>>>>>>
>>>>>>> Currently, when I type "mp3 p", the suggester is returning words that
>>>>>>> starts with the letter "p" only, and I'm getting results like "plan",
>>>>>>> "production", etc, and it does not take the "mp3" token into
>>>>>>> consideration.
>>>>>>>
>>>>>>> I'm using Solr 5.1 and below is my configuration:
>>>>>>>
>>>>>>> In solrconfig.xml:
>>>>>>>
>>>>>>> <searchComponent name="suggest" class="solr.SuggestComponent">
>>>>>>>  <lst name="suggester">
>>>>>>>
>>>>>>>                 <str name="lookupImpl">FreeTextLookupFactory</str>
>>>>>>>                 <str name="indexPath">suggester_freetext_dir</str>
>>>>>>>
>>>>>>> <str name="dictionaryImpl">DocumentDictionaryFactory</str>
>>>>>>> <str name="field">Suggestion</str>
>>>>>>> <str name="weightField">Project</str>
>>>>>>> <str name="suggestFreeTextAnalyzerFieldType">suggestType</str>
>>>>>>> <int name="ngrams">5</int>
>>>>>>> <str name="buildOnStartup">false</str>
>>>>>>> <str name="buildOnCommit">false</str>
>>>>>>>  </lst>
>>>>>>> </searchComponent>
>>>>>>>
>>>>>>>
>>>>>>> In schema.xml
>>>>>>>
>>>>>>> <fieldType name="suggestType" class="solr.TextField"
>>>>>>> positionIncrementGap="100">
>>>>>>> <analyzer type="index">
>>>>>>> <charFilter class="solr.PatternReplaceCharFilterFactory"
>>>>>>> pattern="[^a-zA-Z0-9]" replacement=" " />
>>>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>>>>>> <filter class="solr.ShingleFilterFactory" minShingleSize="2"
>>>>>>> maxShingleSize="6" outputUnigrams="false"/>
>>>>>>> </analyzer>
>>>>>>> <analyzer type="query">
>>>>>>> <charFilter class="solr.PatternReplaceCharFilterFactory"
>>>>>>> pattern="[^a-zA-Z0-9]" replacement=" " />
>>>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>>>>>> <filter class="solr.ShingleFilterFactory" minShingleSize="2"
>>>>>>> maxShingleSize="6" outputUnigrams="true"/>
>>>>>>> </analyzer>
>>>>>>> </fieldType>
>>>>>>>
>>>>>>>
>>>>>>> Is there anything that I configured wrongly?
>>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>> Edwin
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> --------------------------
>>>>>>
>>>>>> Benedetti Alessandro
>>>>>> Visiting card : http://about.me/alessandro_benedetti
>>>>>>
>>>>>> "Tyger, tyger burning bright
>>>>>> In the forests of the night,
>>>>>> What immortal hand or eye
>>>>>> Could frame thy fearful symmetry?"
>>>>>>
>>>>>> William Blake - Songs of Experience -1794 England
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> --------------------------
>>>>>
>>>>> Benedetti Alessandro
>>>>> Visiting card : http://about.me/alessandro_benedetti
>>>>>
>>>>> "Tyger, tyger burning bright
>>>>> In the forests of the night,
>>>>> What immortal hand or eye
>>>>> Could frame thy fearful symmetry?"
>>>>>
>>>>> William Blake - Songs of Experience -1794 England
>>>>
>>>
>>>
>>>
>>> --
>>> --------------------------
>>>
>>> Benedetti Alessandro
>>> Visiting card : http://about.me/alessandro_benedetti
>>>
>>> "Tyger, tyger burning bright
>>> In the forests of the night,
>>> What immortal hand or eye
>>> Could frame thy fearful symmetry?"
>>>
>>> William Blake - Songs of Experience -1794 England
>>>
>>
>>
>>
>> --
>> --------------------------
>>
>> Benedetti Alessandro
>> Visiting card : http://about.me/alessandro_benedetti
>>
>> "Tyger, tyger burning bright
>> In the forests of the night,
>> What immortal hand or eye
>> Could frame thy fearful symmetry?"
>>
>> William Blake - Songs of Experience -1794 England
>

Reply via email to