Re: Solr suggest is related to second letter, not to initial letter

Michael Sokolov Sun, 15 Feb 2015 17:55:23 -0800

StandardTokenizer splits your text into tokens, and the suggestersuggests tokens independently. It sounds as if you want the suggestionsto be based on the entire text (not just the current word), and thatonly adjacent words in the original should appear as suggestions.Assuming that's what you are after (it's a little hard to tell from youre-mail -- you might want to clarify by providing a few example of howyou *do* want it to work instead of just examples of how you *don't*want it to work), you have a couple of choices:

1) don't use StandardTokenizer, use KeywordTokenizer instead - this willpreserve the entire original text and suggest complete texts, ratherthan words2) maybe consider using a shingle filter along with standard tokenizer,so that your tokens include multi-word shingles3) Use a suggester with better support for a statistical language model,like this one:http://blog.mikemccandless.com/2014/01/finding-long-tail-suggestions-using.html,but to do this you will probably need to do some java programming sinceit isn't well integrated into solr


-Mike

On 2/14/2015 3:44 AM, Volkan Altan wrote:

Any idea?

On 12 Şub 2015, at 11:12, Volkan Altan <volkanal...@gmail.com> wrote:

Hello Everyone,

All I want to do with Solr suggester is obtaining the fact that the asserted 
suggestions  for the second letter whose entry actualizes after the initial 
letter  is actually related to initial letter, itself. But; just like the 
initial letters, the second letters rotate independently, as well.


Example;
http://localhost:8983/solr/solr/suggest?q=facet_suggest_data:”adidas+s"; 
<http://localhost:8983/solr/vitringez/suggest?q=facet_suggest_data:%22adidas+s%22>

adidas s

response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">4</int>
</lst>
<lst name="spellcheck">
<lst name="suggestions">
<lst name="s">
<int name="numFound">1</int>
<int name="startOffset">27</int>
<int name="endOffset">28</int>
<arr name="suggestion">
<str>samsung</str>
</arr>
</lst>
<lst name="collation">
<str name="collationQuery">facet_suggest_data:"adidas samsung"</str>
<int name="hits">0</int>
<lst name="misspellingsAndCorrections">
<str name="adidas">adidas</str>
<str name="s">samsung</str>
</lst>
</lst>
</lst>
</lst>
</response>


The terms of ‘’Adidas’’ and ‘’Samsung’’ are available within seperate 
documents. A common place in which both of them are available cannot be found.

How can I solve that problem?



schema.xml

<fieldType name="suggestions_type" class="solr.TextField" 
positionIncrementGap="100">
             <analyzer type="index">
                 <charFilter class="solr.HTMLStripCharFilterFactory"/>
                 <tokenizer class="solr.StandardTokenizerFactory"/>
                 <filter class="solr.ApostropheFilterFactory"/>
                 <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" 
ignoreCase="true" expand="false"/>
                 <filter class="solr.StopFilterFactory" ignoreCase="true" 
words="stopwords.txt" enablePositionIncrements="true" />
                 <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
             </analyzer>
             <analyzer type="query">
                 <charFilter class="solr.HTMLStripCharFilterFactory"/>
                 <tokenizer class="solr.StandardTokenizerFactory"/>
                 <filter class="solr.ApostropheFilterFactory"/>
                 <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
             </analyzer>
         </fieldType>

<field name=“facet_suggest_data" type="suggestions_type" indexed="true" multiValued="true" 
stored="false" omitNorms="true"/>


Best

Re: Solr suggest is related to second letter, not to initial letter

Reply via email to