Re: solr synonyms behaviour

swarag Tue, 15 Jul 2008 11:31:21 -0700


matt connolly wrote:
> 
> You won't have the multiple word problem if you use synonyms at index time
> instead of query time.
> 
> 
> swarag wrote:
>> 
>> Here is a basic example of some synonyms in my synonyms.txt:
>> club=>club,bar,night cabaret
>> bar=>bar,club
>> 
>> As you can see, a search for 'bar' will return any documents with 'bar'
>> or 'club' in the name. This works fine. However, a search for 'club'
>> SHOULD return any documents with 'club', 'bar' or 'night cabaret' in the
>> name, but it does not. It only returns 'bar' and 'club'.  
>> 
>> Interestingly, a search for 'night cabaret' gives me all 'night
>> cabaret's, 'bar's and 'club's...which is quite unexpected since I'm using
>> uni-directional synonym config (using the => symbol)
>> 
>> Does your config give you my desired behavior?
>> 
> 
>


Is there something I am missing here? This is an excerpt from my schema.xml:
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="false"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>

To my understanding, this means I am using synonyms at index time and NOT
query time. And yet, I am still having these problems with synonyms.

-- 
View this message in context: 
http://www.nabble.com/solr-synonyms-behaviour-tp15051211p18471922.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr synonyms behaviour

Reply via email to