Re: Synonyms problem

Steve Rowe Fri, 29 Mar 2013 09:50:55 -0700

The XPath expressions used to collect the charFilter sequence, the tokenizer, 
and the token filter sequence are evaluated independently of each other - see 
line #244 through #251:


<http://svn.apache.org/viewvc/lucene/dev/tags/lucene_solr_4_2_0/solr/core/src/java/org/apache/solr/schema/FieldTypePluginLoader.java?view=markup#l232>

Steve

On Mar 29, 2013, at 12:37 PM, Walter Underwood <wun...@wunderwood.org> wrote:

> Also, all the filters need to be after the tokenizer. There are two synonym 
> filters specified, one before the tokenizer and one after.
> 
> I'm surprised that works at all. Shouldn't that be fatal error when loading 
> the config?
> 
> wunder
> 
> On Mar 29, 2013, at 9:33 AM, Thomas Krämer | ontopica wrote:
> 
>> Hi Plamen
>> 
>> You should set expand to true during
>> 
>> <analyzer type="index">
>> ....
>> <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt"
>>             ignoreCase="true" expand="true"/>
>> 
>> 
>> ...
>> 
>> Greetings,
>> 
>> Thomas
>> 
>> Am 29.03.2013 17:16, schrieb Plamen Mihaylov:
>>> Hey guys,
>>> 
>>> I have the following problem - I have a website with sport players, where
>>> using Solr indexing their data. I have defined synonyms like: NY, New York.
>>> When I search for New York - there are 145 results found, but when I search
>>> for NY - there are 142 results found. Why there is a diff and how can I fix
>>> this?
>>> 
>>> Configuration snippets:
>>> 
>>> synonyms.txt
>>> 
>>> ...
>>> NY, New York
>>> ...
>>> 
>>> ------
>>> schema.xml
>>> 
>>> ...
>>>        <fieldType name="text" class="solr.TextField"
>>> positionIncrementGap="100">
>>>           <analyzer type="index">
>>>               <filter class="solr.
>>> SynonymFilterFactory" synonyms="synonyms.txt"
>>>                   ignoreCase="true" expand="true"/>
>>>               <tokenizer class="solr.WhitespaceTokenizerFactory" />
>>>               <!-- we will only use synonyms at query time <filter
>>> class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt"
>>>                   ignoreCase="true" expand="false"/> -->
>>> 
>>>               <filter class="solr.StopFilterFactory" ignoreCase="true"
>>> words="stopwords.txt" enablePositionIncrements="true" />
>>>               <filter class="solr.WordDelimiterFilterFactory"
>>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>>                   catenateNumbers="1" catenateAll="0"
>>> splitOnCaseChange="1" />
>>>               <filter class="solr.LowerCaseFilterFactory" />
>>>               <filter class="solr.PhoneticFilterFactory"
>>> encoder="DoubleMetaphone" inject="true" />
>>>               <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
>>>               <filter class="solr.LengthFilterFactory" min="2" max="100"
>>> />
>>>               <!-- <filter class="solr.SnowballPorterFilterFactory"
>>> language="English" /> -->
>>>           </analyzer>
>>>           <analyzer type="query">
>>>               <filter class="solr.SynonymFilterFactory"
>>> synonyms="synonyms.txt" ignoreCase="true" expand="true" />
>>>               <tokenizer class="solr.WhitespaceTokenizerFactory" />
>>> 
>>>               <filter class="solr.StopFilterFactory" ignoreCase="true"
>>> words="stopwords.txt" />
>>>               <filter class="solr.WordDelimiterFilterFactory"
>>> generateWordParts="1" generateNumberParts="1" catenateWords="0"
>>>                   catenateNumbers="0" catenateAll="0" />
>>>               <filter class="solr.LowerCaseFilterFactory" />
>>>               <!-- <filter class="solr.EnglishPorterFilterFactory"
>>> protected="protwords.txt"/> -->
>>>               <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
>>>               <filter class="solr.StopFilterFactory" ignoreCase="true"
>>> words="letterstops.txt" enablePositionIncrements="true" />
>>>           </analyzer>
>>>       </fieldType>
>>> 
>>> 
>>> Thanks in advance.
>>> Plamen
>>> 
>> 
>> 
>> -- 
>> 
>> ontopica GmbH
>> Prinz-Albert-Str. 2b
>> 53113 Bonn
>> Germany
>> fon: +49-228-227229-22
>> fax: +49-228-227229-77
>> web: http://www.ontopica.de
>> ontopica GmbH
>> Sitz der Gesellschaft: Bonn
>> 
>> Geschäftsführung: Thomas Krämer, Christoph Okpue
>> Handelsregister: Amtsgericht Bonn, HRB 17852
>> 
>> 
> 
> --
> Walter Underwood
> wun...@wunderwood.org
> 
> 
>

Re: Synonyms problem

Reply via email to