The XPath expressions used to collect the charFilter sequence, the tokenizer, and the token filter sequence are evaluated independently of each other - see line #244 through #251:
<http://svn.apache.org/viewvc/lucene/dev/tags/lucene_solr_4_2_0/solr/core/src/java/org/apache/solr/schema/FieldTypePluginLoader.java?view=markup#l232> Steve On Mar 29, 2013, at 12:37 PM, Walter Underwood <wun...@wunderwood.org> wrote: > Also, all the filters need to be after the tokenizer. There are two synonym > filters specified, one before the tokenizer and one after. > > I'm surprised that works at all. Shouldn't that be fatal error when loading > the config? > > wunder > > On Mar 29, 2013, at 9:33 AM, Thomas Krämer | ontopica wrote: > >> Hi Plamen >> >> You should set expand to true during >> >> <analyzer type="index"> >> .... >> <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" >> ignoreCase="true" expand="true"/> >> >> >> ... >> >> Greetings, >> >> Thomas >> >> Am 29.03.2013 17:16, schrieb Plamen Mihaylov: >>> Hey guys, >>> >>> I have the following problem - I have a website with sport players, where >>> using Solr indexing their data. I have defined synonyms like: NY, New York. >>> When I search for New York - there are 145 results found, but when I search >>> for NY - there are 142 results found. Why there is a diff and how can I fix >>> this? >>> >>> Configuration snippets: >>> >>> synonyms.txt >>> >>> ... >>> NY, New York >>> ... >>> >>> ------ >>> schema.xml >>> >>> ... >>> <fieldType name="text" class="solr.TextField" >>> positionIncrementGap="100"> >>> <analyzer type="index"> >>> <filter class="solr. >>> SynonymFilterFactory" synonyms="synonyms.txt" >>> ignoreCase="true" expand="true"/> >>> <tokenizer class="solr.WhitespaceTokenizerFactory" /> >>> <!-- we will only use synonyms at query time <filter >>> class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" >>> ignoreCase="true" expand="false"/> --> >>> >>> <filter class="solr.StopFilterFactory" ignoreCase="true" >>> words="stopwords.txt" enablePositionIncrements="true" /> >>> <filter class="solr.WordDelimiterFilterFactory" >>> generateWordParts="1" generateNumberParts="1" catenateWords="1" >>> catenateNumbers="1" catenateAll="0" >>> splitOnCaseChange="1" /> >>> <filter class="solr.LowerCaseFilterFactory" /> >>> <filter class="solr.PhoneticFilterFactory" >>> encoder="DoubleMetaphone" inject="true" /> >>> <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> >>> <filter class="solr.LengthFilterFactory" min="2" max="100" >>> /> >>> <!-- <filter class="solr.SnowballPorterFilterFactory" >>> language="English" /> --> >>> </analyzer> >>> <analyzer type="query"> >>> <filter class="solr.SynonymFilterFactory" >>> synonyms="synonyms.txt" ignoreCase="true" expand="true" /> >>> <tokenizer class="solr.WhitespaceTokenizerFactory" /> >>> >>> <filter class="solr.StopFilterFactory" ignoreCase="true" >>> words="stopwords.txt" /> >>> <filter class="solr.WordDelimiterFilterFactory" >>> generateWordParts="1" generateNumberParts="1" catenateWords="0" >>> catenateNumbers="0" catenateAll="0" /> >>> <filter class="solr.LowerCaseFilterFactory" /> >>> <!-- <filter class="solr.EnglishPorterFilterFactory" >>> protected="protwords.txt"/> --> >>> <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> >>> <filter class="solr.StopFilterFactory" ignoreCase="true" >>> words="letterstops.txt" enablePositionIncrements="true" /> >>> </analyzer> >>> </fieldType> >>> >>> >>> Thanks in advance. >>> Plamen >>> >> >> >> -- >> >> ontopica GmbH >> Prinz-Albert-Str. 2b >> 53113 Bonn >> Germany >> fon: +49-228-227229-22 >> fax: +49-228-227229-77 >> web: http://www.ontopica.de >> ontopica GmbH >> Sitz der Gesellschaft: Bonn >> >> Geschäftsführung: Thomas Krämer, Christoph Okpue >> Handelsregister: Amtsgericht Bonn, HRB 17852 >> >> > > -- > Walter Underwood > wun...@wunderwood.org > > >