Hi,

Now I ran into another problem by using the solr.DictionaryCompoundWordTokenFilterFactory :-( If I search for the german word "Spargelcremesuppe" which contains "Spargel", "Creme" and "Suppe" SOLR will find way to many result. Its because SOLR finds EVERY entry with either one of the three words in it :-(

Here is my schema.xml

<fieldType name="text_text" class="solr.TextField" positionIncrementGap="100">
           <analyzer>
               <tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.DictionaryCompoundWordTokenFilterFactory"
                               dictionary="dictionary.txt"
                               minWordSize="5"
                               minSubwordSize="2"
                               maxSubwordSize="15"
                               onlyLongestMatch="true" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
               <filter class="solr.LowerCaseFilterFactory"/>
               <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="German" />
           </analyzer>
       </fieldType>

Any help ?

Greets,

Ralf Kraus

Reply via email to