Den 17. des. 2009 kl. 12.42 skrev Shalin Shekhar Mangar: > 2009/12/17 Steinar Asbjørnsen <steinar...@gmail.com> > >> Hi all. >> >> I have a delicate problem when it comes to two words that are rather >> similar in the way they are typed, but when it comes to the meaning of the >> word they are completely different. >> The actual words are restaurant (as in restaurant) and restaurering (as in >> restoration). >> >> Solr seems to think these words are similar enough to present hits on both >> of them in the same search result. >> Obviously this is not desirable. >> >> Is there a way to take care of such spesific cases without disabling solr >> functionality for stemming and/or plurals? >> Or would I need to disable stemming to make this special case disapear? >> >> > For specific cases like this, you can add the word to a file and specify it > in schema, for example: > > <filter class="solr.SnowballPorterFilterFactory" language="English" > protected="protwords.txt"/>
Ty Shalin. This is my schema.xml file <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> </fieldType> I added restaurant and restaurering to protwords.txt, restarted Tomcat, but no dice. Do I need to use the SnowballPorterFilterFactory? And do I need to reindex the documents? Steinar