Hi, Note that RemoveDuplicatesTokenFilterFactory "filters out any tokens which are at the same logical position in the tokenstream as a previous token with the same text."
So if you have "men in black are real men" then RemoveDuplicatesTokenFilterFactory will not remove duplicate "men". This may or may not be your problem. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: KLessou <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Wednesday, October 1, 2008 9:48:28 AM > Subject: termFreq always = 1 ? > > Hi, > > I want to index a list of keywords. > > When I search "k1_en:men", I find a lot of documents like that : > > DocA : > (k1_en = man;men;Men;business... termFreq=2) > DocB : > (k1_en = man;Men;business... termFreq=1) > DocC : > ... > DocD : > ... > DocE : > ... > > But I don't want to have a different termFreq for DocA & DocB. > > I try RemoveDuplicatesTokenFilterFactory but it doesn't seem to help me :-/ > > > > > > > > > ignoreCase="true"/> > > protected="protwords.txt" /> > > > > > generateWordParts="0" > generateNumberParts="0" > catenateWords="0" > catenateNumbers="0" > catenateAll="0" > /> > > > > > /> > > > > > ignoreCase="true"/> > > protected="protwords.txt" /> > > > > generateWordParts="0" > generateNumberParts="0" > catenateWords="0" > catenateNumbers="0" > catenateAll="0" > /> > > > > > > ... > > > > required="false" /> > > > If you have any idea, thx in advance. > > -- > ~~~~~ > | klessou | > ~~~~~