In each of your examples (is each one a documen?) I see only 1 "men" instance, so "men" term frequency should be 1 for that document.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: KLessou <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Wednesday, October 1, 2008 11:43:59 AM > Subject: Re: termFreq always = 1 ? > > Yes this may be my problem, > > But is there any solution to have only one "men" keyword indexed when i''ve > got something like this : > > 1 - k1_en = men;business;Men > or : > 2 - k1_en = man,business,men > or : > 3 - k1_en = Man,men,business,Men,man > ... > > Thx in advance, > > On Wed, Oct 1, 2008 at 5:12 PM, Otis Gospodnetic > > wrote: > > > Hi, > > > > Note that RemoveDuplicatesTokenFilterFactory "filters out any tokens which > > are at the same logical position in the tokenstream as a previous token with > > the same text." > > > > So if you have "men in black are real men" then > > RemoveDuplicatesTokenFilterFactory will not remove duplicate "men". > > > > This may or may not be your problem. > > > > Otis > > -- > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > ----- Original Message ---- > > > From: KLessou > > > To: solr-user@lucene.apache.org > > > Sent: Wednesday, October 1, 2008 9:48:28 AM > > > Subject: termFreq always = 1 ? > > > > > > Hi, > > > > > > I want to index a list of keywords. > > > > > > When I search "k1_en:men", I find a lot of documents like that : > > > > > > DocA : > > > (k1_en = man;men;Men;business... termFreq=2) > > > DocB : > > > (k1_en = man;Men;business... termFreq=1) > > > DocC : > > > ... > > > DocD : > > > ... > > > DocE : > > > ... > > > > > > But I don't want to have a different termFreq for DocA & DocB. > > > > > > I try RemoveDuplicatesTokenFilterFactory but it doesn't seem to help me > > :-/ > > > > > > > > > > > > > > > > > > > > > > > > > > > ignoreCase="true"/> > > > > > > protected="protwords.txt" /> > > > > > > > > > > > > > > > generateWordParts="0" > > > generateNumberParts="0" > > > catenateWords="0" > > > catenateNumbers="0" > > > catenateAll="0" > > > /> > > > > > > > > > > > > > > > /> > > > > > > > > > > > > > > > ignoreCase="true"/> > > > > > > protected="protwords.txt" /> > > > > > > > > > > > > generateWordParts="0" > > > generateNumberParts="0" > > > catenateWords="0" > > > catenateNumbers="0" > > > catenateAll="0" > > > /> > > > > > > > > > > > > > > > > > > ... > > > > > > > > > > > > required="false" /> > > > > > > > > > If you have any idea, thx in advance. > > > > > > -- > > > ~~~~~ > > > | klessou | > > > ~~~~~ > > > > > > > -- > ~~~~~ > | klessou | > ~~~~~