Re: termFreq always = 1 ?

Otis Gospodnetic Wed, 01 Oct 2008 11:46:19 -0700

In each of your examples (is each one a documen?) I see only 1 "men" instance, 
so "men" term frequency should be 1 for that document.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: KLessou <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, October 1, 2008 11:43:59 AM
> Subject: Re: termFreq always = 1 ?
> 
> Yes this may be my problem,
> 
> But is there any solution to have only one "men" keyword indexed when i''ve
> got something like this :
> 
> 1 - k1_en = men;business;Men
> or :
> 2 - k1_en = man,business,men
> or :
> 3 - k1_en = Man,men,business,Men,man
> ...
> 
> Thx in advance,
> 
> On Wed, Oct 1, 2008 at 5:12 PM, Otis Gospodnetic 
> > wrote:
> 
> > Hi,
> >
> > Note that RemoveDuplicatesTokenFilterFactory "filters out any tokens which
> > are at the same logical position in the tokenstream as a previous token with
> > the same text."
> >
> > So if you have "men in black are real men" then
> > RemoveDuplicatesTokenFilterFactory will not remove duplicate "men".
> >
> > This may or may not be your problem.
> >
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >
> >
> >
> > ----- Original Message ----
> > > From: KLessou 
> > > To: solr-user@lucene.apache.org
> > > Sent: Wednesday, October 1, 2008 9:48:28 AM
> > > Subject: termFreq always = 1 ?
> > >
> > > Hi,
> > >
> > > I want to index a list of keywords.
> > >
> > > When I search "k1_en:men", I find a lot of documents like that :
> > >
> > > DocA :
> > > (k1_en = man;men;Men;business... termFreq=2)
> > > DocB :
> > > (k1_en = man;Men;business... termFreq=1)
> > > DocC :
> > > ...
> > > DocD :
> > > ...
> > > DocE :
> > > ...
> > >
> > > But I don't want to have a different termFreq for DocA & DocB.
> > >
> > > I try RemoveDuplicatesTokenFilterFactory but it doesn't seem to help me
> > :-/
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > ignoreCase="true"/>
> > >
> > > protected="protwords.txt" />
> > >
> > >
> > >
> > >
> > >                     generateWordParts="0"
> > >                     generateNumberParts="0"
> > >                     catenateWords="0"
> > >                     catenateNumbers="0"
> > >                     catenateAll="0"
> > >                     />
> > >
> > >
> > >
> > >
> > > />
> > >
> > >
> > >
> > >
> > > ignoreCase="true"/>
> > >
> > > protected="protwords.txt" />
> > >
> > >
> > >
> > >                     generateWordParts="0"
> > >                     generateNumberParts="0"
> > >                     catenateWords="0"
> > >                     catenateNumbers="0"
> > >                     catenateAll="0"
> > >                     />
> > >
> > >
> > >
> > >
> > >
> > > ...
> > >
> > >
> > >
> > > required="false" />
> > >
> > >
> > > If you have any idea, thx in advance.
> > >
> > > --
> > > ~~~~~
> > > | klessou |
> > > ~~~~~
> >
> >
> 
> 
> -- 
> ~~~~~
> | klessou |
> ~~~~~

Re: termFreq always = 1 ?

Reply via email to