Hi Grant,
thanks for your reply. I have one more doubt, if I use Luke's request
handler in solr for this issue, the top terms I get, are they term frequency
or highest document frequency terms.
I would like to get terms that occur max in a document and those document
form a good percentage in the t
esting patterns emerges.
Cheers,
Daniel
-Original Message-
From: Walter Underwood [mailto:wunderw...@netflix.com]
Sent: 16 July 2009 17:15
To: solr-user@lucene.apache.org
Subject: Re: Word frequency count in the index
I haven't researched old versions of Lucene, but I think it has
> Regards,
> Daniel
>
> -Original Message-
> From: Walter Underwood [mailto:wunderw...@netflix.com]
> Sent: 16 July 2009 15:04
> To: solr-user@lucene.apache.org
> Subject: Re: Word frequency count in the index
>
> Lucene uses a tf.idf relevance formula, so it auto
Hi Walter,
Has it always been there? Which version of Lucene are we talking about?
Regards,
Daniel
-Original Message-
From: Walter Underwood [mailto:wunderw...@netflix.com]
Sent: 16 July 2009 15:04
To: solr-user@lucene.apache.org
Subject: Re: Word frequency count in the index
Lucene
t; To: solr-user@lucene.apache.org
> Sent: Thursday, July 16, 2009 6:35:28 AM
> Subject: Re: Word frequency count in the index
>
> In the trunk version, the TermsComponent should give you this:
> http://wiki.apache.org/solr/TermsComponent. Also, you can use the
> LukeRequestHa
Lucene uses a tf.idf relevance formula, so it automatically finds common
words (stop words) in your documents and gives them lower weight. I
recommend not removing stop words at all and letting Lucene handle
the weighting.
wunder
On 7/16/09 3:29 AM, "Pooja Verlani" wrote:
> Hi,
>
> Is there an
In the trunk version, the TermsComponent should give you this: http://wiki.apache.org/solr/TermsComponent
. Also, you can use the LukeRequestHandler to get the top words in
each field.
Alternatively, you may just want to point Luke at your index.
On Jul 16, 2009, at 6:29 AM, Pooja Verlani w