I could be wrong about MLT - maybe it really does use TF IDF and not raw frequency.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Walter Underwood <wunderw...@netflix.com> > To: solr-user@lucene.apache.org > Sent: Thursday, July 2, 2009 10:26:33 AM > Subject: Re: Implementing PhraseQuery and MoreLikeThis Query in one app > > I think it works better to use the highest tf.idf terms, not the highest tf. > That is what I implemented for Ultraseek ten years ago. With tf, you get > lots of terms with low discrimination power. > > wunder > > On 7/2/09 4:48 AM, "Otis Gospodnetic" wrote: > > > > > Michael - because they are the most frequent, which is how MLT selects terms > > to use for querying, IIRC. > > > > > > Otis -- > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > ----- Original Message ---- > >> From: Michael Ludwig > >> To: solr-user@lucene.apache.org > >> Sent: Thursday, July 2, 2009 6:20:05 AM > >> Subject: Re: Implementing PhraseQuery and MoreLikeThis Query in one app > >> > >> SergeyG schrieb: > >> > >>> Can both queries - PhraseQuery and MoreLikeThis Query - be implemented > >>> in the same app taking into account the fact that for the former to > >>> work the stop words list needs to be included and this results in the > >>> latter putting stop words among the most important words? > >> > >> Why would the inclusion of a stopword list result in stopwords being of > >> top importance in the MoreLikeThis query? > >> > >> Michael Ludwig > >