I think it works better to use the highest tf.idf terms, not the highest tf. That is what I implemented for Ultraseek ten years ago. With tf, you get lots of terms with low discrimination power.
wunder On 7/2/09 4:48 AM, "Otis Gospodnetic" <otis_gospodne...@yahoo.com> wrote: > > Michael - because they are the most frequent, which is how MLT selects terms > to use for querying, IIRC. > > > Otis -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > ----- Original Message ---- >> From: Michael Ludwig <m...@as-guides.com> >> To: solr-user@lucene.apache.org >> Sent: Thursday, July 2, 2009 6:20:05 AM >> Subject: Re: Implementing PhraseQuery and MoreLikeThis Query in one app >> >> SergeyG schrieb: >> >>> Can both queries - PhraseQuery and MoreLikeThis Query - be implemented >>> in the same app taking into account the fact that for the former to >>> work the stop words list needs to be included and this results in the >>> latter putting stop words among the most important words? >> >> Why would the inclusion of a stopword list result in stopwords being of >> top importance in the MoreLikeThis query? >> >> Michael Ludwig >