Re: MoreLikeThis class in Lucene within Solr?

Erik Hatcher Tue, 12 Sep 2006 13:24:59 -0700


On Sep 12, 2006, at 3:41 PM, Michael Imbeault wrote:

I haven't looked at the specifics of how MoreLikeThis determinewhich items are similar; I'm mainly wondering about performancehere. Yesterday I tried to code myself a poor man's similarityclass (which was nothing more than doing a search with OR betweenwords and sorting by score), and the performance was abysmal (well,I kinda expected it. 1000+ words queries on a 15 millions docscollection, you don't expect miracles). At first glance I think itsearches for the most 'relevant' words, I'm I right? What kind ofperformance are you getting with it?

Performance with MoreLikeThis is not an issue. It has manyparameters to tune how many terms are used in the query it builds,and it pulls these terms in an extremely efficient manner from theLucene index.

I'm doing some traveling soon, which is always a good time to hack onsomething tractable like adding MoreLikeThis to Solr. So your wishmay be granted in a week :)


        Erik

Re: MoreLikeThis class in Lucene within Solr?

Reply via email to