Thanks for that Eric; It looks like a very good implementation of the class. If you ever find time to add it to the query handlers in Solr, I'm sure it would be wonderful for tons of users (solr has tons of users, right? it definitively should!).

I haven't looked at the specifics of how MoreLikeThis determine which items are similar; I'm mainly wondering about performance here. Yesterday I tried to code myself a poor man's similarity class (which was nothing more than doing a search with OR between words and sorting by score), and the performance was abysmal (well, I kinda expected it. 1000+ words queries on a 15 millions docs collection, you don't expect miracles). At first glance I think it searches for the most 'relevant' words, I'm I right? What kind of performance are you getting with it?

Thanks a lot,

Michael Imbeault
CHUL Research Center (CHUQ)
2705 boul. Laurier
Ste-Foy, QC, Canada, G1V 4G2
Tel: (418) 654-2705, Fax: (418) 654-2212



Erik Hatcher wrote:
I use MoreLikeThis in a custom request handler for Collex, for example the three items shown at the bottom left here:

<http://svn.sourceforge.net/viewvc/patacriticism/collex/trunk/src/solr/org/nines/TermQueryRequestHandler.java?revision=391&view=markup>

I would like to get MoreLikeThis hooked into the StandardRequestHandler just like highlighting and facets are now. One of these days I'll carve out time to do that if no one beats me to it. It would not be difficult to do, it would just take some time to iron out how to parameterize it cleanly for general-purpose use.

    Erik

Reply via email to