Re: Result based sorting for KWIC?

Christian Wittern Mon, 17 Mar 2008 18:27:50 -0700

Chris Hostetter wrote:

That still doesn't really answer a fairly fundemental question i've beentrying to understand: *why* would having the results in that order be muchmore useful to for the users?

Well, there are several reasons: One is that it allows users to easilyspot related entries, for example quotations of a text appearing withinanother document. Another reason is that it allows to easily detectlinguistic patterns.

Of course, this is not the only sorting to be offered, but the one I amcurrently struggling with and trying to evaluate whether Solr would beof help here.

what are you going to do if the term input more then once in a single document?

The KWIC representation is generated for every hit, so if there are 5matches in a doc, you get five hits.

SOlr can sort your results on any indexed, single value, field - but forsomething like this you'd need to write your own plugin to do the sorting.Note that your plugin would basically need to do the same thing youcurrently do on the client, the only real speed performance gain would bein reducing the amount of data sent over the wire.

Indeed. Except that Solr might be able to use mature, efficient andwell-debugged code to do that, which I can't say about my client code.Well, not knowing anything about the internals used in Solr (or Lucenefor that matter), I just assumed that this in some sense parallels theway a ranking value is calculated for a search term and then the resultsare sorted by relevance.


But I think I have enough information now to decide how to proceed.

Christian

--

Christian WitternInstitute for Research in Humanities, Kyoto University

47 Higashiogura-cho, Kitashirakawa, Sakyo-ku, Kyoto 606-8265, JAPAN

Re: Result based sorting for KWIC?

Reply via email to