Chris Hostetter wrote:
That still doesn't really answer a fairly fundemental question i've been
trying to understand: *why* would having the results in that order be much
more useful to for the users?
Well, there are several reasons: One is that it allows users to easily
spot related entries, for example quotations of a text appearing within
another document. Another reason is that it allows to easily detect
linguistic patterns.
Of course, this is not the only sorting to be offered, but the one I am
currently struggling with and trying to evaluate whether Solr would be
of help here.
what are you going to do if the term input more then once in a single document?
The KWIC representation is generated for every hit, so if there are 5
matches in a doc, you get five hits.
SOlr can sort your results on any indexed, single value, field - but for
something like this you'd need to write your own plugin to do the sorting.
Note that your plugin would basically need to do the same thing you
currently do on the client, the only real speed performance gain would be
in reducing the amount of data sent over the wire.
Indeed. Except that Solr might be able to use mature, efficient and
well-debugged code to do that, which I can't say about my client code.
Well, not knowing anything about the internals used in Solr (or Lucene
for that matter), I just assumed that this in some sense parallels the
way a ranking value is calculated for a search term and then the results
are sorted by relevance.
But I think I have enough information now to decide how to proceed.
Christian
--
Christian Wittern
Institute for Research in Humanities, Kyoto University
47 Higashiogura-cho, Kitashirakawa, Sakyo-ku, Kyoto 606-8265, JAPAN