>> On Sat, Feb 14, 2009 at 6:45 AM, karl wettin <karl.wet...@gmail.com> >> wrote: >>> Also, as my threadshold is based on the distance in score between the >>> first result it sounds like using a result start position greater than >>> 0 is something I have to look out for. Or? >> >> Hmmm - this isn't that easy in general as it requires knowledge of the >> max score, right? > > Hmmm indeed. Does Solr not collect 0-20 even though the request is for > 10-20? Wouldn't it then be possible to inject some code that limits the > DocSet at that layer?
Yes, Solr would actually collect 0-20, but the entire set of matching documents must still be scored to find the maximum score. So if the threshold will be a function of maxScore, it still requires two passes, no? > There is more. Not important but a nice thing to get: I create multiple > documents per entity from my primary data source (e.g. each entity a book > and each document a paragraph from the book) but I only want to present the > top scoring document per entity. This sounds like field collapsing. There's is a patch that's still in the works: http://wiki.apache.org/solr/FieldCollapsing -Yonik http://www.lucidimagination.com