15 feb 2009 kl. 20.15 skrev Yonik Seeley:

On Sat, Feb 14, 2009 at 6:45 AM, karl wettin <karl.wet...@gmail.com> wrote:
Also, as my threadshold is based on the distance in score between the
first result it sounds like using a result start position greater than
0 is something I have to look out for. Or?

Hmmm - this isn't that easy in general as it requires knowledge of the
max score, right?

Hmmm indeed. Does Solr not collect 0-20 even though the request is for 10-20? Wouldn't it then be possible to inject some code that limits the DocSet at that layer?

There is more. Not important but a nice thing to get: I create multiple documents per entity from my primary data source (e.g. each entity a book and each document a paragraph from the book) but I only want to present the top scoring document per entity. I handle this with client side post processing of the results. This means that I potentially get facet counts from documents that I actually don't present to the user. I would be nice to handle this in the same layer as my score threadshold restriction, but it would require loading the primary key from the document rather early. And it would also mean that even though I might get 2000 results within the threadshold the actual number of results I want to pass on to the client is a lot less than that. I.e. I'll have to request more results than I want in order to ensure I get enough even after filtering out documents that points at the an entity already member of the result list but with a greater score.

The question is if I can fit all this stuff in the same layer as the by score threadshold result set limiter.


I'm rather lost in the Solr code. Pointers at class and method names is most welcome.



         karl

Reply via email to