15 feb 2009 kl. 20.15 skrev Yonik Seeley:
On Sat, Feb 14, 2009 at 6:45 AM, karl wettin <karl.wet...@gmail.com>
wrote:
Also, as my threadshold is based on the distance in score between the
first result it sounds like using a result start position greater
than
0 is something I have to look out for. Or?
Hmmm - this isn't that easy in general as it requires knowledge of the
max score, right?
Hmmm indeed. Does Solr not collect 0-20 even though the request is for
10-20? Wouldn't it then be possible to inject some code that limits
the DocSet at that layer?
There is more. Not important but a nice thing to get: I create
multiple documents per entity from my primary data source (e.g. each
entity a book and each document a paragraph from the book) but I only
want to present the top scoring document per entity. I handle this
with client side post processing of the results. This means that I
potentially get facet counts from documents that I actually don't
present to the user. I would be nice to handle this in the same layer
as my score threadshold restriction, but it would require loading the
primary key from the document rather early. And it would also mean
that even though I might get 2000 results within the threadshold the
actual number of results I want to pass on to the client is a lot less
than that. I.e. I'll have to request more results than I want in order
to ensure I get enough even after filtering out documents that points
at the an entity already member of the result list but with a greater
score.
The question is if I can fit all this stuff in the same layer as the
by score threadshold result set limiter.
I'm rather lost in the Solr code. Pointers at class and method names
is most welcome.
karl