Looking at this line in the code:

// This enusres that reRankDocs <= docs needed to satisfy the result set.
reRankDocs = Math.max(start+rows, reRankDocs);

This looks like it would cause skips and duplicates while paging through
the results, since if you exceed the reRankDocs parameter and keep finding
things that match the re-ranking query, they'll get boosted earlier
(skipped), thus pushing down items you already saw (causing duplicates).

It's obviously intentional behavior, but there's no documentation I can see
of why, if you request fewer documents to be re-ranked than you're asking
to view, it goes ahead and ignores the number you asked for. What if I only
want the top 10 out of 50 rows to be reranked? Wouldn't it be better to
make the client choose whether to increase the reRankDocs or leave it the
same?

If no one replies and I have time, I might check out 4.9 and see if I can
confirm or disprove the bug, but figured I'd bring it up now in case I
don't end up having time. It would be good to document the reason for this
behavior if it turns out it's necessary.

Thanks. I'm excited about this feature btw.

--Adair

Reply via email to