On 14-May-07, at 6:49 PM, James liu wrote:
2007/5/15, Mike Klaas <[EMAIL PROTECTED]>:
On 14-May-07, at 1:35 AM, James liu wrote:
When you get up to 60 partitions, you should make it a multi stage
process. Assuming your partitions are disjoint and evenly
distributed, estimate the number of documents that will appear in the
final result from each.
yes, partitions distrbuted.
Double or triple that (and put a minimum
threshold), try to assemble the number of documents you require, and
if one partition "runs out" of docs before it is done, request a new
round.
i dont' know what u mean "runs out"
Say you request 5 docs from each of 60 partitions, and are interested
in docs 1-10. If, sorted by score, the docs come from:
p1, p2, p1, p1, p3, p4, p1, p1
Then p1 has "run out" at n=8, and there is no way to be sure if the
remaining two needed docs come from p1 or somewhere else. So you
have to now request at least two additional documents from p1.
one user request will generate 60 partitions request.
they work in parallel。
so i don't know every partion's status before they done.
Normally, you would wait for them to finish, and execute a subsequent
request if more docs are needed.
-Mike