2007/5/15, Mike Klaas <[EMAIL PROTECTED]>:

On 14-May-07, at 8:55 PM, James liu wrote:

> thks for your detail answer.
>
> but u ignore "sorted by score"
>
> p1, p2,p1,p1,p3,p4,p1,p1
>
> maybe their max score is lower than from p19,p20.
>

I'm not ignoring it: I'm implying that the above is the correct
descending score-sorted order.  You have to perform that sort manually.


i mean merged results(from 60 p) and sort it, not solr's sort.
every result from box have been  sorted by score.


so it will not sorted by score correctly.
>
> and if user click page 2 to see, how to show data?
>
> p1 start from 10 or query other partitions?

Assemble results 1 through 20, then display 11-20 to the user.


for example, i wanna query "solr"

p1 have 100 results which score is bigger than 80

p2 have 100 results which score is smaller than 20

so if i use rows=10, score not correct.

if i wanna promise 10 pages which sort by score correctly.

so i have to get 100(rows=100) results from every box.

and merge results, sort it, finallay get top 100 results.

but it will very slow.


i don't know other search how to solve it? maybe they not sort by score very
correctly.




-Mike

>
> 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>:
>>
>> On 14-May-07, at 6:49 PM, James liu wrote:
>>
>> > 2007/5/15, Mike Klaas <[EMAIL PROTECTED]>:
>> >>
>> >> On 14-May-07, at 1:35 AM, James liu wrote:
>> >>
>> >> When you get up to 60 partitions, you should make it a multi stage
>> >> process.  Assuming your partitions are disjoint and evenly
>> >> distributed, estimate the number of documents that will appear
>> in the
>> >> final result from each.
>> >
>> >
>> > yes, partitions distrbuted.
>> >
>> >
>> > Double or triple that (and put a minimum
>> >> threshold), try to assemble the number of documents you
>> require, and
>> >> if one partition "runs out" of docs before it is done, request
>> a new
>> >> round.
>> >
>> >
>> > i dont' know what u mean "runs out"
>>
>> Say you request 5 docs from each of 60 partitions, and are interested
>> in docs 1-10.  If, sorted by score, the docs come from:
>>
>> p1, p2, p1, p1, p3, p4, p1, p1
>>
>> Then p1 has "run out" at n=8, and there is no way to be sure if the
>> remaining two needed docs come from p1 or somewhere else.  So you
>> have to now request at least two additional documents from p1.
>>
>> > one user request will generate 60 partitions request.
>> >
>> > they work in parallel。
>> >
>> > so i don't know every partion's status before they done.
>>
>> Normally, you would wait for them to finish, and execute a subsequent
>> request if more docs are needed.
>>
>> -Mike
>
>
>
>
> --
> regards
> jl




--
regards
jl

Reply via email to