James, It looks like people already answered your questions. Split your big index. Put it on multiple servers. Put Solr on each of those servers. Write an application that searches multiple Solr instances in parallel. Get N results from each, combine them, order by score.
As far as I know, this is the best you can do with what is available from Solr today. For anything else, you'll have to roll up your sleeves and dig into the code. Good luck! Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share ----- Original Message ---- From: James liu <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Thursday, April 5, 2007 1:18:30 AM Subject: Re: Does solr support Multi index and return by score and datetime Anyone have problem like this and how to solve it? 2007/4/5, James liu <[EMAIL PROTECTED]>: > > > > 2007/4/5, Mike Klaas <[EMAIL PROTECTED]>: > > > > On 4/4/07, James liu <[EMAIL PROTECTED]> wrote: > > > > > > > I think it is part of full-text search. > > > > > > I think query slavers and combin result by score should be the part of > > solr. > > > > > > I find it http://dev.lucene-ws.net/wiki/MultiIndexOperations > > > but i wanna use solr and i like it. > > > > > > Now i wanna find a good method to solve it by using solr and less > > > coding.(More code will cost more time to write and test.) > > > > I agree that it would be an excellent addition to Solr, but it is a > > major undertaking, and so I wouldn't wait around for it if it is > > important to you. Solr devs have code to write and test too :). > > > > > > > If you document > > > > > > distribution is uniform random, then the norms converge to > > > > > > approximately equal values anyway. > > > > > > > > > > I don't know it. > > > > > > I don't know why u say "document distribution". Does it mean if i > > write code > > > independently, i will consider it? > > > > One of the complexities of queries multiple remote Solr/lucene > > instances is that the scores are not directly comparable as the term > > idf scores will be different. However, in practical situations, this > > can be glossed over. > > > > This is the basic algorithm for single-pass querying multiple solr > > slaves. Say you want results N to N + M (e.g 10 to 20). > > > > 1. query each solr instance independently for N+M documents for the > > given query. This should be done asynchronously (or you could spawn a > > thread per server). > > 2. wait for all responses (or for a certain timeout) > > 3. put all returned documents into an array, and reverse sort by score > > 4. select documents [N, N+M) from this array. > > > > This is a relatively simple task. It gets more complicated once > > multiple passes, idf compensation, deduplication, etc. are added. > > > > -Mike > > > > Thks Mike. > > I find it more complicate than i think. > > Is it the only way to solve my problem: > > I have a project, it have 100g data, now i have 3-4 server for solr. > > > > > > > -- > regards > jl -- regards jl