On Thu, 20 Sep 2007 09:37:51 +0800 "Jarvis" <[EMAIL PROTECTED]> wrote:
> If we use the RPC call in nutch . Hi, I wasn't suggesting to use nutch in solr...I'm only a young grasshopper in this league to be suggesting architecture stuff :) but i imagine there's nothing wrong with using what they've built if it addresses solr's needs. > Manually separate the index is required . hmm i imagine this really depends on the application. In my case, this separation of which docs go where happens @ a completely different layer. > We will receive reduplicate result if there is reduplicate index document on > different servers. Maybe I got this wrong...but isn't this what mapreduce is meant to deal with? eg, 1) get the job (a query) 2) map it to workers ( servers that provide search results from their own indexing) 3) wait for the results from all workers that reply within acceptable timeframe. 4) comb through the lot of results from all workers, reduce them according to your own biz rules (eg, remove dupes, sort them by quality / priority... here possibly relying on the original parameters of the query in 1) 5) return the reduced results to the frontend. > And also the data updating and single server's error is > hard to deal with. this really depends on your infrastructure + design. Having the indexing , searching and providing of results in different layers should make for some interesting design options... If each searcher (or wherever the index resides) is really a small cluster of servers , the issue of data safety / server error is addressed @ that point. You can also have repeated data across indexes (again, independent indexes) and that's a more ... randomised :) way of keeping the docs safe... For example, IIRC, googleFS keeps copies of each file in 3 servers or more... cheers, B _________________________ {Beto|Norberto|Numard} Meijome "He uses statistics as a drunken man uses lamp-posts ... for support rather than illumination." Andrew Lang (1844-1912) I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.