First, remember that SolrCloud is relatively new, operational issues like this will doubtless accrue "folk wisdom" as we all gain experience...
But my current thinking is that the remote installations are essentially completely separate installations with no knowledge of each other. Your indexing process needs to be smart enough to send _one_ update request to each cluster. Or perhaps each cluster has its own indexing process. The point is that the nodes in the remote installations don't know about each other, so they don't incur the communications lag. Because I'm pretty sure you're exactly right. The supposition is that the pipe connecting the remote installations is slow/expensive/whatever and the chatter that'll go back and forth if they are aware of each other will be too expensive to really be practical. YMMV, of course.... I can imagine low-frequency updates where at least the indexing process wouldn't care, but then you'd still have the querying crossing the expensive pipe... As you can tell, I don't have a tried-and-tested answer... Best Erick On Sun, Oct 14, 2012 at 5:41 AM, AlexeyK <lex.kudi...@gmail.com> wrote: > Hi, > As far as I understand, SolrCloud eliminates the master-slave specifics, and > automates both update and search seamlessly. > What should I take into account configuring SolrCloud for a large customer > with multiple physical locations? > I mean, for older Solr I would define master 'close to the data' with batch > replication to the search server (slave). I would have several such slaves > for different geographical locations as well. > How can I ensure (if at all) that search queries do not cross geographical > boundaries? As far as I understand, SolrCloud routes to any arbitrary active > replica. > How can I control the indexing process so that the update request is routed > to the closest server? If SolrCloud accidentally elects some remote replica > as a current leader, the indexing process will deteriorate due to networking > issues; moreover, the update requests will be also bounced back across the > network as a part of the online replication process. > > Do I miss something fundamental in my assumptions/understanding of SolrCloud > features? > > Thanks a lot, > > Alexey > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/SolrCloud-distributed-architecture-considerations-tp4013594.html > Sent from the Solr - User mailing list archive at Nabble.com.