Re: LuceneRevolution - NoSQL: A comparison

Shawn Heisey Tue, 12 Oct 2010 08:12:39 -0700

 On 10/11/2010 6:32 PM, Peter Keegan wrote:

When Solr does a distributed search across shards, it does this in 2 phases
(correct me if I'm wrong):


1. 1st query to get the docIds and facet counts
2. 2nd query to retrieve the stored fields of the top hits

The problem here is that the index could change between (1) and (2), so it's
not an atomic transaction. If the stored fields were kept outside of Lucene,
only the first query would be necessary. However, this would mean that the
external NoSQL data store would have to be synchronized with the Lucene
index, which might present its own problems. (I'm just throwing this out for
discussion)

I've got a related issue that I have run into because of my use of aload balancer.

I have a total of seven shards, each of which has a replica. I've gotone set of machines set up as brokers that have the shards parameter inthe standard request handler. Queries are sent to the load balancer,which sends it to one of the brokers. The shards parameter sendsrequests back to the load balancer to be ultimately sent to an actualserver.

I have a monitoring script that retrieves the latest document and alarmsif it's older than ten minutes. Something that happens on occasion:


1) An update is made to the master (happens every two minutes).
2) Monitoring script requests newest document.
3) Initial request is sent to master, finds ID.
4) Second request is sent to the slave, document not found.
5) Up to 15 seconds later, the slave replicates.

I solved this problem by having the monitoring script try several timeson failure, waiting a few seconds on each loop. Do I need to beterribly concerned about this impacting real queries?

I do not actually need to load balance, I have slave servers purely forfailover. Currently the load balancer has a 3 to 1 weight ratiofavoring the slaves, which I plan to increase. At one time I had themaster set up as a backup rather than a lower weight target, but haproxyseemed to take longer to recover from failures in that mode. I willhave to do some more comprehensive testing. If there's a bettersolution than haproxy that works with heartbeat, I can change that.


Thanks,
Shawn

Re: LuceneRevolution - NoSQL: A comparison

Reply via email to