On Mar 4, 2012, at 5:43 PM, Markus Jelsma wrote:

> everything stalls after it lists all segment files and that a ZK state change 
> has occured.

Can you get a stack trace here? I'll try to respond to more tomorrow. What 
version of trunk are you using? We have been making fixes and improvements all 
the time, so need to get a frame of reference.

When a client node cannot talk to zookeeper, because it may not know certain 
things it should (what if a leader changes?), it must reject updates (searches 
will still work). Why can't the node talk to zookeeper? Perhaps the load is so 
high on the server, it cannot respond to zk within the session timeout? I 
really don't know yet. When this happens though, it forces a recovery when/if 
the node can reconnect to zookeeper.

We have not yet started on optimizing bulk indexing - currently an update is 
added locally *before* sending updates in parallel to each replica. Then we 
wait for each response before responding to the client. We plan to offer more 
optimizations and options around this.

Feed back will be useful in making some of these improvements.


- Mark Miller
lucidimagination.com











Reply via email to