We actually do currently batch updates - we are being somewhat loose when we 
say a document at a time. There is a buffer of updates per replica that gets 
flushed depending on the requests coming through and the buffer size.

- Mark Miller
lucidimagination.com

On Feb 28, 2012, at 3:38 AM, eks dev wrote:

> SolrCluod is going to be great, NRT feature is really huge step
> forward, as well as central configuration, elasticity ...
> 
> The only thing I do not yet understand is treatment of cases that were
> traditionally covered by Master/Slave setup. Batch update
> 
> If I get it right (?), updates to replicas are sent one by one,
> meaning when one server receives update, it gets forwarded to all
> replicas. This is great for reduced update latency case, but I do not
> know how is it implemented if you hit it with "batch" update. This
> would cause huge amount of update commands going to replicas. Not so
> good for throughput.
> 
> - Master slave does distribution at segment level, (no need to
> replicate analysis, far less network traffic). Good for batch updates
> - SolrCloud does par update command (low latency, but chatty and
> Analysis step is done N_Servers times). Good for incremental updates
> 
> Ideally, some sort of "batching" is going to be available in
> SolrCloud, and some cont roll over it, e.g. forward batches of 1000
> documents (basically keep update log slightly longer and forward it as
> a batch update command). This would still cause duplicate analysis,
> but would reduce network traffic.
> 
> Please bare in mind, this is more of a question than a statement,  I
> didn't look at the cloud code. It might be I am completely wrong here!
> 
> 
> 
> 
> 
> On Tue, Feb 28, 2012 at 4:01 AM, Erick Erickson <erickerick...@gmail.com> 
> wrote:
>> As I understand it (and I'm just getting into SolrCloud myself), you can
>> essentially forget about master/slave stuff. If you're using NRT,
>> the soft commit will make the docs visible, you don't ned to do a hard
>> commit (unlike the master/slave days). Essentially, the update is sent
>> to each shard leader and then fanned out into the replicas for that
>> leader. All automatically. Leaders are elected automatically. ZooKeeper
>> is used to keep the cluster information.
>> 
>> Additionally, SolrCloud keeps a transaction log of the updates, and replays
>> them if the indexing is interrupted, so you don't risk data loss the way
>> you used to.
>> 
>> There aren't really masters/slaves in the old sense any more, so
>> you have to get out of that thought-mode (it's hard, I know).
>> 
>> The code is under pretty active development, so any feedback is
>> valuable....
>> 
>> Best
>> Erick
>> 
>> On Mon, Feb 27, 2012 at 3:26 AM, roz dev <rozde...@gmail.com> wrote:
>>> Hi All,
>>> 
>>> I am trying to understand features of Solr Cloud, regarding commits and
>>> scaling.
>>> 
>>> 
>>>   - If I am using Solr Cloud then do I need to explicitly call commit
>>>   (hard-commit)? Or, a soft commit is okay and Solr Cloud will do the job of
>>>   writing to disk?
>>> 
>>> 
>>>   - Do We still need to use  Master/Slave setup to scale searching? If we
>>>   have to use Master/Slave setup then do i need to issue hard-commit to make
>>>   my changes visible to slaves?
>>>   - If I were to use NRT with Master/Slave setup with soft commit then
>>>   will the slave be able to see changes made on master with soft commit?
>>> 
>>> Any inputs are welcome.
>>> 
>>> Thanks
>>> 
>>> -Saroj












Reply via email to