Yeah, there's been talk of making this configurable, but there are more pressing priorities so far.
So just to be clear, is this theoretical or practical? I know of several very high-performance situations where 1,000 updates/sec (and I'm assuming that it's 1,000 docs/sec not 1,000 batches of 1,000 docs) hasn't caused problems here. So unless you're actually seeing performance problems as opposed to fearing that there _might_ be, I'd just go on the to the next urgent problem. Best Erick On Fri, Jun 21, 2013 at 8:34 PM, Asif <talla...@gmail.com> wrote: > Erick, > > Thanks for your reply. > > You are right about 10 updates being batch up - It was hard to figure out > due to large number of updates/logging that happens in our system. > > We are batching 1000 updates every time. > > Here is my observation from leader and replica - > > 1. Leader logs are clearly indicating that 1000 updates arrived - [ (1000 > adds)],commit=] > 2. On replica - for each 1000 document adds on leader - I see a lot of > requests on replica - with no indication of how many updates in each > request. > > Digging a little bit into Solr code I figured this variable I am > interested in - maxBufferedAddsPerServer is set to 10 - > > http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/update/SolrCmdDistributor.java?view=markup > > This means for a batch update of 1000 documents - we will be seeing 100 > requests for replica - which translates into 100 writes per collection per > second in our system. > > Should this variable be made configurable via solrconfig.xml (or any other > appropriate place)? > > A little background about a system we are trying to build - real time > analytics solution using the Solr Cloud + Atomic updates - we have very > high amount of writes - going as high as 1000 updates a second (possibly > more in long run). > > - Asif > > > > > > On Sat, Jun 22, 2013 at 4:21 AM, Erick Erickson > <erickerick...@gmail.com>wrote: > >> Update are batched, but it's on a per-request basis. So, if >> you're sending one document at a time you'll won't get any >> batching. If you send 10 docs at a time and they happen to >> go to 10 different shards, you'll get 10 different update >> requests. >> >> If you're sending 1,000 docs per update you' should be seeing >> some batching going on. >> >> bq: but why not batch them up or give a option to batch N >> updates in either of the above case >> >> I suspect what you're seeing is that you're not sending very >> many docs per update request and so are being mislead. >> >> But that's a guess since you haven't provided much in the >> way of data on _how_ you're updating. >> >> bq: the cloud eventually starts to fail >> How? Details matter. >> >> Best >> Erick >> >> On Wed, Jun 19, 2013 at 4:23 AM, Asif <talla...@gmail.com> wrote: >> > Hi, >> > >> > I had questions on implementation of Sharding and Replication features of >> > Solr/Cloud. >> > >> > 1. I noticed that when sharding is enabled for a collection - individual >> > requests are sent to each node serving as a shard. >> > >> > 2. Replication too follows above strategy of sending individual documents >> > to the nodes serving as a replica. >> > >> > I am working with a system that requires massive number of writes - I >> have >> > noticed that due to above reason - the cloud eventually starts to fail >> > (Even though I am using a ensemble). >> > >> > I do understand the reason behind individual updates - but why not batch >> > them up or give a option to batch N updates in either of the above case >> - I >> > did come across a presentation that talked about batching 10 updates for >> > replication at least, but I do not think this is the case. >> > - Asif >>