No, we don't have any patches for it yet. You might make a JIRA issue for it?
I think the big win is a fairly easy one - basically, right now when we update the cloud state, we look at the children of the 'shards' node, and then we read the data at each node individually. I imagine this is the part that breaks down :) We have already likely have most of that info though - really, you should just have to compare the children of the 'shards' node with the list we already have from the last time we got the cloud state - remove any that are no longer in the list, read the data for those not in the list, and get your new state efficiently. - Mark Miller lucidimagination.com 2011.lucene-eurocon.org | Oct 17-20 | Barcelona On Sep 28, 2011, at 10:35 PM, Jamie Johnson wrote: > Thanks Mark found the TODO in ZkStateReader.java > > // TODO: - possibly: incremental update rather than reread everything > > Was there a patch they provided back to address this? > > On Tue, Sep 27, 2011 at 9:20 PM, Mark Miller <markrmil...@gmail.com> wrote: >> >> On Sep 26, 2011, at 11:42 AM, Jamie Johnson wrote: >> >>> Is there any limitation, be it technical or for sanity reasons, on the >>> number of shards that can be part of a solr cloud implementation? >> >> >> The loggly guys ended up hitting a limit somewhere. Essentially, whenever >> the cloud state is updated, info is read about each shard to update the >> state (from zookeeper). There is a TODO that I put in there that says >> something like, "consider updating this incrementally" - usually the data on >> most shards has not changed, so no reason to read it all. They implemented >> that today in their own code, but we have not yet done this in trunk. What >> that places the upper limit at, I don't know - I imagine it takes quite a >> few shards before it ends up being too much of a problem - they shard by >> user I believe, so lot's of shards. >> >> >> - Mark Miller >> lucidimagination.com >> 2011.lucene-eurocon.org | Oct 17-20 | Barcelona >> >> >> >> >> >> >> >> >> >> >>