So I tested what I wrote, and man was that wrong. I have updated it and created a JIRA for this issue. I also attached a patch which will patch CloudState to address this issue. Feedback is appreciated.
https://issues.apache.org/jira/browse/SOLR-2799 On Wed, Sep 28, 2011 at 11:46 PM, Jamie Johnson <jej2...@gmail.com> wrote: > I'll definitely create a JIRA for this. Looking at the code in > CloudState I think we could do the following > > as we iterate over shardINames we check to see if the oldCloudState > had the slice already, if so get the state from there, otherwise do > what is already happening. Something like the following: > > for (String shardIdZkPath : shardIdNames) { > Slice slice = null; > if(oldCloudState.liveNodesContain(shardIdZkPath)) { > slice = > oldCloudState.getCollectionStates().get(collection).get(shardIdZkPath); > } > > if(slice == null){ > Map<String,ZkNodeProps> shardsMap = > readShards(zkClient, > shardIdPaths + "/" + shardIdZkPath); > slice = new Slice(shardIdZkPath, shardsMap); > } > > slices.put(shardIdZkPath, slice); > } > I don't see a need to remove the old states since we only keep the > states that are already in oldCloudState and read new ones. Does that > make sense? > > On Wed, Sep 28, 2011 at 11:01 PM, Mark Miller <markrmil...@gmail.com> wrote: >> No, we don't have any patches for it yet. You might make a JIRA issue for it? >> >> I think the big win is a fairly easy one - basically, right now when we >> update the cloud state, we look at the children of the 'shards' node, and >> then we read the data at each node individually. I imagine this is the part >> that breaks down :) >> >> We have already likely have most of that info though - really, you should >> just have to compare the children of the 'shards' node with the list we >> already have from the last time we got the cloud state - remove any that are >> no longer in the list, read the data for those not in the list, and get your >> new state efficiently. >> >> - Mark Miller >> lucidimagination.com >> 2011.lucene-eurocon.org | Oct 17-20 | Barcelona >> >> On Sep 28, 2011, at 10:35 PM, Jamie Johnson wrote: >> >>> Thanks Mark found the TODO in ZkStateReader.java >>> >>> // TODO: - possibly: incremental update rather than reread everything >>> >>> Was there a patch they provided back to address this? >>> >>> On Tue, Sep 27, 2011 at 9:20 PM, Mark Miller <markrmil...@gmail.com> wrote: >>>> >>>> On Sep 26, 2011, at 11:42 AM, Jamie Johnson wrote: >>>> >>>>> Is there any limitation, be it technical or for sanity reasons, on the >>>>> number of shards that can be part of a solr cloud implementation? >>>> >>>> >>>> The loggly guys ended up hitting a limit somewhere. Essentially, whenever >>>> the cloud state is updated, info is read about each shard to update the >>>> state (from zookeeper). There is a TODO that I put in there that says >>>> something like, "consider updating this incrementally" - usually the data >>>> on most shards has not changed, so no reason to read it all. They >>>> implemented that today in their own code, but we have not yet done this in >>>> trunk. What that places the upper limit at, I don't know - I imagine it >>>> takes quite a few shards before it ends up being too much of a problem - >>>> they shard by user I believe, so lot's of shards. >>>> >>>> >>>> - Mark Miller >>>> lucidimagination.com >>>> 2011.lucene-eurocon.org | Oct 17-20 | Barcelona >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >> >> >> >> >> >> >> >> >> >> >> >> >> >