Hey *Shawn*, *Erik*, I's wondering if there is a JIRA story for splitting the current clusterstate.json to collection specific clusterstate config that I can track. I looked around a bit but couldn't get my hands on anything useful on that.
On Mon, Apr 28, 2014 at 7:43 AM, Shawn Heisey <s...@elyograg.org> wrote: > On 4/28/2014 5:05 AM, Mukesh Jha wrote: > > Thanks Erik, > > > > Sounds about right. > > > > BTW how long can I keep adding collections i.e. can I keep 5/10 years > data > > like this? > > > > Also what do you think of bullet 2) of having collection specific > > configurations in zookeeper? > > Regarding bullet 2, there is work underway right now to create a > separate clusterstate within zookeeper for each collection. I do not > know how far along that work is. > > There are no hard limits in SolrCloud at all. The things that will > cause issues with scalability are resource-related problems. You'll > exceed the 1MB default limit on a zookeeper database pretty quickly. If > you're not using the example jetty included with Solr, you'll exceed the > default maxThreads on most servlet containers very quickly. You may run > into problems with the default limits on Solr's HttpShardHandler. > > Running hundreds or thousands of cores efficiently will require lots of > RAM, both for the OS disk cache and the java heap. A large java heap > will require significant tuning of Java garbage collection parameters. > > Most operating systems limit a user to 1024 open files and 1024 running > processes (which includes threads). These limits will need to be > increased. > > There may be other limits imposed by the Solr config, Java, and/or the > operating system that I have not thought of or stated here. > > Thanks, > Shawn > > -- Thanks & Regards, *Mukesh Jha <me.mukesh....@gmail.com>*