For what it's worth, in SPM we keep track of nodes/server stats, of course, and that metric has been going up for those using SPM to monitor Solr clusters, which is a nice sign.
Otis -- Solr & ElasticSearch Support -- http://sematext.com/ Solr Performance Monitoring -- http://sematext.com/spm On Wed, Jul 10, 2013 at 9:29 AM, Jack Krupansky <j...@basetechnology.com> wrote: > Again, no hard limits, mostly performance-based limits and environmental > factors of your own environment, as well as the fact that most people on > this list will have deeper experience with smaller clusters, so if you > decide to "go big", you will be in uncharted and untested territory. > > I would relax my number a little (actually, double it) to 64 nodes, to > handle the 8-shard, 8-replica case, since just yesterday somebody on the > list mentioned that they were using such a configuration. > > In other words, with configurations up to 16 or 32 or even 64 nodes, you > will readily find people here who might be able to help support you, but if > you are thinking of a 16-shard, 16-replica cluster with 256 nodes or > 32-shard, 32-replica cluster with 1,024 nodes, it's not that that will hit > any hard limit in Solr, but simply that not as many people will be able to > provide support, answer questions, or simply confirm that yes, a cluster > that big is a... "slam-dunk." And if you do want to try a 1,024-node > cluster, you absolutely should do a Proof of Concept implementation first. > > I actually don't have any hard, empirical evidence to back up my 32/64-node > guidance, but it seems reasonable and consistent with configurations people > commonly talk about. Generally, people talk about smaller clusters, so I'm > stretching a little to get up to my 32/64 guidance. And, to be clear, that's > just a rough guide and not intended to guarantee that a 64-node cluster will > perform really well, nor to imply that a 96-node or 128-node cluster won't > perform well. > > -- Jack Krupansky > > -----Original Message----- From: Ramkumar R. Aiyengar > Sent: Wednesday, July 10, 2013 4:03 AM > To: solr-user@lucene.apache.org > Subject: Re: Solr limitations > > > I understand, thanks. I just wanted to check in case there were scalability > limitations with how SolrCloud operates.. > On 9 Jul 2013 12:45, "Erick Erickson" <erickerick...@gmail.com> wrote: > >> I think Jack was mostly thinking in "slam dunk" terms. I know of >> SolrCloud demo clusters with 500+ nodes, and at that point >> people said "it's going to work for our situation, we don't need >> to push more". >> >> As you start getting into that kind of scale, though, you really >> have a bunch of ops considerations etc. Mostly when I get into >> larger scales I pretty much want to examine my assumptions >> and see if they're correct, perhaps start to trim my requirements >> etc. >> >> FWIW, >> Erick >> >> On Tue, Jul 9, 2013 at 4:07 AM, Ramkumar R. Aiyengar >> <andyetitmo...@gmail.com> wrote: >> >> 5. No more than 32 nodes in your SolrCloud cluster. >> > >> > I hope this isn't too OT, but what tradeoffs is this based on? Would > >> > have >> > thought it easy to hit this number for a big index and high load (hence >> > with the view of both the number of shards and replicas horizontally >> > scaling..) >> > >> >> 6. Don't return more than 250 results on a query. >> >> >> >> None of those is a hard limit, but don't go beyond them unless your >> Proof >> > of Concept testing proves that performance is acceptable for your >> situation. >> >> >> >> Start with a simple 4-node, 2-shard, 2-replica cluster for preliminary >> > tests and then scale as needed. >> >> >> >> Dynamic and multivalued fields? Try to stay away from them - excepts >> >> >> for >> > the simplest cases, they are usually an indicator of a weak data model. >> > Sure, it's fine to store a relatively small number of values in a >> > multivalued field (say, dozens of values), but be aware that you can't >> > directly access individual values, you can't tell which was matched on a >> > query, and you can't coordinate values between multiple multivalued >> fields. >> > Except for very simple cases, multivalued fields should be flattened > >> > into >> > multiple documents with a parent ID. >> >> >> >> Since you brought up the topic of dynamic fields, I am curious how you >> > got the impression that they were a good technique to use as a starting >> > point. They're fine for prototyping and hacking, and fine when used in >> > moderation, but not when used to excess. The whole point of Solr is >> > searching and searching is optimized within fields, not across fields, > >> > so >> > having lots of dynamic fields is counter to the primary strengths of >> Lucene >> > and Solr. And... schemas with lots of dynamic fields tend to be >> difficult >> > to maintain. For example, if you wanted to ask a support question here, >> one >> > of the first things we want to know is what your schema looks like, but >> > with lots of dynamic fields it is not possible to have a simple >> discussion >> > of what your schema looks like. >> >> >> >> Sure, there is something called "schemaless design" (and Solr supports >> > that in 4.4), but that's very different from heavy reliance on dynamic >> > fields in the traditional sense. Schemaless design is A-OK, but using >> > dynamic fields for "arrays" of data in a single document is a poor match >> > for the search features of Solr (e.g., Edismax searching across multiple >> > fields.) >> >> >> >> One other tidbit: Although Solr does not enforce naming conventions for >> > field names, and you can put special characters in them, there are > >> > plenty >> > of features in Solr, such as the common "fl" parameter, where field > >> > names >> > are expected to adhere to Java naming rules. When people start "going >> wild" >> > with dynamic fields, it is common that they start "going wild" with > >> > their >> > names as well, using spaces, colons, slashes, etc. that cannot be parsed >> in >> > the "fl" and "qf" parameters, for example. Please don't go there! >> >> >> >> In short, put up a small cluster and start doing a Proof of Concept >> > cluster. Stay within my suggested guidelines and you should do okay. >> >> >> >> -- Jack Krupansky >> >> >> >> -----Original Message----- From: Marcelo Elias Del Valle >> >> Sent: Monday, July 08, 2013 9:46 AM >> >> To: solr-user@lucene.apache.org >> >> Subject: Solr limitations >> >> >> >> >> >> Hello everyone, >> >> >> >> I am trying to search information about possible solr limitations I >> >> should consider in my architecture. Things like max number of dynamic >> >> fields, max number o documents in SolrCloud, etc. >> >> Does anyone know where I can find this info? >> >> >> >> Best regards, >> >> -- >> >> Marcelo Elias Del Valle >> >> http://mvalle.com - @mvallebr >> >