Let me give a bit of background. Our Solr cluster is multi-tenant, where we use one collection for each of our customers. In many cases, these customers are very tiny, so their collection consists of just a single shard on a single Solr node. In fact, a non-trivial number of them are totally empty (e.g. trial customers that never did anything with their trial account). However there are also some customers that are larger, requiring their collection to be sharded. Our strategy is to try to keep the total documents in any one shard under 20 million (honestly not sure where my coworker got that number from - I am open to alternatives but I realize this is heavily app-specific).
So my original question is not related to indexing or query traffic, but just the sheer number of cores. For example, if I have 10 active cores on a machine and everything is working fine, should I expect that everything will still work fine if I add 10 nearly-idle cores to that machine? What about 100? 1000? I figure the overhead of each core is probably fairly low but at some point starts to matter. Does that make sense? - Ian On Tue, Mar 24, 2015 at 11:12 AM, Jack Krupansky <jack.krupan...@gmail.com> wrote: > Shards per collection, or across all collections on the node? > > It will all depend on: > > 1. Your ingestion/indexing rate. High, medium or low? > 2. Your query access pattern. Note that a typical query fans out to all > shards, so having more shards than CPU cores means less parallelism. > 3. How many collections you will have per node. > > In short, it depends on what you want to achieve, not some limit of Solr > per se. > > Why are you even sharding the node anyway? Why not just run with a single > shard per node, and do sharding by having separate nodes, to maximize > parallel processing and availability? > > Also be careful to be clear about using the Solr term "shard" (a slice, > across all replica nodes) as distinct from the Elasticsearch term "shard" > (a single slice of an index for a single replica, analogous to a Solr > "core".) > > > -- Jack Krupansky > > On Tue, Mar 24, 2015 at 9:02 AM, Ian Rose <ianr...@fullstory.com> wrote: > > > Hi all - > > > > I'm sure this topic has been covered before but I was unable to find any > > clear references online or in the mailing list. > > > > Are there any rules of thumb for how many cores (aka shards, since I am > > using SolrCloud) is "too many" for one machine? I realize there is no > one > > answer (depends on size of the machine, etc.) so I'm just looking for a > > rough idea. Something like the following would be very useful: > > > > * People commonly run up to X cores/shards on a mid-sized (4 or 8 core) > > server without any problems. > > * I have never heard of anyone successfully running X cores/shards on a > > single machine, even if you throw a lot of hardware at it. > > > > Thanks! > > - Ian > > >