Multi-tenancy is a bad idea for a single solr Cluster. Better to give each tenant a separate Solr instance that you spin up and spin down based on demand.
Think about it: If there are a small number of tenants, just giving each their own machine will be cheaper than the effort spent managing a multi-tenant cluster, and if there are a large number of tenants of even a moderate number of large tenants, you can't expect them to all run reasonably on a relatively small cluster. Think about scalability. -- Jack Krupansky On Tue, Mar 24, 2015 at 1:22 PM, Ian Rose <ianr...@fullstory.com> wrote: > Let me give a bit of background. Our Solr cluster is multi-tenant, where > we use one collection for each of our customers. In many cases, these > customers are very tiny, so their collection consists of just a single > shard on a single Solr node. In fact, a non-trivial number of them are > totally empty (e.g. trial customers that never did anything with their > trial account). However there are also some customers that are larger, > requiring their collection to be sharded. Our strategy is to try to keep > the total documents in any one shard under 20 million (honestly not sure > where my coworker got that number from - I am open to alternatives but I > realize this is heavily app-specific). > > So my original question is not related to indexing or query traffic, but > just the sheer number of cores. For example, if I have 10 active cores on > a machine and everything is working fine, should I expect that everything > will still work fine if I add 10 nearly-idle cores to that machine? What > about 100? 1000? I figure the overhead of each core is probably fairly > low but at some point starts to matter. > > Does that make sense? > - Ian > > > On Tue, Mar 24, 2015 at 11:12 AM, Jack Krupansky <jack.krupan...@gmail.com > > > wrote: > > > Shards per collection, or across all collections on the node? > > > > It will all depend on: > > > > 1. Your ingestion/indexing rate. High, medium or low? > > 2. Your query access pattern. Note that a typical query fans out to all > > shards, so having more shards than CPU cores means less parallelism. > > 3. How many collections you will have per node. > > > > In short, it depends on what you want to achieve, not some limit of Solr > > per se. > > > > Why are you even sharding the node anyway? Why not just run with a single > > shard per node, and do sharding by having separate nodes, to maximize > > parallel processing and availability? > > > > Also be careful to be clear about using the Solr term "shard" (a slice, > > across all replica nodes) as distinct from the Elasticsearch term "shard" > > (a single slice of an index for a single replica, analogous to a Solr > > "core".) > > > > > > -- Jack Krupansky > > > > On Tue, Mar 24, 2015 at 9:02 AM, Ian Rose <ianr...@fullstory.com> wrote: > > > > > Hi all - > > > > > > I'm sure this topic has been covered before but I was unable to find > any > > > clear references online or in the mailing list. > > > > > > Are there any rules of thumb for how many cores (aka shards, since I am > > > using SolrCloud) is "too many" for one machine? I realize there is no > > one > > > answer (depends on size of the machine, etc.) so I'm just looking for a > > > rough idea. Something like the following would be very useful: > > > > > > * People commonly run up to X cores/shards on a mid-sized (4 or 8 core) > > > server without any problems. > > > * I have never heard of anyone successfully running X cores/shards on a > > > single machine, even if you throw a lot of hardware at it. > > > > > > Thanks! > > > - Ian > > > > > >