Multi-tenancy is a bad idea for a single solr Cluster. Better to give each
tenant a separate Solr instance that you spin up and spin down based on
demand.

Think about it: If there are a small number of tenants, just giving each
their own machine will be cheaper than the effort spent managing a
multi-tenant cluster, and if there are a large number of tenants of even a
moderate number of large tenants, you can't expect them to all run
reasonably on a relatively small cluster. Think about scalability.


-- Jack Krupansky

On Tue, Mar 24, 2015 at 1:22 PM, Ian Rose <ianr...@fullstory.com> wrote:

> Let me give a bit of background.  Our Solr cluster is multi-tenant, where
> we use one collection for each of our customers.  In many cases, these
> customers are very tiny, so their collection consists of just a single
> shard on a single Solr node.  In fact, a non-trivial number of them are
> totally empty (e.g. trial customers that never did anything with their
> trial account).  However there are also some customers that are larger,
> requiring their collection to be sharded.  Our strategy is to try to keep
> the total documents in any one shard under 20 million (honestly not sure
> where my coworker got that number from - I am open to alternatives but I
> realize this is heavily app-specific).
>
> So my original question is not related to indexing or query traffic, but
> just the sheer number of cores.  For example, if I have 10 active cores on
> a machine and everything is working fine, should I expect that everything
> will still work fine if I add 10 nearly-idle cores to that machine?  What
> about 100?  1000?  I figure the overhead of each core is probably fairly
> low but at some point starts to matter.
>
> Does that make sense?
> - Ian
>
>
> On Tue, Mar 24, 2015 at 11:12 AM, Jack Krupansky <jack.krupan...@gmail.com
> >
> wrote:
>
> > Shards per collection, or across all collections on the node?
> >
> > It will all depend on:
> >
> > 1. Your ingestion/indexing rate. High, medium or low?
> > 2. Your query access pattern. Note that a typical query fans out to all
> > shards, so having more shards than CPU cores means less parallelism.
> > 3. How many collections you will have per node.
> >
> > In short, it depends on what you want to achieve, not some limit of Solr
> > per se.
> >
> > Why are you even sharding the node anyway? Why not just run with a single
> > shard per node, and do sharding by having separate nodes, to maximize
> > parallel processing and availability?
> >
> > Also be careful to be clear about using the Solr term "shard" (a slice,
> > across all replica nodes) as distinct from the Elasticsearch term "shard"
> > (a single slice of an index for a single replica, analogous to a Solr
> > "core".)
> >
> >
> > -- Jack Krupansky
> >
> > On Tue, Mar 24, 2015 at 9:02 AM, Ian Rose <ianr...@fullstory.com> wrote:
> >
> > > Hi all -
> > >
> > > I'm sure this topic has been covered before but I was unable to find
> any
> > > clear references online or in the mailing list.
> > >
> > > Are there any rules of thumb for how many cores (aka shards, since I am
> > > using SolrCloud) is "too many" for one machine?  I realize there is no
> > one
> > > answer (depends on size of the machine, etc.) so I'm just looking for a
> > > rough idea.  Something like the following would be very useful:
> > >
> > > * People commonly run up to X cores/shards on a mid-sized (4 or 8 core)
> > > server without any problems.
> > > * I have never heard of anyone successfully running X cores/shards on a
> > > single machine, even if you throw a lot of hardware at it.
> > >
> > > Thanks!
> > > - Ian
> > >
> >
>

Reply via email to