Re: Maximum number of SolrCloud collections in limited hardware resource

Alexandre Rafalovitch Thu, 05 Jul 2018 07:23:29 -0700

Does it need to be a SolrCloud? If it is just replication, maybe it can
just be double indexed from the client. Or old style replication. And then
use LotsOfCores autoloading.


Regards,
    Alex

On Wed, Jun 27, 2018, 8:46 AM Shawn Heisey, <elyog...@elyograg.org> wrote:

> On 6/27/2018 5:10 AM, Sharif Shahrair wrote:
> > Now the problem is, when we create about 1400 collection(all of them are
> > empty i.e. no document is added yet) the solr service goes down showing
> out
> > of memory exception. We have few questions here-
> >
> > 1. When we are creating collections, each collection is taking about 8 MB
> > to 12 MB of memory when there is no document yet. Is there any way to
> > configure SolrCloud in a way that it takes low memory for each collection
> > initially(like 1MB for each collection), then we would be able to create
> > 1500 collection using about 3GB of machines RAM?
>
> Solr doesn't dictate how much memory it allocates for a collection.  It
> allocates what it needs, and if the heap size is too small for that,
> then you get OOME.
>
> You're going to need a lot more than two Solr servers to handle that
> many collections, and they're going to need more than 12GB of memory.
> You should already have at least three servers in your setup, because
> ZooKeeper requires three servers for redundancy.
>
>
> http://zookeeper.apache.org/doc/r3.4.12/zookeeperAdmin.html#sc_zkMulitServerSetup
>
> Handling a large number of collections is one area where SolrCloud needs
> improvement.  Work is constantly happening towards this goal, but it's a
> very complex piece of software, so making design changes is not trivial.
>
> > 2. Is there any way to clear/flush the cache of SolrCloud, specially from
> > those collections which we don't access for while(May be we can take
> those
> > inactive collections out of memory and load them back when they are
> needed
> > again)?
>
> Unfortunately the functionality that allows index cores to be unloaded
> (which we have colloquially called "LotsOfCores") does not work when
> Solr is running in SolrCloud mode.SolrCloud functionality would break if
> its cores get unloaded.  It would take a fair amount of development
> effort to allow the two features to work together.
>
> > 3. Is there any way to collect the Garbage Memory from SolrCloud(may be
> > created by deleting documents and collections) ?
>
> Java handles garbage collection automatically.  It's possible to
> explicitly ask the system to collect garbage, but any good programming
> guide for Java will recommend that programmers should NOT explicitly
> trigger GC.  While it might be possible for Solr's memory usage to
> become more efficient through development effort, it's already pretty
> good.  To our knowledge, Solr does not currently have any memory leak
> bugs, and if any are found, they are taken seriously and fixed as fast
> as we can fix them.
>
> > Our target is without increasing the hardware resources, create maximum
> > number of collections, and keeping the highly accessed collections &
> > documents in memory. We'll appreciate your help.
>
> That goal will require a fair amount of hardware.  You may have no
> choice but to increase your hardware resources.
>
> Thanks,
> Shawn
>
>

Re: Maximum number of SolrCloud collections in limited hardware resource

Reply via email to