Hm, temporarily more threads is hard. We already reduced -Xss256k. Wouldn't it 
be better to use Callable and Executor as proposed in:

http://stackoverflow.com/questions/16789288/java-lang-outofmemoryerror-unable-to-create-new-native-thread

and limit the number of used threads to the number of CPUs or twice or triple 
of it? Native threads are restricted by the total virtual memory of the system 
(at least in linux as far as I know). So the 10.000 threads, we use, is somehow 
near the limit of the hardware, we have.

Christoph

-----Ursprüngliche Nachricht-----
Von: Ramkumar R. Aiyengar [mailto:andyetitmo...@gmail.com] 
Gesendet: Sonntag, 31. August 2014 17:53
An: solr-user@lucene.apache.org
Betreff: Re: Scaling to large Number of Collections

On 31 Aug 2014 13:24, "Mark Miller" <markrmil...@gmail.com> wrote:
>
>
> > On Aug 31, 2014, at 4:04 AM, Christoph Schmidt <
christoph.schm...@moresophy.de> wrote:
> >
> > we see at least two problems when scaling to large number of
collections. I would like to ask the community, if they are known and maybe 
already addressed in development:
> > We have a SolrCloud running with the following numbers:
> > -          5 Servers (each 24 CPUs, 128 RAM)
> > -          13.000 Collection with 25.000 SolrCores in the Cloud
> > The Cloud is working fine, but we see two problems, if we like to 
> > scale
further
> > 1.       Resource consumption of native system threads
> > We see that each collection opens at least two threads: one for the
zookeeper (coreZkRegister-1-thread-5154) and one for the searcher
(searcherExecutor-28357-thread-1)
> > We will run in "OutOfMemoryError: unable to create new native thread".
Maybe the architecture could be changed here to use thread pools?
> > 2.       The shutdown and the startup of one server in the SolrCloud
takes 2 hours. So a rolling start is about 10h. For me the problem seems to be 
that leader election is "linear". The Overseer does core per core. The 
organisation of the cloud is not done parallel or distributed. Is this already 
addressed by https://issues.apache.org/jira/browse/SOLR-5473 or is there more 
needed?
>
> 2. No, but it should have been fixed by another issue that will be in
4.10.

Note however that this fix will result in even more temporary thread usage as 
all leadership elections will happen in parallel, so you might still end up 
with these out of threads issue again.

Quite possibly the out of threads issue is just some system soft limit which is 
kicking in. Linux certainly has a limit you can configure through sysctl, your 
OS, whatever that might be, probably does the same. May be worth exploring if 
you can bump that up.

>
> - Mark
> http://about.me/markrmiller

Reply via email to