Yes, that is quite normal for a busy search engine, especially for cloud 
environments. We always start by increasing it to 64k minimum when provisioning 
machines.
Markus
 
-----Original message-----
> From:Mads Tomasgård Bjørgan <m...@dips.no>
> Sent: Thursday 30th June 2016 13:05
> To: solr-user@lucene.apache.org
> Subject: RE: Solr node crashes while indexing - Too many open files
> 
> That's true, but I was hoping there would be another way to solve this issue 
> as it's not considered preferable in our situation.
> 
> Is it normal behavior for Solr to open over 4000 files without closing them 
> properly? Is it for example possible to adjust autoCommit-settings I 
> solrconfig.xml for forcing Solr to close the files?
> 
> Any help is appreciated :-)
> 
> -----Original Message-----
> From: Markus Jelsma [mailto:markus.jel...@openindex.io] 
> Sent: torsdag 30. juni 2016 11.41
> To: solr-user@lucene.apache.org
> Subject: RE: Solr node crashes while indexing - Too many open files
> 
> Mads, some distributions require different steps for increasing 
> max_open_files. Check how it works vor CentOS specifically.
> 
> Markus
> 
>  
>  
> -----Original message-----
> > From:Mads Tomasgård Bjørgan <m...@dips.no>
> > Sent: Thursday 30th June 2016 10:52
> > To: solr-user@lucene.apache.org
> > Subject: Solr node crashes while indexing - Too many open files
> > 
> > Hello,
> > We're indexing a large set of files using Solr 6.1.0, running a SolrCloud 
> > by utilizing ZooKeeper 3.4.8.
> > 
> > We have two ensembles - and both clusters are running on three of their own 
> > respective VMs (CentOS 7). We first thought the error was due to CDCR - as 
> > we were trying to index a large amount of documents which had to be 
> > replicated to the target cluster. However, we got the same error even after 
> > turning of CDCR - which indicates CDCR wasn't the problem after all.
> > 
> > After indexing between 20 000 to 35 000 documents to the source cluster 
> > does the File Descriptor Count reach 4096 for one of the solr-nodes - and 
> > the respective node crashes. The count grows quite linearly as time goes. 
> > The remaining 2 nodes in the cluster is not affected at all, and their logs 
> > had no relevant posts.  We found the following errors for the crashing node 
> > in its log:
> > 
> > 2016-06-30 08:23:12.459 ERROR 
> > (updateExecutor-2-thread-22-processing-https:////10.0.106.168:443//solr//DIPS_shard3_replica1
> >  x:DIPS_shard1_replica1 r:core_node1 n:10.0.106.115:443_solr s:shard1 
> > c:DIPS) [c:DIPS s:shard1 r:core_node1 x:DIPS_shard1_replica1] 
> > o.a.s.u.StreamingSolrClients error
> > java.net.SocketException: Too many open files
> >                 (...)
> > 2016-06-30 08:23:12.460 ERROR 
> > (updateExecutor-2-thread-22-processing-https:////10.0.106.168:443//solr//DIPS_shard3_replica1
> >  x:DIPS_shard1_replica1 r:core_node1 n:10.0.106.115:443_solr s:shard1 
> > c:DIPS) [c:DIPS s:shard1 r:core_node1 x:DIPS_shard1_replica1] 
> > o.a.s.u.StreamingSolrClients error
> > java.net.SocketException: Too many open files
> >                 (...)
> > 2016-06-30 08:23:12.461 ERROR (qtp314337396-18) [c:DIPS s:shard1 
> > r:core_node1 x:DIPS_shard1_replica1] o.a.s.h.RequestHandlerBase 
> > org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException:
> >  2 Async exceptions during distributed update:
> > Too many open files
> > Too many open files
> >                 (...)
> > 2016-06-30 08:23:12.461 INFO  (qtp314337396-18) [c:DIPS s:shard1 
> > r:core_node1 x:DIPS_shard1_replica1] o.a.s.c.S.Request 
> > [DIPS_shard1_replica1]  webapp=/solr path=/update params={version=2.2} 
> > status=-1 QTime=5
> > 2016-06-30 08:23:12.461 ERROR (qtp314337396-18) [c:DIPS s:shard1 
> > r:core_node1 x:DIPS_shard1_replica1] o.a.s.s.HttpSolrCall 
> > null:org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException:
> >  2 Async exceptions during distributed update:
> > Too many open files
> > Too many open files
> >                 (....)
> > 
> > 2016-06-30 08:23:12.461 WARN  (qtp314337396-18) [c:DIPS s:shard1 
> > r:core_node1 x:DIPS_shard1_replica1] o.a.s.s.HttpSolrCall invalid return 
> > code: -1
> > 2016-06-30 08:23:38.108 INFO  (qtp314337396-20) [c:DIPS s:shard1 
> > r:core_node1 x:DIPS_shard1_replica1] o.a.s.c.S.Request 
> > [DIPS_shard1_replica1]  webapp=/solr path=/select 
> > params={df=_text_&distrib=false&fl=id&fl=score&shards.purpose=4&start=0&fsv=true&shard.url=https://10.0.106.115:443/solr/DIPS_shard1_replica1/&rows=10&version=2&q=*:*&NOW=1467275018057&isShard=true&wt=javabin&_=1467275017220}
> >  hits=30218 status=0 QTime=1
> > 
> > Running netstat -n -p on the VM that yields the exceptions reveals that 
> > there is at least 1 800 TCP connections (not counted how many - the netstat 
> > command filled the entire PuTTY window yielding 2 000 lines) waiting to be 
> > closed:
> > tcp6      70      0 10.0.106.115:34531      10.0.106.114:443        
> > CLOSE_WAIT  21658/java
> > We're running the SolrCloud on 443, and the IP's belong to the VMs. We also 
> > tried adjusting the ulimit for the machine to 100 000 - without any 
> > results..
> > 
> > Greetings,
> > Mads
> > 
> 

Reply via email to