Hello,
We're running solr 4.2.0 and recently converted to SolrCloud. We've got
16 cores, each with 1 shard. 3 zookeeper instances, 4 replicas of each
core. We're suddenly having trouble with very slow tomcat restarts
(15-45 minutes) and even when we can get a few replicas up, we aren't
seeing a leader for many of our cores. I tried issuing a reload command
through the cores admin, but it fails because there is no leader. Is
there any way to cause an election? Restarting tomcat on individual
servers in the cluster doesn't seem to help. We do have some cores that
are serving request properly and would prefer not to shut down the whole
cluster if possible -- this is a production system.
In addition, some cores are reporting a peculiar error, stack trace
below. The cores that report this problem seem to be completely down
across all replicas.
ERROR org.apache.solr.servlet.SolrDispatchFilter - null
:org.apache.solr.common.SolrException: Error opening new searcher
at
org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1415)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1527)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1304)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1239)
at
org.apache.solr.request.SolrQueryRequestBase.getSearcher(SolrQueryRequestBase.java:94)
at
org.apache.solr.servlet.cache.HttpCacheHeaderUtil.calcLastModified(HttpCacheHeaderUtil.java:145)
at
org.apache.solr.servlet.cache.HttpCacheHeaderUtil.doCacheHeaderValidation(HttpCacheHeaderUtil.java:218)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:334)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:581)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:879)
at
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
at
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
at
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.RuntimeException: Already closed
at
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:237)
at
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:222)
at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:244)
at
org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1326)
Has anyone see either of these issues before? I'm having trouble finding
any information on either situation.
Thanks,
-Cat