[ https://issues.apache.org/jira/browse/SOLR-9830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17044048#comment-17044048 ]
Vinh Le commented on SOLR-9830: ------------------------------- I've seen this error when requesting /metrics APIs in 7.3 also, and only disappear when restarting. > Once IndexWriter is closed due to some RunTimeException like > FileSystemException, It never return to normal unless restart the Solr JVM > --------------------------------------------------------------------------------------------------------------------------------------- > > Key: SOLR-9830 > URL: https://issues.apache.org/jira/browse/SOLR-9830 > Project: Solr > Issue Type: Bug > Components: update > Affects Versions: 6.2 > Environment: Red Hat 4.4.7-3,SolrCloud > Reporter: Daisy.Yuan > Priority: Major > > 1. Collection coll_test, has 9 shards, each has two replicas in different > solr instances. > 2. When update documens to the collection use Solrj, inject the exhausted > handle fault to one solr instance like solr1. > 3. Update to col_test_shard3_replica1(It's leader) is failed due to > FileSystemException, and IndexWriter is closed. > 4. And clear the fault, the col_test_shard3_replica1 (is leader) is always > cannot be updated documens and the numDocs is always less than the standby > replica. > 5. After Solr instance restart, It can update documens and the numDocs is > consistent between the two replicas. > I think in this case in Solr Cloud mode, it should recovery itself and not > restart to recovery the solrcore update function. > 2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 | > [DWPT][http-nio-21101-exec-20]: now abort | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 | > [DWPT][http-nio-21101-exec-20]: done abort | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,932 | INFO | http-nio-21101-exec-20 | > [IW][http-nio-21101-exec-20]: hit exception updating document | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 | > [IW][http-nio-21101-exec-20]: hit tragic FileSystemException inside > updateDocument | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 | > [IW][http-nio-21101-exec-20]: rollback | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,933 | INFO | http-nio-21101-exec-20 | > [IW][http-nio-21101-exec-20]: all running merges have aborted | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,934 | INFO | http-nio-21101-exec-20 | > [IW][http-nio-21101-exec-20]: rollback: done finish merges | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,934 | INFO | http-nio-21101-exec-20 | > [DW][http-nio-21101-exec-20]: abort | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,939 | INFO | commitScheduler-46-thread-1 | > [DWPT][commitScheduler-46-thread-1]: flush postings as segment _4h9 > numDocs=3798 | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | > [DWPT][commitScheduler-46-thread-1]: now abort | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | > [DWPT][commitScheduler-46-thread-1]: done abort | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,940 | INFO | http-nio-21101-exec-20 | > [DW][http-nio-21101-exec-20]: done abort success=true | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | > [DW][commitScheduler-46-thread-1]: commitScheduler-46-thread-1 > finishFullFlush success=false | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,940 | INFO | http-nio-21101-exec-20 | > [IW][http-nio-21101-exec-20]: rollback: > infos=_4g7(6.2.0):C59169/23684:delGen=4 _4gq(6.2.0):C67474/11636:delGen=1 > _4gg(6.2.0):C64067/15664:delGen=2 _4gr(6.2.0):C13131 _4gs(6.2.0):C966 > _4gt(6.2.0):C4543 _4gu(6.2.0):C6960 _4gv(6.2.0):C2544 | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,940 | INFO | commitScheduler-46-thread-1 | > [IW][commitScheduler-46-thread-1]: hit exception during NRT reader | > org.apache.solr.update.LoggingInfoStream.message(LoggingInfoStream.java:34) > 2016-12-01 14:13:00,967 | INFO | http-nio-21101-exec-20 | > [col_test_shard3_replica1] webapp=/solr path=/update > params={wt=javabin&version=2}{add=[5____5 (1552493084330164224), 24____5 > (1552493084330164225), 28____5 (1552493084331212800), 32____5 > (1552493084331212801), 44____5 (1552493084331212802), 46____5 > (1552493084331212803), 64____5 (1552493084331212804), 94____5 > (1552493084331212805), 100____5 (1552493084331212806), 119____5 > (1552493084331212807), ... (74 adds)]} 0 43 | > org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.finish(LogUpdateProcessorFactory.java:187) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2143) > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:695) > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:471) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:450) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:400) > at > org.apache.solr.servlet.SolrAuthorizationFilter.doFilter(SolrAuthorizationFilter.java:195) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at > com.huawei.solr.security.check.SolrParaCheckFilter.doFilter(SolrParaCheckFilter.java:201) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at > com.huawei.solr.security.audit.AuditFilter.doFilter(AuditFilter.java:145) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at > com.huawei.solr.security.auth.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:611) > at > com.huawei.solr.security.auth.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:578) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at > com.huawei.solr.security.auth.cas.HttpServletRequestWrapperFilterWrapper.doFilter(HttpServletRequestWrapperFilterWrapper.java:37) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at > com.huawei.solr.security.auth.cas.Cas20ProxyReceivingTicketValidationFilterWrapper.doFilter(Cas20ProxyReceivingTicketValidationFilterWrapper.java:71) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at > com.huawei.solr.security.auth.cas.Cas20AuthenticationFilterWrapper.doFilter(Cas20AuthenticationFilterWrapper.java:60) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at > com.huawei.solr.security.auth.cas.LogoutFilter.doFilter(LogoutFilter.java:84) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at > com.huawei.solr.monitor.MemMonitorFilter.doFilter(MemMonitorFilter.java:81) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at > com.huawei.solr.security.auth.ServerRealmFilter.doFilter(ServerRealmFilter.java:55) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at > com.huawei.solr.security.auth.RerouteRequestFilter.doFilter(RerouteRequestFilter.java:58) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:218) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) > at > org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:442) > at > org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1083) > at > org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:640) > at > org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1756) > at > org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1715) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at > org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter > is closed > at > org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:740) > at > org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:754) > at > org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1558) > at > org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:279) > at > org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:211) > at > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:166) > ... 73 more > Caused by: java.nio.file.FileSystemException: > /srv/BigData/solr/solrserveradmin/col_test_shard3_replica1/data/index/_4ha.fdx: > Too many open files in system > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:91) > at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > at > sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) > at > java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434) > at java.nio.file.Files.newOutputStream(Files.java:216) > at > org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:413) > at > org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:409) > at > org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:253) > at > org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:44) > at > org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:43) > at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.<init>(CompressingStoredFieldsWriter.java:108) > at > org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:128) > at > org.apache.lucene.codecs.lucene50.Lucene50StoredFieldsFormat.fieldsWriter(Lucene50StoredFieldsFormat.java:183) > at > org.apache.lucene.index.DefaultIndexingChain.initStoredFieldsWriter(DefaultIndexingChain.java:83) > at > org.apache.lucene.index.DefaultIndexingChain.startStoredFields(DefaultIndexingChain.java:331) > at > org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:368) > at > org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:231) > at > org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:478) > at > org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1562) > ... 76 more > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org