About a week and a half into simultaneously growing and querying a new Solr index, the index has gotten corrupted, as reflected by the following IOExceptions:
* java.io.IOException: Cannot overwrite: E:\solr-10009\solr\filingcore\data\index\_1kir.tis * java.io.FileNotFoundException: E:\solr-10009\solr\filingcore\data\index\_1dri.fnm (The system cannot find the file specified) (More detailed stack traces and stuff below.) I know that I may be able to repair the index somewhat by truncating it with the Lucene CheckIndex tool. I'm totally mystified by what might have gone wrong, though, or how I could prevent this from happening in the future. Any suggestions? Here is info on my setup: *** Solr version: - based on solr trunk r758161 (whose tests all pass) - modifications: * SOLR-744 (for bigram stuff) * LUCENE-1370 (for bigram stuff) - this implies a custom Lucene build, but I used as source the same Lucene revision that this solr revision is based off of, i.e. Lucene r752164 * Qsol query parser patch * Tika handler (aka Solr cell) * SOLR-236: Field collapsing Windows Server 2008, 64-bit C:\Users\myuser>"C:\Program Files\Java\jre6\bin\java.exe" -version java version "1.6.0_11" Java(TM) SE Runtime Environment (build 1.6.0_11-b03) Java HotSpot(TM) 64-Bit Server VM (build 11.0-b16, mixed mode) 7GB allocated to Solr Maybe 15M records in the index? *** Below are a few relevant (and potentially relevant) records from my Solr log. I think these include the very first of these IOExceptions that my Solr ran into. There were other SEVERE errors in the log, but they all look minor. First there are some errors where the qsol query parser failed to correctly parse a user query, and then there are some errors where the Tika handler failed to correctly extract text from a PDF file. But in neither case, I think, should that result in corrupting the Lucene index. *** <record> <date>2009-05-01T03:48:26</date> <millis>1241174906926</millis> <sequence>1459</sequence> <logger>org.apache.solr.servlet.SolrDispatchFilter</logger> <level>SEVERE</level> <class>org.apache.solr.common.SolrException</class> <method>log</method> <thread>65</thread> <message>java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.search.FieldCacheImpl$10.createValue(FieldCacheImpl.java:367) at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:71) at org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:359) at org.apache.lucene.search.FieldSortedHitQueue.comparatorString(FieldSortedHitQueue.java:433) at org.apache.lucene.search.FieldSortedHitQueue$1.createValue(FieldSortedHitQueue.java:210) at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:71) at org.apache.lucene.search.FieldSortedHitQueue.getCachedComparator(FieldSortedHitQueue.java:168) at org.apache.lucene.search.FieldSortedHitQueue.<init>(FieldSortedHitQueue.java:58) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:997) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:928) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:345) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:171) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1333) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) </message> </record> // Note: the same error with same stack trace also appears // at 2009-05-01T03:49:02 <record> <date>2009-05-01T08:56:35</date> <millis>1241193395244</millis> <sequence>1516</sequence> <logger>org.apache.solr.core.SolrCore</logger> <level>SEVERE</level> <class>org.apache.solr.common.SolrException</class> <method>log</method> <thread>77</thread> <message>org.apache.solr.common.SolrException: java.io.IOException: Cannot overwrite: E:\solr-10009\solr\filingcore\data\index\_1kir.tis at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:169) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1333) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) Caused by: java.io.IOException: Cannot overwrite: E:\solr-10009\solr\filingcore\data\index\_1kir.tis at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:511) at org.apache.lucene.index.TermInfosWriter.initialize(TermInfosWriter.java:98) at org.apache.lucene.index.TermInfosWriter.<init>(TermInfosWriter.java:83) at org.apache.lucene.index.FormatPostingsFieldsWriter.<init>(FormatPostingsFieldsWriter.java:41) at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:96) at org.apache.lucene.index.TermsHash.flush(TermsHash.java:145) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:76) at org.apache.lucene.index.DocFieldConsumers.flush(DocFieldConsumers.java:75) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:60) at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:571) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3798) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3708) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2233) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2187) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:238) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:90) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:95) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:157) ... 22 more </message> </record> <record> <date>2009-05-01T08:57:19</date> <millis>1241193439908</millis> <sequence>1517</sequence> <logger>org.apache.solr.update.UpdateHandler</logger> <level>SEVERE</level> <class>org.apache.solr.update.DirectUpdateHandler2$CommitTracker</class> <method>run</method> <thread>78</thread> <message>auto commit error...</message> </record> <record> <date>2009-05-01T08:58:45</date> <millis>1241193525132</millis> <sequence>1518</sequence> <logger>org.apache.solr.update.UpdateHandler</logger> <level>SEVERE</level> <class>org.apache.solr.update.DirectUpdateHandler2$CommitTracker</class> <method>run</method> <thread>78</thread> <message>auto commit error...</message> </record> <record> <date>2009-05-01T09:03:36</date> <millis>1241193816122</millis> <sequence>1519</sequence> <logger>org.apache.solr.core.SolrCore</logger> <level>SEVERE</level> <class>org.apache.solr.common.SolrException</class> <method>log</method> <thread>77</thread> <message>org.apache.solr.common.SolrException: java.io.FileNotFoundException: E:\solr-10009\solr\filingcore\data\index\_1dri.fnm (The system cannot find the file specified) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:169) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1333) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) Caused by: java.io.FileNotFoundException: E:\solr-10009\solr\filingcore\data\index\_1dri.fnm (The system cannot find the file specified) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.<init>(Unknown Source) at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:623) at org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:653) at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:559) at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:553) at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:58) at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:503) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:468) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:398) at org.apache.lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java:912) at org.apache.lucene.index.IndexWriter.applyDeletes(IndexWriter.java:4585) at org.apache.lucene.index.IndexWriter._mergeInit(IndexWriter.java:4248) at org.apache.lucene.index.IndexWriter.mergeInit(IndexWriter.java:4224) at org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:186) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2617) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2612) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:2608) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3709) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2233) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:2187) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:238) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:90) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:95) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:157) ... 22 more </message> </record>