Well shoot, upon further examination it appears that this time around there weren't actually any segment deletion problems. The "only" problem was a "doc counts differ" error. Interestingly, the count is only off by one.
>From the CheckIndex tool: **** Opening index @ /ssd/solr-9999/solr/exhibitcore/data/index Segments file=segments_qc3 numSegments=31 version=FORMAT_SHARED_DOC_STORE [Lucene 2.3] [...] 4 of 31: name=_3r docCount=23673 compound=false numFiles=8 size (MB)=203.217 no deletions test: open reader.........FAILED WARNING: would remove reference to this segment (-fix was not specified); full exception: org.apache.lucene.index.CorruptIndexException: doc counts differ for segment _3r: fieldsReader shows 23672 but segmentInfo shows 23673 at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:315) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:264) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:199) at org.apache.lucene.index.CheckIndex.check(CheckIndex.java:178) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:433) [...] WARNING: 1 broken segments detected WARNING: 23673 documents would be lost if -fix were specified *** I don't know if they would still help in light of this, but I've got 12MB of infoStream data from this round here: http://www.knowledgemosaic.com/public/infoStream.tar.gz As for the normal Solr log (I now have a log for the whole round), the first relevant SEVERE error was this: <record> <date>2008-08-20T16:42:06</date> <millis>1219275726747</millis> <sequence>587322</sequence> <logger>org.apache.solr.core.SolrCore</logger> <level>SEVERE</level> <class>org.apache.solr.common.SolrException</class> <method>log</method> <thread>13</thread> <message>java.lang.NullPointerException </message> </record> Curiously there is no stack trace. The next relevant error: <record> <date>2008-08-20T16:42:52</date> <millis>1219275772073</millis> <sequence>590946</sequence> <logger>org.apache.solr.update.UpdateHandler</logger> <level>SEVERE</level> <class>org.apache.solr.update.DirectUpdateHandler2$CommitTracker</class> <method>run</method> <thread>16</thread> <message>auto commit error...</message> </record> Then there's another NullPointerException, now from a different logger <record> <date>2008-08-20T16:53:46</date> <millis>1219276426304</millis> <sequence>640371</sequence> <logger>org.apache.solr.servlet.SolrDispatchFilter</logger> <level>SEVERE</level> <class>org.apache.solr.common.SolrException</class> <method>log</method> <thread>17</thread> <message>java.lang.NullPointerException</message> </record> And later on the "corrupt index" fun begins: <record> <date>2008-08-20T17:02:18</date> <millis>1219276938663</millis> <sequence>731205</sequence> <logger>org.apache.solr.core.SolrCore</logger> <level>SEVERE</level> <class>org.apache.solr.common.SolrException</class> <method>log</method> <thread>13</thread> <message>org.apache.lucene.index.CorruptIndexException: doc counts differ for segment _3r: fieldsReader shows 23672 but segmentInfo shows 23673 at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:313) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:197) at org.apache.lucene.index.MultiSegmentReader.<init>(MultiSegmentReader.java:55) at org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:75) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:636) at org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:63) at org.apache.lucene.index.IndexReader.open(IndexReader.java:209) at org.apache.lucene.index.IndexReader.open(IndexReader.java:173) at org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:93) at org.apache.solr.core.SolrCore.newSearcher(SolrCore.java:213) at org.apache.solr.update.DirectUpdateHandler2.openSearcher(DirectUpdateHandler2.java:207) at org.apache.solr.update.DirectUpdateHandler2.doDeletions(DirectUpdateHandler2.java:466) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:295) at org.apache.solr.handler.RichDocumentLoader.doAdd(RichDocumentRequestHandler.java:231) at org.apache.solr.handler.RichDocumentLoader.addDoc(RichDocumentRequestHandler.java:236) at org.apache.solr.handler.RichDocumentLoader.load(RichDocumentRequestHandler.java:278) at org.apache.solr.handler.RichDocumentRequestHandler.handleRequestBody(RichDocumentRequestHandler.java:80) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:125) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:228) at org.apache.solr.core.SolrCore.execute(SolrCore.java:965) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:339) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:274) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) </message> </record> I should probably reiterate that this round was not with the latest Solr or the latest Lucene. Testing that is in the works, but I kind of wanted to figure out why this older stuff was failing too. Chris On Wed, Aug 20, 2008 at 6:03 PM, Michael McCandless <[EMAIL PROTECTED]> wrote: > > Did the same FileNotFoundException / massive deletion of files occur? > > Actually if you could zip them all up and post them, I'll dig through them > to see if they give any clues... > > Mike > > Chris Harris wrote: > >> Ok, I did what you suggested, giving each SolrIndexWriter its own >> "infoStream" log file, created in the init() method. The thing is, I >> now have like 3400 infostream log files, I guess reflecting how solr >> created like 3400 SolrIndexWriters over the course of the run. >> (Hopefully this is plausible.) Could you explain what I should be >> looking for in these files? (Posting the whole bunch of it doesn't >> sound very useful.) >> >> Thanks, >> Chris >> >> On Mon, Aug 18, 2008 at 10:12 AM, Michael McCandless >> <[EMAIL PROTECTED]> wrote: >>> >>> Alas, I think this won't actually turn on IndexWriter's infoStream. >>> >>> I think you may need to modify the SolrIndexWriter.java sources, in the >>> init >>> method, to add a call to setInfoStream(...). >>> >>> Can any Solr developers confirm this? >>> >>> Mike >>> >>> Chris Harris wrote: >>> >>>> I'm assuming that one way to do this would be to set the logging level >>>> to "FINEST" in the "logging" page in the solr admin tool, and then to >>>> make sure my logging.properties file is also set to record the FINEST >>>> logging level. Let me know if that won't enable to sort of debugging >>>> info you are talking about. (I do understand that the logging page in >>>> the admin tool makes temporary changes that will get reverted when you >>>> restart Solr.) >>>> >>>> On Mon, Aug 18, 2008 at 3:05 AM, Michael McCandless >>>> <[EMAIL PROTECTED]> wrote: >>>>> >>>>> Since it seems reproducible, could you turn on debugging output >>>>> (IndexWriter.setInfoStream(...)), get the FileNotFoundException to >>>>> happen >>>>> again, and post the resulting output? >>>>> >>>>> Mike >>> >>> > >