On Sat, Aug 16, 2008 at 4:33 AM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > What version of Java do you have on Linux?
The Java version on *Linux* (where I'm seeing the trouble): java version "1.6.0" OpenJDK Runtime Environment (build 1.6.0-b09) OpenJDK 64-Bit Server VM (build 1.6.0-b09, mixed mode) I'm pretty sure this is the latest one from the Ubuntu repository. Maybe I should try the official Sun HotSpot build instead. I'm not finding any complaints about OpenJDK on the Lucene list, though. The Java version on *Windows* (where I created the initial compound format index) is an official Sun build: java version "1.6.0_06" Java(TM) SE Runtime Environment (build 1.6.0_06-b02) Java HotSpot(TM) Client VM (build 10.0-b22, mixed mode, sharing) > Also, is this easily reproducible? How many threads are you adding > documents with? What is your Auto Commit setting? I think it takes 12-24hr to get the index to screw up, so while I did reproduce it once, I haven't yet tried again. Intuition says that if I repeat the same procedure the same problem would arise. Of course, what would be nice is if I could figure out how to reproduce it more quickly, with a smaller index, and a simpler schema. I'm adding documents with 5-10 threads. Since I'm using the rich document update handler (https://issues.apache.org/jira/browse/SOLR-284), there's going to be PDF and HTML conversion going on within Solr alongside the normal analysis and indexing. Autocommit is: <autoCommit> <maxDocs>100000</maxDocs> <maxTime>1800000</maxTime> <!-- 30 min --> </autoCommit> > Can you try Lucene's CheckIndex tool on it and report what it says? Working on that now. It should take some time, though, due to the index size. > > On Aug 15, 2008, at 1:35 PM, Chris Harris wrote: > >> I have an index (different from the ones mentioned yesterday) that was >> working fine with 3M docs or so, but when I added a bunch more docs, >> bringing it closer to 4M docs, the index seemed to get corrupted. In >> particular, now when I start Solr up, or when when my indexing process >> tries add a document, I get a complaint about missing index files. >> >> The error on startup looks like this: >> >> <record> >> <date>2008-08-15T10:18:54</date> >> <millis>1218820734592</millis> >> <sequence>92</sequence> >> <logger>org.apache.solr.core.MultiCore</logger> >> <level>SEVERE</level> >> <class>org.apache.solr.common.SolrException</class> >> <method>log</method> >> <thread>10</thread> >> <message>java.lang.RuntimeException: java.io.FileNotFoundException: >> /ssd/solr-9999/solr/exhibitcore/data/index/_p7.fdt (No such file or >> directory) >> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:733) >> at org.apache.solr.core.SolrCore.<init>(SolrCore.java:387) >> at org.apache.solr.core.MultiCore.create(MultiCore.java:255) >> at org.apache.solr.core.MultiCore.load(MultiCore.java:139) >> at >> org.apache.solr.servlet.SolrDispatchFilter.initMultiCore(SolrDispatchFilter.java:147) >> at >> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:75) >> at >> org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) >> at >> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) >> at >> org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594) >> at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) >> at >> org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218) >> at >> org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500) >> at >> org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448) >> at >> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) >> at >> org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) >> at >> org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161) >> at >> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) >> at >> org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) >> at >> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) >> at >> org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117) >> at org.mortbay.jetty.Server.doStart(Server.java:210) >> at >> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) >> at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:616) >> at org.mortbay.start.Main.invokeMain(Main.java:183) >> at org.mortbay.start.Main.start(Main.java:497) >> at org.mortbay.start.Main.main(Main.java:115) >> Caused by: java.io.FileNotFoundException: >> /ssd/solr-9999/solr/exhibitcore/data/index/_p7.fdt (No such file or >> directory) >> at java.io.RandomAccessFile.open(Native Method) >> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233) >> at >> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:506) >> at >> org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:536) >> at >> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:445) >> at >> org.apache.lucene.index.FieldsReader.<init>(FieldsReader.java:75) >> at >> org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:308) >> at >> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262) >> at >> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:197) >> at >> org.apache.lucene.index.MultiSegmentReader.<init>(MultiSegmentReader.java:55) >> at >> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:75) >> at >> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:636) >> at >> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:63) >> at org.apache.lucene.index.IndexReader.open(IndexReader.java:209) >> at org.apache.lucene.index.IndexReader.open(IndexReader.java:173) >> at >> org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:93) >> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:724) >> ... 29 more >> </message> >> </record> >> >> And the error on doc add looks like this: >> >> <record> >> <date>2008-08-15T09:51:30</date> >> <millis>1218819090142</millis> >> <sequence>6571937</sequence> >> <logger>org.apache.solr.core.SolrCore</logger> >> <level>SEVERE</level> >> <class>org.apache.solr.common.SolrException</class> >> <method>log</method> >> <thread>14</thread> >> <message>java.io.FileNotFoundException: >> /ssd/solr-9999/solr/exhibitcore/data/index/_p7.fdt (No such file or >> directory) >> at java.io.RandomAccessFile.open(Native Method) >> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233) >> at >> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.<init>(FSDirectory.java:506) >> at >> org.apache.lucene.store.FSDirectory$FSIndexInput.<init>(FSDirectory.java:536) >> at >> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:445) >> at >> org.apache.lucene.index.FieldsReader.<init>(FieldsReader.java:75) >> at >> org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:308) >> at >> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262) >> at >> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:197) >> at >> org.apache.lucene.index.MultiSegmentReader.<init>(MultiSegmentReader.java:55) >> at >> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:75) >> at >> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:636) >> at >> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:63) >> at org.apache.lucene.index.IndexReader.open(IndexReader.java:209) >> at org.apache.lucene.index.IndexReader.open(IndexReader.java:173) >> at >> org.apache.solr.search.SolrIndexSearcher.<init>(SolrIndexSearcher.java:93) >> at org.apache.solr.core.SolrCore.newSearcher(SolrCore.java:213) >> at >> org.apache.solr.update.DirectUpdateHandler2.openSearcher(DirectUpdateHandler2.java:207) >> at >> org.apache.solr.update.DirectUpdateHandler2.doDeletions(DirectUpdateHandler2.java:466) >> at >> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:295) >> at >> org.apache.solr.handler.RichDocumentLoader.doAdd(RichDocumentRequestHandler.java:231) >> at >> org.apache.solr.handler.RichDocumentLoader.addDoc(RichDocumentRequestHandler.java:236) >> at >> org.apache.solr.handler.RichDocumentLoader.load(RichDocumentRequestHandler.java:278) >> at >> org.apache.solr.handler.RichDocumentRequestHandler.handleRequestBody(RichDocumentRequestHandler.java:80) >> at >> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:125) >> at >> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:228) >> at org.apache.solr.core.SolrCore.execute(SolrCore.java:965) >> at >> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:339) >> at >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:274) >> at >> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) >> at >> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) >> at >> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) >> at >> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) >> at >> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) >> at >> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) >> at >> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) >> at >> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) >> at >> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) >> at org.mortbay.jetty.Server.handle(Server.java:285) >> at >> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) >> at >> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835) >> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) >> at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) >> at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) >> at >> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) >> at >> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) >> </message> >> </record> >> >> I just checked, and the files that Solr is complaining about are >> indeed not in the index directory. >> >> The earliest indication of trouble I found in my log was an error like >> this: >> >> <record> >> <date>2008-08-15T09:47:48</date> >> <millis>1218818868528</millis> >> <sequence>6525387</sequence> >> <logger>org.apache.solr.update.UpdateHandler</logger> >> <level>SEVERE</level> >> <class>org.apache.solr.update.DirectUpdateHandler2$CommitTracker</class> >> <method>run</method> >> <thread>15</thread> >> <message>auto commit error...</message> >> </record> >> >> There may have been SEVERE errors before this, but my log doesn't go >> back to the very beginning. >> >> It's interesting that while adding documents seems to be usually >> failing now (yielding the "file not found" exception), I could add >> documents successfully for some time before things started to go >> wrong. What's more, some documents do seem to *still* get added >> successfully. I'm using the rich document update handler, so the >> successful log entries look like this: >> >> <record> >> <date>2008-08-15T09:50:54</date> >> <millis>1218819054600</millis> >> <sequence>6561534</sequence> >> <logger>org.apache.solr.core.SolrCore</logger> >> <level>INFO</level> >> <class>org.apache.solr.core.SolrCore</class> >> <method>execute</method> >> <thread>14</thread> >> <message>[exhibitcore] webapp=/solr path=/update/rich >> >> params={filenumber=333-112076-85&formtype=S-4/A&stream.fieldname=body&exhibittype=EX-3.99&date=2004-02-09T00:00:00Z&companyname=PROGRESSIVE+VENTURE+CAPITAL+CORP&exhibitdescription=EXHIBIT+3.99&id=37684831&cik=1275089&stream.type=html&filingkey=0001193125-04-017196/1275089/FILER&stateofincorporation=WV&fieldnames=key,filingkey,companyname,accessionnumber,cik,date,exhibitdescription,exhibittype,exhibittypeint,filenumber,filename,formtype,stateofheadquarters,stateofincorporation&filename=dex399.htm&exhibittypeint=3&accessionnumber=0001193125-04-017196&stateofheadquarters=~&key=0001193125-04-017196/1275089/FILER/dex399.htm} >> status=0 QTime=9 </message> >> </record> >> >> The deletes I'm seeing in my log also seem to be working fine; I get >> log entries like >> >> <record> >> <date>2008-08-15T09:50:54</date> >> <millis>1218819054602</millis> >> <sequence>6561535</sequence> >> <logger>org.apache.solr.update.processor.UpdateRequestProcessor</logger> >> <level>INFO</level> >> <class>org.apache.solr.update.processor.LogUpdateProcessor</class> >> <method>finish</method> >> <thread>14</thread> >> <message>{delete=[0001193125-04-017196/1275096/FILER/dex231.htm]} 0 >> 1</message> >> </record> >> >> and >> >> <record> >> <date>2008-08-15T09:51:30</date> >> <millis>1218819090153</millis> >> <sequence>6571944</sequence> >> <logger>org.apache.solr.update.UpdateHandler</logger> >> <level>INFO</level> >> <class>org.apache.solr.update.DirectUpdateHandler2</class> >> <method>doDeletions</method> >> <thread>13</thread> >> <message>DirectUpdateHandler2 deleting and removing dups for 100788 >> ids</message> >> </record> >> >> After I noticed this corruption thing, I thought I'd see if I could >> get it to happen again, so I went back to the original 3M-ish doc >> index, and tried adding the new documents again. (If it matters, the >> new docs would have come into the index in a different permutation on >> this retry.) This too resulted in an index with "file not found" >> problems. >> >> The following may or may not be relevant: I built the base 3M-ish doc >> index on a Windows machine, and it's a compound (.cfs) format index. >> (I actually created it not with Solr, but by using the index merging >> tool that comes with Lucene in order to merge three different >> non-compound format indexes that I'd previously made with Solr into a >> single index.) Before I started adding documents, I moved the index to >> a Linux machine running a newer version of Solr/Lucene than was on the >> Windows machine. The stuff described above all happened on Linux. >> >> Any thoughts? >> >> Thanks a bunch, >> Chris > >