I hate to blame the JDK, but we tried 1.6 for our production
webapp and it was crashing too often. Unless you need 1.6,
you might try 1.5. --wunder

On 8/16/08 1:54 PM, "Chris Harris" <[EMAIL PROTECTED]> wrote:

> On Sat, Aug 16, 2008 at 4:33 AM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
>> What version of Java do you have on Linux?
> 
> The Java version on *Linux* (where I'm seeing the trouble):
> 
>     java version "1.6.0"
>     OpenJDK Runtime Environment (build 1.6.0-b09)
>     OpenJDK 64-Bit Server VM (build 1.6.0-b09, mixed mode)
> 
> I'm pretty sure this is the latest one from the Ubuntu repository.
> 
> Maybe I should try the official Sun HotSpot build instead. I'm not
> finding any complaints about OpenJDK on the Lucene list, though.
> 
> The Java version on *Windows* (where I created the initial compound
> format index) is an official Sun build:
> 
>     java version "1.6.0_06"
>     Java(TM) SE Runtime Environment (build 1.6.0_06-b02)
>     Java HotSpot(TM) Client VM (build 10.0-b22, mixed mode, sharing)
> 
>> Also, is this easily reproducible?  How many threads are you adding
>> documents with?  What is your Auto Commit setting?
> 
> I think it takes 12-24hr to get the index to screw up, so while I did
> reproduce it once, I haven't yet tried again. Intuition says that if I
> repeat the same procedure the same problem would arise. Of course,
> what would be nice is if I could figure out how to reproduce it more
> quickly, with a smaller index, and a simpler schema.
> 
> I'm adding documents with 5-10 threads. Since I'm using the rich
> document update handler
> (https://issues.apache.org/jira/browse/SOLR-284), there's going to be
> PDF and HTML conversion going on within Solr alongside the normal
> analysis and indexing.
> 
> Autocommit is:
> 
>     <autoCommit>
>       <maxDocs>100000</maxDocs>
>       <maxTime>1800000</maxTime>  <!-- 30 min -->
>     </autoCommit>
> 
>> Can you try Lucene's CheckIndex tool on it and report what it says?
> 
> Working on that now. It should take some time, though, due to the index size.
> 
>> 
>> On Aug 15, 2008, at 1:35 PM, Chris Harris wrote:
>> 
>>> I have an index (different from the ones mentioned yesterday) that was
>>> working fine with 3M docs or so, but when I added a bunch more docs,
>>> bringing it closer to 4M docs, the index seemed to get corrupted. In
>>> particular, now when I start Solr up, or when when my indexing process
>>> tries add a document, I get a complaint about missing index files.
>>> 
>>> The error on startup looks like this:
>>> 
>>> <record>
>>>  <date>2008-08-15T10:18:54</date>
>>>  <millis>1218820734592</millis>
>>>  <sequence>92</sequence>
>>>  <logger>org.apache.solr.core.MultiCore</logger>
>>>  <level>SEVERE</level>
>>>  <class>org.apache.solr.common.SolrException</class>
>>>  <method>log</method>
>>>  <thread>10</thread>
>>>  <message>java.lang.RuntimeException: java.io.FileNotFoundException:
>>> /ssd/solr-9999/solr/exhibitcore/data/index/_p7.fdt (No such file or
>>> directory)
>>>        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:733)
>>>        at org.apache.solr.core.SolrCore.&lt;init&gt;(SolrCore.java:387)
>>>        at org.apache.solr.core.MultiCore.create(MultiCore.java:255)
>>>        at org.apache.solr.core.MultiCore.load(MultiCore.java:139)
>>>        at
>>> org.apache.solr.servlet.SolrDispatchFilter.initMultiCore(SolrDispatchFilter.
>>> java:147)
>>>        at
>>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:75)
>>>        at
>>> org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99)
>>>        at
>>> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
>>>        at
>>> org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594)
>>>        at org.mortbay.jetty.servlet.Context.startContext(Context.java:139)
>>>        at
>>> org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218)
>>>        at
>>> org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500)
>>>        at
>>> org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448)
>>>        at
>>> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
>>>        at
>>> org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:1
>>> 47)
>>>        at
>>> org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCol
>>> lection.java:161)
>>>        at
>>> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
>>>        at
>>> org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:1
>>> 47)
>>>        at
>>> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
>>>        at
>>> org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117)
>>>        at org.mortbay.jetty.Server.doStart(Server.java:210)
>>>        at
>>> org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
>>>        at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929)
>>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>        at
>>> 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57>>>
)
>>>        at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
>>> .java:43)
>>>        at java.lang.reflect.Method.invoke(Method.java:616)
>>>        at org.mortbay.start.Main.invokeMain(Main.java:183)
>>>        at org.mortbay.start.Main.start(Main.java:497)
>>>        at org.mortbay.start.Main.main(Main.java:115)
>>> Caused by: java.io.FileNotFoundException:
>>> /ssd/solr-9999/solr/exhibitcore/data/index/_p7.fdt (No such file or
>>> directory)
>>>        at java.io.RandomAccessFile.open(Native Method)
>>>        at java.io.RandomAccessFile.&lt;init&gt;(RandomAccessFile.java:233)
>>>        at
>>> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.&lt;init&gt;(FSD
>>> irectory.java:506)
>>>        at
>>> org.apache.lucene.store.FSDirectory$FSIndexInput.&lt;init&gt;(FSDirectory.ja
>>> va:536)
>>>        at
>>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:445)
>>>        at
>>> org.apache.lucene.index.FieldsReader.&lt;init&gt;(FieldsReader.java:75)
>>>        at
>>> org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:308)
>>>        at
>>> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262)
>>>        at
>>> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:197)
>>>        at
>>> org.apache.lucene.index.MultiSegmentReader.&lt;init&gt;(MultiSegmentReader.j
>>> ava:55)
>>>        at
>>> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.j
>>> ava:75)
>>>        at
>>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:
>>> 636)
>>>        at
>>> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:
>>> 63)
>>>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:209)
>>>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:173)
>>>        at
>>> org.apache.solr.search.SolrIndexSearcher.&lt;init&gt;(SolrIndexSearcher.java
>>> :93)
>>>        at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:724)
>>>        ... 29 more
>>> </message>
>>> </record>
>>> 
>>> And the error on doc add looks like this:
>>> 
>>> <record>
>>>  <date>2008-08-15T09:51:30</date>
>>>  <millis>1218819090142</millis>
>>>  <sequence>6571937</sequence>
>>>  <logger>org.apache.solr.core.SolrCore</logger>
>>>  <level>SEVERE</level>
>>>  <class>org.apache.solr.common.SolrException</class>
>>>  <method>log</method>
>>>  <thread>14</thread>
>>>  <message>java.io.FileNotFoundException:
>>> /ssd/solr-9999/solr/exhibitcore/data/index/_p7.fdt (No such file or
>>> directory)
>>>        at java.io.RandomAccessFile.open(Native Method)
>>>        at java.io.RandomAccessFile.&lt;init&gt;(RandomAccessFile.java:233)
>>>        at
>>> org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.&lt;init&gt;(FSD
>>> irectory.java:506)
>>>        at
>>> org.apache.lucene.store.FSDirectory$FSIndexInput.&lt;init&gt;(FSDirectory.ja
>>> va:536)
>>>        at
>>> org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:445)
>>>        at
>>> org.apache.lucene.index.FieldsReader.&lt;init&gt;(FieldsReader.java:75)
>>>        at
>>> org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:308)
>>>        at
>>> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262)
>>>        at
>>> org.apache.lucene.index.SegmentReader.get(SegmentReader.java:197)
>>>        at
>>> org.apache.lucene.index.MultiSegmentReader.&lt;init&gt;(MultiSegmentReader.j
>>> ava:55)
>>>        at
>>> org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.j
>>> ava:75)
>>>        at
>>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:
>>> 636)
>>>        at
>>> org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:
>>> 63)
>>>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:209)
>>>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:173)
>>>        at
>>> org.apache.solr.search.SolrIndexSearcher.&lt;init&gt;(SolrIndexSearcher.java
>>> :93)
>>>        at org.apache.solr.core.SolrCore.newSearcher(SolrCore.java:213)
>>>        at
>>> org.apache.solr.update.DirectUpdateHandler2.openSearcher(DirectUpdateHandler
>>> 2.java:207)
>>>        at
>>> org.apache.solr.update.DirectUpdateHandler2.doDeletions(DirectUpdateHandler2
>>> .java:466)
>>>        at
>>> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java
>>> :295)
>>>        at
>>> org.apache.solr.handler.RichDocumentLoader.doAdd(RichDocumentRequestHandler.
>>> java:231)
>>>        at
>>> org.apache.solr.handler.RichDocumentLoader.addDoc(RichDocumentRequestHandler
>>> .java:236)
>>>        at
>>> org.apache.solr.handler.RichDocumentLoader.load(RichDocumentRequestHandler.j
>>> ava:278)
>>>        at
>>> org.apache.solr.handler.RichDocumentRequestHandler.handleRequestBody(RichDoc
>>> umentRequestHandler.java:80)
>>>        at
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.
>>> java:125)
>>>        at
>>> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest
>>> (RequestHandlers.java:228)
>>>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:965)
>>>        at
>>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:3
>>> 39)
>>>        at
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:
>>> 274)
>>>        at
>>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler
>>> .java:1089)
>>>        at
>>> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
>>>        at
>>> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>>>        at
>>> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>>>        at
>>> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
>>>        at
>>> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
>>>        at
>>> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerColl
>>> ection.java:211)
>>>        at
>>> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:11
>>> 4)
>>>        at
>>> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
>>>        at org.mortbay.jetty.Server.handle(Server.java:285)
>>>        at
>>> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
>>>        at
>>> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:
>>> 835)
>>>        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
>>>        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202)
>>>        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
>>>        at
>>> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:22
>>> 6)
>>>        at
>>> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:4
>>> 42)
>>> </message>
>>> </record>
>>> 
>>> I just checked, and the files that Solr is complaining about are
>>> indeed not in the index directory.
>>> 
>>> The earliest indication of trouble I found in my log was an error like
>>> this:
>>> 
>>> <record>
>>>  <date>2008-08-15T09:47:48</date>
>>>  <millis>1218818868528</millis>
>>>  <sequence>6525387</sequence>
>>>  <logger>org.apache.solr.update.UpdateHandler</logger>
>>>  <level>SEVERE</level>
>>>  <class>org.apache.solr.update.DirectUpdateHandler2$CommitTracker</class>
>>>  <method>run</method>
>>>  <thread>15</thread>
>>>  <message>auto commit error...</message>
>>> </record>
>>> 
>>> There may have been SEVERE errors before this, but my log doesn't go
>>> back to the very beginning.
>>> 
>>> It's interesting that while adding documents seems to be usually
>>> failing now (yielding the "file not found" exception), I could add
>>> documents successfully for some time before things started to go
>>> wrong. What's more, some documents do seem to *still* get added
>>> successfully. I'm using the rich document update handler, so the
>>> successful log entries look like this:
>>> 
>>> <record>
>>>  <date>2008-08-15T09:50:54</date>
>>>  <millis>1218819054600</millis>
>>>  <sequence>6561534</sequence>
>>>  <logger>org.apache.solr.core.SolrCore</logger>
>>>  <level>INFO</level>
>>>  <class>org.apache.solr.core.SolrCore</class>
>>>  <method>execute</method>
>>>  <thread>14</thread>
>>>  <message>[exhibitcore] webapp=/solr path=/update/rich
>>> 
>>> params={filenumber=333-112076-85&amp;formtype=S-4/A&amp;stream.fieldname=bod
>>> y&amp;exhibittype=EX-3.99&amp;date=2004-02-09T00:00:00Z&amp;companyname=PROG
>>> RESSIVE+VENTURE+CAPITAL+CORP&amp;exhibitdescription=EXHIBIT+3.99&amp;id=3768
>>> 4831&amp;cik=1275089&amp;stream.type=html&amp;filingkey=0001193125-04-017196
>>> /1275089/FILER&amp;stateofincorporation=WV&amp;fieldnames=key,filingkey,comp
>>> anyname,accessionnumber,cik,date,exhibitdescription,exhibittype,exhibittypei
>>> nt,filenumber,filename,formtype,stateofheadquarters,stateofincorporation&amp
>>> ;filename=dex399.htm&amp;exhibittypeint=3&amp;accessionnumber=0001193125-04-
>>> 017196&amp;stateofheadquarters=~&amp;key=0001193125-04-017196/1275089/FILER/
>>> dex399.htm}
>>> status=0 QTime=9 </message>
>>> </record>
>>> 
>>> The deletes I'm seeing in my log also seem to be working fine; I get
>>> log entries like
>>> 
>>> <record>
>>>  <date>2008-08-15T09:50:54</date>
>>>  <millis>1218819054602</millis>
>>>  <sequence>6561535</sequence>
>>>  <logger>org.apache.solr.update.processor.UpdateRequestProcessor</logger>
>>>  <level>INFO</level>
>>>  <class>org.apache.solr.update.processor.LogUpdateProcessor</class>
>>>  <method>finish</method>
>>>  <thread>14</thread>
>>>  <message>{delete=[0001193125-04-017196/1275096/FILER/dex231.htm]} 0
>>> 1</message>
>>> </record>
>>> 
>>> and
>>> 
>>> <record>
>>>  <date>2008-08-15T09:51:30</date>
>>>  <millis>1218819090153</millis>
>>>  <sequence>6571944</sequence>
>>>  <logger>org.apache.solr.update.UpdateHandler</logger>
>>>  <level>INFO</level>
>>>  <class>org.apache.solr.update.DirectUpdateHandler2</class>
>>>  <method>doDeletions</method>
>>>  <thread>13</thread>
>>>  <message>DirectUpdateHandler2 deleting and removing dups for 100788
>>> ids</message>
>>> </record>
>>> 
>>> After I noticed this corruption thing, I thought I'd see if I could
>>> get it to happen again, so I went back to the original 3M-ish doc
>>> index, and tried adding the new documents again. (If it matters, the
>>> new docs would have come into the index in a different permutation on
>>> this retry.) This too resulted in an index with "file not found"
>>> problems.
>>> 
>>> The following may or may not be relevant: I built the base 3M-ish doc
>>> index on a Windows machine, and it's a compound (.cfs) format index.
>>> (I actually created it not with Solr, but by using the index merging
>>> tool that comes with Lucene in order to merge three different
>>> non-compound format indexes that I'd previously made with Solr into a
>>> single index.) Before I started adding documents, I moved the index to
>>> a Linux machine running a newer version of Solr/Lucene than was on the
>>> Windows machine. The stuff described above all happened on Linux.
>>> 
>>> Any thoughts?
>>> 
>>> Thanks a bunch,
>>> Chris
>> 
>> 

Reply via email to