Re: java.lang.RuntimeException: after flush: fdx size mismatch

Michael McCandless Thu, 21 May 2009 05:18:01 -0700

If you're able to run a patched version of Lucene, can you apply the
attached patch, run it, get the issue to happen again, and post back
the resulting exception?


It only adds further diagnostics to that RuntimeException you're hitting.

Another thing to try is turning on assertions, which may very well
catch the issue sooner.

Mike

On Wed, May 20, 2009 at 11:18 AM, James X
<hello.nigerian.spamm...@gmail.com> wrote:
> Hi Mike, thanks for the quick response:
>
> $ java -version
> java version "1.6.0_11"
> Java(TM) SE Runtime Environment (build 1.6.0_11-b03)
> Java HotSpot(TM) 64-Bit Server VM (build 11.0-b16, mixed mode)
>
> I hadn't noticed the 268m trigger for LUCENE-1521 - I'm definitely not
> hitting that yet!
>
> The exception always reports 0 length, but the number of of docs varies,
> heavily weighted towards 1 or two docs. Of the last 130 or so exceptions:
>     89 1 docs vs 0 length
>     20 2 docs vs 0 length
>      9 3 docs vs 0 length
>      1 4 docs vs 0 length
>      3 5 docs vs 0 length
>      2 6 docs vs 0 length
>      1 7 docs vs 0 length
>      1 9 docs vs 0 length
>      1 10 docs vs 0 length
>
> The only unusual thing I can think of that we're doing with Solr is
> aggressively CREATE-ing and UNLOAD-ing cores. I've not been able to spot a
> pattern between core admin operations and these exceptions, however...
>
> James
>
> On Wed, May 20, 2009 at 2:37 AM, Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> Hmm... somehow Lucene is flushing a new segment on closing the
>> IndexWriter, and thinks 1 doc had been added to the stored fields
>> file, yet the fdx file is the wrong size (0 bytes).  This check (&
>> exception) are designed to prevent corruption from entering the index,
>> so it's at least good to see CheckIndex passes after this.
>>
>> I don't think you're hitting LUCENE-1521: that issue only happens if a
>> single segment has more than ~268 million docs.
>>
>> Which exact JRE version are you using?
>>
>> When you hit this exception, is it always "1 docs vs 0 length in bytes"?
>>
>> Mike
>>
>> On Wed, May 20, 2009 at 3:19 AM, James X
>> <hello.nigerian.spamm...@gmail.com> wrote:
>> > Hello all,I'm running Solr 1.3 in a multi-core environment. There are up
>> to
>> > 2000 active cores in each Solr webapp instance at any given time.
>> >
>> > I've noticed occasional errors such as:
>> > SEVERE: java.lang.RuntimeException: after flush: fdx size mismatch: 1
>> docs
>> > vs 0 length in bytes of _h.fdx
>> >        at
>> >
>> org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:94)
>> >        at
>> >
>> org.apache.lucene.index.DocFieldConsumers.closeDocStore(DocFieldConsumers.java:83)
>> >        at
>> >
>> org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:47)
>> >        at
>> >
>> org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:367)
>> >        at
>> > org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:567)
>> >        at
>> > org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3540)
>> >        at
>> org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3450)
>> >        at
>> > org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1638)
>> >        at
>> org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1602)
>> >        at
>> org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1578)
>> >        at
>> > org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:153)
>> >
>> > during commit / optimise operations.
>> >
>> > These errors then cause cascading errors during updates on the offending
>> > cores:
>> > SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain
>> timed
>> > out: SingleInstanceLock: write.lock
>> >        at org.apache.lucene.store.Lock.obtain(Lock.java:85)
>> >        at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1070)
>> >        at
>> org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:924)
>> >        at
>> > org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:116)
>> >        at
>> >
>> org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:122)
>> >
>> > This looks like http://issues.apache.org/jira/browse/LUCENE-1521, but
>> when I
>> > upgraded Lucene to 2.4.1 under Solr 1.3, the issue still remains.
>> >
>> > CheckIndex doesn't find any problems with the index, and problems
>> disappear
>> > after an (inconvenient, for me) restart of Solr.
>> >
>> > Firstly, can I as the symptoms are so close to those in 1521, can I check
>> my
>> > Lucene upgrade method should work:
>> > - unzip the Solr 1.3 war
>> > - remove the Lucene 2.4dev jars
>> > (lucene-core, lucene-spellchecker, lucene-snowball, lucene-queries,
>> > lucene-memory,lucene-highlighter, lucene-analyzers)
>> > - move in the Lucene 2.4.1 jars
>> > - rezip the directory structures as solr.war.
>> >
>> > I think this has worked, as solr/default/admin/registry.jsp shows:
>> >  <lucene-spec-version>2.4.1</lucene-spec-version>
>> >  <lucene-impl-version>2.4.1 750176 - 2009-03-04
>> > 21:56:52</lucene-impl-version>
>> >
>> > Secondly, if this Lucene fix isn't the right solution to this problem,
>> can
>> > anyone suggest an alternative approach? The only problems I've had up to
>> now
>> > is to do with the number of allowed file handles, which was fixed by
>> > changing limits.conf (RHEL machine).
>> >
>> > Many thanks!
>> > James
>> >
>>
>

Re: java.lang.RuntimeException: after flush: fdx size mismatch

Reply via email to