This is spooky!
First off, why are you hitting so much index corruption? Many classes
of failure (unhandled exception exits JVM, JVM killed or SEGVs, OS
crashes, power cord is pulled, etc.) should never result in index
corruption. Other failures (bad RAM, bad hard drives) can easily
cause corruption. So I'd really like to understand what kind of
corruption you're seeing and how/why. Why does Solr need to be
killed, and how do you kill it? When CheckIndex does catch the
failure, what failures is it seeing? Is there any pattern to which
indexes become corrupt?
Hmm -- you seem to be using Lucene 2.3.1, so in fact OS crashes and
power cord pulling could lead to corruption. But JVM crashing or
being killed should not. Upgrading to Solr 1.3 (Lucene 2.4) would be
a good idea, though I'd still like to understand what's causing your
corruption.
Second off, you're right: CheckIndex fails to detect the docs-out-of-
order form of corruption. I will open Jira issue & fix it.
Mike
James Brady wrote:
Hi,My indices sometime become corrupted - normally when Solr has to be
KILLed - these are not normally too much of a problem, as
Lucene's CheckIndex tool can normally detect missing / broken
segments and
fix them.
However, I now have a few indices throwing errors like this:
INFO: [core4] webapp=/solr path=/update params={} status=0 QTime=2
Exception in thread "Thread-75"
org.apache.lucene.index.MergePolicy$MergeException:
org.apache.lucene.index.CorruptIndexException: docs out of order
(1124 <=
1138 )
at
org.apache.lucene.index.ConcurrentMergeScheduler
$MergeThread.run(ConcurrentMergeScheduler.java:271)
Caused by: org.apache.lucene.index.CorruptIndexException: docs out
of order
(1124 <= 1138 )
at
org
.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:
502)
at
org
.apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java:
456)
at
org
.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:
425)
at
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:
389)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:134)
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:
3109)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2834)
at
org.apache.lucene.index.ConcurrentMergeScheduler
$MergeThread.run(ConcurrentMergeScheduler.java:240)
and
INFO: [core7] webapp=/solr path=/update params={} status=500
QTime=5457
Feb 22, 2009 12:14:07 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.lucene.index.CorruptIndexException: docs out of
order
(242 <= 248 )
at
org
.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:
502)
at
org
.apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java:
456)
at
org
.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:
425)
at
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:
389)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:134)
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:
3109)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2834)
at
org
.apache
.lucene
.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:
193)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:
1800)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:
1795)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:
1791)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:2398)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
1465)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:
1424)
at
org
.apache
.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:
278)
CheckIndex reports these cores as being completely healthy, and yet
I can't
commit new documents in to them.
Rebuilding indices isn't an option for me: is there any other way to
fix
this? If not, any ideas on what I can do to prevent it in the future?
Many thanks,
James