This is spooky!

First off, why are you hitting so much index corruption? Many classes of failure (unhandled exception exits JVM, JVM killed or SEGVs, OS crashes, power cord is pulled, etc.) should never result in index corruption. Other failures (bad RAM, bad hard drives) can easily cause corruption. So I'd really like to understand what kind of corruption you're seeing and how/why. Why does Solr need to be killed, and how do you kill it? When CheckIndex does catch the failure, what failures is it seeing? Is there any pattern to which indexes become corrupt?

Hmm -- you seem to be using Lucene 2.3.1, so in fact OS crashes and power cord pulling could lead to corruption. But JVM crashing or being killed should not. Upgrading to Solr 1.3 (Lucene 2.4) would be a good idea, though I'd still like to understand what's causing your corruption.

Second off, you're right: CheckIndex fails to detect the docs-out-of- order form of corruption. I will open Jira issue & fix it.

Mike

James Brady wrote:

Hi,My indices sometime become corrupted - normally when Solr has to be
KILLed - these are not normally too much of a problem, as
Lucene's CheckIndex tool can normally detect missing / broken segments and
fix them.

However, I now have a few indices throwing errors like this:

INFO: [core4] webapp=/solr path=/update params={} status=0 QTime=2
Exception in thread "Thread-75"
org.apache.lucene.index.MergePolicy$MergeException:
org.apache.lucene.index.CorruptIndexException: docs out of order (1124 <=
1138 )
at
org.apache.lucene.index.ConcurrentMergeScheduler $MergeThread.run(ConcurrentMergeScheduler.java:271) Caused by: org.apache.lucene.index.CorruptIndexException: docs out of order
(1124 <= 1138 )
at
org .apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java: 502)
at
org .apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java: 456)
at
org .apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java: 425) at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java: 389)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:134)
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java: 3109)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2834)
at
org.apache.lucene.index.ConcurrentMergeScheduler $MergeThread.run(ConcurrentMergeScheduler.java:240)

and

INFO: [core7] webapp=/solr path=/update params={} status=500 QTime=5457
Feb 22, 2009 12:14:07 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.lucene.index.CorruptIndexException: docs out of order
(242 <= 248 )
at
org .apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java: 502)
at
org .apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java: 456)
at
org .apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java: 425) at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java: 389)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:134)
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java: 3109)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2834)
at
org .apache .lucene .index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java: 193) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java: 1800) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java: 1795) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java: 1791)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:2398)
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java: 1465) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java: 1424)
at
org .apache .solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java: 278)


CheckIndex reports these cores as being completely healthy, and yet I can't
commit new documents in to them.

Rebuilding indices isn't an option for me: is there any other way to fix
this? If not, any ideas on what I can do to prevent it in the future?

Many thanks,
James

Reply via email to