I seem to be able to reproduce this very easily and the data is medline (so I am sure I can share it if needed with a quick email to check).
- I am using fedora: %uname -a Linux ghetto5.projectlounge.com 2.6.23.1-42.fc8 #1 SMP Tue Oct 30 13:18:33 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux %java -version java version "1.7.0" IcedTea Runtime Environment (build 1.7.0-b21) IcedTea 64-Bit Server VM (build 1.7.0-b21, mixed mode) - single core (will use shards but each machine just as one HDD so didn't see how cores would help but I am new at this) - next run I will keep the output to check for earlier errors - very and I can share code + data if that will help On Thu, Aug 14, 2008 at 4:23 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > Yikes... not good. This shouldn't be due to anything you did wrong > Ian... it looks like a lucene bug. > > Some questions: > - what platform are you running on, and what JVM? > - are you using multicore? (I fixed some index locking bugs recently) > - are there any exceptions in the log before this? > - how reproducible is this? > > -Yonik > > On Thu, Aug 14, 2008 at 2:47 PM, Ian Connor <[EMAIL PROTECTED]> wrote: >> Hi, >> >> I have rebuilt my index a few times (it should get up to about 4 >> Million but around 1 Million it starts to fall apart). >> >> Exception in thread "Lucene Merge Thread #0" >> org.apache.lucene.index.MergePolicy$MergeException: >> java.lang.IndexOutOfBoundsException: Index: 105, Size: 33 >> at >> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:323) >> at >> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:300) >> Caused by: java.lang.IndexOutOfBoundsException: Index: 105, Size: 33 >> at java.util.ArrayList.rangeCheck(ArrayList.java:572) >> at java.util.ArrayList.get(ArrayList.java:350) >> at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:260) >> at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:188) >> at >> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:670) >> at >> org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:349) >> at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:134) >> at >> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3998) >> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3650) >> at >> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:214) >> at >> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:269) >> >> >> When this happens, the disk usage goes right up and the indexing >> really starts to slow down. I am using a Solr build from about a week >> ago - so my Lucene is at 2.4 according to the war files. >> >> Has anyone seen this error before? Is it possible to tell which Array >> is too large? Would it be an Array I am sending in or another internal >> one? >> >> Regards, >> Ian Connor >> > -- Regards, Ian Connor