Since this looks like more of a lucene issue, I've replied in [EMAIL PROTECTED]
-Yonik On Thu, Aug 14, 2008 at 10:18 PM, Ian Connor <[EMAIL PROTECTED]> wrote: > I seem to be able to reproduce this very easily and the data is > medline (so I am sure I can share it if needed with a quick email to > check). > > - I am using fedora: > %uname -a > Linux ghetto5.projectlounge.com 2.6.23.1-42.fc8 #1 SMP Tue Oct 30 > 13:18:33 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux > %java -version > java version "1.7.0" > IcedTea Runtime Environment (build 1.7.0-b21) > IcedTea 64-Bit Server VM (build 1.7.0-b21, mixed mode) > - single core (will use shards but each machine just as one HDD so > didn't see how cores would help but I am new at this) > - next run I will keep the output to check for earlier errors > - very and I can share code + data if that will help > > On Thu, Aug 14, 2008 at 4:23 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: >> Yikes... not good. This shouldn't be due to anything you did wrong >> Ian... it looks like a lucene bug. >> >> Some questions: >> - what platform are you running on, and what JVM? >> - are you using multicore? (I fixed some index locking bugs recently) >> - are there any exceptions in the log before this? >> - how reproducible is this? >> >> -Yonik >> >> On Thu, Aug 14, 2008 at 2:47 PM, Ian Connor <[EMAIL PROTECTED]> wrote: >>> Hi, >>> >>> I have rebuilt my index a few times (it should get up to about 4 >>> Million but around 1 Million it starts to fall apart). >>> >>> Exception in thread "Lucene Merge Thread #0" >>> org.apache.lucene.index.MergePolicy$MergeException: >>> java.lang.IndexOutOfBoundsException: Index: 105, Size: 33 >>> at >>> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:323) >>> at >>> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:300) >>> Caused by: java.lang.IndexOutOfBoundsException: Index: 105, Size: 33 >>> at java.util.ArrayList.rangeCheck(ArrayList.java:572) >>> at java.util.ArrayList.get(ArrayList.java:350) >>> at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:260) >>> at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:188) >>> at >>> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:670) >>> at >>> org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:349) >>> at >>> org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:134) >>> at >>> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3998) >>> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3650) >>> at >>> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:214) >>> at >>> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:269) >>> >>> >>> When this happens, the disk usage goes right up and the indexing >>> really starts to slow down. I am using a Solr build from about a week >>> ago - so my Lucene is at 2.4 according to the war files. >>> >>> Has anyone seen this error before? Is it possible to tell which Array >>> is too large? Would it be an Array I am sending in or another internal >>> one? >>> >>> Regards, >>> Ian Connor >>> >> > > > > -- > Regards, > > Ian Connor >