Thanks Lance and Michael,

We are running Solr 1.3.0.2009.09.03.11.14.39  (Complete version info from
Solr admin panel appended below)

I tried running CheckIndex (with the -ea:  switch ) on one of the shards.
CheckIndex also produced an ArrayIndexOutOfBoundsException on the larger
segment containing 500K+ documents. (Complete CheckIndex output appended
below)

Is it likely that all 10 shards are corrupted?  Is it possible that we have
simply exceeded some lucene limit?

I'm wondering if we could have exceeded the lucene limit of unique terms of
2.1 billion as mentioned towards the end of the Lucene Index File Formats
document.  If the small 731 document index has nine million unique terms as
reported by check index, then even though many terms are repeated, it is
concievable that the 500,000 document index could have more than 2.1 billion
terms.

Do you know if  the number of terms reported by CheckIndex is the number of
unique terms?

On the other hand, we previously optimized a 1 million document index down
to 1 segment and had no problems.  That was with an earlier version of Solr
and did not include CommonGrams which could conceivably increase the number
of terms in the index by 2 or 3 times.


Tom
-----------------------------------------------------------------------------------

        Solr Specification Version: 1.3.0.2009.09.03.11.14.39
        Solr Implementation Version: 1.4-dev 793569 - root - 2009-09-03 11:14:39
        Lucene Specification Version: 2.9-dev
        Lucene Implementation Version: 2.9-dev 779312 - 2009-05-27 17:19:55


[tburt...@slurm-4 ~]$  java -Xmx4096m  -Xms4096m -cp
/l/local/apache-tomcat-serve/webapps/solr-sdr-search/serve-10/WEB-INF/lib/lucene-core-2.9-dev.jar:/l/local/apache-tomcat-serve/webapps/solr-sdr-search/serve-10/WEB-INF/lib
-ea:org.apache.lucene... org.apache.lucene.index.CheckIndex
/l/solrs/1/.snapshot/serve-2010-02-07/data/index 

Opening index @ /l/solrs/1/.snapshot/serve-2010-02-07/data/index

Segments file=segments_zo numSegments=2 version=FORMAT_DIAGNOSTICS [Lucene
2.9]
  1 of 2: name=_29dn docCount=554799
    compound=false
    hasProx=true
    numFiles=9
    size (MB)=267,131.261
    diagnostics = {optimize=true, mergeFactor=2,
os.version=2.6.18-164.6.1.el5, os=Linux, mergeDocStores=true,
lucene.version=2.9-dev 779312 - 2009-05-27 17:19:55, source=merge,
os.arch=amd64, java.version=1.6.0_16, java.vendor=Sun Microsystems Inc.}
    has deletions [delFileName=_29dn_7.del]
    test: open reader.........OK [184 deleted docs]
    test: fields, norms.......OK [6 fields]
    test: terms, freq, prox...FAILED
    WARNING: fixIndex() would remove reference to this segment; full
exception:
java.lang.ArrayIndexOutOfBoundsException: -16777214
        at
org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:246)
        at
org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:218)
        at
org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:57)
        at
org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:474)
        at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:715)

  2 of 2: name=_29im docCount=731
    compound=false
    hasProx=true
    numFiles=8
    size (MB)=421.261
    diagnostics = {optimize=true, mergeFactor=3,
os.version=2.6.18-164.6.1.el5, os=Linux, mergeDocStores=true,
lucene.version=2.9-dev 779312 - 2009-05-27 17:19:55, source=merge,
os.arch=amd64, java.version=1.6.0_16, java.vendor=Sun Microsystems Inc.}
    no deletions
    test: open reader.........OK
    test: fields, norms.......OK [6 fields]
    test: terms, freq, prox...OK [9504552 terms; 34864047 terms/docs pairs;
144869629 tokens]
    test: stored fields.......OK [3550 total field count; avg 4.856 fields
per doc]
    test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]

WARNING: 1 broken segments (containing 554615 documents) detected
WARNING: would write new segments file, and 554615 documents would be lost,
if -fix were specified


[tburt...@slurm-4 ~]$ 


The index is corrupted. In some places ArrayIndex and NPE are not
wrapped as CorruptIndexException.

Try running your code with the Lucene assertions on. Add this to the
JVM arguments:  -ea:org.apache.lucene...


-- 
View this message in context: 
http://old.nabble.com/TermInfosReader.get-ArrayIndexOutOfBoundsException-tp27506243p27518800.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to