Re: TermInfosReader.get ArrayIndexOutOfBoundsException

2010-02-09 Thread Michael McCandless
On Tue, Feb 9, 2010 at 2:56 PM, Tom Burton-West wrote: > I'm not sure I understand. CheckIndex reported a negative number: > -16777214. Right, we are overflowing the positive ints, which wraps around to the smallest int (-2.1 billion), and then dividing by 128 = ~ -1677214. Lucene has an array

Re: TermInfosReader.get ArrayIndexOutOfBoundsException

2010-02-09 Thread Michael McCandless
I attached a patch to the issue that may fix it. Maybe start by running CheckIndex first? Mike On Tue, Feb 9, 2010 at 2:56 PM, Tom Burton-West wrote: > > Thanks Michael, > > I'm not sure I understand.  CheckIndex reported a negative number: > -16777214. > > But in any case we can certainly try

Re: TermInfosReader.get ArrayIndexOutOfBoundsException

2010-02-09 Thread Tom Burton-West
Thanks Michael, I'm not sure I understand. CheckIndex reported a negative number: -16777214. But in any case we can certainly try running CheckIndex from a patched lucene We could also run a patched lucene on our dev server. Tom Yes, the term count reported by CheckIndex is the total

Re: TermInfosReader.get ArrayIndexOutOfBoundsException

2010-02-09 Thread Michael McCandless
I opened a Lucene issue w/ patch to try: https://issues.apache.org/jira/browse/LUCENE-2257 Tom let me know if you're able to test this... thanks! Mike On Tue, Feb 9, 2010 at 2:09 PM, Michael McCandless wrote: > Yes, the term count reported by CheckIndex is the total number of unique > term

Re: TermInfosReader.get ArrayIndexOutOfBoundsException

2010-02-09 Thread Michael McCandless
Yes, the term count reported by CheckIndex is the total number of unique terms. It indeed looks like you are exceeding the unique term count limit -- 16777214 * 128 (= the default term index interval) is 2147483392 which is mighty close to max/min 32 bit int value. This makes sense, because Check

Re: TermInfosReader.get ArrayIndexOutOfBoundsException

2010-02-09 Thread Tom Burton-West
Thanks Lance and Michael, We are running Solr 1.3.0.2009.09.03.11.14.39 (Complete version info from Solr admin panel appended below) I tried running CheckIndex (with the -ea: switch ) on one of the shards. CheckIndex also produced an ArrayIndexOutOfBoundsException on the larger segment contai

Re: TermInfosReader.get ArrayIndexOutOfBoundsException

2010-02-09 Thread Michael McCandless
Which version of Solr/Lucene are you using? Can you run Lucene's CheckIndex tool (java -ea:org.apache.lucene org.apache.lucene.index.CheckIndex /path/to/index) and then post the output? Have you altered any of IndexWriter's defaults (via solrconfig.xml)? Eg the termIndexInterval? Mike On Mon, F

Re: TermInfosReader.get ArrayIndexOutOfBoundsException

2010-02-08 Thread Lance Norskog
The index is corrupted. In some places ArrayIndex and NPE are not wrapped as CorruptIndexException. Try running your code with the Lucene assertions on. Add this to the JVM arguments: -ea:org.apache.lucene... On Mon, Feb 8, 2010 at 1:02 PM, Burton-West, Tom wrote: > Hello all, > > After optimiz