Right, it's the total number of terms across all fields... unfortunately. This class is used to enroll a term into the terms cache that wraps the terms dictionary, so in theory you could also hit this issue during normal searching when a term is looked up once, and then looked up again (the 2nd time will pull from the cache).
I've mod'd Test2BTerms and am running it now... Mike http://blog.mikemccandless.com On Mon, Apr 11, 2011 at 12:51 PM, Burton-West, Tom <tburt...@umich.edu> wrote: > Thanks Mike, > > At first I thought this couldn't be related to the 2.1 Billion terms issue > since the only place we have tons of terms is in the OCR field and this is > not the OCR field. But then I remembered that the total number of terms in > all fields is what matters. We've had no problems with regular searches > against the index or with other facet queries. Only with this facet. Is > TermInfoAndOrd only used for faceting? > > I'll go ahead and build the patch and let you know. > > > Tom > > p.s. Here is the field definition: > <field name="topicStr" type="string" indexed="true" stored="false" > multiValued="true"/> > <fieldType name="string" class="solr.StrField" sortMissingLast="true" > omitNorms="true"/> > > > -----Original Message----- > From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Monday, April 11, 2011 8:40 AM > To: solr-user@lucene.apache.org > Cc: Burton-West, Tom > Subject: Re: ArrayIndexOutOfBoundsException with facet query > > Tom, > > I think I see where this may be -- it looks like another > 2B terms > bug in Lucene (we are using an int instead of a long in the > TermInfoAndOrd class inside TermInfosReader.java), only present in > 3.1. > > I'm also mad that Test2BTerms fails to catch this!! I will go fix > that test and confirm it sees this bug. > > Can you build from source? If so, try this patch: > > Index: lucene/src/java/org/apache/lucene/index/TermInfosReader.java > =================================================================== > --- lucene/src/java/org/apache/lucene/index/TermInfosReader.java > (revision > 1089906) > +++ lucene/src/java/org/apache/lucene/index/TermInfosReader.java > (working copy) > @@ -46,8 +46,8 @@ > > // Just adds term's ord to TermInfo > private final static class TermInfoAndOrd extends TermInfo { > - final int termOrd; > - public TermInfoAndOrd(TermInfo ti, int termOrd) { > + final long termOrd; > + public TermInfoAndOrd(TermInfo ti, long termOrd) { > super(ti); > this.termOrd = termOrd; > } > @@ -245,7 +245,7 @@ > // wipe out the cache when they iterate over a large numbers > // of terms in order > if (tiOrd == null) { > - termsCache.put(cacheKey, new TermInfoAndOrd(ti, (int) > enumerator.position)); > + termsCache.put(cacheKey, new TermInfoAndOrd(ti, > enumerator.position)); > } else { > assert sameTermInfo(ti, tiOrd, enumerator); > assert (int) enumerator.position == tiOrd.termOrd; > @@ -262,7 +262,7 @@ > // random-access: must seek > final int indexPos; > if (tiOrd != null) { > - indexPos = tiOrd.termOrd / totalIndexInterval; > + indexPos = (int) (tiOrd.termOrd / totalIndexInterval); > } else { > // Must do binary search: > indexPos = getIndexOffset(term); > @@ -274,7 +274,7 @@ > if (enumerator.term() != null && term.compareTo(enumerator.term()) == 0) { > ti = enumerator.termInfo(); > if (tiOrd == null) { > - termsCache.put(cacheKey, new TermInfoAndOrd(ti, (int) > enumerator.position)); > + termsCache.put(cacheKey, new TermInfoAndOrd(ti, > enumerator.position)); > } else { > assert sameTermInfo(ti, tiOrd, enumerator); > assert (int) enumerator.position == tiOrd.termOrd; > > Mike > > http://blog.mikemccandless.com > > On Fri, Apr 8, 2011 at 4:53 PM, Burton-West, Tom <tburt...@umich.edu> wrote: >> The query below results in an array out of bounds exception: >> select/?q=solr&version=2.2&start=0&rows=0&facet=true&facet.field=topicStr >> >> Here is the exception: >> Exception during facet.field of >> topicStr:java.lang.ArrayIndexOutOfBoundsException: -1931149 >> at >> org.apache.lucene.index.TermInfosReader.seekEnum(TermInfosReader.java:201) >> >> We are using a dev version of Solr/Lucene: >> >> Solr Specification Version: 3.0.0.2010.11.19.16.00.54 >> Solr Implementation Version: 3.1-SNAPSHOT 1036094 - root - 2010-11-19 >> 16:00:54 >> Lucene Specification Version: 3.1-SNAPSHOT >> Lucene Implementation Version: 3.1-SNAPSHOT 1036094 - 2010-11-19 16:01:10 >> >> Just before the exception we see this entry in our tomcat logs: >> >> Apr 8, 2011 2:01:58 PM org.apache.solr.request.UnInvertedField uninvert >> INFO: UnInverted multi-valued field >> {field=topicStr,memSize=7675174,tindexSize=289102,time=2577,phase1=2537,nTerms=498975,bigTerms=0,termInstances=1368694,uses=0} >> Apr 8, 2011 2:01:58 PM org.apache.solr.core.SolrCore execute >> >> Is this a known bug? Can anyone provide a clue as to how we can determine >> what the problem is? >> >> Tom Burton-West >> >> >> Appended Below is the exception stack trace: >> >> SEVERE: Exception during facet.field of >> topicStr:java.lang.ArrayIndexOutOfBoundsException: -1931149 >> at >> org.apache.lucene.index.TermInfosReader.seekEnum(TermInfosReader.java:201) >> at >> org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:271) >> at >> org.apache.lucene.index.TermInfosReader.terms(TermInfosReader.java:338) >> at org.apache.lucene.index.SegmentReader.terms(SegmentReader.java:928) >> at >> org.apache.lucene.index.DirectoryReader$MultiTermEnum.<init>(DirectoryReader.java:1055) >> at >> org.apache.lucene.index.DirectoryReader.terms(DirectoryReader.java:659) >> at >> org.apache.solr.search.SolrIndexReader.terms(SolrIndexReader.java:302) >> at >> org.apache.solr.request.NumberedTermEnum.skipTo(UnInvertedField.java:1018) >> at >> org.apache.solr.request.UnInvertedField.getTermText(UnInvertedField.java:838) >> at >> org.apache.solr.request.UnInvertedField.getCounts(UnInvertedField.java:617) >> at >> org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:279) >> at >> org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:312) >> at >> org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:174) >> at >> org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:72) >> at >> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) >> at >> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) >> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1354) >> >> >