Right, it's the total number of terms across all fields... unfortunately.

This class is used to enroll a term into the terms cache that wraps
the terms dictionary, so in theory you could also hit this issue
during normal searching when a term is looked up once,  and then
looked up again (the 2nd time will pull from the cache).

I've mod'd Test2BTerms and am running it now...

Mike

http://blog.mikemccandless.com

On Mon, Apr 11, 2011 at 12:51 PM, Burton-West, Tom <tburt...@umich.edu> wrote:
> Thanks Mike,
>
> At first I thought this couldn't be related to the 2.1 Billion terms issue 
> since the only place we have tons of terms is in the OCR field and this is 
> not the OCR field. But then I remembered that the total number of terms in 
> all fields is what matters. We've had no problems with regular searches 
> against the index or with other facet queries.  Only with this facet.   Is 
> TermInfoAndOrd only used for faceting?
>
> I'll go ahead and build the patch and let you know.
>
>
> Tom
>
> p.s. Here is the field definition:
> <field name="topicStr" type="string" indexed="true" stored="false" 
> multiValued="true"/>
> <fieldType name="string" class="solr.StrField" sortMissingLast="true" 
> omitNorms="true"/>
>
>
> -----Original Message-----
> From: Michael McCandless [mailto:luc...@mikemccandless.com]
> Sent: Monday, April 11, 2011 8:40 AM
> To: solr-user@lucene.apache.org
> Cc: Burton-West, Tom
> Subject: Re: ArrayIndexOutOfBoundsException with facet query
>
> Tom,
>
> I think I see where this may be -- it looks like another > 2B terms
> bug in Lucene (we are using an int instead of a long in the
> TermInfoAndOrd class inside TermInfosReader.java), only present in
> 3.1.
>
> I'm also mad that Test2BTerms fails to catch this!!  I will go fix
> that test and confirm it sees this bug.
>
> Can you build from source?  If so, try this patch:
>
> Index: lucene/src/java/org/apache/lucene/index/TermInfosReader.java
> ===================================================================
> --- lucene/src/java/org/apache/lucene/index/TermInfosReader.java        
> (revision
> 1089906)
> +++ lucene/src/java/org/apache/lucene/index/TermInfosReader.java        
> (working copy)
> @@ -46,8 +46,8 @@
>
>   // Just adds term's ord to TermInfo
>   private final static class TermInfoAndOrd extends TermInfo {
> -    final int termOrd;
> -    public TermInfoAndOrd(TermInfo ti, int termOrd) {
> +    final long termOrd;
> +    public TermInfoAndOrd(TermInfo ti, long termOrd) {
>       super(ti);
>       this.termOrd = termOrd;
>     }
> @@ -245,7 +245,7 @@
>             // wipe out the cache when they iterate over a large numbers
>             // of terms in order
>             if (tiOrd == null) {
> -              termsCache.put(cacheKey, new TermInfoAndOrd(ti, (int)
> enumerator.position));
> +              termsCache.put(cacheKey, new TermInfoAndOrd(ti,
> enumerator.position));
>             } else {
>               assert sameTermInfo(ti, tiOrd, enumerator);
>               assert (int) enumerator.position == tiOrd.termOrd;
> @@ -262,7 +262,7 @@
>     // random-access: must seek
>     final int indexPos;
>     if (tiOrd != null) {
> -      indexPos = tiOrd.termOrd / totalIndexInterval;
> +      indexPos = (int) (tiOrd.termOrd / totalIndexInterval);
>     } else {
>       // Must do binary search:
>       indexPos = getIndexOffset(term);
> @@ -274,7 +274,7 @@
>     if (enumerator.term() != null && term.compareTo(enumerator.term()) == 0) {
>       ti = enumerator.termInfo();
>       if (tiOrd == null) {
> -        termsCache.put(cacheKey, new TermInfoAndOrd(ti, (int)
> enumerator.position));
> +        termsCache.put(cacheKey, new TermInfoAndOrd(ti, 
> enumerator.position));
>       } else {
>         assert sameTermInfo(ti, tiOrd, enumerator);
>         assert (int) enumerator.position == tiOrd.termOrd;
>
> Mike
>
> http://blog.mikemccandless.com
>
> On Fri, Apr 8, 2011 at 4:53 PM, Burton-West, Tom <tburt...@umich.edu> wrote:
>> The query below results in an array out of bounds exception:
>> select/?q=solr&version=2.2&start=0&rows=0&facet=true&facet.field=topicStr
>>
>> Here is the exception:
>>  Exception during facet.field of 
>> topicStr:java.lang.ArrayIndexOutOfBoundsException: -1931149
>>        at 
>> org.apache.lucene.index.TermInfosReader.seekEnum(TermInfosReader.java:201)
>>
>> We are using a dev version of Solr/Lucene:
>>
>> Solr Specification Version: 3.0.0.2010.11.19.16.00.54
>> Solr Implementation Version: 3.1-SNAPSHOT 1036094 - root - 2010-11-19 
>> 16:00:54
>> Lucene Specification Version: 3.1-SNAPSHOT
>> Lucene Implementation Version: 3.1-SNAPSHOT 1036094 - 2010-11-19 16:01:10
>>
>> Just before the exception we see this entry in our tomcat logs:
>>
>> Apr 8, 2011 2:01:58 PM org.apache.solr.request.UnInvertedField uninvert
>> INFO: UnInverted multi-valued field 
>> {field=topicStr,memSize=7675174,tindexSize=289102,time=2577,phase1=2537,nTerms=498975,bigTerms=0,termInstances=1368694,uses=0}
>> Apr 8, 2011 2:01:58 PM org.apache.solr.core.SolrCore execute
>>
>> Is this a known bug?  Can anyone provide a clue as to how we can determine 
>> what the problem is?
>>
>> Tom Burton-West
>>
>>
>> Appended Below is the exception stack trace:
>>
>> SEVERE: Exception during facet.field of 
>> topicStr:java.lang.ArrayIndexOutOfBoundsException: -1931149
>>        at 
>> org.apache.lucene.index.TermInfosReader.seekEnum(TermInfosReader.java:201)
>>        at 
>> org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:271)
>>        at 
>> org.apache.lucene.index.TermInfosReader.terms(TermInfosReader.java:338)
>>        at org.apache.lucene.index.SegmentReader.terms(SegmentReader.java:928)
>>        at 
>> org.apache.lucene.index.DirectoryReader$MultiTermEnum.<init>(DirectoryReader.java:1055)
>>        at 
>> org.apache.lucene.index.DirectoryReader.terms(DirectoryReader.java:659)
>>        at 
>> org.apache.solr.search.SolrIndexReader.terms(SolrIndexReader.java:302)
>>        at 
>> org.apache.solr.request.NumberedTermEnum.skipTo(UnInvertedField.java:1018)
>>        at 
>> org.apache.solr.request.UnInvertedField.getTermText(UnInvertedField.java:838)
>>        at 
>> org.apache.solr.request.UnInvertedField.getCounts(UnInvertedField.java:617)
>>        at 
>> org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:279)
>>        at 
>> org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:312)
>>        at 
>> org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:174)
>>        at 
>> org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:72)
>>        at 
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
>>        at 
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1354)
>>
>>
>

Reply via email to