It's actually limited to 24 bits to point to the term list in a byte[], but there are 256 different arrays, so the maximum capacity is 4B bytes of un-inverted terms, but each bucket is limited to 4B/256 so the real limit can come in at a little less due to luck.
>From the comments: * There is a single int[maxDoc()] which either contains a pointer into a byte[] for * the termNumber lists, or directly contains the termNumber list if it fits in the 4 * bytes of an integer. If the first byte in the integer is 1, the next 3 bytes * are a pointer into a byte[] where the termNumber list starts. * * There are actually 256 byte arrays, to compensate for the fact that the pointers * into the byte arrays are only 3 bytes long. The correct byte array for a document * is a function of it's id. -Yonik http://lucidworks.com On Thu, Sep 6, 2012 at 6:33 PM, Fuad Efendi <f...@efendi.ca> wrote: > Hi Jack, > > > 24bit => 16M possibilities, it's clear; just to confirm... the rest is > unclear, why 4-byte can have 4 million cardinality? I thought it is 4 > billions... > > > And, just to confirm: UnInvertedField allows 16M cardinality, correct? > > > > > On 12-08-20 6:51 PM, "Jack Krupansky" <j...@basetechnology.com> wrote: > >>It appears that there is a hard limit of 24-bits or 16M for the number of >>bytes to reference the terms in a single field of a single document. It >>takes 1, 2, 3, 4, or 5 bytes to reference a term. If it took 4 bytes, >>that >>would allow 16/4 or 4 million unique terms - per document. Do you have >>such >>large documents? This appears to be a hard limit based of 24-bytes in a >>Java >>int. >> >>You can try facet.method=enum, but that may be too slow. >> >>What release of Solr are you running? >> >>-- Jack Krupansky >> >>-----Original Message----- >>From: Fuad Efendi >>Sent: Monday, August 20, 2012 4:34 PM >>To: Solr-User@lucene.apache.org >>Subject: UnInvertedField limitations >> >>Hi All, >> >> >>I have a problemÅ (Yonik, please!) help me, what is Term count limits? I >>possibly have 256,000,000 different terms in a fieldÅ or 16,000,000? >> >>Thanks! >> >> >>2012-08-20 16:20:19,262 ERROR [solr.core.SolrCore] - [pool-1-thread-1] - : >>org.apache.solr.common.SolrException: Too many values for UnInvertedField >>faceting on field enrich_keywords_string_mv >> at >>org.apache.solr.request.UnInvertedField.<init>(UnInvertedField.java:179) >> at >>org.apache.solr.request.UnInvertedField.getUnInvertedField(UnInvertedField >>.j >>ava:668) >> at >>org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:326) >> at >>org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java >>:4 >>23) >> at >>org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:206) >> at >>org.apache.solr.handler.component.FacetComponent.process(FacetComponent.ja >>va >>:85) >> at >>org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHa >>nd >>ler.java:204) >> at >>org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBas >>e. >>java:129) >> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1561) >> >> >> >> >>-- >>Fuad Efendi >>http://www.tokenizer.ca >> >> >> > >