Hi Otis,

thanks for the hint. Turns out I have 17.8 million unique terms. I'm
fairly sure by now that the problem lies with the sorting. In the Lucene
java docs
(http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/search/Sort.html)
it is stated that

> Sorting uses of caches of term values maintained by the internal
HitQueue(s)

Are these caches separate from the Solr caches? Is there a good way of
reducing their max size when using Solr?

Cheers,

Chris


Otis Gospodnetic wrote:
> Chris, I checked Luke handler for you on a sample index.  Indeed, it does 
> provide the number of terms and a bunch of other nice information, for 
> example:
> 
> <int name="numDocs">19295605</int>
> <int name="maxDoc">20437118</int>
> <int name="numTerms">49209736</int>               <------- here
> <long name="version">1195333103547</long>
> <bool name="optimized">false</bool>
> <bool name="current">true</bool>
> <bool name="hasDeletions">true</bool>
> 
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 

Reply via email to