Chris, I checked Luke handler for you on a sample index. Indeed, it does provide the number of terms and a bunch of other nice information, for example:
<int name="numDocs">19295605</int> <int name="maxDoc">20437118</int> <int name="numTerms">49209736</int> <------- here <long name="version">1195333103547</long> <bool name="optimized">false</bool> <bool name="current">true</bool> <bool name="hasDeletions">true</bool> Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- From: Otis Gospodnetic <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Thursday, November 22, 2007 10:15:23 PM Subject: Re: Memory use with sorting problem I'd have to check, but Luke handler might spit that out. If not, Lucene's TermEnum & co. are your friends. :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- From: Chris Laux <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Thursday, November 22, 2007 7:22:56 AM Subject: Re: Memory use with sorting problem Thanks for your reply. I made some memory saving changes, as per your advice, but the problem remains. > Set the max warming searchers to 1 to ensure that you never have more > than one warming at the same time. Done. > How many documents are in your index? Currently about 8 million. > If you don't need range queries on these numeric fields, you might try > switching from "sfloat" to "float" and from "sint" to "int". The > fieldCache representation will be smaller. As far as I can see "slong" etc. is also needed for sorting queries (which I do, as mentioned). Anyway, I got an error message when I tried sorting on a "long" field. >> Is it normal to need that much Memory for such a small index? > > Some things are more related to the number of unique terms or the > numer of documents more than the "size" of the index. Is there a manageable way to find out / limit the number of unique terms in Solr? Cheers, Chris