On Mon, Aug 3, 2009 at 4:45 AM, Nicolae Mihalache<xproma...@gmail.com> wrote:
> Hello,
>
> I'm using faceted search (perhaps in a dumb way) to collect some statistics
> for my index. I have documents in various languages, one of the field is
> "language" and I simply want to see how many documents I have for each
> language. I have noticed that the search builds a int[maxDoc] array and then
> traverses the array to count. If facet.method=enum (discovered later) is
> used, the things are still counted in a different way. But for this case
> where all the documents are retrieved, the information is already available
> in the lucene index.

> So, I think it would be a good optimization to detect these cases (i.e. no
> filtering) and just return the number from the index instead of counting the
> docs again.

That would require
 - a base query that matched the entire index
 - no filters
 - no deletions in the index

If you want those numbers, see the terms component.

> Another issue: there is no way currently to disable the caching of the
> int[maxDoc], is there?

use facet.method=enum... the number of filters cached can be
controlled by the filterCache.
You can also prevent the filterCache from being used via the
facet.enum.cache.minDf param.

-Yonik

Reply via email to