On Mon, Aug 3, 2009 at 4:45 AM, Nicolae Mihalache<xproma...@gmail.com> wrote: > Hello, > > I'm using faceted search (perhaps in a dumb way) to collect some statistics > for my index. I have documents in various languages, one of the field is > "language" and I simply want to see how many documents I have for each > language. I have noticed that the search builds a int[maxDoc] array and then > traverses the array to count. If facet.method=enum (discovered later) is > used, the things are still counted in a different way. But for this case > where all the documents are retrieved, the information is already available > in the lucene index.
> So, I think it would be a good optimization to detect these cases (i.e. no > filtering) and just return the number from the index instead of counting the > docs again. That would require - a base query that matched the entire index - no filters - no deletions in the index If you want those numbers, see the terms component. > Another issue: there is no way currently to disable the caching of the > int[maxDoc], is there? use facet.method=enum... the number of filters cached can be controlled by the filterCache. You can also prevent the filterCache from being used via the facet.enum.cache.minDf param. -Yonik