: This works, but i'm concerned about how many terms we could end up : with as the size grows. : : Another possibility could be a Filter that iterates though FieldCache : and checks if each value is in the Set<String> : : Any thoughts/directions on things to look at?
It really all depends on what kind of orders of magnitude you're tlaking about. both in terms of the number of filters, the cardinality of those filters, and the likely hood of reuse (ie: will the same Set<String> be used many times? will the strings in that Set typically be used but in various perumtations? You might want to consider ways you could apply the concepts from Field Faceting (particularly the tradeoffs between the fc and enum methods, good values for enum.cache.minDf, fieldValueCache's use of "bigTerms" etc...) since you're faceing roughly the same questions -- except instead of computing a bunch of distinct facet counts, you want to compute the intersection of a bunch of filters ... but you need to decide when to cache those filters independently, when to not bother caching them at all, when to cache them as a reusable unit, etc... -Hoss