Re: Best approach to Intersect results with big Set?

Chris Hostetter Sun, 04 Sep 2011 16:36:37 -0700

: This works, but i'm concerned about how many terms we could end up
: with as the size grows.
: 
: Another possibility could be a Filter that iterates though FieldCache
: and checks if each value is in the Set<String>
: 
: Any thoughts/directions on things to look at?


It really all depends on what kind of orders of magnitude you're tlaking 
about.  both in terms of the number of filters, the cardinality of 
those filters, and the likely hood of reuse (ie: will the same Set<String> 
be used many times?  will the strings in that Set typically be used but in 
various perumtations?


You might want to consider ways you could apply the concepts 
from Field Faceting (particularly the tradeoffs between the fc and enum 
methods, good values for enum.cache.minDf, fieldValueCache's use of 
"bigTerms" etc...) since you're faceing roughly the same questions -- 
except instead of computing a bunch of distinct facet counts, you want to 
compute the intersection of a bunch of filters ... but you need to 
decide when to cache those filters independently, when to not bother 
caching them at all, when to cache them as a reusable unit, etc...


-Hoss

Re: Best approach to Intersect results with big Set?

Reply via email to