Jonathan Rochkind wrote: > What matters isn't how many documents have a value, so much > as how many unique values there are in the field total. If > there aren't that many, faceting can be done fairly quickly and fairly > efficiently.
Really? Don't these 2 log file lines: INFO: UnInverted multi-valued field {field=vehicle_vin_facet,memSize=39513151,tindexSize=208256,time=138382,phase1=138356,nTerms=638642,bigTerms=0,termInstances=739169,uses=0} INFO: UnInverted multi-valued field {field=specialassignyn_facet,memSize=36336696,tindexSize=44,time=1458,phase1=1438,nTerms=5,bigTerms=0,termInstances=138046,uses=0} suggest that whether I have a facet with a half million unique values or a half dozen, they use roughly the same much memory? At first glance they both seem similarly efficient to filter on. Certainly the one with many unique instances takes longer to invert -- but that's just computer time that's hidden from users, no? > ... 50 megs times 80 is still only around 4 gigs, not entirely out of the > question > to simply supply enough RAM for all those caches. Yup - that's what I'm doing for now (just moved to a 24 gig ram machine); but I expect we'll have 10X as many documents, and maybe 2x as many facets by spring. Still not undoable, but I may need to start forecasting RAM budgets.