The other thing I'd point out is that if your hit ratio is low, you
might as well disable it entirely.

Finally, if you have any a-priori knowledge that certain fq clauses
are very unlikely to be re-used,
add {!cache=false}. If you also add cost=101, then the fq clause will
only be evaluated for
docs that need it, especially if you turn caching off.

See: http://yonik.com/advanced-filter-caching-in-solr/

Best,
Erick

On Thu, Oct 5, 2017 at 12:20 AM, Toke Eskildsen <t...@kb.dk> wrote:
> On Wed, 2017-10-04 at 21:42 -0700, S G wrote:
>> The bit-vectors in filterCache are as long as the maximum number of
>> documents in a core. If there are a billion docs per core, every bit
>> vector will have a billion bits making its size as 10 9 / 8 = 128 mb
>
> The tricky part here is there are sparse (aka few hits) entries that
> takes up less space. The 1 bit/hit is worst case.
>
> This is both good and bad. The good part is of course that it saves
> memory. The bad part is that it often means that people set the
> filterCache size to a high number and that it works well, right until
> a series of filters with many hits.
>
> It seems that the memory limit option maxSizeMB was added in Solr 5.2:
> https://issues.apache.org/jira/browse/SOLR-7372
> I am not sure if it works with all caches in Solr, but in my world it
> is way better to define the caches by memory instead of count.
>
>> With such a big cache-value per entry,  the default value of 128
>> values in will become 128x128mb = 16gb and would not be very good for
>> a system running below 32 gb of memory.
>
> Sure. The default values are just that. For an index with 1M documents
> and a lot of different filters, 128 would probably be too low.
>
> If someone were to create a well-researched set of config files for
> different scenarios, it would be a welcome addition to our shared
> knowledge pool.
>
>> If such a use-case is anticipated, either the JVM's max memory be
>> increased to beyond 40 gb or the filterCache size be reduced to 32.
>
> Best solution: Use maxSizeMB (if it works)
> Second best solution: Reduce to 32 or less
> Third best, but often used, solution: Hope that most of the entries are
> sparse and will remain so
>
> - Toke Eskildsen, Royal Danish Library
>

Reply via email to