Questions around cache sizes, initsizes, autowarm, and cache invalidation

Koen De Groote Fri, 08 Sep 2023 16:29:09 -0700

I was reading about the caches and warming and I'm left with a bunch of
questions.


I don't know if I should ask them here, but I also don't know an other good
place to go with them.

So what I'll do is: I'll ask them, and if anybody wants to answer only 1 or
a set of them, feel free. No need to address the entire set.

1/ I see mention of an "Index Searcher", which can have instances that have
a lifetime and these instances then have caches.

Am I to understand that the process of building new ones and invalidating
old ones, means that the "instances" refer to the fact that there might
still be old instances serving old data, while the new one is being
constructed? And not there being multiple Index Searchers for a core at
once?

2/ How does all this work in terms of cache invalidation? Is the cache
basically the searcher and if 1 cache gets invalidated, the searcher has to
be re-created along with all its other caches?

3/ I'm reading that autowarming basically takes a bunch of entries from an
old Index Searcher instance and adds them to the new instance. Is there any
sort of insurance that these copied entries are still valid? That is to
say: if a cache is no longer valid, due to some commit that might include
changes for certain documents or query results... is there a mechanism to
ensure we avoid copying old entries that contain documents/results based on
outdated material?


4/ I see examples for caches using numbers like 512 entries. This seems
low. What are the considerations here? Do caches get rebuilt very
frequently due to new Index Searchers having to be built and thus it's a
waste to create big objects all the time, only for them to be frequently
abandoned and then re-constructed? Or something else?


5/ Suppose you have an application that creates documents and performs
queries based on application-generated user ids. And I have 1
core/collection "user_documents" and it all goes in there, with the "user
id" being a field. In this scenario, it seems like the action of 1 user
might invalidate the cache for all users. How does one avoid that?


6/ In regards to filtercaches, I'm reading that the oldest entry of an LRU
cache gets replaced for new entires. Since filtercaches story entries for
each "fq" of a query, can it happen that a long query gets pushed out of
the cache, but only some of its fq's and not others? Is this a bad thing?


7/ For the document cache, I see the line:

The size for the documentCache should always be greater than max_results
> times the max_concurrent_queries, to ensure that Solr does not need to
> refetch a document during a request.
>
What are the implications of that? I can't find documentation on max_results
and max_concurrent_queries as names for settings.


Thanks to any who dedicate their time to this.

Regards,
Koen De Groote

Questions around cache sizes, initsizes, autowarm, and cache invalidation

Reply via email to