Thanks for the pointer, David! While browsing through that issue, I found this comment left by you from SOLR-14166
https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1185 "note: can't use computeIfAbsent because can be recursive" I don't quite understand what the recursive case means here, can you elaborate on that? I didn't see any discussion of it in the JIRA either. Thanks, Mike On Wed, Jul 14, 2021 at 11:03 AM David Smiley <dsmi...@apache.org> wrote: > > Ideally, we could use SolrCache.computeIfAbsent [1] for the filter cache, as > is used for some of the other caches. The best SolrCache is CaffeineCache > which works atomically for the same key (just as does ConcurrentHashMap). > The problem is that this method on CaffeineCache does not support computing a > cache entry that is reentrant, i.e. that which can produce another cache > entry when it is computed. Really, that limitation ought to be elevated to > the docs on SolrCache.computeIfAbsent. Andrzej discovered [1] that some > queries could do that, and so he did not update Solr's use of the filter > cache to call it. Please read the thread there and maybe comment further to > get the attention of pertinent people. > > [1]: https://issues.apache.org/jira/browse/SOLR-13898 > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley > > > On Tue, Jul 13, 2021 at 4:31 PM Mike Drob <md...@apache.org> wrote: >> >> Hi folks, >> >> This is an idea based on a recent prod issue, and while we found >> another workaround I think there is some merit to discuss here. >> >> Currently our filter cache is a mapping from queries to docs, and the >> result cache is similar although slightly more abstract. When we have >> a lot of similar queries come in at the same time, if a particular >> filter hasn't been cached yet then it will be computed a bunch of >> times in parallel as each query tries to be the one to insert into the >> cache. >> >> One option that I've thought about is if instead of inserting results >> into the cache directly, we pre-register a future in the cache, and >> then use that as a reference to the results. Multiple queries coming >> in parallel would all wait for the same result calculation instead of >> allocating large arrays each. >> >> The benefits are pretty straightforward - we reduce the amount of >> computation done when there are lots of queries coming in, and reduce >> the memory allocation pressure. >> >> The complexity might be around handling errors or query timeouts or >> cancellations. Or evictions, but I think that would all be manageable. >> >> What do other folks think? Should I write up a SIP for this, since I >> think it will be fairly complex, or are there existing solutions that >> I should look into first? >> >> Mike >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org >> For additional commands, e-mail: dev-h...@solr.apache.org >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org For additional commands, e-mail: dev-h...@solr.apache.org