Thanks for the pointer, David!

While browsing through that issue, I found this comment left by you
from SOLR-14166

https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1185

"note: can't use computeIfAbsent because can be recursive"

I don't quite understand what the recursive case means here, can you
elaborate on that? I didn't see any discussion of it in the JIRA
either.

Thanks,
Mike

On Wed, Jul 14, 2021 at 11:03 AM David Smiley <dsmi...@apache.org> wrote:
>
> Ideally, we could use SolrCache.computeIfAbsent [1] for the filter cache, as 
> is used for some of the other caches.  The best SolrCache is CaffeineCache 
> which works atomically for the same key (just as does ConcurrentHashMap).  
> The problem is that this method on CaffeineCache does not support computing a 
> cache entry that is reentrant, i.e. that which can produce another cache 
> entry when it is computed.  Really, that limitation ought to be elevated to 
> the docs on SolrCache.computeIfAbsent.  Andrzej discovered [1] that some 
> queries could do that, and so he did not update Solr's use of the filter 
> cache to call it.  Please read the thread there and maybe comment further to 
> get the attention of pertinent people.
>
> [1]: https://issues.apache.org/jira/browse/SOLR-13898
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Tue, Jul 13, 2021 at 4:31 PM Mike Drob <md...@apache.org> wrote:
>>
>> Hi folks,
>>
>> This is an idea based on a recent prod issue, and while we found
>> another workaround I think there is some merit to discuss here.
>>
>> Currently our filter cache is a mapping from queries to docs, and the
>> result cache is similar although slightly more abstract. When we have
>> a lot of similar queries come in at the same time, if a particular
>> filter hasn't been cached yet then it will be computed a bunch of
>> times in parallel as each query tries to be the one to insert into the
>> cache.
>>
>> One option that I've thought about is if instead of inserting results
>> into the cache directly, we pre-register a future in the cache, and
>> then use that as a reference to the results. Multiple queries coming
>> in parallel would all wait for the same result calculation instead of
>> allocating large arrays each.
>>
>> The benefits are pretty straightforward - we reduce the amount of
>> computation done when there are lots of queries coming in, and reduce
>> the memory allocation pressure.
>>
>> The complexity might be around handling errors or query timeouts or
>> cancellations. Or evictions, but I think that would all be manageable.
>>
>> What do other folks think? Should I write up a SIP for this, since I
>> think it will be fairly complex, or are there existing solutions that
>> I should look into first?
>>
>> Mike
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>> For additional commands, e-mail: dev-h...@solr.apache.org
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org

Reply via email to