Re: [DISCUSS] Solr Cache with Futures as values

Mike Drob Thu, 15 Jul 2021 16:14:18 -0700

> It also has has a bulk load API

We currently use this when we need to resize a cache because those
will have the same results but maybe a smaller number. I don't think
we can do it when starting a new cache that needs warming, since the
bulk putAll takes both keys and values rather than keys and a
computational unit. If we move to futures as values then something
like this becomes more possible, I think.


Unrelated, I really struggle figuring out how to test this in a
reproducible fashion. We'd need a filter query that takes a long time
to execute, or even an injectable latch to stall all of the queries
that we can release from the test code. Will fiddle with this some
more.

On Wed, Jul 14, 2021 at 5:46 PM Mark Miller <markrmil...@gmail.com> wrote:
>
> If Caffeine is being used, it might be worthwhile to look into using it’s 
> feature set to do this.
>
> It has the ability to do either async or sync loading - if using sync, 
> modifications will block while an entry is loading.
>
> It also has has a bulk load API, might be interesting for things like auto 
> warming.
>
> - MRM
>
> On Wed, Jul 14, 2021 at 6:18 PM Mike Drob <md...@apache.org> wrote:
>>
>> Thanks for the pointer, David!
>>
>> While browsing through that issue, I found this comment left by you
>> from SOLR-14166
>>
>> https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1185
>>
>> "note: can't use computeIfAbsent because can be recursive"
>>
>> I don't quite understand what the recursive case means here, can you
>> elaborate on that? I didn't see any discussion of it in the JIRA
>> either.
>>
>> Thanks,
>> Mike
>>
>> On Wed, Jul 14, 2021 at 11:03 AM David Smiley <dsmi...@apache.org> wrote:
>> >
>> > Ideally, we could use SolrCache.computeIfAbsent [1] for the filter cache, 
>> > as is used for some of the other caches.  The best SolrCache is 
>> > CaffeineCache which works atomically for the same key (just as does 
>> > ConcurrentHashMap).  The problem is that this method on CaffeineCache does 
>> > not support computing a cache entry that is reentrant, i.e. that which can 
>> > produce another cache entry when it is computed.  Really, that limitation 
>> > ought to be elevated to the docs on SolrCache.computeIfAbsent.  Andrzej 
>> > discovered [1] that some queries could do that, and so he did not update 
>> > Solr's use of the filter cache to call it.  Please read the thread there 
>> > and maybe comment further to get the attention of pertinent people.
>> >
>> > [1]: https://issues.apache.org/jira/browse/SOLR-13898
>> >
>> > ~ David Smiley
>> > Apache Lucene/Solr Search Developer
>> > http://www.linkedin.com/in/davidwsmiley
>> >
>> >
>> > On Tue, Jul 13, 2021 at 4:31 PM Mike Drob <md...@apache.org> wrote:
>> >>
>> >> Hi folks,
>> >>
>> >> This is an idea based on a recent prod issue, and while we found
>> >> another workaround I think there is some merit to discuss here.
>> >>
>> >> Currently our filter cache is a mapping from queries to docs, and the
>> >> result cache is similar although slightly more abstract. When we have
>> >> a lot of similar queries come in at the same time, if a particular
>> >> filter hasn't been cached yet then it will be computed a bunch of
>> >> times in parallel as each query tries to be the one to insert into the
>> >> cache.
>> >>
>> >> One option that I've thought about is if instead of inserting results
>> >> into the cache directly, we pre-register a future in the cache, and
>> >> then use that as a reference to the results. Multiple queries coming
>> >> in parallel would all wait for the same result calculation instead of
>> >> allocating large arrays each.
>> >>
>> >> The benefits are pretty straightforward - we reduce the amount of
>> >> computation done when there are lots of queries coming in, and reduce
>> >> the memory allocation pressure.
>> >>
>> >> The complexity might be around handling errors or query timeouts or
>> >> cancellations. Or evictions, but I think that would all be manageable.
>> >>
>> >> What do other folks think? Should I write up a SIP for this, since I
>> >> think it will be fairly complex, or are there existing solutions that
>> >> I should look into first?
>> >>
>> >> Mike
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>> >> For additional commands, e-mail: dev-h...@solr.apache.org
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>> For additional commands, e-mail: dev-h...@solr.apache.org
>>
> --
> - Mark
>
> http://about.me/markrmiller

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org

Re: [DISCUSS] Solr Cache with Futures as values

Reply via email to