Hi folks,

This is an idea based on a recent prod issue, and while we found
another workaround I think there is some merit to discuss here.

Currently our filter cache is a mapping from queries to docs, and the
result cache is similar although slightly more abstract. When we have
a lot of similar queries come in at the same time, if a particular
filter hasn't been cached yet then it will be computed a bunch of
times in parallel as each query tries to be the one to insert into the
cache.

One option that I've thought about is if instead of inserting results
into the cache directly, we pre-register a future in the cache, and
then use that as a reference to the results. Multiple queries coming
in parallel would all wait for the same result calculation instead of
allocating large arrays each.

The benefits are pretty straightforward - we reduce the amount of
computation done when there are lots of queries coming in, and reduce
the memory allocation pressure.

The complexity might be around handling errors or query timeouts or
cancellations. Or evictions, but I think that would all be manageable.

What do other folks think? Should I write up a SIP for this, since I
think it will be fairly complex, or are there existing solutions that
I should look into first?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org

Reply via email to