[jira] [Commented] (SOLR-8241) Evaluate W-TinyLfu cache

Ben Manes (Jira) Mon, 23 Sep 2019 13:30:18 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16936170#comment-16936170
 ]


Ben Manes commented on SOLR-8241:
---------------------------------

I'm sorry you ran into issues running JMH. This seems to be a bug in their 
plugin and I added a workaround to the issue.

FastLRUCache is an unbounded {{ConcurrentHashMap}} which uses a background 
thread to prune it when it exceeds a threshold. This has the following 
tradeoffs:
 * The read/write throughput will match {{ConcurrentHashMap}}, making it close 
to the ideal performance.
 * The cache may have runaway memory growth under high load when the cleaner 
thread cannot keep up. 
 * The cleanup takes {{O(n lg n)}} time, which could be expensive when the 
system is already under load. 

{{Caffeine}} is designed to optimize system performance, rather than just 
get/put throughput. If we can exceed the performance requirements for 
throughput, we can sacrifice a little to improve other characteristics without 
impacting real-world performance. This includes a best in class hit rate, no 
runaway growth, {{O(1)}} costs, and many more features. 

{{FastLRUCache}} may beat {{Caffeine}} by a nanosecond or two per operation on 
a cache hit. However the miss penalty (I/O, deserialization, higher GC) will 
mean that is has lower system performance. We can run some simulations of trace 
files to show the hit rate differences. Assuming a strict LRU, we can see that 
in a [search 
trace|https://github.com/ben-manes/caffeine/wiki/Efficiency#search] Caffeine's 
hit rate is significantly higher.

Let me know how I can help with the evaluation. I'd gladly write some 
integrations into my tooling with your guidance.

> Evaluate W-TinyLfu cache
> ------------------------
>
>                 Key: SOLR-8241
>                 URL: https://issues.apache.org/jira/browse/SOLR-8241
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>            Reporter: Ben Manes
>            Assignee: Andrzej Bialecki 
>            Priority: Major
>             Fix For: master (9.0)
>
>         Attachments: SOLR-8241.patch, SOLR-8241.patch, SOLR-8241.patch, 
> SOLR-8241.patch, SOLR-8241.patch, caffeine-benchmark.txt, proposal.patch
>
>
> SOLR-2906 introduced an LFU cache and in-progress SOLR-3393 makes it O(1). 
> The discussions seem to indicate that the higher hit rate (vs LRU) is offset 
> by the slower performance of the implementation. An original goal appeared to 
> be to introduce ARC, a patented algorithm that uses ghost entries to retain 
> history information.
> My analysis of Window TinyLfu indicates that it may be a better option. It 
> uses a frequency sketch to compactly estimate an entry's popularity. It uses 
> LRU to capture recency and operate in O(1) time. When using available 
> academic traces the policy provides a near optimal hit rate regardless of the 
> workload.
> I'm getting ready to release the policy in Caffeine, which Solr already has a 
> dependency on. But, the code is fairly straightforward and a port into Solr's 
> caches instead is a pragmatic alternative. More interesting is what the 
> impact would be in Solr's workloads and feedback on the policy's design.
> https://github.com/ben-manes/caffeine/wiki/Efficiency



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-8241) Evaluate W-TinyLfu cache

Reply via email to