[jira] [Commented] (LUCENE-9038) Evaluate Caffeine for LruQueryCache

Ben Manes (Jira) Sun, 24 Nov 2019 17:38:06 -0800


    [ 
https://issues.apache.org/jira/browse/LUCENE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16981241#comment-16981241
 ]


Ben Manes commented on LUCENE-9038:
-----------------------------------

I tried running the [luceneutil|https://github.com/mikemccand/luceneutil] 
benchmark against this change rebased on master. The benchmark is pretty noisy 
and not sure how the cache interacts, but these were the results.

{code}
                    TaskQPS baseline      StdDevQPS my_modified_version      
StdDev                Pct diff
                 Respell      184.34     (32.4%)      167.09     (34.9%)   
-9.4% ( -57% -   85%)
                  Fuzzy1      213.08     (15.1%)      202.41     (15.4%)   
-5.0% ( -30% -   30%)
   BrowseMonthSSDVFacets     1789.91     (10.4%)     1759.04     (11.6%)   
-1.7% ( -21% -   22%)
                 LowTerm     3172.83     (11.3%)     3149.06     (11.1%)   
-0.7% ( -20% -   24%)
         LowSloppyPhrase      510.21     (12.6%)      505.35      (5.4%)   
-1.0% ( -16% -   19%)
               OrHighLow      911.22     (11.4%)      907.11      (8.7%)   
-0.5% ( -18% -   22%)
             MedSpanNear      639.59     (14.3%)      637.37     (11.8%)   
-0.3% ( -23% -   29%)
       HighTermMonthSort     1410.18     (14.8%)     1414.44     (17.8%)    
0.3% ( -28% -   38%)
              OrHighHigh      282.72     (18.9%)      283.90     (27.8%)    
0.4% ( -38% -   58%)
              AndHighLow     1811.44     (16.5%)     1826.13      (8.5%)    
0.8% ( -20% -   30%)
               LowPhrase      830.24     (12.8%)      837.28      (9.7%)    
0.8% ( -19% -   26%)
BrowseDayOfYearSSDVFacets     1538.60      (9.5%)     1552.58     (11.5%)    
0.9% ( -18% -   24%)
                HighTerm     1010.87     (11.3%)     1020.64      (9.9%)    
1.0% ( -18% -   24%)
               MedPhrase      571.41     (11.5%)      579.31      (7.3%)    
1.4% ( -15% -   22%)
         MedSloppyPhrase      417.12     (21.1%)      423.51     (21.7%)    
1.5% ( -34% -   56%)
             LowSpanNear      746.19     (18.1%)      758.25     (12.9%)    
1.6% ( -24% -   39%)
                Wildcard      184.23     (29.0%)      187.63     (29.3%)    
1.8% ( -43% -   84%)
    BrowseDateTaxoFacets     2747.64     (15.6%)     2804.34     (14.5%)    
2.1% ( -24% -   38%)
BrowseDayOfYearTaxoFacets     6748.62      (7.1%)     6900.47      (6.1%)    
2.3% ( -10% -   16%)
             AndHighHigh      608.66     (11.9%)      622.76     (16.0%)    
2.3% ( -22% -   34%)
              AndHighMed     1974.49     (14.2%)     2031.35     (10.3%)    
2.9% ( -18% -   31%)
                  Fuzzy2       19.26     (69.9%)       19.84     (54.5%)    
3.0% ( -71% -  423%)
                 MedTerm     2809.96      (9.1%)     2900.82     (10.7%)    
3.2% ( -15% -   25%)
    HighIntervalsOrdered      253.46     (37.7%)      261.87     (43.4%)    
3.3% ( -56% -  135%)
   BrowseMonthTaxoFacets     6838.39      (8.4%)     7109.56      (8.4%)    
4.0% ( -11% -   22%)
        HighSloppyPhrase      379.10     (20.3%)      395.81     (20.4%)    
4.4% ( -30% -   56%)
   HighTermDayOfYearSort      498.94     (15.7%)      527.78     (13.5%)    
5.8% ( -20% -   41%)
                PKLookup      158.51     (27.8%)      169.54     (12.9%)    
7.0% ( -26% -   65%)
                 Prefix3      168.46     (38.7%)      180.95     (36.4%)    
7.4% ( -48% -  134%)
              HighPhrase      260.05     (34.0%)      279.62     (20.5%)    
7.5% ( -35% -   94%)
                  IntNRQ      598.33     (33.7%)      651.97     (33.9%)    
9.0% ( -43% -  115%)
               OrHighMed      378.56     (32.7%)      427.55     (16.9%)   
12.9% ( -27% -   93%)
            HighSpanNear      217.85     (37.3%)      249.79     (36.3%)   
14.7% ( -42% -  140%)
{code}

> Evaluate Caffeine for LruQueryCache
> -----------------------------------
>
>                 Key: LUCENE-9038
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9038
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Ben Manes
>            Priority: Major
>         Attachments: CaffeineQueryCache.java, cache.patch
>
>
> [LRUQueryCache|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java]
>  appears to play a central role in Lucene's performance. There are many 
> issues discussing its performance, such as LUCENE-7235, LUCENE-7237, 
> LUCENE-8027, LUCENE-8213, and LUCENE-9002. It appears that the cache's 
> overhead can be just as much of a benefit as a liability, causing various 
> workarounds and complexity.
> When reviewing the discussions and code, the following issues are concerning:
> # The cache is guarded by a single lock for all reads and writes.
> # All computations are performed outside of the any locking to avoid 
> penalizing other callers. This  doesn't handle the cache stampedes meaning 
> that multiple threads may cache miss, compute the value, and try to store it. 
> That redundant work becomes expensive under load and can be mitigated with ~ 
> per-key locks.
> # The cache queries the entry to see if it's even worth caching. At first 
> glance one assumes that is so that inexpensive entries don't bang on the lock 
> or thrash the LRU. However, this is also used to indicate data dependencies 
> for uncachable items (per JIRA), which perhaps shouldn't be invoking the 
> cache.
> # The cache lookup is skipped if the global lock is held and the value is 
> computed, but not stored. This means a busy lock reduces performance across 
> all usages and the cache's effectiveness degrades. This is not counted in the 
> miss rate, giving a false impression.
> # An attempt was made to perform computations asynchronously, due to their 
> heavy cost on tail latencies. That work was reverted due to test failures and 
> is being worked on.
> # An [in-progress change|https://github.com/apache/lucene-solr/pull/940] 
> tries to avoid LRU thrashing due to large, infrequently used items being 
> cached.
> # The cache is tightly intertwined with business logic, making it hard to 
> tease apart core algorithms and data structures from the usage scenarios.
> It seems that more and more items skip being cached because of concurrency 
> and hit rate performance, causing special case fixes based on knowledge of 
> the external code flows. Since the developers are experts on search, not 
> caching, it seems justified to evaluate if an off-the-shelf library would be 
> more helpful in terms of developer time, code complexity, and performance. 
> Solr has already introduced [Caffeine|https://github.com/ben-manes/caffeine] 
> in SOLR-8241 and SOLR-13817.
> The proposal is to replace the internals {{LruQueryCache}} so that external 
> usages are not affected in terms of the API. However, like in {{SolrCache}}, 
> a difference is that Caffeine only bounds by either the number of entries or 
> an accumulated size (e.g. bytes), but not both constraints. This likely is an 
> acceptable divergence in how the configuration is honored.
> cc [~ab], [~dsmiley]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9038) Evaluate Caffeine for LruQueryCache

Reply via email to