[ 
https://issues.apache.org/jira/browse/SOLR-14428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17093772#comment-17093772
 ] 

Colvin Cowie commented on SOLR-14428:
-------------------------------------

That's fair.

I think that having a distinction between the (Query) key used for cache 
lookups and the Query object used for actual querying is definitely a _good 
thing_. But I don't know enough about the internals of it all, so happy to hear 
what other people think.

 

The one thing I have tried doing just now is using a {{SoftReference}} and a 
{{WeakReference}} on the {{automata}} so that it can be GCed and rebuilt if 
it's gone. With SoftReferences the OOMEs stop, but the heap usage stays higher, 
and with WeakReferences the heap usage looks pretty much the same as in 8.3.1.

I'm not sure though whether it provides any real improvement on how things were 
before the changes made in LUCENE-9068, when the automata was just built on 
demand. Plus I have never actually used SoftReference or WeakReference before, 
so can't claim to truely know the ins and outs of them.

[^SOLR-14428-WeakReferences.patch]

> FuzzyQuery has severe memory usage in 8.5
> -----------------------------------------
>
>                 Key: SOLR-14428
>                 URL: https://issues.apache.org/jira/browse/SOLR-14428
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 8.5, 8.5.1
>            Reporter: Colvin Cowie
>            Assignee: Andrzej Bialecki
>            Priority: Major
>         Attachments: FuzzyHammer.java, SOLR-14428-WeakReferences.patch, 
> image-2020-04-23-09-18-06-070.png, image-2020-04-24-20-09-31-179.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> I sent this to the mailing list
> I'm moving from 8.3.1 to 8.5.1, and started getting Out Of Memory Errors 
> while running our normal tests. After profiling it was clear that the 
> majority of the heap was allocated through FuzzyQuery.
> LUCENE-9068 moved construction of the automata from the FuzzyTermsEnum to the 
> FuzzyQuery's constructor.
> I created a little test ( [^FuzzyHammer.java] ) that fires off fuzzy queries 
> from random UUID strings for 5 minutes
> {code}
> FIELD_NAME + ":" + UUID.randomUUID().toString().replace("-", "") + "~2"
> {code}
> When running against a vanilla Solr 8.31 and 8.4.1 there is no problem, while 
> the memory usage has increased drastically on 8.5.0 and 8.5.1.
> Comparison of heap usage while running the attached test against Solr 8.3.1 
> and 8.5.1 with a single (empty) shard and 4GB heap:
> !image-2020-04-23-09-18-06-070.png! 
> And with 4 shards on 8.4.1 and 8.5.0:
>  !screenshot-2.png! 
> I'm guessing that the memory might be being leaked if the FuzzyQuery objects 
> are referenced from the cache, while the FuzzyTermsEnum would not have been.
> Query Result Cache on 8.5.1:
>  !screenshot-3.png! 
> ~316mb in the cache
> QRC on 8.3.1
>  !screenshot-4.png! 
> <1mb
> With an empty cache, running this query 
> _field_s:e41848af85d24ac197c71db6888e17bc~2_ results in the following memory 
> allocation
> {noformat}
> 8.3.1: CACHE.searcher.queryResultCache.ramBytesUsed:      1520
> 8.5.1: CACHE.searcher.queryResultCache.ramBytesUsed:    648855
> {noformat}
> ~1 gives 98253 and ~0 gives 6339 on 8.5.1. 8.3.1 is constant at 1520



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to