Thanks Shawn,

Yeah think we have identified root cause thanks to some of the suggestions
here.

Originally we stopped using deleteByQuery as we saw it caused some large
CPU spikes (see https://issues.apache.org/jira/browse/LUCENE-7049) and
Solr pauses
And switched to using a search and then deleteById. It worked fine on our
(small) test collections.

But with 200M documents it appears that deleteById causes the heap to
increase dramatically (we guess fieldCache gets populated with a large
number of object ids?)
To confirm our suspicion we put ³docValues²=³true² on the schema and began
to reindex and the heap memory usage dropped significantly - in fact heap
memory usage on the Solr VMs dropped by a half.

Can someone confirm (or deny) our suspicion that deleteById results in
some on-heap caching of the unique key (id?)?


Cheers!

-Frank

P.s. Interesting when I searched the Wiki for docs on deleteById I did not
find any
https://cwiki.apache.org/confluence/dosearchsite.action?where=solr&spaceSea
rch=true&queryString=deleteById


P.p.s Separately we are also turning off FilterCache but we know from
usage and plugin stats that it is not in use but best to turn it off
entirely for risk reduction

 
Frank Kelly
Principal Software Engineer
 
HERE 
5 Wayside Rd, Burlington, MA 01803, USA
42° 29' 7" N 71° 11' 32" W
 
 <http://360.here.com/>     <https://www.twitter.com/here>
<https://www.facebook.com/here>
<https://www.linkedin.com/company/heremaps>
<https://www.instagram.com/here/>



On 2/9/17, 11:00 AM, "Shawn Heisey" <apa...@elyograg.org> wrote:

>On 2/9/2017 6:19 AM, Kelly, Frank wrote:
>> Got a heap dump on an Out of Memory error.
>> Analyzing the dump now in Visual VM
>>
>> Seeing a lot of byte[] arrays (77% of our 8GB Heap) in
>>
>>   * TreeMap$Entry
>>   * FieldCacheImpl$SortedDocValues
>>
>> We¹re considering switch over to DocValues but would rather be
>> definitive about the root cause before we experiment with DocValues
>> and require a reindex of our 200M document index
>> In each of our 4 data centers.
>>
>> Any suggestions on what I should look for in this heap dump to get a
>> definitive root cause?
>>
>
>Analyzing the cause of large memory allocations when the large
>allocations are byte[] arrays might mean that it's a low-level class,
>probably in Lucene.  Solr will likely have almost no influence on these
>memory allocations, except by changing the schema to enable docValues,
>which changes the particular Lucene code that is called.  Note that
>wiping the index and rebuilding it from scratch is necessary when you
>enable docValues.
>
>Another possible source of problems like this is the filterCache.  A 200
>million document index (assuming it's all on the same machine) results
>in filterCache entries that are 25 million bytes each.  In Solr
>examples, the filterCache defaults to a size of 512.  If a cache that
>size on a 200 million document index fills up, it will require nearly 13
>gigabytes of heap memory.
>
>Thanks,
>Shawn
>

Reply via email to