To clarify
"we put ³docValues²=³true² on the schema” should have said "we put ³docValues²=³true² on the id field only” -Frank On 2/10/17, 10:27 AM, "Kelly, Frank" <frank.ke...@here.com> wrote: >Thanks Shawn, > >Yeah think we have identified root cause thanks to some of the suggestions >here. > >Originally we stopped using deleteByQuery as we saw it caused some large >CPU spikes (see >https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues. >apache.org%2Fjira%2Fbrowse%2FLUCENE-7049&data=01%7C01%7C%7Cd9606e62fa5a421 >a08d008d451c95f04%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=5LhJ4eWQY1s >tkH0vyMm6c5kzeOcpjOXLtzU5gql6TT8%3D&reserved=0) and >Solr pauses >And switched to using a search and then deleteById. It worked fine on our >(small) test collections. > >But with 200M documents it appears that deleteById causes the heap to >increase dramatically (we guess fieldCache gets populated with a large >number of object ids?) >To confirm our suspicion we put ³docValues²=³true² on the schema and began >to reindex and the heap memory usage dropped significantly - in fact heap >memory usage on the Solr VMs dropped by a half. > >Can someone confirm (or deny) our suspicion that deleteById results in >some on-heap caching of the unique key (id?)? > > >Cheers! > >-Frank > >P.s. Interesting when I searched the Wiki for docs on deleteById I did not >find any >https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.a >pache.org%2Fconfluence%2Fdosearchsite.action%3Fwhere%3Dsolr%26spaceSea&dat >a=01%7C01%7C%7Cd9606e62fa5a421a08d008d451c95f04%7C6d4034cd72254f72b85391fe >aea64919%7C1&sdata=ixHi%2BZ%2B5wlqQ3tu%2FSQCcgjqPIfMRA2ta7Uo%2BBvwEUxE%3D& >reserved=0 >rch=true&queryString=deleteById > > >P.p.s Separately we are also turning off FilterCache but we know from >usage and plugin stats that it is not in use but best to turn it off >entirely for risk reduction > > >Frank Kelly >Principal Software Engineer > >HERE >5 Wayside Rd, Burlington, MA 01803, USA >42° 29' 7" N 71° 11' 32" W > > ><https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2F360.her >e.com%2F&data=01%7C01%7C%7Cd9606e62fa5a421a08d008d451c95f04%7C6d4034cd7225 >4f72b85391feaea64919%7C1&sdata=R%2BAbWMlSJ%2FRN0oAF3smwJawoQGr4U4%2BFdKCxy >XWLXIg%3D&reserved=0> ><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.tw >itter.com%2Fhere&data=01%7C01%7C%7Cd9606e62fa5a421a08d008d451c95f04%7C6d40 >34cd72254f72b85391feaea64919%7C1&sdata=qnVxW4o1CDcnjOiKdqjhCddGHUqbVlZuvxp >zMxRme0s%3D&reserved=0> ><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.fa >cebook.com%2Fhere&data=01%7C01%7C%7Cd9606e62fa5a421a08d008d451c95f04%7C6d4 >034cd72254f72b85391feaea64919%7C1&sdata=YaluC4BvPWpKhe5HQ8aaJqy7eW4SIOEdls >8tNp63xV0%3D&reserved=0> ><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.li >nkedin.com%2Fcompany%2Fheremaps&data=01%7C01%7C%7Cd9606e62fa5a421a08d008d4 >51c95f04%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=jLfR0kUX4yDZ29FeJEN5 >2jRUxYAPOEaXqoq3L67xSBk%3D&reserved=0> ><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.in >stagram.com%2Fhere%2F&data=01%7C01%7C%7Cd9606e62fa5a421a08d008d451c95f04%7 >C6d4034cd72254f72b85391feaea64919%7C1&sdata=xKrwI%2BcUq0sSNf%2FUUdiz9GA%2B >ckjttBO61qCk1%2BwlsTk%3D&reserved=0> > > > >On 2/9/17, 11:00 AM, "Shawn Heisey" <apa...@elyograg.org> wrote: > >>On 2/9/2017 6:19 AM, Kelly, Frank wrote: >>> Got a heap dump on an Out of Memory error. >>> Analyzing the dump now in Visual VM >>> >>> Seeing a lot of byte[] arrays (77% of our 8GB Heap) in >>> >>> * TreeMap$Entry >>> * FieldCacheImpl$SortedDocValues >>> >>> We¹re considering switch over to DocValues but would rather be >>> definitive about the root cause before we experiment with DocValues >>> and require a reindex of our 200M document index >>> In each of our 4 data centers. >>> >>> Any suggestions on what I should look for in this heap dump to get a >>> definitive root cause? >>> >> >>Analyzing the cause of large memory allocations when the large >>allocations are byte[] arrays might mean that it's a low-level class, >>probably in Lucene. Solr will likely have almost no influence on these >>memory allocations, except by changing the schema to enable docValues, >>which changes the particular Lucene code that is called. Note that >>wiping the index and rebuilding it from scratch is necessary when you >>enable docValues. >> >>Another possible source of problems like this is the filterCache. A 200 >>million document index (assuming it's all on the same machine) results >>in filterCache entries that are 25 million bytes each. In Solr >>examples, the filterCache defaults to a size of 512. If a cache that >>size on a 200 million document index fills up, it will require nearly 13 >>gigabytes of heap memory. >> >>Thanks, >>Shawn >> >