Randy That is one issue, i don't know if it fixes everything for you or not. However, Lucene doesn't put a limit on number of incoming requests and after https://issues.apache.org/jira/browse/LUCENE-6659 , solr has no way (i don't know at least) to limit threads. So if you have ton of parallel updates reaching the solr websever, it can cause a performance problem.
On Tue, Oct 17, 2017 at 10:52 AM, Randy Fradin <randy.fra...@gmail.com> wrote: > I've been trying to understand DocumentsWriterFlushControl.java to figure > this one out. I don't really have a firm grasp of it but I'm starting to > suspect that blocked flushes in aggregate can take up to (ramBufferSizeMB * > maximum # of concurrent update requests * # of cores) of heap space and > that I need to limit how many concurrent update requests are sent to the > same Solr node at the same time to something much lower than my current > 240. I don't know this for sure.. it is mostly a guess based on the fact > that one of the DocumentsWriter instances in my heap dump has just under > 240 items in the blockedFlushes list and each of those is retaining up to > 57MB of heap space (which is less than ramBufferSizeMB=100 but in the > ballpark). > > Can anyone shed light on whether I'm going down the right path here? > > > On Mon, Oct 16, 2017 at 5:34 PM David M Giannone <david.giann...@gm.com> > wrote: > > > > > > > > > > > Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone > > > > > > -------- Original message -------- > > From: Randy Fradin <randy.fra...@gmail.com> > > Date: 10/16/17 7:38 PM (GMT-05:00) > > To: solr-user@lucene.apache.org > > Subject: [EXTERNAL] Re: OOM during indexing with 24G heap - Solr 6.5.1 > > > > Each shard has around 4.2 million documents which are around 40GB on > disk. > > Two nodes have 3 shard replicas each and the third has 2 shard replicas. > > > > The text of the exception is: java.lang.OutOfMemoryError: Java heap space > > And the heap dump is a full 24GB indicating the full heap space was being > > used. > > > > Here is the solrconfig as output by the config request handler: > > > > { > > "responseHeader":{ > > "status":0, > > "QTime":0}, > > "config":{ > > "znodeVersion":0, > > "luceneMatchVersion":"org.apache.lucene.util.Version:6.5.1", > > "updateHandler":{ > > "indexWriter":{"closeWaitsForMerges":true}, > > "commitWithin":{"softCommit":true}, > > "autoCommit":{ > > "maxDocs":50000, > > "maxTime":300000, > > "openSearcher":false}, > > "autoSoftCommit":{ > > "maxDocs":-1, > > "maxTime":30000}}, > > "query":{ > > "useFilterForSortedQuery":false, > > "queryResultWindowSize":1, > > "queryResultMaxDocsCached":2147483647 <(214)%20748-3647>, > > "enableLazyFieldLoading":false, > > "maxBooleanClauses":1024, > > "":{ > > "size":"10000", > > "showItems":"-1", > > "initialSize":"10", > > "name":"fieldValueCache"}}, > > "jmx":{ > > "agentId":null, > > "serviceUrl":null, > > "rootName":null}, > > "requestHandler":{ > > "/select":{ > > "name":"/select", > > "defaults":{ > > "rows":10, > > "echoParams":"explicit"}, > > "class":"solr.SearchHandler"}, > > "/update":{ > > "useParams":"_UPDATE", > > "class":"solr.UpdateRequestHandler", > > "name":"/update"}, > > "/update/json":{ > > "useParams":"_UPDATE_JSON", > > "class":"solr.UpdateRequestHandler", > > "invariants":{"update.contentType":"application/json"}, > > "name":"/update/json"}, > > "/update/csv":{ > > "useParams":"_UPDATE_CSV", > > "class":"solr.UpdateRequestHandler", > > "invariants":{"update.contentType":"application/csv"}, > > "name":"/update/csv"}, > > "/update/json/docs":{ > > "useParams":"_UPDATE_JSON_DOCS", > > "class":"solr.UpdateRequestHandler", > > "invariants":{ > > "update.contentType":"application/json", > > "json.command":"false"}, > > "name":"/update/json/docs"}, > > "update":{ > > "class":"solr.UpdateRequestHandlerApi", > > "useParams":"_UPDATE_JSON_DOCS", > > "name":"update"}, > > "/config":{ > > "useParams":"_CONFIG", > > "class":"solr.SolrConfigHandler", > > "name":"/config"}, > > "/schema":{ > > "class":"solr.SchemaHandler", > > "useParams":"_SCHEMA", > > "name":"/schema"}, > > "/replication":{ > > "class":"solr.ReplicationHandler", > > "useParams":"_REPLICATION", > > "name":"/replication"}, > > "/get":{ > > "class":"solr.RealTimeGetHandler", > > "useParams":"_GET", > > "defaults":{ > > "omitHeader":true, > > "wt":"json", > > "indent":true}, > > "name":"/get"}, > > "/admin/ping":{ > > "class":"solr.PingRequestHandler", > > "useParams":"_ADMIN_PING", > > "invariants":{ > > "echoParams":"all", > > "q":"{!lucene}*:*"}, > > "name":"/admin/ping"}, > > "/admin/segments":{ > > "class":"solr.SegmentsInfoRequestHandler", > > "useParams":"_ADMIN_SEGMENTS", > > "name":"/admin/segments"}, > > "/admin/luke":{ > > "class":"solr.LukeRequestHandler", > > "useParams":"_ADMIN_LUKE", > > "name":"/admin/luke"}, > > "/admin/system":{ > > "class":"solr.SystemInfoHandler", > > "useParams":"_ADMIN_SYSTEM", > > "name":"/admin/system"}, > > "/admin/mbeans":{ > > "class":"solr.SolrInfoMBeanHandler", > > "useParams":"_ADMIN_MBEANS", > > "name":"/admin/mbeans"}, > > "/admin/plugins":{ > > "class":"solr.PluginInfoHandler", > > "name":"/admin/plugins"}, > > "/admin/threads":{ > > "class":"solr.ThreadDumpHandler", > > "useParams":"_ADMIN_THREADS", > > "name":"/admin/threads"}, > > "/admin/properties":{ > > "class":"solr.PropertiesRequestHandler", > > "useParams":"_ADMIN_PROPERTIES", > > "name":"/admin/properties"}, > > "/admin/logging":{ > > "class":"solr.LoggingHandler", > > "useParams":"_ADMIN_LOGGING", > > "name":"/admin/logging"}, > > "/admin/file":{ > > "class":"solr.ShowFileRequestHandler", > > "useParams":"_ADMIN_FILE", > > "name":"/admin/file"}, > > "/export":{ > > "class":"solr.ExportHandler", > > "useParams":"_EXPORT", > > "components":["query"], > > "defaults":{"wt":"json"}, > > "invariants":{ > > "rq":"{!xport}", > > "distrib":false}, > > "name":"/export"}, > > "/graph":{ > > "class":"solr.GraphHandler", > > "useParams":"_ADMIN_GRAPH", > > "invariants":{ > > "wt":"graphml", > > "distrib":false}, > > "name":"/graph"}, > > "/stream":{ > > "class":"solr.StreamHandler", > > "useParams":"_STREAM", > > "defaults":{"wt":"json"}, > > "invariants":{"distrib":false}, > > "name":"/stream"}, > > "/sql":{ > > "class":"solr.SQLHandler", > > "useParams":"_SQL", > > "defaults":{"wt":"json"}, > > "invariants":{"distrib":false}, > > "name":"/sql"}, > > "/terms":{ > > "class":"solr.SearchHandler", > > "useParams":"_TERMS", > > "components":["terms"], > > "name":"/terms"}, > > "/analysis/document":{ > > "class":"solr.DocumentAnalysisRequestHandler", > > "startup":"lazy", > > "useParams":"_ANALYSIS_DOCUMENT", > > "name":"/analysis/document"}, > > "/analysis/field":{ > > "class":"solr.FieldAnalysisRequestHandler", > > "startup":"lazy", > > "useParams":"_ANALYSIS_FIELD", > > "name":"/analysis/field"}, > > "/debug/dump":{ > > "class":"solr.DumpRequestHandler", > > "useParams":"_DEBUG_DUMP", > > "defaults":{ > > "echoParams":"explicit", > > "echoHandler":true}, > > "name":"/debug/dump"}}, > > "updateRequestProcessorChain":[{ > > "default":"true", > > "name":"customupdatechain", > > > > > > "":[{"class":"org.apache.solr.update.processor. > CustomDedupProcessorFactory"}, > > {"class":"solr.LogUpdateProcessorFactory"}, > > {"class":"solr.RunUpdateProcessorFactory"}]}], > > "updateHandlerupdateLog":{ > > "dir":"", > > "numVersionBuckets":65536}, > > "requestDispatcher":{ > > "handleSelect":true, > > "httpCaching":{ > > "never304":false, > > "etagSeed":"Solr", > > "lastModFrom":"opentime", > > "cacheControl":null}, > > "requestParsers":{ > > "multipartUploadLimitKB":2048, > > "formUploadLimitKB":2048, > > "addHttpRequestToContext":false}}, > > "indexConfig":{ > > "useCompoundFile":false, > > "maxBufferedDocs":-1, > > "maxMergeDocs":-1, > > "mergeFactor":-1, > > "ramBufferSizeMB":100.0, > > "writeLockTimeout":-1, > > "lockType":"native", > > "infoStreamEnabled":false, > > "metrics":{}}, > > "peerSync":{"useRangeVersions":true}}} > > > > > > > > On Mon, Oct 16, 2017 at 3:38 PM Shawn Heisey <apa...@elyograg.org> > wrote: > > > > > On 10/16/2017 3:19 PM, Randy Fradin wrote: > > > > We are seeing a lot of full GC events and eventual OOM errors in Solr > > > > during indexing. This is Solr 6.5.1 running in cloud mode with a 24G > > > heap. > > > > At these times indexing is the only activity taking place. The > > collection > > > > has 4 shards and 2 replicas across 3 nodes. Each document is ~10KB (a > > few > > > > hundred fields each), and indexing is using the normal update > handler, > > 1 > > > > document per request, up to 240 request at a time. > > > > > > > > The heap dump taken automatically on OOM shows 18.3GB of heap taken > by > > 3 > > > > instances of DocumentsWriter. Within those instances, all of the heap > > is > > > > retained by the blockedFlushes LinkedList inside the flushControl > > object. > > > > Each node in the LinkedList appears to be retaining around 55MB. > > > > > > > > Clearly something to do with flushing is at play here but I'm at a > loss > > > > what tuning parameters I should be looking at. I would expect things > to > > > > start blocking if I fall too far behind on flushing but apparently > > that's > > > > not happening. The ramBufferSizeMB is set to the default 100. My heap > > > size > > > > is already absurdly more than I thought we would need for this > volume. > > > > > > One of the first things we need to find out is about your index size. > > > > > > In each of your shards, how many documents are there? How much disk > > > space does one shard replica take up? How many shard replica cores > does > > > each node have on it in total? > > > > > > I would also like to get a look at your full solrconfig.xml file. The > > > schema may be helpful at a later date, along with an example of a > > > document that you're indexing. With ramBufferSizeMB at the default, > > > having a ton of memory used up by a class used for indexing seems very > > odd. > > > > > > Do you have the text of the OOM exception? Is it saying out of heap > > > space, or some other problem? > > > > > > Thanks, > > > Shawn > > > > > > > > > > > > Nothing in this message is intended to constitute an electronic signature > > unless a specific statement to the contrary is included in this message. > > > > Confidentiality Note: This message is intended only for the person or > > entity to which it is addressed. It may contain confidential and/or > > privileged material. Any review, transmission, dissemination or other > use, > > or taking of any action in reliance upon this message by persons or > > entities other than the intended recipient is prohibited and may be > > unlawful. If you received this message in error, please contact the > sender > > and delete it from your computer. > > >