On 5/27/2017 7:14 AM, Daniel Angelov wrote: > I would like to ask, what could be the memory/cpu impact, if the fq > parameter in many of the queries is a long string (fq={!terms > f=...}...,.... ) around 2000000 chars. Most of the queries are like: > "q={!frange l=Timestamp1 u=Timestamp2}... + some others criteria". This is > with SolrCloud 4.1, on 10 hosts, 3 collections, summary in all collections > are around 10000000 docs. The queries are over all 3 collections. > > I have sometimes OOM exceptions. And I can see GC times are pretty long. > The heap size is 64 GB on each host. The cache settings are the default. > > Is it possible the long fq parameter in the requests to cause OOM > exceptions?
A two million character string in Java will take just over four million bytes of memory. This is because Java uses UTF-16 internally, and overhead on a String object is approximately 56 bytes. With multiple shards, that string is going to get copied for each shard. There might be other places in the Solr and Lucene code where the string will also get copied multiple times. At four megabytes for each copy, that's going to eat up memory quickly. It will also take a non-trivial amount of time to accomplish each copy. OOM exceptions on a 64GB heap? Even if we consider the info just mentioned and there are several copies of the two million character string floating around, it sounds like you are doing some massively complex queries, or that your index size is beyond gargantuan. I cannot imagine needing a 64GB heap for 30 million documents unless the system is handling some very unusual queries, and/or an enormous index, and/or some *extremely* large Solr caches. I suspect there are many details that we haven't heard yet. I'm not even sure exactly what to ask for, so I'll ask for the moon: On a per-server basis, can we see the following info? Total memory installed in the server. How many Solr instances are running on the server. The total amount of max heap memory allocated to Solr. A list of other things running on the server besides Solr. Total size of the solr home directory. How many documents does that solr home size represent? If there are multiple shards/replicas, all of them must be counted. solrconfig.xml and the schema would be useful. More general questions: What does a typical query involve? If there are facets, describe each field used in a facet -- term cardinality, typical contents, analysis, etc. If the system is running an OS with the "top" utility available, run top (not htop or any other variety), press shift-M to sort by memory, grab a screenshot, and put the information on the Internet somewhere we can access it with a URL. If it's on Windows, similar information can be obtained with Resource Monitor, sort by "Working Set" on the Memory tab. Thanks, Shawn