On 5/27/2017 9:05 AM, Shawn Heisey wrote: > On 5/27/2017 7:14 AM, Daniel Angelov wrote: >> I would like to ask, what could be the memory/cpu impact, if the fq >> parameter in many of the queries is a long string (fq={!terms >> f=...}...,.... ) around 2000000 chars. Most of the queries are like: >> "q={!frange l=Timestamp1 u=Timestamp2}... + some others criteria". >> This is with SolrCloud 4.1, on 10 hosts, 3 collections, summary in >> all collections are around 10000000 docs. The queries are over all 3 >> collections.
Followup after a little more thought: If we assume that the terms in your filter query are a generous 15 characters each (plus a comma), that means there are in the ballpark of 125 thousand of them in a two million byte filter query. If they're smaller, then there would be more. Considering 56 bytes of overhead for each one, there's at least another 7 million bytes of memory for 125000 terms when the terms parser divides that filter into multiple String objects, plus memory required for the data in each of those small strings, which will be just a little bit less than the original four million bytes, because it will exclude the commas. A fair amount of garbage will probably also be generated in order to parse the filter ... and then once the query is done, the 15 megabytes (or more) of memory for the strings will also be garbage. This is going to repeat for every shard. I haven't even discussed what happens for memory requirements on the Lucene frange parser, because I don't have any idea what those are, and you didn't describe the function you're using. I also don't know how much memory Lucene is going to require in order to execute a terms filter with at least 125K terms. I don't imagine it's going to be small. Thanks, Shawn