On top of Shawn's analysis, I am also wondering how often those FQ queries are reused. Because they and the matching documents are getting cached, so there might be quite a bit of space taken with that too.
Regards, Alex. ---- http://www.solr-start.com/ - Resources for Solr users, new and experienced On 27 May 2017 at 11:32, Shawn Heisey <apa...@elyograg.org> wrote: > On 5/27/2017 9:05 AM, Shawn Heisey wrote: >> On 5/27/2017 7:14 AM, Daniel Angelov wrote: >>> I would like to ask, what could be the memory/cpu impact, if the fq >>> parameter in many of the queries is a long string (fq={!terms >>> f=...}...,.... ) around 2000000 chars. Most of the queries are like: >>> "q={!frange l=Timestamp1 u=Timestamp2}... + some others criteria". >>> This is with SolrCloud 4.1, on 10 hosts, 3 collections, summary in >>> all collections are around 10000000 docs. The queries are over all 3 >>> collections. > > Followup after a little more thought: > > If we assume that the terms in your filter query are a generous 15 > characters each (plus a comma), that means there are in the ballpark of > 125 thousand of them in a two million byte filter query. If they're > smaller, then there would be more. Considering 56 bytes of overhead for > each one, there's at least another 7 million bytes of memory for 125000 > terms when the terms parser divides that filter into multiple String > objects, plus memory required for the data in each of those small > strings, which will be just a little bit less than the original four > million bytes, because it will exclude the commas. A fair amount of > garbage will probably also be generated in order to parse the filter ... > and then once the query is done, the 15 megabytes (or more) of memory > for the strings will also be garbage. This is going to repeat for every > shard. > > I haven't even discussed what happens for memory requirements on the > Lucene frange parser, because I don't have any idea what those are, and > you didn't describe the function you're using. I also don't know how > much memory Lucene is going to require in order to execute a terms > filter with at least 125K terms. I don't imagine it's going to be small. > > Thanks, > Shawn >