Re: Long string in fq value parameter, more than 2000000 chars

Shawn Heisey Sat, 27 May 2017 08:33:58 -0700

On 5/27/2017 9:05 AM, Shawn Heisey wrote:
> On 5/27/2017 7:14 AM, Daniel Angelov wrote:
>> I would like to ask, what could be the memory/cpu impact, if the fq
>> parameter in many of the queries is a long string (fq={!terms
>> f=...}...,.... ) around 2000000 chars. Most of the queries are like:
>> "q={!frange l=Timestamp1 u=Timestamp2}... + some others criteria".
>> This is with SolrCloud 4.1, on 10 hosts, 3 collections, summary in
>> all collections are around 10000000 docs. The queries are over all 3
>> collections.


Followup after a little more thought:

If we assume that the terms in your filter query are a generous 15
characters each (plus a comma), that means there are in the ballpark of
125 thousand of them in a two million byte filter query.  If they're
smaller, then there would be more.  Considering 56 bytes of overhead for
each one, there's at least another 7 million bytes of memory for 125000
terms when the terms parser divides that filter into multiple String
objects, plus memory required for the data in each of those small
strings, which will be just a little bit less than the original four
million bytes, because it will exclude the commas.  A fair amount of
garbage will probably also be generated in order to parse the filter ...
and then once the query is done, the 15 megabytes (or more) of memory
for the strings will also be garbage.  This is going to repeat for every
shard.

I haven't even discussed what happens for memory requirements on the
Lucene frange parser, because I don't have any idea what those are, and
you didn't describe the function you're using.  I also don't know how
much memory Lucene is going to require in order to execute a terms
filter with at least 125K terms.  I don't imagine it's going to be small.

Thanks,
Shawn

Re: Long string in fq value parameter, more than 2000000 chars

Reply via email to