If you are able to draw gc logs in gcviewer when OOM happens, it can give you idea if it was sudden OOM or heap gets filled over a period of time. This may help to nail down if any particular query is causing the problem or something else...
Thanks, Susheel On Sat, May 27, 2017 at 5:36 PM, Daniel Angelov <dani.b.ange...@gmail.com> wrote: > Thanks for the support so far. > I am going to analyze the logs in order to check the frequency of such > queries. BTW, I have forgot to mention, the soft and the hard commits are > each 60 sec. > > BR > Daniel > > Am 27.05.2017 22:57 schrieb "Erik Hatcher" <erik.hatc...@gmail.com>: > > > Another technique to consider is {!join}. Index the cross ref id "sets" > > to another core and use a short and sweet join, if there are stable sets > of > > id's. > > > > Erik > > > > > On May 27, 2017, at 11:39, Alexandre Rafalovitch <arafa...@gmail.com> > > wrote: > > > > > > On top of Shawn's analysis, I am also wondering how often those FQ > > > queries are reused. Because they and the matching documents are > > > getting cached, so there might be quite a bit of space taken with that > > > too. > > > > > > Regards, > > > Alex. > > > ---- > > > http://www.solr-start.com/ - Resources for Solr users, new and > > experienced > > > > > > > > >> On 27 May 2017 at 11:32, Shawn Heisey <apa...@elyograg.org> wrote: > > >>> On 5/27/2017 9:05 AM, Shawn Heisey wrote: > > >>>> On 5/27/2017 7:14 AM, Daniel Angelov wrote: > > >>>> I would like to ask, what could be the memory/cpu impact, if the fq > > >>>> parameter in many of the queries is a long string (fq={!terms > > >>>> f=...}...,.... ) around 2000000 chars. Most of the queries are like: > > >>>> "q={!frange l=Timestamp1 u=Timestamp2}... + some others criteria". > > >>>> This is with SolrCloud 4.1, on 10 hosts, 3 collections, summary in > > >>>> all collections are around 10000000 docs. The queries are over all 3 > > >>>> collections. > > >> > > >> Followup after a little more thought: > > >> > > >> If we assume that the terms in your filter query are a generous 15 > > >> characters each (plus a comma), that means there are in the ballpark > of > > >> 125 thousand of them in a two million byte filter query. If they're > > >> smaller, then there would be more. Considering 56 bytes of overhead > for > > >> each one, there's at least another 7 million bytes of memory for > 125000 > > >> terms when the terms parser divides that filter into multiple String > > >> objects, plus memory required for the data in each of those small > > >> strings, which will be just a little bit less than the original four > > >> million bytes, because it will exclude the commas. A fair amount of > > >> garbage will probably also be generated in order to parse the filter > ... > > >> and then once the query is done, the 15 megabytes (or more) of memory > > >> for the strings will also be garbage. This is going to repeat for > every > > >> shard. > > >> > > >> I haven't even discussed what happens for memory requirements on the > > >> Lucene frange parser, because I don't have any idea what those are, > and > > >> you didn't describe the function you're using. I also don't know how > > >> much memory Lucene is going to require in order to execute a terms > > >> filter with at least 125K terms. I don't imagine it's going to be > > small. > > >> > > >> Thanks, > > >> Shawn > > >> > > >