Hi, Yeah, large heap can be problematic like that. :) But if there is some sort of a leak, and if I had to bet I'd put my money on your custom QP knowing what I know about this situation, you could also start Solr with a much smaller heap and grab the heap snapshot as soon as you see some number of those objects appearing towards the top of jmap - that should be enough to trace them to their roots.
Otis -- Solr Performance Monitoring - http://sematext.com/spm/index.html On Tue, Nov 13, 2012 at 5:18 PM, Prasanna R <plistma...@gmail.com> wrote: > We do have a custom query parser that is responsible for expanding the user > input query into a bunch of prefix, phrase and regular boolean queries in a > manner similar to that done by DisMax. > > Analyzing heap with jhat/YourKit is on my list of things to do but I > haven't gotten around to doing it yet. Our big heap size (13G) makes it a > little difficult to do a full blown heap dump analysis. > > Thanks a ton for the reply Otis! > > Prasanna > > On Mon, Nov 12, 2012 at 5:42 PM, Otis Gospodnetic < > otis.gospodne...@gmail.com> wrote: > > > Hi, > > > > I've never seen this. You don't have a custom query parser or anything > > else custom, do you? > > Have you tried dumping and analyzing heap? YourKit has a 7 day eval, or > > you can use things like jhat, which may be included on your machine > already > > (see > http://docs.oracle.com/javase/6/docs/technotes/tools/share/jhat.html). > > > > Otis > > -- > > Performance Monitoring - http://sematext.com/spm/index.html > > > > > > On Mon, Nov 12, 2012 at 8:35 PM, Prasanna R <plistma...@gmail.com> > wrote: > > > > > We have been using Solr in a custom setup where we generate results > for > > > user queries by expanding it to a large boolean query consisting of > > > multiple prefix queries. There have been some GC issues recently with > the > > > Old/tenured generation becoming nearly 100% full leading to near > constant > > > full GC cycles. > > > > > > We are running Solr 3.1 on servers with 13G of heap. jmap live object > > > histogram is as follows: > > > > > > num #instances #bytes class name > > > ---------------------------------------------- > > > 1: 27441222 1550723760 [Ljava.lang.Object; > > > 2: 23546318 879258496 [C > > > 3: 23813405 762028960 java.lang.String > > > 4: 22700095 726403040 > org.apache.lucene.search.BooleanQuery > > > 5: 27431515 658356360 java.util.ArrayList > > > 6: 22911883 549885192 > > org.apache.lucene.search.BooleanClause > > > 7: 21651039 519624936 org.apache.lucene.index.Term > > > 8: 6876651 495118872 > > > org.apache.lucene.index.FieldsReader$LazyField > > > 9: 11354214 363334848 > org.apache.lucene.search.PrefixQuery > > > 10: 4281624 137011968 java.util.HashMap$Entry > > > 11: 3466680 83200320 org.apache.lucene.search.TermQuery > > > 12: 1987450 79498000 > org.apache.lucene.search.PhraseQuery > > > 13: 631994 70148624 [Ljava.util.HashMap$Entry; > > > ..... > > > > > > I have looked at the Solr cache settings multiple times but am not able > > to > > > figure out how/why the high number of BooleanQuery and BooleanClause > > object > > > instances stay alive. These objects are live and do not get collected > > even > > > when the traffic is disabled and a manual GC is triggered which > indicates > > > that someone is holding onto references. > > > > > > Can anyone provide more details on the circumstances under which these > > > objects stay alive and/or cached? If they are cached then is the > caching > > > configurable? > > > > > > Any and all tips/suggestions/pointers will be much appreciated. > > > > > > Thanks, > > > > > > Prasanna > > > > > >