okay. huge rows value is no.1 way to kill Lucene. It's not possible, absolutely. You need to rethink logic of your component. Check Solr's FieldCollapsing code, IIRC it makes second search to achieve similar goal. Also check PostFilter and DelegatingCollector classes, their approach can also be handy for your task.
On Fri, Oct 5, 2012 at 2:38 PM, Aaron Daubman <daub...@gmail.com> wrote: > On Fri, Oct 5, 2012 at 4:33 AM, Mikhail Khludnev > <mkhlud...@griddynamics.com> wrote: > > what's the value of rows param > > http://wiki.apache.org/solr/CommonQueryParameters#rows ? > > Very interesting question - so, for historic reasons lost to me, we > pass in a huge (10000000?) number for rows and this hits our custom > component, which has its own internal maximum for real rows returned. > (This is a custom grouping component, so I am guessing the large > number of rows had to do with trying not to limit what got grouped?). > > Is the value of rows what is used for that heap allocation? > absolutely. it's classic priority queue algorithm upon binary heap. > > Thanks, > Aaron > > > > > On Fri, Oct 5, 2012 at 6:56 AM, Aaron Daubman <daub...@gmail.com> wrote: > > > >> Greetings, > >> > >> I've been seeing this call chain come up fairly frequently when > >> debugging longer-QTime queries under Solr 3.6.1 but have not been able > >> to understand from the code what is really going on - the call graph > >> and code follow below. > >> > >> Would somebody please explain to me: > >> 1) Why this would show up frequently as a hotspot > >> 2) If it is expected to do so > >> 3) If there is anything I should look in to that may help performance > >> where this frequently shows up as the long pole in the QTime tent > >> 4) What the code is doing and why heap is being allocated as an > >> apparently giant object (which also is apparently not unheard of due > >> to MAX_VALUE wrapping check) > >> > >> ---call-graph--- > >> Filter - SolrDispatchFilter:doFilter (method time = 12 ms, total time = > >> 487 ms) > >> Filter - SolrDispatchFilter:execute:365 (method time = 0 ms, total > >> time = 109 ms) > >> org.apache.solr.core.SolrCore:execute:1376 (method time = 0 ms, > >> total time = 109 ms) > >> org.apache.solr.handler.RequestHandlerBase:handleRequest:129 > >> (method time = 0 ms, total time = 109 ms) > >> > org.apache.solr.handler.component.SearchHandler:handleRequestBody:186 > >> (method time = 0 ms, total time = 109 ms) > >> com.echonest.solr.component.EchoArtistGroupingComponent:process:188 > >> (method time = 0 ms, total time = 109 ms) > >> org.apache.solr.search.SolrIndexSearcher:search:375 (method time > >> = 0 ms, total time = 96 ms) > >> org.apache.solr.search.SolrIndexSearcher:getDocListC:1176 > >> (method time = 0 ms, total time = 96 ms) > >> org.apache.solr.search.SolrIndexSearcher:getDocListNC:1209 > >> (method time = 0 ms, total time = 96 ms) > >> org.apache.solr.search.SolrIndexSearcher:getProcessedFilter:796 > >> (method time = 0 ms, total time = 26 ms) > >> org.apache.solr.search.BitDocSet:andNot:185 (method time = 0 > >> ms, total time = 13 ms) > >> org.apache.lucene.util.OpenBitSet:clone:732 (method time = > >> 13 ms, total time = 13 ms) > >> org.apache.solr.search.BitDocSet:intersection:31 (method > >> time = 0 ms, total time = 13 ms) > >> org.apache.solr.search.DocSetBase:intersection:90 (method > >> time = 0 ms, total time = 13 ms) > >> org.apache.lucene.util.OpenBitSet:and:808 (method time = > >> 13 ms, total time = 13 ms) > >> org.apache.lucene.search.TopFieldCollector:create:916 (method > >> time = 0 ms, total time = 46 ms) > >> org.apache.lucene.search.FieldValueHitQueue:create:175 > >> (method time = 0 ms, total time = 46 ms) > >> > >> > > org.apache.lucene.search.FieldValueHitQueue$MultiComparatorsFieldValueHitQueue:<init>:111 > >> (method time = 0 ms, total time = 46 ms) > >> org.apache.lucene.search.SortField:getComparator:409 > >> (method time = 0 ms, total time = 13 ms) > >> > >> org.apache.lucene.search.FieldComparator$FloatComparator:<init>:400 > >> (method time = 13 ms, total time = 13 ms) > >> org.apache.lucene.util.PriorityQueue:initialize:108 > >> (method time = 33 ms, total time = 33 ms) > >> ---snip--- > >> > >> > >> org.apache.lucene.util.PriorityQueue:initialize - hotspot is line 108: > >> heap = (T[]) new Object[heapSize]; // T is unbounded type, so this > >> unchecked cast works always > >> > >> ---PriorityQueue.java--- > >> /** Subclass constructors must call this. */ > >> @SuppressWarnings("unchecked") > >> protected final void initialize(int maxSize) { > >> size = 0; > >> int heapSize; > >> if (0 == maxSize) > >> // We allocate 1 extra to avoid if statement in top() > >> heapSize = 2; > >> else { > >> if (maxSize == Integer.MAX_VALUE) { > >> // Don't wrap heapSize to -1, in this case, which > >> // causes a confusing NegativeArraySizeException. > >> // Note that very likely this will simply then hit > >> // an OOME, but at least that's more indicative to > >> // caller that this values is too big. We don't +1 > >> // in this case, but it's very unlikely in practice > >> // one will actually insert this many objects into > >> // the PQ: > >> heapSize = Integer.MAX_VALUE; > >> } else { > >> // NOTE: we add +1 because all access to heap is > >> // 1-based not 0-based. heap[0] is unused. > >> heapSize = maxSize + 1; > >> } > >> } > >> heap = (T[]) new Object[heapSize]; // T is unbounded type, so this > >> unchecked cast works always > >> this.maxSize = maxSize; > >> > >> // If sentinel objects are supported, populate the queue with them > >> T sentinel = getSentinelObject(); > >> if (sentinel != null) { > >> heap[1] = sentinel; > >> for (int i = 2; i < heap.length; i++) { > >> heap[i] = getSentinelObject(); > >> } > >> size = maxSize; > >> } > >> } > >> ---snip--- > >> > >> > >> Thanks, as always! > >> Aaron > >> > > > > > > > > -- > > Sincerely yours > > Mikhail Khludnev > > Tech Lead > > Grid Dynamics > > > > <http://www.griddynamics.com> > > <mkhlud...@griddynamics.com> > -- Sincerely yours Mikhail Khludnev Tech Lead Grid Dynamics <http://www.griddynamics.com> <mkhlud...@griddynamics.com>