Mohsin Beg Beg [mohsin....@oracle.com] wrote: > I am getting OOM when faceting on numFound=28. The receiving > solr node throws the OutOfMemoryError even though there is 7gb > available heap before the faceting request was submitted.
fc and fcs faceting memory overhead is (nearly) independent on the number of hits in the search result. > If a different solr node is selected that one fails too. Any suggestions ? > &facet.field=field1....field15 > &f.field1...field15.facet.method=fc/fcs > &collection=Collection1...Collection100 You seem to be issuing a facet request for 15 fields in 100 collection concurrently. The memory overhead will be linear to the number of documents, references from documents to field values and the number of unique values in your facets, for each facet independently. That was confusing. Let me try an example instead: For each field, static memory requirements will be a structure that maps from documents to term ordinals. Depending on circumstances, this can be small (DocValues and a numeric field) or big (multi-value, non-DocValue String). Each concurrent call will temporarily allocate a structure for counting. If the field is numeric, this will be a hashmap. If it is String, it will be an integer-array with as many entries as there are unique values: If there are 1M unique String values in the field, the overhead will be 4 bytes * 1M = 4MB. So, if each field has 250K unique String values, the temporary overhead for all 15 fields will be 15MB. I don't now if the request for multiple collections is threaded, but if so, the 15MB should be multiplied with 100, totalling 1.5GB memory overhead for each call. Add the static structures and it does not seem unreasonable that you run out of memory. All this is very loose, but the overall message is that documents, unique facet values, facets and collections all multiplies memory requirements. * Do you need to query all collections at once? * Can you collapse some of the facet fields, to reduce the total number? * Are some of the fields very small? If so, use enum for them instead of fc/fcs. * Maybe you can determine your limits by issuing requests first for 1 field, then 2 etc. This is to see if it is feasible to do minor tweak to get it to work or if your setup is so large that something entirely else needs to be done. - Toke Eskildsen