Alice.H.Yang (mis.cnsh04.Newegg) 41493 [[email protected]] wrote:
> 1. I'm sorry, I have made a mistake, the total number of documents is 32
> Million, not 320 Million.
> 2. The system memory is large for solr index, OS total has 256G, I set the
> solr tomcat HEAPSIZE="-Xms25G -Xmx100G"
100G is a very high number. What special requirements dictates such a large
heap size?
> Reply: 9 fields I facet on.
Solr treats each facet separately and with facet.method=fc and 10M hits, this
means that it will iterate 9*10M = 90M document IDs and update the counters for
those.
> Reply: 3 facet fields have one hundred unique values, other 6 facet fields'
> unique values are between 3 to 15.
So very low cardinality. This is confirmed by your low response time of 6ms for
2925 hits.
> And we test this scenario: If the number of facet fields' unique values is
> less we add facet.method=enum, there is a little to improve performance.
That is a shame: enum is normally the simple answer to a setup like yours. Have
you tried fine-tuning your fc/enum selection, so that the 3 fields with
hundreds of values uses fc and the rest uses enum? That might halve your
response time.
Since the number of unique facets is so low, I do not think that DocValues can
help you here. Besides the fine-grained fc/enum-selection above, you could try
collapsing all 9 facet-fields into a single field. The idea behind this is that
for facet.method=fc, performing faceting on a field with (for example) 300
unique values takes practically the same amount of time as faceting on a field
with 1000 unique values: Faceting on a single slightly larger field is much
faster than faceting on 9 smaller fields. After faceting with facet.limit=-1 on
the single super-facet-field, you must match the returned values back to their
original fields:
If you have the facet-fields
field0: 34
field1: 187
field2: 78432
field3: 3
...
then collapse them by or-ing a field-specific mask that is bigger than the max
in any field, then put it all into a single field:
fieldAll: 0xA0000000 | 34
fieldAll: 0xA1000000 | 187
fieldAll: 0xA2000000 | 78432
fieldAll: 0xA3000000 | 3
...
perform the facet request on fieldAll with facet.limit=-1 and split the
resulting counts with
for (entry: facetResultAll) {
switch (0xFF000000 & entry.value) {
case 0xA0000000:
field0.add(entry.value, entry.count);
break;
case 0xA1000000:
field1.add(entry.value, entry.count);
break;
...
}
}
Regards,
Toke Eskildsen, State and University Library, Denmark