How to speed up field collapsing on large number of groups

jichi Tue, 28 Jun 2016 13:09:06 -0700

Hi everyone,

I am using Solr 4.10 to index 20 million documents without sharding.
Each document has a groupId field, and there are about 2 million groups.
I found the search with collapsing on groupId significantly slower
comparing to without collapsing, especially when combined with facet
queries.


I am wondering what would be the general approach to speedup field
collapsing by 2~4 times?
Would sharding the index help?
Is it possible to optimize collapsing without sharding?

The filter parameter for collapsing is like this:

    q=*:*&fq={!collapse field=groupId max=sum(...a long formula...)}

I also put this fq into warmup queries xml to warmup caches. But still,
when q changes and more fq are added, the collapsing search would take
about 3~5 seconds. Without collapsing, the search can finish within 2
seconds.

I am thinking to manually optimize CollapsingQParserPlugin through
parallelization or extra caching.
For example, is it possible to parallelize collapsing collector by
different lucene index segments?

Thanks!

-- 
jichi

How to speed up field collapsing on large number of groups

Reply via email to