Re: Faceting over limited result set

Mike Klaas Tue, 13 Nov 2007 11:44:45 -0800

On 12-Nov-07, at 8:03 AM, Chris Hostetter wrote:

if what you are interested in is stats on the first N docsaccording to aspecific sort (score or otherwise) then you could write a customrequest
handler that executed a search with a limit of N, got the DocList,
iterated over it to build a DocSet, and then used that DocSet to do
faceting ... but that would probably take even longer then justusing the
full DocSet matching the entire query.


An implementation might look like:

        DocList superlist;
        int facetDocLimit = params.getInt(DMP.FACET_DOCLIMIT, -1);
        if(facetDocLimit > 0 && facetDocLimit != req.getLimit()) {
          superlist = s.getDocList(query, restrictions,
                                   SolrPluginUtils.getSort(req),
                                   req.getStart(), facetDocLimit,
                                   flags);

results.docSet = SearcherUtils.getDocSetFromDocList(superlist, s);

          results.docList = superlist.subset(0, req.getLimit());
        } else {

Where getDocSetFromDocList() uses DocSetHitCollector to build a DocSet.

To answer the performance question: There is a gain to be had whendoing lots of faceting on huge indices, if N is low (say, 500-1000).One problem with the implementation above is that it stymies thequery caching in SolrIndexSearcher (since the generated DocList is >the cache upper bound).


-Mike

Re: Faceting over limited result set

Reply via email to