Re: Traverse over response docs in SearchComponent impl.

Chris Hostetter Tue, 13 Dec 2016 14:28:13 -0800

FWIW: Perhaps an XY problem?  can you explain more in depth what it is you 
plan on doing in this search component?


: I can see that Solr calls the component's process() method, but from 
: within that method, rb.getResponseDocs(); is always null. No matter what 
: i try, i do not seem to be able to get a hold of that list of response 
: docs.

IIRC getResponseDocs() is only non-null when agregating distributed/cloud 
resultsfrom multiple shards (where we already have a fully 
populated SolrDocumentList due to agregating the remote responses), but in 
a single-node Solr request only a "DocList" is used, and the stored field 
values are read lazily from the IndexReader by the ResponseWriter.

So if you're not writting a distributed component, check 
ResponseBuilder.getResults() ?

Even if you are writting a component for a distributed solr setup, what 
method you call (and where you call it) depends a lot on when/where you 
expect your code to run...

IIRC: 
* prepare() runs on every node for every request (original aggregation 
request and every sub-request to each shard).  
* distributedProcess runs on the aggregation node, and is called 
repeatedly for each "stage" requested by any components (so at a minimum once, 
usually twice to fetch stored fields, maybe more if there are multiple 
facet refinement phases, etc...).  
* modifyRequest() & handleResponses() are called on the aggregation node 
prior/after every sub-request to every shard.
* process() is called on each shard for each sub request. 
* finishStage is called on the aggreation node at the ned of each stage 
(after all the responses from all shards for that sub-request)


...so something like HighlightComponent does it's main work in the 
process() method, because it only needs the data for each doc, the impacts 
of other (aggregated) docs don't affect the results -- then later 
finishStage combines the results.

If you on the otherhand want to look at all of the *final* documents being 
returned to the user, not on a per-shard basis but on an aggregate basis, 
you'd want to put that logic in something like finishStage and check for 
the stage that does a GET_FIELDS -- but if you want your component to 
*also* work in non-cloud mode, you'd need the same logic in your process() 
method (looking at the DocList instead of the SolrDocumentList, with a 
conditional to check for distrib=false so you don't waste a bunch of work 
on per-shard queries when it is in fact being used in cloud-mode)


None of this is very straight forward, but you are admitedly geting int 
overy advanced expert territory here.



-Hoss
http://www.lucidworks.com/

Re: Traverse over response docs in SearchComponent impl.

Reply via email to