I'm fairly sure I did a custom (Hit)Collector in lucene-java, but all I can
find at the moment are my retro implementations (w/o collectors). I won't
bore (or scare?) you with the details, but I follow some of what you're
suggesting. 

I have been able to get straight SpanQueries to work in my custom
QueryComponent. I think I followed the same path you describe, but correct
me if I misunderstand. I've more-or-less replaced the
QueryComponent.process() code:
        SolrIndexSearcher.QueryResult result = new
SolrIndexSearcher.QueryResult();
        searcher.search(result, cmd);
        rb.setResult(result);

with (in my overridden process() method):
        String[] selectFields = {"id", "fileName"};  // the subset of fields
I am interested in
        TopDocs results = searcher.search(cmd.getQuery(), 100000);   //
custom spanquery, and many/all hits
        /* save hit info (doc & score) */
        /* maybe process SpanQuery.getSpans() here, but perhaps try "doc
oriented results" processing approach(?) for tokenization
caching/optimization? */

The code above _seems_ to work, but I am still in the initial stages at the
moment. When I get to the point I have a better understanding of the
challenges you mention, I will share thoughts and insights I've gained along
the way : -).

Thanks for your time and help,

Sean




hossman wrote:
> 
> 
> : (e.g. defType=fooSpanQuery), along with token positions. I have this
> working
> : in straight lucene, so my challenge is to implement it
> half-intelligently in
> : solr. At the moment, I can't figure out where and how to customize the
> : 'inner' search process.
> 
> the first step is to really make sense of how you do this with 
> lucene-java, so we can find the best corrisponding points in Solr.
> 
> I suspect you are using a custom (Hit)Collector, which is an area in Solr 
> that isn't easily customizable. Some other issues have brought up the need 
> to allow custom code to provide a Collector that Solr would use in 
> addition to it's own Collectors (for building up DocList and DocSet 
> structiures) but no clear picture has surfaced as to what that API should 
> really like like to be useful to plugin writers, and still be performant 
> in the common case.
> 
> The most straight forward way to add custom logic like this would be to 
> use SolrIndexSearcher just like a regular IndexSearcher, passing your 
> Collector to the same old methods you would outside of Solr -- this 
> bypasses Solr's internal DocList & DocSet caching, but in many cases this 
> may be exactly what you want -- it's the main problem that makes a 
> generalized pluggable Collector implementation hard to implement: if the 
> Collector has side effects, it's not clear that cached results are useful 
> unless they can reproduce the same side effects on cache read.
> 
> For response purposes (if you care) you can always then take the results 
> of your own data structure, and use it to generate a DocLIst or a DocSet.
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Customizing-solr-search%3A-SpanQueries-%28revisited%29-tp25838412p25884771.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to