I'm fairly sure I did a custom (Hit)Collector in lucene-java, but all I can
find at the moment are my retro implementations (w/o collectors). I won't
bore (or scare?) you with the details, but I follow some of what you're
suggesting.
I have been able to get straight SpanQueries to work in my custom
QueryComponent. I think I followed the same path you describe, but correct
me if I misunderstand. I've more-or-less replaced the
QueryComponent.process() code:
SolrIndexSearcher.QueryResult result = new
SolrIndexSearcher.QueryResult();
searcher.search(result, cmd);
rb.setResult(result);
with (in my overridden process() method):
String[] selectFields = {"id", "fileName"}; // the subset of fields
I am interested in
TopDocs results = searcher.search(cmd.getQuery(), 100000); //
custom spanquery, and many/all hits
/* save hit info (doc & score) */
/* maybe process SpanQuery.getSpans() here, but perhaps try "doc
oriented results" processing approach(?) for tokenization
caching/optimization? */
The code above _seems_ to work, but I am still in the initial stages at the
moment. When I get to the point I have a better understanding of the
challenges you mention, I will share thoughts and insights I've gained along
the way : -).
Thanks for your time and help,
Sean
hossman wrote:
>
>
> : (e.g. defType=fooSpanQuery), along with token positions. I have this
> working
> : in straight lucene, so my challenge is to implement it
> half-intelligently in
> : solr. At the moment, I can't figure out where and how to customize the
> : 'inner' search process.
>
> the first step is to really make sense of how you do this with
> lucene-java, so we can find the best corrisponding points in Solr.
>
> I suspect you are using a custom (Hit)Collector, which is an area in Solr
> that isn't easily customizable. Some other issues have brought up the need
> to allow custom code to provide a Collector that Solr would use in
> addition to it's own Collectors (for building up DocList and DocSet
> structiures) but no clear picture has surfaced as to what that API should
> really like like to be useful to plugin writers, and still be performant
> in the common case.
>
> The most straight forward way to add custom logic like this would be to
> use SolrIndexSearcher just like a regular IndexSearcher, passing your
> Collector to the same old methods you would outside of Solr -- this
> bypasses Solr's internal DocList & DocSet caching, but in many cases this
> may be exactly what you want -- it's the main problem that makes a
> generalized pluggable Collector implementation hard to implement: if the
> Collector has side effects, it's not clear that cached results are useful
> unless they can reproduce the same side effects on cache read.
>
> For response purposes (if you care) you can always then take the results
> of your own data structure, and use it to generate a DocLIst or a DocSet.
>
>
> -Hoss
>
>
>
--
View this message in context:
http://www.nabble.com/Customizing-solr-search%3A-SpanQueries-%28revisited%29-tp25838412p25884771.html
Sent from the Solr - User mailing list archive at Nabble.com.