I'm fairly sure I did a custom (Hit)Collector in lucene-java, but all I can find at the moment are my retro implementations (w/o collectors). I won't bore (or scare?) you with the details, but I follow some of what you're suggesting.
I have been able to get straight SpanQueries to work in my custom QueryComponent. I think I followed the same path you describe, but correct me if I misunderstand. I've more-or-less replaced the QueryComponent.process() code: SolrIndexSearcher.QueryResult result = new SolrIndexSearcher.QueryResult(); searcher.search(result, cmd); rb.setResult(result); with (in my overridden process() method): String[] selectFields = {"id", "fileName"}; // the subset of fields I am interested in TopDocs results = searcher.search(cmd.getQuery(), 100000); // custom spanquery, and many/all hits /* save hit info (doc & score) */ /* maybe process SpanQuery.getSpans() here, but perhaps try "doc oriented results" processing approach(?) for tokenization caching/optimization? */ The code above _seems_ to work, but I am still in the initial stages at the moment. When I get to the point I have a better understanding of the challenges you mention, I will share thoughts and insights I've gained along the way : -). Thanks for your time and help, Sean hossman wrote: > > > : (e.g. defType=fooSpanQuery), along with token positions. I have this > working > : in straight lucene, so my challenge is to implement it > half-intelligently in > : solr. At the moment, I can't figure out where and how to customize the > : 'inner' search process. > > the first step is to really make sense of how you do this with > lucene-java, so we can find the best corrisponding points in Solr. > > I suspect you are using a custom (Hit)Collector, which is an area in Solr > that isn't easily customizable. Some other issues have brought up the need > to allow custom code to provide a Collector that Solr would use in > addition to it's own Collectors (for building up DocList and DocSet > structiures) but no clear picture has surfaced as to what that API should > really like like to be useful to plugin writers, and still be performant > in the common case. > > The most straight forward way to add custom logic like this would be to > use SolrIndexSearcher just like a regular IndexSearcher, passing your > Collector to the same old methods you would outside of Solr -- this > bypasses Solr's internal DocList & DocSet caching, but in many cases this > may be exactly what you want -- it's the main problem that makes a > generalized pluggable Collector implementation hard to implement: if the > Collector has side effects, it's not clear that cached results are useful > unless they can reproduce the same side effects on cache read. > > For response purposes (if you care) you can always then take the results > of your own data structure, and use it to generate a DocLIst or a DocSet. > > > -Hoss > > > -- View this message in context: http://www.nabble.com/Customizing-solr-search%3A-SpanQueries-%28revisited%29-tp25838412p25884771.html Sent from the Solr - User mailing list archive at Nabble.com.