Hi,

May be org.apache.lucene.search.spans.TermSpans ?



On Sunday, June 5, 2016 7:59 AM, Alexandre Rafalovitch <arafa...@gmail.com> 
wrote:
It sounds like TermVector component's output:
https://cwiki.apache.org/confluence/display/solr/The+Term+Vector+Component

Perhaps with additional flags enabled (e.g. tv.offsets and/or tv.positions).

Regards,
   Alex.
----
Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/



On 5 June 2016 at 07:39, Justin Lee <lee.justi...@gmail.com> wrote:
> Is anyone aware of a way of getting a list of each matching token and their
> offsets after executing a search?  The reason I want to do this is because
> I have the physical coordinates of each token in the original document
> stored out of band, and I want to be able to highlight in the original
> document.  I would really like to have Solr return the list of matching
> tokens because then things like stemming and phrase matching will work as
> expected. I'm thinking of something like the highlighter component, except
> instead of returning html, it would return just the matching tokens and
> their offsets.
>
> I have googled high and low and can't seem to find an exact answer to this
> question, so I have spent the last few days examining the internals of the
> various highlighting classes in Solr and Lucene.  I think the bulk of the
> action is in WeightedSpanTermExtractor and its interaction with
> getBestTextFragments in the Highlighter class.  But before I spend anymore
> time on this I thought I'd ask (1) whether anyone knows of an easier way of
> doing this, and (2) whether I'm at least barking up the right tree.
>
> Thanks much,
> Justin

Reply via email to