Ahh this reconfirms. The analyzers are properly pulling things apart. There
are two instances of the query keyword with words between them. But from
your last comment, it sounds like the system's not trying to do any sort of
phrase highlighting, but is just hitting a weird edge case? I'm seeing this
behavior somewhat commonly, so I thought for sure there must be some option
that says if two highlighted words are sufficiently close together,
highlight them as a single phrase.

On Tue, Nov 9, 2010 at 7:11 PM, Lance Norskog <goks...@gmail.com> wrote:

> Have you looked at solr/admin/analysis.jsp? This is 'Analysis' link
> off the main solr admin page. It will show you how text is broken up
> for both the indexing and query processes. You might get some insight
> about how these words are torn apart and assigned positions. Trying
> the different Analyzers and options might get you there.
>
> But to be frank- highlighting is a tough problem and has always had a
> lot of edge cases.
>
> On Tue, Nov 9, 2010 at 6:08 PM, Sasank Mudunuri <sas...@gmail.com> wrote:
> > I'm finding that if a keyword appears in a field multiple times very
> close
> > together, it will get highlighted as a phrase even though there are other
> > terms between the two instances. So this search:
> >
> > http://localhost:8983/solr/select/?
> >
> > hl=true&
> > hl.snippets=1&
> > q=residue&
> > hl.fragsize=0&
> > mergeContiguous=false&
> > indent=on&
> > hl.usePhraseHighlighter=false&
> > debugQuery=on&
> > hl.fragmenter=gap&
> > hl.highlightMultiTerm=false
> >
> > Highlights as:
> > What does "low-<em>residue" mean? Like low-residue</em> diet?
> >
> > Trying to get it to highlight as:
> > What does "low-<em>residue</em>" mean? Like low-<em>residue</em> diet?
> > I've tried playing with various combinations of mergeContiguous,
> > highlightMultiTerm, and usePhraseHighlighter, but they all yield the same
> > output.
> >
> > For reference, field type uses a StandardTokenizerFactory and
> > SynonymFilterFactory, StopFilterFactory, StandardFilterFactory and
> > SnowballFilterFactory. I've confirmed that the intermediate words don't
> > appear in either the synonym or the stop words list. I can post the full
> > definition if helpful.
> >
> > Any pointers as to how to debug this would be greatly appreciated!
> > sasank
> >
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>

Reply via email to