On 12/11/06, Edward Garrett <[EMAIL PROTECTED]> wrote:
hello,

i'm doing phrasal searches, and am not happy with how highlighting is done
by default.

if i search for something, like "w1 w2 w3", then correctly, only fields that
match perfectly will be found. however, when i specify highlighting with
hl=true&hl.fl=myfield, then two things don't work according to (my)
expectations:

1) "w1 w2 w3" is not highlighted as a whole, but rather the pieces are
highlighted. e.g. <em>w1</em> <em>w2</em> <em>w3</em>. really, the whole
thing should be contained within a single <em> element.

2) relatedly, and presumably for the same reason, all instances of "w1",
"w2" and "w3" in myfield are highlighted, even when they don't occur
together.

i can't see any possible reason for things working this way, but perhaps
SOLR is just following lucene here.

Solr is using Lucene's built-in highlighter, which has the
deficiencies you mention.  There have been improved highlighting
approaches proposed; see
http://issues.apache.org/jira/browse/LUCENE-663 and
http://issues.apache.org/jira/browse/LUCENE-644.

Improving Solr's highlighting is something I am quite interested in
personally.  Unfortunately, this is an extremely busy time for me at
work, and I doubt that I'll have time to work on this in the near
future.

-Mike

Reply via email to