Erick, Eric and Mike,
Thanks for your help and ideas.
It sounds like we'd need to do a bit of revamping in the highlighter.
Perhaps even PostingsHighligher should be taken as the baseline, since it
is faster. It uses the same extractTerms() method, that Erik has shown.
The user story here is tha
There is also PostingsHighlighter -- I recommend it, if only for the
performance improvement, which is substantial, but I'm not completely
sure how it handles this issue. The one drawback I *am* aware of is
that it is insensitive to positions (so words from phrases get
highlighted even in isol
BooleanQuery’s extractTerms looks like this:
public void extractTerms(Set terms) {
for (BooleanClause clause : clauses) {
if (clause.isProhibited() == false) {
clause.getQuery().extractTerms(terms);
}
}
}
that’s generally the method called by the Highlighter for what terms should
Hmmm, not quite sure what to say. Offsets and positions help,
particularly with FastVectorHighlighter, but the highlighting is
usually re-analyzed anyway so it _shouldn't_ matter. But what I don't
know about highlighting could fill volumes ;)..
Sorry I can't be more help here.
Erick
On Tue, Feb 2
Erick,
Our default operator is AND.
Both queries below parse the same:
a OR (b c) OR d
a OR (b AND c) OR d
The parsed query:
Contents:a (+Contents:b +Contents:c)
Contents:d
So this part is consistent with our expectation.
>> I'm a bit puzzled by your statement that "c" didn't contribute to
Highlighting is such a pain...
what does the parsed query look like? If the default operator is OR,
then this seems correct as both 'd' and 'c' appear in the doc. So
I'm a bit puzzled by your statement that "c" didn't contribute to the score.
If the parsed query is, indeed
a +b +c d
then it does
Erick,
nope, we are using std lucene qparser with some customizations, that do not
affect the boolean query parsing logic.
Should we try some other highlighter?
On Mon, Feb 23, 2015 at 6:57 PM, Erick Erickson
wrote:
> Are you using edismax?
>
> On Mon, Feb 23, 2015 at 3:28 AM, Dmitry Kan wrot
Are you using edismax?
On Mon, Feb 23, 2015 at 3:28 AM, Dmitry Kan wrote:
> Hello!
>
> In solr 4.3.1 there seem to be some inconsistency with the highlighting of
> the boolean query:
>
> a OR (b c) OR d
>
> This returns a proper hit, which shows that only d was included into the
> document score
Hello!
In solr 4.3.1 there seem to be some inconsistency with the highlighting of
the boolean query:
a OR (b c) OR d
This returns a proper hit, which shows that only d was included into the
document score calculation.
But the highlighter returns both d and c in tags.
Is this a known issue of