Dawid Weiss created LUCENE-10229:
------------------------------------

             Summary: Match offsets should be consistent for fields with 
positions and fields with offsets
                 Key: LUCENE-10229
                 URL: https://issues.apache.org/jira/browse/LUCENE-10229
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Dawid Weiss


This is a follow-up of LUCENE-10223 in which it was discovered that fields with
offsets don't highlight some more complex interval queries properly.  Alan says:
{quote}
It's because it returns the position of the inner match, but the offsets of the 
outer.  And so if you're re-analyzing and retrieving offsets by looking at the 
positions, you get the 'right' thing.  It's not obvious to me what the correct 
response is here, but thinking about it the current behaviour is kind of the 
worst of both worlds, and perhaps we should change it so that you get offsets 
of the inner match as standard, and then the outer match is returned as part of 
the sub matches.
{quote}

Intervals are nicely separated into "basic intervals" and "filters" which 
restrict some other source of intervals, here is the original documentation:

https://github.com/apache/lucene/blob/main/lucene/queries/src/java/org/apache/lucene/queries/intervals/package-info.java#L29-L50

My experience from an extended period of using interval queries in a frontend 
where they're highlighted is that filters are restrictions that should not be 
highlighted - it's the source intervals that people care about. Filters are 
what you remove or where you give proper context to source intervals.

The test code contributed in LUCENE-10223 contains numerous query-highlight 
examples (on fields with positions) where this intuition is demonstrated on all 
kinds of interval functions:

https://github.com/apache/lucene/blob/main/lucene/highlighter/src/test/org/apache/lucene/search/matchhighlight/TestMatchHighlighter.java#L335-L542

This issue is about making the internals work consistently for fields with 
positions and fields with offsets.







--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to