Alan Woodward created LUCENE-9099: ------------------------------------- Summary: Correctly handle repeats in ordered and unordered intervals Key: LUCENE-9099 URL: https://issues.apache.org/jira/browse/LUCENE-9099 Project: Lucene - Core Issue Type: Improvement Reporter: Alan Woodward Assignee: Alan Woodward
If you have repeating intervals in an ordered or unordered interval source, you currently get somewhat confusing behaviour: * ORDERED(a, a, b) will return an extra interval over just `a b` if it first matches `a a b`, meaning that you can get incorrect results if used in a CONTAINING filter - CONTAINING(ORDERED(x, y), ORDERED(a, a, b)) will match on the document `a x a b y` * UNORDERED(a, a) will match on documents that just containg a single `a`. It is possible to deal with the unordered case when building sources by rewriting duplicates to nested ORDERED clauses, so that UNORDERED(a, b, c, a, b) becomes UNORDERED(ORDERED(a, a), ORDERED(b, b), c), but this then breaks MAXGAPS filtering. We should try and fix this within intervals themselves. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org