[ https://issues.apache.org/jira/browse/LUCENE-9568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17212738#comment-17212738 ]
Robert Muir commented on LUCENE-9568: ------------------------------------- It will break most of the top-N optimizations of the query. If you use the max term length to compute the boost, then the PQ can never optimize. That would be because there could always exist some unseen term with a length of say: 1MB (just for illustration) which would rank extremely high. By using the min, it is bounded by the query term, and once that PQ fills with ed=2, the query can restrict itself further and only look for ed=1, ed=0, etc. Practically it is this way because that's how fuzzy was defined, with the min and the tie-breaker after this boost being term sort order (you can check the code of an ancient version like 2.9 to see: https://github.com/apache/lucene-solr/blob/releases/lucene/2.9.2/src/java/org/apache/lucene/search/FuzzyTermEnum.java#L189 ). We just exploited it for all its worth. It is especially important for small PQ (top-N) sizes such as spell checking. Hopefully I have explained it ok, this thing is hairy :) Happy to try again if needed. I think first we should make a test, ideally one that doesn't use highlighting? I think there should be an alternative, simpler fix that won't break the top-N optimization. > FuzzyTermEnums sets negative boost for fuzzy search & highlight > --------------------------------------------------------------- > > Key: LUCENE-9568 > URL: https://issues.apache.org/jira/browse/LUCENE-9568 > Project: Lucene - Core > Issue Type: Bug > Components: modules/highlighter > Affects Versions: 8.5.1 > Reporter: Juraj Jurčo > Priority: Minor > Labels: highlighting, newbie > Attachments: FindSqlHighlightTest.java > > > *Description* > When user indexes a word with an apostrophe and constructs a fuzzy query for > highlighter, it throws an exception with set negative boost for a query. > *Repro Steps* > # Index a text with apostrophe. E.g. doesn't > # Parse a fuzzy query e.g.: se~, se~2, se~3 > # Try to highlight a text with apostrophe > # The exception is thrown (for details see attached test test with repro > steps) > *Actual Result* > {{java.lang.IllegalArgumentException: boost must be a positive float, got > -1.0}} > *Expected Result* > * No exception. > * Highlighting marks are inserted into a text. > *Workaround* > - not known. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org