dsmiley commented on a change in pull request #1123: LUCENE-9093: Unified 
highlighter with word separator never gives context to the left
URL: https://github.com/apache/lucene-solr/pull/1123#discussion_r361828673
 
 

 ##########
 File path: 
lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/LengthGoalBreakIterator.java
 ##########
 @@ -173,8 +205,30 @@ private int moveToBreak(int idx) { // precondition: idx 
is a known break
 
   // called at start of new Passage given first word start offset
   @Override
-  public int preceding(int offset) {
-    return baseIter.preceding(offset); // no change needed
+  public int preceding(int matchStartIndex) {
+    final int targetIdx = (matchStartIndex - 1) - (int)(lengthGoal * 
fragmentAlignment);
+    if (targetIdx <= 0) {
+      return 0;
+    }
+    final int beforeIdx = baseIter.preceding(targetIdx + 1);
+    if (beforeIdx == DONE) {
+      return 0;
+    }
+    if (beforeIdx == targetIdx) { // right on the money
+      return beforeIdx;
+    }
+    if (isMinimumLength) { // thus never undershoot
+      return beforeIdx;
+    }
+
+    // note: it is a shame that we invoke following() *one more time*; BI's 
are sometimes expensive.
+
+    // Find closest break to target
+    final int afterIdx = baseIter.following(targetIdx - 1);
+    if (afterIdx - targetIdx < targetIdx - beforeIdx && afterIdx < 
matchStartIndex) {
+      return afterIdx;
+    }
+    return beforeIdx;
 
 Review comment:
   No moveToBreak and so the underlying BI here is not consistent.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to