hongyuyan97 opened a new issue, #12213:
URL: https://github.com/apache/lucene/issues/12213

   ### Description
   
   Here is my input text 'A B A C A B C' and search ORDERED(A, B, C)
   It should hits [0,3] and [4,6], but it will ignore the [4,6].
   The reason is similar to this issue 
[LUCENE-9418](https://issues.apache.org/jira/browse/LUCENE-9418).
   After finding the first interval [0, 3], the subintervals will become 
A[0,0], B[1,1], C[3,3]; then the algorithm will try to minimize it and the 
subintervals will become: A:[2,2], B:[5,5], C:[3,3] (after finding 5 > 3 it 
breaks the minimization)
   
   And when finding next interval, it will do advance(B) before checking 
whether it is after A(the do-while loop), so subintervals will become A[2,2], 
B[inf, inf], C[3,3] and return NO_MORE_INTERVAL.
   
   Based on the paper cited by intervals, I think we should continue the loop 
from where the last "nextInterval" stopped, rather than always starting from 1.
   
   Later I will file a PR for this.
   
   
   ### Version and environment details
   
   Lucene: 8.10.1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to