benwtrent opened a new pull request, #11905:
URL: https://github.com/apache/lucene/pull/11905

   This bug has been around since 9.1. It relates directly to the number of 
nodes that are contained in the level 0 of the HNSW graph. Since level 0 
contains all the nodes, this implies the following:
   
    - In Lucene 9.1, the bug probably would have appeared once `31580641` 
(Integer.MAX_VALUE/(maxConn + 1)) vectors were in a single segment
    - In Lucene 9.2+, the bug appears when there are `16268814` 
(Integer.MAX_VALUE/(M * 2 + 1)) or more vectors in a single segment.
   
   The stack trace would indicate an EOF failure as Lucene attempts to `seek` 
to a negative number in `ByteBufferIndexInput`.
   
   This commit fixes the type casting and utilizes `Math.exact...` in the 
number multiplication and addition. The overhead here is minimal as these 
calculations are done in constructors and then used repeatably afterwards.
   
   
   I put fixes in the older codecs, I don't know if that is typically done, but 
if somebody has a large segment and wants to read the vectors, they could build 
this jar and read them now (the bug is only on read and data layout is 
unchanged)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to