benwtrent commented on PR #14527:
URL: https://github.com/apache/lucene/pull/14527#issuecomment-2847706746

   @weizijun 
   
   It should be fixed. Having an estimation that is more than 2x off is pretty 
bad.
   
   this estimation is used to determine how often flushes should occur, etc.
   
   There are a couple of ways this can be fixed. 
   
   A simple way could be providing an optional call-back to `NeighborArray` 
that accesses package-private method on `OnHeapHnswGraph` that allows for their 
individual estimation to be adjusted during array growth.
   
   `NeighborArray(OnHeapHnswGraph::updateEstimate...)` or something. Then the 
ram estimation in OnHeapHnswGraph becomes the accumulation of those estimates 
as the inner estimates evolve. 
   
   We need to be cautious there with multi-threadedness as many node updates 
could be occuring at a time. So, likely this inner accumulator needs to be 
`LongAccumulator` and it should also assert that its always a positive number
   
   Please also adjust the inner arrays to enforce their maximal length. This 
way we never over-allocate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to