msokolov commented on issue #13565:
URL: https://github.com/apache/lucene/issues/13565#issuecomment-2301788507

   Hi thanks for that @jpountz, no worries; this was something we all agreed 
on. I'm able to continue with the "research" part of this by simply increasing 
heap size - it's not a blocker.
   
   At the same time I think we might want to reintroduce random-access vector 
readers as a first-class API for other reasons. Even the current case of 
merging multiple large segments containing vectors would be affected by this, 
wouldn't it? Since SortingCodecReader is used by IndexWriter when merging 
sorted indexes, it means that in that case all vector data of segments being 
merged is held in RAM, potentially requiring quite a lot of RAM when instead we 
could read from "disk" at the cost of some random accesses. I guess disk random 
accesses are generally to be avoided, but given that the alternative is to 
"page in" every vector page to the heap, I would think we would prefer to let 
the OS do the paging as usual for our index data.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to