mayya-sharipova edited a comment on pull request #728: URL: https://github.com/apache/lucene/pull/728#issuecomment-1059202174
@msokolov Thanks a lot for your review. >I'm not sure what unset means? I guess it goes to the default 16MB, but I assume you must be doing the same in the other test condition? Is there some difference in how the IndexWriter is configured between the two conditions? Is there some difference in how the IndexWriter is configured between the two conditions? Or maybe I'm misunderstanding and you are allowing the entire set of vectors to buffer in RAM (in the baseline case)? Sorry for not being clear on the methodology. Indeed, in both baseline and test conditions `RAMBufferSize` is set to default 16Mb. IndexWriter is configured the same. The reason while `baseline` is slower is as vectors fill RAMBuffer, we trigger flush and create an extra HNSWGraph, only to abandon them later during force merge. While in the `test` case we create the final HNSW graph only once at the end, as RAMBuffer never gets filled. > Actually I would like to understand the difference between that case, and buffering on disk. Do we pay any penalty for buffering on disk? how much? That's a good question, I assume we pay some penalty on buffering on disk, I can measure how much. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org