mayya-sharipova edited a comment on pull request #728:
URL: https://github.com/apache/lucene/pull/728#issuecomment-1059202174
@msokolov Thanks a lot for your review.
>I'm not sure what unset means? I guess it goes to the default 16MB, but I
assume you must be doing the same in the other test condition? Is there some
difference in how the IndexWriter is configured between the two conditions? Is
there some difference in how the IndexWriter is configured between the two
conditions? Or maybe I'm misunderstanding and you are allowing the entire set
of vectors to buffer in RAM (in the baseline case)?
Sorry for not being clear on the methodology. Indeed, in both baseline and
test conditions `RAMBufferSize` is set to default 16Mb. IndexWriter is
configured the same.
The reason while `baseline` is slower is as vectors fill RAMBuffer, we
trigger flush and create an extra HNSWGraph, only to abandon them later during
force merge. While in the `test` case we create the final HNSW graph only
once at the end, as RAMBuffer never gets filled.
> Actually I would like to understand the difference between that case, and
buffering on disk. Do we pay any penalty for buffering on disk? how much?
That's a good question, I assume we pay some penalty on buffering on disk, I
can measure how much.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]