jpountz commented on PR #11796: URL: https://github.com/apache/lucene/pull/11796#issuecomment-1259137841
> Interesting point.. Thinking how/when we'd like to track the impact of temp output files. From what I understand, they won't be a part of commit and fsync. So if we're trying to measure increased disk or remote store I/O, we probably want to skip them? Indeed temporary files are never part of a commit and fsynced, but this may also be the case for a number of flushed segments: if flushed segments get included in a merge before the next commit, then they would never be part of a commit and fsynced either. > Although we delete the temp files right after, but on a small box, maybe we gives us a sense of increased file writes or page fault. Some temporary files are also not necessarily short-lived, like the ones we create for stored fields when index sorting is enabled. I'm considering exposing write amplification separately for flushes (as `flushedBytes / totalIndexSize`), merges (as `(totalIndexSize + mergedBytes) / totalIndexSize`) and temporary files (as `(totalIndexSize + tempBytes) / totalIndexSize`) and pushing the responsibility to users of whether and how they would like to combine these various metrics? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org