jpountz commented on PR #11796:
URL: https://github.com/apache/lucene/pull/11796#issuecomment-1259137841

   > Interesting point.. Thinking how/when we'd like to track the impact of 
temp output files. From what I understand, they won't be a part of commit and 
fsync. So if we're trying to measure increased disk or remote store I/O, we 
probably want to skip them?
   
   Indeed temporary files are never part of a commit and fsynced, but this may 
also be the case for a number of flushed segments: if flushed segments get 
included in a merge before the next commit, then they would never be part of a 
commit and fsynced either.
   
   > Although we delete the temp files right after, but on a small box, maybe 
we gives us a sense of increased file writes or page fault.
   
   Some temporary files are also not necessarily short-lived, like the ones we 
create for stored fields when index sorting is enabled.
   
   I'm considering exposing write amplification separately for flushes (as 
`flushedBytes / totalIndexSize`), merges (as `(totalIndexSize + mergedBytes) / 
totalIndexSize`) and temporary files (as `(totalIndexSize + tempBytes) / 
totalIndexSize`) and pushing the responsibility to users of whether and how 
they would like to combine these various metrics?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to