viliam-durina commented on issue #14334:
URL: https://github.com/apache/lucene/issues/14334#issuecomment-2704000723

   TL;DR: I think this issue is still relevant to Lucene today.
   
   Explanation:
   
   Quoting [from here](https://wiki.postgresql.org/wiki/Fsync_Errors):
   > Linux 4.13 and 4.15: [fsync() only reports writeback errors that occurred 
after you called 
open()](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=088737f44bbf6378745f5b57b035e57ee3dc4750)
 so our schemes for closing and opening files LRU-style and handing fsync() 
work off to the checkpointer process can hide write-back errors; also buffers 
are marked clean after errors so even if you opened the file before the 
failure, retrying fsync() can falsely report success and the modified buffer 
can be thrown away at any time due to memory pressure.
   
   The [current man page for 
`fsync`](https://www.man7.org/linux/man-pages/man2/fsync.2.html) says:
   >fsync() transfers ("flushes") all modified in-core data of (i.e., modified 
buffer cache pages for) the file referred to by the file descriptor fd to the 
disk device
   
   And in the ERRORS section:
   >EIO    An error occurred during synchronization.  This error may
                 relate to data written to some other file descriptor on the
                 same file.  Since Linux 4.13, errors from write-back will
                 be reported to all file descriptors that might have written
                 the data which triggered the error.  Some filesystems
                 (e.g., NFS) keep close track of which data came through
                 which file descriptor, and give more precise reporting.
                 Other filesystems (e.g., most local filesystems) will
                 report errors to all file descriptors that were open on the
                 file when the error was recorded.
   
   Mu understanding is this: if there are no errors, then `fsync` is 
successful, and all dirty pages have been durably stored. But if there were 
write-back errors for writes done using other file descriptors, they MAY (i.e. 
are not required to) be reported when fsync-ing another descriptor. But we 
require the write-back errors to be reported when fsyncing a file descriptor 
opened later, which is not the case.
   
   Intuitively, if you think about it, for how long exactly would the OS need 
to report the failures? Indefinitely? Then, once a write-back fails (e.g. due 
to insufficient space), you'd get errors forever for that file, even if there's 
now enough space. The OS is free to reclaim the failing pages from the cache at 
any time, it doesn't have to wait for further opening and fsync attempt to 
report them. If you close a file without fsyncing first, then you're giving up 
the guarantee of being told about any errors.
   
   All in all, I think there's an issue here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to