Re: [I] Incorrect use of fsync [lucene]

2025-04-05 Thread via GitHub
rmuir commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2772221194 Nobody needs to fsync any temporary files, ever. They are temporary: we don't need them durable. Look at how lucene uses temporary files to understand this. We don't need suc

Re: [I] Incorrect use of fsync [lucene]

2025-04-04 Thread via GitHub
viliam-durina commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2771963385 >Please stop arguing here about problems that don't exist. Issue https://github.com/apache/lucene/issues/10906 has nothing to do with temporary files. This issue is not

Re: [I] Incorrect use of fsync [lucene]

2025-04-04 Thread via GitHub
viliam-durina commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2771736474 >every application everywhere and everytime would need to call fsync on close Yeah, I don't know what would be the use case for not fsyncing before closing. Maybe if you

Re: [I] Incorrect use of fsync [lucene]

2025-04-04 Thread via GitHub
viliam-durina commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2770735674 Without fsync, there's no guarantee that anything you wrote was written. The OS delays the writes. If you close, the data is still in memory, the OS later tries to write and f

Re: [I] Incorrect use of fsync [lucene]

2025-04-02 Thread via GitHub
viliam-durina commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2772476115 >There has to be a certain trust in what the operating system provides and its consistency guarantees Sure, but the OS doesn't guarantee anything, if you don't fsync. I'

Re: [I] Incorrect use of fsync [lucene]

2025-04-02 Thread via GitHub
uschindler commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2771774432 > For temporary files, we should either fsync before closing, or start reading without closing the file. This shows that you have no idea about how Lucene works internally

Re: [I] Incorrect use of fsync [lucene]

2025-04-02 Thread via GitHub
dweiss commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2772129998 There has to be a certain trust in what the operating system provides and its consistency guarantees. What you describe seems like a fringe case that - even if possible - falls under

Re: [I] Incorrect use of fsync [lucene]

2025-04-02 Thread via GitHub
uschindler commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2771766515 If a file is incomplete, the commit will fail. Lucene's fileformats are designed in a way that corruption can be found early (this includes checksums). So zero byte temp files or

Re: [I] Incorrect use of fsync [lucene]

2025-04-02 Thread via GitHub
viliam-durina commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2771546918 There was no crash, that's the problem. You wrote a file, then opened it again for reading and it's corrupted, and there was no IO error reported. As I said, if you clo

Re: [I] Incorrect use of fsync [lucene]

2025-04-02 Thread via GitHub
dweiss commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2771582280 Ok, sorry but the scenario you're describing is insane to me. If something like this happens, I don't think it's Lucene's duty to try to correct it - it seems like the entire system

Re: [I] Incorrect use of fsync [lucene]

2025-04-02 Thread via GitHub
uschindler commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2771580012 > There was no crash, that's the problem. You wrote a file, then opened it again for reading and it's corrupted, and there was no IO error reported. > > As I said, if you c

Re: [I] Incorrect use of fsync [lucene]

2025-04-01 Thread via GitHub
dweiss commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2771397718 But why would you want to read a temporary file after a crash? These are... temporary - if a process crashed, there is no recovery at all (at least concerning temporary files). --

Re: [I] Incorrect use of fsync [lucene]

2025-04-01 Thread via GitHub
viliam-durina commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2768802025 I think we must fsync also the temporary files. Without fsyncing, when we read them back, they might be incomplete and no error might be reported. We could perhaps avoid fsync

Re: [I] Incorrect use of fsync [lucene]

2025-03-12 Thread via GitHub
rmuir commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2717869141 I also want to point out here, that current usage is not "incorrect". The idea that there is a "correct" way that will always work is 100% broken. look at what fsync() does on m

Re: [I] Incorrect use of fsync [lucene]

2025-03-12 Thread via GitHub
rmuir commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2717565686 > we still need the fsync on the parent directory to persist the file metadata on Linux Blows a giant hole in your argument, that it is ok to write to this file and separately

Re: [I] Incorrect use of fsync [lucene]

2025-03-12 Thread via GitHub
viliam-durina commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2716978401 > personally I think we should just simply fsync the files before we close them: nothing more fancy than that. If we rely on the file being ever durably stored, then the

Re: [I] Incorrect use of fsync [lucene]

2025-03-06 Thread via GitHub
msokolov commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2703792662 Interesting: I would note that info is from ~8 years ago; I wonder if it is still valid. Also, if you sync() and then close() isn't there a risk there may be intervening writes tha

Re: [I] Incorrect use of fsync [lucene]

2025-03-06 Thread via GitHub
viliam-durina commented on issue #14334: URL: https://github.com/apache/lucene/issues/14334#issuecomment-2704000723 TL;DR: I think this issue is still relevant to Lucene today. Explanation: Quoting [from here](https://wiki.postgresql.org/wiki/Fsync_Errors): > Linux 4.13 and 4

[I] Incorrect use of fsync [lucene]

2025-03-06 Thread via GitHub
viliam-durina opened a new issue, #14334: URL: https://github.com/apache/lucene/issues/14334 ### Description According to [this answer](https://stackoverflow.com/a/50158433/952135), calling `fsync` after the file descriptor is closed gives no guarantees about what's persistent on dis