This patch is relatively aggressive in removing FileReader::readCached() and all associated infrastructure entirely. I'm not yet certain that's the best, but in the absence of enough ongoing development to support something evidently flawed.
Since non-Windows doesn't support readDirect() to avoid polluting the disk cache, this ensures hashing will probably evict disk cache on Linux and other platforms. I'm not sure the benchmarks you've shown address that side effect. The concern isn't slowing down AirDC++ or DC++, it's slowing down everything else, as they have to repopulate disk cache. To ameliorate this on Linux, one can use O_DIRECT (https://linux.die.net/man/2/open says "Since Linux 2.4.10", which seems sufficient). FreeBSD 4.x, released in 2000, introduced this flag, which means DragonFly BSD would have inherited it. https://man.netbsd.org/NetBSD-9.0/open.2 documents that NetBSD supports it (with an alignment restriction) https://stackoverflow.com/questions/55281353/use-of-undeclared- identifier-o-direct suggests that this is trickier on macOS. Apparently there's an F_NOCACHE flag which accomplishes a similar goal. https://man.openbsd.org/amd64/open shows OpenBSD not supporting O_DIRECT. It's not clear to me whether it supports any alternate mechanism. So for all the typical POSIXy systems except OpenBSD, the safer approach looks like removing readMapped() and, in its place, using those per-OS direct I/O hints. The attached patch doesn't do the latter part, but only the former. ** Patch added: "rm_readmapped.patch" https://bugs.launchpad.net/dcplusplus/+bug/1909861/+attachment/5448723/+files/rm_readmapped.patch -- You received this bug notification because you are a member of Dcplusplus-team, which is subscribed to DC++. https://bugs.launchpad.net/bugs/1909861 Title: FileReader is not thread safe on Linux Status in DC++: New Bug description: FileReader::readMapped currently modifies the global SIGBUS handler in order to catch read errors: https://sourceforge.net/p/dcplusplus/code/ci/default/tree/dcpp/FileReader.cpp#l289 Since the function can be called concurrently from different threads (currently hashing/queue recheck/sfv check in DC++) and each of them sets and resets the SIGBUS handler, there's a high risk that the application will crash in case of read errors as they aren't being handler properly. More information about the caveats: https://www.sublimetext.com/blog/articles/use-mmap-with-care These issues are much more likely to happen with AirDC++ as it uses multiple threads for hashing. Read errors caused rather nasty crashes with corrupted stack traces for one user, but luckily he was able to catch the SIGBUS signal with gdb. I didn't even spend time in trying to figure out how to make the mapped reads work properly, as based on my testing the basic FileReader::readCached function is noticeably faster: readMapped: 671 files (21.70 GiB) in 9 directories have been hashed in 4 minutes 21 seconds (84.87 MiB/s) readCached: 671 files (21.70 GiB) in 9 directories have been hashed in 3 minutes 58 seconds (93.08 MiB/s) FileReader::readMapped is now disabled in AirDC++, as I can't see any benefits from using it. The included setjmp.h header is even causing issues when using clang for compiling on Linux: https://bugs.gentoo.org/731676 To manage notifications about this bug go to: https://bugs.launchpad.net/dcplusplus/+bug/1909861/+subscriptions _______________________________________________ Mailing list: https://launchpad.net/~linuxdcpp-team Post to : linuxdcpp-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~linuxdcpp-team More help : https://help.launchpad.net/ListHelp