Control: reopen -1 On 2022-10-14 02:26:20 +0200, Guillem Jover wrote: > All these fsync()s you see in rapid succession are used as a > synchronization points, way after the data has been requested to be > synced to disk asynchronously via sync_file_range().
I don't know why, but what strace shows is fsync(), not sync_file_range(). See strace output at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=923423#44 So this is not "asynchronously". > What this is trying to achieve is durability, so that dpkg can know > the data is on the disk, so that it can mark the package as installed. I agree that there should be a sync at the end (at least). But here, there sems to be one for *every* of the 92000 files! > This is explained on the dpkg FAQ: > > > <https://wiki.debian.org/Teams/Dpkg/FAQ#Q:_Why_is_dpkg_so_slow_when_using_new_filesystems_such_as_btrfs_or_ext4.3F> > [...] > Most programs do not seem concerned about making sure the data is > stored safely on disk, I'm afraid. > > In any case, I don't think there's anything else for dpkg to do here. > Please see the FAQ entry. I'm thus closing this now. According to the discussion at bug 1021750, and in particular https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1021750#45 this FAQ is wrong (concerning both the performance and data safety). If you disagree, please comment there. Since this is based on incorrect assumptions (wrong FAQ, and sync's are not asynchronous) and synchronizations could be less frequent without making data store less safe[*], I'm reopening the bug. [*] On the opposite, I would tend to think that such frequent synchronizations tend to yield more writes on the disk, thus more stress on the hardware, which could make data store less safe. -- Vincent Lefèvre <vinc...@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)