Hi!

On Wed, 2010-10-20 at 16:11:05 -0500, Chanoch (Ken) Bloom wrote:
> > 1. On ext4 with certain mount options, using rename() without first
> >    calling fsync() to get the data on disk has an unfortunate risk of
> >    clearing out a file[1].
> 
> This issue was current at the beginning of 2010, around the time Bug
> #567089 was filed and discussed. It's been fixed in the kernel since
> then. See http://lwn.net/Articles/322823/, and
> http://lwn.net/Articles/326471/
> 
> Does it still affect the shipping Debian kernel?

Some of the problems might have been patched over, but AFAIK it still
affects latest upstream kernels:

  <https://bugzilla.kernel.org/show_bug.cgi?id=15910>

> > 2. On ext4 with certain mount options, using fsync() instead of sync()
> >    to sync a collection of newly installed files is unacceptably
> >    slow[2].
> 
> The problem here was "data=ordered". ext3 also suffered from this
> problem, since its default was "data=ordered".
> In brief, ONE fsync() call cost about as much as ONE sync() call.
> The solution was "don't use data=ordered" (and Linus patched the
> kernel to change the default) then fsync() will be suitably faster.
> 
> The bug you cite here was also around April/May when this problem was
> being sorted out by the Linux kernel community.

AFAIR benchmarks showed during the process to fix the fsync() slowdown
bug in dpkg, ext3 didn't suffer a significant slowdown, while ext4 did.

The biggest problem with using sync() is that it affect *all* mount
points, not just the one where the file might be stored. So background
I/O might cause way more load than necessary.

> > 3. sync() obviously does way more than we want it too, since it
> >    touches files and filesystems that have nothing to do with
> >    dpkg’s work.
> > 
> > So what should we do?  Dear kernel, we will happily provide a list
> > of files we want to be renamed in place.  Can you make sure they
> > have the right data without _repeatedly_ incurring the penalty of
> > fsync()?
> 
> Is a solution of "mount your hard drive in a way that fsync() doesn't
> hurt" a good solution? I think that was the upstream kernel
> developers' decision on how to handle this.

Well, fsync() is the correct solution for this problem, if the file
system cannot handle it, then I'd say the file system is the problem.

> If not, maybe postponing sync() calls further is the solution.
> I.e. instead of doing it after every package, do it every 10 packages,
> or just do it once at the end of an apt-get dist-upgrade.

Postponing fsync() or sync() calls give the same guarantees as not
doing them at all in the presence of an abrupt system crash/shutdown.

regards,
guillem



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to