-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 [adding bug-gnulib]
According to Eric Blake on 12/15/2009 7:48 PM: > According to John Stanley on 12/15/2009 4:42 PM: >> Basically, what's happening is that 'touch -a ..' updated ctime in >> coreutils-7.6, >> but does not update ctime in coreutils-8.2 (hence misc/ls-time fails). > > Ouch. That's a bug in the kernel; I can reproduce it: > > $ uname -a > Linux fencepost 2.6.26-2-xen-amd64 #1 SMP Thu Nov 5 04:27:12 UTC 2009 > x86_64 GNU/Linux > $ touch q > $ stat -c '%x %z' q > 2009-12-15 21:46:33.186677568 -0500 2009-12-15 21:46:33.186677568 -0500 > $ touch -a q > $ stat -c '%x %z' q > 2009-12-15 21:47:15.157175384 -0500 2009-12-15 21:46:33.186677568 -0500 > $ According to strace, coreutils 6.10 used syscall_280 (which I'm assuming is utimensat, and that strace is just behind the times compared to the kernel); ltrace says it was via: futimesat(0, 0, 0x7fff0568c900, 0, 3) = 0 The newer coreutils likewise uses syscall_280, but via: futimens(0, 0x7fff5b31a450, 0x60ebd0, 0x7fff5b31a450, 3) = 0 By comparing the results of 'touch f' and 'touch -a f', it appears that the kernel ctime bug is only triggered when UTIME_OMIT is passed as one of the two timestamps (which is only possible via futimens/utimensat, not futimesat). And that is consistent with the fact that coreutils didn't use UTIME_OMIT until coreutils 8.1. Also, it means that I can probably devise a way to work around the bug in gnulib while we wait for the kernel folks to fix their bug. However, there's a question of the minimal number of syscalls needed to fix the problem. It may be that UTIME_NOW also has an impact. My current idea: Keep a cache variable that shows whether UTIME_OMIT works (0=unknown, 1=yes, -1=no). If the variable is -1, then treat UTIME_OMIT the same was as we do for futimesat (that is, call stat()/gettime() to populate the struct timespec prior to making the syscall). If the variable is 1, then the kernel has been fixed. If the variable is 0, then perform [f]stat both before and after the utimensat call; if the times differ, set the cache variable to 1 and we're done. Otherwise, ctime didn't change, so also call gettime(). If gettime is within 10 ms of the second stat, the results are inconclusive (given that we have proven that some filesystems have a quantization boundary of 10 ms where multiple actions within that window all end up with the timestamp), so leave the cache at 0, but re-call utimensat anyways with the times learned by stat/gettime(). Otherwise, the current time and the second ctime differ by more than 10 ms, so utimensat UTIME_OMIT is broken; set cache to -1, and fix the problem by re-calling utimensat with the times learned by stat/gettime(). Sounds quite hairy. Any ideas for improvements? And how best to report this bug to the kernel folks? - -- Don't work too hard, make some time for fun as well! Eric Blake e...@byu.net -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAksoXS0ACgkQ84KuGfSFAYAQzACdGVTRw4Pt/CspbvpJkGUd2Fq1 vxEAnjUrLX3d2UkCi8q1Okq3H/gvGXml =mmqQ -----END PGP SIGNATURE-----