Paul Eggert wrote: > Possibly a bug in the file system or kernel? > An earlier part of the code does this: > > /* lines 57 and 58 of test-utimens.h */ > ASSERT (func (BASE "file", ts) == 0); > ASSERT (stat (BASE "file", &st1) == 0); > > and this relies on st_ctime being marked for update in the first > line, and being actually updated before the second line takes effect. > This is pretty easy to get wrong, and many file systems do get similar > things wrong, particularly under high load. If my guess is right, > it should be easy to verify by instrumenting those lines. > > There is a similar issue here: > > /* lines 98 and 99 (/ > ASSERT (func (BASE "file", ts) == 0); > ASSERT (stat (BASE "file", &st2) == 0); > > Of course this is just a long-distance guess....
I think it comes down to either a bug, or that somehow "nap()" is not sleeping long enough. Details on how this file system is mounted: (it's SSD-backed in a rawhide guest, 3.2.0-0.rc1.git4.1.fc17.x86_64) $ grep vda4 /proc/mounts /dev/vda4 / ext4 rw,seclabel,relatime,user_xattr,barrier=1,data=ordered 0 0 There seem to be at least three points at which I'm seeing failed assertions. I caught this by changing the abort() in macros.h's ASSERT definition to be sleep(9999999) and by running parallel "make check" jobs until one hung. Then attached with gdb: #2 0x0000000000403c28 in test_lutimens (func=0x401e10 <do_lutimens1>, print=false) at test-lutimens.h:66 (gdb) l 61 } 62 ASSERT (stat (BASE "file", &st2) == 0); 63 ASSERT (st2.st_atime == Y2K); 64 ASSERT (st2.st_mtime == Y2K); 65 if (check_ctime) 66 ASSERT (st1.st_ctime < st2.st_ctime 67 || (st1.st_ctime == st2.st_ctime 68 && get_stat_ctime_ns (&st1) < get_stat_ctime_ns (&st2))); 69 70 /* Play with symlink timestamps. */ (gdb) p st1.st_ctim $1 = { tv_sec = 1321473919, tv_nsec = 914482118 } (gdb) p st2.st_ctim $2 = { tv_sec = 1321473919, tv_nsec = 914482118 } Equal structs. Definitely indicates failure, somewhere. ====================================================================== Trying again, I hit a sleep on the 8th trial. Note, it's in a different file and with different var names, but the same condition: unchanged ctime. This time, I was using a 40-millisecond delay, just in case 20 was close but not enough. #2 0x0000000000402ad3 in test_utimens (func=0x401e50 <do_utimens>, print=false) at test-utimens.h:135 135 ASSERT (st2.st_ctime < st3.st_ctime (gdb) p st2.st_ctim $1 = { tv_sec = 1321476466, tv_nsec = 27092765 } (gdb) p st3.st_ctim $2 = { tv_sec = 1321476466, tv_nsec = 27092765 } ====================================================================== I did it one more time: #2 0x0000000000405153 in test_futimens (print=<optimized out>, func=0x402930 <do_futimens>) at test-futimens.h:148 148 ASSERT (st2.st_ctime < st3.st_ctime (gdb) p st2.st_ctim $1 = { tv_sec = 1321478037, tv_nsec = 577808187 } (gdb) p st3.st_ctim $2 = { tv_sec = 1321478037, tv_nsec = 577808187 } Then examined the ctime of the actual file: $ stat --fo=%.Z test-utimens.tfile 1321478037.577808187 Yes, it matches. This was also with the 40-millisecond delay. This time it took 65 iterations. Tomorrow I'll write a better test. I suspect that "make check" is not exercising it thoroughly enough.