On Thu, Nov 23, 2000 at 07:13:51PM -0500, Roland McGrath wrote:
> Please try to get a stack trace from the assertion failure. You could
> attach gdb in noninvasive mode and hit it, or you could just hack the code
> to use glibc's backtrace (execinfo.h) function and print it out rather than
> using assert.
I have put in a sleep(3600), which was sufficient to attach gdb when it hit
the bug.
The complete stack trace and data content is attached as a script session
(has gdb a functionality to pipe output in a file?). I have not come too far
interpreting what I see, but there are acouple of things that are easy to
get from it:
* The assertion failure (and thus the write_node) happens during a normal
sync process.
* The inode which has such time flags set is the inode temacs is dumping the
emacs binary file to. The flags set are dn_set_mtime and dn_set_ctime (while
dn_set_atime is cleared).
* The file size is the final file size, so the file is completely written by
the time.
* Maybe Important: The file emacs is hardlinked to emacs-20.7 as well!
(The hard link or the creation of it might be relevant). I don't know if it
first written and then hard linked, or first hard linked and then written,
or if this matters.
Random thoughts:
* The node->lock is held, which should probably avoid syncing?
* Not all parts of the system which set some dn_set_?time flag call
diskfs_node_update consecutively (for example, write_symlink). I don't know
if this is a requirement and possible point of failure. When nodes can be
written (synced) while they are locked, there is a lot of room for race
conditions (everywhere where dn_set_?time is set, even when it is directly
followed by a diskfs_node_update).
This seems to be some race condition betweenm the sync thread and other
dn_set_?time mangling stuff. It's only strange that it never happened
before, and building emacs is such a reproducible test case (huge file with
a hardlink? Can't be the only reason, as it does only happen with a full
build, not with an interrupted and restarted build).
I can reproduce this easily, so if more testing and debugging is requried, I
am happy to do that.
Thanks,
Marcus
--
`Rhubarb is no Egyptian god.' Debian http://www.debian.org [EMAIL PROTECTED]
Marcus Brinkmann GNU http://www.gnu.org [EMAIL PROTECTED]
[EMAIL PROTECTED]
http://www.marcus-brinkmann.de
_______________________________________________
Bug-hurd mailing list
[EMAIL PROTECTED]
http://mail.gnu.org/mailman/listinfo/bug-hurd