On Thu, 2004-01-22 at 14:58, alberto wrote: > > ----- > Dec 3 15:55:34 machine sshd(pam_unix)[1791]: session opened for > user xxxx by (uid=500) > Dec 3 15:56:08 machine kernel: 0x0: 58 41 47 46 00 00 00 01 00 00 00 > 04 00 10 00 00 > Dec 3 15:56:08 machine kernel: xfs_force_shutdown(md(9,0),0x8) > called from line 1070 of file xfs_trans.c. Return address = > 0xf8a23aa8 > Dec 3 15:56:08 machine kernel: Filesystem "md(9,0)": Corruption of > in-memory data detected. Shutting down filesystem: md(9,0) > Dec 3 15:56:08 machine kernel: Please umount the filesystem, and > rectify the problem(s) > ------ > > What is the nature of this problem? Kernel drivers? Hardware > failure? Filesystem inconsistency? >
the "md(9,0)" I believe refers to a specific disk in your array (any experts out there, please correct me). You may want to compare logs from the various times this happens to see if the same disk is fingered each time. If the issue is hardware, then a) most likely the disk referred to in the logs will be the same in all instances or b) the RAID card itself could be acting up. In either, case swapping out the suspected hardware (only possible in raid 1, 5 or similar setups, i suppose) should stop the problems. My guess is that the problem is with the kernel driver rather than hardware. The kernel message seems to indicate that the buffered filesystem data (changes to the filesystem that have yet to be written to the disk) has been corrupted. If the error came when reading data from the disk, which could mean buggy kernel drivers too, but could also be hardware problems. the line in the kernel driver which is refered to in your log: /* * See if the caller is relying on us to shut down the * filesystem. This happens in paths where we detect * corruption and decide to give up. */ if ((tp->t_flags & XFS_TRANS_DIRTY) && !XFS_FORCED_SHUTDOWN(tp->t_mountp)) xfs_force_shutdown(tp->t_mountp, XFS_CORRUPT_INCORE); to me, it looks like code to handle the problem but isnt the source of the problems itself, but I'm no kernel hacker. Perhaps some others can read more into this than I? Filesystem inconsistancy, I think, would just be a symptom of one of the other two possibilities mentioned. > What can be a possible solution? > Swap out bad hardware, try a newer version of the xfs driver. dunno if that helps any, but there you have my $0.02 -davidc -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]