Bug#575343: Precautionary checks slow boot excessively

Elliott Mitchell Thu, 27 May 2010 18:00:18 -0700

>From: ty...@mit.edu
> On Mon, May 24, 2010 at 03:11:23PM -0700, Elliott Mitchell wrote:
> 
> If you hate precautionary checks so much, then turn them off.  The
> tools for doing this are within your hands.  Or do them using a LVM
> snapshot.


I don't hate precautionary checks. The problem is my personal server is
carefully setup and on fairly reliable hardware, and so typically goes
well over 6 months between reboots; while carefully setup systems might
not be all /that/ common, there are still plenty out there. With such
large uptimes though, typically the checktimes on *all* filesystems have
expired every time it has to restart (typically kernel patches). As such,
I tend to think the precautionary checks need a bit of work.


> > Two concerns. First, what is the significance of return codes >=4 from a
> > low-level fsck?
> 
> RTFM.   It's in the 3rd paragraph of the fsck man page....

Looks like I goofed, since I ment >=2. I did beforehand and the e2fsck
man page doesn't really say too much, "File system errors corrected,
system should be rebooted", "should"? Why? Will the theme song from
Mission Impossible start playing and the system explode in 5 seconds if I
don't?

> > If these merely mean filesystem was altered in a way that
> > invalidates kernel structures, isn't this non-problematic in the case of
> > a filesystem that is mounted read-only? (a later remount rw would flush
> > those, right?)
> 
> In general it's possible for fsck to potentially modify a filesystem
> such that it will confuse a kernel which is actively accessing the
> filesystem, yes.  It's rare, as discussed above, but it's
> theoretically possible.  What's **really** bad is that if the file
> system is mounted read-only, and then remounted read/write, it's
> possible that there are cached file system data structures (block
> group descriptors, inodes, etc) which were incorrect, and fixed by
> fsck --- but the kernel doesn't know that they were modified, and if
> the file system is remounted r/w, the incorrect cached version could
> get written back to disk, thus undoing the good work of the fsck.

Oy vey! I'm inclined to suggest that if this really does happen, that
sounds like a bug...   I could understand funky occurances if after
reloading, the kernel can no longer find the FS data structures (device
was erased), but I would expect the FS data structures to be otherwise
invalidated. I understand the situation and reasoning with `dump` causing
problems, but the mount/remount/unmount really should be flushing
everything.


> In any case, this is not something that *I* plan to do anything with,
> simply because I have much higher priority things to do with my time.
> If this is important, you'll have to do it or pay someone to do it, or
> try to trick someone else into doing it.

I suppose once I'm no longer pinned to the floor, I may go after the
simple approach (only ensuring a single FS is check during `checkfs.sh`),
but I'll admit the complete solution mentioned before will be complex to
implement.


-- 
(\___(\___(\______          --=> 8-) EHM <=--          ______/)___/)___/)
 \BS (    |         e...@gremlin.m5p.com PGP F6B23DE0         |    )   /
  \_CS\   |  _____  -O #include <stddisclaimer.h> O-   _____  |   /  _/
2477\___\_|_/DC21 03A0 5D61 985B <-PGP-> F2BE 6526 ABD2 F6B2\_|_/___/3DE0





-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Bug#575343: Precautionary checks slow boot excessively

Reply via email to