Re: Reiserfs questions & Discussion

Chris Colomb Mon, 29 Jan 2001 16:39:42 -0800


On Mon, 29 Jan 2001, Dave Ihnat wrote:

> On Mon, Jan 29, 2001 at 12:26:21PM -0500, Chris Colomb wrote:
> > UPS systems can and do fail.
> 
> Absolutely--anything fails, eventually.  The issue is one of
> _probability_.  You can give anecdotal descriptions of UPS failures
> all day; unless you can point to a statistical proof that they fail
> consistently, they're just that:  Anecdote.  My, and the industry's,
> real-world experience is that properly installed and maintained UPS
> systems have a negligable failure rate.
>

If you're in that percentage that's small comfort.
 
> Does this preclude the need to back up systems?  Not a whit.
> Would this preclude use of a jouraling file system?  No; but it's not a
> 'gimme' proposition, either, considering that there are real expenses
> associated with use of such a system, especially the current state of
> journaling in Linux.

Exactly. Which is why we use SGI, AIX, and Solaris for that. Because they
do journaling right and there isn't a performance impact. This is the
problem one gets into when there's an OS agenda to grind rather than
recognizing that what is ultimately being delivered is a service,
regardless of OS. 

> 
> > At my location we have a room full of batteries and a diesel generator
> > with fuel for 10 days. Both of which get exercised regularly. Still the
> > system that switches to batteries failed last month and we were without
> > power for almost a minute.
> 
> Anecdotal.  Regularly exercised--regularly maintained?  What did
> post-failure analysis show?  Almost certainly something is wrong--either
> with installation or maintenance.  Perhaps a single point-of-failure in
> the control circuitry, or too many months (years?) without maintenance.
> 

As I wrote earlier it was regularly exercised and maintained. And actually
it was just reconditioned and passed with flying colors. There was a
hardware failure in the device that switches to battery. But that's not
really the point.


> > A properly implemented journaling file system has negligible performance
> > overhead.
> 
> The current implementation on Linux doesn't meet that requirement.  Yet.
> 
> > To rely on a UPS to the exclusion of a journaling filesystem is IMHO just
> > as irresponsible and unprofessional in a truly mission critical production
> > enviroment.
> 
> Inaccurate, bordering on specious.  


As you put earlier...gently, gently.


> A UPS is a universal need; a
> journaling filesystem is, in practice, not nearly as universal in
> either implementation or in practice.  Its delivered benefit is only
> incremental; if your systems are unstable enough that the benefits of
> a journaling system loom large in your recovery analysis, you have much
> greater problems--either with your architecture, topology, or equipment.
> 
> > I have many terabytes of disk. The type of failure mentioned
> > above...I don't even want to think about how long it would take to fsck
> > all of that.
> 
> Fine.  Figure how many catastrophic failures you expect over a given time
> period--analyze WHY you expect them--and determine the cost of fsck against
> the initial and ongoing cost in terms of CPU cycles, storage, complexity,
> etc. of a journaling solution.  It's all about risk assessment and cost-
> benefit analysis.
> 

Right. And that's why Linux isn't being used for any of our large data
mission critical applications. It doesn't take a catastrophic failure to
put one in such a situation. And it doesn't have to be about a UPS...you
get the same issues if, say, the kernel panics. Or if someone kicks a cord
by mistake. Stuff happens, despite one's best efforts to the contrary. 

And as you have more and more data online the cost of *any* such
interruption is greatly multiplied if you have to fsck. You can easily be
looking at hours of downtime when you're talking about fscking terabytes
of data.

> 
> First, there is no reason the Linux systems should require a manual fsck;
> they can be configured for auto-recovery as well as a commercial Unix
> system.

They were. However the automatic fsck halted, as it should, because the
damage was too extensive. Which is when manual intervention was required. 
The point being, with IBM AIX, SGI, Sun none of that was necessary as it
never came to that in the first place. And if the damage wasn't so
extensive in the Linux case, even when the automatic fsck works and you
don't have to manually intervene, simply to fsck terabytes worth of disk
results in a very non trivial outage.

> 
> Next, the ext2 filesystem is more robust than commercial Unix filesystems
> were for more than 3/4 of the total life of Unix.  Can it be made better?
> Of course.  But I don't accept the premise that every commercial system
> needs journaling, nor do I believe that this results in commercially
> unacceptable levels of exposure to risk.
> 
> Your anecdote tells me that someone fell down in disaster recovery
> planning and execution; and that the Linux systems weren't configured
> for auto-recovery.  Not that there was anything inherently wrong with
> Linux per se.
> 
> There are many approaches to system reliability; journaling is one tool
> among many, not a universal solution.
> 

Journaling hardly has to be a universal solution to be extremely valuable
and very much worth implementing. The 3/4 of the life of Unix argument is
interesting in that it leads into what I think is really the issue at
hand, one of best practices. 

Best practices is all about using the tools available to you in the most
advantageous way possible, given the state of the technology you're
working with.  What is best practice changes over time...what could have
been considered acceptable a couple of years ago can be an unecessary risk
today. 

What I don't think you're acknowledging in your statements is that, um,
stuff happens.  Despite all one's best attempts at planning etc. 
Especially as things scale up and get more complex. The UPS that we and
other folks have that can run a large machine room is much more complex
than the APC sitting under your desk. But whether it's a UPS issue, or
someone kicks a cord, or leans against a switch, or the kernel panics, or
whatever, stuff happens. Some things you can't really plan for, like say a
meteor hits your machine room (though some folks actually do plan for
things like this) but are pretty remote possibilities. Power interruptions
and kernel panics and the like, unfortunately, are much more likely
events.  Attend a Usenix or LISA conference where the folks that run large
sites congregate and you'll see that such events in our imperfect world
are all too common, despite having very bright folks implementing very
well planned data center operations.

Most management I've worked with/for understands that stuff beyond your
control happens, and you won't get fired for that. What *will* get you
fired is if there was a prevalent industry practice that you could have
been implementing to prevent the problem in the first place, and you
weren't doing it.

And that's why the 3/4 of the total life of Unix argument just doesn't
wash. Best practices is what's acceptable/available currently. And that's
where I think the problem lies with Linux at this point in time...I think
it's the *only* server OS in commercial use that *doesn't* have
filesystem journaling.  Even Win2k has it. 

When you say well your UPS shouldn't have failed etc that's not the point. 
The point is that if the UPS fails or power is interrupted for whatever
reason, with the current state of commercial server grade filesystems and
hence best practices you shouldn't have to fsck. That *any* such an event
should not result in a significant service outage. And if you have to fsck
data of any significant size you *will* have a significant service outage.
Best practices dictates that you shouldn't have *any* exposure like this
in a mission critical enviroment with large amounts of data online, where
just selecting another in a number of operating systems would eliminate
the exposure.

Regards,

Chris




_______________________________________________
Redhat-list mailing list
[EMAIL PROTECTED]
https://listman.redhat.com/mailman/listinfo/redhat-list
Re: Reiserfs questions & Discussion

Reply via email to