On 10/26/10 21:17, Chuck Swiger wrote:
On Oct 26, 2010, at 11:33 AM, Marc G. Fournier wrote:
Someone recently posted on one of the PostgreSQL Blogs concerning fsync on
Linux/Windows/Mac OS X, but failed to make any comments on any of the BSDs ...
the post has to do with how fsync works on the various OSs, and am curious as
to whether or not this is something that also afflicts us:
http://rhaas.blogspot.com/2010/10/wal-reliability.html
From reading our man page, I see no warnings similar to what the other OSs
have, specifically:
Mac OS X: For applications that require tighter guarantees about the
integrity of their data, Mac OS X provides the F_FULLFSYNC fcntl
Linux: If the underlying hard disk has write caching enabled, then the
data may not really be on permanent storage when fsync() /
fdatasync() return.
So, do we hide the fact, or are, in fact, not afflicted by this?
Whether the data actually gets written and the on-disk cache itself flushed
seems to depend on a sysctl called hw.ata.wc for FreeBSD or the dkctl setting
in NetBSD; write-caching seems to always default to on because otherwise people
scream bloody murder about the factor of ten reduction in write performance
with it off. Further, by default (ie, FFSv2 with soft updates), data changes
are synced out when you do an fsync(), but metadata changes are done
asynchronously-- which is exactly what MacOS X does.
In other words, if you have write-caching on, no effort is made to invoke ATA_FLUSHCACHE
or SCSI "SYNCHRONIZE CACHE" to make sure that your disk has actually written
the bits to permanent storage.
To clarify: all this is in case write-caching happens on disk drives or
on disk controllers.
The common way to deploy servers for a long time now is to have a disk
controller with RAID capabilities and its own RAM cache which is backed
by a battery or a capacitor. This controller in turn switches on-drive
write caches off. All of the RAID controllers I've seen have a toggle
for this last part (on-drive write caches) and it was always turned off
by default (though it doesn't hurt to check).
To emulate this with desktop drives, as cswiger said, hw.ata.wc should
be turned off, with the expected influence on drive performance.
All this is valid for UFS. ZFS on the other hand *should* use BIO_FLUSH
where appropriate, so it should be safer with desktop drives. OTOH ZFS
is so complex that it's hard to say if an error occurs what has caused it.
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[email protected]"