In general none of Cyrus will necessarily work over NFS. If you're only accessing the NFS store from a single client, things have a much better chance of working---By single client, do you mean a single NFS client hitting the NFS server? If so, this is guaranteed in our configuration.
Yes.
[...]
A lot of problems also result when people try to run the application on more than one computer hitting the same NFS server. But things that drive us application writers mad is the idea that rename() can return failure but have actually happened; and if you're trying to write a reliable application, you don't want to rely on the fact that the chance of this is minimized, since you know it's going to happen and you're going to be sorry.It's hard to find any hard information amongst the traditional NFS hysteria. I suspect Sleepycat's warning is there simply because the quality of NFS implementations is often poor, and it involves so many other variables they can't control.
I would hope it would work with a single server with multiple processes. But I really haven't thought about all the possibilities with NFS. (The "return error and succeed" problem is just one that springs to mind, and I've never audited the code thinking about that.)skiplist should work over NFS with a single client and map_nommap.So, do you mean a single process or a single server (potentially with multiple processes hitting the file).
Great, now I need to do bookkeeping to do this. Plus on most Unix filesystems, rename() is a more expensive operation than 1 fsync() and probably even 2 fsync()s. And how am I suppose to programmatically determine whether or not a given version is valid?Indeed, however if you are talking about increasing the frequency of writes to the file, and if you retain a few old versions, you will almost certainly get away with it (so, worst case on restart, you try progressively older files). This wouldn't be an answer for critical data, but it may be acceptable for the \Seen state. Shrug.
Linux ext2 has this metadata problem. ext3 and reiserfs are both suppose to force metadata to disk when fsync() is called, similiar to how softupdates on BSD, Veritas, or most other modern filesystems. I'm willing to bet that I've wasted more time than you have worrying about the semantics of fsync() on various Unix filesystems.BTW, Linux up until very recently synced way too much data on an fsync() (it behaved more like a sync()). Yet, even after the new improved fsync(), it still doesn't guarantee the file won't be lost (since it doesn't sync the directory entry for the file, only the file data and metadata, whereas the BSDs and Solaris do). This is a massive pain in the arse for MTA authors.
You need to do the stat() regardless if you want the latest data. By keeping the file open, you potentially amortize the cost of an open(), another fstat (find out the file descriptor of your open'd fd) and an mmap(). All of these have various different costs depending on your platform and your Unix.I think my point is that the cost of open() is roughly equivalent to the cost of stat() under Solaris - so rather than keep a file open, and stat it periodically to see if it's changed under you, you can close and reopen the file (resulting in simpler code, but similar performance).
Keeping the file open costs almost nothing (the cost of the disk space when and if there is write contention).
[...]
You have one database and weren't fsync()ing the data. Cyrus has thousands of active databases and cares about the reliability of the data.Actually, it scaled better than initially expected - this map type was used specifically for tables that changed very frequently (the pop-before-smtp pre-auth mechanism being a case in point). The only synchronous operation was the rename(). The lookup read()'s would have been pulling the data from the buffer cache, and sequential searches beat more complex schemes every time when the dataset is small (less than 100kB was the figure we found when comparing to things like libdb). The saving in resident set size was critical too - the machine had 4G of RAM, and no more could be fitted.
Larry