Some random rambling on this topic. I'm not sure I'm contributing a whole lot new.
In general, distributed filesystems are nice mechanisms for doing general purpose stuff. However, when you try to layer very specific tasks on top of distributed filesystem, you may run into problems where your filesystem doesn't do everything you really want it to do for you and so you have to do it yourself. Similarly, you can't take shortcuts by not choosing to use the parts of the filesystem that you don't need. The Cyrus aggregator is an attempt to do a form of availability partitioning. This probably will work for us but maybe not for you. The idea is that we can split our users amoung multiple machines so a single failure will not take down the entire system. Of course, this does mean that there potentially more opportunities for failures. This could be taken to the next step where Cyrus does mailbox replication (and maybe some of the stuff we are looking to do with "online" backup will help too). However, I don't think we have any immediate plans to do this. Of course, if people are interested enough in fully funding this effort, we probably could work something out. Walter