Jure PeÃÂar wrote:
Although many on the list claim that this (having 2 boxes with 1 disk-array) is a nice way for redundancy I'm in doubt now if this is true. It still takes 30 mins before everything is back again! It seems to me that if there was a "live" version of cyrus available with a synchronised mail-spool, that there was no outage noticeable for users (except in losing a connection maybe). Am I right?Having 2 boxes with one disk array leaves you wit a single point of failure
that you wouldn't think of immediately: filesystem. I learned that the hard
way.
Yes, I agree.
I'm planning to 'redesign' our storage: instead of one big volume that fscksHmm, then your fscks will run faster/with less problems, but there is still outage that you can prevent if there is failover in another way and availability/replication on the application level.
for hours, i'm going to split in in many mirrors and use them as cyrus
partitions. This way they could all fsck in parrallel. I'm going to lose the
'single instance store' capability, but thats a tradeoff that i'm willing to
take.
If there are replicated spools it doesn't matter if the fsck takes long or not... although there will be a backlog of course.
Is it possible to have an fsck running on one partition and have cyrus started already (so part of the mail-store, e.g. archives, is not available yet?)
It happened to me at least once that the machine that crashed corrupted the
filesystem in a way that the machine that took over also crashed within
hours...
Maybe it's time to continue on the "High availability ... again"-discussion we had a while ago. If the cyrus developers are able to implement this with some funding there are still some questions left for me: how much time would it take before a "stable" solution is ready? How many funding is expected? I still have to talk to management about this, but I would really support this development and I'm certainly willing to convince some managers.The only high availability i see here is the google way. Cyrus is offering
you that with the 'murder' component.
That's not really availability, but distributed risk.
BTW, you're mentioning FreeBSD ... doesn't it have some sort of backgroundIt can, but I'm not sure if that's what I prefer. I'm not sure how mature it is with FreeBSD, and I prefer to have mail-integrety over a "quick restore".
fsck while the filesystem is moutned rw?
Paul
--- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html