> I have no real idea what could cause this but I have the following > sequence in my db conversion script which is used by the init script > in my rpms. The procedure is the best according do lots of my tests > using different version of db3 and db4 with cyrus-imapd. As you can > see I first try a db_checkpoint, then kill it if it seems to hang, > then do a db_recover and only after this do a rm -vf > $imap_prefix/db/log.* $imap_prefix/db/__db.*. I just tried to find out > the safest procedure after simulated crashes, without really > understanding BDB and why people like to use it so much. I don't, and > my servers run fine without any BDB.
I might look into updating our start process, but the bit I don't get is why the errors are occuring at all. I can only think of two possible reasons: 1. There's some DB state being left around somewhere so after the restart, it's accessing corrupted data, though why a second restart tends to fix it I don't know 2. There's some bug that manifests itself on mostly on small databases, so the problem occuring is more coincidental whether it does or doesn't happen. To be honest, I haven't kept a log of if it definitely happens after every restart, but not every second. The main reason we haven't switched to skiplist for deliver.db is that on active servers like ours, the deliverdb can get pretty large (500-1000M) even with daily pruning, and the skiplist DB implementation requires mmap'ing the entire file into memory which gets problematic at that size. I notice that the latest 2.3 has a berkeley_hash_nosync option, which might use s sufficiently different code path to avoid this issue. When we upgrade I'll try that out... Rob ---------- [EMAIL PROTECTED] Sign up at http://fastmail.fm for fast, ad free, IMAP accessible email ---- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html