Hello Dovecot developers, hello list,
since more than two years we had occasional log messages like this:
Error: Mailbox INBOX: Corrupted transaction log file
/IMAP/mail/mailboxes/USER/mdbox/mailboxes/INBOX/dbox-Mails/dovecot.index.log
seq 45532: Unexpected garbage at EOF (sync_offset=4856)
Mostly I had to repair the affected mailbox manually after such events.
Luckily I could prevent any data corruptions by using the healthy
mailbox from the replication partner.
Those errors occurred rather seldom (once a month) but started get more
often. At the end of last year, they occurred about once a week for 800
users with up to 200 folders. The reporting process was always
different, like the indexer, imapd, lmtpd, replicator or doveadm (on
cmdline). It was obviously some sort of race-condition between the
different processes accessing the index. I did not see any hints how to
reproduce it. At the beginning of this year I started to debug this
issue by adding debugging log output to our productive system, mostly in
lib-index. I was hoping to find the cause of the issue and maybe even
provide a patch to fix this. But about 5 weeks ago, I discovered, that
the error can be simply avoided by setting mmap_disable=yes. Since this
change, the error did not occur again even with specific tests, that
previously likely triggered the bug. That's where I stopped to debug
further.
We are running Dovecot in master-master mode using replication plugin
behind a Dovecot director setup. The storage on the backends is provided
by local disks on Linux with newest ZFS filesystem. We use mdbox storage
and FTS. While testing I also checked ext4 instead of ZFS and different
FTS backends without success.
I can provide more information for the developers if you are interested!
---------------------------------
About mmap_disable:
The documentation only mentions, that you should set this to "yes" for
SHARED filesystems (I don't think local ZFS or ext4 qualify for that).
https://doc.dovecot.org/2.3/settings/core/#core_setting-mmap_disable
On another page
(https://doc.dovecot.org/2.3/admin_manual/mailbox_formats/#memory-mapping),
it is mentioned, that "If mmap() is supported by your filesystem, it’s
still not certain that it gives better performance. Try benchmarking to
make sure."
I also found an old mail from Timo
(https://dovecot.org/list/dovecot/2011-December/079975.html) which lists
3 cases, where mmap_disable=yes is required or at least suggested. In
only one case, he wrote "With local filesystems mmap_disable=no _should_
be faster."
I did not do extensive benchmarks but I did not see any performance
issues since we disabled mmap on our IMAP backends.
So, I wonder, if it would not be better to switch the default for
mmap_disable to YES:
1. There are configurations, where the current default might cause data
corruption!
2. There are configurations, where disabling mmap is suggested.
3. Even on local storage, the performance benefits are unsure.
4. According to my observations, even on local storage, enabling mmap
can cause data corruption.
Changing the default to NO would protect users from accidental data
corruption. Users that aim for maximum performance can still enable mmap
and test if it improves performance. IMHO the documentation should
mention, that even for local disks, mmap CAN cause data corruption
(until the above mentioned bug is found and fixed).
Best regards,
--
Patrick Cernko <[email protected]> +49 681 9325 5815
Joint Scientific IT and Technical Service
Max-Planck-Institute für Informatik & Softwaresysteme
_______________________________________________
dovecot mailing list -- [email protected]
To unsubscribe send an email to [email protected]