Oh sorry Nic, didn't notice that it was you on that thread. Then you are probably way ahead of me :) I guess only Bron can answer this then. We had this issue when upgrading large installations and the first one was a real nightmare, costing a night of sleep to bring the system back to operational status, since we didn't know the details we do now.. The fix you mentioned on the thread is for 3.0.x. >From my experience, running reconstruct 3.0.x only creates further issues. Reconstruct 2.5.x also crashes sometimes which is really frustrating and restarting it for all mailboxes after each crash was really not an option. So we ended up with a perl script that I could probably share here that gets the list of mailboxes, checks which are still v12 and performs a reconstruct on them only, one by one. If reconstruct crashes it restarts it until it completes successfully or a limit of failures is reached... At the end all mailboxes were v13. On some very rare cases, a mailbox was really corrupt, so it had to be manually fixed by something more drastic, like deleting cyrus.index file. I have to check with my boss tomorrow morning just to get an ok and I'll share that script here if it helps.
Regards, Savvas Karagiannidis On Wed, Aug 1, 2018 at 12:26 AM Nic Bernstein <n...@onlight.com> wrote: > On 07/31/2018 02:32 PM, Savvas Karagiannidis wrote: > > Hi Nic, > as far as I know there are currently a couple of nasty open issues when > upgrading old mailboxes, created from earlier versions of cyrus, that have > not been upgraded to v13 index using reconstruct. These issues are: > > https://github.com/cyrusimap/cyrus-imapd/issues/2171 > https://github.com/cyrusimap/cyrus-imapd/issues/2208 > > The main problem is frequent crashes with error message "assertion failed" > when accessing these mailboxes. > > In issue #2208 a reconstruct command was proposed as a workaround: > reconstruct -G -V max [mailboxname] > > Note that -G parameter seems necessary, otherwise the problem persists... > > Note also, that this should be done while still in version 2.5.x! The > behavior of version 3.x reconstruct is not the same and you will not have > the same results. > > As for the cyrdump you mention, it would be better if you compared the > results before running reconstruct and after that. > I guess what you would see is that the uidlist before would probably be in > the range 6 - 9 (or you'd see a crash). cyrus 3.0.x considers that part of > the index corrupt so simply cannot access those uids (1-5). Reconstruct > detects this issue and what it does is that *reappends *those "newly > found" messages, giving them a new uid, so that's what uids 10-14 probably > really are. Note that this causes these messages to appear unread, since > they have no flags (not something you want for your clients' mailboxes). > > Bron mentioned something about this bug in the cyrus devel list a few days > ago: > > https://lists.andrew.cmu.edu/pipermail/cyrus-devel/2018-July/thread.html#4313 > but I didn't see any related commit or anything mentioned in the above > issues, so I'm not sure if there is a fix at the moment... > > > [Cross posting to cyrus-devel, as am now dumping core] > > Savvas, > Thanks for your response. I think, however, that we've already tackled > the issue of assert failures in reconstruct. If you follow the entire > thread of the message you linked to, Bron's message from the other day, he > was actually writing to me, and his patch, referenced with a commit, within > that message, worked just fine for me, as addressed in my follow up: > > https://lists.andrew.cmu.edu/pipermail/cyrus-devel/2018-July/004315.html > > But I can see that you're correct about the cyrdump issue. That's a real > mess. Here's the top of a cyrdump from the old host, running 2.5.10: > > onlight@mail:~$ sudo su cyrus -c "/usr/lib/cyrus/bin/cyrdump user.onlight" | > head -12 > Content-Type: multipart/related; boundary="dump-10426-1533070392-23406623" > > --dump-10426-1533070392-23406623 > Content-Type: text/xml > IMAP-Dump-Version: 0 > > <imapdump uniqueid="710a47ca47ebc676"> > <mailbox-url>imap://mail.ahwi.us/user.onlight</mailbox-url> > <incremental-uid>0</incremental-uid> > <nextuid>10</nextuid> > > <uidlist>1 2 3 4 5 6 7 9 </uidlist> > > which comports with the existing mailbox directory: > > onlight@mail:~$ sudo ls /var/spool/cyrus/mail/I/user/onlight/ > 1. 2. 3. 4. 5. 6. 7. 8. 9. cyrus.annotations cyrus.cache > cyrus.header cyrus.index Drafts Sent Spam Trash > > And here's what it looks like after reconstructing on the new server: > > root@newmail:~# ls /var/spool/cyrus/mail/I/user/onlight/ > 10. 11. 12. 13. 14. 6. 7. 8. 9. cyrus.annotations cyrus.cache > cyrus.header cyrus.index Drafts Sent Spam Trash > > But I cannot even run cyrdump on the new mailbox now, I keep dumping core > with a Floating Point error: > > root@newmail:~# /usr/lib/cyrus/bin/cyrdump user/onlight > Content-Type: multipart/related; boundary="dump-27702-1533070740-1079351705" > > --dump-27702-1533070740-1079351705 > Content-Type: text/xml > IMAP-Dump-Version: 0 > > <imapdump uniqueid="710a47ca47ebc676"> > <mailbox-url>imap://newmail.ahwi.us/user.onlight</mailbox-url> > <incremental-uid>0</incremental-uid> > <nextuid>15</nextuid> > > <uidlist>6 7 9 10 11 12 13 14 </uidlist> > > <flags> > Floating point exception (core dumped) > > Here's a backtrace from gdb: > > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". > Core was generated by `/usr/lib/cyrus/bin/cyrdump user/onlight'. > Program terminated with signal SIGFPE, Arithmetic exception. > #0 0x00007fcbc1057955 in hash_lookup (key=key@entry=0x7ffd75ae4610 > "systemflags", > table=table@entry=0x7fcbc1894e30 <attrs_by_name>) at lib/hash.c:159 > 159 lib/hash.c: No such file or directory. > (gdb) bt > #0 0x00007fcbc1057955 in hash_lookup (key=key@entry=0x7ffd75ae4610 > "systemflags", > table=table@entry=0x7fcbc1894e30 <attrs_by_name>) at lib/hash.c:159 > #1 0x00007fcbc1636822 in search_attr_find (name=<optimized out>) at > imap/search_expr.c:2465 > #2 0x000055c087d5a694 in systemflag_match (flag=1) at imap/cyrdump.c:149 > #3 0x000055c087d5a9dd in dump_me (data=<optimized out>, rock=0x7ffd75ae68e0) > at imap/cyrdump.c:209 > #4 0x00007fcbc16164c3 in find_cb (rockp=rockp@entry=0x7ffd75ae6820, > key=key@entry=0x7fcbc1aa1a28 <error: Cannot access memory at address > 0x7fcbc1aa1a28>, keylen=keylen@entry=12, > data=data@entry=0x7fcbc1aa1a38 <error: Cannot access memory at address > 0x7fcbc1aa1a38>, datalen=<optimized out>) > at imap/mboxlist.c:2410 > #5 0x00007fcbc131f553 in myforeach (db=0x55c0894b2610, prefix=0x7ffd75ae5fb0 > "user.onlight", prefixlen=0, > goodp=0x7fcbc16162c0 <find_p>, cb=0x7fcbc1616390 <find_cb>, > rock=0x7ffd75ae6820, tidptr=<optimized out>) > at lib/cyrusdb_skiplist.c:1177 > #6 0x00007fcbc16141cd in mboxlist_find_category > (rock=rock@entry=0x7ffd75ae6820, > prefix=prefix@entry=0x7ffd75ae5fb0 "user.onlight", len=len@entry=0) at > imap/mboxlist.c:2685 > #7 0x00007fcbc1614b2c in mboxlist_do_find (rock=rock@entry=0x7ffd75ae6820, > patterns=patterns@entry=0x55c0894b26f0) > at imap/mboxlist.c:2877 > #8 0x00007fcbc1619fe6 in mboxlist_findallmulti (namespace=<optimized out>, > patterns=0x55c0894b26f0, isadmin=<optimized out>, > userid=0x0, auth_state=<optimized out>, proc=<optimized out>, > rock=0x7ffd75ae68e0) at imap/mboxlist.c:2942 > #9 0x000055c087d5a4bd in main (argc=2, argv=<optimized out>) at > imap/cyrdump.c:119 > > I'll note, too, that the behavior of reconstruct, with regards to those > low numbered UIDs, is totally inconsistent. Here's a re-run on the same > mailbox, after re-rsyncing (with --delete option) from the original server: > > root@newmail:~# su cyrus -c "/usr/lib/cyrus/bin/reconstruct -V max > user/onlight" > user.onlight: update uniqueid from header (null) => 710a47ca47ebc676 > user.onlight updating quota_mailbox_used: 10715 => 3910 > user.onlight: updating exists 8 => 3 > user.onlight: updating sync_crc 2250269913 => 1705867986 > user/onlight > Repacked user/onlight to version 13 > root@newmail:~# ls -l /var/spool/cyrus/mail/I/user/onlight/ > total 76 > -rw------- 1 cyrus mail 2924 Mar 27 2008 1. > -rw------- 1 cyrus mail 748 Mar 27 2008 2. > -rw------- 1 cyrus mail 804 Jul 15 2008 3. > -rw------- 1 cyrus mail 1137 Oct 2 2009 4. > -rw------- 1 cyrus mail 1192 Apr 22 2012 5. > -rw------- 1 cyrus mail 1423 Sep 19 2016 6. > -rw------- 1 cyrus mail 1743 Dec 22 2016 7. > -rw------- 2 cyrus mail 1357 Jun 1 2017 8. > -rw------- 1 cyrus mail 744 Jun 1 2017 9. > -rw------- 1 cyrus mail 336 Mar 1 2017 cyrus.annotations > -rw------- 1 cyrus mail 10276 Jul 31 15:54 cyrus.cache > -rw------- 1 cyrus mail 165 Feb 28 2017 cyrus.header > -rw------- 1 cyrus mail 1064 Jul 31 16:16 cyrus.index > drwx------ 2 cyrus mail 4096 Nov 30 2017 Drafts > drwx------ 2 cyrus mail 4096 Nov 30 2017 Sent > drwx------ 2 cyrus mail 4096 Nov 30 2017 Spam > drwx------ 2 cyrus mail 4096 Nov 30 2017 Trash > > So now it's just ignoring UIDs 1-6, and claiming the mailbox has just 3 > messages. > > Color me confused. > > Honestly, I am not sure how anyone is managing large upgrades from 2.5 to > 3.0? Not being able to reliably bring the existing message store up to the > necessary level is quite unsettling. > > Cheers, > -nic > > > > Hope this helps, > Regards, > Savvas Karagiannidis > > On Tue, Jul 31, 2018 at 6:43 PM Nic Bernstein <n...@onlight.com> wrote: > >> Friends, >> I'm preparing for a couple of belated 2.5.X to 3.0.X upgrades, and have a >> question about how necessary it is to run "reconstruct -V max" on the >> mailstore. Both systems are currently running 2.5.10, and are already at >> index version 13. However, when performing "reconstruct -V max" on one, on >> a new 3.0.7 (with patches) system, I see this: >> >> root@newmail:~# /usr/lib/cyrus/bin/reconstruct -V max user/onlight >> user.onlight uid 1 rediscovered - appending >> user.onlight uid 2 rediscovered - appending >> user.onlight uid 3 rediscovered - appending >> user.onlight uid 4 rediscovered - appending >> user.onlight uid 5 rediscovered - appending >> user/onlight >> Repacked user/onlight to version 13 >> >> The last line can be ignored, as it's really a noop. The "rediscovered - >> appending" stuff is what catches my eye. However, once the reconstruct is >> complete, here's what the mailbox looks like: >> >> root@newmail:/var/spool/cyrus/mail/I/user/onlight# >> /usr/lib/cyrus/bin/cyrdump user/onlight >> Content-Type: multipart/related; boundary="dump-27466-1533049817-351841533" >> >> --dump-27466-1533049817-351841533 >> Content-Type: text/xml >> IMAP-Dump-Version: 0 >> >> <imapdump uniqueid="710a47ca47ebc676"> >> <mailbox-url>imap://newmail.example.com/user.onlight</mailbox-url> >> <incremental-uid>0</incremental-uid> >> <nextuid>15</nextuid> >> * <uidlist>6 7 9 10 11 12 13 14 </uidlist>* >> >> <flags> >> ... >> >> Note that the <uidlist> doesn't list those low number UIDs which were >> listed in the reconstruct sequence. In other words, I think this all is >> harmless, but I'm not sure how much overhead it brings to the whole >> process. >> >> One of the servers has a total of 70GB of mail, so a complete reconstruct >> run only takes a short while. The other, however, has over 8TB scattered >> over >30 partitions. If I could avoid running reconstruct across that >> whole wad, it'd be great. >> >> Thoughts please? >> -nic >> >> -- >> Nic Bernstein n...@onlight.com >> Onlight, Inc. www.onlight.com >> 6525 W Bluemound Road, Suite 24 v. 414.272.4477 <(414)%20272-4477> >> Milwaukee, Wisconsin 53213-4073 >> >> ---- >> Cyrus Home Page: http://www.cyrusimap.org/ >> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ >> To Unsubscribe: >> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus > > > -- > Nic Bernstein n...@onlight.com > Onlight, Inc. www.onlight.com > 6525 W Bluemound Road, Suite 24 v. 414.272.4477 <(414)%20272-4477> > Milwaukee, Wisconsin 53213-4073 > >
---- Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus