This has been discussed in the Fastmail internal mailing lists.  I've already 
started some steps towards it in:

https://github.com/cyrusimap/cyrus-imapd/pull/3658

and the earlier:

https://github.com/cyrusimap/cyrus-imapd/pull/3650

Both of which prepare the mailboxes.db APIs to have all the right locking and 
making `mbentry_t` more useful as a general item to pass around.

INTERMEDIATES suck - confirmed.

Gaps in the tree suck too. (subfolder created, parent doesn't exist)

We could reject mailbox create when it has no parent, and reject mailbox delete 
when it has children.  Fine.  But then RENAME is a problem, because even if we 
made it atomic and deep on the master, the replication protocol still needs to 
rename mailboxes one at a time - and if you rename the parents first then you 
leave source children with no parent temporarily, and if you rename the leaves 
of the tree first, then you create target children with no parents.

ALSO: we don't want to break the replication protocol to/from older versions, 
so just changing RENAME is a bit bogus.

SO: my proposal is that if the sender uses RENAME we just have to suck it up 
and leave the replica temporarily bogus because the alternatives are broken (I 
can expand on this, but there isn't space in the margin of this long email), 
but if the sender uses RENAMETREE then we can do the same atomic magic that we 
do for IMAP rename (and implicitly for JMAP rename since it's by ID and doesn't 
even know that we're doing a name tree underneath).

(Yes, this is almost RFC length, I put a lot of thought into writing it, so you 
can put some into reading :p)

*RENAME ATOMIC MAGIC:*

Speaking of atomic magic, I want to make renames really really crash safe and 
repairable, for which I propose the following:

*Pre-move setup:***
1) take user locks for the associated user or users (in alphabetical order to 
ensure deadlock safety)
2) validate the entire move - can only move all legacy or all non-legacy 
mailboxes, target parent must exist, target mailbox name must not exist, all 
names must be valid after move, etc.
2a) optional: create parent mailboxes of the destination name with the create 
logic below, since RFC3501 says you SHOULD do that.
3) database record updates in a single transaction as follows for each affected 
mailbox:
  OLD N KEY: update type to origmbtype|MBTYPE_MOVING
  NEW N KEY: update type to origmbtype|MBTYPE_RESERVE
  I KEY: update N field to new name, mbtype to origmbtype|MBTYPE_MOVING, old 
name at the front of the H history record.

*Actual move - if legacy:*
1) rename the mailbox folder of the root of the move (this now works safely, 
since we can restart this process!)  If rename fails, fall back to copy all 
files and clean up all files at source afterwards.
2) rename the user files (if the root of the move is a user)

*Actual move - if non-legacy:***
1) [THIS LINE INTENTIONALLY LEFT BLANK]
... but note, we can't skip the DB multistage above, because we need to update 
all the mailbox headers and we don't want to keep mailboxes.db locked for the 
whole time - just keep the userlocks to stop anything screwing with these users 
in the meantime.

*Post-move cleanup:*
1) open each mailbox at the new path, and update the cyrus.header tracking 
information with the new name, maybe new quotaroot, optionally: name history 
record if we decide to keep that foreverish.
2) update the database records in another single transaction:
 OLD N KEY: update type to origmbtype|MBTYPE_DELETED (tombstone)
 NEW N KEY: update type to origmbtye
3) release the name locks and let the new mailbox names be enjoyed.

*Error recovery*!
If this process bails part way through, then ctl_cyrusdb -r or reconstruct will 
discover the records with these types and can attempt to continue the rename or 
abort it.  We could create codepaths for either, and allow the admin to choose 
- though I would favour continuing where possible.

NOTE: if we bailed before pre-move stage 3 then there's nothing to discover 
since that transaction never committed and we don't have any _MOVING or 
_RESERVE mailboxes.db entries.  Likewise obviously if post-move stage 2 
committed then we're done, all the move is complete already.

Discovery of a set to do in a single transaction: we probably want to just grab 
the full list of _MOVING I keys, then remove any which are children of others 
and use the remaining roots as the roots of tree renames.

*Continue path error recovery:*
* take the name locks in order again.
* validate that we didn't lose the race against another process doing this 
repair (keys are still in moving state)
* lookup I key to find target name for each mailbox (regardless of how we 
discovered this partial move)
* lookup disk paths for old and new if legacy:
  - if both paths exist - may need smarts to merge / repair them (disk full 
part way through?) - for path under source, move / replace into destination 
since source is likely to be more correct (it's untouched except for the header 
updates once we start the rename)
  - if destination path exists and source path doesn't, we're done!
  - if source path exists and destination path doesn't, rename.
  - if neither exist: error!  Sorry :(  Broken.
* for each mailbox, open the header and update with the details from the I key 
if not already correct.
* create OLD and NEW N keys from the I key N and first H fields.
* complete post-move stage 2 changes and stage 3 lock release, move is now 
complete.

*Revert path error recovery:***
* take the name locks in order again.
* validate that we didn't lose the race against another process doing this 
repair (keys are still in moving state)
* lookup I key to find target name for each mailbox (regardless of how we 
discovered this partial move)
* lookup disk paths for old and new if legacy - same logic as above except 
destination is the old name.
  - if both paths exist - may need smarts to merge / repair them (disk full 
part way through?) - only move files back if there's not already something 
there (old name files are likely more correct still)
  - if destination path exists and source path doesn't, we're done!
  - if source path exists and destination path doesn't, rename.
  - if neither exist: error!  Sorry :(  Broken.
* for each mailbox, open the header and update with the details from the OLD N 
KEY (H field first item)
* post cleanup transaction for all the folders:
  * update I key to remove new N, take N from the start of the H field, shift 
start of H field.
  * delete new N key folder entirely
  * update old N key to remove the _MOVING flag.

And then we're back where we started.

*MAILBOX ATOMIC CREATE MAGIC:*

Likewise, creating mailboxes is currently somewhat messy with namelocks.  I 
don't think we need namelocks on the mailbox name any more, because the 
userlock protects creation, but we should still switch back to using _RESERVE 
because it has some nice restartable properties - and can be actually atomic 
even in the face of disk errors part way through.

*Happy path:*
* take the name lock on the username (or a global shared-namespace lock, we can 
look into how we shard the shared namespace later if someone who uses shared 
namespace wants to chime in with a viable plan - the locking is per cyrus run, 
so it's easy to change this stuff later)
* validate the create, including assigning a uniqueid and making sure it's 
unused, making sure the name isn't already claimed etc.
* insert an N key and I key in a single transaction, both with 
mbtype|MBTYPE_RESERVE (this means you can tell the difference between a RENAME 
and a CREATE because the I key has a different flag).
* create the directories and files on disk for the mailbox (copy of I key in 
mailbox header)
* if this is a new user, create the directories and files on disk for the user.
* replace the N and I keys with non-RESERVE versions in a transaction
* release the lock

*Continue path:*
* take the name lock
* validate that we're still in reserve state.
* delete existing files on disk
* create new files just like the happy path (could also read existing files and 
validate/repair them, but that's more complex)
* if this is a new user, ditto with the delete/create or repair.
* replace the N and I keys with non-RESERVE versions in a transaction
* release the lock

*Revert path:*
* take the name lock
* validate that we're still in reserve state.
* delete all paths on disk for the uniqueid / name / user as appropriate.
* delete the keys in a transaction
* release the lock

and I think that's basically it.  It's rewriting some of the ugliest code in 
mboxlist.c, which is nice!  

Also we can actually change the "RENAME INBOX INBOX.sub" logic to create a new 
mailbox like the above and then just issue a MOVE on the emails in the INBOX 
like the spec says to do.

Both mboxlist_create and mboxlist_renametree should take an optional ptrarray 
which gets filled with a struct that tells the details of what they did, 
uniqueid, oldname, newname, etc for each mailbox - allowing us to create the 
nice unsolicited RENAME replies for imap, and also allowing the syncserver 
RENAMETREE to work.

*SYNC APPLY RENAMETREE:*

sync_apply_rename takes oldmboxname, newmboxname, partition and optional 
uidvalidity.  What a crock, doesn't even check uniqueid.  We should always send 
uniqueid!  So something like this:

 > APPLY RENAMETREE %(OLDMBOXNAME "domain!user.foo.bar" NEWMBOXNAME 
 > "domain!user.foo.baz" UNIQUEID "dcf735a0-5978-4226-a3eb-b289268b698e" 
 > FOLDERMODSEQ 123)
 < * RENAME %(OLDMBOXNAME "domain!user.foo.bar" NEWMBOXNAME 
"domain!user.foo.baz" UNIQUEID "dcf735a0-5978-4226-a3eb-b289268b698e" 
UIDVALIDITY 1630503873 FOLDERMODSEQ 123)
 < * RENAME %(OLDMBOXNAME "domain!user.foo.bar.subA" NEWMBOXNAME 
"domain!user.foo.baz.subA" UNIQUEID "091e90a2-6a92-4376-9fd4-0977667481ad" 
UIDVALIDITY 1630503874 FOLDERMODSEQ 123)
 < * RENAME %(OLDMBOXNAME "domain!user.foo.bar.subB" NEWMBOXNAME 
"domain!user.foo.baz.subB" UNIQUEID "091e90a2-6a92-4376-9fd4-0977667481ad" 
UIDVALIDITY 1630503901 FOLDERMODSEQ 123)
 < OK DONE

Which returns details to tell the sync_client that it then needs to actually 
sync each mailbox to make sure all the rest of the modseqs and annotations and 
such are correct.

We can detect the new protocol by looking for a version key in the 
capabilities, either in imapd or sync_server, on the replica.

Bron.

**--
  Bron Gondwana, CEO, Fastmail Pty Ltd
  br...@fastmailteam.com


------------------------------------------
Cyrus: Devel
Permalink: 
https://cyrus.topicbox.com/groups/devel/Tf0643c8d6e381e30-M6197e56c655115223444ce89
Delivery options: https://cyrus.topicbox.com/groups/devel/subscription

Reply via email to